535 stories
·
0 followers

How Convenience Kills Curiosity

1 Share
How Convenience Kills Curiosity

I’ve been thinking about the death of curiosity.

Remember when you had to figure out how a new piece of software worked by poking around its menus?

When finding an obscure fact meant wandering through library stacks, accidentally discovering three unrelated interesting things along the way?

When getting somewhere required unfolding a paper map so massive it would never fold back correctly?

Those friction-filled experiences are nearly extinct now. We’ve optimized them away in the name of convenience, and something important has been lost in the transition.

Type a question, get an answer—ideally without even clicking through to a webpage. The algorithm tries to anticipate exactly what you want, then delivers it with surgical precision. This seems like an unalloyed good until you realize what’s missing: the pathway of discovery, the intellectual side-quests, the context that situates knowledge within a broader landscape.

Every product manager in Silicon Valley has been trained to reduce friction. If it takes three clicks, reduce it to two. If it takes two, try one. If users have to think, it’s considered a failure. The ideal interface is intuitive, immediate, invisible. And it works—until it works too well.

Because in the real world, knowledge is earned through movement. Friction. Ambiguity. The old experience of falling into a stack of books at the library wasn’t efficient, but that was hardly the point. You’d go in looking for one answer and come out with five better questions. That’s how curiosity thrives: in the space between expected and unexpected, between map and territory.

UX optimization reorients the human toward delivery over discovery. You’re no longer hunting; you’re being served. Platforms track your behavior and give you more of what you already like. Spotify doesn’t ask you to browse—it predicts your vibe. It’s all so thoughtful, so personalized, so utterly numbing. When systems get good enough at anticipating your preferences, your preferences shrink. The feedback loop closes. Curiosity flattens.

There’s a concept in behavioral science called the “effort heuristic.” It’s the idea that we tend to value information more if we worked for it. The more effort something requires, the more meaning we assign to the result. When all knowledge is made effortless, it’s treated as disposable. There’s no awe, no investment, no delight in the unexpected—only consumption.

Convenience rewires the mind. It makes learning feel like confirmation. It reduces exploration to retrieval. You end up knowing more but noticing less.

It’s not that user experience designers are trying to kill curiosity. They’re trying to help. But help, at scale, becomes architecture. And architecture shapes cognition. The interfaces we use become metaphors for how we think the world works. If your phone always gives you the answer, you stop asking better questions.

We don’t need more information. We have oceans of it. What we need are tools that reintroduce friction in thoughtful ways. Interfaces that don’t just answer us, but provoke us. Not to make things harder for the sake of it— to remind us that adult curiosity is not a default state. It must be cultivated. And right now, the culture of convenience is starving it.

Westenberg explores the intersection of technology, systems thinking, and philosophy that shapes our future—without the fluff.

Free readers get powerful ideas. Paid subscribers get more:

  • Exclusive in-depth essays
  • Early access to new work
  • Private discussions and Q&As
  • Future digital products and resources
  • The satisfaction of supporting independent thinking

$5/month or $50/year. No sponsors. No bullshit. Just valuable insight delivered directly.

If it makes you think—we're aligned.

Please consider signing up for a paid monthly or annual membership to support my writing and independent/sovereign publishing.

Subscribe


Read the whole story
mrmarchant
14 hours ago
reply
Share this story
Delete

We did the math on AI’s energy footprint. Here’s the story you haven’t heard.

1 Share
Read the whole story
mrmarchant
21 hours ago
reply
Share this story
Delete

Structure, Routines, Accountability

1 Share

A Story

I have a challenge assignment each week for my students. It's optional, something they can work on if they're finished with an assignment early. It's always posted on Google Classroom to save time printing and copying. One week the challenge assignment was this problem from Play With Your Math:

22-Persistence Sq

One student had finished our quiz early. She was a bright and motivated student — she ended up taking a year of math online over the summer to accelerate in high school. I suggested she look at the challenge assignment. I was wandering the room but kept an eye on her as she opened up the link. She read the problem and, without missing a beat, googled "number with persistence of 4." The answer showed up (I won't spoil it here if you want to play with the problem), she typed it in, and moved on.

Two things strike me about this story. First was how casual she was. She googled the answer without thinking twice, as if she did this all the time. As if there was no purpose in the problem I had posed besides finding an answer and moving on while doing as little thinking as possible. And second, this was one of my more motivated students. If she was googling the answer, what was everyone else doing?

That moment happened three years ago. I'm a bit less naive now. I put a lot more structure in place for my students. I also use internet-connected devices a lot less. And still, I see this type of instinctive googling, instinctive avoidance of thinking, all the time.

Here’s a truth we don’t like to admit: if we put students on internet-connected devices, assign them some tasks, and don’t monitor what they’re doing, most kids will google or ChatGPT the answers. This isn’t a hypothetical. If you wander through some middle school and high school classrooms at a typical school, that’s exactly what you will see. It’s often under the guise of “ownership” or “independence” or “critical thinking.” It’s not every kid cheating, but it’s far too many.

Learning Things Is Good

I'll put a very brief stake in the ground here. I think learning things is good. Math is worth learning. So are lots of other things. I know we have little computers in our pocket that can look up vast amounts of knowledge in seconds. That doesn't mean knowledge is bad. Knowledge is what we think with, and we can only think with knowledge that's in our minds. Maybe I'll write a longer post sometime defending knowing things. If you disagree, feel free to stop reading.

A Headline

That story about my student googling answers came to mind when I saw this headline in New York Magazine:

This is news? Of course they're cheating their way through college. They're already cheating their way through high school. In some tech-heavy places they're cheating their way through middle school.1

There's another truth we don't like to admit in education. We like to think education is about inspiring students, about developing curiosity and an innate love of learning. School should absolutely try to do those things. But the elements of school that drive most learning are more pedestrian. Structure. Routines. Accountability. Structure breaks the complex task of learning down into lots of little manageable chunks that build toward larger goals. Routines get students into positive habits. Accountability communicates that we care about student learning, and we will intervene if students aren’t learning. Those are the drivers of most learning that happens in schools. They're also the solution to lots of challenges in education. Every student has a phone in their pocket? Let's give them some structure, routines, and accountability.2

The Goal

The goal of school is to get students to think. Thinking is what causes learning. There are lots of shortcuts to avoid thinking. Those shortcuts are tempting. They're tempting for everyone. Lots of adults struggle to learn things because they don’t have the structure, routines, and accountability they need to stick with a topic long enough to make progress.

Structure, routines, and accountability aren't hip or progressive. But when it comes to AI, they are the answer. Provide more structure. Don't just say, "write me an essay." Break that assignment into pieces. Handwrite some of them. Engineer more routines. Only use technology at specific times, for specific purposes. Practice what students should do when they’re stuck. Increase accountability. We care about your learning. Because we care about your learning, we are making some changes to make sure you are learning and not outsourcing your thinking to the internet.

None of that is easy. It’s a lot of work for teachers. It’s definitely easier for me as a math teacher, but even in math you might be surprised how often students will google answers without structure, routines, and accountability. It’s also not in the DNA of universities, and I’m not optimistic they will change to adapt to the moment. The specifics are tricky — they will depend on your resources, knowledge, and preferences.

But stepping back, there are some broad ideas I wish we all agreed on:

  • The purpose of school is to get students to think. Thinking is what causes learning.

  • Googling and ChatGPT-ing answers is a way to avoid thinking.

  • Far too many students will avoid thinking if we don’t take steps to stop them.

  • The best ways to get students to think are structure, routines, and accountability.

  • There’s no easy answer for what your structure, routines, and accountability should look like. They’ll work best if you collaborate with the teachers around you. It will take time and effort and trial and error. But the results are worth it.

1

I’ll qualify this with a reminder that not all students are cheating. Not all students are googling the answers whenever they’re unsupervised. But the goal of school is to educate all students, and far too many are cheating.

2

My school is finally (finally!) banning cell phones next year. We had a meeting two weeks ago to hash out the structure, routines, and accountability. It will be a lot of work at first. I’m excited, it will make a huge difference for us.

Read the whole story
mrmarchant
21 hours ago
reply
Share this story
Delete

i want to get off the screen, too

1 Share

hello.

this is an extension of a post i shared a while ago. it was a short, simple list of things to do instead of being on your phone. nothing urgent or groundbreaking. just a quiet invitation to come back to yourself.

it resonated more than i expected. i think that’s because so many of us are exhausted by constant noise. we’re tired of being pulled in a hundred directions, of filling every still moment with distraction. we’re not necessarily looking for discipline or detox, we’re looking for depth and presence. for something that feels like ours again.

this version goes a little deeper. it’s still gentle, but more layered. it’s not about productivity or self optimization or getting your life together. it’s not about becoming someone new. it’s about remembering how to live in your own life. how to be curious again. how to stretch your mind, tend to your body, take care of your space, and reconnect with something slower and more sacred.

we don’t need to quit our phones or escape the world. but we can choose to step away now and then. we can make space for boredom, for stillness, for focus. we can give ourselves the gift of our own attention.

what follows is a list. not a checklist or a challenge. just a collection of small, intentional things you can do when you want to feel more awake. more grounded and more like yourself.

you don’t have to do all of them. or any of them.
but if even one of them makes you feel more alive or more in touch with the present, then that’s enough.

(this post is free, but if you enjoy this newsletter, consider becoming a paid subscriber and be part of a smaller circle where things feel a little softer, a little more personal, you’ll get early access to my youtube videos and a weekly media consumption roundup filled with articles, video essays, podcasts, and other references to make you smarter. i’d love to have you there)

milk fed is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.


for your mind
things that stretch your intellect, sharpen your thinking, and make you feel awake in the best way.

read a book that challenges you
choose something a little difficult. a classic novel, a philosophical essay, a book with long sentences and big ideas. read slowly. keep a pencil nearby. you don’t have to understand every word. you just have to keep going.
start with: the ethics of ambiguity by simone de beauvoir, the book of disquiet by fernando pessoa, or the death of ivan ilyich by leo tolstoy.

study a topic you’ve always wanted to understand
make it impractical. make it obscure. maybe you start watching lectures on gothic architecture or learning the basics of quantum theory through youtube explainers. use free tools like coursera or open yale courses. let your mind stretch in a direction that doesn’t need to be useful.

learn chess
chess is meditation in strategy form. it teaches focus, pattern recognition, and patience. start with a free app like lichess or chess.com, which both have excellent beginner tutorials. watch a few short videos by gothamchess on youtube if you want to get into it fast. play against the computer. lose a few times. you’re still winning.

annotate poetry
you don’t need to analyze. just sit with the words. underline the lines that feel like bruises or breath. write little notes in the margins. you’re not performing intelligence. you’re just being present.
start with: sappho, rainer maria rilke, or louise glück.

write a personal essay or a letter you’ll never send
choose a moment in your life and narrate it like you're telling the story for the first time. write a letter to your younger self, or to someone you miss, or someone who hurt you. let it be honest and unfinished. burn it or save it. no rules.

do a page of logic puzzles or sudoku with pen and paper
challenge your mind to move differently. sit at a table, make it quiet, treat it like meditation.

learn the basics of a new language
language learning is one of the best ways to gently rewire your brain. it wakes something up. it teaches you to listen differently. start small. write out simple phrases in a notebook. say them out loud, even if your accent is terrible. duolingo is a good place to begin, especially for structure and streaks. if you’re learning mandarin, try red note or skritter for handwriting and tone practice. tandem and hellotalk are great for speaking with real people and building conversation confidence. if you're someone who needs immersion, switch your phone settings to the new language. watch children's shows in that language with subtitles. label objects in your room with sticky notes. let the language live around you.

make a list of questions you’ve never asked anyone and answer them for yourself
questions like: who do you feel safest around, and why? what part of your personality is performative? when was the last time you felt proud of yourself? be brave in the asking. be honest in the answering.

read something that disagrees with you
pick up a book or essay that challenges your beliefs. not to change your mind immediately, but to sharpen your reasoning. underline what makes you pause.

go to a museum alone and bring a notebook
give yourself at least an hour. don’t try to see everything. just choose one room or one piece to sit with. sketch it, describe it, or write down how it makes you feel. pretend you’re a student again. this is your field trip.

create your own mini syllabus
pick a theme: grief, memory, mythology, time, desire. choose three books, two essays, a film, and a podcast. give yourself a week or a month to move through it. keep notes. write a reflection when it’s done. this is your private university. you’re the professor and the student.

read the original source instead of the summary
go to the text. not the tweet about it. not the blog post. not the thinkpiece. read the actual thing people are quoting.
it’s harder, slower, and always worth it.


for your body
things that ground you in sensation and get you out of your head.

practice real self-care
not the kind that gets marketed to you. not the bubble baths, the expensive skincare routines, the luxury candles, though those can be lovely too. real self care is brushing your teeth even when you're depressed. showing up for your doctor’s appointments. drinking water. texting your friend back. getting enough sleep. it's asking what your body actually needs, not what you’ve been told to buy. real self care is also collective. it’s making soup for someone sick. offering to babysit. checking in on the people you love without waiting for them to ask. it's being kind even when it’s inconvenient.

take a walk with no destination, just a jacket and a thermos
walk like you have nowhere to be. listen to the way your feet hit the ground. carry something warm (or cold) to drink. look up.
you’re not exercising. consider this a way to return to the world.

stretch while listening to jazz or rain sounds
choose music that feels like it could stretch with you. move slowly and hold each pose a little longer than you think you should.
let your body take up space without asking permission.

tidy a drawer or shelf like it’s a meditation
not because it’s messy. but because it’s yours. take everything out, wipe it clean, decide what goes back in.

put on a face mask and read one chapter of a book
give your skin some love, but don’t scroll while you wait. lie down. read something that moves slowly. your body is not a project, it’s a place to live in.

practice slow breathing while lying in the grass or on the floor
inhale for four. hold for four, exhale for four, repeat. put a hand on your chest or your stomach. feel what’s real. let the rest go.

light a candle and just sit with it
watch the flame.let the room be quiet. maybe journal or read a book.


for your home
things that make your space feel more lived-in and loved. rooted in ritual, softness, and imperfection.

cook something with intention and eat it at the table
choose a recipe that feels comforting, even if it’s simple. chop slowly. stir gently. let the kitchen smell warm. set the table, even if it’s just for you. light a candle. pour water into a glass. sit down and eat without a screen. this is nourishment. not just for your body, but for the room you live in. cooking makes a house feel alive.

rearrange a corner of your room
don’t redecorate. just shift things around. move a chair, add a scarf to a surface, place a book where it can be seen. this is not about aesthetics, it’s about new energy. a small shift can change everything.

wipe down a surface you always overlook
dust the baseboards, clean the corners of a mirror, wipe a windowsill. not to impress anyone. just to say, this matters. care is often quiet, invisible, and sacred.

make a seasonal altar
gather what feels like now. a bowl of lemons, a feather from a walk, dried rice, a rock that feels heavy in your hand, incense ash from a morning you needed peace. arrange it on a small plate, a windowsill, a bedside table. let it shift with the seasons. let it remind you that time is moving, and you’re moving with it.

open the windows and play a vinyl or old playlist
air out your space like you’re letting something go. open every window and let the wind move through the room. play something familiar, maybe an old playlist, a record with static, a forgotten song from years ago. let the music fill the corners. this is how you shift the energy of a room. no deep clean required. just breath and sound and memory.

dry orange slices or press flowers in a book
slice the oranges thin, lay them on parchment, let them turn golden in the oven or the sun. press flowers between heavy pages and forget about them for a week. these small rituals of preservation are acts of patience. things that force you to wait, and in waiting, to appreciate.

hang something small and beautiful that reminds you to look up
a wind chime, a dried bouquet, a sun catcher, a paper crane. let it dangle. let it move when the air shifts. give yourself a reason to pause mid thought, mid scroll, mid sentence.

do your laundry slowly. fold it like a ritual
choose a quiet morning. sort your clothes with care. fold each piece slowly and smooth it out like you’re offering comfort. laundry can be sacred if you let it be.

make tea with full attention
choose your tea like you’re choosing how you want to feel. something grounding, something bright, something soft. boil the water and stay close. listen to it hum and rise. watch the steam curl into the air. pour it over the leaves and watch them swirl, then settle, then unfurl. this is the moment. not the drinking, not the caffeine, this.

embrace imperfection
cracked bowls, chipped edges, soft clutter. find beauty in what’s worn, incomplete, and real. let your home feel like a poem, not a showroom.


for your inner child
things that are quiet, nostalgic, and not at all “productive.”

color with pencils or crayons, no goal in mind
open a fresh sheet of paper and let your hand wander. don't worry about staying inside the lines. use whatever colors call out to you, and just let the markings unfold like a secret map of your feelings. you can also opt for a coloring book.

reread a childhood favorite or watch a comfort movie
choose that book or film you once knew by heart. allow the familiar words or scenes to transport you to simpler days. remind yourself that there is magic in nostalgia, and sometimes it’s the light of childhood that heals us.

bake cookies and eat the dough
mix flour, sugar, and a sense of whimsy. relish the tactile pleasure of kneading dough, then steal a taste before it’s even in the oven. this is a little rebellion of sweetness. a moment where you honor spontaneity and taste without guilt.

play a board game or puzzle, especially with someone you love
sit down with a jigsaw, a classic board game, or even a simple card game. allow playful competition or collaboration to spark joy. these shared moments remind you that even in adult life, there is room for laughter and wonder.

write a short story with a magical animal in it
allow your imagination to weave a tale about a creature both familiar and fantastical. let your characters be kind, brave, or even a little mischievous. this isn’t about crafting perfection, it's about rediscovering the wild wonder of your imagination.


for your spirit
things that reconnect you to meaning, mystery, and that quiet sacredness that’s hard to name.

journal about a question you’re afraid to answer
what are you avoiding? what don’t you want to name? write it down. then answer it anyway. no one has to read it but you. there’s courage in writing things you’re not ready to say out loud.

take yourself on a solo date to the library, garden, or church
go somewhere still and beautiful. take a book, bring a notebook, don’t rush. notice everything. we don’t always need conversation to feel connected, sometimes presence is enough.

sit in silence for ten minutes. breathe. observe. that’s it.
no music. no guidance. no goals. just sit and notice what it feels like to be in your body, in this moment. silence can be uncomfortable at first, but then it starts to feel like coming home.

tend to something small and alive
water a plant. sweep your doorstep. feed the birds. light a candle for someone you love. care for something that needs you.


a final note

you don’t need to do all of these. or even most. maybe you just pick one. maybe you just think about picking one, and that’s enough.

if you’ve been feeling distracted or scattered, overstimulated or a little hollow around the edges, try trading your screen for something slower. something quieter. something that brings you back to yourself.

your mind deserves your attention. your body does too. and maybe there’s still a small part of you, curious and gentle and waiting, hoping you’ll come outside and play.

okay, that’s all i have for you today.

if you’re not ready to become a paid subscriber and you have the capacity to leave a tip, that would be so appreciated.

i love you.

bye.

(follow ig, tiktok, youtube, pinterest and spotify for more)

milk fed is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.



Read the whole story
mrmarchant
21 hours ago
reply
Share this story
Delete

Something rotten in AI research

1 Share
Eminem - The Real Slim Shady (Official Video - Clean Version)
This looks like a job for me (credit)

Why is it so hard to find trustworthy studies of the impact of AI?

Let’s start with the scandal that unfolded last week. In November 2024, a PhD student at MIT released a pre-print of a major study titled, “Artificial Intelligence, Scientific Discovery, and Product Innovation.” The headline finding was that, at a major unnamed US manufacturing firm, researchers who used AI “discovered 44% more materials, resulting in a 39% increase in patent filings and a 17% rise in downstream product innovation.” The study gained major attention from important media outlets—including The Wall Street Journal, The Atlantic, and Nature—and was hailed as providing rigorous, empirical evidence that scientists using AI could rapidly improve the discovery of new materials.

A huge win for co-intelligence, if you will.

Just one problem: It was bullshit. Pure academic fraud, it seems. In one of the harshest press releases from a major research university you’ll ever read, MIT stated that it has “contacted arXiv to formally request that the paper be withdrawn…MIT has no confidence in the provenance, reliability or validity of the data and has no confidence in the veracity of the research contained in the paper…The author is no longer at MIT.” They concluded, “We award him no points, and may God have mercy on his soul.” (Just kidding.)1

Meanwhile, as this controversy was breaking, a separate mini-scandal developed on an ed-tech email listserv that I’m on, where an academic pointed to a new meta-analyses of ChatGPT uses in education that was published in Humanities and Social Sciences Communications journal of Nature. The headline findings:

This study aimed to assess the effectiveness of ChatGPT in improving students’ learning performance, learning perception, and higher-order thinking through a meta-analysis of 51 research studies published between November 2022 and February 2025. The results indicate that ChatGPT has a large positive impact on improving learning performance (g = 0.867) and a moderately positive impact on enhancing learning perception (g = 0.456) and fostering higher-order thinking (g = 0.457).

These are stunning results for an education intervention—trust me when I say an effect size of .867 on student learning is massive. So much so that George Siemens, who writes a newsletter titled “Sensemaking, AI and Learning (SAIL),” immediately argued the findings are “so substantive…that universities need to figure out what cognitive escalation looks like in the future and how metacognition will be taught and emerging literacies, outside of AI literacies, will be addressed.”

Perhaps you can guess where this is going. Shortly after this meta-analysis was shared, Steve Ritter, the founder and chief scientist at Carnegie Learning, mercifully did a spot check of the underlying studies and found, um, issues. With his permission, I now share his email to the listserv in full (my emphasis added):

I’d like to believe this result, and I was curious about what a meta-analysis of GPT usage would look like, given that there are so many possible usage models. The reported effect sizes are really large.

The first thing I noticed in the paper (Table 3) is that most studies have Ns that look suspiciously like “one class for the experimental condition and one for the control.”

So I picked one of the included papers to examine – Karaman & Goksu (2024) – sorta at random. The meta-analysis classifies Karaman & Goksu as using secondary students (one reason I picked it) and using GPT as an “intelligent learning tool” for “tutoring.”

None of this is correct.

The paper looked at the impact of using GPT to generate lesson plans that were used to teach third-graders (not secondary students). It compared a section taught by one of the authors using these GPT-generated lesson plans to a randomly-selected class taught by a different teacher. They used students as the unit of analysis and found no significant difference. The meta-analysis reports an effect size of 0.07, but I’m not sure where they got that (they gave pre- and posttests, but the main analysis just looked at posttest scores).

I guess this paper technically meets the inclusion criteria for the meta-analysis, but that just shows that their inclusion criteria need to be revisited. Given that one paper I looked at is so clearly misclassified, I’m not sure I trust anything in the meta-analysis.

Another researcher on the listserv then chimed in with the suggestion that “people glance through a few of the other studies that were included here. It's a perfect example of garbage in garbage out.”

So much for cognitive escalation.

a man in a blue suit and tie is holding a bottle of beer and says boy that escalated quickly

This keeps happening. Back in January, an economist at the World Bank wrote a blog post regarding a pilot study of afterschool tutoring in Nigeria involving AI that led to “striking” learning improvements, “the equivalent to nearly two years of typical learning in just six weeks.” Again, these sorts of gains should on their face raise major red flags, but what’s worse is that the study involved human teachers acting “as ‘orchestra conductors’…who guided students in using the LLM, mentoring them, and offering guidance and additional prompting.”

Oh you don’t say! Unsurprisingly, the kids who received this additional “orchestrated support” from teachers performed better on tests than the kids who received nothing at all, but who knows what role AI played. Of course, if we had a formal write-up perhaps we could tease this out, but despite promises that one would be forthcoming, nothing appears to have been published.2

Unfortunately, our story doesn’t stop here.

One study that has garnered a great deal of attention in the AI-in-education space was published on arXiv by researchers at Stanford in October 2024 titled, “Tutor CoPilot: A Human-AI Approach for Scaling Real-Time Expertise.” This effort involved a randomized control trial where some human tutors were given access to AI to support them while tutoring high school students in math (via text-based chatting), whereas other tutors were not. More specifically:

This study is the first randomized controlled trial of a Human-AI system in live tutoring, involving 900 tutors and 1,800 K-12 students from historically under-served communities. Following a preregistered analysis plan, we find that students working with tutors who have access to Tutor CoPilot are 4 percentage points (p.p.) more likely to master topics (p<0.01). Notably, students of lower-rated tutors experienced the greatest benefit, improving mastery by 9 p.p. We find that Tutor CoPilot costs only $20 per-tutor annually…Altogether, our study of Tutor CoPilot demonstrates how Human-AI systems can scale expertise in real-world domains, bridge gaps in skills and create a future where high-quality education is accessible to all students.

This study received a lot of attention, with write-ups in MIT Technology Review, Education Week, the 74, K-12 Dive, and MarkTechPost, among others. I even was quoted in The 74 story praising the study design, and observing that if its findings held up, it offered promise for future tutoring efforts.

But after closer review, aided by other education researchers, I no longer see that promise.

Before continuing, I want to be crystal clear that there is zero evidence, and I am not suggesting, that any of the researchers involved in this study have committed any form of academic fraud along the lines committed by the PhD student at MIT. Nor am I contending that they undertook a shoddy empirical investigation a la the ChatGPT meta-analysis or World Bank pilot.

However. What I am prepared to say is that the authors of this study presented their findings in a way that deliberately plays up the positive impact of AI, while minimizing the evidence that calls into question the value of using AI to enhance tutoring. And I find this troubling. To understand why requires diving a bit into the research weeds, so please bear with me.

Let’s start with the core finding outlined above—that students who worked with tutors that had access to AI-enhanced tutoring (aka the Tutor Co-Pilot) were 4% more likely to master topics overall, and that students with lower-rated human tutors improved mastery by 9%. These are modest but non-trivial gains that were calculated through the use of “exit tickets,” essentially mini-tests given at the end of tutoring sessions.

When this study first crossed my desk back in October, I found it slightly puzzling that this was the outcome measure the researchers used, given that they also collected data on how students did on their end-of-year summative tests called MAP—more on this momentarily. Nonetheless, I assumed they had reasons for making that research design choice, and hey, it’s their study, they get to choose their measure.

But that is not exactly what happened. As is good practice, the researchers here pre-registered the primary outcome measures they intended to use, which included a healthy dose of survey measures of tutors and students, analysis of language use within the tutoring sessions, and—ahem—the students’ spring MAP scores.

So how did that go? Although this was a math tutoring intervention, students who received AI-enhanced tutoring did worse on the end-of-year MAP math test, though the results were not statistically significant. (Oddly enough, students did slightly better on the MAP reading test, but here too the results were not statistically significant.)

In other words, there was no measurable impact of AI-enhanced tutoring on the primary pre-registered outcome measure of MAP test results. Yet this seemingly important finding is reported on pages 31 and 32 of the 33-page study. In an appendix.

Now, to be completely fair, in their pre-registration the researchers did include a very long of additional data they planned to collect beyond their primary outcome measures:

Blink and you might miss it, but exit tickets are in there—yet plainly they are not what the study was designed around. In my view, to later feature exit-ticket data because it provides modest evidence of a positive AI effect contravenes the spirit if not the letter of the pre-registration process, which after all is to prevent data hacking to produce novel results.

As a sanity check, I asked two education researchers to independently review this study to see if my concerns were warranted. Here is what the first said (my emphasis added):

If I was reviewing this for a journal, I would say revise and resubmit, and would not accept for publication unless they centered the MAP score as the outcome of interest. That said, I think it's very publishable! But from a policy / practice perspective, I probably wouldn't use Tutor Co-Pilot, given the disappointing MAP results.

My biggest concern has to do with using the exit ticket as the outcome….There are several reasons why the exit tickets are not the appropriate outcome to focus on:

  • Post-session mini-tests (exit tickets) are not mentioned in the pre-registration plan (Spring MAP is in the pre-registration plan).3

  • Unless I missed it, the authors provide no evidence of reliability of exit tickets.

  • Exit tickets are (I think?) a researcher-developed outcome measure. Potentially meaningful differences in effect sizes can be obtained from measures created by study authors, and these measures may not be as informative to policymakers and practitioners as independent measures (from WWC V5 updates summary).

  • If exit tickets are the outcome, it’s not really an Intent to Treat study (causal estimate of impact) because students don’t have exit tickets if they don’t do the tutoring. You can do a real ITT estimate with the MAP data because that would be available for all students in the study, regardless of whether they engaged in tutoring.

And the second (again with my emphasis):

Is the overall pattern of results encouraging? Not obviously, even if taken at face value. On one side of the ledger, positive and statistically significant results for exit tickets. On the less-good side, insignificant negative end-of-year effects, and insignificant, mixed-direction results for participation and surveys. Everything hinges on what you think the exit tickets are telling us…

What they really need is an explanation for the divergence between the exit ticket results and the [end of year] MAP results and that's probably hand-waved away a little too easily….I think it's actually pretty easy to come up with a story where 1) exit tickets are valid measures of math skill in general but 2) the AI component helps students complete exit tickets without learning anything (making the exit tickets a less valid measure of learning).

In fact, their lit review notes some possibilities here (e.g., “[Large-Language Models] often generate bad pedagogical responses, such as giving away answers".)

There are other yellow-to-red flags too. For one thing, the researchers claim their study involved approximately 1,800 students—but that refers to the number of kids who were offered tutoring, not the number who received it. As it turns out, there was a 38% attrition rate over the seven-week intervention, and many students only participated in one or two tutoring sessions. The question of why student participation rates were so low is barely mentioned. Perhaps kids, being humans, do not like to learn by typing back and forth on the computer?

For another, the study reports the cost of delivering AI tutoring as only $20 per student, but that appears to exclude the extensive training that was provided to the human tutors to prepare them to use AI. As one of my anonymous reviewers noted, “the costs should include the tutor training (and the trainer’s time, and any materials or facilities needed for training).” Perhaps not coincidentally, not long after this report was published, the company that provided the AI-enhanced tutoring went out of business.

Ok, so. Is all this just academic sniping? I think not, and here’s why:

Just two weeks ago, I was invited to a conference at Stanford regarding the future of tutoring, one that involved some of the researchers involved in this very study. During one session, I listened to a program officer from the Overdeck Family Foundation, a major funder of education research and other efforts related to tutoring, make clear that as federal funding for tutoring dries up, his employer will soon be pushing for tutoring companies to become “more cost efficient” using technology (read: AI). The Overdecks get to choose how to spend their money, of course, but what they do not get to do is claim that there is robust empirical evidence suggesting that AI-enhanced human tutoring is effective. There isn’t.

Research is a public good, and one that’s vital to producing knowledge we can use to inform policy and practice. With Silicon Valley in full AI hype mode, we need independent studies of AI in education that we can trust to fairly evaluate this technology’s impact, positive or negative.

We’re not getting that right now and it’s a problem.


Addendum: I highly recommend this essay by Nick McGrievy, who recently received his PhD in physics from Princeton, on the limits of AI science. As he aptly notes, “be cautious about taking AI research at face value. Most scientists aren’t trying to mislead anyone, but because they face strong incentives to present favorable results, there’s still a risk that you’ll be misled. Moving forward, I would have to be more skeptical, even (or perhaps especially) of high-impact papers with impressive results.”

Addendum #2: After publishing this essay, Mike Kentz pointed me to this impressively comprehensive investigation into AI-in-education studies conducted by Wess Trabelsi, who reports being psychologically scarred by the “incompetence, lack of integrity, and downright confirmed fraud” in this space. It’s bad!

Subscribe now

1

For more background on why this study should have raised red flags from the outset, check out this essay from the appropriated named BS Detector. Paul Bruno also shared a thoughtful thread exploring the challenges of sniffing out academic fraud via peer review.

2

Did these obvious methodological flaws stop Ethan Mollick from excitedly sharing the results with his (check notes) 275,000 followers on LinkedIn? Of course not. And it’s far from the first time Mollick has promoted questionable AI research, as I’ve covered previously.

3

This statement is correct as to primary outcome measures, but as noted exit tickets were technically preregistered.

Read the whole story
mrmarchant
1 day ago
reply
Share this story
Delete

Game theory illustrated by an animated cartoon game

1 Share
Comments
Read the whole story
mrmarchant
1 day ago
reply
Share this story
Delete
Next Page of Stories