One of the classic challenges of teaching math: you teach a skill, students can do it in class, then the next day you ask them to use the skill again and it’s as if they have never seen it before in their lives.
What’s a teacher to do?
Most teachers agree students need practice. But how much practice is enough? Put a bunch of smart, experienced teachers in a room and they can give you a pretty good guess for how much practice students will need, but that guess will never be right for every classroom.
I take an empirical approach: practice until learning is secure in long-term memory.
First, I want to be clear about what I’m talking about. This is a strategy for improving retention for procedural skills. Think multi-digit subtraction, or long division, or solving a two-step equation, or finding a derivative using the chain rule. This isn’t about conceptual understanding, or how to initially introduce those skills. The problem here is that you’re teaching a procedural skill, students can do it one day, but they forget it the next.1
My Approach
I teach the skill day one. I have students practice a bit.
Day two, I ask students to try the skill again to check for understanding. I typically use mini whiteboards for this. No scaffolding, no hints, I just want to see how many kids remember. Then, I’ll model, or we’ll look at an example together, or we’ll scaffold up to the skill with some easier questions. Then practice. This isn’t a whole class period: there’s the check for understanding, then a reminder and a bit of scaffolding, then maybe five minutes of practice.
Day three, we do the same thing again. I start with a check for understanding. The key question: how many kids get it right? Let’s say on day two it was 30%. Day three, it might be 60%. The goal is to get that number a little higher each day.
Wash, rinse, repeat. My goal is to practice until students can reliably retrieve the skill the next day.2
When the success rate is low, I’m likely to choose a whole-class worked example and scaffolding. Once the success rate gets up to 80-90%, I start focusing on those final few students. Most students now know it, so I check in individually with the students who missed the check for understanding, try to figure out how to help them, and then we all practice for five minutes.
The last 10% can be stubborn. If the skill is important, I try to stick with it until we get to 100%. Are some students ready to move on? Sure. But a quick chunk of practice isn’t doing any harm, and that extra practice makes a huge difference for the students who need it most.3
Once we reach 100%, we shift into retrieval practice mode. Instead of five-minute chunks of practice, I’m mixing these questions into a Do Now or mixed practice assignment. Students should keep seeing them regularly, but one or two questions at a time is enough to keep the learning secure in long-term memory.
What’s Happening Under the Hood
I think it’s helpful to have a mental model for what’s happening in my students’ minds during this process.
Cognitive scientists delineate two different types of memory strength: retrieval strength and storage strength.
When you first learn something, retrieval strength is high. If I tell you that 18 x 8 = 144 and then ask you for 18 x 8, you will have high retrieval strength for that math fact. You won’t have high storage strength, and the learning fades quickly. Here’s what that looks like:
If I ask you for the value of π, you can probably tell me it’s about 3.14. You’ve seen it plenty of times before. You can retrieve it from memory, and that memory is durable. Rerieval strength and storage strength are both high. Here’s what that looks like:
Let’s think back to day one. Students can perform a skill. They get a bunch of questions right in a row, but they forget quickly. In that moment students have high retrieval strength, but low storage strength. They can do it in the moment, but it doesn’t quite stick. Here’s a gif to represent this situation. After practicing a bit, retrieval strength is high, but we haven’t added much to storage strength and retrieval strength fades quickly.4
The best way to improve storage strength is counterintuitive: forget a little bit, then practice again. It looks like this:
That’s the purpose of day two, and day three, and so on. Once you’ve practiced something 10 times in a row, the 11th time isn’t adding much. But if you practice 10 times, wait a day, then practice again — that does much more to improve storage strength.
High storage strength has two main benefits. First, forgetting is slower. Here’s what the model looks like when storage strength is high:
When storage strength is high, learning is more durable.5
Second, relearning is faster. All learning fades. If you don’t use it, you lose it. But if storage strength is high, you can relearn that topic quickly.6
This is the answer to the titular question. How much practice do students need? Enough that they can reliably retrieve the skill the next day. Learning is a change in long-term memory, and the best practical way to measure long-term memory is to wait a day and see what students remember. Continue practicing until students reach that threshold. Retrieval practice continues at a smaller scale after that point, and if retention slips we practice a bit more.7
Where to From Here
I know what some of you are thinking. This whole thing sounds nice, but there’s no time. I can’t spend 5-10 minutes practicing a skill from yesterday or last week every class. I have to move on to the next objective.
I hear you. To take this type of practice seriously requires a bit of instructional redesign. One approach is to set aside the first chunk of class to check for understanding and practice skills from previous days or weeks. That takes time away from your current lesson, but it averages out in the end: you’ll have less time for practice on the day’s objective, but make up for it with more time to practice on future days.
There is another option. This is where I plug my favorite off-the-wall curriculum design idea. Here is a rough sketch of a typical 7th grade sequence of units:
Six big ideas, one after another.
Here is what I did this year:
I took the toughest topics, the ones that students typically have the most trouble with, and stretched them out through the entire year. Multiple strands, taught side by side. Rational numbers means adding, subtracting, multiplying, and dividing, with negatives, fractions, and decimals. Equations and expressions includes a bunch of stuff, but the keys are finding equivalent expressions and solving two-step equations. Every day, we work a little bit on those two topics, as well as one of the other units. Sometimes it’s just a few minutes of practice on one strand. Sometimes it’s introducing a new concept. A bit of progress, every day. This structure provided time and flexibility to give students the spaced, repeated practice I’m describing in this post. It’s been a pain to plan, but it’s also my favorite change I’ve made to my teaching in the last few years.
Whatever approach you choose, you’ll need to accept that, if learning is a change in long-term memory, we can’t measure the success of our teaching in a single lesson. To figure out if students have learned, I need to check for understanding over multiple days and provide more practice when necessary. Many teachers and schools are strongly committed to the idea that every lesson must have exactly one objective, and every part of the lesson must be in service of that objective. I think we should let go of that idea. It simply is not consistent with how humans learn.
There’s an interesting question I elided in the post but is worth mentioning. If you teach a skill using a typical structure — lesson, practice, move on — some students will retain it. It’s not like that type of teaching results in zero learning. Why can some students retain what they learn without much practice, while others need the type of repeated, spaced practice I’m describing in this post? A helpful mental model here is that the skills we want students to learn are part of webs of knowledge, not isolated islands. Many students in a class walk in already knowing some things about a topic, or knowing a lot about concepts that are closely connected to the topic. When new learning is connected to what students already know, they can lean on the storage strength of that prior learning to boost the new learning. This underscores the importance of making connections in math learning.
It’s tempting to think that, if only we did a really good job creating clear connections from one topic to the next, students wouldn’t need the type of practice I’m describing. Here I can only offer my experience: we should absolutely work to help students make connections between what we teach and what they already know. When I do a good job of this, students need less practice. Still, I have always found that if I want students to retain what I teach, and to teach so that the largest possible number of students can learn, I need to provide the type of repeated, spaced practice I’m describing in this post.
If you’re looking for a way to generate this type of practice, I wrote a post a few weeks ago about an AI tool to generate quick chunks of math practice.
The 100% goal bugs some people. It’s really hard to get to 100%, and there are practical barriers — with absences and silly mistakes 100% can feel like overkill. The goal here is that if I get most of the way but I have two students who I know need more practice, I provide a bit more practice until those two students feel confident with the skill. In too many math classes those final few students watch the class move on before they’re ready, over and over again. Eventually they lose confidence as they feel like they just can’t keep up.
If you’d like to play with these models for memory strength, here is a website to explore retrieval strength and storage strength. I wrote a longer post on the topic last year. I’ll emphasize that this is a toy model. There’s a lot that’s not included. I do think it gives a good sense of the basic distinction between retrieval strength and storage strength and is useful for teachers, even if it’s not perfect.
Another benefit of high storage strength is fluency: if storage strength is high, retrieving that knowledge puts less of a tax on working memory, freeing up space to think about more complex ideas.
Something interesting about this model: researchers think storage strength is something that never declines. I’m not an expert at the details here, I’m drawing on work by Robert and Elizabeth Bjork. You can see Robert Bjork talking about this topic in a video here.
Worth noting that, realistically speaking, it’s not possible to do this for every single skill in the curriculum. While I enjoy teaching the triangle inequality theorem, it doesn’t quite matter enough and I won’t go through this whole rigamarole. This is a strategy for the key procedural skills that are most important for students to retain.







