The Story of the Science of Learning

by Justin Skycak (@justinskycak) on January 07, 2024

In terms of improving educational outcomes, science is not where the bottleneck is. The bottleneck is in practice. The science of learning has advanced significantly over the past century, yet the practice of education has barely changed.

This post is part of the book The Math Academy Way (Working Draft, Jan 2024). Suggested citation: Skycak, J., advised by Roberts, J. (2024). The Story of the Science of Learning. In The Math Academy Way (Working Draft, Jan 2024). https://justinmath.com/the-story-of-the-science-of-learning/

Want to get notified about new posts? Join the mailing list and follow on X/Twitter.

The science of learning has advanced significantly over the past century. Numerous effective cognitive learning strategies have been identified and researched extensively since the early to mid-1900s, with key findings being successfully reproduced over and over again.

At a glance, here are some of the highlights:

Active Learning -- students learn more when they are actively performing learning exercises as opposed to passively consuming educational content.
Deliberate Practice -- effective learning feels like a workout with a personal trainer and should center around individualized training activities that are chosen to improve specific aspects of one's performance through repetition and successive refinement.
Mastery Learning -- each individual student needs to demonstrate proficiency on prerequisite topics before moving on to more advanced topics.
Minimizing Cognitive Load -- because our brains can only process small amounts of new information at once, it's critical to break down skills and concepts into tiny steps.
Developing Automaticity -- to free up mental processing power, it's also critical to practice low-level skills enough that they can be carried out without requiring conscious effort.
Layering -- learning is about making connections. The more connections there are to a piece of knowledge, the more ingrained, organized, and deeply understood it is, and the easier it is to recall. The most efficient way to increase the number of connections to existing knowledge is to continue layering on top of it -- that is, continually acquiring new knowledge that exercises prerequisite or component knowledge.
Non-Interference -- conceptually related pieces of knowledge should be spaced out over time so that they are less likely to interfere with each other's recall. New concepts should be taught alongside dissimilar material.
Spaced Repetition (Distributed Practice) -- reviews should be spaced out or distributed over multiple sessions (as opposed to being crammed or massed into a single session) so that memory is not only restored, but also further consolidated into long-term storage, which slows its decay.
Interleaving (Mixed Practice) -- the effectiveness of practice is diminished when a single skill is practiced many times consecutively beyond a minimum effective dose. Review problems should be spread out or interleaved over multiple review assignments that each cover a broad mix of previously-learned topics. In addition to being more efficient, this also helps students match problems with the appropriate solution techniques.
The Testing Effect (Retrieval Practice) -- to maximize the amount by which your memory is extended when solving review problems, it's necessary to avoid looking back at reference material unless you are totally stuck and cannot remember how to proceed. For this reason, it's necessary to test frequently as a part of the learning process itself.
Gamification -- when game-like elements (such as points and leaderboards) are properly integrated into student learning environments, students typically not only learn more and engage more with the content, but also enjoy it more. However, these gamified elements must be aligned with the goals of the course, the motivations of the students, and the context of the educational setting. Further, they need to be resistant to "hacking" behaviors that attempt to bypass learning by exploiting loopholes in the rules of the game.

The Persistence of Tradition

One might expect to find these strategies being leveraged in today’s classrooms to drastically improve the depth, pace, and overall success of student learning. However, the disappointing reality is that the practice of education has barely changed, and in many ways remains in direct opposition to the strategies outlined above.

Classes still march through linear sequences of topics according to a predetermined schedule. Students are tethered to the pace of the class, which means that students who get lost are continually asked to learn new topics despite not having mastered the prerequisites, and students who learn quickly are prevented from learning more advanced concepts that come later in the class schedule or in a higher grade level (even if they have already mastered the prerequisites).
Units of related material are taught in subsequent lessons, which promotes confusion, impedes recall, and places a severe bottleneck on how many topics can be successfully taught simultaneously, thereby creating lots of friction and massively slowing down the learning process.
After learning a topic during class and practicing it on the homework, students forget about it until it's time to study for a test -- and there are only a handful of tests given throughout the entire duration of a course. After the test, students are rarely required to practice the topic again, unless it just happens that some new topic requires them to remember the old one. The end result is that students end up forgetting most of what they learn.
All students are given the same homework and assessments. This creates opportunities for coordinated cheating, a wide-open loophole in the grading system. Many students habitually exploit this loophole to bypass learning and obtain grades that do not reflect their (lack of) knowledge.

As lamented by Weinstein, Madan, & Sumeracki (2018):

"The science of learning has made a considerable contribution to our understanding of effective teaching and learning strategies. However, few instructors outside of the field are privy to this research.

In particular, a review published 10 years ago identified a limited number of study techniques that have received solid evidence from multiple replications testing their effectiveness in and out of the classroom (Pashler et al., 2007).

A recent textbook analysis (Pomerance, Greenberg, & Walsh, 2016) took the six key learning strategies from this report by Pashler and colleagues, and found that very few teacher-training textbooks cover any of these six principles -- and none cover them all, suggesting that these strategies are not systematically making their way into the classroom.

This is the case in spite of multiple recent academic (e.g., Dunlosky et al., 2013) and general audience (e.g., Dunlosky, 2013) publications about these strategies."

Kirschner & Hendrick sum it up as follows (2024, pp.275):

"...[M]ost students, and also many or even most teachers, don't have an accurate picture of the effectiveness of their study approach.

After more than a hundred years of research into learning and memory, there are a few things that we know about good and less good approaches. Since the turn of this century, people have been trying to figure out how to remember as much as possible, how to ensure that we forget as little as possible, and how to do this in as little time as possible.

The reason we have our doubts with respect to teachers is because the findings that have emerged from this research aren't yet included in textbooks for teachers (both in research in the US, as well as in the Netherlands and Flanders; Pomerance, Greenberg, & Walsh, 2016; Surma, Vanhoyweghen, Camp, & Kirschner, 2018)."

As Halpern & Hakel (2003) emphasize more sharply:

"Those outside academia further assume that because we are college faculty, we actually have a reasonable understanding of how people learn and that we apply this knowledge in our teaching. ... It would be reasonable for anyone reading these fine words to assume that the faculty who prepare students to meet these lofty goals must have had considerable academic preparation to equip them for this task. But this seemingly plausible assumption is, for the most part, just plain wrong.

The preparation of virtually every college teacher consists of in-depth study in an academic discipline: chemistry professors study advanced chemistry, historians study historical methods and periods, and so on. Very little, if any, of our formal training addresses topics like adult learning, memory, or transfer of learning.

And these observations are just as applicable to the cognitive, organizational, and educational psychologists who teach topics like principles of learning and performing, or evidence-based decision-making. We have found precious little evidence that content experts in the learning sciences actually apply the principles they teach in their own classrooms. Like virtually all college faculty, they teach the way they were taught.

But, ironically (and embarrassingly), it would be difficult to design an educational model that is more at odds with the findings of current research about human cognition than the one being used today at most colleges and universities.
...
There is a large amount of well-intentioned, feel-good psychobabble about teaching out there that falls apart upon investigation of the validity of its supporting evidence."

These sentiments are also echoed by Rohrer & Hartwig (2020):

"We fear, however, that continued advocacy might fall on deaf ears. ... [E]mpirical evidence is not highly valued by many of the educators who recommend learning methods and train teachers (e.g., Robinson, Levin, Thomas, Pituch, & Vaughn, 2007; Sylvester Dacy, Nihalani, Cestone, & Robinson, 2011). Against this backdrop, it might be difficult to inspire the kind of support for evidence-based interventions like those that sparked the dramatic improvements in Western medicine over the last century. Doing so, we believe, is the most pressing challenge facing learning scientists."

A Common Theme Preventing Adoption

Theme and Examples

So, what happened? Why have these cognitive learning strategies been rejected by the education system? The common theme throughout the literature is that effective cognitive learning strategies often deviate from traditional conventions, which are held in place by convenient misconceptions about learning.

The most obvious example of this theme is active learning.

Traditionally, classes are taught using passive learning: the instructor lectures, and students listen, maybe answering a question here and there. Unsurprisingly, this is not nearly as effective as an active learning class where students spend most of their time actively performing learning exercises.
However, it has been shown (Deslauriers et al., 2019) that even though students in active learning classes learn more, they mistakenly perceive that they learn less. Active learning produces more learning by increasing cognitive activation, but students often mistakenly interpret extra cognitive effort (such as productive struggle and occasional confusion) as an indication that they are not learning as well, when in fact the opposite is true.
Of course, this misconception is a convenient belief for students who want to minimize the amount of effort that they expend during class while still "feeling" as though they are learning (even if it is not really happening). It is also a convenient belief for teachers who enjoy the spotlight and art of lecturing and the "feeling" that their students are learning, do not want to nag students to stay focused during class, and do not suffer repercussions for the reality that is their students' lack of learning.

Another example of this theme is interleaving (mixed practice).

Traditionally, homework assignments focus on a single topic (or group of closely related topics) that are practiced many times consecutively beyond a minimum effective dose. This is not nearly as effective as spreading out or interleaving those problems over multiple review assignments that each cover a broad mix of previously-learned topics, which is more efficient and also helps students learn to match problems with the appropriate solution techniques.
However, it has been shown (see Rohrer, 2009 for a review) that even though interleaving promotes vastly superior retention and generalization, students again mistakenly believe that they are learning less due to the increased cognitive effort. Teachers can be fooled, too, because although interleaving increases performance on cumulative tests, it actually lowers performance on homework (which is otherwise artificially high if students settle into a robotic rhythm of mindlessly applying one type of solution to one type of problem).
Again, this misconception is a convenient belief for students who want to get through homework as quickly and effortlessly as possible while "feeling" as though they are mastering new skills (even if they are unable to consistently reproduce those skills in true assessment situations). It is also a convenient belief for teachers who want to assign good homework grades and "feel" as though these grades represent their students' learning, but don't want to spend extra effort organizing a properly spaced mixed review schedule and fielding a greater number and variety of homework questions from students.

A similar example can be constructed for every cognitive learning strategy that was mentioned earlier in this chapter. In some way or another, each strategy increases the intensity of effort required from students and/or instructors, and the extra effort is then converted into an outsized gain in learning. However, the extra effort also exposes the reality that students didn’t actually learn as much as they (and their teachers) “felt” they did under less effortful conditions. This reality is inconvenient to students and teachers alike; therefore, it is common to simply believe the illusion of learning and avoid activities that might present evidence to the contrary.

More generally, while “innocent until proven guilty” is a good model for a legal system, “competent until proven incompetent” is a poor model for an educational system. If students are not made to demonstrate measurable learning at each step of the way, until they are able to consistently reproduce learned skills in true assessment situations, then the most likely outcome is that very little learning will happen. Whereas the casualties of the legal system are those who are jailed without just cause, the casualties of the education system are those students who are hopelessly pushed to learn advanced skills despite not having actually mastered the prerequisites. Empowering students requires ensuring their learning, and ensuring learning requires interrogating their knowledge.

Desirable Difficulty vs Illusion of Comprehension

This theme is so well-documented in the literature that it even has a catchy name: a practice condition that makes the task harder, slowing down the learning process yet improving recall and transfer, is known as a desirable difficulty. As summarized by Rohrer (2009):

"A feature that decreases practice performance while increasing test performance has been described by Bjork and his colleagues as a desirable difficulty, and spacing and mixing are two of the most robust ones. As these researchers have noted, students and teachers sometimes avoid desirable difficulties such as spacing and mixing because they falsely believe that features yielding inferior practice performance must also yield inferior learning."

Many types of cognitive learning strategies introduce desirable difficulties – for instance, Bjork & Bjork (2011) list a few more:

"Such desirable difficulties (Bjork, 1994; 2013) include varying the conditions of learning, rather than keeping them constant and predictable; interleaving instruction on separate topics, rather than grouping instruction by topic (called blocking); spacing, rather than massing, study sessions on a given topic; and using tests, rather than presentations, as study events."

As Bjork & Bjork (2023, pp.21-22) elaborate, desirable difficulties make practice more representative of true assessment conditions. Consequently, it is easy for students (and their teachers) to vastly overestimate their knowledge if they do not leverage desirable difficulties during practice, a phenomenon known as the illusion of comprehension:

"A general characteristic of desirable difficulties (such as the spacing or interleaving of study or practice trials) is that they present challenges (i.e., difficulties) for the learner, and hence can even appear to be slowing the rate at which learning is occurring. In contrast, their opposites (such as massing or blocking of study or practice trials) often make performance improve rapidly and can appear to be enhancing learning.

Thus, as either learners or teachers, we are vulnerable to being misled as to whether we or our students are actually learning effectively, and, indeed, we can easily be misled into thinking that these latter types of conditions, such as massing or blocking, are actually better for learning. Such dynamics probably play a major role in why students often report that their most preferred and frequently used types of study activity include activities such as rereading chapters (e.g., Bjork et al., 2013), typically right away after an initial reading. Such activities can provide a sense of familiarity or perceptual fluency that we can interpret as reflecting understanding or comprehension and, thus, produce in us what we have sometimes called an 'illusion of comprehension' (Bjork, 1999; Jacoby et al., 1994).

Similarly, when information comes readily to mind, which frequently is the case in blocked practice, or with no contextual variation in a repeated study or practice setting, we can be led to believe that such immediate access reflects real learning when, in fact, such access is likely to be the product of cues that continue to be present in the unchanging study situation, but that are unlikely to be present at a later time, such as on an exam. As both learners and teachers, we need to be suspicious of conditions of learning, such as massing and blocking, that frequently make performance improve rapidly, but then typically fail to support long-term retention and transfer. To the extent that we interpret current performance as a valid measure of learning, we become susceptible both to mis-judging whether learning has or has not occurred and to preferring poorer conditions of learning over better conditions of learning."

The Educational System Prefers Illusion

As Bjork (1994) explains, the typical teacher is incentivized to maximize the immediate performance and/or happiness of their students, which biases them against introducing desirable difficulties and incentivizes them to promote illusions of comprehension:

"Recent surveys of the relevant research literatures (see, e.g., Christina & Bjork, 1991; Farr, 1987; Reder & Klatzky, 1993; Schmidt & Bjork, 1992) leave no doubt that many of the most effective manipulations of training -- in terms of post-training retention and transfer -- share the property that they introduce difficulties for the learner.
...
If the research picture is so clear, why then are ... nonproductive manipulations such common features of real-world training programs? ... [T]he typical trainer is overexposed, so to speak, to the day-to-day performance and evaluative reactions of his or her trainees. A trainer, in effect, is vulnerable to a type of operant conditioning, where the reinforcing events are improvements in the [immediate] performance and/or happiness of trainees.

Such a conditioning process, over time, can act to shift the trainer toward manipulations that increase the rate of correct responding -- that make the trainee's life easier, so to speak. Doing that, of course, will move the trainer away from introducing the types of desirable difficulties summarized in the preceding section."

What’s more, most educational organizations operate in a way that exacerbates this issue:

"The tendency for instructors to be pushed toward training programs that maximize the performance or evaluative reaction of their trainees during is exacerbated by certain institutional characteristics that are common in real-world organizations.

First, those responsible for training are often themselves evaluated in terms of the performance and satisfaction of their trainees during training, or at the end of training.

Second, individuals with the day-to-day responsibility for training often do not get a chance to observe the post-training performance of the people they have trained; a trainee's later successes and failures tend to occur in settings that are far removed from the original training environment, and from the trainer himself or herself.

It is also rarely the case that systematic measurements of post-training on-the-job performance are even collected, let alone provided to a trainer as a guide to what manipulations do and do not achieve the post-training goals of training.

And, finally, where refresher or retraining programs exist, they are typically the concern of individuals other than those responsible for the original training."

As a result, these cognitive learning strategies often ruffle the feathers of educational traditionalists, whose immediate response is to lash out against it. Take it directly from John Gilmour Sherman (1992), a professor who implemented evidence-based learning strategies in his own classroom, only to be shut down for no reason other than his superior’s unsupported opinions about how learning works:

"Avoiding a frontal attack, the chairman of the Psychology Department at Georgetown declared by fiat that something on the order of 50% of class time must be devoted to lecturing. By reducing the possibility of self-pacing to zero, this effectively eliminated PSI [Personalized System of Instruction] courses.

He issued this order on the grounds that in the context of lecturing 'it is the dash of intellects in the classroom that informs the student.' No data were presented on this point! The spectacle of purporting to defend scholarship while deciding the merits of instructional methods by assertion is silly.

The troubling aspect of all these cases was that data played no part in the decisions. It is disturbing when one has to wonder whether research on the education process makes any difference."

Ultimately, Sherman’s experiences led him to conclude that

"...[T]he investment in keeping things as they are may be impossible to overcome. ... Improving instruction is the goal, but only in the context of not changing anything that is important to any vested interest. ... [When the role of the teacher] does not conform to what most people think of as teaching; this is a problem and an obstacle to implementation."

This sentiment continues into recent years. As Bjork & Bjork (2023, pp.19) reminisce:

"Having been asked to convey in 'our own words' what we most want students and teachers to know regarding how to apply findings from the science of learning has led us to think back on our efforts to spread the desirable difficulties gospel, so to speak. It verges on laughable that we thought 25 years or so ago that we would simply tell people about certain key findings, and they would then immediately change how they managed their own learning."

Or, as Rohrer & Hartwig (2020) put it bluntly:

"...[T]he success of an intervention depends partly on whether students and teachers are willing to use it. Too often, the classroom is where promising interventions go to die."

Technology Changes Everything

Revival via Technology

It is unfortunate that Sherman and countless other researchers, practitioners, and proponents of evidence-based education are no longer alive to see their life’s work positively transform the practice of education – and especially so for those like Sherman (1992) who eventually despaired “whether research on the education process makes any difference.”

However, some did maintain hope that one day their contributions might be revived in the future when computers advanced far enough to make individualized digital learning environments technologically possible and commercially viable.

Indeed, these cognitive learning strategies are starting to see the light of day in online learning systems (e.g., Math Academy). By learning in an environment that leverages these strategies to their fullest effect and captializes on their compounding nature, students now have the opportunity to learn many times more than they would otherwise in a traditional classroom.

Necessity of Technology

In building Math Academy, we discovered something interesting: technology not only lets us circumvent the opposing inertia in the education system, but also helps us leverage these cognitive learning strategies to a degree that would not be feasible for even the most agreeable and hard-working human teacher. While it’s true that a human teacher can reap some benefits of these strategies while maintaining a reasonable workload (and there really is no good excuse for not doing so), technology enables us to leverage these strategies to their full extent and produce even better learning outcomes than a human teacher who uses loose approximations of these strategies as much as humanly possible.

For instance, consider spaced repetition. While some curricula now adopt a spiral approach where material is naturally revisited and further built upon in later textbook chapters and/or grades, this is nowhere near the level of granularity, precision, and individualization that is required to capture the maximum benefit of true spaced repetition. Taken to its fullest extent, spaced repetition requires the instructor to keep track of a repetition schedule for every student for every topic and continually update that schedule based on the student’s performance – and each time a student learns (or reviews) an advanced topic, they’re implicitly reviewing many simpler topics, all of whose repetition schedules need to be adjusted as a result.

Of course, this is an inhuman amount of work. In fact, before building our online system, we actually tried performing a loose approximation of spaced repetition manually while teaching in a human-to-human classroom. It turned out that, teaching just two classes with only a handful of students in each class, it took more time and effort than a full-time job to implement a very loose approximation of spaced repetition for the class as a whole – not even personalized to individual students. And that’s just one of many strategies that are necessary for effective teaching!

But just because fully leveraging these cognitive learning strategies requires an inhuman amount of work, doesn’t mean that there’s little to gain from it (especially when a century of research has shown that these strategies lead to immense improvements in learning). All it means is that the human teacher is a bottleneck to effective teaching. And what’s always the solution when manual human effort is a bottleneck? Technology.

References

Bjork, E. L., & Bjork, R. A. (2011). Making things hard on yourself, but in a good way: Creating desirable difficulties to enhance learning. Psychology and the real world: Essays illustrating fundamental contributions to society, 2(59-68).

Bjork, E. L., & Bjork, R. A. (2023). Introducing Desirable Difficulties Into Practice and Instruction: Obstacles and Opportunities. In C. Overson, C. M. Hakala, L. L. Kordonowy, & V. A. Benassi (Eds.), In Their Own Words: What Scholars and Teachers Want You to Know About Why and How to Apply the Science of Learning in Your Academic Setting (pp. 111-21). Society for the Teaching of Psychology.

Bjork, R. A. (1994). Memory and metamemory considerations in the training of human beings. In J. Metcalfe and A. Shimamura (Eds.), Metacognition: Knowing about knowing (pp.185-205).

Deslauriers, L., McCarty, L. S., Miller, K., Callaghan, K., & Kestin, G. (2019). Measuring actual learning versus feeling of learning in response to being actively engaged in the classroom. Proceedings of the National Academy of Sciences, 116(39), 19251-19257.

Halpern, D. F., & Hakel, M. D. (2003). Applying the Science of Learning. Change, 37.

Kirschner, P., & Hendrick, C. (2024). How learning happens: Seminal works in educational psychology and what they mean in practice. Routledge.

Rohrer, D. (2009). Research commentary: The effects of spacing and mixing practice problems. Journal for Research in Mathematics Education, 40(1), 4-17.

Rohrer, D., & Hartwig, M. K. (2020). Unanswered questions about spaced interleaved mathematics practice. Journal of Applied Research in Memory and Cognition, 9(4), 433.

Sherman, J. G. (1992). Reflections on PSI: Good news and bad. Journal of Applied Behavior Analysis, 25(1), 59.

Weinstein, Y., Madan, C. R., & Sumeracki, M. A. (2018). Teaching the science of learning. Cognitive research: principles and implications, 3(1), 1-17.

Want to get notified about new posts? Join the mailing list and follow on X/Twitter.