That may be true for typical universities. But if you’re going to an elite university that is known worldwide for its math program or general STEM prowess (MIT, Caltech, Princeton, UChicago, etc), and all the knowledge you show up with is high school math and AP Calculus, then you’re going to get your ass handed to you.
The problem is that high school math – even the “honors” track – doesn’t accurately depict the level of background knowledge that successful math majors at these universities typically obtain before being admitted.
We’re talking about the kids who graduate high school having already taken Linear Algebra, Multivariable Calculus, and Introduction to Proofs, and have probably already seen inklings of Real Analysis (e.g., epsilon-delta limit proofs) and Abstract Algebra (e.g., arithmetic within the additive group of integers).
This is such a tiny slice of the population that you’re not going to see them in high school. But they exist, and they’re going to show up in the math-major math classes at these universities.
When the professor is writing furiously at the chalkboard assuming that their students are absorbing the information in real time, these students actually are. Or, at least, they give the appearance of it, because so much of the content (or, at least, the way of thinking about it) is familiar to them.
Regardless, they’re able to keep up, and if you’re not able to do the same (which you probably aren’t if you haven’t been exposed to as much math as they have), then the class is not going to slow down just for you. Not to mention, you’re going to feel dumb, which is going to severely impact your motivation even if you manage to find help outside of class.
]]>It’s common to think that maximum-efficiency learning should feel maximally scaffolded, perfectly smooth/easy the whole way through. While this is more true than not, it misses an important nuance: maximum-efficiency learning should feel just-enough scaffolded that the learning tasks are challenging yet still achievable.
This is more obvious in the context of athletics: maximum-efficiency training involves pushing athletes to the brink of their capabilities. At the beginning of a training session, an athlete undergoing maximal-efficiency training will probably not be confident in their ability to successfully perform the training tasks, but they will end up doing so.
Confidence
When you’re developing skills at peak efficiency, you are maximizing the difficulty of your training tasks subject to the constraint that you end up successfully overcoming those difficulties.
A noteworthy corollary of this is that you are also minimizing your confidence in your ability to complete the training tasks (again subject to the constraint that you end up successfully completing them).
In that view, confidence is more of a “hindsight” thing than an “in-the-moment” thing. If you feel confident while engaging in maximum-efficiency learning, it’s not because the task in front of you seems easy relative to your abilities, but because you’ve been in situations before where tasks felt challenging relative to your abilities but you’ve always managed to come out successful.
Pep Talks
This perspective on confidence occasionally showed up when I was teaching the most advanced high school math/CS sequence in the USA. Sometimes the kids would get a bit intimidated by the coding tasks I was asking them to complete – not because they tried and failed to complete the tasks, but because the tasks just looked very challenging from the outset.
So I’d sometimes have to give a little pep talk beforehand. Something like this:
This is one of the “soft” psychological things that good coaches will do: reminding students that things may feel challenging in the moment, but that’s what it feels like to engage in maximum-efficiency practice. The efficiency is contained within the challenge.
]]>Thre is one setting in which the conclusions of the paper might make sense to me. It involves tightening the definition of “favorable learning conditions” to the point that it becomes more theoretical than practical, and it doesn’t imply that students actually learn at similar absolute rates, but here it is.
The paper limits its conclusions to the context of “favorable learning conditions,” which it described as follows:
I wonder if the definition of “favorable learning conditions” also needs to specify (in some more precise way) that the curriculum is sufficiently granular relative to most students’ comfortable “bite sizes” for learning new information, and includes sufficient review relative to their forgetting rates.
Under that definition, it would make more intuitive sense to me that (barring hard cognitive limits) such favorable learning conditions could to some extent factor out cognitive differences, causing learning rates to appear surprisingly regular. A metaphor: “students eat meals of information at similar bite rates when each spoonful fed to them is sufficiently small.”
Though, ceiling effects may confound when the curriculum is too granular or provides too much review relative to the learner’s needs – so perhaps the definition would need to be amended once more to specify that the curriculum’s granularity is equal to the student’s bite size and rate of review is equal to the student’s rate of forgetting. The amended metaphor: “students eat meals of information at similar bite rates when each spoonful fed to them is sized appropriately relative to the size of their mouth.” (Note that equal bite rates does not imply equal rates of food volume intake.)
This definition of “favorable learning conditions” would also allow for anecdotes / case studies of math becoming hard for different students at different levels, because the following factors affect students differentially as they move up the levels of math:
It would even allow for the concept of soft and hard ceilings on the highest level of math that one can reach:
Most people know that higher-number facts are typically harder than lower-number facts, but the 10s are really easy and the 9s follow a pattern that makes them fairly easy as well.
So, we arrive at the following guess: $8 \times 8$ is the hardest, then $7 \times 8,$ then maybe $7 \times 7$ tied with $6 \times 8.$
This is a decent guess. These are all some of the hardest facts. But when you look at the results, such as here, you maybe surprised by the following two observations:
What gives?
There’s actually a cognitive principle behind this: associative interference, the phenomenon that conceptually related pieces of knowledge can interfere with each other’s recall.
For instance, when recalling $4 \times 8,$ related facts like $\mathbf{4} \times 6 = \mathbf{24}$ and $3 \times \mathbf{8} = \mathbf{24}$ interfere with the spreading activation during the recall process and increase the likelihood of the error $4 \times 8 = 24.$
(Spreading activation is a method by which connections between information can be used to recall information in response to a stimulus. The stimulus activates some piece(s) of information, and the activity flows through connections to other pieces of information.)
Here’s a diagram that I made to illustrate:
Active Learning
In order for students to have learned something, they need to be able to consistently reproduce that information and use it to solve problems. None of these things happen when students watch a lecture, even if they understand it perfectly. The same reasoning applies to watching videos, reading books, re-reading notes, and all other passive learning techniques. If students don’t actively practice retrieving information from memory, it doesn’t get written to memory. It just falls out of their brain.
Relationship with Cognitive Load
Now, here’s the thing. The goal of active learning is not to blow up a student’s cognitive load. It’s actually the opposite – to get students actively retrieving information from memory, while minimizing their cognitive load.
When a student has a heavy cognitive load, their working memory is running low on processing power, which means that
This is why it’s so important to scaffold instructional material and introduce new material only after prerequisites have been learned. New material needs to be
Issues with Pure Discovery & Radical Constructivist Learning
Unfortunately, that’s where some extremist non-traditional approaches get it wrong: they get students performing activities, but they don’t minimize cognitive load, and students just spend the whole time in a state of cognitive overload, getting nowhere.
Optimally active learning doesn’t mean that students never watch and listen. It just means that students are actively and successfully solving problems as soon as possible following a minimum effective dose of initial explanation, and they spend the vast majority of their time actively and successfully solving problems.
Reality vs Perception of Learning
Finally, there’s one catch: even if students are engaged in optimally active learning, they’re typically not going to perceive it as being optimal. Active learning produces more learning by increasing cognitive activation, but students mistakenly interpret that extra cognitive effort as an indication that they are not learning as well, when in fact the opposite is true.
The important keyword here is “desirable difficulty,” which refers to a practice condition that makes the task harder, slowing down the learning process yet improving recall and transfer.
Active learning creates a desirable difficulty that makes class feel more challenging but improves learning. Passive learning, on the other hand, promotes an illusion of comprehension in which students (and their teachers) overestimate their knowledge because they are not made to exercise it.
Further Reading
Here’s a draft that I’m working on that goes into all this stuff (and more) in way more detail with over 300 references and relevant quotes pulled out of those references.
]]>Imagine signing up for tennis lessons with a personal coach.
When does the learning happen?
It’s not when you pay the coach the money. It’s not when you watch the coach demonstrate a move.
It’s when you actually start doing things that you weren’t able to do before. It’s when you attempt a move, the coach corrects your form, and you attempt the move again with better results.
The learning is the incremental gain in your ability to perform a tangible, reproducible skill. If you’re not getting those gains, you’re not learning.
It’s the same in mathematics.
The keys to effective training in mathematics are the same as the keys to effective training in athletics, music, or any other skill-based domain.
Learning how to solve a new type of equation is totally different from, say, learning some new history about the life of Napoleon.
You’re not just absorbing information – you’re developing skills.
]]>For instance, if you want to rotate 90 degrees clockwise, then you want to take the identity matrix
and turn it into the following:
Now, sometimes you have to tweak the result by transposing it afterwards – we don’t catch this in the above reasoning since the identity matrix is symmetric.
But you can catch that by double-checking that your transformation works in a more general case as follows:
So, if you want to rotate a matrix 90 degrees clockwise then you use rotate90(X) = transpose(antiDiag * X)
where antiDiag
is the antidiagonal matrix with 1’s on the anti-diagonal.
Unfortunately, but understandably, many educators have come to distrust scientific findings about education as a whole – and this is compounded by an ongoing replication crisis in psychology.
But here’s the thing: Sure, many findings don’t hold up, but also, many findings do.
For instance: we know that actively solving problems produces more learning than passively watching a video/lecture or re-reading notes. This sort of thing has been tested scientifically, numerous times, and it is completely replicable. It might as well be a law of physics at this point. In fact, a highly-cited meta-analysis states, verbatim:
So there you go, that’s one cognitive psychology finding that holds up: active learning beats passive learning.
(To be clear: active learning doesn’t mean that students never watch and listen. It just means that students are actively solving problems as soon as possible following a minimum effective dose of initial explanation, and they spend the vast majority of their time actively solving problems – and by “vast majority” I mean, like, 90%, not 60%.)
Another finding: if you don’t review information, you forget it. You can actually model this precisely, mathematically, using a forgetting curve. I’m not exaggerating when I refer to these things as laws of physics – the only real difference is that we’ve gone up several levels of scale and are dealing with noisier stochastic processes (that also have noisier underlying variables).
Okay, but aren’t these obvious? Yes, but…
Now it’s time to address the elephant in the room: if cognitive psychology has found many effective learning strategies (like mastery learning, spaced repetition, the testing effect, and mixed practice), then why haven’t these learning strategies been implemented large-scale?
Here are a handful of reasons that I’m aware of.
1. Leveraging them (at all) requires additional effort from both teachers and students.
In some way or another, each strategy increases the intensity of effort required from students and/or instructors, and the extra effort is then converted into an outsized gain in learning.
This theme is so well-documented in the literature that it even has a catchy name: a practice condition that makes the task harder, slowing down the learning process yet improving recall and transfer, is known as a desirable difficulty.
Desirable difficulties make practice more representative of true assessment conditions. Consequently, it is easy for students (and their teachers) to vastly overestimate their knowledge if they do not leverage desirable difficulties during practice, a phenomenon known as the illusion of comprehension.
However, the typical teacher is incentivized to maximize the immediate performance and/or happiness of their students, which biases them against introducing desirable difficulties and incentivizes them to promote illusions of comprehension.
Using desirable difficulties exposes the reality that students didn’t actually learn as much as they (and their teachers) “felt” they did under less effortful conditions. This reality is inconvenient to students and teachers alike; therefore, it is common to simply believe the illusion of learning and avoid activities that might present evidence to the contrary.
2. Leveraging cognitive learning strategies to their fullest extent requires an inhuman amount of effort from teachers.
Let’s imagine a classroom where these strategies are being used to their fullest extent.
Why is this an inhuman amount of work?
In the absence of the proper technology, it is impossible for a single human teacher to deliver an optimal learning experience to a classroom of many students with heterogeneous knowledge profiles, who all need to work on different types of problems and receive immediate feedback on each attempt.
3. Most edtech systems do not actually leverage the above findings.
If you pick any edtech system off the shelf and check whether it leverages each of the cognitive learning strategies I’ve described above, you’ll probably be surprised at how few it actually uses. For instance:
Sometimes a system will appear to leverage some finding, but if you look more closely it turns out that this is actually an illusion that is made possible by cutting corners somewhere less obvious. For instance:
Now, I’m not saying that these issues apply to all edtech systems. I do think edtech is the way forward here – optimal teaching is an inhuman amount of work, and technology is needed. Heck, I personally developed all the quantitative software behind one system that properly handles the above challenges. All I’m saying is that you can’t just take these things at face value. Many edtech systems don’t really work from a learning standpoint, just as many psychology findings don’t hold up in replication – but at the same time, some edtech systems do work, shockingly well, just as some cognitive psychology findings do hold up and can be leveraged to massively increase student learning.
4. Even if you leverage the above findings, you still have to hold students accountable for learning.
Suppose you have the Platonic ideal of an edtech system that leverages all the above cognitive learning strategies to their fullest extent.
Can you just put a student on it and expect them to learn? Heck no! That would only work for exceptionally motivated students.
Most students are not motivated to learn the subject material. They need a responsible adult – such as a parent or a teacher – to incentivize them and hold them accountable for their behavior.
I can’t tell you how many times I’ve seen the following situation play out:
In these situations, here’s what needs to happen:
Even if an adult puts a student on an edtech system that is truly optimal, if the adult clocks out and stops holding the student accountable for completing their work every day, then of course the overall learning outcome is going to be worse.
Before ending this post, I want to drive home the point that the cognitive learning strategies discussed here really do connect all the way down to the mechanics of what’s going on in the brain.
The goal of mathematical instruction is to increase the quantity, depth, retrievability, and generalizability of mathematical concepts and skills in the student’s long-term memory (LTM).
At a physical level, that amounts to creating strategic connections between neurons so that the brain can more easily, quickly, accurately, and reliably activate more intricate patterns of neurons. This process is known as consolidation.
Now, here’s the catch: before information can be consolidated into LTM, it has to pass through working memory (WM), which has severely limited capacity. The brain’s working memory capacity (WMC) represents the amount of effort that it can devote to activating neural patterns and persistently maintaining their simultaneous activation, a process known as rehearsal.
Most people can only hold about 7 digits (or more generally 4 chunks of coherently grouped items) simultaneously and only for about 20 seconds. And that assumes they aren’t needing to perform any mental manipulation of those items – if they do, then fewer items can be held due to competition for limited processing resources.
Limited capacity makes WMC a bottleneck in the transfer of information into LTM. When the cognitive load of a learning task exceeds a student’s WMC, the student experiences cognitive overload and is not able to complete the task. Even if a student does not experience full overload, a heavy load will decrease their performance and slow down their learning in a way that is NOT a desirable difficulty.
Additionally, different students have different WMC, and those with higher WMC are typically going to find it easier to “see the forest for the trees” by learning underlying rules as opposed to memorizing example-specific details. (This is unsurprising given that understanding large-scale patterns requires balancing many concepts simultaneously in WM.)
It’s expected that higher-WMC students will more quickly improve their performance on a learning task over the course of exposure, instruction, and practice on the task. However, once a student learns a task to a sufficient level of performance, the impact of WMC on task performance is diminished because the information processing that’s required to perform the task has been transferred into long-term memory, where it can be recalled by WM without increasing the actual load placed on WM.
So, for each concept or skill you want to teach:
But also, even if you do all the above perfectly, you still have to deal with forgetting. The representations in LTM gradually, over time, decay and become harder to retrieve if they are not used, resulting in forgetting.
The solution to forgetting is review – and not just passively re-ingesting information, but actively retrieving it, unassisted, from LTM. Each time you successfully actively retrieve fuzzy information from LTM, you physically refresh and deepen the corresponding neural representation in your brain. But that doesn’t happen if you just passively re-ingest the information through your senses instead of actively retrieving it from LTM.
I’ve written extensively on this. See the working draft here for more info and hundreds of scientific citations to back it up.
The citations are from a wide variety of researchers, but there’s one researcher in particular who has published a TON of papers relevant to this question/answer in particular, has all (or at least most) of those papers freely available on his personal site, and has a really engaging and “to the point” writing style, so I want to give him a shout-out. His name is Doug Rohrer. You can read his papers here: drohrer.myweb.usf.edu/pubs.htm
Similarly, there are amazing practical guides on retrievalpractice.org that not only describe these learning strategies but also talk about how to leverage them in the classroom. They’re easy reading yet also incredibly informative. Here are some of my favorites:
Another website worth checking out: learningscientists.org
As far as books, check out the following:
Quite a few people have experienced a sort of “intellectual awakening” thanks to LLMs. In school, they weren’t studious, motivated, or even interested in the material.
But once ChatGPT came out, they started talking to it and eventually ended up asking a bunch of “why/how” questions like a young child might do – e.g., is time travel possible? How does the internet work? What is a neural network?
By chatting with ChatGPT about these topics, they developed an interest and a baseline level of surface-level knowledge about various STEM subjects.
These learners used an LLM to spur interest in STEM subjects and acquire some baseline knowledge. That’s great. But what they’re unable to learn from the LLM is the hard, technical skills and the associated concepts.
(There do exist autodidacts who can teach themselves hard, technical skills just given reference material – but they’re not really who we’re talking about here. We’re focusing on the non-autodidacts who have learned some stuff from LLMs that they would not have learned without LLMs. Autodidacts already have all the information they need on libraries and the internet; LLMs are not a game-changing technology for them.)
Let’s consider a particular one of these learners who used ChatGPT to learn about, well, neural networks and LLMs themselves.
Sure, they can talk about how cool LLMs are and they might know that it’s based on the transformer neural network architecture. They might be familiar with some other architectures like a convolutional neural network, and they might know that convolutional neural networks are often used for image processing.
But can they explain the difference between how different neural network architectures are connected up? Probably not.
Can they talk about the tradeoffs between all the choices for different components of the model including activation functions, loss functions, learning rates, regularization methods, etc? No.
Can they code up a neural network from scratch, including implementing the backpropagation algorithm (which requires applying the chain rule from multivariable calculus)? Heck no.
And if they were given a neural network model with some bug in it, could they figure out what’s going wrong and then fix it? No way in hell.
(Just to give a sense of what would be needed to pull this off: not only would the learner need to be able to verify the backpropagation computations, but they would also have to conceptually understand how different features of the model’s output are indicative of various choices – and issues – in the mathematical machinery under the hood. They would likely need to track statistical distributions throughout the model, and who knows, the bug might not even be in the model itself – the bug might stem from an undesirable statistical property of the data on which the model was trained.)
Can’t our learner just ask the LLM to teach them all the stuff above, including exercises on which to practice?
Not really. The problem is that our learner doesn’t know what to ask for. They don’t know where to start their learning journey, and how to build up an understanding.
This is where the analogy between an LLM and a teacher really starts to break down. Sure, in some sense an LLM is like a teacher because it can respond intelligently to a student’s questions. But on the flipside, all of its responses are contingent upon a query from the student.
Think about what an effective teacher does. Do they just stand up at the front of the class and field questions from students? No. They deliver material in a structured, scaffolded manner so that it actually makes sense to students.
Many times, students’ own questions are not even well-posed – and even when they are, they may not be productive to explore fully given the student’s knowledge.
For instance, it’s common for an expert teacher to respond to a student’s question like this:
Think of a gymnastics coach – if a novice signs up for gymnastics lessons and asks the coach to teach them how to do a backflip, does the coach demonstrate the components of a backflip and ask the student to mirror their movements? Heck no!
Chances are, the student can’t even jump high enough off the ground yet. There are numerous component skills, one of which is explosive jumping strength, that need to be built up before the student has any chance of successfully landing a backflip.
The coach knows this, and breaks down the learning process into a scaffolded journey up the hierarchy of these component skills. The coach also determines what constitutes a sufficient level of mastery to advance beyond each skill, which is another thing that students typically struggle with.
LLMs are kind of like human experts. (Not a world-class expert with years of hands-on experience, but more like a book-smart person who is well-read in every subject.)
But if subject expertise were all that it took to be an effective teacher, then we would expect the most renowned mathematicians to be the best math teachers. Is that the case? Heck no!
Every STEM major in college can count off numerous professors who are true experts in their field but whose students do a poor job of learning the material.
Sometimes, this is due to shortcomings in navigation and scaffolding. For instance, I recently tutored a student who took a Real Analysis course from a professor who did not teach from a textbook, and there were no class notes, just problem sets where abstract problems were rarely preceded by simpler cases. I’m told that most of the class was completely lost, but the professor would take the class’s silence as an indication that they fully grasped everything that was said (whereas in actuality, they were so lost that nobody could even pinpoint a specific thing to ask about).
But other times, even if an instructor excels at explanation, poor learning outcomes can still result from neglecting to manage the entire learning process. This is why you can’t actually learn a subject in proper depth just by watching 3Blue1Brown videos.
It may come as a surprise to many that Richard Feynman – widely known as “the great explainer,” one of the greatest lecturers of all time – also belongs in this category. And that’s not just my opinion. That’s coming from Feynman himself.
According to Feynman himself, his classes were a failure for 90% of his students. In his lectures, Feynman did a phenomenal job appealing to intuition and conceptual thinking, making complex physics feel simple and accessible without getting too deep into the math. On the flipside, however – when it came time to solve actual problems on exams, Feynman’s students failed.
Take it from Feynman himself in the preface to his quantum mechanics lectures:
Additionally, while some may view Feynman-style pedagogy as supporting inclusive learning for all students across varying levels of ability, Feynman himself acknowledged that his methods only worked for the top 10% of his students – and he even went as far as to admit that those were the only students he was actually trying to engage with his teaching.
It’s worth noting that, because Feynman taught at Caltech (which is one of the most selective universities in the world, and possibly the most STEM-focused university in the world), the top 10% of Feynman’s students were really the top 1% of students in general (and that’s a conservative estimate).
Many people who have (unsuccessfully) attempted to apply AI to education have focused too much on the “explanation” part and not enough on the “scaffolding” and “management” parts. Yes, for an AI system to be successful in education, it has to be able to explain things clearly – but as we’ve discussed above, that’s only one piece of the puzzle.
Pitfall #1: Over-engineering the “explanation” component.
It’s easy to go on a wild goose chase building an “explanation AI.” There are endless fascinating distractions.
For example, it’s easy to fall in love and get lost in the idea of the AI having conversational dialogue with the student. But conversational dialogue opens a can of worms on complexity, and it turns out to not even be necessary.
You can create extremely clear explanations by having humans hard-code them – the trick is that you just need to break them up into bite-sized pieces and serve each one to the student at just the right time. And you can tie up the feedback loop by having the student solve problems (so their response is essentially whether they got the problem correct or not) – which is something that they need to be doing anyway.
Sure, hard-coding bite-sized explanations can feel tedious for a human, and it requires the effort of a full team of humans over the course of years, and it’s not as “sexy” as an AI program that comes up with its own responses from scratch – but unlike the conversational dialogue approach, it’s actually tractable. It’s not just a pipe dream. It’s a practical solution.
If you’re willing to put in the time and effort, then you can solve the problem using this practical approach and move on to building the other components of the AI system, which are just as important.
Pitfall #2: Cutting corners on the other components.
It’s also tempting to cut corners on the less sexy (but still crucial) parts, scaffolding and managing the learning process.
Pitfall #2.1: Cutting corners on scaffolding.
It’s very expensive to create a course textbook from scratch. But guess what? Textbooks typically aren’t scaffolded enough for students to learn on their own.
If you’re developing an education AI, then you have to increase the granularity of the curriculum by an order of magnitude so that it can be consumed by students in bite-sized pieces.
And guess what happens when you increase the granularity of the curriculum? You also increase the cost to develop it.
So, unless you manage to secure a lot of funding for your education AI system, you’re probably going to have an under-scaffolded curriculum, which means students are probably going to get stuck at various places within it.
Pitfall #2.1: Cutting corners on managing the learning process.
There are a lot of components within managing the learning process: forcing students to solve problems, responding to a student’s struggles, reviewing previously-learned material so that the student doesn’t forget it, and transitioning from problem-solving in easier contexts (e.g. with a worked example to look back at) to harder contexts (e.g. on a timed quiz with no reference material available), just to name a few.
To provide a single example of an area where it’s tempting to cut corners, let’s focus on the need to respond to a student’s struggles. An AI education system will need some sort of remediation protocol for when a student struggles with a task that they are asked to accomplish.
In such cases, it’s tempting to take the easy way out and lower the bar for success, whereas what the AI system should really do is provide remedial practice to shore up the student’s weaknesses (so that the student develops the ability to clear the bar where it’s at).
A particular example of this that has plagued prior AI systems is allowing students to request so many hints that it renders the problem trivial. In that case, of course students will request maximum hints by default, solve the now-trivial problem, and learn little or nothing from it!
Simply put, it has to do a good job of explaining the content AND managing the learning process.
It needs to start out with a minimal dose of explanation (in which intuition and conceptual thinking do have a place) – but then immediately switch over to active problem-solving.
During active problem-solving, students should begin with simple cases but then climb up the ladder of difficulty to cover all cases that the student could reasonably be expected to demonstrate their knowledge of on an assessment.
Assessments should be frequent and broad in coverage, and students should be assigned personalized remedial reviews based on what they answered incorrectly.
Students should progress through the curriculum in a personalized and mastery-based manner, only being presented with new topics when they have (as individuals, not just as a group) demonstrated mastery of the prerequisite material.
And even after a student has learned a topic, they should periodically review it using spaced repetition, a systematic way of reviewing previously-learned material to retain it indefinitely into the future.
If a student ever struggles, the system should not lower the bar for success on the learning task (e.g., by giving away hints). Rather, it should take actions that are most likely to strengthen a student’s area of weakness and allow them to clear the bar fully and independently on their next attempt.
]]>It’s easy to create a diagonalizable matrix with prescribed eigenvalues: just put those eigenvalues in a diagonal matrix, and then conjugate it by your favorite invertible matrix.
But for each of those eigenvalues, the algebraic multiplicity equals the geometric multiplicity. How do you do create a (non-diagonalizable) matrix whose eigenvalues have specified algebraic and geometric multipicity?
You can use the same method but with a more general Jordan block matrix instead of a strictly diagonal matrix.
EXAMPLE
Setup
For example, say you want a matrix with an eigenvalue $\lambda_1 = 10$ with algebraic multiplicity $a_1=3$ and geometric multiplicity $g_1=2.$
Just set up your Jordan block matrix like this:
Verifying algebraic multiplicity
The characteristic polynomial of this matrix is $\det(J - \lambda I) = (10 - \lambda)^3,$ so the algebraic multiplicity of the eigenvalue $\lambda_1=10$ is $a_1 = 3$ as desired.
Verifying geometric multiplicity
The geometric multiplicity of the eigenvalue $\lambda_1=10$ is the dimension of this eigenvalue’s eigenspace, that is, the dimension of the solution space ${ v: (J-10I)v = 0 }.$
Writing $v$ as its components $v = \left< v_1, v_2, v_3 \right>,$ the equation $(J-10I)v=0$ becomes
which means the solution space consists of vectors $v=\left< v_1, 0, v_3 \right>$ where $v_1$ and $v_3$ are free variables. The two free variables yield a two-dimensional solution space, as desired.
(At this point it’s easy to see intuitively that the off-diagonal $1$ in $J$ forced $v_2=0,$ i.e., each off-diagonal $1$ in a Jordan form matrix decrements the dimension of the corresponding eigenvalue’s eigenspace.)
Conjugating
Again, you can conjugate $J$ by your favorite invertible matrix and retain the desired properties.
CASE OF MULTIPLE DISTINCT EIGENVALUES
If you want to use multiple eigenvalues, just create a Jordan block for each one.
For instance, suppose you want a matrix with
Here’s the corresponding Jordan block matrix: