I almost never quote a blog post in its entirety, but this one from Dan Meyer is so good that I just can’t bear to cut a single word:
Stephanie Simon, reporting for Reuters on inBloom and SXSWedu:
Does Johnny have trouble converting decimals to fractions? The database will have recorded that – and may have recorded as well that he finds textbooks boring, adores animation and plays baseball after school. Personalized learning software can use that data to serve up a tailor-made math lesson, perhaps an animated game that uses baseball statistics to teach decimals.
Three observations:
One, it shouldn’t cost $100 million to figure out that Johnny thinks textbooks are boring.
Two, nowhere in this scenario do we find out why Johnny struggles to convert decimals to fractions. A qualified teacher could resolve that issue in a few minutes with a conversation, a few exercises, and a follow-up assessment. The computer, meanwhile, has a red x where the row labeled “Johnny” intersects the column labeled “Converting Decimals to Fractions.” It struggles to capture conceptual nuance.
Three, “adores” protests a little too much. “Adores” represents the hopes and dreams of the educational technology industry. The purveyors of math educational technology understand that Johnny hates their lecture videos, selected response questions, and behaviorist video games. They hope they can sprinkle some metadata across those experiences — ie. Johnny likes baseball; Johnny adores animation — and transform them.
But our efforts at personalization in math education have led all of our students to the same buffet line. Every station features the same horrible gruel but at its final station you can select your preferred seasoning for that gruel. Paprika, cumin, whatever, it’s yours. It may be the same gruel for Johnny afterwards, but Johnnyadores paprika.
Dan captures most of what I was trying to get at with my rant on the big data hype, but much more clearly and succinctly. Points two and three are the most salient here. First of all, the sort of surface-level analysis we can get from applying machine learning techniques to the current data we have from digital education system is insufficient to do some of the most important diagnostic work that real human teachers do. Think about the math classes is which you had to show your work on your homework. Why was that important? Because the teacher needs to see not only what you got wrong but why you got it wrong. Teachers generally don’t just say, “You got three out of five problems involving converting decimals to fractions wrong. Go study some more.” They sit down and work through the problems with the student to find the source of the errors. It’s really hard to get computers to do this well, even with highly procedural domains like math. (Forget about, say, literary analysis.) So in the vast majority of cases, we don’t even try to design systems where students show their work. And without the step-by-step data, no fancy algorithm is going to teach Johnny.
Second, if the problem is that your content isn’t what the student needs, no fancy algorithm is going to fix that either. Videos are a prime example. I know of one textbook publisher whose teacher customers report that students won’t watch the publishers’ videos, but they can and do find videos on the same topic on YouTube and share them with each other. Think about that. Video-based pedagogical support is valuable enough to the students that they will expend energy searching for videos and sharing them. But they reject the expensive, carefully crafted videos from the publisher that are served up to them on a silver platter. It’s not that the publisher-supplied videos are necessarily “bad” in the sense that they have poor production qualities or are unclear or factually inaccurate. But the students have a particular use in mind for the videos. Maybe they’re struggling with a particular homework problem and just need a quick walk-through of a technique so that they can see the step that they are missing, for example. If the video doesn’t fit their needs—both utilitarian and aesthetic—then it won’t get used. Serving it up adaptively isn’t going to help that problem.
That said, it’s worth taking a little time to break down the different types of adaptive learning analytics into a couple of categories and see just what we should and should not reasonably hope to gain from them.
Topological Analytics
The first category of adaptive analytics are what I call “topological,” by which I mean that they look at the surface characteristics of whatever data is in the system. They don’t assume any special content tagging, or really any understanding of the content at all, and they don’t require any special tweaks to the user interface of the software. They’re just sifting through whatever data is already in the system and mining it for insight. Within this genus, there are several species of adaptive analytics.
Early Warning Systems
Most people don’t think about early warning systems as being in the same category as adaptive analytics, but if you consider that “adaptive” really just means “adjusting to your personal needs,” then a system like Purdue’s Course Signals is, in fact, adaptive. It sees when a student is in danger of failing or dropping out and sends increasingly urgent and specific suggestions to that student. It does that without “knowing” anything about the content that the student is learning. Rather, it’s looking at things like recency of course login (Are you showing up for class?), discussion board posts (Are you participating in class?), on-time assignment delivery (Are you turning in your work?), and grade book scores (Is your work passing?), as well as longitudinal information that might indicate whether a student is at-risk coming into the class. What Purdue has found is that such a system can teach students metacognitive awareness of their progress and productive help-seeking behavior. It won’t help them learn the content better, but it will help them develop better learning skills.
Black Box Analytics
Then there are those systems where you just run machine learning algorithms against a large data set and see what pops up. This is where we see a lot of hocus pocus and promises of fabulous gains without a lot of concrete evidence. (I’m looking at you, Knewton.) But if you think about where these sorts of systems are employed successfully, it will give you a sense of how they can help and what their limits are likely to be. Probably the best example that I can think of are the recommendation engines that we see from companies like Amazon or Netflix. It’s easy to imagine a system that can tell you, “students who watched this video did better on the test”, or even “students who found this video helpful also found these other videos helpful.” That could be valuable. But there are a few caveats. First, there is vastly more noise in educational data than there is in shopping data. Particularly in face-to-face classes, the system has absolutely no insight into what happens in class. And even in fully online classes, unless they are massive and therefore taught essentially the same way to many students, you’re just going to get a huge amount of variation across many variables that may or may not be relevant and may or may not be independent of each other. It’s I’d tough to find the signal in all that noise. Second, this sort of approach will only work if there is, in fact, a video (or other content element) that makes a difference in learning outcomes in the set of content that students are seeing. If the content is not sufficiently effective to have that kind of an impact (for whatever reason), then machine learning will tell you nothing. And finally, even if you do find content (or behaviors) that are particularly impactful on learning outcomes, the machine can’t tell you why they are impactful. In the best case, they can highlight the effective resources or learning experiences when it finds them, but they can’t ensure that we are crafting impactful resources and learning experiences in the first place. At best, they can point humans to correlations they should look at for clues.
Zombie Learning Styles Analytics
One example that we hear a lot—and that Dan mentions in his post—is the one of the kid who learns better from videos (or “adores” video). This is hardly new. Research into learning styles has been going on for decades. And guess what? Nobody’s been able to prove that any particular theory of learning styles is true. I think black box advocates latch onto video as an example because it’s easy to see which resources are videos. Since doing good learning analytics is hard, we often do easy learning analytics and pretend that they are good instead. But pretending doesn’t make it so. Is it possible that machine learning will turn up statistically significant evidence that some students learn better with videos, (or short videos, or audio with female narrators, or whatever)? Yes, it is. But I will believe it when I see it.
Semantic Analytics
Separate from the topological adaptive analytics are the semantic analytics, where the adaptivity depends on having some understanding of the content. Unsurprisingly, these types of analytics are harder than topological analytics, because they require the content to be tagged in a way that is useful to the analytic engine. But they can be very effective.
Adaptivity by Learning Objective
This is the one that all the publishers are focused on right now. Basically, you form a bunch of what I call golden triangles—learning objectives linked to related learning content and learning assessment activities. The system flags which learning objectives you need to work on based on your assessment and points you to the related content. Obviously, this can be a time-saver by helping students to skip work that they don’t need to bother with. It also can help students who don’t have good meta-cognitive skills become aware of what they don’t know. And it certainly can help teachers understand what skills their class is struggling with, which students might work well together because they’re on the same level (or because they’re not), and so on. But again, as Dan pointed out in his post, this level of adaptivity won’t actually teach students the concepts they are struggling with. All it will do is focus them on the topics that they need to learn.
“Inner Loop” Adaptivity
I have written about the inner loop before. (I strongly recommend reading Kurt VanLehn’s paper on Intelligent Tutoring Systems for an overview of this and related topics.) Basically, this is the “show your work” monitoring that I talked about at the top of the post. Intelligent Tutoring Systems do what the adaptive-by-learning-objective systems do, but at a micro level within a problem that the student is trying to solve. (Hence, “outer loop” and “inner loop.”) They flag the step that the student gets wrong and offer hints related to that particular step to help the student get back on track. Carnegie Mellon’s OLI and Carnegie Learning are probably the two most widely known examples of products with inner loop adaptivity. There is decent and growing empirical evidence that these systems work to teach students, but they are time-consuming and therefore expensive to design, require relatively high levels of skill, and we don’t really know how to do them well for knowledge domains that aren’t either highly procedural or highly fact-driven (like writing, for example).
Adaptivity by Activity
This is one of the most interesting and least well-developed areas of adaptive analytics. Rather than adapting content, the idea is to adapt activities. There are examples of this for lower-level cognitive domains like memorization and related language-learning skills. What we don’t have yet is a good vocabulary of the kinds of classroom moves that teachers make in order to match them up against different learning contexts and learning outcomes. This is a topic that merits its own post.
Phil @liketeaching says
Cheers for this. I feel a little smarter now I’ve read it!