There’s something that drives me a little crazy when I hear about how someone has learned from Netflix, Amazon, Google, etc., and that they’re going to be the Netflix, Amazon, Movie Box or Google of education. Actually there are a few things.
There is a natural tendency to want to leverage the work of others. In the burgeoning space of learning analytics many look to those who have large data sets and algorithms to extract some kind of meaning from those massive stores of information. We should always be looking to leverage when we can! This time, though, our takeaways are limited.
What are Our Outcomes?
Amazon, Netflix, etc., have an outcome. Let’s call it “profit” for lack of a better term. They run data analysis to provide suggestions like Netflix does to recommend movies to users. Why? The goal is to show you something you didn’t think of that you’d like, which then increases interaction and loyalty, leading to profit. So long as they get that right even 1 in 100 times, it’s worth their while to run their analysis and make the suggestion. For someone doing data analysis, this is a gift. Any data mining algorithm wants something to optimize for. Let’s say in this case, it’s “$”.
They can ask their algorithm “I did X, did they give us more $?” Netflix can ask their data “I did Y, did I get more $?” Google can ask “I did Z, did I get more $?” Not only are they measuring the outcomes directly (more $) but also the inputs (X, Y, Z – whatever changes they introduce to their products).
We don’t have it that easy in education if we want to do it right (assuming we’re not looking just for more $). Measuring learning is hard. As Michael Feldstein discusses in his post A Taxonomy of Adaptive Analytics Strategies, “Since doing good learning analytics is hard, we often do easy learning analytics and pretend that they are good instead.”
I’ll be the first to say there is good work going on in early warning systems based on click data. The outcome is yes/no – does the student stay? They’re not measuring if students understand the material, just “do they stick it through.” Don’t get me wrong – every student kept in school via an early warning system is an achievement. But it’s not the big promise of learning analytics. It’s an achievement of website analytics very similar to those studied by the commerce sites. The more a user stays engaged with their sites, the more profit they generate. The comparisons to those kinds of analytics pretty much end there. Unfortunately for those looking for the easy path, our outcomes are complex and the inputs aren’t actually that obvious either. Let’s talk about the real promise of learning analytics.
We need to be measuring learning the entire time in which students are engaging in learning activities and track how we believe those activities should be shaping the experience.
Explanatory vs. Predictive
How does a good teacher know if a student is struggling with a concept in the classroom? We hope that they recognize signs of difficulty while reviewing practice work or are asked for assistance by the student (feedback, hints). If learning analytics are going to provide useful feedback then we should be measuring those feedbacks and requests for help. A click stream tells me if a student is using material but not why or what that interaction ought to achieve. A student might skip problems they already know – their lack of answering questions in a particular part of the course is not itself evidence of lack of understanding. Similarly, a student can struggle while working very hard to try and understand a concept. Their mere frequency of interaction does not in any way imply instructional success. Only knowing their clicks, or visits, tells us nothing about their intent, when they wanted help, if they got that help, or what feedback they were given (or should have been given). Consider the following questions one might want to ask:
- How often is the student getting questions right on the first try?
- Do they eventually get them correct?
- How often are they asking for help?
- Do expert teachers rate this skill is generally difficult?
Answering these (and many more such) questions require semantic data that we need to be collecting and cannot collect with a mere click-stream. When Feldstein refers to Semantic Analytics, this points the finger at the algorithms. It is also the lack of semantic data for algorithms to take advantage of. What does the difference in that data look like? This is an example of what I mean:
- Click-stream:
- Student x clicked y at time z
- With semantic data, we can can store:
- Student x has asked for his second hint on part three of the question “What are the five steps of this program?” and he was told “Recall that you need to identify a base case for your function” The correct answer will be “line 5”. The question is related to the skill “recursive base case” and is often mistakenly answered as “line 4” due to a common misconception
- Click-steam:
- Student x’ clicked y’ at time z’
- With semantic data we can store:
- Student x’ has now selected “line 5” which is correct. Student was given the feedback “You are correct, line 5 is the base case”. This was her third try, though it is the first question about the skill “recursive base case” she attempted to answer even though it’s the third related question in the material.
Not only do we need data about interaction, but we need the content itself to identify these in meaningful ways. If nobody tells the system what the hints and feedback mean, what skills the targeted interaction is meant to address, etc., the algorithms can’t make any reasonable estimations of learning. It’s beyond the capacity of a simple algorithm to identify what these clicks mean without guidance of better data. We can go further.
- When do students ask for a specific hint and what do we know about the misconception they’re exhibiting?
- Does one hint for a particular question provide enough guidance or are more hints needed?
- After selecting an answer and being told “That’s not quite right, because…” do they then answer correctly?
To answer these kinds of questions you have to have a design process that not only creates these targeted hints and feedback, but allows the system to semantically record each and every selection of students interacting with the material as described. Only then can you ask the questions above of the data you’re collecting with any hope of a meaningful result. If there are no targeted hints that students can ask for, if there is no targeted feedback, if there is no well-designed question, there is no semantic data.
What can we do when we are empowered with this sort of semantic data and analysis? Here are just some examples:
- Provide real-time feedback to teachers about how groups of students and individuals are performing while they learn before summative exams or projects arise
- Provide guidance on where specifically in the course students are struggling
- Use semantic analyses like learning curve analysis to identify areas where content needs to be improved
This last one in particular is important. A lot of times we are tempted to assume that whatever content we create simply works. This is true even in the traditional classroom where a teacher prepares and delivers a lecture. Was it an effective lecture? We might be able to decide if students liked the lecture but we have the same data collection problem as the click stream. All we can reliable know is if the student received the lecture. In reality we want to be able to know in what areas are we being successful at imparting knowledge and be accepting of the fact that not everything is great right at the start. Without semantic data and algorithms you’re forced to assume effective content and that when the student receives it, it did what it was supposed to (whatever that was) effectively (whatever that means). With so many variables, we must make assertions that can then be verified or debunked by analyses.
It’s tempting to try to work around this metric problem by using a summative evaluation as the metric, ie, if they pass this test, then clearly all the stuff before must have done what we wanted. (And even then, if they don’t, there’s no good information on why they didn’t). This is not much better than saying the SAT is an accurate reflection of an individual’s skills as a whole and their educational experience up to exam time. We want an approach that utilizes the formative work of students to give us insights. If our materials are working, then the formative ought to simply give the same results and the summative ought to be perfunctory.
Impact of Errors in Analytics
The ultimate goal of learning analytics must be providing actionable feedback that can be given to students and instructors during the run of a course. This is already possible technologically and pedagogically. As this methodology becomes more readily available it will become expected. This isn’t a pipe dream. This mandates moving beyond the goal-post metrics of summative exams and click-stream data that allows us at best only look back and say “we weren’t entirely successful in supporting students last semester, so we’ll take a guess this semester at improving things for students next semester and see what happens.” We can and ought to do better.
But this has to be done with a high degree of accuracy. If Amazon gives me a recommendation I don’t agree with (which it actually does fairly often for me), is there any real harm? If I buy some gifts for a friend and Amazon uses that to suggest other purchases for me that I don’t want, I don’t leave Amazon because I see a poor recommendation. When they make a particularly good one their profit metric goes up a tic. No harm, no foul and so it is worth Amazon making the occasional poor recommendation to capitalize on the good ones.
What if we’re not good at these predictive models? It’s not as neat and tidy as recommendations. If we mistakenly identify a student as at risk for dropping out, the two possible negative effects are that we intervene with someone who is not actually at risk (not a terribly bad thing) or we miss an at risk student we would otherwise hope to identify (not optimal but no different from if we weren’t trying).
Now what happens if we tell a student they aren’t achieving learning outcomes when in fact we are wrong about that? The potential for demotivating the student comes at a high cost. This could happen with errors in reporting the other way, as well. If learning analytics inform a student they are succeeding but in fact they are not prepared for their next exam or job, the disservice is just as bad. Getting learning analytics wrong on the learning dimension is a recipe for disaster and must be done carefully and with understanding. Without that semantic ability to understand what is happening, we won’t even know if we’re doing harm to our students by using algorithms to optimize for things we don’t understand.
Summary
The way to take advantage of online learning technologies has to include the ability for timely, reliable, prescriptive information for those engaged in the learning experience (students and their instructors) as well as a rich semantic data set for learning engineers to be able to improve those resources continually. The only way to do this is to have algorithms and data that have an explanatory capability in order to give guidance to each user group on what to do next. This means developing rich models that encapsulate the intent of online learning content and well-instrumented learning environments that provide large sets of meaningful data that can feed these analyses. Click streams provide retention data and this is being used to success. Now we need to recognize the next step is in fact a harder one to take, but well worth it for everyone involved.
Alfred Essa says
Bill, Great post. Look forward to reading your contributions in eLiterate.
Srujan says
Excellent post Bill!
srinivas vedantam (@sencbull) says
Great post! An eye opener of sorts, not just for the suggestions but also for bringing out the truth.
Peter Shea (@pshea99) says
Terrific post here, thanks for the insights. One quibble – in part you write:
“If we mistakenly identify a student as at risk for dropping out, the two possible negative effects are that we intervene with someone who is not actually at risk (not a terribly bad thing)…”
I’d argue that the impact is far worse and is already an major contributor to drop out, especially in the community college setting which suffers from attrition (for a variety of reasons) far higher than four-year colleges. I’d argue that part of the problem is tied to “bad” analytics and “bad” interventions tied to these. When “Accuplacer” diagnoses a student as in need of remediation (which already sounds Orwellian when you think about it) the student is frequently condemned to take a non-credit bearing remedial course that research shows frequently derails the student from momentum and persistence in college. The current enthusiasm for analytics has the very real potential to amplify this unfortunate state of affairs (at least in the short run).
Michael Feldstein says
Good point about Accuplacer, Peter, but I believe that Bill was specifically referring to retention early warning analytics of the type that say, “Student X hasn’t logged into the LMS in two weeks, has failed two assignments, and is on academic probation–somebody better call her right away.” That’s not the same as using automated assessment for placement purposes, which I agree can be very bad indeed when it goes wrong.
Peter Hess says
I can see (I think) how analytics might play a role, but if generating models and collecting data requires a great deal of effort and thought to get a narrow result (applicable to a few situations, a small cohort of students, or one teacher), I wonder if it’s worth the candle. The attempt to impose technology on education (esp K12) has come (IMO) with huge and mostly unrecognized opportunity costs, while the most widely adopted (and, by that measure, most useful) applications of technology have often emerged organically and haphazardly and largely against the grain of those of us who think about these things in an idealistic or abstract way.
Peter Hess says
To make clearer the meaning behind “applications of technology [for education] have often emerged organically and haphazardly and largely against the grain of those of us who think about these things in an idealistic or abstract way”, some examples:
YouTube
Wikipedia
Google search
The LMS
Facebook (I cringe to say it)
MS Office
Google Docs
iTunes
WordPress
Alfred Essa says
Bill and Michael are to be commended for elevating the discussion on analytics and raising a rich set of themes for us to think about and debate. It’s timely and much needed.
I would like to comment briefly on the theme of semantic analytics. I couldn’t agree more with Bill and Michael that learning analytics will not advance significantly unless we incorporate a rich semantic layer. Michael has also correctly emphasized on numerous occasions that this is a much harder problem than we might think.
I believe that Bill and Michael’s insight is more general and deeper. In developing the Student Success System at Desire2Learn, for example, we have tried to incorporate an extensible semantic layer from the outset as an architectural principle. A core feature set of the Student Success System is an early warning system. We have taken the Signals approach as a starting point but extended it with a rich set of diagnostics and visualizations. The diagnostics and visualizations are in turn enabled by a semantic layer.
Our design approach is described in a paper published as part of the LAK’12 proceedings:
http://dl.acm.org/citation.cfm?id=2330641&preflayout=tabs
In short, we are in complete accord with Bill and Michael that semantics needs to reach wide and deep in any learning analytics solution. It also needs to be a bedrock architectural principle.
different Dave says
It’s also worth noting that the conversion process in ecommerce is almost infinitely more simple than the learning process, in a way that lends itself to data-driven decision making. Amazon has millions of customers who all basically take the same steps when making a purchase, and within a short amount of time. Amazon can theoretically pull a small percentage out and do A/B testing, and have data about the effect on conversion rates within the hour.
Learning and school just don’t work that way. The path from not knowing to expertise is unique for each student, and complicated, and time-consuming. A true A/B test in education would require weeks of prep, work, and evaluation, and permission slips, and even then your data wouldn’t be anywhere near as solid as Amazon’s. It’s unavoidably an apples to oranges situation.
Alfred Essa says
different Dave, you make a good point. However, I don’t think you can generalize that what Amazon is trying to do is easier than what we need to do in Learning. It all depends on what we are trying to measure and what we are trying to predict.
Take a look at the following video by Eric Mazur at Harvard.
The link should take you to the point in the video where Mazur is framing an A/B type testing to measure the efficacy of doing demonstrations in a science class. The technical aspects of Mazur’s experiment is not any more complex than what Amazon is doing. The results of Mazur’s simple experiment, however, is quite stunning.
http://youtu.be/aYiI2Hvg5LE
The video is worth watching in its entirety. Mazur has been applying the “flipped classroom” approach for nearly a decade. He backs up his claims with data.
Peter Hess says
I watched the whole Eric Mazur video. Too bad it was truncated at the end. It’s hard to miss the irony that someone crusading against the lecture is such an effective lecturer. Anyway, I think the points Mazur makes about scientific demonstrations and cultural bias, while both valid and important, are very narrow from a learning analytics perspective.
No doubt decision making from data is a good thing, though we shouldn’t miss that data can come in many forms. You can make the following important inferences simply from widespread use of Platform X (if you need a concrete example think of Skype or Google Docs): it serves people’s needs to a significant degree; there is a large base of users who are already capable of using it.
Nate Silver and Phillip Tetlock popularly have demonstrated that one person, asking very incisive questions (and the questions are key) using readily available data sets and sophisticated methods of analyzing them, can make very significant predictions. Which is not so different from what Mazur demonstrated.
What worries me, and what I feel like I am seeing, is that learning analytics is being anointed as the magic potion du jour, and the probably result if that is true is that it will be overused and carelessly applied, and will divert substantial resources from more conventional and more proven approaches to the problem of improving educational results.
Alfred Essa says
Peter,
I agree with your overall sentiment. It’s also consistent with what Michael has emphasized in a series of recent posts, namely that we shouldn’t peddle “big data” as the solution to every problem in education.
I cited Mazur’s work to support that very point. I also agree with your assessment of Silver and Tetlock. It’s another example of the advantageous use of “small data”.(Note: I believe that a) there are a certain class of problems in education that can only be solved by “big data” and b) the solutions need not be expensive But we can debate that separately.)
I was intrigued by your final comment and wonder if you can elaborate. Can you say more about you mean by “conventional and more proven approaches to the problem of improving educational results.”?
I think one of the things that’s lack in the educational community is consensus about what works and what doesn’t work. We need to do a better job of understanding and disseminating good educational practices substantiated by good educational data. Perhaps eLiterate can be one of the forums where we can begin to have that discussion.
Peter Hess says
Hi Al,
Thanks for your thoughtful comments.
Re: Can you say more about you mean by “conventional and more proven approaches to the problem of improving educational results.”?
It was rhetorical overreach, as I didn’t and don’t have specifics in mind. I should have omitted the reference. I think there is a lot of rhetorical overreach, too, in the general discussion of learning analytics: much arm waving and a good measure of unfocused thinking.
In the quest for clarity, Bill Jerome’s distinction between Explanatory v. Predictive is interesting, but I think not quite right. There is too much overlap between what’s explanatory and what’s predictive. My inclination is to borrow from statistics terminology and use the categories “descriptive” and “inferential” to draw what I think is an important dichotomy.
The “early warning systems” Bill talks about, or anything that tracks performance – whether large or fine grained – of a particular individual or group (e.g. a class) falls on the descriptive side. People have been doing that sort of analysis long before technology was available to assist with it.
On the inferential side are questions like “What does someone need to know from domain A to be a successful learner in domain B?”; “What mental models lead to successful transfer of learning?” (thanks, Eric Mazur); “How can formative assessment be constructed to achieve desired learning outcomes?”. (Because the questions are hard and my understanding is limited, I’m sure much better examples could be offered.) These all rely on aggregate data and imply experimental design; they would not flag the need for immediate intervention with individuals or groups.
I’m all for the idea that eLiterate be a forum for serious discussion of learning analytics. I hope that it continues.
Michael Feldstein says
We will try our best, Peter. I’m pleased to have Bill as a contributor, and I’m also pleased to have all of you engaged in such a rich conversation.