EdSurge‘s Tony Wan is first out of the blocks with an Instructurecon coverage article this year. (Because of my recent change in professional focus, I will not be on the LMS conference circuit this year.) Tony broke some news in his interview with CEO Dan Goldsmith with this tidbit about the forthcoming DIG analytics product:
One example with DIG is around student success and student risk. We can predict, to a pretty high accuracy, what a likely outcome for a student in a course is, even before they set foot in the classroom. Throughout that class, or even at the beginning, we can make recommendations to the teacher or student on things they can do to increase their chances of success.
Instructure CEO Dan Goldsmith
There isn’t a whole lot of detail to go on here, so I don’t want to speculate too much. But the phrase “before they even set foot in the classroom” is a clue as to what this might be. I suspect that the particular functionality he is talking about is what’s known as an “student retention early warning system.”
Or maybe not. Time will tell.
Either way, it provides me with the thin pretext I was looking for to write a post on student retention early warning systems. It seems like a good time to review the history, anatomy, and challenges of the product category since I haven’t written about them in quite a while and they’ve become something of a fixture. The product category is also a good case study in why tool that could be tremendously useful in supporting students who need help the most often fails to live up to either its educational or commercial potential.
The archetype: Purdue Course Signals
The first retention early warning system that I know of was Purdue Course Signals. It was an experiment undertaken by Purdue University to—you guessed it—increase student retention, particularly in the first year of college, when students tend to drop out most often. The leader of the project, John Campbell, and his fellow researchers Kim Arnold and Matthew Pistilli, looked at data from their Student Information System (SIS) as well as the LMS to see if they could predict and influence students. Their first goal was to prevent them from dropping courses, but they ultimately wanted to prevent those students from dropping out.
They looked at quite a few variables from both systems, but the main results they found are fairly intuitive. On the LMS side, the four biggest predictors they found for students staying in the class (or, conversely, for falling through the cracks) where
- Student logins (i.e., whether they are showing up for class)
- Student assignments (i.e., whether they are turning in their work)
- Student grades (i.e., whether their work is passing)
- Student discussion participation (i.e., are they participating in class)
All four of these variables were compared to the class average, because not all instructors were using the LMS in the same way. If, for example, the instructor wasn’t conducting class discussions online, then the fact that a student wasn’t posting on the discussion board wouldn’t be a meaningful indicator.
These are basically four of the same very generic criteria that any instructor would look at to determine whether a student is starting to get in trouble. The system is just more objective and vigilant in applying these criteria than instructors can be at times, particularly in large classes (which is likely to be the norm for many first-year students). The sensitivity with which Course Signals would respond to those factors would be modified by what the system “knew” about the students from their longitudinal data—their prior course grades, their SAT or ACT scores, their biographical and demographic data, and so on. For example, the system would be less “concerned” about an honors student living on campus who doesn’t log in for a week than about a student on academic probation who lives off-campus.
In the latter case, the data used by the system might not normally be accessible, or even legal, for the instructor to look at. For example, a disability could be a student retention risk factor for which there are laws governing the conditions under which faculty can be informed. Of course, instructors don’t have to be informed in order for the early warning system to be influenced by the risk factor. One way to think about a way that this sensitive information could be handled is like a credit score. There is some composite score that informs the instructor that the student is at increased risk based on a variety of factors, some of which are private to the student. The people who are authorized to see the data can verify that the model works and that there is legitimate reason to be concerned about the student, but the people who are not authorize are only told that the student is considered at-risk.
Already, we are in a bit of an ethical rabbit hole here. Note that this is not caused by the technology. At least in my state, the great Commonwealth of Massachusetts, instructors are not permitted to ask students about their disabilities, even though that knowledge could be very helpful in teaching those students. (I should know whether that’s a Federal law, but I don’t.) Colleges and universities face complicated challenges today, in the analog world, with the tensions between their obligation to protect student privacy and their affirmative obligation to help the students based on what they know about what the students need. And this is exactly the way John Campbell characterized the problem when he talked about it. This is not a “Facebook” problem. It’s a genuine educational ethical dilemma.
Some of you may remember some controversy around the Purdue research. The details matter here. Purdue’s original study, which showed increased course completion and improved course grades, particularly for “C” and “D” students, was never questioned. It still stands. A subsequent study, which purported to show that student gains persisted in subsequent classes, was later called into question. You can read the details of that drama here. (e-Literate played a minor role in that drama by helping to amplify the voices of the people who caught the problem in the research.)
But if you remember the controversy, it’s important to remember three things about it. First, the original research about persistence was not ever called into question. Second, the subsequent finding was not disproven; rather, there was a null hypothesis. We have proof neither for nor against the hypothesis that the Perdue system can produce longer term effects. And finally, the biggest problem that controversy exposed was with university IR departments releasing non-peer-reviewed research papers that staff researchers have no power to respond to on their own when they get criticized. That’s worth exploring further some other time, but for now, the point is that the process problem was the real story. The controversy didn’t invalidate the fundamental idea behind the software.
Since then
Since then, we’ve seen lots of tinkering with the model on both the LMS and SIS sides of the equation. Predictive models have gotten better. Both Blackboard and D2L have some sort of retention early warning products, as do Hobsons, Civitas, EAB, and HelioCampus, among others. There were some early problems related to a generational shift in data analytics technologies; most LMSs and SISs were originally architected well before the era when systems were expected to provide the kind of high-volume transactional data flows needed to perform near-real-time early warning analytics. Those problems have increasingly been either ironed out or, at least, worked around. So in one sense, this is a relatively mature product category. We have a pretty good sense of what a solution looks like and there are a number of providers in the market right now with variations on on the theme.
In a second sense, the product category hasn’t fundamentally changed since Purdue created Course Signals over a decade ago. We’ve seen incremental improvements to the model, but no fundamental changes to it. Maybe that’s because the Purdue folks pretty much nailed the basic model for a single institution on the first try. What’s left are three challenges that share the common characteristic of becoming harder when converted from an experiment by a single university to a product model supported by a third-party company. At the same time, They fall on different places on the spectrum between being primarily human challenges and primarily technology challenges. The first, the aforementioned privacy dilemma, is mostly a human challenge. It’s a university policy issue that can be supported by software affordances. The second, model tuning, is on the opposite end of the spectrum. It’s all about the software. And the third, which is the last mile problem from good analytics to actual impact, is somewhere in the messy middle.
Three significant challenges
I’ve already spent some time on the student data privacy challenge specific to these systems, so I won’t spend much more time on it here. The macro issue is that these systems sometimes rely on privacy-sensitive data to determine—with demonstrated accuracy—which students are most likely to need extra attention to make sure they don’t fall through the cracks. This is an academic (and legal) problem that can only be resolved by academic (and legal) stakeholders. The role of the technologists is to make the effectiveness and the privacy consequences of various software settings both clear and clearly in the control of the appropriate stakeholders. In other words, the software should support and enable appropriate policy decisions rather than obscuring or impeding them. At Purdue, where Course Signals was not a product that was purchased but a research initiative that had active, high-level buy-in from academic leadership, these issues could be worked through. But a company selling the product into as many universities as possible with differing levels of sophistication and policy-making capability in this area, the best the vendor can do is build a transparent product and try to educate their customers as best as they can. You can lead a horse to water and all that.
On the other end of the human/technology spectrum, there is an open question about the degree to which these systems can be made accurate without individual hand tuning of the algorithms for each institution. Purdue was building a system for exactly one university, so it didn’t face this problem. We don’t have good public data on how well its commercial successors work out of the box. I am not a data scientist, but I have had this question raised by some of the folks who I trust the most in this field. That, in turn, means that each installation of the product would require a significant services component, which would raise the cost and make these systems less affordable to the access-oriented institutions that need them the most. This is not a settled question; the jury is still out. I would like to see more public proof points that have undergone some form of peer review.
And in the middle, there’s the question of what to do with the predictions in order to produce positive results. Suppose you know which students are more likely to fail the course on Day 1. Suppose your confidence level is high. Maybe not Minority Report-level stuff—although, if I remember the movie correctly, they got a big case wrong, didn’t they?—but pretty accurately. What then? At my recent IMS conference visit, I heard one panelists on learning analytics (depressingly) say, “We’re getting really good at predicting which students are likely to fail, but we’re not getting much better at preventing them from failing.”
Purdue had both a specific theory of action for helping students and good connections among the various program offices that would need to execute that theory of action. Campell et al believed, based on prior academic research, that students who struggle academically in their first year of college are likely to be weak in a skill called “help-seeking behavior.” Academically at risk students often are not good at knowing when they need help and they are not good at knowing how to get it. Course Signals would send students carefully crafted and increasingly insistent emails urging them to go to the tutoring center, where staff would track which students actually came. The IR department would analyze the results. Over time, the academic IT department that owned the Course Signals system itself experimented with different email messages, in collaboration with IR, and figured out which ones were the most effective at motivating students to take action and seek help.
Notice two critical features to Purdue’s method. First, they had a theory about student learning—in this case, learning about productive study behaviors—that could be supported or disproven by evidence. Second, they used data science to test a learning intervention that they believed would help students based on their theory of what is going on inside the students’ heads. This is learning engineering. It also explains why the Purdue folks had reason to hypothesize that the effects of using Course Signals might persist with students after they stopped using the product. They believed that students might learn the skill from the product. The fact that the experimental design of their follow-up study was flawed doesn’t mean that their hypothesis was a bad one.
When Blackboard built their first version of a retention early warning system—one, it should be noted, that is substantially different from their current product in a number of ways—they didn’t choose Purdue’s theory of change. Instead, gave the risk information to the instructors and let them decide what to do with it. As have many other designers of these systems. While everybody that I know of copied Purdue’s basic analytics design, nobody that I know—at least no commercial product developers that I know of—copied Purdue’s decision to put so much emphasis on student empowerment first. Some of this has started to enter product design in more recent years now that “nudges” have made the leap from behavioral economics into consumer software design. (Fitbit, anyone?) But the faculty and administrators remain the primary personas in the design process for many of these products. (For non-software designers, a “persona” is an idealized person that you imagine that you’re designing the software for.)
Why? Two reasons. First, students don’t buy enterprise academic software. So however much the companies that design these products may genuinely want to serve students well, their relationship with them is inherently mediated. The second reason is the same as with the previous two challenges in scaling Purdue’s solution. Individual institutions can do things that companies can’t. Purdue was able to foster extensive coordination between academic IT, institutional research, and the tutoring center, even though those three organizations live on completely different branches of the organizational chart in pretty much every college and university that I know. An LMS vendor has no way of compelling such inter-departmental coordination in its customers. The best they can do is give information to a single stakeholder who is most likely to be in a position to take action and hope that person does something. In this case, the instructor.
One could imagine different kinds of vendor relationships with a service component—a consultancy or an OPM, for example—where this kind of coordination would be supported. One could also imagine colleges and universities reorganizing themselves and learning new skills to become better at the sort of cross-functional cooperation for serving students. If academia is going to survive and thrive in the changing environment it finds itself in, both of these possibilities will have to become far more common. The kinds of scaling problems I just described in retention early warning systems are far from unique to that category. Before higher education can develop and apply the new techniques and enabling technologies it needs to serve students more effectively with high ethical standards, we first need to cultivate an academic ecosystem that can make proper use of better tools.
Given a hammer, everything looks pretty frustrating if you don’t have an opposable thumb.
Terry Mulcaire says
So, do you think Instructure’s DIG suggests a move towards “learning engineering”? Instructure touts its goal of “linking the academic and corporate/professional spheres.” Do you think they might be turning into a behavior management company?
I teach at a California Community College; the system uses Canvas as its LMS. I am interested in where Instructure is going.
Michael Feldstein says
Honestly, I don’t have an opinion about DIG yet because I know almost nothing about it. I just used the tidbit dropped in the interview as a launching point for the post that I wanted to write. I certainly think it’s extremely premature to conclude that Instructure is “becoming a behavior management company.”
I think Instructure’s CEO has made a couple of vague comments about an unreleased product. I think he seems enthusiastic about that product and, lacking experience with the unusually high levels of sensitivity in this particular market to that particular kind of market, may have made a comment or two that have been interpreted by some as meaning more than he may have intended to say. I think I will wait and see what the actual product is before forming an opinion about it or about Instructure’s intentions. And I think the company has more work to do in terms of articulating its philosophy and policies regarding the use, value, and necessary guardrails for safe use of such technologies.
Jen Ebbeler says
These claims drive me bonkers. First, as you note, an experienced teacher who has taught a course (or type of course) multiple times can generally predict how students are going to do within the first 3-4 weeks. I would venture to guess that, these days, the single biggest predicative factor is attendance/preparation. Just showing up. But also, what no algorithm can take into account is the fact that students are humans and many of them, especially at public institutions, have very complicated lives. Stuff happens unexpectedly. Sometimes they can get back on track, sometimes they can’t. I can tell you that about 75% can’t but I couldn’t tell you who will or won’t. It seems totally disconnected from any discernible factors apart from their own resilience (and, sometimes, outside factors like needing to graduate). Canvas has been making claims about predicative analytics for a long time and yet it still can’t even tell me when a student is cheating in an online class. You know what would be a great tool? An algorithm that tells me there’s a high probability that a student is cheating on an online quiz. Then I can go in and look at the specific details of how they took the quiz and see whether they were likely cheating. As it is, with 200+ students, I can only spot check (something I do out of curiosity and not to punish them since I can’t “prove” that they were cheating). There are really clear patterns of how a student works through a graded quiz when they are looking things up, though, and that’s something that an LMS ought to be able to ID.
Michael Feldstein says
As I have written about before, Jen, people often have funny intuitions when we start to speak of risks in quantitative terms. As educators, we have a responsibility to be on the lookout for at-risk students, which is not the same as predicting that they will fail. Describing students’ risk levels in quantitative terms doesn’t change that.
As for the quiz cheating function, I agree that is both feasible and potentially useful. Imagine the reaction from some educators, though if a vendor’s product put out a statement, “There is a 76% chance that this student cheated on this test.” Actually, we have some idea of what that reaction would look like. Just ask Turnitin.
These are tricky waters to navigate for anyone, for sure, and particularly for vendors.
John Fritz says
Hi Michael,
Maybe a minor point, but I’m pretty sure from past conversations with the folks who built Purdue’s Signals that it did not “automatically” (based on predictive algorithm) send a “signal” intervention directly to students. Like the Blackboard interventions you described further in your post, Signals also “gave the risk information to the instructors and let them decide what to do with it.” Now, to Purdue’s credit, they defined options (e.g., red, yellow or green signals) and urged faculty to then issue one of them, but as I noted in my dissertation, Kim Arnold said they found faculty varied significantly in how they did so:
“Arnold acknowledged one concern: a lack of consistent, effective practice among faculty about if, when and how to flip the signal alerting students they are in jeopardy. This is a key point: Signals does not alert students – it alerts the faculty member who then decides whether to display a red, yellow or green alert. Despite Purdue’s sophisticated predictive risk algorithm, how and when faculty flip the Signals intervention switch has varied at Purdue, as it most likely would at any other institution fortunate enough to have it (p. 46, umbc.box.com/johnfritzdissertation).
This is based on Kim’s 2010 Educause Article at https://er.educause.edu/articles/2010/3/signals-applying-academic-analytics.
Best,
John
John Fritz, Ph.D.
Associate Vice President, Instructional Technology
UMBC Division of Information Technology
Michael Feldstein says
Thanks for the additional info. (By the way, if you’re in getting a primer on the state of LMS analytics research, you should read John’s dissertation. Clearly, I need to go back and review it.)
I wouldn’t call this a small point at all. Even at Purdue, where there was a high level of faculty buy-in and coordination, the final decision on the messages was made by faculty. Was that a political decision or a design decision? I’m guessing both, and both are OK. Faculty do need to be active stewards in the process, even when the nature of their stewardship shifts somewhat. So buy-in is important, and oversight is also important. The trade-off is that Purdue lost some control over their ability to test the conditions under which messages had the maximum impact. So by giving faculty more control, they became less able to give those faculty useful information.
That’s the educational research conundrum in a nutshell. Purdue was able to manage this better than most by working really closely with faculty. But it’s really, really hard to do well.
John Fritz says
Thanks Michael, I completely agree. This is hard work, and Purdue’s Signals was a trailblazer. I’m very grateful to John Campbell, Matt Pistilli and Kim Arnold who’ve also been very supportive of our work. In fact, the variability Purdue found in faculty responsiveness to student success predictions, albeit very understandable, is a big reason why we focused on providing students with our “Check My Activity” dashboard showing how their own LMS activity compared to an anonymous summary of course peers across all their current courses. If faculty also post grades in the LMS, then students can see how active they are compared to peers earning the same, higher or lower grade on any assignment. Here’s a brief (5 min) demo from a few years ago: https://youtu.be/rpU1GdvS_yc
We didn’t attempt or accomplish as much as Purdue, in terms of predictive analytics, but tried to scale direct feedback to students as a nudge to encourage responsibility for their own learning. Alas, some did, some didn’t. Human learning remains a wonderful mystery. 😉 Now, we’re focused more on course design as learning analytics variable, to perhaps account for variation in student LMS use as proxy for engagement. Still early days, but we’re seeing some compelling results. If it pans out, I think course re-design could still be the most scalable form of analytics-based intervention any institution could pursue.
Jan Day says
I was struck by what you said in the last paragraph of your blog post about the idea of “…colleges and universities reorganizing themselves and learning new skills to become better at the sort of cross-functional cooperation for servicing students.” This is something I and our team at Starfish whole-heartedly agree with. In our work with higher education institutions we find that they recognize that technology isn’t a silver bullet to improving student outcomes. They know they need to do better at cross-campus collaboration. But leaders at these institutions struggle to overcome long-established cultural norms (including pernicious “that’s not my job” attitudes from some) and inflexible organizational structures.
Over the past year we have asked the executive leadership of nearly 100 HE institutions to assess their own readiness to tackle student success challenges. We have found that on average institutions regardless of type (2-year, 4-year, public, private, large, small) generally rate themselves highest (in other words, most ready/prepared) in the area of having goals and aspirations for improving student success. Nearly every institution can point to the lofty goals in their strategic plans.
These same institutions rate themselves least-ready in terms of being able to define clear plans to move forward and set milestones and goals to reach their aspirations. Part of the reason for this is they are stuck in their current organizational state and culture. It can be difficult to break out of that mindset to explore other possibilities. One way we do that is through our implementation process. The combination of consulting and implementation services requires coordination across campus to set Starfish up and reap its benefits. As a result we see a lot of innovative coordination in our client institutions. For example, we ask institutions to inventory all the interventions they are carrying out with students across their organizations and departments. This allows stakeholders to see in one place all the groups that are serving certain types of students for certain reasons and at which point in their student life cycle. Oftentimes participants in our highly collaborative and on-going workshop sessions tell us that these sessions are the first time these stakeholders in student success at an institution often have been in the same room together.
thatchmo says
This is a conversation we are having at our community college. Some faculty are raising the issue of whether outreach to students coming directly from advising/student services adversely affects the relationship faculty have with those students, especially when the context of student performance within a particular course isn’t well understood. Crafting the message carefully could be the difference between insult and appreciation for the institution, as well as trust or betrayal with the teacher and student.