Thanks to Audrey Watters I just read a new article in Science Magazine and publicly posted here by Justin Reich, the lead researcher for HarvardX (Harvard’s implementation of edX and associated research team)1. Justin calls out the limitations of current MOOC research that focuses on A/B testing and engagement instead of learning, single-course context, and post hoc analysis with proper course design. While praising the field for making available cleansed data for any type of analysis, his core argument is that we need new approaches that cannot be solved just by research teams.
Update: Added link to publicly-available DOCX article.
While the whole article is worth reading, there are quite a few insightful quotes should get past the journal paywall.
- Big data sets do not, by virtue of their size, inherently possess answers to interesting questions.
- We have terabytes of data about what students clicked and very little understanding of what changed in their heads.
- It does not require trillions of event logs to demonstrate that effort is correlated with achievement.
- One reason that early MOOC studies have examined engagement or completion statistics is that most MOOCs do not have assessment structures that support robust inferences about learning.
- Distinguishing between engagement and learning is particularly crucial in voluntary online learning settings, because media that provoke confusion and disequilibrium can be productive for learners.
- Boosting motivation in well-designed courses is good, but if a MOOC’s overall pedagogical approach is misguided, then plug-in experiments can accelerate participation in ineffective practices.
- For the first MOOC researchers, getting data cleaned for any analysis was an achievement. In early efforts, following the path of least resistance to produce results is a wise strategy, but it runs the risk of creating path dependencies.
- For the first MOOC researchers, getting data cleaned for any analysis was an achievement. In early efforts, following the path of least resistance to produce results is a wise strategy, but it runs the risk of creating path dependencies.
Some e-Literate Context
This article is a welcome statement from one of the leading MOOC researchers, and it connects with some earlier posts and interactions at e-Literate. In June 2014 I wrote a post contrasting the MOOC research results with the approach taken at the University of Phoenix.
Beyond data aggregated over the entire course, the Harvard and MIT edX data provides no insight into learner patterns of behavior over time. Did the discussion forum posts increase or decrease over time, did video access change over time, etc? We don’t know. There is some insight we could obtain by looking at the last transaction event and number of chapters accessed, but the insight would be limited. But learner patterns of behavior can provide real insights, and it is here where the University of Phoenix (UoP) could teach Harvard and MIT some lessons on analytics. [snip]
UoP recognizes the value of learner behavior patterns, which can only be learned by viewing data patterns over time. The student’s behavior in a course is a long-running transaction, with data sets organized around the learner.2
Two days later I wrote a follow-up post based on commenters speculating that Harvard and MIT might have learning data that was just not released.
Granted, I am arguing without definitive proof, but this is a blog post, after all. I base my argument on two points – there is no evidence of HarvardX or MITx pursuing learner-centered long-running data, and I believe there is great difficulty getting non-event or non-aggregate data out of edX, at least in current forms.
Justin Reich replied in the comments, essentially agreeing about the lack of learner-centered long-running data analysis but disagreeing with my arguments on the effect of MOOC architecture and data availability. This comment from June aligns quite well with the current Science Magazine article.
My research presentation was not exhaustive, although generally belies my belief that we need advances in instrumentation and assessment. Fancy manipulations of of predictors (from the click stream) may be limited in value if we don’t have good measures of learning, or a rich understanding of the context of data. But I’m super excited, too, about people doing great work with the edX event log data, and it’ll get out.
It is very encouraging to see the HarvardX team pushing to move beyond clicks-as-engagement and get to actual learning analysis.
Additional Notes
Some additional notes:
- I still maintain that the course-centric transactional design of MOOCs (as with most LMSs) plays a role in the current, limited MOOC research analysis. I have spoken to many MOOC researchers who lament the enormous amount of time it takes to parse JSON files to try and recreate patterns based on individual learners over time. While I believe that Harvard, MIT, and Stanford have research teams capable of this extraction, a learner-centered system architecture would do wonders to advance the state of art for learning analytics.
- As mentioned above, I believe that the standard usage of an LMS in online or blended courses leads to many of the same limitations in learning analytics. You could apply many of Justin’s quotes outside of the MOOC world.
- I wish Justin had moved beyond formal journal and conference proceedings articles in his references and included the results from the MOOC Research Initiative.3 Although not peer-reviewed, several of these reports addressed deficiencies such as being discipline-specific and even including the assessment considered in the MOOC design (as opposed to post hoc analysis). These reports do not negate the points made in the Science Magazine article, but it would have been useful to include this set of reports as a basis to understand the current state of research.
- Note that Science Magazine access requires a subscription or purchase or individual article. [↩]
- Note that I based this argument on what UoP claims to be producing internally without being able to validate the results. [↩]
- Disclosure – MRI was funded by the Gates Foundation, which is also a sponsor of the next e-Literate TV series. [↩]
[…] Thanks to Audrey Watters I just read a new article in Science Magazine ((Note that access requires a subscription or purchase or individual article.)) and publicly posted here by Justin Reich, the lead researcher for HarvardX (Harvard’s implementation of edX and associated research team). Justin calls out the limitations of current MOOC research, focusing on A/B testing, engagement instead of learning, single-course context, and post hoc analysis without proper course design. While praising the field for making available cleansed data for any type of analysis, his core argument is that we need new approaches that cannot be solved just by research teams.While the whole article is worth reading, there are quite a few insightful quotes should get past the journal paywall. – Big data sets do not, by virtue of their size, inherently possess answers to interesting questions. – We have terabytes of data about what students clicked and very little understanding of what changed in their heads. – It does not require trillions of event logs to demonstrate that effort is correlated with achievement. – One reason that early MOOC studies have examined engagement or completion statistics is that most MOOCs do not have assessment structures that support robust inferences about learning. – Distinguishing between engagement and learning is particularly crucial in voluntary online learning settings, because media that provoke confusion and disequilibrium can be productive for learners. – Boosting motivation in well-designed courses is good, but if a MOOC’s overall pedagogical approach is misguided, then plug-in experiments can accelerate participation in ineffective practices. – For the first MOOC researchers, getting data cleaned for any analysis was an achievement. In early efforts, following the path of least resistance to produce results is a wise strategy, but it runs the risk of creating path dependencies.- For the first MOOC researchers, getting data cleaned for any analysis was an achievement. In early efforts, following the path of least resistance to produce results is a wise strategy, but it runs the risk of creating path dependencies. […]