Speaking of efficacy and the complexity of measuring it, I had an interesting conversation the other day with Danae Hudson, a professor of Psychology at Missouri State University, about a course redesign effort that she participated in. The initial contact came from a P.R. firm hired by Pearson. Phil and I get a lot of these and turn down most of them. This one interested me for several reasons. First, it was an opportunity to talk directly to a faculty member who was involved in the project. (A tip to all you P.R. folks out there: You will have a better chance of getting our attention when the focus of the call is to put us in direct contact with your customers about what they are achieving.) Second, the project was facilitated by The National Center for Academic Transformation (NCAT). I am a big fan of NCAT’s work, despite the fact that they seem to have an almost pathological urge to sabotage efforts to bring their work the attention that they deserve. Pearson’s interest lab in the call was that MyPsychLab was an integral part of the course redesign. My interest was to see what I could learn about the interaction between educational technology products and educational practices in delivering educational efficacy.
What I heard tended to confirm my suspicions (and common sense): Educational technology products can produce significant learning gains, but they often doing so by supporting changes in classroom practices.
The Goals
Like all NCAT redesign projects, this one has a complete write-up on the NCAT site. The document summarizes the redesign context and goals as follows:
Introductory Psychology is a semester-long, general education course at Missouri State University (MSU). The course falls within the self understanding/social behavioral perspective area of general education and is by far, the most popular choice for students within that area. Each academic year, at least 18 traditional face-to-face sections are offered with a total enrollment of 2,500-2,700 students. The course is lecture-based and typically taught by 65% full-time faculty and 35% adjunct instructors. While there are common general education goals across all sections, each instructor makes individual choices of content and delivery.
Despite being a popular choice among students, Introductory Psychology has traditionally experienced a high DFW rate (approximately 25%). The department wants to find ways to develop a more engaging course that will result in improved student learning outcomes and student satisfaction. Due to the large enrollment and numerous sections offered throughout the year, a significant number of adjunct instructors teach the course, which has contributed to some course drift and grade inflation. Currently, each section of 153 students is taught by one instructor, which significantly limits the type of activities that can be assigned and graded. The vast majority of the final course grade is derived from a series of multiple-choice exams. The goal is to redesign the course to be much more engaging and interactive, with an emphasis on true mastery of the course material.
To sum up: We have a popular Gen Ed course with a high failure and withdrawal rate. Danae also told me that the psychology department had long delivered a formative exam at the beginning of that class, and that they were unhappy with the level of improvement students were showing between the formative and summative exams. The faculty wanted to improve those numbers by making the course “more engaging and interactive, with an emphasis on the true mastery of the course material.”
This is typically where we start hearing that teaching effectively is expensive. But NCAT has a strong track record of proving that to be false. It turns out that ineffective teaching methods are usually often inefficient. Let’s pause and think about the formulation of that last sentence for a moment. It’s not always the case that effective teaching measures are cost-efficient. Of course we know that good seminars with low teacher/student ratios can be very effective but, to adopt the current parlance, “don’t scale.”  In that situation, there is a tension between effectiveness and efficiency. But despite appearances, some traditional styles of teaching—most notably the classic ginormous lecture class—are both ineffective and inefficient. Why is that so? For several reasons. First, both the physical plant and the labor structure of the large lecture class limit its ability to scale. If you run out of lecture hall seats, or you run out of TAs, you have exhausted your ability to increase the number of students taught with the faculty that you have. The central innovation of video-based xMOOCs is that they remove this limitation without changing the underlying pedagogical model of the large lecture. But the central problem is that cost and effectiveness is a two-way street in education. In my last post, I discussed David Wiley’s argument that cost of curricular materials impact effectiveness insofar as cost limits student access to those materials. But it goes the other way too. There is a cost for every student who fails or withdraws from a class and therefore has to retake it. The direct cost is in the tuition paid for two classes rather than one—a cost paid but the financial aid providers in addition to the student—but indirect costs include increased chances that the student might have to stay an extra semester or drop out altogether as well as the knock-on effect of the student blocking the seat for another student in an enrollment-capped but graduation-required course. NCAT typically doesn’t even look at these indirect costs and are often able to find significant direct cost savings by restructuring courses away from ineffective pedagogical approaches toward more effective pedagogical approaches that also happen to be more scalable. In MSU’s case, they projected that they would be able to lower the direct cost of the course by 17.8% while still achieving the primary goal of increasing effectiveness. The NCAT report notes,
The cost savings will remain in the psychology department and will be used to provide support for the redesigned course in the future, faculty wishing to take on additional course redesign projects and faculty travel to present at conferences related to the scholarship of teaching and learning.
But How?
MSU decided to redesign its course around what NCAT calls “the Replacement Model,” which can be thought of as a combination of flipped and hybrid. At this point most people have at least a basic idea of what “hybrid” means, but “flipped” is a lot less clear. The Chronicle of Higher Education recently published a column by Robert Talbert highlighting a group that is trying to establish definition and best practices around what they call “flipped learning,” which they describe as follows:
Flipped Learning is a pedagogical approach in which direct instruction moves from the group learning space to the individual learning space, and the resulting group space is transformed into a dynamic, interactive learning environment where the educator guides students as they apply concepts and engage creatively in the subject matter.
That’s it in a nutshell: Move direct instruction (i.e., lectures) out of class time so that there can be more direct student interaction time. Which sounds great, but it leads to a frequently asked question. If students have to do all the homework they were doing before plus watching all the lecture videos at home, isn’t that going to dramatically increase the amount of time they have to spend on the class? How can they do all of that work? NCAT’s answer is that you give them back some of that time by making the class “hybrid” in the sense that you reduce their in-class seat time by 50%. That’s why it’s called the “Replacement Model.”
While Danae never used the term “flipped learning”, she did talk about the flipped classroom and made it very clear that she meant using it to increase the amount of class time spent interacting with students and focusing on their particular needs. But the MSU plan called for decreasing class time by 50% while doubling the number of students per class from an average of 153 to 300. How was that supposed to work?
Part of the answer lies in using traditional techniques like group projects, but a lot of it is in using data to provide students with more feedback and fine tune the classroom experience. This is where Pearson comes in. I wrote a while back that the promise of adaptive learning programs is to transform the economics of tutoring:
The simplest way to think about adaptive learning products in their current state is as tutors. Tutors, in the American usage of the word, provide supplemental instruction and coaching to students on a one-on-one basis. They are not expected to know everything that the instructor knows, but they are good at helping to ensure that the students get the basics right. They might quiz students and give them tips to help them remember key concepts. They might help a student get unstuck on a particular step that he hasn’t quite understood.  And above all, they help each student to figure out exactly where she is doing well and where she still needs help.
Adaptive learning technologies are potentially transformative in that they may be able to change the economics of tutoring. Imagine if every student in your class could have a private tutor, available to them at any time for as long as they need. Imagine further that these tutors work together to give you a daily report of your whole class—who is doing well, who is struggling on which concepts, and what areas are most difficult for the class as a whole. How could such a capability change the way that you teach? What would it enable you to spend less of your class time doing, and what else would it enable you to spend more of your class time doing? How might it impact your students’ preparedness and change the kinds of conversations you could have with them? The answers to these questions are certainly different for every discipline and possibly even for every class. The point is that these technologies can open up a world of new possibilities.
This is exactly how MSU is using MyPsychLab. One of the biggest benefits that Danae cited was being able to walk into a class knowing what students were doing well with and what they were struggling with. This enables her and her colleagues to focus on the topics that those particular students need the most help with in class while simultaneously communicating to the students that their teacher is aware of how they are doing and what they need. Likewise, she said that the students are coming to class more engaged with better questions. MSU also uses clickers in class to augment the feedback loop that they are getting from the homework platform. This certainly was a critical enabler at a class size of 300 and would be useful in a significantly smaller lecture class as well.
Did it work? The results are overall very positive but mixed:
- On the 30-item comprehensive exam, students in the redesigned sections performed significantly better (84% improvement) compared to the traditional comparison group (54% improvement).
- Students in the redesigned course demonstrated significantly more improvement from pre to post on the 50-item comprehensive exam (62% improvement) compared to the traditional sections (37% improvement).
- Attendance improved substantially in the redesigned section. (Fall 2011 traditional mean percent attendance = 75% versus fall 2012 redesign mean percent attendance = 83%)
- They did not get a statistically significant improvement in the number of failures and withdrawals, which was one of the main goals of the redesign, although they note that “it does appear that the distribution of A’s, B’s, and C’s shifted such that in the redesign, there were more A’s and B’s and fewer C’s compared to the traditional course.”
- In terms of cost reduction, while they fell short of their 17.8% goal, they did achieve a 10% drop in the cost of the course.
Intuitions and Open Questions
The study of the course redesign was intended to measure the overall impact of the effort rather than to research the components of efficacy, which means that we don’t have good data from which we can draw strong conclusions on the most interesting questions in this regard. But I’m not afraid to make some guesses and I asked Danae to do the same with me. To be clear, her first answer to any of the questions I’m going to bring up in this section of the post was consistently along the lines of, “I don’t have data that speaks to that question.” Which is the right answer. I want to be clear that wherever I reference her opinions here that it was in this context and that she was appropriately tentative.
First of all, what role did MyPsychLab have in the improvements? Here we have at least one hard number:
A significant portion of the redesigned course utilized publisher-customized digital learning technology. A correlation was calculated between the students’ online total score of assigned material and the total of five exam scores. This correlation was .68, p < .001 suggesting a strong relationship between the completion of online learning activities and exam performance.
But why? Obviously, practice is part of the equation. Students who do the homework tend to do better in classes in general. I asked Danae what she thought the big drivers were beyond that. She cited the feedback to faculty and student engagement. The product seemed to succeed in getting students engaged, from her perspective. When pressed about the degree to which the adaptive component of the product made a difference, she guessed that it wasn’t as big a factor. “My gut tells me that it is less about the personalization,” she said. But then she added that the personalization may have helped to drive student engagement by making the students feel like the content was tailored to their needs. “I think personalization is the part that appeals to the students.” This raises the question about the degree to which any gains that we see added to an adaptive product may be because of a digital…er…analog to teaching presence and as opposed to the software’s real ability to adapt to individual student needs and capabilities.
Second, I asked Danae to characterize how much she thinks adopting MyPsychLab would have driven improvements had it been added to the original class before the redesign. Her instinct was not nearly as much, which is my instinct too. We don’t have numbers to separate the impact of the practice from the impact of the tailored instruction that resulted from having the student data in the product. Nor do we know how much student engagement with the product was impacted by the fact that it was integrated into the whole course redesign. These would be important questions to answer before we can have a clear and nuanced answer to the question of the product’s “efficacy.” Efficacious under what circumstances?
Finally, I’d like to return to David Wiley’s question about cost as a proxy for access and its impact on efficacy. Danae was traveling and didn’t have access to the course materials cost information when I reached her by email today, but she was confident that the cost had not gone up significantly and thought it might have actually gone done post-redesign. (And no, Pearson did not subsidize the cost of MyPsychLab to the students.) So we have no before/after data from which we can make inferences regarding the impact of cost on student outcomes. But it’s possible that MSU could have had a more significant impact on its DFW rate had the total cost to the students been lower. It’s also worth noting that MSU expected to increase enrollment by 72 students annually but actually saw a decline of enrollment by 126 students, which impacted their ability to deliver decreased costs to the institution. Would they have seen different enrollments had the curricular materials been less expensive? Or free? We don’t know. But this raises the point that efficacy cannot be reduced to one aggregate number. Improving student aggregate test scores and reducing the number of students who fail or withdraw are two different goals which certainly need to be measured differently and probably need different sorts of interventions to achieve.
Postscript
After this post went live, Danae shared some data with me from the semesters after the NCAT report was published. As it turns out, the course did see significant reduction in its DFW rates and enrollments bounced back over the course of several semesters. You can read about the details, as well as possible explanations, here.
Laura Gekeler says
Thank you, Michael. Very helpful.
theundergradneglectblogEric Gates says
I always enjoy your posts. I do wish that you would do a write-up on a tool that was “born digital,” since these can do so much more than Pearson MyLab products, which are really based on a traditional textbook model. , Just a small nit, but there is something of a collective failure of the imagination when we take a book and use it (and outmoded “database innovation” ) as a Gold standard model of what is possible. NCAT has many examples, and there’s just a lot going on elsewhere too..
Michael Feldstein says
Thanks for your kind words, Eric. I’m not holding up MyLabs as a “gold standard.” I had access to a teacher and outcomes data fall into my lap, so I wrote about it. And in any case, I was less interested in writing about MyLabs itself than I was in the teaching practices that it supported. But for the record, I have no particular reason to believe that born-digital products will be more effective than ones that are not. I understand why born-digital is an important characteristic for publishers–a lot of it frankly has to do with production complexity and cost–but I have yet to see evidence that it is a characteristic that generally correlates with educational effectiveness.