Research in Translation: Cultural Limits of Self-Regulated Learning

We are currently facing two civilizational educational challenges. By “civilizational,” I mean that they go beyond country- or region-specific challenges. They are unprecedented in the history of humanity. The challenges I’m talking about are universal access to quality education and universal lifelong learning, both which are almost certainly ones that you’re well aware of. But we don’t talk enough about how unprecedented they are, what we need to learn to do differently to meet them, and how we could go about learning what we need to learn.

One plausible solution to both civilizational challenges is to get a lot better at teaching humans how to get better at learning. Again, this is probably not a new or shocking assertion to you. But the research paper I’m going to describe here—”Eight-minute self-regulation intervention raises educational attainment at scale in individualist but not collectivist cultures” by René F. Kizilcec and Geoffrey L. Cohen— suggests that doing so is going to be particularly hard because the effectiveness of different self-education strategies is heavily mediated by contextual factors like culture.

At the very least, this means that our conversations about trendy approaches like adaptive learning and competency-based education need to become a lot more nuanced than they are right now. We’re not paying enough attention to the things we don’t know yet about the circumstances under which these strategies work. But more profoundly, the findings of the paper raise questions about whether our fundamental approach to educational research is adequate to the task of learning how to meet these grand educational challenges of our age.

The Challenges: Universal and Lifelong Education

The first grand challenge is to give every human on the planet the access and support they need to achieve the highest level of education that suits their individual goals and abilities. No civilization has ever come close to achieving this before, particularly in large, heterogenous culture like the United States. And while the defunding of public college and university systems has increased the challenge, I’m not aware of any evidence that we could meet this challenge with our currently structured educational system at any economically feasible funding level.

At the same time, the idea of a “terminal degree” being synonymous with the end of an individual’s education is going away. We’ve heard talk for the past few decades about how “knowledge workers” are the future of the economy and how everyone will need to be “lifelong learners” because the skills and knowledge they need will change quickly and constantly. But now we’re really seeing it show up all over the economy and, as a civilization, we haven’t yet developed strategies to deal with it. For example, as the coal mining industry dies, we don’t know how to help all of those miners learn the skills they need to find new careers. Our existing systems are not adequate for those kinds of large-scale educational transitions. So the second grand civilizational challenge is universal lifelong learning.

These challenges are both partly ones of scale. We know how to help some humans achieve the highest level of education that suits their goals and abilities. We know how to help some miners learn what they need to make career transitions. We just don’t know how to do these things for everyone by growing our current system. Logistically, it’s something we’ve never done before, and economically, it’s not clear that we have the resources to give everybody access to the education they need, never mind at the level of quality that would eliminate any achievement gaps. At least, not with our current system.

Whether consciously or not, most of the high-profile efforts and many of the discussions around how to address these challenges have been heavily influenced by research that has come to be known as Benjamin Bloom’s “two sigma problem.” Before we can understand the full implications René F. Kizilcec and Geoffrey L. Cohen’s paper, we need to understand the two sigma problem, including some of the limitations of the research and the ways in which it has framed educational reform discussions.

Understanding the Two Sigma Problem

Benjamin Bloom is most famous for three contributions to educational research. The first is Bloom’s Taxonomy, which is not relevant to this post. The second is the mastery learning approach, which we’ll get to shortly. And the third is the two sigma problem.

A “sigma” or “standard deviation” is a statistical concept that has to do with measuring how far from a group’s “average” something is. So it’s a relative concept. Height for an adult human being doesn’t vary nearly as much as, say, hight for all primates. So one standard deviation, or sigma, from the average human height will be a smaller increment than one sigma for average primate height. In Bloom’s world, where he is measuring variation in grades in a class, a very rough proxy for one sigma is one letter grade in a final course grade, e.g., the difference between a C and a B.

The popular interpretation Bloom’s two sigma experimental result is that students who get one-on-one tutors do better by roughly two sigma, or two final course grade letters, than students in a typical class who do not get one-on-one tutoring. But that’s not quite right. Or at least, it’s not the full picture.

To start with, the two-sigma research is built on Bloom’s previous work on mastery learning. You may not be familiar with this term, but if you’re at all engaged with ed tech, then you’ve probably seen traces of its influence. The basic idea of mastery learning is that you break a subject up into small learning objectives that are properly sequenced, and students don’t move on to the next learning objective until they’ve demonstrated “mastery”—often defined in terms like “answering 90% or better of assessment questions correctly.” Bloom and his colleagues found that they could achieve a one-sigma improvement for students who were taught using mastery learning techniques over similar students who were not. Students who were taught using mastery learning and received one-on-one tutoring achieved a two-sigma improvement relative to the control groups.

Textbook publishers love this result because it gives them a direction for product development. They know how to break course subjects down into small, sequenced learning objectives. They also know how to set thresholds on assessments that unlock the next bit of content. Whether educators choose to actually employ master learning techniques is out of their control, but at least they can develop products that are friendly to that approach. And if they could somehow automate the mastery learning pedagocial techniques—say, through adaptive learning, for example—then they could show that their products can improve students’ learning outcomes by two course grades over traditional teaching approaches.

But this approach and Bloom’s “two sigma” framing are littered with caveats (as research tends to be). First, his control groups were school children in traditional courses. If you vary that environment significantly—say, by putting students in a MOOC or teaching adults pursuing non-degree career development knowledge—I’m not aware of robust findings that Bloom’s results still hold. Second, not all subjects lend themselves equally well to being broken up into small, discrete, and straightforwardly measurable learning objectives that can be determinatively sequenced. Third, as far as learning outcomes go, a “letter grade” difference is not a terribly authentic measure of progress.

Lastly, and most relevant to the Kizilcec and Cohen paper, Bloom and his colleagues were never able to pin down exactly what it was about one-on-one tutoring that led to that second sigma of improvement. Bloom himself wrote,

It should be pointed out that the need for corrective work under tutoring is very small.

So what is it about one-on-one tutoring that made such a big difference? Bloom and his colleagues tried to isolate one or two variables that would account for it. They failed.

What if they failed because there aren’t just one or two factors that account for the difference that human tutors make? What if there are many different factors that affect different students to different degrees in different contexts? What would this mean for the whole personalized learning enterprise? Even more broadly, if the impacts of teaching interventions vary dramatically, singly and in combination, across a wide range of contextual factors, what are the implications for making progress in educational research? What does a science of learning look like in a world where isolating a variable in a particular experimental condition doesn’t tell us a whole lot that can be generalized very far?

These are the questions that ultimately interest me in the Kizilcec and Cohen paper. But as with all of these “Research in Translation” articles, we’re going to have to start by unpacking some of the discipline-specific knowledge that underpins the study itself. Some of it will probably be unfamiliar to you. For example, there’s a good chance that you haven’t run across the theory of cultural dimensions that the paper draws upon unless you’re a social psychologist or a sociologist. On the other hand, if you’re reading this blog, there’s a reasonable chance that you know at least a little bit about self-regulated learning. But even there we may find some aspects or implications that you’re not aware of.

In fact, let’s start with self-regulated learning.

Understanding Self-Regulation and Self-Regulated Learning

Kizilcec and Cohen aren’t concerned with mastery learning but rather self-regulated learning. If we forget about the “learning” part for a moment and focus on the “self-regulation” part, the concept is familiar enough in our daily lives.

Suppose that I want to lose 20 pounds. There are any number of strategies that I could employ try to get myself to where I want to be. Here are a few:

Plan a beach vacation six months from now and think about how I want to look in my bathing suit
Read scary articles about the consequences of being overweight
Promise myself I will buy something I really want if I achieve my goal
Commit to giving money to a cause I hate if I don’t achieve my goal
Do a 20-minute cardio workout four times a week
Increase my fiber intake
Fast 8 hours out of every day
Reduce my sugar and carbohydrate intake
Adopt the cabbage soup diet
Wear ear magnets

Notice that there are two basic kinds of strategies on this list: motivation and implementation. Things that make me want to take action and actions that I can take.

Let’s say I decide that I’m going to try to lose weight by wearing ear magnets. Every morning I put them on, and every day I weigh myself. After a month, I have not lost weight. Being the introspective person that I am, I come to the conclusion that ear magnets are not helping me achieve my goal.

Next, I try reducing my sugar and carbohydrate intake. There’s only one problem: I can’t get myself to stick to the plan. Every morning, I reach for a bagel or muffin for breakfast. Whenever I’m out for dinner, I just can’t stop myself from ordering dessert. A month later, I still haven’t achieved my goal.

So I decide I will promise myself that I will buy that remote controlled submarine drone I’ve been lusting after if I can lose weight by giving up those bagels and desserts. That will be my next experiment. And so on.

That’s self-regulation in a nutshell:

Set a goal
Pick a strategy that might help you achieve your goal
Monitor your progress toward your goal
Reflect on whether your chosen strategy has resulted in satisfactory progress toward your goal
Adjust your strategy (or stick with it) accordingly
Rinse, repeat

Apply this basic self-regulation approach to learning and you have…wait for it…self-regulated learning!

When I wrote at the top of the post that one plausible answer to our civilizational education challenges is to get a lot better at teaching humans how to get better at learning, that basically means getting a whole lot better at teaching students how to be self-regulated learners.

SRL post-dates Bloom’s work on mastery learning and it reflects a different focus. Mastery learning is primarily focused on the content that is being learned. Any consideration of learner motivation is a means to an end. In contrast, SRL is focused on achieving the learner’s goals more effectively. The goal may very well be to master some coherent set of content. But any consideration of content is a means to an end. Specifically, the learner’s end.

At least in theory. In practice, there are lots of attempts underway, with varying degrees of self-awareness, to marry mastery-based adaptive learning products with SRL techniques. For example, here’s an effort at Essex County College in Newark, NJ in which they recruited John Hudesman, a researcher in SRL, to marry the approach with an essentially mastery-based model:

The basic idea is that maybe the SRL feedback loop can help achieve that elusive second sigma. It’s more sophisticated than Bloom’s hunt for the one or two factors that account for one-on-one tutoring’s effect on all students in the sense that each student can individualize and hopefully find for herself the factors that will enable her to reach her goals.

But here, too, there are caveats.

The Limits of SRL

Let’s go back to that weight loss self-regulation problem. Suppose I’ve tried everything. Ear magnets. The cabbage soup diet. Exercise. Weightwatchers. Promises. Threats. Nothing works. In every case, either I can’t stick with the plan or I stick with it but don’t lose weight. Maybe, after many months of frustration, I talk to my doctor, who tells me that a side effect of my prescription medication is weight gain.

Oh.

Now I’m in uncharted territory. Can I take a different medicine? Is there some other way to counterbalance the effect of the medication? Or am I just stuck being 20 pounds overweight?

There are limits to the power of self-regulation, including the power of self-regulated learning. If I’m a single parent working two jobs and driving for Uber in between to make ends meet, then no SRL strategy is going to give me time that I don’t have. If I have a learning disability, then I may need help finding SRL strategies that account for my particular situation.

Here’s the tricky part that gets into the heart of the Kizilcec and Cohen paper: Barriers that impact the effectiveness of SRL strategies might be non-obvious and influenced by things like culture. In particular, the paper examines the differences of average effectiveness of a couple of different self-regulation strategies for people from individualist cultures versus collectivist cultures. This distinction will take a little unpacking in order to understand how it affects SRL.

Suppose the week that I have a big assignment due, my uncle dies. I wasn’t particularly close to this person, but my extended family is generally close-knit. My relatives would be hurt if I didn’t come. I really value the closeness of my extended family, even if I didn’t really get to know my uncle well, and I care a lot about how they feel. It’s not really an option for me to skip the funeral. Which is halfway across the country. My sense of social obligation to my family makes it impossible for me to employ the self-regulation strategy of setting aside the time I need to complete my schoolwork.

Imagine that your whole life were filled with these sorts of social obligations. Not just an occasional death in the family, but daily demands that you can’t predict and can’t ignore. In your world, there are many people who can ask you to change your plans at any time. Extended family members, neighbors, coworkers, or even strangers can place demands on you and, depending on the specifics, you really can’t say “no.” Because you live in a culture that places a high value on social bonds, and you have learned to place a high value on those bonds as well. The term of art for this kind of a culture is “collectivist,” and it contrasts with an “individualist” culture, where less emphasis is placed on social expectations and more on individual achievement.

At the very least, you will find yourself in a similar problem to one of struggling to lose weight when you’re taking a medication that causes weight gain. Time management strategies don’t work because you’re really not in control of your time. But it may go even deeper than that. If your world is highly unpredictable because you can’t anticipate demands that you can’t reject and that come at you on a daily basis, then your fundamental idea of what it means to “manage time” may have to be different. You can’t just schedule certain nights of the week or reserve X hours to do your homework. You don’t have that power. Or, at least, it wouldn’t occur to you that exercising that power is a viable option, because you care deeply about what your willingness to fulfill your social obligations says about you as a person.

This is exactly the hypothesis that Kizilcec and Cohen wanted to test. They wanted to see whether there are differences in how SRL works for students in collectivist versus individualist cultures.

The stakes are high. Remember those two civilizational challenges: universal and lifelong education. To meet those challenges, we need to provide less traditional and formalized teacher support than the students in Bloom’s control groups got. We just don’t have enough teachers and classrooms to go around. This is a fundamentally different challenge than the one that Bloom was probing with his two sigma experiments. If it turns out that all kinds of factors, some as subtle as the the nature of the social ties in the culture you come from, can impact the effectiveness of our ability to teach students how to teach themselves, then how can we possibly meet our unprecedented civilizational goals? How can we tease out all the many possible factors, particularly when they interact with each other? How would we even begin to go about figuring that out in a rigorous, evidence-grounded way?

Hold that thought. We’re going to return to it later in this post. First, though, we have to understand the experiment that Kizilcec and Cohen conducted to test whether this is even a problem. And to do that, we first need to understand one piece of social psychology.

Understanding Cultural Dimensions Theory

Kizilcec and Cohen wanted to figure out a way to test whether, on average, students from individualist cultures benefit more from being taught SRL techniques than students from collectivist cultures. The first thing they needed in order to do that is some way to define individualist versus collectivist cultures in a reasonably rigorous way.

As you might expect, the researchers didn’t just pull the idea of individualist and collectivist cultures out of thin air. There is a body of research literature from which they were drawing. In particular, they drew on the research of a social psychologist named Geert Hofstede. During the late 1960s, Hofstede worked at IBM, where he founded and led the Personnel Research Department. This was at a time when globalization was really beginning to take hold and American-founded companies like IBM were learning how to run divisions in other countries with very different cultures. In the early days, these companies believed that they could train their new international employees on IBM-standard management practices and all would be well. But it soon became apparent that the challenge was more complicated than they thought. People in other countries responded to IBM’s management practices differently.

At about this time, Hofstede stumbled upon a database of 117,000 attitude surveys from IBM employees all over the world. When analyzed for patterns on an individual level, the data were confusing. But when Hofstede grouped employees by nationality and looked for similarities and differences between national groups, some patterns began to emerge.

As I have written about before, I worry that our generally low level of statistical literacy means that many of us are prone to misread statistical results and have little confidence in statistical analysis. So I am making a practice of providing some explanation of the statistical methods used when I write up these Research in Translation posts. In Hofstede’s case, he used a method called “factor analysis.”

We can get a basic sense of the intuition that underlies that method through a common joke. You’ve probably seen lists with titles like “You might be a _________ if…”. The basic idea is that there are funny and non-obvious little traits and experiences that are shared by people of a certain type. When they are not mean, they are often inside jokes. For example, RallyPoint, a site that bills itself as “The Professional Military Network,” has an article entitled, “You might be a veteran if…” Item number five on the list is “You remember laughing at troops who thought 29% APR was good…” I don’t even know what that means, and I certainly wouldn’t think that an attitude about interest rates would be a marker of whether somebody is a veteran.

Factor analysis starts with a collection of seemingly unrelated variables (like answers on an employee attitude survey) and looks for how closely correlated they are. If a group of variables is highly correlated, then it may be because they are all indications of a hidden or “latent” variable.

“Oh, you answered ‘yes’ on eight out of these ten seemingly random questions. That suggests that you might be a veteran.”

Hofstede applied factor analysis to national groups of employees and found evidence of four latent variables, which he calls “cultural dimensions.” (Subsequent research has reproduced Hofstede’s results, and the theory has been refined and expanded.) He called one of the latent variables “Individualism/Collectivism.” It purports to capture the degree to which each national culture has a strong web of the kinds of social obligations and expectations that I described in the previous section. Hofstede has used the cumulative research to create a country-by-country comparative index of his different dimensions, which you can play around with here.

Kizilcec and Cohen used Hofstede’s country index of the Individualism/Collectivism dimension to provide some rigor to their question about cultural differences impacting the effectiveness SRL techniques.

Understanding the Kizilcec and Cohen Paper

The researchers conducted experiments on two MOOCs. In each case, students were given eight-minute tutorials on two SRL techniques: Mental Contrasting (MC) and Implementation Intentions (II). If you recall the weight loss strategy list from earlier in the posts, there were strategies that helped motivate me to do what I needed to do (like promising to buy myself a submarine drone if I meet my weight loss goal) and other strategies to actually accomplish my goal (like wearing ear magnets). MC and II fall into these two respective categories. MC is intended to help students self-motivate by (in the words of the authors)

…vividly elaborating on positive outcomes associated with attaining a goal (e.g., learning a new skill) followed by vividly elaborating on central hindrances in the present that might interfere (e.g., a busy work schedule). By juxtaposing the desired future with current obstacles, MC can strengthen goal commitment and striving. Insofar as the obstacles to goal attainment are seen as surmountable, MC induces a sense that the desired future is within one’s reach, thereby increasing commitment and effortful goal striving.

On the other side,

[t]he II procedure helps people plan how to overcome obstacles and execute goal-directed actions. It encourages people to generate concrete if–then plans. Unlike unstructured planning, an II links a specific situation to a goal-directed action. An example of an II is, “If I feel too tired after work to watch the next lecture, then I will make myself coffee to stay awake.” Forming an II facilitates goal attainment because it increases the likelihood that people will respond efficiently and even automatically to regular obstacles that threaten the completion of their goals.

Using their short MC and II tutorials, the researchers were able to increase MOOC completion by 32% over the control group in the first experiment and 15% in the second—for students from individualist countries. That’s a pretty remarkable result. If we’re trying to achieve these big civilizational goals of providing every human with higher education and lifelong learning advancement, then the possibility that we could increase completion rates in low-facilitation courses by up to 30% with an eight-minute lesson helping students get motivated and focused is pretty huge.

That’s the good news. The bad news is that students from collectivist cultures showed no significant benefit. In fact, using India as a country on the collectivist end of the scale, the researchers found that

relative to US respondents, Indian respondents reported that their social environment was more complex and that they shied away from forming if–then plans. Indian respondents listed more obstacles that could interfere with the goal of achieving a good grade in an online course than US respondents (India median = 4, US median = 3; Kruskal–Wallis X2 = 9.50, P = 0.002). They were also more likely to report that if– then plans oversimplify the complexity and ignore the uncertainty of real-life situations [t(192) = 3.12, P = 0.002, d = 0.45].

In other words, Indian students were more likely to say that the II self-regulation strategy in general was unrealistic. It didn’t account for the complexity in their lives.

There’s a lot more to this study than I’m going to cover here. For example, students from collectivist countries showed some benefits from MC when decoupled from II, which is interesting to think about. Generally speaking, I can’t unpack all the background in these Research in Translation posts and still have room to cover all the nuances of the papers themselves (although one of my goals is to give readers enough background that they can read and understand the papers themselves).

But the headline here is provocative enough. When we think back to Bloom’s failure to find the one or two magic ingredients that tutors add which account for the second sigma of improvement, we can now see that the answer very likely is different from student to student. There is no silver bullet. In fact, there are all kinds of non-obvious factors that influence how well a given educational strategy works for a given student.

So if we still want to address those two civilizational challenges—or even an intermediate challenge along the way, like closing achievement gaps—then Kizilcec and Cohen’s study raises a critical question:

Now what?

Implications

Right about now, some of you are probably thinking, “So after making me read all of that, your grand conclusion is that students are individuals? Thanks for a whole lot of nothing, buddy.” Fair enough. But meeting these big educational challenges requires us to navigate between a rock and a hard place.

On the one hand, we don’t want to fall for easy answers. As a culture, we tend to be a little schizophrenic about our attitudes toward education. Even smart people who believe in their hearts that every student is an individual human with different needs and goals can all too easily slip into over-generalizations and solutionism when the conversation turns from individual students to solving large-scale educational problems.

On the other hand, if we’re committed to solving the big educational challenges, we can’t just shrug our shoulders and say, “It’s too hard. There’s no way to sort out all the factors.” We have to come up with a research approach that accounts for the fractal problem of student differences and how various combinations of those differences affect what works for different students in different learning contexts. The Kizilcec and Cohen paper is one model for what that kind of science could look like. But we need more. A lot more.

In my view, we need what I call “empirical educators” and what Candace Thille calls “citizen scientists.” We need to crowd-source this problem by recruiting front-line classroom educators as field researchers that work in cooperation with researchers trained in learning sciences. For example, Kizilcec and Cohen’s experiments, however cleverly designed, can’t tell us how these results play out in courses that are not MOOCs. Or with students who are first-generation Americans whose families come from collectivist societies. Or whether other factors influence whether there are indentifiable factors that tell us which students from a collectivist country are most or least likely to be similar to their cultural norm in terms of SRL. Or what other SRL techniques might be more effective given any combination of these variables. The amount of research one can imagine being generated off the results of this one study alone is massive.

We spend a lot of public and private money chasing silver bullets in education. I propose we would be better served by investing that money in providing educators with the training, support, and incentives to participate in the work of advancing the sciences of learning. At the very least, all professional educators should have a certain level of literacy on what we know about education, be able to read and understand the implications of a research paper, and believe that having this knowledge and these skills is a core part of their professional identity. Nobody should still be talking about learning styles, for example.

Some educators may take this a step further and learn how to make the classroom experiments that they intuitively conduct on a regular basis a little more rigorous. And some may actively collaborate with professional researchers or even design their own studies using the disciplinary research tools that they already know from their graduate training.

There won’t be one answer for every educator, any more than there will be one answer for every student. My country doctor primary care physician has a different relationship to medical science than an oncologist working at Memorial Sloan Kettering Cancer Center. But they both believe that having some relationship to medical science is essential to doing their jobs. In education, where the fractal nature of the problems we are trying to understand requires us to run many experiments in many contexts, having all educators see themselves in some sense as citizen scientists is even more critical.

This post is part of our Research in Translation series, which is funded in part by the Bill & Melinda Gates Foundation. The findings and conclusions (or views) contained within are those of the authors and do not necessarily reflect positions or policies of the Bill & Melinda Gates Foundation.