Let’s assume we live in a world in which students are going to use ChatGPT or similar tools on their assignments. (Because we do.) Let’s also assume that when those students start their jobs, they will continue to use ChatGPT or similar tools to complete their jobs. (Because they will.) Is this the end of teaching as we know it? Is this the end of education as we know it? Will we have to accept that robots will think for everyone in the future?
No. In this post, I’m going to show you one easy solution that solves the problem of assuming students will use generative AI by incorporating it into assessments. Keep in mind this is just a sketch using naked ChatGPT. If we add some scaffolding through software code, we can do better. But we can do surprisingly well right now with what we have.
The case study
Suppose I’m teaching a college government class. Here are my goals:
- I want students to be able to apply legal principles correctly.
- I want to generate assignments that require students to employ critical thinking even if they’re using something like ChatGPT.
- I want students to learn to use generative AI appropriately.
Let’s throw in a couple of more goals just to make it interesting:
- I want students to demonstrate competencies.
- I want my assignment to support diversity, equity, and inclusion
Can we do all that?
I cobbled together a DEI guidelines document from some existing ones I found on the web to create something focused on curricular materials generation. I gave the document to ChatGPT along with the following prompt, which is designed to provide assessment questions that you can’t answer by just copying/pasting them into ChatGPT:
You are instructed to generate 3 fictional legal scenarios assessing The following competency:
- Assess whether a case is likely to be heard by the Supreme Court based on an analysis of the Court’s role and selection criteria.
Evaluating Court functions will be consistently required.
For each scenario:
- Involve 2-3 parties and a dispute requiring application of the competency.
- Make it sufficiently complex so direct questions don’t fully demonstrate understanding.
- Generate 1-2 written questions requiring analysis using the competency.
Additionally, construct a rubric table that is appropriate for all generated scenarios to evaluate student responses on:
- Mastery of the competency above.
- Mastery of the competencies below.
Demonstration of the following additional competencies
Critical Thinking Competencies
- Ask probing questions to uncover deeper complexities beneath the surface. Questioning skills are inherently required by the assessment structure.
- Critically evaluate source credibility and potential biases when analyzing information. Assessing the AI responses is crucial.
- Produce sound arguments and reasoning to justify conclusions.
Written justifications will necessitate this in all cases.
- Formulate questions and direct dialogue with an AI assistant to extract relevant insights. Strategic probing of the AI agent will be essential.
- Efficiently utilize an AI’s knowledge while evaluating its capabilities and limitations. Critically appraising the AI’s responses is key.
Rubric should include levels of performance for each competency (e.g. novice, developing, proficient). Apply the attached diversity, equity, and inclusion guidelines to the content you generate. Incorporate diverse litigants and equity-relevant issues in the cases you generate.
[Note: I’ve added some formatting to make it more legible for humans. The actual prompt doesn’t have bullets and has fewer paragraph breaks.]
Here’s what ChatGPT gave me back:
Legal Scenario 1: The Case of Interstate Water Rights
Background: The states of Aridonia and Riverbend are locked in a legal battle over the rights to water from the Silver River, which flows from the mountains in Aridonia and through the agricultural heartlands of Riverbend. Aridonia, facing an unprecedented drought, has constructed a series of dams to conserve water, significantly reducing the flow to Riverbend. Riverbend’s economy relies heavily on agriculture, which is now under threat due to the reduced water supply.
- The State of Aridonia (Plaintiff)
- The State of Riverbend (Defendant)
- The Indigenous Tribes of the Silver River Basin (Intervenors) – They claim ancestral rights to the river and are impacted by the actions of both states.
Complexity: The dispute requires an understanding of interstate commerce, environmental law, and indigenous rights.
- Analyze the potential arguments each party could make to justify their claims to the Silver River’s water. How might these arguments impact the likelihood of the Supreme Court electing to hear this case?
- Considering the Supreme Court’s historical approach to interstate disputes and the involvement of indigenous rights, what precedents might influence the Court’s decision to grant certiorari?
Next, I will create the second legal scenario.
Legal Scenario 2: The Case of Digital Privacy and International Data Exchange
Background: GlobalTech Inc., a multinational tech company, is sued by the State of Freedonia on behalf of its citizens for allegedly breaching privacy laws. GlobalTech’s data centers, located in various countries, process and store data from Freedonian users. The lawsuit asserts that GlobalTech’s practices violate Freedonia’s stringent digital privacy laws. However, GlobalTech contends that they are compliant with international agreements on data protection to which Freedonia is a signatory.
- The State of Freedonia (Plaintiff)
- GlobalTech Inc. (Defendant)
- International Privacy Watch (Amicus Curiae) – A global coalition advocating for citizens’ digital rights and privacy.
Complexity: This scenario involves understanding international law, digital privacy, and the implications of cross-border data flow.
- Evaluate the competing interests of national privacy law and international data agreements in the context of this case. How might these factors affect the Supreme Court’s decision to grant a writ of certiorari?
- What are the broader implications for digital privacy and international commerce if the Supreme Court decides to hear this case? Consider historical precedents and current legal trends in your analysis.
Now, I’ll proceed to create the third legal scenario.
Legal Scenario 3: The Case of Health Care Access and Religious Exemptions
Background: A new federal regulation mandates that all employers, regardless of their religious beliefs, must include comprehensive reproductive health services in their employee health insurance plans. FaithFirst, a nationwide chain of craft stores owned by a family with strong religious convictions, files a lawsuit claiming that this mandate violates their religious freedom. The case is further complicated by the intervention of HealAll, a non-profit organization that provides health services to underprivileged populations and argues that exemptions undermine public health efforts.
- FaithFirst (Plaintiff)
- The Federal Government (Defendant)
- HealAll (Intervenor) – Advocating for universal access to health care.
Complexity: This scenario encompasses the interpretation of religious freedom, federal regulatory power, and the right to health care.
- Assess the potential constitutional conflicts at play in this case, particularly the tension between religious liberty and federal health care mandates. How could these conflicts influence the Supreme Court’s inclination to review the case?
- Examine how the Supreme Court’s previous decisions on similar health care and religious freedom cases might be used as precedents in this scenario. What factors might the Court consider in deciding whether to hear this case?
With the three legal scenarios outlined, it’s time to generate a universal rubric that can be applied to assess student responses for all three scenarios. I will incorporate the DEI guidelines into this rubric to ensure the content is sensitive to diversity, equity, and inclusion.
Universal Rubric for Assessing Legal Scenario Responses
The following rubric applies to responses for all three legal scenarios, ensuring that students are assessed on the mastery of core competencies along with critical thinking and AI engagement. The DEI guidelines are integrated to promote inclusive and diverse perspectives in the analysis.
Competency Novice (1-2) Developing (3-4) Proficient (5-6) Mastery (7) Mastery of Supreme Court Case Assessment Displays limited insight into the Supreme Court’s selection criteria. Overlooks DEI considerations. Recognizes basic aspects of the Court’s role and criteria. Considers DEI aspects superficially. Analyzes the Court’s criteria proficiently, integrating DEI considerations into the analysis. Provides comprehensive analysis aligned with the Court’s criteria and DEI principles, demonstrating nuanced understanding. Critical Thinking: Questioning Questions lack depth and do not reflect DEI considerations. Questions show an emerging understanding of complexities with some DEI awareness. Probes complex issues effectively with questions that incorporate DEI awareness. Asks deep, insightful questions that fully embrace DEI complexities within the legal context. Critical Thinking: Source Evaluation Shows limited ability to evaluate credibility and biases, with minimal DEI consideration. Begins to critique sources for credibility and bias, showing awareness of DEI. Evaluates sources critically for credibility and biases with a consistent DEI lens. Expertly discerns source credibility and biases, thoroughly applying a DEI perspective. Critical Thinking: Argumentation Constructs basic arguments, lacking DEI perspective and strong justification. Develops arguments that begin to incorporate DEI considerations with moderate justification. Presents well-reasoned arguments that effectively integrate DEI considerations. Articulates complex arguments, richly informed by DEI considerations and robust justification. AI Competency: Question Formulation Struggles to craft questions that effectively leverage AI insights, with little DEI focus. Formulates questions that occasionally lead to relevant AI insights and consider DEI. Consistently crafts questions that elicit useful AI insights, mindful of DEI implications. Strategically formulates questions for AI that yield comprehensive insights, fully integrating DEI. AI Competency: AI Utilization Demonstrates limited understanding of AI capabilities and DEI considerations. Shows basic awareness of AI’s capabilities and DEI considerations in utilization. Utilizes AI effectively, with a clear understanding of its capabilities and DEI considerations. Masters AI utilization, skillfully balancing its capabilities with DEI awareness in analysis.
The performance levels for each competency are:
- Novice: Limited application and understanding
- Developing: Basic to moderate application and understanding with some inconsistencies
- Proficient: Consistent application and understanding with minor gaps
- Mastery: Exceptional application and understanding with insightful integration
These are challenging assignments. You’d need to precede them with lower-level formative assessments to ensure they’re getting the foundational concepts and know what kinds of questions to ask the generative AI.
Which is exactly what you should be doing before any challenging competency application assessment anyway.
I asked ChatGPT to generate three examples, but I just as easily could have asked it to generate 300. In other words, you can create a test bank if that’s what you need. Or just generate them on demand every term. You’d want a human expert to tweak the rubric and review each assignment; it’s a bit more complex and error-prone than algorithmic math problem generators.
Grading the assignment
The key here is that the assignment students turn in is the ChatGPT transcript. (You can optionally have them submit their final analysis work product separately.) The students are, in effect, showing their work. They can’t use ChatGPT to “cheat” because (1) ChatGPT is part of the assignment, and (2) the assignment is designed such that students can’t just plug in the questions and have the AI give them the answer. Their ability to analyze the problem using the new tool is what you are evaluating.
You could use your generative AI here too as a TA. Give it the assignment and the rubric. Write a prompt asking it to suggest scores and cite evidence from the student’s work. You can decide how heavily you want to lean on the software’s advice, but at least you can get it.
Learning to think like a lawyer (or whatever)
Generative AI does not have to kill critical thinking skills. Quite the opposite. These assignments are much farther up on Bloom’s taxonomy than multiple-choice questions and such. Plus, they get students to show their thought work.
In fact, these scenarios are highly reminiscent of how I use generative AI every day. Here is a sampling of tasks I’ve performed over the last several months using ChatGPT and other generative AI that I probably couldn’t have—and definitely wouldn’t have—performed without them:
- Analyzed the five-year performance of a business based on its tax returns and developed benchmarks to evaluate the quality of its growth
- Cloned a Github source code repository, installed Docker and other needed tools on my laptop, and ran the Docker image locally
- Analyzed and hedged the risk to my retirement fund portfolio based on technical and economic indicators
- Wrote the generative AI prompt that is the centerpiece of this post
None of these scenarios were “one and done,” where I asked the question and got the answer I wanted. In all cases, I had to think of the right question, test different variations, ask follow-up questions, and tease out implications using generative AI as a partner. I didn’t have to learn accounting and business analyst but I did have to know enough about how both think to ask the right question, draw inferences from the answer, and then formulate follow-up questions.
To score well on these assessments, I have to demonstrate both an understanding of the legal principles and the ability to think through complex problems.
Critical thinking competencies
Ethan Mollick, a professor at the Wharton School of Business who writes prolifically and insightfully about generative AI, wrote an excellent analogy for how to think about these tools:
The thing about LLMs that make them unintuitive is that analogizing them to having a science fiction AI is less useful than thinking of them as infinite copies of some guy named Steve, a first year grad student who is great at coding & art and is widely-read, but makes up stuff based on what he remembers when he is pressed.
Asking AI to do things an incredibly fast Steve couldn’t do is going to lead to disappointment, but there is a lot of value in Steve-on-demand.Ethan Mollick’s LinkedIn post
This is a great analogy. When I was analyzing the tax returns of the business, I didn’t have to understand all the line items. But I did have to know how to ask Steve for the important information. Steve doesn’t understand all the intricacies of this business, its context, or my purpose. I could explain these things to him, but he’d still just be Steve. He has limits. I had to ask him the right questions. I had to provide relevant information that wasn’t on the internet and that Steve couldn’t know about. I used Steve the way I would use a good accountant whose help I need to analyze the overall quality of a business.
Coming up with benchmarks to measure the business against its industry was even more challenging because the macroeconomic data I needed was not readily available. I had to gather it from various sources, evaluate the quality of these sources, come up with a relevant metric we could estimate, and apply it to the business in question.
In other words, I had to understand accounting and economics enough to ask an accountant and an economist the right questions and apply their answers to my complex problem. I also had to use critical thinking skills. Steve could help me with these challenges, but I ultimately had to think through the problem to ask Steve for the kind of help he could give me.
When you’re teaching students using a generative AI like ChatGPT, you should be teaching them how to work with Steve. And as bright as Steve may be, your student still has much she can contribute to the team.
Generative AI competencies
Suppose you have a circle of intelligent friends. Steve is brilliant. He has a mind like an engineer, which can be good or bad. Sometimes, he assumes you know more than you do or gives you too short an answer to be helpful. Also, he’s been focused night and day on his dissertation for the last two years and doesn’t know what’s been happening in the real world lately. He’ll do a quick internet search for you if it helps the conversation, but he’s not tuned in.
Your friend Claude thinks like a Classics graduate student. He’s philosophical. He pays close attention to the nuances of your question and tends to give longer answers. He also has a longer attention span. He’s the kind of friend you talk with late into the night about things. He’s somewhat more aware of current events but is also a bit tuned out of the latest happenings. He can be analytical, but he’s more of a word guy than Steve.
Then there’s your friend Anna. Anna Bard. She’s not quite as sharp as either Steve or Claude, but, as an international finance graduate student, she reads everything that’s happening now. If you need to have an in-depth conversation on anything that’s happened in the last two years, Anna is often the person to go to.
Also, all of these friends being young academics in training, they’re not very good at saying “I don’t know” or “I’m not sure.” They’re supposed to be the smartest people in the room, and they very often are. So, they’re not very self-aware of their limitations sometimes. All three of my friends have “remembered” studies or other citations that don’t exist.
And each has their quirks. Claude has a strong sense of ethics, which can be good and bad. I once asked him to modify a chapter of an OER book for me. I gave him the front matter so that he could see the Creative Commons licensing was there. He told me he couldn’t do the work unless he could see the whole book to verify that it was ethically OK to modify the content.
I told him, “Claude, that book is 700 pages. Even you don’t have the attention span to read that much.”
He told me, “You’re right. In that case, I’m sorry, but I can’t help you.”
So I took the chapter to Steve, who had no ethical qualms at all but only skimmed the chapter and lost interest about halfway through my project.
When I do my work, I have to figure out which of my AI colleagues can help me and when to trust them. For the business model analysis, Steve answered most of my questions, but I had to get him some information from my friends who haven’t been locked in the library for the past two years. I asked both Anna and Claude. They were somewhat different from each other, both of which were well-reasoned. I had to do some of my own Googling to help me synthesize the analyses of my two friends, develop my own opinion, and bring it back to Steve so he could help me finish the work.
For the software project, surprisingly, Steve was useless. He assumed I knew more than I did despite my asking him several times to simplify and slow down. Also, the software had changed since he last looked at it. While he tried to make up for it by saying, “Look for a menu item labeled something like ‘X’ or ‘Y’,” he just couldn’t walk me through it. Anna, on the other hand, did a superb job. She knew the latest versions of all the software. She could adjust when I had trouble and needed some extra explanation. While I wouldn’t have guessed that Anna is the better co-worker for that type of task, I am learning how to get the most out of my team.
For the design of the prompt at the heart of this post, I went to Claude first to think through the nuances of the competency and the task. Then, I brought the summary I created with Claude to Steve, who sharpened it up and constructed the prompt. And yet, it still could use improvement. I can ask my friends for more help, but I will need to think through what to ask them.
My retirement portfolio analysis was 90% Anna’s work since she’s been following the market and economic conditions. I asked Steve to give me a second opinion on bits of her analytic approach. But mostly I relied on Anna.
We often say that we must teach students how to collaborate in teams since they will probably have to collaborate in their jobs. Teaching students how to use generative AI models is an overlapping skill. And it’s only going to get more critical as models proliferate.
I have a model called Mistral running on my laptop right now. That’s right. It’s running locally on my laptop. No internet connection is required. I don’t need to share my data with some big cloud company. And I don’t need to pay for the usage.
My subjective experience is that Mistral is generally more competent than GPT-3 but not as smart as ChatGPT-3.5 Turbo. However, according to one calculation, Mistral is 187 times cheaper to run than GPT-4. It’s also relatively easy and affordable to fine-tune, which is a bit like sending her out to earn a MicroMasters in a particular subject.
Let’s suppose I’m a building site engineer for net-zero buildings in Nova Scotia. I have to know all the details of the building codes at the municipal, township, provincial, and national levels that apply to any given project. Since I’m using new building technologies and techniques, I may have to think through how to get a particular approach accepted by the local building inspector. Or find an alternative approach. And very often, I’ll be out in the field without any internet connection. Mistral may not be as smart at questions about macroeconomics or software development as Steve, Claude, and Anna, but she’s smart enough to help me with my job.
If I were running that construction company, I would hire Mistral over the others and pay for her MicroMasters. So I have to know how to evaluate her against other potential synthetic employees I could employ. Choosing Steve would be like hiring a Harvard-educated remote-working external consultant. That’s not what I need.
Personally speaking, my daily use of generative AI hasn’t made me dumber or lazier. Sure, it’s saved me a lot of work. But it’s also enabled me to do work that was beyond my reach before. It feels a little like when Google first came out. If I’m curious about something, I can explore it instantly, any time I want, and go as deep as I want.
In fact, generative AI has made me a better learner because I’m fearless now. “Can’t” isn’t a viable starting assumption anymore. “Oh, I can’t analyze tax returns.” That answer doesn’t cut it when I have an Ivy League accounting MBA student available to me at all times. I need to know which financial questions to ask and what to do with the answers. But if I don’t at least try to solve a problem that’s bugging me, I feel like I’m copping out. I almost can’t not try to figure it out. The question won’t leave me alone.
Isn’t that what we want learning to feel like all the time?