Multiple Choice Exam Theory: Remote Teaching Edition

(This is a revised edition of a 2012 blog post I did for the Chronicle of Higher Education‘s ProfHacker column, which is now in 404 Error Heaven.)

At McGill University, as of this writing, it is allowed to have students in remote courses take timed multiple-choice exams, so long as the window is flexible.  This means you can’t have everyone take the test at the same time.  

This also means you can’t be absolutely sure that students will not seek help. This is a concern for some people. But it doesn’t have to be. “Academic integrity” can mean many different things.

From 2011-2014 I developed a system for COMS 210, a 200 student lecture course, that allowed students to seek help and collaborate while taking multiple choice exams.  My results are unscientific, but overall test averages only went up by about 3%.  I was honestly quite surprised. Student complaints about testing, and even test anxiety, seemed to drop precipitously. It required more front-end preparation, but the grading was easy and the learning experience was better.

A couple caveats are in order here: 1) students need to adjust to it and may initially really dislike it.  They have good reason to feel that way.  It looks like other multiple choice tests but it’s not, so skills that were well developed in years of standardized testing are rendered irrelevant.  

2) in past courses, multiple choice was only one axis of evaluation for the course.  Students must write and synthesize, and they are subject to pop quizzes, which they also dislike (except for a small subset that realizes a side-effect is they keep up with readings).  On the syllabus, I am completely clear about which evaluation methods are coercive (those I use to make them keep up with the reading and material) and which are creative (where they must analyze, synthesize and make ideas their own).

So, here’s my multiple choice final exam formula.

Step 1: Make it completely open-everything (book/friend/internet), but warn students that they should make a study sheet because they won’t have time to look up everything.  

The advantage of the study sheet method is it allows students to write down anything they have trouble memorizing, but it pushes them to study and synthesize before they get to the moment of the test.

I also advised them to work in groups, but not to centralize study sheet labor, as there would often be wrong answers in the “centrally made” study sheets.

To further reduce student anxiety, I renamed the exams “quizzes” and said we would drop their lowest score (out of 4 of them). Shockingly, I think the renaming made a difference.

Step 2: Rules for the test: for teaching on campus, I told them that if they came to the classroom that day, we would enforce it as a quiet space, but that they could take the exam anywhere they wanted.  Most students selected that option.  So I already know a system like what we’d have to do online can work.

Step 3: Build a unique exam for each student (sort of).  Let’s say you are giving a 50-minute exam with 40 question.  Each question will be an “objective” on a topic.  Now, you need to write four (+/-) questions for each objective.  Yes, that’s 160 questions fora. 40 question test but I follow a formula (see Step 4).  Using the exam tools in MyCourses, each student then gets a set of 40 questions in a unique order, and with a unique order of answers.  They can phone or text a friend for help, or take it with a friend, but if they get stuck their friend has to actually give them the right answer, rather than saying “the answer to #6 is C,” so there is some learning going on.

NB: I am not providing technical support on how to actually design exams in MyCourses.  Please contact TLS if the online training materials don’t work for you.

Step 4: Eliminate recognition as a factor in the test.  

Most multiple choice questions rely on recognition as the path to the right answer.  You get a question stem, and then four or five answers, one of which will be right.  Often, the right answer is something the student will recognize from the reading, while the wrong answers aren’t.  

But recognition isn’t the kind of thinking we want to test for.  We want to test if the student understands the reading.

The answer to this problem is simple: spend more time writing the wrong answers.

I’ve come up with a formula that works pretty well.  Pretty much all my multiple choice exam questions take this form:

Question stem. This is the “question” part of the question in multiple-choice lingo. The ideal question stem has more words than any of the possible answers and is clearly worded, though I do throw in a negation (“not”) from time to time.

A. Right answer

B. True statement from the same reading or a related reading, but that does not correctly answer the question

C. Argument or position author rehearsed and dismissed; or that appears in another reading that contradicts the right answer.

D. Converse of one of B or C.

From here, you’re basically set, though I often add a 5th option that is “the common sense” answer (since people bring a lot of preconceptions to media studies), or I take the opportunity to crack a dad joke.

Step 5: Give the students practice questions, and explain the system to them.  I hide nothing.  I tell them how I write the questions, why I write them the way I do, and what I expect of them.  I even have them talk about what to write on their sheets of paper.  At the beginning of each class, we would do a multiple choice question reviewing something from the last class. At the time, I used clickers, which we also used for surveys and attendance.

A few other guidelines:

Answers should be as short as possible; most of the detail should be in the question stem

Answers should be of roughly the same length

I never use “all of the above” or “none of the above”

Since we are testing on comprehension of arguments, I always attribute positions to an author (“According to Stuart Hall,”), so it is not a question about reality or what the student thinks, but what the student understands authors to me.

Exception: I will ask categorical questions, ie, “According to Terranova, which of the following 4 items would not be an example of ‘free labour’?”

Step 6 (optional):  In 2012, I had students try to write questions themselves.  Over the course of about 10 weeks, I had groups of 18 students write up and post questions on the discussion board (that follow the rules above) that pertained to readings or lectures from their assigned week.  A large number of them were pretty good, so I edited them and added them to my question bank for the final exam.  So for fall 2012, my COMS 210 students wrote about half the questions they were likely to encounter on the final.  If they were exceptionally lucky, their own question might wind up on their own exam (we used 4 different forms for the final).

Here is a link to a copy that assignment. So long as the rules about timed exams do not change between now and September, I plan on using a variation of it for my fall lecture course: