## Should you start using multiple choice tests?

Pros

1. They can be exceptionally rigorous

As Daisy Christodoulou sets out first here, here and then here, counter to our lay-person intuition multiple-choice questions can actually be very difficult to answer well, thus requiring truly concrete and in-depth knowledge.  In the following example:

15. How did the Soviet totalitarian system under Stalin differ from that of Hitler and Mussolini?
A. It built up armed forces.
B. It took away human rights.
D. It abolished private land ownership.

It wouldn’t be enough to know a bit about about Nazi Germany, you would have to also know something about the regimes of Stalin and Mussolini, and be aware of subtle differences between them.  I think I know the answer to the question… but I’m not certain!

If you’re still unconvinced, just consider Tony Gardiner’s UK Mathematics Challenge, which was sat by pupils across the country on Thursday last week:

http://www.ukmt.org.uk/individual-competitions/intermediate-challenge/

Or try searching for just a few example questions from the GMAT:

2. They can be marked quickly, or even instantly by computer, and then provide remarkably granular insight into pupil knowledge and progression

‘But can’t they just guess?’  In terms of a summative assessment of pupil progress, guessing won’t work.  If the questions have five possible answers, then over time they will only achieve 20% through guesswork.  A pupil could guess correctly, presenting the impression that they know a particular item which in fact they do not.  Over time, however, if this question were allowed to recur as part of a long-term assessment strategy, then the correct guess would likely be spotted in the analysis.

As Joe Kirby and Bodil Isaksen also recently pointed out to me, if you allow for more than one correct answer, and don’t say how many correct answers there are, then the probability of correct guessing reduces dramatically.  For example, if the question were “Which of these were considered wonders of the ancient world?” And five of the seven were given as possible answers, along with a further five which are incorrect, then the odds of a correct guess plummets to 0.3% if the pupil knew that there were five correct answers.  I’m not even sure right now how to calculate the odds if they don’t have that information. Joe also noted that, as with the example question about totalitarianism, if your ‘wrong options’ are things that sound plausible, like ‘the Colosseum’ for example, then the questions become a further tool for learning as pupils make mistakes and realise that ‘it’s an error to think that the Colosseum was a wonder of the world.’

3 Once created, they can be set automatically

4. They enable pupils to easily leverage the testing effect and distributed practice, thus building memory storage strength throughout the course of their time in education

If multiple-choice assessment were part of an extremely frequent routine then it benefits from the double-whammy of the testing effect (acts of memory recall build the storage strength of that memory) and, if questions are allowed to disappear for a while before they reappear, then also distributed practice (the process of allowing retrieval strength to dip, so that when the act of recall is attempted the subsequent gain to storage strength is increased.)

They can also allow for pupils to take greater ownership of their learning.  Some teachers are now setting up courses on Memrise which empowers pupils to leverage the testing effect and distributed practice from home.

5. When used as pre-tests, they can prime pupils for what is to come, leading to even greater learning

It’s been shown that testing before learning the material on the test can accelerate future learning of that material.

Multiple-choice tests have an advantage here because they offer an opportunity for pupils to see key words, language, phrases and ideas before they are presented to them in lessons, while still giving pupils the chance to feel like they can kind of do *something*, even if it’s just read then guess, whereas by contrast an open-ended question would quite obviously be impossible to answer.

Suggestions have been made that this pre-lesson priming leads to greater attention in pupils when those same key words and phrases from the pre-test appear in the lesson.  Ironically the research has suggested an advantage to English and the humanities as a result of this, where most English and humanities teachers I’ve spoken to have intuitively assumed that pre-testing might be fine for maths, but has no place in their subject.

Cons

1. They take a long time to create

2. They require a great deal of expertise and skill to create well

Creating a good set of multiple-choice questions is not an easy thing to do.  Ideally, the incorrect options should be selected with care, and should reveal common misconceptions or tease out fine detail nuances between minimally different concepts (e.g. totalitarianism under different leaders.)  Just take a quick look at Dylan Wiliam’s work here on Hinge Questions to get a sense of how much expertise needs to go into designing these well:

You can see this kind of thinking at work on the UK Maths Challenge papers.  If people have been inexpertly creating multiple-choice questions for many years, it would be no surprise that they now have a bad rep.

3. We value what we measure, and they restrict what we ask pupils to do

Dan Meyer has done some fascinating work analysing the verbs that Khan Academy uses in posing its maths questions.  The most frequent verb is analyse, followed by calculate, both occurring roughly 20 times more often than the verbs estimate and argue.

He then looks at what pupils are asked to produce.  Most frequent are ‘a number‘ and ‘multiple-choice response,’ roughly 30 times more often than ‘a transformation‘ or ‘a two-column proof.’

The reason this kind of analysis is important is because we value what we measure.  In maths, for example, if people were forever expected to ‘select the right answer’ then perhaps we risk people thinking that mathematics is all about ‘getting the right answer,’ even when questions are as demanding and thought provoking as those of the UKMT.

So is this a tool of summative assessment, formative assessment, or a tool of learning?

All of the above.

Multiple-choice tests can be a summative tool (see TIMSS and NAEP), a formative tool (via granular question-level analysis) and a tool of learning (via the testing effect and distributed practice, and as this article notes:  “the well-replicated finding that tests do not simply act as measurement tools, but also prompt leaning in their own right, [are under-appreciated.]”  http://learninglab.uchicago.edu/Pre-Testing.html)

I get the sense that, in education over the years, people have tended to adopt an idea and then stick to it.  Con number 3 is largely eliminated if multiple-choice questions are seen not as another panacea, but as another tool in an inevitably complex toolbox.

But should you use them?  Well I don’t.  I don’t only because I do not have enough time to write them, write them well, and write enough of them to leverage the pros.  If you can find a way, though, I would strongly suggest that you add them to your repertoire.  David Thomas managed to do so, as he notes here, through use of Quick Key.  For my part, I’m wondering if I can use Quick Key to have pupils sit half a GCSE paper, and fill in the numbers according to how many marks they scored on each question, so that I can quickly create a QLA of every practice paper they sit…

Ultimately, I’m fascinated by the role multiple-choice questions could play as part of a well-thought-out, long-term education strategy, embraced by a teaching profession as a whole.

Teach First 2011 maths teacher, focussed on curriculum design.
This entry was posted in Uncategorized. Bookmark the permalink.

### 16 Responses to Should you start using multiple choice tests?

1. Kris Boulton says:

Reblogged this on The Echo Chamber.

2. cbokhove says:

Great post. As ‘con’ I would argue that some types of content do not lend itself for them (or formulated differently: making them in a MC would test something different) for example ‘What is the solution of the equation x^2=16’. Answering a MC on this seems rather pointless, even if you make up distractors. I however do agree that MC are undervalued for some of the pros you mention, especially quick diagnostic, formative tests, etc.

(btw, typo TIMSS paragraph. Further, TIMSS is not exclusively MC by the way)

• Kris Boulton says:

Thanks for the comment. Indeed, TIMSS is not exclusively MC; neither is the GMAT. I think if anything that speaks further to their being incorporated as part of a broader range of strategies.

For the example you gave, maybe there’s still some utility in MC for that question (computer/quick assessment for example), but you’re right to point out that it can lead to assessing the wrong thing. For example ‘What is the solution to 2x – 9 = 13’ – if you were hoping to test for analytic solving via the balance method, you might instead find that someone correctly answers it by substituting in each of the available options! Nothing wrong with that, but not what you were hoping to assess.

Also in the GMAT and UKMT, part of the strategy for being successful is the quick elimination of answers that obviously cannot be correct, which again, isn’t necessarily what you’re hoping to test for!

3. Nick Daniels says:

Hi Kris, I think this is a really good summary of the main issues here. Regarding your point at the end (that MCQs are useful but too time consuming to write), surely the answer is to use tests written by someone else? Obviously then a teacher faces at least two challenges:
1. Making sure the tests are good quality
2. Making sure the tests are suited to a particular class (i.e. pitched at the right level, not testing them on topics they hadn’t studied etc.)
These are significant, but I don’t think they’re insurmountable.

• Kris Boulton says:

Thanks Nick.

Yes, I agree that having someone else write them is the way to go. But, as with most resources produced by other people, you’re right to say that we then run into those two key problems – are they any good, and are they right for what I’m doing? I guess that’s why I prefer to think more in terms of a cohesive strategy… something that all teachers could pick up and use for their classes, somehow, and be guaranteed of their quality.

4. Nick Daniels says:

The obvious solution (possibly one that exists already) would be to have tests provided alongside a textbook. The fact they were produced by a publisher would give some quality assurance (depending on the publisher, obv), and the textbook link would give you reassurance that they’re pitched at the right level / linked to whatever syllabus you’re following.

• Kris Boulton says:

Pretty much – just doesn’t quite exist yet… gap in the market.

5. Hi Chris. How to calculate those probabilities? For a test in which the student knows there are 5 correct options out of 10 the probability is 1/N where N is the number of ways to choose 5 objects from a set of 10, or 10!/(5!)^2. So 1/N = (1x2x3x4x5)^2/(1x2x…x9x10) = 2x3x4x5/(6x7x8x9x10) = 1/252 ≈ 0.00397 which rounds to 0.4%. If there may be any number of correct options (including 0 or 10) the probability is 1/M where M is the number of subsets of a set of 10 objects, which is 2^10 = 1024. So 1/M = 1/1024 ≈ 0.00098 which rounds to 0.1%.

Things get much more complicated (or flexible depending on how you look at it) if one wishes to assign partial credit for better wrong answers. For example, what if four are correct out of 5 picked? What if there is a really important option whose omission would belie complete ignorance of a subject? Or a really bad option whose inclusion would do the same? In a simple multiple choice one could penalize wrong answers at a rate that brings the expected score for guessing to 0 (and creates the possibility of a negative score!). Etc.

If there are sufficiently many questions the accuracy in judging student mastery goes up rapidly. Contrary to popular wisdom MC does not rule out testing for understanding. In fact I find it very helpful. I teach honours math courses at university and routinely use True/False (essentially MC with only TWO options!) for this precise purpose. I give a series of 20 statements and students must decide, in each case, whether they are true or false. My students know how I make these statements up: I pick common misunderstandings or misinterpretations of the material and state them, in the positive or in the negative, and see whether they can pick out the howlers. Discouragingly, students are poor at this exercise (or I am lousy at teaching them the requisite insight) and — particularly when I try this with non-majors — it is common to see a class average that is below 50% (the mark expected through random guessing). I suspect they would do better if I gave them a huge bank of questions for spaced practice; that’s a thought. Of course the object is not for students merely to consistently get these right, but to gain understanding. I find these questions extremely productive in going over tests afterwards. Students always want to know “why that isn’t true” etc.

6. dodiscimus says:

The problem with calculating the probability like that is that students will have some notion (perhaps erroneous) of how many correct answers there might be, so although the 1/M calculation is nice (I especially like that rounding – are you a closet physicist?) students are not equally likely to guess at 0, 3, 5 or 10 correct answers. Whether this makes it less or more likely they can score by guessing depends on whether the question setter has the same bias. This is just like how students get thrown if a MC test has D as the correct answer five times in a row. Maybe some very sophisticated Bayesian / game theory approach is needed but it’s well beyond my dodgy maths… However, I digress. I think MC questions are exceptionally useful for all the reasons given and the problem is always time to create them. How much more useful would throwing money at this problem have been than re-writing the NC? What is needed is some sort of central repository that teachers can load their questions up to, and crucially an experienced teacher (or team of them) doing the QA and some intelligent categorisation. It would be handy if such a database allowed direct import into Moodle, Socrative etc. However, I suspect a national collection is not going to happen in the near future but maybe collaboration across a department, local Teaching Schools Alliance, LA HoD forum, or academy chain might be fruitful. For science teachers, there was a project at the University of York to produce a bank of MC questions to test important misconceptions in physics. These are still available http://www.york.ac.uk/education/research/cirse/older/epse/resources/ Things like the Keogh and Naylor concept cartoons are good too http://www.conceptcartoons.com/ and are also a good example of how a simple MCQ can be expanded into a productive discussion.

7. Clifford Palmer says:

Hi Kris, Great post as usual. Have you had a look at the Diagnostic Questions website. I have only just started looking at it, and it has a maths focus at the moment, but they seem to be collating the type of hinge questions you were referring to. http://www.diagnosticquestions.com/Home/About

• Kris Boulton says:

Thanks Clifford. I have see it before, yes. It’s been the logistics of implementing them effectively that have held me back up until this point… which isn’t to say that I don’t take an ongoing interest, and love the work they’re doing!

8. A great post, Kris. The focus here is on how MC questions can improve learning, but I think there’s an understated benefit to our teaching too. The reason good questions take so long to write is that they force us to properly think about what misconceptions typical students hold.

I would love lesson plan pro-formas to do away with a ‘learning objective’ box at the top and replace it with:
* this is the (single) multiple-choice question I want my students to answer at the end of this lesson
* these are the wrong answers that students with an incomplete understanding might choose
* crucially: this is what I’m going to do in my teaching to tackle those misconceptions.

We started using multiple-choice questions in my maths department a couple of years ago, when we developed our mobile assessment platform called Imvoto. At first it ONLY supported MC questions… this highlighted Kris’s con number 1: it took us 3-4 hours to write good questions for a 50 minute lesson.

Even though in Imvoto we’ve introduced alternative forms of question that are less time consuming to write, MC questions are still the bedrock of my current planning and teaching. I typically have sequences of 5-10 questions of which the first couple are MC. If pupils get those wrong, it’s quick sign to them and me that we have work to do to close the gaps in their understanding.