Re-Inventing Multiple Choice: A critique of summative multiple choice assessment and a UDL redesign.
By: Justin Mark
Background:
My original motivation for this inquiry was to find evidence demonstrating the negatives of multiple choice style testing. I have used multiple-choice testing in past practice, but have always felt a bit guilty for doing so. As a young teacher, I wanted to fit in and follow the practices of the more experienced teachers. Multiple choice tests that I have used in the past, were often administered during exam week, and were required to take at least 1 hour to complete. Marking turnover for report card reporting in many cases only afforded a single day, so Scantron multiple-choice technology was efficient and allowed for timely reporting. Now entering my 15th year as an educator, I have long since abandoned the use multiple-choice summative assessment. In fact in the past five years I haven’t employed a single multiple-choice exam. For the past 6 years I have taught French 8-12 and History 12 at Cedar Secondary in SD68. My summative assessment is skill based and consists of an interview, a meaningful and guided written sample that reflects the structure lessons of the semester and a translation to test student’s reading comprehension. My assessment is meant to be meaningful, authentic and reflective of acquired and developing skills. From the perspective of a student taking M/C test I find them quite frustrating, as I often find the questions not to be mutually exclusive and sometimes ambiguous. I find myself wanting to debate the merits of the question, or argue for multiple interpretations of the answer during the test. Currently at Cedar Secondary we are engaged in discussions debating the most appropriate methods of assessments for our students. In our school we seem to be divided between those that believe that content based summative final assessment using multiple choice is best practice and those that feel more authentic styles of assessment such as showcases of learning, or skill based assessments are more meaningful. These discussions have fueled my inquiry.
Original Learning Activity
A sample of a standard multiple-choice test used for summative assessment. These computer generated questions are supplied through software designed by Glencoe’s Bon Voyage 1 program.
Chapitre 8 L'aéroport et l'avion
Link to Multiple-Choice questions - http://www.glencoe.com/qe/qe22.php?qi=501
Critique:
Before employing an UDL critique of the above multiple-choice assessment I wanted to review the general merits and criticism of multiple-choice assessment. Through the process of my research I found a very helpful table that outlined the pros and cons of multiple-choice style testing. According to the assessment guidelines at Antelope Valley College, many of the merits of M/C style assessment benefits the instructor and not the student. M/C type questions often access surface learning, may be misinterpreted and in many cases involve testing for lower level knowledge. In addition questions don’t allow students to explain or expand on their answers. Random selection or Scantron errors are also common.
Souce: http://www.avc.edu/administration/organizations/slo/common/documents/ProsandConsofAssessmenttools.pdf
Universal Design for Learning Critique:
For this critique I have chosen to employ the “Universal Design for Learning Guidelines” produced by CAST. The Glencoe Bon Voyage generated multiple-choice questions that I have selected for this critique do not meet many of the criteria for the categories of representation, action and expression and engagement. In my own practice I have witnessed many students “tune out” during extensive M/C assessment sittings. Often students are expected to complete content-based exams that contain hundreds of multiple-choice questions. Focusing on the category of representation, clearly this form of assessment provides a very limited display of information. In terms of providing multiple means of action and expression, one could argue that a paper based (Scantron) multiple-choice test is a limited form of assessment. Finally in terms of providing multiple means of engagement the M/C test is again extremely limited. Using the UDL guidelines to review the M/C assessment exposes its limitations in the categories of representation, expression and engagement.
Source: http://www.udlcenter.org/sites/udlcenter.org/files/updateguidelines.pdf
The Redesign.
My original plan was to replace the M/C summative assessment model with authentic assessment pieces that were skill based (interview, guided written sample and translation). Initially I was prepared to permanently retire multiple choice questioning, however during my research I discovered that if the assessment strategy for multiple choice is flipped from summative to formative, significant and higher level learning outcomes can be obtained as demonstrated through examples at even at the post secondary level. For example At the University of Wollongong in Australia, in a 2nd year fluid mechanics engineering course, a series of multiple choice quizzes posted online are used formatively to help students build deeper understanding. In a study, students who employed this M/C quizzes posted higher final exam results than those that didn’t. (Source: http://aaee.com.au/conferences/AAEE2010/PDF/AUTHOR/AE100014.PDF)
Similarly in her blog post, Heather Wolpert-Gawron a middle school teacher and author of the book “Tween Crayons and Curfews: Tips for Middle School Teachers”
demonstrates how formative applications and reflective practice “Lesson Trails” can yield significant learning outcomes for a diversity of learners in the middle school environment. (Source: http://www.edutopia.org/blog/multiple-choice-assessments-formative-heather-wolpert-gawron)
Ultimately I was convinced that by changing the assessment goal from summative to formative multiple-choice style questioning still could be a valuable tool for student learning. In addition a formative multiple-choice activity meets increased criteria on the UDL guidelines rubric.
The goal of my lesson was to create a lesson guided by Universal Design for Learning guidelines and following a formative assessment strategy. As a challenge I wanted to produce a lesson using the same base material as my Multiple-Choice sample. At its core those questions focus specifically on the skill of reading comprehension for the French language. I hypothesized that higher level learning outcomes could be produced to a wider range of learners with diverse learning styles by changing the delivery method and presentation of the material.
Lesson Goals
1) Incorporate technology - Twitter
2) Improve reading comprehension - specifically translation
3) Group problem solving
4) Formative assessment vs Summative Assessment
5) Increased engagement
6) Having students demonstrate their thought process instead of just randomly guessing
7) Having students work towards building class consensus of the correct answer instead of it being supplied instantly by the teacher
Lesson Plan
- Using the same question bank as shown in example above.
- A class Twitter stream is created and presented on the projector through the teacher computer.
- Students form small groups - 2 or 3 (*** Does not need to be in groups, depends on specific learning outcomes, also might help facilitate an in class lesson for individuals who do not have access to hand held technology.)
- Students are given one question at a time – information could be provided electronically or on paper, or/and by using the projector.
- Students are asked to translate their sentences, and choose the correct answer. They are to post their response using twitter.
- Students are given a 5 minute window, when finished they can review the twitter stream and compare their answers with their peers
- After 5 minutes class led discussions builds a consensus to determine the best answer.
- Students move to the next question.
- Class works on 8 questions and finishes with a reflection of learning piece.
- After 45 minutes: Students write a reflection of what they learned? Using the following guiding questions. List 10 words in French that I will remember after today! Did using Twittter enhance my learning? What was the most challenging aspect of this lesson?
- Students are asked to add five expressions to their personal dictionaries.
Refection and Closing Statements
The reworked formative multiple-choice activity covers a broader range of the Universal Design for Learning guidelines. Representation is significantly enhanced with the multiple presentations of information and the use of Twitter. Action and representation are increased with the dynamic of the translation, posting on twitter and building consensus towards the best answer. Engagement is certainly positively affected by the group dynamic, use of Twitter, publishing of content and reflective components.
A big part of my motivation for this inquiry was to critically examine the function of multiple-choice assessment. The Design for Learning guidelines clearly exposes the shallowness of such practice, and reconfirms what I already suspected. Hopefully, though I’ve demonstrated that multiple-choice questioning does not need to be discarded. More specifically by changing the assessment style from summative to formative and “tweeking” the delivery method, there is still significant learning value to be yielded from this traditional tool.
By: Justin Mark
Background:
My original motivation for this inquiry was to find evidence demonstrating the negatives of multiple choice style testing. I have used multiple-choice testing in past practice, but have always felt a bit guilty for doing so. As a young teacher, I wanted to fit in and follow the practices of the more experienced teachers. Multiple choice tests that I have used in the past, were often administered during exam week, and were required to take at least 1 hour to complete. Marking turnover for report card reporting in many cases only afforded a single day, so Scantron multiple-choice technology was efficient and allowed for timely reporting. Now entering my 15th year as an educator, I have long since abandoned the use multiple-choice summative assessment. In fact in the past five years I haven’t employed a single multiple-choice exam. For the past 6 years I have taught French 8-12 and History 12 at Cedar Secondary in SD68. My summative assessment is skill based and consists of an interview, a meaningful and guided written sample that reflects the structure lessons of the semester and a translation to test student’s reading comprehension. My assessment is meant to be meaningful, authentic and reflective of acquired and developing skills. From the perspective of a student taking M/C test I find them quite frustrating, as I often find the questions not to be mutually exclusive and sometimes ambiguous. I find myself wanting to debate the merits of the question, or argue for multiple interpretations of the answer during the test. Currently at Cedar Secondary we are engaged in discussions debating the most appropriate methods of assessments for our students. In our school we seem to be divided between those that believe that content based summative final assessment using multiple choice is best practice and those that feel more authentic styles of assessment such as showcases of learning, or skill based assessments are more meaningful. These discussions have fueled my inquiry.
Original Learning Activity
A sample of a standard multiple-choice test used for summative assessment. These computer generated questions are supplied through software designed by Glencoe’s Bon Voyage 1 program.
Chapitre 8 L'aéroport et l'avion
Link to Multiple-Choice questions - http://www.glencoe.com/qe/qe22.php?qi=501
Critique:
Before employing an UDL critique of the above multiple-choice assessment I wanted to review the general merits and criticism of multiple-choice assessment. Through the process of my research I found a very helpful table that outlined the pros and cons of multiple-choice style testing. According to the assessment guidelines at Antelope Valley College, many of the merits of M/C style assessment benefits the instructor and not the student. M/C type questions often access surface learning, may be misinterpreted and in many cases involve testing for lower level knowledge. In addition questions don’t allow students to explain or expand on their answers. Random selection or Scantron errors are also common.
Souce: http://www.avc.edu/administration/organizations/slo/common/documents/ProsandConsofAssessmenttools.pdf
Universal Design for Learning Critique:
For this critique I have chosen to employ the “Universal Design for Learning Guidelines” produced by CAST. The Glencoe Bon Voyage generated multiple-choice questions that I have selected for this critique do not meet many of the criteria for the categories of representation, action and expression and engagement. In my own practice I have witnessed many students “tune out” during extensive M/C assessment sittings. Often students are expected to complete content-based exams that contain hundreds of multiple-choice questions. Focusing on the category of representation, clearly this form of assessment provides a very limited display of information. In terms of providing multiple means of action and expression, one could argue that a paper based (Scantron) multiple-choice test is a limited form of assessment. Finally in terms of providing multiple means of engagement the M/C test is again extremely limited. Using the UDL guidelines to review the M/C assessment exposes its limitations in the categories of representation, expression and engagement.
Source: http://www.udlcenter.org/sites/udlcenter.org/files/updateguidelines.pdf
The Redesign.
My original plan was to replace the M/C summative assessment model with authentic assessment pieces that were skill based (interview, guided written sample and translation). Initially I was prepared to permanently retire multiple choice questioning, however during my research I discovered that if the assessment strategy for multiple choice is flipped from summative to formative, significant and higher level learning outcomes can be obtained as demonstrated through examples at even at the post secondary level. For example At the University of Wollongong in Australia, in a 2nd year fluid mechanics engineering course, a series of multiple choice quizzes posted online are used formatively to help students build deeper understanding. In a study, students who employed this M/C quizzes posted higher final exam results than those that didn’t. (Source: http://aaee.com.au/conferences/AAEE2010/PDF/AUTHOR/AE100014.PDF)
Similarly in her blog post, Heather Wolpert-Gawron a middle school teacher and author of the book “Tween Crayons and Curfews: Tips for Middle School Teachers”
demonstrates how formative applications and reflective practice “Lesson Trails” can yield significant learning outcomes for a diversity of learners in the middle school environment. (Source: http://www.edutopia.org/blog/multiple-choice-assessments-formative-heather-wolpert-gawron)
Ultimately I was convinced that by changing the assessment goal from summative to formative multiple-choice style questioning still could be a valuable tool for student learning. In addition a formative multiple-choice activity meets increased criteria on the UDL guidelines rubric.
The goal of my lesson was to create a lesson guided by Universal Design for Learning guidelines and following a formative assessment strategy. As a challenge I wanted to produce a lesson using the same base material as my Multiple-Choice sample. At its core those questions focus specifically on the skill of reading comprehension for the French language. I hypothesized that higher level learning outcomes could be produced to a wider range of learners with diverse learning styles by changing the delivery method and presentation of the material.
Lesson Goals
1) Incorporate technology - Twitter
2) Improve reading comprehension - specifically translation
3) Group problem solving
4) Formative assessment vs Summative Assessment
5) Increased engagement
6) Having students demonstrate their thought process instead of just randomly guessing
7) Having students work towards building class consensus of the correct answer instead of it being supplied instantly by the teacher
Lesson Plan
- Using the same question bank as shown in example above.
- A class Twitter stream is created and presented on the projector through the teacher computer.
- Students form small groups - 2 or 3 (*** Does not need to be in groups, depends on specific learning outcomes, also might help facilitate an in class lesson for individuals who do not have access to hand held technology.)
- Students are given one question at a time – information could be provided electronically or on paper, or/and by using the projector.
- Students are asked to translate their sentences, and choose the correct answer. They are to post their response using twitter.
- Students are given a 5 minute window, when finished they can review the twitter stream and compare their answers with their peers
- After 5 minutes class led discussions builds a consensus to determine the best answer.
- Students move to the next question.
- Class works on 8 questions and finishes with a reflection of learning piece.
- After 45 minutes: Students write a reflection of what they learned? Using the following guiding questions. List 10 words in French that I will remember after today! Did using Twittter enhance my learning? What was the most challenging aspect of this lesson?
- Students are asked to add five expressions to their personal dictionaries.
Refection and Closing Statements
The reworked formative multiple-choice activity covers a broader range of the Universal Design for Learning guidelines. Representation is significantly enhanced with the multiple presentations of information and the use of Twitter. Action and representation are increased with the dynamic of the translation, posting on twitter and building consensus towards the best answer. Engagement is certainly positively affected by the group dynamic, use of Twitter, publishing of content and reflective components.
A big part of my motivation for this inquiry was to critically examine the function of multiple-choice assessment. The Design for Learning guidelines clearly exposes the shallowness of such practice, and reconfirms what I already suspected. Hopefully, though I’ve demonstrated that multiple-choice questioning does not need to be discarded. More specifically by changing the assessment style from summative to formative and “tweeking” the delivery method, there is still significant learning value to be yielded from this traditional tool.