Beyond the Bubble Test: How Performance Assessments Support 21st Century Learning
reviewed by Arlo Kempf - June 22, 2015
Title: Beyond the Bubble Test: How Performance Assessments Support 21st Century Learning
Author(s): Linda Darling-Hammond & Frank Adamson
Publisher: Jossey-Bass Publishers, San Francisco
ISBN: 1118456181, Pages: 464, Year: 2014
Search for book at Amazon.com
As the Common Core State Standards (CCSS) make their way into nearly all realms of classroom teaching and learning in districts and schools across the country, Beyond the Bubble Test: How Performance Assessments Support 21st Century Learning offers a research-rich guide to performance assessments and makes the case for their inclusion at the center of 21st century assessment in place of multiple choice testing. Linda Darling-Hammond, her co-editor Frank Adamson, and their contributors have assembled the most powerful, detailed, and comprehensive academic case against multiple-choice testing to datebased on deep and wide research from top scholars, much of it conducted by the contributors.
Laying bare the timing, policy, and political implications of the work, Darling-Hammond, Adamson & Toch argue: The advent of the Common Core State Standards, the Next Generation Science Standards, and the emergence of new accountability systems under federally approved waivers from [No Child Left Behind] provide a potential opportunity to address [the] fundamental misalignment between our aspirations for students and the assessments we use to measure whether they are achieving these goals (p. 313). Performance assessments are the means and 21st century learning and skills are the ends.
High-stakes multiple-choice (or bubble) testing, as the unofficial centerpiece of NCLB assessment, has been fraught with limitations. Contributors point to: reduced curricular breadth and depth; the potential performance misrepresentation of ELLs and students with disabilities; the over-representation of lower-order thinking and analytical skills; the mismatch between the skills measured by multiple-choice questions and those needed to enter and succeed in post-secondary education; and the persistent gap between U.S. students and their international peers since the implementation of NCLB. In short, bubble testing has failed to teach and assess 21st Century skills, including the ability to synthesize, analyze, and apply new learning to address new problems, design solutions, collaborate effectively and communicate persuasively (p. 260). Indeed this work, from these authors, should signal the death knell of traditional multiple choice testing as we know it.
Performance assessment (often made up of smaller, performance tasks) offers potential improvements in all of these areas. Understanding accountability beyond taxpayer return on investment, the book extends Darling-Hammonds (2004) notion that good assessment improves learning rather than simply measuring it. Organized in three major parts, the works diverse and accomplished contributors include senior and junior academics, analysts, researchers with private and public sector assessment backgrounds, and one in-service teacher. Chapters are smoothly and consistently cross-referenced, and Darling-Hammonds voiceone imaginesis heard throughout with a strong narrative through-line (she authors or co-authors six of books eleven chapters).
Part One sketches the landscape of performance assessments past, present, and future with a comparative look at various national and subnational contexts. Rich with policy and practice examples, Part Two drills down into the mechanics, concerns, and further possibilities of performance assessments. Part Three looks at the challenges and opportunities of large-scale performance assessments and policy, including a detailed look at the relative associated costs.
Contributors tackle challenging questions using clearly laid-out and extensive researchincluding frank assessments and considerations of fairness, reliability, and representation which arise around any standardized measurement. Noting these assessments can be typically twice as expensive as multiple-choice testingthe work suggests a holistic, system-wide cost-benefit analysis alongside smaller recommendations (such as initial instruction versus after-the-fact interventions).
Contributors flesh out the challenges of cost, reliability, and accuracy, noting that consistency in representation may increase with a larger number of standardized performance tasks and assessmentsmeaning more accurate results are more expensive than potentially less accurate results. On the limits of reliability, Pecheone and Kahl grant that although human scorers evaluating complex student performance are not perfectly calibrated when they begin the process. . . with training and moderation, it is not uncommon to achieve high rates interrater reliabilityoften 90% or higher (p. 79). Although far from a panacea for equity and standardized measurement, performance assessment can increase accessibility for ELLs and students with disabilities and decrease the chances of misinterpretation of assessment results for these students.
Among the most fascinating undertakings here are considerations of teachers roles in the generation, scoring, and revision of performance assessments which are particularly robust at the state and other subnational levels. The comparative analyses of relevant teacher practice suggests a need for collaborative approaches, guided by calibrated professional judgment; bringing to mind Hargreaves and Fullans (2012) notion of professional capital. This raises questions of course for teacher-training, in which the portfolio model is widely used, but too infrequently linked to potential future professional practice.
The role of teachers may relate to the integration of ongoing assessment of learning with particular articulations of federal and state standards. Moving away from summative measures to include ongoing assessment may be central to successful teaching, learning, and measurement of 21st century skills. Curriculum-embedded performance components might incorporate locally developed, curriculum-related tasks into the local performance component of accountability testing (p. 87). Applied to CCSS, this could mark an important re-integration of teacher and context knowledge/consideration within high-stakes test-centered educational policy and practice.
Early development of performance assessments were driven by anxieties over measurement-driven instruction, tests worth teaching to, and what you test is what you get (p. 27). Recognizing these concerns, the volume has outlined measurements worthy of instruction in our increasingly standardized education systems. Taken for granted, and perhaps as inevitable, in this conception is that we need to improve high-stakes standardized measurements rather than challenge the very premise that our schools be guided by them. Figuring out a way to ensure that students construct rather than select their answers may not go far enough for many in addressing the concerns of so many researchers, educators, and parents about educational standardization generally and standardized assessment in particular.
The consideration of computer measurement of scientific inquiry skills among K-12 students (p. 145) brought to mind a recent visit to a Toronto kindergarten class, in which students were outdoors all year long (even through the Canadian winter). Surrounded by little fingers covered in soil, structures built of bark, and cheeks rosy from constant activity, I was reminded that inquiry is human, intrinsically exploratory, and indeed that this beautiful learning (this scientific inquiry) would have to stop in order to assess it. I can muster no excitement at the prospect of moving these moments online and mandating screen time for children (with cheeks aglow for a different reason altogether)all in the name of measurement (even good measurement, conducted through performance assessments which closely mirror the real life moments in which the measured skills and knowledge will be applied).
Going too far down this road may be to miss the point and to misread the intended audience. This book will not please those opposed to high stakes standardized assessment outright, but it isnt intended to. Darling-Hammond is more aware of these concerns than many, and this collection is looking in a different direction. It is a best-case-scenario for an effective and assessment-rich standardized educational future. Common Core is not going anywhere, and this book suggests we seize the opportunities it presents. If competition, comparison, and job-readiness at the core of assessment are inevitable, we should certainly move to richer tasks embodied in performance assessments that measure a broader array of skills and abilities than previous methodsand may do so more accurately, clarifying demographic variations and improving the overall precision of standardized assessment results.
This important book successfully makes the case for large-scale performance assessment in place of or alongside bubble test measures in the implementation of national standards. So, whats next? Among the most important questions thus raised by the work: a) is the widespread implementation of new high quality performance assessments likely given the cost proclivities? and b) do the necessary layers of bureaucracy exist that are needed to hammer out nuanced assessment practices?
Much successful performance assessment has come at the subnational level, both currently and historically. Concerns about performance assessment at an increasingly national standards level, and the ability of these assessments to be adapted based on local context and knowledge (including a variety of inputs from teachers) are thus worth close consideration. As one of the most impactful advocates of quality public education, Darling-Hammond is indeed a unique and significant voice in this conversationBeyond the Bubble Test is worthy of that reputation and should be difficult to ignore as the future of national standards unfolds in the coming years. Indeed, there may be no one better to fight this uphill battle.
Darling-Hammond, L. (2004). Standards, accountability and school reform. Teachers College Record, 106(6), 1047-1085.
Hargreaves, A., & Fullan, M. (2012). Professional capital: Transforming teaching in every school. New York, NY: Teachers College Press.