Home Articles Reader Opinion Editorial Book Reviews Discussion Writers Guide About TCRecord
transparent 13

Reflections on the Gordon Commission

by Edward H. Haertel - 2014

Background:This brief reflection on the work of the Gordon Commission calls out significant themes and implications found in the various papers authored by the commissioners and other scholars, especially those included in this special issue of Teachers College Record.

Purpose: The forward-looking vision of the Gordon Commission is contrasted with contemporary teaching and testing practices to highlight implications for new assessment purposes and methods. It is argued that a new vision of assessment is inseparable from a new vision of teaching and learning. To realize this new vision, some current practices, especially uses of testing to sort and select students and to rank teachers and schools, will need to be greatly attenuated or even abandoned.

Research Design: This is a narrative review expressing the author’s own point of view. No empirical findings are cited.

Conclusions: A conservative reading of the Gordon Commission’s work might suggest that educational assessment tomorrow should function much as it does today, only better. A closer reading, however, suggests a more radical view. Assessment FOR education must break free from the constraints of standardization and consequential comparison.

The Gordon Commission on the Future of Assessment in Education has been a remarkable undertaking. Over the past three years, the 30 commission members, supported by some 50 consultants, have produced an extraordinary body of papers, policy briefs, and reports. The work is fresh, sound, and well-reasoned, offering a new vision of why, what, and how educational testing should be carried out—a vision in which the most important assessments are deeply connected to teaching and learning, and designed to support those processes. This vision places highest priority on assessment FOR education, while still attending to assessment OF education. It is developed in full recognition that learning, including out-of-school learning, will look very different in tomorrow’s digital future from the way it looks today.


The Commissioners set out to envision a new and better future for educational assessment, but their work begins with a cogent critique of contemporary educational testing practice. One consistent theme is the artificiality of the tasks and settings current assessments employ. Today’s testing evolved from methods developed primarily to sort and select, not to support learning. The need for fair comparisons among examinees prompted a concern for standardization, but there is an inevitable tension between standardization and authenticity. As the name implies, standardized testing resolved this tension in favor of standardization. The need to make assessment contexts uniform enforced a separation of testing from teaching and learning. Along with standardization, current testing practice was shaped by demands for efficiency and objectivity. Selected-response questions meet all three of these requirements. They are administered under standardized conditions, easy to score quickly and inexpensively, and carefully designed so that for each question, there is just one correct answer. Demands for efficiency and objectivity, together with outmoded behaviorist ideas about teaching and learning, also led to an overemphasis on low-level factual and procedural knowledge, which are just the sorts of learning outcomes most easily measured using multiple-choice items. In today’s testing, the broad repertoire of skills and habits of mind defining intellective competence is mostly left unexamined.

Viewed from the perspectives of the authors in this journal issue, educational testing today seems a very strange business. We socialize students about how to perform an activity called “taking a test,” which is quite unlike any of their other day-to-day activities, either in school or out of school. Even though assessment tasks are intended to elicit knowledge, skills, and abilities that are valued in their own right or that support future school learning, most of these assessment tasks are themselves highly artificial contrived activities.

Students taking a test are indeed performing. They are answering questions, solving problems, or writing out their thoughts not for any authentic purpose, but merely to demonstrate, for the teacher or for some more distant audience, that they are capable of doing so. Like a play acted out on a stage, the test performance is a simulation, not the real thing. After rehearsals, referred to as test preparation, the testing event itself is staged. It unfolds with pencils and booklets and answer sheets as props, beginning with the examiner reading some scripted lines. Most importantly, special rules constrain the social context of the activity. Help seeking follows precise rules, and peer interaction is generally forbidden. To borrow the phrase used by Bransford and Schwartz (1999), examinees engage in “sequestered problem solving,” demonstrating what they are able to accomplish not in a “smart workplace” or using “smart tools,” not as participants in groups working together, but instead using a meager set of permissible resources, perhaps a calculator or a translating dictionary, and working in isolation. Some do well at this odd activity, some poorly. For better or worse, these performances can have profound consequences for the test takers themselves, their teachers, and their schools.


The articles in this issue, among other papers and reports produced by the Gordon Commission, point the way toward a very different assessment future. Giving assessment FOR education priority over assessment OF education will mean greater emphasis on classroom-based assessments used to guide instructional decisions and less emphasis on external “drop in from the sky” examinations. Test interpretations will be primarily criterion referenced, showing directly what individual students know or are able to do, rather than norm referenced, showing mostly how they compare to one another. As children come to be seen as creators and users of knowledge, less emphasis will be placed on the teaching and testing of facts and procedures, and greater emphasis will be placed on competence in accessing, evaluating, and using new tools for knowledge representation.

In this new future, classroom assessment will be truly integrated with instruction, based on student pursuits that are educationally useful and intrinsically meaningful in the classroom context. Children will work alone or together on engaging tasks, tackling novel as well as routine sorts of problems, and much of their work will be supported by various technology-mediated systems. As computers and other digital devices increasingly serve as media for both instruction and communication, the old idea of a “scorable record” like a bubble sheet will be completely transformed. Digital technologies will support both data collection and data processing, yielding richer, more detailed portrayals of the processes, as well as the products, of students’ academic work. Data streams generated in the course of ongoing classroom activities may serve as the basis for assessments that today would be prohibitively expensive or simply impossible. Information from multiple sources will be brought together and interpreted in context to support high-level inferences about children’s reasoning and their evolving expertise. Assessment feedback will be generated in real time to guide instructional decision making.


The shift in primary focus from assessment OF education to assessment FOR education will require changes beyond a new vision for educational testing alone. As some of these articles suggest, the promise of the Gordon Commission’s vision for the future may not be attainable unless some basic assumptions about schooling and testing are also challenged. It would be possible to read this body of work as calling mostly for an expansion of formative classroom-based assessment in support of learning and instruction, together with some enhancements of external summative assessments, but still leaving the sorting and selecting functions of summative assessment largely intact. There is scope for much improvement within this frame, but some of these articles go further still. They call into question (1) the appropriateness of tying education and educational assessment to the allocation of other kinds of opportunities in society and (2) the fundamental nature of the “achievement” schooling is designed to foster and educational tests are designed to measure.

The sorting and selecting functions of educational testing are bound up in complex ways with ideas of meritocracy. We take it as a given that educational achievement tests should determine access to further educational opportunities and, thereby, to lucrative high-status occupations and all that greater affluence buys. From this perspective, the ill-defined notion of equality of educational opportunity assumes great importance as a matter of fairness to all. Under this prevailing view, the ideal educational system would afford equal opportunities for everyone, so that variation in learning outcomes was attributable solely to differences in individual hard work and talent. This is a worthy ideal. Many of today’s tests were devised, and in fact have served, to identify talented students who would otherwise have been disenfranchised. Note, however, that there may be a contradiction here. As Varenne (this issue) reminds us, so long as assessments and schooling are used to grant privileges for some and to close doors on opportunities for others, the system will have to be structured so as to create winners and losers. It is, of course, an important and worthy goal to strive for equality in opportunity to learn, but the tacit requirement for unequal outcomes may reinforce the prevailing system under which the education of some children is in fact privileged over that of others. Ironically, tests once created to promote equality of educational opportunity may instead have come to help sustain patterns of inequality.

Educational testing for sorting and selection requires that different examinees’ scores be comparable. As already noted, this implies a requirement for standardization of testing conditions. It also implies a requirement for standardization of targeted learning outcomes. Scores are most easily compared when achievement is regarded as unidimensional, enabling a ranking of all examinees along a single continuum. Sorting and selecting are more complex, though still possible, when achievement is regarded as multidimensional, but still defined in the same way for all examinees. Several articles in this journal issue call for a richer conception of multidimensional learning outcomes. This is clearly a step in the right direction, but might leave intact the prevailing notion that educational achievement can be defined objectively as some body of knowledge, skills, and dispositions existing independent of individual learners, to be acquired by different students to greater or lesser degrees. The paper by Gordon and Campbell (this issue) challenges the positivist epistemology inherent in this view. These authors remind the reader that if knowledge and knowing how, knowing that, and knowing with (Broudy, 1977) are truly contextualized, then learning outcomes can no longer be defined in terms that are context free. Intellective competence encompasses a social dimension. In discourse communities of all kinds, including classrooms, participants continually construct shared knowledge together. Certainly, there are fundamental skills and much knowledge that all or nearly all children should master, but beyond the basics, different patterns of interest, talent, and experience will lead different children to grow and flourish in different ways. This is becoming more and more evident as increasing diversity and mixing of global populations enrich the range of backgrounds and experiences represented in most classrooms. A conception of assessment as “diagnostic inquiry, exploratory mediation, and intensive accountable exchange” must capitalize on this diversity and accommodate meaningful variation in the goals different learners pursue.

In summary, a comfortable, conservative reading of the Gordon Commission’s work might suggest that educational assessment tomorrow should function much as it does today, only better. Computers and other digital devices will enable some new sorts of tests and better reporting. Classroom assessments, especially, will improve, but nothing more fundamental will change. A closer reading, however, suggests a more radical view. Assessment FOR education must break free from the constraints of standardization and consequential comparison. It must support children in classroom learning communities as they develop a range of mental capacities that defies uniform rankings or tidy classifications. It must respect students’ different backgrounds, interests, and experiences. It must support grounded inferences from multiple sources of rich information to guide and document the growth of children’s intellective competence as they learn together. Much work remains, but the Gordon Commission has pointed the way.


Bransford, J. D., & Schwartz, D. L. (1999). Rethinking transfer: A simple proposal with multiple implications. In A. Iran-Nejad & P.D. Pearson (Eds.), Review of Research in Education, 24, 61–101. Washington, DC: American Educational Research Association.

Broudy, H. S. (1977). Types of knowledge and purposes of education. In R. C. Anderson, R. J. Spiro, & W. E. Montague (Eds.), Schooling and the acquisition of knowledge (pp. 1–17). Hillsdale, NJ: Erlbaum.

Cite This Article as: Teachers College Record Volume 116 Number 11, 2014, p. 1-6
https://www.tcrecord.org ID Number: 17654, Date Accessed: 10/24/2021 3:39:42 PM

Purchase Reprint Rights for this article or review
Article Tools
Related Articles

Related Discussion
Post a Comment | Read All

About the Author
  • Edward Haertel
    Stanford University
    E-mail Author
    EDWARD HAERTEL is the Jacks Family Professor of Education, Emeritus, at Stanford University, where his research and teaching focus on quantitative research methods, psychometrics, and educational policy. His recent publications include “How is Testing Supposed to Improve Schooling?” (2013, in Measurement: Interdisciplinary Research and Perspectives) and Reliability and Validity of Inferences About Teachers Based on Student Test Scores (14th William H. Angoff Memorial Lecture, 2013).
Member Center
In Print
This Month's Issue