This article argues that the historical reduction of age at grade level in the 20th century has interacted with test scores that take age into account, resulting in a rise in IQ scores in school populations.
This chapter contrasts the approach to educational evaluation championed in recent educational policy-making with a dialogical epistemology of evaluation. A dialogical epistemology, drawn from the writings of M. M. Bakhtin, enjoins evaluators to consider multiple voices and their relative authority and power in making judgments of the worth or merit of programs. Further, it positions them as participants in policy discussions rather than arbiters of program value whose authority stems from the methods they use.
Framed by the assumptions of ethnomethodology and drawing on methods of conversational analysis, the author analyzed a set of 10 transcripts of Teacher Work Sample scoring conversations to identify patterns in scorer interaction. Interactive rules and strategies are identified and implications and cautions offered for the use of work samples, particularly for high-stakes assessment.
In this article, teacher action research is positioned as a bridge connecting research, practice, and policy—as an important and practical way to engage teachers as consumers of research, as researchers of their own practice, as designers of their own professional development, and as informants to scholars and policy-makers regarding critical issues in the field.
This article analyzes the classroom instruction of an experienced teacher in an elementary school where the principal resisted a movement toward standardization and supported teachers’ autonomy and authority over curriculum and instruction amid high-stakes state-level testing in language arts and mathematics. Examining the teacher’s instructional practice in social studies, a subject not included in state testing but nevertheless impacted by state testing, we demonstrate how specific teaching dilemmas that arose in response to state testing led to a new type of professional practice that we call constrained professionalism.
This project examines the intersection of in-class assessments, student identity, and the construction of teachable moments. Through examining a science department’s attempt to use daily practice with assessments as a teaching tool, this study explored students’ use of discourse in relation to the teacher’s use of this approach to teaching.
This article focuses on the public and Catholic school discourse that accompanied the introduction of IQ testing in the early 20th century. It analyzes the nature of the discourse among educational researchers, administrators, and teachers in two parallel educational settings and examines the way that public and Catholic school educators responded to IQ testing.
This article contrasts democracy with corporatocracy, showing that the accountability movement today is rooted more in the latter than the former. Case studies of two teachers explore how democratically minded teachers can navigate accountability pressures in a corporatocratic context, as well as limitations that context places on them.
This study explores the design process of how one urban school district developed and deployed a series of reports designed to communicate the results of student achievement testing across the district. The focus of this research is to understand the district’s efforts to design new programs that would fit coherently into existing initiatives in local schools.
This article analyzes cultural and home language influences in the responses of White, African American, Hispanic, and Haitian American children on paper-and-pencil science assessments. Factors interfering with students’ interpretation of test items and teachers’ interpretation of students’ responses included (1) phonological and semantic features of students’ home languages, (2) students’ cultural beliefs and practices, and (3) “languacultural” features linked to various discursive and textual conventions. The article concludes that science assessments are inherently cultural objects whose content and organization rely on implicit knowledge that different groups of students may not share.
One of the basic problems in relating educational evaluation and educational practice is that the two activities often take place on radically differing time scales. It is not only a matter of aims—that evaluation of local educational practice as conducted by external researchers (or by the use of instruments designed by external researchers, as in the case of formal testing) may be done “summatively” for purposes of external accountability, and so the information collected may not directly inform the local conduct of instruction and school administration. It is also a matter of timeliness, in that whatever information is collected from a local site of practice may not be analyzed and communicated back to the site in time for frontline service providers to do anything about it, that is, in time for teachers to adapt their ongoing instruction in light of the information provided by the assessment.
This chapter aims to introduce several ideas about using evidence from assessment to guide educational decision making. We expect these ideas to be new to many readers, as they reflect the influence of “sociocultural” theories of learning (e.g., Vygotsky, 1986), particularly the theories of “situative” sociocultural theorists (e.g., Greeno & MMAP, 1998). These theories assume that all learning is social change. This contrasts with traditional theories underlying most prior considerations of assessment, which assume that learning is fundamentally about individual (“cognitive”) change.
The enactment of the No Child Left Behind Act (NLCB) has
resulted in an unprecedented and very direct connection between high-stakes assessments and instructional practice. Historically, the disassociation between large-scale assessments and classroom practice has been decried, but the current irony is that the influence these tests now have on educational practice has raised even stronger concerns (e.g., Abrams, Pedulla, & Madaus, 2003) stemming from a general narrowing of the curriculum, both in terms of subject areas and in terms of the kinds of skills and understandings that are taught. The cognitive models underlying these assessments have been criticized (Shepard, 2000), evidence is still collected primarily through multiple choice items, and psychometric models still order students along a single dimension of proficiency.
Large-scale assessments designed to serve as indicators of academic progress in a social context provide invaluable information about the condition of education in America. This unique class of assessments serves as a common yardstick by which the educational progress in states, jurisdictions, and other countries can be compared. Because these assessments serve as monitors across a wide variety of curricula, content standards, and instructional practices, they are uniquely designed and well suited for their task. The focus of this chapter is to define what policymakers need to know to be proficient in this kind of large-scale indicator assessment literacy.
This chapter is a reflection on assessment and the implications and uses of assessments from what will be called a “sociocultural-situated” perspective on language, learning, and mind. By “sociocultural” I mean to indicate the importance of the fact that human beings are givers and takers of meaning and the meanings they give and take can come from no other place than the cultures and social groups within which they act and interact (Gee, 1992, 1996). This is so for much the reasons Wittgenstein (1958) pointed to in his well-known argument about the impossibility of “private” languages. By “situated” I mean to indicate the importance of the fact that the meanings which humans give and take are always customized to—situated within—actual situations or contexts of use (Gee, 2004, 2005). Humans make meanings that both shape the contexts they are in and are shaped by them (Duranti, 1992).
This article explores the doubting process as an emerging concept in school reform. After introducing the concept of doubt and its importance in educational reform, the article exemplifies a secondary school principal who doubted core pedagogical practices.
This study examines trends of school effects on student achievement by employing three national probability samples of high school seniors: NLS:72, HSB:82, and NELS:92. Our findings indicate that schools matter beyond student background.
The purpose of this study was to examine the special education referral and decision-making process for English language learners (ELLs), with a focus on Child Study Team (CST) meetings and placement conferences/multidisciplinary team meetings.
This article discusses psychometric issues in the assessment of English language learners and examines the validity of classifying ELL students, with a focus on the possibility of misclassifying ELL students as students with learning disabilities.
The authors argue that English language learner (ELL) language assessment policy and poor language tests partly account for ELLs’ disproportionate representation in special education.
Language differences in the United States are largely viewed as problems that schools must remedy. This paradigm has created the pervasive belief that Spanish is a root cause of underachievement for Spanish-speaking English language learners (ELLs). This article examines teacher beliefs systems with regard to the above paradigm.
This article examines the intersection of psychometrics and sociolinguists in the testing of English language learners (ELLs).
Findings about the implementation of a system for rapidly assessing student progress in math and reading in grades K–12 suggest that this type of system could potentially reduce pressure on teachers resulting from high-stakes testing and the implementation of the No Child Left Behind Act.
The article describes the evaluation of a parent training program, sponsored by a major research university. It discusses the challenges of true parent empowerment and educators' resistance. It highlights the importance of considering socio-cultural contexts in evaluation and points to the potential of social justice evaluation approaches.
The article briefly characterizes a deliberative democratic approach to program evaluation. It then illustrates and assesses the approach in terms of an evaluation of the school choice policy in the Boulder Valley School District, Boulder, Colorado.
In this article, we used a multimethod, multilevel analysis to document the underlying dynamics of specific alternative learning contexts to identify generalizable principles while allowing for local variation.
This article chronicles the development of an “evaluation habit of mind” within a particular professional development context. It does so through an analysis structured according to the three overarching cognitive themes of preconceptions, frameworks, and reflections given in Then National Research Council’s synthesized report on how people learn.
This article addresses the construction of "critical friendships" within the practice of one particular program evaluation. It focuses on the evolution of relationships developed during one 2-year program evaluation study that examined a collaborative educational project.
This case study examines the usefulness of the 1994 standards, offered by the Joint Committee on Standards for Educational Evaluation, in monitoring the quality of international evaluations.
Although significant progress in understanding the effects of early childhood interventions has occurred over the last four decades, questions remain about the causal mechanisms of change, who benefits most from which program components, and the reliability of effects for large-scale programs. Examples from the Chicago Longitudinal Study are highlighted to show how confirmatory evaluation can help validate the effects of social interventions. Studies of the Chicago Child-Parent Centers are described to emphasize how the causal criteria of coherence, specificity, and within- and between-study consistency can strengthen causal inference and generalizability.