Home Articles Reader Opinion Editorial Book Reviews Discussion Writers Guide About TCRecord
transparent 13

Teacher and Administrator Responses to Standards-Based Reform

by Laura M. Desimone - 2013

Background: Since the onset of standards-based reform and its continuation in the form of No Child Left Behind (NCLB) and now Race to the Top, debates have continued about whether such policies foster desirable change in states, districts, schools and classrooms.

Research Question: The study asks: How do state and district administrators, principals, and teachers describe their responses to standards-based reform, in terms of beliefs, understanding, and attitudes, as well as behavior?

Participants: The analysis uses interview data from 32 schools in 10 districts in 5 states. Data are from 60 fourth- and seventh-grade mathematics teachers, 32 principals, 14 district administrators, and 7 state officials.

Research Design: The overarching five-state study identified a representative probability sample of districts for each state. From those districts, two in each state were chosen for the case study, based on size and poverty.

Data Collection and Analysis: Interviews were transcribed, then coded both inductively and deductively to identify patterns and elicit major themes.

Findings: This study provides evidence that standards-based reform has elicited positive change in four areas: attention to struggling learners, teaching to the test, responsibility for student learning, and classroom content and pedagogy. Respondents said that as a result of standards-based reform policies, they focused more on struggling students, trying new approaches to reach them, and increasing their expectations for them. Respondents also reported increasing their personal and group responsibility for student learning, while at the same time feeling stress and pressure from the testing regime. Respondents reported worrying that teaching to the test was narrowing the curriculum, but they also praised the system for eliciting needed improvements in the curriculum. Some teachers said the new state standards required them only to change the order of topics they taught, but other teachers said the standards required them to cover different content and to focus more on student understanding and knowledge retention.

Conclusions: The findings here of constructive and positive responses stand in stark contrast to a substantial literature that documents negative reactions to NCLB. One explanation might be that earlier attempts at standards-based reform and accountability, such as those documented in this analysis, were much more closely aligned with the theoretical vision of standards-based reform than were later manifestations as codified under NCLB, which have (a) moved away from local discretion, (b) emphasized rewards and sanctions rather than authority (buy-in), and (c) changed the target from standards to focusing on test results, often exclusively.


Since the onset of standards-based reform and its continuation in the form of No Child Left Behind (NCLB) and now Race to the Top (RTT), debates rage about whether such policies foster desirable change in states, districts, schools and classrooms, (e.g., Darling-Hammond, 2004; Diamond, 2007; Loeb, Knapp, & Elfers, 2008; Louis, Febey, & Schroder, 2005; Swanson & Stevenson, 2002).  In this study, I examine 32 schools in 10 districts, chosen for their variation in size and poverty. These districts are in five states, which in early 2000 were implementing NCLB-like policies of aligned content standards and assessments accompanied by rewards and sanctions based on student performance on state assessments. Through an examination of fourth- and seventh-grade math teachers, their principals, and their state and district administrators, we learn a great deal about how the 10 districts in this study reacted to standards-based reform as implemented in their states, districts and schools.

The study focuses on math reform because of its centrality to the standards-based reform movement. Student achievement in math is a primary target of education reform (Bybee, 1993; Elmore, Peterson, & McCarthy, 1996), and is used as an international benchmark (e.g., Schmidt, McKnight, & Raizen, 1997; Stevenson & Stigler, 1992; Stigler & Hiebert, 1999). The study focused on mathematics to avoid the confounding of subjects that might occur in a multiple-subject study of how standards and accountability are translated by teachers into the classroom.


Standards-based reform, the foundation of NCLB, has its roots in systemic reform (Clune, 1998; Cohen, 1995; Grant, Peterson, & Shojgreen-Downer, 1996; Knapp, 1997). The theory of standards-based reform posits that improved teaching and learning will result from (a) creating “high-quality” content standards that provide uniform and meaningful learning goals, (b) designing student assessments aligned to those standards, (c) providing a system of supports to help build teacher’s capacity to successfully teach to the “high-quality” standards, and (d) establishing accountability mechanisms to motivate compliance (M. Smith & O’Day, 1991).

Early statewide standards-based reform offered guidance but not directives to localities on how to transform their educational systems to respond to new standards and achievement targets. This study focuses on local implementation variation to help us understand how reform works (see McLaughlin, 1976, 1987, 1990). For example, studying educators’ variation in responses helps us to identify the range of possible reactions and develop and refine hypotheses about cause and effect.

Certainly state and local contexts vary and change continually (Education Week, 2001), and educators’ responses to reform are related to these contextual variables (Spillane & Zeuli, 1999). Educators’ self-reported beliefs, attitudes and reactions, then, are relevant to our wider understanding of how educators process and react to this type of change. This analysis seeks to examine notions of educator beliefs and actions in light of what theory would predict.  The use here of a theoretical/conceptual framework to guide the coding, analysis, and interpretation of results, creates the potential to contribute to generalizable knowledge (C. H. Weiss, 1998). In addition, findings are analyzed in light of recent NCLB studies, drawing insights for increased understanding of current educational reform successes and failures.

With this study, I intend to contribute to knowledge of how educators react to the core components of standards-based reform—prescribing content for all students through standards, establishing assessments that reflect the standards, and holding schools and teachers accountable through incentives and sanctions for student assessment results. These components have been a consistent presence in school efforts over the past two decades, and serve as the fundamental building blocks of NCLB and many RTT efforts.


A comprehensive theoretical framework for standards-based reform includes ideas about the role that educators’ beliefs and actions play in the reform’s success. This study drew on this part of the theory of how reform works. Specifically, I focus on theories of how educators’ (a) beliefs, interpretations, and understandings and (b) perceptions of instructional change can help to realize the goals of standards-based reform. The study is based on the idea that educators must accept and understand the reform and, in response, be willing to change their instructional practices. The broad question that drives this analysis is How do state and district administrators, principals, and teachers describe their responses to standards-based reform? This study focuses on two aspects of this question: As a result of standards-based reform, (a) How do educators describe changes in their beliefs, understanding, and attitudes? and (b) How do educators describe changes in their behavior?

Because they deliver the instruction that standards-based reform targets, teachers are a key connection between policy and practice (Cohen, 1990). Standards-based reform assumes that teachers will not only acknowledge and become familiar with standards, but that they will (a) believe the standards are worthwhile and achieveable (Gregoire, 2003; Stecher et al., 2002), (b) recognize how their instruction is inconsistent with the standards (Gill, Ashton, & Algina, 2004) and (c) adjust their instruction to align with standards and assessments in a way that ultimately fosters students’ success (Loeb et al., 2008).

Thus, I view understanding, beliefs, and interpretations of reforms as necessary precursors to changing practice (Louis, Febey, & Schroeder, 2005). Standards-based reform is founded on two theories of action: that teachers may change their instruction because they are inspired to believe their students can achieve at higher levels, and/or because the pressure of the accountability system motivates and challenges them (Nave, Miech, & Mosteller, 2000; Porter et al., 1988). The analysis provides insight about the extent to which change is generated by inspiration, pressure, or both.

In their presentation of a conceptual framework for studying standards-based reform, I. R. Weiss, Knapp, Hollweg, & Burril (2002) emphasize the need to understand how educators view standards and how teaching and learning have changed. District, principal and teacher understandings and reactions to accountability policy can be seen as a necessary condition and a mechanism for eliciting changes in practice that often follow changes in beliefs (Louis, Febey, & Schroeder, 2005). Grounded in these ideas, this study focuses on educators’ roles in reacting to and enacting standards-based reform policy, specifically, their understandings and interpretations of the policy and the behavioral responses they describe.


The analysis yielded four main themes. As a dynamic conception of educators’ response to reform would predict, each of the themes straddled the intersection of two research questions about beliefs and behavior. The main themes that emerged from the analysis correspond to the following critical issues directly relevant to today’s education environment: (a) the extent to which previously “left behind” students receive classroom support, (b) whether teachers and principals truly feel accountable to student achievement in a way that fosters positive behavior change, (c) how teachers describe “teaching to the test,” and when and if this is good or bad for teachers and students, and (d) the extent to which standards-based reform fosters desirable changes in pedagogy and the content of instruction.  I examine each of these issues to see how standards-based reform weaves its way into schools and classrooms, and the extent to which it can foster fundamental change that ultimately improves student achievement.


There is evidence that states with strong accountability programs exhibit greater student achievement gains (Carnoy & Loeb, 2002) and that standards-based instruction is related to student achievement (Hamilton et al., 2003). But some scholars say that even if accountability produces gains, its cost is too high (Darling-Hammond, 2003; McNeil, 2000; Sheldon & Biddle, 1998).  I hope this study contributes to this debate.

Although a thorough review of the literature on standards-based reform is beyond the scope of this manuscript, below I discuss notable findings that apply to each of the four analytic themes, and suggest how the study might contribute to each of these four areas. Further, in the discussion section, I link the findings to the current literature.


One of the most pervasive debates about standards-based reform is whether it fulfills its potential to act as a mechanism to improve learning opportunities for traditionally underachieving students, or whether it instead undermines instruction for these students. One set of arguments emphasizes that high standards, applied uniformly to all students, can be a mechanism for improved opportunity to learn (Porter, 1993; M. Smith & O’Day, 1991). Others who have conducted studies of accountability in particular contexts show how accountability policies can exacerbate inequalities by marginalizing low-performing students (Booher-Jennings, 2005; Diamond & Spillane, 2004; Sandholtz, Ogawa, & Scribner, 2004).  Some research suggests that low-achieving students are at least as well off as they were before standards-based reform/NCLB reforms, and sometimes better off (Gamoran, 2007); or that local context and implementation and interactive complexities account for whether negative consequences occur (e.g., Firestone, 2003; O’Day, 2002). To help sort out these mixed findings, this study examines teachers’, principals’, and districts’ own descriptions of how they see the role of struggling learners in standards-based reform.


Little research has examined how accountability reforms foster a sense of personal accountability among educators that result in behavioral change. Previous work has shown that educators’ interpretations of policy largely determine levels of change and resistance (Gold, 2002; Louis & Dentler, 1988). Standards-based reform theory posits that high-stakes tests will motivate educators to take new steps to improve their students’ learning, such as using more conceptual instructional strategies (National Council on Education Standards and Testing, 1992). But how does such motivation manifest itself? Diamond (2007) suggests that to assume teachers’ motivation in response to increased stakes will lead to positive change is to underestimate the complexity of transforming instruction, even when teachers are trying to change. Largely unexplored is the extent to which educators develop or reject a personal sense of accountability in response to standards-based reform, how they articulate such a sense of accountability, and how they think it impacts their behavior, all of which are examined in this study.


Here the basic debate is how well students are served by instructional reactions to standards-based reform, which often include narrowing the curriculum to respond to tested content and using class time to practice test-taking strategies. One group of scholars suggests that if high-stakes testing is done right, it can lead teachers to change their practice in positive ways (Bishop & Mane, 1999; Borko & Elliott, 1999; Wolf & McIver, 1999) and promote student learning (Hannaway, 2003; Porter, 2000). Others suggest that standards-based reform promotes undesirable “teaching to the test” (Hilliard, 2000), and that narrowing the content taught can be done in counterproductive ways that are not good for teachers or students (Booher-Jennings, 2005; Diamond & Spillane, 2004). “Teaching to the test” can mean a number of things, and it can be good or bad depending on the circumstances (Koretz, 2008; Firestone & Schorr, 2004). For example, narrowing the curriculum might be productive if it allows for more depth than the U.S.’s traditional “mile wide, inch deep” curriculum (Schmidt, McKnight, & Raizen, 1997), but it might be undesirable if it omits important content.

Koretz and colleagues describe several types of “teaching to the test”, which can be desirable or undesirable depending on how they are implemented: (a) teaching more content; (b) working harder; (c) working more effectively; (d) re-allocating time; (e) alignment; (f) coaching (focusing on small details of the test); and (g) cheating (Koretz, 2008; Koretz & Hamilton, 2006; Koretz, McCaffrey, & Hamilton, 2001). Koretz (2008) recently concluded that in most cases we do not know whether higher test scores are the result of score inflation, problematic implementation of one of the seven practices, or real learning gains. To continue to build understanding about what “teaching to the test” means, this analysis examines how educators view their responses to the material to be tested, and how they perceive and interpret their implicit mandate to teach what will be tested.


At the heart of standards-based reform is the goal of improving instruction. Thus, a useful way to study reactions to standards-based reform is to examine what teachers are doing differently—are they changing the topics they teach, how they teach them, or both? The reform assumes the inclusion in the standards and assessments of problem solving, reasoning, and communicating understanding will motivate teachers to develop instructional strategies that foster these skills (NCEST, 1992).

Research looking at instructional outcomes is mixed. Some studies show standards-based reform may lead to more emphasis on didactic pedagogy (Booher-Jennings, 2005; Diamond, 2007; Diamond & Spillane, 2004; Sandholtz et al., 2004), while others find more use of conceptual, problem solving approaches (Firestone, Camilli, Yurecko, Monfils, & Mayrowetz, 2000; Hamilton et al., 2003; Stecher, Barron, Chun, & Ross, 2000). Still others found no change in instruction (Wong, Anagnostopoulos, Rutledge, & Edwards, 2003), or the relative emphasis on didactic or conceptual instruction depended on the teacher’s skill and experience (Achinstein, Ogawa, & Spiegelman, 2004). To provide insight into these mixed findings, I discuss how teachers talk about distinctions between content and pedagogy, in which areas they perceive they are changing, and what they think fostered this change.  



This analysis uses a subset of data from a larger evaluation of standards-based reform conducted by the American Institutes for Research (AIR). The overarching study was led by Michael Garet, Rebecca Herman and myself, and included a substantial team of AIR researchers and data collectors (Berger, Desimone, Herman, Garet, & Margolin, 2002; Miller, Herman, Garet, Desimone, & Zhang, 2002)1.  For the analysis reported herein, the data were coded by a team I led at Vanderbilt University.

The analysis reported here uses interview data from 32 schools in 10 districts in 5 states2. Data are from 60 teachers, 32 principals, 14 district administrators and 7 state officials. The five states—States A, B, C, D, and E—were selected by a panel of national experts because they were at the forefront of standards-based reform in the early 2000s. In addition to having publicly available student achievement data, the states (a) emphasized high student achievement for all students; (b) had curriculum frameworks and a statewide assessment program that existed for several years before the study; (c) varied in the degree of alignment of state standards to those articulated by the National Council of Teacher of Mathematics (NCTM, 2000); and (d) varied on the policy instruments used as the primary mechanisms driving reform and accountability.

In Table 1, I draw on a theory developed by Porter (1998; Porter et al., 1988) to describe each state’s policy environment.  The theory describes the strength of state policies on the basis of five key attributes, postulating that the stronger a policy is in each of the five dimensions, the more effective implementation will be. The five dimensions are (a) specificity—the extent to which the policy provides clear and detailed guidance; (b) alignment—the extent to which all components of the system are consistent with each other; (c) legitimacy—the degree to which the policy has the support or backing of institutions or individuals; (d) accountability—the rewards and sanctions attached to implemention of the policy; and (e) stability—the extent to which policies remain in place over time. Descriptions of the state’s policy environment were derived from several sources, including state-level interviews with education officials, state documents, department of education websites, and external sources (e.g., AFT, 2001).

Table 1. Description of State Policy Environments







State A

Standards for grade  bands 4-5 and 6-8; assessments for grades 2-10

Standards based on NCTM; no external alignment review; AFT judged curriculum and standards needed to be more aligned

Parents, students, business and community members involved in standards development

Extra academic support and additional funds for low-performing students; state exam not required for graduation

1997 implemented standards; implemented assessments 1996-97; revised to align with standards in 1999

State B

Standards for grade bands K-4 and 5-8. not mandated. Math assessments in grades 4, 6, 8 and 10

Used NCTM and state assessments in developing standards. No external alignment review.

Parents, community members and students included on advisory committee that developed standards and curriculum frameworks

Provided assistance to low-scoring students, but did not administer school sanctions. Low-scoring 4th grade students required to attend state-funded summer school to be promoted. High school students could earn an advanced diploma by meeting high standards.

Standards initiated in 1981; most recent version disseminated in 1998. Assessments implemented 1985; revised 1993 and 2000.

State C

Standards for grade bands 3-5 and 6-8. Grade-level expectations served as basis for state assessments.

Standards strongly influenced by NCTM; no external alignment review, math assessments designed to measure content of state standards. State required publishers to correlate all instructional material with state standards.

Stakeholder reviewed draft of grade-level expectations

Assistance and sanctions (including permitting students to transfer to other schools). Extra funds for academic assistance to low scoring students. Students could be held back if continued to struggle after receiving extra help and were required to achieve passing score on math assessment to receive a high school diploma.

Approved new content standards in 1996, expended these to include grade-level expectations in 1999. First state assessment program 1971. 1996 developed statewide assessments based on standards. First administered in 1998.

State D

Grade-level standards and assessments

Standards developed to be consistent with NCTM, implemented an external alignment review

Teachers worked with state department of education to develop assessments and review standards and curriculum.

Assistance and sanctions (e.g., school reconstitution) based on assessment results. Grades 3, 5 and 8 promotion dependent on proficiency score. Low-performing students received academic assistance. State funds distributed based on number of students below proficiency.

1994 implemented standards and performance objectives; revised in 1999. 1996-97 implemented assessments for grades 4, 7 and 10. 1999 revised standards.

State E

Grade level standards; assessments in 4, 7 and 10th grade.

Developed standards based on NCTM. Implemented an external review. AFT judged curriculum and state standards needed to be more aligned.

Teachers included in development of standards and assessments; fairness committed reviewed assessment items for culturally sensitivity.

School and district required to report percent of students meeting each standard. Academic assistance to students below mastery. Funds to districts to develop “extended learning” opportunities for students.

Adopted grade-level math standards in 1995; piloted assessments in 1996 in grade 4, implemented assessments in grades 7 and 10 in 1997. Revised assessments in 2001.

Within each of the states sampled, both a case study and survey study were conducted. Analysis of the survey data is reported in Berger et al. (2002) and Miller et al. (2002). Approximately 50 districts were selected to receive surveys. Districts were sampled separately within each state, with probability proportional to the number of schools with seventh grades in each district, as indicated in the Common Core of Data.  Thus, the sample of districts was a state-representative probability sample. Two districts from the district-level sample were selected from each state to participate in the case study component of the study.  For the case study, up to two schools with a fourth grade and up to two schools with a seventh grade, depending on district size, were randomly selected within each district, for a total of 32 schools.  Two mathematics teachers per school were selected by the school principal to participate in the case study. This selection process creates a risk that the teacher sample is not representative. Two teachers from every school participated, and there were no reported incidences of teacher refusal to participate. The research team interviewed the principals in all 32 schools. We asked to interview the district administrator who was the most knowledgeable and involved in math standards-based reform in the district. In several cases, when the district indicated that the math program work was distributed across positions, we met with both the district’s math coordinator and the curriculum coordinator, for a total of 14 district interviews. Similarly, in two of the study states, there were multiple actors with key involvement in math reform that we interviewed, for a total of seven state interviews.

To select districts for the case study, a Latin square design was used to ensure variation across districts on size and poverty. Districts were then drawn randomly from each of these size-poverty-state combinations. The final district sample comprised four low-poverty, small districts; two low-poverty, large districts; two high-poverty, small districts; and two high-poverty, large districts. Districts were invited but not required to participate in the case study component of the study. If a district declined, the next district on a randomly generated list was contacted until an acceptance was secured. To ensure confidentiality, I refer to districts by assigning them a number rather than using their real name or a pseudonym.


State case studies, conducted in the spring and summer of 2000, consisted of in-depth telephone interviews with state education administrators and consultants involved with the design, implementation, and administration of state standards and assessments. The district case studies consisted of two site visits—one in the fall of 2000 and the other in the spring of 2001. Data collection included separate individual interviews with district administrators and school principals, and intensive one-on-one interviews with two mathematics teachers in each school. Respondents were told that they were participating in a study of standards-based reform in math, and that their answers were confidential. The research team used structured protocols, which asked open-ended questions about math reform and standards, assessments, professional development, instructional supports and resources, data-informed planning, school decision-making.


Though methodologically challenging to make causal links between state policy and classroom changes, other studies have suggested that state standards-based reform policy does affect what district administrators, principals, and teachers do (Cohen & Hill, 2000; Desimone, Smith, & Frisvold, 2007; Loeb et al., 2008; Swanson & Stevenson, 2002). But limited effects have been found on teacher quality and student achievement (Gamoran, 2007).  Several qualitative studies of a handful of schools or with a small number of teachers, have explored how educators interpret standards-based reform policies. These studies have shown some of the complexities and challenges of integrating accountability reforms into practice (e.g., Borko, Wolf, Simone, & Uchiyama, 2003; Diamond, 2007; Louis, Febey, & Schroeder, 2005). This analysis builds on these intensive studies by using data from a larger, state-representative sample of districts and, within those districts, elementary and middle school principals and mathematics teachers. Thus I am able to explore these ideas on a larger scale than in many other qualitative studies. This study also complements large-scale survey-based NCLB studies, some of which include case studies (e.g., Center on Education Policy, 2006, 2007; Le Floch et al., 2007; Stecher et al., 2008; Taylor, Stecher, O’Day, Naftel, & Le Floch, 2010).

The data analyzed here is qualitative—allowing detailed examination of teacher perceptions and reactions—and, at the same time, on a larger scale than much similar research. Though I cannot explore at the same depth that very small samples allow, a strength of this study is that I can examine whether findings are consistent across multiple settings. Further, I am able to compare state, district, principal, and teacher responses, useful given that actors at different levels in the policy system perceive and experience policy differently (Desimone, 2006).

Evidence on how teachers react to new standards legislation is limited (Ingram, Louis, & Schroeder, 2004). The interview data here provide direct evidence of teacher, principal, and district and state descriptions of change in response to standards-based reform. I not only acknowledge but emphasize that self-reported change does not necessarily and automatically directly correlate with enacted change as might be assessed by a third-party observer (e.g., Cohen, 1990). Thus, I do not draw conclusions about actual changes in behavior. However, the analysis is predicated on the belief that educator talk about reform can be a powerful indicator of change; thus viewing changing perspectives and beliefs as a precursor to behavioral change (Richardson & Placier, 2001).


Data analysis followed the procedures outlined by Miles and Huberman (1994), Huberman and Miles (1994), Patton (1990), and Coffey and Atkinson (1996). I started with general themes derived from the study’s conceptual framework, which in turn was derived from the literature; I then added more themes and subthemes, as called for by the ongoing analysis of the transcript data. Specifically, the study team used the conceptual framework and research questions to guide the development of the structured interview protocol; this served as the basis for my initial coding framework for interview transcripts (Alexander, 2001). Grounding the coding in the initial framework, I used the constant comparative method to develop the codes (Glaser & Strauss, 1967; Strauss & Corbin, 1998), so that ideas from the transcripts were used to expand and refine the coding system. Through this iterative process, I changed, adapted and integrated categories or themes (Goetz & LeCompte, 1984). In this way, I was able to interactively identify themes using both the conceptual framework as well as the transcript data (Emerson, Fretz, & Shaw, 1995; Green, Dixon, & Zaharlock, 2002). Thus I used the data to inductively test the theory of standards-based reform discussed here, as well as to deductively allow other themes and explanations to emerge that were not anticipated by the conceptual framework.

I first identified the basic units of analysis—the responses to open-ended interview questions. Then I identified chunks of text—sometimes sentences and sometimes phrases (Strauss & Corbin, 1998)—that reflected a single theme (Krippendorf, 1980). Educators’ exact words were filed within each category or theme (e.g., Strauss & Corbin, 1998). With a trained team of coders, I then engaged in a coding and recoding process with each individual transcript to establish themes and subthemes, with the codes being refined through multiple iterations of individual analysis, group discussion, and recoding (Ary et al., 2006). I had multiple coders independently sort educator’s statements into thematic categories, then had weekly group meetings to reach consensus on coding categories; individuals would recode as necessary to correspond to the consensus, and the process would begin again.

The coding team used negative case analysis or discrepant data analysis (Ary et al., 2006) to look for negatives to the main themes and sub-themes (Johnson & Christensen, 2000). We did this to guard against either negative or positive bias in our analysis of responses. Then, the results of the coding for all transcripts were merged, and all coders reviewed the themes and sub-themes for agreement (Eisner, 1998; Lincoln & Guba, 1985).

The design and analysis of the interview data were meant to identify patterns and elicit major themes, rather than determine firm causal relationships, more appropriate from larger-scale, longitudinal data or experimental designs.  Thus, in reporting the results, I describe and synthesize overarching themes from multiple respondents across settings. Rather than providing multiple quotes that support the same idea (C. H. Weiss, 1998), I illustrate each overarching theme with key quotes as exemplars (see Atkinson, Coffey, & Delamont, 2003), “a widely used method for describing themes is the presentation of direct quotes from respondents—quotes that lead the reader to understand quickly what it may have taken the researcher months or years to figure out” (Ryan & Bernard, 2003, p. 282). I emphasize that the analysis took an inclusive approach—I report on all major themes and subthemes touched on in our data. I note where a theme is a consensus, which suggests nearly unanimous agreement among respondents. I also indicate where a minority opposing view was mentioned, and provide a quote to represent it, no matter how few respondents hold that view (Strauss & Corbin, 1990).


In this section, I describe each of the four themes and their sub-themes, illustrated by representative quotes from study respondents. The interpretation and synthesis of the findings are presented mainly in the discussion section, to clearly separate my interpretation from the description of the themes (see Ary et al., 2006).


There was a strong district- and principal-level consensus that standards-based reform efforts in their jurisdictions had brought a new focus on struggling learners. Similarly, teachers indicated that, in response to the new state policy, they were paying more attention to all students when preparing for their state’s assessment, or trying to make their instruction more engaging.

In each of the five study states, district and school leaders articulated their shifting emphasis on struggling learners, as demonstrated by this succinct and powerful statement from a State D state administrator: “[the] accountability program has forced us to look at all students, not just the smart ones.” Further, it seemed clear that for most, the focus on struggling learners and believing all kids can learn was a change from traditional practices, such as teaching to the middle (e.g., Brozo & Hargis, 2003). For example, a State D District 1 principal’s’ statement—“It is shocking other personnel, but it's my belief that we can . . . reach all students no matter what [their] background or ethnic status”—demonstrates that the idea of having high expectations for struggling learners was a stark change.


Contrary to other research, educational leaders in the study often pointed to accountability testing as a positive influence in their attempt to uncouple teacher expectations from background factors such as race and income. A State C district administrator (District 1) articulates this theme by pointing out that seeing their low-scoring students improve had a momentous effect on decoupling teacher expectations from demographics: “[Before high stakes testing], too many people felt like economic level also meant intellectual level. That definitely isn’t the case, and we’ve proven that.”  Much of the data from educational administrators noted this relationship between test results and changing attitudes. The power of this relationship was characterized by a state-level administrator from State D: “The testing has proved to us that we can in fact teach more children more mathematics, especially those students that we traditionally have not thought could deal with mathematics because of the pressures from testing . . . proving that it can happen.”

Our data abound with similar examples supporting the idea that witnessing student learning is perhaps the most meaningful mechanism for changing educators’ deep-seated beliefs (Herman et al., 2008). Such a dynamic is illustrated by a State C District 1 principal who said,

It’s been a challenge to convince some of our teachers that students are capable of doing algebra in the eighth grade. This is our first year, and our grades have been great. [As a result], the teachers have changed their attitudes.


In addition to describing attitudinal changes, respondents described specific policies that were developed to help lower-achieving students succeed. For example, a state administrator in State D said that “the state appropriated monies for districts to work with children who are not working at grade level”; another state-level administrator described establishing “a whole new work unit department that is focused on closing the gap . . . between students who are meeting standards [and] those who are not.” The superintendent in District 1 in State C described a new program to assist low-scoring students that involved after-school tutoring and a summer program. Similarly, a District 1 principal in State C described how they analyzed their student assessment data to target students for extra attention. The principal said that the school brought in a consultant to analyze the data; “they started looking at the groups of children that fell into the lower quartile and tried to figure out how to move them from that lower quartile up,” and they separated results by race, gender and socio-economic status to make sure they aren’t “losing a group.” Similar interventions were described in all of the states and most of the districts in the sample.


Nearly all respondents said they felt accountable for student learning, and most said this had not always been the case before standards-based reform. Such a strong perception of accountability is notable, given that reforms are usually met with resistance, and as a result often do not permeate beliefs and practice (Richardson & Placier, 2001). The change in attitude we observed is best characterized by a teacher in State E, District 1, who said,

When they started using the [state] tests, the meaning [of state standards] became more important. Being responsible for district results is important. Before, we just used textbooks, and whatever the text said to do, that’s what you did.


An elementary school principal in State D, District 3 described how school comparisons in the media made it clear “who is doing better, and that is a lot of pressure.” He said that this pressure resulted in “the upper grades . . . pushing lower grades to prepare younger students to help make goals . . . [and encouraging] everyone trying to work towards the same goal.”

This principal also noted that “the testing makes sure that teachers have and are teaching the state standards.” This theme—that the accountability system ensured certain content was covered—was consistent throughout the interviews. A representative example comes from a State D District 3 administrator, who described how teachers now felt accountable for certain content:

I personally see [testing] as a pretty positive thing in that we are no longer—we don’t have to worry as much about the teacher who goes into the classroom and closes the door and can teach whatever they like.  If I like to do measurement, then I can spend my whole year teaching measurement.  The test is the thing that has caused people to look at the entire program and feel some responsibility about teaching it.


The literature indicates that rewards and sanctions associated with a policy do not produce the deep and meaningful implementation that occurs when teachers and principals buy in to a policy (Desimone, 2002). But research offers little empirical evidence about how our current accountability system may motivate change. This study’s data offer insight into how accountability systems may motivate teachers. One of the middle school teachers (State D, District 1) explained that competition was a motivating factor for her, as was knowing that teaching to the standards would give her students necessary knowledge:

Unfortunately, one of the influences is that we do have this huge test that hangs over us. It is good for me ’cause I’m a very competitive person, and I want my scores to be the best in this school if not in this county. It’s a nice push, ’cause at the end of the school year I get to see results. That’s a motivational thing for me. Also, it’s knowing that these kids have to have so many skills when they go out there.

Another middle school teacher (in State D, District 1) suggested a similar notion of personal accountability linked to the testing regime:

I don’t necessarily think it’s a good thing that we’re so test driven [in State D] . . . but as long as I’m accountable as a teacher, I’m going to make sure I perform as best as I can.

The teachers in the study all expressed similar motivations, indicating that the publication of test scores and the emphasis on particular knowledge acquisition fostered a sense of accountability that was not prevalent before the standards were put in place.


Underpinning this sense of responsibility was consistent mention that the pressure of the testing regime was stressful. An elementary principal in State D, District 1 said, “Some of the morale has dropped because of the pressure of standardized testing.” A middle school principal in State D, District 3 discussed the tension between test stress and the success they were enjoying:

We should be held accountable, but the number of these tests is too high, especially with pilot testing.  We are constantly testing, and I’m concerned that our students are getting fed up, and I know our teachers are as well. No one can argue with the state, because we’re getting results that are higher than the national standard.

Another principal (middle school, State D, District 3) agreed that the tests cause stress and anxiety but put a positive spin on it, indicating that despite the stress, students are able to perform:

During testing time here, everyone is stoic and gray.  I call it the “dentist office feel.”  We are all driven by stress, but you can perform your best under stress because you are prepared.

The message that tests caused stress for both teachers and students was clear in the data, though as the illustrative quotes above show, many saw this stress as productive in that it fostered good achievement outcomes.


There was surprisingly little discussion of students’ roles in the accountability system across respondents in all five of the study states. In one district, however, three teachers commented on student accountability. One middle school teacher (State D, District 3) expressed frustration with the pressure from the test and the challenge of getting students to learn after the testing is over, when “the students know that they don’t have to learn any more this year.” Another teacher in the same district indicated that she thought the accountability system extended to the students:

Each student is held accountable. Not only does this put pressure on me to make sure the student learns the stuff, but also on the student. If they don’t do well on the test, then it is going to make their school look bad.

A third teacher in the same district indicated that students felt a responsibility toward the test, in that there were negative repercussions: “Also, the students are stressed when they know that they don’t know something but that they are going to be tested on it.” Other than these few comments, the effects of standards-based reform on students was not an issue that respondents brought up in response to our many questions about how the standards influenced teaching and learning.


“Teaching to the test” has a negative connotation to many. It denotes such undesirable behaviors as teaching test-taking strategies during valuable classroom time that might otherwise be spent learning important content, and omitting important material to cover only what is addressed by a test that may focus predominately on procedural knowledge (Booher-Jennings, 2006; Cizek, 2001; Crocco & Costigan, 2007; Firestone & Schorr, 2004; Watanabe, 2007). But are there situations where “teaching to the test” might result in desirable instruction? Arguably, “teaching to the test” might be useful if it means targeting the content covered on a test that reflects appropriate subject matter and cognitive domains (Mehrens, Popham, & Ryan, 1998; Popham, 2001; Porter, 2000), or using a more balanced mix of procedural and conceptual/reform-oriented teaching strategies (Firestone & Schorr, 2004). While “teaching to the test” could be desirable if the test domain covered appropriate content, there is little consensus on what “appropriate” content is, and this disagreement prevailed long before accountability and testing policies gained widespread use.

The data here reflect both positive and negative interpretations of “teaching to the test.” One issue, common in the literature, was the extent to which teachers were narrowing the curriculum to focus on the items covered in the test, rather than the wider domain the test represents. A state-level official in State D said, “There is a concern among many educators that we are too much teaching to the test and therefore narrowing the experiences of children in the classroom.” A principal in State D, District 3 indicated that he sees his mandate as teaching to the test, especially given that his job might depend on student test scores:

I think it is foolhardy for a teacher, school or district to teach anything that isn’t on the test . . . I don’t have tenure as a principal.  We are going to teach what we are supposed to teach.  Are we teaching the test?  Yeah, we are.  I know we have the liberty to add to it, but the bottom line is that we have to teach [to it].

Similarly, a teacher in a different school in State D, District 3 explicitly intended to teach what was covered on the state assessment: “I’m told what my students are going to be tested on, so that’s what I teach.”

The phrase “teaching to the test” may have such negative connotations that few want to admit they do it; for example, an elementary school teacher in State D, District 3 denied “teaching to the test,” but she went on to describe how she covered the material on the test and had changed her instruction to respond to the test, including a shift from focusing on students feeling successful to students being successful:

I don’t teach the test. I introduce what I know they’re going to be required to know for the test. You have to change [your instructional approaches]. I’ve been teaching for 27 years, and in that time, I have seen the emphasis shift. Before it was more about making that student feel successful. Now it’s about, “I can pass this.” So I’ve had to change. And it’s been hard for me . . . I’ve had to open my door a little bit and let some new things in.

I found evidence that teachers spent time helping students learn the test’s layout and format. A State D, District 1 middle school teacher said, “Since the [state test] is given in September of the eighth-grade year, I feel a responsibility and obligation to prepare these kids for the test.  I also like to give my students sample [test] questions.” Several teachers in all states similarly indicated they give their students sample questions to prepare for the state test. None of the respondents criticized this practice or complained that it was taking up too much instructional time, as might be expected based on previous research (Diamond & Spillane, 2004; Finnigan & Gross, 2007; Louis, Febey, & Schroeder, 2005).


One powerful theme, as one would expect, was the extent to which standards and/or the curriculum and texts were aligned with the assessments, and how the degree of alignment affected teachers’ instructional responsiveness to the state test. One State D, District 3 principal aptly described the potential of the system and the importance of alignment:

We’re just trying to make sure we’re teaching what we need to be and what the state is assessing. [We] need to make sure it’s all integrated and aligned. It’s silly to have standards and teach to the standards but to have a test that tests something else.

Another District 3 elementary school principal in State D implied that if there is alignment, teaching the standards is the same as teaching to the test:

End-of-grade testing and the state curriculum have driven the teaching of state standards in the past few years. The testing makes sure that teachers have and are teaching the state standards. If the teachers are doing well and making improvements in teaching the standards, then it shows up on the tests.

The common issue of the difference between teaching to the standards or the assessment was aptly described by a District 3 official in State D.

Sometimes they [administrators] don’t see the correlation between [standards and assessments], and in [our state] . . . I truly believe our state test truly does test what our standards are. If you teach the standards, your students are going to do well on the test. You don’t have to teach the test. You teach the standards. Sometimes administrators see it from the other end. You teach the test. And so I find myself harping on, if you will teach the standard course of study, your kids will do well.  You don’t have to worry about the test.

Issues of misalignment were common, as represented by the comment of a District 2 teacher in State E: “I think the [state test] is leading us more than the standards. We talk about the standards because that sounds nicer, like we are not teaching to the test.”

Even if the standards and assessments are aligned, a test only covers a portion of the standards. If teachers know what will be tested and then narrow the focus of their instruction to only tested topics rather than all the content standards, it is easy to see how “teaching to the test” would be undesirable. However, I found no insights into how teachers balance the inclusiveness of the standards with the narrowness of the test. Though states alternated testing forms each year, teachers indicated they had a good idea what content would be tested.  


A strong theme in the interviews was that the new state assessments emphasized more than just procedural knowledge, which had been a primary focus of previous tests. As a result, “teaching to the test” for many teachers meant expanding the “cognitive demands” (Porter, 2002) required of students from solely memorization and procedures to more advanced thought processes such as application to real-world problems, conjecture, and generalizing. While in research, content is often defined as the intersection of topics and cognitive demands (e.g., memorize math facts), educators’ references to “content” may refer to topics or cognitive demands separately.

Epitomizing the theme of expanding cognitive demand coverage, a lead teacher in State B, District 1 said that teachers in the district had been asked to do state assessment-like activities (also called performance based activities) so that the kids would become more familiar with state assessment questions. She said that math questions on the state assessment were more “realistic, real-world problems,” and that she likes the questions and wants to use them in her instruction.

Similarly, a middle school teacher in State C, District 1 described changes in her instruction that she thought could be explicitly linked to the content of the state assessment, which included higher-level cognitive demands than she previously taught:

I try to not only have for the students’ computational skills, but to have some reasoning behind it—some finding of similarities, some problem-solving strategies similar to what we did today, trying to explain what you do and write things down. Because the testing that they’re doing is not just a computational but an application type of test. I want them to have those types of skills as well.

Another teacher in the same district said that because assessments and standards drove her instruction, she had increased her emphasis on explaining rather than rote procedures.

What I did today was very standard or [state-assessment] based, where they would have to write out. It was kind of a proof that had a little more language arts, a little more sentence structure in it, but what I was trying to get them to show is the steps and the process. And the two longer kinds of questions—they have the “rethink and explain”—are similar to what I made them do today except that is [state-standard] driven.


Another theme that emerged was that test-driven instruction resulted in a tighter coupling of the curriculum and the test, and more content coherence across grades. Typical comments were consistent with those of a principal in State C, District 1, who said that the students’ test scores drive “the way we address math” and that they “look at what is tested in order to determine what we should focus our curriculum on.” He added,

Over the years, we have come to view the [state assessment] scores as very important and as a team effort. Our sixth-grade teachers are as concerned about [state assessment] scores as the seventh- and eighth-grade teachers. The results of the scores help realign the curriculum so that not all grade levels are teaching the same concept but are either building on each other's teachings or teaching different concepts so that the students are getting all of the necessary materials to perform well on the test.  

Several respondents across all states indicated that their state’s test changed what and how they taught across grades, and changed the order of topics to align with the testing and reinforcement necessary for success on the state assessment.


The extent to which “teaching to the test” is desirable also depends on the quality of the test. Virtually none of the respondents commented on the quality of of the test’s content. In general, respondents thought that the content covered was appropriate, as reflected by a teacher in State D, District 2 who said, “I don't always agree with teaching to a test, but the test is fair, and you know what’s going to be on the test.” Critiques were instead aimed at the amount of testing or the accountability linked to test results. In fact, when teachers discussed how their content and pedagogy changed (below), it was both implicit and explicit that the math tests demanded more problem solving and conceptual understanding, which to many reflected an improvement over predominantly procedural mathematics tests.


Under standards-based reform, content standards, coupled with aligned textbooks and/or curriculum frameworks or pacing guides, serve as the basis of classroom curriculum, replacing teacher autonomy in deciding what content to teach. Pedagogy is a less-direct target of standards-based reform. How teachers teach is not an explicit component of content standards, except to the extent that standards focus on certain cognitive demands (e.g., memorization vs. conjecture). For example, worksheets and drill may allow children to memorize math facts, but alternative pedagogies such as discussion and group problem analysis may be necessary to foster opportunities for conjecture and real-world applications.

The distinction between, integration of, and relative importance of content versus pedagogy has been debated for decades in the teacher education literature (Pellegrino, Baxter, & Glaser, 1999; Porter et al., 1993). The extent to which these debates are important for policy and reform hinges on the extent to which reforms ask teachers to change what they teach, how they teach, or both.  Teachers in this study had different perspectives on which aspects of their teaching they were changing as a result of standards-based reform.


Virtually all respondents—district administrators, principals and teachers alike—agreed that they had “free reign” when it came to pedagogy, as an elementary teacher in State A, District 1 phrased it. District administrators made comments such as, “[Teachers] are told which standards to teach. . . . How they teach them is up to the teacher.” Principals’ comments were consistent with those of a State D middle school administrator in District 1, who said,“Teachers are given autonomy to figure out how to teach what they need to meet the standards.”  Similarly, teachers did not feel that explicit policies were in place to change their pedagogy; a State C, District 1 elementary school teacher said:

The manner on how you teach the objective/standard is our choice. . . . Actually, the district tells us what we have to teach, but they don’t tell us how we have to teach it. . . . You pretty much come up with however you think it’s going to work for your kids.

Though they agreed that standards-based reform did not explicitly dictate a change in pedagogy, many of the respondents described pedagogical changes that were occuring, and directly traced them to the standards and testing policies recently put in place by their states. A common reason respondents gave for the pedagogical changes was that they thought a shift in pedagogy was important to realize the goals of standards-based reform. For example, principals said that “[we have to] make sure teachers are not just handing out dittos . . . and [we have] to keep teachers creative in the way they deliver instruction” (State D, District 1);

the actual implementation [of content standards] is not such a big deal as the instructional methods to get people who have taught a number of years, who are teaching as they were taught, to try new things . . . to try some of the manipulatives. (State A, District 2)

One State C, District 1 principal indicated progress in moving from an emphasis on traditional lecture-only teaching styles: “Teachers are using hands-on manipulatives more than they used to. Teachers are also doing much better when it comes to having students work in groups or in pairs [rather than rows].”

Teachers also acknowledged pressure to deliver more engaging and effective instruction. A middle school teacher in State A, District 2 said that standards-based reform had made her feel that she needed to “find . . . a way to present the lesson the way students will understand and that’s as easy as possible.” Others talked about the standards movement as if it were directed at pedagogy changes, subscribing to the idea that “the content has not changed; what has changed is how we approach it” (State C, District 2 elementary school teacher). A State B, District 1 lead teacher indicated that he thought pedagogy was the real challenge in the standards-based reform movement. He thought that “any teacher can teach the subjects,” and that the “real difficulty” is getting teachers to change their methods.   

In fact, several respondents described specific reforms aimed at changing pedagogy. One district administrator (State B, District 1) said, “The teachers in each grade were asked to get together to create a document that listed potential classroom practices for each state objective.” A lead teacher in a middle/high school in State B, District 1 said,

The teachers are already calling for this book [of classroom practices] . . . and [we are] hoping that the teachers will walk around with the resource book rather than the textbook. . . . If a teacher looked at her objectives and saw that she needed to teach number sense, the next thing she would do would be to go to the resource book and see how other teachers are teaching number sense.


One of the challenges of standards-based reform for math teachers is moving from a predominantly rote, procedural view of math teaching and learning to include conceptual approaches (Desimone, Smith, Baker, & Ueno, 2005; National Mathematics Advisory Panel, 2008; Schmidt, McNight & Raizen, 1997). The idea of fostering a deeper understanding of math and real-world problems is consistent with certain types of math reform (e.g., NCTM 2000). Though there is a debate about the appropriate balance between procedural and conceptual mathematics instruction (e.g., Loveless 2001), few argue that math should focus only on memorization and procedural knowledge, as it has traditionally.

I found the language of reform focusing on conceptual learning to be quite common in the five study states. As one State D official said, “When we bring teachers in to start writing and reviewing assessment items, they have a narrow vision of math content and concepts . . . We need to broaden conceptual thoughts of what math is.”  In State E, the same idea was described by a state administrator, who said, “Standards require higher order cognitive demands.”  Similarly, in State C, teachers, principals, and district administrators uniformly applauded the emphasis on higher order and conceptual learning over procedural, rote learning. Critical and conceptual thinking about math were at the forefront of goals for many states. A principal in State C, District 1 described this idea:

The [state assessment] has response questions which require critical thinking on the part of the students. I believe that if you can answer those questions, then you can take the math and apply it to the real world. . . . We also have a big push on communicating math. I don’t think we have ever done that in the past, and we’re now more aware that . . . students need to be able to communicate verbally and in writing.

Communicating understanding, a focal point of the math standards movement (NCTM, 2000), was at the forefront of several teachers’ repertoires. As a State C, District 3 middle school teacher said,

I’ve been trying to stress the importance of communicating what they are doing when students do their work. Because it is generally more important to understand the process rather than arrive at a correct answer . . . I think a way to help them retain information is to get them to communicate their thought processes.

Another elementary teacher in State C, District 1 described how she has changed her instruction to focus more on word problems and explaining answers, activities reflected in the standards, and how she tries different strategies to help understanding:

I’ve changed my instruction toward problem solving and explaining your answers, rather than just simply, “What's 13 plus 12?” “How did we get this answer?” “What did we do first?”

Similarly, other State C teachers stressed how, after the onset of standards, they began teaching “multiple ways to solve for [problems]” (State C, District 1 middle school teacher).  A similar change was occurring in State D—one State D, District 3 elementary teacher said,

I’m really working with them in terms of problem-solving strategies.  I’m not as concerned with the answer as with how the student arrives at the answer.  The process is the important part of solving the equation.  In most schools, we work on word problems, problem solving, and strategies for problem solving.  The state standards drive the course.

In State B, a district administrator discussed the emphasis on “problems in the real world” and having students capable of responding to problems that are not “immediately solvable.” This emphasis was carried through to teachers, who said things like, “We also need to make sure that we are doing the math journals and open-ended problem solving” (State B, District 2 lead teacher). Another State B, District 2 teacher said that

particularly since the results [of the state test] become public information . . . now teachers focus on applications to math problems, rather than just drilling concepts. . . . There has been a general change of philosophy among individual teachers.

In State A, this type of discussion focused on “real-world applications.” A State A, District 2 principal said, “[We want our students to have the] ability to transfer those skills into life skills—grocery shopping, perhaps ratios.  If you needed to figure out how much carpet for your house, you need to figure out area.” Respondents in State E also focused on conceptual instruction. A typical response is illustrated by a State E, District 2 principal, who described the district standards as “applying concepts and procedures, applying to real-life, using reasoning and thinking skills and communication is a big piece. Ultimately we want kids to make the connection between the real world and math.”

Despite the overwhelming emphasis on conceptual learning, the tension between conceptual and procedural emphasis (Loveless, 2001) did surface in the interviews.  As one State A, District 2 elementary teacher said: “You can understand how to do something, but if can’t do the computation, what good is it? You have to drill on basic skills because they forget.”

Another State A, District 2 principal expressed the need to have both procedural and conceptual knowledge: “Students should have thinking skills. But [you] have to have basic computation, and [we] have had to spend more time on [computation] but also focus on the process and getting them to think.” Only one respondent touched on the issue of whether conceptual approaches were appropriate for all students (see Loveless, 2001). In State E, District 2, a teacher said she didn’t do much higher order learning because her students “can’t handle it.”


One of the fundamental ideas behind standards-based reform was that the standards would drive the curriculum (M. Smith & O’Day, 1991). In the data, I found evidence to suggest three scenarios that sometimes overlapped and sometimes contradicted each other: (a) the standards definitely drove the curriculum, and teachers felt they had to follow them; (b) the standards did little more than codify or change the order of the curriculum already taught; and (c) the standards were more rigorous than the previous curriculum.

Across states, districts and schools, principals and teachers felt that “all teachers must teach to the standards” (State D, District 1 principal).  This was the one theme that met unanimous agreement. Teachers across all of five agreed that there was a mandate to follow their state’s standards, and teachers had no flexibility in the content they taught. One State D, District 3 middle school teacher said,

I don’t feel like I have much of a choice with the content and the standard course of study. . . . As far as being able to vary what I want to teach, that is not an option . . . I just teach the standard course of study, which is from the state and is what we’re tested on.

All the teachers in the study articulated views consistent with the idea that “the content is out of my control. The objectives are given to us by the state, so we teach to the objectives and standards” (State D, District 1 elementary school teacher).

But respondents disagreed about the extent to which state standards required any major changes in the curriculum.  Some thought their state’s standards covered different material than their previous curriculum. One State D, District 3 middle school teacher said, “The state always had a standard course of study, but it wasn’t as rigorous.” A State E District 2 junior high school principal indicated that the standards required teachers to cover new, harder subjects: “We have had to introduce that whole concept [of probability and statistics] to the students and the teachers. [Before the standards], we didn't touch probability and statistics.”  In State B, a District 2 middle school principal said that “in terms of content, we are stressing more things like number sense, ratio/proportions, make connections with the real world, use of technology (computers, graphing calculators).”

But in the same states where some respondents reported substantial change, other respondents said that despite the new standards, “the content has not changed in any major way” (State E, District 2 teacher). Several respondents said that the content was presented in a different way than teachers were used to, and that was really the issue, not one of substance but of form. For example, a principal in State D, District 3 explained, “Well the math standards have changed.  There used to be seven, but now there are four.  They didn’t drop any, but merged them.” Similarly, a State E, District 2 teacher thought the standards didn’t require much change, even though “some people will present and it sounds like they are turning their whole classroom upside down. When I look through those standards, I don't see that you need to change a whole lot.” A State E, District 2 teacher at a different school said, “Besides, what the standards want done are what you are doing anyway. It’s just a different way of categorizing things.”

A principal in State C, District 1 said, “Even though many of the standards are the same thing teachers have been doing for years, when they hear it in a different language, it’s frightening.” Similarly, a principal in State A, District 3 said, “The curriculum hasn’t changed a lot; we have just re-ordered a lot.”  Teachers in State C often agreed that “the content has not changed. The order in which material is taught has changed because the district as a whole decided on the order of the content topics” (District 1 middle school teacher). Teachers indicated that they had not changed what they covered in class because “[the standards] weren't very far off from what I was doing” (State E, District 2 teacher).


I coded and analyzed the data with the goal of understanding how the early standards-based reform/accountability movements influenced schools and classrooms. The research questions were derived from previous research; the interview questions or coding rubric was not shaped to emphasize positive or negative reactions; and our coding team kept an open mind while identifying key themes. What I found were what most would consider overwhelmingly positive beliefs, perceptions, and self-reports of change.

These findings should be interpreted in the context of several key weaknesses of the study. There are a multitude of complexities that the data do not address, which might change the understanding or interpretation of the findings. I do not analyze how principal and district actions may have affected teachers (e.g., Hamilton, Stecher, Russell, Marsh, & Miles, 2008) or possible unintended consequences for low-achieving, underserved students (Hauser, 1999; Sandholtz et al., 2004). Nor do I have evidence of actual behavior change. Again, I emphasize that beliefs, perceptions, and self-reported change are a necessary precursor to meaningful change (Richardson & Placier, 2001) but not a substitute for it.

Another limit of the study is that it focuses on mathematics only. This is especially salient if teachers react to reform differently depending on what subject they teach. To investigate this possibility, I examined the literature, especially studies that included both mathematics and English Language Arts (ELA) teachers, given that math and English Language Arts/reading are the two main subjects historically tested under NCLB. One hypothesis might be that the student achievement testing that is part of an accountability system is more acceptable to math teachers because math teachers are more likely to be measurement-oriented, and/or to believe that math is more easily codified into discrete, linear skills than are other subjects (CEP, 2009).

I found that little if any research directly distinguishes the attitudes of math and ELA teachers, and the work studying educator reactions to accountability, standards-based reform, and NCLB suggests that math and ELA teachers hold similar attitudes.  Reactions to reform have been found to differ based on classroom experience, and number of struggling students, but are similar across subjects (e.g., Knapp et al., 2004; Loeb et al., 2008). Consistent with this, a survey conducted for Education Week (Measuring Up, 2001) near the time of the study reported here, revealed that math and ELA teachers held similar attitudes about a variety of aspects of standards based reform . For example, 89% of math teachers and 90% of ELA teachers answered yes to a question asking whether they “believe raising academic standards is a move in the right direction”; 31% of math teachers and 30% of ELA teachers agreed that “state tests help them focus on what children really need to know.” Further, I reviewed again the state and national studies of accountability and NCLB discussed earlier in this paper (e.g., Stecher et al., 2008), and did not find any attention to or evidence of distinctive reactions or perceptions based on subject matter.

I do not believe that enough attention has been paid to this distinction between subjects to conclude that there are no differences, but the relevant literature points to other factors (e.g., high minority populations and experience) as affecting perceptions about testing and accountability—not the subject being tested or taught. Investigating subject-matter similarities and differences in reactions to reform is an area for future research.

The four themes of the paper are still fundamental components of educational reform and policy debates today: (a) whether accountability reforms are truly improving instruction for low-achieving students (e.g., Hassel & Hassel, 2010); (b) the extent to which these policy levers are fostering a sense of internal accountability and responsibility among educators (e.g., Weast, 2011); (c) teaching to the test—what it means, and when it is good or bad (Gallagher, 2010); and (d) how teachers are changing what and how they teach (McCann, Jones, & Aronoff, 2010). Thus, while this paper’s findings and discussion focus on challenges that are still directly relevant to today’s education policy environment, there are important differences in current reform efforts. I take these into account in the discussion of the findings, and what we can learn from them about both past and current reform efforts.

While there is substantial literature criticizing NCLB and similar accountability policies (e.g., Diamond, 2007; Linn, 2000), little published research shows positive change fostered by new standards and accountability systems. This analysis offers a wealth of examples of fundamental self-reported change that many believe is the right direction to improve student learning experiences. These data from five states provides evidence that standards-based reform can work to elicit changes in beliefs that can ultimately change the way administrators and teachers treat struggling learners, approach instruction and their responsibility toward students, respond to accountability tests, and choose the content and pedagogy used in the classroom. Below I discuss the key findings, and then analyze them in light of more recent (and negative) reactions to reform.


One notable meta-finding from this analysis is the consistency in responses across multiple levels: teachers, principals, and district and state administrators.  Among the respondents, perceptions about standards, testing, accountability, instruction, and achievement were remarkably similar; remarkable in part because previous research suggests teachers and administrators have differing views on reform (e.g., B. D. Jones & Egley, 2006).  However, little research explicitly addresses similarities and differences in perceptions of reform from educators at different levels in the education system. But many of the findings from school reform studies suggest that teachers are generally more critical of the effects of standards-based reform on their instruction and autonomy than are administrators, and that principals express frustration at their lack of autonomy, given district mandates (e.g., Berends, Bodilly, & Kirby, 2002). Further, there is some evidence that principals are not as negative as teachers about testing (Ladd & Zelli, 2002). One study that directly compared teachers’, principals’, and district administrators’ perceptions of standards-based reform found that although there was not consistent agreement across levels, and administrators tended to rate several aspects of the system slightly higher than teachers did, there were few examples of strong differences in views (Desimone, 2006).

One hypothesis is that principals, who arguably have a broader perspective and a more vested interest in the overall impact of reform, would hold different views from teachers, whose main interests are their students and their own instruction (B. D. Jones & Egley, 2006). I did find this to be true. In our data, it is evident that administrators took a broader view of the influences of reform on the system as a whole, while teachers focused mainly on effects on their own instruction and their students’ learning. Further, even though B. D. Jones and Egley (2006) reported some differences across levels, they also reported that, as with our data, “many administrators’ and teachers’ perceptions are very similar, and they do agree on many accounts (p. 770).”

Still, it is noteworthy that in this study, administrators, principals, and teachers held such similar views of the reform occurring in their localities. It is not straightforward to definitely explain this agreement. One hypothesis, however, is that divergence in views is fostered by tensions in the system, which had not really manifested themselves at the time of this study. As I argue in more detail later, the early standards-based reform efforts were more consistent with the ideal design of reform, and embodied many desirable components—high quality standards; local flexibility to innovate; and testing as one of many motivating factors in the system, but not the only one. Perhaps the degradation of standards-based reform to test-based accountability (articulated in more detail later) amplifies differences across levels. For example, as sanctions become more powerful, and flexibility at the local level grows more constrained, teacher resistance may increase. Similarly, as principal innovation is more constrained by district mandates, divergence in perceptions about the reforms may occur between levels. I offer this as a working hypothesis, consistent with other research that suggests that when an education system is functioning well, actors at multiple levels converge in their views and actions, while dysfunctional education systems suffer from conflict and disagreement (e.g., Firestone, Mangin, Martinez, & Polovsky, 2005).


With one exception, all district and school respondents in this study said they focused attention on struggling learners, and were committed to the idea that “all students can learn.” Previous research has correlated accountability policies such as testing, rewards, and sanctions with reduced opportunity to learn for disadvantaged populations (Darling-Hammond, 2004). In contrast, I found that the new emphasis on test results seemed to motivate districts and principals to use new approaches to reach struggling students.  However, this study falls short of being able to assess the quality of these attempts or the challenges involved (e.g., Loeb et al., 2008). Further, the recently documented focus on students who are near proficiency cut scores (Hamilton et al., 2007; Le Floch et al., 2007; Stecher et al., 2008; Taylor et al., 2010) was not pervasive in this study.

Unlike principals and state and district administrators, teachers did not discuss changing their instruction to focus on struggling learners. It may be that it is difficult for teachers to admit that they had not previously emphasized the learning of all students, which may explain why it was principals and district administrators who were likely to indicate there was a change to focus on struggling learners. Another plausible explanation is that higher-level administrators focused on new programs they put in place to target struggling learners, whereas any changes teachers made to target low achievers could be subsumed under efforts to improve the engagement and test scores of all of their students, which is how teachers tended to talk about their instructional changes. A pessimistic interpretation is that teachers may not have been focusing their instruction on struggling learners.

Certainly, discussion of paying attention to struggling learners is the appropriate “language of reform” in an accountability setting, and previous research has shown that educators sometimes talk the language of reform without understanding it, believing it, or implementing it in a meaningful way (e.g., Cohen, 1990). Analysis of the context and examples provided by those who described their efforts to focus on struggling learners led us to an optimistic interpretation—that seeing formerly low-achieving students respond to reform created buy-in to the idea that children who are traditionally part of low-achieving groups were capable of learning algebra, given appropriate supports and supplemental instruction. Witnessing this success motivated them to continue altering their programs and practice.

The theme that permeated conversations about the “new” focus on struggling learners was that it was not something teachers were used to doing. More evidence that the policy system was responsible for fostering at least some of the reported change is found in statements that explicitly linked accountability or standards-based reform with the new focus on struggling learners.  This is consistent with recent studies where educators report an increased focus on low-achieving students in response to NCLB mandates (e.g., Stecher et al., 2008; Taylor et al., 2010).

The uncoupling of achievement from background factors such as race and ethnicity was also noteworthy. Several respondents indicated that high post-intervention test scores challenged their previously held notions of the limits of certain student populations. This can be seen as a powerful and remarkable shift, and one very much consistent with the goals of NCLB and standards-based reform. I found evidence of a basic tenet of standards-based reform—make educators accountable for all student learning, and they will pay increased attention to struggling learners, by changing pedagogy, content, or both, and the students will be more successful as a result (M. Smith & O’Day, 1991). Obviously, this study is not designed to let us make cause-and-effect connections between teacher actions and student learning; neither one is measured directly. But it is certainly dramatic when educators admit that their previously held prejudices were wrong, and note the new success they have had with struggling learners.

At least one detractor was not convinced that traditionally low-achieving students could learn at higher rates; this is consistent with evidence from surveys that shows many teachers worry that standards are too difficult for certain students (Stecher et al., 2008). However, I found ample evidence that witnessing results of the various state, district, and school programs targeted to low achievers was a powerful mechanism for changing belief patterns about the potential of struggling learners, and convincing educators that, as I earlier quoted a State D state administrator, “we can in fact teach more children more mathematics.”


The findings here show that the testing and accountability system had moved schools in the desired direction—toward personal and group responsibility for student learning. On the other hand, the stress and pressure associated with such a system was also quite palpable. This lower morale has been documented continually since the late 1990s (M. G. Jones et al., 1999; Koretz, Barron, Mitchell, & Stecher, 1996), though one study found that it had decreased from 2003 to 2006 (Stecher et al., 2008). Principals and teachers expressed feeling accountable in a way they had not before—considering student test scores a direct reflection of the quality of teaching—and that the feeling of accountability was changing their behavior in specific ways—namely, ensuring that they taught their state’s standards.

Policy and reform theorists debate whether meaningful, long-lasting change can be elicited by the rewards and sanctions associated with an accountability system, rather than the development of belief and buy-in (Datnow, 2000). This study did not document whether an articulated sense of accountability actually resulted in changed behavior, but the data do show that principals and teachers believed they were changing as a result of accountability pressure. The “sense-making” teachers and principals engaged in as part of their interpretation of accountability (Louis, Febey, & Schroeder, 2005) seemed mainly to focus on their responsibility to increase student learning as demonstrated on the state assessment. The key idea was that instead of just doing “whatever the textbook said to do,” teachers and principals had their eyes on test results—that is, on outcomes instead of processes, one of the goals of standards-based reform. This focus on assessment data, and its use in making decisions about instruction and professional development, has been documented in surveys over the last several years (CEP, 2006, 2007; Le Floch et al., 2007; Scott, 2008; Stecher et al., 2008; Taylor et al., 2010).  

Previous research on state policy connects teacher receptivity to reform and professional development experiences (Swanson & Stevenson, 2002). Teachers in our sample did not draw links between their learning experiences and their openness to change; rather, they stressed that behavioral change was necessary to respond to policy mandates. Consistent with other research (e.g., Stecher et al., 2008; Taylor et al., 2010), respondents in this study felt pressure and stress as a result of being held accountable to test results. Some argue that such stress interferes with the teaching and learning process (e.g., Valli, Croninger, & Walters, 2007), while others argue it is necessary to foster improvements (e.g., O’Day, 2002). Most white collar professions have a similar type of high-stress, bottom-line accountability: lawyers win or lose cases; business professionals make or lose money. These types of accountability have been known to cause substantial stress and anxiety, but no one has called them inappropriate or unproductive. It could be argued that elementary and secondary education, to take its rightful place in the professions, desperately needed a form of accountability that the public could observe and understand (Adams, Heywood, & Rothstein, 2009; Stecher & Kirby, 2004). A continuing question is whether there are tangible trade-offs to accountability that are detrimental to students and teachers that merit exempting K-12 educators from the performance pressures that most other professional sectors experience.

A piece of this puzzle is the effect accountability has on students. This issue was left relatively unexplored by respondents in this study. Instead of discussing how the accountability system of testing was affecting student learning, teachers usually emphasized how it was affecting their own sense of responsibility and feelings of pressure and anxiety.

Teaching to the Test

Respondents in this study worried that teaching to the test, was “narrowing the experiences of children in the classroom,” (as earlier quoted by a State D administrator) though they also praised test-based accountability for eliciting needed change in the curriculum. This is consistent with mixed findings from research, which shows examples of where teaching to the test did not narrow the curriculum (Stecher & Borko, 2002), and examples of where it did (Stecher & Mitchell, 1995). Similarly, Monfils and colleagues (2004) point to the ambiguity and mixed benefits of test-driven reform, including both increased teacher awareness of and interest in reform practices, but also the intensification of conventional practice (see also M. G. Jones et al., 1999). Though recent studies have found that test-driven reform has caused a movement away from non-tested subjects (CEP, 2006, 2007; Hamilton et al., 2007; McMurrer, 2007; Stecher, Barron, Chun, & Ross, 2005; Stecher et al., 2008; Taylor et al., 2010), this issue was not discussed by our respondents; likely it was less pervasive at the time of the study.

One of the most powerful themes I found was the critical importance of the alignment of standards and assessments. When such alignment is in place, teachers can teach to the standards and not focus on the test. In contrast, when alignment is absent, teachers may adapt instruction to the assessments rather than the standards that may undesirably narrow the curriculum (Stecher et al., 2000; Stecher & Chun, 2001). Respondents overall recognized the essential link between aligning standards and assessments, acknowledging that when standards and tests are aligned, “teaching to the test” can be an appropriate and desired response.  

Another theme that stood out from the data here was that teachers said they reacted to reform by going beyond procedural knowledge and memorization to include more conceptual forms of learning, problem solving, and real-world learning. Other studies have shown that high-stakes tests can lead teachers to increase emphasis on problem solving and other desirable practices (Koretz, Stecher, Klein, & McCaffrey, 1994; Stecher, Barron, Kaganoff, & Goodwin, 1998). To many, this movement toward a more balanced approach to math instruction is one of the goals of standards-based reform (NCTM, 2000). At the time of the study, the five participating states had what many judged to be “high-quality” standards and assessments. In contrast, when tests cover material that emphasizes rote learning, a resulting undesirable emphasis on didactic instruction is not a surprise (Diamond, 2007; Resnick, Rothman, Slattery, & Vranek, 2004). This is consistent with other findings that the format of the test (e.g., open ended items or multiple choice, questions that require procedural versus conceptual knowledge) drives practice (Firestone & Schorr, 2004; Hamilton, 2004; McMurrer, 2007; Parke & Lane, 2008; Stecher et al., 1998).

It is certainly possible that the increased pressure of accountability over the last several years has moved states toward more basic state assessments, and away from the “reform-oriented” assessments that were in place at the time of this study. Similarly, while our respondents indicated they did spend time practicing test taking strategies with their students, none criticized this practice or complained that it was taking up too much instructional time, as we might have expected from other research which documents excessive time spent on such activities (Diamond, 2007).

The data here also indicated that the testing system had encouraged schools to take a more coherent and reasoned approach to the timing of introducing and repeating material. This responds to an endemic problem in education—repetitiveness and disjointed topic coverage across grades, especially in mathematics (Schmidt, McKnight, & Raizen, 1997). The main caveat here, again, is that the study does not document whether changes were actually occurring. Instead, I interpret findings with an emphasis on educator perceptions and beliefs as a precursor to action (Richardson & Placier, 2001); within this framework, I conclude that respondents were genuinely re-thinking their alignment, instruction and sequencing as a result of the testing regime, and perceived that they were thinking more holistically and sequentially.

Thus, to the debate about whether “teaching to the test” is desirable or not, I hope that these findings add to and extend the discussion that such an approach can have both advantages and disadvantages. In some cases, it may cause teachers to break out of old bad habits, “let some new ideas in,” and move toward a more balanced approach to math instruction, as recommended by much research on student learning (National Mathematics Advisory Panel, 2008). Such a change would depend fundamentally on the nature and quality of the test (e.g., whether it demanded more conceptual understanding).

Our data indicated that the state assessments in the five study states pushed toward conceptual and real-world applications. In other cases, teaching to the test may narrow the curriculum in an undesirable way. We do not know whether the “narrowing” is detrimental to students—that would depend on the content covered, and the content omitted. The “mile high inch wide” U.S. curriculum has frequently been criticized (e.g., Schmidt, McKnight, & Raizen, 1997), so some narrowing of the curriculum may be desirable. Similarly, the time spent preparing for the test and doing sample questions might be constructive ways to enhance student understanding in mathematics, or alternatively, it might be devoid of instructional content (Koretz, 2008). That would depend wholly on the particular teacher’s approach (Lane, Parke, & Stone, 2002; Linn, Baker, & Betebenner, 2002). The main conclusion I draw from this analysis, consistent with previous work, is that “teaching to the test” may in some contexts be productive, and its relative merits depend heavily on the alignment, content and quality of standards and assessments.


I found evidence that some teachers were changing both the content they covered and the pedagogy they used to cover it, while other teachers believed their new state standards required only changing the order of what they were already teaching. Two camps emerged – those who said they were already teaching the material in the standards, just in a different order, and those who said that the standards required changes, for example, emphasizing areas previously not covered such as measurement and statistics, or communicating understanding. The extent of teacher change would depend on what content a teacher taught before standards-based reform interrupted his or her autonomy; likely some teachers’ pre-standards instruction were more closely aligned with the standards than others.

Recent studies have documented teachers reporting that their teaching was already consistent with state standards (e.g., Jennings, Swidler, & Koliba, 2005); this may be problematic, reflecting either a misunderstanding of the reform or a non-critical view of their own instruction (Archbald & Porter, 1994; Cohen, 1990; Floden, Porter, Schmidt, Freeman, & Schwille, 1981; Porter, Floden, Freeman, Schmidt, & Schwille, 1988; Spillane, Reiser, & Reimer, 2002).

I found ample evidence of a most basic tenet of standards-based reform, that standards were driving the curriculum (see Loeb et al., 2008). It was surprising that teachers did not talk more about curriculum materials that were used to support their teaching to standards (e.g., Remillard, 2005; Reys, Reys, & Rubenstein, 2010).

What perhaps is most instructive were the discussions about changes in pedagogy, which is not an explicit target of standards-based reform. Changes in pedagogy centered on NCTM-like strategies such as making teaching more engaging, using manipulatives and emphasizing more conceptual learning strategies. Even teachers who admitted they had changed their instruction acknowledged that there was no explicit policy requiring a change in pedagogy; rather, since teachers were being held accountable for results, teachers noted the new focus on student understanding rather than getting the right answers, and feeling pressure to present lessons in ways that students will comprehend and retain the information.

These findings are consistent with previous studies that documented educators as saying standards-based reform promoted better instruction and increased student learning in their schools and districts (e.g., Stecher, Barron, Chun, & Ross, 2006) and recent NCLB evaluations, which show many educators report that NCLB has caused them to improve teaching practice, academic rigor and the focus on student learning (e.g., CEP, 2007; Hamilton et al., 2007; Le Floch et al., 2007; Stecher et al., 2008; Taylor et al., 2010).

Still, a few teachers mentioned the tension between procedural and conceptual learning, indicating the need to, as quoted earlier, “drill on basic concepts.” This is consistent with research that indicates low-achieving students may need more drill or at least assistance to learn conceptually oriented math (Baxter, Woodward, & Olson, 2001); but inconsistent with findings that conceptual instruction does not hinder procedural learning (Carpenter, Fennema, Peterson, Chiang, & Loef, 1989). Other studies have found that more challenging instruction was differentially administered to higher-achieving students (Sandholtz et al., 2004), whereas teachers in this study, whether in a high or low achieving district, mainly reported the same new focus on more conceptual, engaging instruction. This analysis does not explore the extent to which teachers were successful with different types of students (Loeb et al., 2008); thus, I can only draw conclusions about their perceived intentions.

Is the goal of standards-based reform and its follow-on NCLB to improve pedagogy or to get teachers to focus on the “right” topics and cognitive demands, as specified in the standards? Or both? As explicit policy is adamant about not dictating the pedagogy to be used, one might argue that any learning improvements are designed to result from content changes. However, it is clear from these data that most educators felt changing pedagogy went hand in hand with the content changes they were making, fulfilling the spirit of accountability in terms of motivating teachers to change the status quo to improve student learning.

What I found noteworthy was that respondents focused on trying to develop more engaging teaching, when there was no explicit policy aimed at such an outcome. Other studies have indicated that high-stakes accountability has resulted in undesirable pedagogical changes (Valli & Buese, 2007), or inconsistent changes (Wong, Anagnostopoulos et al., 2003), and that NCLB teacher quality mandates are generally not related to more conceptual teaching in mathematics (T. Smith,  Desimone, & Ueno, 2005). In this study, it seems that the accountability system and focus on test results and leaving the means up to the teachers resulted in a major emphasis on conceptual and engaging teaching, at least as articulated by teachers. Recent studies have also shown that educators report on surveys that NCLB has resulted in teachers improving their classroom practice (Stecher et al., 2008; Taylor et al., 2010). Diamond (2007) found contrasting results from classroom observations and interviews with teachers; he found that teachers more often linked high-stakes testing policies to content, not pedagogy, and concluded that as a result didactic instruction dominated. National and state data show a predominance of didactic instruction (Desimone, Smith, Ueno, & Baker, 2005; Schmidt, McKnight & Raizen, 1997), thus a desirable change toward a more balanced approach could still mean more didactic than conceptual instruction.

Further, while respondents often mentioned developing resources to build capacity in engaging instruction, such as lesson plan modules and web-based resources, lack of capacity to change pedagogy did not emerge as a theme, as it has in other research (Loeb et al., 2008; Knapp et al., 2004; Minnici & Hill, 2007; Stecher et al., 2008; Taylor et al., 2010). While I am not arguing that our schools all had sufficient capacity, I do emphasize that teachers were readily admitting that they were changing their teaching to be more “engaging” and “motivating,” as if they knew how to do it, but just had not done it previously. One then wonders why these teachers were not seeking to engage students (with the same level of effort) before standards-based reform in their state.  One might reason that since it takes more time and energy to develop lessons that engage all students, an incentive, such as accountability to test scores, needed to be in place to motivate (some) teachers to put in the extra time. This is consistent with the theory behind standards-based reform (M. Smith & O’Day, 1991).

These findings are directly relevant to critical unanswered questions about teacher motivation and improvement. Are some teachers really not trying as hard as they might, and policies like accountability and merit pay are needed to apply pressure to motivate them toward improvement? If so, does this “improvement” need to be focused on the content taught, the pedagogy used, or both?


As I pointed out, many of the findings reported here are partly consistent with several more recent studies of NCLB (e.g., Hamilton et al., 2007). Still, the findings of constructive and positive responses stand in stark contrast to a substantial literature that documents negative reactions and outcomes to NCLB. An obvious question is how these study findings should be viewed in light of these recently documented negative responses to NCLB?


Analyzing these findings through the theory of standards-based reform and Porter’s policy attributes theory leads to a tentative insight that has potential to help us understand both the strengths and weaknesses of current NCLB implementation: the closer reform implementation comes to theoretical ideals of standards-based reform, the more positive the outcomes.  That is, one could argue that earlier attempts at standards-based reform and accountability, such as those documented in this analysis, were much more closely aligned with the theoretical vision of standards-based reform than were later manifestations as codified under NCLB. Next I discuss this idea and provide examples to illustrate my argument.

Movement Away From School and Teacher Discretion

A central tenet of standards-based reform is allowing local discretion in how to meet new state standards. Early statewide standards-based reform offered guidance but not directives about how to respond.  As reported in earlier in this study, “teachers are given autonomy to figure out how to teach what they need to meet the standards.”  Teachers and principals in the five states reported exploring innovative programs, experimenting with new teaching strategies, trying new curricula, and forging collaborations to respond to the new pressures to teach to standards. They indicated that they had opportunities to innovate and decide for themselves how to change their curriculum or instruction to meet the demands of the new state standards and aligned assessments.

In contrast, recent state and district responses to NCLB mandates often leave little room for curriculum or instructional discretion (Hamilton, Stecher, & Yuan, 2008). Along with the codification of standards-based reform through NCLB came tighter and stricter controls over curriculum (U.S. Department of Education, 2003), constraining the very creativity and innovation that was envisioned as a positive lever in the standards-based reform system.

Imbalance of Policy Attributes

Research suggests that the attributes of power, authority, consistency, specificity, and stability work together in the policy attributes framework work, interacting in meaningful ways with one another, and that this give and take is essential to successful implementation (e.g., Desimone, 2002). Similarly, a study using NAEP data and a constructed state policy data found that states with high levels of specificity and authority was related to student improvements in procedural math knowledge, while states high on power (accountability) were associated with a small decrease in procedural knowledge, problem solving ability and conceptual learning (Desimone, Smith, Hayes, & Frisvold, 2005).

 Comparing the data in the study reported here with NCLB studies suggests that the education system may have initially been more balanced across policy attributes, but has since moved away from that balance toward a more uneven emphasis on particular attributes.  

For example, much has changed in terms of the interaction of power and authority, two key components to a well-functioning reform system. Educators in our study were responding to increased accountability pressure, but they were not overwhelmed by it, nor did it seem to be the driving force behind their behavior.  Educator responses in our study showed that they were also motivated by a fairly high level of authority. Authority, often manifested through teacher buy-in and belief in the reform, has been shown repeatedly to be influential in fostering constructive and meaningful responses to reform mandates (e.g., Datnow, 2000; Desimone, 2002).  Most of the respondents said they believed the standards were worthwhile, that their students were learning in important ways, and that teacher instruction was improving.  As I reported, most believed that “the testing has proved to us that we can in fact teach more children more mathematics.”  

In contrast, descriptions of educator perceptions of later NCLB policies show little evidence that teachers believe the changes being made are good for them or their students (e.g., Ravitch, 2010). Authority, especially teacher buy-in, has become less salient in the policy environment. Instead, rewards and sanctions now seem to dominate as the primary mechanism for fostering change. None of the states in this study was imposing curricula or levying rigorous sanctions like school reconstitution, but sanctions under NCLB have become increasingly severe (Dee & Jacob, 2011).

Alignment among policy instruments has also become a problem. Educators complain that unrealistic pacing guides are not consistent with the goals of meaningful student learning, a stark example of misalignment. And the reported narrowing and denigration of instruction is contrary to many educators’ beliefs about good instruction, causing misalignment between beliefs and practice (e.g., Cochran-Smith & Lytle, 2006; Loeb et al., 2008).

Research also suggests that the targets for educators have become much more specific than they were at the time of our study. Our data suggest that grade-level standards provided a more specific target than did textbooks, and so were thought by many to be an improvement. But the more recent trend of standards-based reform focusing on tested material suggests that having too specific a target provides a perverse incentive to narrow instruction in an undesirable way.

Shift in the Quality of the Target: Standards-Based Reform’s Transformation to Test-Based Accountability

In M. Smith and O’Day’s (1991) vision of standards-based reform, high-quality content standards provide uniform and meaningful learning goals. This seemed to be occurring in our five study states. Educators generally thought the tests were reasonable reflections of what students should know and be able to do. Most agreed with this statement from one of the respondents who said, “I truly believe our state test truly does test what our standards are.” Teachers reported changing their instruction toward the problem solving and applications reflected in the standards, rather than memorization and drill. Currently, however, in many schools across the country, standards-based accountability has morphed into test-based accountability, thought by many to provide inadequate learning goals.

I theorize, along with Hamilton, Stecher, & Yuan (2008), that current negative reactions to standards-based reform are related to the increased emphasis on test-driven accountability:

The preponderance of research on the impact of testing rather than the impact of standards reflects the emerging realization that “standards-based reform” has largely given way to “test-based reform,” a system in which the test rather than the standards communicates expectations and drives practice. (p. 3)

The version of reform studied here included standards playing an influential role in driving the system, as envisioned by theory.  As the role of tests has expanded, so have the inevitable distortions. In many states, the codification of NCLB resulted in the use of tests for “highly consequential decisions at the both the school and student levels” . Indeed, much of the criticism of NCLB has focused on how test-based accountability has corrupted instruction.

Nichols and Berliner (2007) suggest that Campbell’s Law—“The more any quantitative social indicator is used for social decision making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor”—explains many practices that schools and teachers have engaged in as test scores have taken on new importance.  The expanded role of testing and accountability has been tied to several perverse outcomes that are not evident in this study’s analysis of earlier standards-based reform efforts.

Nichols and Berliner (2007) give many examples of the corruption of test-based accountability, including the degradation of instruction, increasing time spent on test preparation instead of genuine instruction, loss of morale, and undercutting of professionalism (2007).  Similarly, teachers have been documented engaging in “educational triage,” focusing on “bubble kids” who are nearest to achieving proficiency cut scores , and gaming and cheating on tests (Ravitch, 2010).  Such undesirable responses to imperfectly measured outcomes are well-documented reactions in the economics literature (Shepard, Hannaway, & Baker, 2009).


The trend of achievement gains also generally supports the notion that early manifestations of standards-based reforms were more productive and constructive than later ones. For example, Shepard et al. (2009) reports that though gains have been achieved since standards-based reform began in the 1990s, “the intensification of pressure since NCLB has not produced commensurately great gains” (p. 3). Ultimately, it is nearly impossible to isolate the impact of particular aspects of NLCB, since many states had instituted some form of accountability before NLCB took effect, and the law’s many dimensions make it impossible to identify which mechanisms drive change (Hanushek & Raymond, 2005).

The current analysis, considered in light of other studies of standards-based reform and NCLB, suggests that it is not the theory of standards-based reform, as embodied in NCLB, which is fundamentally responsible for the negative behaviors and outcomes.  Rather, it is the imperfect manifestation and implementation of NCLB—the shift to “test-based accountability” and its accompanying distortions and perverse incentives—which have fostered disappointing and in some cases alarming responses that have failed to fulfill the potential of standards-based reform.  I hypothesize that this may be a central reason that results reported here are more positive than the results of more recent studies of NCLB; we would expect educator practices and attitudes in the context of standards-based reform to be different than in the context of test-based reform.


Given that previous research emphasizes the importance of local contexts (Spillane, 2005); one of this study’s most pervasive finding is surprising—the incredible consistency across districts, schools, and states. And in the context of teachers as resistant to change (Valli et al., 2007) and the rarity of reform-induced major transformations (Fullan, 1993; Tyack & Cuban, 1995), I found that educators believed that were making fundamental changes in their beliefs and actions.

This analysis focused on administrator and teacher perceptions of the changes they have made in response to standards-based reform efforts in their respective states. Overwhelmingly, I found that educators described changes in their belief system as well as changes in practice. Teachers and principals reported changing their ideas about the extent to which struggling learners could achieve at high levels in mathematics, after seeing the results of various interventions put in place to respond to accountability mandates. Respondents articulated a new and keen sense of responsibility for student learning, which they directly traced to the new accountability systems in their states. Their “teaching to the test” in many cases meant more focus on conceptual thinking and problem solving and less attention to memorizing and procedural knowledge. While there was disagreement about whether standards had changed the content of instruction, overwhelmingly teachers said they were changing their instruction to be more engaging to students and to focus on more conceptual instruction. Most of these changes are in a direction consistent with the ideals of standards-based reform. I offer this evidence not to refute other studies that have shone light on the tensions and drawbacks of standards-based reform, but to contribute to the investigation of the myriad of complex effects standards-based reform-like policies can have on teaching and learning. This is especially important with the dawn of the Common Core State Standards Initiative (corestandards.org), which will bring more consistency in standards across participating states.

While many questions remain about the tensions and debates inherent in standards-based and NCLB-like reforms and those funded with RTT money, the findings here suggest that standards and accountability reforms in some contexts have the potential to permeate the education system and change educator beliefs in ways that have great potential to improve teaching, and increase student learning.


This research was supported in part by the U.S. Department of Education Contract No. 282-98-9920 Task Order No. 19 with the American Institutes for Research and also by a Discovery Grant from Vanderbilt University to the author.  The opinions, findings, and conclusions or recommendations expressed are those of the author and do not represent the views of the U.S. Department of Education or Vanderbilt University. Special thanks to Michael Garet, Rebecca Herman, and the AIR research team for their collaborative work designing and executing the evaluation of which this study is a part; and appreciation to Joy Lesnick, Kailey Spencer, and Dan Stuckey for their contributions to this manuscript.


1. The design and data collection was funded by the then Planning and Evaluation Service under contract. The coding and analysis was funded by the Discovery Grant program, Vanderbilt University.

2. The overarching study included six states; one state had so few interview respondents it was omitted from the current analysis.


Abrams, L. M., Pedulla, J. J., & Madaus, G. F. (2003). Views from the Classroom: Teachers’ Opinions of Statewide Testing Programs. Theory into Practice, 42(1), 18–29.

Achinstein, B., Ogawa, R., & Spiegelman, A. (2004). Are We Creating Separate and Unequal Tracks of Teachers? American Educational Research Journal, 41(3), 557-603.

Adams, S., Heywood, J., & Rothstein, R. (2009). Teachers, performance pay, and accountability: What education should learn from other sectors. Washington, DC: EPI.

Alexander, R. (2001). Culture and pedagogy: International comparisons in primary education. Malden, MA: Blackwell.

American Federation of Teachers. (2001). Making Standards Matter 2001. Washington, DC: Author. Retrieved March 4, 2002, from http://www.aft.org/edissues/standards/MSM2001/downloads/whole.pdf

Archbald, D. A., & Porter, A. C. (1994). Curriculum control and teachers’ perceptions of autonomy and satisfaction. Educational Evaluation and Policy Analysis, 16(1), 21-39.

Ary et al. (2006). Qualitative research: Data analysis, rigor, and reporting. In D. Ary, L. C. Jacobs, A., Razavieh, & C. K. Sorensen (Eds.), Introduction to Research in Education (7th ed.) (pp. 489-536). Belmont, CA: Wadsworth Publishing Co.

Atkinson, P., Coffey, A., & Delamont. S. (2003). Key themes in qualitative research: continuities and change. Walnut Creek, CA: AltaMira Press.

Ball, D. L., Cohen, D. K. (1996). Reform by the book: What is—or might be—the role of curriculum materials in teacher learning and instructional reform? Educational Researcher, 25(9), 6-8.

Baxter, J., Woodward, J., & Olson, D. (2001). The effects of reform-based mathematics instruction on low achievers in five third-grade classrooms. Elementary School Journal, 101, 529-548.

Belden Russonello & Steward Research and Communication. (2000). Tables for EdWeek Survey-1. Washington, DC: Author.

Berends, M., Bodilly, S., & Kirby, S. N. (2002). Facing the challenges of whole-school reform: New American Schools after a decade. Santa Monica, CA: RAND.

Berger, A., Desimone, L. , Herman, R., Garet, M., & Margolin, J. (2002).  Content of state standards and the alignment of state assessments with state standards.  Washington, DC:  U.S. Department of Education.

Bishop, J. H., & Mane, F. (1999). The New York state reform strategy: The incentive effects of minimum competency exams. Philadelphia, PA: National Center on Education in Inner Cities.

Booher-Jennings, J. (2005). Below the bubble: "Educational triage" and the Texas accountability system. American Education Research Journal, 4, 231-268.

Borko, H., & Elliott, R. (1999). Hands-on pedagogy versus hands-off accountability: Tensions between competing commitments for exemplary math teachers in Kentucky. Phi Delta Kappan, 80, 394–400.

Borko, H., Wolf, S. A., Simone, G., & Uchiyama, K. (2003). Schools in transition: Reform efforts and school capacity in Washington state. Educational Evaluation and Policy Analysis, 25(2), 171-202.

Brozo, W. G., & Hargis, C. H. (2003). Taking seriously the idea of reform: One high school’s efforts to make reading more responsive to all Students. Journal of Adolescent & Adult Literacy, 47(1), 14-23.

Bybee, R. W. (1993). Reforming science education: Social perspectives and personal reflections. New York: Teachers College Press.

Carnoy, M., & Loeb, S. (2002). Does external accountability affect student outcomes? A cross-state analysis. Educational Evaluation and Policy Analysis, 24(4), 305-331.

Carpenter, T. P., Fennema, E., Peterson, P. L., Chiang, C. P. & Loef, M. (1989). Using Knowledge of Children’s Mathematics Thinking in Classroom Teaching: An Experimental Study. American Educational Research Journal 26(4), 499–531.

Center on Education Policy. (2006). From the capital to the classroom: Year 4 of the No Child Left Behind Act. Washington, DC: Author.

Center on Education Policy. (2007). Moving beyond identification: Assisting schools in improvement.  Washington, DC: Author.

Center on Education Policy. (2009). Is the emphasis on “proficiency” shortchanging higher- and lower-achieving students? State Test Score Trends Through 2007-08, Part 1. Washington, DC: Author.

Cizek, G. J. (2001). Conjectures on the rise and call of standard setting: An introduction to context and practice. In G. J. Cizek (Ed.), Setting performance standards: Concepts, methods, and perspectives (pp. 3-17). Mahwah, NJ: Lawrence Erlbaum.

Clune, W. (1998). Toward a theory of systemic reform: The case of nine NSF statewide systemic initiatives [Research Monograph No. 16]. Madison, WI: National Institute for Science Education.

Cochran-Smith, M., & Lytle, S. L. (2006). Troubling images of teaching in No Child Left Behind. Harvard Educational Review, 76(4), 668-697.

Coffey, A., & Atkinson, P. (1996). Making Sense of Qualitative Data Analysis: Complementary Strategies. Thousand Oaks CA: Sage.

Cohen, D. K. (1990). A revolution in one classroom: The case of Mrs. Oublier. Educational Evaluation and Policy Analysis, 12(3), 327-345.

Cohen, D. K. (1995). What is the system in systemic reform?  Educational Researcher, 24(9), 11-17.

Cohen, D. K., & Hill, H. C. (2000). Instructional policy and classroom performance: The mathematics reform in California. Teachers College Record, 102(2), 294-343.

Crocco, M. S., & Costigan, A. T. (2007). The narrowing of curriculum and pedagogy in the age of accountability: urban educators speak out. Urban Education, 42(6), 512–535.

Darling-Hammond, L. (2003). Standards and assessments: Where we are and what we need. Teachers College Record [Online]. Retrieved from http://www.tcrecord.org/Content.asp?ContentID=11109.

Darling-Hammond, L. (2004). Standards, accountability, and school reform. Teachers College Record, 106(6), 1047-1085.

Datnow, A. (2000). Power and politics in the adoption of whole school reform models. Educational Evaluation and Policy Analysis, 22(4), 357-374.

Dee, T. S., & Jacob, B. (2011). The impact of No Child Left Behind on student achievement. Journal of Policy Analysis and Management, 30(3), 418-446.

Desimone, L. (2000). The role of teachers in urban school reform. Clearinghouse on Urban Education Digest, 154.

Desimone, L. (2002).  How can comprehensive school reform models be successfully implemented?  Review of Educational Research, 72(3), 433-479.

Desimone, L. M. (2006). Consider the source: Response differences among teachers, principals, and districts on survey questions about their education policy environment. Educational Policy, 20(4), 640-676.

Desimone, L., Smith, T., & Frisvold, D. (2007). Is NCLB increasing teacher quality for students in poverty? In A. Gamoran (Ed.), Will standards-based reform in education help close the poverty gap? Washington, DC: Brookings Institution.

Desimone, L. M., Smith, T., Baker, D., & Ueno, K. (2005). The distribution of teaching quality in mathematics: Assessing barriers to the reform of United States mathematics instruction from an international perspective. American Educational Research Journal, 42(3), 501–535.

Desimone, L. M., Smith, T. S., Hayes, S., & Frisvold, D. (2005). Beyond Accountability and Average Math Scores: Relating Multiple State Education Policy Attributes to Changes in Student Achievement in Procedural Knowledge, Conceptual Understanding and Problem Solving in Mathematics. Educational Measurement:  Issues and Practice, 24(4), 5-18.

Diamond, J. B. (2007). Where the rubber meets the road: Rethinking the connection between high stakes accountability policy and classroom instruction. Sociology of Education, 80(4), 285-313.

Diamond, J. B., Spillane, J. P. (2004). High-stakes accountability in urban elementary schools: Challenging or reproducing inequality? Teachers College Record, 106(6), 1145-1176.

Eisenhart, M. (2006). Representing qualitative data. In J. L. Green, G. Camilli, & P. B. Elmore (Eds.), Handbook of complementary methods in education research (pp. 567-581). Mahwah, NJ: Lawrence Erlbaum Associates.

Eisner, E. W. (1998). The enlightened eye. Qualitative inquiry and the enhancement of educational practice. Upper Saddle River, NJ: Prentice Hall.

Elmore, R. F., Peterson, P. L., & McCarthey, S. J. (1996). Restructuring in the classroom: Teaching, learning, and school organization. San Francisco, CA: Jossey-Bass.

Emerson, R., Fretz, R., & Shaw, L. (1995). Writing ethnographic fieldnotes. Chicago, IL: University of Chicago Press.

Finnigan, K. S., & Gross, B. (2007). Do accountability policy sanctions influence teacher motivation? Lessons from Chicago’s low-performing schools. American Educational Research Journal, 44(3), 594-629.

Firestone, W. A. (2003). The governance of teaching and standards-based reform from the 1970s to the new millennium. In Hallinan, M., Gamoran, A., Kubitschek, W., & Loveless, T. (Ed.) Stability and change in American education: Structure, process and outcomes.   Clinton Corners, NY:  Elliot Werner Publications.

Firestone, W., Camilli, G., Yurecko, M., Monfils, L., & Mayrowetz, D. (2000). State standards, socio-fiscal context and opportunity to learn in New Jersey. Educational Policy Analysis Archives, 8(35).

Firestone, W., Mangin, M., Martinez, M., & Polovsky, T. (2005). Leading coherent professional development: A comparison of three districts. Educational Administration Quarterly, 41(3), 413-448.

Firestone, W., Mayrowetz, D., & Fairman, J. (1998). Performance-based assessment and instructional change: The effects of testing in Maine and Maryland. Educational Evaluation and Policy Analysis, 20(2), 95-133.

Firestone, W. A., & Schorr, R., (2004). Introduction. In W. A. Firestone, R. Schorr, & L. F. Monfils (Eds.) The ambiguity of teaching to the test: Standards, assessment and educational reform (pp. 1-17). Mahwah, NJ: Lawrence Erlbaum.

Floden, R. E., Porter, A. C., Schmidt, W. H., Freeman, D. J., & Schwille, J. R. (1981). Responses to curriculum pressures: Policy-capturing study of teacher decisions about content. Journal of Educational Psychology, 73(2), 129-141.

Fullan, M. (1993). Change forces: probing the depth of educational reform. New York: Routledge Falmer.

Gallagher, K. (2010, November). Why I won’t teach to the test. Education Week, 30(12), 29, 36.

Gamoran, A. (Ed.) (2007). Standards-based reform and the poverty gap: Lessons for No Child Left Behind. Washington, DC: Brookings Institution Press.

Gill, M. G., Ashton, P. T., & Algina, J. (2004). Changing preservice teachers’ epistemological beliefs about teaching and learning in mathematics: An intervention study. Contemporary Educational Psychology, 29(2), 164-185.

Glaser, B. G., & Strauss, A. L. (1967). The discovery of grounded theory: Strategies for qualitative research. Chicago, IL: Aldine.

Goetz, J., & LeCompte, M. (1984). Ethnography and qualitative design in educational research. Orlando, FL: Academic Press.

Gold, B. (2002). Social construction of urban education: New Jersey whole school reform and teachers’ understanding of social class and race. New York: Pace University.

Grant, S. G., Peterson, P. L., & Shojgreen-Downer, A. (1996). Learning to teach mathematics in the context of systemic reform. American Educational Research Journal, 33(2), 509-541.

Green, J., Dixon, C., & Zaharlock, A. (2002). Ethnography as a logic of inquiry. In J. Flood, J. Jensen, D. Lapp, & J. Squire (Eds.), Handbook for methods of research on English language arts teaching (pp. 201-224). Mahwah, NJ: Lawrence Erlbaum Associates.

Greene, J. C., Caracelli, V. J., & Graham, W. F. (1989). Toward a conceptual framework for mixed-method evaluation designs. Educational Evaluation and Policy Analysis, 11(3), 255-274.

Gregoire, M. (2003). Is it a challenge or a threat? A dual-process model of teachers’ cognition and appraisal processes during conceptual change. Educational Psychology Review, 15(2), 147-179.

Hamilton, L. S. (2004). Assessment as a policy tool. Review of Research in Education, 27, 25-68.

Hamilton, L.S., McCaffrey, D., Klein, S.P., Stecher, B.M., Robyn, A., & Bugliari, D. (2003). Studying large-scale reforms of instructional practice: An example from mathematics and science. Educational Evaluation and Policy Analysis, 25, 1-29.

Hamilton, L., & Stecher, B. (2004). Responding Effectively to Test-Based Accountability. Phi Delta Kappan, 85(8), 578.

Hamilton, L. S., & Stecher, B. M. (2006). Measuring instructional responses to standards-based accountability. Santa Monica, CA: RAND Corporation.

Hamilton, L. S., Stecher, B. M., Marsh, J. A., McCombs, J. S., Robyn, A., Russell, J., Naftel, S., & Barney H. (2007). Standards-based accountability under No Child Left Behind: Experiences of teachers and administrators in three states. Santa Monica, CA: RAND Corporation.

Hamilton, L. S., Stecher, B. M., & Yuan, K. (2008). Standards-based reform in the United States: History, research, and future directions. Santa Monica, CA: RAND Corporation.

Hamilton, L. S., Stecher, B. M., Russell, J. L., Marsh, J. A. & Miles, J. (2008). Accountability and teaching practices: School-level actions and teacher responses. Research in Sociology of Education, 16, 31-66.

Hannaway, J. (2003). Accountability, assessment, and performance issues: we've come a long way … or have we? In W. L. Boyd & D. Miretzky (Eds.), American Educational Governance on Trial: Change and Challenges, 102nd Yearbook of the National Society for the Study of Education. Chicago IL: National Society for the Study of Education.

Hanushek, E. A., & Raymond, M. E. (2005). Does school accountability lead to improved student performance? Journal of Policy Analysis and Management, 24(2), 297-327.

Hassel, B. C., & Hassel, E. A. (2010). Low-performing schools. Race to the Top: Accelerating College and Career Readiness in States. Washington, DC: Achieve, Inc.

Hauser, R. M. (1999, April). What if we ended social promotion? Education Week, 18(30), 34-36.

Herman, R., Dawson, P., Dee, T., Greene, J., Maynard, R., Redding, S., & Darwin, M. (2008). Turning around chronically low-performing schools: A practice guide (NCEE #2008-4020). Washington, DC: National Center for Education Evaluation and Regional Assistance, Institute of Education Sciences, U.S. Department of Education. Retrieved from http://ies.ed.gov/ncee/wwc/practiceguides.

Hilliard, A. G., III. (2000). Excellence in education versus high-stakes standardized testing. Journal of Teacher Education, 51(4), 293-304.

Huberman, A. M., Miles, M. B. (1994). Data management and analysis methods. In N. K. Denzin & Y. S. Lincoln (Eds.), Handbook of qualitative research (pp.428-444). Thousand Oaks, CA: Sage.

Ingram, D., Louis, K. S., Schroeder, R. G. (2004). Accountability policies and teacher decision making: Barriers to the use of data to improve practice. Teachers College Record, 106(6), 1258–1287.

Jennings, N., Swidler, S., & Koliba, C. (2005). Place-based education in the standards-based reform era—conflict or complement? American Journal of Education, 112(1), 44.

Johnson, R. B., & Christensen, L. B. (2000). Educational research: Quantitative and qualitative approaches. Boston, MA: Allyn and Bacon.

Jones, B. D., & Egley, R. J. (2006). Looking through different lenses: Teachers’ and administrators’ views of accountability. Phi Delta Kappan, 87(10), 767-771.

Jones, M. G., Jones, B. D., Hardin, B., Chapman, L., Yarbrough, T., & Davis, M. (1999). The impact of high-stakes testing on teachers and students in North Carolina. Phi Delta Kappan, 81(3), 199-203.

Klein, S., Hamilton, L., McCaffrey, D., & Stretcher, B. (2000). What do test scores in Texas tell us? Santa Monica, CA: RAND.

Knapp, M. (1997). Between systemic reforms and the mathematics and science classroom: The dynamics of innovation, implementation, and professional learning. Review of Educational Research, 67(2), 227–266.

Knapp, M. S., Elfers, A. M., Plecki, M. L., Loeb, H., Boatright, B., Cabot, N., & Center for the Study of Teaching and Policy, S. (2004). Preparing for Reform, Supporting Teachers’ Work: Surveys of Washington State Teachers, 2003-04 School Year. Center for the Study of Teaching and Policy.

Koretz, D. (2008). Measuring up: What educational testing really tells us. Cambridge, MA: Harvard University Press.

Koretz, D. M., Barron, S., Mitchell, K., & Stecher, B. M. (1996). Perceived effects of the Kentucky Instructional Results Information System (KIRIS). Santa Monica, CA: RAND Corporation.

Koretz, D. M., & Hamilton, L. S. (2006). Testing for accountability in K-12. In R. Brennan (Ed.), Educational Measurement  (pp. 531-578). Westport, CT: Praeger Publishers.

Koretz, D., McCaffrey, D., & Hamilton, L. (2001).  Toward a framework for validating gains under high-stakes conditions (CSE Technical Report 551). Los Angeles, CA: University of California, Center for the Study of Evaluation.

Koretz, D., Stecher, B., Klein, S., & McCaffrey, D. (1994). The Vermont portfolio assessment program: Findings and implications. Educational Measurement: Issues and Practice, 13(3), 5-16.

Krippendorf, K. (1980). Content analysis: An introduction to its methodology. Beverly, Hills, CA: Sage.

Ladd, H. F., & Zelli, A. (2002). School based accountability in North Carolina: The responses of

school principals. Educational Administration Quarterly, 38(4), 494-529.

Lane, S., Parke, C.S., & Stone, C.A. (2002). The impact of a state performance-based assessment and accountability program on mathematics instruction and student learning: Evidence from survey data and school performance. Educational Assessment, 8(4), 279-315.

Le Floch, K.C., Martinez, F., O’Day, J., Stecher, B., Taylor, J., & Cook, A. (2007). State and local implementation of the No Child Left Behind Act, Vol. III—accountability under NCLB. Washington, DC: U.S. Department of Education.

Lincoln, Y. S., & Guba, E. G. (1985). Naturalistic inquiry. Beverly, Hills, CA: Sage.

Linn, R. L. (2000). Assessments and accountability. Educational Researcher, 29(2), 4-16.

Linn, R. L., Baker, E. L., & Betebenner, D. W. (2002) Accountability systems: Implications of requirements of the No Child Left Behind Act of 2001. Educational Researcher, 31(6), 3-16.

Loeb, H., Knapp, M. S., & Elfers, A. M. (2008). Teachers’ response to standards-based reform: Probing reform assumptions in Washington State. Education Policy Analysis Archives, 16(8), 1-29.

Louis, K. S., Dentler, R. A. (1988). Knowledge use and school improvement. Curriculum Inquiry, 18(1), 33-62.

Louis, K. S., Febey, K., Schroeder, R. (2005). State-mandated accountability in high schools: Teachers’ interpretations of a new era. Educational Evaluation and Policy Analysis, 27(2), 177-204.

Loveless, T. (Eds.) (2001). The great curriculum debate: How should we teach reading and math? Washington, DC: Brookings.

McCann, T., M., Jones, A. C., & Aronoff, G. (2010). TRUTHS hidden in plain view. Phi Delta Kappan, 92(2), 65-67.

McCombs, J. S. (2005). Progress in implementing standards, assessments, and the highly qualified teacher provisions of NCLB: Perspectives from California, Georgia, and Pennsylvania. Santa Monica, CA: RAND Corporation.

McLaughlin, M. (1976). Implementation as mutual adaptation: Change in classroom organization. Teachers College Record, 77(3), 339-351.

McLaughlin, M. (1987). Learning from experience: Lessons from policy implementation. Educational Evaluation and Policy Analysis, 9(2), 171-178.

McLaughlin, M. (1990). The Rand Change Agent Study Revisited: Macro Perspectives and Micro Realities. Educational Researcher, 19(9), 11-16.

McMurrer, J. (2007). Choices, changes, and challenges: Curriculum and instruction in the NCLB era. Washington, DC: Center on Education Policy.

McNeil, L. (2000). Contradictions of school reform: Educational costs of standardized testing.  New York, NY: Routeledge.

Measuring Up. (2002, January). Education Week, 21(16), 26. Retrieved from http://www.edweek.org/ew/articles/2002/01/09/16testbox1.h21.html?qs=Testing+systems+in+most+states+not+ESEA+ready

Mehrens, W. A., Popham, W. J., & Ryan, J. M. (1998). How to prepare students for performance assessments. Educational Measurement: Issues & Practice, 17, 18-22.

Miles, M. B., & Huberman, A. M. (1994). Qualitative data analysis (2nd ed.). Thousand Oaks, CA: Sage.

Miller, L., Herman, R., Garet, M., Desimone, L., & Zhang, Y. (2002).  State mathematics standards:  Policies, instructional supports, aligned instruction, and student achievement. Washington, DC: U.S. Department of Education.

Minnici, A., & Hill, D. D. (2007). Educational architects: Do state education agencies have the tools necessary to implement NCLB? Washington, DC: Center on Education Policy.

Monfils, L. F., Firestone, W. A., Hicks, J. E., Martinez, M. C., Schorr, R. Y., & Camilli, G. (2004). Teaching to the test. In W. A. Firestone, R. Schorr, & L. F. Monfils (Eds.). The ambiguity of teaching to the test: Standards, assessment and educational reform (pp. 37-61). Mahwah, NJ: Lawrence Erlbaum.

National Council on Education Standards and Testing. (1992). Raising standards for American education. A report to Congress, the Secretary of Education, the National Education Goals Panel, and the American people. Washington, DC: U.S. Government Printing Office.

National Council of Teachers of Mathematics. (2000). Principles and standards for school mathematics. Reston, VA: NCTM.

National Mathematics Advisory Panel. (2008). Foundations for success: The final report of the National Mathematics Advisory Panel. Washington, DC: U.S. Department of Education.

Nave, B., Miech, E., & Mosteller, F. (2000). A lapse in standards: Linking standards-based reform with student achievement. Phi Delta Kappan, 82(2), 128-132.

Nichols, S. L., & Berliner, D. C. (2007). Collateral damage: How high-stakes testing corrupts America’s schools. Cambridge, MA: Harvard Education Press.

No Child Left Behind. (2004, August). Education Week. Retrieved from http://www.edweek.org/ew/issues/no-child-left-behind/

O'Day, J. (2002). Complexity, accountability, and school improvement. Harvard Educational Review, 72(3), 293-329.

Parke, C. S., & Lane, S. (2008). Examining alignment between state performance assessment and mathematics classroom activities. Journal of Educational Research, 101(3), 132-147.

Patton, M. Q. (1990). Qualitative evaluation and research methods (2nd ed.). Newbury Park, CA: Sage.

Pellegrino, J. W., Baxter, G. P., & Glaser, R. (1999).  Addressing the “two disciplines” problem:  Linking theories of cognition and learning with assessment and instructional practice.  Review of Research in Education, 24(9), 307-353.

Popham, W. J. (2001). The truth about testing: An educator’s call to action. Alexandria, VA: Association for Supervision and Curriculum Development.

Porter, A. (1993). School delivery standards. Educational Researcher, 22(5), 24-30.

Porter, A. C. (1998). The effects of upgrading policies on high school mathematics and science.  In D. Ravitch (Ed.), Brookings papers on education policy 1998 (pp. 123-172). Washington, DC: Brookings Institution Press.

Porter, A. (2000). Doing high-stakes assessment right. The School Administrator, 11(57), 28-31.

Porter, A. (2002). Measuring the content of instruction: Uses in research and practice. Educational Researcher, 31(7), 3-14.

Porter, A., Floden, R., Freeman, D., Schmidt, W., & Schwille, J. (1988). Content determinants in elementary school mathematics. In D. A. Grouws & T. J. Cooney (Eds.), Perspectives on research on effective mathematical teaching (pp. 96-113). Hillsdale, NJ: Lawrence Erlbaum Associates.

Porter, A., Kirst, M. W., Osthoff, E. J., Smithson, J. L., & Schneider, S. A. (1993). Reform up close: An analysis of high school mathematics and science classrooms. University of Wisconsin-Madison.

Ravitch, D. (2010). The Death and life of the great American school system: How testing and choice are undermining education. New York, NY: Basic Books.

Remillard, J. T. (2005) Examining key concepts in research on teachers’ use of mathematics curricula. Review of Educational Research 75(2), 211-246.

Resnick, L. B., Rothman, R., Slattery, J. B., & Vranek, J. (2004). Benchmarking and alignment of standards and testing. Educational Assessment, 9(1-2), 1-27.

Reys, B., Reys, R., & Rubenstein, R. (2010). Mathematics curriculum: Issues, trends, and future directions, 72nd Yearbook. Washington, DC: National Council of Teachers of Mathematics.

Richardson, V., Placier, P. (2001). Teacher change. In V. Richardson (Ed.) Handbook of research on teaching (pp. 905-947).  Washington, DC: American Education Research Association.

Ryan, G. W., & Bernard, H. R. (2003). Data management and analysis methods. In N. K. Denzin & Y. S. Lincoln (Eds.), Collecting and interpreting qualitative materials (2nd ed.) (pp 259-309). Thousand Oaks, CA: Sage.

Sandholtz, J. H., Ogawa, R., & Scribner, S. P. (2004). Standards gaps: Unintended consequences of local standards-based reform. Teachers College Record 106(6), 1177-1202.

Schmidt, W. H., McKnight, C. C., & Raizen, S. A. (1997). A splintered vision: An investigation of U.S. science and mathematics education. Boston, MA: Kluwer Academic Publishers.

Sheldon, K. M., & Biddle, B. J. (1998). Standards, accountability, and school reform: Perils and pitfalls. Teachers College Record, 100, 164-180.

Shepard, L., Hannaway, J., & Baker, E. (2009). Standards, assessments, and accountability: Education policy white paper. Washington, DC: National Academy of Education. Retrieved from http://www.naeducation.org/Standards_Assessments_Accountability_White_Paper.pdf

Smith, T., Desimone, L., & Ueno, K. (2005). “Highly Qualified” to Do What? The Relationship between NCLB Teacher Quality Mandates and the Use of Reform-oriented Instruction in Middle School Math.  Educational Evaluation and Policy Analysis, 27(1), 75-109.

Smith, M., & O’Day, J. (1991a). Systemic school reform. In S. H. Fuhrman & B. Malen (Eds.), The politics of curriculum and testing: The 1990 yearbook of the Politics of Education Association (pp. 233-267). London: The Falmer Press.

Smith, M., & O’Day, J. (1991b). Putting the pieces together: Systemic school reform (CPRE Policy Brief). New Brunswick, NJ: Eagleton Institute of Politics.

Spillane, J. (2005). Primary school leadership practice: How the subject matters. School Leadership & Management, 25(4), 383-397.

Spillane, J., Reiser, B. J., & Reimer, T. (2002). Policy implementation and cognition: Reframing and refocusing on implementation research. Review of Educational Research, 72(3), 387-431.

Spillane, J., & Zeuli, J. (1999).  Reform and teaching: Exploring patterns of practice in the context of national and state mathematics reforms. Educational Evaluation and Policy Analysis 21(1), 1-27.

Stecher, B. M., Barron, S. L., Chun, T. & Ross, K. (2000). The effects of the Washington state education reform on schools and classrooms (CSE Technical Report No. 525). Los Angeles, CA: UCLA National Center for Research on Evaluation, Standards and Student Testing.

Stecher, B., Barron, S., Chun, T., & Ross, K. (2005). The effects of Washington state education reform on schools and classrooms. Santa Monica, CA: RAND Corporation.

Stecher, B. M., Barron, S., Kaganoff, T., & Goodwin, J. (1998). The effects of standards-based assessment on classroom practices: Results of the 1996–97 RAND survey of Kentucky teachers of mathematics and writing (CSE Technical Report No. 482). Los Angeles, CA: UCLA National Center for Research on Evaluation, Standards and Student Testing.

Stecher, B. M., & Borko, H. (2002). Integrating findings from survey sand case studies: Examples from a study of standards-based educational reform. Journal of Education Policy, 17(5), 547-569.

Stecher, B. M., & Chun, T. (2001, April) The effects of the Washington education reform on school and classroom practice, 1999–2000. Paper presented at the annual meeting of the American Educational Research Association, Seattle, WA.

Stecher, B. M., Epstein, S., Hamilton, L. S., Marsh, J. A., Robyn, A., McCombs, J. S., Russell, J., et al. (2008). Pain and gain: Implementing No Child Left Behind in three states, 2004-2006. Santa Monica, CA: RAND Corporation.

Stecher, B. M., & Hamilton, L. S. (2002). Putting theory to the test: Systems of “educational accountability” should be held accountable. RAND Review, 26(1), 17-23.

Stecher, B. M., Hamilton, L. S., Ryan, G. W., Le, V., Williams, V. L., Robyn, A. & Alonzo, A. (2002). Measuring reform-oriented instructional practices in mathematics and science. Santa Monica, CA: RAND Corporation.

Stecher, B. M., & Kirby, S. N. (2004). Organizational improvement and accountability: Lessons for education from other sectors [Monograph]. RAND Corporation. Retrieved 20 July 2010, from http://www.rand.org/pubs/monographs/2004/RAND_MG136.sum.pdf. 

Stecher, B. M., & Mitchell, K. (1995). Portfolio-driven reform: Vermont teachers’ understanding of mathematical problem solving and related changes to classroom practice (RP-539). Santa Monica, CA: RAND Corporation.

Stevenson, H., & Stigler, J. (1992). The learning gap:  Why our schools are failing and what we can learn from Japanese and Chinese education. New York: Summit Books.

Stigler, J., & Hiebert, J. (1999). The teaching gap:  Best ideas from the world's teachers for improving education in the classroom. New York: The Free Press.

Strauss, A., & Corbin, J. (1998). Basics of qualitative research: Techniques and procedures for developing grounded theory (2nd ed.). Thousand Oaks, CA: Sage.

Swanson, C., & Stevenson, D. (2002). Standards-based reform in practice: Evidence on state policy and classroom instruction from the NAEP state assessments. Educational Evaluation and Policy Analysis, 24(1), 1-27.

Taylor, J., Stecher, B., O’Day, J., Naftel, S., & Le Floch, K. C. (2010). State and local implementation of the No Child Left Behind Act, Vol. IX—Accountability under NCLB: Final report. Washington, DC: U.S. Department of Education.

Tyack, D., Cuban, L. (1995). Tinkering toward utopia: A century of public school reform. Cambridge, MA: Harvard University Press.

Valli, L., Croninger, R., & Walters, K. (2007). Who (else) is the teacher? Cautionary notes on teacher  

accountability systems. American Journal of Education, 113, 635–662.

Watanabe, M. (2007). Displaced teacher and state priorities in a highstakes accountability context. Educational Policy, 21(2), 311–368.

Weast, J. D. (2011). Engage educators in order to achieve the best results for students. Phi Delta Kappan, 92(4), 39-42.

Weiss, C. H. (1998). Evaluation research: Methods for studying programs and policies. Upper Saddle River, NJ: Prentice Hall.

Weiss, C. H. (1998). Evaluation (2nd ed.). Upper Saddle River, NJ: Prentice Hall.

Weiss, I. R., Knapp, M. S., Hollweg, K. S., & Burril, G. (2002). Investigating the influence of standards: A framework for research in mathematics, science, and technology education. Washington DC: National Academies Press.

Wolf, S. A., & McIver, M. (1999). When process becomes policy: The paradox of Kentucky state reform for exemplary teachers of writing. Phi Delta Kappan, 80(5), 401-406.

Wong, K. K., Anagnostopoulos, D., Rutledge, S., & Edwards, C. (2003). The challenge of improving instruction in urban high schools: Case studies of the implementation of the Chicago academic standards. Peabody Journal of Education, 78(3), 39-87.

Cite This Article as: Teachers College Record Volume 115 Number 8, 2013, p. 1-53
https://www.tcrecord.org ID Number: 17083, Date Accessed: 10/22/2021 4:05:20 PM

Purchase Reprint Rights for this article or review
Article Tools

Related Media

Related Articles

Related Discussion
Post a Comment | Read All

About the Author
  • Laura Desimone
    University of Pennsylvania
    E-mail Author
    LAURA M. DESIMONE is associate professor of public policy and education at University of Pennsylvania. She studies policy effects on teachers and students; the effects of instruction on student learning, and how professional development interventions are translated to the classroom. Recent publications: Desimone, L.M., & Long, D., (2010). Does conceptual instruction and time spent on mathematics decrease the student achievement gap in early elementary school? Findings from the Early Childhood Longitudinal Study (ECLS). Teachers College Record, 112(12); and Desimone, L. & Smith, T., & Frisvold, D. (2010). Survey measures of classroom instruction: Comparing student and teacher reports. Educational Policy, 24(2), 267-329.
Member Center
In Print
This Month's Issue