Getting College-Ready During State Transition Toward the Common Core State Standards
by Zeyu Xu & Kennan Cepa - 2018
Background: As of 2016, 42 states and the District of Columbia have adopted the Common Core State Standards (CCSS). Tens of millions of students across the country completed high school before their schools were able to fully implement the CCSS. As with previous standards-based reforms, the transition to the CCSS-aligned state education standards has been accompanied by curriculum framework revisions, student assessment redesigns, and school accountability and educator evaluation system overhauls.
Purpose: Even if the new standards may improve student learning once they are fully implemented, the multitude of changes at the early implementation stage of the reform might disrupt student learning in the short run as teachers, schools, and communities acclimate to the new expectations and demands. The goal of this study is not to evaluate the merits and deficiencies of the CCSS per se, but rather to investigate whether college readiness improved among high school students affected by the early stages of the CCSS implementation, and whether students from different backgrounds and types of high schools were affected differently.
Research Design: We focus on three cohorts of 8theighth-grade students in Kentucky and follow them until the end of the 11th -grade, when they took the state mandatory ACT tests. The three successive cohorts—enrolled in the 8theighth -grade between 2008 and 2010—each experienced different levels of exposure to CCSS transition. Using ACT scores as proxy measures of college readiness, we estimate cohort fixed-effects models to investigate the transitional impact of standards reform on student performance on the ACT. To gauge the extent to which the implementation of CCSS is directly responsible for any estimated cross-cohort differences in student ACT performance, we conduct additional difference-in-differences analyses and a falsification test.
Data: Our data include the population of 3 three cohorts of 8theighth -graders enrolled in Kentucky public schools between 2008 and 2010. The total analytic sample size is 100,212. The data include student test scores, student background characteristics, and school characteristics.
Findings: In the case of the CCSS transition in Kentucky, our findings suggest that students continued to improve their college -readiness, as measured by ACT scores, during the early stages of CCSS implementation. Furthermore, evidence suggests that the positive gains students made during this period accrue to students in both high- and low-poverty schools. However, it is not conclusive that the progress made in student college -readiness is necessarily attributable to the new content standards.
Conclusions: As we seek to improve the education of our children through reforms and innovations, policymakers should be mindful about the potential risks of excessive changes. Transition issues during the early stages of major educational changes sometimes lead to short-term effects that are not necessarily indicative of the longer-term effects of a program or intervention. Nevertheless, standards-based reforms are fairly frequent, and each takes multiple years to be fully implemented, affecting millions of students. Therefore, we encourage researchers and policymakers to pay more attention to the importance of transitional impact of educational reforms.
This study provides a first look at college readiness in the early years of the Common Core State Standards (CCSS) implementation. Using longitudinal administrative data from Kentucky, we follow three cohorts of students from eighth grade through 11th grade and find that students exposed to the CCSSincluding students in low-poverty schoolshad faster learning gains than similar students who were not exposed to the standards. Although it is not conclusive whether cross-cohort improvement was entirely attributable to the CCSS implementation, we find that students gained proficiency in the years immediately before and after the transition. Additionally, we find that student performance in subjects that adopted CCSS-aligned curricula exhibited larger, more immediate improvement than student performance in subjects that did not.
As of 2016, 42 states and the District of Columbia have adopted the Common Core State Standards (CCSS or Common Core). The Common Core standards, sponsored by the National Governors Association and the Council of Chief State School Officers, were developed in 2009 and released in mid-2010 (NGA/CCSSO, 2010). The CCSS represent a cross-state effort to adopt a set of college- and career-ready standards for kindergarten through 12th grade in English language arts/literacy and mathematics.1 The CCSS initiative grew out of concerns that existing state standards were not adequately preparing students with the knowledge and skills needed to compete globally (Porter, McMaken, Hwang, & Yang, 2011), necessitating a clearer set of learning expectations that are consistent across states. The initiative is also thought to offer the potential benefit of allowing for cross-state collaboration when developing teaching materials, common assessment systems, as well as tools to support educators and schools.
Yet the CCSS initiative is not without controversy, and it has become increasingly polarizing.2 Advocates and opponents disagree on many aspects of the CCSS. Key points of contention include the standards themselves, the transparency of the development of these standards, their accompanying standardized tests, the appropriateness of student proficiency levels and their implications on performance gaps between high- and low-poverty students, the financial cost of implementation, the adequacy of implementation supports, as well as the roles played by federal and corporate entities in the development and adoption of CCSS.
As with previous standards-based reforms, the implementation of CCSS-aligned state education standards has been accompanied by curriculum framework revisions, student assessment redesigns, and school accountability and educator evaluation system overhauls (Clune, 1993; Rentner, 2013; Smith & ODay, 1991). Even if the new standards may improve student learning once they are fully implemented, the multitude of changes at the early implementation stage of the reform might disrupt student learning in the short run as teachers, schools, and communities acclimate to the new expectations and demands (Schmidt & Houang, 2012). Furthermore, schools and districts with more constrained staffing capacity and limited financial resources, such as those serving predominantly low-income students, may face more challenges during the CCSS transition (Logan, Minca, & Adar, 2012; Schmidt & Houang, 2012). Indeed, in a survey of deputy superintendents of education in 40 CCSS states, 34 states reported that finding adequate staff and financial resources to support all of the necessary CCSS implementation activities is a major (22 states) or minor (12 states) challenge (Rentner, 2013).
The concern that students and schools may be overwhelmed by the presence of multiple, concurrent changes to the education system motivated us to investigate student performance trends during the transition to the CCSS. The objective of this study is not to evaluate the merits and deficiencies of the CCSS per se. Instead, we focus on the transitional impact on student learning that could arise from two competing hypotheses: On one hand, students could potentially benefit from the new standards, which some believe hold the promise of overcoming a number of flaws in previous standards-based reforms, such as low-quality content standards (Finn, Petrilli, & Julian, 2006; Hill, 2001), poorly aligned assessments (Polikoff, Porter, & Smithson, 2011), and misplaced incentives that distract attention from lower-performing students (Hamilton, Stecher, Marsh, McCombs, & Robyn, 2007; Taylor, Stecher, ODay, Naftel, & LeFloch, 2010). On the other hand, regardless of the design and implementation quality of the CCSS, student learning may suffer during the transition years when both students and teachers need to learn and adapt to the new systems. Students may also be adversely affected when human and financial resources are diverted to support the transition. The net transitional impact on student achievement is theoretically ambiguous, and it is a question that deserves empirical attention.
Student learning experiences during policy transition years are sometimes dismissed as transitory and unreliable, and they are certainly not reflective of the efficacy of the reform itself. However, tens of millions3 of students across the country will complete high school before their schools fully implement the CCSS. For those students, their experiences under the incomplete implementation of the CCSS will have a lasting impact on their future life opportunities. Whether college readiness improved among high school students affected by the early stages of the CCSS implementation, and whether students from different backgrounds and types of high schools were affected differently, are important research questions that have yet to be addressed.
This paper starts to fill in this gap by using longitudinal administrative data from Kentucky, the first state to adopt the CCSS. A critical analytic requirement to answer these questions is to have student achievement measures that are comparable before and after the implementation of the CCSS. States that adopted the new standards typically also redesigned their assessments simultaneously. Therefore, any comparisons of student achievement before and after the standards transition using state standardized tests would conflate changes of standards with changes of testing regimes.
As a state that mandates all 11th graders to take the ACT, Kentucky provides us with a rare opportunity to overcome this analytic challenge. As we will discuss in more detail later, the ACT is designed to measure what students need to know to be ready for entry-level college-credit courses (ACT, 2008). The Kentucky Department of Education (KDE) and the Council on Postsecondary Education (CPE) define college readiness as the ability for students to access credit-bearing coursework without the need for developmental education or supplemental courses (KDE & CPE, 2010, p.7). Therefore, the design of the ACT aligns with Kentuckys operating definition of college readiness, which we adopt for the current study.4 Moreover, because the ACT is mandatory in Kentucky and has been since 2007, we can measure the proficiency before and after the implementation of Common Core standards of all students, not just students who have already decided to go to college. The mandatory nature of the ACT in Kentucky allows us to avoid the self-selection issue that often biases research findings (Clark, Rothstein, & Schanzenbach, 2009).
The remainder of this paper is organized as follows. In the next section, we describe the transition to the CCSS in Kentucky, followed by a discussion of the theory and evidence of standards-based education reforms. We then describe the data and measures we use in our analyses and outline the empirical research design. Results are reported and discussed in the final two sections of the paper.
COMMON CORE IN KENTUCKY
Kentucky adopted the CCSS in 2010 and started its implementation in the 20112012 school year. Before 2011, Kentuckys education standards were the Kentucky Program of Studies (POS), and the 2006 Core Content for Assessment described the particular skills and concepts that would be assessed in each grade under POS. The POS-aligned Kentucky Core Content Test (KCCT) was a series of state tests designed to measure students learning in reading, math, science, social studies, and writing. Senate Bill 1, enacted by the General Assembly in 2009, directed the Kentucky Department of Education (KDE) to revise state content standards and launched Kentuckys transition toward the CCSS-aligned Kentucky Core Academic Standards (KCAS). Adopted by the Kentucky State Board of Education in June 2010, these new standards were developed jointly by the National Governors Association (NGA) and the Council of Chief State School Officers (CCSSO). Under the KCAS, the ELA and math curriculum frameworks are now aligned with the CCSS, whereas the curriculum frameworks for all other subject areas are carried over from POS.5
Similar to the experiences in many other CCSS states, a plethora of other changes took place in Kentucky in 20112012 along with the implementation of the CCSS. First, starting from the 20112012 school year, the Kentucky Performance Rating for Educational Progress (K-PREP) tests replaced the KCCT. Students in Grades 3 through 8 are required to take K-PREP in reading, math, science, social studies, and writing. In addition, students started to take K-PREP end-of-course tests for high-school level courses including English II, algebra II, biology, and U.S. history. Second, in 20112012, Kentucky started field testing major components of its newly designed teacher evaluation system called the Kentucky Teacher Professional Growth and Effectiveness System.6 The new system evaluated teacher performance based on multiple measures, including student growth, student surveys, and observations by peers and evaluators. Finally, a new school accountability model, Unbridled Learning: College/Career-Readiness for All, took effect in the 20112012 school year.7 The new model measures and categorizes school performance based on student achievement in the five content areas, student-achievement growth, measures of student-achievement gap among student subgroups, high school graduation rates, and college- and career-readiness. Since the U.S. Department of Education granted Kentucky a No Child Left Behind (NCLB, 2001) waiver in February 2012, Kentucky can use the Unbridled Learning model to report both state- and federal-level accountability measures.
The breadth of CCSS-related changes is what motivated our concerns of potential disruption to learning among students who spend some or most of their high school careers under the CCSS transition. The depiction here is also intended to reiterate that it is not feasible, nor our goal, to disentangle the impact of one set of changes from the impact of another. Instead, we acknowledge that standards-based educational reforms often lead to cascading changes within schools and districts, and that the focus of this study is on the overall impact of the CCSS transition on student learning.
STANDARDS-BASED EDUCATION REFORMS
Using student test scores as an outcome of interest, education policymakers have sought to improve schools and student learning (Coleman, 1966). A core component of state and federal efforts to improve education has been standards-based education reform. In the 1980s, states implemented minimum standards for student learning, and the 1990s ushered in a national movement toward raising these minimum standards (Swanson & Stevenson, 2002). The NCLB of 2001 focused on getting students to proficiency in math and reading by emphasizing accountability, sanctions, and awards (NCLB of 2001, sec. 2Aiii). Since then, the 2009 Race to the Top (RTTT), the 2011 ESEA Flexibility program, the CCSS, and the 2015 Every Student Succeed Act (ESSA) all encouraged states to set high standards so that children graduate high school ready for college and careers.
The theory underlying standards-based reforms posits that teaching and learning will improve by (a) creating high-quality content standards and clearly articulated learning goals, (b) designing student assessments aligned to those standards to monitor progress toward achieving the learning goals, and (c) establishing support and incentives systems to facilitate and motivate the adoption of the standards (Smith & ODay, 1991). The CCSS, the latest example of standards-based reforms, provides a common set of standards for ELA and mathematics defin[ing] the rigorous skills and knowledge . . . that need to be effectively taught and learned for students to be ready to succeed academically in credit-bearing, college-entry courses.8 Although the CCSS prescribes academic goals, it does not determine specific curricula for states or districts.
Existing content analysis of the CCSS (Beach, 2011; Cobb & Jackson, 2011; Porter et al., 2011) shows that the CCSS requires a modest increase in cognitive skills in math, and a larger increase for ELA, when compared to previous state-level standards. However, differences between existing standards and the CCSS vary across states, and researchers find that Kentuckys previous standards are among the least similar to the CCSS (Schmidt & Houang, 2012). For example, unlike the previous standards, Kentuckys CCSS-aligned standards mandate that each of Kentuckys postsecondary institutions assist in the development of academic standards in reading and mathematics to ensure that the curricula are aligned between high school and college (Winters et al., 2009).
WHAT HAVE WE LEARNED ABOUT STANDARDS-BASED REFORMS?
To date, very little empirical research exists on the extent to which the central goal of the CCSSimproved college- and career-readinesshas been achieved. While student performance on the National Assessment of Educational Progress (NAEP) tests improved after CCSS implementation, the performance gains are small and may not be statistically significant (Loveless, 2016). Furthermore, the correlation between NAEP scores and CCSS implementation is by no means causal.
By comparison, there is a large literature on the impact of previous standards-based reforms on teaching and learning. The most rigorous studies suggest that reforms of this type, when properly implemented and under certain circumstances, could improve student learning and classroom instruction (Carnoy & Loeb, 2002; Dee & Jacob, 2011; Hamilton et al., 2007; Hanushek & Raymond, 2005; Jacob, 2007; Stecher et al., 2008; Taylor et al., 2010). However, research also documents evidence of adverse outcomes of standards-based reforms, such as the narrowing of the curriculum (Clarke et al., 2003), strategic manipulation of the composition of the test-taking student population (Cullen & Reback, 2006; Figlio, 2006; Özek, 2012), diverting attention away from lowest-performing students to students near the proficiency cut score (Booher-Jennings, 2005; Hamilton et al., 2007; Stecher et al., 2008; Taylor et al., 2010), excessive test preparation (Wong, Anagnostopoulos, Rutledge, & Edwards, 2003), and even outright cheating on the test (Jacob & Levitt, 2003; Sass, Apperson, & Bueno, 2015).
A FOCUS ON THE TRANSITIONAL IMPACT ON STUDENT LEARNING
There are very few empirical studies that explicitly analyze the relationship between standards-based education reforms and student achievement as the reforms are being implemented. Educational change takes time and schools may face performance setbacks in the early years (Fullan, 2001). Transition issues during the early stages of major educational changes sometimes lead to short-term effects that are not necessarily indicative of the longer-term effects of a program or intervention (Kane & Staiger, 2002). For example, studies on the implementation of school-level curriculum interventions, comprehensive school reforms, teacher evaluation systems, and accountability systems (Borman, Hewes, Overman, & Brown, 2003; Borman et al., 2007; Dee & Wyckoff, 2013; Ladd, 2007) invariably find that changes in student achievement during the early implementation stages are not always consistent with later results. Understandably, researchers hesitate to make policy decisions due to small sample sizes, minor year-to-year fluctuations, or similar issues related to the early years of implementation (Kane & Staiger, 2002; Kober & Rentner, 2011).
However, this should not lead researchers and policymakers to dismiss the importance of the transitional impact of education reforms. Standards-based reforms are fairly frequent, and each takes multiple years to be fully implemented. As discussed earlier, most states have implemented major educational reforms in response to the 2001 NCLB, the 2009 RTTT, and the 2011 ESEA Flexibility program during the last decade, and they are considering changes again after the passage of the 2015 ESSA. Changes of the curriculum and instructional materials and the need to realign performance expectations with the new standards are a major source of frustration to teachers (Desimone, 2002). Educator commitment to the new standards may also be reduced if teachers and education leaders perceive such reforms as transitory (Ross et al., 1997). How educators react to standards transitions, in turn, will affect the learning experiences of tens of millions of students, which will likely have long-term effects on students (Rouse, Hannaway, Goldhaber, & Figlio, 2013).
For these reasons, this paper focuses on early impacts of statewide standards implementation in Kentucky and its differential effects across schools and student subgroups, while recognizing that such early results should not be the last word on the effectiveness of the CCSS.
RESOURCE CONSTRAINTS AND DIFFERENTIAL IMPACT
An area of constant debate is whether the implementation of CCSS may exacerbate the persisting achievement gaps between disadvantaged students and their more affluent peers (for example, Reardon, 2011). In particular, implementing large-scale education standards reforms like the CCSS is likely to impose additional challenges to resource-constrained schools and students. Local administrators, teachers, principals, and other staff working in high-poverty districts and schools feel generally less prepared to implement the CCSS than their counterparts in low-poverty districts and schools (Brown & Clift, 2010; Finnan, 2014). In addition, compared to low-poverty schools, schools serving more disadvantaged students often have fewer professional development resources and academic supports for students (Regional Equity Assistance Centers, 2013).
On the other hand, standards may reduce inequality if they help schools identify struggling students while also improving instruction for disadvantaged students (Gamoran, 2013). In their meta-analysis of early reform implementation, Borman et al. (2003) find that high-poverty schools experience similar benefits to implementation as low-poverty schools. However, schools successful implementation of new standards is critical to reducing achievement gaps (Foorman, Kalinowski, & Sexton, 2007). Furthermore, the early years of standards implementation may cause the achievement gap to widen, even if it reduces disparities in the long run.
With diverse student needs, accountability pressure, and resource constraints, the quality, scope, and strategy of standards implementation between high- and low-poverty schools are likely to be different, even though standards may reduce the achievement gap over time. While we cannot test the mechanisms of differential achievement gaps by school and student poverty status, we can examine the extent to which school and student poverty mediates the relationship between educational standards transition and student achievement.
DATA AND MEASURES
The data provided to us by the KDE include detailed records for individual students, school personnel, and student course-taking from school years 20082009 through 20122013, covering three years before the CCSS and two years post-CCSS. Teachers and students are assigned unique identifiers that can be used to track individuals over time; students and teachers also can be linked to specific classrooms.
Utilizing available student-level data, we can control for background characteristics (e.g., age, gender, race/ethnicity, free or reduced-price lunch [FRPL] eligibility, special education status, and English language learner [ELL] designation), enrollment, and state assessment scores. These state assessment scores are from pre-CCSS exams (i.e., the KCCT) that Kentucky students took at the end of Grades 3 through 8 in reading, mathematics, social studies, and writing.
Beginning in the 20072008 school year, all students in Grades 10 and 11 take the PLAN and the ACT, respectively. Both tests are provided by the ACT, Inc. The PLAN is administered every September to all incoming 10th-grade students. The ACT, on the other hand, is administered near the end of Grade 11 every March. For both the ACT and the PLAN, our data include composite scores as well as four sub-scores (English, mathematics, reading, and science). The PLAN scores can be used for two purposes: to augment the KCCT to control for student baseline academic achievement, and to facilitate sensitivity analyses that are discussed in the research design section below.
We aggregate individual FRPL-eligibility to the school level to examine whether students at high-FRPL and low-FRPL schools have different experiences to new curriculum implementation. School-level poverty context is measured by the percentage of FRPL-eligible students in a school. For students who attended multiple schools between Grades 9 and 11, we use the average FRPL percentage across schools. We define schools in the top one fifth of the school poverty distribution in Kentucky (> 55 percent FRPL) as high poverty, and those in the bottom fifth (≤ 35 percent) as low poverty. (See Figure 1 for the distribution of school poverty among Kentucky public high schools).
Figure 1. Distribution of the Percentage of Students Eligible for Free/Reduced-Price Lunch (FRPL) in School, High Schools, 20092013
MEASURING COLLEGE READINESS
As the proportion of students planning to attend college has increased, research has sought to develop an understanding of college readiness (Conley, 2007; Porter & Polikoff, 2011; Roderick, Nagaoka, & Coca, 2009). Ultimately, college readiness measures students ability to succeed in college (Conley, 2007), and existing literature urges a multidimensional view of college readiness that includes content knowledge and basic skills, core academic skills, non-cognitive skills and norms of performance, and college knowledge (Conley, 2007; Conley, 2010; Roderick et al., 2009). However, such a comprehensive measure is difficult to quantify.
As a result, most colleges continue to rely heavily on standardized achievement tests, such as the ACT, to measure cognitive ability, basic skills, content knowledge, and common academic skills (Roderick et al., 2009, p. 185). The use of tests like the ACT in college admission has relatively strong empirical support. The development of the ACT relies on college facultys input (Zwick, 2004), and ACT scores are found to predict student grades in the first year of college (Allen, 2013; Allen & Sconing, 2005), as well as students likelihood to persist in college (Bettinger, Evans, & Pope, 2011). In fact, for some institutions, ACT scores may be the best single predictor of first-year college course performance (Maruyama, 2012).
More importantly, the design of the ACT is aligned with the operating definition of college readiness in Kentucky. As discussed in the introduction, Kentucky considers a student to be college ready if the student is able to take credit-bearing college courses without remediation (KDE & CPE, 2010), which is exactly what the ACT is designed to measure (ACT, Inc., 2008). This is supported by empirical evidence that student ACT scores are highly correlated with remedial course-taking (Bettinger & Long, 2009; Howell, 2011; Martorell & McFarlin, Jr., 2011; Scott-Clayton, Crosta, & Belfield, 2014). For this reason, ACT test scores are a measure of college readiness that many states care about, leading 15 states (including Kentucky) to adopt the ACT as a federal accountability measure of high school performance (Gewertz, 2016).
Finally, for a study like ours, it is critical to have a college-ready measure for all students. Frequently, measures of college readiness, including ACT test scores, are available only among a select group of students who plan to apply to college (Clark et al., 2009; Goodman 2013; Roderick et al., 2009; Steelman & Powell, 1996). In such cases, any changes in the average ACT performance could reflect real improvement in college readiness, changes in who elects to take the ACT, or both. But the ACT is mandatory for all students in Kentucky, so we can observe changes in student college readiness that represent the entire high school student population in the state without selection concerns. In fact, this is why Porter and Polikoff (2011), in their review of possible measures of college readiness, recommend the ACT as a good option for measuring college-level academic readiness.
In summary, we caution that the ACT is by no means a comprehensive measure of college readiness. As pointed out in Maruyama (2012), the use of ACT scores aloneparticularly when threshold scores are created to dichotomize students into being ready for college or notleads to imprecise prediction of student success in college. Maruyama (2012) and other researchers (e.g., Roderick et al., 2009) recommend the use of multiple indicators, such as high school course taking and grades, to measure students ability to succeed in college. However, the ACTs close alignment with Kentuckys operating definition of college readiness, its importance in college admissions, its correlation with college achievement, its relevance to state education accountability, and its mandatory nature in Kentucky make it the best, if an imperfect, outcome measure for the current study.
Our analyses focus on three cohorts of eighth-grade students and follow them until the end of the 11th grade (Exhibit 1). For all three cohorts, student academic preparation for high school is measured by the KCCT at the end of eighth grade. At the end of 11th grade, the ACT measures high school students general educational development and their capability to complete college-level work. Neither the KCCT nor the ACT changed during the study period. Therefore, student performance at both the starting and the end points is measured with the same test instruments for all three cohorts, and is not affected by changing test familiarity.
Exhibit 1. Cross-cohort Comparison of KCAS Exposure: 20072008 through 20122013
Each of the three cohorts experienced different exposure to CCSS transition. For this study, exposure refers to the amount of time a student spent in school while the state was implementing CCSS-aligned standards and making related changes to student assessments, the state school accountability system, and the teacher evaluation system. As Exhibit 1 shows, the first cohort of students enrolled in the eighth grade in 20072008 and had no exposure to CCSS transition before sitting for the ACT in 20102011. By comparison, the second and third cohorts of eighth-grade students had spent one and two years, respectively, of their high school careers under CCSS transition before taking the ACT. We take advantage of this cross-cohort variation in student exposure to CCSS transition and address the following question: For students starting high school at similar performance levels and with similar background characteristics, did more exposure to the transitional period of standards reform predict higher ACT scores in Grade 11?
To answer this question, we first estimate a cross-cohort model in the following form:
Here, student is ACT composite score is a function of his or her eighth-grade test scores, student cohort, and background characteristics . The eighth-grade score vector includes KCCT scores in all four tested subject areas: English, mathematics, social studies, and writing. Student background characteristics include FRPL eligibility, race/ethnicity, ELL status, and special education status. To capture cohort-to-cohort variation in high school readiness, all ACT and KCCT scores are standardized by subject across all years rather than within each year in order. The coefficients of interest are and , which represent the ACT performance differentials between students affected by CCSS implementation and unaffected students who are otherwise comparable.9
In order to investigate whether the implementation of CCSS may have differential effects on students and schools facing varying degrees of financial constraints, we estimate the cross-cohort model (equation 1) for students in low and high school-poverty contexts separately. Within each school type, we further split students into those who are eligible for FRPL and those who are not, in order to capture the interplay between individual- and school-level poverty conditions.
WHAT MAY EXPLAIN ACT PERFORMANCE TRENDS
To gauge the extent to which the implementation of CCSS is directly responsible for any estimated cross-cohort differences in student ACT performance, we conduct two additional, more nuanced analyses. First, when Kentucky implemented the new state standards, it decided to adopt a revised, CCSS-aligned curriculum framework for English and mathematics (targeted subjects), but carried over the reading and science (untargeted subjects) curricula from the old regime. This allows us to implement a difference-in-differences type of analysis by comparing cross-cohort changes in ACT scores on targeted subjects with cross-cohort changes on untargeted subjects. The ACT performance trends on untargeted subjects serve as our counterfactuals, representing what student ACT performance might have been across all subject areas in the absence of content standards changes. If CCSS-aligned curriculum framework changes did make a difference, we would expect a stronger association between CCSS exposure and student ACT performance on targeted subjects than on untargeted subjects. To test this hypothesis, we estimate the following cross-subject, cross-cohort model:
Instead of using ACT composite score, this model uses ACT subject-specific score (student is score on subject s, which includes English, math, reading, and science) as the dependent variable. Compared to model (1), model (2) adds an indicator variable T for targeted subjects and its interaction with cohort dummy variables. Coefficients and now represent cross-cohort differences in ACT performance on untargeted subjects (reading and science). The coefficients of interest, and , estimate the extent to which cross-cohort progress in student ACT performance on targeted subjects (English and math) differs from that on untargeted subjects. Because the unit of analysis is student-by-subject, the total sample size is inflated by a factor of four. Therefore, we need to cluster standard error estimates at the student level to take into account cross-subject correlation of scores within individual students.
One complication in our cross-subject, cross-cohort design is that the CCSS for ELA also aims to raise the literacy standards in history/social studies, science, and technical subjects. The goal is to help students achieve the literacy skills and understandings required for college readiness in multiple disciplines.10 In other words, untargeted subjects, at least in theory, are not completely untouched by the curriculum reform. Insofar as this design feature of the CCSS was implemented authentically, our difference-in-differences coefficients ( and ) estimate the lower-bound effect of curriculum reform. However, these ELA standards are not meant to replace content standards in those subject areas, but rather to supplement them. Therefore, even if the revised English curriculum framework benefits student performance in other subject areas, the benefits to those subject areas are likely to be less immediate and pronounced than what we might expect for directly targeted subject areas.
We conduct a second analysis out of the concern that model (1) takes into account only cross-cohort performance differentials at a single point in time. However, between the eighth and the 11th grade, students from the three cohorts could have followed different performance trajectories, either due to unobserved student characteristics or due to education interventions or programs implemented right before the CCSS transition. In other words, cross-cohort improvement in student performance may have started before the implementation of the CCSS, and therefore it should not be attributed to CCSS transition. We test this possibility by creating a pseudo year of change and conducting a falsification test. Because the CCSS was not actually implemented in the pseudo year, we should not detect any cross-cohort differences if the implementation of the CCSS was directly responsible for those differences. Implementing this strategy, however, requires the ACT (or similar tests aligned with the ACT) to be administered to the same students repeatedly. The Kentucky assessment system provides us with a rare opportunity to conduct this falsification test, as it requires all 10th-grade students take the PLAN tests. The PLAN, often considered the Pre-ACT assessment, helps students understand their college readiness midway through high school and plan accordingly for their remaining high school years. The PLAN scores are highly predictive of student performance on the ACT. In our sample, the correlations between the two test scores range from 0.70 to 0.86.
Because the PLAN is administered at the beginning of the 10th grade every September, none of the three cohorts under investigation had had any meaningful exposure to CCSS implementation by the time they took the PLAN. The timing of the PLAN administration allows us to examine whether students from the three cohorts, otherwise comparable in terms of background characteristics and performance at the start of high school, had already been on different learning trajectories even before the CCSS transition. This analysis is carried out by re-estimating model (1) after replacing the ACT composite scores with the PLAN composite scores, standardized across cohorts.
Descriptive statistics in Table 1 show that students from Cohorts 2 and 3 outperformed Cohort 1 students on ACT composite score by 0.18 and 0.25 points, respectively. These differences are equivalent to about 4% to 5% of a standard deviation (one standard deviation is 4.84 points). To put the magnitude of these differences into context, Lipsey and colleagues (2012) report that the annual achievement gain from Grade 10 to Grade 11 is around 0.15 standard deviations in nationally normed test scores. Therefore, the cross-cohort gains in ACT performance are roughly equivalent to three months of additional learning.
Table 1. Student Performance and Background Characteristics, by Cohort
Note. ** denotes the statistic is significantly different from Cohort 1 at p < 0.05.
It is premature, however, to jump to strong conclusions, as the three cohorts of eighth-grade students also differ in other ways. First, students from the latter two cohorts appear to be more disadvantaged than Cohort 1 students, with higher percentages of students eligible for FRPL (53% and 56% vs. 48%) and slightly higher percentages of minority students (13% vs. 12%). On the other hand, compared with Cohort 1 students who took the ACT prior to the adoption of the CCSS, students in the second and third cohorts started high school with generally higher achievement levels. On eighth-grade math, for instance, students from the latter two cohorts scored 6% of a standard deviation higher than students from the first cohort. On both eighth-grade reading and writing, Cohort 3 students outperformed Cohort 1 students by an even larger margin of about 9% of a standard deviation. Although the eighth-grade performance gap between students in Cohort 2 and Cohort 1 is smaller on these subjects, those differences remain statistically significant.
Table 2 reports cross-cohort changes in student ACT performance for all students and for student subgroups categorized by individual and school poverty circumstances. Before we turn to the key findings, a few ancillary results are worth noting. First, eighth-grade performance in all subject areas is a strong predictor of students ACT performance, with eighth-grade mathematics test scores being the strongest predictor. Second, students who are black, Hispanic, or with special education needs underperformed their peers on average (Table 2, column 1). However, ELL students performed better than their peers, mostly driven by the strong performance of ELL students who are not eligible for FRPL (Table 2, columns 3 and 5).
Results presented in Table 2 suggest that exposure to CCSS transition is associated with higher ACT composite scores (column 1). Specifically, compared to Cohort 1 students with similar starting academic proficiency and background characteristics, Cohort 2 students scored 3% of a standard deviation higher at the end of the first year of CCSS implementation. After two years of schooling under the CCSS, Cohort 3 students not only outscored Cohort 1 students to 4% of a standard deviation, but also significantly outperformed their Cohorts 2 peers.
Table 2. Cross-Cohort Comparisons of ACT Composite Scores, by School Poverty and Student FRPL Eligibility
[Standard errors in parentheses]
Note. The reference cohort took the ACT in the 20102011 school year. The reference racial group is White.
*** p < 0.01. ** p < 0.05. * p < 0.1.
In columns 2 through 5 of Table 2, we explore whether there appears to be heterogeneity in the association between exposure to the CCSS transition and ACT performance across student- and school-poverty subgroups. There is some evidence of this heterogeneity. Among students in low-poverty schools, students in both Cohorts 2 and 3 outscored Cohort 1. In other words, all students in low-poverty schools, regardless of individual FRPL eligibility, improved their ACT performance after a single year of exposure to CCSS implementation. By comparison, among students in high-poverty schools, particularly those eligible for FRPL, only Cohort 3 students outperformed their Cohort 1 counterparts, suggesting that it took longer exposure to the CCSS for students in high-poverty schools to demonstrate significant progress in ACT performance.
These findings raise the concern that students in high-poverty schools may have lost ground to students in low-poverty schools in terms of performance growth between the eighth and the 11th grade. One possible reason, as discussed earlier, is that high-poverty schools are generally perceived as less prepared to provide teachers and students with the resources and support required by the standards transition. And opponents of the CCSS often cite the new standards as a potential distraction to ongoing efforts in narrowing the student performance gap between high- and low-poverty students (Rotberg, 2014). However, we cannot pinpoint when such divergence in growth began to emerge. That is, we are uncertain whether students in high-poverty schools started to fall behind their counterparts in low-poverty schools before or after the implementation of the CCSS.
CROSS-SUBJECT CROSS-COHORT ANALYSIS
Next we use the ACT subject area scores to estimate a difference-in-differences type model. These models use cross-cohort differences in student ACT performance on untargeted subjectssubjects that did not receive curriculum framework overhaulas the counterfactual, representing how cross-cohort patterns in ACT performance might have looked in the absence of curriculum alignment with the CCSS. If CCSS-aligned content standards are indeed superior to Kentuckys last-generation standards, as claimed by advocates of the CCSS (Carmichael, Martino, Porter-Magee, & Wilson, 2010), we should observe more pronounced cross-cohort improvement in ACT performance on targeted subjects that now have adopted CCSS-aligned curriculum frameworks. This hypothesis is supported by comparisons between Cohort 1 and 2 students (Table 3). We detect no statistically significant improvement in ACT performance on untargeted subjects (reading and science). The coefficient on Untargeted subjects, Cohort 2012 is 0.00. By comparison, ACT performance on targeted subjects (math and English) improved after a single year of CCSS implementation, significantly outpacing cross-cohort student-performance trajectory on untargeted subjects by 5% of a standard deviation (the coefficient on Targeted subjects, Cohort 2012 is 0.05). Importantly, Cohort 2 students in both high- and low-poverty schools improved significantly on targeted subjects relative to untargeted subjects. The lack of progress in overall ACT performance from Cohort 1 to Cohort 2 in high-poverty schools reported in Table 2 seems to be due to the deteriorating (although statistically insignificant) performance on untargeted subjects, negating the gains students made on targeted subjects.
Table 3. Cross-Subject Cross-Cohort Comparisons of ACT Subject Scores, by School Poverty and Student FRPL Eligibility
[Robust standard errors clustered at the student level in parentheses]
Note. The reference cohort took the ACT in the 20102011 school year. The reference racial group is White. Targeted subjects include English and mathematics, for which the KCAS implemented new, CCSS-aligned curricula since 20112012. Comparison subjects include science and reading, whose curricula were carried over from the era of Program of Studies, the old state standards before KCAS.
*** p < 0.01. ** p < 0.05. * p < 0.1.
Cross-subject comparisons between Cohorts 1 and 3, however, demonstrated a different pattern. By the end of the second year of the CCSS implementation, Cohort 3 students outscored Cohort 1 students on both targeted and untargeted subjects. On untargeted subjects, student performance improved by 4% of a standard deviation. On targeted subjects, the improvement was smaller (by 2% of a standard deviation) but remained statistically significant (0.04 0.02 = 0.02 standard deviations). These patterns were consistently observed for students enrolled in both high- and low-poverty schools. One interpretation of the difference in Cohort 2 and Cohort 3 coefficients is that curriculum changes not only benefit those directly targeted subjects, but also other subject areas, albeit in a more tangential way. As discussed earlier, the CCSS-aligned ELA framework is intended to help improve literacy skills required in other subject areas. This design feature implies that student performance on untargeted subjects is likely to benefit from ELA curriculum change, with a lag as improved literacy skills trickle down to these other subjects.
CROSS-COHORT DIFFERENCES: WHEN DID THE DIVERGENCE BEGIN?
Starting high school with similar test scores, students from Cohorts 2 and 3 made more progress in terms of academic proficiency than Cohort 1 students by the end of the 11th grade. However, it remains unclear when such cross-cohort divergence began. If students from the three cohorts had been on different performance trajectories prior to the CCSS despite having similar starting performance levels, our findings should not be completely attributed to CCSS implementation. To investigate this possibility, we compare students 10th-grade PLAN composite scores across cohorts. All three cohorts took the 10th-grade PLAN before the implementation of the CCSS. Therefore, we should expect no cross-cohort differences in 10th-grade scores if CCSS transition was responsible for improved student learning. Indeed, we find no difference in 10th-grade performance between students in Cohorts 1 and 2 (Table 4), lending support to the interpretation that CCSS implementation likely led to improved ACT performance from Cohort 1 to Cohort 2. By comparison, Cohort 3 students outscored Cohort 1 students at the start of the 10th grade by 4% of a standard deviation. That is, there is strong evidence that Cohort 3 students started pulling ahead of comparable Cohort 1 students before the CCSS transition.11
Table 4. Cross-Cohort Comparisons of 10th-grade PLAN Composite Scores, by School Poverty and Student FRPL Eligibility
[Standard errors in parentheses]
Note. The reference cohort took the ACT in the 20102011 school year, and the PLAN in the 20092010 school year. The reference racial group is White.
*** p < 0.01. ** p < 0.05. * p < 0.1.
Our falsification test appears to have reached contradictory conclusions as to whether we should attribute cross-cohort improvement in ACT performance to CCSS implementation. What we have learned from this exercise is that, between the eighth grade and the start of the 10th grade, students in Cohorts 1 and 2 followed the same learning trajectory, whereas the learning trajectory was steeper for Cohort 3 students. It becomes clear that controlling for student academic proficiency at a single point in time is insufficient to account for important baseline cross-cohort differences. We therefore augment models (1) and (2) by controlling for 10th-grade PLAN scores in addition to the eighth-grade KCCT scores. The augmented models allow us to answer the question: Among students who started high school at similar levels and remained comparable in academic performance at the start of Grade 10, did those in later cohorts outperform those in the first cohort? The augmented models, however, may run the risk of over-controlling: It is possible that schools adjusted their instructions in earlier grades while anticipating that performance expectations in later grades will be different after the standards reform. If that were the case, 10th-grade scores of later cohorts could reflect changes induced by the standards reform; therefore, controlling for those scores would remove part of the transitional impact of the CCSS on student performance.
Table 5 presents estimates based on the augmented models. For both model (1) and (2), adding the PLAN score explains an additional 13% to 18% of the total variation in student ACT scores. Focusing on ACT composite scores, estimates in the top panel of Table 5 show that students from both Cohorts 2 and 3 still significantly outperformed Cohort 1 students. Cohort 2 students scored 2% of a standard deviation higher on average. Interestingly, after controlling for the PLAN score, Cohort 2 students from both high- and low-poverty schools improved their ACT performance relative to their counterparts in Cohort 1, alleviating the concern that recent changes in the school system triggered by the CCSS may have disproportionate adverse effects on students in high-poverty schools.
Table 5. Cross-Subject and Cross-Cohort Comparisons of ACT Scores While Controlling for PLAN, by School Poverty and Student FRPL Eligibility
Note. Standard errors in parentheses in the top panel, and standard errors clustered at the student level in parentheses for the bottom panel. The reference cohort took the ACT in the 20102011 school year, and the PLAN in the 20092010 school year. The reference racial group is White. Targeted subjects include English and mathematics, for which the KCAS implemented new, CCSS-aligned curricula since 20112012. Comparison subjects include science and reading, whose curricula were carried over from the era of Program of Studies, the old state standards before KCAS. Regressions control for student PLAN scores in addition to KCCT scores and the same list of student background characteristics as in earlier tables.
*** p < 0.01. ** p < 0.05. * p < 0.1.
Although Table 2 reports that Cohort 3 students experienced larger cumulative gains between Grades 8 and 11 relative to Cohort 2 students when both are compared to Cohort 1 students, most of the gains accrued to Cohort 3 students had been achieved before the CCSS transition, by the time when they started Grade 10. Consequently, once the PLAN score is controlled for, Cohort 3 students outscored Cohort 1 students on the ACT by just 1% of a standard deviation on average. The difference nevertheless remained statistically significant. The results in the top panel of Table 5 indicate that exposure to CCSS transition was correlated with improved college readiness, but a higher dosage of exposure was not necessarily associated with continual improvement in student readiness.
Comparing results reported in Tables 4 and 5, it appears that Cohort 2 students made significant progress in Grades 10 and 11 (from 20102011 to 20112012), whereas Cohort 3 students made most of the gains in the ninth grade (20102011) and continued to improve (at a slower rate) in Grades 10 and 11 (from 20112012 to 20122013). Although Cohorts 2 and 3 differ in the grades in which progress was observed, both cohorts improved relative to the first cohort during the same time period (that is, in the year immediately before the CCSS implementation and the years after).
The bottom panel in Table 5 reports cross-subject differences in cross-cohort gains in ACT performance after taking into account 10th-grade PLAN subject scores. Similar to results reported in Table 3, by the end of the first year of CCSS transition, there was no statistically significant difference in ACT performance on untargeted subjects between Cohort 1 and Cohort 2 students. On the other hand, ACT scores on targeted subjects improved significantly (0.02 standard deviations) during the same period. Two years into the standards transition, however, ACT performance on both targeted and untargeted subjects improved (and by the same magnitude since the coefficient on Targeted subjects, Cohort 2013 is zero). These patterns were largely consistent across student subgroups regardless of school poverty context. The findings appear to confirm that the new math and ELA curriculum framework did make a difference, and that reformed ELA curriculum might indeed have benefitted non-ELA subjects with some delay.
With education policies increasingly focused on college- and career-ready standards, important changes are being introduced at the state and local levels. The CCSS is one prominent recent example. With the passage of the Every Student Succeeds Act (ESSA) in December 2015, states are likely to continue to engage in an active agenda of education policy experimentation. As we seek to improve the education of our children through reforms and innovations, policymakers should be mindful about the potential risks of excessive changes, which are a source of confusion and frustration among teachers and undermine teachers commitment to educational reforms. Indeed, the stability of education policy is one of the key determinants of policy success under Porters policy attributes framework (Porter, 1994).
Our study was motivated by concerns that changes triggered by the transition to CCSS might be disruptive to student learning in the short run, even when those policy changes may benefit students once they are fully implemented. The goal of the study is not only to provide a first look at how student college readiness progressed in the early years of the CCSS implementation in Kentucky, but also to encourage researchers and policymakers to pay more attention to the transitional impact of educational reforms in general. This is a highly pertinent issue, because with the passing of the ESSA, states are preparing to make further changes to systems that were implemented merely five years ago under the CCSS. Kentucky, for example, has started the process of drafting a new accountability system that began in August 2017.12
We hypothesize that implementation of standards-based education reforms may have two diverging effects on student performance. First, implementation may be disruptive to student learning, regardless of how well designed the standards are. The disruptive effect of reform implementation may be more pronounced among students who are more disadvantaged and schools that are more resource-constrained. In this case, as exposure increases, student performance may eventually improve, but only after an initial decrease in test scores. In contrast, the benefits of standards-based reforms may outweigh the negative influence of implementation. In this case, as exposure increases, student performance will improve without any initial disruptions to the upward trend. It is reassuring that, in the case of CCSS transition in Kentucky, our findings support the second hypothesis, and students continued to improve their college readiness, as measured by ACT scores, during the early stages of CCSS implementation. Furthermore, evidence suggests that the positive gains students made during this period accrue to students in both high- and low-poverty schools. In other words, the net effect of CCSS transition appears to be positive for all students.
However, it is not conclusive that the progress made in student college readiness is necessarily attributable to the new content standards. On one hand, we find that students made more progress on subjects directly targeted by CCSS-aligned curriculum than on untargeted subjects after the first year of CCSS implementation, suggesting that student performance may have benefitted from the reformed content standards. Similarly, the fact that student performance on untargeted subjects caught up with student performance on targeted subjects by the end of Year 2 supports the claim that the new CCSS-aligned ELA curriculum will eventually benefit non-ELA subject areas. On the other hand, there is evidence that students made significant progress toward college readiness both in the year immediately before and during the early years of the CCSS implementation, raising questions about the degree to which curriculum changes were directly responsible for observed performance improvement.
While it is unclear what might have changed in the year immediately preceding the launch of the CCSS or whether those changes were CCSS-induced, one speculation is that in anticipation of the upcoming standards reform, some schools and districts might have started the preparation for transition to the CCSS before its official launch. There are anecdotal references to implementation activities starting in 2010, right after Kentucky adopted the CCSS in February 2010 but before the CCSS implementation. A number of other states also reported teaching CCSS-aligned curricula in English and math as early as 20102011 (Rentner, 2013). However, those activities were probably unlikely to generate benefits to student learning both fast enough and widespread enough to be reflected in statewide average test scores.
All in all, Kentuckys CCSS transition does not seem to have adversely affected high school students academic performance, regardless of their individual or school poverty status. However, there is potentially important variation among states in approaches to standards-based reforms, so Kentuckys experience with the CCSS transition does not necessarily apply to other states. Future research should strive to collect more empirical evidence on the transitional impact of standards-based educational reforms on (a) student performance from other states; (b) students in younger age groups, whose learning experiences may differ from those of high school students; (c) additional student subgroups characterized by their race/ethnicity, English proficiency, and special education needs; and (d) students outcomes later in life, such as college attendance and completion.
We acknowledge support from the Bill & Melinda Gates Foundation for this study. We thank the Kentucky Department of Education for providing us with the required data. This research has benefitted from the helpful comments of two anonymous reviewers and inputs from Mike Garet, Dan Goldhaber, Angela Minnici, Toni Smith, and Fannie Tseng. Tiffany Chu provided excellent research assistance. Any and all errors are solely the responsibility of the studys authors, and the views expressed are those of the authors and should not be attributed to their institutions, the studys funder, or the agencies supplying data.
1. http://www.corestandards.org/about-the-standards/frequently-asked-questions. Accessed October 29, 2014.
2. See, for instance, discussions in Education Week (2014), Hess and McShane (2014), Marchitello (2014), and Rotberg (2014).
3. Authors calculation based on three cohorts of projected 12th-grade public school enrollment from Hussar and Bailey (2014).
4. In Kentucky, career-ready standards are separate from college-ready standards, and they are measured using additional criteria such as industry certificates, Kentucky Occupational Skills Standards Assessment (KOSSA), Armed Services Vocational Aptitude Battery (ASVAB), and ACT WorkKeys.
5. See http://education.ky.gov/curriculum/docs/Documents/KCAS%20-%20June%202013.pdf for more details about KCAS.
6. See http://www.kentuckyteacher.org/wp-content/uploads/2012/04/Field-Test-Guide-2-2-12.pdf for more details about the new teacher evaluation system.
7. More details can be found at http://education.ky.gov/comm/ul/Pages/default.aspx.
8. See http://www.corestandards.org/assets/Criteria.pdf. Accessed on March 7, 2018.
9. Another potentially important control variable is high school tracking. Many studies on tracking demonstrate that high school tracks are associated with student test score gains. In our case, whether or not a student follows an academic track may predict his or her ACT performance. Unfortunately, we do not have detailed course-taking information to infer student tracks. However, the omission of high school tracks as a control variable will not have a large impact on our estimates of cohort coefficients unless the proportion of students following various high school tracks has changed significantly across cohorts.
11. We also re-estimated the cross-subject, cross-cohort model presented in Table 3 by replacing ACT subject scores with corresponding PLAN subject scores. Findings are similar to what is reported here for PLAN composite scores: We found no diverging performance trajectories between Cohort 1 and 2 on any subjects by Grade 10. However, Cohort 3 significantly outperformed Cohort 1 on the PLAN on both untargeted and targeted subjects, raising questions about the extent to which ACT performance gains achieved by Cohort 3 on all subjects can be attributed to changes in curriculum frameworks.
12. For more information, visit http://education.ky.gov/comm/Pages/Every-Student-Succeeds-Act-%28ESSA%29.aspx.
Allen, J. (2013). ACT Research Report Series. Updating the ACT College Readiness Benchmarks (ACT
Research Report Series, 2013-6). Washington, DC: American College Testing (ACT), Inc.
Allen, J., & Sconing, J. (2005). Using ACT Assessment scores to set benchmarks for college readiness
(ACT Research Report Series, 2005-3). Washington, DC: American College Testing (ACT), Inc.
American College Testing (ACT). (2008). What kind of interpretations can be made on the basis of ACT
scores? Washington, DC: Author.
American College Testing (ACT). (2010). The alignment of Common Core and ACTs College and Career
Readiness System. Washington, DC: Author.
Beach, R. W. (2011). Issues in analyzing alignment of language arts Common Core Standards with state
standards. Educational Researcher, 40(4), 179182.
Bettinger, E. P., Evans, B. J., & Pope, D. G. (2011). Improving college performance and retention the
easy way: Unpacking the ACT exam (No. w17119). New York, NY: The National Bureau of Economic
Bettinger, E. P., & Long, B. T. (2009). Addressing the needs of under-prepared students in higher
education: Does college remediation work? The Journal of Human Resources, 44(3), 736771.
Booher-Jennings, J. (2005). Below the bubble: Educational triage and the Texas accountability system.
American Educational Research Journal, 42(2), 231268.
Borman, G. D., Hewes, G. M., Overman, L. T., & Brown, S. (2003). Comprehensive school reform and
achievement: A meta-analysis. Review of Educational Research, 73(2), 125230.
Borman, G. D., Slavin, R. E., Cheung, A. C., Chamberlain, A. M., Madden, N. A., & Chambers, B. (2007).
Final reading outcomes of the national randomized field trial of Success for All. American Educational
Research Journal, 44, 701731.
Brown, A. B., & Clift, J. W. (2010). The unequal effect of adequate yearly progress evidence from school
visits. American Educational Research Journal, 47(4), 774798.
Carmichael, S. B., Martino, G., Porter-Magee, K., & Wilson, W. S. (2010). The state of state standards
and the Common Corein 2010. Washington, DC: Thomas B. Fordham Institute.
Carnoy, M., & Loeb, S. (2002). Does external accountability affect student outcomes? A cross-state
analysis. Educational Evaluation and Policy Analysis, 24, 305331.
Clark, M., Rothstein, J., & Schanzenbach, D. W. (2009). Selection bias in college admissions test scores.
Economics of Education Review, 28, 295307.
Clarke, M., Shore, A., Rhoades, K., Abrams, L., Miao, J. & Li, J. (2003). Perceived effects of state
mandated testing programs on teaching and learning: Findings from interviews with educators in low-,
medium- and high-stakes states. Boston, MA: National Board on Educational Testing and Public Policy.
Clune, W. H. (1993). Systemic educational policy: A conceptual framework. In S. H. Fuhrman (Ed.),
Designing coherent educational policy (pp. 125140). San Francisco, CA: Jossey-Bass.
Cobb, P., & Jackson, K. (2011). Assessing the quality of the Common Core State Standards for
Mathematics. Educational Researcher, 40(4), 183185.
Coleman, J. S. (1966). Equality of educational opportunity [summary report] (Vol. 2). Washington, DC:
U.S. Department of Health, Education, and Welfare, Office of Education.
Conley, D. T. (2007). Redefining college readiness. Eugene, OR: Educational Policy Improvement
Conley, D. T. (2010). College and career ready: Helping all students succeed beyond high school.
Hoboken, NJ: John Wiley & Sons.
Cullen, J. B., & Reback, R. (2006). Tinkering toward accolades: School gaming under a performance
accountability system (NBER Working Papers Series No. 12286). Cambridge, MA: National Bureau of Economic Research.
Dee, T., & Jacob, B. (2011). The impact of No Child Left Behind on student achievement. Journal of
Policy Analysis and Management, 30, 418446.
Dee, T., & Wyckoff, J. (2013). Incentives, selection, and teacher performance: Evidence from IMPACT
(NBER Working Paper No. 19529). New York, NY: The National Bureau of Economic Research.
Desimone, L. (2002). How can comprehensive school reform models be successfully implemented?
Review of Educational Research, 72(3), 433479.
Education Week. (2014). From adoption to practice: Teacher perspectives on the Common Core.
Retrieved from http://www.edweek.org/media/ewrc_teacherscommoncore_2014.pdf on January 14, 2015.
Figlio, D. N. (2006). Testing, crime and punishment. Journal of Public Economics, 90(45), 837851.
Finn, C. E., Petrilli, M. J., & Julian, L. (2006). The state of state standards 2006. Washington, DC:
Thomas B. Fordham Foundation.
Finnan, L. A. (2014). Common Core and other state standards: Superintendents feel optimism, concern
and lack of support. Alexandria, VA: The School Superintendents Association.
Foorman, B. R., Kalinowski, S. J., & Sexton, W. L. (2007). Standards-based educational reform is one
important step toward reducing the achievement gap. In A. Gamoran (Ed.), Standards-based reform and
the poverty gap: Lessons for No Child Left Behind. Washington, DC: Brookings Institution Press.
Fullan, M. (2001). The future of educational change. New Meaning of Educational Change, 3, 267272.
Gamoran, A. (2013). Educational inequality in the wake of No Child Left Behind. Washington, DC:
Association for Public Policy and Management. Retrieved from
Gewertz, C. (2016, March 23). State testing: An interactive breakdown of 2015-16 plans. Education
Week. Retrieved from http://www.edweek.org/ew/section/multimedia/state-testing-an-interactive
Goodman, S. (2016). Learning from the test: Raising selective college enrollment by providing information. Review of Economics and Statistics, 98(4), 671684.
Hamilton, L. S., Stecher, B. M., Marsh, J. A., McCombs, J. S., & Robyn, A. (2007). Standards-based
accountability under No Child Left Behind: Experiences of teachers and administrators in three states
(MG-589-NSF). Santa Monica, CA: RAND Corporation.
Hanushek, E. A., & Raymond, M. E. (2005). Does school accountability lead to improved student
performance? Journal of Policy Analysis Management, 24, 297327.
Hess, F. M., & McShane, M. Q. (2014). Flying under the radar? Analyzing Common Core media
coverage. Washington, DC: American Enterprise Institute. Retrieved from
January 14, 2015.
Hill, H. C. (2001). Policy is not enough: Language and the interpretation of state standards. American
Educational Research Journal, 38(2), 289318.
Howell, J. S. (2011). What influences students need for remediation in college? Evidence from California.
The Journal of Higher Education, 82(3), 292318.
Hussar, W. J., & Bailey, T. M. (2014). Projections of education statistics to 2022 (NCES 2014-051).
Washington, DC: U.S. Department of Education, National Center for Education Statistics.
Jacob, B. A. (2007). Test-based accountability and student achievement: An investigation of differential
performance on NAEP and state assessments (NBER Working Paper No. 12817). New York, NY: The
National Bureau of Economic Research.
Jacob, B., & Levitt, S. D. (2003). Rotten apples: An investigation of the prevalence and predictors of
teacher cheating. The Quarterly Journal of Economics, 118(3), 843877.
Kane, T. J., & Staiger, D. O. (2002). The promise and pitfalls of using imprecise school accountability
measures. The Journal of Economic Perspectives, 16(4), 91114.
Kentucky Department of Education and Kentucky Council on Postsecondary Education. (2010). Unified
strategy for college and career readiness: Senate Bill 1 (2009). Retrieved from
Kober, N., & Rentner, D. S. (2011). Common Core State Standards: Progress and challenges in school
districts implementation. Washington, DC: Center on Education Policy.
Ladd, H. F. (2007). Holding schools accountable revisited. Presented at APPAM Fall Research Conference: Spencer Foundation Lecture in Education Policy and Management, Washington DC, November 2007. Association for Public Policy Analysis and Management.
Lipsey, M. W., Puzio, K., Yun, C., Hebert, M. A., Steinka-Fry, K., Cole, M. W., . . . Busick, M. D. (2012).
Translating the statistical representation of the effects of education interventions into more readily
interpretable forms (NCSER 2013-3000). Washington, DC: U.S. Department of Education, Institute of
Education Sciences, National Center for Special Education Research. Retrieved from
www.ies.ed.gov/ncser/pubs/20133000/pdf/20133000.pdf on January 26, 2015.
Logan, J. R., Minca, E., & Adar, S. (2012). The geography of inequality: Why separate means unequal in
American public schools. Sociology of Education, 85(3), 287301.
Loveless, T. (2016). The 2016 Brown Center Report on American Education: How well are American students learning? (Vol. 3, No. 5). Washington, DC: The Brown Center on Education Policy, The Brookings Institution. Retrieved from
Marchitello, M. (2014, September 26). Politics threaten efforts to improve K12 education. Center for
American Progress. Retrieved from
improve-k-12-education/ on January 14, 2015.
Martorell, P., & McFarlin, I., Jr. (2011). Help or hindrance? The effects of college remediation on
academic and labor market outcomes. The Review of Economics and Statistics, 93(2), 436454.
Maruyama, G. (2012). Assessing college readiness: Should we be satisfied with ACT or other threshold
scores? Educational Researcher, 41, 252261.
National Governors Association Center for Best Practices & Council of Chief State School Officers
(NGA/CCSSO). (2010). Common Core State Standards. Retrieved from
No Child Left Behind (NCLB) Act of 2001, Pub. L. No. 107-110, 115, Stat. 1425 (2002).
Özek, U. (2012). One day too late: Mobile students in an era of accountability (Working Paper No. 82).
Washington, DC: National Center for Analysis of Longitudinal Data in Education Research.
Polikoff, M. S., Porter, A. C., & Smithson, J. (2011). How well aligned are state assessments of student achievement with state content standards? American Educational Research Journal, 48(4), 965995.
Porter, A. C. (1994). National standards and school improvement in the 1990s: Issues and promise.
American Journal of Education, 102(4), 421449.
Porter, A., McMaken, J., Hwang, J., & Yang, R. (2011). Common Core Standards: The new U.S. intended
curriculum. Educational Researcher, 40, 103116.
Porter, A. C., & Polikoff, M. S. (2011). Measuring academic readiness for college. Educational Policy,
Reardon, S. F. (2011). The widening academic achievement gap between the rich and the poor: New
evidence and possible explanations. In G. J. Duncan & R. J. Murnane (Eds.), Whither opportunity?:
Rising inequality, schools, and childrens life chances (91116). New York, NY: Russell Sage Foundation.
Regional Equity Assistance Centers. (2013). How the Common Core must ensure equity by fully
preparing every student for postsecondary success: Recommendations from the Regional Equity
Assistance Centers on implementation of the Common Core State Standards. San Francisco, CA:
Rentner, D. S. (2013). Year 3 of implementing the Common Core State Standards: An overview of states
progress and challenges. Washington, DC: Georgetown University, Center on Education Policy.
Roderick, M., Nagaoka, J., & Coca, V. (2009). College readiness for all: The challenge for urban high
schools. Future of Children, 19, 185210.
Ross, S. M., Henry, D., Phillipsen, L., Evans, K., Smith, L., & Buggey, T. (1997). Matching restructuring
programs to schools: Selection, negotiation, and preparation. School Effectiveness and School
Improvement, 8(1), 4571.
Rotberg, I. C. (2014, October 16). The endless search for silver bullets. Teachers College Record.
Retrieved from http://www.tcrecord.org/Content.asp?ContentId=17723 on October 25, 2014.
Rouse, C. E., Hannaway, J., Goldhaber, D., & Figlio, D. (2013). Feeling the Florida heat? How low
performing schools respond to voucher and accountability pressure. American Economic Journal:
Economic Policy, 5, 251281.
Sass, T., Apperson, J., & Bueno, C. (2015). The long-run effects of teacher cheating on student
outcomes. Atlanta, GA: Atlanta Public Schools. Retrieved from
Schmidt, W. H., & Houang, R. T. (2012). Curricular coherence and the Common Core State Standards for
Mathematics. Educational Researcher, 41(8), 294308. http://doi.org/10.3102/0013189X12464517
Scott-Clayton, J., Crosta, P. M., & Belfield, C. R. (2014). Improving the targeting of treatment evidence
from college remediation. Educational Evaluation and Policy Analysis, 36(3), 371393.
Smith, M. S., & ODay, J. (1991). Systemic school reform. In S. H. Fuhrman & B. Malen (Eds.), The
politics of curriculum and testing: The 1990 yearbook of the Politics of Education Association (pp. 233
267). London, UK: The Falmer Press.
Stecher, B. M., Epstein, S., Hamilton, L. S., Marsh, J. A., Robyn, A., McCombs, . . . Naftel, S. (2008). Pain
and gain: Implementing No Child Left Behind in three states, 2004-2006. Santa Monica, CA: RAND
Steelman, L. C., & Powell, B. (1996). Bewitched, bothered, and bewildering: The use and misuse of state
SAT and ACT scores. Harvard Educational Review, 66, 2759.
Swanson, C. B., & Stevenson, D. L. (2002). Standards-based reform in practice: Evidence on state policy
and classroom instruction from the NAEP state assessments. Educational Evaluation and Policy Analysis,
Taylor, J., Stecher, B., ODay, J., Naftel, S., & LeFloch, K. C. (2010). State and local implementation of
the No Child Left Behind Act, Vol. IXAccountability under NCLB: Final report. Washington, DC: U.S.
Department of Education.
Winters, K., Williams, D., McGaha, V., Stine, K., Thayer, D., & Westwood, J. Senate Bill 1, 1 SB (2009).
Wong, K. K., Anagnostopoulos, D., Rutledge, S., & Edwards, C. (2003). The challenge of improving
instruction in urban high schools: Case studies of the implementation of the Chicago academic standards.
Peabody Journal of Education, 78(3), 3987.
Zwick, R. (2004). Rethinking the SAT: The future of standardized testing in university admissions. New York, NY: Routledge.