Home Articles Reader Opinion Editorial Book Reviews Discussion Writers Guide About TCRecord
transparent 13
Topics
Discussion
Announcements
 

“Truths” Devoid of Empirical Proof: Underlying Assumptions Surrounding Value-Added Models in Teacher Evaluation


by Jessica Holloway-Libell & Audrey Amrein-Beardsley - June 29, 2015

Despite the overwhelming and research-based concerns regarding value-added models (VAMs), VAM advocates, policymakers, and supporters continue to hold strong to VAMs’ purported, yet still largely theoretical strengths and potentials. Those advancing VAMs have, more or less, adopted and promoted a set of agreed-upon, albeit “heroic” set of assumptions, without independent, peer-reviewed research in support. These “heroic” assumptions transcend promotional, policy, media, and research-based pieces, but they have never been fully investigated, explicated, or made explicit as a set or whole. These assumptions, though often violated, are often ignored in order to promote VAM adoption and use, and also to sell for-profits’ and sometimes non-profits’ VAM-based systems to states and districts. The purpose of this study was to make obvious the assumptions that have been made within the VAM narrative and that, accordingly, have often been accepted without challenge. Ultimately, sources for this study included 470 distinctly different written pieces, from both traditional and non-traditional sources. The results of this analysis suggest that the preponderance of sources propagating unfounded assertions are fostering a sort of VAM echo chamber that seems impenetrable by even the most rigorous and trustworthy empirical evidence.

Value-added models (VAMs) are statistical instruments intended to objectively measure the amount of “value” that a teacher “adds” to (or detracts from) student learning and achievement from one school year to the next. Stemming largely from the field of economics (Hanushek, 1971, 1979, 2009, 2011), VAMs have been the source of both academic and public controversy, often causing rift between teachers and public officials (e.g., the Chicago Teacher Strike of 2012).


There is also a growing divide between academic scholars who have taken to either side of the VAM debate, with economists often on one side (although not always; see, for example, Rothstein, 2009, 2014) promoting the still largely purported strengths of these models (see for example, Chetty, Friedman, & Rockoff, 2011, 2014a, 2014b, 2014c; Ehlert, Koedel, Parsons, & Podgursky, 2012; Gordon, Kane, & Staiger, 2006; Kane & Staiger, 2008, 2012; Hanushek, 1971, 2009, 2011), and educational researchers often on the other side, representing the VAM critics (see for example, Baker, Barton, Darling-Hammond, Haertel, Ladd, Linn, . . . & Shepard, 2010; Corcoran, 2010; Gabriel & Lester, 2013; Graue, Delaney, & Karch, 2013; Hill, Kapitula, & Umlan, 2011; Newton, Darling-Hammond, Haertel, & Thomas, 2010; Papay, 2010; Rothstein, 2009, 2010).


Given the still current momentum of VAM adoption and use, then, it is reasonable to posit that economists have taken the lead in influencing educational policy in this area, which is not surprising in light of their increasing influence in the public policy arena at large (Fourcade, Ollion, & Algan, 2014; Lazear, 1999).


THE ROLE OF ECONOMICS IN EDUCATIONAL POLICY


Economists work from a different set of theoretical and epistemological assumptions than those of other social scientists (Fourcade et al., 2014). Economists, for example, tend to work from the presupposition that “the properties of 'collectivities'—groups, institutions, societies—can be reduced to statements about the properties of individuals” (Ingham, 1996, pp. 245–246). In contrast, sociologists (and educational researchers) work from the presupposition that individuals (human capital) are part of larger social structures (social capital) that influence the make-up of the individuals (Ingham, 1996). Accordingly, the methodological approaches to understanding human behavior and society are quite different, by discipline. Other social science disciplines, such as psychologists, anthropologists, and political scientists, have entirely different sets of epistemological assumptions within which they ground their work, as well.


But here, and in particular, VAMs—the products of economists—are built on assumptions that, while appropriate for an economic analytical framework, create a set of problems for other disciplinary approaches that work from different sets of assumptions. Hence, our purpose for writing this commentary was to better explore and (hopefully) help others better understand the core of the VAM debate by (a) identifying some of the key assumptions, or conditions, that have not been explicitly addressed by economics-based value-added calculations, but have nonetheless been implicitly assumed as “true” conditions upon which VAMs and VAM-use can be justified and rationalized; and (b) mapping these assumptions onto the greater VAM literature to interpret the feasibility of such assumptions as being “true” given a multi-disciplinary lens.


For the purposes of this paper, we define “assumptions” broadly to include room for all social science disciplines’ definitions, for we find that such inclusion is not only important, but critical for understanding the complexities involved in the process for measuring teacher quality, as well as teacher impacts on growth in student achievement and learning over time (i.e., growth).


In addition, we worked from the position that economists have led the work. They have also informed and defined the narratives surrounding VAMs, especially as these narratives relate to the asserted need for VAMs (Chetty et al., 2011, 2014a, 2014b, 2014c; Gordon et al., 2006; Kane & Staiger, 2008, 2012; Rockoff, Staiger, Kane, & Taylor, 2010; Weisburg, Sexton, Mulheron, & Keeling, 2009), the development of VAMs (Hanushek, 1971, 2009, 2011), and the use of VAMs for teacher evaluation and accountability purposes (Chetty et al., 2011, 2014a, 2014b, 2014c; Harris & Weingarten, 2011; Sanders, 2003; Sanders, Saxton, & Horn, 1997).


Accordingly, VAMs and VAM-based policies and practices have been built upon a set of assumptions that are appropriate to the discipline of economics. However, given the complexities that are inherent in education systems, it might serve our understandings well to apply multidisciplinary approaches to think about not only the capabilities of VAMs, but also the assertions regarding the need for VAMs, the practical application of VAMs, and the potential consequences related to VAM-use. We argue that by looking at the VAM literature in terms of disciplinary assumptions, we might better understand the nature of the debate surrounding VAMs, all the while explicitly unpacking the problems that can arise by depending almost entirely on one disciplinary approach to a problem. To this end, we conducted a content analysis of the VAM literatures, defined both in traditional (e.g., peer-reviewed publications) and non-traditional (e.g., news articles) terms, while treating each of these literature pieces as artifacts.


Our goal was to locate and identify the implicit and explicit assumptions that were made across pieces, while situating these assumptions within a multidisciplinary framework. In other words, we did not limit our definition of “assumption” to the strict, statistical sense of the word. Rather, we took the term “assumption” as broadly conceived, to encompass that which is accepted as “truth,” devoid of empirical proof. Sources for this study included 470 distinctly different pieces1 that we read and analyzed, again while deconstructing the assumptions common across sources. In order by volume, resources that we read and analyzed included: peer-reviewed research studies (29%, n=138); articles published in media outlets (21%, n=98); organization, foundation, business, and think tank research studies (20%, n=95); editor-and self or author-reviewed studies (14%, n=66); federal, local, and other promotional materials (10%, n=46); and blog posts (6%, n=27) (see Figure 1).


Figure 1. Types of VAM-based articles read and analyzed for this study.


[39_18008.htm_g/00002.jpg]



We carefully read through each of the sources, and, using a constant comparison analytic method (Leech & Onwuegbuzie, 2008), noted when the authors made or advanced assumptions, or implicitly accepted as “true” potentially necessary pre-conditions—without acknowledgement and research or references in support. In other words, we attempted to make sense of the way in which VAMs have been politically and socially accepted, despite the academic contention, by tracking the narrative at the base of the discrepancies—the assumptions or conditions upon which VAMs are possible, necessary, and successful at measuring teacher quality as conceptualized.


After collapsing the 1226 initial codes into a set of 33 major assumptions, and eventually into four major themes, we mapped these assumptions onto the greater VAM literature to determine the feasibility, practicality, and appropriateness of VAM-use given a multidisciplinary lens. We compared each assumption against the research to determine the literature.  


FINDINGS


We present each assumption, as situated within the greater multi-disciplinary VAM literature. We do this—to consider VAMs and VAM-use—as part of complex schooling systems that can best be understood, via multiple approaches instead of a single, economics-based approach. Of particular focus are the 33 assumptions, related to the (a) assumptions used as rationales to justify VAM adoption (see Figure 2), (b) assumptions used as justifications to further advance VAM implementation (see Figure 3), (c) major statistical and methodological assumptions about VAMs (see Figure 4), and, (d) assumptions specifically made about the large-scale standardized tests used for value-added calculations (see Figure 5). We call on readers to consider, ponder, think upon, or deliberate these, as illustrated in these figures.


Figure 2. Assumptions used as rationales to justify VAM adoption.


Assumptions

Multidisciplinary Consensus

Teachers are the most important factors that impact student learning and achievement.

Teachers are strong school-level factors that influence student learning and achievement, but teachers operate alongside many other school-level factors that also impact student learning and achievement, and these other factors largely conflate determinations about the authentic effects of teachers. Furthermore, all school-level factors put together matter much less than what many, again, assume.

Good teaching comes from enduring qualities that teachers possess and carry with them from one year to the next, regardless of context.

The teacher effect (i.e., 10-20% of the variance in test scores) is not strong enough to supersede the powers of the aforementioned student-level and out-of-school influences and effects (i.e., 80-90% of the variance in test scores) from one year to the next.

Too many of America’s public school teachers are unqualified, unskilled, lazy, or uninspired, and they are the ones who are hindering educational progress.

Not nearly as many teachers as is often assumed are in fact ineffectual, particularly when “teacher effectiveness” is quantified in normative terms (i.e., below average effectiveness via comparisons to the mean). Thus, by statistical design, there will always be some teachers who will appear relatively less effective simply because they fall on the wrong side of the bell curve.

If enough ineffective teachers are fired, test scores will increase.

We have no empirical evidence to suggest that such purgative policies work. As well, ineffective teachers often exit voluntarily anyways by self-selecting themselves out of teaching. Lastly, there is no evidence to support the idea that these teachers, if purged, would be replaced by more effective teachers.

More than 1% of America’s public school teachers are ineffective.

We really have no idea how many teachers are (or are not) effective because how one defines teacher effectiveness varies widely.

VAMs will improve upon, if not solve, that which is wrong with the faulty evaluation systems traditionally used in education.

There is no strong research evidence to support the accurate identification of teachers using VAMs. In addition, using VAMs in isolation or as the dominant indicator in evaluative systems based on multiple measures is not yet warranted and should not be done or encouraged.

If ever our teacher evaluation systems were made stronger (e.g., via the use of VAMs), the barriers preventing teacher termination would still be too substantial and burdensome to permit the terminations assumed necessary.

Even if all barriers and blockades preventing ease in teacher dismissal were removed, whether administrators would fire ineffective teachers is uncertain in that realities about how many teachers need to be fired are still uncertain.

Ineffective school administrators are part of the problem. Ineffective school administrators do not really care whether good teachers stay or bad teachers leave, and this has been evidenced by their lack of action.

School administrators are not a part of the problem. While ineffective school administrators certainly do exist, school administrators are not co-participants in the same alleged plot to maintain average levels of student learning and performance.



Figure 3. Assumptions used as justifications to further advance VAM implementation.


Assumptions

Multidisciplinary Consensus

To reform America’s public schools, we must treat educational systems as we would market-based corporations. Educational corporations produce knowledge, the quality of which can be manipulated by objectively measuring knowledge-based outcomes (i.e., via VAMs).

Schools are social institutions that do not operate like mechanistic corporations. Teaching is one of the most complex occupations in existence, and thinking about teacher effects in economic terms is misleading, unsophisticated, disingenuous, and obtuse.

If educational goals are informed by VAMs, and the goals are then operationalized using incentives and disincentives, teachers will become more motivated to improve their instruction and student achievement will subsequently improve.

There is little to no research evidence to support this assumption. Teachers are not altogether motivated to improve student achievement by the potential to be monetarily rewarded for gains in student test scores.

The monies to be earned via merit pay and other monetary bonuses are substantial and worthy enough to become the motivators conceptualized.

Pay-for-performance bonuses are not nearly large enough to motivate much of anything, especially when considering the cuts in pay and halted levels of salary growth that teachers have realized at the same time. In addition, for the most part, teachers are already exerting a lot of effort in their classrooms. Maximum levels of teacher effort have been reached more often than often assumed, which also inhibits teachers’ capacities to further advance and continuously demonstrate VAM-based growth over time.

Teachers, who for the most part became teachers because they wanted to have an impact on the lives of children, are more motivated by money than they are the interests of the students in their classrooms. When teachers are faced with choices between serving their own needs over the needs of others, they will on average choose themselves.

When teachers view such systems as neither accurate nor fair, teachers see little to no value in the performance-based systems to which seemingly important rewards and sanctions are attached. What is worse is that implementing pay-for-performance plans may trigger adverse consequences that well outweigh the intended, yet illusive, benefits of the pay-for-performance plans implemented.

Teachers can both understand VAM-based information and use VAM-based information to diagnose needs and improve upon their instruction. Then increased levels of student achievement and growth will come about as a result.

There is no evidence to suggest that the research about using general assessment information generalizes when using VAM-based information in the same ways. There is no evidence that demonstrates that if teachers are provided increased access to VAM-based information, this will enhance teachers’ abilities to understand or use this information in instructionally meaningful or relevant ways. There is no evidence to suggest that doing so will increase levels of student achievement or growth.



Figure 4. Statistical and methodological assumptions about VAMs.


Assumptions

Multidisciplinary Consensus

Once students’ prior learning and other background variables are statistically controlled for, or factored out, the only thing left in terms of the gains students make on large-scale standardized achievement tests over time can be directly attributed to the teacher and his/her causal effects.

The variables typically available in the databases used to conduct VAM analyses are very limited and, accordingly, cannot be used to work the statistical miracles assumed. Even the richest of variables would not liberate value-added analyses from the confounding effects students’ backgrounds and other risk factors have on academic growth over time.

The placement of students into classrooms occurs more or less at random as standard practice, so any effects that are observed among teachers can be reasonably attributed to those teachers provided effective statistical tools and techniques are used.

The assignment of students to classrooms (and teachers to classrooms, as well as students and teachers to schools) is much farther from random than is often assumed, and this biases VAM estimates and greatly weakens arguments that teachers, not students, were responsible for the effects observed.

The issues and errors caused by non-random placements either cancel each other out or can be cancelled out using complex statistics to make nonrandom effects tolerable if not ignorable.

Even the most complicated statistics cannot, and may not ever be able to effectively counter for the deleterious issues with VAMs caused by non-random assignment practices.

Bright students learn no faster than their less intellectually-able classmates, and students with lower aptitudes for learning learn as fast as their higher aptitude peers.

Learning is neither linear nor consistent. Learning is generally uneven, unstable, and discontinuous, as well as often detected in punctuated spurts. The learning trajectories of groups of students over time are not essentially the same, and they often deviate from the linear, constant, unwavering forms statistically predicted and used.

VAMs uses large-scale standardized achievement test scores that capture student achievement while students are under the direct tutelage of the same teacher, within the same academic year, from the pre- to post-test occasions, being fall-to-spring.

In almost every case, the tests used to estimate value-added are administered annually, and meant to measure growth from the spring of year X to the spring of year Y. The summer losses and gains captured over this time account for between one-half and one-third of the achievement gap that is persistently prevalent between high- and low-income students, and they distort VAM-based estimates accordingly.

The influences students’ prior teachers have on student learning (i.e., teacher persistence effects) are negligible. If not negligible, they can be statistically controlled for under the assumption that these effects decay quickly and at the same rate over time.

Because the pretest scores used to calculate teacher Y’s value-added are collected in year X, and the value measured from point X to Y includes teacher X’s prior effects, it is highly unlikely that statements can be made that the value teacher Y added to or detracted from his/her students’ learning was entirely due to teacher Y’s efforts.

Even though multiple teachers teach and interact with students in multiple ways (as do administrators and others), value-added effects can be attributed solely to the teacher under examination.

Teachers do not work in isolation of other teachers (and administrators and other staff), and student test scores cannot be attributable to individual teachers in isolation of others’ effects accordingly.

Peer effects are negligible. The extent to which students academically or socially influence other students in their classrooms or attending their schools is trivial.

Students interact inside and outside of their individual classrooms in academic and social ways, both positively and negatively. As well, they likely boost or take away from one another’s achievement, respectively.

VAM analyses and estimates are based on sufficient amounts of data, and any data that are missing are missing at random.

Data are very frequently missing, data are missing more often for high-needs students, and the more years needed to conduct value-added analyses the more this becomes problematic. Missing data are not missing at random, and this too biases VAM estimates for some versus others.

VAM accuracy is defensible across teachers’ classrooms regardless of the sizes of the classrooms of students taught any given year.

Even if no data are missing, the accuracy of all VAM estimates still depends on the size of the samples used to generate teachers’ value-added. Estimates for teachers with fewer students are less accurate than those for their colleagues with more students to whom they might be compared.

  


Figure 5. Assumptions made about the tests used for value-added calculations.


Assumptions

Multidisciplinary Consensus

Because measurements typically yield a mathematical and assumed highly scientific value, an appropriate level of certainty or exactness comes along with the numerical scores that result.

Even though measurement is scientifically based and rooted, measurements function much more like language – they are essentially arbitrary designations that have no inherent value. Rather, their values are constructed given there is mutual agreement about how to use and interpret the numbers derived.

Large-scale standardized achievement tests serve as precise measures of what students know and are able to do.

Tests offer very narrow measures of what students have achieved, and they do not effectively assess students’ depth of knowledge and understanding, their ability to think critically, analytically, or creatively, solve contextual problems, or even accomplish authentic, performance-based tasks.

Regardless of whether student-level consequences are attached to tests, students inherently want to perform well on large-scale standardized achievement tests.

Students often believe that tests are of no consequence. Inversely, when serious consequences are attached to tests, students might perceive the tests to matter, possibly more than they really do. Either scenario typically yields underestimates or overestimates, respectively, pulling test scores away from the “true” scores of interest.

Large-scale standardized achievement tests yield highly objective numbers based on complex statistics and therefore yield “true” scores about student learning.

Statistics are being used to convey authority and intimidate others into accepting VAMs simply because VAMs are based on statistics and their mystique, despite statistics’ gross limitations when used to make presumably highly scientific inferences derived via tests.

The subject areas that are tested using large-scale standardized achievement tests matter more than the subject areas not tested.

Sometimes science, and more often social studies, music, physical education, and the arts, are marginalized by the fact that it is very difficult to construct tests to assess student learning in these subject areas. Then, what is tested often dictates what is taught, implying that the tested subject areas matter more than the others.

The items used to construct large-scale standardized achievement tests are included to measure things that students should know and be able to do.

The test items most often included on tests are those that “discriminate” well, but they are not representative of the test items we might value most or that are most often taught by teachers as written into state standards.

The large-scale standardized achievement tests being used to measure growth are set on equal interval scales.

The tests being used are typically not set on equal interval scales. Even the psychometricians who develop the tests that are used for value-added analyses hesitate to agree to such an assumption.

A normal distribution (i.e., a bell curve) of teacher effectiveness exists and can and should be used to norm large-scale standardized achievement test scores for VAM analyses. Half of all of America’s public school teachers is above and the other half is below average.

VAM estimates do not indicate whether a teacher is good or highly effective, but rather whether a teacher is better or worse than other “similar” teachers. Teachers’ value-added scores are not absolute, but relative.

Scores derived via large-scale standardized achievement tests yield strong indicators about what students learn in schools.

Test scores provide weak signals about educational quality and often much stronger signals about what students bring with them every year to the schoolhouse door, including out-of-school variables that are related to students’ backgrounds, peers, parents, families, neighborhoods, communities, and the like.

Large-scale standardized achievement tests are “the best” and most objective measures we have. Because they are readily available and accessible, and despite the fact that they are designed to measure student achievement not teacher effects, we should use them to make causal distinctions about teachers.

The tests off of which VAM analyses are based are inherently flawed for measuring student achievement in and of themselves. Using such faulty measures to make causal statements about teacher (or school/district) effects simply exacerbates the issues (e.g., errors, bias, disparate impact).



CONCLUSIONS


For this commentary, we attempted to situate the VAM narrative, within a multi-disciplinary context by unpacking the conditions, upon which VAMs and VAM-use are built. We also attempted to situate these conditions within—and map these conditions onto—the larger value-added literature. Taking such a multi-disciplinary approach, it becomes rather clear that the conditions must stand as “true” in order for VAMs and VAM-use to be appropriate. It is much more difficult than is currently represented in the public narrative surrounding VAMs.


As illustrated, accepting the many assumptions that must be accepted, in order to “truly” measure the isolated impact of teacher effects, on student learning is nearly impossible. Nonetheless, policies across the country are not only requiring school administrators to apply methods to teacher evaluations—but accept these assumptions, certainly not reject them—then also attach high-stakes personnel decisions to VAM-derived outcomes.


Lazier (1999) reminds us that it is the intention of the economist to make “simplifying assumptions” (p. 5), in order to sustain a level of statistical rigor. It is feasible to assume that such work has a certain appeal that might lure politicians and the public into trusting such seemingly “simple” analytical instruments and analyses, largely given the “simple” assumptions that make “simple” sense.


Note


1. To view a full list of these 470 references, please visit an anonymous, APA-formatted listing of them at https://sites.google.com/site/anonymousauthor111/references


References


Baker, E. L., Barton, P. E., Darling-Hammond, L., Haertel, E., Ladd, H. F., Linn, R. L., . . . & Shepard, L. A. (2010). Problems with the use of student test scores to evaluate teachers. (Briefing Paper #278).

Retrieved from the Economic Policy Institute website: http://www.epi.org/publications/entry/bp278


Chetty, R., Friedman, J. N., & Rockoff, J. E. (2011, December). The long-term impacts of teachers: Teacher value-added and student outcomes in adulthood. (Working Paper No. 17699). Retrieved from National Bureau of Economic Research website: http://www.nber.org/papers/w17699


Chetty, R., Friedman, J. N., & Rockoff, J. E. (2014a). Measuring the impact of teachers I:

Teacher value-added and student outcomes in adulthood. (Working Paper No. 19423). Retrieved from National Bureau of Economic Research website: http://www.nber.org/papers/w19423


Chetty, R., Friedman, J. N., & Rockoff, J. E. (2014b). Measuring the impact of teachers II:

Evaluating bias in teacher value-added estimates. (Working Paper No. 19424). Retrieved from National Bureau of Economic Research website: http://www.nber.org/papers/w19424


Chetty, R., Friedman, J. N., & Rockoff, J. E. (2014c). Response to Rothstein (2014) on

“Revisiting the Impacts of Teachers.” Unpublished research note. Retrieved

from Harvard website: http://obs.rc.fas.harvard.edu/chetty/Rothstein_response.pdf


Corcoran, S. P. (2010). Can teachers be evaluated by their students’ test scores? Should they be? The use of value-added measures of teacher effectiveness in policy and practice. Providence, RI: Annenberg Institute for School Reform. Retrieved from http://annenberginstitute.org/publication/can-teachers-be-evaluated-their-students%E2%80%99-test-scores-should-they-be-use-value-added-mea


Ehlert, M., Koedel, C., Parsons, E., & Podgursky, M. (2012). Selecting growth measures for school and teacher evaluations. Washington, DC: National Center for Analysis of Longitudinal Data in Education Research (CALDER). Retrieved from: http://www.caldercenter.org/publications/selecting-growth-models-school-and-teacher-evaluations-should-proportionality-matter


Fourcade, M., Ollion, E., & Algan, Y. (2014). The superiority of economists. (Discussion Paper 14/3). Retrieved from the Max Planck Sceince Po Center website: http://www.maxpo.eu/pub/maxpo_dp/maxpodp14-3.pdf


Gabriel, R. & Lester, J. N. (2013). Sentinels guarding the grail: Value-added measurement and the quest for education reform. Education Policy Analysis Archives, 21(9), 1–30. Retrieved from http://epaa.asu.edu/ojs/article/view/1165


Gordon, R., Kane, T. J., & Staiger, D. O. (2006). Identifying effective teachers using performance on the job. (Discussion paper). Retrieved from the Brookings Institution website: http://www.brookings.edu/papers/2006/04education_gordon.aspx


Graue, M. E., Delaney, K. K., & Karch, A. S. (2013). Ecologies of education quality. Education Policy Analysis Archives, 21(8), 1-36.


Hanushek, E. (2009). Teacher deselection. In D. Goldhaber & J. Hannaway (Eds.), Creating a new teaching profession (pp. 165–80). Washington, DC: Urban Institute Press.


Hanushek, E. A. (1971). Teacher characteristics and gains in student achievement: Estimation using micro data. The American Economic Review, 61(2), 280–288.


Hanushek, E. A. (1979). Conceptual and empirical issues in estimating educational production function issues. Journal of Human Resources 14(3), 351–88.


Hanushek, E. A. (2011). The economic value of higher teacher quality. Economics of Education Review, 30, 466-479.


Harris, D. N., &  Weingarten, R. (2011). Value-added measures in education: What every educator needs to know. Cambridge, MA: Harvard Education Press.


Hill, H. C., Kapitula, L, & Umlan, K. (2011, June). A validity argument approach to evaluating teacher value-added scores. American Educational Research Journal, 48(3), 794–831. doi:10.3102/0002831210387916


Ingham, G. (1996). Some recent changes in the relationship between economics and sociology. Cambridge Journal of Economics, 20(2), 243-275.


Kane, T. J., & Staiger, D. O. (2008). Estimating teacher impacts on student achievement: An experimental evaluation. (Working Paper No. 14607). Retrieved from the National Bureau of Economic Research website: http://www.nber.org/papers/w14607


Kane, T., & Staiger, D. (2012). Gathering feedback for teaching: Combining high-quality observations with student surveys and achievement gains. (Research Paper). Retrieved from the Bill & Melinda Gates Foundation website: http://www.metproject.org/downloads/Preliminary_Findings-Research_Paper.pdf


Lazear, E. P. (1999). Economic imperialism. (Working paper no. 7300). Retrieved from National Bureau of Economic Research website: http://www.nber.org.ezproxy1.lib.asu.edu/papers/w7300


Leech, N. L., & Onwuegbuzie, A. J. (2008). Qualitative data analysis: A compendium of techniques for school psychology research and beyond. School Psychology Quarterly, 23, 587–604.


Newton, X., Darling-Hammond, L., Haertel, E., & Thomas, E. (2010) Value-added modeling of teacher effectiveness: An exploration of stability across models and contexts. Educational Policy Analysis Archives, 18(23), 1–27. Retrieved from http://epaa.asu.edu/ojs/article/view/810


Papay, J. P. (2010). Different tests, different answers: The stability of teacher value-added estimates across outcome measures. American Educational Research Journal, 48, 163–193. doi: 10.3102/0002831210362589


Rockoff, J. E., Staiger, D. O., Kane, T. J., & Taylor, E. S. (2010). Information and employee

evaluation: Evidence from a randomized intervention in public schools. (Working Paper No. 16240). Retrieved from the National Bureau of Economic Research Website: http://www.nber.org/papers/w16240


Rothstein, J. (2010). Teacher quality in educational production: Tracking, decay, and student achievement. Quarterly Journal of Economics, 125(1). 175–214. doi:10.1162/qjec.2010.125.1.175


Sanders, W. L. (2003, April). Beyond “No Child Left Behind.” Paper presented at the Annual Meeting of the American Educational Research Association. Retrieved from: http://www.sas.com/govedu/edu/no-child.pdf


Sanders, W. L., Saxton, A. M., & Horn, S. P. (1997). The Tennessee Value-Added Accountability

System: A quantitative, outcomes-based approach to educational assessment. In J.Millman (Ed.), Grading teachers, grading schools: is student achievement a valid

evaluation measure? (pp. 137–162). Thousand Oaks, CA: Corwin Press.


Weisberg, D., Sexton, S., Mulhern, J., & Keeling, D. (2009). The widget effect. Education Digest, 75(2), 31–35.




Cite This Article as: Teachers College Record, Date Published: June 29, 2015
https://www.tcrecord.org ID Number: 18008, Date Accessed: 1/27/2022 9:42:21 AM

Purchase Reprint Rights for this article or review
 
Article Tools
Related Articles

Related Discussion
 
Post a Comment | Read All

About the Author
  • Jessica Holloway-Libell
    Arizona State University
    E-mail Author
    JESSICA HOLLOWAY-LIBELL earned her PhD from Mary Lou Fulton Teachers College at Arizona State University.
  • Audrey Amrein-Beardsley
    Arizona State University
    E-mail Author
    AUDREY AMREIN-BEARDSLEY, PhD, is an Associate Professor at Mary Lou Fulton Teachers College at Arizona State University.
 
Member Center
In Print
This Month's Issue

Submit
EMAIL

Twitter

RSS