Home Articles Reader Opinion Editorial Book Reviews Discussion Writers Guide About TCRecord
transparent 13
Topics
Discussion
Announcements
 

When Statistical Significance Hides More Than it Reveals


by Jeanne M. Powers & Gene V. Glass - July 02, 2014

Background & Purpose: The What Works Clearinghouse (WWC) recently released a summary of five CREDO studies of charter school outcomes produced between 2009 and 2013. We compared the WWC’s summary, which highlighted the statistical significance of the findings, to the effect sizes reported in the individual reports. We also addressed the findings reported in the original studies that were not summarized in the WWC reports.

Research Design: Analytic essay highlighting the gaps between the findings reported in the summary WWC report, the individual WWC reports for each study, and the original CREDO studies.

Findings: We argue that focusing on statistical significance is potentially misleading. The WCC summary invites the reader to conclude that charter schools had a greater effect on students’ achievement gains than traditional public schools. Comparing across the studies’ effect sizes suggests that the average effect of charter schools on students’ achievement gains is negligible. The WWC reports also do not address the considerable variation in achievement gains within and across subgroups of students and schools.

Conclusion: Summaries generated from research studies should provide an accounting of findings that allows practitioners to assess their practical importance. When these and similar reports are hard to understand and misleading, they run the risk of eroding practitioners’ trust in research and increasing rather than bridging the gulf between research and practice.



INTRODUCTION


Table 1 is a simplified version of a table produced for the What Works Clearinghouse’s (WWC) Review of the Center for Research on Education Outcomes (CREDO) Charter School Studies (WWC, 2014a). The WWC review summarized the individual reviews of five studies examining charter school student outcomes. The original studies were produced by CREDO between 2009 and 2013. The WWC reviews (three quick reviews and two single study reviews) were released about six months after the release of each study. The last of these reviews and the summary were released in January 2014.


Table 1: WWC table summarizing statistically significant findings across five CREDO studies.



Study

Impact on Reading

Gains for Charter School Students

Impact on Math

Gains for Charter School Students

16 States

-

-

Indiana

+

+

National

+

-

New Jersey

+

+

New York

+

+


Source: Authors’ reproduction of table from WWC Review of the CREDO Charter School Studies (January, 2014a)i


The table is surrounded by technical details about the studies summarized. But the power of the table lies not in its fidelity to technical details; it is technically correct. The power of the table is in its simple visual appeal. In this research note we argue that this table can also provide important lessons about statistical and practical significance.


BACKGROUND


The goal of the WWC is to allow educators to make “evidence-based decisions” by identifying studies that “provide credible and reliable evidence of the effectiveness of a given practice, program, or policy” (WWC, n.d.). These are studies that have been vetted and summarized using protocols for assessing the research designs and findings of studies (WWC, 2014b). These summaries of “what works” are explicitly intended to help people working in schools figure out how to work more effectively with students and their families—these short summaries are presumably easier to access and understand than the full research reports.


Quick reviews provide preliminary reviews of analyses of program or practice effectiveness; single study reviews are more detailed reviews of studies that underwent quick review. Single study reviews “are designed to provide education practitioners and policymakers with timely and objective assessments of the quality of the research evidence from recently released research papers and reports” (WWC, 2012).1 The CREDO studies were reviewed by the WWC because they had received significant media attention (WWC, 2014a). They are also important because they are the studies of school choice outcomes in the WWC that are the largest in scope. The 14 other studies of school choice in the WWC tend to focus on specific settings (Milwaukee, the District of Columbia, New York City) or smaller samples of specific types of charter school (KIPP schools, charter schools run by charter management organizations). Our discussion is focused on the WWC reviews, although we consulted the CREDO studies to clarify our understanding of the WWC reviews as needed (WWC 2010a, 2010b, 2011, 2013, 2104a, 2014c).


Briefly, the CREDO studies all used the same technique:  charter school students were matched to similar students attending “feeder” traditional public schools2 by grade level, baseline achievement, and demographic characteristics (eligibility for free or reduced price lunch, special education status, gender, and race). The analyses compared the two groups’ achievement gains on standardized reading and math tests between the baseline year and the following year. The main difference between the studies was the scope of the analytic samples. Two were multi-state (a national and a 16 state study), two focused on specific states (Indiana and New Jersey), and the final study was of New York City charter schools. The studies also varied by time frame and the grades included in the analyses.


ANALYSIS


Table 2 provides a summary of the effect sizes reported in the individual WWC reviews. For the purposes of accuracy, in Table 2, we reproduced much of the text from the individual reports and organized it into a chart similar to Table 1. Leave aside for the moment the possibility that all of these comparisons are confounded by a differential regression effect (Campbell & Stanley, 1966); one of the common reasons a student enrolls in a charter school is because they earned poor grades at their traditional public school.3 We paraphrased when necessary to present the findings in table form and also facilitate comparison across the reports.


Table 2:  Effect sizes across CREDO charter school studies


Study

Impact on Reading Gains

Impact on Math Gains

16 States

Charter school students’ reading test score growth was slightly lower than traditional public school students.’ The differences were the equivalent of moving the median student from the 50th to slightly higher than the 49th percentile in a normal distribution.

Charter school students’ test score growth was slightly lower than traditional public school students.’ The differences were the equivalent of moving the median student from the 50th to the 49th percentile.

Indianaii

Charter school students’ annual reading score growth was 0.05 standard deviations higher than that for traditional public school students, equivalent to moving the median student from the 50th to the 52rd percentile.

Charter school students’ annual math score growth was 0.07 standard deviations higher than traditional public school students’, equivalent to moving the median student from the 50th to the 53rd percentile.

National

Charter school students had annual reading score growth that was 0.01 standard deviations higher than students attending traditional public schools. Due to an extraordinarily large N, this difference was statistically significant.

There was no statistically significant difference between charter school students and traditional public school students in their year-to-year gains in math.

New Jersey

1-year reading achievement gains of students in the charter schools were statistically significantly greater than those of comparison students in traditional public schools, equivalent to moving the median student from the 50th to the 52rd percentile.iii

1-year math achievement gains of students in the charter schools were statistically significantly greater than those of comparison students in traditional public schools, equivalent to moving the median student from the 50th to the 53rd percentile.

New York

Charter school student achievement growth was statistically significantly higher than the achievement growth of comparison students — 0.06 standard deviations higher in reading, equivalent to moving the median student from the 50th to the 52rd percentile.iv

Charter school student achievement growth was statistically significantly higher than the achievement growth of comparison students — 0.12 standard deviations higher in math, equivalent to moving the median student from the 50th to the 55th percentile.

Source:  Authors’ summary from WWC 2010a, 2010b, 2011, 2013, 2014c.


Notes.

1. Retrieved February 21, 2014 from http://ies.ed.gov/ncee/wwc/SingleStudyReview.aspx?sid=220. A “+” indicates that charter school students had greater achievement gains after one year than their traditional public school counterparts. A “-“ indicates that traditional public school students made greater achievement gains. The text above the table notes that all of the findings were statistically significant except for the negative math gains observed in the national study.

2. The results reported here are from WWC’s analysis of CREDO’s first report on Indiana (2011), which covered the years between 2004 and 2008.  In a second report covering the years between 2007 and 2011 (CREDO, 2011), the achievement gains for charter school students, while still positive, were smaller (.04 standard deviations higher than traditional public school students in reading and .04 in math).

3. The information about percentile rank equivalency for New Jersey is drawn from the table in Appendix C. We reworded the information about percentile rank equivalency in the New Jersey and New York reports to be consistent with the information presented in the WWC’s 16 state report (the first of the WWC’s five single study reports).

4. The WWC report discussed here analyzed CREDO’s January 2010 report on charter school achievement in New York City which analyzed achievement gains from 2003 to 2008.  A more recent report (February 2013) assessed achievement gains from 2007 to 2011 and found that the average achievement gains for charter school students in reading were lower than those reported here (.03 standard deviations higher than traditional public students), and the average achievement gains for charter school students in math were slightly higher (.14.).


Table 2 lacks the impact on the reader of Table 1. Table 1 invites the reader to count the number of cells with a “+” and come to the conclusion that on balance, charter schools had a much greater effect on students’ achievement gains than traditional public schools. Yet for more than 30 years it has been clear that counts of statistically significant and non-significant results violate some of the most fundamental properties of statistical hypothesis testing (Hedges & Olkin, 1980). Table 2 suggests that on average, the effect of charter schools on students’ achievement gains, positive or negative, was negligible. The strongest positive results for charter schools were reported for New York City students in mathematics. While positive, these effect sizes are among the lowest of those reported across the 15 studies of school choice in the WWC database that met WWC evidence standards.4 They are also well under the definition of a substantially important finding provided in the glossary of all WWC single study reviews:  a substantively important finding “has an effect size of .25 or greater, regardless of statistical significance” (WWC, 2014a).5 By comparison, the typical effect size of a year’s growth in reading achievement at the elementary school level is about 1.0; in math, it is slightly larger (Levin, Glass & Meister, 1986, 1987). The latter figure allows us to interpret more correctly the practical significance of the studies’ findings because we can compare the effect sizes yielded across the studies to an appropriate benchmark, typical yearly achievement growth.


Finally, while the discussion above focuses on the findings that the WWC reported and assessed, these are a relatively small part of the analyses in the CREDO studies. The WWC reports focus largely, although not exclusively, on the average achievement gains of charter school students compared to traditional public school students across the full sample of students in each study.  However, the CREDO studies also compare achievement gains within subgroups of students by race, special education, and English language learner status, deciles of prior achievement, and the number of years enrolled in charter schools. Other analyses include assessments of charter school and traditional public school students’ achievement gains in different locales (the “local markets” charter schools serve, metropolitan areas, and states depending on the scope of the report). These findings are substantially more complex and nuanced than the summary table might suggest and are only minimally and inconsistently addressed in three of the five WWC reports and not addressed in two.6 Moreover, marked differences in achievement gains within groups of students or schools indicate that simple overall averages are not appropriate for understanding the “effectiveness of a given practice, program, or policy” (WWC, n.d.). In other words, the WWC summary and individual reports are largely silent on a key finding of the CREDO studies—charter school achievement effects vary considerably across settings and subgroups of students—findings that have important implications that practitioners and researchers need to consider.


CONCLUSION


While the table that is arguably the center of the WWC Review of the CREDO Charter School Studies is a technically correct summary of statistical significance, it provides potentially misleading visual cues that may overemphasize and exaggerate the success of charter schools, adding to the myth of their superiority (Thaler & Sunstein, 2009; see also Fischman & Tefera, 2014 and Berliner, Glass & Associates, 2014). This review is unique in the WWC database because it summarizes across multiple single study and quick reviews of studies using the same research design. It is also the starkest representation of an inconsistency in how WWC reports are presented to the public. Half of the 14 other studies of school choice summarized as either quick or single study reviews address the practical significance of the studies’ findings on each review’s web page – the easiest to access source of information about a given study. More detailed reports can be easily downloaded in PDF form. In the latter, the statistical significance of findings and effect sizes are reported in narrative form, much like the information presented in Table 2 above. In the WWC Review of the CREDO Charter School Studies, the effort to summarize across multiple studies resulted in an oversimplification that obscures more than it illuminates. Likewise, the summary report and the individual reviews largely do not address a significant aspect of all of the CREDO analyses—the variations in achievement gains across subgroups of students and different types of charter schools.


If summaries generated from research studies are intended to be useful guides to practitioners, they must provide a consistent and careful accounting of findings that allows them to assess their practical importance. Summaries of research that are hard to understand and misleading run the risk of eroding practitioners’ trust in research and increasing rather than bridging the gulf between research and practice.


Notes


1. The WWC provides four types of review: (a) intervention guides that review analyses of specific interventions; (b) practice guides that are produced by an expert panel and are aimed at providing clear, research-based guidance for practitioners; (c) quick reviews; and (d) single study reviews.

2. Feeder schools are traditional public schools that had students transfer to one of the charter schools in the study sample.

3. All of the reviews provide a similar note of caution about the research design that alludes to the possibility of regression effects. These vary somewhat across the five reports. The most elaborate version states the following:  “[U]nobserved differences between [charter school and traditional public school students] may have existed. For example, charter school students may have been more motivated to do well in school or may have had other unobserved characteristics that influenced student achievement. This means the study’s results do not necessarily isolate the effect of charter schools.” (WWC, 2013) According to the WWC protocol, the highest rating that quasi-experimental studies such as the CREDO charter school studies can receive is “meets WWC group design standards with reservations” because even if groups are equivalent on observed characteristics, there may be important differences between a treatment group and the comparison group on unobserved characteristics that may “introduce bias into an estimate of the effect of the intervention” (WWC, 2014b, pp. 10–11). All five of the CREDO studies met the WWC’s standards with reservations.

4. As the WWC Review of the CREDO Charter School Studies noted, the effect sizes for the CREDO studies are not directly comparable to the effect sizes reported for other studies because the CREDO studies compare achievement gains between charter school students and traditional public school students whereas other studies compare the two groups on different outcomes (e.g., reading and math achievement, high school graduation, and college attendance).

5. While outside the scope of the discussion here, this standard is also problematic because it provides a normative standard for interpreting findings without a clear rationale (Konstantopolis & Hedges, 2008; see also Lipsey et al., 2012). It is significant here because this is ostensibly the information a practitioner, the target audience for WWC reviews would have at hand to assess the findings reported in single study reviews.

6. The WWC report of the 16-state study discusses the school-level comparisons (WWC, 2010), the Indiana report summarizes the decile comparison (WWC, 2011), and the New Jersey report highlights the findings for Newark (WWC, 2013).


References


Berliner, D. C., Glass, G. V & Associates. (2014). 50 myths and lies that threaten America’s public schools: The real crisis in education. NY: Teachers College Press.


Campbell, D. T. & Stanley, J. C. (1966). Experimental and quasi-experimental designs for research. Chicago: Rand-McNally.


Center for Research on Education Outcomes. (2012, December 12). Charter school performance in Indiana.  Retrieved from http://credo.stanford.edu/pdfs/IN_2012_FINAL_20130117nw.pdf


Center for Research on Education Outcomes. (2013, February 20).  Charter school performance in New York City.  Retrieved from http://credo.stanford.edu/documents/NYC_report_2013_FINAL_20130219_000.pdf


Hedges, L. V. & Olkin, I. (1980). Vote-counting methods in research synthesis. Psychological Bulletin, 88(2), 359-369.


Fischman G. E. & Tefera A. A. (2014). Qualitative inquiry in an age of educationalese. Education Policy Analysis Archives, 22(7).


Konstantopolis, S. & Hedges, L. (2008). How large an effect can we expect from school reforms? Teachers College Record 110(8), 1611–1638.


Levin, H.M., Glass, G.V & Meister, G.R. (1986). The political arithmetic of cost-effectiveness analysis. Kappan, 68(1), 69–72.


Levin, H. M.; Glass, G. V & Meister, G.R. (1987). Different approaches to improving performance at school. Zeitschrift fur Internationale Erziehungs und Sozial Wissenschaftliche Forschung, 3, 156–176.


Lipsey, M. W., Puzio, K., Yun, C., Hebert, M. A., Steinka-Fry, K., Cole, M. W., Roberts, M., Anthony, K. S., Busick, M. D. (2012). Translating the statistical representation of the effects of education interventions into more readily interpretable forms. (NCSER 2013-3000). Washington, DC: Institute of Education Sciences, U.S. Department of Education.


Thaler, R. H., & Sunstein, C. R. Nudge: Improving decisions about health, wealth, and happiness. New York: Penguin Books.


What Works Clearinghouse. (n. d.) Topics in education. http://ies.ed.gov/ncee/wwc. Retrieved February 23, 2014 from http://ies.ed.gov/ncee/wwc/topics.aspx.


What Works Clearinghouse. (2010a, February). WWC quick review of the report “Multiple choice: Charter school performance in 16 States.” Retrieved from http://ies.ed.gov/ncee/wwc/pdf/quick_reviews/charterschools_021710.pdf


What Works Clearinghouse. (2010b, July). WWC review of the report “Charter School Performance in New York.” Retrieved from http://ies.ed.gov/ncee/wwc/pdf/quick_reviews/nyccharter_070710.pdf.


What Works Clearinghouse. (2011, September). WWC review of the report “Charter School Performance in Indiana.” Retrieved from http://ies.ed.gov/ncee/wwc/pdf/quick_reviews/incharter_093011.pdf.


What Works Clearinghouse. (2013, October).  WWC review of the report “Charter School Performance in New Jersey.” Retrieved from es.ed.gov/ncee/wwc/pdf/single_study_reviews/wwc_njcharter_100113.pdf.


What Works Clearinghouse. (2014a, January). WWC review of the CREDO charter school studies. Retrieved from http://ies.ed.gov/ncee/wwc/SingleStudyReview.aspx?sid=220.


What Works Clearinghouse. (2014b, January). WWC review of the report “National Charter School Study: 2013.” Retrieved from http://ies.ed.gov/ncee/wwc/pdf/single_study_reviews/wwc_ncss_012814.pdf.


What Works Clearinghouse. (2014c). Procedures and standards handbook, Version 3.0. Retrieved from http://ies.ed.gov/ncee/wwc/pdf/reference_resources/wwc_procedures_v3_0_standards_handbook.pdf.




Cite This Article as: Teachers College Record, Date Published: July 02, 2014
https://www.tcrecord.org ID Number: 17591, Date Accessed: 1/20/2022 8:52:43 AM

Purchase Reprint Rights for this article or review
 
Article Tools
Related Articles

Related Discussion
 
Post a Comment | Read All

About the Author
  • Jeanne M. Powers
    Mary Lou Fulton Teachers College at Arizona State University
    E-mail Author
    JEANNE M. POWERS is an Associate Professor in the Mary Lou Fulton Teachers College at Arizona State University. A sociologist, her research focuses on school segregation, school choice, and school finance litigation. Her publications include: “From segregation to school finance: The legal context for language rights in the United States” in Review of Research in Education (Volume 38) and Charter schools: From reform imagery to reform reality (Palgrave MacMillan).
  • Gene Glass
    Arizona State University
    E-mail Author
    GENE V GLASS is a Regents' Professor Emeritus from Arizona State University and a Research Professor at the University of Colorado Boulder. His most recent book is David C. Berliner, Gene V Glass & Associates, 50 myths and lies that threaten America's public schools, Teachers College Press, 2014.
 
Member Center
In Print
This Month's Issue

Submit
EMAIL

Twitter

RSS