Home Articles Reader Opinion Editorial Book Reviews Discussion Writers Guide About TCRecord
transparent 13
Topics
Discussion
Announcements
 

Do Student-Level Incentives Increase Student Achievement? A Review of the Effect of Monetary Incentives on Test Performance


by Vi-Nhuan Le - 2020

Background: Policymakers have debated whether test scores represent students’ maximum level of effort, prompting research into whether student-level financial incentives can improve test scores. If cash incentives are shown to improve students’ test performance, there can be questions as to whether test scores obtained in the absence of financial incentives accurately reflect what students know and can do. This can raise concerns as to whether test scores should be used to guide policy decisions.

Purpose: This study used meta-analysis to estimate the effect of student-level incentives on test performance. The study also included a narrative review. Research Design: Twenty-one studies conducted in the United States and internationally were included in the meta-analysis. Effect sizes were estimated separately for mathematics, reading/language arts, and overall achievement.

Findings: Financial incentives had a significantly positive effect on overall achievement and on mathematics achievement, but no effect on reading/language arts achievement. The narrative review suggested mixed effects with regards to whether treatment estimates could be sustained after the removal of the incentives and whether larger cash payments were associated with stronger program impact. Programs that offered monetary incentives in conjunction with other academic supports tended to show stronger effects than programs that offered incentives alone.

Conclusion: The findings raise questions as to whether policymakers should use scores from low-stakes test to inform high-stakes policy decisions. The study cautions against using scores from international assessments to rank order countries’ educational systems or using scores from state achievement tests to sanction schools or award teacher bonuses.



Although providing incentives to students to improve student achievement is a controversial practice, many schools throughout the United States use reward programs as a means of improving student test scores (Prothero, 2017). In a survey of 250 charter school principals from 17 states, Raymond (2008) found that 57% of responding principals indicated that they used incentives with their students as a way of raising student achievement. In recent years, cash incentive programs have gained traction, with districts throughout the country paying students for academic achievement. For example, the Baltimore City Public School district implemented a monetary incentive program that paid 10th and 11th graders who had previously failed one of their state graduation exams up to $110 if the students improved their scores on benchmark assessments (Ash, 2008). Similarly, the Urban Strategies Memphis Hope program awards students $30 for each A, $20 for each B, $10 for each C grade earned on their report card, and $50 for scoring at least 19 on the ACT (Scheie, 2014).


Understanding whether financial incentives can improve student achievement is important for two reasons. First, incentives interventions are less costly to implement than other types of educational reforms that require higher levels of human capital (e.g., class size reductions, teacher training, curricular development, etc.). If the body of literature suggests that monetary incentives have positive effects on test scores, policymakers may want to increase investments in cash incentive programs as a cost-effective strategy to improve student achievement. Second, educators often use test scores to guide policy decisions. An important assumption underlying the test scores is that students are sufficiently motivated to perform well so their scores are an accurate representation of what they know and can do. If research shows that offering financial incentives can improve student scores, then there can be concerns as to whether test scores obtained in the absence of financial incentives are truly indicative of students’ abilities. This would call into question the utility of the test scores to inform policy decisions.


The purpose of this study is to use meta-analysis to synthesize the results across multiple evaluations of student-level incentives in order to estimate the magnitude of the effect of monetary incentives on student test performance. The meta-analysis is supplemented with a narrative review of findings that have important ramifications for the design of future cash incentive programs. This study is guided by the five research questions shown below. The first question is addressed via meta-analysis and the latter four questions are addressed through a narrative review. The research questions include:


1.

What is the effect of cash incentives on student test performance? Does the effect vary by particular subgroups, including location (i.e., international students versus students in the United States), schooling level (i.e., elementary versus secondary grades), gender, and initial achievement level?

2.

Are the effects of incentives on test performance sustained, once the incentives are removed?

3.

Is there a relationship between the magnitude of program effect and the size of the monetary incentive?

4.

Among studies with multiple treatment conditions, what are the features of promising incentive programs?

5.

Are there any unintended consequences of implementing incentive programs?


This review is organized as follows. It begins with a discussion of the pros and cons of incentive programs from a motivational perspective. Next, the study describes previous literature that has examined the effect of monetary incentives on test performance. The study then describes the analytic approach used in the study, including the search methods, inclusion criteria, and meta-analytic techniques used to estimate effect sizes. This is followed by the results of the meta-analysis, then the results of the narrative review. The article concludes with implications of the results for future policy and research.


EXPECTANCY-VALUE THEORY AS A RATIONALE FOR PROVIDING MONETARY INCENTIVES TO STUDENTS


Although many different motivational theories have been used as a rationale for incentive programs,1 this study uses the expectancy-value framework adopted in the health and work performance fields (Stolovitch, Clark, & Condly, 2002; White, 2012). According to expectancy-value theory, students’ effort, persistence, and performance on a task depend on their beliefs about their chances of performing well on the task (i.e., expectancy) and on their subjective values that they place on the task and its associated rewards (i.e., value) (Eccles et al., 1983; Eccles & Wigfield, 2002; Wigfield & Cambria, 2000). Expectancy-value theorists argue that students may not put forth their best effort on achievement tasks because they have low valuation of the tasks and/or low expectancy of performing well on the task (Eccles, 2007). Monetary incentives are primarily intended to increase the value that students place on an achievement task (Levitt, List, Neckermann, & Sadoff, 2016).


From a behavioral economics perspective, students may have low subjective values for achievement tasks because the costs and effort required to perform well on the tasks are upfront and high, but the benefits are delayed and not readily apparent or tangible (Barrow & Rouse, 2016; Levitt, List, Neckermann, & Sadoff, 2016). For example, paying attention in class, completing homework, and engaging in behaviors that lead to school success can bring intangible rewards (such as peer recognition), but a tangible payoff may not be realized until well in the future (Bembenutty, 2008). Monetary rewards can enhance the value that students place on achievement tasks by allowing them to more quickly realize the payoffs of their hard work in a concrete manner (Sadoff, 2014; Wallace, 2009).


MONETARY INCENTIVES AS A DETRIMENT TO INTRINSIC MOTIVATION


Despite the theoretical appeal of expectancy-value theory as a rationale for providing monetary incentives for student achievement, many motivational theorists question the premise that monetary incentives will necessarily increase students’ motivation and resulting performance. Depending on the reward structure, incentives may have the opposite effect and can actually lead to decreased performance (Ryan & Deci, 2000). Because incentives create an explicit link between desired student behaviors that lead to high student achievement (such as studying) and monetary payments, engaging in these behaviors becomes a transactional process (Gallani, 2017). If students attribute their desire to study to a transactional link to a monetary reward as opposed to an inherent interest in the subject matter, they may express lower intrinsic motivation in the task. A meta-analysis conducted by Deci, Koestner, and Ryan (1999) found support for the notion that providing performance-contingent rewards can undermine students’ intrinsic motivation, as the offer of a performance-contingent reward was associated with lower levels of self-reported interest. In a classic experiment, Lepper, Greene, and Nisbett (1973) found that offering an external reward to young children to draw and color pictures resulted in a subsequent decrease in children’s interest in drawing as a free-choice activity, relative to children who had not received a reward. Frey and Goette (1999) found that high school volunteers who were collecting donations for charity put forth more effort when they were not compensated than when a small payment was offered.


Motivational theorists also contend that even if external rewards could improve performance, the positive effect is likely to be fleeting, as students may engage in “temporary compliance” (Kohn, 1993) and decrease their efforts after the removal of the incentives (Gneezy, Meier, & Rey-Biel, 2011; Willingham, 2008). Gallini (2017) examined hand hygiene practices from a hospital and found that incentives resulted in an increase in hand sanitizing during the period when they were eligible for incentives, but that individuals regressed to lower levels of hand sanitizing after the incentives were withdrawn. Visaria, Dehejia, Chao, and Mukhopadhyay (2016) examined an incentive program in India that was designed to improve the school attendance of low-income children. When the incentives were in place, average attendance improved, but the removal of the incentives resulted in even lower attendance among children with initially low baseline attendance. Taken together, this line of research suggests that any positive effect of incentives on student achievement may fade once the incentives are removed.


PREVIOUS REVIEWS OF FINANCIAL INCENTIVES ON STUDENT ACHIEVEMENT


There have been three often-cited reviews of financial incentive programs on student achievement in field settings. Slavin (2010) highlighted 19 financial incentive programs implemented across developing and developed countries. Using a narrative approach, Slavin concluded that monetary incentives did not have any effects on the graduation rates and achievement of students in developed countries but were weakly positive for students in developing countries. The National Research Council (2011) conducted a narrative review of test-based incentives and concluded that incentives had relatively weak effect on student achievement. McEwan (2015) conducted a meta-analysis of eight randomized cash incentive studies implemented in developing countries and reported a statistically significant standardized regression coefficient of 0.089.


Although these reviews provide useful information about the potential effect of student-level financial incentives on student achievement, they also underscore the need for more research. Conditional cash transfer programs comprised the vast majority of the studies included in Slavin’s (2010) review. However, conditional cash transfer programs typically incentivize school attendance as opposed to student achievement, so it is not entirely surprising that he found the effect of incentives on student achievement to be weak. Presumably, stronger effects could be observed with incentive programs that explicitly set out to improve student achievement. The National Research Council (2011) review represented an early synthesis of incentives research, and since its publication, numerous other incentive studies have been completed. McEwan’s (2015) study examined student-level incentives in conjunction with teacher-level incentives, rendering it difficult to understand to what extent the reported 0.089 regression coefficient reflected the specific effect of student-level incentives. The present study builds on these prior reviews by increasing the number of student-level incentive studies reviewed and by estimating an effect size specific to student-level incentives.


METHOD


LITERATURE SEARCH PROCEDURES


Three sequential steps were used to search for relevant literature. First, various combinations of the search terms “financial/cash/monetary,” “incentives/rewards/awards/prize,” and “student achievement/test scores/test performance” were entered within Google and six databases representing multiple disciplines: ERIC, PsycInfo, Sociological Abstracts, Dissertation Abstracts, EconLit, and National Bureau of Economic Research. The sources covered both peer-reviewed journal articles as well as working papers in order to mitigate publication bias. The search included all studies up to December 2017, but because evaluations of student-level financial incentives are a relatively recent development, most studies were published within the last decade. Second, the Social Sciences Citation Index was used to identify studies that cited seminal research in the field. Finally, the bibliographies of all major literature reviews, as well as the bibliographies of the studies that met the inclusion criteria, were examined for any relevant or missing citations.


INCLUSION CRITERIA


To be included in the meta-analysis, the study had to meet several criteria. First, the study must have been a field experiment as opposed to a laboratory experiment. Second, the incentive programs needed to provide students with cash rewards for meeting a specified academic performance threshold. This criterion eliminated the majority of conditional cash transfer reforms because most of these programs incentivized school attendance as opposed to school achievement. Third, the study needed to include standardized test scores as an outcome. This criterion eliminated studies that examined whether incentives increased participation in certain courses, such as Advanced Placement courses. Fourth, the study needed to focus on students in the elementary and secondary grades. This criterion eliminated all of the studies that provided postsecondary scholarship incentives to college students. Finally, research briefs, op-ed pieces, and research narratives were excluded because they were not primary empirical studies.


The literature search yielded 80 research studies, of which 74 were empirical studies that necessitated further review (see Figure 1). The majority of the studies (n = 30) were eliminated because they were conditional cash transfer programs that incentivized school attendance but not student achievement. Another 10 studies were eliminated because they were scholarship programs conducted with college-going students in postsecondary settings. Four studies were eliminated because they did not provide sufficient details that would allow for a conversion of their statistical results (e.g., percentages meeting an academic performance threshold) into the regression estimate metric used in this study. Five studies were eliminated because they did not include test performance as an outcome, and another four studies were eliminated because they were conducted in a laboratory setting. In total, 21 unique studies were included in the analysis.


Figure 1. Exclusion criteria for the meta-analysis

[39_23009.htm_g/00001.jpg]


CODING STUDY INFORMATION


Each study was coded for the following information: (a) research design (e.g., randomized versus correlational); (b) sample size; (c) structure of the financial incentive program; (d) study location; (e) achievement outcomes; and (f) regression coefficients and associated standard errors for the treatment and control conditions for each achievement outcome, delineated by subject, gender, and initial achievement level, where relevant.


In some studies, both unconditional and conditional regression estimates were reported. In those instances, the regression coefficients from models that included controls for student- and school-level variables were reported because the inclusion of the control variables are often used to adjust for imbalances between the treatment and control groups (McEwan, 2015), and can reduce the standard error of the estimated treatment estimate (Duflo, Hanna, & Ryan, 2008).


ACHIEVEMENT OUTCOMES


This study focused on three achievement outcomes: overall achievement, mathematics, and reading/language arts. Overall achievement was defined as the treatment estimate pooled across subjects such as mathematics, reading/language arts, science, and history/geography/ social sciences. The analysis also estimated separate effect sizes for mathematics and reading/language arts, where possible.


SUBGROUP ANALYSIS


Separate effect sizes were examined for certain subgroups. The subgroups include location, schooling level, gender, and initial achievement level. These subgroups were chosen because the literature suggests effect sizes may vary by these factors. For example, many of the incentive programs in international settings were conducted in developing countries, which differs markedly from the United States with respect to school resources. In terms of schooling level, it is possible that older children may be more motivated by incentives than younger children because older children may have a better understanding of money and finances. With respect to gender, scholarship studies conducted at the postsecondary level have found that females are more responsive to financial incentives than males (Angrist, Lang, & Oreopoulos, 2009). Finally, it is important to examine whether there are differential effects on students of different achievement levels because previous studies have found that financial rewards can have stronger effects on higher achieving college students than on lower achieving college students (Leuven, Oosterbeek, & van der Klaauw, 2010).


THE REGRESSION COEFFICIENT AS AN EFFECT SIZE INDEX


Following other studies (e.g., Cooper, Robinson, & Patall, 2006; Kim, 2011; McEwan, 2015; Nieminen, Lehtiniemi, Vähäkangas, Huusko, & Rautio, 2013), a standardized regression coefficient was used as a measure of effect size. All but two studies reported a standardized regression coefficient for the effect of financial incentives on test performance. In the two instances that unstandardized regression coefficients were reported, the unstandardized regression coefficient estimate was converted to a standardized regression coefficient by dividing the treatment effect and its associated standard error by the pooled standard deviation of the outcome variable (McEwan, 2015).


An efficient estimator of the mean of the true effects of the various programs is the weighted average of the observed effect sizes, where the weight is the inverse of the squared standard error (Kim, 2011; Nieminen et al., 2013). Thus, the mean effect size estimate of the standardized regression coefficients (i.e., ([39_23009.htm_g/00003.jpg])) and the associated standard errors (i.e., [39_23009.htm_g/00005.jpg]) are estimated by:


[39_23009.htm_g/00007.jpg]  and [39_23009.htm_g/00009.jpg], where [39_23009.htm_g/00011.jpg]  (1)


where [39_23009.htm_g/00013.jpg] is the standardized regression estimate for the [39_23009.htm_g/00015.jpg] effect size in treatment j for study k, [39_23009.htm_g/00017.jpg] is the squared standard error, and [39_23009.htm_g/00019.jpg] represents the common between-study variance resulting from random effects pooling (Borenstein, Hedges, Higgins, & Rothstein, 2009), which is estimated via restricted maximum likelihood (Ringquist, 2013). Effects were estimated using a random-effects model, which assumes that variation in the observed effect sizes stems from both sampling error and random variance.


ACCOUNTING FOR DEPENDENT EFFECT SIZE ESTIMATES


A key assumption of meta-analysis is that the treatment effect sizes can be treated as independent (Kim, 2011). However, this assumption is violated when there are multiple outcomes or multiple incentive conditions within the same study (Scammacca, Roberts, & Stuebing, 2014). Multiple outcomes (e.g., mathematics and reading achievement scores) are not independent data points because the effect sizes are based on the same set of students. Similarly, studies can have multiple incentive conditions, and require multiple treatment contrasts relative to the same control group. To account for the non-independence of effect sizes, the study adopted Hedges, Tipton, and Johnson’s (2010) robust variance estimation (RVE) approach. The RVE has the advantage of making no assumptions about the covariance structure of the effect size estimates (Tanner-Smith & Tipton, 2014) and yields estimates that are robust to a wide range of within-study intraclass correlations (Scammacca et al., 2014; Wilson, Tanner-Smith, Lipsey, Steinka-Fry, & Morrison, 2013). In addition, the analyses used the small-sample corrections recommended by Tipton (2015) and Tipton and Pustejovsky (2015), which makes adjustments to both the residuals and the degrees of freedom of the treatment estimates.


RESULTS


CHARACTERISTICS OF THE INCLUDED STUDIES


The appendix shows the characteristics of the incentive programs included in this meta-analysis. There was a balance of published and unpublished studies in the analysis, as well as a balance with respect to location and schooling level. All but three studies used a randomized design. Four studies did not report treatment estimates separately by subject area, which meant that these studies could only be used to estimate the effect of incentives on overall achievement.


Analysis was conducted on 21 studies that yielded 103 effect sizes. The relatively large number of effect sizes arose in part because several investigators chose to report results from distinct experiments within the same study. For example, Fryer, Devi, and Holden (2016) described two field experiments conducted in Washington, DC, and Houston that varied on several dimensions, including the academic behaviors that were incentivized, the frequency with which students were provided incentives, the grade levels that participated, the magnitude of the rewards, and the state achievement tests used as outcome measures. In addition, nine studies examined multiple treatment conditions (e.g., individual-level incentives versus team-level incentives versus a control group). As a result, the number of studies included in the meta-analysis was not aligned with the number of experiments that were conducted. Overall, the 21 studies reported on 39 different student-level cash incentive programs.


EFFECTS ON TEST PERFORMANCE


The I2, which is a measure of heterogeneity between studies, was approximately 23%, which is indicative of a low level of between-study variability (Higgins, Thompson, Deeks, & Altman, 2003). Table 1 provides the mean estimates, the standard errors of the mean estimates, the associated p values, and the number of studies and effect sizes for each analysis. For overall achievement, the mean estimate ([39_23009.htm_g/00021.jpg]= 0.062) was significantly positive. There was also a significantly positive effect of monetary incentives on mathematics achievement ([39_23009.htm_g/00023.jpg]= 0.095), but there was no relationship between student incentives and reading/language arts achievement ([39_23009.htm_g/00025.jpg]= 0.020).


Table 1. Effect Sizes of Monetary Incentives on Test Performance

Outcomes

[39_23009.htm_g/00027.jpg]

SE ([39_23009.htm_g/00029.jpg])

p value

Number of

Studies

Effect Sizes

Overall achievement

0.062

0.018

0.008

21

103

  Subject

     

 Mathematics

0.095

0.029

0.011

15

58

             International

0.125

0.046

0.029

9

32

U.S.

0.082

0.035

0.046

7

26

Reading/language arts

0.020

0.013

0.162

11

33

Location

     

International

0.099

0.034

0.017

12

39

U.S.

0.044

0.018

0.042

10

64

Schooling level

     

Elementary

0.039

0.022

0.131

10

35

Secondary

0.076

0.029

0.036

13

50

Gender

     

Males

0.085

0.022

0.004

14

43

Females

0.084

0.026

0.009

14

43

Initial achievement level

     

High achievers

0.074

0.023

0.010

14

37

Low achievers

0.065

0.024

0.026

14

43


SUBGROUP ANALYSIS


Location


For both international and U.S. students, cash incentives had a statistically significant positive effect on overall achievement ([39_23009.htm_g/00031.jpg]= 0.099 and [39_23009.htm_g/00033.jpg]= 0.044, respectively) as well as on mathematics achievement ([39_23009.htm_g/00035.jpg]= 0.125 and [39_23009.htm_g/00037.jpg]= 0.082, respectively). The effect was stronger for international students than for U.S. students for overall achievement, but the effect was the same for both groups for mathematics achievement.


Schooling Level


There was also a statistically significant effect of cash incentives on overall achievement at the secondary grades ([39_23009.htm_g/00039.jpg]= 0.076), but not at the elementary grades ([39_23009.htm_g/00041.jpg]= 0.039). However, there was no difference in the magnitude of effect between the two schooling levels.


Gender


Financial incentives had a statistically significant effect on overall achievement for both males ([39_23009.htm_g/00043.jpg]= 0.085) and females ([39_23009.htm_g/00045.jpg]= 0.084). In addition, the positive effect was equally strong for both genders.


Initial Achievement Levels


The effect of financial incentives for lower achieving students, defined as those whose test performance prior to the implementation of the incentive program was below the median score, was compared to the effect for higher achieving students, defined as those whose initial test performance was above the median score. The effect of incentives on overall achievement was statistically significant and positive for both higher achieving students ([39_23009.htm_g/00047.jpg]= 0.074) as well as for lower achieving students ([39_23009.htm_g/00049.jpg]= 0.065). There was also no difference in the magnitude of effect between the two groups.


NARRATIVE REVIEW


In designing cash incentive programs, policymakers need to consider several issues, including whether treatment effects persist after the incentives are removed and whether stronger effects can be observed if larger incentives are provided. In addition, it is important to identify promising features of incentive programs, especially among studies with multiple treatment conditions that facilitate direct comparisons of the effectiveness of different design options. Because studies that examined these issues did not necessarily present the findings within a quantitative framework, a narrative review is presented instead.


EFFECTS AFTER THE REMOVAL OF THE INCENTIVES


The evidence is mixed as to whether achievement gains stemming from the implementation of the incentive programs could be sustained after the removal of the incentives. Kremer, Miguel, and Thornton (2009) found that even one year after the incentive program had ended, the program continued to have a positive effect on test scores, although the effect was weaker than that observed when the program was still in place. They contend that these findings support the notion that the initial test score gains reflected real learning as opposed to cramming or cheating for the test. Levitt, List, and Sadoff (2016) also found the effects of financial incentive programs could persist after the incentive program ended, at least for one additional year. They examined whether financial incentives provided during students’ freshman year in high school was associated with a higher probability of being “on track” to graduate in subsequent years. Treatment effects persisted one year after the incentives had been removed, such that students in the treatment group had a higher probability of being “on track” to graduate when measured at the 10th grade. However, the treatment effects dissipated thereafter, and there were no differences between the treatment and control students’ probabilities of being “on track” to graduate by the time students reached the 11th or 12th grades.


Other studies have confirmed that achievement gains associated with the monetary incentive programs may be short-lived. Bettinger (2012) conducted a multi-year evaluation, where students could be eligible for incentives in one year, but not the next. He found that the achievement gains demonstrated by the incentive recipients in the previous year did not persist into the following year. Similarly, examining the incentive program in Dallas, Fryer (2011) found that a year after the incentives had ended, the treatment estimates had faded and were no longer statistically significant.


EFFECTS OF INCENTIVES AS A FUNCTION OF THE SIZE OF THE CASH PRIZES


The results are also contradictory as to whether the effect of the monetary incentive programs is related to the size of the cash reward. Jackson (2014) did not find a relationship between program effect size and the size of the reward—the number of Advanced Placement tests passed was the same for schools that paid $100 per exam as for schools that paid between $101 and $500. By way of contrast, Fryer, Devi, and Holden (2016) found that achievement gains were greater with larger incentives. They initially paid students $2 per mathematics objective mastered. When they temporarily increased the amount of incentives to $4 and then $6, the rate of learning objectives mastered per week also increased. Namely, when the incentive amount was $2, students mastered an average of 2.32 objectives per week, but when the amount was increased to $4 then to $6, the average number of objectives mastered increased to 2.81 and 5.79, respectively. In a similar vein, Levitt, List, Neckermann, and Sadoff (2016) found that offering a $20 cash prize had a positive effect on test scores, while offering a $10 cash prize did not have an effect. However, this effect appeared to be driven mostly by older students, as younger children responded in similar ways to both the larger and smaller incentives.


COMPARISONS OF MULTIPLE TREATMENT CONDITIONS


Some studies included two or more treatment conditions, allowing for a direct comparison of the effectiveness of different types of incentive programs. Blimpo (2014) studied three types of student incentive structures—incentives to individual students, to teams of students, and to teams of students in a tournament format—and found them all to be equally effective at raising student test scores. Behrman, Parker, Todd, and Wolpin (2012) also studied the effectiveness of three incentive conditions, which differed with respect to the stakeholders being incentivized: individual students; individual teachers; or individual students with groups of teachers and school administrators. They found that providing incentives to individual teachers had no effect on student test scores, but incentives provided to individual students had a positive effect on test scores. The strongest effect was found when individual students, groups of teachers, and schools administrators were all eligible for cash rewards. Notably, teachers in this incentive condition reported spending more outside-of-class time helping students prepare for the exam than teachers in the two other incentive conditions.


Li, Han, Rozelle, and Zhang (2014) also found that a multipronged incentive structure was most effective. They studied an individual-level incentive program in which students who posted the largest achievement gains received a cash prize. In a variation on this incentive structure, they also incentivized peer tutoring in addition to test performance, such that a subset of higher achieving students were given contracts to tutor other students in the class. If their tutees were among the highest-gaining students, the tutors would receive the same cash prizes as the tutees. Offering individual student-level incentives had no effect on test scores, but combining incentives for test performance with incentives for peer tutoring showed a positive effect.


Hirshleifer (2017) compared the effectiveness of two incentive conditions, one of which focused on incentivizing inputs while the other focused on incentivizing outputs. In the inputs condition, students completed a series of interactive learning modules, after which they were administered a cumulative end-of-unit test. While working through the learning modules, students were provided immediate feedback about their performance. If they answered incorrectly, students were given an opportunity to click on a button to see the fully worked out question and the correct solution. Students could then incorporate this approach to future questions within a module. Students were paid based on the number of items they answered correctly while working through a given module and on the total number of modules mastered. In the outputs condition, students were paid based on the number of items answered correctly on the cumulative end-of-unit test. Hirshleifer (2017) found that on a subsequent non-incentivized test, students in the inputs condition outperformed students in the outputs condition. He hypothesized that incentivizing inputs was more effective than incentivizing outputs because it allowed students to more quickly and directly realize the fruits of their efforts.


UNINTENDED CONSEQUENCES OF MONETARY INCENTIVE PROGRAMS


One potential unintended consequence of monetary incentive programs is that it may divert students’ attention to the incentivized subjects at the expense of the subjects that are not incentivized. This “substitution effect,” however, may depend on initial achievement level. In their study, Fryer and Holden (2013) paid students based on the number of mathematics objectives mastered. High-achieving treatment students mastered more mathematics objectives, scored higher on the standardized mathematics test, and scored comparably on the standardized reading test, relative to high-achieving control students. In contrast, although low-achieving treatment students mastered more mathematics objectives than low-achieving control students, low-achieving treatment students scored comparably to low-achieving control students on the standardized mathematics test, and lower on the standardized reading test. Fryer and Holden (2013) noted that although both high-and low-achieving treatment students put in additional effort to obtain the prize (as evidenced by the increase in the number of mathematics objectives mastered), this increased effort came at the expense of the low-achieving students’ reading performance.


DISCUSSION


The results of this study suggest that financial incentives can modestly improve student achievement. There was a positive effect of monetary incentives on overall achievement and on mathematics achievement, although there was no effect for reading/language arts achievement. This finding is consistent with studies that suggest that incentives may be more effective with concrete subjects, such as mathematics, than with conceptual subjects, such as reading/language arts (Rouse, 1998). Incentives were related to the achievement of both international students and U.S. students, although the effect was stronger within the international context. There were no differences in the effects of incentives by gender or by initial achievement level, but the effect was significant for secondary grade students but not for elementary grade students. Perhaps by virtue of their better understanding of finances, older students may have found the cash rewards to be more enticing than younger students.


IMPLICATIONS FOR THE DESIGN OF FUTURE INCENTIVE PROGRAMS


The modest effect sizes found in this study raise questions as to why incentives did not have a stronger effect. One possibility, raised by Fryer (2011), is that offering financial incentives may increase students’ motivation to perform well, but students may not know what to do to improve their performance, despite their desire to do so. In interviews with students, Fryer (2011) found that students expressed excitement about the possibility of obtaining a cash prize, but when asked about how they could improve their test performance to attain the reward, students could not readily answer. Students responded with general test-taking strategies (e.g., making sure that their answers were entered correctly), as opposed to strategies that would actually improve their learning (e.g., studying harder, completing their homework, asking teachers for help). Students’ lack of understanding about what to do to improve their performance may explain why incentives did not affect their study habits.


In a similar vein, Li et al. (2014) noted that incentives may help to motivate students, but without being accompanied by additional remediation or academic supports, incentives, in and of themselves, will not help students learn the material. This may explain why treatment conditions that incentivized peers or teachers to provide extra assistance to students, such as those implemented by Li et al. (2014) and Behrman et al. (2015), showed stronger effects than treatment conditions that simply paid students for test performance. This finding is consistent with the conclusions by Slavin (2010), whose review led him to conclude that financial incentives to students worked best when paired with improvements in teaching or other supports.


Fryer (2011) suggested that incentivizing educational inputs (e.g., reading books) as opposed to outputs (e.g., reaching a performance standard on a test) may lead to stronger effects because inputs encourage students to engage in concrete behaviors that can lead to improved performance. By way of contrast, outputs such as “reach proficient level” are abstract goals, and do not offer students guidance about specific steps that will improve their learning. The findings from Hirshleifer’s (2017) study lend credence to this idea, as students who were incentivized on inputs (i.e., the number of modules mastered during a unit) demonstrated better test performance than students who were incentivized on outputs (i.e., the number of items answered correctly on an end-of-unit exam). Due to a lack of studies that incentivized educational inputs, more definitive conclusions cannot be drawn. However, future studies should examine whether incentives applied in combination with educational inputs, such as providing immediate corrective feedback to students, prove to be more effective than incentives that merely pay students for the number of correct responses.


The results of this study have implications for the design of future incentive programs. Consistent with the findings from laboratory experiments (O’Neill, Abedi, Miyoshi, & Mastergeorge, 2005) as well as the findings from incentive programs conducted at the postsecondary level (Barrow & Rouse, 2013), there was some evidence that offering a larger cash prize may not necessarily lead to stronger effect than offering a smaller cash prize (Jackson, 2010). The results also suggest that students may engage in substitution, and focus on the incentivized subjects to the detriment of non-incentivized subjects (Fryer et al., 2016). This suggests that policymakers may want to design an incentive program that incentivizes multiple subjects, or include stipulations that performance on non-incentivized subjects cannot decline beyond a pre-specified level in order to receive the reward.


IMPLICATIONS FOR POLICY AND PRACTICE


A key assumption in the interpretation of test scores is that the results are an accurate demonstration of what students know and can do. If students have not put forth their best effort on the tests because the tests do not hold personal consequences for them, then the test results can yield a misleading picture of students’ abilities. The fact that this study found that student performance on mostly low-stakes tests could be improved with financial incentives calls into question the practice of making policy decisions based on these types of assessments. For example, the educational systems of countries that are highly ranked on international assessments are often lauded as exemplars that should serve as models for improvement (Grek, 2009). However, there is evidence that students from different countries may not have the same levels of motivation to perform on the low-stakes international tests (Zamarro, Hitt, & Mendez, 2016), and that performance differences on these tests reflect differences in ability as well as differences in motivation (Gneezy et al., 2017). This raises concerns whether the scores from these tests can be accurately used to rank order countries’ educational systems.


Similarly, in the U.S., policy decisions are often made on state achievement tests that have little consequence to students, yet have high-stakes consequences for teachers and schools. Results from state achievement tests have been used to dismiss teachers, determine teacher bonuses, and sanction schools for failing to make adequate yearly progress. That students may not put forth maximum effort on these types of low-stakes tests in the absence of a financial reward undermines the utility of these tests as an indicator of quality of instruction because the test scores may not accurately reflect what students have actually learned (Cole & Osterlind, 2008). Future studies should examine whether the use of tests that have personal consequences for students would change interpretations about teacher effectiveness or school improvement.


Another important policy question is the magnitude of student-level financial incentives relative to other programs designed to improve achievement. Compared to the effect sizes for other interventions such as class size reduction ([39_23009.htm_g/00051.jpg]= 0.08 to 0.10; Jepsen & Rivkin, 2002) or instructional reforms that involve computers or technology ([39_23009.htm_g/00053.jpg]= 0.15; McEwan, 2015), the effect sizes for student-level financial incentives are smaller. However, financial incentive programs are relatively inexpensive to implement, especially when compared to other types of reforms that involve substantial investments in human capital (Bettinger, 2012; Blimpo, 2014). Yeh (2010) suggested conducting a cost-effective analysis, in which policymakers examine the relative impact of each intervention per dollar, in order to evaluate different types of interventions to inform policy decisions. Fryer (2011) conducted a cost-effective analysis for the incentive programs included in his study, and found that statistically insignificant and weak effect sizes ranging from 0.0006 to 0.016 could have a 5% return on investment. Similarly, Blimpo (2014) found that the cost of implementing student-level financial incentives was $30 per standard deviation gain in test score. By way of comparison, Yeh (2010) reported significantly higher costs for interventions that involved significant human capital. For example, teacher education was estimated to cost just over $700 for a one quarter standard deviation increase in student achievement, and cross-age tutoring was estimated to cost a little more than $550 for a nearly one standard deviation gain in mathematics test performance. Thus, although the effect sizes for financial incentives are smaller than those for other educational interventions, the relatively low level of resources needed to implement student-level incentives may mean that financial incentives can be a more cost-effective strategy for improving achievement. It is important to emphasize, however, that the effect sizes associated with financial incentives are not nearly large enough to bring the United States to the levels of the highest-achieving nations (National Research Council, 2011), so policymakers may wish to invest in other reforms that are more costly but also have the potential to yield stronger effects.


STUDY LIMITATIONS


There are several limitations to this study. First, the study could not disentangle the impact of student incentives from other concurrent interventions. For example, in the United States, many schools and districts are required to submit continuous improvement plans, which are often accompanied by changes to the curriculum or teachers’ professional development. To the extent that the incentive programs took place at the same time as other ongoing interventions, the estimates in the meta-analysis may be overstated.


Second, it is possible that publication bias may have resulted in inaccurate estimates. Analyses using the Duval and Tweedie’s (2000) trim and fill procedure to assess publication bias did not change the study conclusions that financial incentives had modestly positive effects on achievement. Similarly, Rosenthal’s (1979) fail-safe N approach indicated that it would require more than 300 additional studies with an average effect size of zero to render the effect on overall achievement to be statistically insignificant. Although these analyses suggest that publication bias may be minimal, it remains possible that researchers may have self-censored themselves and failed to publicly disseminate manuscripts with findings of null or negative relationships, which would result in an upward bias in the estimates.


Finally, the relatively small number of studies included in the review means that the conclusions warrant caution. However, research has indicated that as few as two studies can yield meaningful meta-analytic results (Valentine, Pigott, & Rothstein, 2010), and it is common to find meta-analyses conducted on a small number of studies (Inthout et al., 2015). For example, within the Cochrane Database of Systematic Reviews, which is a repository for thousands of systemic reviews and meta-analyses, the median number of studies per meta-analysis is seven (von Hippel, 2015). Research has also found that evaluating the reliability of meta-analytic results solely on the number of studies included in the review can be misleading because many meta-analyses include smaller studies that are underpowered, and the inclusion of smaller studies increases the between-study heterogeneity (IntHout, Ioannidis, Borm, & Goeman, 2015) and reduces the precision of the meta-analytic estimates (Turner, Bird, & Higgins, 2013). For this reason, researchers have suggested that meta-analysis ignore estimates from smaller studies and make conclusions based exclusively on estimates from larger studies that are sufficiently powered (Kraemer, Gardner, Brooks, & Yesavage, 1998; Stanley, Jarrell, & Doucouliagos, 2010). Notably, the primary studies included in the meta-analysis were able to leverage existing administrative records, and each analysis was conducted on thousands of students. These larger sample sizes lend credence to the reliability of the study conclusions.


Overall, this study suggests that financial incentives can modestly improve student achievement. The findings also raise questions as to whether policymakers should use low-stakes tests, such as state achievement tests or international assessments, to inform high-stakes decisions. More research is needed to better understand the intersection between student achievement and student motivation, and the inferences that can be drawn from tests that are administered in the absence of financial incentives or other personal consequences for students.


Note


1. See Fang and Gerhart (2012) and Johnston and Sniehotta (2010) for a discussion of alternative motivational theories for incentive programs, including self-regulation theory, self-determination theory, and cognitive evaluation theory.


Acknowledgements


This research was supported by funding from NORC’s Working Paper Series. The content or opinions expressed do not necessarily reflect the views of NORC, and any errors remain my own.


References


References marked with an asterisk indicate studies included in the meta-analysis.


Angrist, J., Lang, D., & Oreopoulos, P. (2009). Incentives and services for college achievement: Evidence

from a randomized trial. American Economic Journal: Applied Economics, 1, 136–163.


Angrist, J. D.., & Lavy, V. (2009). The effects of high stakes school achievement awards: Evidence from a

randomized trial. American Economic Review, 99, 301–331.*


Ash, K. (2008, February 13). Promises of money meant to heighten student motivation. Education Week.

Retrieved from http://www.edweek.org


Barrera-Osorio, F., & Filmer, D. (2016). Incentivizing schooling for learning: Evidence on the impact of

alternative targeting approaches. Journal of Human Resources, 51(2), 461–499.*


Barrow, L., & Rouse, C. E. (2016). Financial incentives and educational investment: The impact of

performance-based scholarships on student time use. Education Finance and Policy. 13(4), 419–448.


Behrman, J. R., Parker, S. W., Todd, P. E., & Wolpin, K. I. (2015). Aligning learning incentives of students

and teachers: Results from a social experiment in Mexican high schools. Journal of Political Economy,

123(2), 325–364.*


Bembenutty, H. (2008). Academic delay of gratification and expectancy value. Personality and Individual

Differences, 44, 193–202.


Berry, J., Kim, H. B., & Son, H. (2017). When student incentives don’t work: Evidence from a field

experiment in Malawi. Newark, DE: University of Delaware.*


Berry, J. W. (2015). Child control in education decisions. Journal of Human Resources, 50(4), 1051

1080.*


Bettinger, E. P. (2012). Paying to learn: The effect of financial incentives on elementary school test

scores. The Review of Economics and Statistics, 94, 686–698.*


Blimpo, M. P. (2014). Team incentives for education in developing countries: A randomized field

experiment in Benin. American Economic Journal: Applied Economics, 6, 90–109.*


Borenstein, M., Hedges, L. V., Higgins, J. P. T., & Rothstein, H. R. (2009). Introduction to meta-analysis.

Chichester, UK: Wiley.


Burgess, S., Metcalfe, R., & Sadoff, S. (2016). Understanding the response to financial and non-financial

incentives in education: Field experimental evidence using high-stakes assessments. Bristol, UK:

University of Bristol.*


Cole, J. S., & Osterlind, S. J. (2008). Investigating differences between low- and high-stakes test

performance on a general education exam. The Journal of General Education, 57(2), 119–130.


Cooper, H., Robinson, J. C., & Patall, E. A. (2006). Does homework improve academic achievement? A

synthesis of research, 1987–2003. Review of Educational Research, 76, 1–62.


Deci, E. L, Koestner, R., & Ryan, R. (1999). A meta-analytic review of experiments examining the effects

of extrinsic rewards on intrinsic motivation. Psychological Bulletin, 125, 627–668.


Deci, E. L., Koestner, R., & Ryan, R. M. (2001). Extrinsic rewards and intrinsic motivation in education:

Reconsidered once again. Review of Educational Research, 71, 1–27.


Duflo, E., Hanna, R., & Ryan, S. P. (2012). Incentives work: Getting teachers to come to school.

American Economic Review, 102, 1241–1278.


Duval, S. J., & Tweedie, R. L. (2000). Trim and fill: A simple funnel-plot-based method of testing and

adjusting for publication bias in meta-analysis. Biometrics, 56(2), 455–463.


Eccles, J. S. (2007). Families, schools, and developing achievement-related motivations and

engagement. In J. E. Grusec & P. D. Hastings (Eds.), Handbook of socialization (pp. 665–691). New

York: The Guilford Press.


Eccles, J. S., Adler, T. F., Futterman, R., Goff, S. B., Kaczala, C. M., Meece, J. L., & Midgley, C. (1983).

Expectancies, values, and academic behaviors. In J. T. Spence (Ed.), Achievement and achievement

motivation (pp. 75–146). San Francisco, CA: W. H. Freeman.


Eccles, J. S., & Wigfield, A. (2002). Motivational beliefs, values, and goals. Annual Review of Psychology,

53, 109–132.


Fang, M., & Gerhart, B. (2012). Does pay for performance diminish intrinsic interest? The International Journal of Human Resource Management, 26(6), 1176–1196.


Frey, B. S., & Goette, L. (1999). Does pay motivate volunteers? Zürich, CH: Institute for Empirical

Research in Economics, Universität Zürich.


Fryer, R. G. (2011). Financial incentives and student achievement: Evidence from randomized trials. The

Quarterly Journal of Economics, 126, 1755–1798.*


Fryer, R. G., Devi, T., & Holden, R. T. (2016). Vertical versus horizontal incentives in education: Evidence

from randomized trials. Cambridge, MA: Harvard University.*


Gallini, S. (2017). Incentives, peer pressure, and behavior persistence. Cambridge, MA: Harvard

Business School.


Gneezy, U., List, A., Livingston, J., Sadoff, S., Qin, X., & Xu, Y. (2017). Measuring success in education: The role of effort on the test itself. National Bureau of Economic Research Working Paper No. w24004. Cambridge, MA: NBER.*


Gneezy, U., Meier, S., & Rey-Biel, P. (2011). When and why incentives (don't) work to modify behavior. The Journal of Economic Perspectives, 25(4), 191–209.


Grek, S. (2009). Governing by numbers: the Pisa effect in Europe. Journal of Education Policy, 24, 23–37.


Hedges, L. V., Tipton, E., & Johnson, M. C. (2010). Robust variance estimation in meta-regression with dependent effect size estimates. Research Synthesis Methods, 1, 39–65.


Higgins, J. P. T., Thompson, S. G., Deeks, J. J., & Altman. D. G. (2003). Measuring inconsistency in meta-analyses. BMJ, 327(7414), 557–560.


Hirshleifer, S. R. (2017). Incentives for effort or outputs? A field experiment to improve student performance. Riverside, CA: University of California, Riverside.*


IntHout, J., Ioannidis, J. P. A., Borm, G. F., & Goeman, J. J. (2015). Small studies are more

heterogeneous than large ones. Journal of Clinical Epidemiology, 68, 860–869.


Jackson, C. K. (2010). A little now for a lot later. A look at a Texas Advanced Placement incentive program. The Journal of Human Resources, 45, 591–639.*


Jackson, C. K. (2014). Do college-preparatory programs improve long-term outcomes? Economic Inquiry, 52, 72–99.*


Jepsen, C., & Rivkin, S. (2002). Class size reduction, teacher quality, and academic achievement in California elementary public schools. San Francisco: Public Policy Institute of California.


Johnston, M., & Sniehotta, F. (2010). Financial incentives to change patient behavior. Journal of Health Services Research Policy, 15(3), 131–132.


Kim, R.S. (2011). Standardized regression coefficients as indices of effect sizes in meta-analysis. (Unpublished doctoral dissertation). Florida State University, Tallahassee, FL.


Kohn, A. (1993). Why incentive plans cannot work. Harvard Business Review, 71, 54–63.


Kraemer, H. C., Gardner, C., Brooks, J. O., & Yesavage, J. A. (1998) Advantages of excluding underpowered studies in meta-analysis: Inclusionist versus exclusionist viewpoints. Psychological Methods, 3, 23–31.


Kremer, M., Miguel, E., & Thornton, R. (2009). Incentives to learn. The Review of Economics and Statistics, 91, 437–456.*


Lepper, M., Greene, D., & Nisbett, R. (1973). Undermining children’s intrinsic interest with extrinsic reward: A test of the “overjustification” hypothesis. Journal of Personality and Social Psychology, 28, 129–137.


Leuven, E., Oosterbeek, H., & van der Klaauw, B. (2010). The effect of financial rewards on students’ achievement: Evidence from a randomized experiment. Journal of the European Economic Association, 8, 1243–1265.


Levitt, S. D., List, J. A., Neckermann, S., & Sadoff, S. (2016). The behavioralist goes to school: Leveraging behavioral economics to improve educational performance. American Economic Journal: Economic Policy, 8(4), 183–219.*


Levitt, S. D., List, J. A., & Sadoff, S. (2016). The effect of performance-based incentives on educational achievement: evidence from a randomized experiment. National Bureau of Economic Research Working Paper No. w22107. Cambridge, MA: NBER.*


Li, T., Han, L., Rozelle, S., & Zhang, L. (2014). Encouraging classroom peer interactions: Evidence from Chinese migrant schools. Journal of Public Economics, 111, 29–45.*


List, J. A., Livingston, J. A., & Neckermann, S. (2012). Harnessing complementarities in the education production function. Chicago, IL: University of Chicago.*


McEwan, P. J. (2015). Improving learning in primary schools of developing countries: A meta-analysis of randomized experiments. Review of Educational Research, 85(3).



Miller, C., Riccio, J., Verma, N., Nunez, S., Dechausay, N., & Yang, E. (2015). Testing a conditional cash transfer program in the U.S.: The effects of the Family Rewards program in New York City. Journal of Labor Policy, 4, 1–29.*


National Research Council (2011). Incentives and test-based accountability in education. Washington, DC: Author.


Nieminen, P., Lehtiniemi, H., Vähäkangas, K., Huusko, A., & Rautio, A. (2013). Standardised regression coefficient as an effect size index in summarizing findings in epidemiological studies. Epidemiology Biostatistics and Public Health, 10, 1–15.


O’Neil, H.F., Abedi, J., Miyoshi, J., & Mastergeorge, A. (2005). Monetary incentives for low-stakes tests. Educational Assessment, 10, 185–208.


Prothero, A. (2017, October 17). Does paying kids to do well in school actually work? Education Week. Retrieved from http://www.edweek.org.


Raymond, M. (2008). Paying for A’s: An early exploration of student reward and incentive programs in charter schools. Stanford, CA: CREDO.


Ringquist, E.J. (2013). Meta-analysis for public management and policy. San Francisco, CA: Jossey-Bass.


Rosenthal, R. (1979). The ‘file-drawer problem’ and tolerance for null results. Psychological Bulletin, 86, 638–641.


Rouse, C. (1998). Private school vouchers and student achievement: An evaluation of the Milwaukee Parental Choice Program. Quarterly Journal of Economics, 113, 553–602.


Ryan, R. M., & Deci, E. L. (2000). Self-determination theory and the facilitation of intrinsic motivation, social development, and well-being. American Psychologist, 55, 68–78.


Sadoff, S. (2014). The role of experimentation in education policy. Oxford Review of Economic Policy, 30, 597–620.


Scammacca, N., Roberts, G., & Stuebing, K. K (2014). Meta-analysis with complex research designs: Dealing with dependence from multiple measures and multiple group comparisons. Review of Educational Research, 84(3), 328–364.


Scheie, E. (2014, August 20). Cities experiment with paying poor students for good grades. World

Journalism Institute. Retrieved from https://world.wng.org.


Sharma, D. (2010). The impact of financial incentives on academic achievement and household behavior: Evidence from a randomized trial. Columbus, OH: The Ohio State University.*


Slavin, R. E. (2010). Can financial incentives enhance educational outcomes? Evidence from international experiments. Educational Research Review, 5, 68–80.


Stanley, T. D., Jarrell, S. B., & Doucouliagos, H. (2010) Could it be better to discard 90% of the data? A statistical paradox. The American Statistician, 64, 70–77.


Stolovitch, H. D., Clark, R. E., & Condly, S. J. (2002). Incentives, motivation, and workplace performance: Research and best practices. McLean, VA: International Society for Performance Improvement and The Incentive Research Foundation.


Tanner-Smith, E. E., & Tipton, E. (2014). Robust variance estimation with dependent effect sizes: Practical considerations including a software tutorial in Stata and SPSS. Research Synthesis Methods, 5(1), 13–30.


Tipton, E. (2015). Small sample adjustments for robust variance estimation with meta-regression. Psychological Methods, 20(3), 375–393.


Tipton, E. & Pustejovsky, J. (2015). Small-sample adjustments for tests of moderators and model fit using robust variance estimation in meta-regression. Journal of Educational and Behavioral Statistics, 40(6), 604–634.


Turner, R. M., Bird, S. M., & Higgins, J. P. T. (2013) The impact of study size on meta-analyses: Examination of underpowered studies in Cochrane Reviews. PLoS ONE, 8(3), e59202.


Valentine, J. C., Pigott, T. D., & Rothstein, H. R. (2010). How many studies do you need? A primer on statistical power for meta-analysis. Journal of Educational and Behavioral Statistics, 35(2), 215–247.


Visaria, S., Dehejia, R., Chao, M. M., & Mukhopadhyay, A. (2016). Unintended consequences of rewards for student attendance: Results from a field experiment in Indian classrooms. Economics of Education Review, 54, 173–184.


von Hippel, P. T. (2015). The heterogeneity statistic I2 can be biased in small meta-analyses. BMC Medical Research Methodology, 15(35), 1–8.


Wallace, B. D. (2009). Do economic rewards work? District Administration, 45, 24–27.


White, R. (2012). The effectiveness of an incentivized program to increase daily fruit and vegetable dietary intake by low-income, middle-aged women. Unpublished thesis. Mankato, MN: Minnesota State University, Mankato.


Wigfield, A., & Cambria, J. (2000). Expectancy-value theory: Retrospective and prospective. In T. Urdan & S. A. Karabenck (Eds.), The decade ahead: Theoretical perspectives on motivation and achievement advances in motivation and achievement (pp. 35–70). Bingley, UK: Emerald Group Publishing Limited.


Willingham, D. T. (2008). Should learning be its own reward? American Educator, 31, 29–35.


Wilson, S. J., Tanner-Smith, E. E., Lipsey, M. W., Steinka-Fry, K., & Morrison, J. (2011). Dropout prevention and intervention programs: Effects on school completion and dropout among school-aged children and youth. Campbell Systematic Reviews, 8.


Yeh, S. (2010). The cost effectiveness of 22 approaches for raising student achievement. Journal of Education Finance, 36(1), 38–75.


Zamarro, G., Hitt, C., & Mendez, I. (2016). Reexamining international differences in achievement and non-cognitive skills: When students don't care. Little Rock, AR: University of Arkansas.



APPENDIX A

CHARACTERISTICS OF THE STUDIES INCLUDED IN THE META-ANALYSIS



Study Name

Published

Randomized

Location

Grade Levels

Incentive Structure a

Sample Size

Angrist & Lavy (2009)

Yes

Yes

Israel

10th–12th

-- NIS500 for taking any Bagrut component test

-- NIS1,500 for passing the component tests before senior year

-- NIS 6,000 for any senior who received a Bagrut

-- NIS 10,000 to a student who passed all the achievement milestones

-- 8,257 treatment students in 20 schools

-- 8,215 control students in 20 schools

Barrera-Osorio & Filmer (2016)

Yes

Yes

Cambodia

4th–6th

-- Scholarships equivalent to $20 were provided for staying enrolled in school, regular school attendance, and maintaining passing grades

-- Two treatment conditions:

1) Targeted students based on poverty

2) Targeted students based on merit

-- Poverty condition:

431 control and 451 treatment students

-- Merit condition: 474 control and 466 treatment students

-- 52 schools in merit condition, 51 schools in poverty condition, and 104 control schools

Behrman et al. (2015)

Yes

Yes

Mexico

10th–12th

--Three treatment conditions:

1) Students only: incentives range from 2500 to 15,000 pesos

2) Teachers only: incentives range from 0 to 750 pesos

3) Students, teachers, and administrators: A combination of the incentive structures described above; in addition, incentives are provided based on the average performance of the class (for students) and the average performance of other students in the school (for teachers and administrators)

-- 88 total schools: 20 schools in each of the three treatment conditions and 28 control schools

-- 582 students in the control group; 632 students in the students only group; 609 students in the teachers only group; 550 students in the students, teachers, and administrator group

Berry (2015)

Yes

No

India

1st–3rd

-- Incentives were based on children’s individual mastery of literacy objectives

-- Six treatment groups:

a) Four treatment groups varied along two dimensions: parent or child received the award or the form of the award (money, toy, or voucher). The money award consisted of 100 rupees, the toy award consisted of a toy valued at 100 rupees, and the voucher award consisted of a voucher redeemable for a toy

b) The two remaining groups offered the parent a choice between money for themselves or the toy, either upon program announcement or conditional on reaching the goal (referred to as the ex-ante choice and post-ante choice, respectively)

-- 161 students in the control group

-- 150 students in the parent money group

-- 150 students in the child money group

-- 151 students in the child toy group

-- 145 students in the voucher group

-- 151 students in the ex-ante choice

-- 151 students in the post-ante choice

Berry, Kim, & Son (2017)

No

Yes

Malawi

5th–8th

-- Students were awarded MWK 4500 under one of two incentive conditions:

a) In the standard scholarship condition, scorers in the top 15% were rewarded

b) In the relative scholarship program, students were grouped according to initial score, and rewards were awarded to the top 15% performers in each group

-- 2830 students in 46 classrooms for the standard scholarship condition

-- 2994 students in 43 classrooms for the relative scholarship condition

-- 1562 students in 30 classrooms for the control group

-- 31 schools participated in the study

Bettinger (2012)

Yes

Yes

Ohio

3rd–6th

-- $15 for scoring at least proficient on each of five subject area tests

-- $20 for scoring advanced or proficient on each of five subject area tests

-- 1527 students total, of whom 893 were treatment students

-- Participants were from 48 grade-school combinations

Blimpo (2014)

Yes

Yes

Benin

10th

--Three treatment conditions:

1) Individual incentives: ranges between 5000 and 20,000 Francs CFA, for a passing and honor score passing performance, respectively

2) Team incentive: ranges between 20,000 and 80,000 Francs CFA, for a passing and honor score passing performance, respectively

3) Tournament: team who ranks among the top 3 average score receives 320,000 Francs CFA

-- A total of 1476 students of whom 423 were control students, 347 were students in the individual incentives group, 367 were students in the team incentives, and 339 were in the students in the tournament incentive

-- A total of 100 schools, of which 28 were control schools, 22 were assigned to the individual incentives, 22 were assigned to the team incentives, and 28 were assigned to the tournament incentives

Burgess et al. (2016)

No

Yes

U.K.

10th–11th

-- Two incentive conditions:

1) Up to £80 per half-term for attendance, behavior, classwork, and homework

2) Nonfinancial incentives include chance to qualify for a high-value event (e.g., amusement theme park or visit to the national soccer stadium)

-- Up to 7636 financial incentive students in 27 schools

-- Up to 8836 non-financial incentive students in 27 schools

Fryer (2011)

Yes

Yes

Dallas

Chicago

New York

2nd

9th

4th, 7th

-- Dallas: Students paid $2 per book and administered a quiz that they must pass at 80% accuracy

-- Chicago: Students paid $50 for an A, $35 for a B, and $20 for a C for grades obtained in 5 core subjects

-- New York (elementary): Students were given $5 for completing interim tests in mathematics and reading, and earned $25 for a perfect score, for up to $250 per year

-- New York (secondary): Students were given $10 for completing interim tests in mathematics and reading, and earned $50 for a perfect score, for up to $500 per year

-- Dallas: 1777 treatment students in 21 schools; 1941 control students in 21 schools

-- Chicago: 70 schools opted to participate; 3275 treatment students in 20 schools; 4380 control students in remaining schools;

-- New York: 121 schools opted to participate and 63 chosen for the treatment; Treatment group consisted of 3348 4th grade and 4605 7th grade students; Control group consisted of 3234 4th grade and 4696 7th grade students

Fryer, Devi, & Holden (2016) b

No

Yes

Washington, DC

Houston

6th–8th


5th

-- Washington, DC: Students were given one point every day for satisfying each of five metrics (attendance, behavior, tests, and two other inputs chosen by the schools). Students were paid $2 per point

-- Houston: Students received $2 per math objective mastered, as indicated by passing a computerized test. Students who mastered 200 objectives received a $100 completion bonus.

-- Parents received $2 for each objective their child mastered and $20 per parent-teacher conference attended

-- Teachers earned $6 per parent-teacher conference held and up to $10,100 in performance bonuses for student achievement on standardized tests

-- Washington: 3377 treatment students in 17 schools and 2485 control students in 17 schools

-- Houston: 1554 treatment students in 25 schools and 1613 control students in 25 schools

Gneezy et al., 2017

No

Yes

U.S.

Shanghai

10th

-- Incentives were framed as a loss

-- U.S.: Students were given an envelope with 25 $1 bills, and were informed that for every test question skipped or missed, $1 would be taken away

-- Shanghai: Students were provided an envelope with 89.25 RMB, and 3.6 RMB was taken away for each incorrect or missed test item

-- Two high schools in the U.S. and three high schools in Shanghai participated

-- U.S. sample: 447 students total, 220 in the treatment group and 227 in the control group

--Shanghai sample: 280 students total, 139 in the treatment group and 141 in the control group

Hirshleifer (2017)

No

Yes

India

4th–6th

--Two treatment conditions:

a) Incentivized input condition in which students earned rewards for a combination of reaching mastery and for answering questions correctly as they completed modules

b) Incentivized output condition in which students earned rewards for answering test questions correctly after the test was completed at the end of the unit

-- Under both conditions, students earned a maximum of 2000 points or 200 rupees but the input condition needed to answer more questions to earn the same number of points

-- 3218 total students across 45 classrooms

Jackson (2010)

Yes

No

Texas

11th–12th

-- AP teachers received between $100 and $500 for each AP qualifying score earned by a high school junior or senior enrolled in their course

-- Teachers could also receive discretionary bonuses of up to $1000

-- Students received between $100 and $500 for each qualifying score

-- For the 40 schools that adopted the incentive program, the average enrollment was 1731 students

-- For the 18 schools that served as the control group, the average enrollment was 2068 students

-- Analysis was conducted on 5888 students

Jackson (2014)

Yes

No

Texas

11th–12th

-- AP teachers received between $100 and $500 for each AP qualifying score earned by a high school junior or senior enrolled in their course

-- Teachers could also receive discretionary bonuses of up to $1000

-- Students received between $100 and $500 for each qualifying score

-- For the 58 schools that adopted the incentive program, the average enrollment was 1778 students from 1993-1999 and 1837 students from 2000-2005

-- For the 1413 schools that served as the control group, the average enrollment was 717 students from 1993-1999 and 752 students from 2000-2005

-- Analysis was conducted on a total of 290,343 students

Kremer et al. (2009)

Yes

Yes

Kenya

6th

-- Scholarships awarded to the top 15% of grade 6 girls in the treatment schools

-- Scholarship provided winning girls with a grant of US$6.40 (KSh 500) to her school and a grant of US$12.80 (KSh 1,000) to the recipient

-- 34 treatment schools and 35 control students in the first district; 30 treatment schools and 28 control students in the second district

-- Across the two cohorts, 1077 treatment students and 1029 control students in the first district; Across cohorts, 755 treatment students and 741 control students in the second cohort

Levitt, List, Neckermann, & Sadoff (2016)

Yes

Yes

Chicago

2nd–8th, 10th

-- Financial incentives of $10, $20, or non-financial incentive (i.e., trophy)

-- Students were paid if they improved on their score from a previous testing session

-- Rewards could be framed as a loss and could either be paid immediately or delayed

-- Study was conducted in three sites:

a) 666 tenth-grade students in one high school randomized at the class level

b) 343 third-through eighth grade students in 7 elementary and middle schools randomized at the school-grade level

c) 4790 second through eighth grade students in 26 elementary/middle schools randomized at the school-grade level

Levitt, List, & Sadoff (2016)

No

Yes

Chicago

9th

--Two incentive conditions:

1) Students could earn $50 a month for 8 months over the school year for attendance, behavior, grades, and standardized test scores

2) Parents received the cash reward if their child met the specified performance standards for attendance, behavior, grades, and test scores

-- Incentives were structured as either as a fixed rate (i.e., recipients were awarded $50 per month) or as a lottery (i.e., recipients had a 10% chance of winning $500 each month), resulting in four incentive programs

-- Two high schools in a single district

-- 175 control students, 750 total treatment students

-- 186 treatment students assigned to the student fixed, 185 assigned to the parent fixed, 189 to the student lottery, 190 to the parent lottery

Li et al. (2014)

Yes

Yes

China

3rd–6th

-- Two treatment conditions:

1) Cash payments for test performance for individual students

-- 100 RMB was awarded to a lower achieving student who achieved the greatest increase in test scores between the baseline test and the evaluation test. Second and third place runners up were promised 50 RMB each

2) Cash incentives for peer tutoring

-- Top students from each class were paired with a lower achieving peer. To encourage peer tutoring, the top pair-mate in which the lower achieving student showed the largest test score gains were each awarded 100 RMB. Second and third place pair-mates were awarded 50 RMB to each member

-- 23 schools participated in the study; 11 schools assigned to the peer incentive group, 12 to the individual incentive, and 23 assigned to the control

-- Randomization occurred at the class level; There were 44 peer incentive classrooms, 47 individual incentive classrooms, and 35 control classrooms

-- 1710 students enrolled in the peer incentive, 1789 in the individual incentive, and 1351 in the control group

-- 371 treated students in the peer incentive group and 411 treated students in the incentive group

List, Livingston, & Neckermann (2012)

No

Yes

 Chicago

3rd–8th

-- Five treatments: incentive to student only; incentive to tutor only; incentive to parent only; incentive to students and parents only; incentive to students, parents, and tutors

-- Total of $90 in incentives paid in each treatment

-- Student and tutor incentives involved students meeting performance standards for attendance, behavior, grades, and test scores

-- Parent incentives required parents to assist with a tutor-assigned homework assignment

-- Analysis was conducted on 547 students in nine elementary and middle schools

Miller et al. (2015)

Yes

Yes

New York City

4th, 7th, 9th

--Family Rewards program offered families cash rewards for 22 education- and health-related behaviors

-- Students were incentivized for school attendance, obtaining a library card, obtaining proficient score on the state tests, passing the Regent exams, earning 11+ credits per year, taking the PSAT, and graduating from high school

-- Parents were incentivized for attending parent-teacher meetings, discussing test results with the teacher, and reviewing low-stakes tests

-- Education-related incentives ranged from $25 to $600, depending on the activity/behavior

-- Analysis conducted on 1726 4th grade students and 1670 7th grade students

Sharma (2010)

No

Yes

Nepal

8th

-- Cash rewards were based on students’ average aggregate scores in each of the two semester exams and the end-of-the-year district level exams

-- Students were paid linearly in proportion to their test scores. If students passed all their courses, they received 5 rupees per percent. If they failed any course, they received 2.5 rupees per percent

-- 33 schools participated, of which 11 were assigned to the treatment group

-- Analysis conducted on 2624 students

Note: A study may have included a separate condition that incentivized teachers or parents. Unless teachers or parents were incentivized in conjunction with students, the effect sizes from these treatment conditions were omitted from the analysis.






Cite This Article as: Teachers College Record Volume 122 Number 3, 2020, p. 1-34
https://www.tcrecord.org ID Number: 23009, Date Accessed: 10/23/2021 9:02:11 PM

Purchase Reprint Rights for this article or review
 
Article Tools
Related Articles

Related Discussion
 
Post a Comment | Read All

About the Author
  • Vi-Nhuan Le
    University of Chicago
    E-mail Author
    VI-NHUAN LE is a senior research scientist with NORC at the University of Chicago. Her research interests are in early childhood education, program evaluation, and math and science reform. Her recent publication includes “Advanced content coverage at kindergarten: Are there trade-offs between academic achievement and social-emotional skills?” published in American Educational Research Journal.
 
Member Center
In Print
This Month's Issue

Submit
EMAIL

Twitter

RSS