Home Articles Reader Opinion Editorial Book Reviews Discussion Writers Guide About TCRecord
transparent 13
Topics
Discussion
Announcements
 

A Call for Consensus in the Use of Student Socioeconomic Status Measures in Cross-National Research using the Trends in International Mathematics and Science Study (TIMSS)


by Amita Chudgar , Thomas F. Luschei & Loris Fagioli - June 16, 2014

The objectives of this research note are to: (1) illustrate variability in approaches to capture student Socioeconomic Status (SES) in current cross-national educational literature using Trends in International Mathematics and Science Study (TIMSS); (2) demonstrate that the choices researchers make about SES measures have important consequences for their conclusions about relationships between student performance and school resources; and (3) invite a conversation among researchers using cross-national data (especially TIMSS) that will lead to greater consensus as to how to measure student SES in educational research.

TIMSS is the world’s most comprehensive source of multi-year, cross-national data on student performance in science and mathematics. The TIMSS data, which provide nationally representative information on fourth and eighth grade students in dozens of school systems, offer tremendous potential to explore mathematics and science achievement across diverse contexts. From a policy perspective, researchers can use TIMSS data—which also contain information on teachers and other school resources—to investigate school and teacher variables associated with national and cross-national variation in mathematics and science achievement. To do so, researchers must account for the socioeconomic status (SES) of students as accurately as possible, to reduce bias resulting from unmeasured relationships between school resources and student SES (Buchmann, 2002; Coleman, 1966, 1975). Yet educational research often lacks a systematic conceptualization of SES (Harwell & LeBeau, 2010; Sirin, 2005). Although researchers have proposed a “tripartite” measure of family SES comprising parental education, parental occupation, and family income (e.g., Buchmann, 2002, Sirin, 20051), these variables are often missing in classroom-based surveys like TIMSS.


Lack of extensive SES measures in TIMSS and other surveys creates a tension between the desire for a conceptually grounded measure of SES and the realities of what is available in the data. In many cases, this tension is resolved through ad hoc conceptualizations of SES. As Harwell and LeBeau argue, SES often seems to be “defined solely by the variable regarded as capturing this information” (Harwell and LeBeau, 2010, p. 121). This task is further complicated in cross-national educational research, due to “the lack of social class measures that are culturally relevant to the particular society or community being studied” (Fuller & Clarke, 1994, p. 167). Within this backdrop, what guidelines are available to researchers wishing to account for SES in the TIMSS data? Unfortunately, current research offers few clear directions.


The objectives of this research note are to: (a) illustrate variability in approaches to capture student SES in current cross-national educational literature using TIMSS; (b) demonstrate that the choices researchers make about SES measures have important consequences for their conclusions about relationships between student performance and school resources; and (c) invite a conversation among researchers using cross-national data (especially TIMSS) that will lead to greater consensus as to how to measure student SES in educational research. To provide a model for the type of effort needed to reach such a consensus, we discuss the recommendations of a panel convened to address the measurement of student SES in the United States’ National Assessment of Educational Progress (NAEP).


WHY IS CONSENSUS IMPORTANT AND WHAT IS NEEDED TO REACH A CONSENSUS?


Student SES is a critical consideration in virtually all educational research. As Harwell and LeBeau (2010) note, “more than nine decades of research” (p. 120) document a relationship between student SES and student achievement. A task force of the American Psychological Association observed, “socioeconomic factors and social class are fundamental determinants of human functioning across the life span, including development, well-being, and physical and mental health” (APA, 2007, p. 1). The practical importance of accounting for SES is echoed in a widely cited report published by the Economic Policy Institute (Carnoy & Rothstein, 2013). The authors of this report argue that reporting on international differences in student achievement does not adequately address student SES. In their analysis of data from the Programme for International Student Achievement (PISA), the authors find that, after adjusting for social composition and sampling error, the United States increases its rank out of 34 members of the Organization for Economic Cooperation and Development (OECD) from 14th to 4th in reading and from 25th to 10th in math (Carnoy & Rothstein, 2013). In other words, to understand the United States’ relative educational underperformance, we must pay much greater attention to cross-country differences in student SES.2


Yet even in the United States, a country with a long history of systematic collection of education data, measuring SES is difficult. For instance, although a great deal of educational research in the US relies on a single measure of student SES (eligibility for the national School Lunch Program, or NSLP), evidence suggests that this measure may be inadequate and possibly even biased (Harwell & LeBeau, 2010).


The panel convened to address the measurement of student SES in the United States’ NAEP made several important observations that apply to both US and cross-national educational research. First, the panel identified the “Big Three” factors of family income, parental education, and parental occupation, as well as the possibility of including neighborhood and school SES, psychological variables, and more subjective measures, such as students’ own beliefs about their SES (Cowan et al., 2012). Second, addressing the debate over whether to use individual measures of SES or to combine them into a composite measure, the NAEP panel argued in favor of a composite measure because of the “simplicity in reporting and avoiding conflicting stories about relationships to achievement.” Despite the advantages of using individual “constituent components” of SES to help policy makers target resources toward specific interventions (Deaton, 2002; Willms, 2003), the NAEP panel concluded that “the advantages of a composite variable generally outweigh the disadvantages” (Cowan et al., p. 22).


Finally, the NAEP panel discussed the treatment of missing data in constructing SES measures. Although any researcher must address the potential for bias stemming from the use of data that are not missing at random, “dealing with the issue of missing data may be more critical in the case of composite variables compared to single variables … simply because there are more opportunities for data to be missing” (p. 26). Yet according to the panel, the researcher can address this problem through data imputation, as “there are probably no special problems associated with imputing missing data in the case of computing the SES composite” (p. 26). At the same time, the NAEP panel asserts, “further study is necessary to address missing data issues in SES measurement” (p. 26).


The NAEP panel’s observations regarding selecting variables, combining these variables, and treating missing data in these variables all apply to cross-national research. Indeed, scholars working on cross-national research have considered similar issues and their implications for some time. For example, Buchmann (2002) noted that in addition to the above considerations, comparative researchers must straddle the “fine line between sensitivity to local context and the concern for comparability across multiple contexts” (Buchmann, 2002, p. 168). Comparative scholars must also consider differences across developed and developing countries (Fuller & Clarke, 1994). Together, these challenges have long presented cross-national educational researchers who work with existing datasets like TIMSS with many complex decisions.  And despite their recognition of these challenges, these researchers have few guidelines to follow. In the following section, we discuss how this complexity results in significant diversity in researchers’ approaches to which variables to use to measure SES, how to use them, and how to deal with missing data. We also discuss the implications of these choices.


DIVERSITY IN THE USE OF SES MEASURES IN TIMSS LITERATURE


To demonstrate the variation in approaches used to measure SES, we conducted a review of recent, representative literature using the ERIC database. Our criteria included relevance, recency, an empirical nature, and quality. To ensure relevance and recency, we searched for articles mentioning “TIMSS” during the period 2003 to 2013. Quality criteria included publication in a peer-reviewed journal indexed with an impact factor in the Thomson Reuters Web of Knowledge Journal Citation Reports, Social Sciences Edition 2011. Limiting the search to peer-reviewed articles resulted in 239 articles, 81 of which were in indexed journals. Each of the 81 articles was read to determine if TIMSS data were used and if the author(s) used SES or home background/context variables in the analyses. Elimination of articles that did not use TIMSS data or SES variables resulted in a final sample of 21 articles from a diverse set of journals. The three authors coded each of these articles in relation to the three key decisions highlighted by the NAEP report: (a) variables used to measure SES, (b) method used to combine these variables, and (c) handling of missing data (see Table 1). In reporting these results we identify the journal and year of publication, but not author names. Our intention is not to identify strengths and weaknesses of specific studies; instead, we aim to illustrate the diversity of approaches to measure the same construct across a group of recently published articles.


Variables selected: As demonstrated in Table 1, there is little consistency in how the authors chose and treated SES variables, along with a few similarities. The two most common variables used were parental education (N=17, 81%) and books in the home (N=16, 76%). For parental education authors used the highest value of either mother’s or father’s education, a combination of both, or only mother’s education. Ten articles used student-reported measures of the availability of possessions in the home. The most frequently used possessions were computer (N=9, 43%) and study desk (N=9, 43%), followed by dictionary (N=6, 29%), calculator (N=5, 24%), and Internet, which was used by one study.


The majority of articles also included additional measures of home background as identified by the authors. These included country-specific possessions in the home,3 exposure to test-language at home, immigration status, household size, living with both parents, and less frequently, parent and student expectations, and student aspirations.


Combining SES variables: We also found diversity in how researchers used the variables they selected for their models, from simple to more sophisticated approaches. Most studies included each variable individually (N=14, 67%), but some included some of the variables individually while combining home resource variables into an index (N=4, 19%). Three articles (14%) created composite measures using either simple summation or more sophisticated methods like factor analysis or confirmatory factor analysis.


Approaches to missing data: The 21 studies varied considerably in how they treated and discussed missing data. Almost half of the studies (N=10, 48%) did not explicitly mention missing data. Of the articles that did address missing data, five employed listwise deletion and three used mean imputation. The others used regression imputation, Full Information Maximum Likelihood, and/or Expectation Maximization.


In our assessment, the variability of approaches we found in recent peer-reviewed literature using TIMSS data stems from a lack of guidelines or “best practices” on how to approach the challenge of accounting for student SES using these data.4 To explore the consequences of this lack of consensus on the results and interpretation of cross-national research, we conducted a simple empirical exercise using the TIMSS data. We discuss our results below.


THE IMPLICATIONS OF SES MEASURE INCLUSION ON MODEL FIT AND SAMPLE SIZE


For this exercise we use the 2007 TIMSS 8th grade data from diverse world regions.5 With Fuller and Clarke’s (1994) critique in mind, we selected two countries each from seven world regions: Norway and Sweden from Scandinavia; Italy and England from Western Europe; the Russian Federation and Romania from Eastern Europe; Oman and Qatar from the Gulf Region; Botswana and Ghana from Sub-Saharan Africa; Japan and South Korea from East Asia; and Thailand and Malaysia from Southeast Asia. We also included the United States. According to CIA World Factbook data (2011), this sample is diverse in terms of both income and inequality. GDP per capita at purchasing power parity ranges from $179,000 in Qatar to $2,500 in Ghana, while the Gini index ranges from 63 in Botswana to 23 in Sweden (a higher number indicates greater income inequality).


We conducted a simple step-wise regression analysis to explore the relative importance of variables commonly used to measure SES. The baseline regression (Regression 1) controls for the child’s age, sex, language spoken at home, access to a computer, time spent on subject homework, index of student confidence in the subject, an index of the perceived value of the subject, an index of the student’s positive affect towards the subject, and an index of the student’s perception of being safe in school.6 All of the indices are provided in the TIMSS data. Each of the next four models adds home background variables available in TIMSS to measure SES. First, we added the number of books (Books) in the home as a continuous variable (Model 1). In Model 2, we added five separate variables indicating the availability of a study desk, a dictionary, a computer, a calculator, and an Internet connection at home (Common). In Model 3 we add four country-specific items (Specific).7 In the final model (Model 4) we added father’s and mother’s education (ParentEd)8 as two separate continuous variables. We add this variable last due to a large amount of missing data in many countries. In preparing parental education variables, we code as missing student responses of “I don’t know.” This is arguably a stringent approach. A child who reports that she “does not know” her parent’s education level is in fact providing some information, but it is not immediately evident how to interpret this information. Our decision to code “I don’t know” responses as missing, illustrates the many choices that researchers face in working with TIMSS data.


We employed two approaches to account for missing data, one simple and the other more sophisticated. The simple approach amounted to dropping an observation (listwise deletion with the first plausible value as the dependent variable) if any data were missing. The more sophisticated approach involved the use of multiple imputation (MI) regressions to create 5 imputed datasets using the command ‘ice’ in STATA (Royston, 2009). This approach allowed us to use all five plausible values as the dependent variable. Coefficients were aggregated according to Rubin (1987) and Harel (2009). All regressions used the appropriate sample weight “TOTWGT.” While the first analysis allows us to examine the impact of missing data on sample size, we use the second analysis to focus on the variance explained by the inclusion or exclusion of different independent variables.   


While we conducted the analysis for mathematics and science, in the interest of space we only present the results for mathematics in tables 2 and 3 and figures 1 and 2. (The results for science are similar and available upon request.) The figures display the same information as the table but are easier to read at a glance. In Figure 1, while there are differences of magnitudes between estimates generated from listwise deletion and MI, the overall patterns in variance explained hold across both approaches.


As expected, in most countries the addition of SES variables across models increases the explained variance in student math performance. Yet the models seem to explain far more variation in student performance in developed countries like Korea or Norway. In developing countries like Ghana or Botswana, as well as in the non-Western contexts of Qatar and Oman, the same sets of variables explain less overall variation.9 The second bar for Model 1 (books in the home) shows the single largest improvement to the baseline adjusted R-squared. Once again the gains vary across countries, with a vast jump in adjusted R-squared in England compared to much lower gains in Botswana, Ghana or Qatar. Improvements in the model’s explanatory capacity after the inclusion of common and country-specific possession are generally lower. With some exceptions we find the smallest changes in explained variance when we included country-specific items (Model 3, fourth bar). The inclusion of parental education (Model 4, fifth bar) appears to add somewhat greater explanatory power, especially in certain countries. This illustrates the importance of variable choices available to researchers. Although a few variables like the number of books in the home may offer “high-yield” opportunities to understand variation in student performance and to account for home background, these variables are not consistently important across diverse contexts.


The right-hand panels of Table 2 and Figure 2 demonstrate changes in sample sizes that would occur if the researchers did not carefully address the missing data issue. In such cases, statistical programs would by default drop observations with any missing information (i.e., listwise deletion). In this manner, including parent education (Figure 2, Model 4, fifth bar) leads to a drastic reduction in sample size. As seen in Table 2, right-hand panel in Model 4, the reduction of sample size from MI to listwise deletion ranges from 23.5% in South Korea to 70.5% in Sweden. The reduction in sample size between MI regressions and listwise deletion is also substantial in Models 1 to 3, ranging from 3.8% to 33.8% in these models. Of note, 17 of the 21 articles we reviewed included parental education, but the majority of these 17 either used listwise deletion or provided no discussion of their treatment of missing data. Our results suggest the potential for considerable missing data consequences of including parental education in combination with listwise deletion.


The brief exploration above highlights three important points. First, recent peer-reviewed literature using TIMSS lacks consensus on how SES should be operationalized along at least three important dimensions: variable selection, variable combination, and treatment of missing data. Second, these decisions may have an important impact on the results.


Additionally, variable selection to explain SES or account for home background may have varying importance across different contexts. Third, addressing missing data is critically important, depending on the choice of variables. While we have not explored different approaches to combine variables, it is quite likely that researchers’ choices in this area will also influence their results.


CONCLUSION


In the United States, the importance of measuring and accounting for SES in educational research has inspired high-profile efforts to come to consensus for future data collection and analysis (e.g., Cowan et al., 2012). As others before us have also shown, key issues to consider in developing this consensus include which measures of SES to include, whether and how to combine these measures, and how to address missing data. In collecting data for large US and cross-national surveys, relevant agencies may need to carefully consider the first of these three issues when deciding which variables to collect information on. Our review of recent published research using TIMSS data finds that the research community also remains divided in their approach to these three issues. In our assessment, the variability of approaches in working with these data is the result of a lack of consensus or best practices on how to account for student SES in cross-national educational research. Our empirical exercise demonstrates that variability in approaches may lead to variability in results and interpretation. There are some excellent recent examples of scholarships that attend to these issues systematically (e.g., Nonoyama-Tarumi, 2008).10 However these advances are limited, leading us to call for a renewed conversation among researchers to come to consensus regarding how to approach the measurement of student SES in cross-national research using the TIMSS data.


Table 1. A Review of Recent Peer-reviewed Articles Using TIMSS Data to Illustrate Diversity in Use of SES Measures


 

Journal title, Year

TIMSS Year & Grade

Variables used to measure SES or home background

Method used to combine variables

Missing Data

1

American Journal of Evaluation , 2004

95- 12th

ParentEd, Books, Calculator, Computer, Study Desk, Dictionary, Internet, Born in country, Language, Household Size, Country specific items

Variables included individually, created home possession index (sum of 16 items)

No discussion

2

American Journal of Education , 2011

99- 8th

ParentEd, Books, Computer, Dictionary, Car, Recreational possessions

Variables included individually

No discussion

3

Applied Measurement in Education , 2004

95- 8th

Mother's Ed, English speaking at home, Mother's expectations for math

Variables included individually

Listwise deletion

4

British Educational Research Journal, 2008

99- 8th

Mother's Ed, Books, Computer, Study Desk, Living with both parents, Immigrant status for parents and students, Use of test language at home

Variables included individually

Regression Imputation

5

Computers & Education, 2012

07- 8th

Books, Test language spoken at home

Variables included individually

No discussion

6

Economics of Education Review, 2005

95- 8th

ParentEd, Books, Immigrant status, Living with both parents

Variables included individually

Regressions imputation

7

Economics of Education Review , 2012

95/99/03/07- 8th

ParentEd, Books, Living with both parents, Immigrant status for parents and students, Household size

Unclear (Variables included individually)

Listwise deletion

8

Educational Studies in Mathematics , 2012

03- 8th

ParentEd, Books, Student’s study aspirations

CFA model with one latent factor for SES

No discussion

9

Intelligence , 2007

03- 4th/8th

ParentEd, Computer, Study Desk

Variables included individually

No discussion

10

Intelligence , 2008

95- 4th/8th

99- 8th

03- 4th/8th

Books

Variable included individually

No discussion

11

International Journal of Science Education , 2010

03- 4th

Books, Computer, Study Desk

Variables included individually, combined home resources into index of high, medium, and low

Listwise deletion, mean imputation

12

International Journal of Science and Mathematics Education , 2008

99- 8th

ParentEd, Books, Calculator, Computer, Study Desk, Dictionary, , Household Size, Test language spoken at home.

Variables included individually

No discussion

13

 International Journal of Science &Mathematics Education , 2012

03- 8th

ParentEd

Variables included individually

Expectation Maximization method

14

Journal of Educational Computing Research , 2012

07- 8th

ParentEd

Variables included individually

Full Information Maximum Likelihood estimation

15

Journal of Marriage and Family , 2010

95- 8th

ParentEd, Books, Calculator, Computer, Study Desk, Dictionary, Household Size, Country specific items

Variables included individually, Created Home Possession Index (average),

No discussion

16

 Journal of Research in Science Teaching, 2007

99- 8th

ParentEd, Books, Study Desk

Factor Analysis

Mean imputation

17

Oxford Review of Education , 2006

95- 4th

Books

Variable included individually

Mean imputation with dummy for missing

18

School Effectiveness and School Improvement , 2008

99- 8th

ParentEd, Books, Calculator, Computer, Study Desk, Dictionary

Unclear (Summative index)

No discussion

19

School Effectiveness and School Improvement, 2007

03- 8th

ParentEd, Books, Language, Educational Expectations

Variables included individually

No discussion

20

Science Education , 2012

99/03/07- 8th

ParentEd, Books, Calculator, Computer, Study Desk, Dictionary

Variables included individually, created Index of Home Resources

Listwise deletion and Expectation Maximization

21

Teachers College Record, 2007

95- 8th

ParentEd

Variable included individually

Listwise deletion


Table 2. Change in Model Adjusted R-Squared and Sample Size by Sequential Inclusion of Student Home Background Information, by Country, Listwise Deletion, Dependent variable Math test-score first plausible-value

 

Adjusted R-Squared1

Sample Size

      

(% Reduction in Sample Size Compared to Complete Information from Multiple Imputation)

 

Baseline

Model 1

Model 2

Model 3

Model 4

Baseline

Model 1

Model 2

Model 3

Model 4

USA

0.235

0.309

0.312

0.327

0.351

6949

(5.1)

6911

(5.6)

6858

(6.3)

6801

(7.1)

4395

(40.0)

Sweden

0.286

0.340

0.340

0.340

0.330

4132

(18.9)

4119

(19.2)

4085

(19.8)

4025

(21.0)

1502

(70.5)

Norway

0.315

0.362

0.365

0.364

0.366

4166

(9.3)

4153

(9.5)

4111

(10.5)

4087

(11.0)

1682

(63.4)

Italy

0.214

0.256

0.268

0.272

0.276

3931

(10.8)

3931

(10.8)

3931

(10.8)

3931

(10.8)

3271

(25.8)

England

0.212

0.344

0.350

0.349

+

3645

(8.3)

3642

(8.3)

3613

(9.1)

3611

(9.1)

+

Romania

0.233

0.308

0.331

0.334

0.340

3558

(15.1)

3551

(15.3)

3323

(20.7)

3290

(21.5)

2312

(44.8)

Russia

0.264

0.283

0.300

0.305

0.357

4187

(6.4)

4185

(6.4)

4129

(7.7)

4095

(8.4)

2915

(34.8)

Japan

0.265

0.302

0.312

0.312

0.347

4129

(3.7)

4125

(3.8)

4087

(4.6)

4072

(5.0)

2595

(39.5)

Korea

0.375

0.424

0.432

0.432

0.434

3985

(6.0)

3983

(6.1)

3974

(6.3)

3969

(6.4)

3242

(23.5)

Thailand

0.232

0.262

0.297

0.305

0.348

4965

(8.3)

4959

(8.4)

4853

(10.3)

4792

(11.5)

3437

(36.5)

Malaysia

0.273

0.306

0.343

0.347

0.345

4205

(5.8)

4198

(6.0)

4119

(7.8)

4099

(8.2)

3275

(26.7)

Oman

0.238

0.259

0.273

0.281

0.291

3962

(16.6)

3929

(17.2)

3685

(22.4)

3599

(24.2)

2394

(49.6)

Qatar

0.171

0.186

0.209

0.218

0.251

5995

(16.3)

5956

(16.8)

5711

(20.3)

5681

(20.7)

4578

(36.1)

Botswana

0.183

0.187

0.210

0.245

0.230

3262

(22.5)

3254

(22.7)

3032

(27.9)

2988

(29.0)

1493

(64.5)

Ghana

0.116

0.123

0.177

0.188

0.183

4017

(24.1)

3974

(24.9)

3585

(32.2)

3500

(33.8)

2769

(47.7)

Note: Model 1: Baseline+ Books,  Model 2: Baseline + Books + Common, Model 3: Baseline + Books + Common + Specific, Model 4: Baseline + Books + Common + ParentEd

1Adjusted r-squared penalizes a model with more variables, and thus is a more conservative estimate of model fit than r-squared

+Parent Education not available for England


Table 3. Change in Model Adjusted R-Squared by Sequential Inclusion of Student Home Background Information, by Country and sample size, Multiple Imputation, Dependent variable Math test-score five plausible values


 

Adjusted R-Squared1

Sample Size

 

Baseline

Model 1

Model 2

Model 3

Model 4

All Models

USA

0.239

0.312

0.315

0.330

0.346

7319

Sweden

0.303

0.361

0.363

0.364

0.373

5095

Norway

0.326

0.376

0.381

0.382

0.388

4591

Italy

0.224

0.263

0.280

0.287

0.294

4408

England

0.215

0.346

0.361

0.362

+

3973

Romania

0.241

0.309

0.347

0.360

0.370

4191

Russia

0.267

0.291

0.309

0.313

0.343

4472

Japan

0.277

0.310

0.325

0.327

0.368

4286

Korea

0.381

0.432

0.445

0.446

0.454

4240

Thailand

0.229

0.257

0.287

0.296

0.304

5412

Malaysia

0.273

0.307

0.342

0.348

0.352

4466

Oman

0.257

0.273

0.306

0.321

0.321

4748

Qatar

0.198

0.214

0.258

0.268

0.295

7162

Botswana

0.197

0.200

0.237

0.267

0.269

4207

Ghana

0.139

0.141

0.210

0.227

0.238

5290

Note: Model 1: Baseline+ Books,  Model 2: Baseline + Books + Common, Model 3: Baseline + Books + Common + Specific, Model 4: Baseline + Books + Common + ParentEd

1Adjusted r-squared penalizes a model with more variables, and thus is a more conservative estimate of model fit than r-squared

Multiple Imputation based on five imputations.

+Parent Education not available for England


Figure 1. Change in Adjusted R-Squared across Five Regression Models for Math test-score, Listwise Deletion and Multiple Imputation, by Country.


[39_17564.htm_g/00001.jpg]

Notes:

1. Y-axis represents adjusted r-square values
2. The top panel provides results from stepwise regressions where listwise deletion was used to account for missing data.
3. The bottom panel provides results from stepwise regressions where multiple imputations was used to account for missing data.
4. The legend key explains how each stepwise regression added additional explanatory variables to explain variation in math performance.

Figure 2. Changes in Sample Size across Five Regression Models for Math test-score, Listwise Deletion and Multiple Imputation, by Country.



[39_17564.htm_g/00003.jpg]

Notes:

1. Y-axis represents sample size
2. The last bar for each country represents the sample size that was available in the stepwise regression framework. This sample size remained constant across all the stepwise regressions.
3. The first five bars for each country exhibit how the sample size changed (declined) in the stepwise regression framework when we used the listwise deletion approach to account for missing data.

Notes


1. Sirin (2005), who covers many of the issues we discuss here, offers an excellent and exhaustive review of related US-focused literature.

2. Carnoy and Rothstein’s primary measure of “social class (home) influences” is the number of books in the home. They also use other measures such as mother’s education and an overall index provided by PISA, but their results remain unchanged (Carnoy & Rothstein, 2013, p. 11).

3. Studies have previously acknowledged the importance of country-specific measures (Fuller & Clarke, 1994; May 2005, Traynor & Raykov, 2013). These studies show that including country-specific items can be valuable but computationally demanding. Country-specific items can improve the validity and reliability of student level-SES scores if used in conjunction with international anchor items that have similar psychometric characteristics (May, 2005). Further, to account for between-country variation Item Response Theory or weights might be necessary(May, 2005; Traynor & Raykov, 2013).

4. Other widely used cross-national data such as PISA and the Progress in International Reading Literacy Study (PIRLS) have more extensive home background measures, as the first surveys older children and the second conducts home surveys. PIRLS and PISA data provide pre-prepared indices that could reasonably be used as SES controls. However, in a brief review of literature (not reported here) we found that many studies using these data do not use the available indices. We also found broad patterns similar to the TIMSS-based studies discussed here, with limited attention to variable selection and missing data issues.

5. We use the 8th grade date because they include information on parental education. The 4th grade questionnaire does not ask students about their parents’ education, which, although understandable, presents another challenge for measuring student SES in TIMSS.

6. The US data do not include the school safety index. The science data from Romania, Russia and Sweden do not have information on time spent on subject homework, index of student confidence in the subject, index of the perceived value of the subject, and the index of the student’s positive affect towards the subject.

7. In TIMSS 2007, each country chose up to four specific possession items to include in its survey. England, Malaysia, and Qatar gathered information on only three country-specific variables.

8. Data from England did not provide any information on father’s or mother’s education. We did not have a third Western European country in the sample in 8th grade.  

9. These findings have an interesting parallel with Sirin (2005), who after extensive meta-analysis found that in the US, the SES-academic achievement relationship tends to be smaller for minority students compared to White students.

10. The author uses cultural capital and wealth theories to generate a conceptualization of SES which is then applied to the PISA data. The author finds that this “multidimensional” measure of SES explains greater variation in student performance across study countries.


Acknowledgements

Chudgar and Luschei are equal authors. The author order was determined by a coin toss.

References

Akiba, M., LeTendre, G. K., & Scribner, J. P. (2007). Teacher quality, opportunity gap, and national achievement in 46 countries. Educational Researcher, 36(7), 369–387.


American Psychological Association [APA]. (2007). Report of the APA task force on socioeconomic status. Washington, DC: American Psychological Association.


Buchmann, C. (2002). Measuring family background in international studies of education: Conceptual issues and methodological challenges. In A. C. Porter & A. Gamoran (Eds.), Methodological advances in cross-national surveys of educational achievement (pp. 150–197). Washington, DC: National Academy Press.


Carnoy, M., & Rohstein, R. (2013, January). What do international tests really show about U.S. student performance? Washington, DC: Economic Policy Institute.


Central Intelligence Agency [CIA]. (2011). The World Fact Book. Retrieved from https://www.cia.gov/library/publications/the-world-factbook.


Coleman, J. S. (1966). Equality of educational opportunity. Washington, DC: National Center for Educational Statistics.


Coleman, J. S. (1975). Methods and results in the IEA studies of effects of school on learning. Review of Educational Research, 45, 335–386.


Cowan, C. D., Hauser, R. M., Kominski, R. A., Levin, H. M., Lucas, S. R., Morgan, S. L., Spencer, M. B., & Chapman, C. (2012, November). Improving the measurement of socioeconomic status for the National Assessment of Educational Progress: A theoretical foundation (Recommendations for the National Center for Education Statistics). Washington, DC: National Center for Education Statistics.


Deaton, A. (2002). Policy implications of the gradient of health and wealth. Health Affairs, 21(2), 13–30.


Fuller, B. & Clarke, P. (1994). Raising school effects while ignoring culture? Local conditions and the influence of classroom tools, rules, and pedagogy. Review of Educational Research, 64, 119–157.


Harel, O. (2009). The estimation of R2 and adjusted R2 in incomplete data sets using multiple imputation. Journal of Applied Statistics, 36(10), 1109–1118.


Harwell, M., & LeBeau, B. (2010). Student eligibility for a free lunch as an SES measure in education research. Educational Researcher, 39(2), 120–131.


May, H. (2006). A multilevel Bayesian item response theory method for scaling socioeconomic status in international studies of education. Journal of Educational and Behavioral Statistics, 31(1), 63–79.


Nonoyama-Tarumi, Y. (2008). Cross-national estimates of the effects of family background on student achievement: A sensitivity analysis. International Review of Education54(1), 57–82.


Royston, P. (2009). Multiple imputation of missing values: further update of ice, with an emphasis on categorical variables. Stata Journal, 9(3), 466–477.


Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. New York: Wiley.


Sirin, S. R. (2005). Socioeconomic status and academic achievement: A meta-analytic review of research. Review of educational research75(3), 417–453.


Traynor, A., & Raykov, T. (2013). Household possessions indices as wealth measures: a validity evaluation. Comparative Education Review, 57(4), 662–688.


Willms, J. D. (2003, February). Ten hypotheses about socioeconomic gradients and community differences in children’s developmental outcomes (0-662-89586-X, RH63-1/560-01-03F). Gatineau, Quebec: Human Resources Development Canada.



Aricles reviewed for Table 1


Akiba, M. (2008). Predictors of student fear of school violence: A comparative study of eighth graders in 33 countries. School Effectiveness and School Improvement, 19(1), 51–72.


Ammermuller, A., Heijke, H., & Wößmann, L. (2005). Schooling quality in Eastern Europe: Educational production during transition. Economics of Education Review, 24(5), 579–599.


Aypay, A., Erdogan, M., & Sozer, M. A. (2007). Variation among schools on classroom practices in science based on TIMSS-1999 in turkey. Journal of Research in Science Teaching, 44(10), 1417–1435.


Cho, I. (2012). The effect of teacher-student gender matching: Evidence from OECD countries. Economics of Education Review, 31(3), 54–67.


Dumay, X., & Dupriez, V. (2007). Accounting for class effect using the TIMSS 2003 eighth-grade database: Net effect of group composition, net effect of class process, and joint effect. School Effectiveness and School Improvement, 18(4), 383–408.


Hansson, A. (2012). The meaning of mathematics instruction in multilingual classrooms: Analyzing the importance of responsibility for learning. Educational Studies in Mathematics, 81(1), 103–125.


Heuveline, P., Yang, H., & Timberlake, J. M. (2010). It takes a village (perhaps a nation): Families, states, and educational achievement. Journal of Marriage and Family, 72(5), 1362–1376.


Ismail, N. A., & Awang, H. (2008). Differentials in mathematics achievement among eighth-grade students in Malaysia. International Journal of Science and Mathematics Education, 6(3), 559–571.


Kaya, S., & Rice, D. C. (2010). Multilevel effects of student and classroom factors on elementary science achievement in five countries. International Journal of Science Education, 32(10), 1337–1363.


Lee, J. (2007). Two worlds of private tutoring: The prevalence and causes of after-school mathematics tutoring in Korea and the United States. Teachers College Record, 109(5), 1207–1234.


Leow, C., Marcus, S., Zanutto, E., & Boruch, R. (2004). Effects of advanced course-taking on math and science achievement: Addressing selection bias using propensity scores. American Journal of Evaluation, 25(4), 461–478.


Luyten, H. (2006). An empirical assessment of the absolute effect of schooling: Regression-discontinuity applied to TIMSS-95. Oxford Review of Education, 32(3), 397–429.


Lynn, R., & Mikk, J. (2007). National differences in intelligence and educational attainment. Intelligence, 35(2), 115–121.


Mohammadpour, E. (2012). A multilevel study on trends in Malaysian secondary school students' science achievement and associated school and student predictors. Science Education, 96(6), 1013–1046.


Park, H., Lawson, D., & Williams, H. E. (2012). Relations between technology, parent education, self-confidence, and academic aspiration of Hispanic immigrant students. Journal of Educational Computing Research, 46(3), 255–265.


Pugh, G., & Telhaj, S. (2008). Faith schools, social capital and academic attainment: Evidence from TIMSS-R mathematics scores in Flemish secondary schools. British Educational Research Journal, 34(2), 235–267.


Rindermann, H. (2008). Relevance of education and intelligence at the national level for the economic welfare of people. Intelligence, 36(2), 127–142.


Rodriguez, M. C. (2004). The role of classroom assessment in student performance on TIMSS. Applied Measurement in Education, 17(1), 1–24.


Schmidt, W. H., Cogan, L. S., Houang, R. T., & McKnight, C. C. (2011). Content coverage differences across Districts/States: A persisting challenge for U.S. education policy. American Journal of Education, 117(3), 399–427.


Wang, Z., Osterlind, S. J., & Bergin, D. A. (2012). Building mathematics achievement models in four countries using TIMSS 2003. International Journal of Science and Mathematics Education, 10(5), 1215–1242.


Wiseman, A. W., & Anderson, E. (2012). ICT-integrated education and national innovation systems in the gulf cooperation council (GCC) countries. Computers & Education, 59(2), 607–618.




Cite This Article as: Teachers College Record, Date Published: June 16, 2014
https://www.tcrecord.org ID Number: 17564, Date Accessed: 10/28/2021 3:21:12 AM

Purchase Reprint Rights for this article or review
 
Article Tools
Related Articles

Related Discussion
 
Post a Comment | Read All

About the Author
  • Amita Chudgar
    Michigan State University, College of Education
    E-mail Author
    AMITA CHUDGAR is an associate professor at Michigan State University’s College of Education. Her long-term interests as a scholar focus on ensuring that children and adults in resource-constrained environments have equal access to high-quality learning opportunities irrespective of their backgrounds. Her recent publications include; Chudgar, Amita, T.F. Luschei, Y. Zhou. 2013 “Science and Mathematics Achievement and the importance of classroom composition: Multi-country analysis using TIMSS 2007” American Journal of Education, 119(2), 295-306. Luschei, Thomas F., A. Chudgar, W. J. Rew. 2013. “Exploring differences in the distribution of teacher qualifications in Mexico and South Korea: Evidence from the Teaching and Learning International Survey” Teachers College Record, 115(5).
  • Thomas Luschei
    Claremont Graduate University
    E-mail Author
    THOMAS F. LUSCHEI is an associate professor in the School of Educational Studies at Claremont Graduate University. His research uses an international and comparative perspective to study the impact and availability of educational resources—particularly high-quality teachers—among economically disadvantaged children. His recent publications include: Chudgar, Amita, T.F. Luschei, Y. Zhou. 2013 “Science and Mathematics Achievement and the importance of classroom composition: Multi-country analysis using TIMSS 2007” American Journal of Education, 119(2), 295-306. Luschei, Thomas F., A. Chudgar, W. J. Rew. 2013. “Exploring differences in the distribution of teacher qualifications in Mexico and South Korea: Evidence from the Teaching and Learning International Survey” Teachers College Record, 115(5).
  • Loris Fagioli
    Claremont Graduate University
    E-mail Author
    LORIS P FAGIOLI is a postdoctoral scholar in the School of Educational Studies at Claremont Graduate University. His research interests center around questions of stratification of educational opportunities. He studies this stratification in college access and choice, student engagement through social media, international comparative topics of education, and value-added measures of school and teacher accountability. His most recent publication is: Fagioli, L. P. 2014. “A Comparison between Value-added School Estimates and Currently Used Metrics of School accountability in California.” Educational Assessment, Evaluation and Accountability.
 
Member Center
In Print
This Month's Issue

Submit
EMAIL

Twitter

RSS