Data Sharing to Drive the Improvement of Teacher Preparation Programs

by Kevin C. Bastian, C. Kevin Fortner, Alisa Chapman, M. Jayne Fleener, Ellen McIntyre & Linda A. Patriarca - 2016

Background/Context: Teacher preparation programs (TPPs) face increasing pressure from the federal government, states, and accreditation agencies to improve the quality of their practices and graduates, yet they often do not possess enough data to make evidence-based reforms.

Purpose/Objective: This manuscript has four objectives: (a) to present the strengths and shortcomings of accountability-based TPP evaluation systems; (b) to detail the individual-level data being shared with TPPs at public universities in North Carolina; (c) to describe how data sharing can lead to TPP improvement and the challenges that programs will need to overcome; and (d) to detail how three TPPs are using the data for program improvement.

Setting: North Carolina public schools and schools of education at public universities in North Carolina. Importantly, this individual-level data sharing system can be instituted among TPPs in other states.

Population/Participants/Subjects: Teachers initially-prepared by public universities in North Carolina.

Research Design: With individual-level data on program graduates, TPPs can conduct a range of analyses—e.g., regression analyses with program data, primary data collection with interviews, and rubric-based observations—designed to aid program improvement efforts.

Conclusions/Recommendations: Teacher preparation programs and researchers or state education agencies need to establish partnerships to share individual-level data on program graduates with TPPs. This individual-level data sharing would help TPPs to develop systems of continuous improvement by examining whether their preparation practices align with the types of environments in which their graduates teach and how graduates’ preparation experiences predict their characteristics and performance as Teachers of Record. Unlike other initiatives targeted at TPP improvement, individual-level data sharing, and its focus on within-program variability, can benefit TPPs at all levels of performance.


In recent years accreditation agencies and policymakers have initiated efforts to both hold teacher preparation programs (TPPs) accountable for the performance of their graduates and push TPPs to make evidence-based reforms. For example, the newly formed Council for the Accreditation of Educator Preparation (CAEP) requires TPPs to demonstrate the impact of their graduates on student learning, classroom instruction, and employer satisfaction and to institute a system of data analysis and continuous improvement (CAEP, 2013). Likewise, the U.S. Department of Education recently announced plans to rate TPPs based on their graduates’ job placement rates, retention rates, and effectiveness, as well as surveys of their graduates and their graduates’ employers (Rich, 2014). While these efforts correctly recognize teachers’ significant effects on student outcomes (Bill and Melinda Gates Foundation, 2013; Nye, Konstantopoulos, & Hedges, 2004) and the importance of teacher preparation to teacher performance (Boyd, Grossman, Lankford, Loeb, & Wyckoff, 2009; Darling-Hammond, Chung, & Frelow, 2002; Goldhaber, Liddle, & Theobald, 2013; Henry et al., 2014), initiatives to hold TPPs accountable for the performance of their graduates often leave an important question unanswered: With what data can TPPs best make evidence-based reforms?

As detailed in a recent National Academy of Education report, evaluations of TPPs typically serve a primary purpose—accountability, providing information to consumers, or program improvement—and the evaluation data required for one purpose may not be well aligned with the evaluation data required for another purpose (Feuer, Floden, Chudowsky, & Ahn, 2013). Many current TPP evaluations, such as estimating the average value-added of a TPP’s graduates (Gansle, Noell, & Burns, 2012; Goldhaber et al., 2013; Henry, Patterson, Campbell, & Pan, 2013) or rating the quality of a TPP’s inputs (National Council on Teacher Quality, 2014), fall into the accountability and/or consumer information categories. When performed well, these evaluation efforts benchmark the performance of a TPP against a reference category or a set of standards and may direct TPPs to look towards high-performing or highly rated TPPs for program improvement ideas.1 However, even with these types of aggregate evaluation data, TPPs are often driving blind, operating without the level of data necessary to guide evidence-based program improvement (Peck, Singer-Gabella, Sloan, & Lin, 2014).

Instead, to initiate systems of continuous improvement, TPPs and researchers or state-level education agencies need to establish partnerships so that TPPs receive individual-level data on the characteristics, work environments, and performance of their graduates as teachers-of-record. Such data could include teachers’ credentials (e.g., National Board Certification status and licensure exam scores), measures of their employment/teaching context (e.g., percentage of school’s students receiving free or reduced-price lunches, students’ average prior test scores, and the percentage of English-language learners taught), and their outcomes (e.g., value-added estimates, evaluation ratings, and retention). Such data sharing partnerships can go well beyond accountability-based evaluations or federal regulations to help TPPs make evidence-based program improvements by examining whether their preparation practices are aligned with the types of school and classroom environments in which their graduates teach—i.e., does the TPP prepare them for the types of schools they will work in and the types of students they will teach—and by exploring how variation in graduates’ preparation experiences explains variation in the characteristics and performance of those graduates when they become teachers. Individual-level data sharing can help TPPs develop the internal capacity for data analysis, determine what additional data measures they should collect to advance program improvement, and create a coordinated and systemic view of teacher education reform (Cochran-Smith & Boston College Evidence Team [BCET], 2009; Peck & McDonald, 2014). Furthermore, unlike other initiatives targeted at the remediation of low-performing TPPs, individual-level data sharing and its focus on within-program variability can benefit TPPs at all levels of performance.

To illustrate the need for such data sharing with TPPs, we begin by detailing the strengths and shortcomings of accountability-based TPP evaluation systems. Responding to the shortcomings of these evaluation efforts, we then discuss the creation of a data sharing initiative. Specifically, we focus on the individual-level data we are sharing with TPPs at public universities in North Carolina, how these data can lead to program improvement, and the obstacles that data sharing must overcome to achieve its potential. While many states share aggregate-level data on teachers’ work environments and/or performance with TPPs,2 to our knowledge this is the only statewide example of individual-level data sharing with TPPs. Finally, to better understand how TPPs can use individual-level data on their graduates to drive evidence-based decision making, deans from three universities—North Carolina State University, the University of North Carolina Charlotte, and East Carolina University—share how they are using the data sharing initiative to guide program improvement.

Overall, TPPs face strong incentives to improve the quality of their preparation practices and, subsequently, the quality of their graduates. Doing so, however, will require more than accountability-based evaluations of TPPs; as a first step, it will require providing TPPs with the resources—the data—to make evidence-based decisions.


Over the last decade, school districts and states, such as New York City, Louisiana, North Carolina, Tennessee, and Washington, have initiated efforts to estimate teachers’ value-added to student achievement and link teachers’ value-added scores to their preparation (Boyd, Grossman, Lankford, Loeb, & Wyckoff, 2006, 2009; Gansle et al., 2012; Goldhaber et al., 2013; Henry et al., 2014; Tennessee State Board of Education [TSBE], 2012, 2013). At a high level, these efforts have asked whether teachers entering the profession through different routes are more or less effective than their peers with other forms of preparation or certification. For example, work by Boyd et al. (2006) in New York City compared the effectiveness of college-recommended teachers with that of teachers entering New York schools through five additional categories; comparable work in North Carolina compared the effectiveness of teachers prepared at in-state public institutions with that of teachers entering the profession through 10 other portals (Henry et al., 2014). More narrowly, these efforts have focused on graduates of individual TPPs and have asked whether they are more or less effective than graduates of other TPPs. For example, research in Louisiana and Washington indicates that there is a substantial degree of overlap in the value-added effectiveness of TPP graduates but that some programs’ graduates significantly outperform their peers from other programs (Gansle et al., 2013).

Overall, these accountability-based research efforts provide a broad perspective on the effectiveness of teachers with different forms of preparation and allow individual TPPs to both see the effectiveness of their graduates in aggregate and identify particular grade levels or subject areas in which their graduates are high- or low-performing. Further, these accountability-based evaluations document the significant heterogeneity in the effectiveness of novice teachers with the same type of preparation (route or program), suggesting the need for continued research to help explain that variability (Kane, Rockoff, & Staiger, 2008; Koedel, Parsons, Podgursky, & Ehlert, 2015). The benefits of these accountability-based research efforts to TPPs are twofold. First, these studies show TPPs how they fare on outcomes that are of interest to policymakers and the general public; this accountability may encourage or force TPPs to focus on program improvement. Second, these studies may help TPPs become more aware of and ready to use research evidence to inform program decisions and may make TPP leadership and faculty/staff better consumers of research findings. These benefits, in turn, highlight the key weakness of accountability-based TPP evaluations: the inability of such evaluations to formatively drive TPP reforms. For example, current analyses of TPP effectiveness only identify which programs’ graduates are performing well or poorly; the aggregate-level data do not pinpoint why or suggest changes that programs can make to improve performance (Henry, Patterson, et al., 2013). Therefore, while accountability-based TPP evaluations serve an important role, they are not sufficient to inform program-improvement efforts. Instead, TPPs need access to individual-level data on program graduates to help establish systems of continuous improvement and to make evidence-based reforms.


Accountability pressures from policymakers and practitioners have pushed theories of evidence-based reform into a wide variety of fields and professions in recent years (Achenbach, 2005; Estabrooks, 2007; National Research Council, 2002). Across post-secondary institutions, this movement is exemplified by evidence-based tools, such as the Predictive Analytics Reporting (PAR) framework, that help higher-education institutions predict student success and find effective, evidence-based practices to boost student retention (PAR, 2015). In teacher education, this effort is exemplified by Cochran-Smith’s (2005) challenge to build “chains of evidence” linking teacher-education pedagogy and program design with meaningful candidate learning, and the Teachers for a New Era (TNE) initiative, which sought to achieve significant program reform through a respect for evidence (Fallon, 2006). Building from such initiatives and our own TPP evaluation work within North Carolina’s public university system, data sharing represents an important next step in evidence-based reform by providing TPPs the individual-level data they need to connect measures of candidates’ preparation experiences to their characteristics and performance as teachers.

The logic behind individual-level data sharing relies on two key suppositions. The first supposition is that TPPs significantly impact the performance of their graduates and that this impact can be measured by outcomes, such as value-added estimates or evaluation ratings. While there are methodological concerns with value-added (Corcoran & Goldhaber, 2013), and some have noted the limitations of holding TPPs accountable for the effectiveness of their graduates (Floden, 2012), research shows that graduates of certain preparation programs are more effective than those from other programs and that teacher preparation explains a significant portion of the variation in teacher effectiveness (Boyd et al., 2009; Goldhaber et al., 2013). These findings support individual-level data sharing as a way to explore that variation for program improvement.

The second supposition is that TPPs can take research evidence and turn it into specific programmatic reforms. As detailed in subsequent sections, this task requires TPPs to possess research capacity and an organizational commitment to change. Holding research evidence about what programmatic experiences predict variation in the performance of their graduates, it will not be easy for programs to (a) identify when and how those experiences are taught or provided in the TPP; (b) institute reforms designed to address program weaknesses or expand program strengths; (c) create short-term indicators to assess whether reforms are having desired effects; and (d) continue to track the outcomes of graduates to determine whether reforms impact teacher effectiveness. Individual-level data sharing does not solve these challenges for TPPs, but it does provide them with evidence to start the process.

Below, we detail the individual-level data being shared with public universities in North Carolina, the theory of change linking data sharing to program improvement, and the obstacles that may prevent data sharing from improving teacher education.


Data sharing is an initiative designed to stimulate a culture of evidence and program reforms by providing TPPs with individual-level data on their graduates as teachers-of-record.

As researchers with a long-standing research relationship with the TPPs at public universities in North Carolina and with the University of North Carolina General Administration (UNCGA), we are sharing individual-level data with these programs in five broad categories: (a) teacher employment; (b) teacher characteristics; (c) classroom characteristics; (d) school characteristics; and (e) teacher outcomes. Specifically, we are providing TPPs at public universities in North Carolina with separate data files per academic year (currently 2005–06 through 2012–13), with each file containing data on all the individuals who were initially prepared to teach by a given TPP and employed as teachers in that academic year. Furthermore, because teachers can work at more than one school in an academic year, files contain observations for each unique teacher-school combination.

Data for this initiative come from two sources: the North Carolina Department of Public Instruction (DPI) and the UNCGA. The DPI provides student test scores and demographics, classroom rosters, school characteristics, teacher credentials, and salary files as part of our broad research agenda focused on teacher preparation and the evaluation of education programs. With these data we can report, at the level of individual teachers, on teacher employment, teacher characteristics, classroom characteristics, school characteristics, and teacher outcomes. The UNCGA provides files identifying those who have received an education degree and/or completed licensure requirements at in-state public universities. With these data we focus the data sharing on those teachers initially prepared to teach by in-state public universities. Here, we note that one limitation of the current initiative is that we can only provide data for teachers working in North Carolina’s traditional public schools; data sharing does not cover teachers working in private schools, charter schools, or out of state.3 Once we have cleaned and prepared these data to create files for each in-state public university, by school year, we provide these files to the UNCGA, which releases these files to each institution once the institution has submitted a plan for data security and research. Below, we detail the data provided in each of these five categories and briefly consider questions TPPs can ask with such data. Table 1 provides a list of the variables we are providing to TPPs; Appendix Table 1 includes a description of the variables.

Table 1. Individual-Level Data Shared with Public University Teacher Preparation Programs in North Carolina

Employment Status

Teacher Characteristics

Classroom Characteristics

School Characteristics

Teacher Outcomes

District and school

Number of pay periods

First pay period

Last pay period

Amount of time worked (full-time equivalency status)

Teaching experience

Graduate-degree status

National Board certification status

Licensure areas

Licensure basis

Exams taken

Exam scores

Teaching a tested grade/subject area

Number of classes taught

Average class size

Grade level(s) taught

Subject area(s) taught

Race/ethnicity proportions

Free and reduced-price lunch proportions

Gifted proportion

Disabled proportion

Limited English Proficient proportion

Average days absent

Average prior achievement scores

Average prior achievement level


School size

Percentage free and reduced-price lunch

Short-term suspension rate

Violent acts rate

Race/ethnicity percentages

Total per-pupil expenditures

Per-pupil expenditures in spending categories (e.g., regular instruction)

AYP percentage

State accountability status and growth

Performance composite

Teacher credentials—percentage fully licensed, novice, or holding an advanced degree or NBC

Pupil-to-teacher ratio

Teacher-stay ratio

Returns to the state’s public schools

Returns to the same school

Teacher value-added estimate (across 10 separate subject areas)

Quintile for value-added estimate

Note: We are providing TPPs with separate data files for each academic year (beginning in 2005–06), with each file containing data on all the individuals who were initially prepared to teach by a given TPP and employed as teachers in that academic year. Files contain observations for each unique teacher-school combination.

Teacher Employment Data

The variables in this category include the district and school in which a teacher was employed, the number of pay periods that a teacher worked in a given school, the first and last pay periods that a teacher worked in a given school, and how much—in full-time equivalents (FTE)—a teacher worked in a given school and across all schools. With such data TPPs can know (a) whether and how quickly their graduates secure teaching jobs in a state’s public schools, (b) whether their graduates were hired after the start of the school year or exited teaching during the middle of the year, (c) the nature of the employment (full- or part-time), and (d) which districts and schools hire their graduates and whether their graduates work in close proximity to the TPP.

Teacher Characteristics

The variables in this category include a teacher’s level of experience, whether a teacher holds a graduate degree or National Board certification, the licensure areas a teacher holds and the basis (e.g., from an in-state preparation program or a reciprocal license from out of state) for those teaching licenses, the tests (e.g. Praxis II Middle School Mathematics, SAT) a teacher has taken and a teacher’s scores on those exams, and whether a teacher teaches in a tested grade/subject. With such data TPPs can know (a) whether and in what areas their graduates have earned additional teaching licenses after graduation, (b) how well their graduates scored on Praxis licensure exams linked to their teacher preparation, (c) whether their graduates have secured additional credentials—graduate degrees or National Board Certification—after graduation, and (d) which of their graduates face accountability pressure as a tested grades/subject teacher.

Classroom Characteristics

The variables in this category include the number of classes taught by a teacher in an academic year, the average size of those classes, the subjects and grades taught by a teacher in an academic year, the average prior performance of a teacher’s students on End-of-Grade and/or End-of-Course exams, the average number of days absent for a teacher’s students, and the proportion of a teacher’s students who are white, black, Hispanic, American Indian, qualify for free or reduced-price lunches, currently are or were Limited English Proficient, and receive gifted or exceptional children services. With such data TPPs can know: (1) the teaching load of their graduates; (2) whether their graduates are teaching in-field or out-of-field and whether the TPP prepared them to teach in their current subject/grade area(s); (3) whether their graduates instruct low, average, or high performing students; and (4) whether their graduates teach classes with high percentages of students who are minority, economically-disadvantaged, non-native English speakers, or exceptional.

School Characteristics

The variables in this category include the number of students enrolled at the school, the urbanicity status of a school, measures of a school’s orderliness (the number of suspensions and violent acts), the racial/ethnicity percentages of a school’s students, the percentage of a school’s students qualifying for free or reduced-price lunch, measures of a school’s academic performance (accountability status and growth and the percentage of students passing standardized exams), total per-pupil expenditures and per-pupil expenditures in key spending categories (e.g., regular instruction, special instruction, and school leadership), and measures of teachers’ persistence and credentials at a school (the proportion of teachers who returned from the previous year and who are fully licensed, novice, Nationally Board Certified, or holding a graduate degree). With such data TPPs can know (a) whether their graduates teach in safe and orderly environments; (b) whether their graduates teach in schools with high percentages of minority or economically disadvantaged students; (c) whether their graduates teach in low-, average-, or high-performing schools; (d) the financial resources available in the schools where their graduates teach; and (e) whether their graduates teach in schools with high turnover and with more or less well-credentialed peers.

Teacher Outcomes

The variables in this category include indicators for whether a teacher returns to the state’s public schools in the following year and the same school in the following year, estimates of individual teacher value-added across ten different subjects/grade-levels—elementary mathematics, reading, and science; middle-grade mathematics, reading, and science; and high-school mathematics, English, science, and social studies4—and the quintile for each value-added estimate. With such data TPPs can know (a) the persistence of the teachers they prepare, (b) how effective their graduates are at promoting student achievement gains, and (c) the relative effectiveness of their graduates compared to peers teaching the same level/subject-area.


Teacher preparation programs can use individual-level data on program graduates to leverage program improvement in three ways: (a) conducting research with shared data and indicators of TPP progress and performance, (b) conducting research with shared data and primary data collected by TPP faculty and staff, and (c) improving the capacity of TPP faculty and staff to conduct research and think strategically about data use. Importantly, TPPs can tailor the use of these individual-level data to their particular needs, components, vision, and questions. With individual-level data sharing, TPPs can engage in theory-driven research to determine whether specific program components and/or a program’s definition of quality teaching are supported by empirical evidence. Beyond theory-driven research, TPPs can also mine the individual-level shared data and their internal data to determine what predicts outcomes of interest. Therefore, unlike other initiatives targeted at the remediation of low-performing TPPs, individual-level data sharing, with its focus on within-program variability, can benefit TPPs at all levels of performance. Below, we further describe the processes connecting individual-level data to program improvement; Figure 1 provides a visual depiction of this theory of change, including how individual-level data can foster cycles of continuous improvement.

Figure 1. How individual-level data sharing can lead to teacher preparation program improvement


Research Studies with TPP Data

Teacher preparation programs collect and store a wide range of data on their teacher candidates. For instance, TPPs typically measure candidates’ pre-TPP academic performance (high school GPA, SAT/ACT scores, Praxis I scores, collegiate GPA); courses taken and the sequence of courses (e.g., number of content courses in all disciplines, number of pedagogy courses); university personnel serving in instructor or advisor roles; ratings across dispositional, student-teaching, and performance-assessment instruments; and responses to program exit surveys. Additionally, TPPs have a host of data on teacher-candidate practicum placements (e.g., characteristics of the cooperating teacher and of the student-teaching placement site). In the absence of individual-level data sharing, TPPs can use all of these data formatively to assess relationships with intermediary outcomes, such as Praxis II licensure exam scores or edTPA ratings. Knowing, for example, that characteristics of the student-teaching placement site or certain courses predict these intermediary outcomes gives TPPs opportunities to create tight feedback loops for programmatic and organizational learning. To further program improvement, TPPs can combine these internal data sources with externally provided, individual-level data on their program graduates to examine whether their graduates’ preparation experiences are aligned with the types of schools and classrooms in which they work and how variation in graduates’ programmatic components or performance predict variation in their outcomes (e.g., entry into or exit from the workforce, earning advanced credentials, and teacher value-added). This type of summative feedback is particularly important because TPPs are held accountable for the performance of their graduates.

Stylized examples of research with internal TPP data and individual-level shared data include the following:

Through analysis of the shared individual-level data and TPP course syllabi, a program may discover that their recent elementary-grades graduates are teaching in classrooms with many English language learners (ELLs) and that these graduates had few opportunities to learn and practice strategies for teaching these students. In response, the TPP could design and require additional learning segments to provide candidates the knowledge and skills to succeed with ELLs.

After examining the relationships between program data and individual teacher value-added, a TPP may find that, on average, those graduates who received additional hours of instructional coaching during student teaching are more effective than others without such an experience. Importantly, the TPP could also use other indicators in their internal data to examine and rule out competing hypotheses for this result—e.g., that the higher value-added was due to these graduates’ higher GPAs. If the TPP could sufficiently isolate the relationship between more instructional coaching during student teaching and teacher value-added, then the program could feel confident in providing additional instructional coaching to all student teachers.

Research Studies with Primary Data Collection

In addition to their extant administrative data, TPPs can use the individual-level shared data as the impetus for their own primary data-collection initiatives to better understand the performance and perspectives of their graduates. These primary data-collection efforts could focus on classroom observations (e.g., general protocols, such as the Framework for Teaching [Danielson, 2013], or content-specific protocols, such as Mathematical Quality of Instruction), interviews and focus groups, or teacher surveys. As a stylized example, if a TPP wanted to assess why some of their middle-grades mathematics graduates generated significantly larger student achievement gains than other middle-grades graduates, the TPP could (a) use the individual-level shared data to identify their graduates in the top and bottom quintiles of effectiveness;5 (b) observe those teachers with a classroom observation protocol (in which the observer is blind to the teacher’s prior effectiveness); and (c) administer surveys to examine these graduates’ perceptions of preparation quality (Hill, Kapitula, & Umland, 2011). Analyses of such data may reveal that a TPP’s highly effective middle-grades mathematics graduates better engage their students in meaning-making and mathematical reasoning and more clearly articulate mathematical ideas. With this evidence, TPP faculty and staff could then engage in work to (a) identify when and how these specific features were or were not addressed in the program, (b) institute program reforms to address the concerns, (c) develop/administer instruments to determine whether teacher candidates improve their knowledge and skills in these areas (short-term feedback), and (d) continue to analyze the individual-level shared data to see whether subsequent graduating cohorts are more effective as classroom teachers.

Improving TPP Capacity

Beyond the direct support of research, individual-level data sharing can also lead to program improvement by helping TPP faculty and staff improve their capacity to conduct research and use research evidence strategically. Quite simply, if TPPs are invested in evidence-based reform, the provision of individual-level data should give TPP faculty and staff more opportunities to (a) develop research questions; (b) determine the required data, sample, and analytical methods to answer those questions; (c) interpret results; and (d) consider beneficial programmatic changes in response to research findings. As a result of this capacity-building, TPPs can create and strengthen a culture of evidence and a coordinated, systemic view of TPP reform (Peck & McDonald, 2014).


While individual-level data sharing has the potential to inform evidence-based program improvement, we acknowledge that a variety of research-based and organizational challenges may make it difficult to turn data into appropriate programmatic reforms. Below, we detail these challenges and introduce ways that TPPs can meet these obstacles.

Research-Based Challenges

Unless TPPs possess the internal capacity to conduct rigorous research analyses, they cannot fully leverage individual-level data to make evidence-based reforms. Here, internal capacity starts with TPPs collecting measures of teacher-candidate progress and performance that have predictive validity—associated with teachers’ performance after beginning teaching (Henry, Campbell, et al., 2013). This requirement may be problematic for TPPs, because many of the measures they currently collect are meant to determine whether teacher candidates meet a competency threshold rather than to distinguish between the performance of teacher candidates and, therefore, do not have the variation needed for analyses. Without such measures, it may be challenging for TPPs to identify programmatic components in need of reform. For TPPs that do not have these measures, however, individual-level data sharing can help determine that current data instruments are not predictive and push these programs to develop and begin using additional measures (Henry, Campbell, et al., 2013).

Beyond data measures, individual-level data sharing requires that TPPs have a robust data-management system that allows them to connect the program-level data they collect on teacher candidates to the characteristics and outcomes data shared by researchers or state education officials. This means that TPPs need (a) a data-management platform that stores measures of teacher-candidate progress and performance over a number of years; (b) a unique identification number for candidates and graduates, to connect separate elements of program-level data to externally provided data; and (c) protocols established to securely handle sensitive information.

With such data structures in place, the next capacity concern is the extent of TPP faculty expertise at conducting rigorous research analyses. At many TPPs, the primary focus of faculty is preparing teacher candidates, and only recently has a stronger research focus developed. This means that there may be only a small number of faculty members with the ability and interest required to effectively analyze the shared data, and, as a result, the timeliness and breadth of research may be limited. To address these capacity concerns, TPPs can collaborate with outside researchers from other schools/departments of their respective institutions and across institutions or with external evaluators and research firms. These outside partnerships would be particularly valuable with external researchers who possess both methodological capacity and knowledge of teacher preparation. All these collaborations would be aided by governmental or philanthropic financial support that TPPs could direct toward outside researchers or toward hiring research coaches who would work with TPPs to create and improve data systems, develop research agendas, analyze data, and, most importantly, build the internal capacity of TPPs to conduct analyses independently. These research coaches would be particularly important to small TPPs that have less technological and human capacity to manage their internal data and the shared data and to use both data sources for program improvement.

The last research-based challenge concerns the small size of many TPPs and whether there is sufficient statistical power to predict significant differences in outcomes for program graduates. Quite simply, insufficient statistical power may limit the ability of TPPs to make evidence-based reforms because the evidence does not meet a threshold—statistical significance—for taking action. In response to this concern, TPPs can increase sample size by pooling individual-level data from multiple graduating cohorts or, when feasible, pooling data with other TPPs that are conducting similar analyses. More broadly, TPPs can reevaluate standards for what makes research evidence actionable. The p values from correlations or regression coefficients that minimize the likelihood of Type I errors (such as those less than 0.05) provide the strongest basis for evidence-based reform; however, to minimize the possibility of Type II errors and respond to findings that suggest a practically significant relationship, TPP can relax standards for designating research findings as actionable. Correlations or regression coefficients that are practically, but not statistically, significant can inform program improvement. Likewise, qualitative data gathered through focus groups, interviews, or case studies can provide important perspectives on teacher-preparation practices that need to be amended. While there must be continued scrutiny to reduce the likelihood that TPPs make programmatic changes that are not supported by evidence, relaxing standards for action will also reduce the likelihood that TPPs miss out on promising opportunities for reform.

Organizational Challenges

Even with the research capacity to leverage individual-level data, TPPs cannot make evidence-based reforms without creating or supporting a “culture of evidence” amongst faculty, supervisors, and staff (Peck, Gallucci, Sloan, & Lippincott, 2009; Peck & McDonald, 2014). Essentially, TPPs have to establish collective values and institutional policies that recognize the importance of individual-level data (acquiring, analyzing, and using it for decision-making) and shift the conception of program reform away from disconnected changes made by single faculty members to coordinated and systemic efforts to improve recruitment/selection, curricular, and clinical practices in response to research evidence (Cochran-Smith & BCET, 2009; Peck, Gallucci, & Sloan, 2010; Peck & McDonald, 2014). For example, if a TPP analyzes the individual-level shared data and finds that graduates scoring higher on certain edTPA rubrics have higher value-added estimates, then TPP faculty and staff must come together to (a) identify when and how the knowledge and skills captured by those edTPA rubrics are taught in the TPP, (b) determine the programmatic differences that exist between those with higher and lower edTPA scores, (c) use this information to institute reforms—often at multiple instances within the program—designed to improve programmatic components, and (d) conduct follow-up research to assess whether changes led to higher edTPA scores and/or higher teacher value-added. Cultivating this culture of evidence and inquiry is not straightforward; it requires TPP faculty and staff to have an interest in program improvement, to view the shared data as valid and relevant to their practice, and to invest considerable time and effort in conducting research and considering the implications of research evidence (Peck & McDonald, 2014). Building this culture of evidence may impact the work of TPP faculty and staff and how faculty and staff view their work, as part of a larger, collective enterprise to improve the preparation of teacher candidates.


In the sections below, College of Education deans from three North Carolina public universities share the research agenda they are pursuing with the individual-level data. While beneficial as stand-alone descriptions of evidence-based program reform efforts, these perspectives can also serve a broader purpose as templates for other TPPs considering programmatic changes. All of the program improvement initiatives described below are in their initial stages; future research needs to assess what TPPs learned from this research and the extent to which these efforts impacted program performance.


STEM (science, technology, engineering, and mathematics) education is an area of emphasis at North Carolina State University (NCSU), and preparing STEM teachers with strong backgrounds in content and pedagogy is central to NCSU’s mission to support a STEM teacher pipeline. Sustaining this pipeline is particularly important because outstanding STEM teachers are a key to preparing and motivating K–12 students to pursue post-secondary STEM opportunities (President’s Council of Advisors on Science and Technology, 2010).

To strengthen STEM teacher-education programs, NCSU is using the individual-level data to conduct drill-down studies examining the school placements of STEM graduates, the courses STEM graduates go on to teach, and the performance of elementary STEM graduates across STEM and non-STEM (English language arts) subjects. Specifically, NCSU is addressing the following sets of questions:

In comparison to state averages and non-STEM graduates, what are the characteristics of the schools in which NCSU’s STEM graduates teach?

What types of courses do NCSU’s STEM graduates teach? Are they teaching advanced courses, such as calculus and physics, or regular courses, such as algebra and biology?

Do NCSU’s elementary-education graduates have higher levels of content knowledge (as measured by licensure exams)? Are NCSU’s elementary-education graduates more effective mathematics and science teachers? Does NCSU’s STEM focus in elementary education compromise the performance of their graduates in English language arts?

The first question helps NCSU better align its coursework and student-teaching placements with the types of students and schools STEM graduates encounter and allows NCSU to create a closer partnership with its most outstanding STEM graduates. The second question assists NCSU in aligning and setting content-area requirements and identifies which STEM graduates—those who were higher or lower performing as teacher candidates—teach a tested grade/subject-area and have value-added data. Finally, NCSU recently created a STEM-focused elementary-education program with high levels of STEM content-area requirements. The final set of questions allows NCSU to know whether this content focus produces graduates with higher levels of content knowledge, graduates who are more effective mathematics and science teachers, and graduates who are also effective in non-STEM subjects. Such data will help NCSU make informed decisions about the direction of its new elementary-education program.


The University of North Carolina Charlotte (UNCC) is a large urban research institution with an explicit mission to prepare teachers for urban environments as well as the surrounding rural and suburban school districts, with a focus on equity, excellence, and engagement with the community. The driving force behind UNCC’s desire to use the individual-level data is studies illustrating that UNCC’s elementary-education graduates’ value-added falls into the low or middle ranges of the value-added scores of all institutions in the state’s public university system.

To assess whether the college is fulfilling its mission and to understand why UNCC’s elementary-program graduates scored lower than expected, researchers at UNCC are employing the shared individual-level data to answer the following groups of questions:

Are UNCC’s graduates more likely to teach in urban settings than graduates of in-state public universities who do not have the same mission? How long do the teachers in urban settings stay in those settings? How effective are the teachers serving high-poverty populations? How effective are teachers who serve large populations of ELLs?

Are UNCC’s elementary-program graduates’ value-added scores predicted by entry characteristics (high school GPA, SAT scores, disposition)?

How do scores on mathematics and reading content-licensure exams predict UNCC’s elementary graduates’ value-added scores? How do elementary graduates’ course-taking patterns predict value-added scores? How do scores on key assignments during professional preparation predict elementary graduates’ value-added scores?

The first group of questions assesses how well UNCC is addressing its stated mission of preparing professionals for challenging environments. Findings could have implications for reexamining the mission or program components to better meet the mission. The second set of questions assesses the relationship between candidate content knowledge or human capital and resulting student achievement, which could have implications for candidate recruitment and selection. Finally, the third set of questions assesses the relationship among the candidates’ content knowledge, value-added scores, and program features. Answers to these questions will entail key implications for program faculty as they grapple with how to better prepare elementary-education candidates.


In recent years East Carolina University (ECU), a pilot institution for edTPA in the state and a recipient of a U.S. Department of Education Teacher Quality Partnership grant, has made significant efforts to implement evidence-based program reforms and evaluate the efficacy of these program revisions. To further this commitment to continuous improvement, ECU is using the shared individual-level data to pursue the following research questions:

How much variance is in the value-added scores of ECU’s graduates?

What is the relationship between ECU graduates’ value-added scores and the following: entry characteristics (e.g. GPA, test scores); academic major/concentration; the personnel training graduates (e.g., instructors, clinical teachers, university supervisors); the number, type, and length of graduates’ clinical practice opportunities; the number, type, and scores of graduates’ formative and summative program assessments; and the four-year GPA and Praxis II scores of graduates?

What are the patterns in ECU graduates’ attrition and changes in position (e.g., changing grades/subject areas or schools/districts)?

The first question, assessing the variance in graduates’ effectiveness, is a key consideration for ECU (and TPPs generally) because if there is a large spread between the more and less effective program graduates, then ECU must address tough questions, such as: are program assessments valid and reliable, how consistently do faculty and staff monitor and grade candidate knowledge and skills, and how rigorous are the standards for demonstrating basic competency during student teaching? The second set of questions assesses the relationship between graduates’ effectiveness and indicators of candidates’ progress and performance. The goal of these analyses is to find patterns in the data that will inform program innovation. Finally, given that teacher mobility may adversely impact students, schools, and the teachers themselves—due to an inability to establish a collaborative and supportive group of practice, inconsistencies and gaps in induction/mentoring, and the lack of experience teaching a particular grade/subject area—answers to the third question will provide insight into the types of support needed during teachers’ induction period and identify gaps in ECU’s preparation that may be contributing to graduates’ early-career struggles.


The status quo for many TPPs is either a lack of access to data on program graduates or access to aggregate-level data measures only. These aggregate data are sufficient for program accountability efforts—such as identifying high- and low-performing programs based on graduates’ value-added estimates—and have helped TPPs initiate evidence-based program reforms. However, these data do not provide TPPs with the level of information necessary to pinpoint why graduates of certain programs are effective or what programmatic components should be altered to benefit program performance. With individual-level data on program graduates, TPPs can assess whether graduates’ preparation experiences are aligned with the types of schools and classrooms in which they work and how variation in graduates’ programmatic components predicts teacher outcomes. Without such data, TPPs may implement program revisions with no indication of whether or not those changes will improve the performance of program graduates. In this way, individual-level data sharing represents an improvement on the current status quo in teacher education by helping TPPs move beyond trial-and-error reforms and toward a more coordinated and systemic approach to program improvement. Essentially, individual-level data sharing allows TPPs at all levels of performance an opportunity to analyze their successes and shortcomings and exercise greater agency in the program improvement process. Furthermore, individual-level data sharing represents a generalizable initiative. Although this effort in North Carolina is aided by the public university system and the quality of K–12 data in the state, other states and groups or systems of TPPs can establish similar initiatives. As more states build PK–20 data systems, it may be easier for TPPs to acquire individual-level data on their program graduates.

While sharing individual-level data with TPPs represents a novel contribution to the field of teacher education, there are challenges to its effective use and opportunities for its improvement. These challenges include the capacity of TPPs to conduct rigorous quantitative research and the organizational commitment of TPPs to engage in coordinated and systemic efforts to improve programmatic components in response to research evidence. Given sufficient resources—financial support from government agencies, teacher education groups, and/or philanthropic organizations to hire external researchers and research coaches and build up independent research capacity, the organization of inter-institutional research collaborations, and sustained leadership from deans and department heads of TPPs—we contend that TPPs can overcome these challenges and use shared data to inform program-improvement efforts. Furthermore, we acknowledge that this initial iteration of individual-level data sharing with the public universities of North Carolina can be strengthened by providing more granular measures of teaching, such as evaluation ratings on state teaching standards, so that TPPs can better assess the relationship between programmatic components and teaching practices.

Overall, individual-level data sharing represents a promising initiative to improve the quality of preparation practices and graduates. We call for the establishment of partnerships between TPPs and researchers/state education agencies and the sharing of individual-level data with TPPs.


We wish to thank the University of North Carolina General Administration for its on-going financial support and the Deans and department heads from the colleges, schools, and departments of education at the 15 UNC system institutions engaged in teacher education for their valuable feedback and collaboration.


1. For example, TPPs performing at average or below-average levels, based on the value-added of their graduates, can look to TPPs with highly effective graduates to try to identify and replicate promising preparation practices.

2. In North Carolina, for example, the Department of Public Instruction provides TPPs with aggregated data on the average evaluation ratings and average value-added estimates of their recent program graduates.

3. Across institutions in the public universities of North Carolina, approximately 66 to 80 percent of initially prepared teachers secure teaching jobs in the state’s public schools. This gives us confidence that results from research with shared data are generalizable to the whole TPP.

4. We estimate individual teacher value-added using a three-level (student, teacher, and school) hierarchical linear model (HLM) with a rich set of student, teacher/classroom, and school covariates. In this model the teacher-effectiveness estimate is the random effect from the second (teacher) level of analysis. While there are concerns with individual teacher value-added estimates—the validity and reliability of estimates—research work on simulated and actual student achievement data indicates that the three-level HLM, in comparison to other value-added approaches, accurately ranks and classifies teacher value-added and has high levels of year-to-year reliability (Rose, Henry, & Lauen, 2012).

5. Due to the potential for bias and measurement error in individual teacher value-added estimates, TPPs should use multiple years of student test-score data to identify graduates in the top and bottom quintiles of effectiveness.


Achenbach, T.M. (2005). Advancing assessment of children and adolescents: Commentary on evidence-based assessment of child and adolescent disorders. Journal of Clinical Child and Adolescent Psychology, 34(3), 541–547.

Bill and Melinda Gates Foundation. (2013). Ensuring fair and reliable measures of effective teaching: Culminating findings from the MET project’s three-year study. Retrieved from

Boyd, D., Grossman, P., Lankford, H., Loeb, S., & Wyckoff, J. (2006). How changes in entry requirements alter the teacher workforce and affect student achievement. Education Finance and Policy, 1(2), 176–216.

Boyd, D., Grossman, P., Lankford, H., Loeb, S., & Wyckoff, J. (2009). Teacher preparation and student achievement. Educational Evaluation and Policy Analysis, 31(4), 416–440.

Cochran-Smith, M. (2005). Teacher education and the outcomes trap. Journal of Teacher Education, 56(5), 411–417.

Cochran-Smith, M., & the Boston College Evidence Team. (2009). “Re-culturing” teacher education: Inquiry, evidence, and action. Journal of Teacher Education, 60(5), 458–468.

Corcoran, S. & Goldhaber, D. (2013). Value-added and its uses: Where you stand depends on where you sit. Education Finance and Policy, 8(3), 418–434.

Council for the Accreditation of Educator Preparation. (2013). CAEP Accreditation Standards. Retrieved from

Danielson, C. (2013). The framework for teaching evaluation instrument. Retrieved from

Darling-Hammond, L., Chung, R., & Frelow, F. (2002). Variation in teacher preparation: How well do different pathways prepare teachers to teach? Journal of Teacher Education, 53(4), 286–302.

Estabrooks, C. (2007). A program of research on knowledge translation. Nursing Research, 56(4), 4–6.

Fallon, D. (2006). Improving teacher education through a culture of evidence. Paper presented at the sixth annual meeting of the Teacher Education Accreditation Council, Washington, DC. Retrieved from remarks.pdf

Feuer, M.J., Floden, R.E., Chudowsky, N., & Ahn, J. (2013). Evaulation of teacher preparation programs: Purposes, methods, and policy options. Washington, DC: National Academy of Education.

Floden, R.E. (2012). Teacher value-added as a measure of program quality: Interpret with caution. Journal of Teacher Education, 63(5), 356–360.

Gansle, K.A., Noell, G.H., & Burns, J.M. (2012). Do student achievement outcomes differ across teacher preparation programs? An analysis of teacher education in Louisiana. Journal of Teacher Education, 63(5), 304–317.

Goldhaber, D., Liddle, S., & Theobald, R. (2013). The gateway to the profession: Assessing teacher preparation programs based on student achievement. Economics of Education Review, 34(2), 29–44.

Henry, G.T., Campbell, S.L., Thompson, C.L., Patriarca, L.A., Luterbach, K.J., Lys, D.B., & Covington, V. (2013). The predictive validity of measures of teacher candidate programs and performance: Toward an evidence-based approach to teacher preparation. Journal of Teacher Education, 64(5), 439–453.

Henry, G.T., Patterson, K.M., Campbell, S.L., & Yi, P. (2013). UNC teacher quality research: 2013 teacher preparation program effectiveness report. Education Policy Inititative at Carolina. Retrieved from

Henry, G.T., Purtell, K.M., Bastian, K.C., Fortner, C.K., Thompson, C.L., Campbell, S.L., & Patterson, K.M. (2014). The effects of teacher entry portals on student achievement. Journal of Teacher Education, 65(1), 7–23.

Hill, H.C., Kapitula, L., & Umland, K. (2011). A validity argument approach to evaluating teacher value-added scores. American Educational Research Journal, 48(3), 794–831.

Kane, T., Rockoff, J., & Staiger, D. (2008). What does certification tell us about teacher effectiveness? Evidence from New York City. Economics of Education Review, 27(6), 615–631.

Koedel, C., Parsons, E., Podgursky, M., & Ehlert, M. (2015). Teacher preparation programs and teacher quality: Are there real differences across programs? Education Finance and Policy, 10(4), 508-534.

National Council on Teacher Quality. (2014). 2014 teacher prep review: A review of the nation’s teacher preparation programs. Retrieved from

National Research Council. (2002). Scientific research in education. Washington, DC: National Academies Press.

Nye, B., Konstantopoulos, S., & Hedges, L. (2004). How large are teacher effects? Educational Evaluation and Policy Analysis, 26(3), 237–257.

Peck, C.A., Gallucci, C., Sloan, T., & Lippincott, A. (2009). Organizational learning and program renewal in teacher education: A socio-cultural theory of learning, innovation, and change. Educational Research Review, 4(1), 16–25.

Peck, C.A., Gallucci, C., & Sloan, T. (2010). Negotiating implementation of high-stakes performance assessment policies in teacher education: From compliance to inquiry. Journal of Teacher Education, 61(5), 451–463.

Peck, C.A. & McDonald, M.A. (2014). What is a culture of evidence? How do you get one? And . . . should you want one? Teachers College Record, 116, 1–27.

Peck, C.A., Singer-Gabella, M., Sloan, T., & Lin, S. (2014). Driving blind: Why we need standardized performance assessment in teacher education. Journal of Curriculum and Instruction, 8(1), 8–30.

Predictive Analytics Reporting framework. (2015). Retrieved from

President’s Council of Advisors on Science and Technology (PCAST). (2010). Prepare and Inspire: K–12 Education in Science, Technology, Engineering, and Math (STEM) for America’s Future. Retrieved from

Rich, M. (2014, April 25). Obama administration plans new rules to grade teacher training programs. The New York Times. Retrieved from

Rose, R.A., Henry, G.T., & Lauen, D.L. (2012). Comparing value-added models for estimating teacher effectiveness: Technical briefing. Consortium for Education Research and Evaluation—North Carolina. Retrieved from

Tennessee State Board of Education. (2012). 2012 report card on the effectiveness of teacher training programs. Retrieved from

Tennessee State Board of Education. (2013). 2013 report card on the effectiveness of teacher training programs. Retrieved from


Appendix Table 1. Individual-Level Data Shared with TPPs at NC Public Universities




Cite This Article as: Teachers College Record Volume 118 Number 12, 2016, p. 1-29 ID Number: 21651, Date Accessed: 1/28/2022 4:24:47 AM

Purchase Reprint Rights for this article or review