Reconciling Data from Different Sources: Practical Realities of Using Mixed Methods to Identify Effective High School Practices
by Thomas M. Smith, Marisa Cannata & Katherine Taylor Haynes — 2016
Background/Context: Mixed methods research conveys multiple advantages to the study of complex phenomena and large organizations or systems. The benefits are derived from drawing on the strengths of qualitative methods to answer questions about how and why a phenomenon occurs and those of quantitative methods to examine how often a phenomenon occurs and establish generalizable, empirical associations between variables and outcomes. Though the literature offers many strategies, designing mixed methods research can be challenging in large scale projects when trying to balance reliability, validity, and generalizability. By supporting the findings with multiple forms of evidence mixed methods designs lend greater validity than mono-method ones. However to draw on the comparative advantages of these two paradigms, researchers must grapple with the challenges of working with more than one method.
Focus of Study: This paper discusses the benefits and challenges of collecting and interpreting mixed methods data in a large scale research and development project. Drawing on existing frameworks, we reflect on our strategies of mixed methods design, data collection, and analysis. We discuss the quandaries faced by researchers when discrepant findings emerge.
Research Design: The data come from a large, mixed methods case study focused on the practices that explain why some high schools in large urban districts are particularly effective at serving low income students, minority students, and English language learners. Undertaken in several phases, the work included sequential and concurrent designs. Incorporating a sequential explanatory design element, we first used quantitative data to identify schools in the district that were more and less effective at improving student achievement in English/language arts, mathematics, and science. We then used a combination of interviews, focus groups, surveys, classroom observations, and district administrative data—in a concurrent design—to try to understand what differentiated between the most and least effective schools in the district.
Conclusions: Based on our analyses, we provide examples of when mixed methods data converge, when they diverge but are complementary, and when they diverge and introduce a methodological quandary for researchers who must confront seemingly discrepant findings. In so doing, we discuss the tradeoffs encountered between the study design and the implications as we confronted them during analysis and suggest ways to balance the methodological demands of complex research studies. Seemingly discrepant findings, while challenging to reconcile, when considered for their potential complementarity, actually lead to a more complete understanding of the phenomena under study.
Keywords: mixed methods, discrepant findings, school reform, case study
Researchers increasingly recognize the comparative benefits of mixed methods . Mixed methods are particularly useful when studying complex phenomena and large organizations or systems. They are also useful when trying to understand nuanced differences in the role of context, practice, and processes. Mixed methods draw on the strengths of qualitative methods in answering questions about how and why a phenomenon occurs (e.g., the history, context, and the enactment of programs and policies), while also capitalizing on the strengths of quantitative methods to examine how often a phenomenon occurs and establish generalizable, empirical associations between variables and outcomes . The combination of these methods can help us to better answer questions about what, when, and how much a certain phenomenon occurs. Drawing on the comparative advantages of these two paradigms, mixed methods designs lend greater validity than mono-method designs because the findings are supported by multiple forms of evidence.
Mixed methods also have the potential to provide greater understanding of complex phenomena in large organizations as independent research projects verify findings or contradict earlier findings and demand further attention . Indeed, recognizing the potential benefits of mixed methods research, there has been precipitous growth in the use of a combination of quantitative and qualitative methods (i.e., what Teddlie and Tashakkori, 2006, and others refer to as quasi-mixed designs). Even the Institute for Education Sciences has called for mixing methods in causal studies (i.e., randomized field trials) in order to better understand the conditions under which an intervention is effective .
With the growth in use of mixed methods, researchers planning to use mixed methods designs could benefit from more examples of real world applications, focusing on both the benefits and challenges of implementing a mixed methods design. For example, despite progress in describing how to combine qualitative and quantitative data, most of the discussion of triangulation design has focused on using the two forms of data to validate each other. While the role of mixed methods in enhancing the validity of inferences is important, it is only one facet of how mixed methods can be used to build understanding of a phenomenon. Further, focusing on enhanced validity as the primary role of mixed methods can raise unsolvable problems when the data suggest discrepant findings. In this paper we draw upon existing frameworks for mixed methods research, and a large mixed methods case study aimed at identifying the practices of effective schools, to discuss the quandaries faced by researchers when discrepant findings emerge. Our data are organized around three points of interface of the qualitative and quantitative data we collected: points of convergence that provide greater validity to the qualitative findings; points of intended divergence when we used qualitative and quantitative data to understand different aspects of the same topic, and points of unexplained divergence that caused us to look more deeply at the likely meaning of both forms of data.
Each example illustrates the different ways in which qualitative and quantitative data can be related and what we learned about effective practices in high school from using a mixed methods approach. This paper makes a contribution as it operationalizes the challenges of mixed methods research within the context of a large research project. Much of the guidance on conducting mixed methods research is found in books and articles that focus on the methodology itself. This literature includes many examples of how to apply the methodology, but the space for the examples is necessarily brief and often takes the form of ideal types . This paper, due to the grounding in a large research project intended to identify practices of effective high schools, keeps the research purposes in the foreground while illustrating the methodological issues in mixed methods research.
REVIEW OF MIXED METHODS RESEARCH DESIGN
Methodologists have provided multiple conceptual frameworks for mixed methods designs. Greene et al.s (1989) framework focuses on five distinct purposes for mixed method evaluations: triangulation, complementarity, development, initiation, and expansion. Other researchers have offered more specific guidance for mixing quantitative and qualitative data , including what Teddlie and Tashakkori refer to as mixed modelsstudies that have two types of questions, data, and interpretations and that are mixed throughout. As with good research practice in general, these frameworks call for research questions to drive the methodological choices, starting in the design phase. The configuration of the mixed methods design would, ideally, be made a priori , paying attention to the strengths of each design in terms of the research questions posed and the analyses planned . For example, Yoshikawa et al. (2008) advocate for the planned use of integrated methods throughout each stage of a mixed methods study, including an iterative, cumulative approach to analysis, rather than designing the analysis strategy after the data have already been collected.
Although the literature suggests a number of different ways to design mixed methods studies , most designs can be broken down into two main categories: concurrent and sequential designs, as shown in Table 1.
Table 1. Different Approaches to Mixed Method Designs
Adapted from Research Design: Qualitative, Quantitative and Mixed Methods Approaches (Third Edition) by J. W. Creswell (2009). Sage Publications, Inc.
Mixed methods designs are differentiated in three main ways: the timing of when qualitative and quantitative data are collected, the relative weight given to the qualitative and quantitative types of data, and the methods of forging connections or interactions between the various forms of data . Mixed methods studies can be designed to be concurrentcollecting both quantitative and qualitative data at the same timeor sequentialcollecting and analyzing either qualitative or quantitative data first before collecting the other type. Results from the first type of data can then inform what is collected in the second type of data, usually by making decisions about sampling, research questions, and instruments. For example, sequential explanatory designs first collect quantitative data to identify overall patterns and then use qualitative data to explain and describe those patterns whereas in sequential exploratory designs, qualitative data is first collected to inform a theory or hypothesis generation that is then tested with quantitative data. Sequential designs can be a particularly powerful means to take full advantage of the strengths of both qualitative and quantitative data. However, project timeframes, and the complexities of accessing research sites and participants, can constrain researchers ability to capitalize on the benefits of these designs. Sequential designs also implicitly weight one type of data as more important than the other as the first data collected informs design and content decisions for the second data collection. The relative weight given to qualitative or quantitative data is another key dimension in which mixed methods research designs vary (Creswell & Plano Clark, 2010). Convergent/triangulation designs place equal weight on both forms of data, while embedded and sequential designs place more weight on one type (which one is emphasized depends on the particular design) and transformative designs can give either equal or unequal weight to the different forms of data (Creswell & Plano Clark, 2010). Such transformations entail when quantitative data are qualitized into narrative data that can be analyzed qualitatively or qualitative data are quantitized into numerical codes that can be represented quantitatively .
How the qualitative and quantitative data interact when analyzed is another key dimension that differentiates mixed methods designs. Some argue that explicitly integrating the data during analysis is what defines a true mixed methods research design (Creswell & Plano Clark, 2010). Yet how to make this work, particularly in triangulation designs in which researchers seek convergence and corroboration of results from different methods of study of the same phenomenon , remains a challenge, and developing more strategies for examining how data converge is a priority for mixed methods researchers (Creswell & Plano Clark, 2007). Comparatively few authors describe the myriad challenges and decisions they face when designing mixed methods studies or collecting and analyzing the data . However, the many advantages of using the two paradigms must be balanced with the challenges of figuring out when and how in the research process to mix the two.
The prototypical question for mixed methods is the extent to which the results converge (Creswell & Clark, 2010). Convergence and triangulation for greater validity are common terms used in many typologies of mixed methods research (Bryman, 2006; Creswell & Clark, 2010; Bryman, 2006; Greene et al., 1989). Yet this focus on triangulation as the method to ensure convergent findings can constrain many of the advantages that mixed methods can provide. What, for example, are researchers to do when the findings are discrepant or contradictory? When this perennial challenge presents itself, the methodological literature on mixed methods offers researchers limited guidance. Further, the emphasis of mixed methods as a source of convergence or triangulation for greater validity overlooks the value of mixed methods research for taking advantage of the nonoverlapping strengths and weaknesses of qualitative and quantitative methods to push for deeper understanding of complex phenomena (Woolley, 2009). Simply using mixed methods for triangulation requires collecting quantitative and qualitative data on the same concepts (Creswell & Clark, 2010), yet the benefits of using both quantitative and qualitative data are maximized when they capture different types of informationqualitative, focused on understanding how and why, and quantitative, focused on patterns and how much. The goal of using both qualitative and quantitative data is not always to get the same picture of a phenomenon with different methods, but to get a fuller, more nuanced picture using multiple perspectives .
This paper builds on the strengths of mixed methods research by drawing out the complexity of phenomena to paint a more comprehensive picture of what makes high schools effective. Our work suggests that discrepant findings, while challenging to reconcile, can point to a more complete understanding of the phenomena under study. This paper draws upon existing frameworks for mixed methods research and a large mixed methods case study to identify practices of effective schools, to discuss the quandaries faced by researchers when discrepant findings emerge.
BACKGROUND ON THE RESEARCH PROJECT
The data for this paper come from a large mixed methods case study focusing on the practices of effective schools. Specifically, this project was designed to identify the combination of essential componentsand the corresponding programs, practices, processes, and policiesthat explain why some high schools are particularly effective at serving low income students, minority students, and English language learners.
This work included several phases, including both sequential and concurrent aspects of mixed methods designs. The first phase involved intensive data collection in four high schools in one large, urban district to identify the practices that distinguish higher value-added (HVA) high schools from lower value-added (LVA) schools (see VARC, 2014, for a description of how we estimated value added). Incorporating a sequential explanatory design element (1) to our study, we first used quantitative data to identify schools in the district that were more and less effective at improving student achievement in English/language arts, mathematics, and science by estimating school-level value-added models based on student achievement data for 9th, 10th, and 11th graders in mathematics, science, and English/language arts. Two schools were selected with relatively higher VA results and two schools were selected with relatively lower ones. We then used a combination of interviews, focus groups, surveys, and classroom observationsin a concurrent designto try to understand what the HVA schools were doing that contributed to their success and distinguished them from the LVA schools in the same district. Table 2 includes demographic information for the four case study schools. In addition, since all four schools were in the same district, many resources and organizational characteristics were similar, with the notable exception that Valley, one of the LVA schools had recently been identified as a turnaround school that provided additional resources and a merit pay incentive for teachers.
Table 2. Demographic Characteristics and Performance Indicators of Case Study High Schools
Note. LVA = lower value-added, HVA = higher value-added. The state accountability rating and graduation rate were the most recent data available at the time of school selection. Demographics represent the composition of the schools at the time of our visits (2011-2012). The value-added ranks are derived from 3 years of data of school-level value-added in math, science, and reading. The most recent year was 2010-2011.
The research findings from this mixed methods study became known as the design challenge that guided a collaborative design process to develop school-based innovations and centered on the distinguishing characteristics of HVA and LVA schools identified in the case study work. Identifying the right distinguishing characteristics, as well as why they work in a particular context, is critical to designing innovations that have a chance of success in other sites. This, combined with the need to draw on the complementarity of methods, led us to use a mixed methods approach to bring the strengths of qualitative and quantitative methods to bear.
METHODOLOGY: DATA COLLECTION DESIGN
Our knowledge of the comparative advantages of the mixed method research convinced us that a mixed methods design was most likely to capture the complex phenomena at the school and district levels as well as the role that context, programs, practices, processes, and policies played in distinguishing between high- and low-value-added high schools.
Our data collection was guided by a framework of eight essential components that emerge from the literature on effective schools at all levels. This framework emphasizes that it is not the adoption of any individual program or practice that leads to school effectiveness, but the integration and alignment of school processes and structures across these eight components (e.g., . The essential components include learning-centered leadership, rigorous and aligned curriculum, quality instruction, personalized learning connections, culture of learning and professional behavior, connections to external communities, systemic performance accountability, and systemic use of data.
After using value added analysis (a quantitative method) to identify the schools for the case studies, we implemented a concurrent mixed methods design. This included analysis of quantitative data from schools across the district (including teacher, student, and parent surveys and administrative data on student discipline, attendance, and course taking patterns).
The design also included the collection of additional data that included both quantitative and qualitative dataclassroom observations, individual interviews and focus groups, observation of administrative meetings, and shadowing of students throughout the school day. These were collected in the four case study schools in three different waves during the 20112012 school year. Focus groups (with students, teachers, student activity leaders, district parent liaisons) and interviews (with principals, assistant principals, guidance counselors, support personnel, teachers, students, and district personnel) were complemented with the collection of school and district artifacts (see Table 3). Data collection primarily focused on 9th and 10th grade students and teachers in English, mathematics, and science, although we balanced this focus with other data from key staff and a cross-section of the school (e.g., teacher focus groups spanned all grades and subject areas) to gain a comprehensive understanding of our schools. We summarize key aspects of the data collection modes below (see Cannata, Taylor Haynes, & Smith, 2013, for a more detailed description).
Note. Teachers and other school personnel may have participated in more than one type of data collection. For example, some individuals may have been interviewed as both a teacher and in their role as a department head or lead content teacher. Similarly, a teacher may have participated in a general teacher focus group in Wave 1 and then the student activity leader focus group in Wave 3 due to his or her role as an athletic coach.
INTERVIEWS AND FOCUS GROUPS
To identify the combination of practices that make some high schools in their urban district particularly successful, we interviewed all principals, assistant principals, guidance counselors, and deans of instruction (when applicable). The instruments were developed around our conceptual framework of eight essential components of effective schools. Six teachers in each of the mathematics, English/language arts (ELA), and science departments were interviewed (and observed) in each school, for a total of 18 teachers per school. All department heads and content area coaches in the three targeted subjects were interviewed and a sample of other support personnel were also interviewed.
We conducted three types of focus groups. First, teachers who were not sampled for individual interviews were invited to participate in focus groups. Second, we conducted focus groups with students. Students were selected on the basis of grade and course selection patterns. We focused on students in grades 1012 because of their familiarity with their high schools. Student focus groups were organized to include one focus group of students taking primarily advanced courses, one of students taking primarily general courses, and one of students enrolled primarily in remedial or classes for repeaters. Students were identified based upon on the convenience of their schedules, with the goal to have a cross section of students in each focus group that is broadly representative demographically of students within that course selection pattern.
As our initial, post Wave 1, data analysis had highlighted the important role of student extracurricular activities in engaging students, we also conducted focus groups with teachers and other adults who supervised these activities in Wave 3 to learn more about how they were manifested in the school (a form of sequential exploratory design).
We observed and videotaped a total of 274 class periods of ELA, math and science across the four case study schools. The same teachers who participated in the interviews were also observed. We used an observational tool called the CLASS-S to turn the observational data into a quantitative measure that assesses teacher-student interactions in the classroom. We observed and coded the following domains and dimensions using the CLASS-S framework: emotional support (positive climate, negative climate, teacher sensitivity, regard for adolescent behavior), classroom organization (behavior management, productivity, instructional learning formats), and instructional support in the classroom (content understanding, analysis and problem solving, quality of feedback, and instructional dialogue), and student engagement.
SURVEYS AND ADMINISTRATIVE DATA
We collaborated with the district to add survey items that measure key study constructs into their annual survey cycle and to obtain survey data from students, teachers, and principals that pertain to further understanding the processes, programs, and practices that might explain school effectiveness. The surveys were administered by the district across all schools in the district, not just in the four case study high schools. The student survey was administered to 9th, 10th, and 11th grade students in November 2011. A total of 10,827 high school students completed surveys, representing approximately 60% of enrolled students. Response rates in the four case study schools ranged from 55% to 77%. The student survey measured the following constructs: academic engagement, personalization, parent press toward academic achievement, peer support for academic achievement, student sense of belonging, student study habits, school-wide future orientation, school climate, disaster preparation, academic press expectations, academic press challenge, student responsibility-participation, student responsibility-school culture, school safety, bullying, and parent connections.
The teacher survey was administered in January 2012. Five hundred and seventy-seven teachers completed the survey across all high schools, for an overall response rate of 44%. Response rates within the four case study schools varied considerably, ranging from 30% to over 60% of teachers. Principle component factor analyses were performed separately on each of the proposed survey constructs. Constructs included: bullying, data use, efficacy, instructional program coherence, personalization-social, school leader instructional support, teacher-principal trust, teacher-teacher trust, supporting quality instruction, systemic performance accountability, supportive and shared leadership, expectations for postsecondary education, personalization-school action, teacher accountability, teacher outreach to parents, teacher-parent trust, and time to collaborate.
For the teacher and student surveys data reported below, scales were created for the factors identified above. School level means for each scale are reported in each table, with significance level calculated through ordinary least-squares (OLS) linear regression on school dummy variables for our case study schools, with comparison group as the remaining students or teachers in the district (12 other schools).
DATA CODING AND ANALYSIS
For the interviews, focus groups, and observations conducted in our four case study schools we employed a multi-stage approach to analyze researchers field notes. Field notes were kept in two forms: personal interaction forms (, which were completed by researchers within 24 hours of conducting an interview or moderating a focus group; and school level analysis forms (SCAFs; Miles & Huberman, 1994), which were completed by the members of each schools research team together during each of the three week-long visits. These served as inputs for generating a cross-school comparison matrix that compared schools across the essential components that guided our work. These three types of documents provided the basis for engaging in the iterative process of refining our instruments and planning our Wave 2, and then Wave 3, field visits. Our analyses were guided by our core research questions: What are the distinguishing characteristics between HVA and LVA schools? How did these differences develop and how are they enacted and supported?
To conduct an in-depth cross case analysis, a team of 19 people systematically coded the interview and focus group data using NVivo, a software program used for qualitative analysis . We used the analytic technique of explanation building (Yin, 2009) to understand how and why the essential components developed (or did not develop) in our schools. This iterative process involved continuously refining claims about the school as additional evidence was examined. The following guiding principles framed our analytic work: focusing on answering our core research questions; discerning findings that could lead us to a design challenge in the district; establishing a process that is rigorous, systematic, allows for tracking back of claims to the data/evidence, and allows us to return to the data and evidence for each finding; and maintaining the essential components as an analytic frame.
To meet these principles, the work was organized into four cases handled by four to six team members for each school. All but one team member had first-hand experience collecting the fieldwork data in that school. The school-based teams were responsible for coding and analyzing all data collected about that school and writing a comprehensive case report. Using an emergent, inductive approach to coding, every member of a school team read through seven to eight key transcripts that were selected in advance to include the SCAFs and comparison tables created after each visit, the principal transcripts, and those from selected teacher and student focus groups. The school team then met to develop an emergent coding framework that was grounded in the data . In addition we used an a priori coding scheme of our essential components and cross-cutting enabling supports (e.g., goals, trust, locus of control, structures that support or inhibit goals, rigor and academic press, student culture of learning, and student responsibility). The general approach was to look at each school as a system. Our analyses centered on understanding each school in depth, while maintaining a focus on the essential components within each school, as well as additional enabling supports that emerged. School case teams met weekly throughout the summer for about 4 hours each week. In between meetings, team members coded interview and focus group transcripts.
We also held cross-case comparison meetings involving all four school teams every other week for approximately 3 hours. The purpose was two-fold: to ensure that definitions were being applied consistently and reliably across schools in the coding process and to flag emerging findings about each school to begin to make comparisons across schools. Once all interview and focus group data were coded, school case teams developed a narrative of each essential component. Coders worked to provide a thorough, well-supported set of claims about the facilitators and inhibitors of essential components, as well as the practices and policies through which these were enacted. These cases formed the basis of a detailed analytic report and the basis for the comparisons described here. The case reports included both quantitative and qualitative data.
TOWARD UNDERSTANDING THE BENEFITS AND CHALLENGES OF OUR DESIGN: EXAMPLES OF DATA TRIANGULATION
To illustrate how we triangulated observation, interview, and survey data across the higher and lower value-added schools, we describe our findings across three main constructs of effective schooling in this district context: student ownership and responsibility, quality instruction, and personalized learning connections. These areas are highlighted because they illustrate three main ways in which we found the qualitative and quantitative data interacted: convergence of data around the main differentiating characteristic of student ownership and responsibility; greater understanding of complex relationships around quality instruction in schools by exploring potential discrepancies in the interview/focus groups and classroom observations; and unresolved discrepant findings around differences in personalized learning connections. The main advantages of triangulating across these different forms of data include: (a) the enhanced validity of our conclusions when they are supported by multiple forms of evidence (as evidenced in the data on student ownership and responsibility) and (b) the ability to compare qualitative data to external benchmarks. We see this advantage in the instructional support data in our study, as both the interview and focus group data point to qualitative differences between the HVA and LVA high schools, while the observation data, as coded using the CLASS-S instrument, highlights the absolute low level in all schools. Further, triangulating across multiple stakeholders, both within and across schools, helped us develop a holistic picture of the four schools within the context of their district, leading us to conjecture how the distinguishing characteristics we identified may have led to differences in value added achievement. These three components illustrate issues in which mixed methods research can provide mutually supporting information, as well as instances in which the data were contradictory.
STUDENT OWNERSHIP AND RESPONSIBILITY: AN IDEAL CASE OF CONVERGENT FINDINGS
Both the qualitative and quantitative data suggest that what differentiated the two HVA schools from the two (LVA) schools were practices that helped students take ownership and responsibility for their own academic success. Through interviews and focus groups, the qualitative data indicated that teachers and other adults in the HVA schools scaffolded students learning of both academic and social behaviors to guide them in assuming ownership and responsibility for their academic success. The HVA schools also developed an integrated system of academic press (the encouragement of students to achieve) and support (resources to foster academic success). This involved promoting self-efficacy by changing students beliefs and attitudes and engaging them to do challenging academic work. Thus, we considered self-efficacy and engagement (both cognitive and behavioral) to be indicators of student ownership and responsibility, while academic support and press are strategies used to develop student ownership and responsibility. While our data do not permit causal claims, our findings are consistent with the broader literature on student engagement, self-efficacy, and academic press .
It is important to note that while student ownership and responsibility were identified based on quantitative and qualitative measures at the student level, the qualitative data indicated that student ownership and responsibility resulted from concerted school-level efforts. In particular, teacher interviews and focus groups and administrator interviews provided evidence that, in the HVA schools, teachers and other adults in the school scaffolded learning of both academic and social behaviors that guided students in assuming ownership and responsibility of their academic success. Both of our HVA case study schools provided this scaffolding through integrated strategies of academic press and academic support. In some cases, due to recent improvements in Valley after being categorized as a turnaround school, we describe how Valley and the two HVA schools differ from Mountainside. Our qualitative data suggest that both HVA schools had stronger and more systemic practices, policies, and resources to establish an academically rigorous school environment in which students were pressed to achieve and supported in doing so. Indeed, the administrator and teacher interviews and teacher focus groups in one HVA school highlighted how the school focused explicitly on increasing student ownership and responsibility for their learning. The efforts to increase student ownership and responsibility focused on building a culture that holds students accountable for their learning and supports them through systematic but personalized interventions. Lakesides levers for academic press were the Lakeside Code, a set of expectations for students and teachers; Learning Time, a lunchtime tutorial system; assignment logs, a shared template for students to monitor their progress; and the Intervention Committee, which provided support and procedures to raise the expectations for student success in all classes. The student focus groups reinforced the importance of these practices as students describe the Lakeside Code as the expectations they must adhere to and frequent use of assignment logs and tutorials.
The other HVA school, Riverview, also showed evidence of a strong student culture of learning from the student focus groups, at least among the honors students who took the initiative to form study groups, tutor each other, and work collaboratively to master challenging material, often after school. Although this culture of learning was heavily influenced by parental press for high academic standards, the teacher interviews and focus groups provided evidence of concerted strategies to increase student engagement to achieve school-wide rigor. The school established academic press and support by highlighting its success with AP/honors courses to encourage more students to take those courses, with a concerted effort to keep the quality high. This outreach, which was targeted particularly at low-income and minority students, was described in interviews and focus groups as a key lever to provide greater learning opportunities for a broad spectrum of the student population. One teacher illustrated this philosophy when she said the faculty was committed to taking students who are not honors students and making them into honors students.
In contrast, the two LVA schools did not demonstrate a systemic focus on academic press and support. One reported characteristic shared by the LVA schools was a culture of multiple chances, in which students could get several opportunities to make up for failure. While participants reported both positive and negative aspects to this practice, the limited student accountability it fostered supports the premise that academic press is a key difference between HVA and LVA schools. While all four schools provided credit recovery and other opportunities for students to make up failed assignments or courses, Lakeside and Riverview both were able to resolve the tension between supporting students and holding them accountable in ways that did not lower rigor. In contrast, LVA schools had only isolated examples of teachers pressing students and helping them take ownership of their academic success.
Because of the concurrent and exploratory nature of the study design, the surveys were not designed to focus explicitly on student ownership and responsibility. Still, four items on our student survey capture aspects of student ownership and two focus on academic press. Of the items on student ownership, one is focused on cognitive engagement and three on behavioral engagement. The academic engagement scale captures whether students get bored in class, find the work interesting, look forward to their classes, and work hard to do their best in class. The behavioral engagement measures are: study habits, responsibility-participation, and peer support for academic achievement. The study habits measure captures the extent to which students study and do homework. The peer support measure captures whether students and their friends support each other academically by talking about what they did in class, preparing for tests together, helping each other with homework, and similar behaviors. For the academic press expectations scale, students were asked the extent to which they agreed with the following statements: my classes really make me think; my teachers expect me to do my best all the time; and my teachers expect everyone to work hard. In general, students agreed with these statements. The academic press challenges scale included items about the difficulty of class work, tests, and teacher questions and asked how often students felt challenged. Student surveys were administered district-wide to understand students perceptions.
In general, survey responses indicated stronger student responsibility and engagement at the HVA schools than at LVA schools (see Table 4), though the evidence was not entirely consistent. Scale averages for Riverview were significantly higher than the district average, with the positive difference largest for study habits and participation and narrower for academic engagement. This is consistent with our qualitative finding of a strong student culture of learning at Riverview. At Lakeside, the academic engagement and participation scale averages were significantly higher than district means, but the scales on study habits and peer support for academic achievement were lower. Results for the LVA schools were significantly lower in some areas and significantly higher in others.
Table 4. Student Survey Data on Academic Press and Student Ownership
Note. LVA = lower value-added; HVA = higher value-added. Statistical significance was calculated based on mean comparisons tests between each case study schools mean scale rating compared to the mean from the districts other 12 schools.
*p < .05. **p < .01. ***p < .001.
Table 5. Percent of Students Participating in Select School Programs
Note. LVA = lower value-added; HVA = higher value-added. Statistical significance was calculated based on mean comparisons tests between each case study schools mean value compared to the mean from the districts 12 other schools.
*p < .05. **p < .01. ***p < .001.
Table 6. Student Engagement Measures by School
Note. LVA = lower value-added; HVA = higher value-added. The CLASS-S data came from observations of 603 20-minute segments of classroom observations. Classrooms in English/language arts, mathematics, and science were observed. The observational rating is on a scale of 1-7. The shadowing data come from 1,360 5-minute segments of shadowing students throughout their core subject classes. Tests for statistical significance were computed by comparing the school value to the value of the other three schools combined.
*p < .05. **p < .01. ***p < .001.
One survey result worth noting was that student perceptions of academic press were mixed at HVA schools. Riverviews scale averages exceeded the district mean for the category, but Lakeside had lower averages. While the differences are small (less than 10% of a SD), they were statistically significant in most cases. Valley also showed some significantly lower scores on the academic press scales.
The student survey also asks whether students participated in credit recovery, tutoring, and preparation for college entrance exams, and responses may help understand the student perspective on academic press and support. For example, Mountainside students were most likely to report participating in credit recovery, suggesting less press to do well the first time, whereas the lower participation rates at the HVA schools suggest greater academic press. On the other hand, students in the HVA schools were more likely to participate in PSAT, SAT, and ACT preparation activities, suggesting more school-wide press to attend college. The effectiveness of Lakesides Learning Time tutoring program is evident in the high percentage of students who get tutoring.
The CLASS-S observational data also presented evidence on student engagement. Consistent with the qualitative and survey data, the classroom observation data suggested higher student engagement in the two HVA schools and lower student engagement in one LVA school, with the other LVA school near the mean.
Finally, we also examined administrative course-taking data as another indicator of academic press (see Table 7). We hypothesized that schools with a greater climate of academic press would have more students taking advanced courses and passing Advanced Placement (AP) exams. These data supported the finding that Riverview and, to a lesser extent, Lakeside, were more successful in getting students to take advanced courses and exams. Not surprisingly, given the fieldwork, Riverview had the highest percentage of students taking at least one AP course and passing an AP test. The other HVA school, Lakeside, had relatively low AP participation, although slightly more AP course-takers took the test than at Riverview. Both HVA schools had higher AP exam pass rates than the LVA schools. A recent increase at Valley in the percentage of students taking and passing an AP test buttressed findings from fieldwork about recent academic improvements. Enrollment patterns in honors and other advanced courses revealed a few differences among the schools, including mixed results in this area for the LVA schools. Valley experienced a drop in enrollment in advanced courses. Mountainside experienced a decrease in AP-related categories. Riverview showed an increase and had a total of 72% of students taking any advanced course (AP, honors, other advanced course), compared to around 51% in the other three schools.
Table 7. Course-Taking Patterns by Percentage of Students for Most Recent Three Years and Change Over Time
LVA = lower value-added; HVA = higher value-added. These percentages represent the percent of all students in the school, although the availability of AP courses is not even across grades. The data on the most recent 3 years is an average of 20082009, 20092010, and 20102011. The change over time data reflect changes from 20082009 to 20102011.
To summarize, the results show an overall pattern of convergence between findings of different indicators, both quantitative and qualitative, of student ownership and responsibility between the HVA and LVA schools. Although there were some instances that appeared to diverge from the other findings, the prevalent pattern is one of differentiating the HVA and LVA schools. Further, many of the potentially discrepant cases were easily explained by the qualitative findings, such as the placement of Valley as a school on a turnaround trajectory and thus often straddling the HVA/LVA line, the high use of tutoring at Lakeside, and the student culture of learning and AP course enrollment at Riverview. Other instances of potentially discrepant findings, such as with the student shadowing data, can be attributed to problems with sample selection and timing of the shadowing data collection.
QUALITY OF INSTRUCTIONAL SUPPORT IN CLASSROOMS: WHEN DIVERGENCE IS NOT BAD
While our most direct measure of instructional quality came from videotaping 9th and 10th grade ELA, mathematics, and science classes and then coding them as quantitative data using a rubric, we also interviewed teachers about their vision of high quality instruction, including any barriers they saw to implementing the practices they described as part of their instructional vision. As noted above, we coded our videotaped observations of classroom instruction using the CLASS-S protocol. To highlight how divergent findings across methods in a mixed methods study are not always an adverse result, we compare quantitative codes for two dimensions of the instructional support domain, Content Understanding and Analysis and Problem Solving, to teacher interview data across our four case study schools. While there are other aspects of instruction that we coded for using the CLASS-S, we focus here on these two domains of the Instructional Support dimension because they highlight aspects of instruction in which the study schools are in the upper end of the midrange of the coding rubric (content understanding domain) and the lower end of the midrange on the coding rubric (problem solving domain). The quantitative coding from the CLASS-S suggest that while the HVA schools score higher, on average, than the LVA schools, the differences are relatively small.
Content understanding, as measured by the CLASS-S, focuses on both the depth of lesson content and the approaches used to help students comprehend the framework, key ideas, and procedures. Indicators of content understanding that we coded for include: demonstration of depth of understanding, effective communication of concepts and procedures, demonstration of background knowledge and misconceptions, and effective transmission of content knowledge and procedures. On average, HVA schools had slightly better average content understanding than LVA schools, although all four schools had average scores in the midrange (4.37 to 4.68) and the differences were not statistically significant. A mid-level score on content understanding could be reflective of cases in which class discussion and materials communicate a few of the essential attributes of concepts/procedures, but examples are limited in scope or not consistently provided .
The Analysis and Problem Solving dimension of the CLASS-S assesses the degree to which the teacher facilitates students use of higher level thinking skills, such as analysis, problem solving, reasoning, and creating through the application of knowledge and skills. We coded for indicators including: opportunities for higher level thinking, problem solving, and metacognition. On average, analysis and problem solving scores across the four case study schools were lowbetween 2.4 and 3 on the 17 point scalethis difference amounted to about half of a standard deviation. In other words, while the HVA schools had higher average analysis and problem solving scores than did the LVA schools, all four schools were at the lower end of this measure, suggesting that the typical classroom had few opportunities for students to engage in higher order thinking through inquiry and analysis.
While results from teacher interviews were broadly consistent with the cross-school patterns found in the CLASS-S ratings, they did not suggest how low the observation scores were going to be on the analysis and problem solving rubric. When asked about what they consider to be high quality instruction, multiple stakeholders at the HVA schools named higher order thinking skills. Teachers (i.e., the same ones that were videoed) mentioned using questioning strategies or problem solving activities (discovery learning, inquiry-based instruction) to reach higher order learning, although most also indicated this was a continuing struggle to do well. In contrast, teachers in the LVA schools viewed students lack of background knowledge as the reason they struggle in their classes, rather than attributing it to their instruction. Interview data from the LVA schools suggested a lack of understanding of how to foster higher order thinking skills and what higher order thinking actually looks like in the classroom.
In a complementary manner, the interview data helped us understand some of the perceived barriers that teachers in the LVA schools see as inhibiting their ability to emphasize higher order thinking in their classroom instruction. This information was valuable to us in our project as it could inform the design of professional development to help teachers improve these skills. The teacher interviews, alone, however, would not have uncovered the relatively low degree to which teachers facilitate students use of higher level thinking skills, on average, in both HVA and LVA schools. By triangulating the observation and interview data we gained confidence in the between-school differences that we saw; although if we had focused our study solely on interviews, we might have overestimated the degree to which teachers in the HVA schools were actually applying elements of quality instruction.
In summary, our CLASS-S coding of videotaped classroom observations across the dimensions of content understanding and analysis and problem solving suggested low overall implementation of strategies to develop higher order thinking skills in classrooms. Although on average the HVA schools performed better on these measures, the results did not match up to the variation in rhetoric around high quality teaching (e.g., discovery learning, inquiry-based instruction) provided by teachers. The interviews in the LVA schools identified teacher efficacy issues that would be critical to address in any reform focusing on strategies to promote higher order thinking. This is a clear example of where divergent findings based on analyses across different modes of data collection were informative to our project goals, rather than simply raising issues of cross method validity. Teachers tendency to cite students prior knowledge or current behavior as barriers to implementing more rigorous instruction is not a factor that we would have ascertained with an equivalent level of detail through survey or observation data. This analysis of the different data sources underscores the notion that divergent findings are not necessarily problematic, but rather permit complementary discoveries about the phenomena under investigation.
PERSONAL LEARNING CONNECTIONS: A METHODOLOGICAL QUANDARY MADE OF DIVERGENT FINDINGS
The construct of personalized learning connections assesses the strength of connections between students and adults in schools and the degree to which these connections allow teachers to provide more individual attention to their students . A school investing in personalized learning connections would also be developing students sense of belonging to school (Walker & Greene, 2009). We would expect the personalized learning connections to span a continuum from strong and robust interactions that lead to connectedness, to weak or nonexistent interactions that lead to isolation and, potentially, alienation (Nasir, Jones, & McLaughlin, 2011, Crosnoe, Johnson, & Elder, 2004).
Interviews across a wide range of school stakeholders in each case study school suggest that building and sustaining strong adult-student relationships was a priority for fostering student engagement and success; however, actual responses and inferred dispositions about adult-student connections differed across schools. Participants at Lakeside reported extremely positive teacher and student chemistry and described the Lakeside code and learning time as the overarching mechanisms promoting such connections. The code and learning time were consistent with the schools dominant focus on academic responsibilitythough students also came to hang out with teachers and get to know them socially At Riverview, there was evident leadership in the efforts to meet different students needs. Through interviews, the administration at Riverview reportedly based employment on teachers commitment to activities that would help develop strong adult-student relationships. This strategy appeared to pay off as faculty reportedly sponsored many clubs and encouraged student involvement. Riverview interviews and focus groups of adult participants suggested a commitment the idea that relationships mattered for low-income students who would go a mile for a teacher.
By contrast, student interviews suggested that the nature of relationships at Riverside followed a school within a school pattern, with students enrolled in honors or AP courses benefiting the most from access to teachers. This pattern was consistent the principals reported goal of closing social gaps in the school as well as academic gaps. Although classified as an LVA school, Valley had a concerted focus on addressing social development needs, but only a stated attention to academics. Teachers were expected to do what it takes to develop relationships with their students, including working outside classand faculty seemed to have solidly bought into this personalization goal. Multiple participants, however, noted low levels of rigor at the school. Some teachers seemed to emphasize developing relationships with students, rather than making academic demands. At Mountainside, only some school personnel were intentional and systematic about building and maintaining relationships. The dissolution of the schools mentoring program belied participants view that relationships with kids were a key aspect of practice. Diminished personalization practices also seemed to be a function of the schools significant teacher turnover rate. The interview data confirmed that the emphasis on personalization varied by value-added status, with Lakeside and Riverview placing more emphasis on concerted interactions between students and adults than Valley and Mountain, the two LVA schools.
In contrast to the interview data, the teacher survey data tell a less consistent story, not always aligned with the value-added rankings of the four case study schools or the pattern that emerged from the qualitative data (see Table 9). Personalization-structural support (three items; alpha = .76) includes items from the teacher survey that indicate how often the teacher organizes school supports, such as parent-teacher meetings and referrals to community organizations, for students that are struggling. Lakeside (HVA) was the only one of our case study schools that had a substantively different score from the district average (although this difference was not statistically significant). The structural supports identified through the interviews were only weakly validated here with the teacher survey data. Teacher responses to structural supports items related to personalization in the other three schools, one HVA and two LVA, were similar to the district average. These findings led us to look closer at the structural supports for personalization that schools reported having in place, particularly Riverview.
Personalization-extra help (5 items; alpha = .76) refers to the extent to which teachers or other school staff provide extra help to students who are struggling. Here, the results generally parallel those suggested by our interview data. Lakeside, Riverview, and Valley all have scores above the district average, while Mountainside scored below the district average, although the differences were not statistically significant. This suggests that the extra help for struggling students in our four case study schools was not very different, at least from the teachers perspective, from other schools in the district.
Personalization-social (five items; alpha = .77) refers to the extent to which teachers report knowing their students personally, such as their academic aspirations, their home life, and who their friends are. There is little alignment between teacher responses to these items and what we saw in our interview data, suggesting that the personalization strategies that came across strongly in interviews with adults in Lakeside and Riverview may not be universally enacted across teachers in these schools.
Another form of triangulation is the comparison of teacher survey responses to student survey responses. The Personalization scale derived from student survey data (see Table 10) includes items that capture how many adults in the school were willing to give extra help with homework, care about students academic achievement, provide advice about graduation requirements, and help with students personal problems. The student sense of belonging scale comprises items measuring the extent to which students viewed people in the school as a family, felt like they fit with the school, and felt that people cared if they were absent. Consistent with teachers survey responses, there do not appear to be systematic differences between higher and lower value-added schools on student perceptions of the level of personalization (five items; alpha = .88) and student sense of belonging (six items; alpha = .71), although Riverview is lower than the district average on personalization and higher on student sense of belonging. These results are striking in their inconsistency with our interview data, which suggest strong organizational supports in place in Lakeside for personalization of students academic needs and in Valley for students need for extra help.
We then triangulated our coded classroom observation data with our interview and survey data. The personalization construct is most proximally measured by the Emotional Support domain of the CLASS-S rubric, as it is designed to measure the degree to which teachers are organizing their classroom environment to build strong connections with students. Results should parallel results of the personalizationsocial construct from the teacher survey and the personalization construct from the student survey. The Emotional Support domain measures these characteristics: positive climate, negative climate, teacher sensitivity, and regard for adolescent perspective. In general, HVA schools had higher ratings for the Emotional Support dimensions than LVA schools, with these differences statistically significant in all areas except positive climate.
Table 8. CLASS-S Scores by School
* for p < .05, ** for p < .01, and *** for p < .001. Statistical significance was calculated based on mean comparisons tests between each case study schools mean rating compared to the mean from the other schools combined.
As coded using the CLASS-S rubric, positive climate reflects the emotional connections and relationships among teachers and students, and the warmth, respect, and enjoyment communicated by verbal and nonverbal interactions. Indicators of this dimension included positive relationships, positive affect, positive communications, and respect. School average scores ranged from 4.4 to 5 (the upper end of the midrange) (see Table 8 above). Consistent with the personalization scales from the student surveys, Lakeside, Riverview, and Valley had similar average positive climate scores, while Mountainside had the lowest ratings. The differences between the HVA and LVA schools were not statistically significant for positive climate.
Table 9. Teacher Survey Data on Personalized Learning Connections
Note. LVA = lower value-added; HVA = higher value-added. Statistical significance was calculated based on mean comparisons tests between each case study schools mean scale rating compared to the mean from the districts other 12 schools.
*p < .05. **p < .01. ***p < .001.
Table 10. Student Survey Data on Personalized Learning Connections
Note. LVA = lower value-added; HVA = higher value-added. Statistical significance was calculated based on mean comparisons tests between each case study schools mean scale rating compared to the mean from the districts other 12 schools.
*p < .05. **p < .01. ***p < .001.
Negative climate encompasses the overall level of negativity among teachers and students in the observed class. This variable has been reverse coded, so a higher score reflects a less negative climate. Indicators of negative climate include negative affect, punitive control, and disrespect. On average, HVA schools had better negative climate ratings than the LVA schools, and this difference was statistically significant. The range in average scores, however, suggests that the typical classroom in any of the case study schools does not have a negative climate.
The teacher sensitivity domain of the CLASS-S codes for teachers responsiveness to the academic and social/emotional needs and developmental levels of individual students and the entire class, and the way these factors impact students classroom experiences. Indicators of teacher sensitivity included awareness, responsiveness to academic and social/emotional needs and cues, effectiveness in addressing problems, and student comfort. Although each of the schools scored in the midrange on this dimension, HVA schools had better teacher sensitivity scores than LVA schools, and this difference was statistically significant.
To summarize, across the broad construct of personalized learning connections, we found that the data on personal learning connections across interview, observation, and survey data were divergent. The interview data are consistent with our labeling of the schools as high and low value-added. The personalized learning connections constructs related to the quantitative coding of observations using the CLASS-S observation rubric are also, for the most part, consistent with the HVA and LVA rankings of the case study schools, although the average school level codes fell in a fairly narrow band, in the midrange. The teacher survey data suggest little difference across the four schools and do not differentiate the schools by value-added ranking.
The divergence in data on personal learning connections presents a methodological quandary of how to reconcile the differences. Viewing each source of data as a piece of the puzzle, rather than just evidence of cross-method validity, supports a more nuanced understanding of the differences in personalization across the schools. Each data source provides additional, complementary insight into the presence and variation of personalization across our case study schools. Had we collected only two of the three data sources, we would have missed part of the story. For example, if we had examined survey data alone, we would not have identified strong differences between the schools in personalized learning connections across schools. The interview data, while not entirely consistent with the survey or observation data, suggest ways in which each of the schools is working to support the development of personalized learning connections between adults and students in different ways. Examples of this include the differences in the nature of teacher-student relationships at Riverview characterized by students as differing by student track (e.g., honors and AP versus on-level courses), and the emphasis at Valley on adults addressing students social development needs over academics.
DISCUSSION AND IMPLICATIONS
Our goal has been to highlight the advantages and challenges of collecting and interpreting mixed methods data in a large scale research and development project. To do so, we have highlighted three different points of interaction of data sources: points of convergence that provide greater validity to the qualitative findings; points of intended divergence when we used qualitative and quantitative data to different ends around the same topic; and points of unexplained divergence that caused us to look more deeply at both forms of data. These examples illustrate the methodological issuesboth advantages and disadvantagesthat arise when mixing qualitative and quantitative data in the context of a large research project.
The advantages of our having used mixed methods far outweighed the disadvantages. These include increasing construct validity of our findings when they are supported by multiple forms of evidence, and enhancing trustworthiness of the analysis by providing a fuller more-rounded account , reducing bias, and compensating for the weakness of one method through the strength of another. Another advantage of using multiple methods was our ability to compare qualitative data to an external benchmark.
Perhaps the most important advantage, however, of our using mixed methods is the complementarity of findings derived from different modalities and methods. As Greene et al. (2001) suggest, the use of mixed methods allows researchers to draw upon the complementarity of the qualitative and quantitative paradigms to better understand the social phenomena under investigation. As noted by Gorard and Taylor , this complementarity is far more important than mutual validation as two or more observations or methods should be expected to yield different results: We cannot use two or more methods to check up on each other in a simple way. The methods should be complementary, producing different aspects of the reality under investigation and then put together, or they could be designed to generate dissonance as a stimulus to progress .
We found that the complexities and nuances that emerged from looking at the programs, policies, and practices that differentiated HVA and LVA schools by using multiple methods were pivotal in helping us look beyond seemingly simple or obvious reasons for why one school was performing better than another. The multiple methods employed in our study allowed us to see schools from multiple perspectives (e.g., administrators, guidance counselors, teachers, coaches, students) as well as through multiple modes of qualitative and quantitative data collection (i.e., interview, focus group, observation, survey, student assessment, and student shadowing). Although costly and time consuming, mixing methods helped us to ask questions and attempt answers that were more useful to us in the development stage of intervention work than would have been the case had we relied upon a single mode or method of research. By triangulating across data from different sources and modes we increased the likelihood that we have actually identified enduring rather than transitoryaspects of a school environment as their quality drivers.
The challenges of taking a mixed methods approach became apparent when data collected from multiple perspectives and multiple modes did not converge. While we sought to design data collection instruments to assess the same underlying constructs, each data source had its own strengths and weakness. The surveys were used to collect representative data systematically, helping to improve reliability, although lower than desired response rates may have biased our results. In addition, our survey questions had to be determined far in advance of the field work in order to get them cleared by our Institutional Review Board and for our district partners to include them in their survey administration. We relied on items that had been previously shown to be reliable and valid, which is less costly and time consuming than developing items and field testing them for our specific study. Items taken from other studies, however, were not always an exact match for the constructs that were the focus of our investigation. By adding three cycles of interviews to the survey data, we could more easily adapt questions to our constructs, and follow up when new ideas as theories emerged. Further, the semistructured interviews allowed the interviewer to probe responses in ways that can help uncover the how and the why behind why a program or practice might be effective (or not), rather than just unearthing its existence. Although the self-reports from school stakeholders in interviews and surveys did not always align with what was observed in classrooms, the combined analysis of data from multiple methods, rather than just one, two, or three, allowed us to move beyond a surface understanding of what makes some schools more effective than others.
Creswell and Clarks (2010) typology of triangulating mixed methods data offers a useful way of framing the relative trade-offs that researchers must consider in analyzing their data (a) the timing of when the qualitative and quantitative data are collected; (b) the relative weight ascribed to one paradigm over the other; and (c) the means by which the data are integrated. Each trade-off presented us with tangible challenges in collecting and analyzing our data. The first challenge was planning the timing for each component of the research design. We sought to design data collection instruments that would assess the same underlying constructs so that there would be consistency across data sources. Each instrument type had its own strengths and weaknesses. The surveys would allow us to collect consistent, representative data across all high schools in the district, not just our four case study schools. This would allow us to compare student and teacher responses in our case studies schools to the other schools in the district. In addition, the survey questions had to be formulated in advance of our conducting the field work so the district could include them in their survey administration. This meant that we were not able to maximize the potential benefits of using emergent findings based on fieldwork to develop or refine survey items. Yet considerations of time loom large for researchers in deciding what items and scales to use. The same temporal considerations are not as relevant to qualitative data collection since the methodology lends itself to developing and refining interview questions iteratively. Qualitative instruments are more easily adapted, and follow up, should there be confusion or a need to dig deeper, more easily obtained. However, the issue of timing arose again as we were faced with analyzing hundreds of interview and focus group transcripts and observation data to inform the next wave of data collection. Thus our three qualitative data collection waves had the advantage of a sequential design, even if our qualitative and quantitative data collection and analysis had to be concurrent.
Our approach to mixing the multiple methods was to use a coordinated model for the explicit reason that we needed to accomplish both quantitative and qualitative data collection within a single year so that the identified difference between HVA and LVA schools could inform the design for our collaborative improvement work with our partner district. While we would have valued being able to collect one type of data first and then use the analyses of that data type to inform the inquiry and method of data collection of the subsequent data collection, both the scale and tight timeframe of our data collection did not allow it. For example, we could have first collected teacher and survey data to identify differences between the HVA and LVA schools, and then used the interviews and focus groups of administrator, teachers, and students to understand the origin of these differences and how they work in their particular school context. We could have also staged the survey and interview data in the opposite orderusing interviews, focus groups, and classroom observations to develop a hypothesis for the components and practices that distinguished the HVA and LVA schools and then using data that could be collected from surveys to test the hypotheses. However, having a single year of intensive data collection simply did not allow for either of these scenarios. Instead, our best option was to choose a concurrent design and use triangulation and complementarity to draw the most out of the data.
Our second challenge was to identify the relative weight we would ascribe to qualitative versus quantitative data, which became salient as we tried to reconcile our seemingly discrepant data. As we learned from the results of the analysis of personalized learning connections, discrepant findings are not always a problem . What one can learn from observing teachers is substantively different from what one learns from interviewing teachers. What one learns from shadowing students to understand their experience of school is one thing, but those learnings are enhanced if one also interviews them about what was observed through shadowing. What surfaces in one-on-one interviews with core content teachers can be broadened and more fully illuminated by also conducting focus groups with support personnel (e.g., guidance counselors and school drop-out prevention specialists) and teachers who coach or run academically focused after-school activities.
Our third challenge was how to integrate the qualitative and quantitative data. As our description of analyzing the data on personal learning connections demonstrated, the complementarity model did not provide a good methodological frame because the triangulation of research results led to a situation in which the different findings did not neatly fit together. Erzberger and Kelle (2003) indicate that in such instances the complementarity model does not provide a good methodological frame. It is most important to do due diligence and check whether the research methods were adequately applied to rule out the possibility that the inconsistencies between qualitative and quantitative findings can be explained as the result of mistakes made in data collection or analysis methods or the possibility that the discrepancies are a result of inadequate theoretical concepts that were applied. Erzberger and Kelle recommend examining the methodology used including the sampling, research instruments and the data analysis process to rule out the first possibility. Low response rate on teacher surveys and the sampling process for students in the shadowing component may have contributed to divergent findings.
While enhanced validity and credibility of inferences is often the focus of triangulation designs, such focus limits the potential advantages of collecting data from multiples sources via different methods. When, as in the case of student ownership and responsibility, the findings from different data sources converge to provide a coherent portrait the issues of triangulation design are less relevant. When findings diverge, however, as was the case with quality of instructional support in classrooms, researchers must confront questions of how to best reconcile such differences. When findings do not neatly converge or are ostensibly contradictory, as was the case with personal learning connections, researchers are presented with the quandary of how to reconcile such results. In these instances, the desired complementarity does not exist and does not result in a convergence of findings. To leverage the full benefits of mixed methods, a triangulation design that does not discount one paradigm in deference to another holds the potential to facilitate deeper understanding of a complex phenomenon. Further, the sequential nature of our field visits allowed us to test what we were finding in interviews and focus groups in the beginning of the year with additional interviews and focus groups with additional informants.
Through this paper, we have sought to weigh the strengths and challenges of mixed methods research in drawing out the complexity of phenomena to paint a more comprehensive picture rather than one more narrowly constrained to searching for convergence in findings. Our work serves as a reminder that seemingly discrepant findings, while challenging to reconcile, when considered for their potential complementarity, actually lead to a more complete understanding of the phenomena under study. Researchers undertaking mixed methods research stand to benefit from additional robust examples of how other researchers have tackled mixing different modalities and methods. This so-called testing of how to mix methods will help inform researchers of real quandaries and their resulting solutions. We can all benefit from additional studies describing how researchers undertake complex, multimodal, mixed methods research.
This research is funded by the Institute of Education Sciences (R305C10023). The opinions expressed in this article are those of the authors and do not necessarily represent the views of the sponsor.
1. As suggested in Table 1, sequential explanatory designs are those in which quantitative data are collected and analyzed first and the qualitative data are collected and analyzed second and informed by the quantitative data. In such cases, the quantitative data are privileged.
Appleton, J. J., Christenson, S. L., & Furlong, M. J. (2008). Student engagement with school: Critical conceptual and methodological issues of the construct. Psychology in the Schools, 45(5), 369386.
Archambault, I., Janosz, M., Fallu, J. S., & Pagani, L. S. (2009). Student engagement and its relationship with early high school dropout. Journal of Adolescence, 32(3), 651670.
Bandura, A. (1997). Self-efficacy: The exercise of control (1st ed.). New York: W. H. Freeman.
Bell, C. A., Gitomer, D. H., McCaffrey, D., Hamre, B., & Pianta, R. C. (2011, April). An argument approach to observation protocol validity. Presented at the annual conference of the American Education Research Association, New Orleans, LA.
Bryk, A. S., Sebring, P. B., Allensworth, E., Luppescu, S., & Easton, J. Q. (2010). Organizing schools for improvement: Lessons from Chicago. University Of Chicago Press.
Bryman, A. (2006). Integrating quantitative and qualitative research: how is it done? Qualitative Research, 6(1), 97113.
Burch, P., Heinrich, C., Farrell, C., Good, A., & Stewart, M. (n.d.). Integrating the qualitative and quantitative in education policy research. Working Paper. Retrieved from http://sesiq2.wceruw.org/documents/SESIQ2_%20IntegratingMethods.pdf
Cannata, M., Taylor Haynes, K., & Smith, T. M. (2013). Reaching for rigor: Identifying practices of effective high schools (Research Report) (pp. 1143). Nashville, TN: National Center for Scaling Up Effective Schools.
Cohen, J. (2006). Social, emotional, ethical, and academic education: Creating a climate for learning, participation in democracy, and well-being. Harvard Educational Review, 76(2), 201237.
Collins, K. M. T., Onwuegbuzie, A. J., & Sutton, I. L. (2006). A model incorporating the rationale and purpose for conducting mixed methods research in special education and beyond. Learning Disabilities: A Contemporary Journal, 4, 67100.
Creswell, J. W. (2009). Research design: qualitative, quantitative, and mixed method approaches. Thousand Oaks, Calif.: Sage Publications.
Creswell, J. W., & Clark, V. L. P. (2010). Designing and conducting mixed methods research (2nd ed.). Sage Publications, Inc.
Creswell, J. W., & Plano Clark, V. L. (2007). Designing and conducting mixed methods research. Thousand Oaks, Calif.: SAGE Publications.
Crosnoe, R., Johnson, M. K., & Elder, G. H. (2004). Intergenerational bonding in school: Thebehavioral and contextual correlates of student-teacher relationships. Sociology of Education, 77(1), 60.
Easton, J. Q. (2014, March 5). Welcome to Institute of Education Science Principal Investigators meeting. Retrieved from http://ies.ed.gov/director/pdf/Easton030514.pdf
Erzberger, C., & Kelle, U. (2003). Making inferences in mixed methods: The rule of integration. In A. Tashakkori & C. Teddlie (Eds.), Handbook of mixed methods in social and behavioral research. Thousand Oaks, Calif.: SAGE Publications.
Fredricks, J. A., Blumenfeld, P. C., & Paris, A. H. (2004). School engagement: Potential of the concept, state of the evidence. Review of Educational Research, 74(1), 59109.
Glasser, B., & Strauss, A. (1967). The discovery of grounded theory. Chicago: Aldine.
Gorard, S., & Taylor, C. (2004). What is "triangulation? Building Research Capacity, 7, 79.
Greene, J. C., Benjamin, L., & Goodyear, L. (2001). The merits of mixing methods in. Evaluation, 7(1), 2544. doi:10.1177/13563890122209504
Greene, J. C., Caracelli, V. J., & Graham, W. F. (1989). Toward a conceptual framework for mixed-method evaluation designs. Educational Evaluation and Policy Analysis, 11(3), 255274. doi:10.3102/01623737011003255
Johnson, R. B., & Onwuegbuzie, A. J. (2004). Mixed methods research: A research paradigm whose time has come. Educational Researcher, 33(7), 1426. doi:10.3102/0013189X033007014
Johnson, R. B., & Turner, L. A. (2003). Data collection strategies in mixed methods research. In A. Tashakkori & C. Teddlie (Eds.), Handbook of mixed methods in social and behavioral research (pp. 297319). Thousand Oaks, CA: Sage.
Kane, T. J., & Staiger, D. O. (2012). Gathering feedback on teaching: Combining high-quality observations with student surveys and achievement gains. Seattle, WA: Bill & Melinda Gates Foundation.
Klem, A. M., & Connell, J. P. (2004). Relationships matter: Linking teacher support to student engagement and achievement. Journal of School Health, 74(7), 262273.
Lee, V. E., & Smith, J. B. (1999). Social support and achievement for young adolescents in Chicago: The role of school academic press. American Educational Research Journal, 36(4), 907945. doi:10.3102/00028312036004907
Libbey, H. P. (2009). Measuring student relationships to school: Attachment, bonding, connectedness, and engagement. Journal of School Health, 74(7), 274283.
Marks, H. M. (2000). Student engagement in instructional activity: Patterns in the elementary, middle, and high school years. American Educational Research Journal, 37(1), 153184.
Miles, M. B., & Huberman, M. (1994). Qualitative data analysis: An expanded sourcebook (2nd ed.). Sage Publications, Inc.
Morse, J. M. (2003). Principles of mixed methods and multimethod research design. In A. Tashakkori & C. Teddlie (Eds.), Handbook of mixed methods in social and behavioral research. Thousand Oaks, Calif.: SAGE Publications.
Morse, J. M. (2010). Simultaneous and sequential qualitative mixed method designs. Qualitative Inquiry. doi:10.1177/1077800410364741
Murnane, R. J., & Willett, J. B. (2010). Methods matter: Improving causal inference in educational and social science research (1st ed.). New York: Oxford University Press.
Murphy, J. F., Elliott, S. N., Goldring, E. B., & Porter, A. C. (2006). Learning-centered leadership: A conceptual foundation. New York, NY: The Wallace Foundation.
Nasir, N. S., Jones, A., & McLaughlin, M. W. (2011). School connectedness for students in low-income urban high schools. Teachers College Record, 113(8), 17551793.
Pajares, F., & Urdan, T. C. (2006). Self-efficacy beliefs of adolescents. Charlotte, NC: IAP.
Pianta, R., Hamre, B., & Mintz, S. (2011). Classroom assessment scoring system-secondary manual. Charlottesville, VA: The Center for Advanced Study of Teaching and Learning.
QSR International. (2012). NVivo Qualitative data analysis (Version 10). Melbourne, Australia: QSR International Pty Ltd.
Tashakkori, A., & Teddlie, C. (1998). Mixed methodology: Combining qualitative and quantitative approaches (1st ed.). Thousand Oaks, Calif: SAGE Publications, Inc.
Tashakkori, A., & Teddlie, C. (2003). Handbook of mixed methods in social and behavioral research. Thousand Oaks, CA: SAGE Publications.
Tashakkori, A., & Teddlie, C. B. (2010). SAGE handbook of mixed methods in social & behavioral research (2nd ed.). Sage Publications, Inc.
Teddlie, C., & Tashakkori, A. (2006). A general typology of research designs featuring mixed methods. Research in the Schools, 13(1), 1228.
Walker, C. O., & Greene, B. A. (2009). The relations between student motivational beliefs and cognitive engagement in high school. The Journal of Educational Research, 102(6), 463472.
Woolley, C. M. (2009). Meeting the mixed methods challenge of integration in a sociological study of structure and agency. Journal of Mixed Methods Research, 3(1), 725.
Value Added Research Center (VARC). (2014). Measuring school effectiveness: Technical report on the 2011 value-added model. Nashville, TN: Vanderbilt University: National Center on
Scaling Up Effective Schools.
Yin, R. K. (2009). Case study research: Design and methods. Atlanta, GA: SAGE.
Yoshikawa, H., Weisner, T. S., Kalil, A., & Way, N. (2008). Mixin