Home Articles Reader Opinion Editorial Book Reviews Discussion Writers Guide About TCRecord
transparent 13

Effective Schools and Effective Principals: Effective Research?

by Perry A. Zirkel & Scott C. Greenwood - 1987

This article cautions that prescriptive announcements for school improvement currently in vogue are not all clearly justified by research on school effectiveness. An overview of the strong principal factor is used as an example.(Source: ERIC)

Pronouncements and policies based on the “effective schools research” put the principal at the top of the agenda for educational reform. The late Ronald Edmonds identified five factors, including “the principal’s leadership and attention to the quality of instruction,”1 that have become what one observer has called “the new catechism of urban school improvement.”2 One government official, writing in his private capacity, has asserted in both popular and scholarly publications that implementation of these five factors is a judicially enforceable legal duty of urban public schools.3 Another government official, then also writing in his private capacity, identified instructional leadership by the principal as one of the “commandments” for bringing about effectiveness in all public schools, not just those in urban areas.4 Finally and most recently, the U.S. Department of Education issued, with great fanfare and an express endorsement from President Reagan,5 a booklet entitled What Works: Research about Teaching and Learning that unequivocally listed “strong instructional leadership” as one of the most important characteristics of effective schools.6

These research-based prescriptions have great allure. They seem to confirm the simple, commonsense notion that the great school principal is indeed principal in making the school great. As a result, this “principal principle” has been given virtual legal legitimacy without regard to limitations in the underlying research.7 Edmonds himself had cautioned: “The point here is to make clear at the outset that no one model explains school effectiveness for the poor or any other social class subset.”8 Similarly, in their review of the early research, Purkey and Smith warned that “blanket acceptance . . . would be dangerous” and that “[we] are suspicious of the ‘great principal’ theory.”9 Pointing to differences between and among the findings and conclusions of the same research, D’Amico cautioned that “as yet, there are no recipes for effective schools.”1010 Even in his response to D’Amico, one of these reviewed researchers concluded that “more research is needed before this work could ever hope to meet the standards of a ‘recipe.’”11

The purpose of this article is to demonstrate that the far-flung prescriptive pronouncements currently in vogue, whether as recipes or as frameworks for school improvement, are not at all clearly justified by the research on effective schools. Focusing by way of example on the first factor, the strong principal, this overview will show that these supposed solutions not only have ignored the limitations in and warnings about the early research, but also are not supported by the findings of the more recent research. We offer an up-to-date overview and a cautionary reminder, not a detailed critique or a rival theory.


The early research on effective schools arose largely in response to the Coleman report and similar studies12 that seemed to show that academic achievement was primarily a function of nonschool variables. This first phase of the effective-schools research was reported in various papers from approximately 1970 to 1980. The details of these studies do not bear repeating here; the available reviews adequately explain the limitations.13

First, with regard to sampling, most of these studies were limited to elementary schools, especially in urban areas, and were characterized by small sample size. Second, with regard to instrumentation, the measurement of strong principals was not systematic. Third, with regard to procedure, a substantial segment of the studies used a case-study approach, which is appropriate for exploration rather than generalization, and limited their examination to effective schools only. Even those studies that statistically compared effective schools (positive outliers) with ineffective or average schools (negative outliers) were not consistent in terms of the type of negative outliers, the control of student background differences, and the inclusion of the leadership variable. The findings with respect to the leadership variable were not conclusive, as admitted by Sweeney in his efforts to synthesize them.14 For example, the Maryland study concluded that effective schools had principals who exercised strong instructional leadership, while the Delaware study found that effective schools had principals who emphasized administrative activities. The emphasis on instructional leadership was found in only three of the seven studies reviewed by Purkey and Smith, and in some cases it was attributable to staff members other than the principal. In a review of fifty-nine systematic case studies, McCarthy found the principal’s leadership identified as important to school success in only 27 percent of the studies, with variations between content and process emphasis.15

Finally, and perhaps most importantly, multivariate, longitudinal studies designed to trace causation were virtually nonexistent. Even if the various researchers had agreed on the salient five factors identified by Edmonds, which is not clearly the case, Edmonds warned that these characteristics have been shown only to be correlates, not causes, of improved school achievement.16 A reviewer of the above-mentioned legal proposal pointed out:

As a matter of logic the five characteristics’ presence in effective schools may be a consequence of other factors that have caused students to perform well. Conversely, the ineffectiveness of schools that do not manifest the live characteristics may not be the result of the failure to implement those characteristics, but of other factors that may have prevented the manifestation of the five characteristics. Ratner’s insistence that causation is established by the correlation between effective schools and the five characteristics thus is logically equivalent to a doctor’s determination that smiling is a cure for cancer because people whose cancers are in remission smile more than those whose cancers are spreading.17

Other reviewers have expressed similar cautions, from both legal and educational perspectives. Rowan, Bossert, and Dwyer expressed cautions as to causation specific to the leadership variable, including the fact that the relationship between student achievement and instructional leadership may be due to effective organizations attracting or molding effective leaders rather than to effective leaders’ bringing about their organizations’ effectiveness. Murphy, Hallinger, and Mitman similarly forewarned against the “white knight” view of educational leadership and the premature application of related research.18


Yudofs warning that educational policy based on this first phase of the effective schools research is built on “shifting sands” becomes an unerring prediction in light of the more recent research.19 A second phase (or generation) of studies focusing on the relationship between school and principal effectiveness was reported in the mid 1980s. These specialized studies have not heretofore been included in syntheses of the applicable research base. In general, they are more advanced than the earlier studies in several respects, including provision for comparison schools, control of background variables, and measurement of the principal’s leadership.

Several of these studies were doctoral dissertations. Although the quality of this level of research is not prestigious, the care, effort, and, not infrequently, committee consultation are too easily dismissed. Indeed, the quality of published studies and presented papers in the effective-schools literature varies widely and at least overlaps with that of the doctoral dissertations. Although the readily available information is often limited to abstracts, the cumulative weight of these doctoral studies cannot be ignored.

In his doctoral study of middle schools in Missouri, Ayres found that the correlation between principal effectiveness, as measured by the overall scores of the “Audit of Principal Effectiveness,” and student achievement, as measured by gain scores during grades seven and eight on standardized achievement tests, was not statistically significant. Similarly, he found no significant difference in the overall principals’ effectiveness scores between the schools with the highest average student gain scores and those with the lowest average student gain scores. Some sub scores of the principals’ effectiveness instrument (e.g., directional leadership, instructional management, and affective involvement), however, were significant.20

Mack, in his doctoral study of twenty-eight suburban schools, found that principals’ expectations for and role consistency with teachers were not significantly related to any of the four selected indicators of school effectiveness (student reading achievement scores, student reading attitude scores, percentage of students above grade level in reading, and teacher-perceived school effectiveness). Moreover, although principals on average rated their efforts to assist teachers as moderate, their increased effort was inversely related to teacher-perceived school effectiveness. Finally, what principals actually do, according to teachers and principals in the same school, was inversely related to teacher-perceived school effectiveness.21 Similarly, in her doctoral study of principals in eighteen elementary schools in one urban community, Adie found that these principals’ expectations and their behavior were not significantly related to student reading achievement.22

In his doctoral study of the schools in an urban district, LaMarr found that instructional leadership, as measured by the principals’ self-ratings and their supervisors’ ratings on the “Leadership Behavior Description Questionnaire” and also their 1982-1983 evaluations, was not significantly related to student achievement. Moreover, for the leadership behaviors on the 1982-1984 evaluation instrument, the student achievement in schools with principals’ ratings above the mean was significantly less than the student achievement in schools with principals’ ratings below the mean.23

Three separately designed and conducted dissertations in Pennsylvania, all using statewide achievement test data that were adjusted for differences in student background and various other “condition variables,” reached surprisingly similar results. Martinez-Antonetty found no significant differences between effective and ineffective middle and junior high schools with respect to the instructional leadership, leadership style, and sources of authority of the principals, based on structured interviews with a cross section of each school staff (principal, assistant principal, teachers, and nonprofessionals). Matula found no significant differences between effective and ineffective high schools, as measured by the achievement scores of eleventh-grade students, with respect to principals’ instructional style, as measured by both self-ratings and teacher ratings on Blake and Mouton’s “Managerial Grid.” Landis found no significant difference between effective and ineffective middle schools, as measured by the achievement scores of eighth-grade students, and overall principals’ leadership, as measured by a comprehensive teacher rating scale. He found significant differences favoring the effective schools for two subscores (goal commitment and decision making) and favoring the ineffective schools for three subscores (monitoring, routines, and resourcefulness). One of the remaining six subscales, which did not yield significant differences, was instructional leadership.24

Moreover, the recent studies are not limited to doctoral dissertations. In a study that is part of the Seattle Effective Schools Project, Andrews, Soder, and Jacoby divided thirty-three elementary schools into three groups—strong leader, average leader, and weak leader—based on the ratings of the principals by the teachers in each school on a comprehensive instructional leadership instrument. Schools in each group were comparable in size, ethnicity, and surrogate socioeconomic status (SES). There were no significant differences among the three groups of schools with respect to mean scores of academic achievement. However, there were significant differences among the three groups of schools with respect to gain scores of academic achievement (over a two-year period), favoring the schools administered by principals who were rated by their teachers as strong instructional leaders. These gain-score differences generally were significant for black and free-lunch students but not for white and paid-lunch students.25

In a study that is part of the Louisiana School Effectiveness Study, Wimpelberg identified nine pairs of “more effective” and “less effective” schools that were geographically, ethnically, and economically representative of that state’s population of public schools. The criterion for selecting more and less effective schools was mean reading achievement scores adjusted for SES and ethnicity. Data were available and were analyzed for six pairs of schools. He found that the Likert-type responses of the principals in the “more effective” schools did not differ significantly from those of the principals in the “less effective” schools for the item “I am able to monitor instruction very closely.” However, in the follow-up responses during the interviews, he reported detecting a difference in “the flavor with which the principals described their involvement in the classroom.“26

A group of other recent studies comprise a category of less direct relevance, but further demonstrate the mixed array of terminology and results. In a study of eighty-eight California elementary school districts, Glasman found no significant differences between principals designated by their superintendents as “most” and “least” effective with regard to their own perceived sense of efficacy in the use of data on gains in student achievement. In successive studies of elementary and secondary schools, respectively, in a southeastern state, Moody and Amos found that principals in high-achieving elementary schools had significantly higher total mean scores on a group interaction instrument than did principals in low-achieving elementary schools; however, there was no significant difference in the corresponding analysis for secondary school principals, and there was no control for student background characteristics. In their study of nine schools in a school-effectiveness project in an urban district in the Midwest, High and Achilles found mixed results with regard to the difference in “influence-gaining behaviors” between principals in high-achieving schools and those in “other” schools, depending on the source of the ratings (teachers, principals, observers) and the nature of the response options (rank vs. degree). In her doctoral study of nineteen schools in a single suburban district, O’Day found a significant relationship not only between teacher ratings of principal instructional management and positive discrepancy scores (between actual and expected student achievement), but also between self-ratings of principal instructional management and negative discrepancy scores for academic achievement. In a national study of effective secondary schools, as determined by a national panel based on their programs, policies, and practices, Huddle concluded that no one leadership style predominated. In his dissertation, Grimsley found that the Machiavellian orientation of the principal was not significantly related to school effectiveness, defined as the health of the organization. More peripherally, in her doctoral study Patterson found no significant differences between effective and ineffective principals, as nominated by supervisors in the eighty-eight participating California school districts, with regard to their survey responses to seven common situations related to student achievement.27


On balance, there appears to be at least as weighty research evidence against as there is for the notion that great principals make great schools. Contrary to some less comprehensive reviews or interpretations, the findings do not show “remarkable consistency.”28 The seeming contradiction in this evidence gives rise to several possible explanations and questions, a substantial sample of which relate directly to effective schools and effective principals, respectively.


What is an “effective school”? The bulk of the studies would seem to agree that academic achievement is the criterion. However, a few of the studies remind us that that a “natural systems” theory, which focuses on organizational health or climate, has arisen as an alternative to the “goals-centered” theory, which focuses on student academic achievement. The definitions, measures, and results of effective-schools research often vary according to the theory of the evaluator.29

Even within the goals-centered approach, it is at least arguable that measures of not only cognitive, but also affective, student outcomes should be included. Academic achievement in basic skills may well be a necessary but not sufficient definition of effectiveness. Further, within the cognitive area, it is not at all clear whether the measurement should be norm-referenced or criterion-referenced and whether it should be limited, particularly in the upper grades, to basic skills in reading and math. Even if one accepts the answer of the bulk of the studies that the tests should be criterion-referenced and limited to basic skills, the studies vary as to whether the criterion should be mean scores or gain scores. The latter are used much less frequently but perhaps much more appropriately.30 Similarly, the studies differ as to whether these scores should be analyzed “as is” or with adjustment for condition variables, such as SES. The preferable answer would appear to be the latter, but some studies have not employed this approach.31 Perhaps these researchers were troubled by the not remote possibility within the adjusted-score approach that a disadvantaged school with below-average student achievement could be classified as effective and a highly advantaged school with above-average scores could be in the ineffective or “other” category. Rowan pointed out additional variations and problems, including his finding that “using regression analysis, we found that only 50 percent of the schools identified [by gain scores] as effective in one year remained effective the next year.”32 Rowan, Bossert, and Dwyer also found that “the school-to-school differences in achievement that emerge when extreme outliers are compared are confounded with differences due to random error.”33 Mandeville similarly found methodological problems and inherent inconsistency of improvement, leading to his conclusion that many unsolved problems still exist in the identification of effective schools.34

For these and other reasons, Kyle’s reminder bears repeating: ‘We tend sometimes to forget the obvious: phrases such as ‘effective school’ . . . can become mere incantations masking different perceptions of what effective schools really are.”35


Kyle’s caveat can be fittingly extended to “instructional leadership” and the variations on this term that are used in the research, often without clear differentiation or definition. The dramatic diversity that is masked under the at least rough commonality of terms is revealed by looking at the literature. For example, in a review of articles about the school principalship since the early 1950s, Glasman found that instructional leadership was only one of several value-laden role definitions and that none of these definitions viewed the principal as directly responsible for improving student achievement. As a more recent example, Pitner and Hocevar found support for a multidimensional conception of principal leadership, which does not appear to be adequately measured by relying on teachers as the sole data source.36 The diversity is also revealed by the instruments used in the two generations of studies reviewed herein. The instruments vary significantly in scope, type, and source. Some have a broad scope, looking at leadership generally; others focus on instructional leadership. Some are systematically quantitative or qualitative; others are impressionistic observations or interviews. Some rely on responses of the principal; others use teachers or observers. For example, some of the effective-schools studies and some research in related areas found notable differences between principals’ and teachers’ or supervisors’ perceptions of the principals’ leadership.37 Similarly, Wimpelberg reached different conclusions based on the quantitative and qualitative parts of his instrument.38 Likewise, some studies obtained dramatically different findings depending on the subscale of the instrument.39 Collectively, such differences would seem to partially account for the variance in the results.

Further, rather than consisting of the list of indicators constituting each instrument, instructional leadership may well be multidimensional, involving the interplay of personal traits, leadership styles, management behaviors, and contextual factors.40 For example, ignoring the differences between elementary and secondary schools is at least as problematic for the measurement of instructional leadership as it is for the measurement of academic achievement. Some studies suggest that the elementary-secondary context is a possibly significant intervening variable,41 although it does not appear to be the primary explanation for the variation between the results of the early and recent research.

Even if instructional leadership, or some identifiable subset thereof, is a consistent correlate of effective schools, policymakers should be cautious about their conclusions and actions. First, there is a glaring absence of multivariate, longitudinal research designed for inferences about causation. Second, the ranking of the leadership factor and its interaction with the other factors have not been explored.42 Finally, some research syntheses have concluded that this leadership may be supplied by other members of the school staff.43 In sum, as Cohen concluded: “Research on what principals actually do, and on the consequences . . . for student learning, is still in its infancy.”44

The substantial variances in the conceptualizations of effective schools and effective principals combine to yield curious results. For example, finding two different types of effective schools in the research, Stedman reformulated the unsubstantiated factors, omitting strong principal leadership as one of the seven “supporting characteristics” across the two types.45

In light of the marked limitations of the early research and the mixed effects of the more recent research, broad characterizations of such findings as being “‘consistent, persuasive, and fairly stable over time”46 are premature over statements. Pronouncements in the form of administrative policy frameworks or judicially enforceable duties are, to the significant extent that they purport to rely on this research base, not objectively acceptable. Similarly, academic plans and prescriptions that claim to be “research based”47 need to be more careful about the “noteworthy shortcomings”48 in this base. For example, a brief caveat about the limitations and non-generalizability of a study does not justify broad prescriptions about the pre-service training, selection, and continuing education of principals.49

We argue for balance. Research should not be ignored or abandoned because it is imperfect. Rather, it merits cautious assessment and continued improvement. Similarly, advocacy and activism have their place, but that place is not where the “rhetoric of reform’ poses under the “guise of positive science.”50 Finally, policymakers should be wary of creating false expectations for principals of their schools. Watchwords of “effective” schools or principals are “politically loaded”; they are susceptible to attracting requirements rather than resources.51 Calling for “a new vehicle” rather than “repainting [the] jalopy,” for example, the recent proposal of a high official in the Department of Education includes holding the principal personally accountable for the performance of his or her school, particularly in terms of student achievement. This proposal curiously combines with federal and state recommendations for certification requirements of principals to attract managerial entrepreneurs and executives rather than instructional experts. Either way, if the principal does not “produce” high student achievement, it may well be a reflection of limitations in the “state of the art” of education and of research.52 Whether schools are loosely coupled or not,53 the principal is a full step removed from the complex teacher-student connection, which is itself built on a limited, albeit improving, research base.

We join others, from Edmonds to Sirotnik, who have called for caution, healthy skepticism, and inquiry into the complexity of schooling.54 The advocacy should not be mistaken for, or be preemptive of, the fact finding or the judgment in the trial of the Great Principal—Great School theory.


Cite This Article as: Teachers College Record Volume 89 Number 2, 1987, p. 255-267
https://www.tcrecord.org ID Number: 527, Date Accessed: 1/27/2022 8:48:37 AM

Purchase Reprint Rights for this article or review
Member Center
In Print
This Month's Issue