The Florida Performance Measurement System: A Consideration
by C. J. B. Macmillan & Shirley Pendlebury - 1985
The Florida Performance Measurement System is probably the most extensive attempt in recent years to translate research on teaching into a practical form for use in training, evaluating, and rewarding teachers. However, the FPMS reflects no sense of the values inherent in teaching and misses all the joy of teaching. (Source: ERIC)
It has long been a dream of educationists that the evaluation of teaching be based on something other than the whims of administrators and supervisors of teachers. Everyone knows what that means at its worst: Unthinking, often theoretically myopic bureaucrats take rumors, noise levels, and even the level of the window blinds as crucial indicators of teacher competence. No attention is paid to the careful thought the teacher has put into lessons; no attempt is made to consider the many ways in which teaching can be excellent or even competent. Anything that moves us away from such whimsicality should be an improvement, and something that is based on careful empirical research should be welcomed with glee by anyone concerned with the present state of American schools.
The Florida Performance Measurement System (FPMS) is a direct attempt to make this dream a reality. The product of a coalition of writers working under the chairmanship of B. O. Smith, it is probably the most extensive attempt in recent years to translate research on teaching into a practical form for use in training, evaluating, and rewarding teachers. The most important FPMS document is Domains: Knowledge Base of the Florida Performance Measurement System (henceforth Domains); other titles include Handbook for the Beginning Teacher Program for Florida Teachers; Manual for Coding Teacher Performance on the Summative Observation Instrument, a short-form version of Domains that does not consider the research background; and the all-important Summative Observation Instrument (SOI), used by observers in classrooms.1
We should be greeting this attempt with gleeand yet there is something about these publications that makes one shiver. When we first started looking at the FPMS publications, our first reaction was horror at the expansion of the behavioristic tradition of bureaucratic manipulation of teachers to please the system: our second reaction was horror at what looks like a positivistic, fallacy-ridden version of technicist science. Our third reaction was amusement at the triviality of many of the points made and the ways in which they are made. Whatever might seem of value in the materials published here is lost in the verbiage, hidden behind educationese language, bureaucratic necessities, and misunderstandings of teaching. Hence, we took on this task of reviewing the whole thing before a narrow interpretation of it is encased in concrete.
There is too much to say about this set of documents in a short article like thisor perhaps in a lifetime. We shall try to give a relatively sympathetic critique of the most important work, Domains, in the hope that some good can be salvaged from the FPMS and for the future of American teachers. For this is the rub: There is some good here. Some features of the FPMS could have long-term benefits for the teaching profession and perhaps for the school as an institution (perhaps not for bothbut that is another story). It seems to us that the authors of FPMS misunderstand what they are doing.
To get at this, we will take a look first at the purposes of the system, then at its assumed view of science, and finally at the notion of teaching that is assumed here. This rather disjointed affair should be thought of as a starta place to begin discussion, rather than a complete critique.
THE POINT OF THE FPMS
Suppose we accept this premise: American teachers are not as good as they ought to be. Hardly anyone would disagree with such a statement.
Try a second premise: The presently existing corps of teachers can be improved on the job and the incoming group of teachers can be trained well only if their attention is focused on what constitutes good teaching. This premise needs more qualifications than the first: for example, the low pay of teachers may keep good people from even thinking of entering the profession; the bureaucratic necessities of running a system may get in the way of establishing good teaching in school; or the conflicting purposes of the public, profession, and institution may work against any sort of improvement in the teaching corps. Mere attention to good teaching will not solve the schools problems.
But the second premise still has a note of truth about it: It is to teaching that we have to look if we want to give advice on how to improve teaching, not to the social class of the children, not to the social problems of the society, not to the pay of the teachers. Rather, we have to examine what individual teachers do, and we have to try to figure out how to evaluate what they are doing and how to get them to improve (however we define improvement). So accept the second premise as qualified: To improve teaching, focus on teaching.
There is a third premise: If you want to find out what effective or good teaching is, you look at what teachers do and correlate their actions with the resulting student achievements. If you do enough correlations of such actions with results, you should come up with a list of things you could recommend that beginning teachers do, or things you could suggest, with good reason, that they not do. We have all heard the maxims that used to be passed on: Dont smile till Thanksgiving, Enlist the biggest boy as your aide, and so forth. We are all searching for such things, but we want them to be verified, to be scientifically respectable.
It is hard to fault such a premise. If we reject this idea, we would then be saying something like, Theres no way to learn from experience of teaching, there are no generalizations that can be made about teaching-and-learning that can be supported by observation. To deny the premise would seem to deny the possibility of improvement, except by fluke. All of our experiences as teachers and as students of teaching goes against this kind of cop-out. We do, can, and ought to learn from experience, to generalize about teaching as everything else, to study teaching rigorously and empirically.
The FPMS is an attempt to work out the implications of these three premises. One might attack its source and its ultimate purpose: It comes from the bureaucrats, for the purpose of controlling the schools, for the sake of centralized and unified systematization of the institutions of the stateperhaps at the cost of teacher autonomy. This argument takes us into political discussions of the purposes of schoolshow schools should be related to centralized authority, how things might be if we had a different set of relationships among teachers, students, administrators, taxpayers, and so forth.
But accept this as a given: It is unlikely that American schools are going to change much in their basic structure. The FPMS is likely to be a part of the future of American schools in some form or another as long as we have this sort of institution. This might be good reason for rejecting the present school system, but until we reject it, it would be well to see how something like the FPMS could look if it were used decently by people who understood it.
Perhaps we should not argue against it too much anyhow, for there is a central benefit in the basic idea: If we want to improve schools, we want the system to be one in which teachers will be considered important, and not merely for who they are as individuals, for how much they are paid, or even for who is in charge. The central questions will be what they do in teaching (and by teaching). If teachers were really evaluated on this basis, really trained to do the right things, to understand why one ought to do one thing rather than another, we would be a long way toward improvement of schooling. And that is the point of the FPMS.
SCIENTIFIC STATUS OF DOMAINS
The domains in the FPMS are a way of cutting the knowledge base of teaching into manageable bites. They are six areas of activity that the authors, on the basis of their survey and analysis of research, consider crucial to teaching: Instructional Planning, Management of Student Conduct, Organization and Delivery of Instruction, Presentation of Subject Matter, Communication: Verbal and Nonverbal, and Evaluation of Student Progress. Each domain is further subdivided into a set of concepts, each of which has a set of related behavioral indicators, a set of teaching principles, and a survey and analysis of relevant research findings. Most of the findings the authors consider relevant to their purposes are generated from the process-product tradition of research on teaching, which also provides the theoretical underpinnings of the FPMS project.2
A major feature of the key document is the authors conception of the relationship between the so-called concepts, their behavioral indicators, and empirical research on teaching:
Each domain treats the concepts and indicators of behavior derived from the research that relate to a specific area of teaching activity, provides the principles for effective teaching formulated from these concepts and indicators, and includes research studies that support the concepts, indicators and principles.3
What concerns us here is the claim that the concepts for each domain are derived from and supported by empirical research. The outline the authors give for the domain of Instructional Planning will serve as an example in our critical examination of this claim:
Definition: This domain = preclassroom teacher activities that develop schemata for classroom activities.
This domain consists of the following named concepts:
1.1 Content Coverage
1.2 Utilization of Instructional Materials
1.3 Activity Structure
1.4 Coal Focusing
For each of these concepts there is a list of behavioral indicators that is presumably intended to help observers identify when and whether the required concept has come into play. To ensure uniformity of understanding of the listed indicators, the authors provide a definition of each one.
Here, for example, is the indicator for Identification/Selection of Content, which is the first indicator for the concept of content coverage:
Definition: Identification/selection of content = teacher names one or more skills, concepts, facts, rules, principles, laws, or value statements to be taught during a period of instruction.
Teacher says, Definition by context will be discussed.
The fourth day will be Halloween reading math. Now, thats the work they will be doing with me, g, and p; so I need to look and see what work they will be expected to complete at their seats by themselves.5
One critic of research on teaching has pointed out that both quantitative and qualitative research fall into the error of assuming that conceptual clarification is simply a matter of providing precise definitions.6 The authors of Domains have fallen into this error; for them a precise definition is a behavioral one (no matter how tortuous, obvious, or trivial)hence the behavioral indicators for the central concepts and the further behavioral definitions for those indicators.
But there is something more seriously wrong with the way in which the authors think about concepts. Just what is wrong is best indicated by the following questions: What is the supposed status of the concepts listed for each domain? Are they offered as concepts that are constitutive of teaching in much the same way as bidding is constitutive of bridge or checkmating of chess? Or are they simply convenient ways of slicing different aspects of teaching to a more manageable size for the purposes of research or evaluation? Or are they perhaps what the authors consider to be the necessary components of effective teaching (or, in the present example, the components of effective planning)?
Whatever the status of the listed concepts, it is surely mistaken to assume that they could have been derived from empirical research. Rather theyor some other conceptswere a prerequisite for such research. The authors of Domains fail to recognize that empirical research has neither point nor direction unless the researcher makes prior decisions concerning its focus and scope. Some decisions will depend on other research that has already been undertaken in the field, but others will depend on the researchers conceptualization of the activity to be studied, in this case teaching. Concept-free observation is an impossibility. Or, to put the point in Kantian terms, percepts without concepts are blind. This general point is at its sharpest when the object of research is an activity. There is no way to identify or sort activities without attending to the agents intentions, for it is these intentions that provide the formal principle of unity for the many different instances of an activity.
The agents intentions are particularly important in identifying instances of a polymorphous activity such as teaching. Many of the guises in which teaching comes are characteristic of other activities too. Think of the varied instances in which we would describe an activity as teaching- a mother encouraging her toddler to walk from the safety of one armchair to another, a small boy showing a friend how to play marbles, Jesus Christ telling his parables, Socrates asking his endless questions, a classroom practitioner listening to groups of children trying to work out a problem in mathematics. Yet encouraging, showing, telling, asking, listeningand all the countless other things one does when teachingare also characteristic of many of our other engagements with people. So when do they fall under the concept of teaching rather than under the concept of selling, or conversing, or acting, or whatever?
The reverse of this is problematic also; surely not everything that goes on in classrooms is properly thought of as teachinglimits must be drawn somewhere. Educationists can and do distinguish between different things that people called teachers do in classrooms: counseling, cleaning up, scolding, cheering children up (and on), and teaching. When are these other things teaching? When are they irrelevant to teaching? An understanding of teaching requires answering just such questions. And they cannot be answered through observational and experimental studies. It is essential to come to grips with what Wittgenstein called the logical grammar of teaching and its near conceptual relatives.
The major point is that it is logically impossible to identify an activity as teaching independently of some formal condition of unity for the concept. Yet this is precisely what the authors of Domains assume can be done. Their claim that the concepts of teaching are derived from empirical research implies that a spectator can identify an activity as teaching without taking account of the agents conception of what he is doing.
Why is it that the authors of the FPMS should be so concerned to have their central concepts derived from and supported by empirical research? The explanation lies, we suspect, in their epistemological commitment to the search for certainty. To claim this kind of relationship between concepts and empirical research is to claim that the concepts are beyond question. They are so because they have been objectively established, scientifically provenor so the authors would have us believe. Objectivity is obviously a crucial issue in a document whose express purpose is to provide a knowledge base for the evaluation of teachers performance. What is at fault in the document is not the attempt to provide objective criteria for judging teaching; rather it is the restricted conception of both knowledge and objectivity implicit in the underlying research tradition. The view of knowledge that underpins the FPMS is one that takes empirical evidence as the only secure foundation for knowledge. In this view, beliefs about the central characteristics of teaching, and in particular of effective teaching, are justifiable only insofar as their derivation from empirical research can be displayedhence the attempt to set up unassailable ties between empirical research, on the one hand, and concepts, behavioral indicators, and teaching principles, on the other. But the attempt is misconceived.
The FPMS assumes that a general specification of effectiveness can be given, that effective teaching can be characterized apart from the subjects being taught. The criteria of effectiveness are the quantifiable effects of teaching on the learners (e.g., higher achievement, increased learning, etc.). Consider a parallel, though; we might say that an effective actor is one who moves the audience to tears or laughter or some other public overflow of feeling. Yet we would surely not commend the quality of an actors performance if he merely had an effect on the audience without bringing the play alive for them in a way appropriate to the work itself. Any schoolchild knows how easy it is to draw a laugh from Shakespeares tragedies-just read in a suitably pompous and tragic voice and one has the class rolling in the aisles. This is hardly quality acting despite its effects; how many people laugh and how much is never by itself a fitting measure of the quality of an actors performance.
In acting, the emphasis on effectiveness can make us forget the importance of the play in determining what will count as good acting. In teaching, the emphasis on effectiveness can make us forget the complex nature of the interaction among teacher, students, and subject matter. To show what counts as good teachingand what is necessary for training and evaluating good teachersattention must be paid to this complex interaction, not just to any single aspect of it.
Yet the FPMS seems to recognize the complexity of teaching; indeed, this may be its strongest point. There is no attempt to give an all-encompassing metaphor that hides the complexity under a spurious unitynot for this coalition the attempt to see teaching as molding minds or promoting growth. The defense of the individual items of behavior reported in each of the domains is to be found in its relation to the objectives of teaching (except insofar as these are themselves part of the domain of planning), not in its relation to an over-arching metaphor of teaching.
Instead, what the FPMS emphasizes are the specifics, the little things that teachers do that together (or separately) lead to effective (or successful) teachingtheir smiles, their gentleness in criticism, their keeping children on task, and so forth. Furthermore, these are emphasized as specific behaviors, each of which can be isolated from its environing activities and correlated with appropriate student achievement or conduct.
But is there not a controlling metaphor here after all? For what we are given is the equivalent of the analyses that might be used for the technological establishment of procedures on an assembly line, or to be more up-to-date, the programming of a computer. The reason that it reads so weirdly is that each behavior is separated and considered by itself rather than seen as having a place in a larger sequence of actions. It is almost as if the authors had considered the programming of a computer piece by piece, with each move considered by itself, with no attention paid to how it developed from or fed into other steps. So not only is there some question about the technological model itself as a metaphor for teaching; the model is sadly misunderstood, since that model itself suggests linear planning and steps. Such steps are simply missing from the FPMS.
The failing can be seen most obviously when the FPMS goes into practice. Consider the summative observation instrument (SOI). SOI has all the failings of bureaucracy, and it has some additional failings that come from the misunderstanding of what teaching is about.
The SOI manual (Manual for Coding Teacher Performance on the Summative Observation Instrument, henceforth MCTP) provides a set of directions for the use of SOI by observers:
The summative instrument is an observation tool for coding performance as it occurs in the classroom. It is systematic and nonevaluative if the observer is trained to recognize the teacher behaviors listed on the instrument and to record them in the appropriate spaces. The only judgment required of the observer is at the level of whether a particular teacher behavior fits an item on the instrument. It does not require that observer to judge whether the behavior is effective or ineffective.7
But the SOI itself requires the observer to record teacher behaviors in one of two columns: The first is headed Effective Indicators and the second Ineffective Indicators.8 For example, Item 1 has this description for an effective indicator: Begins instruction promptly, and this for ineffective: delays. When it gives instructions for tallying observations, MCTP suggests the following:
1. Begins instruction promptly Delays
a) Tally in the left column if the teacher:
organizes and begins work promptly after the bell rings.
works with the class as a whole and manages interaction so that time is used productively.
b) Tally in the right column if the teacher does one or more of the following:
does not routinize and handle with dispatch such things as roll reports and lunch tickets.
chats with students so that classwork is delayed.
loses time while students go to their lockers for supplies.9
At the end of the observation period, the observer is to summarize the data:
To complete the data collection process the marks are summed after the observation period and a frequency count is posted in the observation summary box. The observation is completed and the data are ready for analysis.10
It should be clear from this summary that the SOI is an evaluative instrument masquerading as an observation instrument. Under the guise of objective (because noninferential) observation, the SOI provides a loaded summary of what the observer has seen. This is even more obvious when specific notes are given concerning some ineffective indicators. For example, in regard to asking questions that call for personal opinions, nonacademic procedural questions, or questions that call for general nonacademic information, all to be tallied as ineffective, there is a note:
These questions may sometimes serve useful or even necessary purposes; however, they should be tallied here since they do not move the class work along academically.11
This note (and similar ones throughout the MCTP) again indicates that there is a particular model of teaching presupposed in the program as a whole. A technological theory of classroom management and teaching is implied in these remarks, thus working against a theory of messing about,12 which has considerable plausibility as a mode of learning, if not of teaching.
This should not be surprising, perhaps, since so much of the research on which this program is based was designed specifically to testand rejectthe plausibility of open education styles of pedagogy. The criteria used to determine the effectiveness of different teacher behaviors were almost always limited to scores on standardized achievement tests, over looking other factors such as enjoyment of the material, continuing interest, and so forth. Putting one theory of teaching into law in this fashion may simplify matters, but it does injustice to the richness of teaching and to the possibilities of other modes of teaching that are effective in different ways.
The educationists dream will finally become a nightmare, because this way of describing and evaluating teaching puts the whims of a theory into the hands of observers who are not given the appropriate intellectual tools to use in the analysis and evaluation of teaching. One can imagine the unimaginative observer/evaluator who says, Your performance is not up to standardjust look at all these ineffective indicators. But a teacher might intentionally ignore students, for example, or give no practice before changing topics, for good reasons. But ignoring students or their responses is coded only as an ineffective indicator on SOI. It is not clear what could count as a good reason within the FPMS way of evaluating teachers.
The evaluator who treats the pattern of observations not as evidence of competence but rather as a source for questions about the teachers reasons for acting as he or she does is in a stronger position to consider the teachers real competence at teaching within a well-worked-out theory (or philosophy) of teaching. To assume that the positive and negative indicators provide anything more than mere evidence is to mistake almost entirely what is going on in the classroom.
This approach to the evaluation of teaching mistakes its own roots and purposes. For what is proposed here is evaluation, criticism of teaching, not mere scientific description and assessment of a set of productive means to predesignated ends.
Consider an analogy. When a drama critic attends a play with a critical eye, he brings to it a tradition within which he considers the present exemplification. He looks at the play as part of a tradition with its own standards for evaluation. His comments attend to the play, the acting, the staging, direction, and so forth. When faced with something remarkably newas, for example, the first critics of Becketts Waiting for Godothe may be at a loss, but he searches for new criteria within the tradition or develops new ones to fit the drama before his eyes. There are good and bad critics as well as good and bad plays. A bad critic will not be able to see the present object in its context as a play; he will be at a loss as to what to say if it is out of the ordinary.
We do not want to go so far as to say that teaching is like drama. The notion of a script, for example (discussed in the planning domain of the FPMS),13 does not adequately capture the teachers plans or procedures, for the ways in which scripts and plans control the agents actions are quite different; other aspects of dramatic presentations (or artistic presentations in general) differ in too many ways from teaching for a strict analogy to hold. But the evaluation of teaching has direct parallels to the evaluation or criticism of art and drama in general. The existence of a set of criteria for evaluation, the presence of a tradition to which the critic might appeal for assessment, the attention to what is there to be observed, are all part of what a critic of teaching must have as a stock in trade much as similar things would be found in the evaluation of any art form.
The FPMS criteria reflect no sense of the values inherent in teaching, of the rational limitations of teaching, of the great arguments that have gone on among the great theorists of education over the millennia. This exposes the greatest danger and weakness of the FPMS: It rejects the tradition of criticism of teaching, the critical standards one might expect to find in such cases. The fundamental notion the program epitomizes is that traditional standards of teaching are not adequate for evaluating teaching in modern schools. Modern criteria, it is assumed, must be grounded in scientific findings, and the criticism of teaching must be as scientific as the research that has led to the new criteria for evaluation.
But no hint of this as an argument is found in the literature, which leads us to suspect that it has not explicitly occurred to its authors or supporters. There is a remarkable lack of attention to historical traditions of teaching: no reference to Plato, to Aristotles objections to Plato, to Augustine and Aquinas, to Comenius, Rousseau, Locke, Pestalozzi, or Dewey. It is as if teaching has just been invented by the FPMS as a mode of getting people to learn.
Teaching does not have to be reinvented in every age -not even in this one; it must, however, be constantly examined and argued over. For, like any other central aspect of human life, teaching changes its form, its content, its ways of being manifested, in every age. But a dialogue ought to be carried on among teachers, their principals, and the people who control the institutional setting of teaching; it should not be a set of directives handed down by science for the peons in the schools to carry out. The FPMS misses all the joy of teaching, all the intellectual honesty of the arguments over teaching, all the freedom and play, all the humanity of this as one of the central things in human life. What a pity. And what a travesty.