|
|
Social Epistemology and the Pragmatics of Assessmentby Kenneth J. Gergen & Ezekiel J. Dixon-Román - 2014 Background/Context: The assessment of students, along with teachers and school systems, has largely taken place within a context of positivist science. An enormous range of scholarship now challenges the positivist paradigm, offering a social espistemological alternative. This alternative invites a re-examination of assessment processes and their policy implications. Purpose/Objective:After sketching out the social constructionist alternative to positivist epistemology, the research centered on the pragmatics of existing assessment practices, including an analysis of who is helped or harmed but such practices. Setting:The research included extended across a wide range of contemporary educational settings. Research Design:The research was primarily analytic, drawing from wide-ranging sources in education and allied disciplines. Conclusions/Recommendations: Among the general outcomes of current assessment practices are the fostering of social division and distrust, the creation of hierarchies of worth, and the diminution of pluralism. Within educational systems we find a sacrifice of curriculum and pedagogy for the production of higher test scores, and the diminution of teacher motivation and engagement. Within communities, there is a disregard for local needs and values, a loss in student motivation, and an increase in family tensions. Possible alternatives to current testing practices, along with recommendations for future policies, are considered. Educational assessment has come to serve as a critical ingredient in the educational process. From the evaluation of individual student performance to the evaluation of teachers, school systems, and indeed the entire nations educational system, assessment practices are critical. They determine not only the personal outcomes for students, teachers, principles, and the families they serve, but schools, educational curricula and policies, and economic decisions. Continuing deliberation on assessment theory and practice is essential. In an earlier paper (Dixon-Román & Gergen, 2013) we drew critical attention to problematic aspects of measurement theory and practice. We pointed to the increasing challenges to the adequacy and relevance of "one-size fits all measurement" and the absence of any foundational logics to support existing practices. In the present offering, we expand on our critique of the empiricist logic underlying current practices of assessment. As we will explore, while the empiricist project remains dominant in education today, its foundations are essentially defunct. In its place a social epistemology has emerged, a view of knowledge that recognizes the contributions of science in terms of pragmatics as opposed to truth telling. In this context we turn to practices of educational assessment. If evaluations are not telling us the truth, but function pragmatically within educational systems and society, then it is imperative that we examine their practical consequences. Here we find major problems inhering in current assessment practices, the magnitude of which place strong demands on developing alternatives. We complete the paper with preliminary suggestions relevant to such developments. FROM EMPIRICAL FOUNDATIONS TO SOCIAL EPISTEMOLOGY During the past several decades of scholarly inquiry, an enormous and far-reaching transformation has taken place in the concept of knowledge, and the attendant concepts of truth, objectivity, and validity. Briefly put, the transformation can be traced in its earliest phases to a number of insoluble conceptual problems inherent in attempts to establish rational foundations for scientific knowledge, and most notably, empiricists accounts of such knowledge. Included among the insoluble problems are the challenge of matching words to world (Quine, 1960), accounting for the origin of theory (Popper, 1959), and sustaining falsification in light of the infinite plasticity of theory (Duhem, 1914/1954). However, inspired in part by the growing critiques of the dominant ordersgovernment, commerce, law, and militaryand the resulting oppression and bloodshed of the past century, a new wave of critical work began to emerge. From this work also emerged an alternative to what Robert Mislevy (1997) now characterizes as a discarded epistemology of knowledge. Three of these critical movements bear particular attention: IDEOLOGICAL ANALYSIS Central to the positivist/empiricist movement is the view that empirically grounded descriptions of the world carry no ideological biases. As proposed, properly supported scientific accounts of the world do not reflect the values, moral prescriptions, or religious beliefs of any particular group. This view met an early challenge from Marxist theorists arguing that capitalist economic theorydespite all the research and analysis in its supportwas essentially a mystifying means of fortifying the existing class structure. As later proposed in Jürgen Habermas (1971) influential volume Knowledge and Human Interests, all knowledge seeking privileges certain interests over others, favoring a certain political and economic configuration to the detriment of alternatives. Or more broadly put, scientific descriptions are not mirrors of the world; based on ones particular interests, certain accounts are preferred over others. The scientist essentially observes from a particular perspective or point of view, and this perspective is never value free. Such critique gained additional depth as scholars began to study the rhetoric of scientific accounts (see, for example, Gross, 1996). One could begin to see how social science terms such as conformity, prejudice, obedience, aggression, altruism, development, mental illness, and intelligence, were saturated with value, and how such values would not only color the interpretation of findings, but the way in which such findings were presented to and used by the public (Gergen, 1973). As scholars such as Emily Martin (1987) began to demonstrate, such colorings were not simply a problem for social sciences, but permeated the natural sciences as well. This early critical work subsequently unleashed a broad and continuing critique of scientific and scholarly accounts in terms of their subtle biases in matters of gender, race, economic class, religion, culture, and more. Whose voices, they continue to ask, are being silenced, exploited, or erased? Many critics found their work is galvanized by the writings of Michel Foucault (1978, 1980). As Foucault argued, when authoritative claims to knowledge are circulated through the society, they act as invitations to believe. As people embrace these claims, so do they come to act in ways that support them. Or, in Foucaults terms, claims to knowledge function to build and sustain structures of power. Thus, for example, when an authoritative group singles out certain behaviors and call them indicators of reasoning ability, develops measures that claim to be valid indicators of reasoning, and uses these to grant privileges to certain people and not others, they sustain a position of power in society. More broadly, these critiques raise questions regarding the ideological and social outcomes of all forms of institutionalized assessment (see also McNamara & Roever, 2006). LINGUISTIC AND LITERARY THEORY A second major challenge to the empiricist account of knowledge emerged from linguistic and literary theory. To appreciate what is at stake it is useful to consider the early work of Swiss linguist Ferdinand de Saussure (18571913). In his influential volume A Course in General Linguistics, Saussure laid out the rationale for what became the discipline of semiotics. The field of semiotics was conceived as a science of signs, that is, a science focused on the systems by which we communicate. Two of Saussure's ideas are particularly important to the critique that ensued. In contemporary terms, Saussure first proposed that the relationship between words and their referents is ultimately arbitrary. In the simplest case, each of us is assigned a name; this assignment is useful in sustaining a social convention. Yet, there is no inherent reason we could not have been given other names (or no name at all). Saussure's second significant proposal was that words function within rule governed systems of usage. Put simply, our language (as a sign system) can be described in terms of various rules, such as rules of grammar or syntax. Such views led to a long line of traditional research on grammar, phonemes, syntax, the history of language, and the like. However, the plot thickens when we consider that the empiricist concepts of accuracy, objectivity, and truth all depend on the assumption that certain words correspond to what is the case. On this view, certain utterances are truth bearing, while others are exaggerated or untrue. If, however, the relationship between words and world is ultimately arbitrary, then in principle any utterance could be used to represent any state of affairs. What privileges any particular arrangement of words as being "true" is simply social convention. In terms of observations, it is no more true to say that objects are propelled to earth by the force of gravity than to say that they are thrust downward by Gods will. Thus, when claims are made to "truth," "objectivity," or "accuracy" in reporting, we are only being exposed to "one way of putting things," a way that is privileged by certain groups of people. When further extended, there were equally unsettling implications of the assumption that language functions as a system in itself. If language use is determined by a logic of its own, then reports on the nature of the world will necessarily be driven by this logic. Or more broadly, whatever ones observations, they will be subservient to the demands of the logic of representation. This line of thinking subsequently led to substantial scholarly study of the ways in which scientific accounts are governed by linguistic devices such as metaphor (e.g., Leary, 1990) and narrative (e.g., Genette, 1980). In the latter case, for example, evolutionary theory is only intelligible by virtue of its drawing from narrative traditions of storytelling (Landau, 1993). Such work has been further innervated by the works of French theorist Jacques Derrida, and particularly his writings on linguistic deconstruction (Derrida, 1976). Although highly complex, many of his influential ideas are relevant to the present critiques of empiricist epistemology. To touch on only one, Derrida pointed out that language meaning depends on a system of differences, or binaries. That is, the meaning of a word depends on a simple split between "the word" and "not the word." The meaning of white, thus depends on differentiating it from what is nonwhite (e.g., black). Word meaning depends, then, on differentiating between a presence and an absence, that which is designated by the word against what is not designated. To give an account of the world is thus to speak in terms of presences, what is designated, against a backdrop of absences. In effect, the presences are privileged; they are brought into focus by the words themselves; the absences may only be there by implication or suppressed altogether. Thus, for example, to say that the world is made up of material entities is only sensible by contrast with a hidden binary of the nonmaterial. Without admitting nonmateriality, the word material loses all meaning. In this sense, science depends for its sense on a suppressed absence. And yet, without admitting a referent for nonmateriality, the declaration that the world is made up of material stands empty. THE SOCIAL CONSTRUCTION OF SCIENTIFIC KNOWLEDGE These preceding critiques, emerging in quite separate domains of scholarship, come to a head in perhaps the third and perhaps most essential contribution to a viable replacement for empiricist epistemology. The movement here is essentially toward a social epistemology. Its origins may be traced to Mannheim's 1929 volume, Ideology and Utopia. As Mannheim proposed, (1) the scientist's theoretical commitments may usefully be traced to social (as opposed to empirical) origins; (2) scientific groups are often organized around certain theories; (3) theoretical disagreements are therefore issues of group conflict; and (4) what we assume to be scientific knowledge is therefore a byproduct of a social process. This seminal work was followed by a substantial number of influential contributions, including Ludwig Fleck's (1935/1979) Genesis and Development of a Scientific Fact, Peter Winch's (1946) The Idea of a Social Science, George Gurvitch's The Social Frameworks of Knowledge, and Berger and Luckmann's The Social Construction of Reality. However, in terms of the evolution of ideas, the landmark volume is Thomas Kuhn's (1970) The Structure of Scientific Revolutions. Most importantly, this work represented a frontal challenge to the longstanding presumption that scientific knowledge is progressive, that with continued researchtesting hypotheses against realitywe come ever closer to the truth. Thus, proposed Kuhn, the shift from a Ptolemaic to a Copernican account of the relation of the earth to the sun is not progress toward truth; nor is the shift from Newtonian theory to quantum mechanics in physics. Rather, Kuhn proposed, our propositions about the world are embedded within paradigms, roughly a network of interrelated commitments (to a particular theory, conception of a subject matter, methodological practices, and the like). Thus, even our most exacting measurements are only sensible from within the paradigm. A look into a microscope tells you nothing unless you are already informed about the nature of the instrument and what you are supposed to be looking at. What we call progress in the above cases of astronomy and physics is not then movement from a less to a more objectively accurate account of the world. They represent shifts in paradigm, different ways of thinking and observing. In recent decades this social view of science has been buttressed by an enormous body of scholarship centered on the cultural and historical contingency of scientific knowledge. As broadly acknowledged, the philosophical search for foundations of empirical knowledge is now moribund. Rather, summarizing the three critical waves outlined above, we find that scientific knowledge is a byproduct of negotiated agreements among people concerning the nature of the world. Whatever exists makes no fundamental requirements regarding our attempts to describe and explain. But, once we have entered into a particular tradition of understanding, as represented in a shared language, this tradition will provide both direction and limits on our explanations, descriptions, and observations. Further, all such traditions will be wedded to particular ways of life, which is to say, they will carry certain implicit or explicit values or desired goals. This social constructionist conception of science is not at all fatal to the empirical tradition. Rather, it simply removes the foundations for such a tradition and considers it as one possibility among others. Thus, the primary questions to be asked of any scientific approach are first pragmatic and second valuational. That is, what is the utility of a given tradition of knowledge, and for whom are these outcomes valuable, or not? ASSESSMENT AS SOCIAL CONSTRUCTION In light of the transformation from an empiricist to a constructionist view of knowledge, how are we to understand the practice of educational testing? First it is important to reiterate a major theme in Part I, namely the lodgment of testing practices in the empiricist paradigm. Within the paradigm the major questions are typically focused on the assessment devices themselves (e.g., validity, reliability, scaling), along with issues of sampling, random errors, and so on. However, from a constructionist standpoint, such questions are highly limited. That is, they issue from the assumptions shared within a particular enclave. And, while there is nothing problematic about such issues within the enclave, a constructionist view opens the door to the full range of stakeholders concerned with education. For the constructionist, the voice of traditional science is simply one voice among many, useful for certain purposes, but not for all. What stands as objective assessment for one may be constructed as prejudice in action for another. Thus in terms of pragmatics, we must ask about the utility of testing, not only within the enclaves of science and government but also across the spectrum of the population, and most especially within those groups affected by tests and the policies they support. And in terms of values, we must not only ask about the values shared by the scientific and policy-making groups, but be prepared to absorb the voices of multiple groups across the nation. There is no doubt that traditional measurement practices have been useful for certain groups in terms of providing a vantage point for deliberating about educational standards and policies. More debatable is whether such tests have been successful in rendering the educational system effective in attaining its goals (see, for example, Ravitch, 2010). When we open the door to the voices of other stakeholders, we find ample room for concern with the outcomes of our practices. In this context, we wish to give voice to a range of existing critiques. Some of these critiques are well known; others are harbored within significant subcultures. However, a broad scan will be useful as a preliminary to a subsequent discussion of alternatives to contemporary testing practices. For purposes of simplification, we focus on three realms of critique, concerned with the impact of testing on: cultural ideology, societal structure, and educational practice. ASSESSMENT AND IDEOLOGY Practices of testing are far more than technologies for providing information. Rather, they carry with them a range of implicit values or ideologies. Through the broad institutionalizing of testing procedures, there is a shaping of cultural values. Practices of assessment do not so much reflect the nature of the individual as they construct the individual in their terms, and thus shape the cultural milieu. In this light, it is important to consider some of the ideological values that are fortified by many testing practices: Neoliberalism Although the term neoliberalism has been used in many ways, one prominent usage denotes an ideological orientation to cultural life in which the metaphor of "the market" is dominant. Thus, a neoliberal view of education emphasizes market-based logics for evaluating the productivity of the educational system. In these terms, the prominent question to be asked about educational systems is whether the investments in the systems are paying off in terms of specifiable outcomes. Measuring devices are thus required to assess these outcomes in a systematic way. This logic is, of course, the dominant driver in national testing of students (see also, Porter, 1996, Scott, 1999). It is this logic, critics propose, that now travels across cultural life, reconstructing relations of trust, care, and nurturance in terms of costs and benefits. People lose intrinsic worth, as do institutions, traditions, rituals, the arts, and so on, as some form of price is attached to each. Many view this orientation as instrumentalist. One begins to think of others in terms of how they can enhance or reduce one's outcomes. Do they help or hurt, are they convenient or frustrating? In important respects, institutional reliance on testing contributes to just such an orientation. Test scores essentially reduce the student to a set of numbers, used by others to make decisions that will shape the student's behavior to their standards. This logic moves from high levels of policy making into the classroom, where teacher's learn that whatever their students' particular characteristics, and regardless of their caring relations for them, the students are ultimately test performers whose scores may vitally affect their own careers. Rather than helping or assisting students for intrinsic reasons, teachers find themselves using students to protect or help themselves. Students become instruments for the teachers well-being. The marketplace mentality expands its reach. Individualism Traditional testing practices are altogether focused on the performance of individuals, chiefly the individual student, but through the agglomeration of test scores, the behavior of individual teachers and school administrators are evaluated. In part, this focus reinforces the presumption of psychological causes of individual behavior, as described in Part I. More broadly, critics find this focus sustains and supports the longstanding Western value of individualism, prizing as it does, the autonomy of the individual. There is much to be said about the contribution of this ideology to the development of Western cultural life. However, within recent decades there has been growing concern with the detrimental implications of individualist ideology for cultural life (see, for example, Bellah, Madsen, Sullivan, Swidler, & Tipton, 1985; Gergen, 2009). At the heart of many such critiques is the divisive impact of individualism, the way in which people come to understand themselves as fundamentally separated from each other. Favored by this separation are tendencies toward selfishness, callousness, and a general suspicion of bonded relationships (e.g., marriage, family, friendships, community). The chief aim of life, in this tradition, becomes one of self-maintenance, with a maximization of self-gain and a minimization of personal discomfort. This focus on the individual is all the more significant in a world in which technologies of communication bring people together in unprecedented degree, thus placing an increasing demand on capacities for collaboration. The "self-made man" is no longer functional, as complex decisions now require the participation of many voices, many vantage points. As social theorists point out, the shift toward collaborative process also favors shifting from accounts of society that focus on individual units (e.g., the person, the institution) to visions of relational process (e.g., systems, networks, confluences, aggregates, synergies). In this light, testing the performance of an individual student is myopic. Such performance will surely depend in part on the quality of the teaching. But if the student is hungry, resides in a chaotic home environment, and participates in a peer group in which schooling is devalued, his or her performance will also be affected. Or, more systemically, the student's performance is the outcome of a confluence of processes, both within and outside the classroom. Test scores are thus presumed to reflect an individual characteristic, when it is far more reasonable to view them as artifacts of system functioning. Attention should properly be directed to the larger processes from which failed performances are generated. Ultimately these could include issues of economic and social inequality. ASSESSMENT AND SOCIETAL WELL-BEING Although ideology subtly insinuates itself into cultural life, there are various ways in which testing more directly affects patterns of behavior. As Dahler-Larsen (2011) argues, testing practices not only reflect societal beliefs and values, but once set in motion, their effects are reverberating. Related practices of many kinds are affected. In this context it is essential to take account of what many see as problematic "side-effects" of current practices. Social Division and Distrust Assessment practices effectively establish a structure of power, with those who administer tests positioned to affect the outcomes of those under evaluation. This four-tiered structureplacing government over school administrators, over teachers, over studentsalso defines their relationship. Those in the lower echelons will fear those above, and those above will be suspicious of all below. The insertion of testing procedures into this mix fortifies the barriers between these groups, intensifying both distrust and antagonism. This is most obvious in the case of the teacher administering a test to students. The test itself positions the student as "in question," or in effect, constructs the student as one whose worth is not yet established. Yet, when students fail on tests, the teacher's capabilities are now thrown into question. They too become aliens to be judged by school administrators. In turn, with failing schools, the fate of school administrators becomes an issue for judgment. At this point, distrust becomes endemic to the entire system. Ironically, while tests are used in order to generate trust that the system is working properly, the result is more often the reverse. As students, teachers, and school systems are placed in jeopardy by the tests, the possibility for cheating becomes attractive. This is especially so, as the power structure has already defined relationships in terms of alienation. Thus, as evaluators have found, cheating on tests is widely practiced. Multiple instances of cheating further feed the fires of suspicion. As a case in point, an audit of Pennsylvania state exams recently flagged 38 school districts and 10 charter schools for possible cheating. Following such findings, the governor then proposed a 43% increase in funding for educational assessment. School funding in general remained at its current level. We thus have a spiral in which tests generate the rationale for cheating, which then provides the grounds for more formidable means of testing. Hierarchies of Worth From a constructionist perspective, the properties or characteristics attributed to human beings are the outcomes of historically and culturally contingent negotiation. Thus, people have variously been defined as witches, possessing purified souls, or harboring a bipolar disorder, depending on history and culture. Further, these attributions topically reflect group interests or values, with a positive value placed on a purified soul, for example, and a negative value on witchery and mental illness. In this context, critics are concerned with the way in which tests function in the classroom and society to create hierarchies of valuesuch as distinguishing between the abled and disabled (Dixon-Roman, 2010; Gutiérrez & Dixon-Roman, 2011). By their mere participation in the school system, students come to learn that they are able or disabled in ways they never imagined. These evaluations will color the way in which they are treated by their teachers and influence, as well, their relations with peers. Some students learn to view themselves as failures and may fear voicing an opinion or indeed, attending school altogether. Others will come to see themselves as superior to their peers, which further invites distancing and disdain. Pluralism in Peril Much has been said in recent decades concerning the pluralist makeup of the society and the necessity of giving voice to the many vibrant minorities making up the whole. Such celebration of pluralism is consistent with the democratic ideals of the nation. Yet, the testing tradition in the United States carries vestiges of an earlier tradition of cultural univocality. Those determining what is to be mastered by students and how this mastery is to be reinforced through standardized tests represent the values and assumptions common to a once-dominant culture. Minority concerns and values are seldom reflected in the design and application of many tests. For various minorities, for example, schools might optimally provide safety and support for their offspring, character training, or practical skills. The significant shift toward home schooling is only one indicator of these specialized concerns for education. In the worst case, one might say that standardized testingsubjecting all to a universal standardfunctions to obliterate minority differences. At a minimum, current testing practices function oppositionally to the democratic ideal of a pluralist engagement in building the societys future. The Loss of the Local In the same way that standardized testing is unresponsive to the multiple ethnic, religious, and racial enclaves in society, so it is oblivious to the local particularities of school systems. School systems in upper middle class suburbs are treated on a par with schools in lower class urban settings, Hispanic communities in the rural Southwest, and so on. Yet, in each of these settings the educational needs may be quite specific. In one case the investments are in preparing students for college, in another, merely finishing high school safely, and in another, fitting in effectively into one's multi-ethnic community, and so on. In some communities there may be companies seeking graduates with particular kinds of skills, and in others there may be no employment opportunities. For many, the issue of workplace skills is especially important. Here, critics point out that traditional testing practices typically stress knowledge within structured systems (e.g., mathematics, grammar). Yet, it is argued, most decisions in the workplace now depend on ones ability to work with others in highly complex, ambiguous, and ever-changing circumstances. Traditional tests are largely irrelevant to such skills. In effect, the diverse needs and aims of students, teachers, schools, and communities are shoved to the margins by traditional testing practices. ASSESSMENT AND THE EDUCATIONAL SYSTEM We finally turn our attention to the effects of testing on the educational process. As pointed out, the extent to which traditional testing has improved the educational process remains in question. However, when we expand the range of voices beyond those concerned primarily with improving test scores, what may be said about the impact of assessment procedures on education? We touch here on four outcomes. Curriculum and Pedagogy One of the most widely voiced critiques of institutionalized testing comes from the teaching community. When student test scores become matters of public evaluation, their own capabilities are also implicated. As previously noted, poor student performance can suggest, among many other things, poor teaching performance. And, when one's career is thus placed in jeopardy, it is natural to focus on means of boosting student test scores. For many teachers this means adjusting both curriculum and teaching practices. For example, the teacher may sacrifice concerns with the interest value and local applicability of the materials to be taught and narrow the curriculum to precisely those materials that are relevant to the test. Rather than fostering class discussion or developing group projects, the teacher focuses on improving students' capabilities in taking tests. As a study by the Center for Education Policy (2005) revealed, demands for accountability generated a major reduction in the time teachers devoted to science, social studies, art, music, and physical education. Or, as commonly put, quality education is sacrificed by "teaching to the test." Teacher Motivation and Engagement Teachers often enter the profession because they are drawn to the educational process. They enjoy working with the young and relating to them in ways that are mutually nourishing. Further, they develop ways of teaching they find enjoyable and effective in their particular classroom. As the character of the student body changes over time and new technologies become available, many teachers also like the challenge of developing new teaching practices. Teacher motivation and engagement in the process importantly depend on the teacher's freedom to work creatively within the immediate context. In this sense, the imposition of extraneous requirements placed upon the teacher runs the risk of reducing motivation and engagement. We have already mentioned the ways in which testing requirements invite an instrumental and distrustful orientation toward students. And, when nourishing forms of practice must be sacrificed in order to boost test scores, the teacher is reduced to a "machine part" within a system in which they have very little voice and constrained pedagogic agency. Student Motivation and Engagement Self-definition is importantly wedded to social process. Thus, how students come to understand themselvestheir worth, abilities, and potentialscan be vitally affected by their school experiences. It is here that a number of preceding discussions come into sharp focus. As Bandura (1994) has put it, "Educational practices should be gauged not only by the skills and knowledge they impart for present use but also by what they do to children's beliefs about their capabilities, which affects how they approach the future (p. 76). With the insinuation of testing systems into the educational process, children rapidly begin to see themselves in terms of evaluative hierarchies. A small percentage will come to see themselves as superior to others; the majority will come to view themselves as "just average," and by virtue of scaling requirements, a number will come to see themselves as "inferior." And, as commonly demonstrated, children from minority or impoverished circumstances, along with children for whom English is a second language, are more likely to find themselves defined as undesirables. These negative effects are all the more accentuated by what becomes the instrumentalist orientation of teachers for whom low student scores are threatening to career. Closely related to this critique, educators have long been critical of the early tradition of learning through intimidation, whether by the old-fashioned threat of a hickory stick or the more contemporary threat of failure and derision. As many believe, far better outcomes are obtained by tapping into intrinsic motives and collaborative processes. Even in the earliest years, students enter schools with interests and enthusiasms. It is when the school can tap into these proclivities and unite them with their educational aims that student engagement is maximal. Yet, as many see it, the imposition of large-scale high-outcome testing functions regressively. It redefines the classroom as a context of threat and reprisal. An abiding concern with high test performance subverts those educational practices not directly tied to such performancenamely those that are student-centered. The capacity of the curriculum and the pedagogical practices to absorb or resonate with student interests and motivation is eroded. In effect, teaching to the test comes at a cost of student engagement in the educational process. Schools and teachers become irrelevant and alien. In this context, the otherwise startling dropout rates in inner-city schools should come as little surprise. ParentChild Relations One may properly surmise that the instrumentalist orientation fostered by the use of tests in the school also carries over to family relations. That is, parents' definitions of their children are informed by their relations with the schools. They come to define their children in terms of the educational system and to see their children in terms of institutional categories. Among the most obvious cases are those in which they might be surprised to find that their otherwise happy child is mentally ill with ADHD. More subtly, however, they learn the evaluative hierarchies implied by testing practices and to judge their children in these terms. In all but the minority of cases they will learn that their children are less than prized. Often parents are enlisted by the teachers as a "support staff" in an effort to boost test scores, thus importing the force of negative judgment into the home, and expanding further the domain of alienation. ALTERNATIVES TO TRADITIONAL ASSESSMENT PRACTICES In many respects, the logic of traditional large-scale testing practices is compelling. In a democratic nation we are committed to education for all citizens. We believe that the future well-being of individuals, communities, and the nation as a whole depends on high quality and efficacious educational institutions. While the economic investments we make in education are less than we dedicate to law enforcement and prisons, we do wish to assess the outcomes of our investments. Are these investments truly fulfilling the goals and aims of the nation? Are there groups of young people left behind by our current practices, and are schools succeeding in the task of educating? In search of standards-based education, we turn to scientifically based testing to provide the answers and to serve as the basis of educational policy and action. Yet, as we have described, the way in which these logics have been placed into practice is perilous. It is not simply that there is no compelling evidence that testing actually contributes to the valued goals. Rather, if we take into account the realities of the multiple stakeholders involved, we find that in many respects our current approach is counterproductive. From within the enclaves of government the mandate for testing is clear enough; for the scientific researcher the logics of test design and application are well understood. However, for those outside these enclaves, current policies and practices are found oppressive and destructive. To summarize our views, at the ideological level our current policies and practices transform human relations into a marketplace; for others, they foster an alienating individualism and invite an instrumentalist orientation to relationships. In terms of societal well-being, otherwise solid testing practices invite distrust and alienation, create divisive hierarchies of worth, undermine pluralistic values, and erode the capacities for local self-direction. In the local school setting, teachers find that standards-based accountability means narrowing curriculum and pedagogical practices and a loss in motivation. Students find their self-worth placed in question and little in school that excites their interest. Parents find their nurturing orientation toward their children replaced by a performance based evaluative posture. Although one could argue for adjusting current policies and practices (cf. Resnick & Zurawsky, 2005), the emerging world conditions invite us to open fully the range of possibilities for achieving commonly shared goals of education. In this context it is useful to consider innovative initiatives. Although there are many such initiatives extant, we outline here several that are illuminating in their avoidance of hierarchical structures. Each favors pluralism, localism, democratic process, qualitative understanding, and processes of improvement over product. Rather than striving for summative statements about existing conditions, they focus on generating more flourishing futures (Dinesen, 2009). EMPOWERMENT EVALUATION Perhaps the most widely known assessment alternative is empowerment evaluation. Developed by David Fetterman (2000), Fetterman and Wandersman (2004), and Fetterman and Wandersman (2007), the attempt is to shift the site of evaluation from the distant overseer to the local participants. And, rather than simply assessing students and teachers, the attempt is to enable the local community to become self-directing, to deliberate on its activities, set goals for itself, and take necessary actions. Empowerment evaluators, in this case, serve as coaches or facilitators, helping local communities to build ongoing practices of self-evaluation. Such practices may vary considerably from one community to another, but a typical empowerment procedure might take place in several stages, including, (1) a process of taking stock in which all members of the school community participate in identifying educational goals and surveying available resources, (2) broad and inclusive discussion of the ideal or dream school to which they might aspire,( 3) the establishment of cadres that develop specific plans for moving toward the ideal, and (4) setting up administrative committees that implement change. The various groups engaged in implementation also establish yardsticks by which they can evaluate their progress. Over time educational communities are enabled to chart their future, evaluate their progress, and alter plans and programs on a continuing basis. This is not to say that outside testing procedures are precluded. Rather, standardized tests can provide information helpful in judging local progress. Rather than dictating policy, test scores become adjuncts to local school development. DIALOGIC EVALUATION Seeking a more inclusive and democratic orientation to evaluation, a broad number of educational specialists have made strides in developing processes of dialogic evaluation (Ryan & DeStefano, 2000; Schwandt, 2005). In brief, dialogic evaluation practices tend to emphasize egalitarian dialogue, equality and justice, multicultural intelligences, dialogic learning, and qualitative analysis as opposed to quantificationall with an eye to broad scale social transformation. As Greene (2001) has put it, dialogic evaluation refers to, engaged, inclusive and respectful interactions among evaluation stakeholders about their respective stances and values, perspectives and experiences, dreams and hopes, and interpretations of gathered data related to the value and its context (p. 182). Dialogic evaluation practices have been explored more visibly in the European context than the American. For example, the English National Board for Nursing, Midwifery and Health Visiting emphasizes the use of context-based dialogue, centering on competence in practice, along with portfolios that can be used to reflect, envision, and plan. Relevant work in the European Union has stressed the need for training practitioners in the use of dialogic evaluation. Much as in the case of empowerment evaluation, proper assessment is not simply a matter of reading percentile scores on national norms, but requires the weaving together of multiple voices, values, and potential indicators of efficacy (see also Cousins & Whitmore, 1998, on participatory evaluation). APPRECIATIVE EVALUATION Appreciative evaluation practices have emerged largely in the context of large-scale organizations and in program evaluation. However, the relevance to the educational sphere is clearly apparent. Appreciative evaluation practices are lodged within a social constructionist premise that we create our realities through dialogue. Thus, dialogues that center on problemsfor example, the poor performance of students, teachers, or school systemssolidify the reality of the problems. And when fortified, this reality will lead to mutual blame, alienation, distrust, disrespect, lowered motivation, and more. Further, this focus is myopic in its search for a "solution to the problem" without taking into account the systemic processes in which it is embedded. The appreciative approach centers discussion on valued actions or performances, that is, what may be prized by the group (see Coghlan, Preskill & Catsambas, 2003; Preskill & Catsambas, 2006). Narratives of positive performance are shared and collected, and dialogue then turns to the common values represented in these narratives. Reflection on these values then moves to the means of building futures in which these values would be most fully instantiated. What concrete steps would be useful and promising? Groups are then established to monitor progress toward the shared goals. Appreciative practices attempt to bring all stakeholders together in future building. In the educational sphere, this might involve the student, his or her teachers, parents, and possibly others. On a broader scale, the entire school system might participate in what is called an appreciative inquiry summit. SOCIOCULTURAL/SITUATIVE ASSESSMENT Within the past two decades new approaches to assessment have drawn from sociocultural and situative theories of learning and knowledge (Moss, Pullin, Gee, Haertel, & Young, 2008). The sociocultural/situative perspective concerns itself with the affordances (i.e., perceived possibilities within the learning environment) and the learners effectivities (i.e., the set of capacities for transforming an affordance into action) (Gee, 2008a). Learning and knowledge are not so much understood as inside the head of the learner as embodied in the relational actions and practices taking place in the learning environment. Assessment practices from a sociocultural/situative perspective not only evaluate the actions and practices based on the local communitys understandings but also attempt to account for the affordances and effectivities of the learning environment. We see examples of this in Myford and Mislevys (1996) Advanced Placement studio art portfolio assessment and Gees (2008b) work on video games and assessment. These four explorations into alternative ways of thinking about and practicing evaluation are salutary, but at the same time, are only exemplary of a movement of a far greater scope. Increasingly educators, program evaluators, performance evaluators in organizations, and organizational change specialiststo name but a feware turning to more equalitarian, reflective, dialogic, and both relational and context sensitive forms of gauging qualities and change in human performance. The interested reader might wish to explore, for example, developments in teacher research (Fishman & McCarthy, 2000), educational action research (Noffke & Somekh, 2009), and collaborative management research (Shani, Mohrman, Pasmore, Stymne, & Adler, 2007). ON THE FUTURE OF ASSESSMENT The nations of the world offer a wide panoply of approaches to educational testing. One might contrast, for example, the stringent and high stakes practices of testing in Japan and Korea with the low levels of standardized testing in Finland. By the same token, one may ask about the ways in which these differing practices function in society. Interestingly, while the strong testing approaches taken in Japan and Korea are associated with high rates of adolescent suicide (along with laws to prevent excessive study), Finnish minimalism is coterminal with what is regarded to be the most effective educational system in the world. Further exploration of such differences would be useful in expanding deliberation on alternatives to present practices. Yet, as we proposed at the outset of our initial paper, technology based shifts in global conditions appear to militate against current testing practices in the United States. The attempt to impose a single set of standards on an enormously varied domain of practices fails to recognize the rapidly shifting and endlessly complex landscape of educational desiderata. What is needed in the way of educationthe knowledge and the skills required for productive and meaningful participationis ever changing. To lock in a highly circumscribed set of ideals and to shape the entire educational system by these specifications is to reduce the potential for productive participation in the future. Further, these same technologies of communication also foster multiple affinity groups, enabling them to generate rationales, agendas, and plans of action. In effect, there is an expansion in the array of voices that demand to be heard, voices that are set against otherwise totalizing agendas. In our view, the way must be prepared for more democratic, inclusive, and dialogic contributions to the ways in which local systems of education function. Let us briefly summarize what we see as the major outcomes of our deliberations. In our view, we should move toward modes of evaluation that will support and enable the participation of all the nation's people in building effective educational systems and the flourishing of human capabilities more generally. Assessment should in no way drive these efforts, but should serve ancillary or supportive functions. Our views support a human development approach to achieving national goals, over and against the reliance on the neoliberal gaze and the economic marketing of education. In what follows, we first attend to what we believe assessment should not be and then shift the focus to what, for us, assessment should become. In planning for the future, we should shift the emphasis away from: " Employing assessment as an instrument for administrative sorting, selecting, or predicting; " Assessing internal mental structures such as cognitive or affective knowledge, reasoning, or other mental abilities; " Using assessment results to establish policies that differentially reward and/or penalize students, teachers, or school administrators; and " Forming policies demanding all educational systems to meet the same standards in terms of student performance. To be clear, we are not arguing against active and continuing deliberations on the performance adequacy of students, teachers, administrators, or school systems in general. Such reflections can make vital contributions to educational success in rapidly changing and highly differentiated world conditions. Assessment instruments or tests can contribute to such deliberations in significant ways. Thus, we propose that such tests be employed in evaluation practices that: " Are dialogic in nature and situated primarily in local knowledges and practices; " Include multiple stakeholders, and importantly, those whose performance is under evaluation; " Include multiple criteria, reflecting the needs and values of multiple stakeholders; " Center on processes of continuous improvement of whole systems, including students, teachers, and surrounding communities; " Provide for the education of teachers, school administrators, and other relevant stakeholders in dialogic centered practices of evaluation. What could this mean for the future of testing and measurement instruments more generally? In our view the above recommendations would not mean a diminishment, but rather, an enrichment and expansion of such services. With respect to testing, there is good reason to move toward the following: " Making standardized tests available (as opposed to mandatory) to all educational institutions. Whether and how local school systems or districts employ test scores in their deliberations should be locally determined. " Radically expanding the kinds of tests available to schools for evaluating students. For example, depending on locale, schools might wish to have tests that would enable them to benchmark students in terms of computer literacy, career fluency, civic and political participation, bilingual capacity, dialogic, skills, environmental knowledge, musical aptitude, physical competence, health, and so on. " Expanding the range of tests available to schools and school districts for evaluating their own development. For example, schools might varyingly wish to benchmark themselves in terms of parental participation, excellence as a learning community, internal collaboration, civic contribution, relationships with business and government, and the like. " Offering educational services enabling local schools to generate effective practices of participatory evaluation. In conclusion, if we take into account the increasing development of communication technologies and the resulting shifts in demands and opportunities, it is imperative to explore new ways of practicing evaluation. Along with Nussbaum (2011), we argue here for assessment in the service of creating capabilities as opposed to judging them. References Bandura, A. (1994). Self-efficacy. In V. S. Ramachaudran (Ed.), Encyclopedia of human behavior (Vol. 4, pp. 7181). New York: Academic Press. Bellah, R. N., Madsen, R., Sullivan, W. M., Swidler, A., & Tipton, S. M. (1985). Habits of the heart: Individuals and commitment in American life. Berkeley: University of CA Press. Center for Education Policy. (2005). From the capital to the classroom: Year 3 of the No Chld Left Behind Act. Washington, DC: Center for Education Policy Report. Coghlan, A. T., Preskill, H., & Catsambas, T. T. (2003). An overview of appreciative inquiry in evaluation. New Directions for Evaluation, 100(Winter). Cousins, J. B., & Whitmore, E. (1998). Framing participatory evaluation. In E. Whitmore (Ed.), Understanding and practicing participatory evaluation. Hoboken: Jossey Bass. Dahler-Larsen, P. (2011). The evaluation society. Palo Alto, CA: Stanford Business School Press. Derrida, J. (1976). Of grammatology. Baltimore, MD: Johns Hopkins University Press. Dinesen, M. S. (2009). Systemic appreciative evaluation: Developing quality instead of just measuring it. AI Practitioner, 11, 4956. Dixon-Román, E. (2010). Inheritance and an economy of difference: The importance of supplementary education. In L. Lin, E. W. Gordon, & H. Varenne (Eds.), Educating comprehensively: Varieties of educational experiences. Lewiston, NY: Edwin Mellen Press. Dixon-Román, E., & Gergen, K. (2013). Epistemology and measurement: Paradigms and practices I: A critical perspective on the sciences of measurement. Retrieved from http://www.gordoncommission.org/rsc/pdfs/dixonroman_gergen_part1_epistemology_measurement.pdf Duhem, P. (1954). The aim and structure of physical theory. Princeton: Princeton University Press. (Originally published in 1914) Fetterman, D. M. (2000). Foundations of empowerment evaluation. Thousand Oaks, CA: Sage. Fetterman, D. M., & Wandersman, A. (Eds.). (2004). Empowerment evaluation, principles in practice. New York: Guilford. Fetterman, D. M., & Wandersman, A. (2007). Empowerment evaluation, yesterday, today, and tomorrow. American Journal of Evaluation, 28, 179198. Fishman, S., & McCarthy, L. (2000). Unplayed tapes: A personal history of teacher research. New York: Teachers College Press. Fleck, L. (1979). Genesis and development of scientific fact. Chicago: University of Chicago Press. (Originally published in 1935) Foucault, M. (1978). The history of sexuality, Vol. 1, an introduction. New York: Pantheon) Foucault, M. (1980). Power/knowledge. New York: Pantheon. Gee, J. P. (2008a). A sociocultural perspective on opportunity to learn. In P. A. Moss,D. C. Pullin, J. P. Gee,E. H. Haertel, & L. J. Young (Eds.), Assessment, equity, and opportunity to learn. New York: Cambridge University Press. Gee, J. P. (2008b). Game-like learning: An example of situated learning and implications for opportunity to learn. In P. A. Moss, D. C. Pullin, J. P. Gee, E. H. Haertel, & L. J. Young (Eds.), Assessment, equity, and opportunity to learn. New York: Cambridge University Press. Genette, G. (1980). Narrative discourse. Ithaca, NY: Cornell University Press. Gergen, K. J. (1973). Social psychology as history. Journal of Personality and Social Psychology, 26, 309320. Gergen, K. J. (2009). Relational being: Beyond self and community. New York: Oxford University Press. Greene, J. C. (2001). Dialogue in evaluation, A relational perspective. Evaluation, 7, 181187. Gross, A. (1996). The rhetoric of science. Cambridge: Harvard University Press. Gurvitch, G. (1966). The social frameworks of knowledge. New York: Harper and Row. Gutiérrez, R., & Dixon-Román, E. (2011). Beyond gap gazing: How can thinking about education comprehensively help us (re)envision mathematics education? In B. Atweh, M. Graven, W. Secada, & P. Valero (Eds.), Mapping equity and quality in mathematics education. New York: Springer. Habermas, J. (1971). Knowledge and human interest. Boston: Beacon Press. Kuhn, T. S. (1970). The structure of scientific revolutions (2nd ed.). Chicago: University of Chicago Press. Landau, M. (1993). Narratives of human evolution. New Haven: Yale University Press. Leary, D. (1990). Metaphors in the history of psychology. New York: Cambridge University Press. Martin, E. (1987). The woman in the body: A cultural analysis of reproduction. Boston: Beacon Press. McNamara, T., & Roever, C. (2006). Language testing: The social dimension. Oxford: Blackwell. Mislevy, R. J. (1997). Postmodern test theory. In A. Lesgold, M. J. Feuer, & A. M. Black (Eds.), Transitions in work and learning: Implications for assessment (pp. 280299). Washington, DC: National Academies Press. Moss, P. A., Pullin, D. C., Gee, J. P., Haertel, E. H., & Young, L. J. (2008). Assessment, equity, and opportunity to learn. New York: Cambridge University Press. Myford, C. M., & Mislevy, R. J. (1996). Monitoring and improving a portfolio assessment system. CSE Technical Report 402. Los Angeles: National Center for Research on Evaluation, Standards, and Student Testing (CRESST). Noffke, S., & Somekh, B. (Eds.). (2009). Sage handbook of educational action research. Thousand Oaks, CA: Sage. Nussbaum, M. C. (2011). Creating capabilities: The human development approach. Cambridge, MA: The Belknap Press. Popper, K. R. (1959). The logic of scientific discovery. London: Routledge. Porter, T. M. (1996). Trust in numbers. Princeton: Princeton University Press. Preskill, H., & Catsambas, T. T. (2006). Reframing evaluation through appreciative inquiry. Thousand Oaks: CA: Sage. Quine, W. V. O. (1960). Word and object. Cambridge: MIT Press. Ravitch, D. (2010). The death and life of the great American school system: How testing and choice are undermining education. New York: Basic Books. Resnick, L., & Zurawsky, C. (2005). Standards-based reform and accountability: Getting back on course. American Educator, Spring, 1-13. Ryan, K. E., & DeStefano, L. (Eds.). (2000). Evaluation as a democratic process: Promoting inclusion, dialogue, and deliberation. San Francisco: Jossey-Bass. Schwandt, T. A. (2005). A diagnostic reading of scientifically based research for education. Educational Theory, 55, 285305. Scott, J. C. (1999). Seeing like a state: How certain schemes to improve the human condition have failed. New Haven: Yale University Press. Shani, A. B., Mohrman, S. A., Pasmore, W. A., Stymne, B., & Adler, N. (2007). (Eds.). Handbook of collaborative management research. Thousand Oaks, CA: Sage. Winch, P. (1946). The idea of a social science. London: Routledge, Kegan, Paul.
|
|
|
|
|
|