The Over-Testing of English Language Learners and Their Teachers
by Jessica Zacher - April 02, 2009
In this commentary I report on some results from three years of ethnographic research in Ms. Romano’s fourth-grade classroom (a pseudonym). Her school, a former Reading First site, serves a ninety percent Latino, high-poverty population, and it requires students to take approximately 35 tests per year. I was in her classroom because I wanted to understand what it was like to teach and be taught in a Reading First school. I found that the testing burdens Ms. Romano and her students shoulder are nearly overwhelming; they are all under siege by a barrage of assessments. I discuss the four main language and language arts tests students take, their links (or disconnects) with instruction, and share some ideas about how to ameliorate the situation in which she and her students find themselves.
Over the past three years, I conducted ethnographic research in Ms. Romanos fourth-grade classroom (a pseudonym). Her school, a former Reading First site, serves a ninety percent Latino, high-poverty population, and it requires students to take approximately 35 tests per year. Ms. Romano often tells her students that one of the reasons we take these tests every 6 weeksmandatory Reading First tests, which are discussed in more detail lateris to see if we learned, because theyre giving us money. In her attempts to explain the politics of testing, she always emphasizes her belief that her students are learning and can excel on tests. I was in her classroom because I wanted to understand what it was like to teach and be taught in a Reading First school. I found that the testing burdens Ms. Romano and her students shoulder are nearly overwhelming. She is a teacher under siege by a barrage of assessments.
As a former teacher who taught before No Child Left Behind was enactedand hasnt taught under itI am sometimes awed by the sheer quantity of tests Ms. Romano gives. How does she do it? Why does she do it? The answers to these questions, which I will get to in a moment, are even more complex when we look closely at the tests involved. Consider the four major language and language arts tests Ms. Romano and her students have to juggle: the high-stakes California Standardized Test (CST) and California English Language Development Test (CELDT), the aforementioned Reading First tests, and reading benchmark tests.
First, for the CST, Ms. Romano was supposed to tailor her teaching to her fourth-grade students end-of-third-grade CST scores, but as they were somewhat out of date on arrival, and more current testslike reading benchmarksexisted, they seemed to have little to do with classroom instruction. They did impact instruction in test preparation time, though; then, she had to spend approximately one month getting her students ready to take their fourth-grade CST and give up other potential language arts activities. During this month, her comments to students were upbeatshed say things like, Okay, were going to be as awesome as every other class in every other schoolbut to me, she shared the pressures of raising students scores. Though the schools scores had risen dramatically over the past five or so years, in the most recent year they had leveled off, and she and her colleagues felt intense pressure to makes scores go up once more.
The second state-mandated assessment that affected Ms. Romanos teaching was the California English Language Development Test of oral and literate English language skills. As with the CST, students enter the classroom with their prior years scores, and, despite any progress they might have made in the intervening months, they keep the label until the end of the following year. This time lag created confusion for Ms. Romano, who was expected to differentiate instruction according to each childs English language development level. In addition, NCLB requirements pressured the district and teachers to reclassify English language learners as Fluent English Proficient speakersan added pressure. While Ms. Romano didnt have to administer the CELDT test, her students scores at the end of the year would (according to NCLB) show whether or not her teaching had helped their English fluency.
Third, we can look at the series of five tests required by the Reading First grant. Ms. Romano had to give them at the end of each of five Open Court units of instruction to measure students reading comprehension, spelling, and grammar skills based on what each Open Court unit had purportedly taught them. Students with below-grade level reading abilities (at least 8 in her class) had trouble accessing the Open Court curriculum and tests, but she had to give the tests regardless of how well she thought students would perform. Not only did it take her an average of six hours to give each test (a loss of approximately one week of language arts instruction every six weeks), but Ms. Romano was required to input the answers into a database on her classroom computer linked to the computers of the schools literacy coach, principal, and the districts team of Reading First program coordinators.
Fourth, Ms. Romano had to administer at least two reading benchmark comprehension and fluency tests a year per child. Technically, students are benchmarked when the teacher thinks they will pass, and as a result, students in one classroom can have a variety of benchmark levels, from as early as end of 1st grade to end of 4th grade. There are two tests per grade level, linked to the states language arts standards. Students can be retained in retention years (1, 3, and 5, for different subjects) if they do not pass a certain benchmark by a certain time, which makes this a very high-stakes test indeed. Ms. Romano often stopped or modified her lessons to give the half-hour-long tests, or sent students to a helper to have them tested; in either situation, students missed lessons while Ms. Romano conducted mandated assessments.
Per the principals mandate, students reading benchmark scores were publicly known. They were called out in class, posted on a poster in the classroom, and posted in the school hallways. Ms. Romano tried to ameliorate the potential negative effects of this publicity by charting students prior progress in green and current successes in red on the classroom poster, but she could not alter the hallway postings or single-handedly change the culture of accountability at her school. Most of her students had never known any other system, and I rarely saw them complain. Ms. Romano, who had also only known this system in her seven-year career, did complain about rising accountability pressures, but she soldiered on.
The most egregious problem, from a teachers perspective, is that these different tests provide conflicting data. For example, Ms. Romano had a student named José come into her fourth grade with very low scores on his CST and reading benchmarks (Far Below Basic and middle of grade 1 respectively); he was, after three years in the U.S., still a beginning English language learner on the CELDT test. In Ms. Romanos fourth-grade class, José made two years of progress on his benchmarks, with her considerable help, and moved up two levels on the CST. However, his scores on the Open Court unit tests were 6-4-6 (out of 10) for the last three testsin other words, he showed little comprehension of the Open Court materials written for fourth graders, though he had made significant progress in reading comprehension on other measures.
Then theres Carlos, who came into Ms. Romanos class with a Proficient score on his CST (the middle level, passing), yet a half-year behind on his reading benchmarks, at the middle of grade 3. After five years in the U.S., his CELDT score was intermediate. Carlos was the student most likely to explain test-taking strategies to his peersthats how you pass the benchmark was his trademark phraseand in one year with Ms. Romano, he moved from middle of grade 3 to middle of grade 5 on his benchmarks. In addition to this success, his last three Open Court unit test scores were 8-8-7. Paradoxically, Carlos CST score remained Proficient at the end of the year; he showed very little growth by that measure.
What is a teacher supposed to make of these mixed results? It is difficult for Ms. Romano to know which assessments are to be trustedthose that measure a students uptake of ideas from instructional materials (Open Court unit tests), those that show progress towards meeting state standards (reading benchmarks), or those that norm students skills statewide (the CST). The former tests may inform instruction, but the CST is the one that really counts for improving a schools AYP and API scores.
As I talked with Ms. Romano and her students, I often wondered if I would have loved teaching as much now as I did in the 1990s, when, as a full-day kindergarten teacher the only test I gave was the Brigance kindergarten readiness screening test. One thing that makes Ms. Romano different from me is that she got her first teaching positionat this schoolthe year the district adopted Open Court/Reading First. In other words, she has always taught under the pressures of ongoing, high-stakes testing. The professional development she receives in topics like guided reading and differentiating instruction is almost more trouble than its worth, because she has built up a storehouse of proven ideas but has little time or freedom to enact them. For instance, she is convinced that formative assessments would help her better tailor her instruction to suit her students needs but has found it hard to actually implement them in lieu of, or even in addition to, the required summative assessments. She continues to try to convince her principal of this, though with little success, because the principal is overburdened with high-stakes accountability requirements.
For Ms. Romano, and other newer teachers facing similar assessment burdens, the question you might ask is not how and why does she continue? She wishes she had more actual teaching time and fewer tests, but she wont give upshe has faith in her students and their abilities to succeed despite this system. Instead we should ask ourselves, How can we help change these circumstances? We should argue to policymakers at all levels that teacher time spent administrating, grading, and interpreting conflicting tests robs children of authentic learning opportunities. Testing itself isnt going anywhere, but we can test smarter, instead of more often. In the process of rethinking the kinds of tests we actually need, we may recover children like José and Carlos from this morass of tests. We may also help Ms. Romano recover what is left of her postgraduate training in the use of formative assessments linked directly to instruction, and, in the process, help her and teachers like her find space in which to exercise their professional judgment. For now, there are simply too many tests, for both English language learners and their teachers.