February 6, 2009

UTLA and "benchmark" or "periodic" testing

Last week, the United Teachers of Los Angeles called for the cessation of every-few-months testing in the district. The response of the district: such testing is an important tool in improving student achievement, which they know because schools with such testing have had annual-test scores higher than schools without such testing.

The flaw in the district's reasoning is left as an exercise for the reader, because I'm more concerned at the moment about what this debate shows about our attitudes towards assessment. UTLA is wrong to attack frequent testing on principle, though I think they may have a good point about this type of assessment. Such periodic assessment may help schools target assistance to students, or they may serve primarily to mimic the state test and encourage teaching to the test (the predictive success of which principals would know by results on the quarterly assessments). Without knowing more about the details, you can't say which is which, and both phenomena are possible (including in the same school).

What concerns me is the direction in which the machinery of testing is taking formative evaluation. There's a lot of research to suggest that when used to guide instruction, frequent assessment can dramatically change results. There are a number of technical questions about so-called formative assessment (or progress monitoring) that are the domains of researchers in the area: how to create material sufficiently related to key skills or the curriculum, how to create assessments where score movement is both meaningful and sensitive to change, how to gauge appropriate change, how to structure the feedback given to teachers, and so forth. My reading of the literature (which is not complete) is that the most powerful uses of formative assessment require very frequent, very short assessments--on the order of once or twice a week, and about the same length as your typical elementary-school spelling test (i.e., a few minutes at most). 

So what do we see as the evolving, bureaucratic version of formative assessment: long tests taken every few months. That's better than once a year in terms of frequency, but it's still a blunt instrument and absorbs a large chunk of time. The reason for this preference is obvious: a large, unwieldy school system can organize systematic evaluation/feedback around quarterly tests. That's doable. But organizing around something that's taken weekly and would often require data entry (e.g., a one-minute fluency score for first- and second-graders)? That's a different kettle of fish.

That doesn't mean it's impossible. It's easy, if you're a principal who's willing to devote the right resources. Consider reading fluency, for example. (I'm not saying that fluency is more important than comprehension. I just have the experience with this to imagine what I'd do as a principal.) Teach a paraprofessional to have every first- and second-grade student in the school read to them one minute a week on a sample reading passage (there are sets of roughly equivalent passages one can purchase for this purpose). Have them enter the data through a Google Docs form, a SurveyMonkey survey, or some other tool that will send the data to a spreadsheet. Get someone to program the results so that you can show data per child with trend lines and sort by grade, classroom, etc. For a few extra lines of code, you could add locally-weighted regression trends to be really fancy, but that's beside the point.

Here's the point: this is not rocket science, this does not require a gazillion-dollar software package from TestPublisher Inc., and it's very different from the type of quarterly testing that superintendents are buying into in a big way (including that gazillion-dollar software package from TestPublisher Inc.). It's very different from the quarterly testing that UTLA is protesting.

So, Ramon Cortines, here's my challenge: can you document that the quarterly-testing regime is better than the weekly-quiz-plus-trends proposal I've outlined above? The second can fit easily into the routines of any school. The second can start conversations EVERY WEEK at a school. The second is MUCH cheaper. It's also less sexy: no giant software packages manipulable from the front office, no instantly-printable pastel-colored graphs that demonstrate what kids were able to do on a test six weeks ago. You'd definitely give up the flashy for the mundane. But prove to me that the flashy is better than the mundane.

Listen to this article
Tags: formative assessment, Ramon Cortines, United Teachers Los Angeles
Posted in Accountability Frankenstein on February 6, 2009 12:05 PM |