November 22, 2006

More on teaching to the test

Jal Mehta (guest blogger for Eduwonk) continues the blogule discussion of teaching to the test with a lament about the inability to follow through on the New Standards Project desire for high-quality tests that were not easily susceptible to test-prep.

For some reason, this strikes me as very similar to the laments I hear about the death of trolleys in mid-20th century L.A., except that nostalgia for the heady days of the early 90s seems, well, a bit misplaced. For that matter, so is the nostalgia for the Red Cars, which were geared as much to opening up the San Fernando Valley (see a system map from 1910) as to mass transit, and probably more geared towards speculative land development.

There are a few things that are important about the proposal to create demanding performance tests:
  • When tried, performance tests have been expensive, and the psychometric qualities controversial, to say the least.
  • Very quickly, states figured out they could 'hybridize' the idea (to use Larry Cuban's expression) by incorporating some performance items in a test that would remain mostly multiple-choice, satisfying the demand for some performance items while lowering the cost and the statistical problems (in the eyes of such officials); here, Florida was a leader.
  • Where it was tried more extensively, it's unclear to see how the existence of performance tasks dramatically changed the dynamics of high-stakes systems. Of the states that tried performance items, one was killed for political reasons (in California). The history of Kentucky's KIRIS system gets read in many different ways, but that was a substantial package of reforms, where pulling out the test format and other characteristics is hard to justify, analytically.
  • The proposal for demanding performance assessments demonstrates that focusing on tests puts the cart before the horse. Suppose we established a mandatory history test in Florida that would be essay-based. Take any item in the national history standards (which are essentially essay prompts), and make a student write on that for an hour. (Example: Evaluate how minorities organized to gain access to wartime jobs in WW2 and how they confronted discrimination.) That's meaningful and demanding, and getting students to the point where they could succeed on such a task might require most of a decade in terms of changing the curriculum, textbooks, and history-class routines. But the sequence that Tucker and Resnick suggested would bollix that up—we are so focused on short-term changes in test results that everyone would assume that lousy scores for several years means that history teachers aren't changing things, even if there's a deliberate effort to change practices.

I'd love to believe the New Standards theory of action, because it's comforting to think that we just have to craft the right test. Nor am I saying that we should be happy with what currently exists! But I'm afraid the New Standards story is a bit of a fairy tale. You can't just tweak the test and expect the rest to follow.

All right: enough procrastinating. Time to get back to the last chapter and describe how we can save the world...

Listen to this article
Posted in Accountability Frankenstein on November 22, 2006 1:11 PM |