October 29, 2009

Channeling Jerry Bracey on "proficiency": it's political, not scientific

One of the late Jerry Bracey's hobbyhorses was the pretense that the NAEP achievement level labels were scientific, as he argued in 1999: "The standards have generally been the object of scorn and derision from the psychometric community." He was fond of quoting the 1999 report on NAEP proficiency levels, esp. from p. 162: " Standards-based reporting is intended to be useful in communicating student results, but the current process for setting NAEP achievement levels is fundamentally flawed." So when NCES issues a report comparing the implied theta-values of cut-scores for proficiency on state assessments to the theta-values of cut scores for proficiency on NAEP and both Ed Week and the Christian Science Monitor report on the paper with a straight face, we're obviously seeing one place where Bracey's voice is already missing.

I think Jerry perseverated on this issue, to the detriment of a sensible argument about political judgments. The larger point which is inescapable is that cut scores are set arbitrarily, and there is no way to avoid that fact. Those who support setting achievement levels hope and pray that they're arbitrary in the sense of arbitration and careful judgment, not by being capricious. But they are arbitrary, and even moreso the labels assigned them. What we know is that someone who scores at a "proficient" level on NAEP is scoring higher than someone in the "basic" band. That's all we know from those labels: ordinality. Moses did not come down from Mount Sinai with NAEP scores carved in tablets. 

So what do we do with the inherently political nature of those labels? As I have argued in Accountability Frankenstein, the caution with which we use the judgments on cut scores should depend on the stakes of their use. If they're used to target resources, that's one thing (resources are going to be targeted in some manner), but the more that someone's job depends on them, the more wary we should be of how we set thresholds. 

Today, however, NAEP labels and cut-scores are serving a purely performative act, to stigmatize states for their political response to NCLB. I hereby propose that we have the following new labels for NAEP achievement levels: 


I think that's in the spirit of the day's report...

Correction: I assumed that NCES was using detailed data from the state assessments to estimate IRT parameters. Silly me. They were using distributional data for linkage. Oops... for me for forgetting the methods from the last such report. I'll let the measurement folks argue about the methods used here. 

Listen to this article
Posted in Accountability Frankenstein on October 29, 2009 1:02 PM |