January 21, 2009

One step in the right direction... but let's end single "national evaluation" studies

As Stephen Sawchuk notes, the stimulus bill package requires randomly-controlled studies of federally-funded performance pay (from the Teacher Incentive Fund, which is receiving $$BIG in the stimulus). From pp. 166-167 of the draft:

Provided further, That a portion of these funds shall also be used for a rigorous national evaluation by the Institute of Education Sciences, utilizing randomized controlled methodology to the extent feasible, that assesses the impact of performance-based teacher and principal compensation systems supported by the funds provided in this Act on teacher and principal recruitment and retention in high-need schools and subjects...

First, kudos to the bill authors (legislative staff) who inserted this language. I'm almost rolling my eyes at the randomized controlled trial language because I thought we'd been through the methodology debate sufficiently to understand that RCT is not a panacea. It is a good option for comparing discrete options (e.g., two different "treatments" that are distinct and clearly defined), but it is extraordinarily hard to arrange in education, there are other legitimate options even for that relatively narrow effectiveness question (e.g., regression discontinuity and propensity score designs), and there are other important analyses to consider (depending on the question and the discipline, economists would ask about cost effectiveness, and the educational equivalent of epidemiologists would ask about the "number to treat" or broad population-treatment questions).

But I'm not rolling my eyes because it's the first full day of the new administration, I'm 43, and my eyes might stick that way if I keep doing it. Hmmn. The serious reason why I'm not going to quibble too much with that language is because if done correctly, discrete studies will still tell us something, there's the "to the extent feasible" clause, and on principle, it is a good step to require planning for evaluation at the front end.

On the other hand, I think it's a mistake to require "a rigorous national evaluation... that assesses the impact" as in a single analysis. That language has a grammatical problem: it's using the singular when the plural is more appropriate. There should be a single rigorously-designed and -collected set of data, but it is wrong either to put the analysis in a single group's hands or to frame the question in a singular fashion, as if the answer to any effectiveness of a national program is "yes, it's effective" or "no, it's not effective."

That's the headline for any single, putatively authoritative national evaluation, and if my favorite policy in the whole wide world were performance pay, I would work like the dickens to make sure the questions were framed differently, because it might well turn out (and we should expect the world to work in this way) that the first and second generation of performance-pay plans will largely do squat. "Do squat" is technical language for an average effect size around d=0. Again--if my favorite policy in the whole wide world were performance pay, I'd want to make sure that the questions revolved around differences among programs and pay schemes (including non-performance-pay structures), not just the difference between systems with and without performance pay. And I'd want to make sure that Hawthorne effects were screened out. And all the other things that could subject any such study to criticism within half a day or so. And I'd want the data made available to researchers with different perspectives, so no single person or group could spike the results.

It just so happens that the same diversity of questions and distribution of data would be good from the research community's standpoint, too. It's a shame that the habit in large federal programs is different. If you doubt the wisdom of my advice, seek counsel from those upset about the national evaluation of Reading First.

Listen to this article
Tags: federal stimulus, performance pay, research
Posted in Education policy on January 21, 2009 10:54 AM |