September 12, 2008

Shared responsibilities III: The next ESEA

Over the summer, Charles Barone challenged me to put up or shut up on NCLB/ESEA. I immediately said that was fair; Accountability Frankenstein had a last chapter that was general, not specific to federal law. I'm stuck in an airport lounge waiting for a late flight, so I have an occasion to write this now. Because I'm on battery power, I'm going to focus on the test-based accountability provisions rather than other items such as the high-quality teaching provisions. Let me identify what I find valuable in No Child Left Behind:

  • Disaggregation of data
  • Public reporting
I think most people who don't have their egos invested in NCLB recognize that its Rube Goldberg proficiency definition has no serious intellectual merit and has been a practical nightmare. Yet there is the policy dynamic that observers in the peanut gallery like me can recognize, which is the practice of states in gaming any system, and the way that such gaming undermines the credibility of states with those inside the Beltway. So there's a solid justification in a continued regulatory regime if it is sane and recognizable as such by most parents and teachers (i.e., the connotation of "loving hardass" that I meant in a prior post and that some readers have recognized). I'll have to write another entry on why I think David Figlio is wrong and why teachers are not magisters economici, but incentives just don't appear to be doing that much. An appropriate regulatory regime has to make it easier to be a good educator than a bad educator, make it easier for states to support good instruction than to game the system, and be reasonably flexible when the specific regulatory mechanisms clearly need adjusting.

So where do we go from here? I don't think trying to tinker with the proficiency formula makes sense: none of the alternatives look like they'll be that much more rational. What needs more focus is what happens when the data suggest that things are going wrong in a school or system. On that, I think the research community is clear: no one has a damned clue what to do. There are a few turnaround miracles, but these are outliers, and billions of dollars are now being spent on turnaround intervention with scant research support. To be honest, I don't care what screening mechanism is used as long as (a) the screening mechanism is used in that way and in that way only: to screen for further investigation/intervention; (b) the screening mechanism has a reasonable shot of identifying a set of schools that a state really does have the capacity to help change things -- if 0 schools are identified, that's a problem, but it's also a problem if 75% of schools are identified for a "go shoot the principal today" intervention; (c) we put more effort and money into changing instruction than in weighing or putting lipstick on the pig. Never mind that I'm vegetarian; this is a metaphor, folks.

So, to the mechanisms:

  • A "you pick your own damned tool" approach to assessment: States are required to assess students in at least core academic content areas in a rigorous, research-supported manner and use those assessments as screening mechanisms for intervention in schools or districts. Those assessments must be disaggregated publicly, disaggregation must figure somehow into the screening decisions, and state plans must meet a basic sniff test on results: if fewer than 5-10% of schools are identified as needing further investigation, or more than 50%, there's something obviously wrong with the state plan, and it has to be changed. The feds don't mandate whether proficiency or scale scores are used; as far as the feds are concerned, it's a state decision whether to use growth. But a state plan HAS to disaggregate data, that disaggregation HAS to count, and the results HAVE to meet the basic sniff test.
  • A separate filter on top of the basic one to identify serious inequalities in education. I've suggested using the grand-jury process as a way for even the wealthiest suburban district to be held to account if they're screwing around with racial/ethnic minorities, English language learners, or students with disabilities. I suspect that there are others, but I think a bottom line here is the following: independence of makeup, independent investigatory powers (as far as I'm aware, in all states grand juries have subpoena power), and public reporting.
  • Each state has to have a follow-up process when a school is screened into investigation either by the basic tool noted above or through the separate filter on inequality. That follow-up process must address both curriculum content and instructional techniques and have a statewide technical support process. At the same time, the federal government needs to engage in a large set of research to figure out what works in intervention. We have no clue, dear reader, and most "turnaround consultants" are the educational equivalents of snake-oil peddlers. That shames all of us.
The gist here is that we stop worrying about perfecting testing and statistical mechanisms as long as they are viewed properly as screening devices. Despite the reasoned criticisms of threshold criteria (e.g., proficiency), the problem is not that they exist but that these mostly jerry-built devices are relied upon for the types of judgments that make many of us wince and that the results fail the common-sense sniff test. As long as the federal government tries to legislate a Rube Goldberg mechanism, it will have little legitimacy, and states will continue to be able to wiggle away from responsibilities when they're not doing stupid things to schools. (Yes, both can happen at the same time.) Much wiser is to shift responsibility onto states for making the types of political decisions that this involves, as long as the results look and smell reasonable.

Doing so will also allow the federal government to focus on what it's largely ignored for years: no one knows how to improve all schools in trouble (and here I mean the organizational remedies -- there's plenty of research on good instruction). Instead of pretending that we do and enforcing remedies with little basis in research, maybe we should leave that as an open, practical question and... uh... do some research?

Listen to this article
Posted in Accountability Frankenstein on September 12, 2008 11:59 AM |