December 20, 2008

Study: Self-serving utility of PR can still be outweighed by inherent value of research

Since I'm convinced by Jeff Henig (Spin Cycle) that the process of research can possibly outweigh the distortions of PR spin in the long term, I will give a bit of thanks here to the New Teacher Project's PR folks who deftly spread a story of TNTP's supposedly great effectiveness in Louisiana, getting word of it in various places, and linking to George Noell's project page with the Louisiana Board of Regents. Without TNTP's flacks, I wouldn't have been prodded to go read the latest paper, which is both more and less than what TNTP's press release said.

More than what TNTP said: In a contract with the state, Noell has piloted a multi-level model to evaluate teacher education programs from grade 4-9 student test scores in five content areas (math, reading, language arts, science, and social studies). If you read the last two technical reports, you'll probably be impressed with the care with which he and his team has conducted this, starting with very small samples several years ago, looking to see whether exclusion of teacher and school levels change the portion of the variance swallowed by different levels, adding in blocks of variables in sensible ways, conducting preliminary analyses with OLS, checking on effects of Katrina and Rita, testing whether family background variables (e.g., whether kids are in single-parent homes) add explanatory power, whether there is stability in the key coefficients of interest (inferred effects for preparation programs), and so forth. The papers explain why they use a prior covariate method and treat successive school years (and the schools in separate years) as independent units, rather than embedding that data in a repeated-measures design, in the context of a "there are tradeoffs in research" worldview. This is about two or three cuts above the work I've read from the SAS Institute's value-added group, and Noell is to be credited with working over half a decade to establish a credible program of research. 

In this cautious vein, Noell's reports lay out the case for differential effects of teacher preparation programs. With the latest reports, he's only covering a small portion of the state's programs, because there have been mandated redesigns in the decade, and he chose to report results only on graduates of the programs having completed the redesign process. Since only a handful of programs completed that redesign long ago enough for there to be sufficient graduates with reported scores, this should be viewed as an initial look, though I am guessing that Louisiana's state officials will be happy to contract with Noell for years in the future until all of the programs come under the microscope.

Less than what TNTP said: Because the alt-cert programs are freestanding, they could finish the redesign earlier, so there are few college-based programs in the hopper for the latest report, and one of them (University of Louisiana - Monroe masters program) looks like it's sitting pretty. There are also some question I have about the choices made by Noell -- it looks like he combined scale scores from different grades in the analysis (at least from the coefficients, I don't think they used z-scores as dependent variables), and there's always the question about matching students to teachers (essentially the fiction that math and science teachers don't help students with the other subject, or that reading and history are unrelated). The last question is testable, but that would be an interesting problem to tackle in a practical sense. (Also, A.W., if you read this blog, I think you were wrong when you said Noell had used three different models. I think you were referring to his test of whether excluding either teacher or school levels changed the variance distribution, unless it was his testing the contribution of family variables. Having three different models in a far more interesting sense would be akin to using HLM software, PROC MIXED, and OLS. Impractical, but it would be great if someone did it. I'll buy you a beer sometime and we can talk about it.)

Then there's the question of how much weight to place on test data, but that's a policy question rather than in the domain of research methods. If Louisiana decides that all college-based programs have to be above-average, that's an abuse of the research. But I have no problems if it uses the data as part of a review decision—e.g., if it will place programs under considerably more scrutiny where graduates have far, far worse results for kids than other new teachers.

The long view: If my quick read of Noell's work is right, and it's as solid as I think it is early on a Saturday morning, then the practical regulatory question for states will not be about whether college-based or freestanding teacher preparation programs are better. Instead, the question will be which specific programs are effective in the areas where there is appropriate data. These days, it is the rare state staff who can criticize an entire sector of preparation programs; state lawmakers or political appointees are the ones to praise, slam, open, and close doors on those types of options. But staff may well be happy to play an evaluative role for specific programs.

Listen to this article
Posted in Education policy on December 20, 2008 9:57 AM |