August 5, 2010
"Overcaffeinated value-added enthusiasts" and the public
I've developed a fondness for Rick Hess's phrasing. Whether or not I agree with him, I have to smile when he describes advocates of value-added or bust as "overcaffeinated value-added enthusiasts." This is in the context of the back-and-forth over value-added measures in the District of Columbia teacher filings (see blog entries by Aaron Pallas, Hess, Pallas, and Hess, from which I drew the overcaffeinated term). What we're seeing here is the beginnings of a public dialog over the technical details of value-added measures, whether in DC or here in Florida (see today's St. Pete Times story on two audit reports over Florida measures, plus articles from Jacksonville and Miami over continuing questions).
Or, rather, we're not seeing much of a dialog, more of a he said-she said dynamic. Pallas wrote from what was publicly available (which was as simplistic as what I read about Bill Sanders' techniques in newspapers in 1990s Tennessee), Hess criticized him for insufficient due diligence for an academic blogger, and we're now into round two on who owes whom what on transparency. Florida is slightly different in the actors: most of the critics this summer have been superintendents, worried about whether problems with the underlying test scores or value-added measures will end up shaming their elementary schools (and them) with lower ratings on the state's accountability system. But it's still he said-she said with the auditors saying they found no problems and the superintendents still having reservations.
What we're missing is clear reporting on the technical issues, but don't blame the reporters. In some cases, there is poor planning by state departments of education (or the DC schools, in this summer's news), so there's nothing clear and accurate and easy to communicate. In other cases, as in those jurisdictions using Bill Sanders' techniques, you've got a proprietary model that the public isn't allowed to inspect. And then there's the simple fact that there is no single holy grail of value-added measures and inherent error issues that tend to be underplayed because standard errors and measurement error are eyeglaze-inducing even if they're important. So the reporting is a far cry from the sensawunda reporting on scientists who uncovered BL's lowball estimates of the gusher's output: oh, wow, there are pools of oil under the surface? oh, wow, you can estimate flows from the speed of particles in a fluid? Nothing like that exists on value-added or growth measures.
Some part of the situation is inevitable when a technical apparatus becomes a tool of political discussion. I don't mean the partisan politicization of statistics (though that happens) but the fact that even mildly controversial bills do not pass in many legislative bodies unless there is a certain amount of pathos in the debate, and the exaggeration of debate tends to drown out the caveats for anything. There are plenty of very careful statisticians out there who can tell you the issues with value-added or growth measures. They're not quoted in news stories, because no editor is going to let "mixed-model BLUE algorithms tend to swallow dependent-variable variance before you get to the effect measures" appear in a newspaper. So there's a mismatch between the technical issues and the level of discussion. You shouldn't need someone with the skills of Robert Krulwich to report on technical measures affecting public policy, but that's where we are.
That feeds into the dichotomous debate that is dominated by the "let the measures work" and the "it's imperfect, so toss it out" arguments. As I wrote a year ago,
The difficulty in looking coldly at messy and mediocre data generally revolves around the human tendency to prefer impressions of confidence and certainty over uncertainty, even when a rational examination and background knowledge should lead one to recognize the problems in trusting a set of data. One side of that coin is an emphasis on point estimates and firmly-drawn classification lines. The other side is to decide that one should entirely ignore messy and mediocre data because of the flaws. Neither is an appropriate response to the problem.
When the rubber meets the road, you're sometimes going to get the firmly-drawn classification lines in Florida that lead people to nitpick technical details (I wonder how many of the superintendents griping this summer have bonuses tied to school grades), and you're going to get nebulous debates when systems such as IMPACT are not accompanied by technical transparency. This just doesn't work for me, and it shouldn't for you, either.
July 30, 2010
"Pushback" week
It's almost as if Nick Anderson and Ruth Marcus worked at the same paper, because "pushback" appears to be the talking point of the week on education policy. Yesterday, Anderson reports, President Obama "pushed back" against some civil-rights groups' criticism of Race to the Top, and Marcus applauded him when the president "took the opportunity to push back." Oh, wait: they do work for the same paper. Well, at least we know that at the Post, some colleagues talk with each other, unlike the one who fired Dave Weigel last month and the other who hired him this month. Then again, the fools at the Post, Inc., appear to be management and bull-male columnists, not rank-and-file reporters.
There are four major stories that dominated national education news in the past week, at least as far as I was paying attention:
- The drama surrounding the civil-rights group report and non-presser and the two major education speeches this week by Duncan and Obama.
- Continuing problems in trying to attach state aid to federal bills (after the emergency war appropriations, there's the inability to break the small business aid bill, which had jobs money attached).
- Michelle Rhee's plans to fire several hundred teachers based on the IMPACT evaluation system.
- The New York state testing cut-score embarrassment.
Pushback was used in the Post's coverage of the first story, but I think you can say it's a theme for the week. House and Senate members are now in almost open warfare over education jobs riders to bills (possibly extending to the FMAP aid to states on Medicaid, stuck in Congress since early this year). There is debate over how many teachers Rhee is firing and how bad a system IMPACT is. And Joel Klein is twisting himself in knots trying to explain how the mistakes in proficiency rates that he used to puff up his record really isn't a problem and, uh, Lady Gaga shows how good the New York City schools are. I'm half-expecting him to talk about New York's smog swampy beauty, the East River though, doesn't it split the Park Slope from the Palisades? Someone get Bill Shatner to read Joel Klein's ratiocinations!
Some things behind the headlines that seem obvious to this historian:
- Part of the loose (and fragile) coalition criticizing the Obama administration's turnaround policy stems from unions concerned about due process for employers and community-based organizations worried about the closure of public facilities in poor neighborhoods and the role of public employment in providing a leg up to the middle class. That's not new, and it's complicated. The civil-rights group interest in public employees can be salutary (my understanding is that Black teachers were a solid core of local NAACP chapters in the mid-20th century) but sometimes at cross-purposes with other interests: I heard informally from some observers that part of the pushback against the decentralization of Chicago schools in the late 1980s was the role of the central school bureaucracy in providing a leg up into the middle class, and the reduction of the central bureaucracy threatened those positions. Today, the invisible risk is the position of minority teachers' aides and other non-certified employees. My guess is that they've been disproportionately affected by school-system layoffs that try to hold onto classroom teachers.
- I still don't have a clue how much test scores played a role in the firing of DC teachers, and my guess is that you don't, either. IMPACT included test scores, but you'd have to look at the details of individual employees to know whether an individual firing is a case where all the indicators (including the required five observations) pointed in the direction of an incompetent teacher or whether test scores trumped supervisory judgment for any. Normally employers have broad discretion in evaluation systems, but the failure to bargain IMPACT may put the DCPS in some jeopardy of an unfair labor practice finding. (That depends on both the structure of DC collective-bargaining law and the details of what happened with IMPACT and WTU's requests for bargaining.) Double jeopardy for Michelle Rhee: the inclusion of the pseudoscientific "learning styles" in the IMPACT observation system. My guess is that the AFT (the national affiliate for the Washington Teachers Union) can quickly get their hands on well-known psychologists to rip that to shreds for any teachers where the tipping factor was a supervisor's judgment that they didn't cater to student "learning styles."
- Joel Klein's dancing around the cut-score fiasco in New York illustrates once again that the performative setting of cut scores is often a result of the tension between bravado and "reform testosterone," on the one hand, and politically acceptable failure and the political need to game the system, on the other. We'd like to think that cut-score setting is arbitrary in the sense of arbitration, but it's too often arbitrary in the sense of caprice and politics. Two years ago, Jennifer Jennings and I wrote a commentary for Teachers College Record ($$ required) about the dangers of trusting threshold-based proficiency percentages as opposed to central tendencies such as means and medians, with New York City as the object lesson. She's too mature for this, but I have no such reticence with the last week's revelations: nyah nyah nyah, we told you so. And from those of us who warned years ago about the fragility of growth/value-added statistics? same message.
Bottom line here for administrators: test-based measures should only be used as a case to fire teachers or administrators where they strongly point in the same direction as observation-based evaluation instruments that are developed with some common sense, with unions and excising crap such as learning styles.
July 26, 2010
"Opportunity to learn" revived?
As Ed Week's Michele McNeil is reporting, a coalition of civil rights groups has issued a white paper today through a (new?) organization, the National Opportunity to Learn Campaign. Last night, Diane Ravitch was tweeting her reading of the paper as a gentle but firm rebuke of the Obama administration's approach to accountability. To some extent, I think she's right: the 17-page report briefly referred to the inappropriateness of judging schools and teachers primarily by test scores, but that was a brief reference.
For the longer and more committed passage criticizing policy prejudices towards school closures, I read the argument differently, because of the other arguments in the paper in favor of more money for early childhood education, wraparound care programs, and NCLB's public-school choice provisions and against budget cuts. And then there's the name that's a throwback to early-90s arguments in favor of opportunity to learn standards. To me, that all looks like a straightforward community-civil-rights approach more than an argument against high-stakes testing. In that context, the argument against school closure is an argument against withdrawing resources from a community institution that may be one of the few public facilities in a poor neighborhood.
That also fits with how the coalition's paper addresses Race to the Top: don't withhold resources or programs from poor children. Instead, combine formula grants with conditions. Notably, the paper states that a limited competition is acceptable, suggesting that the constituent organizations would not directly oppose Race to the Top as long as its structure does not permanently replace formula grants in ESEA. I know what others are going to say in response: we have plenty of conditions on federal funding, but the federal government almost never penalizes states for falling down on the job.
To a great extent, the politics of and posturing around education reform are all depressing to me: education reform policies are dwarfed by the state of the country's economy right now. In fact, that's a crucial part of the argument of the Broader, Bolder Approach. So you should maybe focus your efforts on the national economy right now? Or if not the national economy, maybe focusing on states, where the real action is going to happen over the next few years?
I think the coalition is moving about 15 months too late, if the key movers intended to shape federal policy. It's very likely that there won't be more RTTT, there won't be ESEA reauthorization, and there won't be a heck of a lot of things that should be happening from the perspectives of a variety of people on different sides of this debate. I wish I had been been wrong a month ago, but it looks more and more that I was right in predicting that David Obey's gambit last month was a stupid gamble instead. I was wrong in guessing that Obey would be frustrating George Miller, but I think I'm right on the general picture. To be clear, it's far from the biggest SNAFU of the Congressional session: that's the too-small size of the stimulus in early 2009 and the failure of the White House to nominate (or recess-appoint) enough Fed governors. But I'm still depressed, and puzzled by the strategic choices.
(One final puzzle is the group's website. The contact information is for the Schott Foundation in Massachusetts, which is consistent with the few blog entries (written by Michael Holzman) and the press-kit stuff. But there are no staff members or individuals listed on the website, just organizations. The whois entry for otlcampaign.org shows that the domain name has existed since sometime in 2009, but it's registered through a proxy, and the Internet Archive has no history of the website (blocked at the site). This is all perfectly legal, but it's odd.)
July 24, 2010
Firings in DC
Andy Rotherham is correct that the termination notices in the DC public schools this week included about a third of the total who had not met licensure standards, and a greater number were rated in the highest classification in the annual evaluations. Nonetheless, what is newsworthy about the terminations is the public nature of outright firing of a chunk of teachers for nonperformance. It wasn't the firing of a third of the district teachers, but significantly less than 10%. Let's assume a similar number of those given notice of "underperformance" this year either quit or are fired next year. That would be the firing of around 13-16% of the teachers for nonperformance in two years. It's noticeable.
By itself, the number is neither good nor bad, though many will argue the point either way without additional information. I say we wait. First, we wait for the Washington Teachers Union to sort through the information to see if any teachers were fired without the five classroom observations required for the evaluations. The grievance mechanism that exists in the union contract is on procedural grounds, and here we'll see how careful Rhee's bureaucrats have been. Then, we wait to see if there are any examples of firings that don't meet a basic smell test--anyone who had won teaching awards and plaudits but were given low ratings for reasons of favoritism or obviously inappropriate application of student test scores. Either procedural errors or plausible miscarriages of justice are reasonable grounds on which the union will fight for members and has an ethical obligation.
Nor is that willingness to fight for individual members inconsistent with a union's willingness to try different methods of evaluation. My chapter can and does file grievances when we think an individual's procedural rights were violated in the tenure review process. That says nothing about the standards of review. It says that we'll fight for the integrity of the review process.
July 16, 2010
Gates in Tampa ... no, my daughter's school!
Two chances in one week to provide personal perspective on Gates' philanthropy. Along with a few thousand other AFT delegates, I saw Gates's speech last Saturday. Today's comment comes via the Business Week article on the Gates Foundation's education program. The article is one of the better journalistic portraits of the foundation, including historical perspective by Maris Vinovskis and some technical perspectives from Howard Wainer and Daniel Koretz. And then in the second half, the article quotes some teachers such as JoAnn Parrino and Kathy Jones. I expected the article to quote either Hillsborough superintendent MaryEllen Elia or Hillsborough Classroom Teachers Association president Jean Clements, and then suddenly the focus was on some teachers at Chamberlain High School, where my daughter graduated in the spring. Yes, she had both Parrino and Jones, as well as a few others mentioned indirectly in the article as Daniel Golden followed Hillsborough's Gates project staff into a teacher meeting at the high school.
Both teach AP social studies courses, Parrino with human geography (taken by ninth graders in Chamberlain) and economics (I forget whether it's micro or macro). Jones teachers the world and European history classes. Both have their student admirers within the school. In the article, Parrino is quoted in favor of random classroom visits, and Jones on a different topic, whether there is such a thing as a year-over-year growth measure when the class is a one-year class such as a topical social studies class. And the music teachers apparently scoffed at the notion that their competence can be measured by student performance on an end-of-semester music theory class. Most of the teachers I've met at the school are reasonably thoughtful at the least, and the article begins to touch on their perspectives and skepticism.
What is notable is that none of the discussion Golden reports is the type of "we can't be expected to do great things with poor kids" excuse that's the common straw-man argument by advocates of high stakes testing. Jones is right to be skeptical that there is any competent value-added measure for history, and the band and chorus teachers are absolutely correct that a music-theory class is an awful measure of their competence. Want to know what a Florida band or orchestra or chorus director pushes their students to perform in? Music Performance Assessments, or MPAs. These are juried festivals of school groups, and teachers in Hillsborough take them very seriously. To use music-theory paper exams instead of MPAs is a pedagogical crime. Do you think the Hillsborough High School band director should be judged by how well my son and his fellow sax players know a Napoleonic 6th, or how well they can blend in a performance of "Take the A Train"?
At some point, advocates of using student outcomes as part of teacher evaluation need to get some sense about implementation. Hillsborough is clunking along right now, and it'll need to adjust things on that part of the evaluation system. The rigid "everyone has to be evaluated in the same way even if it makes no sense" system is not viable in the long term. But it's what the mantra of "50% must be on student outcomes" will lead to unless Charlie Barone and others come out in favor of common sense in the use of student outcomes, and that includes telling their friends when they're wrong in a formulaic approach.
July 14, 2010
Fat tails and audit trails in Florida test scores
I'm starting the day behind on a bunch of things, thanks to a week at the AFT convention in Seattle and the beauteous handling of bad weather by Delta. I arrived in Tampa about 23 hours after leaving Seattle, and let's leave it at that.
So I'm a bit behind on the background behind the evolving controversy over test scores in Florida. NCS Pearson was way, way late on releasing scores, and part of the reason was what Florida DOE officials called glitches in the demographic files Pearson had on students, or how test scores are tied to students and then teachers.
I have a sneaking suspicion that's also behind the controversy that's developing, as first the urban and then a bunch of other system superintendents complained that the proportion of elementary students not making adequate progress year-to-year just didn't fit with any sense of reality (on the low side). Head to the St Pete Times for the published stories and blog entries, including new complaints that the organization auditing Pearson's work is a subcontractor of Pearson, but here's the reason why I suspect the demographic files are a good starting point: Florida's "growth" measure is not the mean or median growth year-over-year on some vertical scale, nor is it a regression-based measure of deviation from some version of expected growth. Instead, it is a jerry-built dichotomous variable of whether an individual student made a particular growth benchmark in a year: yes/no.
It's been a few years since I looked at the details of this "growth" definition, but there's some inherent sensitivity in any measure based on thresholds to variability around the relevant threshold. In the case of Florida's growth measure, the vulnerability is going to be less around the construction of a particular scale at a point in an individual test because the measure depends on a student's prior-year score. So a psychometric vulnerability is going to be two sources: the general characteristics of tests in two years, and the added variability that you get from comparing scores in two years (there's measurement error in both scores, and the measurement error when you compare the scores is going to be greater than the measurement error in either base year or following year).
Since the two-year-variability issue has been a fact of life for this measure for a number of years, I would be surprised if that were the issue. So then the question is whether this year's fourth- or fifth-grade reading test scores have unusual distributions that would cause interesting problems at the thresholds for "making gains" for students who were low-performing in the prior year. A particularly fat tail at the low end might cause that, but that's speculation, and I suspect an obviously fat-tailed distribution would have been picked up by the main auditor, Buros.
But you can have a non-psychometric wrench in the works, because Florida's dichotomous variable is highly sensitive to one other matter: the correct matching of student test scores from year to year. If the student data files were messed up, and student scores from 2009 were matched to the incorrect student scores from 2010, you'd have all sorts of problems with growth. I strongly suspect that's what tipped off problems with the data files earlier in the spring. If the failures were general, you'd have a skewed distribution of the dichotomous growth variable as the lowest-performing students from 2009 would be the most likely to be matched (incorrectly) to higher scores in 2010 and vice versa, so the first clue would be markedly high growth indicators for 2009's low-performing students and markedly low growth indicators for 2009's high-performing students.
But that's not what school districts are reporting: they're reporting unusually low growth proportions for low-performing students from 2009. I can think of a few different ways you'd have that after Pearson tried to correct any obvious problems it saw earlier, but that's speculation. What needs to happen is an examination of the physical artifacts from this year for a sample of schools: the booklets, the student demographic sheets, and the score sheets. We're talking about more than a million students tested, but we can start with a sample of schools that the urban-system superintendents are worried about and track the data from beginning to end with a small enough set to see exactly what happened to the satisfaction of local school officials, policymakers, and the general public.
And if Pearson destroyed all physical artifacts so you can't trace the path of data? Cue "expensive lawyer" music...
July 12, 2010
Gates speech at AFT
Originally written Saturday, July 10: I've figured out how to hang this electronic device onto the back of the chair in front of me while my old PDA foldable keyboard is synced and sitting on my lap, so I can write this blog entry in the middle of the AFT session. AFL-CIO President Richard Trumka gave a spirited speech before lunch, and then the floor approved a resolution on teacher evaluation without amendment.
This afternoon, we started with resolutions on community support and career/technical education (CTE) programs. For the most part, the resolutions this afternoon were neither going to be the controversial resolutions nor the controversial part of the afternoon session, which was Bill Gates' appearance at the convention. Very popular was a resolution urging public meetings for the national commission on fiscal responsibility and reform and giving AFT an official position in favor of progressive effective tax policy instead of Social Security benefits cuts that are regressive. As I've written before, a number of people simultaneously want policies that would end in significant layoffs of teachers over 50 and also significantly reduce pension benefits and contributions to public-employee pensions. Evidently, there is some group of self-defined reformers who are in fear that somewhere, someone is enjoying a retirement free from fear of destitution.
The Gates appearance started at 4:15. From what a colleague told me later, he helicoptered over from his island estate. Randi Weingarten at first started speaking from the sheet announcing Innovation Fund awardees and then turned to introducing Gates. She took care to quote from Gates's annual letter at points where he specified opposition to solitary use of test scores to evaluate teachers and supported evaluation as a tool to help most teachers. With a smattering of boos, Weingarten smiled and said, "I thought you guys were leaving," referring to the threats of a boycott by the small dissenting caucus By Any Means Necessary (BAMN). The majority of delegates roared. Later, there were about 25 delegates out of several thousand present who walked out as Gates stood at the podium. So much for the huge boycott of Gates's speech...
Gates started by publicly congratulating AFT for the approval of the resolution on teacher evaluation/development and on steps taken thus far, including the AFT locals who are working with the Gates Foundation on specific programs. He mixed in some misleading statements about "declining" graduation rates (as opposed to stagnation) with some fair statements and a clear statement that teachers must be included in reform. He spent a few moments discussing the failed small-schools initiative. The greatest applause lines came when Gates criticized the existing record of poor administrators' evaluations and when he acknowledged that people who have never taught in a classroom do not understand how difficult teaching can be.
The BAMN protesters then had pretty awful timing, coming back towards the hall shouting protests ... just as Gates said some teachers have challenges with students who are bored or engage in disruptive behavior. The hall erupted in laughter at the irony.
Gates's weakest argument was the individual teacher equivalent of effective-schools rhetoric: see what teachers do when students demonstrate great achievement. It's a high-risk claim, to assert that the development of a teacher evaluation system can also document which a priori behaviors are best. What may be easier is the collection of videos of different teachers, with a broad enough sample that some will turn out to be great teachers. Gates also highlighted two project districts in AFT: Hillsborough, Florida, and Pittsburgh, Pennsylvania. As is common with description of risky projects in early days, the rhetoric was a bit breathless, and I could hear a few oohs and boos in the audience when he mentioned merit pay, Race to the Top, and tying tenure to student achievement.
Gates ended with the obligatory reference to Al Shanker and the need for teacher voice in reform. "Don't give it back, take the risk, and keep it up." "No other union is doing what you are to make this [reform] happen."
Additional thoughts a few days later: Gates got some personal mileage by appearing at AFT. He spoke with a few reporters afterwards, and his appearance generated some newspaper stories at the St. Pete Times and Washington Post that were more about the Gates Foundation than the AFT convention. At AFT, I don't think delegates had their minds changed much by Gates, since they were likely to be aware of what he's done and where he agrees and disagrees with them.
Gates's rhetoric is compartmentalized. In a good part of what he said, teachers were at the center of what he describes as reform, including teacher evaluation. But then the sore-thumb statement popped out about tying due-process protections to student test scores, unmediated by professional judgment. It's as if there's a switch inside his head, where he can talk either about test scores or about better evaluation of teacher practice. Reform rhetoric as a quantum effect? I don't know. But it's poor strategizing and a poor contribution to discussion. One of the wealthiest men in the world should be able to be more sophisticated.
June 19, 2010
What uses of test scores will pass legal muster in teacher evaluations?
Legal considerations on the use of test score derived stats in teacher evaluation: Scott Bauries started an interesting discussion June 2 of value-added measures and teacher evaluations from a legal perspective. It's very important to read the comment thread, as he's challenged on his conclusions by Bruce Baker and Preston Green, especially with regard to disparate-impact claims. Bauries claims that employers need to defend the procedural due process but are probably safer on the substance, regardless of the problems with value-added measures.
Reading the main entry and discussion, I lean strongly towards' Bauries' conclusion, with one important caveat (below). My impression of the 2000 G. I. Forum v. Texas Education Agency case on the disparate impact of high-stakes graduation tests, which the state won, was that the plaintiffs were not prepared for the last burden-switching test on disparate impact. My rough impression of disparate-impact claims of illegal discrimination based on the Civil Rights Act: it's a series of penalty kicks/shots in soccer/hockey or maybe the games with alternating possession in overtime. I'm not a lawyer, and this is primarily based on my understanding of Title VI rather than Title VII law, but to the probably-inapt analogy: First, the plaintiffs try to demonstrate that a mechanism such as a test affected a property interest of the plaintiffs and had a disparate impact on one of the protected classes. If the plaintiffs succeed, the defendant tries to demonstrate that the mechanism meets an important interest, was properly constructed and applied, and members of the affected class had a fair chance at succeeding in the mechanism.
So far, we're describing lots of situations that have evolved in the past 25-30 years, especially with high stakes testing. Debra P. v. Turlington established the basic federal expectations in terms of student tests, and as a number of states created a new round of graduation tests in the 1990s, they relied on Debra P. v. Turlington as a guide to meeting the basic questions and getting to the final round all tied up. And this sort of makes sense if you think about the maturity of various mechanisms: you can argue that there is a rational state interest in a certain outcome (an adequate measure of achievement in the case of graduation requirements), and then satisfying the "fair chance at succeeding" is often a question of satisfying a set of criteria rather than perfection and that's often a reflection of the organization's experience and capacity.
The final test is whether there is a better option: could the defendant have feasibly chosen an alternative mechanism that satisfies the same interest with less impact. I've never read all of the materials in the G.I. Forum case, but the following is a key passage in Judge Prado's ruling:
The Plaintiffs were able to show that the policies are debated and debatable among learned people. The Plaintiffs demonstrated that the policies have had an initial and substantial adverse impact on minority students. The Plaintiffs demonstrated that the policies are not perfect. However, the Plaintiffs failed to prove that the policies are unconstitutional, that the adverse impact is avoidable or more significant than the concomitant positive impact, or that other approaches would meet the State's articulated legitimate goals. In the absence of such proof, the State must be allowed to design an educational system that it believes best meets the need of its citizens. (emphasis added)
In the end, the plaintiffs' lawyers in the Texas case were unable to provide a clear alternative to high-stakes testing that they could demonstrate was both feasible (i.e., wouldn't cost an arm and a leg) and would have a lower disparate impact. I'm not too worried about the state interest, since you can usually construct alternative mechanisms that have facial validity and that have roughly the same "noise" as whatever you're arguing against. And the not-an-arm-and-a-leg criteria is tougher to meet if you're arguing for portfolios, since it increases the cost... but it starts from a relatively low base of cost per-pupil. Ultimately, though, it is hard to argue that a prospective alternative would result in a lower disparate impact if it is only prospective and thus you have no evidence whether the protected class you're worrying about would be helped by the alternative.
So in the discussion over at EdJurist, Bauries's clinching argument is really that for all their flaws, value-added measures are going to look reasonable to a judge in that they try to adjust for incoming achievement of students and plaintiffs will have to put forward an alternative with concrete evidence that the alternative does a demonstrably better job at treating teachers fairly. The catch-22: without a working model of alternatives with that record, plaintiffs are going to be sunk on disparate-impact claims.
Bruce Baker has followed up on Bauries with a set of tongue-in-cheek impossible criteria to make the use of value-added measures reasonably fair. I understand the temptation, but he's onto one thing: ultimately, local K-12 unions will have to figure out how to respond. This will include whether they have separate evaluation procedures for the 20% of teachers for whom value-added measures are even possible, how to mix the data, and so forth.
And now for the caveat: a good part of the legal consequences of using student test scores for personnel decisions will depend on how stupid local administrators are in the first jurisdictions to use them, and the first that are challenged. I can imagine districts where administrators are careful to fire experienced teachers only where there is a record of several years of low statistical measures of student achievement and only where that is consistent with low marks in other areas, such as administrator and peer observations. I can also imagine districts where administrators purge teachers based on a single year's worth of data and with no checks of consistency with other sources of information. If the legal tests are in jurisdictions with the first set of practices, they're far more likely to pass muster than if the first cases are for terminations that don't meet a basic smell test of rationality.
June 9, 2010
Get your performance-pay evaluation report bingo cards here
So another few evaluation reports have been released with little evidence of student achievement flowing from performance-pay systems. This is going to sound like a broken record from me, but I don't make too much out of one or two studies in policy research. These studies on systems in Chicago and New York confirm something any historian (or anyone who's read education historians) could have predicted: even if there is some benefit from changing a pay system, it's a darned hard thing to try. This is one of the reasons why I dislike the boutique, closed evaluation tradition in education research: every evaluation collects data, walls it off, and then presents (only) conclusions to the public. When there are millions of dollars being spent through the Teacher Incentive Fund in addition to privately-funded efforts (or any program with an interesting but untested theory of action), there have to be data archives so that other researchers (those not on the original evaluation team) can conduct secondary analyses.
But having put forward these caveats, I'm going to guess that most studies of performance pay are going to show negligible effects on test scores. This may be my inner cynic (okay, not very inner), but the long-term questions on performance-pay policies revolve less around whether it is consistent with the theory of action proponents have but focus instead on whether the politics demand something regardless of effects and what is workable from a variety of standpoints.
June 4, 2010
More on so-called "side deals"
Andy Rotherham has responded to my blog entry early this morning. Let me skip for now the question of why he was the sole person quoted in the article and address the local MOUs in Florida on Race to the Top. Rotherham wrote in part, "If these agreements have no bearing on the state's application or implementation then why go through the laborious exercise of crafting them[?]" I wasn't in the room for any of these, but having observed Florida schools for almost 15 years, I can imagine a number of reasons, including distrust by some party in a county in the judgment of FEA President Andy Ford and other participants in the task force about the clause excluding non-mandatory subjects of bargaining from impasse. I said as much in my prior post. I'm not a labor lawyer, and neither is Andy Rotherham, but I do know something about the dynamics within FEA, where there is often a healthy internal debate. The argument that local MOUs are an inherent evasion of the grant is something that requires examination of the actual language at issue.
Now, to my comment about Rotherham's being used as the sole source for Wednesday's story in the St. Petersburg Times. Why did it seem curious to me? Partly it's a matter of sensitivity to these issues on a number of fronts. For more than a year, Rick Hess has been pointing out the potential for all sorts of perception problems with a competitive process that's the result of (enormous) discretionary authority. In April, Liam Goldrick noted that the New Teacher Project was simultaneously advising several states on RTTT and then commenting on the process (something he thought was unwise from an organizational standpoint).
So when I read a story with a single source commenting on the issue, where the source may have had business interests at stake and where the disclosure of that in the story was vague, and then the term "some say" with only that source as documentation, it looked odd. It was a good journalist quoting someone who is open about disclosure in every direct piece of writing of his I've read. And it wasn't the larger point. But I don't remember anyone correcting the impression earlier in the spring that TNTP had been a disinterested observer. But I could also have thought of a number of reasons why there weren't more people quoted or more disclosure about Wetherbell (Rotherham's firm): maybe Matus got the information late in the day and couldn't reach more than Rotherham; maybe the material was in the submitted story and an editor chopped out additional disclosure; maybe Rotherham didn't have access to the text of the local MOUs or didn't have a copy of the state MOU and was relying on Matus's over-the-phone description of "hey, this looks like it could be different, especially in Hernando." But on the other hand (I think we're on my fourth hand here), the failure to correct the stuff on TNTP is a lapse for education journalism more generally, and it has to stop. So I decided to note what I had observed, call it minor compared with the other issues, and go on. If it looks like I'm being hard on the Times, it's because this local newspaper is one of the top papers in the country on education, and I think I can expect great reporting. But this is a minor error, I meant the observation as such, and I explicitly said so. If I were going to point out that the alleged transparency problem with Florida's application is a distraction until better researched, maybe I need to be consistent and explicitly say my concerns along parallel lines were less important, nu?
My central point was that absent some more solid analysis of what the local MOUs actually meant and whether they conflicted with the application packet, the larger issues were not about procedure but the sustainability of whatever happened with RTTT, assuming it was beneficial. Here's what I wrote this morning:
It's a legitimate question to ask what the right balance is in collective bargaining on the scope of bargaining, on the relative power of the parties, and on state law that can essentially dictate terms and conditions of employment outside bargaining.... It's also a legitimate question to ask about the commitment of parties to reform after the money runs out.... Those issues are still out there, and they're out there whether or not a particular state has an MOU like Florida's.
And here's what Andy Rotherham wrote:
[T]here are two big outstanding questions on RTT that we won't know the answers to for several years: First, how durable will the policy changes be? Will states relax things when the money is gone and/or will "loser" states undo the reforms they put in place in an effort to win? Prize theory is built on the idea that the progress generated in an effort to win is built upon. That idea has not been fully tested yet in the public/political sphere.
Surprise: we agree on the importance of that question! No, it's not really a surprise. It's just that a lot of electrons have lost their lives this week in what thus far looks like a nonstory.
Correcting the facts on so-called RTTT "side deals"
In Wednesday's paper, St. Petersburg Times reporter Ron Matus relied on the sloppy language "some say" to spin a mountain out of a molehill about county-specific MOUs between school boards and local FEA affiliates. In the article as well as two blog entries Wednesday and Thursday, Matus stated that there were a number of counties with local memoranda of understanding (or MOUs) and quoted one individual who said that the existence of what Matus called "side deals" might be a problem for Florida's application. Matus stated that his source Andy Rotherham had helped other states with RTTT applications, but the article did not state whether that was in the context of consultancy contracts (i.e., whether Rotherham's new organization had its reputation and business at stake in competition with Florida's RTTT application).
The omission of any mention of Rotherham's business concern is minor compared with the failure of the Times to look at the content of the side deals and see whether they modified the obligations of local parties vis-a-vis the state Memorandum of Understanding that most districts and unions signed across Florida. Since Ed Week has gotten into the story, albeit without quoting Rotherham, it's important to look at the facts.
First, the issue of impasse. Language from the state MOU (part of the RTTT application):
Only the elements of this MOU which are contained in existing law are subject to the provisions of section 447.403, Florida Statutes. (p. 3)
Explanation: F.S. 447.403 is the part of Florida's public-employee collective-bargaining law that covers impasse. In other words, if there's a part of the MOU that is not already a term and condition of employment under Florida law, it's not susceptible to the impasse procedure. That point is clarified in the attachment the Times education blog noted was part of many counties' documentation.
The Broward side agreement in its entirety:
Any items relating to the RTTT Application or Plan that are unsuccessfully negotiated between the parties specifically for the purpose of applying for or receiving the RTTT grant award will not be subject to the impasse procedures set forth in Chapter 447.
Can someone explain to me why this is any different from the state MOU? But there's a second issue that Ed Week's Michele McNeil discussed: "these side deals also say that any changes successfully negotiated because of Race to the Top will expire once the funding does," and refers to the Hernando County MOU. But on p. 4 of the state MOU, Part IV explicitly states that the state MOU expires "upon the expiration of the grant project period, or upon mutual agreement of the parties, whichever occurs first."
The question one might logically ask is why some counties and locals felt they needed extra language. The FEA had a long weekend discussion with state leaders and local leaders about the state RTTT application this time around, and from news coverage it looks like FEA President Andy Ford was strongly encouraging locals to sign on. The reason why was pretty clear: he had had a seat at the table in the task force Charlie Crist set up in the week after Crist vetoed SB 6. When you've had a hand in crafting changes, you've got a stake in success. In addition, the additional language had taken care of one of the legal concerns of FEA bargaining-support staff, because the MOU from the first application in Florida looked like it might give school boards the ability to impose contracts on matters beyond what is currently in state bargaining law. Unlike in many northern states, Florida school boards have the authority to impose contract terms under impasse for a the duration of a fiscal year, but only on mandatory terms and conditions of employment as defined in Florida law.
In January, FEA had cautioned locals not to sign the MOU, and it crafted language for the few locals who wanted to sign (including one large county, Hillsborough). The language FEA crafted for the locals in the first RTTT round? It specifically exempted issues from impasse when the issues were not already in state law. (I don't have the exact wording in front of me, but I am sure an intrepid reporter could ask the Hillsborough press rep for it.) In that case, it's clear that the local MOUs created legal conditions different from what would have been the case with a signed state MOU and no local MOU. So when similar language appeared in the state language, why did some local teachers unions sign essentially redundant local MOUs? Let's just say a generous level of suspicion about the process.
The greatest problem with this coverage of the county-specific MOUs is that it's a distraction from serious issues of reform implementation with RTTT. The issues Matus and McNeil have raised in the context of local MOUs exist with the state MOU. But instead of focusing on the substance, the reporters are focusing on the process issue. It's a legitimate question to ask what the right balance is in collective bargaining on the scope of bargaining, on the relative power of the parties, and on state law that can essentially dictate terms and conditions of employment outside bargaining. In a state like New York, bargaining authority leans more towards unions than in Florida, and likewise state law. Northern states are the ones to have seniority-preference laws that trump the bargaining process, and Florida has had several statutes trying to mandate all sorts of things unions would be very unlikely to agree to in local collective bargaining.
It's also a legitimate question to ask about the commitment of parties to reform after the money runs out. That is one of the critical questions with the DC teachers contract: what happens if/when the billionaires pull out? The billionaires' support of DC along with RTTT present a theory of action all about inertia: if we can just budge districts away from current practices, we'll accomplish long-term structural changes. In contrast, Denver's ProComp was in the context of a permanent funding stream and a political deal with voters: give us the money permanently, and there is a permanent change in compensation practices.
Those issues are still out there, and they're out there whether or not a particular state has an MOU like Florida's. I understand why this reporting on process exists, especially in a rush to print news, but am disappointed that two good reporters have perseverated on an apparent process issue without checking the details of their assumptions (i.e., whether the local so-called "side deals" are substantively different from the state MOU).
Disclosure: I am a former member of the FEA governing board. (I have not corresponded with elected FEA leaders about the reporting on this story, but I want to be open about my former position within my state affiliate.)
Update (9 am EDT): After I posted this earlier in the morning, Valerie Strauss published an entry on the topic in the Washington Post blog she writes, largely repeating what the Times had said. I also corresponded in the last hour with one reporter interested in the story, and one of the empirical questions is whether the local MOUs in Florida are more like Broward (short and redundant) or more like Hernando (which was much longer and with elements McNeil discussed in her blog entry yesterday afternoon). There's also a broader question about state administrative authority. Suppose a superintendent of a district in any state receiving RTTT funds decides she or he isn't going to follow one of the requirements. She or he just didn't put it in writing. Does the state's obligation to eliminate participation and cut off funding for that district change? As I said earlier this morning, the broader and more interesting questions are not really about local MOUs.
May 2, 2010
The theater of basing a majority of evaluation on test scores
Now that SB 6 is dead, that a governor's task force on RTTT came to a compromise in a single day, and it looks like there is some direction for teacher evaluation in Florida that's acceptable to Florida's K-12 teachers unions, it's time to take stock of the rhetorical stance SB 6 supporters had that a "majority" of a teacher's evaluation had to depend on student test scores. I've seen this pop up in other states, so it's a common rhetorical stance. Let's get a few things off the table first: this is not based on any research, and the supporters have no clearer idea of what "majority of a teacher's evaluation" might mean than supporters of the "65% solution" had any clue what spending money in a classroom meant. For that matter, neither did I as a skeptic (about either proposal).
So the "majority on test scores" stance is political, then. That's fine as a minimal statement; almost all decisions about pay structures are political in a broad sense rather than based on research, and to some extent they're reactive. Teacher pay scales became standardized to protect bureaucratic structures from (and sometimes in response to) accusations of corruption, and the single salary schedule is a response historically to gross pay inequity.
I'll go further: I don't think there's a way to avoid political values embedded in pay structures. Once you involve public money and a service most people connect with citizenship (education), you've got politics, however well structured and justified by reference to neutral statements of organizational need. On that level, performance pay is justifiable from the sense of satisfying public perceptions about how teachers should be paid. That was explicit in Denver's ProComp plan: the voters approved higher taxes in return for a performance-pay structure.
The problem with the "majority based on test score" position is twofold. One is the obvious one: it's divisive, and many parents and other community members are offended by the idea. Here, Diane Ravitch spoke for millions when she criticized SB 6. But there's another problem: it obscures the evaluation process rather than clarifying it. By reference to an implied point-based system, it fails to focus on what matters in a teacher evaluation system in terms of either an algorithm or underlying concepts.
I've written a bit about point-based systems, and because the focus of my paper was elsewhere, I didn't have a chance to talk about the limit of point-based scoring systems: it matters not where you can earn points but where you might lose points. I learned this in high school when I was a debater: individual raters have an implied comfortable range for scores, and it's the range of scores that matters, not the total number of points available in different categories. If raters have different effective ceilings as well as ranges (i.e., it is impossible for people to earn perfect scores with some raters, while others commonly hand out full marks), then the raters with the largest ranges of scores exert more power over final results than raters who have a very narrow range.
Similarly, components of any point-based system will have differential impact on final results when they have broader ranges in practice regardless of the proportion of the scale that derives from individual components. Imagine a teacher evaluation system with 100 points. Suppose 60 points comes from student test scores, and the range is restricted for most teachers to between 52 and 60 points. On the other hand, suppose 30 points in this hypothetical evaluation comes from direct observation, and the range of scores is between 10 and 30 (and more than a handful of teachers may earn the low score). Which component has the greatest influence on final results? It's the 30-point direct-observation component in this thought experiment, because in this hypothetical example teachers can lose more than twice the number of points there than through student test scores.
But the "majority of evaluation" rhetoric does more than obscure the real power in point-based systems: it obscures the question of what teachers are responsible for. "Outcomes!" says the supporter. Right, I say: that doesn't say a darned thing about the types of outcomes that will make the difference in evaluation. In Florida, Louisiana, and other states where people have pushed a majority from test scores approach, the push has been to create a mandate and defer the implementation to a regulatory process. That's a nice illusionist's trick if you can get away with it, but the process of implementation always mediates absolutist mandates, and then the legislature is giving up what mediates the test scores.
There are three ways I can see that test scores' impact on evaluations would be mediated in any system (and yes, I'm including SB 6 here): ad hoc (i.e., caprice), by reference to student disadvantage (i.e., blame-shifting), or by reference to teacher behaviors in classrooms (i.e., standards of practice). Without any legislative guidance, ad hoc and capricious mediation is likely (probably by the temperament and philosophy of the administrator with the greatest authority over evaluation). More destructive than ad hoc mediation would be blame-shifting: a teacher would be held blameless if someone else/something else (poverty, language, presumed parental neglect, etc.) can be blamed instead. Bad, bad idea.
Of the three options that come to mind tonight, mediating test scores by professional standards of practice seems the most productive. But then that raises the central question: if the use of test scores is inevitably subject to mediation, and the best choice for that mediation is through professional standards of practice, why not base evaluation on professional standards of practice to begin with--for example, to let an evaluation that documents effective practice create the rebuttal presumption of effectiveness?
The answer here is two-fold: one is that there is no agreed-upon standards of practice for teaching more generally, other than by crude and obvious standards (don't beat your students) or by reference to effects (keep your students' attention). The other explanation is that even if there were agreed-upon standards of practice, the process would be sufficiently messy as to irritate the sensibilities of those who advocate the putatively cleaner "majority from test score" approach.
The result is that instead of getting a messy but constructive system based on developing standards of practice, any such system that putatively bases the majority of a teacher's evaluation on test score is going to get ad hoc or blame-shifting mediation through the back door.
Update: Linda Perlstein noticed the 50% rhetoric and should get credit for the pattern recognition. Consultants' advice? Hmmn... looks like an interlocking-directorate phenomenon (no conspiracy needed).
April 22, 2010
Dorn reviews Ravitch
My review of Diane Ravitch's new book is now up at the Education Review website. I should have finished it a few weeks ago, but the fragmentation of my time this spring has interrupted all sorts of usually-short-term projects, such as book reviews.
If there is one benefit to the delay, it was my ability to watch the sales keep racking up while the book climbed several bestseller lists. At one level, I think, "I wish my book on the topic had sold a tenth as many copies!" But that's silly; I'm glad someone was able to meet the clear need for this book in a way that's been rewarded.
Bottom line of the review: read the book. In writing the review, I made the choice to skip much of the contemporary discussions around the book and focus on Ravitch's historical arguments. As usual (with Ravitch), she writes a highly appealing argument, and it's important to look at the claims dispassionately. I should say that I dearly wish she were correct in her claim that Lynne Cheney's attack on the voluntary national history standards in the 1990s was a primary cause of mediocre curriculum standards and our current policy obsession with high stakes testing. At the time (as a new scholar in the field) I was very upset with Cheney's distortions of the record, and at one level it is attractive to see her in the villain's role. But I think it's more complicated.
April 15, 2010
Misinterpretations of Crist's veto, and where to go next
I suspect that a number of observers will spin Charlie Crist's veto of Senate Bill 6 to the point where the representation doesn't come close to reality. By a quirk of timing, I was in Tallahassee today talking with legislators and staffers in the morning. In other words, I was at Ground Veto. Yep: I came, and Charlie caved. No, that would be a post hoc fallacy, even if his veto message used the same word (overreach) that I used to describe the bill. Wait: he used a hyphen (over-reach). Or maybe I don't own the term, and the idea had been floating around the state for the last few weeks, including in newspaper editorials, and it was one of the options available for a governor vetoing the bill. So I can't claim credit as being the person who killed the bill, though I was one of thousands who contacted Crist in the last week.
In the meantime people are spinning this as the Event that Destroyed Florida Education, or the Victory of the Union(s), or the Resuscitation of Crist's Senate Campaign. Maybe one or all of those labels is true, but I doubt more than one is. (To calculate the probabilities, we need to use quantum spin dynamics, a new field that melds political science with nuclear physics.) Whoa, friends, and maybe you should take a step back. Here are the reasons why Crist vetoed the bill:
- Thousands of Floridians from both major parties contacted Crist to urge a veto.
- His sisters who teach probably told him they hated the bill.
- The Republican legislators and former Governor Bush who were pushing the bill had largely sided against him in the primary against Marco Rubio.
- Crist prefers consensual processes.
Crist's veto kills this particular bill, in this form. It does not signal a victory of teachers unions over performance pay, and it does not mean that the Florida Education Association will oppose either performance pay or alternations in the process leading to due-process protections. In fact, if you're on Facebook and "friends" with Andy Ford (he's a nice guy, and the ironic quotation marks are about FB, not Andy), go ahead and see what anti-SB 6 groups he joined... and which he didn't. If you're a reporter, go ahead and talk with Commissioner Smith and ask him to repeat the first thing Ford said at discussions about Race to the Top.
Where do we go from here? It depends largely on whether the FEA executive cabinet will support Andy Ford in negotiating with other stakeholders and politicians, on what the administrator and school board associations push for, and whether the business groups or the Republican sponsors of SB 6 are willing to negotiate in good faith. Here are some obvious questions that don't correspond with any hypothesized litmus tests:
- Can the key parties agree that a performance-pay framework can exist?
- Can the parties agree that a performance-pay framework cannot force budget cuts to current operations?
- Can the parties agree on a performance-pay framework that addresses student outcomes on a "pass a smell test" basis but does not depend on blue-sky assumptions about assessment for students with disabilities, English language learners, and every subject in the curriculum?
- Can the parties agree that teachers should not automatically receive continuing-contract status (with due process protections) without a more serious evaluation than usually exists (i.e., by default after three years regardless of the scope of evaluation)?
- Can the parties agree on the scope of personnel contracts that can be negotiated at the local level?
- Can the parties agree on what due process protections are workable for experienced teachers who have demonstrated effectiveness in the classroom?
- Can the parties agree on what must be part of teacher evaluations and the range of options for those evaluations?
- Can the parties agree on what constitutes a proof of concept for their pet ideas?
Disclosure: I am a 14-year member of the United Faculty of Florida and thus a member of FEA. I am firmly convinced that if you are a Florida teacher and want a future with no performance pay, and if you somehow persuade your local and state leaders to agree with you, you will be at the policy table... as the meal. I am equally convinced that if you are Jeb Bush or one of his close friends and want a future with no job security for teachers beyond a single year, you will succeed... in turning a great number of people who would otherwise agree with you into political enemies. And if you think that there can either be a future in state education policy with no high-stakes tests or a future in state education policy where there is a quantified high-stakes test for every subject and grade level... well, I'm not legally licensed to give my opinion of that response.
In other words, many of the questions above have yes as an answer, but only if people who would otherwise hold extreme positions are willing to work on problems rather than positions.
April 1, 2010
Hilda Turner and why teachers are skeptical of John Thrasher's motives
In Tampa, there is a five-year-old elementary school named after the late Hilda Turner. The students attending Turner Elementary may not know why it's named after her, or who she was. Most legislators in the capitol probably don't know about her case against the all-white Hillsborough school board in the early 1940s and why the long history of politicized teacher evaluations give Florida teachers reasons to believe that Senator John Thrasher's bill is an attack on them.
But my friend and colleague Barbara Shircliffe knows, and she reminded me of the case today. She published a history of Tampa's desegregation case a few years ago (The Best of That World), and she's currently researching the history of teacher desegregation in the South. In the early 1940s, teachers across the South faced a split between what the federal courts had decreed and what the reality on the ground was. In 1940, Melvin Alston had won a lawsuit against the Norfolk, Virginia, schools for having separate salary schedules for white and black teachers, because the (federal 4th Circuit) court had ruled that unequal salaries were wrong. (In the decision linked above is the salary schedule that shows high school teachers were paid more than elementary teachers, men in high schools were paid more than women teaching in high school, and white teachers were paid more than black teachers.)
But most school systems didn't change anything until they were sued, and it took quite a spine for a teacher to take on her or his employer. Maybe the teaching shortage of WW2 made a difference. Certainly the fact that black soldiers were bleeding for their country played a role in growing militance (including the "Double V" campaign of the Pittsburgh Courier). Or maybe this sham of an evaluation for Hilda Turner in 1942 kicked her into action (Turner v. Board of Public Instruction, reference exhibit 3). The case quickly became messy and ugly, and I'm going to leave the story of that for my colleague's next book. But this wasn't isolated. Black teachers in Florida were treated unfairly and unequally for decades, often by their white colleagues. It probably wasn't until the mid- and late-1960s that teachers of all races in Florida started working together to address teaching conditions in the schools.
Nor were the types of spurious judgments in that evaluation uncommon. The fact that an annual evaluation was one of the lawsuit exhibits may be a legal quirk (since it was damning evidence of how the system treated black educators). But it also illustrates the controlling way that systems treated all teachers, and that continued for decades. In the 1950s and 1960s, they were subject to attacks by the state's anticommunist legislative committee, run by Horace Johns, which eventually turned to outing gay teachers. (If I remember correctly, current U.S. Rep. Bill Young was a member of that committee when he was a state legislator starting out in politics.) Teachers in general were attacked in 1968 for striking, but gay teachers were the target of another attack in the 1970s by Anita Bryant. In the following decade the state imposed a generic evaluation instrument (the Florida Performance Measurement System), designed before the recognition that there was subject-specific expertise in teaching. And all of that came before the Sunshine State Standards in the mid-1990s, Jeb Bush's A+ accountability program, vouchers, No Child Left Behind, the Bush Recession of 2008, and finally John Thrasher's bill. I can point to a number of events or policies that supported teachers, but the background has always been a recent history of blaming and judging teachers.
Because there has never been a sufficiently well-grounded system of teacher evaluation, the experience of teachers on the ground has been ineffective, useless evaluations... or worse. And what teachers see in Senator Thrasher's bill is the "worse" category. Combined with the elimination of tenure (a topic for another entry), the mandate of a formulaic approach to teacher evaluation is too much for many teachers to swallow. This is not the result of hyperbole on the part of the Florida Education Association. This is the result of Florida's history of education.
(For more on the local context of Turner's actions, see Doris Weatherford's history of women in Tampa, pp. 287-288.)
March 30, 2010
Race to the Top winners and losers
So officially, Delaware and Tennessee won (note, Andy Smarick: I spelled both states and your name correctly). But in the side competition (including brackets and sidebar bets), who won and lost?
Those who predicted political decision-making were wrong. I know Mike Petrilli has wondered if politics has intervened in the reviewing process (and thought the secrecy of reviewer identity was political suicide). When New York, Ohio, and Illinois are frozen out, it's hard to spin the choice of Delaware and Tennessee as political (though Petrilli takes a half-hearted stab at it). Addendum: Rick Hess takes a firmer stab at it, though I think you could take any possible RttT awardee list and fabricate a post hoc "this was all politics" explanation.
Those who predicted a "low bar" in getting money were wrong. In the end, when Arne Duncan said USDOE would give the money to a small number of states, he meant it.
Those who predicted "reforminess" as the secret criterion were wrong. All the cool kids were assuming Florida and Louisiana would win because, well, they're the fair-haired boys this year. Wrong! While stakeholder buy-in (or the lack thereof by Florida's unions) was part of the reason for Florida's four-place finish, there were other ways Florida's application lost points, and Michelle Rhee's application for DC fell at the bottom of the Tweet 16.
Here's who won in the side competition: the reviewers. At least at first reading, the reviewers' comments on Florida's application were serious in comparing the application to the scoring guidelines. I'm sure you can quibble with scores here and there, but I think any sane journal editor might be tempted to kill to have this quality of effort from manuscript referees.
Especially in Florida, there's a great deal of second-guessing and spinning after the announcement of results. I'm tempted to pitch in, but I'll decline, at least for today.
March 25, 2010
In better news, bipartisan bill passes Florida Senate reforming high school testing
In addition to Senate Bill 6, the Senate also passed an amended form of Senate Bill 4, which moves the state's high school testing program away from comprehensive exams in 10th and 11th grade and towards end-of-course (EOC) exams. Senators from both parties finally "get it" that the so-called comprehensive science exam was counterproductive, and a well-implemented EOC exam system is significantly better than the one-size-fits-none eleventh-grade test. But that doesn't mean the bill is perfect: FSU physics professor Paul Cottle has been diligent in explaining his concerns with dilatory clauses placed in the bill that eliminate any deadlines for physical-science exams.
It's important to keep in mind that only part of the purpose of these exams is to encourage students to go into STEM fields, though it's important to raise the floor of science courses students take in part to reduce inequalities in access to lab-based courses. The purpose of pushing all students to take more math and science courses is because they are going to be adults when they leave, citizens who vote on issues where they should be informed. I want elementary-school teachers to have stronger math and science backgrounds, and so should you. I'd like someone in charge of a venture fund or pension fund to be able to recognize fraudulent science claims without wasting other people's money. And when my oldest nephew finishes his graduate program in astrophysics, I want a ready source of groupies fanatics educated readers willing to pay oodles of money for listen to him to talk about microwave inferometers and the early universe.
Okay, maybe the last isn't a public purpose. But the rest is. We all benefit when high school students have a well-rounded academic education not only in "skills" such as reading and arithmetic but in history, literature, math, and science, and moving from the FCAT to EOC exams is the right step.
Florida Senate overreaches on changes to regulation of teaching
Yesterday, the Florida Senate voted for Senate Bill 6, which would dramatically change the structure of teacher evaluation, contracts, pay, and licensure in the state. A few amendments were approved on the floor of the senate, but only three appear substantive, and the largest changes happened in committee, in part to address concerns about constitutionality for the initial bill.
As the Washington Post's Valerie Strauss has, most observers have focused on the evaluation, pay, and contract issues, and that's because the intent of the bill is to elliminate any form of tenure, to reorient evaluation around student test scores, and to eliminate the ability of school boards to pay teachers in part based on experience. For a variety of reasons, legislation such as SB 6 is policy overreaching, and as it has in several other ways in the past decade, Florida has gone far beyond any other state in education policy. In part because it is so hostile towards the Florida Education Association, I suspect that some observers will praise the senate even if this turns out to be horrid policy. That way lies Thrasymachus, and it's not pretty.
SB 6 is overreaching. Instead of reducing the protections of tenure, it eliminates all meaningful due process related to job security. Instead of mandating that student outcome data be a part of teacher evaluation, it requires that test scores form the majority of any teacher evaluation system. Instead of moderating the influence of job experience on pay, it completely prohibits any such factor being used.
As a result of this overreaching, school boards are going to be motivated to work with teachers unions on workarounds for most of these issues. For each area where school boards and union locals agree the state has gone too far, they'll figure out another way to provide for some job security, to moderate the effect of test scores on evaluations, or to create a legally defensible proxy for experience in salary structures and call it performance-based pay. It took me about 10 minutes to come up with a few mechanisms for these issues, and I'm not nearly as clever as highly-motivated union officials and superintendents. But as a result, you're going to see highly variable treatment of teachers across the state, which I don't think is the intent of legislators.
There is only one area where the state has an undisputed right to regulate teaching, either in Florida or elsewhere, and that's in licensure. Regardless of what happens in collective bargaining at the local level, any state can decide who has the right to be licensed as a teacher, and at least at first, the part of SB 6 that is least amenable to mediating influences is in the requirement that teachers demonstrate effectiveness to have their professional certifications renewed. Does that mean that it will be tied closely to test scores? That's what I fear. While there's a substantial academic literature on the problems with using either test scores or growth measures, Daniel Willingham's video remains the clearest short explanation for a lay audience. But I'm sure there's going to be lots of testosterone-laced talk about getting tough on teachers, at least until the State Board of Education has to decide what proportion of experienced teachers it's going to non-renew licenses for... and wait for things like lawsuits and backlash from parents and districts.
I expect that I might find a few additional nuggets of unworkable details in the bill, but that's the big picture. If the Florida House passes SB 6 without substantial changes, there's going to be a great deal of turmoil in schools over the next few years, and until the questions raised by the bill are settled about local bargaining authority and the use of test scores in teacher evaluation, there's going to be a substantial cost of the bill in terms of instability.
March 23, 2010
The sugar-daddy amendment to SB 6
Note (March 25, 2010): This entry was written on March 23, before the Senate adopted the Thrasher/Crist amendment. For my thoughts about the version that passed the senate on March 24, see my entry describing it as overreaching
.Among the amendments to Florida Senate Bill 6 filed today is a short amendment sponsored by John Thrasher (Jacksonville) and Victor Crist (Tampa) to address a concern I raised Saturday (and I assume others have also raised): As originally filed and then approved by state senate committees, Senate Bill 6 would essentially punish the Hillsborough (Tampa) school system for having won a Gates Foundation grant because the carving out a portion of teacher evaluation for trained observers would reduce the amount accounted for by student outcomes below the statutory minimum in the bill.
So along comes the bill with a possible solution to this individual problem: a school district can apply to the State Board of Education for an exemption if it's constructed in various ways that match Hillsborough's situation... including the first requirement: "Any school district that received a grant of at least $75 million from a private foundation for the purpose of improving the effectiveness of teachers within the school district may seek an annual exemption..."
In other words, only Hillsborough need apply. If you've got a sugar daddy, you're eligible for the exemption. If you don't, even if you're a school system willing to invest your own money in a similar system meeting all the other requirements, you can kiss any exemption goodbye.
March 20, 2010
Would Florida SB 6 criminalize Gates grant to Hillsborough schools?
Note (March 25, 2010): This entry was written on March 20, about an earlier version of Senate Bill 6. Early this week, the bill was modified to allow Hillsborough to seek an exemption; the amendment was crafted so that no other district could apply, even if they replicated Hillsborough's efforts using local funding. For my thoughts about the version that passed the senate on March 24, see my entry describing it as overreaching.
In the past year, supporters of using student test scores to help evaluate teachers have expressed incredulity when some teachers union officials have been opposed to those moves in states such as California. "We're not even talking about having test scores dominate all evaluation!" has been the tone of such comments, "but student achievement should be one of the important factors."
Whether or not you agree with that position, it's intellectually defensible. This month, though, I suspect DFER members and Obama administration officials are going to do their best to avoid writing or speaking about Florida Senate Bill 6, which takes the approach that student test scores should be an absolute criterion for continuing professional licensure, and undefined "learning gains" should "comprise more than 50 percent of the determination of the classroom teacher's performance" (ll. 1197-1198 of the 3/19/10 version), no matter what subject the teacher is responsible for and whether anything like a value-added measure is technically feasible.
This majority-of-evaluation position is essentially what the state department of education wanted districts and locals to sign off on for Race to the Top, and Commissioner Smith's public support of Senate Bill 6's approach is inconsistent with his earlier claims in December and early January that the department would be flexible about how districts and unions could implement the RTTT MOU. As the head of the Florida superintendents association wrote in a letter to the commissioner, "you and your staff have emphasized flexibility in implementing these elements" (Bill Montford to Eric Smith, January 8, 2010).
In fact, Senate Bill 6 is less flexible than the text of the Memorandum of Understanding on the use of student outcome data for teacher evaluation. Here is the relevant MOU paragraph:
(D)(2)(ii)(1). Utilizes the Department-selected teacher-level student growth measure cited in (D)(2)(i) as the primary factor of the teacher and principal evaluation system. Primary is defined as greater than 50% of the evaluation. However, an LEA that completed renegotiation of its collective bargaining agreement between July 1, 2009, and December 1, 2009, for the purpose of determining a weight for student growth as the primary component of its teacher and principal evaluations, is eligible for this grant as long as the student growth component is at least 40% and is greater than any other single component of the evaluation.
The second sentence beginning with However appears to be framed specifically to allow Hillsborough County to participate; Hillsborough and its teachers union won one of the Gates Foundation multimillion-dollar grants in the fall, and one of the provisions of the grant is to construct teacher evaluation around three components: student data, an administrative review, and observations from a trained classroom instruction evaluator (the last part of the Gates initiative to develop such evaluation expertise). And in the January letter noted above, Montford wrote that all districts should be able to do what Hillsborough and its union had agreed to for the Gates grant.
So what happens if Senate Bill 6 passes? Well, there goes any value of the Gates award in Hillsborough; the arrangement in Hillsborough would violate the law because less than 50% of the teacher evaluation structure will use student outcomes. Is this really what DFER and the Obama administration wants? Teachers union and district take a risky step in a joint commitment; state punishes district.
Keep in mind that SB 6 is a moving target: on Thursday, a state senate committee changed the bill to eliminate constitutionally-dubious provisions in the original that would have forced local school districts to raise taxes if they didn't do what the bill rquired and that would tie half of teacher pay to test scores. And thus far there is no House companion. But the teacher-evaluation and licensure components of SB 6 are based on a fantasy of assessment data and state authority that is unrealistic and is a slap in the face of administrators and teachers who are working at the ground level to develop better teacher evaluation systems.
I can't expect Commissioner Smith to acknowledge openly that his public support of SB 6 is a political calculation that he has no choice if he wants to keep his job. His capitulation is sad, since I like Smith and he's done a considerable amount of work in the background to educate members of the state Board of Education and legislators. But those outside Florida are free to criticize overreaching on teacher evaluation proposals, and this is a chance for them to prove that they are not as absolutist as teacher union activists in California and other states claim. So, is anyone from DFER or the Obama administration willing to speak up against the excesses of SB 6?
March 19, 2010
ESEA reauthorization blueprint, the CliffNotes version
I have several meetings today, but I want to write down my thoughts on Duncan's ESEA reauthorization "blueprint" before I forget them. As I wrote over the weekend, Mike Petrilli is reading the substance of the blueprint correctly; the Obama administration is proposing that federal policy walk back a few steps from NCLB's absolutist mechanisms and disentangle the different issues involved in accountability. Petrilli is also correct in seeing a connection between the administration's ESEA reauthorization proposal and the promises by both Duncan and Russlyn Ali to be more aggressive in the department's Office of Civil Rights (OCR). That's essentially the implicit deal the administration is putting out for review by stakeholders: "We won't force states to label the majority of schools as failing, but we will require states to intervene in the worst 5% of schools in each state, and we will be aggressive in monitoring equity issues in other schools."
At least in theory, this fits with my argument in Accountability Frankenstein that schools have three different types of challenges: the challenge of truly mismanaged schools in crisis, the challenge of inequality, and the challenge of making sure the next generation is smarter and wiser than we are. I argued that NCLB tried to address all of those challenges with the same mechanisms, and it looks like the Obama administration is recognizing that they need different policy approaches: requiring states to identify 5% of schools in crisis, using OCR to address inequality, and pushing for common curriculum standards for the next-generation challenge.
That's not saying that the proposed mechanisms are going to work. I am less worried about using testing to screen for schools in crisis than others, but I agree with Diane Ravitch that educational euthanasia is a simplistic response. That doesn't mean that states should allow schools with deep problems to fester but that both states and the federal government need to be much more humble about their ability to "turn around" schools in crisis or even replace them with putatively brand-new schools. It's the proposed four-option turnaround mandate in the blueprint that bears the most resemblance to NCLB's cookie-cutter interventions, and that's a matter of deep concern for me.
Then there is the effective-teachers piece of the blueprint, which is less bureaucratic than NCLB's "highly qualified teacher" approach and the trigger for NEA's and AFT's critical responses to the blueprints (though I think Andy Rotherham is correct that the Obama administration's pushing of a health-care excise tax, abandonment of the Employee Free Choice Act, and passiveness with regard to NLRB appointments is definitely playing a role). The blueprint is very general with regard to its treatment of teacher effectiveness, and it could be consistent either with something like the Toledo peer-review system and Denver's ProComp, or with the problematic Senate Bill 6 in this year's Florida legislature.
The generally positive response to Duncan's presentations this week (especially from rural-state senator Tom Harkin) suggests that Duncan's hit a number of right notes, at least politically. That's not the same as effective policy, but it's a long way from a 40-something-page document and a law.
March 14, 2010
Petrilli nails ESEA reauthorization proposal
After finishing the last entry, I realized I should write something about Friday's USDOE proposal on ESEA reauthorization. But procrastination is sometimes a serendipitous thing, thanks to the Fordham blog: Mike Petrilli's analysis is correct, at least on first approximation. A narrative framework is not statutory language, Duncan's proposal isn't George Miller's, and other Beelzebubs squatting in the filigree, but I had the same general reaction Petrilli did.
I'll write more about ESEA reauthorization later in the week.
March 8, 2010
Sour-grapes agreement
Michael Olneck and Peter Sacks turn petty in letters to the editor about Diane Ravitch that the New York Times printed today. Wow. I agree with Ravitch on a number of things and disagree with her on a number of things, some of which is in our area of expertise (history of education) and some of which falls outside the history of education. But I'm not sure why Sacks in particular is turning on the venom spigot. Well, actually, I do have some hypotheses about general hostility to her I've occasionally seen (as opposed to disagreement): she caricatured the field of history of education in a sloppy late-70s publication sponsored by the National Academy of Education, and along with Patricia Graham she was a woman to get high-status national recognition in the 1970s for her work in education policy at the national level, which heretofore had been a male bastion. (Graham was director of NIE from 1977 to 1979.) The first is a seriously flawed work, but that's several decades in the past, and in any case, a particular work should stand or fall on its own merits. I've never seen the second item discussed or even acknowledged.
There's a related issue here, which is Ravitch's position outside traditional faculty. As far as I'm aware, she's never had a tenure-track or tenured faculty position, and she's one of the few historians who can say that they published their dissertation commercially before receiving the Ph.D. (The Great School Wars was published in 1974; Ravitch received her Ph.D. from Columbia in 1975). For the most part, her books are far more widely read than those of us who have full-time faculty positions, and I think she and Graham are the only historians of education to have held political appointments in the federal government. That's an interesting combination of insider and outsider positions.
When Meier and Ravitch started their joint blog/conversation three years ago, I briefly referred to this history in writing, "Regardless of various professional views of her scholarship, Ravitch is a recognized voice on education policy. There are plenty of people I correspond with who have fewer claims to expertise, so I can either have a snit-fit about that or deal, and at this point, having a snit-fit is darned close to sexism and uber-testosterone in education policy studies." I'm sorry Olneck and Sacks, and especially Sacks, have made a different choice.
For the record, Sacks is factually wrong when he states, "Dr. Ravitch fashioned herself into the Ayn Rand of educational policy and rose to fame as a result of a free-market ideology that came into fashion in George W. Bush's administration." Ravitch's appointment was during the first Bush administration, and whatever you might think of Ravitch's historical arguments in different books, she's a much better writer than Rand.
February 11, 2010
Additional thoughts on performance pay politics
An addendum to my entry earlier this morning: I think that there is a politically-robust rationale for performance-pay policies, but it's not at the level of incentives usually used as the justification. The more plausible rationale for performance-pay policies is at the level of public-sector accountability: most people with jobs do not expect identical salaries or salaries based on a formula, and small variations based on something other than seniority and educational credentials might boost the facial validity of public-sector HR practices.
Note that this is not an argument that business practices are always incentives based (or should be: witness AIG as a disaster stemming from short-term incentives) or even widely varying. In some cases--large law firms, for example--entry-level professionals receive step pay increases in their first few years akin to teachers' step increases. But if I were to ask the head of the Florida Council of 100, Susan Story, whether she'd stop advocating performance pay even if the research consensus in a few years were solidly against its doing anything for student achievement, my guess is that she'd still push for some form of performance pay.
The discourse around this is somewhat similar to other comparisons people make between their lives and public policy: when policies look like you're pushing the cart and someone else paid by public funds isn't, you're less likely to maintain support for it. A friend of mine visited a newspaper columnist some years ago to complain about an article the columnist had written regarding AFDC (the federal welfare program before 1996). Don't you understand the factual errors with all of the myths about welfare? my friend asked. Sure, said the columnist, but you don't understand why public attitudes have changed: as the majority of mothers now have to find their own child-care arrangements while they're working, they're going to be far less sympathetic towards women who aren't willing to work or perceived as not willing to work.
I don't agree with the columnist's thumbnail history of public attitudes towards federal welfare policies or on assumptions that women on welfare have not historically wanted to work. But there is a significant grain of truth there that when the majority of mothers work when their children are young, and they have to find and pay for child care and wrestle with the stress involved in that, those mothers are not going to want to see that they're pushing the cart and others aren't. For similar reasons, those who oppose any performance pay have an uphill road telling people who work in environments with non-step-like pay arrangements that somehow public schools should be arranged differently.
Why the Teacher Incentive Fund and Race to the Top are long-term dead ends for merit-pay advocates
The apparent push in the proposed 2011 Obama budget for an enlarged Teacher Incentive Fund on the heels of Race to the Top makes me think that merit-pay/performance-pay advocates may be spreading their political capital very thin on teacher evaluation. Most advocates of paying teachers in part based on test scores are also advocates of using test scores in part to evaluate teachers more broadly, especially dividing probationary teachers from teachers with a right to due process before dismissal. And they're trying to do both. Smart or stupid? I think it's counterproductive for several reasons:
- The research on benefits of individual-teacher performance pay is limited. Very limited and quite mixed. Putting all your chips on a huge expansion of experimental performance-pay schemes? You may not get what you want, and public evaluations may doom the politics. (Think Reading First, though the allegations of corruption set the stage in that case for death-by-evaluation.)
- Grant programs end. If the expansion of performance-pay programs relies on temporary revenue, then the program may well die along with the extra revenue. Denver's teachers union and district worked together on a long-term political deal: performance pay that teachers helped develop tied to a long-term boost in revenue. That's not the structure of RTTT, TIF, or the Gates Foundation grants.
- Real-life performance-pay bonus budgets are stingy. The best example of that reality is here in Florida, where the state budget for the school-based rewards for test scores has been no greater than $100/student (for a school) since the late 1990s, and while my undergraduate students sometimes enter my classes thinking that a huge amount of school budgets are based on test scores, in reality that's no more than about 1.5-2% of per-pupil expenditures in Florida (and that's for the schools that receive the money). When this money is distributed to staff (sometime it is, sometimes it isn't), it's in the form of bonuses, not additions to base salaries. The fiscal and political reality is that the only way to permanently boost base salaries substantially based on test scores is to give the money to a tiny fraction of teachers, and that's a recipe for political disaster (and legal problems).
The last point is one I am surprised opponents of performance pay have not raised sufficiently, and here's how I thought someone would have put it by now: Okay, so you want to pay teachers well if their students learn a great deal? Wonderful. So if students perform at a very high level, you're willing to raise taxes to reward teachers for that accomplishment? Liberal advocates of performance pay would probably answer yes if. I don't think fiscal conservatives who are performance-pay advocates have thought through the dilemma on that point very clearly; either the answer is that you're willing to raise taxes or that you have low expectations for schoolchildren.
Eventually, I suspect that advocates of performance pay will have to decide whether they want to put all of their political capital into pay schemes that are fragile or into hiring and retention issues. The proposed ballooning of TIF is a sign that no one in Washington is thinking about the political balance of these issues in the long run.
Disclosure: I'm a member of a higher ed union that has long had a contract with merit pay and considerable differences in pay by rank and discipline. K-12 is a very different world in this regard.
Note: I started this entry on Tuesday, and because I forgot to change the "publish date" (which Movable Type usually sets at the time you started an entry, not published it), it first appeared as if it were published Tuesday. My editing fault, not your faulty memory. Now, your forgetting to read all of my books and articles? That's a different story.
January 9, 2010
Spot temperature:Climate::Test score:____________
I fully expect that within a week (if not yet already) some climate-change skeptic will use the cold wave currently freezing much of the country as an argument that climate changing really isn't happening. And every time there's a vicious cold snap in winter or a cooler-than-average summer we get the argument. And some reporter and editor decides to devote part of the ever-shrinking news hole to bad coverage of the issue, while a relative handful of reporters use the question as an opportunity to educate readers about the difference between weather and climate.
Today, I'm sitting in central Florida with more layers on than I usually need in early January. It's colder weather than usual. But we're in a warming climate, because in the long run of decades (or centuries) the current cold wave is just noise, and the trend is towards a warmer atmosphere. "Just noise," you may be thinking through chattering teeth, "tell my heating bill that it's just noise." The current cold wave is nasty for individuals today (and a few days more), but it's temporary.
The variability of weather makes sense to most people because we have enough experience to distinguish between spot temperatures and broader patterns. We know that temperatures have daily and seasonal cycles. But the cyclical nature of weather does not give us enough background to grasp climate change. For that, you need data. A lot of data. A lot of data from a lot of places and times, of different sorts, with a number of experts sifting through it.
And even then you get climate-change conspiracy theorists, including someone who's evidently a hacker.
You can probably guess the logical analogue here: we do not have anywhere near the same density of data on student achievement that we have on climate, and yet we draw bold conclusions about the underlying achievement from a relative paucity of noisy data. As I wrote in August, we need to learn how to make decisions with noisy data. But in terms of broad trends in achievement, it is a bad habit of Americans to equate the latest test scores with long trends.
And that doesn't even touch the question of whether test scores are like temperature readings. Ah, but they are, if you're talking about your and my outside thermometers: placed at different heights, in different conditions (sheltered, out in the open, shade v. sun), different ages of the thermometers (and thus consistency of the readings across the years). I am sure that background thermometers in these varied conditions are highly correlated in the sense that when it's colder, they're all colder, and when it's warmer, they're all warmer, and so the correlations across time are likely to be very high. But I wouldn't use them in any scientific research.
Stay warm, and have whatever hot beverage you like!
December 13, 2009
Turnaround or abandonment in NYC?
The extent of school closings in New York City is becoming evident, and after JD2718's posts on the subject over the past half-week, UFT's Leo Casey provides an overview and alleges an ulterior motive (to create available space for other purposes, not to improve education).
I'm far from NYC and can't speak from close knowledge of the city schools, and I'm still grading student work so I have no time to read extensively. But this is an important story and rolling conflict, and there are a few predictions I'll hazard:
- At least one conservative will commit rampant inconsistency by simultaneously (or nearly simultaneously) weeping over the demise of the DC voucher program and applaud Klein for his bold moves, repeating the double standard on the issue I have described before.
- A small handful of schools may be preserved through fairly heroic efforts, but most of the closures will stand.
- There will be no effective way to hold Tweed responsible for consistency and rationality in its school opening/closing decisions.
In truth, many administrators engage in maneuvers that appear as arbitrary as Klein's closures do, but rarely is it on such a scale or so visible beyond the locality.
December 5, 2009
Are central Florida schools flouting Florida law limiting test-prep?
I have heard from teachers and students in three area districts (Hillsborough, Pinellas, and Hernando counties) that secondary teachers in some subjects are being ordered to spend the first 10 minutes of class suspending the curriculum and teaching material from another class. In the case of two counties (Pinellas and Hernando), I have heard stories that math teachers are being asked to teach 10 minutes of reading--not include word problems in math, which is certainly appropriate, but teach reading (a subject very few of them would have certification in). In one county (Hillsborough), I have heard a report from a student that a high-school anatomy teacher has been asked to spend 10 minutes reviewing other science subjects (and the emphasis appears to be in chemistry), probably to prepare students for the 11th grade FCAT science comprehension exam.
In 2008, the Florida legislature added a section to the existing law on assessment (F.S. 1008.22(4), if you're curious), specifying limits to what schools can do to prepare for tests, specifically
STATEWIDE ASSESSMENT PREPARATION; PROHIBITED ACTIVITIES.--Beginning with the 2008-2009 school year, a district school board shall prohibit each public school from suspending a regular program of curricula for purposes of administering practice tests or engaging in other test-preparation activities for a statewide assessment.
There are a number of exceptions to this prohibition--school districts can distribute sample test books, teach test-taking skills in limited quantities, etc.--but the spirit is clear: schools are not supposed to be engaging in test-prep that is a substitute for instruction. And taking time away from math class to teach reading, or away from anatomy to teach chemistry, looks like it's clearly prohibited.
It's also counterproductive from an administrative standpoint: if you wanted to add reading instruction, why would you ask a math teacher to do it? I should be clear: these are unconfirmed reports rather than documented examples. But if these reports are true, this clearly looks to be an end-run around ordinary curriculum policies requiring a certain amount of instruction in the classes to get more instruction or more test-prep in for high-stakes subjects.
There is one additional legal problem with this practice: there are both state and federal policies about teacher qualifications. I bet it's illegal in a number of respects to assign math teachers to teach reading and then report that everyone instructing in a subject is properly certified.
I have contacted the three districts in question to ask where the policies required by the law are. If you are aware of any specific examples (and I would need the school, date, class, period, and witness for sufficient documentation), please contact me by e-mail (sherman dottish-thingie dorn at-symbol-stuff gmail.com).
November 12, 2009
Race to the Top: review, revise, redux
I am in California this weekend for the Social Science History Association annual meeting, where we get to talk about Maris Vinovskis's book on the last quarter century of school reform, and since one of my copanelists Saturday morning is Jennifer Jennings, I finally get to meet the sociologist-formerly-known-as-Eduwonkette in person, face to face. Because several family members live in Costa Mesa, I also get to enjoy Kean Coffee about 20 miles south of the conference hotel/cruise ship (when the heck did the SSHA officers decide to book the Queen Mary??!).
While the focus of the book panel will be ... well, Maris's book, I'm sure we'll be talking about Obama education policy at some point, including Race to the Top. I was rushing around last night not getting enough done, so I didn't have a chance to do more than casually skim the stuff that's now available on the revised final guidelines. A few initial thoughts:
- Bottom line? No idea. I traveled west and had coffee (see above), so I don't have a bad case of jet lag, but I've been on planes for 7 hours today.
- I very much like the competitive priority on STEM fields. That uses a standard device for focusing grant-writers' minds in USDOE competitions (the bonus points for meeting a competitive priority). (Disclosure: it looks like my state's department of education is following the push a bunch of us have been making about using Race to the Top funds for end of course exams, especially in science.)
- From the list of changes made, it looks like there have been a lot of political calculations made on what changes had to be made to keep stakeholders in the game and what had to stay the same to satisfy policy goals.
- Duncan is not anal retentive enough to make the points add up to a "nice round number." I have a suspicion this is deliberate, and if so I think I know the reason why.
- People who focus on the total potential range of points for each section are missing an important feature of point distributions in scoring systems: it's the actual range and not the potential range that matters on rankings. If the potential range is 58 points from top to bottom on one component but the scoring leaves a real-life range of 10 points, it doesn't matter that the total number of points is 58. It could have been anything from 10 to 58. So what matters is how the reviewing panel looks at everything.
If we have time, I'll try to persuade Jennings to put on her Eduwonkette cape and save the state where I grew up. But I think California's problems are beyond what even a brilliant sociologist can solve. At least I get to see family members, which is worth the jet lag I'll be fighting in the next week.
October 29, 2009
Channeling Jerry Bracey on "proficiency": it's political, not scientific
One of the late Jerry Bracey's hobbyhorses was the pretense that the NAEP achievement level labels were scientific, as he argued in 1999: "The standards have generally been the object of scorn and derision from the psychometric community." He was fond of quoting the 1999 report on NAEP proficiency levels, esp. from p. 162: " Standards-based reporting is intended to be useful in communicating student results, but the current process for setting NAEP achievement levels is fundamentally flawed." So when NCES issues a report comparing the implied theta-values of cut-scores for proficiency on state assessments to the theta-values of cut scores for proficiency on NAEP and both Ed Week and the Christian Science Monitor report on the paper with a straight face, we're obviously seeing one place where Bracey's voice is already missing.
I think Jerry perseverated on this issue, to the detriment of a sensible argument about political judgments. The larger point which is inescapable is that cut scores are set arbitrarily, and there is no way to avoid that fact. Those who support setting achievement levels hope and pray that they're arbitrary in the sense of arbitration and careful judgment, not by being capricious. But they are arbitrary, and even moreso the labels assigned them. What we know is that someone who scores at a "proficient" level on NAEP is scoring higher than someone in the "basic" band. That's all we know from those labels: ordinality. Moses did not come down from Mount Sinai with NAEP scores carved in tablets.
So what do we do with the inherently political nature of those labels? As I have argued in Accountability Frankenstein, the caution with which we use the judgments on cut scores should depend on the stakes of their use. If they're used to target resources, that's one thing (resources are going to be targeted in some manner), but the more that someone's job depends on them, the more wary we should be of how we set thresholds.
Today, however, NAEP labels and cut-scores are serving a purely performative act, to stigmatize states for their political response to NCLB. I hereby propose that we have the following new labels for NAEP achievement levels:

I think that's in the spirit of the day's report...
Correction: I assumed that NCES was using detailed data from the state assessments to estimate IRT parameters. Silly me. They were using distributional data for linkage. Oops... for me for forgetting the methods from the last such report. I'll let the measurement folks argue about the methods used here.
October 14, 2009
The comparability fly in the Ouchi/principal-autonomy ointment
Yesterday from a "stakeholders" meeting (I think at the USDOE), Charlie Barone tweets,
Richard Laine of Wallace Foundation: forthcoming Rand study will show [principal] autonomy in hiring a key factor in student achievement.
I've been expecting something like this for a while, not because I'm connected to a RAND insider (I'm not) but because this is the obvious new version of decentralization form that would marry the 1980s-90s site-based management fad with new managerial fads in education.
To some extent I am attracted to Bill Ouchi's argument about principal autonomy leading to lower total student load. Ouchi's claims about total student load is essentially one of Ted Sizer's central arguments from Horace's Compromise, that the number of students a teacher sees is a key factor in the ability to push student achievement. But... and here's a fairly important but... Ouchi's work is tantalizing rather than definitive (because it has not be replicated substantially in terms of total student load), and the temptation to manage large urban districts as "portfolios" with quasi-independent school-level management may push a single form of decentralization at the cost of comparability in expenses and access to great teachers.
What the heck do I mean by that? In a sentence, we may not want principals to have complete autonomy in a task where they have relatively weak skills: knowing which novice teachers are going to be great teachers.
Everyone and her or his grandmother is focusing on the problem of
where senior teachers work. This is an intellectual sleight of hand if
you simultaneously argue that teachers with seniority are taking
advantage of contracts with seniority privileges on transfer to avoid
schools who need them and also insinuate that experience means nothing.
Let me get this straight: we need to prevent experienced teachers from
exploiting labor-market choice to move to schools with more comfortable
teaching situations because... they're not inherently any better than
teachers with only a few years of experience? This is an inconsistency
ripe for Jon Stewart-like treatment.
More important than the intellectual sleight of hand is the way that this argument ignores an opportunity for a simple but politically sensitive intervention we could make that could simultaneously improve the lives of poor children and new teachers: create regional new-teacher clearinghouses and matching services. Here's the thought experiment: Far from decentralizing, I think it would be a healthy system for schools to require new teachers go into a large regional market where vacancies for relatively new teachers (e.g., those with fewer than three years of experience) would be balanced with a matching process akin to matching of med-school graduates to residencies. This would require collective bargaining and regional agreements between districts (or changes to statute), but here's the idea:
Brand-new teacher's perspective: A new teacher registers with the regional teaching market clearinghouse, with all of the stuff you'd want applicants to provide. The clearinghouse is directly tied to vacancies in the region, and that would probably include multiple districts in most parts of the country. The clearinghouse matches teachers to jobs for the first year. The teachers and administrators are told, explicitly, "This is a one-year arrangement. In the second year, the teacher is headed to a new school, and the administrator provides an evaluation knowing that the teacher is not coming back to that school until at least two years down the road." And that's what happens. At the end of the first year, the clearinghouse matches jobs to teachers who want to continue teaching and whom the first-year administrators recommend continue. Same with the end of the second year. And the clearinghouse's job is to make sure that by the end of a new teacher's third year, that teacher has worked in multiple settings, with different characteristics of students (at least within the range of the region), in areas of the teacher's documented expertise (i.e., no out-of-field matches).
At the end
of Year 3? Open market in the spring, in most places, and
administrators wanting to hire on the open market must hire teachers
with at least three years' experience -- in other words, teachers for
whom there is a record of evaluations from different administrators and
for whom there is a record of performance for students in different
settings (within the range of the region's student population). Schools
are allowed to hire teachers who worked in their schools before... if
the now-third-year teacher wants to work there again.
Benefit
to teachers: first-year teachers stuck with horrible administrators (or
generally toxic environments) know that they'll be moving on if they
survive. They'll get experience with multiple settings where they'll be
able to demonstrate their chops. At the end of their third year,
they'll have some variation in experience with administration to be
able to judge people better when applying in an open-market situation.
Disadvantage to teachers: if you happen to get lucky and get a great
job in Year One, you have to move on.... and let another new teacher
get the benefit of that experience.
Benefit to
administrators: because new teachers are forced to move on after a
year, honest evaluations are less likely to result in social
backlashes. When you hire on the open market, you'll know you'll have
evaluations and (where this is gathered) other performance data that is
from school settings with a range of student populations. Disadvantage:
you don't get to hire absolutely new teachers; you get whom you get,
and if you were great spotters of talent, or you think you're better
than the average principal at spotting good talent, you'll be upset.
(Personally, I think I would prefer this as an administrator: if you've read Moneyball, you know the sabremetricians' rule of thumb: you can predict a baseball player's professional performance from college experience, but someone straight out of high school is just a raw bet without college experience. Why would you want the authority to make hires in a situation where you're almost guaranteed to be a worse judge of talent/skill than any other personnel situation? Then again, I'm sure many principals think of themselves like the [very poorly-predicting] old scouts of baseball, making seat-of-the-pants judgments.)
Advantages
for systems: See advantages for administrators above. In addition, you
have lower risk with variation in administrators' skills in talent
judgment, while principals would still have the autonomy to pick more
experienced teachers, after they pick up enough of a record for
administrators to see who has more talent. You could also get
development of evaluation skills in a regional context without
diseconomies of scale. If clearinghouses have to track teachers, they
could also be tasked with additional evaluation responsibilities across
a region. Advantage for relatively poor systems: you know that
wealthier districts will not be able to be as much of a magnet for new
teachers, because of regional rotation, and you could push
administrators to do what is necessary to convince teachers that they
want to return to your district after their initial three-year rotation
is done. Disadvantages: there would need to be legal agreements to
cover this, and there would be some logistical challenges in
identifying vacancies (and making sure those vacancies are reported
accurately and promptly) as well as the operation of a clearinghouse.
School districts would have to delegate hiring authority for some of
their jobs to a regional body, and if school systems really thought
that they were hot stuff in terms of talent scouting, that might be
hard to swallow. (See above and Moneyball on the egos of baseball scouts and possibly school administrators.) Disadvantage for wealthy districts: poof
goes your advantage in recruiting brand-new and relatively-new
teachers, because they'll spend some time in your districts but also
some time in poorer districts.
Now, the payoff in terms of debates about comparability: a regional new-teacher clearinghouse/matching process would instantly equalize a significant part of the teaching staff across a region, because of rotation among jobs and districts. Yes, there would still be an advantage of wealthier districts in attracting teachers with three or more years of experience, but poorer districts would know that they at least have a shot of persuading new teachers that they can make a good career inside a district... if the relatively new teachers have an experience that is supportive.
Remember that this is a thought experiment: I don't know of any places with regional new-teacher clearinghouses/matching services, and I dreamed it up out of whole cloth (plus some inspiration from what happens with med-school students). But I think it points out a structural problem with giving principals entire autonomy: with complete autonomy, there is no balancing out of regional needs. Equality of opportunity would depend entirely on the skills of individual principals, and while principals are extraordinarily important, that's putting a heck of a lot of eggs in a single basket. If you care about making sure that a broad range of students have access to great teachers, there are serious dangers in the Ouchi principal-autonomy approach.October 8, 2009
First, find me a box of cereal that squirms and drips snot in winter
Congratulations to former Florida Governor Jeb Bush, who knows a critical rule of politics: declare victory whenever you can, no matter whether you were right. I am quite serious about his political acumen: his push of a system that assigned letter grades to schools was ingenious politics. And Bush deserves credit for supporting a research technical assistance center in Florida as well as funding for reading coaches. But Jeb Bush's comments to the Jeb Bush Celebration Conference this week had an interesting quip:
Frankly, if Walmart can track a box of cereal from the manufacturer to the check-out line, schools should be able to track the academic growth of a student from the time they step in the classroom until they graduate.
I am firmly in favor of using longitudinal data, but this comment is cheerleading and not serious discussion. There are significant challenges in the creation, maintenance, and use of longitudinal data systems, and Walmart-style tracking logistics don't touch the greater ones.
October 3, 2009
Child murder, Chicago style
Chicago teacher Deborah Lynch pointed out in a Sun-Times opinion piece yesterday that one of the Chicago schools' "turnaround targets" this fall has been Fenger High School, near the gang fight that led to Derrion Albert's death and the school where she implies many of the combatants attend. (Hat tip/alternative source.)
I am not saying that knowing the kids better could have averted the melee and tragic death of last week, obviously. But trouble had been brewing at the school even before last week. Staff reported a riot the previous week inside the building, involving teachers being hit, and that two different police stations had to be called in to quell the disturbance. Those are the times when the staff members draw on their relationships with kids to urge restraint, to urge calm and peace, to try to talk things out rather than fight things out. Those are the times when a seasoned staff can identify strategies and resources to address and prevent further problems.
Lynch's argument is interesting and plausible. I'd be cautious of taking it at face value, but don't toss it out the window. As far as I am aware, there is nothing either to contradict or to support the claim that the length of time a staff (as a whole) has spent in a school is predictive of the general school environment. I suspect it depends on the staff; experienced good teachers and staff are going to have the types of relationships with students that Lynch describes.
But there is another important limit to Lynch's argument, and I'm thinking about the debate that's usually focused on academics rather than violence: the relationship between schools and the rest of students' lives. I suspect that if George Schmidt is correct, that the police congregated around Fenger rather than following potential combatants, any immediate investigation needs to focus largely on the tactical decisions of the police. It's possible that no matter what happened in the school, the gang fight would have occurred unless police decisions had been different.
The murder of Derrion Albert is representative of one fact: in violent neighborhoods students are usually safer in school than out of school. A skilled set of professionals can make it so kids are safe in school, safe enough to focus on school. And it's much harder to bring peace to a violent neighborhood without involving schools. What happens inside the classroom can change the conversations that happen outside school boundaries, but there are no guarantees. What if Fenger had not been the target of a turnaround effort: would Albert still be alive? I don't know.
Update (October 7): More on MSNBC, and more focused, on the rearrangement of enrollment patterns.
September 2, 2009
"Lake Wobegon" Klein
From pp. 68-69 of Accountability Frankenstein:
The complexity of an accountability system can also help muffle opposition to accountability if it gives a reasonable chance for students or schools to be successful in the system's labeling... the political potential to muffle opposition within a system may be more important than the technical qualities of a system, for schools typically trumpet any positive label on any website, pamphlet, or streetside marquis. All three of these states provide evidence of the capacity for complex systems to muffle dissent. In North Carolina, the majority of schools have received some recognition award in every single year of its accountability system's history. In Florida's system, 13% earned recognition in its first year, 1999, but that proportion rapidly grew, and a majority of schools received recognition awards in each of the years from 2003 to 2006. In California, 47% of California's schools earned statewide recognition in 2002, and two thirds of the schools in the Los Angeles Unified School District earned recognition.
I don't know why anyone would suspect that there is any political convenience involved in having the single letter grades assigned to a whole slew of NYC schools jump to A, but it's not isolated to New York. It's just that New York has overtaken Lake Wobegon as a symbol of overestimation of results. Then again, since Garrison Keillor spends several months a year in New York, maybe it's highly appropriate.
August 30, 2009
Race to the Top comment sausage
A friend of mine from Chicago introduced me to the term link sausage as a blog entry that is not much more than a set of links. Here are links to various comments on Race to the Top (a tiny slice of the well-over-thousand comments submitted):
- National Education Association (Word)
- American Federation of Teachers (PDF)
- Learning Disabilities Association of America (PDF)
- New Teacher Center (part 1 and part 2) (both PDF)
- Thomas Kane et al. (PDF)
- Forum on Educational Accountability (Word)
- Charles Barone et al. (PDF)
As I expected, others have started to chime in on the NEA comments. The New York Times took the comments as a sign of obstinance. Former Park Ridge Education Association president Fred Klonsky wrote,
While it seems to me that it is late in coming, the letter from Brilliant is well deserved, and [Sherman] Dorn's comments notwithstanding, I think it reflects the views of the NEA membership. At least among those who have been following the debate.
I think that was my point: the comments reflected the views of a large slice of the NEA membership, but not in a productive fashion, and I fear that on balance it will harm the concrete interests of teachers (both in and out of the NEA) no matter how you want to define those interests.
Note: As Klosnky points out in comments, he's not an ex-president (yet). The error is all mine in sloppy reading of his about page.
August 28, 2009
I'm commenting on Race to the Top, and I want a pony, too!
Impressions of a quick skim of 20 or so comments on the draft Race to the Top regs:
- I couldn't find the national AFT comments anywhere.
- Thus far, the two sets of technical comments by the Learning Disabilities Association of America and the group of academics with Kane, Staiger, and several others (uploaded by Thomas Kane), respectively, earn my "okay, you guys read the regulations and targeted your comments" award. Whether you agree with them or not, the comments were shrewd and focused. (I happen to like most of the comments, which are practical and sensible on the whole.)
- The New Teacher Project signed onto the multi-organization letter that was essentially a vague "okay, we agree with this" note (with the advice for the USDOE to be selective in the first round), and then submitted comments that were, ahem, not nearly as far in the opposite direction as NEA but bewildering in its unbridled confidence in the suggestions made. TNTP staff, please read the comments written by Kane et al. You're smart, and they're smart, and they're much closer to the mark than you were this week. At least you don't come close to winning the second "I'm commenting on Race to the Top, and I want a pony, too!" award (first was to the NEA).
- I think that the California Teachers Association (the NEA affiliate in California) avoided the factual blunder in the NEA comments of asserting that Race to the Top is a mandate. Instead, they asked what states would have to give up in return for the money. In this case, they were deeply, deeply concerned with the threat to federalism embedded in asking that a state be able to link teacher and student records. That would be more plausible if TNTP's comments were enacted, but either the draft regs or the Kane et al.'s suggestions are reasonable in an imperfect world.
- One state department of education accidentally sent the USDOE its cover
letter to a national organization telling the national organization it
was sharing its reg comments, in the place where it was supposed to
upload comments. No signs of actual comments on the regs (thus far
today). Ouch! I suspect there are similar technical glitches in other places.
I didn't comment. This is the first week of classes, and I'm a firm believer in the biggest bang for my buck (or hour).
August 23, 2009
NEA's comments: righteousness over responsibility to members?
I'm an NEA member, through my membership in the United Faculty of Florida. I'm a skeptic and critic of high-stakes accountability. Wrote a book and a few articles on the topic. And I am astounded at the NEA's comments on the Race to the Top draft regulations. (Hat tip.)
It is one thing to submit a righteous objection to the entire program if you are an individual with no responsibilities but to your conscience and your personal judgment of posterity. It is an entirely different thing when you represent several million teachers and you submit a document that for all intents and purposes appears to have an internal audience inside the NEA. That's nice, in the worst sense of the word "nice," because NEA staff had a responsibility to protect and advance their members' interests, not indulge any of our fantasies. To put it bluntly, on what planet would this regulatory comment have any effect on the final regs?
Let me be clear on my perspective as an NEA member and as an observer of political processes: There are lots of reasonable individual passages within the document, but you don't submit a manifesto when you comment on regs as an organization. You don't submit a manifesto that covers up any potential for effectiveness with what amounts to political poison. And you don't submit a manifesto that undermines your credibility.
Two examples will have to suffice, because there's only so much I can wince at publicly: "we cannot support yet another layer of federal mandates" (from p. 2), or with regard to the creation of statewide longitudinal data systems, opposition to "[i]gnoring states' rights to enact their own laws and constitutions" (p. 24). The problem with these claims (and attendant tone of outrage) is that Race to the Top is not a mandate. Love it or hate it, it's something states must apply for.
There were certainly alternatives available to the NEA, including the following choices:
- Realpolitik: nudge the regs a bit to help state and local affiliates.
- Legal: set up a legal challenge after final publication.
- Abstinence: if you need to make a statement of conscience, declare that "we have serious doubts that this program will substantially help schools and will not participate in the regulatory comment process."
I may be dead wrong about this, and there may be some uber-secret strategy behind this comment, but from where I sit at the end of the summer, it looks like one of my national affiliates' new president's first major move has been a bunch of wasted electrons.
August 16, 2009
What "multiple measures" looks like in reality
Friday's Sun-Sentinel article on the new evaluation scale for Florida high schools shows what happens when a state moves away from general-assessment test scores as the end-all and be-all of accountability. In this case, Florida's new scale for high schools rewards schools for graduating more students, especially those who have problems with the state assessments, for enrolling students in challenging courses, for students who succeed in the challenging courses, and for student success in voc-ed certification programs.
How are Broward County schools responding?
At South Broward High School in Hollywood, students will get the chance to take additional AP classes, such as human geography, world history, music theory and macroeconomics, in addition to more traditional offerings such as AP English and biology, said principal Alan Strauss.
They're also ready to better monitor performance of at-risk students and ensure the entire senior class is ready to graduate, Strauss said. "I say overall I would hold myself accountable for grad rate and preparing my kids for college," Strauss said. "I don't find a problem with that. I think that's what my job should be."
Surprise, surprise! A more balanced accountability mechanism leads to planning a more balanced set of programs for students. I can quibble with loads of details on the new scale, but the direction is the right one, and I think we'll know in a few years how this is going. I'll stick my neck out and predict the evidence will be reasonably good (in terms of outcomes). A small step for a single state, a giant step for accountability options.
August 13, 2009
How can we use bad measures in decisionmaking?
I had about 20 minutes of between-events time this morning and used it to catch up on two interesting papers on value-added assessment and teacher evaluation--the Jesse Rothstein piece using North Carolina data and the Koedel-Betts replication-and-more with San Diego data.
Speaking very roughly, Rothstein used a clever falsification test: if the assignment of students to fifth grade is random, then you shouldn't be able to use fifth-grade teachers to predict test-score gains in fourth grade. At least with the set of data he used in North Carolina, you could predict a good chunk of the variation in fourth-grade test gains knowing who the fifth grade teachers were, which means that a central assumption of many value-added models is problematic.
Cory Koedel and Julian Betts's paper replicated and extended the analysis using data from San Diego. They were able to confirm with different data that using a single year's worth of data led to severe problems with the assumption of close-to-random assignment. They also claimed that using more than one year's worth of data smoothed out the problems.
Apart from the specifics of this new aspect of the value-added measure debate, it pushed my nose once again into the fact that any accountability system has to address the fact of messy data.
Let's face it: we will never have data that are so accurate that we can worry about whether the basis for a measure is cesium or ytterbium. Generally, the rhetoric around accountability systems has been either "well, they're good enough and better than not acting" or "toss out anything with flaws," though we're getting some new approaches, or rather older approaches introduced into national debate, as with the June Broader, Bolder Approach paper and this morning's paper on accountability from the Education Equality Project.
Now that we have the response by the Education Equality Project to the Broader, Approach on accountability more specifically, we can see the nature of the debate taking shape. Broader, Bolder is pushing testing-and-inspections, while Education Equality is pushing value-added measures. Incidentally, or perhaps not, the EEP report mentioned Diane Ravitch in four paragraphs (the same number of paragraphs I spotted with references to President Obama) while including this backhanded, unfootnoted reference to the Broader, Bolder Approach:
While many of these same advocates criticize both the quality and utility of current math and reading assessments in state accountability systems, they are curiously blithe about the ability of states and districts to create a multi-billion dollar system of trained inspectors--who would be responsible for equitably assessing the nation's 95,000 schools on a regular basis on nearly every dimension of school performance imaginable, no matter how ill-defined.
I find it telling that the Education Equality Project folks couldn't bring themselves to acknowledge the Broader, Bolder Approach openly or the work of others on inspection systems (such as Thomas Wilson). Listen up, EEP folks: Acknowledging the work of others is essentially a requirement for
debate these days. Ignoring the work of your intellectual opponents is
not the best way to maintain your own credibility. I understand the politics: the references to Ravitch indicate that EEP (and Klein) see her as a much bigger threat than Broader, Bolder. This is a perfect setup for Ravitch's new book, whose title is modeled after Jane Jacobs's fight with Robert Moses. So I don't think in the end that the EEP gang is doing themselves much of a favor by ignoring BBA.
Let's return
to the substance: is there a way to think coherently about using
mediocre data that exist while acknowledging we need better systems
and working towards them? I think the answer is yes, especially if you
divide the messiness of test data into separate problems (which are not
exhaustive categories but are my first stab at this): problems when data cover a
too-small part of what's important in schooling, and problems when the
data are of questionable trustworthiness.
Data that cover too little
As Daniel Koretz explains, no test currently in existence can measure everything in the curriculum. The circumscribed nature of any assessment may be tied to the format of a test (a paper and pencil test cannot assess the ability to look through a microscope and identify what's on a slide), to test specifications (which limits what a test measures within a subject), or to subjects covered by a testing system. Some of the options:
- Don't worry. Don't worry about or dismiss the possibility of a narrowed curriculum. Advantage: simple. Easy to spin in a political context. Disadvantage: does not comport with the concerns of millions of parents concerned about a narrowed curriculum.
- Toss. Decide that the negative consequences of accountability outweigh any use of limited-purpose testing. Advantage: simple. Easy to spin in a political context. Disadvantage: does not comport with the concerns of millions of parents concerned about the quality of their children's schooling.
- Supplement. Add more information, either by expanding the testing or by expanding the sources of information. Advantage: easy to justify in the abstract. Disadvantages: requires more spending for assessment purposes, either for testing or for the type of inspection system Wilson and BBA advocate (though inspections are not nearly as expensive as the EEP report claims without a shred of evidence). If the supplementation proposal is for more testing, this will concern some proportion of parents who do not like the extent of testing as it currently exists.
Data that are of questionable trustworthiness
I'm using the term trustworthiness instead of reliability because the latter is a term of art in measurement, and I mean the category to address how accurately a particular measure tells us something about student outcomes or any plausible causal connection to programs or personnel. There are a number of reasons why we would not trust a particular measure to be an accurate picture of what happens in a school, ranging from test conditions or technical problems to test-specification predictability (i.e., teaching to the test over several years) and the global questions of causality.
The debate about value-added measures is part of a longer discussion about the trustworthiness of test scores as an indication of teacher quality and a response to arguments that status indicators are neither a fair nor accurate way to judge teachers who may have very different types of students. What we're learning is a confirmation of what I wrote almost 4 years ago: as Harvey Goldstein would say, growth models are not the Holy Grail of assessment. Since there is no Holy Grail of measurement, how do we use data that we know are of limited trustworthiness (even if we don't know in advance exactly what those limits are)?
- Don't worry. Don't worry about or dismiss the possibility of making the wrong decision from untrustworthy data. Advantage: simple. Easy to spin in a political
context. Disadvantage: does not comport with the credibility problems of historical error in testing and the considerable research on the limits of test scores.
- Toss.
Decide that the flaws of testing outweigh any
use of messy data. Advantage: simple in concept. Easy to spin in a
political context. Easy to argue if it's a partial toss justified for technical reasons (e.g., small numbers of students tested). Disadvantage: does not comport with the concerns of
millions of parents concerned about the quality of their children's
schooling. More difficult in practice if it's a partial toss (i.e., if you toss some data because a student is an English language learner, because of small numbers tested, or for other reasons).
- Make a new model. Growth (value-added) models are the prime example of changing a formula in response to concerns about trustworthiness (in this case, global issues about achievement status measures). Advantage: makes sense in the abstract. Disadvantage: more complicated models can undermine both transparency and understanding, and claims about superiority of different models become more difficult to evaluate as the models become more complex. There ain't no such thing* as a perfect model specification.
- Retest, recalculate, or continue to accumulate data until you have trustworthy data. Treat testing as the equivalent of a blood-pressure measurement: if you suspect that a measurement is not to be trusted,
take the blood pressuretest the student again in a fewminutesmonths/another year. Advantage: can wave hands broadly and talk about "multiple years of data" and refer to some research on multiple years of data. Disadvantage: Retesting/reassessment works best with a certain density of data points, and the critical density will depend on context. This works with some versions of formative assessment, where one questionable datum can be balanced out by longer trends. It's more problematic with annual testing, for a variety of reasons, though that can reduce uncertainties. - Model the trustworthiness as a formal uncertainty. Decide that information is usable if there is a way to accommodate the mess. Advantage: makes sense in the abstract. Disadvantage: The choices are not easy, and there are consequences of the way of modeling uncertainty you choose: adjusting cut scores/data presentation by measurement/standard errors, using fuzzy-set algorithms, Bayesian reasoning, or political mechanisms to reduce the influence of a specific measure when trustworthiness decreases.
Even if you haven't read Accountability Frankenstein or other entries on this blog, you have probably already sussed out my view that both "don't worry" and "toss" are poor choices in addressing messy data. All other options should be on the table, usable for different circumstances and in different ways. Least explored? The last idea, modeling trustworthiness problems as formal uncertainty. I'm going to part from measurement researchers and say that the modeling should go beyond standard errors and measurement errors, or rather head in a different direction. There is no way to use standard errors or measurement errors to address issues of trustworthiness that go beyond sampling and reliability issues, or to structure a process to balance the inherently value-laden and political issues involved here.
The difficulty in looking coldly at messy and mediocre data generally revolves around the human tendency to prefer impressions of confidence and certainty over uncertainty, even when a rational examination and background knowledge should lead one to recognize the problems in trusting a set of data. One side of that coin is an emphasis on point estimates and firmly-drawn classification lines. The other side is to decide that one should entirely ignore messy and mediocre data because of the flaws. Neither is an appropriate response to the problem.
* A literary reference, not an illiteracism.
August 12, 2009
Belated kudos to Broader, Bolder and to Fordham
In the whirlwind of my obligations this year, my reading has lagged, and I am late in recommending and praising two reports published in the first half of 2009:
- The Broader, Bolder Approach's accountability report, published in late June. This report suggests combining the use of achievement test data and on-site school inspections for school-level accountability. For those who have read Accountability Frankenstein, you'll know that I agree with those ideas. This report addresses the central gap in the original Broader, Bolder manifesto, and I am delighted to have read the proposal.
- In March, the Fordham Institute published a report recommending a scaled approach to accountability when private schools take public dollars. Their proposal is roughly that the more dependent a private school is on public funding, the more the school has to provide data and be accountable in a way similar or parallel to local public schools.
Both are thoughtful, well-reasoned brief arguments, and they move each debate in interesting directions. Whether or not you agree with the conclusions, you'll have things to think about.
Updated: Aaaaargh! Six days later, I realize I've been calling the group the Bolder, Broader Approach instead of the other way around. Dear readers: when I make a stupid error, please point it out as soon as you see it.
Proposed ground rules on teacher evaluation and test discussion
Seeing how too many writers about Race to the Top, tests, and teacher evaluation would have taken actions in the Cuban Missile Crisis that would have led to nuclear war--i.e., seeing the worst in opponents, or maybe seeing posturing as the best path forward for themselves personally or for their positions (sound like the health-care debate-cum-food-fight?)--I am hereby proposing the following ground rules/stipulations:
- The modal forms of teacher evaluation used in K-12 schools are not useful.
- Some aspect of student performance (abstracted from all measurement questions and concerns about flawed tests) should matter in teacher evaluation.
- At least one problem of including student performance in teacher evaluation is how to use messy and flawed data. This comes from the fact that current tests are flawed. Heck, all tests are going to be imperfect and create the dilemma that Diane Ravitch referred to this morning. But plenty of today's tests should embarrass anyone who approved their use.
- Yes, people who disagree with you have used inane arguments, and some of them might even have gotten some provisions through a legislature by logrolling. I know I can say the same about your putative allies. Let's call each other out on those moves, and then move on to the substantive issues. Doing more than calling people out on that at the time (i.e., holding grudges) is playing the game of "your side is dirtier than mine," and you will inevitably lose that game, especially if there's an historian in the room (and in addition to me, there's also Diane Ravitch, Larry Cuban, Maris Vinovskis, and others who can quickly point out where folks have played dirty political pool for decades, though many of us will just call it the standard operating procedure in education politics). See reference above to Cuban Missile Crisis. If Reagan make an arms-control treaty with Gorbachev, we can all be a little more mature in disagreements.
August 4, 2009
Your personal, homemade commission on tenure and test scores
Sick of finger-pointing in the absence of a New York state commission to study how to use test scores in teacher evaluation (including tenure) decisions? Look no further! In this space, we will be conducting our own homegrown commission over the next three months. No need for the New York Assembly and Senate to act! We'll do it ourselves.
What? you say. You're in Florida. Well, yes, but everyone knows that Florida is just the Southern branch of New York. My father grew up on Flatbush Avenue and graduated from Lincoln High School. He was in New York City for his residency in pediatrics (with an office in Bellevue, but that's another story). The Yankees' spring training home? Eight miles from my house.
And if that doesn't convince you, you should know that Alexander Russo runs his blog on Chicago schools from Long Island. If he can do that, I can run a citizens' commission for New York from here (and then someone in Chicago can run something in Florida).
Apply in comments: name, role in New York education, what you'll bring to the table.
July 27, 2009
Talking turkey on "Race to the Top"
The hoopla surrounding the draft "Race to the Top" guidelines have obscured the long-game strategy involved here. If you think about the structure of the funds--more discretionary money than the U.S. Department of Education has ever had before, competitive grant system, and a set of priorities that the Duncan department has been signaling for six months--there are two guesses I have about the broader goals:
- The double-shot of grants over the next year is intended to be the first of two or three shots of large amounts of discretionary money for the department.
- Duncan's learned about vicarious reinforcement and intends to use it here.
The obvious initial "winners" will be states such as Florida which have a number of the required elements in place and are ready to go on a few payoff projects. But there will also be a few very large states left in the cold (and without that extra funding) after these first two rounds of awards. What if California is one of those states out in the cold? Or New York? There will be local pressure from school boards and administrators on members of Congress to continue feeding money to the department until their states land at least one award.
In the long game, the fact that Race to the Top can't bail California out is not really the issue, and I disagree with Mike Klonsky's assumption that this is an attempt to starve the states into submission. While I think a number of people would have preferred a larger ARRA stimulus fund, I don't think you can claim that the Obama administration has acted at all as if it wants thousands of teachers fired. Far more likely is the ordinary political dynamics of federal programs: no one wants to be without a slice of the pie. For these reasons, if it were legal to place a bet of this kind, I'd give rather interesting odds that California loses out big in the first two swats at Race to the Top money.
And speaking of misdirected Mikes, Mike Antonucci is wrong about the teachers union dynamics in Race to the Top. While my higher-ed local has both the AFT and NEA as affiliates, I'm generally out of the loop on national headquarters stuff, but I can see the writing on the wall: one of the unions may well push in the regulatory process to increase the leverage of state affiliates, not to eliminate the requirement on linkability of teachers to student data. The best thing that the national affiliates can do is help state affiliates' negotiating position with their own state departments of education. If two states' applications are similar, but only one has a letter of support from their state affiliate's (or affiliates') elected officers, both the NEA and AFT need the state with union support in the application to have an advantage. (There are some interesting dynamics here vis-a-vis merged state affiliates, but the larger incentive at the national level is to help all state affiliates.)
July 25, 2009
Temporizing and teasing on tests and teacher evaluation
I still don't have time to expand at length on combining qualitative and quantitative sources of data for teaching evaluation, but given the hoopla surrounding the draft Race to the Top regulations, I should at least provide an update, or rather a bit of a tease for what's developing into a short paper-to-be. In addition to my fairly general understanding of some technical issues, I'm developing the argument that any point-based system for combining professional judgment and test scores needs to avoid fixed weights for the components of the system.
The explanation is not that technical, and I can sketch it here: the benefit of a truly Bayesian approach to using test scores to evaluate teachers is a reciprocal relationship between the decision-making authority of professional judgment and the power of other data (including test scores). A forceful judgment by professionals reduces the power of test scores in such a system, while tepid judgments increase the power of test scores. That is one possible solution to the thorny question of relative weights: if educators are willing to judge their own, test scores are less important (addressing the concerns of teachers unions and many administrators), but if educators are not willing to judge their own, test scores are more important (addressing the concerned of those criticizing the very low proportion of teachers given poor evaluations).
In a point-based system with fixed weights (or fixed percentages of the total) assigned to individual components, you don't have a structure with a reciprocal relationship between the exercise of professional judgment and the authority of test-score data. But I think the dynamic benefits of a Bayesian approach can be created in a point system, as long as the weights are not fixed. I need to think through the potential approaches, but it's possible.
There: that's the tease.
July 13, 2009
AFT QuEST presentation slides on performance pay
I am not in DC, but I do catch things online: the presentation slides for the AFT QuEST session on performance pay are available, and while Edward Tufte thinks Powerpoint is awful, a stack of straightforward, well-written slides provides a wonderful vicarious outline for those of us who Were Not There.
July 10, 2009
Those evil union supporters who denigrate objective measures...
Quick: who said the following recently?
We do see the incredible power of setting stretch goals. But if you set a goal that's really not within reach, people will just give up on it and you really don't have a goal. We've seen this over and over. I think there's as much talking down of goals around here as there is of actually saying, "You're not thinking big enough."
Oh, this evil denigrator of the value of objective goals. From the text, you might conclude that this person is a teacher union supporter who will die before wanting to break down the firewall between teacher records and student test scores.
Except that the speaker was Wendy Kopp, head of Teacher for America and someone who said later in the interview that she is an advocate of using data and setting goals. But there's an important piece here about motivations and goals. No, I don't have answers for the K-12 world, but as I will continue to state until someone proves me wrong, there is something deeply wrong when an historian knows more about the relevant goals and motivation literature than most of the people who advocate setting extremely high goals in education.
Combining qualitative and quantitative evidence for teacher evaluation: What does "predominant" mean?
According to Gotham Schools, former NSVF and current USDOE official Joanne Weiss "said the Obama administration aims to reward states that use student achievement as a 'predominant' part of teacher evaluations with the extra stimulus funds" (emphasis added). I followed up with a USDOE representative, who emphasized after talking with Weiss that she meant a predominant part, not the
predominant part of teacher evaluations, and that is how Walz reported
the comment. The department representative added that department
leaders "consider it illogical to remove student achievement from
teacher evaluation, and we want states and districts to remove any
existing barriers."
This came on the heels of TNTP's Widget Effect argument and Joan Baratz-Snowden's Fixing Tenure. I know that the political context of Weiss's remarks is to push the Duncan line that New York State's moratorium on the use of test scores in personnel decisions is wrong, and if necessary Weiss will bar New York from the Race to the Top funds if the legislature doesn't get its act in gear. Stand in line, please; I have a feeling a few million New Yorkers have the first dibs on dunking the entire state senate in the Hudson near Albany sometime in late November.
Back to policy, though: the word predominant perked up my ears because Florida legislature's language has evolved from language involving the dominance of student achievement to quantification. The current language on personnel evaluation is a legacy of language first written in 1999:
The assessment must primarily use data and indicators of improvement in student performance assessed annually as specified in s. 1008.22 and may consider results of peer reviews in evaluating the employee's performance. [emphasis added]
The current performance-pay language in Florida has the Merit Award Program which stipulates that for the purposes of merit pay, achievement data "shall be weighted at not less than 60 percent of the overall evaluation" (F.S. 1012.225(3)(c)).
I need to think about this in some depth, but it strikes me that the Florida legislature mandated one of several options to use in combining quantitative and qualitative judgments of teacher effectiveness, the point system. You can probably come up with other variations that meet the statutory language, but my guess is that any real-world implementation would almost all be linear combinations of different subscores, and I will use incredibly technical measurement language to call it the point system of combining different sources of information about teaching effectiveness. But that's not the only one, and I am always troubled when a clunky system is chosen as the default because it is the first option rather than a deliberate decision among options. I understand why a point system is in the bureaucratic and political gravity well, and it may well be that this particular clunky point system is the best option. However, it should be considered in comparison with what other clunky systems might be appropriate.
For example, there is also the holistic review of teacher effectiveness, such as exists in the new Green Dot-UFT collective bargaining agreement teacher evaluation system. There's no specific way that test scores inherently enter the judgment as such, though the implication is that teachers will have to show that they use assessment to shape instructional practices (what's called action research in the document, at the very least).
But those aren't all: a flow-chart is at least theoretically possible, though I do not have a real-life example. Yes, there are process flow-charts such as exists in Denver (and in the Green Dot system), but it's a flow-chart essentially describing when and how you schedule meetings, not how you make decisions in a meeting. (Step 1: Can you understand this chart? Yes: read the rest of it while walking to your secretary's desk; no: pretend to read it while walking to your secretary's desk. Step 2a [at secretary's desk]...)
Most theoretical: a Bayesian bump algorithm. I am guessing that there is a high probability that any subjective Bayesian statistician reading this blog will have thought of this idea already, but I'll adjust that guess after some data comes in. Since even well-trained evaluators are making subjective judgments about people, you could treat a principal's or peer's judgment as a prior judgment about the probability that a teacher should be retained/rewarded, given help, or fired. In the Bayesian world, that prior judgment can and should be shifted based on data, to form a posterior estimate of the probabilities of what should be done (you can play with a Bayesian calculator here, in a medical-test context). That adjustment is why I'm calling it a "bump" -- start with a professional assessment on various grounds and allow that to be bumped somewhat by test data, with the magnitude of the bumping depending on the data. Going down this path would involve some interesting studies, and it would probably be working with Bayesian posterior odds (which provide an interesting possible back door to a point system). This is a little out of my league in terms of specific characteristics, but the Bayesian perspective on statistics makes it possible to combine qualitative and quantitative data in a framework that already exists.
So we have four large categories of ways to combine essentially qualitative and quantitative data. While I am busy reading student work and doing other stuff in the next week, you all have a chance to dive in and describe what you think are strengths and weaknesses of each approach, as well as any additional categories (or disagreements with my classification entirely). After I have a weekend and get other tasks finished, I will return to explain (a) why a Bayesian approach is not only philosophically appropriate but serves the needs of unions, students, and anyone Alexander Russo describes as reformy; (b) why a Bayesian approach is not that different from a point system, at least in theory; and (c) what characteristics you would look for in a point system for teacher evaluation to meet the political interests described in (a).
July 8, 2009
A word to the wise on accountability
Dear fellow Americans who support equal education and are inclined to attack teachers unions when you get frustrated (e.g., Charles Barone and Citizens' Commission on Civil Rights):
- Borg-like rhetoric ("Those who resist the school reform movement are going to find they are on the wrong side of history. They may affect the pace of reform, but not its inexorable direction") is not likely to convince anyone that they're wrong and you're right. It's not even close to the level of Rod Paige's NEA = terrorist remark, but it's still intemperate. And I don't know about you, but the last degree I earned came with a beautiful, shiny rearview mirror, not a crystal ball.
- I'm persuadable that NEA staff and national leaders made some incredibly stupid/venal moves in trying to shift policy in the backrooms of power (which apparently are no longer smoke-filled), that the AFT may have made (fewer) such moves, and that locals and state affiliates of both national affiliates also make stupid/venal moves at varying rates depending on location and internal union politics. But a report that essentially treats policy concerns and backroom politics as identical? It strikes me as shoddy analysis, for several reasons. First, it's scattershot, which undermines the credibility of what probably would be stronger arguments on more narrow grounds. Second, it misunderstands the nature of organizations, assuming that unions have intentions rather than internal politics, agreed-upon positions, strategies, and tactics. Third, if you criticize both regular and backroom politics, you're implicitly committing yourself not to do much politicking on your own part.
Every few years you see a wavelet of attacks on teachers unions, and I am assuming that this is part of a new one. Sometimes it's just a coincidence, and I hope that's the case in the entries linked above... and here.
Addendum: Charles Barone takes me to task on two items; in comments I say he's right on one and wrong on the other, but you'll have to read what he writes rather than my summary.
June 30, 2009
Grading reports that grade states, which have schools that grade
It's now a PR cliche in education wonkery: grade states. Issue grades, and that's a hook for reporters to write stories about the reports, because the reporters at daily metros can say, "[Your state's name here] receives 'F' in think tank report on education." But beyond the PR value of grades, it's facile, which is why I'm surprised Education Sector gave into this particular venal sin in its report on states' higher-ed accountability policies. C'mon folks: can't you figure out a more substantive way of evaluating states? At the very least, this is so 1990s.
So I'm thinking about developing a report over the next year that grades think-tank reports that issue grades for states on some matter of education, where of course schools have teachers who grade students. Among the standards will be the following:
Clear standards for grades: a year before the report is issued, does the entity that issues the report publish grading standards or criteria?
A - Entity publishes grading standards with sufficient criterion specificity that an outside observer would not be surprised at the grade a state receives the next year. (Note: this is a low bar, not requiring agreement with grades.)
B - Entity publishes standards, but standards are too vague to provide benchmarks for policy progress.
C - Entity has previously published reports issuing grades to states, but changed the standards, or described the project and the areas where states would be grade, but no standards for those areas.
D - Entity has previously published the existence of the report project, but there is no previous publication of intent to grade states in this area of policy.
F - Report appears out of the blue with no publication of intent in this area.
Okay, folks: where does today's Education Sector report fit? How about Ed Week's annual Quality Counts phonebook? Fordham's reports that issue grades?
And, yes, if I'm serious about this, that implies I have to develop some more grading criteria. After all, it would be most interesting and ironic if I created a report that contained the mechanism by which the report itself could be torn apart. Hint, hint, ...
June 26, 2009
How to steer CYA-oriented bureaucracies, or why NCLB supporters need to think about libel law
Someone at USDOE sent me an invitation to listen to the June 14 phone conference where Arne Duncan explained how disappointed he was in Tennessee, Indiana, and other states with charter caps, let alone states such as Maine with no charter law, and how that disappointment might be reflected in the distribution (or lack of distribution) of "Race to the Top" funds (applications available in October, due in December, with the first round of funding out in February 2010). There are a few details that reporters didn't ask about (Duncan's somewhat surprising statement that a good state charter law would set some barriers for entry rather than establish a "Wild West of charter schools," and the way that small charter schools and charter schools with grade configurations outside state testing programs can stay off the radar for accountability purposes), but I was not surprised that two Tennessee reporters were called on for questions.
But apart from the selection of reporters for questions, the phone presser and other DOE moves made me think about the various uses of power in education-policy federalism. In limited ways, explicit mandates can be effective, if there is a sustained willingness within the USDOE (and esp. OCR) to make painful examples of the nastier school systems that try to evade those mandates. Offering technical assistance is another method, and despite the massive conflict-of-interest problems in Reading First, I agree with one of the researchers in the field who thinks that Reading First did improve primary-grade reading instruction, on balance. (Thumbnail version: hourslong scripts, ugh; explicit instruction in phonemic awareness and some other fluency components, obviously necessary.)
But neither heavyhanded mandates nor technical assistance can do
everything, and neither works with the greatest motivation for both
defensive and hubris-oriented bureaucracies: risk management. If you
are a public school teacher or administrator, my guess is that you can
identify some fairly silly action by your district that was motivated
almost entirely by CYA motives, and if you can marry those CYA
activities to pedagogy, you've been lucky or have a black belt in
administrative maneuvering. (If you have such victories, please
describe them in comments! Otherwise, we'll all wallow in the shared
misery of observing defensive administering and the all-too-frequent ensuing
train wreck.)
I think the federal government can shape bureaucratic behavior to
the good by using that risk management and structuring accountability
policies around that. And here's the lesson I take from my high-school
journalism class in ninth grade 30 years ago: libel law in the U.S.
generally recognizes the truth as a positive defense agaist libel
allegations. That seems like a backwards way to frame the legal issue
-- after all, isn't it common sense that a publication is libelous only
if it's false? -- but the notion of a legal positive defense gives an
individual or organization a way to organize behavior in a way that is
both professionally appropriate and also make a legal defense aligned
with professional expectations. Because the truth is a positive defense
against libel claims, even an idiotic general counsel for a newspaper
or publisher looks to the professionally-appropriate standard: is there
documentation that the published work is true?
Sometimes a positive defense is not explicitly part of jurisprudence
but evolves as a practical guidance for clinical legal work and
internal advice for school systems. Observing procedural and
professional niceties create exactly that type of positive defense in
special education law. There is nothing in federal special education
law to carve out an explicit positive defense for school system
behavior, but many articles written by Mitchell Yell over the past few
decades constitute a convincing case that school systems now have a de
facto positive defense: professional documentation of decisionmaing and
scrupulous adherence to procedural requirements are a positive defense against a broad range of allegations by parents of and advocates for students with disabilities.
Yell has argued (persuasively) that due-process hearing officers and judges use procedural adherence and professional documentation as a filter in special education cases.
If a school district can document that it has paid attention to
procedural mandates and has met professional standards for documenting
decision-making, then hearing officers and judges are extremely
reluctant to look at the substantive merits of those decisions. But if
a school district has ignored standard procedural expectations that
most districts meet, or if a school district has kept no or inadequate
documentation of its decision-making rationale, then all bets are off
and a hearing officer or judge will be much less likely to defer to the
school district on professional judgments.
In essence, Yell implies, school districts can avoid adverse judgments if they pay attention to timelines and other procedural niceties and if they keep teachers and principals on their toes about current "best practices" as well as deadlines, notices, etc. Not all districts are aware of this positive defense, or I suspect that some enterprising special education researchers could make a mint running seminars, "How never to get sued again."
More broadly, I'm beginning to think that the construction of a positive defense against charges of incompetence would be healthy for school systems and state policies. The devil would definitely be in the details, but instead of being frustrated by a consistently observed school system behavior, maybe we should take advantage of that consistency.
June 25, 2009
See-no-knowledge in education policy?
I seem to be reading several "we don't know anything so let's plow ahead" arguments in education think-tankery, from Mike Petrilli's argument that because we don't currently have a solid research base about how to turn schools around, we shouldn't try, to Kevin Carey's consistent argument in Education Sector's blog that because there is no research consensus about predictors of good teaching (and considerable research suggesting that there is not a link between effectiveness and countable items like years of experience beyond the first few or graduate degrees), it makes better sense to let people into teaching and then evaluate their effectiveness.
Fortunately, that's not the approach of the Institute of Education Sciences under John Easton, which has just announced a large research initiative on turning around schools. I suspect that both Petrilli and Carey would acknowledge that research in difficult topics is a good thing and argue that IES initiatives are different from policy, because sometimes you have to make decisions based on the state of knowledge you have, not the ... oh, shoot, there's Donald Rumsfeld phrasing again. But you probably know what I mean: Petrilli and Carey's stances are policy stances based on topic-specific agnosticism, not opposition to research.
But there's a serious question buried here: on big questions of policy, where you have to make choices, and the research is nondirective, how do you make decisions? I think the answer has to be incrementally, to allow research to catch up and influence policy later. If you make a huge political and institutional commitment to a policy path that has no research support and no ethical/legal obligation, then you're committing millions of children and hundreds of thousands of educators to a path that is very hard to change later.
For that reason, while I think Arne Duncan's four-choice speech earlier this week is not based on research, and Petrilli is correct that there is no particular reason to believe that charter schools will somehow rescue the education of students otherwise stuck in horrible circumstances, the policy itself is good largely because it doesn't make hard and fast commitments to a particular path. The good thing about a charter is that it can be revoked, and in states such as Florida where there is a single authorizer for a geographic area (here, the county school boards), authorizers can be reasonably aggressive in shutting down shady or incompetent operations. So I share Petrilli's skepticism, but precisely because I am skeptical of any particular approach to schools in crisis, and because Duncan is being wishy-washy, I will applaud the Secretary for being wishy-washy.
Update: I first used the term "know-nothingism" in the title. Ugh. Bad move for an historian. Petrilli and Carey are not members of the 19th century anti-immigrant party. Mea culpa.
June 18, 2009
The world is complicated, part 752
So the Center for Research on Education Outcomes has a report on charter-school performance, the Center on Education Policy has released a report on student achievement trends, NAEP released art-education data, and the spin has begun. Missing from almost all the reporting: Statements about the extent of peer reviewing for any of these reports. I'm not too worried about the professionalism of these reports, since I know that the Department of Education always has an internal review process, CEP usually asks researchers in the area to review draft reports, and I would be surprised if CREDO did not have a pre-publication review process. However, the failure to report on the extent of peer review is a continuing and glaring omission in the reporting of education research.
In terms of the substance of the reports, I'm up to my eyeballs in prior commitments, but it's clear from the brief reading I have been able to do that the findings for all three reports are more complicated than the spin emanating for many of The Usual Suspects.* That's not news, I know, but I am the King of Things That Are Obvious Once He States Them, and I have a job to do.
* a great name for an a cappella group, if you happen to be starting one up.
June 13, 2009
On graduation rates and auditing state databases
I sympathize with Florida's Deputy Commissioner of Education Jeff Sellers, finding himself defending the state's official graduation rate the week that Education Week published its Swanson-index issue and pointed to Florida as a low-graduation state, using numbers far below the state's official numbers.
Some perspective: Florida's official graduation rate is inflated, but it's still better than Swanson's. Florida's graduation rate does more than Swanson (i.e., does anything) to adjust for student transfers and the fact that ninth-grade enrollment numbers overestimate the number of first-time ninth graders.
Because of Florida's state-level database and the programming/routine that already exists, Florida is much closer to the new federal regulatory definition of a graduation rate than many other states, and Commissioner Eric Smith has been preparing the state board and other interested parties for the likely effect of the change on the official published rate -- i.e., that the rate will be a visible quantum lower than the currently-published rates (and largely for the reasons I have explained in the 2006 paper linked above). So in a few years we'll get a closer estimate of graduation from a lay understanding (the proportion of 9th graders who graduate 4, 5, or 6 years later).
The point in the St Pete Times interview where I winced was Sellers's answer to the question of how the state (and the general public) knows that the exit codes entered for a student are accurate: Sellers said that his department conducts an "audit from a data perspective."
That statement is misleading. It is technically true that there is an audit in two senses: each school district is required to check its data for accuracy before sending the data to the state's servers, and the state conducts a search of students reported as withdrawn in one county to see if they entered another county system before labeling them dropouts. But while I have seen reference to checking that the withdrawal codes are correct, I have not seen any evidence that such checks have actually occurred, and I have been unable to find that evidence anywhere on the Florida Department of Education website. That doesn't mean that it doesn't happen, but call me a touch skeptical. Without random checks, there is no guarantee that a 16-year-old coded as a transfer to another school actually was a transfer.
Given Florida's long experience with a state-managed education database, the lack of published audits of this process should caution us about the magic of state databases. They are important, but they need to be done properly. It makes sense to talk about the internal and external checks that should happen as other states construct databases and all states start to conform to the mandated longitudinal graduation rate:
- Districts will need to be the first party to check accuracy, both in terms of preventing mistakes/fraud but also conducting consistency checks--are there any records which claim that a 45-year-old is attending kindergarten, for example? The first is supposed to happen in Florida, and I suspect that counties catch the low-hanging fruit in terms of errors. But the accuracy check on withdrawal code is the type of check that requires extensive follow-up to document whether a student identified as a transfer did in fact enroll in another school.
- States will also need to conduct accuracy and consistency checks, though a state will necessarily be far less likely than school districts to catch outright fraud in claiming students transferred when they did not.
- States will also have to conduct the cross-checking that Florida currently performs every year and that I describe above: which students move between districts in the same state, but are counted as dropouts because a county only looks at its own students.
- Finally, the auditing of transfer records would be MUCH easier if there is a standard way for school districts and individual schools to request the transfer of a student record and simultaneously use that authenticated request as verification that a transfer code is appropriate.
This is an incomplete list, but it's a start.
June 8, 2009
No one ever accused Arne Duncan of impersonating an education researcher
Hopefully some day we can track kids from pre-school to high-school and from high school to college and college to career. Hopefully we can track good kids to good teachers and good teachers to good colleges of education.
This was an excerpt from a speech Duncan gave today to IES staff about the need to use data warehouses to link individual teachers and test scores and then use that linkage to evaluate teachers (hat tip). Oh, yes, and do it based on research. Some day, Secretary Duncan, but tying an individual teacher to student performance is not something that you can assert is based on research available today. It is more wishful thinking than anything else. The best apparent on-the-ground research of this type with teacher education is nonetheless full of caveats. And that's on a program-level scale, not on the level of the teacher.
I'd accuse Duncan of spouting fuzzy logic, but fuzzy logic (the real stuff, research-wise, using fuzzy sets) may be one tool we use to get out of this dilemma.
June 1, 2009
The Procrustean bed of teacher tests
Mike Petrilli's stab at the Sonia Sotomayor nomination via the Massachusetts teacher tests is a little askew, and I'm surprised he didn't look at an obvious dilemma that's deeper than the politics of a judicial nomination. Several former teachers have sued the state (and Pearson) for what they claim is a discriminatory impact of teacher tests given the disproportionate failure rate of minority teachers. This is the employee side of impact-analysis law that most school lawyers probably know better under the graduation-exam cases in Florida and Texas.
The landmark case here is Debra P. v. Turlington, which led to a number of federal decisions that guide the use of tests that have disparate impact in schools. To wit, tests with disparate impact by protected classes are acceptable if...
- There is a rational state purpose for imposing them (guarantee graduate skills, in the Debra P. case)
- There is sufficient notice to those affected
- Those affected have a reasonable opportunity to learn the material on the test (the key reason for delaying graduation test applications in Florida, where federal judges did not want to hold the victims of segregation responsible for the unconstitutional behavior of schools)
- The application of the test is professionally done (I'm bundling together several separate issues, including the composition of the test, defensible setting of cut scores, multiple opportunities to retake the tests, etc.)
- There is no better way to meet the state's purpose that also reduces the disparate impact.
In the employment context, Petrilli is probably correct that the translation of the first item is essentially whether the test is a reasonable proxy for necessary teacher qualifications. But there is almost no way for anyone engaged in the current debate over teacher qualifications can defend these tests or defend the teachers' lawsuit without having some fairly severe inconsistencies.
Consider first the folks who have the approach that we should not care who enters teaching as long as we measure student achievement and make personnel decisions as a result. Several (whom I will not name to protect the guilty) have accused the High Quality Teacher standards in NCLB of obsessing about inputs (i.e., what teachers know) in contrast to outputs (what students learn). Anyone in this camp should abhor the Massachusetts teacher tests (and all teacher tests) because they continue the "let's look at the teacher qualifications absent the kids" approach, and we should be moving away from proxies for teacher effectiveness.
But the lawyers for the teachers and their supporters are not in much better shape, logic-wise. It is going to be very difficult to knock the legs out from the state's teacher testing
program. They have to argue that the tests are a poor proxy for teacher
skill, or that the tests were poorly constructed, or that there is a
better option with a reduced disparate impact. If they cannot convince a judge that the tests were constructed and administered unprofessionally, the lawyers are going to be in the uncomfortable spot of arguing that the testing is an inferior proxy for judging teacher quality, in contrast to ... [The conclusion is left as an exercise for the reader.]
Summary: If you are in favor of judging teachers by student learning, then content-testing knowledge is a poor proxy by your own arguments. If you are against the content-based testing, then you have to come up with a better standard that will hold up in court. No, I don't think there's a way out of this for anyone with skin in the game, but if there is no summary dismissal and no evidence of rank incompetence in test construction, the fireworks will be interesting to watch.
Texas, South Carolina, Missouri, and Alaska
I know that the reports of the common-standards agreement shepherded by the Council of Chief State School Officers and the National Governors Association describe a few different reasons for why four states have not joined in a standards framework that is probably going to be about as close to a less-is-more approach as one can get in a bureaucratic standards document. Yes, I know Texas has just drafted standards (as has Florida, which is joining), that Missouri is searching for a new state superintendent (my guess is others are as well), that South Carolina has Mark Sanford (which is enough for any state to deal with), and that we haven't heard from Alaska. But here are my imaginary real reasons for why these states have opted out (thus far):
- Other states refused to agree that everyone in the country would have to pronounce Harry Truman's state as mizZURah.
- Texas would have to admit that bidness is not a word.
- South Carolina did not get its way that there would be history standards with the required benchmark, "All six-year-olds will understand that each state is required to have at least one completely nutty elected official at all times, and this is a heritage of the Founders."
- There was a riot, not when Alaska insisted that NAEP math exams all use the Iditarod as an example of measure, rate, and general all-round toughness (other states just wanted to add their own events), but instead fisticuffs broke out when the Alaska rep. insisted that the current accepted size of the Earth was incorrect because if it was as large as most people thought, then you couldn't see Russia from your house.
Unfortunately, I suspect that the truth is far less entertaining. That's okay. We still have Joe Biden and George Will to mangle the facts in an interesting way.
Addendum: Lest anyone think I am making fun of other states, I should be very clear: I grew up in California in the 1970s, and I now live in Florida. That's enough ridiculous states to live in for a lifetime!
May 12, 2009
Should artists know something about money?
It's cringing time for this union activist: Teaching is an art, not a business wrote Hans, commenting this evening on a story about a judicial mandate prohibiting a UTLA one-day strike this Friday. That statement is irrelevant in the specific context (teacher layoffs), is a false dichotomy, and is wrong-headed in other ways. Let's start with the literal claim that art is incompatible with business. The daughter of a friend and colleague went to SMU on a dance scholarship. She was smart and after a minor injury decided to get some business training and is now an administrator in an art-related New York nonprofit. Artists and non-profits need people who are passionate about art and can also manage money (ask members of the Florida Orchestra, which I hear is surviving today in this economy because its new executive director is very competent).
Or to take another example, there's a wonderful segment of Stuart Math's documentary on desegregation in Shaker Heights, Ohio, where one of the old-time activists describes a post-WW2 meeting of residents who were trying to figure out how to create a stable housing market, and a business owner said, "You know, we can be liberal and effective, too." And they were, running a neighbor-managed real-estate outfit that was crucial in maintaining a stable, desegregated, prosperous community.
So much for the claim that art can't be business and warm-hearted liberals can't think in terms of getting stuff done. But the whole premise is wrong; I don't think teaching is an art. You can make a good argument that teaching is a craft, but there has to be solid practice at the bottom of it. In addition, anyone who is skeptical of the value of high-stakest testing, as I am, has to have something that's just a tad, a teeny, a tiny bit more astute than a statement that screams, "Just let me do what I want when I'm paid with the public purse." That's nuts, both philosophically and politically.
May 11, 2009
"Governance reform" is not reform
While New York rages over mayoral control, which is all the rage, schools in Pinellas County are headed towards The New Site Based Management, which was the rage in the late 1980s and early 1990s and which Bill Ouchi hopes will be the rage again.
While there are plenty of ways that governance can affect the classroom, I am consistently underwhelmed by the argument that governance reform improves what happens in the classroom. I've seen it all before.
May 5, 2009
Florida could still jump forward on end-of-course exams
The St. Pete Times is reporting that the death of the Florida House bill mandating end-of-course (EOC) exams in high school starting in science is the death of end-of-course exams, at least for this year. I'm not so sure. If I remember correctly, the legislature authorized EOC exams in principle last year, and there is an alternative funding mechanism: stimulus dollars. Embedded in the stimulus bill is section 14006, which is part of the $5 billion discretionary amount given the U.S. Department of Education. The state's application for state stabilization funds probably satisfies the nominal requirement for Florida to be aligible for a state incentive fund, if the state asks for incentive funds to develop EOC exams. This is precisely the type of project that the state incentive fund is designed for; it would replace the single comprehensive test with a number of tests tied to specific courses and instead of having to upset science teachers (such as in physics and earth sciences) with subjects not included in the first round (the filed bill in the House excluded them), there could be development of a full range of EOC exams in science. Seems like an obvious "yes we'll do that" to me.
I could be wrong; there may be legitimate reasons not to apply for state incentive funds to develop EOC exams. What surprises me is that during the legislative session, there was no public discussion I am aware of about the possibility of using federal stimulus dollars to develop EOC exams. I have heard nothing publicly at all about this, yet it's been an obvious possibility, at least to me. Has any reporter asked Commissioner Eric Smith about this? Is there any legislator or legislative aide who has asked about it?
April 6, 2009
One teacher's response to Ron Matus's article
There's been lots of coverage of the Ron Matus story March 29 on firing teachers in Florida, but there's been no follow-up online about the letters to the editor that were printed April 4 (last Saturday), and at this point, I can't even find the letters on the Times website. But I think one needs to be highlighted, because it's from a teacher and makes a few important points:
The premise in the article [by Ron Matus] is that tenure makes it too hard to fire bad teachers, yet the few examples given don't demonstrate that, but rather, simply show inaction on the part of school districts.
If the writer had found districts attempting, but failing, to fire bad teachers, he might have a point. I see this drive to get rid of tenure as an effort to instill fear in teachers and keep them silent. Teachers living in fear for their jobs can't afford to speak out.
Getting rid of tenure (read: due process) might make it easier to dismiss the rare teacher who shouldn't be in the profession. It would also make it easier to dismiss the good teachers--even the great ones, because the great ones are the ones who stand up and advocate for their students, themselves and their profession, and in doing so sometimes step on toes...
John Perry, Tampa
I've known John Perry for a number of years; he's an activist in the Hillsborough Classroom Teachers Association, but I don't think he was when we met. I think Perry's wrong about the order of magnitude of "the rare teacher who shouldn't be in the profession" (emphasis added), but since a good portion of teachers leave the field within a few years, I don't think that there's a shortage of ways to discourage teachers from continuing.
More broadly speaking, I think more sophisticated critics of teachers and their unions understand that administrators are the ones who fail to fire teachers, but Perry's other point is important: while K-12 teachers do not have academic freedom in the same sense that higher-ed faculty do, they're the ones I often hear a certain style of reformers praise for precisely the type of dissent that would be in danger without due process.
So let me phrase the question in the following way: does anyone want administrators to be able to fire teachers summarily after teachers do the following?
- Refuse to change a grade to let an athlete play.
- Complain that the new math textbook series is confusing to new teachers and likely to lead to poor teaching.
- Sign and date a request that a child be evaluated for eligibility for special education services.
- Complain when girls have fewer opportunities than boys.*
As far as I am aware, the only case above for which K-12 teachers are clealry protected when they speak out is the last one, and that's because of a Supreme Court decision stemming from Title IX; I suspect that the are likely to be protected if they push for assessment to gain services for a child, but I don't know of anything as clear-cut as a Supreme Court decision. And I don't see people who are in favor of "tenure reform" rushing to replace workplace due process with greater whistleblower protections.
April 1, 2009
Sharpton paid off? Please tell me this is an April Fool's joke
The New York Daily News is reporting this morning that former NYC Schools Chancellor Harold Levy is involved in a $500,000 payoff set of donations to the Rev. Al Sharpton's organization, with payments beginning shortly after Sharpton and Joel Klein launched the Education Equality Project in June 2008. With friends like Levy,...
In other news, I am hereby announcing my support for the public flogging of teachers whose students' test scores decrease from year to year, my hope that NYC invests an addition $1 billion in the ARIS system, my trust in the market to determine the true worth of schools within a voucherized environment, and my death last Thursday from reading Michele Foucault. In lieu of flowers, my family is asking that donations be made in my name to the John Birch Society, except for my son, who would appreciate iTunes cash cards instead.
Okay, it looks like the DN story is serious. Yikes. That'll take the wind out of the Education Equality Project (EEP) conference starting today. Then again, maybe "eep!" is the reaction of participants and fans of the Klein-Sharpton effort.
March 30, 2009
Seattle will be drier
I spent some time this weekend finishing the first complete draft of a talk I'm giving in Seattle on Thursday. I'm going to be heading there while a few thousand historians are leaving Seattle after the end of the Organization of American Historians meeting. I'm either expecting to find a time machine or I am heading there for a different meeting (Council for Exceptional Children). Last time I was in Seattle, it was wetter and colder than what's forecasted for the middle of this week. We had a drenching rain in Tampa this morning, so things will even out in my personal experience this week, even if not for the world.
I hope my neighbors weren't paying close attention while I was timing the draft. I don't read papers word-for-word, but I wanted to get a sense of how far I'm off on time, so I read it aloud while alternating between the laundry room and the kitchen.
Oh, the topic? Accountability and students with disabilities. I think I know how I'm ending the hour, but the cliffhanger before the third set of commercials is the tough part right now, and I haven't yet decided if Jason's going to live. If he does, I'm going to have to tear up the last act and start fresh. I've given a spoiler, haven't I?
More seriously, this talk is giving me the opportunity and prod to think through some connections between areas of education politics that I mentally put on "percolate": the democratic rationale for public education, tensions between public and private purposes of schooling, and what technocratic mechanisms may be useful for (and in what circumstances). When I get back, I have to think about potential outlets and how to get a potential coauthor to give up enough time to participate (and the value involved in that).
The only serious performance question I have is the extent of corny jokes and how far I can/should push them.
- An RTI Tier 2 intervention plan and a Writ of Mandamus walk into a bar...
- Peter Singer dies and finds himself at the Pearly Gates facing St. Peter: "So your most important goal right now is to avoid pain?" St. Peter begins...
- How many IEP team members does it take to screw in a lightbulb?...
- A rabbi, a minister, and a psychometrist are in a rowboat in the middle of the lake...
Maybe not those jokes.
March 17, 2009
Longitudinal data systems, good; unique teacher linkage, bad
Diane Ravitch's blog entry this morning seriously disparages the value of longitudinal data systems, including the linking of teachers to students, and John Thompson's entry discusses the abuse of data by administrators. Essentially, both Ravitch and Thompson fear the brain-dead or conscious abuse of data to judge teachers out of context. That's also the reason why NYSUT (the New York state joint NEA-AFT affiliate) worked hard to convince the legislature to put a moratorium on using test scores to make tenure decisions; Joel Klein was moving very quickly, and I think UFT and NYSUT had good reason to believe that without the moratorium, there would be substantial abuses of test data in NYC (and elsewhere) in tenure decisions.
My take: longitudinal data systems are a good thing, but linking teachers to students is a much more fragile undertaking.
Florida has a longitudinal data system that began in the early 1990s and has been used for 10 years to judge schools based on test data. Approximately ten years ago, I sat in a windowless room in Tallahassee as a Florida DOE member discussed the new A-plus system and a variety of technical decisions tied to it, and for which he had brought stakeholders and a few yahoos from around the state to give advice. I was one of the unpaid yahoos who had the great joy of flying in tiny airplanes several hundred miles a few times a year to give advice on the matters.
We had so many matters to discuss that one minor conversation was almost overlooked: a state mandate that required that the FDOE link each student to a teacher primarily responsible for reading and math. One state official showed us a draft form and then explained the concerns he had about it: in his view, the state that had tried that a few years earlier (Tennessee) had multiple conceptual difficulties connecting individual teachers to individual students. But they had run roughshod over those concerns, and he anticipated that Florida would do the same.
It wasn't a matter of letting teachers off the hook (this now-retired professional staffer is what I think of as an accountability hawk) but logic and sense. How many physics and chemistry teachers help students understand algebra better? How many history teachers help students with writing or reading? For students receiving special education services in a pull-out system, do you want only the special educator to be responsible for a subject, or do you want both the general-ed classroom teacher and the special educator to have responsibility? This spring, my wife (a math major and special educator) is tutoring a local child in math on weekends or evenings; so who should get credit for how he performed on testing in the last week, his teachers in school or my wife? Today, you can add NCLB supplemental educational services (or after-school tutoring) to the mix.
The larger point: even if you decide to wave away the concerns of Richard Rothstein and others, even if you focus entirely on what happens in academic environments, it is fallacious to link every student performance with a single teacher. If we are providing the appropriate supports for children, then the students with the lowest performance are the ones for whom such unique linkage assumptions are the least justifiable, because they may be receiving academic support from general education classroom teachers, from special educators, from after-school tutors, and maybe mentors or other providers in neighborhood support organizations (such as Geoffrey Canada's). Today, I do not think one can parcel out responsibility without making assumptions that have no basis in empirical research. Those who support individual teacher linkage have the burden to demonstrate otherwise.
March 12, 2009
Joel Klein as DM
John Thompson's blog entry today, God Does Not Play Dice, is in response to Charles Barone's Ed Sector report on value-added or growth models used for high-stakes accountability. (It's on my to-read list along with the IES/Mathematica study on teacher ed programs and various other things.) Thompson describes a number of caveats and then says,
...none of my objections would be major if the model was used for purposes of diagnosis, science, or a "consumers' report." We should pursue social science fearlessly, but we must not play dice with the lives of teachers by evaluating them with some theoretical work in progress.
That plays off Einstein's quip, "God does not play dice," in reference to quantum mechanics. That comment always made me think that if God does not play dice, maybe God forces you to pick up the dice and roll.
And that gave me the image of Joel Klein as Dungeonmaster.
A troll has just entered your classroom. He has a mace, a strength of 11, and 16 hit points.
After the Cafeteria Blob you threw at us, I only have 4 hit points, and I lost my Spitball Blocking spell.
Fight or run away?
Better fight; if I run away, I lose the Memo Spindle.
Better hope you're lucky. You need to roll a 17 to block the mace, 20 to break it.
But you're only giving me a D12!!
This is New York. You're tough enough. Roll.
March 10, 2009
Get Accountability Frankenstein for $10!!
Information Age Publishing is having a ten-year anniversary sale where you can get 10 or more books from their catalog for $10 each. Their authors, editors, and series editors include Gene Glass, Ernie House, Erwin Johanningmeier, Terry Richarson, Tom Popkewitz, Kathy Borman, Kenneth Wong, Jaekyung Lee, Maurice Berube, V.P. Franklin, Carol Camp Yeakey, and many others.
March 2, 2009
Take a breath (if you don't have asthma) and go on
I don't have asthma, but as my head cold morphs into the ordinary misery of seasonal allergies, I realize it's a darned nuisance not to be able to breathe comfortably. With luck I'll shortly be back to normal (or at least for what passes as normal for me), and in times like these, it pays to take a deep breath on receipt of almost any news and criticism. Evidently, my perspective lies somewhere between former Hill staffer and new DFER policy guru Charles Barone and NYC union activist Norm Scott, because I'm getting dished on by both. I'm not going to use the lazy journalist's excuse, "Because both sides are criticizing me, I must be right," in part because I'm not a journalist, in part because it's easily possible to be wrong about multiple things at once, and in part because while I disagree with Barone's and Scott's posts, they (generally) have the guts to say where they disagree with me. Oh, yeah, and they spell my name right. That counts for a lot with me.
Barone criticizes me (and others) for writing too much from an adult's perspective. I've written about that topic before (at length in Accountability Frankenstein and in more digestible chunks in One-Blog Schoolhouse), so let me provide a somewhat different gloss here: I could easily turn my blog over to several guest writers, my children and their friends. I suspect Barone's response to their criticisms of high-stakes testing would be, "Well, I know a little more about the world and your own best interest than you do." That statement would be absolutely right (at least in the first half) and an absolutely adult perspective.
(Incidentally, I agree with his substantive point in his entry that teacher happiness is not the point of either education policy or teacher education. I don't think that you can usually have effective teaching with completely miserable teachers, but I suspect or at least hope Barone would agree with me, and there's plenty of ground between avoiding total misery for teachers and seeing their euphoria as the primary goal of policy.)
Scott criticizes me (and others) for ignoring the fact that Arne Duncan was flawed as head of the Chicago Public Schools. Er, no. I'm fairly sure I'd have disagreed with him on a number of his decisions in the same way that I am fairly confident on where I'll disagree with him on federal education policy. But that open expectation of some disagreement does not mean the Obama administration is evil. Scott asks, "Exactly how much 'context' do these people need?" I'd say 20 years of Republican presidencies divided by 8 years of Bill Clinton. In comparison with Bill Clinton on the whole, Obama is good. And in contrast to the others, he's very, very good. That doesn't mean that I'm going to stay quiet when I think the administration is doing something wrong. It means I do have some perspective. Breathe, folks, breathe. For those who are worried about Arne Duncan, I think you'd do much better to putting your energies into worrying about Timothy Geithner instead.
February 25, 2009
On exaggerations in the service of bitterness
Today, Charles Barone indulged in some recriminations about the use of test data to evaluate teachers: "In fact, in many states there is tremendous pressure to pass legislation which assures a firewall-like separation between teachers and student performance. Such laws have already passed in California, New York, and Wisconsin; ..."
But let's examine that claim with regard to New York, about which others such as Kevin Carey and Jennifer Jennings wrote last April. The language:
3012b. Minimum Standards for Tenure Determinations for Teachers.
(a) A superintendent of schools or district superintendent of schools, prior to recommending tenure for a teacher, shall evaluate all relevant factors, including the teacher's effectiveness over the applicable probationary period, or over three years in the case of a regular substitute with a one-year probationary period, in contributing to the successful academic performance of his or her students. When evaluating a teacher for tenure, each school district and board of cooperative educational services shall utilize a process that complies with subdivision (b) of this section.
(b) The process for evaluation of a teacher for tenure shall be consistent with article 14 of the Civil Service Law and shall include a combination of the following minimum standards:
(1) evaluation of the extent to which the teacher successfully utilized analysis of available student performance data (for example: State test results, student work, school-developed assessments, teacher-developed assessments, etc.) and other relevant information (for example: documented health or nutrition concerns, or other student characteristics affecting learning) when providing instruction but the teacher shall not be granted or denied tenure based on student performance data;
(2) peer review by other teachers, as far as practicable; and
(3) an assessment of the teacher's performance by the teacher's building principal or other building administrator in charge of the school or program, which shall consider all the annual professional performance review criteria set forth in section 100.2(o)(2)(iii)(b)(1) of the Regulations of the Commissioner.
The part that was added last spring is in italics, but the rest remains, including clear performance references in bold. How are we supposed to read the combination of "the extent to which the teacher successfully utilized analysis of available student performance data... when providing instruction" together with the ban on granting or denying tenure "based on student performance data"? I'm not a lawyer, but obviously there has to be data for one to judge teachers on how well they use the data. My reading (which I think is plausible) is that one couldn't make a blanket decision based only on test scores, but you could grant or deny tenure based on how well a teacher used the data in adjusting instruction. This latter is pretty close to the best-world scenario of Response to Intervention (RTI) policy, which has a lot of research at least in core areas in elementary schools. In comments on Barone's entry, I wrote,
I think we may be reading the same legal language with very different lenses. To me, the tenure-qualifications language in NY state essentially conforms with RTI -- teachers have to show that they can use data. Those upset with the added language for this year -- which bars a brain-dead statistical formula -- must think it would be as appropriate and also easier to define effectiveness with test scores as what is currently allowed/required by law. Me? I don't think there's anything that's easy here to implement in a fair way, and there ain't yet no Holy Grail. I also suspect that there is no provision in NY law that prohibits the type of analysis of teacher education that Louisiana has been building for the last 5-7 years. Either I'm reading your definition of a firewall too broadly, or I'm misreading NY law.
Here is Barone's response, word-for-word (the bold-faced sentence is my emphasis):
It seems to depend on how you define "brain dead." The data can't be used, thoughtfully or otherwise, to inform tenure decisions. Whether there is a holy grail, or it hasn't been found, remains to be seen. But surely everyone agrees that poor and minority kids are getting the short end of the stick, and data available now can and should be used to help level the playing field for kids while we adults have our fun little debates. I notice you rarely use the word student or child, unless you are quoting me. I think we need to err on the side of the kids for a while even if it makes adults uncomfortable. If we wait for there to be a consensus among academics, today's kindergartners will be collecting Social Security before anything is done. If then.
The "bitterness" referred to in the title of this entry refers to this response. I'm disappointed by Barone's avoidance of the substantive topic by applying a rhetorical litmus test (how often I mention children in my blog), as well as the politician's logic here (something must be done; this is something; so we must do it). But let me get to the point: Barone is misreading the law. Data can be used to inform tenure decisions, and in fact, they must be, because the law requires that part of the tenure decision depends on teacher use of data. No data, no use of data -- no tenure. It may not be Barone's picture of how data informs a personnel decision, but Barone's claim is just plain wrong.
Addendum: In comments, Barone argues that the New York state law is clear and bars use of test data for making tenure decisions. Here's the way to decide it:
1) Does New York law prohibit a district from denying tenure because a teacher refuses to implement Response to Intervention practices?
2) Is Response to Intervention something based on student performance data?
If the answers are "no" and "yes," respectively, I'm right. Any other combination, and Barone is right. Let's try another scenario:
Main office conference room, where the assistant principal is meeting with a new teacher. "Let's look at your student's last quizzes and talk about where they learned the material well, and where you might want to reteach."
The teacher holds up his hand. "Wait a minute. Am I going to be judged based on what I say in this meeting?"
The assistant principal nods her head. "In part, what I'm judging with your effectiveness is how you respond to student needs. C'mon. Let's just look at the quizzes."
"No way. State law forbids the use of student performance data in tenure decisions. I'm talking with my union rep!"
If Barone is right in the global sense, this conversation could really happen. But I don't think it could (or has). When Barone claimed that New York had put a "firewall" between teachers and performance data, I know he was thinking in the narrow sense of "if students perform poorly on standardized tests, then we should be able to deny tenure." But regardless of whether that is a good or bad policy, that's not the only way one can connect teachers and student performance. Expecting teachers to look at student performance and change instruction based on data is a second way, and New York does not bar it. Looking at teacher education and student performance is a third way, and New York does not bar it. Which of those three is good policy is an interesting and debatable question, but what is not debatable is that all three connect teachers to data.
February 20, 2009
Technology and assessment
Education Sector's new report Beyond the Bubble is shorter than I had expected, so I finished it while watching the end of my son's tae kwondo class last night. It looks to be a decent summary of the optimistic side of technology-and-assessment literature. Its tone is, "Yes, we can dramatically change and improve assessment with technology that is either just about to come online or that deserves some investment." And I think that for some things, that's absolutely right: an online/computerized science exam could have color images of tissue slides, videos of animal behavior, and so forth. But, while author Bill Tucker bowed his head in the direction of friendly technoskeptic Larry Cuban, there are some flies in the ointment:
- Students with disabilities. This is true for pencil-and-paper tests as well, but when you only have black ink, there are a few other issues you don't have to worry about that on-screen designers have to: red-green color blindness, epilepsy and screen movement, etc. The half-page on universal design is good, and any CFP will need to specify (and budget for) disability/accessibility awareness.
- Code creep. I don't mean internet safety but the fact that programming languages grow up and die. We've gone from perl to python, from HTML to XML, and languages and interfaces will continue to evolve. I wonder how many of the cases pointed to in the report are essentially one-off projects that will die at some point because the platform no longer exists. (Any readers remember Infocom's text games?)
- Holy Grail syndrome, also known as a belief in "the leap in cognitive science that will allow perfect, automatic scoring of essays is just around the corner." Same with the great and brilliant analysis of hundreds of microstate data that a single student can generate in a simulation environment. I trust colleagues who work in cognitive psychology to do some great things in the next decade, but this seems a bit utopian. Okay, more than a bit.
All of this doesn't say we shouldn't be engaged in using technology, but maybe we should work along two tracks: encourage the fast, frequent, and flexible for now and also invest in the medium- and long-term projects.
There is something that the paper never addresses: intellectual-property rights. Part of the imprisonment of assessment in an oligopoly is the ownership of assessment materials, backed up by the fear of security problems. (Here's reality for you: the day after a state test is given, assume NO security for that test. None. Despite all the laws. Just give that idea up, folks, unless you believe in the tooth fairy, have never heard of BitTorrent, and don't think college students ever cheat.) I am curious what the position of various folks are on open-source assessment. I am not entirely sure what it would consist of, or how it would meet adequate technical standards, but it's tough to argue that despite the testing industry's oligopoly status, we should suddenly think that a brand-new investment will erase both the proprietary rights of the major firms or the start-up threshhold for the creation of commercially-viable products.
February 6, 2009
Klein compares Bloomberg to Putin
No, he didn't, but at the mayoral-control hearing in Albany, according to the indefatigable Elizabeth Green,
Klein defended himself passionately, arguing that mayoral control is a democratic governance structure, not an authoritarian one, as some members painted it.
The logic here is weak: under that view, a plebiscite dictatorship is democratic because every few years the head honcho could be kicked out of office.
I think there are multiple reasonable approaches to the policy question, such as UFT's "you need two (more) righteous people to save Gotham" proposal of giving the mayor a plurality on the main policymaking body (so the mayor and chancellor would have to convince 2 out of the other 8 members) or something that would give an independent body subpoena authority and the responsibility and right to issue reports on the schools.
But the gist is to inject public accountability beyond the one-person constituency of Joel Klein. I'm a little curious why advocates of mayoral control don't grasp the fundamental irony that you don't create accountability by removing it. There are multiple ways of addressing the messiness of urban politics, but if the appointed chancellor has spent several years ignoring parents, he's getting his natural comeuppance today.
UTLA and "benchmark" or "periodic" testing
Last week, the United Teachers of Los Angeles called for the cessation of every-few-months testing in the district. The response of the district: such testing is an important tool in improving student achievement, which they know because schools with such testing have had annual-test scores higher than schools without such testing.
The flaw in the district's reasoning is left as an exercise for the reader, because I'm more concerned at the moment about what this debate shows about our attitudes towards assessment. UTLA is wrong to attack frequent testing on principle, though I think they may have a good point about this type of assessment. Such periodic assessment may help schools target assistance to students, or they may serve primarily to mimic the state test and encourage teaching to the test (the predictive success of which principals would know by results on the quarterly assessments). Without knowing more about the details, you can't say which is which, and both phenomena are possible (including in the same school).
What concerns me is the direction in which the machinery of testing is taking formative evaluation. There's a lot of research to suggest that when used to guide instruction, frequent assessment can dramatically change results. There are a number of technical questions about so-called formative assessment (or progress monitoring) that are the domains of researchers in the area: how to create material sufficiently related to key skills or the curriculum, how to create assessments where score movement is both meaningful and sensitive to change, how to gauge appropriate change, how to structure the feedback given to teachers, and so forth. My reading of the literature (which is not complete) is that the most powerful uses of formative assessment require very frequent, very short assessments--on the order of once or twice a week, and about the same length as your typical elementary-school spelling test (i.e., a few minutes at most).
So what do we see as the evolving, bureaucratic version of formative assessment: long tests taken every few months. That's better than once a year in terms of frequency, but it's still a blunt instrument and absorbs a large chunk of time. The reason for this preference is obvious: a large, unwieldy school system can organize systematic evaluation/feedback around quarterly tests. That's doable. But organizing around something that's taken weekly and would often require data entry (e.g., a one-minute fluency score for first- and second-graders)? That's a different kettle of fish.
That doesn't mean it's impossible. It's easy, if you're a principal who's willing to devote the right resources. Consider reading fluency, for example. (I'm not saying that fluency is more important than comprehension. I just have the experience with this to imagine what I'd do as a principal.) Teach a paraprofessional to have every first- and second-grade student in the school read to them one minute a week on a sample reading passage (there are sets of roughly equivalent passages one can purchase for this purpose). Have them enter the data through a Google Docs form, a SurveyMonkey survey, or some other tool that will send the data to a spreadsheet. Get someone to program the results so that you can show data per child with trend lines and sort by grade, classroom, etc. For a few extra lines of code, you could add locally-weighted regression trends to be really fancy, but that's beside the point.
Here's the point: this is not rocket science, this does not require a gazillion-dollar software package from TestPublisher Inc., and it's very different from the type of quarterly testing that superintendents are buying into in a big way (including that gazillion-dollar software package from TestPublisher Inc.). It's very different from the quarterly testing that UTLA is protesting.
So, Ramon Cortines, here's my challenge: can you document that the quarterly-testing regime is better than the weekly-quiz-plus-trends proposal I've outlined above? The second can fit easily into the routines of any school. The second can start conversations EVERY WEEK at a school. The second is MUCH cheaper. It's also less sexy: no giant software packages manipulable from the front office, no instantly-printable pastel-colored graphs that demonstrate what kids were able to do on a test six weeks ago. You'd definitely give up the flashy for the mundane. But prove to me that the flashy is better than the mundane.
February 5, 2009
What personality is your Performance-Pay Attitude? (and other mixed metaphors)
Since other bloggers I read have used various quizzes to spice up their entries, or maybe do something online while they're waiting for a bus, here is the all-purpose Performance-Pay Personality Quiz. Oh, wait: "personality" isn't quite appropriate here. But to mix metaphors, what personality is YOUR attitude towards performance pay?
- Do you think that there is ever a justification for some teachers' being paid more than others?
- 1 point -- A paycheck is performance pay: either pay people a good wage for doing their job, or fire them for not doing it.
- 4 points -- Some differential pay is required to encourage teachers to take hard-to-staff jobs (either by subject or school), and that's more important than merit pay.
- 7 points -- On balance, performance pay would be a good thing, but it's not the most important thing to change in schools.
- 10 points -- Performance pay or bust: I'll throw everything else out the window to get it!
- 1 point -- A paycheck is performance pay: either pay people a good wage for doing their job, or fire them for not doing it.
- What's the most important motivation for teachers and administrators?
- 1 point -- They love children; that's their only motivation.
- 2 points -- Personal integrity is a more powerful motivator than salary. Teachers need salaries, but if you can show teachers how to feel better about the job they're doing (including showing them how to do a better job), you can move mountains.
- 3 points -- Money's an important part of the picture. It's not the only thing, and seeing money as the only motivational tool would be foolish public policy, but to ignore it would be wrong.
- 4 points -- There's nothing like money to get people's attention, and teachers are people.
- How important is it for education policy to encourage educators to work together?
- 1 -- Teachers are not islands: rewarding individuals will kill the type of mentoring and sharing that's essential for professional development. Doubt me? Go ask stock-market traders who entered their career recently whether individual rewards encouraged their elders to mentor them... or spend every second on the floor trying to make a buck.
- 2 -- Cooperation is crucial. It's not everything, since all teachers have strengths and weaknesses, and we don't want a school full of Stepford Teachers, but I worry that too much emphasis on individual recognition will discourage teachers from talking to each other, and from any chance that teachers will hold each other accountable.
- 3 -- Teachers' talking in a lounge is like little kids' hugging each other. Often it's wonderful, but you sometimes worry what they're sharing. Individual recognition is pretty important to give credibility to the better and more professional teachers.
- 4 -- Teacher go it alone anyway: recognizing their achievement as individuals is unlikely to harm the type of substantive collaboration that happens rarely.
- 1 -- Teachers are not islands: rewarding individuals will kill the type of mentoring and sharing that's essential for professional development. Doubt me? Go ask stock-market traders who entered their career recently whether individual rewards encouraged their elders to mentor them... or spend every second on the floor trying to make a buck.
- What is the right balance between judging teachers based on the professional judgment of peers and using student performance?
- 1 -- Peer judgment: they're the ones who know what good teaching looks like, and what we care about is whether teachers are teaching well.
- 2 -- Er... wouldn't peers be interested in what students are learning? Student performance should be part of the mix, as one springboard for evaluation. But peer judgment should be central.
- 3 -- Student performance should anchor qualitative judgments of teaching. Yes, peers can judge teachers, but student performance should be central.
- 4 -- Skip the peers. What matters is whether students are learning.
- 1 -- Peer judgment: they're the ones who know what good teaching looks like, and what we care about is whether teachers are teaching well.
- How ready is the technology of testing to use in judging individual teacher and school performance?
- 1 -- When the solid historical record of more than a century shows that people have abused tests in every decade, we should assume that tests will be misused, and it's the burden of high-stakes testing advocates to show otherwise.
- 2 -- Tests are useful, but we're far from being sure that tests tell us what most politicians think they tell us.
- 3 -- They're imperfect, but we need to start using test scores to judge effectiveness now because we can't wait for tests to be perfect to look at performance.
- 4 -- They're just fine, and they have been for years.
- 1 -- When the solid historical record of more than a century shows that people have abused tests in every decade, we should assume that tests will be misused, and it's the burden of high-stakes testing advocates to show otherwise.
- What role should collective bargaining play in education reform?
- 1 -- Collective bargaining is crucial to protecting due process and teacher rights, and if possible to block stupid reforms.
- 2 -- Collective bargaining is crucial to protecting due process and teacher rights, and unions can play an important part of reform.
- 3 -- Collective bargaining is primarily an obstacle to important reform. Where unions will accept reforms, great. Where they won't, federal and state governments have powerful incentives to change the balance of power at the local level.
- 4 -- Federal and state governments should do their best to break unions, because they do nothing good. Break them, circumvent them, discredit them with their bargaining units.
- 1 -- Collective bargaining is crucial to protecting due process and teacher rights, and if possible to block stupid reforms.
- What should be the ceiling in terms of paying for performance (both the total amount of money and how many teachers should be eligible)?
- 1 -- Arguments in favor of performance pay are a cover for not wanting to pay teachers more. Those who work with children are generally underpaid, and while performance pay looks like it's in "the children's interest," in reality it's another way of being cheap.
- 2 -- Part of my skepticism about performance pay is the assumption that only 10-25% of teachers should receive it. To these brilliant people, I ask: "Okay, suppose there's performance pay and every student meets whatever is your definition of proficiency by 2014. Does that mean you'd be willing to double teacher pay for that result, or is this an education-reform shell game?"
- 3 -- Part of my acceptance of performance pay is looking at the numbers: there are lots of students, and it's almost impossible to staff every classroom with a brilliant and greatly-skilled teacher. So let's pay the great ones the best. "In a perfect world we'd double teacher pay" is another way of saying "never."
- 4 -- Competition is the best way to motivate individuals, and you're going to get little competition if everyone can earn a bonus. Limit performance pay to the top slice of teachers.
- 1 -- Arguments in favor of performance pay are a cover for not wanting to pay teachers more. Those who work with children are generally underpaid, and while performance pay looks like it's in "the children's interest," in reality it's another way of being cheap.
Psychometrics-free labels to share with frenemies and colleagues:
7-11: You are Alfie Kohn. You'd really like the testing industry to suffer an ignominious death, and anyone who thinks that using tests will improve schooling is smoking something fairly powerful.
11-16: You are Reg Weaver. You are publicly skeptical of merit pay, you think most designed systems are going to be disasters, but you're also going to hold your nose and support teachers who decide it's in their best interests.
17-23: You are Randi Weingarten. You know that the American public is used to people making more money if they do a better job, but you're skeptical of most performance-pay plans in operation today. You think collective bargaining is the best way to moderate the more idiotic ideas surrounding teacher pay and to protect the legitimate interests of teachers and communities.
24-28: You are Thomas Toch. You're well aware of the flaws of testing and accountability systems, but you think moving in the direction of performance pay is essential, and you will trust that the system can be improved incrementally once it's started in the right direction.
29-34: You are Michelle Rhee. The day that teachers have a starkly uneven pay scale, the day that school districts fire a fifth of their teachers, and the day that unions are decertified around the country will be the day you will not only take up that Newsweek broom again but dance with it a la Fred Astaire.
(Don't like the questions? Fine: make up your own completely unscientific spoof of internet quizzes!)
January 13, 2009
Oversight boondoggle
Last week the Wall Street Journal lambasted Florida Governor Charlie Crist for failing to appeal a ruling that struck down the Florida Schools of Excellence Commission as an unconstitutional infringement on the powers of county school boards in Florida. The legislature wanted to set up the FSEC as a second authorizer of charter schools in case county boards were unfair and refused to let enough charter schools open. This bewildered me because Florida has no statutory cap and there are a few hundred charter schools in the state.
This afternoon, I remembered a blog entry written by St. Pete Times reporters in December: the FSEC has been spending the people's money like it was water, racking up almost half a million dollars in expenses over two fiscal years without authorizing a single charter school that has yet opened its doors.
Isn't the Wall Street Journal supposed to have a conservative fiscal philosophy?
January 12, 2009
Deantidisestablishmentarianism in education policy rhetoric
Joel Klein and Al Sharpton wrote an open letter to Barack Obama and Arne Duncan that appeared this morning in the Wall Street Journal. And I have just a few questions about this:
- How can the sitting chancellor and a long-time civil-rights activist claim to be railing against "the entrenched education establishment" when you could reasonably conclude that they are The Establishment?
- Why do they think that placing a column in the WSJ establishes their anti-establishment street cred? That newspaper isn't exactly an underground pamphlet.
- Isn't Klein the type of guy who already has Arne Duncan's cell number? They're fellow urban superintendents, they've talked at meetings, and you assume he could call Duncan up at any time, and probably get Obama's number as well. So why do they need this open letter--do they feel this deep psychological need to pose as Village Voice rebels with a cause?
Klein and Sharpton are setting up a straw-man opponent. In my masters class in the fall, one of my students argued that accountability is well-entrenched as part of the public-school policy script. Whether you want to use Tyack and Cuban's "grammar of schooling" or Mary Metz's "real school" language, I think there's a case to be made that anyone who claims that accountability is "new" is in denial and as punishment should have to watch three or four consecutive playings of an inane 1980s adolescent-rebellion film.
So someone who is less establishment than Joel Klein would be... anyone? Anyone?
Second thought: For a few years, I've had the suspicion that the public "letter to the next president" was a bit precious (in the pejorative sense). The collections of letters to the president published after the end of an administration are usually drawn from the sample of correspondence from ordinary Americans that the White House staff select for a president to read as a reality check. Even if Klein gets some credit in my book for having a salary far less than what either New York financiers or university presidents are commonly receiving these days, in no way could one call Joel Klein or Al Sharpton "ordinary Americans."
So if Joel Klein gets to write a "letter to the next president," though we all know he could call Obama up with ideas about either antitrust policy (his Clinton-era gig) or education policy (his current gig), then the gloves are off. I'm writing a letter, too! And you know from my loving hardass manifesto that I intend to bring some style to it. So here's the rule for 2009, for all of you: Staid pretentious public letters to the new president are out. Your job is to write the most outlandish letters that tell the truth. Come on: it's going to be the Obama era. You can say it.
One more update: Apparently Margaret Spellings doesn't have Arne Duncan's cell number, either! Or at least she's pretending not to. Isn't it so nice of major papers to devote part of their ever-shrinking news hole to long classified ads from major policy honchos who can't navigate their cell-phone menus? Though I think the following would have been free on Craigslist: "Arne: call me. Margaret." What? The Post may have been joking? Oh, yeah, and that's a good use of newsprint...
December 17, 2008
Okay, it's Arne Duncan. Back to the substance already, willya?
The following is one of those trick questions you should never answer: Was Arne Duncan appointed because he's a cipher/Rorschach test for those with an axe to grind in national education politics, or is he an appointee primarily because of his personal and political connections? In between other tasks, I've been reading the comments flying past at half the speed of light, and after the most sensible and well-grounded supporting piece I've seen yet (disclosure: I'm a sometimes contributor to the blog), I've been reminded of Stephen Carter's response when asked if he ever benefited from affirmative action: so what?
So what if he's a policy cipher? He won't be making decisions by himself, and if anyone has a bully pulpit on education, it's going to be Duncan's boss. What matters is the collective decision-making, including the debate over the hard decisions to be taken with NCLB.
So what if his appointment is far more closely tied to networking than many of the other Cabinet appointees? He'll now be in a far more public and less insulated role than as aide to Paul Vallas or the CPS head serving at the pleasure of Richard Daley. He'll rise or fall on his own merits, at this point.
As I wrote six weeks ago, let's move on to some discussion that is less personality-based.
November 23, 2008
When the news hole shrinks, any mention is a blessing... well, sort of
Adam Emerson used to be the Tampa Tribune's higher-ed reporter. As the Tribune's owner Media General has been laying off reporters and editors left and right over the past year, assignments have shifted, and Emerson now has the K-12 education beat. So when he called me up with the news, it was also to ask about Florida's graduation rate. Basic story: in the last week, the Florida Department of Education released its annual data on graduation. They published two sets of statistics, both including and excluding GEDs from the number of students in each cohort receiving a diploma. They did not publish the alternate rate that they will have to start publishing in a few years, where the students who drop out to take GEDs will still be part of the cohort schools are responsible for. Some progress in transparency is still progress, and as I told Emerson, Florida's education commissioner is smoothly preparing both his board and the public for when the official graduation rate drops because of the change in definition. I suspect he may also be giving signals to the superintendents around the state that they'll no longer be able to hide problems with the dropout-to-adult-GED path or with GEDs.
We talked about this and other topics in a longish phone call, and as I usually do, I wished him well on the story, especially on getting enough space for it. Well, Emerson's story is now published, and in a 130-word story, my name is in there three times. He's a good reporter, and any gap between the published story and the first paragraph above is entirely a matter of the space he had to tell the story. I like seeing my name in print as much as the next yahoo, but yeow, that's a rapidly-shrinking news hole.
November 15, 2008
NCLB music
Bill Wraga, at work a mild-mannered U. of Georgia faculty member, has recently uploaded the latest NCLB/ed reform song I've come across. Some others:
- The Department of Education's NCLB song (NY Times article with partial lyrics; anyone know of a location for the full lyrics or an mp3?)
- Tom Paxton, Not on the Test
- No Child Left Behind? (album)
- Madelein Begun Kane's Education President Song
October 31, 2008
Happy Halloween, and now read my book!
Charles Barone chose Halloween to point to my proposal for post-NCLB federal accountability policy. For the record, despite what the picture on my website implies, I really look like the hunk of handsomeness that's at the top of Barone's entry (well, on the right side of the picture). I appreciate the link and hope folks will leave a comment on Barone's entry. (Commenting here won't count.)
Federal influence
Mike Petrilli asks one right question: where can the federal government influence behavior, and what are the tradeoffs? I'm especially delighted that the research in question is about desegregation. As I've written before, the argument against top-down reform by David Tyack and Larry Cuban is smart, sensible, detailed, and fits with an enormous amount of historiography... but it doesn't address desegregation. I'm not headed entirely towards Nudge territory, though I much enjoyed the book, and part of the reason is that there is a role for top-down policy imposition. We just have to be very careful about how that power is used.
NCLB regs and graduation rates
A few quick ones this morning, while my brain warms up... So the new NCLB regulations are out. (Or, rather, they were out a few days ago, but I've been putting out fires while in the midst of a cold, and this was a lower priority.) Atlanta Journal-Constitution reporter Laura Diamond asked on Wednesday, Will NCLB changes improve grad rates? The obvious answer is yes and no: yes, the measures mandated by the federal government will be much better than the goat-rodeo world of dropout measures that currently exists, but, no, better measures will not move the world in themselves. After almost two decades of looking at attainment and dropout-prevention and -remediation programs, I am no longer surprised when people look to vocational education, personal counseling, and (these days) credit-recovery programs as solutions to dropping out. They may all be good on a small-scale basis with some students, but I worry when people reinvent the wheel and think they're hot stuff.
September 12, 2008
Shared responsibilities III: The next ESEA
Over the summer, Charles Barone challenged me to put up or shut up on NCLB/ESEA. I immediately said that was fair; Accountability Frankenstein had a last chapter that was general, not specific to federal law. I'm stuck in an airport lounge waiting for a late flight, so I have an occasion to write this now. Because I'm on battery power, I'm going to focus on the test-based accountability provisions rather than other items such as the high-quality teaching provisions. Let me identify what I find valuable in No Child Left Behind:
- Disaggregation of data
- Public reporting
So where do we go from here? I don't think trying to tinker with the proficiency formula makes sense: none of the alternatives look like they'll be that much more rational. What needs more focus is what happens when the data suggest that things are going wrong in a school or system. On that, I think the research community is clear: no one has a damned clue what to do. There are a few turnaround miracles, but these are outliers, and billions of dollars are now being spent on turnaround intervention with scant research support. To be honest, I don't care what screening mechanism is used as long as (a) the screening mechanism is used in that way and in that way only: to screen for further investigation/intervention; (b) the screening mechanism has a reasonable shot of identifying a set of schools that a state really does have the capacity to help change things -- if 0 schools are identified, that's a problem, but it's also a problem if 75% of schools are identified for a "go shoot the principal today" intervention; (c) we put more effort and money into changing instruction than in weighing or putting lipstick on the pig. Never mind that I'm vegetarian; this is a metaphor, folks.
So, to the mechanisms:
- A "you pick your own damned tool" approach to assessment: States are required to assess students in at least core academic content areas in a rigorous, research-supported manner and use those assessments as screening mechanisms for intervention in schools or districts. Those assessments must be disaggregated publicly, disaggregation must figure somehow into the screening decisions, and state plans must meet a basic sniff test on results: if fewer than 5-10% of schools are identified as needing further investigation, or more than 50%, there's something obviously wrong with the state plan, and it has to be changed. The feds don't mandate whether proficiency or scale scores are used; as far as the feds are concerned, it's a state decision whether to use growth. But a state plan HAS to disaggregate data, that disaggregation HAS to count, and the results HAVE to meet the basic sniff test.
- A separate filter on top of the basic one to identify serious inequalities in education. I've suggested using the grand-jury process as a way for even the wealthiest suburban district to be held to account if they're screwing around with racial/ethnic minorities, English language learners, or students with disabilities. I suspect that there are others, but I think a bottom line here is the following: independence of makeup, independent investigatory powers (as far as I'm aware, in all states grand juries have subpoena power), and public reporting.
- Each state has to have a follow-up process when a school is screened into investigation either by the basic tool noted above or through the separate filter on inequality. That follow-up process must address both curriculum content and instructional techniques and have a statewide technical support process. At the same time, the federal government needs to engage in a large set of research to figure out what works in intervention. We have no clue, dear reader, and most "turnaround consultants" are the educational equivalents of snake-oil peddlers. That shames all of us.
Doing so will also allow the federal government to focus on what it's largely ignored for years: no one knows how to improve all schools in trouble (and here I mean the organizational remedies -- there's plenty of research on good instruction). Instead of pretending that we do and enforcing remedies with little basis in research, maybe we should leave that as an open, practical question and... uh... do some research?
September 9, 2008
Cold permutations
First, to provide a minor update on this morning's news items:
- Semi-success on the reserving-time front. I had a lunch meeting and then a 3 pm meeting, and the time in between was too short to do much, so I exchanged one parking sticker for another. Whee. At least my wonderful grad student assisting with the journal did a monster job helping on a long MS, giving my head-cold-affected mind a much easier job going through the next article. I WILL climb on top of this mountain of work. Just not today.
- It's a semi-full-blown cold now. Proof: I should be asleep, and I'm exhausted, but I can't sleep.
I've been trying to wrap my mind around permutation tests and exchangeability for about a week, and I figure that my typical head-cold mentality may be the best shot I can take at it both in terms of the orthogonal way I think at way-too-late-on-a-head-cold evening and also the fact that once I'm up this late and in this state, no student or MS author wants me to be making decisions right now. (For the record, I'm on antihistamines. I know, I know: Never take Benadryl and grade. No. That's not funny, not even in my state of mind.)
A few weeks ago, I was pondering the NYC achievement gap controversy, a debate over the summer that among other things spawned a Teachers College Record commentary by Jennifer Jennings and me (available just to subscribers for now, but to the world in a few weeks). And while the limits on TCR commentaries and op-eds require a fairly narrow argument, I kept thinking about trends and time series data as I looked at the New York City Department of Education's claims. I kept thinking to myself, There has to be something an historian can contribute to this debate that is specific to the way historians think. I'll probably write something at length when I'm more coherent and have some time, but there was an obvious answer that came to mind: to historians, the order of events matter. An argument about causality depends on contingency which depends on a sequence. (Historians often focus on contingency rather than causality, except when we're playing the counterfactual game. The obvious answer to the question, "What caused Gore's defeat in 2000?" is "everything, or almost everything.") The sequence doesn't prove causality (or contingency), but it's necessary.
That logic is usually not applied in policy. In the case of New York City, as is typical in this type of reform publicity, someone pointed to a time series of data and claim, "Aha! See this trend? Ignore its tentative nature: it's PROOF that we're on the right track." One obvious problem with the NYC data is the reliance on threshold-passing percentages; that's the focus of the TCR commentary. But the NYC Department of Education made claims about the achievement gap more broadly, and the data is a lot messier than the folks in Tweed would state. Below are three permutations of the "z-scores" of achievement gaps (the differences in Black-White means on the 4th-grade state math tests, scaled to the population's standard deviation). One is the real time series that runs between 2002 and 2008. The other two are permutations. Before you look for the data (it's on p. 13 of the PDF file linked above), see if you can tell the differences among them, and which is the observed order:
| 0.74 0.79 0.73 0.67 0.72 0.67 0.71 | 0.79 0.67 0.72 0.67 0.71 0.74 0.73 | 0.79 0.72 0.71 0.74 0.73 0.67 0.67 |
My professional judgment as an historian is also common sense: if the order of events does not make a discernible difference, even if you ignore measurement error and standard errors, then it's hard to conclude that there's a trend. How to test that is the realm of statistics, and when I explained the issue to my colleagues Jeffrey Kromrey and John Ferron, the answer from them was clear: permutation tests. That's a general family of nonparametric tests of inference that's the formal version of the question I asked: if you jumble up the data in all the possible ways they could be permuted, and if you look at a particular measure of interest (a test statistic), where in the distribution of all permutations does the observed data set fall? In the case of the 4th grade Black-White gap on New York state math tests measured as a z-score, we have 7 points of data, which have 7! = 5040 permutations. If you choose an appropriate test statistic for each permutation and the observed time series is about 125 from either end of the distribution, that excludes the 95% or more permutations in the middle of the distribution.
No, I haven't had the time or inclination to follow up, learn how to calculate one of the possible test statistics and how to get the R statistics program to do a permutation test. There are two problems, as I've learned from my colleagues: choosing the right test statistic is a matter of art as well as science; and there may be a problem with exchangeability. As far as I understand it, exchangeability is a less constricting assumption than the standard "independent, identically-drawn" sample assumption in parametric inferential statistics. From what I understand, the practical definition of exchangeability means roughly that you could theoretically exchange all the data points without screwing up the distribution. Again, if I understand correctly, one situation that violates the assumption of exchangeability is in autocorrelated datai.e., when one data point influences the next one (or the next few). And if there's anything that's likely to be autocorrelated, it's a time series. That's not a serious problem if you're just looking to see if a trend exists at all; for that, autocorrelation is a form of trend (though an artifactual one). But if you're trying to make causal inferences or anything more complicated when there's autocorrelation (i.e., if achievement data levels or trend slopes are different before and after a policy change), I think you have to throw permutation tests out the window.
And that's such a shame, because the concept is still right when extended beyond the question of a trend: if a policy makes a difference, then it should make a difference on which side of the policy change you're sitting. So if you're a clever person with statistics, please provide some ideas in comments for where to go with this or if, as I suspect, the best we can do with permutation tests is ruling out possible trends/autocorrelation.
September 8, 2008
Monday bits
I didn't have time this weekend to write a lengthy, thoughtful post, or even a lengthy and thoughtless piece, so you get bits this morning.
- Reserving Mondays: I've shut off my e-mail for now to get some editing tasks done, and I'll see if I can reserve Mondays for selfish purposes for the entire semester. Wish me luck on this one!
- Honesty: the Palm Beach Post's editorial board approves a draft change in calculating graduation rates in Florida. Kudos to Florida's commissioner of education, Eric Smith, for pushing this. (Disclosure: I've given a few ideas to the state department of ed on options for how to handle graduation in 5, 6 years, etc.)
- Sunday morning grading: I got out to a coffeehouse early yesterday to read my first batch of undergraduate papers. Several brought smiles to my face with great writing, provocative ideas, or both. That's a good sign for the semester.
- Fetishized vs. nonfetishized curricula: I wonder how the history of the Core Knowledge Foundation would have been different if E.D. Hirsch had thought to frame the issue not just as accumulating tiny bits of knowledge (how Herbartian of him!) and instead had framed it as a matter of both a knowledge base in different disciplines and the heuristic frameworks of those disciplines.
- I know I have at least a below-the-radar version of a head cold because I've had moments of earache in the last day, I had less energy over the weekend than I normally do, and I was sure last night that a mashup of Timothy Burke's guide to historical arguments and Atlas Games's Once Upon a Time would make a great introduction to historiography.
September 1, 2008
Shared responsibilities for children II: The loving hardass manifesto
Back in June, I briefly noted the potential political dynamics of the dueling manifestoes associated with the Broader, Bolder Approach to Education and the Education Equality Project, apologized for overplaying that analysis, and wrote an entry to talk broadly about shared responsibilities and education as part of the state. I've promised but have not followed through on my own manifesto, and it's now long past time for that. So, without further ado...
The Loving Hardass Manifesto*
I'm going to cut the shared-responsibility issue in a way that doesn't avoid the hard problems. Essentially, wherever your work touches children's lives, you're responsible for busting your butt without ruining your health or life. Unlike the Education Equality Project manifesto, I do not think that teachers are all-powerful or all-responsible. They're very important and responsible, but not for everything. Unlike the Broader, Bolder Approach, I do not think we can avoid central questions about accountability within school by reference to the other legitimate needs of children outside of schools. Yes, children have lives outside school, but it's acceptable to focus on what happens inside schools for things schoools are responsible for. And unlike Barack Obama, I am not going to say that both statements are right. Both statements are partially right. And while I know and admire several people who have signed one or the other statement, I will not sign either one, because both are flawed.
Let me start with the Project crowd. If you're a politician or administrator and believe that everything you've done is perfect, with no regrets, and all the evidence points in your favor, I hope you brought enough to share, because whatever you're smoking, I want to try it. Using only the high-quality evidence that is in your favor (and here I mean David Figlio-quality evidence), you can make a claim that high-stakes accountability leads to modest improvement in outcomes. But that's about it.
If you're a civil-rights activist and think that the best way to improve schools is to lambaste teachers and their representatives, I have a year for you: 1968. And a book: Tyack and Cuban's Tinkering toward Utopia. I have plenty more to suggest, but I figure that's enough.
Let's think about some basic facts: most kids come to school with families they go home to at night. If the children and their teachers are lucky, their families will only have the ordinary neuroses that God or Woody Allen placed there. If the children are unlucky, they'll also deal with poverty, disability, abuse, negligence, or having Paris Hilton as a distant relative. If you're a teacher, you can gripe about the families, but it's probably best not to, for a few reasons:
Your complaining to peers will not improve the parenting of anyone.
We've heard it before, and it wasn't convincing the last time, either.
If you complain about the parents, you will be depriving your students of their internationally-recognized right to be the first to complain to a therapist about how they were brought up. Really: it's in the UN Charter, under "Psychotherapy as an Adolescent," right above the bit about iPods and PlayStations. Go look it up if you doubt me.
I just lied. You may not have caught this, but the 1959 Declaration of the Rights of the Child does not mention the right to criticize parents in therapy or the right to consumer electronics. There isn't a single mention of either Apple or Microsoft, a shameful omission which Bill Gates is working hard to remedy. But until then, children only have the recognized right to things such as health care, food, shelter, the care of parents or other responsible adults, freedom from discrimination, and education.
I don't know if you've noticed this, but as a society we're not doing so well on fulfilling these rights. 600 million Chinese citizens use cell phones, and in a country that is far wealthier, we've still got millions of children without health care. It used to be that American parents would shame their kids into eating everything at dinner by pointing out that children around the world were starving. That makes you wonder what Chinese parents tell their children to shame them. Maybe they say, "Take your vaccination and stop crying: Kids are getting sick in America!"
Since the dueling manifestoes appeared in June, I've been scratching my head. The broader, bolder approach is fine as a statement of broad social policy but it doesn't work in terms of day-to-day accountability. You are responsible for the people who are in your life. When my children have been sick, and I've taken them to their doctors, I've never once been asked, "How are they doing in math?" and then had a doctor refuse to treat my child because they're not yet evaluating double integrals. They treat the kid in front of them the best they can. My father was a pediatrician and allergist who treated both wealthy families from one side of town and working-class families from another part of town. He never complained about the families from one side or the other. He just treated them.
But that doesn't mean my father had absolute responsibility, either. He was expected to be a professional, to keep up with the literature, and to follow standards of medical practice. But there has never been a "Health Care Equality Project" whose primary activities were to take pot-shots at doctors, call them "interests who seek to preserve a failed system," and want to pay doctors by a handful of measures of the health of their patients. My father was never paid by how much his patients weighed that year, or by how many tissues they used because of colds. We already have accounting-driven health care, and I don't know of any doctors or patients who think it's a good idea.
We also don't have ridiculous fads in medicine. Well, we do, but it's generally called the X diet (for various string values of X), or "alternative medicine," for those who think that if you dilute some processed duck liver by 30 or 40 orders of magnitude, your body will react in any way other than, "I'm sorry if you paid for that sugar pill instead of your mortgage, but the best I can do right now is a placebo effect. I hope you like it." In education, we have far more fads. If we had as many fads in medicine as we do in education, people would think that wearing uniforms made you thinner.
So there is something about the dueling manifestoes that just does not seem real to me. It's not that I am immune to their appeal. I want there to be equal education. And I've already written in many places that schooling needs to be thought of in the context of all the state structures that touch kids' lives. But it's still not resonating with me. My generation of the family takes care of these issues collaboratively. My oldest brother has been a lawyer, lobbyist, and think-tank staff member on health-care policy, which takes care of one right. I teach and write about education. The rest of the immediate family's a bunch of layabouts who do nothing other than have jobs and take care of their families, but Stan and I, we're holding our own on this caring-for-children thing, and if your family isn't, don't blame us. We are the Broader, Bolder Approach. But we're both going on diets soon, so that will change.
And for the Broader, Bolder crowd, you know you can do better. As a group, you include a bunch of incredibly well-read, smart researchers. And you're right on putting schooling in a broader context. But you just fell down on the accountability part. That one short paragraph on accountability? Please reread it. Really. You think that was the best you could do? You KNOW what you'd say to a grad student who had that fluff in a dissertation. Revise and resubmit, because I know you can get this up to your usual standards.
And the rest of you in the peanut gallery? Don't think that we can rest on our laurels, either. The folks I'm criticizing at least had the energy and guts to put pen to paper. What have you done to define "bust your butts"?
And, yes, this means that I need to look back at the last chapter of Accountability Frankenstein and see if it needs to be sharper. A commenter some months ago said it was not specific to NCLB, and that's a fair enough point. I wanted the book to be about accountability in general, but if I really know my stuff, I should be able to apply it in specific situations. Want a specific list of changes that should happen with the next reauthorization of ESEA? Coming up this fall...
* While I was drafting this in bits and pieces, I pondered whether to use the term hardass, but since Bob Sutton has written the book The No Asshole Rule and Harry Frankfurt's On Bullshit won a book award, I don't think I'm going that far out on a limb. A loving hardass knows that holding people to standards can be in their best interest. So for everyone who signed one of the manifestoes and think I'm nuts here, you're wrong. And in two years, you'll thank me for this.
August 27, 2008
Two interviews to read today
A few shout-outs while I'm still juggling a few hundred tasks the first week of classes:
- Charles Barone's interview of Robin Taylor (Delaware's state person in charge of accountability)
- The Eduwonkette Q&A for Bruce Fuller (coeditor of a new book on accountability)
August 5, 2008
Two brief comments
I promised not to comment on anything during my two-week break, but the NewTalk NCLBfest made me wonder who's missing from this debate. Your observations in the comments are most welcome.
Also, I think I may have alienated my family forever by going against their advice and buying a Sony Reader. Even my technophile son thinks I'm nuts. But the EPAA MS authors will probably appreciate my carrying their stuff with me to various short-reading opportunities.
August 1, 2008
A higher-ed unionist's view of the performance-pay debate
Corey Bunje Bower criticized a Newsweek column by Jonathan Alter and has the following response to Alter's slur against teacher unions:
Perhaps the most ridiculous thing that Alter writes -- and the statement that gives away the ideological underpinnings of his argument if anybody wasn't already aware -- is that unions "still believe that protecting incompetents is more important than educating children." Unions are far from perfect, and this is far from the most inflammatory rhetoric that I've read about them, but it's still sheer and utter nonsense.... Though more polite, it's the intellectual equivalent of calling somebody with whom you disagree a [N]azi or a terrorist.If I were a union leader, however, I would mull over Alter's final point.... the general idea that unions could view submitting their members to more scrutiny in exchange for higher pay is something on which both sides might find some common ground.
I suppose I qualify as a union leader albeit in higher ed, so I'll take the bait. Disclosure: my faculty union was the one to propose merit pay at the table many years ago, and university faculty are more likely to approve of something called merit pay because there is a tradition of peer review for tenure/promotion. (Our collective bargaining agreement provides for general due process and substantive standards but leaves specific procedures for annual reviews to department votes.) So while I am skeptical of several top-down proposals for/policies encouraging performance pay in K-12, it is out of my seeing problems with it rather than a visceral opposition to merit pay. As the car ads say, your mileage may vary.
There are two policy issues here: one is how to think about teacher pay and working conditions in general, and the other is the question of collective bargaining at the local level (and the centralization/local question more generally). In Accountability Frankenstein, I wrote about high-stakes accountability advocates' simplistic and often flawed grasp of motivation. To put it briefly, even if we had a Holy Grail measure of "teacher contribution to learning," that wouldn't be a sufficient justification for relying on test scores for teacher pay. No one has the best idea for what works best, and a top-down approach would short-circuit even the most rabid merit-pay advocate's interest in finding out what works, in much the same way that NCLB's proficiency measure aborted alternative ways to examine student achievement (including quantitative measures such as average scale score, medians, percentile splits, etc.). Essentially, those interested in performance pay have to make the policy choice between experimentation and a crusade. So to all 0.379 Capitol Hill staffers and campaign advisors reading this blog, you should be wary of federal mandates: if you mandate the wrong formula, everyone will pay the price for Beltway arrogance, and you'll endanger the political legitimacy of the idea for the long term.
Caution about top-down mandates also fits with the local nature of collective bargaining and the affiliate structure in American unions. Despite what people may claim about the NEA's visceral opposition to merit pay, the big picture is more complicated: locals have negotiated performance pay or merit pay or whatever you want to call it, and the governance structures of both the NEA and the AFT commit the national affiliates to support collective bargaining at the local level. (There are also the merged locals and state affiliates that belong to both national affiliates.) That federal structure means that the NEA and AFT support what local leaders decide in terms of bargaining strategy and the agreements that the parties ratify at the local level. Where local leadership negotiates performance pay, the state and national affiliates support that. And where local leadership decides not to negotiate performance pay, the affiliates support that, too. (See a March 2008 column from NEA Today for an example of recent rhetoric that illustrates this complexity.) The more accurate policy position of both the NEA and AFT is that they oppose top-down mandates of performance pay, including how it is structured. The AFT is not officially skeptical of performance pay, but both national affiliates work with and for the locals. If you believe that either national teachers union can dictate bargaining positions to locals, e-mail me about my deep-discount sale price on the Brooklyn Bridge.
The second question about performance pay is thus the degree to which there should be centralized decision-making in education, and that is true for collective bargaining as well as for other matters of policy. It is not necessarily a matter of offering a grand bargain to Randi Weingarten and Dennis Van Roekel, because the bargain for some segments of a national union may be anathema to others. Let me put forward a pro-performance-pay, pro-union person's pipe-dream proposal that would serve someone's interests as a union leader, and you may understand: If I were a K-12 union leader in Florida, I would definitely listen to a national policy proposal that would tie some incentives for performance pay (bargained at the local level) to the degree to which a state had the following in place:
- Collective-bargaining rights for public employees
- Card-check procedures for certification of public employee unions
- Binding arbitration for first contracts after a certain length of bargaining (say, 6-12 months)
- Fair share in a bargaining unit that is represented by a union
As a result of this pattern, where different circumstances lead to different views of policy by local union leaders, you can have leaders sitting in different places, each of whom has a deserved reputation for being able to craft a deal with administrators, but where they have very different views of policy proposals. Ultimately, someone who wants performance pay in K-12 schools has to understand the fact that national affiliates support locals, and that the needs of locals will vary by state environment.
July 28, 2008
Ocala rethinks high grade-retention rates
In the late 1990s, Florida instituted a requirement that third-graders reach a certain test threshold in reading or be held back in third grade. Now Marion County schools (which includes Ocala) is rethinking grade retention where it can (hat tip), once they realized they had several hundred middle-school students who could legally drive.
The research on retention is fairly clear: if you have the choice between holding a student back a grade and praying they somehow improve, on the one hand, and advancing the student a grade and praying that they somehow improve, the better long-term choice is to promote the student and pray. Then again, my colleague Sister Jerome Leavy would point out that while plenty of Catholic schoolteachers believe in the power of prayer, you gotta do some teaching, and that's a poor way to frame public policy questions. Retention/promotion questions are an administrative distraction from the need to identify children who need help and intervene early.
July 23, 2008
Review of "Accountability Frankenstein"
As far as I'm aware, Teachers College Record recently published the first review of Accountability Frankenstein. From the comments by Dick Schutz, "If you are in any way concerned with the status and future of US el-hi education, you owe it to yourself to read this book." You can read the review to see where he thinks I got things right and wrong.
Crisis rhetoric, attention seeking, and capacity building
Berliner and Biddle's The Manufactured Crisis was the independent reading choice of several students in my summer doctoral course, and as they have been writing comments on the book in the last week, I have been thinking about the split retrospective view of the 1983 A Nation at Risk report, produced by the National Commission on Excellence in Education. The report has been on the receiving end of a tremendous amount of criticism by Berliner, Biddle, Jerry Bracey, and many others.
Of the various criticisms of the report, two stick fairly well: the report was thin on legitimate evidence of a decline in school performance, and the declension story is ahistorical. First, the report relied on a poor evidentiary record, using problematic statistics such as the average annual decline in SAT scale scores from 1964 to 1975, statistics the report's authors claimed were proof of declining standards in schools. (Why this was flawed is left as an exercise for the reader.) Using this evidence, the report claimed that
... the educational foundations of our society are presently being eroded by a rising tide of mediocrity that threatens our very future as a Nation and a people. What was unimaginable a generation ago has begun to occur--others are matching and surpassing our educational attainments.
If an unfriendly foreign power had attempted to impose on America the mediocre educational performance that exists today, we might well have viewed it as an act of war. As it stands, we have allowed this to happen to ourselves. We have even squandered the gains in student achievement made in the wake of the Sputnik challenge. Moreover, we have dismantled essential support systems which helped make those gains possible. We have, in effect, been committing an act of unthinking, unilateral educational disarmament.
Where do I start with the problems here: the war-like rhetoric, the implication that we don't want the rest of the world's education to improve, the bald assertion that there is any solid evidence of student achievement gains post-1958 that can be attributed to Sputnik, or the assumption that if there were low expectations observable in the early 1980s it must have been a decline from previous times instead of a generally anti-intellectual culture?
But 25 years after the report's release, it is easy to poke holes in and fun at the hyperbolic rhetoric. What the last few weeks have brought home for me is the very different perceptions of the report. Berliner, Biddle, Bracey, and other critics are absolutely right that the report is factually and conceptually flawed. And yet there are many people involved with the commission who not only thought they were factually correct, they thought that the report's purpose was to help public schooling. If you read various accounts of the commission's work, it is clear that they thought the report was necessary to build political support for school reforms.
Part of the report's creation lies in the campaign promise of President Ronald Reagan to abolish the federal Department of Education. In this regard, his first Secretary of Education Terence Bell brilliantly outmaneuvered Reagan, and within a few months of the report's release, it was clear that the report had resonated with newspaper editorial boards and state policymakers. Even without it, given the Democratic majority in the House and the presence of several moderate Republicans in the Senate, it was unlikely that Congress would abolish the department. After it, the idea was largely unthinkable.
But the motives of Bell and the commission members were clearly not about saving an administrative apparatus. They were true believers in reform, and if all of the recommendations had been followed, today we would have a much more expansive school system. (The recommendations included 200- or 220-day school calendars and 11-month teacher contracts.) Some of the recommendations were followed, primarily expanding high school course-taking requirements and standardized testing, as well as the experiments in teacher career ladders in several states. But the guts of the implemented recommendations were already in the works or in the air: I remember that California state Senator Gary Hart had been pushing an increase in graduation requirements, a bill that passed in 1983. (This is not the same Gary Hart as the famous one from Colorado.) While I could have graduated from high school in 1983 with one or two semesters of math (I forget which), students in my former high school now must take several years of math. (As others have pointed out, one of the unintended beneficial consequences of raising course-taking requirements was dramatically reducing the gender differences in math and science course taking. Richard Whitmire, take note: Terence Bell is the villain!)
Lest some people not know or have forgotten, A Nation at Risk was not the only major mid-80s report on public schooling. Others were written from a variety of perspectives: Ernest Boyer's High School, Ted Sizer's Horace's Compromise, Arthur Powell et al.'s The Shopping-Mall High School, and John Goodlad's A Place Called School. All were published in 1983 or 1984. All were earnest. All were more thoughtful than A Nation at Risk. I suspect that if Two Million Minutes had been made and released at the same time (if with different non-U.S. countries and different students), it would have fit into that cache of reform reports very well.
Those other reports did not gain the same attention as A Nation at Risk, and I am not certain that any of the reports dramatically changed the policy options discussed at the state level. Changed course requirements and testing were prominent parts of the discussion before the reports, and they were the primary consequences of state-level reforms in the 1970s and 1980s. What the body of reports did instead was push the idea that schools needed reforming. On that score, I think they succeeded, even if several of the report writers (Sizer and Goodlad) became horrified at the direction of reform policies.
Today, we have a new set of actors making similar claims about the need to reform schools: did you receive the e-mail from Strong American Schools/Ed in '08 that I did yesterday? If you didn't, here's the text:
We are only as strong as our schools, and our schools are failing our children.
Consider:We know that the nations with the best schools attract the best jobs. If those jobs move to other countries, our economy, our lives and our children will suffer.
- Almost 70% of America's eighth-graders do not read at grade level.
- Our 15-year-olds rank 25th in math and 21st in science.
- America showed no improvement in its post-secondary graduation rate between 2000 and 2005.
For that reason, Strong American Schools launched a new campaign this week to combat the crisis in our public schools.
Click on the image below to view our television advertisement:
Please join us. Tell your governors, your state and national representatives and senators that you want a change for stronger schools.
Make your voice heard.
The ad's rhetoric is definitely in line with A Nation at Risk, down to the tagline: "As our schools go, so goes our country." It's tired rhetoric at this point, and I think it's important to understand why the folks behind Strong American Schools are keeping at it, though they've made no traction in making education a highly visible part of the presidential campaign thus far: as with the major figures in A Nation at Risk, they are true believers in reform to increase the capacity of regulators.
But Strong American Schools has now become a shadow of A Nation at Risk, itself the least substantive of the mid-1980s reports on American schooling. Instead of making specific claims or recommendations, they're pushing "a change for stronger schools," or rather attention. To do so, they claim a crisis, though this is probably the worst time to claim that weak education is the cause of what Phil Gramm calls our "mental recession": to anyone who looks at the current state of the world, our economic woes are the consequences of the subprime mortgage crisis and energy prices (which themselves are driven by the growing Chinese and Indian economies). In 1983, the economy was out of recession. I just don't think the world will realign itself in the same way as in the 1980s. That doesn't mean that there isn't a tie between education and the
economy in the long term, but it's diffuse rather than mechanical.
And there's another question here: is it ethical or even helpful to claim that a long-term problem is an acute crisis, just to gain public attention for an issue? We've gone down this road many times before, and I just don't see where it helps in the long term.
July 21, 2008
The higher-ed split among conservatives
One could probably have predicted today's Inside Higher Ed article describing how several conservative academics criticized the current push for quantitative assessment of higher ed. I didn't, but if you did, give yourself a pat on the back.
The article describes a panel on Friday sponsored by the American Academy of Distance Learning (more about that later) where the former head of Margaret Spellings's Office of Postsecondary Education and the executive director of the National Association of Scholars ripped Spellings and her allies for pushing standardized tests in higher ed to the detriment of liberal arts. According to the article, Diane Auer Jones was more diplomatic than Peter Wood, but both complained that the push for accountability was turning reductionist. In this regard, I think Wood's reported comments are on the money: today, the policy rhetoric on higher education is vocational, and that threatens to make the defense of a liberal-arts education more difficult. He ties it to the push for accountability in higher education, and I've had similar concerns about calls for standardized testing as the primary accountability mechanism for colleges.
The predictability comes in the split among conservatives, one that Wood ties back to a "practical"/"classical" distinction in the late 18th century. The Spellings Commission report ignored fundamental tensions in American higher education, and one interesting feature of the report is the invisibility of the curriculum. The report's rhetoric was tied closely to economics, and I suspect that Jones's resignation in May on a matter of principle was the result of a long-simmering frustration among some conservative academics, not an isolated event. No party or political coalition is monolithic, and I've heard several current and former Capitol Hill staffers from Democratic offices who were far closer to Spellings on higher-ed accountability than either Jones or Wood. And I'm closer to Jones and Wood at least on this issue, though I'm a Democrat.
And now the coda: The building frustration among some conservatives that I'm inferring here may explain why Jones and Wood were willing to use the sponsorship of a proprietary university's president's shadow accreditation office: I've tried to look for the "American Academy of Distance Learning," which seemed to be an odd outfit to sponsor a talk about standardized testing and the liberal arts. I found an American Academy of Distance Learning (or at least a reference to its tax-exempt status) headquartered in Denver, but Dick Bishirjian runs the proprietary Yorktown University, which is in Denver... at the same address as AADL, down to the same suite number. But the media advisory for the panel lists AADL with a Norfolk post office box. Bishirjian also appears to be the president of the American Academy of Privatization, a proponent of "privatization training for public officials." I'm not sure what that means, precisely, but the P.O. box for it is the same as that given in the media advisory for AADL. In other words, it looks like Bishirjian has a mail drop in Norfolk and office space in Denver. That's an amazingly slim infrastructure to run a university and two other organizations... or at least to claim so. A July 10 Denver Post article gives a little more information about Yorktown, at least in relationship to Republican Senate candidate Bob Schaffer, who served on Yorktown's board of trustees for several years. Yorktown apparently has a single graduate program and only a few dozen students. Given the plaudits for Bishirjian by Paul Weyrich earlier this month on David Horowitz's website, it looks like Bishirjian had enormous difficulties gaining accreditation. So... is his sponsorship of the forum for Jones and Wood something that's tied to his proprietary institution's interests? I don't know if either Jones or Wood is aware of Bishirjian's background or the disconnect between his proprietary institution's curriculum and their arguments, but this is definitely one of the odder set of bedfellows I've seen in higher education.
July 17, 2008
Teachers and the public sphere
Partially drafted in Chicago Sunday evening, July 13, and revised July 17:
I'm listening to Susan Ohanian at the moment, talking to a group of about 50 AFT delegates and others. Ohanian is a well-known opponent of NCLB and academic standards and was invited to speak at an event sponsored by the AFT Peace and Freedom Caucus (which should sound familiar to NEA national delegates, who can sign up for an NEA Peace and Freedom Caucus as well). As I've written elsewhere, Ohanian is right in several things and wrong in others. (Go read our books to figure out where we agree and disagree; I like her as a person, and she raises important questions about the purpose of education and high-stakes testing.) But I'm more interested this evening in the audience after she and the other speaker (the leader of an independent teachers union in Puerto Rico) finish. The AFT crowd neither applauded nor booed this morning when Barack Obama talked about merit pay in his live-feed speech to the convention floor. (The crowd went to its feet and cheered loudly when he first appeared and cheered again loudly at the end, and applauded at various points in the 10-minute speech. As Mike Antonucci has noted, it's essentially the same speech he gave to NEA, the one that had NEA California delegates booing, so we have an interesting comparison point.) But since a strong positive reaction followed Ohanian's statement that it was wrong for Obama to claim that teachers are the most important influence on children, I'm fascinated.
Part of the reason why I'm fascinated is because I think Ohanian's arguments are inconsistent. Ohanian worried about the statement by Obama that "the single most important factor in determining a child's achievement is not the color of their skin or where they come from; it's not who their parents are or how much money they have. It's who their teacher is." Ohanian argued that this statement is rhetoric that sets up blaming teachers for all sorts of problems they are not responsible for. A few minutes later, she claimed that the real danger of high-stakes accountability was the destruction of children's imaginations and the creation of a compliant workforce. But there's a logical inconsistency here: how can schools create worker robots if they are not powerful in shaping the lives of children?
I worry (and I said towards the end of the event) that Ohanian's criticism undercut arguments about the importance of the public sphere. You can say that teachers are not crucial to children's lives, but then it's hard to argue that schools should be well-funded. You can say that teachers are not crucial, but then it's hard to argue against all sorts of problematic policy proposals that take authority away from teachers or that position teachers' professional judgment as irrelevant. Ohanian was nodding in acknowledgment at the time, so I think (or I hope) she knows that her impromptu remarks were not consistent with either her deeper views of schooling or that of most teachers.
As it turned out my initial impression of the crowd was wrong: there was a lively discussion after the speakers finished, with plenty of dissent with Ohanian's arguments. So in one sense, I never had my question answered: what drew some of the delegates to agree with the remarks by Ohanian that concerned me the most?
July 15, 2008
Know what union membership means before you write, Ray
Ray Fisman wrote a laudatory article released Friday by Slate about NYC's P.S. 49 principal Anthony Lombardi, an article with themes remarkably similar to what Robert Kolker wrote for New York Magazine in 2003, even down to quoting Randi Weingarten calling Lombardi a tyrant without crediting Kolker. Fisman links to an Inside Schools page summarizing P.S. 49 data and using Kolker's quotation, again without credit. C'mon, Mr. Fisman: if I can find the source by Googling, why couldn't you? (Given that flaw, I am doubtful of Fisman's claim that Lombardi was ever "at the top of the teachers-union hit list" (evidence of any such list or just colorful language to cover up a reporter's lassitude?)
But the passage that had me laughing was the following bit of ignorance:
Currently, New York City teachers get their union cards their first day on the job. In theory they're on probation for three years after that, but in practice very few are forced out. Lombardi suggests replacing this system with an apprenticeship program. Rather than requiring teaching degrees (which don't seem to improve value-added all that much), new recruits would have a couple of years of in-school training. There would then come a day of reckoning, when teachers-to-be would face a serious evaluation before securing union membership and a job for life.
Here is a fundamental conflation of tenure and union membership, or union membership with the legal protections of a collective bargaining agreement, or "serious evaluation" with something. I'm not sure where the root of the error lies, but I do know one thing that's true everywhere, as far as I know: union membership does not change your legally recognized rights under a collective bargaining agreement. It does other things that are important (greater chance of gains at the bargaining table through solidarity, access to specific benefits provided by the union beyond CBA protection, etc.), but Fisman just doesn't know what he's talking about here.
And then Joanne Jacobs repeats the error. Wince time...
July 9, 2008
Can reporters raise their game in writing about education research?
I know that I still owe readers the ultimate education platform and the big, hairy erratum I promised last month, but the issue of research vetting has popped up in the education blogule*, and it's something I've been intending to discuss for some time, so it's taking up my pre-10:30-am time today. In brief, Eduwonkette dismisses the new Manhattan Institute report on Florida's high-stakes testing regime as thinktankery, drive-by research with little credibility because it hasn't been vetted by peer review. Later in the day, she modified that to explain why she was willing to promote working papers published through the National Bureau of Economic Research or the RAND Corporation: they have a vetting process for researchers or reports, and their track record is longer. Jay Greene (one of the Manhattan Institute report's authors and a key part of the think tank's stable of writers) replied with probably the best argument against eduwonkette (or any blogger) in favor of using PR firms for unvetted research: as with blogs, publicizing unvetted reports involves a tradeoff between review and publishing speed, a tradeoff that reporters and other readers are aware of.
Releasing research directly to the public and through the mass media and internet improves the speed and breadth of information available, but it also comes with greater potential for errors. Consumers of this information are generally aware of these trade-offs and assign higher levels of confidence to research as it receives more review, but they appreciate being able to receive more of it sooner with less review.
In other words, caveat lector.
We've been down this road before with blogs in the anonymous Ivan Tribble column in fall 2005, responses such as Timothy Burke's, a second Tribble column, another round of responses such as Miriam Burstein's, and an occasional recurrence of sniping at blogs (or, in the latest case, Laura Blankenship's dismay at continued sniping). I could expand on Ernest Boyer's discussion of why scholarship should be defined broadly, or Michael Berube's discussion of "raw" and "cooked" blogs, but if you're reading this entry, you probably don't need all that. Suffice to say that there is a broad range of purpose and quality of blogging, some blogs such as The Valve or the Volokh Conspiracy have become lively places for academics, while others such as the The Panda's Thumb are more of a site for the public intellectual side of academics. These are retrospective judgments that are only possible after many months of consistent writing in each blog.
This retrospective judgment is a post facto evaluation of credibility, an evaluation that is also possible for institutional work. That judgment is what Eduwonkette is referring to when making a distinction between RAND and NBER, on the one hand, and the Manhattan Institute, on the other. Because of previous work she has read, she trusts RAND and NBER papers more. (She's not alone in that judgment of Manhattan Institute work, but I'm less concerned this morning with the specific case than the general principles.)
If an individual researcher needed to rely on a track record to be credible, we'd essentially be stuck in the intellectual equivalent of country clubs: only the invited need apply. That exists to some extent with citation indices such as Web of Science, but it's porous. One of the most important institutional roles of refereed journals and university presses is to lend credibility to new or unknown scholars who do not have a preexisting track record. To a sociologist of knowledge, refereeing serves a filtering purpose to sort out which researchers and claims to knowledge will be able to borrow institutional credibility/prestige.
Online technologies have created some cracks in these institutional arrangements in two ways: reducing the barriers to entry for new credibility-lending arrangements (i.e., online journals such as the Bryn Mawr Classical Review or Education Policy Analysis Archives) and making large banks of disciplinary working papers available for broad access (such as NBER in economics or arXiv in physics). To some extent, as John Willinsky has written, this ends up in an argument over the complex mix of economic models and intellectual principles. But its more serious side also challenges the refereeing process. To wit, in judging a work how much are we to rely on pre-publication reviewing and how much on post-publication evaluation and use?
To some extent, the reworking of intellectual credibility in the internet age will involve judgments of status as well as intellectual merit. To avoid doing so risks the careers of new scholars and status-anxious administrators, which is why Harvard led the way on open-access archiving for "traditional" disciplines and Stanford has led the way on open-access archiving for education, and I would not be surprised at all if Wharton or Chicago leads in an archiving policy for economics/business schools. Older institutions with little status at risk in open-access models might make it safer for institutions lower in the higher-ed hierarchy (or so I hope). (Explaining the phenomenon of anonymous academic blogging is left as an exercise for the reader.)
But the status issue doesn't address the intellectual question. If not for the inevitable issues of status, prestige, credibility, etc., would refereeing serve a purpose? No serious academic believes that publication inherently blesses the ideas in an article or book; publishable is different from influential. Nonetheless, refereeing serves a legitimate human side of academe, the networking side that wants to know which works have influenced others, which are judged classics, ... and which are judged publishable. Knowing that an article has gone through a refereeing process comforts the part of my training and professional judgment that values a community of scholarship with at least semi-coherent heuristics and methods. That community of scholarship can be fooled (witness Michael Bellesiles and the Bancroft Prize), but I still find it of some value.
Beyond the institutional credibility and community-of-scholarship issues, of course we can read individual works on their own merit, and I hope we all do. Professionally-educated researchers have more intellectual tools which we can bring to bear on working papers, think-tank reports, and the like. And that's our advantage over journalists; we know the literature in our area (or should), and we know the standard methodological strengths and weaknesses in the area (or should). On the other hand, journalists are paid to look at work quickly, while I always have competing priorities the day a think-tank report appears.
That gap provides a structural advantage to at least minimally-funded think tanks: they can hire publicists to push reports, and reporters will always be behind the curve in terms of evaluating the reports. More experienced reporters know a part of the relevant literature and some of the more common flaws in research, but the threshold for publication in news is not quality but newsworthiness. As news staffs shrink, individual reporters find that their beats become much larger, time for researching any story shorter, and the news hole chopped up further and further. (News blogs solve the news-hole problem but create one more burden for individual reporters.)
Complicating reporters' lack of time and research background is the limited pool of researchers who carve out time for reporters' calls and who understand their needs. In Florida, I am one of the usual suspects for education policy stories because I call reporters back quickly. While a few of my colleagues disdain reporting or fear being misquoted, the greater divide is cultural: reporters need contacts to respond within hours, not days, and they need something understandable and digestible. If a reporter leaves me a message and e-mails me about a story, I take some time to think about the obvious questions, figure out a way of explaining a technical issue, and try to think about who else the reporter might contact. It takes relatively little time, most of my colleagues could outthink me in this way, and somehow I still get called more than hundreds of other education or history faculty in the state. But enough about me: the larger point is that reporters usually have few contacts who have both the expertise and time to read a report quickly and provide context or evaluation before the reporter's deadline. Education Week reporters have more leeway because of the weekly cycle, but when the goal of a publicist is to place stories in the dailies, they have all the advantages with general reporters or reporters new to the education beat.
In this regard, the Hechinger Institute's workshops provide some important help to reporters, but everything I have read about the workshops are usually oriented to current topics, providing ideas for stories, and a matter of general context and "what's hot" rather than helping reporters respond to press releases. Yet reporters need the help from a research perspective that's still geared to their needs. So let me take a stab at what should appear in reporting on any research in education, at least from my idiosyncratic readers' perspective. I'll use the reporter's 5 W's, split into publication and methods issues:
- Publication who: authors' names and institutional affiliations (both employer and publisher) are almost always described.
- Publication what: title of the work and conclusions are also almost always described. Reporters are less successful in describing the research context, or how an article fits into the existing literature. Press releases are rarely challenged on claims of uniqueness or what is new about an article, and think-tank reports are far less likely than refereed articles or books to cite the broadly relevant literature. When reporters call me, they frequently ask me to evaluate the methods or meaning but rarely explicitly ask me, "Is this really new?"My suggested classification: entirely new, replicates or confirms existing research, or is counter to existing research. Reporters could address this problem by asking sources about uniqueness, and editors should demand this.
- Publication when: publication date is usually reported, and occasionally the timing context becomes the story (as when a few federal reports were released on summer Fridays).
- Publication where: rarely relevant to reporters, unless the institutional sponsor or author is local.
- Publication why: Usually left implicit or addressed when quoting the "so what?" answer of a study author. Reporters could explicitly state whether the purpose of a study is to answer fundamental issues (such as basic education psychology), applied (as with teaching methods), attempting to influence, etc.
- Publication how: Usually described at a superficial level. Reporters leave the question of refereeing as implicit: they will mention a journal or press, but I rarely see an explicit statement that a publication is either peer-reviewed or not peer-reviewed. There is no excuse for reporters to omit this information.
- Content who: the study participants/subjects are often described if there's a coherent data set or number. Reporters are less successful in describing who are excluded from studies, though this should be important to readers and reporters could easily add this information.
- Content what: how a researcher gathered data and broader design parameters are described if simple (e.g., secondary analysis of a data set) or if there is something unique or clever (as with some psychology research). More complex or obscure measures are usually simplified. This problem could be addressed, but it may be more difficult with some studies than with others.
- Content when: if the data is fresh, this is generally reported. Reporters are weaker when describing reports that rely on older data sets. This is a simple issue to address.
- Content where: Usually reported, unless the study setting is masked or an experimental environment.
- Content why: Reporters usually report the researchers' primary explanation of a phenomenon. They rarely write about why the conclusion is superior to alternative explanations, either the researchers' explanations or critics'. The one exception to this superficiality is on research aimed at changing policy; in that realm, reporters have become more adept at probing for other explanations. When writing about non-policy research, reporters can ask more questions about alternative explanations.
- Content how: The details of statistical analyses are rarely described, unless a reporter can find a researcher who is quotable on it, and then the reporting often strikes me as conclusory, quoting the critic rather than explaining the issue in depth. This problem is the most difficult one for reporters to address, both because of limited background knowledge and also because of limited column space for articles.
Let's see how reporters did in covering the new Manhattan Institute report, using the St Petersburg Times (blog), Education Week (blog thus far), and New York Sun (printed). This is a seat-of-the-pants judgment, but I think it shows the strengths and weaknesses of reporting on education research:
| Criterion | Times (blog) | Ed Week (blog) | Sun | |
|---|---|---|---|---|
| Publication | ||||
| Who | Acceptable | Acceptable | Acceptable | |
| What | Weak | Acceptable | Weak | |
| When | Acceptable | Acceptable | Acceptable | |
| Where | N/A | N/A | N/A | |
| Why | Implicit only | Implicit only | Implicit only | |
| How | Acceptable | Absent | Absent | |
| Content | ||||
| Who | Acceptable | Acceptable | Acceptable | |
| What | Weak | Weak | Weak | |
| When | Acceptable | Acceptable | Acceptable | |
| Where | Acceptable | Acceptable | Acceptable | |
| Why | Weak | Acceptable | Weak | |
| How | Weak | Weak | Weak | |
Remarks: I rated the Times and Sun items as weak in "publication what" because there was no attempt to put the conclusions in the broader research context. All pieces implied rather than explicitly stated that the purpose of the report was to influence policy (specifically, to bolster high-stakes accountability policies). Only the Times blog noted that the report was not peer-reviewed. All three had "weak" in "content what" because none of them described the measures (individual student scale scores on science adjusted by standard deviation). Only the Ed Week blog entry mentioned alternative hypotheses. None described the analytical methods in depth.
While some parts of reporting on research is hard to improve on a short deadline (especially describing regression discontinuity analysis or evaluating the report without the technical details), the Ed Week blog entry was better than the others in in several areas, with the important exception of describing the non-refereed nature of the report. So, education reporters: can you raise your game?
* - Blogule is an anagram of globule and connotes something less global than blogosphere. Or at least I prefer it. Could you please spread it?
July 8, 2008
300 v. 10,000 and the broader discussion of performance pay
A bit more on Obama, performance pay, and the NEA: I commented yesterday about the Mike Antonucci video of Obama's speech to the representative assembly and the light round of boos when he mentioned performance pay (or merit pay or differential pay: take your pick, it doesn't change the substantive matters). Antonucci responds with more about his impression of the response (whether boos or cheers were louder for Obama, for which segments, etc.). I wasn't there, so I'll take his word that I miscounted from the spectacular audio on Youtube. I'm not sure that matters much either for the politics (which is that Obama is popular among teachers, but he and union leaders disagree most about performance pay) or for the substantive policy.
Charles Barone updated his entry on the matter twice, and here's the relevant matter:
I and many of the people who were passing this around are a little more skeptical than Sherman about what is needed to effect the kind of change Obama is talking about. The teacher quality problem is national. And urgent. It requires a national solution, which is frankly long overdue
Here we see what I explain to my undergraduate students: NCLB and education politics more generally have created a vicious circle of distrust. Because of how states respond to NCLB (some of which is pushed by the law and some a matter of state choice), teachers and parents at the local level have an increasingly negative view of NCLB and states. And because of the same choices, national policymakers and the Beltway view states and local actors with even more distrust.
The argument that Problem X "requires a national solution" is more a reflection of this distrust than a result of serious research or policy perspectives about the role of the federal government. (See Manna, Mcguinn, DeBray-Pelot, Kaestle, and others on federalism in education policy.) The federal government can do many things, and some things it must do, but federal education law is pretty blunt. It has never been a policy scalpel. And everything we know about performance pay and merit pay is that the details matter a great deal, a situation where federal mandates would be disastrous and eventually undercut any transient support for merit pay.
I know that the details matter from my observations of a cudgel-like mandate in my own state and also from my own experience with merit pay in higher ed: my colleagues generally like merit pay because departments are in control of the procedures and vote on them. Test scores play no role, and support for merit pay would evaporate if any of the K-12 schemes involving those were floated here. The most quantitatively-oriented department chair I know is least confident about evaluations of teaching and most confident on research, for a variety of reasons. Even so, my colleagues also support across-the-board raises (salaries at USF are in the fourth quintile of research-extensive universities, in terms of the national distribution) and compression-inversion remedies.
July 7, 2008
300 booing is somehow more important than 10,000 delegates
Former Hill staffer Charles Barone wrote early this morning that a video of Barack Obama's speech to the NEA Representative Assembly last week was being watched closely by "Congressional staff and education policy folks." Barone highlights a point in the speech where Obama says he is in favor of performance pay and where you can hear some booing in the background. "Pretty striking, booing a plan to give teachers who do more work, attain certain skills, or take tough assignments more money."
Barone is taking that moment far out of context, and so is anyone who draws a similar conclusion: what sounds like several hundred people booing is in a hall of about 10,000 delegates, and the cheers at other moments easily outweighed the booing. Even the laughter at Barack's comment after that moment was far louder. Bargaining performance pay is a hot topic among teacher union officers, and it should be clear that many union leaders are highly skeptical of any and all performance pay plans. I don't want to paper that over. There are plenty of reasons for union officials to be skeptical, given the history of arbitrary administrative evaluations before unionization, pay plans that have been imposed without bargaining, or pressure tactics that can undermine local bargaining. On the other hand, I can think of several locals (including those in the NEA) who have bargained performance pay when they have been part of its development.
In the end, Barone's comment is sad evidence of a Beltway mentality: Hill staffers know best. Neither members of Congress nor local school board members nor union leaders inherently know best. Where that type of arrogance rears its head, it undermines what should be happening: discussion.
(Disclosure: My own faculty union was the first to propose merit pay many years ago in the statewide contract, and of all the locally-derived money at USF for collectively bargained raises since our first local contract in 2004, two thirds has been for merit pay.)
June 13, 2008
I was manifest(o)ly wrong
June 12, 2008
Shared responsibilities for children I
Probably the most important issue is the role of schools in citizenship and the welfare state. Because schooling became closely tied to the rhetoric of citizenship two out of the three times that the franchise expanded dramatically in the past two centuries, we think of education today as a birthright. Primary education became common in the U.S. earlier than in other early-industrializing countries, and as a result education is the primary form of social citizenship in this country. As Hochschild and Scovronick note, we imbue education with many of the same functions that a broader welfare state serves in other industrialized countries: education is supposed to advance economic opportunity, better health, happier lives, and so forth. (The last, most corrupt form of progressive curriculum ideas was called the Life Adjustment movement, and it was the reductio ad absurdum of education as a substitute for broader social citizenship.) So now schools are supposed to do everything from resuscitate the economy to save lives to ... oh, I don't know, cure split ends. There is a legitimate and identifiable human capital consequence to education,
but the rhetoric on that is overblown. There is an inevitable
temptation to see education as the cure for all ills, and the politics
of education is liberally infected with panacea attribution disease. One part of the serious debate over accountability is the precise role of schools, and that is intimately tied to questions about the extent of the American welfare state.
One complication in thinking about education is the fact that elementary and secondary schooling is among the most equally distributed resources in the United States. In the states with the worst inequality in school spending, you'll see maybe two or even three times as much spending for some children as for others. Think about the distribution of other resources: access to health care, housing, transportation. All are distributed less equally than schools, because schooling is part of the democratic state and a right of citizenship by politics and state constitutions. That fact does not excuse educational inequality, but it's something we don't talk about openly or think about clearly.
I think there's a way out from the quagmire I've identified above: schools, other agencies, and families share responsibilities for children. Each is independently responsible for a reasonable but critical role in the lives of young people. Schools are not time machines: they cannot go back and undue what happened or didn't happen in earlier years, nor can they provide health care, clean air, and so forth. Nor can they take over the lives of children. But neither are they or teachers able to use the rest of children's lives as excuses; you take the students you have and move them. Period. The same is true for parents: they're not responsible for teaching their children calculus. But neither are they supposed to sit on their butts when things go wrong in schools, nor is it responsible to neglect their children. Oh, yes, and you're responsible for talking with people in the other roles, too.
There is a crucial advantage of having twin principles (responsibilities for both coordination and independent functioning): It fits with the broad sense of U.S. parents and other adults that both families and schools are responsible for academic achievement. I've pointed out this apparent inconsistency for several years, but in reality it's not an inconsistency. It reflects one reasonable solution to the dilemma: we're all supposed to be responsible.
But there's a sticking point in this grand ideal: given that schools have a serious but limited responsibility, how do we define the scope of that responsibility? Let's assume (for now) that we're concerned primarily with academic achievement. What exactly do we want schools to do? The final issue I want to identify is the series of shortcuts we take when talking about standards, proficiency, expectations, and any synonym you can find to the general concept of what we want children to learn. I have made the following point in Accountability Frankenstein among other places, and no one has even challenged me on it: almost every policy displaces the hard choices about expectations into a different forum. That doesn't mean that I have no expectations for my children or for schools. It just means that the process of turning rhetoric into policy mechanism removes the definition of academic expectations from public debate. Some of us say we want "high standards," but that does not say a single thing except in the politics of symbolism. Reformulating the concept doesn't help: growth models are equally suspect. In short, "proficiency" is a cipher.
Oh, damn: and there you thought I was headed into a Grand Bargain, a reasonable solution to all the fighting over accountability? Unfortunately, I'm an historian, not a Nobel Peace Prize winner. And I have somewhere to be in a few minutes. But do not fear: for those who grumble about the lack of specifics in this week's manifestoes or this entry, just hold on (or read the last chapter in my book, which is available without waiting for the second entry on this topic).
June 10, 2008
Missing out of the action, still
I'm going to be late in responding to this (and other major stories such as Ed Week's grad-rate release last week). I'd give my brief gloss on the topic, but I've already written a book on accountability, and I'm too exhausted right now for pithy comments.
May 28, 2008
The test-prep nightmare
I'm OK with test prep. When standardized tests are well-crafted, as they are in my state, teachers should use tests to shape their classroom instruction. Done thoughtfully, "teaching to the test" is a good idea. But at my school, and others in Houston, we execute test prep so poorly that it ends up hurting students more than it helps them.
The concrete description in the rest of the entry shows what happens in the school where he teaches:
... the sticker exercise told us little about our students' needs...
Mostly, teachers made worksheets with questions only loosely related to each other taken from previous TAKS tests, or, in some cases, from math textbooks that are largely unaligned with the TAKS test. Think panicked college students poring over Cliffs Notes for the wrong novel.
Sometimes, the school made all math teachers work off of the same worksheets, regardless of the fact that they taught different subjects....
Our test prep worksheets aim to review important skills. But oftentimes students have not learned these skills in the first place. And the worksheets don't fix that....
Students choose not to try mostly because they think they have no chance to succeed. That's not their fault. At Hastings, we are far too willing to exchange gimmicky test-prep and other instructional shortcuts for real teaching.
Rosenthal's vision of teaching-to-the-test done right is in line with the argument of Lauren Resnick, if TAKS were such a "good test" (many would disagree), and if that incentive pushed the type of instruction Rosenthal prefers (i.e., good instruction). But that's far too rare.
May 21, 2008
Qualitative data on schools
Yesterday's story in the Washington Post (hat tip) on in-person reviews of schools by external committees is one step in the right direction for accountability: using in-person eyeballs instead of just statistical eyeballs to see what should be done. Rhee sent teams of people into schools she wanted to change. There are some questions I still have after reading the article: why only one- and two-day visits? what did the DC teachers union think of the reviews? what did other stakeholders think? But even if there were flaws with this process, having students, parents, and educators visit schools to provide a snapshot is dramatically different from just looking at test scores and prescribing a cookie-cutter "fix."
(Note: Ken DeRosa pointed out the false dichotomy I had when rushing this entry through yesterday, and I trust this is now more "just.")
May 19, 2008
Political science/political philosophy and education policy
That's probably one logical direction for some good academic work to head in, after the solid work done by Manna, Mcguinn, and Debray (three new scholars: go buy their books!). Education governance is such a complicated mess for some who think about school reform, it's thus a wonderful place for academics to play.
April 20, 2008
The Indiana Jones response to philosophy-of-research blogging
Kevin Carey has his say on a preponderance-of-evidence standard on policy propositions (in response to an Eduwonkette discussion of growth measures). Skoolboy responds. I wouldn't go all ad-lib-for-convenience on you all if it weren't 11:20 at night, but I'm tired, and since this is a meta-discussion about judging teachers based on test scores, I'll just say this: It already happens (firing educators based on test scores), it's called reconstitution, and the evidence of its success is mediocre at best. We don't need to go all meta- when there's experience at hand... or specific proposals such as New York City's (which Skoolboy points out fails the sniff test of basic algebra).
If anyone were tempted to go meta-, I'd point out that there is no such thing as a monolithic social scientist's frame for policy. Then again, I'm not only an alleged social scientist, I'm a card-carrying member of the Social Science History Association and have a degree in one of those odd number-crunching realms (demography).
April 13, 2008
Legislative rolling and the New York budget language on tenure
One more thought on the New York state budget's language placing a moratorium on using test scores to deny teachers tenure: I'm wondering how much of the ire directed at the legislature and the calumny aimed at NYSUT (the state teachers union affiliate) is about the process of how this happenedi.e., without the "right" people in control or at the table.
I suspect the substance of the language is all about the waiting game going on with the end of Michael Bloomberg second term as New York mayor. The use of value-added measures as the sole or a primary tenure criterion is now blocked until after Bloomberg is out of office (and after Joel Klein is also likely to be gone as schools chancellor). Whatever decisions are taken after the moratorium ends will be taken by other people, in other political circumstances.
And it's that fact that makes me wonder about the undiscussed process issue. For the last seven and a half years, plenty of players were ignored in education policymaking. That's why the legislature approved mayoral control: to remove large bunches of stakeholders from the decision-making, in hopes that putting power in the hands of one person (Mayor Bloomberg) would aid significant reform. The political regime that followed that decision is something I'll leave to others to describe (and I suspect it would make a great dissertation for someone in the New York area), but the whole point of mayoral control was to remove people from the policymaking process.
So what happened in Albany? According to the critics of the decision who blamed NYSUT, the teachers union used every lobbying trick at their disposal to hide this provision in the budget while it was being drafted/finalized, while others (Bloomberg and allies) were left out of the process. The tone used by DFER head Joe Williams is one of anger and surprise, a "we was robbed" attitude. One informal term for being robbed and beaten up in the process is "being rolled," and that's much the impression I get from the critics of the language, especially the New York Daily News's referring to Albany as in the midst of a "legislative crime wave." No one likes to be rolled politically, but the irony here is that many of those who disapprove of being rolled in Albany haven't said boo about others' being rolled in NYC.
April 9, 2008
There it ain't -- a rap on The Quick and the Ed's knuckles
In The Quick and the Ed today, Kevin Carey boldly overclaims:
The Times is reporting that, at the behest of the teachers unions, last-minute language was snuck into the New York State budget providing that "teacher[s] shall not be granted or denied tenure based on student performance data." There's really not much one can add to that; it's hard to imagine a more unambiguous declaration of the union's total disregard for student learning when its members' jobs are at stake.
I suppose there really isn't much to add except that the Times article clearly states that the provision in question is not a ban but a two-year moratorium. It's hard to imagine a more unambiguous declaration of the union's caution about buying into rash schemes, and it puzzles me why Carey would make such an obvious omission in a way that undercuts his argument. See Eduwonkette for more links.
April 3, 2008
A dozen questions for an official graduation rate
When the OMB clears the draft regs on counting dropouts, we can expect another wave of stories on graduation rates and what they all mean. Sharp reporters and other observers will ask the following questions of the draft regs:
- Does the definition of graduation include or exclude non-standard completion categories such as GEDs and "certificates of completion"?
- How does the definition of graduation handle students with disabilities with a modified curriculum (that is, with an emphasis on functional rather than academic goals)?
- Is the mandatory measure a longitudinal statistic such as the NGA compact or a synthetic measure such as Chris Swanson's Cumulative Proportion Index? (I will assume until proven wrong that it is a longitudinal measure.)
- Regardless of the measure proposed, how many states have data systems that can produce the statistics required?
- How does the measure address transfers, homeschooling, migration, and mortality?
- For the adjustments proposed for transfers, homeschooling, migration, and mortality, are there any requirements that states audit the corresponding codes in their data systems?
- How does the proposed measure handle grade retention (e.g., multiple years in ninth grade)?
- Does the proposed measure forbid a state from using the Florida tactic of calling a dropout a transfer if the dropout immediately enrolled in a GED program?
- How does the proposed measure handle students who graduate in five years?
- Do the proposed regs require that school districts and schools must meet benchmarks in graduation in the same way that they must meet benchmarks with % 'proficient'?
- If there are such required benchmarks, is there any supporting research to suggest that the status or improvement benchmarks are realistic?
- In crafting the draft regs, did the Department of Education consult with more than two of the researchers recognized to have published in the relevant area, such as Chris Swanson, Rob Warren, Melissa Roderick, Russell Rumberger, Bob Hauser, Michelle Fine, or Gary Orfield? I'm an historian, and we're generally trotted out as mantel decorations for such affairs, if at all, but there are plenty of solid researchers in the area who could be consulted. And if you're a reporter, you need to line up a few of those folks to be ready to respond to draft regs.
April 1, 2008
Gradu[r]ated
So U.S. Secretary of Education Margaret Spellings Announces Department Will Move to a Uniform Graduation Rate, Require Disaggregation of Data (the true title of the press release today announcing imminent-but-not-published draft regs defining a graduation rate and only a few words away from the type of book title that would cure almost any insomnia). And George Miller huffs some that it wasn't bipartisan (hat tip to David Hoff on the Miller statement). So what's the buzz about?
- Spellings is channeling Adlai Stevenson's approach to governance and proudly announcing bold action on issues that are almost consensual and would happen without her intervention.
- Especially for this particular issue, the devil is in the details. Florida has a longitudinal graduation measure, but that doesn't mean it's accurate. If the regulatory language released in draft form would allow Florida to keep doing what it's doing officially, you won't see much in the form of transparency (and at least with two issues, you may see things get worse).
- Spellings is hoping the gravitas and charm of Colin Powell rubs off. Admittedly, Powell hasn't (yet) been on NPR's Wait, wait, ...
Maybe this is more evidence that Spellings will run for elected office in Texas and claim that she created growth measures, differentiated consequences, and airtight graduation rates. At least she's not claiming to have invented the Internet...
March 19, 2008
"Differentiated accountability"
Alexander Russo links to news coverage of the Margaret Spellings announcement yesterday that maybe not all AYP failures are the same. Here's some blog coverage:
- Ed Week reporter David Hoff notes the irony that Spellings made the announcement in a state that was ineligible for the pilot.
- St Pete Times reporter Ron Matus anticipates that Florida politicians will say "we did it first!"
- Jim Horn quotes the FairTest reaction that it's rearranging deck chairs on the Titanic.
- Eduwonk Andy Rotherham says it's a good policy move, but the administration's political context for the announcement ain't pretty.
- NSBA's BoardBuzz calls it "probably a day late and a dollar short."
- Eduflack wonders why Spellings is allergic to the word "flexibility."
Spellings went to growth pilots, waivers (or turning the other cheek) to allow tutoring before choice, and now differing judgments on failure to meet AYP after others talked about the ideas for years. I think Spellings is just channeling Adlai Stevenson, who once quipped that leadership is seeing where the crowd is heading and getting in front of it.
(Does anyone know the exact wording or source for that?)
Florida ed policy and politics
The legislative session is in full swing (or a more colorful noun), and a bunch of things are in the air either in Tallahassee or elsewhere:
1. Both houses of the state legislature are considering bills to change the role of state testing (FCAT), either by adding other information to the labeling of high schools (the senate's approach) or by a compromise bill that discourages test-prep and sets more specific grade-level standards (the proposal in the house).
2. The ACLU sues Palm Beach County for its low high school graduation. Superintendent Art Johnson suggests it's the state's fault for not providing enough money (scroll down for "But the superintendent..."). (Disclosure: A 2006 paper of mine is mentioned in both stories.)
3. Something that wasn't covered in my local papers in January: Holmes County administrators have banned students from displaying anything related to gay pride. The ACLU of Florida sued. I suspect this one's a no-brainer in a bench trial: in the majority opinion in Morse v. Frederick, Chief Justice Roberts made a distinction between what he thought of as the political speech of Tinker and the display of "Bong Hits 4 Jesus."
The only interest the Court discerned underlying the school's actions [in Tinker] was the "mere desire to avoid the discomfort and unpleasantness that always accompany an unpopular viewpoint," or "an urgent wish to avoid the controversy which might result from the expression." Tinker, 393 U. S., at 509, 510. That interest was not enough to justify banning "a silent, passive expression of opinion, unaccompanied by any disorder or disturbance." Id., at 508.
I think that reasoning clearly applies in this case.
March 11, 2008
Defending Effective Accountability and Assessment Practices
Saturday, March 29, 2008
10:45-12:15
Hilton Washington
Defending Effective Accountability and Assessment Practices is the title of the session I'm a participant in at the NEA/AFT Higher Education Joint Conference.
From what I understand, the tentatively-slated participants include staff members of two institutional associations as well as us faculty. As soon as I have permission to post those names, I'll do that.
February 28, 2008
Is the blind spot on higher-ed accountability that big?
In all the kerfluffle over the senior theses of Hillary Clinton and Michelle Obama, I hope I am not the only person asking the other question that I think is obvious and to the point: What do the theses tell us about the state of undergraduate education for Princeton and Wellesley students at the time?
Similarly, all those who huff and puff about higher-ed accountability are ignoring a huge source of information on the quality of graduate education: dissertations. Want to know what the expectations of students are really like? Go read what students create, when they know it's going in the library, going to be microfilmed, or going to be available electronically to the world.
February 25, 2008
NCLB and where we sit
In my undergraduate social foundations class, I spend some time explaining the politics of accountability. For the last few years, a critical mass of students (either a majority or a vocal minority) have consistently opposed accountability, taking on the mantle of professionalism, and it's my job to rattle their cages and make them see things using at least one other lens.
I usually explain things in words something like the following:
Views of accountability depend dramatically on where you are. At the classroom level, teachers trust what they do and would like to trust parents but aren't exactly sure. Parents may want to trust teachers, if their children's experiences have generally been decent, or may be entirely untrusting if not. Principals generally trust their own judgment and would like to trust teachers but have a supervisory responsibility (and the level of supervision they exercise will depend rather dramatically on a variety of factors).
Once you get above the level of the school, each level tends to want to impose some accountability on the level below it. For NCLB purposes, the key issue is the state/feds split: in a number of states, officials in the state capitol don't trust local districts and feel that it is their responsibility to regulate the districts, while a number of federal officials are skeptical that states will do the right thing unless there is a federal level of accountability.
NCLB forced states to define a variety of measures and set targets for those measures. At the local level, the state plan is often viewed as onerous, unreasonable, and inflexible. But the state plans are inherently compromises, and so various parties in Washington have looked at the state plans with skepticism.
For example, let's take a look at graduation, which states often defined to mean one minus the proportion of high school students identified as dropouts. That too-easily-falsifiable "dropout rate" is very low in many places, for reasons largely unrelated to the actual proportion of teenagers who graduate from high school, and the official graduation rate if defined as the complement will be wildly inflated.
To local residents and some educators, it looks like the state is hiding a sizable dropout rate, which many view as a consequence of out-of-control accountability systems. That's the type of local or educator-centered view many of you have described.
But you also need to look at it from a federal perspective, from those who see state plans and state commitments with enormous skepticism. To them, what would be the logical conclusion drawn about such graduation rates?
Linda McNeil et al.'s recent article on high-stakes accountability in Texas and Charles Barone's entry today, The Games States Play: Graduation Rates, are Exhibits A and B the next time I have this discussion.
Wrong incentive structure for community colleges/technical training
George R. Boggs and Marlene B. Seltzer describe Washington State's incentive structure designed to encourage community colleges to push completion:
Washington's community and technical colleges will receive extra money for students who earn their first 15 and first 30 college credits, earn their first 5 credits of college-level math, pass a pre-college writing or math course, make significant gains in certain basic skills tests, earn a degree or complete a certificate. Colleges also will be rewarded for students who earn a GED through their programs.
On the one hand, focusing on proximate measures on the way to degrees makes enormous sense, at least if we trust Cliff Adelman's work. On the other hand, I worry that such an incentives structure will affect standards in institutions with weak faculty governance and protection of academic freedom: "We need these students to pass these credits, or we lose money."
Better incentive structure: if public funding plus current tuition is sufficient for an institution's operating expenses (a rather big if, as I'm aware in Florida), keep the hands off the potentially perverse incentives inside the curriculum and give students an incentive to do well by keeping tuition stable for students as long as they make steady progress towards degrees. In other words, tuition stability (or a cap on rising tuition) is guaranteed if students are doing well.
The institutional incentives then can be geared towards summary graduation measures, to some extent. Florida's universities are having their first bite of outcome incentives this year, but the budget cut is swamping the effects of it. (Here's the motivational undermining: You don't starve people and then tell them they can earn a little bit of pin money if they work harder. At this point, at least for the universities, it's a matter of looking to the future and probably a system negotiation about formulae.)
There's a lot more to be said about higher-ed accountability, including Gerald Graff's commentary on assessment and Erin O'Connor's response, but I have to chair a proposal defense in 10 minutes...
Update (2/27): Kevin Carey responds:
I'd like to propose that people be more judicious and precise in their use of the term "perverse incentives" by not applying it to any incentive that could theoretically cause someone to act in bad faith.
I'm not going to split hairs by pointing out the adverb potentially up in the original entry (okay, originally potential and then changed to potentially); if I understand it correctly, Carey's argument is that we should not say something is a perverse incentive unless we can really point to the evidence of strong corrupting influences. In this case, my argument is about the pressures on instructors, not students (something different from what Carey inferred). Are colleges susceptible to such corruption when institutional stakes are tied to individual course grades? The scandals each year tied to athletics (e.g, FSU and tutors who helped athletes cheat) tell me the answer is yes.
Teacher performance-pay distributions in Tampa
Yesterday and today, the St Petersburg Times has been covering the distribution of performance pay among different schools in Hillsborough County (one of the few in Florida where the union and school board agreed to the state's merit-pay provisions). See the main story from yesterday and also a tale of two teachers, a basic Q&A sidebar, and then play around with school-level statistics.
What the Times has documented is that teachers were more likely to receive the bonuses in schools where students are more likely to be from well-off families. The district says they'll tinker with the formula for next year. While I love David Tyack and Larry Cuban's book with tinkering in the title, I'm skeptical that tinkering will work in this case.
February 14, 2008
Helen Ladd's common-sense approach
I'm biased because I've made the same recommendations: In a late January Ed Week commentary I should have pointed to earlier, the Duke University professor says we should be Rethinking the Way We Hold Schools Accountable.
February 12, 2008
On excuses for unintended consequences
Oh, my: I head out of town for a week, and when I get back there's a trail of tears blogs on curriculum narrowing:
- Charles Barone, January 17
- Robert Pondiscio, January 18
- Eduwonkette, January 18
- Eduwonk, January 17-18
- Eduwonkette, February 4
- Eduwonk, February 6
- Ken DeRosa, February 6
- Robert Pondiscio, February 7
- Joanne Jacobs, February 7
- Eduwonkette, February 8
- Eduwonk, February 8
- Charles Barone, February 12
While there is some question about the extent of curriculum narrowing that followed NCLB (see: no causal language there), the basic argument in these entries is over whether NCLB creates incentives to narrow the curriculum and the extent to which the variation in curriculum narrowing shows that schools don't have to narrow the curriculum to do well on tests.
(...except for Eduwonk's red herring about low bars, which essentially is that because states can set relatively low thresholds for proficiency, that eliminates the incentive to narrow curriculum, stuff test-prep into the kids up the wazoo, etc. No economist or behaviorist would accept an argument of "hey, the marginal change required is low, so that doesn't create an incentive for changed behavior." Either would reply that's a question that should be left to evidence, not speculation. I'm not an economist or a behaviorist, but I don't buy the hand-waving about low bars, either. And, as 'kette points out, isn't NCLB supposed to change behavior? You can't simultaneously say NCLB is changing some behavior you like without acknowledging that it has the potential to provoke behavior we don't like.)
If we agree that thousands of schools are making poor decisions in response to the pressure of test-based accountability, then the operative question is, How do we help schools and educators make better decisions? Charles Barone and others suggest we hold up exemplars and say, "Follow them." That's the effective-schools-literature strategy, and we've paddled that boat since the late 1970s without getting where we want, so we know at least that it's not enough. Robert Pondiscio and other core-knowledge or other-curriculum standards folks would say, "Build the curriculum, and they will follow." That's a step towards regulating input more than outcomes, which I suspect will not be politically viable, but I may be wrong. George Miller, Ted Kennedy, and others propose to increase the number of measures used, with legislative language that assumes that AYP can be finely tuned. I don't buy that argument: test-based accountability is a cudgel, not a scalpel. My instinct is to say, Watch the decision-making, but that's because I distrust black-box handwaving, and I know it's hard to operationalize a procedural standard within a test-prep culture.
The meta-political question is deeper and one that I think most people understand in spots if not generally: you either own reform or you lose the reformer label. If you do not acknowledge problems through implementation and own them, you give up a huge chunk of credibility. Whether I agree with them on an issue or not, I give credit to Ed Trust for occasionally identifying problems with implementation and deciding to own the issue (e.g., growth models). They haven't done that with 100%-proficiency goals or test-prep (yet), but it's a healthy dynamic where they have done it. You could say the same with Fordham and curriculum-narrowing (or Diane Ravitch with the same issue plus test-prep). Or Miller and Kennedy and 100% proficiency (though their concrete ideas on those points are Rube-Goldbergesque).
I haven't seen that nearly as much with Barone, Eduwonk, or some others, and the failure to own problems with NCLB ignores the fundamental fact of post-NCLB politics: Parents of public-school children are far more skeptical of test-based accountability than they were 5 years ago. Own the problems or lose control.
February 11, 2008
Probably not what Tallahassee or Beltway policy wonks intended
So some Florida teachers were fired because they were abusing students, letting a classroom get out of hand, not being prepared ... but the state has forced the reinstatement of the teachers because the districts did not rely on test scores to make the personnel decisions.
Can someone explain to me how this makes sense?
February 7, 2008
One more follow-up on Kennedy/Miller endorsement and NCLB politics
Just one more datum on speculation about the Kennedy and Miller endorsements of Obama mean for NCLB (little, I've said before). Let's suppose for a moment that all this is true, and that the stars are lining up behind Obama from the Democratic Forces for NCLB. If you believe that and the bundling hypothesis about donations to campaigns, and if you know where Bill Gates stands, where do you think the majority of donations from Microsoft employees would be going?
Wrong: Clinton.
February 3, 2008
Matt Miller's fallacy
I must have had a busy month to wait several weeks before correcting the record on Matt Miller's Atlantic article, First, Kill All the School Boards. The real problem, he says, is all of those selfish, parochial school board members and the unions who manipulate them. He paints a romantic picture of Horace Mann, repeats both the truthful and the hoary cliches of the past quarter-century of school reform, and calls for nationalizing education.
To put it briefly, Miller falls into the standard "let's fix the governance structure" fallacy of a certain chunk of education reform wannabes. I just don't buy it. If school-board parochialism were the main problem, then we'd find Hawa'i's schools outdoing the rest of the country because of its unitary system. Or we'd find Southern states outdoing the north because many of them have mostly county systems, in contrast to Northern and Western states with tiny, fragmentary districts. Or New York City's system would be perfect today because of the elimination of the elected school boards through mayoral control. I'm sure that there are governance changes that would matter, but this one? It's bold, provocative, simple, and not very helpful.
Miller refers to a comparative study of education policymaking by economist Ludger Woessmann, and I need to track that down, but I suspect it will support Miller's argument less than he thinks, at least from other writings of Woessmann that I've come across. We'll see. In the meantime, here's a bit of cold water on the everyone-has-national-standards argument, taken from Accountability Frankenstein:
[N]ot all industrialized countries have a national curriculum framework: Spain and Hungary have a common core, but regions have the authority to adjust the core curriculum or add to it. Italy's and Argentina's curriculum planning has become less centralized in the past decade. Australia, Canada, Germany, and Switzerland have federal systems, like that in the U.S., where there is no central curriculum authority (Chisolm, 2005; Gvirtz & Beech, 2004; Jansen, 1999; O'Donnell, 2001). Even among countries with a centralized curriculum, the focus varies widely (Holmes & McLean, 1992). The United States is not out of step with the world, because there is no international consensus on the appropriate control of curriculum and expectations (or standards), let alone the content.
February 2, 2008
Bill Clinton's Ego, redux
I think Leo Casey is wrong about the politics of Bill Clinton's slamming Ted Kennedy. Since I agree with Leo on a large swath of education policy, including the effects of NCLB, I should explain a bit. For the most part, Hillary Clinton and Barack Obama share significant rhetoric on education and quite a bit of fuzziness on the details. They've both said NCLB has serious flaws, but it hasn't been a focus of their campaigns. That's not much of a surprise, because, despite the efforts of Ed in '08, education is not a huge issue in the campaign. (Bill Gates, get behind in line the folks who want a presidential debate around science.)
Over the past few weeks, both George Miller and Ted Kennedy have endorsed Obama. Has Obama said he agrees with Miller and Kennedy about NCLB? No, not to my knowledge. Maybe he did a backroom deal with both of them about reauthorization, but I've already explained why I think that's not the likely reason for both endorsements.
After being chastised for going after Obama directly and crudely in South Carolina, Bill Clinton did his best to undermine the endorsement of a liberal icon, by linking Kennedy to Bush:
No Child Left Behind was supported by George Bush and Senator Ted Kennedy and everybody in between.Let me make this clear: I don't think Bill Clinton gives a hoot about NCLB right now, but if he can use it to smear Kennedy and undermine that endorsement, he will. To that end, I think Charles Barone's line-by-line response is tangential. The only phrase that Bill Clinton wanted to get out was "George Bush and Senator Ted Kennedy." Yeah, he can spin a policy tale out of that, but that's not the point.
I know that Hillary Clinton freely acknowledges that she cannot carry a tune in a bucket, but in this case, it's Bill Clinton who's tone-deaf.
February 1, 2008
At least Timothy Leary chose to drop out...
I think I understand Leary's choices, or at least the temptation: It's the end of two very tiring days, when I had a chance to talk for a few hours with one of the folks who tore down Florida's old Pork Chop Gang. Short story: an undergraduate I've been mentoring for a few semesters had an internship with the law firm of this Florida political hero, and after e-mailing back and forth, he needed some questions answered about the background of his senior thesis. So he proposed a joint meeting, first scheduled at the law firm and then moved to my office. I was expecting it to go about 90 minutes. It lasted 150 minutes instead. So we got off on various tangents, since he had the personal experience and I had the history, but the student said it was worth it. I had several meetings today (some planned, some impromptu, some deferred). Lots of things delayed, which is my life these days.
But even if deferred for a few days, the new English-language article of EPAA is out: Avoidable Losses: High-Stakes Accountability and the Dropout Crisis. Its authors combined interview work with following students in Texas as they were left behind in 9th grade and then dropped out. This is very difficult work to do, and the findings are provocative. Two stand out for me: that principals know that they are choosing between education and satisfying the test-score gods, and they reluctantly choose to satisfy the gods; and that to students, there is no distinction between accountability and all the practices that alienate many of them from high school. To the students in this Texas school district in the late 1990s and early 200s, there is a single massive bureaucracy that held them back, denied them opportunities in part to game the system, and never told them that their education was being sacrificed in the name of pressure whose putative goal was to ensure that they were not denied educational opportunities.
Whether you agree with the article's authors or not, I suspect it will be discussed vigorously, which is all to the good. A few years after Jennifer Booher-Jennings' article on triage in Texas, one of the models for NCLB continues to be a focus of criticism and debate.
(No, I've never taken illegal drugs, nor have I ever been tempted to, in reality. But I live on antihistamines when I have a cold...)
Evaluating college teaching
Since my energy is now sapped, I'll address Eduwonkette's four questions from yesterday:
1) How should learning be evaluated in college?
There are two separate questions (what did individual students learn? and what did groups of students learn?), though I think Eduwonkette is asking more about personnel evaluation. The first two can be evaluated using similar questions and data (including student work!), as long as you acknowledge that classroom dynamics can change things quite a bit. Usually, the first question is tied to students' individual grades, and the second is water-cooler (or coffee-urn) talk among colleagues: how was your class in HVN 101 this semester: better than HLL 666 last semester? Faculty rarely get to ask the second question in more systematic ways.
2) Are course evaluations a fair and comprehensive measure of college teaching?
Eduwonkette is either asking a trick question or conflating the end-of-course surveys that students take with either course evaluation or personnel evaluation. Students are evaluating their own experiences throughout a term, so the survey is more a chance for them to express the conclusions they have already reached, in some fashion, at least if the survey items are at least tangentially related to their concerns. Evaluating a course should involve student feedback but also something about what students learned, not just what they felt or expressed. And evaluating faculty as employees involves additional layers involving their contributions to a course, other information and context often unknown to students, let alone research or service assignments.
3) What should universities do with student course evaluations?
See above on my desire to ban evaluation as the term used for student surveys. But to answer the substantive question: they should be written with input from faculty, include an item on how much effort the student expended on the course (for a few reasons), be available to students (except for graduate students, who are students as well as employees and thus should have some privacy protections), and be part of program and personnel evaluations.
4) What are the potential risks/benefits to students and profs of making them public?
When I was a student, I found the comments far more telling than the numbers. But I suspect that this doesn't have to be theoretical or based on anecdote: there have to be institutions where the survey responses are public, and where one could study the consequences. See above on the graduate-student privacy concerns I have.
January 31, 2008
Higher education and the wrong battle
At Education Sector, Kevin Carey (a 4 out of 5 in my book) has an institutionalist lens that is sometimes incisive (4.5 out of 5) , sometimes frustrating (2 of 5), and occasionally both. Such as his complaint yesterday about the "Higher Ed Lobby" (my quotation marks, which are probably 1 out of 5 on style). Here's the gist in his complaint about accreditation agency politics:
But accreditation does a terrible job of creating or providing any kind of public, comparable information about institution-level academic quality.
I'd rate that comment as a 3 out of 5, and the post in general a 2.5 (in comparison with Eduwonkette, whose posts are averaging about 4.87 in the last few months). There are multiple arguments layered into that one statement, but let me focus on two:
- Lax accreditation has played a significant role in letting the quality of (undergraduate) instruction be lower than it could be.
- What we need to improve undergraduate instruction is predigested comparisons of quality between institutions.
Thus, yesterday's statement of principles by the Association of American Colleges and Universities and the Council for Higher Education Accreditation is unlikely to satisfy Carey's concerns because it resists the notion that creating quantitative comparisons of student outcomes is a necessary part of the accreditation process. Delving into the broader issue at length requires more energy and time than I have this morning, but I'll put out a few counterclaims:
- As long as millions of parents and students perceive that they are buying a degree from a college, there will be an inevitable tension between credentialism and the "use value" of a college education. In this environment, accreditation has to answer the face-value "does this college provide an opportunity to learn, and is the degree legitimate?" question.
- The most savvy students and parents want more than U.S. News rankings, but they're not going to give a hoot about what irks Carey and me about the rankings. Instead, savvy students and parents want to know what happens in the classroom, the lab, the studio, and the field. A case in point: last year, one teen acquaintance of mine was looking for colleges with performing arts programs. In the end, she was accepted to two schools with outstanding reputations, one with local connections that are unbeatable in this subfield, and the other that's in another region, perfectly reputable, but without those networking opportunities. She had the opportunity for one last visit to each place, and what made the difference was watching students rehearse and perform. There was no faux objectivity. My young friend watched students work and decided that the less-networked place had the better education because there was a pop to the work in one place that just didn't exist in the other.
My friend and her parents (whom I've known for years) cared about comparisons, but not predigested ones. They made their own ranking. Kevin Carey, Charles Miller, and others may want to see predigested measures, but they'll be swimming upstream against credentialism, against the needs of students and families who really do want information about educational quality, and against the professional judgment of faculty. Framing the issue as one of the White Hats against the Higher Ed Lobby does everyone a disservice.
One more thing: Last week I tried an experiment and allowed readers to rate my posts on a 1-5 scale. I tried priming the pump by rating a few of them (no, not all 5's), but no one else participated, and I pulled that option. I guess maybe some people are interested in ratings, but not my blog's readers.
January 30, 2008
Chemistry or test-prep?
In Palm Beach County, high schools are ditching real science for FCAT prep. And I thought the election results were the most depressing news of the morning!
January 29, 2008
Alfie Kohn and Diane Ravitch agree!
This week, the zeitgeist in education news is paying students for test scores, as in the Baltimore Sun article yesterday or the USA Today story, but so-called incentive programs have been in the news before and criticized before: See criticisms of Pizza Hut's "Book It" program or Barry Schwartz's column last July, which scored New York City's initiative to pay students for test scores. While they sound good in theory (reward kids for doing well!), it rubs a number of people the wrong way, including Elena Silva of Education Sector, Diane Ravitch, Eduwonkette, and even conservative Liam Julian, who criticized such programs last year (though I'm linking to my blog entry because the original column has suffered linkrot). And virtually the whole education world knows about Alfie Kohn's opposition to tangible incentives. So what could possibly bring folks from very different stripes together; after all, as Robert Pondiscio points out, isn't giving one incentive the same as giving any incentive, and all we're doing is haggling over the price?
First, a bit of disillusionment: while Kohn and Ravitch both talk about intrinsic rewards, I suspect only one of them will agree with the second half of the reasoning below.
There are two problems with paying students cash for achievement. One is that these programs are not finely calibrated. Whether they reward status achievement (straight As or a certain score on standardized tests) or some sort of growth/effort, there are going to be some rewarded students who did not work hard for the reward and other unrewarded students who probably deserve it. Two consequences flow from that fact. First, students will perceive it as unfair, once the money is doled out. Well, maybe we should be teaching teenagers that "merit pay" isn't always distributed on an equitable basis (see Robert Dreeben's work), but I suspect a program that doesn't pass the adolescent sniff test for fairness will alienate rather than motivate students, with the consequences magnified because of the money stakes. In addition to the fairness issue, there is the research question of whether rewarding students' focused effort and improvement is better or worse than rewarding status. Most program administrators probably make decisions based on seat-of-the-pants judgments rather than the research.
There is a second problem with paying students cash for achievement, and that is the question of the reward itself: will it promote continued effort, or will it be tangential to effort? A case in point from my own experience as a parent, and that of many other parents: you go to the library with your elementary-school child and borrow some books that the child chooses. You all return home. The child reads the book. What is the reward for the child's reading the book? My wife and I didn't think about it at the time in this way, but what our children chose was to return to the library to get more books. The reward was another library trip, which promoted reading. Many math teachers have bonus questions on tests to keep some occupied when they finish the main questions earlier than other students. But the bonus questions also reward completing the test by giving the students more opportunity to challenge themselves. Students of moderate means who work their tail off in high school should be rewarded by an opportunity to attend college at reduced cost (a scholarship), which promotes learning. And so forth.
From this, I'd argue that the more fundamental problem with rewarding achievement with cash is that such rewards do not promote additional learning. While Roland Fryer (the designer of NYC's incentives program) is obviously a very smart new scholar, he is thinking of the rewards from a fairly narrow perspective, assuming that all incentives are fungible and ignoring the post-award uses of rewards. We know that Pizza Hut is engaged in marketing rather than a promotion of reading because it rewards kids with pizza instead of with books. And we'll see appropriate incentives when their use is intimately tied to additional effort.
January 28, 2008
Party trumps policy
Last night, Leo Casey hypothesized on Edwize that Kennedy's endorsement of Obama was related to NCLB. Like Scott Elliott (a reporter with the Dayton Daily News), I'm skeptical. While George Miller and Ted Kennedy have both endorsed Obama and are major figures in NCLB politics, they are also stalwarts in the Democratic caucuses in each side of Capitol Hill, and a significant obligation of such folks is to defend the Congressional majority. The defense of that majority will depend on how well Democratic candidates perform in historically Republican states. As Matthew Yglesias has pointed out, within the Democratic party, Obama is convincing officeholders in Republican-dominated states that he can not only win the White House but help Democratic candidates for lower offices.
That potential contrasts with one of the signal legacies of the (Bill) Clinton administration, a cannibalization of the party by the top of the ticket. While Bill Clinton's fortunes thrived, the Democratic party's did not. I don't think Hillary Clinton is nearly as egotistical as her husband, but downticket potential is probably more important to endorsements than the few inches that separate Clinton and Obama on No Child Left Behind.
January 23, 2008
Value-added, with botulism
Before Kevin Carey proclaims that value-added [method] comes of age, he might want to read the real true facts behind the New York City teacher value-added project, wherein we learn that the city's great statistical experts thought three children were enough of a sample on which to base a teacher evaluation, or maybe the ethical problems with the NYC project, or maybe even my comments on value-added or growth measures in Accountability Frankenstein.
No matter what else you can say about growth measures, NYC's project is about the worst example I can imagine to use if one wanted to push the approach.
Update I: Carey responds in his post:
It might [have methodological problems, in NYC], I don't know, I guess we'll find out. But, per above, methodological issues can be worked out, and anyone who thinks the hysterical reaction to the value-added initiative stems from a deep and abiding concern for statistical integrity is willfully not paying attention.
The claim that "methodological issues can be worked out" is evidence that Carey hasn't read the writings of professional researchers who point out that growth models are no holy grail. I am one of those who have written about the difficulties inherent in growth models, but certainly not the only one.
And my response isn't hysterical; it's simply disgusted with the latest shenanigans from Tweed. The title comes from a wordplay (when food "comes of age," you don't really want it).
Update II: Best comment in response to Eduwonkette: skoolboy, who writes, "I'd characterize the New York City Department of Education as loving data but hating research."
January 20, 2008
Where does effective reform come from?
Thursday, Andy Rotherham challenged historians of education:
[H]ere's a question for the historians that might help explain why education does careen from one thing to the next. What are the most compelling examples of where the education system has reformed itself in ways that have demonstrably benefited students? Haven't most of the reforms, for good and ill, come from influences on the outside, whether higher ed leaders, business, etc...?
I'm not sure Rotherham was responding to Diane Ravitch's plaintive query fairly (I read Ravitch's argument to be that the content of Michael Bloomberg's and Joel Klein's reform ideas is nonsense), but let me answer the question as best I can. As David Tyack and Larry Cuban point out in Tinkering toward Utopia (1995), we sometimes confuse noise for reform. Well, that's not quite their point: they argue in an early chapter that you have to distinguish between cycles of reform rhetoric and institutional trends. We can't look just at the visible reforms, the ones that have someone shouting from the rooftops about them. In other words, the only reforms that might pop up on Rotherham's radar screen would come either from outside reformers or from the louder inside advocates.
But "the most compelling examples of where the education system has reformed itself" might lie precisely in institutional trends that are tough to identify as coming from a specific set of pressures. I would argue that on the whole, elementary schools treat children much better than they did a century ago: only rare beatings (which provoke outright shaming if they become public), much less physical punishment, and a much higher proportion of teachers who understand better ways of motivating kids. That doesn't mean that everyone is perfect, just much better on the whole than teachers from a few generations ago.
One could make a pretty good case that the consistent rise in NAEP math scores in many states is the result of changing practice. As I've argued before, the National Council of Teachers of Mathematics is not perfect, especially in how it communicates ideas, but my guess is that math instruction is slowly shifting, with more use of manipulatives and other varied repertoires in early grades and also in early childhood settings. Again, nothing is perfect, but as a child I never encountered the easy introduction to graphing that my own son had when he was in preschool in the 1990s. (It involved tasting fruits and vegetables, with children in the class putting up an icon of the food when they liked the taste. The result was a vertical bar chart of preferences by food.) I don't think that came from outside schools.
That doesn't exonerate school officials. I've criticized Tyack and Cuban's incrementalist framework, using desegregation as the obvious counter-example. But that history doesn't quite provide an argument in favor of mapping business rhetoric onto schools. Among other things, there's only one city I know where desegregation was supported by the business community: Charlotte. And where were today's advocates of high-stakes accountability in the 1980s and early 1990s, as Presidents Reagan and Bush were appointing federal judges who eventually undermined and reversed the pressures for desegregation? I think only Miller and Kennedy get credit there, and I can think of several who actively tried to undermine desegregation.
I'm not sure that Rotherham's question is even a relevant one: the fact that we can find a few examples of where outside pressure was absolutely appropriate doesn't mean that it's a panacea. Sometimes the "I'm an outsider" and "reform is inevitable" rhetoric trumps informed judgment. If "I'm a professional; trust me" is fallacious, so is "I'm a businessman; trust me."
January 17, 2008
Ranking creates perverse incentives; ranking of lunchtime and liberal-arts colleges, doubly so
Inside Higher Ed has a great article today, Potemkin Rankings, on how Washington and Jefferson College did everything you'd normally think is right to improve how they look to outsiders and still sank in the U.S. News & World Report rankings. The short story: W&J recruited like crazy to increase the applicant pool and managed to increase selectivity while starting to increase enrollment, hold down the full-price tuition, and still maintain a good faculty-student ratio. Because other liberal-arts colleges increased their endowments and tuition faster, W&J sank in the resources area and thus in the U.S. News ranking.
The problem here is not just with U.S. News. You can find that with almost any system that reduces a complex set of data to a simple ranking. Because the quality of any complex service is never going to be monotonic, there will be inconsistencies in any reductive ranking depending on the relative importance of different factors in the final (reduced) rating. This year, Education Week's Quality Counts report includes a weight your own factor feature, where you can re-rate an individual state based on your own idea of how important you find different elements in the Ed Week database. Well, not really: it looks like the mix within an individual subscale remains the same in the summary number, even if you can come up with different subscale scores. And there's no way to see how the rankings might change based on different weights. (I guess the Ed Week editors didn't really want people to look too closely at the rankings, or at how robust/fragile they might be.)
January 8, 2008
Sixth anniversary present for NCLB
So the Sixth Circuit Court of Appeals has revived the 2005 "unfunded mandate" NCLB lawsuit, and here is where things get interesting, because the original complaint is an interesting argument about statutory limits to the power of the purse, tied specifically to NCLB language that lifted mandates that were not paid for. Given the language of the appeals decision, this is going to be a lot more interesting on reargument, and with the current composition of the Supreme Court, I refuse to hazard any prediction about ultimate disposition.
But it won't get to the Supreme Court, because NCLB will be rewritten before it gets that far. Here are the real consequences of the lawsuit: If the plaintiffs win at the lower-court level or if the Sixth Circuit steps in for the plaintiffs in a substantive manner (as opposed to the procedural decision this week), that victory would shift the initiative in reauthorization. On the one hand, those critical of NCLB provisions will be able to be patient, in contrast to supporters of most of the current structure. On the other hand, without the pressure ratcheting up on schools, NCLB critics may not have quite as much organizing energy behind their battle, and that energy may shift to those who support most of the status quo.
January 7, 2008
Ted Kennedy and frames: 51 to go
Last Thursday, I recklessly created a set of predictions for major 2008 education stories and in the top item (on NCLB) wrote,
If I were a senior member of an education committee, I'd work throughout the year to establish some consensus that would hold at least reasonably well no matter what the results of the election.
Lo! and behold! Ted Kennedy has fulfilled my prediction in less than a week with today's Washington Post op-ed column. To be honest, that's only in the first week, but I suspect we'll see plenty of such efforts in the next 51.
December 21, 2007
Guesting on Edwize!
I've gone and committed guest blogging over at the UFT blog Edwize. The gist of the argument is that Joel Klein's pulling a Microsoft-like maneuver with accountability.
And he's the guy who prosecuted Microsoft for antitrust violations.
December 8, 2007
Waiting for the criticism of Winerip
Michael Winerip reports tomorrow on a new ETS report by Paul Barton and Richard Coley, The Family: America's Smallest School. Shades of Moynihan's response to Coleman, anyone? (And does anyone else know the reference for that?)
I expect the blogs next week will be full of criticisms, at least of Winerip's reporting if not the report. It'll be interesting to see if there's some substantive discussion along with the criticism.
Update: Charles Barone was first off the blocks on this. I wish he weren't so consistently sarcastic; it distracts from the analytical points he's making about Winerip and ETS, and those points are important, if not as much of a trump as he implies.
December 7, 2007
Whose values would be valued in a neoliberal education world: Michelle Rhee's or Marc Dean Millot's?
Marc Dean Millot explains why he's a critic of DC Chancellor Michelle Rhee (hat tip), and here's the key paragraph:
What I see in Chancellor Rhee's approach, abetted, permitted or endorsed by Mayor Fenty, is 1) insensitivity and arrogance towards others, combined with 2) a reliance on fear to control staff, and 3) a considerable willingness not to apply analogous performance criteria and public criticism to themselves. Managers cannot be harder and harsher with others than they are on themselves and expect support from their staff, respect from their board, or trust from the public. And managers without all three cannot succeed in a turn-around.
There are three points here. One is the immediate and obvious one: Humiliation and denigration are not great motivators, nor is "making an example of" a significant proportion of the people you work with. I don't know Rhee, but this is not the first time I've seen reports of her approach to people being problematic. And Millot is right on the general principle.
The second point is that mayoral control of schools is no panacea and often a fig-leaf reform. As Monday's Washington Post story on the matter indicates, politics don't disappear with mayoral control. And that's why I was disappointed to see the brief mention of David Tyack's One Best System in Wong, Shen, Anagnostopolous, and Rutledge's new book, The Education Mayor. Tyack showed how governance reformers in the early 20th century claimed to be "taking politics out of school" in changing ward-based urban school boards to nonpartisan boards often appointed by courts or mayors. Wong et al. seriously misread Tyack in claiming that the historical lesson is that we need to keep politics out of school. Tyack documented how the new boards may have been nonpartisan but were certainly political, elitist, highly connected, and contributor