March 8, 2010

Sour-grapes agreement

Michael Olneck and Peter Sacks turn petty in letters to the editor about Diane Ravitch that the New York Times printed today. Wow. I agree with Ravitch on a number of things and disagree with her on a number of things, some of which is in our area of expertise (history of education) and some of which falls outside the history of education. But I'm not sure why Sacks in particular is turning on the venom spigot. Well, actually, I do have some hypotheses about general hostility to her I've occasionally seen (as opposed to disagreement): she caricatured the field of history of education in a sloppy late-70s publication sponsored by the National Academy of Education, and along with Patricia Graham she was a woman to get high-status national recognition in the 1970s for her work in education policy at the national level, which heretofore had been a male bastion. (Graham was director of NIE from 1977 to 1979.) The first is a seriously flawed work, but that's several decades in the past, and in any case, a particular work should stand or fall on its own merits. I've never seen the second item discussed or even acknowledged. 

There's a related issue here, which is Ravitch's position outside traditional faculty. As far as I'm aware, she's never had a tenure-track or tenured faculty position, and she's one of the few historians who can say that they published their dissertation commercially before receiving the Ph.D. (The Great School Wars was published in 1974; Ravitch received her Ph.D. from Columbia in 1975). For the most part, her books are far more widely read than those of us who have full-time faculty positions, and I think she and Graham are the only historians of education to have held political appointments in the federal government. That's an interesting combination of insider and outsider positions. 

When Meier and Ravitch started their joint blog/conversation three years ago, I briefly referred to this history in writing, "Regardless of various professional views of her scholarship, Ravitch is a recognized voice on education policy. There are plenty of people I correspond with who have fewer claims to expertise, so I can either have a snit-fit about that or deal, and at this point, having a snit-fit is darned close to sexism and uber-testosterone in education policy studies." I'm sorry Olneck and Sacks, and especially Sacks, have made a different choice.

For the record, Sacks is factually wrong when he states, "Dr. Ravitch fashioned herself into the Ayn Rand of educational policy and rose to fame as a result of a free-market ideology that came into fashion in George W. Bush's administration." Ravitch's appointment was during the first Bush administration, and whatever you might think of Ravitch's historical arguments in different books, she's a much better writer than Rand.

February 11, 2010

Additional thoughts on performance pay politics

An addendum to my entry earlier this morning: I think that there is a politically-robust rationale for performance-pay policies, but it's not at the level of incentives usually used as the justification. The more plausible rationale for performance-pay policies is at the level of public-sector accountability: most people with jobs do not expect identical salaries or salaries based on a formula, and small variations based on something other than seniority and educational credentials might boost the facial validity of public-sector HR practices.

Note that this is not an argument that business practices are always incentives based (or should be: witness AIG as a disaster stemming from short-term incentives) or even widely varying. In some cases--large law firms, for example--entry-level professionals receive step pay increases in their first few years akin to teachers' step increases. But if I were to ask the head of the Florida Council of 100, Susan Story, whether she'd stop advocating performance pay even if the research consensus in a few years were solidly against its doing anything for student achievement, my guess is that she'd still push for some form of performance pay.

The discourse around this is somewhat similar to other comparisons people make between their lives and public policy: when policies look like you're pushing the cart and someone else paid by public funds isn't, you're less likely to maintain support for it. A friend of mine visited a newspaper columnist some years ago to complain about an article the columnist had written regarding AFDC (the federal welfare program before 1996). Don't you understand the factual errors with all of the myths about welfare? my friend asked. Sure, said the columnist, but you don't understand why public attitudes have changed: as the majority of mothers now have to find their own child-care arrangements while they're working, they're going to be far less sympathetic towards women who aren't willing to work or perceived as not willing to work.

I don't agree with the columnist's thumbnail history of public attitudes towards federal welfare policies or on assumptions that women on welfare have not historically wanted to work. But there is a significant grain of truth there that when the majority of mothers work when their children are young, and they have to find and pay for child care and wrestle with the stress involved in that, those mothers are not going to want to see that they're pushing the cart and others aren't. For similar reasons, those who oppose any performance pay have an uphill road telling people who work in environments with non-step-like pay arrangements that somehow public schools should be arranged differently.

Why the Teacher Incentive Fund and Race to the Top are long-term dead ends for merit-pay advocates

The apparent push in the proposed 2011 Obama budget for an enlarged Teacher Incentive Fund on the heels of Race to the Top makes me think that merit-pay/performance-pay advocates may be spreading their political capital very thin on teacher evaluation. Most advocates of paying teachers in part based on test scores are also advocates of using test scores in part to evaluate teachers more broadly, especially dividing probationary teachers from teachers with a right to due process before dismissal. And they're trying to do both. Smart or stupid? I think it's counterproductive for several reasons:

  1. The research on benefits of individual-teacher performance pay is limited. Very limited and quite mixed. Putting all your chips on a huge expansion of experimental performance-pay schemes? You may not get what you want, and public evaluations may doom the politics. (Think Reading First, though the allegations of corruption set the stage in that case for death-by-evaluation.)
  2. Grant programs end. If the expansion of performance-pay programs relies on temporary revenue, then the program may well die along with the extra revenue. Denver's teachers union and district worked together on a long-term political deal: performance pay that teachers helped develop tied to a long-term boost in revenue. That's not the structure of RTTT, TIF, or the Gates Foundation grants.
  3. Real-life performance-pay bonus budgets are stingy. The best example of that reality is here in Florida, where the state budget for the school-based rewards for test scores has been no greater than $100/student (for a school) since the late 1990s, and while my undergraduate students sometimes enter my classes thinking that a huge amount of school budgets are based on test scores, in reality that's no more than about 1.5-2% of per-pupil expenditures in Florida (and that's for the schools that receive the money). When this money is distributed to staff (sometime it is, sometimes it isn't), it's in the form of bonuses, not additions to base salaries. The fiscal and political reality is that the only way to permanently boost base salaries substantially based on test scores is to give the money to a tiny fraction of teachers, and that's a recipe for political disaster (and legal problems).

The last point is one I am surprised opponents of performance pay have not raised sufficiently, and here's how I thought someone would have put it by now: Okay, so you want to pay teachers well if their students learn a great deal? Wonderful. So if students perform at a very high level, you're willing to raise taxes to reward teachers for that accomplishment? Liberal advocates of performance pay would probably answer yes if. I don't think fiscal conservatives who are performance-pay advocates have thought through the dilemma on that point very clearly; either the answer is that you're willing to raise taxes or that you have low expectations for schoolchildren. 

Eventually, I suspect that advocates of performance pay will have to decide whether they want to put all of their political capital into pay schemes that are fragile or into hiring and retention issues. The proposed ballooning of TIF is a sign that no one in Washington is thinking about the political balance of these issues in the long run. 

Disclosure: I'm a member of a higher ed union that has long had a contract with merit pay and considerable differences in pay by rank and discipline. K-12 is a very different world in this regard. 

Note: I started this entry on Tuesday, and because I forgot to change the "publish date" (which Movable Type usually sets at the time you started an entry, not published it), it first appeared as if it were published Tuesday. My editing fault, not your faulty memory. Now, your forgetting to read all of my books and articles? That's a different story.

January 9, 2010

Spot temperature:Climate::Test score:____________

I fully expect that within a week (if not yet already) some climate-change skeptic will use the cold wave currently freezing much of the country as an argument that climate changing really isn't happening. And every time there's a vicious cold snap in winter or a cooler-than-average summer we get the argument. And some reporter and editor decides to devote part of the ever-shrinking news hole to bad coverage of the issue, while a relative handful of reporters use the question as an opportunity to educate readers about the difference between weather and climate.

Today, I'm sitting in central Florida with more layers on than I usually need in early January. It's colder weather than usual. But we're in a warming climate, because in the long run of decades (or centuries) the current cold wave is just noise, and the trend is towards a warmer atmosphere. "Just noise," you may be thinking through chattering teeth, "tell my heating bill that it's just noise." The current cold wave is nasty for individuals today (and a few days more), but it's temporary.

The variability of weather makes sense to most people because we have enough experience to distinguish between spot temperatures and broader patterns. We know that temperatures have daily and seasonal cycles. But the cyclical nature of weather does not give us enough background to grasp climate change. For that, you need data. A lot of data. A lot of data from a lot of places and times, of different sorts, with a number of experts sifting through it.

And even then you get climate-change conspiracy theorists, including someone who's evidently a hacker.

You can probably guess the logical analogue here: we do not have anywhere near the same density of data on student achievement that we have on climate, and yet we draw bold conclusions about the underlying achievement from a relative paucity of noisy data. As I wrote in August, we need to learn how to make decisions with noisy data. But in terms of broad trends in achievement, it is a bad habit of Americans to equate the latest test scores with long trends. 

And that doesn't even touch the question of whether test scores are like temperature readings. Ah, but they are, if you're talking about your and my outside thermometers: placed at different heights, in different conditions (sheltered, out in the open, shade v. sun), different ages of the thermometers (and thus consistency of the readings across the years). I am sure that background thermometers in these varied conditions are highly correlated in the sense that when it's colder, they're all colder, and when it's warmer, they're all warmer, and so the correlations across time are likely to be very high. But I wouldn't use them in any scientific research.

Stay warm, and have whatever hot beverage you like!

December 13, 2009

Turnaround or abandonment in NYC?

The extent of school closings in New York City is becoming evident, and after JD2718's posts on the subject over the past half-week, UFT's Leo Casey provides an overview and alleges an ulterior motive (to create available space for other purposes, not to improve education).

I'm far from NYC and can't speak from close knowledge of the city schools, and I'm still grading student work so I have no time to read extensively. But this is an important story and rolling conflict, and there are a few predictions I'll hazard:

  • At least one conservative will commit rampant inconsistency by simultaneously (or nearly simultaneously) weeping over the demise of the DC voucher program and applaud Klein for his bold moves, repeating the double standard on the issue I have described before
  • A small handful of schools may be preserved through fairly heroic efforts, but most of the closures will stand.
  • There will be no effective way to hold Tweed responsible for consistency and rationality in its school opening/closing decisions. 

In truth, many administrators engage in maneuvers that appear as arbitrary as Klein's closures do, but rarely is it on such a scale or so visible beyond the locality.

December 5, 2009

Are central Florida schools flouting Florida law limiting test-prep?

I have heard from teachers and students in three area districts (Hillsborough, Pinellas, and Hernando counties) that secondary teachers in some subjects are being ordered to spend the first 10 minutes of class suspending the curriculum and teaching material from another class. In the case of two counties (Pinellas and Hernando), I have heard stories that math teachers are being asked to teach 10 minutes of reading--not include word problems in math, which is certainly appropriate, but teach reading (a subject very few of them would have certification in). In one county (Hillsborough), I have heard a report from a student that a high-school anatomy teacher has been asked to spend 10 minutes reviewing other science subjects (and the emphasis appears to be in chemistry), probably to prepare students for the 11th grade FCAT science comprehension exam.

In 2008, the Florida legislature added a section to the existing law on assessment (F.S. 1008.22(4), if you're curious), specifying limits to what schools can do to prepare for tests, specifically

STATEWIDE ASSESSMENT PREPARATION; PROHIBITED ACTIVITIES.--Beginning with the 2008-2009 school year, a district school board shall prohibit each public school from suspending a regular program of curricula for purposes of administering practice tests or engaging in other test-preparation activities for a statewide assessment.

There are a number of exceptions to this prohibition--school districts can distribute sample test books, teach test-taking skills in limited quantities, etc.--but the spirit is clear: schools are not supposed to be engaging in test-prep that is a substitute for instruction. And taking time away from math class to teach reading, or away from anatomy to teach chemistry, looks like it's clearly prohibited.

It's also counterproductive from an administrative standpoint: if you wanted to add reading instruction, why would you ask a math teacher to do it? I should be clear: these are unconfirmed reports rather than documented examples. But if these reports are true, this clearly looks to be an end-run around ordinary curriculum policies requiring a certain amount of instruction in the classes to get more instruction or more test-prep in for high-stakes subjects.

There is one additional legal problem with this practice: there are both state and federal policies about teacher qualifications. I bet it's illegal in a number of respects to assign math teachers to teach reading and then report that everyone instructing in a subject is properly certified.

I have contacted the three districts in question to ask where the policies required by the law are. If you are aware of any specific examples (and I would need the school, date, class, period, and witness for sufficient documentation), please contact me by e-mail (sherman dottish-thingie dorn at-symbol-stuff gmail.com).

November 12, 2009

Race to the Top: review, revise, redux

I am in California this weekend for the Social Science History Association annual meeting, where we get to talk about Maris Vinovskis's book on the last quarter century of school reform, and since one of my copanelists Saturday morning is Jennifer Jennings, I finally get to meet the sociologist-formerly-known-as-Eduwonkette in person, face to face. Because several family members live in Costa Mesa, I also get to enjoy Kean Coffee about 20 miles south of the conference hotel/cruise ship (when the heck did the SSHA officers decide to book the Queen Mary??!).

While the focus of the book panel will be ... well, Maris's book, I'm sure we'll be talking about Obama education policy at some point, including Race to the Top. I was rushing around last night not getting enough done, so I didn't have a chance to do more than casually skim the stuff that's now available on the revised final guidelines. A few initial thoughts:

  • Bottom line? No idea. I traveled west and had coffee (see above), so I don't have a bad case of jet lag, but I've been on planes for 7 hours today. 
  • I very much like the competitive priority on STEM fields. That uses a standard device for focusing grant-writers' minds in USDOE competitions (the bonus points for meeting a competitive priority). (Disclosure: it looks like my state's department of education is following the push a bunch of us have been making about using Race to the Top funds for end of course exams, especially in science.)
  • From the list of changes made, it looks like there have been a lot of political calculations made on what changes had to be made to keep stakeholders in the game and what had to stay the same to satisfy policy goals.
  • Duncan is not anal retentive enough to make the points add up to a "nice round number." I have a suspicion this is deliberate, and if so I think I know the reason why.
  • People who focus on the total potential range of points for each section are missing an important feature of point distributions in scoring systems: it's the actual range and not the potential range that matters on rankings. If the potential range is 58 points from top to bottom on one component but the scoring leaves a real-life range of 10 points, it doesn't matter that the total number of points is 58. It could have been anything from 10 to 58. So what matters is how the reviewing panel looks at everything.

If we have time, I'll try to persuade Jennings to put on her Eduwonkette cape and save the state where I grew up. But I think California's problems are beyond what even a brilliant sociologist can solve. At least I get to see family members, which is worth the jet lag I'll be fighting in the next week.

October 29, 2009

Channeling Jerry Bracey on "proficiency": it's political, not scientific

One of the late Jerry Bracey's hobbyhorses was the pretense that the NAEP achievement level labels were scientific, as he argued in 1999: "The standards have generally been the object of scorn and derision from the psychometric community." He was fond of quoting the 1999 report on NAEP proficiency levels, esp. from p. 162: " Standards-based reporting is intended to be useful in communicating student results, but the current process for setting NAEP achievement levels is fundamentally flawed." So when NCES issues a report comparing the implied theta-values of cut-scores for proficiency on state assessments to the theta-values of cut scores for proficiency on NAEP and both Ed Week and the Christian Science Monitor report on the paper with a straight face, we're obviously seeing one place where Bracey's voice is already missing.

I think Jerry perseverated on this issue, to the detriment of a sensible argument about political judgments. The larger point which is inescapable is that cut scores are set arbitrarily, and there is no way to avoid that fact. Those who support setting achievement levels hope and pray that they're arbitrary in the sense of arbitration and careful judgment, not by being capricious. But they are arbitrary, and even moreso the labels assigned them. What we know is that someone who scores at a "proficient" level on NAEP is scoring higher than someone in the "basic" band. That's all we know from those labels: ordinality. Moses did not come down from Mount Sinai with NAEP scores carved in tablets. 

So what do we do with the inherently political nature of those labels? As I have argued in Accountability Frankenstein, the caution with which we use the judgments on cut scores should depend on the stakes of their use. If they're used to target resources, that's one thing (resources are going to be targeted in some manner), but the more that someone's job depends on them, the more wary we should be of how we set thresholds. 

Today, however, NAEP labels and cut-scores are serving a purely performative act, to stigmatize states for their political response to NCLB. I hereby propose that we have the following new labels for NAEP achievement levels: 

NAEP-achievement-levels.jpg

I think that's in the spirit of the day's report...

Correction: I assumed that NCES was using detailed data from the state assessments to estimate IRT parameters. Silly me. They were using distributional data for linkage. Oops... for me for forgetting the methods from the last such report. I'll let the measurement folks argue about the methods used here. 

October 14, 2009

The comparability fly in the Ouchi/principal-autonomy ointment

Yesterday from a "stakeholders" meeting (I think at the USDOE), Charlie Barone tweets,

Richard Laine of Wallace Foundation: forthcoming Rand study will show [principal] autonomy in hiring a key factor in student achievement.

I've been expecting something like this for a while, not because I'm connected to a RAND insider (I'm not) but because this is the obvious new version of decentralization form that would marry the 1980s-90s site-based management fad with new managerial fads in education.

To some extent I am attracted to Bill Ouchi's argument about principal autonomy leading to lower total student load. Ouchi's claims about total student load is essentially one of Ted Sizer's central arguments from Horace's Compromise, that the number of students a teacher sees is a key factor in the ability to push student achievement. But... and here's a fairly important but... Ouchi's work is tantalizing rather than definitive (because it has not be replicated substantially in terms of total student load), and the temptation to manage large urban districts as "portfolios" with quasi-independent school-level management may push a single form of decentralization at the cost of comparability in expenses and access to great teachers.

What the heck do I mean by that? In a sentence, we may not want principals to have complete autonomy in a task where they have relatively weak skills: knowing which novice teachers are going to be great teachers.


Everyone and her or his grandmother is focusing on the problem of where senior teachers work. This is an intellectual sleight of hand if you simultaneously argue that teachers with seniority are taking advantage of contracts with seniority privileges on transfer to avoid schools who need them and also insinuate that experience means nothing. Let me get this straight: we need to prevent experienced teachers from exploiting labor-market choice to move to schools with more comfortable teaching situations because... they're not inherently any better than teachers with only a few years of experience? This is an inconsistency ripe for Jon Stewart-like treatment.

More important than the intellectual sleight of hand is the way that this argument ignores an opportunity for a simple but politically sensitive intervention we could make that could simultaneously improve the lives of poor children and new teachers: create regional new-teacher clearinghouses and matching services. Here's the thought experiment: Far from decentralizing, I think it would be a healthy system for schools to require new teachers go into a large regional market where vacancies for relatively new teachers (e.g., those with fewer than three years of experience) would be balanced with a matching process akin to matching of med-school graduates to residencies. This would require collective bargaining and regional agreements between districts (or changes to statute), but here's the idea:

Brand-new teacher's perspective: A new teacher registers with the regional teaching market clearinghouse, with all of the stuff you'd want applicants to provide. The clearinghouse is directly tied to vacancies in the region, and that would probably include multiple districts in most parts of the country. The clearinghouse matches teachers to jobs for the first year. The teachers and administrators are told, explicitly, "This is a one-year arrangement. In the second year, the teacher is headed to a new school, and the administrator provides an evaluation knowing that the teacher is not coming back to that school until at least two years down the road." And that's what happens. At the end of the first year, the clearinghouse matches jobs to teachers who want to continue teaching and whom the first-year administrators recommend continue. Same with the end of the second year. And the clearinghouse's job is to make sure that by the end of a new teacher's third year, that teacher has worked in multiple settings, with different characteristics of students (at least within the range of the region), in areas of the teacher's documented expertise (i.e., no out-of-field matches). 

At the end of Year 3? Open market in the spring, in most places, and administrators wanting to hire on the open market must hire teachers with at least three years' experience -- in other words, teachers for whom there is a record of evaluations from different administrators and for whom there is a record of performance for students in different settings (within the range of the region's student population). Schools are allowed to hire teachers who worked in their schools before... if the now-third-year teacher wants to work there again.

Benefit to teachers: first-year teachers stuck with horrible administrators (or generally toxic environments) know that they'll be moving on if they survive. They'll get experience with multiple settings where they'll be able to demonstrate their chops. At the end of their third year, they'll have some variation in experience with administration to be able to judge people better when applying in an open-market situation. Disadvantage to teachers: if you happen to get lucky and get a great job in Year One, you have to move on.... and let another new teacher get the benefit of that experience.

Benefit to administrators: because new teachers are forced to move on after a year, honest evaluations are less likely to result in social backlashes. When you hire on the open market, you'll know you'll have evaluations and (where this is gathered) other performance data that is from school settings with a range of student populations. Disadvantage: you don't get to hire absolutely new teachers; you get whom you get, and if you were great spotters of talent, or you think you're better than the average principal at spotting good talent, you'll be upset.

(Personally, I think I would prefer this as an administrator: if you've read Moneyball, you know the sabremetricians' rule of thumb: you can predict a baseball player's professional performance from college experience, but someone straight out of high school is just a raw bet without college experience. Why would you want the authority to make hires in a situation where you're almost guaranteed to be a worse judge of talent/skill than any other personnel situation? Then again, I'm sure many principals think of themselves like the [very poorly-predicting] old scouts of baseball, making seat-of-the-pants judgments.)

Advantages for systems: See advantages for administrators above. In addition, you have lower risk with variation in administrators' skills in talent judgment, while principals would still have the autonomy to pick more experienced teachers, after they pick up enough of a record for administrators to see who has more talent. You could also get development of evaluation skills in a regional context without diseconomies of scale. If clearinghouses have to track teachers, they could also be tasked with additional evaluation responsibilities across a region. Advantage for relatively poor systems: you know that wealthier districts will not be able to be as much of a magnet for new teachers, because of regional rotation, and you could push administrators to do what is necessary to convince teachers that they want to return to your district after their initial three-year rotation is done. Disadvantages: there would need to be legal agreements to cover this, and there would be some logistical challenges in identifying vacancies (and making sure those vacancies are reported accurately and promptly) as well as the operation of a clearinghouse. School districts would have to delegate hiring authority for some of their jobs to a regional body, and if school systems really thought that they were hot stuff in terms of talent scouting, that might be hard to swallow. (See above and Moneyball on the egos of baseball scouts and possibly school administrators.) Disadvantage for wealthy districts: poof goes your advantage in recruiting brand-new and relatively-new teachers, because they'll spend some time in your districts but also some time in poorer districts.

Now, the payoff in terms of debates about comparability: a regional new-teacher clearinghouse/matching process would instantly equalize a significant part of the teaching staff across a region, because of rotation among jobs and districts. Yes, there would still be an advantage of wealthier districts in attracting teachers with three or more years of experience, but poorer districts would know that they at least have a shot of persuading new teachers that they can make a good career inside a district... if the relatively new teachers have an experience that is supportive. 

Remember that this is a thought experiment: I don't know of any places with regional new-teacher clearinghouses/matching services, and I dreamed it up out of whole cloth (plus some inspiration from what happens with med-school students). But I think it points out a structural problem with giving principals entire autonomy: with complete autonomy, there is no balancing out of regional needs. Equality of opportunity would depend entirely on the skills of individual principals, and while principals are extraordinarily important, that's putting a heck of a lot of eggs in a single basket. If you care about making sure that a broad range of students have access to great teachers, there are serious dangers in the Ouchi principal-autonomy approach.

October 8, 2009

First, find me a box of cereal that squirms and drips snot in winter

Congratulations to former Florida Governor Jeb Bush, who knows a critical rule of politics: declare victory whenever you can, no matter whether you were right. I am quite serious about his political acumen: his push of a system that assigned letter grades to schools was ingenious politics. And Bush deserves credit for supporting a research technical assistance center in Florida as well as funding for reading coaches. But Jeb Bush's comments to the Jeb Bush Celebration Conference this week had an interesting quip:

Frankly, if Walmart can track a box of cereal from the manufacturer to the check-out line, schools should be able to track the academic growth of a student from the time they step in the classroom until they graduate.

I am firmly in favor of using longitudinal data, but this comment is cheerleading and not serious discussion. There are significant challenges in the creation, maintenance, and use of longitudinal data systems, and Walmart-style tracking logistics don't touch the greater ones.

October 3, 2009

Child murder, Chicago style

Chicago teacher Deborah Lynch pointed out in a Sun-Times opinion piece yesterday that one of the Chicago schools' "turnaround targets" this fall has been Fenger High School, near the gang fight that led to Derrion Albert's death and the school where she implies many of the combatants attend. (Hat tip/alternative source.)

I am not saying that knowing the kids better could have averted the melee and tragic death of last week, obviously. But trouble had been brewing at the school even before last week. Staff reported a riot the previous week inside the building, involving teachers being hit, and that two different police stations had to be called in to quell the disturbance. Those are the times when the staff members draw on their relationships with kids to urge restraint, to urge calm and peace, to try to talk things out rather than fight things out. Those are the times when a seasoned staff can identify strategies and resources to address and prevent further problems.

Lynch's argument is interesting and plausible. I'd be cautious of taking it at face value, but don't toss it out the window. As far as I am aware, there is nothing either to contradict or to support the claim that the length of time a staff (as a whole) has spent in a school is predictive of the general school environment. I suspect it depends on the staff; experienced good teachers and staff are going to have the types of relationships with students that Lynch describes.

But there is another important limit to Lynch's argument, and I'm thinking about the debate that's usually focused on academics rather than violence: the relationship between schools and the rest of students' lives. I suspect that if George Schmidt is correct, that the police congregated around Fenger rather than following potential combatants, any immediate investigation needs to focus largely on the tactical decisions of the police. It's possible that no matter what happened in the school, the gang fight would have occurred unless police decisions had been different.

The murder of Derrion Albert is representative of one fact: in violent neighborhoods students are usually safer in school than out of school. A skilled set of professionals can make it so kids are safe in school, safe enough to focus on school. And it's much harder to bring peace to a violent neighborhood without involving schools. What happens inside the classroom can change the conversations that happen outside school boundaries, but there are no guarantees. What if Fenger had not been the target of a turnaround effort: would Albert still be alive? I don't know. 

Update (October 7): More on MSNBC, and more focused, on the rearrangement of enrollment patterns.

September 2, 2009

"Lake Wobegon" Klein

From pp. 68-69 of Accountability Frankenstein:

The complexity of an accountability system can also help muffle opposition to accountability if it gives a reasonable chance for students or schools to be successful in the system's labeling... the political potential to muffle opposition within a system may be more important than the technical qualities of a system, for schools typically trumpet any positive label on any website, pamphlet, or streetside marquis. All three of these states provide evidence of the capacity for complex systems to muffle dissent. In North Carolina, the majority of schools have received some recognition award in every single year of its accountability system's history. In Florida's system, 13% earned recognition in its first year, 1999, but that proportion rapidly grew, and a majority of schools received recognition awards in each of the years from 2003 to 2006. In California, 47% of California's schools earned statewide recognition in 2002, and two thirds of the schools in the Los Angeles Unified School District earned recognition.

I don't know why anyone would suspect that there is any political convenience involved in having the single letter grades assigned to a whole slew of NYC schools jump to A, but it's not isolated to New York. It's just that New York has overtaken Lake Wobegon as a symbol of overestimation of results. Then again, since Garrison Keillor spends several months a year in New York, maybe it's highly appropriate.

August 30, 2009

Race to the Top comment sausage

A friend of mine from Chicago introduced me to the term link sausage as a blog entry that is not much more than a set of links. Here are links to various comments on Race to the Top (a tiny slice of the well-over-thousand comments submitted):

As I expected, others have started to chime in on the NEA comments. The New York Times took the comments as a sign of obstinance. Former Park Ridge Education Association president Fred Klonsky wrote,

While it seems to me that it is late in coming, the letter from Brilliant is well deserved, and [Sherman] Dorn's comments notwithstanding, I think it reflects the views of the NEA membership. At least among those who have been following the debate.

I think that was my point: the comments reflected the views of a large slice of the NEA membership, but not in a productive fashion, and I fear that on balance it will harm the concrete interests of teachers (both in and out of the NEA) no matter how you want to define those interests. 

Note: As Klosnky points out in comments, he's not an ex-president (yet). The error is all mine in sloppy reading of his about page.

August 28, 2009

I'm commenting on Race to the Top, and I want a pony, too!

Impressions of a quick skim of 20 or so comments on the draft Race to the Top regs:

  • I couldn't find the national AFT comments anywhere.
  • Thus far, the two sets of technical comments by the Learning Disabilities Association of America and the group of academics with Kane, Staiger, and several others (uploaded by Thomas Kane), respectively, earn my "okay, you guys read the regulations and targeted your comments" award. Whether you agree with them or not, the comments were shrewd and focused. (I happen to like most of the comments, which are practical and sensible on the whole.)
  • The New Teacher Project signed onto the multi-organization letter that was essentially a vague "okay, we agree with this" note (with the advice for the USDOE to be selective in the first round), and then submitted comments that were, ahem, not nearly as far in the opposite direction as NEA but bewildering in its unbridled confidence in the suggestions made. TNTP staff, please read the comments written by Kane et al. You're smart, and they're smart, and they're much closer to the mark than you were this week. At least you don't come close to winning the second "I'm commenting on Race to the Top, and I want a pony, too!" award (first was to the NEA). 
  • I think that the California Teachers Association (the NEA affiliate in California) avoided the factual blunder in the NEA comments of asserting that Race to the Top is a mandate. Instead, they asked what states would have to give up in return for the money. In this case, they were deeply, deeply concerned with the threat to federalism embedded in asking that a state be able to link teacher and student records. That would be more plausible if TNTP's comments were enacted, but either the draft regs or the Kane et al.'s suggestions are reasonable in an imperfect world.
  • One state department of education accidentally sent the USDOE its cover letter to a national organization telling the national organization it was sharing its reg comments, in the place where it was supposed to upload comments. No signs of actual comments on the regs (thus far today). Ouch! I suspect there are similar technical glitches in other places.

I didn't comment. This is the first week of classes, and I'm a firm believer in the biggest bang for my buck (or hour).

August 23, 2009

NEA's comments: righteousness over responsibility to members?

I'm an NEA member, through my membership in the United Faculty of Florida. I'm a skeptic and critic of high-stakes accountability. Wrote a book and a few articles on the topic. And I am astounded at the NEA's comments on the Race to the Top draft regulations. (Hat tip.)

It is one thing to submit a righteous objection to the entire program if you are an individual with no responsibilities but to your conscience and your personal judgment of posterity. It is an entirely different thing when you represent several million teachers and you submit a document that for all intents and purposes appears to have an internal audience inside the NEA. That's nice, in the worst sense of the word "nice," because NEA staff had a responsibility to protect and advance their members' interests, not indulge any of our fantasies. To put it bluntly, on what planet would this regulatory comment have any effect on the final regs?

Let me be clear on my perspective as an NEA member and as an observer of political processes: There are lots of reasonable individual passages within the document, but you don't submit a manifesto when you comment on regs as an organization. You don't submit a manifesto that covers up any potential for effectiveness with what amounts to political poison. And you don't submit a manifesto that undermines your credibility. 

Two examples will have to suffice, because there's only so much I can wince at publicly: "we cannot support yet another layer of federal mandates" (from p. 2), or with regard to the creation of statewide longitudinal data systems, opposition to "[i]gnoring states' rights to enact their own laws and constitutions" (p. 24). The problem with these claims (and attendant tone of outrage) is that Race to the Top is not a mandate. Love it or hate it, it's something states must apply for. 

There were certainly alternatives available to the NEA, including the following choices:

  • Realpolitik: nudge the regs a bit to help state and local affiliates.
  • Legal: set up a legal challenge after final publication.
  • Abstinence: if you need to make a statement of conscience, declare that "we have serious doubts that this program will substantially help schools and will not participate in the regulatory comment process." 

I may be dead wrong about this, and there may be some uber-secret strategy behind this comment, but from where I sit at the end of the summer, it looks like one of my national affiliates' new president's first major move has been a bunch of wasted electrons.

August 16, 2009

What "multiple measures" looks like in reality

Friday's Sun-Sentinel article on the new evaluation scale for Florida high schools shows what happens when a state moves away from general-assessment test scores as the end-all and be-all of accountability. In this case, Florida's new scale for high schools rewards schools for graduating more students, especially those who have problems with the state assessments, for enrolling students in challenging courses, for students who succeed in the challenging courses, and for student success in voc-ed certification programs.

How are Broward County schools responding?

At South Broward High School in Hollywood, students will get the chance to take additional AP classes, such as human geography, world history, music theory and macroeconomics, in addition to more traditional offerings such as AP English and biology, said principal Alan Strauss.

They're also ready to better monitor performance of at-risk students and ensure the entire senior class is ready to graduate, Strauss said. "I say overall I would hold myself accountable for grad rate and preparing my kids for college," Strauss said. "I don't find a problem with that. I think that's what my job should be."

Surprise, surprise! A more balanced accountability mechanism leads to planning a more balanced set of programs for students. I can quibble with loads of details on the new scale, but the direction is the right one, and I think we'll know in a few years how this is going. I'll stick my neck out and predict the evidence will be reasonably good (in terms of outcomes). A small step for a single state, a giant step for accountability options.

August 13, 2009

How can we use bad measures in decisionmaking?

I had about 20 minutes of between-events time this morning and used it to catch up on two interesting papers on value-added assessment and teacher evaluation--the Jesse Rothstein piece using North Carolina data and the Koedel-Betts replication-and-more with San Diego data. 

Speaking very roughly, Rothstein used a clever falsification test: if the assignment of students to fifth grade is random, then you shouldn't be able to use fifth-grade teachers to predict test-score gains in fourth grade. At least with the set of data he used in North Carolina, you could predict a good chunk of the variation in fourth-grade test gains knowing who the fifth grade teachers were, which means that a central assumption of many value-added models is problematic.

Cory Koedel and Julian Betts's paper replicated and extended the analysis using data from San Diego. They were able to confirm with different data that using a single year's worth of data led to severe problems with the assumption of close-to-random assignment. They also claimed that using more than one year's worth of data smoothed out the problems.

Apart from the specifics of this new aspect of the value-added measure debate, it pushed my nose once again into the fact that any accountability system has to address the fact of messy data.


Let's face it: we will never have data that are so accurate that we can worry about whether the basis for a measure is cesium or ytterbium. Generally, the rhetoric around accountability systems has been either "well, they're good enough and better than not acting" or "toss out anything with flaws," though we're getting some new approaches, or rather older approaches introduced into national debate, as with the June Broader, Bolder Approach paper and this morning's paper on accountability from the Education Equality Project.

Now that we have the response by the Education Equality Project to the Broader, Approach on accountability more specifically, we can see the nature of the debate taking shape. Broader, Bolder is pushing testing-and-inspections, while Education Equality is pushing value-added measures. Incidentally, or perhaps not, the EEP report mentioned Diane Ravitch in four paragraphs (the same number of paragraphs I spotted with references to President Obama) while including this backhanded, unfootnoted reference to the Broader, Bolder Approach:

While many of these same advocates criticize both the quality and utility of current math and reading assessments in state accountability systems, they are curiously blithe about the ability of states and districts to create a multi-billion dollar system of trained inspectors--who would be responsible for equitably assessing the nation's 95,000 schools on a regular basis on nearly every dimension of school performance imaginable, no matter how ill-defined.

I find it telling that the Education Equality Project folks couldn't bring themselves to acknowledge the Broader, Bolder Approach openly or the work of others on inspection systems (such as Thomas Wilson). Listen up, EEP folks: Acknowledging the work of others is essentially a requirement for debate these days. Ignoring the work of your intellectual opponents is not the best way to maintain your own credibility. I understand the politics: the references to Ravitch indicate that EEP (and Klein) see her as a much bigger threat than Broader, Bolder. This is a perfect setup for Ravitch's new book, whose title is modeled after Jane Jacobs's fight with Robert Moses. So I don't think in the end that the EEP gang is doing themselves much of a favor by ignoring BBA.

Let's return to the substance: is there a way to think coherently about using mediocre data that exist while acknowledging we need better systems and working towards them? I think the answer is yes, especially if you divide the messiness of test data into separate problems (which are not exhaustive categories but are my first stab at this): problems when data cover a too-small part of what's important in schooling, and problems when the data are of questionable trustworthiness.

Data that cover too little

As Daniel Koretz explains, no test currently in existence can measure everything in the curriculum. The circumscribed nature of any assessment may be tied to the format of a test (a paper and pencil test cannot assess the ability to look through a microscope and identify what's on a slide), to test specifications (which limits what a test measures within a subject), or to subjects covered by a testing system. Some of the options:

  • Don't worry. Don't worry about or dismiss the possibility of a narrowed curriculum. Advantage: simple. Easy to spin in a political context. Disadvantage: does not comport with the concerns of millions of parents concerned about a narrowed curriculum.
  • Toss. Decide that the negative consequences of accountability outweigh any use of limited-purpose testing. Advantage: simple. Easy to spin in a political context. Disadvantage: does not comport with the concerns of millions of parents concerned about the quality of their children's schooling.
  • Supplement. Add more information, either by expanding the testing or by expanding the sources of information. Advantage: easy to justify in the abstract. Disadvantages: requires more spending for assessment purposes, either for testing or for the type of inspection system Wilson and BBA advocate (though inspections are not nearly as expensive as the EEP report claims without a shred of evidence). If the supplementation proposal is for more testing, this will concern some proportion of parents who do not like the extent of testing as it currently exists.

Data that are of questionable trustworthiness

I'm using the term trustworthiness instead of reliability because the latter is a term of art in measurement, and I mean the category to address how accurately a particular measure tells us something about student outcomes or any plausible causal connection to programs or personnel. There are a number of reasons why we would not trust a particular measure to be an accurate picture of what happens in a school, ranging from test conditions or technical problems to test-specification predictability (i.e., teaching to the test over several years) and the global questions of causality.

The debate about value-added measures is part of a longer discussion about the trustworthiness of test scores as an indication of teacher quality and a response to arguments that status indicators are neither a fair nor accurate way to judge teachers who may have very different types of students. What we're learning is a confirmation of what I wrote almost 4 years ago: as Harvey Goldstein would say, growth models are not the Holy Grail of assessment. Since there is no Holy Grail of measurement, how do we use data that we know are of limited trustworthiness (even if we don't know in advance exactly what those limits are)?

  • Don't worry. Don't worry about or dismiss the possibility of making the wrong decision from untrustworthy data. Advantage: simple. Easy to spin in a political context. Disadvantage: does not comport with the credibility problems of historical error in testing and the considerable research on the limits of test scores.
  • Toss. Decide that the flaws of testing outweigh any use of messy data. Advantage: simple in concept. Easy to spin in a political context. Easy to argue if it's a partial toss justified for technical reasons (e.g., small numbers of students tested). Disadvantage: does not comport with the concerns of millions of parents concerned about the quality of their children's schooling. More difficult in practice if it's a partial toss (i.e., if you toss some data because a student is an English language learner, because of small numbers tested, or for other reasons).
  • Make a new model. Growth (value-added) models are the prime example of changing a formula in response to concerns about trustworthiness (in this case, global issues about achievement status measures). Advantage: makes sense in the abstract. Disadvantage: more complicated models can undermine both transparency and understanding, and claims about superiority of different models become more difficult to evaluate as the models become more complex. There ain't no such thing* as a perfect model specification.
  • Retest, recalculate, or continue to accumulate data until you have trustworthy data. Treat testing as the equivalent of a blood-pressure measurement: if you suspect that a measurement is not to be trusted, take the blood pressure test the student again in a few minutes months/another year. Advantage: can wave hands broadly and talk about "multiple years of data" and refer to some research on multiple years of data. Disadvantage: Retesting/reassessment works best with a certain density of data points, and the critical density will depend on context. This works with some versions of formative assessment, where one questionable datum can be balanced out by longer trends. It's more problematic with annual testing, for a variety of reasons, though that can reduce uncertainties. 
  • Model the trustworthiness as a formal uncertainty. Decide that information is usable if there is a way to accommodate the mess. Advantage: makes sense in the abstract. Disadvantage: The choices are not easy, and there are consequences of the way of modeling uncertainty you choose: adjusting cut scores/data presentation by measurement/standard errors, using fuzzy-set algorithms, Bayesian reasoning, or political mechanisms to reduce the influence of a specific measure when trustworthiness decreases.

Even if you haven't read Accountability Frankenstein or other entries on this blog, you have probably already sussed out my view that both "don't worry" and "toss" are poor choices in addressing messy data. All other options should be on the table, usable for different circumstances and in different ways. Least explored? The last idea, modeling trustworthiness problems as formal uncertainty. I'm going to part from measurement researchers and say that the modeling should go beyond standard errors and measurement errors, or rather head in a different direction. There is no way to use standard errors or measurement errors to address issues of trustworthiness that go beyond sampling and reliability issues, or to structure a process to balance the inherently value-laden and political issues involved here. 

The difficulty in looking coldly at messy and mediocre data generally revolves around the human tendency to prefer impressions of confidence and certainty over uncertainty, even when a rational examination and background knowledge should lead one to recognize the problems in trusting a set of data. One side of that coin is an emphasis on point estimates and firmly-drawn classification lines. The other side is to decide that one should entirely ignore messy and mediocre data because of the flaws. Neither is an appropriate response to the problem.

* A literary reference, not an illiteracism.

August 12, 2009

Belated kudos to Broader, Bolder and to Fordham

In the whirlwind of my obligations this year, my reading has lagged, and I am late in recommending and praising two reports published in the first half of 2009:

  • The Broader, Bolder Approach's accountability report, published in late June. This report suggests combining the use of achievement test data and on-site school inspections for school-level accountability. For those who have read Accountability Frankenstein, you'll know that I agree with those ideas. This report addresses the central gap in the original Broader, Bolder manifesto, and I am delighted to have read the proposal.
  • In March, the Fordham Institute published a report recommending a scaled approach to accountability when private schools take public dollars. Their proposal is roughly that the more dependent a private school is on public funding, the more the school has to provide data and be accountable in a way similar or parallel to local public schools.

Both are thoughtful, well-reasoned brief arguments, and they move each debate in interesting directions. Whether or not you agree with the conclusions, you'll have things to think about.

Updated: Aaaaargh! Six days later, I realize I've been calling the group the Bolder, Broader Approach instead of the other way around. Dear readers: when I make a stupid error, please point it out as soon as you see it.

Proposed ground rules on teacher evaluation and test discussion

Seeing how too many writers about Race to the Top, tests, and teacher evaluation would have taken actions in the Cuban Missile Crisis that would have led to nuclear war--i.e., seeing the worst in opponents, or maybe seeing posturing as the best path forward for themselves personally or for their positions (sound like the health-care debate-cum-food-fight?)--I am hereby proposing the following ground rules/stipulations:

  1. The modal forms of teacher evaluation used in K-12 schools are not useful.
  2. Some aspect of student performance (abstracted from all measurement questions and concerns about flawed tests) should matter in teacher evaluation.
  3. At least one problem of including student performance in teacher evaluation is how to use messy and flawed data. This comes from the fact that current tests are flawed. Heck, all tests are going to be imperfect and create the dilemma that Diane Ravitch referred to this morning. But plenty of today's tests should embarrass anyone who approved their use.
  4. Yes, people who disagree with you have used inane arguments, and some of them might even have gotten some provisions through a legislature by logrolling. I know I can say the same about your putative allies. Let's call each other out on those moves, and then move on to the substantive issues. Doing more than calling people out on that at the time (i.e., holding grudges) is playing the game of "your side is dirtier than mine," and you will inevitably lose that game, especially if there's an historian in the room (and in addition to me, there's also Diane Ravitch, Larry Cuban, Maris Vinovskis, and others who can quickly point out where folks have played dirty political pool for decades, though many of us will just call it the standard operating procedure in education politics). See reference above to Cuban Missile Crisis. If Reagan make an arms-control treaty with Gorbachev, we can all be a little more mature in disagreements.
Anyone who has broken these ground rules or is going to break the ground rules in the near future is currently in a grace period thanks to my staying away from blogging much in the past few weeks. But if I have time in the fall, I'll write a weekly entry on who's doing the best and worst jobs of fighting fairly on this issue.

August 4, 2009

Your personal, homemade commission on tenure and test scores

Sick of finger-pointing in the absence of a New York state commission to study how to use test scores in teacher evaluation (including tenure) decisions? Look no further! In this space, we will be conducting our own homegrown commission over the next three months. No need for the New York Assembly and Senate to act! We'll do it ourselves.

What? you say. You're in Florida. Well, yes, but everyone knows that Florida is just the Southern branch of New York. My father grew up on Flatbush Avenue and graduated from Lincoln High School. He was in New York City for his residency in pediatrics (with an office in Bellevue, but that's another story). The Yankees' spring training home? Eight miles from my house. 

And if that doesn't convince you, you should know that Alexander Russo runs his blog on Chicago schools from Long Island. If he can do that, I can run a citizens' commission for New York from here (and then someone in Chicago can run something in Florida).

Apply in comments: name, role in New York education, what you'll bring to the table.

July 27, 2009

Talking turkey on "Race to the Top"

The hoopla surrounding the draft "Race to the Top" guidelines have obscured the long-game strategy involved here. If you think about the structure of the funds--more discretionary money than the U.S. Department of Education has ever had before, competitive grant system, and a set of priorities that the Duncan department has been signaling for six months--there are two guesses I have about the broader goals:

  1. The double-shot of grants over the next year is intended to be the first of two or three shots of large amounts of discretionary money for the department.
  2. Duncan's learned about vicarious reinforcement and intends to use it here.

The obvious initial "winners" will be states such as Florida which have a number of the required elements in place and are ready to go on a few payoff projects. But there will also be a few very large states left in the cold (and without that extra funding) after these first two rounds of awards. What if California is one of those states out in the cold? Or New York? There will be local pressure from school boards and administrators on members of Congress to continue feeding money to the department until their states land at least one award.

In the long game, the fact that Race to the Top can't bail California out is not really the issue, and I disagree with Mike Klonsky's assumption that this is an attempt to starve the states into submission. While I think a number of people would have preferred a larger ARRA stimulus fund, I don't think you can claim that the Obama administration has acted at all as if it wants thousands of teachers fired. Far more likely is the ordinary political dynamics of federal programs: no one wants to be without a slice of the pie. For these reasons, if it were legal to place a bet of this kind, I'd give rather interesting odds that California loses out big in the first two swats at Race to the Top money. 

And speaking of misdirected Mikes, Mike Antonucci is wrong about the teachers union dynamics in Race to the Top. While my higher-ed local has both the AFT and NEA as affiliates, I'm generally out of the loop on national headquarters stuff, but I can see the writing on the wall: one of the unions may well push in the regulatory process to increase the leverage of state affiliates, not to eliminate the requirement on linkability of teachers to student data. The best thing that the national affiliates can do is help state affiliates' negotiating position with their own state departments of education. If two states' applications are similar, but only one has a letter of support from their state affiliate's (or affiliates') elected officers, both the NEA and AFT need the state with union support in the application to have an advantage. (There are some interesting dynamics here vis-a-vis merged state affiliates, but the larger incentive at the national level is to help all state affiliates.)

July 25, 2009

Temporizing and teasing on tests and teacher evaluation

I still don't have time to expand at length on combining qualitative and quantitative sources of data for teaching evaluation, but given the hoopla surrounding the draft Race to the Top regulations, I should at least provide an update, or rather a bit of a tease for what's developing into a short paper-to-be. In addition to my fairly general understanding of some technical issues, I'm developing the argument that any point-based system for combining professional judgment and test scores needs to avoid fixed weights for the components of the system.

The explanation is not that technical, and I can sketch it here: the benefit of a truly Bayesian approach to using test scores to evaluate teachers is a reciprocal relationship between the decision-making authority of professional judgment and the power of other data (including test scores). A forceful judgment by professionals reduces the power of test scores in such a system, while tepid judgments increase the power of test scores. That is one possible solution to the thorny question of relative weights: if educators are willing to judge their own, test scores are less important (addressing the concerns of teachers unions and many administrators), but if educators are not willing to judge their own, test scores are more important (addressing the concerned of those criticizing the very low proportion of teachers given poor evaluations). 

In a point-based system with fixed weights (or fixed percentages of the total) assigned to individual components, you don't have a structure with a reciprocal relationship between the exercise of professional judgment and the authority of test-score data. But I think the dynamic benefits of a Bayesian approach can be created in a point system, as long as the weights are not fixed. I need to think through the potential approaches, but it's possible.

There: that's the tease.

July 13, 2009

AFT QuEST presentation slides on performance pay

I am not in DC, but I do catch things online: the presentation slides for the AFT QuEST session on performance pay are available, and while Edward Tufte thinks Powerpoint is awful, a stack of straightforward, well-written slides provides a wonderful vicarious outline for those of us who Were Not There.

July 10, 2009

Those evil union supporters who denigrate objective measures...

Quick: who said the following recently?

We do see the incredible power of setting stretch goals. But if you set a goal that's really not within reach, people will just give up on it and you really don't have a goal. We've seen this over and over. I think there's as much talking down of goals around here as there is of actually saying, "You're not thinking big enough."

Oh, this evil denigrator of the value of objective goals. From the text, you might conclude that this person is a teacher union supporter who will die before wanting to break down the firewall between teacher records and student test scores.

Except that the speaker was Wendy Kopp, head of Teacher for America and someone who said later in the interview that she is an advocate of using data and setting goals. But there's an important piece here about motivations and goals. No, I don't have answers for the K-12 world, but as I will continue to state until someone proves me wrong, there is something deeply wrong when an historian knows more about the relevant goals and motivation literature than most of the people who advocate setting extremely high goals in education.

Combining qualitative and quantitative evidence for teacher evaluation: What does "predominant" mean?

According to Gotham Schools, former NSVF and current USDOE official Joanne Weiss "said the Obama administration aims to reward states that use student achievement as a 'predominant' part of teacher evaluations with the extra stimulus funds" (emphasis added). I followed up with a USDOE representative, who emphasized after talking with Weiss that she meant a predominant part, not the predominant part of teacher evaluations, and that is how Walz reported the comment. The department representative added that department leaders "consider it illogical to remove student achievement from teacher evaluation, and we want states and districts to remove any existing barriers."

This came on the heels of TNTP's Widget Effect argument and Joan Baratz-Snowden's Fixing Tenure. I know that the political context of Weiss's remarks is to push the Duncan line that New York State's moratorium on the use of test scores in personnel decisions is wrong, and if necessary Weiss will bar New York from the Race to the Top funds if the legislature doesn't get its act in gear. Stand in line, please; I have a feeling a few million New Yorkers have the first dibs on dunking the entire state senate in the Hudson near Albany sometime in late November.

Back to policy, though: the word predominant perked up my ears because Florida legislature's language has evolved from language involving the dominance of student achievement to quantification. The current language on personnel evaluation is a legacy of language first written in 1999:

The assessment must primarily use data and indicators of improvement in student performance assessed annually as specified in s. 1008.22 and may consider results of peer reviews in evaluating the employee's performance. [emphasis added]

The current performance-pay language in Florida has the Merit Award Program which stipulates that for the purposes of merit pay, achievement data "shall be weighted at not less than 60 percent of the overall evaluation" (F.S. 1012.225(3)(c)).

I need to think about this in some depth, but it strikes me that the Florida legislature mandated one of several options to use in combining quantitative and qualitative judgments of teacher effectiveness, the point system. You can probably come up with other variations that meet the statutory language, but my guess is that any real-world implementation would almost all be linear combinations of different subscores, and I will use incredibly technical measurement language to call it the point system of combining different sources of information about teaching effectiveness. But that's not the only one, and I am always troubled when a clunky system is chosen as the default because it is the first option rather than a deliberate decision among options. I understand why a point system is in the bureaucratic and political gravity well, and it may well be that this particular clunky point system is the best option. However, it should be considered in comparison with what other clunky systems might be appropriate.

For example, there is also the holistic review of teacher effectiveness, such as exists in the new Green Dot-UFT collective bargaining agreement teacher evaluation system. There's no specific way that test scores inherently enter the judgment as such, though the implication is that teachers will have to show that they use assessment to shape instructional practices (what's called action research in the document, at the very least).

But those aren't all: a flow-chart is at least theoretically possible, though I do not have a real-life example. Yes, there are process flow-charts such as exists in Denver (and in the Green Dot system), but it's a flow-chart essentially describing when and how you schedule meetings, not how you make decisions in a meeting. (Step 1: Can you understand this chart? Yes: read the rest of it while walking to your secretary's desk; no: pretend to read it while walking to your secretary's desk. Step 2a [at secretary's desk]...)

Most theoretical: a Bayesian bump algorithm. I am guessing that there is a high probability that any subjective Bayesian statistician reading this blog will have thought of this idea already, but I'll adjust that guess after some data comes in. Since even well-trained evaluators are making subjective judgments about people, you could treat a principal's or peer's judgment as a prior judgment about the probability that a teacher should be retained/rewarded, given help, or fired. In the Bayesian world, that prior judgment can and should be shifted based on data, to form a posterior estimate of the probabilities of what should be done (you can play with a Bayesian calculator here, in a medical-test context). That adjustment is why I'm calling it a "bump" -- start with a professional assessment on various grounds and allow that to be bumped somewhat by test data, with the magnitude of the bumping depending on the data. Going down this path would involve some interesting studies, and it would probably be working with Bayesian posterior odds (which provide an interesting possible back door to a point system). This is a little out of my league in terms of specific characteristics, but the Bayesian perspective on statistics makes it possible to combine qualitative and quantitative data in a framework that already exists.

So we have four large categories of ways to combine essentially qualitative and quantitative data. While I am busy reading student work and doing other stuff in the next week, you all have a chance to dive in and describe what you think are strengths and weaknesses of each approach, as well as any additional categories (or disagreements with my classification entirely). After I have a weekend and get other tasks finished, I will return to explain (a) why a Bayesian approach is not only philosophically appropriate but serves the needs of unions, students, and anyone Alexander Russo describes as reformy; (b) why a Bayesian approach is not that different from a point system, at least in theory; and (c) what characteristics you would look for in a point system for teacher evaluation to meet the political interests described in (a).

July 8, 2009

A word to the wise on accountability

Dear fellow Americans who support equal education and are inclined to attack teachers unions when you get frustrated (e.g., Charles Barone and Citizens' Commission on Civil Rights):

  1. Borg-like rhetoric ("Those who resist the school reform movement are going to find they are on the wrong side of history. They may affect the pace of reform, but not its inexorable direction") is not likely to convince anyone that they're wrong and you're right. It's not even close to the level of Rod Paige's NEA = terrorist remark, but it's still intemperate. And I don't know about you, but the last degree I earned came with a beautiful, shiny rearview mirror, not a crystal ball.
  2. I'm persuadable that NEA staff and national leaders made some incredibly stupid/venal moves in trying to shift policy in the backrooms of power (which apparently are no longer smoke-filled), that the AFT may have made (fewer) such moves, and that locals and state affiliates of both national affiliates also make stupid/venal moves at varying rates depending on location and internal union politics. But a report that essentially treats policy concerns and backroom politics as identical? It strikes me as shoddy analysis, for several reasons. First, it's scattershot, which undermines the credibility of what probably would be stronger arguments on more narrow grounds. Second, it misunderstands the nature of organizations, assuming that unions have intentions rather than internal politics, agreed-upon positions, strategies, and tactics. Third, if you criticize both regular and backroom politics, you're implicitly committing yourself not to do much politicking on your own part.

Every few years you see a wavelet of attacks on teachers unions, and I am assuming that this is part of a new one. Sometimes it's just a coincidence, and I hope that's the case in the entries linked above... and here

Addendum: Charles Barone takes me to task on two items; in comments I say he's right on one and wrong on the other, but you'll have to read what he writes rather than my summary.

June 30, 2009

Grading reports that grade states, which have schools that grade

It's now a PR cliche in education wonkery: grade states. Issue grades, and that's a hook for reporters to write stories about the reports, because the reporters at daily metros can say, "[Your state's name here] receives 'F' in think tank report on education." But beyond the PR value of grades, it's facile, which is why I'm surprised Education Sector gave into this particular venal sin in its report on states' higher-ed accountability policies. C'mon folks: can't you figure out a more substantive way of evaluating states? At the very least, this is so 1990s.

So I'm thinking about developing a report over the next year that grades think-tank reports that issue grades for states on some matter of education, where of course schools have teachers who grade students. Among the standards will be the following:

Clear standards for grades: a year before the report is issued, does the entity that issues the report publish grading standards or criteria?

A - Entity publishes grading standards with sufficient criterion specificity that an outside observer would not be surprised at the grade a state receives the next year. (Note: this is a low bar, not requiring agreement with grades.)

B - Entity publishes standards, but standards are too vague to provide benchmarks for policy progress.

C - Entity has previously published reports issuing grades to states, but changed the standards, or described the project and the areas where states would be grade, but no standards for those areas.

D - Entity has previously published the existence of the report project, but there is no previous publication of intent to grade states in this area of policy.

F - Report appears out of the blue with no publication of intent in this area.

Okay, folks: where does today's Education Sector report fit? How about Ed Week's annual Quality Counts phonebook? Fordham's reports that issue grades?

And, yes, if I'm serious about this, that implies I have to develop some more grading criteria. After all, it would be most interesting and ironic if I created a report that contained the mechanism by which the report itself could be torn apart. Hint, hint, ...

June 26, 2009

How to steer CYA-oriented bureaucracies, or why NCLB supporters need to think about libel law

Someone at USDOE sent me an invitation to listen to the June 14 phone conference where Arne Duncan explained how disappointed he was in Tennessee, Indiana, and other states with charter caps, let alone states such as Maine with no charter law, and how that disappointment might be reflected in the distribution (or lack of distribution) of "Race to the Top" funds (applications available in October, due in December, with the first round of funding out in February 2010). There are a few details that reporters didn't ask about (Duncan's somewhat surprising statement that a good state charter law would set some barriers for entry rather than establish a "Wild West of charter schools," and the way that small charter schools and charter schools with grade configurations outside state testing programs can stay off the radar for accountability purposes), but I was not surprised that two Tennessee reporters were called on for questions.

But apart from the selection of reporters for questions, the phone presser and other DOE moves made me think about the various uses of power in education-policy federalism. In limited ways, explicit mandates can be effective, if there is a sustained willingness within the USDOE (and esp. OCR) to make painful examples of the nastier school systems that try to evade those mandates. Offering technical assistance is another method, and despite the massive conflict-of-interest problems in Reading First, I agree with one of the researchers in the field who thinks that Reading First did improve primary-grade reading instruction, on balance. (Thumbnail version: hourslong scripts, ugh; explicit instruction in phonemic awareness and some other fluency components, obviously necessary.)


But neither heavyhanded mandates nor technical assistance can do everything, and neither works with the greatest motivation for both defensive and hubris-oriented bureaucracies: risk management. If you are a public school teacher or administrator, my guess is that you can identify some fairly silly action by your district that was motivated almost entirely by CYA motives, and if you can marry those CYA activities to pedagogy, you've been lucky or have a black belt in administrative maneuvering. (If you have such victories, please describe them in comments! Otherwise, we'll all wallow in the shared misery of observing defensive administering and the all-too-frequent ensuing train wreck.)

I think the federal government can shape bureaucratic behavior to the good by using that risk management and structuring accountability policies around that. And here's the lesson I take from my high-school journalism class in ninth grade 30 years ago: libel law in the U.S. generally recognizes the truth as a positive defense agaist libel allegations. That seems like a backwards way to frame the legal issue -- after all, isn't it common sense that a publication is libelous only if it's false? -- but the notion of a legal positive defense gives an individual or organization a way to organize behavior in a way that is both professionally appropriate and also make a legal defense aligned with professional expectations. Because the truth is a positive defense against libel claims, even an idiotic general counsel for a newspaper or publisher looks to the professionally-appropriate standard: is there documentation that the published work is true?

Sometimes a positive defense is not explicitly part of jurisprudence but evolves as a practical guidance for clinical legal work and internal advice for school systems. Observing procedural and professional niceties create exactly that type of positive defense in special education law. There is nothing in federal special education law to carve out an explicit positive defense for school system behavior, but many articles written by Mitchell Yell over the past few decades constitute a convincing case that school systems now have a de facto positive defense: professional documentation of decisionmaing and scrupulous adherence to procedural requirements are a positive defense against a broad range of allegations by parents of and advocates for students with disabilities.

Yell has argued (persuasively) that due-process hearing officers and judges use procedural adherence and professional documentation as a filter in special education cases. If a school district can document that it has paid attention to procedural mandates and has met professional standards for documenting decision-making, then hearing officers and judges are extremely reluctant to look at the substantive merits of those decisions. But if a school district has ignored standard procedural expectations that most districts meet, or if a school district has kept no or inadequate documentation of its decision-making rationale, then all bets are off and a hearing officer or judge will be much less likely to defer to the school district on professional judgments.

In essence, Yell implies, school districts can avoid adverse judgments if they pay attention to timelines and other procedural niceties and if they keep teachers and principals on their toes about current "best practices" as well as deadlines, notices, etc.  Not all districts are aware of this positive defense, or I suspect that some enterprising special education researchers could make a mint running seminars, "How never to get sued again." 

More broadly, I'm beginning to think that the construction of a positive defense against charges of incompetence would be healthy for school systems and state policies. The devil would definitely be in the details, but instead of being frustrated by a consistently observed school system behavior, maybe we should take advantage of that consistency.

June 25, 2009

See-no-knowledge in education policy?

I seem to be reading several "we don't know anything so let's plow ahead" arguments in education think-tankery, from Mike Petrilli's argument that because we don't currently have a solid research base about how to turn schools around, we shouldn't try, to Kevin Carey's consistent argument in Education Sector's blog that because there is no research consensus about predictors of good teaching (and considerable research suggesting that there is not a link between effectiveness and countable items like years of experience beyond the first few or graduate degrees), it makes better sense to let people into teaching and then evaluate their effectiveness.

Fortunately, that's not the approach of the Institute of Education Sciences under John Easton, which has just announced a large research initiative on turning around schools. I suspect that both Petrilli and Carey would acknowledge that research in difficult topics is a good thing and argue that IES initiatives are different from policy, because sometimes you have to make decisions based on the state of knowledge you have, not the ... oh, shoot, there's Donald Rumsfeld phrasing again. But you probably know what I mean: Petrilli and Carey's stances are policy stances based on topic-specific agnosticism, not opposition to research.

But there's a serious question buried here: on big questions of policy, where you have to make choices, and the research is nondirective, how do you make decisions? I think the answer has to be incrementally, to allow research to catch up and influence policy later. If you make a huge political and institutional commitment to a policy path that has no research support and no ethical/legal obligation, then you're committing millions of children and hundreds of thousands of educators to a path that is very hard to change later. 

For that reason, while I think Arne Duncan's four-choice speech earlier this week is not based on research, and Petrilli is correct that there is no particular reason to believe that charter schools will somehow rescue the education of students otherwise stuck in horrible circumstances, the policy itself is good largely because it doesn't make hard and fast commitments to a particular path. The good thing about a charter is that it can be revoked, and in states such as Florida where there is a single authorizer for a geographic area (here, the county school boards), authorizers can be reasonably aggressive in shutting down shady or incompetent operations. So I share Petrilli's skepticism, but precisely because I am skeptical of any particular approach to schools in crisis, and because Duncan is being wishy-washy, I will applaud the Secretary for being wishy-washy. 

Update: I first used the term "know-nothingism" in the title. Ugh. Bad move for an historian. Petrilli and Carey are not members of the 19th century anti-immigrant party. Mea culpa.

June 18, 2009

The world is complicated, part 752

So the Center for Research on Education Outcomes has a report on charter-school performance, the Center on Education Policy has released a report on student achievement trends, NAEP released art-education data, and the spin has begun. Missing from almost all the reporting: Statements about the extent of peer reviewing for any of these reports. I'm not too worried about the professionalism of these reports,  since I know that the Department of Education always has an internal review process, CEP usually asks researchers in the area to review draft reports, and I would be surprised if CREDO did not have a pre-publication review process. However, the failure to report on the extent of peer review is a continuing and glaring omission in the reporting of education research.

In terms of the substance of the reports, I'm up to my eyeballs in prior commitments, but it's clear from the brief reading I have been able to do that the findings for all three reports are more complicated than the spin emanating for many of The Usual Suspects.* That's not news, I know, but I am the King of Things That Are Obvious Once He States Them, and I have a job to do.

* a great name for an a cappella group, if you happen to be starting one up.

June 13, 2009

On graduation rates and auditing state databases

I sympathize with Florida's Deputy Commissioner of Education Jeff Sellers, finding himself defending the state's official graduation rate the week that Education Week published its Swanson-index issue and pointed to Florida as a low-graduation state, using numbers far below the state's official numbers.

Some perspective: Florida's official graduation rate is inflated, but it's still better than Swanson's. Florida's graduation rate does more than Swanson (i.e., does anything) to adjust for student transfers and the fact that ninth-grade enrollment numbers overestimate the number of first-time ninth graders. 

Because of Florida's state-level database and the programming/routine that already exists, Florida is much closer to the new federal regulatory definition of a graduation rate than many other states, and Commissioner Eric Smith has been preparing the state board and other interested parties for the likely effect of the change on the official published rate -- i.e., that the rate will be a visible quantum lower than the currently-published rates (and largely for the reasons I have explained in the 2006 paper linked above). So in a few years we'll get a closer estimate of graduation from a lay understanding (the proportion of 9th graders who graduate 4, 5, or 6 years later).

The point in the St Pete Times interview where I winced was Sellers's answer to the question of how the state (and the general public) knows that the exit codes entered for a student are accurate: Sellers said that his department conducts an "audit from a data perspective."

That statement is misleading. It is technically true that there is an audit in two senses: each school district is required to check its data for accuracy before sending the data to the state's servers, and the state conducts a search of students reported as withdrawn in one county to see if they entered another county system before labeling them dropouts. But while I have seen reference to checking that the withdrawal codes are correct, I have not seen any evidence that such checks have actually occurred, and I have been unable to find that evidence anywhere on the Florida Department of Education website. That doesn't mean that it doesn't happen, but call me a touch skeptical. Without random checks, there is no guarantee that a 16-year-old coded as a transfer to another school actually was a transfer.

Given Florida's long experience with a state-managed education database, the lack of published audits of this process should caution us about the magic of state databases. They are important, but they need to be done properly. It makes sense to talk about the internal and external checks that should happen as other states construct databases and all states start to conform to the mandated longitudinal graduation rate:

  • Districts will need to be the first party to check accuracy, both in terms of preventing mistakes/fraud but also conducting consistency checks--are there any records which claim that a 45-year-old is attending kindergarten, for example? The first is supposed to happen in Florida, and I suspect that counties catch the low-hanging fruit in terms of errors. But the accuracy check on withdrawal code is the type of check that requires extensive follow-up to document whether a student identified as a transfer did in fact enroll in another school.
  • States will also need to conduct accuracy and consistency checks, though a state will necessarily be far less likely than school districts to catch outright fraud in claiming students transferred when they did not. 
  • States will also have to conduct the cross-checking that Florida currently performs every year and that I describe above: which students move between districts in the same state, but are counted as dropouts because a county only looks at its own students.
  • Finally, the auditing of transfer records would be MUCH easier if there is a standard way for school districts and individual schools to request the transfer of a student record and simultaneously use that authenticated request as verification that a transfer code is appropriate.

This is an incomplete list, but it's a start.

June 8, 2009

No one ever accused Arne Duncan of impersonating an education researcher

Hopefully some day we can track kids from pre-school to high-school and from high school to college and college to career. Hopefully we can track good kids to good teachers and good teachers to good colleges of education.

This was an excerpt from a speech Duncan gave today to IES staff about the need to use data warehouses to link individual teachers and test scores and then use that linkage to evaluate teachers (hat tip). Oh, yes, and do it based on research. Some day, Secretary Duncan, but tying an individual teacher to student performance is not something that you can assert is based on research available today. It is more wishful thinking than anything else. The best apparent on-the-ground research of this type with teacher education is nonetheless full of caveats. And that's on a program-level scale, not on the level of the teacher. 

I'd accuse Duncan of spouting fuzzy logic, but fuzzy logic (the real stuff, research-wise, using fuzzy sets) may be one tool we use to get out of this dilemma.

June 1, 2009

The Procrustean bed of teacher tests

Mike Petrilli's stab at the Sonia Sotomayor nomination via the Massachusetts teacher tests is a little askew, and I'm surprised he didn't look at an obvious dilemma that's deeper than the politics of a judicial nomination. Several former teachers have sued the state (and Pearson) for what they claim is a discriminatory impact of teacher tests given the disproportionate failure rate of minority teachers. This is the employee side of impact-analysis law that most school lawyers probably know better under the graduation-exam cases in Florida and Texas.

The landmark case here is Debra P. v. Turlington, which led to a number of federal decisions that guide the use of tests that have disparate impact in schools. To wit, tests with disparate impact by protected classes are acceptable if...

  • There is a rational state purpose for imposing them (guarantee graduate skills, in the Debra P. case)
  • There is sufficient notice to those affected
  • Those affected have a reasonable opportunity to learn the material on the test (the key reason for delaying graduation test applications in Florida, where federal judges did not want to hold the victims of segregation responsible for the unconstitutional behavior of schools)
  • The application of the test is professionally done (I'm bundling together several separate issues, including the composition of the test, defensible setting of cut scores, multiple opportunities to retake the tests, etc.)
  • There is no better way to meet the state's purpose that also reduces the disparate impact.

In the employment context, Petrilli is probably correct that the translation of the first item is essentially whether the test is a reasonable proxy for necessary teacher qualifications. But there is almost no way for anyone engaged in the current debate over teacher qualifications can defend these tests or defend the teachers' lawsuit without having some fairly severe inconsistencies.

Consider first the folks who have the approach that we should not care who enters teaching as long as we measure student achievement and make personnel decisions as a result. Several (whom I will not name to protect the guilty) have accused the High Quality Teacher standards in NCLB of obsessing about inputs (i.e., what teachers know) in contrast to outputs (what students learn). Anyone in this camp should abhor the Massachusetts teacher tests (and all teacher tests) because they continue the "let's look at the teacher qualifications absent the kids" approach, and we should be moving away from proxies for teacher effectiveness.

But the lawyers for the teachers and their supporters are not in much better shape, logic-wise. It is going to be very difficult to knock the legs out from the state's teacher testing program. They have to argue that the tests are a poor proxy for teacher skill, or that the tests were poorly constructed, or that there is a better option with a reduced disparate impact. If they cannot convince a judge that the tests were constructed and administered unprofessionally, the lawyers are going to be in the uncomfortable spot of arguing that the testing is an inferior proxy for judging teacher quality, in contrast to ... [The conclusion is left as an exercise for the reader.]

Summary: If you are in favor of judging teachers by student learning, then content-testing knowledge is a poor proxy by your own arguments. If you are against the content-based testing, then you have to come up with a better standard that will hold up in court. No, I don't think there's a way out of this for anyone with skin in the game, but if there is no summary dismissal and no evidence of rank incompetence in test construction, the fireworks will be interesting to watch.

Texas, South Carolina, Missouri, and Alaska

I know that the reports of the common-standards agreement shepherded by the Council of Chief State School Officers and the National Governors Association describe a few different reasons for why four states have not joined in a standards framework that is probably going to be about as close to a less-is-more approach as one can get in a bureaucratic standards document. Yes, I know Texas has just drafted standards (as has Florida, which is joining), that Missouri is searching for a new state superintendent (my guess is others are as well), that South Carolina has Mark Sanford (which is enough for any state to deal with), and that we haven't heard from Alaska. But here are my imaginary real reasons for why these states have opted out (thus far):

  • Other states refused to agree that everyone in the country would have to pronounce Harry Truman's state as mizZURah.
  • Texas would have to admit that bidness is not a word.
  • South Carolina did not get its way that there would be history standards with the required benchmark, "All six-year-olds will understand that each state is required to have at least one completely nutty elected official at all times, and this is a heritage of the Founders." 
  • There was a riot, not when Alaska insisted that NAEP math exams all use the Iditarod as an example of measure, rate, and general all-round toughness (other states just wanted to add their own events), but instead fisticuffs broke out when the Alaska rep. insisted that the current accepted size of the Earth was incorrect because if it was as large as most people thought, then you couldn't see Russia from your house.

Unfortunately, I suspect that the truth is far less entertaining. That's okay. We still have Joe Biden and George Will to mangle the facts in an interesting way.

Addendum: Lest anyone think I am making fun of other states, I should be very clear: I grew up in California in the 1970s, and I now live in Florida. That's enough ridiculous states to live in for a lifetime!

May 12, 2009

Should artists know something about money?

It's cringing time for this union activist: Teaching is an art, not a business wrote Hans, commenting this evening on a story about a judicial mandate prohibiting a UTLA one-day strike this Friday. That statement is irrelevant in the specific context (teacher layoffs), is a false dichotomy, and is wrong-headed in other ways. Let's start with the literal claim that art is incompatible with business. The daughter of a friend and colleague went to SMU on a dance scholarship. She was smart and after a minor injury decided to get some business training and is now an administrator in an art-related New York nonprofit. Artists and non-profits need people who are passionate about art and can also manage money (ask members of the Florida Orchestra, which I hear is surviving today in this economy because its new executive director is very competent).

Or to take another example, there's a wonderful segment of Stuart Math's documentary on desegregation in Shaker Heights, Ohio, where one of the old-time activists describes a post-WW2 meeting of residents who were trying to figure out how to create a stable housing market, and a business owner said, "You know, we can be liberal and effective, too."  And they were, running a neighbor-managed real-estate outfit that was crucial in maintaining a stable, desegregated, prosperous community.

So much for the claim that art can't be business and warm-hearted liberals can't think in terms of getting stuff done. But the whole premise is wrong; I don't think teaching is an art. You can make a good argument that teaching is a craft, but there has to be solid practice at the bottom of it. In addition, anyone who is skeptical of the value of high-stakest testing, as I am, has to have something that's just a tad, a teeny, a tiny bit more astute than a statement that screams, "Just let me do what I want when I'm paid with the public purse." That's nuts, both philosophically and politically.

May 11, 2009

"Governance reform" is not reform

While New York rages over mayoral control, which is all the rage, schools in Pinellas County are headed towards The New Site Based Management, which was the rage in the late 1980s and early 1990s and which Bill Ouchi hopes will be the rage again.

While there are plenty of ways that governance can affect the classroom, I am consistently underwhelmed by the argument that governance reform improves what happens in the classroom. I've seen it all before.

May 5, 2009

Florida could still jump forward on end-of-course exams

The St. Pete Times is reporting that the death of the Florida House bill mandating end-of-course (EOC) exams in high school starting in science is the death of end-of-course exams, at least for this year. I'm not so sure. If I remember correctly, the legislature authorized EOC exams in principle last year, and there is an alternative funding mechanism: stimulus dollars. Embedded in the stimulus bill is section 14006, which is part of the $5 billion discretionary amount given the U.S. Department of Education. The state's application for state stabilization funds probably satisfies the nominal requirement for Florida to be aligible for a state incentive fund, if the state asks for incentive funds to develop EOC exams. This is precisely the type of project that the state incentive fund is designed for; it would replace the single comprehensive test with a number of tests tied to specific courses and instead of having to upset science teachers (such as in physics and earth sciences) with subjects not included in the first round (the filed bill in the House excluded them), there could be development of a full range of EOC exams in science. Seems like an obvious "yes we'll do that" to me.

I could be wrong; there may be legitimate reasons not to apply for state incentive funds to develop EOC exams. What surprises me is that during the legislative session, there was no public discussion I am aware of about the possibility of using federal stimulus dollars to develop EOC exams. I have heard nothing publicly at all about this, yet it's been an obvious possibility, at least to me. Has any reporter asked Commissioner Eric Smith about this? Is there any legislator or legislative aide who has asked about it?

April 6, 2009

One teacher's response to Ron Matus's article

There's been lots of coverage of the Ron Matus story March 29 on firing teachers in Florida, but there's been no follow-up online about the letters to the editor that were printed April 4 (last Saturday), and at this point, I can't even find the letters on the Times website. But I think one needs to be highlighted, because it's from a teacher and makes a few important points:

The premise in the article [by Ron Matus] is that tenure makes it too hard to fire bad teachers, yet the few examples given don't demonstrate that, but rather, simply show inaction on the part of school districts.

If the writer had found districts attempting, but failing, to fire bad teachers, he might have a point. I see this drive to get rid of tenure as an effort to instill fear in teachers and keep them silent. Teachers living in fear for their jobs can't afford to speak out.

Getting rid of tenure (read: due process) might make it easier to dismiss the rare teacher who shouldn't be in the profession. It would also make it easier to dismiss the good teachers--even the great ones, because the great ones are the ones who stand up and advocate for their students, themselves and their profession, and in doing so sometimes step on toes...

John Perry, Tampa

I've known John Perry for a number of years; he's an activist in the Hillsborough Classroom Teachers Association, but I don't think he was when we met. I think Perry's wrong about the order of magnitude of "the rare teacher who shouldn't be in the profession" (emphasis added), but since a good portion of teachers leave the field within a few years, I don't think that there's a shortage of ways to discourage teachers from continuing.

More broadly speaking, I think more sophisticated critics of teachers and their unions understand that administrators are the ones who fail to fire teachers, but Perry's other point is important: while K-12 teachers do not have academic freedom in the same sense that higher-ed faculty do, they're the ones I often hear a certain style of reformers praise for precisely the type of dissent that would be in danger without due process.

So let me phrase the question in the following way: does anyone want administrators to be able to fire teachers summarily after teachers do the following?

  • Refuse to change a grade to let an athlete play.
  • Complain that the new math textbook series is confusing to new teachers and likely to lead to poor teaching.
  • Sign and date a request that a child be evaluated for eligibility for special education services.
  • Complain when girls have fewer opportunities than boys.*

As far as I am aware, the only case above for which K-12 teachers are clealry protected when they speak out is the last one, and that's because of a Supreme Court decision stemming from Title IX; I suspect that the are likely to be protected if they push for assessment to gain services for a child, but I don't know of anything as clear-cut as a Supreme Court decision. And I don't see people who are in favor of "tenure reform" rushing to replace workplace due process with greater whistleblower protections.

April 1, 2009

Sharpton paid off? Please tell me this is an April Fool's joke

The New York Daily News is reporting this morning that former NYC Schools Chancellor Harold Levy is involved in a $500,000 payoff set of donations to the Rev. Al Sharpton's organization, with payments beginning shortly after Sharpton and Joel Klein launched the Education Equality Project in June 2008. With friends like Levy,...

In other news, I am hereby announcing my support for the public flogging of teachers whose students' test scores decrease from year to year, my hope that NYC invests an addition $1 billion in the ARIS system, my trust in the market to determine the true worth of schools within a voucherized environment, and my death last Thursday from reading Michele Foucault. In lieu of flowers, my family is asking that donations be made in my name to the John Birch Society, except for my son, who would appreciate iTunes cash cards instead.

Okay, it looks like the DN story is serious. Yikes. That'll take the wind out of the Education Equality Project (EEP) conference starting today. Then again, maybe "eep!" is the reaction of participants and fans of the Klein-Sharpton effort.

March 30, 2009

Seattle will be drier

I spent some time this weekend finishing the first complete draft of a talk I'm giving in Seattle on Thursday. I'm going to be heading there while a few thousand historians are leaving Seattle after the end of the Organization of American Historians meeting. I'm either expecting to find a time machine or I am heading there for a different meeting (Council for Exceptional Children). Last time I was in Seattle, it was wetter and colder than what's forecasted for the middle of this week. We had a drenching rain in Tampa this morning, so things will even out in my personal experience this week, even if not for the world.

I hope my neighbors weren't paying close attention while I was timing the draft. I don't read papers word-for-word, but I wanted to get a sense of how far I'm off on time, so I read it aloud while alternating between the laundry room and the kitchen.

Oh, the topic? Accountability and students with disabilities. I think I know how I'm ending the hour, but the cliffhanger before the third set of commercials is the tough part right now, and I haven't yet decided if Jason's going to live. If he does, I'm going to have to tear up the last act and start fresh. I've given a spoiler, haven't I?

More seriously, this talk is giving me the opportunity and prod to think through some connections between areas of education politics that I mentally put on "percolate": the democratic rationale for public education, tensions between public and private purposes of schooling, and what technocratic mechanisms may be useful for (and in what circumstances). When I get back, I have to think about potential outlets and how to get a potential coauthor to give up enough time to participate (and the value involved in that). 

The only serious performance question I have is the extent of corny jokes and how far I can/should push them.

  • An RTI Tier 2 intervention plan and a Writ of Mandamus walk into a bar...
  • Peter Singer dies and finds himself at the Pearly Gates facing St. Peter: "So your most important goal right now is to avoid pain?" St. Peter begins...
  • How many IEP team members does it take to screw in a lightbulb?...
  • A rabbi, a minister, and a psychometrist are in a rowboat in the middle of the lake...

Maybe not those jokes.

March 17, 2009

Longitudinal data systems, good; unique teacher linkage, bad

Diane Ravitch's blog entry this morning seriously disparages the value of longitudinal data systems, including the linking of teachers to students, and John Thompson's entry discusses the abuse of data by administrators. Essentially, both Ravitch and Thompson fear the brain-dead or conscious abuse of data to judge teachers out of context. That's also the reason why NYSUT (the New York state joint NEA-AFT affiliate) worked hard to convince the legislature to put a moratorium on using test scores to make tenure decisions; Joel Klein was moving very quickly, and I think UFT and NYSUT had good reason to believe that without the moratorium, there would be substantial abuses of test data in NYC (and elsewhere) in tenure decisions. 

My take: longitudinal data systems are a good thing, but linking teachers to students is a much more fragile undertaking.

Florida has a longitudinal data system that began in the early 1990s and has been used for 10 years to judge schools based on test data. Approximately ten years ago, I sat in a windowless room in Tallahassee as a Florida DOE member discussed the new A-plus system and a variety of technical decisions tied to it, and for which he had brought stakeholders and a few yahoos from around the state to give advice. I was one of the unpaid yahoos who had the great joy of flying in tiny airplanes several hundred miles a few times a year to give advice on the matters. 

We had so many matters to discuss that one minor conversation was almost overlooked: a state mandate that required that the FDOE link each student to a teacher primarily responsible for reading and math. One state official showed us a draft form and then explained the concerns he had about it: in his view, the state that had tried that a few years earlier (Tennessee) had multiple conceptual difficulties connecting individual teachers to individual students. But they had run roughshod over those concerns, and he anticipated that Florida would do the same.

It wasn't a matter of letting teachers off the hook (this now-retired professional staffer is what I think of as an accountability hawk) but logic and sense. How many physics and chemistry teachers help students understand algebra better? How many history teachers help students with writing or reading? For students receiving special education services in a pull-out system, do you want only the special educator to be responsible for a subject, or do you want both the general-ed classroom teacher and the special educator to have responsibility? This spring, my wife (a math major and special educator) is tutoring a local child in math on weekends or evenings; so who should get credit for how he performed on testing in the last week, his teachers in school or my wife? Today, you can add NCLB supplemental educational services (or after-school tutoring) to the mix. 

The larger point: even if you decide to wave away the concerns of Richard Rothstein and others, even if you focus entirely on what happens in academic environments, it is fallacious to link every student performance with a single teacher. If we are providing the appropriate supports for children, then the students with the lowest performance are the ones for whom such unique linkage assumptions are the least justifiable, because they may be receiving academic support from general education classroom teachers, from special educators, from after-school tutors, and maybe mentors or other providers in neighborhood support organizations (such as Geoffrey Canada's). Today, I do not think one can parcel out responsibility without making assumptions that have no basis in empirical research. Those who support individual teacher linkage have the burden to demonstrate otherwise.

March 12, 2009

Joel Klein as DM

John Thompson's blog entry today, God Does Not Play Dice, is in response to Charles Barone's Ed Sector report on value-added or growth models used for high-stakes accountability. (It's on my to-read list along with the IES/Mathematica study on teacher ed programs and various other things.) Thompson describes a number of caveats and then says,

...none of my objections would be major if the model was used for purposes of diagnosis, science, or a "consumers' report." We should pursue social science fearlessly, but we must not play dice with the lives of teachers by evaluating them with some theoretical work in progress.

That plays off Einstein's quip, "God does not play dice," in reference to quantum mechanics. That comment always made me think that if God does not play dice, maybe God forces you to pick up the dice and roll.

And that gave me the image of Joel Klein as Dungeonmaster.

A troll has just entered your classroom. He has a mace, a strength of 11, and 16 hit points.

After the Cafeteria Blob you threw at us, I only have 4 hit points, and I lost my Spitball Blocking spell.

Fight or run away?

Better fight; if I run away, I lose the Memo Spindle.

Better hope you're lucky. You need to roll a 17 to block the mace, 20 to break it.

But you're only giving me a D12!!

This is New York. You're tough enough. Roll.

March 10, 2009

Get Accountability Frankenstein for $10!!

Information Age Publishing is having a ten-year anniversary sale where you can get 10 or more books from their catalog for $10 each. Their authors, editors, and series editors include Gene Glass, Ernie House, Erwin Johanningmeier, Terry Richarson, Tom Popkewitz, Kathy Borman, Kenneth Wong, Jaekyung Lee, Maurice Berube, V.P. Franklin, Carol Camp Yeakey, and many others.

March 2, 2009

Take a breath (if you don't have asthma) and go on

I don't have asthma, but as my head cold morphs into the ordinary misery of seasonal allergies, I realize it's a darned nuisance not to be able to breathe comfortably. With luck I'll shortly be back to normal (or at least for what passes as normal for me), and in times like these, it pays to take a deep breath on receipt of almost any news and criticism. Evidently, my perspective lies somewhere between former Hill staffer and new DFER policy guru Charles Barone and NYC union activist Norm Scott, because I'm getting dished on by both. I'm not going to use the lazy journalist's excuse, "Because both sides are criticizing me, I must be right," in part because I'm not a journalist, in part because it's easily possible to be wrong about multiple things at once, and in part because while I disagree with Barone's and Scott's posts, they (generally) have the guts to say where they disagree with me. Oh, yeah, and they spell my name right. That counts for a lot with me.

Barone criticizes me (and others) for writing too much from an adult's perspective. I've written about that topic before (at length in Accountability Frankenstein and in more digestible chunks in One-Blog Schoolhouse), so let me provide a somewhat different gloss here: I could easily turn my blog over to several guest writers, my children and their friends. I suspect Barone's response to their criticisms of high-stakes testing would be, "Well, I know a little more about the world and your own best interest than you do." That statement would be absolutely right (at least in the first half) and an absolutely adult perspective.

(Incidentally, I agree with his substantive point in his entry that teacher happiness is not the point of either education policy or teacher education. I don't think that you can usually have effective teaching with completely miserable teachers, but I suspect or at least hope Barone would agree with me, and there's plenty of ground between avoiding total misery for teachers and seeing their euphoria as the primary goal of policy.)

Scott criticizes me (and others) for ignoring the fact that Arne Duncan was flawed as head of the Chicago Public Schools. Er, no. I'm fairly sure I'd have disagreed with him on a number of his decisions in the same way that I am fairly confident on where I'll disagree with him on federal education policy. But that open expectation of some disagreement does not mean the Obama administration is evil. Scott asks, "Exactly how much 'context' do these people need?" I'd say 20 years of Republican presidencies divided by 8 years of Bill Clinton. In comparison with Bill Clinton on the whole, Obama is good. And in contrast to the others, he's very, very good. That doesn't mean that I'm going to stay quiet when I think the administration is doing something wrong. It means I do have some perspective. Breathe, folks, breathe. For those who are worried about Arne Duncan, I think you'd do much better to putting your energies into worrying about Timothy Geithner instead.

February 25, 2009

On exaggerations in the service of bitterness

Today, Charles Barone indulged in some recriminations about the use of test data to evaluate teachers: "In fact, in many states there is tremendous pressure to pass legislation which assures a firewall-like separation between teachers and student performance. Such laws have already passed in California, New York, and Wisconsin; ..."

But let's examine that claim with regard to New York, about which others such as Kevin Carey and Jennifer Jennings wrote last April. The language:

3012b. Minimum Standards for Tenure Determinations for Teachers.

(a) A superintendent of schools or district superintendent of schools, prior to recommending tenure for a teacher, shall evaluate all relevant factors, including the teacher's effectiveness over the applicable probationary period, or over three years in the case of a regular substitute with a one-year probationary period, in contributing to the successful academic performance of his or her students. When evaluating a teacher for tenure, each school district and board of cooperative educational services shall utilize a process that complies with subdivision (b) of this section.

(b) The process for evaluation of a teacher for tenure shall be consistent with article 14 of the Civil Service Law and shall include a combination of the following minimum standards:

(1) evaluation of the extent to which the teacher successfully utilized analysis of available student performance data (for example: State test results, student work, school-developed assessments, teacher-developed assessments, etc.) and other relevant information (for example: documented health or nutrition concerns, or other student characteristics affecting learning) when providing instruction but the teacher shall not be granted or denied tenure based on student performance data;

(2) peer review by other teachers, as far as practicable; and

(3) an assessment of the teacher's performance by the teacher's building principal or other building administrator in charge of the school or program, which shall consider all the annual professional performance review criteria set forth in section 100.2(o)(2)(iii)(b)(1) of the Regulations of the Commissioner.

The part that was added last spring is in italics, but the rest remains, including clear performance references in bold. How are we supposed to read the combination of "the extent to which the teacher successfully utilized analysis of available student performance data... when providing instruction" together with the ban on granting or denying tenure "based on student performance data"? I'm not a lawyer, but obviously there has to be data for one to judge teachers on how well they use the data. My reading (which I think is plausible) is that one couldn't make a blanket decision based only on test scores, but you could grant or deny tenure based on how well a teacher used the data in adjusting instruction. This latter is pretty close to the best-world scenario of Response to Intervention (RTI) policy, which has a lot of research at least in core areas in elementary schools. In comments on Barone's entry, I wrote,

I think we may be reading the same legal language with very different lenses. To me, the tenure-qualifications language in NY state essentially conforms with RTI -- teachers have to show that they can use data. Those upset with the added language for this year -- which bars a brain-dead statistical formula -- must think it would be as appropriate and also easier to define effectiveness with test scores as what is currently allowed/required by law. Me? I don't think there's anything that's easy here to implement in a fair way, and there ain't yet no Holy Grail. I also suspect that there is no provision in NY law that prohibits the type of analysis of teacher education that Louisiana has been building for the last 5-7 years. Either I'm reading your definition of a firewall too broadly, or I'm misreading NY law.

Here is Barone's response, word-for-word (the bold-faced sentence is my emphasis):

It seems to depend on how you define "brain dead." The data can't be used, thoughtfully or otherwise, to inform tenure decisions. Whether there is a holy grail, or it hasn't been found, remains to be seen. But surely everyone agrees that poor and minority kids are getting the short end of the stick, and data available now can and should be used to help level the playing field for kids while we adults have our fun little debates. I notice you rarely use the word student or child, unless you are quoting me. I think we need to err on the side of the kids for a while even if it makes adults uncomfortable. If we wait for there to be a consensus among academics, today's kindergartners will be collecting Social Security before anything is done. If then.

The "bitterness" referred to in the title of this entry refers to this response. I'm disappointed by Barone's avoidance of the substantive topic by applying a rhetorical litmus test (how often I mention children in my blog), as well as the politician's logic here (something must be done; this is something; so we must do it). But let me get to the point: Barone is misreading the law. Data can be used to inform tenure decisions, and in fact, they must be, because the law requires that part of the tenure decision depends on teacher use of data. No data, no use of data -- no tenure. It may not be Barone's picture of how data informs a personnel decision, but Barone's claim is just plain wrong

Addendum: In comments, Barone argues that the New York state law is clear and bars use of test data for making tenure decisions. Here's the way to decide it:

1) Does New York law prohibit a district from denying tenure because a teacher refuses to implement Response to Intervention practices?

2) Is Response to Intervention something based on student performance data?

If the answers are "no" and "yes," respectively, I'm right. Any other combination, and Barone is right. Let's try another scenario:

Main office conference room, where the assistant principal is meeting with a new teacher. "Let's look at your student's last quizzes and talk about where they learned the material well, and where you might want to reteach."

The teacher holds up his hand. "Wait a minute. Am I going to be judged based on what I say in this meeting?"

The assistant principal nods her head. "In part, what I'm judging with your effectiveness is how you respond to student needs. C'mon. Let's just look at the quizzes."

"No way. State law forbids the use of student performance data in tenure decisions. I'm talking with my union rep!"

If Barone is right in the global sense, this conversation could really happen. But I don't think it could (or has). When Barone claimed that New York had put a "firewall" between teachers and performance data, I know he was thinking in the narrow sense of "if students perform poorly on standardized tests, then we should be able to deny tenure." But regardless of whether that is a good or bad policy, that's not the only way one can connect teachers and student performance. Expecting teachers to look at student performance and change instruction based on data is a second way, and New York does not bar it. Looking at teacher education and student performance is a third way, and New York does not bar it. Which of those three is good policy is an interesting and debatable question, but what is not debatable is that all three connect teachers to data.

February 20, 2009

Technology and assessment

Education Sector's new report Beyond the Bubble is shorter than I had expected, so I finished it while watching the end of my son's tae kwondo class last night. It looks to be a decent summary of the optimistic side of technology-and-assessment literature. Its tone is, "Yes, we can dramatically change and improve assessment with technology that is either just about to come online or that deserves some investment." And I think that for some things, that's absolutely right: an online/computerized science exam could have color images of tissue slides, videos of animal behavior, and so forth. But, while author Bill Tucker bowed his head in the direction of friendly technoskeptic Larry Cuban, there are some flies in the ointment:

  • Students with disabilities. This is true for pencil-and-paper tests as well, but when you only have black ink, there are a few other issues you don't have to worry about that on-screen designers have to: red-green color blindness, epilepsy and screen movement, etc. The half-page on universal design is good, and any CFP will need to specify (and budget for) disability/accessibility awareness.
  • Code creep. I don't mean internet safety but the fact that programming languages grow up and die. We've gone from perl to python, from HTML to XML, and languages and interfaces will continue to evolve. I wonder how many of the cases pointed to in the report are essentially one-off projects that will die at some point because the platform no longer exists. (Any readers remember Infocom's text games?)
  • Holy Grail syndrome, also known as a belief in "the leap in cognitive science that will allow perfect, automatic scoring of essays is just around the corner." Same with the great and brilliant analysis of hundreds of microstate data that a single student can generate in a simulation environment. I trust colleagues who work in cognitive psychology to do some great things in the next decade, but this seems a bit utopian. Okay, more than a bit.

All of this doesn't say we shouldn't be engaged in using technology, but maybe we should work along two tracks: encourage the fast, frequent, and flexible for now and also invest in the medium- and long-term projects.

There is something that the paper never addresses: intellectual-property rights. Part of the imprisonment of assessment in an oligopoly is the ownership of assessment materials, backed up by the fear of security problems. (Here's reality for you: the day after a state test is given, assume NO security for that test. None. Despite all the laws. Just give that idea up, folks, unless you believe in the tooth fairy, have never heard of BitTorrent, and don't think college students ever cheat.) I am curious what the position of various folks are on open-source assessment. I am not entirely sure what it would consist of, or how it would meet adequate technical standards, but it's tough to argue that despite the testing industry's oligopoly status, we should suddenly think that a brand-new investment will erase both the proprietary rights of the major firms or the start-up threshhold for the creation of commercially-viable products.

February 6, 2009

Klein compares Bloomberg to Putin

No, he didn't, but at the mayoral-control hearing in Albany, according to the indefatigable Elizabeth Green,

Klein defended himself passionately, arguing that mayoral control is a democratic governance structure, not an authoritarian one, as some members painted it.

The logic here is weak: under that view, a plebiscite dictatorship is democratic because every few years the head honcho could be kicked out of office. 

I think there are multiple reasonable approaches to the policy question, such as UFT's "you need two (more) righteous people to save Gotham" proposal of giving the mayor a plurality on the main policymaking body (so the mayor and chancellor would have to convince 2 out of the other 8 members) or something that would give an independent body subpoena authority and the responsibility and right to issue reports on the schools.

But the gist is to inject public accountability beyond the one-person constituency of Joel Klein. I'm a little curious why advocates of mayoral control don't grasp the fundamental irony that you don't create accountability by removing it. There are multiple ways of addressing the messiness of urban politics, but if the appointed chancellor has spent several years ignoring parents, he's getting his natural comeuppance today.

UTLA and "benchmark" or "periodic" testing

Last week, the United Teachers of Los Angeles called for the cessation of every-few-months testing in the district. The response of the district: such testing is an important tool in improving student achievement, which they know because schools with such testing have had annual-test scores higher than schools without such testing.

The flaw in the district's reasoning is left as an exercise for the reader, because I'm more concerned at the moment about what this debate shows about our attitudes towards assessment. UTLA is wrong to attack frequent testing on principle, though I think they may have a good point about this type of assessment. Such periodic assessment may help schools target assistance to students, or they may serve primarily to mimic the state test and encourage teaching to the test (the predictive success of which principals would know by results on the quarterly assessments). Without knowing more about the details, you can't say which is which, and both phenomena are possible (including in the same school).

What concerns me is the direction in which the machinery of testing is taking formative evaluation. There's a lot of research to suggest that when used to guide instruction, frequent assessment can dramatically change results. There are a number of technical questions about so-called formative assessment (or progress monitoring) that are the domains of researchers in the area: how to create material sufficiently related to key skills or the curriculum, how to create assessments where score movement is both meaningful and sensitive to change, how to gauge appropriate change, how to structure the feedback given to teachers, and so forth. My reading of the literature (which is not complete) is that the most powerful uses of formative assessment require very frequent, very short assessments--on the order of once or twice a week, and about the same length as your typical elementary-school spelling test (i.e., a few minutes at most). 

So what do we see as the evolving, bureaucratic version of formative assessment: long tests taken every few months. That's better than once a year in terms of frequency, but it's still a blunt instrument and absorbs a large chunk of time. The reason for this preference is obvious: a large, unwieldy school system can organize systematic evaluation/feedback around quarterly tests. That's doable. But organizing around something that's taken weekly and would often require data entry (e.g., a one-minute fluency score for first- and second-graders)? That's a different kettle of fish.

That doesn't mean it's impossible. It's easy, if you're a principal who's willing to devote the right resources. Consider reading fluency, for example. (I'm not saying that fluency is more important than comprehension. I just have the experience with this to imagine what I'd do as a principal.) Teach a paraprofessional to have every first- and second-grade student in the school read to them one minute a week on a sample reading passage (there are sets of roughly equivalent passages one can purchase for this purpose). Have them enter the data through a Google Docs form, a SurveyMonkey survey, or some other tool that will send the data to a spreadsheet. Get someone to program the results so that you can show data per child with trend lines and sort by grade, classroom, etc. For a few extra lines of code, you could add locally-weighted regression trends to be really fancy, but that's beside the point.

Here's the point: this is not rocket science, this does not require a gazillion-dollar software package from TestPublisher Inc., and it's very different from the type of quarterly testing that superintendents are buying into in a big way (including that gazillion-dollar software package from TestPublisher Inc.). It's very different from the quarterly testing that UTLA is protesting.

So, Ramon Cortines, here's my challenge: can you document that the quarterly-testing regime is better than the weekly-quiz-plus-trends proposal I've outlined above? The second can fit easily into the routines of any school. The second can start conversations EVERY WEEK at a school. The second is MUCH cheaper. It's also less sexy: no giant software packages manipulable from the front office, no instantly-printable pastel-colored graphs that demonstrate what kids were able to do on a test six weeks ago. You'd definitely give up the flashy for the mundane. But prove to me that the flashy is better than the mundane.

February 5, 2009

What personality is your Performance-Pay Attitude? (and other mixed metaphors)

Since other bloggers I read have used various quizzes to spice up their entries, or maybe do something online while they're waiting for a bus, here is the all-purpose Performance-Pay Personality Quiz. Oh, wait: "personality" isn't quite appropriate here. But to mix metaphors, what personality is YOUR attitude towards performance pay?

  1. Do you think that there is ever a justification for some teachers' being paid more than others?
    • 1 point -- A paycheck is performance pay: either pay people a good wage for doing their job, or fire them for not doing it.
    • 4 points -- Some differential pay is required to encourage teachers to take hard-to-staff jobs (either by subject or school), and that's more important than merit pay.
    • 7 points -- On balance, performance pay would be a good thing, but it's not the most important thing to change in schools.
    • 10 points -- Performance pay or bust: I'll throw everything else out the window to get it!
  2. What's the most important motivation for teachers and administrators?
    • 1 point -- They love children; that's their only motivation.
    • 2 points -- Personal integrity is a more powerful motivator than salary. Teachers need salaries, but if you can show teachers how to feel better about the job they're doing (including showing them how to do a better job), you can move mountains.
    • 3 points -- Money's an important part of the picture. It's not the only thing, and seeing money as the only motivational tool would be foolish public policy, but to ignore it would be wrong.
    • 4 points -- There's nothing like money to get people's attention, and teachers are people.
  3. How important is it for education policy to encourage educators to work together?
    • 1 -- Teachers are not islands: rewarding individuals will kill the type of mentoring and sharing that's essential for professional development. Doubt me? Go ask stock-market traders who entered their career recently whether individual rewards encouraged their elders to mentor them... or spend every second on the floor trying to make a buck.
    • 2 -- Cooperation is crucial. It's not everything, since all teachers have strengths and weaknesses, and we don't want a school full of Stepford Teachers, but I worry that too much emphasis on individual recognition will discourage teachers from talking to each other, and from any chance that teachers will hold each other accountable.
    • 3 -- Teachers' talking in a lounge is like little kids' hugging each other. Often it's wonderful, but you sometimes worry what they're sharing. Individual recognition is pretty important to give credibility to the better and more professional teachers.
    • 4 -- Teacher go it alone anyway: recognizing their achievement as individuals is unlikely to harm the type of substantive collaboration that happens rarely.
  4. What is the right balance between judging teachers based on the professional judgment of peers and using student performance?
    • 1 -- Peer judgment: they're the ones who know what good teaching looks like, and what we care about is whether teachers are teaching well.
    • 2 -- Er... wouldn't peers be interested in what students are learning? Student performance should be part of the mix, as one springboard for evaluation. But peer judgment should be central.
    • 3 -- Student performance should anchor qualitative judgments of teaching. Yes, peers can judge teachers, but student performance should be central.
    • 4 -- Skip the peers. What matters is whether students are learning.
  5. How ready is the technology of testing to use in judging individual teacher and school performance?
    • 1 -- When the solid historical record of more than a century shows that people have abused tests in every decade, we should assume that tests will be misused, and it's the burden of high-stakes testing advocates to show otherwise.
    • 2 -- Tests are useful, but we're far from being sure that tests tell us what most politicians think they tell us.
    • 3 -- They're imperfect, but we need to start using test scores to judge effectiveness now because we can't wait for tests to be perfect to look at performance.
    • 4 -- They're just fine, and they have been for years.
  6. What role should collective bargaining play in education reform?
    • 1 -- Collective bargaining is crucial to protecting due process and teacher rights, and if possible to block stupid reforms.
    • 2 -- Collective bargaining is crucial to protecting due process and teacher rights, and unions can play an important part of reform.
    • 3 -- Collective bargaining is primarily an obstacle to important reform. Where unions will accept reforms, great. Where they won't, federal and state governments have powerful incentives to change the balance of power at the local level.
    • 4 -- Federal and state governments should do their best to break unions, because they do nothing good. Break them, circumvent them, discredit them with their bargaining units.
  7. What should be the ceiling in terms of paying for performance (both the total amount of money and how many teachers should be eligible)?
    • 1 -- Arguments in favor of performance pay are a cover for not wanting to pay teachers more. Those who work with children are generally underpaid, and while performance pay looks like it's in "the children's interest," in reality it's another way of being cheap.
    • 2 -- Part of my skepticism about performance pay is the assumption that only 10-25% of teachers should receive it. To these brilliant people, I ask: "Okay, suppose there's performance pay and every student meets whatever is your definition of proficiency by 2014. Does that mean you'd be willing to double teacher pay for that result, or is this an education-reform shell game?"
    • 3 -- Part of my acceptance of performance pay is looking at the numbers: there are lots of students, and it's almost impossible to staff every classroom with a brilliant and greatly-skilled teacher. So let's pay the great ones the best. "In a perfect world we'd double teacher pay" is another way of saying "never."
    • 4 -- Competition is the best way to motivate individuals, and you're going to get little competition if everyone can earn a bonus. Limit performance pay to the top slice of teachers.

Psychometrics-free labels to share with frenemies and colleagues:

7-11: You are Alfie Kohn. You'd really like the testing industry to suffer an ignominious death, and anyone who thinks that using tests will improve schooling is smoking something fairly powerful.

11-16: You are Reg Weaver. You are publicly skeptical of merit pay, you think most designed systems are going to be disasters, but you're also going to hold your nose and support teachers who decide it's in their best interests.

17-23: You are Randi Weingarten. You know that the American public is used to people making more money if they do a better job, but you're skeptical of most performance-pay plans in operation today. You think collective bargaining is the best way to moderate the more idiotic ideas surrounding teacher pay and to protect the legitimate interests of teachers and communities.

24-28: You are Thomas Toch. You're well aware of the flaws of testing and accountability systems, but you think moving in the direction of performance pay is essential, and you will trust that the system can be improved incrementally once it's started in the right direction.

29-34: You are Michelle Rhee. The day that teachers have a starkly uneven pay scale, the day that school districts fire a fifth of their teachers, and the day that unions are decertified around the country will be the day you will not only take up that Newsweek broom again but dance with it a la Fred Astaire. 

(Don't like the questions? Fine: make up your own completely unscientific spoof of internet quizzes!)

January 13, 2009

Oversight boondoggle

Last week the Wall Street Journal lambasted Florida Governor Charlie Crist for failing to appeal a ruling that struck down the Florida Schools of Excellence Commission as an unconstitutional infringement on the powers of county school boards in Florida. The legislature wanted to set up the FSEC as a second authorizer of charter schools in case county boards were unfair and refused to let enough charter schools open. This bewildered me because Florida has no statutory cap and there are a few hundred charter schools in the state.

This afternoon, I remembered a blog entry written by St. Pete Times reporters in December: the FSEC has been spending the people's money like it was water, racking up almost half a million dollars in expenses over two fiscal years without authorizing a single charter school that has yet opened its doors. 

Isn't the Wall Street Journal supposed to have a conservative fiscal philosophy?

January 12, 2009

Deantidisestablishmentarianism in education policy rhetoric

Joel Klein and Al Sharpton wrote an open letter to Barack Obama and Arne Duncan that appeared this morning in the Wall Street Journal. And I have just a few questions about this:

  • How can the sitting chancellor and a long-time civil-rights activist claim to be railing against "the entrenched education establishment" when you could reasonably conclude that they are The Establishment?
  • Why do they think that placing a column in the WSJ establishes their anti-establishment street cred? That newspaper isn't exactly an underground pamphlet.
  • Isn't Klein the type of guy who already has Arne Duncan's cell number? They're fellow urban superintendents, they've talked at meetings, and you assume he could call Duncan up at any time, and probably get Obama's number as well. So why do they need this open letter--do they feel this deep psychological need to pose as Village Voice rebels with a cause?

Klein and Sharpton are setting up a straw-man opponent. In my masters class in the fall, one of my students argued that accountability is well-entrenched as part of the public-school policy script. Whether you want to use Tyack and Cuban's "grammar of schooling" or Mary Metz's "real school" language, I think there's a case to be made that anyone who claims that accountability is "new" is in denial and as punishment should have to watch three or four consecutive playings of an inane 1980s adolescent-rebellion film.

So someone who is less establishment than Joel Klein would be... anyone? Anyone?

Second thought: For a few years, I've had the suspicion that the public "letter to the next president" was a bit precious (in the pejorative sense). The collections of letters to the president published after the end of an administration are usually drawn from the sample of correspondence from ordinary Americans that the White House staff select for a president to read as a reality check. Even if Klein gets some credit in my book for having a salary far less than what either New York financiers or university presidents are commonly receiving these days, in no way could one call Joel Klein or Al Sharpton "ordinary Americans."

So if Joel Klein gets to write a "letter to the next president," though we all know he could call Obama up with ideas about either antitrust policy (his Clinton-era gig) or education policy (his current gig), then the gloves are off. I'm writing a letter, too! And you know from my loving hardass manifesto that I intend to bring some style to it. So here's the rule for 2009, for all of you: Staid pretentious public letters to the new president are out. Your job is to write the most outlandish letters that tell the truth. Come on: it's going to be the Obama era. You can say it.

One more update: Apparently Margaret Spellings doesn't have Arne Duncan's cell number, either! Or at least she's pretending not to. Isn't it so nice of major papers to devote part of their ever-shrinking news hole to long classified ads from major policy honchos who can't navigate their cell-phone menus? Though I think the following would have been free on Craigslist: "Arne: call me. Margaret." What? The Post may have been joking? Oh, yeah, and that's a good use of newsprint...

December 17, 2008

Okay, it's Arne Duncan. Back to the substance already, willya?

The following is one of those trick questions you should never answer: Was Arne Duncan appointed because he's a cipher/Rorschach test for those with an axe to grind in national education politics, or is he an appointee primarily because of his personal and political connections? In between other tasks, I've been reading the comments flying past at half the speed of light, and after the most sensible and well-grounded supporting piece I've seen yet (disclosure: I'm a sometimes contributor to the blog), I've been reminded of Stephen Carter's response when asked if he ever benefited from affirmative action: so what?

So what if he's a policy cipher? He won't be making decisions by himself, and if anyone has a bully pulpit on education, it's going to be Duncan's boss. What matters is the collective decision-making, including the debate over the hard decisions to be taken with NCLB. 

So what if his appointment is far more closely tied to networking than many of the other Cabinet appointees? He'll now be in a far more public and less insulated role than as aide to Paul Vallas or the CPS head serving at the pleasure of Richard Daley. He'll rise or fall on his own merits, at this point.

As I wrote six weeks ago, let's move on to some discussion that is less personality-based.

November 23, 2008

When the news hole shrinks, any mention is a blessing... well, sort of

Adam Emerson used to be the Tampa Tribune's higher-ed reporter. As the Tribune's owner Media General has been laying off reporters and editors left and right over the past year, assignments have shifted, and Emerson now has the K-12 education beat. So when he called me up with the news, it was also to ask about Florida's graduation rate. Basic story: in the last week, the Florida Department of Education released its annual data on graduation. They published two sets of statistics, both including and excluding GEDs from the number of students in each cohort receiving a diploma. They did not publish the alternate rate that they will have to start publishing in a few years, where the students who drop out to take GEDs will still be part of the cohort schools are responsible for. Some progress in transparency is still progress, and as I told Emerson, Florida's education commissioner is smoothly preparing both his board and the public for when the official graduation rate drops because of the change in definition. I suspect he may also be giving signals to the superintendents around the state that they'll no longer be able to hide problems with the dropout-to-adult-GED path or with GEDs.

We talked about this and other topics in a longish phone call, and as I usually do, I wished him well on the story, especially on getting enough space for it. Well, Emerson's story is now published, and in a 130-word story, my name is in there three times. He's a good reporter, and any gap between the published story and the first paragraph above is entirely a matter of the space he had to tell the story. I like seeing my name in print as much as the next yahoo, but yeow, that's a rapidly-shrinking news hole.

November 15, 2008

NCLB music

Bill Wraga, at work a mild-mannered U. of Georgia faculty member, has recently uploaded the latest NCLB/ed reform song I've come across. Some others:

This is certainly not the first time that education issues have been set to song. Doggerel is a longstanding tradition among students around the world, and sometimes it's a ritual. (One of the traditions at Bryn Mawr College is the three evenings in the year when most of the undergraduates gather and sing a bunch of songs about campus life, with lyrics in both English and Greek.) Tom Paxton is a wonderful songwriter, but his song is not his best. I'm hoping to find a set of lyrics for ed reform that has a bit of whimsy, is set to "I'll Fly Away," or is written by students.

October 31, 2008

Happy Halloween, and now read my book!

Charles Barone chose Halloween to point to my proposal for post-NCLB federal accountability policy. For the record, despite what the picture on my website implies, I really look like the hunk of handsomeness that's at the top of Barone's entry (well, on the right side of the picture). I appreciate the link and hope folks will leave a comment on Barone's entry. (Commenting here won't count.)

Federal influence

Mike Petrilli asks one right question: where can the federal government influence behavior, and what are the tradeoffs? I'm especially delighted that the research in question is about desegregation. As I've written before, the argument against top-down reform by David Tyack and Larry Cuban is smart, sensible, detailed, and fits with an enormous amount of historiography... but it doesn't address desegregation. I'm not headed entirely towards Nudge territory, though I much enjoyed the book, and part of the reason is that there is a role for top-down policy imposition. We just have to be very careful about how that power is used.

NCLB regs and graduation rates

A few quick ones this morning, while my brain warms up... So the new NCLB regulations are out. (Or, rather, they were out a few days ago, but I've been putting out fires while in the midst of a cold, and this was a lower priority.) Atlanta Journal-Constitution reporter Laura Diamond asked on Wednesday, Will NCLB changes improve grad rates? The obvious answer is yes and no: yes, the measures mandated by the federal government will be much better than the goat-rodeo world of dropout measures that currently exists, but, no, better measures will not move the world in themselves. After almost two decades of looking at attainment and dropout-prevention and -remediation programs, I am no longer surprised when people look to vocational education, personal counseling, and (these days) credit-recovery programs as solutions to dropping out. They may all be good on a small-scale basis with some students, but I worry when people reinvent the wheel and think they're hot stuff.

September 12, 2008

Shared responsibilities III: The next ESEA

Over the summer, Charles Barone challenged me to put up or shut up on NCLB/ESEA. I immediately said that was fair; Accountability Frankenstein had a last chapter that was general, not specific to federal law. I'm stuck in an airport lounge waiting for a late flight, so I have an occasion to write this now. Because I'm on battery power, I'm going to focus on the test-based accountability provisions rather than other items such as the high-quality teaching provisions. Let me identify what I find valuable in No Child Left Behind:

  • Disaggregation of data
  • Public reporting
I think most people who don't have their egos invested in NCLB recognize that its Rube Goldberg proficiency definition has no serious intellectual merit and has been a practical nightmare. Yet there is the policy dynamic that observers in the peanut gallery like me can recognize, which is the practice of states in gaming any system, and the way that such gaming undermines the credibility of states with those inside the Beltway. So there's a solid justification in a continued regulatory regime if it is sane and recognizable as such by most parents and teachers (i.e., the connotation of "loving hardass" that I meant in a prior post and that some readers have recognized). I'll have to write another entry on why I think David Figlio is wrong and why teachers are not magisters economici, but incentives just don't appear to be doing that much. An appropriate regulatory regime has to make it easier to be a good educator than a bad educator, make it easier for states to support good instruction than to game the system, and be reasonably flexible when the specific regulatory mechanisms clearly need adjusting.

So where do we go from here? I don't think trying to tinker with the proficiency formula makes sense: none of the alternatives look like they'll be that much more rational. What needs more focus is what happens when the data suggest that things are going wrong in a school or system. On that, I think the research community is clear: no one has a damned clue what to do. There are a few turnaround miracles, but these are outliers, and billions of dollars are now being spent on turnaround intervention with scant research support. To be honest, I don't care what screening mechanism is used as long as (a) the screening mechanism is used in that way and in that way only: to screen for further investigation/intervention; (b) the screening mechanism has a reasonable shot of identifying a set of schools that a state really does have the capacity to help change things -- if 0 schools are identified, that's a problem, but it's also a problem if 75% of schools are identified for a "go shoot the principal today" intervention; (c) we put more effort and money into changing instruction than in weighing or putting lipstick on the pig. Never mind that I'm vegetarian; this is a metaphor, folks.

So, to the mechanisms:

  • A "you pick your own damned tool" approach to assessment: States are required to assess students in at least core academic content areas in a rigorous, research-supported manner and use those assessments as screening mechanisms for intervention in schools or districts. Those assessments must be disaggregated publicly, disaggregation must figure somehow into the screening decisions, and state plans must meet a basic sniff test on results: if fewer than 5-10% of schools are identified as needing further investigation, or more than 50%, there's something obviously wrong with the state plan, and it has to be changed. The feds don't mandate whether proficiency or scale scores are used; as far as the feds are concerned, it's a state decision whether to use growth. But a state plan HAS to disaggregate data, that disaggregation HAS to count, and the results HAVE to meet the basic sniff test.
  • A separate filter on top of the basic one to identify serious inequalities in education. I've suggested using the grand-jury process as a way for even the wealthiest suburban district to be held to account if they're screwing around with racial/ethnic minorities, English language learners, or students with disabilities. I suspect that there are others, but I think a bottom line here is the following: independence of makeup, independent investigatory powers (as far as I'm aware, in all states grand juries have subpoena power), and public reporting.
  • Each state has to have a follow-up process when a school is screened into investigation either by the basic tool noted above or through the separate filter on inequality. That follow-up process must address both curriculum content and instructional techniques and have a statewide technical support process. At the same time, the federal government needs to engage in a large set of research to figure out what works in intervention. We have no clue, dear reader, and most "turnaround consultants" are the educational equivalents of snake-oil peddlers. That shames all of us.
The gist here is that we stop worrying about perfecting testing and statistical mechanisms as long as they are viewed properly as screening devices. Despite the reasoned criticisms of threshold criteria (e.g., proficiency), the problem is not that they exist but that these mostly jerry-built devices are relied upon for the types of judgments that make many of us wince and that the results fail the common-sense sniff test. As long as the federal government tries to legislate a Rube Goldberg mechanism, it will have little legitimacy, and states will continue to be able to wiggle away from responsibilities when they're not doing stupid things to schools. (Yes, both can happen at the same time.) Much wiser is to shift responsibility onto states for making the types of political decisions that this involves, as long as the results look and smell reasonable.

Doing so will also allow the federal government to focus on what it's largely ignored for years: no one knows how to improve all schools in trouble (and here I mean the organizational remedies -- there's plenty of research on good instruction). Instead of pretending that we do and enforcing remedies with little basis in research, maybe we should leave that as an open, practical question and... uh... do some research?

September 9, 2008

Cold permutations

First, to provide a minor update on this morning's news items:

  • Semi-success on the reserving-time front. I had a lunch meeting and then a 3 pm meeting, and the time in between was too short to do much, so I exchanged one parking sticker for another. Whee. At least my wonderful grad student assisting with the journal did a monster job helping on a long MS, giving my head-cold-affected mind a much easier job going through the next article. I WILL climb on top of this mountain of work. Just not today.
  • It's a semi-full-blown cold now. Proof: I should be asleep, and I'm exhausted, but I can't sleep.


I've been trying to wrap my mind around permutation tests and exchangeability for about a week, and I figure that my typical head-cold mentality may be the best shot I can take at it both in terms of the orthogonal way I think at way-too-late-on-a-head-cold evening and also the fact that once I'm up this late and in this state, no student or MS author wants me to be making decisions right now. (For the record, I'm on antihistamines. I know, I know: Never take Benadryl and grade. No. That's not funny, not even in my state of mind.)

A few weeks ago, I was pondering the NYC achievement gap controversy, a debate over the summer that among other things spawned a Teachers College Record commentary by Jennifer Jennings and me (available just to subscribers for now, but to the world in a few weeks). And while the limits on TCR commentaries and op-eds require a fairly narrow argument, I kept thinking about trends and time series data as I looked at the New York City Department of Education's claims. I kept thinking to myself, There has to be something an historian can contribute to this debate that is specific to the way historians think. I'll probably write something at length when I'm more coherent and have some time, but there was an obvious answer that came to mind: to historians, the order of events matter. An argument about causality depends on contingency which depends on a sequence. (Historians often focus on contingency rather than causality, except when we're playing the counterfactual game. The obvious answer to the question, "What caused Gore's defeat in 2000?" is "everything, or almost everything.") The sequence doesn't prove causality (or contingency), but it's necessary.

That logic is usually not applied in policy. In the case of New York City, as is typical in this type of reform publicity, someone pointed to a time series of data and claim, "Aha! See this trend? Ignore its tentative nature: it's PROOF that we're on the right track." One obvious problem with the NYC data is the reliance on threshold-passing percentages; that's the focus of the TCR commentary. But the NYC Department of Education made claims about the achievement gap more broadly, and the data is a lot messier than the folks in Tweed would state. Below are three permutations of the "z-scores" of achievement gaps (the differences in Black-White means on the 4th-grade state math tests, scaled to the population's standard deviation). One is the real time series that runs between 2002 and 2008. The other two are permutations. Before you look for the data (it's on p. 13 of the PDF file linked above), see if you can tell the differences among them, and which is the observed order:

0.74
0.79
0.73
0.67
0.72
0.67
0.71
0.79
0.67
0.72
0.67
0.71
0.74
0.73
0.79
0.72
0.71
0.74
0.73
0.67
0.67

My professional judgment as an historian is also common sense: if the order of events does not make a discernible difference, even if you ignore measurement error and standard errors, then it's hard to conclude that there's a trend. How to test that is the realm of statistics, and when I explained the issue to my colleagues Jeffrey Kromrey and John Ferron, the answer from them was clear: permutation tests. That's a general family of nonparametric tests of inference that's the formal version of the question I asked: if you jumble up the data in all the possible ways they could be permuted, and if you look at a particular measure of interest (a test statistic), where in the distribution of all permutations does the observed data set fall? In the case of the 4th grade Black-White gap on New York state math tests measured as a z-score, we have 7 points of data, which have 7! = 5040 permutations. If you choose an appropriate test statistic for each permutation and the observed time series is about 125 from either end of the distribution, that excludes the 95% or more permutations in the middle of the distribution.

No, I haven't had the time or inclination to follow up, learn how to calculate one of the possible test statistics and how to get the R statistics program to do a permutation test. There are two problems, as I've learned from my colleagues: choosing the right test statistic is a matter of art as well as science; and there may be a problem with exchangeability. As far as I understand it, exchangeability is a less constricting assumption than the standard "independent, identically-drawn" sample assumption in parametric inferential statistics. From what I understand, the practical definition of exchangeability means roughly that you could theoretically exchange all the data points without screwing up the distribution. Again, if I understand correctly, one situation that violates the assumption of exchangeability is in autocorrelated data—i.e., when one data point influences the next one (or the next few). And if there's anything that's likely to be autocorrelated, it's a time series. That's not a serious problem if you're just looking to see if a trend exists at all; for that, autocorrelation is a form of trend (though an artifactual one). But if you're trying to make causal inferences or anything more complicated when there's autocorrelation (i.e., if achievement data levels or trend slopes are different before and after a policy change), I think you have to throw permutation tests out the window.

And that's such a shame, because the concept is still right when extended beyond the question of a trend: if a policy makes a difference, then it should make a difference on which side of the policy change you're sitting. So if you're a clever person with statistics, please provide some ideas in comments for where to go with this or if, as I suspect, the best we can do with permutation tests is ruling out possible trends/autocorrelation.

September 8, 2008

Monday bits

I didn't have time this weekend to write a lengthy, thoughtful post, or even a lengthy and thoughtless piece, so you get bits this morning.

  • Reserving Mondays: I've shut off my e-mail for now to get some editing tasks done, and I'll see if I can reserve Mondays for selfish purposes for the entire semester. Wish me luck on this one!
  • Honesty: the Palm Beach Post's editorial board approves a draft change in calculating graduation rates in Florida. Kudos to Florida's commissioner of education, Eric Smith, for pushing this. (Disclosure: I've given a few ideas to the state department of ed on options for how to handle graduation in 5, 6 years, etc.)
  • Sunday morning grading: I got out to a coffeehouse early yesterday to read my first batch of undergraduate papers. Several brought smiles to my face with great writing, provocative ideas, or both. That's a good sign for the semester.
  • Fetishized vs. nonfetishized curricula: I wonder how the history of the Core Knowledge Foundation would have been different if E.D. Hirsch had thought to frame the issue not just as accumulating tiny bits of knowledge (how Herbartian of him!) and instead had framed it as a matter of both a knowledge base in different disciplines and the heuristic frameworks of those disciplines.
  • I know I have at least a below-the-radar version of a head cold because I've had moments of earache in the last day, I had less energy over the weekend than I normally do, and I was sure last night that a mashup of Timothy Burke's guide to historical arguments and Atlas Games's Once Upon a Time would make a great introduction to historiography.

September 1, 2008

Shared responsibilities for children II: The loving hardass manifesto

Back in June, I briefly noted the potential political dynamics of the dueling manifestoes associated with the Broader, Bolder Approach to Education and the Education Equality Project, apologized for overplaying that analysis, and wrote an entry to talk broadly about shared responsibilities and education as part of the state. I've promised but have not followed through on my own manifesto, and it's now long past time for that. So, without further ado...


The Loving Hardass Manifesto*

I'm going to cut the shared-responsibility issue in a way that doesn't avoid the hard problems. Essentially, wherever your work touches children's lives, you're responsible for busting your butt without ruining your health or life. Unlike the Education Equality Project manifesto, I do not think that teachers are all-powerful or all-responsible. They're very important and responsible, but not for everything. Unlike the Broader, Bolder Approach, I do not think we can avoid central questions about accountability within school by reference to the other legitimate needs of children outside of schools. Yes, children have lives outside school, but it's acceptable to focus on what happens inside schools for things schoools are responsible for. And unlike Barack Obama, I am not going to say that both statements are right. Both statements are partially right. And while I know and admire several people who have signed one or the other statement, I will not sign either one, because both are flawed.

Let me start with the Project crowd. If you're a politician or administrator and believe that everything you've done is perfect, with no regrets, and all the evidence points in your favor, I hope you brought enough to share, because whatever you're smoking, I want to try it. Using only the high-quality evidence that is in your favor (and here I mean David Figlio-quality evidence), you can make a claim that high-stakes accountability leads to modest improvement in outcomes. But that's about it.

If you're a civil-rights activist and think that the best way to improve schools is to lambaste teachers and their representatives, I have a year for you: 1968. And a book: Tyack and Cuban's Tinkering toward Utopia. I have plenty more to suggest, but I figure that's enough.

But I'm also disappointed by the Broader, Bolder Approach. Everything it says about putting education in the context of broader government programs for children is correct. And yet, if its purpose is to get us to think in a different way about accountability and NCLB, it underwhelms. There's something odd about a statement on school accountability that has precisely one paragraph suggesting vague ways to change how accountability should work within schools.

Let's think about some basic facts: most kids come to school with families they go home to at night. If the children and their teachers are lucky, their families will only have the ordinary neuroses that God or Woody Allen placed there. If the children are unlucky, they'll also deal with poverty, disability, abuse, negligence, or having Paris Hilton as a distant relative. If you're a teacher, you can gripe about the families, but it's probably best not to, for a few reasons:

Your complaining to peers will not improve the parenting of anyone.

We've heard it before, and it wasn't convincing the last time, either.

If you complain about the parents, you will be depriving your students of their internationally-recognized right to be the first to complain to a therapist about how they were brought up. Really: it's in the UN Charter, under "Psychotherapy as an Adolescent," right above the bit about iPods and PlayStations. Go look it up if you doubt me.

I just lied. You may not have caught this, but the 1959 Declaration of the Rights of the Child does not mention the right to criticize parents in therapy or the right to consumer electronics. There isn't a single mention of either Apple or Microsoft, a shameful omission which Bill Gates is working hard to remedy. But until then, children only have the recognized right to things such as health care, food, shelter, the care of parents or other responsible adults, freedom from discrimination, and education.

I don't know if you've noticed this, but as a society we're not doing so well on fulfilling these rights. 600 million Chinese citizens use cell phones, and in a country that is far wealthier, we've still got millions of children without health care. It used to be that American parents would shame their kids into eating everything at dinner by pointing out that children around the world were starving. That makes you wonder what Chinese parents tell their children to shame them. Maybe they say, "Take your vaccination and stop crying: Kids are getting sick in America!"

Since the dueling manifestoes appeared in June, I've been scratching my head. The broader, bolder approach is fine as a statement of broad social policy but it doesn't work in terms of day-to-day accountability. You are responsible for the people who are in your life. When my children have been sick, and I've taken them to their doctors, I've never once been asked, "How are they doing in math?" and then had a doctor refuse to treat my child because they're not yet evaluating double integrals. They treat the kid in front of them the best they can. My father was a pediatrician and allergist who treated both wealthy families from one side of town and working-class families from another part of town. He never complained about the families from one side or the other. He just treated them.

But that doesn't mean my father had absolute responsibility, either. He was expected to be a professional, to keep up with the literature, and to follow standards of medical practice. But there has never been a "Health Care Equality Project" whose primary activities were to take pot-shots at doctors, call them "interests who seek to preserve a failed system," and want to pay doctors by a handful of measures of the health of their patients. My father was never paid by how much his patients weighed that year, or by how many tissues they used because of colds. We already have accounting-driven health care, and I don't know of any doctors or patients who think it's a good idea.

We also don't have ridiculous fads in medicine. Well, we do, but it's generally called the X diet (for various string values of X), or "alternative medicine," for those who think that if you dilute some processed duck liver by 30 or 40 orders of magnitude, your body will react in any way other than, "I'm sorry if you paid for that sugar pill instead of your mortgage, but the best I can do right now is a placebo effect. I hope you like it." In education, we have far more fads. If we had as many fads in medicine as we do in education, people would think that wearing uniforms made you thinner.

So there is something about the dueling manifestoes that just does not seem real to me. It's not that I am immune to their appeal. I want there to be equal education. And I've already written in many places that schooling needs to be thought of in the context of all the state structures that touch kids' lives. But it's still not resonating with me. My generation of the family takes care of these issues collaboratively. My oldest brother has been a lawyer, lobbyist, and think-tank staff member on health-care policy, which takes care of one right. I teach and write about education. The rest of the immediate family's a bunch of layabouts who do nothing other than have jobs and take care of their families, but Stan and I, we're holding our own on this caring-for-children thing, and if your family isn't, don't blame us. We are the Broader, Bolder Approach. But we're both going on diets soon, so that will change.

Back to the central point about responsibility. The hard task that both manifestoes avoid is defining what we really should expect from schools. I don't know: maybe "bust your butts" isn't something people say in polite company. And it's even harder to define in practice. But since the people who signed the Education Equality Project say they're in favor of holding people accountable, here's my charge: go define what "bust your butts" means in ways that are realistic, or fold your tent. I suggest you start by talking with teachers and parents, not among yourselves. This is just one (loving hardass) reader's response, but I know you can do it, or I wouldn't insist on it.

And for the Broader, Bolder crowd, you know you can do better. As a group, you include a bunch of incredibly well-read, smart researchers. And you're right on putting schooling in a broader context. But you just fell down on the accountability part. That one short paragraph on accountability? Please reread it. Really. You think that was the best you could do? You KNOW what you'd say to a grad student who had that fluff in a dissertation. Revise and resubmit, because I know you can get this up to your usual standards.

And the rest of you in the peanut gallery? Don't think that we can rest on our laurels, either. The folks I'm criticizing at least had the energy and guts to put pen to paper. What have you done to define "bust your butts"?

And, yes, this means that I need to look back at the last chapter of Accountability Frankenstein and see if it needs to be sharper. A commenter some months ago said it was not specific to NCLB, and that's a fair enough point. I wanted the book to be about accountability in general, but if I really know my stuff, I should be able to apply it in specific situations. Want a specific list of changes that should happen with the next reauthorization of ESEA? Coming up this fall...

* While I was drafting this in bits and pieces, I pondered whether to use the term hardass, but since Bob Sutton has written the book The No Asshole Rule and Harry Frankfurt's On Bullshit won a book award, I don't think I'm going that far out on a limb. A loving hardass knows that holding people to standards can be in their best interest. So for everyone who signed one of the manifestoes and think I'm nuts here, you're wrong. And in two years, you'll thank me for this.

August 27, 2008

Two interviews to read today

A few shout-outs while I'm still juggling a few hundred tasks the first week of classes:

I can now bury my head in my own details, knowing that the education blogule is going strong without me.

August 5, 2008

Two brief comments

I promised not to comment on anything during my two-week break, but the NewTalk NCLBfest made me wonder who's missing from this debate. Your observations in the comments are most welcome.

Also, I think I may have alienated my family forever by going against their advice and buying a Sony Reader. Even my technophile son thinks I'm nuts. But the EPAA MS authors will probably appreciate my carrying their stuff with me to various short-reading opportunities.

August 1, 2008

A higher-ed unionist's view of the performance-pay debate

Corey Bunje Bower criticized a Newsweek column by Jonathan Alter and has the following response to Alter's slur against teacher unions:

Perhaps the most ridiculous thing that Alter writes -- and the statement that gives away the ideological underpinnings of his argument if anybody wasn't already aware -- is that unions "still believe that protecting incompetents is more important than educating children." Unions are far from perfect, and this is far from the most inflammatory rhetoric that I've read about them, but it's still sheer and utter nonsense.... Though more polite, it's the intellectual equivalent of calling somebody with whom you disagree a [N]azi or a terrorist.

If I were a union leader, however, I would mull over Alter's final point.... the general idea that unions could view submitting their members to more scrutiny in exchange for higher pay is something on which both sides might find some common ground.

I suppose I qualify as a union leader albeit in higher ed, so I'll take the bait. Disclosure: my faculty union was the one to propose merit pay at the table many years ago, and university faculty are more likely to approve of something called merit pay because there is a tradition of peer review for tenure/promotion. (Our collective bargaining agreement provides for general due process and substantive standards but leaves specific procedures for annual reviews to department votes.) So while I am skeptical of several top-down proposals for/policies encouraging performance pay in K-12, it is out of my seeing problems with it rather than a visceral opposition to merit pay. As the car ads say, your mileage may vary.


There are two policy issues here: one is how to think about teacher pay and working conditions in general, and the other is the question of collective bargaining at the local level (and the centralization/local question more generally). In Accountability Frankenstein, I wrote about high-stakes accountability advocates' simplistic and often flawed grasp of motivation. To put it briefly, even if we had a Holy Grail measure of "teacher contribution to learning," that wouldn't be a sufficient justification for relying on test scores for teacher pay. No one has the best idea for what works best, and a top-down approach would short-circuit even the most rabid merit-pay advocate's interest in finding out what works, in much the same way that NCLB's proficiency measure aborted alternative ways to examine student achievement (including quantitative measures such as average scale score, medians, percentile splits, etc.). Essentially, those interested in performance pay have to make the policy choice between experimentation and a crusade. So to all 0.379 Capitol Hill staffers and campaign advisors reading this blog, you should be wary of federal mandates: if you mandate the wrong formula, everyone will pay the price for Beltway arrogance, and you'll endanger the political legitimacy of the idea for the long term.

Caution about top-down mandates also fits with the local nature of collective bargaining and the affiliate structure in American unions. Despite what people may claim about the NEA's visceral opposition to merit pay, the big picture is more complicated: locals have negotiated performance pay or merit pay or whatever you want to call it, and the governance structures of both the NEA and the AFT commit the national affiliates to support collective bargaining at the local level. (There are also the merged locals and state affiliates that belong to both national affiliates.) That federal structure means that the NEA and AFT support what local leaders decide in terms of bargaining strategy and the agreements that the parties ratify at the local level. Where local leadership negotiates performance pay, the state and national affiliates support that. And where local leadership decides not to negotiate performance pay, the affiliates support that, too. (See a March 2008 column from NEA Today for an example of recent rhetoric that illustrates this complexity.) The more accurate policy position of both the NEA and AFT is that they oppose top-down mandates of performance pay, including how it is structured. The AFT is not officially skeptical of performance pay, but both national affiliates work with and for the locals. If you believe that either national teachers union can dictate bargaining positions to locals, e-mail me about my deep-discount sale price on the Brooklyn Bridge.

The second question about performance pay is thus the degree to which there should be centralized decision-making in education, and that is true for collective bargaining as well as for other matters of policy. It is not necessarily a matter of offering a grand bargain to Randi Weingarten and Dennis Van Roekel, because the bargain for some segments of a national union may be anathema to others. Let me put forward a pro-performance-pay, pro-union person's pipe-dream proposal that would serve someone's interests as a union leader, and you may understand: If I were a K-12 union leader in Florida, I would definitely listen to a national policy proposal that would tie some incentives for performance pay (bargained at the local level) to the degree to which a state had the following in place:

  • Collective-bargaining rights for public employees
  • Card-check procedures for certification of public employee unions
  • Binding arbitration for first contracts after a certain length of bargaining (say, 6-12 months)
  • Fair share in a bargaining unit that is represented by a union
Florida currently has one of those (collective bargaining rights for public employees), but gaining the others would be a pretty good trade in return for negotiating some version of performance pay (assuming it's not something that looks like the awful stuff that Florida has tried in recent years). To someone in a state like Florida, that looks like a possible deal. Framed as an incentive, it doesn't step on constitutional toes, but it gives more options to states that respect unions and collective bargaining. On the other hand, that's an awful deal to a union leader sitting in a state that already has fair share as well as collective bargaining. To someone who is opposed to any performance pay in such a state, that proposal looks closer to an insult than a serious attempt at a grand bargain.

As a result of this pattern, where different circumstances lead to different views of policy by local union leaders, you can have leaders sitting in different places, each of whom has a deserved reputation for being able to craft a deal with administrators, but where they have very different views of policy proposals. Ultimately, someone who wants performance pay in K-12 schools has to understand the fact that national affiliates support locals, and that the needs of locals will vary by state environment.

July 28, 2008

Ocala rethinks high grade-retention rates

In the late 1990s, Florida instituted a requirement that third-graders reach a certain test threshold in reading or be held back in third grade. Now Marion County schools (which includes Ocala) is rethinking grade retention where it can (hat tip), once they realized they had several hundred middle-school students who could legally drive.

The research on retention is fairly clear: if you have the choice between holding a student back a grade and praying they somehow improve, on the one hand, and advancing the student a grade and praying that they somehow improve, the better long-term choice is to promote the student and pray. Then again, my colleague Sister Jerome Leavy would point out that while plenty of Catholic schoolteachers believe in the power of prayer, you gotta do some teaching, and that's a poor way to frame public policy questions. Retention/promotion questions are an administrative distraction from the need to identify children who need help and intervene early.

July 23, 2008

Review of "Accountability Frankenstein"

As far as I'm aware, Teachers College Record recently published the first review of Accountability Frankenstein. From the comments by Dick Schutz, "If you are in any way concerned with the status and future of US el-hi education, you owe it to yourself to read this book." You can read the review to see where he thinks I got things right and wrong.

Crisis rhetoric, attention seeking, and capacity building

Berliner and Biddle's The Manufactured Crisis was the independent reading choice of several students in my summer doctoral course, and as they have been writing comments on the book in the last week, I have been thinking about the split retrospective view of the 1983 A Nation at Risk report, produced by the National Commission on Excellence in Education. The report has been on the receiving end of a tremendous amount of criticism by Berliner, Biddle, Jerry Bracey, and many others.

Of the various criticisms of the report, two stick fairly well: the report was thin on legitimate evidence of a decline in school performance, and the declension story is ahistorical. First, the report relied on a poor evidentiary record, using problematic statistics such as the average annual decline in SAT scale scores from 1964 to 1975, statistics the report's authors claimed were proof of declining standards in schools. (Why this was flawed is left as an exercise for the reader.) Using this evidence, the report claimed that

... the educational foundations of our society are presently being eroded by a rising tide of mediocrity that threatens our very future as a Nation and a people. What was unimaginable a generation ago has begun to occur--others are matching and surpassing our educational attainments.

If an unfriendly foreign power had attempted to impose on America the mediocre educational performance that exists today, we might well have viewed it as an act of war. As it stands, we have allowed this to happen to ourselves. We have even squandered the gains in student achievement made in the wake of the Sputnik challenge. Moreover, we have dismantled essential support systems which helped make those gains possible. We have, in effect, been committing an act of unthinking, unilateral educational disarmament.

Where do I start with the problems here: the war-like rhetoric, the implication that we don't want the rest of the world's education to improve, the bald assertion that there is any solid evidence of student achievement gains post-1958 that can be attributed to Sputnik, or the assumption that if there were low expectations observable in the early 1980s it must have been a decline from previous times instead of a generally anti-intellectual culture?

But 25 years after the report's release, it is easy to poke holes in and fun at the hyperbolic rhetoric. What the last few weeks have brought home for me is the very different perceptions of the report. Berliner, Biddle, Bracey, and other critics are absolutely right that the report is factually and conceptually flawed. And yet there are many people involved with the commission who not only thought they were factually correct, they thought that the report's purpose was to help public schooling. If you read various accounts of the commission's work, it is clear that they thought the report was necessary to build political support for school reforms.

Part of the report's creation lies in the campaign promise of President Ronald Reagan to abolish the federal Department of Education. In this regard, his first Secretary of Education Terence Bell brilliantly outmaneuvered Reagan, and within a few months of the report's release, it was clear that the report had resonated with newspaper editorial boards and state policymakers. Even without it, given the Democratic majority in the House and the presence of several moderate Republicans in the Senate, it was unlikely that Congress would abolish the department. After it, the idea was largely unthinkable.

But the motives of Bell and the commission members were clearly not about saving an administrative apparatus. They were true believers in reform, and if all of the recommendations had been followed, today we would have a much more expansive school system. (The recommendations included 200- or 220-day school calendars and 11-month teacher contracts.) Some of the recommendations were followed, primarily expanding high school course-taking requirements and standardized testing, as well as the experiments in teacher career ladders in several states. But the guts of the implemented recommendations were already in the works or in the air: I remember that California state Senator Gary Hart had been pushing an increase in graduation requirements, a bill that passed in 1983. (This is not the same Gary Hart as the famous one from Colorado.) While I could have graduated from high school in 1983 with one or two semesters of math (I forget which), students in my former high school now must take several years of math. (As others have pointed out, one of the unintended beneficial consequences of raising course-taking requirements was dramatically reducing the gender differences in math and science course taking. Richard Whitmire, take note: Terence Bell is the villain!)

Lest some people not know or have forgotten, A Nation at Risk was not the only major mid-80s report on public schooling. Others were written from a variety of perspectives: Ernest Boyer's High School, Ted Sizer's Horace's Compromise, Arthur Powell et al.'s The Shopping-Mall High School, and John Goodlad's A Place Called School. All were published in 1983 or 1984. All were earnest. All were more thoughtful than A Nation at Risk. I suspect that if Two Million Minutes had been made and released at the same time (if with different non-U.S. countries and different students), it would have fit into that cache of reform reports very well.

Those other reports did not gain the same attention as A Nation at Risk, and I am not certain that any of the reports dramatically changed the policy options discussed at the state level. Changed course requirements and testing were prominent parts of the discussion before the reports, and they were the primary consequences of state-level reforms in the 1970s and 1980s. What the body of reports did instead was push the idea that schools needed reforming. On that score, I think they succeeded, even if several of the report writers (Sizer and Goodlad) became horrified at the direction of reform policies.

Today, we have a new set of actors making similar claims about the need to reform schools: did you receive the e-mail from Strong American Schools/Ed in '08 that I did yesterday? If you didn't, here's the text:

We are only as strong as our schools, and our schools are failing our children.

Consider:
  • Almost 70% of America's eighth-graders do not read at grade level.
  • Our 15-year-olds rank 25th in math and 21st in science.
  • America showed no improvement in its post-secondary graduation rate between 2000 and 2005.
We know that the nations with the best schools attract the best jobs. If those jobs move to other countries, our economy, our lives and our children will suffer.

For that reason, Strong American Schools launched a new campaign this week to combat the crisis in our public schools.

Click on the image below to view our television advertisement:

Please join us. Tell your governors, your state and national representatives and senators that you want a change for stronger schools.

Make your voice heard.

The ad's rhetoric is definitely in line with A Nation at Risk, down to the tagline: "As our schools go, so goes our country." It's tired rhetoric at this point, and I think it's important to understand why the folks behind Strong American Schools are keeping at it, though they've made no traction in making education a highly visible part of the presidential campaign thus far: as with the major figures in A Nation at Risk, they are true believers in reform to increase the capacity of regulators.

But Strong American Schools has now become a shadow of A Nation at Risk, itself the least substantive of the mid-1980s reports on American schooling. Instead of making specific claims or recommendations, they're pushing "a change for stronger schools," or rather attention. To do so, they claim a crisis, though this is probably the worst time to claim that weak education is the cause of what Phil Gramm calls our "mental recession": to anyone who looks at the current state of the world, our economic woes are the consequences of the subprime mortgage crisis and energy prices (which themselves are driven by the growing Chinese and Indian economies). In 1983, the economy was out of recession. I just don't think the world will realign itself in the same way as in the 1980s. That doesn't mean that there isn't a tie between education and the
economy in the long term, but it's diffuse rather than mechanical.

And there's another question here: is it ethical or even helpful to claim that a long-term problem is an acute crisis, just to gain public attention for an issue? We've gone down this road many times before, and I just don't see where it helps in the long term.

July 21, 2008

The higher-ed split among conservatives

One could probably have predicted today's Inside Higher Ed article describing how several conservative academics criticized the current push for quantitative assessment of higher ed. I didn't, but if you did, give yourself a pat on the back.

The article describes a panel on Friday sponsored by the American Academy of Distance Learning (more about that later) where the former head of Margaret Spellings's Office of Postsecondary Education and the executive director of the National Association of Scholars ripped Spellings and her allies for pushing standardized tests in higher ed to the detriment of liberal arts. According to the article, Diane Auer Jones was more diplomatic than Peter Wood, but both complained that the push for accountability was turning reductionist. In this regard, I think Wood's reported comments are on the money: today, the policy rhetoric on higher education is vocational, and that threatens to make the defense of a liberal-arts education more difficult. He ties it to the push for accountability in higher education, and I've had similar concerns about calls for standardized testing as the primary accountability mechanism for colleges.

The predictability comes in the split among conservatives, one that Wood ties back to a "practical"/"classical" distinction in the late 18th century. The Spellings Commission report ignored fundamental tensions in American higher education, and one interesting feature of the report is the invisibility of the curriculum. The report's rhetoric was tied closely to economics, and I suspect that Jones's resignation in May on a matter of principle was the result of a long-simmering frustration among some conservative academics, not an isolated event. No party or political coalition is monolithic, and I've heard several current and former Capitol Hill staffers from Democratic offices who were far closer to Spellings on higher-ed accountability than either Jones or Wood. And I'm closer to Jones and Wood at least on this issue, though I'm a Democrat.

And now the coda: The building frustration among some conservatives that I'm inferring here may explain why Jones and Wood were willing to use the sponsorship of a proprietary university's president's shadow accreditation office: I've tried to look for the "American Academy of Distance Learning," which seemed to be an odd outfit to sponsor a talk about standardized testing and the liberal arts. I found an American Academy of Distance Learning (or at least a reference to its tax-exempt status) headquartered in Denver, but Dick Bishirjian runs the proprietary Yorktown University, which is in Denver... at the same address as AADL, down to the same suite number. But the media advisory for the panel lists AADL with a Norfolk post office box. Bishirjian also appears to be the president of the American Academy of Privatization, a proponent of "privatization training for public officials." I'm not sure what that means, precisely, but the P.O. box for it is the same as that given in the media advisory for AADL. In other words, it looks like Bishirjian has a mail drop in Norfolk and office space in Denver. That's an amazingly slim infrastructure to run a university and two other organizations... or at least to claim so. A July 10 Denver Post article gives a little more information about Yorktown, at least in relationship to Republican Senate candidate Bob Schaffer, who served on Yorktown's board of trustees for several years. Yorktown apparently has a single graduate program and only a few dozen students. Given the plaudits for Bishirjian by Paul Weyrich earlier this month on David Horowitz's website, it looks like Bishirjian had enormous difficulties gaining accreditation. So... is his sponsorship of the forum for Jones and Wood something that's tied to his proprietary institution's interests? I don't know if either Jones or Wood is aware of Bishirjian's background or the disconnect between his proprietary institution's curriculum and their arguments, but this is definitely one of the odder set of bedfellows I've seen in higher education.

July 17, 2008

Teachers and the public sphere

Partially drafted in Chicago Sunday evening, July 13, and revised July 17:

I'm listening to Susan Ohanian at the moment, talking to a group of about 50 AFT delegates and others. Ohanian is a well-known opponent of NCLB and academic standards and was invited to speak at an event sponsored by the AFT Peace and Freedom Caucus (which should sound familiar to NEA national delegates, who can sign up for an NEA Peace and Freedom Caucus as well). As I've written elsewhere, Ohanian is right in several things and wrong in others. (Go read our books to figure out where we agree and disagree; I like her as a person, and she raises important questions about the purpose of education and high-stakes testing.) But I'm more interested this evening in the audience after she and the other speaker (the leader of an independent teachers union in Puerto Rico) finish. The AFT crowd neither applauded nor booed this morning when Barack Obama talked about merit pay in his live-feed speech to the convention floor. (The crowd went to its feet and cheered loudly when he first appeared and cheered again loudly at the end, and applauded at various points in the 10-minute speech. As Mike Antonucci has noted, it's essentially the same speech he gave to NEA, the one that had NEA California delegates booing, so we have an interesting comparison point.) But since a strong positive reaction followed Ohanian's statement that it was wrong for Obama to claim that teachers are the most important influence on children, I'm fascinated.

Part of the reason why I'm fascinated is because I think Ohanian's arguments are inconsistent. Ohanian worried about the statement by Obama that "the single most important factor in determining a child's achievement is not the color of their skin or where they come from; it's not who their parents are or how much money they have. It's who their teacher is." Ohanian argued that this statement is rhetoric that sets up blaming teachers for all sorts of problems they are not responsible for. A few minutes later, she claimed that the real danger of high-stakes accountability was the destruction of children's imaginations and the creation of a compliant workforce. But there's a logical inconsistency here: how can schools create worker robots if they are not powerful in shaping the lives of children?

I worry (and I said towards the end of the event) that Ohanian's criticism undercut arguments about the importance of the public sphere. You can say that teachers are not crucial to children's lives, but then it's hard to argue that schools should be well-funded. You can say that teachers are not crucial, but then it's hard to argue against all sorts of problematic policy proposals that take authority away from teachers or that position teachers' professional judgment as irrelevant. Ohanian was nodding in acknowledgment at the time, so I think (or I hope) she knows that her impromptu remarks were not consistent with either her deeper views of schooling or that of most teachers.

As it turned out my initial impression of the crowd was wrong: there was a lively discussion after the speakers finished, with plenty of dissent with Ohanian's arguments. So in one sense, I never had my question answered: what drew some of the delegates to agree with the remarks by Ohanian that concerned me the most?

July 15, 2008

Know what union membership means before you write, Ray

Ray Fisman wrote a laudatory article released Friday by Slate about NYC's P.S. 49 principal Anthony Lombardi, an article with themes remarkably similar to what Robert Kolker wrote for New York Magazine in 2003, even down to quoting Randi Weingarten calling Lombardi a tyrant without crediting Kolker. Fisman links to an Inside Schools page summarizing P.S. 49 data and using Kolker's quotation, again without credit. C'mon, Mr. Fisman: if I can find the source by Googling, why couldn't you? (Given that flaw, I am doubtful of Fisman's claim that Lombardi was ever "at the top of the teachers-union hit list" (evidence of any such list or just colorful language to cover up a reporter's lassitude?)

But the passage that had me laughing was the following bit of ignorance:

Currently, New York City teachers get their union cards their first day on the job. In theory they're on probation for three years after that, but in practice very few are forced out. Lombardi suggests replacing this system with an apprenticeship program. Rather than requiring teaching degrees (which don't seem to improve value-added all that much), new recruits would have a couple of years of in-school training. There would then come a day of reckoning, when teachers-to-be would face a serious evaluation before securing union membership and a job for life.

Here is a fundamental conflation of tenure and union membership, or union membership with the legal protections of a collective bargaining agreement, or "serious evaluation" with something. I'm not sure where the root of the error lies, but I do know one thing that's true everywhere, as far as I know: union membership does not change your legally recognized rights under a collective bargaining agreement. It does other things that are important (greater chance of gains at the bargaining table through solidarity, access to specific benefits provided by the union beyond CBA protection, etc.), but Fisman just doesn't know what he's talking about here.

And then Joanne Jacobs repeats the error. Wince time...

July 9, 2008

Can reporters raise their game in writing about education research?

I know that I still owe readers the ultimate education platform and the big, hairy erratum I promised last month, but the issue of research vetting has popped up in the education blogule*, and it's something I've been intending to discuss for some time, so it's taking up my pre-10:30-am time today. In brief, Eduwonkette dismisses the new Manhattan Institute report on Florida's high-stakes testing regime as thinktankery, drive-by research with little credibility because it hasn't been vetted by peer review. Later in the day, she modified that to explain why she was willing to promote working papers published through the National Bureau of Economic Research or the RAND Corporation: they have a vetting process for researchers or reports, and their track record is longer. Jay Greene (one of the Manhattan Institute report's authors and a key part of the think tank's stable of writers) replied with probably the best argument against eduwonkette (or any blogger) in favor of using PR firms for unvetted research: as with blogs, publicizing unvetted reports involves a tradeoff between review and publishing speed, a tradeoff that reporters and other readers are aware of.

Releasing research directly to the public and through the mass media and internet improves the speed and breadth of information available, but it also comes with greater potential for errors. Consumers of this information are generally aware of these trade-offs and assign higher levels of confidence to research as it receives more review, but they appreciate being able to receive more of it sooner with less review.

In other words, caveat lector.


We've been down this road before with blogs in the anonymous Ivan Tribble column in fall 2005, responses such as Timothy Burke's, a second Tribble column, another round of responses such as Miriam Burstein's, and an occasional recurrence of sniping at blogs (or, in the latest case, Laura Blankenship's dismay at continued sniping). I could expand on Ernest Boyer's discussion of why scholarship should be defined broadly, or Michael Berube's discussion of "raw" and "cooked" blogs, but if you're reading this entry, you probably don't need all that. Suffice to say that there is a broad range of purpose and quality of blogging, some blogs such as The Valve or the Volokh Conspiracy have become lively places for academics, while others such as the The Panda's Thumb are more of a site for the public intellectual side of academics. These are retrospective judgments that are only possible after many months of consistent writing in each blog.

This retrospective judgment is a post facto evaluation of credibility, an evaluation that is also possible for institutional work. That judgment is what Eduwonkette is referring to when making a distinction between RAND and NBER, on the one hand, and the Manhattan Institute, on the other. Because of previous work she has read, she trusts RAND and NBER papers more. (She's not alone in that judgment of Manhattan Institute work, but I'm less concerned this morning with the specific case than the general principles.)

If an individual researcher needed to rely on a track record to be credible, we'd essentially be stuck in the intellectual equivalent of country clubs: only the invited need apply. That exists to some extent with citation indices such as Web of Science, but it's porous. One of the most important institutional roles of refereed journals and university presses is to lend credibility to new or unknown scholars who do not have a preexisting track record. To a sociologist of knowledge, refereeing serves a filtering purpose to sort out which researchers and claims to knowledge will be able to borrow institutional credibility/prestige.

Online technologies have created some cracks in these institutional arrangements in two ways: reducing the barriers to entry for new credibility-lending arrangements (i.e., online journals such as the Bryn Mawr Classical Review or Education Policy Analysis Archives) and making large banks of disciplinary working papers available for broad access (such as NBER in economics or arXiv in physics). To some extent, as John Willinsky has written, this ends up in an argument over the complex mix of economic models and intellectual principles. But its more serious side also challenges the refereeing process. To wit, in judging a work how much are we to rely on pre-publication reviewing and how much on post-publication evaluation and use?

To some extent, the reworking of intellectual credibility in the internet age will involve judgments of status as well as intellectual merit. To avoid doing so risks the careers of new scholars and status-anxious administrators, which is why Harvard led the way on open-access archiving for "traditional" disciplines and Stanford has led the way on open-access archiving for education, and I would not be surprised at all if Wharton or Chicago leads in an archiving policy for economics/business schools. Older institutions with little status at risk in open-access models might make it safer for institutions lower in the higher-ed hierarchy (or so I hope). (Explaining the phenomenon of anonymous academic blogging is left as an exercise for the reader.)

But the status issue doesn't address the intellectual question. If not for the inevitable issues of status, prestige, credibility, etc., would refereeing serve a purpose? No serious academic believes that publication inherently blesses the ideas in an article or book; publishable is different from influential. Nonetheless, refereeing serves a legitimate human side of academe, the networking side that wants to know which works have influenced others, which are judged classics, ... and which are judged publishable. Knowing that an article has gone through a refereeing process comforts the part of my training and professional judgment that values a community of scholarship with at least semi-coherent heuristics and methods. That community of scholarship can be fooled (witness Michael Bellesiles and the Bancroft Prize), but I still find it of some value.

Beyond the institutional credibility and community-of-scholarship issues, of course we can read individual works on their own merit, and I hope we all do. Professionally-educated researchers have more intellectual tools which we can bring to bear on working papers, think-tank reports, and the like. And that's our advantage over journalists; we know the literature in our area (or should), and we know the standard methodological strengths and weaknesses in the area (or should). On the other hand, journalists are paid to look at work quickly, while I always have competing priorities the day a think-tank report appears.

That gap provides a structural advantage to at least minimally-funded think tanks: they can hire publicists to push reports, and reporters will always be behind the curve in terms of evaluating the reports. More experienced reporters know a part of the relevant literature and some of the more common flaws in research, but the threshold for publication in news is not quality but newsworthiness. As news staffs shrink, individual reporters find that their beats become much larger, time for researching any story shorter, and the news hole chopped up further and further. (News blogs solve the news-hole problem but create one more burden for individual reporters.)

Complicating reporters' lack of time and research background is the limited pool of researchers who carve out time for reporters' calls and who understand their needs. In Florida, I am one of the usual suspects for education policy stories because I call reporters back quickly. While a few of my colleagues disdain reporting or fear being misquoted, the greater divide is cultural: reporters need contacts to respond within hours, not days, and they need something understandable and digestible. If a reporter leaves me a message and e-mails me about a story, I take some time to think about the obvious questions, figure out a way of explaining a technical issue, and try to think about who else the reporter might contact. It takes relatively little time, most of my colleagues could outthink me in this way, and somehow I still get called more than hundreds of other education or history faculty in the state. But enough about me: the larger point is that reporters usually have few contacts who have both the expertise and time to read a report quickly and provide context or evaluation before the reporter's deadline. Education Week reporters have more leeway because of the weekly cycle, but when the goal of a publicist is to place stories in the dailies, they have all the advantages with general reporters or reporters new to the education beat.

In this regard, the Hechinger Institute's workshops provide some important help to reporters, but everything I have read about the workshops are usually oriented to current topics, providing ideas for stories, and a matter of general context and "what's hot" rather than helping reporters respond to press releases. Yet reporters need the help from a research perspective that's still geared to their needs. So let me take a stab at what should appear in reporting on any research in education, at least from my idiosyncratic readers' perspective. I'll use the reporter's 5 W's, split into publication and methods issues:

  • Publication who: authors' names and institutional affiliations (both employer and publisher) are almost always described.
  • Publication what: title of the work and conclusions are also almost always described. Reporters are less successful in describing the research context, or how an article fits into the existing literature. Press releases are rarely challenged on claims of uniqueness or what is new about an article, and think-tank reports are far less likely than refereed articles or books to cite the broadly relevant literature. When reporters call me, they frequently ask me to evaluate the methods or meaning but rarely explicitly ask me, "Is this really new?"My suggested classification: entirely new, replicates or confirms existing research, or is counter to existing research. Reporters could address this problem by asking sources about uniqueness, and editors should demand this.
  • Publication when: publication date is usually reported, and occasionally the timing context becomes the story (as when a few federal reports were released on summer Fridays).
  • Publication where: rarely relevant to reporters, unless the institutional sponsor or author is local.
  • Publication why: Usually left implicit or addressed when quoting the "so what?" answer of a study author. Reporters could explicitly state whether the purpose of a study is to answer fundamental issues (such as basic education psychology), applied (as with teaching methods), attempting to influence, etc.
  • Publication how: Usually described at a superficial level. Reporters leave the question of refereeing as implicit: they will mention a journal or press, but I rarely see an explicit statement that a publication is either peer-reviewed or not peer-reviewed. There is no excuse for reporters to omit this information.
  • Content who: the study participants/subjects are often described if there's a coherent data set or number. Reporters are less successful in describing who are excluded from studies, though this should be important to readers and reporters could easily add this information.
  • Content what: how a researcher gathered data and broader design parameters are described if simple (e.g., secondary analysis of a data set) or if there is something unique or clever (as with some psychology research). More complex or obscure measures are usually simplified. This problem could be addressed, but it may be more difficult with some studies than with others.
  • Content when: if the data is fresh, this is generally reported. Reporters are weaker when describing reports that rely on older data sets. This is a simple issue to address.
  • Content where: Usually reported, unless the study setting is masked or an experimental environment.
  • Content why: Reporters usually report the researchers' primary explanation of a phenomenon. They rarely write about why the conclusion is superior to alternative explanations, either the researchers' explanations or critics'. The one exception to this superficiality is on research aimed at changing policy; in that realm, reporters have become more adept at probing for other explanations. When writing about non-policy research, reporters can ask more questions about alternative explanations.
  • Content how: The details of statistical analyses are rarely described, unless a reporter can find a researcher who is quotable on it, and then the reporting often strikes me as conclusory, quoting the critic rather than explaining the issue in depth. This problem is the most difficult one for reporters to address, both because of limited background knowledge and also because of limited column space for articles.

Let's see how reporters did in covering the new Manhattan Institute report, using the St Petersburg Times (blog), Education Week (blog thus far), and New York Sun (printed). This is a seat-of-the-pants judgment, but I think it shows the strengths and weaknesses of reporting on education research:


CriterionTimes (blog)Ed Week (blog)
Sun
Publication
WhoAcceptableAcceptableAcceptable
WhatWeakAcceptableWeak
WhenAcceptableAcceptableAcceptable
WhereN/AN/AN/A
WhyImplicit only
Implicit only
Implicit only
HowAcceptableAbsentAbsent
Content
WhoAcceptableAcceptableAcceptable
WhatWeakWeakWeak
WhenAcceptableAcceptableAcceptable
WhereAcceptable
AcceptableAcceptable
WhyWeakAcceptableWeak
HowWeakWeakWeak

Remarks: I rated the Times and Sun items as weak in "publication what" because there was no attempt to put the conclusions in the broader research context. All pieces implied rather than explicitly stated that the purpose of the report was to influence policy (specifically, to bolster high-stakes accountability policies). Only the Times blog noted that the report was not peer-reviewed. All three had "weak" in "content what" because none of them described the measures (individual student scale scores on science adjusted by standard deviation). Only the Ed Week blog entry mentioned alternative hypotheses. None described the analytical methods in depth.

While some parts of reporting on research is hard to improve on a short deadline (especially describing regression discontinuity analysis or evaluating the report without the technical details), the Ed Week blog entry was better than the others in in several areas, with the important exception of describing the non-refereed nature of the report. So, education reporters: can you raise your game?

* - Blogule is an anagram of globule and connotes something less global than blogosphere. Or at least I prefer it. Could you please spread it?

July 8, 2008

300 v. 10,000 and the broader discussion of performance pay

A bit more on Obama, performance pay, and the NEA: I commented yesterday about the Mike Antonucci video of Obama's speech to the representative assembly and the light round of boos when he mentioned performance pay (or merit pay or differential pay: take your pick, it doesn't change the substantive matters). Antonucci responds with more about his impression of the response (whether boos or cheers were louder for Obama, for which segments, etc.). I wasn't there, so I'll take his word that I miscounted from the spectacular audio on Youtube. I'm not sure that matters much either for the politics (which is that Obama is popular among teachers, but he and union leaders disagree most about performance pay) or for the substantive policy.

Charles Barone updated his entry on the matter twice, and here's the relevant matter:

I and many of the people who were passing this around are a little more skeptical than Sherman about what is needed to effect the kind of change Obama is talking about. The teacher quality problem is national. And urgent. It requires a national solution, which is frankly long overdue

Here we see what I explain to my undergraduate students: NCLB and education politics more generally have created a vicious circle of distrust. Because of how states respond to NCLB (some of which is pushed by the law and some a matter of state choice), teachers and parents at the local level have an increasingly negative view of NCLB and states. And because of the same choices, national policymakers and the Beltway view states and local actors with even more distrust.

The argument that Problem X "requires a national solution" is more a reflection of this distrust than a result of serious research or policy perspectives about the role of the federal government. (See Manna, Mcguinn, DeBray-Pelot, Kaestle, and others on federalism in education policy.) The federal government can do many things, and some things it must do, but federal education law is pretty blunt. It has never been a policy scalpel. And everything we know about performance pay and merit pay is that the details matter a great deal, a situation where federal mandates would be disastrous and eventually undercut any transient support for merit pay.

I know that the details matter from my observations of a cudgel-like mandate in my own state and also from my own experience with merit pay in higher ed: my colleagues generally like merit pay because departments are in control of the procedures and vote on them. Test scores play no role, and support for merit pay would evaporate if any of the K-12 schemes involving those were floated here. The most quantitatively-oriented department chair I know is least confident about evaluations of teaching and most confident on research, for a variety of reasons. Even so, my colleagues also support across-the-board raises (salaries at USF are in the fourth quintile of research-extensive universities, in terms of the national distribution) and compression-inversion remedies.

July 7, 2008

300 booing is somehow more important than 10,000 delegates

Former Hill staffer Charles Barone wrote early this morning that a video of Barack Obama's speech to the NEA Representative Assembly last week was being watched closely by "Congressional staff and education policy folks." Barone highlights a point in the speech where Obama says he is in favor of performance pay and where you can hear some booing in the background. "Pretty striking, booing a plan to give teachers who do more work, attain certain skills, or take tough assignments more money."

Barone is taking that moment far out of context, and so is anyone who draws a similar conclusion: what sounds like several hundred people booing is in a hall of about 10,000 delegates, and the cheers at other moments easily outweighed the booing. Even the laughter at Barack's comment after that moment was far louder. Bargaining performance pay is a hot topic among teacher union officers, and it should be clear that many union leaders are highly skeptical of any and all performance pay plans. I don't want to paper that over. There are plenty of reasons for union officials to be skeptical, given the history of arbitrary administrative evaluations before unionization, pay plans that have been imposed without bargaining, or pressure tactics that can undermine local bargaining. On the other hand, I can think of several locals (including those in the NEA) who have bargained performance pay when they have been part of its development.

In the end, Barone's comment is sad evidence of a Beltway mentality: Hill staffers know best. Neither members of Congress nor local school board members nor union leaders inherently know best. Where that type of arrogance rears its head, it undermines what should be happening: discussion.

(Disclosure: My own faculty union was the first to propose merit pay many years ago in the statewide contract, and of all the locally-derived money at USF for collectively bargained raises since our first local contract in 2004, two thirds has been for merit pay.)

June 13, 2008

I was manifest(o)ly wrong

Several days ago, I echoed Steve Diamond's argument that the dueling manifestoes this week are related to "the battle for the soul of Barack Obama." Larry Mishel took me to task in comments, and I will now publicly apologize, since David Brooks has now made the same point Diamond and I did. In his Manichean spin, Brooks claims that one cannot agree with both manifestoes, and that they represent the status quo camp and the reform camp. But wait: isn't NCLB the status quo, and high-stakes accountability the status quo in many states before that? And how does Brooks' one-or-the-other story jibe with Arne Duncan's being a signatory on both? (And per Eduwonk's offhand remark, do we really need another controversial local superintendent bumped up to Secretary of Education?) Quick, everyone: post sentries at the camp entrances!

June 12, 2008

Shared responsibilities for children I

I had intended to blog about the responsibilities of schools for a few weeks, since Harry Brighouse responded last month to April's Richard Rothstein-Rick Hess(-and-others) debate and Matthew Yglesias responded a week later to Ezra Klein's comments on education and the economy. I've been swamped by other things and am writing this first entry (of two) during a fragment of my day when I can't do anything else productive. (This is the background piece: the Uber Education Manifesto Du Jour With Humor will be the second entry.) But, in any case, this goes back at least a few weeks before this week's manifestoes presented Tuesday and yesterday. Then again, I suppose we should really go back to Richard Rothstein's Class and Schools (2004). Or maybe Berliner and Biddle's The Manufactured Crisis (1995). But that's only the recent lineage. Other ideas that will appear later in this entry come from Jennifer Hochschild and Nathan Scovronick, Michael Katz, Miriam Cohen, and Stephen Provasnik, among other historians and social scientists who have written about education as part of the state for about 40 years or more. Well, that's not quite accurate: the current line of academic writings is 40 years old, but the North American debates they've covered are several hundred years old. In other words, the relative responsibility of schools for academic achievement is not something that's new or newly struggled over. My goal in this entry is to identify three key issues underlying the current (and older) debate.

Probably the most important issue is the role of schools in citizenship and the welfare state. Because schooling became closely tied to the rhetoric of citizenship two out of the three times that the franchise expanded dramatically in the past two centuries, we think of education today as a birthright. Primary education became common in the U.S. earlier than in other early-industrializing countries, and as a result education is the primary form of social citizenship in this country. As Hochschild and Scovronick note, we imbue education with many of the same functions that a broader welfare state serves in other industrialized countries: education is supposed to advance economic opportunity, better health, happier lives, and so forth. (The last, most corrupt form of progressive curriculum ideas was called the Life Adjustment movement, and it was the reductio ad absurdum of education as a substitute for broader social citizenship.) So now schools are supposed to do everything from resuscitate the economy to save lives to ... oh, I don't know, cure split ends. There is a legitimate and identifiable human capital consequence to education, but the rhetoric on that is overblown. There is an inevitable temptation to see education as the cure for all ills, and the politics of education is liberally infected with panacea attribution disease. One part of the serious debate over accountability is the precise role of schools, and that is intimately tied to questions about the extent of the American welfare state.

One complication in thinking about education is the fact that elementary and secondary schooling is among the most equally distributed resources in the United States. In the states with the worst inequality in school spending, you'll see maybe two or even three times as much spending for some children as for others. Think about the distribution of other resources: access to health care, housing, transportation. All are distributed less equally than schools, because schooling is part of the democratic state and a right of citizenship by politics and state constitutions. That fact does not excuse educational inequality, but it's something we don't talk about openly or think about clearly.

I think there's a way out from the quagmire I've identified above: schools, other agencies, and families share responsibilities for children. Each is independently responsible for a reasonable but critical role in the lives of young people. Schools are not time machines: they cannot go back and undue what happened or didn't happen in earlier years, nor can they provide health care, clean air, and so forth. Nor can they take over the lives of children. But neither are they or teachers able to use the rest of children's lives as excuses; you take the students you have and move them. Period. The same is true for parents: they're not responsible for teaching their children calculus. But neither are they supposed to sit on their butts when things go wrong in schools, nor is it responsible to neglect their children. Oh, yes, and you're responsible for talking with people in the other roles, too.

There is a crucial advantage of having twin principles (responsibilities for both coordination and independent functioning): It fits with the broad sense of U.S. parents and other adults that both families and schools are responsible for academic achievement. I've pointed out this apparent inconsistency for several years, but in reality it's not an inconsistency. It reflects one reasonable solution to the dilemma: we're all supposed to be responsible.

But there's a sticking point in this grand ideal: given that schools have a serious but limited responsibility, how do we define the scope of that responsibility? Let's assume (for now) that we're concerned primarily with academic achievement. What exactly do we want schools to do? The final issue I want to identify is the series of shortcuts we take when talking about standards, proficiency, expectations, and any synonym you can find to the general concept of what we want children to learn. I have made the following point in Accountability Frankenstein among other places, and no one has even challenged me on it: almost every policy displaces the hard choices about expectations into a different forum. That doesn't mean that I have no expectations for my children or for schools. It just means that the process of turning rhetoric into policy mechanism removes the definition of academic expectations from public debate. Some of us say we want "high standards," but that does not say a single thing except in the politics of symbolism. Reformulating the concept doesn't help: growth models are equally suspect. In short, "proficiency" is a cipher.

Oh, damn: and there you thought I was headed into a Grand Bargain, a reasonable solution to all the fighting over accountability? Unfortunately, I'm an historian, not a Nobel Peace Prize winner. And I have somewhere to be in a few minutes. But do not fear: for those who grumble about the lack of specifics in this week's manifestoes or this entry, just hold on (or read the last chapter in my book, which is available without waiting for the second entry on this topic).

June 10, 2008

Missing out of the action, still

I'm swamped by work, so I'm afraid I'm going to be missing the party today on A Broader, Bolder Approach to Education, the collaborative statement on education policy headed up by the Education Policy Institute. Eduwonkette praises the statement. Richard Lee Colvin cautiously praises the emphasis on early childhood education while noting that it is likely to be controversial. Sara Mead's view is highly mixed. Eduwonk and Mike Petrilli are outright cynical.

I'm going to be late in responding to this (and other major stories such as Ed Week's grad-rate release last week). I'd give my brief gloss on the topic, but I've already written a book on accountability, and I'm too exhausted right now for pithy comments.

May 28, 2008

The test-prep nightmare

Over at Ed Sector's blog, former ES intern Danny Rosenthal describes how a test-prep nightmare unfolded in his Texas school. Towards the beginning of the entry, he writes,
I'm OK with test prep. When standardized tests are well-crafted, as they are in my state, teachers should use tests to shape their classroom instruction. Done thoughtfully, "teaching to the test" is a good idea. But at my school, and others in Houston, we execute test prep so poorly that it ends up hurting students more than it helps them.

The concrete description in the rest of the entry shows what happens in the school where he teaches:

... the sticker exercise told us little about our students' needs...

Mostly, teachers made worksheets with questions only loosely related to each other taken from previous TAKS tests, or, in some cases, from math textbooks that are largely unaligned with the TAKS test. Think panicked college students poring over Cliffs Notes for the wrong novel.

Sometimes, the school made all math teachers work off of the same worksheets, regardless of the fact that they taught different subjects....

Our test prep worksheets aim to review important skills. But oftentimes students have not learned these skills in the first place. And the worksheets don't fix that....

Students choose not to try mostly because they think they have no chance to succeed. That's not their fault. At Hastings, we are far too willing to exchange gimmicky test-prep and other instructional shortcuts for real teaching.

Rosenthal's vision of teaching-to-the-test done right is in line with the argument of Lauren Resnick, if TAKS were such a "good test" (many would disagree), and if that incentive pushed the type of instruction Rosenthal prefers (i.e., good instruction). But that's far too rare.

May 21, 2008

Qualitative data on schools

Yesterday's story in the Washington Post (hat tip) on in-person reviews of schools by external committees is one step in the right direction for accountability: using in-person eyeballs instead of just statistical eyeballs to see what should be done. Rhee sent teams of people into schools she wanted to change. There are some questions I still have after reading the article: why only one- and two-day visits? what did the DC teachers union think of the reviews? what did other stakeholders think? But even if there were flaws with this process, having students, parents, and educators visit schools to provide a snapshot is dramatically different from just looking at test scores and prescribing a cookie-cutter "fix."

(Note: Ken DeRosa pointed out the false dichotomy I had when rushing this entry through yesterday, and I trust this is now more "just.")

May 19, 2008

Political science/political philosophy and education policy

I was going to spend some time last night connecting my weekend entry on hubris to the debate over whether a preponderance-of-evidence standard is right for policy, when I discovered that the macrotheoretical gap had already been filled by Leo Casey's point about seeing like a state, not like an educator. I'm expecting two quick-tongued responses today from other bloggers, but I hope that there is more than a fast wit applied to an argument about the way that states behave and how that shapes education policy debate. I didn't use James Scott's book in Accountability Frankenstein, but I easily could have (and probably should have).

That's probably one logical direction for some good academic work to head in, after the solid work done by Manna, Mcguinn, and Debray (three new scholars: go buy their books!). Education governance is such a complicated mess for some who think about school reform, it's thus a wonderful place for academics to play.

April 20, 2008

The Indiana Jones response to philosophy-of-research blogging

Kevin Carey has his say on a preponderance-of-evidence standard on policy propositions (in response to an Eduwonkette discussion of growth measures). Skoolboy responds. I wouldn't go all ad-lib-for-convenience on you all if it weren't 11:20 at night, but I'm tired, and since this is a meta-discussion about judging teachers based on test scores, I'll just say this: It already happens (firing educators based on test scores), it's called reconstitution, and the evidence of its success is mediocre at best. We don't need to go all meta- when there's experience at hand... or specific proposals such as New York City's (which Skoolboy points out fails the sniff test of basic algebra).

If anyone were tempted to go meta-, I'd point out that there is no such thing as a monolithic social scientist's frame for policy. Then again, I'm not only an alleged social scientist, I'm a card-carrying member of the Social Science History Association and have a degree in one of those odd number-crunching realms (demography).

April 13, 2008

Legislative rolling and the New York budget language on tenure

One more thought on the New York state budget's language placing a moratorium on using test scores to deny teachers tenure: I'm wondering how much of the ire directed at the legislature and the calumny aimed at NYSUT (the state teachers union affiliate) is about the process of how this happened—i.e., without the "right" people in control or at the table.

I suspect the substance of the language is all about the waiting game going on with the end of Michael Bloomberg second term as New York mayor. The use of value-added measures as the sole or a primary tenure criterion is now blocked until after Bloomberg is out of office (and after Joel Klein is also likely to be gone as schools chancellor). Whatever decisions are taken after the moratorium ends will be taken by other people, in other political circumstances.

And it's that fact that makes me wonder about the undiscussed process issue. For the last seven and a half years, plenty of players were ignored in education policymaking. That's why the legislature approved mayoral control: to remove large bunches of stakeholders from the decision-making, in hopes that putting power in the hands of one person (Mayor Bloomberg) would aid significant reform. The political regime that followed that decision is something I'll leave to others to describe (and I suspect it would make a great dissertation for someone in the New York area), but the whole point of mayoral control was to remove people from the policymaking process.

So what happened in Albany? According to the critics of the decision who blamed NYSUT, the teachers union used every lobbying trick at their disposal to hide this provision in the budget while it was being drafted/finalized, while others (Bloomberg and allies) were left out of the process. The tone used by DFER head Joe Williams is one of anger and surprise, a "we was robbed" attitude. One informal term for being robbed and beaten up in the process is "being rolled," and that's much the impression I get from the critics of the language, especially the New York Daily News's referring to Albany as in the midst of a "legislative crime wave." No one likes to be rolled politically, but the irony here is that many of those who disapprove of being rolled in Albany haven't said boo about others' being rolled in NYC.

April 9, 2008

There it ain't -- a rap on The Quick and the Ed's knuckles

In The Quick and the Ed today, Kevin Carey boldly overclaims:

The Times is reporting that, at the behest of the teachers unions, last-minute language was snuck into the New York State budget providing that "teacher[s] shall not be granted or denied tenure based on student performance data." There's really not much one can add to that; it's hard to imagine a more unambiguous declaration of the union's total disregard for student learning when its members' jobs are at stake.

I suppose there really isn't much to add except that the Times article clearly states that the provision in question is not a ban but a two-year moratorium. It's hard to imagine a more unambiguous declaration of the union's caution about buying into rash schemes, and it puzzles me why Carey would make such an obvious omission in a way that undercuts his argument. See Eduwonkette for more links.

April 3, 2008

A dozen questions for an official graduation rate

When the OMB clears the draft regs on counting dropouts, we can expect another wave of stories on graduation rates and what they all mean. Sharp reporters and other observers will ask the following questions of the draft regs:

  1. Does the definition of graduation include or exclude non-standard completion categories such as GEDs and "certificates of completion"?
  2. How does the definition of graduation handle students with disabilities with a modified curriculum (that is, with an emphasis on functional rather than academic goals)?
  3. Is the mandatory measure a longitudinal statistic such as the NGA compact or a synthetic measure such as Chris Swanson's Cumulative Proportion Index? (I will assume until proven wrong that it is a longitudinal measure.)
  4. Regardless of the measure proposed, how many states have data systems that can produce the statistics required?
  5. How does the measure address transfers, homeschooling, migration, and mortality?
  6. For the adjustments proposed for transfers, homeschooling, migration, and mortality, are there any requirements that states audit the corresponding codes in their data systems?
  7. How does the proposed measure handle grade retention (e.g., multiple years in ninth grade)?
  8. Does the proposed measure forbid a state from using the Florida tactic of calling a dropout a transfer if the dropout immediately enrolled in a GED program?
  9. How does the proposed measure handle students who graduate in five years?
  10. Do the proposed regs require that school districts and schools must meet benchmarks in graduation in the same way that they must meet benchmarks with % 'proficient'?
  11. If there are such required benchmarks, is there any supporting research to suggest that the status or improvement benchmarks are realistic?
  12. In crafting the draft regs, did the Department of Education consult with more than two of the researchers recognized to have published in the relevant area, such as Chris Swanson, Rob Warren, Melissa Roderick, Russell Rumberger, Bob Hauser, Michelle Fine, or Gary Orfield? I'm an historian, and we're generally trotted out as mantel decorations for such affairs, if at all, but there are plenty of solid researchers in the area who could be consulted. And if you're a reporter, you need to line up a few of those folks to be ready to respond to draft regs.
I'm exhausted from a third straight fragmented day, looking forward to a fourth one... but I suspect the above set of questions covers much of the ground on the anticipated raft regs defining an official graduation rate.

April 1, 2008

Gradu[r]ated

So U.S. Secretary of Education Margaret Spellings Announces Department Will Move to a Uniform Graduation Rate, Require Disaggregation of Data (the true title of the press release today announcing imminent-but-not-published draft regs defining a graduation rate and only a few words away from the type of book title that would cure almost any insomnia). And George Miller huffs some that it wasn't bipartisan (hat tip to David Hoff on the Miller statement). So what's the buzz about?

  1. Spellings is channeling Adlai Stevenson's approach to governance and proudly announcing bold action on issues that are almost consensual and would happen without her intervention.
  2. Especially for this particular issue, the devil is in the details. Florida has a longitudinal graduation measure, but that doesn't mean it's accurate. If the regulatory language released in draft form would allow Florida to keep doing what it's doing officially, you won't see much in the form of transparency (and at least with two issues, you may see things get worse).
  3. Spellings is hoping the gravitas and charm of Colin Powell rubs off. Admittedly, Powell hasn't (yet) been on NPR's Wait, wait, ...

Maybe this is more evidence that Spellings will run for elected office in Texas and claim that she created growth measures, differentiated consequences, and airtight graduation rates. At least she's not claiming to have invented the Internet...

March 19, 2008

"Differentiated accountability"

Alexander Russo links to news coverage of the Margaret Spellings announcement yesterday that maybe not all AYP failures are the same. Here's some blog coverage:

Spellings went to growth pilots, waivers (or turning the other cheek) to allow tutoring before choice, and now differing judgments on failure to meet AYP after others talked about the ideas for years. I think Spellings is just channeling Adlai Stevenson, who once quipped that leadership is seeing where the crowd is heading and getting in front of it.

(Does anyone know the exact wording or source for that?)

Florida ed policy and politics

The legislative session is in full swing (or a more colorful noun), and a bunch of things are in the air either in Tallahassee or elsewhere:

1. Both houses of the state legislature are considering bills to change the role of state testing (FCAT), either by adding other information to the labeling of high schools (the senate's approach) or by a compromise bill that discourages test-prep and sets more specific grade-level standards (the proposal in the house).

2. The ACLU sues Palm Beach County for its low high school graduation. Superintendent Art Johnson suggests it's the state's fault for not providing enough money (scroll down for "But the superintendent..."). (Disclosure: A 2006 paper of mine is mentioned in both stories.)

3. Something that wasn't covered in my local papers in January: Holmes County administrators have banned students from displaying anything related to gay pride. The ACLU of Florida sued. I suspect this one's a no-brainer in a bench trial: in the majority opinion in Morse v. Frederick, Chief Justice Roberts made a distinction between what he thought of as the political speech of Tinker and the display of "Bong Hits 4 Jesus."

The only interest the Court discerned underlying the school's actions [in Tinker] was the "mere desire to avoid the discomfort and unpleasantness that always accompany an unpopular viewpoint," or "an urgent wish to avoid the controversy which might result from the expression." Tinker, 393 U. S., at 509, 510. That interest was not enough to justify banning "a silent, passive expression of opinion, unaccompanied by any disorder or disturbance." Id., at 508.

I think that reasoning clearly applies in this case.

March 11, 2008

Defending Effective Accountability and Assessment Practices

Saturday, March 29, 2008
10:45-12:15
Hilton Washington

Defending Effective Accountability and Assessment Practices is the title of the session I'm a participant in at the NEA/AFT Higher Education Joint Conference.

From what I understand, the tentatively-slated participants include staff members of two institutional associations as well as us faculty. As soon as I have permission to post those names, I'll do that.

February 28, 2008

Is the blind spot on higher-ed accountability that big?

In all the kerfluffle over the senior theses of Hillary Clinton and Michelle Obama, I hope I am not the only person asking the other question that I think is obvious and to the point: What do the theses tell us about the state of undergraduate education for Princeton and Wellesley students at the time?

Similarly, all those who huff and puff about higher-ed accountability are ignoring a huge source of information on the quality of graduate education: dissertations. Want to know what the expectations of students are really like? Go read what students create, when they know it's going in the library, going to be microfilmed, or going to be available electronically to the world.

February 25, 2008

NCLB and where we sit

In my undergraduate social foundations class, I spend some time explaining the politics of accountability. For the last few years, a critical mass of students (either a majority or a vocal minority) have consistently opposed accountability, taking on the mantle of professionalism, and it's my job to rattle their cages and make them see things using at least one other lens.

I usually explain things in words something like the following:

Views of accountability depend dramatically on where you are. At the classroom level, teachers trust what they do and would like to trust parents but aren't exactly sure. Parents may want to trust teachers, if their children's experiences have generally been decent, or may be entirely untrusting if not. Principals generally trust their own judgment and would like to trust teachers but have a supervisory responsibility (and the level of supervision they exercise will depend rather dramatically on a variety of factors).

Once you get above the level of the school, each level tends to want to impose some accountability on the level below it. For NCLB purposes, the key issue is the state/feds split: in a number of states, officials in the state capitol don't trust local districts and feel that it is their responsibility to regulate the districts, while a number of federal officials are skeptical that states will do the right thing unless there is a federal level of accountability.

NCLB forced states to define a variety of measures and set targets for those measures. At the local level, the state plan is often viewed as onerous, unreasonable, and inflexible. But the state plans are inherently compromises, and so various parties in Washington have looked at the state plans with skepticism.

For example, let's take a look at graduation, which states often defined to mean one minus the proportion of high school students identified as dropouts. That too-easily-falsifiable "dropout rate" is very low in many places, for reasons largely unrelated to the actual proportion of teenagers who graduate from high school, and the official graduation rate if defined as the complement will be wildly inflated.

To local residents and some educators, it looks like the state is hiding a sizable dropout rate, which many view as a consequence of out-of-control accountability systems. That's the type of local or educator-centered view many of you have described.

But you also need to look at it from a federal perspective, from those who see state plans and state commitments with enormous skepticism. To them, what would be the logical conclusion drawn about such graduation rates?

Linda McNeil et al.'s recent article on high-stakes accountability in Texas and Charles Barone's entry today, The Games States Play: Graduation Rates, are Exhibits A and B the next time I have this discussion.

Wrong incentive structure for community colleges/technical training

George R. Boggs and Marlene B. Seltzer describe Washington State's incentive structure designed to encourage community colleges to push completion:

Washington's community and technical colleges will receive extra money for students who earn their first 15 and first 30 college credits, earn their first 5 credits of college-level math, pass a pre-college writing or math course, make significant gains in certain basic skills tests, earn a degree or complete a certificate. Colleges also will be rewarded for students who earn a GED through their programs.

On the one hand, focusing on proximate measures on the way to degrees makes enormous sense, at least if we trust Cliff Adelman's work. On the other hand, I worry that such an incentives structure will affect standards in institutions with weak faculty governance and protection of academic freedom: "We need these students to pass these credits, or we lose money."

Better incentive structure: if public funding plus current tuition is sufficient for an institution's operating expenses (a rather big if, as I'm aware in Florida), keep the hands off the potentially perverse incentives inside the curriculum and give students an incentive to do well by keeping tuition stable for students as long as they make steady progress towards degrees. In other words, tuition stability (or a cap on rising tuition) is guaranteed if students are doing well.

The institutional incentives then can be geared towards summary graduation measures, to some extent. Florida's universities are having their first bite of outcome incentives this year, but the budget cut is swamping the effects of it. (Here's the motivational undermining: You don't starve people and then tell them they can earn a little bit of pin money if they work harder. At this point, at least for the universities, it's a matter of looking to the future and probably a system negotiation about formulae.)

There's a lot more to be said about higher-ed accountability, including Gerald Graff's commentary on assessment and Erin O'Connor's response, but I have to chair a proposal defense in 10 minutes...

Update (2/27): Kevin Carey responds:

I'd like to propose that people be more judicious and precise in their use of the term "perverse incentives" by not applying it to any incentive that could theoretically cause someone to act in bad faith.

I'm not going to split hairs by pointing out the adverb potentially up in the original entry (okay, originally potential and then changed to potentially); if I understand it correctly, Carey's argument is that we should not say something is a perverse incentive unless we can really point to the evidence of strong corrupting influences. In this case, my argument is about the pressures on instructors, not students (something different from what Carey inferred). Are colleges susceptible to such corruption when institutional stakes are tied to individual course grades? The scandals each year tied to athletics (e.g, FSU and tutors who helped athletes cheat) tell me the answer is yes.

Teacher performance-pay distributions in Tampa

Yesterday and today, the St Petersburg Times has been covering the distribution of performance pay among different schools in Hillsborough County (one of the few in Florida where the union and school board agreed to the state's merit-pay provisions). See the main story from yesterday and also a tale of two teachers, a basic Q&A sidebar, and then play around with school-level statistics.

What the Times has documented is that teachers were more likely to receive the bonuses in schools where students are more likely to be from well-off families. The district says they'll tinker with the formula for next year. While I love David Tyack and Larry Cuban's book with tinkering in the title, I'm skeptical that tinkering will work in this case.

February 14, 2008

Helen Ladd's common-sense approach

I'm biased because I've made the same recommendations: In a late January Ed Week commentary I should have pointed to earlier, the Duke University professor says we should be Rethinking the Way We Hold Schools Accountable.

February 12, 2008

On excuses for unintended consequences

Oh, my: I head out of town for a week, and when I get back there's a trail of tears blogs on curriculum narrowing:

While there is some question about the extent of curriculum narrowing that followed NCLB (see: no causal language there), the basic argument in these entries is over whether NCLB creates incentives to narrow the curriculum and the extent to which the variation in curriculum narrowing shows that schools don't have to narrow the curriculum to do well on tests.

(...except for Eduwonk's red herring about low bars, which essentially is that because states can set relatively low thresholds for proficiency, that eliminates the incentive to narrow curriculum, stuff test-prep into the kids up the wazoo, etc. No economist or behaviorist would accept an argument of "hey, the marginal change required is low, so that doesn't create an incentive for changed behavior." Either would reply that's a question that should be left to evidence, not speculation. I'm not an economist or a behaviorist, but I don't buy the hand-waving about low bars, either. And, as 'kette points out, isn't NCLB supposed to change behavior? You can't simultaneously say NCLB is changing some behavior you like without acknowledging that it has the potential to provoke behavior we don't like.)

If we agree that thousands of schools are making poor decisions in response to the pressure of test-based accountability, then the operative question is, How do we help schools and educators make better decisions? Charles Barone and others suggest we hold up exemplars and say, "Follow them." That's the effective-schools-literature strategy, and we've paddled that boat since the late 1970s without getting where we want, so we know at least that it's not enough. Robert Pondiscio and other core-knowledge or other-curriculum standards folks would say, "Build the curriculum, and they will follow." That's a step towards regulating input more than outcomes, which I suspect will not be politically viable, but I may be wrong. George Miller, Ted Kennedy, and others propose to increase the number of measures used, with legislative language that assumes that AYP can be finely tuned. I don't buy that argument: test-based accountability is a cudgel, not a scalpel. My instinct is to say, Watch the decision-making, but that's because I distrust black-box handwaving, and I know it's hard to operationalize a procedural standard within a test-prep culture.

The meta-political question is deeper and one that I think most people understand in spots if not generally: you either own reform or you lose the reformer label. If you do not acknowledge problems through implementation and own them, you give up a huge chunk of credibility. Whether I agree with them on an issue or not, I give credit to Ed Trust for occasionally identifying problems with implementation and deciding to own the issue (e.g., growth models). They haven't done that with 100%-proficiency goals or test-prep (yet), but it's a healthy dynamic where they have done it. You could say the same with Fordham and curriculum-narrowing (or Diane Ravitch with the same issue plus test-prep). Or Miller and Kennedy and 100% proficiency (though their concrete ideas on those points are Rube-Goldbergesque).

I haven't seen that nearly as much with Barone, Eduwonk, or some others, and the failure to own problems with NCLB ignores the fundamental fact of post-NCLB politics: Parents of public-school children are far more skeptical of test-based accountability than they were 5 years ago. Own the problems or lose control.

February 11, 2008

Probably not what Tallahassee or Beltway policy wonks intended

So some Florida teachers were fired because they were abusing students, letting a classroom get out of hand, not being prepared ... but the state has forced the reinstatement of the teachers because the districts did not rely on test scores to make the personnel decisions.

Can someone explain to me how this makes sense?

February 7, 2008

One more follow-up on Kennedy/Miller endorsement and NCLB politics

Just one more datum on speculation about the Kennedy and Miller endorsements of Obama mean for NCLB (little, I've said before). Let's suppose for a moment that all this is true, and that the stars are lining up behind Obama from the Democratic Forces for NCLB. If you believe that and the bundling hypothesis about donations to campaigns, and if you know where Bill Gates stands, where do you think the majority of donations from Microsoft employees would be going?

Wrong: Clinton.

February 3, 2008

Matt Miller's fallacy

I must have had a busy month to wait several weeks before correcting the record on Matt Miller's Atlantic article, First, Kill All the School Boards. The real problem, he says, is all of those selfish, parochial school board members and the unions who manipulate them. He paints a romantic picture of Horace Mann, repeats both the truthful and the hoary cliches of the past quarter-century of school reform, and calls for nationalizing education.

To put it briefly, Miller falls into the standard "let's fix the governance structure" fallacy of a certain chunk of education reform wannabes. I just don't buy it. If school-board parochialism were the main problem, then we'd find Hawa'i's schools outdoing the rest of the country because of its unitary system. Or we'd find Southern states outdoing the north because many of them have mostly county systems, in contrast to Northern and Western states with tiny, fragmentary districts. Or New York City's system would be perfect today because of the elimination of the elected school boards through mayoral control. I'm sure that there are governance changes that would matter, but this one? It's bold, provocative, simple, and not very helpful.

Miller refers to a comparative study of education policymaking by economist Ludger Woessmann, and I need to track that down, but I suspect it will support Miller's argument less than he thinks, at least from other writings of Woessmann that I've come across. We'll see.  In the meantime, here's a bit of cold water on the everyone-has-national-standards argument, taken from Accountability Frankenstein:

[N]ot all industrialized countries have a national curriculum framework: Spain and Hungary have a common core, but regions have the authority to adjust the core curriculum or add to it. Italy's and Argentina's curriculum planning has become less centralized in the past decade. Australia, Canada, Germany, and Switzerland have federal systems, like that in the U.S., where there is no central curriculum authority (Chisolm, 2005; Gvirtz & Beech, 2004; Jansen, 1999; O'Donnell, 2001). Even among countries with a centralized curriculum, the focus varies widely (Holmes & McLean, 1992). The United States is not out of step with the world, because there is no international consensus on the appropriate control of curriculum and expectations (or standards), let alone the content.

February 2, 2008

Bill Clinton's Ego, redux

I think Leo Casey is wrong about the politics of Bill Clinton's slamming Ted Kennedy. Since I agree with Leo on a large swath of education policy, including the effects of NCLB, I should explain a bit. For the most part, Hillary Clinton and Barack Obama share significant rhetoric on education and quite a bit of fuzziness on the details. They've both said NCLB has serious flaws, but it hasn't been a focus of their campaigns. That's not much of a surprise, because, despite the efforts of Ed in '08, education is not a huge issue in the campaign. (Bill Gates, get behind in line the folks who want a presidential debate around science.)

Over the past few weeks, both George Miller and Ted Kennedy have endorsed Obama. Has Obama said he agrees with Miller and Kennedy about NCLB? No, not to my knowledge. Maybe he did a backroom deal with both of them about reauthorization, but I've already explained why I think that's not the likely reason for both endorsements.

After being chastised for going after Obama directly and crudely in South Carolina, Bill Clinton did his best to undermine the endorsement of a liberal icon, by linking Kennedy to Bush:

No Child Left Behind was supported by George Bush and Senator Ted Kennedy and everybody in between.
Let me make this clear: I don't think Bill Clinton gives a hoot about NCLB right now, but if he can use it to smear Kennedy and undermine that endorsement, he will. To that end, I think Charles Barone's line-by-line response is tangential. The only phrase that Bill Clinton wanted to get out was "George Bush and Senator Ted Kennedy." Yeah, he can spin a policy tale out of that, but that's not the point.

I know that Hillary Clinton freely acknowledges that she cannot carry a tune in a bucket, but in this case, it's Bill Clinton who's tone-deaf.

February 1, 2008

At least Timothy Leary chose to drop out...

I think I understand Leary's choices, or at least the temptation: It's the end of two very tiring days, when I had a chance to talk for a few hours with one of the folks who tore down Florida's old Pork Chop Gang. Short story: an undergraduate I've been mentoring for a few semesters had an internship with the law firm of this Florida political hero, and after e-mailing back and forth, he needed some questions answered about the background of his senior thesis. So he proposed a joint meeting, first scheduled at the law firm and then moved to my office. I was expecting it to go about 90 minutes. It lasted 150 minutes instead. So we got off on various tangents, since he had the personal experience and I had the history, but the student said it was worth it. I had several meetings today (some planned, some impromptu, some deferred). Lots of things delayed, which is my life these days.

But even if deferred for a few days, the new English-language article of EPAA is out: Avoidable Losses: High-Stakes Accountability and the Dropout Crisis. Its authors combined interview work with following students in Texas as they were left behind in 9th grade and then dropped out. This is very difficult work to do, and the findings are provocative. Two stand out for me: that principals know that they are choosing between education and satisfying the test-score gods, and they reluctantly choose to satisfy the gods; and that to students, there is no distinction between accountability and all the practices that alienate many of them from high school. To the students in this Texas school district in the late 1990s and early 200s, there is a single massive bureaucracy that held them back, denied them opportunities in part to game the system, and never told them that their education was being sacrificed in the name of pressure whose putative goal was to ensure that they were not denied educational opportunities.

Whether you agree with the article's authors or not, I suspect it will be discussed vigorously, which is all to the good. A few years after Jennifer Booher-Jennings' article on triage in Texas, one of the models for NCLB continues to be a focus of criticism and debate.

(No, I've never taken illegal drugs, nor have I ever been tempted to, in reality. But I live on antihistamines when I have a cold...)

Evaluating college teaching

Since my energy is now sapped, I'll address Eduwonkette's four questions from yesterday:

1) How should learning be evaluated in college?

There are two separate questions (what did individual students learn? and what did groups of students learn?), though I think Eduwonkette is asking more about personnel evaluation. The first two can be evaluated using similar questions and data (including student work!), as long as you acknowledge that classroom dynamics can change things quite a bit. Usually, the first question is tied to students' individual grades, and the second is water-cooler (or coffee-urn) talk among colleagues: how was your class in HVN 101 this semester: better than HLL 666 last semester? Faculty rarely get to ask the second question in more systematic ways.

2) Are course evaluations a fair and comprehensive measure of college teaching?

Eduwonkette is either asking a trick question or conflating the end-of-course surveys that students take with either course evaluation or personnel evaluation. Students are evaluating their own experiences throughout a term, so the survey is more a chance for them to express the conclusions they have already reached, in some fashion, at least if the survey items are at least tangentially related to their concerns. Evaluating a course should involve student feedback but also something about what students learned, not just what they felt or expressed. And evaluating faculty as employees involves additional layers involving their contributions to a course, other information and context often unknown to students, let alone research or service assignments.

3) What should universities do with student course evaluations?

See above on my desire to ban evaluation as the term used for student surveys. But to answer the substantive question: they should be written with input from faculty, include an item on how much effort the student expended on the course (for a few reasons), be available to students (except for graduate students, who are students as well as employees and thus should have some privacy protections), and be part of program and personnel evaluations.

4) What are the potential risks/benefits to students and profs of making them public?

When I was a student, I found the comments far more telling than the numbers. But I suspect that this doesn't have to be theoretical or based on anecdote: there have to be institutions where the survey responses are public, and where one could study the consequences. See above on the graduate-student privacy concerns I have.

January 31, 2008

Higher education and the wrong battle

At Education Sector, Kevin Carey (a 4 out of 5 in my book) has an institutionalist lens that is sometimes incisive (4.5 out of 5) , sometimes frustrating (2 of 5), and occasionally both. Such as his complaint yesterday about the "Higher Ed Lobby" (my quotation marks, which are probably 1 out of 5 on style). Here's the gist in his complaint about accreditation agency politics:

But accreditation does a terrible job of creating or providing any kind of public, comparable information about institution-level academic quality.

I'd rate that comment as a 3 out of 5, and the post in general a 2.5 (in comparison with Eduwonkette, whose posts are averaging about 4.87 in the last few months). There are multiple arguments layered into that one statement, but let me focus on two:

  • Lax accreditation has played a significant role in letting the quality of (undergraduate) instruction be lower than it could be.
  • What we need to improve undergraduate instruction is predigested comparisons of quality between institutions.

Thus, yesterday's statement of principles by the Association of American Colleges and Universities and the Council for Higher Education Accreditation is unlikely to satisfy Carey's concerns because it resists the notion that creating quantitative comparisons of student outcomes is a necessary part of the accreditation process. Delving into the broader issue at length requires more energy and time than I have this morning, but I'll put out a few counterclaims:

  • As long as millions of parents and students perceive that they are buying a degree from a college, there will be an inevitable tension between credentialism and the "use value" of a college education. In this environment, accreditation has to answer the face-value "does this college provide an opportunity to learn, and is the degree legitimate?" question.
  • The most savvy students and parents want more than U.S. News rankings, but they're not going to give a hoot about what irks Carey and me about the rankings. Instead, savvy students and parents want to know what happens in the classroom, the lab, the studio, and the field. A case in point: last year, one teen acquaintance of mine was looking for colleges with performing arts programs. In the end, she was accepted to two schools with outstanding reputations, one with local connections that are unbeatable in this subfield, and the other that's in another region, perfectly reputable, but without those networking opportunities. She had the opportunity for one last visit to each place, and what made the difference was watching students rehearse and perform. There was no faux objectivity. My young friend watched students work and decided that the less-networked place had the better education because there was a pop to the work in one place that just didn't exist in the other.

My friend and her parents (whom I've known for years) cared about comparisons, but not predigested ones. They made their own ranking. Kevin Carey, Charles Miller, and others may want to see predigested measures, but they'll be swimming upstream against credentialism, against the needs of students and families who really do want information about educational quality, and against the professional judgment of faculty. Framing the issue as one of the White Hats against the Higher Ed Lobby does everyone a disservice.

One more thing: Last week I tried an experiment and allowed readers to rate my posts on a 1-5 scale. I tried priming the pump by rating a few of them (no, not all 5's), but no one else participated, and I pulled that option. I guess maybe some people are interested in ratings, but not my blog's readers.

January 30, 2008

Chemistry or test-prep?

In Palm Beach County, high schools are ditching real science for FCAT prep. And I thought the election results were the most depressing news of the morning!

January 29, 2008

Alfie Kohn and Diane Ravitch agree!

This week, the zeitgeist in education news is paying students for test scores, as in the Baltimore Sun article yesterday or the USA Today story, but so-called incentive programs have been in the news before and criticized before: See criticisms of Pizza Hut's "Book It" program or Barry Schwartz's column last July, which scored New York City's initiative to pay students for test scores. While they sound good in theory (reward kids for doing well!), it rubs a number of people the wrong way, including Elena Silva of Education Sector, Diane Ravitch, Eduwonkette, and even conservative Liam Julian, who criticized such programs last year (though I'm linking to my blog entry because the original column has suffered linkrot). And virtually the whole education world knows about Alfie Kohn's opposition to tangible incentives. So what could possibly bring folks from very different stripes together; after all, as Robert Pondiscio points out, isn't giving one incentive the same as giving any incentive, and all we're doing is haggling over the price?

First, a bit of disillusionment: while Kohn and Ravitch both talk about intrinsic rewards, I suspect only one of them will agree with the second half of the reasoning below.

There are two problems with paying students cash for achievement. One is that these programs are not finely calibrated. Whether they reward status achievement (straight As or a certain score on standardized tests) or some sort of growth/effort, there are going to be some rewarded students who did not work hard for the reward and other unrewarded students who probably deserve it. Two consequences flow from that fact. First, students will perceive it as unfair, once the money is doled out. Well, maybe we should be teaching teenagers that "merit pay" isn't always distributed on an equitable basis (see Robert Dreeben's work), but I suspect a program that doesn't pass the adolescent sniff test for fairness will alienate rather than motivate students, with the consequences magnified because of the money stakes. In addition to the fairness issue, there is the research question of whether rewarding students' focused effort and improvement is better or worse than rewarding status. Most program administrators probably make decisions based on seat-of-the-pants judgments rather than the research.

There is a second problem with paying students cash for achievement, and that is the question of the reward itself: will it promote continued effort, or will it be tangential to effort? A case in point from my own experience as a parent, and that of many other parents: you go to the library with your elementary-school child and borrow some books that the child chooses. You all return home. The child reads the book. What is the reward for the child's reading the book? My wife and I didn't think about it at the time in this way, but what our children chose was to return to the library to get more books. The reward was another library trip, which promoted reading. Many math teachers have bonus questions on tests to keep some occupied when they finish the main questions earlier than other students. But the bonus questions also reward completing the test by giving the students more opportunity to challenge themselves. Students of moderate means who work their tail off in high school should be rewarded by an opportunity to attend college at reduced cost (a scholarship), which promotes learning. And so forth.

From this, I'd argue that the more fundamental problem with rewarding achievement with cash is that such rewards do not promote additional learning. While Roland Fryer (the designer of NYC's incentives program) is obviously a very smart new scholar, he is thinking of the rewards from a fairly narrow perspective, assuming that all incentives are fungible and ignoring the post-award uses of rewards. We know that Pizza Hut is engaged in marketing rather than a promotion of reading because it rewards kids with pizza instead of with books. And we'll see appropriate incentives when their use is intimately tied to additional effort.

January 28, 2008

Party trumps policy

Last night, Leo Casey hypothesized on Edwize that Kennedy's endorsement of Obama was related to NCLB. Like Scott Elliott (a reporter with the Dayton Daily News), I'm skeptical. While George Miller and Ted Kennedy have both endorsed Obama and are major figures in NCLB politics, they are also stalwarts in the Democratic caucuses in each side of Capitol Hill, and a significant obligation of such folks is to defend the Congressional majority. The defense of that majority will depend on how well Democratic candidates perform in historically Republican states. As Matthew Yglesias has pointed out, within the Democratic party, Obama is convincing officeholders in Republican-dominated states that he can not only win the White House but help Democratic candidates for lower offices.

That potential contrasts with one of the signal legacies of the (Bill) Clinton administration, a cannibalization of the party by the top of the ticket. While Bill Clinton's fortunes thrived, the Democratic party's did not. I don't think Hillary Clinton is nearly as egotistical as her husband, but downticket potential is probably more important to endorsements than the few inches that separate Clinton and Obama on No Child Left Behind.

January 23, 2008

Value-added, with botulism

Before Kevin Carey proclaims that value-added [method] comes of age, he might want to read the real true facts behind the New York City teacher value-added project, wherein we learn that the city's great statistical experts thought three children were enough of a sample on which to base a teacher evaluation, or maybe the ethical problems with the NYC project, or maybe even my comments on value-added or growth measures in Accountability Frankenstein.

No matter what else you can say about growth measures, NYC's project is about the worst example I can imagine to use if one wanted to push the approach.

Update I: Carey responds in his post:

It might [have methodological problems, in NYC], I don't know, I guess we'll find out. But, per above, methodological issues can be worked out, and anyone who thinks the hysterical reaction to the value-added initiative stems from a deep and abiding concern for statistical integrity is willfully not paying attention.

The claim that "methodological issues can be worked out" is evidence that Carey hasn't read the writings of professional researchers who point out that growth models are no holy grail. I am one of those who have written about the difficulties inherent in growth models, but certainly not the only one.

And my response isn't hysterical; it's simply disgusted with the latest shenanigans from Tweed. The title comes from a wordplay (when food "comes of age," you don't really want it).

Update II: Best comment in response to Eduwonkette: skoolboy, who writes, "I'd characterize the New York City Department of Education as loving data but hating research."

January 20, 2008

Where does effective reform come from?

Thursday, Andy Rotherham challenged historians of education:

[H]ere's a question for the historians that might help explain why education does careen from one thing to the next. What are the most compelling examples of where the education system has reformed itself in ways that have demonstrably benefited students? Haven't most of the reforms, for good and ill, come from influences on the outside, whether higher ed leaders, business, etc...?

I'm not sure Rotherham was responding to Diane Ravitch's plaintive query fairly (I read Ravitch's argument to be that the content of Michael Bloomberg's and Joel Klein's reform ideas is nonsense), but let me answer the question as best I can. As David Tyack and Larry Cuban point out in Tinkering toward Utopia (1995), we sometimes confuse noise for reform.  Well, that's not quite their point: they argue in an early chapter that you have to distinguish between cycles of reform rhetoric and institutional trends. We can't look just at the visible reforms, the ones that have someone shouting from the rooftops about them. In other words, the only reforms that might pop up on Rotherham's radar screen would come either from outside reformers or from the louder inside advocates.

But "the most compelling examples of where the education system has reformed itself" might lie precisely in institutional trends that are tough to identify as coming from a specific set of pressures. I would argue that on the whole, elementary schools treat children much better than they did a century ago: only rare beatings (which provoke outright shaming if they become public), much less physical punishment, and a much higher proportion of teachers who understand better ways of motivating kids. That doesn't mean that everyone is perfect, just much better on the whole than teachers from a few generations ago.

One could make a pretty good case that the consistent rise in NAEP math scores in many states is the result of changing practice. As I've argued before, the National Council of Teachers of Mathematics is not perfect, especially in how it communicates ideas, but my guess is that math instruction is slowly shifting, with more use of manipulatives and other varied repertoires in early grades and also in early childhood settings. Again, nothing is perfect, but as a child I never encountered the easy introduction to graphing that my own son had when he was in preschool in the 1990s. (It involved tasting fruits and vegetables, with children in the class putting up an icon of the food when they liked the taste. The result was a vertical bar chart of preferences by food.) I don't think that came from outside schools.

That doesn't exonerate school officials. I've criticized Tyack and Cuban's incrementalist framework, using desegregation as the obvious counter-example. But that history doesn't quite provide an argument in favor of mapping business rhetoric onto schools. Among other things, there's only one city I know where desegregation was supported by the business community: Charlotte. And where were today's advocates of high-stakes accountability in the 1980s and early 1990s, as Presidents Reagan and Bush were appointing federal judges who eventually undermined and reversed the pressures for desegregation? I think only Miller and Kennedy get credit there, and I can think of several who actively tried to undermine desegregation.

I'm not sure that Rotherham's question is even a relevant one: the fact that we can find a few examples of where outside pressure was absolutely appropriate doesn't mean that it's a panacea. Sometimes the "I'm an outsider" and "reform is inevitable" rhetoric trumps informed judgment. If "I'm a professional; trust me" is fallacious, so is "I'm a businessman; trust me."

January 17, 2008

Ranking creates perverse incentives; ranking of lunchtime and liberal-arts colleges, doubly so

Inside Higher Ed has a  great article today, Potemkin Rankings, on how Washington and Jefferson College did everything you'd normally think is right to improve how they look to outsiders and still sank in the U.S. News & World Report rankings. The short story: W&J recruited like crazy to increase the applicant pool and managed to increase selectivity while starting to increase enrollment, hold down the full-price tuition, and still maintain a good faculty-student ratio. Because other liberal-arts colleges increased their endowments and tuition faster, W&J sank in the resources area and thus in the U.S. News ranking.

The problem here is not just with U.S. News. You can find that with almost any system that reduces a complex set of data to a simple ranking. Because the quality of any complex service is never going to be monotonic, there will be inconsistencies in any reductive ranking depending on the relative importance of different factors in the final (reduced) rating. This year, Education Week's Quality Counts report includes a weight your own factor feature, where you can re-rate an individual state based on your own idea of how important you find different elements in the Ed Week database. Well, not really: it looks like the mix within an individual subscale remains the same in the summary number, even if you can come up with different subscale scores. And there's no way to see how the rankings might change based on different weights. (I guess the Ed Week editors didn't really want people to look too closely at the rankings, or at how robust/fragile they might be.)

January 8, 2008

Sixth anniversary present for NCLB

So the Sixth Circuit Court of Appeals has revived the 2005 "unfunded mandate" NCLB lawsuit, and here is where things get interesting, because the original complaint is an interesting argument about statutory limits to the power of the purse, tied specifically to NCLB language that lifted mandates that were not paid for. Given the language of the appeals decision, this is going to be a lot more interesting on reargument, and with the current composition of the Supreme Court, I refuse to hazard any prediction about ultimate disposition.

But it won't get to the Supreme Court, because NCLB will be rewritten before it gets that far. Here are the real consequences of the lawsuit: If the plaintiffs win at the lower-court level or if the Sixth Circuit steps in for the plaintiffs in a substantive manner (as opposed to the procedural decision this week), that victory would shift the initiative in reauthorization. On the one hand, those critical of NCLB provisions will be able to be patient, in contrast to supporters of most of the current structure. On the other hand, without the pressure ratcheting up on schools, NCLB critics may not have quite as much organizing energy behind their battle, and that energy may shift to those who support most of the status quo.

January 7, 2008

Ted Kennedy and frames: 51 to go

Last Thursday, I recklessly created a set of predictions for major 2008 education stories and in the top item (on NCLB) wrote,

If I were a senior member of an education committee, I'd work throughout the year to establish some consensus that would hold at least reasonably well no matter what the results of the election.

Lo! and behold! Ted Kennedy has fulfilled my prediction in less than a week with today's Washington Post op-ed column. To be honest, that's only in the first week, but I suspect we'll see plenty of such efforts in the next 51.

December 21, 2007

Guesting on Edwize!

I've gone and committed guest blogging over at the UFT blog Edwize. The gist of the argument is that Joel Klein's pulling a Microsoft-like maneuver with accountability.

And he's the guy who prosecuted Microsoft for antitrust violations.

December 8, 2007

Waiting for the criticism of Winerip

Michael Winerip reports tomorrow on a new ETS report by Paul Barton and Richard Coley, The Family: America's Smallest School. Shades of Moynihan's response to Coleman, anyone? (And does anyone else know the reference for that?)

I expect the blogs next week will be full of criticisms, at least of Winerip's reporting if not the report. It'll be interesting to see if there's some substantive discussion along with the criticism.

Update: Charles Barone was first off the blocks on this. I wish he weren't so consistently sarcastic; it distracts from the analytical points he's making about Winerip and ETS, and those points are important, if not as much of a trump as he implies.

December 7, 2007

Whose values would be valued in a neoliberal education world: Michelle Rhee's or Marc Dean Millot's?

Marc Dean Millot explains why he's a critic of DC Chancellor Michelle Rhee (hat tip), and here's the key paragraph:

What I see in Chancellor Rhee's approach, abetted, permitted or endorsed by Mayor Fenty, is 1) insensitivity and arrogance towards others, combined with 2) a reliance on fear to control staff, and 3) a considerable willingness not to apply analogous performance criteria and public criticism to themselves. Managers cannot be harder and harsher with others than they are on themselves and expect support from their staff, respect from their board, or trust from the public. And managers without all three cannot succeed in a turn-around.

There are three points here. One is the immediate and obvious one: Humiliation and denigration are not great motivators, nor is "making an example of" a significant proportion of the people you work with. I don't know Rhee, but this is not the first time I've seen reports of her approach to people being problematic. And Millot is right on the general principle.


The second point is that mayoral control of schools is no panacea and often a fig-leaf reform. As Monday's Washington Post story on the matter indicates, politics don't disappear with mayoral control. And that's why I was disappointed to see the brief mention of David Tyack's One Best System in Wong, Shen, Anagnostopolous, and Rutledge's new book, The Education Mayor. Tyack showed how governance reformers in the early 20th century claimed to be "taking politics out of school" in changing ward-based urban school boards to nonpartisan boards often appointed by courts or mayors. Wong et al. seriously misread Tyack in claiming that the historical lesson is that we need to keep politics out of school. Tyack documented how the new boards may have been nonpartisan but were certainly political, elitist, highly connected, and contributors to instead of brakes on bureaucracy. We have seen plenty of the last (continuing bureaucracy) in Chicago and New York City, where mayoral control appears to have changed the address of the bureaucracy instead of the basic facts. Beyond the obscuring of bureaucratic continuation, the arguments in favor of mayoral control contain a romantic view that is all too familiar to historians: change the structure and you can reduce if not eliminate the presumably nasty consequences of education politics. There are at least two fallacies in this romantic view: An unrealistic view of structural change as a panacea, and the blithe assumption that we'd want public education without politics. As long as education is tied to citizenship, politics will inevitably be involved, and that's not a bad thing. (You think Brown v. Board of Education and Title VI of the Civil Rights Act of 1964 weren't political??)

The third point is obvious in the today but subtler when looking at the long term (or long duree if you're a devotee of the French Annalist school): there is a distinction between policy and approaches to handling people, and you don't know what will win out in the end. You can agree with the policy orientations of people whom you'd never trust (Millot's response to Rhee), and you can see and admire the human qualities of people with whom you have fundamental policy disagreements (me and Mike Huckabee, to take one example; I mean my view of him, not the converse). Often, the historical perspective focuses on the policy issues instead of the person, in part because extant records that focus on personality are often sensationalist instead of subtle. One exception is the record of a few common-school reformers from the early 19th century, whose views on "school management" were an intimate and conscious part of their ouvre. While one or two of the crankier education historians from the 1970s portrayed Horace Mann and his ilk as 19th century Darth Vaders, top-down class-oriented stealers of democracy, the truth that good historians of various stripes recognize is that a number of class-conscious reformers had a serious argument about the need to be kinder to students. One of the arguments for women as teachers was that they'd be more nurturing. (Sexist? Yes. Motivated by some understanding that beating kids isn't great? Absolutely. Ignores the fact that in the 19th century, women as well as men beat students? You bet.) And Mann is famous for pointing out that Massachusetts teachers regularly beat and humiliated students... and his argument that such mistreatment was unnecessary and wrong.

That fact notwithstanding, Mann, Henry Barnard, and others still fit into a broad movement of 19th century social reformers who held a set of overlapping traits, which in retrospect we associate with northern Whig parties, the growth of merchant capitalism, concerns about poverty and social disorder, a belief in the ability of the state to address such concerns, and an environmentalist analysis of social problems. When most educational historiography mentions Michael Katz's The Irony of Early School Reform, it is usually in reference to the vote abolishing the high school in Beverly, Massachusetts, but the Beverly story is only the first of three parts. The other two sections emphasize the rise and fall of environmental thinking in the mid-19th century. By the 1870s and 1880s, the optimistic environmentalism from a few decades before had become overshadowed by Social Darwinism and "scientific charity." Katz argued that the early promises of reformatories and other social reforms overpromised and ignored the corrupting influences of institutions and the expenses of running truly beneficial programs. (Disclosure: I'm a Katz student, or I was in grad school.)

Mann's twelve reports are the most interesting body of common-school reform writing to me, in part because there is so much complexity to them. He wanted teachers to be kinder to kids and to use more effective teaching methods. He certainly fit comfortably into the world of early- and mid-19th century Whig reformers, belonging to a temperance society and key in the creation of a state asylum while in the Massachusetts legislature. That reformist attitude was perfectly consistent with the background fear of social disorder. In a letter to a friend, Mann explained his acceptance of the Board of Education secretary position by saying, "Having found the present generation composed of materials almost unmalleable, I am about transferring my efforts to the next. Men are cast-iron; children are wax." Maybe he was influenced by religious riots in Massachusetts in the prior few years, but in any case that fear lasted until his very last report in 1848, which resonated with the news of revolution Europe and the publication of the Communist Manifesto. We had to have common schooling, Mann said, or else we would have classes bent on mutual conflict:

Now, surely, nothing but Universal Education can counter-work this tendency to the domination of capital and the servility of labor. If one class possesses all the wealth and the education, while the residue of society is ignorant and poor, it matters not by what name the relation between them may be called; the latter, in fact and in truth, will be the servile dependents and subjects of the former.

For students of 19th century history, this should be familiar; it is an echo of the developing free-labor ideology in the North. And as Maris Vinovskis has pointed out, Mann had an approach to education that approximated human capital arguments:

But if education be equably diffused, it will draw property after it, by the strongest of all attractions; for such a thing never did happen, and never can happen, as that an intelligent and practical body of men should be permanently poor. Property and labor, in different classes, are essentially antagonistic; but property and labor, in the same class, are essentially fraternal.

Educate the tykes, and they'll all have some prosperity and a stake in society. But Mann's fear is less about the South than events across the Atlantic:

The people of Massachusetts have, in some degree, appreciated the truth, that the unexampled prosperity of the State,-its comfort, its competence, its general intelligence and virtue,-is attributable to the education, more or less perfect, which all its people have received; but are they sensible of a fact equally important?-namely, that it is to this same education that two thirds of the people are indebted for not being, to-day, the vassals of as severe a tyranny, in the form of capital, as the lower classes of Europe are bound to in the form of brute force.

To Mann, poverty and conflict lurk under the surface of an industrial economy, something that only education can forestall. This was not the naked instrumentalism that Bowles, Gintis, and others claimed in the 1970s, but neither were common-school reformers unconnected to early 19th century industrialization: there were intimately vested in it and saw education's connections to it in multiple ways, including ameliorating social tensions.

In the long run, the more child-friendly views of Mann did not become a part of bureaucratic school culture. As hundreds of my students have pointed out to me over the years, common school reforms were far more successful in changing the structure of schools than in directly affecting the cultural practices inside a classroom. Some things changed, certainly: as other historians (e.g., David Tyack and Larry Cuban) note, chalkboards slowly became institutionalized in school construction, and in the early 1960s, Mann's view of an 'unvarnished' Bible reading instead of sectarian instruction had become the norm. But those were compartmentalized practices, the type of add-on that Larry Cuban has frequently noted is easier for schools to accommodate. (Note: I am dramatically underestimating the issues involved in shifting away from sectarian instruction. Nonetheless, )

One operative question that 1970s and 1980s historians wrestled with is the extent to which the growth of bureaucracy and the decline of early 19th century environmentalism were the consequence of early industrial capitalism. We have a much richer and more complex picture of 19th century school history today, and yet that question remains (or should remain) interesting. The truly large-factory model of education tried in early 19th century cities died as many schools shifted from monitorial schools to smaller, self-contained classes and choral recitation. On the one hand, one could argue that the organization of graded elementary school in many ways mirrored the less-mechanized and smaller factories in the U.S. better than they did some of the much larger factories in England, where monitorial instruction was invented. But that argument that emphasizes the parallel between graded elementary schools and factories overemphasizes the importance of larger cities, when much of early industrialization happened in towns rather than the largest cities.

And that city-town distortion ignores rural places. As Nancy Beadie's recent research uncovers, the building of schools in small towns and rural places may have been as important a part of local economic development in indirect terms as in any human capital effects. The marshaling of local resources for something as simple as church or school buildings required a complex web of economic and social relationships, quasi-private loan networks and reciprocal property relationships that helped incorporate small towns and rural places into a regional economic watershed. ("Watershed" is an unfortunately naturalized metaphor, but I'm not sure there are better alternatives: web and ecology are as inapt.) There's far more to industrialization than building schools, but Beadie's work shows the potential subtlety of schooling's effects and the relationship between economic life and formal education.

And even the subtler views skip some important topics, including the role of mid-19th century higher education, a fuzzily-bordered sector that included institutions called academies, high schools, normal schools, and colleges. And then there's the growth of Sunday schools, and the links between Northern missionary groups and Reconstruction education. So I'm feeling still a bit at sea, wanting a more synthetic interpretive history of 19th century education that wrestles with the bigger economic questions.

What is unquestionable is that Mann's kinder, gentler school didn't survive in the nascent bureaucracy that he helped build. School bureaucracies were easily corrupted into hierarchies that held low expectations for the poorest students. We have the historical example of a structurally-oriented school reformer who still held complex views about what should happen inside the classroom, views that did respect the potential and humanity of children in ways that we should not ignore. Yet his humane vision of schools lost out, at least for most of a century. The structure he imagined did not require humane treatment of its inhabitants.

So today, as we witness another experimental phase in the structure of American education, I read Marc Dean Millot's blogging with both a smile and heartache. Millot writes with passion about treating people with respect. Yet he is in favor of building the same type of structure that Michelle Rhee favors. Whose ways of treating humans would win out in that structure?

November 29, 2007

Wherein we excoriate Everyday Mathematics and also demonstrate the plausibility of letting secondary-grade students use calculators

As Joanne Jacobs notes (hat tip), some of the questions on the NYC-used and Texas-rejected Everyday Mathematics series are just absurd: if math were a color, a food, a type of weather, or a political party, ... oh, wait. We have a mashup: if your political party were a color, what would it be?

I've never seen any of that particular series, but it was mentioned in a comment thread on an entry about communicating math standards (a post from two months ago). I wonder if the most vociferous ideological complaints about Everyday Mathematics are by folks who would disagree with letting kids use calculators on tests. I'm very sympathetic to that argument from one perspective: children should learn fluency in tasks such as multiplication. (We have a copy of Bill Handley's Speed Mathematics book in our house, and I absorbed a few ideas from Jakow Trachtenberg's book when I was a child.)

But at the same time, not having calculators leaves multiple-choice problems vulnerable to testwise strategies.  I don't know which states have exams with two- and three-digit multiplication problems, but the following is a fairly easy example of finding the right answer without doing the problem.

Consider an extreme example: 47,583 x 97,621. We know three facts about the answer:

  • The last digit of the answer is 3. (Multiply last digits.)
  • The answer is a multiple of 9. (Cast out nines from the two numbers.)
  • The first digit of the answer is 4. (Estimating 4.7*0.97.)

With that information, I probably don't have to perform any calculations other than addition and single-digit multiplication (1*3, 0*7, and 4*1). 

I wrote all of the above before calling up my computer's calculator. For those who are curious, the answer is 4,645,100,043. That happens to be 9*516,122,227, no remainder.

Are these really the type of skills such tests are designed to measure? I'm not saying the skills are bad to have: estimation is very important, and casting out nines is an excellent check on answers. But there is a rather romantic notion floating around that somehow, if we buckle down and remove calculators from the hands of kids in all situations, men will be real men, women will be real women, and international math and science scores will be real international math and science scores (apologies to Douglas Adams fans).

Somewhere between Everyday Mathematics and macho attitudes towards calculators, there must be sanity.


Addendum/explanation of why casting out nines works as a check on multiplication. Let X=9x+a and Y=9y+b, where |x| and |y| are the largest possible for a and b to be integers as well. (I.e., a and b are the ordinary remainders when you divide X and Y by 9.)

X*Y = (9x + a)*(9y + b) = 81xy +9(ay+bx) + ab. Since the first two terms are multiples of 9, the remainder of X*Y when divided by 9 will be ab. This works with any chosen number to divide everything by, but since we normally work in base 10, 9 is the easiest numeral to work with. (If your species generally had Z fingers and therefore used a base Z system, you'd probably be casting out Z-1's.)

November 26, 2007

Eduwonkette on NAEP Exemptions

It's not part of her theme this week (exploring Fordham and Ogbu's "acting white" hypothesis), but her post on Lies, Damned Lies, and NAEP Exemptions is still required reading, following up on Elizabeth Green's story in the New York Sun on the large number of exemptions in New York City's urban NAEP testing.

November 11, 2007

Finger-pointing 101

Charles Barone responds to news of the delay of NCLB reauthorization with a lament that (at least in his view) unions are crowing over a political victory. He broadens the field a tiny bit and then engages in a touch of nostalgia for times that never were:

...in the education arena, there was a time when the mantra was that "politics should stop at the schoolhouse door." No one ever reached perfection on that. But it was aspired to or at the very least given lip-service. Now, however, such principles are dismissed with impunity. Politics, campaign contributions, and interpersonal feuds have taken over the entire schoolhouse and are staging a sit-in.

If one defines politics entirely as partisanship in an electioneering context, Barone might be partially right. There are plenty of examples of bipartisan support for various education policies in history. But he might be wrong even in that vein: witness bipartisan support for the College Cost Reduction and Access Act.

As important, though, is the fact that Barone views this issue ahistorically and narrowly. Since the Progressive Era, the cry "get politics out of education" has been a common rhetorical trump card that has often meant "get all the political views except mine out of education." For that reason alone, I am skeptical of various claims on that front.

In this particular context (reauthorization arguments), Barone is engaging in a fairly unsubtle form of finger-pointing: who's to blame for the death of reauthorization? I'm unconvinced that Miller-McKeon was enough of an improvement on virtually any front to rush it through. But beyond the issues, if you really want to point fingers, there are a few complicating factors. First is the distribution of blame: if one wants to call NEA obstinate, one has to explain why Educator Roundtable has rounded on NEA, why Ed Trust doesn't deserve equal blame for appearing equally obstinate, Bush for his Department of Ed appointments who allowed cronyism to poison the waters (Neil Bush and COWs, the inadequate control of conflict-of-interest issues with Reading First, etc.), etc.

Even if one wanted to get around the finger-pointing, there remains the fact that the political landscape of accountability has changed: Parents are changing their views of teaching to the test. Any reauthorization that does not address that issue will be politically risky, because most parents really do not want schools turned into test-prep factories (a term Diane Ravitch uses).

November 9, 2007

Janie's mother endorses cliff-diving

"You're too young for make-up, Sweetie. Wait 'til you're sixteen."
"I'm not Janie's mother. I don't do this to be mean."
"If those clothes fit any tighter, you would bust out every seam!"
When did my mother slip inside of me?
--- Brenda Sutton, Mama's Hands

For those of you who truly wanted a test of the famous parental Socratic question--"and if Janie jumped off a cliff, would you do it, too?"--we now have a natural experiment. The University of Wisconsin system has committed to the Voluntary System of Accountability, including standardized testing of learning outcomes (hat tip: Zach Blattner).

The Voluntary System of Accountability is a joint effort by the American Association of State Colleges and Universities and the National Association of State Universities and Land-Grant Colleges to respond to pressures for accountability in higher education. Much of it makes sense except for a rather premature (even nuttily premature) inclusion of standardized testing as a proxy for learning outcomes. Only one of the VSA "learning outcomes" tests has been reviewed by the Buros Mental Measurements Yearbook, and the one that was reviewed (Collegiate Assessment of Academic Proficiency) had a fair assessment ($$) from the standpoint of the VSA:

The validity section of the technical manual is quite brief, and the data provided are not particularly encouraging. There is no information with regard to content validity except the suggestion that each institution should conduct its own content validity assessment.... A major concern regarding content validity of the CAAP relates to the coverage of the CAAP to what is taught in college.... There are skills measures that are certainly important to the social sciences, but the work and tools of the social scientist (hypothesis generation and testing, interpretation of statistical data, the search for alternative explanations of findings, etc.) are fundamentally absent from the assessment.

Less than a few weeks after Miami Dade College's internally-developed portfolio system received positive attention from Margaret Spellings, Wisconsin is essentially drinking the Kool-Aid of poorly-constructed standardized testing as a proxy for accountability. When a young friend of mine had to choose between two schools where she was interested in a performing-arts major, she visited the schools, sat in classes, talked with students, and watched performances. Despite Kevin Carey's desire that she and her family use someone else's ranking to make decisions on college, she used the criterion that made sense: see what students are doing in the field she intends to study. AASCU and NASULGC have made a poor choice that risks the waste of millions of dollars poured into the companies that produce those tests and do little to bring serious accountability to higher education.

November 8, 2007

The Bloomberg-Klein attack on Diane Ravitch

The key clause from Diane Ravitch's reflections on the smear campaign aimed at her recently:

... if they could silence me, I would serve as an example to anyone else who criticized them.

Ravitch is is right: as a well-known, respected, and outspoken critic, she is the safest of Klein's critics. A visible attack on her is an attack all who are more vulnerable. In addition, the sad fact about attempts to intimidate people is that an unsuccessful attack on Ravitch still accomplishes part of the end, by making other critics think twice or three times before opening their mouths.

November 6, 2007

What not to do on pay-for-performance

A new report on pay-for-performance plans (by Joan Baratz-Snowden) was released by the Center on American Progress, and if you strip out all the political and other analysis, here's the gist of the report: We know what not to do on pay for performance. That's important: I'm glad to see my state described as the poster child for ill-advised impositions (we've had several), but Baratz-Snowden's acknowledgment of the thinness of research is reflected in her references, which have only a handful of refereed articles or other similarly-reviewed research papers. That's not her fault: it reflects the simple fact that there is little professional research documenting salutary effects of any pay-for-performance policies (regardless of details). Until we get something on that order, any prescriptions for what to do in a positive sense is foolhardy, let alone inserting any oxymoronic phrase like "proven" strategies into NCLB (from the Miller-McKeon draft language on performacne pay). It's a little tough to mandate "proven strategies" on performance pay when there aren't any.

November 4, 2007

NCLB reauthorization dead until 2008

One Nevada newspaper is report that the Senate Won't Take Up NCLB this year (hat tip: Michele McLaughlin). This wasn't hard to predict, to be honest. Once we get into 2008, the legislative calendar will become increasingly bogged down with other matters, and while individual legislators (including chairs) may have an incentive to move bills, an increasing number of legislators and advocacy groups will want to wait until after the 2008 elections.

In many ways, the Senate's move may make George Miller's job easier in the House, since the debate becomes more about long-term questions than short-term (and jerry-built) fixes. I'll keep my prediction from 2006: by the end of next year, growth models will look much less like a "fix" than they were at the beginning of this year.

October 30, 2007

Diane Ravitch's disillusionment

From Diane Ravitch's latest entry in Bridging Differences:

Now that the president and the U.S. Department of Education have made it their business to show that federal legislation can and will raise test scores, every release of NAEP data is accompanied by a press statement from the U.S. Secretary of Education that magnifies slight gains as huge achievements. This is troublesome. It is troublesome because the federal government's role as the honest, impartial collector and distributor of information gets corrupted when it acts as a cheerleader. And it is troublesome because it is unrealistic to expect test scores to make major leaps in a few years. When they do, one should suspect chicanery of some kind.

Sharon Nichols and David Berliner make the same point about almost all high-stakes testing in Collateral Damage.

October 19, 2007

On metaphors and people

A few days ago I commented on an Eduwonk entry about Michelle Rhee's wanting more convenient dismissal options for non-unionized central-office staff... and teachers, in part to give some positive reinforcement for the decision to allow comments and in part because there are some interesting ideas in the entry that I wanted to follow up on. (You'll have to go there to see the comments.)

But I looked back at the entry last night, and upon rereading, the last paragraph stuck in my craw:

In the case of D.C., this debate is actually larger than whether Michelle Rhee will be able to fire some people from the central office and some low-performing teachers. It's a proxy for how hard she (and Mayor Fenty) will push on the schools. If they lose this one it's an enormous setback and the wait them out game will start in earnest. If they win, they might not have to fire so many people anyway because it will be a clear signal that business as usual is over. For Rhee, a lot riding on this. Insert your own metaphor here.

While we may think partly in metaphors, I'd prefer to think of debates over the terms and conditions of work in something other than a metaphorical sense. Maybe this is because I like the second formulation of Kant's categorical imperative (the one about not treating people as ends), and if so, I'm a softie for unreadable German philosophers. But I don't think either children or adults are metaphorical vehicles. They're people, and we should talk about them as such.

Beyond that, I think Andy Rotherham is mistaken here about the use of power. I've known plenty of people in academe and the K-12 world who have paid far too much attention to symbols of power, from the all-too-important brush-off in person to stressing the importance of a particular goal for ends far beyond what it can possibly mean in reality. Power is also more subtle than the imposition of one's will through forceful means. The principal who inspires and convinces a school's teachers to work their tails off is more powerful than any petty tyrant who might occupy the same office. The true setback in DC would be if Rhee focuses more on acquiring power than in using it wisely.

Addendum: I realized a fast read of this entry may lead readers to erroneously conclude I think Andy Rotherham is into power games. That's not my argument or assumption at all; I suspect that in his own work environment, Andy pays attention to the interpersonal touch and not to imposition of his will on the people who report to him. Maybe the same should be true in school systems...

October 15, 2007

President Bush guarantees irrelevance on NCLB

President Bush has promised a veto of any NCLB reauthorization with significant changes he would interpret as weakening the bill's accountability provisions. The policy influence of this White House continues to recede.

And along with the veto threat, the president decided to misinterpret the concerns many have with teaching to the test:

People say, well, they're just teaching to test. Uh-uh. We're teaching a child to read so they can pass a reading test.

That is the type of petulant rhetoric that ignores a broad current of dissatisfaction with instruction that is largely unproductive by any stretch of the imagination. But his rhetoric is perfectly consistent with the president's general belief that reality has a well-known liberal bias.

Source: President Bush Discusses The Budget

Three shots at graduation rates

Below are three different takes on graduation rates and the Miller-McKeon discussion draft (which includes an elaborate definition of graduation rates and a 2.5% improvement target folded into AYP). This is partly a short description of my reaction to that piece of the discussion draft and partly an experiment in using different multimedia (including Youtube mashups).

Youtube video (straight)

Video with internal object tagging

Video with rebuttal

October 11, 2007

So, um, ... how about writing about MY book?

Kevin Carey wrote yesterday:

At any given moment, there's a limited amount of room in the general consciousness for books about education, and over the past few months a lot of that space has been occupied by Linda Perlstein's new book, Tested. Which, as I explain in my review in this month's Washington Monthly, is too bad.

Fair enough in terms of wishing for different apportionment of air time. Perlstein has the advantage of a mainstream (i.e., large corporate) publisher and publicist. So, Kevin (and anyone else who wishes public attention paid to other materials), why not review some recent books on accountability that are more substantive and analytical?

<whistles and walks away to work on journal editing>

October 5, 2007

Backlash against formative assessment

As reported in the Orlando Sentinel education blog, some educators are worried about the time occupied by tests given throughout the year, tests that school districts hope will track predicted scores on the spring tests in Florida (FCAT):

In plain language there are 8 full student days wasted on these tests. By the time FCAT comes around the students are burned out and I have a strong feeling that they will not be giving 100% on the FCAT. (a correspondent with the reporters)

It's hard to know how to evaluate that claim without knowing more specifics, but there's a fine line between not assessing students enough and wasting time. If you give students a five-minute math quiz every Friday for tracking purposes (apart from any unit tests), that's maybe 10 minutes for test administration a week (at least for students; this doesn't count grading). I think that's reasonable. On the other hand, I wouldn't want to see such quizzes last for 40 minutes every week unless they're very good tests. But then again, in my high school U.S. history class, we wrote an essay a week arguing about the interpretation of the topic of the week. Multiply 45 minutes times the 30-33 weeks that the full curriculum was in force (apart from short weeks and the very start and end of the school year), and that's well over 20 hours of testing in a year on that subject alone. But those were very good tests, as activities in and of themselves.

The danger of very long tests in multiple-choice formats is that they aren't very good, and the school district employee quoted above may well be right: the sheer volume of such testing can alienate students very quickly. (If you disagree, try filling out your income taxes every month as a formative exercise.) And then the longer-term danger is that such effects can undermine the use of formative assessment even when it does have a light footprint in the classroom.

October 2, 2007

The adults v. children meme of facile ed policy talk, part 375

Ruben Navarette (hat tip) captures a thumbnail historical myth embodied in the "adults vs. children" theme in accountability talk:

Public schools have, for generations, crafted an environment that caters to the needs and wants of the adults who work in the schools rather than those of the children who attend them.

As Seymour Sarason has observed, children-first rhetoric such as Navarette voices is actively hostile to reform because it fails to acknowledge some truths about schools as organizations. (Sarason contrasts K-12 schools with higher education, where I work.) Elementary and secondary schools are environments that are about the least adult-friendly you can imagine, outside sweatshops. Where else can adults be vulnerable to being hit by children, be told when they can go to the bathroom, and be told that their own intellectual development does not serve the organization's interests?

Of course schools serve multiple purposes and interests, and yes, one needs to work with that dynamic. But you don't work with the dynamic by setting off one group entirely against another, and that is what Navarette implies: It's a grudge match, teachers vs. students.

September 25, 2007

NAEP scores out

The National Assessment of Educational Progress (2007) scores are out, and here's a quick response on reading for the country and Florida:

1) The U.S. Department of Education report focuses on feel-good comparisons with 2005, when looking back further gives a different picture. Yes, in fourth grade reading for the country's children, the average scale score has gone up 2 points in the past year, but the improvement was better in the four years before 2002 (4 points 1998-2002, vs. 2 points 2002-2007). And in eighth grade, the report claims improvement since 2005, but there's been a slight average scale score decline since 2002. In general, fourth-grade reading has been on a gentle upward slope for the past decade while eighth-grade reading is stagnant. In addition, in most areas there has been no closing of the achievement gap since 1992. (The only achievement gap to show a decline either since 1992 or 2005 was the White-Black comparison in fourth-grade reading.) The take-home story today is that the nation's reading achievement provides no clear evidence that No Child Left Behind has dramatically changed elementary and middle-school reading proficiency.

2) Florida's reading achievement is mixed. There appears to be a long-term improvement in reading in fourth grade but stagnant reading scores in eighth grade since 2002. (There was a decline between 2002 and 2005 and then an increase, so the average scale score in 2007 was 1 point below 2002.) There was a slight increase in the proportion of students excluded from testing, but it's hard to know how that might have affected scale scores. Today's report also gives no trend data by population subgroup, so we can tell nothing about changes in achievement gaps in Florida from today's report.

3) If you look at Florida scores by achievement levels, the conclusions you draw depend on which grade and level you pick. Fourth grade: In both the second (proficient) and third (basic) levels, there is a long-term increase in the proportion of students achieving that level, but the second level's upward trend started in 1998, while the third level's upward trend started in 1994.   Eighth grade: There's been stagnation since 2002 no matter which level you examine, after a four-year uptick.

I'd like to get inside the data more, but the NAEP Data Explorer server is now very busy.

September 20, 2007

Vi8gra for ur tests

In response to the growing arguments over the Miller-McKeon Title II proposal (i.e., encouragement of performance pay), Eduwonk (Andy Rotherham) writes

... until education becomes a field that is comfortable with the idea of performance, it's a field that is in some trouble.

This may say something about my spam filter, but that phrase brought up images of all the potential spam about "your test score size," "plez ur schl brd," etc. Of course, neither schools nor teachers would ever spend money on charlatans promising to increase test scores...

Oh, wait.  Yeah.

Well, at least I've got another few lines for an accountability stand-up routine, and all I had to do was have a mild emergency yesterday and be away from my computer all day.

More seriously, in practice these performance-pay plans are complicated and often undercut the claims of proponents that they will reward teachers who work hard in difficult circumstances. Or, as the Orlando Sentinel reported September 9 about the Orange County (Fla.) plan,

...teachers at predominantly white and affluent schools were twice as likely to get a bonus as teachers from schools that are predominantly black and poor.

Finally, I'll repeat this until I'm blue in the face: Everyone else in favor of performance pay on principle or faith, please show me up and read the literature on goal-setting. I don't want to be better read on this than you.

September 14, 2007

The b-word and education politics

Blogger KDeRosa calls George Miller "a Whiny Bitch" (hat tip), which makes me wonder where this performative name-calling came from (see Andy Rotherham's running joke about Rick Hess, though I don't know where that came from, either, and I thought it would be more appropriate to call him Rick "Baby" Hess, since that's the interjection he uses frequently).

Fundamentally, KDeRosa is trying to slap down Miller's rhetoric ($ after today), which in turn is an effort to pivot around Spellings's claims that the Miller-McKeon draft is too wimpy when Miller pointed out that the federal DOE isn't exactly clean on loopholes.

I never knew that education politics was a macho sport. Maybe we can get it on ESPN now, if we can get a little more trash-talking? Or will they be serving diabetes-sized buckets of cheap beer at the next Washington education think-tank gathering?

Take home message: Any day of the week, I'll take ideas over name-calling. What about you?

Five-Year Plans and Ed Trust flexibility

Trust it to AFT's Michele McLaughlin to find the hidden item in the Ed Trust statement on Miller-McKeon's draft Title I language. Like many others, I had focused on the more belligerent language earlier:

Although the staff draft creates an accountability fig-leaf by preserving the requirement that all students reach proficiency in reading and mathematics by the 2013-14 school year, the heart of the law has been hollowed out.

Sting! But McLaughlin notes the following:

"Additional funding may be included, but money is not the sticking point," says [Ed Trust VP Amy] Wilkins. "The 2013-14 deadline for proficiency is a powerful disincentive to raising standards. If we are going to ask states - and students - to climb a higher mountain, we need to give them more time to get there, and this bill draft does not do that."

McLaughlin correctly notes the hint at flexibility that I (and almost everyone else) missed. In testifying at Monday's hearing, Ed Trust's President Kati Haycock largely ignored Title I to focus on teacher issues. With the exception of data issues, the only pieces of Title I mentioned in her testimony were parts related to which teachers are where.

Hmmn...

I'll ignore the positioning/politicking questions to focus on one thing: There appears to be one less visible supporter for the rigid Five-Year Plannish elements of NCLB.

September 11, 2007

Jack Jennings is right (part 1)

I didn't have my computer for part of the evening, but I did have a way to record my thoughts on Jack Jennings's testimony yesterday at the NCLB reauthorization hearing.

NCLB hearing testimony

I'm trying to find all of the written statements for yesterday's NCLB hearing. Thus far, I have the following:


Correction: All of the testimony is listed on the committee's hearing page, which also includes a video archive of the day. Hat tip: Alexander Russo.

September 10, 2007

Stalemate talk or spin?

Another bit from Alexander Russo today, stemming from an NPR story:

[This is the] first word I've heard of that Spellings is saying she'd rather have the current NCLB than the Miller draft. Saber-rattling? Maybe. But for those who are most worried about multiple measures and all the rest, it's going to be a serious consideration.

I can't believe that Spellings would play that game, because she'll be gone within two years (sooner if she's really looking for a university leadership position). Stalemate will give more time for parents to decide that test-based statistical judgments are a poor idea. Stalemale = greater likelihood of defeat, for Spellings at least. Or stalemate shoves the responsibility for defending the current structure onto Achieve and Education Trust.

Or maybe this is Spellings' way to set up a spin when/if reauthorization doesn't happen this year, much akin to a song a friend of mine wrote: I Meant To Do That.

NCLB reauthorization hearing

Since he's a former Hill staffer, I'd pay attention to  Alexander Russo's comments on today's House NCLB reauthorization hearing.

By having everyone speak, the committee pretty much ensures a certain amount of cacaphony. And by putting Kati Haycock -- one of the draft's most vocal critics -- off in the teacher quality corner, the committee sends a clear message that it doesn't like being called out.

I'm more confident of Russo's first conclusion than his second one: NEA and AFT's representatives are on the teaching/school leadership (not teacher quality) panel. While analyzing a witness list is akin to reading tea leaves a la old-style Kremlinology, maybe that's appropriate for a law whose numerical goals seem awfully Five-Year-Plannish.

Update: podcast available! (Thank my poor ergonomic awareness last week for this one...)

September 8, 2007

NCLB "Shrinklits" spin

I've just finished a substantial detail-oriented task (took about a day or more spread over the past week), and I am just too tired right now to read and talk about the Miller-McKeon discussion draft sequel, esp. Title II. I'm far too tired now to analyze the various spins that people have tried out on the Title I part of the draft, let alone Title II. I'll offer a few Shrinklits versions, and you pick which one you want to use:

  1. It was the best of laws, it was the worst of laws.
  2. All happy reforms are alike; each unhappy reform is unhappy in its own way.
  3. Quickly, word got to the villagers and everybody in the village rushed to the newspaper to see Anansi's school listed under "needs improvement." It was such a shame for Anansi, he ran away and hid in a corner of his room. That is why he is always in the top corner of rooms and why he hides from us.
  4. As someday it may happen that a scapegoat must be found,
    I've got a little list. I've got a little list
    of overblown pol gasbags who might well be DC-bound
    and that never would be missed. They never would be missed!
  5. I am, indeed, mighty world-destroying Discourse,
    Here made part of U.S. Code for destroying the school.
    Even without Statute, none of the teachers here
    Arrayed within the foolish classrooms shall stay in their professions another 37 nanoseconds.
  6. It was a dark and stormy reauthorization.

Any others?

September 4, 2007

Miller-McKeon draft thoughts

So how was your Labor Day? I spent part of mine combing through the NCLB reauthorization discussion draft made available a week ago. (My spouse and I agree that we don't engage in paid work on legal holidays, but we're allowed to do anything that's fun or citizenship oriented. So we're well-acquainted with various loopholes... call it 'gaming the system' if you wish. I called this citizenship, not fun. Yes, I did spend time with my family, with a good book, and in something creative.)

If you want to read my scribbles, you can look at my comments on the Miller-McKeon reauthorization draft (PDF, 12 MB). The first page is my attempt to cross-reference common criticisms of NCLB with pages/sections of the discussion draft that may address those criticisms. The rest are all of the pages of the draft (well, two pages per sheet) with my comments. The file is about 150 pages long, because I didn't scan the sheets I didn't have comments on. Please remember that I was (and still am) not happy with the short turnaround time for comments, so you'll find plenty of snark and a few comments that indicate I need to look up things to check whether the draft has changed language, etc.

I hope to have something more analytical within a day or three, but the first page shows my thinking that this draft attempts to address the vast majority of criticisms in different ways. That statement doesn't say anything about how well the draft addresses criticisms, but with a few notable exceptions, the draft does tackle the well-known gripes.

The exceptions (and these are important):

  1. How NCLB has been followed by the transformation of large numbers of schools into test-prep factories. (This is separate from the issue of curriculum-narrowing.)
  2. The mandate of a limited menu of fundamentally unproven restructuring options (made even more restrictive under the discussion draft).
  3. The failure to hold SES providers accountable in a timely way.
  4. The waste of the 20% set-aside provision for schools in the "needs improvement" category.
  5. The fundamentally arbitrary nature of defining levels of proficiency.

The discussion draft fails to address any of these five criticisms. These are all substantive problems, well-known to anyone who's dealt with NCLB, and the failure to even acknowledge #1 in any way shows how the Beltway conventional wisdom has its head in the sand on test-prep. But despite my somewhat cynical disappointment on these matters, to my surprise, my impression is that the discussion draft provides a reasonable basis for negotiating reauthorization. Of the items listed above, I suspect the only non-negotiable item from the inside-the-Beltway perspective is #4, and I think that is the least important issue to address in the short term (i.e., reauthorization).

August 30, 2007

Parsing Miller/McKeon

Kevin Carey takes the first crack at a moderately-detailed description of the Miller/McKeon discussion draft of NCLB II. From the few sections I've skimmed, I think he's done a good job at description. I agree on some things, disagree on others. The one solid thing I've seen was a definition of a reasonable cohort-plus calculation of graduation rates, though it says something about Washington that this took ... how many pages?  (I forget, and I have to do something else rather urgently tonight, so you can count the pages yourself. I think it's section 1124.)

Having read through Carey's description and quickly skimmed through the assessment/AYP language, my first thought is that this draft is essentially establishing negotiations over what level of failure is politically acceptable.

August 28, 2007

Eight days to read 400+ pages

Time to call BS on George Miller and Buck McKeown: the release of an initial draft of NCLB's reauthorization was accompanied by a letter saying the public would have eight days to comment. I guess that they gave us a day beyond a week to accomodate the Labor Day holiday.

So much for wanting public input.

Update: Andy Rotherham (Eduwonk) points out that the 8-day window isn't a hardship for those inside the Beltway:

First, there has been a lot of opportunity for input so far --and interest groups working on this full time for months -- so this is not really the first cut, second a lot of the language isn't new anyway and it's 400 pages of legislative text, which is different than 400 pages of prose, and third, if people have to read a little over Labor Day that's OK, the staff working on this has worked weekends all summer, one weekend won't kill anyone,...

Okay, so those who have had advanced pre-draft drafts and are used to legislative language can skim through this to spot their favorite and hated items. But this still leaves out about 290,000,000 U.S. residents who haven't had those opportunities and don't have that background. My point stands: the 8-week window is a clear indication that this is an inside game.

(And, yes, I'll squeeze in some time to read it, but I'm not doing it on Rehoboth Beach or anywhere else that the Beltway set go for Labor Day Weekend.)

Parents change their minds on teaching to the test

Since 2002, the annual fall release of results from the Phi Delta Kappa/Gallup Poll of public attitudes towards public education has become increasingly focused on NCLB. Today's release (hat tip) is no exception, and my guess is that most reporters will run with the results of the first section on NCLB and accountability.

My nomination for most significant result is from Table 14, asked of those who agreed in a prior question that "standardized tests encourage teachers to 'teach to the test,' that is, concentrate on teaching their students to pass the tests rather than teaching the subject." The majorities answering yes to that first question (in Table 13) haven't changed much between 2003 (when 68% of public-school parents and 64% of adults without children in school said yes, standardized testing encouraged teaching to the test) and 2007 (with 75% and 66% of each group saying testing encouraged teaching to the test).

While a clear majority has always seen testing as encouraging teaching to the test, American adults have changed their mind on whether that is good or not. In 2003, 40% of surveyed parents with children in public schools thought that teaching to the test was a good thing. This fits in well with arguments by David Labaree, Jennifer Hochschild, and Nathan Scovronick that a good part of the appeal of public schooling is to serve private purposes, giving children a leg up in a competitive environment. In that context, it makes enormous sense to value teaching to the test, since many parents understand how college admissions tests are related to access to selective institutions and scholarships. While 58% of public-school parents thought that teaching to the test was a bad idea in 2003, a sizable minority thought it was just fine.

That opinion has changed, dramatically. In the 2007 poll, only 17% of public-school parents thought that teaching to the test was a good thing. Fewer than one-half of one percent had no opinion, and 83% of public-school parents thought that teaching to the test is a bad thing. Adults who did not have children in school also have changed their minds, with 22% of those surveyed this year thinking that teaching to the test is a good thing.

This question was asked separately from the issue of narrowing the curriculum. While there may be some spillage or confusion of issues, I think the sea change is a warning to advocates of high-stakes test-only accountability: Few parents see benefits in sending their children to test-prep factories. Fix that consequence or see the political foundations of accountability crumble.

August 16, 2007

Multiple issues in multiple measures

In his July 30 statement at the National Press Club, House Education and Labor Committee Chair George Miller said that his plans for reauthorizing the No Child Left Behind Act included the addition of multiple measures, an incantation that has provoked more Sturmunddrang in national education politics than if Rep. Miller had stood at the podium and revealed he was a Visitor from space. While Congress is in recess this month, the politics of reauthorization continue. I'll parse the debate over multiple measures or multiple sources of evidence, and then I'll foolishly predict NCLB politics over the next month or so.

The different issues

Calculating AYP

At one level, the discussion appears entirely to focus on the determination of adequate yearly progress. Add measures and you "let schools off the hook," according to Education Trust (with similar noises from the Chamber of Commerce's Arthur Rothkopf [RealAudio file-hat tip]. No escape hatch, promised Miller when asked. Maybe if you add measures, there are more ways to fail AYP, as one reporter noted at the press conference; not so, said Miller, for we'll figure out some way so that the extra measures only get you over the hump if you're almost there. Since AYP is the largest chunk of NCLB politics, all of the talking points are familiar. In the end, this piece of the debate will get bundled into the most likely package that includes growth measures.

Teaching to the test

As the Forum on Educational Accountability has argued, as well as last week's letter by civil rights groups, narrow measures of learning tend to distort how schools behave in several ways, from narrowing the taught curriculum to teaching test-taking skills and engaging in various forms of triage. One argument in favor of multiple sources of evidence is Lauren Resnick's old one, that a better test is likely to encourage better behavior by schools, both in terms of better assessments and school indicators that penalize schools for triage. To the extent that more input dilutes the incentive for systems to attend to single indicators, that may be true. On the other hand, multiple sources of evidence by themselves will not eliminate the corrupting effect of brain-dead accountability formulas, and to some extent the resolution of the debate over AYP can blunt the effect of multiple sources of evidence. On the third hand, I suspect most of those who support multiple sources of evidence are adults and prefer some improvement over none. Including multiple sources of evidence will not eliminate the deleterious side-effects of high-stakes testing, but they should ameliorate them.

Improving the quality of exams and their cost

Connecticut's NCLB lawsuit is based on the claim that the federal government has not provided enough support for the state to develop its performance-heavy exam for all the required grades. The feds allegedly told Connecticut that it doesn't need to use the performance-heavy exams, claiming that an off-the-shelf commercial test system would work just fine. After investing state money and political capital in the performance exams, Connecticut officials were rather peeved. The Title I Monitor nailed this issue in May, noting that the argument over multiple measures is in part a matter of the quality of assessments and cost. The Monitor also noted a level of denial in the US Department of Education that should be familiar to Bush-watchers:

[A] senior ED staffer acknowledged the benefits of states using varying assessment formats compared to a single test, but challenged the idea that costs and timelines are a barrier to states developing tests with multiple formats.

And the escalation in Iraq is currently providing an environment conducive to the reconciliation of factions. Right. Officials from a variety of states and a number of players in Washington agree that NCLB has essentially stressed if not broken the testing industry's credibility and infrastructure, and the inclusion of multiple measures is part of the negotiations over how much Washington will pay for better assessments.

Reframing accountability

One doesn't have to agree with George Lakoff's version of framing to recognize that the politics of accountability are driven by assumptions about the need for centralization and authoritarian/bureaucratic discipline. These themes are obvious in the dominant inside-the-Beltway narrative about NCLB: We can't trust the states. The best argument for this position is Jennifer Hochschild's thesis in The New American Dilemma (1984), a claim that sometimes we need a non-pluralistic tool to advance democratic aims, a contradiction she saw in desegregation. But we don't have an open debate about this dilemma. We didn't have it about desegregation, and we certainly don't have it about accountability.

Instead of reflecting some honesty about policy dilemmas, the arguments defending No Child Left Behind today are generally at the soundbite level. A common metaphor used by many supporters of NCLB relies on time, such as the Education Trust's organizing an administrators' letter several years ago warning against a thinly veiled attempt to turn back the clockA step forward is another phrase that the same letter uses to describe NCLB, and Education Trust's response to the Forum on Educational Accountability proposals describes them as a giant step backward. This is an ad hominem metaphor: It says, "Our opponents are Luddites. They are not to be trusted to defend anything except their own narrow and short-sighted interests."

The other language commonly used by NCLB supporters is a simple assertion that they own accountability. Anyone who disagrees with them is against accountability. Together, these bits of accountability language imply that there is one true accountability and that NCLB skeptics like me are apostates or blasphemers. Pardon me, but I don't believe in an accountability millennium. 

To shift the debate away from accountability millennialism, critics of NCLB have to provide a counter-narrative. Both the August 7 civil rights-group letter and the August 13 researchers' letter (or the letter signed by mostly researchers) describe the current NCLB implementation with words such as discourage, narrowed, and fail. In its August 2 recommendations for reauthorization, the Forum on Educational Accountability uses the words build, support, and strengthen. The Forum and August 7 letter also use a single word to describe the best use of assessment: tool. In their recommendations, the Forum and its allies use an architectural metaphor: we need to strengthen the system while keeping it mostly intact. The criticisms directed against multiple-choice statistics aren't part of that story, though I suppose a purist would insist on that, some how described as undermining foundations, eroding under the foundation, blowing out a window, or somesuch.

I don't know to what extent the debate over multiple measures will shift debate, but it is potentially the most far-reaching of the consequences of the letter.

Where we're headed in the short term

My guess is that Miller's September draft will bless consortia of states that develop assessments with more performance, authorize funding for more (but not all) of that test development if small states work in consortia, and promise to pay for almost all of the infrastructure needed to track student data.

We will also see the true character of high-stakes advocates in Education Trust and the Chamber of Commerce. The Education Trust is now under the greatest pressure of its existence over both growth measures and the issue of multiple measures. In Washington, almost no one gets their way all the time. How people negotiate and handle compromise reveals their true character.

August 7, 2007

A conversation with Doug Christensen

The audio (mp3) of a discussion last Saturday among Nebraska Education Commissioner Doug Christensen, Maryland teacher Ken Bernstein, and me is now available online. The discussion was recorded at a session of YearlyKos in Chicago.

August 6, 2007

Framing NCLB debates

Matthew Yglesias has a point about the the details of NEA's No Contractor Left Behind flyer passed out liberally at YearlyKos this weekend. Yglesias notes that the message of the flyer relies on sloppy reasoning and is more sensationalist than sensible.

I'm worried by something else about the flyer: it's irrelevant to NCLB policy debates. As I've argued before, you can agree with the conflict-of-interest argument 100% and decide that the appropriate response is to build in more procedural safeguards against such dealings, not change the structure of NCLB. Fundamentally, it's a waste of NEA's resources to push this, and as a member, I'm ashamed at the poor decision-making.

But I think I understand why NEA staff have still diverted it: it holds a certain appeal for those of us angry with the Bush shenanigans. Mike Klonsky's entry on the matter demonstrates the appeal that the flyer holds for some.

(Incidentally, for those who know of Yglesias's relationship with Sara Mead, this isn't a devious insider plan to discredit the NEA. If I were really devious and wanted NCLB to be reauthorized intact, I'd encourage the NEA to waste even more resources on this nonsense. There are real conflicts of interest, but that's not a wise political focus if you want to change policy structures.)

And now, back to editing a 104-page manuscript for EPAA. It's a good one, but as I've discovered the efficiency of giving suggestions on accepting a manuscript, it's labor-intensive. I need to take breaks from the close reading/editing, and the blog will get the benefit of that.

August 4, 2007

YearlyKos

I'm @ the Midway airport, waiting to board. Good trip, combining friends, touristing, politics, union stuff, academics, and even a bunch of cool new t-shirts. Packed 50 hrs!

I'll blog more extensively when I get home and take care of some other tasks, but I had a good time. The session with Nebraska's commissioner of ed was the most consistently substantive, and an unexpected surprise was listening to a conversation between him and George Lakoff. More later!

P.S. No education question in the first part of the pres. candidates' forum, before I had to leave for the airport.

August 1, 2007

The celebrity-faculty fallacy

As noted in The Gradebook, Stanley Fish has now waded into ($) the Florida higher-ed funding battle.  Like many of our fellow Florida faculty, Fish says we can't simultaneously have great, universal, and really cheap higher education. Yet Kevin Carey has a point: Fish's proposed solution is a search for celebrity faculty:

Five straight years of steadily increased funding, tuition raises and high-profile faculty hires would send a message that something really serious is happening. Ten more years of the same, and it might actually happen.

Fish followed the same formula when Arts and Sciences Dean at UIC. A large part of his modus operandi was symbolic and cultural, but a substantial chunk was trying to snag Big Fish. Fish's fishing spent resources that could have been used to hire and reward wonderful and less-famous faculty.

Florida has tried the Famous Faculty Fishing expedition before, among other things with FSU hiring Nobel Prize winner John Robert Schrieffer, who later killed people while driving. His shenanigans are proof that neither universities nor famous faculty are idiot-proof. There is a point in recruiting famous people, so long as the resources devoted to such efforts do not drain the ability of an institution to reward and retain the vast majority of faculty who neither win Nobel prizes nor write best-sellers.

Florida loses 15% of its faculty every year, essentially serving as a farm league for other regions. Hiring a few famous faculty will not stop that attrition, and if it absorbs too much of the university system's resources, such a concentration of resources will prevent us from holding onto the hundreds of darned good faculty we already have.

Sara Mead: We gotta walk the walk

Last night, former Education Sector staff member Sara Mead wrote an important blog entry as a guest for Eduwonk. She pointed out that teachers unions are not the opponents to a number of reforms supported by Andy Rotherham (Eduwonk), Joe Williams (Democrats for Education Reform), and others such as charter-schools and performance pay.

Liberal education reformers need to win hearts and minds by engaging with reform-wary lefties and taking their concerns seriously--not just calling them union hacks or accusing them of not caring about kids. We need to engage in honest self-reflection and be willing to make changes in response to valid critiques from the left. We need to avoid allying with "friends" who undermine our credibility as proponents of social justice. We need to make common cause with other progressive advocates for kids--those working on health care, childcare, and juvenile justice, for instance--rather than undermining them.

We'll see whether others share Mead's perspective; I hope so, but I am especially skeptical that Williams will change his standard rhetorical approach.  Whether that cripples the organization he directs is an open question.

July 30, 2007

George Miller press conference

I'm listening right now to California Rep. George Miller's press conference previewing his NCLB reauthorization bill. My first impression is that he's overpromising. (He's also talking too quickly, but his audience is a group of reporters, and the faster he talks, the more questions they can ask.)

Here come the questions, reported here as topics and answers.

  1. Timing on the bill in the House: Miller waffles while saying the goal for passage out of the House is still September.
  2. Who would decide what a "better test" is: Miller says he is responding to claims that state tests are of poor quality and acknowledges the need for funding for such tests. He says that the bill isprovision for K-12-university-business partnerships to create tests that assess "college-ready" or "work-ready" skills and knowledge. In other words, he didn't respond to the question, other than saying he wasn't for national standards/tests.
  3. Other measures' relationships to AYP and growth: Miller talks about a college-prep curriculum, etc.  In response to a follow-up to the waffling, Miller says schools would still have to perform well on reading and math tests, and adding other measures is "not an escape hatch."
  4. Is the bill bipartisan: Miller says yes.
  5. Performance pay and test scores: Miller says some portion "has to be tied to student achievement," and he refers to "a growth model." "We would honor... collective bargaining agreements [and would not] upset those." He says he understands the reservations given the history of merit pay as an "arbitrary system of rewarding friends." He then talks about needing to creating careers for teachers that look like other careers where teachers can be rewarded for their efforts, time, etc.
  6. Choice options: Miller says it's under discussion, not resolved, about supplemental education services and public-school choice. He acknowledges the difficult of providing choice in "jammed" districts and says the bill will reverse the order of interventions. He says he is concerned with the lack of accountability for supplemental education services.
  7. English language learners and testing: He implies that tools exist for assessing the skills of English language learners, including tests in other languages (and he notes that other countries somehow have assessment in non-English languages, because most of them don't have English as the official language). Miller mentions the "p" word (portfolios). Wow.
  8. Administration response: Miller discusses ongoing discussions with US DOE and "talks nice" about his relationship with Secretary Spellings.
  9. Spending issues: Miller says he doesn't know what additional spending is required by the bill. (WHAT???) He then talks variously about "strategic investments," the lag in spending after the first year or two of NCLB, assistance to students who move, formative testing, and then blathers a bit about the need for data systems. "There's no point in going to a growth model if you don't know where your students have been." Then he mentions supplemental appropriations for education, I think gratuitously.
  10. Portfolios? George Miller says he knows that's a minefield and will have to get back to the reporter. (Okay, it only took three questions for someone to follow up.)
  11. If additional data is not "an escape hatch" on accountability, doesn't that just add more ways for schools to fail? Miller says no... and doesn't say anything substantive. He acknowledges such as system is "not easily constructed." In response to an inaudible foll0w-up, Miller says a student would have to be close on reading or math, and the system would have to be (nonspecifically) complementary.
  12. Is bipartisanship still possible? Miller says "This bill will test that." He then waxes optimistic. Laugh line: "There are no short answers from me. I'm the Joe Biden of education."
  13. Rural districts and flexibility: Miller waffles for a few minutes.
  14. Adding assessments: Miller says that's a state decision. (I don't understand the context for this answer.) Miller then starts talking about teaching to the test (finally!). Miller acknowledges that but claims schools have been successful on the narrow measures without narrowing the test. Miller says "more time on task" in reading and math isn't all that education should be, but then repeats Riley's "learn to read, then read to learn" adage.
  15. Specialized support services for health (nurses, psychologists): No.

Later today there's supposed to be a written summary of bill provisions on Miller's website (or maybe the committee website).

July 21, 2007

Reading... but not what you think

Yes, I was at a local bookstore at midnight, getting two copies of Harry Potter and the Deathly Hallows. But I'm in my office this afternoon, reading student papers. I don't get the Harry Potter until I'm done.

In other news, Jeff Solochek reports correctly that I'm now the representative of the Florida Coalition for Assessment Reform on the Florida DOE's advisory committee looking at the FCAT. I've worked with FCAR co-founder Gloria Pipkin before on a few matters, and I was flattered to receive her request. This is an interesting challenge for me, and Gloria and I took a few steps to make sure that the FCAR board was comfortable with my particular take on accountability.

There are a few things that Solochek didn't get correctly. I don't think of myself as an "FCAT critic" but a critic of the current uses of the FCAT. The conflation of the test with the policy is interesting...

The more serious problem is the way that the Gradebook's thumbnail of my portrait is all fuzzy. You can compare it to the image in the top left corner of this page and see what you think. But I understand the need for thumbnails, and I am here providing a slim, 100-by-100 portrait that should accommodate virtually any blog's storage limits:

Simpsonized Dorn portrait
(after Simpsonization)

Enough silliness.  Back to reading!

July 18, 2007

NCLB identifies wrong target for students with disabilities

Erin Dillon's short piece yesterday, Labeled: The Students Behind NCLB's 'Disabilities' Designation, is a response to criticism of NCLB as unrealistic about the achievement of students with disabilities. Dillon argues that because approximately half of students with disabilities are identified as having learning disabilities, and because of the overrepresentation of minorities in special education, the critics are wrong.  Specifically, she writes that "the majority of special education students have disabilities that do not preclude them from reaching grade-level standards."

There are several issues here:

  • Do schools use special education as an excuse not to educate students identified as having disabilities?
  • Should schools be pushed to educate students with disabilities better?
  • Can students with disabilities reach the proficiency standard identified by states?
  • Is NCLB the best current tool to prod states and schools to educate students with disabilities better?

Dillon's answer to all of those questions is yes, and the clear implication is that the answers are linked: you think the answers to all questions are either yes or no.

I disagree with that assumption. More specifically, I'd say yes, yes, sometimes, and no. Let's at least acknowledge the fictitious nature of grade-level standard; in reality, states set arbitrary proficiency thresholds, but we can agree they divide the range of achievement into two ordinal categories. Given that fact, there is no guarantee that such thresholds are plausible for all students, regardless of the help provided. NCLB critics are correct in pointing out that 100% proficiency is an unrealistic standard in itself.

That fact does not mean that schools should be let off the hook, and NCLB's defenders are correct that having different standards for students with disabilities is dangerous. Yet you have to have different standards. And in Accountability Frankenstein, I have acknowledged the implausibility of using the response to formative assessment as a summative tool. 

A plausible way out is to allow students with disabilities to take different grade-level tests under a few conditions:

  • The student then follows the sequence of grade-level tests up each year (so that if a student is taking a 3rd-grade test in 4th grade, it's a 4th grade test the following year, etc.)
  • There are negotiated limits on the proportions of students allowed to take tests 1 or more years behind grade level
  • There is research to document what proportion of students we should expect to need behind-grade-level tests, with such research informing future limits on such exemptions.

If we are stuck with mediocre to awful annual testing, we should at least do it as sensibly as possible.

Mea culpa: I misread Ms. Dillon's name as Eric. My apologies!

Update: Dillon responds.

July 16, 2007

NCLB reauthorization blog at Ed Week

There's a new Ed Week blog: David Hoff's NCLB: Act II, entirely about reauthorization.

In many ways, this is a wise focus for an education beat blog. Several other blogs have died out when key reporters have moved on, as they typically do after some (relatively short)period of time. Having a blog with a relatively clear criterion for the