February 1, 2010

Grading the "Grades" reports

I'm back from Toronto today--had a great time talking with Canadian faculty, had my head chewed off in a thoroughly polite, Canadian way for one bone-headed error I made in discussion, survived subzero temperatures for a few mornings, and completely failed to enter the Hockey Hall of Fame building--and back in Florida the temperatures are a bit lower-than-average for this time of year but discussion of the Ed Week Quality Counts "grades" given Florida is apparently heating up. So maybe I need to revisit my idea from last summer of grading the grading reports. Last June, I pointed out that professional grading practices generally provide scoring criteria in advance, so that those who are being evaluated will have a chance to... you know... meet the standards. Let me list all of the facets on which I think one can grade such "grade reports" of states and the like:

  • Purpose. Is there a clear public rationale for issuing such a report? How broad or narrow is the public purpose?
  • Scoring criteriaDescribed in June.
  • Description of sources and analysis. How systematic is the collection of source material (as opposed to anecdotal or convenience sampling)? Is there a clear chain described from collection of data to the application of labels? Is there a discussion of relevant caveats/alternatives?
  • Robustness of sources. Are the sources publicly verifiable or replicable? Are they subject to gaming, falsifiability, or manipulation?
  • Relevance of sources. Is the material relevant to the criteria, and does the "grade report" use the most relevant obtainable information? Is the source information analyzed appropriately to warrant the application of the grade labels?
  • Sponsorship. Are funding sources and potential related interests stated clearly? Is there a separation between the real or likely perceived material interests of sponsors, on the one hand, and editorial control of the project?

It strikes me on impression that different types of periodic "grading" exercises have different types of weaknesses. An advocacy organization whose reports rely on anecdotal evidence and give higher grades to states that are more extreme towards its position might receive lower grades on description of sources and analyses and sponsorship than in other categories. A news organization that makes millions of dollars by selling a volume ranking colleges and universities using reputational surveys of institution heads and data on institutional wealth is likely to receive low grades on public purpose, robustness of sources, and relevance of sources. A news organization that ranks states on categories that change every year using no apparent criteria that also change every year is likely to receive its lowest grades in the area of scoring criteria and description of sources and analyses.

As a faculty member who has assigned thousands of grades to students, where the grades affect student progress towards degrees and financial-aid eligibility, I know from experience that the process of grading is imperfect and in my field depends on judgment rather than objective cut-and-dried methods. That's why I state criteria as early as I can, display model work from prior semesters if possible (with the permission of their creators), answer questions about assignments, look at drafts, structure revision opportunities into a number of courses, and always let students correct me when they document that I have recorded individual assignment grades incorrectly.  I know from student complaints about grading in general that they hate being judged on criteria they feel the evaluator keeps secret, or that is designed to make the evaluator look good, or that serves some other purpose that isn't for the general purposes grading is accepted by at least some to serve. In other words, if you're going to assign grades, especially if the clear intent is to shame certain entities into changing, you need to take at least a few minutes of care to address common-sense ethical expectations. I'd have far more patience with these publicity-seeking exercises if there were more care evident in the process.

