June 22, 2006

Mishel on Swanson

In corerspondence, Larry Mishel sent me material over the last few days. With his permission, I am posting his material below. In the meantime, of course, the USDOE put out its report today on the "Averaged Freshman Graduation Rate", which is also based on CCD figures (hat tip: Andrew Rotherham). Of those, the numbers for Nevada (dropping precipitously in one year) don't look right.

But, to Mishel's comments, below the fold. I'll say this about using grade-based data from the CCD: one of the problems we've discovered this week is how friable that data is (and that's something that Chris Chapman and his coauthors at NCES noted in the AFGR publication today). But the other is a problem Mishel notes with using 9th grade enrollment data, which conflates first-time high school students with those who are repeaters. When I first came back to the quantification of attainment a few years ago, I tried to model retention issues, using both state-provided data (for a few states) and then thinking about it as a stable-population problem. Neither approach was satisfactory. That's why Warren uses 8th grade enrollment. I'm sticking to age-based data where available.

In any case, you can judge the issues for yourself, on the jump.

From Mishel:

The Bulge and Retention

Let me see if I can convince you that retention and the ninth grade bulge are serious problems for Swanson. Recall that his formula iterates declines in enrollment beck from diplomas to 9th grade. Every decline in enrollment from year to year counts as dropping out. It is well known (by Jay Greene who acknowledges it, and among everybody else but Swanson) that enrollment in 9th grade is far above that of eighth grade (therefore a ‘bulge’) because there is a lot of students, especially minorities, retained in 9th grade- the bulge is about 12-13% overall and 25% for minorities. This means that 9th grade enrollment is far above the count of entering ninth graders. This makes Swanson’s formula have the equivalent of a far too large denominator in his formula. It is easy to see how large the bias is by just extending his formula back to 8th grade: it shows an eight percentage point higher graduation rate and twelve percentage point higher minority graduation rates (see our book Table 10, page 64).

The Texas example shows how misleading his formula can be. I choose Texas both because it has very high retention rates and because Texas provides data on retention by grade by race. [He attached a file related to retention data in Texas.] The first table (page) basically shows that the ninth grade bulge—the extent to which 9th grade enrollment exceeds 8th grade enrollment- is fully explained by retention (in some states there may be issues of transfers into public schools from private schools).

The following table shows the impact on Texas rates and on the comparison of Texas to the nation. The first column reproduces Swanson’s published rates for 2001, which assumes that all ninth grade enrollment is ‘first-time’. The second column uses published Texas data on retention to recomputed graduation rates per ‘first-time’ ninth grader (we are employing a simple diploma to 9th grade ratio to make things simple). These calculations show that Swanson understates graduation rates in Texas by wide margins (13 and 14 percentage points for blacks and Hispanics) and overstates the race/ethnic gaps by 6-8 percentage points (increasing them by more than half their value). These are pretty large errors.

Because states and districts vary so much in the extent of retention these errors in Swanson’s formula are larger in some places than others: therefore, Swanson’s measure generates faulty comparisons across jurisdictions. Consider a comparison of Texas to the nation in the last columns. By Swanson’s measure Texas is below the national average but with a corrected measure Texas is substantially above the national average and has smaller race/ethnic gaps (though these national numbers are biased, as well, because of retention, but not as much as Texas).

Table 2. Bias in Swanson Measure from Ignoring Retention in Texas

Texas relative to national average
PopulationSwanson Measure*Uses first-time 9th graders
Corrected for retention
BiasNational averageSwanson measureCorrected measure















9th grade Enrollment







First-Time 9th graders







Graduation Rate





















9th grade Enrollment







First-Time 9th graders







Graduation Rate





















9th grade Enrollment







First-Time 9th graders







Graduation Rate





















9th grade Enrollment







First-Time 9th graders







Graduation Rate



































* Column Source: Christopher B. Swanson, Who Graduates? Who Doesn’t? A Statistical Portrait of Public High School Graduation, Class of 2001 (Washington, DC: Education Policy Center, The Urban Institute), Table 4.

Swanson versus NYC Longitudinal Data

One way to check on whether Swanson's measure of graduation correctly estimates graduation rates is to compare it to other measures for the same location using the same underlying student records. New York City provides such a possibility. Several newspaper articles point to differing estimates of graduation from the city data, the state data and Swanson.

Our purpose here is to create an apples-to-apples comparison between Swanson and the city school district data. So, since Swanson’s measure of graduation counts all diplomas, no matter when earned, we use a comparable measure form the school district data. To avoid issues of whether to count GEDs or not, we make comparisons with diplomas only and exclude GEDs.

The school district data are based on following individual students through a longitudinal data system. Swanson bases his estimates based on enrollment counts in 9th and other grades and counts of diplomas each year.

New York City Longitudinal Data

I start from the fact that NYC reports graduation rates, excluding GEDS, of about 60%, as calculated below. The rate with GEDs would be 7.0% higher. Swanson reports 39% and Greene, 43% for the class of 2001. That’s a huge difference. It is easy for me to identify ways that Swanson and Greene’s estimates inappropriately and artificially lower the measured graduation rate.

One can get the necessary information for constructing the following table from the report for 2001:

Population N % Total % Grand total
Dropouts19,748 32.0%25.2%
Other discharges16,82721.4%
Grand Total78,456100.0%
Source: pages 3 and 5

This table shows where we get our figure from. We take the reported graduates from Figure 1 (page 3) and subtract the GEDs from Table 1 (page 5). This gives us a rate of 60.9%, the longitudinal rate that eliminates GEDs. This rate includes graduation in 3,4,5,6 and 7 years (mostly all are within five years). However, so does Swanson’s diploma counts!

Some people have questioned whether some of the students identified as ‘other discharges’ are really dropouts, but not counted as dropouts in the NYC data. This is the tricky part for school districts in compiling longitudinal graduation rates—they essentially have to determine how many students left their system and which ones should be considered dropouts or legitimate transfers to other districts, etc.

I’m not sure how one can identify how many are falsely labeled a discharge versus a dropout. WE can assess the extent of any possible bias by making the extreme assumption that all the ‘discharges’ are dropouts. If so, the graduation rate would be 47.9% according to my calculations in the table. That is still above Greene’s rate of 43% and way above Swanson’s 39%. Yet, the rate must be somewhere between the 47.9% rate and the official 60.9% rate since surely some of the discharges are appropriately classified as such.

We can also adjust these data to include special education. Figure 5 says that there are 1,092 (city-wide special ed) who graduated at a 35.5% rate and 4,359 in self-contained classes who had a 38.3% rate. If we include the special education graduates among the graduates and add the total special education enrollment to the grand total we get a graduation rate of 47.2%.

This suggests to me that the NYC grad rates are substantially higher than Greene and Swanson. My calculation is that when one includes all of the special education and assumes all other discharges are dropouts one still finds a grad rate of 47.2%. The graduation rate actually lies somewhere between 47.2% and about 60%. Unfortunately, we do not have the data to make calculations by race and ethnicity. Anyway, a grad rate between 47.2% and about 60% may be nothing to brag about but it still shows that Greene’s and Swanson’s estimates are way off base.

I’m struck by your statement that Swanson’s biggest problem is no migration adjustment. [Sherman here: in correspondence, I explained that my initial impression was that was the major problem with Detroit. The major problem with Detroit is awful data.] I’m skeptical about these population adjustments once I realized that they incorporate new immigrants along with transfers in and out, not by design but because there’s no way to separate out immigrants (correct?). At the national level the population adjustment is only immigration. That’s why I don’t understand why you think a Warren estimate at the national level is at all valid (or at least can be compared to longitudinal data of students starting in 8th grade or some other starting point (NLSY). Listen to this article
Posted in Research on June 22, 2006 4:08 PM |