Zombie stats: “dropout rate” as a case in point

Education has a whole host of statistics that are unreliable, that have been unreliable or unnecessary or off-target for years, and that continue to be created, published, and reported on. “Dropout rate” is one of those. It’s been around for more than forty years, crafted in the late 1960s when there was no way to find out what proportion of students graduated from individual schools or school districts. Yet some districts said they counted those who left school without plans to return — dropouts, to use a term that had just recently become the dominant way Americans talked about teenagers who left high school before graduation. And because there was visible concern about dropping out, some tried to measure the phenomenon.

The measure was bad, and those who created it probably knew it was bad: divide the number of people that a school says dropped out in a year by the total enrollment in the school. At the time, there were few other choices. There are some reasons why this so-called dropout rate is a bad measure:

  • It does not measure what people are interested in, the proportion of teenagers who graduate or its inverse, the proportion of teenagers who never graduate. That is the common-sense question parents, administrators, and the public have. Dropout rate does not measure that. It is sort of like measuring the crude death rate (deaths divided by the population) without attempting to measure life expectancy.
  • Changes in the statistic are often unrelated to the critical measure of interest (what proportion of teenagers are graduating from high school). A declining dropout rate might mean that fewer students are dropping out and thus more are likely to graduate. Hurray! But it might also reflect unrelated increases in school enrollments in high school (the denominator). This increase in school enrollment can happen because of growth in the teenage population (through births more than a decade before, or through migration). It can also happen if schools are failing more ninth-graders, so a higher proportion of students are taking more than four years to finish. Reverse all of this, and you can then have an increasing dropout rate without any changes in how many teenagers graduate.1
  • It is vulnerable to incompetent or corrupt administration: it requires accurately classifying people who leave school. I can create a protocol for schools to use to do this well, which would require that a school classifies anyone who has been absent for three weeks as a dropout unless there is a documented reason to classify the person otherwise (e.g., independent confirmation of enrollment in another school), but that was never done in the 1960s. Houston schools clearly falsified that data in the late 1990s, as was discovered after the “Houston Miracle” became one of former President Bush’s talking-points in favor of NCLB.

Last week, some reporting highlighted decreases in nationwide dropout rates–first picked up by Richard Fry at Pew, then Libby Nelson at Vox, and Alexander Russo. Because a 1988 law required that the federal government publish reports on dropout rates, the Census Bureau began doing so, and has continued to do so; that annual report is the reason for the reporting last week. Now, we have much better data on graduation rates, but we still have reporting on dropout rates because of the machinery of a required federal report, and school districts and states often still publish dropout rates based on (highly-flawed) administrative data.

In the case of the federal data, the long-term trends of Census-based dropout measures are highly correlated with the true measure of interest. But dropout rate is no longer necessary to calculate, and it has all sorts of potential for mischief when schools publish it based on administrative data. It is a zombie measure; I wish reporters would stop volunteering their brains for its continued existence.

If you enjoyed this post, please consider subscribing to the RSS feed to have future articles delivered to your feed reader, and sign up for my irregular newsletter below!


  1. This slip is also common with crude death rates, which can be affected by the age distribution of a population. Japan has a very high life expectancy. Pakistan has a lower crude death rate than Japan, not because people live longer in Pakistan but because the population is much, much younger. []

One response to “Zombie stats: “dropout rate” as a case in point”

  1. Paul Bielawski

    The progress that has been made is that the graduation rates by state, district, and school are much better because there are common administrative definitions that require documentation of transfers, etc… No Child Left Behind required states to use their student databases to set an on-time graduation date for each student and to document transfers. In Michigan we audit “high risk” transfers such as out-of-state and transfers to nonpublic schools. Most states also publish dropout rates. These are not the inverse of the graduation rate because some students stay in school after the expected on-time graduation date. A further point is that most graduation and dropout rates start by setting an expected on-time graduation date based on initial enrollment in ninth grade. Students that do not make it to ninth grade are ignored. Another criticism is that many students, particularly special ed and ELLs, wind up graduation with more time.

    Most states have been publishing reliable graduation and dropout rates using cohort methodology since the class of 2007 or so. We now have quite a history. The data are audited and are often reported as part of a “school report card.”