Learning from the “failure” of polling

The tie between [Progressive-era] administrative authority and the discourse of special education lay in three connected features: the objects of study in the field, the evangelism of experts embedded in personal and professional networks, and the technical tools that experts and their public partners used in practice. We can call this set a triangle of expertise: objects, experts, and tools. – Artiles, Dorn, & Bal (forthcoming, Review of Research in Education)

After the failure of most polling aggregators this week, I am not all that surprised that some observers of education have taken it as a warning about the flaws in big data in education, whether Harry Boyte, Audrey Watters, or others. As someone who has written a bit about the history of expertise in education — yes, the block quotation above is a bit of a tease about a future article — I am sympathetic towards that skepticism. Yet that is not the only conclusion one can draw, and it is important to consider alternative arguments.

First, the alternatives on the election:

  1. Some folks got it right. Or rather, a compilation of “some folks” got it right. In the spring and summer, Vox Media’s leadership put together an ensemble model of the election, using a set of political science empirical findings about national elections. This ensemble model predicted a narrow Republican victory back in the summer. Instead of believing this model, however, Vox’s reporters changed their reporting frame to use the model as a basis on which to claim Trump was underperforming a nominal Republican candidate. Except he didn’t, really, and the story of that decision (and initial reactions by Vox’s editorial leadership) is in “The Weeds” most recent episode, Trumpocalypse Now.
  2. The polls were collectively wrong by a fairly normal amount. One poll aggregator correctly pegged the uncertainty: Nate Silver’s model ended up predicting a roughly 30% chance of a Trump win, and Silver was much closer to reality than Sam Wang, who was arguing for his 99%-Clinton-win-probability claim last weekend.1 As reported by Andrew Gelman, David Rothschild and Sharon Goel pointed out that polling election predictions have larger errors than we may assume from the standard errors associated with polls. Polls have multiple potential sources of error, but the poll aggregators assumed they only needed to model one level of error.

So what are the the applications to education research? A few come to mind:

  1. Ensemble models may be an important skill in policy analysis. MDRC’s new predictive modeling approach may go partway in that direction. But you don’t need to know about things like classification trees (and ensemble forests) to gain from this; I suspect one can also leverage multiple approaches more simply, though I do not know of much writing around this.
  2. We need to chain together different sources of uncertainty in drawing inferences from empirical data. I lived in Florida for 18 years, and everyone who lives through tropical storms and storm forecasting knows that there are multiple sources of errors in forecasts: uncertainty in the center of the storm’s track forecast (the famous “cone of uncertainty” you will see in news coverage); the uncertainty in how large the effects of a storm are, or how far away from the center one needs to worry; uncertainty of the power of a storm, uncertainty of timing/speed, and so forth. I have yet to see anyone model this adequately in education, when we know there are multiple sources of uncertainty even around great research designs. But it exists, and the best I have seen is pretty simplistic, using varying point estimates rather than magnifying uncertainty around point estimates.

None of this should be surprising: humans do not have much experience (and do not have inherent instincts) for integrating different models or understanding different sources of uncertainty.

If you enjoyed this post, please consider subscribing to the RSS feed to have future articles delivered to your feed reader.

Notes

  1. The poker-player beat the guy who puts “PhD” in his Twitter tag. Maybe there’s something about playing poker that gives one a better sense of uncertainty? Yes, I’m picking on Wang a bit… []