Nate Silver’s FiveThiryEight blog at The New York Times really only does one thing: It makes election predictions. But it does this differently than pretty much everyone else, because it aggregates all available polls using a statistical model calibrated with past election data. He has his critics among the political class, but to my eye, it makes pretty much all other election “horse race” coverage look primitive and uninformed.
FiveThirtyEight has obvious lessons for journalism about data-related topics such as statistics and uncertainty. But I think I also see wider implications for the evolving role of the political journalist. At heart, these changes are about the response of journalism to a world that is increasingly complex and networked.
Silver’s approach has had remarkable success in past elections, correctly predicting the winner in 49 of 50 states in 2008. That doesn’t necessarily mean his model is going to get 2012 right — as Silver will be first to admit — but there is at least one reason to recommend FiveThirtyEight over other sources: It takes the statistics of polling seriously. Polls are subtle creations, careful extrapolations from a small sample to an entire population. Although the basic theory is centuries old, the details are complex and tricky. See, for example, this lengthy analysis of why Gallup polls consistently favor Romney slightly more than other polls.
Silver understands all of this this, and his model accounts for dozens of factors: “house effects” that make particular firms lean in particular ways, the relationships between state and national polls, the effect of economic indicators on election results, post-convention bounces, and lots of other good stuff. Yes, you can talk about all of these factors — but without quantifying them there is no way to know whether the cumulative effect is up or down.
Recently CNN aired a chart that showed one candidate ahead 49 percent to 47 percent, and the commentators were discussing this lead. But up in the corner in small print, the margin of error of the poll was given as 5.5 percent. In other words, the size of the “lead” was smaller than the expected error in the poll result, meaning that the difference was probably meaningless.
Expected error — quantified uncertainty — is the price you pay for polling a national sample instead of asking every person in the country how they’re going to vote. It means that small variations in poll numbers are mostly meaningless “noise,” because those last 5.5 percent are effectively down to a coin toss. In other words, you’d expect the very next poll to show the lead reversing much of the time. This 2 percent difference with a 5.5 percent margin of error would never pass standard statistical tests such as the t-test — so you couldn’t publish the result in a scientific paper, a medical board wouldn’t authorize a new treatment based on such weak evidence, and you certainly wouldn’t want to place a bet.
So why do journalists spend so much energy talking about a result like this, as if there’s anything at all to learn from such a roll of the dice? One possibility is a widespread misunderstanding of the limitations of statistical methods and how to interpret measures of uncertainty. But I suspect there’s also a deeper cultural force at play here: Journalists are loathe to admit that the answer cannot be known. “Unexpected Reversal in Polls” is a great headline; “Magic Eight Ball says ‘Sorry, Ask Again Later'” is a story no one wants to write — or read. To his great credit, Silver never shies away from saying that we don’t have yet enough information to know something, as when he cautioned that we had to wait a few more days to see if the Denver debate really had any effect.
The big data craze notwithstanding, more data isn’t always better. However, in the limited field of statistical sampling, more samples are better. That’s why averaging polls works; in a sense, it combines all of the individuals asked by different pollsters into one imaginary super-poll with a smaller margin of error. This is the idea behind Real Clear Politics’ simple poll averages and FiveThirtyEight’s more sophisticated weighted averages.
All well and good, but to average polls together you have to be willing to use other peoples’ polling data. This is where traditional journalism falls down. We have the ABC-WaPo poll, the AP-GfK poll, the CNN/ORC poll, and then Gallup, Rasmussen, and all the others. FiveThirtyEight shamelessly draws on all of these and more — while individual news outlets like to pretend that their one poll is definitive. This is a disservice to the user.
This situation is not unlike the battles over aggregation and linking in the news industry more generally. Aggregation disrupts business models and makes a hash of brands — but in the long run none of that matters if it also delivers a superior product for the user.
It’s not just statistics. To report well on complicated things, you need specialized knowledge. As Megan Garber put it so well, “While it may still be true that a good magazine — just as a good newspaper — is a nation talking to itself, the standards for that conversation have risen considerably since people have been able to talk to each other on their own.” The traditional generalist education of the journalist is ill suited to meaty topics such as law, science, finance, technology, and medicine. It’s no longer enough to be able to write a good article; on the web, the best is just a click away, and the best on these sorts of subjects is probably being written by someone with the sort of deep knowledge that comes from specialized training.
Silver is a statistician who got into journalism when he began publishing the results of his (initially sabermetric) models; the reverse, a journalist who becomes a statistician when they start modeling polling data, seems like a much longer road.
Journalism today has an obvious shortage of talent in many specialized fields. I’d like the financial press to be asking serious questions about, say, the systemic risks of high-frequency trading — but instead we get barely factual daily market reports that, like most poll coverage, struggle to say something in the face of uncertainty. But then again, most finance reporters have training in neither quantitative finance nor computer science, which makes them probably unqualified for this topic. I suspect that we will see many more specialists brought into journalism to address this sort of problem.
For the last several decades, both in the United States and internationally, “horse race” or “political strategy” coverage of politics has been something like 60 percent or 70 percent of all political journalism. Certainly, it’s important to keep track of who might win an election — but 60 or 70 percent? There are several different arguments that this is way too much.
First, it’s very insider-y, focusing on how the political game is played rather than what sort of information might help voters choose between candidates. Jay Rosen has called this the cult of the savvy. As one friend put it to me: “I wish the news would stop talking about who won the debate and start asking questions about what they said.”
Second, this quantity of horse race coverage is massively wasteful. Given the tall problems of uncertainty and attributing causation, can you really produce all that many words about the daily state of the race? Can you really say anything different than the thousands of other stories on the topic? (Literally thousands — check Google News.) So why not cover something else instead? I find it noteworthy that it was not journalists who crunched the numbers behind Romney’s centerpiece tax plan. That task, really nothing more than a long night with a spreadsheet, fell to think tanks.
Third and finally, FiveThirtyEight has set a new standard for horse race coverage. We should rejoice that this is a higher standard than we had before, and hopefully represents a narrowing of the data gap between politicians and journalists. It’s also a complicated and presumably expensive process. Because there are many assumptions and judgement calls that go into such a complex statistical model, we really do need more than one. (And indeed, there are other models of this type.) But we don’t need one from every newsroom — and anyway, you need to hire a statistician to produce a statistical model. The politics desk of the future might look a lot different than it does today.
Photo of Nate Silver by J.D. Lasica used under a Creative Commons license.
Learn more about The New York Times