Twitter  The newsonomics of the Kyiv Post’s embattled work nie.mn/1l87jD2  
Nieman Journalism Lab
Pushing to the future of journalism — A project of the Nieman Foundation at Harvard
nate-silver-cc-990

Data, uncertainty, and specialization: What journalism can learn from FiveThirtyEight’s election coverage

Nate Silver’s number-crunching blog is perceived as a threat by some traditional political reporters — but its model has lessons for all journalists.
Email

Nate Silver’s FiveThiryEight blog at The New York Times really only does one thing: It makes election predictions. But it does this differently than pretty much everyone else, because it aggregates all available polls using a statistical model calibrated with past election data. He has his critics among the political class, but to my eye, it makes pretty much all other election “horse race” coverage look primitive and uninformed.

FiveThirtyEight has obvious lessons for journalism about data-related topics such as statistics and uncertainty. But I think I also see wider implications for the evolving role of the political journalist. At heart, these changes are about the response of journalism to a world that is increasingly complex and networked.

Data literacy

Silver’s approach has had remarkable success in past elections, correctly predicting the winner in 49 of 50 states in 2008. That doesn’t necessarily mean his model is going to get 2012 right — as Silver will be first to admit — but there is at least one reason to recommend FiveThirtyEight over other sources: It takes the statistics of polling seriously. Polls are subtle creations, careful extrapolations from a small sample to an entire population. Although the basic theory is centuries old, the details are complex and tricky. See, for example, this lengthy analysis of why Gallup polls consistently favor Romney slightly more than other polls.

Silver understands all of this this, and his model accounts for dozens of factors: “house effects” that make particular firms lean in particular ways, the relationships between state and national polls, the effect of economic indicators on election results, post-convention bounces, and lots of other good stuff. Yes, you can talk about all of these factors — but without quantifying them there is no way to know whether the cumulative effect is up or down.

Uncertainty

Recently CNN aired a chart that showed one candidate ahead 49 percent to 47 percent, and the commentators were discussing this lead. But up in the corner in small print, the margin of error of the poll was given as 5.5 percent. In other words, the size of the “lead” was smaller than the expected error in the poll result, meaning that the difference was probably meaningless.

Expected error — quantified uncertainty — is the price you pay for polling a national sample instead of asking every person in the country how they’re going to vote. It means that small variations in poll numbers are mostly meaningless “noise,” because those last 5.5 percent are effectively down to a coin toss. In other words, you’d expect the very next poll to show the lead reversing much of the time. This 2 percent difference with a 5.5 percent margin of error would never pass standard statistical tests such as the t-test — so you couldn’t publish the result in a scientific paper, a medical board wouldn’t authorize a new treatment based on such weak evidence, and you certainly wouldn’t want to place a bet.

So why do journalists spend so much energy talking about a result like this, as if there’s anything at all to learn from such a roll of the dice? One possibility is a widespread misunderstanding of the limitations of statistical methods and how to interpret measures of uncertainty. But I suspect there’s also a deeper cultural force at play here: Journalists are loathe to admit that the answer cannot be known. “Unexpected Reversal in Polls” is a great headline; “Magic Eight Ball says ‘Sorry, Ask Again Later’” is a story no one wants to write — or read. To his great credit, Silver never shies away from saying that we don’t have yet enough information to know something, as when he cautioned that we had to wait a few more days to see if the Denver debate really had any effect.

Aggregation

The big data craze notwithstanding, more data isn’t always better. However, in the limited field of statistical sampling, more samples are better. That’s why averaging polls works; in a sense, it combines all of the individuals asked by different pollsters into one imaginary super-poll with a smaller margin of error. This is the idea behind Real Clear Politics’ simple poll averages and FiveThirtyEight’s more sophisticated weighted averages.

All well and good, but to average polls together you have to be willing to use other peoples’ polling data. This is where traditional journalism falls down. We have the ABC-WaPo poll, the AP-GfK poll, the CNN/ORC poll, and then Gallup, Rasmussen, and all the others. FiveThirtyEight shamelessly draws on all of these and more — while individual news outlets like to pretend that their one poll is definitive. This is a disservice to the user.

This situation is not unlike the battles over aggregation and linking in the news industry more generally. Aggregation disrupts business models and makes a hash of brands — but in the long run none of that matters if it also delivers a superior product for the user.

Specialization

It’s not just statistics. To report well on complicated things, you need specialized knowledge. As Megan Garber put it so well, “While it may still be true that a good magazine — just as a good newspaper — is a nation talking to itself, the standards for that conversation have risen considerably since people have been able to talk to each other on their own.” The traditional generalist education of the journalist is ill suited to meaty topics such as law, science, finance, technology, and medicine. It’s no longer enough to be able to write a good article; on the web, the best is just a click away, and the best on these sorts of subjects is probably being written by someone with the sort of deep knowledge that comes from specialized training.

Silver is a statistician who got into journalism when he began publishing the results of his (initially sabermetric) models; the reverse, a journalist who becomes a statistician when they start modeling polling data, seems like a much longer road.

Journalism today has an obvious shortage of talent in many specialized fields. I’d like the financial press to be asking serious questions about, say, the systemic risks of high-frequency trading — but instead we get barely factual daily market reports that, like most poll coverage, struggle to say something in the face of uncertainty. But then again, most finance reporters have training in neither quantitative finance nor computer science, which makes them probably unqualified for this topic. I suspect that we will see many more specialists brought into journalism to address this sort of problem.

The role of the political journalist

For the last several decades, both in the United States and internationally, “horse race” or “political strategy” coverage of politics has been something like 60 percent or 70 percent of all political journalism. Certainly, it’s important to keep track of who might win an election — but 60 or 70 percent? There are several different arguments that this is way too much.

First, it’s very insider-y, focusing on how the political game is played rather than what sort of information might help voters choose between candidates. Jay Rosen has called this the cult of the savvy. As one friend put it to me: “I wish the news would stop talking about who won the debate and start asking questions about what they said.”

Second, this quantity of horse race coverage is massively wasteful. Given the tall problems of uncertainty and attributing causation, can you really produce all that many words about the daily state of the race? Can you really say anything different than the thousands of other stories on the topic? (Literally thousands — check Google News.) So why not cover something else instead? I find it noteworthy that it was not journalists who crunched the numbers behind Romney’s centerpiece tax plan. That task, really nothing more than a long night with a spreadsheet, fell to think tanks.

Third and finally, FiveThirtyEight has set a new standard for horse race coverage. We should rejoice that this is a higher standard than we had before, and hopefully represents a narrowing of the data gap between politicians and journalists. It’s also a complicated and presumably expensive process. Because there are many assumptions and judgement calls that go into such a complex statistical model, we really do need more than one. (And indeed, there are other models of this type.) But we don’t need one from every newsroom — and anyway, you need to hire a statistician to produce a statistical model. The politics desk of the future might look a lot different than it does today.

Photo of Nate Silver by J.D. Lasica used under a Creative Commons license.

                                   
What to read next
INNlogo_blue
Justin Ellis    April 15, 2014
Chalkbeat, Southern California Public Radio, InvestigateWest and others are awarded over $236,000 in micro-grants to support events programming, collaborative reporting, and a “native underwriting” pilot program.
  • http://www.beastsandlunatics.com/ The Unsanctioned Interlocutor

    This article interestingly points right to the challenge of attributing causation and effect via differing perspectives of risk and blame, which is at the heart of journalism’s challenge.  This is where rhetoric, (as persuasive argument not empty speech), meets what we like to call “fact” and objectivity in journalism.

    This also points to philosophical problems that the field of journalism is ill-equipped to wrestle with, such as the divides between objectivism and relativism as well as the views offered by a positivistic outlook (data derived from sensory experience: quantifying and measuring), in light of the human desire and tendency to explain and predict… even when we don’t have enough information. It also highlights a certain turning away from interpretation and understanding (hermeneutics), even when that’s what we need more of, in a political sense.

    I believe that in this light Nate Silver’s ability (and willingness) to stand in mystery, doubt and uncertainty when the data doesn’t offer a clear picture is commendable.  As Jonathan points out here, journalism doesn’t like ambiguity, it desires objective certainty; that is a human tendency and certainly an American one but good science is willing to stand in mystery and not knowing but politics is not a science, neither are our ethics.As C. Wright Mills once pointed out, in reference to what he critiqued as the Grand Theory of sociology in the work of Talcott Parsons, all too often the positivist outlook gets drunk on syntax and in the process, remains blind to semantics.  In the end, we focus more on rules, arrangements and the predictive analysis of “outcomes” and less on meanings and inferred implications of what was said.  In this way, we miss the empirical propositions,in terms of core inferences that the arguments are making about the nature of reality, and instead focus on which “horse” is gonna win.

  • http://twitter.com/jonathanstray jonathanstray

    Nick Giedner has asked me a good question:

    “How can you speak so glowingly of Nate Silver and his modeling w/o any specifics of his model/method?”

    It’s an important one, so I thought I’d respond here. First, I believe that Nick is correct that Silver has never published a complete mathematical description of his model,  or released the code. So the details are hidden from us, and those details might be important. This does bother me. In general, I believe that statistical prediction models need to be transparent if we’re to trust them. I am especially convinced of this with, for example, financial risk models, the sort of models that failed so badly in 2007.

    However, I think there are also some pretty good reasons to believe that the FiveThirtyEight model is solid. Silver has written a detailed description of the model, which, while it does not get into the exact math, is at least sophisticated and plausible. We also know the model includes many factors such as economic indicators, which  Silver has discussed extensively, and convention bounces, for which he has shown derived curves. So while the code is not available, we can get a pretty good idea of what he is doing from the descriptions of the model and its components in various places. 

    Also, FiveThirtyEight is not the only model out there. There are several others, and also the election trading markets. All quantitative models and markets give about the same chance of Obama winning. So while FiveThirtyEight could be wrong, it means that all of the other models would have to be quite wrong too, which seems unlikely.

    Nonetheless the opacity here is a real problem. I can see why Silver and the Times might want to keep the model a secret for competitive reasons. Nonetheless I argue that the posts and interpretations on FiveThirtyEight are a significant part of its success, and that it would still be a popular destination even if others were able to replicate the numerical results exactly. 

    I would like to see Silver publish his model after the election. He’ll doubtless use a different one for 2014 anyway. Then we could have real statisticians go over it. I also hope that future news organizations will take the data journalist’s creed of Show Your  Work more seriously.
     

  • http://twitter.com/mugwump2 Mark Paul

    A poll with a margin of error of 5.5% showing a candidate with a 49-47 lead is not definitive. But neither is it “probably meaningless.” If you believe that, let’s make a bet. I get all the candidates with 49%; you take all the candidates with 47%. We bet equal amounts on each race. 

  • http://twitter.com/Gwyntaglaw Gwyntaglaw

    “I would like to see Silver publish his model after the election. He’ll doubtless use a different one for 2014 anyway. Then we could have real statisticians go over it.”

    I’m pretty sure Nate Silver is a “real statistician”.  Whether someone likes, respects or agrees with him is irrelevant to that fact.

    The real thing going on with the debate about Silver’s blog is that most people – journalists, commentators and others – really don’t get how mathematics works. This is not unexpected. When the subject comes up, many people refer to their school days, say they never quite understood it all, and change the subject as quickly as they can – sometimes pausing to smear the “geeks” and “nerds” along the way.

    The simplistic view, which is being repeated a great deal at the moment, is to look at a 49-48 poll and say “it’s too close, so it must be a coin-toss”.

    What Silver does (and he’s hardly the only one, but he’s the whipping boy of the moment) is to say: yes, that’s one poll, but we have to read it in conjunction with the myriad of other data.  And when you do that, the data themselves reveal patterns and other information, and those patterns and tendencies can actually be quantified; and that quantification can be expressed as a number.

    And that, in a very small nutshell, is what Silver’s headline percentage is.  It’s a way of quantifying uncertainty.  People who reduce a close poll to a 50-50 coin toss are doing the same thing; it’s just that they are doing it in a very unsophisticated manner.  Silver’s method is more sophisticated.  It’s still dealing with uncertainty – even a 75% chance of victory for Obama still leaves a 1-in-4 chance that Romney will be elected.  That’s a real, definite possibility.  It’s just a better way of looking at ALL the data than just boiling it down to a simple even split.

  • http://twitter.com/rsporter Robert S. Porter

    Nate Silver does real statistics, but that doesn’t make him a statistician. 

  • http://twitter.com/Gwyntaglaw Gwyntaglaw

    “Nate Silver does real statistics, but that doesn’t make him a statistician.”

    “Nate Silver does real mathematics, but that doesn’t make him a mathematician.”

    “Nate Silver does real science, but that doesn’t make him a scientist.”

  • http://twitter.com/jonathanstray jonathanstray

    I didn’t mean to imply that Silver isn’t a “real statistician.” I think the evidence suggests that he is (trained as a statistician, plus a career doing statistics.) I meant we can have a more rigorous look at his work, rather than the indirect inferences we must make now.

  • http://twitter.com/jonathanstray jonathanstray

    All right, let’s work the numbers here.

    I’m going to calculate the odds that 49%-47% with a 5.5% margin of error is a fluke, and that the numbers are actually tied and we’re just seeing statistical noise. 

    To do this, the number we’re interested in is the difference in the two scores, d = p1-p2 or 49% – 47% = 2%. There is some *real* difference, which is what we could get if we counted every voter, and this 2% number we see is different from that real number by random chance, depending on who was randomly asked. 

    The first thing we need is the margin of error not of the poll, but of the difference between the two numbers. Here is an extensive discussion of how to calculate that. When p1 and p2 are nearly 100% of the vote together (they are 47% + 49% = 96% in this case) then the margin of error of the difference is very nearly twice the margin of error of the poll, so 2*5.5% = 11%. Using the more accurate formulas in that link, we can infer a more precise 10.8%. Of course that 5.5% figure is already rounded so the difference makes little difference as we could be off either way by a small amount, but let’s be conservative (exaggerate the effects of the 2% difference) and say it’s 10.8

    All of this so far, and especially the next step, assume that the polling error takes a normal (Gaussian) distribution. This is the case both in the theory of sampling and in fact what we see in the actual polling data.

    Now we’re going to say, suppose the race is actually tied. How often will we see a 2% difference in a random poll where the margin of error is 10.8%? To do this, we standardize these figures to a z-score by dividing 2%/10.8% = 0.185. This means the 2% is 0.185 standard deviations away from the assumed true mean of 0%, i.e. a tie. To convert that to a probability, we find  the area under the two-tailed normal distribution outside of +/-0.185, perhaps by using this calculator.

    I get an 85% chance we’d see a difference of 2% or greater in a poll where the candidates are actually perfectly tied and the margin of error is 5.5%

    Now of course +2% is better evidence for one candidate being actually leading than it is for one candidate to be tied… so yes, I’d rather be the candidate who got 49%. And I won’t take your bet.

    But if the candidates are actually tied, we’ll see at least this much difference about 85% of the time. As they say, we cannot reject the null hypothesis. So your odds, while better than one, are only very slightly better than one.

  • http://www.theclippingpathindia.com/ Clipping Path

    The role of the political journalist- Excellent point!

    Nice Post. Thanks