How to cover 11,250 elections at once: Here’s how The Washington Post’s new computational journalism lab will tackle 2020

“We’re not super-interested in telling a story about one Iowa county using this infrastructure. But we’re making sure our reporters know which county in Iowa to go if they’re looking for one that has particular characteristics.”

By Christine Schmidt @newsbyschmidt July 25, 2019, 1:17 p.m.

Hold onto your robots: The future of journalism is exceedingly computational. (At least part of it.) And the 2020 U.S. election is a great place to start.

April 26, 2017

Newsrooms worldwide are trying to infuse their reporting with more data and computational analysis (see: The New York Times’ data training). The engineering-heavy Washington Post newsroom has used bots, automation, and large-scale data in its reporting before, perhaps notably with its Heliograf tool to automatically write stories from data.

But now The Washington Post is going a step further, creating a computational political journalism lab — we’ll unpack what that term means in a sec — just in time for the 2020 political campaigns. The R&D lab, orchestrated by director of engineering Jeremy Bowers, will work with Northwestern assistant professor and algorithmic reporting expert Nick Diakopoulos and have between three to six contributors.

The lab’s main outputs will be reporting tools for the newsroom to cover all the 2020 U.S. elections, new storytelling experiences and projects (“what would an election sound like?”), and semi-frequent blog posts sharing what they’re working on and learning. And yes, Bowers knows the scope of the lab is large.

“We have these problems that could use engineering rigor applied to them. In previous years, it’s been hard to embed an engineer in the newsroom,” said Bowers, who returned to the Post this spring after five years working on news apps at The New York Times.

“We’re looking at 11,250 races for 2020, including state legislatures and dog catchers and mosquito commissioners. A lot of these races are important, but we don’t have enough reporters to cover all 11,250 of them the same way we cover the presidential. We need to figure out ways to augment what we’re doing and help readers find things that are most important, then go ahead and throw real human resources at it. What are ways we can algorithmically provide analysis of these areas so we’re making sure we’re getting the broadest coverage available?”

This story is a great example of how machine learning can be used in the newsroom to support journalists: https://t.co/Z9Yp9WNcLh pic.twitter.com/0TXrYFtqux

— Lenny Bronner (@lbronner) June 21, 2019

Diakopoulos will be working with Post data scientist Lenny Bronner and a soon-to-be-determined intern, along with other engineering or graphics team members cycling in and out over the fall. (His stint is just for the fall while he’s on sabbatical from Northwestern; the lab may reprise itself in the future as needed.) He and Bowers dreamed up some ideas about what computational political journalism might look like in the lab:

Breaking down the data before the election wave comes: “We have a much broader span of data than just results,” Bowers said. “The lab is going to be working on fingerprinting every county, congressional district, and — if we can get around to it — precinct in America with info that gives us a descriptive understanding of who the voters are, how they’ve previously voted, and how they might vote in the future. One way we might do that is through large-scale analysis of a voter file, but we also have BLS data that helps us get a good understanding of who the folks there are. That’s one level, we could write a story or build a graphic about that.”
Figuring out which counties need extra attention: “The next level is using that to get insights about changes in the voter population and get that closer to our reporters and editors making decisions about where they’d like to go in the run-up to 2020. We’re not super-interested in telling a story about one Iowa county using this infrastructure. But we’re making sure our reporters know which county in Iowa to go if they’re looking for one that has particular characteristics.”
“Then we have fun, silly but serious things: How would we do election results in a physical space? Election results as an audio experience? High-fidelity and low-fidelity election experiences?…If we tend to write a lot of a certain kind of story that follows the narrative of who a candidate is and how they like to talk to voters, it’d be fun for us to dig up every time we’ve written a story that’s shaped like that. One of the ways we can do that is by comparing similarity scores of two stories to each other over time.”

“We’re already kicking around a bunch of innovative ideas that would capitalize on automated and algorithmic news production to augment the capacities of journalists and provide unique experiences and information for readers,” Diakopoulos said over email. “We’re also thinking about how computational techniques are changing politics more broadly, and how that in turn may change the way reporters and editors need to cover the elections.”

October 24, 2018

The looming specter of subpar 2016 mainstream media coverage helped encourage rethinking the coverage process, from story choices to technical tools to the way interviews are conducted. By being removed from the newsroom’s daily deadlines, Bowers said he hopes the team can think more efficiently about what to give up on and what to push along to full product development.

“I’ve done a lot of elections stuff — it’s a sprint that’s the length of a marathon. You have to run fast for a really long time,” Bowers said. “Primary season lasts forever, and in the general election, you don’t want to do anything that you haven’t tried at least once. It’s a nice pressure-release valve for people who are on the team, to rotate through the lab and do some thinking that isn’t just so focused on what we need to get done today or for the next primary.”

POSTED July 25, 2019, 1:17 p.m.

SEE MORE ON Aggregation & Discovery

Show tags

TWITTER FACEBOOK EMAIL