Nieman Foundation at Harvard
From shrimp Jesus to fake self-portraits, AI-generated images have become the latest form of social media spam
ABOUT                    SUBSCRIBE
July 25, 2019, 1:17 p.m.
Aggregation & Discovery

How to cover 11,250 elections at once: Here’s how The Washington Post’s new computational journalism lab will tackle 2020

“We’re not super-interested in telling a story about one Iowa county using this infrastructure. But we’re making sure our reporters know which county in Iowa to go if they’re looking for one that has particular characteristics.”

Hold onto your robots: The future of journalism is exceedingly computational. (At least part of it.) And the 2020 U.S. election is a great place to start.

Newsrooms worldwide are trying to infuse their reporting with more data and computational analysis (see: The New York Times’ data training). The engineering-heavy Washington Post newsroom has used bots, automation, and large-scale data in its reporting before, perhaps notably with its Heliograf tool to automatically write stories from data.

But now The Washington Post is going a step further, creating a computational political journalism lab — we’ll unpack what that term means in a sec — just in time for the 2020 political campaigns. The R&D lab, orchestrated by director of engineering Jeremy Bowers, will work with Northwestern assistant professor and algorithmic reporting expert Nick Diakopoulos and have between three to six contributors.

The lab’s main outputs will be reporting tools for the newsroom to cover all the 2020 U.S. elections, new storytelling experiences and projects (“what would an election sound like?”), and semi-frequent blog posts sharing what they’re working on and learning. And yes, Bowers knows the scope of the lab is large.

“We have these problems that could use engineering rigor applied to them. In previous years, it’s been hard to embed an engineer in the newsroom,” said Bowers, who returned to the Post this spring after five years working on news apps at The New York Times.

“We’re looking at 11,250 races for 2020, including state legislatures and dog catchers and mosquito commissioners. A lot of these races are important, but we don’t have enough reporters to cover all 11,250 of them the same way we cover the presidential. We need to figure out ways to augment what we’re doing and help readers find things that are most important, then go ahead and throw real human resources at it. What are ways we can algorithmically provide analysis of these areas so we’re making sure we’re getting the broadest coverage available?”

Diakopoulos will be working with Post data scientist Lenny Bronner and a soon-to-be-determined intern, along with other engineering or graphics team members cycling in and out over the fall. (His stint is just for the fall while he’s on sabbatical from Northwestern; the lab may reprise itself in the future as needed.) He and Bowers dreamed up some ideas about what computational political journalism might look like in the lab:

  1. Breaking down the data before the election wave comes: “We have a much broader span of data than just results,” Bowers said. “The lab is going to be working on fingerprinting every county, congressional district, and — if we can get around to it — precinct in America with info that gives us a descriptive understanding of who the voters are, how they’ve previously voted, and how they might vote in the future. One way we might do that is through large-scale analysis of a voter file, but we also have BLS data that helps us get a good understanding of who the folks there are. That’s one level, we could write a story or build a graphic about that.”
  2. Figuring out which counties need extra attention: “The next level is using that to get insights about changes in the voter population and get that closer to our reporters and editors making decisions about where they’d like to go in the run-up to 2020. We’re not super-interested in telling a story about one Iowa county using this infrastructure. But we’re making sure our reporters know which county in Iowa to go if they’re looking for one that has particular characteristics.”
  3. “Then we have fun, silly but serious things: How would we do election results in a physical space? Election results as an audio experience? High-fidelity and low-fidelity election experiences?…If we tend to write a lot of a certain kind of story that follows the narrative of who a candidate is and how they like to talk to voters, it’d be fun for us to dig up every time we’ve written a story that’s shaped like that. One of the ways we can do that is by comparing similarity scores of two stories to each other over time.”

“We’re already kicking around a bunch of innovative ideas that would capitalize on automated and algorithmic news production to augment the capacities of journalists and provide unique experiences and information for readers,” Diakopoulos said over email. “We’re also thinking about how computational techniques are changing politics more broadly, and how that in turn may change the way reporters and editors need to cover the elections.”

The looming specter of subpar 2016 mainstream media coverage helped encourage rethinking the coverage process, from story choices to technical tools to the way interviews are conducted. By being removed from the newsroom’s daily deadlines, Bowers said he hopes the team can think more efficiently about what to give up on and what to push along to full product development.

“I’ve done a lot of elections stuff — it’s a sprint that’s the length of a marathon. You have to run fast for a really long time,” Bowers said. “Primary season lasts forever, and in the general election, you don’t want to do anything that you haven’t tried at least once. It’s a nice pressure-release valve for people who are on the team, to rotate through the lab and do some thinking that isn’t just so focused on what we need to get done today or for the next primary.”

POSTED     July 25, 2019, 1:17 p.m.
SEE MORE ON Aggregation & Discovery
Show tags
Join the 60,000 who get the freshest future-of-journalism news in our daily email.
From shrimp Jesus to fake self-portraits, AI-generated images have become the latest form of social media spam
Within days of visiting the pages — and without commenting on, liking, or following any of the material — Facebook’s algorithm recommended reams of other AI-generated content.
What journalists and independent creators can learn from each other
“The question is not about the topics but how you approach the topics.”
Deepfake detection improves when using algorithms that are more aware of demographic diversity
“Our research addresses deepfake detection algorithms’ fairness, rather than just attempting to balance the data. It offers a new approach to algorithm design that considers demographic fairness as a core aspect.”