Nieman Foundation at Harvard
HOME
          
LATEST STORY
As the Christchurch massacre trial begins, New Zealand news orgs vow to keep white supremacist ideology out of their coverage
ABOUT                    SUBSCRIBE
Aug. 22, 2016, 11:01 a.m.
Reporting & Production
LINK: docs.google.com  ➚   |   Posted by: Shan Wang   |   August 22, 2016

When your news organization publishes data stories, does it always publish a “nerd box” alongside it, explaining the methodology behind the analysis and detailing decisions made along the way? Does it publish the complete raw data set, in its naked glory? Or does it publish a cleaned-up version of the data? Or nothing at all?

Christine Zhang, a 2016 Knight-Mozilla OpenNews Fellow based at the Los Angeles Times’ data desk, and Dan Nguyen, who teaches computational journalism at Stanford, want to hear from people working in newsrooms directly about the decisions behind making data public (or not). The (qualitative) survey is here, and tries to get at how data and methodology are shared (GitHub? Jupyter? Google Drive? Dropbox?), and why (Increases authoritativeness? Improves internal workflow? Ensures accuracy of the analysis?).

“Dan and I both have academic and journalism backgrounds. And for us, data journalism seemed to be very much tied to social sciences, and examining data to find stories definitely has parallels with the way that social scientists work with data to write papers and provide conclusions,” Zhang, who was previously a research analyst at Brookings, said. “We started thinking about how in social sciences, peer review is the way people check their work. How do we check our work as data journalists, as people in the newsroom who tell data stories? Our research is about that nerd box, examining the transparency and openness that goes with data stories.” (Zhang recently moderated a SRCCON session with Ariana Giorgi on peer reviewing data stories.)

Part of their research includes a quantitative analysis of GitHub repos from news organization-associated accounts. ProPublica’s Scott Klein created a bot that tweeted every time a news organization posted a GitHub repo, and Zhang and Nguyen pored over the list of organizations and the people affiliated with those organizations, filtering out non-data source repos like web development frameworks that might also be posted to GitHub.

“Our goal is essentially to look at general trends in data being put up on GitHub publicly, looking at which organizations are doing it more consistently and which are not, the types of stories that tend to merit that sort of consideration,” Zhang said. (BuzzFeed News, for example, regularly creates GitHub repos for its investigations and data stories.) “This is why we wanted to launch the qualitative survey as well: to get some commentary in addition to the data that we have. I don’t think this can be representative by any means, but we’d like to collect as many survey responses as we can get, to understand also how newsrooms are sharing their data outside of GitHub.”

Photo by JustGrimes used under a Creative Commons license.

Show tags Show comments / Leave a comment
 
Join the 50,000 who get the freshest future-of-journalism news in our daily email.
As the Christchurch massacre trial begins, New Zealand news orgs vow to keep white supremacist ideology out of their coverage
“We’re going to do our job — we won’t chill our coverage in any way — but we’re not going to spread hate or misinformation.”
Populists prefer television to online news — but are sticking to Facebook as others leave
“In the U.S., though there are some outlets with populist audiences — such as Fox and HuffPost — it is clear that the majority of outlets have audiences that are predominately non-populist left, such as The New York Times.”
Investigative Network aims to bring more documentary video to local TV (but it’ll need funding first)
“What I’ve seen with most nonprofits is they’re driven by former print people who have transitioned to digital. I can’t tell you how many times I see a digital story and think it would have been a good 10-minute, 15-minute, hour-long documentary piece.”