Nieman Foundation at Harvard
HOME
          
LATEST STORY
Three years into nonprofit ownership, The Philadelphia Inquirer is still trying to chart its future
ABOUT                    SUBSCRIBE
Aug. 22, 2016, 11:01 a.m.
Reporting & Production
LINK: docs.google.com  ➚   |   Posted by: Shan Wang   |   August 22, 2016

When your news organization publishes data stories, does it always publish a “nerd box” alongside it, explaining the methodology behind the analysis and detailing decisions made along the way? Does it publish the complete raw data set, in its naked glory? Or does it publish a cleaned-up version of the data? Or nothing at all?

Christine Zhang, a 2016 Knight-Mozilla OpenNews Fellow based at the Los Angeles Times’ data desk, and Dan Nguyen, who teaches computational journalism at Stanford, want to hear from people working in newsrooms directly about the decisions behind making data public (or not). The (qualitative) survey is here, and tries to get at how data and methodology are shared (GitHub? Jupyter? Google Drive? Dropbox?), and why (Increases authoritativeness? Improves internal workflow? Ensures accuracy of the analysis?).

“Dan and I both have academic and journalism backgrounds. And for us, data journalism seemed to be very much tied to social sciences, and examining data to find stories definitely has parallels with the way that social scientists work with data to write papers and provide conclusions,” Zhang, who was previously a research analyst at Brookings, said. “We started thinking about how in social sciences, peer review is the way people check their work. How do we check our work as data journalists, as people in the newsroom who tell data stories? Our research is about that nerd box, examining the transparency and openness that goes with data stories.” (Zhang recently moderated a SRCCON session with Ariana Giorgi on peer reviewing data stories.)

Part of their research includes a quantitative analysis of GitHub repos from news organization-associated accounts. ProPublica’s Scott Klein created a bot that tweeted every time a news organization posted a GitHub repo, and Zhang and Nguyen pored over the list of organizations and the people affiliated with those organizations, filtering out non-data source repos like web development frameworks that might also be posted to GitHub.

“Our goal is essentially to look at general trends in data being put up on GitHub publicly, looking at which organizations are doing it more consistently and which are not, the types of stories that tend to merit that sort of consideration,” Zhang said. (BuzzFeed News, for example, regularly creates GitHub repos for its investigations and data stories.) “This is why we wanted to launch the qualitative survey as well: to get some commentary in addition to the data that we have. I don’t think this can be representative by any means, but we’d like to collect as many survey responses as we can get, to understand also how newsrooms are sharing their data outside of GitHub.”

Photo by JustGrimes used under a Creative Commons license.

Show tags Show comments / Leave a comment
 
Join the 50,000 who get the freshest future-of-journalism news in our daily email.
Three years into nonprofit ownership, The Philadelphia Inquirer is still trying to chart its future
Buyouts, rebranding, good journalism, and a vision still in progress: The Philadelphia Inquirer has had quite a summer. The metro newspaper business is still tough, even without a hedge fund or private equity pulling the strings.
People avoid consuming news that bums them out. Here are five elements that help them see a solution
“It is important that journalists take the time to fully explain the issue and the response before exploring implementation, results, and insights.”
The Boston Globe continues its regional expansion experiment, with students in a suburb
“Investigative reporting is great to have, but first we need the basics — and we’re no longer getting them.”