Twitter  Quartz found an unlikely inspiration for its relaunched homepage: The email newsletter. nie.mn/1AQXuxD  
Nieman Journalism Lab
Pushing to the future of journalism — A project of the Nieman Foundation at Harvard

At #Niemanleaks, a new generation of tools to manage floods of new data

Whether it’s 250,000 State Department cables or the massive spending databases on Recovery.gov, the trend in data has definitely become “more.” That presents journalists with a new problem: How do you understand and explain data when it comes by the gigabyte? At the Nieman Foundation’s one-day conference on secrecy and journalism, presenters from the New York Times, Sunlight Foundation, and more offered solutions — or at least new ways of thinking about the problems.

Think like a scientist

With the massive amounts of primary documents now available, journalists have new opportunities to bring their readers into their investigations — which can lead to better journalism. John Bohannon, a contributing correspondent for Science Magazine, said his background as a scientist was great preparation for investigative reporting. “The best kind of investigative journalism is like the best kind of science,” he said. “You as the investigator don’t ask your readers to take your claims at face value: You give them the evidence you’ve gathered along the way and ask them to look it with you.”

It’s not a radical idea, but it’s one being embraced in new ways. For Bohannon, it meant embedding with a unit in Afganistan and methodically gathering first-hand data about civilian deaths — a more direct and reliable indicator than the less expensive and safer method of counting media-reported deaths. He also found his scientific approach was met with more open answers from a military known for tight information control. “Sometimes if you politely ask for information, large powerful organizations will actually give it to you,” he said.

The future will be distributed: BitTorrent, not Napster

Two of the projects discussed, Basetrack and DocumentCloud, invite broader participation in the news process, the former in the sourcing and the latter with the distribution.

Basetrack, a Knight News Challenge winner, goes beyond the normal embedding process to more actively involving the Marines of First Battalion, Eighth Marine Regiment as they deploy overseas in reporting their experiences. Teru Kuwayama, who leads the project and deployed with the battalion to Afghanistan, said ensuring that confidential information wasn’t released, putting lives in danger, was essential to building trust and openness with the project. So Basetrack built a “Denial of Information” tool that allowed easy, pre-publication redactions, with the caveat that the fact of those redactions — and the reasons given for them — would be made public. It’s a compromise that promises a greater intimacy and a collaborative look at life at war while ensuring the safety of the soldiers.

Fellow News Challenge winner DocumentCloud, on the other hand, distributes the primary documents dug up through traditional investigative journalism, such as historical confidential informant files or flawed electoral ballot designs. Aron Pilhofer, editor of interactive news at The New York Times, said he was unsure about whether journalists would actually use it when his team began working on the project — but since then dozens of organizations have embraced it, happy to take readers along for the ride of the investigative process.

These new ways of distributing reporting were just the beginning, Pilhofer said, with a trend that will likely push today’s marquee whistleblower out of the limelight. “WikiLeaks was very much a funnel going in and very much a funnel going out,” he said. “Distributed is the future.” A new project, called OpenLeaks, will embrace a less centralized model, building technology to allow anonymous leaks without a central organization to be taken out.

Big data’s day is here

The panel also tackled how to digest truly massive data sets. Bill Allison, editorial director of the Sunlight Foundation, detailed how his organization collected information on everything from earmarks to political fundraising parties. Allison said making this data actually meaningful required context, which could be simple as mapping already available data or scoring government databases based on understandable criteria.

“We try to make the information easy to use,” he said. But beyond the audience of curious constituents who use Sunlight’s tools, a much broader audience is reached as hundreds of journalists around the country use Sunlight’s tools to dig up local stories they might not otherwise have noticed — creating a rippling effect of transparency

                                   
What to read next
ferguson-protest-night-ap
Mark Coddington    Aug. 22, 2014
Plus: Controversy at Time Inc., more plagiarism allegations, and the rest of the week’s journalism and tech news.