Nieman Foundation at Harvard
4 takeaways from The New York Times’ new digital strategy memo
ABOUT                    SUBSCRIBE
June 22, 2011, 4 p.m.

The News Challenge-winning PANDA Project aims to make research easier in the newsroom

In an ideal world, when news breaks, reporters can fall back on their encyclopedic knowledge of local stories, events, and people to put the news in context. And failing that, they can turn to colleague who’ve been covering a beat forever and know where to find the right files.

The ideal world doesn’t often match up with the real one, which calls on reporters to employ a mix of intuition, document archives, and Google to shore up their knowledge when writing a lead on a breaking story. And if there’s a place where the real world and the ideal world meet, it’s in the files, the data, that all newsrooms compile but may not have accessible.

But with your own newsroom PANDA, that could all change. The PANDA Project, a winner of this year’s Knight News Challenge, is what developer Brian Boyer calls a “newsroom data application,” a tool that helps find context and relationships on the fly. Boyer, the news applications editor at the Chicago Tribune, will lead the project, which plans to create a set of web-based open source tools that will allow any newsroom to set up their own PANDA to analyze data whenever the need arises. (As for that name? “PANDA A News Data Application” is a cheeky, but hopefully not too cute, recursive acronym, Boyer hopes.)

The PANDA project’s one-year, $150,000 grant will largely go towards hiring a developer to build the application, along with some assorted contracting work necessary to give the project a nice look and easy-to-understand features. Boyer is working in concert with Investigative Reporters & Editors (where the developer will likely be working, along with an expanded group of data producers), and The Spokane Spokesman-Review, whose online director, Ryan Pitts, is another lead on the project.

As for building the tool, “the first problem is knowledge management,” Boyer told me. “Where to stash all that information you collect. It’s a problem that all businesses have.”

News organizations, almost by their nature, have tons of data, from Census numbers and campaign finance reports to DWI records and housing prices. It’s all information that have proved their usefulness at one point or another. But instead of allowing it to be dumped in a file cabinet or left to die on a forgotten disk drive, PANDA wants to give all that info a home where it can be easily accessed.

The idea for the project stemmed from an update to a similar database at the Tribune, which allows reporters to cross-check names against city records and other information. “We dig up all kinds of datasets,” Boyer remembers thinking at the time; “we could really augment this thing.” But he and his team started thinking more broadly. They figured that, if having a cross-referencing database could work for the Tribune, it could work well for others — particularly smaller newspapers that may not have the resources the Trib enjoys.

One of the cornerstones of the project is Google Refine, a tool launched last year that cleans up datasets filled with irregularities and inconsistencies. One of the added benefits of Google Refine, Boyer said, is that it can help draw relationships across data. “So when a reporter gets a 10,000-row campaign contribution list, they can reconcile it against databases we keep on file to see what things pop up,” Boyer said.

The initial focus will be on surveying reporters on how they would like a database like PANDA to work in their newsroom. The next step will be trying to find ways to scale the project across newsrooms of varying sizes. One big question to address is how to host the large amounts of data that PANDA will involve. Boyer said a cloud storage option would likely work best, but they don’t have specifics worked out for that part of the project yet.

One thing Boyer already has worked out is that PANDA needs to be accessible in order to succeed. Newsrooms will have to be able to set it up seamlessly and start putting it to use without too much instruction or installation hassles. “The goal is to have a system that each news organization can put to their own use,” Boyer said. “You don’t have to have a server administrator set it up for you. I want this to be something an editor can set up for you, not your IT department.”

Panda image used under a Creative Commons license from Jenn and Tony Bot.

POSTED     June 22, 2011, 4 p.m.
PART OF A SERIES     Knight News Challenge 2011
Show comments  
Show tags
Join the 15,000 who get the freshest future-of-journalism news in our daily email.
4 takeaways from The New York Times’ new digital strategy memo
With a renewed focus on subscriptions, the Times believes it can double its digital revenue to $800 million in 2020.
Get AMP’d: Here’s what publishers need to know about Google’s new plan to speed up your website
The speed gains are very real. But do publishers want to trade in the open space of what we’ve known as the web for yet another platform they have little control over?
The Longest Shortest Time brings listeners’ voices into its podcast with a dedicated app
The app is built on WNYC tech that allows listeners to upload audio directly.
What to read next
What happened after 7 news sites got rid of reader comments
Recode, Reuters, Popular Science, The Week, Mic, The Verge, and USA Today’s FTW have all shut off reader comments in the past year. Here’s how they’re all using social media to encourage reader discussion.
699Facebook woos journalists with Signal, a dashboard to gather news across Facebook and Instagram
Signal helps journalists find, source, and embed content from Facebook and Instagram.
567Facebook rolls out new tools to help reporters share their work (and choose who sees it)
Facebook is making an app that was previously only for celebrities and other public figures available to journalists with verified profiles.
These stories are our most popular on Twitter over the past 30 days.
See all our most recent pieces ➚
Fuego is our heat-seeking Twitter bot, tracking the links the future-of-journalism crowd is talking about most on Twitter.
Here are a few of the top links Fuego’s currently watching.   Get the full Fuego ➚
Encyclo is our encyclopedia of the future of news, chronicling the key players in journalism’s evolution.
Here are a few of the entries you’ll find in Encyclo.   Get the full Encyclo ➚
Minneapolis Star Tribune
The Huffington Post
Bureau of Investigative Journalism
Mother Jones
Animal Político
Conde Nast
The Times of London
Drudge Report