HOME
          
LATEST STORY
Where you get your news depends on where you stand on the issues
ABOUT                    SUBSCRIBE
June 22, 2011, 4 p.m.

The News Challenge-winning PANDA Project aims to make research easier in the newsroom

In an ideal world, when news breaks, reporters can fall back on their encyclopedic knowledge of local stories, events, and people to put the news in context. And failing that, they can turn to colleague who’ve been covering a beat forever and know where to find the right files.

The ideal world doesn’t often match up with the real one, which calls on reporters to employ a mix of intuition, document archives, and Google to shore up their knowledge when writing a lead on a breaking story. And if there’s a place where the real world and the ideal world meet, it’s in the files, the data, that all newsrooms compile but may not have accessible.

But with your own newsroom PANDA, that could all change. The PANDA Project, a winner of this year’s Knight News Challenge, is what developer Brian Boyer calls a “newsroom data application,” a tool that helps find context and relationships on the fly. Boyer, the news applications editor at the Chicago Tribune, will lead the project, which plans to create a set of web-based open source tools that will allow any newsroom to set up their own PANDA to analyze data whenever the need arises. (As for that name? “PANDA A News Data Application” is a cheeky, but hopefully not too cute, recursive acronym, Boyer hopes.)

The PANDA project’s one-year, $150,000 grant will largely go towards hiring a developer to build the application, along with some assorted contracting work necessary to give the project a nice look and easy-to-understand features. Boyer is working in concert with Investigative Reporters & Editors (where the developer will likely be working, along with an expanded group of data producers), and The Spokane Spokesman-Review, whose online director, Ryan Pitts, is another lead on the project.

As for building the tool, “the first problem is knowledge management,” Boyer told me. “Where to stash all that information you collect. It’s a problem that all businesses have.”

News organizations, almost by their nature, have tons of data, from Census numbers and campaign finance reports to DWI records and housing prices. It’s all information that have proved their usefulness at one point or another. But instead of allowing it to be dumped in a file cabinet or left to die on a forgotten disk drive, PANDA wants to give all that info a home where it can be easily accessed.

The idea for the project stemmed from an update to a similar database at the Tribune, which allows reporters to cross-check names against city records and other information. “We dig up all kinds of datasets,” Boyer remembers thinking at the time; “we could really augment this thing.” But he and his team started thinking more broadly. They figured that, if having a cross-referencing database could work for the Tribune, it could work well for others — particularly smaller newspapers that may not have the resources the Trib enjoys.

One of the cornerstones of the project is Google Refine, a tool launched last year that cleans up datasets filled with irregularities and inconsistencies. One of the added benefits of Google Refine, Boyer said, is that it can help draw relationships across data. “So when a reporter gets a 10,000-row campaign contribution list, they can reconcile it against databases we keep on file to see what things pop up,” Boyer said.

The initial focus will be on surveying reporters on how they would like a database like PANDA to work in their newsroom. The next step will be trying to find ways to scale the project across newsrooms of varying sizes. One big question to address is how to host the large amounts of data that PANDA will involve. Boyer said a cloud storage option would likely work best, but they don’t have specifics worked out for that part of the project yet.

One thing Boyer already has worked out is that PANDA needs to be accessible in order to succeed. Newsrooms will have to be able to set it up seamlessly and start putting it to use without too much instruction or installation hassles. “The goal is to have a system that each news organization can put to their own use,” Boyer said. “You don’t have to have a server administrator set it up for you. I want this to be something an editor can set up for you, not your IT department.”

Panda image used under a Creative Commons license from Jenn and Tony Bot.

POSTED     June 22, 2011, 4 p.m.
PART OF A SERIES     Knight News Challenge 2011
SHARE THIS STORY
   
Show comments  
Show tags
 
Join the 15,000 who get the freshest future-of-journalism news in our daily email.
Where you get your news depends on where you stand on the issues
A new study by the Pew Research Center examines how Americans’ news consumption habits correlate with where they fall on the political spectrum.
Light everywhere: The California Civic Data Coalition wants to make public datasets easier to crunch
Journalists from rival outlets are pursuing the dream of “pluggable data,” partnering to build open-source tools to analyze California campaign finance and lobbying data.
Ebola Deeply builds on the lessons of single-subject news sites: A news operation with an expiration date
Following the blueprint of Syria Deeply, the new Ebola-focused site hopes to deliver context and coherence in covering the spread and treatment of the virus.
What to read next
1020
tweets
The newsonomics of the millennial moment
The new wave of news startups is aiming at a younger audience. But do legacy media companies have a chance at earning their attention?
803A mixed bag on apps: What The New York Times learned with NYT Opinion and NYT Now
The two apps were part of the paper’s plan to increase digital subscribers through smaller, targeted offerings. Now, with staff cutbacks on the way, one app is being shuttered and the other is being adjusted.
537Watching what happens: The New York Times is making a front-page bet on real-time aggregation
A new homepage feature called “Watching” offers readers a feed of headlines, tweets, and multimedia from around the web.
These stories are our most popular on Twitter over the past 30 days.
See all our most recent pieces ➚
Encyclo is our encyclopedia of the future of news, chronicling the key players in journalism’s evolution.
Here are a few of the entries you’ll find in Encyclo.   Get the full Encyclo ➚
The Nation
DocumentCloud
Newser
FiveThirtyEight
FactCheck.org
Arizona Guardian
Baristanet
Financial Times
PubliCola
San Diego News Network
Frontline
New Haven Independent