Nieman Foundation at Harvard
HOME
          
LATEST STORY
From shrimp Jesus to fake self-portraits, AI-generated images have become the latest form of social media spam
ABOUT                    SUBSCRIBE
April 1, 2016, 9:30 a.m.
Reporting & Production

What happens to a great open source project when its creators are no longer using the tool themselves?

PANDA, the four-year-old Knight News Challenge-winning newsroom application for storing and analyzing large data sets, still has a respectable community of users, but could now use a new longterm caretaker.

Wanted: One well-liked open source newsroom application seeks one or more journalist-developers — or even an organization — who can take the reins and help build out a bigger community of users.

PANDA (“PANDA A News Data Application,” not to be confused with the Python data analysis library) was a 2011 Knight News Challenge winner. At its core is a data warehouse appliance that gives local news outlets a centralized place to maintain, organize, and analyze data sets, including huge ones like voter registration rolls. (Disclosure: Nieman Lab is also supported by Knight, though not through a News Challenge grant.)

Reporters can upload their own data sets or search existing data and documents. Interested developers can help tweak the code and make use of PANDA’s API to customize the application to their newsrooms’ needs. The tool was designed with local newsrooms in mind, with features like fuzzy search and language translations, led by a group of people who were working in local newsrooms themselves.

Here’s the thing: None of the people from the PANDA project work in local newsrooms anymore. But team members estimate that at least dozens of newsrooms or more use it, and not just in the U.S. — some as superusers who’ve even built additional features, and others on a much smaller scale, with a few interested reporters uploading data every once in a while. Yet more newsrooms have it installed, but haven’t gotten farther than that.

PANDA formally debuted at the 2012 NICAR conference, in partnership with Investigative Reporters & Editors. Brian Boyer, then on the news apps team at the Chicago Tribune and now editor of the NPR Visuals team; Joe Germuska, another Tribune developer and now at Northwestern University’s Knight Lab; and Ryan Pitts, then senior editor managing web development at the Spokane Spokesman-Review and now at Knight-Mozilla’s OpenNews, put together the News Challenge proposal. (The Tribune’s intranet for databases inspired PANDA.)

“Plenty of folks use it a lot, and I’m far from ready to call it dead,” Boyer said. “Now we just need to figure out what the next step is. It’s just dumb luck that every single one of us went somewhere where PANDA doesn’t necessarily make sense!”

PANDA could find a new partner and hand off the torch, Boyer suggested. Someone at IRE could look after it and offer it as a benefit for members, so the tool would shift to having a central maintainer. A single organization could host it for many. Or PANDA’s original team could pursue another grant to hire a developer to help debug and build new features.

“It’s a little tricky to figure out what to do with it because the overall adoption was uneven, with some places using it a lot, some places installing it and then forgetting, and some never getting over the hurdle in the first place,” Christopher Groskopf, PANDA’s main original developer, said. (Groskopf was a news applications developer at the Chicago Tribune but is now a reporter at Quartz.) “There hasn’t been one single plan forward that has really emerged. Like a lot of open source projects, ours kind of runs by momentum and rough consensus, and without that single bolt-of-lightning idea about how things should be done, things are more in limbo.

Germuska and Pitts were both at this year’s NICAR conference in Denver, and Germuska convened a lunch for PANDA users to talk about how they’re using the tool and discuss its future. The call was aimed at serious users, but people who were just curious also showed up (including me).

Matt Kiefer first heard about PANDA when it won the News Challenge in 2011, but didn’t work in a newsroom where it made sense for him to install the tool until a couple of months ago, when he started as a data editor at the nonprofit investigative outlet the Chicago Reporter. The editorial team is small — a dozen or so — and juggles plenty of data sets.

“To keep the data sane, to keep the system manageable even in a small newsroom, you need some kind of system with some kind of standard,” Kiefer said. He found PANDA setup simple (“The documentation was very straightforward — I got out of a meeting with my boss late afternoon and had it deployed before I left for the day”). There are still kinks to work out, but he’s hoping his newsroom will be able to use PANDA’s API features to simplify workflow.

“PANDA is a good solution as a data warehouse,” Kiefer said. “Before data’s served its ultimate purpose, whether piped into graphics or mapping software or something else, it can be on file in an organized way for fact-checking. PANDA is where our data could live between research and getting reported.”

e16qOEj

The Associated Press got PANDA up and running about a year ago and has been able to adjust it to the newsroom’s needs, including getting to the point of having single sign-ons for newsroom staff, AP data journalist Serdar Tumgoren told me. PANDA allowed developers to shut down some buggy old Ruby on Rails–based news apps, and it allows reporters who work with a lot of data to deal with a functional data warehouse directly. The AP is using it to store huge amounts of voter registration data. Automation editor Justin Myers feeds regularly released data, like labor and economics statistics, into PANDA on schedule.

“We were looking for a way for reporters to get and share data internally, and PANDA fits that bill in a lot of ways,” Tumgoren said. “I’m very happy with a lot of the features. Many of the reporters really seem to love it and use it to do a lot of their research.”

Tumgoren would love a more complex management system that would allow power users working on investigative stories to restrict access to data and documents. But he acknowledges that PANDA, by design, is a “bazaar of open materials” geared at “making information more discoverable.”

By all accounts, PANDA’s creators have made themselves as available as possible for feedback and requests, despite being no longer officially involved. (There’s also a semi-active Google group.)

“Open source software have ebbs and flows, and they get regular life because people decide it’s worth it,” Tumgoren said. “I appreciate what the team has been doing. They don’t talk about an end, but rather convene folks to see how we can keep this going, as long as people are using it.”

“One of the oldest rules of open source software is that it works when it scratches people’s itches, and then people are motivated to fix it where it falls short,” Germuska said. “This is not something that’s part of our everyday. But if there are people who don’t know how to start but are motivated to install it, we want to help them.”

Steven Acres via Atlas Obscura.

POSTED     April 1, 2016, 9:30 a.m.
SEE MORE ON Reporting & Production
Show tags
 
Join the 60,000 who get the freshest future-of-journalism news in our daily email.
From shrimp Jesus to fake self-portraits, AI-generated images have become the latest form of social media spam
Within days of visiting the pages — and without commenting on, liking, or following any of the material — Facebook’s algorithm recommended reams of other AI-generated content.
What journalists and independent creators can learn from each other
“The question is not about the topics but how you approach the topics.”
Deepfake detection improves when using algorithms that are more aware of demographic diversity
“Our research addresses deepfake detection algorithms’ fairness, rather than just attempting to balance the data. It offers a new approach to algorithm design that considers demographic fairness as a core aspect.”