Nieman Foundation at Harvard
HOME
          
LATEST STORY
Newsonomics: Tribune’s Thursday night surprise rescrambles the consolidation puzzle
ABOUT                    SUBSCRIBE
Sept. 28, 2017, 9:32 a.m.
Mobile & Apps

The internet isn’t forever. Is there an effective way to preserve great online interactives and news apps?

“I like to talk about it as reading today’s news on tomorrow’s computer.”

So many pioneering works of digital journalism no longer exist online, or exist only as a shadow of their former selves.

The Guardian’s 2009 coverage of the MP expenses scandal, for instance, which included a massive crowdsourcing effort and hundreds of thousands of documents (a project we wrote up at Nieman Lab): The stories that anchored that coverage are nowhere to be found on theguardian.com.

A lavish online multimedia experience built around a Pulitzer Prize-nominated work, which explored the legacy of a deadly 1961 bus-train collision in Colorado, from the now defunct Rocky Mountain News: Rescued from internet limbo, thanks to efforts of the former Rocky Mountain News staffer who reported the series. (Former Nieman Labber Adrienne LaFrance explored the path to resurrection and warned of the ephemerality of the internet in this 2015 piece for the Atlantic. “The Crossing” project is now at its own, stable URL, though someone still needs to maintain it.)

Archive of the Black Hawk Down series, from the Philadelphia Inquirer when it was owned by Knight Ridder. Source.

You get the picture.

“We were told the internet was forever. That was kind of a lie,” Meredith Broussard, currently a professor at the Arthur L. Carter Journalism Institute at New York University, told me. “I’ve been writing for the web since 1996 or so, and all of my early work is gone. It only exists in paper files in archival boxes somewhere in my apartment. Unless somebody is maintaining internet sites, they go away — and somebody needs to be paying the bill for the server.”

Broussard and colleague Katherine Boss, the librarian for journalism, media, culture, and communication at NYU, are working on a workflow and on building tools to help organizations effectively and efficiently preserve their big data journalism projects, and putting together a scholarly archive of data journalism projects.

“News apps can’t be preserved the same way you preserve the static webpage,” Broussard said. The Internet Archive’s Wayback Machine is dependable for finding a snapshot in time, but a searcher needs to know the time frame of what they’re looking for, and snapshots don’t really capture a complicated, database-driven project or any site with a lot of dynamic links. “The way to capture these is from the backend. You can grab the whole database — all of the images from the server side, and so forth. We’re looking to build server-side tools that will allow for automated, large-scale, long-term archiving of data journalism projects.”

stray-media-1

What remains of this impressive USA Today catalog of casualties in the wars of Afghanistan and Iraq. Source.

Boss and Broussard’s first move has been surveying developers and journalists on the tech used to make and store their news apps. The preliminary survey returned a range of technologies, frameworks, and platforms: Flask, Django, Ruby, Node.js, d3, AWS, Heroku, and on and on.

“Nobody’s yet collected data on what we’re trying to learn. Where are all projects being stored? How are they built — are they pulling from external APIs? We’re trying to ask the right questions digital archivists will want the answers to, in order to build these sorts of tools,” Boss said. “There are great projects that are really just being lost — there is currently no way to archive or preserve them. It’s not really technically possible right now. We’re trying to develop new workflows, and not just within libraries, that could be used by anyone.” (There are efforts to catalog and tag all the interactives out there, such as the Interactive News Depot, but those, too, need to be painstakingly updated and maintained.)

What remains of “Ghost Factories,” an investigation on the hazards of forgotten lead factory locations. Source.

It’s an issue Boss and Broussard have been following closely for a while — Broussard even wrote a response in The Atlantic to LaFrance’s story about “The Crossing.” The data journalist community has also been concerned with preserving interactive projects news apps for years. The Journalism Digital News Archive, part of the Reynolds Journalism Institute, has also been convening these researchers and journalists to tackle this problem. A recent Twitter thread, featuring some steadfast champions of news app preservation efforts, offers a taste of what some forward-thinking organizations are considering:

(USA Today is aware of and working on the issue, it seems.)

“But in the broader population, there isn’t much awareness that this is a problem,” Boss said. “And in libraries, this is a thing that librarians just recently have started thinking about. It’s on the cutting edge of research in digital archiving.”

In librarian/archivist nerd-speak, Boss explained, there’s “migration,” and then there’s “emulation.” “Migration” is the traditional stuff we might associate with libraries: digitizing print materials, digitizing microfilms, moving VHS to DVDs and then DVDs to Blu-Ray and then Blu-ray to streaming media. That process doesn’t make sense for digital “objects” like news apps that are dependent on many different types of software, and therefore have too many moving parts to migrate. What if, a hundred years out, we’re not even browsing the internet on computers, or at least not the computers we’re familiar with now? What’s needed is a way to capture a data journalism project from the server side and then “emulate” that whole environment on whatever future device is being used to view the project.

“We’re borrowing best practices from other fields, though there are also issues unique to data journalism,” Broussard said. She pointed to efforts in the scientific community like ReproZip, which can help with reproducing and archiving digital scientific experiments, and the similar challenges facing the contemporary art conservation around how to archive something like an interactive video installation or a piece of performance art.

“I like to talk about it as reading today’s news on tomorrow’s computer,” she said.

Wild picture of a faculty member’s laptop screen, by arvind grover, used under a Creative Commons license.

POSTED     Sept. 28, 2017, 9:32 a.m.
SEE MORE ON Mobile & Apps
SHARE THIS STORY
   
 
Join the 50,000 who get the freshest future-of-journalism news in our daily email.
Newsonomics: Tribune’s Thursday night surprise rescrambles the consolidation puzzle
Could the moves presage the major rollup that’s been increasingly talked about in America’s now-in-play, ever-struggling daily newspaper industry?
Anti-vaxxers are among the WHO’s top 10 global health threats, and Ebola fake news is killing people
During an outbreak in the Democratic Republic of Congo, “as rumors surface, communications experts rebut them with accurate information via WhatsApp or local radio.”
Nine steps for how Facebook should embrace meaningful interac— er, accountability
“There are broad concerns that Facebook continues to engage in deceptive behavior when it comes to user privacy, and that it is biased against certain groups, but outsiders currently have almost no possibilities to verify these claims.”