Journalists shouldn’t lose their rights in their move to private platforms
ABOUT                    SUBSCRIBE
March 6, 2013, 1:05 p.m.

Data science, commoditized backends, and the need to know code: A roundup of NICAR 2013

The annual gathering of data journalists, stats junkies, and other assorted investigators just wrapped up in Louisville. Here are some of the highlights from Chrys Wu’s annual summary.


The annual Computer Assisted Reporting conference, known colloquially as NICAR, wrapped Sunday. Of all the journalism conferences held throughout the year, this is the only one to specifically focuses on the needs and interests of reporters and editors who work in investigative news and data journalism. It’s a tremendous three and a half days of classes, panels and camaraderie. Attendees come to learn, share, and solve some of the most pressing issues currently facing the industry.

For the last three years, I’ve been collecting NICAR presentations, tutorials, tools, and work samples, because learning and looking through everything presented takes time. This year’s sessions were wide-ranging (including a brand new session on how to host high-traffic news apps), but there were still a few notable themes running through it all.


Many attendees are in the “people should learn to code” camp, and for good reason: With budgets gettings squeezed (at the start of one hands-on tutorial, an urgent question launched from the back: “Is it free?”), fee structures changing, and the growing desire to customize the look and interaction of published work, journalists’ need to literally and figuratively own what they make is more important than ever.


This year, there was much heavier emphasis on learning JavaScript, Python, R, and Ruby. Jeremy Bowers (NPR) and Serdar Tumgoren (Washington Post) posted their Fundamentals of Programming in Python materials to GitHub and created a Google Group for class members. Ron Campbell (The Orange County Register) and Christopher Schnaars (USA Today) offered Programming for the Rest of Us to those who wanted to code but were worried about the learning curve. (Though not specifically taught this year, there are also a number of excellent d3.js tutorials, which I’ve collected in the references section of my list.)

That said, lots of data still comes in Excel spreadsheets, and Krista Kjellman Schmidt (ProPublica), Linda Johnson (Lexington Herald-Leader), Denise Malan (Corpus Christi Caller-Times) and MaryJo Webster (St. Paul Pioneer Press) all gave terrific presentations on how to do it.

We’re sharing best practices

Sometimes it’s hard to tell when “best practices” are coming from experts or pretenders. In the case of NICAR, it’s a pretty safe bet that you’re getting advice from people who’ve tried, tested, and refined their methods.


Dave Cole (Mapbox), John Keefe (WNYC), and Matt Stiles (NPR) shared what what works well for mapping. Tasneem Raja (Mother Jones) and Sisi Wei (ProPublica) showed how to make interactives fun. Steve Myers (The Lens) explained workflows for social media discovery, verification, and publication, particularly during breaking news.

Learning from data science


The oft-cited Venn diagram from data scientist Drew Conway describes data science as the melding of hacking skills, math and statistics knowledge, and substantive expertise. It’s the hot term for the combination of skills that more and more industries need.

With that in mind, IRE and the Center for Investigative Reporting data journalism team created a Kaggle competition that asked data scientists to look at campaign finance records. More than a dozen new ways of looking at the data came back, and with it, some insights into how journalists could learn from the various approaches. Chase Davis, who led the CIR team, talked about the results and provided his own code, slides and tipsheets from his four talks on GitHub.

One of the key tools for statistical analysis is R, and two of its most popular ambassadors gave hands-on demonstrations of how to use it. New York Times graphics editor Amanda Cox’s session showed attendees R’s power to generate maps from data that she had painstakingly (and considerately) cleaned beforehand. A sample of her more recent work can be found at NYTimes.com.

Hadley Wickham, statistician and author of several popular R libraries including ggplot2 and plyr, held a daylong workshop that delved into ways to visualize, clean, transform, and model data with R. For many in the course, it was an eye-opening introduction into how to use the tool, but more importantly, how to understand, doubt, and test datasets. Hadley has shared his detailed slides and code and Sisi Wei shared her class notes.

Making data journalism easier for everyone


There was much discussion of tweaking workflows to make the reporting process more data-journalism friendly. Last year, Balance Media and WNYC introduced Tabletop.js, which allows Google Spreadsheet data to power web interactives. This year, the Chicago Tribune news apps team introduced Tarbell, a Google Docs-driven CMS. Journalists Heather Billings (Chicago Tribune), Jacob Harris (The New York Times), and Al Shaw (ProPublica) spoke about this and other ways of getting news apps and the CMS to live together in their talk Infect the CMS.

Since 2010, NICAR has hosted a lightning talks session. Attendees get to pitch a five-minute talk, and the 10 most popular are presented. After last year’s “Cats Cats Cats” stunt by Aron Pilhofer of The New York Times, it was no wonder this year’s session was packed. This year’s standout moment was Ben Welsh’s five-minute rant (and yes, there’s some cursing) about the five ways coding like a web developer can make you a better investigative developer. The best part? Ben tells coders the five things need to learn from reporters. Even as the methods and tools change, tried and true reporting skills still matter.

Be excited and keep learning. Visit the complete roundup of NICAR13 tools, slides, and links and dig in.

Cartoon via xkcd.

POSTED     March 6, 2013, 1:05 p.m.
Show comments  
Show tags
Join the 15,000 who get the freshest future-of-journalism news in our daily email.
Journalists shouldn’t lose their rights in their move to private platforms
The shift to distributed content means concepts like fair use are increasingly in the hands of private companies — like SoundCloud.
How The Forward, 118 years old, is remaking itself as the American Jewish community changes
The newspaper, first published in Yiddish, is facing all the familiar pressures of print, combined with a shifting base of potential readers.
Newsonomics: Are local newspapers the taxi cabs of the Uber age?
Local newspapers still act as if they’re monopolies — despite all the new players eating away at their audiences’ attention. Is there room to adapt?
What to read next
The Economist’s Tom Standage on digital strategy and the limits of a model based on advertising
“The Economist has taken the view that advertising is nice, and we’ll certainly take money where we can get it, but we’re pretty much expecting it to go away.”
889A wave of distributed content is coming — will publishers sink or swim?
Instead of just publishing to their own websites, news organizations are being asked to publish directly to platforms they don’t control. Is the hunt for readers enough to justify losing some independence?
550What USA Today Sports learned covering the Final Four on Periscope and Snapchat
These new platforms are optimized for realtime news on phones, but there are lots of questions for news organizations — from what content to share to how to measure their effectiveness.
These stories are our most popular on Twitter over the past 30 days.
See all our most recent pieces ➚
Encyclo is our encyclopedia of the future of news, chronicling the key players in journalism’s evolution.
Here are a few of the entries you’ll find in Encyclo.   Get the full Encyclo ➚
The Ann Arbor Chronicle
The Daily Voice
The Blaze
Next Door Media
News Corp
Windy Citizen
Davis Wiki
The Sunlight Foundation
Foreign Policy