Nieman Foundation at Harvard
HOME
          
LATEST STORY
Spain’s Eldiario.es has 18,000 paying members, and its eye on the next several million
ABOUT                    SUBSCRIBE
March 6, 2013, 1:05 p.m.

Data science, commoditized backends, and the need to know code: A roundup of NICAR 2013

The annual gathering of data journalists, stats junkies, and other assorted investigators just wrapped up in Louisville. Here are some of the highlights from Chrys Wu’s annual summary.

nicar-2013-logo

The annual Computer Assisted Reporting conference, known colloquially as NICAR, wrapped Sunday. Of all the journalism conferences held throughout the year, this is the only one to specifically focuses on the needs and interests of reporters and editors who work in investigative news and data journalism. It’s a tremendous three and a half days of classes, panels and camaraderie. Attendees come to learn, share, and solve some of the most pressing issues currently facing the industry.

For the last three years, I’ve been collecting NICAR presentations, tutorials, tools, and work samples, because learning and looking through everything presented takes time. This year’s sessions were wide-ranging (including a brand new session on how to host high-traffic news apps), but there were still a few notable themes running through it all.

DIY

Many attendees are in the “people should learn to code” camp, and for good reason: With budgets gettings squeezed (at the start of one hands-on tutorial, an urgent question launched from the back: “Is it free?”), fee structures changing, and the growing desire to customize the look and interaction of published work, journalists’ need to literally and figuratively own what they make is more important than ever.

al-shaw-ruby-nicar

This year, there was much heavier emphasis on learning JavaScript, Python, R, and Ruby. Jeremy Bowers (NPR) and Serdar Tumgoren (Washington Post) posted their Fundamentals of Programming in Python materials to GitHub and created a Google Group for class members. Ron Campbell (The Orange County Register) and Christopher Schnaars (USA Today) offered Programming for the Rest of Us to those who wanted to code but were worried about the learning curve. (Though not specifically taught this year, there are also a number of excellent d3.js tutorials, which I’ve collected in the references section of my list.)

That said, lots of data still comes in Excel spreadsheets, and Krista Kjellman Schmidt (ProPublica), Linda Johnson (Lexington Herald-Leader), Denise Malan (Corpus Christi Caller-Times) and MaryJo Webster (St. Paul Pioneer Press) all gave terrific presentations on how to do it.

We’re sharing best practices

Sometimes it’s hard to tell when “best practices” are coming from experts or pretenders. In the case of NICAR, it’s a pretty safe bet that you’re getting advice from people who’ve tried, tested, and refined their methods.

xkcd-heatmap

Dave Cole (Mapbox), John Keefe (WNYC), and Matt Stiles (NPR) shared what what works well for mapping. Tasneem Raja (Mother Jones) and Sisi Wei (ProPublica) showed how to make interactives fun. Steve Myers (The Lens) explained workflows for social media discovery, verification, and publication, particularly during breaking news.

Learning from data science

drew-conway-data-science-venn-diagram

The oft-cited Venn diagram from data scientist Drew Conway describes data science as the melding of hacking skills, math and statistics knowledge, and substantive expertise. It’s the hot term for the combination of skills that more and more industries need.

With that in mind, IRE and the Center for Investigative Reporting data journalism team created a Kaggle competition that asked data scientists to look at campaign finance records. More than a dozen new ways of looking at the data came back, and with it, some insights into how journalists could learn from the various approaches. Chase Davis, who led the CIR team, talked about the results and provided his own code, slides and tipsheets from his four talks on GitHub.

One of the key tools for statistical analysis is R, and two of its most popular ambassadors gave hands-on demonstrations of how to use it. New York Times graphics editor Amanda Cox’s session showed attendees R’s power to generate maps from data that she had painstakingly (and considerately) cleaned beforehand. A sample of her more recent work can be found at NYTimes.com.

Hadley Wickham, statistician and author of several popular R libraries including ggplot2 and plyr, held a daylong workshop that delved into ways to visualize, clean, transform, and model data with R. For many in the course, it was an eye-opening introduction into how to use the tool, but more importantly, how to understand, doubt, and test datasets. Hadley has shared his detailed slides and code and Sisi Wei shared her class notes.

Making data journalism easier for everyone

tabletop-js-google-doc

There was much discussion of tweaking workflows to make the reporting process more data-journalism friendly. Last year, Balance Media and WNYC introduced Tabletop.js, which allows Google Spreadsheet data to power web interactives. This year, the Chicago Tribune news apps team introduced Tarbell, a Google Docs-driven CMS. Journalists Heather Billings (Chicago Tribune), Jacob Harris (The New York Times), and Al Shaw (ProPublica) spoke about this and other ways of getting news apps and the CMS to live together in their talk Infect the CMS.

Since 2010, NICAR has hosted a lightning talks session. Attendees get to pitch a five-minute talk, and the 10 most popular are presented. After last year’s “Cats Cats Cats” stunt by Aron Pilhofer of The New York Times, it was no wonder this year’s session was packed. This year’s standout moment was Ben Welsh’s five-minute rant (and yes, there’s some cursing) about the five ways coding like a web developer can make you a better investigative developer. The best part? Ben tells coders the five things need to learn from reporters. Even as the methods and tools change, tried and true reporting skills still matter.

Be excited and keep learning. Visit the complete roundup of NICAR13 tools, slides, and links and dig in.

Cartoon via xkcd.

POSTED     March 6, 2013, 1:05 p.m.
SHARE THIS STORY
   
Show comments  
Show tags
 
Join the 15,000 who get the freshest future-of-journalism news in our daily email.
Spain’s Eldiario.es has 18,000 paying members, and its eye on the next several million
“We have a potential of six million readers. You may not convince all six million people to be your socios, but if you learn more about their interests, you can get closer.”
Chasing subscriptions over scale, The Athletic wants to turn local sports fandom into a sustainable business — starting in Chicago
“It’s very easy today to be click-driven and produce articles that don’t have a lot of substance or depth and don’t cost that much to produce, but that dynamic is disappointing for fans who want higher-quality content.”
Hot Pod: We now have new, free rankings to show how podcasts stack up against each other
Plus: Parsing the RadioPublic announcement; premium podcast subscriptions; Bill Simmons oversimplifies things.
What to read next
0
tweets
The American Bystander is trying to revive the humor magazine with a reader-supported business model
“Our idea was that we were going to create one of these things in a classic format and see if there was enough interest to sustain it.”
0Algorithms, clickworkers, and the befuddled fury around Facebook Trends
“Trends are not the same as news, but Facebook kinda wants them to be.”
0With new columns and newsletters, ProPublica is trying to attract new readers and have more fun
“There’s a huge benefit to coming up with features that are more fun and more interesting. It appeals to a different audience and can create closer connections with readers — they can see a different side of us.”
These stories are our most popular on Twitter over the past 30 days.
See all our most recent pieces ➚
Fuego is our heat-seeking Twitter bot, tracking the links the future-of-journalism crowd is talking about most on Twitter.
Here are a few of the top links Fuego’s currently watching.   Get the full Fuego ➚
Encyclo is our encyclopedia of the future of news, chronicling the key players in journalism’s evolution.
Here are a few of the entries you’ll find in Encyclo.   Get the full Encyclo ➚
Upworthy
DNAinfo
FactCheck.org
GateHouse Media
Daily Mail
Craigslist
San Diego News Network
SF Appeal
Placeblogger
NPR
Associated Press
MinnPost