The golden age of computer-assisted reporting is at hand

By Mathew IngramMay 20, 2009  /  11:14 p.m.  

Computer-assisted reporting or CAR has been around, well — ever since there were computers. Even when I was in journalism school (which was longer ago than I care to remember), we learned about databases we could search, etc. But the explosion of Web-based tools and ways of sifting through and sharing data has created something approaching a revolution, and the potential benefits for journalism are only just beginning to reveal themselves. If this movement has a patron saint, it is probably Adrian Holovaty, who gained renown for creating the amazing Chicagocrime.org — one of the first Google Maps mashups — and then worked on data-driven features at the Washington Post, followed by his fellowship-financed Everyblock, which aggregates local data about an area.

Another recent example of how data can drive reporting, and how Web-based tools can extend and enhance that reporting, comes from several British newspapers — primarily The Guardian — and their coverage of an emerging expense scandal involving British politicians. One of the really interesting things that The Guardian has done is to publish all of the expense info they have through a laboriously detailed and publicly accessible Google spreadsheet. As Paul Bradshaw points out at the Online Journalism Blog, this structure actually allows reporters (or in fact anyone who is interested in the info) to extract useful data simply by changing the URL. Someone has even created a page where you can run queries on the database with a simple click.

There are any number of tools out there that can take the data you get from spreadsheets or databases and do useful things with it, such as organizing it into charts the way Many Eyes Wikified (a spinoff from IBM’s Many Eyes) does. Another source of interesting data-driven mini-apps is Yahoo Pipes, an often-overlooked service that lets you create data mashups of various kinds. I’ve already come across pipes that someone created to map your Twitter followers and strip-mine your Twitter stream for links, and I’m sure there are dozens of others. They are relatively easy to create and can be easily customized to do a variety of things.

Is mapping your Twitter followers journalism? Not really. But these tools can be used for all kinds of journalistic efforts, as The Guardian and others have shown. As Holovaty continually points out, we are just scratching the surface of what is possible with the data underlying much of journalism — data that would be a lot easier to remix and mashup and display in different and interesting ways if newspapers identified and tagged and indexed that data when stories were being written, instead of trying to do those things retroactively. When the data that is already being collected is freed up, projects like this (a Holovaty production) all of a sudden start to become not just possible but almost easy to generate.

The Guardian, not surprisingly, is pretty far out in front on this — along with the New York Times, which has also been doing a lot of interesting data-driven things. But while the NYT has an open API for stories and data, only The Guardian offers a *full* API of all the content they publish, as well as a “data store” filled with lots of the data they have accumulated on a whole range of stories (if you’re interested in some tips, there’s a great interview here with Tony “the Data Juggler” Hirst,” one of the most active users of The Guardian’s data and APIs). If you’ve got any other great examples of newspapers using data to enhance their journalism, or any useful sites or recommended Yahoo Pipes, please leave links in the comments.

Bonus link:

See Adrian Holovaty’s definitive, two-part answer to the question “is data journalism?”

This entry was written by Mathew Ingram, posted on May 20, 2009 at 11:14 pm, and tagged , , , , . Bookmark the permalink. Follow any comments here with the RSS feed for this post. Post a comment or leave a trackback.


21 comments:

  1. Murry Shohat at 11:16 am, May 21, 2009

    Using data mining skills with metrics generated by google, an international team with a journalist in the lead uncovered massive plagiarism — about one million words worth. Story here:
    http://knol.google.com/k/murry-shohat/the-curious-copyright-infringement-case/2srzofgvr8kjr/16#

     
  2. Joan at 4:45 pm, May 21, 2009

    Data is valuable to the degree that it tells a story. Throwing a spreadsheet on the web is useful for the purposes of public disclosure, but it is not journalism any more than a reporter publishing his notes (in lieu of writing and editing) would be considered journalism. Using data to effectively inform the public calls for critical analysis, an understanding of proper sourcing and methodology, and the ability to communicate findings in a way that is both succinct and nuanced.

    This is why I find the “Computer-Assisted Reporting” moniker irksome. It implies a certain passivity on the part of those who would make use of it. It marginalizes the process of information gathering and communication that goes into data-driven reporting and essentializes it to a tool: the computer. What if, in the days before computer ubiquity, journalists were regarded not for their investigative reporting ability, but for their remarkable adeptness at making use of a notepad and pen? “You can track information by recording it on paper with ink? You don’t say.” Notepads and pens are used to report the news, sure, but that’s not what it’s all about.

    In the same way, using computers to store data or to do flashy things with numbers is not what it’s all about. The essential value of data-driven reporting is taking a volume of information and helping people to digest and understand it. This may happen with varying degrees of distillation and analysis. In its rawer forms, data-driven reporting may look like EveryBlock or the LA Times Homicide Map, which present broad and minimally edited data for users to explore. In its more distilled forms, it may focus on illuminating specific phenomena such as the NY Times “Geography of a Recession” graphic.

    So is data-driven reporting journalism? Yes and no. There will undoubtedly be some misguided attempts to hop on the data bandwagon without regard to how effectively it tells a story. People will throw numbers up on a website, maybe slap on a flash interface, and then wonder why readers aren’t engaging with it. But then there will be those instances where data-driven reporting is used to show us something about the world in which we live, to elevate us beyond anecdotal experiences to see things from a more global perspective. A golden age of that? Sign me up.

     
  3. Mathew Ingram at 10:47 pm, May 21, 2009

    I couldn’t agree more, Joan — analysis and context and all of those other things are definitely required. That’s why it’s called “computer-assisted” reporting rather than “computerized reporting.” What you do with the data in terms of making it understandable and putting in a meaningful context is the important thing.

     
  4. MichaelJ at 8:45 am, May 22, 2009

    Joan, you say “Data is valuable to the degree that it tells a story.” Data never tells a story. That needs smart people. But the more data points that are available the more nuanced the story can be. And the faster it can be published for the ever changing formations of communities of interest.

    I do agree that “computer assisted” needs a revise to capture a different reality.

     
  5. number1 at 2:20 pm, June 1, 2009

    The term computer-assisted means just that. If the context is not put in an organized and understandable manner, readers will not fully comprehend the data. In order for the data to tell a story someone has to come along and organize the data in such a way that it tells a story and makes sense to the viewer.

     
  6. Brant Houston at 12:35 am, June 9, 2009

    The use of digital tools to present data and information on the Web – and make it easy to search and sift through – is a big step forward. And what the Guardian has done was laudable.

    But this post overlooks the long and impressive work in computer-assisted reporting (or precision journalism as practiced, espoused and taught by Nieman fellow Philip Meyer) that has been going on for decades.

    If you take a look at the web site of Investigative Reporters and Editors http://www.ire.org and the National Institute for Computer-Assisted Reporting (founded in 1989 and renamed in 1994) and look through its story indexes or the feature Extra!Extra! you will find numerous stories (in the hundreds if not thousands), involving both simple and complex, that have been done over the last two decades using data analysis.

    Adrian Holovaty and others do impressive computer-assisted aggregations and presentations of data on the Web, but they do not do data analysis or investigative work or what has been known as computer-assisted reporting.

    For a full appreciation of how computer-assisted reporting and database analysis can contribute to better journalism, take a look at the work of Sarah Cohen and Dan Keating at the Washington Post: Andy Lehren, Janet Roberts, Griff Palmer, Jo Craven McGinty,Tom Torok, Aron Pilhofer and many others at the New York Times; Tom McGinty and Maurice Tamman at the Wall Street Journal; Jennifer LaFleur at the Pro Publica and David Donald at the Center for Public Integrity; and many, many others both in the U.S. and around the world.

    Brant Houston
    Knight Chair in Investigative Reporting
    University of Illinois
    Author: Computer-Assisted Reporting: A Practical Guide
    Former executive director of Investigative Reporters and Editors and the National Institute for Computer-Assisted Reporting

     

Trackbacks:

  1. The golden age of data journalism? at 11:17 pm, May 20, 2009

    [...] read the rest of this post at the Nieman Journalism Lab) [...]

     
  2. Links in investigative journalism 21_May_09 « The Centre for Investigative Journalism News Blog at 5:10 am, May 21, 2009

    [...] The Golden Age of Computer-Assisted Reporting is at Hand – Nieman Labs [...]

     
  3. Variety of Ways to Have Your Own Satellite TV on PC | diet at 2:53 pm, May 21, 2009

    [...] The golden age of computer-assisted reporting is at hand » Nieman … [...]

     
  4. The golden age of computer-assisted reporting is at hand (Mathew Ingram/Nieman Journalism Lab) | Cars For Sale and More at 8:14 pm, May 21, 2009

    [...] Ingram / Nieman Journalism Lab:The golden age of computer-assisted reporting is at hand  —  Computer-assisted reporting or CAR has been around, well — ever since [...]

     
  5. NYTimes Appoints First Social Media Editor | GroupHelp.NET - Easy everything! at 1:46 pm, May 26, 2009

    [...] (For more Ingram goodness, see his article last week at the Nieman Journalism Lab titled The Golden Age of Computer Assisted Reporting is at Hand.) [...]

     
  6. MLB.com’s iPhone App Could Be a Model For Media Saving Itself | yKvz Blog at 2:14 pm, June 17, 2009

    [...] access for sophisticated ongoing access. Mathew Ingram of the Toronto Globe and Mail says that the Golden Age of Computer Assisted Reporting is at hand. As Paul Bradshaw wrote this week on his OnlineJournalismBlog, every newspaper should have a data [...]

     
  7. MLB.com's iPhone App Could Be a Model For Media Saving Itself - ComponentGear.com Feed - ComponentGear.com at 4:50 pm, June 17, 2009

    [...] access for sophisticated ongoing access. Mathew Ingram of the Toronto Globe and Mail says that the Golden Age of Computer Assisted Reporting is at hand. As Paul Bradshaw wrote this week on his OnlineJournalismBlog, every newspaper should have a data [...]

     
  8. MLB.com’s iPhone App Could Be a Model For Media Saving Itself | Techdare at 1:12 am, June 18, 2009

    [...] access for sophisticated ongoing access. Mathew Ingram of the Toronto Globe and Mail says that the Golden Age of Computer Assisted Reporting is at hand. As Paul Bradshaw wrote this week on his OnlineJournalismBlog, every newspaper should have a data [...]

     
  9. My Site! » MLB.com’s iPhone App Could Be a Model For Media Saving Itself at 3:09 pm, June 18, 2009

    [...] access for sophisticated ongoing access. Mathew Ingram of the Toronto Globe and Mail says that the Golden Age of Computer Assisted Reporting is at hand. As Paul Bradshaw wrote this week on his OnlineJournalismBlog, every newspaper should have a data [...]

     
  10. John Tedesco» Blog Archive » Today’s watchdog blog roundup at 1:05 pm, July 7, 2009

    [...] Nieman Journalism Lab: The Golden Age of computer-assisted reporting is at hand. Share and Enjoy: [...]

     
  11. AMB Album » MLB.com’s iPhone App Could Be a Model For Media Saving Itself at 12:51 am, September 2, 2009

    [...] access for sophisticated ongoing access. Mathew Ingram of the Toronto Globe and Mail says that the Golden Age of Computer Assisted Reporting is at hand. As Paul Bradshaw wrote this week on his OnlineJournalismBlog, every newspaper should have a data [...]

     
  12. The Magic Number « Broadsheet Boutique at 8:35 am, December 7, 2009

    [...] CAR, is not, as Google suggests, “a motor vehicle with four wheels”, it is in fact the process of using data and figures to produce a news story. There is a marginally more developed definition here and a fantastic overview here. [...]

     
  13. Computer Assisted Reporting « My Other Blog at 1:16 pm, December 14, 2009

    [...] Computer Assisted Reporting (CAR) has been around in The States since 1952, delivering raw information on local communities and making it searchable to others. CAR is a way of communicating data effectively and was originally developed from journalists using tools like databases and spreadsheets. It gets others involved in what you’re doing, while inadvertently helping you to cut down your workload. Instead of being the lowly intern who has to weed through 458,832 pages of MPs’ expenses, by using CAR other people can gain access to the material and you can sift through it together. [...]

     
  14. philosophie des journalismus oder: wahrheit im medienzeitalter « miss pia's diary at 5:12 am, January 11, 2010

    [...] ich glaube, die gegenüberstellung von entweder “internet-experimente” oder “philosophie” ist unangebracht. der philosophischen lehrplan, den er aufstellt, muss aufpoliert und modernisiert werden. denn tatsächlich ist die frage “was ist wahrheit?” in der multimediagesellschaft sehr aktuell. doch kann man sie mit rein philosophischen ansätzen nicht mehr beantworten. um in den datenmengen des www zwischen fakt und fiktion zu unterscheiden benötigt man eben auch technik, eine forensiche ausbildung um datensalat zu sezieren. journalisten betreiben immer mehr “data-journalism”. [...]

     
  15. The golden age of computer-assisted reporting is at hand » Nieman Journalism Lab « Computation + Journalism Class at Georgia Tech at 12:48 pm, January 25, 2010

    [...] via The golden age of computer-assisted reporting is at hand » Nieman Journalism Lab. [...]

     

Leave a comment

Check out these related posts