Nieman Foundation at Harvard
HOME
          
LATEST STORY
Why are digital newsrooms unionizing now? “This generation is tired of hearing that this industry requires martyrdom”
ABOUT                    SUBSCRIBE
May 31, 2017, 9 a.m.
Reporting & Production
LINK: www.apstylebook.com  ➚   |   Posted by: Laura Hazard Owen   |   May 31, 2017

It’s fitting that, in a year when the Panama Papers investigation won the 2017 Pulitzer Prize for explanatory reporting (the entire leaked data set for that investigation totaled 11.5 million documents adding up to 2.6 terabytes), the Associated Press is releasing its updated 2017 Stylebook with a new chapter on data journalism.

“Government agencies, businesses and other organizations alike all communicate in the language of data and statistics,” the AP said. “To cover them, journalists must become conversant in that language as well.”

Here are a few of the AP’s data journalism recommendations:

Get the data in searchable form, if you can. “In a records request for data, be sure to ask for data in an ‘electronic, machine-readable’ format that can be interpreted by standard spreadsheet or database software. The alternative, which is the default for many agencies, is to provide records in paper form or as scans of paper pages, which present an obstacle to analysis.”

Scraping data should be a “last resort.”

Some website operators sanction this practice, and others oppose it. A website with policies limiting or prohibiting scraping often will include them in its terms of service or in a “robots.txt” file, and reporters should take these into account when considering whether to scrape.

Scraping a website can cause its servers to work unusually hard, and in extreme cases, scraping can cause a website to stop working altogether and treat the attempt as a hostile attack. Therefore, follow these precautions:
— Scraping should be seen as a last resort. First try to acquire the desired data by requesting it directly.
— Limit the rate at which the scraper software requests pages in order to avoid causing undue strain on the website’s servers.
— Wherever feasible, identify yourself to the site’s maintainers by adding your contact information to the scraper’s requests via the HTTP headers.

Make sure someone else can reproduce your findings. “If at all possible, an editor or another reporter should attempt to reproduce the results of the analysis and confirm all findings before publication.”

Let your readers see the source data, too.

Where possible, provide the source data for download along with the story or visualization. When distributing data consider the following guidelines:
— The data should be distributed in a machine-readable, widely useable format, such as a spreadsheet.
— The data should be accompanied by thorough documentation that explains data provenance, transformations and alterations, any caveats with the data analysis and a data dictionary.

The updated stylebook also includes entries on fake news, among other things; if you have questions for its editor, you can ask them in a 2:30 p.m. ET Twitter chat with the hashtag #APStyleChat.

Show tags Show comments / Leave a comment
 
Join the 50,000 who get the freshest future-of-journalism news in our daily email.
Why are digital newsrooms unionizing now? “This generation is tired of hearing that this industry requires martyrdom”
“These are professional-class jobs paying working-class wages, and these people have working-class worries about being downsized, laid off, cast aside in a market that is really stripped down.”
The New Humanitarian (no longer an acronymed UN agency) wants to move humanitarian crisis journalism beyond its wonky, depressing roots
“It’s one thing to have been an internal information center for an international NGO. It’s another thing to become a full newsroom, and an independent newsroom at that — it’s not a switch you turn on and off.”
Research from Canada suggests journalists’ creed can withstand government support
Social psychologists have demonstrated a significant link between people’s behavior, their values, and the norms of their milieu, and that they feel rewarded when they act consistently with their beliefs.