Nieman Foundation at Harvard
Holding algorithms (and the people behind them) accountable is still tricky, but doable
ABOUT                    SUBSCRIBE
Sept. 10, 2013, 12:58 p.m.
LINK:  ➚   |   Posted by: Joshua Benton   |   September 10, 2013

Sarah Marshall at

A tool which helps non-coding journalists scrape data from websites has launched in public beta today. lets you extract data from any website into a spreadsheet simply by mousing over a few rows of information.

Until now, which we reported on back in April, has been available in private developer preview and has been Windows only. It is now also available for Mac and is open to all.

Although plans to charge for some services at a later date, there will always be a free option.

The London-based start-up is trying to solve the problem of the fact that there is “lots of data on the web, but it’s difficult to get at”, Andrew Fogg, founder of, said in a webinar last week.

Those with the know-how can write a scraper or use an API to get at data, Fogg said. “But imagine if you could turn any website into a spreadsheet or API.”

If learning the basics (and the not-so-basics) of scraping is of interest to you, I can recommend Paul Bradshaw’s Scraping for Journalists.

Show tags Show comments / Leave a comment
Join the 45,000 who get the freshest future-of-journalism news in our daily email.
Holding algorithms (and the people behind them) accountable is still tricky, but doable
“We were able to demystify this black box, this algorithm that had very scary connotations, and break it down into what ended up being a very simple linear model.”
Fill in the blanks: What’s still missing from the study of fake news? (A whole lot.)
A big new report from the Hewlett Foundation pulls together existing research on social media, political polarization, and disinformation to show where we still need to know more.
Google announces a $300M ‘Google News Initiative’ (though this isn’t about giving out grants directly to newsrooms, like it does in Europe)
Also: an easier subscription flow, $10 million for media literacy in U.S. high schools, fact-checking efforts in search around health issues, and more.