Marc Andreessen is still an optimist about the future of news, three years post-tweetstorm
June 11, 2013, 1 p.m.

OpenData Latinoamérica: Driving the demand side of data and scraping towards transparency

“If you put five Chileans in a room, they’re probably going to fight each other. So one of the things — we’re not just building tools, we’re also building ways of working together, and making people trust each other.”

En español aquí.

“There’s a saying here, and I’ll translate, because it’s very much how we work,” Miguel Paz said to me over a Skype call from Chile. “But that doesn’t mean that it’s illegal. Here, it’s ‘It’s better to ask forgiveness than to ask permission.””

Paz is a veteran of the digital news business. The saying has to do with his approach to scraping public data from governments that may be slow to share it. He’s also a Knight International Journalism Fellow, the founder of Hacks/Hackers Chile, and a recent Knight News Challenge winner. A few years ago, he founded Poderopedia, a database of Chilean politicians and their many connections to political organizations, government offices, and businesses.

But freeing, organizing, and publishing data in Chile alone is not enough for Paz, which is why his next project, in partnership with Mariano Blejman of Argentina’s Hacks/Hackers network, is aimed at freeing data from across Latin America. Their project is called OpenData Latinoamérica. Paz and Blejman hope to build a centralized home where all regional public data can be stored and shared.

Their mutual connection through Hacks/Hackers is key to the development of OpenData Latinoamérica. The network will make itself, to whatever extent possible, available for trouble shooting and training as the project gets off the ground and civic hackers and media types learn both how to upload data sets as well as make use of the information they find there.

Another key partnership helping make OpenData Latinoamérica possible is with the World Bank Institute’s Global Media Development program, which is run by Craig Hammer. Hammer believes the data age is revolutionizing government, non-government social projects, and how we make decisions about everyday life.

“The question for us, is, What are we gonna do with the data? Data for what? Bridging that space between opening the data and how it translates into improving the quality of people’s lives around the world requires a lot of time and attention,” he says. “That’s really where the World Bank Institute and our programmatic work is focused.”

A model across the Atlantic

Under Hammer, the World Bank helped organize and fund Africa Open Data, a similar project launched by another Knight fellow, Justin Arenstein. “The bank’s own access-to-information policy provides for a really robust opportunity to open its own data,” Hammer says, “and in so doing, provide support to countries across regions to open their own data.”

Africa Open Data is still in beta, but bringing together hackers, journalists, and information in training bootcamps has already led to reform-producing journalism. In a post about the importance of equipping the public for the data age, Hammer tells the story of Irene Choge, a journalist from Kenya who attended a training session hosted by the World Bank in conjunction with Africa Open Data.

She…examined county-level expenditures on education infrastructure — specifically, on the number of toilets per primary school…Funding allocated for children’s toilet facilities had disappeared, resulting in high levels of open defecation (in the same spaces where they played and ate). This increased their risk of contracting cholera, giardiasis, hepatitis, and rotavirus, and accounted for low attendance, in particular among girls, who also had no facilities during their menstruation cycles. The end result: poor student performance on exams…Through Choge’s analysis and story, open data became actionable intelligence. As a result, government is acting: ministry resources are being allocated to correct the toilet deficiency across the most underserved primary schools and to identify the source of the misallocation at the root of the problem.

Hammer calls Africa Open Data a useful “stress test” for OpenData Latinoamérica, but Paz says the database was also a natural next step in a series of frustrations he and Blejman had encountered in their other work.

“Usually, the problem you have is: Everything is cool before the hackathon, and during the hackathon,” says Paz. “But after, it’s like, who are the people who are working on the project? What’s the status of the project? Can I follow the project? Can I be a part of the project?” The solution to this problem ended up being Hackdash, which was actually Blejman’s brainchild — an interface that helps hackers keep abreast of the answers to those questions and thereby shore up the legacy of various projects.

So thinking about ways that international hackers can organize and communicate across the region is nothing new to Paz and Blejman. “One hackathon, we would do something, and another person who didn’t know about that would do something else. So when we saw the Open Data Africa platform, we thought it was a really great idea to do in Latin America,” he says.

Blejman says the contributions of the World Bank have been essential to the founding of OpenData Latinoamérica, especially in organizing the data bootcamps. Hammer says he sees the role of the bank as building a bridge between civic hackers and media. “More than a platform,” he says it’s, “an institution in and of itself to help connect sources of information to government and help transform that data into knowledge and that knowledge into action.”

Giving people the tools to understand the power of data is an important tenet of Hammer’s open data philosophy. He believes the next step for Big Data is global data literacy, which he says is most immediately important for “very specific and arguably strategic public constituencies — journalists, media, civic hackers, and civil society.” Getting institutions, like newspapers, to embrace the importance of data literacy rather than relying on individual interest is just one goal Hammer has in mind.

“I’m not talking about data visualization skills for planet Earth,” he says. “I’m saying, it’s possible — or it should be possible — for anybody that wants to have these skills to have them. If we’re talking about data as the real democratizer — open data as meaningful democratization of information — then it has to be digestible and accessible and consumable by everyone and everybody who wants to access and digest and consume it.”

Increasing the desire of the public for more, freer data is what Hammer calls stoking the demand side. He says it’s great if governments are willingly making information accessible, but for it to be useful, people have to understand its power and seek to unleash it.

“What’s great about OpenData Latinoamérica is it’s in every way a demand-side initiative, where the public is liberating its own data — it’s scraping data, it’s cleaning it,” he says. “Open data is not solely the purview of the government. It’s something that can be inaugurated by public constituencies.”

For example, in Argentina, where the government came late to the open data game, Blejman says he saw a powerful demand for information spring up in hackers and journalists around him. When they saw what other neighboring countries had and what they could do with that information, they demanded the same, and Argentina’s government began to release some of that data.

“We need to think about open data as a service, because no matter how much advocacy from NGOs, people don’t care about ‘open data'” per se, Paz says. “They care about data because it affects their life, in a good or bad way.”

Another advantage Bleman and Paz had when heading into OpenData Latinoamérica was the existence of Junar, a Chilean software platform founded by Javier Pajaro, who was a frustrated analyst when he decided to embrace open data platforms and help others do the same. Blejman said that, while Africa Open Data opted for CKAN, using a local, Spanish-language company that was already familiar to members of the Hacks/Hackers network has strengthened the project, making it easier to troubleshoot problems as they arise. He also said Junar’s ability to give participating organizations more control fit nicely into their hands-off, crowd-managed vision for future day-to-day operation of the database.

Organizing efforts

Paz and Blejman have high hopes for the stories and growth that will come from OpenData Latinoamérica. “What we expect from these events is for people to start using data, encourage newspapers to organize around data themes, and have the central hub for what they want to consume,” Blejman said.

They hope to one day bring in data from every country in Latin America, but they acknowledge that some will be harder to reach than others. “Usually, the federated governments, it’s harder to get standardized data. So, in a country like Argentina, which is a federated state with different authorities on different levels, it’s harder to get standardized data than in a republic where there’s one state and no federated government,” says Paz. “But then again, in Chile, we have a really great open data and open government and transparency allows, but we don’t have great data journalism.” (Chile is a republic.)

Down the road, they’d also like to provide a secure way for anonymous sources to dump data to the site. Paz says in his experience as a news editor, 20–25 percent of scoops come from anonymous tips. But despite developments like The New Yorker’s recent release of Strongbox, OpenData Latinoamérica is still working out a secure method that doesn’t require downloading Tor, but is more secure than email. Blejman also added that, for now, whatever oversight they have over the quality and accuracy of the original data they’re working with is minimal: “At the end, we cannot control the original sources, and we are just trusting the organizations.”

But more than anything, Paz is excited about seeing the beginnings of the stories they’ll be able to tell. He plans to use documents about public purchases made by Chile’s government to build an app that allows citizens to track what their government is spending money on, and what companies are being contracted those dollars.

Another budding story exemplifies the extent to which Paz has taken to heart Craig Hammer’s emphasis on building demand. In Chile, there is currently a significant outcry from students over the rising cost of education. Protests in favor of free education are ongoing. In response, Paz decided to harness this focus, energy, and frustration into a scrape-a-thon (or #scrapaton) to be held June 29 in Santiago. They will focus on scraping data on the owners of universities, companies that contract with universities, and who owns private and subsidized schools.

“There’s a joke that says if you put five gringos — and I don’t mean gringos in a disrespectful way — if you put five U.S. people in a room, they’re probably going to invent a rocket,” says Paz. “If you put five Chileans in a room, they’re probably going to fight each other. So one of the things — we’re not just building tools, we’re also building ways of working together, and making people trust each other.” Blejman added that he hopes the recent release of a Spanish-language version of the Open Data Handbook (El manual de Open Data) will further facilitate collaboration between hackers in various Latin American countries.

With a project of this size and scope, there are also some ambitious designs around measurement. Paz hopes to track how many stories and projects originate with datasets from OpenData Latinoamérica. Craig Hammer wants to quantify the social good of open data, a project he says is already underway via the World Wide Web Foundation’s collaboration with the Open Data for Development Camp.

“If there is a cognizable and evidentiary link between open data and boosting shared prosperity,” Hammer says, “then I think that would be, in many cases, the catalytic moment for open data, and would enable broad recognition of why it’s important and why it’s a worthwhile investment, and broad diffusion of data literacy would really explode.”

Hammer wants people to take ownership of data and realize it can help inform decisions at all levels, even for individuals and families. Once that advantage is made clear to the majority of the population, he says, the demand will kick in, and all kinds of organizations will feel pressured to share their information.

“There’s this visceral sense that data is important, and that it’s good. There’s recognition that opening information and making it broadly accessible is in and of itself a global public good. But it doesn’t stop there, right? That’s not the end,” he says. “That’s the beginning.”

Photo of Santiago student protesters walking as police fire water canons and tear gas fills the air, Aug. 8, 2012 by AP/Luis Hidalgo.

