Nieman Foundation at Harvard
HOME
          
LATEST STORY
Twitter is removing precise location data on tweets — a small win for privacy but a small loss for journalists and researchers
ABOUT                    SUBSCRIBE
March 14, 2013, 2:15 p.m.

Last week, we wrote about winners of the latest round of grantees of the Knight Prototype Fund. One of them was Open Gender Tracking, a team that’s working to figure out the best way to automatically analyze gender bias in news outlets, both in terms of who is writing the news, and who the news is being written about. This week, Source gets into the details of how they built the GenderTracker and how it works.

Under the hood, OGT is a ruby application (we use JRuby), which schedules batch jobs to process data from news APIs and static files. Jobs are queued using Redis. Text processing is done with Apache OpenNLP, but it really could be anything. Jobs don’t even have to be written in Ruby.

Outputs can be incredibly diverse. For our Global Voices study, we imported Open Gender Tracker into the R statistical software. The Boston Globe project is more interactive, storing results in MongoDB and serving them out to a Backbone.js app that visualizes the results.

Our new Global Name Data repository is another exciting part of this project. We have collected names from the US, UK, and Ireland and are hoping to add Chinese names from my colleague Huan Sun’s research on gender on Sina Weibo. We’re actively looking for name gender datasets from other cultures and would love to add more to this growing list.

Show tags Show comments / Leave a comment
 
Join the 50,000 who get the freshest future-of-journalism news in our daily email.
Twitter is removing precise location data on tweets — a small win for privacy but a small loss for journalists and researchers
For the past decade, location-tagged tweets have been a useful (if imperfect) tool for anyone trying to connect time, place, and information in ways that told us something about the world.
“News unfolds like a saga”: The French news site Les Jours wants to marry narrative, depth, and investigative reporting
“Serial” isn’t just a podcast: It’s also the format hook Les Jours uses to bring some of the lessons of drama to long-form investigative reporting. It’s a fascinating mish-mash of ideas you’ll recognize from short-run nonfiction audio, Quartz, Epic Magazine, and more.
Meet TikTok: How The Washington Post, NBC News, and The Dallas Morning News are using the of-the-moment platform
“When I was a beat reporter, I used to look at national news and say: How can I localize this? I feel like this is the other way around: What’s going on in our community that people can relate to across all platforms?”