The primary source in the age of mechanical multiplication

“Many stories lay dormant in the vast amounts of data produced by everyday consumers because journalists are still only starting to acquire the large-scale data-wrangling expertise needed to tap them.”

Historians and journalists alike have long prized one source of information above all others: the on-the-record, primary source. They scour the attics for diaries and journals and fly across the country for interviews. But now we have a glut of documentation.

lam-thuy-voThanks to social media, millions of people go on the record, publicly, every single day. People sent billions of tweets, Facebook posts, and WhatsApp messages last year. They have expressed wonder 😮, anger 😤, and love ❤️ for social and political issues.

At present, most journalists treat social sources like they would any other — individual anecdotes and single points of contact. But to do so with a handful of tweets and Instagram posts is to ignore the potential of hundreds of millions of others.

Many stories lay dormant in the vast amounts of data produced by everyday consumers because journalists are still only starting to acquire the large-scale data-wrangling expertise needed to tap them. As more and more people conduct their lives online, and as smartphones are penetrating previously unconnected regions around the world, this trove of stories is only becoming larger.

The kinds of stories journalists can tell using this data are wide ranging. We can reconstruct online encounters in ways more precise than a source may recall from memory. Le Monde, for instance, retraced the journey of Syrian refugees through their WhatsApp messages. At Al Jazeera America, we analyzed and chronicled the evolution of a Hong Kong pro-democracy movement in Facebook chatrooms.

Journalists can try to find insight to people’s personality and character or hold powerful people accountable. Data scientist David Robinson did a sentiment analysis of Donald Trump’s tweets and found that Trump’s own tweets are much more negative than those his campaign staff tweeted. My colleague Charlie Warzel and I looked at the links Trump tweeted to explore the news he chooses to circulate, as a proxy for the news he may consume.

Journalists can examine the ways in which technology disadvantages groups by looking at social data. ProPublica’s Julia Angwin and Terry Parris Jr. bought a Facebook ad and established that the social media company allows advertisers to exclude customers based on race, while Vox’s Alvin Chang expanded on ProPublica’s analysis by looking at whether Facebook’s algorithm excludes already disadvantaged populations from being offered opportunities that their more affluent counterparts receive.

When journalists venture into this kind of story mining, I hope that they also continue to discuss the ethics surrounding it, the blurred lines between what is considered public and what is considered private, and the caveats that come with each dataset.

Last but not least, I hope that journalists will dig into social data to gain insights into whom they reach and, perhaps more importantly, whom they do not reach. People live in their filtered worlds in which algorithms serve them information that tends to affirm rather than question political views.

Maybe social data can allow us to understand these bubbles better in an effort to pierce them.

Lam Thuy Vo is a fellow in BuzzFeed’s Open Lab for Journalism, Technology, and the Arts.

Nicholas Quah   Podcasting’s coming class war

Ken Schwencke   Disaggregation and collection

Michael Oreskes   Reversing the erosion of democracy

Kathleen Kingsbury   Print as a premium offering

Andrew Haeg   The year of listening

Javaun Moradi   What can we own?

Doris Truong   Connecting with diverse perspectives

Cory Haik   Navigating power in Trump’s America

Rachel Schallom   Stop flying over the flyover states

Ernst-Jan Pfauth   Earn trust by working for (and with) readers

Renée Kaplan   Pure reach has reached its limit

Coleen O'Lear   Back to basics

Alberto Cairo   Communicating uncertainty to our readers

Vivian Schiller   Tested like never before

Mark Armstrong   Time to pay up

Mandy Velez   The audience is the source and the story

Michael Kuntz   Trust is the new click

Andy Rossback   The year of the user

Andrea Silenzi   Podcasts dive into breaking news analysis

Asma Khalid   The year of the newsy podcast

Nathalie Malinarich   Making it easy

Corey Ford   The year of the rebelpreneur

Carrie Brown-Smith   We won’t do enough

Annemarie Dooling   UGC as a path out of the bubble

Ole Reißmann   Un-faking the news

Amy O'Leary   Not just covering communities, reaching them

Katie Zhu   The year of minority media

P. Kim Bui   The year journalism teaches again

Guy Raz   Inspiration and hope will matter more than ever

Geetika Rudra   Journalism is community

Alice Antheaume   A new test for French media

Ashley C. Woods   Local journalism will fight a new fight

Rasmus Kleis Nielsen   News after advertising may look like news before advertising

S.P. Sullivan   Baking transparency into our routines

Helen Havlak   Chasing mobile search results

Lee Glendinning   A call for great editing

Erin Pettigrew   A year of reflection in tech

Burt Herman   Local news gets interesting

Juan Luis Sánchez   Your predictions are our present

Zizi Papacharissi   Distracted journalism looks in the mirror

Sam Ford   The year we talk about our awful metrics

Mary Meehan   Feeling blue in a red state

Christopher Meighan   Unlocking a deeper mobile experience

Rachel Sklar   Women are going to get loud

Aja Bogdanoff   Comments start pulling their weight

Ryan McCarthy   Platforms grow up or grow more toxic

Jim Friedlich   A banner year for venture philanthropy

Emi Kolawole   From empathy to community

Taylor Lorenz   “Selfie journalism” becomes a thing

Claire Wardle   Verification takes center stage

Laura Walker   Authentic voices, not fake news

Alexis Lloyd   Public trust for private realities

Dan Colarusso   Let’s make live video we can love

Ray Soto   VR moves from experiments to immersion

Priya Ganapati   Mobile websites are ready for reinvention

Juliette De Maeyer and Dominique Trudel   A rebirth of populist journalism

Erin Millar   The bottom falls out of Canadian media

Dannagal G. Young   The return of the gatekeepers

Tressie McMillan Cottom   A path through the media’s coming legitimacy crisis

David Chavern   Fake news gets solved

Almar Latour   Thanks, #fakenews

Kawandeep Virdee   Moving deeper than the machine of clicks

Dhiya Kuriakose   The year of digital detoxing

Sara M. Watson   There is no neutral interface

Rubina Madan Fillion   Snapchat grows up

Emily Goligoski   Incorporating audience feedback at scale

Melody Kramer   Radically rethinking design

Bill Keller   A healthy skepticism about data

Nushin Rashidian   A rise in high-price, high-value subscriptions

Jon Slade   Trusted news, at a premium

Tanya Cordrey   The resurgence of reach

Millie Tran   International expansion without colonial overtones

Adam Thomas   The coming collaboration across Europe

Libby Bawcombe   Kids board the podcast train

Matt Karolian   AI improves publishing

Hillary Frey   Forests need to burn to regrow

Ståle Grut   The battle for high-quality VR

Swati Sharma   Failing diversity is failing journalism

Scott Dodd   Nonprofits team up for impact

Mike Ragsdale   A smarter information diet

Sarah Wolozin   Virtual reality on the open web

Margarita Noriega   From pinning tweets to tweeting pins

Megan H. Chan   Cultural reporting goes mainstream

Mira Lowe   News literacy, bias, and “Hamilton”

Jeremy Barr   A terrible year for Tiers B through D

Keren Goldshlager   Defining a focus, and then saying no

Francesco Marconi   The year of augmented writing

Liz McMillen   The year of deep insights

Trushar Barot   API or die

Tim Herrera   The safe space of service journalism

Jonathan Stray   A boom in responsible conservative media

Mary Walter-Brown   Getting comfortable asking for money

Steve Henn   The next revolution is voice

Errin Haines Whack   Chaos or community?

Laura E. Davis   Show your work

Richard J. Tofel   The country doesn’t trust us — but they do believe us

Peter Sterne   A dangerous anti-press mix

Carla Zanoni   Prioritizing emotional health

Maria Bustillos   “It’s true — I saw it on Facebook”

Matt Waite   The people running the media are the problem

Jonathan Hunt   Measurement companies get with the times

Samantha Barry   Messaging apps go mainstream

Amie Ferris-Rotman   Вслед за Россией

Tracie Powell   Building reader relationships

David Weigel   A test for online speech

Felix Salmon   Headlines matter

Julia Beizer   Building a coherent core identity

Cindy Royal   Preparing the digital educator-scholar hybrid

Eric Nuzum   Podcasting stratifies into hard layers

Andrew Ramsammy   Rise of the rebel journalist

Mathew Ingram   The Faustian Facebook dance continues

Bill Adair   The year of the fact-checking bot

Molly de Aguiar   Philanthropists galvanize around news

Moreno Cruz Osório   The year of transparency in Brazilian journalism

Umbreen Bhatti   A sense of journalists’ humanity

Gabriel Snyder   The aberration of 20th-century journalism

Olivia Ma   The year collaboration beats competition

Elizabeth Jensen   Trust depends on the details

Valérie Bélair-Gagnon   Truthiness in private spaces

David Skok   What lies beyond paywalls

Lam Thuy Vo   The primary source in the age of mechanical multiplication

Reyhan Harmanci   Bear witness — but then what?

Tim Griggs   The year we stop taking sides

Robert Hernandez   History will exclude you, again

Joanne Lipman   The year of the drone, really

Mario García   Virtual reality on mobile leaps forward

Liz Danzico   The triumph of the small

Sarah Marshall   Focusing on the why of the click

An Xiao Mina   2017 is for the attention innovators

Amy Webb   Journalism as a service

Sue Schardt   Objectivity, fairness, balance, and love

Sydette Harry   Facing journalism’s history

Andrew Losowsky   Building our own communities

Rebekah Monson   Journalism is community-as-a-service

Anita Zielina   The sales funnel reaches (and changes) the newsroom

Caitlin Thompson   High touch, high value

Dan Gillmor   Fix the demand side of news too

Pablo Boczkowski   Fake news and the future of journalism

Ariane Bernard   Better data about your users

M. Scott Havens   Quality advertising to pair with quality content