Nieman Foundation at Harvard
HOME
          
LATEST STORY
What journalists and independent creators can learn from each other
ABOUT                    SUBSCRIBE
Oct. 18, 2012, 3:10 p.m.
Reporting & Production

ProPublica’s Message Machine is figuring out what the Obama campaign knows about me

Crowdsourcing and computer intelligence help reverse-engineer campaign targeting.

Just how much do presidential campaigns know about the voters they’re trying to court? The short answer: More than ever.

But figuring out precisely how the campaigns use your individual personal data to target you personally is far more complicated, mostly because campaigns won’t say.

So the data junkies at ProPublica set out to reverse-engineer presidential campaign messaging themselves. Here’s how their Message Machine works:

First, ProPublica asked people to fill out a brief, demographic profile for ProPublica’s eyes only. (How old are you? How much money do you make? Are you married? Are you Hispanic? What’s your highest level of education? And so on.)

Then, participants simply had to forward to ProPublica every email they received from either the Romney or Obama campaigns (as well as from high-profile groups like the Democratic Senatorial Campaign Committee). ProPublica’s Message Machine processes emails based on computer-science techniques from the fields of natural language processing and machine learning. (For a more granular look at how ProPublica used these models, check out ProPublica’s blog post.) Based on how the machine clusters emails, the human team at ProPublica can then begin its analysis.

ProPublica has already shared some real-time results on emails received. So, for example, you can see that the Message Machine has processed 1,429 different emails from Obama for America since March. (You can also mouse over a graph that breaks down those emails by subject line, to see how many Message Machine participants received a given variation on any campaign message.)

But now ProPublica is unveiling its findings from more complex Message Machine analysis. Today, the site launched individual profiles for participants, so they can begin to see just how campaigns are targeting them.

For now, the site is most confident about what it has learned from Obama for America emails. (ProPublica is actually using a method that the Obama campaign’s head of micro-targeting used, Jeff Larson says.) The Message Machine team isn’t sure why that is. Maybe fewer participants are getting Romney emails in the first place; maybe Romney’s campaign is actually sending out fewer emails; or maybe his campaign is relying more on A/B testing than sophisticated micro-targeting.

Since June, I’ve forwarded more than 300 emails I received from the Obama and Romney campaigns. Here’s what the Message Machine can tell me about how the Obama campaign has targeted me so far:

  • Obama’s campaign has asked me for an average of about $15
  • It has made decisions to send me some emails — or not send them to me — based on my age
  • My ZIP code is a factor to the campaign

My personal Message Machine profile also tracks a list of every email I’ve submitted, which I can then compare with different versions of that email that were designed to target other voters.

ProPublica’s still at the very beginning stages of individual analysis, but there are some broader conclusions ProPublica has been able to make. For one thing, age matters to a campaign. ProPublica finds, for example, that younger voters were more likely to be targeted for campaign contests that involved flying a winner somewhere.

“The other thing we’re finding that’s very, very important is actually where they live — whether they live in a competitive state,” Larson said. “Campaigns will send out emails to folks in noncompetitive states asking them if they want to go to, say, Pennsylvania and campaign.”

Another key finding had to do with donations. Whether a person has donated in the past is one of the main ways the Obama campaign partitions its emails lists. “If you’re donated $3 in the past, they’ll quickly bump you up to a much bigger donation group — sometimes as high as $500 they’ll ask for,” Larson said. “If you were a supporter in 2008, and you donated a whole bunch, they’ll ask you to donate the same amount.”

Because people with higher household incomes tend to donate more, ProPublica found it hard to determine whether high-income voters were targeted because of their income or because of their past donations. “Because there’s that correlation, it’s hard to say whether the campaign knows how much you make versus how much you’re willing to give,” Larson said.

Another uncertainty had to do with gender-related targeting. ProPublica doesn’t yet have enough data to definitively say that gender is a targeting factor. Even though the Message Machine has registered tens of thousands of emails from about 700 participants, the data set is “very sparse,” Larson says. Part of the challenge is that ProPublica needs emails that show people who are “wildly disparate in the way the campaign looks at them.” (On the flip side, when the Message Machine receives a huge number of the same emails, there’s not much to learn.)

“Emails they tend to send out almost to their whole entire lists, they’re not so specific in interest groups or what the campaign knows about folks,” Larson said. “That’s the biggest challenge, and the reason it’s taken us so long to do this.”

Larson says he and his ProPublica colleagues are holding out hope that people will forward enough emails to the Message Machine between now and election day so that they might glean more information about how campaigns are targeting voters.

“It’s still early for us, even though it’s late in the political season,” Larson said. “Hopefully we get more in the next 20 days. These models get better with more data.”

Photo by Michael Connell used under a Creative Commons license.

POSTED     Oct. 18, 2012, 3:10 p.m.
SEE MORE ON Reporting & Production
Show tags
 
Join the 60,000 who get the freshest future-of-journalism news in our daily email.
What journalists and independent creators can learn from each other
“The question is not about the topics but how you approach the topics.”
Deepfake detection improves when using algorithms that are more aware of demographic diversity
“Our research addresses deepfake detection algorithms’ fairness, rather than just attempting to balance the data. It offers a new approach to algorithm design that considers demographic fairness as a core aspect.”
What it takes to run a metro newspaper in the digital era, according to four top editors
“People will pay you to make their lives easier, even when it comes to telling them which burrito to eat.”