News in a disintegrating reality: Tow’s Jonathan Albright on what to do as things crash around us

“The kinds of things that I often see could literally be stopped by one person. I mean: 4chan trending on Google during the Las Vegas shooting? How that even happened, I have no idea, but I do know that one person could have stopped that.”

By Laura Hazard Owen @laurahazardowen Feb. 28, 2018, 11:43 a.m.

It’s less about what we’re doing on Facebook, and more about what’s being done to us.

Jonathan Albright, the research director at the Tow Center for Digital Journalism and a faculty associate at Harvard’s Berkman Klein Center for Internet and Society, isn’t big on studies that try to track how much fake news people have clicked on, or how many outright hoaxes they recall seeing in their feeds. Instead, his research into activity on the biggest platforms on the Internet — Facebook, YouTube, Instagram, and to a lesser extent, Twitter — situates everyday Internet users inside a kind of trap, one they can’t get out of without a great deal of help from those same platforms, which thus far haven’t been eager to tackle the problem.

It’s shadowy, scary, and difficult to pinpoint. I talked to Albright this week about the work he’s doing, which has come to center around pulling whatever data can be pulled from those platforms (almost always without the participation of those companies, and in the case of Facebook usually only through loopholes), analyzing it, releasing the data publicly, and helping journalists make sense of it means — and then repeat.

“It’s getting worse. Since the 2016 election what I’ve come to the realization — horribly, and it’s very depressing — that nothing has gotten better, despite all the rhetoric, all of the money, all of the PR, all of the research. Nothing has really changed with the platforms,” Albright told me. “We’re basically yelling about Russia right now when our technological and communication infrastructure — the ways that we experience reality, the ways we get news — are literally disintegrating around us.”

It’s all horrible and depressing, but it was still fun to talk with Albright, who speaks energetically and urgently and somehow manages not to be a total downer. Or maybe I was just scrambling to find something positive to pick up on. There are glints of light here, but a lot of them come down to the platforms accepting that they’re media companies and hiring people into totally new roles (Albright’s idea: “platform editor”). We’ll see. Our conversation, lightly condensed and edited for clarity, is below.

Laura Hazard Owen: Can you start out by telling me the kind of work you do for Tow? How’d you get into this research? Is it a main big part of your job or is it something that started more as a side hobby and has turned into more? What kind of skills do you need for it and how’d you get interested in it in the first place?

Jonathan Albright: I’ve been doing this category of work, data analytics and looking at flows of information — especially on social media for trending topics and news events — going back years.

But I got into election-related things when I started looking at this more from a manipulation perspective. I did my Ph.D. on news hashtags. I was really interested in social change, things like the Arab Spring, and response to crisis events, like the Japanese earthquake. I was looking at this in the context of responsibility and using social media as a force for change and good.

But I was noticing through the 2016 [U.S. presidential] election — and this also occurred a little bit in the 2014 midterms — the patterns of how candidates were using social media, especially Trump. Things stood out. Twitter would run surveys during the presidential debates, and the Trump train would go crazy and dominate the other candidates. I saw people acting out in ways that weren’t what I expected. I saw polarizing callouts in News Feed. I saw tweets that got out of control.

I started collecting data in 2016, and once the election result kind of sunk in, I took a day or two and then started to dig into it. I focused on misinformation sites that continually spread unreliable, poorly sourced information or hyperpartisan news. I scraped all of those sites and spent hours and hours putting all the URLs together, pulling a network out of it, and really looking at this from a network perspective. I wanted to get a sense of the scale of this and see how the resources were being connected into from a linking perspective. I’d learned that you can’t just look at one platform or one type of communication channel — they’re all linked together.

What I saw in that first study was that YouTube was basically the center of that universe, especially from a resource perspective. So many sites, domains, tweets, and Facebook pages were linking into YouTube — not just YouTube channels or single videos, but previews in tweets or Facebook pages. It was shocking, and it showed something that hadn’t been brought to the conversation from a data perspective. People like Zeynep Tufekci have been on this for a long time, pushing that there are other platforms [to consider] besides Facebook and Twitter, and so in a sense I’m validating this with…not big data, very focused “medium “data.

Then I went back through that same network and scraped all the ad tech off of it — any script that was loading, any tracker, Facebook scripts, cookies, anything going on in the background between the browsers and the cache.

It showed that it’s not just about content. It’s really important to understand this “fake news” and information problem from a tracking perspective, because the tracking is actually how the people are getting the content. The number of people who navigate from a browser to a fake news story is just shockingly low. The vast majority of traffic on Facebook comes directly from the mobile app, and even among people who do the desktop version of Facebook, mobile dominates. So the studies that looking at this as people navigating from a URL to look at a news story on their own volition are completely missing the profiling and microtargeting that are happening.

Microtargeting is not just a buzzword; it exists, and in many ways people are being provoked to act out and share and keep spreading this misinformation and disinformation, as kind of a campaign. It’s part of the trolling. It’s targeting specific segments of audiences, hammering them with certain types of news at different times of day, and giving them a reason to act out.

There’s a whole other side of this with monetization as well, and I think other people are doing good work on that, like Craig Silverman. People are contributing [research] in all sorts of different ways, and hopefully it’s going to lead somewhere and none of it will be for naught.

Owen: Who is doing the targeting?

Albright: It really depends on the platform and the news event. Just the extensiveness of the far right around the election: I can’t talk about that right this second, but I can say that, very recently, what I’ve tended to see from a linking perspective and a network perspective is that the left, and even to some degree center-left news organizations and journalists, are really kind of isolated in their own bubble, whereas the right have very much populated most of the social media resources and use YouTube extensively. This study I did over the weekend shows the depth of the content and how much reach they have. I mean, they’re everywhere; it’s almost ubiquitous. They’re ambient in the media information ecosystem. It’s really interesting from a polarization standpoint as well, because self-identified liberals and self-identified conservatives have different patterns in unfriending people and in not friending people who have the opposite of their ideology.

From those initial maps of the ad tech and hyperlink ecosystem of the election-related partisan news realm, I dove into every platform. For example, I did a huge study on YouTube last year. It led me to almost 80,000 fake videos that were being auto-scripted and batch-uploaded to YouTube. They were all keyword-stuffed. Very few of them had even a small number of views, so what these really were was about impact — these were a gaming system. My guess is that they were meant to skew autocomplete or search suggestions in YouTube. It couldn’t have been about monetization because the videos had very few views the sheer volume wouldn’t have made sense with YouTube’s business model.

Someone had set up a script that detected social signals off of Twitter. It would go out and scrape related news articles, pull the text back in, and read it out in a computer voice, a Siri-type voice. It would pull images from Google Images, create a slideshow, package that up and wrap it, upload it to YouTube, hashtag it and load it with keywords. There were so many of these and they were going up so fast that as I was pulling data from the YouTube API dozens more would go up.

These things are just…not really explainable with regular logic.

I worked with The Washington Post on a project where I dug into Twitter and got, for the last week leading up to the election, a more or less complete set of Twitter data for a group of hashtags. I found what were arguably the top five most influential bots through that last week, and we found that the top one was not a completely automated account, it was a person.

The Washington Post’s [Craig Timberg] looked around and actually found this person and contacted him and he agreed to an interview at his house. It was just unbelievable. It turns out that this guy was almost 70, almost blind.

[From Timberg’s piece: “Sobieski’s two accounts…tweet more than 1,000 times a day using ‘schedulers’ that work through stacks of his own pre-written posts in repetitive loops. With retweets and other forms of sharing, these posts reach the feeds of millions of other accounts, including those of such conservative luminaries as Fox News’s Sean Hannity, GOP strategist Karl Rove and Sen. Ted Cruz (R-Tex.), according to researcher Jonathan Albright…’Life isn’t fair,’ Sobieski said with a smile. ‘Twitter in a way is like a meritocracy. You rise to the level of your ability….People who succeed are just the people who work hard.'”]

The most dangerous accounts, the most influential accounts, are often accounts that are supplemented with human input, and also a human identity that’s very strong and possibly already established before the elections come in.

I’ve looked at Twitter, YouTube — I’ve obviously looked at Facebook. I approach this from a different angle than other scholars do; I focus on accountability data. My purpose is to get data that otherwise wouldn’t be available, and to try and repackage it and share it so journalists can use it. These are things that are becoming more difficult to understand and write about quickly, and it’s becoming more difficult for journalists to get the kind of data they need. I’m not talking about the number of likes and retweets. I’m talking about how many people a post reaches.

As everything moves onto platforms and into closed walled gardens and apps, it’s becoming more and more difficult to get any type of data to hold institutions that are media companies — and platforms are media companies, there’s no question. We are losing access to the data that would let us understand what is going on and report this to the public.

Owen: A lot of the fake news research that I’ve seen focuses on Twitter. You are not a big fan of that research. So why do we see so much about Twitter and so much less about Facebook?

Albright: Twitter is a huge decoy; it’s completely placed out of context in studies and taken far too seriously by some people doing the reporting. Twitter doesn’t have that many regular monthly active users. Everyone is on Instagram and Facebook.

One of the good things about Twitter, though, and there are a lot of good things about them, is that they have always been very open. Yes, they define the metrics, and yes, people are angry about bot signals and want more data on automated accounts. But Twitter has been very good about providing some kind of accountability.

Facebook basically went the opposite direction. Facebook has never really been open, but you used to be able to capture something like public hashtag data — you could pull Facebook data from people who’d left their privacy open or public and had posted with a hashtag. Facebook closed that and now only shares it with select marketing partners like Samsung. They also shut off access to the social graph. Three years ago, in my classes, I used to have my students pull their own social network graph. That’s no longer possible.

Instagram did the exact same thing. You used to be able to pull hashtags and things like networks from Instagram, but now the only easy data to pull from Instagram is GPS, so if an Instagram post is geotagged, you can pull a post in a perimeter. You can kind of pull hashtags. Other than that, you’d have to build a system from the ground up, and even then, every step you take, you could be violating their Terms of Service and get kicked off.

Twitter and YouTube are some of the only platforms that let you get large amounts of data. YouTube is actually fairly open for now. Every time a study like mine comes out, though, they’re going to consider closing that loophole.

As the access closes, the ability for us to study these platforms — their effects on society, their impact on elections — becomes smaller and smaller.

It’s really concerning. This is the zeitgeist of where politics, news, digital technologies, and algorithms all converge. It’s a huge problem when we can’t even go back and reconstruct. We don’t have the data to make sense of what’s happening and what’s taken place so we can prepare or organize or come up with realistic solutions.

Owen: Are these companies aware of your research?

Albright: They’re hyper-aware. When I put out research in October from an accidental Facebook cache that Facebook had either forgotten or that had been stopped with some kind of order or subpoena that we don’t know about, the CEO of CrowdTangle and one of the leads of their advertising team called me the next day.

Owen: But they weren’t calling you to be like: Wow, this research is really alarming!

The Russian ads Facebook turned over to Congress are the tip of the iceberg 😬

October 6, 2017

Albright: No. They’re calling because it’s a liability for their PR and for their shareholders.

There are clearly amazing, very concerned people working at Facebook. A lot of people work at Facebook specifically for that reason — they think they can affect the world in positive ways and build new tools to enrich people’s lives. But the problem is that with with the size and scale and sheer dominance of Facebook as a for-profit corporation, it’s getting to the point where it’s becoming impossible to affect it. [These platforms] are no longer startups that can shift direction.

Often, these companies are open to research partnerships and things, but it’s always on their terms. If you do research with them, you’re dealing with IP issues, you’re signing over the rights to the research. It has to be reviewed completely and vetted by their legal process. They often handpick researchers that help them and help their purpose and help their cause — they maybe throw in some sprinkles of criticism. I understand why they would be hesitant to want to work with people like me.

Owen: Okay, sorry to mention Russia, but how much of this, like, a Russia problem, and how much of this is coming from inside our country?

Albright: Frankly, I don’t know the answer. Whatever or whoever is behind some of this, though, I’ve chosen to focus on the larger problem, which is the fact that these algorithms, the business model, and the monetization encourage the production and promotion and spread of disinformation.

I mean, I do hold that it’s not okay to come in and try to influence someone’s election; when I look at these YouTube videos, I think: Someone has to be funding this. In the case of the YouTube research, though, I looked at this more from a systems/politics perspective.

We have a problem that’s greater than the one-off abuse of technologies to manipulate elections. This thing is parasitic. It’s growing in size. The last week and a half are some of the worst things I’ve ever seen, just in terms of the trending. YouTube is having to manually go in and take these videos out. YouTube’s search suggestions, especially in the context of fact-checking, are completely counter-productive. I think Russia is a side effect of our larger problems.

Owen: What do the platforms individually need to be doing, and are there things that they all need to be doing? Or should we just burn YouTube down completely?

Albright: YouTube has no competition, right? None. YouTube, in its space, it is a monopoly. DailyMotion is tiny, Vimeo is niche. The fact that no one has come up to challenge YouTube is bizarre, but it’s probably because they can’t afford to fend off copyright claims.

We’re being held in the dark data-wise, but equally problematic is that we’re not able to understand how things are being promoted and how they’re reaching people because of algorithms. Everything is an algorithm on top of an algorithm. The search function that I used to pull the videos is an algorithm, and you have a little bit of profiling involved in that. The recommendations are an algorithm, so everything is proprietary and highly secret, because if someone ever found the exact formulas they were using, they could instantly game it. If opaque algorithms continue to exist as a business model, we’re always gonna be chasing effects.

Maybe there needs to be a job called, like, Platform Editor, where someone works to not only stop manipulation but also works across the security team and the content team and in between the different business verticals to ensure the quality and integrity of the platform. That’s a lot of responsibility, but the kinds of things that I often see could literally be stopped by one person. I mean: 4chan trending on Google during the Las Vegas shooting? How that even happened, I have no idea, but I do know that one person could have stopped that. And I do know that a group of people working together — even if it involves deliberation, even if they don’t agree on one specific thing — can often solve problems that appear or are starting to surface because of automation. And I don’t mean, like, contract moderators from India — I mean high-level people. The companies need to invest in human capital as well as technological capital, but that doesn’t align with their business model. The rhetoric exists in their public statements, but we can clearly see that how it’s being implemented isn’t working.

It’s getting worse. Since the 2016 election, I’ve come to the realization — horribly, and it’s very depressing — that nothing has gotten better, despite all the rhetoric, all of the money, all of the PR, all of the research. Since nothing has really changed with the platforms, we can scream about Russia as the structure of our information decays around us. Our technological and communication infrastructure, the ways that we experience reality, the ways we get news, are literally disintegrating.

Owen: Why is it getting worse?

Albright: There are more people online, they’re spending more time online, there’s more content, people are becoming more polarized, algorithms are getting better, the amount of data that platforms have is increasing over time.

I think one of the biggest things that’s missing from political science research is that it usually doesn’t consider the amount of time that people spend online. Between the 2012 election and the 2016 election, smartphone use went up by more than 25 percent. Many people spend all of their waking time somehow connected.

This is where psychology really needs to come in. There’s been very little psychology work done looking at this from an engagement perspective, looking at the effect of seeing things in the News Feed but not clicking out. Very few people actually click out of Facebook. We really need social psychology, we really need humanities work to come in and pick up the really important pieces. What are the effects of someone seeing vile or conspiracy news headlines in their News Feed from their friends all day?

Owen: This is so depressing.

Albright: Sorry.

Owen: No, I mean, I already knew it was a problem, it’s just…

Albright: It’s a huge problem. It’s the biggest problem ever, in my opinion, especially for American culture. Maybe it’s less of a problem for other countries and cultures, but the way our country works is just really susceptible to this. Those Russian statements about how Americans are impressionable and they’re easy to manipulate are largely true. It’s not because Americans are stupid, but because there’s been no effort to get ahead of the curve in terms of technological policy or privacy laws. There’s no protection for Americans or researchers right now. We’re fighting everything.

Snapshots of the network generated from 9,000 “crisis actor” YouTube videos, by Jonathan Albright.

Laura Hazard Owen is the editor of Nieman Lab. You can reach her via email (laura_owen@harvard.edu) or Twitter DM (@laurahazardowen).

POSTED Feb. 28, 2018, 11:43 a.m.

SEE MORE ON Audience & Social

Show tags

TWITTER FACEBOOK EMAIL