Facebook and YouTube just got more transparent. What do we see?

Transparency, even in its candor, is a performance, leaving as much unseen as seen.

By Tarleton Gillespie May 3, 2018, 9:34 a.m.

Social media platforms have been notoriously opaque about how they work. But something may have shifted.

Last week, several social media platforms took significant steps toward greater transparency, particularly around content moderation and data privacy. Facebook published a major revision of its Community Standards, the rules that govern what users are prohibited from posting on the platform. The changes are dramatic, not because the rules shifted much but because Facebook has now spelled out those rules in much, much more detail.

YouTube released its latest transparency report, and for the first time included data on how it handles content moderation, not just government takedown requests. And dozens of platforms alerted their users of updates to their privacy policies this week, in anticipation of the General Data protection Regulation (GDPR) out of Europe, which goes into effect May 25.

Q&A: Tarleton Gillespie says algorithms may be new, but editorial calculations aren’t

July 8, 2014

What can we learn from these gestures of transparency? And what do they mean for the problem of content moderation? I, like many others, have been calling for social media platforms to be more transparent about how content moderation works. So the published internal rules from Facebook and the expanded transparency report from YouTube should be commended. From one vantage point, Facebook’s new guidelines are the next logical step in content moderation on social media platforms. Their rules about sexually explicit material, harassment, real names, and self-harm are already in place; now we need to get down to exactly how to impose them effectively and fairly.

Most of the major platforms have been publishing transparency reports for years, but all have focused exclusively on content takedown requests from governments and corporations; YouTube’s report appears to be the first time that a major platform has systematically reported where flags come from and how they’re responded to, and the company is promising more flagging data in future reports.

But transparency, even in its candor, is a performance, leaving as much unseen as seen. At the same time, the performance itself can be revealing, of the larger situation in which we find ourselves. A closer look at Facebook’s new Community Standards and YouTube’s new data will reveal more about how content moderation is done, and how committed we have become to the approach to moderation as a project.

Every traffic light is a tombstone

Different platforms articulate their rules in different ways. But all have some statement that offers, more plainly than the legalese of a “Terms of Service,” that that platform expects of users and what it prohibits. Explaining the rules is just one small part of platform moderation. Few users read these Community Standards; many don’t even know they exist. And the rules as stated may or may not have a close correlation with how they’re actually enforced. Still, how they are articulated is of enormous importance. Articulating the rules is the clearest opportunity for a platform to justify its moderation efforts as legitimate. Less an instruction manual, the community guidelines are like a constitution.

Last week, Facebook spelled out its rules in blunt and sometimes unnerving detail. While it already prohibited images of “explicit images of sexual intercourse,” now it defines its terms: “mouth or genitals entering or in contact with another person’s genitals or anus, where at least one person’s genitals are nude.” Prohibited sexual fetishes now include “acts that are likely to lead to the death of a person or animal; dismemberment; cannibalism; feces, urine, spit, menstruation, or vomit.”

While some of these specific rules may be theoretical, most are here because Facebook has already encountered and had to remove this kind of content, usually thousands of times. This document is important as a historical compendium of the exceedingly horrifying ends to which some users put social media: “Dehumanizing speech including (but not limited to) reference or comparison to filth, bacteria, disease, or feces…” “Videos that show child abuse, [including] tossing, rotating, or shaking of an infant (too young to stand) by their wrists/ankles, arms/legs, or neck…” “organizations responsible for any of the following: prostitution of others, forced/bonded labor, slavery, or the removal of organs.” Facebook’s new rules are the collected slag heap beneath the shiny promise of Web 2.0.

Flagging is no longer what it used to be

Most platforms turn largely or exclusively to their user base to help identify offensive content and behavior. This usually means a “flagging” mechanism that allows users to alert the platform to objectionable content. Using the users is convenient because it divides this enormous task among many, and puts the task of identifying offensive content right at the point when someone comes into contact with it. Relying on the community grants the platform legitimacy and cover. The flagging mechanism itself clearly signals that the platform is listening to its users and providing avenues for them to express offense or seek help when they’re being harmed.

When YouTube added a flagging mechanism to its videos back in 2005, it was a substantive change to the site. Before allowing users to “flag as inappropriate,” YouTube had only a generic “contact us” email link in the footer of the site. Today, enlisting the crowd to police itself is commonplace across social media platforms and, more broadly, the management of public information resources. It is increasingly seen as a necessary element of platforms, both by regulators who want platforms to be more responsive and by platform managers hoping to avoid stricter regulations.

Flagging has expanded as part of the vocabulary of online interfaces, beyond alerting a platform to offense: platforms let you flag users who you fear are suicidal, or flag news or commentary that peddles falsehoods. What users are being asked to police, and the responsibility attached, is expanding.

On the other hand, flagging is voluntary — which means that the users who deputize themselves to flag content are those most motivated to do so. Platforms often describe flagging as an expression of the community. But are the users who flag representative of the larger user base, and what are the ramifications for the legitimacy of the system if they’re not? Who flags, and why, is hard to know.

YouTube’s latest transparency report tells us a great deal about how user flags now matter to its content moderation process — and it’s not much. Clearly, automated software designed to detect possible violations and “flag” them for review do the majority of the work. In the three-month period between October and December 2017, 8.2 million videos were removed; 80 percent of those removed were flagged by software, 13 percent by trusted flaggers, and only 4 percent by regular users. Strikingly, 75 percent of the videos removed were gone before they’d been viewed even once, which means they simply could not have been flagged by a user.

On the other hand, according to this data, YouTube received 9.3 million flags in the same three months, 94 percent from regular users. But those flags led to very few removals. In the report, YouTube is diplomatic about the value of these flags: “user flags are critical to identifying some violative content that needs to be removed, but users also flag lots of benign content, which is why trained reviewers and systems are critical to ensure we only act on videos that violate our policies.”

“Critical” here seems generous. Though more data might clarify the story (how many automated flags did not lead to removals?) it seems reasonable to suggest that flags from users are an extremely noisy resource. It would be tempting to say that they are of little value other than public relations — letting victims of harassment know they are being heard, respecting the community’s input — but it might be worth noting the additional value of these flags: as training data for those automated software tools. Yet, if user flags are so relatively inaccurate, it may be that the contributions of the trusted flaggers are weighted more heavily in this training.

Gestures of transparency in the face of criticism

Facebook and YouTube are responding to the growing calls for social media platforms to take greater responsibility for how their systems work — from moderation to data collection to advertising. But Facebook’s rule change may be a response to a more specific critique: that while Facebook has one set of rules for the public, it seemed to have a different set of rules for use internally: that its policy team used to judge hard cases, that it used to train tens of thousands of human moderators, and that it programmed into its AI detection algorithms. This criticism became most pointed when training documents were leaked to The Guardian in 2016 — documents Facebook used to instruct remote content moderation teams and third-party crowdworkers on how to draw the line between, say, a harsh sentiment and a racist screed, between a visible wound and a gruesome one, between reporting on a terrorist strike and celebrating it.

Facebook is describing these new rules as its “internal guidelines.” This is meant to suggest a couple things. First, we’re being encouraged to believe that this document, in this form, existed behind the scenes all along, standing behind the previous Community Standards, which were more written in a more generalized language for the benefit of users. Second, we’re supposed to take the publication of these internal rules as a gesture of transparency: “all right, we’ll show you exactly how we moderate, no more games.” And third, it implies that, going forward, there will be no gap between what the posted rules say and what Facebook moderators do.

The suggestion that these are their internal guidelines may be strictly true, if “internal” means the content policy team at Facebook corporate headquarters in Menlo Park. It makes sense that, behind the broad standards written for users, there were more spelled out versions being used for actual moderation decisions. It is of course hard to know whether this document was already sitting there as some internal moderation bible and Facebook’s team merely made the decision to publish it, or if it was newly crafted for the purpose of performing transparency; or if (more likely) it was assembled out of an existing tangle of rules, guidelines, tip sheets, definitions, consulting documents, and policy drafts that were combined for publication.

However, if the 2016 Guardian documents are any indication, what Facebook gave its larger labor force of content moderators was much more than just detailed rules. They included examples, tests, and hard cases, meant to guide reviewers on how to think about these rules and how to apply them. The challenge of moderation at this scale is not simply to locate the violations and act on them. It is also how to train hundreds or even thousands of people to make these tricky distinctions in the same way, over and over again, across a shocking array of unexpected variations and contexts.

Those examples and test cases are extremely important in helping to calibrate the review process. They are also where content moderation can go horribly wrong; ProPublica’s investigation into Facebook’s moderation of hate speech revealed that, even across just a handful of examples, reviewers were profoundly inconsistent, and Facebook was often unable to explain why specific decisions had been made.

The same approach, only more so

We are in a supremely weird moment.

Legible in Facebook’s community guidelines are the immense challenges involved in overseeing massive, global social media platforms. They are scarred by the controversies that each platform has faced, and the bumpy road that all social media have traveled together over the past decade. They reveal how social media platform administrators try to make sense of and assert their authority over users in the first place.

Apparent in the YouTube transparency report is a reminder of how, even as platforms promise to be responsive to the needs of their users, the mechanics of content moderation are moving away from users — done more on their behalf than at their behest.

And both make clear the central contradiction of moderation that platform creators must attempt to reconcile, but never quite can: If social media platforms were ever intended to embody the freedom of the web, then constraints of any kind run counter to these ideals, and moderation must be constantly disavowed. But if platforms are supposed to promise anything better than the chaos of the open web, then oversight and prohibition is central to that promise.

More transparency is nearly always good, and it’s certainly good in this instance. (Facebook has also promised to expand users’ ability to appeal moderation decisions they disagree with, and that’s also good.) But even as they’re more open about it, Facebook is deepening its commitment to the same underlying logic of content moderation platforms have embraced for a decade: a reactive, customer-service approach where the power to judge remains almost exclusively in the hands of the platforms. Even if these guidelines are now 8,000 words long and spell out the rules in much more honest detail, they are still Facebook’s rules, written in the way they choose, it is their judgment when they apply, their decision what the penalty should be, their appeals process.

Mark Zuckerberg himself has said that he feels deeply ambivalent about this approach. In a March interview with Recode, he said:

I feel fundamentally uncomfortable sitting here in California at an office, making content policy decisions for people around the world…things like where is the line on hate speech? I mean, who chose me to be the person that?…I have to, because [I lead Facebook], but I’d rather not.

I share his sense of discomfort, as do millions of others. Zuckerberg’s team in Menlo Park may have just offered us much more transparency about how it defines hate speech, plus a more robust appeals process and a promise to be more responsive to change. But they’re still sitting there “in California at an office, making content policy decisions for people around the world.”

The truth is, we wish platforms could moderate away the offensive and the cruel. We wish they could answer these hard questions for us and let us get on with the fun of sharing jokes, talking politics, and keeping up with those we care about. As users, we demand that they moderate, and that they not moderate too much. But as Roger Silverstone noted, “The media are too important to be left to the media.” But then, to what authority can we even turn? As citizens, perhaps we must begin to be that authority, be the custodians of the custodians.

Perhaps Facebook’s responsibility to the public includes sharing that responsibility with the public—not just the labor, but the judgment. I don’t just mean letting users flag content, which YouTube’s data suggests is both minimal and ineffective on its own. I mean finding ways to craft the rules together, set the priorities together, and judge the hard cases together. Participation comes with its own form of responsibility. We must demand that social media share the tools to govern collectively, not just explain the rules to us.

Tarleton Gillespie is principal researcher at Microsoft Research New England, a member of the Social Media Collective, and an adjunct associate professor in the Department of Communication and Department of Information Science at Cornell University.

A couple of paragraphs of this piece are drawn from his forthcoming book, Custodians of the Internet: Platforms, Content Moderation, and the Hidden Decisions that Shape Social Media, which will be published soon by Yale University Press.

Photo of a transparent Lego by WRme2 used under a Creative Commons license.

POSTED May 3, 2018, 9:34 a.m.

SEE MORE ON Audience & Social