Nieman Foundation at Harvard
HOME
          
LATEST STORY
From shrimp Jesus to fake self-portraits, AI-generated images have become the latest form of social media spam
ABOUT                    SUBSCRIBE
Nov. 20, 2018, 9:01 a.m.
Audience & Social
LINK: datasociety.net  ➚   |   Posted by: Laura Hazard Owen   |   November 20, 2018

In a week when you might be discussing turkey preferences (free-range heirloom vs. supermarket Butterball), this new Data & Society paper feels appropriate: Robyn Caplan takes a look at approaches to content moderation, examining how 10 different platforms — from Facebook to Vimeo — handle a flow of user content.

Caplan breaks the policies into three categories (which have also been discussed by Tarleton Gillespie in his book Custodians of the Internet.) She spoke with representatives from the companies about their approaches and also threaded in remarks that company executives made at the Content Moderation and Removal at Scale conferences earlier this year. and Here are some notes about each model.

Artisanal

— “Content moderation is done manually in-house by employees with limited use of automated technologies. Content moderation policy development and enforcement tend to happen in the same space. Platforms such as Vimeo, Medium, Patreon, or Discord operate in this way.”

— These moderation teams are super small and made up of full-time, in-house employees rather than contract workers. Patreon, for instance, has six full-time content moderation employees. Discord has 10, and Medium said its team is between 5 and 7. (Reddit and Vimeo didn’t give firm numbers.)

— Companies using this model stress that moderation is manual, by human beings rather than algorithms. As such, employees are learning over time, policies may change without formalized rules, and:

Representatives also noted the significant organizational costs, such as employee resources to have debates or deep discussions, which must happen when taking on each case on its own, without having a set of formal overarching rules used by larger companies such as Facebook and Google to guide individual decisions. As these companies attempt to scale, and if they become subject to a rule like the NetzDG in Germany, they will have to actively add employees and formalize rules much faster than what was afforded existing major companies such as Facebook and Google.

Community-reliant

— “These platforms — such as Wikimedia and Reddit — rely on a large volunteer base to enforce the content moderation policies set up by a small policy team employed by the larger organization. Subcommittees of volunteers are typically responsible for norm-setting in their own communities on the site.”

— A Reddit representative explained: “Similarly to the U.S. Constitution, states are not allowed to have laws that are in contravention of the Constitution, and subreddits are not allowed to have rules that are in contravention of our site-wide rules.” Reddit didn’t provide exact numbers for how many in-house moderators it has, but said “around 10 percent of the company is dedicated to fighting abusive content on the site, whether that abusive content is bad content posted by users or spam or bots,” with a total company size of about 400.

— The community-reliant model is subject to criticism for relying on unpaid volunteers, and “the relationship between volunteer workers and parent organizations can be quite adversarial, with volunteers, who often have their own vision for the site, pushing back heavily against broad-level rules.”

Industrial

— “These companies are large-scale with global content moderation operations utilizing automated technologies; operationalizing their rules; and maintaining a separation between policy development and enforcement teams. This is most popularly seen with platforms like Facebook or Google.” (The “industrial” description was coined by Tarleton Gillespie.)

— Facebook has said it will have 20,000 people around the world handling content moderation by the end of 2018, “though it is likely this number reflects a large percentage of contract-based, outsourced workers, not contained within the company.”

— These companies usually started using the artisanal model and, by experimenting over that time, came to more formalized, inflexible policies.

— The amount of content that needs to be moderated is massive.

According to Nora Puckett, the YouTube representative at the 2018 Content Moderation at Scale Conference in Washington, D.C., in the fourth quarter of 2017, YouTube removed 8.2 million videos from 28 million videos flagged, which included 6.5 million videos flagged by automated means, 1.1 million flagged by trusted users, and 400,000 flagged by regular users. According to that same representative, YouTube has 10,000 workers in their content moderation teams. Twitter, which is dwarfed by behemoths Facebook and Google, still has 330 million monthly users and billions of tweets per week. At the Content Moderation at Scale event, Del Harvey, vice president of trust and safety at Twitter, noted that with this kind of scale, catching 99.9 percent of bad content still means that tens of thousands of problematic tweets remain.

— Companies using the industrial approach favor consistency across countries and

tend to collapse contexts in favor of establishing global rules that make little sense when applied to content across vastly different cultural and political contexts around the world. This can, at times, have significant negative impact on marginalized groups. Julia Angwin criticized this type of policy practice when Facebook attempted to implement a policy that incorporated notions of intersectionality divorced from existing power arrangements, essentially protecting the hegemonic groups of White and men, but not “Black children.” Her work demonstrated that attempts at universal anti-discrimination rules too often do not account for power differences along racial and gender lines.

The full paper is here.

Show tags
 
Join the 60,000 who get the freshest future-of-journalism news in our daily email.
From shrimp Jesus to fake self-portraits, AI-generated images have become the latest form of social media spam
Within days of visiting the pages — and without commenting on, liking, or following any of the material — Facebook’s algorithm recommended reams of other AI-generated content.
What journalists and independent creators can learn from each other
“The question is not about the topics but how you approach the topics.”
Deepfake detection improves when using algorithms that are more aware of demographic diversity
“Our research addresses deepfake detection algorithms’ fairness, rather than just attempting to balance the data. It offers a new approach to algorithm design that considers demographic fairness as a core aspect.”