Nieman Foundation at Harvard
HOME
          
LATEST STORY
Thriving on change
ABOUT                    SUBSCRIBE
June 14, 2017, 12:37 p.m.
LINK: towcenter.org  ➚   |   Posted by: Joseph Lichterman   |   June 14, 2017

Journalism is becoming increasingly automated. From the Associated Press using machine learning to write stories to The New York Times’ plans to automate its comment moderation, outlets continue to use artificial intelligence to try and streamline their processes or make them more efficient.

But what are the ethical considerations of AI? How can journalists legally acquire the data they need? What types of data should news orgs be storing? How transparent do outlets need to be about the algorithms they use?

These were some of the questions posed Tuesday at a panel discussion held by the Tow Center for Digital Journalism and the Brown Institute for Media Innovation at Columbia University that tried to address these questions about the ethics of AI powered journalism products.

Tools such as machine learning or natural language processing require vast amounts of data to learn to behave like a human, and Amanda Levendowski, a clinical teaching fellow at the NYU’s law school, listed a series of considerations that must be thought about when trying to access data to perform these tasks.

“What does it mean for a journalist to obtain data both legally and ethically? Just because data is publicly available does not necessarily mean that it’s legally available, and it certainly doesn’t mean that it’s necessarily ethically available,” she said. “There’s a lot of different questions about what public means — especially online. Does it make a difference if you show it to a large group of people or small group of people? What does it mean when you feel comfortable disclosing personal information on a dating website versus your public Twitter account versus a LinkedIn profile? Or if you choose to make all of those private, what does it meant to disclose that information?”

For example, Levendowski highlighted the fact that many machine learning algorithms were trained on a cache of 1.6 million emails from Enron that were released by the federal government in the early 2000s. Companies are risk averse, she said, and they prefer to use publicly available data sets, such as the Enron emails or Wikipedia, but those datasets can produce biases.

“But when you think about how people use language using a dataset by oil and gas guys in Houston who were convicted of fraud, there are a lot of biases that are going to be baked into that data set that are being handed down and not just imitated by machines, but sometimes amplified because of the scale, or perpetuated, and so much so that now, even though so many machine learning algorithms have been trained or touched by this data set, there are entire research papers dedicated to exploring the gender-race power biases that are baked into this data set.”

The whole panel featured speakers such as John Keefe, the head of Quartz’s bot studio; BuzzFeed data scientist Gilad Lotan; iRobot director of data science Angela Bassa; Slack’s Jerry Talton, Columbia’s Madeleine Clare Elish, and (soon-to-be Northwestern professor) Nick Diakopoulos. The full video of the panel (and the rest of the day’s program) is available here and is embedded above; the panel starts about eight minutes in.

Show tags Show comments / Leave a comment
 
Join the 45,000 who get the freshest future-of-journalism news in our daily email.
Thriving on change
“The age of automatic deference in the workplace is over. Are individuals able to speak up and be heard? Do managers and colleagues alike reach deep into teams to collect ideas and exploit expertise? Has a culture of experimentation permeated everywhere?”
Looking for loyalty in all the right places
“We’ve come to realize that one-time visitors — effectively the one-night stands of the media world — provide little more than the cheap thrills of banner-ad dollars. And chasing comScore uniques isn’t very fulfilling. But a meaningful connection with our audience? That’s #relationshipgoals.”
Working together toward sustainable solutions
“I think people want to be effective news consumers, and they’ll take on the responsibility to become highly media literate. People don’t like being conned or lied to, but to a great extent, that’s what’s happened over the last year, particularly by disreputable sources that were amplified on social media.”