Nieman Foundation at Harvard
HOME
          
LATEST STORY
How The Washington Post built — and will be building on — its “Knowledge Map” feature
ABOUT                    SUBSCRIBE
Nov. 13, 2012, 10:30 a.m.
nytchronicle1

The New York Times’ Chronicle tool explores how language — and journalism — has evolved

When were yuppies supplanted by hipsters in the minds of New York Times reporters? A corpus of Times usage has the answer.

It’s possible The New York Times is using the word “signature” too much. I’ll let Philip Corbett, the paper’s standards editor, explain:

We wrote of the signature evidence of early phase C.T.E., of the Paper Bag Players’ signature oversize props and costumes of cardboard and paper, of a golf course’s signature par-3 hole and of a restaurant’s signature sushi rolls. We said candles of a woman’s signature scent would make a nifty gift.

As the guy in charge of standards, Corbett has to have a keen eye for detail on what appears in the Times every day. But how widespread is this, er, signature problem?

Thanks to a new (internal, alas) tool from the New York Times Co.’s R&D Lab, we have the data to know for sure. (Since 1981, usage of “signature” has increased at the paper, peaking in 2010 when the word appeared in more than 1,500 articles.) Chronicle is a database of articles and story tags from the past 31 years of Times content. The tool makes it possible to see the frequency of use of certain words — but also what people, organizations, or locations are most related to keywords.

“It’s a way of being able to see patterns in our vocabulary — not just in topics in the news, but in language and how we talk about the news,” said Alexis Lloyd, a creative technologist in the R&D Lab.

For instance, using Chronicle, you’ll find that the word “terrorism” was used as a tag in more than 7,000 articles in 2004. We can also see that among the most related tags to terrorism were, to no one’s surprise, George W. Bush, Saddam Hussein, and Al Qaeda. Chronicle also lets you chart how words have risen and fallen over time by the number of Times articles they have appeared in. The comparison tool, for example, shows that the word “yuppie” has been in decline since the mid 1980s, while “hipster” has shot upward. “We almost stopped using the word ‘decor’ in 2001,” she said. “[Usage] went up and up and up and stopped. There was an editorial decision that was made.”

Word choice can be an agonizing but prideful task for reporters. While certain words are necessary for conveying the facts of a story, others allow for a signature touch. But Chronicle wasn’t meant to be used as an adjective monitor. Michael Zimbalist, vice president of R&D for NYT Co., said the paper is trying to find new ways to put its index of articles and taxonomy of story tags to better use. What the Times is sitting on is a mountain of semantic data that opens up many research opportunities, Zimbalist said. “We’re looking at a giant corpus of text we have here and how to process that text as data,” he said.

Chronicle is similar in many ways to Google’s Ngram Viewer, which lets users compare phrases that have been digitized in conjunction with the Google Books project. Both projects seek to learn more about the ways language has morphed as cultures have changed. A newspaper represents a constrained body of work to study, Lloyd said, because stories are largely based on current events, but also because newsrooms are subject to regularly updated style guides. “This gives you a particular view into news and culture and history,” she said.

Though the Times has an extensive archive and a rich system of metadata, Chronicle’s data only runs 31 years back — to the 1981 start of the paper’s database of full-text articles. Lloyd said her next challenge is finding a way to extend the corpus deeper into the Times archive.

At the moment, Chronicle is in the “not for public use” part of the Times R&D Lab. While the search tool has potential uses for research, Lloyd said she wants to focus on expanding the corpus to make the results richer. At the moment, Chronicle will only be available inside the walls of the Times.

“I think it can be used for research and reporting,” Lloyd said. “The primary idea is to have it as an internal tool to be able to get those aggregate views and look into trends and patterns you can’t get at any other way.”

POSTED     Nov. 13, 2012, 10:30 a.m.
SHARE THIS STORY
   
Show comments  
Show tags
 
Join the 15,000 who get the freshest future-of-journalism news in our daily email.
How The Washington Post built — and will be building on — its “Knowledge Map” feature
The Post is looking to create a database of “supplements” — categorized pieces of text and graphics that help give context around complicated news topics — and add it as a contextual layer across lots of different Post stories.
How 7 news organizations are using Slack to work better and differently
Here’s how Fusion, Vox, Quartz, Slate, the AP, The Times of London, and Thought Catalog are using Slack for workflow — and which features they wish the platform would add.
The New York Times built a robot to help make article tagging easier
Developed by the Times R&D lab, the Editor tool scans text to suggest article tags in real time. But the automatic tagging system won’t be moving into the newsroom soon.
What to read next
1119
tweets
New Pew data: More Americans are getting news on Facebook and Twitter
A new study from the Pew Research Center and Knight Foundation finds that more Americans of all ages, races, genders, education levels, and incomes are using Twitter and Facebook to consume news.
788Newsonomics: The halving of America’s daily newsrooms
If you’re lucky enough to have the right deep-pocketed owner buy your paper and steady it, you’ve won the lottery. If you’re in a town whose paper is owned by the better chains, or committed local ownership, your loss will probably be mitigated. Otherwise, you’re out of luck.
575How 7 news organizations are using Slack to work better and differently
Here’s how Fusion, Vox, Quartz, Slate, the AP, The Times of London, and Thought Catalog are using Slack for workflow — and which features they wish the platform would add.
These stories are our most popular on Twitter over the past 30 days.
See all our most recent pieces ➚
Encyclo is our encyclopedia of the future of news, chronicling the key players in journalism’s evolution.
Here are a few of the entries you’ll find in Encyclo.   Get the full Encyclo ➚
The Daily Telegraph
Hechinger Report
Animal Político
Associated Press
Texas Tribune
Mozilla
The Atlantic
Voice of San Diego
The Economist
Las Vegas Sun
Foreign Policy
OpenFile