HOME
          
LATEST STORY
What are the boundaries of today’s journalism, and how is the rise of digital changing who defines them?
ABOUT                    SUBSCRIBE
Nov. 13, 2012, 10:30 a.m.
nytchronicle1

The New York Times’ Chronicle tool explores how language — and journalism — has evolved

When were yuppies supplanted by hipsters in the minds of New York Times reporters? A corpus of Times usage has the answer.

It’s possible The New York Times is using the word “signature” too much. I’ll let Philip Corbett, the paper’s standards editor, explain:

We wrote of the signature evidence of early phase C.T.E., of the Paper Bag Players’ signature oversize props and costumes of cardboard and paper, of a golf course’s signature par-3 hole and of a restaurant’s signature sushi rolls. We said candles of a woman’s signature scent would make a nifty gift.

As the guy in charge of standards, Corbett has to have a keen eye for detail on what appears in the Times every day. But how widespread is this, er, signature problem?

Thanks to a new (internal, alas) tool from the New York Times Co.’s R&D Lab, we have the data to know for sure. (Since 1981, usage of “signature” has increased at the paper, peaking in 2010 when the word appeared in more than 1,500 articles.) Chronicle is a database of articles and story tags from the past 31 years of Times content. The tool makes it possible to see the frequency of use of certain words — but also what people, organizations, or locations are most related to keywords.

“It’s a way of being able to see patterns in our vocabulary — not just in topics in the news, but in language and how we talk about the news,” said Alexis Lloyd, a creative technologist in the R&D Lab.

For instance, using Chronicle, you’ll find that the word “terrorism” was used as a tag in more than 7,000 articles in 2004. We can also see that among the most related tags to terrorism were, to no one’s surprise, George W. Bush, Saddam Hussein, and Al Qaeda. Chronicle also lets you chart how words have risen and fallen over time by the number of Times articles they have appeared in. The comparison tool, for example, shows that the word “yuppie” has been in decline since the mid 1980s, while “hipster” has shot upward. “We almost stopped using the word ‘decor’ in 2001,” she said. “[Usage] went up and up and up and stopped. There was an editorial decision that was made.”

Word choice can be an agonizing but prideful task for reporters. While certain words are necessary for conveying the facts of a story, others allow for a signature touch. But Chronicle wasn’t meant to be used as an adjective monitor. Michael Zimbalist, vice president of R&D for NYT Co., said the paper is trying to find new ways to put its index of articles and taxonomy of story tags to better use. What the Times is sitting on is a mountain of semantic data that opens up many research opportunities, Zimbalist said. “We’re looking at a giant corpus of text we have here and how to process that text as data,” he said.

Chronicle is similar in many ways to Google’s Ngram Viewer, which lets users compare phrases that have been digitized in conjunction with the Google Books project. Both projects seek to learn more about the ways language has morphed as cultures have changed. A newspaper represents a constrained body of work to study, Lloyd said, because stories are largely based on current events, but also because newsrooms are subject to regularly updated style guides. “This gives you a particular view into news and culture and history,” she said.

Though the Times has an extensive archive and a rich system of metadata, Chronicle’s data only runs 31 years back — to the 1981 start of the paper’s database of full-text articles. Lloyd said her next challenge is finding a way to extend the corpus deeper into the Times archive.

At the moment, Chronicle is in the “not for public use” part of the Times R&D Lab. While the search tool has potential uses for research, Lloyd said she wants to focus on expanding the corpus to make the results richer. At the moment, Chronicle will only be available inside the walls of the Times.

“I think it can be used for research and reporting,” Lloyd said. “The primary idea is to have it as an internal tool to be able to get those aggregate views and look into trends and patterns you can’t get at any other way.”

POSTED     Nov. 13, 2012, 10:30 a.m.
SHARE THIS STORY
   
Show comments  
Show tags
 
Join the 15,000 who get the freshest future-of-journalism news in our daily email.
What are the boundaries of today’s journalism, and how is the rise of digital changing who defines them?
In a new book, a group of academics look at how the big defining questions of the field — what is journalism? who is a journalist? who decides? — are changing.
Esquire has a cold: How the magazine is mining its archives with the launch of Esquire Classics
“We’re continuing our experiments with seeing what kinds of great archival stories people want to read and what formats seem to be most popular.”
The Atlantic redesigns, trading clutter and density for refinement
It wants to be a “real-time magazine” on the web, connected to its print heritage. But stripping out the visual noise won’t please everyone.
What to read next
2439
tweets
The Economist’s Tom Standage on digital strategy and the limits of a model based on advertising
“The Economist has taken the view that advertising is nice, and we’ll certainly take money where we can get it, but we’re pretty much expecting it to go away.”
579What USA Today Sports learned covering the Final Four on Periscope and Snapchat
These new platforms are optimized for realtime news on phones, but there are lots of questions for news organizations — from what content to share to how to measure their effectiveness.
410Journalists shouldn’t lose their rights in their move to private platforms
The shift to distributed content means concepts like fair use are increasingly in the hands of private companies — like SoundCloud.
These stories are our most popular on Twitter over the past 30 days.
See all our most recent pieces ➚
Fuego is our heat-seeking Twitter bot, tracking the links the future-of-journalism crowd is talking about most on Twitter.
Here are a few of the top links Fuego’s currently watching.   Get the full Fuego ➚
Encyclo is our encyclopedia of the future of news, chronicling the key players in journalism’s evolution.
Here are a few of the entries you’ll find in Encyclo.   Get the full Encyclo ➚
Suck.com
PBS NewsHour
West Seattle Blog
Ars Technica
Backfence
Austin American-Statesman
NBC News
Patch
Facebook
Ann Arbor News
TechCrunch
USA Today