Nieman Foundation at Harvard
HOME
          
LATEST STORY
How AJ+ embraces Facebook, autoplay, and comments to make its videos stand out
ABOUT                    SUBSCRIBE
Jan. 13, 2010, 12:35 p.m.

SpeakerText wants to free all your words from the prison of your videos

There’s a school of thought that says video is the future of information, that rich media is the endpoint of the evolution of text. I don’t know that I buy that, since text still has so many advantages over video: its scannability, its searchability, how much easier it usually is to create and polish. But some of those edges might be temporary, as technology evolves to solve away some of video’s problems.

SpeakerText, a new startup, is trying to become one of those problem solvers by directly tying videos to their corresponding words.

Cofounder Matt Mireles, 29 and an occasional commenter around here, used to dream about being a war correspondent for The New York Times Magazine. But “reading Romenesko and getting depressed” pushed his interest more toward the intersection of journalism and technology.

Here’s his argument: It’s relatively easy for the value of a piece of text content to be shifted from its creator to someone else. Let’s say your news organization breaks big news. What happens next? Other people start writing about your big news — summarizing it, excerpting it, putting their own spin on it. Maybe also linking to it — but a lot of those links don’t get clicked on, and the “credit” in terms of eyeballs ends up spread around a lot of different sites, not just the one doing the original reporting. Or, as SpeakerText cofounder Matt Mireles puts it: “Text is easily commoditized.”

But the same isn’t quite as true for video content. It’s a lot harder to satisfyingly summarize a piece of video for a blog post. (Not impossible — harder.) It’s also a lot less excerptable: If you’ve posted an hour-long video, and the juicy stuff is 39 minutes in, it’s not always easy to direct people to that spot without recutting the video.

SpeakerText tries to tackle those problems by linking points in a video with their transcripts, allowing text to be a navigational tool to locate specific points in a video. (You may have seen The New York Times do something similar for major Obama speeches.)

Here’s an example of how SpeakerText works, using a video of the NYT’s Bill Keller we wrote about back in October:

Pressing play will start both the video and the movement of highlighted text down the accompanying transcript. If you click at any point in the transcript, the video should jump to that point. This has obvious use for speeches, lectures, interviews or anything else that combines multimedia with a lot of words.

And SpeakerText also allows video to be shared at the quote level. For instance, in that Keller video, the one line lots of people seized on is where he seems to (maybe?) confirm the existence of a new Apple tablet device, calling it “the impending Apple slate.” With the tool, I can link directly to that quote, 8:24 in.

Room for improvement

It’s not perfect. For one thing, it doesn’t yet work with all video providers; we at the Lab use Vimeo as our video player, which doesn’t work with SpeakerText, so I had to reupload the Keller video to YouTube to make it work. The time-tagging isn’t perfectly precise; I had a few tags that seemed to float a few seconds away from the exact moment I tied them to.

As for the transcriptions, you either need to provide them yourself or pay for them to be created by the anonymous armies of Amazon’s Mechanical Turk. I love Mechanical Turk — I use it to do all the video transcriptions on this site. But its transcriptions can be spotty — from misplaced commas and incorrect proper names to varying interpretations of whether every last “um,” “er,” and “uh” should be considered worthy for the permanent record.

But the biggest hassle is that the connections between the transcripts and the video must be manually, by time-stamping the text. Mireles suggests having an intern do it, but being internless, it was a bit of a slog. For an hour-long video, time-stamping at the sentence level would be a big pain.

Mireles told me that technology to automate the time-stamping is available for purchase and is part of the plan as they move from boot-strapped startup to investor-fueled.

Applying the tech to government meetings

And that brings us to SpeakerText’s efforts to raise that money. Mireles is seeking investors, in part with the idea that a future SpeakerText Pro (which would allow a website’s branding to be part of the player) and enterprise-level deals with major video vendors would generate a revenue stream. The technology, if it evolves, would also seem to be a potential purchase for one of the big video platforms.

But the company is also seeking money from the Knight Foundation as part of the 2010 Knight News Challenge. The idea is based on using SpeakerText’s tech to generate sharable and linkable video transcripts of government meetings.

“Who goes to these city council meetings and legislative meetings? Classically, that’s newspaper reporters,” Mireles told me. “They listen to everything and filter out quotes into a story, and that’s the public record. What I’d like to do is create a framework where all government business is easily searchable, quotable, linkable, and sharable.”

Such an idea would obviously require a lot more than SpeakerText’s transcription-tying tech — a whole bunch of cameras, to start — but it’s a worthy vision of how technology could work to open up all the information locked inside video files to the text-reading world. In the meantime, SpeakerText might be a useful tool for online journalists working with word-heavy videos.

POSTED     Jan. 13, 2010, 12:35 p.m.
SHARE THIS STORY
   
Show comments  
Show tags
 
Join the 15,000 who get the freshest future-of-journalism news in our daily email.
How AJ+ embraces Facebook, autoplay, and comments to make its videos stand out
“We think a lot about whether a video works with the sound off. Do we have to subtitle it to keep the audience retention high? Do we need to use big fonts?”
“It’s like seeing your grandpa in a nightclub”: The New York Times’ challenge in building a digital brand
“Relevance is the Times’ big problem, not awareness. Plenty of people know about The New York Times. But most of them think we’re not for them.”
How The Washington Post built — and will be building on — its “Knowledge Map” feature
The Post is looking to create a database of “supplements” — categorized pieces of text and graphics that help give context around complicated news topics — and add it as a contextual layer across lots of different Post stories.
What to read next
1119
tweets
New Pew data: More Americans are getting news on Facebook and Twitter
A new study from the Pew Research Center and Knight Foundation finds that more Americans of all ages, races, genders, education levels, and incomes are using Twitter and Facebook to consume news.
797Newsonomics: The halving of America’s daily newsrooms
If you’re lucky enough to have the right deep-pocketed owner buy your paper and steady it, you’ve won the lottery. If you’re in a town whose paper is owned by the better chains, or committed local ownership, your loss will probably be mitigated. Otherwise, you’re out of luck.
698How 7 news organizations are using Slack to work better and differently
Here’s how Fusion, Vox, Quartz, Slate, the AP, The Times of London, and Thought Catalog are using Slack for workflow — and which features they wish the platform would add.
These stories are our most popular on Twitter over the past 30 days.
See all our most recent pieces ➚
Encyclo is our encyclopedia of the future of news, chronicling the key players in journalism’s evolution.
Here are a few of the entries you’ll find in Encyclo.   Get the full Encyclo ➚
International Consortium of Investigative Journalists
Lens
Texas Tribune
Hearst
Futurity
Newser
American Independent News Network
EveryBlock
ABC News
Chi-Town Daily News
Foursquare
Charlottesville Tomorrow