<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: What qualifies as a Spotlight story on Google News? Here&#8217;s a few clues</title>
	<atom:link href="http://www.niemanlab.org/2010/01/what-qualifies-as-a-spotlight-story-on-google-news-heres-a-few-clues/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.niemanlab.org/2010/01/what-qualifies-as-a-spotlight-story-on-google-news-heres-a-few-clues/</link>
	<description>A collaborative effort to figure out the future of journalism. A project of Harvard University.</description>
	<lastBuildDate>Sun, 12 Feb 2012 17:39:00 -0500</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
	<item>
		<title>By: Friday Weekly Reader&#160;&#124;&#160;PressPass</title>
		<link>http://www.niemanlab.org/2010/01/what-qualifies-as-a-spotlight-story-on-google-news-heres-a-few-clues/comment-page-1/#comment-99940</link>
		<dc:creator>Friday Weekly Reader&#160;&#124;&#160;PressPass</dc:creator>
		<pubDate>Mon, 12 Apr 2010 21:27:50 +0000</pubDate>
		<guid isPermaLink="false">http://www.niemanlab.org/?p=11783#comment-99940</guid>
		<description>[...] What qualifies as a Spotlight story on Google News? Here’s a few clues [...]</description>
		<content:encoded><![CDATA[<p>[...] What qualifies as a Spotlight story on Google News? Here’s a few clues [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jonathan Stray</title>
		<link>http://www.niemanlab.org/2010/01/what-qualifies-as-a-spotlight-story-on-google-news-heres-a-few-clues/comment-page-1/#comment-69360</link>
		<dc:creator>Jonathan Stray</dc:creator>
		<pubDate>Tue, 12 Jan 2010 11:49:33 +0000</pubDate>
		<guid isPermaLink="false">http://www.niemanlab.org/?p=11783#comment-69360</guid>
		<description>I think you&#039;re probably thinking along the wrong lines in terms of how Google defines &quot;lasting value,&quot; imagining it to be based on editorial selection of key words. 

Google doesn&#039;t do that, which is in my opinion part of their value. They don&#039;t make semantic assumptions (remember, the small Google News team would have to do this for 40 or so languages) and they&#039;re in the business of expressing what everyone else thinks, not voicing their own opinion.

If I were assigned the project of algorithm development for the Spotlight section, I&#039;d gather data like the time distribution of comments and new links. If people are still commenting on a story at the same (high) rate as they were five days ago,  it&#039;s probably not a flash-in-the-pan bit of spot news. 

The thresholds could even be calibrated automatically by examining the typical distribution of comment/link production for a news story and looking for long-lived outliers.

I&#039;m not saying this how Google does it, but this is the sort of thing I&#039;d experiment with. 

BTW, I suspect your understanding of how article categorization works is similarly off. I would do it by generating word vectors for stories (See http://en.wikipedia.org/wiki/Vector_space_model) and comparing them to content known to be human-categorized in specific ways. Again, this is both category and language neutral.

As a professional computer scientist, I maintain that journalism has a lot to learn from computational linguistics ;)
  
  - Jonathan</description>
		<content:encoded><![CDATA[<p>I think you&#8217;re probably thinking along the wrong lines in terms of how Google defines &#8220;lasting value,&#8221; imagining it to be based on editorial selection of key words. </p>
<p>Google doesn&#8217;t do that, which is in my opinion part of their value. They don&#8217;t make semantic assumptions (remember, the small Google News team would have to do this for 40 or so languages) and they&#8217;re in the business of expressing what everyone else thinks, not voicing their own opinion.</p>
<p>If I were assigned the project of algorithm development for the Spotlight section, I&#8217;d gather data like the time distribution of comments and new links. If people are still commenting on a story at the same (high) rate as they were five days ago,  it&#8217;s probably not a flash-in-the-pan bit of spot news. </p>
<p>The thresholds could even be calibrated automatically by examining the typical distribution of comment/link production for a news story and looking for long-lived outliers.</p>
<p>I&#8217;m not saying this how Google does it, but this is the sort of thing I&#8217;d experiment with. </p>
<p>BTW, I suspect your understanding of how article categorization works is similarly off. I would do it by generating word vectors for stories (See <a href="http://en.wikipedia.org/wiki/Vector_space_model" rel="nofollow">http://en.wikipedia.org/wiki/Vector_space_model</a>) and comparing them to content known to be human-categorized in specific ways. Again, this is both category and language neutral.</p>
<p>As a professional computer scientist, I maintain that journalism has a lot to learn from computational linguistics ;)</p>
<p>  &#8211; Jonathan</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Stories you may have missed from January 6th &#171; Radioactive Gavin is Out of Print</title>
		<link>http://www.niemanlab.org/2010/01/what-qualifies-as-a-spotlight-story-on-google-news-heres-a-few-clues/comment-page-1/#comment-68501</link>
		<dc:creator>Stories you may have missed from January 6th &#171; Radioactive Gavin is Out of Print</dc:creator>
		<pubDate>Fri, 08 Jan 2010 05:58:51 +0000</pubDate>
		<guid isPermaLink="false">http://www.niemanlab.org/?p=11783#comment-68501</guid>
		<description>[...] What qualifies as Spotlight story on Google News? from Nieman J- Lab  [...]</description>
		<content:encoded><![CDATA[<p>[...] What qualifies as Spotlight story on Google News? from Nieman J- Lab  [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Korta klipp &#8211; 07 January 2010</title>
		<link>http://www.niemanlab.org/2010/01/what-qualifies-as-a-spotlight-story-on-google-news-heres-a-few-clues/comment-page-1/#comment-68330</link>
		<dc:creator>Korta klipp &#8211; 07 January 2010</dc:creator>
		<pubDate>Thu, 07 Jan 2010 09:00:39 +0000</pubDate>
		<guid isPermaLink="false">http://www.niemanlab.org/?p=11783#comment-68330</guid>
		<description>[...] What qualifies as a Spotlight story on Google News? Here’s a few clues [...]</description>
		<content:encoded><![CDATA[<p>[...] What qualifies as a Spotlight story on Google News? Here’s a few clues [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Stories you may have missed from January 6th &#171; Radioactive Gavin is Out of Print</title>
		<link>http://www.niemanlab.org/2010/01/what-qualifies-as-a-spotlight-story-on-google-news-heres-a-few-clues/comment-page-1/#comment-68307</link>
		<dc:creator>Stories you may have missed from January 6th &#171; Radioactive Gavin is Out of Print</dc:creator>
		<pubDate>Thu, 07 Jan 2010 06:08:50 +0000</pubDate>
		<guid isPermaLink="false">http://www.niemanlab.org/?p=11783#comment-68307</guid>
		<description>[...] What qualifies as a Spotlight story on Google News? from Nieman Journalism Lab [...]</description>
		<content:encoded><![CDATA[<p>[...] What qualifies as a Spotlight story on Google News? from Nieman Journalism Lab [...]</p>
]]></content:encoded>
	</item>
</channel>
</rss>

