Nieman Foundation at Harvard
Why “Sorry, I don’t know” is sometimes the best answer: The Washington Post’s technology chief on its first AI chatbot
ABOUT                    SUBSCRIBE
March 6, 2024, 12:42 p.m.
LINK:  ➚   |   Posted by: Joshua Benton   |   March 6, 2024

Google is one of the most important vectors for online spammers and scammers. The search engine’s dominance means that, if you want to generate traffic — a.k.a. potential marks — you’ll probably need to convince Google’s algorithm to send it your way. And there’s a good chance the resulting traffic will be monetized using Google’s own ad tools, which has a knock-on effect on the rest of the digital ad market. So the tech giant’s responses to new iterations of bad behavior have a big impact on the broader web publishing world.

On Tuesday, Google announced an important set of algorithm changes in an attempt to deal with a wave of AI-generated spam:

In 2022, we began tuning our ranking systems to reduce unhelpful, unoriginal content on Search and keep it at very low levels. We’re bringing what we learned from that work into the March 2024 core update.

This update involves refining some of our core ranking systems to help us better understand if webpages are unhelpful, have a poor user experience or feel like they were created for search engines instead of people. This could include sites created primarily to match very specific search queries.

We believe these updates will reduce the amount of low-quality content on Search and send more traffic to helpful and high-quality sites. Based on our evaluations, we expect that the combination of this update and our previous efforts will collectively reduce low-quality, unoriginal content in search results by 40%.

(Here’s coverage from Wired, The Verge, Ars Technica, Gizmodo, and TechCrunch.)

The update has three prongs, which Google is calling expired domain abuse, scaled content abuse, and site reputation abuse.

Expired domain abuse involves someone buying a domain that Google had previously considered high quality and coasting on that reputation to drive traffic to, er, low-quality material. This is not a new problem; longtime readers might remember, in 2013, when a spammer grabbed the domain name of the Online Journalism Review and filled it with links to their products. Expired news sites are a particularly juicy target; New York Times reporter Lydia DePillis recently mourned The Washington Independent, an early-Obama-era politics site that has been turned into, well, something else.

Site reputation abuse is similar, except rather than taking an entire domain, the spammer manages to post their own low-quality content onto an existing high-quality site — usually by exploiting a security hole in their content management systems or taking advantage of some long-forgotten user-generated-content section. Those who control these sites have two months to clear our the bad stuff before Google brings the hammer fully down.

And scaled content abuse is mostly driven by AI. A scammer uses something like ChatGPT to generate oceans of mealy-mouthed content relating to some topic or another in hopes of drawing organic search traffic. Sometimes they’ll pull an existing site’s sitemap to specifically copy its content (in just-altered-enough form) and steal away its traffic. (Some are even dumb enough to brag about their heist online!) The new policy “will allow us to take action on more types of content with little to no value created at scale, like pages that pretend to have answers to popular searches but fail to deliver helpful content,” Google says.

These changes sound on net like good news for legit publishers — any traffic redirected from spam sites is traffic that might instead be directed their way. But while you’re waiting for that organic boost: Renew your domain names, people.

Photo by Hannes Johnson.

Show tags
Join the 60,000 who get the freshest future-of-journalism news in our daily email.
Why “Sorry, I don’t know” is sometimes the best answer: The Washington Post’s technology chief on its first AI chatbot
“For Google, that might be failure mode…but for us, that is success,” says the Post’s Vineet Khosla
Browser cookies, as unkillable as cockroaches, won’t be leaving Google Chrome after all
Google — which planned to block third-party cookies in 2022, then 2023, then 2024, then 2025 — now says it won’t block them after all. A big win for adtech, but what about publishers?
Would you pay to be able to quit TikTok and Instagram? You’d be surprised how many would
“The relationship he has uncovered is more like the co-dependence seen in a destructive relationship, or the way we relate to addictive products such as tobacco that we know are doing us harm.”