Monday, April 09, 2007

Google fails to stop theft and abuse by spam blogs

Google's search engine lists spam blogs which steal copyright material from legitimate blogs and then generate income by hosting Google Ad-Sense adverts.

The content of my blogs - which is all copyrighted specifically to address this issue - is being repeatedly scraped (stolen) and used on spam blogs. The content of your blog might be suffering the same fate. I'm currently faced with an explosion in the amount of material being stolen. Read on to find out if you've got a problem like mine and what you can do about it.

What's a spam blog (splog)?
Spam blogs, sometimes referred to by the neologism splogs, are artificially created weblog sites which the author uses to promote affiliated websites or to increase the search engine rankings of associated sites. The purpose of a splog can be to increase the PageRank or backlink portfolio of affiliate websites, to artificially inflate paid ad impressions from visitors, and/or use the blog as a link outlet to get new sites indexed. Spam blogs are usually a type of scraper site, where content is often either Inauthentic Text or merely stolen from other websites. These blogs usually contain an high number of links to sites associated with the splog creator which are often disreputable or otherwise useless websites. (Wikipedia)
What sort of fraud or theft is it?
Content from my website/blogs is stolen (scraped) and reposted on a splog which hosts Google Ad-Sense adverts.
  • It's fraud because income generated for the host spam blog is generated by content devised by and stolen from other legitimate blogs (likes yours and mine).
  • It's fraud because all clicks on an advert on a spam blog are charged to the advertiser. The advertiser's name is also being linked to fraudulent activity.
  • it's theft because Google is involved in publishing stolen copyright material by continuing to index the website in its search engine after it has been notified about it.
How did I find out?
I have various phrases associated with my website/blogs set up on Google Alerts [] - not to be confused with which is nothing to do with Google. Guess who got stitched up on that domain name!

Every time the phrase is used in something indexed by Google I get an e-mail alert. The number of alerts I've had in the last 24 hours which are actually associated with spam blogs is getting quite ridiculous - hence this post.

I very much recommend that you also set up Google alerts for your blog's name and any similar 'phrases' to see if your content is also being scraped.

How did I report the theft and the abuse?

What I did was I found the website for reporting abuse of abuse of Google Ad-Sense, located the e-mail address and reported the abuse. The e-mail address is " ". (They call it a 'policy violation'. I call if theft.)

You should report:
  • the names/URLs/posts of websites/blog sites from which content is being scraped
  • if your content is copyright protected and whether there is a notice to this effect on site
  • your concern that income is being generated for thieves committing fraud through the use of Google Ad-Sense adverts on the offending site
You will get an automated response. This is it.
We've received your email alerting us to a potential policy violation on a site displaying Google ads. Although we're unable to respond to individual reports, we have forwarded your email to our team of specialists for further investigation.

We appreciate your help in maintaining the quality of the AdSense program.

The Google AdSense Team
Dissatisfied with this initial response to a statement of their involvement in a theft of copyrighted material, I wrote again and indicated I was about to start blogging about it and got this (automated?) response
Hello Katherine,

Thanks for writing in. I understand your concern about your content being re-posted on other sites. Upon investigation of the sites you've listed, it appears that the blogs in question have been since been suspended.

Please know that publishers participating in AdSense may not display Google ads on web pages with content protected by copyright law unless they have the necessary legal rights to display that content. Should you find that your content is being replicated on other sites, rest assured that we will take action as necessary to address the issue.

We appreciate your patience and understanding.


The Google AdSense Team
I've reported a number of further breaches in the last 12 hours and am back to the original automated response.

Why is it a problem?
The problem exists because Google has created opportunities for it the problem exist and is not doing enough to close down those who operate in this way. This is not a new problem. I did a little bit of investigation and found more than a few articles about it. Here are links to a couple.
Often, legitimate companies have their advertisements served on questionable sites through redirections designed to "obfuscate the connection between the advertisers and the spammers," the researchers wrote.........If those links are clicked, the doorway pages then redirect to other pages, potentially bringing revenue back to its controller via pay-per-click advertising offered by companies such as Google Inc. through its AdSense program.

A responsibility also lies with advertisers to assert greater control over where and how their ads are placed.

"Ultimately, it is advertisers' money that is funding the search spam industry, which is increasingly cluttering the Web with low-quality content and reducing Web users' productivity," they wrote.

How to get Google to do something?
So long as Google's search engine continues to index spam blogs and Google Ad-Sense continues to serve up adverts to spam blogs then:
  • blogs like mine - and yours - will continue to be scraped of content
  • copyright breaches will continue
  • income generated as a result will mean that content authors are being defrauded
  • people advertising using Google Ad-Sense will be defrauded.
I sat and pondered on why Google hasn't done anything. And decided that one of the reasons may be that advertisers are not aware of how bad the problem is. If advertisers started pulling their adverts and/or pressurising Google then maybe Google would get it sorted.

So I've decided to start listing all the advertisers appearing on the same page of the spam blogs hosting my stolen content. This at least would alert advertisers to where their adverts were being placed.

Some of the advertisers are small operations - some are individual artists - and some are major companies - including Microsoft! All are potentially being defrauded by the sites hosting my stolen copyrighted content. None appear to be exerting sufficient control over where their adverts appear and all need to check what sort of site their adverts are appearing on.
  1. - an artist's pen and ink artwork
  2. - an architectural illustrator
  3. - buying pencils
  4. - on the spot caricature
  5. - quality pens
  6. - nail art pen
  7. - free trials of Microsoft sharepoint 2007 software
  8. - anabolic steroids book
  9. - training
  10. - stock reports
  11. - Indian blankets
  12. - holidays
  13. - holidays
  14. - Holiday apartments
  15. - Gulf hurricane relief (charity?)
I wonder how long it will take for a representative from Google Ad-Sense will turn up to comment on this post to identify what they are doing to halt this sort of abuse. Or maybe they don't pay attention to what people are saying about them on the Internet?

If you are an artist paying for Ad-Sense to advertise your art - you might want to pause and think about the wisdom of your investment. Your adverts might be turning up on spam blogs and getting listed on blog posts like this.....................

Technorati tags: ,


  1. It gets better - I've now discovered another site and blog post - supposedly about free web hosting - which is scraping content using Google Alerts - and has posted this blog post!!!

    Just how many legal 'own goals' against copyright material on one blog can Google commit?

  2. I have seen the same. I actually found your article with my google alerts, which these days are getting worthless because of the amount of spblogs that just contain the words I am interested in, among other junk.

  3. I've subsequently had a satisfactory reply from Google as to action relating to the blog which generated this blog post.

    Ishai - it's very sad isn't it Ishai.

    However I've just found another site which is publishing my RSS link in full. Admin of that site have been notified - and I'm inclined to think their action to date has been done out of ignorance.

    Should the RSS link not be removed from that website immediately I shall be taking further action on a formal basis.

    To all other would'be scrapers/thieves please be aware that I have google alerts set up for a number of phrases which tell me when you publish my copyrighted material. YOU HAVE BEEN WARNED.


I always check identities and ALL links in comments for spam.

Due to excessive attempts to introduce spam via comments on this blog, I've introduced a regime where all comments with links in the ID or text to the websites of hotels/resorts/tourist destinations will NOT be approved and are deleted. The websites of repeat spammers are also reported to Google.

Nice, sensible people who are not new to blogging probably don't need to read my Comments Policy