I have a couple of fairly new sites and have Google Alerts setup so I know when they are indexed or mentioned by another site. I'm constantly getting Google Alerts about my RSS text being indexed by what I'd consider scraper sites. The only thing on these sites are short summaries of many different site's content. All of these scraper sites are being indexed by Google and use Adsense. When I reported them to Adsense, I got a form letter response that says I have to send them proof and a signed letter that states these sites are infringing on my copyrights. Yet it is so obvious the sites are stealing content from many other sites and they even link to mine/other sites (to the original article/blog post). I don't mind doing it, but should I even waste my time since Google doesn't appear to care anyway? And ... just as important, should I be concerned about duplicate content?
Absolutely, duplicate content isn't good anymore. But RSS feeds are great if you don't solely rely on them
You have to see the irony there - you're expecting Google to care when in fact, that's pretty much what Google does - scrape content to sell ads off of.
I keep sending the notifications. I can't help but think a LOT of people are violating the terms of service - or one or two people doing it very well.
it's not content thieft if they only take a small part of every sites, it's fair use, or something like that (extract) and you shouldn't be worried, maybe try to do something along those lines too.
You mean put up sleazy MFA sites so I can cheat the advertisers who support my other sites, and screw up the whole system?
You probably shouldn't bother. If RSS is done properly, it should only be a short summary or intro to an article, rather than the entire article itself. This is how I do mine. I know there are bloggers who put their entire posts in the RSS, but IMO this is a mistake. What incentive does the web user then have to come and visit your sites (and view your lovely ads while they're at it)? Just make sure that all you put in the description field is a short sentence or two, and not the whole shebang. That way you won't have to worry about duplicate content, and your RSS will be working for you, not competing with you. Auto-blogs that present the content of a variety of feeds are proliferating like bunnies at the moment.
Google has to legally cover themselves before they ban a site for alleged copyright. Otherwise any one could make a frivolous charge. Follow Google's Adsense DCMA process and the offender must respond or get banned: http://www.google.com/adsense_dmca.html
I think there's some confusion here. there are some sites that simply grab the description ofo various sites or a few lines at random, pretending to be a resource for something they're not, e.g. "Faucets!" (faucets.somethingorother.###) with a few lines from the Chevy Faucet site, a few more from Joe Faucet's blog, etc., out of context and simply taken at random so they can show AdSense ads. Frustrated users presumably see one useful link in the AdSense area and click on it, providing instant revenue to sleazy crooks.