Hi Guys, Ok, im launching a new site that contains thousands of pages with unique content and images. I have spent a lot of time and money trying to create this content so i would like to try and protect it as best as possible. I have written some code that looks at the useragent of the visitor ... if the useragent is not in my list of 'agreed' useragents then i diplay them the page they are trying to see but i simply do NOT put anything in the source code. If the user is in my list of 'agreed' useragents they see the same page but with the content in the source file. Example: Mozilla/4.0 = display no source code googlebot = display the source code Both pages will be 100% the same in everyway but it just means i can hide the source code if the visitor is not a search engine spider (in my agreed list) If someone from google or yahoo was to manually look at the pages (if someone reported me) would i be looked apon as spamming? or would this technique be ok as both pages were the same?
It's a waste of time, if you gave me the URL i bet i could scrape your content anyway. People can fake their useragent to appear to be Google bot, or i've even got a simple that does that. So personally i would scrap the idea.
Yeah i know its not 100% but it just makes me feel better knowing not every scumbag will be able to steal my stuff. just wondering if the SE will class it as a form of cloaking?
So long as you arn't delivering different content to humans and spiders to manipulate rankings you will be ok. I've used Useragent, Browser and IP detection and delivery methods numerous times without problems.
Hi darrens In my opinion it is not a spamming but I think you are going in the right way. you should not permit the user agents to whom you dont want to crawl your site. so the bandwidth of the site also will not be increased. In your site there may be many folders like images folder, bin.... and so on. so you should also avoid that folders that means you have to specify them in the "robots.txt" file so that they should not crawl by the crawler
I dont think its spamming, but whats to stop a user from just saving your webpages and getting your content that way? You can't prevent that.
It's a stupid thing to do, not to mention a complete waste of time. Rather than worrying about people stealing your content, just be vigilant and if you do find some scrapers stealing your site's content, check their Web hosting provider, and if they're in the United States, send their hosts a DMCA takedown notice (also be sure to send one to each of the four major search engines - Google, Yahoo, MSN, and Ask.com). You'll want to check out these links to learn more about the DMCA. http://www.smashingmagazine.com/2007/07/07/copyright-explained-i-may-copy-it-right/ http://www.google.com/dmca.html http://www.cs.cmu.edu/~dst/Terrorism/form-letter.html http://lorelle.wordpress.com/2006/04/10/what-do-you-do-when-someone-steals-your-content/ http://www.seologic.com/faq/dmca-notifications.php