Hi, We have a bit of a problem with the indexing of our sites. We have footer links for things like 'send to friend' 'print page' etc and they are getting indexed by Google, i.e. http://www.website.co.uk/about_us/index:sendtofriend.html When you click on these links the page is usually blank and theres no link back to the home page. I have implemented the following robots.txt file to prevent any new sites from being affected; User Agent: * Disallow: */index:sendtofriend* etc However, I recently read that Googlebot will ignore the wildcard and the User Agent needs to be : Googlebot - is this the case? Sounded like BS to me but new sites still seem to have the bad pages indexed despite the new robots.txt. Also, what's the best way to fix sites which are already affected? Kind regards, Anthony
for your 2nd question.. There is option on webmaster tool to remove chached pages under option- crawaler access.. Today, google has posted in webmastercentral blog how to remove unwanted url.. you can check this out. here is its link http://googlewebmastercentral.blogspot.com/2010/04/url-removals-explained-part-ii-removing.html