Hi Guys and Girls! Do you have any idea what should I write in my robots.txt file in order to prevent Google from spidering a few hundred pages that look like this: http:// domain dot com/?e=a&subcat=Business&l=A The variables are highlighted with RED color. I read that this is correct: Disallow: /*? Is this correct? Also If google already crowled a few pages, do I have any choice to tell Google via Google Sitemaps to throw them away? Thanks.
Put this in your robots.txt User-agent: Googlebot* Disallow: / User-agent: * Allow: / Code (markup): Put that in your root directory. Should be working. This only disallows googlebot on all pages. Hpe this helps.
google support * characters. adding Disallow: /*subcat=* will stop google to access it. this line will stop google to access urls which include subcat= . or adding Disallow: /*l=* will stop google to access urls which include l= . this can be used for google and yahoo but not msn