Is it possible to use a robots.txt file to block search engine spiders from crawling a pages with certain parameters?? For example if my site had a page: http://mysite.com?custom_pages=1 Could I make sure the robots dont crawl any pages with the ?custom_pages parameter?? Any help much appreciated!
you would have to list each possible parameter: custom_pages=1 custom_pages=2 custom_pages=3 because each is seen as a different page. Also, robots.txt does not block content. It does not stop the page from being crawled. It just tells the crawler not to index the page. BTW: you could put your custom_pages in a separate folder and disallow the folder.
@longcall911 Yeah, that is what I am after. Got some pages that will appear to be duplicate content due to having a different URL structure so want to manipulate what gets indexed. Just have to find an easy why to find all the parameters that I want blocked from our 1,950,000 pages! Ouch!