Hi! I am a bit new to SEO. What's the best way to stop a page being crawled? I have a page containing ads and affiliate links that I would rather not be crawled until my site has been listed. I know of two possible ways: adding rel="nofollow" to the <a href=> tag in my navigation and using the robots.txt file. Which is the better and are there alternatives. Also, should I do this, or should I delete the page altogether until I am listed? Cheers, Peter
You should use nofollow when linking to pages you don't want crawling and you should block the page in robots.txt.
instead of using nofollow on links (yahoo & msn may still follow these links) you can add this meta robots tag to the pages you dont want indexed. <meta name="robots" content="noindex,nofollow">
There is absolutely no need for that! Since you know how to use robots.txt that's all you need. This is the first file that the search engines look for before they go anywhere else on your site. Just block the individual page(s).
That last question hints towards hiding the file from everything. If you have affilate links and ads, then the sites providing those links and ads will know of your page, regardless of the robots.txt file. Rather than delete them, place them in a password protected directory until you are ready. Nothing can get in except you.
Where did you get this? Google can still crawl a site even without a sitemap, sitemap is only G toys.
Would this be do-able with .htaccess too? Or would that make it server wide and not just for an individual page?
Thanks everyone. I know now that robots.txt is the best. I knew about it, but wasn't sure how to apply it. just-4-teens showed me how. Cheers, and thanks. Pete
the code i posted goes within the <head></head> section of the page and not within the robots.txt file.