It tells the search engine spiders what they can and can't look at on the site. Just upload a blank text file made with notepad called robots.txt and the errors will disappear.
When a robot crawls your site it looks for the robots.txt file. If it doesn't find one it assumes automatically that it may crawl and index the entire site. Not having a robots.txt file can also create unnecessary 404 errors in your server logs, making it more difficult to track "real" 404 errors. Assuming you want your entire site indexed and only want to stop the unnecessary 404 errors from occurring you have a couple of options. Upload a blank robots.txt file to the root directory of your domain. Upload a simple robots.txt file to the root directory of your domain. =========== I have an article on my site that covers the BASICS of how to Create Robots.txt File that may help you get started. Cricket
Ooops! Sorry! I didn't realize you had already answered this. Our posts must have crossed paths Cricket
No probs Cricket. There are lots of reasons you may want to keep spiders away but the biggest reasons are to keep the spiders away from sensitive data or pages you do not want in the database and to also give the spider more direction so that it only reads the pages you want listed in their database.
Yes. To prevent it spidering your images, scripts, stats, mail, etc. By the way, a basic "spider everything" robots.txt file would look like this: User-agent: * Disallow: Code (markup): Save as plain text (ASCII/ANSI) and upload it to the ROOT of your site. This translates to "all spiders, please crawl everything (disallow nothing)".
Hi Guys, Whats the command to stop spiders and robots to crawl image files as it eats up bandwidth. Thanks
Under: User-agent: * Code (markup): add this line: Disallow: /images/ Code (markup): substituting the name of your images folder for "images". If your images aren't in a separate folder, add lines like this instead: Disallow: /image1.gif Disallow: /image2.gif Disallow: /image3.jpg Code (markup):
No. The robots.txt file is a "limiter" for spiders -- it tells them what parts of your site you do not want them to crawl/index. There is nothing you can put in a robots.txt file to increase search engine ranking.