robots.txt is nothing bt txt file it contain code to instruct robots when they visit out site. we can give instructions to robots to not crawl particular page
Robots.txt file is most important section of a website and it instructs crawlers what they should visit in the website and what they should ignore.
Most people think that you can use robots to tell spiders what to access and what not. However there are many more things you can do. 1. You can provide sitemaps to spiders 2. Give them request rates (ie 1/20 means 1 page every 20 seconds) 3. Tell them between what hours you only want them to visit 4. How long they should wait before requesting the next page If you have a popular website using these details in the robots.txt file can help you to save resources and get the site running faster and schedule the spiders to crawl your website at a time when your site has usually low number of visitors (night time). But then in the end there are more spiders that don't obey robots.txt instructions than there are that obey them.
Also forgot that depending on your server settings having a robots.txt will actually save you bandwidth. Most crawlers will request the robots.txt (whether they be good or evil ones). Now if you don't have the robots.txt file they will get the 404 page delivered. They will know that there is no robots.txt and start crawling your page. However 404 pages are most of the time bigger than a robots.txt. So every time a search engine requests your robots.txt you are loosing bandwidth. Even if you simply create an empty file of allow all spiders to crawl everything you will save bandwidth User-agent: * Disallow: Code (markup): Code above will allow all crawlers to crawl everything.
Just did a little test to the above. my 404 page: 3.27kb my robots.txt: 0.02kb So you can see the savings in terms of bandwidth.