Robot.txt files are used to inform SE spiders of ignoring specified files or directories in their search... One good example is when the content of the selected directories might be misleading or irrelevant to the categorization of the site as a whole, in this case you have to indicate in the Robot.txt file that this part of your site must be excluded for crawling...
They should really. You can put a link to your sitemap via them as well. Very handy to have. Go to http://www.robotstxt.org/ for more information on robots.txt
It's very hard to imagine any site that wants "all" the web pages indexed in Google, but some people just don't know better. Your better off atleast using the robots.txt file to point to your sitemap, since search engines like Ask.com only use this method to locate it, below is a simple robots.txt entry you can use. User-agent: * Disallow: sitemap: <full www path to sitemap> Code (markup):
Not only that, but not having a robots.txt file will cause your server's error logs to get congested and polluted pretty darn quick (PDQ). So if you don't want to go through false positives like "no robots.txt file" or "no favicon.ico file" then put them in there and be done with it, even if you don't use them.
Over the years I have added a pretty simple robots.txt. As time progresses, I find myself adding to it on occassion to disallow roge robots.