As per I know, Google webmaster tools analyzes your robot.txt and shows the report there in your control panel. And Google takes the sitemap as a file name (not strictly but by default) sitemap.xml or may be other extensions as per file 3 types defined there. There is no harm if you dont have a robot.txt, yah it gives a 404 response but nowhere it is stated bad. And having blank robot.txt is also recommended by some of the SEO experts, but no one can just decide about it. I think if you have a sitemap you should manually submit it to google and others. And should have a html page of sitemap linked properly from your main page is better.
http://www.sitemapdoc.com/Default.aspx That is the one I use. I'm not affiliated with them in any way. I just like it that it will show you errors and what they are before you make your map.
I think its a good idea to do this through creating a robots.txt file for your sites. Every search engine first find your robots.txt file in you files.
You don't need an robots.txt file. You need it just if you want to hide some files from search engines
Hi- I have an SEO article for sitemaps and robots.txt both on my SEO Blog if you want to check them out.
You won't automatically rank #1 for everything, but utilizing a robots.txt file is a great search engine practice. Good luck
Can you elaborate on this? Which bots are bad and why? And, will these bad bots even pay any attention to your robots.txt file? Anyway to do something else to redirect or fool the bad bots?
Which bots are bad because I just have it completly open. Do you mean to keep bots out of certain areas of your site like admin/photos. For the most part I didn't think it mattered.
Always create a robots.txt file even if you are not blocking any spiders or directories/pages on your site. The robots.txt is also very useful for blocking pages on your site with duplicate content if you are unable to insert a noindex tag on them
A sitemap.xml is great in telling the Search Engine bots what pages you would like to be indexed, while the robot.txt is used to instruct the bots where they should and should not go. Of course a deviant bot can read your robot.txt and intentionally navigate to a file or directory that you explicitly told it not to go to for devious reasons.
You can checking tool at http://webtools.live2support.com or just googling it. google will answer your question
All of those robot.txt files will work. I also blocked google from viewing my url without a www. I did this on the server side and it forwards all of the urls to www. That way google wont see a duplicate of the site in two different places.