evera is correct - a deal of rogue engines will simply ignore the robots file. However, assuming the robots file is valid (written correctly), the major bots will follow it. Based on your question, you may want to give Google's Webmaster Central area a look; Google has a pretty good robots.txt 'tester' as a part of their Webmaster tools. Note that some bots will specifically crawl directories you deny, and many will cache the robots file for a period of time (which is generally a good thing).
Google, Yahoo and Microsoft have agreed to support the Sitemaps protocol. See http://sitemaps.org/ for details on how the use of robots.txt is sufficient for informing a search engine of the presence of your sitemap. That said, I've found it better to use the management interface for each search engine provider to test your robots.txt file and also improve on general site submission issues. For Google use: http://www.google.com/webmasters/ For Yahoo use: https://siteexplorer.search.yahoo.com/ For Microsoft use: http://blogs.msdn.com/livesearch/ar...al-and-an-invitation-to-the-private-beta.aspx
I think not all SE bots respect robots.txt file so better password protect directories which you don't want to get crawl.