I recently read that after submitting an XML sitemap to Google, that you should remove it from your site to prevent hackers from accessing more pages. I have no idea if this is total BS, or completely valid. Can anyone answer definitively?
cgragg, it depends. A sitemap will usually list all the files from your server. You might want to keep some files private: like backups, password files, admin area login pages, etc. A hacker can usually find out a weak point on your site by examining whatever data he can gather of your site, and yes, sometimes a sitemap can offer him what he needs to further investigate things and eventually hack your site. But this has nothing to do with XML sitemaps. Bottom line, XML sitemaps should be as safe as the next thing as long as you make sure private or sensitive data is not present in them.
Hmm I disagree, say you have a control panel or admin section and you allow it in your sitemap then yes hackers who are targeting you will have more information and other places to look. I would do the following, 1) Restrict folders / files of a sensitive nature from your sitemap. 2) Make a robots.txt file with a list of files or folders to deny the robots access to It will not always stop some people but it is a good idea to hide things you don't want public as you in turn will just add fuel to the fire.
You can't delete your sitemap file once you submit it to Google otherwise you will get errors when you log into your account and will lose the ability to view some statistics. Your sitemap file should only contain pages you want to be found in search engines so technically any "hacker" could find all pages your sitemap.xml file by doing a "site:domain.com" search. I say technically because all of your Sitemap pages may not actually get indexed. You could always rename your sitemap.xml to anything you like and not use the robots.txt auto discovery but its really not necessary because of the above.
You don't need to delete just send a new one to google and it will overwrite the one you have there. Just ping google with the new map that has the same name and it will be overwritten with the new.
I'm checking all site-maps to make sure that the secure pages are not showing, and use the Robots to block any others. Thanks for the responses and advice.
use of robots.txt file is good to disallow access to good robots but how will you disallow access to robots file for humans. they can easily view it...making an entry in sitemap or robots files is one and the same thing for hacker. he will check the both
Remember that security by obscurity is no security at all. Always use .htaccess files or other access control mechanisms to restrict access to sensitive content. Even better, only allow access to sensitive content from certain IP addresses. If you're using Apache, this can be easily configured.