Google Sitemap : .html or .xml which one is better ? Does every search engine accept xml sitemaps ? I've heard yahoo doesn't accept xml sitemaps. Is that true ?
html sitemaps is good for all search bots where as xml isnt compatible for them all. If the site isnt big then html should do the trick.
The google bot will still crawl a html sitemap. The only purpose the sitemap really has (as in search engines talk) is making the crawl of your site easier and faster for the bots. As far as I remember GSiteCrawler creates a number of different sitemaps for Google, Yahoo etc.
Different sitemaps have different purposes, or targets. While Google uses their own XML format that nobody else seems to understand right now (search engine-wise), Yahoo accepts a certain format of txt file, all engines accept XML in the RSS format, and reportedly, they all understand XML in the ROR format. One thing is for certain... ALL search engines understand the HTML format
if you have highly dynamic site - content changes few times a day then use xml for google. Use the API to generate that. This will let GBot know exactly how many times to crawl the site. If its static or low frequency of content changes then use HTML and txt. If the bot visits often that means you are giving away your bandwidth a lot.
Bots visiting a lot is of no meaning. You want GBot and other to visit at least once a day or maybe at least two days. Crawling all the important pages is valuable. That is why we have sitemaps - to guide the bot properly. If you have 50kb per page and 1000 pages - that is almost 5MB, and if a bot crawls each and every 1000 pages, you are giving away 5mb. If the same bot visits twice thrice a day - just add that much to your bandwidth times the number of bots times number of days/month = total bandwidth just for bots!
I think so, and some big sites as well, check this out: http://spiderbites.about.com/sitemap.htm Code (markup):