I thought that you stated if it works for the robots I should be OK. Navigation on my site is simple from a human perscpective. Don't think I really need a sitemap other then to help google. However, it didnt seem to index but 2 of my pages. Thanks for the reply.
Sitemaps are used as a "guide" for search engine crawlers, it doesn't mean all your pages will get indexed. It's a source for the crawlers to find a list of all your valid web pages.
If I don't list all my pages in a sitemap, do those pages still get crawled? I use a php news script so it is not feasible to list all the news. But I can list all my categroies etc?
shawnden and skaterkee: as ssandecki said, a sitemap helps the Google spider to find all the pages on you web site, especially those which may not be linked - accidentally or deliberately - within your site. However, a sitemap does NOT guarantee that all the pages listed in it will be indexed. So by putting up a complete sitemap, your are "helping" you site being completely indexed by Google (Yahoo and MS Live also use the sitemap). K
Yes, you don't need to list all the URLs. Google will still find them while crawling your site, unless you use robots.txt to block those specific URLs from indexing\crawling.
Hello...we're trying to locate a decent sitemap generator, but have had no luck after demo-ing about a dozen. You had listed the tool from AuditMySite.com, which seemed to do the best job of crawling. However, my browser window closes itself when the tool is finished. Besides, when I attempted to use this tool a few months back I was able to export a sitemap.xml...but the resulting file had obscure XML tags. According to Google, the correct format uses the <urlset> and <loc>, etc. tags. This particular file used a different notation all together. Anyway, we've tried installing tools on our server, but things just don't "cooperate". We also tried to use the CoffeeCup software...this seemed to work well, except it only indexed 414 pages out of thousands on our site. I know it isn't imperative to have every page included, but that number just seemed rather low to me. At this point I'm getting frustrated trying to find something that actually works Thanks!
thanks for the info, I used freesitemapgenerator.com it took a couple of days for them to crawl my site but it worked great.
Glad to hear, i'm going to go through all the pages in this thread in a few days and update more sitemap generating websites and add more content to the FAQ.
Great FAQ, I have been using the www.auditmypc one for ages now and IMHO its the best there is online. Fast, reliable and great with big sites...
All, Happy to report that the Google sitemap problem (mainly with 1-URL maps) appears to finally have been fixed. K
Hello All, I have been Submitted sitemaps "garment.fibre2fashion.com/garment.xml" it contains 15000 url but in webmaster toll it will shows me only 5000 url are indexed. please help me out to resolve this error? Reply With Quote Txs in Advance Dipali
It's recommended to only use 5,000 URLs per sitemap. I actually have multiple sitemaps for my website, I split them into 2,500 URLs.
All this discussion is great for explaining how to create and use a sitemap (I have used one for a long time now). However... If all your pages are reachable starting from your home page, is there actually any point? In other words, unless you have pages that a crawler wouldn't otherwise find, what difference does a sitemap make?
This is a great point, usually after a website has been crawled and indexed for a good period and has a good amount of backlinks a sitemap is really not needed, but new upcoming sites should always use one.
PaddyL You are correct as far as you comment goes, but by adding a site map, you shorten the time for google to obtain all of your url. I use xml site maps with ~35,000 URLs in them, and use Webmaster Tools to submit them. Within weeks, most of the URLs have been indexed.
2 questions: A) I created an XML sitemap and succussfully submitted it to Google. Should I still create a sitemap link on my site to the XML page? B) I plan on using a PHP jump script as I add more products and content to my site. How can I use a robot.txt file so Google won't index the folder containing the PHP script?
A) You can either create a link on your home page, or, more usefully, submit directly to Google. http://www.google.com/webmasters/ B) Your robots.txt should be in your website's root directory. The line you want is User-agent: * Disallow: /php-scripts/ Code (markup): where "php-scripts" would be the path to the scripts. Find more information at http://www.robotstxt.org/