Hello, I have submited a valid sitemap in webmaster tools that has 5000 pages, however I see there only 111 have been indexed. Anyone knows the reason why not all 5k are in?
Google does not guarantee that it will index all urls of the sitemap. More details can be found from Help section of the Google Webmaster Tools Central. See: http://www.google.com/support/webmasters/bin/answer.py?answer=80488&hl=en (Sorry about the text link, but I am not allowed to post live links yet).
Even it can take up to 1 year and you dont see all of your pages indexed. Google will not index all of pages for most websites.
Yeah there's no assurance that Google will index all of those URL's specially those deep link pages in your site.
Could you be more specific? Do you mean that Google has found some errors from your sitemap? Can you post an example of the error you have received from the sitemap errors and warnings section of the Google Webmaster Toolkit? It could be that for some reason the server is under a very heavy load, and cannot process the request fast enough, or it could be down when Google tries to fetch your sitemap and process it. Without more information it is pretty hard to say what is going on.
A sitemap is simply a list of your pages in a single page layout. When a visitor goes to your site they can access the sitemap to find any pages that they are interested in viewing. Usually the sitemap is to benefit the people who visit your site and give them a quick and easy way to navigate your site. A sitemap has a benefit for the search engines as well. This sitemap will let the Google robot see how pages, such as those in the fourth and fifth level, fit into your site. If you link to your sitemap page from your homepage, all of the pages listed in your sitemap will be no farther from your home page than the third level. This will encourage Google to index your entire site.
Sitemap submission would not guarantee a thorough indexing on any search engines, especially for those dynamic generating pages. If you want all your pages being indexed, I would like to recommend you to deep backlinking to your inner pages via some channels like social bookmarking. Have a nice day,