I've got some pretty mental sitemap stuff going on, five million urls split into multiple maps, map index and automated weekly updates followed by automated notification of update to Google, Bing, Ask and Moreover. Who else would you ping / notify? Yes, it's a lot of URLs! We've got nearly five million legal downloads on the site so it's justified and not just a spam factory of gazillions of pages. When I google site:myurl it returns just over half a million pages most of the time. Whilst it's a huge figure it's nowhere near the number of pages I've submitted. The maps have been in place about three months now. How long in other folks experience has it taken Google to fully ingest and index the content of your sitemaps? Any tips? I know maps alone don't bring results and there are other factors but what can I do to make the indexing of my content faster? Or, am I being impatient and think it's going to take even a super computer quite to time to read five million pages? Please don't suggest an off site deep link campaign, it'd take 100 links x 5 million songs and that's beyond madness! Thanks
site:myurl doesn't return all the pages indexed by google. Only way to check is in google.com/webmasters and under sitemaps. It will show how many urls are there in the sitemap and how many are indexed by google. You can set the custom crawl rate , increase it if your server can handle the overload but remember if google starts crawling your site more than other search engines will follow suite especially the chinese bots.
Well GWMT says: Sitemap stats Total URLs: 5,255,283 Indexed URLs: 170,505 Site: = 503,000 Too much variation there! As for chinese bots, they shouldn't be an issue as I've over 2,000 lines of rules in the firewall banning all Chinese traffic. And I'm not changing the crawl rate as it says not to unless the server is having issues and that's far from the case. Thanks
I find that it is very hard to get all your pages indexed, even if you have a website with only a few hundred pages, does everyone agree on that?
Yes, Google specifically says they don't guarantee all your pages will be indexed. Get more backlinks and this will help.
get more backlinks get deep backlinks don't worry about your sitemap is your sites navagation setup correctly? can you access all 5 million pages with your breadcrumbs or some page on your site that links to the major sections? Sitemaps aren't the end all cure for indexed pages, if your site isn't indexed fully, it needs more backlinks or better page navagation
I said without suggesting deep links as that would totally take the p with the number of items I've got for sale, as for what you say about navigation, there's always room for improvement as there is on pretty much most sites.
links are your only answer you need more of them if you want the full site indexed it's not our fault you don't want to build a few links ps.. you don't have to link to all 5million waraz pages, but the main sections would be good