My site is quite huge with 900 plus folders and almost 1 million pages. I've used Gsite crawler to crawl the entire 900 plus folders individually and have created sitemaps for each folders respectively. Now i need to integrate all these 900 sitemaps to the one single sitemap master file, to which i will be directing from the robots.txt file. Any ideas on how to accomplish this?
sounds easy enough, just put a sitemap index in your root directory. a sitemap index may contain up to 1000 sitemaps unless i'm mistaken, so you're ok. anyway, you can add multiple sitemap indices if you get more than 1000 sitemaps. instructions are at google. https://www.google.com/webmasters/tools/docs/en/protocol.html
you can sitemaps:your-sitemap-url to your robots.txt file. search engines will find your sitemap when they visit your site
Thanks for the replies, But i have one questions unanswered still. How to create a sitemap index manually? I have the respective gzipped sietmaps in all the child folders(child sitemaps) Now how do i create a sitemap index for them? Do i manullay put in their urls? Anyother easy way out?
well in my case i wrote a script which was easy thanks to my mad perl skillz but if you're not up to writing scripts i guess you could get your spouse or miscellaneous dependent relative to do it one by one. no seriously you definitely need to write a script... it would be very short because we're just picking up filenames and last modified dates. what flava you want to use depends on your operating system i guess. if it's unix you could just grep or ls from the command line.
thanks much Mono - but i managed it with a demo soft "rapid sitemap" from downloads.com Phew! so im done with it now. Problem sorted. Thanks much friends.
no kidding. they have software for everything these days. congrats, hope the sitemap works and improves your indexing. keep us posted.
Here's the documentation of what i did - (i made on a post on it) http://www.dailydoseofinternet.com/2007/06/how-to-build-sitemap-for-blogger-and.html Code (markup): Hope it helps you.