I have a website with huge sql database that i tried to generate sitemap with gcrawler. it took about a day to generate but after this i saw 4 sitemap files also 1 sitemap-index file. should i upload all these to my server directory or have i done something wrong, what do you think is the reason for this. Shouldn't i get only one sitemap file and upload that one?
use the xml.file for google and the txt.file for yahoo the others two are archives...you don't need them for this
well i have 5 xml files four are sitemap1 sitemap2.... and one sitemap-index sitemap1-4 are 7mb sitemap index is 1kb and it has .gz archive files that zips sitemap1-4
You can read this article why that happens (XML sitemaps section). Basicly the XML sitemap has been split into multiple files. You should upload them all.
Wow 5 XML files.... I "only have two" 9MB and 4 MB However these files does not seem to nbe linked in their code.... Do I have to submit them seperately to google? What are gss.xSl and sitemap-index.xml for The last one seems to "link to the two tar.gz archives but NOT to the two actual sitemaps. Oops EVERA I think you answered my question...but surely you mean sitemap-indez.xml (without GZ) and no need to link the other files?
I had this same question too. Because of the link directory within my site, the GSiteCrawler goes on for days trying to create a sitemap. Right now there are five submitted to google, each with 40001 links, about 9MB each Thanks for the helpful answers.
Yeaahhaaa! Just wanna let you know how it goes on my server. Gcrawler is busy for 9 days straight got like 11 zips each around 9meg --> over 380000 links in the sitemaps.... great tool. Can it go faster? What to do to optimize it for quick spidering?
You can try check the website and/or ask the GSiteCrawler author how to improve speed. Some obvious possibilities are computer, webserver, bandwidth etc.
I tweaked some settings about time-out sec., amount of spiders and logging. Goes faster than lightspeed as we speak. Status: * 10 days * +/- 693000 links in the sitemaps * over 1000000 pages are spidered.