I host a free forum host so I need to have them index everything since all the forums change daily. Right now my xml files just point them to the main index for each forum but I would like to do more then that and with a limit of 50,000 links and I have close to 40,000 forums you can see that I will need either: a) them to up the number over 50,000 b) have to have multiple xml files submitted
Yea, its called a sitemap index file. See Google's sitemap page for complete details. It keeps track of all your various 50,000 link files.
I saw one posting in the discussion group that said the limit was 5000 urls for any given sitemap file. They suggest adding multiple sitemap files for different root directories. So - maybe a sitemap per forum.
Q: How big can my Sitemap be? Search engines will not process Sitemaps larger than 10MB (10,485,760 bytes) in length when uncompressed or that contain more than 50,000 URLs. This means that if your site contains more than 50,000 URLs or your Sitemap is bigger than 10MB, you must create multiple Sitemap files and use a Sitemap index file. You should use a Sitemap index file even if you have a small site but plan on growing beyond 50,000 URLs or a filesize of 10MB.
I had the same problem my sitemap freezed. I will ask on the phpbbstyles forum what to change in the code for having them for multiple categories.
Hmm... I went here: https://www.google.com/webmasters/sitemaps/docs/en/faq.html And didn't see that answer. I haven't used it yet though, so I don't really know what I'm talking about. Just thought I'd try to throw in whatever help I could. I did see on that page though after looking again: # Each sitemap file must have no more than 50,000 URLs. Under the question - What is the simplest sitemap I can submit? But anyways... back to your original question - Why is a sitemap index file not the solution to your problem? It seems like it's exactly what you are looking for. Yes, it requires multiple sitemap files, but that seems like a decent solution to me.
I got everything working from cron and some scripts, one has like 6 gz files thats close to 300k links