1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

Sitemap needed for 400,000 URLS How-To

Discussion in 'Google Sitemaps' started by eautocad, Aug 29, 2008.

  1. #1
    Ok, here is the deal. I have about 900 pages in my sitemap right now, but that's just the beginning. I'm estimating that my statistical search engine has about 400,000 urls that it can display. I would like to index all of them.

    How do I do this, is their any non-server side programs I can use to do this? Like a script that I can run from here to generate the information that I can enter into the sitemap manually? This would be a great help... I don't want to install anything on my server to do this, I don't see why it can't be done but I cant' find anything to do it...

    Any software recomendations? Websites related? Advice?

    Thanks!:cool:
     
    eautocad, Aug 29, 2008 IP
  2. pubdomainshost.com

    pubdomainshost.com Peon

    Messages:
    1,277
    Likes Received:
    21
    Best Answers:
    0
    Trophy Points:
    0
    #2
    I have been using GSiteCrawler for generating sitemaps for my sites.. you may check it out (it's free and excellent tool for generating sitemap for Yahoo, Google and Live) http://gsitecrawler.com/

    Cheers
    Gs
     
    pubdomainshost.com, Aug 29, 2008 IP
    indyguidedotinfo likes this.
  3. eautocad

    eautocad Peon

    Messages:
    245
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    0
    #3
    i went to download.com and tried a bunch, none of them worked. I'll give yours a shot! Thank you!
     
    eautocad, Aug 29, 2008 IP
  4. seohyderabad

    seohyderabad Peon

    Messages:
    1,190
    Likes Received:
    29
    Best Answers:
    0
    Trophy Points:
    0
    #4
    google allows 50K per sitemap (i hav seen tthis when 6months ago i prepared a sitemap for my article dir 60K+ Urls)

    breeak it into 8 or 10 files & if you have list of urls u can prepare it using excel too :)
     
    seohyderabad, Aug 29, 2008 IP
  5. eautocad

    eautocad Peon

    Messages:
    245
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    0
    #5
    good advice thank you!
     
    eautocad, Aug 29, 2008 IP
  6. websitetools

    websitetools Well-Known Member

    Messages:
    1,513
    Likes Received:
    25
    Best Answers:
    4
    Trophy Points:
    170
    #6
    If you configure A1 Sitemap Generator for large websites, then 400.000 will probably not be a problem (if using a recent version, e.g. 1.7.4). But it can depend on factors outside my control. You should make sure to test it first. One thing worth mentioning: You can automate the software to e.g. run at night 10 hours, then stop.... Then next night, resume scan 10 hours etc. This might be useful to you when scanning such a large website...
     
    websitetools, Aug 29, 2008 IP
  7. gbh1935

    gbh1935 Peon

    Messages:
    585
    Likes Received:
    11
    Best Answers:
    0
    Trophy Points:
    0
    #7
    I would think that if your site is being database driven, it would be a simpler task to write a little code that generates and updates the sitemap realtime
     
    gbh1935, Aug 30, 2008 IP
  8. bhaiyaji

    bhaiyaji Banned

    Messages:
    184
    Likes Received:
    5
    Best Answers:
    0
    Trophy Points:
    0
    #8
    Site Mapper.info is offering free tool to create unlimited google site maps. NO LIMITATION!!

    http://sitemapper.info

    This may help you... :D
     
    bhaiyaji, Aug 31, 2008 IP
  9. remember123

    remember123 Peon

    Messages:
    620
    Likes Received:
    12
    Best Answers:
    0
    Trophy Points:
    0
    #9
    there are many php scripts which wold serve your purpose
     
    remember123, Aug 31, 2008 IP
  10. catanich

    catanich Peon

    Messages:
    1,921
    Likes Received:
    40
    Best Answers:
    0
    Trophy Points:
    0
    #10
    gSiteCrawler is a great tool to use for this
     
    catanich, Sep 1, 2008 IP
  11. webrickco

    webrickco Active Member

    Messages:
    268
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    60
    #11
    400000 links? how much time do you think this will take? I use a standalone sitemap builder downloaded at www.sitemapbuilder.net. However it is not very actual, I use it for testing and for performance matters, since this software is a lot faster than everything online, and following 10 links at a time.

    I tryed once to use it for a site that was "only" 50000 links (google recommended limit for a sitemap file). It took more than 5 hours to generate and the file was so big, it was almost impossible to change individual priorities! I really would like to see an online tools capable of dealing with such amount of data.

    Additionally, regarding the recommendations about sitemap files limitations, your sitemap will have to be divided in 8 xml files (50000 records each with a limit of 10Mb for each file) and last but not least, a sitemap index will have to be built! I don't know any online or standalone generator capable of cutting files into 50000 record subfiles and generate the index automatically. If someone knows, please inform, otherwise, you will have to build it manually.

    Do you really need the search engine to index all your files? To index 300 articles inside my site google took 1 month!!! how long will it take for 400000?
     
    webrickco, Sep 1, 2008 IP
  12. eautocad

    eautocad Peon

    Messages:
    245
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    0
    #12
    yes this is it right here... Is there any sql commands for this?





    yes it is a lot of info, and thank you for the replies. The site is pretty big... I need it to be completely indexed though so I can get the most amount of traffic. Will this huge site status get my site a better pagerank?
     
    eautocad, Sep 2, 2008 IP
  13. eautocad

    eautocad Peon

    Messages:
    245
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    0
    #13
    An interesting point of information, is that yahoo has indexed 1700 pages and google has indexed 1050 pages.. They were both submitted at the same time! It looks like yahoo is a bit quicker than google when it comes to crawling and indexing pages.

    So say this thing has 400k pages, does that make it worth more $$$$
     
    eautocad, Sep 2, 2008 IP
  14. websitetools

    websitetools Well-Known Member

    Messages:
    1,513
    Likes Received:
    25
    Best Answers:
    4
    Trophy Points:
    170
    #14
    Well, since you ask. A1 Sitemap Generator does this. Or put it another way: When it generates the XML sitemap files, it will automatically split output into multiple xml sitemap if necessary + generate sitemap index file. (The actual "URL limit per sitemap file" default is set to 25.000 in A1SG, but can be set to the sitemaps protocol max of 50.000 as well.)


    It will indeed take some time. But not so bad when you can automate software like A1 Sitemap Generator to run during night-only every day resuming earlier website scan. But yes, 400.000 pages are a lot, and will probably take a couple of days.
     
    websitetools, Sep 2, 2008 IP
    webrickco likes this.
  15. Serghei

    Serghei Well-Known Member

    Messages:
    1,062
    Likes Received:
    7
    Best Answers:
    0
    Trophy Points:
    130
    #15
    GSiteCrawler is the best tool i ever used
     
    Serghei, Sep 4, 2008 IP
  16. eautocad

    eautocad Peon

    Messages:
    245
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    0
    #16
    I tried to use Gsite and it didn't work.. :(

    Good news, google has crawled 2500 pages of the 400k... It doubled in one week. :) I've been getting alot of 1 hits too that register 0:0:0 that generally congregate around decent viewing visitors, 4 pages +

    is this the google crawler?
     
    eautocad, Sep 4, 2008 IP
  17. gbh1935

    gbh1935 Peon

    Messages:
    585
    Likes Received:
    11
    Best Answers:
    0
    Trophy Points:
    0
    #17
    It will all depend on the layout of your database...
     
    gbh1935, Sep 5, 2008 IP
  18. justinlorder

    justinlorder Peon

    Messages:
    4,160
    Likes Received:
    61
    Best Answers:
    0
    Trophy Points:
    0
    #18
    Yes this is the right answer.
    I have used that software, free and excellent. you only install on your local computer.
     
    justinlorder, Sep 5, 2008 IP
  19. Ann_India

    Ann_India Peon

    Messages:
    89
    Likes Received:
    5
    Best Answers:
    0
    Trophy Points:
    0
    #19
    Well my vote goes for gsitecrawler. Any other online sitemap generator has some limit of crawling.

    As you have 4 Lakhs web pages, you cannot depend upon online sitemap generator unless u purchase their premium service.

    So i would recommend to divide your sitemap as per category or section & create multiple xml sitemap. And finally create index xml sitemap with the details of all the sitemap.

    Then lastly submit your index xml sitemap & all other section sitemap. Soon you will find your web pages indexed in google.

    Hope I have answered to your query properly. :)
     
    Ann_India, Sep 6, 2008 IP
  20. demonzmedia

    demonzmedia Guest

    Messages:
    21
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #20
    You could use the Google sitemap generator from Google (oddly enough) -

    google.com/webmasters/tools/docs/en/sitemap-generator.html

    Another (better) way to do it assuming that your site is powered by a database would be to write a script which generates a sitemap directly out of your database. If you are using a generic script (e.g. PHPBB) check the plugins page for that site to see if there is a sitemap generator.
     
    demonzmedia, Sep 10, 2008 IP