i realy suck form google sitemaps and spiders

Discussion in 'Google Sitemaps' started by a87a.com, Sep 23, 2008.

  1. #1
    hi evry body

    i have really problem with google crawling my forum

    i was have sitemap containe 40425 url in it only 2573 indexed :mad:

    but the sitemap really have some problem in it like :

    --------

    Errors for URLs in Sitemaps 37
    HTTP errors 0
    Not found 88
    URLs not followed 0
    URLs restricted by robots.txt 0
    Unreachable URLs 14

    --------

    i wasn't have robots.txt

    then i made one for my site : http://www.a87a.com/robots.txt

    then the errors day after day decreased

    but one of my friend adviced me to remove the old sitemaps and generate new one for my web site with vbseo

    and disabled the archive from google spiders

    when i did it

    evry thing gone wrong :(

    Errors for URLs in Sitemaps 67
    HTTP errors 1
    Not found 93
    URLs not followed 0
    URLs restricted by robots.txt 228
    Unreachable URLs 14

    what can i do for these errors

    ----------

    another thing how can i avoid google spiders from crawling :

    user names and moderators forum

    if the moderator forum link like this : http://a87a.com/vb/f7.html



    plz help

    ---------
     
    a87a.com, Sep 23, 2008 IP
  2. d tea

    d tea Peon

    Messages:
    436
    Likes Received:
    6
    Best Answers:
    0
    Trophy Points:
    0
    #2
    I am trying how to learn the robots / spiders / sitemaps game as well. I tried following the steps in google webmaster tools but I never got very far. I hope this helps you more than it does me, good luck http://www.google.com/webmasters/start/
     
    d tea, Sep 23, 2008 IP
  3. a87a.com

    a87a.com Peon

    Messages:
    18
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #3
    thank you dude

    i think it will ne helpfull
     
    a87a.com, Sep 24, 2008 IP
  4. arnabme

    arnabme Member

    Messages:
    28
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    38
    #4
    There are certain things that will help you go a long way.

    1. Check from Google Analytics the pages visited by your visitors.
    2. Check from Xenu the Site Analysis ( this way you would understand which of the pages in your site is having errors)
    3. Try to figure out the problems in terms of link wrong pages, url or page names being called.
    4. Implement a robots.txt file to stop bots from crawling pages you don't want
    5. Implement a sitemap(xml version) but i do believe that you have some thing like sitemap1.xml and sitemap2.xml if you have so many pages.

    I think this help you a lot.
     
    arnabme, Sep 24, 2008 IP
  5. udayns

    udayns Peon

    Messages:
    237
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #5
    Hi,

    I think you creat a sitemp.xml this will help helps to index your site pages.
     
    udayns, Sep 24, 2008 IP
  6. a87a.com

    a87a.com Peon

    Messages:
    18
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #6
    thanks for the comments :)

    thank you arnabme and evry body

    i tried to do like thins in the robort.txt file

    Disallow: /~vb/f7.html

    i hope it's work :D

    thank you :)
     
    a87a.com, Sep 24, 2008 IP
  7. vseo

    vseo Peon

    Messages:
    11
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #7
    i never experienced this.
     
    vseo, Sep 25, 2008 IP
  8. websitetools

    websitetools Well-Known Member

    Messages:
    1,513
    Likes Received:
    25
    Best Answers:
    4
    Trophy Points:
    170
    #8
    You can "noindex" all those URLs you don't want sitemap generators / Google to crawl. At least A1 Sitemap Generator will obey noindex, robots.txt etc. That might help you getting Google to only crawl your important pages.

    Regarding errors in sitemaps... Are you sure those are 404s in the XML sitemaps and not just Google reporting 404 error from an URL it has followed from another source (e.g. link from some other website?)
     
    websitetools, Sep 25, 2008 IP
  9. a87a.com

    a87a.com Peon

    Messages:
    18
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #9
    thomas thank you for the comment

    i realy check my sitemap and i found some old url

    thanks
     
    a87a.com, Sep 26, 2008 IP