Sitemap only crawled 1/4 of the way according to Google Webmasters tools

Discussion in 'Search Engine Optimization' started by mikerowan, Mar 3, 2010.

  1. #1
    I submitted a new sitemap 2 days ago, and also set up global redirects on all old urls.

    Any thoughts on why this is the case?

    Also, does anyone out there know of a free crawling program that could replicate a spider and give me more insight?
     
    mikerowan, Mar 3, 2010 IP
  2. pontifixx

    pontifixx Member

    Messages:
    89
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    43
    #2
    The crawl you got might have only been the 'fresh' bot and not the deep crawl bot.
     
    pontifixx, Mar 3, 2010 IP
  3. rockjone

    rockjone Peon

    Messages:
    358
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #3
    you got the fresh bot through the crawl
     
    rockjone, Mar 3, 2010 IP
  4. selectsplat

    selectsplat Well-Known Member

    Messages:
    2,559
    Likes Received:
    121
    Best Answers:
    0
    Trophy Points:
    190
    #4
    A few things to check.

    First, make sure that all pages in your sitemap have a unique title. Pages with identical meta titles may not always be indexed.

    Second, make sure that each page of your site can only be accessed through 1 URL. For example, they should be able to get to the same product using the following 2 URLs...
    www.example.com/product_one
    www.example.com/most_popular/product_one

    These pages may be indexed only once, and sometimes not at all.

    Third, make sure each of the pages in yoru sitemap can be access in a few seconds. If it takes longer than a few second to access, the pages may not be accessed.

    Fourth - Check your robots.txt and no index pages and make sure your sitemap doesn't include pages that are restricted by robots.txt or noindex.

    Fifth - make sure pages included in the index can be accessed by someone not logged in.

    Hope this helps.



     
    selectsplat, Mar 3, 2010 IP