1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

Deep Crawl Issues

Discussion in 'Google' started by wiseone, May 24, 2004.

  1. #1
    Greetings,
    My first post here...

    I am wondering if anyone else has been having issues with deep crawls from Google lately. We have approximately 7000 pages on our site. We add approximately 500 to 600 articles per month. It seems to take Google a long time to pick up these articles... Here is my hypothetical question...

    If you had 10000 articles on your site today, and 5000 of them were indexed in Google, how long would you expect it to take for Google to index the other 5000 articles? One month? Many months? Never?

    We have noticed that Google will index approximately 500 to 600 per month.

    From an internal linking point of view - all articles are accessible by two clicks from the home page.

    Thanks for the help and thoughts. :)
     
    wiseone, May 24, 2004 IP
  2. digitalpoint

    digitalpoint Overlord of no one Staff

    Messages:
    38,333
    Likes Received:
    2,613
    Best Answers:
    462
    Trophy Points:
    710
    Digital Goods:
    29
    #2
    It depends on the PageRank of the site... the higher the PageRank, the more often Google will visit and spider stuff. This forum for example pretty much is spidered 24/7 (I imagine this thread should be in the index within 24 hours).

    Most of the time, you can see Googlebot spidering here: http://forums.digitalpoint.com/online.php

    What is the PageRank of your site? And how many clicks deep is your deepest page?
     
    digitalpoint, May 24, 2004 IP
  3. Bernard

    Bernard Well-Known Member

    Messages:
    1,608
    Likes Received:
    107
    Best Answers:
    0
    Trophy Points:
    185
    #3
    Are your pages dynamic? How many variables do you have in the URL? I've read many times where webmasters in your position had their entire site crawled within days of using mod rewrite or otherwise changing the URLs to limit or eliminate variables. YMMV.
     
    Bernard, May 24, 2004 IP
  4. expat

    expat Stranger from a far land

    Messages:
    873
    Likes Received:
    18
    Best Answers:
    0
    Trophy Points:
    0
    #4
    Hi,

    I've never hit a ceiling with new pages but I only ever fed about 250 max in one day and under PR6/7 with a site map of pr 5 these where indexed within a day.
    But all of these pages had different meta and title.

    So for another 5000 pages I would probably take about 8 weeks as I would feed at a rate of 100/day with an expanding site map tree and not using generic title / meta.
    M
     
    expat, May 24, 2004 IP
  5. wiseone

    wiseone Peon

    Messages:
    72
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    0
    #5
    Greetings,
    Thanks for the quick help...

    Our homepage is PR5. It does tend to have activity pretty much everyday from Google. However, it seems to be going after older pages most often.

    All of our articles are static webpages. No CGI. All are normal URLs. (ex: http://www.mysite.com/articles/article1.shtml)

    Every article is reach by two clicks from the homepage. How this works is as follows:

    - We created pages A-Z pages. These pages basically contain links to the articles. These pages reside at the root level of our server (ex: http://www.mysite.com/articles_a.shtml).

    - We also index each article with a synopsis page. These are a bit deeper into the site and would take more cicks to get to an individual article.

    Example: Article7100 is located on synopsis index page 214.

    The path to get to Article7100 for the home page is as follows:

    - Click on the digest index
    - Click on Index pages numbers 200-214
    - Click on Index page 214
    - Click on Article 7100

    It would take four clicks to get from homepage to Article7100.

    All of our articles resides no deeper than one directory from our main directory.

    Thanks for the help.
     
    wiseone, May 24, 2004 IP
  6. digitalpoint

    digitalpoint Overlord of no one Staff

    Messages:
    38,333
    Likes Received:
    2,613
    Best Answers:
    462
    Trophy Points:
    710
    Digital Goods:
    29
    #6
    My best suggestion would be to simply get more links. The more links you get, the more important Google will think you are, and the more often it will spider/respider.
     
    digitalpoint, May 24, 2004 IP
  7. wiseone

    wiseone Peon

    Messages:
    72
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    0
    #7
    Another note... All articles are unique. They are not duplicate content. They contain unique titles, descriptions, and keyword info.
    Thanks.
     
    wiseone, May 24, 2004 IP
  8. nohaber

    nohaber Well-Known Member

    Messages:
    276
    Likes Received:
    18
    Best Answers:
    0
    Trophy Points:
    138
    #8
    wiseone,
    the crawling of a page depends on its PR and the PR of the pages that link to it. You either need a big PR for a page, or links from pages with big PR. Sometimes, second ++ level pages may take months to crawl.
    If they aren't too many, you can put links on your home page to third level pages, and rotate them. It's a big problem because if google does not crawl let's say your third level pages, it would never go to level 4.
    For example, I have a PR5 homepage and a direct link to my links directory, which links to 70 subcategories. Well, my main links directory got PR4, then PR5, but my links subcategories (level 3) got crawled after a couple of months. Still, they haven't "officially" got a rank on the toolbar.
    So even though my links page had a good PR it only had one link pointing to it. Third level ++ pages need time.
     
    nohaber, May 24, 2004 IP
  9. sarahk

    sarahk iTamer Staff

    Messages:
    28,500
    Likes Received:
    4,460
    Best Answers:
    123
    Trophy Points:
    665
    #9
    Try to get more internal linking going on.

    for instance this page: http://www.goodreturns.co.nz/article.php?ArticleID=976489473 has links on the right hand side to recent articles increasing the number of internal links.

    Then it would be good to refer to those articles from other sites. Say you mention company, get that company to link to your article. Link out too, to high ranking sites relevant to your story.

    If you have a search page, then list the "recent searches". It shows the humans what people have been looking for and shows the bots a different collection of pages.
     
    sarahk, May 24, 2004 IP
  10. disgust

    disgust Guest

    Messages:
    2,417
    Likes Received:
    133
    Best Answers:
    0
    Trophy Points:
    0
    #10
    is it really normal for pages to be added in between big updates?

    we're a PR5/6 (fluctuates) and new pages are only added once a month during a deep crawl.
     
    disgust, May 24, 2004 IP
  11. ZanderXML

    ZanderXML Guest

    Messages:
    123
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #11
    And it depends on what type of URL you have!!! If you have static pages your site will be indexed in 2 weeks if it has just PR 4. If you have dynamic pages - you need good PR to be crawled deeply

    And it seems Google don't crawl dynamic pages with PR 3 or less. Look for details at my previous message "Static vs Dynamic Pages" http://forums.digitalpoint.com/showpost.php?p=6685&postcount=15
     
    ZanderXML, May 27, 2004 IP
  12. john_loch

    john_loch Rodent Slayer

    Messages:
    1,294
    Likes Received:
    66
    Best Answers:
    0
    Trophy Points:
    138
    #12
    An alternative option is to setup a subdomain with a single indexable page. Rotate your latest through this page every 24 hrs, with a small synopsis for each. Let Google see that this page is changing on a daily basis, and ensure FREQUENT linkage to this subdom throughout your main site.. perhaps in your footer etc.

    You **should** also consider feeding this (RSS). This will attract relevant inbounds.

    It has the effect of:
    A. Being treated as a seperate domain by G.
    B. Freshness
    C. Linkage from your main site will propagate PR

    PS: I haven't visited your site, but I suspect from what I've read the above should help :)
     
    john_loch, May 27, 2004 IP
  13. expat

    expat Stranger from a far land

    Messages:
    873
    Likes Received:
    18
    Best Answers:
    0
    Trophy Points:
    0
    #13
    if the crawler can find it via linkage internally or externally every page gets crawled and if it's different it gets deep crawled.

    http://departures-arrivals.com/uk-airport-information.htm?airport=&category=DIR PR 1 crawled daily just changed title day before yesterday and it's already in with the new title.

    But if dynamic pages have not sufficent difference in title desc and content it may well be another story.

    M

    PS I'm quite happy with PR1 for this as its an intermediary hop page with no real content
     
    expat, May 27, 2004 IP
  14. Dominic

    Dominic Well-Known Member

    Messages:
    1,725
    Likes Received:
    121
    Best Answers:
    0
    Trophy Points:
    185
    #14
    Hi -

    I would make sure you get some links to pages deeper in your site.

    And ensure you have links from a variety of data centres - i.e. get a link from europe, Australia / Asia, US, and so on.

    I would also ensure when you upload the pages you have added or changed to your server that you re-upload all pages that link to your new pages.

    By this I mean - google will visit your server and be less interested in pages which 'haven't changed / been re-loaded since last spidered' - and will come back more frequently to a page that either regularly has new content or has at least been re-uploaded since it's last visit.

    Dominic
     
    Dominic, May 27, 2004 IP