Googlebot encountered an extremely high number of URLs

Discussion in 'Google' started by Martha, Oct 29, 2008.

  1. #1
    Hi. Today I've come across a new problem with my website and I hope you'll be able to help me. I'd appreciate even the slightest suggestions as it concerns an extremely important website for me.

    Namely, I have a huge site that is a company directory. As you can guess, there's a list of all cities in my country, what results in creating really lots of pages. Some of the cities don't have any companies yet so I added the noindex metatag for them. Despite that, the Googlebot may still follow those URL's and probably that's the reason of getting the message titled "Googlebot encountered an extremely high number of URLs".

    I've read that such a problem appears in case of having the same content under different URL's, but I believe that's not a problem this time. Firstly, I use mod_rewirte so URL's don't look as if they were generated automatically. Secondly, session id's are not added to URL's. Thirdly, I don't let the search results to be indexed. As you can see, this problem cannot be connected with duplicate content.

    The message in webmaster tools contains the list with some problematic URL's and most of them are those pages with noindex metatag. I suppose this may be one of the reasons, and the second one may be having only the contact data on some of the pages. Unfortunately, not all companies wanted to complete their profiles and write something more about their activity, but I'm doing my best to encourage them to do that.

    My question is - How to solve this problem and what steps helped you in this case? I'm trying to do everything to index only the unique URL's, but maybe I should nofollow the links to some of them so that the Googlebot doesn't even find them. What do you think about it?
     
    Martha, Oct 29, 2008 IP