1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

Google Not Consistent With robots.txt

Discussion in 'robots.txt' started by digitalpoint, Mar 22, 2006.

  1. #1
    digitalpoint, Mar 22, 2006 IP
  2. rustybrick

    rustybrick User ID 3

    Messages:
    384
    Likes Received:
    41
    Best Answers:
    0
    Trophy Points:
    158
    #2
    It is amazing that Google would crawl pages that it clearly should not.
     
    rustybrick, Mar 22, 2006 IP
  3. alifan

    alifan Peon

    Messages:
    46
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #3
    I Would Agree with Digipoint i have had goolge look at link folders that it was not suppost to access
     
    alifan, Mar 30, 2006 IP
  4. minstrel

    minstrel Illustrious Member

    Messages:
    15,082
    Likes Received:
    1,243
    Best Answers:
    0
    Trophy Points:
    480
    #4
    Two points:

    1. Google has said previously that they interpret robots.txt a bit more liberally than many spiders, in that they will try to figure out what you "meant" when there are errors (much like MSIE tries to work around coding errors). The reply from Google noted in your blog is correct, I think - the fact that a different bot is doing what you wanted it to do is testimony to Google's ability to read between the lines.

    2. There is a difference between crawling and indexing. Googlebots and also other spiders do seem to crawl Disallowed folders and files - this isn't new. That's not necessarily a problem, though. It's only a problem if it starts showing up in the search indices. If it's really sensitive or private information, it should be password protected.
     
    minstrel, Apr 2, 2006 IP
  5. Jean-Luc

    Jean-Luc Peon

    Messages:
    601
    Likes Received:
    30
    Best Answers:
    0
    Trophy Points:
    0
    #5
    In your example, Googlebot respected the standard and the validator didn't.

    And, according to Matt Cutts:
    Jean-Luc :confused:
     
    Jean-Luc, Apr 2, 2006 IP