Googlebot downloads .zip files ??

Discussion in 'Site & Server Administration' started by sgthayes, May 18, 2006.

  1. #1
    I checked my logs for my fonts site today and Googlebot has been downloading my fonts all night long (zip files). I quickly changed robots.txt to disallow the downloading and added a rel="nofollow" tag to the links.

    Have a look at some logs :
    
    66.249.72.1 - - [19/May/2006:07:34:43 +0200] "GET /fonts/keyboard_light_ssi_light.zip HTTP/1.1" 200 34108 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
    66.249.72.1 - - [19/May/2006:07:34:44 +0200] "GET /fonts/handelgotdbol.zip HTTP/1.1" 200 31566 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
    
    Code (markup):


    This has been going on for 7 hours now. Is this normal behaviour ? Does G now index stuff that is in .zip files :confused:
     
    sgthayes, May 18, 2006 IP
  2. irka

    irka Well-Known Member

    Messages:
    1,875
    Likes Received:
    183
    Best Answers:
    0
    Trophy Points:
    185
    #2
    Googlebot is way too curious IMO, i don't think google index .zip files, would be a mess!
     
    irka, May 18, 2006 IP
  3. sgthayes

    sgthayes Peon

    Messages:
    171
    Likes Received:
    12
    Best Answers:
    0
    Trophy Points:
    0
    #3
    Yes indeed !
    But what possible reason could they have to do it ? Would some people set their webserver to return text or html when a .zip file is served ? And if so then you'd expect G to be smart enough to stop downloading once it downloaded a couple of zips and sees they are all real zip files.
     
    sgthayes, May 18, 2006 IP
  4. irka

    irka Well-Known Member

    Messages:
    1,875
    Likes Received:
    183
    Best Answers:
    0
    Trophy Points:
    185
    #4
    Bah you know perhaps i said a very stupid thing... Google is getting smarter everyday with a way of indexing Videos Pictures Documents, perhaps now they are attacking on the Archives Files and they try to find a way to index them...

    But i think on the other hand they want to get as much informations as possible for each sites owners :D to know who they are, imagine that a site owner got lots of pictures by a man molesting kidz in a archive file? How can you know there is that kind of pictures on his site without looking into the archive file? Eheheheh....
     
    irka, May 18, 2006 IP