Does Googlebot crawls externals CSS files ?

Discussion in 'Google' started by Monty, Jun 26, 2006.

  1. #1
    Hi,

    My first post so,
    first of all, I thank you for your unvaluable and helpful knowledge (I've been reading this forum for some time now), and I'll ask you to forgive my english, well, I'am french you know...

    I just read a news from a french website (quite renowned in the french webmaster community) and it states some kind of evidences that googlebot crawl ( and perharps 'understand' ) externals CSS files.

    Here is the link for those who read french :
    -http://www.webrankinfo.com/actualites/200606-google-et-css.htm

    To make it short, this assumption is based on logs files showing crawl of CSS files by Googlebot.

    We already know/believe that Google can detect hidden link in CSS, ( I think Matt Cutts post about it some time ago, ) but did any of you ever notice evidence about Googlebot crawling/interpret externals CSS files, and what do you think about it ?

    Thanks,
     
    Monty, Jun 26, 2006 IP
  2. Endurer

    Endurer Well-Known Member

    Messages:
    1,113
    Likes Received:
    84
    Best Answers:
    0
    Trophy Points:
    140
    #2
    Welcome to the community, Monty!

    I haven't experienced it yet, neither google has crawled any of my css files. You can always stop spiders from indexing certain files in your directory through robots.txt.
     
    Endurer, Jun 26, 2006 IP
  3. ServerUnion

    ServerUnion Peon

    Messages:
    3,611
    Likes Received:
    296
    Best Answers:
    0
    Trophy Points:
    0
    #3
    ServerUnion, Jun 26, 2006 IP
  4. Monty

    Monty Peon

    Messages:
    1,363
    Likes Received:
    132
    Best Answers:
    0
    Trophy Points:
    0
    #4
    Thanks for the link ServerUnion,

    As you and Endurer suggest, it's possible to exclude the CSS files from the crawl with robot.txt, (but I think I read in some posts that it wasn't recommended if there is nothing to hide), but it's not really the point.

    And, actually, I think it's matter a little, it's always interresting to understand how Google works, what it can or cannot do.

    Well, at least it's interrest me, so I ask.
     
    Monty, Jun 26, 2006 IP
  5. hulkster

    hulkster Peon

    Messages:
    1,705
    Likes Received:
    93
    Best Answers:
    0
    Trophy Points:
    0
    #5
    Simple way to figure out is pick a page that is frequently crawled that references a .css file ... and then see if the Bot's crawl that. My experience (I don't do any robots.txt stuff) is NO.
     
    hulkster, Jun 27, 2006 IP
  6. Monty

    Monty Peon

    Messages:
    1,363
    Likes Received:
    132
    Best Answers:
    0
    Trophy Points:
    0
    #6
    Right, but the thing is that it's seems there is some evidences of googlebot crawling CSS file, here is an excerpt from a log posted on a board :
    crawl-66-249-66-82.googlebot.com 27989 0 - [23/Jun/2006:03:32:20 +0200] 
    "GET /style/corps.css HTTP/1.1" 200 613 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
    Code (markup):
    But it doesn't seems systematic, some others people confirm that the bot had sometimes crawled CSS files, but on the other hand, many people say that they never saw such a thing...

    So I was wondering :
    why Googlebot crawl sometimes external CSS files,
    and why it's not systematic ?
     
    Monty, Jun 27, 2006 IP
  7. adwordaffiliate

    adwordaffiliate Active Member

    Messages:
    760
    Likes Received:
    22
    Best Answers:
    0
    Trophy Points:
    58
    #7
    AFAIK Google does now crawl external CSS files to look for hidden text & links which can be placed within them :)
     
    adwordaffiliate, Jun 28, 2006 IP
  8. Jean-Luc

    Jean-Luc Peon

    Messages:
    601
    Likes Received:
    30
    Best Answers:
    0
    Trophy Points:
    0
    #8
    Bonjour Monty,

    With a good stats tool, you can see that Googlebot reads your css files, say once every month. This explains why some people still claim that they never saw it.

    Google does not need to read these files more often, as they rarely change.

    Jean-Luc
     
    Jean-Luc, Jun 28, 2006 IP
  9. ServerUnion

    ServerUnion Peon

    Messages:
    3,611
    Likes Received:
    296
    Best Answers:
    0
    Trophy Points:
    0
    #9
    I would assume they are checking the CSS to see if they are being used for blackhat SEO.
     
    ServerUnion, Jun 28, 2006 IP
  10. SEOEgghead

    SEOEgghead Peon

    Messages:
    18
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #10
    It doesn't really do it on a regular basis, but according to my logs, Google does look at stylesheets. If you really don't want it to, you can exclude it using robots.txt like everyone said.

    I always thought excluding it could be percevied as a red flag, but I've seen many non BH sites excluding it as well. Realistically, I always thought excluding, assuming that spiders honor robots.txt scrupulously, makes spamming a lot easier ...
     
    SEOEgghead, Jun 28, 2006 IP
  11. hulkster

    hulkster Peon

    Messages:
    1,705
    Likes Received:
    93
    Best Answers:
    0
    Trophy Points:
    0
    #11
    I stand corrected as I looked at some logs on a website with light traffic going back to March 30th. Googlebot's came by a total of 824 times and 'Lo and Behold and there was ONE googlebot visit that spidered a .css file on June 22nd ... so this may be a fairly recent thing (?)

    IP address was 66.249.65.10 (which appears to be a legit Google IP address). BTW, I did not see any visits from slurp or msnbot to grab the external .css file ... so maybe they aren't doing this ... or at least not yet.
     
    hulkster, Jun 28, 2006 IP
  12. SEOEgghead

    SEOEgghead Peon

    Messages:
    18
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #12
    Not sure why I get hit so much, but my CSS file gets pulled more than that. It really does make sense to crawl CSS if you can to look for stuff like: "left:-2000px;" But if you exclude it, SEs can't touch it anyway. Like I said, though, not being a blackhat, I cannot comment on the viability of employing CSS spam in an excluded CSS file or anything.
     
    SEOEgghead, Jun 28, 2006 IP
  13. ServerUnion

    ServerUnion Peon

    Messages:
    3,611
    Likes Received:
    296
    Best Answers:
    0
    Trophy Points:
    0
    #13
    might be missing the point here, is there any reason that you wouldn't want google looking into your client side includes? Doesn't appear that they are indexed, or has anyone seen otherwise?
     
    ServerUnion, Jun 28, 2006 IP
  14. Monty

    Monty Peon

    Messages:
    1,363
    Likes Received:
    132
    Best Answers:
    0
    Trophy Points:
    0
    #14
    Do you mean : CSS files don't appear to be indexed ?

    Some of them (few) are :
    http://www.google.com/search?num=100&hl=en&lr=&safe=off&q=filetype%3Acss+style&btnG=Search
     
    Monty, Jun 29, 2006 IP
  15. ServerUnion

    ServerUnion Peon

    Messages:
    3,611
    Likes Received:
    296
    Best Answers:
    0
    Trophy Points:
    0
    #15
    maybe those have direct links on-page to them. Just set the robots.txt and forget about it. Nothing that really can be done.
     
    ServerUnion, Jul 5, 2006 IP
  16. gianinydw

    gianinydw Peon

    Messages:
    13
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #16
    i dont know
     
    gianinydw, Jul 7, 2006 IP