Hi, My first post so, first of all, I thank you for your unvaluable and helpful knowledge (I've been reading this forum for some time now), and I'll ask you to forgive my english, well, I'am french you know... I just read a news from a french website (quite renowned in the french webmaster community) and it states some kind of evidences that googlebot crawl ( and perharps 'understand' ) externals CSS files. Here is the link for those who read french : -http://www.webrankinfo.com/actualites/200606-google-et-css.htm To make it short, this assumption is based on logs files showing crawl of CSS files by Googlebot. We already know/believe that Google can detect hidden link in CSS, ( I think Matt Cutts post about it some time ago, ) but did any of you ever notice evidence about Googlebot crawling/interpret externals CSS files, and what do you think about it ? Thanks,
Welcome to the community, Monty! I haven't experienced it yet, neither google has crawled any of my css files. You can always stop spiders from indexing certain files in your directory through robots.txt.
Does it really matter? As suggested, I would just limit the exposure with the robots.txt here is another thread that may help you: http://forums.digitalpoint.com/showthread.php?t=122
Thanks for the link ServerUnion, As you and Endurer suggest, it's possible to exclude the CSS files from the crawl with robot.txt, (but I think I read in some posts that it wasn't recommended if there is nothing to hide), but it's not really the point. And, actually, I think it's matter a little, it's always interresting to understand how Google works, what it can or cannot do. Well, at least it's interrest me, so I ask.
Simple way to figure out is pick a page that is frequently crawled that references a .css file ... and then see if the Bot's crawl that. My experience (I don't do any robots.txt stuff) is NO.
Right, but the thing is that it's seems there is some evidences of googlebot crawling CSS file, here is an excerpt from a log posted on a board : crawl-66-249-66-82.googlebot.com 27989 0 - [23/Jun/2006:03:32:20 +0200] "GET /style/corps.css HTTP/1.1" 200 613 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" Code (markup): But it doesn't seems systematic, some others people confirm that the bot had sometimes crawled CSS files, but on the other hand, many people say that they never saw such a thing... So I was wondering : why Googlebot crawl sometimes external CSS files, and why it's not systematic ?
AFAIK Google does now crawl external CSS files to look for hidden text & links which can be placed within them
Bonjour Monty, With a good stats tool, you can see that Googlebot reads your css files, say once every month. This explains why some people still claim that they never saw it. Google does not need to read these files more often, as they rarely change. Jean-Luc
It doesn't really do it on a regular basis, but according to my logs, Google does look at stylesheets. If you really don't want it to, you can exclude it using robots.txt like everyone said. I always thought excluding it could be percevied as a red flag, but I've seen many non BH sites excluding it as well. Realistically, I always thought excluding, assuming that spiders honor robots.txt scrupulously, makes spamming a lot easier ...
I stand corrected as I looked at some logs on a website with light traffic going back to March 30th. Googlebot's came by a total of 824 times and 'Lo and Behold and there was ONE googlebot visit that spidered a .css file on June 22nd ... so this may be a fairly recent thing (?) IP address was 66.249.65.10 (which appears to be a legit Google IP address). BTW, I did not see any visits from slurp or msnbot to grab the external .css file ... so maybe they aren't doing this ... or at least not yet.
Not sure why I get hit so much, but my CSS file gets pulled more than that. It really does make sense to crawl CSS if you can to look for stuff like: "left:-2000px;" But if you exclude it, SEs can't touch it anyway. Like I said, though, not being a blackhat, I cannot comment on the viability of employing CSS spam in an excluded CSS file or anything.
might be missing the point here, is there any reason that you wouldn't want google looking into your client side includes? Doesn't appear that they are indexed, or has anyone seen otherwise?
Do you mean : CSS files don't appear to be indexed ? Some of them (few) are : http://www.google.com/search?num=100&hl=en&lr=&safe=off&q=filetype%3Acss+style&btnG=Search
maybe those have direct links on-page to them. Just set the robots.txt and forget about it. Nothing that really can be done.