Interesting thoughts from a poster on another forum. This is something to ponder and may make sense. "If my understanding of Caffeine is correct, it doesn't involve an algo change, but SERPs could still be affected by the fact that more data will be collected and fed into the algo. The additional data will be collected by deeper and more frequent crawling activity. It occurred to me that this could theoretically hurt the rankings of sites that depend on backlinks from high PR pages for most of their incoming link juice, but might help sites that have a lot of backlinks from low PR marginal pages."
I got the info from this post. it is alot of reading but very interesting. "What is the Caffeine Update? Google File System v2 A couple of years ago at the first Seattle Conference on Scalability, Google’s Jeffrey Dean remarked that the company wanted 100x more scalability. Unsurprising given the rapid growth of the web. But there was more to it than that: GFS – the Google File System was running out of scalability. http://storagemojo.com/2009/08/17/google-file-systems-v2-part-1/ Background Here: http://storagemojo.com/google-file-system-eval-part-i/ "
Another comment that really got me thinking..... " Index, as referred to by Google, is NOT the pages (URLs) stored in the database (GFS). Index, as referred to by G is what they show the end user when they conduct a search... A 'noindex' tag does not prevent your page from being spidered and stored. It prevents it from being shown in the results. " So does that mean a no index tag does not help you in your search positions? It just tells google rather to show the link or not? Interesting question isn't it..
So in other words your saying that no matter what you do to try to prevent google from having the information they will still take it anyway but just not make that info available to the rest of the public. Does this theory apply to all other search engines as well?
I think if this is true (which I am not 100% positive, I can only go on what I read) Then the new changes will possibly cause Google to align to be more simialar with the serps in bing or yahoo. We all know that Google only shows a certain # of backlinks. In theory these are generally the backlinks that give the most rank in determining a pages rank in Google. So If this theory is true and google under the new infastructure actually indexes more pages then naturally a site with a bunch of backlinks even low pr links will start to gain rank based on numbers. Now as far as the No Index tag. remember this is different then a rel no follow tag. The index tag is placed on a page so that Google does not index that page. A no follow tag is placed on a link so google does not follow the link. Hopefully that makes sense. If all of this is true then it may very wel explain the changes I am beginning to see with our older site from 2003. We do have thousands of low pr links from all over.
So to continue what I am seeing on our older site is a huge increase in ranking actually to the level and for some KW above many top notch authority sites. The only explanation I have is that in the caffeine DC Google has indexed a ton more pages that have Backlinks pointing to us. We do very well in Yahoo and Bing. Google traffic dropped like a rock in 2006 during Big Daddy as did many others. Now traffic has been rebounding on the normal serps and on caffeine we are at Amazing levels. I love the boost but would really like to get to the bottom of why? We have been trying to figure out for 3 years why we dropped in the first place and there has been no explanation. However this new theory does make alot of sense. I would love to hear from others on this one because it is important to document.
A few weeks ago, Google announced the beta launch of Caffeine, the company's next-generation search infrastructure. At that time, Google said that most of the changes in this update were under the hood and that users wouldn't notice a difference in search results. At its core, Caffeine is basically a major overhaul of the Google File System. There have been some discussions about whether this update will bring any other major changes to page rankings or the importance of certain categories in the search results. Summit Media, a UK-based digital marketing agency, compared search results for 9,000 keywords (PDF) in Caffeine and Google's default ('vanilla') search and, interestingly, didn't find any major differences between the two. http://www.readwriteweb.com/archives/what_really_changed_in_googles_caffeine_update.php
Well I do not know about Summit Media results, but I do find HUGE differences between the current index and Caffein...for my sites main keywords at least. Impossible drop for certain keywords in the default index, and in Caffein - almost all the first places on page 1. Example with a pretty competitive keyword on a 3 y/o site, my first one - page 7 in the default index(used to be between positions 1 and 5 on 1st page); currently on position 4 in Caffein index! However, there are and some keywords which keep their same positions in both indexes (not many!). As for the theory of the many low PR links I do believe in it since I launched my first site. As long as they are related they are welcome even the n/a ones. The high PR's are more suitable for those who make sites for link-selling. Hope they will implement completely the Caffein index very soon, otherwise they will ruin my winter season and I'll have to kill myself
I am thinking that they are implementing bits and pieces of Caffeine all of the time. I am sure that someday soon there will be a major iplementation and everyone wqill be talking about it. I am just hoping this is sooner rather then later I hate it when Google shakes up the serps during the Christmas season as they often do.
Totally agree, those who rely on the extra seasonal income should not be left to the vageries of major changes in such an important period. But, hey, that is the beauty of online business.
This really does sound like something Google would do. Google are obsessive about hoarding data - even deleted pages that result in a 404. A front-end cloak (blocking user-agent googlebot) is probably the only way to keep Google from seeing the content of a page.
What even makes this discussion more interesting is that all of a sudden very recently google is listing a bunch of pages that were removed 4 years ago as page not found in webmaster tools. What took them 4 years to discover this? Hmm ?? the theory is starting to make more sense. Seems as thought they are trying to index every possible page they can..
dont know google caffeine update or not but my site give me more visitor from google compare to previous.
I have pages in my route directory that dont link to my site anywhere externally or internally. They have scripts in them to run a mysql db entry if they are run and store the user details. Google is crawling those files on my server!