Ok before long this is going to happen and already is happening. Google knows that their algorithm can easily be manipulated by off page factors such as buying links, buying shitloads of articles and publishing everywhere, reciprocal linking and other manipulative factors that they recognize is a problem with their algo. So basically before long this is how its going to work and I believe we are seeing some of it in testing right now! As you may or may not know Anna Patterson is the inventor of many of Google patents which she has worked for the Internet Archives which is bigger than Google. Here is what some of the patent is saying. An information retrieval system uses phrases to index, retrieve, organize and describe documents. Phrases are identified that predict the presence of other phrases in documents. Documents are the indexed according to their included phrases. Related phrases and phrase extensions are also identified. Phrases in a query are identified and used to retrieve and rank documents. Phrases are also used to cluster documents in the search results, create document descriptions, and eliminate duplicate documents from the search results, and from the index. (a) comprehensively identify phrases in a large scale corpus, (b) index documents according to phrases, (c) search and rank documents in accordance with their phrases, and (d) provide additional clustering and descriptive information about the documents 3 Primary factors of patent and ranking algo # identification of phrases and related phrases, # indexing of documents with respect to phrases, and; # generation and maintenance of a phrase-based taxonomy Good phrases stand out because they: 1. Appear in more than certain percentage of the documents on the web, and/or; 2. Are distinguished in appearance, by being marked up by html tags or “other morphological, format, or grammatical markers.†3. Predict other good phrases rather than being mere sequences of words appearing in the lexicon Functional Stages of the Phrase Identification Process These can also be broken down into three steps: 1. Collect possible and good phrases, along with frequency and co-occurrence statistics of the phrases, 2. Classify possible phrases to either good or bad phrases based on frequency statistics, and; 3. Prune good phrase list based on a predictive measure derived from co-occurrence statistics. To identify good phrases, a predictive measure is used which expresses the increased likelihood of one phrase appearing in a document given the presence of another phrase. The patent describes in detail, rankings based upon: - Contained phrases - Anchor Phrases - Date Range Relevance So to put it bluntly if your on-page optimization isn't what they are looking for then your shit out of luck! No off page optimization like buying high page rank links, reciprocal linking, articles none of that is going to help more than phrase based LSI (Co Occurance) Keywords. Why would they want to do this? Simple they know their are too many off page factors that can easily manipulate their algo, so why not just base it like this = If your page sucks and is not related then your rankings will suck as well. Everyone has already seen some of this happening with the very familiar Google bombing like "Miserable Failure" then George Bushes site pops up...No longer does George Bush show up because of the beginning phases of Co Occurance and more human based Algorithm. Also it also ties into duplicate content which has obviously been taking place for quite some time....But this is getting way more advanced than that. The algo is going human! So my advice is work on on page factors not off page because before long just like the patent says if your site sucks then your rankings will too! Some peoples sites are clearly dropping off the map as we speak....This could be the beginning of "Phrase Rank"!
Thanks for the information. Even if this is adopted soon, it simply means G is weighing more on contents than before, or phrases than keywords, or relevancy than sheer volume of links. In another word, some long-overdue tweaking to their PR algorithm. Not necessary a new era for page ranking.
Exactly, inbound links will be purely almost worthless because they know people can buy, trade and sell.. And just because you have a crapload of inbound links with the correct anchor text doesn't mean your site is relevant...It means it's been manipulated. Their going to a more human algorithm. So off page optimization will not be as effective or not effective at all...soon!
Contextual links = good business for content specific websites or blogs. This a good bit of info. Thanks!
Yes when I say link relevancy I should said page relevancy. Meaning they are judging the quality of your inbound links by the content on the pages linking to you, not the anchor text. I've always thought that's the way it should work and that's how I currently trade links. I only trade with sites that match my site in terms of content.
Yes that's how you should trade anyway if you want to have a good list of related resources that will provide a great user experience.
Interesting article, hopefully if implemented by G we should get better or more relevant results when we search. This change will help those SEOs who know how to optimize the On Page Content.
it sounds like its describing Latent Semantic Indexing. Its been said for a while that this is the way its going to go. This will make site promotion easier and cheaper than it is already.
Yep LSI and Co Occurance...But It wont be cheaper if you don't know what you are doing and have the market intelligence to make critical Vertical Market Analysis Drilldowns. The algorithm will be based on proven "I'll leave this out" My RSS
Then whats up with the REP baby! This is not it I have tons of information about the way Google is going and little tips and tricks on how you can do Vertical Market Analysis Drilldowns and how you don't have to settle for long tail you can swallow your whole market. My RSS
I disagree, inbound links will be important, but not "as" important. That pantent is not for the whole "search" algo, it is just a part of it. Besides, google is not going to drastically change their results like that, because it knows people will move on elsewhere to yahoo etc. And, did you know, google only has some 45% of the US marketshare, it isn't as big as some of us think. Google can't afford to lose any of that. If anything, I think google will give more weight to well optimized pages. And, it will be matching phrases, phrases that people type to what phrases you have. Sounds much like how it used to be on yahoo years ago.
You mention this "human' algo thing. You should know that Google has a team of hundreds that manually check websites for what they call spam which refers to any site breaking google policies. The queue for pages to be checked is huge and the workers stay busy, at times working full time to check hundreds of pages each day (per person). So, this human thing you are mentioning, doesn't have any correlation with this patent. I know what you are referring to but it is not "more" human than an actual human checking the pages as googlebot (or any other robot) deems.