Internet Businesses Online Articles - Credit Reports - Bollywood India forum movie reviews - Personal Loans - Debt Consolidation

PDA

View Full Version : Latent Semantic Indexing


glickman.mcclain007
Oct 20th 2008, 8:05 pm
I recently wrote a few articles on LSI, latent semantic indexing, and it was quite fascinating. Does anyone know how much Google cares about it? Is it extremely important?

vstar
Oct 20th 2008, 8:19 pm
I recently wrote a few articles on LSI, latent semantic indexing, and it was quite fascinating. Does anyone know how much Google cares about it? Is it extremely important?Although Google's algorithm Patent DOES NOT specifically mention LSI, it does mention a similar system.

Here is paragraph from their patent:

"The system is further adapted to identify phrases that are
related to each other, based on a phrase's ability to predict
the presence of other phrases in a document. More specifically,
a prediction measure is used that relates the actual co-occurrence
rate of two phrases to an expected co-occurrence rate
of the two phrases. Information gain, as the ratio of actual
co-occurrence rate to expected co-occurrence rate, is one such
prediction measure. Two phrases are related where the prediction
measure exceeds a predetermined threshold. In that case, the
second phrase has significant information gain with respect to
the first phrase. Semantically, related phrases will be those
that are commonly used to discuss or describe a given topic or
concept, such as 'President of the United States' and 'White
House.' For a given phrase, the related phrases can be ordered
according to their relevance or significance based on their
respective prediction measures."

SEO_WatchDog
Oct 21st 2008, 12:56 am
Well, certainly LSI is the basis of any search engine, they naturally further improve it...

seodilip
Oct 21st 2008, 5:56 am
Latent Semantic Indexing (LSI) is a unique information retrieval method developed that improves your ability to find applicable information. Using a powerful and fully automatic statistical algorithms LSI can retrieve relevant documents even when they do not share any words with your query — concepts replace keywords to improve retrieval. Latent semantic indexing adds an important step to the document indexing process. In addition to recording which keywords a document contains, the method examines the document collection as a whole, to see which other documents contain some of those same words. LSI considers documents that have many words in common to be semantically close, and ones with few words in common to be semantically distant.

Michael
Oct 21st 2008, 1:01 pm
Google does not use LSI because LSI does not give good results on large non-homogeneous document collections.

You will find that those who believe that Google is using LSI do not know what LSI actually is...

Information retrieval experts like Dr Garcia and others have been debunking the 'Google uses LSI' myth (http://irthoughts.wordpress.com/2007/05/03/latest-seo-incoherences-lsi/) for years.

- Michael