Latent Semantic Indexing

Discussion in 'Link Development' started by glickman.mcclain007, Oct 20, 2008.

  1. #1
    I recently wrote a few articles on LSI, latent semantic indexing, and it was quite fascinating. Does anyone know how much Google cares about it? Is it extremely important?
     
    glickman.mcclain007, Oct 20, 2008 IP
  2. vstar

    vstar Well-Known Member

    Messages:
    906
    Likes Received:
    25
    Best Answers:
    0
    Trophy Points:
    150
    #2
    Although Google's algorithm Patent DOES NOT specifically mention LSI, it does mention a similar system.

    Here is paragraph from their patent:

    "The system is further adapted to identify phrases that are
    related to each other, based on a phrase's ability to predict
    the presence of other phrases in a document. More specifically,
    a prediction measure is used that relates the actual co-occurrence
    rate of two phrases to an expected co-occurrence rate
    of the two phrases. Information gain, as the ratio of actual
    co-occurrence rate to expected co-occurrence rate, is one such
    prediction measure. Two phrases are related where the prediction
    measure exceeds a predetermined threshold. In that case, the
    second phrase has significant information gain with respect to
    the first phrase. Semantically, related phrases will be those
    that are commonly used to discuss or describe a given topic or
    concept, such as 'President of the United States' and 'White
    House.' For a given phrase, the related phrases can be ordered
    according to their relevance or significance based on their
    respective prediction measures."
     
    vstar, Oct 20, 2008 IP
  3. SEO_WatchDog

    SEO_WatchDog Well-Known Member

    Messages:
    1,148
    Likes Received:
    15
    Best Answers:
    0
    Trophy Points:
    115
    #3
    Well, certainly LSI is the basis of any search engine, they naturally further improve it...
     
    SEO_WatchDog, Oct 21, 2008 IP
  4. seodilip

    seodilip Active Member

    Messages:
    697
    Likes Received:
    9
    Best Answers:
    0
    Trophy Points:
    58
    #4
    Latent Semantic Indexing (LSI) is a unique information retrieval method developed that improves your ability to find applicable information. Using a powerful and fully automatic statistical algorithms LSI can retrieve relevant documents even when they do not share any words with your query — concepts replace keywords to improve retrieval. Latent semantic indexing adds an important step to the document indexing process. In addition to recording which keywords a document contains, the method examines the document collection as a whole, to see which other documents contain some of those same words. LSI considers documents that have many words in common to be semantically close, and ones with few words in common to be semantically distant.
     
    seodilip, Oct 21, 2008 IP
  5. Michael

    Michael Raider

    Messages:
    677
    Likes Received:
    92
    Best Answers:
    0
    Trophy Points:
    150
    #5

    Google does not use LSI because LSI does not give good results on large non-homogeneous document collections.

    You will find that those who believe that Google is using LSI do not know what LSI actually is...

    Information retrieval experts like Dr Garcia and others have been debunking the 'Google uses LSI' myth for years.

    - Michael

     
    Michael, Oct 21, 2008 IP