Kindly let me know what is Latent Semantic Indexing (LSI)? In what ways it is related with SEO? Please help me.
It's a way in which words can be considered in context, rather than in isolation. The problem with evaluating words in isolation is that it is vulnerable to keyword stuffing - LSI is about giving weight to words used in the way they are naturallly used - it takes into account synonyms and the words that are often used with each other. How much it is actually used now is a matter of debate I think, but it is certainly a trend that the major search engines would like to use more in the future, as it does give more of an advantage to genuine content, rather than to keyword tricks or automatically generated content.
in simple way, its a word relationship techlology that tells a search engine bot what a page is about so that the bot only can index it within the right category but also can use the same technology assigning values for backlinks (whether they are in a defined neighborhood). if you have a site related to headaches, then the word tylenol is a related word for your site. or if you exchange links with a site that is related to tylenol, then the both sites are related.
...and can some one tell me why we should care about LSI??? Why not Phrase Based Indexing and retrieval? Why not probabilistic latent semantic analysis? Why not Latent Dirichlet Allocation? How about Hidden Topic Markov Models?? Be carefull not to get on the LSI Bandwagon too quickly
its just the talk that has been going on since Brandy update where everyone was thinking google could be using LSI on their new algo. where there is no SE currently using LSI, google is known to be using some kind of word relationship technology to rank sites based on keywords. for example: the word camera is semantically associated with cannon http://www.google.com/search?hl=en&q=~camera
he he.... bing bing... we have a winner!!! Last we heard as far as semantic relations and defining concepts was Phrase Based Indexing and Retrieval and Probabilistic Learning Models which both are somewhat related Google patent series. The engineers seemed keen on Hidden Topic Markov Models a while ago. Applied Semantics was purchased, from what we know, for the AdSense program back in 2003 ( I think) thus the interest SEOs had in LSI as it was a technology of that company. BUT the author of the Phrase Based Patents, Anna Patterson, was hired on by Google long after that and those patents surfaced in 2007..... If an SEO claims LSI tactics, he is simply telling you that he is inexperienced
For many results on a specific keyword search, you will notice yahoo has millions of results, and google may have hundreds of thousands. It has only been this way in maybe the past eight or so months. Not long ago, google also would have had millions of results. That leads me to believe that LSI is at least started with google, and will be adopted more heavily by other search engines, as they advance to give their users a more relative search.
Dood... did ya even stop long enough to read this (which I posted already) before making such a statement? I am giving you gold here mate... try reading the references first.... can't hurt now can it? It has been shown to most likely not be able to scale mathematically to deal with the loads of RI.... ever think that why they went out and hired Anna 2 years AFTER the purchase of applied semantics? Just a co-incindence she was building a phrase based algo for her search engine... hmmmmmm Stop..read..think... then decide.
sure an feel free to email/pm if you have an interest in techical garble.... it is certianly daunting at times. Or simply read stuff like my blog, Bill Slawski, Micheal Martinez... search theory stuff is enlightening once you get the hang of it. Not sure if I shall be back round these parts fer a while... just don't get the time to play on forums as much as I used to. Want real fun? Start thinking user performance metrics... interesting stuff...
I cant wait till they start rating a user based on how fast they can read. Current basic searching is done with ten measly results. When China steps in, with their billions, and ....well... let me stop there. I think google is going to "web2.0" the whole web. They almost have already. That should be fun... Well, give us a shout when you get back. Thanks !
No it was not a technology of that company. Applied Semantics patented a proprietary ontology called CIRCA that had nothing to with LSI. If anyone believes that Google is using LSI then we can safely assume that they don't know what LSI really is.... - Michael
hello thegypsy, i was not aware of those information which u posted on those links about LSI. thanks for sharing, thanks for sharing. LSI is still big topic for discussion.
Latent semantic indexing adds an important step to the document indexing process. In addition to recording which keywords a document contains, the method examines the document collection as a whole, to see which other documents contain some of those same words. LSI considers documents that have many words in common to be semantically close, and ones with few words in common to be semantically distant. This simple method correlates surprisingly well with how a human being, looking at content, might classify a document collection. Although the LSI algorithm doesnt understand anything about what the words mean, the patterns it notices can make it seem astonishingly intelligent....
Really? Do tell.... Have you even bothered to READ the posts in this thread mate? Can you show us evidence of ANY search engine that employs LSI?
I read your page, where you said "Who cares ?" about what google is doing. You have a good point, especially since people are using social bookmarking to get around a lot these days. Just when will computers get around to understanding sentences ?? ~