Find jobs - Debt Consolidation - Wordpress Theme - Debt Consolidation - Winunited Bonuses

PDA

View Full Version : Misspellings and synonyms


SEO Guy
May 1st 2004, 10:05 pm
Here is my situation, a rather large software company that utilizes Google for their internal search function has asked me to help build a "learning" database that will allow results to be posted in search even if it is not exact match.

We currently have a database of many of the common misspellings "those typed in enough to get lots of attention " but they want a much more extensive list of common misspellings and also synonyms for so that they are able to match and offer results much more comprehensively. We are currently looking for all sites, techniques, software and suggestions as to harvesting these common misspellings and synonyms and your help is greatly appreciated. If anyone knows of such resources for misspellings etc please email me, post or IM.

Moving forward they want to build a much larger database that can incorporate some sort of "'learning" function so that we can constantly and dynamically update the database and hope to have it all (Or mostly) automated. I am thinking of programming it so that once a kw has been entered past a threshold value or number of times it is flagged but my system still would require manual review of the flagged terms in order to match them up with the appropriate product and this could be daunting as there are thousands of products. Any thoughts on streamlining this process would be appreciated as well
Cheers
SEO Guy

schlottke
May 1st 2004, 11:13 pm
MSN's search uses this capability, pretty well. I'd try to piggy-back their technology, perhaps writing a script with all of the words in the english dictionary for starters and have it create misspellings based off of closeby keys, addditional letters ;) , and other similar ideas.. anyway you do it, it will be a chore.

HTH

hans
May 1st 2004, 11:42 pm
another source BESIDES siteowners - with their access_log as source of misspelled words entered to find their site -

are the many spell checkers avaiable on the market
whenever a spellcheck utility offers a selection of corrected word it has a matched misspell in its db ..
including spell checkers from linux world and browsers of course !

if that company goes PUBLIC with name and URL and offers a tool to enter common website relevant misspelllings for EASY submission ( by email !! ? ), then i may also submit - and other site owners may be as well.

depends on WHO the db/SE owns and if for pay inclusion of FREE

Owlcroft
May 3rd 2004, 7:47 pm
[T]hey want a much more extensive list of common misspellings and also synonyms ... so that they are able to match and offer results much more comprehensively. We are currently looking for all sites, techniques, software and suggestions as to harvesting these common misspellings and synonyms . . . .
A pair of places you can look for help with such matters are two related but distinct usenet groups:

alt.usage.english

and

alt.english.usage

Their memberships overlap a fair bit, but each is worth trying. I have seen all sorts of strange language-related data and databases that one or another regular there knew of.

nlopes
May 9th 2004, 3:29 am
One common option is ude PHP with Pspell that uses a dictionary to make corrections and sugestions.