"did you mean" - zend search lucene

Discussion in 'PHP' started by ramachandran, Jul 23, 2010.

  1. #1
    How to implement the "did you mean" suggestions for misspelled search terms in the query.

    Does anybody have any idea how to create this?
     
    ramachandran, Jul 23, 2010 IP
  2. Deacalion

    Deacalion Peon

    Messages:
    438
    Likes Received:
    11
    Best Answers:
    0
    Trophy Points:
    0
    #2
    There are quite a few ways to go about this. Will you base it on a dictionary or a field in your database?

    If it's based on a field in your database (keywords, name etc.), then you could read all those fields and generate the most common misspells of those words - then place them into a separate table that links the misspellings to the correct spelling. If you want it to work with every word in the dictionary, then you could have a table with every english word - then when the search if performed you would query the database for that word, if you get no rows back then it could be classed as a misspell so you would work out the levenshtein distance and show the closest match.
     
    Deacalion, Jul 23, 2010 IP
  3. ramachandran

    ramachandran Peon

    Messages:
    6
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #3

    Thanks for your idea,

    But i need to retrive the "did you mean" suggestions from zend search lucene index.
     
    ramachandran, Jul 23, 2010 IP
  4. Deacalion

    Deacalion Peon

    Messages:
    438
    Likes Received:
    11
    Best Answers:
    0
    Trophy Points:
    0
    #4
    Ahh.. I missed that. You could do a find() and count the results, if it's zero then do a fuzzy search with find() and ~ and order by score.
     
    Deacalion, Jul 23, 2010 IP
  5. ramachandran

    ramachandran Peon

    Messages:
    6
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #5
    Thanks again,

    When i use the fuzzy search, it will find the terms misspelling and correct spelling. The search result is an array of Zend_Search_Lucene_Search_QueryHit objects. In the following example, a hit is returned with two fields from the corresponding document: title and content.

    $hits = $index->find($query);

    foreach ($hits as $hit) {
    echo $hit->id;
    echo $hit->score;
    echo $hit->title;
    echo $hit->content;
    }

    My question is,
    I don't want the entire document values. i need the correct spelling terms from lucene index.

    Example:

    Misspelling query term : compputer

    Did you mean : computer
     
    ramachandran, Jul 23, 2010 IP