Phrase Matching (for similarity?)

Discussion in 'PHP' started by Jexley, Nov 11, 2009.

  1. #1
    Hey kids,
    I've got a tool that does magical SEO things to keyword traffic on your site.

    I'm looking to have about 5-10 keywords in a database table and then when someone plugs something into Google, it'll check how close that something is to the words you've got in your DB.

    Example: You've got "web design perth" in your database, but someone typed in "perth web people that are designers that don't suck".

    I'd like to be able to run that very-long-tailed keyword into a filter that compares it and "web design perth" and is able to tell me if they're connected (and maybe even to what degree).

    Anybody got anything out there? I'd be happy to pay you in SEO efforts, charm or lollies if need be.

    Smooches.
    -Judd
     
    Jexley, Nov 11, 2009 IP
  2. n3r0x

    n3r0x Well-Known Member

    Messages:
    257
    Likes Received:
    4
    Best Answers:
    1
    Trophy Points:
    120
    #2
    First i can say this is a very complex system you want.. To determine the degree two strings are connected would require at least multiple comparison..

    I can only see one way i would do thing though..

    1. Split the search collected from Google into words
    2. Run a database query where it searches for all words with leading and trailing spaces in the keyword column.
    3. Extract the keywords word that matched (if you got multiple rows with keys)
    4. Now for the more complex part.

    I can´t explain this in words..
    
    <?php
    
    
    function CheckKeyword($Google_String,$Keywords) {
        $Google_String = strtolower($Google_String);
        $Keywords = strtolower($Keywords);
        $value = 0;
        if($Google_String == $Keywords) {
            $value = 20;
        } else {
            // Split the database stored keywords into an array
            $parts = explode(",",$Keywords);
            // -1 since we sizeof gives the total number and index starts at 0 + we use icount +1 if we use sizeof we´ll have an "Index out of range error"
            $iTotal = sizeof($parts)-1;
            $iCount = 0;
            while ($iCount < $iTotal) {
                // Check if both current keywords exists in the string
                if(strstr($Google_String,$parts[$iCount]) && strstr($Google_String,$parts[$iCount+1])) {
                    // Check if keywords are connected insid the string
                    if(strstr($Google_String,$parts[$iCount]." ".$parts[$iCount+1])) {
                        $value += 3;
                    } else {
                        $value += 2;
                    }
                // Check if either of the word exists in the string
                } else if(strstr($Google_String,$parts[$iCount]) || strstr($Google_String,$parts[$iCount+1])) {
                    $value += 1;
                }
                $iCount++;
            }
        }
        return $value;
    }
    ?>
    
    PHP:
    Might need some more advanced algorithm with word count in consideration as well.. But this should be sufficient to check for similarities..
     
    Last edited: Nov 12, 2009
    n3r0x, Nov 12, 2009 IP
  3. Jexley

    Jexley Peon

    Messages:
    16
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #3
    Thanks for that.

    I've discovered these functions:
    soundex()
    similar_text()
    metaphone()
    levenshtein()

    That I think should take care of me just fine.

    Much appreciated!
     
    Jexley, Nov 12, 2009 IP
  4. n3r0x

    n3r0x Well-Known Member

    Messages:
    257
    Likes Received:
    4
    Best Answers:
    1
    Trophy Points:
    120
    #4

    Just edited.. found that i missed to remove some initiation of the variables when i converted it into a function... also added strtolower() since strstr() is case sensitive... but i'm sure I´ll find better method to do it later during the day.
     
    n3r0x, Nov 12, 2009 IP