[PHP] Common Words

Discussion in 'PHP' started by fLICKERR, Mar 27, 2012.

  1. #1
    Search a body of text for the most common words.

    <?php 
    function commonWords($string, $max = null, $file = 'stopwords.txt'){ 
        $handle = fopen($file, 'rb'); 
        $contents = fread($handle, filesize($file)); 
        fclose($handle); 
        $stopWords = explode("\n", $contents); 
        foreach($stopWords as $key => $val){ 
            $stopWords[$key] = trim($stopWords[$key]); 
        } 
        $string = preg_replace('/ss+/i', '', $string); 
        $string = trim($string); // trim the string 
        $string = preg_replace('/[^a-zA-Z0-9 -]/', '', $string); // only take alphanumerical characters, but keep the spaces and dashes tooâ?¦ 
        $string = strtolower($string); // make it lowercase 
    
        preg_match_all('/([a-z]*?)(?= )/i', $string, $matchWords); 
        $matchWords = $matchWords[0]; 
        foreach ( $matchWords as $key => $item ) { 
            if ($item == '' || in_array(strtolower($item), $stopWords) || strlen($item) < 3) {
                unset($matchWords[$key]); 
            } 
        } 
        $wordCountArr = array(); 
        if ( is_array($matchWords) ) { 
            foreach ( $matchWords as $key => $val ) { 
                $val = strtolower($val); 
                if ( isset($wordCountArr[$val]) ) { 
                    $wordCountArr[$val]++; 
                } else { 
                    $wordCountArr[$val] = 1; 
                } 
            } 
        } 
        arsort($wordCountArr); 
        if($max != null){ 
            $final = array_slice($wordCountArr, 0, $max); 
        }else{ 
            $final = array_slice($wordCountArr, 0); 
        } 
        if(count($final) == 0){ 
            $final = explode(' ', $string); 
        } 
        return $final; 
    } 
    
    $str = 'This is a string it has some words and some words are written more than one time. Words are a combination of letters and spaces to make readable text, these form to make sentences, paragraphs, and full bodies of text.';
    print_r(commonWords($str, null)); 
    echo '<br>'; 
    echo '<br>'; 
    print_r(commonWords($str, 10)); 
    ?>
    PHP:
     

    Attached Files:

    fLICKERR, Mar 27, 2012 IP
    ROOFIS likes this.
  2. yho_o

    yho_o Well-Known Member

    Messages:
    354
    Likes Received:
    6
    Best Answers:
    1
    Trophy Points:
    140
    #2
    Thanks for the share
     
    yho_o, Mar 28, 2012 IP
  3. EricBruggema

    EricBruggema Well-Known Member

    Messages:
    1,740
    Likes Received:
    28
    Best Answers:
    13
    Trophy Points:
    175
    #3
    EricBruggema, Apr 1, 2012 IP
  4. abyssal

    abyssal Guest

    Messages:
    70
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #4
    Thanks dude !
     
    abyssal, Apr 1, 2012 IP
  5. toolsmith

    toolsmith Well-Known Member

    Messages:
    725
    Likes Received:
    46
    Best Answers:
    0
    Trophy Points:
    128
    #5
    Very nice example...I'll be adding that one to my toolbox. Thanks for sharing!
     
    toolsmith, Apr 2, 2012 IP
  6. Artuurs

    Artuurs Peon

    Messages:
    24
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #6
    Very nice example
     
    Artuurs, Apr 7, 2012 IP
  7. ROOFIS

    ROOFIS Well-Known Member

    Messages:
    1,234
    Likes Received:
    30
    Best Answers:
    5
    Trophy Points:
    120
    #7
    Great script! reminds me of the one I made a while back for a client ==> http://pastebin.com/RECSyhMT
    welcoming anyone to adapt and improve it :)







    ROOFIS
     
    ROOFIS, Apr 9, 2012 IP