1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

Find matching words (two strings)

Discussion in 'PHP' started by qwikad.com, Nov 18, 2020.

  1. #1
    I've got a file $datafile['spamwords'] that has spam words / phrases in it:
    gambling
    xxx
    milf
    porn
    adult movies
    cbd shops
    etc...

    I want to compare the spamwords file with another file: $text.

    I found this code, but it doesn't really do the job. I don't get why it has this part

    $a1=explode(' ',$str1);
    $a2=explode(' ',$str2);

    The code does find some spam words in the $text string ($str2), but not all of them, which is strange.

    
    $spamwords = file_get_contents($datafile['spamwords']);   
    $text = "Some text that has spam words in it. Can be whatever text.";
       
    $str1=$spamwords;
    $str2=strtolower($text);
    $a1=explode(' ',$str1);
    $a2=explode(' ',$str2);
    function longenough($word){
        return strlen( $word ) > 3;
    }
    $a1=array_filter($a1,'longenough');
    $a2=array_filter($a2,'longenough');
    $common=array_intersect( $a1, $a2 );
    foreach( $common as $word ){
        $str2=preg_replace( "/($word)/i",'<span style="color:red;font-weight:bold;">$1</span>', $str2 );
    }
    echo $str2;
    
    Code (markup):
    Thank you.
     
    qwikad.com, Nov 18, 2020 IP
  2. sarahk

    sarahk iTamer Staff

    Messages:
    28,500
    Likes Received:
    4,460
    Best Answers:
    123
    Trophy Points:
    665
    #2
    I just ran it with this and the only problem I had was when $spamwords had Maecenas instead of maecenas. Can you post an example of where it fails. Send in a convo if it's NSFW.

    $spamwords = "maecenas vestibulum lorem malesuada"; 
    $text = "Phasellus condimentum pretium lorem, ut suscipit felis. Quisque nulla nibh, tempor ac erat vitae, luctus iaculis ligula. Duis eu dapibus odio. Interdum et malesuada fames ac ante ipsum primis in faucibus. In quis maximus odio. Suspendisse potenti. Maecenas viverra auctor velit nec tempor. Cras non sodales augue. Proin rutrum faucibus ante, malesuada rutrum lectus tincidunt quis. Duis lorem dui, vehicula nec magna vitae, hendrerit ornare eros. Nullam ac leo lobortis, posuere mi ut, iaculis massa. Ut vestibulum mi nec sapien vulputate rhoncus. Sed dignissim consequat justo, quis faucibus metus eleifend vel. Phasellus non mi sem. Donec imperdiet tellus massa, vitae sagittis ex vestibulum blandit.";
    PHP:
    Was in a game's chat recently and they had a swear filter, wouldn't allow "screwed", "s c r e w e d", "s.c.r.e.w.e.d". I was impressed!
     
    sarahk, Nov 18, 2020 IP
  3. qwikad.com

    qwikad.com Illustrious Member Affiliate Manager

    Messages:
    7,151
    Likes Received:
    1,656
    Best Answers:
    29
    Trophy Points:
    475
    #3
    I stripped the new lines (since the $spamwords file had them) and it seems to be working now.

    $spamwords = str_replace(array("\n", "\r"), ' ', $spamwords);
     
    qwikad.com, Nov 18, 2020 IP
  4. sarahk

    sarahk iTamer Staff

    Messages:
    28,500
    Likes Received:
    4,460
    Best Answers:
    123
    Trophy Points:
    665
    #4
    That'll do it!
     
    sarahk, Nov 18, 2020 IP
  5. JEET

    JEET Notable Member

    Messages:
    3,825
    Likes Received:
    502
    Best Answers:
    19
    Trophy Points:
    265
    #5
    In your spamwords file you are using "ENTER" key as word separator, while below 2 lines are using "space" as word separator.

    $a1=explode(' ',$str1);
    $a2=explode(' ',$str2);

    You can make the spamwords file like this:
    file.txt=
    word1 word2 word3 word4 etc etc

    instead of:
    word1
    word2
    word3
    etc
    etc
     
    JEET, Nov 18, 2020 IP
  6. sarahk

    sarahk iTamer Staff

    Messages:
    28,500
    Likes Received:
    4,460
    Best Answers:
    123
    Trophy Points:
    665
    #6
    Eventually, you'll end up with hundreds of spamwords.
    Having them all on the same line would get quite hard to maintain wouldn't it?

    FWIW I'd be putting them into a database
     
    sarahk, Nov 18, 2020 IP
    qwikad.com and JEET like this.
  7. JEET

    JEET Notable Member

    Messages:
    3,825
    Likes Received:
    502
    Best Answers:
    19
    Trophy Points:
    265
    #7
    Yes @sarahk having many in one line will be difficult to manage. I was only talking about this code above. Its looking for a white space as delimiter to form the array from a string.
     
    JEET, Nov 19, 2020 IP
    qwikad.com likes this.