remomove duplicate string

Discussion in 'Programming' started by bumbar, Jan 16, 2013.

  1. #1
    Hello,

    I have this script:

    
    $string = "The The the Hello Truck Hello The the Fantastic bear"; 
    $pattern = "/\b([\w'-]+)(\s+\\1)+/i"; 
    $replacement = "$1"; 
    print preg_replace($pattern, $replacement, $string);
    
    PHP:
    This print: 'The Hello Truck Hello The Fantastic bear'

    The problem is that only 'The' removed, Hello still exist....

    Any suggestions, please :)
     
    bumbar, Jan 16, 2013 IP
  2. goliath

    goliath Active Member

    Messages:
    308
    Likes Received:
    11
    Best Answers:
    0
    Trophy Points:
    60
    #2
    The hello is not a duplicate string.

    You can't have a script for this very easily because it requires some human thought to realize that the hello is not desired in more than one place, and then it needs to know which one to remove. I don't even know which one of those you remove they should both go somewhere else.

    You can tell it to take all hello, or the first hello. Then when you give it 'The Goodbye Truck Goodbye The Fantastic bear' it won't know about goodbye and you have the same problem again. And sometimes it's good to have the same word twice in a sentence.

    This is not done in a simple regex, it's done with a huge set of rules and code it's a grammar/syntax checker.
     
    goliath, Jan 18, 2013 IP
  3. EricBruggema

    EricBruggema Well-Known Member

    Messages:
    1,740
    Likes Received:
    28
    Best Answers:
    13
    Trophy Points:
    175
    #3
    If you only want unique words to be saved its not that hard, you have to

    explode your string
    loop all explosed
    check if in a temporary array the value exists, if not add it into a temporary array and add it to a string.

    something like this (not tested).
    
    $string = "The The the Hello Truck Hello The the Fantastic bear";
    $tstring = '';
    $tarray = '';
    
    foreach (explode(" ", $string) AS $k=>$v)
    {
        if (!isset($tarray[$v]))
        {
            $tarray[$v] = true;
            $tstring .= $v . ' ';
        }
    }
    echo trim($tstring);
    
    PHP:
    :)
     
    EricBruggema, Jan 18, 2013 IP