Problem with regular expressions for unicode

Discussion in 'PHP' started by s_ruben, Dec 14, 2009.

  1. #1
    Can anybody say me why doesn't this work for unicode strings

    preg_replace('/\b'.$word.'\b/ui', '<span class="bigger">$0</span>', $text);
    Code (markup):
    The variables $word and $text are unicode strings.
     
    s_ruben, Dec 14, 2009 IP
  2. xenon2010

    xenon2010 Peon

    Messages:
    237
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    0
    #2
    can you write the values of $word and $text
     
    xenon2010, Dec 14, 2009 IP
  3. s_ruben

    s_ruben Active Member

    Messages:
    735
    Likes Received:
    26
    Best Answers:
    1
    Trophy Points:
    78
    #3
    For example:

    $word = имя;
    $text = Мое имя Рубен;

    It is unicode strings in Russian.
     
    s_ruben, Dec 14, 2009 IP
  4. xenon2010

    xenon2010 Peon

    Messages:
    237
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    0
    #4
    hmmm, firstly you need to convert your document to utf-8
    or put this code between HEAD tags up in your page..

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
    
    HTML:
    then try this sample:
    $word = 'имя';
    $text = 'Мое имя Рубен';
    echo preg_replace('/'.$word.'/i', 'your replacement words', $text);
    PHP:
    Rep me up if that helps :D
     
    xenon2010, Dec 14, 2009 IP
  5. s_ruben

    s_ruben Active Member

    Messages:
    735
    Likes Received:
    26
    Best Answers:
    1
    Trophy Points:
    78
    #5
    My document is utf-8 and the meta tag is put.
    And the code you have written is not what I want. I want to replace that words, which are not in other word-form. For example the word "body" must not be replaced in word "everybody".
     
    s_ruben, Dec 14, 2009 IP
  6. xenon2010

    xenon2010 Peon

    Messages:
    237
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    0
    #6
    just add spaces like:
    echo preg_replace('/ '.$word.' /i', ' blah ', $text);
     
    xenon2010, Dec 14, 2009 IP
  7. s_ruben

    s_ruben Active Member

    Messages:
    735
    Likes Received:
    26
    Best Answers:
    1
    Trophy Points:
    78
    #7
    but it doesn't work for these
    1. "$word"
    2. $word,
    3. $word.
    ...
     
    s_ruben, Dec 15, 2009 IP
  8. xenon2010

    xenon2010 Peon

    Messages:
    237
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    0
    #8
    you can add whatever you want to the pattern i.e:
    preg_replace('/ '.$word.'(.|,| |)/i', ' blah ', $text);
     
    xenon2010, Dec 15, 2009 IP
  9. s_ruben

    s_ruben Active Member

    Messages:
    735
    Likes Received:
    26
    Best Answers:
    1
    Trophy Points:
    78
    #9
    And this doesn't work for this
    $word = "word";
    $text = "word word. word word, word";
    Try and see that several "word"-s wouldn't be replaced!!!

    The script that I have written in the first post works perfectly, but not for Unicode strings!!! :(
     
    s_ruben, Dec 15, 2009 IP
  10. AlexKey

    AlexKey Peon

    Messages:
    22
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #10
    Try this:
    If it doesn't work, try to use 'iconv' or 'mbstring' php extensions.
     
    AlexKey, Dec 15, 2009 IP
  11. s_ruben

    s_ruben Active Member

    Messages:
    735
    Likes Received:
    26
    Best Answers:
    1
    Trophy Points:
    78
    #11
    How use 'iconv' or 'mbstring' for my example?
     
    s_ruben, Dec 15, 2009 IP
  12. s_ruben

    s_ruben Active Member

    Messages:
    735
    Likes Received:
    26
    Best Answers:
    1
    Trophy Points:
    78
    #12
    Can anybody help me?? It is very important!!!
     
    s_ruben, Dec 22, 2009 IP