Decoding text using str_replace, e.g. 'ü' => 'ü' does not work :-(

Discussion in 'PHP' started by Multiplexor, Jan 24, 2010.

  1. #1
    Hello,

    the data in my database is encoded in a wrong way. Now I am trying to replace characters during the runtime, but I see no effect. This is the code I am using:

    			$decode_ary = array(
    					'ü' => 'ü',
    					'ä' => 'ä',
    					'ß' => 'ß',
    					'ö' => 'ö',
    					'Ü' => 'Ü',
    					'Ä' => 'Ä',
    					'é' => 'é',
    					'ã' => 'ã',
    					'ü' => 'ü',
    					'ä' => 'ä',
    					'ß' => 'ß',
    					'ö' => 'ö',
    					'Ü' => 'Ü',
    					'Ä' => 'Ä',
    					'é' => 'é',
    					'ã' => 'ã',
    					'©' => '©',
    					'a' => 'A',
    				);
                
                $topic_title = str_ireplace(array_keys($decode_ary), array_values($decode_ary), $topic_title);
    
    Code (markup):
    Example:
    Baseballschläger --> BAseballschläger

    But the code returns:
    BAseballschläger

    ...The a is replaced by A (it's just a check if the replace-function works properly), but the actual char combination is not replaced. Why?
     
    Multiplexor, Jan 24, 2010 IP
  2. SmallPotatoes

    SmallPotatoes Peon

    Messages:
    1,321
    Likes Received:
    41
    Best Answers:
    0
    Trophy Points:
    0
    #2
    str_replace only works with one-byte characters. Your database may contain multi-byte sequences representing those characters.

    Without knowing your database connection encoding and display encoding it's hard to say. As a starting point, you may want to try outputting the strings as fetched from the database byte-by-byte to see what's really in there.
     
    SmallPotatoes, Jan 24, 2010 IP
  3. danx10

    danx10 Peon

    Messages:
    1,179
    Likes Received:
    44
    Best Answers:
    2
    Trophy Points:
    0
    #3
    He could perhaps use preg_replace for that instead.
     
    danx10, Jan 25, 2010 IP
  4. Multiplexor

    Multiplexor Greenhorn

    Messages:
    57
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    16
    #4
    Very trange thing happens...
    I discovered that using utf8_decode almost all letters are converted correctly, except for ß. Instead of Groß the result is Gro�?. Why is there such inconsistency?

    utf8_decode($topic_title)
     
    Multiplexor, Jan 27, 2010 IP
  5. SmallPotatoes

    SmallPotatoes Peon

    Messages:
    1,321
    Likes Received:
    41
    Best Answers:
    0
    Trophy Points:
    0
    #5
    Maybe the ß character was stored in a different encoding and your browser is being overly helpful by displaying it anyway. Look at the actual bytes being stored and your mystery will be solved.
     
    SmallPotatoes, Jan 27, 2010 IP