Grrr... Character encoding

Discussion in 'PHP' started by squishi, Dec 20, 2008.

  1. #1
    This is so annoying!
    A script that I am working on shows all kinds of weird characters in the output.

    Example:
    ...á...ó... ...ú...í ...ño....é...

    I know this is a character encoding issue and I know the corresponding characters.

    But I don't know what more to do.

    - My page contains a meta tag with UTF-8 encoding.
    - I make sure to utf8_encode($string) on all the output.
    - I run the strings through html_entitites or html_entity_decode, but nothing helps.

    I even tried to replace the weird characters, for example like this:
    	$content = str_replace("é", 'é', $content); //é
    
    PHP:
    But even this did not work. And that is a sign that something is not right with the character encoding.

    What is most upsetting is that this only happens in some strings that I output. In others, all the special characters are showing fine.
    They all are processed in the same way... :confused:
     
    squishi, Dec 20, 2008 IP
  2. Yesideez

    Yesideez Peon

    Messages:
    196
    Likes Received:
    5
    Best Answers:
    0
    Trophy Points:
    0
    #2
    Where is your source for the data - where is $content coming from?
     
    Yesideez, Dec 20, 2008 IP
  3. wmtips

    wmtips Well-Known Member

    Messages:
    601
    Likes Received:
    70
    Best Answers:
    1
    Trophy Points:
    150
    #3
    Is your browser detects character encoding of your pages as 'utf-8'? If your web server returns wrong encoding in http headers, your html meta encoding could be ignored.

    Also make sure you supply 'utf-8' charset to htmlentities and htmlspecialchars functions, they'll broke utf-8 strings if charset is not specified.
     
    wmtips, Dec 20, 2008 IP
  4. squishi

    squishi Peon

    Messages:
    58
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #4
    I thought I had solved the problem today. I changed the character encoding in my text editor to UTF-8.

    The problem has appeared again, though. I don't understand why some Wordpress posts on the same page (!) show those weird characters and some don't!
     
    squishi, Dec 22, 2008 IP
  5. squishi

    squishi Peon

    Messages:
    58
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #5
    Well, with the source code now being utf-8, I am finally able to do some replacements on the page's content.
    So I can replace the characters with their html representation...
     
    squishi, Dec 22, 2008 IP