RSS and  error - encoding issues GAH!

Discussion in 'PHP' started by asgsoft, Aug 21, 2012.

  1. #1
    Hey guys

    I am working on a custom RSS to retrieve data from a MySQL database and I've run into a little problem with regards to encoding.

    My content has characters like "£" which confuses everything!

    I got the feed to display everything as it should except for  which shows up at the start.

    The start of my code is:

    
    <rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" >
    <?php header('Content-type: text/xml;charset=ISO-8859-1'); 
    
    putenv("TZ=Europe/London");
    ?>
    
    <channel>
    <title>Feed Title</title>
    <description>Feed Desc</description>
    
    PHP:
    Any ideas what this problem is caused by?
     
    asgsoft, Aug 21, 2012 IP
  2. plussy

    plussy Peon

    Messages:
    152
    Likes Received:
    5
    Best Answers:
    9
    Trophy Points:
    0
    #2
    Firstly you should do any header stuff right at the beginning before anything is sent to the browser

    
    <?php header('Content-type: text/xml;charset=ISO-8859-1'); 
    
    putenv("TZ=Europe/London");
    ?>
    
    <rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" >
    <channel>
    <title>Feed Title</title>
    <description>Feed Desc</description>
    
    PHP:
    Secondly you could probably try this


    
    <?php header('Content-type: text/xml;charset=ISO-8859-1'); 
    
    putenv("TZ=Europe/London");
    ?>
    
    <rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" >
    <channel>
    <title><![CDATA[Feed Title]]></title>
    <description><![CDATA[Feed Desc]]></description>
    
    PHP:
     
    plussy, Aug 21, 2012 IP
  3. asgsoft

    asgsoft Well-Known Member

    Messages:
    1,737
    Likes Received:
    34
    Best Answers:
    0
    Trophy Points:
    160
    #3
    It still has  as the very first thing on a new line :S
     
    asgsoft, Aug 21, 2012 IP
  4. plussy

    plussy Peon

    Messages:
    152
    Likes Received:
    5
    Best Answers:
    9
    Trophy Points:
    0
  5. asgsoft

    asgsoft Well-Known Member

    Messages:
    1,737
    Likes Received:
    34
    Best Answers:
    0
    Trophy Points:
    160
    #5
    Yeah I had tried that a when I first started researching the problem but again to no effect :S
     
    asgsoft, Aug 21, 2012 IP
  6. plussy

    plussy Peon

    Messages:
    152
    Likes Received:
    5
    Best Answers:
    9
    Trophy Points:
    0
    #6
    then go hardcore.

    delete the file and create it again.
     
    plussy, Aug 21, 2012 IP
  7. asgsoft

    asgsoft Well-Known Member

    Messages:
    1,737
    Likes Received:
    34
    Best Answers:
    0
    Trophy Points:
    160
    #7
    Its a charset encoding issue.

    UTF-8 and ISO-8859-1

    Having ISO-8859-1 makes the £ show up properly while having UTF-8 removes the  from the start
     
    asgsoft, Aug 21, 2012 IP
  8. plussy

    plussy Peon

    Messages:
    152
    Likes Received:
    5
    Best Answers:
    9
    Trophy Points:
    0
    #8
    can you send me a link to your rss file?
     
    plussy, Aug 21, 2012 IP
    asgsoft likes this.
  9. asgsoft

    asgsoft Well-Known Member

    Messages:
    1,737
    Likes Received:
    34
    Best Answers:
    0
    Trophy Points:
    160
    #9
    Which part exactly? I am still testing locally
     
    asgsoft, Aug 21, 2012 IP
  10. deathshadow

    deathshadow Acclaimed Member

    Messages:
    9,732
    Likes Received:
    1,999
    Best Answers:
    253
    Trophy Points:
    515
    #10
    Someone set you up the BOM.

    Your editor is set to save as UTF-8 with the Byte Order Mark -- which is what's causing that problem. Load in all your files, and save without the BOM, and you'll be fine.

    Assuming whatever editor you are using has that option -- the good ones, and many of the crappy ones, do.

    http://en.wikipedia.org/wiki/Byte_order_mark

    The problem exists, as the link plussy pointed to, ENTIRELY because of whatever editor you are using.

    Of course, since you're saving the document as UTF-8, it's wrong to be using the iso-8859-1 encoding in the header either. You have to match formats when saving; which to be frank, you don't seem to be doing. You say that in the header, you have to say that in the editor.

    Funny nobody has asked the obvious question -- just what editor are you using? ... and the other obvious question, this is 2012 not 1997, why are you trying to use ISO-8859?
     
    deathshadow, Aug 21, 2012 IP
    asgsoft likes this.