simplexml &ndash problem

Discussion in 'PHP' started by Ntech, Aug 8, 2009.

  1. #1
    Hi,

    I'm copying a text element from an xml file into an php array but the &ndash character is being replaced with this: –

    I tried html_entity_decode() but it had no effect.
     
    Ntech, Aug 8, 2009 IP
  2. HivelocityDD

    HivelocityDD Peon

    Messages:
    179
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #2
    You can try some thing like this.

    Let $str be the string

    
    $str = "Your original string";
    $str = htmlentities($str);
    $str = html_entity_decode($str);
    
    
    If this does not work .. create a function 
    
    function unhtmlentities($string)
    {
        // replace numeric entities
        $string = preg_replace('~&#x([0-9a-f]+);~ei', 'chr(hexdec("\\1"))', $string);
        $string = preg_replace('~&#([0-9]+);~e', 'chr("\\1")', $string);
        // replace literal entities
        $trans_tbl = get_html_translation_table(HTML_ENTITIES);
        $trans_tbl = array_flip($trans_tbl);
        return strtr($string, $trans_tbl);
    }
    
    and pass the string to remove the html entities 
    
    
    PHP:
     
    HivelocityDD, Aug 8, 2009 IP
  3. Ntech

    Ntech Peon

    Messages:
    33
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #3
    Thanks for your help HivelocityDD, but amazingly, none of that worked! :confused:
     
    Ntech, Aug 9, 2009 IP
  4. Ntech

    Ntech Peon

    Messages:
    33
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #4
    This is how I'm copying the xml data into a php array:

    $sxe =  simplexml_load_file("gamecat.xml");
    	
    $arrFeeds = array();
    	
    foreach($sxe->xpath('//game') as $item) {
    
    	$row = simplexml_load_string($item->asXML());
    		
    	foreach ($row->xpath('//game') as $v) {
    		if($v){
    			$itemRSS = array (
    				'gameid' => $item->gameid,
    				'name' => unhtmlentities($item->gamename),
    				'desc' => unhtmlentities($item->longdesc),
    				'download' => $item->downloadurl,
    			);
    				
    	
    			array_push($arrFeeds, $itemRSS);	
    		}
    	}
    }
    Code (markup):
    Could it be I'm doing it wrong, thus causing this problem?

    My code is working fine except for those special characters.

    I don't want to remove the html entities, I just want them to show up properly when I display the web page.
     
    Ntech, Aug 9, 2009 IP
  5. Ntech

    Ntech Peon

    Messages:
    33
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #5
    I got it working with a more direct approach.

    In case anyone else has this problem here's my function:

    function unhtmlentities($str)
    {
    	$return = '';
    	$final = '';
    	$search = array('–','—','t’','é','è','ó','…','®','“','”','™');
    	$replace = array('–','—','’','é','è','ó','…','®','“','”','™');
    	$return = str_replace($search,$replace,$str);
    
    	$ret_array = preg_split('//', $return);
    	
    	foreach ($ret_array as $val) {
    		if (ord($val) < 128) $final .= $val;
    	}
    
    	return $final;
    }
    Code (markup):
    It's limited to the specific encoded characters that are in my xml document so HivelocityDD's function is a better universal solution. It just doesn't seem to work, unfortunately.
     
    Ntech, Aug 9, 2009 IP