Editing a script that imports CSV files (affiliate datafeeds) into a WordPress database. It's running into problems when the CSV files contain characters like  and É Basically the script truncates the database entry after these characters, so if I have an entry The script cuts Ât the  and nothing else is added All I get is The script cuts Added to the database. I don't understand the script enough to change it to accept these characters, so looking to remove/replace them. I can add individual str_replace code to replace these along the lines of: $content = str_replace('Â', 'A', $content); $content = str_replace('É', 'E', $content); Code (markup): And it works, but I don't have a list of all of these characters (some are weird like º (is meant to have a line under it) and Æ So I'm hoping there's some nifty bit of PHP to deal with this sort of stuff David
It's probably not your code, it's the character encoding in the database. Check that, you can set it in your my.cnf file (assuming mysql). I don't remeber exactly how but it's easy enough to google.
I've kind of solved it using: $content = preg_replace("/[^\x9\xA\xD\x20-\x7F]/", "", $content); This removes anything that's not standard charachters (I think). It's not ideal as it deletes rather than replaces the charachters that are causing problems. Almost found an ideal solution: $transwpimc = get_html_translation_table(HTML_ENTITIES); $encodedwpimc = strtr($content, $transwpimc); $content = $encodedwpimc; This converts the charachters to the equivleent charachter code, but it also converts HTML tags as well, this is too good . Looks like I'll have to create a bunch of replaces for each charachter, found a list of them all so shouldn't be as hard as I thought. David
Have a look at utf8_encode() PHP: Something like $content = utf8_encode($content); PHP: Might work out just fine, or you can run a function on your content, maybe this example from that page, consider $str as your $content variable.