I was trying to use the php function: substr with an arabic and chinese string but it never gives the right caracters (it shows something like: "??? ??? ?????" ) The script opens google translate tool, to translate a word, and give it to me automatically, but when using substr to get the exact translated word, I found it "????????..." Can you please help me ?
you need to convert the text to utf-8 in order to use substr... try to set your document's charset to utf-8.. put this code between head tags.. <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
I already did that, with no chance. The problem is not in the browser. Even when I see the html source code of the output, I see ��� I also tried using mb_substr instead of substr, but not working...
this happens coz the text you are trying to cut is not in utf-8 you need to convert the string to utf-8.. to do so you need to use iconv() ... i.e. most arabic sites use windows-1256 charsets. so to convert arabic characters to utf-8 you need to use iconv()..
Thank you Xenon2010, The page that contains arabic characters is this: http://translate.google.com/translate_t?hl=&ie=utf-8&text=welcome&sl=en&tl=ar# (using file_get_contents() ) But I still don't find a solution I thought also about using: mb_convert_encoding($html_page,'HTML-ENTITIES', 'utf-8'); It works with any string and convert arabic chars to #01256 and things like that. BUT DO NOT work with that google page !!!
okay you need to use CURL instead. so here is your solution: I just made this function to you its easy to use. function get_content($url) { $ch = curl_init(); curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_HEADER, 1); curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; MSIE 7.0; Windows NT 6.0; en-US)'); ob_start(); curl_exec ($ch); curl_close ($ch); $string = ob_get_contents(); ob_end_clean(); return $string; } echo get_content('http://translate.google.com/translate_t?hl=&ie=utf-8&text=welcome&sl=en&tl=ar#'); PHP: now it should work fine Rep me up