hello everyone, I have an issue with regex here is the text that i want to extract the highlighted text: "fTnh",13972496,"some text Important text bla bla bla",50423,"","04204733",416,7279,2075309,"","160345","",0 so here what i did for regex pattern but i keep getting error messages anyway here is the snippets: $text= '"fTnh",[B][COLOR="Red"]13972496[/COLOR][/B],"some text [B][COLOR="Red"]Important text[/COLOR][/B] bla bla bla",50423,"","04204733",416,7279,2075309,"","160345","",0'; [COLOR="Red"]$pattern='"fTnh",'.".*?".','.".*? Important text.*?".','; [/COLOR] preg_match_all("$pattern",$text,$out)); Code (markup): so any ideas how to fix this?
$text= '"fTnh",13972496,"some text Important text bla bla bla",50423,"","04204733",416,7279,2075309,"","160345","",0'; $pattern= '".*",([0-9]+),"(.+)",.*'; eregi($pattern, $text, $matches); echo $matches[1] . "<br>" . $matches[2]; Code (markup): haven't tried it, but i am sure it work.
im getting this error: Warning: preg_match_all() [function.preg-match-all]: Unknown modifier ',' Code (markup):
Do you have more examples about the plain string and what to catch? Are you sure that the string is always has "fTnh", on the front?
here is a sample of that file: "ftnh",[COLOR="Red"]14465707[/COLOR],"bla bla text [COLOR="Red"]IMPORTANT TEXT[/COLOR] bla bla bla",799473,"","065612",28,40,1605573,"","191543","",0, <=== check the differences type1 "ftnh",14471405,"bla bla text IMPORTANT TEXT bla bla bla",1646558,"","155302",46,18,1605573,"","190905","",0, <=== check the differences type1 "ftnh",14443139,"bla bla text IMPORTANT TEXT bla bla bla",1232179,"","y040034",393,171,1148089,"","190851","",0, <=== check the differences type1 "fTnh",14445246,"bla bla text IMPORTANT TEXT bla bla bla",1225476,"","y121418",43,138,1445848,"","183820","",0, <=== check the differences type1 "ftnh",14458810,"bla bla text IMPORTANT TEXT bla bla bla",1417515,"","y220559",38,39,1445848,"","180919","",0, <=== check the differences type1 "ftnh",14382043,"bla bla text IMPORTANT TEXT bla bla bla",1225476,"","23124909",45,72,626153,"","180619","",0, <=== check the differences type1 "fnh",14456171,"bla bla text bla bla bla",1225476,"","y203019",49,107,2300064,"","172710","",0, <=== check the differences type2 "fTnh",13972496,"bla bla text IMPORTANT TEXT bla bla bla",50423,"","04204733",435,7345,1972877,"","172340","",0, "ftnh",14389035,"bla bla text IMPORTANT TEXT bla bla bla",1225476,"","23180218",42,166,1234430,"","155702","",0, "ftn",14441432,"bla bla text bla bla bla",2306701,"","y013833",11,15,1769461,"","090825","",0, <=== check the differences type3 PHP: I want the data in the lines started with "ftnh" so i need the number after and the important text...thats all
<?php $str = ' "ftnh",14465707,"bla bla text [color="Red"]IMPORTANT TEXT[/color] bla bla bla",799473,"","065612",28,40,1605573,"","191543","",0, "ftnh",14471405,"bla bla text IMPORTANT TEXT bla bla bla",1646558,"","155302",46,18,1605573,"","190905","",0, "ftnh",14443139,"bla bla text IMPORTANT TEXT bla bla bla",1232179,"","y040034",393,171,1148089,"","190851","",0, "fTnh",14445246,"bla bla text IMPORTANT TEXT bla bla bla",1225476,"","y121418",43,138,1445848,"","183820","",0, "ftnh",14458810,"bla bla text IMPORTANT TEXT bla bla bla",1417515,"","y220559",38,39,1445848,"","180919","",0, "ftnh",14382043,"bla bla text AM I IMPORTANT bla bla bla",1225476,"","23124909",45,72,626153,"","180619","",0, "ftnh",14456171,"bla bla text bla bla bla",1225476,"","y203019",49,107,2300064,"","172710","",0, "fTnh",13972496,"bla bla text CATCH ME I AM IMPORTANT bla bla bla",50423,"","04204733",435,7345,1972877,"","172340","",0, "ftnh",14389035,"bla bla text IMPORTANT TEXT bla bla bla",1225476,"","23180218",42,166,1234430,"","155702","",0, "ftn",14441432,"bla bla text bla bla bla",2306701,"","y013833",11,15,1769461,"","090825","",0, '; preg_match_all('/ftnh",([0-9]+),"bla bla text ([a-z0-9 ]+)?bla bla bla/i', $str, $match); unset($match[0]); echo '<pre>'; print_r($match); echo '</pre>'; ?> PHP: Output: Array ( [1] => Array ( [0] => 14471405 [1] => 14443139 [2] => 14445246 [3] => 14458810 [4] => 14382043 [5] => 14456171 [6] => 13972496 [7] => 14389035 ) [2] => Array ( [0] => IMPORTANT TEXT [1] => IMPORTANT TEXT [2] => IMPORTANT TEXT [3] => IMPORTANT TEXT [4] => AM I IMPORTANT [5] => [6] => CATCH ME I AM IMPORTANT [7] => IMPORTANT TEXT ) ) Code (markup):
impressive... thank u so much but I still have problem... In the real document Just instead of the "IMPORTANT TEXT" i have arabic characters...I really dont know what should i input there instead of ([a-z0-9 ]+)?.... do u have any idea how to use foreign characters 'arabic,turkish, russian' with regex?
I usualy explode the text with "\r\n" or just one of thise characters. they go throw array foreach ($lines as $line) {} an explode each $line with ',' then you have data separated and on array[3] use regex if you need one
Try this (modified regular expression) preg_match_all('/ftnh",([0-9]+),"bla bla text (.*)?bla bla bla/iU', $str, $match); PHP: And try to change the "bla bla" with non latin characters (not tested yet).
thanx all for ur help after a while from hitting my head with keyboard, i finally found the solution for the f*cked up arabic characters.. here is what i did: $content = get_content("$url2"); //grabbing site's content $tmpContent=iconv("windows-1256", "utf-8", "$content"); //encoding the grabbed content and change it to utf-8 :D $encoded= utf8_encode($tmpContent); // now encode it again to utf8 format to start matching ;) $encoded_phrase=utf8_encode('مسابقة');//encode our phrase to match the encoded content $pattern='/"ftnh",(.*?),(.*?)'."($encoded_phrase)".'(.*?),/'; //this pattern will get all words near "the encoded word;" if(preg_match_all($pattern,$encoded,$out,PREG_PATTERN_ORDER)) { echo "topics found:<br>"; } PHP: ps: i think this is the best and easiest solution for using regex with arabic sites..