I am using curl to open google search page $filelocation="http://www.google.com/search?q=cellphone&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a"; $ch = curl_init(); curl_setopt($ch, CURLOPT_URL, $filelocation); curl_setopt($ch, CURLOPT_RETURNTRANSFER,1); $html=curl_exec ($ch); curl_close ($ch); Code (markup): Now I want all the sponsored results to get in php variable from $html, like $title[0]="Cell phone" //ad title $adurl[0]="http://www.unfoundation.org/vodafone/index.asp" // ad url is appended after href=/url?sa or href=/pagead/iclk?sa $addescription[0]="Improving telecommunications to help in times of disaster." $displayurl[0]="www.UNFoundation.org/vodafone" I am not able to parse ad data from html code ( because I can't write regex-regular expression for that) I need some kind of help in writing regex to parse ad data from html code. I can pay for it. P.S. - Google sponsored results are at top or right of natural results.
<a id=an5 href=/pagead/iclk?sa=l&ai=Bjt-Pnum=8&adurl=http://www.westhost.com/package-compare.html%3FDgoo-gene> $3.95 <b>Web Hosting</b></a></font><br>VPS, Huge Disk Space and Bandwidth!<br> Fall Special ends soon...<br><span class=a>www.westhost.com</span> <a id=pa3 href=/url?sa=L&ai=B0MF0&q=http://www.3ix.com/%3Fso onmouseover="return true"> 2GB <b>Web Hosting</b> $1/Rs.40</a><br> <font size=-1><span class=a>www.3ix.in</span> Code (markup): I have only above two type of code in my document. and I want to extract following data from it. Example: exact url: http://www.westhost.com/package-compare.html Title: $3.95 Web Hosting Description : VPS, Huge Disk Space and Bandwidth! Fall Special ends soon... Domain: www.westhost.com I can make some kinda logic but cant make exact regular expression <a id=(an|pa)[0-9] href=/[^&q|&adurl] (&q|&adurl)=$exacturl%[^ ]> $title </a> <span>$Domain </span>$description </font> I need regular expression to parse this data from my html code. with regular expression I can use preg_match_all to get the data. P.S. - For any reference one can refer http://www.google.com/search?hl=en&q=webhosting&btnG=Google+Search From here i got the HTML code. Exact url is ended at % sign. Thanks for any kind of help.