I'm trying to run a function while my loop is running but I get Maximum execution time outs. Basically I'm trying in this example to change the url 5 times and in between each, run a small function, my final script will mysql insert the result, bt for now I cant even get it echo. $i=0; //starts from 0 while($i<=5) { $url = "http://www.domain.com/$i"; echo $some function result on that domain name; $i++; // adds 1 on each loop } PHP: Is there a different way I should be doing this ?
You can use a for() loop: <?php for ($i = 0; $i <= 5; $i++) { $url = "http://www.domain.com/{$i}"; //other code will go here... } ?> PHP: Furthermore, don't place function(s) within the loop only call them! (place the function(s) outside the loop) Lastly although you shouldn't need this - you can use the set_time_limit() function to modify the execution time (to avoid getting the maximum exection error).
Thanks Dan I had tried that method yesterday, none of which worked, however today both methods are working, although if I increase to 10 or more it seems to take forever. Basically I want to scrap facebook, each url starts a page with a different person, I'm then gathering the link name, persons name, the avatar and type of person, I will store this info in a database, I will have a form for start $i and finish, ideally I'm talking about scrapping maybe 50 at a time. I understand I'm loading a new page each time on the count and thats the reason its slow but could you take a look at my script below and possible advise a better method ? <?php function opens($url) { $ch = curl_init(); $useragent="Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.1) Gecko/20061204 Firefox/2.0.0.1"; curl_setopt($ch, CURLOPT_HEADER, 0); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); //Set curl to return the data instead of printing it to the browser. curl_setopt($ch, CURLOPT_URL, $url); curl_setopt ($ch, CURLOPT_REFERER, 'http://www.mse360.com/about/bot.php'); curl_setopt ($ch, CURLOPT_USERAGENT, "MSE360 - FredBot (See: http://mse360.com/about/bot.php)"); curl_setopt ($ch, CURLOPT_TIMEOUT, 60); $data = curl_exec($ch); curl_close($ch); return $data; } function get_linkname($regex,$content) { preg_match($regex,$content,$linkname); return $linkname[1]; } function get_persons_name($regex,$content) { preg_match($regex,$content,$persons_name); return $persons_name[2]; } function get_persons_avatar($regex,$content) { preg_match($regex,$content,$persons_avatar); return $persons_avatar[1]; } function get_persons_type($regex,$content) { preg_match($regex,$content,$persons_type); return $persons_type[1]; } for ($i = 0; $i <= 5; $i++) { $url = "http://www.facebook.com/pages/?browse&ps=87&s={$i}"; $data = opens($url); $linkname = strip_tags(get_linkname('#class\=\"name\"><a href\=\"http\:\/\/www\.facebook\.com\/(.*)\">(.*)<\/a>#isU',$data)); $persons_name = strip_tags(get_persons_name('#class\=\"name\"><a href\=\"http\:\/\/www\.facebook\.com\/(.*)\">(.*)<\/a>#isU',$data)); $persons_avatar = strip_tags(get_persons_avatar('#class\=\"img\" src\=\"(.*)\">#isU',$data)); $persons_avatar = preg_replace('#" />#','',$persons_avatar); $type = get_persons_type('#<dd>(.*)</dd>#isU',$data); echo $linkname; echo $persons_name; echo $persons_avatar; echo $type; } //echo $data; ?> PHP:
First, you might want to consolidate your regular expression functions into one and adding a third parameter to which array sub to return. Open your php.ini file, increase the max execution time limit and then run the script via the command line (not through the browser). You can also print your current x, this way you can see it actually running and you will know where you stand.
<?php ob_implicit_flush(); function curl_get_contents($url) { $ch = curl_init(); curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_USERAGENT, $_SERVER['HTTP_USER_AGENT']); curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); curl_setopt($ch, CURLOPT_TIMEOUT, 5); curl_setopt($ch, CURLOPT_AUTOREFERER, false); curl_setopt($ch, CURLOPT_REFERER, $url); $buffer = curl_exec($ch); curl_close($ch); return $buffer; } function fb_type($content) { preg_match('~<dl class="clearfix"><dt>Type:</dt><dd>([^<]+)</dd>~', $content, $a); return htmlspecialchars($a[1], ENT_QUOTES); } function fb_name($content) { preg_match('~<div class="name"><a href="(http://www\.facebook\.com/[^"]+)">([^<]+)</a>~', $content, $a); return array( htmlspecialchars($a[1], ENT_QUOTES), htmlspecialchars($a[2], ENT_QUOTES) ); } function fb_avatar($content) { preg_match('~<div class="pic">.+?<img class="img" src="(http://profile\.ak\.fbcdn\.net/.+?)"~', $content, $a); return htmlspecialchars($a[1], ENT_QUOTES); } for ($i = 0; $i <= 5; $i++) { $content = curl_get_contents("http://www.facebook.com/pages/?browse&ps=87&s={$i}"); $name_arr = fb_name($content); $link = $name_arr[0]; $name = $name_arr[1]; $avatar = fb_avatar($content); $type = fb_type($content); echo $link . '<br>'; echo $name . '<br>'; echo $avatar . '<br>'; echo $type . '<br>'; } ?> PHP: It may be better to just use their API: http://developers.facebook.com/docs/ (Not familiar with it, as I've never used it, but I'm guessing it would be easier then scraping)