Script time out

amorph Peon

Messages:: 200

Likes Received:: 1

Best Answers:: 0

Trophy Points:: 0

#1

I have a page that does many regexes with large files and it's not even ready but it's giving me the 60 seconds time out. Is there a good way of avoiding this?
Like I see on many websites with a small gif indicating "Loading" until the script finishes. Something to keep it alive until it's done?

amorph, Jun 27, 2007 IP

ansi Well-Known Member

Messages:: 1,483

Likes Received:: 65

Best Answers:: 0

Trophy Points:: 100

#2

set_time_limit(99999999);

ansi, Jun 27, 2007 IP

amorph Peon

Messages:: 200

Likes Received:: 1

Best Answers:: 0

Trophy Points:: 0

#3

I can't do that on a hosting server. I need a solution that could manage and split the process...something similar.

amorph, Jun 27, 2007 IP

Acecool Peon

Messages:: 267

Likes Received:: 6

Best Answers:: 0

Trophy Points:: 0

#4

Hey, You can do it several ways..

If the hosting server does not have safe mode enabled, you should be able to use ini_set(max_execution_time, ###);

Otherwise, what you can do, is have the script open the file, and read 100 lines... pause, then reload and read the next 100 lines...

Pm me if you want some more detail

Acecool, Jun 27, 2007 IP

amorph Peon

Messages:: 200

Likes Received:: 1

Best Answers:: 0

Trophy Points:: 0

#5

That will be interesting. Hard as well I'm sure.

I have a set of 10 textboxes with 10 url's and my page should extract their titles and metas and lots of other details using regex and it timeouts. I really would like to know if it's possible to run this job with reloads.

amorph, Jun 27, 2007 IP

rodney88 Guest

Messages:: 480

Likes Received:: 37

Best Answers:: 0

Trophy Points:: 0

#6

60 seconds for extracting data from 10 pages? If you haven't already, I'd suggest optimizing your regexes...

If you know you're going to hit the time limit, you can set it to only process X pages at a time - depending on what you're doing, either storing your progress in the query string or using sessions.

Alternatively you could use ticks to run a function periodically that checks if you're close to the timeout and if so, saves your progress and forces a page refresh. [url="http://uk2.php.net/manual/en/control-structures.declare.php#control-structures.declare.ticks]See manual[/url].

rodney88, Jun 27, 2007 IP

Acecool Peon

Messages:: 267

Likes Received:: 6

Best Answers:: 0

Trophy Points:: 0

#7

http://www.php.net/get_meta_tags

get_meta_tags...

OR only FREAD the first 1kb, or 500 bytes which should be enough...

Edit: I use a regex to check over 8300 websites in about 60 seconds..

Acecool, Jun 27, 2007 IP

amorph Peon

Messages:: 200

Likes Received:: 1

Best Answers:: 0

Trophy Points:: 0

#8

if ( preg_match ( '/<title>([^<]+)<\/title>/i', addSpecialChars ( $row->page_source ), $title ) )
{
		return trim ( $title[1] ); 
}
PHP:
This is the only regex used yet. To lower the time taken I'm now saving the sources to db and then performing regex extractions using the stored data and it still timeouts. Regex is very slow and when called more than 20 times in various functions and loops it crashes. Damn this is hard.

When I eliminate this regex and set the result to some default it finishes in no time so I'm sure this is the buggy one/.

amorph, Jun 27, 2007 IP

Acecool Peon

Messages:: 267

Likes Received:: 6

Best Answers:: 0

Trophy Points:: 0

#9

try using (.*?) between title...

and, only get the first half a kb per site

Acecool, Jun 27, 2007 IP

amorph Peon

Messages:: 200

Likes Received:: 1

Best Answers:: 0

Trophy Points:: 0

#10

I wil use only the half kb for the title but what happends when I'm going to expand this and want to extract image alt attributes, metadata, h1, h2, h3, links, keyword occurences and more stuff... This is a project for school and I need to extract anything from 10 pages.

amorph, Jun 27, 2007 IP

gibex Active Member

Messages:: 1,060

Likes Received:: 21

Best Answers:: 0

Trophy Points:: 95

#11

how big are those test pages?

gibex, Jun 27, 2007 IP

amorph Peon

Messages:: 200

Likes Received:: 1

Best Answers:: 0

Trophy Points:: 0

#12

I don't know. My web app will have to work with 10 pages that I won't chose. Someone else. Right now I'm playing with some that have between 20 and 40kb

amorph, Jun 27, 2007 IP

rodney88 Guest

Messages:: 480

Likes Received:: 37

Best Answers:: 0

Trophy Points:: 0

#13

The regex looks fine - I can't think why you'd have any problems with it. Are you sure it's not just getting stuck in a loop somewhere?

What does the addSpecialChars function do, or rather, why does it need to be done to the subject for the regex?

rodney88, Jun 28, 2007 IP

amorph Peon

Messages:: 200

Likes Received:: 1

Best Answers:: 0

Trophy Points:: 0

#14

Well....the pages are first saved using "htmlspecialchars" and then they need to be reverted using "addSpecialChars" in order for the regex to match the titles. Here's the function itself...nothing much...
function addSpecialChars($string, $noQuotes = FALSE) 
	{
		$string = eregi_replace("&amp;","&",$string);
		if (!$noQuotes) $string = eregi_replace("'","'",$string);
		$string = eregi_replace('&quot;','"',$string);
		$string = eregi_replace('&nbsp;',' ',$string);
		$string = eregi_replace('&lt;','<',$string);
		$string = eregi_replace('&gt;','>',$string);
		return $string;
	}
PHP:

amorph, Jun 28, 2007 IP

KalvinB Peon

Messages:: 2,787

Likes Received:: 78

Best Answers:: 0

Trophy Points:: 0

#15

If possible, you should consider running the script locally on your own computer and then just upload the results to your server.

KalvinB, Jun 28, 2007 IP

rodney88 Guest

Messages:: 480

Likes Received:: 37

Best Answers:: 0

Trophy Points:: 0

#16

The string functions will always be faster than regex (and even then, preg is usually quicker than ereg). If you don't need to use regex, avoid the regex functions.

You could save a bit of time by using str_replace instead for your addSpecialChars function. And assuming you're saving the result of addSpecialChars (rather than running it again and again for every preg_match), I can't see any reason why you'd need anywhere near 60 seconds for processing 10 pages..

rodney88, Jun 28, 2007 IP

amorph Peon

Messages:: 200

Likes Received:: 1

Best Answers:: 0

Trophy Points:: 0

#17

addspecialchars was eating it. I slipped me. I was focusing on the title regex and never knew the bug was somewhere else. It was working well untill confrunted with something bigger in various loops. Now it's ok...let's hope it will stay this way until I finish it.

amorph, Jun 28, 2007 IP

Log in or Sign up

Script time out

amorph Peon

ansi Well-Known Member

amorph Peon

Acecool Peon

amorph Peon

rodney88 Guest

Acecool Peon

amorph Peon

Acecool Peon

amorph Peon

gibex Active Member

amorph Peon

rodney88 Guest

amorph Peon

KalvinB Peon

rodney88 Guest

amorph Peon

Useful Searches