Script time out

Discussion in 'PHP' started by amorph, Jun 27, 2007.

  1. #1
    I have a page that does many regexes with large files and it's not even ready but it's giving me the 60 seconds time out. Is there a good way of avoiding this?
    Like I see on many websites with a small gif indicating "Loading" until the script finishes. Something to keep it alive until it's done?
     
    amorph, Jun 27, 2007 IP
  2. ansi

    ansi Well-Known Member

    Messages:
    1,483
    Likes Received:
    65
    Best Answers:
    0
    Trophy Points:
    100
    #2
    set_time_limit(99999999);
     
    ansi, Jun 27, 2007 IP
  3. amorph

    amorph Peon

    Messages:
    200
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #3
    I can't do that on a hosting server. I need a solution that could manage and split the process...something similar.
     
    amorph, Jun 27, 2007 IP
  4. Acecool

    Acecool Peon

    Messages:
    267
    Likes Received:
    6
    Best Answers:
    0
    Trophy Points:
    0
    #4
    Hey, You can do it several ways..

    If the hosting server does not have safe mode enabled, you should be able to use ini_set(max_execution_time, ###);

    Otherwise, what you can do, is have the script open the file, and read 100 lines... pause, then reload and read the next 100 lines...

    Pm me if you want some more detail
     
    Acecool, Jun 27, 2007 IP
  5. amorph

    amorph Peon

    Messages:
    200
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #5
    That will be interesting. Hard as well I'm sure.

    I have a set of 10 textboxes with 10 url's and my page should extract their titles and metas and lots of other details using regex and it timeouts. I really would like to know if it's possible to run this job with reloads.
     
    amorph, Jun 27, 2007 IP
  6. rodney88

    rodney88 Guest

    Messages:
    480
    Likes Received:
    37
    Best Answers:
    0
    Trophy Points:
    0
    #6
    60 seconds for extracting data from 10 pages? If you haven't already, I'd suggest optimizing your regexes...

    If you know you're going to hit the time limit, you can set it to only process X pages at a time - depending on what you're doing, either storing your progress in the query string or using sessions.

    Alternatively you could use ticks to run a function periodically that checks if you're close to the timeout and if so, saves your progress and forces a page refresh. [url="http://uk2.php.net/manual/en/control-structures.declare.php#control-structures.declare.ticks]See manual[/url].
     
    rodney88, Jun 27, 2007 IP
  7. Acecool

    Acecool Peon

    Messages:
    267
    Likes Received:
    6
    Best Answers:
    0
    Trophy Points:
    0
    #7
    http://www.php.net/get_meta_tags

    get_meta_tags...

    OR only FREAD the first 1kb, or 500 bytes which should be enough...

    Edit: I use a regex to check over 8300 websites in about 60 seconds..
     
    Acecool, Jun 27, 2007 IP
  8. amorph

    amorph Peon

    Messages:
    200
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #8
    if ( preg_match ( '/<title>([^<]+)<\/title>/i', addSpecialChars ( $row->page_source ), $title ) )
    {
    		return trim ( $title[1] ); 
    }
    PHP:
    This is the only regex used yet. To lower the time taken I'm now saving the sources to db and then performing regex extractions using the stored data and it still timeouts. Regex is very slow and when called more than 20 times in various functions and loops it crashes. Damn this is hard.

    When I eliminate this regex and set the result to some default it finishes in no time so I'm sure this is the buggy one/.
     
    amorph, Jun 27, 2007 IP
  9. Acecool

    Acecool Peon

    Messages:
    267
    Likes Received:
    6
    Best Answers:
    0
    Trophy Points:
    0
    #9
    try using (.*?) between title...

    and, only get the first half a kb per site
     
    Acecool, Jun 27, 2007 IP
  10. amorph

    amorph Peon

    Messages:
    200
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #10
    I wil use only the half kb for the title but what happends when I'm going to expand this and want to extract image alt attributes, metadata, h1, h2, h3, links, keyword occurences and more stuff... This is a project for school and I need to extract anything from 10 pages.
     
    amorph, Jun 27, 2007 IP
  11. gibex

    gibex Active Member

    Messages:
    1,060
    Likes Received:
    21
    Best Answers:
    0
    Trophy Points:
    95
    #11
    how big are those test pages?
     
    gibex, Jun 27, 2007 IP
  12. amorph

    amorph Peon

    Messages:
    200
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #12
    I don't know. My web app will have to work with 10 pages that I won't chose. Someone else. Right now I'm playing with some that have between 20 and 40kb
     
    amorph, Jun 27, 2007 IP
  13. rodney88

    rodney88 Guest

    Messages:
    480
    Likes Received:
    37
    Best Answers:
    0
    Trophy Points:
    0
    #13
    The regex looks fine - I can't think why you'd have any problems with it. Are you sure it's not just getting stuck in a loop somewhere?

    What does the addSpecialChars function do, or rather, why does it need to be done to the subject for the regex?
     
    rodney88, Jun 28, 2007 IP
  14. amorph

    amorph Peon

    Messages:
    200
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #14
    Well....the pages are first saved using "htmlspecialchars" and then they need to be reverted using "addSpecialChars" in order for the regex to match the titles. Here's the function itself...nothing much...

    function addSpecialChars($string, $noQuotes = FALSE) 
    	{
    		$string = eregi_replace("&amp;","&",$string);
    		if (!$noQuotes) $string = eregi_replace("'","'",$string);
    		$string = eregi_replace('&quot;','"',$string);
    		$string = eregi_replace('&nbsp;',' ',$string);
    		$string = eregi_replace('&lt;','<',$string);
    		$string = eregi_replace('&gt;','>',$string);
    		return $string;
    	}
    PHP:
     
    amorph, Jun 28, 2007 IP
  15. KalvinB

    KalvinB Peon

    Messages:
    2,787
    Likes Received:
    78
    Best Answers:
    0
    Trophy Points:
    0
    #15
    If possible, you should consider running the script locally on your own computer and then just upload the results to your server.
     
    KalvinB, Jun 28, 2007 IP
  16. rodney88

    rodney88 Guest

    Messages:
    480
    Likes Received:
    37
    Best Answers:
    0
    Trophy Points:
    0
    #16
    The string functions will always be faster than regex (and even then, preg is usually quicker than ereg). If you don't need to use regex, avoid the regex functions.

    You could save a bit of time by using str_replace instead for your addSpecialChars function. And assuming you're saving the result of addSpecialChars (rather than running it again and again for every preg_match), I can't see any reason why you'd need anywhere near 60 seconds for processing 10 pages..
     
    rodney88, Jun 28, 2007 IP
  17. amorph

    amorph Peon

    Messages:
    200
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #17
    addspecialchars was eating it. I slipped me. I was focusing on the title regex and never knew the bug was somewhere else. It was working well untill confrunted with something bigger in various loops. Now it's ok...let's hope it will stay this way until I finish it.
     
    amorph, Jun 28, 2007 IP