Source code question

Discussion in 'Programming' started by LaPoChE, Jul 21, 2006.

  1. #1
    Hey there.

    I need to get some info from a bunch of different website, thru the sites source code... This is how i'm doing this

    <cfhttp url="#URL#" method="GET" resolveurl="Yes" throwOnError="Yes"/>
    <cfset sourceCode = htmleditformat(CFHTTP.FileContent)>

    this returns me the source code of the URL.

    I run this code thru a loop but it times out before it goes thru all sites. Anyone knows of a better way of doing this? Could this be done using Javascript?

    thanks
     
    LaPoChE, Jul 21, 2006 IP
  2. woodside

    woodside Peon

    Messages:
    182
    Likes Received:
    8
    Best Answers:
    0
    Trophy Points:
    0
    #2
    Just pass a "?requesttimeout=5000" (or bigger) parameter in the url. The number is in seconds.
     
    woodside, Jul 21, 2006 IP
  3. LaPoChE

    LaPoChE Active Member

    Messages:
    198
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    51
    #3
    I did try passing that, and it still times out after a while.
     
    LaPoChE, Jul 21, 2006 IP
  4. woodside

    woodside Peon

    Messages:
    182
    Likes Received:
    8
    Best Answers:
    0
    Trophy Points:
    0
    #4
    Well, it'll time out when it hits whatever number set. Do you get the normal coldfusion timeout error or something else? Set it to a huge number and see what happens.
     
    woodside, Jul 21, 2006 IP
  5. LaPoChE

    LaPoChE Active Member

    Messages:
    198
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    51
    #5
    I tried to setup to the highest timeout possible and still get the same timeout message
     
    LaPoChE, Jul 21, 2006 IP
  6. tbarr60

    tbarr60 Notable Member

    Messages:
    3,455
    Likes Received:
    125
    Best Answers:
    0
    Trophy Points:
    210
    #6
    How many sites is a bunch? If it's under 100 I would create a list of the sites on the page and have each one link to a CFHTTP for that one site. It's a crude method but would get the job done.

    Another approach is to create an array of the site names in a persistent scope or a database table and load the page with the code to capture one site and flag the site in the array as done. At the end of the page have it reload it self and process the next un-processed site in the array.
     
    tbarr60, Jul 25, 2006 IP
  7. woodside

    woodside Peon

    Messages:
    182
    Likes Received:
    8
    Best Answers:
    0
    Trophy Points:
    0
    #7
    Where are you putting the "requesttimeout" parameter? In the cfhttp call, or in the url of the script? I needs to be in the url of the script, not the cfhttpd.
     
    woodside, Jul 25, 2006 IP
  8. advantage

    advantage Well-Known Member

    Messages:
    1,338
    Likes Received:
    36
    Best Answers:
    0
    Trophy Points:
    140
    #8
    You can try cfxhttp (google it) and do simultaneous requests.
     
    advantage, Jul 27, 2006 IP
  9. datropics

    datropics Peon

    Messages:
    309
    Likes Received:
    3
    Best Answers:
    1
    Trophy Points:
    0
    #9
    are you getting a Request Timeout or a Connection time out?

    If you are getting a connection timeout then you may need to adjust your useragent variable - some sites don't like the useragent ColdFusion to connect (which is default when you use the cfhttp tag) so you can find out what your useragent is and send that in your http request. Also, adjust the timeout of the http request also.

    Is it posible to see the code that you are using to loop through the sites's source code
     
    datropics, Oct 26, 2006 IP