1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

cURL Remote Download File Size Limiting/Ping Remote URL without Downloading it Fully?

Discussion in 'Programming' started by ColorWP.com, Sep 10, 2008.

  1. #1
    Hello, I require some cURL help.

    I have started a little project about site for downloading videos from YouTube-like video sharing site (it's a small local video sharing site).

    The concept behind the videos in the site is this:
    They have around 10 servers for the videos. The video watching pages look like this http://site.com/play:12345678 (this is the unique ID of the video). The download location for the FLV file is like this:
    http://mediaXX.site.com/s/YY/ZZZZZZZZ.flv
    where:
    XX is the number of the random server where this exact video is hosted (ex. 01, 02, 03 - media01, media02, media03), YY is the first characters of the unique video ID and ZZZZZZZZ is the full video ID.

    I already did the part to parse the video watching URL and make it this way:
    http://mediaXX.site.com/s/12/12345678.flv, but the server number remains a problem. The only way to get the exact server ID is to cycle and download the pages on all the ten servers (media01, media02) and see the response codes each one returns. The cycle stops as soon as it finds a server that returns a mime/x-flv file and not a mime/xhtml. When the script cycles, the incorrect servers will return a small HTML file with a 404 error and when it reaches the correct one, it will return the FLV file.

    I already figured it out how to check the remote files, but I don't want it to download the file completely (I use cURL), because when it reaches the correct server, it will download the full FLV (which is usually a lot of Megabytes) only to check it's mime-type. How may I make cURL limit and download only a few required chunks of the file, only to check it's mime-type?

    I use DreamHost hosting, so using fopen, fsockopen is not an option. Any advice?
     
    ColorWP.com, Sep 10, 2008 IP
  2. Kaizoku

    Kaizoku Well-Known Member

    Messages:
    1,261
    Likes Received:
    20
    Best Answers:
    1
    Trophy Points:
    105
    #2
    Most of these sites use xml for their processing for their flash players. You need to find out what the xml file is and it's $_GET parameters.
     
    Kaizoku, Sep 10, 2008 IP
  3. ColorWP.com

    ColorWP.com Notable Member

    Messages:
    3,121
    Likes Received:
    100
    Best Answers:
    1
    Trophy Points:
    270
    #3
    No, they do not in this case. You can see the concept of the video sharing site here: vbox7.com.
     
    ColorWP.com, Sep 11, 2008 IP
  4. Kaizoku

    Kaizoku Well-Known Member

    Messages:
    1,261
    Likes Received:
    20
    Best Answers:
    1
    Trophy Points:
    105
    #4
    Okay, I went through the packets, and it seems you need to use curl to POST

    onLoad=%5Btype%20Function%5D ( if doesnt work use decoded form [type Function] )
    vid=32e47744 ( your video id )

    To this address, http://www.vbox7.com/play/magare.do
    This is infact like a xml, but in text form.

    You can do debugging with a html file like this

    
    <form method='post' action='http://www.vbox7.com/play/magare.do'>
    	<input type='hidden' name='onLoad' value='[type Function]'>
    	<input type='text' name='vid'>
    	<input type='submit'>
    </form>
    
    HTML:
     
    Kaizoku, Sep 11, 2008 IP
    www.Andro.ws likes this.
  5. happpy

    happpy Well-Known Member

    Messages:
    926
    Likes Received:
    14
    Best Answers:
    0
    Trophy Points:
    120
    #5
    a "curl header only request" cycling through all servers until the right answer comes back would do the trick.

    if 404, check the next, if header is 200 OK, use the url
     
    happpy, Sep 11, 2008 IP
  6. ColorWP.com

    ColorWP.com Notable Member

    Messages:
    3,121
    Likes Received:
    100
    Best Answers:
    1
    Trophy Points:
    270
    #6
    Yes, that was exactly what I was trying to do, but from my experiments, it takes quite some time to check the headers, so I assume that it downloads the remote file in order to check it's headers and that is what I try to avoid.

    @Kaizoku: Thanks man, it worked that way. Now I just have to figure out how to do the cURL POST and parse the output.
     
    ColorWP.com, Sep 11, 2008 IP
  7. happpy

    happpy Well-Known Member

    Messages:
    926
    Likes Received:
    14
    Best Answers:
    0
    Trophy Points:
    120
    #7
    then let every single file server occasionally bring in his "have-list" to the main site server.
    then you save the time of "looking-if-there".
     
    happpy, Sep 12, 2008 IP
  8. ColorWP.com

    ColorWP.com Notable Member

    Messages:
    3,121
    Likes Received:
    100
    Best Answers:
    1
    Trophy Points:
    270
    #8
    I don't think I understand what you mean. As far as I understand, you mean that I should insert all the IDs the users initiate for download into a database with their corresponding server ID, so the script would check if the video ID is in the database and get it's server ID is there, instead of cycling through the servers first?
     
    ColorWP.com, Sep 12, 2008 IP
  9. happpy

    happpy Well-Known Member

    Messages:
    926
    Likes Received:
    14
    Best Answers:
    0
    Trophy Points:
    120
    #9
    no. when you tell curl "headers only" it will only fetchthe headers
    headers are very quick to fetch

    try it with curl on command line:

    curl --head http://google.com > xy.txt && less xy.txt
    
    HTTP/1.1 301 Moved Permanently
    Location: http://www.google.com/
    Content-Type: text/html; charset=UTF-8
    Date: Sun, 14 Sep 2008 20:56:20 GMT
    Expires: Tue, 14 Oct 2008 20:56:20 GMT
    Cache-Control: public, max-age=2592000
    Server: gws
    Content-Length: 219
    
    Code (markup):
     
    happpy, Sep 14, 2008 IP
  10. Kaizoku

    Kaizoku Well-Known Member

    Messages:
    1,261
    Likes Received:
    20
    Best Answers:
    1
    Trophy Points:
    105
    #10
    For headers, why not use the inbuilt function get_headers()?
     
    Kaizoku, Sep 14, 2008 IP
  11. happpy

    happpy Well-Known Member

    Messages:
    926
    Likes Received:
    14
    Best Answers:
    0
    Trophy Points:
    120
    #11
    nice, is it new in php5? did not know it
     
    happpy, Sep 14, 2008 IP
  12. Kaizoku

    Kaizoku Well-Known Member

    Messages:
    1,261
    Likes Received:
    20
    Best Answers:
    1
    Trophy Points:
    105
    #12
    Well, it's been in php5 for some time now.
     
    Kaizoku, Sep 14, 2008 IP
  13. ColorWP.com

    ColorWP.com Notable Member

    Messages:
    3,121
    Likes Received:
    100
    Best Answers:
    1
    Trophy Points:
    270
    #13
    I believe my host (DreamHost) doesn't support that, as it doesn't support fopen or stream_get_headers (or something like that) of remote locations and only supports cURL for this kind of stuff. Anyway, I have solved my issue thanks to Kaizoku (rep comes in a second) and I consider this thread may be closed.
     
    ColorWP.com, Sep 16, 2008 IP