cURL Remote Download File Size Limiting/Ping Remote URL without Downloading it Fully?

ColorWP.com Notable Member

Messages:: 3,120

Likes Received:: 100

Best Answers:: 1

Trophy Points:: 270

#1

Hello, I require some cURL help.

I have started a little project about site for downloading videos from YouTube-like video sharing site (it's a small local video sharing site).

The concept behind the videos in the site is this:
They have around 10 servers for the videos. The video watching pages look like this http://site.com/play:12345678 (this is the unique ID of the video). The download location for the FLV file is like this:
http://mediaXX.site.com/s/YY/ZZZZZZZZ.flv
where:
XX is the number of the random server where this exact video is hosted (ex. 01, 02, 03 - media01, media02, media03), YY is the first characters of the unique video ID and ZZZZZZZZ is the full video ID.

I already did the part to parse the video watching URL and make it this way:
http://mediaXX.site.com/s/12/12345678.flv, but the server number remains a problem. The only way to get the exact server ID is to cycle and download the pages on all the ten servers (media01, media02) and see the response codes each one returns. The cycle stops as soon as it finds a server that returns a mime/x-flv file and not a mime/xhtml. When the script cycles, the incorrect servers will return a small HTML file with a 404 error and when it reaches the correct one, it will return the FLV file.

I already figured it out how to check the remote files, but I don't want it to download the file completely (I use cURL), because when it reaches the correct server, it will download the full FLV (which is usually a lot of Megabytes) only to check it's mime-type. How may I make cURL limit and download only a few required chunks of the file, only to check it's mime-type?

I use DreamHost hosting, so using fopen, fsockopen is not an option. Any advice?

ColorWP.com, Sep 10, 2008 IP

Kaizoku Well-Known Member

Messages:: 1,261

Likes Received:: 20

Best Answers:: 1

Trophy Points:: 105

#2

Most of these sites use xml for their processing for their flash players. You need to find out what the xml file is and it's $_GET parameters.

Kaizoku, Sep 10, 2008 IP

ColorWP.com Notable Member

Messages:: 3,120

Likes Received:: 100

Best Answers:: 1

Trophy Points:: 270

#3

Kaizoku said: ↑

Most of these sites use xml for their processing for their flash players. You need to find out what the xml file is and it's $_GET parameters.
Click to expand...

No, they do not in this case. You can see the concept of the video sharing site here: vbox7.com.

ColorWP.com, Sep 11, 2008 IP

Kaizoku Well-Known Member

Messages:: 1,261

Likes Received:: 20

Best Answers:: 1

Trophy Points:: 105

#4

Okay, I went through the packets, and it seems you need to use curl to POST

onLoad=%5Btype%20Function%5D ( if doesnt work use decoded form [type Function] )
vid=32e47744 ( your video id )

To this address, http://www.vbox7.com/play/magare.do
This is infact like a xml, but in text form.

You can do debugging with a html file like this
<form method='post' action='http://www.vbox7.com/play/magare.do'>
	<input type='hidden' name='onLoad' value='[type Function]'>
	<input type='text' name='vid'>
	<input type='submit'>
</form>
HTML:

Kaizoku, Sep 11, 2008 IP

www.Andro.ws likes this.

happpy Well-Known Member

Messages:: 926

Likes Received:: 14

Best Answers:: 0

Trophy Points:: 120

#5

a "curl header only request" cycling through all servers until the right answer comes back would do the trick.

if 404, check the next, if header is 200 OK, use the url

happpy, Sep 11, 2008 IP

ColorWP.com Notable Member

Messages:: 3,120

Likes Received:: 100

Best Answers:: 1

Trophy Points:: 270

#6

happpy said: ↑

a "curl header only request" cycling through all servers until the right answer comes back would do the trick.

if 404, check the next, if header is 200 OK, use the url
Click to expand...

Yes, that was exactly what I was trying to do, but from my experiments, it takes quite some time to check the headers, so I assume that it downloads the remote file in order to check it's headers and that is what I try to avoid.

@Kaizoku: Thanks man, it worked that way. Now I just have to figure out how to do the cURL POST and parse the output.

ColorWP.com, Sep 11, 2008 IP

happpy Well-Known Member

Messages:: 926

Likes Received:: 14

Best Answers:: 0

Trophy Points:: 120

#7

then let every single file server occasionally bring in his "have-list" to the main site server.
then you save the time of "looking-if-there".

happpy, Sep 12, 2008 IP

ColorWP.com Notable Member

Messages:: 3,120

Likes Received:: 100

Best Answers:: 1

Trophy Points:: 270

#8

happpy said: ↑

then let every single file server occasionally bring in his "have-list" to the main site server.
then you save the time of "looking-if-there".
Click to expand...

I don't think I understand what you mean. As far as I understand, you mean that I should insert all the IDs the users initiate for download into a database with their corresponding server ID, so the script would check if the video ID is in the database and get it's server ID is there, instead of cycling through the servers first?

ColorWP.com, Sep 12, 2008 IP

happpy Well-Known Member

Messages:: 926

Likes Received:: 14

Best Answers:: 0

Trophy Points:: 120

#9

mrgee said: ↑

Yes, that was exactly what I was trying to do, but from my experiments, it takes quite some time to check the headers, so I assume that it downloads the remote file in order to check it's headers and that is what I try to avoid.
Click to expand...

no. when you tell curl "headers only" it will only fetchthe headers
headers are very quick to fetch

try it with curl on command line:
curl --head http://google.com > xy.txt && less xy.txt

HTTP/1.1 301 Moved Permanently
Location: http://www.google.com/
Content-Type: text/html; charset=UTF-8
Date: Sun, 14 Sep 2008 20:56:20 GMT
Expires: Tue, 14 Oct 2008 20:56:20 GMT
Cache-Control: public, max-age=2592000
Server: gws
Content-Length: 219
Code (markup):

happpy, Sep 14, 2008 IP

Kaizoku Well-Known Member

Messages:: 1,261

Likes Received:: 20

Best Answers:: 1

Trophy Points:: 105

#10

For headers, why not use the inbuilt function get_headers()?

Kaizoku, Sep 14, 2008 IP

happpy Well-Known Member

Messages:: 926

Likes Received:: 14

Best Answers:: 0

Trophy Points:: 120

#11

Kaizoku said: ↑

For headers, why not use the inbuilt function get_headers()?
Click to expand...

nice, is it new in php5? did not know it

happpy, Sep 14, 2008 IP

Kaizoku Well-Known Member

Messages:: 1,261

Likes Received:: 20

Best Answers:: 1

Trophy Points:: 105

#12

Well, it's been in php5 for some time now.

Kaizoku, Sep 14, 2008 IP

ColorWP.com Notable Member

Messages:: 3,120

Likes Received:: 100

Best Answers:: 1

Trophy Points:: 270

#13

Kaizoku said: ↑

For headers, why not use the inbuilt function get_headers()?
Click to expand...

I believe my host (DreamHost) doesn't support that, as it doesn't support fopen or stream_get_headers (or something like that) of remote locations and only supports cURL for this kind of stuff. Anyway, I have solved my issue thanks to Kaizoku (rep comes in a second) and I consider this thread may be closed.

ColorWP.com, Sep 16, 2008 IP

Log in or Sign up

cURL Remote Download File Size Limiting/Ping Remote URL without Downloading it Fully?

ColorWP.com Notable Member

Kaizoku Well-Known Member

ColorWP.com Notable Member

Kaizoku Well-Known Member

happpy Well-Known Member

ColorWP.com Notable Member

happpy Well-Known Member

ColorWP.com Notable Member

happpy Well-Known Member

Kaizoku Well-Known Member

happpy Well-Known Member

Kaizoku Well-Known Member

ColorWP.com Notable Member

Useful Searches