A way to collect links

Discussion in 'PHP' started by Hades, Aug 15, 2007.

  1. #1
    Basically. I need a program that woud collect all megaupload links from one site, and then save them to a .txt file. Is there such a program? If not, is there anyone that can make a program like that and how much would it cost.

    Note: This isn't an offer to pay someone to build me a script. Im just asking if anyone knows if such a program exists for now.
     
    Hades, Aug 15, 2007 IP
  2. nico_swd

    nico_swd Prominent Member

    Messages:
    4,153
    Likes Received:
    344
    Best Answers:
    18
    Trophy Points:
    375
    #2
    
    function fetch_megaupload_links($url, $save_as = 'megaupload.txt')
    {
    	if (!$source = @file_get_contents($url))
    	{
    		trigger_error("Unable to open URL at {$url}", E_USER_ERROR);
    	}
    	
    	if (preg_match_all('/megaupload\.com\/\?d=([A-Z0-9]+)/i', $source, $links))
    	{
    		$urls = array();
    		
    		foreach ($links[1] AS $link)
    		{
    			$urls[] = "http://www.megaupload.com/?d={$link}";
    		}
    		
    		if (!$cache = @file($save_as))
    		{
    			@touch($save_as);
    			$cache = array();
    		}
    		
    		$fp = @fopen($save_as, 'w');
    		$urls = array_merge($urls, array_map('trim', $cache));
    		$stat = @fwrite($fp, implode("\n", array_unique($urls)));
    		@fclose($fp);
    		
    		return (bool)$stat;
    	}
    	
    	return false;
    }
    
    
    PHP:

    Usage example:
    
    $url = 'http://bdcomics.bdgamers.net/2007/06/05/el-p-cannibal-ox-company-flow/';
    
    var_dump(fetch_megaupload_links($url));
    
    PHP:
     
    nico_swd, Aug 16, 2007 IP
    Hades likes this.
  3. Hades

    Hades Well-Known Member

    Messages:
    1,873
    Likes Received:
    67
    Best Answers:
    0
    Trophy Points:
    150
    #3
    Dude you're awesome. Im gonna name my son after you. :D

    Though one question, how do I use it? I completely don't know php, so I don't know how to make it work.
     
    Hades, Aug 16, 2007 IP
  4. Hades

    Hades Well-Known Member

    Messages:
    1,873
    Likes Received:
    67
    Best Answers:
    0
    Trophy Points:
    150
    #4
    after looking through some tutorials, considering i did it right, i placed it onto my server, but it doesnt work. It gives me an error in line 10.

    http://www.erosennin.net/animephp/test3.php

    <html>
    <body>
    
    <?php
    
    function fetch_megaupload_links($url, $save_as = 'megaupload.txt')
    {
       
        
        if (preg_match_all('/megaupload\.com\/\?d=([A-Z0-9]+)/i', $source, $links))
        {
            $urls = array();
            
            foreach ($links[1] AS $link)
            {
                $urls[] = "http://www.megaupload.com/?d={$link}";
            }
            
            if (!$cache = @file($save_as))
            {
                @touch($save_as);
                $cache = array();
            }
            
            $fp = @fopen($save_as, 'w');
            $urls = array_merge($urls, array_map('trim', $cache));
            $stat = @fwrite($fp, implode("\n", array_unique($urls)));
            @fclose($fp);
            
            return (bool)$stat;
        }
        
        return false;
    }
    
    $url = 'http://bdcomics.bdgamers.net/2007/06/05/el-p-cannibal-ox-company-flow/';
    
    var_dump(fetch_megaupload_links($url));
    
    ?>
    
    </body>
    </html>
    PHP:
    Can you help?
     
    Hades, Aug 17, 2007 IP
  5. nico_swd

    nico_swd Prominent Member

    Messages:
    4,153
    Likes Received:
    344
    Best Answers:
    18
    Trophy Points:
    375
    #5
    Maybe you're using an old version of PHP. Try replacing this:
    
    if (!$source = @file_get_contents($url))
        {
            trigger_error("Unable to open URL at {$url}", E_USER_ERROR);
        }
    
    PHP:
    With:
    
    	if (!$fp = fopen($url, 'rb'))
    	{
            trigger_error("Unable to open URL at {$url}", E_USER_ERROR);
        }
    	
    	$source = '';
    	
    	while (!feof($fp))
    	{
    		$source .= fread($fp, 8192);
    	}
    	
    	fclose($fp);
    
    
    PHP:
     
    nico_swd, Aug 17, 2007 IP
  6. Hades

    Hades Well-Known Member

    Messages:
    1,873
    Likes Received:
    67
    Best Answers:
    0
    Trophy Points:
    150
    #6
    it gives a new error:

    http://www.erosennin.net/animephp/test6.php
     
    Hades, Aug 17, 2007 IP
  7. nico_swd

    nico_swd Prominent Member

    Messages:
    4,153
    Likes Received:
    344
    Best Answers:
    18
    Trophy Points:
    375
    #7
    Seems like allow_url_fopen has been disabled by your server. So you cannot make HTTP requests, UNLESS (our last hope) cURL is installed. Replace the code I just gave you with:

    
    $ch = curl_init($url);
    curl_setopt($ch, CURL_RETURNTRANSFER, true);
    
    $source = curl_exec($ch);
    
    PHP:
    If that doesn't work either, email your host and ask them if allow_url_fopen can be enabled, or if they can install cURL. If not, you'll have to get another host if you need this script.
     
    nico_swd, Aug 17, 2007 IP
  8. Hades

    Hades Well-Known Member

    Messages:
    1,873
    Likes Received:
    67
    Best Answers:
    0
    Trophy Points:
    150
    #8
    it worked..at least i think. It took me to the url. Though it didnt copy any links into the folder. I changed to a url that had many megaupload links, but it didnt save the file anywhere. I have another host, a VDS, which has cURL installed, ill try it on that as soon as it restarts (i put ina service request cause my proxies are killing it) hopefully it works.

    http://www.erosennin.net/animephp/test6.php

    thats what it does right now.
     
    Hades, Aug 17, 2007 IP
  9. nico_swd

    nico_swd Prominent Member

    Messages:
    4,153
    Likes Received:
    344
    Best Answers:
    18
    Trophy Points:
    375
    #9
    Create a file called megaupload.txt in the same directory. If the script was allowed to, it'd create the file automatically. But try creating it manually. This is where all links will be saved.
     
    nico_swd, Aug 17, 2007 IP
  10. Hades

    Hades Well-Known Member

    Messages:
    1,873
    Likes Received:
    67
    Best Answers:
    0
    Trophy Points:
    150
    #10
    Thanks. The second version works on my VDS. I checked in whm, and allow_url_fopen is on.

    Again, thanks very much :D

    Nick
    +rep
     
    Hades, Aug 17, 2007 IP
  11. Hades

    Hades Well-Known Member

    Messages:
    1,873
    Likes Received:
    67
    Best Answers:
    0
    Trophy Points:
    150
    #11
    Hmm. I checked it on another site, and I think it can't do files links that are under anchor text. For example, if a link is like this

    <a href="http://megaupload.com/......">Episode 1</a>

    It won't do it. Is there a way to make it read it even under the anchor text?

    when it works, it gives a message "bool(true)"

    but right now, it's giving a message "bool(false)" The site I set it to definately has links on it.
     
    Hades, Aug 17, 2007 IP
  12. nico_swd

    nico_swd Prominent Member

    Messages:
    4,153
    Likes Received:
    344
    Best Answers:
    18
    Trophy Points:
    375
    #12
    It should work, no matter where the links are. Can you post or PM me the link to the page?
     
    nico_swd, Aug 17, 2007 IP
  13. Hades

    Hades Well-Known Member

    Messages:
    1,873
    Likes Received:
    67
    Best Answers:
    0
    Trophy Points:
    150
    #13
    page i tried collecting links from:

    http://www.anime-sensei.net
    http://www.anime-sensei.net/2006/08/berserk.html

    page with the code:

    http://surfoxy.com/anime/text8.php

    At first I thought it doesn't go far is subdirectories, so i went to the exact page, but it didnt work there either.

    Update:


    Hmm...this is kind of weird. At first it doent work,and replies with bool(false), but then if i refresh it a few mins later, it goes to bool(true)

    weird..

    One thing though, is it possible for it to go through the whole site and search instead of just the page if i give it just the root domain?

    Oh, and I tried using the _Post method thingy to use a form so that I can enter the url on a page, but that didn't work. Is there a way to do it so that editing the page wouldn't always be necessary?
     
    Hades, Aug 17, 2007 IP