Need a program

Discussion in 'General Chat' started by rafke, Oct 30, 2006.

  1. #1
    Is there a program to download all files in a html directory?
    Already thanks,
    Raf
     
    rafke, Oct 30, 2006 IP
  2. Dudibob

    Dudibob Peon

    Messages:
    618
    Likes Received:
    14
    Best Answers:
    0
    Trophy Points:
    0
    #2
    from a server?

    give smartftp.com a go
     
    Dudibob, Oct 30, 2006 IP
  3. rafke

    rafke Active Member

    Messages:
    418
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    58
    #3
    rafke, Oct 30, 2006 IP
  4. Dudibob

    Dudibob Peon

    Messages:
    618
    Likes Received:
    14
    Best Answers:
    0
    Trophy Points:
    0
    #4
    oh right, if you don't have FTP access, then try saving each page individually then sorting out folders after that, without FTP access it's a very very long and boring job
     
    Dudibob, Oct 30, 2006 IP
  5. rafke

    rafke Active Member

    Messages:
    418
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    58
    #5
    Isn't there a program?
    I think I saw this a long time ago.
     
    rafke, Oct 30, 2006 IP
  6. revsorg

    revsorg Peon

    Messages:
    190
    Likes Received:
    8
    Best Answers:
    0
    Trophy Points:
    0
    #6
    Teleport Pro will grab a whole website, but it can only find files and pages that have links to them.
     
    revsorg, Oct 30, 2006 IP
  7. thes

    thes Peon

    Messages:
    38
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #7
    yeah, might not be possible if you dont know the filenames or there aren't links to them
     
    thes, Oct 30, 2006 IP
  8. directorycollector

    directorycollector Peon

    Messages:
    259
    Likes Received:
    8
    Best Answers:
    0
    Trophy Points:
    0
    #8
    I don't know if this will help you get off on the right foot.

    This code will crawl a website and look for a particular URL, but it can be modified to save each page it crawls.

    // JScript.
    
    /*****************************************************************
     * Poor Man's Backlink Checker.
     *
     * Author: George Asprey
     *
     * Owner of: http://directory.proud-collector.com/ -- Free directory
     *      and: http://www.proud-collector.com/
     *
     * This script requires Internet Explorer, and was tested
     * on WSH 5.6 - XP Media Center with IE 7.0
     *****************************************************************/
    
    function addURL(theURL) {
    	var addURL = true;
    
    	if (theURL.indexOf(theDomain) == -1)
    		// theURL is an external URL so don't spider
    		addURL = false;
    	else
    		// make sure theURL wasn't already spidered
    		for (var i=0; i<theURLs.length; i++)
    			if (theURL == theURLs[i]) {
    				// theURL was already spidered
    				addURL = false;
    				break;
    			}
    
    	if (addURL)
    		// theURL is a new internal link
    		theURLs[theURLs.length] = theURL;
    }
    
    function parsePage(theURL) {
    	IE.Navigate(theURL);
    	while (IE.ReadyState < 4) WScript.Sleep(10);
    
    	theDoc = IE.document;
    	while (theDoc.readyState != "complete") WScript.Sleep(10);
    
    	for (var i=0; i<theDoc.links.length; i++) {
    		if (theDoc.links[i].href.indexOf(theSearchDomain) == -1)
    			// this link is not the one we are
    			// looking for so add it to the list
    			addURL(theDoc.links[i].href);
    		else {
    			theSearchDomainFound = true;
    			break;
    		}
    	}
    }
    
    /*****************************************************************
     * Main code for Poor Man's Backlink Checker.
     * This script uses only one thread to check a single
     * domain for a single backlink.
     * Future improvement (Not So Poor Man's Backlink Checker)
     * will utilize multiple threads (or at least I hope so).
     *****************************************************************/
    
    var IE = WScript.CreateObject("internetexplorer.application");
    IE.top = 0;
    IE.left = 0;
    IE.width = 800;
    IE.height = 570;
    IE.visible = true;
    
    var theSearchDomainFound = false;
    var theURLs = new Array();
    var theURLsIndex = 0;
    var maxURLCount = 2000;
    
    // this is the domain where your link should be
    var theDomain = "www.backlinkdomain.com"
    // this is the your domain (what we are searching for)
    var theSearchDomain = "www.mydomain.com"
    
    // prime the array with the first URL
    theURLs[0] = "http://"+theDomain+"/"
    
    // start spidering the domain
    do {
    	parsePage(theURLs[theURLsIndex++]);
    } while (theURLsIndex < theURLs.length && theURLsIndex < maxURLCount && theSearchDomainFound == false);
    
    IE.Quit();
    
    if (theSearchDomainFound)
    	WScript.ECHO("The domain: "+theSearchDomain+" was found.");
    else
    	WScript.ECHO("The domain: "+theSearchDomain+" was not found.");
    
    WScript.Quit();
    Code (markup):
     
    directorycollector, Oct 30, 2006 IP