scraping guide?

dnahosting Active Member

Messages:: 385

Likes Received:: 9

Best Answers:: 0

Trophy Points:: 60

#1

I have been looking for a guide to website scraping, I have some of the basics down, but I have been having trouble creating the start and stop points, like the stop point stops at the last </div> and I want it to stop at the next </div> after the start <div> if that makes any sense.

dnahosting, Feb 20, 2007 IP

ErectADirectory Guest

Messages:: 656

Likes Received:: 65

Best Answers:: 0

Trophy Points:: 0

#2

dnahosting said: ↑

I have been looking for a guide to website scraping, I have some of the basics down, but I have been having trouble creating the start and stop points, like the stop point stops at the last </div> and I want it to stop at the next </div> after the start <div> if that makes any sense.
Click to expand...

Below is a simple function to scrape of alexa ranking found here. I hope this points you in the right direction as it seems pretty simple to implement and hack out.
function get_alexa($url){
    $site = fopen('http://www.alexa.com/data/details/main?url='.urlencode($url),'r');
    while($cont = fread($site,1024657)){
        $total .= $cont;
    }
    fclose($site);
    $match_expression = '/for more information about the Alexa Web Information Service.â€“>(.*)<\/span><\/a>/Us';
    preg_match($match_expression,$total,$matches);
    return strip_tags($matches[1]);
}
PHP:

ErectADirectory, Feb 20, 2007 IP

dnahosting likes this.

dnahosting Active Member

Messages:: 385

Likes Received:: 9

Best Answers:: 0

Trophy Points:: 60

#3

Thanks EAD, I will try it out.

dnahosting, Feb 20, 2007 IP

Log in or Sign up

scraping guide?

dnahosting Active Member

ErectADirectory Guest

dnahosting Active Member

Useful Searches