Web Automation - Fetching automatically values from a specific part of a web-page

somghosh Peon

Messages:: 3

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#1

Hello

I am trying to extract data from a specific web-page (say for an this page, go and find the price currently of the product - http://www.ocado.com/webshop/product/Ocado-Smoked-Salmon-Long-Sliced/59534011? )

I am trying to find out the best way to do this.

1) Imacros extension on Firefox - that allows for some script to be run on firefox, does not require a server
2) PERL - I dont exactly know how to do it, but I think requires a server
3) PHP - I dont exactly know how to do it, but I think requires a server

Would be glad if someone comments on these methods and if there's anything more easy, and guides me in the right direction

Many thanks

somghosh, Nov 24, 2012 IP

goldensea80 Well-Known Member

Messages:: 422

Likes Received:: 10

Best Answers:: 0

Trophy Points:: 128

#2

somghosh said: ↑

Hello

I am trying to extract data from a specific web-page (say for an this page, go and find the price currently of the product - http://www.ocado.com/webshop/product/Ocado-Smoked-Salmon-Long-Sliced/59534011? )

I am trying to find out the best way to do this.

1) Imacros extension on Firefox - that allows for some script to be run on firefox, does not require a server
2) PERL - I dont exactly know how to do it, but I think requires a server
3) PHP - I dont exactly know how to do it, but I think requires a server

Would be glad if someone comments on these methods and if there's anything more easy, and guides me in the right direction

Many thanks
Click to expand...

It depends on which language you know best. For me, the best way is using Regular expression (preg_match) in PHP to extract the content.

goldensea80, Nov 24, 2012 IP

NetStar Notable Member

Messages:: 2,471

Likes Received:: 541

Best Answers:: 21

Trophy Points:: 245

#3

I prefer Perl (using WWW::Mechanize and an HTML Extractor library) to scrape and crawl.... With PHP you can use cURL to crawl and the DOM to scrape...

NetStar, Nov 25, 2012 IP

Sano000 Active Member

Messages:: 52

Likes Received:: 4

Best Answers:: 5

Trophy Points:: 53

#4

You can use any programming language. PHP and perl does not require a server, you can run them locally.

It is a good idea, to use some libs to parse content. For example, http://simplehtmldom.sourceforge.net/

Then solution to your problem will be like this: (php+simple_html_dom)
<?php
    require_once('simple_html_dom.php');

    $html = file_get_html('http://www.ocado.com/webshop/product/Ocado-Smoked-Salmon-Long-Sliced/59534011?');
    $was_price = html_entity_decode(trim($html->find('span.wasPrice', 0)->plaintext), ENT_NOQUOTES, 'UTF-8');
    $price = html_entity_decode(trim($html->find('span.nowPrice', 0)->plaintext), ENT_NOQUOTES, 'UTF-8');
    
    print "Old price $was_price\n";
    print "New price $price\n";
?>
PHP:

Sano000, Nov 25, 2012 IP

yelbom Greenhorn

Messages:: 36

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 16

#5

I use PHP file_get_contents and then preg_match_all to extract data from certain parts of the page
$url = file_get_contents("http://example.com);
preg_match_all('%<span class="item">(.*?)</span>%sim', $url, $result);
PHP:

yelbom, Dec 9, 2012 IP

Log in or Sign up

Web Automation - Fetching automatically values from a specific part of a web-page

somghosh Peon

goldensea80 Well-Known Member

NetStar Notable Member

Sano000 Active Member

yelbom Greenhorn

Useful Searches