Hi All I need a little help with my Amazon price scraping script. I currently sell over 3000 items on amazon and need to keep up to date when prices go down. This script uses a mySQL database, goes to the relevant amazon page then scrapes the lowest price. I can then check the prices against my inventory to make sure i'm not to expensive. The problem is that it only uses the first price scraped. I know where the problem is but after 6 hours staring at this screen my mind has stopped working . If anyone can help I'd be very grateful. <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /> <title>Untitled Document</title> </head> <body> <?php // listing script // connect to the server mysql_connect( 'localhost','XXXXXXXXX','XXXXXXXXXX' ) or die( "Error! Could not connect to database: " . mysql_error() ); // select the database mysql_select_db( XXXXXXXX) or die( "Error! Could not select the database: " . mysql_error() ); // retrieve all the rows from the database $query = "SELECT * FROM `XXXXXXX`"; $results = mysql_query( $query ); // print out the results if( $results ) { while( $contact = mysql_fetch_object( $results ) ) { // print out the info $id = $contact -> id; $sku = $contact -> sku; $ASIN = $contact -> ASIN; ?> <?php $url = "http://www.amazon.co.uk/gp/offer-listing/$ASIN/"; $filepointer = fopen($url,"r"); if($filepointer){ while(!feof($filepointer)){ $buffer = fgets($filepointer); $file .= $buffer; } fclose($filepointer); } else { die("Could not create a connection to Amazon.co.uk"); } ?> <?php preg_match("/<span\sclass=\"price\">£(.*)\s/i",$file,$match); $result = $match[1]; ?> <table width="100%" border="0" cellspacing="0" cellpadding="0"> <tr> <td width="5%"><?php echo($id) ?></td> <td width="23%"><?php echo($sku) ?></td> <td width="18%"><?php echo($ASIN) ?></td> <td width="39%"><?php echo($url) ?></td> <td width="20%"><?php echo($result) ?></td> </tr> </table> <?php } } else { die( "Trouble getting info from database: " . mysql_error() ); } ?> </body> </html>
First thing to do is if you know where the problem is, start by adding some echo's to see if what you suspect is true. Echo out the data and see what your queries are returning, etc. Debug that thing man Usually I find that if after several hours of beating dead horses, its best to just step back and think about things otherwise you just go in circles or get completely off track. Go have a beer (if you are of age!) and try to visualize what you want.
Your code, which btw you should always post in tags on forums because it's too hard to read code looking like you posted. [php] <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /> <title> Untitled Document </title> </head> <body> <?php // listing script // connect to the server mysql_connect( 'localhost','XXXXXXXXX','XXXXXXXXXX' ) or die( "Error! Could not connect to database: " . mysql_error() ); // select the database mysql_select_db( XXXXXXXX) or die( "Error! Could not select the database: " . mysql_error() ); // retrieve all the rows from the database $query = "SELECT * FROM `XXXXXXX`"; $results = mysql_query( $query ); // print out the results if( $results ) { while( $contact = mysql_fetch_object( $results ) ) { // print out the info $id = $contact -> id; $sku = $contact -> sku; $ASIN = $contact -> ASIN; if( !( $data = file_get_contents("http://www.amazon.co.uk/gp/offer-listing/$ASIN/") ) ) { die("Could not create a connection to Amazon.co.uk"); } preg_match("/<span class=\"price\">£(.*)\s/i", $data, $match ); $result = $match[1]; ?> <table width="100%" border="0" cellspacing="0" cellpadding="0"> <tr> <td width="5%"> <?php echo($id) ?> </td> <td width="23%"> <?php echo($sku) ?> </td> <td width="18%"> <?php echo($ASIN) ?> </td> <td width="39%"> <?php echo($url) ?> </td> <td width="20%"> <?php echo($result) ?> </td> </tr> </table> <?php } } else { die( "Trouble getting info from database: " . mysql_error() ); } ?> </body> </html> PHP: You're using ASIN as a variable name, that's a function name, that's bad practice also. Before I can say what's wrong with the script, can you post a link to a working http://www.amazon.co.uk/gp/offer-listing/$ASIN/ My guess would be that you're supposed to be using preg_match_all if you want more than one result from each page, preg_match stops at the first occurence.
Hi krakjoe Thanks for the advice, it's always appreciated. Please find a working link below. http://www.amazon.co.uk/gp/offer-listing/B0001K9W9Y/ I think i'm ok using preg_match as I only want the first price on the page (I need to know what the lowest marketplace price is). I think the problem's near $match[1]. The array doesn't seem to update when a new product is selected from my database.