Find jobs - Wordpress Theme - Insurance Quotes - Wordpress Theme - Wordpress Themes

PDA

View Full Version : PHP Parsing


fouadz
Oct 1st 2007, 10:46 am
hi ,

I,m trying to pars some information form a webpage , but I got strange results.


$regexxp = "/\<TR CLASS=oddrow><TD ALIGN=CENTER CLASS=grid><\/TD><TD CLASS=grid>(.*?)<\/TD><TD CLASS=grid>(.*?)<\/TD><TD CLASS=grid><IMG SRC=\/images\/.*.gif>(.*)<\/TD><TD><NOBR><A HREF='..\/teams\/.*.html' CLASS=grid>(.*?)<\/A><\/NOBR><\/TD><TD ALIGN=RIGHT CLASS=grid>(.*?)<\/TD><TD ALIGN=RIGHT CLASS=grid>(.*?)<\/TD><TD ALIGN=RIGHT CLASS=grid>(.*?)<\/TD><TD ALIGN=RIGHT CLASS=grid>(.*?)<\/TD><TD ALIGN=RIGHT CLASS=grid>(.*?)<\/TD><TD ALIGN=RIGHT CLASS=grid>(.*?)<\/TD><TD ALIGN=RIGHT CLASS=grid>(.*?)<\/TD><TD ALIGN=RIGHT CLASS=grid><STRONG>(.*?)<\/STRONG><\/TD><\/TR>/i";
$filename = 'debug.txt';
$debfile = fopen($filename,"w");
$webpage = file_get_contents ("http://itv.stats.football365.com/dom/ENG/PR/overview.html");
$webpage = ereg_replace("<!-- OverviewTable -->","",$webpage);
$webpage = ereg_replace("even","odd",$webpage);
$webpage = ereg_replace("prom","grid",$webpage);
$webpage = ereg_replace("rel","grid",$webpage);
$webpage = ereg_replace("<TD><IMGl","<TD CLASS=grid><IMG",$webpage);
$webpage = ereg_replace("gif></TD>","gif>=</TD>",$webpage);
$matches = preg_match_all($regexxp,$webpage,$tablefix);


for ($i=0; $i<count($tablefix[0]); $i++) {
print $tablefix[0][$i]."-";
echo "<br>";
}

fwrite($debfile,$webpage);



The result is very bad :mad:


594Everton8413108213-
6115Portsmouth83321512312-
7103Blackburn Rovers733175212-
862Chelsea833278-112-
954Newcastle United632195411-
1082Aston Villa631274310-
1174West Ham United731397210-
13141Birmingham City8224710-38-
14151Sunderland8224813-58-
15132Middlesbrough92251016-68-
16171Fulham81431214-27-
17161Reading8215918-97-


it doesnt make sense :D

Anyway , I will really appreciate is someone can help me on this one.


Take care

tamen
Oct 1st 2007, 1:45 pm
I haven't looked closely at your code, but whenever I do something similar to what you do, I chop it into smaller pieces.
If you explode the html into bites that only holds one record, its much easier to handle in a regexp.

crazyryan
Oct 1st 2007, 1:50 pm
I can't really help but you might wanna check out this:
http://uk2.php.net/manual/en/ref.curl.php#75126

sdemidko
Oct 2nd 2007, 2:10 pm
the bug is very simple - try to use
$tablefix[1][$i]