PHP Parsing

Discussion in 'PHP' started by fouadz, Oct 1, 2007.

  1. #1
    hi ,

    I,m trying to pars some information form a webpage , but I got strange results.

    
    		$regexxp = "/\<TR CLASS=oddrow><TD ALIGN=CENTER CLASS=grid><\/TD><TD CLASS=grid>(.*?)<\/TD><TD CLASS=grid>(.*?)<\/TD><TD CLASS=grid><IMG SRC=\/images\/.*.gif>(.*)<\/TD><TD><NOBR><A HREF='..\/teams\/.*.html' CLASS=grid>(.*?)<\/A><\/NOBR><\/TD><TD ALIGN=RIGHT CLASS=grid>(.*?)<\/TD><TD ALIGN=RIGHT CLASS=grid>(.*?)<\/TD><TD ALIGN=RIGHT CLASS=grid>(.*?)<\/TD><TD ALIGN=RIGHT CLASS=grid>(.*?)<\/TD><TD ALIGN=RIGHT CLASS=grid>(.*?)<\/TD><TD ALIGN=RIGHT CLASS=grid>(.*?)<\/TD><TD ALIGN=RIGHT CLASS=grid>(.*?)<\/TD><TD ALIGN=RIGHT CLASS=grid><STRONG>(.*?)<\/STRONG><\/TD><\/TR>/i";
    	$filename = 'debug.txt';
    	$debfile = fopen($filename,"w");
    	$webpage = file_get_contents ("http://itv.stats.football365.com/dom/ENG/PR/overview.html");
    	$webpage = ereg_replace("<!-- OverviewTable -->","",$webpage);
    	$webpage = ereg_replace("even","odd",$webpage);
    	$webpage = ereg_replace("prom","grid",$webpage);
    	$webpage = ereg_replace("rel","grid",$webpage);
    	$webpage = ereg_replace("<TD><IMGl","<TD CLASS=grid><IMG",$webpage);
    	$webpage = ereg_replace("gif></TD>","gif>=</TD>",$webpage);
    	$matches = preg_match_all($regexxp,$webpage,$tablefix);
    
    
    	for ($i=0; $i<count($tablefix[0]); $i++) {
    		print $tablefix[0][$i]."-";
    		echo "<br>";
    	} 
    
    	fwrite($debfile,$webpage); 
    
    PHP:

    The result is very bad :mad:

    
    594Everton8413108213-
    6115Portsmouth83321512312-
    7103Blackburn Rovers733175212-
    862Chelsea833278-112-
    954Newcastle United632195411-
    1082Aston Villa631274310-
    1174West Ham United731397210-
    13141Birmingham City8224710-38-
    14151Sunderland8224813-58-
    15132Middlesbrough92251016-68-
    16171Fulham81431214-27-
    17161Reading8215918-97-
    
    Code (markup):
    it doesnt make sense :D

    Anyway , I will really appreciate is someone can help me on this one.


    Take care
     
    fouadz, Oct 1, 2007 IP
  2. tamen

    tamen Peon

    Messages:
    182
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    0
    #2
    I haven't looked closely at your code, but whenever I do something similar to what you do, I chop it into smaller pieces.
    If you explode the html into bites that only holds one record, its much easier to handle in a regexp.
     
    tamen, Oct 1, 2007 IP
  3. crazyryan

    crazyryan Well-Known Member

    Messages:
    3,087
    Likes Received:
    165
    Best Answers:
    0
    Trophy Points:
    175
    #3
    crazyryan, Oct 1, 2007 IP
  4. sdemidko

    sdemidko Member

    Messages:
    81
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    43
    #4
    the bug is very simple - try to use
    $tablefix[1][$i]
     
    sdemidko, Oct 2, 2007 IP