Stripping lines out of a file...

Discussion in 'PHP' started by stma, Mar 5, 2008.

  1. #1
    $lines = file($shufflefile);
    	$mine = $lines[array_rand($lines)] ;
    	$tabbreaks = explode("\t",$mine);
    Code (markup):
    I'm grabbing data from a tab deliminated file with the code above. I end up with three values:

    tabbreaks[0]
    tabbreaks[1]
    tabbreaks[2]

    Tabbreaks[1] includes a url and my issue is some of them are .js files or .css, jpg, etc.. I need to just discard any lines that contain certain values in one of the tabbreaks and not deal with them in the loop that follows.

    Tried a few things and just spinning my wheels - any suggestions.
     
    stma, Mar 5, 2008 IP
  2. nico_swd

    nico_swd Prominent Member

    Messages:
    4,153
    Likes Received:
    344
    Best Answers:
    18
    Trophy Points:
    375
    #2
    The easiest way would be discarding them inside the loop. (Would also be the fastest - Execution time-wise)

    But anyway, can you be more specific about the "certain values"? Which files do you want to allow? .php? .html?

    EDIT:
    Maybe you can do something like this:
    
    function strip_lines(array $line)
    {
    	return !preg_match('~\.(jpe?g|bmp|css|js|png)(\t|[\r\n]|$)~i', $line);
    }
    
    $lines = file($shufflefile);
    $lines = array_filter($lines, 'strip_lines');
    
    $mine = $lines[array_rand($lines)] ;
    $tabbreaks = explode("\t",$mine);
    
    PHP:
     
    nico_swd, Mar 5, 2008 IP
  3. stma

    stma Peon

    Messages:
    37
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #3
    Big help there - thank you!
     
    stma, Mar 5, 2008 IP
  4. stma

    stma Peon

    Messages:
    37
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #4
    Well I'm almost there --- not quite working. I'm not actually doing anything different... basically not scrubbing the values.

    function strip_lines ($line)
    {
        return !preg_match('~\.(jpe?g|bmp|css|js|png)(\t|[\r\n]|$)~i'), $line);
    }
    function Randompostingtime($shufflefile)  {
    	global $change;
    	global $name;
        $lines = file($shufflefile);
    	$mine = $lines[array_rand($lines)];
    $lines = array_filter($lines, 'strip_lines');
    	$tabbreaks = explode("\t",$mine);
    	
    		
    		$name = $tabbreaks[3];
    
    		$Link= $tabbreaks[1];
    	
    		$tags = get_meta_tags($Link);
    
    		$desc = $tags['description'];  
    		
    		$desc = mysql_real_escape_string($desc);
    	
    
    	$change = '<p><a href="'.$Link.'">'.$name.'</a></p>
    <p>'.$desc.'</p>';
    
    	echo $change;
    }
    
    while ($counter <= $total) {
    
    Randompostingtime("file.txt");
    
    ## rest of the loop
    
    Code (markup):
     
    stma, Mar 5, 2008 IP
  5. nico_swd

    nico_swd Prominent Member

    Messages:
    4,153
    Likes Received:
    344
    Best Answers:
    18
    Trophy Points:
    375
    #5
    This line:
    
    $mine = $lines[array_rand($lines)];
    
    PHP:
    Should be below the array_filter() line, to make sure the bad lines are filtered out when we pick a random line.
     
    nico_swd, Mar 5, 2008 IP
  6. stma

    stma Peon

    Messages:
    37
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #6
    Looks like the problem is with the strip_lines function.

    function strip_lines ($line)
    {
    return !preg_match('~\.(jpe?g|bmp|css|js|png)(\t|[\r\n]|$)~i'), $line);
    }

    ## Kicks up an error saying unexepected , on the line with the preg match.



    I did have to change the function strip_lines array($line) as you see above. Any variation I did with array just didn't work - didn't like (array($line)) or array($line). Kept giving me parse errors.
     
    stma, Mar 5, 2008 IP
  7. nico_swd

    nico_swd Prominent Member

    Messages:
    4,153
    Likes Received:
    344
    Best Answers:
    18
    Trophy Points:
    375
    #7
    Can you post an example of the text file you're parsing? So I can try it locally and test it.
     
    nico_swd, Mar 5, 2008 IP
  8. stma

    stma Peon

    Messages:
    37
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #8
    stma, Mar 5, 2008 IP
  9. stma

    stma Peon

    Messages:
    37
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #9
    OriginPage	LinkToPage	LinkToPageStatus	LinkToPageTitle	OriginPageDate
    http://www.constructionmaterials.home9improvement.com/key/Colorado	http://www.constructionmaterials.home9improvement.com/	ok	Construction Materials
    Code (markup):
    That might work a little better - apparently quotes doesn't keep the tabs
     
    stma, Mar 5, 2008 IP
  10. stma

    stma Peon

    Messages:
    37
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #10
    I've spent a couple hours on this now :) - About to give up and go back to python or ruby to do it. Anyone else see a problem or a reason this isn't working?
     
    stma, Mar 5, 2008 IP