php string remove paragraphs with question marks

Discussion in 'PHP' started by wormser, Sep 20, 2006.

  1. #1
    I have a feed that I need to clean up. I'm not too experienced with all of the string replace functions so I was hoping you guys might be able to help. Right now I have some paragraphs that need to be removed. The paragraphs have a "????" in them and they are each separated by a <p>. Is there any way to remove a paragraph if it contains the four question marks in a row?
     
    wormser, Sep 20, 2006 IP
  2. wmtips

    wmtips Well-Known Member

    Messages:
    601
    Likes Received:
    70
    Best Answers:
    1
    Trophy Points:
    150
    #2
    Here is the script code. All blocks started with tag <p>, ended with </p> and contained "????" are removed from $s:
    
    $s = preg_replace('/<p>.*?\?{4}.*?<\/p>/si','',$s);
    PHP:
     
    wmtips, Sep 21, 2006 IP
  3. matthewk

    matthewk Guest

    Messages:
    265
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    0
    #3
    hey wmtips, ive been doing php for a while, and learned abotu preg_replace and other regular expressions, but never really actually understood how they worked. where did you learn how to use them? Thanks
     
    matthewk, Sep 21, 2006 IP
  4. wmtips

    wmtips Well-Known Member

    Messages:
    601
    Likes Received:
    70
    Best Answers:
    1
    Trophy Points:
    150
    #4
    Just learn the syntax. One of my favourite tutorials (for .NET, but very similar to php) is here. I also would recommend to download recently published Regular Expressions Cheat Sheet.
     
    wmtips, Sep 21, 2006 IP
  5. wormser

    wormser Well-Known Member

    Messages:
    112
    Likes Received:
    6
    Best Answers:
    0
    Trophy Points:
    138
    #5
    Thanks for the code wmtips. I can't believe I posted that 5 days ago...feels like 5 hours ago. I'm still having a bit of a problem, here's what's up:
    This displays the third paragraph only. I think it sees the first "<p>" instead of the one closest to the "????" and replaces everything. Any ideas?
     
    wormser, Sep 25, 2006 IP
  6. wmtips

    wmtips Well-Known Member

    Messages:
    601
    Likes Received:
    70
    Best Answers:
    1
    Trophy Points:
    150
    #6
    Hmm, yes I was not right..
    Ok, it's a bit harder. If we assume that paragraphs can contain other tags, the regex will be:
    $s = preg_replace('/<p>((?!<p>).)*?\?{4}.*?<\/p>/si','', $s);
    PHP:
    If paragraphs do not contain any html tags, regex can be simplier..
     
    wmtips, Sep 25, 2006 IP
  7. wormser

    wormser Well-Known Member

    Messages:
    112
    Likes Received:
    6
    Best Answers:
    0
    Trophy Points:
    138
    #7
    Sweetness... Thank wmtips, I never would have figured that out.
     
    wormser, Sep 27, 2006 IP