regular expression help

Discussion in 'PHP' started by frankcow, Apr 23, 2007.

  1. #1
    Hey all you PHP or regular expression gurus. I've got an if statement, that I'm thinking may be more efficient to run as a regular expression.

    Basically, I'm checking if the word is longer than 5 chars, and if a '<' or '>' is present, to detect is the word is within an HTML tag. Here's what I have:
    
    if (str_len($word) > 4 && strpos($word, ">") == -1 && strpos($word, "<") == -1)
    
    PHP:
    Any thoughts on the regex I would use?

    Green rep for whoever can answer!
     
    frankcow, Apr 23, 2007 IP
  2. JoshuaGross

    JoshuaGross Peon

    Messages:
    411
    Likes Received:
    14
    Best Answers:
    0
    Trophy Points:
    0
    #2
    Not the prettiest, but:

    if (preg_match('#\<[^<>]*'.preg_quote($word).'[^<>]*\>#si', $original_string))
    PHP:
    That will check for $word within an HTML tag in $original_string.
     
    JoshuaGross, Apr 23, 2007 IP
    frankcow likes this.
  3. nico_swd

    nico_swd Prominent Member

    Messages:
    4,153
    Likes Received:
    344
    Best Answers:
    18
    Trophy Points:
    375
    #3
    If the word between <> tags is 5 chars or longer?

    
    if (preg_match('/<[\w]{5,}>/', $word))
    
    PHP:
    Or this if you want to check from the end to the beginning of the word, so that $word holds nothing else except the tag.

    
    if (preg_match('/^<[\w]{5,}>$/', $word))
    
    PHP:
    \w matches A-Z a-z 0-9 and under scores. I think that's all a valid tag needs.
     
    nico_swd, Apr 23, 2007 IP
    frankcow likes this.
  4. frankcow

    frankcow Well-Known Member

    Messages:
    4,859
    Likes Received:
    265
    Best Answers:
    0
    Trophy Points:
    180
    #4
    sorry. Let me explain myself better. I'm looking to exclude words shorter than 5 chars, and looking to exclude words that may be in or beside tags.

    What I was going to do is explode a paragraph of text at " ", then loop through the array to examine each word. I want to ignore short words and words within or beside tags.

    Is there a better way to do this with regex?
     
    frankcow, Apr 23, 2007 IP