Ideas for a regular expression

Discussion in 'Programming' started by earticles, Sep 25, 2007.

  1. #1
    Hello.

    I need to build a regular expression to do a replacement as following:

    I have a HTML page. The regular expression should take a statement like this:

    <span onmouseover="_tipon(this)" onmouseout="_tipoff()"><span class="google-src-text" style="direction: ltr; text-align: left">TEXT 1</span>TEXT 2</span>

    from that HTML page and finally get the TEXT 2 only without any other HTML tags in the statement above.

    Could somebody please help me with this problem? It took many hours for me finding a solution but nothing worked for me... Or, if any other method to do that... Any help or ideas are highly appreciated.
     
    earticles, Sep 25, 2007 IP
  2. Jamie18

    Jamie18 Peon

    Messages:
    201
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    0
    #2
    have any other examples?
     
    Jamie18, Sep 25, 2007 IP
  3. earticles

    earticles Well-Known Member

    Messages:
    933
    Likes Received:
    35
    Best Answers:
    0
    Trophy Points:
    175
    #3
    No, this is the only thing I need to replace in the HTML page. The bold statement above is the html piece needed to be replaced as explained by me in the previous post.
     
    earticles, Sep 25, 2007 IP
  4. Jamie18

    Jamie18 Peon

    Messages:
    201
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    0
    #4
    you might try using two regex replacement methods.. the first to get rid of everything up until the first </span>

    like rereplace(htmlstring,"^\.*\\span>","", "one")
    than rereplace(htmlstring,"<\\span>","")

    sorry i had to do it in coldfusion.. that may not be what you need but maybe get you on the right track
     
    Jamie18, Sep 25, 2007 IP
  5. earticles

    earticles Well-Known Member

    Messages:
    933
    Likes Received:
    35
    Best Answers:
    0
    Trophy Points:
    175
    #5
    I suppose the regular expression is the same even if I use it in PHP with php instructions. I will try it tonight while now I'm on road.
     
    earticles, Sep 25, 2007 IP
  6. earticles

    earticles Well-Known Member

    Messages:
    933
    Likes Received:
    35
    Best Answers:
    0
    Trophy Points:
    175
    #6
    I tested but it does not work with PHP... Maybe the regexp is not right?
     
    earticles, Sep 25, 2007 IP
  7. krt

    krt Well-Known Member

    Messages:
    829
    Likes Received:
    38
    Best Answers:
    0
    Trophy Points:
    120
    #7
    
    preg_match('~</span>(.+?)</span>~', $html, $m);
    echo $m[1];
    
    PHP:
     
    krt, Sep 25, 2007 IP
  8. earticles

    earticles Well-Known Member

    Messages:
    933
    Likes Received:
    35
    Best Answers:
    0
    Trophy Points:
    175
    #8
    This works well but it removes all the content from the HTML file... only showing the TEXT 2...
     
    earticles, Sep 25, 2007 IP
  9. krt

    krt Well-Known Member

    Messages:
    829
    Likes Received:
    38
    Best Answers:
    0
    Trophy Points:
    120
    #9
    Isn't that what you wanted?
    If not, can you clarify what you need?
     
    krt, Sep 25, 2007 IP
  10. earticles

    earticles Well-Known Member

    Messages:
    933
    Likes Received:
    35
    Best Answers:
    0
    Trophy Points:
    175
    #10
    i resolved the problem in another way, using the explode() twice, first by "<span onmouseover=\"_tipon(this)\" onmouseout=\"_tipoff()\"><span class=\"google-src-text\" style=\"direction: ltr; text-align: left\">" and second by </span> to separate every sentence of the html page. Then I used 2

    $lines=explode("<span onmouseover=\"_tipon(this)\" onmouseout=\"_tipoff()\"><span class=\"google-src-text\" style=\"direction: ltr; text-align: left\">", $content);
    echo "{$lines[0]}";
    $num_lines=count($lines);
    for ($k=1; $k<$num_lines; $k++) {
    $line=explode("</span>", $lines[$k]);
    $num_line=count($line);
    for ($j=1; $j<$num_line; $j++) {
    echo "$line[$j]";
    }
    }


    statements and now I get the new html page without any <span> tags and "TEXT 1" phrases.

    Thank you very much for help, anyway!
     
    earticles, Sep 25, 2007 IP