I am right and preg_match is wrong.

Discussion in 'PHP' started by Hade, Sep 16, 2008.

  1. #1
    I'm trying to extract a value from some text (scraped from a web site)
    using a regular expression and It's not playing ball.

    Here's the text (with the value I want in bold)

    <span id="DetailsCurrentBidValue" class="sectiontitle"><b>GBP 49.99 </b></span>

    Here is my regular expression:

    '#id="DetailsCurrentBidValue".+\s([\d]+\.[\d]+)#i'

    However, when I change my regular expression to this (remove the '\s'):

    '#id="DetailsCurrentBidValue".+([\d]+\.[\d]+)#i'

    It returns '9.99.
    It should return 49.99

    Any ideas? could this be an encoding issue?
    Thanks!
     
    Hade, Sep 16, 2008 IP
  2. lp1051

    lp1051 Well-Known Member

    Messages:
    163
    Likes Received:
    7
    Best Answers:
    0
    Trophy Points:
    108
    #2
    This sucking of others sites might be wrong, neither preg_match, nor you...
    You can try something like this : #id="DetailsCurrentBidValue"[^\d]+([\d]+\.[\d]+)#

    BUT!!! Be careful. As soon as there'll be some number in 'class' attribute, it'll fail.
     
    lp1051, Sep 16, 2008 IP
  3. Hade

    Hade Active Member

    Messages:
    701
    Likes Received:
    18
    Best Answers:
    0
    Trophy Points:
    90
    #3
    Thanks very much.
    For the record, I'm not leeching content. I'm building an eBay sniper.

    I really appreciate your help.

    Page code:
    <span id="DetailsCurrentBidValue" class="sectiontitle"><b>US $44.00 </b></span>

    Tried Regex:
    '#id="DetailsCurrentBidValue".+[^\d]([\d]+\.[\d]+)#i'

    Also Tried:
    '#id="DetailsCurrentBidValue".+\$([\d]+\.[\d]+)#i'

    Thanks
     
    Hade, Sep 16, 2008 IP