1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

Parsing <h1> and <title> tags from html page using cfhttp

Discussion in 'Programming' started by jbard, Oct 31, 2010.

  1. #1
    Hi

    So i am using cfhttp to grab content off some websites, what i would like to do is grab the content from the h1 and title tages of the page.

    I investigated using regex but this wont work because if a site has done this.
    <h1 class="h1class">content</h1>

    then a regex would get confused.

    I have seen plently of php scripts that can do something like this but found nothing in coldfusion. What is the best way for me to go about this, does anyone know of any scripts or sample code?

    Any help would be most appreciated.
    thanks
     
    jbard, Oct 31, 2010 IP
  2. cfStarlight

    cfStarlight Peon

    Messages:
    398
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    0
    #2
    Sounds like you've just got the wrong regex. IIRC you can do it with back references. I can't remember the syntax offhand. Try asking in a regex forum. I'm sure somebody there would know.
     
    cfStarlight, Nov 1, 2010 IP
  3. fardesi

    fardesi Peon

    Messages:
    17
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #3
    H1Value = ReReplace(myTextString, "(.*?)(<h1.*?)(.*?)(</h1>)(.*)", "\3", "ALL")

    If you need to find multiple... run the regex using ReFind... CFLib.org has ReFindAll available that may help... you can likely write/find a lot better regex then what I put up above... Try RegEx Coach to help in debugging the regex....
     
    fardesi, Jan 18, 2011 IP
  4. prptl709

    prptl709 Guest

    Messages:
    83
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #4
    In which language this tag is used I don't know.Help me.
     
    prptl709, Feb 25, 2011 IP