1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

I need to extract URLs from a piece of HTML code, but how?

Discussion in 'Programming' started by xthoms, Nov 21, 2010.

  1. #1
    Let's say I have some code like this

    Code:
    <a target="_blank" href="http://link.com">
    abc</a> </p>

    But I of course have a lot more. How can I automatically extract the URLs?
    I know the task is simply to return all values between href=" and "> but no idea what code I could use for it.
    SEMrush
     
    xthoms, Nov 21, 2010 IP
    SEMrush
  2. Deacalion

    Deacalion Peon

    Messages:
    438
    Likes Received:
    11
    Best Answers:
    0
    Trophy Points:
    0
    #2
    You could use regular expressions. You will beyond a shadow of a doubt come across some HTML that won't fit the RegEx though :).

    For fast development (but might break easily): Regular Expressions
    For bulletproof code (but is harder to learn and longer to code): XPath or some selector library like phpQuery


    Edit: sorry, I just came out of the PHP forum and assumed you're talking about PHP.
     
    Deacalion, Nov 21, 2010 IP
  3. xthoms

    xthoms Member

    Messages:
    170
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    30
    #3
    Thanks for ur reply
    Well I just need what i wrote above cus it's just one site i need to be able to do it from.

    it's a site that checks backlinks and i need the URLs by themselves like

    www.link1.com
    www.link2.com
    www.link3.com

    I don't think it's a hard task as I basically just need somewhere where I can insert the code, and that it then finds all values that are inbetween href=" and "> and that it then echoes them. But i'm no programmor so it's not that easy
     
    xthoms, Nov 21, 2010 IP
  4. SterlingS

    SterlingS Greenhorn

    Messages:
    29
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    13
    #4
    SterlingS, Nov 21, 2010 IP
  5. xthoms

    xthoms Member

    Messages:
    170
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    30
    #5
    Your tool doesn't work :/ but some other guys made a script that works perfectly fine so i dont need it anymore :)
     
    xthoms, Nov 21, 2010 IP
  6. matessim

    matessim Active Member

    Messages:
    514
    Likes Received:
    5
    Best Answers:
    1
    Trophy Points:
    70
    #6
    You can make a scraper in a multitude of languages, you could use Java + Regex(Easiest way to search IMO) to do it easily.
     
    matessim, Nov 22, 2010 IP