1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

finding a tag in html document

Discussion in 'PHP' started by theblackjacker, Oct 23, 2009.

  1. #1
    Hi!

    I'm using the file_get_content() to get everything (html) from a url. However I would like to get what's in the <h1> tag.

    I have read and searched for the DOM-document which seems to be the best way to do this but I'm not sure exaxtly how to do it with PHP.

    I have seen some tutorials for javascript but I need to write the content to a database so I need to use php.
     
    theblackjacker, Oct 23, 2009 IP
  2. mastermunj

    mastermunj Well-Known Member

    Messages:
    687
    Likes Received:
    13
    Best Answers:
    0
    Trophy Points:
    110
    #2
    mastermunj, Oct 23, 2009 IP
  3. techbabu

    techbabu Peon

    Messages:
    20
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #3
    Hi..........

    You can try this, I hope it'll help you.


    <?php
    
    $myFile = "myfile.html";
    $fh = fopen($myFile, 'r');
    $htmlData = fread($fh, filesize($myFile));
    fclose($fh);
    
    /* Get all contents from <h1> .... </h1> */
    preg_match_all("/<h1>?.*?<\/h1>/", $htmlData, $matches);
    print_r($matches);
    
    ?>
    PHP:
    The Output should be like this.............

    Array
    (
        [0] => Array
            (
                [0] => <h1>Chroot Bind FreeBSD</h1>
                [1] => <h1>MySQL on FreeBSD</h1>
                [2] => <h1>10 Best Linux Distro</h1>
                [3] => <h1>Top 4 Virtualization Platforms</h1>
                [4] => <h1>You can access all above information from my blog site </h1>
                [5] => <h1>www.techbabu.com</h1>
            )
    
    )
    Code (markup):
    Techbabu
    --------------------------------------
    Dont' just make a website: Make an Impact
     
    techbabu, Oct 23, 2009 IP
  4. FCM

    FCM Well-Known Member

    Messages:
    669
    Likes Received:
    14
    Best Answers:
    0
    Trophy Points:
    155
    #4
    Although the answer above is Good you will have to consider if the user has style, class attributes within there tag. If it does you will not return any results.

    consider looking for just <tab

    at first, then you can start to get more into array manipulation more.

    Best of luck
     
    FCM, Oct 24, 2009 IP
  5. theblackjacker

    theblackjacker Peon

    Messages:
    52
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #5
    I get it to work with an external .html page with techbabus code..

    However I don't understand why this don't work, shouldn't it be basically the same thing.. getting html from a website or getting it from a file.


    $url = "http://www.example.com";
    $testing = file_get_contents($url);

    /* Get all contents from <h1> .... </h1> */
    preg_match_all("/<h1>?.*?<\/h1>/", $testing, $matches);
    print_r($matches);
     
    theblackjacker, Oct 24, 2009 IP
  6. JAY6390

    JAY6390 Peon

    Messages:
    918
    Likes Received:
    31
    Best Answers:
    0
    Trophy Points:
    0
    #6
    $url = "http://www.example.com";
    $content = file_get_contents($url);
    preg_match_all('%<h1>([^<]+)</h1>%s', $content, $matches);
    echo '<pre>'.prit_r($matches, true).'</pre>';
    PHP:
     
    JAY6390, Oct 24, 2009 IP
  7. techbabu

    techbabu Peon

    Messages:
    20
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #7
    Hi...

    Its work here, I think your variable $testing is empty or there is nothing about <h1> in it.


    <?php
    
    $url = "http://www.techbabu.com";
    $testing = file_get_contents($url);
    
    /* Get all contents from <h1> .... </h1> */
    preg_match_all("/<h1>?.*?<\/h1>/", $testing, $matches);
    print_r($matches);
    
    
    PHP:

    Array
    (
        [0] => Array
            (
                [0] => <h1><a href="http://www.techbabu.com/2009/10/best-10-linux-distros/" rel="bookmark">Best 10 Linux Distros</a></h1>
                [1] => <h1><a href="http://www.techbabu.com/2009/10/microsoft-windows-7-launched/" rel="bookmark">Microsoft Windows 7 Launched</a></h1>
                [2] => <h1><a href="http://www.techbabu.com/2009/10/samsung-mobile-phone-t401g/" rel="bookmark">Samsung Mobile Phone – T401G</a></h1>
                [3] => <h1><a href="http://www.techbabu.com/2009/10/motorola-mobile-phone-dext-mb220/" rel="bookmark">Motorola Mobile Phone – DEXT MB220</a></h1>
                [4] => <h1><a href="http://www.techbabu.com/2009/10/samsung-mobile-phone-t939-behold-2/" rel="bookmark">Samsung Mobile Phone – T939 Behold 2</a></h1>
                [5] => <h1><a href="/privacy-policy">Privacy Policy</a> &nbsp; | &nbsp; <a href="/sitemap/">Sitemap</a> &nbsp; | &nbsp; <a href="/contact/">Contact Us</a></h1>
            )
    
    )
    
    Code (markup):
     
    techbabu, Oct 24, 2009 IP