Html code

Discussion in 'PHP' started by ssimon171078, Dec 22, 2014.

  1. #1
    i have html code :
    <h3 class="lvtitle"><a href="http://www.ebay.com/itm/SEO-Service-First-page-of-Google-TOP-10-in-30-days-/321275961955?pt=LH_DefaultDomain_0&hash=item4acd8a2263"  class="vip" title="Click this link to access SEO Service. First page of Google (TOP-10) in 30 days.">SEO Service. First page of Google (TOP-10) in 30 days.</a>
    PHP:
    i want to receive only links like http://www.ebay.com/itm/SEO-Service-First-page-of-Google-TOP-10-in-30-days
    i wrote some code in php but i do not see any links:
    <?php
    // parser of website ebay
    $website="http://www.ebay.com/sch/Web-Computer-Services-/47104/i.html";
    $filename="links_ebay_services_1_1.txt";
    $fd=fopen($filename,"w");
    $content=file_get_contents($website);
    $stripped_file = strip_tags($content, "<a>");
    fwrite($fd,$stripped_file);
    fwrite($fd,"\n");
    fclose($fd);
    
    
    
    
    
    
    ?>
    PHP:

     
    Last edited by a moderator: Dec 22, 2014
    ssimon171078, Dec 22, 2014 IP
  2. Sano000

    Sano000 Active Member

    Messages:
    52
    Likes Received:
    4
    Best Answers:
    5
    Trophy Points:
    53
    #2
    Hi,
    Use preg_match to get links:

    <?php
    
    $content = file_get_contents('http://www.ebay.com/sch/Web-Computer-Services-/47104/i.html');
    if (preg_match_all('/<h3 class="lvtitle">.+href="(.+?)"/', $content, $matches)) {
        print_r($matches);
    }
    PHP:
    Also you may look at the Symphony DomCrawler or similar package.
     
    Sano000, Dec 24, 2014 IP
  3. hilhilginger

    hilhilginger Well-Known Member

    Messages:
    324
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    103
    #3
    Adding to the above reply, if PHP file_get_contents is not enabled by hosting provider it wont give you any results.you cal use PHP curl instead of file_get_content
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL, "http://www.ebayxxxxxxxxxxx");
    curl_setopt($ch, CURLOPT_REFERER, "http://www.yourdomain.com");
    curl_setopt($ch, CURLOPT_USERAGENT, "MozillaXYZ/1.0");
    curl_setopt($ch, CURLOPT_HEADER, 0);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    curl_setopt($ch, CURLOPT_TIMEOUT, 10);
    $content = curl_exec($ch);
    curl_close($ch);
    
    follow the code from above reply 
    PHP:
     
    hilhilginger, Dec 27, 2014 IP