file get contents problem

Discussion in 'PHP' started by ItamarP, Jan 29, 2010.

  1. #1
    hey there

    i'm trying to get the content of a certain website, and i'd like to get the exact links shown on the site after using file_get_content("$url" )
    -->for example , if digitalpoint
    is shown , i'd like to get the url--->http://‬forums.digitalpoint.com , not the string "digitalpoint"

    i've tried using strip_tags - which gives a weird outpot- "MARGIN-TOP: 0px; MARGIN-LEFT: 0px; DIRECTION: rtl; MARGIN-RIGHT: 0px; FONT-FAMILY: Arial; } TD { FONT-SIZE: 80%; COLOR: #003ca5 } " etc...

    any suggestions?

    itamarp
     
    ItamarP, Jan 29, 2010 IP
  2. new2seoo

    new2seoo Peon

    Messages:
    143
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #2
    Let me know if this helps.

    
    
    $url = 'Url from which you want to get the data.';
    $handle = fopen($url, "rb");
    $contents = stream_get_contents($handle);
    fclose($handle);
    
    //you will have to perform some replace according to text available in the $contents.
    
    $contents = str_replace('Search','Replace',$contents) ;
    
    $regex_pattern = "/<a href=\"(.*)\">(.*)<\/a>/";
    preg_match_all($regex_pattern,$contents,$matches);
    print_r($matches);exit;
    
    
    PHP:
     
    new2seoo, Jan 29, 2010 IP
  3. danx10

    danx10 Peon

    Messages:
    1,179
    Likes Received:
    44
    Best Answers:
    2
    Trophy Points:
    0
    #3
    Revised code:

    <?php
    $url = 'Url from which you want to get the data.';
    $contents = file_get_contents($url);
    
    $regex_pattern = "@http[s]*:\/\/([^\/]+)[^\s]*@i";
    
    preg_match_all($regex_pattern,$contents,$matches);
    echo "<pre>".print_r($matches)."</pre>";
    exit;
    ?>
    PHP:
     
    Last edited: Jan 29, 2010
    danx10, Jan 29, 2010 IP
  4. ItamarP

    ItamarP Member

    Messages:
    56
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    41
    #4
    danx10 thanks , your code is what i need (i've revised the regex section)
    new2seo thanks for trying

    i'm still getting a messi outpot, while "echo matches[1];" returns an error ....
     
    ItamarP, Jan 29, 2010 IP
  5. new2seoo

    new2seoo Peon

    Messages:
    143
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #5
    Let me know if you need any further help.
     
    new2seoo, Jan 29, 2010 IP
  6. JAY6390

    JAY6390 Peon

    Messages:
    918
    Likes Received:
    31
    Best Answers:
    0
    Trophy Points:
    0
    #6
    $url = 'http://www.google.com';
    preg_match_all('%\bhref="\K[^"]+%', file_get_contents($url), $matches);
    PHP:
    Your matches will be in the array $matches[0], so to get the first link use
    $matches[0][0]
    The second use
    $matches[0][1]
    The third
    $matches[0][2]
    and so on
     
    JAY6390, Jan 29, 2010 IP
  7. new2seoo

    new2seoo Peon

    Messages:
    143
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #7
    Wow this is awesome, Can you also write regex for extracting value of url from

    [url=http://www.google.com/]Google.com[/url]
    [url=http://www.google.com]Google.com[/url]
    [url=http://www.gmail.com/]Gmail[/url]
    [url=http://www.gmail.com]Gmail[/url]
    HTML:
    Thanks
     
    Last edited: Jan 29, 2010
    new2seoo, Jan 29, 2010 IP