hey there i'm trying to get the content of a certain website, and i'd like to get the exact links shown on the site after using file_get_content("$url" ) -->for example , if digitalpoint is shown , i'd like to get the url--->http://‬forums.digitalpoint.com , not the string "digitalpoint" i've tried using strip_tags - which gives a weird outpot- "MARGIN-TOP: 0px; MARGIN-LEFT: 0px; DIRECTION: rtl; MARGIN-RIGHT: 0px; FONT-FAMILY: Arial; } TD { FONT-SIZE: 80%; COLOR: #003ca5 } " etc... any suggestions? itamarp
Let me know if this helps. $url = 'Url from which you want to get the data.'; $handle = fopen($url, "rb"); $contents = stream_get_contents($handle); fclose($handle); //you will have to perform some replace according to text available in the $contents. $contents = str_replace('Search','Replace',$contents) ; $regex_pattern = "/<a href=\"(.*)\">(.*)<\/a>/"; preg_match_all($regex_pattern,$contents,$matches); print_r($matches);exit; PHP:
Revised code: <?php $url = 'Url from which you want to get the data.'; $contents = file_get_contents($url); $regex_pattern = "@http[s]*:\/\/([^\/]+)[^\s]*@i"; preg_match_all($regex_pattern,$contents,$matches); echo "<pre>".print_r($matches)."</pre>"; exit; ?> PHP:
danx10 thanks , your code is what i need (i've revised the regex section) new2seo thanks for trying i'm still getting a messi outpot, while "echo matches[1];" returns an error ....
$url = 'http://www.google.com'; preg_match_all('%\bhref="\K[^"]+%', file_get_contents($url), $matches); PHP: Your matches will be in the array $matches[0], so to get the first link use $matches[0][0] The second use $matches[0][1] The third $matches[0][2] and so on
Wow this is awesome, Can you also write regex for extracting value of url from [url=http://www.google.com/]Google.com[/url] [url=http://www.google.com]Google.com[/url] [url=http://www.gmail.com/]Gmail[/url] [url=http://www.gmail.com]Gmail[/url] HTML: Thanks