hi, i have a list of websites, and i need the first level domain from each url. how can i do this? tried to count the dots or backslashes and then use substr(), but i have all kinds of urls: http://domain.com www.domain.com http://domain.com/folder http://domain.com/folder/index.php http://subdomain.domain.com http://www.domain.com http://www.domain.co.uk ... i only need the top level domain (.com, .org, com.au, .co.uk and so on). any ideas...?
hmm... sorry, i think i was wrong earlier. but you can get that with regular expressions. now i'm really very busy but i think that later i can make u a little script that does what you want. good luck
At the moment there are two threads with the exact same question posted, ther are some good suggestions in the other thread - http://forums.digitalpoint.com/showthread.php?t=181461
The variable you are looking for is captured as follows: $server = $_SERVER["SERVER_NAME"]; This ignores trailing directories. Another way to accomplish the same result is: $domain = "http://www.test.com/test.html"; list($x,$x,$domain,$x) = explode( "/", $domain, 4); echo $domain, "\n"; PHP: But, this generates a PHP Notice when the URL has no trailing slash.
thanks everyone, have it done now. thougt there was an easier way or a simple function for this, but i had to to it by myself: function get_tld($url) { $host = parse_url($url); $domain = $host['host']; $domain = str_replace("__", "", $domain); $tail = substr($domain, -7); $tld = strstr($tail, "."); return $tld; } Code (markup): i used pharse_url before, so this was a good start. for some reason i got results like "domain.com__" from pharse_url() sometimes, so i had to get rid of the underscores with str_replace. then i take the last 7 chars of this string (because the domain must be within the last 7) and return eveything after the first "." - success!