Extract domain name from host name

Hi,

I need to extract the domain name from a long host name. For example if I have a string, “www.domain.com”, I need to extract just “domain.com” - and if I have “blah.blah.blah.domain.co.uk” I need to extract “domain.co.uk” - is there a regular expression which can do this?

Thanks,

James.

How about this example found in the PHP Manual:


$string = "http://abc.acb.php.net/";
preg_match('@^(?:http://)?([^/]+)@i', $string, $matches);
$host = $matches[1];
preg_match('/[^.]+\\.[^.]+$/', $host, $matches);
echo "domain name is: " . $matches[0] . "\
";

Thanks Rajug but it doesn’t work for “domain.co.uk” - it only works if the domain suffix is of one part (i.e. “com”,“net” etc.)

$rawurl = "http://asdf.sadf.abc.com/adsfkjl/adfs/adfs.html";
$url = parse_url($rawurl);
echo $url['host'];

You cannot do this with a regular expression alone, you need a list of all the top level and second level domains.

Hi Jimmy,

Yes there is a regular expression which can do this (we all know that regexes can solve any problem, right? (:slight_smile: but as crmalibu pointed out, you’d need a list of (first, second, or more level) domains to be a part of that regex. As it happens, there are in the region of 2,500 of those domains to be taken into consideration (and almost certainly more that I couldn’t find in my all-too-brief search) so it’s entirely possible to construct a monster-regex to match what you’re looking for. Just how monster the regex is, is up to you but one that I came up with is over 19,000 characters long!

Do you still want to use a regular expression for this problem? :eye:

While you could use a regular expression for this, why not use the built-in php function created just for doing things like this?

Use str_replace :slight_smile: Google it, there are many tutorials. Good luck.

Your TLD list: http://data.iana.org/TLD/tlds-alpha-by-domain.txt

Because it doesn’t do what hes asking for.

Thank you for all the responses.

Luckily I don’t need to support all possible domain suffixes. Only com/co.uk/net

I’m just really bad at writing regular expressions. I think an alteration of the code shown in post#2 would work… (currently, it works perfectly for .com domains but fails for .co.uk)


$rawurl = "http://asdf.sadf.abc.com/adsfkjl/adfs/adfs.html";
$url = parse_url($rawurl);

$domain = preg_replace('#^(?:.+?\\.)+(.+?\\.(?:co\\.uk|com|net))#', '$1', $url['host']);

echo $domain;

If you need to add support for more TLDs then add them after net in the last set of brackets:


"(?:co\\.uk|com|net|org)"

And remember to escape the dot if you’re adding 2nd level domain:


"(?:co\\.uk|com|net|org|ac\\.uk)"

Perfect! Thank you decowski!! :slight_smile: