Hi guys,
I have a function that l’ve written on my site to validate an input given to me by users. The goal of the function is to fetch the domain name out of a given string.
The problem is sometimes the users give me a long URL, example:
http://www.somesite.org/this/is/a/long/url.php?method=something&attribute=something_else&document_id=123213
expected response: somesite.org
Sometimes, they just give me their base url, example:
http://www.somesite.org/
expected response: somesite.org
And they have subdomains somtimes, example:
http://subdomain1.somesite.org/
expected response: subdomain1.somesite.org
And sometimes they give me what l want, just the domain:
somesite.org
expected response: somesite.org
I also want to allow the user to include a subdomain, but strip out www.
For example:
subdomain1.mysite.com is okay
subdomain1.subdomain2.mysite.com is okay
www.mysite.com is bad, should be: mysite.com
The function l have here appears to be working, but it’s piecemealed together and l wanted to see if one of the pros out there could take a look and see if there is a more elegant solution.
function get_domain($url){
$url = strtolower($url);
if(substr($url, 0, 4) != 'http')
{
$url = 'http://' . $url;
}
$domain = str_ireplace('www.', '', parse_url($url, PHP_URL_HOST));
if(
preg_match("/([0-9a-z-]+\.)?[0-9a-z-]+\.[a-z]{2,7}/", $domain)
&& preg_match("/^([a-z\d](-*[a-z\d])*)(\.([a-z\d](-*[a-z\d])*))*$/i", $domain)
&& preg_match("/^.{1,253}$/", $domain)
&& preg_match("/^[^\.]{1,63}(\.[^\.]{1,63})*$/", $domain)
)
{
return $domain;
}
return false;
}