The problem is knowing how many elements of the array make up the actual domain name, and where the subdomains (if any) are. I don’t know much about regex so I can’t think of how to do even simple things with that, but I also can’t think of a way of doing it reliably at all, except for progressively adding an extra element backwards from the end and doing a domain look-up to see if it exists.
I was asking for clarification - domain.gov.net is a subdomain of gov.net so by the criteria provided gov.net is a possible answer - also gov.net is a subdomain of net so also by the criteria provided net would also be a possible answer.
The OP needs to define what they mean as the difference between a domain and a subdomain as anything in front of a dot is a subdomain of what comes after the dot.
My understanding is that in terms of getting domains from email addresses:
Anything before and including the @ can be disregarded.
That leaves (going from right to left) Top Level Domain preceded by Second Level Domain …
Then optionally preceded by n Level Domains up to effectively no limit
That is, something like this would be considered valid. name@ab.cd.ef.gh.ij.kl.mn.no.pq.rs.tu.vw.xy.com
So it should be easy enough to not capture the TLD, but I don’t see any easy way to parse out anything more that would apply to all possible, other than just capture what’s left over after that.
I realised that getting a domain name (without a subdomain) is not possible because the domain extension can be 1,2 or even 3 words. So even if we use some kind of techique, it wil be a patch and there will be no gurantee that the result is what i wanted.
It is possible but not automatically with a generic regex. From what you wrote it seems like there are only two possibilities for the psedo top level domain - 2 or 3 segments, like example.com, domain.co.in, domain.gov.net, etc. I would simply make a list of all possible second-level domains and check if the last 2 segments can be found in the list - if yes then this means the last 3 segments are the main
domain you are interested in instead of the standard 2 - then it’s easy to extract the subdomain you want. Even simple explode() functions would do it, no need for a regex.