Regex To Extract 2nd Level Domains From All TLDs?

Good Day Folks!

  1. Is the following regex ok to extract top level domains and 2nd level domains ?

  2. How to write php code to use that regex ?
    Any sample code welcome.

Use preg_match function to search text with regular expression pattern. If you need to extract matched text use third parametar of function.


only matching:

if (preg_match($pattern, $string)) {
   // matches

extracting matched string(s):

if (preg_match($pattern, $string, $match) {
    // matches
    print_r($match); // $match is returned as array

if you want to output also all matched offsets from string consider using PREG_OFFSET_CAPTURE flag as fourth parameter.

You can examine whole functionality at php manual: PCRE manual

Test it.



Thank you for your willingness to help.
I’m a complete beginner in regex and so any suitable tutorial suggestions for complete beginners are welcome too!

Anyway, as you know, different webpages would have different internal & external links all over their pages. No matter, what the link looks like, the domain should be extracted. Imagine, I’m running a web crawler, it would encounter unlimited links where some would have just domain and some subdomain and so on.

Note: No matter how many subdomains or levels of domains (3rd level, 4th level, etc.) or dirs or sub-dirs (regardless of levels) the links contain, the 2nd level domain should be extracted along with it’s tld.
From our examples above, the script should extract “” from all the above mentioned links.
I need an example of the php code too alongside the regex.

There doesn’t seem to be any shortage of beginners’ tutorials, if you just look. For example:


Yes. Yes. But I learnt it the hard way to not just jump into any tutorial found online as they have a lot of outdated coding and bad coding on some too. Hence, intermediate or adv level programmer’s links suggestions weighed more on my scale.

This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.