[‘"]? - matches single/double quote or no quote
[^\s’"<>]* - href, match everything except: whitespace, quotes, tags, 0 or more
[^<>]* - any additional attributes after href, for example target=_blank class=smth
[\s\S]{1,100}? - link name, 1-100 characters, “?” means ungreedy so it ends on first closing tag
/i - case insensitive
Actually it detects none of my tests. We want to match links, and not garbage like: ‘test.php’ with quotes, your pattern matches 0 of 5 cases from my previous post.
Let’s examine the examples in details:
<a href=test0.php>this is
test0</a>
0 : 1, does not match
<a href=test1.php class="test">test1</a>
0 : 2, garbage match: test1.php class=“test” - this is not a correct link
<a href='test2.php'>test2</a>
0 : 3, garbage match: ‘test.php’ - this is not a correct link
<a href="test3.php">test3</a>
0 : 4, garbage match: “test3.php” - this is not a correct link
His approach is wrong, this is not the way to match links. It wasn’t specified? It also wasn’t specified that we shouldn’t show him the right direction if he was making wrong assumptions.
He did say that he wants to parse links, not parse urls, as it was stated in the first post. And link is not specifically what is inside href attribute, it could be whole ‘a’ tag from html perspective. It depends what it was needed for.