Results 1 to 5 of 5
Thread: Good regex for Link Checking
Jul 19, 2006, 07:28 #1
Good regex for Link Checking
I'm looking for a good regex to use in a link exchange script that will check if another website is linking to me
Searching through the old post I found this
But there are some things that are left out, could anyone review this and solve the remaining problems, so that we can get a really good regex? I think it'll be useful for everybody!
Jul 19, 2006, 07:51 #2
Jul 19, 2006, 08:06 #3
- Join Date
- Jun 2006
- 0 Post(s)
- 0 Thread(s)
That should so it.
It finds anything staring with "<a" then anything besides a ">" untill it hits an "href", any number of spaces, an "=", any number of spaces, single/double or no quotes, the http to your site (with or without the www) fallowed by single/double or no qoutes, any number of characters besides ">" and the closing ">" tag.
Jul 19, 2006, 08:20 #4
Thanks for helping!
One last thing, as you know rel="nofollow" basically makes your link worthless, so could you include a small check to see if rel=nofollow is present?
i'm not skilled in regex but it should be something like [^rel="nofollow"], I just don't know where to put it
Jul 19, 2006, 09:53 #5
you could accomplish this with one regex, but it would be a big mess since pcre regex don't support variable-length look aheads/behinds.
so I would use 2 regex:
first to match the tag:
then to see if it contains rel="nofollow"