Invalid urls regular expression


#1

Hi,

I have below regular expression helps to find valid urls in txt file.

I want to reverse the logic to find invalid urls in txt file.

I am using sublime editor.

^(?:http(s)?:\/\/)?[\w.-]+(?:\.[\w\.-]+)+[\w\-\._~:/?#[\]@!\$&'\(\)\*\+,;=.]+$


#2

If you're using that in a PHP script, surely you can just add an else clause to the if clause you use around the preg_match()? If all you want is the ones that don't match, just have nothing inside the if clause, or precede it with a ! to negate the result.


#3

I am using this regular expression in sublime text editor like notepad++


#4

Oh sorry, I figured that as you had posted in the PHP board, you were using it from PHP.


#5

I tried ! before expression but it does not select invalid url in sublime text editor

!^(?:http(s)?:\/\/)?[\w.-]+(?:\.[\w\.-]+)+[\w\-\._~:/?#[\]@!\$&'\(\)\*\+,;=.]+$


#6

Well, I don't know whether your text editor supports that as a general "not" conversion like the PHP language (and C, and probably others) do. Like I said, I thought you were doing it from PHP, so none of what I said above is relevant.


#7

thanks droopsnoot for support. :slight_smile:


#8

I have moved this topic to General Web Dev since it appears not to be about PHP after all.


#9

That depends on what you consider an URL candidate, since the simple negation of a valid url is essentially every text that surrounds a valid url.


#10

Any idea what change in above regular expression make it find invalid urls.


#11

You need to be able to describe what an invalid URL will be like because

will be an invalid URL.

For example, it could be said that the following quote has 4 invalid URLs

The quick brown fox


#12

Things seem to be getting confused.
Maybe explain what your objective is, how you will use this and for what purpose.

Is this Regex going into any script in any particular language?
Or are you searching a file in the editor for strings that are like URLs, but not actually valid URLs? In which case that is some "fuzzy" logic to deal with.


#13

HI Sam,

I have text file with many valid and invalid urls. Like below.

I want to remove invalid urls in text file using notepadd++ or sublime text editor regular expression find and replace feature.

The regular expression shared in previous post find valid url. I am looking for regular expression to find invalid url.

https://www.example.com
http://www.example.com
www.example.com
example.com
http://blog.example.com
http://www.example.com/product
http://www.example.com/products?id=1&page=2
http://www.example.com#up
http://255.255.255.255
255.255.255.255
http://invalid.com/perl.cgi?key= | http://web-site.com/cgi-bin/perl.cgi?key1=value1&key2
http://www.site.com:8008

#14

This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.