Regular Expressions in JavaScript
This tutorial will explain how regular expressions are used in JavaScript, with a practical example or two to help the concepts gel. By the end of this article, you’ll probably still be a mere mortal, but you’ll certainly be able to impress at parties with your newly acquired text juggling skills!
Using Regular Expressions in JavaScript
Regular expressions in JavaScript are simple once you understand the basics of regex. You can create a regular expression in JavaScript as follows:
const myRE = /regexp/;
Where regexp is the regular expression code, as described above. For example, the following creates a regular expression that detects the string “JavaScript”:
const myRE = /JavaScript/;
Similarly, the follow example matches “banana”, “nababa”, “baba”, “nana”, “ba”, “na”, and others.
const myRE = /^(ba|na)+$/;
By default, JavaScript regular expressions are case sensitive and only search for the first match in any given string. By adding the g
(for global) and i
(for ignore case) modifiers after the second /
, you can make a regular expression search for all matches in the string and ignore case, respectively. Here are a few example regular expressions. For each, I’ve indicated what portion(s) of the string "test1 Test2 TEST3"
they would match:
RegExp | Match(es): |
---|---|
/Test[0-9]+/ | “Test2” only |
/Test[0-9]+/i | “test1” only |
/Test[0-9]+/gi | “test1”, “Test2”, and “TEST3” |
Using a regular expression is easy. Every JavaScript variable containing a text string supports three methods (or functions, if you aren’t used to object-oriented terminology) for working with regular expressions: match()
, replace()
, and search()
.
match()
match()
takes a regular expression as a parameter and returns an array of all the matching strings found in the string under consideration. If no matches are discovered, then match()
returns false. Returning to our original example, let’s say that we wanted a function that can check that a string entered by the user as his or her phone number is of the form (XXX) XXX-XXXX. The following code would do the trick:
function checkPhoneNumber(phoneNo) {
const phoneRE = /^\(\d\d\d\) \d\d\d-\d\d\d\d$/;
if (phoneNo.match(phoneRE)) {
return true;
} else {
alert( "The phone number entered is invalid!" );
return false;
}
}
As a first order of business, this function defines a regular expression. Let’s break it down to understand how it works. The regular expression begins with ^
, to indicate that any match must begin at the start of the string. Next is \(
, which will just match the opening parenthesis. We prefixed the character with a backslash to remove its special meaning in regular expression syntax (to mark the start of a set of alternatives for matching). As mentioned previously, \d
is a special code that matches any digit; thus, \d\d\d
matches any three digits. We could have written [0-9][0-9][0-9]
to achieve the same effect, but this is shorter. The rest of the pattern should be pretty self-explanatory. \)
matches the closing parenthesis, the space matches the space that must be left in the phone number, then \d\d\d-\d\d\d\d
matches three digits, followed by a dash, followed by four more digits. Finally, the $
indicates that any match must end at the end of the string.
Incidentally, we could shorten this regular expression to the following, by using another shortcut that we did not mention above. If you can see how this works, you’re a natural!
const phoneRE = /^\(\d{3}\) \d{3}-\d{4}$/;
Our function then checks if phoneNo.match(phoneRE)
evaluates to true
or false
. In other words, it checks whether or not the string contained in phoneNo
matches our regular expression (thus returning an array, which in JavaScript will evaluate to true
). If a match is detected, our function returns true
to certify that the string is indeed a phone number. If not, a message is displayed warning of the problem and the function returns false
.
The most common use for this type of function is in validating user input to a form before allowing it to be submitted. By calling our function in the onSubmit
event handler for the form, we can prevent the form from being submitted if the information entered is not properly formatted. Here’s a simple example demonstrating the use of our checkPhoneNumber()
function:
<form action="...">
<label>Enter phone number (e.g. (123) 456-7890):
<input type="text" name="phone">
</label>
<input type="submit">
</form>
<script>
let form = document.querySelector('form');
form.addEventListener('submit', function() {
return checkPhoneNumber(this.phone.value);
});
</script>
The user will be unable to submit this form unless a phone number has been entered. Any attempt to do so will produce the error message generated by our checkPhoneNumber()
function.
See the Pen match() example by SitePoint (@SitePoint) on CodePen.
replace()
As its name would suggest, replace()
lets you replace matches to a given regular expression with some new string. Let’s say you were a spelling nut and wanted to enforce the old adage “I before E, except after C” to correct such misspellings as “acheive” and “cieling”. What we’d need is a function that takes a string and performs two search-and-replace operations. The first would replace “cie” with “cei”.
Here’s the code:
theString = theString.replace(/cie/gi,"cei");
Simple enough, right? The first parameter is the regular expression that we’re searching for (notice that we’ve set it to “ignore case” and to be “global” so that it finds all occurrences, not just the first), and the second parameter is the string that we want to replace any matches with.
The second replacement is a little more complicated. We want to replace “xei” with “xie” where ‘x’ is any letter except ‘c’. The regular expression to detect instances of “xei” is fairly easy to understand:
/[abd-z]ei/gi
This just detects any letter except ‘c’ (‘a’, ‘b’, and ‘d’ to ‘z’ inclusive), followed by “ei”, and does it in a global, case-insensitive manner.
The complexity comes in defining our replacement string. Obviously, we want to replace the match with “xie”, but the difficulty comes in writing the ‘x’. Remember, we have to replace ‘x’ with whatever letter appears in the matching string. To do this, we need to learn a new trick.
Earlier on, I showed you how parentheses could be used to define a set of alternatives in a regular expression (e.g. ^(ba|na)+$
). Well as it turns out, parentheses have another meaning, too. They let us “remember” part of a match, so that we can use it in the replacement string. In this case, we want to remember the portion of the match that corresponds to the [abd-z]
in the regular expression. Thus, we surround it with parentheses:
/([abd-z])ei/gi
Now, when specifying the replacement string, we put $1
where we want to insert the portion of the string corresponding to the parenthesised portion of the regular expression. Thus, the code for performing the required substitutions is as follows:
theString = theString.replace(/([abd-z])ei/gi,"$1ie");
To sum it up, here’s the complete function for performing our auto-correction:
function autoCorrect(theString) {
theString = theString.replace(/cie/gi,"cei");
theString = theString.replace(/([abd-z])ei/gi,"$1ie");
return theString;
}
See the Pen replace() demo by SitePoint (@SitePoint) on CodePen.
Before you go and use this function on your page, realize that there are exceptions to the “I before E except after C” rule. Weird, huh?
search()
The search()
function is similar to the well-known indexOf()
function, except it takes a regular expression instead of a string. It then searches the string for the first match to the given regular expression and returns an integer indicating the position in the string (e.g. 0 if the match is at the start of the string, 9 if the match begins with the 10th character in the string). If no match is found, the function returns a value of -1.
const theString = "test1 Test2 TEST3";
theString.search(/Test[0-9]+/); // 6
Summing It Up
Regular expressions are an invaluable tool for verifying user input. By taking advantage of support for regular expressions in JavaScript, that verification can be done as the data is entered, providing a smoother user experience (Note: server-side validation is still necessary for security, and also to caters for situations where JavaScript is unavailable.)