RegExper: Regular Expressions Explained

What does this code snippet do?…

/^[0-9a-zA-Z]+@[0-9a-zA-Z]+[\.]{1}[0-9a-zA-Z]+[\.]?[0-9a-zA-Z]+$/

Those with several years development expertise will realize it’s a regular expression. But even the most astute guru will take a few moments to determine that it checks the validity of an email address. Only a superior subset of that group will comprehend that it’s fairly superficial and won’t check all possibilities.

Regular expressions are extremely powerful search patterns which can be used for string matching and replacement. They’re supported in the majority of languages including JavaScript, PHP, Perl, Java, C#, Python and Ruby.

Individual rules are normally straight-forward, e.g. [a-z] matches a single lowercase character and c.t matches a three letter string starting with ‘c’ and ending with ‘t’ — such as ‘cat’. However, when rules are combined, an indecipherable string of seemingly random codes starts to appear. The one above is relatively simple compared to many you’ll find in the wild.

Creating your own regular expressions is difficult enough and many of us resort to using the force. But it’s easy compared to parsing someone else’s code — which is normally written by someone who has an irrational aversion to comments!

Fortunately, Jeff Avallone has created a solution to your regex woes. RegExper transforms meaningless JavaScript-based expressions into a graphical representation:

Regexper

Admittedly, you’ll still need a reasonable understanding of pattern matching, but it’s far more evident the expression is analyzing an email address.

Behind the scenes, RegExper is a Ruby application which translates regular expressions into an SVG image. The SVG is embedded in the page, but it should be possible to extract or copy it for use in your own documentation.

If you’d like to make improvements or automate the process, the project is open source and available to download or fork from github.com/javallone/regexper.

RegExper is incredibly clever. While there are plenty of tools to help you devise and test regular expressions, very few allow you to parse or reverse-engineer existing code. I’ve not found any which does it quite so prettily either.

Add RegExper to your toolkit and you’ll be parsing regular expressions with renewed enthusiasm. Probably.

Free book: Jump Start HTML5 Basics

Grab a free copy of one our latest ebooks! Packed with hints and tips on HTML5's most powerful new features.

  • Anonymous

    Odd that this is a Ruby server-side application, when all of this could be done with JavaScript in the front-end, potentially with offline support. Definitely neat though. :)

  • Anonymous

    It could be, although creating an image server-side makes it easier to save the resulting file.

  • Anonymous

    Great tool. Though I take issue with this:

    “meaningless JavaScript-based expressions”

    1) They have plenty of meaning to people who are familiar with Regular Expressions

    2) These are POSIX regular expressions and are thus applicable in any environment that uses POSIX regular expressions, and thus are suitable for PHP, Perl, among others.

    • Anonymous

      Sorry Ben – you’ve read that totally out of context!

      I was referring to regular expressions which are difficult to parse, i.e. it’s seemingly meaningless to a developer viewing the code so they use Regexper to aid comprehension. I certainly didn’t mean that JavaScript, regular expressions or JS regex implementation is somehow pointless or flawed.

      And, yes, they are POSIX – although there are subtle differences. Regexper is designed for JavaScript regexs although it’ll be fine for most PHP PCRE, Perl etc. samples.

  • brothercake

    The image you’ve posted doesn’t explain the regular expression. It doesn’t say that the character patterns must occur one or more times, or that the third dot can occur either once or not at all. Based on that example, it hasn’t given me much faith in the tool.

    • pablo

      Actually it does, check the shape of the start and end points of the branches.

      • brothercake

        The shape of the start and end points? I don’t understand what you mean.

        • Anonymous

          brothercake, notice that there is a branch in the line after each of the [] groups, one of those branches loops back to the left of the group, and the other goes to the next group. The loop indicates that it can be repeated, i.e., one or more.

        • Anonymous

          Though I am seeing that there is a loop around first the “.” too, which is incorrect. There is exactly one. The {1} quantifier is actually redundant here and might be confusing the parser.

          • brothercake

            Right yeah, I didn’t spot those flow lines, but now that you draw my attention to them, they are visually inconsistent. The loop around the first dot implies “one repeat” when it’s actually “one instance” (and as you say, the “{1}” in the expression is redundant anyway).

            I would imagine that a tool like this is primarily aimed at novice programmers, or those who don’t come from a computer science background. And such an audience is not likely to be familiar with flow diagrams, as I wasn’t.

            My personal opinion is that tools like this are rather self-contradictory, because the visualisation of a regular expression is almost as complicated as the original expression. The solution for people who find regex difficult to work with, is to take the time to understand them more deeply. I remember when that was me — I’d read dozens of online primers, and none of them many any sense. Then I read the O’Reilly book “Mastering Regular Expressions” and that was really got through. I would certainly recommend it for everyone’s Top 5 books.

          • Anonymous

            I don’t think it particularly matters whether you’re a novice or an expert – a tool such as Regexper can give you an insight into what a regular expression is doing (even if the results aren’t exactly what you expect). You can identify the email structure almost immediately from the diagram. You can parse the code too, but it’ll take longer.

          • brothercake

            Yeah that’s fair comment, perhaps I was a little harsh.

            I was thinking of it as a tool to help someone learn how to use regex, and I don’t think it’s that. But it could be helpful as a way of visualizing expressions, to make it easier to get a general sense of what they’re doing.

            Now show me a graphical tool where you can create a flow diagram, and it will compile that visualisation into the best possible regex — then I’d be impressed :-)

  • Anonymous

    There is a fork of Regexper at http://www.regexplained.co.uk/ using real-time client side processing.