Regular Expressions 101

I wondered if someone could give me a brief primer on regular expressions. I’ve used regular expressions before, but when trying out regular expressions I find on the Internet, they often don’t work. I think the problem is that there are different kinds of regular expressions that work in different ways with different software programs.

For example, there’s “regular expressions,” and there’s GREP, which I’m not that familiar with. I work a lot with Dreamweaver, which has some regular expressions built into its search-and-replace function (though I seem to have lost them when I upgraded to version CC 2017). I also just bought BBEdit, which has GREP built in to it, though I haven’t yet worked with it.

Anyway, I’d like to start by asking what the main kinds of regular expressions are. There are apparently “regular expressions” and “GREP.” Are there any others I should know about?

And which combination of regular expressions and software would be the best choice for modifying content in some files in a folder?

For example, I have an index in an epub that has thousands of page numbers, each preceded by a space and followed by a comma, like this:

Africa 23, 179, 488,
Europe 25, 187,

I want to find a regular expression that will insert a particular hyperlink in a range of values (corresponding to a chapter or page file). For example, I would like to change everything from the page number 87 to 150 to something like this:

<a href="../Text/chapter2#" title="">87</a>,
<a href="../Text/chapter2#" title="">88</a>,
 151, (unchanged, because it's not in the specified range of numerals)

I would later have to go back and manually insert the appropriate anchor values, of course.

To get to my question, what would you recommend as the best tools for doing this? Is GREP better than “regular” regular expressions? Would BBEdit be a better choice than Dreamweaver, or is there another software program that’s better still?

On a similar note, what’s the name of the PHP function that effectively acts like a regular expression when a page is previewed? For example, imagine your static text includes the words “New World.” You don’t want to modify this text, but you would like visitors to see “New World II” instead. You could obviously do this with a str_replace, but is there another PHP function that does this?

Thanks!

(I guess I should have posted that last question in the PHP forum, but I’ll leave it there for now.)

TBH I don’t think there is such a thing as a “brief” primer.
But don’t let that stop you. To me, I don’t consider “different regex languages” to so much be different languages but more like different “flavors”. That is, all the ones I’ve encountered have much more in common than they have that varies.

The first time I tried Regex is while writing Perl code. Later, PHP and JavaScript. All three are “Perl flavor regex” (well, PHP also has POSIX flavor, but I didn’t use it much, preferring the more familiar PCRE) More recently I’ve been working with Discourse code a bit, which uses POSIX

Anyway, to get to the point, once you learn one, it should be fairly easy to use another. There may be some frustration if one flavor doesn’t have one of your “favorites” e.g. JavaScript doesn’t have “look behind”, but for the most part a way can be found to do what is wanted to be done.

I’m admittedly biased, but I think learning a PCRE flavor such as PHP or JavaScript regex would be a good place to start.

1 Like

Thanks. I’ve never even heard of PCRE regex. So you’re saying if I learned how to do something with PCRE, it shouldn’t be too hard to make it work with “regular” regular expressions or GREP? Also, is there a particular software program that’s really helpful for working with PCRE?

AFAIK grep is the grand-daddy of regex that all others are based on.

So if you are more comfortable working from the command line, you could try that.

But you don’t need any special app. A localhost server running PHP or a browser with JavaScript enabled that can open HTML files on your computer is more than enough to get started.

BTW, PCRE stands for Perl Compatible Regular Expression

I’made not sure which version/flavor grep uses but the two are similar enough it should get you far enough.

I personally own a hard copy of
[O’Reilly Regular Expression Pocket Edition] (http://shop.oreilly.com/product/mobile/9780596514273.do) and I love that book.

https://regex101.com is a great site for testing your regular expressions and figuring out where your mistakes are.

3 Likes

And here is a tutorial site about Regex.

http://www.regular-expressions.info/

2 Likes

Thanks for all the tips.

This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.