PREG (regex) refresher needed!

codamedia · May 4, 2010, 10:28pm

Regular expressions are not my specialty. I am trying to solve a simple problem, but coming up with a blank.

EXAMPLE:
If the string is:
“The fox jumped over the road and landed on the other side.”

Using PREG (not ereg), how would I extract everything between the “strong” tags, even including the strong tags. What I want to end up with is:

over the road

I will always know the exact START and END of the string (in this case the strong tags). I just need to grab both of those, and everything in between.

force · May 4, 2010, 10:41pm

Something like this would work:

<strong>[\\w\\s]+</strong>

Or, if you want to use a capture group to get only the text between the tags:

<strong>([\\w\\s]+)</strong>

Note that \w only does alphanumeric and \s does spaces. If you want punctuation marks in addition to that, you’ll have to add them.

http://php.net/manual/en/function.preg-match.php

I usually use this AIR app to quickly test regular expressions: http://www.gskinner.com/RegExr/ (there’s a link to the desktop version on the bottom right)

rpkamp · May 4, 2010, 10:45pm

Or


<strong>([^<]+)</strong>

Everything except <, seeing as tha’s where starts.
That way you don’t have to specify everything you do want to match (like characters with accents, and so on).

force · May 4, 2010, 11:01pm

That’s probably a better approach.

But…what if there’s a link tag between the strong tags?

rpkamp · May 4, 2010, 11:05pm

I guess you use negative lookahead to look for , or go for the root all evil: the ANYTHING atom: (.) (just make sure to make it lazy though (.?)).

codamedia · May 5, 2010, 2:59am

Thanks everyone. I’ll work with a few of these thoughts and come up with something that works for this project.

codamedia · May 5, 2010, 2:55pm

I’m sorry, but my regex skills are very limited. Still having problems, mainly because I don’t fully understand how preg works.

With the suggestions above, it is returning the boolian response (0 or 1 depending on what I toss in the test string). However, I am wanting the actual “string”.

Using my initial example in this post, how do I get it to extract the string so I can store it in a variable? (either "over the road " or even “over the road” would be fine if that is easier.)

If someone could provide me with a full example I would really appreciate it! Thanks.

rpkamp · May 5, 2010, 3:24pm

Run the following code to see how preg_match works:


$matches = array();
$str = "The fox jumped <strong>over the road</strong> and landed on the other side.";
preg_match('/<strong>([^<]+)</strong>/', $str, $matches);
var_dump($matches);

codamedia · May 5, 2010, 8:44pm

Thanks for the example - it certainly gives me a better idea of how it works.

rpkamp:

Run the following code to see how preg_match works:


$matches = array();
$str = "The fox jumped <strong>over the road</strong> and landed on the other side.";
preg_match('/<strong>([^<]+)</strong>/', $str, $matches);
var_dump($matches);

One problem I found was that this code threw an warning on my system when I tested it.
“Warning: preg_match() [function.preg-match]: Unknown modifier ‘t’ in …”.

The fix was to escape the / in like this <\/strong>

rpkamp · May 5, 2010, 10:41pm

Somehow I always manage to forget escaping html tags like that when regexing them

Thanks for pointing it out, and glad you’ve got a better idea on how it works

stereofrog · May 6, 2010, 10:16am

Actually, what you’re looking for is the “match all non-greedy” construct, dot-star-question mark:


$a = "foo and <strong>bar</strong> and <strong>baz</strong>!";

preg_match_all('~<strong>(.*?)</strong>~', $a, $m);
print_r($m[1]); // prints bar, baz

the ([^<]+) thing won’t work for strings like "foo bar.

Topic		Replies	Views
preg_match question PHP	7	659	April 10, 2010
Regular expression help PHP	17	1029	April 10, 2010
[Solved] Regular expression - parse string PHP	10	1651	April 19, 2015
preg_match extract text between two identifiers PHP	4	11565	July 25, 2011
Regular Expressions: Remove everything between <script> tags? PHP	5	32518	September 19, 2014

PREG (regex) refresher needed!

Related topics