could someone explain to me in very simple terms (![]()
) what is meant by greedy matching in regexps? preferably via an example??
cheers!![]()
| SitePoint Sponsor |
could someone explain to me in very simple terms (![]()
) what is meant by greedy matching in regexps? preferably via an example??
cheers!![]()





Hi.
Greedy means that the regexp matches from the first occurrence of the pattern until the last.
example: this is the string and we want to capture the words inside the bold tags.
Now if you use a "greedy" regexp like:PHP Code:$string = 'This is my <b>black</b> cat her <b>name</b> is Furball.';
The bold tags surround the result. Since this regexp is "greedy" matches from the first "<b>" to the last "</b>" in the string, the result is:PHP Code:preg_match('#<b>(.*)</b>#', $string, $match);
print_r($match);
Not what we want. If we use the "?" to make the regexp ungreedy we get the desired result.Code:black</b> cat her <b>name
which results in an array containing "black" and "name".PHP Code:preg_match('#<b>(.*?)</b>#', $string, $match);
print_r($match);





Lets take this string for example:Then lets run this command:PHP Code:$string = 'Here is | an example code | of something big | and strange';
When doing echo($string) it would output 'replacement and strange' because it replaces from the start of the string to the last occurance of |.PHP Code:$string = preg_replace('#^(.+)\|#', 'replacement', $string);
This is called greedy.
However if we would run this commandecho($string) would output 'replacement an example code | of something big | and strange', see how it goes from the start of the string to the first occurance of |, this is called ungreedy.PHP Code:$string = preg_replace('#^(.+?)\|#', 'replacement', $string);
To switch from greedy to greedy use ? or the 'U' modifier.
Hope this helps
Edit:
Too slow![]()
- website





Ah you're not the only one. Happened to my quite often aswellOriginally Posted by website
![]()
thanks! i *think* i get it lol!!!
i will have a look in more depth when i get home...for example if i did a replace that swapped my custom formatting tag [b ][ eb ] for html bold tags, and did a replace on a string like
[ b ]here[ eb ] are some [ b ]words[ eb ], some of them are in [ b ]bold[ eb ] and some of them are [ b ]not[ eb ]
i would get something like
< b >here[ eb ] are some [ b ]words[ eb ], some of them are in [ b ]bold[ eb ] and some of them are [ b ]not< / b >
is that right?





Yes if you use a greedy regexp that will be the result.
Bookmarks