Regular Expression Hell

This seems to be the worst documented area of learning PHP imaginable. Are there any decent tutorials about? Most seem to talk about ereg() which is no longer relevant or are relevant but horrifically complex and not designed for people like me with no real understanding of the subject. There must be some kind of decent introduction somewhere?

In the meantime, anyone fancy being a star and writing one to match

100-299:1.25

The numbers are of any length but can only be numbers. The other characters ( - : . )have to be exact though.

Until now I’ve always managed to avoid regexes and fumble about with str_replace etc but I give in and I need to learn how to use these scary things! Any help appreciated.

So, like 1005678-333299:7890541.3456725 ?

The other characters ( - : . )have to be exact though.

What does ‘exact’ mean? Just once each, or just in that order, or…?

I realized it isn’t an amazingly well documented area either, in comparison to the other areas anyway, so I learned it the broken way, this should work, but is probably flawed.


$matches = array();

$string = "100-299:1.25";

// matches: 3chardigit-3chardigit:1/2chardigit.2chardigit
preg_match("/\\d{3}-\\d{3}:\\d{1,2}.\\d{2}/", $string, $matches);

echo "<pre>";
print_r($matches);
echo "</pre>";


BTW, when you do learn, I recommend using http://gskinner.com/RegExr/ - Great platform for testing match and replaces!

To me regex is like a whole other language. One the hardest things to master is to not only get what you want to get, but to also not get what you want not to get. eg. try

<?php
$matches = array();

$string = "100-299:1X25";

// matches: 3chardigit-3chardigit:1/2chardigit.2chardigit
preg_match("/\\d{3}-\\d{3}:\\d{1,2}.\\d{2}/", $string, $matches);

echo "<pre>";
print_r($matches);
echo "</pre>"; 
?>

*simple fix, let me know if you get stuck

Just throwing a, nice-n-simple, alternative to RegEx in the mix. :slight_smile:


function parse($str){
  return sscanf($str, '%d-%d:%f');
}

print_r(
  parse('100-299:1.25')
);

/*
  Array
  (
      [0] => 100
      [1] => 299
      [2] => 1.25
  )
*/

print_r(
  parse('1-6000:3')
);
/*
  Array
  (
      [0] => 1
      [1] => 6000
      [2] => 3
  )
*/
print_r(
  parse('0-1:999.99')
);
/*
  Array
  (
      [0] => 0
      [1] => 1
      [2] => 999.99
  )
*/

If the numbers can be of arbitrary length it should be:


preg_match('~(\\d+)-(\\d+):(\\d+\\.\\d+)~', $var, $matches);

Or, if you want to capture the number in front of the dot separately from the number after the dot:


preg_match('~(\\d+)-(\\d+):(\\d+)\\.(\\d+)~', $var, $matches);

I like the tutorials from regular-expressions.info. They’re really thorough yet easy to read :slight_smile:

Off Topic:

Look mum, Anthony learned a new trick from Salathe :smiley:

I’m a fan of this for quickly testing stuff.

Off Topic:

Shurrup you.

Brilliant, thanks for the help chaps, that .info site looks pretty useful.

preg_match(‘~(\d+)-(\d+):(\d+\.\d+)~’, $var, $matches);

works a charm.

haha, that’s the saddest regex in all the land…

Just out of curiousity I benchmarked 1,000,000 times regex vs 1,000,000 sscanf for the var $var='100-299:1.25'; and regex takes 3.7137291431427 seconds, while sscanf only takes 1.7851250171661, so it only need about 48% of the time the regex needs! I like sscanf! :slight_smile:

Off Topic:

I didn’t say that to bash you, you know that right? Just found it funny is all :slight_smile:

I’m intrigued as to why you both feel that the documentation is lacking. The section of the PHP manual on PCRE goes into a lot of depth, especially describing the syntax to use, and covers more than most folks will ever use in their day-to-day regexing (not a real word)… at least that was what I thought until reading this thread!

[ot]
I have been showing sscanf() to Anthony for years (e.g. here & [url=http://www.sitepoint.com/forums/showpost.php?p=4242754&postcount=4]here). It’s good to see it being adopted, where appropriate. (:[/ot]

www.regular-expressions.info is one of better resources when it comes to introduction to regexps.
As Salathe already hinted, php itself has nothing to do with actual regexps and how to construct them as the topic is vast in itself and not tied into the manual about php functions using those.

This is a must have for regex,

I’m amazed at what it has done for me.
hth