SitePoint Sponsor

User Tag List

Results 1 to 15 of 15
  1. #1
    SitePoint Addict amy.damnit's Avatar
    Join Date
    Sep 2009
    Posts
    336
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Deciphering Regular Expression?!

    Say, I don't have any experience with Regular Expressions, and was hoping some kind could could help me decipher what the following Regular Expressions mean from one of my PHP books?!

    First:
    if (preg_match ('/^[A-Z \'.-]{2,20}$/i', $trimmed['first_name']))

    Second:

    if (preg_match ('/^[\w.-]+@[\w.-]+\.[A-Za-z]{2,6}$/', $trimmed['email']))

    Third:
    if (preg_match ('/^\w{4,20}$/', $trimmed['password1']) )

    Wow, those look like Egyptian hieroglyphics?!


    Amy

  2. #2
    SitePoint Guru
    Join Date
    Jun 2006
    Posts
    638
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Code:
    if (preg_match ('/^[A-Z \'.-]{2,20}$/i', $trimmed['first_name']))
    String must contain 2 to 20 characters of only A to Z, "'",".","-".

    Second:
    Code:
    if (preg_match ('/^[\w.-]+@[\w.-]+\.[A-Za-z]{2,6}$/', $trimmed['email']))
    Crappy email format, must start with 1 or more WORD characters, then an \@ then 1 or more WORD characters, then a ".", then 2 to 6 characters of A to Z

    Third:
    Code:
    if (preg_match ('/^\w{4,20}$/', $trimmed['password1']) )
    String must only contain WORD characters, 4 to 20 of them.

    See: http://www.regular-expressions.info/reference.html

  3. #3
    play of mind Ernie1's Avatar
    Join Date
    Sep 2005
    Posts
    1,252
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    my mobile portal
    ghiris.ro

  4. #4
    SitePoint Addict amy.damnit's Avatar
    Join Date
    Sep 2009
    Posts
    336
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Vali View Post
    Code:
    if (preg_match ('/^[A-Z \'.-]{2,20}$/i', $trimmed['first_name']))
    String must contain 2 to 20 characters of only A to Z, "'",".","-".

    Second:
    Code:
    if (preg_match ('/^[\w.-]+@[\w.-]+\.[A-Za-z]{2,6}$/', $trimmed['email']))
    Crappy email format, must start with 1 or more WORD characters, then an \@ then 1 or more WORD characters, then a ".", then 2 to 6 characters of A to Z

    Third:
    Code:
    if (preg_match ('/^\w{4,20}$/', $trimmed['password1']) )
    String must only contain WORD characters, 4 to 20 of them.

    See: http://www.regular-expressions.info/reference.html
    Thanks for the quick deciphering!

    Some questions...

    1.) What is a "Word" character?

    2.) How would you re-write those Regular Expressions to make them better? (You didn't seem to like the e-mail one?!)

    3.) If I used what the book author used, would that work on any web-hosting package?

    Or does that require some special add-on or language?

    4.) What language were those written in?!

    Thanks,


    Amy

  5. #5
    dooby dooby doo silver trophybronze trophy
    spikeZ's Avatar
    Join Date
    Aug 2004
    Location
    Manchester UK
    Posts
    13,788
    Mentioned
    153 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by amy.damnit View Post
    Thanks for the quick deciphering!

    Some questions...



    2.) How would you re-write those Regular Expressions to make them better? (You didn't seem to like the e-mail one?!)

    3.) If I used what the book author used, would that work on any web-hosting package?

    Or does that require some special add-on or language?

    4.) What language were those written in?!

    Thanks,


    Amy
    1.) What is a "Word" character?
    literally a word - represented by the w

    2.) How would you re-write those Regular Expressions to make them better? (You didn't seem to like the e-mail one?!)
    Look at : http://www.regular-expressions.info/...ddy/email.html
    and study the explanation

    3.) If I used what the book author used, would that work on any web-hosting package?

    Pretty much any half decent server


    4.) What language were those written in?!

    see Ernies link
    Mike Swiffin - Community Team Advisor
    Only a woman can read between the lines of a one word answer.....

  6. #6
    @php.net Salathe's Avatar
    Join Date
    Dec 2004
    Location
    Edinburgh
    Posts
    1,396
    Mentioned
    55 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by spikeZ View Post
    1.) What is a "Word" character?[/B]
    literally a word - represented by the w
    A word character in PCRE (PCRE is the regular expression "flavour" that you're using with the preg_* functions) is, put simply, an alphanumeric character or underscore: the same as using the character class [a-zA-Z0-9_]

    Off Topic:


    Put less simply, the exact characters matched by the \w escape sequence may be different depending on the locale that PCRE is using (compiled with, or passed to the internal pcre_exec()). In most cases, this is not an issue and \w will match only those characters listed above but it cannot be 100% guaranteed to do so (this is an off-topic so perhaps it's best not to delve into any more detail!).
    Salathe
    Software Developer and PHP Manual Author.

  7. #7
    SitePoint Addict amy.damnit's Avatar
    Join Date
    Sep 2009
    Posts
    336
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by spikeZ View Post
    1.) What is a "Word" character?
    literally a word - represented by the w

    2.) How would you re-write those Regular Expressions to make them better? (You didn't seem to like the e-mail one?!)
    Look at : http://www.regular-expressions.info/...ddy/email.html
    and study the explanation

    3.) If I used what the book author used, would that work on any web-hosting package?

    Pretty much any half decent server


    4.) What language were those written in?!

    see Ernies link
    Uh oh... someone is goading me to do a little more research?!

    (Hey, Mike, I only got 2 hours sleep because of a majorly disruptive neighbor last night?!)


    Amy

  8. #8
    SitePoint Addict amy.damnit's Avatar
    Join Date
    Sep 2009
    Posts
    336
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Salathe View Post
    A word character in PCRE (PCRE is the regular expression "flavour" that you're using with the preg_* functions) is, put simply, an alphanumeric character or underscore: the same as using the character class [a-zA-Z0-9_]
    Okay, that is a better explanation!!!

    Off Topic:


    Put less simply, the exact characters matched by the \w escape sequence may be different depending on the locale that PCRE is using (compiled with, or passed to the internal pcre_exec()). In most cases, this is not an issue and \w will match only those characters listed above but it cannot be 100% guaranteed to do so (this is an off-topic so perhaps it's best not to delve into any more detail!).
    Sounds interesting, but probably more than I need to worry about now.

    Thanks,


    Amy

  9. #9
    SitePoint Guru
    Join Date
    Jun 2006
    Posts
    638
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    The email regex I use:
    Code:
    /[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*@(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?/i
    Matches 99.9% of the emails. (think 1 or 2 were not matches in 30million)

  10. #10
    SitePoint Enthusiast nrg_alpha's Avatar
    Join Date
    Dec 2008
    Posts
    81
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by amy.damnit View Post
    Okay, that is a better explanation!!!



    Sounds interesting, but probably more than I need to worry about now.

    Thanks,


    Amy
    Depending on your locale, it may (or may not be) an issue.

    I have found that this is an issue for me more often than not. In my locale settings, \w also involves accented characters for example, and in the same breath, \d matches exponents, etc... (not my definition of desirable).

    But while this is 'off-topic', there are solutions. I have responded in this thread that discusses such issues (sorry, it's not a Sitepoint thread).

    So you have the option of either changing your ctype settings, or simply resorting to explicit character classes. One way or the other, turning a blind eye to locale issues just may come back to haunt you with undesirable results should your locale settings not match/capture what you think / expect them to.

    EDIT - You can learn more about regex with the following links:
    http://www.regular-expressions.info/
    http://weblogtoolscollection.com/regex/regex.php
    http://www.phpfreaks.com/tutorial/re...--basic-syntax

    Surely, there are more resources on the net via Google..

    And if you really want to get on the ball, I recommend this book:
    http://www.amazon.com/Mastering-Regu...9475391&sr=8-1

  11. #11
    SitePoint Enthusiast nrg_alpha's Avatar
    Join Date
    Dec 2008
    Posts
    81
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Vali View Post
    The email regex I use:
    Code:
    /[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*@(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?/i
    Matches 99.9% of the emails. (think 1 or 2 were not matches in 30million)
    As far as matching emails are concerned, I would consider having a look at:
    http://www.iamcal.com/publish/articl...parsing_email/

    It offers a (perhaps intimidating yet) comprehensive script for this sort of thing.

  12. #12
    SitePoint Addict amy.damnit's Avatar
    Join Date
    Sep 2009
    Posts
    336
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by nrg_alpha View Post
    As far as matching emails are concerned, I would consider having a look at:
    http://www.iamcal.com/publish/articl...parsing_email/

    It offers a (perhaps intimidating yet) comprehensive script for this sort of thing.
    Thanks for all of the links and thoughts.

    I appreciate your concerns about how certain things could come back to bite me, but also understand that if I don't get this site done by the end of the week, it really won't matter?! (I'm nearly out of time and $$$!)

    That is why I posted here, to get a general sense/check of a moderately good Regular Expression.

    This is such a big topic, I'm really going to bump it down my priority list.

    First I need a site up and working. After that I can go back and tweak things like Regular Expression and improve things time permitting.

    Thanks,


    Amy

  13. #13
    PHP Developer W1LL's Avatar
    Join Date
    Apr 2001
    Location
    Leicester, UK
    Posts
    459
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I tend to use PHP's filter_var() function for validating emails:

    PHP Code:
    filter_var('bob@example.com'FILTER_VALIDATE_EMAIL); 
    I used to use regular expressions, but they just seemed very untidy.

  14. #14
    SitePoint Wizard lorenw's Avatar
    Join Date
    Feb 2005
    Location
    was rainy Oregon now sunny Florida
    Posts
    1,094
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Try this
    http://www.addedbytes.com/cheat-sheets

    3rd one down is regex and you may find other cheat sheets helpful.
    What I lack in acuracy I make up for in misteaks

  15. #15
    @php.net Salathe's Avatar
    Join Date
    Dec 2004
    Location
    Edinburgh
    Posts
    1,396
    Mentioned
    55 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by W1LL View Post
    I tend to use PHP's filter_var() function for validating emails:

    PHP Code:
    filter_var('bob@example.com'FILTER_VALIDATE_EMAIL); 
    I used to use regular expressions, but they just seemed very untidy.
    In this case, the regex is just swept under the floor (it does look much tidier though!). All that particular filter does behind the scenes is try to match the supplied email against a regular expression. Admittedly, that regular expression has been tested and used by many more people than perhaps one you'll construct on your own. It is also probably best to be aware of precisely what FILTER_VALIDATE_EMAIL will consider valid, or not (e.g. +@0 will validate).
    Salathe
    Software Developer and PHP Manual Author.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •