SitePoint Sponsor

User Tag List

Results 1 to 19 of 19
  1. #1
    SitePoint Wizard silver trophy someonewhois's Avatar
    Join Date
    Jan 2002
    Location
    Canada
    Posts
    6,364
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Preg_match to confirm letters/numbers/underscores only?

    Hey, I've had people register with ^, *, and allt hese other weird characters.

    I want to confirm on registration that it's just letters, numbers, underscores, or dashes.

    How can I do this?

    Regards,
    Someonewhois

  2. #2
    Non-Member
    Join Date
    Jan 2003
    Posts
    5,748
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Try this:

    PHP Code:
    if((ereg("^[a-zA-Z0-9_\-]+$"$your_variable_here)) {
    # allowable characters only
    } else {
    # some bad characters found

    This will say true/false for valid characters though it'll not replace characters etc. I think the Reg Exp is okay as I've used it before bar the underscore and dash characters.

    You may need to escape the underscore as well ? using /_ for example as I've not tested this.

  3. #3
    SitePoint Zealot
    Join Date
    Feb 2003
    Posts
    156
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I think there is no longer any reason to use ereg, use preg instead.

    In brackets [] every character is literal IIRC, so you should not try to escape the -, instead make sure it is the last character in the class (as it already is in your example). By adding the \ you are also allowing the backslash as a valid character.

  4. #4
    Non-Member
    Join Date
    Jan 2003
    Posts
    5,748
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Wasn't sure as I don't allow the - character through my FORMS as this can be used to hack mySQL 8)

  5. #5
    SitePoint Wizard samsm's Avatar
    Join Date
    Nov 2001
    Location
    Atlanta, GA, USA
    Posts
    5,011
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by R. U. Serious
    I think there is no longer any reason to use ereg, use preg instead.

    In brackets [] every character is literal IIRC, so you should not try to escape the -, instead make sure it is the last character in the class (as it already is in your example). By adding the \ you are also allowing the backslash as a valid character.
    Like he said!

    POSIX -> PCRE (Perl) Regex

    I think this would serve: /[\w-]+/
    (\w includes an underscore)

    While you are at it, you could also use {6,12} style syntax instead of + to control the maximum length of the string.
    Using your unpaid time to add free content to SitePoint Pty Ltd's portfolio?

  6. #6
    Sidewalking anode's Avatar
    Join Date
    Mar 2001
    Location
    Philadelphia, US
    Posts
    2,205
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by samsm
    I think this would serve: /[\w_-]+/
    Small nit: "\w" includes underscores, IIRC.
    TuitionFree a free library for the self-taught
    Anode Says... Blogging For Your Pleasure

  7. #7
    ********* Member website's Avatar
    Join Date
    Oct 2002
    Location
    Iceland
    Posts
    1,238
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    wouldn't this work?
    PHP Code:
    preg_match('#^[A-Za-z0-9_-]{3,20}$#s'$string); 
    And then for the password
    PHP Code:
    preg_match('#^[A-Za-z0-9?+*_!#$%&-]{6,20}$#s'$string); 
    ?
    - website

  8. #8
    ********* Member website's Avatar
    Join Date
    Oct 2002
    Location
    Iceland
    Posts
    1,238
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    but would \w include any 'special' letters like we have here? ?
    - website

  9. #9
    SitePoint Wizard samsm's Avatar
    Join Date
    Nov 2001
    Location
    Atlanta, GA, USA
    Posts
    5,011
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by anode
    Small nit: "\w" includes underscores, IIRC.
    Heh... I think I got my edit in before you posted... that occured to me just after submit.

    PHP Code:
    preg_match('#^[A-Za-z0-9_-]{3,20}$#s'$string); 
    Yes that would work! However, the #s is redundant since there are no periods in the regex. Also, \w happens to be a perfect replacement for 2/3 of that.

    \w is [0-9A-Za-z_] ... no funky characters :-)

    Besides the underscore that was on my post for about 10 seconds, I did make another mistake ... failed to put the ^ and $ in so that it only matched if it was the entire string.
    Using your unpaid time to add free content to SitePoint Pty Ltd's portfolio?

  10. #10
    ********* Member website's Avatar
    Join Date
    Oct 2002
    Location
    Iceland
    Posts
    1,238
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    ok, preg code verision 2
    username:
    PHP Code:
    preg_match('#^[\w-]{3,20}$#'$string); 
    password:
    PHP Code:
    preg_match('#^[\w?+*!#$%&-]{6,20}$#'$string); 
    wouldn't that just be in ideal preg_match for those?

    but about the \w
    Quote Originally Posted by The Manual
    \w
    any "word" character
    But is then 0-9 and _ thought of as 'word' ? and then again are not thought of as a part of word, that is ehm, a little bit wierd I think.

    is \w maybe just an 'alias' for a-zA-Z0-9_ ? as you said ?
    - website

  11. #11
    SitePoint Wizard silver trophy redemption's Avatar
    Join Date
    Sep 2001
    Location
    Singapore
    Posts
    5,269
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Actually, and 0-9 are matched by \w.

  12. #12
    ********* Member website's Avatar
    Join Date
    Oct 2002
    Location
    Iceland
    Posts
    1,238
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    hmm, so that is not very 'internet friendly' to use \w, propably best to limit it to [a-zA-Z0-9_-] I think...
    - website

  13. #13
    SitePoint Wizard silver trophy redemption's Avatar
    Join Date
    Sep 2001
    Location
    Singapore
    Posts
    5,269
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by website
    hmm, so that is not very 'internet friendly' to use \w,
    Why is it not 'Internet friendly'? I thought it would be friendlier to accept such characters, considering the international nature of the Internet. Or am I misunderstanding you?

  14. #14
    SitePoint Wizard samsm's Avatar
    Join Date
    Nov 2001
    Location
    Atlanta, GA, USA
    Posts
    5,011
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by redemption
    Actually, and 0-9 are matched by \w.
    Granted, for 0-9, but as for the "special characters" I just tested and no match. Perhaps I've done something wrong, so here's the test:
    PHP Code:
    $source ''
    if (
    preg_match('/\w+/'$source))
    {
    echo (
    'match!');

    No match for me.
    Using your unpaid time to add free content to SitePoint Pty Ltd's portfolio?

  15. #15
    SitePoint Wizard silver trophy redemption's Avatar
    Join Date
    Sep 2001
    Location
    Singapore
    Posts
    5,269
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Funny, because I'd tested this out to confirm too, with this code snippet:
    PHP Code:
    $string '232';

    if ( 
    preg_match'/^\w+$/'$string) ) {
        echo 
    'Match!';

    and it matches. Maybe it's a locale issue.

  16. #16
    SitePoint Wizard silver trophy redemption's Avatar
    Join Date
    Sep 2001
    Location
    Singapore
    Posts
    5,269
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by PHP function reference user contributed notes
    I you want to match all scandinavian characters () in addition to those matched by \w, you might want to use this regexp:

    /^[\w\xe6\xc6\xf8\xd8\xe5\xc5\xf6\xd6\xe4\xc4]+$/

    Remember that \w respects the current locale used in PCRE's character tables.
    Yes it is a locale issue.

  17. #17
    ********* Member website's Avatar
    Join Date
    Oct 2002
    Location
    Iceland
    Posts
    1,238
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by redemption
    Why is it not 'Internet friendly'? I thought it would be friendlier to accept such characters, considering the international nature of the Internet. Or am I misunderstanding you?
    Well, like you can't have usernames here (I think) with icelandic letters and not in emails, bad to have them in files/directories names and many things, doesn't it just offer trouble?
    - website

  18. #18
    SitePoint Wizard samsm's Avatar
    Join Date
    Nov 2001
    Location
    Atlanta, GA, USA
    Posts
    5,011
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by redemption
    Yes it is a locale issue.
    I was wondering, thanks for looking it up. That's a good note to keep in mind for scripts that may travel.

    Of course, if you are installing a script on a server with a locale that includes in \w, there is a good chance that you would want to accept those characters.
    Using your unpaid time to add free content to SitePoint Pty Ltd's portfolio?

  19. #19
    ********* Member website's Avatar
    Join Date
    Oct 2002
    Location
    Iceland
    Posts
    1,238
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by redemption
    Yes it is a locale issue.
    And how is it configured?
    - website


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •