SitePoint Sponsor

User Tag List

Results 1 to 21 of 21
  1. #1
    SitePoint Guru
    Join Date
    Nov 2003
    Location
    Huntsville AL
    Posts
    689
    Mentioned
    4 Post(s)
    Tagged
    0 Thread(s)

    Reg Exp Help - All Upper/Lower case names

    So I have a little registration system and most people take the time to case their names in a "normal" fashion i.e. "Bill Smith". But some like to shout "BILL SMITH" and some are a bit shy "bill smith".

    Looking for an expression that will match if all letters are upper case and another expression for all lower case. Needs to ignore other characters.

    I could then make a reasonable attempt at adjusting the names.

    Thanks in advance.

  2. #2
    SitePoint Wizard silver trophybronze trophy Stormrider's Avatar
    Join Date
    Sep 2006
    Location
    Nottingham, UK
    Posts
    3,133
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    /^([A-Z]+\s[A-Z]+|[a-z]+\s[a-z]+)$/

    Something like that?

  3. #3
    Non-Member
    Join Date
    Apr 2011
    Location
    no fixed address
    Posts
    851
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by ahundiak View Post
    I could then make a reasonable attempt at adjusting the names.
    You could convert names to lower case and then use ucwords() to capitalise the first letter.

  4. #4
    SitePoint Guru
    Join Date
    Nov 2003
    Location
    Huntsville AL
    Posts
    689
    Mentioned
    4 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by webdev1958 View Post
    You could convert names to lower case and then use ucwords() to capitalise the first letter.
    I could but then McReynolds get messed up.

  5. #5
    Keeper of the SFL StarLion's Avatar
    Join Date
    Feb 2006
    Location
    Atlanta, GA, USA
    Posts
    3,748
    Mentioned
    71 Post(s)
    Tagged
    0 Thread(s)
    or just... if($str == strtoupper($str) || $str == strtolower($str)) and dont make it so hard on yourself?
    Never grow up. The instant you do, you lose all ability to imagine great things, for fear of reality crashing in.

  6. #6
    SitePoint Guru
    Join Date
    Nov 2003
    Location
    Huntsville AL
    Posts
    689
    Mentioned
    4 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Stormrider View Post
    /^([A-Z]+\s[A-Z]+|[a-z]+\s[a-z]+)$/

    Something like that?
    That seems to work. Thanks.

  7. #7
    SitePoint Guru
    Join Date
    Nov 2003
    Location
    Huntsville AL
    Posts
    689
    Mentioned
    4 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by StarLion View Post
    or just... if($str == strtoupper($str) || $str == strtolower($str))
    Yep but I get points for using regular expressions. Not really but my knowledge of how to create expressions is abysmal. Slowly building a working library.

  8. #8
    Keeper of the SFL StarLion's Avatar
    Join Date
    Feb 2006
    Location
    Atlanta, GA, USA
    Posts
    3,748
    Mentioned
    71 Post(s)
    Tagged
    0 Thread(s)
    Tell your teacher to stop rewarding making things more difficult. :P
    Never grow up. The instant you do, you lose all ability to imagine great things, for fear of reality crashing in.

  9. #9
    SitePoint Guru
    Join Date
    Nov 2003
    Location
    Huntsville AL
    Posts
    689
    Mentioned
    4 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by StarLion View Post
    Tell your teacher to stop rewarding making things more difficult. :P
    Now that I have my expression perhaps we could morph this thread into a discussion of expressions vs functions?

    I like to use expressions when appropriate because:
    1. Compact code - Not always a good thing but as long as the expression is documented with some tests then it's okay.
    2. Easy to reuse - Just a string. Can always use a DEFINE to share. As opposed to making a custom function when then needs to managed.
    3. Speed - Pretty low on my list but I do tend to process thousands of names in one request.
    4. Professional - looks like I know what I am doing.
    5. Expressions are more or less standard and can be used in multiple programing languages.
    6. Expressions are also quite common so learning to read them is perhaps a good thing.
    7. Expressions are useful for validation and can avoid the need for custom validation functions.
    8. Teachers might give extra credit?

  10. #10
    Keeper of the SFL StarLion's Avatar
    Join Date
    Feb 2006
    Location
    Atlanta, GA, USA
    Posts
    3,748
    Mentioned
    71 Post(s)
    Tagged
    0 Thread(s)
    Go go SPF for eating my post and not posting it. Anyway, as usual this is the time for me to be contrarian, so:

    #1: Regex's get very complicated very quickly. What you're doing here is an 0.5 on the difficulty scale of 10.
    #2: Predefined functions = no management either? (strtoupper and strtolower are PHP Core Functions.)
    #3: Fairly certain regex is actually slower on a 1-to-1 vs strtoupper, but even then the differences will be so microscopic....
    #4: You came here to ask, so... no you dont? And honestly, using a regex in this capacity to me just makes it look like you dont know the language.
    #5: When was the last time you coded a single webpage in multiple languages? (EDIT: That WASNT a class assignment)
    #6: Learning is always good. Learning the right place to use the knowledge, better.
    #7: Try regex'ing an email address to the RFC2822 standard real quick.
    #8: Not a reason.
    Never grow up. The instant you do, you lose all ability to imagine great things, for fear of reality crashing in.

  11. #11
    SitePoint Guru
    Join Date
    Nov 2003
    Location
    Huntsville AL
    Posts
    689
    Mentioned
    4 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by StarLion View Post
    Go go SPF for eating my post and not posting it. Anyway, as usual this is the time for me to be contrarian, so:
    #1: Regex's get very complicated very quickly. What you're doing here is an 0.5 on the difficulty scale of 10.
    Not mine! I know what you mean but .5 is pretty much the limit for me.
    #2: Predefined functions = no management either? (strtoupper and strtolower are PHP Core Functions.)
    Pretty sure if($str == strtoupper($str) || $str == strtolower($str)) is not a core function.
    #3: Fairly certain regex is actually slower on a 1-to-1 vs strtoupper, but even then the differences will be so microscopic....
    And if the goal was to upper case a string then sure. But it's not so not sure of the relevance.
    #4: You came here to ask, so... no you dont? And honestly, using a regex in this capacity to me just makes it look like you dont know the language.
    I asked abut an expression and was happy with the answer. I personally think preg_match($exp,$name) is better than two case shifts and two comparison operators but to each their own.
    #5: When was the last time you coded a single webpage in multiple languages? (EDIT: That WASNT a class assignment)
    Can't really remember the last time I didn't use multiple languages for each webpage. PHP/SQL/JavaScript/Annotations. All support regular expressions in a more or less consistent fashion. Not sure of the relevance though. I use a number of languages during a typical work week.
    #6: Learning is always good. Learning the right place to use the knowledge, better.
    Yep. That is why debating can be useful.
    #7: Try regex'ing an email address to the RFC2822 standard real quick.
    Why?
    /**
    * @Assert\NotBlank()
    * @Assert\Email()
    */
    public function getEmail() { return $this->person->getEmail(); }
    Works fine for me.
    #8: Not a reason.
    It is if your instructor thinks it is.

  12. #12
    Keeper of the SFL StarLion's Avatar
    Join Date
    Feb 2006
    Location
    Atlanta, GA, USA
    Posts
    3,748
    Mentioned
    71 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by ahundiak View Post
    Not mine! I know what you mean but .5 is pretty much the limit for me.

    Pretty sure if($str == strtoupper($str) || $str == strtolower($str)) is not a core function.
    It's completely comprised of core elements though, none of which require maintaining by a user. IF, ==, and strtoupper/strtolower. If they ever change one of those elements, i'd probably stop using PHP :P

    And if the goal was to upper case a string then sure. But it's not so not sure of the relevance.
    In fact you're right. Executing all of those horribly long and complicated statements saved you... 0.00145 seconds (rounded) of execution time across 5000 records, according to my tests. (PS: If you're regexing more than that at a single time, you're probably doing something wrong.)

    I asked abut an expression and was happy with the answer. I personally think preg_match($exp,$name) is better than two case shifts and two comparison operators but to each their own.
    I think that my statement is simpler to read than a -lot- of regex out there. Again, when you get beyond the 0.5 difficulty, this will become more clear to you. Just a question of application.

    Can't really remember the last time I didn't use multiple languages for each webpage. PHP/SQL/JavaScript/Annotations. All support regular expressions in a more or less consistent fashion. Not sure of the relevance though. I use a number of languages during a typical work week.
    At most, you should be validating twice - once at the Javascript level (which is purely for show, and is subject to bypass, so should never be trusted), and once at the PHP level (which is actually trustworthy, if you do the validation right). So... no, I dont agree with your assessment.
    Why?
    /**
    * @Assert\NotBlank()
    * @Assert\Email()
    */
    public function getEmail() { return $this->person->getEmail(); }
    Works fine for me.
    Which... is a custom function. So... I dont get your point here, except that you're proving my statement true.

    (The general form of a 2822 address is (?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*")@(??:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(??:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\]) ). Try reading that lol.
    Never grow up. The instant you do, you lose all ability to imagine great things, for fear of reality crashing in.

  13. #13
    SitePoint Wizard silver trophy TheOriginalH's Avatar
    Join Date
    Aug 2000
    Location
    Thailand
    Posts
    4,810
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    Lots of sad faces in a 2822
    ~The Artist Latterly Known as Crazy Hamster~
    922ee590a26bd62eb9b33cf2877a00df
    Currently delving into Django, GIT & CentOS

  14. #14
    Keeper of the SFL StarLion's Avatar
    Join Date
    Feb 2006
    Location
    Atlanta, GA, USA
    Posts
    3,748
    Mentioned
    71 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by TheOriginalH View Post
    Lots of sad faces in a 2822
    It reflects all the points at which you lose pieces of your soul trying to regex something like that.
    Never grow up. The instant you do, you lose all ability to imagine great things, for fear of reality crashing in.

  15. #15
    SitePoint Guru
    Join Date
    Nov 2003
    Location
    Huntsville AL
    Posts
    689
    Mentioned
    4 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by StarLion View Post
    It's completely comprised of core elements though, none of which require maintaining by a user. IF, ==, and strtoupper/strtolower. If they ever change one of those elements, i'd probably stop using PHP :P
    I'm probably doing things wrong but when I have a bit of functionality that needs to be reused I tend to make a function even if all the functions inside are core elements.

    In fact you're right. Executing all of those horribly long and complicated statements saved you... 0.00145 seconds (rounded) of execution time across 5000 records, according to my tests. (PS: If you're regexing more than that at a single time, you're probably doing something wrong.)
    You seem fixated on execution time. You really ran a bench mark?
    I am surprised that you consider "preg_match'/^([A-Z]+\s[A-Z]+|[a-z]+\s[a-z]+)$/',$name)" to be long and complicated. Maybe I'm just smarter than most developers but to me it seems pretty basic.

    I think that my statement is simpler to read than a -lot- of regex out there. Again, when you get beyond the 0.5 difficulty, this will become more clear to you. Just a question of application.
    You seem to be saying that "if($str == strtoupper($str) || $str == strtolower($str))" can be used in place of all complicated expressions? I'm not so sure about that. I would think that more complicated expressions would probably require more complicated functions. In which case I don't really understand your point. Are you saying that if I use simple expressions then I will also be forced to use more complicated ones? Is this something the interpreter enforces?

    At most, you should be validating twice - once at the Javascript level (which is purely for show, and is subject to bypass, so should never be trusted), and once at the PHP level (which is actually trustworthy, if you do the validation right). So... no, I dont agree with your assessment.
    I think I see part of my problem. I use expressions for things other than validation. Is that wrong? Expressions shall only be used for validation? If so then I fear I need to redo quite a bit of my sql code.

    Which... is a custom function. So... I dont get your point here, except that you're proving my statement true.
    For me at least it makes sense to use simple expressions for simple tasks. Truly validating an email is not a simple task. Therefore, I don't use expressions for email. Which is why your demand that I create one is very puzzling. I'm sure that my methodology will change once I gather sufficient experience but for the moment anyways, I try to use the best tool for the job at hand.

  16. #16
    Non-Member
    Join Date
    Apr 2011
    Location
    no fixed address
    Posts
    851
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by ahundiak View Post
    I could but then McReynolds get messed up.
    ucwords capitalises only the first letter. Obviously you would have to add more code for "special" cases.

    What is normally done is store names in a database all in either upper or lower case (I normally store all in lower case) and then after extracting names from the database, format the names however you like in the application (not the database) for output to whatever. It's fairly straight forward and not rocket science.

  17. #17
    SitePoint Guru
    Join Date
    Nov 2003
    Location
    Huntsville AL
    Posts
    689
    Mentioned
    4 Post(s)
    Tagged
    0 Thread(s)
    Interesting. I have never come across a modern application that always converts names to upper or lower. For my stuff at least, the all upper or all lower names only happens in a tiny number of cases. Probably people with tablets or smart phones that don't like the shift key. And it's just an annoyance more than anything. There are a lot more names like McReynolds than there are upper/lower only names. Lot of special code would be needed.

  18. #18
    Non-Member
    Join Date
    Apr 2011
    Location
    no fixed address
    Posts
    851
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    How you store names is up to you.

    You can:

    1) store names as entered by the user

    2) store them in a consistent format like all upper or all lower case.

    Personally, I store names in all lower case and then reformat them to what I need after extracting them from the database and before outputting to wherever.

    If you let users enter names however they like and not reformat it in any way anywhere at all then you are likely to get at least a small number of outputs looking like BilL jONeS.

    I have my own customised php class for reformatting names.

  19. #19
    @php.net Salathe's Avatar
    Join Date
    Dec 2004
    Location
    Edinburgh
    Posts
    1,397
    Mentioned
    63 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by ahundiak View Post
    That seems to work. Thanks.
    How about names like PRINCE (one name), BILLY JOE JNR (more than two names), PETER O'PETERSON (non-A-Z)?
    Salathe
    Software Developer and PHP Manual Author.

  20. #20
    SitePoint Guru
    Join Date
    Nov 2003
    Location
    Huntsville AL
    Posts
    689
    Mentioned
    4 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Salathe View Post
    How about names like PRINCE (one name), BILLY JOE JNR (more than two names), PETER O'PETERSON (non-A-Z)?
    Not exactly sure what your question is. The expression flags all of the above. When I do need to clean up a name I explode on space and then do a ucfirst on each one.

  21. #21
    Keeper of the SFL StarLion's Avatar
    Join Date
    Feb 2006
    Location
    Atlanta, GA, USA
    Posts
    3,748
    Mentioned
    71 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by ahundiak View Post
    Not exactly sure what your question is. The expression flags all of the above. When I do need to clean up a name I explode on space and then do a ucfirst on each one.
    /^([A-Z]+\s[A-Z]+|[a-z]+\s[a-z]+)$/

    Shouldnt match any of those three examples.

    It also wouldnt match (and neither would the IF's above) "JOE berthera" (all upper one word, all lower other word), though that might be an allowable miss in your opinion.
    Or "JOAN MARTINE-TOURINGTON" or "MICHAEL ST.TOWER"...
    Maybe you should split the names up and match the parts individually?
    Never grow up. The instant you do, you lose all ability to imagine great things, for fear of reality crashing in.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •