SitePoint Sponsor

User Tag List

Page 1 of 2 12 LastLast
Results 1 to 25 of 39
  1. #1
    SitePoint Zealot
    Join Date
    Apr 2006
    Posts
    147
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Handling telephone numbers

    I'm trying to work out the best way to do this. I'll be dealing with UK telephone numbers which will all be stored in the format: 0xxxxxxx... so no country code.

    On searching about this it seem the consensus is to store telephone numbers as strings but I can't help but think storing as a number may solve some problems while also speeding up searches and reducing storage size!?

    The only thing I'll be doing is storing, searching and displaying those numbers.

    One of the problems is spaces. For example 01234567890 could be entered as 01234 567890 or 01234 567 890. So I'd need to handles that when doing a search.

    If I stored the numbers as an INT then I just need to handle the leading zero.

    What do you think?

  2. #2
    SitePoint Zealot
    Join Date
    Apr 2009
    Location
    Wenatchee, WA
    Posts
    101
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    No expert here, but I'd probably just store them as a string, and strip the spaces (and parentheses/dashes/dots, at least for our numbers), storing just the characters. Figure out how you want to enforce the minimum and/or maximum string length to protect against someone entering the wrong number of digits. Not being that familiar with your phone number formatting, our equivalent would be if some entered less than seven digits (incomplete even for a local number), more than 10 (area code plus local number), or more than seven but less than ten (incomplete area code and/or some portion of the local number).

    I believe your subsequent search queries would need to also strip the spaces, dashes, dots, etc. from the user-inputted string so that it would match the value stored in the table.

    Alternately you could program things to where the value is stored as a formatted string, with spaces, etc. in the appropriate places...

  3. #3
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,215
    Mentioned
    58 Post(s)
    Tagged
    3 Thread(s)
    store both the stripped number and the formatted number

    as strings, of course

    searching an index on a 10- or 15-char string is not going to be noticeably slower than searching on an integer index
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  4. #4
    SitePoint Zealot
    Join Date
    Apr 2006
    Posts
    147
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Ok thanks. So why would storing as a number be a bad idea in this case?

  5. #5
    Programming Since 1978 silver trophybronze trophy felgall's Avatar
    Join Date
    Sep 2005
    Location
    Sydney, NSW, Australia
    Posts
    16,784
    Mentioned
    25 Post(s)
    Tagged
    1 Thread(s)
    Quote Originally Posted by Code.Warrior View Post
    Ok thanks. So why would storing as a number be a bad idea in this case?
    Storing as a number doesn't allow for any leading zeros on the front of the phone number.
    Stephen J Chapman

    javascriptexample.net, Book Reviews, follow me on Twitter
    HTML Help, CSS Help, JavaScript Help, PHP/mySQL Help, blog
    <input name="html5" type="text" required pattern="^$">

  6. #6
    SitePoint Addict bimalpoudel's Avatar
    Join Date
    Feb 2009
    Location
    Kathmandu, Nepal
    Posts
    279
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Code.Warrior View Post
    Ok thanks. So why would storing as a number be a bad idea in this case?
    Using VARCHAR column ensures that it can accept several possibilities of phone number formats as well. Like those with:
    * Leading zeros
    * Spaces
    * Hyphens and braces
    * Even alphabets

    However, you can always filter the user input if you strictly want to follow a format. Otherwise, you can respect the user input (by not saying your phone format is wrong, retype the correct one). Just assume user input is valid, unless it is filled by a spammer.
    Bimal Poudel @ Sanjaal Framework over Smarty Template Engine
    ASKING INTERESTING QUESTIONS ON SITEPOINT FOURM

    Hire for coding support - PHP/MySQL

  7. #7
    SitePoint Zealot
    Join Date
    Apr 2006
    Posts
    147
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    That's the thing, if I'm ONLY using numbers I won't have any of those issues except the leading zero which can be added for display purposes.

  8. #8
    Programming Since 1978 silver trophybronze trophy felgall's Avatar
    Join Date
    Sep 2005
    Location
    Sydney, NSW, Australia
    Posts
    16,784
    Mentioned
    25 Post(s)
    Tagged
    1 Thread(s)
    Quote Originally Posted by Code.Warrior View Post
    That's the thing, if I'm ONLY using numbers I won't have any of those issues except the leading zero which can be added for display purposes.
    The problem would be that there is no way when a phone number is stored in a number field to tell whether or not it has a leading zero and how many leading zeros it should have - eg. the phone number 000 would be stored simply as zero and you'd have no way to tell that it is supposed to be three zeros.
    Stephen J Chapman

    javascriptexample.net, Book Reviews, follow me on Twitter
    HTML Help, CSS Help, JavaScript Help, PHP/mySQL Help, blog
    <input name="html5" type="text" required pattern="^$">

  9. #9
    SitePoint Zealot
    Join Date
    Apr 2006
    Posts
    147
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by felgall View Post
    The problem would be that there is no way when a phone number is stored in a number field to tell whether or not it has a leading zero and how many leading zeros it should have - eg. the phone number 000 would be stored simply as zero and you'd have no way to tell that it is supposed to be three zeros.
    All UK telephone numbers have one leading zero.

  10. #10
    SitePoint Guru
    Join Date
    Sep 2008
    Posts
    977
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Better imv, to store the leading 0, regardless. So when you display it in your webpage, you can process the dropping of it, if you are to show +44, for example.

    If there was no 0, then would you not have to hard code that process of adding/dropping the 0, in your script?

    bazz

  11. #11
    SitePoint Zealot
    Join Date
    Apr 2006
    Posts
    147
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Yes baz, but that seems quite trivial and straight forward considering that whichever method I use I will have to manipulate the data for display and search purposes anyway.

  12. #12
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,215
    Mentioned
    58 Post(s)
    Tagged
    3 Thread(s)
    no you wont

    see post #3 -- store bofadem
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  13. #13
    SitePoint Zealot
    Join Date
    Apr 2006
    Posts
    147
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    r937, I'm not sure storing bofadem! is a good idea, because all of a sudden we have added an extra complexity to data integrity, where any insert and future updates will have to make sure that both fields are holding the correct value.

    Also data will need to be manipulated either way because for example:

    record_field = 0123456789
    formatted_field = 0123 456789
    search_field = 0123 456 789

    The search_field will need to return a match, so there's no getting away from it.

  14. #14
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,215
    Mentioned
    58 Post(s)
    Tagged
    3 Thread(s)
    seems you have three columns, not just the two that i suggested
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  15. #15
    SitePoint Addict bimalpoudel's Avatar
    Join Date
    Feb 2009
    Location
    Kathmandu, Nepal
    Posts
    279
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    You can deal with single column as well.
    format your searched numbers according to what way you save them in the database.
    Whle showing / printing, format the phone data according to second field. That is just for a display and not stored.

    I think, only one field is enough to hold.
    Bimal Poudel @ Sanjaal Framework over Smarty Template Engine
    ASKING INTERESTING QUESTIONS ON SITEPOINT FOURM

    Hire for coding support - PHP/MySQL

  16. #16
    SitePoint Wizard
    Join Date
    Jul 2003
    Location
    Kent
    Posts
    1,921
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    a telephone number is NOT a number - there is no meaning to the concept of adding, subtracting multiplying or dividing them. A phone number IS a string of characters, from the domain of characters, which just by coincidence all happen to be in the range 0123456789. But they are not numbers. So don't use a numerical field.

    storing as a string gives you options on the way the data is entered that at the present moment you may not have thought about or think you don't need. for example we all split up phone numbers into groups such as 01324 897 465 because this makes reading them and remembering them easier. While at present you think that 01324897465 is easier, at some point in the future you or another user may choose to enter 01324-897-465 or use spaces, which will not be accepted. Your idea is universally considered wrong, so why choose something that everyone says is wrong.

    Now the phone number example you have forgotten about:
    01324 897 465 ext 2305


    think about that one as a number...

  17. #17
    SitePoint Guru bronze trophy
    Join Date
    Dec 2003
    Location
    Poland
    Posts
    930
    Mentioned
    7 Post(s)
    Tagged
    0 Thread(s)
    I would do it in 1 column - at least at the beginning. Either store the unformatted number (digits only) or formatted (digits + spaces and other extra characters) depending on your needs.

    When using the latter approach use a function while searching that will strip out special characters. In the future, if the database grows large and you need to index the numbers for searches then use another column as r937 suggests. Or do it immediately if you want to be prepared for that already.

    I have a similar situation in a client table but with tax id numbers - they are stored as entered by users (sometimes with dashes and/or spaces). The table now has 12000 rows and doing a search for the number using REPLACE functions is lightning fast - so I'm not even considering the need to add another column just to be able to use index. But it may depend on how much data you have in your rows - for example, if each row stores large amount of data in TEXT or BLOB columns then doing a full table scan over 12000 rows can be time-consuming. My client table has only 12 varchar and 2 numeric columns, and occupies 1.5MB so that is not an issue and I expect it won't be still for quite a long time.

  18. #18
    From space with love silver trophy
    SpacePhoenix's Avatar
    Join Date
    May 2007
    Location
    Poole, UK
    Posts
    4,993
    Mentioned
    100 Post(s)
    Tagged
    0 Thread(s)
    If you read this wikipedia article on UK phone numbers you'll see that there is a variety of formats so your probably best of having two fields (or one field if it's just going to be UK numbers only) 1 field would be the 2 digit country code and the other would be a varchar field with any space and non-nuemric characters stripped out.

    On the server-side language part using php as an example, you'd strip any non-numeric characters from the phone number and send that to the database after having first sanitized the number. The database would then be doing a simple comparison of strings.

    Any formatting of the phone number could be done in php basically using a switch-break block.
    Community Team Advisor
    Forum Guidelines: Posting FAQ Signatures FAQ Self Promotion FAQ
    Help the Mods: What's Fluff? Report Fluff/Spam to a Moderator

  19. #19
    SitePoint Zealot
    Join Date
    Apr 2006
    Posts
    147
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    seems you have three columns, not just the two that i suggested
    I should have been a bit clearer in the example - the search_field is a user input field not a database field.

  20. #20
    SitePoint Zealot
    Join Date
    Apr 2006
    Posts
    147
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by bimalpoudel View Post
    You can deal with single column as well.
    format your searched numbers according to what way you save them in the database.
    Whle showing / printing, format the phone data according to second field. That is just for a display and not stored.

    I think, only one field is enough to hold.
    I agree, store one field and format it for storage and display purposes.

  21. #21
    SitePoint Zealot
    Join Date
    Apr 2006
    Posts
    147
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Dr John View Post
    a telephone number is NOT a number - there is no meaning to the concept of adding, subtracting multiplying or dividing them. A phone number IS a string of characters, from the domain of characters, which just by coincidence all happen to be in the range 0123456789. But they are not numbers. So don't use a numerical field.

    storing as a string gives you options on the way the data is entered that at the present moment you may not have thought about or think you don't need. for example we all split up phone numbers into groups such as 01324 897 465 because this makes reading them and remembering them easier. While at present you think that 01324897465 is easier, at some point in the future you or another user may choose to enter 01324-897-465 or use spaces, which will not be accepted. Your idea is universally considered wrong, so why choose something that everyone says is wrong.

    Now the phone number example you have forgotten about:
    01324 897 465 ext 2305


    think about that one as a number...
    I'm not sure I agree. By your definition anything that isn't added, subtracted, multiplied or divided should not be stored in a database as a number. As an example I happen to store status codes or error codes in a database as INTs, these codes are only EVER comprised of numbers but no arithmetic is ever performed on them. Should I change these to CHARs?

    With regards to telephone representation, I've already covered this in posts above. Storing them as strings doesn't solve this issue because as you pointed out the number can be represented as 01234567890 or 01234 567890 or 01234 567 890 or (01234) 567890 or 01234-567-890... which format do you store? Therefore the need to manipulate the input/output cannot be avoided if you wish to handle these cases. Therefore I cannot see any advantage to storing as a string instead of a number in this case.

    This is why I'm questioning the "universally" accepted method of storing as a string. It's important to question these things to have a better understanding of why. This is after all a discussion forum so I want to find out why "everyone" says it's wrong when it seems right to me in this particular situation.

    Telephone extensions aren't used in my case but that would be an easy addition - store them in their own field.

  22. #22
    SitePoint Zealot
    Join Date
    Apr 2006
    Posts
    147
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Lemon Juice View Post
    I would do it in 1 column - at least at the beginning. Either store the unformatted number (digits only) or formatted (digits + spaces and other extra characters) depending on your needs.

    When using the latter approach use a function while searching that will strip out special characters. In the future, if the database grows large and you need to index the numbers for searches then use another column as r937 suggests. Or do it immediately if you want to be prepared for that already.

    I have a similar situation in a client table but with tax id numbers - they are stored as entered by users (sometimes with dashes and/or spaces). The table now has 12000 rows and doing a search for the number using REPLACE functions is lightning fast - so I'm not even considering the need to add another column just to be able to use index. But it may depend on how much data you have in your rows - for example, if each row stores large amount of data in TEXT or BLOB columns then doing a full table scan over 12000 rows can be time-consuming. My client table has only 12 varchar and 2 numeric columns, and occupies 1.5MB so that is not an issue and I expect it won't be still for quite a long time.
    I thought about storing them as entered but doesn't it make more sense (even in your tax id case) to store them all in a uniform formatted way. This makes searching and displaying them easier because you know what format they are in and you only need to worry about validating / formatting the data entered by the user which is easy enough especially if it's only digits you are interested in.

  23. #23
    SitePoint Zealot
    Join Date
    Apr 2006
    Posts
    147
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by SpacePhoenix View Post
    If you read this wikipedia article on UK phone numbers you'll see that there is a variety of formats so your probably best of having two fields (or one field if it's just going to be UK numbers only) 1 field would be the 2 digit country code and the other would be a varchar field with any space and non-nuemric characters stripped out.

    On the server-side language part using php as an example, you'd strip any non-numeric characters from the phone number and send that to the database after having first sanitized the number. The database would then be doing a simple comparison of strings.

    Any formatting of the phone number could be done in php basically using a switch-break block.
    Thanks, that's actually how I'm currently doing it but I still don't see why it needs to be stored as a string when I'm stripping out "any space and non-nuemric characters"? Granted the leading zero won't be stored but that's not issue (if anything it's a good thing since it's redundant information). The zero and spaces can be later added when displaying the number or stripped out from user input when performing a search.

    This seems to be the best solution. But still willing to hear any valid reasons against doing it this way.

  24. #24
    SitePoint Guru bronze trophy
    Join Date
    Dec 2003
    Location
    Poland
    Posts
    930
    Mentioned
    7 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Code.Warrior View Post
    I thought about storing them as entered but doesn't it make more sense (even in your tax id case) to store them all in a uniform formatted way.
    What makes sense in my case doesn't have to make sense in your case. I don't need tax id numbers all in a uniform format, I want to allow people to store them in their preferred format and this is what they get on the invoices from us. If a uniform format makes more sense in your case then use it.

    Quote Originally Posted by Code.Warrior View Post
    But still willing to hear any valid reasons against doing it this way.
    The 'valid' reasons have more to do with logic than any real implementation issues. You seem to have different logical thinking than most other people. While you ask why not use INT, I would ask why use INT - because I don't see any real advantages of INT over VARCHAR and my logic says a phone number is a string of digits and not a number. There's only one 'real' disadvantage of INT I can think of: you cannot use index when doing a search using LIKE '123%'.

    Just try INT, your application will work fine, not a big deal. BTW, INT may be not enough for all numbers, you would need BIGINT or DECIMAL.

  25. #25
    SitePoint Guru
    Join Date
    Sep 2008
    Posts
    977
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    fwiw, my way would be to store it as a set of numbers, with - ( ) etc removed from whatever input the user made. Keeing it a string would enable me to keep the leading 0.

    So when I want to search, I can but when I want to output, say, in a webpage, I can format that consistently stored string, into however I want to display it.

    Doesn't matter (from a sotrage perspective), how someone may input their phone number. IMV they should be stored consistently and formatted for whatever output is required.

    But as has been written several times, each application has its own requirements and then so does the app designer.

    What I would suggest it that the issue shouldn't be discussed on the basis that it requires to store only UK phone numbers. That may be how it works now but, what if the scenario changes? Much better to build it once to work for any requirement relating to phone numbers and then, to use it as needed.

    example of change: coule of years ago, it was recommended to me to store VAT simply as a value of 17.5% (in the case of the UK). ie.


    Code MySQL:
    create table vat_rates
    ( country char(3) not null
    , vat_rate DECIMAL not null
    , primary key (country, vat_rate)

    Because I am so old; I remember when vat was 7.5% and that it ahs changed several times since and until it reached now 20%. with it not being set in concrete, I decided to store it on a date-related basis. And what happened? The rate changed.

    By storing it with the flexibility built in, I was able to change the vat rate from 2010-01-04 well ahead of time so that when the flip-over came, it was painless and pricing was correct throughout.

    OK, that's a bit different from the phone number issue but, I hope it helps with the discussion with regards to future-proofing and scaleability and data management.

    my 2c

    bazz


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •