SitePoint Sponsor

User Tag List

Page 2 of 2 FirstFirst 12
Results 26 to 39 of 39
  1. #26
    SitePoint Zealot
    Join Date
    Apr 2006
    Posts
    147
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Lemon Juice View Post
    While you ask why not use INT, I would ask why use INT - because I don't see any real advantages of INT over VARCHAR and my logic says a phone number is a string of digits and not a number. There's only one 'real' disadvantage of INT I can think of: you cannot use index when doing a search using LIKE '123%'.
    The whole point of using (BIG)INT instead of VARCHAR is for size and performance benefits. You can't see any real advantages to using INT over VARCHAR? What about:

    * Smaller storage requirements: BIGINT = 8 bytes, VARCHAR(12) = 13 bytes
    * Faster database queries.

  2. #27
    SitePoint Zealot
    Join Date
    Apr 2006
    Posts
    147
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    IBazz: thanks for the example, I agree future-proofing is important though it can be a fine line to tread.

  3. #28
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,015
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    Quote Originally Posted by Code.Warrior View Post
    * Faster database queries.
    if you can give a quantitative answer to the question "faster by how much?" then you may have a valid point, otherwise it's just another urban database myth

    me, i prefer to use the most appropriate datatype for each element

    unless you are seriously going to find AVG(phonenumber) or SUM(phonenumber) then it is ~not~ numeric data, and should be stored as VARCHAR

    my advice is to store two values, formatted for display and unformatted for search -- but they're both VARCHARs
    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  4. #29
    SitePoint Zealot
    Join Date
    Apr 2006
    Posts
    147
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    if you can give a quantitative answer to the question "faster by how much?" then you may have a valid point, otherwise it's just another urban database myth
    I can't give a quantitative answer because I haven't measured it but it's certainly not an urban database myth; it's a fact that queries on (BIG)INT fields are going to be faster than on larger sized (VAR)CHAR fields, especially when indexes are involved. It might not be much faster but it WILL be faster.

    Quote Originally Posted by r937 View Post
    me, i prefer to use the most appropriate datatype for each element

    unless you are seriously going to find AVG(phonenumber) or SUM(phonenumber) then it is ~not~ numeric data, and should be stored as VARCHAR
    It that case please tell me what datatype you would use to store a TCP or UDP port number?

    Quote Originally Posted by r937 View Post
    my advice is to store two values, formatted for display and unformatted for search -- but they're both VARCHARs
    I appreciate the advice, I think I will probably stick with VARCHAR but just have the one unformatted value. Can you explain why you would choose to store both?

  5. #30
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,015
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    Quote Originally Posted by Code.Warrior View Post
    it's a fact that queries on (BIG)INT fields are going to be faster than on larger sized (VAR)CHAR fields, especially when indexes are involved. It might not be much faster but it WILL be faster.
    oh come on, it's not that hard, find just one authoritative source that backs you up on this

    Quote Originally Posted by Code.Warrior View Post
    It that case please tell me what datatype you would use to store a TCP or UDP port number?
    you'll have to give examles of those numbers because i am not familiar with them

    Quote Originally Posted by Code.Warrior View Post
    Can you explain why you would choose to store both?
    let's say that the phone number you want to store is (937) 555-1212

    i would store two VARCHARs: 9375551212 for searching, and (937) 555-1212 for display

    then when you run a search, use
    Code:
    SELECT display_number
      FROM phonenumbers
     WHERE search_number = REPLACE(REPLACE(REPLACE(REPLACE('$form_number','(',''),')',''),'-',''),' ','')
    this strips out the formatting characters from the user-entered value in the form field, and allows the index on the search_number to be used

    of course, you could also strip the extraneous characters in your application language prior to running the search (e.g. regular expression which ditches everything except digits)

    so if the user enters 937-555-1212 or (937)5551212 or other formats, they will all be found -- efficiently
    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  6. #31
    SitePoint Zealot
    Join Date
    Apr 2006
    Posts
    147
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    oh come on, it's not that hard, find just one authoritative source that backs you up on this
    MySQL :: Re: Primary key performance int vs. char

    Quote Originally Posted by r937 View Post
    you'll have to give examles of those numbers because i am not familiar with them
    Really? List of TCP and UDP port numbers - Wikipedia, the free encyclopedia

    What datatype would you use for the port numbers?

    Quote Originally Posted by r937 View Post
    let's say that the phone number you want to store is (937) 555-1212

    i would store two VARCHARs: 9375551212 for searching, and (937) 555-1212 for display
    To me that doesn't seem like a good solution, for the following reasons:

    1. You are using two fields when you could just use one.
    2. On insert you need to ensure the numbers of both fields match.
    3. On updates you need to change the number in two places and also ensure they match.

    With one field (formatted by the application on display) I can avoid the above problems, plus:

    1. If I decide I want to change the display format from (937) 555-1212 to 937-555-1212 I can do that without having to update all the records in the database.
    2. I have the ability to allow the user to decide which format the number is displayed to them.

  7. #32
    SitePoint Guru bronze trophy
    Join Date
    Dec 2003
    Location
    Poland
    Posts
    925
    Mentioned
    7 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Code.Warrior View Post
    Even if that is true the difference in speed is so small that it shouldn't be a concern.
    What datatype would you use for the port numbers?
    I don't know what r937 will say but to me they qualify to be int because port numbers are single numeric entities, they are not written as formatted in any way, and - as the table shows - they have certain meaningful ranges, which means that it makes sense to make numeric comparisons and searches using < and >, while it doesn't make sense to search port numbers using LIKE '%...%'. With the phone numbers it's the opposite: numerical searches with < and > make no sense while textual searches with LIKE '%...%' make sense, that's why it's more logical to use a text field for phone numbers. And, if you want to find all phone numbers from a certain area (by first few digits) then your argument about speed is lost completely because no index is used on an INT column with LIKE '...%'.

  8. #33
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,015
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    Quote Originally Posted by Code.Warrior View Post
    excellent

    Quote Originally Posted by Code.Warrior View Post
    What datatype would you use for the port numbers?
    SMALLINT
    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  9. #34
    SitePoint Zealot
    Join Date
    Apr 2006
    Posts
    147
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Lemon Juice View Post
    With the phone numbers it's the opposite: numerical searches with < and > make no sense while textual searches with LIKE '%...%' make sense, that's why it's more logical to use a text field for phone numbers. And, if you want to find all phone numbers from a certain area (by first few digits) then your argument about speed is lost completely because no index is used on an INT column with LIKE '...%'.
    Search for all numbers in the Oxford (01865) area:

    Code:
    SELECT fields
     FROM table
     WHERE phone_number > 1865000000 AND phone_number < 1865999999

  10. #35
    SitePoint Zealot
    Join Date
    Apr 2006
    Posts
    147
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Code.Warrior
    What datatype would you use for the port numbers?
    Quote Originally Posted by r937 View Post
    SMALLINT
    Quote Originally Posted by r937 View Post
    me, i prefer to use the most appropriate datatype for each element

    unless you are seriously going to find AVG(phonenumber) or SUM(phonenumber) then it is ~not~ numeric data, and should be stored as VARCHAR
    But you wouldn't seriously find AVG(portnumber) or SUM(portnumber).

  11. #36
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,015
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    Quote Originally Posted by Code.Warrior View Post
    But you wouldn't seriously find AVG(portnumber) or SUM(portnumber).
    i'm weird like that
    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  12. #37
    SitePoint Guru bronze trophy
    Join Date
    Dec 2003
    Location
    Poland
    Posts
    925
    Mentioned
    7 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Code.Warrior View Post
    Search for all numbers in the Oxford (01865) area:

    Code:
    SELECT fields
     FROM table
     WHERE phone_number > 1865000000 AND phone_number < 1865999999
    It's as weird as trying to use
    Code:
    WHERE numeric_field LIKE '1__'
    instead of
    Code:
    WHERE numeric_field BETWEEN 100 AND 199
    Now consider the fact that not all numbers are 10-digit long and your where clause becomes even weirder. I'm not saying it won't work. It's just a clumsy workaround.

  13. #38
    SitePoint Zealot
    Join Date
    Apr 2006
    Posts
    147
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    i'm weird like that
    Yes you are.

  14. #39
    SitePoint Zealot
    Join Date
    Apr 2006
    Posts
    147
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Lemon Juice View Post
    Now consider the fact that not all numbers are 10-digit long and your where clause becomes even weirder. I'm not saying it won't work. It's just a clumsy workaround.
    It would be very easy to setup the correct query within the application based on the number length.

    But the point was that searches with < and > can make sense.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •