SitePoint Sponsor

User Tag List

Results 1 to 14 of 14
  1. #1
    SitePoint Evangelist
    Join Date
    Mar 2011
    Location
    Bellingham, WA
    Posts
    450
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)

    varchar vs nvarchar

    Hello!

    I've been recently initiated into the world of UTF-8 encoding and how it can help me create a database with languages such as Spanish. In my database, I'll need to add some mathematical symbols (such as the infinity sign). Though I could just turn them into html entities via php and str_replace, I'd prefer to keep the actual characters in the database. Doing some research, it seems like by changing the datatype from varchar to nvarchar, my database will be able to handle such characters.

    A couple of questions regarding this:

    1) If there's data already in the database and I change the column from varchar to nvarchar, will any of the existing characters be messed up?

    2) It seems like nvarchar offers much more flexibility than varchar. If my application will need to deal with different languages & mathematical symbols, why wouldn't I want to use nvarchar?

    Thanks so much,

    Eric

  2. #2
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,246
    Mentioned
    59 Post(s)
    Tagged
    3 Thread(s)
    NVARCHAR uses two bytes for each character, VARCHAR only one

    the only reason not to use NVARCHAR is disk space, which, in the 21st century, hardly matters

    as for converting an existing table, a quick test should be able to answer that

    by the way, searching "difference between VARCHAR and NVARCHAR" or "VARCHAR versus NVARCHAR" turns up dozens and dozens of good articles
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  3. #3
    SitePoint Evangelist
    Join Date
    Mar 2011
    Location
    Bellingham, WA
    Posts
    450
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    Thanks for the helpful info!

  4. #4
    Non-Member
    Join Date
    Feb 2012
    Posts
    892
    Mentioned
    10 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by kreut View Post
    2) It seems like nvarchar offers much more flexibility than varchar.
    Which DB?

    Oracle has gone further and one can ask for chars not bytes as storage units.
    varchar2(20 char) means you can store 20 characters: 20, 40, 60, 80 bytes, or more.
    varchar2(20) means you can store 20 bytes.
    The maximum length of a varchar2 is 4000 bytes so the field will max out at 4000 bytes - not 4000 characters.

  5. #5
    SitePoint Evangelist
    Join Date
    Mar 2011
    Location
    Bellingham, WA
    Posts
    450
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    mysql: would this be an option here as well?

  6. #6
    Non-Member
    Join Date
    Feb 2012
    Posts
    892
    Mentioned
    10 Post(s)
    Tagged
    0 Thread(s)
    Not sure, but I believe there is a better one.

    In MySQL you can declare the charset and collation even at column level:

    ORACLE: nvarchar(100)
    is
    MySQL: varchar(100) character set UTF8

  7. #7
    SitePoint Evangelist
    Join Date
    Mar 2011
    Location
    Bellingham, WA
    Posts
    450
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    Thank you!

  8. #8
    SitePoint Wizard guelphdad's Avatar
    Join Date
    Oct 2003
    Location
    St. Catharines, ON Canada
    Posts
    1,708
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    nvarchar isn't a datatype supported by mysql.

  9. #9
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,246
    Mentioned
    59 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by guelphdad View Post
    nvarchar isn't a datatype supported by mysql.
    actually, it is


    Quote Originally Posted by da manual
    Standard SQL defines NCHAR or NATIONAL CHAR as a way to indicate that a CHAR column should use some predefined character set. MySQL 5.5 uses utf8 as this predefined character set. For example, these data type declarations are equivalent:

    CHAR(10) CHARACTER SET utf8
    NATIONAL CHARACTER(10)
    NCHAR(10)

    As are these:

    VARCHAR(10) CHARACTER SET utf8
    NATIONAL VARCHAR(10)
    NCHAR VARCHAR(10)
    NATIONAL CHARACTER VARYING(10)
    NATIONAL CHAR VARYING(10)

    -- http://dev.mysql.com/doc/refman/5.5/...-national.html
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  10. #10
    SitePoint Evangelist
    Join Date
    Mar 2011
    Location
    Bellingham, WA
    Posts
    450
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    Update: I used the Alter Table command to Modify the columns to use NVARCHAR and after doing so, I was able to add the infinity symbol as I had hoped. (Though I was a bit discouraged at first as it's not one of the drop down PHPMYadmin options)

    Though everything seems to work now, I feel like I have a conceptual whole regarding utf-8 and the whole nvarchar business and lest I dig a hole for myself, I figure that I better ask away. Basically, I first investigated the utf-8 encoding because I had to add some Spanish characters to the database. And, that in fact worked just great without using nvarchar after making sure that my webpage, and database were utf-8. So..why do I need the nvarchar to make mathematical symbols work? Aren't they part of the utf-8 character set?

  11. #11
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,246
    Mentioned
    59 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by kreut View Post
    So..why do I need the nvarchar to make mathematical symbols work? Aren't they part of the utf-8 character set?
    you don't need NVARCHAR if you specify utf8 on your VARCHAR -- they're the same thing

    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  12. #12
    SitePoint Evangelist
    Join Date
    Mar 2011
    Location
    Bellingham, WA
    Posts
    450
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)

    still not quite there...

    OK! That's becoming clearer. And doing the change at the column level does fix things. However, it still doesn't explain why the Spanish accented symbols worked but the mathematical symbols didn't. If it helps, originally I didnīt actually specify utf8 on the varchar, I did it at the database level. And, after performing the query:

    Code:
    SELECT default_character_set_name FROM information_schema.SCHEMATA S
    WHERE schema_name = "schemaname";
    I saw that it was in fact utf8 (I also know this because my Spanish accented characters are showing)

    So, what might explain some of the utf8 characters working and others not working?

    (As a side note, when I switch the collation between latin_1 and utf8 on the column in question, the inifinity symbol doesn't show and then does show which confuses me since I thought that collation was simply how entries are ordered.)

    Any additional thoughts would be appreciated.

    -Eric

  13. #13
    Non-Member
    Join Date
    Feb 2012
    Posts
    892
    Mentioned
    10 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by kreut View Post
    I saw that it was in fact utf8 (I also know this because my Spanish accented characters are showing)

    So, what might explain some of the utf8 characters working and others not working?

    (As a side note, when I switch the collation between latin_1 and utf8 on the column in question, the inifinity symbol doesn't show and then does show which confuses me since I thought that collation was simply how entries are ordered.)
    Well, the latin1 charset fully covers the Spanish language. So the Spanish special chars would display even without a utf8 charset. However, the math symbols are not covered in this charset.

    Looking at the "Character Sets and Collations That MySQL Supports" section in MySQL docs: http://dev.mysql.com/doc/refman//5.5...-charsets.html, I don't see any latin1 or utf8 collations, those are charsets.

  14. #14
    SitePoint Evangelist
    Join Date
    Mar 2011
    Location
    Bellingham, WA
    Posts
    450
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    Thanks so much for the reply. After more investigation, while I thought my columns and tables were utf8, they actually weren't.
    I've now set the default charset to utf8 at the database, table, and column level and all symbols are working as hoped. I appreciate all who took the time to respond to these questions.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •