SitePoint Sponsor

User Tag List

Results 1 to 7 of 7
  1. #1
    SitePoint Member
    Join Date
    Feb 2005
    Location
    Cape Town, South Africa
    Posts
    13
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Always have a autoincrement 'id' column as primary key in a table?

    Hi,

    I have been working for 2 yrs now...at the same company...and have been 'taught' to have an autoincrement id field (also primary key) in each table i create, then use that to link to other tables etc. I have had no problems using this technique thus far, but have been told its not a good way of doing it.

    Is this a bad way of designing databases, if so how should i be doing it?

    Regards,
    Bjorn

  2. #2
    Feel my RewiredMind KMxRetro's Avatar
    Join Date
    Jan 2001
    Location
    Exeter, Devon, UK
    Posts
    477
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I don't see why it would be any worse than any other way of doing it. Its the way I do it, but I name mine "cust_id" and "order_id" etc, etc.

    If it works, and it doesn't cause problems, I wouldn't worry about it - unless anyone else can put me straight here...

  3. #3
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,336
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    a surrogate id is a good idea if (1) the real primary key is composite or large, and (2) the table has child (related) tables

    otherwise, it is not necessary

    [stand well back, the above is sure to draw some comment from relational theorists]

    what's bad is declaring a surrogate key and then forgetting to also declare a unique constraint on the "real" key, resulting in yet another "how do i remove duplicate rows" thread on some forum

    and totally useless is an associative or relationship table with its own surrogate key, as in

    create table bookauthors
    ( id integer not null primary key auto_increment
    , book_id integer not null
    , author_id integer not null
    )

    far better is this --

    create table bookauthors
    ( book_id integer not null
    , author_id integer not null
    , primary key (book_id, author_id)
    )
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  4. #4
    SitePoint Wizard stereofrog's Avatar
    Join Date
    Apr 2004
    Location
    germany
    Posts
    4,324
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937
    a surrogate id is a good idea if (1) the real primary key is composite or large, and (2) the table has child (related) tables
    Well, the second clause should probably read "has or ever will have child tables". Because you never know it's better to omit it and stick to "big primary key" principle.

  5. #5
    SitePoint Guru asterix's Avatar
    Join Date
    Jun 2003
    Posts
    847
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Practical DBA considerations:

    1) What datatype are you going to use? INT? VARCHAR? GUID if your DB supports it? And why are you choosing exactly that datatype? If you are choosing int, what do you do when it rollsover the largest number available in your DB? If you've never seen that happen, I can assure you that it is *not* fun. If you choose GUID or VARCHAR even, do you know how much extra disk space the index will need over an INT field?

    2) If you choose autoincrement INT as your PK, do you know what the term "disk hotspotting" means? It depends a lot on the frequency of inserts, but it can become a real performance problem in some situations.

    3) Do you know what consequences this choice has on the performance of your index when many inserts are occurring? At some point, every nth insert is going to cause huge b-tree reorganisation, and you can just make coffee whilst waiting

    4) Not all RDBMSs support autoincrementing INT datatypes, notably Oracle. It doesn't happen often, I admit, but you'll have a headache porting your schema to Oracle.

    5) What is your strategy for dealing with the users of the data who - because they constantly see this ID thing everywhere - start treating the autoincrement ID as business critical! "What do you mean that product XYZ no longer has ID of 123!, I can't find it anymore!". Stories abound of DBAs refusing to renumber an autoincrement field because the previously "surrogate" key has taken on a life and destiny all it's own.

    6) Databases in which every table has a PK named "ID" are a nightmare to code. I use the following naming scheme:

    <tablename>
    -----
    <tablename>_id for the PK.

    It makes generating code so much easier, and it makes the PK / FK relationships immediately clear.


    In answer to the OP's question

    I have had no problems using this technique thus far, but have been told its not a good way of doing it.

    Is this a bad way of designing databases, if so how should i be doing it?
    I think that it is a bad way of desgining relational models, yes. But it is often a smart way to design real databases.

  6. #6
    SitePoint Zealot
    Join Date
    Feb 2005
    Location
    UK
    Posts
    121
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    To summarise.
    Yes, use autoinc fields as primary key for all tables that contain real-world data.
    Name them for the table so you dont get accidental joins on 2 columns named 'id'.
    Never reveal them to the outside world - they only have meaning inside the db.
    Integer indexes are much faster than other data types.
    The only exceptions to this would be data that already has a unique integer identifier such as an Invoice number, because autoincs will end up with numbers missing from the sequence and auditors Hate missing invoice nos.

  7. #7
    SitePoint Member
    Join Date
    Feb 2005
    Location
    Cape Town, South Africa
    Posts
    13
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks all for your replies...they have helped alot. I think i will continue using them as i do but just prefix them with the table name. Of course keeping in mind point no.1 from asterix! Although i dont think i will ever need a bigger integer than a BIGINT in MySQL hehehe...

    Thanks again!


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •