SitePoint Sponsor

User Tag List

Results 1 to 4 of 4
  1. #1
    SitePoint Member Ramses's Avatar
    Join Date
    Jul 2001
    Location
    Montréal, Canada
    Posts
    15
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Alternatives to auto_increment

    [Warning: I'm long-winded. Actual question is in bold, a tad below.]

    Thanks to past rants found in a miriad of threads at Sitepointforums, I came to develop a new way of looking at 'auto_increment' column attributes to generate unique keys. I'd now like to try generating my own primary keys.

    Aside from the many reasons found in threads like this one, I need to make sure that the application I currently develop will be portable and compatible with a handful of RDBMS. 'auto_increment' is a MySQL implementation, and while it's quite a good feature, I'm a control freak. I want the database system to obey me. I therefore want unique ID that mean something to me. I also want to make sure that the DB and my app will work under the pressure of a insert-intensive environment. While I'm not claiming that my site will get millions of visitors a day and each of them writing 100 messages an hour, I prefer coding for strenght and with an open mind, since when my app matures I see a good future for it (and me... and my Swiss banker... )

    Having looked around a bit, I have experimented using a monotonic key generation by basing myself on the way PHPLIB does with it's 'nextid' function. For those unfamiliar with this method, a common table named 'db_sequence' is created with 2 columns (seq_name, nextid). For each table that an autonumber is needed, a row is added to 'db_sequence' with the table name (seq_name) and a blank start-up id (nextid). Now, every time a record is added to a table that requires an autonumber, we first fetch the nextid for that seq_name in the db_sequence table, then update db_sequence with nextid+1, and finally use the nextid for whatever row insertion we wish to do elsewhere. Nice, sweet, elegant and intuitive... but requires table locking.

    But to tell the truth, I've been seduced by this Sybase paper on Surrogate Primary Key Generation, mostly aroused (...intellectually) by the Recycled Series Approach. In short, and from what I understand, the Recycled Series Approach allows for better concurrent use of the db_sequence table when under stress and would keep the keys small. As the doc says :
    In the recycled series approach, each concurrent process has its own conceptual next-key table.
    For example, three concurrent processes can acquire next key values as follows: One process can work on the sequential series from 1 to 1,000,000, the second can work on the series from 1,000,001 to 2,000,000, and the third from 2,000,001 to 3,000,000.
    If a process dies, another picks up the slack instead of opening a new series. And no need to do table-locking since each process is responsible for it's own turf and will not simultaneously do two jobs at once (no collision).

    ...and then there's the Composite Key Approach, which basically means to tag the server's user_id to the precise date and to some other random number. The PHP function uniqid does exactly that I believe. This way we can do away with the extra queries associated with using and maintaining a sequence table. The downside seems to be larger keys, and a slight possibility of creating identical keys when on a multi-processor machine... or when someone readjusts the computer clock.

    Now I'm perplexed as I am not sure which way to go.

    Question:I'm looking for a rock-solid way to generate unique ids to use in tables that require primary keys. Any thoughts?
    Requirements for proper key generation would be the following:
    • Must withstand and perform well under low-to-very-high levels of stress (in load balanced environment or whatever);
    • Should always create unique keys;
    • Keep overhead and impact low;
    • Set and forget, because once in a production environment it would be nightmarish to replace all generated ids to a new system.
    Has anyone experimented with such systems and would like to comment on their experience? Does anyone know how do banks and telcos approach this problem? Should I simply give up and keep on usong auto_increment?
    Last edited by Ramses; Aug 24, 2001 at 14:03.
    SciFlicks.com -- Science Fiction Cinema
    No, I'm not as old as Professor Farnsworth.

  2. #2
    Your Lord and Master, Foamy gold trophy Hierophant's Avatar
    Join Date
    Aug 1999
    Location
    Lancaster, Ca. USA
    Posts
    12,305
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    Having worked on Bank software in the mid-90's, I can say they use sequential number generators (a.k.a Auto-Number Fields). However Account numbers are made up from a couple of different fields depending on the bank, hence the dashes. These other fields include regions, branch numbers, check numbers etc.

    You can search on the Internet for algorithms that create unique numbers however unless you have some other fanatical reason not to, I would use the AutoNumber version of MySQL.

    The Auto_Increment is MySQL's terminology. Every database allows an easy way to create unique keys (and since they have referential integrity can actually enforce it). In Sybase and MS-SQL you have Indentity fields. In Oracle you use Sequences (these are powerful in that you can create the forum by which it increments by). Even Access allows you to create AutoNumber fields for this purpose.

    As long as your code is written properly you shouldn't have a problem.
    Wayne Luke
    ------------


  3. #3
    Database Jedi MattR's Avatar
    Join Date
    Jan 2001
    Location
    buried in the database shell (Washington, DC)
    Posts
    1,107
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I'd go along with Wayne's advice.. The only time you'd ever see problems with an auto_increment (or the like) would be on very insert-heavy applications (e.g. 30+ inserts per second). I doubt MySQL would even be used in this sort of application (table lock, etc.). If you're really interested then try your hand at some of the Sybase paper's suggestions (they are all very cool; I have had a PDF of that document stored in my 'neat things' folder on my hard drive for quite some time ).

  4. #4
    SitePoint Member Ramses's Avatar
    Join Date
    Jul 2001
    Location
    Montréal, Canada
    Posts
    15
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks W.Luke and MattR, I guess I'll stop being paranoid and use current features where available! I'll probably end up using the next_id approach for RDBMS where no LAST_INSERT_ID() feature exist and no unique identifiers can be used to pull the latest entry number back and collisions might occur. But for everyday uses, autonumber will still be my best pal...

    While the following is more like a rethorical question, and while I understand MySQL doesn't scale too well, I assume that exotic id number generation might be useful in instances where multiple DB servers run in tandem and where each machine is handling part of the data instead of replicated copies. In any case, I'm nowhere near that bridge yet.
    SciFlicks.com -- Science Fiction Cinema
    No, I'm not as old as Professor Farnsworth.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •