How to Use Unique Indexes in MySQL and Other Databases

Contributing Editor

If you’ve been using databases for a while, you’ve probably set a primary key in most of your tables. A primary key is a unique identifier for each record, e.g.

CREATE TABLE `phone` (
	`id` MEDIUMINT(8) UNSIGNED NOT NULL AUTO_INCREMENT,
	`country` DECIMAL(5,0) UNSIGNED NOT NULL,
	`area` DECIMAL(5,0) UNSIGNED NOT NULL,
	`number` DECIMAL(8,0) UNSIGNED NOT NULL,
	`extension` DECIMAL(5,0) UNSIGNED DEFAULT NULL,
	PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=0 DEFAULT CHARSET=utf8;

In this example, the ‘id’ column is our primary key. When we INSERT telephone numbers, if we don’t specify an id, an AUTO_INCREMENT number will be generated by adding one to the highest existing id.

Presume you’ve added the following data:

id country area number extension
1 1 234 567890 NULL
2 44 9876 54321 42
3 61 3 90908200 NULL

then issue the following INSERT:

INSERT INTO `phone`
(`id`, `country`, `area`, `number`)
(1, 1, 234, 567890);

The database would refuse to add a new record because one already exists with an id of 1. Fortunately, we can omit the id from our INSERT to automatically generate one:

INSERT INTO `phone`
(`country`, `area`, `number`)
(1, 234, 567890);

We now have four records:

id country area number extension
1 1 234 567890 NULL
2 44 9876 54321 42
3 61 3 90908200 NULL
4 1 234 567890 NULL

We can add almost 17 million records before the id goes out of range.

Great — except that records 1 and 4 are identical. What if we want to ensure all phone numbers are unique?

Unique Indexes

Unique indexes work in much the same way as a primary key. Although you can only have one primary key, any number of unique indexes can be created with any number of fields.

In our example, we want to ensure no two records have the same country, area, number and extension. We can do this by altering our table:

ALTER TABLE `phone` 
ADD UNIQUE INDEX `ix_phone` (`country`, `area`, `number`, `extension`);

Note that the index name ‘ix_phone’ is optional. Alternatively, we could re-create our table:

DROP TABLE IF EXISTS `phone`;

CREATE TABLE `phone` (
	`id` MEDIUMINT(8) UNSIGNED NOT NULL AUTO_INCREMENT,
	`country` DECIMAL(5,0) UNSIGNED NOT NULL,
	`area` DECIMAL(5,0) UNSIGNED NOT NULL,
	`number` DECIMAL(8,0) UNSIGNED NOT NULL,
	`extension` DECIMAL(5,0) UNSIGNED DEFAULT NULL,
	PRIMARY KEY (`id`),
	UNIQUE KEY `ix_phone` (`country`, `area`, `number`, `extension`),
) ENGINE=InnoDB AUTO_INCREMENT=0 DEFAULT CHARSET=utf8;

Most databases support unique indexes but the SQL syntax may differ.

Let’s try inserting a duplicate record even though we’re not specifying an id:

INSERT INTO `phone`
(`country`, `area`, `number`, `extension`)
(44, 9876, 54321, 42);

The following error will be generated if you’re using MySQL:

Error Code: 1062
Duplicate entry '44-9876-54321-42' for key 'ix_phone'

If you’re using almost any database, you can guarantee your phone records are unique no matter how the data is inserted.

MySQL NULLs

I say almost any database because MySQL has an odd quirk. NULL is treated as a unique value — which is why you cannot use comparisons such as value = NULL and need to use value IS NULL. Unfortunately, this also affects unique indexes and no logic has been implemented to fix it.

We can execute our original INSERT multiple times and a new record will be created each time because the extension field defaults to NULL and is considered to be unique:

INSERT INTO `phone`
(`country`, `area`, `number`)
(1, 234, 567890);

Yes, it’s insane. I’m not aware of the problem in other databases and even MySQL works as expected if you’re using the BDB storage engine. It’s been reported as a MySQL bug but there are no known plans to fix it.

The solution: ensure all fields defined in a unique index cannot be set to NULL. In this example, we could indicate that there’s no extension number by setting a value such as 0 or 99999. Or perhaps we could make the field a signed number and set -1. It’s horrible, but it’ll work.

Despite this issue, unique indexes are useful in many situations and help you to retain data integrity when other programmers and users are not so conscientious!

Free book: Jump Start HTML5 Basics

Grab a free copy of one our latest ebooks! Packed with hints and tips on HTML5's most powerful new features.

  • ElDerecho

    Unique indexes are particularly handy in conjunction with MySQL’s ON DUPLICATE KEY, allowing you to insert a row, or update an existing row with one statement.

  • Ian Simmons

    Thank you for this good and informative article

  • Ian Simmons

    ah this explains why I see a UUID column sometimes. Probably for cases where username or email or some other constant value doesn’t exist or isn’t appropriate for the table. Although I imagine there could be a small bit of overhead generating a UUID for every record.

    • Taylor Ren

      UUID should not be encouraged.

      One point missing in my above paragraphs regarding PK or UI is that:

      The method to calculate the UI must be logical and simple. For example, in my book collection database, I have a table to store my book collections with these fields:

      ID, Title, Author, ISBN, Purchase Date, Place of purchase, Price, Publisher, Version, Decoration, etc.

      ID is the primary key. But as I said, I created another UI to avoid duplicate entry.

      Obviously, title, author are not the good candidates for a UI.

      So? You may think that the ISBN is a good candidate for UI. Not really. I may bought another same book but with a different version or decoration.

      Eventually I chose the combination of the following fields. title+author+version+date of purchase as the UI.

      You see, I even skipped the ISBN into the UI composition.

      The logic is that: title + author normally defines the book but I just distinguish a probable repeated purchase with the date of purchase (as there may be a change in the version).

      • Ian Simmons

        Ah ok, makes sense. Thanks for the extra explanation.

  • http://stephenmorley.org/ Stephen Morley

    With regards to MySQL’s treatment of NULLs in unique indices, I regards this as a useful feature rather than a bug. There are many situations in real life where an object may or may not have some property, but if it does then its value must be unique, and MySQL’s behaviour allows this to be modelled easily.

    Note also that MySQL’s behaviour is in accordance with the SQL standard: “A unique constraint is satisfied if and only if no two rows in a table have the same non-null values in the unique columns”; in other words, the standard only requires rows that don’t have any NULLs in the unique columns to actually be unique.

  • http://www.matthewsetter.com/ Matthew Setter

    @taylorren:disqus I really like your point about having a compound key. I’ve not explored the limitations of MySQLs auto increment too far, but I see that by using a compound key, the probability of having a clash is highly unlikely. Thanks for the extended clarification about what you mean.

    • Taylor Ren

      Thanks. It is always nice to share something useful. I will have a full length article devoted to this scheduled on 30th. Take a look then!

      • http://www.matthewsetter.com/ Matthew Setter

        looking forward to it.

  • http://www.matthewsetter.com/ Matthew Setter

    Craig, thanks for specifically mentioning the point multiple null values and how they’re accepted, except with the BDB engine and that that’s not considered a bug by MySQL support. The bug report’s comments make for some interesting reading. I agree with you that it sure seems insane. If nothing else, inconsistent.