InnoDB foreign keys can cause major performance problems

Although there are some nice features in MySQL there are plenty of areas where it’s rather abysmal. Some of these problems are so bad though that it’s easily and by far the worst database I’ve ever had to use in a production OLTP environment. Hopefully someday this will change as development continues.

This thread will just go into one of the major problem areas:

Foreign Key support is a required feature for any dbms that claims to be relational. So first off MyISAM is far from relational as it does not support this feature among others. InnoDB does support this as well as transactions so it is much closer but still falls rather short.
Is your database relational?

The problem though with InnoDB’s FK support is that it requires an index on referencing columns - the foreign key columns in child tables. Often this is desirable but certainly not always. For medium to large database this seemingly innocuous restriction can quickly lead to intolerable performance problems for a few reasons.

Consider the following simplified example - additional columns like amount, order_date not included to avoid cluttering things.

create table orders (
  order_id int unsigned not null auto_increment primary key,
  part_id smallint unsigned not null,
  type_id tinyint unsigned not null,
  color_id tinyint unsigned not null,
  customer_id int unsigned not null,
  currency_id tinyint unsigned not null,
  foreign key(part_id) references parts(part_id),
  foreign key(type_id) references types(type_id),
  foreign key(color_id) references colors(color_id),
  foreign key(customer_id) references customers(customer_id),
  foreign key(currency_id) references currencies(currency_id)

So here we’ve opted to use surrogate keys (a.k.a. integer “id” columns) to conserve space keeping table rows and the associated indexes as narrow as possible - especially important in MySQL and the primary key ends up in every secondary index page as well.

Now MySQL will create a unique index for the primary key - that’s always desirable. But unfortunately it will also go ahead and create indexes on all 5 foreign key columns too. Some of those might be good to have but not all.

Let say there are 4 colors to choose from and any part can be painted in any of those colors. As for order types perhaps there are three types of orders - lets say cash, credit and check. As for currency perhaps only a few varieties there too - dollars, pounds and euros.

Indexes with low cardinality and a roughly equal distribution of values are bad indexes. These are almost never a good choice to use. Yet with InnoDB you are stuck with them if you want to use foreign keys. It will not let you drop these useless indexes without removing the foreign key that you’d like to keep for checking with the lookup tables to ensure only valid values are placed into the orders table.

Now consider the orders tables growing to about 20 million rows. The index on color_id would return perhaps 5 million rows per color so what good is that? Might as well do a table scan since these rows are scattered through the table anyway. There’s also the time and space required for maintaining these indexes you’ll never use.

Worse, when presented with too many indexes to choose from on large table sometimes the MySQL optimizer will choose one of these garbage indexes to use as the first index to limit the results on. Then you are in for a world of hurt as your query that use to run quickly all of a sudden starts running “forever” inexplicably. I’ve seen the optimizer flake out and do this on tables with roughly 10 million plus rows when the query contains some like where color_id = 2 in the where clause among other criteria that should result in only a few rows matching - yet MySQL sees there’s an index on a FK column and decides to use it even though the rows matching it are numerous. (I’m, on 5.0.45 btw so maybe this has been fixed perhaps). The solution for me was a choice between an optimizer hint or dropping the FK as we’re not going to be upgrading soon. Lame.

Does any other popular database that supports foreign key constraints have such a restriction? I don’t know of one - are indexes required on foreign keys for: Microsoft SQL Server? No. PostgreSQL? No. Oracle? No. Sybase SQL Anywhere? No. There’s a reason why they aren’t required…

With databases containing 100+ tables you’re typically going to have lots of relationships between these tables. It would be great to be able to enforce referential integrity amongst these using declarative foreign keys as opposed to relying on triggers - also poorly implemented in MySQL - but that’s another story. However, you do so at the risk of quickly ending up with a system that becomes unnecessarily bloated and slow as it grows - and from my experience starts to make some rather bizarre optimizer choices.

Will this ever be addressed in upcoming MySQL releases?