Why is this query slow?

Hey guys,

sorry for the bad topic title, but I could not think of anything better. My problem is, that I have a really slow query, that I need help with:

SELECT `Assembly`.`id`, `Assembly`.`created` FROM `assemblies` AS `Assembly` WHERE `Assembly`.`account_id` = 'some_id'; 

There are approx. ~160 000 rows that would match this. The whole table has 500 000 records and the query executes in like 48s! There is an index on ‘account_id’ and one on ‘created’. Unfortunately there is no compound index.

This is the output of EXPLAIN:

| id | select_type | table    | type | possible_keys | key        | key_len | ref   | rows   | Extra       |
| 1  | SIMPLE      | Assembly | ref  | account_id    | account_id | 97      | const | 157122 | Using where |

For other account_id’s that would contain less rows the query runs really fast. Can someone explain to me how mysql processes this query and why it is not using the account_id index? That would be great as it makes our production site really slow (for that particular client).

Thanks a lot in advance.

PS: The table engine is InnoDB.

It isn’t?

Do you really need to retrieve all 160k rows?

It ~is~ using the index, see value “account_id” under the column “key”.

It might be slow because MySQL has a hard time finding everything, or the index is too sparse (you might want to try a OPTIMIZE TABLE <table> – beware, might take a while and make your server slow).

What puzzles is me is why the key_len is 97 for just that one id. What datatype do you use for that id?

@guido2004: Shouldn’t it say “using index” in the “Extra” column if it were using the index?

@Scalio: Can’t use optimize unfortunately. Will try it on the staging machine.

id is char(32).

Well it’s not necessary to get all rows. In fact I only want those where created > something, but that ran even slower.

Even select id from assemblies where account_id = ‘some id’ limit 50; is pretty slow. :frowning:

Okay a few more updates.

With an optimize I could get the query to execute in 1min (8min before).

This one here completes instantly:

select id from assemblies where account_id = ‘some_id’;
If I add any field to the list of retrieved fields, the query becomes horribly slow.

Anybody else having a good solution?

    , created
    account_id = 'some_id'

Just give that a try, it’s near identical, I wonder if the query in the OP having tables qualified when only one table is involved may be confusing MySQL slightly, it may be expecting more then one table to be involved

Something you could do to speed up access is to ensure that all the fields in the table are fixed length. If you change VARCHARs to CHARs and don’t use BOOL or TEXT fields each record will have a fixed size and traversing the table will be a LOT quicker.

If there are TEXT fields or lots of other fields not need in most of the queries you could try moving that data to another table and joining it only when you need that information. But that depends on how you are using the data across the site as to whether that would be better overall rather than just for this query.

no, not “a LOT quicker”

extracting data from a variable length row is infinitesimally faster than extracting data from a fixed length row

however, your suggestion for using CHAR instead of VARCHAR will bloat the size of the row

this means fewer rows will fit onto a single disk block, which means that the query will have to read more disk blocks to retrieve all the same data

and the speed of retrieving data blocks off the disk is humoungously slower than the speed you might have saved in not extracting data from a variable length row

so in my opinion, this idea of using CHAR instead of VARCHAR is self-defeating

this, on the other hand, is an excellent suggestion

TEXT columns are already stored “off row” so you don’t need to worry about them, but separating other infrequently used columns into a related one-to-one table often has a significant effect on performance

why? because the main table is now shorter, which means more rows fit onto a disk block, which means you can read more rows with fewer disk reads


Thanks, I’d not considered the disk blocks themselves.