SitePoint Sponsor |
|
User Tag List
Results 1 to 10 of 10
-
Jun 1, 2002, 01:37 #1
- Join Date
- Jul 2001
- Location
- Missouri
- Posts
- 3,428
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
why doesn't MySQL use the index for sorting?
this sucks. i just realized that MySQL isn't using the index for the ORDER BY when i really wish it would.
examples: these use the index:
Code:mysql> EXPLAIN SELECT * FROM users ORDER BY userid LIMIT 10; +---------+-------+---------------+---------+---------+------+------+-------+ | table | type | possible_keys | key | key_len | ref | rows | Extra | +---------+-------+---------------+---------+---------+------+------+-------+ | users | index | NULL | PRIMARY | 2 | NULL | 527 | | +---------+-------+---------------+---------+---------+------+------+-------+ mysql> EXPLAIN SELECT * FROM users ORDER BY userid DESC LIMIT 10; +---------+-------+---------------+---------+---------+------+------+-------+ | table | type | possible_keys | key | key_len | ref | rows | Extra | +---------+-------+---------------+---------+---------+------+------+-------+ | users | index | NULL | PRIMARY | 2 | NULL | 527 | | +---------+-------+---------------+---------+---------+------+------+-------+ mysql> EXPLAIN SELECT * FROM users WHERE userid > 100 ORDER BY userid LIMIT 10; +---------+-------+---------------+---------+---------+------+------+------------+ | table | type | possible_keys | key | key_len | ref | rows | Extra | +---------+-------+---------------+---------+---------+------+------+------------+ | users | range | PRIMARY | PRIMARY | 2 | NULL | 527 | where used | +---------+-------+---------------+---------+---------+------+------+------------+ mysql> EXPLAIN SELECT * FROM users WHERE userid < 100 ORDER BY userid LIMIT 10; +---------+-------+---------------+---------+---------+------+------+------------+ | table | type | possible_keys | key | key_len | ref | rows | Extra | +---------+-------+---------------+---------+---------+------+------+------------+ | users | range | PRIMARY | PRIMARY | 2 | NULL | 78 | where used | +---------+-------+---------------+---------+---------+------+------+------------+
Code:mysql> EXPLAIN SELECT * FROM users WHERE userid < 100 ORDER BY userid DESC LIMIT 10; +---------+-------+---------------+---------+---------+------+------+----------------------------+ | table | type | possible_keys | key | key_len | ref | rows | Extra | +---------+-------+---------------+---------+---------+------+------+----------------------------+ | users | range | PRIMARY | PRIMARY | 2 | NULL | 78 | where used; Using filesort | +---------+-------+---------------+---------+---------+------+------+----------------------------+ mysql> EXPLAIN SELECT * FROM users WHERE userid > 100 ORDER BY userid DESC LIMIT 10; +---------+-------+---------------+---------+---------+------+------+----------------------------+ | table | type | possible_keys | key | key_len | ref | rows | Extra | +---------+-------+---------------+---------+---------+------+------+----------------------------+ | users | range | PRIMARY | PRIMARY | 2 | NULL | 527 | where used; Using filesort | +---------+-------+---------------+---------+---------+------+------+----------------------------+
i could almost understand the last one (which is the kind i want to use the index), but i still don't get why it won't use the index for `ORDER BY key DESC' when you have a range and it does for just `ORDER BY key'.
i guess there's not really an answer, but i wanted to see if anybody had anything to say. would it work in other databases?
oh BTW, kind of a related question. when you want to sort in the order rows were INSERTed, is it OK to use the AUTO_INCREMENT PRIMARY KEY since it should always be "in order?" or is that bad and you should ORDER BY a timestamp, etc.? on a message board, for example, the latest posts should always have the highest ID, so is it OK to ORDER BY that? seems all right to me, since "older" deleted IDs shouldn't be reused. let me know.- Matt** Ignore old signature for now... **
Dr.BB - Highly optimized to be 2-3x faster than the "Big 3."
"Do not enclose numeric values in quotes -- that is very non-standard and will only work on MySQL." - MattR
-
Jun 1, 2002, 02:48 #2
- Join Date
- Jul 2001
- Location
- Missouri
- Posts
- 3,428
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
just looking in the changelog and found this for 4.0.0:
ORDER BY ... DESC can now use keys.
and for 4.0.2:
Use index for ORDER BY in queries of type: SELECT * FROM t WHERE key_part1=1 ORDER BY key_part1 DESC,key_part2 DESC
so that appears to be the answer.why didn't they implement this in 3.23?
still want opinions on sorting by an AUTO_INCREMENT column, though.
-
Jun 1, 2002, 08:11 #3
- Join Date
- Jan 2001
- Location
- buried in the database shell (Washington, DC)
- Posts
- 1,107
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
The sorting works in other DBs just fine. Order by autoincrement is fine as long as you can't edit the datestamp.
Matt - Sybase DBA / PHP fanatic
Sybase/MySQL/Oracle | I don't like MySQL
Download Sybase | DBForums.com - for all your RDBMS talk
-
Jun 1, 2002, 15:12 #4
- Join Date
- Jul 2001
- Location
- Missouri
- Posts
- 3,428
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
i was hoping you'd reply.
Originally posted by MattR
Order by autoincrement is fine as long as you can't edit the datestamp.
-
Jun 1, 2002, 15:57 #5
- Join Date
- Jan 2001
- Location
- buried in the database shell (Washington, DC)
- Posts
- 1,107
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
LOL glad I could help. What Sybase and MS SQL server do depend on whether or not you declare your index as:
CREATE INDEX user_userid ON user( userid )
or
CREATE INDEX user_userid ON user( userid DESC )
For an ORDER BY DESC:
case 1)
The index is actually read in reverse. This isn't as efficient as the index being ordered decending and then being read sequentially forward.
case 2)
Index is decending, read from start to finish.
Sybase and MS SQL Server have the concept of a 'clustered' index (Oracle has an analagous structure called a grouped table). Basically it sorts your data rows by a particular key (or keys) so that the rows do not have to be sorted in memory.
It also improves heavily on BETWEEN clauses or grouping.
For instance, if you were designing a forum system in Syb/MS SQL/Oracle you would do something like this:
Code:CREATE TABLE thread( threadid INT IDENTITY, -- ident is same as auto_increment forumid INT NOT NULL REFERENCES forum( forumid ), lastpost INT NOT NULL ) CREATE CLUSTERED INDEX thread_forumid_lastpost ON ( forumid, lastpost DESC ) CREATE TABLE post( postid INT IDENTITY, posttext, etc. ) CREATE CLUSTERED INDEX post_threadid_postid ON post( threadid, postid )
If you are really interested in index binary tree layout you also save a level of binary tree since the clustered index leaf nodes are the data rows! So your indexes are 1 level smaller which is more efficient to search and store in memory.
The reason why I included postid in the post_threadid_postid index was that without it, there would have to be a sort (because new rows are not necessarily added to the end of the table) to ORDER BY postid ASC.Matt - Sybase DBA / PHP fanatic
Sybase/MySQL/Oracle | I don't like MySQL
Download Sybase | DBForums.com - for all your RDBMS talk
-
Jun 1, 2002, 16:40 #6
- Join Date
- Jul 2001
- Location
- Missouri
- Posts
- 3,428
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
cool. too bad MySQL doesn't support those clustered indexes.
just curious, doesn't that make updates to the table a bit slower since the whole data file has to be rearranged instead of just the index? even so, i guess it's offset by the faster read speed.
-
Jun 1, 2002, 17:19 #7
- Join Date
- Jan 2001
- Location
- buried in the database shell (Washington, DC)
- Posts
- 1,107
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
Remember that in the b+ tree you can have many data elements in a node. So the only time the tree needs to adjust to an insert (an update on a clustered key is somewhat like a delete followed by an insert) is when the leaf node is full or the row is too big to fit on the remaining space.
If it is full, then yes there is a small performance hit as the single leaf node is split in two (called page splitting). Pointers are readjusted and life is good. So there are small drawbacks if you insert a lot and have wide rows (or really low rows per page -- which is a configurable option for an index).
I can't recall exactly how much I/O is required to do a page split; it of course depends on how large your leaf nodes are and how many rows there are on a page. The bigger the node and the more rows the more costly it will be to split, although if the rows are tiny (2K or less) it can read many of them in a big chunk and be less costly than you'd think. In short, the cost depends on your situation -- the motto here is know your data.
Typically you make primary keys things which do not change often (username, postid, threadid, etc.), although it does happen.Last edited by MattR; Jun 1, 2002 at 17:22.
Matt - Sybase DBA / PHP fanatic
Sybase/MySQL/Oracle | I don't like MySQL
Download Sybase | DBForums.com - for all your RDBMS talk
-
Jun 1, 2002, 17:25 #8
- Join Date
- Jan 2001
- Location
- buried in the database shell (Washington, DC)
- Posts
- 1,107
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
If you want to know a little bit about Sybase-centric (and MS SQL as well; as of MS 7 they haven't changed the way pages split and I think 2000 is the same as well) splitting works you can go here:
DocsMatt - Sybase DBA / PHP fanatic
Sybase/MySQL/Oracle | I don't like MySQL
Download Sybase | DBForums.com - for all your RDBMS talk
-
Jun 1, 2002, 21:50 #9
- Join Date
- Jul 2001
- Location
- Missouri
- Posts
- 3,428
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
thanks for the info once again. i appreciate it.
-
Jan 21, 2005, 06:11 #10
- Join Date
- Mar 2001
- Posts
- 249
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
Does this help with sequential sorting on disk?
http://dev.mysql.com/doc/mysql/en/Index_preloading.html
Bookmarks