SitePoint Sponsor |
|
User Tag List
Results 1 to 8 of 8
Thread: TABLE SCAN and LIMIT
-
May 3, 2008, 02:25 #1
- Join Date
- Mar 2005
- Location
- India
- Posts
- 280
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
TABLE SCAN and LIMIT
I have a table with the following structure
user_id (int PRIMARY AUTO INCREMENT)
username (varchar)
status (tinyint) - possible values 1, 2 and 3
INDEXES are PRIMARY(user_id) and status(status) and table has nearly 100000 records.
Now most of the queries that we run fetches user records on the basis of their status (1, 2 or 3)
E.g. SELECT * FROM users WHERE status='2' ORDER BY user_id LIMIT 10
Now records associated with every status (1, 2 or 3) are nearly one third of 100000.
Now since we have index on status and user_id, when I run this query with EXPLAIN, it shows me the following data
Now since I have mentioned LIMIT 10 with ORDER BY user_id, why is it scanning all the records with status='2'? When ORDER BY column i.e. user_id (PRIMARY) is already indexed, it should have sorted the first 10 rows and returned the data.
At the moment the SELECTs have become slow and I am afraid that it will get worse with more records.
So what should I do with INDEXES so that it only scans 10 rows or whatever I mention with LIMIT.
If I use USE INDEX (PRIMARY) in the query, it removes "Using filesort" however the number of rows that getting scanned increases to 100000.
Any suggestions on how to optimize this particular table?
Thanks.Last edited by viv5; May 3, 2008 at 06:14.
-
May 3, 2008, 03:36 #2
- Join Date
- May 2008
- Posts
- 15
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
What do you really want the ORDER BY clause to do?
You want it to sort all of your users, -and give you then 10 first right?
If so, - then it will need to sort all, nut just 10
-
May 3, 2008, 04:34 #3
- Join Date
- Jul 2002
- Location
- Toronto, Canada
- Posts
- 39,347
- Mentioned
- 63 Post(s)
- Tagged
- 3 Thread(s)
try an index on (status,user_id), and drop the index on status alone
-
May 3, 2008, 06:29 #4
- Join Date
- Mar 2005
- Location
- India
- Posts
- 280
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
Thanks for your comments.
I did try the same query after putting an index on (status, user_id) and by dropping an index on status alone however the results are just the same.
It's scanning 32222 rows in order to display just 10.
Does it work like this only? I mean soon the number of records in my database would become 5 times of what they are now, so it means this query would be scanning 5 * 32222 records at that time. This would make the process very very slow.
If I remove the ORDER BY statement from the query, still the table is scanning all the records (32222) with status=’2’ in order to display just 10.
Why is it so? Is there any way of optimizing this?
-
May 4, 2008, 02:31 #5
- Join Date
- Mar 2005
- Location
- India
- Posts
- 280
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
Any Ideas ???
-
May 4, 2008, 05:15 #6
- Join Date
- Jul 2002
- Location
- Toronto, Canada
- Posts
- 39,347
- Mentioned
- 63 Post(s)
- Tagged
- 3 Thread(s)
possible keys: status_user_id
key: status_user_id
i am not an optimization specialist but it looks to me like it did use the index
-
May 4, 2008, 19:32 #7
- Join Date
- Feb 2004
- Location
- Tampa, FL (US)
- Posts
- 9,854
- Mentioned
- 1 Post(s)
- Tagged
- 0 Thread(s)
EXPLAIN does not "explain" LIMIT optimizations very well. so it's very likely that only 10 rows are ever examined but EXPLAIN will never say so. see http://www.mysqlperformanceblog.com/...-optimization/
Check out our new Industry News forum!
Keep up-to-date with the latest SP news in the Community Crier
I edit the SitePoint Podcast
-
May 5, 2008, 02:25 #8
- Join Date
- Mar 2005
- Location
- India
- Posts
- 280
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
Thank you guys for your comments. I think longneck is correct as EXPLAIN statement may be giving wrong information for rows scanned in this case.
This is what they've mentioned in their article:
One more note about ORDER BY ... LIMIT is - it provides scary explain statements and may end up in slow query log as query which does not use indexes, even if it is quite fast:
Bookmarks