SitePoint Sponsor

User Tag List

Results 1 to 8 of 8
  1. #1
    SitePoint Addict
    Join Date
    Mar 2005
    Location
    India
    Posts
    280
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    TABLE SCAN and LIMIT

    I have a table with the following structure

    user_id (int PRIMARY AUTO INCREMENT)
    username (varchar)
    status (tinyint) - possible values 1, 2 and 3

    INDEXES are PRIMARY(user_id) and status(status) and table has nearly 100000 records.

    Now most of the queries that we run fetches user records on the basis of their status (1, 2 or 3)

    E.g. SELECT * FROM users WHERE status='2' ORDER BY user_id LIMIT 10

    Now records associated with every status (1, 2 or 3) are nearly one third of 100000.

    Now since we have index on status and user_id, when I run this query with EXPLAIN, it shows me the following data



    Now since I have mentioned LIMIT 10 with ORDER BY user_id, why is it scanning all the records with status='2'? When ORDER BY column i.e. user_id (PRIMARY) is already indexed, it should have sorted the first 10 rows and returned the data.

    At the moment the SELECTs have become slow and I am afraid that it will get worse with more records.

    So what should I do with INDEXES so that it only scans 10 rows or whatever I mention with LIMIT.

    If I use USE INDEX (PRIMARY) in the query, it removes "Using filesort" however the number of rows that getting scanned increases to 100000.



    Any suggestions on how to optimize this particular table?


    Thanks.
    Last edited by viv5; May 3, 2008 at 06:14.

  2. #2
    SitePoint Member
    Join Date
    May 2008
    Posts
    15
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    What do you really want the ORDER BY clause to do?

    You want it to sort all of your users, -and give you then 10 first right?

    If so, - then it will need to sort all, nut just 10

  3. #3
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,016
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    try an index on (status,user_id), and drop the index on status alone
    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  4. #4
    SitePoint Addict
    Join Date
    Mar 2005
    Location
    India
    Posts
    280
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    try an index on (status,user_id), and drop the index on status alone
    Thanks for your comments.

    I did try the same query after putting an index on (status, user_id) and by dropping an index on status alone however the results are just the same.



    It's scanning 32222 rows in order to display just 10.

    Does it work like this only? I mean soon the number of records in my database would become 5 times of what they are now, so it means this query would be scanning 5 * 32222 records at that time. This would make the process very very slow.

    Quote Originally Posted by dadane View Post
    You want it to sort all of your users, -and give you then 10 first right?
    If I remove the ORDER BY statement from the query, still the table is scanning all the records (32222) with status=’2’ in order to display just 10.

    Why is it so? Is there any way of optimizing this?

  5. #5
    SitePoint Addict
    Join Date
    Mar 2005
    Location
    India
    Posts
    280
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Any Ideas ???

  6. #6
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,016
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    possible keys: status_user_id
    key: status_user_id

    i am not an optimization specialist but it looks to me like it did use the index
    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  7. #7
    reads the ********* Crier silver trophybronze trophy longneck's Avatar
    Join Date
    Feb 2004
    Location
    Tampa, FL (US)
    Posts
    9,854
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    EXPLAIN does not "explain" LIMIT optimizations very well. so it's very likely that only 10 rows are ever examined but EXPLAIN will never say so. see http://www.mysqlperformanceblog.com/...-optimization/
    Check out our new Industry News forum!
    Keep up-to-date with the latest SP news in the Community Crier

    I edit the SitePoint Podcast

  8. #8
    SitePoint Addict
    Join Date
    Mar 2005
    Location
    India
    Posts
    280
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by longneck View Post
    EXPLAIN does not "explain" LIMIT optimizations very well. so it's very likely that only 10 rows are ever examined but EXPLAIN will never say so. see http://www.mysqlperformanceblog.com/...-optimization/
    Thank you guys for your comments. I think longneck is correct as EXPLAIN statement may be giving wrong information for rows scanned in this case.

    This is what they've mentioned in their article:
    One more note about ORDER BY ... LIMIT is - it provides scary explain statements and may end up in slow query log as query which does not use indexes, even if it is quite fast:


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •