SitePoint Sponsor

User Tag List

Results 1 to 19 of 19
  1. #1
    SitePoint Wizard
    Join Date
    Oct 2004
    Location
    Newport Beach
    Posts
    1,742
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Why Scan All 38K Rows When I Limit 4?

    I've been using EXPLAIN to better do my queries and spotted this:

    PHP Code:
    EXPLAIN SELECT films.film_titletrailers.trailer_title 
    FROM films
    ,trailers 
    WHERE trailers
    .film_id=films.film_id 
    ORDER BY trailers
    .trailer_id DESC 
    LIMIT 4 
    The result I get is 48K rows must get scanned in the 'trailers' table. Why can't it just do the 4 and stop?

    Thanks
    Ryan
    Upcoming Movies - Movie News. Updated Daily.
    Movie Trailers - Awesome trailer site. Nuff said.

  2. #2
    Programming Since 1978 silver trophybronze trophy felgall's Avatar
    Join Date
    Sep 2005
    Location
    Sydney, NSW, Australia
    Posts
    16,608
    Mentioned
    24 Post(s)
    Tagged
    1 Thread(s)
    Are you missing an index on trailer_id?
    Stephen J Chapman

    javascriptexample.net, Book Reviews, follow me on Twitter
    HTML Help, CSS Help, JavaScript Help, PHP/mySQL Help, blog
    <input name="html5" type="text" required pattern="^$">

  3. #3
    SitePoint Addict svcghost's Avatar
    Join Date
    Oct 2010
    Posts
    288
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Try writing
    Code:
    EXPLAIN SELECT films.film_title, trailers.trailer_title 
    FROM films,trailers 
    WHERE trailers.film_id=films.film_id 
    ORDER BY trailers.trailer_id DESC 
    LIMIT 0, 4 
    Doing so means you are starting at record 0, and grabbing 4 records starting there.
    This should pull records 0, 1, 2, and 3.

  4. #4
    Utopia, Inc. silver trophy
    ScallioXTX's Avatar
    Join Date
    Aug 2008
    Location
    The Netherlands
    Posts
    8,907
    Mentioned
    139 Post(s)
    Tagged
    2 Thread(s)
    Quote Originally Posted by casbboy View Post
    The result I get is 48K rows must get scanned in the 'trailers' table. Why can't it just do the 4 and stop?
    Because you have

    Code:
    ORDER BY trailers.trailer_id DESC
    MySQL can't know what the 4 highest numbers are without looking through all the numbers.
    Rémon - Hosting Advisor

    Minimal Bookmarks Tree
    My Google Chrome extension: browsing bookmarks made easy

  5. #5
    SitePoint Addict svcghost's Avatar
    Join Date
    Oct 2010
    Posts
    288
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Yeah. Don't you know how many iterations it has to go through to sort stuff? Definitely 48,000 worth if you have a lot of records! Good spot there Scallio

  6. #6
    SitePoint Wizard
    Join Date
    Oct 2004
    Location
    Newport Beach
    Posts
    1,742
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    1 SIMPLE trailers index film_id PRIMARY 4 NULL 45858
    1 SIMPLE films eq_ref PRIMARY PRIMARY 4 trailera_trailers.trailers.film_id 1

    trailer_id is a Primary, which automatically makes it an index, correct?

    So there is no way to limit the scan if I have a limit and order by the primary column desc?

    I know 45K rows isn't exactly horrid in terms of other explains I've seen, but this query is run about 500K times per day, so limiting the scan seems beneficial.

    Cheers
    Ryan
    Upcoming Movies - Movie News. Updated Daily.
    Movie Trailers - Awesome trailer site. Nuff said.

  7. #7
    Non-Member
    Join Date
    Apr 2004
    Location
    Miami, FL, USA
    Posts
    449
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    You can limit the scan with a subquery (as I posted in your other thread). Your best bet is to execute this query in two phases.... the first does:

    SELECT MAX(t.id) AS maxid FROM trailers AS t;

    The second subqueries the trailers table and selects "WHERE id > {$maxid} - 10" to limit number of records hashed against your other table.
    Code:
    SELECT f.film_title, tr.trailer_title 
    FROM films AS f
    INNER JOIN (
        SELECT * FROM trailers WHERE id > {$maxid} - 10
    AS tr ON tr.film_id = f.film_id 
    ORDER BY tr.trailer_id DESC 
    LIMIT 0, 4
    10 should give you enough padding to not worry about deleted records or skipped ids.

  8. #8
    SitePoint Wizard
    Join Date
    Oct 2004
    Location
    Newport Beach
    Posts
    1,742
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Nice. I added it and now have:

    PHP Code:
    SELECT trailers.trailer_titlefilms.film_title 
    FROM films
    ,trailers 
    WHERE trailers
    .trailer_id $maxtrailerid 10 
    AND trailers.film_id=films.film_id 
    ORDER BY trailers
    .trailer_id DESC 
    LIMIT 4 
    Doing an Explain, that limited the rows scanned to 64.

    Ryan
    Upcoming Movies - Movie News. Updated Daily.
    Movie Trailers - Awesome trailer site. Nuff said.

  9. #9
    Non-Member
    Join Date
    Apr 2004
    Location
    Miami, FL, USA
    Posts
    449
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Cheers.

  10. #10
    SitePoint Wizard
    Join Date
    Oct 2004
    Location
    Newport Beach
    Posts
    1,742
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    yeah,

    Thanks mucho. I'm trying to see if I can apply a similar solution to my other thread. But looks doubtful.

    Cheers
    Ryan
    Upcoming Movies - Movie News. Updated Daily.
    Movie Trailers - Awesome trailer site. Nuff said.

  11. #11
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,022
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    Quote Originally Posted by transio View Post
    The second subqueries the trailers table and selects "WHERE id > {$maxid} - 10" to limit number of records hashed against your other table.
    nice technique
    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  12. #12
    SitePoint Guru bronze trophy
    Join Date
    Dec 2003
    Location
    Poland
    Posts
    927
    Mentioned
    7 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by casbboy View Post
    I've been using EXPLAIN to better do my queries and spotted this:

    PHP Code:
    EXPLAIN SELECT films.film_titletrailers.trailer_title 
    FROM films
    ,trailers 
    WHERE trailers
    .film_id=films.film_id 
    ORDER BY trailers
    .trailer_id DESC 
    LIMIT 4 
    The result I get is 48K rows must get scanned in the 'trailers' table. Why can't it just do the 4 and stop?
    I had a similar problem after my web host upgraded from mysql 5.0.91 to 5.5.8. I had a subquery which ended in
    Code:
    ORDER BY primary_key DESC LIMIT 1
    and mysql 5.5 did the full table scan. The solution was to explicitely define which index should be used with
    Code:
    USE INDEX(primary_key)
    . I've heard mysql 5.5 has huge performance improvements but in this case its optimizer turns out to be really dumb!

  13. #13
    SitePoint Wizard
    Join Date
    Oct 2004
    Location
    Newport Beach
    Posts
    1,742
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    How do you add USE to the query? Never done so. Restricted to a version?

    Cheers
    Ryan
    Upcoming Movies - Movie News. Updated Daily.
    Movie Trailers - Awesome trailer site. Nuff said.

  14. #14
    SitePoint Guru bronze trophy
    Join Date
    Dec 2003
    Location
    Poland
    Posts
    927
    Mentioned
    7 Post(s)
    Tagged
    0 Thread(s)
    You put it right after SELECT ... FROM table. See here. As far as I know its been in mysql for quite some time, at least since 4.1.

  15. #15
    SitePoint Wizard
    Join Date
    Oct 2004
    Location
    Newport Beach
    Posts
    1,742
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I'm trying it, but getting an error:

    PHP Code:
    EXPLAIN SELECT films.film_titletrailers.trailer_title
    FROM films
    trailers
    USE INDEX (trailers.trailer_id)
    WHERE trailers.film_id films.film_id
    ORDER BY trailers
    .trailer_id DESC
    LIMIT 4 
    Me confused?

    Ryan
    Upcoming Movies - Movie News. Updated Daily.
    Movie Trailers - Awesome trailer site. Nuff said.

  16. #16
    SitePoint Guru bronze trophy
    Join Date
    Dec 2003
    Location
    Poland
    Posts
    927
    Mentioned
    7 Post(s)
    Tagged
    0 Thread(s)
    You cannot use table. name in USE INDEX, only the index name (not column name!). Read the first comment on the documentation page on syntax for multiple table queries.

  17. #17
    SitePoint Wizard
    Join Date
    Oct 2004
    Location
    Newport Beach
    Posts
    1,742
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Ah.

    Okay, got it to work. But same number rows scanned.

    I'm going to see if I can make this work with other queries hopefully.

    Cheers
    Ryan
    Upcoming Movies - Movie News. Updated Daily.
    Movie Trailers - Awesome trailer site. Nuff said.

  18. #18
    SitePoint Guru bronze trophy
    Join Date
    Dec 2003
    Location
    Poland
    Posts
    927
    Mentioned
    7 Post(s)
    Tagged
    0 Thread(s)
    Yeah, each query is unique so there's no guarantee this method will work in your case. But it's worth trying USE INDEX/FORCE INDEX with various syntaxes. You could try using INNER JOIN - then you could get rid of the WHERE clause - who knows, maybe the optimizer is misled by the WHERE?

    Alternatively, you could fetch the ids of last 4 trailers in a separate query, the index should be guaranteed (I hope...!) to be used:
    Code:
    SELECT trailer_id
    FROM trailers
    ORDER BY trailer_id DESC
    LIMIT 4
    If this is successful you can use IN() clause in the second query without the LIMIT. How optimal that is - I don't know

  19. #19
    SitePoint Wizard
    Join Date
    Oct 2004
    Location
    Newport Beach
    Posts
    1,742
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Yeah. I have considered cutting to two queries. The only big catch is I use ORDER BY a ton, and that seams to be the culprit in plenty of my cases.

    Working on it.

    Cheers
    Ryan
    Upcoming Movies - Movie News. Updated Daily.
    Movie Trailers - Awesome trailer site. Nuff said.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •