SitePoint Sponsor

User Tag List

Results 1 to 7 of 7
  1. #1
    SitePoint Guru aamonkey's Avatar
    Join Date
    Sep 2004
    Location
    kansas
    Posts
    953
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    IN() too slow, need help rewriting

    I have the following query:
    Code MySQL:
    SELECT COUNT(VI.id)
    FROM Client_Voucher_Items VI
    LEFT JOIN Client_Vouchers V
        ON VI.voucherId = V.id
    LEFT JOIN Clients C
        ON C.id = V.clientId
    LEFT JOIN Items I
        ON I.id = VI.itemId
    WHERE I.description = 'Glasses'
    AND V.dateVoided IS NULL
    AND V.dateCreated >= '2010-01-01'
    AND V.dateCreated <=  '2010-01-31'
    AND C.id IN (
        SELECT DISTINCT(V.clientId)
        FROM Client_Vouchers V, Clients C
        WHERE C.id = V.clientId
        AND DATE_FORMAT(V.dateCreated, '%Y') - DATE_FORMAT(C.dob, '%Y') - (DATE_FORMAT(V.dateCreated, '00-%m-%d') < DATE_FORMAT(C.dob, '00-%m-%d')) >=  60
        AND V.dateCreated >= '2010-01-01'
        AND V.dateCreated <=  '2010-01-31'
        AND V.dateVoided IS NULL
    )
    Because of the IN() clause (and mysql's horrible optimization of it) this query is taking 10+ seconds to run. I am not proficient enough with sql to rewrite this without the IN() clause. I can post the table structure if it's not clear.

    thank you for any help
    aaron-fisher.com - PHP articles and more

  2. #2
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,262
    Mentioned
    60 Post(s)
    Tagged
    3 Thread(s)
    is it possible to have a client voucher item for a client voucher that doesn't exist?

    is it possible to have a client voucher for a client that doesn't exist?

    is it possible to have a client voucher item for an item that doesn't exist?

    i'm going to guess that the answer to all of these questions is no, so your LEFT OUTER JOINs should be rewritten as INNER JOINs

    does that speed things up a little bit?

    we can tackle the IN problem next...
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  3. #3
    SitePoint Guru aamonkey's Avatar
    Join Date
    Sep 2004
    Location
    kansas
    Posts
    953
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks r937 - that did shave a second or two off the query - so now I have this:

    Code MySQL:
    SELECT COUNT(VI.id)
    FROM Client_Voucher_Items VI
    INNER JOIN Client_Vouchers V
        ON VI.voucherId = V.id
    INNER JOIN Clients C
        ON C.id = V.clientId
    INNER JOIN Items I
        ON I.id = VI.itemId
    WHERE I.description = 'Glasses'
    AND V.dateVoided IS NULL
    AND V.dateCreated >= '2010-01-01'
    AND V.dateCreated <=  '2010-01-31'
    AND C.id IN (
        SELECT DISTINCT(V.clientId)
        FROM Client_Vouchers V, Clients C
        WHERE C.id = V.clientId
        AND DATE_FORMAT(V.dateCreated, '%Y') - DATE_FORMAT(C.dob, '%Y') - (DATE_FORMAT(V.dateCreated, '00-%m-%d') < DATE_FORMAT(C.dob, '00-%m-%d')) >=  60
        AND V.dateCreated >= '2010-01-01'
        AND V.dateCreated <=  '2010-01-31'
        AND V.dateVoided IS NULL
    )
    aaron-fisher.com - PHP articles and more

  4. #4
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,262
    Mentioned
    60 Post(s)
    Tagged
    3 Thread(s)
    okay, we're in great shape

    before rewriting the subquery as a join, i just want to take a moment and confirm that the subquery is returning something different from the main query

    in the main query, you're looking specifically for glasses

    but in the subquery, you seem to be looking for any client of age 60+ who has ~any~ voucher, regardless of whether that voucher contains glasses or not

    is this correct?

    because if it isn't, it sure looks like those WHERE conditions in the subquery could simply be added to the WHERE clause of the outer query
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  5. #5
    SitePoint Guru aamonkey's Avatar
    Join Date
    Sep 2004
    Location
    kansas
    Posts
    953
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I get what you're saying Rudy. The query is supposed to pull the unduplicated number of people over 60 that had vouchers for glasses in a date range, and I see now that I had it written incorrectly.

    I rewrote this as:

    Code MySQL:
    SELECT COUNT(DISTINCT(V.clientId))
    FROM ClientManagement_Client_Voucher_Items VI
    INNER JOIN ClientManagement_Client_Vouchers V
        ON VI.voucherId = V.id
    INNER JOIN ClientManagement_Clients C
        ON C.id = V.clientId
    INNER JOIN SystemSetup_Items I
        ON I.id = VI.itemId
    WHERE I.description = 'Glasses'
    AND V.dateVoided IS NULL
    AND V.dateCreated >= '2010-01-01'
    AND V.dateCreated <=  '2010-12-31'
    AND DATE_FORMAT(V.dateCreated, '%Y') - DATE_FORMAT(C.dob, '%Y') - (DATE_FORMAT(V.dateCreated, '00-%m-%d') < DATE_FORMAT(C.dob, '00-%m-%d')) >=  60

    Does that look right to you? It's yielding slightly different numbers, but I think it's right and what I had previously was wrong (because I was counting voucher items instead of distinct clients).

    Side note - this is one of about 50 subqueries in a massive annual report I'm working on....I'm trying to find the bottlenecks in it and I believe I've been staring at it wayyyyy too long.
    aaron-fisher.com - PHP articles and more

  6. #6
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,262
    Mentioned
    60 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by aamonkey View Post
    Does that look right to you?
    certainly looks a lot simpler than what you had before

    on a minor note, i try to write my FROM clause in a logical sequence, one that mirrors how i hope/assume the optimizer will approach it -- starting with the essential or most restrictive table, and joining from there (so as to minimize the number of rows that it retrieves only to throw away)

    since your whole query is about glasses, this is how i would write it...
    Code:
    SELECT COUNT( DISTINCT V.clientId )
      FROM SystemSetup_Items I
    INNER
      JOIN ClientManagement_Client_Voucher_Items VI
        ON VI.itemId = I.id 
    INNER 
      JOIN ClientManagement_Client_Vouchers V    
        ON V.id = VI.voucherId
       AND V.dateVoided IS NULL
       AND V.dateCreated >= '2010-01-01'
       AND V.dateCreated <=  '2010-12-31'
    INNER 
      JOIN ClientManagement_Clients C    
        ON C.id = V.clientId
       AND DATE_FORMAT(V.dateCreated, '%Y') - 
           DATE_FORMAT(C.dob, '%Y') - 
             (DATE_FORMAT(V.dateCreated, '00-%m-%d') < 
              DATE_FORMAT(C.dob, '00-%m-%d')) >=  60
     WHERE I.description = 'Glasses'
    note also that DISTINCT is ~not~ a function, and whatever comes after it does not need to be placed in parentheses
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  7. #7
    SitePoint Guru aamonkey's Avatar
    Join Date
    Sep 2004
    Location
    kansas
    Posts
    953
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I just learned a bunch of stuff

    Thanks for the help and the 2nd pair of eyes Rudy - I appreciate it
    aaron-fisher.com - PHP articles and more


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •