Using Date Range and getting Totals

I’m using the query below is a report that runs incredibly slow (due to the subselect I’m sure). I’ve tried splitting this into 2 separate queries where I find the value for num_orders inside my application loop, however it runs just about as slow. The problem here is my report uses a date range, however within each result, I need to know if a customer had more than 1 total orders where recurring_order = ‘1’ and rec_denied = ‘0’ (which tells the user if they are a NEW recurring order customer or not) outside the date range.

SELECT c.cust_id,
(
    SELECT COUNT(o1.order_customer_id)
    FROM orders AS o1
    LEFT JOIN order_products AS op1 ON o1.order_id = op1.order_id AND op1.recurring_order = '1' AND op1.rec_denied = '0'
    WHERE o1.order_customer_id = c.cust_id
    GROUP BY o1.order_customer_id
) AS num_orders
FROM customers AS c
LEFT JOIN orders AS o ON c.cust_id = o.order_customer_id
LEFT JOIN order_products AS op ON o.order_id = op.order_id AND op.recurring_order = '1' AND op.rec_denied = '0'
WHERE o.order_created_date >= '2010-10-01' AND
    o.order_created_date <= '2010-10-30'
GROUP BY c.cust_id

Any suggestions on optimizing this?

Couple things:

  1. Use JOINs instead of LEFT joins since you’re restricting the values on the order_products table anyways.
  2. Make sure you have indexes defined on the fields you are querying on:[LIST=1]
  3. order_products.recurringorder
  4. order_products.rec_denied
  5. order_products.order_id
  6. orders.order_id
  7. orders.order_customer_id
  8. orders.order_created_date
  9. customers.cust_id

[/LIST]

Ok, I have added indexes and changed LEFT JOINs to JOINs. Is there anything I can do to restrict the number of records found in the subquery. I really only need to know if a customer has 1 order or if they have 2 or more, so if they have 2, I don’t need to know if they have 5 or 500, 2 or more = NOT a new customer. Not sure if that well help at all, but just curious.

you said “more than 1 recurring order … outside the date range

more specific? before the date range? after? either before or after?

in any case, it’s the “more than one” part that means you would have to do a COUNT

if it were “at least 1” instead of “more than one” then you could use EXISTS which is faster

note that with the current subquery, you are not restricting the count to “outside the range” as you should