I’ve developed the following query to show the views and purchases of the products we sell. However, we just implemented the view tracking table so any dates prior to this week don’t have any view data at all. This query works fine as long as there are page views for the products during that period. However, since the older data doesn’t have views data, no data at all is being returned for older queries. How do I return NULL for page views when there are none during that period but there are sales? I assume the problem is due to the ON clause in the left outer join, but I’m not sure how else to do it.
SELECT
SUM( op.products_quantity ) AS num_sold,
SUM( op.final_price * op.products_quantity ) AS total_sales,
p.products_model, pd.products_name,
p.products_id,
visits.visits
FROM orders_products AS op
JOIN products AS p
JOIN products_description AS pd
JOIN orders as o
LEFT OUTER JOIN (
SELECT pv.products_id, count( pv.ip_addr ) as visits
FROM products_visits AS pv
WHERE pv.timestamp
BETWEEN '".$start_date."'
AND '".$end_date."'
GROUP BY products_id
) visits ON visits.products_id=p.products_id
WHERE p.products_id = op.products_id
AND p.products_id = pd.products_id
AND op.orders_id = o.orders_id
AND visits.products_id = p.products_id
AND o.date_purchased BETWEEN '".$start_date."' AND '".$end_date."'
GROUP BY p.products_id
ORDER BY total_sales DESC
the problem is that you have a condition in the WHERE clause which “negates” the left outer join
your ON clause says
ON visits.products_id=p.products_id
if this were all you did, it would be fine, and you would return rows from the p table which had no matching rows in the visits table, and the columns from the visits table would be NULL in those unmatched rows
however, you’ve sabotaged your outer join with this –
WHERE ...
AND visits.products_id = p.products_id
this ~forces~ the visits rows to have a non-NULL value, i.e. to match the p table
in effect, this changes the outer join to an inner join, because trhe unmatched rows are filtered out by your WHERE clause
you need to put the other join conditions into their respective ON clauses as well
here’s your re-written query…
SELECT SUM( op.products_quantity ) AS num_sold
, SUM( op.final_price * op.products_quantity ) AS total_sales
, p.products_model
, pd.products_name
, p.products_id
, visits.visits
FROM orders as o
INNER
JOIN orders_products AS op
ON op.orders_id = o.orders_id
INNER
JOIN products AS p
ON p.products_id = op.products_id
INNER
JOIN products_description AS pd
ON pd.products_id = p.products_id
LEFT OUTER
JOIN ( SELECT products_id
, COUNT(*) AS visits
FROM products_visits
WHERE timestamp BETWEEN '".$start_date."'
AND '".$end_date."'
GROUP
BY products_id ) AS visits
ON visits.products_id = p.products_id
WHERE o.date_purchased BETWEEN '".$start_date."'
AND '".$end_date."'
GROUP
BY p.products_id
ORDER
BY total_sales DESC
note carefully the sequence of tables in the FROM clause, and notice that the only WHERE condition applies to the first table mentioned in the FROM clause – in effect, this is what “drives” the query, i.e. this is the most restricted table so i always write it first, so that the other tables are joined only to the rows you’re interested in
the way you had it, with unqualified joins missing their ON conditions, you’d get a humoungous cross join, after which the various conditions in your WHERE clause would start throwing most joined rows away