SitePoint Sponsor

User Tag List

Results 1 to 5 of 5
  1. #1
    SitePoint Zealot
    Join Date
    Jan 2006
    Posts
    169
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    outer join query

    I've developed the following query to show the views and purchases of the products we sell. However, we just implemented the view tracking table so any dates prior to this week don't have any view data at all. This query works fine as long as there are page views for the products during that period. However, since the older data doesn't have views data, no data at all is being returned for older queries. How do I return NULL for page views when there are none during that period but there are sales? I assume the problem is due to the ON clause in the left outer join, but I'm not sure how else to do it.

    Code:
    SELECT 
       SUM( op.products_quantity ) AS num_sold, 
       SUM( op.final_price * op.products_quantity ) AS total_sales, 
       p.products_model, pd.products_name, 
       p.products_id,
       visits.visits
    FROM orders_products AS op
       JOIN products AS p
       JOIN products_description AS pd
       JOIN orders as o
       LEFT OUTER JOIN (
          SELECT pv.products_id, count( pv.ip_addr ) as visits
             FROM products_visits AS pv
             WHERE pv.timestamp
             BETWEEN '".$start_date."'
             AND '".$end_date."'
             GROUP BY products_id
       ) visits ON visits.products_id=p.products_id
       WHERE p.products_id = op.products_id
       AND p.products_id = pd.products_id
       AND op.orders_id = o.orders_id
       AND visits.products_id = p.products_id
       AND o.date_purchased BETWEEN '".$start_date."' AND '".$end_date."'
       GROUP BY p.products_id
       ORDER BY total_sales DESC

  2. #2
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,251
    Mentioned
    59 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by whitemank View Post
    I assume the problem is due to the ON clause in the left outer join
    nope

    the problem is that you have a condition in the WHERE clause which "negates" the left outer join

    your ON clause says
    Code:
    ON visits.products_id=p.products_id
    if this were all you did, it would be fine, and you would return rows from the p table which had no matching rows in the visits table, and the columns from the visits table would be NULL in those unmatched rows

    however, you've sabotaged your outer join with this --
    Code:
       WHERE ...
       AND visits.products_id = p.products_id
    this ~forces~ the visits rows to have a non-NULL value, i.e. to match the p table

    in effect, this changes the outer join to an inner join, because trhe unmatched rows are filtered out by your WHERE clause

    make sense?

    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  3. #3
    SitePoint Zealot
    Join Date
    Jan 2006
    Posts
    169
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks! Yes, I forgot to take the where clause out when I was playing with the query.

  4. #4
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,251
    Mentioned
    59 Post(s)
    Tagged
    3 Thread(s)
    you need to put the other join conditions into their respective ON clauses as well

    here's your re-written query...
    Code:
    SELECT SUM( op.products_quantity ) AS num_sold
         , SUM( op.final_price * op.products_quantity ) AS total_sales
         , p.products_model
         , pd.products_name
         , p.products_id
         , visits.visits
      FROM orders as o
    INNER
      JOIN orders_products AS op
        ON op.orders_id = o.orders_id
    INNER
      JOIN products AS p
        ON p.products_id = op.products_id
    INNER
      JOIN products_description AS pd
        ON pd.products_id = p.products_id
    LEFT OUTER 
      JOIN ( SELECT products_id
                  , COUNT(*) AS visits
               FROM products_visits
              WHERE timestamp BETWEEN '".$start_date."'
                                  AND '".$end_date."'
             GROUP 
                 BY products_id ) AS visits 
        ON visits.products_id = p.products_id
     WHERE o.date_purchased BETWEEN '".$start_date."' 
                                AND '".$end_date."'
    GROUP 
        BY p.products_id
    ORDER 
        BY total_sales DESC
    note carefully the sequence of tables in the FROM clause, and notice that the only WHERE condition applies to the first table mentioned in the FROM clause -- in effect, this is what "drives" the query, i.e. this is the most restricted table so i always write it first, so that the other tables are joined only to the rows you're interested in

    the way you had it, with unqualified joins missing their ON conditions, you'd get a humoungous cross join, after which the various conditions in your WHERE clause would start throwing most joined rows away

    make sense?
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  5. #5
    SitePoint Zealot
    Join Date
    Jan 2006
    Posts
    169
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Yes, that makes a lot of sense. Thanks for your help.


Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •