SitePoint Sponsor

User Tag List

Results 1 to 4 of 4
  1. #1
    SitePoint Member
    Join Date
    Feb 2012
    Location
    Lawrence, MA
    Posts
    2
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Using GROUP BY and MAX in SQL

    hi folks - I was hoping someone (Rudy?) could help me understand what I'm missing here; as I mentioned to Rudy earlier I know enough about SQL and databases in general to be dangerous.

    I have two tables: customers & visits; one customer can have multiple visits; what I'm trying to come up with is a query that will provide me with the latest visit per customer:


    Code:
    mysql> select id, name from customers;
     +-----+-----------------+
     | id  | name            |
     +-----+-----------------+
     | 101 | AAA Electronics |
     | 102 | BBB Electronics |
     | 103 | CCC Electronics |
     | 104 | DDD Electronics |
     +-----+-----------------+
     4 rows in set (0.00 sec)

    Code:
    mysql> select id, date, customer_id from visits order by customer_id;
     +----+---------------------+-------------+
     | id | date                | customer_id |
     +----+---------------------+-------------+
     |  1 | 2012-02-20 00:00:00 |         101 |  <----
     |  2 | 2012-02-01 00:00:00 |         101 |
     |  3 | 2012-02-03 00:00:00 |         101 |
     |  4 | 2012-02-05 00:00:00 |         101 |
     |  5 | 2012-01-07 00:00:00 |         102 |
     |  6 | 2012-01-08 00:00:00 |         102 |
     |  7 | 2012-01-09 00:00:00 |         102 |
     |  8 | 2012-01-25 00:00:00 |         102 |  <----
     |  9 | 2011-12-01 00:00:00 |         103 |
     | 10 | 2011-12-02 00:00:00 |         103 |
     | 11 | 2011-12-25 00:00:00 |         103 |  <----
     | 12 | 2011-12-03 00:00:00 |         103 |
     | 13 | 2011-12-04 00:00:00 |         103 |
     +----+---------------------+-------------+
     13 rows in set (0.00 sec)
    Code:
    mysql>  SELECT c.name, c.id customer_id, v.id visit_id, max(v.date) visit_date
         ->  FROM customers c join VISITS v ON v.customer_id = c.id
         -> GROUP BY c.name;
     +-----------------+-------------+----------+---------------------+
     | name            | customer_id | visit_id | visit_date          |
     +-----------------+-------------+----------+---------------------+
     | AAA Electronics |         101 |        1 | 2012-02-20 00:00:00 |
     | BBB Electronics |         102 |        5 | 2012-01-25 00:00:00 |
     | CCC Electronics |         103 |        9 | 2011-12-25 00:00:00 |
     +-----------------+-------------+----------+---------------------+
     3 rows in set (0.00 sec)

    I seem to be almost there, except that the visit_id isn't getting grouped correctly.

    what I need for the query to produce is this:

    Code:
     +-----------------+-------------+----------+---------------------+
     | name            | customer_id | visit_id | visit_date          |
     +-----------------+-------------+----------+---------------------+
     | AAA Electronics |         101 |        1 | 2012-02-20 00:00:00 |
     | BBB Electronics |         102 |        8 | 2012-01-25 00:00:00 |
     | CCC Electronics |         103 |       11 | 2011-12-25 00:00:00 |
     +-----------------+-------------+----------+---------------------+
     3 rows in set (0.00 sec)

    I'm starting to think I might need two queries with a UNION in there somewhere... thanks.

    -Len

  2. #2
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,021
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    Quote Originally Posted by lendalberti View Post
    I seem to be almost there, except that the visit_id isn't getting grouped correctly.
    i seriously question why you would need the visit_id

    but let's pretend there is another column in the visits table, something that might realistically be required, but let's call it "foo"

    so you want the customers along with their latest visit's foo
    Code:
    SELECT customers.id
         , customers.name
         , visits.date
         , visits.foo
      FROM customers
    LEFT OUTER
      JOIN ( SELECT customer_id
                  , MAX(date) AS max_date
               FROM visits
             GROUP
                 BY customer_id ) AS latest
        ON latest.customer_id = customers.id
    LEFT OUTER
      JOIN visits
        ON visits.customer_id = latest.id
       AND visits.date = latest.max_date
    the "latest" subquery determines the latest date for each customer

    then you join only those rows from the visits table to pull the corresponding foo

    notice the join conditions on the last join
    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  3. #3
    SitePoint Member
    Join Date
    Feb 2012
    Location
    Lawrence, MA
    Posts
    2
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    thanks - I actually understood that, so I must be making progress...

    btw - had to fix part of the last join from:

    Code:
    ON visits.customer_id = latest.id
    to:

    Code:
    ON visits.customer_id = latest.customer_id

  4. #4
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,021
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    Quote Originally Posted by lendalberti View Post
    btw - had to fix part of the last join
    well spotted!

    ah, the perils of copy/paste ...

    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •