Select * with distinct on one field

How can I select all fields with a distinct on only one field of that table? On top of this I’d like to sort so I have the most recent records.

For example (using a table from a previous post):

T1
id | datetime | userid
1 | 2007-03-03 20:20:20 | 1
1 | 2007-03-03 20:20:25 | 1
1 | 2007-03-03 20:20:30 | 1
1 | 2007-03-10 17:15:30 | 2
1 | 2007-03-10 17:15:45 | 2
2 | 2007-03-15 18:34:45 | 1

I would like distinct on field id only, so my result would be * from the last two rows. Sort would be on datetime so result is:

1 | 2007-03-10 17:15:45 | 2
2 | 2007-03-15 18:34:45 | 1

I’m trying to do something like: “Select distinct id (but retrieve *) sort by datetime asc”.

DISTINCT is the wrong solution, especially if you’re using the dreaded, evil “select star”

just to confirm, you want the latest row per id?

Maybe I read this wrong but why not just:


GROUP BY your_field
ORDER BY your_field DESC

[indent]When using this feature, all rows in each group should have the same values for the columns that are ommitted from the GROUP BY part. The server is free to return any value from the group, so the results are indeterminate unless all values are the same

http://dev.mysql.com/doc/refman/5.5/en/group-by-hidden-columns.html[/indent]
that’s why not :slight_smile:

Got you :slight_smile:

I agree about the dreaded *, I only use it for brevity… sorry about.

I did not know that about group by. A thought was to do something nested like:

$query1 = “SELECT distinct id from t1”;
$results1 = mysql_db_query(“mydb”,“$query1”);
while($row1 = mysql_fetch_array($results1)){
IF (isset($row1[‘id’]){
$query2 = “SELECT id, datetime, userid from t1 where id=‘$row1[‘id’]’ order by datetime asc”;
$results2 = mysql_db_query(“mydb”,“$query2”);
while($row = mysql_fetch_array($results2)){
// do something with my results…
}
}
}

but this is so inelegant and requires two database hits. I still might not get the latest record this way, either.

just to confirm, you want the latest row per id?

I think I have something that works when I add MAX():

$query = “SELECT MAX(datetime), id, userid FROM t1 GROUP BY id”;

also, I understand that SELECT must include whatever column is being GROUP BY.

And, yes, latest row per id. Your post came whicl I was posting the last.

so you want the latest/most recent row for each id?

here’s my attempt


select
  id
, date_time
, user_id
from table_name as t1
group by id
order by date_time DESC

If you get the earliest date to show rather than the most recent, use ASC instead of DESC

hth

bazz

ps, I did read post #4.

sorry, no, that’s problematic – the userid is a “hidden” column, see the link i posted

let me ask you this… do you really need the userid? if so, why?

you didn’t get it, then…

:slight_smile:


SELECT 
  id
, date_time
FROM table_name as t1
where date_time = (select max(date_time)
                          from table_name as t2
                          where t2.id = t1.id
                      )
group by id
order by date_time ASC

(not tested)

By my reckoning, that makes the rows returned the same (per id) and so the group by should work. ?

nice one, bazz, you are 98% home :slight_smile:

you’re using a correlated subquery, which works but can be inefficient (it’s a mysql peccadillo, unfortunately), and i would rewrite it with a join instead

you’ll want to remove the (incorrectly named) date column as the secondary sort column in the ORDER BY, since there isn’t more than one row per id, right? :wink:

and then you’ll want to add userid to the SELECT clause, which was the “hidden” column that needed to be seen :slight_smile:

So this be what the OP wants, then?


SELECT
  id
, date_time
, user_id
FROM table_name as t1
join (select id
           , max(date_time)
        from table_name as t2
        )
  on t1.id = t2.id
 and t1.date_time = t2.date_time
GROUP BY id
ORDER BY date_time ASC

your subquery wants a GROUP BY clause, and the t2 alias needs to apply to the subquery

I tested:
$query = “SELECT MAX(datetime), id, userid FROM t1 GROUP BY id”;

It seems to work correctly. I read the page you linked to but I didn’t understand the problem you mentioned. I’ll keep testing…

the operative word being seems

each id has several datetimes, right? so if you GROUP BY id, and use MAX(), you will get the greatest value for each id

however, userid is not in the GROUP BY, so its value can be ~any~ of the userid values for that particular id

it is ~not~ guaranteed to be the same userid value that happens to be on the row that has the max value of datetime

if you’re okay with showing a different userid than the one that has the max value of datetime, then fine

:slight_smile: