Select data from a joined table with the newest date

This is really a continuaton from this previous topic from a few years back:-

But I am updating the project and I’m having difficulty with the new version of this query.
Some background
As in this topic, I’m updating the database to use POINT type spatial data to store lat/lon coordiantes, instead of two FLOAT columns.
Previously I stored the current coorinates in the main seamarks table and recorded changes (moves) of the locations along with the date of the move in the changes table.
As part of the application update I’m thinking I can remove the location data from the main seamarks table and use the latest recorded location from the changes table to get the current location. This will avoid data duplication by only storing in one table and simplify the updating of data when things move. Perviously when marks are moved, I UPDATE the location in the seamarks table and INSERT a new record in changes with the new location and date, referencing the mark with an FK. This was done via a stored proceedure that run both queries. But by omitting the location data from seamarks I only need do the INSERT to changes and I have all the data I need recorded.

The problem
I’m now having difficulty with the query to select seamarks data (multiple rows) while joining the changes table to get the most recent location and date.
This is an updated variation of the query from the other topic:-

SELECT seamarks.id, seamarks.name, ST_Y(changes.loc) AS lat, ST_X(changes.loc) AS lon, MAX(changes.date) AS `date` FROM seamarks
LEFT JOIN changes ON seamarks.id = changes.mid
WHERE seamarks.active = '1' GROUP BY seamarks.id
ORDER BY `date` DESC

The ORDER BY is only there to get the most recent at the top for clarity.
It gives a set of results like I expect, but on closer examination, although I see the most recent date, the postion data is not the most recent.
Here is a result I see:-


Now compare with a straight select from changes:-

We see the first result, a mark with the ID 126 has a recent date of 2024-12-20 and in the first result set the location is lat: 53.72241667 lon: -0.6164
But in the changes table the row with that most recent date for 126 shows the location of lat: 53.71721667 lon: -0.5871 which is the correct current location that I want to fetch.
I have worked out that this is the use of MAX() along with GROUP BY, but I can’t figure out the correct query to get only the most recent row from the JOIN for each mark with the most recent date and location.

To put it a different way, and maybe more simply, here are results from a simplified version of the query, without MAX() and GROUP BY.


I get results with location data that matches the date in the row.
But I get the same marks/IDs duplicated (highlighted in colours) for every time they moved.
I want to filter the results so every ID appears only once, with data from only the most recent change/date it has (marked with red dot).

The problem you’re running into is the multiple records, so you need to filter it a little more in the where clause…

SELECT seamarks.id
     , seamarks.name
     , ST_Y(changes.loc) AS lat
     , ST_X(changes.loc) AS lon
     , changes.date AS `date` 
  FROM seamarks
  LEFT JOIN changes ON seamarks.id = changes.mid
 WHERE seamarks.active = '1'
   AND (changes.date IS NULL OR
        changes.date = (SELECT MAX(c2.date)
                          FROM changes C2
                         WHERE c2.mid = seamarks.id))
 ORDER BY `date` DESC

That’s giving an error near ORDER on line 9, which doesn’t look right, also an odd number of brackets.
I’ve tried a few variations based on that, but not getting the results I want.

Sorry about that. Going too quick and had some syntax errors.

I fixed the post. Try that query to see if you get closer…

That still gives the duplicate IDs.
I tried a few variations before to try and correct it, but I get either every row with duplicates, a single row, no results, or NULL values.
It feels like it should not be too difficult, but I can’t work it out.

Grrr…one more try. Change the MAX in the sub-query to MAX(c2.date)

1 Like

That may have done it, I just need time to verify the results.

So far all looks good.
Thank you.

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.