Help with an SQL query

Hope someone can help with this. Its something I understand why doesn’t work as currently is, but wondering if there is a way to return the results I would like.

Its a fairly simple table structure, with a main table ‘lodges’, and a linked table ‘updates’. So any lodge can have multiple updates.

So I can have a simple query:

SELECT DISTINCT lodges.LodgeID, Lodge FROM lodges

Which will list all the lodges.

Or I can use an INNER JOIN to return lodges where there is a matching record in the linked table:

SELECT DISTINCT lodges.LodgeID, Lodge, update_date INNER JOIN lodge_updates ON lodges.LodgeID = lodge_updates.LodgeID

But that will filter the list of lodges to only show lodges that have a record in the update table.

Is it possible to join the tables, but still list all lodges whether they have a record in the update table or not?

The idea is to show a list of all lodges, with an indication of whether or not they have any updates.

Hope that makes sense.

Thank you.

Where is the FROM here…

This should read

SELECT DISTINCT lodges.LodgeID, lodges.Lodge, lodge_updates.update_date FROM Lodges INNER JOIN lodge_updates ON lodges.LodgeID = lodge_updates.LodgeID

Get used to write the table names. It is good practice. Not all the databases are forgiving.

[quote=“johngordon, post:1, topic:176404”]
But that will filter the list of lodges to only show lodges that have a record in the update table.
[/quote]Well, that’s the definition of INNER JOIN :wink:

If you want to list all the lodges whether they have an update or not, you need an outer join and not an inner join.

Depending on the database, it could be OUTER JOIN or LEFT JOIN (since Lodge is the first table to appear)

SELECT DISTINCT lodges.LodgeID, lodges.Lodge, lodge_updates.update_date FROM Lodges OUTER JOIN lodge_updates ON lodges.LodgeID = lodge_updates.LodgeID

or

SELECT DISTINCT lodges.LodgeID, lodges.Lodge, lodge_updates.update_date FROM Lodges LEFT JOIN lodge_updates ON lodges.LodgeID = lodge_updates.LodgeID

Thanks Molona - I did have the FROM - I snipped out a bunch of fields for brevity, and accidentally cut that bit too.

I tried OUTER JOIN and got an error, but seems happy with LEFT JOIN, so thank for that.

The full query has a few joins, and I’ve just spotted it not doing something else I would like so will mark this as resolved and start another. (I never know if its forum etiquette to do that, or just carry on this one.)

If it is related, you can go in this topic. If it is a completely different issue, then you should start another.

I assume, though, that you’re trying to do just a big query so I would think it is related

I guess it is related rather than completely different. Let me just have a look to best explain it - basically to do with the order of results based on values from different tables.

OK - I have a main table, Lodges, and the related table Lodge_Updates. There are two other tables related to the main Lodges table - Countries and Lodge_Likes. Lodge_Likes just has a couple of fields - LodgeIDPL and UserPL - it narrows down the list to a list of Lodges that a particular user has liked.

And from that I would like to return a list of lodges that includes the Date_Updated field from the Lodge_Updates table, where that is the date of the most recent update.

So for example, if I have these records in my Lodges table:

LodgeID, Lodge
1, Lodge A
2, Lodge B
3, Lodge C

And these records in my product updates table:

UpdateID, LodgeID, Update, Date_Updated
1, 1, Update A, 18 April 2015
2, 1, Update B, 1 April 2015
3, 2, Update C, 5 April 2015
4, 2, Update D, 15 April 2015
5, 3, Update E, 10 April 2015
6, 3, Update F, 30 April 2015

I would like to return the following list:

LodgeID, Lodge, Date_Updated
3, Lodge C, 30 April 2015
1, Lodge A, 18 April 2015
2, Lodge B, 15 April 2015

So I nearly have it with:

SELECT DISTINCT UpdateID, Lodges.LodgeID, Lodge, Update, Date_Updated 
FROM Lodges 
INNER JOIN Countries ON Lodges.CountryID = Countries.CountryID 
INNER JOIN Lodge_Likes ON Lodges.LodgeID = Lodge_Likes.LodgeIDPL 
LEFT JOIN Lodge_Updates ON Lodges.LodgeID = Lodge_Updates.LodgeID 
WHERE Lodge_Likes.UserIDPL = 123 
GROUP BY LodgeID 
ORDER BY Date_Updated DESC                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           

But that gives me the list based on the oldest update for any lodge date, not the most recent:

UpdateID, LodgeID, Lodge, Date_Updated
5, 3, Lodge C, 10 April 2015
3, 2, Lodge B, 5 April 2015
2, 1, Lodge A, 1 April 2015

If I change the ORDER BY to:

ORDER BY Date_Updated ASC

That just reverses that order:

UpdateID, LodgeID, Lodge, Date_Updated
2, 1, Lodge A, 1 April 2015
3, 2, Lodge B, 5 April 2015
5, 3, Lodge C, 10 April 2015

Hope that makes sense and is possible somehow.

Thanks again.

i’d like to make a few points, in the hope that someone finds them useful and educational

SELECT DISTINCT lodges.LodgeID, Lodge FROM lodges

presumably LodgeID is the primary key of the lodges table – so DISTINCT is completely redundant

also, when the FROM clause has just one table, you don’t need to qualify the column names (like lodges.LodgeID) – but when the FROM clause has more than one, i strongly urge you to qualify all columns

it’s not a question of some databases being more forgiving – no database will forgive an ambiguous column (unqualified column name which exists in more than one table)

“but can’t i qualify only the ambiguous ones and not the others?”

yes, you could, but that would be most unwise

primarily because you will end up with clumsy, verbose, and ugly column naming conventions

as you’ve discovered, OUTER JOIN is invalid

in fact, it can only be LEFT JOIN or LEFT OUTER JOIN – it’s the OUTER keyword that’s optional, not the direction (LEFT, RIGHT, or FULL)

[quote]So for example, if I have these records in my Lodges table:

And these records in my product updates table:[/quote]whoa, stop right there!

you allow updates to be recorded for lodges that don’t exist? please, do yourself a huge favour and invest some time in learning about primary and foreign keys… and then go implement them

it does, and it is, but it’s not as simple as tossing a GROUP BY clause onto the end of your query

i can walk you through the process, and it begins with writing a small query that gets only the latest date for each lodge – just those two columns

let me know when you have that, and i’ll help you with the next step

Thanks for any pointers. There might be some crossed wires with the updates for lodges that don’t exist thing - that’s certainly not what I’m trying to do. It was probably the language I used though, with the table called ‘Lodge_Updates’.

To give a bit of background…

There is a main table Lodges:
LodgeID (PK), Lodge, and lots of other fields containing various information about each lodge.

There is also a table Users:
UserID (PK), LodgeID (FK), Name, Email etc

When users are logged in they can like a lodge - when they do this a record is added to a table Lodge_Likes:
LikeID (PK), LodgeID (FK), UserID (FK)

From that, when users are logged in they can view a list of lodges they like using:

SELECT Lodges.LodgeID, Lodge 
FROM Lodges 
INNER JOIN Lodge_Likes
ON Lodges.LodgeID = Lodge_Likes.LodgeID

We would like the people who are liking the lodges to be able to view updates to the lodges. But seeing as the people who own the lodges can also log in and update the various fields, that could be anything from a meaningful and significant update to a simple addition of a full stop or correction of a typo. So we wouldn’t want those to be listed as updates.

So the Lodge_Updates table is meant to be where the Lodges can add news type items pertaining to the lodge. So for example they might add a bit of text to a few fields, and an update with a summary of whatever is new.

So that is linked to the main Lodges table and looks like:

UpdateID (PK), LodgeID (FK), Update, Date_Updated

So from there I am just trying to return a list of lodges with the date of the most recent record added to the Lodge_Updates table.

Does that make a bit more sense? The date isn’t really the date the record in the main Lodges table was updated, but rather the date a news item was added to the related Lodge_Updates table.

It may well be that that isn’t the best or most appropriate solution to the problem, but its what made most sense to me. If there is a more elegant way of listing updates that exclude minor updates that wouldn’t really be relevant, then I’m all ears.

i think you missed what my message was about lodges that don’t exist

your example showed lodges A, B, C, and updates for A, B, C, D, E, and F – in other words, updates for lodges D, E, and F which don’t exist in the lodges table

anyhow… :blush:

could you please write a small query that returns two columns – the lodgeID and the latest date – which you can get from the updates table (i.e. no joins yet)

then we’ll get back to your original problem, which i understood the first time :sunny:

In the example there are only three lodges in the Lodges table Lodge A, Lodge B and Lodge C.

Each have two updates in the Lodge_Updates table.

Lodge A - Update A and Update B
Lodge B - Update C and Update D
Lodge C - Update E and Update F

With that simple query I must admit I would never think of getting values from two tables without a JOIN, as I always had in my head that the JOIN was how you filtered the correct values.

Having said that, I came up with:

SELECT Lodges.LodgeID, Lodge_Updates.Date_Updated
FROM lodges, product_updates

Which lists six records for each Lodge which seems counter intuitive, but presumably with the correct further steps it should all come out right.

So hopefully that’s that bit?

[quote=“johngordon, post:10, topic:176404, full:true”]
Lodge A - Update A and Update B
Lodge B - Update C and Update D
Lodge C - Update E and Update F
[/quote]okay, i see where i made an assumption :blush:
you should use different sample values for your lodges and your updates

anyhow, all that stuff doesn’t matter now…

please write a query using the updates table only (no joins yet) to get each lodgeID (hint: GROUP BY) and the latest date for each lodge (hint: MAX)

just those two columns

Sorry - quickest off the top of my head example.

Anyway… was missing the blindingly obvious earlier, but MAX isn’t something I’ve used before so this looks a lot more like it:

SELECT LodgeID, MAX(Date_Updated) 
FROM Lodge_Updates 
GROUP BY LodgeID

perfect :smile:

next step is to join the results of this query to your table(s), to act as a filter ensuring only the latest information per lodge

before we do that, could you indicate which columns you want in the final results? you mentioned a column called Update, as well as UserIDPL… and something about a countries table?

Update doesn’t need to be in this list - it is just a list of lodges that a user has liked, with an indication of the most recent update posted for that lodge. UserIDPL is the UserID field in the Lodge_Likes table.

I now have:

SELECT Lodge_Updates.LodgeID, MAX(Date_Updated), Country 
FROM Lodge_Updates 
INNER JOIN Lodges ON Lodge_Updates.LodgeID = Lodges.LodgeID
INNER JOIN Countries ON Lodges.CountryID = Countries.CountryID
INNER JOIN Lodge_Likes ON Lodges.LodgeID = Lodge_Likes.LodgeIDPL
WHERE Lodge_Likes.UserIDPL = 2888
GROUP BY LodgeID

So the mistake originally was starting with the Lodges table, when it really should have been the Updates table.

Its not quite there, as it that only returns Lodges that have any records in the Updates table, when it should include any Lodges with a record in the Lodge_Likes table. I assume it needs a LEFT JOIN somewhere, but not sure where exactly.

you’ve jumped way ahead of me, and written a query that is semantically invalid (syntactially invalid, too, in all databases other than mysql)

LE HUGE SIGH

what happened to the Update column? presumably that column contains information about the update, that might identify what the update was about

i don’t believe that the Update column contains “just a list of lodges that a user has liked” because it’s in the updates table, not the likes table

maybe take another stab at mentioning which columns you actually want in the results?

Sorry, wires crossed. I didn’t mean the update field is just a list of lodges a user has liked.

I meant the list of lodges returned by the query needs to be a list of lodges that a user has liked, which includes the date of the most recent update for the lodge (or empty if the lodge has not added any updates yet.)

From this list the idea is that you would then click through to a list of all the updates for the lodge you click on.

So the fields needed here are:

Lodges.LodgeID, Lodge, MAX(Date_Updated) and Country

(I meant to include Lodge in the query above.)

If you click on a lodge, the next screen would need:

LodgeID, Lodge, Country, Date_Updated and Update.

Sorry about this but I found the hard way that not all databases accept LEFT or RIGHT.

Going back to an earlier post, when I had:

SELECT DISTINCT UpdateID, Lodges.LodgeID, Lodge, Update, Date_Updated 
FROM Lodges 
INNER JOIN Countries ON Lodges.CountryID = Countries.CountryID 
INNER JOIN Lodge_Likes ON Lodges.LodgeID = Lodge_Likes.LodgeIDPL 
LEFT JOIN Lodge_Updates ON Lodges.LodgeID = Lodge_Updates.LodgeID 
WHERE Lodge_Likes.UserIDPL = 123 
GROUP BY LodgeID 
ORDER BY Date_Updated DESC

That was giving me the records I needed, just with the wrong values for Date_Updated.

I don’t need the DISTINCT or the Update field here.

And if I change Date_Updated to MAX(Date_Updated) I get the right values using:

SELECT Lodges.LodgeID, Lodge, Country, MAX(Date_Updated) 
FROM Lodges 
INNER JOIN Countries ON Lodges.CountryID = Countries.CountryID 
INNER JOIN Lodge_Likes ON Lodges.LodgeID = Lodge_Likes.LodgeIDPL 
LEFT JOIN Lodge_Updates ON Lodges.LodgeID = Lodge_Updates.LodgeID 
WHERE Lodge_Likes.UserIDPL = 123 
GROUP BY LodgeID 
ORDER BY MAX(Date_Updated) DESC

So that returns the results I was looking for, namely:

LodgeID, Lodge, Country, MAX(Date_Updated)
3, Lodge C, Kenya, 30 April 2015
1, Lodge A, Botswana, 18 April 2015
2, Lodge B, Tanzania, 15 April 2015

But from what you were saying there may be issues with the semantics?

yes, indeed

like i said, only mysql even attempts to execute it… the values of any non-aggregate columns are indeterminate… mysql attempts to explain it here: http://dev.mysql.com/doc/refman/5.1/en/group-by-handling.html

say, what about that Update column, you want it or not? because that would be a non-aggregate column

Sorry, I am using MySQL and realise I should have made that distinction in the thread title. That article suggests this is illegal in SQL but legal in MySQL?

I have already said I don’t need the Update field here. This query just needs to return a list of lodges which includes the date of its most recent update.

A visitor will then be able to click on the lodge to view a list of all the updates for that lodge.