GROUP BY and HAVING

I have the following SQL, which works great:

SELECT DISTINCT lodges.LodgeID, Lodge, lodge_activities_it.ActivityID, user_activities_it.ActivityID, Activity,
(SELECT DISTINCT Count(*) 
FROM lodge_activities_it
INNER JOIN user_activities_it ON lodge_activities_it.ActivityID = user_activities_it.ActivityID
WHERE user_activities_it.UserID = $UserID AND lodge_activities_it.LodgeID = lodges.LodgeID
GROUP BY user_activities_it.UserID 
) 
AS nb
FROM lodges 
INNER JOIN lodge_activities_it ON lodges.LodgeID = lodge_activities_it.LodgeID 
INNER JOIN lodge_activities ON lodge_activities_it.ActivityID = lodge_activities.ActivityID
INNER JOIN user_activities_it ON user_activities_it.ActivityID=lodge_activities_it.ActivityID 
WHERE Product_Type = 'Property' AND user_activities_it.UserID = 123 AND Category_ID = 15
GROUP BY lodges.LodgeID

Which lists the lodges and includes the value for nb.

All I am looking to do now is limit the results to exclude lodges where the nb / count is less than 3.

I have been trying to add

HAVING Count (*) > 2 after GROUP BY 

(tried it in both, but without any luck.)

Also tried adding AND nb > 2 to the WHERE clause, but no joy.

Hopefully this is fairly straightforward from this point - if anyone could help out that would be appreciated.

Thank you!

Looks like it might be working by changing the last line to:

GROUP BY lodges.LodgeID HAVING nb > 2

Why didn’t you just add

AND nb < 3

To the where clause?

That was the first thing I tried, and got the error:

Unknown column ‘nb’ in ‘where clause’

I assumed it was because nb was a calculated value, rather than a table field.

Should that work?

[quote=“johngordon, post:4, topic:212325, full:true”]Should that work?
[/quote]only in MS Access

Thanks - I realised this had cropped up previously, and that’s when I learned about HAVING.

This is all at the point where it all seems to be working great, and people are starting to use it.

The problem we now have is that we are noticing performance is really taking a hit, and the pages on the site displaying this stuff are taking ages to load while they wait for the results back from the server.

Doing a bit of searching on Google throws up all sorts of stuff and its difficult to know where to start - do you happen to know any good resources for that side of things that you may have come across that you could recommend?

What does each table do, and what are trying to achieve for this?

The sub query seems to be like it will be your biggest hit, and I’m not sure what you’re trying to accomplish with it.

reformatting for my own understanding –

SELECT DISTINCT lodges.LodgeID , xxxxxx.Lodge , lodge_activities_it.ActivityID , user_activities_it.ActivityID , xxxxxx.Activity , ( SELECT DISTINCT Count(*) FROM lodge_activities_it INNER JOIN user_activities_it ON user_activities_it.ActivityID = lodge_activities_it.ActivityID WHERE user_activities_it.UserID = $UserID AND lodge_activities_it.LodgeID = lodges.LodgeID GROUP BY user_activities_it.UserID ) AS nb FROM lodges INNER JOIN lodge_activities_it ON lodge_activities_it.LodgeID = lodges.LodgeID INNER JOIN lodge_activities ON lodge_activities.ActivityID = lodge_activities_it.ActivityID INNER JOIN user_activities_it ON user_activities_it.ActivityID = lodge_activities_it.ActivityID WHERE xxxxxx.Product_Type = 'Property' AND user_activities_it.UserID = 123 AND xxxxxx.Category_ID = 15 GROUP BY lodges.LodgeID
the subquery does not need the GROUP BY, and also does not need the DISTINCT

the outer query almost certainly does not need the GROUP BY, nor the DISTINCT

try those changes and see if it speeds things up a bit

also, note the places where i’ve used the table prefix “xxxxxx” – you should always use table prefixes on ~all~ columns (even if it’s obvious to you which table a column comes from, it isn’t to me)

if you’re still getting poor performnace, we will next want to see what indexes you’ve declared

1 Like

Thank you - I made those changes, and the hosting company made some tweaks too, and it seems noticeably better.

There is one other query we have causing a slow down, which looks like this:

SELECT DISTINCT trade_users.UserID, Firstname, Lastname, Company, trade_users.User_Level, LoginDate,         
(SELECT Count(DISTINCT UserIDPL, LodgeIDPL) FROM product_likes WHERE UserIDPL = trade_users.UserID) AS connections
FROM trade_users
LEFT JOIN trade_users_last_login ON trade_users_last_login.LoginUsername = trade_users.Username
LEFT JOIN product_likes ON trade_users.UserID = product_likes.UserIDPL 
WHERE trade_users.User_Level = 3 OR trade_users.User_Level = 7
GROUP BY UserID

Which doesn’t seem quite as complex - any suggestions with that one?

the join to product_likes is unnecessary, so the DISTINCT and GROUP BY on the outer query are unnecessary

    SELECT trade_users.UserID
         , trade_users.Firstname
         , trade_users.Lastname
         , trade_users.Company
         , trade_users.User_Level
         , trade_users_last_login.LoginDate
         , ( SELECT COUNT(DISTINCT LodgeIDPL) 
               FROM product_likes 
              WHERE UserIDPL = trade_users.UserID ) AS connections
      FROM trade_users
    LEFT OUTER
      JOIN trade_users_last_login 
        ON trade_users_last_login.LoginUsername = trade_users.Username
     WHERE trade_users.User_Level IN ( 3 , 7 )

the subquery is correlated and correlated subqueries are sometimes slow, so you could try this –

    SELECT trade_users.UserID
         , trade_users.Firstname
         , trade_users.Lastname
         , trade_users.Company
         , trade_users.User_Level
         , trade_users_last_login.LoginDate
         , COALESCE(likes.connections,0) AS connections
      FROM trade_users
    LEFT OUTER
      JOIN trade_users_last_login 
        ON trade_users_last_login.LoginUsername = trade_users.Username
    LEFT OUTER
      JOIN ( SELECT UserIDPL
                  , COUNT(DISTINCT LodgeIDPL) AS connections 
               FROM product_likes 
             GROUP
                 BY UserIDPL ) AS likes
        ON likes.UserIDPL = trade_users.UserID 
     WHERE trade_users.User_Level IN ( 3 , 7 )
1 Like

Thank you so much for that - that has made a huge difference.

The only tweak I had to make was add GROUP BY trade_users.UserID at the end, as it was returning a record for each user login.

[quote=“johngordon, post:11, topic:212325, full:true”]… as it was returning a record for each user login.
[/quote]of course, it’s a one-to-many relationship

perhaps “trade_users_last_login” should have been named “trade_users_many_logins”

anyhow, having added GROUP BY, did you also use the MAX function?

I hadn’t - it looked like it was displaying the most recent login by default.

I tried changing

trade_users_last_login.LoginDate

to

MAX (trade_users_last_login.LoginDate)

but it doesn’t display any login dates.

Am also looking at an equivalent but slightly different query which is listing all the products and their connections - be a good exercise for me to try to rework it based on your changes to this one rather than just ask you.

I might come back if I can’t quite get it though.

it’s indeterminate – there is no default, it just picks one

[quote=“johngordon, post:13, topic:212325, full:true”]… but it doesn’t display any login dates.
[/quote]

likely a php whoopsie… try MAX(trade_users_last_login.LoginDate) AS LoginDate

Thanks - that’s working now.

Looking at the other one I was talking about - it actually seems OK speed wise, so might be OK as it is.

If not, don’t post anything yet, but if you could let me know if it looks OK as it is, I won’t look at changing it:

SELECT DISTINCT lodges.LodgeID
, lodges.Lodge
, lodge_activities.Activity
, lodge_activity_categories.CategoryID
, (SELECT Count(DISTINCT UserIDPL, LodgeIDPL) 
FROM product_likes 
WHERE product_likes.LodgeIDPL = lodges.LodgeID) 
AS connections
FROM lodges
INNER JOIN lodge_activities_it
ON lodges.LodgeID = lodge_activities_it.LodgeID
INNER JOIN lodge_activities
ON lodge_activities_it.ActivityID = lodge_activities.ActivityID
INNER JOIN lodge_activity_categories
ON lodge_activities.Category_ID = lodge_activity_categories.CategoryID 
LEFT JOIN product_likes ON lodges.LodgeID = product_likes.LodgeIDPL 
WHERE lodge_activity_categories.CategoryID = 15 
GROUP BY LodgeID

i can’t read that

remember “xxxxxx” ?

please fix those, and repost

Updated that now.

you missed the xxxxxx’s in the WHERE and GROUP BY clauses, but never mind

let me explain how the GROUP BY clause works – whatever you put in the GROUP BY clause, guarantees that the results will have only one row for each value

so if GROUP BY LodgeID, then there will be only one row for each LodgeID in the results

which makes your DISTINCT redundant… think about it :smile:

okay, so if, as in the previous example, you are applying GROUP BY and/or DISTINCT to your query, it’s likely because you are trying to suppress a one-to-many relationship, when in fact you don’t really want the “many” results, you just want one of them

how many activities can a lodge be involved in? many, because you’re using a many-to-many relationship table (lodge_activities_it)

and yet you seem to want to show only one of them

A lodge can have multiple activities, although the goalposts kind of shifted since that table was created. Activities now covers a wider range of things, including country, which lodges use to create a profile. I should probably rename it something like lodge_profile_keywords.

Although it is one-to-many, this is for a page listing the lodges - and we want the country to appear in the list, which is why I have the line:

WHERE lodge_activity_categories.CategoryID = 15

This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.