Counting rows with WHERE and HAVING while using ONLY_FULL_GROUP_BY

DrQuincy · October 12, 2017, 9:52am

I have this query that is trying to count the number of rows from a filtered search.

SELECT COUNT( `informationItems`.`id`) AS `num`,
       (SELECT GROUP_CONCAT(`categoryID`) FROM `informationListings` WHERE `itemID` = `informationItems`.`id`) AS `categories`
FROM `informationItems` 
JOIN `pages` ON `informationItems`.`page` = `pages`.`id` 
WHERE `pages`.`status` = 'published' AND `dateFiled` <= NOW()
HAVING FIND_IN_SET(1, `categories`);

I am trying to make it work under SQL mode ONLY_FULL_GROUP_BY . I get why it fails — because the results set is indeterminate. Is it possible to get this to work under ONLY_FULL_GROUP_BY mode while still using the sub-query (other than removing the COUNT() and then counting the actual number or rows returned)?

r937 · October 12, 2017, 2:22pm

try this

SELECT pages.id , COUNT(*) AS num , GROUP_CONCAT(categoryID) AS categories FROM pages INNER JOIN informationItems ON informationItems.page = pages.id INNER JOIN informationListings ON informationListings.itemID = informationItems.id WHERE pages.status = 'published' AND pages.dateFiled <= NOW() GROUP BY pages.id HAVING COUNT(CASE WHEN informationItems.categoryID = 1 THEN 'gotcha' ELSE NULL END ) > 0

DrQuincy · October 13, 2017, 9:19am

Thanks, I’ll give it a go.

DrQuincy · October 13, 2017, 9:23am

What is your opinion on using ANY_VALUE() in 5.7? It looks as though MySQL is the only database that offers this so I am assuming it is frowned upon when it comes to SQL standards.

r937 · October 13, 2017, 9:39am

had not heard of it until now

Mittineague · October 13, 2017, 2:00pm

My bet is you go the “dependent on GROUP BY” route
https://dev.mysql.com/doc/refman/5.7/en/upgrading-from-previous-series.html

If it is possible to modify an offending query, do so, either so that nondeterministic nonaggregated columns are functionally dependent on GROUP BY columns, or by referring to nonaggregated columns using ANY_VALUE().

r937 · October 13, 2017, 2:05pm

actually i won’t need to do anything, as i have been writing ANSI-compatible GROUP BY queries for decades

DrQuincy · October 17, 2017, 11:59am

Is this a lot more inefficient that @r937’s version?

SELECT count(*)
FROM (
    {queryGoesHere}
) AS `num`

Will this get a lot slower and use a lot more memory as the data increases or does the optimiser “know” that it is only fetching a number?

The reason I ask is with the code I am refactoring it is much more convenient to do the latter version so I am weighing up how much of it needs to change.

r937 · October 17, 2017, 12:05pm

no, not a lot
my version has counts for each id

DrQuincy · October 17, 2017, 12:11pm

Great, thank you.

I really do owe you a beer!

DrQuincy · October 19, 2017, 11:12am

I have tried a few different queries on a few different tables (even on a MyISAM table) and it seems to me that SELECT COUNT(*) FROM ({subQuery}) is slower than just the subQuery on its own. On my test system the first query is up to 2.5x slower. Even if I remove JOINS it is pretty much the same result.

While does counting in this way add so much overhead?

Mittineague · October 19, 2017, 3:19pm

You may be able to spot the difference by using EXPLAIN

https://dev.mysql.com/doc/refman/5.7/en/explain.html

The EXPLAIN statement provides information about how MySQL executes statements

DrQuincy · October 20, 2017, 2:44pm

Thanks. That appears to be that sub-queries don’t use indexes. I still don’t understand while it is that much slower since there are only a few hundred records in there.

DrQuincy · October 20, 2017, 2:46pm

How would you change your query so that is gives a total count rather than for each ID? I am finding that counting when there is a HAVING clause really slows things down. I have only managed to get it to work with a sub-query. If you don’t use a sub-query it seems to ignore the HAVING.

r937 · October 20, 2017, 3:08pm

could you give examples of both please

DrQuincy · October 23, 2017, 11:49am

I will do.

But at the weekend I read about SQL_CALC_FOUND_ROWS and this seems to be perfect for what I want. I am doing pagination and am wanting the total count and — this does it without having to do the extra SELECT COUNT(*) FROM {subQuery} query.

A lot of people seem to be down on it and quote a decade-old Percona blog but according to the MySQL docs, and more recent anecdotal evidence on Stack Overflow, SQL_CALC_FOUND_ROWS would usually be faster than two queries, so long as you are using LIMIT.

In my case, adding SQL_CALC_FOUND_ROWS seems to make little-to-no difference to the query speed and is twice as fast as the SELECT COUNT(*) FROM {subQuery} query on its own!

What is your experience of this vs COUNT(*) on later versions (5.6+)?

r937 · October 23, 2017, 2:09pm

never needed it, but would’ve used it in a new york minute over a separate count query

system · January 22, 2018, 9:09pm

This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
ONLY_FULL_GROUP_BY and query efficency Databases mysql	7	1937	September 9, 2018
Help with group_concat and count Databases	3	4217	December 3, 2011
COUNT and LISTING in same query Databases	7	751	June 30, 2010
GROUP BY and HAVING Databases	19	2898	April 16, 2016
GROUP BY and then COUNT without a subquery? Databases	5	14262	October 8, 2014

Counting rows with WHERE and HAVING while using ONLY_FULL_GROUP_BY

Related topics