I have this query that is trying to count the number of rows from a filtered search.
SELECT COUNT( `informationItems`.`id`) AS `num`,
(SELECT GROUP_CONCAT(`categoryID`) FROM `informationListings` WHERE `itemID` = `informationItems`.`id`) AS `categories`
FROM `informationItems`
JOIN `pages` ON `informationItems`.`page` = `pages`.`id`
WHERE `pages`.`status` = 'published' AND `dateFiled` <= NOW()
HAVING FIND_IN_SET(1, `categories`);
I am trying to make it work under SQL mode ONLY_FULL_GROUP_BY . I get why it fails — because the results set is indeterminate. Is it possible to get this to work under ONLY_FULL_GROUP_BY mode while still using the sub-query (other than removing the COUNT() and then counting the actual number or rows returned)?
SELECT pages.id
, COUNT(*) AS num
, GROUP_CONCAT(categoryID) AS categories
FROM pages
INNER
JOIN informationItems
ON informationItems.page = pages.id
INNER
JOIN informationListings
ON informationListings.itemID = informationItems.id
WHERE pages.status = 'published'
AND pages.dateFiled <= NOW()
GROUP
BY pages.id
HAVING COUNT(CASE WHEN informationItems.categoryID = 1
THEN 'gotcha'
ELSE NULL END ) > 0
What is your opinion on using ANY_VALUE() in 5.7? It looks as though MySQL is the only database that offers this so I am assuming it is frowned upon when it comes to SQL standards.
If it is possible to modify an offending query, do so, either so that nondeterministic nonaggregated columns are functionally dependent on GROUP BY columns, or by referring to nonaggregated columns using ANY_VALUE().
Is this a lot more inefficient that @r937’s version?
SELECT count(*)
FROM (
{queryGoesHere}
) AS `num`
Will this get a lot slower and use a lot more memory as the data increases or does the optimiser “know” that it is only fetching a number?
The reason I ask is with the code I am refactoring it is much more convenient to do the latter version so I am weighing up how much of it needs to change.
I have tried a few different queries on a few different tables (even on a MyISAM table) and it seems to me that SELECT COUNT(*) FROM ({subQuery}) is slower than just the subQuery on its own. On my test system the first query is up to 2.5x slower. Even if I remove JOINS it is pretty much the same result.
While does counting in this way add so much overhead?
Thanks. That appears to be that sub-queries don’t use indexes. I still don’t understand while it is that much slower since there are only a few hundred records in there.
How would you change your query so that is gives a total count rather than for each ID? I am finding that counting when there is a HAVING clause really slows things down. I have only managed to get it to work with a sub-query. If you don’t use a sub-query it seems to ignore the HAVING.
But at the weekend I read about SQL_CALC_FOUND_ROWS and this seems to be perfect for what I want. I am doing pagination and am wanting the total count and — this does it without having to do the extra SELECT COUNT(*) FROM {subQuery} query.
A lot of people seem to be down on it and quote a decade-old Percona blog but according to the MySQL docs, and more recent anecdotal evidence on Stack Overflow, SQL_CALC_FOUND_ROWS would usually be faster than two queries, so long as you are using LIMIT.
In my case, adding SQL_CALC_FOUND_ROWS seems to make little-to-no difference to the query speed and is twice as fast as the SELECT COUNT(*) FROM {subQuery} query on its own!
What is your experience of this vs COUNT(*) on later versions (5.6+)?