# Thread: How to optemise the query having group by on two colomun

1. ## How to optemise the query having group by on two colomun

i have one table say A

+------------+--------------+----------------+
| value1 | value 2 | value 3 |
+------------+--------------+----------------+
| 5 | 4adb83c1d71b | 1 |
| 5 | 44fcb93589fe | 2 |
| 5 | 4182b6e2100c | 1 |

| 6 | 4bfd8db46ab1 | 1 |

| 7 | 4b0f87c68808 | 1 |
| 7 | 413fa4646dfe | 2 |
| 7 | 4585b5671b68 | 3 |
| 7 | 4486abc0ddce | 2 |
| 7 | 4b01b048f07c | 5 |
| 7 | 4cebb754ff21 | 2 |
+------------+--------------+----------------+

Now i need an optemise query that will give number of duplicate in value 3 colomun for each value 1 colomun

mean i need a result as

+------------+--------------+----------------+
| value1 | value 3 | count |
+------------+--------------+----------------+
| 5 | 1 | 2 |
| 6 | 1 | 1 |
| 7 | 2 | 3 |
+------------+--------------+----------------+

mean
value 1(5) have value 3(1) repeate 2 time
value 1(6) have value 3(1) repeate 1 time
value 1(7) have value 3(2) repeate 3 time

for this i write a query but that query is not optemise so if have any better solution for query.

select value1,value3,count(value1) as ct from A where value1>0 and value1 != '' group by value1,value2 having ct >1 limit 3;

where value 1 and value 3 are indexed

2. ALTER TABLE a ADD INDEX(value1,value2)

3. select value1,value3,count(value1) as ct from A where value1>0 and value1 != '' group by value1,value3 having ct >1 limit 3;

value 1 and value 3 are already index then also it scan the whole row

4. ALTER TABLE a ADD INDEX(value1,value3)

5. Originally Posted by r937
ALTER TABLE a ADD INDEX(value1,value3)

value1 and value 3 are already index in table so this does not work

6. sure it does

i am not suggesting separate indexes on those columns individually, i am suggesting a single composite index on both columns

7. +----+-------------+---------------------+-------+-----------------------------------------------------+--------------+---------+------+---------+-----------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------------------+-------+-----------------------------------------------------+--------------+---------+------+---------+-----------------------------------------------------------+
| 1 | SIMPLE | A | range | value1_3 | value1_3 | 8 | NULL | 3124145 | Using where; Using index; Using temporary; Using filesort |
+----+-------------+---------------------+-------+-----------------------------------------------------+--------------+---------+------+---------+-----------------------------------------------------------+

so as we see from here value1_3 is index of (value1 and value3)

Still expensive

8. Originally Posted by phpgurullc
Still expensive
compared to what? you're reading three million rows!!!

9. 3 million rows entries are in table A, so when we explain this query it give us this result
+----+-------------+---------------------+-------+-----------------------------------------------------+--------------+---------+------+---------+-----------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------------------+-------+-----------------------------------------------------+--------------+---------+------+---------+-----------------------------------------------------------+
| 1 | SIMPLE | A | range | value1_3 | value1_3 | 8 | NULL | 3124145 | Using where; Using index; Using temporary; Using filesort |
+----+-------------+---------------------+-------+-----------------------------------------------------+--------------+---------+------+---------+-----------------------------------------------------------+

So its scanning the whole table

10. actually, it's not accessing the table at all, just the index

you're using an ORDER BY clause, right?

11. Originally Posted by r937
ALTER TABLE a ADD INDEX(value1,value3)

Originally Posted by r937
actually, it's not accessing the table at all, just the index

you're using an ORDER BY clause, right?
its group by clause

12. so there is no ORDER BY clause?

13. Dear Just go through this query,we are not using order by we are using group by

select value1,value3,count(value1) as ct from A where value1>0 and value1 != '' group by value1,value2 having ct >1 limit 3;

14. Originally Posted by phpgurullc
select value1,value3,count(value1) as ct from A where value1>0 and value1 != '' group by value1,value2 having ct >1 limit 3;
there's your problem... you're grouping by something, but selecting something else

15. Originally Posted by r937
there's your problem... you're grouping by something, but selecting something else
sorry its typo error
select value1,value3,count(value1) as ct from A where value1>0 and value1 != '' group by value1,value3 having ct >1 limit 3;

16. okay, i don't think i can keep dancing around the real issue here with fake table and column names

is there any chance you can show your actual table and actual query?

do a SHOW CREATE TABLE for me please

17. Table Structure is as follows

| gallery_event_image | CREATE TABLE `gallery_event_image` (
`gallery_id` bigint(20) unsigned NOT NULL,
`image_id` varchar(20) NOT NULL,
`image_position` smallint(5) unsigned DEFAULT NULL,
UNIQUE KEY `gallery_id_2` (`gallery_id`,`image_id`),
KEY `gallery_id` (`gallery_id`),
KEY `image_id` (`image_id`),
KEY `image_position` (`image_position`),
KEY `gallery_id_3` (`gallery_id`,`image_position`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 |

and real query

select gallery_id,image_position,count(gallery_id) as ct from gallery_event_image WHERE image_position < 255 and gallery_id!=0 and gallery_id != '' group by gallery_id,image_position having ct >1 order by ct asc limit 500;

18. well, whaddya know... there ~was~ an ORDER BY clause after all

19. Originally Posted by r937
there's your problem... you're grouping by something, but selecting something else
Originally Posted by r937
well, whaddya know... there ~was~ an ORDER BY clause after all

ok, so how can we solve it

20. you can't solve it

if you want the lowest 500 counts, you're going to have to compute all the counts, right?

that's what using temporary and using filesort mean -- each gallery_id/image_position combination is counted, the results are placed into a temporary file, and then this file is sorted into ascending sequence by the count

that's what you asked for, and your query is as optimized as it's going to get

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•