I need help to understand select explain of request with subqueries

With laravel eloquent I got sql (Mysql 8.1) of products with filers on related cities and categories.
In both cases many-to many relations :

SELECT * 
    FROM `ts_products` 
    WHERE `ts_products`.`title` like '%no%'     AND `ts_products`.`status` = 'A'     AND ts_products.published_at >= '2022-04-01'     AND ts_products.published_at < '2022-04-21'     AND 
    EXISTS (  SELECT * 
    FROM `ts_products_cities` 
    WHERE `ts_products`.`id` = `ts_products_cities`.`product_id`     AND 
    EXISTS (  SELECT * 
    FROM `ts_cities` 
    WHERE `ts_products_cities`.`city_id` = `ts_cities`.`id`     AND `ts_cities`.`id` in ('3255', '3739', '4192', '5434', '4678', '3297', '4529', '4036', '3051', '4502')))     AND 
    EXISTS (  SELECT * 
    FROM `ts_products_categories` 
    WHERE `ts_products`.`id` = `ts_products_categories`.`product_id`     AND 
    EXISTS (  SELECT * 
    FROM `ts_categories` 
    WHERE `ts_products_categories`.`category_id` = `ts_categories`.`id`     AND `ts_categories`.`active` = '1')) 
    ORDER BY `regular_price` asc -- 17 rows are read

I got next explain analyze : https://prnt.sc/UpRBzl_AvkoY

  1. What Is “temporary table” ?

  2. What is “cost”? Some internal mysql value ? Can I have any use of it?

  3. What is “Remove duplicate ts_products rows ” ?

  4. What is flow is this request ? I understand some parts of it, but as it has subqueries I have difficulties in understanding it as
    a whole. How correctly to read.

  5. Why “actual time” has range values?

  6. How explain looks : https://prnt.sc/PjTaQkyE601a
    As in all cases keys are used it seemed to me good for performance, but what is type=“range” in first row?

Is there is a way to make performance of this request better ?

Thanks!

Caveat: I had to go look at the documentation to be sure I was giving you the correct answer in a place or two

A temporary table is a construct used by any database product to store data in before it is returned to the user.

This is a bit of a throwback to when providers charged by how hard a database worked, but there is a bit of reality in there as well. The more expensive the cost, the longer the query is going to take or the higher the cost because of needing more powerful machines to run the database on.

What is happening here is each exists will pull the matching rows into the temporary table, THEN because you’re asking for just the values for ts_products, it’s sorting out those duplicates so you’re only getting one value per row.

Sorry. I’m not sure I understand this question.

It’s not a range. The first number is the time to retrieve the first row of data, the second is the time to retrieve ALL the rows of data. A bit confusing, I admit…

It’s indicating that it pulled a group of data, and the key column shows what indexes are being used.

This a blanket statement as there are exceptions, but using IN and/or EXISTS are very expensive to compare data between tables, especially if using indexed keys (Primary and Secondary) and probably shouldn’t be used. A much better approach would be to use JOINS instead because JOINS use indexes much more effectively than INs or EXISTS.

Blanket statement #2 - SELECT * is almost ALWAYS a bad idea. It’s better to select the distinct fields you want to use, both from a performance but also from a readability perspective.

SELECT DISTINCT P.* 
  FROM ts_products P
 INNER JOIN ts_products_cities PC ON PC.product_id = P.id 
 INNER JOIN ts_cities` C ON C.id = PC.city_id AND c.ID IN ('3255', '3739', '4192', '5434', '4678', '3297', '4529', '4036', '3051', '4502')))     AND 
 INNER JOIN ts_products_categories PCAT ON PCAT.product_id = P.id
 INNER JOIN ts_categories CAT ON CAT.id = PCAT.category_id` AND CAT.active = '1'
 WHERE P.title like '%no%'  
   AND P.status = 'A'
   AND P.published_at BETWEEN '2022-04-01' AND '2022-04-21'
 ORDER BY P.regular_price asc
1 Like

i don’t have the answers to your questions regarding the EXPLAIN, i just thought i would format your query for others who might be wondering what’s going on in there

SELECT * 
  FROM ts_products 
 WHERE ts_products.title LIKE '%no%'     
   AND ts_products.status = 'A'     
   AND ts_products.published_at >= '2022-04-01'     
   AND ts_products.published_at  < '2022-04-21'     
   AND EXISTS 
       ( SELECT * 
           FROM ts_products_cities 
          WHERE ts_products_cities.product_id = ts_products.id
            AND EXISTS 
               ( SELECT * 
                   FROM ts_cities 
                  WHERE ts_cities.id = ts_products_cities.city_id
                    AND ts_cities.id IN ('3255','3739','4192','5434','4678'
                                        ,'3297','4529','4036','3051','4502')
               )
       )     
   AND EXISTS 
       ( SELECT * 
           FROM ts_products_categories 
          WHERE ts_products_categories.product_id = ts_products.id
            AND EXISTS 
                ( SELECT * 
                    FROM ts_categories 
                   WHERE ts_categories.id = ts_products_categories.category_id
                     AND ts_categories.active = '1'
                )
       ) 
ORDER 
    BY regular_price ASC

this should make it clear to others what you’re doing

1 Like

blanket statement #3 – when you get humongous numbers of rows out of your FROM clause which cross joins two one-to-many relationships (each product is joined to all of its cities, and then each of those product-city rows is joined to every category the product is in) and then just throw DISTINCT into the SELECT clause, there’s a good chance your performance will suffer

1 Like

Leave it to you to point out when I miss that obvious little bit. :roll_eyes: :lol:

1 Like

Thanks! It is very helpfull!
I want to detalize my question, which was not clear :

What is flow is this request ? I understand some parts of it, but as it has subqueries I
have difficulties in understanding it as a whole. How correctly to read.
https://prnt.sc/CK17k9ARg8NB ?

I’m sorry, but that looks like exactly the same question. If you are looking for instructions on how to read the analyzer, I’m going to again point you to the actual documentation. There is a sample there which it goes through step by step to explain how it is used, and should be able to point you in the right direction.

If you have specific questions after reading the example, then we can try to break it down, but that explanation is pretty clear and concise, so I’m hoping it gives you what you need.

1 Like