SitePoint Sponsor

User Tag List

Results 1 to 17 of 17
  1. #1
    SitePoint Evangelist
    Join Date
    May 2002
    Location
    Barcelona, Spain
    Posts
    434
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Why are UNIONS faster than JOINS?

    Ok, I realize I'm being very general here and the statement may not be true, but in the database we are working with, UNIONS seems to be much faster than LEFT JOINs when using junction tables.

    Okay, I just realized that what's causing the difference is that when using JOINS I am using ORs which I think are what's causing the speed difference.

    Does anyone know of a better way to write the following code without using UNIONs?:

    Code MySQL:
    SELECT SQL_CALC_FOUND_ROWS DISTINCT dvdpedia.dogTag, dvdpedia.title, dvdpedia.starring, dvdpedia.originalTitle, dvdpedia.theatrical, dvdpedia.media, localeLanguage, MATCH(title) AGAINST('jones' IN BOOLEAN MODE) AS relevance 
    					 	, director
    						FROM dvdpedia 
    						LEFT OUTER JOIN dvd_director2title ON dvdpedia.dogTag = dvd_director2title.dogTag
    						LEFT OUTER JOIN dvd_director ON dvd_director.id = dvd_director2title.pId
    						WHERE MATCH(title) AGAINST('jones' IN BOOLEAN MODE) 
    						or MATCH(director) AGAINST('jones' IN BOOLEAN MODE)
    						or MATCH(starring) AGAINST('jones' IN BOOLEAN MODE)    
    						GROUP BY dogTag						
    						ORDER BY localeLanguage = 'en' DESC, localeLanguage, relevance DESC

    Thanks.
    Used to prefer PHP.
    Now I prefer Maya but stuck with PHP
    bruji.com

  2. #2
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,273
    Mentioned
    60 Post(s)
    Tagged
    3 Thread(s)
    title, director, and starring and not the types of data fields that are conducive to fulltext indexing

    i doubt any of these three is actually longer than 50 characters, so you should be probably be using LIKE

    secondly, you write LEFT OUTER JOIN but then you also have a condition on the right table director column

    i'm having trouble understanding why you have a many-to-many realtionship table for dvds to directors, and ~not~ one for dvds to "starring" (if i think this is what it looks like, the people starring in the dvd)

    finally, i think your use of DISTINCT is redundant and may also be contributing to poor performance
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  3. #3
    SitePoint Evangelist
    Join Date
    May 2002
    Location
    Barcelona, Spain
    Posts
    434
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    title, director, and starring and not the types of data fields that are conducive to fulltext indexing

    i doubt any of these three is actually longer than 50 characters, so you should be probably be using LIKE
    My understanding is that regardless of the length of the field, a FULLTEXT search is usually faster than a LIKE search, is this not so? Also the advantage of a FULLTEXT search on those fields is that then we can order by Relevance.

    secondly, you write LEFT OUTER JOIN but then you also have a condition on the right table director column
    Sorry I don't understand what you mean here, the LEFT OUTER JOIN is so that we can get listings of movies without directors in case they are searching for titles (this being a keyword search, we don't know if they're searching for titles, directors or actors).

    i'm having trouble understanding why you have a many-to-many realtionship table for dvds to directors, and ~not~ one for dvds to "starring" (if i think this is what it looks like, the people starring in the dvd)
    I do have one for credits as well, it's just that the 6 main actors go into that starring field to make thing easier. Then I have another query that only searches the actors table which has the many-to-many relationship.

    finally, i think your use of DISTINCT is redundant and may also be contributing to poor performance
    Aha, hadn't caught that, great, at least one thing I can optimize
    Used to prefer PHP.
    Now I prefer Maya but stuck with PHP
    bruji.com

  4. #4
    SitePoint Evangelist
    Join Date
    May 2002
    Location
    Barcelona, Spain
    Posts
    434
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    While we are talking about optimization, is there a way to find out if my joins are not using the indexes properly? I have tried using EXPLAIN but that doesn't seem to help. I've also tried turning on the slow query logs and parsing those with mysqldumpslow but although this helps track the time the queries take, it doesn't tell me if the joins are being used properly or where I could be optimizing the queries.

    Let me clarify, EXPLAIN loses me when it reports something like:

    id select_type table type possible_keys key
    1 PRIMARY <derived2> All

    I have no idea how to interpret or optimize this, even where in my query it's referring to.

    Thanks again.
    Used to prefer PHP.
    Now I prefer Maya but stuck with PHP
    bruji.com

  5. #5
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,273
    Mentioned
    60 Post(s)
    Tagged
    3 Thread(s)
    writing a simple explanation that will help you learn how to read EXPLAIN output is not easy

    it requires an understanding of the table design (use SHOW CREATE TABLE to include the indexes), the query (source), and the full EXPLAIN output

    taking a single line from the EXPLAIN output isn't enough to go on, i'm afraid
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  6. #6
    SitePoint Evangelist
    Join Date
    May 2002
    Location
    Barcelona, Spain
    Posts
    434
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Here's a full output of the explain, does this help?

    Code MySQL:
    +----+-------------+---------------------+--------+---------------+------------+---------+------------------------------------+-------+----------------------+
    | id | select_type | table               | type   | possible_keys | key        | key_len | ref                                | rows  | Extra                |
    +----+-------------+---------------------+--------+---------------+------------+---------+------------------------------------+-------+----------------------+
    |  1 | PRIMARY     | dvdpedia            | const  | PRIMARY       | PRIMARY    | 4       | const                              |     1 |                      |
    |  1 | PRIMARY     | dvd_locale          | const  | PRIMARY       | PRIMARY    | 12      | const,const                        |     0 | unique row not found |
    |  1 | PRIMARY     | dvd_studio          | const  | PRIMARY       | PRIMARY    | 4       | const                              |     1 |                      |
    |  1 | PRIMARY     | dvd_pictureFormat   | const  | PRIMARY       | PRIMARY    | 4       | const                              |     1 |                      |
    |  1 | PRIMARY     | dvd_rated           | const  | PRIMARY       | NULL       | NULL    | NULL                               |     1 |                      |
    |  1 | PRIMARY     | dvd_aspectRatio     | const  | PRIMARY       | NULL       | NULL    | NULL                               |     1 |                      |
    |  1 | PRIMARY     | dvd_series          | const  | PRIMARY       | NULL       | NULL    | NULL                               |     1 |                      |
    |  1 | PRIMARY     | dvd_images          | ref    | dogTag        | dogTag     | 4       | const                              |     2 |                      |
    |  1 | PRIMARY     | dvd_sound           | eq_ref | PRIMARY       | PRIMARY    | 4       | const                              |     1 |                      |
    |  1 | PRIMARY     | dvd_genre           | eq_ref | PRIMARY       | PRIMARY    | 4       | const                              |     1 |                      |
    |  1 | PRIMARY     | <derived2>          | ALL    | NULL          | NULL       | NULL    | NULL                               |  8773 |                      |
    |  1 | PRIMARY     | <derived3>          | ALL    | NULL          | NULL       | NULL    | NULL                               | 12580 |                      |
    |  1 | PRIMARY     | <derived4>          | ALL    | NULL          | NULL       | NULL    | NULL                               |  7425 |                      |
    |  1 | PRIMARY     | <derived5>          | ALL    | NULL          | NULL       | NULL    | NULL                               | 10537 |                      |
    |  1 | PRIMARY     | <derived6>          | ALL    | NULL          | NULL       | NULL    | NULL                               |  6044 |                      |
    |  1 | PRIMARY     | <derived7>          | ALL    | NULL          | NULL       | NULL    | NULL                               | 10714 |                      |
    |  7 | DERIVED     | dvd_country2title   | index  | PRIMARY       | dogtag_pid | 8       | NULL                               | 12852 | Using index          |
    |  7 | DERIVED     | country             | eq_ref | PRIMARY       | PRIMARY    | 4       | bruji_data.dvd_country2title.pId   |     1 |                      |
    |  6 | DERIVED     | dvd_subtitles2title | index  | PRIMARY       | dogtag_pid | 8       | NULL                               | 16621 | Using index          |
    |  6 | DERIVED     | languages           | eq_ref | PRIMARY       | PRIMARY    | 4       | bruji_data.dvd_subtitles2title.pId |     1 |                      |
    |  5 | DERIVED     | dvd_languages2title | index  | PRIMARY       | dogtag_pid | 8       | NULL                               | 16957 | Using index          |
    |  5 | DERIVED     | languages           | eq_ref | PRIMARY       | PRIMARY    | 4       | bruji_data.dvd_languages2title.pId |     1 |                      |
    |  4 | DERIVED     | dvd_producer2title  | index  | PRIMARY       | dogtag_pid | 8       | NULL                               | 18071 | Using index          |
    |  4 | DERIVED     | dvd_producer        | eq_ref | PRIMARY       | PRIMARY    | 4       | bruji_data.dvd_producer2title.pId  |     1 |                      |
    |  3 | DERIVED     | dvd_director2title  | index  | PRIMARY       | dogtag_pid | 8       | NULL                               | 15032 | Using index          |
    |  3 | DERIVED     | dvd_director        | eq_ref | PRIMARY       | PRIMARY    | 4       | bruji_data.dvd_director2title.pId  |     1 |                      |
    |  2 | DERIVED     | dvd_writer2title    | index  | PRIMARY       | dogtag_pid | 8       | NULL                               | 16801 | Using index          |
    |  2 | DERIVED     | dvd_writer          | eq_ref | PRIMARY       | PRIMARY    | 4       | bruji_data.dvd_writer2title.pId    |     1 |                      |
    +----+-------------+---------------------+--------+---------------+------------+---------+------------------------------------+-------+----------------------+
    Used to prefer PHP.
    Now I prefer Maya but stuck with PHP
    bruji.com

  7. #7
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,273
    Mentioned
    60 Post(s)
    Tagged
    3 Thread(s)
    it doesn't help me, no, not without the entire query, plus a few days to study it...
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  8. #8
    SitePoint Evangelist
    Join Date
    May 2002
    Location
    Barcelona, Spain
    Posts
    434
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    it doesn't help me, no, not without the entire query, plus a few days to study it...
    Here's the full query:

    Code MySQL:
    EXPLAIN
    SELECT dvdpedia.*
      , dvd_rated.rated
      , d.director
      , w.writer
      , p.producer
      , DATE_FORMAT(dvdpedia.theatrical, '%Y-%m-%d') AS 'theatrical'
      , DATE_FORMAT(dvd_locale.releaseDate, '%Y-%m-%d') AS 'releaseDate'
      , dvd_aspectRatio.aspectRatio
      , l.languages
      , dvd_locale.region
      , dvd_locale.locale
      , dvd_locale.price
      , dvd_locale.videoFormat
      , dvd_images.upc
      , dvd_pictureFormat.pictureFormat
      , dvd_studio.studio
      , dvd_sound.sound
      , dvd_genre.genre
      , dvd_series.series
      , s.subtitles
      , c.country
      FROM dvdpedia 
      LEFT OUTER JOIN dvd_locale ON dvdpedia.dogTag = dvd_locale.dogTag	AND dvd_locale.locale = 'UK'
      LEFT OUTER JOIN dvd_studio ON dvdpedia.studioId = dvd_studio.id
      LEFT OUTER JOIN dvd_pictureFormat ON dvdpedia.pictureFormatId = dvd_pictureFormat.id
      LEFT OUTER JOIN dvd_images ON dvd_images.dogTag = dvdpedia.dogTag  
      LEFT OUTER JOIN dvd_rated ON dvd_rated.id = dvd_locale.ratedId 
      LEFT OUTER JOIN dvd_aspectRatio ON  dvdpedia.aspectRatioId = dvd_aspectRatio.id
      LEFT OUTER JOIN dvd_sound ON dvdpedia.soundId = dvd_sound.id
      LEFT OUTER JOIN dvd_genre ON dvdpedia.genreId = dvd_genre.id		
      LEFT OUTER JOIN dvd_series ON dvdpedia.seriesId = dvd_series.id
      LEFT OUTER JOIN ( SELECT dvd_writer2title.dogTag
      , GROUP_CONCAT(writer SEPARATOR ', ') AS writer
      FROM dvd_writer2title 
      INNER JOIN dvd_writer 
      ON dvd_writer.id = dvd_writer2title.pId
      GROUP BY dvd_writer2title.dogTag ) AS w
      ON w.dogTag = dvdpedia.dogTag		    
      LEFT OUTER JOIN ( SELECT dvd_director2title.dogTag,
      GROUP_CONCAT(director SEPARATOR ', ') AS director
      FROM dvd_director
      INNER JOIN dvd_director2title
      ON dvd_director.id = dvd_director2title.pid
      GROUP BY dvd_director2title.dogTag ) AS d
      ON d.dogTag = dvdpedia.dogTag
      LEFT OUTER JOIN ( SELECT dvd_producer2title.dogTag,
      GROUP_CONCAT(producer SEPARATOR ', ') AS producer
      FROM dvd_producer
      INNER JOIN dvd_producer2title
      ON dvd_producer.id = dvd_producer2title.pid
      GROUP BY dvd_producer2title.dogTag ) AS p
      ON p.dogTag = dvdpedia.dogTag
      LEFT OUTER JOIN (SELECT dvd_languages2title.dogTag, GROUP_CONCAT(languages.languages SEPARATOR ', ' ) AS languages
      FROM languages 
      INNER JOIN  dvd_languages2title 
      ON dvd_languages2title.pId = languages.id
      GROUP BY dvd_languages2title.dogTag) AS l
      ON l.dogTag = dvdpedia.dogTag
      LEFT OUTER JOIN (SELECT dvd_subtitles2title.dogTag, GROUP_CONCAT(languages.languages SEPARATOR ', ' ) AS subtitles
      FROM languages 
      INNER JOIN  dvd_subtitles2title 
      ON dvd_subtitles2title.pId = languages.id
      GROUP BY dvd_subtitles2title.dogTag) AS s
      ON s.dogTag = dvdpedia.dogTag
      LEFT OUTER JOIN (SELECT dvd_country2title.dogTag, GROUP_CONCAT(country SEPARATOR ', ') AS country
      FROM country 
      INNER JOIN dvd_country2title 
      ON country.id = dvd_country2title.pId
      GROUP BY dvd_country2title.dogTag) AS c
      ON c.dogTag = dvdpedia.dogTag
      WHERE dvdpedia.dogTag = '40';
    Used to prefer PHP.
    Now I prefer Maya but stuck with PHP
    bruji.com

  9. #9
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,273
    Mentioned
    60 Post(s)
    Tagged
    3 Thread(s)
    wow, what a query

    judging from a cursory glance at the EXPLAIN, it looks okay

    what issues are you having with it?

    i presume this isn't the same issue as the ORs not performing as well as the UNIONs in your original post with the FULLTXT search...
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  10. #10
    SitePoint Evangelist
    Join Date
    May 2002
    Location
    Barcelona, Spain
    Posts
    434
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    wow, what a query

    judging from a cursory glance at the EXPLAIN, it looks okay

    what issues are you having with it?

    i presume this isn't the same issue as the ORs not performing as well as the UNIONs in your original post with the FULLTXT search...
    Well it's not issues as such, we are just trying to optimize as much as we can and of course when we see things in EXPLAIN that do not use indexes they seem ripe candidates for optimization. My suspicion is that it's these queries here:
    Code MySQL:
     LEFT OUTER JOIN ( SELECT dvd_writer2title.dogTag
      , GROUP_CONCAT(writer SEPARATOR ', ') AS writer
      FROM dvd_writer2title 
      INNER JOIN dvd_writer 
      ON dvd_writer.id = dvd_writer2title.pId
      GROUP BY dvd_writer2title.dogTag ) AS w
      ON w.dogTag = dvdpedia.dogTag
    causing the no-index situation. We are rewriting them like this now:
    Code MySQL:
    , GROUP_CONCAT(DISTINCT writer SEPARATOR ', ') AS writer
    Then the simple JOIN by itself like the others. This produces a much better EXPLAIN output with all SIMPLE joins and only one empty value in "possible_keys" but the type is "index" there so I think we should be fine no?
    Used to prefer PHP.
    Now I prefer Maya but stuck with PHP
    bruji.com

  11. #11
    Non-Member
    Join Date
    Jun 2011
    Posts
    2
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    A single SELECT will use no more than one index per table.

    A UNION will use no more than one index per SELECT in the union.

    Hence, the latter will make better use of indexes, as seen by the "Using index" in a lot of places in its EXPLAIN.

  12. #12
    SitePoint Evangelist
    Join Date
    May 2002
    Location
    Barcelona, Spain
    Posts
    434
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by john143 View Post
    A single SELECT will use no more than one index per table.

    A UNION will use no more than one index per SELECT in the union.

    Hence, the latter will make better use of indexes, as seen by the "Using index" in a lot of places in its EXPLAIN.
    So from what you are saying UNIONs by their nature are truly faster than JOINs. However JOINs seem to be the preferred way of associating tables.
    Used to prefer PHP.
    Now I prefer Maya but stuck with PHP
    bruji.com

  13. #13
    SitePoint Evangelist
    Join Date
    May 2002
    Location
    Barcelona, Spain
    Posts
    434
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    And I have to say that it really bugs me that a UNION, which is faster, spits out a join of type ALL which is evil according to the docs and yet the equivalent JOIN, which is slower, does not, the join types are all good:

    Code MySQL:
    +----+--------------+------------+----------+------------------------+------------------------+---------+------+------+-----------------------------+
    | id | select_type  | table      | type     | possible_keys          | key                    | key_len | ref  | rows | Extra                       |
    +----+--------------+------------+----------+------------------------+------------------------+---------+------+------+-----------------------------+
    |  1 | PRIMARY      | dvdpedia   | fulltext | title_fulltext         | title_fulltext         | 0       |      |    1 | Using where; Using filesort |
    |  2 | UNION        | dvdpedia   | fulltext | originalTitle_fulltext | originalTitle_fulltext | 0       |      |    1 | Using where; Using filesort |
    | NULL | UNION RESULT | <union1,2> | ALL      | NULL                   | NULL                   | NULL    | NULL | NULL | Using filesort              |
    +----+--------------+------------+----------+------------------------+------------------------+---------+------+------+-----------------------------+
    Used to prefer PHP.
    Now I prefer Maya but stuck with PHP
    bruji.com

  14. #14
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,273
    Mentioned
    60 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by pata View Post
    ... spits out a join of type ALL which is evil according to the docs
    i think you misunderstand how a UNION works

    i'm assuming the ALL means that it is simply accepting all rows from the two SELECTs, which is more efficient than trying to detect duplicates

    SELECT ...
    UNION [ ALL | DISTINCT ]
    SELECT ...

    as for the slowness, i believe that's down to the "using filesort" which you forced upon it by your ORDER BY clause
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  15. #15
    SitePoint Evangelist
    Join Date
    May 2002
    Location
    Barcelona, Spain
    Posts
    434
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    i think you misunderstand how a UNION works
    I probably do. Any good, succinct explanations you can point me to (aside from the Mysql manual)?
    Used to prefer PHP.
    Now I prefer Maya but stuck with PHP
    bruji.com

  16. #16
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,273
    Mentioned
    60 Post(s)
    Tagged
    3 Thread(s)
    you mean the concept? it's really rather simple -- each SELECT retrieves result rows separately from all other SELECTs, and all the result rows are fed into the same result table

    consequently each SELECT must return exactly the same number of columns, and the columns have to be "union compatible" i.e. similar datatypes -- string with strings, numbers with numbers, dates with dates

    i don't know of any resources which explain the EXPLAIN for unions, though
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  17. #17
    SitePoint Evangelist
    Join Date
    May 2002
    Location
    Barcelona, Spain
    Posts
    434
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Yes, that bit I know, I was talking about the magic of UNIONs, when you said I misunderstood how it worked, that's what came to my mind.
    Used to prefer PHP.
    Now I prefer Maya but stuck with PHP
    bruji.com


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •