SitePoint Sponsor

User Tag List

Results 1 to 14 of 14
  1. #1
    SitePoint Enthusiast
    Join Date
    Dec 2009
    Posts
    39
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    simple (?) JOIN question

    I have a question how to have two JOINs using the same table.
    The previous sentence probably doesn't make sense, so let me try to explain using an example. Let say, I have two tables

    Code:
    CREATE TABLE users (
       userid INTEGER AUTO_INCREMENT,
       name VARCHAR(36),
       PRIMARY KEY (userid)
    )
    
    CREATE TABLE transactions (
       transactionid INTEGER AUTO_INCREMENT,
       buyerid INTEGER,
       sellerid INTEGER,
       price INTEGER,
       PRIMARY KEY (transactionid)
    )
    Now, when I want buyer information for all transactions where the price was greater than 10, I could retrieve this using the following statement

    Code:
    SELECT 
      users.name AS buyer 
    FROM 
      transactions INNER JOIN users
        ON transactions.buyerid = users.userid
    WHERE
       transactions.price > 10
    But what statement should I use when I also want to retrieve the sellers name with the same query?

  2. #2
    SitePoint Guru
    Join Date
    Jan 2004
    Location
    Uppsala, sverige
    Posts
    698
    Mentioned
    2 Post(s)
    Tagged
    1 Thread(s)
    Code:
    select buyer.name as buyer,
             seller.name as seller
      from users as buyer
      join  transactions 
        on buyer.userid = transactions.buyerid
      join users as seller
        on transaction.sellerid = seller.userid
     where transactions.price > 10

  3. #3
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,322
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    i prefer to write the tables in the FROM clause in a logical sequence
    Code:
    SELECT buyer.name AS buyer
         , seller.name AS seller
      FROM transactions 
    INNER
      JOIN users AS buyer
        ON buyer.userid = transactions.buyerid
    INNER
      JOIN users AS seller
        ON seller.userid = transactions.sellerid
     WHERE transactions.price > 10
    i always try to start the FROM clause with the "driving" table, i.e. the one mentioned in the WHERE clause which has the main restriction, the one which the other tables are joined to
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  4. #4
    SitePoint Enthusiast
    Join Date
    Dec 2009
    Posts
    39
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    great, your responses make it all clear.
    I missed the point that I should give the table that is being joined an alias, so that I can join it twice

  5. #5
    SitePoint Enthusiast
    Join Date
    Dec 2009
    Posts
    39
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I actually would like to ask a second question and that is about the efficiency of JOIN statements.

    Let say, the transaction table has many records. And I would like to get the buyer and seller name for a specific transaction (eg transaction 123456). As I have been explained, I can do this in one query:

    Code:
    SELECT 
      buyers.name AS buyer,
      sellers.name AS seller 
    FROM 
      transactions 
        INNER JOIN users AS buyers
          ON transactions.buyerid = buyers.userid
        INNER JOIN users AS sellers
          ON transactions.sellerid = sellers.userid
    WHERE
      transactions.transactionid = 123456
    Now, I would "visualize" this query as a large table being constructed (with buyer and seller names for all transactions) to which the WHERE clause is being applied.

    If the database would indeed process the query like that, it would be very inefficient. In that case it would be better to obtain the buyerid and sellerid, and get the names in separate query.

    Does anyone know if one should take these kind of considerations into account? Or can you expect the database to optimize the query?

  6. #6
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,322
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by rblon View Post
    Does anyone know if one should take these kind of considerations into account?
    yes, i do

    have you ever done an EXPLAIN? which database system is this?
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  7. #7
    SitePoint Enthusiast
    Join Date
    Dec 2009
    Posts
    39
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    yes, i do
    You take these considerations into account? Or you know the answer?

    No I have never done an EXPLAIN... I am not running a large database, it is more an academic interest. Hence, the stylized example.

    I am using MySQL.

  8. #8
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,322
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by rblon View Post
    You take these considerations into account? Or you know the answer?
    i know the answer

    Quote Originally Posted by rblon View Post
    I am using MySQL.
    the reason i asked which database you're using is because we have a mysql forum (for mysql, eh) and a databases forum (for everything else), and you didn't post in the mysql forum and you also didn't say which database you're using

    in your query, put the word EXPLAIN in front of the word SELECT, run it, and show us the results
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  9. #9
    SitePoint Enthusiast
    Join Date
    Dec 2009
    Posts
    39
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Ok, so I have created the following php script to populate the tables with some random data

    Code:
    $seed = 100;
    srand($seed);
    
    // Fill users table with 1000 users
    for ($i = 0; $i < 1000; $i++) {
      $name = rand(0, 100000);
      $sql = "INSERT INTO users (name) VALUES ('$name')";
      mysql_query($sql, $con);
    }
    
    // Fill transactions table with 200k transactions
    for ($i = 0; $i < 200000; $i++) {
      $buyerid = rand(0, 999);
      while (($sellerid = rand(0, 999)) == $buyerid);
      $price = rand(1, 100);
      $sql = "INSERT INTO transactions (buyerid, sellerid, price) ";
      $sql .= "VALUES ($buyerid, $sellerid, $price)";
      mysql_query($sql, $con);
    }
    Now executing the SELECT statement with two JOINs returns buyer=72396 (corresponding to userid=148) and seller=88733 (corresponding to userid=184) pretty instantaneously.

    And EXPLAIN gives an output as per screenshot.

    Based on the speed of returning the result I would think MySQL is optimizing.

    And, then I would assume any other respectable database system would do the same.
    Attached Images Attached Images

  10. #10
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,322
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    yep, they do
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  11. #11
    SitePoint Enthusiast
    Join Date
    Dec 2009
    Posts
    39
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    ok great, then we can keep it general, and I might be posting in the right forum after all

    Any insights in what EXPLAIN can tell me?

  12. #12
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,322
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by rblon View Post
    Any insights in what EXPLAIN can tell me?
    it can tell you when certain operations are inefficient, e.g. table scans where the "key" column is empty
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  13. #13
    SitePoint Enthusiast
    Join Date
    Dec 2009
    Posts
    39
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    but it doesn't tell me that the WHERE clause is executed before the JOINs (as seems to be the case)?

  14. #14
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,322
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    yes, the WHERE clause is applied in step 1 to the transaction table (using its PRIMARY key)

    the "rows" column indicates that 1 row has been retrieved (instead of all of them)

    after that, the other tables are joined
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •