SitePoint Sponsor

User Tag List

Results 1 to 17 of 17
  1. #1
    SitePoint Evangelist
    Join Date
    May 2006
    Posts
    423
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)

    Questions on JOINs and foreign keys

    I'm new to using JOINs and foreign keys in MySQL. I think I understand but have a few questions:

    1. Are less queries always better? E.g. You can query a product and then query the category it belongs to in two queries or you can use one in a single JOIN query? Is the latter better? I'm guessing it means less trips to the server if you're using PHP.
    2. Should you always index both columns involved in a foreign key relationship? What if the cardinality on a column is really low?
    3. Are joins faster with a foreign key in place? Or it is for data integrity only?
    4. Is adding 'On INSERT CASCADE' really worth it on an auto-increment ID field? In practice they never really change do they? So is having an index here a waste of resources?
    5. If you have 'ON DELETE RESTRICT' set up the query fails if you try to delete the parent of a child. Other than PHP returning false when you run a DELETE query, is there any other way to check if there's a dependency? From your web application point-of-view a query returning false is ambiguous. Or is there a specific error code for it?
    6. Am I right in saying most FK relations are one-to-many. When you set the key up, do you always do it from the 'one' to the 'many'?
    7. If you want a zero-to-many do you allow NULL on the column on the left table—and if you want a one-to-one you make both columns unique?


    Thanks.

  2. #2
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,016
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    Quote Originally Posted by DrQuincy View Post
    Are less queries always better?
    in the example you cited, yes, and in general, yes

    there are exceptions, but as a beginner, you won't run across many


    Quote Originally Posted by DrQuincy View Post
    Should you always index both columns involved in a foreign key relationship?
    you might not have to

    i believe mysql now automatically creates foreign key indexes for you


    Quote Originally Posted by DrQuincy View Post
    Are joins faster with a foreign key in place? Or it is for data integrity only?
    technically speaking, it's the indexes that make joins faster, not the logical restriction of values

    yes, foreign keys are for data integrity (the logical restriction of values)


    Quote Originally Posted by DrQuincy View Post
    Is adding 'On INSERT CASCADE' really worth it on an auto-increment ID field?
    i'm not gonna reply to this one because it mixes too many concepts to untangle with a single straight answer


    Quote Originally Posted by DrQuincy View Post
    If you have 'ON DELETE RESTRICT' set up ... is there a specific error code for it?
    i think so, yes

    testing it outside of php will surely reveal what error number it is


    Quote Originally Posted by DrQuincy View Post
    Am I right in saying most FK relations are one-to-many.
    not most -- all


    Quote Originally Posted by DrQuincy View Post
    When you set the key up, do you always do it from the 'one' to the 'many'?
    nope, the other way around


    Quote Originally Posted by DrQuincy View Post
    If you want a zero-to-many do you allow NULL on the column on the left table...
    no, on the many side

    Quote Originally Posted by DrQuincy View Post
    and if you want a one-to-one you make both columns unique?
    this question is beyond the scope of a beginner's concern

    short answer: you may need application code to enforce this one
    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  3. #3
    SitePoint Guru bronze trophy
    Join Date
    Dec 2003
    Location
    Poland
    Posts
    925
    Mentioned
    7 Post(s)
    Tagged
    0 Thread(s)
    I'll jump in with my opinions for the few questions that Rudy left unanswered:

    Quote Originally Posted by DrQuincy View Post
    Is adding 'On INSERT CASCADE' really worth it on an auto-increment ID field? In practice they never really change do they? So is having an index here a waste of resources?
    I don't think ON INSERT CASCADE is available in MySQL (unless emulated by triggers) - or are you talking about ON UPDATE CASCADE? If you have a relation then you always need an index no matter if it's CASCADE, RESTRICT or SET NULL. CASCADE will not use up resources unless you perform the update. Personally, I tend to use ON UPDATE CASCADE even on relationships with auto-increment fields - yes, they normally don't change but they may change in some unusual circumstances like me doing some db maintenance, data cleanup or data corrections - then I prefer the db to allow me to change the PK instead of me going throught the hoops to change all the keys that are bound by the ON UPDATE RESTRICT relations. Of course, this will vary depending on the situation but if in doubt I simply allow CASCADE even if I'm not intending to use it. I think about it in another way: do I need the db to forbid changing the PK? If not, then I use CASCADE.

    Quote Originally Posted by DrQuincy View Post
    and if you want a one-to-one you make both columns unique?
    Yes, both unique or primary keys. You can do this but must remember that ON UPDATE/DELETE actions will be one-way only. To make them work in both directions you would need to employ some other mechanisms - I suspect you could do it with triggers.

  4. #4
    SitePoint Evangelist
    Join Date
    May 2006
    Posts
    423
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Thanks to you both and sorry I made some mistakes:

    1. I got the many to one the wrong way round.
    2. Yes, I mean ON UPDATE CASCADE.

    Let's say you create a FK index like in SitePoint's very own tutorial:

    Code:
    ALTER TABLE `user`
    ADD CONSTRAINT `FK_course`
    FOREIGN KEY (`course`) REFERENCES `course` (`id`)
    ON UPDATE CASCADE;
    In this example do you refer to user as the child and course as the parent? With regards to this:

    i believe mysql now automatically creates foreign key indexes for you
    Usually the parent table's column will be a primary index anyway. Would it index the user.course automatically? I've been doing it manually.

  5. #5
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,016
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    Quote Originally Posted by DrQuincy View Post
    In this example do you refer to user as the child and course as the parent?
    yes


    Quote Originally Posted by DrQuincy View Post
    Would it index the user.course automatically?
    yes it would


    Quote Originally Posted by DrQuincy View Post
    I've been doing it manually.
    do a SHOW CREATE TABLE and you might see two indexes on the same column
    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  6. #6
    SitePoint Evangelist
    Join Date
    May 2006
    Posts
    423
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    yes


    yes it would


    do a SHOW CREATE TABLE and you might see two indexes on the same column
    Thanks for that.

    I can confirm if there is no index it does indeed create one but if there is already one there it just uses that.

  7. #7
    SitePoint Evangelist
    Join Date
    May 2006
    Posts
    423
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Quick further question on joins.

    Am I right in saying in this situation you wouldn't use joins?

    You have a products table (e.g. item Apples) and a products unit table (e.g. items Granny Smith, Pink Lady, Golden Delicious). The products unit table has a belongsto column that is the parent of products id. Normally I would get the current product details from the database (1 query) and then SELECT name, price FROM productunits WHERE belongsto = productid ORDER BY name ASC (1 query).

    That's better with two queries, right? You could join them but to me it seems neater to do it in two queries rather than fetching the main product ID each time (which would be the same in each row).

  8. #8
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,016
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    Quote Originally Posted by DrQuincy View Post
    Am I right in saying in this situation you wouldn't use joins?
    no, you are not, sorry -- i definitely would use a join in a single query rather than two separate queries
    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  9. #9
    SitePoint Guru bronze trophy
    Join Date
    Dec 2003
    Location
    Poland
    Posts
    925
    Mentioned
    7 Post(s)
    Tagged
    0 Thread(s)
    I wouldn't use joins . The product table may have many other fields apart from ID like name, descriptions, etc. and I wouldn't want to fetch the same data with each item of the unit table. I don't know why Rudy would use a join since it also seems counter-intuitive to me - suppose I get 7 rows from such a join query and want to simply get data from the product table - then I get this data from the first returned row (or second, or third, etc.) and discard the rest because it is duplicated in every row - this doesn't look elegant, does it?

  10. #10
    SitePoint Evangelist
    Join Date
    May 2006
    Posts
    423
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Lemon Juice View Post
    I wouldn't use joins . The product table may have many other fields apart from ID like name, descriptions, etc. and I wouldn't want to fetch the same data with each item of the unit table. I don't know why Rudy would use a join since it also seems counter-intuitive to me - suppose I get 7 rows from such a join query and want to simply get data from the product table - then I get this data from the first returned row (or second, or third, etc.) and discard the rest because it is duplicated in every row - this doesn't look elegant, does it?
    Thanks. Well, that was my thinking. In my mind it just seems odd to get the product name and description with each row.

    How does MySQL work here? If I query the productunits table and join the products and it gets exactly the same name and description each time does MySQL query products per row or just once?

  11. #11
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,016
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    lemon juice, you're trolling me, aren't you



    my dear doctor quincy, let's not use made up tables, let's use your actual scenario

    please show your two tables, and the two queries you would run
    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  12. #12
    SitePoint Guru bronze trophy
    Join Date
    Dec 2003
    Location
    Poland
    Posts
    925
    Mentioned
    7 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    lemon juice, you're trolling me, aren't you

    I just couldn't help myself.. .

    I thought I could at least provoke you to offer a short explanation of why you would use a join or why two queries is a bad idea

  13. #13
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,016
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    Quote Originally Posted by Lemon Juice View Post
    ... offer a short explanation of why you would use a join or why two queries is a bad idea
    overall performance

    short enough?

    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  14. #14
    SitePoint Guru bronze trophy
    Join Date
    Dec 2003
    Location
    Poland
    Posts
    925
    Mentioned
    7 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    overall performance

    short enough?

    Short enough but not convincing enough . I don't think this kind of SELECT with JOIN will be faster (by any significant margin) than two plain SELECTs given the fact that the join will fetch n-times more data unnecessarily because of duplication of the same product fields in each row. At best I'd consider both methods more or less equal in performance.

    I'm also thinking about the issue of fetching the data by a scripting language, which is common on the web. Getting the results from a simple select is easy and logical while trying to extract the product item from the unit items being combined in every single row is slightly convoluted. Not to mention the mess if someone uses a database abstraction layer where each row goes to a separate object of given type...

  15. #15
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,016
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    Quote Originally Posted by Lemon Juice View Post
    ... trying to extract the product item from the unit items being combined in every single row is slightly convoluted.
    only if you are a very junior programmer

    Quote Originally Posted by Lemon Juice View Post
    Not to mention the mess if someone uses a database abstraction layer where each row goes to a separate object of given type...
    tail wagging the dog, dude

    feel free to run comprehensive benchmarks

    as a general rule of thumb, my advice is rock solid -- a join query is always* better than two separate queries

    * some conditions apply -- see, for example, http://sqllessons.com/sqlhack75.html
    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  16. #16
    SitePoint Evangelist
    Join Date
    May 2006
    Posts
    423
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    only if you are a very junior programmer

    tail wagging the dog, dude

    feel free to run comprehensive benchmarks

    as a general rule of thumb, my advice is rock solid -- a join query is always* better than two separate queries

    * some conditions apply -- see, for example, http://sqllessons.com/sqlhack75.html
    Just out at the moment so will reply in more detail later but that links seems to suggest the problem is with number of queries rather than joins per se.

    I wonder whether the difference is negligible and the gain of more logical results is worth it. (To use union the column numbers need to match so not always practical.)

    If extra queries are so bad what about mysqli multi query? Only one server trip...

    Useful discussion here for me either way, thanks.

  17. #17
    SitePoint Wizard bronze trophy
    Join Date
    Jul 2006
    Location
    Augusta, Georgia, United States
    Posts
    4,048
    Mentioned
    16 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by r937
    as a general rule of thumb, my advice is rock solid -- a join query is always* better than two separate queries
    Taking application programming out of the mix yes. However, once application logic is introduced there are several areas where performance and maintainability can both be gained using multiple queries. With that in mind how rarely as a software engineer would one write a program with only SQL. It would be naive to consider only the performance of a query when building any application. I know this is not a discussion about applications but it is difficult to talk about performance without considering the entire gamut of technology used to deliver an end solution/service.

    Quote Originally Posted by DrQuincy
    Are less queries always better? E.g. You can query a product and then query the category it belongs to in two queries or you can use one in a single JOIN query? Is the latter better? I'm guessing it means less trips to the server if you're using PHP.
    Just as with your example when application logic is introduced that leaves room for caching and other application side techniques. Techniques that can significantly reduce load and increase performance over a large span of requests. Perhaps the initial request to fetch data from the service layer might be a little slower when using multiple queries but requests following are much faster due to caching with the added benefit of producing a much more stable, maintainable, and flexible infrastructure overall.

    Though I would tend to agree that as a beginner when writing plain jane queries try to do things a single query. For example, don't select from one table than loop through the result and for each row run a query to select data. In just about all cases when that is done a single query can be created instead. In more cases than not unless you are using or building your own ORM w/ caching the fewer queries the better.
    The only code I hate more than my own is everyone else's.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •