SitePoint Sponsor

User Tag List

Page 1 of 2 12 LastLast
Results 1 to 25 of 30
  1. #1
    SitePoint Zealot Zurev's Avatar
    Join Date
    Feb 2009
    Posts
    171
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Critique database structure (Has a way to go)

    So I've done a lot with databases but haven't ever really worked on a website where efficiency has to be managed well due to a large amount of traffic. The idea is pretty simple when you get to reading the structure, and I'm willing to answer any questions about any tables.

    In my previous database designs I found myself incorporating a lot of things that really should've had their own tables into other tables. For example I had an articles table that had a tags field with a comma-seperated string for their tags, and no definition of tags in their own table. Which I came to pay for in the long run when I wanted to make the tags part of the site more in-depth I couldn't keep putting massive strain on the database. So this design I tried to keep data seperate from one another but included what I thought was necessary, there are comments at the end of each table to indicate their primary function.

    My biggest concerns are indexes, foreign keys, and MyISAM vs InnoDB, I've been doing a lot of reading on how one is better for a lot more select queries, and another is better for more write transactions. On the premise of the site I would have to say the amount of viewing (select queries) to writing(update/inserts) would be 2:1 perhaps, so twice as many selects to writes.

    However I believe only InnoDB supports foreign keys, so if anyone seems that they are required then that will settle that component of it.

    I appreciate you guys taking the time to look at this lengthy post and table structure and giving feedback.

    Code MySQL:
    CREATE TABLE `theaters` (
    `theaterID` INT( 11 ) NOT NULL AUTO_INCREMENT PRIMARY KEY ,
    `companyID` INT( 11 ) NULL ,
    `name` VARCHAR( 100 ) NOT NULL ,
    `description` VARCHAR( 255 ) NULL,
    `mainImage` VARCHAR( 150) NULL,
    `address` VARCHAR( 100 ) NOT NULL ,
    `city` VARCHAR( 50 ) NOT NULL ,
    `state` VARCHAR( 2 ) NOT NULL ,
    `phone` INT( 11 ) NULL COMMENT  'Raw digits of a phone number, dashes can be added through PHP.'
    ) ENGINE = INNODB 
    COMMENT =  'Sole purpose is to hold the main attributes of a movie theater.';
     
    CREATE TABLE `companies` (
    `companyID` INT( 11 ) NOT NULL AUTO_INCREMENT PRIMARY KEY ,
    `name` VARCHAR( 50 ) NOT NULL ,
    `description` TEXT NULL
    ) ENGINE = INNODB
    COMMENT =  'Used to store each company''s name and ID for reference of theaters.';
     
    CREATE TABLE `theater_company` (
    `theaterID` INT( 11 ) NOT NULL ,
    `companyID` INT( 11 ) NOT NULL
    ) ENGINE = INNODB 
    COMMENT =  'Stores the relationship of theaters to companies. (1 to Many)';
     
    CREATE TABLE `theater_pricing` (
    `theaterID` INT( 11 ) NOT NULL ,
    `adult` VARCHAR( 5 ) NULL ,
    `senior` VARCHAR( 5 ) NULL ,
    `child` VARCHAR( 5 ) NULL ,
    `3d` VARCHAR( 5 ) NULL ,
    `matinee` VARCHAR( 5 ) NULL
    ) ENGINE = INNODB 
    COMMENT =  'Prices will be stored in a 10.75 format ($10.75).';
     
    CREATE TABLE `theater_reviews` (
    `reviewID` INT NOT NULL AUTO_INCREMENT PRIMARY KEY ,
    `theaterID` INT( 11 ) NOT NULL ,
    `overall` FLOAT NOT NULL COMMENT  'Averages the other ratings',
    `reviewContent` VARCHAR( 255 ) NOT NULL COMMENT  'Since it''s mostly rating based, this is a small input field for additional comments.' ,
    `picQuality` TINYINT( 1 ) NULL ,
    `audioQuality` TINYINT( 1 ) NULL ,
    `priceQuality` TINYINT( 1 ) NULL ,
    `staffQuality` TINYINT( 1 ) NULL ,
    `restroomQuality` TINYINT( 1 ) NULL ,
    `stickyFloor` TINYINT( 1 ) NULL ,
    `seatingQuality` TINYINT( 1 ) NULL ,
    `concessionSelection` TINYINT( 1 ) NULL ,
    `concessionPrices` TINYINT( 1 ) NULL ,
    `visitAgain` TINYINT( 1 ) NULL COMMENT  'Yes/No - No=0, Yes=5'
    ) ENGINE = INNODB 
    COMMENT =  'Used to store individual reviews, all attributes are on a 1-5 integer scale.';
     
    CREATE TABLE `theater_comments` (
    `commentID` INT( 11 ) NOT NULL AUTO_INCREMENT PRIMARY KEY ,
    `theaterID` INT( 11 ) NOT NULL ,
    `userID` INT( 11 ) NOT NULL COMMENT  'Who posted this comment?',
    `timePosted` INT( 11 ) NOT NULL COMMENT  'PHP Generated timestamp.',
    `title` VARCHAR( 100 ) NOT NULL ,
    `comment` TEXT NOT NULL
    ) ENGINE = INNODB
    COMMENT =  'Used to store comments specifically about a theater.';
     
    CREATE TABLE `theater_likes` (
    `theaterID` INT( 11 ) NOT NULL ,
    `userID` INT( 11 ) NOT NULL ,
    `likes` TINYINT( 1 ) NOT NULL COMMENT  '0 for dislike, 1 for like.'
    ) ENGINE = INNODB
    COMMENT =  'Stores all users votes whether it be like or dislike.';
     
    CREATE TABLE `theater_images` (
    `theaterID` INT( 11 ) NOT NULL ,
    `userID` INT( 11 ) NOT NULL COMMENT  'User that uploaded the image.',
    `imageLocation` VARCHAR( 150 ) NOT NULL COMMENT  'URL to the specific image.'
    ) ENGINE = INNODB 
    COMMENT =  'Used to store the locations of various images of the specific theater ID.';
     
    CREATE TABLE `theater_visits` (
    `theaterID` INT( 11 ) NOT NULL ,
    `userID` INT( 11 ) NOT NULL COMMENT  'What user visited theaterID'
    ) ENGINE = INNODB 
    COMMENT =  'Stores what user has visited what theater.';

  2. #2
    SitePoint Wizard bronze trophy C. Ankerstjerne's Avatar
    Join Date
    Jan 2004
    Location
    The Kingdom of Denmark
    Posts
    2,692
    Mentioned
    7 Post(s)
    Tagged
    0 Thread(s)
    Either you should remove the companyID field from the theaters table, or you should remove the theater_company table altogether. Which depends on the relationship:
    • If a theater can only ever be connected to one company, then having the companyID field in the theaters is probably the best solution.
    • If a theater can be connected to more than one company, then you should definitely keep the theater_company table, and remove the companyID from the theaters table.
    Christian Ankerstjerne
    <p<strong<abbr/HTML/ 4 teh win</>
    <>In Soviet Russia, website codes you!

  3. #3
    SitePoint Zealot Zurev's Avatar
    Join Date
    Feb 2009
    Posts
    171
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by C. Ankerstjerne View Post
    Either you should remove the companyID field from the theaters table, or you should remove the theater_company table altogether. Which depends on the relationship:
    • If a theater can only ever be connected to one company, then having the companyID field in the theaters is probably the best solution.
    • If a theater can be connected to more than one company, then you should definitely keep the theater_company table, and remove the companyID from the theaters table.
    Thank you very much for the quick feedback, a theater:company is a 1:0 or 1 relationship, so I should remove the theater_company table entirely, and keep the companyID field in theaters as you suggest?

  4. #4
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,017
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    what is a company, anyway (i think i have the theatre part figured out)

    why does the comment say that relationship of theaters to companies is one-to-many?
    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  5. #5
    SitePoint Zealot Zurev's Avatar
    Join Date
    Feb 2009
    Posts
    171
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Yeah r937, I actually editted that out of my own personal version because I realized that's wrong, I re-explained the relationship in my most recent post.

    Companies own theaters, like your local theater may be owned by AMC, Regal, etc, and if the companyID is null, it will be referred to as unknown or independent.

  6. #6
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,017
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    okay, so theater_company table is not required

    tip: declare your prices as DECIMAL(4,2) so that you can, someday, do AVG(price) and similar calculations
    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  7. #7
    SitePoint Zealot Zurev's Avatar
    Join Date
    Feb 2009
    Posts
    171
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    okay, so theater_company table is not required

    tip: declare your prices as DECIMAL(4,2) so that you can, someday, do AVG(price) and similar calculations
    Thanks r937, gonna be honest you've helped me in the past and was hoping you would have some input.

  8. #8
    SitePoint Wizard bronze trophy C. Ankerstjerne's Avatar
    Join Date
    Jan 2004
    Location
    The Kingdom of Denmark
    Posts
    2,692
    Mentioned
    7 Post(s)
    Tagged
    0 Thread(s)
    Just a quick note: If you remove the theater_company table, and stick with the companyID field in the theaters table, how will you handle a case where a theater is owned as a joint-venture between two companies?

    There might be ways to work around this, but work-arounds are seldom pretty, especially when you have loads of data in the tables.
    Christian Ankerstjerne
    <p<strong<abbr/HTML/ 4 teh win</>
    <>In Soviet Russia, website codes you!

  9. #9
    SitePoint Zealot Zurev's Avatar
    Join Date
    Feb 2009
    Posts
    171
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by C. Ankerstjerne View Post
    Just a quick note: If you remove the theater_company table, and stick with the companyID field in the theaters table, how will you handle a case where a theater is owned as a joint-venture between two companies?

    There might be ways to work around this, but work-arounds are seldom pretty, especially when you have loads of data in the tables.
    Thanks for the information C. Ankerstjerne, I considered that and referred to my client who stated a theater will only be owned by one company, though I can see how that is a possible change in the future. I'm unsure what to do as I want it to be as minimal as possible (less confusing for myself) at the time being, while being flexible for the future as you suggest.

    However, to all the others browsing this thread to help me, here is an updated table structure:
    Code MySQL:
    CREATE TABLE `theaters` (
    `theaterID` INT( 11 ) NOT NULL AUTO_INCREMENT PRIMARY KEY ,
    `companyID` INT( 11 ) NULL ,
    `name` VARCHAR( 100 ) NOT NULL ,
    `description` VARCHAR( 255 ) NULL,
    `mainImage` VARCHAR( 150) NULL,
    `address` VARCHAR( 100 ) NOT NULL ,
    `city` VARCHAR( 50 ) NOT NULL ,
    `state` VARCHAR( 2 ) NOT NULL ,
    `phone` INT( 11 ) NULL COMMENT  'Raw digits of a phone number, dashes can be added through PHP.'
    ) ENGINE = INNODB 
    COMMENT =  'Sole purpose is to hold the main attributes of a movie theater.';
     
    CREATE TABLE `companies` (
    `companyID` INT( 11 ) NOT NULL AUTO_INCREMENT PRIMARY KEY ,
    `name` VARCHAR( 50 ) NOT NULL ,
    `description` TEXT NULL
    ) ENGINE = INNODB
    COMMENT =  'Used to store each company''s name and ID for reference of theaters.';
     
    CREATE TABLE `theater_pricing` (
    `theaterID` INT( 11 ) NOT NULL ,
    `adult` DECIMAL(4,2) NULL ,
    `senior` DECIMAL(4,2) NULL ,
    `child` DECIMAL(4,2) NULL ,
    `3d` DECIMAL(4,2) NULL ,
    `matinee` DECIMAL(4,2) NULL 
    ) ENGINE = INNODB 
    COMMENT =  'Prices will be stored in a 10.75 format ($10.75).';
     
    CREATE TABLE `theater_reviews` (
    `reviewID` INT NOT NULL AUTO_INCREMENT PRIMARY KEY ,
    `theaterID` INT( 11 ) NOT NULL ,
    `overall` FLOAT NOT NULL COMMENT  'Averages the other ratings',
    `reviewContent` VARCHAR( 255 ) NOT NULL COMMENT  'Since it''s mostly rating based, this is a small input field for additional comments.' ,
    `picQuality` TINYINT( 1 ) NULL ,
    `audioQuality` TINYINT( 1 ) NULL ,
    `priceQuality` TINYINT( 1 ) NULL ,
    `staffQuality` TINYINT( 1 ) NULL ,
    `restroomQuality` TINYINT( 1 ) NULL ,
    `stickyFloor` TINYINT( 1 ) NULL ,
    `seatingQuality` TINYINT( 1 ) NULL ,
    `concessionSelection` TINYINT( 1 ) NULL ,
    `concessionPrices` TINYINT( 1 ) NULL ,
    `visitAgain` TINYINT( 1 ) NULL COMMENT  'Yes/No - No=0, Yes=5'
    ) ENGINE = INNODB 
    COMMENT =  'Used to store individual reviews, all attributes are on a 1-5 integer scale.';
     
    CREATE TABLE `theater_comments` (
    `commentID` INT( 11 ) NOT NULL AUTO_INCREMENT PRIMARY KEY ,
    `theaterID` INT( 11 ) NOT NULL ,
    `userID` INT( 11 ) NOT NULL COMMENT  'Who posted this comment?',
    `timePosted` INT( 11 ) NOT NULL COMMENT  'PHP Generated timestamp.',
    `title` VARCHAR( 100 ) NOT NULL ,
    `comment` TEXT NOT NULL
    ) ENGINE = INNODB
    COMMENT =  'Used to store comments specifically about a theater.';
     
    CREATE TABLE `theater_likes` (
    `theaterID` INT( 11 ) NOT NULL ,
    `userID` INT( 11 ) NOT NULL ,
    `likes` TINYINT( 1 ) NOT NULL COMMENT  '0 for dislike, 1 for like.'
    ) ENGINE = INNODB
    COMMENT =  'Stores all users votes whether it be like or dislike.';
     
    CREATE TABLE `theater_images` (
    `theaterID` INT( 11 ) NOT NULL ,
    `userID` INT( 11 ) NOT NULL COMMENT  'User that uploaded the image.',
    `imageLocation` VARCHAR( 150 ) NOT NULL COMMENT  'URL to the specific image.'
    ) ENGINE = INNODB 
    COMMENT =  'Used to store the locations of various images of the specific theater ID.';
     
    CREATE TABLE `theater_visits` (
    `theaterID` INT( 11 ) NOT NULL ,
    `userID` INT( 11 ) NOT NULL COMMENT  'What user visited theaterID'
    ) ENGINE = INNODB 
    COMMENT =  'Stores what user has visited what theater.';

  10. #10
    SitePoint Zealot
    Join Date
    Dec 2010
    Posts
    187
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I'll just comment a few things that I've noticed, from technical point of view - some might find it as nitpicking.

    InnoDB vs MyISAM - InnoDB.
    It scales better. It doesn't lock down entire table when writing new data to it, so if your website is such that certain tables might experience a lot of traffic - that's why you should avoid MyISAM.
    The only advantage MyISAM has is fulltext index support, but I really haven't seen an application of larger scale that decided to use MyISAM due to that feature since there are engines that do searches way better and faster than you could do with MyISAM + fulltext indexes + your application logic (for example, Sphinx is one of those fulltext searching engines).
    You probably know about transactions already, so there's no point bringing that up

    Second thing is, and this IS nitpicking - how come your integer columns are signed? Especially auto_increment ones - I mean, are you really going to use negative integer values?
    If not, you just wasted yourself a good half of numbers at your disposal - but then again, it would be a few centuries before you ran out of them even when they're signed integers.

    The other thing - if you won't have 2 to the power of 32 of company IDs, you can reduce its data storage type to a smaller integer, which takes up less space, which speeds up the whole deal with reading and displaying everything.

    As I said, it is a bit of nitpicking as other, more experienced people, have covered the bits when it comes to actual db design.
    But if you are worried about performance of the site, have you considered caching mechanisms of any sort so you alleviate database load?

  11. #11
    SitePoint Zealot Zurev's Avatar
    Join Date
    Feb 2009
    Posts
    171
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Second thing is, and this IS nitpicking - how come your integer columns are signed? Especially auto_increment ones - I mean, are you really going to use negative integer values?
    If not, you just wasted yourself a good half of numbers at your disposal - but then again, it would be a few centuries before you ran out of them even when they're signed integers.
    Sorry I don't understand what you mean by signed integer values, I'm assuming you mean the fact that I specified them to 11 characters. I never thought about why or why not to do it, it's just something I've always done.

    The other thing - if you won't have 2 to the power of 32 of company IDs, you can reduce its data storage type to a smaller integer, which takes up less space, which speeds up the whole deal with reading and displaying everything.
    So would you suggest using TINYINT or SMALLINT, and if so how many characters, I can't imagine more than a thousand companies at the absolute most, so maybe TINY or SMALL int would be the way to go.

    As for your remark about caching mechanisms, I'll be blunt in stating I'm not working on a website that's expecting a large amount of users, simply trying to prepare for the future in database structure.

  12. #12
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,017
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    Quote Originally Posted by Zurev View Post
    I'm assuming you mean the fact that I specified them to 11 characters. I never thought about why or why not to do it, it's just something I've always done.
    the number in parentheses does not define how many digits the column can hold

    all integers can hold the same range of numbers, whether you say INT(11) or INT(3) or INT(937)

    his comment about signed integers related to this...
    • INTEGER holds values from -2147483648 to 2147483647, but auto_increments start at 1, so a signed integer auto_increment holds just over 2 billion values

    • UNSIGNED INTEGER holds values from 0 to 4294967295, so since auto_increments start at 1, an unsigned integer auto_increment holds just over 4 billion values, i.e. twice as many


    Quote Originally Posted by Zurev View Post
    So would you suggest using TINYINT or SMALLINT, and if so how many characters, I can't imagine more than a thousand companies at the absolute most, so maybe TINY or SMALL int would be the way to go.
    TINYINT UNSIGNED only goes to 255

    all these ranges are clearly spelled out in da manual

    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  13. #13
    SitePoint Zealot Zurev's Avatar
    Join Date
    Feb 2009
    Posts
    171
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    the number in parentheses does not define how many digits the column can hold

    all integers can hold the same range of numbers, whether you say INT(11) or INT(3) or INT(937)

    his comment about signed integers related to this...
    • INTEGER holds values from -2147483648 to 2147483647, but auto_increments start at 1, so a signed integer auto_increment holds just over 2 billion values

    • UNSIGNED INTEGER holds values from 0 to 4294967295, so since auto_increments start at 1, an unsigned integer auto_increment holds just over 4 billion values, i.e. twice as many


    TINYINT UNSIGNED only goes to 255

    all these ranges are clearly spelled out in da manual

    r937:
    Your explanation on unsigned integers as opposed to signed was perfect, I always wondered what that column was for. (Learned MySQL via phpmyadmin, not properly learning it ).

    After looking at the manual with integer types, TINYINT maximum of 255 seems to be spot on for the ratings since they max at 5, and I believe SMALLINT should suit the companyIDs as a possible range is 1-65535.

    Now, as I attempt to milk this community for all it's knowledge. As I've been wondering about indices, I know they're already assigned to tables with primary keys, but as for those that don't (like: theater_images,theater_visits), I should assign indices to the theaterID field in these tables since that's most likely what it will be called upon in queries in the code?

    Also, as far as foreign keys go, I hear that they're necessary, however after doing some research I realize the constraint is between two tables only, so honestly I can't see any two specific tables where they would be beneficial, except maybe theaters:theater_comments. Any input on that?

  14. #14
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,017
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    Quote Originally Posted by Zurev View Post
    ...the constraint is between two tables only, so honestly I can't see any two specific tables where they would be beneficial, except maybe theaters:theater_comments. Any input on that?
    you should have foreign keys in almost all your tables

    every table besides the theaters table which has a theaterID should have a foreign key to the theaters table

    for example --
    Code:
    CREATE TABLE theater_likes 
    ( theaterID INTEGER NOT NULL 
    , CONSTRAINT likes_theater_fk
        FOREIGN KEY ( theaterID )
          REFERENCES theaters ( theaterID )
    , userID    INTEGER NOT NULL 
    , CONSTRAINT likes_user_fk
        FOREIGN KEY ( userID )
          REFERENCES users ( userID )
    , PRIMARY KEY ( theaterID , userID )
    , likes TINYINT NOT NULL COMMENT  '0 for dislike, 1 for like.'
    ) ENGINE = INNODB
    COMMENT =  'Stores all users votes whether it be like or dislike.';
    also, this --
    Code:
    CREATE TABLE theaters 
    ( theaterID INTEGER NOT NULL AUTO_INCREMENT PRIMARY KEY 
    , companyID INTEGER NULL 
    , CONSTRAINT theater_company_fk
        FOREIGN KEY ( companyID )
          REFERENCES companies ( companyID )
    , name VARCHAR( 100 ) NOT NULL 
    , ...
    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  15. #15
    SitePoint Zealot
    Join Date
    Dec 2010
    Posts
    187
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I've got several questions that are in relation to actual performance and not database design itself since it's against the theory and some might not like it.

    I've noticed that certain tables don't have surrogate primary key (auto_increment) but a natural one (composite of userID and theaterID) and so on.
    Also, those tables are defined as InnoDB.

    When using natural key rather than surrogate to uniquely identify the row in the table, InnoDB will create overhead and you lose the inherent ability of InnoDB to quickly find rows based on primary key lookup.
    When primary key is nonsequential, which is the case of natural keys, InnoDB uses 8 bit signed integer for its internal uses, instead of 4 byte one that is usually the case. You can also refer to http://dev.mysql.com/doc/refman/4.1/...dex-types.html for some further insight on inner workings.

  16. #16
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,017
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    Quote Originally Posted by furicane View Post
    When using natural key rather than surrogate to uniquely identify the row in the table, InnoDB will create overhead and you lose the inherent ability of InnoDB to quickly find rows based on primary key lookup.
    could you please provide a reference for this claim

    the index created for the primary key does ~not~ become unusable just because it isn't declared on an auto_increment

    i suspect you may have misinterpreted something that you read somewhere...


    Quote Originally Posted by furicane View Post
    When primary key is nonsequential, which is the case of natural keys...
    this is nonsense, sorry
    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  17. #17
    SitePoint Zealot
    Join Date
    Dec 2010
    Posts
    187
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    First of all, hello Rudy and I don't mean to argue here about things but what I wrote are all facts and of course, I'll provide the sources and I'll try to explain more clearly what I meant since I think I wasn't clear enough.

    Natural key - here being userID + theaterID - aren't sequential in lookup table theater_likes, or to put it more simply - what guarantee is there that user with ID 1 will be the 1st entry in the table and not user with ID 2 and so on so I don't know why it's nonsense in this particular case that the primary key is not sequential but who am I to judge.

    could you please provide a reference for this claim
    http://blog.johnjosephbachir.org/200...-primary-keys/

    I also stand corrected, InnoDB creates overhead if no primary key is specified so I'm sorry for posting false information. However, the blog linked up clears up certain things about how InnoDB works internally and why it's good that each table has its own primary key and foreign keys for join purposes.

  18. #18
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,017
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    hello furicane

    you brought up an interesting point of discussion, so i will respond with some thoughts

    the blog you linked to says innodb primary keys should be "sequential"

    however, it also explains why, and the reason is, the pk index is used as a clustering index -- so what the recommendation actually should have said is that inserts should be sequential, i.e. that new rows should have a pk value that sequentially comes in at the top end of the existing range of values, rather than somewhere in the middle, which can cause index reorganizations

    you can not say that a primary key is not sequential, because of course ~every~ primary key is sequential, since any set of values, whether they be numbers, strings, dates, can be sorted into a sequence

    it is, though, important that inserts not cause frequent index reorganizations, and i'm not going to argue with that

    but wait...

    suppose we take this example --
    Code:
    CREATE TABLE theater_likes 
    ( theaterID INTEGER NOT NULL 
    , userID    INTEGER NOT NULL 
    , PRIMARY KEY ( theaterID , userID )
    , likes TINYINT NOT NULL 
    );
    the primary key becomes the clustering index, and there is one index

    however, if we introduce a surrogate key like this --
    Code:
    CREATE TABLE theater_likes 
    ( theater_likes_id INTEGER NOT NULL PRIMARY KEY AUTO_INCREMENT
    , theaterID INTEGER NOT NULL 
    , userID    INTEGER NOT NULL 
    , UNIQUE KEY ( theaterID , userID )
    , likes TINYINT NOT NULL 
    );
    notice the surrogate key forces us to declare an additional key, the UNIQUE key on the pair of columns, which is necessary to ensure that each user can like any theater only once -- data integrity, dontcha know

    so now there is a good primary key index (which guarantees that new inserts come at the end of the clustering index), but there is also a second index!

    furthermore, the pk index will never be utilized in searches!

    i would need to see real world benchmark data before i concede that having two indexes to update instead of one is better

    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  19. #19
    SitePoint Zealot Zurev's Avatar
    Join Date
    Feb 2009
    Posts
    171
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    you should have foreign keys in almost all your tables

    every table besides the theaters table which has a theaterID should have a foreign key to the theaters table

    for example --
    Code:
    CREATE TABLE theater_likes 
    ( theaterID INTEGER NOT NULL 
    , CONSTRAINT likes_theater_fk
        FOREIGN KEY ( theaterID )
          REFERENCES theaters ( theaterID )
    , userID    INTEGER NOT NULL 
    , CONSTRAINT likes_user_fk
        FOREIGN KEY ( userID )
          REFERENCES users ( userID )
    , PRIMARY KEY ( theaterID , userID )
    , likes TINYINT NOT NULL COMMENT  '0 for dislike, 1 for like.'
    ) ENGINE = INNODB
    COMMENT =  'Stores all users votes whether it be like or dislike.';
    also, this --
    Code:
    CREATE TABLE theaters 
    ( theaterID INTEGER NOT NULL AUTO_INCREMENT PRIMARY KEY 
    , companyID INTEGER NULL 
    , CONSTRAINT theater_company_fk
        FOREIGN KEY ( companyID )
          REFERENCES companies ( companyID )
    , name VARCHAR( 100 ) NOT NULL 
    , ...
    Thanks so much for explains that r937. So I've been reading up on the MySQL documentation and came across this:
    If ON DELETE or ON UPDATE are not specified, the default action is RESTRICT.
    Later restrict is defined as rejecting the action on the parent table:
    Specifying RESTRICT (or NO ACTION) is the same as omitting the ON DELETE or ON UPDATE clause.
    So clearly this is not what I want correct, since if a theater gets deleted, I would want all comments for that theater deleted as well, same with likes, images, comments, etc. So after reading through, I would want
    CASCADE: Delete or update the row from the parent table, and automatically delete or update the matching rows in the child table.
    I believe however, that I would only want on delete cascade, and not on update?

  20. #20
    SitePoint Zealot
    Join Date
    Dec 2010
    Posts
    187
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    First off, I totally worded wrong what I meant since English is my 3rd language so phrasing my thoughts is kinda harder.


    you can not say that a primary key is not sequential, because of course ~every~ primary key is sequential, since any set of values, whether they be numbers, strings, dates, can be sorted into a sequence
    This is what I had in mind - when you use a composite key userID, theaterID you get values such as [1,1], [2,1], [123,1], [3,1] and so on - so yes, they are inserted sequentially but their values do not correspond to the insert order.
    If you have a key such as auto_increment, every next value is sequential - incremented by 1 or whatever is set as the offset in the .cnf file.

    Now on to your (valid) point - yes, surrogate key would require yet another index and no, I'm not suggesting that having 3 indexes beats one composite.

  21. #21
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,017
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    Quote Originally Posted by Zurev View Post
    I believe however, that I would only want on delete cascade, and not on update?
    ON DELETE CASCADE makes sense

    as for ON UPDATE, i would make that CASCADE as well, but you could also make it RESTRICT

    when would you need to cascade an update? you would need it if you ever decided to change the id of a theater from one number to a different number

    why would you do that, you ask? that's a different question, isn't it
    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  22. #22
    SitePoint Zealot Zurev's Avatar
    Join Date
    Feb 2009
    Posts
    171
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    So, as I'm wrapping things up and about to make all the appropriate changes to the database structure, I'm just a bit stuck on a small portion of the foreign keys.

    Say I have my theater_reviews, with a foreign key of theaterID referencing the theaters tables theaterID. If I set ON DELETE CASCADE, then when a review is deleted, it's deleted from the theater_reviews (parent table), and any matching rows in the theaters table (child) are also deleted. Now shouldn't that be 0 rows in the theaters table are deleted if a review is deleted? If so, is there any purpose to setting ON DELETE/UPDATE CASCADE on any tables besides theaters-companies?

  23. #23
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,017
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    Quote Originally Posted by Zurev View Post
    Say I have my theater_reviews, with a foreign key of theaterID referencing the theaters tables theaterID. If I set ON DELETE CASCADE, then when a review is deleted, it's deleted from the theater_reviews (parent table), and any matching rows in the theaters table (child) are also deleted.
    i am afraid you have this completely backwards

    when a row in the theater table (the parent table with the primary key) is deleted, then all related rows in the theater_reviews table (the child table, with the foreign key that references the theaters table) are also deleted if ON DELETE CASCADE is set

    if you just delete a review, this has no effect on the parent theater


    Quote Originally Posted by Zurev View Post
    ... is there any purpose to setting ON DELETE/UPDATE CASCADE on any tables besides theaters-companies?
    yes, on all the tables linked to a theater, so that if you delete a theater, everything related to it also gets deleted -- pricing, reviews, comments, likes, images...
    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  24. #24
    SitePoint Zealot Zurev's Avatar
    Join Date
    Feb 2009
    Posts
    171
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Ah, see my issue was I was assuming that the table with the constraint referencing another table was considered the parent, where it's the complete opposite, whoops.

    So anytime a table is creating a foreign key it is the child to the table it's referencing, gotcha. (Hope I didn't get that wrong and you have to give me another backwards reaction :P)

  25. #25
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,017
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    Quote Originally Posted by Zurev View Post
    So anytime a table is creating a foreign key it is the child to the table it's referencing, gotcha.
    correct
    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •