SitePoint Sponsor

User Tag List

Results 1 to 22 of 22
  1. #1
    SitePoint Addict
    Join Date
    Jun 2005
    Posts
    313
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Populating tables with last_insert_id() function

    Hi everyone,

    A while back I asked a question about populating foreign key fields in a database, and I was told that people usually use a last_insert_id() function to do this. I understand that this would work if I was populating the database one row at a time (appparently the function only inserts 1 id at a time), but I planned to enter the data all at once using tab delimited .txt files and LOAD DATA INFILE.

    I planned to populate all the tables in this way except the linking tables, eg. garment_to_colour etc. where the foreign key problem exists. Is it true to say that it's not possible to use the last_insert_id() function to populate the linking tables if I insert the data all at once as described above?

    Just wondered if someone could help solve this problem?

    CREATE TABLE `garments` (
    `garment_id` smallint(5) unsigned NOT NULL auto_increment,
    `supplier` varchar(30) NOT NULL,
    `garment_type` varchar(30) NOT NULL,
    `title` varchar(60) NOT NULL,
    `code` varchar(20) NOT NULL,
    `description` varchar(400) NOT NULL,
    `extra_info` varchar(50) default NULL,
    `image` enum('y','n') NOT NULL,
    `swatch_image` enum('y','n') NOT NULL,
    PRIMARY KEY (`garment_id`),
    UNIQUE KEY `supplier` (`supplier`,`garment_type`, `description`,
    `title`,`code`)
    ) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=1 ;


    CREATE TABLE `colours` (
    `colour_id` smallint(5) unsigned NOT NULL auto_increment,
    `colour` varchar(20) NOT NULL,
    PRIMARY KEY (`colour_id`),
    UNIQUE KEY `colour` (`colour`)
    ) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=1 ;


    CREATE TABLE `garment_to_colour` (
    `garment_id` smallint(5) unsigned NOT NULL,
    `colour_id` smallint(5) unsigned NOT NULL,
    PRIMARY KEY (`garment_id`,`colour_id`)
    ) ENGINE=MyISAM DEFAULT CHARSET=latin1;


    CREATE TABLE `sizes` (
    `size_id` smallint(5) unsigned NOT NULL auto_increment,
    `size` varchar(15) NOT NULL,
    PRIMARY KEY (`size_id`),
    UNIQUE KEY `size` (`size`)
    ) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=1 ;


    CREATE TABLE `garment_to_size` (
    `garment_id` smallint(5) unsigned NOT NULL,
    `size_id` smallint(5) unsigned NOT NULL,
    PRIMARY KEY (`garment_id`,`size_id`)
    ) ENGINE=MyISAM DEFAULT CHARSET=latin1;


    CREATE TABLE `categories` (
    `cat_id` smallint(5) unsigned NOT NULL auto_increment,
    `category` varchar(30) NOT NULL,
    PRIMARY KEY (`cat_id`),
    UNIQUE KEY `category` (`category`)
    ) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=1 ;


    CREATE TABLE `garment_to_category` (
    `garment_id` smallint(5) unsigned NOT NULL,
    `cat_id` smallint(5) unsigned NOT NULL,
    PRIMARY KEY (`garment_id`,`cat_id`)
    ) ENGINE=MyISAM DEFAULT CHARSET=latin1;

  2. #2
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,336
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by gwh View Post
    Is it true to say that it's not possible to use the last_insert_id() function to populate the linking tables if I insert the data all at once as described above?
    yes
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  3. #3
    SitePoint Addict
    Join Date
    Jun 2005
    Posts
    313
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks for the reply,

    Entering one row at a time would probably be okay in my situation since I only have a relatively small number of rows compared to other databases, but I just wondered how really huge databases with lookup tables get populated. Is it just a painstaking process of entering row by row? I just thought there must be a quicker way - especially with ones with hundreds of thousands of rows.

  4. #4
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,336
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    to answer your question, i need to ask you a question

    suppose you add a whole mess of garments using LOAD DATA INFILE

    how do you know what colours each garment has? where does that information come from?
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  5. #5
    SitePoint Addict
    Join Date
    Jun 2005
    Posts
    313
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    After populating the garments table and colours table, I'd have a list of garments and colours with a unique id. I guess I thought I'd print out the entire lists from both these tables and then fill up the garment_to_colour lookup table manually, ie. insert say colour ids 1, 2 and 5 into three different rows of the colour_id column, then insert the garment id into 3 matching rows of the garment_id column, say 3, 3 and 3 (ie. if 3 was the garment_id record that related to the three colour_ids mentioned).

    Does that sound right? Sorry if I sound like I don't know what I'm doing - truth is I'm just learning as I'm going.

    What do you think?

  6. #6
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,336
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by gwh View Post
    After populating the garments table and colours table, I'd have a list of garments and colours with a unique id. I guess I thought I'd print out the entire lists from both these tables and then fill up the garment_to_colour lookup table manually, ie. insert say colour ids 1, 2 and 5 into three different rows of the colour_id column, then insert the garment id into 3 matching rows of the garment_id column, say 3, 3 and 3 (ie. if 3 was the garment_id record that related to the three colour_ids mentioned).

    Does that sound right?
    well, i dunno about right, but it sure sounds tedious if you really plan to do "hundreds of thousands of rows" by hand

    how would knowing the last_insert_id values immediately after loading the garments help this scenario? (hint: it wouldn't)

    now let me suggest a different scenario

    but first, a couple of questions

    can you tell one garment apart from another garment without having to refer to an auto_increment id?

    what makes one colour different from another, again without reference to an auto_increment id?

    now, instead of entering numbers manually, would you be able to say that garment such-and-such is this colour and that colour, while garment so-and-so is some other colour?

    in other words, can you solve this problem without auto_increments? (hint: yes)
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  7. #7
    SitePoint Addict
    Join Date
    Jun 2005
    Posts
    313
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Yeah you're right about the last_insert_id not helping.

    It's also true that you can tell one garment apart without an auto_increment id but I thought you were supposed to have an id on each table?

    The names of the colours make them different from one another but again I'd be repeating these colours throughout the table since they apply to many different garments.

    When I first started this project I started typing out all the details of each garment in an excel spreadsheet and when I looked down the columns I saw the many repetitions of size, colour, supplier, category etc. so I thought I had to normalise it all. If I put all the columns into one table it'll look exactly like the spreadsheet I had originally - it wouldn't be a proper database right?

    I realise you said that repetition isn't that much of a problem but I read that this was the rule of the first normal form, ie. to do away with the repetition.

    Anyway, these are the answers to your questions - can you tell me what your different scenario would be?

  8. #8
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,336
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by gwh View Post
    If I put all the columns into one table it'll look exactly like the spreadsheet I had originally - it wouldn't be a proper database right?
    actually, yes it would

    the following is not repetition --

    Code:
    fname lname  salary
    john smith    1,500
    john brown    1,937
    john black    2,000
    john baker    1,400
    john white    1,225
    just because a bunch of employees all have the same first name, you don't have to eliminate the "repetition" by putting them into a first names table and linking the first names to the employee table with some id number -- because the id number would repeat just as often!!

    on the other hand, the following is the type of repetition that you need to be concerned about for normaliazation --
    Code:
    
    garment colour1 colour2 colour3 colour4 colour5
    shirt   red     green   blue    --      --
    pants   black   --      --      --      --
    skirt   pink    coral   fuscia  peach   --
    with me so far?
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  9. #9
    SitePoint Addict
    Join Date
    Jun 2005
    Posts
    313
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I do get the idea in the examples you gave, but just as a further example to clarify, say I had a books table which had columns for book_id, isbn, title, publisher, author and description.

    Isn't it true that publisher has a one to many relationship with the books, ie. a publisher could have published more than one book, therefore a separate publishers table would be needed and a foreign key pub_id column would go into the books table? Also the authors would go into a separate table and then, due to the many-to-many relationship of book to authors, a book_to_author lookup table would be had.

    Sorry to harp on about this - the above example was taken from a book I was reading so I just wondered if you agreed with the database structure of that example - if not I'm very confused.

  10. #10
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,336
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    i certainly don't mind going into detail, but we seem to be drifting further away from garments and colours...

    i would change "therefore a separate publishers table would be needed" to "therefore a separate publishers table might be useful"

    let's look at your particular example --
    book_id, isbn, title, publisher, author, description
    let's say that book_id is the primary key, okay? so, if i give you a value of the primary key, can you unambiguously tell me which publisher it is?

    hint: yes

    therefore, as far as the publisher is concerned, this table is properly normalized



    now, if a book has more than one author, then this design would be problematic, and you'd likely end up with the many-to-many structure

    this is similar to garments and colours -- yes, you do need a many-to-many structure here

    however (and this is where i was trying to get to) you do not need to do this with ids

    you can do it with the colour names themselves
    Code:
    garment colour
    shirt   red
    shirt   green
    shirt   blue
    pants   black
    skirt   pink
    skirt   coral
    skirt   fuscia
    skirt   peach
    still with me?
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  11. #11
    SitePoint Addict
    Join Date
    Jun 2005
    Posts
    313
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I can tell which publisher belongs to a particular book record if I have the book_id primary key - I'd just look across the row and see what the value was in publisher_id column. You were saying that a separate publishers table might be useful, but can I assume that it's not absolutely necessary? I mean I could have put the suppliers in a separate table (they have a one to many relationship with the garments also) but I didn't. So I guess I'm asking when do you know if it's necessary or not?

    As you were saying, the scenario between book to author is the same as my scenario between garments and colours - they're both many-to-many. Since I was using the books to authors example in the book I was reading as an example, I thought I needed the lookup table. So are you saying the example in the book re the book-to-author lookup is incorrect? If so, when would you need one of these lookup tables?

  12. #12
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,336
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by gwh View Post
    I can tell which publisher belongs to a particular book record if I have the book_id primary key - I'd just look across the row and see what the value was in publisher_id column. You were saying that a separate publishers table might be useful, but can I assume that it's not absolutely necessary?
    that is correct, it isn't, because the publisher depends entirely on the complete primary key -- therefore, at least as far as the publisher column is concerned, it's in 3NF
    Quote Originally Posted by gwh View Post
    So I guess I'm asking when do you know if it's necessary or not?
    by examining the logical dependency of a non-key attribute on the primary key
    Quote Originally Posted by gwh View Post
    So are you saying the example in the book re the book-to-author lookup is incorrect?
    nope, that is the right way to do a many-to-many, just like your garments and colours
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  13. #13
    SitePoint Addict
    Join Date
    Jun 2005
    Posts
    313
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    But in the book, he had the authors in a separate table and then a book-to-author look up table with two columns - book_id and author_id. Sorry I may not have mentioned that above. Therefore he is incorrect right? He didn't put the authors in the books table as you're advising me to do with the colours right?

    One final question and i'll stop bugging you - should I put sizes and cateogries into the garments table also, and do I need the garment_id primary key?

  14. #14
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,336
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by gwh View Post
    But in the book, he had the authors in a separate table and then a book-to-author look up table with two columns - book_id and author_id. Sorry I may not have mentioned that above. Therefore he is incorrect right? He didn't put the authors in the books table as you're advising me to do with the colours right?
    okay, we need to clear this up

    the book-to-author many-to-many table is correct, okay?

    i am not telling you to put the colours into the garments table

    have a look at the sample data i gave in post #10

    that's your garment-to-colour many-to-many table, except it isn't using ids, it's using the names

    the names, in that example, are the foreign keys

    make sense now?
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  15. #15
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,336
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    i'll go a little further -- suppose you have the data that i gave in post #10 and you wanted to use ids

    you would load that data into a temporary table, then do a three-way join, joining the garment names to the garment table, and the colour names to the colours table, matching names to names, then the query would be able to pull out of the garments table and the colours table the corresponding ids, which you could then use to populate your real garment-to-colur table, the one with the two ids as foreign keys

    which is where this thread started, yes?
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  16. #16
    SitePoint Addict
    Join Date
    Jun 2005
    Posts
    313
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    In the sample colour table you gave, you have one column called garment and the other called colour. I understand now that values in the colour column are the foreign keys, but are these foreign keys referencing the colour names from a separate colours table? If so will this table only have the one column, ie. colours and no primary key id?

    Also I got a bit confused with the garment column in this garment to colour look up. In my original garments table, the column called garment_type listed the garments, so does the foreign key column called garments in the garment to colour lookup reference the garment_type column in my garments table? If that's the case, do I no longer need an auto_incrementing garment_id in my original garments table?

    And just to finalise, once the above has been clarified, I assume I'll need the same lookup tables for size and category since these are many to many also right?

    Then as you say getting back to the original question asked, I can populate the garments, colours, sizes and categories tables with LOAD DATA INFILE, then do I use the three-way join you mentioned, even if I don't decide to use ids - I mean is this the answer to the original question, ie. is this how I'd populate the lookup tables?

  17. #17
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,336
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by gwh View Post
    In the sample colour table you gave, you have one column called garment and the other called colour. I understand now that values in the colour column are the foreign keys, but are these foreign keys referencing the colour names from a separate colours table? If so will this table only have the one column, ie. colours and no primary key id?
    if i were doing the database design without ids, yes, it would have only one column, but that column would be the primary key

    primary keys do not have to be ids, which i guess is the main point to take away from this little discussion

    this was what i was trying to work towards earlier, with those questions about how you would differentiate one colour from another, one garment from another

    Quote Originally Posted by gwh View Post
    Also I got a bit confused with the garment column in this garment to colour look up. In my original garments table, the column called garment_type listed the garments, so does the foreign key column called garments in the garment to colour lookup reference the garment_type column in my garments table?
    no, it was supposed to be a foreign key to the garment table's primary key -- perhaps the names chosen were a bit too simplistic

    Quote Originally Posted by gwh View Post
    And just to finalise, once the above has been clarified, I assume I'll need the same lookup tables for size and category since these are many to many also right?
    yes, similar

    Quote Originally Posted by gwh View Post
    Then as you say getting back to the original question asked, I can populate the garments, colours, sizes and categories tables with LOAD DATA INFILE, then do I use the three-way join you mentioned, even if I don't decide to use ids - I mean is this the answer to the original question, ie. is this how I'd populate the lookup tables?
    you would have to load the relationship data somehow (this was my earlier question -- where does this data come from?)

    if you're not using ids, but rather using the unique names to identify each row in each table, then you're done

    if you're using ids, then you'd need to do the joins and additional inserts, in order to synch up the right names with the right ids

    i apologize for the long detour we took getting to this point
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  18. #18
    SitePoint Addict
    Join Date
    Jun 2005
    Posts
    313
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    No need to apologise - I'm really grateful for all your help and have learnt heaps.

    If not using ids is easier and will not break normalisation as you say, then I'll probably use the unique names to identify each row in each table, so when you say "i'm done", do you mean I'll just have to manually insert the relationship data in the lookup tables, ie. maybe entering it via a tab delimited file then do a LOAD DATA INFILE along with all the other tables?

  19. #19
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,336
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    yes, that's right, that's how it would work

    although to be honest, i would defo use an id for the garments

    whether i would even have separate colour and size tables (i would prefer not to) would depend on whether i wanted to enforce relational integrity, i.e. if i had an application where someone adding a new garment had to specify size and colour from dropdowns, then i would generate those dropdowns from the separate size and colour tables, but if size and colour were less rigourously controlled, i would probably just have them as text attributes in the garments table

    so with an id for the garment table, you still have a join to do when you upload your size and colour data
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  20. #20
    SitePoint Addict
    Join Date
    Jun 2005
    Posts
    313
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Yeah I was going to use the id in the garments table - just not in the others.

    So just to clarify, when I'm compiling the relationship data between say the colours and the garments, I do this with the garment name, ie. shirt, skirt etc. - not the garment_id primary key. So I have my table looking like your example in post #10 and I use this as the temporary table and then do the Joins to synch up the colours with the right garment ids - right?

    Sorry - I think I've just about got it.

  21. #21
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,336
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    don't you have the garment-colour relationship data stored somewhere?

    you said "when i'm compiling the relationship data" which makes me think that you are going through a list of garments and deciding which colours they should be, then entering this information into a file that you will upload

    if that's what you're doing, i would use the garment id straightaway, which you can get by simply running a query to list out the garments after they've been added, so that you can see what id to use
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  22. #22
    SitePoint Addict
    Join Date
    Jun 2005
    Posts
    313
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I don't have this data stored yet - I've yet to establish these relationships. I'll do as you suggested re running the query for the list of garments.

    Finally! I think I'm really starting to understand it all. Thanks so much for all your help!


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •