SitePoint Sponsor

User Tag List

Page 1 of 3 123 LastLast
Results 1 to 25 of 61

Hybrid View

  1. #1
    SitePoint Evangelist
    Join Date
    May 2002
    Location
    Barcelona, Spain
    Posts
    434
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Question repeated values, how to best handle

    Hello, I am making a movie database and wanted to ask opinions on how to separate the actors from the title tables.

    I was thinking of a title table with duration, comments, etc but then another table for actors and an actorId field in the title table to refer to the actors. However this results in a many to many relationship so I need to build a junction table for that. Pretty standard stuff, but was wondering if anyone had any other ideas on how to better do this.

    Thanks.
    Used to prefer PHP.
    Now I prefer Maya but stuck with PHP
    bruji.com

  2. #2
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,248
    Mentioned
    59 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by pata View Post
    ...was wondering if anyone had any other ideas on how to better do this.
    junction table

    pretty standard stuff

    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  3. #3
    SitePoint Evangelist
    Join Date
    May 2002
    Location
    Barcelona, Spain
    Posts
    434
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    of course now a single select will produce a cartesian product of repeated values for a movie for each actor involved in it when using a junction table.

    Sorry, I'm a bit rusty on many to many relationships, how can I avoid this?
    Used to prefer PHP.
    Now I prefer Maya but stuck with PHP
    bruji.com

  4. #4
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,248
    Mentioned
    59 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by pata View Post
    of course now a single select will produce a cartesian product of repeated values for a movie for each actor involved in it when using a junction table.
    no it won't

    if you could do a SHOW CREATE TABLE for each of your tables, i'll show you the query
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  5. #5
    SitePoint Evangelist
    Join Date
    May 2002
    Location
    Barcelona, Spain
    Posts
    434
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    As usual, you come to my rescue, here are the create table commands, I have a couple more lookup tables, but I'm only listing 2 so as not to try your patience

    Code MySQL:
    CREATE TABLE `actor2title` (
      `pId` int(11) NOT NULL DEFAULT '0',
      `titleId` int(11) NOT NULL DEFAULT '0',
      PRIMARY KEY (`pId`,`titleId`)
    ) ENGINE=MyISAM DEFAULT CHARSET=utf8;
     
    CREATE TABLE `actors` (
      `id` int(11) NOT NULL AUTO_INCREMENT,
      `name` varchar(200) DEFAULT NULL,
      `lastName` varchar(200) DEFAULT NULL,
      PRIMARY KEY (`id`)
    ) ENGINE=MyISAM AUTO_INCREMENT=4 DEFAULT CHARSET=utf8;
     
    CREATE TABLE `movies` (
      `id` int(11) NOT NULL AUTO_INCREMENT,
      `title` varchar(255) NOT NULL,
      `studioId` int(11) DEFAULT NULL,
      `genreId` int(11) DEFAULT NULL,
      `ratedId` int(11) DEFAULT NULL,
      `duration` timestamp NULL DEFAULT NULL,
      `theatrical` date DEFAULT NULL,
      `release` date DEFAULT NULL,
      `rating` int(20) DEFAULT NULL,
      `price` decimal(4,2) DEFAULT NULL,
      `soundId` int(11) DEFAULT NULL,
      `countryId` int(11) DEFAULT NULL,
      `awards` text,
      `comments` text,
      PRIMARY KEY (`id`)
    ) ENGINE=MyISAM AUTO_INCREMENT=3 DEFAULT CHARSET=utf8;
     
    CREATE TABLE `writer2dvd` (
      `pId` int(11) NOT NULL DEFAULT '0',
      `titleId` int(11) NOT NULL DEFAULT '0',
      PRIMARY KEY (`pId`,`titleId`)
    ) ENGINE=MyISAM DEFAULT CHARSET=utf8;
     
    CREATE TABLE `writers` (
      `id` int(11) NOT NULL AUTO_INCREMENT,
      `name` varchar(200) DEFAULT NULL,
      `lastName` varchar(200) DEFAULT NULL,
      PRIMARY KEY (`id`)
    ) ENGINE=MyISAM AUTO_INCREMENT=2 DEFAULT CHARSET=utf8;
    Used to prefer PHP.
    Now I prefer Maya but stuck with PHP
    bruji.com

  6. #6
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,248
    Mentioned
    59 Post(s)
    Tagged
    3 Thread(s)
    list all the movies for a specific actor --
    Code:
    SELECT movies.id
         , movies.title
      FROM actor2title
    INNER
      JOIN movies
        ON movies.id = actor2title.titleId
     WHERE actor2title.pId = 937
    list all the actors for a specific movie --
    Code:
    SELECT actors.id
         , actors.name
         , actors.lastName
      FROM actor2title
    INNER
      JOIN actors
        ON actors.id = actor2title.pId
     WHERE actor2title.titleId = 21
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  7. #7
    SitePoint Evangelist
    Join Date
    May 2002
    Location
    Barcelona, Spain
    Posts
    434
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    That's beautiful!

    But is there a way to get the movie titles and movie info along with all the actors's name with just one query? This is the part that I am missing.


    guelphdad, I have not seen this database, will take a look, thanks.
    Used to prefer PHP.
    Now I prefer Maya but stuck with PHP
    bruji.com

  8. #8
    SitePoint Wizard guelphdad's Avatar
    Join Date
    Oct 2003
    Location
    St. Catharines, ON Canada
    Posts
    1,708
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Have you looked at the Sakila practice database on the mysql site? It is a movie database. It might give you some ideas as well.

  9. #9
    SitePoint Evangelist
    Join Date
    May 2002
    Location
    Barcelona, Spain
    Posts
    434
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Neither, I'm looking for a query that will list all movies with their respective actors. For example, Midnight Run with the 4 leading actors would produce 4 rows. I'm trying to find a way to have the movies with their actors so I can list them in a loop.
    Used to prefer PHP.
    Now I prefer Maya but stuck with PHP
    bruji.com

  10. #10
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,248
    Mentioned
    59 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by pata View Post
    list all movies with their respective actors.
    Code:
    SELECT movies.id
         , movies.title
         , actors.id
         , actors.name
         , actors.lastName
      FROM movies
    INNER
      JOIN actor2title
        ON actor2title.titleId = movies.id 
    INNER
      JOIN actors
        ON actors.id = actor2title.pId
    ORDER
        BY movies.title
         , actors.lastName
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  11. #11
    SitePoint Evangelist
    Join Date
    May 2002
    Location
    Barcelona, Spain
    Posts
    434
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    This query you gave me produces repeated rows for the same movie, one row for each actor.

    However, looking at the Sakila example DB I came up with this query:

    Code MySQL:
    select movies.id, movies.title, group_concat(concat(`actors`.`name`, ' ',`actors`.`lastName`) separator ', ') AS `actors`
    from movies join `actor2title` on (`movies`.`id` = `actor2title`.`titleId`) join `actors` on (`actor2title`.`pId` = `actors`.`id`) group by `movies`.`id`;

    which produces one row for each movie with a comma-separated list of the actors. Seems to work great. The group_concat function is new to me.

    Of course, any other pointer would be greatly appreciated.
    Used to prefer PHP.
    Now I prefer Maya but stuck with PHP
    bruji.com

  12. #12
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,248
    Mentioned
    59 Post(s)
    Tagged
    3 Thread(s)
    GROUP_CONCAT is awesome

    good choice

    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  13. #13
    SitePoint Evangelist
    Join Date
    May 2002
    Location
    Barcelona, Spain
    Posts
    434
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Now I have a related question: I need to build a credits table where I would put the actor's roles that they've played in different movies. Obviously, this is tied to a particular movie, but I have the actors in a different table and use a many to many relationship to get the actor for the movies.

    What would be the most efficient way of dealing with credits? Any ideas? I'd need a solution that doesn't break the query I already have going.
    Used to prefer PHP.
    Now I prefer Maya but stuck with PHP
    bruji.com

  14. #14
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,248
    Mentioned
    59 Post(s)
    Tagged
    3 Thread(s)
    put the role that an actor played in a movie into the many-to-many table

    CREATE TABLE actor2title
    ( pId INTEGER NOT NULL
    , titleId INTEGER NOT NULL
    , PRIMARY KEY ( pId , titleId )
    , role VARCHAR(99)
    );
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  15. #15
    SitePoint Evangelist
    Join Date
    May 2002
    Location
    Barcelona, Spain
    Posts
    434
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    put the role that an actor played in a movie into the many-to-many table

    CREATE TABLE actor2title
    ( pId INTEGER NOT NULL
    , titleId INTEGER NOT NULL
    , PRIMARY KEY ( pId , titleId )
    , role VARCHAR(99)
    );
    But the actor will play different roles in different movies, so I would end up with repeated rows for the same actor inside that table???
    Used to prefer PHP.
    Now I prefer Maya but stuck with PHP
    bruji.com

  16. #16
    From Italy with love silver trophybronze trophy
    guido2004's Avatar
    Join Date
    Sep 2004
    Posts
    9,496
    Mentioned
    163 Post(s)
    Tagged
    4 Thread(s)
    Quote Originally Posted by pata View Post
    But the actor will play different roles in different movies, so I would end up with repeated rows for the same actor inside that table???
    Of course, that's the whole point
    An actor can have more than 1 role in his life, and in that case he will have more than 1 row in this table.

  17. #17
    SitePoint Evangelist
    Join Date
    May 2002
    Location
    Barcelona, Spain
    Posts
    434
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by guido2004 View Post
    Of course, that's the whole point
    An actor can have more than 1 role in his life, and in that case he will have more than 1 row in this table.
    But doesn't that defeat the whole point of normalization of not repeating the same value? Also, it would make searches based on actors a lot slower no?
    Used to prefer PHP.
    Now I prefer Maya but stuck with PHP
    bruji.com

  18. #18
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,248
    Mentioned
    59 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by pata View Post
    But the actor will play different roles in different movies, so I would end up with repeated rows for the same actor inside that table???
    of course!!!

    there will be as many "repeats" for a single actor as there are movies that this actor played in

    they aren't really repeats, of course, because you would not allow the same actor to play the same role in the same movie more than once

    however, this brings up the interesting point that you might want to have the same actor play more than one role in the same movie

    for example...

    Coming To America - Eddie Murphy - Prince Akeem
    Coming To America - Eddie Murphy - Clarence
    Coming To America - Eddie Murphy - Randy Watson
    Coming To America - Eddie Murphy - Saul

    in this case you need to change the table's primary key

    CREATE TABLE actor2title
    ( pId INTEGER NOT NULL
    , titleId INTEGER NOT NULL
    , role VARCHAR(99) NOT NULL
    , PRIMARY KEY ( pId , titleId , role )
    );

    and of course now you need to specify a role for every actor in every movie, since no portion of a primary key may be null

    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  19. #19
    SitePoint Evangelist
    Join Date
    May 2002
    Location
    Barcelona, Spain
    Posts
    434
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Unhappy

    Ok, spoke too soon: This works (sort of) but is now giving me the wrong movie title, it's giving me entry #1 when it should be #2. Here's the whole query:

    Code MySQL:
    SELECT DISTINCT dvdpedia.id, dvdpedia.title, group_concat(DISTINCT concat(`actors`.`name`, ' ',`actors`.`lastName`) ORDER BY actors.lastName separator ', ') AS `actors`, rated.rated, group_concat(DISTINCT concat(writers.name, " ", writers.lastName) separator ', ') AS writers
    from dvdpedia join `actor2title` on (`dvdpedia`.`id` = `actor2title`.`titleId`) join `actors` on (`actor2title`.`pId` = `actors`.`id`) join rated on (dvdpedia.ratedId = rated.id) join writer2dvd on (dvdpedia.id = writer2dvd.titleId) join writers ON (writer2dvd.pId = writers.id) group by `dvdpedia`.`id` ORDER BY dvdpedia.title;

    Hold the phone, through the magic of Google I found the proper use of the DISTINCT keyword. Used at the beginning of the query it doesn't make a difference, used within the group_concat statement, it makes a world of difference and fixes the problem below (phew). I'm just leaving it here in the hopes that it might help other users.

    *sigh*

    Now I have another problem. I just realized that if I add another many to many relationship query to the mix, say the writers for example, and I use the group_concat for writers I get the writers repeated as many times as there are actors:

    George Gallo, George Gallo, George Gallo, George Gallo, George Gallo...

    If I don't use group_concat, I only get the first writer. Not sure how to fix this. The Sakila demo db doesn't seem to have a similar situation for me to get an idea how to do it properly.
    Used to prefer PHP.
    Now I prefer Maya but stuck with PHP
    bruji.com

  20. #20
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,248
    Mentioned
    59 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by pata View Post
    Used at the beginning of the query it doesn't make a difference, used within the group_concat statement, it makes a world of difference and fixes the problem below (phew).
    it does not fix the problem, it merely papers over it

    your query tries to combine multiple one-to-many relationships, with the result that you are getting cross join effects (multiplied results)

    using DISTINCT within the GROUP_CONCAT merely collapses the output, you are still executing a horrendously inefficient query underneath

    does this explanation make sense to you?
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  21. #21
    SitePoint Evangelist
    Join Date
    May 2002
    Location
    Barcelona, Spain
    Posts
    434
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    it does not fix the problem, it merely papers over it

    your query tries to combine multiple one-to-many relationships, with the result that you are getting cross join effects (multiplied results)

    using DISTINCT within the GROUP_CONCAT merely collapses the output, you are still executing a horrendously inefficient query underneath

    does this explanation make sense to you?
    It does, but I don't know how to fix it. Any pointers?
    Used to prefer PHP.
    Now I prefer Maya but stuck with PHP
    bruji.com

  22. #22
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,248
    Mentioned
    59 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by pata View Post
    Any pointers?
    let's use a portion of your database to illustrate

    one movie can have multiple actors, and one movie can have multiple writers

    try to combine these relationships in a single query, and boom, you get cross join effects

    one thing you could do is collapse one of these relationships in a subquery, so that the subquery produces a single row per movie, with GROUP_CONCAT to collapse multiple values to one (concatenated) value
    Code:
    SELECT dvdpedia.id
         , dvdpedia.title
         , rated.rated
         , w.writers
         , actors.name
         , actors.lastName
      FROM dvdpedia 
    INNER
      JOIN actor2title 
        on actor2title.titleId = dvdpedia.id
    INNER
      JOIN actors 
        on actors.id = actor2title.pId
    INNER
      JOIN rated 
        on rated.id = dvdpedia.ratedId 
    INNER
      JOIN ( SELECT writer2dvd.titleId
                  , GROUP_CONCAT(CONCAT(writers.name
                                      , ' '
                                      , writers.lastName) AS writers
               FROM writer2dvd 
             INNER
               JOIN writers 
                 ON writers.id = writer2dvd.pId
             GROUP 
                 BY writer2dvd.titleId ) AS w
        ON w.titleId = dvdpedia.id
    thus the outer query has only one one-to-many relationship, along with the one-to-one relationship with the subquery

    if you also wanted to collapse the actors into a concatenated value, you would do that in a subquery as well
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  23. #23
    SitePoint Evangelist
    Join Date
    May 2002
    Location
    Barcelona, Spain
    Posts
    434
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    This query IS faster, but produces one row for each actor in a movie, see attached screenshot.
    Attached Images Attached Images
    Used to prefer PHP.
    Now I prefer Maya but stuck with PHP
    bruji.com

  24. #24
    SitePoint Evangelist
    Join Date
    May 2002
    Location
    Barcelona, Spain
    Posts
    434
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Yet another question: The above is working beautifully, except that if the movie has a missing entry (say no actors associated yet) the query will not list it. Is there a way to list movies with NULL entries?
    Used to prefer PHP.
    Now I prefer Maya but stuck with PHP
    bruji.com

  25. #25
    From Italy with love silver trophybronze trophy
    guido2004's Avatar
    Join Date
    Sep 2004
    Posts
    9,496
    Mentioned
    163 Post(s)
    Tagged
    4 Thread(s)
    Quote Originally Posted by pata View Post
    Yet another question: The above is working beautifully, except that if the movie has a missing entry (say no actors associated yet) the query will not list it. Is there a way to list movies with NULL entries?
    Yes, use a LEFT OUTER JOIN instead of an INNER JOIN.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •