SitePoint Sponsor

User Tag List

Results 1 to 23 of 23
  1. #1
    SitePoint Enthusiast
    Join Date
    Apr 2013
    Posts
    28
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Selecting one through many in a many-to-many relationship

    The situations is this (somewhat simplified):

    I have a MySQL database that contains data about photographs. One table (photos) contains basic data such as year, place etc. Then I have another table (names) that contains names of people in the pictures. And finally there's a table (pic_name) that connects pictures with people, linking photoid with nameid corresponding to photos.id and names.id respectively.

    Let's say that the picture with id 1 (photos.id = 1) is a photo of three people called John, Lisa and Esmeralda having the IDs 1, 2 and 3 in the names table (as well as the names encoded in the first_name column). Then pic_name has three rows connecting photoid = 1 with nameid = 1, 2 and 3.

    Now to my question: How do I construct a query that retrieves some specific info from the photos table, for example place or year, for all photos that contain both John and Lisa (not just either one of them)? That is, how do I get a list of where or when all my pictures containing both John and Lisa (and possibly others such as Esmeralda, but that's irrelevant) were shot?

    Intuitively, it seems like a fairly simple question, but I can't for my life come up with a query that produces the desired result. Any help would be highly appreciated.

  2. #2
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,022
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    Code:
    SELECT photos.id
         , photos.place
         , photos.year
      FROM ( SELECT pic_name.photoid
               FROM names
             INNER
               JOIN pic_name
                 ON pic_name.nameid = names.id
              WHERE names.first_name IN ( 'John' , 'Lisa' )
             GROUP
                 BY pic_name.photoid
             HAVING COUNT(*) = 2
           ) AS these
    INNER
      JOIN photos
        ON photos.id = these.photoid
    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  3. #3
    SitePoint Enthusiast
    Join Date
    Apr 2013
    Posts
    28
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks a lot. This seems like a very good start with fast response time and everything. There's only one slight problem: If I have a picture with two persons called John and no Lisa (or vice versa), I get that as well. If you can fix that too I'll definitely buy your book ... Or maybe I'll do it anyway ...

  4. #4
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,022
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    Quote Originally Posted by atoroqo View Post
    There's only one slight problem
    let me turn it around and ask you to explain how your application interface will decide which people you want to base the query on... in other words, sticking with your original example, where did John and Lisa come from? how were they chosen?
    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  5. #5
    SitePoint Enthusiast
    Join Date
    Apr 2013
    Posts
    28
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    let me turn it around and ask you to explain how your application interface will decide which people you want to base the query on... in other words, sticking with your original example, where did John and Lisa come from? how were they chosen?
    It'll be a public archive where anyone should be able to search for pictures based on various criteria. There already is a database that I have exported parts of in order to migrate to MySQL. The original database uses 4D as its DBMS and the querys work ok, but the whole database is a complete mess structure-wise and there are some other problems with it which is why I'm trying to clean it up and switch to MySQL. I have no access to the server code of the original database, so I have no idea what's going on there. What I can do is to export its contents to an XML file and take it from there in whatever direction I choose to. There will be a web form identical to the original (see link below) and then I'll process the querystring through PHP and send the query to the database from there.

    Here's the web site as it is today: http://bildarkivet.orsa.se/
    John and Lisa wasn't a very good example, but you get at least two hits where the Lisas are represented by Anna-Lisa and Anna Lisa respectively. It's all in Swedish, but the top right field in the middle column ("förnamn") represents "first name". So if you enter "John Lisa" there (without the quotes) you get two hits. (And if you search for "John,Lisa" instead, you get pictures with either John or Lisa.) Similarly, if you write "ko gris" (= "cow pig") in the "Nyckelord" field in the left column (= key words) you get three pictures with both a cow and a pig.

    I hope this explains the context. So far I've managed to build all the queries I need through PHP except the one I'm asking about here. (Sometime in the future I hope I'll be able to transform the name mess into a decent persons table, but that'll take quite a lot of work.)

  6. #6
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,022
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    okay, if you are going to allow a search based only on firstnames, then pretty much the only thiing you can do is change HAVING COUNT(*) = N (where N is the number of names entered) to HAVING COUNT(DISTINCT names.first_name) = N
    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  7. #7
    SitePoint Enthusiast
    Join Date
    Apr 2013
    Posts
    28
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Yes, I don't want to remove any of the present functionality although people probably won't search for two first names particularly often. But for keywords this kind of query will probably be more useful. Anyway, thanks a lot again! This made my day (week, month ...)

  8. #8
    SitePoint Enthusiast
    Join Date
    Apr 2013
    Posts
    28
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    There's still a lot to learn ...

    I have this query:

    Code:
    SELECT photos.id
    FROM (SELECT pic_name.photoid
               FROM names
               INNER JOIN pic_name
               ON pic_name.nameid = names.id
               WHERE names.first_name IN ( 'John' , 'Lisa' )
               GROUP BY pic_name.photoid
               HAVING COUNT(DISTINCT names.first_name) = 2
            ) AS these
    INNER JOIN photos
    ON photos.id = these.photoid
    Now I want a negative filter on the above selection. For example, I might want all pictures with John and Lisa except those with Eric in them as well. Or perhaps all pictures of John and Lisa except the one with photo.id = 1. I had assumed that it would be possible to construct a query selecting what I don't want and then just add it with a "WHERE NOT EXISTS", like so:

    Code:
    WHERE NOT EXISTS (
        SELECT id 
        FROM photos
        WHERE id = 1)
    but that won't return any results whatsoever. Where did my mind wander astray?

  9. #9
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,022
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    in general you need either to rewrite the HAVING clause to suit each circumstance...
    Code:
      FROM ( SELECT pic_name.photoid
               FROM names
             INNER 
               JOIN pic_name
                 ON pic_name.nameid = names.id
              WHERE names.first_name IN ( 'John' , 'Lisa' , 'Eric' )
             GROUP 
                 BY pic_name.photoid
             HAVING COUNT(DISTINCT names.first_name) = 2
                AND COUNT(CASE WHEN names.first_name = 'Eric'
                               THEN 'uh oh'
                               ELSE NULL END) = 0  
            ) AS these
    ... or add WHERE conditions to the outer query...
    Code:
    INNER 
      JOIN photos
        ON photos.id = these.photoid
     WHERE photos.id <> 1
    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  10. #10
    SitePoint Enthusiast
    Join Date
    Apr 2013
    Posts
    28
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks. At least I did put my erroneous WHERE condition in the right place.

    The first solution seems easily expandable also to cases when pics with more than one person should be excluded, e.g.:

    Code:
    AND COUNT(CASE WHEN names.first_name IN ('eric', 'zoe')
        THEN 'uh oh'
        ELSE NULL END) = 0
    And it should be easy enough to construct by a PHP script.

    But I also tried to write a query with a WHERE condiction to exclude Eric, but I can't seem to come up with a working solution there either. For example

    Code:
    INNER JOIN pic_name as pn
    ON pn.photoid = photos.id
    INNER JOIN names as n
    ON n.id = pn.nameid
    WHERE n.first_name <> 'eric'
    GROUP BY photos.id
    won't give me what I want.

    Maybe I should find myself another hobby .

  11. #11
    SitePoint Enthusiast
    Join Date
    Apr 2013
    Posts
    28
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Could anyone tell me if I am correct in assuming that the added WHERE condition with a photo id works because the id is a unique, single, non-NULL value, whereas first names are not? I.e. "WHERE n.first_name <> 'eric'" will retrieve rows as long as there are non-Erics in the image as well? Understanding why it doesn't work could be a step towards understanding what will work instead and why.

  12. #12
    SitePoint Enthusiast
    Join Date
    Apr 2013
    Posts
    28
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Ah, well. I'll go with the suggested CASE solution anyway. I came up with the following alternative solution, which seems to be working as it should, although it's a bit slower. But it'll do at my present stage of learning.

    Code:
    WHERE NOT photos.id
    IN (
        SELECT photoid
        FROM pic_name AS pn
        INNER JOIN names AS n
        ON n.id = pn.nameid
        WHERE n.first_name = 'eric'
        )

  13. #13
    SitePoint Enthusiast
    Join Date
    Aug 2011
    Location
    OH, USA
    Posts
    72
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    WHERE NOT photos.id
    IN (
    SELECT photoid
    FROM pic_name AS pn
    INNER JOIN names AS n
    ON n.id = pn.nameid
    WHERE n.first_name = 'eric'
    )
    is not correct
    WHERE photos.id NOT IN
    (
    SELECT photoid
    FROM pic_name AS pn
    INNER JOIN names AS n
    ON n.id = pn.nameid
    WHERE n.first_name = 'eric'
    )
    is correct sentence

  14. #14
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,022
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    Quote Originally Posted by gk53 View Post
    is not correct
    yes, it is

    did you test it? i did
    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  15. #15
    SitePoint Enthusiast
    Join Date
    Apr 2013
    Posts
    28
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Next problem: LIKE.

    If I simplify the inner query in the above example to this

    Code:
    SELECT x
    FROM y
    WHERE z IN ('a','b','c')
    GROUP BY x.id
    HAVING COUNT(DISTINCT z) = 3
    I could rewrite it as

    Code:
    SELECT x
    FROM y
    WHERE z = 'a' OR z = 'b' OR z = 'c'
    GROUP BY x.id
    HAVING COUNT(DISTINCT z) = 3
    and then introduce a LIKE like this:

    Code:
    SELECT x
    FROM y
    WHERE z = 'a' OR z = 'b' OR z LIKE 'c%'
    GROUP BY x.id
    HAVING COUNT(DISTINCT z) = 3
    But that won't give me what I want (a, b and c), since it will also retrieve b, cd and ce; cd, ce and cf; etc.

    So I tried

    Code:
    SELECT x
    FROM y
    WHERE z = 'a' OR z = 'b' OR z LIKE 'c%'
    GROUP BY x.id
    HAVING COUNT(DISTINCT z) = 3
    AND COUNT(z NOT LIKE 'c%') = 2
    which doesn't work either.

    Nor does

    Code:
    SELECT x
    FROM y
    WHERE z = 'a' OR z = 'b' OR z LIKE 'c%'
    GROUP BY x.id
    HAVING COUNT(DISTINCT z) = 3
    AND COUNT(z = 'a') = 1
    AND COUNT(z = 'b') = 1
    Any suggestions what I should try next?

  16. #16
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,022
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    Quote Originally Posted by atoroqo View Post
    But that won't give me what I want (a, b and c), since it will also retrieve b, cd and ce; cd, ce and cf; etc.
    then you need count, separately, the different results
    Code:
    HAVING MAX(CASE WHEN z='a' THEN 1 ELSE NULL END) +
           MAX(CASE WHEN z='b' THEN 1 ELSE NULL END) +
           MAX(CASE WHEN z LIKE 'c%' THEN 1 ELSE NULL END) = 3
    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  17. #17
    SitePoint Enthusiast
    Join Date
    Apr 2013
    Posts
    28
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Mr. L. to the rescue ... Thanks.

    Seeing that solution I spontaneously imagined that this would work as well:

    Code:
    HAVING COUNT(CASE WHEN z='a' THEN 1 ELSE NULL END) +
           COUNT(CASE WHEN z='b' THEN 1 ELSE NULL END) +
           COUNT(CASE WHEN z LIKE 'c%' THEN 1 ELSE NULL END) = 3
    But obviously it doesn't. Does anyone see why I don't see why it doesn't work?

    Edit: Or rather: I see why this solution doesn't work but I'm not sure why the MAX solution works.

  18. #18
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,022
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    Quote Originally Posted by atoroqo View Post
    ... but I'm not sure why the MAX solution works.
    because even if there are seventeen 'a' rows, MAX will ensure that only one 1 is included in the count
    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  19. #19
    SitePoint Enthusiast
    Join Date
    Apr 2013
    Posts
    28
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Sweet.

    And I suppose that the negative version from earlier in the thread takes care of itself automatically by excluding all possible variations of c% leaving only 'a' and 'b' to add up to the distinct 2? Or am I overlooking something?

    Code:
    SELECT x
    FROM y
    WHERE z = 'a' OR z = 'b' OR z LIKE 'c%'
    GROUP BY x.id
    HAVING COUNT(DISTINCT z) = 2
    AND COUNT(
        CASE WHEN z LIKE 'c%' 
        THEN ''
        ELSE NULL END
        ) = 0

  20. #20
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,022
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    that works
    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  21. #21
    SitePoint Enthusiast
    Join Date
    Apr 2013
    Posts
    28
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Great. And I finally got around to buying your book, so now it's time to fill in some more gaps.

  22. #22
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,022
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)


    holler when you find the errors
    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  23. #23
    SitePoint Enthusiast
    Join Date
    Apr 2013
    Posts
    28
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Edit: Never mind, my syntax was scewed up.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •