SitePoint Sponsor

User Tag List

Results 1 to 10 of 10
  1. #1
    Don't eat yellow snow spaceman's Avatar
    Join Date
    Mar 2001
    Location
    Melbourne, Australia
    Posts
    1,039
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    mysql question: how to include the record count of a linked table in a SELECT?

    Bit of a vague Subject posting there. Let me explain.

    Example:

    Two tables, 'members' and 'professions'. A member can only have 1 profession, 1 profession 0, 1, or many members.

    I want to show a list of all professions that have 1 or more linked members.

    I could do a SELECT that loops through all the professions. Inside the loop I could do a COUNT(*) of all members with a matching profession, and then reject the profession (ie. don't display it) if I can't find any matching members.

    This is the method (above) that I've used so far, but there must be a more efficient method that builds all this into a single SELECT statement - ie. without the additional overhead of a COUNT(*) inside the loop.

    Any takers?

    Thanks in anticipation.
    Web Design Perth Melbourne .:. Itomic Business Website Solutions
    Drupal Experts .:. Drupalise

  2. #2
    SitePoint Wizard
    Join Date
    Jan 2001
    Location
    Milton Keynes, UK
    Posts
    1,011
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I'm not sure about your field names so you'll probably need to adapt this.

    Code:
    SELECT	[ProfessionID]	= P.ProfessionID,
    	[Profession]	= MIN(P.Profession),
    	[MemberCount]	= COUNT(*)
    FROM	Profession P (NOLOCK)
    	INNER JOIN Members M (NOLOCK) ON P.ProfessionID = M.ProfessionID
    GROUP BY P.ProfessionID
    HAVING COUNT(*) > 1

  3. #3
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,270
    Mentioned
    60 Post(s)
    Tagged
    3 Thread(s)
    shane, please do not take offence, but i need to correct your sql slightly

    you can drop the HAVING clause, because the requirement was "1 or more"

    also, and this is way more subtle, you do not need to take the MIN of P.Profession, because you are grouping on ProfessionID which can be assumed to be unique, so if there's only one P.Profession per ProfessionID, it's gotta be the lowest P.Profession of all the P.Professions with the same ProfessionID!!

    finally, the "best practices" way to do this query -- which i hasten to point out mysql cannot use because it does not support subqueries (wayne luke will be happy to hear me slam mysql) -- allows the optimizer to go from one ProfessionID to the next as soon as it has found the first member for that profession, rather than retrieving all the members for the profession:
    Code:
    SELECT ProfessionID, Profession
      FROM Profession P
     WHERE EXISTS
           ( SELECT ProfessionID
               FROM Member
              WHERE ProfessionID = P.ProfessionID )
    Last edited by r937; Aug 5, 2002 at 09:12.
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  4. #4
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,270
    Mentioned
    60 Post(s)
    Tagged
    3 Thread(s)
    p.s. spaceman, if you want to include the number of members in each profession in your profession list, then use
    Code:
    SELECT P.ProfessionID, Profession
         , count(*) as Members
      FROM Profession P
    INNER 
      JOIN Member
        ON P.ProfessionID = M.ProfessionID
    GROUP
        BY P.ProfessionID, Profession
    rudy
    Last edited by r937; Aug 5, 2002 at 09:11.

  5. #5
    SitePoint Wizard
    Join Date
    Jan 2001
    Location
    Milton Keynes, UK
    Posts
    1,011
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Originally posted by r937
    shane, please do not take offence, but i need to correct your sql slightly

    you can drop the HAVING clause, because the requirement was "1 or more"

    also, and this is way more subtle, you do not need to take the MIN of P.Profession, because you are grouping on ProfessionID which can be assumed to be unique, so if there's only one P.Profession per ProfessionID, it's gotta be the lowest P.Profession of all the P.Professions with the same ProfessionID!!

    finally, the "best practices" way to do this query -- which i hasten to point out mysql cannot use because it does not support subqueries (wayne luke will be happy to hear me slam mysql) -- allows the optimizer to go from one ProfessionID to the next as soon as it has found the first member for that profession, rather than retrieving all the members for the profession:
    Code:
    SELECT ProfessionID, Profession
      FROM Profession P
     WHERE EXISTS
           ( SELECT ProfessionID
               FROM Member
              WHERE ProfessionID = P.ProfessionID )
    No offence taken rudy.

    I added the HAVING clause because I didn't read spaceman's request properly and thought he wanted Professions that had more than one member. As you said it can simply be dropped.

    As for the use of MIN aggregate, this is just a bad habit on my behalf, as I often use queries where I use MIN as opposed to using correlated subqueries when extracting data from many joined tables. I've found that queries using MIN always out perform queries using correlated subqueries.

    As for use of the subquery I was always led to believe that the use of JOINs was the preference as it gives more choice to the optimizer.

  6. #6
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,270
    Mentioned
    60 Post(s)
    Tagged
    3 Thread(s)
    As for use of the subquery I was always led to believe that the use of JOINs was the preference as it gives more choice to the optimizer.
    that depends on the query and on which optimizer you're using

    actually, most optimizers simply end up doing a join for a subquery anyway

    however, the EXISTS subquery is an exception, even though it too is executed as a join

    what makes it special is that as soon as the first matching row in the 2nd table is found for a given row in the 1st table, the optimizer can abandon retrieving any more rows of the 2nd table for that row of the 1st table, and move on to the next row of the 1st table

    NOT EXISTS is similar in offering a "quick out", except sometimes (depending on the subquery conditions) all the rows of the 2nd table do have to be examined

    it's moot in mysql anyway
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  7. #7
    Don't eat yellow snow spaceman's Avatar
    Join Date
    Mar 2001
    Location
    Melbourne, Australia
    Posts
    1,039
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks very much to you both. I failed to specify in my original post that I was after a MySQL solution, but I'm pleased and grateful that between you you've given this and more. I tried out the code on my particular application, and it worked 100%. Feels great to get rid of that unnecessary db lookup that I've always previously employed.

    Web Design Perth Melbourne .:. Itomic Business Website Solutions
    Drupal Experts .:. Drupalise

  8. #8
    Don't eat yellow snow spaceman's Avatar
    Join Date
    Mar 2001
    Location
    Melbourne, Australia
    Posts
    1,039
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Hmmm. I thought I had this problem licked, but now I can't seem to get the result I'm after in this scenario:

    I've got two tables, 'registrars' and 'domains' - it's a 1:many relationship. RegisterName and DomainName are both unique fields on their respective tables, amd the foreign key 'DomainRegistrarID' sits on the domains table to make the link to the registrars table. All I want to do is to show a list of Registrar names, and the number of Domain names associated with each.

    PHP Code:
    $sql "SELECT RegistrarName, DomainName, Count(*) AS count FROM registrars
    LEFT JOIN domains ON RegistrarID=DomainRegistrarID
    GROUP BY RegistrarName
    ORDER BY RegistrarName
    ...
    echo "
    $RegistrarName$count";
    ... 
    The problem I have is this: not all domains have been assigned (or linked to) a registrar yet, and some of the Registrars on my registrars table have not been assigned/linked to any domains. But, all the Registrars that have no domains linked to them are showing a count of 1 in the above example when they should (IMHO) be showing 0.

    More info: if a domain has not yet been linked to a registrar, then the DomainRegistrarID field is 0. RegistrarID values on the registrar table start from 1.

    So obviously I've got some something wrong in my logic or my syntax - but what? Any takers prepared to point out the error of my ways? I've got a headache.
    Web Design Perth Melbourne .:. Itomic Business Website Solutions
    Drupal Experts .:. Drupalise

  9. #9
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,270
    Mentioned
    60 Post(s)
    Tagged
    3 Thread(s)
    registrars without domains are listed, as desired, because of the left outer join

    to show the correct count of 0 you need to use count(DomainRegistrarID), which counts non-null values, instead of count(*), which counts rows

    rudy

  10. #10
    Don't eat yellow snow spaceman's Avatar
    Join Date
    Mar 2001
    Location
    Melbourne, Australia
    Posts
    1,039
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Great! Thanks for putting me out of my misery, r937. Much appreciated.
    Web Design Perth Melbourne .:. Itomic Business Website Solutions
    Drupal Experts .:. Drupalise


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •