How to join just one record from a table

I have two tables: boats and crew. I need to select some boats (not a problem) and to join the crew table but I want to pick only one crew member per boat and not just any crew member but one selected by sorting the crew table.

In other words, I need to do a sub select on the crew table that yields at most one record per boat and to join the result to the select from the boats table. How can this be done?

SELECT boats.name AS boat_name
     , boats.port
     , crew.name AS captain_name
  FROM boats
INNER
  JOIN ( SELECT crew.boat_id
              , MAX(crew.hat_size) AS captain
           FROM crew
         GROUP
             BY crew.boat_id ) AS subquery
    ON subquery.boat_id = boats.id
INNER
  JOIN crew
    ON crew.boat_id = subquery.boat_id
   AND crew.hat_size = subquery.captain                 

Thanks but I couldn’t get it to work. “MAX(crew.hat_size) AS captain” doesn’t work :cool:

The crew members have a rank (enum field): Owner, Captain, Crew, Passenger. The “top hat” is picked by sorting:

ORDER BY rank, lname ASC
   LIMIT 1

I would really appreciate if you would comment your proposed code.

dude, the quality of my solution is exactly in line with the quality of your question

and yes, my solution does work, i’ve been writing sql for many, many years

you should really have mentioned the ENUM datatype in your first post

(aside: you should change it if you can… do a search on “ENUM is the spawn of the devil”)

could you kindly do a SHOW CREATE TABLE for the crew table

A badly asked question? Sorry, I’ll work on that.

and yes, my solution does work, i’ve been writing sql for many, many years

I don’t doubt either statement. I said “I could not get it to work.”

you should really have mentioned the ENUM datatype in your first post

(aside: you should change it if you can… do a search on “ENUM is the spawn of the devil”)

Being an atheist, I don’t believe there is a devil. I take it you don’t like ENUM. I did the search and found a good explanation of the shortcomings of ENUM. I’ve been using it for years and never had a problem because I use it in such a way as not to tempt its devilish soul. :smiley:

Change it to what?

could you kindly do a SHOW CREATE TABLE for the crew table

CREATE TABLE `crew` (
  `id` mediumint(9) NOT NULL auto_increment,
  `boat_id` mediumint(9) NOT NULL default '0',
  `fname` varchar(25) NOT NULL default '',
  `iname` varchar(5) NOT NULL default '',
  `lname` varchar(25) NOT NULL default '',
  `rank` enum('own','cap','crw','pass') NOT NULL default 'crw',
  `document` varchar(30) NOT NULL default '',
  `doc_type` enum('id','pass') NOT NULL default 'id',
  `notes` text NOT NULL,
  `date_created` date NOT NULL default '0000-00-00',
  `ip_created` varchar(15) NOT NULL default '',
  `reason_removed` text NOT NULL,
  `date_removed` date NOT NULL default '0000-00-00',
  `ip_removed` varchar(15) NOT NULL default '',
  `date_modified` date NOT NULL default '0000-00-00',
  `ip_modified` varchar(15) NOT NULL default '',
  PRIMARY KEY  (`id`),
  KEY `boat_id` (`boat_id`),
  KEY `lname` (`lname`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=1 ;

okay, thanks for that… so, i did some testing

here’s my table –

CREATE TABLE crew
( id TINYINT NOT NULL AUTO_INCREMENT PRIMARY KEY
, fname VARCHAR(25)
, rank ENUM('own','cap','crw','pass') NOT NULL DEFAULT 'crw'
);

now let’s re-acquaint ourselves with ENUM behaviour –

INSERT INTO crew ( fname ) VALUES ( 'Todd' )
;
INSERT INTO crew ( fname , rank ) VALUES
 ( 'Tom' , 'pass' )
,( 'Dick' , 'pass' )
,( 'Harry' , 'crw' )
,( 'Humpty' , 'cap' )
,( 'Dumpty' , 'own' )
;

SELECT *
     , 0+rank as nrank
  FROM crew
;

1 Todd   crw  3
2 Tom    pass 4
3 Dick   pass 4
4 Harry  crw  3
5 Humpty cap  2
6 Dumpty own  1

and vwalah, go back to the solution i gave you and replace MAX(crew.hat_size) with MIN(0+crew.rank)

:slight_smile:

It almost works but not quite:

   SELECT boats.id AS id, yacht_name, yacht_registry, yacht_flag, 
          crew.rank, crew.fname, crew.iname, crew.lname,
     FROM boats
LEFT JOIN (SELECT crew.boat_id, MIN(0+crew.rank) AS captain   
             FROM crew
         GROUP BY crew.boat_id) AS subquery
       ON subquery.boat_id = boats.id
LEFT JOIN crew
       ON crew.boat_id = subquery.boat_id
      AND crew.rank = subquery.captain

I’ve changed the INNER JOINs to LEFT JOINs to list boats without a crew member. But if there is more than one crew member with the same top-rank (2 or more owners, 2 or more captains – could happen), then the boat is listed several times. I only want it listed once. The second boat is listed twice

[B]Yate                Registro       Bandera          Tripulante[/B]
Persuasion          Cayman-123456  Cayman Islands   Capitán: Mr. Pepe
Tribilín's stories  AGSC-123456    Venezuela        Capitán: Nemo Morales
Tribilín's stories  AGSC-123456    Venezuela        Capitán: Mafalda Quino Perez
Pandemonium         Dante          Italy	

How do I tweak the code?

Got it!

DISTINCT(boats.id) AS id
GROUP BY boats.id

come back when you discover why that won’t work :wink:

Rest assured I will but it’s working right now on my test case. Here is the full sql

   SELECT DISTINCT(boats.id) AS id, 
          yacht_name, yacht_registry, yacht_flag, 
          crew.rank, crew.fname, crew.iname, crew.lname,
          subquery.captain
     FROM boats
LEFT JOIN  (SELECT crew.boat_id, MIN(0+crew.rank) AS captain   
              FROM crew
             WHERE date_removed = '0000-00-00'
          GROUP BY crew.boat_id) AS subquery
       ON subquery.boat_id = boats.id
LEFT JOIN crew
       ON crew.boat_id = subquery.boat_id
      AND crew.rank = subquery.captain
      AND crew.date_removed = '0000-00-00'
 GROUP BY boats.id
 ORDER BY yacht_name

BTW, I have a working two query solution to fall back on. Query 1: get the boats, Query 2: get top crew for this boat.

Thanks for your help!

two comments…

  1. DISTINCT is ~not~ a function… placing the first column that comes after the DISTINCT keyword inside parentheses makes no difference, the DISTINCT will still apply to ~all~ columns in the SELECT list

  2. if you GROUP BY boats.id, then the DISTINCT is redundant

what you’re actually doing is forcing mysql to choose one value for each column in the SELECT list for each boat, and this does not guarantee that the fname, lname, etc. will all come from the same row!!

see http://rpbouman.blogspot.ca/2007/05/debunking-group-by-myths.html (a long read, but worth it)

Right!

  1. if you GROUP BY boats.id, then the DISTINCT is redundant

what you’re actually doing is forcing mysql to choose one value for each column in the SELECT list for each boat, and this does not guarantee that the fname, lname, etc. will all come from the same row!!

see http://rpbouman.blogspot.ca/2007/05/debunking-group-by-myths.html (a long read, but worth it)

When SQL gets this complicated a simple php solution is best: skip the duplicate rows:

$boat_id = '';
while ($row = mysqli_fetch_assoc($boats)) {
    if($boat_id == $row['id']) { continue; }  // :rofl:
    $boat_id = $row['id'];
    -
    -
    the rest of the code

I googled how to get just one record in a set or the first record in a set and I could not find a satisfactory answer. It seems that SQL does have a “FIRST” statement but it’s not available in MySQL. As far as I recall, this is the first time I needed such a result in maybe 10 years writing MySQL code. I guess it’s not something that is required often.

I found two workarounds:

1.- use two sql queries where the second one allows me to use “LIMIT 1”, and
2.- ignore the duplicates in the php code following the sql query

While I like to write “good” or “clean” code I’m no purist. There comes a time when a simple workaround is more practical than banging your head against obscure constructs. Just my opinion.

And again, thanks for all the help!

as a general statement, i would have to agree with you completely here

however, i still question the wisdom/appropriateness of your original requirement…

if a boat has two captains, why would you ~not~ want to show them both? if your answer to this is that you never want to see more than one output row per boat, you could use GROUP_CONCAT to show both captains in the same result column

That question is outside the coding realm and in the application design realm. While googling this issue I did run across “GROUP_CONCAT.” I considered using it and decided against because of the consequences to the rest of the page design.

The function of this particular page is to facilitate finding a specific yacht to work on. The page will list all the yachts or a subset of “found” yachts. “Do you remember that yacht from Grenada with the whosawhatsis?” So you list the yachts from Grenada and then link to the most likely candidate. The idea behind listing one name only is to list the “top dog” on the yacht which is the reason for the ordering: “Owner, captain, crew, passenger.” Suppose there is no owner or captain on board, just 12 crew members, GROUP_CONCAT would include all twelve. IMO, this is not “information” but clutter. It’s a compromise and we can agree to disagree about it. :slight_smile:

David Pogue makes a good point: Keep the design as simple as possible, follow the rules but break them when it makes sense to do so. One example he gives is a popup menu of country names, they should be in alphabetical order which would put the USA near the bottom. Intelligent rule breaking puts the USA on top. Really intelligent rule breaking lists the USA twice, once in its proper place and also on top. :wink:

David Pogue: Simplicity sells
http://www.ted.com/talks/david_pogue_says_simplicity_sells.html

okay, i agree to disagree :slight_smile:

you single out one of two captains as “top dog” and the other one will never forgive you

it’s one thing to simplify data presentation, it is another altogether to distort data