Select one per each column MySQL

So here’s an example table:


+-----------
| Name | Category |
| Guy 1 | Small |
| Girl 1 | Small |
| Boy 7 | Big |
| Boy 9 | Huge |
+------------

How would I select just ONE for each repeated category. I only want a list of categories.

I know I could use PHP and foreach to filter out any excess rows with the same category, but that’s dirty.

Something like

select `Category` from `people` limit 1 per `Category`

Even though that makes no sense in terms of MySQL grammar.

It should return:


Row 1: Small
Row 2: Big
Row 3: Huge

Thanks.

One way is:


SELECT DISTINCT Category
FROM Products

The MySQL Forum has experts who can advise on other appropriate ways too.

For example, the 2nd-normal form of database normalisation requires that you have a separate table called Category that contains only those distinct types.

Then, your table of guys boys and girls would have a CategoryId field instead, which contains the numeric identifier of the Category itself.


CATEGORY
+------------
| Id | Name  |
+------------
| 1  | Small |
| 2  | Big   |
| 3  | Huge  |
+------------

PRODUCTS
+---------------------
| Name   | CategoryId |
+---------------------
| Guy 1  | 1          |
| Girl 1 | 1          |
| Boy 7  | 2          |
| Boy 9  | 3          |
+---------------------

Thank you Much appreciated :slight_smile: !! And yes, I agree, however, it’s not my call. I’m working for a pre-made MySQL table on someone else’s website. If it were up to me I would do as you suggested.

sorry, paul, but that’s not true

2NF does not require creating a separate table, and in particular, it does ~not~ mandate the use of a surrogate key

from the wikipedia article:

Specifically: a 1NF table is in 2NF if and only if, given any candidate key and any attribute that is not a constituent of a candidate key, the non-key attribute depends upon the whole of the candidate key rather than just a part of it.
applied to this example, it means that with Name as the key, and Category as the non-key attribute, the value of the Category column in any given row is determined entirely by the value in the Name column

turned around, it says that if i have this table, and you tell me the value of any name (which, since it’s a key, must be unique), therefore i can tell you unequivocably what the value of the category for that name is

again, it’s important to remember that the normal forms in data modelling are completely silent on the use of numeric ids as keys

don’t let anyone tell you that you have to use a numeric id to conform to any normal form

Thank you for coming on. I should avoid making posts such as these while distracted.

Edit:

Actually, which one is it where duplicate entries, such as the category, are separated out to help prevent duplication?

this is, sadly, a common misconception

you are referring to replacing the category name with a numeric id, right?

this does ~not~ prevent duplication

there are just as many duplications of the numbers that replace the other values, as there were duplications of the other values

think about it :slight_smile:

I may need to phrase this differently then.

What are the main benefits that may be enjoyed by making use of a primary key / foreign key relationship with for example, a Category table.

the main benefit is data integrity

note that you can achieve this without a numeric key just as easily – in fact, a bit more easily

That sounds interesting. What resource would recommend for someone wanting to learn more about how to achieve this?

any tutorial which discusses primary and foreign keys

e.g. http://en.wikipedia.org/wiki/Foreign_key