On my continued journey of becoming a more “sophisticated” database developer…
What is the best way to handle “Look-up Values” in MySQL?
In the past, when I had a column/field that had a finite list of valid values, I would create a “Look-up Table” and join that to my main table.
For instance, if I had an “Order” table and one of the fields in that table was “Shipping_State”, I would have created a look-up table called “States” that contained the 50 U.S. States and their corresponding abbreviations. Then I would do a 1-to-M join from “States” table to the “Order” table.
Not sure if that was a good way to do things?!
It seems like there are a few ways you could handle this issue…
1.) Create a “Look-up Table” (as described above) every time you have a situation with needing to define a finite list of choices.
2.) Just create a “Look-up List” in your front-end application (e.g. Combo-Box on a web form) and use that to populate your back-end database tables.
3.) Use the ENUM data-type in MySQL.
4.) Some other way??
What is the best way to approach this, and why is said method better than other ways?
A lookup table is more flexible than using enum. If you use enum and decide to add an extra value then you have to change the table definition. With a lookup table you just create the extra record.
If Lookup Tables are the best way to go (in your opinion), then…
1.) Should you create the Lookup Tables in the database itself, or can they exist in your Front-end Application?
2.) If the Lookup Tables are stored in the database, do you need to “join” them to the tables they reference? That is, do you need to create foreign keys and referential integrity and all that jazz?
If you create the lookup table in the database then your front end can read all the options from there so that the only action required to add an extra option is to add the extra record to the lookup table. If you hard code it in the front end then you have to update every page that uses them when you add one.
I don’t have a particular preference for any one of the alternatives as each has advantages and disadvantages. The main consideration is how static the list of options will be. The more likely it is that the list will change in the future the greater the advantage that using a lookup table in the database gives you since it will minimise the amount of effort to add a new option. If you know for certain that the options will never change then using an enum in the database and hard coding the list in the front end means you have less code to write at the start.
If you are going to use a hard coded list in the front end then you may as well use an enum in the database as that will take up less space in the database since with an enum only the number corresponding to the option in the enum list is stored (so if you ever do need to add extra options to an enum you should add them to the end so that the entire table doesn’t need to be rebuilt).
I don’t have a particular preference for any one of the alternatives as each has advantages and disadvantages. The main consideration is how static the list of options will be. The more likely it is that the list will change in the future the greater the advantage that using a lookup table in the database gives you since it will minimise the amount of effort to add a new option. If you know for certain that the options will never change then using an enum in the database and hard coding the list in the front end means you have less code to write at the start.
Good point.
If you are going to use a hard coded list in the front end then you may as well use an enum in the database as that will take up less space in the database since with an enum only the number corresponding to the option in the enum list is stored (so if you ever do need to add extra options to an enum you should add them to the end so that the entire table doesn’t need to be rebuilt).
If I had an ENUM with:
KEY VALUE
---- =======
0 Up
1 Down
2 Left
3 Right
you are saying that my table would only have Key values??
If that is the case, I do NOT like that.
Even when I have used Look-up Tables in the past, I usually made the Look-up Table’s ID a text value (e.g. Alabama, Alaska, Arkansas, California,…) so that if you ever need to report of the table, you aren’t dependent on needing the Look-up Table and its values.
If you create the lookup table in the database then your front end can read all the options from there so that the only action required to add an extra option is to add the extra record to the lookup table. If you hard code it in the front end then you have to update every page that uses them when you add one.
That isn’t true. String manipulation can be used on the field definition to extract the individual values. I know because I’ve had to resort to it (cries).
Enums aren’t rows in a table its a data type that restricts data to listed values. For example, given a body for a blog you may want to allow users three types of input options. Those input options being plain text, html and code. You could use an ENUM to restrict the data to one of those types. So you enum values for the field could be [text,code,html]. Now if someone enters html1 into the field the table will not accept it considering it doesn’t match any of those values. However, using a varchar anything within the limits could be accepted such as dfgh.
ENUM is just a way to restrict a field to containing only the listed values without using a separate table to store the values individually as rows and use foreign keys instead. ENUM is a much simpler approach with many disadvantages in comparison to using a separate data structure with foreign keys. I don’t necessarily believe they are evil and should be avoided altogether but they should be used in moderation and in respect to change as felgall already mentioned.
Going back to the previous example if the content allowances may change in the future than its probably best to avoid an ENUM. On the other hand if html,code and text will never change than its pretty safe to use an ENUM.
ENUM is just a way to restrict a field to containing only the listed values without using a separate table to store the values individually as rows and use foreign keys instead. ENUM is a much simpler approach with many disadvantages in comparison to using a separate data structure with foreign keys.
And what are those disadvantages?
So if I use “look-up tables” then I do need to create Foreign-Key Constraints based on what you said above, right?
I don’t necessarily believe they are evil and should be avoided altogether but they should be used in moderation and in respect to change as felgall already mentioned.
But is it really any harder to alter an ENUM as opposed to adding more rows into a “look-up table”?
Internally the database would store them as 0,1,2,3 but every reference to them would use Up,Down,Left,Right.
The issue with adding an extra entry in the middle of the list is that the database then has to change its internal value in all the rows that reference those values that are now in a different position in the list.
The benefits of using a lookup table for any list where there is even a potential for the list of options to change far outweighs any benefit that the simplicity of enum provides. Only where there is no possibility at all of the list of values ever changing would it be worth using an enum. Even with your simple example you may need to add forward,back,past,future to the list at some point.
So to clarify on an earlier point, when I’m designing (visually) in Workbench, I need to “litter up” my ERD with Look-up Tables, huh?
More so, I still need to create Foreign Keys and set any Referential Integrity settings (e.g. On Update, On Delete) for all of the Look-up Tables, right?
I know a Forms/UI Developer would certainly care what the values are! And this leads back to an earlier question of “Where should you handle this?”
I can see a Look-up Table in the database for “States”, “Credit Card Type”, “Shipping Type”, and “Address Type”.
But what about “Salutation” (as you already mentioned), or “Order Quantity” (I think I would want people to pick from a list of Integers), or “Year of Birth”, or…
These aren’t as crucial as the first group, yet you were the one who lectured me last month when I asked about where to do error-handling and data checking. (And you were right in that “The database should be the ultimate gatekeeper of what goes in and out!!”)
So to be consistent, shouldn’t most things need to be reduced to some finite list to force data consistency?? (And presumably you might as well put those “lists” in Look-up Tables in the back-end. Then again, who knows?!)
Like almost everything in IT, I guess it is all subjective?! (:
i wouldn’t use a lookup table because i don’t care about the values
to do a proper job of it, yes you do – but only for those lookup tables that you absolutely have to have as lookup tables
But would you handle them in the UI with List Boxes, Combo Boxes, Option Groups, etc.?
You only really need a lookup table for any of those if there is a form field that needs to validate against that list and if the list is going to change over time.
If the list is always going to be the same then it may be more efficient to pull it out of the database and hard code it as a validation routine. Provided that you code it as a separate object/function then you can always come back later and convert it to use a lookup table if you need to.
If the list is always going to be the same then it may be more efficient to pull it out of the database and hard code it as a validation routine. Provided that you code it as a separate object/function then you can always come back later and convert it to use a lookup table if you need to.
That is really good advice all should follow. Something doesn’t need to be stored in the database to be handled inside the data access layer. By placing static lists in the data access layer it is possible to come back later and add more items to the list or even retrieve the data from a table instead. The last thing you should do ever is hard code them into the template. The next developer inline will thank you.
Also, I know this the MySQL forum but XML is a very viable option for some things like configuration. With that said, accepted credit cards could be placed inside the configuration XML file. You could than parse the XML to retrieve the accepted credit cards throug the applications data access layer. This would be not different interface wise than using a static lists and merely returning the results via a data layer method or function.
Here is an example in PHP of what I’m talking about. You could change the implementation and so long as a signature and result set has a same interface everything will be compatible.
class DAOPayment {
public function fetchAcceptedCredCards() {
return array(
array('name'=>'American Express')
,array('name'=>'MasterCard')
,array('name'=>'VISA')
);
}
}
$payment = new DAOPayment();
$creditCards = $payment->fetchAcceptedCredCards();
So if you decide to place the cards in the db down the line you can do so without breaking your application.
class DAOPayment {
public function fetchAcceptedCredCards() {
return query('SELECT name FROM accepted_credit_cards');
}
}
$payment = new DAOPayment();
$creditCards = $payment->fetchAcceptedCredCards();
You could also use a XML file to retrieve the results or any other storage mechanism since the implementation details details have been made irrelevant to the application besides for the single method responsible for the data. So long as the expected format is returned you could even use a web service or anything really.