SitePoint Sponsor |
|
User Tag List
Results 1 to 24 of 24
-
Jun 19, 2009, 11:57 #1
- Join Date
- Dec 2002
- Location
- Ann Arbor, MI (USA)
- Posts
- 648
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
Storing product attributes/options in DB
Branching off of this thread, what is the ideal way to store products in the database?
On a site I'm doing currently, I have a table 'products' and a table 'product_attributes' - attributes being things a user can choose before purchasing - size, color, style, etc. This setup works fine, but was built with only one attribute per product in mind, and doesn't allow the same attributes to be applied to more than one product.
I was discussing this in the other thread, and noticed that Magento (a php-based shopping cart) uses EAV tables to store their attributes. From previous experience and another poster, I know that's not an ideal solution. I'm wondering, though, what would be the 'best practice' way to store attributes for products, when each product may have any number of attributes, and each attribute many have any number of values?
-
Jun 19, 2009, 12:56 #2
- Join Date
- Jul 2002
- Location
- Toronto, Canada
- Posts
- 39,347
- Mentioned
- 63 Post(s)
- Tagged
- 3 Thread(s)
"best practice" for storing would have to be EAV -- it is the ultimate in genericness (if that's a word) and a pinnacle of simplicity
however, just storing data usually isn't da whole story, eh
it's getting it back out, specifically in searches, that's hard
picture the user interface for a search on shirts -- you'd want a dropdown for sizes, and another for colours, with maybe an options for all colours, text description search box, and so on
then the web interface would have to translate the form fields into query variables, and the query would have to go against the EAV table and retrieve multiple rows efficiently, and, more than that, combine them in the same logical ways as the user specified in the form...
and now do that with bike parts instead of shirts
as you said, "the setup works fine" but EAV has an exponential increase in difficulty the more sophisticated the retrieval needs to be
did that make sense or do i need to have another coffee?
-
Jun 19, 2009, 13:03 #3
- Join Date
- Dec 2002
- Location
- Ann Arbor, MI (USA)
- Posts
- 648
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
Ok... so what is the ideal way to store product attributes for retrieval? Preferably in a way that doesn't increase exponentially in difficulty the more products you have
-
Jun 19, 2009, 13:07 #4
- Join Date
- Feb 2004
- Location
- Tampa, FL (US)
- Posts
- 9,854
- Mentioned
- 1 Post(s)
- Tagged
- 0 Thread(s)
let's use your example of size, color, style, etc., and extend it to a clothing store.
if you want the database to store stock levels of each combination of size, color, style, etc., then you really have no choice and will need one row per each possible combination.
if you don't need stock levels, then you CAN collapse multiple values for an attribute in to a delimited column. however, this only makes sense if you never want to let your users search for those values on their own. for example, if the only time you'll be displaying color is when looking at an item or a list of items but not using color as a search criteria to define the list, then delimited values might be OK.
but if you need to search those values, then you're back to one row per possible value.
frankly, i would almost always go with one row per possible value. if you abstract the products table using a class or set of functions in your application, you only ever have to deal with the complexity once.
i implemented this once and wrote a user interface that made it really easy to manage. the whole project also ended up being much easier to code than i had anticipated, once i wrote the abstraction layer to manage the products table.Last edited by longneck; Jun 19, 2009 at 13:53.
Check out our new Industry News forum!
Keep up-to-date with the latest SP news in the Community Crier
I edit the SitePoint Podcast
-
Jun 19, 2009, 17:10 #5
- Join Date
- Jul 2006
- Location
- Augusta, Georgia, United States
- Posts
- 4,194
- Mentioned
- 17 Post(s)
- Tagged
- 5 Thread(s)
Do the options affect the price or stock?
The major problem I see here is that if there are say 20 options for product how to handle the variant price and stock based on the selection of options. Its much easier just to handle each as a separate product if the options affect the price or stock individually and in different variations. Either way its going to become a mess as least by storing each product individually you only have one level to deal with.The only code I hate more than my own is everyone else's.
-
Jun 20, 2009, 11:45 #6
- Join Date
- Dec 2002
- Location
- Ann Arbor, MI (USA)
- Posts
- 648
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
I think that's a situation where most products will have a size, a color, and potentially a style. I'm wondering what to do in situations where the number of attributes vary in addition to the number of values each attribute has.
Originally Posted by oddz
Maybe the answer is a table for each product 'type' (ie, 'jacket' or 'wine bottle' in which each attribute is a column on that table.Last edited by allspiritseve; Jun 20, 2009 at 11:54. Reason: Ooh, 500 posts :)
-
Jun 20, 2009, 11:55 #7
- Join Date
- Jul 2002
- Location
- Toronto, Canada
- Posts
- 39,347
- Mentioned
- 63 Post(s)
- Tagged
- 3 Thread(s)
so? and this is a problem because... ???
analogy: you know how in an employees table you have many people with the first name "john" -- do you say to yourself "whoa, all that duplication from one person to another with the same first name, maybe we need to design another table which will hold unique instances of every first name, and link everyone to that table?"
no, you don't -- in fact, most developers never, ever think that this is a good idea
-
Jun 20, 2009, 12:07 #8
- Join Date
- Dec 2002
- Location
- Ann Arbor, MI (USA)
- Posts
- 648
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
It's a problem because, unlike your employee analogy, each product variation is part of a 'product type' whereas an employee named john is not part of a 'john type'. If one john changes their name, they wouldn't all change their names, whereas if a product's name changed, or a variation's name changed, or a product ended up having different attributes, that would apply to all variations of that product, not just the one changed. I'm no DB expert, but in that situation my instinct is to move the duplication to another table so changing one row applies to all rows that reference that row with a foreign key. The reason, from what I've learned reading forum posts, is in case rows have data that varies (spelling errors, whatever) in this type of situation, they wouldn't get an applied change that referenced them by their value, rather than a foreign key. However, I will defer to your expertise if I am wrong.
-
Jun 20, 2009, 12:34 #9
- Join Date
- Jul 2002
- Location
- Toronto, Canada
- Posts
- 39,347
- Mentioned
- 63 Post(s)
- Tagged
- 3 Thread(s)
no, you're not wrong
a product type table isn't necessary, although in some applications it might have advantages
"instinct" in database design doesn't always produce the optimal result
-
Jun 20, 2009, 12:49 #10
- Join Date
- Dec 2002
- Location
- Ann Arbor, MI (USA)
- Posts
- 648
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
Well, I'm sure simpler schemas are fine for simpler sites, I just wanted to know for more complex e-commerce sites what direction I should head in, since I'd heard Magento used EAV and a lot of people didn't seem to agree with that decision.
Exactly, hence two threads on the subject
-
Jun 20, 2009, 16:33 #11
- Join Date
- Dec 2002
- Location
- Ann Arbor, MI (USA)
- Posts
- 648
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
OK, so on a simple site we might have a schema like this:
Code:product -------- sku title description size color
Code:shirt ----- sku title description size color wine ----- sku title description vineyard type volume etc...
-
Jun 20, 2009, 16:39 #12
- Join Date
- Jul 2002
- Location
- Toronto, Canada
- Posts
- 39,347
- Mentioned
- 63 Post(s)
- Tagged
- 3 Thread(s)
a separate table for every different product type?
i wouldn't go there myself...
-
Jun 20, 2009, 16:40 #13
- Join Date
- Dec 2002
- Location
- Ann Arbor, MI (USA)
- Posts
- 648
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
-
Jun 20, 2009, 16:44 #14
- Join Date
- May 2003
- Location
- Washington, DC
- Posts
- 10,653
- Mentioned
- 4 Post(s)
- Tagged
- 0 Thread(s)
I'd settle in the middle--columns for common product metadata, then a "payload column" of the specific product details. Personally, I'd use XML to store this in the database. Your product object can then be a bit polymorphic and act as the base products class or a concrete product type which would pull it's additional info from the XML.
Another way to model this is a base products table then a variety of general product type tables to hold attributes for specific types of products. A little less flexible than the above, but a bit easier to search on the database level if you want to go that way.
-
Jun 24, 2009, 15:30 #15
- Join Date
- Dec 2002
- Location
- Ann Arbor, MI (USA)
- Posts
- 648
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
@r937 if I have all my products in one table, I could see problems where a client might rename one product to the same name as another, and then any renaming would affect both products (assuming I'm distinguishing one product from another by their name, since two separate products may have the same attributes but should have a different name. Should I have a product_type table with the name on it, and then reference it with a foreing key from the products table?
-
Jun 24, 2009, 21:16 #16
- Join Date
- Jul 2002
- Location
- Toronto, Canada
- Posts
- 39,347
- Mentioned
- 63 Post(s)
- Tagged
- 3 Thread(s)
-
Jun 24, 2009, 21:24 #17
- Join Date
- Dec 2002
- Location
- Ann Arbor, MI (USA)
- Posts
- 648
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
Then how would you recommend differentiating two products with the same name and attributes? It's not an ideal situation, but if it can be done, a client will do it. With a product_type table I can have an auto incrementing id to differentiate them. Also, what do you do about columns that are shared between multiple products, such as a product title or description that is not affected by attributes? Do you just duplicate them across all products that need the same title/description?
Edit: I doubt you saw my question in the other thread, so I'll ask again here: do you have some links on where I can read up on when it is appropriate to move columns into another table and when to leave them in the first one? I'd like to better understand the tradeoffs involved in each decision.
-
Jun 25, 2009, 08:41 #18
- Join Date
- Jul 2002
- Location
- Toronto, Canada
- Posts
- 39,347
- Mentioned
- 63 Post(s)
- Tagged
- 3 Thread(s)
if they have the same name and the same attributes, then they're the same product!
i would, yes
the same way i would just duplicate the first name 'John" across all people in the persons table that have that first name that is not affected by the other attributes
search supertype/subtype -- there are some very comprehensive articles/tutorials out there
-
Jun 25, 2009, 11:16 #19
- Join Date
- Dec 2002
- Location
- Ann Arbor, MI (USA)
- Posts
- 648
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
Not necessarily. If the user is going through and renaming a couple of products, there could potentially be duplicate names before they're done renaming. And the attributes don't have to be the same, as long as the name field is the one that determines a "Product with variations". Any name clash would result in rows for both "Product with variations" being edited at once.
I don't understand why you keep using that analogy. Multiple product rows have the same title/description, not because of coincidence, but because they're just variations of a single Product. Renaming one title/description should be renaming every title/description.
I didn't mean in this scenario specifically, I just meant in general, when to create new tables and when not to. I'd always understood it to be to remove duplication, but you don't seem to care about duplication.
-
Jun 25, 2009, 13:00 #20
- Join Date
- Jul 2002
- Location
- Toronto, Canada
- Posts
- 39,347
- Mentioned
- 63 Post(s)
- Tagged
- 3 Thread(s)
because it is ~so~ appropriate
nobody every thinks that an employee table with dozens of duplications of the first name "John" warrants some kind of intervention by way of a FirstNames table with numeric ids
why? well, nobody does this because it would be silly... except it doesn't appear quite as silly in other contexts
if there are 15 different variations of "Kool Tshirt" based on size, colour, etc., then why is it necessary to abstract the description there?
suppose you want to change the description, well, you'd run something like this:Code:UPDATE products SET descr = 'Kool T-Shirt' WHERE descr = 'Kool Tshirt'
as far as duplication is concerned, let's suppose that you did go ahead and abstract the description into a product_type table
then each of the 15 products would have a link to its product_type table, yes? so the description is stored only once, right? well, my point about eliminating duplication is that you haven't eliminated it at all -- the 15 products all have the same numeric FK to the product_type table, so there is still exactly the same amount of duplication
did i explain that okay, or are you still not sure?
-
Jun 25, 2009, 13:18 #21
- Join Date
- Dec 2002
- Location
- Ann Arbor, MI (USA)
- Posts
- 648
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
As I've said before, that's because they're not of a 'John type'. Renaming one shouldn't rename the others, but renaming a product 'Kool Tshirt' should rename all variations of that product.
It's taking up space, with no real need. If I wanted certain products to say 'Red Kool Tshirt' and have a separate description for each, then I would see the benefit to duplicating the name/description. The customization is there if you need it. But that's generally not a requirement.
duplicating 15 numbers takes up far less space than 15 names and full page descriptions of a product.
I understand your point, not to eliminate duplication just because there's duplication. I think in this case the duplication is unnecessary, though.
-
Jun 25, 2009, 13:34 #22
- Join Date
- Jul 2002
- Location
- Toronto, Canada
- Posts
- 39,347
- Mentioned
- 63 Post(s)
- Tagged
- 3 Thread(s)
so it isn't duplication per se that normalization addresses, is it
now let's move on the the pros and cons of abstracting the product description
one of the benefits is that it takes less space (although to be frank, disk space is ~not~ an issue at a few dollers per gigabyte, is it)
can you think of any drawbacks for the product_type (description) table?
-
Jun 25, 2009, 13:41 #23
- Join Date
- Dec 2002
- Location
- Ann Arbor, MI (USA)
- Posts
- 648
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
-
Jun 25, 2009, 13:46 #24
- Join Date
- Jul 2002
- Location
- Toronto, Canada
- Posts
- 39,347
- Mentioned
- 63 Post(s)
- Tagged
- 3 Thread(s)
Bookmarks