SitePoint Sponsor

User Tag List

Results 1 to 24 of 24

Hybrid View

  1. #1
    Spirit Coder allspiritseve's Avatar
    Join Date
    Dec 2002
    Location
    Ann Arbor, MI (USA)
    Posts
    648
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Storing product attributes/options in DB

    Branching off of this thread, what is the ideal way to store products in the database?

    On a site I'm doing currently, I have a table 'products' and a table 'product_attributes' - attributes being things a user can choose before purchasing - size, color, style, etc. This setup works fine, but was built with only one attribute per product in mind, and doesn't allow the same attributes to be applied to more than one product.

    I was discussing this in the other thread, and noticed that Magento (a php-based shopping cart) uses EAV tables to store their attributes. From previous experience and another poster, I know that's not an ideal solution. I'm wondering, though, what would be the 'best practice' way to store attributes for products, when each product may have any number of attributes, and each attribute many have any number of values?

  2. #2
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,273
    Mentioned
    60 Post(s)
    Tagged
    3 Thread(s)
    "best practice" for storing would have to be EAV -- it is the ultimate in genericness (if that's a word) and a pinnacle of simplicity

    however, just storing data usually isn't da whole story, eh

    it's getting it back out, specifically in searches, that's hard



    picture the user interface for a search on shirts -- you'd want a dropdown for sizes, and another for colours, with maybe an options for all colours, text description search box, and so on

    then the web interface would have to translate the form fields into query variables, and the query would have to go against the EAV table and retrieve multiple rows efficiently, and, more than that, combine them in the same logical ways as the user specified in the form...

    and now do that with bike parts instead of shirts

    as you said, "the setup works fine" but EAV has an exponential increase in difficulty the more sophisticated the retrieval needs to be

    did that make sense or do i need to have another coffee?

    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  3. #3
    Spirit Coder allspiritseve's Avatar
    Join Date
    Dec 2002
    Location
    Ann Arbor, MI (USA)
    Posts
    648
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Ok... so what is the ideal way to store product attributes for retrieval? Preferably in a way that doesn't increase exponentially in difficulty the more products you have

  4. #4
    reads the ********* Crier silver trophybronze trophy longneck's Avatar
    Join Date
    Feb 2004
    Location
    Tampa, FL (US)
    Posts
    9,854
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    let's use your example of size, color, style, etc., and extend it to a clothing store.

    if you want the database to store stock levels of each combination of size, color, style, etc., then you really have no choice and will need one row per each possible combination.

    if you don't need stock levels, then you CAN collapse multiple values for an attribute in to a delimited column. however, this only makes sense if you never want to let your users search for those values on their own. for example, if the only time you'll be displaying color is when looking at an item or a list of items but not using color as a search criteria to define the list, then delimited values might be OK.

    but if you need to search those values, then you're back to one row per possible value.

    frankly, i would almost always go with one row per possible value. if you abstract the products table using a class or set of functions in your application, you only ever have to deal with the complexity once.

    i implemented this once and wrote a user interface that made it really easy to manage. the whole project also ended up being much easier to code than i had anticipated, once i wrote the abstraction layer to manage the products table.
    Last edited by longneck; Jun 19, 2009 at 13:53.
    Check out our new Industry News forum!
    Keep up-to-date with the latest SP news in the Community Crier

    I edit the SitePoint Podcast

  5. #5
    Spirit Coder allspiritseve's Avatar
    Join Date
    Dec 2002
    Location
    Ann Arbor, MI (USA)
    Posts
    648
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by longneck View Post
    let's use your example of size, color, style, etc., and extend it to a clothing store.
    I think that's a situation where most products will have a size, a color, and potentially a style. I'm wondering what to do in situations where the number of attributes vary in addition to the number of values each attribute has.

    Quote Originally Posted by oddz
    The major problem I see here is that if there are say 20 options for product how to handle the variant price and stock based on the selection of options. Its much easier just to handle each as a separate product if the options affect the price or stock individually and in different variations. Either way its going to become a mess as least by storing each product individually you only have one level to deal with.
    I agree treating each variation as a product makes things simpler, but it also duplicates a ton of columns if only 1 attribute varies between any given two rows. Plus, to a client, a 'Product' is conceptually higher than a variation-- they will, for example, add a new Columbia jacket (product) and then specify the sizes for that shirt (variations/options/attributes on a product).

    Maybe the answer is a table for each product 'type' (ie, 'jacket' or 'wine bottle' in which each attribute is a column on that table.
    Last edited by allspiritseve; Jun 20, 2009 at 11:54. Reason: Ooh, 500 posts :)

  6. #6
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,273
    Mentioned
    60 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by allspiritseve View Post
    I agree treating each variation as a product makes things simpler, but it also duplicates a ton of columns if only 1 attribute varies between any given two rows.
    so? and this is a problem because... ???

    analogy: you know how in an employees table you have many people with the first name "john" -- do you say to yourself "whoa, all that duplication from one person to another with the same first name, maybe we need to design another table which will hold unique instances of every first name, and link everyone to that table?"

    no, you don't -- in fact, most developers never, ever think that this is a good idea
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  7. #7
    Spirit Coder allspiritseve's Avatar
    Join Date
    Dec 2002
    Location
    Ann Arbor, MI (USA)
    Posts
    648
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    so? and this is a problem because... ???
    It's a problem because, unlike your employee analogy, each product variation is part of a 'product type' whereas an employee named john is not part of a 'john type'. If one john changes their name, they wouldn't all change their names, whereas if a product's name changed, or a variation's name changed, or a product ended up having different attributes, that would apply to all variations of that product, not just the one changed. I'm no DB expert, but in that situation my instinct is to move the duplication to another table so changing one row applies to all rows that reference that row with a foreign key. The reason, from what I've learned reading forum posts, is in case rows have data that varies (spelling errors, whatever) in this type of situation, they wouldn't get an applied change that referenced them by their value, rather than a foreign key. However, I will defer to your expertise if I am wrong.

  8. #8
    SitePoint Wizard bronze trophy
    Join Date
    Jul 2006
    Location
    Augusta, Georgia, United States
    Posts
    4,151
    Mentioned
    16 Post(s)
    Tagged
    3 Thread(s)
    Do the options affect the price or stock?

    The major problem I see here is that if there are say 20 options for product how to handle the variant price and stock based on the selection of options. Its much easier just to handle each as a separate product if the options affect the price or stock individually and in different variations. Either way its going to become a mess as least by storing each product individually you only have one level to deal with.
    The only code I hate more than my own is everyone else's.

  9. #9
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,273
    Mentioned
    60 Post(s)
    Tagged
    3 Thread(s)
    no, you're not wrong

    a product type table isn't necessary, although in some applications it might have advantages

    "instinct" in database design doesn't always produce the optimal result
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  10. #10
    Spirit Coder allspiritseve's Avatar
    Join Date
    Dec 2002
    Location
    Ann Arbor, MI (USA)
    Posts
    648
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    no, you're not wrong

    a product type table isn't necessary, although in some applications it might have advantages
    Well, I'm sure simpler schemas are fine for simpler sites, I just wanted to know for more complex e-commerce sites what direction I should head in, since I'd heard Magento used EAV and a lot of people didn't seem to agree with that decision.

    Quote Originally Posted by r937 View Post
    "instinct" in database design doesn't always produce the optimal result
    Exactly, hence two threads on the subject

  11. #11
    Spirit Coder allspiritseve's Avatar
    Join Date
    Dec 2002
    Location
    Ann Arbor, MI (USA)
    Posts
    648
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    OK, so on a simple site we might have a schema like this:

    Code:
    product
    --------
    sku
    title
    description
    size
    color
    And then on a more complex site, something like this?
    Code:
    shirt
    -----
    sku
    title
    description
    size
    color
    
    wine
    -----
    sku
    title
    description
    vineyard
    type
    volume
    
    etc...
    Is that where we've arrived?

  12. #12
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,273
    Mentioned
    60 Post(s)
    Tagged
    3 Thread(s)
    a separate table for every different product type?

    i wouldn't go there myself...
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  13. #13
    Spirit Coder allspiritseve's Avatar
    Join Date
    Dec 2002
    Location
    Ann Arbor, MI (USA)
    Posts
    648
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    a separate table for every different product type?

    i wouldn't go there myself...
    Where would you go? A column on products for every unique attribute in the catalog?

  14. #14
    SitePoint Author silver trophybronze trophy
    wwb_99's Avatar
    Join Date
    May 2003
    Location
    Washington, DC
    Posts
    10,633
    Mentioned
    4 Post(s)
    Tagged
    0 Thread(s)
    I'd settle in the middle--columns for common product metadata, then a "payload column" of the specific product details. Personally, I'd use XML to store this in the database. Your product object can then be a bit polymorphic and act as the base products class or a concrete product type which would pull it's additional info from the XML.

    Another way to model this is a base products table then a variety of general product type tables to hold attributes for specific types of products. A little less flexible than the above, but a bit easier to search on the database level if you want to go that way.

  15. #15
    Spirit Coder allspiritseve's Avatar
    Join Date
    Dec 2002
    Location
    Ann Arbor, MI (USA)
    Posts
    648
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    @r937 if I have all my products in one table, I could see problems where a client might rename one product to the same name as another, and then any renaming would affect both products (assuming I'm distinguishing one product from another by their name, since two separate products may have the same attributes but should have a different name. Should I have a product_type table with the name on it, and then reference it with a foreing key from the products table?

  16. #16
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,273
    Mentioned
    60 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by allspiritseve View Post
    Should I have a product_type table with the name on it...
    i don't think that would be necessary
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  17. #17
    Spirit Coder allspiritseve's Avatar
    Join Date
    Dec 2002
    Location
    Ann Arbor, MI (USA)
    Posts
    648
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    i don't think that would be necessary
    Then how would you recommend differentiating two products with the same name and attributes? It's not an ideal situation, but if it can be done, a client will do it. With a product_type table I can have an auto incrementing id to differentiate them. Also, what do you do about columns that are shared between multiple products, such as a product title or description that is not affected by attributes? Do you just duplicate them across all products that need the same title/description?

    Edit: I doubt you saw my question in the other thread, so I'll ask again here: do you have some links on where I can read up on when it is appropriate to move columns into another table and when to leave them in the first one? I'd like to better understand the tradeoffs involved in each decision.

  18. #18
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,273
    Mentioned
    60 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by allspiritseve View Post
    Then how would you recommend differentiating two products with the same name and attributes?
    if they have the same name and the same attributes, then they're the same product!

    Quote Originally Posted by allspiritseve View Post
    Also, what do you do about columns that are shared between multiple products, such as a product title or description that is not affected by attributes? Do you just duplicate them across all products that need the same title/description?
    i would, yes

    the same way i would just duplicate the first name 'John" across all people in the persons table that have that first name that is not affected by the other attributes

    Quote Originally Posted by allspiritseve View Post
    do you have some links on where I can read up on when it is appropriate to move columns into another table and when to leave them in the first one?
    search supertype/subtype -- there are some very comprehensive articles/tutorials out there
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  19. #19
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,273
    Mentioned
    60 Post(s)
    Tagged
    3 Thread(s)
    so it isn't duplication per se that normalization addresses, is it

    now let's move on the the pros and cons of abstracting the product description

    one of the benefits is that it takes less space (although to be frank, disk space is ~not~ an issue at a few dollers per gigabyte, is it)

    can you think of any drawbacks for the product_type (description) table?
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  20. #20
    Spirit Coder allspiritseve's Avatar
    Join Date
    Dec 2002
    Location
    Ann Arbor, MI (USA)
    Posts
    648
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    can you think of any drawbacks for the product_type (description) table?
    It requires a join, and doesn't allow you to change individual product names/descriptions separate from the product_type table. Anything else?

  21. #21
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,273
    Mentioned
    60 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by allspiritseve View Post
    Anything else?
    ain't that enough???

    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"


Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •