SitePoint Sponsor

User Tag List

Page 3 of 4 FirstFirst 1234 LastLast
Results 51 to 75 of 96
  1. #51
    SitePoint Guru
    Join Date
    Oct 2006
    Location
    Queensland, Australia
    Posts
    852
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by SJH View Post
    Oh, and don't be fooled into thinking that more rows = less efficient, or that more rows = lack of flexibility for displaying the data at the front end. If data is stored in an efficient manner, you will be able to display it however you want. Trust me on this
    You can't tell me that matthijsA's schema is more efficient or flexible than mine?

    Quote Originally Posted by SJH View Post
    Wardrop, please explain to me, using your proposed schema, how choosing 'cotton' as the material could increase the price by $2 if the shirt is in medium, and by $3 if large is selected. Not that you'd necessarily need to, but I still think that's another potential shortcoming of your table structure.
    I just noticed that I had cotton repeated twice in my example table. Can you look at that post again and tell me if your question still applies?

    EDIT: Oh wait, I think I just got what you were asking. You're asking me to demonstrate something to you if I'm not mistaken. There's two ways I can interpret that though. You're either saying, how can the total price of the shirt be increased by $2 if medium cotton was chosen, and by $3 if the large cotton combination was chosen, in which case I'd present you with this...

    Code:
    ID	Product_ID	Attribute	Value		Price_Diff
    --	----------	---------	-----		----------
    388	45		size		medium		+$1
    389	45		size		Large		+$2
    390	45		material	Cotton		+$1
    That would make a medium cotton shirt cost $2 more than the base price, and a large cotton shirt cost $3 more than the base price. However, you may be asking, how do you (using my schema), make the cost of selecting cotton, an extra $3, if large has been selected, but only an extra $2 if medium has been selected. So if you were to use the price_diff values for the two sizes in the above example table, it would make the cost of a medium cotton shirt $3, but the $cost of a large cotton shirt, $5. In which case, that would be a minor limitation of my schema I guess, but it could be easily worked around by merging the two separate attributes into one (essentially putting the possible combinations as a value), which you'd probably want to do anyway seeing as though in such a circumstance, both attributes and value sets directly relate to one another. So what you'd end up with is a 'size/material' attribute which would look like the following...

    Code:
    ID	Product_ID	Attribute	Value		Price_Diff
    --	----------	---------	-----		----------
    388	45		size/material	medium/cotton	+$3
    389	45		size/material	large/cotton	+$5
    Not really a limitation when you think about it, as I can't think of (off the top my head), a schema that could achieve such a thing. Being a complex pricing scheme, I imagine it would require a relatively complex schema - probably too complex to worry about implementing unless it was an absolute requirement of the application.
    Last edited by Wardrop; Jun 23, 2009 at 04:46.

  2. #52
    SitePoint Guru
    Join Date
    Oct 2006
    Location
    Queensland, Australia
    Posts
    852
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by matthijsA View Post
    I just proposed an alternative schema. I really don't feel like or need to defend it.
    The only reason we had this debate (I guess you could call it), and the only reason you have needed to defend it is because you said when you first proposed it, that it essentially achieved the same thing (which I guess technically it does, albeit in a more impractical, cumbersome way - that is however, if you were trying to achieve the same goal as what I was trying to achieve). At least that's how I interpreted it.

  3. #53
    SitePoint Zealot
    Join Date
    Jun 2004
    Location
    Netherlands
    Posts
    172
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Wardrop View Post
    I don't think there was a stage where I didn't understand your schema and that it was "possible", it was just so damn impractical I thought that maybe you were missing something.
    what about:
    Quote Originally Posted by Wardrop View Post
    This whole time I've been trying to prove to you that your solution didn't support such dynamic pricing, and only supported one price altering attribute (when there could potentially be many). However it's clear now that you weren't trying to achieve such a thing, which makes me ask what relevance your solution has to the original post, as the whole point of this thread I thought, was about linking prices to attributes
    ...
    That's my reason for trying to prove the point I'm trying to prove - your schema does not meet those requirements. Unfortunately it's taken me many posts to try and explain that
    ...
    Your schema does not allow any more than one price changing attribute
    Can you understand that if I read this, I interpret it as meaning: "your schema is not doing what the requirements are, allowing price differences for different attributes"?


    Again, what is "impractical" is something else and depends on other factors. For example, SJH's example, is a situation which my schema allows, without adaptation. Maybe allspiriteve needs that specific flexibility, maybe not

  4. #54
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,015
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    i just thought i'd jump in and correct a misunderstanding about normalization

    normalization is ~not~ about removing duplication

    you can have a perfectly good normalized database schema with duplication all over the place

    normalization has to do with the relationship between attributes and keys

    that is all

    now back to your regularly scheduled discussion of the intricacies of EAV models

    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  5. #55
    SitePoint Guru
    Join Date
    Oct 2006
    Location
    Queensland, Australia
    Posts
    852
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by matthijsA View Post
    Can you understand that if I read this, I interpret it as meaning: "your schema is not doing what the requirements are, allowing price differences for different attributes"?
    Yes, that's how it should be understood. If you're trying to emphasise the point that it is possible to achieve variable pricing based on attributes using your schema, then don't. I mean, it's possible to achieve that with a single products table. For example...

    Code:
    ID	Title					Price_Diff
    --	-----					----------
    388	T-Shirt - Green Large Cotton		$23
    389	T-Shirt - Orange Medium Polyster	$20	
    389	T-Shirt - Orange Small Satin		$24
    Would you look at that, I've got price altering product attributes.

    All of these schemas essentially store the same data. I didn't think this thread was about finding a schema that can achieve the outcome any old way, I was under the impression this thread was about finding the most efficient/reliable/friendly method of doing so.

    Quote Originally Posted by matthijsA View Post
    Again, what is "impractical" is something else and depends on other factors. For example, SJH's example, is a situation which my schema allows, without adaptation. Maybe allspiriteve needs that specific flexibility, maybe not
    SJH's example can also be achieved in that single products table example above. It's of course possible in mine as well as I demonstrated. In all 3 schemas, SJH's example is stored as a combined set of attribute values (with a pre-determined price). You could say mine goes a little further by allowing multiple sets of combined attribute values which mixes it all together (although you could probably claim the same thing about your schema)...

    Code:
    ID	Product_ID	Attribute	Value		Price_Diff
    --	----------	---------	-----		----------
    388	45		size/material	medium/cotton	+$3
    389	45		size/material	large/cotton	+$5
    390	45		colour/finish	green/matte	+$0
    391	45		colour/finish	orange/matte	+$0
    392	45		colour/finish	orange/gloss	+$3
    Anyway, it's bed time for me.

  6. #56
    SitePoint Guru
    Join Date
    Oct 2006
    Location
    Queensland, Australia
    Posts
    852
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    i just thought i'd jump in and correct a misunderstanding about normalization
    I think we've moved on

    However, if we ever revisit normalisation, I'll be sure to refer back to your post

  7. #57
    SitePoint Wizard
    Join Date
    Mar 2002
    Location
    Bristol, UK
    Posts
    2,240
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Wardrop
    You can't tell me that matthijsA's schema is more efficient or flexible than mine?
    I'm afraid so

    Quote Originally Posted by Wardrop View Post
    SJH's example can also be achieved in that single products table example above. It's of course possible in mine as well as I demonstrated. In all 3 schemas, SJH's example is stored as a combined set of attribute values (with a pre-determined price). You could say mine goes a little further by allowing multiple sets of combined attribute values which mixes it all together (although you could probably claim the same thing about your schema)...

    Code:
    ID    Product_ID    Attribute    Value        Price_Diff
    --    ----------    ---------    -----        ----------
    388    45        size/material    medium/cotton    +$3
    389    45        size/material    large/cotton    +$5
    390    45        colour/finish    green/matte    +$0
    391    45        colour/finish    orange/matte    +$0
    392    45        colour/finish    orange/gloss    +$3
    Anyway, it's bed time for me.
    This breaks first normal form. What happens if I want to query the database for all shirts that are in medium, or all shirts that are in orange and in cotton? I guess we could use WHERE Value LIKE '%cotton%' or some variant thereof, but that's going to be one hell of a slow query.

  8. #58
    SitePoint Guru
    Join Date
    Oct 2006
    Location
    Queensland, Australia
    Posts
    852
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by SJH View Post
    This breaks first normal form. What happens if I want to query the database for all shirts that are in medium, or all shirts that are in orange and in cotton?
    You do have a point, and I'm glad you pointed out that fact. I'll probably think of a response to it after I get some sleep

  9. #59
    SitePoint Guru
    Join Date
    Oct 2006
    Location
    Queensland, Australia
    Posts
    852
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Ok, I've had some sleep.

    I'll admit, my schema doesn't allow for such information to be stored, but it really doesn't worry me when I think about it. That's about the only minor pitfall of the schema.

    To compare, my schema, and matthijsA's schema store information in a completely different way. In my schema a single product, with multiple attributes (and values), is treated as such (a single product). matthijsA's schema takes the approach that every possible combination of attributes and values, should be it's own product.

    matthijsA's schema would work if every product only had very few attributes and attribute values (3 attributes, each with 3 values would probably be as far as you'd want to go). In other words, it doesn't scale well (takes up a lot of space in the database for complex products, and data entry for such products would be a nightmare to say the least). That would be suitable however for the majority of online stores, but that's the main pitfall.

    My schema on the other hand, scales very well. It's most suited to shops that sell made-to-order products, such as Dell, but can accommodate simple products at the same time (such different colour t-shirts). The only pitfall of my schema (that has be found at least), is that you can't do complex price altering which involves linking to values from separate attributes. For this you need a schema such as
    matthijsA's which stores all the combinations of a product.

    So there you go. Both schema's do essentially the same thing, but both have their pitfalls I guess. From allspiritsteves original post, I got the impression that he wanted a schema that supported scalability in terms of the number of attributes (so it could support customisable made-to-order products). matthijsA's interpreted the original the other way. Who's right? allspiritsteve will have to clarify won't he.

    I'm off to work anyway... again (bloody work).

  10. #60
    Spirit Coder allspiritseve's Avatar
    Join Date
    Dec 2002
    Location
    Ann Arbor, MI (USA)
    Posts
    648
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Wardrop View Post
    From allspiritsteves original post, I got the impression that he wanted a schema that supported scalability in terms of the number of attributes (so it could support customisable made-to-order products). matthijsA's interpreted the original the other way. Who's right? allspiritsteve will have to clarify won't he.
    Nobody's right or wrong here. However, I'm looking for a schema that is not EAV-based. Both yours and Matthijs' solutions are EAV-based. We've established EAV is extremely slow and hard to search. Thus, I've come to the conclusion EAV is not a 'best practice', but I have yet to hear what a best practice schema is for a complex e-commerce site that has to deal with many and varying attributes.

  11. #61
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,015
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    Quote Originally Posted by allspiritseve View Post
    Thus, I've come to the conclusion EAV is not a 'best practice'...
    okay, let's move on to the next approach

    do some research on supertype/subtype

    you would have a main products table with as many common attributes as possible -- e.g. description, supplier, price, etc.

    (yes each product has a single price, so if there are 24 size/colour variations in t-shirts, then there are 24 separate products -- this is how the global SKU system works)

    then for products which require additional oddball attributes, another table to hold just those attributes, with a foreign key back to the main products table, a one-to-one relationship for ech different product subtype table
    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  12. #62
    SitePoint Wizard Ren's Avatar
    Join Date
    Aug 2003
    Location
    UK
    Posts
    1,060
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    okay, let's move on to the next approach

    do some research on supertype/subtype
    Also known as "Class Table Inheritance"

  13. #63
    Spirit Coder allspiritseve's Avatar
    Join Date
    Dec 2002
    Location
    Ann Arbor, MI (USA)
    Posts
    648
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    okay, let's move on to the next approach
    That's what I'm talking about!

    So basically a simple site would have just a products table, and then as attribute columns became specialized, move them out into a separate table. Is there a rule of thumb for when to separate columns out into the subtype table?

  14. #64
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,015
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    Quote Originally Posted by allspiritseve View Post
    Is there a rule of thumb for when to separate columns out into the subtype table?
    well, my particular rule of thumb is whenever the number of specialized columns becomes much larger than the number of common columns

    which, in the number of times i have implemented it, has never happened

    i see nothing wrong with several sets of columns, that are all NULLable, where, depending on the product type, only one set will have any values
    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  15. #65
    SitePoint Guru
    Join Date
    Oct 2006
    Location
    Queensland, Australia
    Posts
    852
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by allspiritseve View Post
    We've established EAV is extremely slow and hard to search.
    Mind elaborating on that. I believe I asked this earlier in the thread but don't believe I received a response.

    Quote Originally Posted by allspiritseve View Post
    Nobody's right or wrong here.
    Just to clarify, I meant who's interpretation (out of myself and matthijsA) of your original post/requirements was correct. I don't really care about getting an answer though, I was just making a point.

  16. #66
    SitePoint Guru
    Join Date
    Oct 2006
    Location
    Queensland, Australia
    Posts
    852
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    you would have a main products table with as many common attributes as possible -- e.g. description, supplier, price, etc.

    Then for products which require additional oddball attributes, another table to hold just those attributes, with a foreign key back to the main products table, a one-to-one relationship for each different product subtype table
    Isn't that essentially what my schema does? The attributes table is completely optional, and a major intention of the products table is to store common attributes as you mentioned.

    I personally don't believe any attributes, which apply only to a minority of products, should be stored as columns in the products table. If you start doing this, you'll never know where to draw the line and what you'll eventually end up with is a schema which stores uncommon attributes in two separate locations. On top of that, if you stop selling products which would normally populate that specialised column, you end up with an extra, completely useless column in your products table, unless you modify the schema of the database. I don't know about you, but modifying the schema whenever as a change the products I sell, isn't something I'd like to be doing.

  17. #67
    Spirit Coder allspiritseve's Avatar
    Join Date
    Dec 2002
    Location
    Ann Arbor, MI (USA)
    Posts
    648
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    well, my particular rule of thumb is whenever the number of specialized columns becomes much larger than the number of common columns

    which, in the number of times i have implemented it, has never happened

    i see nothing wrong with several sets of columns, that are all NULLable, where, depending on the product type, only one set will have any values
    Geez, might as well go back to my boss's favorite schema: one table for the whole application

    In all seriousness though, do you have some links on where I can read up on when it is appropriate to move columns into another table and when to leave them in the first one? I'd like to better understand the tradeoffs involved in each decision.

  18. #68
    SitePoint Guru
    Join Date
    Oct 2006
    Location
    Queensland, Australia
    Posts
    852
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    <ignore this post, it was an accident.>

  19. #69
    SitePoint Zealot
    Join Date
    Jun 2004
    Location
    Netherlands
    Posts
    172
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    What I wonder though: do people have actual experience with the scaling (or not) of different schema's? What kind of numbers are we talking about? With how many products and attributes do things start to get slow? Have people put up benchmarks with dummy products to test this stuff?

    I've been reading up a bit about when databases become the limiting factor and what measures are taken to scale, but the only articles I find are about massive sites like Flickr, Digg, Delicious, etc. Now with millions of users, millions of visits a day, and millions of requests an hour, I can understand some draconian measures are needed to scale the stuff. But I assume that, in this thread, we are not talking about building another Amazon.com, aren't we?

    I still have to find out why an app like Magento is slow (if it is). Maybe they did some weird stuff so that they have loops of 100 queries instead of one single smart join?

  20. #70
    SitePoint Addict Mastodont's Avatar
    Join Date
    Mar 2007
    Location
    Czech Republic
    Posts
    375
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I still have to find out why an app like Magento is slow (if it is)
    http://www.sitepoint.com/forums/showthread.php?t=601503
    http://www.crucialwebhost.com/blog/i...l-query-cache/
    http://www.byte.nl/blog/2008/09/03/speeding-up-magento/
    It seems that Magento needs a lot of server tuning.

  21. #71
    SitePoint Zealot
    Join Date
    Jun 2004
    Location
    Netherlands
    Posts
    172
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Mastodont, thanks for the links. However, those articles still don't explain exactly why Magento is slow. It's something to do with complex code and slow queries. But then we still don't know how much the the EAV db schema itself, independently, is responsible for the slowdown.

    [edit]By the way, I'm fine leaving the EAV subject behind (or split it in a new thread) and exploring other possibilities, if everybody already agrees that it is not a useful pattern in this case. I can do my own research on EAV to find out why it is so slow

  22. #72
    SitePoint Addict Mastodont's Avatar
    Join Date
    Mar 2007
    Location
    Czech Republic
    Posts
    375
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by matthijsA View Post
    However, those articles still don't explain exactly why Magento is slow. It's something to do with complex code and slow queries.
    I do not know Magento and am not involved in any OS shop, but lately I see some interesting presentation:
    http://www.priebsch.de/2009/03/had-a...urce-code-yet/
    So I think that both mentioned sources are valid ...

  23. #73
    SitePoint Zealot
    Join Date
    Jun 2004
    Location
    Netherlands
    Posts
    172
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    @mastodont: please read carefully what I said. I said: yes, magento might be slow. And yes, there is bad code in there. But that's not the question here. The relevant question is:
    "is it the EAV-db schema used in Magento that is causing the slowdown?"

    This question is relevant to the discussion here, because Allspririteve want to know what is considered best practice for designing his database model. Both Wardrop and I brought forward some sort of EAV model (at least my schema looks like that). So the question is, does that schema scale or not?

    In my schema, if you have 10,000 products, with each on average 10 attributes (color, size, weight, etc), you end up having a table products with 10,000 rows. A table attributes with say a few hundred attributes. And a lookup Product2Attribute table with 10x10,000 = 100,000 rows.

  24. #74
    SitePoint Guru
    Join Date
    Oct 2006
    Location
    Queensland, Australia
    Posts
    852
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by matthijsA View Post
    What I wonder though: do people have actual experience with the scaling (or not) of different schema's? What kind of numbers are we talking about? With how many products and attributes do things start to get slow? Have people put up benchmarks with dummy products to test this stuff?
    It depends on how the schema stores the data, and how many rows it uses to do so. As I said in my PM to you, storing a single product using your previously proposed schema, which had 10 attributes, each of which had 5 values each, would require 100 millions rows to store just that single product (if anyone doubts this, I've got equations to prove it). Obviously that schema wasn't designed to store products with that many attributes and values, but it's an example of why your question can't be answered as is.

    The two things which can slow a database are size, and the queries which are required to retrieve certain information. The latter being the most common. That's why it's not possible to say that the EAV model is slow, without knowing what data is being stored, and without knowing how it needs to be accessed.

    That's about all I can say in reply to such a general question.

  25. #75
    SitePoint Zealot
    Join Date
    Jun 2004
    Location
    Netherlands
    Posts
    172
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    You are right that in a situation in which each attribute can be combined with any of the other values for other attributes, you get millions of combinations.

    I understand your calculation. But what we are thinking/talking about are different situations. Your calculation is indeed correct for a situation like the computer store in which you have many, many (almost endless) possible combinations of attributes. Each motherboard can have each memory and can have each case, can have each mouse, can have each monitor, etc. Then, indeed you are right, my schema will not scale.

    However, I was more thinking about for example a clothing store. That clothing store has a limited inventory. There are not 100 million different clothing items lying in the store. There might be 10,000 or 100,000 different variations of products in total. That is doable for a database, as far as I know.


Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •