SitePoint Sponsor

User Tag List

Page 4 of 4 FirstFirst 1234
Results 76 to 83 of 83
  1. #76
    SitePoint Member
    Join Date
    Aug 2005
    Posts
    11
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Has anyone successfully implemented the model outlined in this article in MySQL? The first four operations -- add node, delete node, move node, add link -- are pretty straightforward but the last "delete link" operation did me in (MySQL 4.1.2).

  2. #77
    SitePoint Evangelist
    Join Date
    May 2004
    Location
    New Jersey, USA
    Posts
    567
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by i_m_making_a_cms
    Has anyone successfully implemented the model outlined in this article in MySQL? The first four operations -- add node, delete node, move node, add link -- are pretty straightforward but the last "delete link" operation did me in (MySQL 4.1.2).
    No.

    I stumbled on that page a while back, and the early parts convinced me that the author was pretty good at using unicode glyphs, but not so hot at database work.

    If you're implementing trees in a RDBMS, you need to make some hard estimations about how many nodes you'll have, and how deep the tree will be -- will it be wide or thin, short or tall? Once you have this in hand, generate some sample data: just randomly create a tree with a hundred, or a hundred thousand, or a hundred million nodes (honest, it's not that hard to write the code). Then test your algorithm.

    Assuming you're not writing small trees (in which case who cares?) you'll almost immediately fall into issues of indexing and statement optimization. Some of Vadim Tropashko's articles on the subject go into the details of query analysis for Oracle (his employer). His nested intervals approach has considerably better performance for tree reordering operations.

    Tropashko points out, correctly, that nested sets is only really useful for small trees or static large trees-- where little or no reorganization will take place-- because of the high reorg cost. MySQL has weighed in on the subject, with a wanton disregard for performance leading them to suggest a three-way self-join. Boy, I hope their optimizer is up to the task: 1M records x 1M x 1M = 1,000,000,000,000,000,000 combinations. Life is so much easier when the sample size is small.

    (Tropashko goes on to author this article on using Farey Fractions (honest: I can't make this stuff up ) to get equivalent performance without the slow reorganization behavior.)

    Next, give some thought to your operation mix: will you be reorganizing frequently? Is reorg performance critical, even if it is uncommon? After all, adjacency lists and materialized path schemes (which is essentially the theme of the article you cite) are murder for path and subtree queries with large datasets, but they perform quite well on reorganization operations.

    OTOH, if you genuinely need a DAG, you're really in a world of hurt. I very much doubt that you'll get any joy from a single-table solution. Consider Amazon's categorization behavior, versus Yahoo's.

    The Yahoo categories are pretty obviously either implemented as filesystem elements, or implemented using some sort of filesystem emulation: the "Links" are blatant: @Link-to-somewhere-else even looks like the output of 'ls'.

    Amazon, however, lists all the paths through all the categories that a book appears in. Looking up, for a random example, "Joe Celko's Trees and Hierarchies in SQL for Smarties" gives these categories:


    (man I hope the formatter doesn't destroy that...)



    Anyhow, if I were a gambling man I'd bet cash money that the categories tree doesn't change much -- a book may move from one place to another, but the basic categories probably don't get changed that much.

    Likewise, I bet the "shape" of the Sitepoint forums tree doesn't change too much, although articles are constantly being moved from one place to another (if you're writing a cms, as your handle suggests, you can see where I'm going with this).

    So maybe it's a clever idea to manage two tables: one for the hierarchy, and another one for the leaf nodes. And of course there's a third table to link them:


    Code:
         
         SELECT book.title, category.path 
           FROM book, link, category
         WHERE book.title LIKE %joe%celko%
           AND book.id = link.fk_bookid
           AND category.id = link.fk_categoryid

    In this scenario, of course, deleting links is trivial.


    =Austin
    Austin Hastings - Principal Consultant - Longacre, Inc.

    Anything you can do, you can do better.

  3. #78
    SitePoint Member
    Join Date
    Aug 2005
    Posts
    11
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Hi Austin,

    First of all, thanks for your informative response. I am indeed making (at least planning to make) a cms. I plan to organize site objects in a hierarchical format in the same vein as Zope and Limb CMS. Since the hierarchy will map to the URL path info, paths generally will not exceed 6 levels. Furthermore, I anticipate only the deepest nodes will be fat. Needless to say, queries will far outnumber updates so node retrieval is key. Updates are secondary but must be tolerably quick since they'll show up in the frontend from time to time.

    I've looked into the main tree/relational mapping models and came to the following conclusion:

    1. Adjacency - ugh, recursion
    2. Nested Sets - cms requires too many updates
    3. Nested Intervals - (apparently) suited for RDBMSes that support stored procedures, triggers, etc. Frankly, I'm a bit relieved because I have no hope of understanding this model anyway.
    4. Materialized Path - probably the best route for me to take. It's denormalized however (even worse for DAGs) and I'm suspicious of indices that rely on string parsing.

    That's why I thought I was on to something when I found the article that I cited in the OP. The model looks like a normalized variant of the materialized path, as you mentioned. A glance at the model suggests that path and subtree queries would be quick, even with a million+ nodes; they're simply selects against id or ancestor_id afterall).

    I'm not quite sure where I'm going with this post, but having elaborated a bit on my requirements I was wondering if you could offer any suggestions or advice. You certainly seem to know a lot more about this than me.

    And perhaps I'm just slow but could you please have a quick look at that last "delete link" query in that article and explain to me why there are two link tables (ActorLink, DataLink)? The previous queries were fairly straightforward mainly because the tables and columns actually map to the tables in the diagram, but that last query, specifically the second derived table, has really manhandled me. My tables look something like this:

    data
    --------------------
    id | parent_id

    path
    --------------------
    id | ancestor_id

    link
    --------------------
    from_id | to_id

    Thanks again.

  4. #79
    SitePoint Enthusiast rikdc's Avatar
    Join Date
    Sep 2005
    Location
    Edinburgh
    Posts
    71
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I've hit this problem time and time again, and so far the one which has caused me the least hassle has been as Austin_Hastings suggested, and that's storing the structure in another table.

    The table structure is simple.

    categories( id, category_name, display_order );
    categories_tree ( parent_id, child_id );

  5. #80
    SitePoint Enthusiast
    Join Date
    Sep 2005
    Location
    Edinburgh, UK
    Posts
    51
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by [ArcanE]
    Anyone found a decent solution for moving tree nodes and their children up and down in the menu while maintaining the mptt database integrity yet?
    Same question here without actually going through all the elements of the tree with a recursive function ??

    Any one with good math

    Geza

  6. #81
    SitePoint Addict
    Join Date
    Sep 2002
    Posts
    225
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    So in reading this whole thread and thinking about the benefits of both the adjacency list model and the modified preorder tree traversal model I'm thinking that the (of course depending on your situation) the adjacency list model might be the way to go for creating menus.

    Let me explain. First it is really slow with recursion creating the menus. Why not create the menu once seralize it and throw it into another table, basically caching it. This way if you get a lot of requests for the menu it's one database call and unserialize and poof. Fast right?

    Then there are the updates, inserts, deletes. In an adjacency model those are easy, from what I've heard (well a lot easier than the modified preorder tree tracersal). So after every update, insert, delete just recache and serialize the tree again.

    I'm assuming that your menu is going to get rendered a bunch of times and not updated all that much. If that's the case why not go this route? hm.

  7. #82
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,215
    Mentioned
    58 Post(s)
    Tagged
    3 Thread(s)
    if you are thinking of a hierarchy for the navigation menu of a web site, you're not going to go more than 3, maybe 4, surely not 5 or 6 levels deep, right? (if you do, either you have a huge site, or maybe you have an information architecture problem)

    in that case, you do not need recursion, just use left outer self-joins

    see http://sqllessons.com/categories.html
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  8. #83
    SitePoint Enthusiast
    Join Date
    Aug 2007
    Posts
    56
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I have the following structure:

    PHP Code:
    TB
        
    cat a
        
    cat b
            
    sub-cat b
            
    sub-cat b
            
    sub-cat b
        
    cat c 
    My table structure is 'category_id, name, lft, rgt'.

    The category_id is linked in with my products table.

    Any ideas how I can display my products in the following fashion:

    PHP Code:
    TB
        
    cat a
            
    product from sub category a
        
    cat b
            
    sub-cat bb
                
    product from sub category bb
            
    sub-cat bbb
                
    product from sub category bbb
                
    product from sub category bbb
            
    sub-cat bbbb
        
    cat c
            
    product from sub category c 
    If I'm not being clear enough, let me know.

    Thanks!


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •