SitePoint Sponsor

User Tag List

Results 1 to 9 of 9
  1. #1
    SitePoint Wizard silver trophy Jelena's Avatar
    Join Date
    Feb 2005
    Location
    Universum, 3rd Corner
    Posts
    3,000
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)

    Tags and Database Structure

    I'm in a situation where I need to create database structure to tag items and display various statistics connected to tagging. Basically, the most similar thing I need is something that You Tube is doing, having tags on video's and connecting them with categories as well.

    I read this article: http://www.pui.ch/phred/archives/200...e-schemas.html which shows 3 different ways of storing tags.

    At first, I thought that delicious database structure presented would be easiest and should be good enough. However, I'm concerned about its speed.

    Now, I'm thinking more about normalized solution with 3 tables, items, tags and m:n relation between them with a slight denormalization of putting tags in one field in items table in character form just to have original order once they get inserted from the form field.

    I don't like the fact that I will have to unlink all tags connected with an item once they need to be updated.

    Anyway, I need your opinion. What would you do, which database structure would you choose and why?

    Thank,
    Jelena
    -- Jelena --

  2. #2
    ☆★☆★ silver trophy vgarcia's Avatar
    Join Date
    Jan 2002
    Location
    in transition
    Posts
    21,235
    Mentioned
    1 Post(s)
    Tagged
    1 Thread(s)
    Quote Originally Posted by jelena
    Now, I'm thinking more about normalized solution with 3 tables, items, tags and m:n relation between them with a slight denormalization of putting tags in one field in items table in character form just to have original order once they get inserted from the form field.
    I don't see why you have to denormalize like that. If you insert the tags/taggings in order you should be able to get them back the same way with an ORDER BY.

    I'd go with your 3-table structure minus the denormalization since I don't particularly see a good reason for it.

    Off Topic:

    If you're doing this in Ruby on Rails, the acts_as_taggable plugin uses the 3 table structure and it's really easy to add to models.

  3. #3
    SitePoint Wizard silver trophy Jelena's Avatar
    Join Date
    Feb 2005
    Location
    Universum, 3rd Corner
    Posts
    3,000
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by vgarcia
    I don't see why you have to denormalize like that. If you insert the tags/taggings in order you should be able to get them back the same way with an ORDER BY.

    I'd go with your 3-table structure minus the denormalization since I don't particularly see a good reason for it.
    What do you think about counting how many items are connected with tags? Would it be good to have a counting field in tags table? I'm concerned about COUNT function speed if table gets too large.
    Quote Originally Posted by vgarcia
    Off Topic:

    If you're doing this in Ruby on Rails, the acts_as_taggable plugin uses the 3 table structure and it's really easy to add to models.
    Off Topic:


    Thanks. I will remember it for the future reference.
    -- Jelena --

  4. #4
    ☆★☆★ silver trophy vgarcia's Avatar
    Join Date
    Jan 2002
    Location
    in transition
    Posts
    21,235
    Mentioned
    1 Post(s)
    Tagged
    1 Thread(s)
    Quote Originally Posted by Jelena
    What do you think about counting how many items are connected with tags? Would it be good to have a counting field in tags table? I'm concerned about COUNT function speed if table gets too large.
    What are the odds of your table getting large right away? If you expect that kind of activity within your first month then go ahead and add the count field, but if this is a new site/app starting from scratch, then don't worry about it until slowdowns actually happen. Get to 3NF now, then denormalize later when you really experience bottlenecks.

  5. #5
    SitePoint Wizard silver trophy Jelena's Avatar
    Join Date
    Feb 2005
    Location
    Universum, 3rd Corner
    Posts
    3,000
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    Yes, it's new application and I don't expect much of an activity in the beginning. However, I was more thinking about it what will happen later if speed become an issue.


    And thanks, Vinnie.
    -- Jelena --

  6. #6
    chown linux:users\ /world Hartmann's Avatar
    Join Date
    Aug 2000
    Location
    Houston, TX, USA
    Posts
    6,455
    Mentioned
    11 Post(s)
    Tagged
    0 Thread(s)
    I would go with the three table option, normalized. The guy's "disadvantage" about there being possible orphan tags on a deletion of an item is a little overplayed. If you are using a newer version of MySQL and the InnoDB table type then you can implement ON DELETE CASCADE for things, so the problem goes out the window. So my design would be as follows:

    Code:
    Item
    ------------
    item_id                   
    item_name
    etc.
    
    Tags
    ------------
    tag_id
    tag_name
    etc.
    
    Item_Tags
    --------------
    item_id
    tag_id
    Now that I look at it, the only reason that you would have orphan tags is if you did not write good delete code.

  7. #7
    ☆★☆★ silver trophy vgarcia's Avatar
    Join Date
    Jan 2002
    Location
    in transition
    Posts
    21,235
    Mentioned
    1 Post(s)
    Tagged
    1 Thread(s)
    Quote Originally Posted by Hartmann

    Now that I look at it, the only reason that you would have orphan tags is if you did not write good delete code.
    It would require some conditional checking to see if the tag is still linked to some other item though. Deleting the tag itself should delete tag/item references, and deleting items should delete tag/item references too, but wouldn't get rid of tags without that checking.

    Personally I don't see the big deal with orphan tags unless you have tons of them. But in that case, a quick query once in a while can remove them pretty easily:
    Code:
    DELETE FROM tags
    WHERE id NOT IN
    (SELECT DISTINCT tag_id FROM items_tags)

  8. #8
    SitePoint Wizard silver trophy Jelena's Avatar
    Join Date
    Feb 2005
    Location
    Universum, 3rd Corner
    Posts
    3,000
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    Yes, that could be easily accomplished with cron job once in a while.

    Thanks both of you for your help. I will go with third solution for sure.

    I was just tempted with easiest one and fulltext search.
    -- Jelena --

  9. #9
    _ silver trophy ses5909's Avatar
    Join Date
    Jul 2003
    Location
    NoVa
    Posts
    5,466
    Mentioned
    1 Post(s)
    Tagged
    1 Thread(s)
    Quote Originally Posted by Jelena
    Thanks both of you for your help. I will go with third solution for sure.
    What am I .. chopped liver
    Sara


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •