SitePoint Sponsor

User Tag List

Results 1 to 8 of 8
  1. #1
    SitePoint Addict
    Join Date
    Nov 2009
    Posts
    283
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Serializing or separate tables?

    Hi,

    I am building a simple browser-based text RPG (Role Playing Game) and my first step is to design the database. I have checked some samples and articles about the subject and I wanted to ask your opinions as well.

    Basically, I will have the following tables:

    - users
    - characters
    - items
    - skills
    etc.

    Each character will have some skills and items in the game. I can think of two options for designing the database as below. Can you please tell me which one would you prefer and why?

    1. Create separate tables such as character_items, character_skills?

    Example:

    characters table:
    character_id
    character_name

    items table:
    item_id
    item_name

    character_items table:
    character_id
    item_id

    2. Create a single column for items/skills in the character table and store the info by serializing it.

    characters table:
    character_id
    character_name
    character_items
    character_skills

  2. #2
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,017
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    option 1 is by far the better choice

    option 2 violates first normal form and therefore has all the issues involved with that

    for example, if you're searching for something specific, you have to deserialize every string, which means that the search will require a table scan, which means your app won't scale -- the query will get slower and slower the more rows there are
    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  3. #3
    SitePoint Addict
    Join Date
    Nov 2009
    Posts
    283
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    option 1 is by far the better choice

    option 2 violates first normal form and therefore has all the issues involved with that

    for example, if you're searching for something specific, you have to deserialize every string, which means that the search will require a table scan, which means your app won't scale -- the query will get slower and slower the more rows there are
    Rudy, thanks for your opinion. After asking the question, I made some more research about normalization and reading your reply, first option makes more sense to me now. Although I am starting very simple, there is no reason why it wouldn't get big in the future if I can create something nice. That's why I am trying to do it right as much as possible from the beginning.

    I also read about denormalization and that it might be considered for really big databases. I guess I shouldn't worry about it at this stage. Besides, there is the database sharding topic. I am not sure if a browser-based text RPG will require this but as a programmer you have to consider all possible scenarios right?

    Theoretically speaking, let's imagine my game reached 1,000,000 players, each player has 2 characters (average) and each character has 20 items (average) in their inventory. Following the first option, character_items table will have 1,000,000 x 2 x 20 = 40,000,000 entries. To read/write data from the database, I guess I will need a lot of joins because there will be many related tables. In such a case, would you still think that the first option is the way to go? Thanks again.

    By the way, don't all searches require a table scan? Or do you mean looking for a key (such as item_id) would be much faster than looking up the content of a cell which contains data such as "item1, item2, item3"

  4. #4
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,017
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    Quote Originally Posted by Nail Yener View Post
    To read/write data from the database, I guess I will need a lot of joins because there will be many related tables. In such a case, would you still think that the first option is the way to go? Thanks again.
    yes
    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  5. #5
    SitePoint Addict
    Join Date
    Nov 2009
    Posts
    283
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    yes
    Thank you again Rudy, I added one last paragraph to my last reply if you missed it.

  6. #6
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,017
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    Quote Originally Posted by Nail Yener View Post
    By the way, don't all searches require a table scan?
    no, not if the searched column is indexed
    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  7. #7
    SitePoint Wizard guelphdad's Avatar
    Join Date
    Oct 2003
    Location
    St. Catharines, ON Canada
    Posts
    1,706
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Nail Yener View Post
    Or do you mean looking for a key (such as item_id) would be much faster than looking up the content of a cell which contains data such as "item1, item2, item3"
    The problem isn't really storing of the data in such a form so much as actually having to work with the data. that is why normalization is the way to go.

    For argument sake let's say you have 10 users and each of them have a minimum of 5 but as many as 20 items as you say.
    with one row per item per user you would have as many as 200 rows rather than 10. BUT it is far simpler with the normalized table when you need to find out which users have a sword, or how many users are carrying a sword and a rope. Even more important when you have a list of say 5 items that a user must have to fight the "boss" and find out if each of the 10 fighters in the party have those five items.

    It becomes more and more complex to do such searches when you have serialized data. And as you say, you might have a million players, so even more so.

  8. #8
    SitePoint Addict
    Join Date
    Nov 2009
    Posts
    283
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by guelphdad View Post
    The problem isn't really storing of the data in such a form so much as actually having to work with the data. that is why normalization is the way to go.

    For argument sake let's say you have 10 users and each of them have a minimum of 5 but as many as 20 items as you say.
    with one row per item per user you would have as many as 200 rows rather than 10. BUT it is far simpler with the normalized table when you need to find out which users have a sword, or how many users are carrying a sword and a rope. Even more important when you have a list of say 5 items that a user must have to fight the "boss" and find out if each of the 10 fighters in the party have those five items.

    It becomes more and more complex to do such searches when you have serialized data. And as you say, you might have a million players, so even more so.
    Your example gives me a better understanding, thank you. I am pretty sure I will have such or even more complex scenarios as I improve the game.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •