SitePoint Sponsor

User Tag List

Results 1 to 6 of 6
  1. #1
    SitePoint Guru glenngould's Avatar
    Join Date
    Nov 2005
    Posts
    662
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    General Tips for Storing User Data - Normalization

    I'm in process of developing a Facebook application. It's kind of a games portal like Mindjolt plus a "non-flash Facebook game" (more such games soon).

    So I store some data in a MySQL database for each user as you would expect.

    I need some tips on how to store data so that;

    • the application works fast,
    • scales fine in the future with thousands (or much more hopefully) of users
    • and it's easy to retrieve/sort user data whenever I need a new type of data in the application (new games, new statistics about existing games etc.).


    So I have a very basic understanding about database normalization but I have some table structures that I know against this principle.

    Hmm, let me shorten this post and give an example so that you can comment on a real case.

    I have a word game in the app. that users make words from letters they have.

    I store all words made, in a table like this:

    Table: WordsMade
    Columns: id, userid, word, wordscore


    So when I need to display a higscores table, I need to select all words a user made from the table, get the sum of wordscores, and then sort users in the order of the scores they made in total.

    Now, here is what I did instead even if this is against normalization (is it?):

    I'm keep the wordsMade table as it is, so that I might be able to re-collect proper data if things go wrong in the future.
    But I add another column to allUsers table (primary table to store username etc.) which holds the total score for each user and updated whenever a user made a new word.

    This way I just sort the allUsers table ordered by the totalScore field. Because it seems to me that it's much faster (is it?) especially if I have thousands of users and many many more made words than that.

    As you can see I'm not experienced with huge-scale databases and how to handle them. I often feel "if I do this now, it'll cause performance issues when I have much more data." and go against normalization.

    Well that's enough for now I guess. Feel free to reply with specific solutions to the example I've described above or some general principles I should consider, or maybe with some resources I need to read about structuring a database.

    Thanks in advance.
    Tweep List adds an avatar menu to Twitter (open source) ---------- Word Stats shows your most used words on Twitter

  2. #2
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,015
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    storing a summary calculation (redundantly) does not violate normalization rules

    keeping it up to date is a good idea, because, yes, just sorting the allusers table on the total score is very efficient (especially if you have an index on that column)
    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  3. #3
    SitePoint Guru glenngould's Avatar
    Join Date
    Nov 2005
    Posts
    662
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Good to hear that thanks for the reply.

    Suppose I need to add 5 other games and need to store high scores (or such) for those in the future. How would you structure that? By adding more fields (along with a separate table of course like the wordsMade table) to the allUsers table?

    It'd be helpful if you have any articles to link, in your mind about these issues, especially about handling big databases. I already know about your website so that's one of the places I'll look into first.
    Tweep List adds an avatar menu to Twitter (open source) ---------- Word Stats shows your most used words on Twitter

  4. #4
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,015
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    Quote Originally Posted by glenngould View Post
    Suppose I need to add 5 other games and need to store high scores (or such) for those in the future.
    instead of adding totals columns to the users table, i'd have a "user_game_totals" table, with a PK of user_id + game_id, and a data column of the running total for that user for that game
    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  5. #5
    SitePoint Guru glenngould's Avatar
    Join Date
    Nov 2005
    Posts
    662
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks again, obviously a better solution.
    Tweep List adds an avatar menu to Twitter (open source) ---------- Word Stats shows your most used words on Twitter

  6. #6
    SitePoint Guru glenngould's Avatar
    Join Date
    Nov 2005
    Posts
    662
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I've found this nice article through r937.com:

    Denormalization Patterns

    I think, anybody else having a similar question about the case I've described above should read it.
    Tweep List adds an avatar menu to Twitter (open source) ---------- Word Stats shows your most used words on Twitter


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •