I'm in process of developing a Facebook application. It's kind of a games portal like Mindjolt plus a "non-flash Facebook game" (more such games soon).

So I store some data in a MySQL database for each user as you would expect.

I need some tips on how to store data so that;

  • the application works fast,
  • scales fine in the future with thousands (or much more hopefully) of users
  • and it's easy to retrieve/sort user data whenever I need a new type of data in the application (new games, new statistics about existing games etc.).


So I have a very basic understanding about database normalization but I have some table structures that I know against this principle.

Hmm, let me shorten this post and give an example so that you can comment on a real case.

I have a word game in the app. that users make words from letters they have.

I store all words made, in a table like this:

Table: WordsMade
Columns: id, userid, word, wordscore


So when I need to display a higscores table, I need to select all words a user made from the table, get the sum of wordscores, and then sort users in the order of the scores they made in total.

Now, here is what I did instead even if this is against normalization (is it?):

I'm keep the wordsMade table as it is, so that I might be able to re-collect proper data if things go wrong in the future.
But I add another column to allUsers table (primary table to store username etc.) which holds the total score for each user and updated whenever a user made a new word.

This way I just sort the allUsers table ordered by the totalScore field. Because it seems to me that it's much faster (is it?) especially if I have thousands of users and many many more made words than that.

As you can see I'm not experienced with huge-scale databases and how to handle them. I often feel "if I do this now, it'll cause performance issues when I have much more data." and go against normalization.

Well that's enough for now I guess. Feel free to reply with specific solutions to the example I've described above or some general principles I should consider, or maybe with some resources I need to read about structuring a database.

Thanks in advance.