SitePoint Sponsor

User Tag List

Results 1 to 12 of 12
  1. #1
    SitePoint Addict
    Join Date
    Nov 2008
    Location
    Peterborough, Ontario, Canada
    Posts
    316
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    One Large Table vs. lots of smaller tables

    Hi,

    I've got a project where users will receive messages from each other. These messages have to be kept (forever).

    Each user will probably receive 100 messages per month (some will receive 1000 or more).

    There are 1000 users.

    Should I create one massive table (with about 1.2 million rows / year) or should I create 1000 tables (1 for each user)?

    Or should I create 1 table for each year?

    The message only consists of 5 columns.


    Also, I'd like to use cacheing but there will be "blasts" to each user. So that's one reason why I'm thinking a single DB doesn't work (because each time a row is inserted then I lose the cache). Some members won't receive messages for days or weeks, and then they'll get a ton all at once.

    Also I'm using InnoDB engine.

    I'd appreciate suggestions.
    Have I helped you? You could help me.
    Like my business on facebook!
    Text message polls with real-time, live results.

  2. #2
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,270
    Mentioned
    60 Post(s)
    Tagged
    3 Thread(s)
    use one big table

    properly indexed, there is no performance issue

    what kind of caching did you have in mind? the whole table?
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  3. #3
    SitePoint Addict
    Join Date
    Nov 2008
    Location
    Peterborough, Ontario, Canada
    Posts
    316
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I don't want to cache the whole table (it wouldn't work anyway since there will be added rows all the time). I was thinking that smaller tables could be cached... I'm just starting the thinking on this table setup - so I'm not sure if I'll cache this stuff. A lot of the other tables will get cached as they won't have nearly as much movement.

    I'll probably just use a summary table for this one though and have update it every hour or so... It shouldn't need to be more specific than that.

    I haven't used DBs with millions of rows before - so I wasn't sure what the overhead would be. I didn't want to have a nightmare after the first year or something.
    Have I helped you? You could help me.
    Like my business on facebook!
    Text message polls with real-time, live results.

  4. #4
    SitePoint Addict
    Join Date
    Nov 2008
    Location
    Peterborough, Ontario, Canada
    Posts
    316
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I just discovered Triggers!

    I might just use those for keeping the summary tables up to date. I was thinking that I would have to count() that huge table -- and that would take a bit of work (I can't use myisam)... But I could just run it overnights to make sure everything is up to snuff and use triggers to keep real-time stats...

    Edit: Good thing I've moved to MySQL 5 That's probably why I hadn't noticed triggers before... I had briefly heard about them last summer, but I didn't have time to think about it... bad move...
    Have I helped you? You could help me.
    Like my business on facebook!
    Text message polls with real-time, live results.

  5. #5
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,270
    Mentioned
    60 Post(s)
    Tagged
    3 Thread(s)
    triggers for summary tables is a tried and true method

    don't be too hasty to look for performance solutions like that, though, until you actually need them

    remember what they said about "premature optimization" ...
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  6. #6
    SitePoint Addict
    Join Date
    Nov 2008
    Location
    Peterborough, Ontario, Canada
    Posts
    316
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Well I need to keep a count() on a table with million(s) of rows. As an innoDB it's gonna have to count each row each time it gets counted. I can set it up to do it with every page refresh - but that's going to be costly and slow! Or I could have a cronjob update a summary table every few minutes. Or I can have triggers update the summary table (when a row gets added it'll just increment the summary table appropriately)... Then I can just have a cronjob recalculate the summary table every couple of hours. I think that this is probably the best way to keep the stats real-time and accurate. Each member can have access to their realtime stats without the overhead of recounting on page loads. Because rows are inserted into the table whether the user's stats is updated or not, caching won't really work because there will be inserts every couple of seconds.
    I'm sure there's a solution with memcache or something, but I don't know anything about those.

    Unless you have a smarter way to keep a table count in this situation?
    Have I helped you? You could help me.
    Like my business on facebook!
    Text message polls with real-time, live results.

  7. #7
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,270
    Mentioned
    60 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by jeffvdovjak View Post
    Unless you have a smarter way to keep a table count in this situation?
    i'm not convinced that you ~need~ to keep the count at all

    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  8. #8
    SitePoint Addict
    Join Date
    Nov 2008
    Location
    Peterborough, Ontario, Canada
    Posts
    316
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Realtime Analytics...
    For ease of discussion I said that this DB was for keeping track of messages between users... that's not entirely correct...
    It's for keeping track of advertising campaigns through text messaging. clients get billed on a per message service, so it's important that they know have access to realtime stats... some campaigns are time sensative (e.g. an on-stage event incorporating audience interaction via cell phones)
    Have I helped you? You could help me.
    Like my business on facebook!
    Text message polls with real-time, live results.

  9. #9
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,270
    Mentioned
    60 Post(s)
    Tagged
    3 Thread(s)
    billing for advertising sounds to me like a great case for "last night's data should be good enough"
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  10. #10
    SitePoint Addict
    Join Date
    Nov 2008
    Location
    Peterborough, Ontario, Canada
    Posts
    316
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Most the time that would be totally good. Except for time sensative stuff, like polls. They need realtime information for those (you can't wait 20 minutes at talent show, ect, for results). Though perhaps I can have one table with time sensative information, and then leave the rest of the data in the other table.

    Wouldn't I still need summary tables? I'd just be updating them less frequently (and not using triggers). Or would I chache it? (the frontend is built on top of codeigniter which has some caching abilities... I think I can just set a cache with an expiry or something to that effect). I think it just stores the summary in a text file and retrieves those results rather than bothering the database...


    But a trigger would add minimal overhead and allow for realtime. The user wouldn't see lag at all

    (as the user sends a text message, then it gets sent to our modem, and then it gets pushed to our api... even if it added a half second-maybe even a few seconds-no body would notice as the lags between sending and receiving texts between phones are sometimes already 10-15 seconds (or longer depending on carrier,ect) and always at least a full second).

    Thoughts? Should I avoid the trigger, and should I use a different method than summary tables?
    Have I helped you? You could help me.
    Like my business on facebook!
    Text message polls with real-time, live results.

  11. #11
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,270
    Mentioned
    60 Post(s)
    Tagged
    3 Thread(s)
    front end? cache? modem? api? one table and the other table? summary tables? triggers? codeigniter? text file?

    looks like i don't know nearly enough about your requirements to offer any more advice than i already have, sorry
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  12. #12
    SitePoint Addict
    Join Date
    Nov 2008
    Location
    Peterborough, Ontario, Canada
    Posts
    316
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    that's okay. thanks for your help though! I've done some reading around some of the things you've said... so hopefully before we plug away on parts of this project I can make some decisions! we aren't working on this portion of the database yet
    Have I helped you? You could help me.
    Like my business on facebook!
    Text message polls with real-time, live results.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •