SitePoint Sponsor

User Tag List

Results 1 to 7 of 7
  1. #1
    SitePoint Zealot
    Join Date
    Feb 2006
    Posts
    168
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Will a huge DB table bog down my site?

    I am going to be making a feature to show people on their return visits to my website things that would be of interest to them judged by their previous site interests.

    To do this I was planing on giving them a unique id that would be stored in a cookie and then record the 10-15 most important recent website activities in a tracking table. Then I would use a cron_job to clean up old tracking info.

    But still you figure 50k visitors a day and say half of them have cookies turned on thats still like 250 thousand rows a day with them expiring after 3-4 weeks, it ads up to a big table.

    So two questions. Is MySQL fast enough that it can retrieve from a table that size without a sweat or will it feel it?

    And would it be better to list all of the tracking information in one row by separating the info with a comma or something and then parsing it into an array once retrieved? That would mean only the 25k rows a day.

    Thanks for any insight,

    Chris

  2. #2
    SitePoint Addict agentforte's Avatar
    Join Date
    May 2007
    Location
    Toronto, ON, Canada
    Posts
    213
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    effiecient design...

    Hello,

    I am not sure about server performance, but I am thinking that you might be able to get away with far less rows.
    Also, I am not sure if you want the information to always be recent.

    some suggestions:

    1) let the users choose to turn this feature on (so people that do not care for it do not use extra resources)

    2) Have a separate table with one field for each activity plus one field to have the user ID, and one to have the date.

    3) Add one every time the user uses each activity. That user can then see how many times each activity was performed.

    4) have a button that lets the user clear the activity history, so they can start fresh ( make sure to save the date it was cleared so they know how long it has been recording since the last time they cleared it)

    this prevents a cron job, and stores far less information

    I hope this helps.

    -Frank

  3. #3
    SitePoint Enthusiast jcarouth's Avatar
    Join Date
    Sep 2006
    Location
    College Station, TX
    Posts
    48
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    MySQL is more than capable of handling data in the Millions of rows. i have several applications in production at this capacity level and there are no problems. the main thing to ensure is that the queries that retrieve the information from your tables are efficient and that they are using indexes when querying these tables.

    the approach of adding the related items in comma form is a bad idea. it is very denormalized and as you noted adds a new overhead of having to dissect the information after you retrieve it.

  4. #4
    SitePoint Zealot
    Join Date
    Feb 2006
    Posts
    168
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks a lot. That answers my question.

  5. #5
    SitePoint Zealot
    Join Date
    Feb 2006
    Posts
    168
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Actually one more thing. I will only be querying the DB from time to time mainly on the homepage. However it will be getting inserts on nearly every page.

    Would you still recommend indexing?

  6. #6
    AdSpeed.com Son Nguyen's Avatar
    Join Date
    Aug 2000
    Location
    Silicon Valley
    Posts
    2,241
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Create separate summary tables (mostly read, with indices) from the log tables (mostly write, might not need too many indices). And yes, MySQL is quite capable of handling significant loads
    - Son Nguyen
    AdSpeed.com - Ad Serving and Ad Management Made Easy

  7. #7
    SitePoint Zealot
    Join Date
    Feb 2006
    Posts
    168
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    So then would you say to create for instance one table with all of the tracking information--the page, the importance of the page, when they visited, user_id, etc--with no index's but setting the id as a primary key?

    And then after adding that information to it, add the primary key `id` from the first table, and the `user_id` to a separate table that indexes the `user_id`?

    I am trying to make everything on the site as optimized as possible and this table will be of much larger scale then I am used to. So I really appreciate the advice from you guys.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •