SitePoint Sponsor

User Tag List

Results 1 to 19 of 19
  1. #1
    SitePoint Enthusiast
    Join Date
    Dec 2004
    Location
    texas
    Posts
    88
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    speed up your site with this caching technique

    ok, so here's the scrivvy. i'm developing this online calendar to help me organize my life and hopefully help other people organize theirs. it's all object oriented in its code and stays very current with the database throughout the page load. 2 days ago, if a user looked at a month view full of events there could be between 4000-5000+ queries to mysql!

    obviously, this is very bad, so i set out to reduce the number of queries to my database... i noticed that lots of these queries were the same since the same objects were just querying the database to stay current as the page loaded.

    i started to cache values inside of my objects instead of have the getter functions delegate to mysql every time. this helped reduce the number of queries to around 3000ish, but i needed even fewer!

    after reading some comments on php.net, i decided to try and cache the database results. since all of my mysql_queries already went through MyClass:mysql_query(), this was relatively straightforward...

    i had already developed a small utility HashTable class, with get, put and reset funcitons, so i put this to use too. it's basically a wrapper class for an basic php array. it stores some data at an index and lets you get the data at an index. it also allows clearIndex() and reset() functions...

    i decided to cache the result of every SELECT query to mysql, and clear my cache if any other query is used (DELETE, UPDATE, etc). i came up with this solution to cache results:

    Code:
    class DbClassExample{
      private $_mysql_link;
      private $_mysql_cache;
    
      public function __construct(){
        $this->_mysql_link = false;
        $this->_mysql_cache;
      }
    
      public function mysql_query($sql){
    	$sql = trim($sql);
    	$this->_mysql_link = mysql_connect($this->HOST(), 
                                                $this->ADMIN(),
                                                $this->PASS());
    	mysql_select_db($this->DATABASENAME(),$this->_mysql_link);
    	// check the cache
    	if($this->_mysql_cache->get($sql)){
                    // it's in the cache, so just return it,
                    // but not before resetting it!
    		$result = $this->_mysql_cache->get($sql);
    		if(mysql_num_rows($result)){
    		      mysql_data_seek($result, 0);
    		}
    	}else{
                    // the result is not in the cache,
                    // so it's business as usual
    		$result = mysql_query($sql, $this->_mysql_link);
    		if(mysql_error()){
    		      throw new DatabaseException(mysql_error());
    		}
    		if(strpos($sql, "SELECT") === 0){
    		      $this->_mysql_cache->put($sql, $result);
    		}else{
                          // our cache might be out of sync,
                          // so clear it out
    		      $this->_mysql_cache->reset();
    		}
    	}
    	return $result;
      }
    
      private function HOST(){ ... }
      private function ADMIN(){ ... }
      private function PASS(){ ... }
      private function DATABASENAME(){ ... }
    
    }
    now, instead of 3000+ queries, i'm getting ~600 max, which is great!

    i ran benchmarks with and without the cache in place. i timed how long my page took to build using the getrusage() function. i ran 20 page loads each way and the threw out the top and bottom values. with the cache a busy month view loads in .48s and without the cache it loads in .72s! the cache is helping me load my month view 66% faster!

    i hope this helps others out there who are trying to tune their sites and applications.

    i'm still gonna work to cache values inside my other objects so i don't have to query as often. 600 still seems like alot, even for an app that does alot. but it's alot better than 3000!

    any comments are most welcome. i hope i've helped somebody else too!
    free online calendar: http://inversiondesigns.com
    includes multiple calendars/users/groups, task manager,
    email and sms reminders, permissions (sharing), etc.

  2. #2
    SitePoint Evangelist ghurtado's Avatar
    Join Date
    Sep 2003
    Location
    Wixom, Michigan
    Posts
    591
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    What I *really* would like to know is how you managed to require 3000 queries to mysql in one single page view (or 600 for that matter)
    Garcia

  3. #3
    SitePoint Enthusiast
    Join Date
    Dec 2004
    Location
    texas
    Posts
    88
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    i wouldn't mind knowing that myself, heh.

    for a start, the page has about 7-10 calendars, each loads about 10 events, and each event loads it's 7 fields. that's 700 (10*10*7) right there, without caching
    plus i check user permissions for calendars to make sure no body's trying to see too much.

    to make matters worse, lots of the objects i use don't cache their results. so if, for instance, i didn't cache the start time of an event, and ask for it 3 times for each event during the load, then that means i go to sql 2100 (700 * 3) times... yikes!

    i've been going through my code the past week trying to cache all i can, without compromising data intregrity and always staying exactly current to what's in sql.

    gotta love refactoring
    free online calendar: http://inversiondesigns.com
    includes multiple calendars/users/groups, task manager,
    email and sms reminders, permissions (sharing), etc.

  4. #4
    SitePoint Addict pointbeing's Avatar
    Join Date
    Jun 2004
    Location
    London, UK
    Posts
    227
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    May I take the liberty of recommending that you grab a copy of PofEAA and read up on the Object-Relational mapping patterns? There's a lot to take in but I think you'll benefit from it

  5. #5
    SitePoint Enthusiast
    Join Date
    Dec 2004
    Location
    texas
    Posts
    88
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    thanks, i might look into that

    i think the big problem is that most of the code i'm workin with is 4 years old (when i was just starting to learn OO) and cared more about data correctness than efficiency. since my pages still loaded in < 1s i didn't notice how inefficient i really am.

    but here i go making excuses.
    free online calendar: http://inversiondesigns.com
    includes multiple calendars/users/groups, task manager,
    email and sms reminders, permissions (sharing), etc.

  6. #6
    Non-Member
    Join Date
    Jan 2003
    Posts
    5,748
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Looking at your script example you posted it kind of reminds me of a pre- deisign pattern era- version on the Registry?

    Look at Harry's site, www.phppatterns.com for more on the pattern. As to how you've manage to generate 3000 plus queries on the one page I don't know

    Maybe your just mad?

  7. #7
    SitePoint Evangelist ghurtado's Avatar
    Join Date
    Sep 2003
    Location
    Wixom, Michigan
    Posts
    591
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Sounds like the DB schema could use some refactoring if you need to generate a separate query for every field of every event of every calendar. You could probably decrease the number of queries dramatically with a few JOINs.
    Garcia

  8. #8
    SitePoint Enthusiast
    Join Date
    Dec 2004
    Location
    texas
    Posts
    88
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    yeah, i'm a fan of www.phppatterns.com. in fact, my entire gui is built with the Visitor pattern (it's quite a bit cleaner than my database )

    (btw, i don't query for every field of every event of every calendar. i'm pretty sure i used to, but that's been fixed it was just an example of how i can get so many queries)

    i agree that my entire DB layer needs to be refactored/written. when i started this project 4 years ago i had no notion of making a DB layer. now i'm trying to figure out the best way to retrofit a DB layer in...

    any ideas on how to abstract out the DB altogether? my long term goal is to support multiple database types (right now i'm mysql only). i've briefly looked at the PEAR DB. i'm not sure what other options i have. any other ideas?

    also, what's the difference between PEAR DB, DBA, and DBA_Relational?
    free online calendar: http://inversiondesigns.com
    includes multiple calendars/users/groups, task manager,
    email and sms reminders, permissions (sharing), etc.

  9. #9
    Non-Member
    Join Date
    Jan 2003
    Posts
    5,748
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I'm in the process of moving from my own abstraction to Creole

    any ideas on how to abstract out the DB altogether?
    For PHP, try looking at Propel, of which you need Creole as well - the reason I'm on the move you see

    For both, look at http://www.phpdb.org/ for more information.

  10. #10
    Non-Member coo_t2's Avatar
    Join Date
    Feb 2003
    Location
    Dog Street
    Posts
    1,819
    Mentioned
    1 Post(s)
    Tagged
    1 Thread(s)
    Quote Originally Posted by awulf

    any ideas on how to abstract out the DB altogether? my long term goal is to support multiple database types (right now i'm mysql only). i've briefly looked at the PEAR DB. i'm not sure what other options i have. any other ideas?
    A simple way that I've been doing it lately is to use object that I call DAOs(not sure if they really are in the strictest sense but I think they are) to access the DB.
    For each type of data source I would just append a _datasourceType at the end of the class name.

    Here's an example:

    PHP Code:
      class TrackerDao_Mysql {
      
          function 
    TrackerDao_Mysql($cookieName$dbConn)
          {}
      
          function 
    add()
          {}
      
          function 
    getHistory($limit=5)
          {}
      
          function 
    setConverted()
          {}
      } 
    I've never needed anything other the mysql class, but if I ever did I would just write another class for another database type or xml file or whatever and use the same interface, I'd just have to change the line(s) where the object is instantiated, and even that could probably be abstracted if you wanted to.
    In PHP5 these classes would probably implement a common interface.

    I'm sure the folks in this forum have better ways that are better for scalability but this works for me.

    --ed

  11. #11
    SitePoint Enthusiast
    Join Date
    Dec 2004
    Location
    texas
    Posts
    88
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Dr Livingston
    For PHP, try looking at Propel, of which you need Creole as well - the reason I'm on the move you see

    For both, look at http://www.phpdb.org/ for more information.
    i'm looking over the documentation for propel, and this thing looks pretty sweet.

    i might have one problem with a propel type system.
    each time i add a calendar, i have to add a table to the database. this is b/c calendars can support custom fields, and each calendar table might have completely different columns than another... so i can't lump all calendar data together in one table.

    but this also means that the number of tables in my database is never constant. it looks like propel requires xml of the database schema and then generates classes one time... but this is impossible for me since i never know exactly what the database layout looks like...
    free online calendar: http://inversiondesigns.com
    includes multiple calendars/users/groups, task manager,
    email and sms reminders, permissions (sharing), etc.

  12. #12
    SitePoint Enthusiast
    Join Date
    Dec 2004
    Location
    texas
    Posts
    88
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    maybe i'll need to write my own. somewhere inbetween coo_t2's suggestion and propel...
    free online calendar: http://inversiondesigns.com
    includes multiple calendars/users/groups, task manager,
    email and sms reminders, permissions (sharing), etc.

  13. #13
    simple tester McGruff's Avatar
    Join Date
    Sep 2003
    Location
    Glasgow
    Posts
    1,690
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by pointbeing
    May I take the liberty of recommending that you grab a copy of PofEAA and read up on the Object-Relational mapping patterns? There's a lot to take in but I think you'll benefit from it
    Absolutely essential reading. Years of experience carefully explained. I was almost drooling when I bought my own copy.

  14. #14
    If it aint Dutch it aint much Kilroy's Avatar
    Join Date
    Oct 2003
    Location
    The Netherlands
    Posts
    406
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Seems like a cool idea, and it could even work on top of other caching systems.

  15. #15
    SitePoint Enthusiast
    Join Date
    Dec 2004
    Location
    texas
    Posts
    88
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    a quick update... my 600+ queries is down to 276 work work work.
    free online calendar: http://inversiondesigns.com
    includes multiple calendars/users/groups, task manager,
    email and sms reminders, permissions (sharing), etc.

  16. #16
    If it aint Dutch it aint much Kilroy's Avatar
    Join Date
    Oct 2003
    Location
    The Netherlands
    Posts
    406
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    But has the speed improved?

  17. #17
    SitePoint Enthusiast
    Join Date
    Dec 2004
    Location
    texas
    Posts
    88
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    well, i remembered that on my last calculations i had divided by 20 even though i had thrown out the top two numbers ! i should have divided by 18... so i fixed the old numbers by multiplying the old stat by 20/18 to adjust it.

    so here's my adjusted stats:
    old with 3000+: .8s / page
    old with ~600: .53s / page
    new with ~250: .40s / page

    looks like it's getting better
    free online calendar: http://inversiondesigns.com
    includes multiple calendars/users/groups, task manager,
    email and sms reminders, permissions (sharing), etc.

  18. #18
    If it aint Dutch it aint much Kilroy's Avatar
    Join Date
    Oct 2003
    Location
    The Netherlands
    Posts
    406
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Yes, it's getting a lot better. Are you using JOINs now?

  19. #19
    SitePoint Enthusiast
    Join Date
    Dec 2004
    Location
    texas
    Posts
    88
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    actually no, i haven't changed any of my actual SQL, i've only changed how i cache the objects that call the sql. i used to load an object, let it call SQL and initialize, then get some info from it and throw it away. i'd reload it just about everytime i needed the info... now i cache just about all my objects and only reload if i absolutely need to, which is very rare.
    free online calendar: http://inversiondesigns.com
    includes multiple calendars/users/groups, task manager,
    email and sms reminders, permissions (sharing), etc.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •