SitePoint Sponsor

User Tag List

Results 1 to 13 of 13
  1. #1
    SitePoint Addict phptek's Avatar
    Join Date
    Jun 2002
    Location
    Wellington, NZ
    Posts
    363
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Optimization & Caching solutions Anyone?

    We've been thinking on and off about some sort of optimisation/caching solution for our flagship production site.

    It's not that it receives a massive number of page impressions, just that many pages do feature hefty database queries (using MySQL 3.23.xx so no subquries here) and customers with 56k+ connections, using old machines (pre 2001 Macs for example) do often moan about the speed of the service (Oh good - the data-quality is OK then! )

    I know about adding indexes to those tables with many rows (Of which our biggest table has something like 15-20,000) - although I've read that while this can yield query speed enhancements it also adds to overall server memory/cpu load. (This said we're due to transfer the site and it's LAMP setup to a very fast machine indeed... *rubs hands*)

    I'd welcome your opinions, experiences and thoughts, with regard to caching mechanisms and code/db/server optimisaton solutions (of which The Zend Optimizer I think is one of the latter??)

    Some of this is a little OT for sure, but I reckon it's best not to post the same topic in multiple forums.

    Thank you all in advance for your thoughts.

  2. #2
    ********* Victim lastcraft's Avatar
    Join Date
    Apr 2003
    Location
    London
    Posts
    2,423
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Hi...

    Indexes use slightly more space on the disk and make updates slower because the indexes have to be updated with the data. In general though, indexes are a good thing and could be the route of your problem (20000 rows is small fry). CPU load will be much reduced if the DB doesn't have to scan the data linearily.

    yours, Marcus

    p.s. I see you are in London. Have you visited teh PHP London group?
    Marcus Baker
    Testing: SimpleTest, Cgreen, Fakemail
    Other: Phemto dependency injector
    Books: PHP in Action, 97 things

  3. #3
    SitePoint Addict phptek's Avatar
    Join Date
    Jun 2002
    Location
    Wellington, NZ
    Posts
    363
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by lastcraft
    p.s. I see you are in London. Have you visited teh PHP London group?
    Heh not yet - I've been thinking about it though. (really)

    Cheers for your post, depending on other replies - I'll investigate index's.

  4. #4
    SitePoint Member
    Join Date
    Dec 2004
    Location
    USA
    Posts
    10
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    This is my first post to the forum, but it is very simple. I am the programmer behind a commercial CMS, chaseMe CMS, which is one of the fastest loading due to its caching system. Here is all that you need to do for it. If you have access to .htaccess, then do something like this.

    In you .htaccess, add:
    php_value auto_prepend_file /home/yourhttpd/cachestart.php
    php_value auto_append_file /home/yourhttpd/cachestop.php

    This will allow gzip compression and caching at the same time. All that you have to do is in cachestart.php, make it see if the file exists as, if not, generate it, otherwise load and exit. The second file outputs to the file if needed. Its pretty simple using ob_start. Email me if you want code for all of the files.

    -Chase-
    Sr Programmer - Access News Group
    Program Staff Manager - DenverAccess.com
    Email: chasecmiller@gmail.com

  5. #5
    SitePoint Wizard
    Join Date
    Aug 2004
    Location
    California
    Posts
    1,672
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    If you have not indexed columns in your data tables then doing so will get you an immediate improvement. In general any column used in a WHERE clause should probably have an index. Otherwise the database must search every row for every query. Not good.

  6. #6
    Non-Member
    Join Date
    Jan 2003
    Posts
    5,748
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Code:
    php_value auto_prepend_file /home/yourhttpd/cachestart.php
    php_value auto_append_file /home/yourhttpd/cachestop.php
    I've never been a fan of using this approach as it removes control from the developer in a number of ways;

    1) no access to allow change
    2) dependent on non script to cache a page

    I agree a page should be cached and possible to compress the file, you maybe should do so as it lightens the load on the server for large page hits.

    But 'having' to resort to a .htaccess file to achieve this?

  7. #7
    Resident Java Hater
    Join Date
    Jul 2004
    Location
    Gerodieville Central, UK
    Posts
    446
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    As everyone will say indexes will make a huge difference. as arborint says, normally index anything in the where clause, or any other fields used for criteria (for instance fields that have GROUP BY, ORDER BY etc as well).

    What you should do is find all of the main 'typical' SELECT queries that get used moreless on every page request, and place an EXPLAIN in front of them. This will return a result set explaining MySQL's query plan. Each row shows a new piece of criteria used in the query. You should index most or all of the fields that listed on each row. You will find when you do that, you will get some KEY's under the 'possible_keys' bit when you re run the EXPLAIN. This means the MySQL is using the indexes. Also you should see that the 'rows' column should go down a lot (before it will probably be = to the number of rows in the table you are querying, where as with indexes it will be reduced to the number of rows matching your criteria).

    There's a load more I could waffle on about how to use the results from EXPLAIN to optimise you DB.

    Although 20000 rows is small on DB scales (I'm dealing with a site with 1.8 million records in one table and that is still only medium sized as far as DB's go). However even here, you will find the average row search will go down from 20000 iterations to about 15 (as 2^14 < 20000 < 2^15).

    You may also want to use something like XDebug to profile your code and find out where the slow points are in the PHP code. If you can move some logic from PHP into SQL (i.e. doing using ORDER BY's instead of sorting using PHP).

    The other thing is to use Turck MMCache or something similar. This will speed your site up a lot if the same scripts are used a lot during requests. It will improve even more if you have a lot of small include files. This is especially true if you use a template system that uses code generation to PHP (i.e. WACT, Smarty, etc)

    It's hard to pin point places where you should optimise, so i can't give any more specific info, but it seems your DB certainly will make the most difference.

  8. #8
    SitePoint Addict
    Join Date
    Apr 2002
    Posts
    330
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by phptek
    I'd welcome your opinions, experiences and thoughts, with regard to caching mechanisms and code/db/server optimisaton solutions (of which The Zend Optimizer I think is one of the latter??)
    If your long queries are used to present content that does not need necessarily to be updated in real time, you may use a content cache solution that stores the content (HTML pages or something else) for a reasonable period and so you avoid repeating such long queries during that period.

    Personally, I use this cache class to store in cache files the HTML content of the pages that result from queries that take a long time to compute.

    The class allows me to set life time in seconds or an expiry date and time, after which, when the cached content is accessed it is considered invalid forcing the regeneration of its contents and consequently run the necessary queries to generate it.

    This class has proven to be very robust as I use it to cache most of the pages of the PHP Classes site that are frequently accessed and take several database queries to generate.

    The top charts page is the one that takes more time to generate (about 30 minutes every day) as it computes all sorts of charts that requiring traversing millions of database rows.

    The cache class employs file locking to effectively prevent corruption that could be caused by simultaneous accesses to a cache file during an expired cache update if robust locking was not employed.
    Manuel Lemos

    Metastorage - Data object relational mapping layer generator
    PHP Classes - Free ready to use OOP components in PHP

  9. #9
    SitePoint Addict phptek's Avatar
    Join Date
    Jun 2002
    Location
    Wellington, NZ
    Posts
    363
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thank you all for your pointers - it does seem obvious that adding index's to our largest tables (at the very least) will yield some sort of performance increase.

    I will talk with our sysadmin and point him to this thread.

    Thanks once again

  10. #10
    SitePoint Zealot
    Join Date
    Jun 2004
    Location
    Bogota
    Posts
    101
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Run a benchmark though, or else you won’t know for sure. Every time we find performance issues our benchmarks and profiling end up tearing apart our hypothesis mercilessly.



    HTH,
    Andres
    If I have wings, why am I walking?

  11. #11
    SitePoint Enthusiast
    Join Date
    Jun 2004
    Location
    New Jersey
    Posts
    64
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Turck MMcache


  12. #12
    SitePoint Addict Viral's Avatar
    Join Date
    Nov 2001
    Location
    Washington DC
    Posts
    294
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    One of the best improvements you can make is to upgrade to Mysql 4 and turn on Query Caching. That one step alone will GREATLY improve the performance of you website, especially if you have redundant queries that return the same information for every visitor (like building your navigation, for example). We saw a HUGE drop in database load and incredible speed increase in webpage delivery.

    Read more about it at http://dev.mysql.com/doc/mysql/en/Query_Cache.html

    --Viral
    A computer without Windows is like a chocolate cake without mustard.

  13. #13
    SitePoint Addict phptek's Avatar
    Join Date
    Jun 2002
    Location
    Wellington, NZ
    Posts
    363
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    One word for you: "Benchmarking" - I did just this. Setting a microtime() value (in PHP) at the start of the script and again at the end and finding the difference between the two was the "measure" of the benchmark I performed.

    One particular script with a DB intensive query was taking 12.43s to render without indexes. I then added indexes to those columns referenced in the where clauses (FKs and display on/off "switches") I was absolutely amazed at the speed improvement. Down from 12.43s to 0.33s!! (I took a mean from 5 page refreshes over 10minutes on the same static data-set)

    Thanks again to you all for your suggestions.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •