SitePoint Sponsor

User Tag List

Results 1 to 19 of 19
  1. #1
    If it aint Dutch it aint much Kilroy's Avatar
    Join Date
    Oct 2003
    Location
    The Netherlands
    Posts
    406
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Caching technique

    Hi guys, I have a problem in the CMS I'm currently writing.

    I am using a custom, self-written template technique and I'm puzzled on how to use it in combination with caching. I currently have so-called "template bits", which are either database entries or flat files that contain HTML. The flat files might also contain PHP.

    Now, what happens is that the template bits get parsed and put together into a single page. I would like to cache these pages, so that they do not have to be dynamically generated every time the page is called.

    My question for you is about the naming scheme. How could I name the cache files to make sure their names are unique and retrievable? The template parser should know when a cached version of a page is necessary, but I an unsure how to implement this checking.

    Do you have any suggestions on what the naming scheme for cache files could be? I was thinking something along the line of everything behind the ?-symbol, but I am worried about duplicate results with different content.

    Any suggestions are appreciated.

    - Peter

  2. #2
    SitePoint Enthusiast
    Join Date
    Jul 2004
    Location
    In front of my computer
    Posts
    96
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Here is an idea..
    Serialize all the variables that are passed to the template, take the template's name and create a hash with all these informations.

    Code:
    $filename = md5(serialize($this->attributes) . $filename) . '.php';
    That way you can grant :
    - A different cache file is used if you have different variables for the template
    - A different cache file is used if you have the same variables for different templates
    participate to the best Php Wiki
    my blog ...

  3. #3
    eschew sesquipedalians silver trophy sweatje's Avatar
    Join Date
    Jun 2003
    Location
    Iowa, USA
    Posts
    3,749
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    zimba, how would you recreate the name later? If you have to recreate all the atributes, then this that as expensive as just redering the object in the first place, why bother to cache?
    Jason Sweat ZCE - jsweat_php@yahoo.com
    Book: PHP Patterns
    Good Stuff: SimpleTest PHPUnit FireFox ADOdb YUI
    Detestable (adjective): software that isn't testable.

  4. #4
    If it aint Dutch it aint much Kilroy's Avatar
    Join Date
    Oct 2003
    Location
    The Netherlands
    Posts
    406
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Well, that looks OK at first sight, but what sweatje says is true. It's no use to cache a page when all variables need to be looked at anyway. What I need is that parser immediately recognizes whether or not a page is already cached, without looking at all the variables. Otherwise, it would only be doing extra work.

    I appreciate the time you take to help me, though.

    - Peter

  5. #5
    If it aint Dutch it aint much Kilroy's Avatar
    Join Date
    Oct 2003
    Location
    The Netherlands
    Posts
    406
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    OK, I think I found something that will work:

    PHP Code:
    $page 'http://' $_SERVER['HTTP_HOST'] . $_SERVER['REQUEST_URI']; 
    $cachefile $cachedir md5($page) . '.' $cacheext
    It seems that that will do

  6. #6
    SitePoint Enthusiast
    Join Date
    Jul 2004
    Location
    In front of my computer
    Posts
    96
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    For php "templates", the time it takes to calculate the hash value should be about the same as parsing the templates (not benchmarked altrough)
    In other cases, where generation takes more time, it could still be usefull.

    Another idea : if all the parameters are passed to the main template, which includes the "bits". You could insert all the sub-templates into the main template and cache the file like that, so that you have one file instead of many. The name can be the "View" you used, or simply the main template name..
    Variables would not be cached, only the template aggregation.

    Another idea : To cache the variables, you need to have their dependencies. For example the $username variable would depend on the user/session. Some variables like $priceOfDollar depends on the time, a timeout can be set until next update. And constants never change by definition.
    Altrough I'm not sure if doing all the work for "dependencies" would accelerate the thing, I'm sure it would slow your development down.

    Finally what I would try is to implement two level of caching. The first one would be template + constants aggregation. The second one would use the session in some way and be a user cache or something.

    Hope it helps that I throw ideas in the air like that
    participate to the best Php Wiki
    my blog ...

  7. #7
    Resident Java Hater
    Join Date
    Jul 2004
    Location
    Gerodieville Central, UK
    Posts
    446
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I normally make an MD5 hash based on the filename of the template (or PK of the template in the database) once a template is compiled to PHP. I wouldn't cache the variables. I would just send the varaibles to the template each time and let the compiled template run, it will normally be quicker

  8. #8
    If it aint Dutch it aint much Kilroy's Avatar
    Join Date
    Oct 2003
    Location
    The Netherlands
    Posts
    406
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks guys, those are some great suggestions. I will surely post my cache class when it is finished. In the meantime, keep the suggestions coming

  9. #9
    SitePoint Wizard silver trophy kyberfabrikken's Avatar
    Join Date
    Jun 2004
    Location
    Copenhagen, Denmark
    Posts
    6,157
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    or you could use the filename for the cachename, and use stat() to determinn how old the cached version is, and just set a reasonable timeout value.
    take a look at Cache_Lite

  10. #10
    If it aint Dutch it aint much Kilroy's Avatar
    Join Date
    Oct 2003
    Location
    The Netherlands
    Posts
    406
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I did take a look at Cache_Lite and it helped me a lot. Thanks

  11. #11
    SitePoint Addict toggg's Avatar
    Join Date
    Jan 2005
    Location
    Auvergne/France
    Posts
    253
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Hi,
    For unique file name, did you try tempnam() -- Create file with unique file name from PHP ?
    or uniqid() -- Generate a unique ID ?
    +
    bertrand Gugger toggg.com linux, PHP, Auvergne/France open source

  12. #12
    If it aint Dutch it aint much Kilroy's Avatar
    Join Date
    Oct 2003
    Location
    The Netherlands
    Posts
    406
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    toggg, those functions are great, but how would I be able to retrieve the correct cache file later? I assume it will not generate the same unique file name!

  13. #13
    eschew sesquipedalians silver trophy sweatje's Avatar
    Join Date
    Jun 2003
    Location
    Iowa, USA
    Posts
    3,749
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Kilroy,

    I don't think there is a magic bullet here. Whenever I implement caching, I find there is a few characteristics (usually values in $_REQUEST or $_SESSION) which make the item unique. I often have these specific keys which make the item unique using md5() and use this as the cache identifier. When I try it again, I know the set of keys which make the cached item unique, I create the hash and then check to see if it is already cached.

    HTH

  14. #14
    If it aint Dutch it aint much Kilroy's Avatar
    Join Date
    Oct 2003
    Location
    The Netherlands
    Posts
    406
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    sweatje,

    I'm now using the following, which works great:

    PHP Code:
    // [snip]
    if($_SERVER['QUERY_STRING']){
        
    $page_name $_SERVER['SCRIPT_FILENAME'] . '?' $_SERVER['QUERY_STRING'];
    }else{
        
    $page_name $_SERVER['SCRIPT_FILENAME'];
    }
            
    try {
        
    $this->cache->get($page_name);
    }catch(
    CacheException $ex){    
        
    // [snip]
        
    $this->cache->put($page_name$this->page);
        echo 
    $this->page;
    }
    // [snip] 
    Thanks for the suggestions,

    - Peter

  15. #15
    SitePoint Addict toggg's Avatar
    Join Date
    Jan 2005
    Location
    Auvergne/France
    Posts
    253
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Kilroy
    toggg, those functions are great, but how would I be able to retrieve the correct cache file later? I assume it will not generate the same unique file name!
    Right Kilroy, I did not read fully your question
    These function are more indicated when you need to save data on a client basis. In this case , the generated id is exanged via cookies or hidden fields so to retrieve the data for this client.

    Anyway your ID construction seems logical, in so far the pages uniquely depend on the QUERY_STRING. I mean it would not be correct if the page content depends on something else, e.g. the date or some SQL result.
    You, as designer, know on what they depend, so it's your decision.

    BTW is your system collecting the garbage caches ?
    +
    bertrand Gugger toggg.com linux, PHP, Auvergne/France open source

  16. #16
    If it aint Dutch it aint much Kilroy's Avatar
    Join Date
    Oct 2003
    Location
    The Netherlands
    Posts
    406
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I am in the process of creating a garbage collector, yes. Thanks for the tip anyway.

    Well, the pages will be static for quite a while and they do depend on the query string. However, I will purge remove a cache file if the page gets updated, so visitors will immediately see the updated page.

    I have been thinking about it and it seems that this will be very hard to do with what I currently have. I mean, how would I be able to tell which pages hold the content that has been modified and therefore need their cache purged?

    Therefore, I decided that every time a part of the site is modified, I will purge the entire cache. This will not allow for dynamic scripting in the CMS, but this will come later. I will have to implement some sort of checking foy dynamic scripts and then make sure those pages are not cached. This, however, is something I will not look at yet.

    If you have another solution, that would be great. I've been meaning to look at other CMS's to see how they implement caching but I haven't been able to find out their techniques. If you know any, please share them.

    - Peter

  17. #17
    SitePoint Addict toggg's Avatar
    Join Date
    Jan 2005
    Location
    Auvergne/France
    Posts
    253
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Hi Killroy,
    Why not make your cache id depending on the script modification date ?
    Something like
    PHP Code:
    $page_name $_SERVER['SCRIPT_FILENAME'];
    $page_name .= filemtime($page_name);
    if(
    $_SERVER['QUERY_STRING']){
        
    $page_name .= '?' $_SERVER['QUERY_STRING'];

    +
    bertrand Gugger toggg.com linux, PHP, Auvergne/France open source

  18. #18
    If it aint Dutch it aint much Kilroy's Avatar
    Join Date
    Oct 2003
    Location
    The Netherlands
    Posts
    406
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Yes, but the thing is, I wouldn't know which file in the cache to retrieve later! That is the major problem. I currently am using the filemtime() function though (to check is a cache file is outdated), but I am could not use it in the way you suggested. The reason for this is that the SCRIPT_FILENAME points to my index.php, which will (if correctly used) never be modified unless the system is upgraded.

  19. #19
    SitePoint Addict toggg's Avatar
    Join Date
    Jan 2005
    Location
    Auvergne/France
    Posts
    253
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    OK,
    That should be the modification date of the included subpages, but then you have a set of signatures to survey what is certainly more complex.
    Anyway, the cache system could analyze before and after the page is loaded which file were included or required thru array get_included_files ( void ). With this list establish the signature set to check. That seems complicated, but should be easy to code.
    I would still think it's better to check the modification date of each include as really include them.

    Another way could be to act at the source of update. I mean have a system (as make) which would allways touch (set modif date) index.php each time a subpage is modified. Then check resume in index.php modification date.
    +
    bertrand Gugger toggg.com linux, PHP, Auvergne/France open source


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •