SitePoint Sponsor

User Tag List

Results 1 to 16 of 16
  1. #1
    SitePoint Enthusiast
    Join Date
    Sep 2005
    Posts
    68
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Tracking variable changes

    I'm trying to track all changes made to a PHP variable. The variable can be an object or array.

    For example it looks something like:

    Code:
    $object = array('a', 'b');
    This object is then persisted to storage using an object-cache. When php script runs again.

    So when the script runs the second time, or another script runs and modifies that object, I want those modifications to be tracked, either as they are being done, or in one go after the script executes.

    eg:

    Code:
    $object[] = 'c';

    I would like to know that 'c' was added to the object.

    Now the actually code looks something like this:

    Code:
    $storage = new Storage();
    $storage->object = array('a', 'b');

    second load:

    Code:
    $storage = new Storage();
    
    var_dump($storage->object); // array('a', 'b')
    
    $storage->object[] = 'c';
    What I want to know is that 'c' was pushed into $storage->object so in the class "Storage" I can set that value to persistent storage.

    I have tried a few methods, that work, but have downsides.

    1) Wrap all objects in a class "Storable" which tracks changes to the object

    The class "Storable" just saves the actual data object as a property, and then provides __get() and __set() methods to access it. When a member/property of the object is modified or added, the "Storable" class notes this.
    When a a property is accessed __get() on the Storable class returns the property, wrapped in another Storable class so that changes on that are tracked also, recursively for each new level.

    The problem is that the objects are no longer native data types, and thus you cannot run array functions on arrays.

    eg:

    Code:
    $storage = new Storage();
    
    var_dump($storage->object); // array('a', 'b')
    
    array_push($storage->object, 'c'); // fails
    So instead we'd have to implement these array functions as methods of Storable.

    eg:

    Code:
    $storage = new Storage();
    
    var_dump($storage->object); // array('a', 'b')
    
    $storage->object->push('c');
    This is all good, but I'd like to know if its possible to somehow use native functions, to reduce the overhead on the library I'm developing, while tracking changes so any changes can be added to persistent storage.

    2) Forget about tracking changes, and just update whole object structures

    This is the simplest method of keeping the objects in the program synchronized with the objects actually stored in the object-cache (which can be on a different machine).

    However, it means whole structures, like an array with 1000 indexes, have to be sent though a socket to the object-cache when a single index changes.

    3) Keep a mirror of the object locally

    I've also tried cloning the object, and keeping a clone object untouched. Then when all processing is done by the PHP script, compare the clone to the modified object recursively, and submitting changed properties back to the object-cache.

    This however requires that the whole object be downloaded in order to use it.
    It also requires that the object take up twice as much memory, since it is cloned.

    ---

    I know this is pretty vague, but there is a quite a bit of code involved. If anyone wants to see the code I can post it, or put it up on an open SVN repo. The project is open source but I haven't set up a public repository yet.
    Fiji Web Design - Enterprise Web Design

  2. #2
    Theoretical Physics Student bronze trophy Jake Arkinstall's Avatar
    Join Date
    May 2006
    Location
    Lancaster University, UK
    Posts
    7,062
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    I'd use the first method, and make the handling class implement the Iterator and ArrayAccess interface, allowing basic Array functionality - not sure about array_pop or functions such as that.

    If not, write custom functions to handle that stuff as you said.

    Of course you'd be sacraficing efficiency (remember, this is on a minischule scale, a matter of miliseconds), but you'd be increasing functionality.
    Jake Arkinstall
    "Sometimes you don't need to reinvent the wheel;
    Sometimes its enough to make that wheel more rounded"-Molona

  3. #3
    SitePoint Wizard bronze trophy
    Join Date
    Jul 2006
    Location
    Augusta, Georgia, United States
    Posts
    4,147
    Mentioned
    16 Post(s)
    Tagged
    3 Thread(s)
    Implement ArrayAccess and Countable as arkinstall suggested and sacrifice the ability to use certain array specific functions. You can then define methods in the class to handle the array specific stuff like array_pop(),sort(),etc. One common function that you will no longer be able to use and should be aware of is empty(). Empty will not return the correct the expected result. To bad there isn't a Empty interface like count for that. However, you can opt to using count()!=0 || count()==0 instead of empty to get around that if the class implements Countable.

  4. #4
    SitePoint Enthusiast
    Join Date
    Sep 2005
    Posts
    68
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by arkinstall View Post
    I'd use the first method, and make the handling class implement the Iterator and ArrayAccess interface, allowing basic Array functionality - not sure about array_pop or functions such as that.

    If not, write custom functions to handle that stuff as you said.

    Of course you'd be sacraficing efficiency (remember, this is on a minischule scale, a matter of miliseconds), but you'd be increasing functionality.
    Thanks guys, I think thats the way to go.

    I've come across something really annoying.

    If you have an object with __get() and __set(). Once you make a copy or reference to one of its properties, __get() and __set() are not called on that property from then on.


    eg:
    Code PHP:
    $obj = new Storable();
    $obj->greeting = 'hello'; // $obj->__set() is called
     
    $test =& $obj->greeting;
    $test = 'bye'; // $obj->__set() is not called

    Can this be worked around?
    Fiji Web Design - Enterprise Web Design

  5. #5
    SitePoint Wizard bronze trophy
    Join Date
    Jul 2006
    Location
    Augusta, Georgia, United States
    Posts
    4,147
    Mentioned
    16 Post(s)
    Tagged
    3 Thread(s)
    You can't do that.

    The only possible way to achieve something like that would be to return another object that fires some type of event when its changed. Then you could have the Storable object update when that happens. Yet, it all seems like more work then what its worth.

  6. #6
    SitePoint Enthusiast
    Join Date
    Sep 2005
    Posts
    68
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by oddz View Post
    You can't do that.

    The only possible way to achieve something like that would be to return another object that fires some type of event when its changed. Then you could have the Storable object update when that happens. Yet, it all seems like more work then what its worth.
    I'm actually wrapping every property into Storable:

    Code PHP:
    $obj = new Storable();
    $obj->greeting = 'hello'; // $obj->__set() is called
     
    $test =& $obj->greeting;
     
    var_dump($test); // object(Storable)#3 (1) { ["data:private"]=>  &string(5) "hello" } 
     
    $test = 'bye'; // $obj->__set() is not called

    I think I might look into __destruct(). I'll get back on that.
    Fiji Web Design - Enterprise Web Design

  7. #7
    SitePoint Guru
    Join Date
    May 2005
    Location
    Finland
    Posts
    608
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by digital-ether View Post
    3) Keep a mirror of the object locally

    I've also tried cloning the object, and keeping a clone object untouched. Then when all processing is done by the PHP script, compare the clone to the modified object recursively, and submitting changed properties back to the object-cache.

    This however requires that the whole object be downloaded in order to use it.
    It also requires that the object take up twice as much memory, since it is cloned.
    As I understand it, Doctrine's view on this is that the overhead is negligible since PHP applies copy-on-write. If the objects don't change, a copy never gets made. Therefore, they keep copies and compare the objects' states after script completion to what they were at hydration.

    You should check out Doctrine 2's implementation of change tracking. It's pretty non-trivial by now, but with some amount of study it might gain you some insight: http://svn.doctrine-project.org/trun...UnitOfWork.php

  8. #8
    SitePoint Enthusiast
    Join Date
    Sep 2005
    Posts
    68
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Ezku View Post
    As I understand it, Doctrine's view on this is that the overhead is negligible since PHP applies copy-on-write. If the objects don't change, a copy never gets made. Therefore, they keep copies and compare the objects' states after script completion to what they were at hydration.

    You should check out Doctrine 2's implementation of change tracking. It's pretty non-trivial by now, but with some amount of study it might gain you some insight: http://svn.doctrine-project.org/trun...UnitOfWork.php
    Thanks, this has given me some good insight on the problem.
    Fiji Web Design - Enterprise Web Design

  9. #9
    SitePoint Wizard Ren's Avatar
    Join Date
    Aug 2003
    Location
    UK
    Posts
    1,060
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Simplest method of determining if an object has changed between two points in an application is something like this, I think.

    PHP Code:
    <?php


    class ChangeTracker
    {
        protected 
    $storage;

        function 
    __construct()
        {
            
    $this->storage = new SplObjectStorage();
        }

        function 
    track($object)
        {
            
    $this->storage->attach($object, clone $object);
        }

        function 
    hasChanged($object)
        {
            if (
    $this->storage->contains($object))
                return 
    $this->storage[$object] != $object;

            throw new 
    Exception('Untracked object');
        }
    }

        
    $tracker = new ChangeTracker();

    class 
    A
    {
        public 
    $foo 'foo';

        function 
    mutator() { $this->foo 'method'; }
        function 
    accesor() { return $this->foo; }
    }


        
    $a0 = new A();
        
    $a1 = new A();
        
    $a2 = new A();
        
    $a3 = new A();
        
    $a4 = new A();

        
    $tracker->track($a0);
        
    $tracker->track($a1);
        
    $tracker->track($a2);
        
    $tracker->track($a3);
        
    $tracker->track($a4);

        
    $a1->accesor();
        
    $a2->foo 'bar';
        
    $a3->mutator();

        
    $i = &$a4->foo;
        
    $i 'indirect';

        echo 
    'a0 '$tracker->hasChanged($a0) ? 'changed' 'unchanged'"\n";
        echo 
    'a1 '$tracker->hasChanged($a1) ? 'changed' 'unchanged'"\n";
        echo 
    'a2 '$tracker->hasChanged($a2) ? 'changed' 'unchanged'"\n";
        echo 
    'a3 '$tracker->hasChanged($a3) ? 'changed' 'unchanged'"\n";
        echo 
    'a4 '$tracker->hasChanged($a4) ? 'changed' 'unchanged'"\n";

  10. #10
    SitePoint Guru
    Join Date
    Jun 2006
    Posts
    638
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Ren's post is the best way to do this for one HTTP request.

    But if plan to cache the object every time it changes, that will be WAY to slow (slower than without cache...).

    I suggest you use your objects normally, and on the de-constructor cache their data (if you want the cache to be done automatically), or cache them manually via some call ( $obj->cache(); ).

    The way I have this done in my system:
    PHP Code:
    # Load object ID 5 from cache, or DB then cache the result
    $obj = new Obj(5);

    # change the object (not the cache)
    $obj->foo 'bar';

    # save the object to the DB and cache
    $obj->save(); 
    Then, I have a second way to do it:
    PHP Code:
    # Load object ID 5 from cache, or DB then cache the result
    $obj = new pObj(5);

    # change the object (not the cache)
    $obj->foo 'bar';

    # called automatically when the HTTP request finished / script ends.
    # This will save the object to the cache.
    #
    # $obj->__destruct();


  11. #11
    SitePoint Enthusiast
    Join Date
    Sep 2005
    Posts
    68
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    From my tests it looks like clone() copies the object value. There is no copy-on-write.

    My test is just using memory_get_usage() after each clone(). Is there a way to get copy on write with PHP5+ objects?
    Fiji Web Design - Enterprise Web Design

  12. #12
    SitePoint Wizard Ren's Avatar
    Join Date
    Aug 2003
    Location
    UK
    Posts
    1,060
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by digital-ether View Post
    From my tests it looks like clone() copies the object value. There is no copy-on-write.

    My test is just using memory_get_usage() after each clone(). Is there a way to get copy on write with PHP5+ objects?
    Yea, thought it was clone did create a copy.

    I am sceptical of Doctrine's view as expressed by Ezku.

    Did you try cloning an object that has like 5Mb string in one of its properties?

  13. #13
    SitePoint Enthusiast
    Join Date
    Sep 2005
    Posts
    68
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Ren View Post
    Yea, thought it was clone did create a copy.

    I am sceptical of Doctrine's view as expressed by Ezku.

    Did you try cloning an object that has like 5Mb string in one of its properties?
    Ok, I was wrong. clone() does create a new object with the properties being copy on write.

    This was apparent after testing with a 1MB string as a property.

    So Doctrine's view is spot on.
    Fiji Web Design - Enterprise Web Design

  14. #14
    SitePoint Wizard Ren's Avatar
    Join Date
    Aug 2003
    Location
    UK
    Posts
    1,060
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by digital-ether View Post
    Ok, I was wrong. clone() does create a new object with the properties being copy on write.

    This was apparent after testing with a 1MB string as a property.

    So Doctrine's view is spot on.
    Ah cool.

    So clone just allocates some memory for book keeping, rather than an complete new copy.

    What if the class implements the __clone magic?

  15. #15
    SitePoint Member romanb's Avatar
    Join Date
    May 2006
    Location
    Berlin
    Posts
    22
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Also, Doctrine 2 does not even clone the object, it simply keeps the array with the object data that was fetched from the database around while at the same time injecting the values into the objects. Copy-on-write makes this a very efficient thing in terms of memory usage. Only when a value in an object gets changed an actual copy of the original value is made. You can easily test this behavior in a small PHP script.

    Good to know that cloning objects takes some advantages of copy-on-write, though.

  16. #16
    SitePoint Addict webaddictz's Avatar
    Join Date
    Feb 2006
    Location
    Netherlands
    Posts
    295
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by digital-ether View Post
    This however requires that the whole object be downloaded in order to use it. It also requires that the object take up twice as much memory, since it is cloned.
    If you only want to check *if* something was changed in the object, vs. what exactly was changed, you might create a checksum just after instantianting the object one way or another and store that checksum. When time comes to save the object, check if the checksum still matches and if not, something has changed. This has the benefit that you don't have to store the object twice, you'll only have to store the checksum.
    Yes, I blog, too.


Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •