SitePoint Sponsor

User Tag List

Page 3 of 3 FirstFirst 123
Results 51 to 59 of 59
  1. #51
    SitePoint Wizard DougBTX's Avatar
    Join Date
    Nov 2001
    Location
    Bath, UK
    Posts
    2,498
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by kyberfabrikken
    It surely will, but having a hash per attribute gives the added benifit of knowing witch attributes to update, thus making it possible to finetune the queries to only update those.
    The hash solution was so that we wouldn't have to track individual changes. The update query would be less efficient as it would include all the fields, but checking to see if an object has been updated would be more efficient and simplified as we could compare just two hashes rather than lists of properties. I can't see the benifit in comparing two lists of hashes vs comparing a two lists of propertirs, because you would have to have the list of properties to generate the list of hashes in the first place! To get the hash of an object, you don't need to worry about its properties atall, you could just use kyberfabrikken's hashing code.

    Douglas
    Hello World

  2. #52
    SitePoint Wizard silver trophy kyberfabrikken's Avatar
    Join Date
    Jun 2004
    Location
    Copenhagen, Denmark
    Posts
    6,157
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I can't see the benifit in comparing two lists of hashes vs comparing a two lists of propertirs, because you would have to have the list of properties to generate the list of hashes in the first place!
    Memory-usage. Whenever the value of the attribute exceeds 32 chars, the hash would take up less space.

    I tend to agree with you to some degree though. In most setups, it wouldn't hurt to put a few redundand columns into the update-query, and it would make the implementation simpler. Actualy the resources saved by not having to maintain a hash of each attribute, plus a simpler comparison, is probably worth it.

  3. #53
    ********* Victim lastcraft's Avatar
    Join Date
    Apr 2003
    Location
    London
    Posts
    2,423
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Hi...

    Quote Originally Posted by kyberfabrikken
    Memory-usage. Whenever the value of the attribute exceeds 32 chars, the hash would take up less space.
    I don't think that this is significant. I like the idea of going...
    PHP Code:
    class IdentityMap {
        ...

        function 
    set(&$entity) {
            
    $class get_class($entity);
            if (! (
    $id $entity->getId()) {
                
    $this->_new = &$entity;
                return;
            }
            
    $this->_map["$class:$id"] = &$entity;
            
    $this->_snaps["$class:$id"] = $this->_getSnap($entity);
        }

        function 
    _getSnap($entity) {
            return 
    md5(serialise($entity));
        }

        function 
    saveUnsaved(&$saver) {
            for (
    $i 0$i count($this->_new); $i++) {
                
    $saver->save($this->_new[$i]);
                
    $this->set($this->_new[$i]);
            }
            unset(
    $this->_new);
            
    $this->_new = array();

            foreach (
    $this->_snaps as $index => $snap) {
                if (
    $snap == $this->_getSnap($this->_map[$index])) {
                    continue;
                }
                
    $saver->save($this->_map[$index]);
                
    $this->set($this->_map[$index]);
            }
        }

    The code could be tidied, but it's still kinda' neat.

    yours, Marcus
    Marcus Baker
    Testing: SimpleTest, Cgreen, Fakemail
    Other: Phemto dependency injector
    Books: PHP in Action, 97 things

  4. #54
    SitePoint Wizard DougBTX's Avatar
    Join Date
    Nov 2001
    Location
    Bath, UK
    Posts
    2,498
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Is there any merit in what richard dot prangnell at ntlworld dot com says at http://www.php.net/serialise?
    Quote Originally Posted by richard dot prangnell at ntlworld dot com
    Although the serialise function has its proper uses, it is a relatively slow process. I try to avoid using it wherever possible in the interests of performance. You can store text records in a database as plain text even when it contains PHP variables. The variable values are lost, naturally; but if they can easily be re-attached to retrieved records, zillions of clock cycles can be saved simply by using the str_replace() function:

    if($convert)

    {

    $mainContent = str_replace('$fixPath', $fixPath, $mainContent);

    $mainContent = str_replace('$theme', $theme, $mainContent);

    }

    The above snippet is used in a CMS project. $fixPath contains something like '.' or '..' to prepend relative paths (allowing the record to be used by scripts located in different parts of the directory hierarchy) and $theme inserts the name of the users custom page rendering scheme, which obviously would be undefined at record storage time anyway.
    Hello World

  5. #55
    ********* Victim lastcraft's Avatar
    Join Date
    Apr 2003
    Location
    London
    Posts
    2,423
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Hi...

    Quote Originally Posted by DougBTX
    Is there any merit in what richard dot prangnell at ntlworld dot com says at http://www.php.net/serialise?
    Not if the values are lost, but this triggered another thought. The serialise will ggo hunting for references which are likely to be other persistent entities . Replace serialise($entity) with something more sane.

    More sane would be to use implode(get_object_vars($entity)), but there will be problems with arrays if your persistence layer supports these.

    yours, Marcus
    Marcus Baker
    Testing: SimpleTest, Cgreen, Fakemail
    Other: Phemto dependency injector
    Books: PHP in Action, 97 things

  6. #56
    SitePoint Enthusiast hantu's Avatar
    Join Date
    Oct 2004
    Location
    Berlin
    Posts
    54
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    @lastcraft

    Hi,

    after I was convinced of the necessitiy of an Identity Map here, I coded one I haven't tested yet. It looks a lot like yours, I used <classname>id> as identifier too


    My question is:

    When would saveUnsaved() be called? I guess at the end of a transaction.


    Another question regarding the Unit Of Work:

    When I understand it right, Fowler says record all changes in a Unit of Work, and write these changes to the database when the current transaction is commited.

    There might be a problem within this:

    Say you need to insert two objects having an 1:n association, so object 1 might have a foreign key referencing object 2.

    That means object 2 needs to be inserted before object 1 (some databases might check foreign keys only at the end of a transaction but I think not all do).

    Correct me if I'm wrong!

    Fowler recommends a topological sorting of the objects before inserting them to overcome this problem. This is not impossible to do but it's certainly a bunch of work, has anybody tried this yet?

  7. #57
    SitePoint Wizard silver trophy kyberfabrikken's Avatar
    Join Date
    Jun 2004
    Location
    Copenhagen, Denmark
    Posts
    6,157
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    More sane would be to use implode(get_object_vars($entity)), but there will be problems with arrays if your persistence layer supports these.
    I don't like the idea of implode(get_object_vars($entity)) -- it's just too unprecise. Say you have this :
    PHP Code:
    $obj1->"foo";
    $obj1->"bar";

    $obj2->"foob";
    $obj2->"ar"
    Now they compare equal ... not good.

    Is there any merit in what richard dot prangnell at ntlworld dot com says at http://www.php.net/serialise?
    I wrote this little test to meassure the speed-differences. There are four tests.
    test1 -- A custom function. Works iteratively with arrays and keeps track of each attribute, instead of just the object.
    test2 -- Iteratively collapses the object into a string much like serialize, but works with arrays.
    Both 1 + 2 ignores objects, witch is a simple way to protect against redundant references.
    test3 -- Based on serialize
    test4 -- Based on get_object_vars

    As predictable test2 is slower than test3 witch is slower than test4. Test1 is the slowest ofcause, but it does add extra functionallity, so it's not directly comparable to the others. I included it in order to show how little performance is lost from test2 to get the extra functionality.
    The figures on my machine is :
    test 1 :0.0748281478882
    test 2 :0.0536389350891
    test 3 :0.0327990055084
    test 4 :0.0277180671692

    However - contrary to richard's claim, serialize (test2) doesn't seem to lack much behind get_object_vars (test3). This may be because the expensive part of the serialize function is in fact the reflexion, witch would indicate that if this could be skipped (having an array of witch attributes to work upon) would boost performance. This would also allow to solve the issue with redundant references effectively.
    To test this, I wrote a fifth test :
    test 5 :0.0342090129852
    As can be seen, it's almost as fast as using serialize, but does not have the inherent troubles associated with serialize (redundancy). Overall the best choice i'd say.

    PHP Code:
    define('SCHNAPS_MODE_MD5'0);
    define('SCHNAPS_MODE_SCALAR'1);
    define('SCHNAPS_MODE_ARRAY'2);

    function 
    compare_schnaps($hashes$obj) {
        if (
    is_object($obj)) {
            
    $attributes get_object_vars($obj);
        } else if (
    is_array($obj)) {
            
    $attributes $obj;
        }
        foreach (
    $hashes as $attribute => $hash) {
            switch (
    $hash[0]) {
                case 
    SCHNAPS_MODE_MD5 :
                    
    $compare md5($attributes[$attribute]) == $hash[1];
                    break;
                case 
    SCHNAPS_MODE_SCALAR :
                    
    $compare $attributes[$attribute] == $hash[1];
                    break;
                case 
    SCHNAPS_MODE_ARRAY :
                    
    $compare compare_schnaps($attributes[$attribute], $hash[1]);
                    break;
            }
            if (!
    $compare) return false;
        }
        return 
    true;
    }

    function 
    make_schnaps($obj) {
        
    $hashes = array();
        if (
    is_object($obj)) {
            
    $attributes get_object_vars($obj);
        } else if (
    is_array($obj)) {
            
    $attributes $obj;
        }
        foreach (
    $attributes as $attribute => $value) {
            if (
    is_array($value)) {
                
    $hashes[$attribute] = array(SCHNAPS_MODE_ARRAYmake_schnaps($value));
            } else if (
    is_string($value) && strlen($value) > 32) {
                
    $hashes[$attribute] = array(SCHNAPS_MODE_MD5md5($value));
            } else if (
    is_scalar($value)) {
                
    $hashes[$attribute] = array(SCHNAPS_MODE_SCALAR$value);
            }
        }
        return 
    $hashes;
    }

    function 
    collapse($obj) {
        
    $properties "";
        if (
    is_object($obj)) {
            
    $attributes get_object_vars($obj);
        } else if (
    is_array($obj)) {
            
    $attributes $obj;
        }
        foreach (
    $attributes as $attribute => $value) {
            if (
    is_array($value)) {
                
    $properties .= $attribute.":"collapse($value);
            } else if (
    is_scalar($value)) {
                
    $properties .= $attribute.":"$value;
            }
        }
        return 
    $properties;
    }

    function 
    collapse_no_reflexion($obj$attributes) {
        
    $properties "";
        foreach (
    $attributes as $attribute) {
            if (
    is_scalar($value)) {
                
    $properties .= $attribute.":"$attribute->$value;
            }
        }
        return 
    $properties;
    }

    class 
    someclass
    {
        function 
    someclass()
        {
            
    $this->abe 1003;
            
    $this->foobar "Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum";
        }
    }

    $obj = new someclass();

    function 
    test1($obj) {
        
    $hash make_schnaps($obj);
        
    $bool compare_schnaps($hash$obj);
    }

    function 
    test2($obj) {
        
    $hash md5(collapse($obj));
        
    $bool $hash == md5(collapse($obj));
    }

    function 
    test3($obj) {
        
    $hash md5(serialize($obj));
        
    $bool $hash == md5(serialize($obj));
    }

    function 
    test4($obj) {
        
    $hash md5(implode(',',get_object_vars($obj)));
        
    $bool $hash == md5(implode(',',get_object_vars($obj)));
    }

    function 
    test5($obj) {
        
    $hash md5(collapse_no_reflexion($obj,array('abe','foobar')));
        
    $bool $hash == md5(collapse_no_reflexion($obj,array('abe','foobar')));
    }


    function 
    getmicrotime() {
        list(
    $usec$sec) = explode(" ",microtime());
        return ((float)
    $usec + (float)$sec);
    }

    $t getmicrotime();
    for (
    $i=0;$i<1000;++$i)
        
    test1($obj);
    echo 
    "test 1 :" . (getmicrotime()-$t) . "<br>";

    $t getmicrotime();
    for (
    $i=0;$i<1000;++$i)
        
    test2($obj);
    echo 
    "test 2 :" . (getmicrotime()-$t) . "<br>";

    $t getmicrotime();
    for (
    $i=0;$i<1000;++$i)
        
    test3($obj);
    echo 
    "test 3 :" . (getmicrotime()-$t) . "<br>";

    $t getmicrotime();
    for (
    $i=0;$i<1000;++$i)
        
    test4($obj);
    echo 
    "test 4 :" . (getmicrotime()-$t) . "<br>";

    $t getmicrotime();
    for (
    $i=0;$i<1000;++$i)
        
    test5($obj);
    echo 
    "test 5 :" . (getmicrotime()-$t) . "<br>"

  8. #58
    ********* Victim lastcraft's Avatar
    Join Date
    Apr 2003
    Location
    London
    Posts
    2,423
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Hi...

    Quote Originally Posted by kyberfabrikken
    Now they compare equal ... not good.
    Actually it's a hash. It's the implode bit I got wrong . Once you make one mistake you don't have to wait long for an even bigger one.

    The problem with serialise is that you will capture colection objects. You already have these in the identity map. I thing get_object_vars() can be made to work, it just requires someone with more than half an ounce of coding ability...

    Interesting test though.

    yours, Marcus
    Marcus Baker
    Testing: SimpleTest, Cgreen, Fakemail
    Other: Phemto dependency injector
    Books: PHP in Action, 97 things

  9. #59
    ********* Victim lastcraft's Avatar
    Join Date
    Apr 2003
    Location
    London
    Posts
    2,423
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Hi...

    Quote Originally Posted by hantu
    When would saveUnsaved() be called? I guess at the end of a transaction.
    The version above can be called multiple times (I hope). The version in Changes is called at the end and actually dispatches the commit() message.

    Quote Originally Posted by hantu
    There might be a problem within this:

    Say you need to insert two objects having an 1:n association, so object 1 might have a foreign key referencing object 2.

    That means object 2 needs to be inserted before object 1 (some databases might check foreign keys only at the end of a transaction but I think not all do).

    Correct me if I'm wrong!
    In Changes I got around this by committing new objects first. I had a class called DeferredId that acted as a placeholder. I haven't tried anything else.

    yours, Marcus
    Marcus Baker
    Testing: SimpleTest, Cgreen, Fakemail
    Other: Phemto dependency injector
    Books: PHP in Action, 97 things


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •