SitePoint Sponsor

User Tag List

Results 1 to 15 of 15
  1. #1
    SitePoint Evangelist
    Join Date
    Jun 2006
    Location
    Wigan, Lancashire. UK
    Posts
    523
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Memory usage reduced with short attribute names ?!?

    Can anybody explain why memory usage seems better when using shorter attribute names?

    I have the following test class:
    PHP Code:
    $loopCount 1024 512;

    error_reporting(E_ALL);


    class 
    testClass
    {
        public 
    $veryLongAttributeName0 NULL;
        public 
    $veryLongAttributeName1 NULL;
        public 
    $veryLongAttributeName2 NULL;
        public 
    $veryLongAttributeName3 NULL;
        public 
    $veryLongAttributeName4 NULL;
        public 
    $veryLongAttributeName5 NULL;
        public 
    $veryLongAttributeName6 NULL;
        public 
    $veryLongAttributeName7 NULL;
        public 
    $veryLongAttributeName8 NULL;
        public 
    $veryLongAttributeName9 NULL;


        function 
    __construct($libraryType null$fileID null)
        {
        }

        function 
    _destructor()
        {
        } 
    // function destructor

    }    //    class testClass


    $callStartTime microtime(true);

    $testArray = array();
    for (
    $i 0$i $loopCount; ++$i) {
        
    $testArray[] = new testClass();
    }

    $callEndTime microtime(true);
    $callTime $callEndTime $callStartTime;
    echo 
    '<br />Call time to instantiate '.$loopCount.' objects of testClass was '.sprintf('%.4f',$callTime)." seconds<br />\n";


    echo 
    date('H:i:s').' Peak memory usage: '.(memory_get_peak_usage(true) / 1024 1024).' MB<br />'
    which produces the following result:
    Code:
    Call time to instantiate 524288 objects of testClass was 3.4530 seconds
    09:34:08 Peak memory usage: 494.75 MB
    If I simply change the attribute names
    PHP Code:
        public $veryLongAttributeName0 NULL;
        public 
    $veryLongAttributeName1 NULL;
        public 
    $veryLongAttributeName2 NULL;
        public 
    $veryLongAttributeName3 NULL;
        public 
    $veryLongAttributeName4 NULL;
        public 
    $veryLongAttributeName5 NULL;
        public 
    $veryLongAttributeName6 NULL;
        public 
    $veryLongAttributeName7 NULL;
        public 
    $veryLongAttributeName8 NULL;
        public 
    $veryLongAttributeName9 NULL
    to
    PHP Code:
        public $v0 NULL;
        public 
    $v1 NULL;
        public 
    $v2 NULL;
        public 
    $v3 NULL;
        public 
    $v4 NULL;
        public 
    $v5 NULL;
        public 
    $v6 NULL;
        public 
    $v7 NULL;
        public 
    $v8 NULL;
        public 
    $v9 NULL
    I get the following result:
    Code:
    Call time to instantiate 524288 objects of testClass was 3.2260 seconds
    09:37:46 Peak memory usage: 374.75 MB
    It appears to run fractionally faster (although that's harder to determine), but uses significantly less memory (494.75 MB reduced to 374.75 MB)
    That shouldn't be right... should it? Even in a semi-compiled language such as PHP, the memory usage (and possibly speed of execution) shouldn't be affected by the length of an attribute name.
    ---
    Development Projects:
    PHPExcel
    PHPPowerPoint

  2. #2
    SitePoint Guru
    Join Date
    Oct 2006
    Location
    Queensland, Australia
    Posts
    852
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    PHP has to store full name references in memory as that's just how scripting languages work. I just did a quick calculation, and found that the variable names alone (all 5,000,000 of them) at 20 characters long roughly equals 100MB.; compared to the variable names of 2 characters which consume less than 10MB. There would of course be additional overheads which would further amplify the difference.

    Of course, any such test like yours is completely meaningless, as even the most complex applications would be hard-pressed to reach 10,000 variables (or pointers), let alone 5,000,000. It's simply not worth worrying about. I'm sure the money you save as a result of faster development due to easier coding and debugging as a result of more descriptive variables, will more than cover the cost of the extra couple of megabytes needed to run the code.

  3. #3
    SitePoint Enthusiast
    Join Date
    Jun 2004
    Location
    London
    Posts
    66
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Just run this script, adjusted from your example, and i am only seeing a change in the execution time
    PHP Code:
    <?php
    error_reporting
    (E_ALL);

    $loopCount 1024 512;
    //$loopCount = 1024;

    class testClass
    {
        public 
    $veryLongAttributeName0 NULL;
        public 
    $veryLongAttributeName1 NULL;
        public 
    $veryLongAttributeName2 NULL;
        public 
    $veryLongAttributeName3 NULL;
        public 
    $veryLongAttributeName4 NULL;
        public 
    $veryLongAttributeName5 NULL;
        public 
    $veryLongAttributeName6 NULL;
        public 
    $veryLongAttributeName7 NULL;
        public 
    $veryLongAttributeName8 NULL;
        public 
    $veryLongAttributeName9 NULL;
    }

    class 
    testClass1
    {
        public 
    $v0 NULL;
        public 
    $v1 NULL;
        public 
    $v2 NULL;
        public 
    $v3 NULL;
        public 
    $v4 NULL;
        public 
    $v5 NULL;
        public 
    $v6 NULL;
        public 
    $v7 NULL;
        public 
    $v8 NULL;
        public 
    $v9 NULL;
    }

    $callStartTime microtime(true);
    $testArray = array();
    for (
    $i 0$i $loopCount; ++$i) {
            
    $testArray[] = new testClass();
    }
    $callEndTime microtime(true);
    $callTime $callEndTime $callStartTime;

    echo 
    'Call time to instantiate '.$loopCount.' objects of testClass was '.sprintf('%.4f',$callTime)." seconds<br />\n";
    echo 
    date('H:i:s').' Peak memory usage: '.(memory_get_peak_usage(true) / 1024 1024)." MB<br />\n"

    // ---

    $callStartTime microtime(true);
    $testArray = array();
    for (
    $i 0$i $loopCount; ++$i) {
            
    $testArray[] = new testClass1();
    }
    $callEndTime microtime(true);
    $callTime $callEndTime $callStartTime;

    echo 
    'Call time to instantiate '.$loopCount.' objects of testClass1 was '.sprintf('%.4f',$callTime)." seconds<br />\n";
    echo 
    date('H:i:s').' Peak memory usage: '.(memory_get_peak_usage(true) / 1024 1024)." MB<br />\n";
    Running cli env
    Code:
    Call time to instantiate 524288 objects of testClass was 2.2142 seconds<br />
    11:16:02 Peak memory usage: 470.75 MB<br />
    Call time to instantiate 524288 objects of testClass1 was 2.8307 seconds<br />
    11:16:05 Peak memory usage: 470.75 MB<br />
    Running using mod apache
    Code:
    Call time to instantiate 524288 objects of testClass was 2.4203 seconds
    11:18:57 Peak memory usage: 470.75 MB
    Call time to instantiate 524288 objects of testClass1 was 3.2689 seconds
    11:19:00 Peak memory usage: 470.75 MB
    Tests where conducted on a CentOS machine using PHP 5.2.10

    MMmmm
    David Stevens, create-inspire
    PHP London, www.phplondon.org

  4. #4
    SitePoint Evangelist
    Join Date
    Jun 2006
    Location
    Wigan, Lancashire. UK
    Posts
    523
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Wardrop View Post
    PHP has to store full name references in memory as that's just how scripting languages work.
    But why are the names stored against every instance of the class. If I were writing a compiler (even one that compiled to bytecode rather than a .exe), I'd maintain the attribute names against the definition of the class, not against every instance.
    Those rare cases where the actual names are needed within the bytecode (error handling, serialize(), etc) should be capable of cross referencing an instance (with references rather than names) against the single class definition (with names).

    Quote Originally Posted by Wardrop View Post
    Of course, any such test like yours is completely meaningless, as even the most complex applications would be hard-pressed to reach 10,000 variables (or pointers), let alone 5,000,000. It's simply not worth worrying about. I'm sure the money you save as a result of faster development due to easier coding and debugging as a result of more descriptive variables, will more than cover the cost of the extra couple of megabytes needed to run the code.
    But it isn't specifically variables in procedural code that concern me, typically variable scope restricts to 1 copy of each.
    My concern is OOP, where you may have several instances of a class, each with its own attributes.
    In a class with a couple of dozen attributes, that is instantiated a few dozen times, those extra bytes can very easily add up to significant amounts of memory.


    Perhaps I'm a real exception.
    PHPExcel has an instantiated object for every cell in every worksheet in a workbook. With large Excel files, that can easily hit several million instantiated objects.... and yes, we do hit memory problems problems with large files that we've been working hard to alleviate.
    Now it looks like we need to figure out some way round this problem as well... but using short, meaningless attribute names goes against everything I've ever been taught about good coding practises.
    ---
    Development Projects:
    PHPExcel
    PHPPowerPoint

  5. #5
    SitePoint Evangelist
    Join Date
    Jun 2006
    Location
    Wigan, Lancashire. UK
    Posts
    523
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by davro View Post
    Just run this script, adjusted from your example, and i am only seeing a change in the execution time
    If you're running both tests as a single script, then you can't call memory_get_peak_usage() because this will return the highest memory usage for the entire script - especially as you're running the test that uses the most memory first - not just at the point where you make the call.
    You'd need to use memory_get_usage(), and make sure to unset $testArray and wait for garbage clearance to remove it from memory before executing the second test
    ---
    Development Projects:
    PHPExcel
    PHPPowerPoint

  6. #6
    SitePoint Enthusiast
    Join Date
    Jun 2004
    Location
    London
    Posts
    66
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Nice point Mark, well made.

    I have separated out the two scripts.

    test_class_names.php
    PHP Code:
    <?php
    error_reporting
    (E_ALL);

    $loopCount 1024 512;

    class 
    testClass
    {
        public 
    $veryLongAttributeName0 NULL;
        public 
    $veryLongAttributeName1 NULL;
        public 
    $veryLongAttributeName2 NULL;
        public 
    $veryLongAttributeName3 NULL;
        public 
    $veryLongAttributeName4 NULL;
        public 
    $veryLongAttributeName5 NULL;
        public 
    $veryLongAttributeName6 NULL;
        public 
    $veryLongAttributeName7 NULL;
        public 
    $veryLongAttributeName8 NULL;
        public 
    $veryLongAttributeName9 NULL;
    }

    $callStartTime microtime(true);
    $testArray = array();
    for (
    $i 0$i $loopCount; ++$i) {
            
    $testArray[] = new testClass();
    }
    $callEndTime microtime(true);
    $callTime $callEndTime $callStartTime;

    echo 
    'Call time to instantiate '.$loopCount.' objects of testClass was '.sprintf('%.4f',$callTime)." seconds<br />\n";
    echo 
    date('H:i:s').' Peak memory usage: '.(memory_get_peak_usage(true) / 1024 1024)." MB<br />\n";
    test_class_names1.php
    PHP Code:
    <?php
    error_reporting
    (E_ALL);

    $loopCount 1024 512;

    class 
    testClass
    {
        public 
    $v0 NULL;
        public 
    $v1 NULL;
        public 
    $v2 NULL;
        public 
    $v3 NULL;
        public 
    $v4 NULL;
        public 
    $v5 NULL;
        public 
    $v6 NULL;
        public 
    $v7 NULL;
        public 
    $v8 NULL;
        public 
    $v9 NULL;
    }

    $callStartTime microtime(true);
    $testArray = array();
    for (
    $i 0$i $loopCount; ++$i) {
            
    $testArray[] = new testClass();
    }
    $callEndTime microtime(true);
    $callTime $callEndTime $callStartTime;

    echo 
    'Call time to instantiate '.$loopCount.' objects of testClass was '.sprintf('%.4f',$callTime)." seconds<br />\n";
    echo 
    date('H:i:s').' Peak memory usage: '.(memory_get_peak_usage(true) / 1024 1024)." MB<br />\n";
    Code:
    php test_class_names.php && php test_class_names1.php
    
    Call time to instantiate 524288 objects of testClass was 2.2189 seconds<br />
    13:49:00 Peak memory usage: 470.75 MB<br />
    Call time to instantiate 524288 objects of testClass was 2.0302 seconds<br />
    13:49:03 Peak memory usage: 370.75 MB<br />
    Shocking...
    David Stevens, create-inspire
    PHP London, www.phplondon.org

  7. #7
    <?php while(!sleep()){code();} G.Schuster's Avatar
    Join Date
    Mar 2007
    Location
    Germany
    Posts
    428
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Just thinking out loud...what about using magic getter/setter methods?
    PHP Code:
    class FooBar {
         private 
    $_data = array();

         public function 
    __set($name$value) {
              switch(
    $name) {
                   case 
    'longVariableName1':
                        
    $this->_data[0] = $value;
                        break;
                    case 
    'longVariableName2':
                         
    $this->_data[1] = $value;
                         break;
                    
    // ...
              
    }
         }

         public function 
    __get($name) {
              switch(
    $name) {
                   case 
    'longVariableName1':
                        return 
    $this->_data[0];
                        break;
                    case 
    'longVariableName2':
                         return 
    $this->_data[1];
                         break;
                    
    // ...
              
    }
         }

    That should at least reduce the amount of memory used to store the variable names.
    Untested! So please don't freak out if I'm wrong.

  8. #8
    SitePoint Evangelist
    Join Date
    Jun 2006
    Location
    Wigan, Lancashire. UK
    Posts
    523
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by G.Schuster View Post
    Just thinking out loud...what about using magic getter/setter methods?

    That should at least reduce the amount of memory used to store the variable names.
    Untested! So please don't freak out if I'm wrong.
    We can't use that for public attributes without breaking the published API, but the idea of an array of numerically indexed values has potential for private attributes. A single array is only one variable name, and use of class constants in the code might help keep any additional memory requirements in the methods down to a minimum, though we'd have to do some fairly extensive testing.

    Thanks for the suggestion.
    ---
    Development Projects:
    PHPExcel
    PHPPowerPoint

  9. #9
    <?php while(!sleep()){code();} G.Schuster's Avatar
    Join Date
    Mar 2007
    Location
    Germany
    Posts
    428
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Well, this even works for public attributes as long as they are not defined.
    As for the API I don't see any real problems.
    OK, code completion will not work in all editors but I think it's OK as long as the decrease in memory consumption helps speed up the apps.

  10. #10
    SitePoint Evangelist
    Join Date
    Jun 2006
    Location
    Wigan, Lancashire. UK
    Posts
    523
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by G.Schuster View Post
    Well, this even works for public attributes as long as they are not defined.
    As for the API I don't see any real problems.
    The problem with the published API is for users who have built working applications using PHPExcel... it could force them to change their own code when they upgraded to a version that had changed public attributes that they were accessing.

    Fortunately, we're looking at other ways round the memory issues... specifically cell data cacheing, so we'd only ever have a single instance of the cell class actively in memory at any given moment.
    ---
    Development Projects:
    PHPExcel
    PHPPowerPoint

  11. #11
    <?php while(!sleep()){code();} G.Schuster's Avatar
    Join Date
    Mar 2007
    Location
    Germany
    Posts
    428
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    There's no change in the API!
    You can still use the public properties as if they were defined - __get(), __set(), __isset() and __unset() to the job for you!

  12. #12
    <?php while(!sleep()){code();} G.Schuster's Avatar
    Join Date
    Mar 2007
    Location
    Germany
    Posts
    428
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    OK, as this triggered me a little I tested it - with great results.
    Code:
    Call time to instantiate 524288 objects of testClass was 1.2642 seconds
    23:50:43 Peak memory usage: 163 MB
    0
    1
    2
    3
    4
    5
    6
    7
    8
    9
    PHP Code:
    <?php
    $loopCount 
    1024 512;
    error_reporting(E_ALL);

    class 
    testClass {
          private 
    $_data = array();
          
          function 
    __construct($libraryType null$fileID null) {
          }
          
          function 
    _destructor() {
          }
          
          public function 
    __set($name$value) {
                switch(
    $name) {
                      case 
    'longVariableName0':
                            
    $this->_data[0] = $value;
                            break;
                      case 
    'longVariableName1':
                            
    $this->_data[1] = $value;
                            break;
                      case 
    'longVariableName2':
                            
    $this->_data[2] = $value;
                            break;
                      case 
    'longVariableName3':
                            
    $this->_data[3] = $value;
                            break;
                      case 
    'longVariableName4':
                            
    $this->_data[4] = $value;
                            break;
                      case 
    'longVariableName5':
                            
    $this->_data[5] = $value;
                            break;
                      case 
    'longVariableName6':
                            
    $this->_data[6] = $value;
                            break;
                      case 
    'longVariableName7':
                            
    $this->_data[7] = $value;
                            break;
                      case 
    'longVariableName8':
                            
    $this->_data[8] = $value;
                            break;
                      case 
    'longVariableName9':
                            
    $this->_data[9] = $value;
                            break;
                }
          }
          
          public function 
    __get($name) {
                switch(
    $name) {
                      case 
    'longVariableName0':
                            return 
    $this->_data[0];
                            break;
                      case 
    'longVariableName1':
                            return 
    $this->_data[1];
                            break;
                      case 
    'longVariableName2':
                            return 
    $this->_data[2];
                            break;
                      case 
    'longVariableName3':
                            return 
    $this->_data[3];
                            break;
                      case 
    'longVariableName4':
                            return 
    $this->_data[4];
                            break;
                      case 
    'longVariableName5':
                            return 
    $this->_data[5];
                            break;
                      case 
    'longVariableName6':
                            return 
    $this->_data[6];
                            break;
                      case 
    'longVariableName7':
                            return 
    $this->_data[7];
                            break;
                      case 
    'longVariableName8':
                            return 
    $this->_data[8];
                            break;
                      case 
    'longVariableName9':
                            return 
    $this->_data[9];
                            break;
                }
          }
    }


    $callStartTime microtime(true);
    $testArray = array();
    for (
    $i 0$i $loopCount; ++$i) {
          
    $testArray[] = new testClass();
    }

    $callEndTime microtime(true);
    $callTime $callEndTime $callStartTime;
    echo 
    '<br />Call time to instantiate '.$loopCount.' objects of testClass was '.sprintf('%.4f',$callTime)." seconds<br />\n";
    echo 
    date('H:i:s').' Peak memory usage: '.(memory_get_peak_usage(true) / 1024 1024).' MB<br />';

    $testArray[0]->longVariableName0 0;
    $testArray[0]->longVariableName1 1;
    $testArray[0]->longVariableName2 2;
    $testArray[0]->longVariableName3 3;
    $testArray[0]->longVariableName4 4;
    $testArray[0]->longVariableName5 5;
    $testArray[0]->longVariableName6 6;
    $testArray[0]->longVariableName7 7;
    $testArray[0]->longVariableName8 8;
    $testArray[0]->longVariableName9 9;

    echo 
    $testArray[0]->longVariableName0.'<br />';
    echo 
    $testArray[0]->longVariableName1.'<br />';
    echo 
    $testArray[0]->longVariableName2.'<br />';
    echo 
    $testArray[0]->longVariableName3.'<br />';
    echo 
    $testArray[0]->longVariableName4.'<br />';
    echo 
    $testArray[0]->longVariableName5.'<br />';
    echo 
    $testArray[0]->longVariableName6.'<br />';
    echo 
    $testArray[0]->longVariableName7.'<br />';
    echo 
    $testArray[0]->longVariableName8.'<br />';
    echo 
    $testArray[0]->longVariableName9.'<br />';

    ?>
    As you can see the "public" variables are still accessible without any workaround.

  13. #13
    SitePoint Evangelist
    Join Date
    Jun 2006
    Location
    Wigan, Lancashire. UK
    Posts
    523
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    With your suggestion, I came up with:
    PHP Code:
    class testClass
    {
        private static 
    $_propertyList = array( 'longVariableName0',
                    
    'longVariableName1',
                    
    'longVariableName2',
                    
    'longVariableName3',
                    
    'longVariableName4',
                    
    'longVariableName5',
                    
    'longVariableName6',
                    
    'longVariableName7',
                    
    'longVariableName8',
                    
    'longVariableName9'
          
    );

        private 
    $_data = array();


        public function 
    __set($name$value) {
            
    $key array_search($name,self::_propertyList);
            if (
    $key !== false) {
                
    $this->_data[$key] = $value;
            }
        }

        public function 
    __get($name) {
            
    $key array_search($name,self::_propertyList);
            if (
    $key !== false) {
                return 
    $this->_data[$key];
            }
        }

        function 
    __construct()
        {
        }

        function 
    _destructor()
        {
        } 
    // function destructor

    }    //    class testClass 
    Which returns identical memory usage to your version using switch. I also tried declaring $_propertyList as a normal class attribute (not static), which used more 36MB more memory than when it's declared as static: it would seem that static attributes aren't replicated across each instance.
    ---
    Development Projects:
    PHPExcel
    PHPPowerPoint

  14. #14
    <?php while(!sleep()){code();} G.Schuster's Avatar
    Join Date
    Mar 2007
    Location
    Germany
    Posts
    428
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    [edit] Oh sorry, it's definately too late...I misunderstood your use of the static property, so my text is irrelevant.

    At least these changes saved you around 73&#37; of memory!
    That's a really, really huge gain, I think.

  15. #15
    SitePoint Evangelist
    Join Date
    Jun 2006
    Location
    Wigan, Lancashire. UK
    Posts
    523
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by G.Schuster View Post
    [edit] Oh sorry, it's definately too late...I misunderstood your use of the static property, so my text is irrelevant.
    No apologies necessary... your suggestion has allowed us to make significant improvements to the overall performance (speed and memory footprint) of PHPExcel.
    Compared with my original script (running on the same server)
    Standard long names:
    Call time to instantiate 524288 objects of testClass was 3.1759 seconds
    09:48:34 Peak memory usage: 494.75 MB
    Using magic getters/setters
    Call time to instantiate 524288 objects of testClass was 1.8602 seconds
    09:48:05 Peak memory usage: 150.75 MB
    We can apply this technique to many of the classes within the library, which should allow us to handle workbooks up to 3 times the size that we can now, without any changes being required by developers who are using the library.

    Quote Originally Posted by G.Schuster View Post
    At least these changes saved you around 73% of memory!
    That's a really, really huge gain, I think.
    It's an incredible gain, and we're really grateful to everybody on SitePoint and other forums who has helped us explain the cause of the problem, and provided us with a solution that not only gives us the ability to handle significantly larger volumes of data, but to do so with improved speed as well.
    ---
    Development Projects:
    PHPExcel
    PHPPowerPoint


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •