Object equivalence after serialize.
I ran into this weird wrinkle - a difference between how ArrayObject behaves and how array() behaves when serialized. Below is the full test script
$a = array(1,2,3,4,5);
echo current($a)."\n"; // returned 3, as expected.
$b = unserialize(serialize($a));
echo current($b)."\n"; // returned 1 - unexpected.
echo $a === $b ? 'true' : 'false'; // returns true, apparently the pointer isn't considered when arrays are compared.
$c = new ArrayObject(array(1,2,3,4,5));
echo current($c)."\n"; // returned 3, again as expected.
$d = unserialize(serialize($c));
echo current($c)."\n"; // returns 3, not 1 as with a true array.
echo $c == $d ? 'true' : 'false'; // returns true. The objects are equivalent, but the pointer has been kept.
I have an object in my program called "ReadOnlyArray" which is the foundation of several classes that implement array like behavior. I've used this instead of ArrayObject because its storage is protected, not private, and the child classes are free to work with it as needed. When I don't need to do that I've used ArrayObject on the presumption it will be faster.
But this is troublesome. As we can see from the test, if you serialize then unserialize an array it's internal pointer resets. Meanwhile, if you serialize then unserialize ArrayObject the internal pointer stays.
I know in some recent coding I've done that if you forget the pointer the objects aren't going to pass an equivalence check. Comparing arrays doesn't consider the internal pointer.
All this leads to a question I need help with. Which is the best thing for me to do with my own framework's arrays? Which is most troublesome?
- Calling serialize resets the pointer. THE GOOD: The object will behave like an array(). THE POTENTIALLY DISASTROUS: Asking an object to serialize mid foreach loop will make the loop eternal. This can be avoided though and I can't at the moment see a reason why you'd want to do serialize($this) during a loop over the object's properties.
- Store the pointer: THE GOOD Object equivalence maintained before and after serialize. THE SLIGHTLY BAD Remembering the pointer can cause unexpected behavior iterated on using older while next approaches to iteration. Foreach should be fine because it calls reset() at start
- Discard the pointer. THE GOOD: This is array() 's behavior. THE BAD: Object equivalence is potentially lost. The only way to secure an object equivalence test is to reset both objects before testing.
The PHP team chose #2 for ArrayObject. I'm leaning towards #3 because holding the pointer works fine for straightforward unit tests, but it doesn't completely cover the problem of object equivalence being affected by the object's iteration state. I think the problem of having an array object behave differently than array is larger than the problem of the equivalence test.
The solution in my opinion lies in PHP itself. Simply put - protected and private properties should not be part of equivalence tests unless exposed through __get. Until that happens though, it's choice of evils. What would you choose and why?
And my mind isn't made up so by all means let me know of other concerns.