Using SPL Iterators, Part 1
When I first came across the term iteration and saw the overwhelming list of classes related to it in the SPL, I was taken aback. It seemed maybe iteration was too complex for me to grasp. I soon realized it was just a fancy word for something we programmers do all the time.
If you use PHP, you’ve most likely used arrays. And if you’ve used arrays, then most definitely you’ve looped through its elements. Look through any code and almost certainly you’ll find a foreach
loop. Yes, iteration is just the process of traversing a list of values. An iterator then is an object that traverses a list, be it an array, a directory listing, or even a database result set.
In the first part of this two-part series I’ll introduce you to iteration and how you can take advantage of some of the built-in classes from the Standard PHP Library (SPL). SPL comes with a large number of iterators, and using them in your code can make your code more efficient and in most cases, more readable.
Why and When to Use SPL Iterators
As you will see, iterating iterator objects is basically the same as iterating arrays, and so many people wonder if it wouldn’t be easier to just stick with using arrays in the first place. However, the real benefit of iterators show through when traversing a large amount of data or anything more complex than a simple array.
The foreach
loop makes a copy of any array passed to it. If you are processing a large amount of data, having the large arrays copied each time you use them in a foreach
loop might be undesirable. SPL iterators encapsulate the list and expose visibility to one element at a time making them far more efficient.
When creating data providers, iterators are a great construct as they allow you to lazy load your data. Lazy loading here is simply retrieving the required data only if and when it is needed. You can also manipulate (filter, transform etc) the data you are working on before giving it to the user.
The decision to use iterators is always at your discretion, however. Iterators have numerous benefits, but in some cases (as with smaller array sets) can cause unwanted overhead. The decision of when to use them rests with you; your choice of style, and their suitability in the given situation, are all factors you should consider.
Iterating Arrays
This first iterator I’d like to introduce you to is ArrayIterator
. The constructor accepts an array for a parameter and provides methods that can be used to iterate through it. Here’s an example:
<?php
// an array (using PHP 5.4's new shorthand notation)
$arr = ["sitepoint", "phpmaster", "buildmobile", "rubysource",
"designfestival", "cloudspring"];
// create a new ArrayIterator and pass in the array
$iter = new ArrayIterator($arr);
// loop through the object
foreach ($iter as $key => $value) {
echo $key . ": " . $value . "<br>";
}
The output of the above code is:
0: sitepoint 1: phpmaster 2: buildmobile 3: rubysource 4: designfestival 5: cloudspring
Usually, however, you will use ArrayObject
, a class that allows you to work with objects as if they were arrays in certain contexts, instead of using ArrayIterator
directly. This automatically creates an ArrayIterator
for you when you use a foreach
loop or call ArrayIterator::getIterator()
directly.
Please note that while ArrayObject
and ArrayIterator
behave like arrays in this context, they are still objects; trying to use built-in array functions like sort()
and array_keys()
on them will fail dismally.
The use of ArrayIterator
is straight forward, but limited to single dimensional arrays. Sometimes you’ll have a multidimensional array and you’ll want to iterate through the nested arrays recursively. In this case you can use RecursiveArrayIterator
.
One common scenario is to nest foreach
loops or to create a recursive function which checks all items of a multidimensional array. For example:
<?php
// a multidimensional array
$arr = [
["sitepoint", "phpmaster"],
["buildmobile", "rubysource"],
["designfestival", "cloudspring"],
"not an array"
];
// loop through the object
foreach ($arr as $key => $value) {
// check for arrays
if (is_array($value)) {
foreach ($value as $k => $v) {
echo $k . ": " . $v . "<br>";
}
}
else {
echo $key . ": " . $value . "<br>";
}
}
The output of the above code is:
0: sitepoint 1: phpmaster 0: buildmobile 1: rubysource 0: designfestival 1: cloudspring 3: not an array
A more elegant approach makes use of RecursiveArrayIterator
.
<?php
...
$iter = new RecursiveArrayIterator($arr);
// loop through the object
// we need to create a RecursiveIteratorIterator instance
foreach(new RecursiveIteratorIterator($iter) as $key => $value) {
echo $key . ": " . $value . "<br>";
}
The output is the same as the previous example.
Note that you need to create an instance of RecursiveIteratorIterator
and pass it the RecursiveArrayIterator
object here or else all you would get would be the values in the root array (and a ton of notices depending on your settings).
You should use RecursiveArrayIterator
when dealing with multidimensional arrays as it allows you to iterate over the current entry as well, but leaves this up to you to do. RecursiveIteratorIterator
is a decorator which does this for you. It takes the RecursiveArrayIterator
, iterates over it and iterates over any Iterable
entry it finds (and so on). Essentially, it “flattens” the RecursiveArrayIterator
. You can get the current depth of iteration by calling RecursiveIteratorIterator::getDepth()
to keep track. Be careful with RecursiveArrayIterator
and RecursiveIteratorIterator
though if you want to return objects; objects are treated as Iterable
and will therefore be iterated.
Iterating Directory Listings
You will undoubtedly need to traverse a directory and its files at some point in time or another, and there are various ways of accomplishing this with the built-in functions provided by PHP already, such as with scandir()
or glob()
. But you can also use DirectoryIterator
. In its simplest form, DirectoryIterator
is quite powerful, but it can also be subclassed and enhanced.
Here’s an example of iterating a directory with DirectoryIterator
:
<?php
// create new DirectoryIterator object
$dir = new DirectoryIterator("/my/directory/path");
// loop through the directory listing
foreach ($dir as $item) {
echo $item . "<br>";
}
The output obviously will depend on the path you specify and what the directory’s contents are. For instance:
. .. api index.php lib workspace
Don’t forget that with DirectoryIterator
, as well as many of the other SPL iterators, you have the added benefit of using exceptions to handle any errors.
<?php
try {
$dir = new DirectoryIterator("/non/existent/path");
foreach ($dir as $item) {
echo $item . "<br>";
}
}
catch (Exception $e) {
echo get_class($e) . ": " . $e->getMessage();
}
UnexpectedValueException: DirectoryIterator::__construct(/non/existent/path,/non/existent/path): The system cannot find the file specified. (code: 2)
With a host of other methods like DirectoryIterator::isDot()
, DirectoryIterator::getType()
and DirectoryIterator::getSize()
, pretty much all of your basic directory information needs are covered. You can even combine DirectoryIterator
with FilterIterator
or RegexIterator
to return files matching specific criteria. For example:
<?php
class FileExtensionFilter extends FilterIterator
{
// whitelist of file extensions
protected $ext = ["php", "txt"];
// an abstract method which must be implemented in subclass
public function accept() {
return in_array($this->getExtension(), $this->ext);
}
}
//create a new iterator
$dir = new FileExtensionFilter(new DirectoryIterator("./"));
...
SPL also provides RecursiveDirectoryIterator
which can be used in the same way as RecursiveArrayIterator
. A function that traverses directories recursively will usually be littered with conditional checks for valid directories and files, and RecursiveDirectoryIterator
can do much of the work for you resulting in cleaner code. There is one caveat, however. RecursiveDirectoryIterator
does not return empty directories; if a directory contains many subdirectories but no files, it will return an empty result (much like how Git behaves).
<?php
// create new RecursiveDirectoryIterator object
$iter = new RecursiveDirectoryIterator("/my/directory/path");
// loop through the directory listing
// we need to create a RecursiveIteratorIterator instance
foreach (new RecursiveIteratorIterator($iter) as $item) {
echo $item . "<br>";
}
My output resembles:
./api/.htaccess ./api/index.php ./index.php ...
Summary
Hopefully you now realize that iteration isn’t a complex beast like I first thought, and that it’s something we do every day as programmers. In this article I’ve introduced iteration and some of the classes that SPL provides to make iterating easier and more robust. Of course I’ve only dealt with a very small sampling of the available classes; SPL provides many, many more and I urge you to take a look at them.
SPL is a “standard” library. Sometimes you may find the classes too general and they may not always do what you need. In such cases you can easily extend the classes to add your own functionality or tweak existing functionality as needed. In the next part of this series I’ll show you how to use SPL interfaces to make your very own custom classes that can be traversed and accessed like arrays.
Image via Mushakesa / Shutterstock