Using SPL Iterators, Part 1

    Stefan Froelich
    Share

    When I first came across the term iteration and saw the overwhelming list of classes related to it in the SPL, I was taken aback. It seemed maybe iteration was too complex for me to grasp. I soon realized it was just a fancy word for something we programmers do all the time.

    If you use PHP, you’ve most likely used arrays. And if you’ve used arrays, then most definitely you’ve looped through its elements. Look through any code and almost certainly you’ll find a foreach loop. Yes, iteration is just the process of traversing a list of values. An iterator then is an object that traverses a list, be it an array, a directory listing, or even a database result set.

    In the first part of this two-part series I’ll introduce you to iteration and how you can take advantage of some of the built-in classes from the Standard PHP Library (SPL). SPL comes with a large number of iterators, and using them in your code can make your code more efficient and in most cases, more readable.

    Why and When to Use SPL Iterators

    As you will see, iterating iterator objects is basically the same as iterating arrays, and so many people wonder if it wouldn’t be easier to just stick with using arrays in the first place. However, the real benefit of iterators show through when traversing a large amount of data or anything more complex than a simple array.

    The foreach loop makes a copy of any array passed to it. If you are processing a large amount of data, having the large arrays copied each time you use them in a foreach loop might be undesirable. SPL iterators encapsulate the list and expose visibility to one element at a time making them far more efficient.

    When creating data providers, iterators are a great construct as they allow you to lazy load your data. Lazy loading here is simply retrieving the required data only if and when it is needed. You can also manipulate (filter, transform etc) the data you are working on before giving it to the user.

    The decision to use iterators is always at your discretion, however. Iterators have numerous benefits, but in some cases (as with smaller array sets) can cause unwanted overhead. The decision of when to use them rests with you; your choice of style, and their suitability in the given situation, are all factors you should consider.

    Iterating Arrays

    This first iterator I’d like to introduce you to is ArrayIterator. The constructor accepts an array for a parameter and provides methods that can be used to iterate through it. Here’s an example:

    <?php
    // an array (using PHP 5.4's new shorthand notation)
    $arr = ["sitepoint", "phpmaster", "buildmobile", "rubysource",
        "designfestival", "cloudspring"];
    
    // create a new ArrayIterator and pass in the array
    $iter = new ArrayIterator($arr);
    
    // loop through the object
    foreach ($iter as $key => $value) {
        echo $key . ":  " . $value . "<br>";
    }

    The output of the above code is:

    0: sitepoint
    1: phpmaster
    2: buildmobile
    3: rubysource
    4: designfestival
    5: cloudspring

    Usually, however, you will use ArrayObject, a class that allows you to work with objects as if they were arrays in certain contexts, instead of using ArrayIterator directly. This automatically creates an ArrayIterator for you when you use a foreach loop or call ArrayIterator::getIterator() directly.

    Please note that while ArrayObject and ArrayIterator behave like arrays in this context, they are still objects; trying to use built-in array functions like sort() and array_keys() on them will fail dismally.

    The use of ArrayIterator is straight forward, but limited to single dimensional arrays. Sometimes you’ll have a multidimensional array and you’ll want to iterate through the nested arrays recursively. In this case you can use RecursiveArrayIterator.

    One common scenario is to nest foreach loops or to create a recursive function which checks all items of a multidimensional array. For example:

    <?php
    // a multidimensional array
    $arr = [
        ["sitepoint", "phpmaster"],
        ["buildmobile", "rubysource"],
        ["designfestival", "cloudspring"],
        "not an array"
    ];
    
    // loop through the object
    foreach ($arr as $key => $value) {
        // check for arrays
        if (is_array($value)) {
            foreach ($value as $k => $v) {
                echo $k . ": " . $v . "<br>";
            }
        }
        else {
            echo $key . ": " . $value . "<br>";
        }
    }

    The output of the above code is:

    0: sitepoint
    1: phpmaster
    0: buildmobile
    1: rubysource
    0: designfestival
    1: cloudspring
    3: not an array

    A more elegant approach makes use of RecursiveArrayIterator.

    <?php
    ...
    $iter = new RecursiveArrayIterator($arr);
    
    // loop through the object
    // we need to create a RecursiveIteratorIterator instance
    foreach(new RecursiveIteratorIterator($iter) as $key => $value) {
        echo $key . ": " . $value . "<br>";
    }

    The output is the same as the previous example.

    Note that you need to create an instance of RecursiveIteratorIterator and pass it the RecursiveArrayIterator object here or else all you would get would be the values in the root array (and a ton of notices depending on your settings).

    You should use RecursiveArrayIterator when dealing with multidimensional arrays as it allows you to iterate over the current entry as well, but leaves this up to you to do. RecursiveIteratorIterator is a decorator which does this for you. It takes the RecursiveArrayIterator, iterates over it and iterates over any Iterable entry it finds (and so on). Essentially, it “flattens” the RecursiveArrayIterator. You can get the current depth of iteration by calling RecursiveIteratorIterator::getDepth() to keep track. Be careful with RecursiveArrayIterator and RecursiveIteratorIterator though if you want to return objects; objects are treated as Iterable and will therefore be iterated.

    Iterating Directory Listings

    You will undoubtedly need to traverse a directory and its files at some point in time or another, and there are various ways of accomplishing this with the built-in functions provided by PHP already, such as with scandir() or glob(). But you can also use DirectoryIterator. In its simplest form, DirectoryIterator is quite powerful, but it can also be subclassed and enhanced.

    Here’s an example of iterating a directory with DirectoryIterator:

    <?php
    // create new DirectoryIterator object
    $dir = new DirectoryIterator("/my/directory/path");
    
    // loop through the directory listing
    foreach ($dir as $item) {
        echo $item . "<br>";
    }

    The output obviously will depend on the path you specify and what the directory’s contents are. For instance:

    .
    ..
    api
    index.php
    lib
    workspace

    Don’t forget that with DirectoryIterator, as well as many of the other SPL iterators, you have the added benefit of using exceptions to handle any errors.

    <?php
    try {
        $dir = new DirectoryIterator("/non/existent/path");
        foreach ($dir as $item) {
            echo $item . "<br>";
        }
    }
    catch (Exception $e) {
        echo get_class($e) . ": " . $e->getMessage();
    }
    UnexpectedValueException: DirectoryIterator::__construct(/non/existent/path,/non/existent/path): The system cannot find the file specified. (code: 2)

    With a host of other methods like DirectoryIterator::isDot(), DirectoryIterator::getType() and DirectoryIterator::getSize(), pretty much all of your basic directory information needs are covered. You can even combine DirectoryIterator with FilterIterator or RegexIterator to return files matching specific criteria. For example:

    <?php
    class FileExtensionFilter extends FilterIterator
    {
        // whitelist of file extensions
        protected $ext = ["php", "txt"];
    
        // an abstract method which must be implemented in subclass
        public function accept() {
            return in_array($this->getExtension(), $this->ext);
        }
    }
    
    //create a new iterator
    $dir = new FileExtensionFilter(new DirectoryIterator("./"));
    ...

    SPL also provides RecursiveDirectoryIterator which can be used in the same way as RecursiveArrayIterator. A function that traverses directories recursively will usually be littered with conditional checks for valid directories and files, and RecursiveDirectoryIterator can do much of the work for you resulting in cleaner code. There is one caveat, however. RecursiveDirectoryIterator does not return empty directories; if a directory contains many subdirectories but no files, it will return an empty result (much like how Git behaves).

    <?php
    // create new RecursiveDirectoryIterator object
    $iter = new RecursiveDirectoryIterator("/my/directory/path");
    
    // loop through the directory listing
    // we need to create a RecursiveIteratorIterator instance
    foreach (new RecursiveIteratorIterator($iter) as $item) {
        echo $item . "<br>";
    }

    My output resembles:

    ./api/.htaccess
    ./api/index.php
    ./index.php
    ...

    Summary

    Hopefully you now realize that iteration isn’t a complex beast like I first thought, and that it’s something we do every day as programmers. In this article I’ve introduced iteration and some of the classes that SPL provides to make iterating easier and more robust. Of course I’ve only dealt with a very small sampling of the available classes; SPL provides many, many more and I urge you to take a look at them.

    SPL is a “standard” library. Sometimes you may find the classes too general and they may not always do what you need. In such cases you can easily extend the classes to add your own functionality or tweak existing functionality as needed. In the next part of this series I’ll show you how to use SPL interfaces to make your very own custom classes that can be traversed and accessed like arrays.

    Image via Mushakesa / Shutterstock