Using YAML in Your PHP Projects

Tweet

Test fixtures, configuration files, and log files all need to be both human and machine readable. YAML (YAML Ain’t Markup Language) is a less-verbose data serialization format than XML and has become a popular format among software developers mainly because it is human-readable. YAML files are simply text files that contain data written according to YAML syntax rules and usually have a .yml file extension. In this article, you will learn the basics of YAML and how you can integrate a PHP YAML parser into your projects.

Understanding YAML Syntax

YAML supports advanced features like references and custom data types, but as a PHP developer, most of the time you’ll be interested in how YAML represents enumerated arrays (sequences in YAML terminology) and associative arrays (mappings).

The following is how to represent an enumerated array in YAML:

- 2
- "William O'Neil"
- false

Each element of the array is presented after a hyphen and a space. It’s syntax for representing values is similar to that of PHP (quoting strings, etc.)

The above is equivalent to the following PHP:

<?php
array(2, "William O'Neil", false);

Generally, each element will appear on it’s own line in YAML, but enumerated arrays can also be expressed on a single line using brackets:

[ 2, "William O'Neil", false ]

The following code shows how to represent an associative array in YAML:

id:       2
name:     "William O'Neil"
isActive: false

First the element’s key is stated followed by a colon and one or more spaces, and then the value is stated. Having just one space after the colon is sufficient, but you can use more spaces for the sake of better readability if you like.

The equivalent PHP array of the above YAML is:

<?php
array("id" => 2, "name" => "William O'Neil", "isActive" => false);

And similar to enumerated arrays, you can express associative arrays on a single line using braces:

{ id: 2, name: "William O'Neil”, isActive: false }

With one or more spaces for indentation, you can represent a multi-dimensional array like so:

author:
  0: { id: 1, name: "Brad Taylor", isActive: true }
  1: { id: 2, name: "William O'Neil", isActive: false }

Note that although the second level arrays are enumerated arrays, I have used the syntax for mappings (colons) instead of the syntax for sequences (hyphens) for clarity.

The above YAML block is equivalent to the following PHP:

<?php
array(
    "author" => array(
        0 => array("id" => 1, "name" => "Brad Taylor", "isActive" => true),
        1 => array("id" => 2, "name" => "William O'Neil", "isActive" => false)
    )
);

YAML also allows representing a collection of data elements in the same document without requiring a root node. The following example is the contents of article.yml which shows several multi-dimensional arrays in the same file.

author:
  0: { id: 1, name: "Brad Taylor", isActive: true }
  1: { id: 2, name: "William O'Neil", isActive: false }

category:
  0: { id: 1, name: "PHP" }
  1: { id: 2, name: "YAML" }
  2: { id: 3, name: "XML" }

article:
  0:
    id:      1
    title:   "How to Use YAML in Your Next PHP Project"
    content: >
               YAML is a less-verbose data serialization format.
               It stands for "YAML Ain't Markup Language".
               YAML has been a popular data serialization format among
               software developers mainly because it's human-readable.
    author:  1
    status : 2

articleCategory:
  0: { articleId: 1, categoryId: 1 }
  1: { articleId: 1, categoryId: 2 }

While most of YAML’s syntax is intuitive and easy to remember, there is one important rule to which you should pay attention. Indentation must be done with one or more spaces; tabs are not allowed. You can configure your IDE to insert spaces instead of tabs when you press tab key, which is a common configuration among software developers to make sure code is properly indented and displayed when it’s viewed in other editors.

You can learn the more complex features and syntax that YAML supports by reading the official documentation, the Symfony reference, or Wikipedia.

YAML Need Not Be an Alternative to XML

If you’re researching YAML with your favorite search engine, you will undoubtedly find discussion on “YAML vs XML”, and naturally when you first experience YAML, you would tend to prefer it over XML mainly because its easier to read and write. However, YAML should be another tool in your developer arsenal and need not be an alternative to XML. Here are some advantages of YAML and XML.

Advantages of YAML

  • Less-verbose, easy to compose, and more readable
  • Need not have tree structure with a single parent node

Advantages of XML

  • More built-in PHP support compared to YAML
  • XML has been the de facto standard for inter-application communication and is widely recognized
  • XML tags can have attributes providing more information about the enclosed data

Despite its verbosity, XML is more readable and maintainable when the hierarchy of elements is deep compared to YAML’s space-oriented hierarchy representation.

Considering the advantages of both languages, YAML seems to be more suitable for collections of different data sets and when humans are also among the data consumers.

Choosing a PHP YAML Parser

A YAML parser is expected to have two functionalities, some sort of load function that converts YAML into an array, and a dump function that converts an array into YAML.

Currently PHP’s YAML parser is available as a PECL extension and is not bundled with PHP. Alternatively, there are parsers written in pure PHP which would be slightly slower compared to the PECL extension.

The following are a few YAML parsers available for PHP:

PECL extension

  • Is not bundled with PHP
  • Will need root access to the server to install

Symfony 1.4 YAML Component

  • Implemented in PHP
  • Will work in PHP 5.2.4+
  • Need to extract from Symfony framework

Symfony 2 YAML Component

  • Implemented in PHP
  • Will work in PHP 5.3.2+

SPYC

  • Implemented in PHP
  • Will work in PHP 5+

My preferred choice is the Symfony 1.4 YAML Component because of its portability (it works with PHP 5.2.4+ versions) and maturity (Symfony 1.4 is a well established PHP framework). Once you’ve extracted the YAML component from the Symfony archive, YAML classes are available under lib/yaml. The static methods load() and dump() are available with the sfYaml class.

Editor Note Oct 28 2012: The accompanying code on GitHub has been updated to use Composer for obtaining the PHPUnit and Symfony 1.4 YAML Component dependencies.

Integrating a PHP YAML Parser into Your Project

Whenever you integrate a third-party class or library into your PHP project, it’s good practice to create a wrapper and a test suite. This let’s you later change the third party library with minimal changes to your project code (project code should only refer the wrapper) and with the assurance that change won’t brake any functionality (test suites will tell you).

Following is the test case (YamlParserTest.php) I created for my wrapper class (YamlParser.php). You need knowledge of PHPUnit to run and maintain the test case. You can add more tests if you’d like, for wrong file names and file extensions other than .yml, and other tests based on the scenarios you encounter in your project.

<?php
require_once "YamlParser.php";

class YamlParserTest extends PHPUnit_Framework_TestCase
{
    private $yamlParser;

    public function setup() {
        $this->yamlParser = new YamlParser();
    }

    public function testMainArrayKeys() {
        $parsedYaml    = $this->yamlParser->load("article.yml");
        $mainArrayKeys = array_keys($parsedYaml);
        $expectedKeys  = array("author", "category", "article", "articleCategory");

        $this->assertEquals($expectedKeys, $mainArrayKeys);

    }

    public function testSecondLevelElement() {
        $parsedYaml    = $this->yamlParser->load("article.yml");
        $actualArticle = $parsedYaml["article"][0];
        $title         = "How to Use YAML in Your Next PHP Project";
        $content = "YAML is a less-verbose data serialization format. "
                 . "It stands for "YAML Ain't Markup Language". "
                 . "YAML has been a popular data serialization format among "
                 . "software developers mainly because it's human-readable.n";

        $expectedArticle = array("id" => 1, "title" => $title, "content" => $content, "author" => 1, "status" => 2);

        $this->assertEquals($expectedArticle, $actualArticle);
    }

    /**
     * @expectedException YamlParserException
     */
    public function  testExceptionForWrongSyntax() {
        $this->yamlParser->load("wrong-syntax.yml");
    }
}

And here is the wrapper class:

<?php
require_once "yaml/sfYaml.php";

class YamlParser
{
    public function load($filePath) {
        try {
            return sfYaml::load($filePath);
        }
        catch (Exception $e) {
            throw new YamlParserException(
                $e->getMessage(), $e->getCode(), $e);
        }
    }

    public function dump($array) {
        try {
            return sfYaml::dump($array);
        }
        catch (Exception $e) {
            throw new YamlParserException(
                $e->getMessage(), $e->getCode(), $e);
        }
    }
}

class YamlParserException extends Exception
{
    public function __construct($message = "", $code = 0, $previous = NULL) {
        if (version_compare(PHP_VERSION, "5.3.0") < 0) {
            parent::__construct($message, $code);
        }
        else {
            parent::__construct($message, $code, $previous);
        }
    }
}

Summary

So now you have the knowledge of what YAML is, how to represent PHP arrays in YAML, and how to integrate a PHP YAML parser into your projects. By spending little more time with YAML syntax, you will be able grasp the powerful features it offers. You may also consider exploring the Symfony 1.4 and 2 frameworks that use YAML extensively. And if you’re interested in playing with the code from this article, it’s available on GitHub.

Image via Fotolia

Free book: Jump Start HTML5 Basics

Grab a free copy of one our latest ebooks! Packed with hints and tips on HTML5's most powerful new features.

  • jozef

    i am sorry, but every time i read something like “here is one important rule to which you should pay attention. Indentation must be done …” i stay away of such IT technologies (Python and company). Please don’t start a flame war, it’s my 20 years old IT subjective sense and feelings, or maybe i am too influenced by C language.

    • Dennis

      So you don’t use Make?

    • http://www.phpknowhow.com Gayanath Jayarathne

      I am not sure what’s your main concern is. I can agree with you to a certain extend since I also think using indentation for representing a hierarchy can be bit tricky when there are many levels. I have mentioned that in the article as well. However I have used considerably large YAML data fixtures.

      I think using spaces instead of tab for tab key is a common practice which is even mentioned in some coding standards.

  • zumba

    A small typo in the first yaml example – it does not have the “2” as its first element.

    • http://zaemis.blogspot.com Timothy Boronczyk

      Thanks! I’ve fixed the example.

    • http://www.phpknowhow.com Gayanath Jayarathne

      @zumba & @Tim
      Thanks a lot for the info and correction.

  • Viktor

    Hi Gayanath Jayarathne
    Good article, well written.

    • http://www.phpknowhow.com Gayanath Jayarathne

      Hi Viktor,
      Thanks a lot for the feedback. Glad you enjoyed the article.

  • http://crazy4groovy.blogspot.com Crazy4Groovy

    Personally, I would recommend playing for YALM for config files, and stick with XML/JSON for data transfer formats.
    Love the article – thanks!

    • http://www.phpknowhow.com Gayanath Jayarathne

      I can agree with you. In addition to config files, YAML is good for data fixtures as well.

  • Oras

    Nice article, easy to read and understand. I learned the basics YAML through it.
    Thanks!