A Look at Hack, the PHP Replacement in HHVM

Tweet
This entry is part 2 of 2 in the series HHVM and Hack

HHVM and Hack

You can use the previously created Vagrant box to run the code snippets from this article.

Why types ?

In the first part of the article we’ve seen that HACK was actually statically typed. This means that you must provide types for all the variables in your application. As a reminder, PHP is dynamically typed so that you never need to type your variables although you can use type hinting for function arguments.

But wait, does that mean that you have to provide types for every single variable of your application? Not exactly, and we are going to see the details.

Facebook’s codebase is composed of hundreds of millions of lines of code and adding types everywhere before they can switch to HACK would have been a real burden. So they’ve come with “gradual typing”: HACK expects types in “strict” mode only. In non-strict mode types would only be taken in account where they are present.

Entering the strict mode is as easy as switching the HACK start tag from <?hh to <?hh // strict.

Even in strict mode you need not to annotate all the variables. That’s because HACK is smart enough to infer local variable types. Type annotations are only ever required for class properties, function arguments and return values. I would otherwise recommend to annotate local variables when it could help the understanding of your code.

Let’s look at an example:

<?hh // strict

require "/vagrant/www/xhp/php-lib/init.php";

// ...

function add(int $a, int $b): int {
    return $a + $b;
}

// ERROR(calling "add()" on l.17) : Argument 2 passed to add() must be an
// instance of int, string given
echo <p>add(1, "a") = {add(1, "a")}</p>;

// ERROR(calling "add()" on l.22) : Argument 2 passed to add() must be an
// instance of int, string given
function add_array(array<int> $a): int {
    return array_reduce($a, "add", 0);
}

echo <p>add_array([1, "a"]) = {add_array([1, "a"])}</p>;

The sample code for this section is located at www/type-checker/index.php and you can see its output by pointing your browser to http://localhost:8080/type-checker/.

The first error message is not surprising: calling add(1, "a") generates an error because add() expects the second argument to be an integer.

The second error message is more unexpected: the error is not generated by calling add_array([1, "a"]). It’s actually the call to add(1, "a") inside of add_array() which generates the error! One could have expected that passing [1, "a"] would trigger an error because it’s not an array<int>.

The thing is that the HHVM runtime check is sparse in order not to impact performance: it doesn’t iterate over objects. At this point you would probably question the usefulness of the HACK type system! But don’t worry, there is an easy answer, the “type checker”: it would catch any type mismatches including the one from the previous example. Don’t look for it in the HHVM repository, it hasn’t been released by Facebook yet.

The type checker is implemented as a server that watches your files for changes. Whenever it detects a change, it will scan the modified file together with its dependencies for errors. The errors are reported real-time so that you do not even have to run the code. It has been designed to work really fast even at FB’s scale.

You should now be convinced that the type system works great, but what are the benefits? It allows catching developer errors in real-time, producing more efficient code: A PHP add() function would first have to check the types of $a and $b (i.e. string, null, …) possibly convert them to numbers and only then perform the addition. Whereas with HACK the add() function above adds two non-null integers which is a very fast operation in assembly language (as generated by the HHVM JIT).

If as a developer you are already using PHP type hinting and PHPDoc annotations, switching to the strict mode should be a no-brainer. Your code will become safer and faster – note that some existing QA tools, like Scrutinizer already use type inference to check your code, though they’re not real-time.

If you use PHP mostly because of its dynamically typed nature then you probably want to stick to the non-strict mode.

User attributes

The use of annotations has dramatically increased in the PHP world during the last years. For those who are not familiar with annotations, they are metadata you can add to classes, interfaces, traits, variables and functions/methods.

The Doctrine ORM has probably been one of the first PHP projects to make an extensive use of annotations. Below is an example of a model configuration from the Doctrine documentation:

<?php
/** @Entity */
class Message
{
    /** @Column(type="integer") */
    private $id;
    /** @Column(length=140) */
    private $text;
    /** @Column(type="datetime", name="posted_at") */
    private $postedAt;
}

PHP, unlike many other languages, has no built-in support for annotations. However, the Doctrine annotation library is widely used to extract metadata from Docblocks. An RFC proposing built-in support for annotations in PHP has been declined back in 2011.

User attributes is the Facebook implementation of annotations. They are enclosed in <<>> and their syntax differs a little from Doctrine annotations:

<?hh

require "/vagrant/www/xhp/php-lib/init.php";

<< UA('klass', ['type' => 'class']) >>
class klass {
    protected $prop;

    << UA(['type' => 'function']) >>
    public function funktion(<< Argument >> $arg) {
    }
}

$rc = new ReflectionClass(klass);
$rm = $rc->getMethod('funktion');
$ra = $rm->getParameters()[0];

// On class
// array ( 'UA' => array ( 0 => 'klass', 1 => array ( 'type' => 'class', ), ), )
// On method
// array ( 'UA' => array ( 0 => array ( 'type' => 'function', ), ), )
// On argument
// array ( 'Argument' => array ( ), )
echo <div><h1>User annotations</h1>
    <h2>On class</h2><p>{var_export($rc->getAttributes(), true)}</p>
    <h2>On method</h2><p>{var_export($rm->getAttributes(), true)}</p>
    <h2>On argument</h2><p>{var_export($ra->getAttributes(), true)}</p></div>;

You should note that the user attributes are, unsurprisingly, accessed from the reflection API. Also note that the support for annotating on class properties is still to be implemented.

The sample code for this section is located at www/attributes/index.php and you can see its output by pointing your browser to http://localhost:8080/attributes/.

XHP

By now you should have a foretaste of what XHP is as we have been using it from the first code example of this article. Let me quote Facebook for a more complete definition “XHP is a PHP extension which augments the syntax of the language such that XML document fragments become valid PHP expressions.”. Note that XHP is available as a PHP extension and that HHVM has native support.

With XHP, you can use<h1>{$hello}</h1> where you would have use "<h1>$hello</h1>" with vanilla PHP. While the previous example is trivial, XHP has more to offer:

  • it would validate your markup so that you can not write invalid HTML – think missing closing tags, typos in parameter names, …
  • it provides some level of contextual escaping – as the engine is aware of what your are rendering, it could escape HTML and attribute values appropriately to prevent XSS attacks,
  • you can write your own tags by extending or wrapping existing tags.

Let’s look at an example:

<?hh

require "/vagrant/www/xhp/php-lib/init.php";

$examples = [
    'hello'        => 'Hello HHVM / HACK',
    'promotion'    => 'Constructor argument promotion',
    'collections'  => 'Collections',
    'types'        => 'Types and Generics',
    'type-checker' => 'The type checker',
];

// The XHP validation should be disabled for better performance in production
//:x:base::$ENABLE_VALIDATION = false;

class :tuto:examples extends :x:element {
    // examples, current are required attributes
    attribute array examples @required;
    attribute string current @required;

    // forbid to explicitly add children
    children empty;

    protected function render() {
        $select = <select onchange="window.location.href=window.location.pathname + '?ex=' + this.value"/>;
        foreach ($this->getAttribute('examples') as $name => $label) {
            $selected = $name === $this->getAttribute('current');
            $child = <option selected={$selected} value={$name}>{$label}</option>;
            $select->appendChild($child);
        }
        return $select;
    }
}

$folder = preg_replace('/[^-_a-z0-9]/', '',isset($_GET['ex']) ? $_GET['ex'] : '');

function getTheCode($folder) {
    // ...
}

echo <html>
    <head><title>"XHP generated index"</title></head>
    <body>
        <tuto:examples examples={$examples} current={$folder} />
        {getTheCode($folder)}
    </body></html>;

The full sample code for this section is located at www/hhxhp/index.php and you can see its output by pointing your browser to http://localhost:8080/hhxhp/.

In this example we start by defining a custom <tuto:examples> tag that will render a <select> tag, this is done by declaring a class :tuto:examples. Our custom tag will require two attributes, examples and current but is not allowed to have children (children empty;).

As we are extending the base :x:element, we should override the render() method to return our custom markup as XHP.

Facebook uses the XHP language as the foundation for its UI library which might eventually get open sourced as well.

Asynchronous code execution

I had plans to write a section about asynchronous code execution after having seen some tests in the HHVM repo. However I was not able to come with a working example. It might be due to my little understanding of the topic or the fact that Facebook has not released all the related code yet. I might write about this once Facebook releases some documentation.

Other features

There are a lot of things about the HHVM ecosystem that were not covered by this article, both because I had to make choices on what to include and because Facebook has not released all the code and documentation yet.

A few things that are worth mentioning are the recent support for FastCGI and the integrated debugger.

Facebook has also showcased “FBIDE”, a web based IDE featuring auto-completion, syntax highlighting, collaborative editing and more. We could expect it to be available at some later time.

External ressources

You can find more information in some talks and slides from the Facebook team that I have used to prepare this article. I first heard of HACK by listening to the “taking PHP seriously” talk from Keith Adams and another great talk from Julien Verlaguet. Sara Golemon’s nice slides were also really helpful to me.

Conclusion

Facebook is committed to provide feature parity with PHP for the HHVM. By the end of last year, HHVM was already able to pass 98.5% of the unit tests for 20+ of the most popular PHP frameworks. The situation has slightly improved since then.

As of today the HHVM executes PHP code faster than PHP while consuming less memory. That will be a significant advantage in favor of HHVM when the parity is eventually achieved. On top of that you can start introducing HACK to gain even more performance and improve code safety with the help of the type checker – remember you don’t have to convert your whole code base at once thanks to the gradual typing and the fact that HACK and PHP are inter-operable.

In a few months from now, we can expect more documentation and tooling from Facebook. You could even help by contributing to the project on github, there is also a bounty program in place.

One of the problems reported by the PHP community which is probably a major obstacle for adoption is the lack of support for PECL extensions. To mitigate this, Facebook has a tool that can automatically compile PHP extensions for the HHVM target; the success rate is far from 100% though. The other thing that could help here is that developing an extension for HHVM is much easier than developing for PHP.

The fact that HHVM is backed by Facebook alone, and the need to sign a CLA before contributing to HHVM seem troublesome to others.

I do personally think that a fair amount of competition is a great thing for the future of PHP.

To conclude, I would like to thank the Facebook team for the amazing job they’ve done and to have open-sourced it. If you would like to see more SitePoint articles on HHVM and HACK in the future do not hesitate to suggest topics by adding a comment below.

HHVM and Hack

<< HHVM and Hack – Can We Expect Them to Replace PHP?

Free book: Jump Start HTML5 Basics

Grab a free copy of one our latest ebooks! Packed with hints and tips on HTML5's most powerful new features.

  • Ian Onvlee

    I don’t like facebook and anything that comes with it. They are internet criminals and hopefully they will soon face more severe lawsuits than they already do. Apart from that, their website is chaotic and buggy and their so-called codebase is worse than that of any amateur coder.

    • Asad Hasan

      Please keep things from an engineering prespective.

      • Ian Onvlee

        Engineering perspective? I think we don’t need another copycat of PHP with another name, doing the same stuff with slightly different syntax, if it doesn’t surpass or extend PHP. Would you learn Esperanto to communicate with the English, Dutch or Philippinos? They understand plain English! Esperanto was brilliant, but a natural old language already successfully made its way into international waters a very long time ago. Praising Facebook now is the same as making WordPress look like the greatest invention on earth. Both are terribly heavy inadequate fatso’s, chaotic and laughable from a web developers point of view: two steps back, one step forward. But don’t let me spoil the fun guys. It’s just an opinion, based on 30 years of experience.

        • http://www.bitfalls.com/ Bruno Skvorc

          You sound like you’re just looking for a fight. You didn’t even read the article in full, or study Hack, did you? It *does* extend and surpass PHP, by a LOT. You also have no idea about Facebook’s back end code. Their front end is laughable, yes, but you can’t speak for the back end when you’ve never seen it.

          • Ian Onvlee

            Oh, I guess we can agree to disagree, sensei Bruno. You haven’t seen the back end of FaceBook either. But I smell a rat when there is one, and opening a bearpit is not my line of business. As the Chinese say, don’t look at what’s in the cup but what’s not in it. Have fun, my friend, hehe.

          • Guest

            github.com/facebook/

  • kamazee

    Well, let’s face the fact: PHP in its current state is spoiled. There’s barely a way to improve it significantly without breaking compatibility.
    So there are to ways to go:
    1. Leave things as they are. Let PHP take one small step at a time, but generally keep compatibility (well, breaking it really sloooooooooooooooowly).
    2. Cut the shit and continue doing what FB is doing right now at somewhat faster pace. And we’ll end up with something PHP-ish: not exactly PHP, but a significantly improved version that has breaking API changes but offers a lot of new features. Should it still be PHP? Well, not really. However, I’d like to go and try the new language/platform they are about to create.

    • Edd Turtle

      I’d definitely vote in favour of a big jump forward under a new name. Shared hosting services are (quite rightly) always updating their PHP versions – which a dramatic change would break. A big company and strong development behind HACK could be the way forward.

  • Taylor Ren

    Well, like my comment on the 1st article said: I would love to see a road map before I can really take a look at a “new” thing, or I will have a strong motivation/obligation to do so (and knowing not doing so is wrong).

    And one more thought derived from an old but tricky question: I have a ship and keep on repairing it: replacing its engine, is that ship still my old ship? replacing the propeller, is that ship still my old ship?…

    Is there a “point” when I can no longer call my ship “my old ship” anymore?

    After saying all these, I do agree that PHP now is somehow very difficult to re-engineering. But I can’t agree one of the reasons quoted for such difficulty: too many legacy codes.

    This somehow shows the reluctance to change to a more modern way. We the programmers, in general, are to be blamed. Well, some because no sufficient exposure to the new technology and some just prefer the old plain way.

    The road map must be clear:

    After we have XHP, what is next? Will we eventually end up with a new language with similar PHP grammar but a totally different thing? After the review on this, we can make an earlier decision: am I ready/willing to go along this path?

    • Victor Berchet

      Facebook has released some plans for the future on http://www.hhvm.com/blog/3743/hhvm-the-next-six-months

      • http://www.bitfalls.com/ Bruno Skvorc

        That’s delightfully ambitious!

      • Taylor Ren

        Thanks for the sharing, @victorberchet:disqus

        Look forward to the 100% completion of Unit Test on Symfony!

  • http://www.bitfalls.com/ Bruno Skvorc

    Awesome stuff, will definitely dissect.

  • http://asciiville.com/ asciiville

    I’m definitely intrigued about an alternative PHP implementation. It never hurts to kick the tires and take it for a spin. Even in the Java ecosystem, there are number of JVM implementations as well as domain specific languages on top of the JVMs, so why not for PHP too?

  • http://www.github.com/emidelgo Emiliano del Gobbo

    Victor, thank you for the credit ;)

  • http://www.emilianodelgobbo.com/ Emiliano del Gobbo

    Facebook Team, released all the HACK Documentation. http://hacklang.org/