Proc_Open: Communicate with the Outside World

There are many ways we can interact with other applications from PHP and share data; there’s web services, message queuing systems, sockets, temporary files, exec(), etc. Well, today I’d like to show you one approach in particular, proc_open(). The function spawns a new command but with open file pointers which can be used to send and receive data to achieve interprocess communication (IPC).

What’s a Pipe?

To understand how proc_open() and the process sends and receives data, you know know what a pipe is.

The Unix Philosophy lead early developers to write many small programs each with very specific functionality. These programs shared plain text as a common format, and users could “chain them together” for greater functionality. The output from one command become the input for the next. The virtual channels that let the data flow between commands became known as pipes.

If you’ve ever worked in a Unix shell then there’s a good chance you’ve used pipes, perhaps even without realizing it. For example:

$ mysql -u dbuser -p test < mydata.sql

Here the mysql utility is invoked and a pipe is set up by the system through which the contents of the mydata.sql file is fed into mysql as if you were typing directly at it’s prompt from your keyboard.

There are two types of pipes: anonymous and named. An anonymous pipe is ad hoc, created only for as long as the process is running and is destroyed once it is no longer needed. The example with redirecting the file contents into mysql above uses an anonymous pipe. A named pipe on the other hand is given a name and can last indefinitely. It’s created using special commands and often appears as a special file in the filesystem.

Regardless of the type, an important property of any pipe is that it’s a FIFO (first in, first out) structure. This means the data written into a pipe first by one process is the first data read out from the pipe by the other process.

Introducing proc_open()

The PHP function proc_open() executes a command, much like exec() does, but with the added ability to direct input and output streams through pipes. It accepts some optional arguments, but the mandatory arguments are:

  • a command to execute.
  • an array that describes the pipes to be used.
  • an array reference that will later be populated with references to the pipes’ endpoints so you can send/receive data.

The optional arguments are used for tweaking the execution environment in which the command spawns. I won’t discuss it here, but you can find more information in the PHP manual.

Aside from the command to execute, I would say the descriptor array that defines the pipes is the most important argument to pay attention to. The documentation explains it as an “indexed array where the key represents the descriptor number and the value represents how PHP will pass that descriptor to the child process,” but what exactly does that mean?

The three main data streams for a well-behaved unix process are STDIN (standard input), STDOUT (standard output), and STDERR (standard error). That is, there’s a stream for incoming data, one for outgoing data, and a second outgoing stream for informational messages. STDIN has traditionally been represented by the integer 0, STDOUT by 1, and STDERR by 2. So, the definition with key 0 will be used to set up the input stream, 1 for the output stream, and 2 for the error stream.

These definitions themselves can take one of two forms, either an open file resource or an array that describes the nature of the pipe. For an anonymous pipe, the first element of the descriptive array is the string “pipe” and the second is “r”, “w”, or “a” depending if the pipe is to be read from, written to, or appended. For named pipes, the descriptive array holds the string “file”, the filename, and then “r”, “w”, or “a”.

Once called, proc_open() fills the third parameter’s array reference to return the resources to the process. The elements in the reference can be treated as normal file descriptors and they work with file and stream functions like fwrite(), fread(), stream_get_contents(), etc.

When you’re done interacting with the external command, it’s important to clean up after yourself. You can close the pipes (with fclose()) and the process resource (with proc_close()).

Depending on how your target command/process behaves, you may need to close the STDIN connection before it begins its work (so it knows not to expect any more input). And, you should close the STDOUT and STDERR connections before closing the process, or it might hang while it waits for everything to be clean before shutting down.

A Practical Example: Converting Wiki Markup

So far I’ve only talked about how things work, but I haven’t shown you an example using proc_open() to spawn and communicate with an external process yet. So, let’s see how easy it is to use.

Suppose we have a need to convert a chunk of text with wiki markup to HTML for display in a user’s browser. We’re using the Nyctergatis Markup Engine (NME) to perform the conversion, but since it’s a compiled C binary we need a way to fire up nme when its needed and a way to pass input and receive output.

<?php
// descriptor array
$desc = array(
    0 => array('pipe', 'r'), // 0 is STDIN for process
    1 => array('pipe', 'w'), // 1 is STDOUT for process
    2 => array('file', '/tmp/error-output.txt', 'a') // 2 is STDERR for process
);

// command to invoke markup engine
$cmd = "nme --strictcreole --autourllink --body --xref";

// spawn the process
$p = proc_open($cmd, $desc, $pipes);

// send the wiki content as input to the markup engine 
// and then close the input pipe so the engine knows 
// not to expect more input and can start processing
fwrite($pipes[0], $content);
fclose($pipes[0]);

// read the output from the engine
$html = stream_get_contents($pipes[1]);

// all done! Clean up
fclose($pipes[1]);
fclose($pipes[2]);
proc_close($p);

First the descriptor array is laid out, with 0 (STDIN) as an anonymous pipe that will be readable by the markup engine, 1 (STDOUT) as an anonymous pipe writable by the engine, and 2 (STDERR) redirecting any error messages to an error log file.

The “r” and “w” on the pipe definitions might seem counter intuitive at first, but keep in mind they are the channels that the engine will be using and so are configured from it’s perspective. We write to the read pipe because the engine will be reading the data from it. We read from the write pipe because the engine has written data to it.

Conclusion

There are many ways to interact with external processes; some may be better than using proc_open() given how and what you need to work with, or proc_open() might just be what the doctor ordered for your situation. Of course you’ll implement what makes sense, but now you’ll know how to use this powerful function if you need to!

I’ve placed someexample code on GitHub that simulates a bare-bones wiki using NME, just like in the example above. Feel free to clone it if you’re interested in playing and exploring further.

Image via Fotolia

Free book: Jump Start HTML5 Basics

Grab a free copy of one our latest ebooks! Packed with hints and tips on HTML5's most powerful new features.

  • http://www.adeveloper.org Hossein Baghayi

    Good article which introduces the proc_open function (and something like pipes in PHP which I did not know about). It may come in handy sometime.

  • http://www.mhlavac.net Martin Hlaváč

    Can same result be achieved by using exec with linux pipes? Even though that this function looks great i have never used it and i always used exec instead.

    • http://zaemis.blogspot.com Timothy Boronczyk

      exec() and proc_open() are similar, but the key difference is proc_open()’s ability to communicate through the pipes. You would use exec() when you just need to execute another command/process. At best you can capture its output, but you’re not able to provide much input. That’s fine is that’s all you need to do. But if you find yourself writing data to a temp file and redirecting it to the command as part of your exec() (for example exec(‘./some-command < input.txt’)) then you should be using proc_open() instead.