Proc_Open: Communicate with the Outside World
There are many ways we can interact with other applications from PHP and share data; there’s web services, message queuing systems, sockets, temporary files,
exec(), etc. Well, today I’d like to show you one approach in particular,
proc_open(). The function spawns a new command but with open file pointers which can be used to send and receive data to achieve interprocess communication (IPC).
What’s a Pipe?
To understand how
proc_open() and the process sends and receives data, you know know what a pipe is.
The Unix Philosophy lead early developers to write many small programs each with very specific functionality. These programs shared plain text as a common format, and users could “chain them together” for greater functionality. The output from one command become the input for the next. The virtual channels that let the data flow between commands became known as pipes.
If you’ve ever worked in a Unix shell then there’s a good chance you’ve used pipes, perhaps even without realizing it. For example:
$ mysql -u dbuser -p test < mydata.sql
mysql utility is invoked and a pipe is set up by the system through which the contents of the
mydata.sql file is fed into
mysql as if you were typing directly at it’s prompt from your keyboard.
There are two types of pipes: anonymous and named. An anonymous pipe is ad hoc, created only for as long as the process is running and is destroyed once it is no longer needed. The example with redirecting the file contents into
mysql above uses an anonymous pipe. A named pipe on the other hand is given a name and can last indefinitely. It’s created using special commands and often appears as a special file in the filesystem.
Regardless of the type, an important property of any pipe is that it’s a FIFO (first in, first out) structure. This means the data written into a pipe first by one process is the first data read out from the pipe by the other process.
The PHP function
proc_open() executes a command, much like
exec() does, but with the added ability to direct input and output streams through pipes. It accepts some optional arguments, but the mandatory arguments are:
- a command to execute.
- an array that describes the pipes to be used.
- an array reference that will later be populated with references to the pipes’ endpoints so you can send/receive data.
The optional arguments are used for tweaking the execution environment in which the command spawns. I won’t discuss it here, but you can find more information in the PHP manual.
Aside from the command to execute, I would say the descriptor array that defines the pipes is the most important argument to pay attention to. The documentation explains it as an “indexed array where the key represents the descriptor number and the value represents how PHP will pass that descriptor to the child process,” but what exactly does that mean?
The three main data streams for a well-behaved unix process are STDIN (standard input), STDOUT (standard output), and STDERR (standard error). That is, there’s a stream for incoming data, one for outgoing data, and a second outgoing stream for informational messages. STDIN has traditionally been represented by the integer 0, STDOUT by 1, and STDERR by 2. So, the definition with key 0 will be used to set up the input stream, 1 for the output stream, and 2 for the error stream.
These definitions themselves can take one of two forms, either an open file resource or an array that describes the nature of the pipe. For an anonymous pipe, the first element of the descriptive array is the string “pipe” and the second is “r”, “w”, or “a” depending if the pipe is to be read from, written to, or appended. For named pipes, the descriptive array holds the string “file”, the filename, and then “r”, “w”, or “a”.
proc_open() fills the third parameter’s array reference to return the resources to the process. The elements in the reference can be treated as normal file descriptors and they work with file and stream functions like
When you’re done interacting with the external command, it’s important to clean up after yourself. You can close the pipes (with
fclose()) and the process resource (with
Depending on how your target command/process behaves, you may need to close the STDIN connection before it begins its work (so it knows not to expect any more input). And, you should close the STDOUT and STDERR connections before closing the process, or it might hang while it waits for everything to be clean before shutting down.
A Practical Example: Converting Wiki Markup
So far I’ve only talked about how things work, but I haven’t shown you an example using
proc_open() to spawn and communicate with an external process yet. So, let’s see how easy it is to use.
Suppose we have a need to convert a chunk of text with wiki markup to HTML for display in a user’s browser. We’re using the Nyctergatis Markup Engine (NME) to perform the conversion, but since it’s a compiled C binary we need a way to fire up
nme when its needed and a way to pass input and receive output.
<?php // descriptor array $desc = array( 0 => array('pipe', 'r'), // 0 is STDIN for process 1 => array('pipe', 'w'), // 1 is STDOUT for process 2 => array('file', '/tmp/error-output.txt', 'a') // 2 is STDERR for process ); // command to invoke markup engine $cmd = "nme --strictcreole --autourllink --body --xref"; // spawn the process $p = proc_open($cmd, $desc, $pipes); // send the wiki content as input to the markup engine // and then close the input pipe so the engine knows // not to expect more input and can start processing fwrite($pipes, $content); fclose($pipes); // read the output from the engine $html = stream_get_contents($pipes); // all done! Clean up fclose($pipes); fclose($pipes); proc_close($p);
First the descriptor array is laid out, with 0 (STDIN) as an anonymous pipe that will be readable by the markup engine, 1 (STDOUT) as an anonymous pipe writable by the engine, and 2 (STDERR) redirecting any error messages to an error log file.
The “r” and “w” on the pipe definitions might seem counter intuitive at first, but keep in mind they are the channels that the engine will be using and so are configured from it’s perspective. We write to the read pipe because the engine will be reading the data from it. We read from the write pipe because the engine has written data to it.
There are many ways to interact with external processes; some may be better than using
proc_open() given how and what you need to work with, or
proc_open() might just be what the doctor ordered for your situation. Of course you’ll implement what makes sense, but now you’ll know how to use this powerful function if you need to!
I’ve placed someexample code on GitHub that simulates a bare-bones wiki using NME, just like in the example above. Feel free to clone it if you’re interested in playing and exploring further.
Image via Fotolia