Dynamic global functions in PHP
Like many others, I prefer to use procedural PHP as a template language. While PHP’s syntax makes it a practical choice for this, there is a problem with embedding dynamic content. Most PHP applications produce HTML output, so you end up writing <?php echo htmlspecialchars($foo);?>
a lot, using this technique. Or you forget it, and make your application prone to all sorts of nasty XSS attacks.
Apart from the annoyance of superfluous typing, there is a danger of getting lazy, seeing that <?php echo $foo;?>
is remarkably shorter to type. In some situations, it won’t manifest itself as a problem either, since some content-types never contains HTML special characters (Numbers for example). This is particularly nasty, because errors in the view layer are notoriously hard to track down, and unlike SQL-injections — a similar problem — the consequences tend to hurt the users of a site, rather than the site directly.
KISS
Recently, I had a look at some code, written for CakePHP. My eye caught a function e
, which is shorthand for echo
. A single letter, regular function is undoubtedly the simplest way to extend PHP’s syntax. Thinking about it, it’s fairly obvious, but it just never occurred to me.
Well, the CakePHP developers made a mistake, as it should have been shorthand for echo htmlspecialchars
. Nonetheless, the syntax works well. So I began using a globally defined function which looks a bit like this:
function e($string) {
echo htmlspecialchars($string);
}
And it works too. It saves me typing and keeps me from forgetting to escape output, because I have to do more work to output strings unescaped, than not. Simple, but powerful.
The clash
There is a problem though; Since this is such a good name for a function, chances are that someone else would use it for something different, or perhaps even for the same. I already know of at least one potential nameclash, since I got the idea from CakePHP.
The usual way of dealing with this, would be with namespaces, but — alas — PHP doesn’t have namespaces (yet), and the common solution of pseudo namespaces (Eg. prefixing names) doesn’t work here, since it would defy the purpose of the function in the first place.
There’s another problem too. In the few cases, where we aren’t rendering HTML/XML output, we don’t want to escape strings for embedding in HTML/XML — instead, we want to escape it for embedding in that target language. Even in HTML, it may be necessary to escape strings with htmlentities, rather than htmlspecialchars, if the text encoding isn’t ISO-8859-1. Or encode the string to UTF-8, if the template is in UTF-8.
Making the static dynamic
The problem with all this is, that the function is static — That’s the nature of global functions in PHP. Other interpreted languages allows us to redefine functions at runtime, but no such luck with PHP (Well, strictly speaking, runkit allows it, but no-one in their right mind would use it in a production environment).
There is however a loophole; Using a callback, we can delegate to a dynamically defined handler:
if (!function_exists('e')) {
function e($args) {
$args = func_get_args();
return call_user_func_array($GLOBALS['_global_function_handler_e'], $args);
}
}
Much better — Now I can define my own output handler as:
$GLOBALS['_global_function_handler_e'] = 'my_global_function_handler_e';
function my_global_function_handler_e($string) {
echo htmlspecialchars($string);
}
And CakePHP can use:
$GLOBALS['_global_function_handler_e'] = 'cakephp_global_function_handler_e';
function cakephp_global_function_handler_e($string) {
echo htmlspecialchars($string);
}
We’re still using a global symbol, but at least it’s dynamic, rather than static.
So here’s a plea to framework writers in the PHP world: If we could all agree to do this for any function, defined in the global scope, which has an obvious risk of nameclashing, I think we would all be better off.
Just a humble suggestion of course.