Programming - - By Maarten Manders

Zend_Filter Reviewed, Blacklist / Whitelist Filters

I like Zend Framework‘s Zend_Filter class. It’s basically a set of methods for validating untrusted data. Although the two arguably most important features isEmail() and isUri() (the latter can be worked around with Zend_Uri) are still missing, the whole thing looks promising already. Here’s a few thoughts on the package:

  • Remove isGreaterThan() and isLessThan(). That’s what we have “< " and ">” operators for. I can understand the designer’s intention to deliver a complete set of tests but these just bloat both Zend_Filter’s and the user’s code. There is no isEqualTo(), either.
  • isDate() looks like a stub. This should be replaced by something more sophisticated.
  • Clean up the code of isHostname().
  • The method name isRegex() makes me think that it checks whether the argument is a valid regular expression. Since pattern matching is a special way of filtering anyway, I’d just abandon the “is” prefix and call it match().
  • I don’t know if isName() works completely accurate on any exotic names. Besides, it can be easily left away as it’s a job for whitelist filtering. See below.
  • International support for isPhone(). I can deliver a Swiss implementation for it, just let me know. By the way, apply self::getDigits() on on the input instead of ctype_digit checking.
  • Let’s add three more class methods to Zend_Filter. The first one escapes a string for safe use in regular expressions:

public static function getRegexEscaped($input) {
  $output = '';
  for($i = 0; $i < strlen($input); $i++) {
    $output .= 'x'.bin2hex($input{$i});
  return $output;

  • The next one validates a string by a character whitelist:

public static function getWhitelisted($input, $allowed_chars = '', $allow_alpha = true, $allow_numeric = true) {
  $regex = '%[^'.($allow_alpha ? '[:alpha:]' : '').($allow_numeric ? 'd' : '').self::getRegexEscaped($allowed_chars).']%';
  return preg_replace($regex, '', $input);

  • When there’s whitelisting, there should be blacklisting, too. On second thought, this should be implemented with str_replace() though.

public function getBlacklisted($input, $forbidden_chars) {
  $regex = '%['.self::getRegexEscaped($forbidden_chars).']%';
  return preg_replace($regex, '', $input);

For example, we can use the more flexible whitelisting method instead of Zend_Filter::isName.

/* We only allow letters, spaces and dashes in names */
$name = Zend_Filter::getWhitelisted($name, " -", true, false);