PHP
Article
By Toby Osbourn

Input Validation Using Filter Functions

By Toby Osbourn
Last chance to win! You'll get a... FREE 6-Month Subscription to SitePoint Premium Plus you'll go in the draw to WIN a new Macbook SitePoint 2017 Survey Yes, let's Do this It only takes 5 min

I’d like to start off this article by thanking you for making it even this far. I’m fully aware that “Input Validation Using Filter Functions” isn’t exactly the sexiest article title in the world!

Filter functions in PHP might not be sexy, but they can improve the stability, security, and even maintainability of your code if you learn how to use them correctly.

In this article I’ll explain why input validation is important, why using PHPs built-in functions for performing input validation is important, and then throw together some examples (namely using filter_input() and filter_var()), discuss some potential pitfalls, and finish with a nice, juicy call to action. Sound good? Let’s go!

Why Input Validation is Important

Input validation is one of the most important things you can do to ensure code security because input is often times the one thing about your application you cannot directly control. Because you cannot control it, you cannot trust it.

Unfortunately, as programmers we often write things thinking only of how we want them to work. We don’t consider how someone else might want to make them work – either out of curiosity, ignorance, or malice.

I am not going to go into too much detail about the trouble you can get into if you do not validate user input; there’s a really good article on this very site called PHP Security: Cross-Site Scripting Attacks if you want to read up on it. But I will say that validating your input is the first step to ensuring that the code you have written will be executed as intended.

Maybe you are coming to PHP from another language and you might be thinking, “this was never an issue before so why should I care?” The reason validation is an issue is because PHP is loosely typed. This makes PHP great for some things, but it can make things like data validation a little bit trickier because you can pretty much pass anything to anything.

Why Using Built-in Methods is Important

In order to try and make validation a little bit easier, from PHP 5.2.0 onward we can now use the filter_input() and filter_var() functions. I’ll talk about them in more detail soon, but first I want to talk about why we should be using PHP provided functionality instead of relying our own methods or third-party tools.

When you roll your own validation methods, you generally fall into the same trap that you can fall into when designing other functionality: you think about the edge cases you want to think about, not necessarily all of the different vectors that could be used to disguise certain input. Another issue is, if you are anything like me, the first 10 minutes of any code review dealing with hand-rolled validation code is spent tutting because the programmer didn’t do exactly what you would have done. This can lead to programmers spending more time learning the codebase and reading internal documentation that could instead be spent coding.

Some people don’t roll their own, but instead opt for a third-party solution. There are some good ones out there, and in the past I have used OWASP ESAPI for some extra validation. These are better than perhaps the hand-rolled solutions because more eyes have looked over them, but then you have the issue of introducing third-party code into your project. Again, this increases time spent learning a codebase and reading additional documentation instead of coding.

For these reasons, using native functions are better; moreover, because such functions are baked into the language, it means we have one place to go for all PHP documentation. New developers will have a greater chance of knowing what the code is and how best to use it. It will be easier to support as a result of this.

Hopefully by now I have you convinced that validation is important, and that it would be a good idea to use PHP functions to help you achieve your validation needs. If you are not convinced, leave a comment and let’s discuss it.

--ADVERTISEMENT--

Some Examples

The filter_input() function was introduced in PHP 5.2.0 and allows you to get an external variable by name and filter it. This is incredibly useful when dealing with $_GET and $_POST data.

Let’s take as an example a simple page that reads a value passed in from the URL and handles it. We know this value should be an integer between 15 and 20.
One way of doing would be something like:

<?php
if (isset($_GET["value"])) {
    $value = $_GET["value"];
}
else {
    $value = false;
}
if (is_numeric($value) && ($value >= 15 && $value <= 20)) {
    // run my code
}
else {
    // handle the issue
}

This is a really basic example and already we are writing more lines that I would like to see.

First, because we can’t be sure $_GET is set, the code performs an appropriate check so that the script doesn’t fall over.

Next is the fact that $value is now a “dirty” variable because it has been directly assigned from a $_GET value. We would need to take care not to use $value anywhere else in the code in case we break anything.

Then there is the issue that 16.0 is valid because is_numeric() okays it.

And finally, we have an issue with the fact that the if statement is a bit of a mouthful to take in and is an extra bit of logic to work through when you are tracing through the code.

Compare the above example now to this:

<?php
$value = filter_input(INPUT_GET, "value", FILTER_VALIDATE_INT,
    array("options" => array("min_range" => 15, "max_range" => 20)));
if ($value) {
    // run my code
}
else {
    // handle the issue
}

Doesn’t that make you feel warm and fuzzy?

filter_input() handles the $_GET value not being set, so you don’t have to stress over whether the script is receiving the correct information or not.

You also don’t have to worry about $value being dirty because it has been validated before it has been assigned.

Note now that 16.0 is no longer valid.

And finally, our logic is no longer complicated. It’s just a quick check for a truthy value (filter_input() will return false if the validation fails and null if $_GET["value"] wasn’t set).

Obviously in a real world setting you could extract the array out into a variable stored in a configuration file somewhere so things can get changed without even needing to go into business logic. Gorgeous!

Now you might be thinking that this might be useful for simple scripts that grab a couple of $_GET or $_POST variables, but what about for use inside of functions or classes? Luckily we have filter_var() for that.

The filter_var() function was introduced at the same time as filter_input() and does much the same thing.

<?php
// This is a sample function, do not use this to actually email,
// that would be silly.
function emailUser($email) {
    mail($email, "Here is my email", "Some Content");
}

The danger here is that is there nothing to stop the mail() function from attempting to send an email to literally any value that could be stored in $email. This could lead to emails not getting sent, or something getting in that can potentially use the function for malicious intent in a worst case scenario.

I have seen people do a check on the result of mail(), which is fine to see if the function completed successfully, but by the time a value is returned the damage is done.

Something like this is much more sane:

<?php
// This is a sample function, do not use this to actually email,
// that would be silly.
function emailUser($email) {
    $email = filter_var($email, FILTER_VALIDATE_EMAIL);
    if ($email !== false) {
        mail($email, "Here is my email", "Some Content");
    }
    else {
        // handle the issue invalid email address
    }
}

The problem with a lot of examples, the above included, is that they are basic. You might be thinking that filter_var() or filter_input() can’t be used for anything other than basic checking. The fine folks who introduced these functions considered that and allow you to pass in a filter to these functions called FILTER_CALLBACK.

FILTER_CALLBACK allows you to pass in a function you have created that will accept as the input the variable being filtered – this is where you can start to have a lot of fun because you can start applying your own business logic to your filtering.

Some Potential Pitfalls

These functions are pretty great, and they allow you to do some really powerful filtering, which as we have discussed can help improve the security and reliability of your code. There are some potential drawbacks however and I would feel that I was remiss if I didn’t point them out.

The main pitfall is that the functions are only as good as the filter you apply to it. Take the last example using email validation – how FILTER_VALIDATE_EMAIL handles email addresses has changed between 5.2.14 and 5.3.3, and even assuming all your applications run on the same version of PHP there are email addresses that are technically valid that you might not expect. Be sure you know about the filters you are using.

The second pitfall is that people think that if they put in some filters then their code is secure. Filtering your variables goes some way to helping, but it doesn’t make your code 100% safe from abuse. I would love to talk more about this, but that is out of the scope of this article and my word count is already pretty high!

Conclusion

Hopefully you have found this introduction to input validation in PHP useful. And now, time for a call to action!

I want you to take one function in your code, just one, and see what happens to it when you pass in different data types and different values. Then I want you to apply some of the filtering methods discussed here and see if there is a difference in how your code performs. I would love to know how you got on in the comments.

Image via Chance Agrella / Freerangestock.com

More:
Login or Create Account to Comment
Login Create Account
Recommended
Sponsors
Get the most important and interesting stories in tech. Straight to your inbox, daily.
Is it good?Is it good?