Key Takeaways
- Never trust foreign input in your application. It’s crucial to filter any data incorporated into your applications to prevent potential vectors for attackers to inject code.
- Two main types of data filtering in PHP are validation and sanitization. Validation ensures the foreign input is what we expect it to be, while sanitization removes illegal or unsafe characters from foreign input.
- PHP provides an array of filters for both validation and sanitization. These filters can be applied using the filter_var() and filter_input() functions, making your PHP applications more secure and reliable.
In this article, we’ll look at why it’s so important to filter anything that’s incorporated into our applications. In particular, we’ll look at how to validate and sanitize foreign data in PHP.
Never (ever!) trust foreign input in your application. That’s one of the most important lessons to learn for anyone developing a web application.
Foreign input can be anything — from $_GET
and $_POST
form input data, some elements on the HTTP request body, or even some values on the $_SERVER
superglobal. Cookies, session values, and uploaded and downloaded document files are also considered foreign input.
Every time we process, output, include or concatenate foreign data into our code, there’s a potential vector for attackers to inject code into our application (the so-called injection attacks). Because of this, we need to make sure every piece of foreign data is properly filtered so it can be safely incorporated into the application.
When it comes to filtering, there are two main types: validation and sanitization.
Validation
Validation ensures that foreign input is what we expect it to be. For example, we might be expecting an email address, so we are expecting something with the ********@*****.***
format. For that, we can use the FILTER_VALIDATE_EMAIL
filter. Or, if we’re expecting a Boolean, we can use PHP’s FILTER_VALIDATE_BOOL
filter.
Amongst the most useful filters are FILTER_VALIDATE_BOOL
, FILTER_VALIDATE_INT
, and FILTER_VALIDATE_FLOAT
to filter for basic types and the FILTER_VALIDATE_EMAIL
and FILTER_VALIDATE_DOMAIN
to filter for emails and domain names respectively.
Another very important filter is the FILTER_VALIDATE_REGEXP
that allows us to filter against a regular expression. With this filter, we can create our custom filters by changing the regular expression we’re filtering against.
All the available filters for validation in PHP can be found here.
Sanitization
Sanitization is the process of removing illegal or unsafe characters from foreign input.
The best example of this is when we sanitize database inputs before inserting them into a raw SQL query.
Again, some of the most useful sanitization filters include the ones to sanitize for basic types like FILTER_SANITIZE_STRING
, FILTER_SANITIZE_CHARS
and FILTER_SANITIZE_INT
, but also FILTER_SANITIZE_URL
and FILTER_SANITIZE_EMAIL
to sanitize URLs and emails.
All PHP sanitization filters can be found here.
filter_var() and filter_input()
Now that we know PHP has an entire selection of filters available, we need to know how to use them.
Filter application is done via the filter_var()
and filter_input()
functions.
The filter_var()
function applies a specified filter to a variable. It will take the value to filter, the filter to apply, and an optional array of options. For example, if we’re trying to validate an email address we can use this:
<?php
$email = your.email@sitepoint.com:
if ( filter_var( $email, FILTER_VALIDATE_EMAIL ) ) {
echo ("This email is valid");
}
If the goal was to sanitize a string, we could use this:
<?php
$string = "<h1>Hello World</h1>";
$sanitized_string = filter_var ( $string, FILTER_SANITIZE_STRING);
echo $sanitized_string;
The filter_input()
function gets a foreign input from a form input and filters it.
It works just like the filter_var()
function, but it takes a type of input (we can choose from GET
, POST
, COOKIE
, SERVER
, or ENV
), the variable to filter, and the filter. Optionally, it can also take an array of options.
Once again, if we want to check if the external input variable “email” is being sent via GET
to our application, we can use this:
<?php
if ( filter_input( INPUT_GET, "email", FILTER_VALIDATE_EMAIL ) ) {
echo "The email is being sent and is valid.";
}
Conclusion
And these are the basics of data filtering in PHP. Other techniques might be used to filter foreign data, like applying regex, but the techniques we’ve seen in this article are more than enough for most use cases.
Make sure you understand the difference between validation and sanitization and how to use the filter functions. With this knowledge, your PHP applications will be more reliable and secure!
Frequently Asked Questions (FAQs) about PHP Data Filtering
What is the importance of PHP data filtering?
PHP data filtering is crucial for ensuring the security and integrity of your web applications. It helps in validating and sanitizing user input, which is a common source of security vulnerabilities like SQL injection and cross-site scripting (XSS). By filtering data, you can prevent malicious or erroneous data from causing harm to your application or database. It also helps in maintaining data consistency and reliability, which is essential for the smooth operation of your application.
How does PHP data filtering work?
PHP data filtering works by applying specific filter functions to the data. These functions can validate data (check if the data meets certain criteria) or sanitize data (remove any illegal or unsafe characters from the data). PHP provides a variety of built-in filter functions that you can use, such as filter_var(), filter_input(), and filter_has_var().
What are some common PHP filter functions and how are they used?
Some common PHP filter functions include filter_var(), filter_input(), filter_has_var(), and filter_id(). The filter_var() function filters a variable with a specified filter. The filter_input() function gets a specific external variable by name and optionally filters it. The filter_has_var() function checks if a variable of a specified input type exists and the filter_id() function returns the filter ID belonging to a named filter.
How can I use the filter_var() function in PHP?
The filter_var() function in PHP is used to filter a variable with a specified filter. It takes two parameters: the variable to be filtered and the type of filter to apply. For example, to validate if a variable is an email, you can use the FILTER_VALIDATE_EMAIL filter with the filter_var() function like this:$email = "test@example.com";
if (filter_var($email, FILTER_VALIDATE_EMAIL)) {
echo("This ($email) is a valid email address.");
} else {
echo("This ($email) is not a valid email address.");
}
What is the difference between validating and sanitizing data in PHP?
Validating data in PHP involves checking if the data meets certain criteria. For example, you might validate an email address to ensure it’s in the correct format. On the other hand, sanitizing data involves removing or replacing any illegal or unsafe characters from the data. This is often done to prevent security vulnerabilities like SQL injection and cross-site scripting (XSS).
How can I sanitize user input in PHP?
You can sanitize user input in PHP using the filter_input() function with a sanitization filter. For example, to sanitize a GET input for a username, you can use the FILTER_SANITIZE_STRING filter like this:$username = filter_input(INPUT_GET, 'username', FILTER_SANITIZE_STRING);
What are some common PHP filter flags and how are they used?
PHP filter flags are used to add extra constraints or modifiers to the filtering process. Some common flags include FILTER_FLAG_STRIP_LOW, FILTER_FLAG_STRIP_HIGH, FILTER_FLAG_ENCODE_LOW, and FILTER_FLAG_ENCODE_HIGH. For example, the FILTER_FLAG_STRIP_LOW flag strips characters that have ASCII value below 32 from the input.
How can I use filter flags in PHP?
You can use filter flags in PHP by passing them as the third parameter to the filter_var() or filter_input() function. For example, to filter an IP address and allow only IPv4 addresses, you can use the FILTER_FLAG_IPV4 flag like this:$ip = "127.0.0.1";
if (filter_var($ip, FILTER_VALIDATE_IP, FILTER_FLAG_IPV4)) {
echo("This ($ip) is a valid IPv4 address.");
} else {
echo("This ($ip) is not a valid IPv4 address.");
}
What is the filter_has_var() function in PHP?
The filter_has_var() function in PHP is used to check if a variable of a specified input type exists. It takes two parameters: the type of input and the name of the variable. If the variable exists, it returns TRUE; otherwise, it returns FALSE.
How can I use the filter_has_var() function in PHP?
You can use the filter_has_var() function in PHP to check if a certain input variable exists. For example, to check if a POST variable named ‘username’ exists, you can do this:if (filter_has_var(INPUT_POST, 'username')) {
echo("The 'username' variable exists.");
} else {
echo("The 'username' variable does not exist.");
}
Cláudio Ribeiro is a software developer, traveler, and writer from Lisbon. He's the author of the book An IDE Called Vim. When he is not developing some cool feature at Kununu he is probably backpacking somewhere in the world or messing with some obscure framework.