PHP Validation

Can someone show me the best way to securely validate form data in PHP?

My main question is not about validating the data itself but how to implement validation securely. In most of the examples I have come across, people tend to assign the form data into a variable and then begin validation. My concern is what if the data has been manipulated to begin with. Would it not be better to validate the data in an if statement before assigning the data to a variable for further use?

For example, say you have a textarea field and someone puts a SQL query into the textarea. When you assign the textarea value to a variable, could the SQL query not then be executed?

Thanks,

Your suspicions are correct, you should run your validation through filter_input for the $_POST array. As for the textarea scenario, you should not put input into a variable and then tack it into a query, you should use prepared statements. Link below as well…

https://www.php.net/manual/en/function.filter-input.php

https://www.php.net/manual/en/book.pdo.php

1 Like

You validate data to make sure it meets the business needs/requirments of your application, to decide if you are even going to use the data. This has nothing directly to do with security. Security is accomplished by how you use the data.

Your post method form processing code should -

  1. detect if a post method form was submitted.
  2. keep the form data as a set, in a php array variable, then reference elements in this array throughout the rest of the code.
  3. trim all the input data at once, mainly so that you can detect if a value was all white-space characters. after you do item #2 on this list, you can accomplish this with one single line of code.
  4. other than trimming the data, do not modify, i.e. filter, sanitize, … the data as this changes the meaning of the data. if the data is not valid, let the user know what was wrong with it, let them fix the problem, and resubmit the data.
  5. validate the now trimmed data, storing user/validation errors in an array using the field name as the main array index.
  6. after the end of the validation logic, if there are no user/validation errors (the array holding the user/validation errors is empty), use the trimmed data.
  7. after using the data, if there are no user/validation errors, in case using the data can produce additional errors, redirect to the exact same url of the current page to cause a get request for that page. This will prevent the browser from trying to resubmit the form data should that page get reloaded or browsed away from and back to.
  8. if there are errors at item #7, continue on to display the html document, display any errors, redisplay the form, populating the field values with any existing data so that the user doesn’t need to keep reentering values over and over upon an error.
  9. any dynamic value you output in a html context should have htmlentities() applied to it, right before outputting it, to help prevent cross site scripting.
4 Likes

I’m no expert in this field by any means, and I might be reading the question incorrectly, but I can’t see how just the simple action of assigning a post variable into another variable would cause an SQL query to be executed. Regardless of what’s inside the text box, your PHP code would need to connect to a database and then explicitly execute the contents of the text box, which would be a crazy thing to do without first sanitising the contents in some way.

If that’s actually possible, perhaps someone with more knowledge - i.e. anyone at all - could explain how that might work.

It wouldnt be possible. A string, by itself, does nothing. You’d have to expose it to an executing function in order for it to do anything - and pretty much all of those carry great big warning labels whenever we mention them.

It’s why the refrain is always parameterize your queries, and why we never advocate for things like exec or other system functions.

Now that said, what the OP I think is trying to say with that particular sentence is the simple SQL injection attack angle. When they say “to a variable”, meaning either a variable in the extant SQL query (bad), or a bound parameter thereof (less bad).

Validation isnt just for fighting against injection on the intake, but the output as well; you dont want the user to be able to inject, say, a <script> tag into a SQL entry that will be output to every user of the site - because that will probably load the javascript contained within.

2 Likes

To clarify, though this has been answered, doing this:-

$message = $_POST['message'] ;

In itself is not dangerous.
SQL injection would only be a problem if you subsequently did something like:-

$sql = $db->query("INSERT INTO table (name, email, phone, message) VALUES ($name, $email, $phone, $message) ;

And apart from SQL in this context:-

It could be a problem if you did:-

echo $message ;

So…

$message = $_POST['message'] ;

Isn’t dangerous on its own, but I only mention it because I see this a lot:-

$name = $_POST['name'] ;
$email = $_POST['email'] ;
$phone = $_POST['phone'] ;
$message = $_POST['message'] ;

Which while not dangerous, is pointless.
If you look at:-

echo $_POST['message'] ;

and:-

$message = $_POST['message'] ;
echo $message ;

Both do exactly the same thing, the only difference is, one is twice as many lines of code.
Only make a new variable if you are going to do something useful in the process, even if it is just to trim the string.

$message = trim($_POST['message'] );

Which is one of the steps @mabismad describes.

Getting back to ‘How to validate’. That is very much dependant on what you consider to be valid for a given input. What will you accept? What would you reject?

1 Like

If in the textarea this was the content:

$sql = $db->query("INSERT INTO table (name, email, phone, message) VALUES ($name, $email, $phone, $message) ;

You are saying assigning this to a variable would not in itself cause any issues? Let’s say they included the other aspects of triggering a Query? As in, the variables, the query execute() function.

Assigning the variable $name does nothing.
Using the variable $name is dangerous.

Particularly, using them in a SQL query in this non-parameterized way is especially dangerous.

I understand. But it seems that you should probably do what you can before you use the variable regardless. It seems that validating the data and sanitizing them at the same time if possible is optimal for security. At least that’s what I’m gathering. Probably being a tad ridiculous but hey, better safe than sorry right?

Again, when validating it’s up to you to decide ‘what is valid’.
For something like an email address, that’s quite simple, as there is a filter for that.

filter_var($email, FILTER_VALIDATE_EMAIL)

While it has been correctly stated that validation and sanitisation are two different things, in many cases such as this one, what is valid is clean.
So you may:-

$email = trim($_POST['email'] );
if(!filter_var($email, FILTER_VALIDATE_EMAIL)){ $errors[] = "You must supply a valid email address!";}

This is part of a system that builds an array of validation errors.
At the end you can:-

if(count($errors)){
    // There are problems, don't use any submitted data, send the user back to the form to try again
    // Display the recorded errors so they know what to fix
}
else{
     // No errors, proceed using the submitted data,
     // but still be cautious about how you save/print it
}

However when it comes to inputs like textareas, what is valid is a much more grey area and for you to decide. You may just limit the length, that won’t stop an attepmt as SQL/script injection, but if treated properly it shouldn’t be a threat.
This means if you do save to a database, do use prepared statements.
If you echo/print to the browser, either direct from the form script, or later retreive submitted data from the database, escape the output.

echo htmlspecialchars($message);
2 Likes

So I’m using Fetch. So I send the data to my PHP script and process the form data. When I return the JSON response I use echo. If something slipped through the cracks, then what is the best way to mitigate XSS when adding the data back to the DOM. Any suggestions?

You’ve already gotten the suggestion :wink:

I dont quite understand @m_hutley.

When I return the JSON and the values are now in the JS file, I loop through the JSON typically and then create and append elements. Are you saying the

htmlentities()

can be added to the value inside the JS file? Example:

var text = d.createElement(“P”);
text.innerText = htmlentities(MyValidatedValueVariabefromPHP);
dom.append(text);

UPDATE:

Just checked htmlentities() is a PHP function. I can test it but would this mess up a JSON? I build a lot of JSON then echo them back to my JS. I’ve had a hard time figuring out how to sanitize the JSON as a whole when echoing back.

Additionally, once the JSON data is back into the JS file and I have extracted the data and adding to the DOM, is there a go to way to handle the data when appending to the DOM to prevent XSS?

a JSON string will ignore characters in the string until it sees another ". Rendering special characters into their HTML Entity code will result in each of them being turned into either &anamehere; or &#anumberhere;, neither of which contain the double quote character.

Could you elaborate?

Someone fills out my form, and I then take those values and send them to my php script via Fetch. I run validation and sanitize the values, then I echo back a JSON. Do I need to add anything to the echo statement around the JSON variables? Additionally, what do I use when adding the values back to the DOM in javascript that I received from my echo?

I seem to have a slight disconnect here on this one issue. It seems a lot of people, process the form data with PHP where the PHP script is located in the same HTML file as the form. So I can see how htmentities() might help in preventing XSS attacks because you can echo the data back directly with PHP, but the data is sent back to my js file and I really don’t want to have PHP tags in my JS file. So is there a JS solution to this or will the htmentities() strip any XSS from the JSON without making my JSON invalid?

This one question does not have a lot of data online.

Typically, what would be in that? Data you’ve retrieved from a database based on the user-supplied form variables, or a status report on whether the update worked, or a mixture? My thought is that you’d use htmlentities() on the data before you encode it into a JSON object to return it to your JS code.

1 Like

A very good method is to use textContent instead of innerHTML. Within textContent all tags, which could lead to attacks, are ignored.

1 Like