Best practice when dealing with $_POST results

Hi,

When dealing in PHP with data returned in the $_POST from a form which of the following is the better practice or are either just as good?

  1. Create variables to be used in the ensuing PHP eg
$name=$_POST['name'];
  1. Simply use eg
    $_POST['name']
    when required in the subsequent PHP?

Thank you.

The answer is…it depends.

  • If you’re going to access it once within a page/method, then accessing the $_POST[‘name’] will be fine,
  • If you’re going to access it multiple times within a page/method, then putting it into a local variable will be slightly more efficient from a performance standpoint (in reality, negligible unless the site it very busy), but has the added bonus of having one place to change if changing the form name later on down the road.
1 Like

Php doesn’t actually copy unchanging values from one variable to anther. It uses a reference to the original variable, with reference counting to let it know when all references in the code have been destroyed, so that it can free up memory. If you modify the value in a variable, the original reference is destroyed (this actually removes an entry in the internal symbol table, reduces the reference count to the original variable by one, and creates a new entry in the internal symbol table), and you get a new variable with the now modified value in it.

One thing you should almost always do to input data is trim it, mainly so that you can detect if all white-space characters were entered. This would modify the value, so you would get a trimmed working copy in the internal variable.

However, creating discrete scaler variables for every field, either by writing out a line of code for each one or magically creating them by using either extract or variable-variables, indicates that you are not treating the submitted data as a set, where you will be operating on each member of the set in the same/similar way. You are instead writing out discrete code each time you are using a variable. By keeping the data as a set, in an array variable, you can do things like trim it in one operation (you would store the trimmed result in an internal array variable, then use elements in this internal array variable throughout the rest of the code, so that the unaltered data in $_POST is available to any other part of your application if needed), dynamically validate the data, dynamically process the data, and by using a template for producing output, simply supply an array of data to the template as its input.

1 Like

All well known frameworks abstract the request using a class. This method provides greater flexibility than accessing the request into directly using the native api. This approach is used across the board in many different frameworks not just php.

2 Likes

I think one thing that has been grossly missed in the answers (even though mabismad slightly hints to it) is that you should never trust the values in $_POST or $_GET. Thus you should be doing some validation and often data transformations on the values in those arrays to make them “clean”. This often facilitates the need of putting them into a variable first, clean the variable and then use the variable. Very few situations should you ever be using $_POST directly, especially in building query strings or the like. That would lead to things like SQL injection. Side note, check out the function filter_input.

You should also, by good practice, never try and alter the value itself in the $_POST array. Leave those as they are and you have a source of truth when it comes to what was actually submitted.

I often use the first method you show of putting the array values into a variable (usually through some form of sanitation method at the same time. This means, as mabis said, trimming it but also validating it. Is it an integer in the correct range? Is it an empty string? Is it a string that is too long or contain harmful script tags? You should always be sanitizing and validating your inputs from these arrays.

I hope this advice helps. :slight_smile:

2 Likes

Citation please. (I am excluding the method reference specifically)

The OP’s example is a “variable for nothing”. It does nothing whatsoever except pollute the code base and the GLOBAL space by adding yet another GLOBAL variable for data that already exists. The original $_POST variable does not magically disappear. NOTHING has changed therefore there is no reason to create ANOTHER variable of what is ALREADY a variable. Yes, $_POST[‘name’] IS a variable already.

Per the Manual of what a variable is…

Variables in PHP are represented by a dollar sign followed by the name of the variable.

More specifically, it is a Predefined Variable. A variable all the same.
https://www.php.net/manual/en/reserved.variables.php

If you were to perform some action on it such as trimming it, then sure, create a new variable since it has been potentially modified. In that usage it is an easy flag when reading the code to know there may have been a change to the original value.

Code Proof…

<?php
$name=$_POST['name'] = 'Bob';
echo $name . '<br>';
echo $_POST['name']. '<br>';
var_dump($GLOBALS);

Result:
Bob
Bob
array(9) { ["_GET"]=> array(0) { } ["_POST"]=> array(1) { [“name”]=> string(3) “Bob” }`

1 Like

I’d have to go back and find the most recent source that has studied this, but it’s pretty consistent across the programming languages.

Technically, $_POST is an array of variables. $_POST[“name”] is an named array element.

If you copy the array value to a local variable, it’s going to allocate the memory location so it can be directly accessed. If $name is referenced, it goes directly to that memory location. If $_POST[“name”] is referenced, the array lookup has to occur each time before the memory location is accessed. Granted, that lookup is quick with modern memory and processes, and will only become noticeable over large scales of transactions but the lookup MUST occur first…hence why the slightly more efficient if the array element would need to be accessed multiple times

The multiple times being the key. A once and done reference wouldn’t add any benefit. But if you have a method where the element would be accessed 10-20 times, it makes a difference. Granted refactoring should be looked at if that many references are needed, but I’ve seen it many times.

1 Like

I’m going to disagree about cleaning, sanitizing, and filtering values, then proceeding to use them. Other than trimming data, you should not alter data. You should validate that it meets the needs of your application. Then use it safely, in whatever context it gets used in. For database queries, use prepared queries. In a html context, apply htmlentities. In an email header context, validate that it is only and exactly a single permitted value.

Here’s a real life example of why to not alter values and blindly use the result. Some paid forum software had a security hole a number of years ago, where someone could make a valid, working email address, similar to an administrator’s email, which are commonly posted for contact purposes, at the same mail host, such as gmail, but with characters in it that the filtering removed. This allowed someone to create an email address and register on the forum, which the same filtering was not applied to, and perform a password recovery for the administrator’s account. A copy of all the user data was taken. So, yes this was the result of inconstant coding on a large project (whoever added the password recovery, did something that wasn’t done in the same way or at all in the registration), but the point is, don’t alter values. If the code had just validated the email address and instead just setup a message for the visitor as to what was wrong with the value, the security breach wouldn’t have occurred, because the invalid value wouldn’t have ever been used.

The same thing occurs with people’s names that contain things like apostrophes. If you alter the value and use it, you have changed their stored name, perhaps preventing some portion of the code using that name from matching other information. If your code produced a validation error instead, either the logging of such errors or the person making contact and reporting the bug would let you know about the problem.

2 Likes

Interested to see this.

Agreed.

What I would suggest to the OP is to trim the ENTIRE post array at once, then assign the array a new variable since it is potentially changed, then do whatever.

$trimmedPost = yourTrimFunction($_POST);
// Empty validations if required
if (empty($trimmedPost['name'])){
}

I completely agree.

EDIT:

Agreed, which is why I specifically excluded methods from my comment.

1 Like

PHP uses copy-on-write. So long as you don’t modify the new variable it is a memory reference to the array element, not a copy. Imagine what would happen if it did copy and you stuffed a complex object in an array and then referenced it. Performance would be horrible.

Arrays in PHP are implemented using hash tables. Lookups are constant ~ O(1).

2 Likes

Whatever you choose, please try to keep $_POST out of classes and methods. If a class or a method needs a POST value, pass it along as a parameter.

That way the code if more open for change. For example if you decide you want to add JSON API for example you can simply extract the values from the HTTP request and then pass those values to the class/method instead, without having to change it.

Also, methods and classes that don’t rely on some global state are easier to test. $_POST is relatively easy, but I find it easier to just always apply that principle.

2 Likes

I would agree that simply doing $name = $_POST['name']; is a waste of time, as you are doing nothing.
At least do: $name = trim($_POST['name']); or better $name = myValidationFunction($_POST['name']);

With post data from a form, I tend to have a pre-built array of indexes I expect to be in the post data. That way you can safely foreach through your data.

foreach($myArray as $input) {
   $formData[$input] = myValidationFunction($_POST[$input]);
}

If you simply did:-

foreach($_POST as $input => $value){...

People can inject any new input they want, or exclude ones you do want.

My example is a simplification, as I would probably have more data in the array to specify what data I expected to be in each input, so validation knows what to consider valid and what’s not.

2 Likes

That is what I do as well. I have a “Whitelist” array of allowed/expected indexes and another array of “required” indexes and then loop over those in my validation class.

1 Like

Guys, filter_input is there for a reason. This looks like it is going to get hotly debated (not sure why other than some misguided opinions), but to the OP I assure you that you don’t want to be using your $_POST and $_GET freely. Validate them for sure, sanitize them where appropriate. If the app expects an integer in the range of 1-100 and the user supplies 101, maybe the best option is just to set it to 100. Maybe you need to encode them for a link or something.

I recommend looking at the examples for filter_var, filter_input, and filter_input_array for plenty of examples of why you would want to sanitize values coming in. Just make sure you validate along with your sanitization.

I will leave it at that. :wink:

2 Likes

Thank you everyone for your input. I think I got what I needed in there!

1 Like

You don’t get away that easy, lol. :grin:

I fear we may be straying from the OP of whether to create variables for nothing or not but…

Validate Input, Escape Output. Very simple.

I define “sanitizing” data as changing data. You don’t change incoming data, you just don’t. You deal with it appropriately on output based on the context at which point it is OK to change it (Sanitize).

2 Likes