Code layout and variable questions

Hello.

I have a quick question regarding php code layout and also variables.

Does it make any difference whatsoever if I choose to lay out my code in sections like this? Note that this is just some example random code.


<?php
session_start();
if (isset($_SESSION['username'])) {
$username = htmlspecialchars($_SESSION['username']);
}
?>
<?php
require_once $_SERVER['DOCUMENT_ROOT'] . '/includes/connection.php';
some code here etc
?>
<?php
echo '$username';
?>

Do the php tags after the first set make no difference at all, and are purely an aesthetic choice to clump together certain parts of the code if I chose to do so?

Would the connection.php still be ‘set’ in the later php tag sets or would I have to call it for each? Somehow, I doubt that.

Also, regarding variables. If I use, like in the above example, htmlspecialchars on the $username variable… whenever I use the $username variable later in my code, I can just use it as $username, and not have to ‘re-do’ htmlspecialchars, correct? Once it is set, it is set until I change it in that particular page. Even if it were done like this:


<?php
session_start();
if (isset($_SESSION['username'])) {
$username = htmlspecialchars($_SESSION['username']);
}
?>
<?php
include hiuser.php
?>
<?php // this is hiuser.php
echo '$username';
?>

Thank you for your input, php masters of sitepoint.

Ok, thank you. That really makes sense. I think I have been double escaping a few things and you have helped me to see that.

I have to sleep now, but tomorrow I might post an example to show what I’ve been doing wrong, and then post my ‘fixed’ version of it to make sure that I know how to solve this once and for all.

Thanks again.

And for finality sake, this is what I ended up with. Added a few variables to make sure I am doing it right, and am unsure about the $item_date filter/escape. But everything else should be right now…

<?php
session_start();
$username = filter_var($_SESSION['username'], FILTER_SANITIZE_STRING);
if (empty($username)) {
    header('Location: /website/login.php');
}
 
$item_submit = isset($_POST['item_submit']);
$user_id = filter_var($_SESSION['user_id'], FILTER_SANITIZE_NUMBER_INT);
$item_name = filter_input(INPUT_POST, 'item_name', FILTER_SANITISE_STRING);
$item_category = filter_input(INPUT_POST, 'item_category', FILTER_SANITISE_STRING);
$item_subcategory = filter_input(INPUT_POST, 'item_subcategory', FILTER_SANITISE_STRING);
$item_date = filter_input(INPUT_POST, 'item_date', FILTER_SANITIZE_NUMBER_INT); // is this right for DATE?
 
require_once $_SERVER['DOCUMENT_ROOT'] . '/includes/database_connection.php';
require_once $_SERVER['DOCUMENT_ROOT'] . '/includes/functions.php';
 
if ($item_submit) {
    $sql = sprintf('INSERT INTO items (user_id, item_name, item_category, item_subcategory, item_date) VALUES (%d, "%s", "%s", %d)',
        intval($user_id),
        mysql_real_escape_string($item_name),
		mysql_real_escape_string($item_category),
		mysql_real_escape_string($item_subcategory),
		intval($item_date) // likewise, is this right for date?
    );
    $result = mysqli_query($link, $sql);
        if (!$result) {
            $error = 'DATABASE ERROR.';
            include 'error.php';
            exit();
        } else {
            header('Location: /website/items.php?user=' . urlencode($username));
        }
    }
}
?>

My amateur opinion on your questions:

  1. Do the php tags after the first set make no difference at all, and are purely an aesthetic choice to clump together certain parts of the code if I chose to do so?

You can do that, but there might be some efficiency issues from doing that. I never open and close like that. If you want to separate your code sections, perhaps includes would be a better choice, but that might have efficiency issues as well. Typically I separate sections with white space, i.e.:


<?php

session_start();

if (isset($_SESSION['username'])) {

   $username = htmlspecialchars($_SESSION['username']);

}

require_once $_SERVER['DOCUMENT_ROOT'] . '/includes/connection.php';

some code here etc

echo $username;

?>

Also note, as a general readability practice, when you have a block of code inside something like an if or while statement, you should indent a few characters to help identify what part of the code is inside that if or while statement.

  1. Would the connection.php still be ‘set’ in the later php tag sets or would I have to call it for each? Somehow, I doubt that.

It would still be set. Closing the php tags doesn’t destroy any variables or connections.

  1. If I use, like in the above example, htmlspecialchars on the $username variable… whenever I use the $username variable later in my code, I can just use it as $username, and not have to ‘re-do’ htmlspecialchars, correct?

That much is true. However, you should know that php doesn’t parse variables within single quotes. If you want, for some reason, to enclose your variables within quotes, use double quotes. If you prefer single quotes to enclose output, then you need to break them for variables, i.e.:

Works:
echo 'Hello there ’ . $username;
echo “Hello there $username”;

Doesn’t work:

echo ‘Hello there $username’;

For some reason, I cannot edit. I seem to be wrong about efficiency. It doesn’t seem to impact the speed of the script to open and close php. However, it may make it harder to read or find a missing closing brace.

The usual reason to use multiple blocks is to permit the use of html between them. Trivial example:

<?php 
session_start(); 
if (isset($_SESSION['username'])) { 
$username = htmlspecialchars($_SESSION['username']); 
} 
?> 
<html>
<head>
<?php 
require_once $_SERVER['DOCUMENT_ROOT'] . '/includes/connection.php'; 
some code here etc 
?> 
</head>
<body>
<?php 
echo '$username'; 
?>
</body>

Your question about escaping a variable is a harder one to explain though.

htmspecialchars() is a means of ‘escaping’ (protecting) a variable destined for display in and amongst html on a webpage.

mysql_real_escape_string() or using PDO/Mysqli prepared statements is a means of escaping a variable destined for storage in a mysql database.

If you take the following programme flow:

a) get a variable from a user
b) display it to a the user in a webpage
c) store it in a database

If you escape it at a) and then re-escape it AGAIN at c) you will end up with data in your database which will not match your expectations and can be the cause of very subtle bugs.

Moving away from the trivial example, lets say you are looking at a variable on a 200 line script which you have not worked on for a few months now.


$username

Now ask yourself, has that already been escaped?

If you insist upon pre-escaping variables maybe in an effort to save time, then at least name them:


$escaped_username

Eugh!

So my answer is no, don’t do that as a knee-jerk reaction, it will cause you problems later down the road. Generally escape data as you display it so that your code is implicit.

If you are reading from a database, and you know that all the variables are to be output onto a page, then yes, consider doing it.

Ok thank you Cute-Tink and Cups. That really clears things up for me.

Cups, I will not pre-escape variables like that from now on. What you said makes far more sense to me, but I have one question regarding your theoretical situation of:

I think I understand this, but let me come up with a slightly more detailed example to make sure that I do understand, and/or get some further tips.

Let’s say I have a form that will insert some data into the database. I have dumbed my example code down to shorten my example.


<?php
session_start(); 
if (isset($_SESSION['username'])) {
     $username = $_SESSION['username']; // A
}
if (isset($_POST['form_submit'])) {
     $username = mysqli_real_escape_string($link, $username); // C
     "INSERT INTO database WHERE username = '$username'";
}
?>
<HTML begins>
<?php
     echo "Welcome htmlspecialchars($username)"; // B
?>
<html form goes here with 'form_submit' button>
</HTML ends>

Sorry about the purposely dumb code, but the point of it is so that I can understand your ABC example. Would my example cause problems:

I am not escaping at ‘A’, but am escaping with htmlspecialchars at ‘B’ when I display the username in a welcome message, and then escaping at ‘C’ with mysqli_real_escape_string if/when the user submits the form.

You nearly got it right mate.


if (isset($_POST['form_submit'])) {
// if you are using mysqli - but then again you should
// ideally be using prepared statements which do the escaping
// for you automatically
      "INSERT INTO database (username) values ( '" . mysqli_real_escape_string($username) ."');  //C
}

Also, bear in mind this scenario:


<?php

// you fetch something from your mysql table
// but you are not sure where it came from, all 
// you know (trust) is that you did escape it ready
// for storing as mysql data

// "select username from mydatabase WHERE id= 1";
// mysql_fetch_array etc

}
?>
<HTML begins>
<?php
     echo "Welcome htmlspecialchars(". $row[0]['username'] .")"; // B
?>
</HTML ends>

Filter Input Escape Output is the rule (FIEO).

You may have protected mysql from sql injection attacks when you saved the data, but now you need to protect users of your html from XSS attacks as you display it. (protect = escape).

I wish I’d been bright enough to ask questions such as this when starting out. :wink:

So:

<?php
session_start(); 
if (isset($_SESSION['username'])) {
     $username = $_SESSION['username'];
}
if (isset($_POST['form_submit'])) {
     "INSERT INTO database (one, two, three) values ('one', 'two', 'three') WHERE username = ( '" . mysqli_real_escape_string($username) ."');
}
?>
<HTML begins>
<?php
     echo "Welcome htmlspecialchars($username)";
?>
<html form goes here with 'form_submit' button>
</HTML ends>

Would be right for the first example?

But if I was also inserting other data generated by the form only (had not been previously set or escaped), it would be ok to do this?

$variable1 = mysqli_real_escape_string($link, $variable);
$variable2 = mysqli_real_escape_string($link, $variable);
$variable3 = mysqli_real_escape_string($link, $variable);
"INSERT INTO database (one, two, three) values ('$variable1', '$variable2', '$variable3') WHERE username = ( '" . mysqli_real_escape_string($username) ."');

Finally, let me ask a really “Hi, I’m a newbie” question too. What is the difference between doing (assuming $username had been htmlspecialchar’d for output like in my first example). Just so I understand the changes I am making.

$username = mysqli_real_escape_string($link, $username);
"INSERT INTO database (one, two, three) values ('one', 'two', 'three') WHERE username = '$username'");

and

"INSERT INTO database (one, two, three) values ('one', 'two', 'three') WHERE username = ( '" . mysqli_real_escape_string($username) ."');

I hope that post wasn’t too much of a mess.

Exactly right.

Sorry, my mistake, I am getting confused with mysql_real_escape_string() - I must admit I am not familiar with mysqli, having moved directly to using PDO when PHP5.1 came out. Maybe someone else can help you with that. I hope I have not lead you astray on that …

Not a problem Cups, you have been a truly excellent help. :slight_smile:

Which part do I still need to confirm, though? So that I know where I’m up to. :smiley:

Judging by what I read in the manual about mysqli function (which I am not conversant with)

This code:


$username = mysqli_real_escape_string($link, $username);
"INSERT INTO database (one, two, three) values ('one', 'two', 'three') WHERE username = '$username'");  

Is correct,

whereas this code:


"INSERT INTO database (one, two, three) 
values ('one', 'two', 'three') 
WHERE username = ( '" . mysqli_real_escape_string($username) ."');  

Is incorrect (or if the i was taken off, it would be true for plain mysql_* use).

Sorry if I have confused you on this score.

Thanks for taking the time to help me out, and to clarify, Cups. I appreciate it.

So basically I’m doing everything right? And it does not matter to switch between different escape methods on the same variable throughout my script, as long as I always use the right one for the right job, like in this example:

<?php
session_start(); 
if (isset($_SESSION['username'])) {
     $username = $_SESSION['username'];
}
if (isset($_POST['form_submit'])) {
     $username = mysqli_real_escape_string($link, $username); // mysqli_real_escape_string for input into database
     "INSERT INTO database WHERE username = '$username'";
}
?>
<HTML begins>
<?php
     echo "Welcome htmlspecialchars($username)"; // htmlspecialchars for output to the page
?>
<html form goes here with 'form_submit' button>
</HTML ends>

Am I the only one who noticed this bit:


<?php 
     echo "Welcome htmlspecialchars($username)"; // htmlspecialchars for output to the page 
?>

It won’t call htmlspecialchars, it’d need to be:


echo "Welcome ".htmlspecialchars($username);
// Or if you wanted to continue the string..
echo "Welcome ".htmlspecialchars($username).", how are you today?";

Oops, thank you Zurev! :slight_smile:

I’ve actually been doing it like this:


<?php
     echo 'Welcome';
     echo htmlspecialchars($username);
     echo '.';
?>

So I would need to change line 2 to this?


echo .htmlspecialchars($username).;

No, line 2 is fine that way.

That last code example won’t work, because PHP will think that you’ve missed out something for the full steps to concatenate with.

You can use either the first example from above, or combine it in to one statement with:


echo 'Welcome' . htmlspecialchars($username) . '.';

Or if you prefer, you can have the one statement spread across multiple lines:


echo 'Welcome'
    . htmlspecialchars($username)
    . '.';


echo 'Welcome' .
    htmlspecialchars($username) .
    '.';

With procedural mysqli, I prefer to not have the variable embedded within the string, which PHP then has to decode:


$username = mysqli_real_escape_string($link, $username);
$result = mysqli_query("INSERT INTO database WHERE username = '$username'"); 

Instead, I prefer to concatenate the variable to the string, so that it’s visually easier to see what’s going on:


$username = mysqli_real_escape_string($link, $username);
$result = mysqli_query('INSERT INTO database WHERE username = "' . $username . '"'); 

I might even go so far as to use sprintf:


$sql = sprintf(
    'INSERT INTO database WHERE username = "%s"',
    mysqli_real_escape_string($link, $username)
);
$result = mysqli_query($sql);

However, if you’re going to go that far, you might as well use prepared statements instead.
All escaping issues are then automatically handled when you bind a parameter to the statement.


if ($stmt = mysqli_prepare($link, 'INSERT INTO database WHERE username = "?"')) {
    mysqli_stmt_bind_param($stmt, "s", $username);
    mysqli_stmt_execute($stmt);
    mysqli_stmt_bind_result($stmt, $result);
    mysqli_stmt_fetch($stmt);
}

Thank you Paul. :slight_smile:

I’ll be truthful, I don’t know the difference there due to me never having used concatenation much, or like that specifically, but I am reading about it right now.

Does anyone have any input on my “escaping-variable-different-ways-throughout-code” question (post #11)? Once I confirm that one way or another, I think my lessons for today are complete! :smiley:

You are all a great help.

Last year I summarised input/sanitize/output/escape issues in this Handling Input and Output post in the [url=“http://www.sitepoint.com/forums/php-34/php-mysql-coding-tips-456441.html”]coding tips sticky thread.