Escape on input or output

Hello,

I was searching google with this question but I can’t find proper answer. Can anyone tell me should I escape data on input or output and why?

Best regards,
@marklenon95

neither. you must escape a value when you change the context of a value (and only at the point where you actually do the context change).

e.g. when you want to use a value in an SQL statement, you need to escape the value for SQL when you put it in the statement.

2 Likes

I go by the rule never trust user’s input, so that means to me never trust their output. If you are using prepared statements properly then their output to the database should be ineffective against the database. So the really thing you have to guard against is output to the screen or some other form of output. If you want to be pretty sure you’re OK using a framework or a php template such as twig → https://twig.symfony.com/ will help in that regards.

1 Like

I go with the old mantra of validate & sanitise input, escape output.
In the context of databases, if using prepared statements for user input (which you should be) there is no need to escape as it will happen within the process.
But still a good idea to check that the data is what you were expecting (validate) and is stripped on any unwanted elements/code (sanitise).
Then still escape the data on output to cover the possibility of a breach of the data.

$username = htmlspecialchars($row['username']) ;

It’s not a big deal to just do that to be on the safe side.

If you have a framework or your own custom functions to help with this, that’s great.

You can call it nitpicking, but I find the below two corrections essential

if using prepared statements for user input

Is a solid nonsense. Prepared statements are not for the user input. Neither for any input at all. They are for SQL.
Every time you are going to add a variable to the SQL query, it should be done via prepared statement. Whereas any input source or data origin or any similar matter is essentially irrelevant and should never be taken into consideration.

Also,

$username = htmlspecialchars($row['username']) ;

means you are escaping a variable in the scope of, let’s say, a “controller”. Whereas it should be done in the scope of a “view”. You cannot know what would be the use of this variable - it can be stored back in the database, or sent via json or XML. By prematurely escaping it you may spoil the data.

In other words, the above statement contradicts with the correct statement from @Dormilich (emphasis mine):

So the correct example would be

<?= htmlspecialchars($row['username']) ?>

or, better,

{{ row.username }}

using Twig as an example of a template system which performs escaping automatically

2 Likes

So escape only in view context.
Thanks guys, you really helped me, I appreciate it.

Perhaps the context is not clear from the small snippet. Yes, right or wrong, I’m escaping within the controller script, but in preparation for output in the html template. The idea being it keeps more php out of the html.

Why not? I’m writing the script, I do with that variable exactly what I intend to: output it to the html.

I possibly could have worded that better, but I think it is clear what I meant: when using prepared statements to insert user inputted data into a database. I’m not saying that is the only use of prepared statements, but one where it is particularly important.

no. you escope it if you change the value’s context. whether that is a view (HTML) or SQL or whatever doesn’t matter.

Well, for the home page featuring your cats - yes.
But speaking of the professional development, most likely you’re not the only one who is writing the code.
And even being the sole developer, you’ll be doing the double job, decoding the value back if output context needs to be changed. This is why we developed the separation of concerns, and in 2017 you seldom find a controller that escapes data for the HTML output.

What is important, is to use prepared statements for any data. Whereas stressing on user input you’re landing a mine in your application with your own hands. It is not a matter of wording. It’s a matter of literally thousands of noobish scripts that don’t use prepared statements when writing the data back into database as it’s no more “from user input”. You’re a sort of mentor here so you should be aware of the effect your words take in the minds of inexperienced readers.

[quote=“SamA74, post:7, topic:269044”]
I’m escaping within the controller script, but in preparation for output in the html template.[/quote]
To me escaping within controller is conceptually wrong because the controller cannot know how the view will use the data. Sure, often it will be used as part of html but it can also be used for javascript, or JSON, or it can even be displayed as a text/plain page or be inserted into html comments. Therefore, escaping belongs to the view because only the view knows the target context.

I think the idea is to first keep business logic out of html, not php. Escaping is view logic and belongs to the view - whether you use php or some template syntax for that. There is nothing wrong in using php in templates as long as it is used for display logic.

1 Like

Interesting. So with my intention of separating concerns, you (both) say I’m putting concerns on the wrong side of the fence, as it were.

Of course it will only be ‘pre-escaped’ in situations where the data absolutely is for html output and nothing else. It just seems neater that way to my mind.
OK that line of code was maybe a bad example of escaping on output, as it was not directly output.

It all depends how you define a concern. Language should not be a concern separator because language is just a tool that we use - and we use whatever language we find is available, convenient, useful, etc. Sure, it often happens that in a system with well defined concerns most php code will be separate from html and if you use a template language then all of php will be separate but it’s just a side effect. The view is a layer and it should only be concerned with display logic just like you have other layers that should be concerned with their own stuff. What languages are used in each layer doesn’t really matter for separation of concerns. Some people use php classes and OOP as part of their views to manage display logic and it’s perfectly fine.

How do you know data is absolutely for html output? When writing a controller you know that certain data is for html and prepare it so then you are putting view logic into the controller so in effect the controller deals with something it should not be responsible for. The view logic leaks into the controller and the concerns become blurred.

When designing any part of an application I’ve found it helps to imagine that I’m actually writing separate parts that do not know anything about each other apart from requiring specific input and returning specific output (API). I imagine the view (html, templates, etc.) are written by a different programmer and I have no idea what the view will do with the data it gets - my responsibility is only to provide the data. Then these parts become more separated from each other, it’s easier to replace one part (module) with another, it becomes all more reusable. For example, my controllers are not suitable only for html pages but they can serve all other kinds of page formats, documents, etc. because they are format agnostic. The same goes for models.

2 Likes

This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.