Escaping content with htmlspecialchars (and escaping in programming in general)

A beginner’s question: I understand that the PHP function htmlspecialchars builds special output from html codes (sometimes quite long output from a very short html code).

Is the special content is what being “escaped” from the aforementioned html?

Is this a general concept for languages that work together with HTML?

Thank you!!!

1 Like

Two times you need to worry about escaping, though using prepared statements has pretty much taken care of the first - if you take advantage and use it!

When you are taking user supplied input and saving it - this usually means in a database, though if you’re writing to a file you also need to consider it.

And whenever you output the saved input.

For example, say I’m writing last names to a CSV file that uses single quote marks as the enclosure eg.

‘Smith’, ‘Jones’, ‘Skywalker’, ‘Fett’

What happens when Paul O’Brien enters his name?
‘Smith’, ‘Jones’, ‘Skywalker’, ‘Fett’, ‘Paul O’

Say I have a forum that display examples of HTML code and someone submits
<script>alert("Ha hah")</script>

If it wasn’t escaped there would be a few people not laughing.

Take a look at this post in view-source and you’ll see

<code>&lt;script&gt;alert("Ha hah")&lt;/script&gt;</code>
1 Like

It is a very general concept.

Any escaping is an output function that examines data that is about to be jumbled with code and converts any character that could be misinterpreted as code into something that means the same thing but will not be misinterpreted as code.

In each case escaping is specific to the type of code you are jumbling the data with.For example with HTML two characters that can be valid in your data ut which would be misinterpreted as part of the HTML code are < and & and so these need to be escaped to &lt; and &amp; respectively so that they will display correctly within the HTML.

Most times a way is provided to keep the code and data separate so that it is unnecessary to escape the data to prevent it being misinterpreted as code - escaping only needs to be done when they can’t be kept separate.

1 Like

This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.