I understand what
htmlentities() does and that it can increase security() but I don’t want to use it indiscriminately. My dilemma is as follows -
I have a project that accepts user input via a form, the form data is analysed and searched for the occurrence of particular strings. Now whilst the data displays ok in a browser the actual text is obviously interspersed with additional characters which makes analysing it difficult.
My questions are -
- If submitted text is not converted using
htmlentities() is it only a threat if you click on it?
htmlentities() have any benefit in preventing SQL injection? ie can I save it unconverted and just convert it before I display it?
- If input is not converted using htmlentities() is there any threat if it is never saved to a database and never displayed in a browser?
Sorry if this sounds a bit dumb, but I like to understand as fully as possible
Actually I have just realised I can use
html_entity_decode() during analysis to convert back temporarily, but I’d still be interested to hearing comments / feedback on the 3 questions above.
I will try to answer…
En/decoding html entities is not used to prevent sql injections. For this you have prepared statements. So you can (and should) save the inputted text into the database without any changes.
You need to html encode your text before you output it to the browser to prevent XSS (cross-Site-scripting) attacks.
Let’s assume a user puts the following in an input field
And you save it to your database.
If you now load this text from the db and use it to show it in the DOM of your website, the browser will execute the script. Of course here the script can do anything not only alert an attack
The problem here is, that if you use the text to put it back in an input field (for example for editing) you should not encode it. But if you use it as plain text in the DOM you need to.
understood and clarified - thanks
you mean htmlentities() yea ?
you mean display it in a browser yea ?
thanks for your effort and time - much appreciated
Same here. There are several ways to display text in your browser. And depending on the way it is done, the text must have other content to execute a script (I just gave one possible example)
Excellen ! Thanks so much, cleared up so much misunderstanding and I learned a lot as well - cheers
Can I please ask an off topic question seeing your title is Mentor . I asked a question about developing an html 5 pattern using regex but only getting replies about the PHP part - is there a better place or better way to ask please
But do you understand what its primary purpose is?
Sometimes we need to show HTML in a web page, such as the following.
If we put that HTML in a web page without using HTML entities then the browser will format the HTML the same as if it was all the other HTML. So we can use HTML entities as in the following that the browser converts to the HTML we want shown.
Using HTML entities for security purposes is a secondary purpose. I suggest not trusting it for use for security.
Perhaps you need to use a HTML parser to audit for dangerous elements, especially links and scripts. However Cross Site Scripting Prevention - OWASP Cheat Sheet Series seems to have good advice.
Thank you for expanding. Most helpful. Cheers.