Hi all, trying to gather some information on which are the best practices for allowing HTML to go into the database from arbitrary user input, bearing in mind XSS attacks or any other problems that might be encountered.
Regarding XSS attacks
a simple way I can think of is to produce a list of allowable tags excluding the 'script' tag and use the following to sanitize the content:
strip_tags($html, ['a', 'div', 'span','etc...']);
Then that doesn't cover the case for 'onclick' attribute exploits... which you could remove from the content perhaps somewhat hesitantly using a regular expression? But I cannot think of a better way.
And there will be other exploits and pitfalls and better ways of doing things that I'm missing right now and that's why I'm asking everyone here.
I very much appreciate your input,