Hi,
I have some code to clean up an HTML document before doing additional processing. One of the steps in cleaning up the HTML document is to remove all JavaScript event attributes from HTML tags (such as onclick, onblur, etc). I have the following code but it seems to have problems when the JavaScript contains a \“. I’m not so great with regular expressions so I’m not really sure how to have it exclude the \” sub-pattern. Any help on how to make this regex better would be appropriated!
$html = preg_replace('#(onabort|onactivate|onafterprint|onafterupdate|onbeforeactivate|on
beforecopy|onbeforecut|onbeforedeactivate|onbeforeeditfocus|onbeforepaste|onbefo
reprint|onbeforeunload|onbeforeupdate|onblur|onbounce|oncellchange|onchange|oncl
ick|oncontextmenu|oncontrolselect|oncopy|oncut|ondataavaible|ondatasetchanged|on
datasetcomplete|ondblclick|ondeactivate|ondrag|ondragdrop|ondragend|ondragenter|
ondragleave|ondragover|ondragstart|ondrop|onerror|onerrorupdate|onfilterupdate|o
nfinish|onfocus|onfocusin|onfocusout|onhelp|onkeydown|onkeypress|onkeyup|onlayou
tcomplete|onload|onlosecapture|onmousedown|onmouseenter|onmouseleave|onmousemove
|onmoveout|onmouseover|onmouseup|onmousewheel|onmove|onmoveend|onmovestart|onpas
te|onpropertychange|onreadystatechange|onreset|onresize|onresizeend|onresizestar
t|onrowexit|onrowsdelete|onrowsinserted|onscroll|onselect|onselectionchange|onse
lectstart|onstart|onstop|onsubmit|onunload)\\s*=\\s*".*?"#is', '', $html);
Thanks!