Regex to replace ampersands

Basically I need to replace ampersands with their escaped value:


I need a regex that is aware of already escaped ampersands or other entities and ignore anything in CDATA tags so javascript is left alone.

I found this:

But it does not work for PHP’s preg_replace.

Can anyone help convert it?

If your “CDATA” is only stuff inside <script> tags, why not first rule out anything inside script tags, then look for all &'s left: set a variable to a regex


and replace them?

Just might be easier to find <script> tags than going through possibly triple-commented CDATA tags.

Disclaimer: I don’t know a single thing about PHP, but that guy’s regex solution looked brittle at best