If your “CDATA” is only stuff inside <script> tags, why not first rule out anything inside script tags, then look for all &'s left: set a variable to a regex
&(?!amp;)
and replace them?
Just might be easier to find <script> tags than going through possibly triple-commented CDATA tags.
Disclaimer: I don’t know a single thing about PHP, but that guy’s regex solution looked brittle at best