Regex alternative

Hi guys,

I was wondering if you would have a better alternative to this code segment:

$code = preg_replace('/(<[^>]+ ?src=")([^"]*)/ie', 'self::_sanitize("$1", "$2")', $code);

Basically, what it does… it parses an entire HTML page and uses regular expressions to find any links, then it “sanitizes” the link by replacing it with a different location, based on various conditions. For example, if we are dealing with “images” directory, it will use an alternative location then it puts it toghether as a regular link into text output. I know the preg_replace function is heavy so I was wondering if you recommend another way to do it…

Example of the $1 output from the above regex:
string(10) “<img src=”"
string(10) “<img src=”"
string(10) “<img src=”"
string(10) “<img src=”"

Example of the $2 output from the above regex:
string(25) “images/buttons/report.gif”
string(21) “images/buttons/ip.gif”
string(30) “images/statusicon/post_old.gif”
string(34) “images/statusicon/user_offline.gif”

Thanks for your help.

Pretty easy and quick with DomDocument:


$doc = new DOMDocument(); 
$doc->loadHTML($code);

foreach($doc->getElementsByTagName('img') as $img) {
    $newSrc = self::_sanitize($img->setAttribute('src'));
    $img->setAttribute('src', $newSrc);
}

$html = $doc->saveHTML();

Thank you!