I have two similar functions that I’m working on, but I’m unable to polish them enough to not match URLs.
First - Embed YouTube Videos Automatically
protected function _embedVideos($str) {
while (preg_match('#(http://www.youtube.com)?/(v([-|~_0-9A-Za-z]+)|watch\\?v\\=([-|~_0-9A-Za-z]+)&?.*?)#i', $str, $matches, PREG_OFFSET_CAPTURE)) {
$position = $matches[0][1];
$length = strlen($matches[0][0]);
$id = $matches[4][0];
$replacement = sprintf(
'<object width="425" height="350">
<param name="movie" value="http://www.youtube.com/v/%s"></param>
<param name="wmode" value="transparent"></param>
<embed src="http://www.youtube.com/v/%s" type="application/x-shockwave-flash" wmode="transparent" width="425" height="350"></embed>
</object>',
$id,
$id
);
$str = substr_replace($str, $replacement, $position, $length);
}
return $str;
}
This one works pretty well, but if the YouTube video is linked to itself, then it will break (embed code gets applied twice and breaks the HTML up). I know I need to use the lookbehind/lookahead but I haven’t had much luck with it.
Two - Auto Linker
I have another that links titles from our database to their relevance pages. It sorts the objects by longest name to first so nothing gets double applied, but the regex has a similar problem where it gets applied again when its an HTML link.
It’s just supposed to match whole words only, and of course, not already linked text. I pulled this regex off the net a while ago so if it sucks, that’s probably why.
protected function _replacePhone($phone, $text) {
$regex = '#(?!<.*?)(?!<a)(\\\\b'.preg_quote($phone['name']).'\\\\b)(?!<\\/a>)(?![^<>]*?>)#i';
$matches = null;
preg_match_all($regex, $text, $matches);
$matches = array_unique($matches[0]);
foreach ($matches as $match) {
$pattern = '#(?!<.*?)(?!<a)(\\\\b'.$match.'\\\\b)(?!<\\/a>)(?![^<>]*?>)#';
$replacement = '<a href="/' . $phone['wp_slug'] . '">' . $match . '</a>';
$text = preg_replace($pattern, $replacement, $text, 1);
}
return $text;
}
Any help would be appreciated.
Cheers