Regex Help

I have two similar functions that I’m working on, but I’m unable to polish them enough to not match URLs.

First - Embed YouTube Videos Automatically

    protected function _embedVideos($str) {
       while (preg_match('#(http://www.youtube.com)?/(v([-|~_0-9A-Za-z]+)|watch\\?v\\=([-|~_0-9A-Za-z]+)&?.*?)#i', $str, $matches, PREG_OFFSET_CAPTURE)) {           
            $position = $matches[0][1];
            $length   = strlen($matches[0][0]);
            
            $id = $matches[4][0];
            
            $replacement = sprintf(
                '<object width="425" height="350">
                    <param name="movie" value="http://www.youtube.com/v/%s"></param>
                    <param name="wmode" value="transparent"></param>
                    <embed src="http://www.youtube.com/v/%s" type="application/x-shockwave-flash" wmode="transparent" width="425" height="350"></embed>
                </object>',
                $id, 
                $id
            );
            
           $str = substr_replace($str, $replacement, $position, $length);
       }
       
       return $str;
    }

This one works pretty well, but if the YouTube video is linked to itself, then it will break (embed code gets applied twice and breaks the HTML up). I know I need to use the lookbehind/lookahead but I haven’t had much luck with it.

Two - Auto Linker

I have another that links titles from our database to their relevance pages. It sorts the objects by longest name to first so nothing gets double applied, but the regex has a similar problem where it gets applied again when its an HTML link.

It’s just supposed to match whole words only, and of course, not already linked text. I pulled this regex off the net a while ago so if it sucks, that’s probably why.

    protected function _replacePhone($phone, $text) {
        $regex = '#(?!<.*?)(?!<a)(\\\\b'.preg_quote($phone['name']).'\\\\b)(?!<\\/a>)(?![^<>]*?>)#i';
        
        $matches = null;
        preg_match_all($regex, $text, $matches);
        
        $matches = array_unique($matches[0]);
        
        foreach ($matches as $match) {
            $pattern = '#(?!<.*?)(?!<a)(\\\\b'.$match.'\\\\b)(?!<\\/a>)(?![^<>]*?>)#'; 
            $replacement = '<a href="/' . $phone['wp_slug'] . '">' . $match . '</a>';
            $text = preg_replace($pattern, $replacement, $text, 1);
        }
        
        return $text;  
    }

Any help would be appreciated.

Cheers

Hmmm, interesting. I think that will work for #1, but not #2, since it actually iteratively inserts links.

Thanks for your response, Anthony.

Just quickly, why not remove all links first, filter, then re-insert 'em?