SitePoint Sponsor

User Tag List

Results 1 to 6 of 6

Hybrid View

  1. #1
    SitePoint Enthusiast [Az]'s Avatar
    Join Date
    Aug 2003
    Location
    UK
    Posts
    98
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Smilies being parsed in URLs

    I'm trying to modify a smilie parser to deal with smilie replacement text appearing in a URL and hence breaking the URL. I've got the following

    PHP Code:
    foreach ($this->smilieSearchArray as $key=>$value)
    {
    $parsedText preg_replace('#(?<!&amp|&quot|&lt|&gt|&copy)' preg_quote($value'#') . '#s'$this->smilieReplaceArray[$key], $text);

    This works fine if I remove $value in the preg_replace function and replace it with some hard coded text, the smilie is subsequently shown. However, using $value then the smilie text is not replaced. Both the search and replace arrays are correct as I can echo them and see that they are set up correctly

    It's probably something obvious that I'm missing but I just can't see it
    Last edited by [Az]; May 25, 2004 at 06:15.
    HEXUS Webmaster

  2. #2
    SitePoint Enthusiast [Az]'s Avatar
    Join Date
    Aug 2003
    Location
    UK
    Posts
    98
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    bump
    HEXUS Webmaster

  3. #3
    SitePoint Enthusiast [Az]'s Avatar
    Join Date
    Aug 2003
    Location
    UK
    Posts
    98
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I found the bug with the script, as expected a stupid error where I was overwriting the parsedText with the original for every loop of the array except for the last. Stupid really

    Anyway, I now have an issue where it is parsing smilies inside an <a href tag that has a javascript command e.g. <a href=javascriptpenWindow. The parser will pick up the in the command and mess up the URL. I've tried to get to the bottom of it but no luck. Can anyone point me in the right direction to make sure it excludes any strings inside an HTML tag ?
    HEXUS Webmaster

  4. #4
    SitePoint Enthusiast [Az]'s Avatar
    Join Date
    Aug 2003
    Location
    UK
    Posts
    98
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    in fact it does exactly what has appeared above
    HEXUS Webmaster

  5. #5
    SitePoint Wizard stereofrog's Avatar
    Join Date
    Apr 2004
    Location
    germany
    Posts
    4,324
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    The simplest solution would be

    PHP Code:
    $test=<<<EEE
    <a href='javascript:open'>foo :o bar</a>foo :o bar<img alt="foo :o bar">:o
    EEE;

    $test=preg_replace("~(>[^<]*)(:o)~","$1{smile}",$test); 
    This code is for wellformed xml though, eg it does not handle <> inside a tags like <a href="javascript:1>2">

    I think "real-life" html parser cannot be made with regexps. You need something like sax parser for this.

  6. #6
    SitePoint Enthusiast [Az]'s Avatar
    Join Date
    Aug 2003
    Location
    UK
    Posts
    98
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Seems to work like a charm. I'm going to try and see if I can find normal circumstances under which it may break but it's better than what I had. In general I aim for XHTML 1.1 Strict so the code should be well-formed (famous last words )

    thanks, your help was much appreciated
    HEXUS Webmaster


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •