SitePoint Sponsor

User Tag List

Results 1 to 6 of 6
  1. #1
    SitePoint Addict
    Join Date
    Oct 2004
    Location
    USA, Saratoga Springs, NY
    Posts
    296
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    REGEX Problem...completely baffled. HELP!

    Okay, so here's the situation...(theoretical, I haven't actually made the interface yet)

    I'm using a textarea field to allow myself to create content and markup on-the-fly to a webpage. I use the good' ol' standard, time tested approach of nl2br() after pulling information from the database upon displaying it on the page... However, what if I want to create some more advanced HTML within the textarea field, such as another textarea, or PRE section to show off some code?

    Well, currently, that means my nl2br will break it! It'll look like the following (rendered HTML) (assume the stuff inside the "<textarea>" is actually within a real textarea field):
    Time for a simple PHP example script:
    <textarea code="php"><br />
    <?php<br />
    //old faithful...<br />
    echo 'Hello World!' . "\n";<br />
    ?><br />
    </textarea>

    There's our famous Hello World example, written in PHP.
    I'm looking for a way to, instead of using nl2br(), determine if I'm NOT inside a textarea, pre, or some other predefined tag (such as BBCode later on, but for the example we'll stick with pre and textarea), and if I'm NOT, then apply a nl2br-type transformation.

    The nl2br transformation I have looks something like the following:
    Code:
    '/(\x0D\x0A|\x0A|\x0D){1})*?/'
    I may go back and alter it, it was merely for testing purposes to start out with.

    I started to tackle the textarea situation, but only got so far...
    Code:
    '/[<\[](textarea|pre|iframe).*?[>\]].*?[<\[]\/\\1[>\]]/'
    I'm afraid I'm overthinking the problem. I'm currently looking at lookaheads, lookbehinds, negatives of each of those, and expression conditionals. I'm just unable to wrap my head around it in a REGEX manner even though I'm pretty sure it can be done.

    ...help?
    They say, "Practice makes perfect," yet they also say, "Nobody's perfect". I don't get it.

  2. #2
    ✯✯✯ silver trophybronze trophy php_daemon's Avatar
    Join Date
    Mar 2006
    Posts
    5,284
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    A quick idea:
    PHP Code:
    function replace($matches){
     
    $str=str_replace("\r","",$matches[2]);
     return 
    $matches[1].str_replace("\n","<imnotnl/>",$str).$matches[4];
    }

    $tmp=preg_replace_callback('/([<\[](textarea|pre|iframe).*?[>\]])(.*?)([<\[]\/\\2[>\]])/','replace',$string);
    $tmp=nl2br($tmp);
    $final=str_raplace("<imnotnl/>","\n",$tmp); 
    Saul

  3. #3
    SitePoint Wizard stereofrog's Avatar
    Join Date
    Apr 2004
    Location
    germany
    Posts
    4,324
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Although it's possible to construct one single regexp for this, the so-called "isolation" technique often leads to cleaner and simpler code. The working example:

    PHP Code:
    function isolate($src$regexp NULL) {
        if(
    $regexp) return preg_replace_callback(
            
    $regexp'isolate'$src);
       global 
    $_buf;
       
    $_buf[] = $src[0];
       return 
    "\001" . (count($_buf) - 1);
    }
    function 
    restore($text) {
       global 
    $_buf;
       return 
    preg_replace(
          
    '~\001(\d+)~e''$_buf[$1]'$text);
    }


    $source "hello
    world
    <textarea>
    php
    code
    </textarea>
    foo
    bar"
    ;

    $t isolate($source
        
    '~<(textarea|pre).*?>.*?</\1>~si');
    $t nl2br($t);
    $t restore($t);

    header("Content-Type: text/plain");
    echo 
    $t
    Feel free to ask if you need more clarification.

  4. #4
    SitePoint Addict
    Join Date
    Oct 2004
    Location
    USA, Saratoga Springs, NY
    Posts
    296
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I'll take a look at both of these possible solutions over the weekend. Thank you very much for the help! I liked the challenge of trying to do this with one regexp, if for nothing else, just to figure out how to do it; wish I still could. Regardless, these look like viable solutions - thanks!
    They say, "Practice makes perfect," yet they also say, "Nobody's perfect". I don't get it.

  5. #5
    SitePoint Addict
    Join Date
    Oct 2004
    Location
    USA, Saratoga Springs, NY
    Posts
    296
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    @php_daemon: I understand what your technique was trying to accomplish after looking it over. Not a bad idea, actually. It didn't work out-of-the-box though and stereofrog's did, so I'm going with his for the sheer fact that I'm sure it'll fit in as-is to the problem at hand, where I might have to adapt yours a bit more than simply getting it to work as it should. Thanks for the help!

    @stereofrog: That worked rather well. I understand how it works and no explanation is necessary. Storing it in an array for callback later is an interesting idea. Are there any articles you know that discuss this type of stuff in more detail? Also, out of curiosity, do you know of any communities based around REGEXP type stuff? I'd still like to learn how to do this all in one shot in case it'll come in handy some day -- it might even be a little less resource intensive on the server if I had a lot of data to parse. Thanks very much for the help!
    They say, "Practice makes perfect," yet they also say, "Nobody's perfect". I don't get it.

  6. #6
    SitePoint Wizard stereofrog's Avatar
    Join Date
    Apr 2004
    Location
    germany
    Posts
    4,324
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I must admit the Friedl's book is the only stuff I've ever read about regexps...


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •