SitePoint Sponsor

User Tag List

Results 1 to 6 of 6

Thread: regex issue

  1. #1
    SitePoint Zealot
    Join Date
    Apr 2006
    Posts
    147
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    regex issue

    I am trying to use preg_replace but my regex ability isn't the best.

    I've got to remove all newlines from a string unless it starts with a number followed by a coma i.e. "1234,"

    I came up with
    Code:
    $newstring = preg_replace('/(?:\n|\r\n)(?:[^0-9]+[^,])/', ' ', $oldstring);
    Which seems to work until I notice a new line that starts "123 Kg".

    What am I doing wrong?

  2. #2
    SitePoint Wizard lorenw's Avatar
    Join Date
    Feb 2005
    Location
    was rainy Oregon now sunny Florida
    Posts
    1,103
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    you could try something like

    Code:
    if (!ereg('^[0-9](.*?)\,$', "$oldstring")){
    $newstring = str_replace("\n",'', $oldstring);
    }
    else{}
    untested but its an idea to an approach.

    looks for number at beginning and a comma at the end

    hth
    cheers
    Lorenw

  3. #3
    SitePoint Zealot
    Join Date
    Apr 2006
    Posts
    147
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Hi lorenw, thanks for the suggesting but the string will have multiple new lines which some will need to be removed and others not (i.e. if they begin with a number and coma) so I don't think that method would work.

  4. #4
    Chessplayer kleineme's Avatar
    Join Date
    Apr 2004
    Location
    Germany
    Posts
    608
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Are you sure, that you don't lose any data? When I use your regex I do:

    PHP Code:
    $text '
    1234,
    khjkh,
    oh0w8,
    123 kg,
    453,
    swf,
    eraz,
    564aw,
    989879,
    s4wefe,
    awtw
    '
    ;

    //$regex = '/(?:\n|\r\n)(?:[^0-9]+[^,])/';
    $regex '~\r?\n[^0-9]+[^,]~';  //which is the same as yours, only more concise
    $repl preg_replace($regex" "$text);
    echo 
    "<pre>" $text "</pre>";
    echo 
    "<pre>" $repl "</pre>"
    Output:
    Code:
    1234,
    khjkh,
    oh0w8,
    123 kg,
    453,
    swf,
    eraz,
    564aw,
    989879,
    s4wefe,
    awtw
    
    1234, w8,
    123 kg,
    453, 64aw,
    989879, wefe,
    To solve your problem you could use a negative lookahead:

    PHP Code:
    $text '
    1234,
    khjkh,
    oh0w8,
    123 kg,
    453,
    swf,
    eraz,
    564aw,
    989879,
    s4wefe,
    awtw
    '
    ;

    $regex '~\r?\n(?!\d+,)~';
    $repl preg_replace($regex" "$text);
    echo 
    "<pre>" $text "</pre>";
    echo 
    "<pre>" $repl "</pre>"
    Output:
    Code:
    1234,
    khjkh,
    oh0w8,
    123 kg,
    453,
    swf,
    eraz,
    564aw,
    989879,
    s4wefe,
    awtw
    
    1234, khjkh, oh0w8, 123 kg,
    453, swf, eraz, 564aw,
    989879, s4wefe, awtw
    Never ascribe to malice,
    that which can be explained by incompetence.
    Your code should not look unmaintainable, just be that way.

  5. #5
    SitePoint Zealot
    Join Date
    Apr 2006
    Posts
    147
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks kleineme, your regex did the trick. Is ~ just another alternative to using / at the beginning and end?

  6. #6
    Chessplayer kleineme's Avatar
    Join Date
    Apr 2004
    Location
    Germany
    Posts
    608
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Yes, you have to use a delimiter and "~" is my favorite one. If you use "/" and you have to match slashes (e.g. in an URL) then you would have to escape all occurrences of "/" within the regex, which makes the expression less legible. Of course, if you have to match lots of tildes, then you should use another delimiter.
    Never ascribe to malice,
    that which can be explained by incompetence.
    Your code should not look unmaintainable, just be that way.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •