SitePoint Sponsor

User Tag List

Results 1 to 7 of 7
  1. #1
    SitePoint Member
    Join Date
    Oct 2009
    Posts
    11
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Please help my regex for preg_match_all

    Hello, I'm writing a regex to use in preg_match_all, it's been successful for preg_match, but I can't get it right for preg_match_all. The pattern: [T-xxx-yyy] with xxx as numbers (unlimited length) and yyy as alphabet (unlimited length). For example: [T-123-OK] or [T-8291-CANCEL] will match the pattern.

    Here's my code.

    PHP Code:
    $regex "/(.*)([T-[0-9]+-[A-Za-z]+)(.*)/";
    $string "[T-1223-DONE][T-381-CANCEL][T-547-DELETE]";
    echo 
    "<pre>";
    $matches = array();
    preg_match_all($regex$string$matches);
    print_r($matches);
    echo 
    "</pre>"
    It returns this array.

    Code:
        [1] => Array
            (
                [0] => [T-1223-DONE][T-381-CANCEL]
            )
    
        [2] => Array
            (
                [0] => [T-547-DELETE]
            )
    
        [3] => Array
            (
                [0] => 
            )
    
    )
    What I want is something like this:

    Code:
    Array
    (
        [0] => Array
            (
                [0] => [T-1223-DONE]
                [1] => [T-381-CANCEL]
                [2] => [T-547-DELETE]
            )
    )
    Also, how do I make my regex able to handle multi line?
    Please help I'm very newbie at this regex thing-- thanks in advance

  2. #2
    Theoretical Physics Student bronze trophy Jake Arkinstall's Avatar
    Join Date
    May 2006
    Location
    Lancaster University, UK
    Posts
    7,062
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    The problem was mainly that your regex was very 'greedy', so removing unnecessary wildcards and using the '+?' combo that was fixed:

    PHP Code:
    $regex "/\[T-(\d+?)-([A-Za-z]+)\]/"
    $string "[T-1223-DONE][T-381-CANCEL][T-547-DELETE]"
    echo 
    "<pre>"
    $matches = array(); 
    preg_match_all($regex$string$matches); 
    print_r($matches); 
    echo 
    "</pre>"
    As for multiline, just use the m flag:
    PHP Code:
    $regex "/\[T-(\d+?)-([A-Za-z]+)\]/m"
    $string "[T-1223-DONE][T-381-CANCEL][T-547-DELETE]"
    echo 
    "<pre>"
    $matches = array(); 
    preg_match_all($regex$string$matches); 
    print_r($matches); 
    echo 
    "</pre>"
    Jake Arkinstall
    "Sometimes you don't need to reinvent the wheel;
    Sometimes its enough to make that wheel more rounded"-Molona

  3. #3
    SitePoint Wizard
    Join Date
    Nov 2005
    Posts
    1,191
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I don't think non greedy is the issue, rather the .* at the beginning and end, and not escaping [
    PHP Code:
    $regex "~\[T-\d+-[A-Z]+\]~"

  4. #4
    SitePoint Member
    Join Date
    Oct 2009
    Posts
    11
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Um sorry I made a mistake for the original regex, I just retype it in a hurry instead of copying from my php file, hence the unescaped [. What I actually used:
    PHP Code:
    $regex "/(.*)(\[T-[0-9]+-[A-Za-z]+\])(.*)/"
    Anyway, I tried Jake's code and it works Although I'm curious of the
    Code:
    (\d+?)
    Why do we need the "?"

    hash, I tried your code on a multiline and it only return the last match of each line.

    Thanks Jake & hash!

  5. #5
    SitePoint Wizard
    Join Date
    Nov 2005
    Posts
    1,191
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Hmm, works for me
    PHP Code:
    $regex "~\[T-\d+-[A-Z]+\]~";
    $string "asdasd[T-111-DONE]ewrew[T-222-CANCEL]asdasd[T-333-DELETE]asdasd
    asdasd[T-444-DONE]ewrew[T-555-CANCEL]asdasd[T-666-DELETE]asdasd
    asdfasdfasd[T-777-NEWLINE]ewr"
    ;
    echo 
    "<pre>";
    $matches = array();
    preg_match_all($regex$string$matches);
    print_r($matches);
    echo 
    "</pre>"
    /*
    Array
    (
        [0] => Array
            (
                [0] => [T-111-DONE]
                [1] => [T-222-CANCEL]
                [2] => [T-333-DELETE]
                [3] => [T-444-DONE]
                [4] => [T-555-CANCEL]
                [5] => [T-666-DELETE]
                [6] => [T-777-NEWLINE]
            )
    )
    */ 
    The ? makes it non greedy, eg
    PHP Code:
    $str '<p>para 1</p><p>para2</p>';
    echo 
    preg_replace('~<p>.+</p>~'' -para- '$str); // -para- even though there are 2
    echo '<br>';
    echo 
    preg_replace('~<p>.+?</p>~'' -para- '$str); // -para- -para- 

  6. #6
    SitePoint Member
    Join Date
    Oct 2009
    Posts
    11
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Hmm I'm not sure why, but I tested it with this code. Anyway, thanks for your help! Really great community, I asked this question in other board yesterday and I haven't got any replies yet!

    PHP Code:
    $regex "~\[T-\d+-[A-Z]+\]~"
    $matches = array();
    $text13 "MY TASK [T-122-done] is DONE, 
    MY TASK [T-134-DONE] is DONE, 
    MY TASK [T-253-Done] is DONE,
    MY TASK [T-321-Done] is DONE, 
    MY TASK [T-654-DONE] is DONE"
    ;
    preg_match_all($regex$text13,$matches,PREG_PATTERN_ORDER);
    echo 
    "<pre>";
    print_r($matches);
    echo 
    "</pre>";
    /*
    Array
    (
        [0] => Array
            (
                [0] => [T-134-DONE]
                [1] => [T-654-DONE]
            )

    )
    */ 

  7. #7
    @php.net Salathe's Avatar
    Join Date
    Dec 2004
    Location
    Edinburgh
    Posts
    1,396
    Mentioned
    61 Post(s)
    Tagged
    0 Thread(s)
    With the code in your last post, the "yyy" part is only matching uppercase letters but some include lowercase letters ("Done"). Either change the character class ("[…]") to allow lowercase letters or make the entire regular express case-insensitive by using the "i" modifier ("…~i")
    Salathe
    Software Developer and PHP Manual Author.


Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •