SitePoint Sponsor

User Tag List

Results 1 to 10 of 10
  1. #1
    SitePoint Member
    Join Date
    Oct 2009
    Posts
    2
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Question Multiple lines in perl regex

    hi, i am trying to make a parser for separating required text from a m-script.

    In that m-script i have different kind of declarations e.g.

    1: par_1.1_anything = min_value;

    2: any_parameter = [ 0 2 3 5;
    6 9 min_v_1 10;
    12 15 30 35];

    3: diff.name = [ 5
    min
    16
    max];

    4: not_req = [ 11 13 15;
    19 24 30;
    31 33 39];

    So i want to separate all those declaration single line and as well as multi-line which have any character between "=" and the end i.e. "];"

    What i am doing is using while(<>) so that it read any input file and then i am using print if $_ =~ /regular expression/ ;
    and then save the output in output file.

    i am quite successful with single lines but i and stuck with multiple lines.
    I want something which when see "[" then i search all lines for any character [a-zA-Z] until it find "];" and then if it find any character in between it print the whole declaration.

    Please help me in this matter.

    Thanks a lot.

  2. #2
    SitePoint Member
    Join Date
    Oct 2009
    Location
    Melbourne VIC
    Posts
    2
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    The option "s" to a regexp will tell the parser to match multi-line inputs. So,

    if($_ =~ /regular expression/s)

    should do the trick for you.

  3. #3
    SitePοint Troll disgracian's Avatar
    Join Date
    Aug 2006
    Location
    Samsara
    Posts
    451
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    The 's' option is actually single-line; use 'm' for multi-line.

    's' will cause the '.' character to match newlines in addition to everything else. 'm' will cause the ^ and $ anchors to match the beginning and end of lines instead of the entire string.

    's' and 'm' can be used together in some circumstances too.

    Cheers,
    D.

  4. #4
    SitePoint Member
    Join Date
    Oct 2009
    Posts
    2
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Smile

    Hi again,

    I need little more help please.

    How can separate or split a string e.g.,

    rpm_max3=[0,20,x_min;3,25,min/2;6,35,max];

    i want to split it and then and put 2 counters one on comma "," and 2nd on semicolon ";". i want output like

    rpm_max3 (0,0) = (0);
    rpm_max3 (0,1) = (20);
    rpm_max3 (0,2) = (x_min);
    rpm_max3 (1,0) = (3);
    rpm_max3 (1,1) = (25);
    rpm_max3 (1,2) = (min/2);
    rpm_max3 (2,0) = (6);
    rpm_max3 (2,1) = (35);
    rpm_max3 (2,2) = (max);

    it is $1 (counter1 which count semicolon, counter 2 which count comma) = ( content that will be $ something)


    Thanks a lot for your help.

    Keep rocking,

    Arsalan

  5. #5
    SitePοint Troll disgracian's Avatar
    Join Date
    Aug 2006
    Location
    Samsara
    Posts
    451
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Split on the outermost grouping, in your case the semi-colon, and work your way inwards. You can store the results in a multi-dimensional array or some other structure of your preference.

    Cheers,
    D.

  6. #6
    SitePoint Zealot Bompa's Avatar
    Join Date
    Feb 2008
    Posts
    179
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by disgracian View Post
    The 's' option is actually single-line; use 'm' for multi-line.

    's' will cause the '.' character to match newlines in addition to everything else. 'm' will cause the ^ and $ anchors to match the beginning and end of lines instead of the entire string.
    The 's' flag means to treat the input string as a single line, but only regarding the . wildcard character.

    The 'm' flag operates on the ^ and $ characters.

    However, what if the regex does not use a . or a ^ or a $?

    One other option would be to set the newline separator to undef.

    $/=undef
    REGEX HERE
    $/="\n"; # restore newline separator

    Bompa

  7. #7
    SitePοint Troll disgracian's Avatar
    Join Date
    Aug 2006
    Location
    Samsara
    Posts
    451
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Apologies for sounding blunt, Bompa, but what exactly was the point of your post? It looks like you were correcting me, but saying exactly the same thing.

    Cheers,
    D.

  8. #8
    SitePoint Zealot Bompa's Avatar
    Join Date
    Feb 2008
    Posts
    179
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by disgracian View Post
    Apologies for sounding blunt, Bompa, but what exactly was the point of your post? It looks like you were correcting me, but saying exactly the same thing.

    Cheers,
    D.
    Hi Disg,

    It's ok to be blunt. I need to walk carefully when correcting other's code, usually they know more than me.

    My point was to restate what the s and m flags do, then pose a question.

    I don't know what is in the regex for this thread since we are just using
    =~ /regular expression/ in examples, but since those two flags operate ONLY
    on the . ^ and $, I am asking what if none of those special characters are
    within the regex?

    What if we want to match "your account is active", but that phrase is split
    over several lines in the input string?

    your account<br />
    is active

    In my understanding, the s and m flags do not help here, am I off?

    Maybe I'm just having a senior moment.

    Bompa

  9. #9
    SitePοint Troll disgracian's Avatar
    Join Date
    Aug 2006
    Location
    Samsara
    Posts
    451
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    If the phrase we're looking to match is separated by newlines then that's not too bad because it's all just whitespace. You could actually just specify /your\s+account\s+is\s+active/. No need for any switches at all, because \s matches any whitespace. You could even use \W (any non-alphanumeric character).

    Cheers,
    D.

  10. #10
    SitePoint Member
    Join Date
    Oct 2009
    Posts
    4
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    The three:

    m// match (m not required)
    tr// translate
    s// substitutes

    "Meta characters" and their meaning

    \ escapes any character in a regular expression.
    ^ match at the begining
    $ match at the end of the string (line if /m)
    | logical or

    Quantifiers

    . Single character
    ? Match 0 or 1 times.
    * match 0 or more times.
    + match 1 or more times.

    Operators

    c Do not reset search position on a failed match when /g is in effect.
    g Match globally, i.e., find all occurrences.
    i Do case-insensitive pattern matching.
    m Treat string as multiple lines.
    o Compile pattern only once.
    s Treat string as single line.
    x Use extended regular expressions.

    Basic examples of matching
    if ($text ~= /string/){#The varible $text contains the word 'string'.}
    Would execute if the varible $text contained

    'This is a string of text'
    'The phaser left a blastring'


    but not

    'String is an important part...'


    For this last example to match we add an i (ignore case).

    if ($text ~= /string/i){#The varible $text contains the word 'string'.}

    You can also test to see if a pattern doesn't match a string with
    if ($text !~ /string/){#Code for negate goes here;}


Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •