SitePoint Sponsor

User Tag List

Results 1 to 3 of 3
  1. #1
    SitePoint Enthusiast
    Join Date
    Feb 2007
    Posts
    28
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Problem with regular Expression

    Hi,

    Using perl regular expression i need to read the text between a xml tag. like <title>this is test</title>. I used, below code for the same.

    $file_content =~ m/<title.*>(.*)<\/title>/g;
    $pageTitle = $1;

    Its working fine if we have text only inside the tag. But its not working for the below tags.

    <title> House
    hold Water Using Products </title>

    <title> House <br/>
    hold Water Using Products </title>

    <title> <bold>House hold Water Using Products </bold></title>

    Can anyone please help me?

    Thanks in advance,
    Priya

  2. #2
    messing with my mind fristi's Avatar
    Join Date
    Feb 2009
    Posts
    292
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Hi,

    The . doesn't match new line chars, you need to add the modifier s
    or change the expression:

    $file_content =~ m/<title.*?>(\s*.*?\n?)<\/title>/g;

    Untested, but it should work
    To PHP or to Perl, that is the question!
    (Bucket - simpletest) User

  3. #3
    SitePoint Member
    Join Date
    Feb 2009
    Posts
    5
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Try with:

    $file_content =~ m/<title.*?>(.*?)<\/title>/gsi;

    s - single inline, "." can represent newlines
    i - ignore case

    I've added "?" on ".*" since regex in perl are greedy.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •