SitePoint Sponsor

User Tag List

Results 1 to 3 of 3
  1. #1
    SitePoint Zealot
    Join Date
    Aug 2003
    Location
    everywhere
    Posts
    179
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Pattern not matching

    I seem to be having an issue which what should be a very simple regular expression. This is a small test of what I've currently got. Its basically trying to find a match for the <th>result</th> and match the new lines + tab feed. If I take the \\s* out at the end this works but it only matches <th>Result:</th> which isn't what I want.

    Eventually I want to match the the class name and the content within the <td> but currently cannot because of this issue.

    Code Java:
    String contents = "<th>Result:</th>\n\n" +
    "\t<td colspan=\"2\" class=\"valid\">\n\n" +
    "abcdef\n\n" + "</td>";
     
    Pattern p = Pattern.compile( "(?im)<th>result:</th>\\s*" );
    Matcher matcher = p.matcher( contents );
    Webmobo - Open Source News Scripts
    Portfolio / Blog

  2. #2
    SitePoint Wizard silver trophy rushiku's Avatar
    Join Date
    Dec 2003
    Location
    A van down by the river
    Posts
    2,056
    Mentioned
    0 Post(s)
    Tagged
    1 Thread(s)
    Pattern p = Pattern.compile( "(?im)<th>result: </th>\\s*" );//this defines the pattern

    Matcher matcher = p.matcher( contents );// this tells the matcher what you want to look for

    find() //This tells the matcher to find the next match, without this, the matcher never does anything, which can be misinterpreted as the matcher not working (which it doesn't, because you never told it to get to work)

    I'm curious though, what is (?im) for? Didn't see it in the docs...

  3. #3
    SitePοint Troll disgracian's Avatar
    Join Date
    Aug 2006
    Location
    Samsara
    Posts
    451
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    It looks a lot like a syntax error to me. If it's a lookbehind, it should be (?<im), if it's a non-caputring group (which wouldn't make much sense in this context) it should be (?:im). If they are supposed to be literal parentheses, with the opening parenthesis being optional, then they should be escaped.

    Cheers,
    D.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •