SitePoint Sponsor

User Tag List

Results 1 to 8 of 8
  1. #1
    SitePoint Evangelist CyberFuture's Avatar
    Join Date
    May 2001
    Location
    San Diego, CA
    Posts
    434
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Need Help with preg_match()

    I'm new at working with regular expression, so please be patient with me. I'm trying to grab the content that is located between two tables.
    PHP Code:
    //First table
    <table bgColor=#ffffff border=0 cellspacing=1 width="100%" cellpadding=0>
    <tr bgcolor="#FFE680" vAlign=middle>
    <
    td align="middle" valign="top" width="12%"><b>
    <
    font size=2 face="Arial, Helvetica, sans-serif">Item Title</font></b><br></td></tr></font></table>

    //Stuff I want to grab
    A whole lot tables are located here.

    //last table
    <table width="100%" border="0" cellspacing="0" cellpadding="0"><tr>
    <
    td align="right" height="25"><font face="Arial, Helvetica, sans-serif" size="2"><a href="#">
    top of page</a></font></td>
    </
    tr>
    </
    table
    Here's the code:
    PHP Code:
    preg_match("/>Item Title.?</table>(.*?)<table width=\"100%\" border/"$page$chunk); 
    $page is an fread of a text file that contains all the tables. The first and last table are not the first and last table on the text file. There are many tables before and after them.

    Regarding the first table, there is another table before it that's almost identical. The only difference is that the Item Title is in a field by itself. The other table has more text in that field, located before Item Title.

    On the last table, what makes it unquie from the tables between it and the first table is that after the table width, the next attribute is the border. In the other tables it's cellpadding.

    When I run this code in a script, I get the following error:
    Warning: Unknown modifier 't' in /path/to/test.php on line 6

  2. #2
    SitePoint Zealot
    Join Date
    Mar 2002
    Location
    Perth, Australia
    Posts
    157
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    You need to escape the / 's in your regular expression, since a / is the delimiter of the pattern

    PHP Code:
    preg_match("/>Item Title.?<\\/table>(.*?)<table width=\"100%\" border/",*$page,*$chunk); 
    Paul Davey
    webmaster for Whitford Church of Christ

  3. #3
    SitePoint Evangelist CyberFuture's Avatar
    Join Date
    May 2001
    Location
    San Diego, CA
    Posts
    434
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Hi bobbymac, thanks for your response. I tried a sight modifcation of your script:
    PHP Code:
    preg_match("/>Item.?<\/table>(.*?)<table width=\"100%\" border/"$page$chunk); 
    And now I get no errors, but $chunk is a completely empty array. My understanding is that $chunk[0] should be equal to $page, but even it's empty. Any ideas on what's happening?

  4. #4
    SitePoint Zealot
    Join Date
    Mar 2002
    Location
    Perth, Australia
    Posts
    157
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    The . won't match any \n characters, so you could try using the /s modifier. I would also try getting rid of the ? after the .* because * will match zero or more anyway.

    It's getting late down here.
    Paul Davey
    webmaster for Whitford Church of Christ

  5. #5
    SitePoint Evangelist CyberFuture's Avatar
    Join Date
    May 2001
    Location
    San Diego, CA
    Posts
    434
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    How do I use the /s modifier? Does it go before or after the . or somewhere else?

  6. #6
    SitePoint Zealot
    Join Date
    Mar 2002
    Location
    Perth, Australia
    Posts
    157
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    It goes after the last / of the pattern

    ie: preg_replace("/pattern/s",$page,$chunk);

    $chunk[0] will contain the text matched by the full pattern (not $page, and not just the parentheses, so if it is empty, then nothing is getting matched.

    Can you try it on something simpler, get it working, and then just build it up?
    Paul Davey
    webmaster for Whitford Church of Christ

  7. #7
    SitePoint Evangelist CyberFuture's Avatar
    Join Date
    May 2001
    Location
    San Diego, CA
    Posts
    434
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks boobymac for you help and suggestion of working samller. It help me identify where the code was breaking down.

    OK I kinda got it working, but with one small problem. Here's the code:
    PHP Code:
    preg_match("/>Item Title.*?<\/table>(.*?)<table width=\"100%\" border/s"$page$chunk); 
    The problem is $chunk[0] starts at >Item Title..... instead of the table right after the table that Item Title is in. Strangly $chunk[0] ends where it should, right before <table width=\"100%\" border....

    Anybody know whats going on here?

  8. #8
    SitePoint Zealot
    Join Date
    Mar 2002
    Location
    Perth, Australia
    Posts
    157
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    $chunk[0] will contain everything that is matched by the pattern between the / and / characters, which in your case includes the first > . Maybe you should enclose the bit you want in parentheses, such as:

    PHP Code:
    preg_match("/>(Item Title.*?)<\\/table>(.*?)<table width=\\"100%\\" border/s"$page$chunk); 
    in this case, you will want $chunk[1].
    Paul Davey
    webmaster for Whitford Church of Christ


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •