SitePoint Sponsor

User Tag List

Results 1 to 14 of 14
  1. #1
    SitePoint Wizard DoubleDee's Avatar
    Join Date
    Aug 2010
    Location
    Arizona
    Posts
    3,777
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)

    Problem with ^/ in mod_rewrite

    This mod_rewrite is not behaving as expected...
    Code:
    #RewriteRule ([^/]+)/$ articles/index-section.php?section=$1 [L]

    When I go to this URL...
    Code:
    http://local.debbie/finance/economy/
    ...then why does section='economy'


    The whole point of saying [^/] was to make Apache grab finance and disregard everything after the /.


    Also, is it correct that you generally want to place my GENERIC mod_rewrites before more SPECIFIC ones?


    My website has "Sections" and "Subsections", and so I figured that my mod_rewrites would go in this order...

    Code:
    # SHOW SECTION INDEX
    #-------------------------------------------------------------------------------
    #PRETTY:		finance/
    #UGLY:			articles/index-section.php?section=finance
    
    #Rewrite only if the request is not pointing to a real file.
    RewriteCond %{REQUEST_FILENAME} !-f
    
    #Match any kind of Section.  PHP will decide if it's valid or not.
    RewriteRule ([^/]+)/$ articles/index-section.php?section=$1 [L]
    
    
    # SHOW SUBSECTION INDEX
    #-------------------------------------------------------------------------------
    #PRETTY:	finance/tax-season/
    #UGLY:		articles/index-subsection.php?section=finance&subsection=tax-season
    
    #Rewrite only if the request is not pointing to a real file.
    RewriteCond %{REQUEST_FILENAME} !-f
    
    #Match any kind of Section and Subsection.  PHP will decide if it's valid or not.
    RewriteRule (.+)/(.+)/$ articles/index-subsection.php?section=$1&subsection=$2 [L]
    Sincerely,


    Debbie

  2. #2
    SitePoint Wizard bronze trophy Jeff Mott's Avatar
    Join Date
    Jul 2009
    Posts
    1,278
    Mentioned
    18 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by DoubleDee View Post
    This mod_rewrite is not behaving as expected...
    Code:
    #RewriteRule ([^/]+)/$ articles/index-section.php?section=$1 [L]

    When I go to this URL...
    Code:
    http://local.debbie/finance/economy/
    ...then why does section='economy'


    The whole point of saying [^/] was to make Apache grab finance and disregard everything after the /.
    In that case, you probably want a "beginning of string" anchor instead of the "end of string" anchor that you currently have.

    RewriteRule ^([^/]+)/ articles/index-section.php?section=$1 [L]

    Quote Originally Posted by DoubleDee View Post
    Also, is it correct that you generally want to place my GENERIC mod_rewrites before more SPECIFIC ones?
    Nope. Other way around. You want specific ones first. Because rewrite rules are processed in order, so if the generic ones are applied first, then the specific ones might never get an opportunity to run.
    "First make it work. Then make it better."

  3. #3
    SitePoint Wizard DoubleDee's Avatar
    Join Date
    Aug 2010
    Location
    Arizona
    Posts
    3,777
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Okay, Jeff, since you are the king of mod_rewrites and regexs, here is an advanced question dealing with what I truly want...

    If a URL looks like this...
    Code:
    local.debbie/finance/markets/brazil-seeks-higher-power-auction-rate-to-spur-use-of-coal
    ...then the ARTICLE mod_rewrite (1st one) should kick in.


    If the URL looks like any of these...
    Code:
    local.debbie/finance/markets/by-date/desc/3/
    local.debbie/finance/markets/by-date/desc/3
    local.debbie/finance/markets/by-date/desc/
    local.debbie/finance/markets/by-date/desc
    local.debbie/finance/markets/by-date/
    local.debbie/finance/markets/by-date
    local.debbie/finance/markets/
    ...then the SUBSECTION mod_rewrite (2nd one) should kick in.


    Why all of those combinations?

    Because if the URL doesn't point to a properly formed Article, then I want the SUBSECTION mod_rewrite to catch things, and pass it on to my "index-subsection.php" script which will either...

    a.) Display Articles sorted in the way requested

    b.) Take the malformed URL (e.g. "local.debbie/finance/markets/by-date") and redirect to a default URL (e.g. "local.debbie/finance/markets/by-date/desc/1")


    I almost have things working, but cannot figure out the ARTICLE mod_rewrite...
    Code:
    # SHOW ARTICLE
    
    #Rewrite only if the request is not pointing to a real file.
    RewriteCond %{REQUEST_FILENAME} !-f
    
    #Match any kind of Section, Subsection and Article.  PHP will decide if it's valid or not.
    RewriteRule ^([^/]+)/([^/]+)/(?:(?!by-date).)*$ articles/article.php?section=$1&subsection=$2&article=$3 [L]
    
    
    
    # SHOW SUBSECTION INDEX
    
    #Rewrite only if the request is not pointing to a real file.
    RewriteCond %{REQUEST_FILENAME} !-f
    
    #Match any Message-View, Sort-Name, Sort-Direction, Page combo.  PHP will decide if they are valid.
    RewriteRule ^([^/]+)/([^/]+)/((by-date)/?)?(([^/]+)/?)?(([^/]+)/?)?$ articles/index-subsection.php?section=$1&subsection=$2&sortname=$4&sortdir=$6&page=$8 [L]

    Any help would be much appreciated!!

    Sincerely,


    Debbie

  4. #4
    SitePoint Wizard bronze trophy Jeff Mott's Avatar
    Join Date
    Jul 2009
    Posts
    1,278
    Mentioned
    18 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by DoubleDee View Post
    #Match any kind of Section, Subsection and Article. PHP will decide if it's valid or not.
    RewriteRule ^([^/]+)/([^/]+)/(?:(?!by-date).)*$ articles/article.php?section=$1&subsection=$2&article=$3 [L]
    It looks like the article name isn't getting captured. (?:) is a non-capturing group. And I changed the quantifier * to + so that there would need to be at least one character of the article name. You may need to do this:

    RewriteRule ^([^/]+)/([^/]+)/((?!by-date).+)$ articles/article.php?section=$1&subsection=$2&article=$3 [L]
    "First make it work. Then make it better."

  5. #5
    SitePoint Wizard DoubleDee's Avatar
    Join Date
    Aug 2010
    Location
    Arizona
    Posts
    3,777
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Jeff Mott View Post
    It looks like the article name isn't getting captured. (? is a non-capturing group. And I changed the quantifier * to + so that there would need to be at least one character of the article name. You may need to do this:

    RewriteRule ^([^/]+)/([^/]+)/((?!by-date).+)$ articles/article.php?section=$1&subsection=$2&article=$3 [L]
    I came up with this one on my own, and behaves exactly as I want...
    Code:
    RewriteRule ^([^/]+)/([^/]+)/(?:(?!(by-date|by-title)))([^/]+)$ articles/article.php?section=$1&subsection=$2&article=$4 [L]
    But is the ?: necessary?


    Debbie

  6. #6
    SitePoint Wizard bronze trophy Jeff Mott's Avatar
    Join Date
    Jul 2009
    Posts
    1,278
    Mentioned
    18 Post(s)
    Tagged
    0 Thread(s)
    Nope, as written, the ?: is unnecessary. In fact, the set of parentheses the ?: is associated with is also unnecessary, as is the inner-most set of parentheses around by-date|by-title.
    "First make it work. Then make it better."

  7. #7
    SitePoint Wizard DoubleDee's Avatar
    Join Date
    Aug 2010
    Location
    Arizona
    Posts
    3,777
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Jeff Mott View Post
    Nope, as written, the ?: is unnecessary. In fact, the set of parentheses the ?: is associated with is also unnecessary, as is the inner-most set of parentheses around by-date|by-title.
    I used ?: because it is a "passive, non-capturing group" and since I don't need to capture "by-date|by-title" it seemed to be the way to go...

    Can you help me understand the difference between my code snippet...
    Code:
    	(?:(?!(by-date|by-title)))([^/]+)

    And your code...
    Code:
    ((?!by-date).+)
    Sincerely,


    Debbie

  8. #8
    SitePoint Wizard DoubleDee's Avatar
    Join Date
    Aug 2010
    Location
    Arizona
    Posts
    3,777
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    BTW, during testing I see my mod_rewrite isn't quite right yet.

    Here is the problematic snippet...

    Code:
    (?:(?!(by-date|by-title)))([^/]+)

    What it should do is this...

    IF the 3rd part of the url is *NOT* either "by-date" OR "by-title" THEN assume the value is an Article slug and go to "article.php" ELSE the value is one of those two values then it is a valid Sort-Name, so drop down to the Subsection mod_rewrite and ultimately go to "index-subsection.php"


    If my URL looks like this...
    Code:
    www.debbie.com/finance/economy/postage-meters-can-save-you-money
    ...then I go to "article.php" which is correct


    If my URL looks like this...
    Code:
    www.debbie.com/finance/economy/by-date
    ...then I go to "index-subsection.php" which is correct


    But if my URL looks like this...
    Code:
    www.debbie.com/finance/economy/by-date2
    ...then I go to "index-subsection.php" which is WRONG!!!


    Not sure what is wrong with my regex?!

    Sincerely,


    Debbie

  9. #9
    SitePoint Wizard bronze trophy Jeff Mott's Avatar
    Join Date
    Jul 2009
    Posts
    1,278
    Mentioned
    18 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by DoubleDee View Post
    I used ?: because it is a "passive, non-capturing group" and since I don't need to capture "by-date|by-title" it seemed to be the way to go...
    Basically that's right. The reason I said it was unecessary is because your passive parentheses are wrapping something that's already wrapped in parentheses.

    before => (?:(?!(by-date|by-title)))([^/]+)

    after => (?!(by-date|by-title))([^/]+)


    Then you also have another set of parentheses -- the capturing kind -- around by-date|by-title, which is why in your substitution URL, you had to skip $3 and instead use $4.

    before => (?!(by-date|by-title))([^/]+)

    after => (?!by-date|by-title)([^/]+)


    Now there isn't any more extra parentheses.
    "First make it work. Then make it better."

  10. #10
    SitePoint Wizard bronze trophy Jeff Mott's Avatar
    Join Date
    Jul 2009
    Posts
    1,278
    Mentioned
    18 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by DoubleDee View Post
    But if my URL looks like this...
    Code:
    www.debbie.com/finance/economy/by-date2
    ...then I go to "index-subsection.php" which is WRONG!!!


    Not sure what is wrong with my regex?!
    Ahh, yes. We'll have to check that "by-date" is followed by either a slash or the end of the string to make sure there isn't anything else in the path segment, like a "2".

    ^([^/]+)/([^/]+)/(?!(?:by-date|by-title)(?:/|$))([^/]+)$
    "First make it work. Then make it better."

  11. #11
    SitePoint Wizard DoubleDee's Avatar
    Join Date
    Aug 2010
    Location
    Arizona
    Posts
    3,777
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Jeff Mott View Post
    Ahh, yes. We'll have to check that "by-date" is followed by either a slash or the end of the string to make sure there isn't anything else in the path segment, like a "2".

    ^([^/]+)/([^/]+)/(?!(?:by-date|by-title)(?:/|$))([^/]+)$
    It looks like that fixed my problem - THANKS!!


    This is some hard-core stuff!!


    If you don't mind, let me try and put your code into plain English to see if I understand what is going on...

    New mod_rewrite
    Code:
    RewriteRule ^([^/]+)/([^/]+)/(?!(?:by-date|by-title)(?:/|$))([^/]+)$ articles/article.php?section=$1&subsection=$2&article=$3 [L]

    mod_rewrite explained
    Code:
    ^			Start of Regex
    
    ([^/]+)			One or more of anything but a Slash
    
    /			Required Slash
    
    ([^/]+)			One or more of anything but a Slash
    
    /			Required Slash
    
    (?!			Forward Negative Assertion??
    			Look forward but do not capture
    			IF Not Match THEN continue, ELSE fail
    
    (?:by-date|by-title)	Forward Assertion??
    			Look forward but do not capture
    			IF Match of either THEN continue, ELSE fail
    
    (?:/|$))		Forward Asesrtion??
    			Look forward but do not capture
    			IF Match or either THEN continue, ELSE fail
    
    ([^/]+)			One or more of anything but a Slash
    
    $			End of Regex

    mod_rewrite explained in prose...
    Look for a 1st variable without a slash in it, then a slash, then a 2nd variable without a slash, then a slash, then IF the 3rd variable is NOT either "by-date" followed by a slash or end of string OR is NOT "by-title" followed by a slash or end of string THEN look for the 3rd variable without a slash, and if all of these conditions are true, then goto "article.php" otherwise drop through to the SUBSECTION mod_rewrite.


    Does that sound correct??

    Sincerely,


    Debbie

  12. #12
    SitePoint Wizard bronze trophy Jeff Mott's Avatar
    Join Date
    Jul 2009
    Posts
    1,278
    Mentioned
    18 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by DoubleDee View Post
    (?:by-date|by-title) Forward Assertion??

    ...

    (?:/|$)) Forward Asesrtion??
    A non-capturing group.

    Quote Originally Posted by DoubleDee View Post
    Look for a 1st variable without a slash in it, then a slash, then a 2nd variable without a slash, then a slash, then IF the 3rd variable is NOT either "by-date" followed by a slash or end of string OR is NOT "by-title" followed by a slash or end of string THEN look for the 3rd variable without a slash, and if all of these conditions are true, then goto "article.php" otherwise drop through to the SUBSECTION mod_rewrite.
    By and large yes. Only minor difference is that "by-date" followed by a slash or end of string isn't the 3rd variable. The by-date|by-title stuff is not enclosed in capturing parentheses, so it isn't a variable at all. But the "One or more of anything but a Slash" following it is enclosed in capturing parentheses, which makes that the 3rd variable.
    "First make it work. Then make it better."

  13. #13
    SitePoint Wizard DoubleDee's Avatar
    Join Date
    Aug 2010
    Location
    Arizona
    Posts
    3,777
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Jeff Mott View Post
    A non-capturing group.
    What is the difference between a "non-capturing group" and an assertion?

    To me they sound like one in the same, and may explain my confusion with all of this...


    Debbie

  14. #14
    SitePoint Wizard bronze trophy Jeff Mott's Avatar
    Join Date
    Jul 2009
    Posts
    1,278
    Mentioned
    18 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by DoubleDee View Post
    What is the difference between a "non-capturing group" and an assertion?

    To me they sound like one in the same, and may explain my confusion with all of this...


    Debbie
    The assertions -- positive and negative look-ahead, and positive and negative look-behind -- are zero-width patterns. What that means is, they peek at the characters around them, but they don't advance the match position.

    A non-capturing group, on the other hand, matches normally. It's sole purpose is to provide grouping behavior. For example...

    abc* => matches "a" then "b" then 0 or more "c"

    (?:abc)* => matches 0 or more "abc"

    They work almost exactly like regular capturing parentheses. The only difference is that () will remember for later the contents of that group, and (?:) won't.
    "First make it work. Then make it better."


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •