Complicated Regex

I’d like a bit of help with my regex below.

Here are the scenarios for the link that it needs to match

  1. http://domain.tld/var1,value1/var2,value2/ passes /var1,value1/var2,value2/ into $vars as $1 ( WORKS )

  2. http://domain.tld/pageslug/ passes pageslug into $page_slug as $1 ( WORKS )

  3. http://domain.tld/pageslug/var1,value1/var2,value2/ passes pageslug into $page_slug as $1 and /var1,value1/var2,value2/ into $vars as $2 ( BROKEN )

so any help with #3 would be much appreciated…


RewriteEngine on
RewriteBase /

RewriteRule ^(.+,.+)$ index.php?vars=$1 [QSA,NC,L]
RewriteRule ^([A-Za-z0-9-]+)/*(.+,.+)*/*?$ index.php?page_slug=$1&vars=$2/ [QSA,NC,L]

Have you thought about passing the whole thing to PHP then parsing it there? I find it much simpler and easier to manage. :slight_smile:

I wish that were an option… I cant really pass the entire query string back to PHP since there are a bunch of other conditions that the site needs to handle… Like say an ajax call…

PLUS I’ve spent so much time on this that I’d really feel a sense of defeat to give up :frowning:

The problem is that that 3rd URL:

Is also matched by the first RewriteRule:

The first .+ matches “pageslug/var”, then there’s a comma, and the second .+ matches “value1/var2,value2/”

The trick is to make to make first rule more specific such that only atoms containing a comma can come through.

PS. Are you using Redirect, RedirectMatch or any other mod_alias directives? If not, ditch the RewriteBase /, you don’t need it.

Thank you “ScallioXTX”…

So how can I exclude “/” in the first expression below…? I know I can do a range but not sure how to exclude / but require (alpha char),(alpha char)


RewriteRule ^(.+,.+)$ index.php?vars=$1 [QSA,NC,L]

You don’t need to exclude the slash, you need to make a different pattern all together.

I’ll say it in words, you make the regex okay? :slight_smile:

Match any character one or more times, then match a comma, then match any character one or more times, then optionally match a slash, and match all that one or more times

Does that help?

QnD,

If you’ll check Tim Berners-Lee, et al’s geeky (BNF) description of Uniform Resource Identifiers (URI): Generic Syntax at http://www.ietf.org/rfc/rfc2396.txt, you’ll see (under section 2.2, Reserved Characters) that the comma is a Reserved Character and, thus, should not be used in a URL (at least not without special handling).

Regards,

DK

For anyone that found this useful…

The solution was


RewriteRule ^([A-Za-z0-9-]+)/((.+),(.+)/*)+$ index.php?page_slug=$1&vars=$2/ [QSA,NC,L]
RewriteRule ^((.+),(.+)/*)+$ index.php?vars=$1 [QSA,NC,L]

@dklynn

I’ve used this scheme for urls for quite a few years on commercial apps without incident and have also seen quite a few other commercial sites with the same. I’m not saying you aren’t technically correct but I’m yet to see where it does actually cause a problem. I’d love to see a situation where it would cause an issue.

Your solution is not able to match a pageslug only, so it doesn’t adhere to your own second requirement.

Also, if you want to match something optionally, use ? instead of *

? = zero or one time

  • = zero or more times

That is, with /* the following is also possible: ///////

With /? there either is a slash, or there isn’t.

Lastly, why don’t replace those .+ with [A-Za-z0-9-]+ ?
That’s what I actually indicated in my first post in this thread