.htaccess redirection of directories

I have a client’s joomla site that i’m re-working. Before the link structure was this:

site.com/section/category/article.html

We decided to change this to just /category.article.html

In order for the old links to still work (links in email newsletters, etc that used the old format) I created the following Rewriterule condition:

RewriteRule ^section/category/(.*)$ http://www.site.com/category/$1 [R=301,L]

The problem is that some of the sections/categories have the same name. Example:

The section is named Anesthesia, the Category is named Anesthesia, and the then the article starts out with the world Anesthesia.

RewriteRule ^anesthesia/anesthesia/(.*)$ http://www.site.com/anesthesia/$1 [R=301,L]

The problem if you go to a link like this:
site.com/anesthesia/anesthesia-modes-under-attack.html

It thinks that the anesthesia in the article name is actually part of the section/category rule and ends up endlessly looping. I’m guessing I just structured the rule incorrectly. If you access the link in the old style - say if we linked to the article in an old email newsletter before the we revamped the structure (site.com/anesthesia/anesthesia/anesthesia-modes-under-attack.html) it resolves but if you try it using the new link structure, which is what the site uses now (site.com/anesthesia/anesthesia-modes-under-attack.html) it is unable to resolve spits out an error.

For now I’ve resorted to putting an extra letter or digit in the article URL alias (1anesthesia-modes-under-attack.html) but I’d like to, if possible, create a rule or rules that works for everything.

Any help with this would be appreciated.

alex,

Did you try my code (BEFORE Joomla’s code)? That should have done the trick for you.

One other possibility is to use your 404 script to READ {THE_REQUEST} to determine the original URI then use the POWER of PHP to make the redirection for you. It’s a “rich man’s mod_rewrite (RewriteMap)” but that could work, too.

Regards,

DK

Hey David,

Again, thanks for your response. So here’s the deal. They’ve been using SH404SEF to create their URLs. It has the option to show both section and category in the URL, which is what they did for a long time. However, 90% of their sections and categories are the same (anesthesia/anesthesia, endoscopy/endoscopy), which creates redundancy. The people who created this site had no real idea how to create a clean section/category structure and it just became a rats nest. We trimmed down the sections to just three and moved all the old categories into these three new buckets. We don’t really have a need to show the new sections in the URL since they serve no real purpose other than internal organization. Functionally using the section in there doesn’t matter other than making a longer URL but they wanted to make them prettier, shorter and hopefully more easy to remember. Not a big deal really, we just changed the setting to only show categories in the URL.

The problem is that before the switch we had 10,000+ articles with SEF links that used the old structure {section}/{category}/article.html and lots of bookmarks that pointed to those articles using that old structure. None of those links resolved once we updated. It just spit out 404 errors so I figured creating a few Rewriterules would make it easy to point the old link structure to the new one.

So that, in a nutshell, is why I’m doing this. The problem arose with how I wrote the rules and my limited knowledge of mod_rewrite.

Your assumptions are correct on the syntax of the directories using lowercase and dashes. I’ll read your tutorial and try this out.

Thanks again.

Alex

alex,

I wrote the tutorial linked in my signature after a year or so of answering the same questions time and again. However, it’s been said to be a GREAT help to mod_rewrite noobies so go have a read.

Okay, you’re merely dropping the {section} pseudo directory out of the URI. I’d ask WHY (because that will upset Joomla - and all other CMS’s) but that wouldn’t help, would it?

Assuming that the {section} was not used in the redirect OR by Joomla to pick the correct db record, then it becomes simple: Test for not file and not directory then redirect if there are more than one pseudo directories in the path (NOTE: If you can say EXACTLY what you want, it’s easy to convert that to regex code.)

RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([a-z]+)/([a-z]+)/([-a-z]+)\\.html$ $2/$3.html [R=301,L]

Please note that I’ve assumed that your pseudo directories are ONLY lowercase characters and that the “title” is lowercase, too, with -'s replacing spaces (I prefer _'s as -'s can be in a title while you’ll not see _'s in any text).

Thanks, I assumed that the section and category were variables - but it does help to have that confirmed.

Regards,

DK

Hey DK,

Sorry – you’ve lost me on this. I’m a bit of an Apache noob. Just gleaned what little I know from Googling.

I’ve determined my revised format. It’s going from /section/category/article.html to just /category/article.html. The problem is there are a TON of links that still use the old format, which is why I needed to do the Rewriterules in the first place, because I didn’t want people who’ve bookmarked old articles to not work.

All the links generated by Joomla now output as just /category/article.html. What seems to be the issue is when the section/category had the same name and then the article starts with the same name as the section and category. Other stories that don’t start with the word “Anesthesia” work fine in either the section/category format or just the category format.

Update:
Sorry, the first Rewriterule was an example. I don’t actually RewriteRule ^section/category/(.*)$ http://www.site.com/category/$1 [R=301,L]. That was just an example of the structure. In reality I have a number of different sections/category (news-analysis/latest for example). Maybe I should’ve put quotes around section/category because those aren’t the name of actual directories.

alexlr,

First, WELCOME to SitePoint’s Apache forum!

Now, what you need to do is determine a revised format for your links then stick with it. What you’ve shown is TWO formats (which must be handled by different mod_rewrite block statements). If you insist on a single (block) statement, then use an optional atom to allow for an optional directory in your path. After all, it’s all about regex.

Regards,

DK