How to redirect based on url content?

Hi,

planning a site rewrite - from static html to wordpress - and host move for an existing site.

I need to redirect most urls from the ‘old’ site to an equivalent on the ‘new’ site e.g.

old: www.olddomain.com/training/variablepage.html

redirect to

new: http://newdomain/training

so, I need to redirect anything containing /training/ AND .html in the url to the new domain/page for training

a few other pages I need to do this with but the principle is the same … pretty much boils down to:

figure out the part between the slashes after the www.olddomain.com
and, if the page name ends in .html
then redirect to a pre-defined page on the new site.

Now I’m a comnplete dope with with this stuff so, can anyone help?

Thanks.

A good point to start is to read this article as that contains everything you need.

If you have any specific questions after that feel free to ask :slight_smile:

Hi,

Thanks for the links … spent some time reading through them and come up with the following:


# BEGIN WordPress
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteRule ^index\\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>

# END WordPress


<IfModule mod_rewrite.c>
RewriteEngine on
RewriteRule ^(training)/([a-zA-Z_]+).html$ http://mydomain.com/category/training [L]
</IfModule>

Unfortunately, the stuff I added - after #END Wordpress - that I hoped would redirect any url which contains training/anypage.html to mydomain.com/category/training doesn’t work as I hoped.

I tried typing www.mydomain/training/newpage.html (a page which doesn’t exist on my site) and hoped it would redirect to the specified page … but …
I get the page not found message.

Anyone tell me what I’ve done wrong?

wf,

Oh, where to begin?

Okay, the first thing deserves a Standard Rant (not personal - I just have a script which writes this BECAUSE I’ve seen this nonsense too many times - and TWICE in your code!):

[standard rant #4][indent]The definition of an idiot is someone who repeatedly does the same thing expecting a different result. Asking Apache to confirm the existence of ANY module with an <IfModule> … </IfModule> wrapper is the same thing in the webmaster world. DON’T BE AN IDIOT! If you don’t know whether a module is enabled, run the test ONCE then REMOVE the wrapper as it is EXTREMELY wasteful of Apache’s resources (and should NEVER be allowed on a shared server).[/indent][/standard rant #4]

Next, doesn’t index.php exist as a file? If so, then the WP code doesn’t need the RewriteCond which matches index.php.

If you have a Redirect (mod_alias) directive, the RewriteBase is designed to accommodate that. You don’t so you don’t need the RewriteBase directive, either.

Finally, the reason your code NEVER gets matched is that the WP code is designed to redirect EVERYTHING … well, at least one character but NOT the empty URI (link to domain without file specified - an error in the WP code).

[indent][aside]All your RewriteRule does is add the category subdirectory and remove the slash and filename which follows training - true? Why not give Apache a break and let it know that training is still a directory (add a trailing slash, preferably with the DirectoryIndex file, too)?

OR is this supposed to be “caught” by WP and redirected to index.php? If that’s the case, remove the pink code below.[/aside][/indent]

Anyway, move that RewriteRule BEFORE the WP block then PLEASE realize that WP will STILL redirect training - with or without the /{DirectoryIndex} to WP’s index because you NEED the old !index\.php to be !^category/training so that WP won’t steal your training link(s). Try the following (AFTER removing the red code and adding the blue):

[COLOR="RoyalBlue"]RewriteEngine on
RewriteRule ^training/[a-zA-Z_]+\\.html$ category/training [L][/COLOR]
# Escape the dot character if you mean that it MUST be the dot character
# This didn't seem to need absolute redirection
# Is training a file?  Directory?  If directory, 
#   you should add a / and the name of the DirectoryIndex file

# BEGIN WordPress
[COLOR="Red"]<IfModule mod_rewrite.c>
# Repeatedly tests server - slows it significantly!
RewriteEngine On
# Already on from above
RewriteBase /[/COLOR]
[COLOR="Magenta"]RewriteCond %{REQUEST_URI} ^category/training[/COLOR]
# See aside's question above
# prevents "hijacking of category/training by WP
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule .[COLOR="RoyalBlue"]?[/COLOR] [COLOR="Red"]/[/COLOR]index.php [L]
# makes single character optional so will function for domain-only request
# removed leading / in redirection to prevent Apache from looking to root first
[COLOR="Red"]</IfModule>[/COLOR]
# repeatedly tests server - slows Apache considerably for no reason

# END WordPress

[COLOR="Red"]<IfModule mod_rewrite.c>
RewriteEngine on
RewriteRule ^(training)/([a-zA-Z_]+).html$ http://mydomain.com/category/training [L]
</IfModule>[/COLOR]
# moved

IMHO, another read of the tutorial Article to pick up the details of the changes I’ve made to your code above might be helpful.

Regards,

DK

Looks like you’re on the right track, but there’s still quite a lot to be done.

Let start with the wordpress code:


# BEGIN WordPress
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteRule ^index\\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>

This code is as unoptimized as can be.
First of all, RewriteBase is used to undo any effect of the Redirect* directives from mod_alias. Since you don’t have any of those, drop the “RewriteBase /” line – you don’t need it.

Secondly, remove the leading / in /index.php in the last RewriteRule. With that slash in place Apache will first look at the physical hard drive root to find the file only to find out it isn’t there and then look in the website. Removing the / in front eliminates this extra step, making the rule faster.

Third, the line “RewriteRule ^index\.php$ - [L]” is complete and utter nonsense. Drop it.

Last but not least, get rid of the <IfModule mod_rewrite.c></IfModule>
Once you’ve assured that mod_rewrite is enabled, it make no sense whatsoever to ask Apache about it each and every time – it puts an unnecessary strain on Apache.

All in all, that first part should be:


# BEGIN WordPress
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . index.php [L]

Now, as for you own code:


<IfModule mod_rewrite.c>
RewriteEngine on
RewriteRule ^(training)/([a-zA-Z_]+).html$ http://mydomain.com/category/training [L]
</IfModule>

Also remove the <IfModule mod_rewrite.c>…</IfModule> stuff, and the RewriteEngine was already enabled in the wordpress section, so there’s no need to enable it again, you can remove that line.

The rule in itself looks pretty okay, but you don’t need the parentheses you’ve put in there. Parentheses are used to create backreferences which can later be used in the right hand side of the URL. Since you don’t use them, using backreferences creates a unnecessary overhead.

Also, the . has a special meaning in regular expressions, namely “any character”. Since you want to literally match a dot, you need to escape it (put a \ in front of it).

So, that would be:


RewriteRule ^training/[a-zA-Z_]+\\.html$ http://mydomain.com/category/training [L]

Okay, so now we have:


# BEGIN WordPress
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . index.php [L]
RewriteRule ^training/[a-zA-Z_]+\\.html$ http://mydomain.com/category/training [L]

the thing is, this still won’t work. The problem is that you need take the order in which rules are fired in to account.
If I now request /training/anything.html, the first RewriteRule (“RewriteRule . index.php [L]”) will fire (assuming there is no actual file by that name), redirecting the request to index.php and the second RewriteRule will never even have a chance to fire.

We could say that the second RewriteRule is more specific and the first RewriteRule is a kind of “if all else fails” kind of thing.

The solution is to swap the rules:


RewriteEngine On

# Redirect training/anything.html to category/training
RewriteRule ^training/[a-zA-Z_]+\\.html$ http://mydomain.com/category/training [L]

# Redirect everything to index.php
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . index.php [L]

What this now does is first rewrite training/anything.html to category/training and then a new rewrite round will start (due to the [L] flag). In that round the the first rewriterule doesn’t match, but the second one does, forwarding the new request to index.php, which is exactly what you want :slight_smile:

Does that make sense?

DK - thanks for putting your exasperation aside and taking the time and effort to reply in such a detailed manner.

ScallioXTX - Thanks for your input. Your solution works in exactly the way I need.