External 301 redirect when folder doesn't exist

I have the following in my htacces file

RewriteCond %{THE_REQUEST} [1]+\ /(accommodation\.php)?\?id=([0-9]+)(#[^\ ]*)?\ HTTP/
RewriteRule ^(accommodation\.php)?$ http://www.example.com/%2? [R=301,L]

RewriteRule ^([0-9]+)$ accommodation.php?id=$1 [L]

The above works fine and produced the desired result (using id 89 as example):
http://www.example.com/89 instead of the old result http://www.example.com/accommodation.php?id=89
and broswers redirct work fine too!

What we are trying to do, rather than just have the url say: http://www.example.com/89 is to insert the type of property it actually is into the url something like: http://www.example.com/Apartment/89

NEW REWRITE RULE:
The new rewrite rule has been changed to:
RewriteRule ^([A-Z][a-z]+)/([0-9]+)$ accommodation.php?accommodation_type$1&id=$2 [L]

Using the ‘accommodation type’ from the ‘accommodation_type’ field in the database.
The new rewrite rule above works fine for internal links within the site returning: http://www.example.com/Apartment/89
or
http://www.example.com/House/89
or
whatever the accommodation_type is in the database.

The php code is like this…<a href=‘“.$row[accommodation_type].”/“.$row[“id”].”’> for links within the site.

But I can not get the external 301 redirect part - this…

RewriteCond %{THE_REQUEST} [2]+\ /(accommodation\.php)?\?id=([0-9]+)(#[^\ ]*)?\ HTTP/
RewriteRule ^(accommodation\.php)?$ http://www.example.com/%2? [R=301,L]

…to work!

It seem as though something needs to be added between example.com/HERE! AND/%2?
Is it seeing the ‘accommodation_type’ as a folder that doesn’t exist.

Thanks!


  1. A-Z ↩︎

  2. A-Z ↩︎

@zorro20 see http://datakoncepts.com/seo#example-12, “Redirect TO New Format”

David,

The solution outlined in this post also doesn’t loop. I’ll give a minimal example to illustrate. Instead of “appartment” I’ll use “house” as it’s easier to type :slight_smile:

The goal

  1. Internally rewrite /house-(\d+) to /house.php?id=$1
  2. 301 redirect /house.php?id=(\d+) to /house-$1

The .htaccess


RewriteEngine On
[COLOR="RoyalBlue"]# Rule number 1[/COLOR]
RewriteCond %{THE_REQUEST} ^(GET|POST|HEAD)\\ /house\\.php?\\?id=(\\d+)
RewriteRule ^house\\.php$ http://www.example.com/house-%2? [R=301,L] 
[COLOR="RoyalBlue"]# Rule number 2[/COLOR]
RewriteRule ^house-(\\d+) /house.php?id=$1

Verification

Case 1. Requesting /house-1
The request for /house-1 does not match rule number 1, since house.php is not in {THE_REQUEST}, but it does match rule number 2, so it is internally rewritten to /house.php?id=1
In the next iteration rule number 1 doesn’t match, as /house.php is not in {THE_REQUEST}, seeing as {THE_REQUEST} is still for /house-1. Rule number 2 also doesn’t match here, because the {REQUEST_URI} now is for /index.php, and no longer for /house-1
Seeing as both rules don’t match the rewriting is done and /house.php?id=1 is served, using the URL /house-1

Case 2. Requesting /house.php?id=1
This request does match the first rule, so the browser is redirected to /house-1 and from there this case is the same as case 1.

So, both directions work, because of the use of {THE_REQUEST}

:slight_smile:

Remon,

{THE_REQUEST} does NOT change ({REQUEST_URI} changes with the redirection). Think about it again 'cause #1 will not redirect on house-1 but, because {THE_REQUEST} does not change, it’ll not redirect on house.php (as the {REQUEST_URI} either).

Regards,

DK

Thanks again Scallio!

Your .htaccess code is fine as it is.
As I understand it DK was under the impression that you wanted to rewrite /accomodation-(number) to /accomodation.php?id=(number), whereas you want it the other way around to remove the /accomodation.php?id=(number) from google and replace them with /accomodation-(number) links using 301 redirects :slight_smile:

True. Once you’re there, though, how are you going to serve acc-no? This is the same question that a member posed years ago: I initially told her that she couldn’t LOOP around and serve a script she was redirecting away from but came up with a solution for her (it’s in my signature’s tutorial).

Regards,

DK

What’s happening Scallio? Am I right to leave the .htaccess code as it is? Not really sure what DK means!

David,

:slight_smile:

{THE_REQUEST} changes when you do a 301 redirect.
At least it does on my server :smiley:
I’ve implemented the .htaccess above on my intranet website as follows:


RewriteEngine On

RewriteCond %{THE_REQUEST} ^(GET|POST)\\ /index.php\\?module=([\\w\\-]+)
RewriteRule .? /%2? [L,R=301]

RewriteRule ^password-tools$ /index.php?module=password-tools [L]

And it works like a charm.
Requesting /index.php?module=password-tools redirects to /password-tools and shows the correct page
Requesting /password-tools also works

(Apache 2.2.1 @ Win7)

z20,

Wait a minute? Do you have the cart in front of the horse? YOU must create the links the way that you want them to be viewed then redirect to something which can be served by Apache.

Saying that, all you need is your one liner:

RewriteRule ^[a-z]+-([0-9]+)$ accommodation.php?id=$1 [L]

Anything else is simply overkill and can only destroy the effect you’re after. If this is placed before a CMS block of mod_rewrite, that block should contain the -f exclusion so that’s the end of it.

BTW, why the insistence on using {THE_REQUEST}?

Regards,

DK

Thanks for that Scallio!
Yes it’s a shame the (5|32|65) bit wouldn’t work as it is less code.
Thanks again for the reply info.

Hi, thanks for that Scallio but it doesn’t work - just tried it! Using this…

RewriteCond %{THE_REQUEST} ^(GET|POST|HEAD)\ accomodation.php\?id=(5|32|65)
RewriteRule .? /house-%2? [L,R=301]

(used all three id numbers as they are all houses)

Still using my old code at the moment although I have changed the A-Z part to GET POST HEAD…

RewriteCond %{the_request} ^(GET|POST|HEAD)\ /(accommodation\.php)?\?id=5\ HTTP/
RewriteRule ^accommodation\.php$ http://www.example.com/house-5? [R=301,L]

Just out of curiosity what’s the difference between using A-Z and GET POST HEAD?
Also what’s the difference between using {the request} and say… {request uri} or {request filename} and why wouldn’t the latter two work for me?

Wow, that was a though one!

I found out the following works:


RewriteCond %{THE_REQUEST} ^(GET|POST|HEAD)\\ accomodation.php\\?id=(5)
RewriteRule .? /appartment-%2? [L,R=301]

RewriteCond %{THE_REQUEST} ^(GET|POST|HEAD)\\ accomodation.php\\?id=(32)
RewriteRule .? /house-%2? [L,R=301]

RewriteCond %{THE_REQUEST} ^(GET|POST|HEAD)\\ accomodation.php\\?id=(65)
RewriteRule .? /house-%2? [L,R=301]

You need the ? at the end of e.g. “/house-%2?” to remove the query string. Otherwise you end up rewriting to accomodation-5?id=5

And I changed [A-Z]+ to (GET|POST|HEAD) because other request don’t make a lot of sense :slight_smile:

%2 is the id (5, or 32, or 65) matched in THE_REQUEST. The cool thing about doing it this way is that you can combine rules this way:


RewriteCond %{THE_REQUEST} ^(GET|POST|HEAD)\\ accomodation.php\\?id=(32|65)
RewriteRule .? /house-%2? [L,R=301]

Works for both accomodation.php?id=32 and accomodation.php?id=65
Just add them all together seperated by | : 1|2|3|4|5 etc
Don’t start with a | , and don’t end with a |, only put the | between numbers.

Hope that helps! :slight_smile:

Hi there just to clarify things a little I am trying to write friendly urls.

The current urls (and the ones indexed by google) look like this…
http://www.example.com/accommodation.php?id=5 (5 being an example)

I was trying to achieve…
http://www.example.com/apartment-5
or
http://www.example.com/house-5 (whatever the property type was in the database)

I have achieved this internally within the site by changing all references to:
/accommodation.php?id=(accommodation row id in database)
to…
/(accommodation_type in database)-(accommodation row id in database)

And using this internal RewriteRule:
RewriteRule [1]±([0-9]+)$ accommodation.php?id=$1 [L]
Which produces the desired results.
The browser address bar says…
http://www.example.com/(accommodation_type)-(accom number) and returns the right page.

Having got everything working ok internally my main concern was getting penalized for duplicate content, for example I thought that google may see:
http://www.example.com/accommodation.php?id=5
and
http://www.example.com/apartment-5
As two and the same page (duplicate content).

As there are only 3 properties at the moment in this section I am working on I have done the following:

RewriteCond %{the_request} [2]+\ /(accommodation\.php)?\?id=5\ HTTP/
RewriteRule ^accommodation\.php$ http://www.example.com/apartment-5? [R=301,L]

RewriteCond %{the_request} [3]+\ /(accommodation\.php)?\?id=32\ HTTP/
RewriteRule ^accommodation\.php$ http://www.example.com/house-32? [R=301,L]

RewriteCond %{the_request} [4]+\ /(accommodation\.php)?\?id=65\ HTTP/
RewriteRule ^accommodation\.php$ http://www.example.com/house-65? [R=301,L]

The 3 above are indexed in google as http://www.example.com/accommodation.php?id=(whatever) and rewritecond and rewriterules above tell google that:
accommodation.php?id=5
accommodation.php?id=32
accommodation.php?id=65

have now moved to http://www.example.com/(accom_type)-(number) the friendly url

I suppose once google has reindexed the 3 properties with the new friendly url they could be removed and any further properties added will have the friendly urls from the start.

What I was trying to do, which I now know is near impossible was to do it all with just 1 rewritecond and 1 rewriterule instead of 3 but because only the id number was in the original url (and not the type of accommodation it was) and there was more than one type of accommodation there was no way of telling google (or any other external links) that accommodation?id=(number) was of a certain property type.

PLEASE NOTE:
I also tried the {request uri} and {request filename} as suggested but they would not work.
When i clicked the indexed link in google using these two it went back to the unfriendly url /accommodation.php?id=(number) NOT SURE WHY?

Thanks again everyone for all your help - I am learning a lot!


  1. a-z ↩︎

  2. A-Z ↩︎

  3. A-Z ↩︎

  4. A-Z ↩︎

Strange, worked perfectly for me (Apache 2.2.1 @ Win7)

Well, at least you’ve got something that works :slight_smile:

[A-Z]+ matches all kinds of strings that will never occur, as the only valid HTTP requests are GET, POST, HEAD and PUT. PUT is hardly ever used and it’s not something you pages would support, so I left it out. Basically changing [A-Z]+ to (GET|POST|HEAD) prevents the .htaccess from rewriting invalid HTTP requests.

{REQUEST_FILENAME} and {REQUEST_URI} both won’t work for you because they don’t contain the query string. {THE_REQUEST} does.

Remon,

Aw, you should have left the deleted post there - good for a giggle, ya know!

Okay, I stand corrected. There is very little information about {THE_REQUEST} at apache.org (only the format of the string that variable returns and their example, ‘(e.g., “GET /index.html HTTP/1.1”),’ which does NOT show the query string - with the exception of one example in a bug report which DOES show the query string). Point #1: The query string of the request IS included in {THE_REQUEST}.

Point #2: When there is no usable information, the testing ScallioXTX did is the only way to learn how mod_rewrite handles unusual situations. :tup:

Point #3: The R=301 flag does set-off a new request so {THE_REQUEST} is updated.

Finally, I still contend that {THE_REQUEST} is a horrible Apache variable to use because of all the extraneous information contained therein. Fortunately, the apache.org examples (mostly in the bug reports) show that the method is generally ignored (don’t use the start anchor - go directly to the /path/file?query_string) as is the HTTP protocol version (omit the end anchor, too).

IMHO, this is a good example of trying to make a simple task difficult but, as ScallioXTX has demonstrated, it works (i.e., DON’T fix it!).

Thanks, Remon, I learned something this time.

Regards,

DK

Yes the initial RewriteRule works (internal one) but the external RewriteCode and RewriteRule to let search engines know it has permanently changed doesn’t
This bit…

RewriteCond %{THE_REQUEST} [1]+\ /(accommodation\.php)?\?id=([0-9]+)(#[^\ ]*)?\ HTTP/
RewriteRule ^(accommodation\.php)?$ http://www.example.com/%2? [R=301,L]


  1. A-Z ↩︎

Z,

Just use the RewriteRule as it’s got everything you need.

Regards,

DK

z,

From my reading of your code, all you want to do is examine the {QUERY_STRING} variable for the digital contents of id. Then, if the {REQUEST_URI} is blank or accommodation.php, redirect to the value of id. Because the value of id is one or more digits, it’s unlikely that Apache will be able to serve ANYTHING, i.e., this makes no sense at all to me (in the same category as your insistence on using {THE_REQUEST} rather than the more appropriate {QUERY_STRING}). If this WERE to work, it would have to be followed by (after removing the Last flag) another RewriteRule to match the value of id then redirected to a script to handle that request, i.e., accommodation2.php?id=$1.

Whew, I’m getting dizzy just trying to guess what your motivation for all this really is and how to get mod_rewrite to do crazy things!

Regards,

DK