Mod rewrite with a parsed files

I am having trouble with mod rewrite.

Currently, the htaccess has a php server parsed code in there and it works fine. All pages have php code running in them to displays ads on the site pages for ease. I just change 1 file, not all pages.
Anyway, it is a static html site with .htm extensions which I don’t want to show. They are aged & well established with links so I have to keep the “.htm”

How would I do that with this code?


Options +Includes
# -FrontPage-


IndexIgnore .htaccess */.??* *~ *# */HEADER* */README* */_vti*

<Limit GET POST>
order deny,allow
deny from all
allow from all
</Limit>
<Limit PUT DELETE>
order deny,allow
deny from all
</Limit>
AuthName www.example.net
AuthUserFile /home/lots/public_html/_vti_pvt/service.pwd
AuthGroupFile /home/lots/public_html/_vti_pvt/service.grp
AddHandler server-parsed .htm

I input this corrected code and it didn’t work or change anything.


Options +Includes
# -FrontPage-


IndexIgnore .htaccess */.??* *~ *# */HEADER* */README* */_vti*

<Limit GET POST>
order deny,allow
deny from all
allow from all
</Limit>
<Limit PUT DELETE>
order deny,allow
deny from all
</Limit>
AuthName www.example.net
AuthUserFile /home/lots/public_html/_vti_pvt/service.pwd
AuthGroupFile /home/lots/public_html/_vti_pvt/service.grp
AddHandler server-parsed .htm
# Externally redirect direct client requests for URLs ending in .htm to extensionless URLs
RewriteCond %{THE_REQUEST} ^[A-Z]+\\ /([^/]+/)*[^.]+\\.htm([?#][^\\ ]*)?\\ HTTP/
RewriteRule ^(([^/]+/)*[^.]+)\\.htm$ http://www.example.net/$1 [R=301,L]
# Internally rewrite extensionless URLs to add ".htm" if a corresponding html file exists
RewriteCond %{REQUEST_FILENAME}.htm -f
RewriteRule ^(([^/]+/)*[^.]+)$ /$1.htm [L]

2nd Question.
Since it is on a shared hosting acct, will it affect the other hosted 5 sites which are also .htm but on their own domain? I ask becaise it did before around a year ago so I just reverted back to the old, “if it’s not broke” thinking and left it as is.
Actually, I also want those sites to NOT display the htm extension too.

Please explain it like you would to a beginning student 1/3 of the way thru the course. I’ve been fighting this like an MMA fighter for a while and cannot come out on top.

Thanks

Thanks for replying.

Unfortunately, I tried whjat you ave above and the main domain’s interior page now looks like:

 http://www.example.net/home/domain/public_html/partners 

I didn’t check the other domains since the main one did not come out right.

lukkas,

Okay, here we go!

First, you are trying to generate a loop, i.e., remove .htm extensions from files which can only be accessed with the .htm extension. Don’t panic, it can be done (and the code is shown in my signature’s tutorial).

The second problem you have is that you’re allowing “addon domains” to be accessed via the “main domain.” This can be handled easily (redirect them to their own domain - which I’ll do for simplicity) or use a mod_rewrite block statement to check whether the addon domains are being requested via the main domain and Skip the correct number of RewriteRules to avoid processing the “loopy code.”

Before I start, let me comment on your current .htaccess:


Options +FollowSymLinks

AddHandler server-parsed .htm
RewriteEngine on
#
## Internally rewrite extensionless file requests to .html files ##
#
# If the requested URI does not contain a period in the final path-part
RewriteCond &#37;{REQUEST_URI} !(\\.[^./]+)$
[indent]You missed the mark as there is no requirement 
for a / followed by the dot character then extension. 
I would have used (to force at least two letters 
following a dot character) !(\\.[a-z]{2,})$

On the other hand, that's not needed with the 
-f and -d checks so - DELETE.[/indent]
# and if it does not exist as a directory
RewriteCond %{REQUEST_FILENAME} !-d
# and if it does not exist as a file
RewriteCond %{REQUEST_FILENAME} !-f
# then add .html to get the actual filename
RewriteRule (.*) /$1.htm [L]
#
#
## Externally redirect clients directly requesting .htm page URIs to extensionless URIs
## If client request header contains html file extension
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\\ /([^.]+\\.)+htm\\ HTTP
[indent]If all you're looking for is the .htm (or is it .html?),
 then all you should have, er, all you need is \\.htm[/indent]
# externally redirect to extensionless URI
RewriteRule ^(.+)\\.htm$ http://www.example.net/$1 [R=301,L]

Okay, while I loathe the use of {THE_REQUEST}, you’ve found a different way than I did to get loopy without looping! :tup::tup: My solution would have been:

RewriteEngine on

# Redirect addon domain requests via the main domain to their own domains
RewriteRule ^site2(/.*)$ http://www.site2.com$1 [R=301,L]
RewriteRule ^site3(/.*)$ http://www.site3.com$1 [R=301,L]

# Redirect to NEW format (extensionless)
RewriteCond %{QUERY_STRING} !marker
RewriteRule ^([a-z]+)\\.htm$ $1? [R=301,L]

# Redirect back to "usable link" (with .htm extension)
# Add "marker" to existing query string to prevent looping
RewriteRule ^([-a-z]+)$ $1.htm?marker [L]

Where I added a “marker” in a hidden query string, you merely used the existing {THE_REQUEST} variable to detect the original state of the request. Despite the pain and suffering of dealing with {THE_REQUEST}, you don’t need anything more than the existence of .htm in {The_REQUEST} the way I merely needed the existence of the marker in the query string. You get the top marks for using something already available!

Regards,

DK

Hi dklynn,

I want to remove the “.htm” extension from all pages, originally built w/ FP. :x

but if someone still clicks on the page link on the www that has an extension it still goes to the original page minus the extension.
I need to maintain a working link from the SERPS; in essence have a 301 redirect. Now, I just read it will be affected.

I got 1 step further this morning w/ the following code:


Options +FollowSymLinks

AddHandler server-parsed .htm
RewriteEngine on
#
## Internally rewrite extensionless file requests to .html files ##
#
# If the requested URI does not contain a period in the final path-part
RewriteCond %{REQUEST_URI} !(\\.[^./]+)$
# and if it does not exist as a directory
RewriteCond %{REQUEST_FILENAME} !-d
# and if it does not exist as a file
RewriteCond %{REQUEST_FILENAME} !-f
# then add .html to get the actual filename
RewriteRule (.*) /$1.htm [L]
#
#
## Externally redirect clients directly requesting .htm page URIs to extensionless URIs
## If client request header contains html file extension
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\\ /([^.]+\\.)+htm\\ HTTP
# externally redirect to extensionless URI
RewriteRule ^(.+)\\.htm$ http://www.example.net/$1 [R=301,L]

it did everything it was supposed to for the main site, whereas the add-on domain sites (fully independent w/ their own htaccess files) appeared correct on their respective home pages, but their interior pages appeared as

www.example.net/site2/page1.htm  
www.example.net/site3/page1.htm  

and so on…

well I got it working. a man who can halfway fish is happy now. EDIT below- not fixed.

Thanks

OOOOPS !! big problem the domains under this one are now showing

 www.example.com/2nddomain/ 

when I plug them in. what a pain:sick:

the hosts has apache 2.2 and php 5.2.
so far, it just ignores the rewrite condition and only parses htm to php.

it does work for home page when user inputs example.net/index.htm it goes to example.net

“it does work,for internal pages it removes the .htm extension but I get a 404 page for internal pages”

using


Options +FollowSymLinks


<Limit GET POST>
order deny,allow
deny from all
allow from all
</Limit>
<Limit PUT DELETE>
order deny,allow
deny from all
</Limit>

AddHandler server-parsed .htm
#
# Externally redirect direct client requests for URLs
# with .htm extensions to new extensionless URLs
RewriteEngine On
# Redirect to remove /index.htm files
RewriteCond &#37;{THE_REQUEST} \\ /(.+/)?index\\.htm(\\?.*)?\\  [NC]
RewriteRule ^(.+/)?index\\.htm$ /%1 [NC,R=301,L]
# Redirect to remove /.htm files
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\\ /(.*)\\.htm\\ HTTP/ [NC]
RewriteRule .+ http://www.example.net/%1  [R=301,L]



I was using this for internal pages

# Externally redirect direct client requests for URLs ending in .htm to extensionless URLs
RewriteCond %{THE_REQUEST} ^[A-Z]+\\ /([^/]+/)*[^.]+\\.htm([?#][^\\ ]*)?\\ HTTP/
RewriteRule ^(([^/]+/)*[^.]+)\\.htm$ http://www.example.net/$1 [R=301,L]

but it does not workl


lukkas,

If you will provide me with a clear statement of your intent (e.g., “internally redirect ALL pagename.htm files where pagename is all lowercase AND pagename.htm does not exist to pagename.php”), I’ll show you the code I’d use. Please note that, if you’ve got multiple domains sharing this .htaccess (mod_rewrite code), then I need to know which domain this is applicable to. Be assured that my code will NOT use {THE_REQUEST} (for the reasons stated above)!

Regards,

DK

lukkas,

Which ONE (mod_rewrite block statement) did you try and what is your test URI?

Regards,

DK

I did read through it but still confused.

Hey, I know FP is a no-no & should be taken to a roadside in Afganistan. Luckily, I haven’t used that since 04’ but the backlinks are “.htm”

I don’t think the code is loopy b/c it will do the index.htm page correctly and once it gets to the domain.com/page1.htm it does take off the extension but goes to a 404 error. But I am no expert in this area by any means.

And is there a backwards compatible way/code so when I finally get this to work where people can simply type in domain.com/page and not get a 404 page or error?

Thanks

Hi DK,

thanks for the response and helping me with this dilemma. I agree with you if the code is too much , useless and heavy throw the debris out.

So, if they are all .htm pages then parsed to php, the rewrite should include .php page extensions or htm extensions to extensionless? (yes, I used Frontpage in 2001 to make this site :()

OR it seems the easy solution is to take out the server-parsed htm pages to php and just use rewrite? There are only 50 pages or so.

But for a larger site this is a problem. (which I do have in its own dir)

2.) Yes, the add-on domains are in their own directory in the public_html folder and have their own htaccess file so they will not be affected by the main domain, right?

Hi lukkas!

The first code block has nothing to do with redirections. Instead, it prevents visitors from accessing your .htaccess (and M$-type) file(s), the <LIMIT> blocks should already be in the httpd.conf, it establishes a login for you and requires the server to parse .htm files (which is how you are getting the PHP in your htm files parsed). I will ignore this “debris” in answering your questions (AND assume that you’re using an Apache 2.x server despite your M$ code).

Question 1: This is essentially identical to the “loopy” problem discussed in my signature’s tutorial (under “Redirect TO New Format”). The problem is that you’re redirecting away from what Apache can serve to something that it can’t (extensionless filename) then redirecting back. Obviously, that sets up a loop which Apache 2 will break (after 10 cycles) - Apache 1.x required a restart! I’ll let you read the answer, code and explanation, and come back with questions rather than repeat it here (and suffer carpal tunnel syndrome).

HOWEVER, I MUST comment on your use of {THE_REQUEST}. That is such a convoluted string which requires parsing to get to the meat of the thing (the {REQUEST_URI} string) that, to me, it makes no sense to EVER use it - it’s just more trouble than it’s worth ESPECIALLY when there are more useful variables provided by the kind folks at apache.org.

Question 2: That depends. If all the domains share the same DocumentRoot and this “loopy” code is employed there, yes. If not, no. If this code is employed on the main domain and the addon domains are in subdirectories, they will ONLY be affected if addressed via the main domain. Otherwise, their addon domain access will bypass the main domain’s DocumentRoot (and this code) so it’ll have to be repeated in each DocumentRoot. It’s just a matter of physical location and the “entry point” for each domain (site).

Regards,

DK

lukkas,

On my website, I don’t “pretend” to have htm(l) pages. The pagename(s) are merely extensionless and are redirected (via mod_rewrite) to the pagename.php version.

Frontpage? :x Oh, well, Notepad (or EditPad) isn’t better?

Did you even read the section about the “loopy” redirects?

Did you read my comments about using {THE_REQUEST}?

Okay, as above, the addon domains will NOT be affected by the main domain’s .htaccess (because the domain requests bypass the main domain’s DocumentRoot).

mod_rewrite does not “ignore” anything. If the code is structured properly, it’ll work properly. What I’d said above was that it looked that your code was “loopy.” That means that mod_rewrite will make a few cycles of redirection (back and forth) until it decides it’s in a loop - then it’ll stop. Read the mod_rewrite log.

Regards,

DK