SitePoint Sponsor

User Tag List

Results 1 to 9 of 9
  1. #1
    SitePoint Zealot mjkovis's Avatar
    Join Date
    May 2009
    Location
    St. Louis, MO
    Posts
    106
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Remove index.php & File Extensions - .htaccess

    Having issues editing my .htaccess file to remove both the index file along with file extensions.

    Example folder structure below:

    _public_html
    __about
    ___index.php
    ___contact-us
    ____index.php
    __information
    ___index.php
    ___example-page.php

    My idea is to rewrite the index.php in a directory to the root of the folder that it is located in (NOT THE ROOT DIRECTORY). Such as:
    Code:
    http://www.domain.com/about/index.php to http://www.domain.com/about/
    Also, I would like to remove the .php extensions from other pages that I have in that folder. Such as:
    Code:
    http://www.domain.com/information/example-page.php to http://www.domain.com/example-page
    Current .htaccess Rewrite below:
    Code:
    # DISABLE Directory Browsing
    Options All -Indexes
    
    # SET Canonical URL and Remove index.php
    RewriteEngine on
    RewriteRule ^([a-z]+/)?index\.php$ http://www.domain.com/$1 [R=301,L]
    RewriteRule ^(sub-directory-one/|sub-directory-two/)?index\.php$ http://www.domain.com/$1 [R=301,L]
    RewriteCond %{HTTP_HOST} !^(www|subdomain)\.domain\.com [NC]
    RewriteRule .? http://www.domain.com%{REQUEST_URI} [R=301,L]
    The line below was added in because I was having issues with Rewriting the index.php file in folders that had hyphens in the name to separate words.
    Code:
    RewriteRule ^(sub-directory-one/|sub-directory-two/)?index\.php$ http://www.domain.com/$1 [R=301,L]
    What do I add to this or change so that I can get the correct result? One line of code I was using was causing a 404 error when trying to redirect a directory with an index.php file in it.

    Thanks in advance!

  2. #2
    Utopia, Inc. silver trophy
    ScallioXTX's Avatar
    Join Date
    Aug 2008
    Location
    The Netherlands
    Posts
    9,039
    Mentioned
    152 Post(s)
    Tagged
    2 Thread(s)
    You're getting yourself in trouble here because index.php is the DirectoryIndex, so even if the requested URL is not index.php in the browser, it will still be the requested URL as far as Apache is concerned, so the rule will fire when index.php is in the URL in the browser, but also when you've removed it, causing an infinite loop.
    What you want to do here is use %{THE_REQUEST} to see if "index.php" is actually in the URL the browser requested and only remove it if that's the case. Otherwise leave it
    Rémon - Hosting Advisor

    Minimal Bookmarks Tree
    My Google Chrome extension: browsing bookmarks made easy

  3. #3
    Certified Ethical Hacker silver trophybronze trophy dklynn's Avatar
    Join Date
    Feb 2002
    Location
    Auckland
    Posts
    14,645
    Mentioned
    19 Post(s)
    Tagged
    3 Thread(s)
    MJ,

    You're also looking at mod_rewrite BackA$$ward! YOU create the new format (extensionless) and then direct mod_rewrite to add the file extension so a file can actually be served.

    IMHO, you need a good tutorial. Try the one linked in my signature as it's helped members here for years.

    Regards,

    DK
    David K. Lynn - Data Koncepts is a long-time WebHostingBuzz (US/UK)
    Client and (unpaid) WHB Ambassador
    mod_rewrite Tutorial Article (setup, config, test & write
    mod_rewrite regex w/sample code) and Code Generator

  4. #4
    SitePoint Zealot mjkovis's Avatar
    Join Date
    May 2009
    Location
    St. Louis, MO
    Posts
    106
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Appreciate the responses guys, but now I am confused by your responses...

    The current .htaccess that I have (which is listed in original post) already REMOVES the index.php file from the URL. That works properly. In fact, dklynn actually helped me get that configuration corrected a long time ago. I have also looked through your tutorial MANY, MANY times but when I add the line to create extensionless URL's, as the new format before I try to remove the index.php file, I get the problems.

    So... I have tried adding:
    Code:
    RewriteRule ^([a-z]+)$ $1.php [L]
    This would go before my rule to remove the index.php file and then causes a 404 error. I'm stumped on what I am missing. My knowledge of Apache is limited, but I wouldn't be asking if I couldn't figure it out myself.

    Honestly, if I am getting myself into too much trouble, I will just keep utilize my current configuration and continue to remove the index.php from the URL. That way I still get a clean URL, I will just have to build my folder and site structure accordingly.

    Thanks.

  5. #5
    Certified Ethical Hacker silver trophybronze trophy dklynn's Avatar
    Join Date
    Feb 2002
    Location
    Auckland
    Posts
    14,645
    Mentioned
    19 Post(s)
    Tagged
    3 Thread(s)
    MJ,

    There are two issues at play here:

    1. Using extensionless URIs.

    Your code is allowing you to use lowercase letters (without checking for whether that file exists with a .php file extension - it would be a smart thing to check before redirecting) which is fine. I have that setup on my website but have extended that for a client (http://wilderness-wally.com) so he can use a lot of special characters, too, to allow the use of his article (page) titles as URIs.

    Where you have introduced a problem is with your
    Also, I would like to remove the .php extensions from other pages that I have in that folder. Such as:
    Code:
    http://www.domain.com/information/example-page.php to http://www.domain.com/example-page
    That will cause your code to loop and generate warning messages when Apache gets dizzy.

    I've addressed the "redirection to new format" in the signature's tutorial so I'll ask you to read it again (and again ) before asking how to prevent Apache's dizzy spells.


    2. Removing the DirectoryIndex filename from the displayed URI.


    Personally, I'd not worry about that. In fact, I believe it is an abuse of the server to ask Apache to go determine which of the DirectoryIndex files it can find first.

    With that out of my system, I suspect that it's a function under the control of your host. You can, of course, match index\.php and redirect to {domain name} but the requested file will, ultimately, BE index.php. What you must include in your mod_rewrite is the same %{IS_SUBREQ} false of the thread below yours (alexandruc's). If already redirected, that should preclude a further redirection for the same thing (DirectoryIndex).


    Regards,

    DK
    David K. Lynn - Data Koncepts is a long-time WebHostingBuzz (US/UK)
    Client and (unpaid) WHB Ambassador
    mod_rewrite Tutorial Article (setup, config, test & write
    mod_rewrite regex w/sample code) and Code Generator

  6. #6
    SitePoint Zealot mjkovis's Avatar
    Join Date
    May 2009
    Location
    St. Louis, MO
    Posts
    106
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by dklynn View Post
    I've addressed the "redirection to new format" in the signature's tutorial so I'll ask you to read it again (and again ) before asking how to prevent Apache's dizzy spells.
    DK,

    Ok. This all has made me very dizzy indeed! I did go read your tutorial again and it still confused me, but I have found the answer!
    Code:
    Options -Indexes +FollowSymlinks -MultiViews
    
    RewriteEngine on
    
    DirectoryIndex index.php
    
    # REDIRECT Force requests for named index files to drop the index file filename, and force non-www to avoid redirect loop
    RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]*/)*index\.(html?|php[45]?)(\?[^\ ]*)?\ HTTP/
    RewriteRule ^(([^/]*/)*)index\.(html?|php[45]?)$ http://example.com/$1 [R=301,L]
    
    RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+)/([^\.]+)\.php\ HTTP/
    RewriteRule ^([a-zA-Z0-9_-]+)/([^.]+)\.php$ http://example.com/$1/$2 [R=301,L]
    
    RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+)/([^/]+)/([^\.]+)\.php\ HTTP/
    RewriteRule ^([a-zA-Z0-9_-]+)/([a-zA-Z0-9_-]+)/([^.]+)\.php$ http://example.com/$1/$2/$3 [R=301,L]
    
    # REDIRECT www to non-wwww
    RewriteCond %{HTTP_HOST} !^example\.com$ [NC]
    RewriteRule .? http://example.com%{REQUEST_URI} [R=301,L]
    
    # REWRITE url to filepath
    RewriteRule ^([a-zA-Z0-9_-]+)/([^/.]+)$ /$1/$2.php [L]
    RewriteRule ^([a-zA-Z0-9_-]+)/([a-zA-Z0-9_-]+)/([^/.]+)$ /$1/$2/$3.php [L]
    This does exactly what I want it to. If my file structure goes deeper than 2 folders I can add more rules.
    Code:
    http://www.example.com REDIRECTS to http://example.com
    
    http://example.com/index.php REDIRECTS to http://example.com
    
    http://example.com/directory/index.php REDIRECTS to http://example.com/directory/
    
    http://example.com/directory/file.php REDIRECTS to http://example.com/directory/file
    And so on and so forth. It works with hyphenated URL's as well. This is being tested on my test domain.

    Thoughts?

  7. #7
    Certified Ethical Hacker silver trophybronze trophy dklynn's Avatar
    Join Date
    Feb 2002
    Location
    Auckland
    Posts
    14,645
    Mentioned
    19 Post(s)
    Tagged
    3 Thread(s)
    MJ,

    Well, as you've discovered, {THE_REQUEST} is another way to get at the !{IS_SUBREQ} but it's a PITA to deal with! I wouldn't do that as {IS_SUBREQ} is far simpler.

    [rant #1]
    The use of "lazy regex," specifically the EVERYTHING atom, (.*), and its close relatives, is the NUMBER ONE coding error of newbies BECAUSE it is "greedy." Unless you provide an "exit" from your redirection, you will ALWAYS end up in a loop!
    [/rant #1]

    Oh, Goodie! I got to use Rant #1 again! Why not encapsulate the path in an optional atom and be done with it?

    Is THIS domain your example.com? If it is, why are you using external absolute redirections (and why aren't you dealing with the force non-www at the start)?

    The first mod_rewrite block ensures that the html file (extension of html or php4 or php5) then redirects any index (html or php4 or php5) to example .com stripping the index file.

    The second mod_rewrite block redirects any subdirectory level one file with a .php extension to the filename without the php extension.

    The third mod_rewrite block does the same with the second level subdirectory.

    The fourth mod_rewrite block strips www from the domain.

    The final mod_rewrite block adds the .php file extension for both first and second level subdirectories.

    You asked for thoughts: OMG! Too convoluted for me with using {THE_REQUEST}, with dealing with different subdirectory levels (without combining those blocks) by stripping file extensions then adding them back in the final block! Why not use the code I provided in the tutorial (far simpler)?

    Ultimately, the only factor that really matters is: If it works, don't fix it!

    Regards,

    DK
    David K. Lynn - Data Koncepts is a long-time WebHostingBuzz (US/UK)
    Client and (unpaid) WHB Ambassador
    mod_rewrite Tutorial Article (setup, config, test & write
    mod_rewrite regex w/sample code) and Code Generator

  8. #8
    SitePoint Zealot mjkovis's Avatar
    Join Date
    May 2009
    Location
    St. Louis, MO
    Posts
    106
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    DK,

    Quote Originally Posted by dklynn View Post
    Why not encapsulate the path in an optional atom and be done with it?
    I believe I was trying to do that and for whatever reason, it wasn't working properly. Maybe I wrote the line incorrectly (which is most likely the case). Either way, that is what I wanted, but I still cannot find the solution to taking out those extra blocks of code.

    Quote Originally Posted by dklynn View Post
    Is THIS domain your example.com? If it is, why are you using external absolute redirections (and why aren't you dealing with the force non-www at the start)?
    Should the force to WWW or non-WWW be the first block after I turn the Rewrite Engine on?

    Quote Originally Posted by dklynn View Post
    Why not use the code I provided in the tutorial (far simpler)?
    First off, it confused the heck out of me (newbie here ). Then, I kept receiving a Internal Server 500 error document while trying to utilize it...

    Quote Originally Posted by dklynn View Post
    Ultimately, the only factor that really matters is: If it works, don't fix it!
    Exactly. BUT, I want to learn the proper way and that is why I am looking for additional thoughts and suggestions to help me understand what I am doing wrong and why. Then ultimately, how to fix it.

    Other than your tutorial, do you have any other solid resources to help my learning process along?

    Thanks again for your help and feedback dklynn!

  9. #9
    Certified Ethical Hacker silver trophybronze trophy dklynn's Avatar
    Join Date
    Feb 2002
    Location
    Auckland
    Posts
    14,645
    Mentioned
    19 Post(s)
    Tagged
    3 Thread(s)
    MJ,

    Responses embedded (indented) below.

    Quote Originally Posted by mjkovis View Post
    dk,

    i believe i was trying to do that and for whatever reason, it wasn't working properly. Maybe i wrote the line incorrectly (which is most likely the case). Either way, that is what i wanted, but i still cannot find the solution to taking out those extra blocks of code.


    Code:
    rewriterule ^(([^/]*/)*) ...
    # can more easily be coded as
    rewriterule ^(.*) ...
    # or, better yet, without anything preceding the index\.
    just a suggestion, of course.


    should the force to www or non-www be the first block after i turn the rewrite engine on?


    because that will be universally applied (unless you change things later), that is what i would do.


    first off, it confused the heck out of me (newbie here ). Then, i kept receiving a internal server 500 error document while trying to utilize it...


    i included the explanation in the tutorial but it is non-trivial:
    Code:
    Rewrite Engine on
    RewriteCond %{HTTP_HOST}/s%{HTTPS} ^www\.([^/]+)/((s)on|s.*)$ [NC]
    RewriteRule .? http%3://%1%{REQUEST_URI} [R=301,L]
    RewriteCond:
    The {HTTP_HOST} will always be available.
    /s is a dummy variable which is to be matched (or not) within the following regex
    {HTTPS} will either be 'on' or null (no value)

    RewriteCond regex:
    Assuming you've forced www, www will be matched.
    ([^/]+) will be the domain including the TLD (extension)
    ((s)on|s.*) will attempt to match either (s)on or sxxx so, if {HTTPS} is 'on', '(s)on' will be matched, otherwise s with anything will be matched. The key here is that we're only interested in the (s) of (s)on which has been captured as %3 which is then used in the redirection:
    Code:
    RewriteRule .? http%3://%1%{REQUEST_URI}


    exactly. But, i want to learn the proper way and that is why i am looking for additional thoughts and suggestions to help me understand what i am doing wrong and why. Then ultimately, how to fix it.

    other than your tutorial, do you have any other solid resources to help my learning process along?


    I'd included a couple of "references" at the bottom of that LONG tutorial. The only thing I can do is warn you that there are a lot of people out there with tutorials which I consider garbage as they seem to rely on (.*) which is the leading cause of problems that newbies have with mod_rewrite.


    Thanks again for your help and feedback dklynn!
    You're very welcome!

    Regards,

    DK
    David K. Lynn - Data Koncepts is a long-time WebHostingBuzz (US/UK)
    Client and (unpaid) WHB Ambassador
    mod_rewrite Tutorial Article (setup, config, test & write
    mod_rewrite regex w/sample code) and Code Generator


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •