SitePoint Sponsor

User Tag List

Results 1 to 6 of 6
  1. #1
    SitePoint Addict
    Join Date
    Sep 2008
    Posts
    341
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Please improve my .htaccess / mod rewrite code

    So far I have this:

    It redirects all requests (if needed) to https://www.domain.com/page/var1/var2/ etc. and rewrites so index.php receives all requests.

    Code:
    RewriteEngine On
    
    # required on my server
    RewriteBase /
    
    # redirect to full correct url if missing trailing slash
    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteCond %{REQUEST_URI} !(.*)/$
    RewriteRule ^(.*)$ https://www.domain.co.uk/$1/ [R=301,L]
    
    # redirect to full correct url if not complete
    rewritecond %{http_host} ^domain.co.uk [nc,OR]
    # the next line may be specific to my server
    RewriteCond %{ENV:HTTPS} !on [NC]
    RewriteRule ^(.*)$ https://www.domain.co.uk/$1 [R=301,L]
    
    # use index.php for all requests
    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteCond %{REQUEST_FILENAME} !-d
    RewriteRule . index.php [L,NS]
    ...but I could do with help on two things:

    (1) I'm guessing the code could be written better. It works perfectly, but I'm sure it could be condensed.

    (2) I also need to add in where if they only go to domain.com (and no page / url vars etc. selected) that it redirects to domain.com/home/. How would I do this without just duplicating another chunk of code again?

  2. #2
    Certified Ethical Hacker silver trophybronze trophy dklynn's Avatar
    Join Date
    Feb 2002
    Location
    Auckland
    Posts
    14,653
    Mentioned
    19 Post(s)
    Tagged
    3 Thread(s)
    JS153,

    Comments embedded in your code:
    Code:
    RewriteEngine On
    
    # required on my server
    RewriteBase /
    # BS - that's designed to UNDO mod_alias redirections so mod_rewrite can work on a request.
    # While it shouldn't hurt, it's incorrect
    
    # redirect to full correct url if missing trailing slash
    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteCond %{REQUEST_URI} !(.*)/$
    RewriteRule ^(.*)$ https://www.domain.co.uk/$1/ [R=301,L]
    # IMHO, it's EXTREMELY bad technique to add a trailing / to anything other than a directory request.
    # Why in the world would you go out of your way to do this?
    
    # redirect to full correct url if not complete
    rewritecond %{http_host} ^domain.co.uk [nc,OR]
    # the next line may be specific to my server
    RewriteCond %{ENV:HTTPS} !on [NC]
    RewriteRule ^(.*)$ https://www.domain.co.uk/$1 [R=301,L]
    # Is there a requirement to use https? This seems to be another very silly thing to do
    
    # use index.php for all requests
    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteCond %{REQUEST_FILENAME} !-d
    RewriteRule .? index.php [L,NS]
    # First, . requires a character while .? allows just the domain request
    # NS (not for internal sub-requests).
    
    Quote Originally Posted by Apache.org
    This flag forces the rewrite engine to skip a rewrite rule if the current request is an internal sub-request. For instance, sub-requests occur internally in Apache when mod_include tries to find out information about possible directory default files (index.xxx). On sub-requests it is not always useful, and can even cause errors, if the complete set of rules are applied. Use this flag to exclude some rules.
    # Is this really appropriate?
    Quote Originally Posted by js153
    ...but I could do with help on two things:

    (1) I'm guessing the code could be written better. It works perfectly, but I'm sure it could be condensed.

    Really?

    (2) I also need to add in where if they only go to domain.com (and no page / url vars etc. selected) that it redirects to domain.com/home/. How would I do this without just duplicating another chunk of code again?

    Code:
    RewriteRule ^$ home/ [R=301,L]
    I'll let you place that in the mess above.
    Regards,

    DK
    David K. Lynn - Data Koncepts is a long-time WebHostingBuzz (US/UK)
    Client and (unpaid) WHB Ambassador
    mod_rewrite Tutorial Article (setup, config, test & write
    mod_rewrite regex w/sample code) and Code Generator

  3. #3
    SitePoint Addict
    Join Date
    Sep 2008
    Posts
    341
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks for posting, but your comments are a little odd.

    (1) As for using "RewriteBase /" then all I've done is follow this: http://www.rackspace.com/knowledge_c...ing_on_my_site. If they're wrong, then surely it's not a big deal anyway is it? Are they wrong about their own server?

    (2) The trailing slash helps prevent duplicate content for SEO (http://www.seroundtable.com/archives/022083.html). As you control all links on your site, then you just ensure all links include the trailing slash, which means htaccess rarely needs to redirect it. If any come externally then .htaccess does the redirect, which helps prevent duplicate content. Try going to apple.com/ipad and you'll see they do exactly this.

    If you're saying that my code will add a trailing slash to every url, and it's only needed for a directory, then that is exactly the kind of thing I need help on. That said, all links will be in the format domain.com/page/var/var2/var3/ etc. anyway.

    (3) HTTPS, yes there is a need to use all-https on the site. The site involves content where its security is paramount. Many sites are going https-only these days, such as odesk.com. By going to odesk.com you will see the https redirect.

    (4) I ask for help improving the code, and you seem to spend ages commenting on it, but haven't offered a better way to write the code. I understand nobody has to help on a forum, and I respect that, but you did take the time to comment.

    (5) You say, "I'll let you place that in the mess above". Well, obviously I can't, that's why I was asking. Thanks for providing the line of code though, I just need to know if it needs to be merged in with other parts of the code. Again,, you say its a mess, so that is why I'm asking for help re-writing it.

    (6) Also, you've highlighted ENV and [NC] in this:
    Code:
    # the next line may be specific to my server
    RewriteCond %{ENV:HTTPS} !on [NC]
    ...is there a problem, as again that's come from my server owners specifically (http://www.rackspace.com/knowledge_c...on_my_PHP_site)?

    I would still appreciate help from anyone who could spare the time. I can't / won't change the things dklynn has commented on though as they are important, but I do need help improving and condensing the code.

    Thanks again for taking the time though dklynn. It has at least made me keen to get additional help and make additional checks before I sign this off.

  4. #4
    Certified Ethical Hacker silver trophybronze trophy dklynn's Avatar
    Join Date
    Feb 2002
    Location
    Auckland
    Posts
    14,653
    Mentioned
    19 Post(s)
    Tagged
    3 Thread(s)
    js,

    Rackspace controls their own servers and I have no idea what they're doing with them, however, I stand by my comments.

    Quote Originally Posted by johnsmith153 View Post
    Thanks for posting, but your comments are a little odd.

    They're generally odd - well, pedantic. I seek perfection and comment to help others learn.

    (1) As for using "RewriteBase /" then all I've done is follow this: http://www.rackspace.com/knowledge_c...ing_on_my_site. If they're wrong, then surely it's not a big deal anyway is it? Are they wrong about their own server?

    Rackspace's problem as that's what Apache says about RewriteBase. Please ask Rackspace why they'd recommend something like that.

    (2) The trailing slash helps prevent duplicate content for SEO (http://www.seroundtable.com/archives/022083.html). As you control all links on your site, then you just ensure all links include the trailing slash, which means htaccess rarely needs to redirect it. If any come externally then .htaccess does the redirect, which helps prevent duplicate content. Try going to apple.com/ipad and you'll see they do exactly this.

    apple.com/ipad? Where's the trailing /?

    If you're saying that my code will add a trailing slash to every url, and it's only needed for a directory, then that is exactly the kind of thing I need help on. That said, all links will be in the format domain.com/page/var/var2/var3/ etc. anyway.

    The domain.com/page/var/var2/var3/ indicates a mod_rewrite is supposed to grab those variables and redirect to a handler file. The series of "pseudo directories," though, tell the visitors' browsers to request support files from different level directories which causes a problem for relative links. If you were to use the above where page={serveable script}, then you must have Options MultiViews enabled - I find MultiViews to be a major PITA as you must understand that the servable script does not require the file extension to be served by Apache so you must carefully watch your choice of file/directory names.

    (3) HTTPS, yes there is a need to use all-https on the site. The site involves content where its security is paramount. Many sites are going https-only these days, such as odesk.com. By going to odesk.com you will see the https redirect.

    That's a choice for you to make. The simple fact that someone else is doing it isn't sufficient reason for me to believe that everyone should be doing it. HTTPS adds load on the server and increases bandwidth - but those penalties won't hurt a low traffic site, so, if you have the Secure Server certificate and desire to go HTTPS, then by all means, do it.

    (4) I ask for help improving the code, and you seem to spend ages commenting on it, but haven't offered a better way to write the code. I understand nobody has to help on a forum, and I respect that, but you did take the time to comment.

    Other than the recommended deletions (the "why in the worlds"), the only change I recommended was removing the ENV: from the Apache variable %{HTTPS}, altering the single character requirement to making it optional and using %{REQUEST_URI} instead of $1. As requested ...

    (5) You say, "I'll let you place that in the mess above". Well, obviously I can't, that's why I was asking. Thanks for providing the line of code though, I just need to know if it needs to be merged in with other parts of the code. Again,, you say its a mess, so that is why I'm asking for help re-writing it.

    That was a challenge for you ... to see whether you understand mod_rewrite or just want someone to script for you. Anyway, I'd put it anywhere (my preference is immediately before) the "send everything to index.php" block.

    (6) Also, you've highlighted ENV and [NC] in this:
    Code:
    # the next line may be specific to my server
    RewriteCond %{ENV:HTTPS} !on [NC]
    ...is there a problem, as again that's come from my server owners specifically (http://www.rackspace.com/knowledge_c...on_my_PHP_site)?

    I can't comment about Rackspace on this one either. {HTTPS} is an Apache variable while you must set the environmental variable to be able to use {ENV:HTTPS} - unless they perform that setting with non-standard coding. I can't imagine why they'd do that either so ... well, ask Rackspace why they're using an environmental variable setting, how it gets set and why they're not simply using {HTTPS}.

    I would still appreciate help from anyone who could spare the time. I can't / won't change the things dklynn has commented on though as they are important, but I do need help improving and condensing the code.

    That's your choice, sir.

    Thanks again for taking the time though dklynn. It has at least made me keen to get additional help and make additional checks before I sign this off.
    Regards,

    DK
    David K. Lynn - Data Koncepts is a long-time WebHostingBuzz (US/UK)
    Client and (unpaid) WHB Ambassador
    mod_rewrite Tutorial Article (setup, config, test & write
    mod_rewrite regex w/sample code) and Code Generator

  5. #5
    SitePoint Addict
    Join Date
    Sep 2008
    Posts
    341
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks agin for the response. I do appreciate the time you've taken.

    For some reason you've mistaken my post for someone who thinks they're an expert at .htaccess / mod-rewrite or whatever the best name for this process is. I am simply someone trying to get a website to do what I want it to. If I had posted "I'm brilliant, look at my great .htacces code", then your responses would make 100% sense. As for Rackspace vs you then I really couldn't care less. At this stage I'd say you probably know more than anyone over there. I don't need to start questioning their advice when it does nothing wrong.

    You twice mention for me to ask Rackspace why they do that and this, but again I really couldn't care. I'm no expert, I just needed help. If Rackspace are wrong then great, but I may as well include the "RewriteBase /" considering they do recommend it and I can't see it harming anything. Rackspace told me to do it, it seems to work, so what's the problem?

    About the apple.com/ipad thing, my aim (and if my htaccess code doesn't do this then that is where I am mistaken) is that someone goes to domain.com/page and it redirects to domain.com/page/ - this is what apple.com/ipad does. If my .htaccess code doesn't do this then that is where I am mistaken.

    By redirecting all "domain.com/page" requests to "domain.com/page/" using a 301 redirect will then ensure that Google index both as the same page, preventing duplicate content issues. You say "where's the trailing slash on apple.com/ipad", well go to the link and you'll see Apple redirect (301) and append the trailing slash. Understanding your knowledge on SEO would help me. Mine's probably 2/10. Do you have any skills on SEO (if you do then I need to look at this, but if you know nothing about SEO then this would probably explain it)?

    Thanks for the advice regarding "domain.com/page/var/var2/var3" and the "series of 'pseudo directories'" thing - I actually ensure all links are absolute anyway which I think resolves this. I've never had a problem actually with this, but I'm guessing a lot of people do.

    Also, does my code add a trailing slash to every url? If it adds the trailing slash like apple.com does then that is my aim, but if it's adding it to requests it shouldn't then I need to change that.

    Also, I posted 10 lines of code in what I would call three chunks. First it does the the trailing slash, then it does the https://www.domain.com/page/var/ redirect and then it ensures index.php is used on all requests. Can these 10 lines be condensed by keeping all the features? I was expecting someone to say, "that can be done in 4 lines" or that kind of thing. If you think that the fact I have gone with those decisions is stupid, then that's one thing, but it's whether my stupid decisions are coded correctly that I wanted help on.

    I'm thinking now that you think the code is reasonable (not perfect though I'm sure), but it's the decision to listen to Rackspace and go with the SEO stuff you are questioning. I first thought you suggested the whole thing should be re-done.

    In fact, the bit about "I'll let you place that in the mess above" - I expected it to be merged in with the rest, but it looks like that's not possible. So if I just need to add that as another line of code then I can easily do that. I just thought there'd be a better way to compress it. As an example, the way I've separately entered the trailings slash redirect and the redirect all to https://www.domain.com/page/var etc. - I expected those two chunks to easily be condensable into one (say 2/3 lines for the lot).

    You also say, "That was a challenge for you ... to see whether you understand mod_rewrite or just want someone to script for you." - no, neither actually and I think this post shows that I don't understand mod_rewrite and I think it shows I haven't just expected someone to script for me.

    If the only problem with my code is that I listened to Rackspace advice, followed SEO advice regarding trailing slash and am using https across the whole site then I actually think the code is ok. Unless there's something else I've messed?

    I really think you have misunderstood the idea for my post. My post was aimed to say "this is my code, please help me improve" and nothing else.

    I know nothing, you know everything. What more can I say!

  6. #6
    Utopia, Inc. silver trophy
    ScallioXTX's Avatar
    Join Date
    Aug 2008
    Location
    The Netherlands
    Posts
    9,067
    Mentioned
    153 Post(s)
    Tagged
    2 Thread(s)
    Alright, short and sweet: apart from maybe removing the RewriteBase your code can't be condensed any further than it is now.

    And I agree with everything David has said so far
    Rémon - Hosting Advisor

    SitePoint forums will switch to Discourse soon! Make sure you're ready for it!

    Minimal Bookmarks Tree
    My Google Chrome extension: browsing bookmarks made easy


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •