mod rewrite to remove file extension, add trailing slash, remove www and redirect to 404 if no file/directory is available

…衆ロ難τιáo~ 提交于 2019-11-29 08:43:06

I've had problems with getting ErrorDocument to work reliably with rewrite errors, so I tend to prefer to handle invalid pages correctly in my rewrite cascade. I've tried to cover a fully range of test vectors with this. Didn't find any gaps.

Some general points:

  • You need to use the DOCUMENT_ROOT environment variable in this. Unfortunately if you use a shared hosting service then this isn't set up correctly during rewrite execution, so hosting providers set up a shadow variable to do the same job. Mine uses DOCUMENT_ROOT_REAL, but I've also come across PHP_DOCUMENT_ROOT. Do a phpinfo to find out what to use for your service.
  • There's a debug info rule that you can trim as long as you replace DOCROOT appropriately
  • You can't always use %{REQUEST_FILENAME} where you'd expect to. This is because if the URI maps to DOCROOT/somePathThatExists/name/theRest then the %{REQUEST_FILENAME} is set to DOCROOT/somePathThatExists/name rather than the full pattern equivalent to the rule match string.
  • This is "Per Directory" so no leading slashes and we need to realise that the rewrite engine will loop on the .htaccess file until a no-match stop occurs.
  • This processes all valid combinations and at the very end redirects to the 404.php which I assume sets the 404 Status as well as displaying the error page.
  • It will currently decode someValidScript.php/otherRubbish in the SEO fashion, but extra logic can pick this one up as well.

So here is the .htaccess fragment:

Options -Indexes -MultiViews
AcceptPathInfo Off

RewriteEngine On
RewriteBase   /

## Looping stop.  Not needed in Apache 2.3 as this introduces the [END] flag
RewriteCond %{ENV:REDIRECT_END}  =1
RewriteRule ^                    -                       [L,NS]

## 302 redirections ##

RewriteRule ^ - [E=DOCROOT:%{ENV:DOCUMENT_ROOT_REAL},E=URI:%{REQUEST_URI},E=REQFN:%{REQUEST_FILENAME},E=FILENAME:%{SCRIPT_FILENAME}]

# redirect from HTTP://www to non-www
RewriteCond %{HTTPS} !=on
RewriteCond %{HTTP_HOST}        ^www\.(.+)$ [NC]
RewriteRule ^                   http://%1%{REQUEST_URI}  [R=301,L]

# remove php file extension on GETs (no point in /[^?\s]+\.php as rule pattern requires this)
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_METHOD}   =GET
RewriteRule (.*)\.php$          $1/                      [L,R=301]

# add trailing slash
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^.*[^/]$            $0/                      [L,R=301]

# terminate if file exists.  Note this match may be after internal redirect.
RewriteCond %{REQUEST_FILENAME} -f
RewriteRule ^                   -                        [L,E=END:1]

# terminate if directory index.php exists.  Note this match may be after internal redirect.
RewriteCond %{REQUEST_FILENAME}    -d
RewriteCond %{ENV:DOCROOT}/$1/index.php    -f
RewriteRule ^(.*)(/?)$             $1/index.php          [L,NS,E=END:1]

# resolve urls to matching php files 
RewriteCond %{ENV:DOCROOT}/$1.php  -f
RewriteRule ^(.*?)/?$              $1.php                [L,NS,E=END:1]

# Anything else redirect to the 404 script.  This one does have the leading /

RewriteRule ^                      /404.php              [L,NS,E=END:1]

Enjoy :-)

You'll probably want to check if the php file exists before adding the tailing slash.

RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME}.php -f
RewriteRule ^.*[^/]$ /$0/ [L,R=301]

or if you really want a tailing slash for all 404 pages (so /image/error.jpg will become /images/error.jpg/, which I think is weird):

RewriteCond %{ENV:REDIRECT_STATUS} !200
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^.*[^/]$ /$0/ [L,R=301]

I came up with this:

DirectorySlash Off
RewriteEngine on
Options +FollowSymlinks

ErrorDocument 404 /404.php

#if it's www
#  redirect to non-www.
RewriteCond %{HTTP_HOST} ^www\.(.+)$ [NC]
   RewriteRule ^ http://%1%{REQUEST_URI} [L,R=301,QSA]

#else if it has slash at the end, and it's not a directory
#  serve the appropriate php
RewriteCond %{ENV:REDIRECT_STATUS} ^$ 
RewriteCond %{REQUEST_FILENAME} !-d
   RewriteRule ^(.*)/$ /$1.php [L,QSA]

#else if it's an existing file, and it's not php or html
#   serve the content without rewrite
RewriteCond %{ENV:REDIRECT_STATUS} ^$ 
RewriteCond %{REQUEST_FILENAME} -f
RewriteCond %{REQUEST_URI} !(\.php)|(\.html?)$
   RewriteRule ^ - [L,QSA]

#else
#  strip php/html extension, force slash
RewriteCond %{ENV:REDIRECT_STATUS} ^$ 
   RewriteRule ^(.*?)((\.php)|(\.html?))?/?$ /$1/ [L,NC,R=301,QSA]

Certainly not very elegant (env:redirect_status is quite a hack), but it passes my modest tests. Unfortunately I can't test the www redirection, as I'm on localhost, and has no real access to a server, but that part should work too.

You see, I used the ErrorDocument directive to specify the error page, and used the DirectorySlash Off request to make sure Apache doesn't interfere with the slash-appending fun. I also used the QSA (Query String Append) flag that, well, appends the query string to the request so that it's not lost. It looks kind of silly after the trailing slash, but anyhow.

Otherwise it's pretty straightforward, and I think the comments explain it pretty well. Let me know if you run into any trouble with it.

  1. Create a folder under the root of the domain
  2. Place a .htaccess in the above folder as RewriteRule ^$ index.php
  3. Parse the URL
  4. With PHP coding you can now strip the URL or file extension as required
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!