问题
I have reviewed the many questions posted here related to .htaccess
, apache
, mod-rewrite
and regex
, but I'm just not getting it. I tried a few different things but either I am over complicating things or making beginner mistakes. Regardless, I've been at it a few days now and have completely scrambled things somewhere as the 10000 404's per day are showing.
My site
I have a WordPress site which contains over 23,000 posts broken down into just over 1200 categories. The site features streaming video files, industry news, show reviews, movies, phpbb forums, etc. and is structured like this:
- site / base categories ( 0 and a-z) / sub categories (series name) / posts (episode name .html )for all streaming media episodes
- site / movies / post title.html for all streaming movies
- site / news / posttitle.html
- site / reviews / posttitle.html
- site / page.html for assorted pages
- site / forums
Permalink structure is /%category%/%postname%.html
I have am using the Yoast Wordpress SEO plugin and have the option to append a trailing slash enabled for directories and categories.
here is the current .htaccess
# BEGIN WordPress
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>
# END WordPress
My examples
From our old site structure we have many inbound links using "/episode title/". This is wrong. We need these incoming links to redirect to /watch-anime/letter, number or symbol only 1 character long/series title/episode title.html
/one-piece-episode-528/
should be
/watch-anime/o/one-piece/one-piece-episode-528.html
A mistake I made caused this problem... "/watch-anime/letter/series title/episode title/
" to "/watch-anime/letter/series title/episode title.html
". So, we need to remove trailing slash from single posts and add .html
/watch-anime/w/welcome-to-the-nhk/welcome-to-the-nhk-episode-14/
should be
/watch-anime/w/welcome-to-the-nhk/welcome-to-the-nhk-episode-14.html
The same mistake caused this problem when combined with the old site structure issue... "/episode title.html
" needs to be "/watch-anime/letter/series title/episode title.html
"
/one-piece-episode-528.html
needs to be
/watch-anime/o/one-piece/one-piece-episode-528.html
As you can see, I've made a mess of things between migrating the sites post structure and my attempts to fix it. I am now asking for any help you can provide in getting a proper .htaccess file that will take care of these 301 redirects.
Thanks for any assistance you can provide!
回答1:
I don't know if RewriteMap work with .htaccess
files, but anyway here's my solution for virtual host, which should work flawlessly.
Create a RewriteMap file. See here for more information. This is a very simple text file with: first, the wrong URL without the '/', then one space (at least) and then the right url, like this:
one-piece-episode-528 /watch-anime/o/one-piece/one-piece-episode-528.html
dexter-season-6-episode-1 /watch-interesting-stuff/d/dexter/dexter-season-6-episode-1.html
breaking-bad-full-season-3 /watch-interesting-stuff/b/breaking-bad/breaking-bad-full-season-3.html
and so on.
convert this simple text file into hash map. For example:
httxt2dbm -i mapanime.txt -o mapanime.map
Now declare it in your vhost:
RewriteMap mapanime \
dbm:/pathtofile/mapanime.map
So all in all your vhost should look like:
<VirtualHost *>
RewriteEngine On
RewriteMap mapanime \
dbm:/pathtofile/mapanime.map
# don't touch the URL, but try to search if it exists in mapanime
RewriteRule /([^/]*)/$ - [QSA,NC,E=VARANIME:${mapanime:$1|notfound}]
# if VARANIME not empty *and*
# VARANIME different from "notfound":
RewriteCond %{ENV:VARANIME} ^(notfound|)$
# then redirect it to the right URL:
# QSA = query string append
# R = redirect, 301 = definitive redirect
# L = last = don't go further
RewriteRule . %{ENV:VARANIME} [QSA,R=301,L]
</VirtualHost>
Hope this helps.
I don't see a simpler solution, but I'm pretty sure this one will work.
If it doesn't work: read my usual "two hints", and add the rewrite log in your question.
Two hints:
Please try to use the RewriteLog
directive: it helps you to track down such problems:
# Trace:
# (!) file gets big quickly, remove in prod environments:
RewriteLog "/web/logs/mywebsite.rewrite.log"
RewriteLogLevel 9
RewriteEngine On
My favorite tool to check for regexp:
http://www.quanetic.com/Regex (don't forget to choose ereg(POSIX) instead of preg(PCRE)!)
来源:https://stackoverflow.com/questions/8610507/htaccess-mod-rewrite-regex-apache-confusion-results-in-10k-404s-per-day