问题
In my website I have the following categories url structure:
/category.php?id=6
(id=6 is for internet category)
My SEO friendly url is like:
/category/6/internet/
The problem is it can be accessed in any of those forms, and because of that, I'm getting duplicate content on google.
So, I'm wondering how can I fix that.
Should I disallow on robots.txt
any urls with ?
on it?
If so, how can I properly set it up?
Should I make a redirection "Permanently Moved" on .htaccess
?
If so, how can I properly set it up?
My actual .htaccess
for categories is like this:
RewriteRule ^category/([^/]*)/([^/]*)/$ category.php?id=$1&name=$2 [L]
回答1:
You just need to set the canonical link tag in the head section of your pages
see http://googlewebmastercentral.blogspot.com/2009/02/specify-your-canonical.html
and http://support.google.com/webmasters/bin/answer.py?hl=en&answer=139394
It will look something like
<link rel="canonical" href="http://www.example.com/category/6/internet/"/>
on the category 6 page
You could also do a 301 redirect for the category.php pages in your .htaccess by adding
RewriteRule ^category.php?id=([^&]*)&name=([^&]*) /category/$1/$2/ [R=301,L]
If you didn't want to go the route of rewriterules you could put the following code at the top of config.php:
if(preg_match('/^\/config\.php/', $_SERVER['REQUEST_URI'])){
header("HTTP/1.1 301 Moved Permanently");
header("Location: /category/{$_GET['id']}/{$_GET['name']}");
die();
}
Either way is up to you but I would use the rewriterule option to redirect to my SEO friendly URL If I were you
回答2:
Or you get rid of the non-SEO url.
- always generate SEO url (you should do that anyway for)
- in category.php check if $_SERVER['REQUEST_URI'] is the seo one and if not redirect to it
回答3:
I would suggest using a canonical link in the document head to ensure Google uses the correct URL. Google on Rel Canonical.
It's really easy to implement, just post this into the HEAD section of the page.
<link rel="canonical" href="/your/url"/>
Google treats the canonical link as a 301 redirect, which means you won't have any duplicate content issues. It also means most of the link juice gets passed on (between 90% and 99%). If you used robots.txt or .htaccess , the page that you're blocking off would lose all its SEO value.
Just make sure you do this for every page as it's a page specific rule. Linking to the domain root will effectively mean all your pages are 301 redirecting to the home page.
来源:https://stackoverflow.com/questions/12827183/duplicated-content-on-google-htaccess-or-robots-txt