SEO Canonical URL in Greek characters

人盡茶涼 提交于 2021-01-28 12:03:47

问题


I have a URL which including Greek letters

http://www.mydomanain.com/gr/τιτλος-σελιδας/20/

I am using $_SERVER['REQUEST_URI'] to insert value to canonical link in my page head like this

<link rel="canonical" href="http://www.mydomanain.com<?php echo $_SERVER['REQUEST_URI']; ?>" />

The problem is when I am viewing the page source the URL is displayed with characters like ...CE%B3%CE%B3%CE%B5%CE%BB...but when clicking on it, its display the link as it should be

Is this will caused any penalty from search engines?


回答1:


No, this is the correct behaviour. All characters in urls can be present in the page source using their human readable form or in encoded form which can be translated back using tables for the relevant character set. When the link is clicked, the encoded value is sent to the server which translates it back to it's human readable form.

It is common to encode characters that may cause issues in urls - spaces being a common example (%20) see Ascii tables. The %xx syntax refers to the equivalent HEX value of the character.

Search engines will be aware of this and interpret the characters correctly.

When sending the HTML to the browser, ensure that the character set specified by the server matches your HTML. Search engines will also look for this to correctly decode the HTML. The correct way to do this is via HTTP response headers. In PHP these are set with header:

header('Content-Type: text/html; charset=utf-8'); 
    // Change utf-8 to a different encoding if used



回答2:


URLs can only consist of a limited subset of ASCII characters. You cannot in fact use "greek characters" in a URL. All characters outside this limited ASCII range must be percent-encoded.

Now, browsers do two things:

  1. If they encounter URLs in your HTML which fall outside this rule, i.e. which contain unencoded non-ASCII characters, the browser will helpfully encode them for you before sending off the request to your server.
  2. For some (unambiguous) characters, the browser will display them in their decoded form in the address bar, to enhance the UX.

So, yeah, all is good. In fact, you should be percent-encoding your URLs yourself if they aren't already.



来源:https://stackoverflow.com/questions/34332307/seo-canonical-url-in-greek-characters

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!