Do search engines respect the HTTP header field “Content-Location”?

亡梦爱人 提交于 2020-01-12 04:32:10

问题


I was wondering whether search engines respect the HTTP header field Content-Location.

This could be useful, for example, when you want to remove the session ID argument out of the URL:

GET /foo/bar?sid=0123456789 HTTP/1.1
Host: example.com
…

HTTP/1.1 200 OK
Content-Location: http://example.com/foo/bar
…

Clarification:
I don’t want to redirect the request, as removing the session ID would lead to a completely different request and thus probably also a different response. I just want to state that the enclosed response is also available under its “main URL”.

Maybe my example was not a good representation of the intent of my question. So please take a look at What is the purpose of the HTTP header field “Content-Location”?.


回答1:


I think Google just announced the answer to my question: the canonical link relation for declaring the canonical URL.

Maile Ohye from Google wrote:

MickeyC said...
You should have used the Content-Location header instead, as per:
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html
"14.14 Content-Location"

@MikeyC: Yes, from a theoretical standpoint that makes sense and we certainly considered it. A few points, however, led us to choose :

  1. Our data showed that the "Content-Location" header is configured improperly on many web sites. Sometimes webmasters provide long, ugly URLs that aren’t even duplicates -- it's probably unintentional. They're likely unaware that their webserver is even sending the Content-Location header.

    It would've been extremely time consuming to contact site owners to clean up the Content-Location issues throughout the web. We realized that if we started with a clean slate, we could provide the functionality more quickly. With Microsoft and Yahoo! on-board to support this format, webmasters need to only learn one syntax.

  2. Often webmasters have difficulty configuring their web server headers, but can more easily change their HTML. rel="canonical" seemed like a friendly attribute.

http://googlewebmastercentral.blogspot.com/2009/02/specify-your-canonical.html?showComment=1234714860000#c8376597054104610625




回答2:


Most decent crawlers do follow Content-Location. So, yes, search engines respect the Content-Location header, although that is no guarantee that the URL having the sid parameter will not be on the results page.




回答3:


In 2009 Google started looking at URIs qualified as rel=canonical in the response body.

Looks like since 2011, links formatted as per RFC5988 are also parsed from the header field Link:. It is also clearly mentioned in the Webmaster Tools FAQ as a valid option.

Guess this is the most up-to-date way of providing search engines some extra hypermedia breadcrumbs to follow - thus allow keeping you to keep them out of the response body when you don't actually need to serve it as content.




回答4:


In addition to using 'Location' rather than 'Content-Location' use the proper HTTP status code in your response depending on your reason for redirect. Search engines tend to favor permanent redirect (301) status vs temporary (302) status.




回答5:


Try the "Location:" header instead.



来源:https://stackoverflow.com/questions/438413/do-search-engines-respect-the-http-header-field-content-location

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!