Missing Host header in HTTP requests from the requests Python library

被刻印的时光 ゝ 提交于 2020-12-31 15:26:52

问题


Where is the HTTP/1.1 mandatory Host header field in HTTP request messages generated by the requests Python library?

import requests

response = requests.get("https://www.google.com/")
print(response.request.headers)

Outputs this:

{'User-Agent': 'python-requests/2.22.0', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive'}


回答1:


The HOST header is not being added to the request by requests by default. If it is not explicitly added then the decision is delegated to the underlying http module.

See this section of http/client.py:

(if 'Host' header is explicitly provided in requests.get then skip_host is True)

    if self._http_vsn == 11:
        # Issue some standard headers for better HTTP/1.1 compliance

        if not skip_host:
            # this header is issued *only* for HTTP/1.1
            # connections. more specifically, this means it is
            # only issued when the client uses the new
            # HTTPConnection() class. backwards-compat clients
            # will be using HTTP/1.0 and those clients may be
            # issuing this header themselves. we should NOT issue
            # it twice; some web servers (such as Apache) barf
            # when they see two Host: headers

            # If we need a non-standard port,include it in the
            # header.  If the request is going through a proxy,
            # but the host of the actual URL, not the host of the
            # proxy.

            netloc = ''
            if url.startswith('http'):
                nil, netloc, nil, nil, nil = urlsplit(url)

            if netloc:
                try:
                    netloc_enc = netloc.encode("ascii")
                except UnicodeEncodeError:
                    netloc_enc = netloc.encode("idna")
                self.putheader('Host', netloc_enc)
            else:
                if self._tunnel_host:
                    host = self._tunnel_host
                    port = self._tunnel_port
                else:
                    host = self.host
                    port = self.port

                try:
                    host_enc = host.encode("ascii")
                except UnicodeEncodeError:
                    host_enc = host.encode("idna")

                # As per RFC 273, IPv6 address should be wrapped with []
                # when used as Host header

                if host.find(':') >= 0:
                    host_enc = b'[' + host_enc + b']'

                if port == self.default_port:
                    self.putheader('Host', host_enc)
                else:
                    host_enc = host_enc.decode("ascii")
                    self.putheader('Host', "%s:%s" % (host_enc, port)) 

As a result we do not see the 'Host' header when inspecting the headers that requests sent to the server.

If we send a request to http://httpbin/get and print the response we can see the Host header was indeed sent.

import requests

response = requests.get("http://httpbin.org/get")
print('Response from httpbin/get')
print(response.json())
print()
print('response.request.headers')
print(response.request.headers)

Outputs

Response from httpbin/get
{'args': {}, 'headers': {'Accept': '*/*', 'Accept-Encoding': 'gzip, deflate', 
 'Host': 'httpbin.org', 'User-Agent': 'python-requests/2.20.0'},
 'origin': 'XXXXXX', 'url': 'https://httpbin.org/get'}

response.request.headers
{'User-Agent': 'python-requests/2.20.0', 'Accept-Encoding': 'gzip, deflate', 
 'Accept': '*/*', 'Connection': 'keep-alive'}


来源:https://stackoverflow.com/questions/57770557/missing-host-header-in-http-requests-from-the-requests-python-library

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!