来自请求Python库的HTTP请求中缺少主机标头
HTTP请求中的 HTTP/1.1强制性主机标头字段在哪里由 requests
Python库生成的消息?
Where is the HTTP/1.1 mandatory Host header field in HTTP request messages generated by the requests
Python library?
import requests
response = requests.get("https://www.google.com/")
print(response.request.headers)
输出以下内容:
{'User-Agent':'python-requests/2.22.0','Accept-Encoding':'gzip,deflate','Accept':'*/*','Connection':'保持活跃'}
{'User-Agent': 'python-requests/2.22.0', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive'}
默认情况下, requests
不会将 HOST
标头添加到请求中.如果未明确添加,则将决策委派给基础 http
模块.
The HOST
header is not being added to the request by requests
by default. If it is not explicitly added then the decision is delegated to the underlying http
module.
请参见 http/client.py
:
(如果在 requests.get
中显式提供了'Host'
标头,则 skip_host
为 True
)
(if 'Host'
header is explicitly provided in requests.get
then skip_host
is True
)
if self._http_vsn == 11:
# Issue some standard headers for better HTTP/1.1 compliance
if not skip_host:
# this header is issued *only* for HTTP/1.1
# connections. more specifically, this means it is
# only issued when the client uses the new
# HTTPConnection() class. backwards-compat clients
# will be using HTTP/1.0 and those clients may be
# issuing this header themselves. we should NOT issue
# it twice; some web servers (such as Apache) barf
# when they see two Host: headers
# If we need a non-standard port,include it in the
# header. If the request is going through a proxy,
# but the host of the actual URL, not the host of the
# proxy.
netloc = ''
if url.startswith('http'):
nil, netloc, nil, nil, nil = urlsplit(url)
if netloc:
try:
netloc_enc = netloc.encode("ascii")
except UnicodeEncodeError:
netloc_enc = netloc.encode("idna")
self.putheader('Host', netloc_enc)
else:
if self._tunnel_host:
host = self._tunnel_host
port = self._tunnel_port
else:
host = self.host
port = self.port
try:
host_enc = host.encode("ascii")
except UnicodeEncodeError:
host_enc = host.encode("idna")
# As per RFC 273, IPv6 address should be wrapped with []
# when used as Host header
if host.find(':') >= 0:
host_enc = b'[' + host_enc + b']'
if port == self.default_port:
self.putheader('Host', host_enc)
else:
host_enc = host_enc.decode("ascii")
self.putheader('Host', "%s:%s" % (host_enc, port))
因此,在检查请求
发送到服务器的标头时,我们看不到'Host'
标头.
As a result we do not see the 'Host'
header when inspecting the headers that requests
sent to the server.
如果我们向 http://httpbin/get 发送请求并打印响应,则可以看到主机
标头确实已发送.
If we send a request to http://httpbin/get and print the response we can see the Host
header was indeed sent.
import requests
response = requests.get("http://httpbin.org/get")
print('Response from httpbin/get')
print(response.json())
print()
print('response.request.headers')
print(response.request.headers)
输出
Response from httpbin/get
{'args': {}, 'headers': {'Accept': '*/*', 'Accept-Encoding': 'gzip, deflate',
'Host': 'httpbin.org', 'User-Agent': 'python-requests/2.20.0'},
'origin': 'XXXXXX', 'url': 'https://httpbin.org/get'}
response.request.headers
{'User-Agent': 'python-requests/2.20.0', 'Accept-Encoding': 'gzip, deflate',
'Accept': '*/*', 'Connection': 'keep-alive'}