Requests - get content-type/size without fetching the whole page/content

后端未结

关注

 4  1266

悲&欢浪女 2021-02-07 12:35

I have a simple website crawler, it works fine, but sometime it stuck because of large content such as ISO images, .exe files and other large stuff. Guessing content-type using

4条回答

Happy的楠姐 (楼主)

2021-02-07 13:16
Yes.

You can use the Session.head method to create HEAD requests:
```
response = session.head(url, timeout=self.pageOpenTimeout, headers=customHeaders)
contentType = response.headers['content-type']
```
A HEAD request similar to GET request, except that the message body would not be sent.

Here is a quote from Wikipedia:

HEAD Asks for the response identical to the one that would correspond to a GET request, but without the response body. This is useful for retrieving meta-information written in response headers, without having to transport the entire content.
0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...