requests模块的学习
使用前
- pip install request
发送get, post请求,获取响应
- response = requests.get(url)
- response = requests.post(url, data = {请求体的字典})
response的方法
-
response.text
– 往往出现乱码,出现乱码时在前面加一句:response.encoding = “utf-8” -
response.content.decode()
– 把响应的二进制流转化为str类型 -
response.request.url #发送请求的URL地址
-
response.url #response响应的URL地址
-
response.request.headers #请求头
-
response.headers #响应请求
获取网页源码的正确方式
(依次尝试以下三种方法,一定有一个可以正确获取解码后的字符串)
- response.content.decode()
- response.content.decode(“gbk”)
- response.text
发送header请求
- 为了模拟浏览器,不被服务器阻拦,获取和浏览器一模一样的内容
headers = {
"User-Agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.122 Safari/537.36",
"Referer":"https://www.baidu.com/"}
response = requests.get(url, headers=headers)
使用超时参数
- request.get(url, headers=headers, timeout=3) #3秒之内必须返回响应,否则报错
retrying模块的学习
pip install retrying
from retrying import retry
headers = {"User-Agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.122 Safari/537.36"}
@retry(stop_max_attempt_number=3) #反复执行3次,都报错才会报错
def _parse_url(url):
print("*"*100)
response = requests.get(url, headers=headers, timeout=5)
return response.content.decode()
处理cookie相关的请求
- 直接携带cookie请求url地址
- cookie放在headers中
headers = {"User-Agent":"......", "Cookie":"......"}
- cookie字典传给cookies参数
requests.get(url,cookies=cookie_dict)
- cookie放在headers中
- 先发送post请求,获取cookie,再上cookie请求登录后的页面
session = requests.session() #实例化session。session具有的方法和cookie一样 session.post(url,data,headers) #发送post请求,执行的同时session自动保存了服务器在本地设置的cookie session.get(url) #会带上之前保存的cookie
来源:CSDN
作者:SEHDY
链接:https://blog.csdn.net/weixin_42384816/article/details/104610470