what's the meaning of request.headers.setdefault() in scrapy

巧了我就是萌 提交于 2020-01-02 23:26:16

问题


I wanna set custom UserAgentMiddleware with scrapy. But I don't know the action of request.headers.setdefault('User-Agent', ua) when I saw it, and I didn't find the method both document of scrapy and requests.

Where can I find the any explanation about it?


回答1:


headers is a normal dictionary, so setdefault would be a way to set a value to that dictionary if that value isn't present there already.

The explanation would be that the Middleware sets the User-Agent by defaut only if you didn't set one already on the spider process.

You can set something like this in your spider code:

...
request.headers['User-Agent'] = 'My Custom User Agent'
yield request

meaning that when that code reaches the Middleware, the user-agent won't be overridden or changed.

The other Middlewares (or any other process) that comes before this one, could also modify the User-Agent, and it won't be changed by this code, because it respects the ones previously set.



来源:https://stackoverflow.com/questions/48050573/whats-the-meaning-of-request-headers-setdefault-in-scrapy

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!