问题
I wanna set custom UserAgentMiddleware with scrapy. But I don't know the action of request.headers.setdefault('User-Agent', ua) when I saw it, and I didn't find the method both document of scrapy and requests.
Where can I find the any explanation about it?
回答1:
headers is a normal dictionary, so setdefault would be a way to set a value to that dictionary if that value isn't present there already.
The explanation would be that the Middleware sets the User-Agent by defaut only if you didn't set one already on the spider process.
You can set something like this in your spider code:
...
request.headers['User-Agent'] = 'My Custom User Agent'
yield request
meaning that when that code reaches the Middleware, the user-agent won't be overridden or changed.
The other Middlewares (or any other process) that comes before this one, could also modify the User-Agent, and it won't be changed by this code, because it respects the ones previously set.
来源:https://stackoverflow.com/questions/48050573/whats-the-meaning-of-request-headers-setdefault-in-scrapy