Scrapy: constructing non-duplicative list of absolute paths from relative paths

后端 未结 2 834
我在风中等你
我在风中等你 2021-01-24 06:18

Question: how do I use Scrapy to create a non-duplicative list of absolute paths from relative paths under the img src tag?

<
2条回答
  •  长发绾君心
    2021-01-24 06:59

    What about:

    def url_join(self,response):
        item=MyItem()
        item['url']=[]
        relative_url=response.xpath('//img/@src').extract()
        for link in relative_url:
            item['url'] = response.urljoin(link)
            yield item
    

提交回复
热议问题