Scrapy Return Multiple Items

有些话、适合烂在心里 提交于 2019-12-04 12:35:41

I was also searching for a solution for the same problem. And here is the solution that I have found:

def parse(self, response):
    for selector in response.xpath("//*[@class='quote']"):
        l = ItemLoader(item=FirstSpiderItem(), selector=selector)
        l.add_xpath('text', './/span[@class="text"]/text()')
        l.add_xpath('author', '//small[@class="author"]/text()')
        l.add_xpath('tags', './/meta[@class="keywords"]/@content')
        yield l.load_item()

    next_page = response.xpath(".//li[@class='next']/a/@href").extract_first()
    if next_page is not None:
        yield response.follow(next_page, callback=self.parse)

To remove quotation marks from the text, you can use an output processor in items.py.

from scrapy.loader.processors import MapCompose

def replace_quotes(text):
    for c in ['“', '”']:
        if c in text:
            text = text.replace(c, "")
    return text

class FirstSpiderItem(scrapy.Item):
    text = scrapy.Field()
    author = scrapy.Field()
    tags = scrapy.Field(output_processor=MapCompose(replace_quotes))

Please let me know whether it was helpful.

Give this a try. It will give you all the data that you wanted to scrape.

import scrapy

class QuotesSpider(scrapy.Spider):

    name = 'quotes'
    start_urls = ['http://quotes.toscrape.com/']

    def parse(self, response):
        for quote in response.xpath("//*[@class='quote']"):
            text = quote.xpath(".//span[@class='text']/text()").extract_first()
            author = quote.xpath(".//small[@class='author']/text()").extract_first()
            tags = quote.xpath(".//meta[@class='keywords']/@content").extract_first()
            yield {"Text":text,"Author":author,"Tags":tags}

        next_page = response.xpath(".//li[@class='next']/a/@href").extract_first()
        if next_page:
            next_page_url = response.urljoin(next_page)
            yield scrapy.Request(next_page_url)
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!