CSV files are empty even items are scraped from site

无人久伴 提交于 2020-01-17 20:04:09

问题


My requirement is to dump scraped items to two different csv files. I'm able to scrape the data but CSV file is empty. Could anyone please help in this regard.

Below is the code for the pipeline.py file and console logs:

Code for pipeline.py :

# -*- coding: utf-8 -*-



# Define your item pipelines here
#
# Don't forget to add your pipeline to the ITEM_PIPELINES setting
# See: https://docs.scrapy.org/en/latest/topics/item-pipeline.html
from scrapy.exporters import CsvItemExporter
from scrapy import signals
from pydispatch import dispatcher



def item_type(item):
    # The CSV file names are used (imported) from the scrapy spider.
    return type(item)



class SecfilingsPipeline(object):
    fileNamesCsv = ['NonDerivatives','Derivatives']

    def __init__(self):
        self.files = {}
        self.exporters = {}
        dispatcher.connect(self.spider_opened, signal=signals.spider_opened)`enter code here`
        dispatcher.connect(self.spider_closed, signal=signals.spider_closed)

    def spider_opened(self, spider):
        self.files = dict([ (name, open(name+'.csv','wb')) for name in self.fileNamesCsv ])
        for name in self.fileNamesCsv:
            self.exporters[name] = CsvItemExporter(self.files[name])




            if name == "NonDerivatives":
                print("File Name 1" + name)
                self.exporters[name].fields_to_export = ['TitleofSecurity','TransactionDate','TransactionCode','Amount','SecuritiesAcquirednDisposed','AmountOfSecurityOwned','OwnershipForm']
                self.exporters[name].start_exporting()



            if name == "Derivatives":
                print("File Name 2" + name)
                self.exporters[name].fields_to_export = ['TitleofDerivativeSecurity','TransactionDate','TransactionCode','SecuritiesAcquired','SecuritiesDisposed','TitleOfSecurity','Amount','AmountOfSecurityOwned','OwnershipForm']
                self.exporters[name].start_exporting()



    def spider_closed(self, spider):
        [e.finish_exporting() for e in self.exporters.values()]
        [f.close() for f in self.files.values()]


    def process_item(self, item, spider):
        typesItem = item_type(item)
        if typesItem in set(self.fileNamesCsv):
            self.exporters[typesItem].export_item(item)
        return item

I also enabled the pipeline configuration in setting.py

来源:https://stackoverflow.com/questions/59387321/csv-files-are-empty-even-items-are-scraped-from-site

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!