Scrapy overwrite json files instead of appending the file

后端 未结 6 1173
长情又很酷
长情又很酷 2020-12-28 10:57

Is there a way to overwrite the said file instead of appending it?

Example)

scrapy crawl myspider -o \"/path/to/json/my.json\" -t json    
scrapy cra         


        
6条回答
  •  时光取名叫无心
    2020-12-28 11:24

    This is an old, well-known "problem" of Scrapy. Every time you start a crawl and you do not want to keep the results of previous calls you have to delete the file. The idea behind this is that you want to crawl different sites or the same site at different time-frames so you could accidentally lose your already gathered results. Which could be bad.

    A solution would be to write an own item pipeline where you open the target file for 'w' instead of 'a'.

    To see how to write such a pipeline look at the docs: http://doc.scrapy.org/en/latest/topics/item-pipeline.html#writing-your-own-item-pipeline (specifically for JSON exports: http://doc.scrapy.org/en/latest/topics/item-pipeline.html#write-items-to-a-json-file)

提交回复
热议问题