Should I create pipeline to save files with scrapy?

后端 未结 3 1838
旧时难觅i
旧时难觅i 2020-12-13 08:12

I need to save a file (.pdf) but I\'m unsure how to do it. I need to save .pdfs and store them in such a way that they are organized in a directories much like they are stor

3条回答
  •  爱一瞬间的悲伤
    2020-12-13 08:26

    It's a perfect tool for the job. The way Scrapy works is that you have spiders that transform web pages into structured data(items). Pipelines are postprocessors, but they use same asynchronous infrastructure as spiders so it's perfect for fetching media files.

    In your case, you'd first extract location of PDFs in spider, fetch them in pipeline and have another pipeline to save items.

提交回复
热议问题