Should I create pipeline to save files with scrapy?

后端未结

关注

 3  1838

旧时难觅i 2020-12-13 08:12

I need to save a file (.pdf) but I\'m unsure how to do it. I need to save .pdfs and store them in such a way that they are organized in a directories much like they are stor

3条回答

爱一瞬间的悲伤 (楼主)

2020-12-13 08:26

It's a perfect tool for the job. The way Scrapy works is that you have spiders that transform web pages into structured data(items). Pipelines are postprocessors, but they use same asynchronous infrastructure as spiders so it's perfect for fetching media files.

In your case, you'd first extract location of PDFs in spider, fetch them in pipeline and have another pipeline to save items.

0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...