Scrapy image download how to use custom filename

前端未结

关注

 6  1261

眼角桃花 2020-11-28 07:27

For my scrapy project I\'m currently using the ImagesPipeline. The downloaded images are stored with a SHA1 hash of their URLs as the file names.

How can I s

6条回答

星月不相逢 (楼主)

2020-11-28 07:50

I did a nasty quick hack for that. In my case, I stored the title of image in my feeds. And, I had only 1 image_urls per item, so, I wrote the following script. It basically renames the image files in the /images/full/ directory with the corresponding title in the item feed that I had stored in as json.

import os import json img_dir = os.path.join(os.getcwd(), 'images\\full') item_dir = os.path.join(os.getcwd(), 'data.json') with open(item_dir, 'r') as item_json: items = json.load(item_json) for item in items: if len(item['images']) > 0: cur_file = item['images'][0]['path'].split('/')[-1] cur_format = cur_file.split('.')[-1] new_title = item['title']+'.%s'%cur_format file_path = os.path.join(img_dir, cur_file) os.rename(file_path, os.path.join(img_dir, new_title))

It's nasty & not recommended. But, it is a naive alternative approach.

0 讨论(0)

查看其它6个回答

发布评论:

提交评论

加载中...

验证码

看不清?

提交回复