For my scrapy project I\'m currently using the ImagesPipeline. The downloaded images are stored with a SHA1 hash of their URLs as the file names.
How can I s
I did a nasty quick hack for that. In my case, I stored the title of image in my feeds. And, I had only 1 image_urls per item, so, I wrote the following script. It basically renames the image files in the /images/full/ directory with the corresponding title in the item feed that I had stored in as json.
import os
import json
img_dir = os.path.join(os.getcwd(), 'images\\full')
item_dir = os.path.join(os.getcwd(), 'data.json')
with open(item_dir, 'r') as item_json:
items = json.load(item_json)
for item in items:
if len(item['images']) > 0:
cur_file = item['images'][0]['path'].split('/')[-1]
cur_format = cur_file.split('.')[-1]
new_title = item['title']+'.%s'%cur_format
file_path = os.path.join(img_dir, cur_file)
os.rename(file_path, os.path.join(img_dir, new_title))
It's nasty & not recommended. But, it is a naive alternative approach.