For my scrapy project I\'m currently using the ImagesPipeline. The downloaded images are stored with a SHA1 hash of their URLs as the file names.
How can I s
In scrapy 0.12 I solved something like this
class MyImagesPipeline(ImagesPipeline):
#Name download version
def image_key(self, url):
image_guid = url.split('/')[-1]
return 'full/%s.jpg' % (image_guid)
#Name thumbnail version
def thumb_key(self, url, thumb_id):
image_guid = thumb_id + url.split('/')[-1]
return 'thumbs/%s/%s.jpg' % (thumb_id, image_guid)
def get_media_requests(self, item, info):
yield Request(item['images'])