I\'ve been working with Scrapy but run into a bit of a problem.
DjangoItem has a save method to persist items using the Django ORM. This is
I think it could be done more simply with
class DjangoSavePipeline(object):
def process_item(self, item, spider):
try:
product = Product.objects.get(myunique_id=item['myunique_id'])
# Already exists, just update it
instance = item.save(commit=False)
instance.pk = product.pk
except Product.DoesNotExist:
pass
item.save()
return item
Assuming your django model has some unique id from the scraped data, such as a product id, and here assuming your Django model is called Product.