I need to ignore duplicate inserts when using insert_many with pymongo, where the duplicates are based on the index. I\'ve seen this question asked on stackoverflow, but I h
Adding more to Neil's solution.
Having 'ordered=False, bypass_document_validation=True' params allows new pending insertion to occur even on duplicate exception.
from pymongo import MongoClient, errors
DB_CLIENT = MongoClient()
MY_DB = DB_CLIENT['my_db']
TEST_COLL = MY_DB.dup_test_coll
doc_list = [
{
"_id": "82aced0eeab2467c93d04a9f72bf91e1",
"name": "shakeel"
},
{
"_id": "82aced0eeab2467c93d04a9f72bf91e1", # duplicate error: 11000
"name": "shakeel"
},
{
"_id": "fab9816677774ca6ab6d86fc7b40dc62", # this new doc gets inserted
"name": "abc"
}
]
try:
# inserts new documents even on error
TEST_COLL.insert_many(doc_list, ordered=False, bypass_document_validation=True)
except errors.BulkWriteError as e:
print(f"Articles bulk insertion error {e}")
panic_list = list(filter(lambda x: x['code'] != 11000, e.details['writeErrors']))
if len(panic_list) > 0:
print(f"these are not duplicate errors {panic_list}")
And since we are talking about duplicates its worth checking this solution as well.