How to Ignore Duplicate Key Errors Safely Using insert_many

前端 未结 2 2113
忘掉有多难
忘掉有多难 2020-11-27 20:24

I need to ignore duplicate inserts when using insert_many with pymongo, where the duplicates are based on the index. I\'ve seen this question asked on stackoverflow, but I h

2条回答
  •  小蘑菇
    小蘑菇 (楼主)
    2020-11-27 21:09

    You can deal with this by inspecting the errors produced with BulkWriteError. This is actually an "object" which has several properties. The interesting parts are in details:

    import pymongo
    from bson.json_util import dumps
    from pymongo import MongoClient
    client = MongoClient()
    db = client.test
    
    collection = db.duptest
    
    docs = [{ '_id': 1 }, { '_id': 1 },{ '_id': 2 }]
    
    
    try:
      result = collection.insert_many(docs,ordered=False)
    
    except pymongo.errors.BulkWriteError as e:
      print e.details['writeErrors']
    

    On a first run, this will give the list of errors under e.details['writeErrors']:

    [
      { 
        'index': 1,
        'code': 11000, 
        'errmsg': u'E11000 duplicate key error collection: test.duptest index: _id_ dup key: { : 1 }', 
        'op': {'_id': 1}
      }
    ]
    

    On a second run, you see three errors because all items existed:

    [
      {
        "index": 0,
        "code": 11000,
        "errmsg": "E11000 duplicate key error collection: test.duptest index: _id_ dup key: { : 1 }", 
        "op": {"_id": 1}
       }, 
       {
         "index": 1,
         "code": 11000,
         "errmsg": "E11000 duplicate key error collection: test.duptest index: _id_ dup key: { : 1 }",
         "op": {"_id": 1}
       },
       {
         "index": 2,
         "code": 11000,
         "errmsg": "E11000 duplicate key error collection: test.duptest index: _id_ dup key: { : 2 }",
         "op": {"_id": 2}
       }
    ]
    

    So all you need do is filter the array for entries with "code": 11000 and then only "panic" when something else is in there

    panic = filter(lambda x: x['code'] != 11000, e.details['writeErrors'])
    
    if len(panic) > 0:
      print "really panic"
    

    That gives you a mechanism for ignoring the duplicate key errors but of course paying attention to something that is actually a problem.

提交回复
热议问题