MongoDB query to remove duplicate documents from a collection

给你一囗甜甜゛ 提交于 2019-12-12 01:22:55

问题


I take data from a search box and then insert into MongoDB as a document using the regular insert query. The data is stored in a collection for the word "cancer" in the following format with unique "_id".

{
  "_id": {
    "$oid": "553862fa49aa20a608ee2b7b"
  },
  "0": "c",
  "1": "a",
  "2": "n",
  "3": "c",
  "4": "e",
  "5": "r"
}

Each document has a single word stored in the same format as above. I have many documents as such. Now, I want to remove the duplicate documents from the collection. I am unable to figure out a way to do that. Help me.


回答1:


an easy solution in mongo shell: `

use your_db
db.your_collection.createIndex({'1': 1, '2': 1, '3': 1, etc until you reach maximum expected letter count}, {unique: true, dropDups: true, sparse:true, name: 'dropdups'})
db.your_collection.dropIndex('dropdups')

notes:

  • if you have many documents expect this procedure to take very long time
  • be careful this will remove documents in place, better clone your collection first and try it there.


来源:https://stackoverflow.com/questions/29818667/mongodb-query-to-remove-duplicate-documents-from-a-collection

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!