I have a collection with 9 million records. I am currently using the following script to update the entire collection:
simple_update.js
db.my
I won't recommend using {multi: true} for a larger data set, because it is less configurable.
A better way using bulk insert.
Bulk operation is really helpful for scheduler tasks. Say you have to delete data older that 6 months daily. Use bulk operation. Its fast and won't slow down server. The CPU, memory usage is not noticeable when you do insert, delete or update over a billion documents. I found {multi:true} slowing down the server when you are dealing with million+ documents(require more research in this.)
See a sample below. It's a js shell script, can run it in server as a node program as well.(use npm module shelljs or similar to achieve this)
update mongo to 3.2+
The normal way of updating multiple unique document is
let counter = 0;
db.myCol.find({}).sort({$natural:1}).limit(1000000).forEach(function(document){
counter++;
document.test_value = "just testing" + counter
db.myCol.save(document)
});
It took 310-315 seconds when I tried. That's more than 5 minutes for updating a million documents.
My collection includes 100 million+ documents, so speed may differ for others.
The same using bulk insert is
let counter = 0;
// magic no.- depends on your hardware and document size. - my document size is around 1.5kb-2kb
// performance reduces when this limit is not in 1500-2500 range.
// try different range and find fastest bulk limit for your document size or take an average.
let limitNo = 2222;
let bulk = db.myCol.initializeUnorderedBulkOp();
let noOfDocsToProcess = 1000000;
db.myCol.find({}).sort({$natural:1}).limit(noOfDocsToProcess).forEach(function(document){
counter++;
noOfDocsToProcess --;
limitNo--;
bulk.find({_id:document._id}).update({$set:{test_value : "just testing .. " + counter}});
if(limitNo === 0 || noOfDocsToProcess === 0){
bulk.execute();
bulk = db.myCol.initializeUnorderedBulkOp();
limitNo = 2222;
}
});
The best time was 8972 millis. So in average it took only 10 seconds to update a million documents. 30 times faster than old way.
Put the code in a .js file and execute as mongo shell script.
If someone found a better way, please update. Lets use mongo in a faster way.