I have a collection with 9 million records. I am currently using the following script to update the entire collection:
simple_update.js
db.my
Starting Mongo 4.2, db.collection.update() can accept an aggregation pipeline, finally allowing the update/creation of a field based on another field; and thus allowing us to fully apply this kind of query server-side:
// { Y: 456, X: 3 }
// { Y: 3452, X: 2 }
db.collection.update(
{},
[{ $set: { pid: {
$sum: [ 2571, { $multiply: [ -1, "$Y" ] }, { $multiply: [ 2572, "$X" ] } ]
}}}],
{ multi: true }
)
// { Y: 456, X: 3, pid: 9831 }
// { Y: 3452, X: 2, pid: 4263 }
The first part {} is the match query, filtering which documents to update (all documents in this case).
The second part [{ $set: { pid: ... } }] is the update aggregation pipeline (note the squared brackets signifying the use of an aggregation pipeline). $set is a new aggregation operator and an alias of $addFields. Note how pid is created directly based on the values of X ($X) and Y ($Y) from the same document.
Don't forget { multi: true }, otherwise only the first matching document will be updated.