Using UUIDs instead of ObjectIDs in MongoDB

后端 未结 5 470
误落风尘
误落风尘 2021-01-30 03:32

We are migrating a database from MySQL to MongoDB for performance reasons and considering what to use for IDs of the MongoDB documents. We are debating between using ObjectIDs,

5条回答
  •  不要未来只要你来
    2021-01-30 04:35

    I think this is a great idea and so does Mongo; they list UUIDs as one of the common options for the _id field.

    Considerations:

    • Performance -- As other answers mention, benchmarks show UUIDs cause a performance drop for inserts. In the worst case measured (going from 10M to 20M docs in a collection) they've about ~2-3x slower -- the difference between inserting 2,000 (UUID) and 7,500 (ObjectID) docs per second. This is a large difference but it's significance depends entirely on you use case. Will you be batch inserting millions of docs at a time? For most apps I've build the common case is inserting individual documents. In that test the difference is much smaller (6,250 -vs- 7,500; ~20%). The ID type is simply not the limiting factor.
    • Portability -- Other DBs certainly do tend to have good UUID support so portability would be improved. Alternatively, since UUIDs are larger (more bits) it is possible to repack an ObjectID into the "shape" of a UUID. This approach isn't as nice as direct portability but it does give you a path forward.

    Counter to some of the other answers:

    • UUIDs have native support -- You can use the UUID() function in the Mongo Shell exactly the same way you'd use ObjectID(); to convert a string into equivalent BSON object.
    • UUIDs are not especially large -- They're 128 bit compared to ObjectIDs which are 96 bit. (They should be encoded using binary subtype 0x04.)
    • UUIDs can include a timestamp -- Specifically, UUIDv1 encodes a timestamp with 60 bits of precision, compared to 32 bits in ObjectIDs. This is over 6 orders of magnitude more precision, so nano-seconds instead of seconds. It can actually be a decent way of storing create timestamps with more accuracy than Mongo/JS Date objects support, however...
      • The build in UUID() function only generates v4 (random) UUIDs so, to leverage this this, you'd to lean on on your app or Mongo driver for ID creation.
      • Unlike ObjectIDs, because of the way UUIDs are chunked, the timestamp doesn't give you a natural order. This can be good or bad depending on your use case.
      • Including timestamps in your IDs is often a Bad Idea. You end up leaking the created time of documents anywhere an ID is exposed. To make maters worse, v1 UUIDs also encode a unique identifier for the machine they're generated on which can expose additional information about your infrastructure (eg. number of servers). Of course ObjectIDs also encode a timestamp so this is partly true for them too.

提交回复
热议问题