Even if journaling is on, is there still a chance to lose writes in MongoDB?
\"By default, the greatest extent of lost writes, i.e., those not made to the journal,
Posting a new answer to clean this up. I performed tests and read the source code again and I'm sure the irritation comes from an unfortunate sentence in the write concern documentation. With journaling enabled and j:true write concern, the write is durable, and there is no mysterious window for data loss.
Even if journaling is on, is there still a chance to lose writes in MongoDB?
Yes, because the durability also depends on the individual operations write concern.
"By default, the greatest extent of lost writes, i.e., those not made to the journal, are those made in the last 100 milliseconds."
This is from Manage Journaling, which indicates you could lose writes made since the last time the journal was flushed to disk.
That is correct. The journal is flushed by a separate thread asynchronously, so you can lose everything since the last flush.
If I want more durability, "To force mongod to commit to the journal more frequently, you can specify
j:true. When a write operation withj:trueis pending, mongod will reducejournalCommitIntervalto a third of the set value."
This irritated me, too. Here's what it means:
When you send a write operation with j:true, it doesn't trigger the disk flush immediately, and not on the network thread. That makes sense, because there could be dozens of applications talking to the same mongod instance. If every application were to use journaling a lot, the db would be very slow because it's fsyncing all the time.
Instead, what happens is that the 'durability thread' will take all pending journal commits and flush them to disk. The thread is implemented like this (comments mine):
sleepmillis(oneThird); //dur.cpp, line 801
for( unsigned i = 1; i <= 2; i++ ) {
// break, if any j:true write is pending
if( commitJob._notify.nWaiting() )
break;
// or the number of bytes is greater than some threshold
if( commitJob.bytes() > UncommittedBytesLimit / 2 )
break;
// otherwise, sleep another third
sleepmillis(oneThird);
}
// fsync all pending writes
durThreadGroupCommit();
So a pending j:true operation will cause the journal commit thread to commit earlier than it normally would, and it will commit all pending writes to the journal, including those that don't have j:true set.
Even in this case, it looks like flushing the journal to disk is asynchronous so there is still a chance to lose writes. Am I missing something about how to guarantee that writes are not lost?
The write (or the getLastError command) with a j:true journaled write concern will wait for the durability thread to finish syncing, so there's no risk of data loss (as far as the OS and hardware guarantee that).
The sentence "However, there is a window between journal commits when the write operation is not fully durable" probably refers to a mongod running with journaling enabled that accepts a write that does NOT use the j:true write concern. In that case, there's a chance of the write getting lost since the last journal commit.
I filed a docs bug report for this.