MongoDB as a Time Series Database

前端 未结 2 519
星月不相逢
星月不相逢 2020-12-02 11:21

I\'m trying to use mongodb for a time series database and was wondering if anyone could suggest how best to set it up for that scenario.

The time series data is ver

2条回答
  •  醉话见心
    2020-12-02 12:04

    Obviously this is an old question, but I came across it when I was researching MongoDB for timeseries data. I thought that it might be worth sharing the following approach for allocating complete documents in advance and performing update operations, as opposed to new insert operations. Note, this approach was documented here and here.

    Imagine you are storing data every minute. Consider the following document structure:

    {
      timestamp: ISODate("2013-10-10T23:06:37.000Z"),
      type: ”spot_EURUSD”,
      value: 1.2345
    },
    {
      timestamp: ISODate("2013-10-10T23:06:38.000Z"),
      type: ”spot_EURUSD”,
      value: 1.2346
    }
    

    This is comparable to a standard relational approach. In this case, you produce one document per value recorded, which causes a lot of insert operations. We can do better. Consider the following:

    {
      timestamp_minute: ISODate("2013-10-10T23:06:00.000Z"),
      type: “spot_EURUSD”,
      values: {
        0: 1.2345,
        …  
        37: 1.2346,
        38: 1.2347,
        … 
        59: 1.2343
      }
    }
    

    Now, we can write one document, and perform 59 updates. This is much better because updates are atomic, individual writes are smaller, and there are other performance and concurrency benefits. But what if we wanted to store the entire day, and not just the entire hours, in one document. This would then require us to walk along 1440 entries to get the last value. To improve on this, we can extend further to the following:

    {
      timestamp_hour: ISODate("2013-10-10T23:00:00.000Z"),
      type: “spot_EURUSD”,
      values: {
        0: { 0: 1.2343, 1: 1.2343, …, 59: 1.2343},
        1: { 0: 1.2343, 1: 1.2343, …, 59: 1.2343},
        …,
        22: { 0: 1.2343, 1: 1.2343, …, 59: 1.2343},
        23: { 0: 1.2343, 1: 1.2343, …, 59: 1.2343}
      }
    }
    

    Using this nested approach, we now only have to walk, at maximum, 24 + 60 to get the very last value in the day.

    If we build the documents with all the values filled-in with padding in advance, we can be sure that the document will not change size and therefore will not be moved.

提交回复
热议问题