MongoDB group by hour

后端 未结 3 409
慢半拍i
慢半拍i 2020-12-03 18:25

I save tweets to mongo DB:

 twit.stream(\'statuses/filter\', {\'track\': [\'animal\']}, function(stream) {
    stream.on(\'data\', function(data) {
        c         


        
相关标签:
3条回答
  • 2020-12-03 18:58

    Lalit's answer did not work for me, it kept giving me zeroes. Instead I did:

    db.tweets.aggregate(
     { "$project": {
          "y":{"$year":"$created_at"},
          "m":{"$month":"$created_at"},
          "d":{"$dayOfMonth":"$created_at"},
          "h":{"$hour":"$created_at"},
          "tweet":1 }
     },
     { "$group":{ 
           "_id": { "year":"$y","month":"$m","day":"$d","hour":"$h"},
           'count':{$sum:1} 
       }
     })
    

    the 'count':{$sum:1} is the only difference.

    Might help someone new to mongo like me.

    0 讨论(0)
  • 2020-12-03 19:11

    There should be no need to use a $project stage here as the date operator functions can just be employed directly in the $group stage when defining the grouping _id. This saves having to process the entire collection in order to get the result:

    Also you are just counting, so simply { "$sum" : 1 }, in which defining a field that didn't exist was the problem resulting in 0.

        $this->collection->aggregate(array(
            array(
                '$group' => array(
                    "_id" => array( 
                        "y" => array( '$year' => '$created_at' ),
                        "m" => array( '$month' => '$created_at' ),
                        "d" => array( '$dayOfMonth' => '$created_at' ),
                        "h" => array( '$hour' => '$created_at' ),
                    ),
                    "total" => array( '$sum' => 1 ),
                ),
            )
        ));
    

    If anything, add a $match stage at the start of the pipeline in order to filter the date. If one day is acceptable for output then you only need to define the $hour in the grouping and you are reducing the working set size, which means faster. And probably what you want to do anyway.

    0 讨论(0)
  • 2020-12-03 19:14

    I could tell you how you can group using aggregation framework directly on mongo console

    db.tweets.aggregate(
     { "$project": {
          "y":{"$year":"$created_at"},
          "m":{"$month":"$created_at"},
          "d":{"$dayOfMonth":"$created_at"},
          "h":{"$hour":"$created_at"},
          "tweet":1 }
     },
     { "$group":{ 
           "_id": { "year":"$y","month":"$m","day":"$d","hour":"$h"},
           "total":{ "$sum": "$tweet"}
       }
     })
    

    For more options you can look here: http://docs.mongodb.org/manual/reference/operator/aggregation-date/

    You will also need to find appropriate way of of using aggregation framework from whichever programming language you are using.

    0 讨论(0)
提交回复
热议问题