MongoDB: groupby subdocument and count + add total count

痞子三分冷 提交于 2019-12-23 17:42:02

问题


What I want to achieve

Suppose having the following subdocument:

{
    "id":1,
    "url":"mysite.com",
    "views": 
     [
       {"ip":"1.1.1.1","date":"01-01-2015"},
       {"ip":"2.2.2.2","date":"01-01-2015"},
       {"ip":"1.1.1.1","date":"01-01-2015"},
       {"ip":"1.1.1.1","date":"01-01-2015"}
     ]
}

I want to count:

  1. how many IPs there are based on the "ip" value
  2. and also count the total of subdocuments in "views"

if possible in the same query, to achieve the following result:

[  
  {  
    "_id":"2.2.2.2",
    "count":1
  },
  {  
    "_id":"1.1.1.1",
    "count":3
  },
  {  
    "_id":"total",
    "count":4
  }
]

What I have achieved so far

Using the MongoDB Aggregation Framework I have managed to achieve point 1. the following way:

db.collection.aggregate([
    {
        "$unwind": "$views"
    },
    {
        "$group": {
            "_id": "$views.ip",
            "count": {
                "$sum": 1
            }
        }
    }
])

which returns:

[  
  {  
    "_id":"2.2.2.2",
    "count":1
  },
  {  
    "_id":"1.1.1.1",
    "count":3
  }
]

I wish to return that extra doc inside the array, that would be:

{  
  "_id":"total",
  "count":4
}

to achieve what I exposed above, but I am stuck there and haven't been able to do so.


回答1:


Not possible within the same aggregation pipeline as in principle the pipeline processes documents as they pass through it i.e. the pipeline stages do not need to produce one output document for every input document; e.g., some stages may generate new documents or filter out documents. In the above scenario, adding another $group step to get the grouped IP counts + total count would produce a different result to what you are after i.e.

db.collection.aggregate([
    {
        "$unwind": "$views"
    },
    {
        "$group": {
            "_id": "$views.ip",
            "count": {
                "$sum": 1
            }
        }
    },
    {
        "$group": {
            "_id": null,
            "total": {
                "$sum": "$count"
            }
        }
    }
])

You will only get the total count since the $group consumes all input documents (documents with the grouped IP counts) and outputs one document per each distinct group. That extra group step will group all the documents from the previous stream.

However, you could get the total count but as an extra field within each grouped document in your final result. The following example which uses an initial $project pipeline stage to get the total count via the $size operator accomplishes this:

db.collection.aggregate([
    {
        "$project": {
            "views": 1,
            "views_size": { "$size": "$views" }
        }
    }
    {
        "$unwind": "$views"
    },
    {
        "$group": {
            "_id": "$views.ip",
            "count": {
                "$sum": 1
            },
            "total": { "$first": "$views_size" }
        }
    }
])

Sample Output

[  
  {  
    "_id": "2.2.2.2",
    "count": 1,
    "total": 4
  },
  {  
    "_id": "1.1.1.1",
    "count": 3,
    "total": 4
  }
]


来源:https://stackoverflow.com/questions/35178741/mongodb-groupby-subdocument-and-count-add-total-count

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!