Max and group by in Mongodb

情到浓时终转凉″ 提交于 2021-01-28 04:06:32

问题


First of all we are just migrating from SQL Server to Mongodb. I have a collection containing fields TFN, Impressions. I need to transform the sql query in mongo but got stuck at a moment.

Scenario is I need to select a top 5 impressions from the collection which are group by on the basis of tfns

Select Top 5 a.TFN, a.MaxImpression as MaxCount from ( 
  Select TFN, Max(Impressions) MaxImpression 
  from tblData 
  Where TFN in (Select TFN From @tmpTFNList) and TrendDate between @StartDate AND @EndDate
  Group by TFN 
  ) a

This is the query in Sql Server. I need to achieve the same scenario by using mongodb. So far i have gone through the aggregate and group functions of mongo but couldn't able to achieve the same output as by sql.

Note: I am just unable to make a link between Max clause and Group by in MongoDb

Here is the implementation which i have tried

db.getCollection("_core.data").aggregate([
       { 
           $match: 
           {
               $and: [
                   {
                       "TFN": 
                       {
                           $in: tfns 

                       }

                   } ,
                   { 
                       "TrendDate": 
                       {
                           $gte : 20170421,
                           $lte: 20170421

                       }
                   }]
           }
        }, 
        {
            $group: 
            {
               _id:"Impressions", 
               Impression: {
                   $max : "$Impressions"
               }
            }  
        }
    ])

secondly tried

db.getCollection("_core.adwordsPull.static").group({
    key: { TFN: 1,  Impressions: 1 },
    cond: { TFN:  {
                               $in: tfns 

                           },
                       { 
                           "TrendDate": 
                           {
                               $gte : 20170421,
                               $lte: 20170421

                           }
                       } },
    reduce: function( curr, result ) {

                result.total += curr.Impression;
             },
    initial: { total : 0 }
})

what is wrong with the approach and how could i correct them?

Edit 1: Sample Data

TFN Impression  TrendDate
84251456    12  20170424
84251456    15  20170424
84251456    18  20170424
84251456    19  20170424
84251456    22  20170424
84251456    23  20170423
84251456    24  20170423

84251455    25  20170423
84251455    30  20170423
84251455    35  20170424
84251455    24  20170423
84251455    22  20170423
84251455    21  20170424
84251455    22  20170424

Expected Output :

TFN  MaxCount
84251456    22
84251455    35

回答1:


To achieve the desired result, start by breaking down the SQL query beginning with the sub query:

Select *
from tblData 
Where TFN in (Select TFN From @tmpTFNList) and TrendDate between @StartDate AND @EndDate

The equivalent mongo query follows:

db.getCollection("_core.data").aggregate([
    {
        "$match": {
            "TFN": { "$in": tmpTFNList },
            "TrendDate": {
                "$gte": startDate,
                "$lte": endDate
            }
        }
    }
])

The $group aggregate equivalent of

Select TFN, Max(Impressions) MaxImpression 
from tblData 
Where TFN in (Select TFN From @tmpTFNList) and TrendDate between @StartDate AND @EndDate
Group by TFN 

follows

db.getCollection("_core.data").aggregate([
    {
        "$match": {
            "TFN": { "$in": tmpTFNList },
            "TrendDate": {
                "$gte": startDate,
                "$lte": endDate
            }
        }
    },
    {
        "$group": {
            "_id": "$TFN",
            "MaxImpression": { "$max": "$Impression" }
        }
    }
])

The top 5 query

Select Top 5 a.TFN, a.MaxImpression as MaxCount from ( 
    Select TFN, Max(Impressions) MaxImpression 
    from tblData 
    Where TFN in (Select TFN From @tmpTFNList) 
        and TrendDate between @StartDate AND @EndDate
    Group by TFN 
) a

is made possible with the $limit operator and the fields selection through the $project stage as

db.getCollection("_core.data").aggregate([
    { /* WHERE TFN in list AND TrendDate between DATES */
        "$match": {
            "TFN": { "$in": tmpTFNList },
            "TrendDate": {
                "$gte": startDate,
                "$lte": endDate
            }
        }
    },
    { /* GROUP BY TFN */
        "$group": {
            "_id": "$TFN",
            "MaxImpression": { "$max": "$Impression" }
        }
    },
    { "$limit": 5 }, /* TOP 5 */
    { /* SELECT a.MaxImpression as MaxCount */
        "$project": {
            "TFN": "$_id",
            "_id": 0,
            "MaxCount": "$MaxImpression"
        }
    }
])

UPDATE

To get the desired result from the sample in this edit, you need an extra $sort pipeline before the $group where your sort the documents by the TrendDate and Impression fields, both in descending order.

You will then have to use the $first accumulator operator within the $group pipeline stage to get the maximum impression as you will have an ordered stream of documents in your pipeline.

Consider running the revised aggregate operation as:

db.getCollection('collection').aggregate([
    { 
        "$match": {
            "TFN": { "$in": tmpTFNList },
            "TrendDate": {
                "$gte": startDate,
                "$lte": endDate
            }
        }
    },
    { "$sort": { "TrendDate": -1, "Impression": -1 } },
    {  
        "$group": {
            "_id": "$TFN",
            "MaxImpression": { "$first": "$Impression" }
        }
    },
    { "$limit": 5 }, 
    {   
        "$project": {
            "TFN": "$_id",
            "_id": 0,
            "MaxCount": "$MaxImpression"
        }
    }
])

Sample Output

/* 1 */
{
    "TFN" : 84251456,
    "MaxCount" : 22
}

/* 2 */
{
    "TFN" : 84251455,
    "MaxCount" : 35
}


来源:https://stackoverflow.com/questions/43585164/max-and-group-by-in-mongodb

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!