问题
First of all we are just migrating from SQL Server to Mongodb.
I have a collection containing fields TFN, Impressions
. I need to transform the sql query in mongo but got stuck at a moment.
Scenario is I need to select a top 5 impressions
from the collection which are group by on the basis of tfns
Select Top 5 a.TFN, a.MaxImpression as MaxCount from (
Select TFN, Max(Impressions) MaxImpression
from tblData
Where TFN in (Select TFN From @tmpTFNList) and TrendDate between @StartDate AND @EndDate
Group by TFN
) a
This is the query in Sql Server. I need to achieve the same scenario by using mongodb. So far i have gone through the aggregate and group functions of mongo but couldn't able to achieve the same output as by sql.
Note: I am just unable to make a link between Max clause and Group by in MongoDb
Here is the implementation which i have tried
db.getCollection("_core.data").aggregate([
{
$match:
{
$and: [
{
"TFN":
{
$in: tfns
}
} ,
{
"TrendDate":
{
$gte : 20170421,
$lte: 20170421
}
}]
}
},
{
$group:
{
_id:"Impressions",
Impression: {
$max : "$Impressions"
}
}
}
])
secondly tried
db.getCollection("_core.adwordsPull.static").group({
key: { TFN: 1, Impressions: 1 },
cond: { TFN: {
$in: tfns
},
{
"TrendDate":
{
$gte : 20170421,
$lte: 20170421
}
} },
reduce: function( curr, result ) {
result.total += curr.Impression;
},
initial: { total : 0 }
})
what is wrong with the approach and how could i correct them?
Edit 1: Sample Data
TFN Impression TrendDate
84251456 12 20170424
84251456 15 20170424
84251456 18 20170424
84251456 19 20170424
84251456 22 20170424
84251456 23 20170423
84251456 24 20170423
84251455 25 20170423
84251455 30 20170423
84251455 35 20170424
84251455 24 20170423
84251455 22 20170423
84251455 21 20170424
84251455 22 20170424
Expected Output :
TFN MaxCount
84251456 22
84251455 35
回答1:
To achieve the desired result, start by breaking down the SQL query beginning with the sub query:
Select *
from tblData
Where TFN in (Select TFN From @tmpTFNList) and TrendDate between @StartDate AND @EndDate
The equivalent mongo query follows:
db.getCollection("_core.data").aggregate([
{
"$match": {
"TFN": { "$in": tmpTFNList },
"TrendDate": {
"$gte": startDate,
"$lte": endDate
}
}
}
])
The $group aggregate equivalent of
Select TFN, Max(Impressions) MaxImpression
from tblData
Where TFN in (Select TFN From @tmpTFNList) and TrendDate between @StartDate AND @EndDate
Group by TFN
follows
db.getCollection("_core.data").aggregate([
{
"$match": {
"TFN": { "$in": tmpTFNList },
"TrendDate": {
"$gte": startDate,
"$lte": endDate
}
}
},
{
"$group": {
"_id": "$TFN",
"MaxImpression": { "$max": "$Impression" }
}
}
])
The top 5 query
Select Top 5 a.TFN, a.MaxImpression as MaxCount from (
Select TFN, Max(Impressions) MaxImpression
from tblData
Where TFN in (Select TFN From @tmpTFNList)
and TrendDate between @StartDate AND @EndDate
Group by TFN
) a
is made possible with the $limit operator and the fields selection through the $project stage as
db.getCollection("_core.data").aggregate([
{ /* WHERE TFN in list AND TrendDate between DATES */
"$match": {
"TFN": { "$in": tmpTFNList },
"TrendDate": {
"$gte": startDate,
"$lte": endDate
}
}
},
{ /* GROUP BY TFN */
"$group": {
"_id": "$TFN",
"MaxImpression": { "$max": "$Impression" }
}
},
{ "$limit": 5 }, /* TOP 5 */
{ /* SELECT a.MaxImpression as MaxCount */
"$project": {
"TFN": "$_id",
"_id": 0,
"MaxCount": "$MaxImpression"
}
}
])
UPDATE
To get the desired result from the sample in this edit, you need an extra $sort pipeline before the $group where your sort the documents by the TrendDate
and Impression
fields, both in descending order.
You will then have to use the $first accumulator operator within the $group pipeline stage to get the maximum impression as you will have an ordered stream of documents in your pipeline.
Consider running the revised aggregate operation as:
db.getCollection('collection').aggregate([
{
"$match": {
"TFN": { "$in": tmpTFNList },
"TrendDate": {
"$gte": startDate,
"$lte": endDate
}
}
},
{ "$sort": { "TrendDate": -1, "Impression": -1 } },
{
"$group": {
"_id": "$TFN",
"MaxImpression": { "$first": "$Impression" }
}
},
{ "$limit": 5 },
{
"$project": {
"TFN": "$_id",
"_id": 0,
"MaxCount": "$MaxImpression"
}
}
])
Sample Output
/* 1 */
{
"TFN" : 84251456,
"MaxCount" : 22
}
/* 2 */
{
"TFN" : 84251455,
"MaxCount" : 35
}
来源:https://stackoverflow.com/questions/43585164/max-and-group-by-in-mongodb