问题
I have to create a check for this use case-
Duplicate payment check
• Same amount to a same account number in last 7 days for all transactions.
I haven't used mongoDb as much would have been easier for me to write in sql
This is what I am trying without the 7 days part
db.transactiondetails.aggregate({$group: {"_id":{"account_number":"$account_number","amount":"$amount"},"count": { $sum: 1 }}})
Where I get something like this :
{ "_id" : { "account_number" : "xxxxxxxy", "amount" : 19760 }, "count" : 2 }
{ "_id" : { "account_number" : "xxxxzzzz", "amount" : 20140 }, "count" : 2 }
...
I have created_at
and updated_at
which are date fields , I am using updated_at for duplicates
for example :
"created_at" : ISODate("2019-01-07T15:40:53.683Z"),
"updated_at" : ISODate("2019-01-09T06:48:44.839Z"),
In sql we can create groups of 7 days, for each date there will be a start date plus 7 days in which we need to find the duplicates.
It is running 7 day groups where I need to find duplicates.
Any help how to go about this will be appreciated.
回答1:
Check if this meets your requirements:
Explanation
- We sort documents (I assume you have indexes). We need it to iterate array in the next steps.
- We group by
account_number
+amount
and create arrays (data
,tmp
) with documents - We
$unwind
(flatten)tmp
array to calculate how many days past for itemi to itemi+1 - n - We count how many duplicates we have for different dates
- Skip all
counts = 0
db.transactiondetails.aggregate([
{
$sort: {
account_number: 1,
amount: 1,
updated_at: 1
}
},
{
$group: {
"_id": {
"account_number": "$account_number",
"amount": "$amount"
},
"data": {
$push: "$$ROOT"
},
"tmp": {
$push: "$$ROOT"
}
}
},
{
$unwind: "$tmp"
},
{
$project: {
_id: {
account_number: "$_id.account_number",
amount: "$_id.amount",
updated_at: "$tmp.updated_at"
},
data: {
$map: {
input: {
$slice: [
"$data",
{
$add: [
{
$indexOfArray: [
"$data",
"$tmp"
]
},
1
]
},
{
$size: "$data"
}
]
},
in: {
"_id": "$$this._id",
"account_number": "$$this.account_number",
"amount": "$$this.amount",
"created_at": "$$this.created_at",
"updated_at": "$$this.updated_at",
"days": {
$divide: [
{
$subtract: [
"$$this.updated_at",
"$tmp.updated_at"
]
},
{
$multiply: [
24,
60,
60,
1000
]
}
]
}
}
}
}
}
},
{
$project: {
count: {
$size: {
$filter: {
input: "$data",
cond: {
$lte: [
"$$this.days",
7
]
}
}
}
}
}
},
{
$match: {
"count": {
$gt: 0
}
}
}
])
MongoPlayground
来源:https://stackoverflow.com/questions/60371210/mongodb-aggregate-find-duplicate-records-within-7-days