mongo 数据去重

mongo中有许多重复的id，去重只保留一个，sql如下
可以参考文章

db.outboundCustomer.aggregate([{
    $match: {
        status: 'notsent'
    }
}, {
    $group: {
        _id: "$sfdcId",
        count: {
            $sum: 1
        },
        dups: {
            $addToSet: '$_id'
        }
    }
}, {
    $match: {
        count: {
            $gt: 1
        }
    }
}], {
    allowDiskUse: true
}).forEach(function(doc) {
    doc.dups.shift();
    db.outboundCustomer.remove({
        _id: {
            $in: doc.dups
        }
    });
})

$match 是匹配，这里匹配的是status=notsent
$group是分组，这里是按照sfdcId字段
$addToSet是加入一个数组中
第二个$match为group分组后>1的数据
前四项相当于

select count(1),sfdcid from outboundCustomer 
where status = 'notsent'
group by sfdcid
having count(1)>1

.forEach中的语句为将重复的数据删除，且保留唯一
doc.dups.shift() 剔除数组中重复的第一个避免删除错误

来源：CSDN

作者：Lawliet_Wei

链接：https://blog.csdn.net/Lawliet_Wei/article/details/104478044

标签

大数据

match

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!