问题
I have access logs such as below stored in a mongodb instance:
Time Service Latency
[27/08/2013:11:19:22 +0000] "POST Service A HTTP/1.1" 403
[27/08/2013:11:19:24 +0000] "POST Service B HTTP/1.1" 1022
[27/08/2013:11:22:10 +0000] "POST Service A HTTP/1.1" 455
Is there an analytics function like PERCENTILE_DISC
in Oracle to calculate the percentile?
I would like to calculate latency percentiles over a period of time.
回答1:
There still appears to be no native way to calculate percentiles but by combining a few aggregate operators you can get the same result.
db.items.aggregate([
{'$group': {
'_id': {
'league': '$league',
'base': '$base',
'type': '$type'
},
'value': {'$push': '$chaosequiv'}
}},
{'$unwind': '$value'},
{'$sort': {'value': 1}},
{'$group': {'_id': '$_id', 'value': {'$push': '$value'}}},
{'$project': {
'_id': 1,
'value': {'$arrayElemAt': ['$value', {'$floor': {'$multiply': [0.25, {'$size': '$value'}]}}]}
}}
], allowDiskUse=True)
Note I wrote my original code in pymongo for a problem that needed to group on 3 fields in the first group so this may be more complex than necessary for a single field. I would write a solution specific to this question but I don't think there is enough specific information.
回答2:
Starting Mongo 4.4
, the $group
stage has a new aggregation operator $accumulator allowing custom accumulations of documents as they get grouped, via javascript user defined functions.
Thus, in order to find the 20th percentile:
// { "a" : 25, "b" : 12 }
// { "a" : 89, "b" : 73 }
// { "a" : 25, "b" : 7 }
// { "a" : 25, "b" : 17 }
// { "a" : 89, "b" : 14 }
// { "a" : 89, "b" : 17 }
// { "a" : 25, "b" : 24 }
// { "a" : 25, "b" : 15 }
// { "a" : 25, "b" : 22 }
// { "a" : 25, "b" : 94 }
db.collection.aggregate([
{ $group: {
_id: "$a",
percentile: {
$accumulator: {
accumulateArgs: ["$b"],
init: function() { return []; },
accumulate: function(bs, b) { return bs.concat(b); },
merge: function(bs1, bs2) { return bs1.concat(bs2); },
finalize: function(bs) {
bs.sort(function(a, b) { return a - b });
return bs[Math.floor(bs.length*.2) + 1];
},
lang: "js"
}
}
}}
])
// { "_id" : 89, "percentile" : 17 }
// { "_id" : 25, "percentile" : 15 }
The accumulator:
- accumulates on the field
b
(accumulateArgs
) - is initialised to an empty array (
init
) - accumulates
b
items in an array (accumulate
andmerge
) - and finally performs the percentile calculation on
b
items (finalize
)
来源:https://stackoverflow.com/questions/18484571/how-to-calculate-the-percentile