问题
I have a class that dispatches a Success and a Failure event and I need to maintain a statistic on the average number of failure/total number of events in the last X seconds from that class.
I was thinking something along the lines of using a circular linked list and append a success or failure node for each event. Then count the numbers of failure nodes vs. total nodes in the list, but this has two major drawbacks:
- I need to constantly scale the list size up/down to account for the "last X seconds" requirement (the number of events per second can change)
- I need to constantly loop over the list and count all the events (potentially expensive as I will probably have 100s of such events per second)
Does anyone know of another way to compute average values from a list of samples received in the last X seconds?
回答1:
You should use a sampling frequency (a-la MRTG). Say you only need one second precision and to maintain the average for the past minute, you will have a fixed table of 60 entries referring to the past 60 seconds (including the present one). And also maintain the current global entry.
Each entry consists of an average value and a number of events. Every entry starts at 0 for both value.
When you receive a new event, you change the current and the global entry like that:
average = ((number * average) + 1) / (number + 1) number = number + 1
At each sampling interval you change the global entry using the oldest entry:
global.average = ((global.number * global.average) - (oldest.number * oldest.average)) / (global.number - oldest.number) global.number = global.number - oldest.number
And you reset the oldest entry to 0 and start using it as the current one.
回答2:
You could use a queue, which would allow you to add new events to the end of the queue, and remove expired events from the start of the queue, assuming that events are added in chronological order. In Java, for example, you could use a LinkedList or an ArrayDeque, both of which implement the Queue interface.
If events are not added in chronological order, then a priority queue could be used. Elements would be ordered by their timestamps, and the highest-priority element (i.e. the next element for removal) would be the one with the smallest timestamp. In Java, this data structure is provided by PriorityQueue.
Instead of counting the events periodically, we can just keep two counters, one for the total number of events, and the other for the number of successful events. These counters will be updated whenever we add or remove events from the queue.
回答3:
keep you events in a Queue. Just append to the end and remove all the events that are too old from the front. This will at least eliminate Problem 1.
回答4:
Usually for these kinds of samplers there's one extra thing you usually specify, and that is sampler resolution.
In your case assuming your description the sampler resolution can be either 1 second or 1 tick.
If the resolution you want for the sampler is 1 second then here's a algorithm proposition that might work well enough.
- create a linked list. The list nodes hold [timestamp, Success count, Failure count, previousNode]
- store a reference to the last node of the list as
lastNodeand a reference to the first nodefirstNode(lastNodewill be the tail of the list, and thefirstNodeis the latest added node, the head) - hold in two global variables gSuccess, gFail, the sum of successes and failures in the last X seconds timeframe.
When a new event is received:
Compare event [timestamp] with firstNode
timestampIF (eventTimestamp.TotalSeconds > firstNode.TotalSeconds)
- Add a new node a at the begining of the list (insert before firsNode) with Succes and Failure count 0.
- firstNode.Previous = newNode
- firsNode = newNode;
END IF
- Increment firstNode.Success or firstNode.Failure count by 1
- *Increment gSuccess or gFail by 1
(after each event added) REMOVE_EXPIRED_NODES
WHILE ( lastNode != nil AND curentTime.TotalSeconds - lastNode.TotalSeconds > X)
- gSuccess -= lastNode.Succes (decrease gSuccess by node to be removed Success count)
- gFail -= lastNode.Fail (decrease gFail by node to be removed Fail count)
- Remove lastNode
END WHILE
getting gFail and gSuccess should always be preceded by REMOVE_EXPIRED_NODES.
The advantages of this approach :
Global counters for the Fail and Success are not recomputed from all events, but instead gradually incremented events are added, and decremented when nodes from the list that are older than X seconds are removed.
it uses sampler resolution of 1 second instead of storing a list of all events (which might be hundreds per second, ensuring that a total of 2 list operations for each second are performed (add + remove operations))
regardless of the number of events, list operations count per second on average is 2 (1 add operation, 1 remove operation)
回答5:
How specific are your requirements? If you're allowed to think outside the box a little, a simple geiger-counter algorithm, a.k.a. an infinite impulse response (IIR) digital filter computes a moving "average" (depending on how you define "average"), has a minimal memory footprint and only takes a few lines of code.
回答6:
It would be more effective to maintain two separate lists, one for the successes and one for the failures. New entries are always appended at the end of the list (i.e. it is sorted by increasing timestamps).
Now when you want to get the numer of successes/failures in the last n seconds, you create a timestamp for now() - n and work the lists. Once you find a timestamp that is greater than this value, you can eliminate all elements before the current one. The length of the list gives you the number uf successes or fails.
If you need to optimise, see if it is more effective to sort the list by decreasing timestamp (i.e. prepending new values) and work the list until you find an element that has a timestamp smaller than your comparison value. Discard this and all following members.
It is hard to say beforehand which scenario will be more effective, so you will have to try it. OTOH if it works well enough, there is no reason to optimise ;-).
来源:https://stackoverflow.com/questions/556155/sample-the-average-of-values-received-in-last-x-seconds