问题
I want a function that can take a series and a set of bins, and basically round up to the nearest bin. For example:
my_series = [ 1, 1.5, 2, 2.3, 2.6, 3]
def my_function(my_series, bins):
...
my_function(my_series, bins=[1,2,3])
> [1,2,2,3,3,3]
This seems to be very close to what Numpy's Digitize is intended to do, but it produces the wrong values (asterisks for wrong values):
np.digitize(my_series, bins= [1,2,3], right=False)
> [1, 1*, 2, 2*, 2*, 3]
The reason why it's wrong is clear from the documentation:
Each index i returned is such that bins[i-1] <= x < bins[i] if bins is monotonically increasing, or bins[i-1] > x >= bins[i] if bins is monotonically decreasing. If values in x are beyond the bounds of bins, 0 or len(bins) is returned as appropriate. If right is True, then the right bin is closed so that the index i is such that bins[i-1] < x <= bins[i] or bins[i-1] >= x > bins[i]`` if bins is monotonically increasing or decreasing, respectively.
I can kind of get closer to what I want if I enter in the values decreasing and set "right" to True...
np.digitize(my_series, bins= [3,2,1], right=True)
> [3, 2, 2, 1, 1, 1]
but then I'll have to think of a way of basically methodically reversing the lowest number assignment (1) with the highest number assignment (3). It's simple when there are just 3 bins, but will get hairier when the number of bins get longer..there must be a more elegant way of doing all this.
回答1:
We can simply use np.digitize
with its right
option set as True
to get the indices and then to extract the corresponding elements off bins
, bring in np.take
, like so -
np.take(bins,np.digitize(a,bins,right=True))
回答2:
I believe np.searchsorted will do what you want:
Find the indices into a sorted array
a
such that, if the corresponding elements inv
were inserted before the indices, the order of a would be preserved.
In [1]: my_series = [1, 1.5, 2, 2.3, 2.6, 3]
In [2]: bins = [1,2,3]
In [3]: import numpy as np
In [4]: [bins[k] for k in np.searchsorted(bins, my_series)]
Out[4]: [1, 2, 2, 3, 3, 3]
(As of numpy 1.10.0, digitize
is implemented in terms of searchsorted
.)
回答3:
Another way would be:
In [25]: def find_nearest(array,value):
...: idx = (np.abs(array-np.ceil(value))).argmin()
...: return array[idx]
...:
In [26]: my_series = np.array([ 1, 1.5, 2, 2.3, 2.6, 3])
In [27]: bins = [1, 2, 3]
In [28]: [find_nearest(bins, x) for x in my_series]
Out[28]: [1, 2, 2, 3, 3, 3]
来源:https://stackoverflow.com/questions/39382594/python-assigning-values-in-a-list-to-bins-by-rounding-up