问题
I have two dataframes, let's say df and map_dum. Here is the df.
>>> print(df)
sales
0 5
1 10
2 9
3 7
4 1
5 1
6 -1
7 2
8 9
9 8
10 1
11 3
12 10
13 -2
14 8
15 5
16 9
17 6
18 10
19 -1
20 5
21 3
And here is for the map_dum.
>>> print(map_dum)
class more_than_or_equal_to less_than
0 -1 -1000 0
1 1 0 2
2 2 2 4
3 3 4 6
4 4 6 8
5 5 8 10
6 6 10 1000
My goal is to add new column to the df, column class. In order to do so, I have to check the value in df['sales'] lies in between which values in map_dum. For example if I want to know the class for the first row in df['sales'], 5, then the class would be 3. The final output would like below.
>>> print(df)
sales class
0 5 3
1 10 6
2 9 5
3 7 4
4 1 1
5 1 1
6 -1 -1
7 2 2
8 9 5
9 8 5
10 1 1
11 3 2
12 10 6
13 -2 -1
14 8 5
15 5 3
16 9 5
17 6 4
18 10 6
19 -1 -1
20 5 3
21 3 2
Currently, I am using apply to solve this, however, it is very slow since my dataset is quite huge.
def add_class(sales, mapping, lower_limit, upper_limit):
result = mapping.loc[((mapping[lower_limit]<=sales)&(mapping[upper_limit]>sales)), 'class'].tolist()[0]
return result
df['class'] = df['sales'].apply(lambda sales: add_class(sales, map_dum, 'more_than_or_equal_to', 'less_than'))
Hence, performance does matter in my case. Any other way to add the class column to the df without violating the criteria, something like vectorization solution? Thanks for any help!
回答1:
I think you need cut:
bins = [-1000, 0, 2, 4, 6, 8, 10, 1000]
labels=[-1,1,2,3,4,5,6]
df['class'] = pd.cut(df['sales'], bins=bins, labels=labels, right=False)
print (df)
sales class
0 5 3
1 10 6
2 9 5
3 7 4
4 1 1
5 1 1
6 -1 -1
7 2 2
8 9 5
9 8 5
10 1 1
11 3 2
12 10 6
13 -2 -1
14 8 5
15 5 3
16 9 5
17 6 4
18 10 6
19 -1 -1
20 5 3
21 3 2
For dynamic add values from map_dum use:
bins = [map_dum['more_than_or_equal_to'].iat[0]] + map_dum['less_than'].tolist()
labels= map_dum['class']
df['class'] = pd.cut(df['sales'], bins=bins, labels=labels, right=False)
print (df)
来源:https://stackoverflow.com/questions/42649224/pandas-alternate-way-to-add-new-column-with-lot-of-conditions-other-than-apply