Set values based on df.query?

℡╲_俬逩灬. 提交于 2021-02-08 13:14:22

问题


I'd like to set the value of a column based on a query. I could probably use .where to accomplish this, but the criteria for .query are strings which are easier for me to maintain, especially when the criteria become complex.

import numpy as np
import pandas as pd

np.random.seed(51723)
df = pd.DataFrame(np.random.rand(n, 3), columns=list('abc'))

I'd like to make a new column, d, and set the value to 1 where these criteria are met:

criteria = '(a < b) & (b < c)'

Among other things, I've tried:

df['d'] = np.nan
df.query(criteria).loc[:,'d'] = 1

But that seems to do nothing except giving the SettingWithCopyWarning even though I'm using .loc

And passing inplace like this:

df.query(criteria, inplace=True).loc[:,'d'] = 1

Gives AttributeError: 'NoneType' object has no attribute 'loc'


回答1:


AFAIK df.query() returns a new DF, so try the following approach:

In [146]: df.loc[df.eval(criteria), 'd'] = 1

In [147]: df
Out[147]:
          a         b         c    d
0  0.175155  0.221811  0.808175  1.0
1  0.069033  0.484528  0.841618  1.0
2  0.174685  0.648299  0.904037  1.0
3  0.292404  0.423220  0.897146  1.0
4  0.169869  0.395967  0.590083  1.0
5  0.574394  0.804917  0.746797  NaN
6  0.642173  0.252437  0.847172  NaN
7  0.073629  0.821715  0.859776  1.0
8  0.999789  0.833708  0.230418  NaN
9  0.028163  0.666961  0.582713  NaN


来源:https://stackoverflow.com/questions/45470219/set-values-based-on-df-query

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!