Pandas overwrite values in column selectively based on condition from another column

▼魔方 西西 提交于 2019-12-11 06:34:11

问题


I have a dataframe in pandas with four columns. The data consists of strings. Sample:

          A                  B                C      D
0         2          asicdsada          v:cVccv      u
1         4     ascccaiiidncll     v:cVccv:ccvc      u
2         9                sca              V:c      u
3        11               lkss             v:cv      u
4        13              lcoao            v:ccv      u
5        14           wuduakkk         V:ccvcv:      u

I want to replace the string 'u' in Col D with the string 'a' if Col C in that row contains the substring 'V' (case sensitive). Desired outcome:

          A                  B                C      D
0         2          asicdsada          v:cVccv      a
1         4     ascccaiiidncll     v:cVccv:ccvc      a
2         9                sca              V:c      a
3        11               lkss             v:cv      u
4        13              lcoao            v:ccv      u
5        14           wuduakkk         V:ccvcv:      a

I prefer to overwrite the value already in Column D, rather than assign two different values, because I'd like to selectively overwrite some of these values again later, under different conditions.

It seems like this should have a simple solution, but I cannot figure it out, and haven't been able to find a fully applicable solution in other answered questions.

df.ix[1]["D"] = "a"

changes an individual value.

df.ix[:]["C"].str.contains("V")

returns a series of booleans, but I am not sure what to do with it. I have tried many many combinations of .loc, apply, contains, re.search, and for loops, and I get either errors or replace every value in column D. I'm a novice with pandas/python so it's hard to know whether my syntax, methods, or conceptualization of what I even need to do are off (probably all of the above).


回答1:


As you've already tried, use str.contains to get a boolean Series, and then use .loc to say "change these rows and the D column". For example:

In [5]: df.loc[df["C"].str.contains("V"), "D"] = "a"

In [6]: df
Out[6]: 
    A               B             C  D
0   2       asicdsada       v:cVccv  a
1   4  ascccaiiidncll  v:cVccv:ccvc  a
2   9             sca           V:c  a
3  11            lkss          v:cv  u
4  13           lcoao         v:ccv  u
5  14        wuduakkk      V:ccvcv:  a

(Avoid using .ix -- it's officially deprecated now.)



来源:https://stackoverflow.com/questions/44596521/pandas-overwrite-values-in-column-selectively-based-on-condition-from-another-co

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!