comparing string in a column and creating respective a new column in python

泄露秘密 提交于 2019-12-11 05:24:20

问题


I have data frame as shown below. I need to compare column in a data frame with the string and creating a new column.

DataFrame:

col_1
AB_SUMI
AK_SUMI
SB_LIMA
SB_SUMI
XY_SUMI

If 'AB','AK','SB' are present in col_1 it should create a new column with their respective values otherwise '*' should come in the column value.

expected output:

col_1      new_col
AB_SUMI     AB
AK_SUMI     AK
SB_LIMA     SB
SB_SUMI     SB
XY_SUMI     *

I have tried with below code but not worked out.

list=['AB','AK','AB']

for item in list:
    if df['col1'].str.contains(item).any():
        df['new']=item

please help me in this regard. Thanks in advance


回答1:


You can use extract with regex created with list by join | (or), last replace NaN by fillna:

L= ['AB','AK','SB']
a = '(' + '|'.join(L) + ')'
print (a)
(AB|AK|SB)

df['new'] = df.col_1.str.extract(a, expand=False).fillna('*')
print (df)
     col_1 new
0  AB_SUMI  AB
1  AK_SUMI  AK
2  SB_LIMA  SB
3  SB_SUMI  SB
4  XY_SUMI   *



回答2:


A fun approach

L = 'AB AK SB'.split()

c = df.col_1.values.astype(str)
f = lambda x, s : np.core.defchararray.find(x, s) >= 0
df.assign(new=np.stack([f(c, i) for i in L]).astype(object).T.dot(np.reshape(L, (-1, 1)))).replace('', '*')

     col_1 new
0  AB_SUMI  AB
1  AK_SUMI  AK
2  SB_LIMA  SB
3  SB_SUMI  SB
4  XY_SUMI   *


来源:https://stackoverflow.com/questions/42883267/comparing-string-in-a-column-and-creating-respective-a-new-column-in-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!