问题
I have a df,
name Value
Sri is a cricketer Sri,is
Ram player Ram
Ravi is a singer is
cricket and foot is ball and,is,foot
and a list,
my_list=["is", "foot"]
I am trying to split df["value"] by (,) and adding the value to a new column if the value exists in my_list. My expected output is
name Value my_list
Sri is a cricketer Sri is
Ram player Ram
Ravi is a singer is
cricket and foot is ball and is,foot
please help to achieve this, thanks in advance
回答1:
Use str.findall with str.join:
my_list=["is", "foot"]
df['my_list'] = df['Value'].str.findall('(' + '|'.join(my_list) + ')').str.join(',')
print (df)
name Value my_list
0 Sri is a cricketer Sri,is is
1 Ram player Ram
2 Ravi is a singer is is
3 cricket and foot is ball and,is,foot is,foot
Another solution with split and get intersection
s of set
s:
my_list=["is", "foot"]
df['my_list']=df['Value'].str.split(',').apply(lambda x: set(x) & set(my_list)).str.join(',')
print (df)
name Value my_list
0 Sri is a cricketer Sri,is is
1 Ram player Ram
2 Ravi is a singer is is
3 cricket and foot is ball and,is,foot is,foot
And last:
df['Value'] = (df['Value'].str.replace('(' + '|,'.join(my_list) + ')', '')
.str.replace('[,]{2,}',',')
.str.strip(','))
print (df)
name Value my_list
0 Sri is a cricketer Sri is
1 Ram player Ram
2 Ravi is a singer is
3 cricket and foot is ball and is,foot
Or:
my_list=["is", "foot"]
s1 = df['Value'].str.split(',')
df['my_list'] = s1.apply(lambda x: set(x) & set(my_list)).str.join(',')
df['Value'] = s1.apply(lambda x: set(x) - set(my_list)).str.join(',')
print (df)
name Value my_list
0 Sri is a cricketer Sri is
1 Ram player Ram
2 Ravi is a singer is
3 cricket and foot is ball and is,foot
来源:https://stackoverflow.com/questions/47449545/how-to-split-values-in-a-datacolumn-and-adding-it-to-a-new-column-with-a-conditi