searching if anyone of word is present in the another column of a dataframe or in another data frame using python

做~自己de王妃 提交于 2019-12-13 09:42:31

问题


Hi I have two DataFrames like below

 DF1

 Alpha   |  Numeric  |  Special

 and     |  1        |   @
 or      |  2        |   $
         |  3        |   &  
         |  4        |     
         |  5        |     

and

DF2 with single column

Content      |

boy or girl  |
school @ morn|

I want to search if anyone of the column in DF1 has anyone of the keyword in content column of DF2 and the output should be in a new DF

 output_DF

 output_column|
 Alpha        |
 Special      |

someone help me with this


回答1:


I have a method that is not very good.

df1 = pd.DataFrame([[['and', 'or'],['1', '2','3','4','5'],['@', '$','&']]],columns=['Alpha','Numeric','Special'])    
print(df1)
       Alpha          Numeric    Special
0  [and, or]  [1, 2, 3, 4, 5]  [@, $, &]

df2 = pd.DataFrame([[['boy', 'or','girl']],[['school', '@','morn']]],columns=['Content'])    
print(df2)
             Content
0    [boy, or, girl]
1  [school, @, morn]

First, combine the df2 data:

df2list=[x for row in df2['Content'].tolist() for x in row]
print(df2list)
['boy', 'or', 'girl', 'school', '@', 'morn']

Then get data of each column of df1 is intersected with the df2list:

containlistname = []
for i in range(0,df1.shape[1]):
    columnsname = df1.columns[i]
    df1list=[x for row in df1[columnsname].tolist() for x in row]
    intersection = list(set(df1list).intersection(set(df2list)))
    if len(intersection)>0:
        containlistname.append(columnsname)
output_DF = pd.DataFrame(containlistname,columns=['output_column'])

Final print:

print(output_DF)
  output_column
0         Alpha
1       Special



回答2:


You could apply the Series.isin() method for each column in df1 and then return the column names for which there are any occurrences:

import pandas as pd

d = {'Alpha' :['and', 'or'],'Numeric':[1, 2,3,4,5],'Special':['@', '$','&']}
df1 = pd.DataFrame(dict([ (k,pd.Series(v)) for k,v in d.iteritems() ]))

df2 = pd.DataFrame({'Content' :['boy or girl','school @ morn']})    

check = lambda r:[c for c in df1.columns if df1[c].dropna().isin(r).any()]
df3 = pd.DataFrame({'output_column' : df2["Content"].str.split(' ').apply(check)})

This results in:

  output_column
0       [Alpha]
1     [Special]


来源:https://stackoverflow.com/questions/45055007/searching-if-anyone-of-word-is-present-in-the-another-column-of-a-dataframe-or-i

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!