I have a pandas dataframe as:
word_list
[\'nuclear\',\'election\',\'usa\',\'baseball\']
[\'football\',\'united\',\'thriller\']
[\'marvels\',\'hollywood\',\'s
You can flatten dictionary of lists first and then lookup by .get
with miscellaneous
for non matched values, then convert to set
s for unique categories and convert to string
s by join
:
movies=['spiderman','marvels','thriller']
sports=['baseball','hockey','football']
politics=['election','china','usa']
d = {'movies':movies, 'sports':sports, 'politics':politics}
d1 = {k: oldk for oldk, oldv in d.items() for k in oldv}
f = lambda x: ','.join(set([d1.get(y, 'miscellaneous') for y in x]))
df['matched_list_names'] = df['word_list'].apply(f)
print (df)
word_list matched_list_names
0 [nuclear, election, usa, baseball] politics,miscellaneous,sports
1 [football, united, thriller] miscellaneous,sports,movies
2 [marvels, hollywood, spiderman, budget] miscellaneous,movies
Similar solution with list comprehension:
df['matched_list_names'] = [','.join(set([d1.get(y, 'miscellaneous') for y in x]))
for x in df['word_list']]