Pandas find super string in one Series from another Series

霸气de小男生 提交于 2020-01-05 08:41:54

问题


This does not need to necessarily be done in pandas but it would be nice if it could be done in pandas.

Say I have a list or Series of strings:

['XXY8779','0060-19','McChicken','456728']

And I have another list or Series which contains sub-strings of the original like so:

['60-19','Chicken','8779','1124231','92871','johnson']

And this would return something like:

[True, True, True, False]

I'm looking for a match that is something like:

^[a-zA-Z0-9.,$;]+ < matching string in other list >

So in other words, something that starts with 1 or more of any character but the rest matches exactly with one of the strings in my other list.

Does anyone have any ideas on the best way to accomplish this?

Thanks!


回答1:


Use str.contains

'|'.join(s2) produces a string that tells contains to use regex and use or logic.

s1 = pd.Series(['XXY8779', '0060-19', 'McChicken', '456728'])

s2 = ['60-19', 'Chicken', '8779', '1124231', '92871', 'johnson']

s1.str.contains('|'.join(s2))

0     True
1     True
2     True
3    False
dtype: bool



回答2:


Since it's always at the end you can use .str.endswith and any to short-circuit the logic. s1 and s2 are just your lists above (but it also works if they are pd.Series)

[any(i.endswith(j) for j in s2) for i in s1]
#[True, True, True, False]

You can then convert it to a series with pd.Series or just use that list as a mask as-is.



来源:https://stackoverflow.com/questions/51085069/pandas-find-super-string-in-one-series-from-another-series

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!