Counting appearances of multiple substrings in a cell pandas

六眼飞鱼酱① 提交于 2019-12-12 04:56:57

问题


I have a column that contains rather lengthy strings. Each of the string may or may not contain substrics. Such substrings as 'H 07', 'H 06' or 'F 13' may or may not appear in a dataframe cell. I would like to count appearances of these substrings and add results to a new cell. The original cell value is

df.iloc[0,0]    
'rfgergerggr H 07 jgjg gjgj H 06 gjhgj  H 06 '. 

The result of the procedure should be a new cell with

df.iloc[0,1]
{'H 07':1, 'H 06':2}

I imagine that this should be done with help of str.contains. But I am looking for about 50 different substrings and I can not imagine a good way to look for them. Also, I think that complex lambda could solve my problems here. But do not know how to built it.

so far I have tried str.contains but it only shows if the substring is there, I do not get the count. Also, to find all 50 substrings I am interested in I will have to call str.contains every time. I think there should be better way of doing that.


回答1:


something like:

substrs = [...]
def f(cell_value):
    return {k: v for k, v in ((s, cell_value.count(s)) for s in substrs) if v}
df.column.apply(f)


来源:https://stackoverflow.com/questions/24700814/counting-appearances-of-multiple-substrings-in-a-cell-pandas

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!