Pandas: Selecting rows based on value counts of a particular column

后端未结

关注

 2  818

故里飘歌 2020-12-09 11:55

Whats the simplest way of selecting all rows from a panda dataframe, who\'s sym occurs exactly twice in the entire table? For example, in the table below, I would like to se

2条回答

庸人自扰 (楼主)

2020-12-09 12:09

You can use map, which should be faster than using groupby and transform:

df[df['sym'].map(df['sym'].value_counts()) == 2]

e.g.

%%timeit
df[df['sym'].map(df['sym'].value_counts()) == 2]
Out[1]:
1.83 ms ± 23.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%%timeit
df[df.groupby("sym")["sym"].transform('size') == 2]
Out[2]:
2.08 ms ± 41.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

0 讨论(0)

查看其它2个回答