I often need to filter pandas dataframe df
by df[df[\'col_name\']==\'string_value\']
, and I want to speed up the row selction operation, is there a qui
Somewhat surprisingly, working with the .values
array instead of the Series
is much faster for me:
>>> time df = mul_df(3000, 2000, 3).reset_index()
CPU times: user 5.96 s, sys: 0.81 s, total: 6.78 s
Wall time: 6.78 s
>>> timeit df[df["STK_ID"] == "A0003"]
1 loops, best of 3: 841 ms per loop
>>> timeit df[df["STK_ID"].values == "A0003"]
1 loops, best of 3: 210 ms per loop