I often need to filter pandas dataframe df by df[df[\'col_name\']==\'string_value\'], and I want to speed up the row selction operation, is there a qui
Depending on what you want to do with the selection afterwards, and if you have to make multiple selections of this kind, the groupby functionality can also make things faster (at least with the example).
Even if you only have to select the rows for one string_value, it is a little bit faster (but not much):
In [11]: %timeit df[df['STK_ID']=='A0003']
1 loops, best of 3: 626 ms per loop
In [12]: %timeit df.groupby("STK_ID").get_group("A0003")
1 loops, best of 3: 459 ms per loop
But subsequent calls to the GroupBy object will be very fast (eg to select the rows of other sting_values):
In [25]: grouped = df.groupby("STK_ID")
In [26]: %timeit grouped.get_group("A0003")
1 loops, best of 3: 333 us per loop