I created a dataframe using the following:
df = pd.DataFrame(np.random.rand(10, 3), columns=[\'alp1\', \'alp2\', \'bet1\'])
I\'d like to ge
option 1
Full numpy + pd.DataFrame
m = np.core.defchararray.find(df.columns.values.astype(str), 'alp') >= 0
pd.DataFrame(df.values[:, m], df.index, df.columns[m])
alp1 alp2
0 0.819189 0.356867
1 0.900406 0.968947
2 0.201382 0.658768
3 0.700727 0.946509
4 0.176423 0.290426
5 0.132773 0.378251
6 0.749374 0.983251
7 0.768689 0.415869
8 0.292140 0.457596
9 0.214937 0.976780
option 2
numpy + loc
m = np.core.defchararray.find(df.columns.values.astype(str), 'alp') >= 0
df.loc[:, m]
alp1 alp2
0 0.819189 0.356867
1 0.900406 0.968947
2 0.201382 0.658768
3 0.700727 0.946509
4 0.176423 0.290426
5 0.132773 0.378251
6 0.749374 0.983251
7 0.768689 0.415869
8 0.292140 0.457596
9 0.214937 0.976780
timing
numpy is faster