I was struggling this afternoon to find a way of selecting few columns of my Pandas DataFrame, by checking the occurrence of a certain pattern in their name (label?).
<I think df.keys().tolist()
is the thing you're searching for.
A tiny example:
from pandas import DataFrame as df
d = df({'somename': [1,2,3], 'othername': [4,5,6]})
names = d.keys().tolist()
for n in names:
print n
print type(n)
Output:
othername
type 'str'
somename
type 'str'
Then with the strings you got, you can do any string operation you want.
Select column by partial string, can simply be done, via:
df.filter(like='hello') # select columns which contain the word hello
And to select rows by partial string match, you can pass axis=0 to filter:
df.filter(like='hello', axis=0)
Your solution using map
is very good. If you really want to use str.contains, it is possible to convert Index objects to Series (which have the str.contains
method):
In [1]: df
Out[1]:
x y z
0 0 0 0
1 1 1 1
2 2 2 2
3 3 3 3
4 4 4 4
5 5 5 5
6 6 6 6
7 7 7 7
8 8 8 8
9 9 9 9
In [2]: df.columns.to_series().str.contains('x')
Out[2]:
x True
y False
z False
dtype: bool
In [3]: df[df.columns[df.columns.to_series().str.contains('x')]]
Out[3]:
x
0 0
1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
UPDATE I just read your last paragraph. From the documentation, str.contains
allows you to pass a regex by default (str.contains('^myregex')
)