I have a DataFrame:
import pandas as pd
import numpy as np
df = pd.DataFrame({\'foo.aa\': [1, 2.1, np.nan, 4.7, 5.6, 6.8],
\'foo.fighters
Another option for the selection of the desired entries is to use map:
df.loc[(df == 1).any(axis=1), df.columns.map(lambda x: x.startswith('foo'))]
which gives you all the columns for rows that contain a 1:
foo.aa foo.bars foo.fighters foo.fox foo.manchu
0 1.0 0 0 2 NA
1 2.1 0 1 4 0
2 NaN 0 NaN 1 0
5 6.8 1 0 5 0
The row selection is done by
(df == 1).any(axis=1)
as in @ajcr's answer which gives you:
0 True
1 True
2 True
3 False
4 False
5 True
dtype: bool
meaning that row 3 and 4 do not contain a 1 and won't be selected.
The selection of the columns is done using Boolean indexing like this:
df.columns.map(lambda x: x.startswith('foo'))
In the example above this returns
array([False, True, True, True, True, True, False], dtype=bool)
So, if a column does not start with foo, False is returned and the column is therefore not selected.
If you just want to return all rows that contain a 1 - as your desired output suggests - you can simply do
df.loc[(df == 1).any(axis=1)]
which returns
bar.baz foo.aa foo.bars foo.fighters foo.fox foo.manchu nas.foo
0 5.0 1.0 0 0 2 NA NA
1 5.0 2.1 0 1 4 0 0
2 6.0 NaN 0 NaN 1 0 1
5 6.8 6.8 1 0 5 0 0