How do I retrieve the number of columns in a Pandas data frame?

后端未结

关注

 9  2272

心在旅途 2021-01-29 18:06

How do you programmatically retrieve the number of columns in a pandas dataframe? I was hoping for something like:

df.num_columns

9条回答

渐次进展 (楼主)

2021-01-29 18:54

In order to include the number of row index "columns" in your total shape I would personally add together the number of columns df.columns.size with the attribute pd.Index.nlevels/pd.MultiIndex.nlevels:

Set up dummy data

import pandas as pd

flat_index = pd.Index([0, 1, 2])
multi_index = pd.MultiIndex.from_tuples([("a", 1), ("a", 2), ("b", 1), names=["letter", "id"])

columns = ["cat", "dog", "fish"]

data = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
flat_df = pd.DataFrame(data, index=flat_index, columns=columns)
multi_df = pd.DataFrame(data, index=multi_index, columns=columns)

# Show data
# -----------------
# 3 columns, 4 including the index
print(flat_df)
    cat  dog  fish
id                
0     1    2     3
1     4    5     6
2     7    8     9

# -----------------
# 3 columns, 5 including the index
print(multi_df)
           cat  dog  fish
letter id                
a      1     1    2     3
       2     4    5     6
b      1     7    8     9

Writing our process as a function:

def total_ncols(df, include_index=False):
    ncols = df.columns.size
    if include_index is True:
        ncols += df.index.nlevels
    return ncols

print("Ignore the index:")
print(total_ncols(flat_df), total_ncols(multi_df))

print("Include the index:")
print(total_ncols(flat_df, include_index=True), total_ncols(multi_df, include_index=True))

This prints:

Ignore the index:
3 3

Include the index:
4 5

If you want to only include the number of indices if the index is a pd.MultiIndex, then you can throw in an isinstance check in the defined function.

As an alternative, you could use df.reset_index().columns.size to achieve the same result, but this won't be as performant since we're temporarily inserting new columns into the index and making a new index before getting the number of columns.

0 讨论(0)

查看其它9个回答