问题
The following is example of data I have in excel sheet.
A B C
1 2 3
4 5 6
I am trying to get the columns name using the following code:
p1 = list(df1t.columns.values)
the output is like this
[A, B, C, 'Unnamed: 3', 'unnamed 4', 'unnamed 5', .....]
I checked the excel sheet, there is only three columns named A, B, and C. Other columns are blank. Any suggestion?
回答1:
There is problem some cells are not empty but contains some whitespaces.
If need columns names with filtering Unnamed:
cols = [col for col in df if not col.startswith('Unnamed:')]
print (cols)
['A', 'B', 'C']
Sample with file:
df = pd.read_excel('https://dl.dropboxusercontent.com/u/84444599/file_unnamed_cols.xlsx')
print (df)
A B C Unnamed: 3 Unnamed: 4 Unnamed: 5 Unnamed: 6 Unnamed: 7
0 4.0 6.0 8.0 NaN NaN NaN NaN NaN
1 NaN NaN NaN NaN NaN NaN NaN
2 NaN NaN NaN NaN NaN NaN
3 NaN NaN NaN NaN NaN NaN
cols = [col for col in df if not col.startswith('Unnamed:')]
print (cols)
['A', 'B', 'C']
Another solution:
cols = df.columns[~df.columns.str.startswith('Unnamed:')]
print (cols)
Index(['A', 'B', 'C'], dtype='object')
And for return all columns by cols use:
print (df[cols])
A B C
0 4.0 6.0 8.0
1 NaN NaN NaN
2 NaN NaN NaN
3 NaN NaN NaN
And if necessary remove all NaNs rows:
print (df[cols].dropna(how='all'))
A B C
0 4.0 6.0 8.0
来源:https://stackoverflow.com/questions/44734613/pandas-returning-the-unnamed-columns