Is there any way to remove column and rows numbers from DataFrame.from_dict?

问题

So, I have a problem with my dataframe from dictionary - python actually "names" my rows and columns with numbers. Here's my code:

a = dict()
dfList = [x for x in df['Marka'].tolist() if str(x) != 'nan']
dfSet = set(dfList)
dfList123 = list(dfSet)
for i in range(len(dfList123)):
    number = dfList.count(dfList123[i])
    a[dfList123[i]]=number
sorted_by_value = sorted(a.items(), key=lambda kv: kv[1], reverse=True)
dataframe=pd.DataFrame.from_dict(sorted_by_value)
print(dataframe)

I've tried to rename columns like this: dataframe=pd.DataFrame.from_dict(sorted_by_value, orient='index', columns=['A', 'B', 'C']), but it gives me a error:

AttributeError: 'list' object has no attribute 'values'

Is there any way to fix it?

Edit: Here's the first part of my data frame:

                     0     1
0                   VW  1383
1                 AUDI  1053
2                VOLVO   789
3                  BMW   749
4                 OPEL   621
5        MERCEDES BENZ   593
...

The 1st rows and columns are exactly what I need to remove/rename

回答1:

By sorting the dict_items object (a.items()), you have created a list. You can check this with type(sorted_by_value). Then, when you try to use the pd.DataFrame.from_dict() method, it fails because it is expecting a dictionary, which has 'values', but instead receives a list.

Probably the smallest fix you can make to the code is to replace the line:

dataframe=pd.DataFrame.from_dict(sorted_by_value)

with:

dataframe = pd.DataFrame(dict(sorted_by_value), index=[0]).

(The index=[0] argument is required here because pd.DataFrame expects a dictionary to be in the form {'key1': [list1, of, values], 'key2': [list2, of, values]} but instead sorted_by_value is converted to the form {'key1': value1, 'key2': value2}.)

Another option is to use pd.DataFrame(sorted_by_value) to generate a dataframe directly from the sorted items, although you may need to tweak sorted_by_value or the result to get the desired dataframe format.

Alternatively, look at collections.OrderedDict (the documentation for which is here) to avoid sorting to a list and then converting back to a dictionary.

Edit

Regarding naming of columns and the index, without seeing the data/desired result it's difficult to give specific advice. The options above will allow remove the error and allow you to create a dataframe, the columns of which can then be renamed using dataframe.columns = [list, of, column, headings]. For the index, look at pd.DataFrame.set_index(drop=True) (docs) and pd.DataFrame.reset_index() (docs).

回答2:

`index` and `columns` are properties of your dataframe

As long as len(df.index) > 0 and len(df.columns) > 0, i.e. your dataframe has nonzero rows and nonzero columns, you cannot get rid of the labels from your pd.DataFrame object. Whether the dataframe is constructed from a dictionary, or otherwise, is irrelevant.

What you can do is remove them from a representation of your dataframe, with output either as a Python str object or a CSV file. Here's a minimal example:

df = pd.DataFrame([[1, 2, 3], [4, 5, 6]])

print(df)
#    0  1  2
# 0  1  2  3
# 1  4  5  6

# output to string without index or headers
print(df.to_string(index=False, header=False))
# 1  2  3
# 4  5  6

# output to csv without index or headers
df.to_csv('file.csv', index=False, header=False)

来源：https://stackoverflow.com/questions/54065097/is-there-any-way-to-remove-column-and-rows-numbers-from-dataframe-from-dict

标签

python