问题
So, I have a problem with my dataframe from dictionary - python actually "names" my rows and columns with numbers. Here's my code:
a = dict()
dfList = [x for x in df['Marka'].tolist() if str(x) != 'nan']
dfSet = set(dfList)
dfList123 = list(dfSet)
for i in range(len(dfList123)):
number = dfList.count(dfList123[i])
a[dfList123[i]]=number
sorted_by_value = sorted(a.items(), key=lambda kv: kv[1], reverse=True)
dataframe=pd.DataFrame.from_dict(sorted_by_value)
print(dataframe)
I've tried to rename columns like this:
dataframe=pd.DataFrame.from_dict(sorted_by_value, orient='index', columns=['A', 'B', 'C'])
, but it gives me a error:
AttributeError: 'list' object has no attribute 'values'
Is there any way to fix it?
Edit: Here's the first part of my data frame:
0 1
0 VW 1383
1 AUDI 1053
2 VOLVO 789
3 BMW 749
4 OPEL 621
5 MERCEDES BENZ 593
...
The 1st rows and columns are exactly what I need to remove/rename
回答1:
By sorting the dict_items
object (a.items()
), you have created a list.
You can check this with type(sorted_by_value)
. Then, when you try to use the pd.DataFrame.from_dict()
method, it fails because it is expecting a dictionary, which has 'values', but instead receives a list.
Probably the smallest fix you can make to the code is to replace the line:
dataframe=pd.DataFrame.from_dict(sorted_by_value)
with:
dataframe = pd.DataFrame(dict(sorted_by_value), index=[0])
.
(The index=[0]
argument is required here because pd.DataFrame
expects a dictionary to be in the form {'key1': [list1, of, values], 'key2': [list2, of, values]}
but instead sorted_by_value
is converted to the form {'key1': value1, 'key2': value2}
.)
Another option is to use pd.DataFrame(sorted_by_value)
to generate a dataframe directly from the sorted items, although you may need to tweak sorted_by_value
or the result to get the desired dataframe format.
Alternatively, look at collections.OrderedDict
(the documentation for which is here) to avoid sorting to a list and then converting back to a dictionary.
Edit
Regarding naming of columns and the index, without seeing the data/desired result it's difficult to give specific advice. The options above will allow remove the error and allow you to create a dataframe, the columns of which can then be renamed using dataframe.columns = [list, of, column, headings]
. For the index, look at pd.DataFrame.set_index(drop=True)
(docs) and pd.DataFrame.reset_index()
(docs).
回答2:
index
and columns
are properties of your dataframe
As long as len(df.index) > 0
and len(df.columns) > 0
, i.e. your dataframe has nonzero rows and nonzero columns, you cannot get rid of the labels from your pd.DataFrame
object. Whether the dataframe is constructed from a dictionary, or otherwise, is irrelevant.
What you can do is remove them from a representation of your dataframe, with output either as a Python str
object or a CSV file. Here's a minimal example:
df = pd.DataFrame([[1, 2, 3], [4, 5, 6]])
print(df)
# 0 1 2
# 0 1 2 3
# 1 4 5 6
# output to string without index or headers
print(df.to_string(index=False, header=False))
# 1 2 3
# 4 5 6
# output to csv without index or headers
df.to_csv('file.csv', index=False, header=False)
来源:https://stackoverflow.com/questions/54065097/is-there-any-way-to-remove-column-and-rows-numbers-from-dataframe-from-dict