问题
I'm looking to take a regular DataFrame into a nested DataFrame, then finally convert the nested DataFrame back to a dictionary.
After cleansing my data-set in Pandas, here is what the data-set looks like in the DataFrame:
Input: df.head(5)
Output:
reviewerName title reviewerRatings
0 Charles Harry Potter Book Seven News:... 3.0
1 Katherine Harry Potter Boxed Set, Books... 5.0
2 Lora Harry Potter and the Sorcerer... 5.0
3 Cait Harry Potter and the Half-Blo... 5.0
4 Diane Harry Potter and the Order of... 5.0
Next, I checked to see the number of unique reviewerNames in my dataset:
Input: len(df['reviewerName'].uqinue())
Output: 66130
Now, I'm trying to find a way to take all the 66130 unique reviewerName and assign them all as the key in the new nested DataFrame, then assign the value using "title" and "reviewerRatings" as another layer of key:value in the nested DataFrame.
When I tried to see how many of the 1st unique value was showing, I get this:
Input: df[df['reviewerName'] == 'Charles G']
Output:
reviewerName title reviewerRatings
0 Charles Harry Potter Book Seven News:... 3.0
19156 Charles Harry Potter and the Half-Blo... 3.5
19156 Charles Harry Potter and the Order of... 4.0
I'm hoping to manipulate the DataFrame so it can look somewhat like this as an output:
title reviewerRatings
Charles Harry Potter Book Seven News:... 3.0
Harry Potter and the Half-Blo... 3.5
Harry Potter and the Order of... 4.0
Katherine Harry Potter Boxed Set, Books... 5.0
Harry Potter and the Half-Blo... 2.5
Harry Potter and the Order of... 5.0
I tried to separate each of the three columns (reviewerName, title, reviewerRatings), then concatenate the items together but found no luck as per below:
Input:
p1 = df[['reviewerName']]
p2 = df[['title']]
p3 = df[['reviewerRatings']]
concatenated = pd.concat([p1,p2,p3], keys=list[p1.unqiue])
concatenated
Output:
AttributeError Traceback (most recent call last)
<ipython-input-106-5a6be8c1a3ba> in <module>()
----> 1 concatenated = pd.concat([p1,p2,p3], keys=list[p1.unqiue])
2 concatenated
C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\generic.py in __getattr__(self, name)
4370 if self._info_axis._can_hold_identifiers_and_holds_name(name):
4371 return self[name]
-> 4372 return object.__getattribute__(self, name)
4373
4374 def __setattr__(self, name, value):
AttributeError: 'DataFrame' object has no attribute 'unqiue'
I also looked into Pandas documentation with no luck, not sure if anyone here can look into this.
Once the desired output is solved, I'm hoping to convert the nested DataFrame into a nested Dictionary.
Thanks!
来源:https://stackoverflow.com/questions/54209548/filter-all-unique-items-in-column1-as-a-key-along-with-column2-and-column3-as-k