Filter all unique items in column1 as a Key, along with Column2 and Column3 as key:value as a Value in nested Dataframe from regular DataFrame

问题

I'm looking to take a regular DataFrame into a nested DataFrame, then finally convert the nested DataFrame back to a dictionary.

After cleansing my data-set in Pandas, here is what the data-set looks like in the DataFrame:

Input: df.head(5)

Output:

    reviewerName    title                               reviewerRatings
0   Charles         Harry Potter Book Seven News:...    3.0
1   Katherine       Harry Potter Boxed Set, Books...    5.0
2   Lora            Harry Potter and the Sorcerer...    5.0
3   Cait            Harry Potter and the Half-Blo...    5.0
4   Diane           Harry Potter and the Order of...    5.0

Next, I checked to see the number of unique reviewerNames in my dataset:

Input: len(df['reviewerName'].uqinue())

Output: 66130

Now, I'm trying to find a way to take all the 66130 unique reviewerName and assign them all as the key in the new nested DataFrame, then assign the value using "title" and "reviewerRatings" as another layer of key:value in the nested DataFrame.

When I tried to see how many of the 1st unique value was showing, I get this:

Input: df[df['reviewerName'] == 'Charles G']

Output:

      reviewerName                               title   reviewerRatings
0          Charles    Harry Potter Book Seven News:...               3.0
19156      Charles    Harry Potter and the Half-Blo...               3.5
19156      Charles    Harry Potter and the Order of...               4.0

I'm hoping to manipulate the DataFrame so it can look somewhat like this as an output:

           title                                reviewerRatings
Charles    Harry Potter Book Seven News:...     3.0
           Harry Potter and the Half-Blo...     3.5
           Harry Potter and the Order of...     4.0
Katherine  Harry Potter Boxed Set, Books...     5.0
           Harry Potter and the Half-Blo...     2.5
           Harry Potter and the Order of...     5.0

I tried to separate each of the three columns (reviewerName, title, reviewerRatings), then concatenate the items together but found no luck as per below:

Input:

p1 = df[['reviewerName']]
p2 = df[['title']]
p3 = df[['reviewerRatings']]
concatenated = pd.concat([p1,p2,p3], keys=list[p1.unqiue])
concatenated

Output:

AttributeError                            Traceback (most recent call last)
<ipython-input-106-5a6be8c1a3ba> in <module>()
----> 1 concatenated = pd.concat([p1,p2,p3], keys=list[p1.unqiue])
      2 concatenated

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\generic.py in __getattr__(self, name)
   4370             if self._info_axis._can_hold_identifiers_and_holds_name(name):
   4371                 return self[name]
-> 4372             return object.__getattribute__(self, name)
   4373 
   4374     def __setattr__(self, name, value):

AttributeError: 'DataFrame' object has no attribute 'unqiue'

I also looked into Pandas documentation with no luck, not sure if anyone here can look into this.

Once the desired output is solved, I'm hoping to convert the nested DataFrame into a nested Dictionary.

Thanks!

来源：https://stackoverflow.com/questions/54209548/filter-all-unique-items-in-column1-as-a-key-along-with-column2-and-column3-as-k

标签

python

pandas

dictionary

dataframe

nested