Pandas pivot table ValueError: Index contains duplicate entries, cannot reshape

后端 未结 2 944
死守一世寂寞
死守一世寂寞 2021-01-03 03:37

I have a dataframe as shown below (top 3 rows):

Sample_Name Sample_ID   Sample_Type IS  Component_Name  IS_Name Component_Group_Name    Outlier_Reasons Actua         


        
2条回答
  •  悲&欢浪女
    2021-01-03 04:24

    You can use groupby() and unstack() to get around the error you're seeing with pivot().

    Here's some example data, with a few edge cases added, and some column values removed or substituted for MCVE:

    # df
          Sample_Name  Sample_ID     IS Component_Name Calculated_Concentration Outlier_Reasons
    Index                                                                    
    1             foo        NaN   True              x                  NaN              NaN  
    1             foo        NaN   True              y                  NaN              NaN 
    2             foo        NaN   False             z            125.92766              NaN 
    2             bar        NaN   False             x                 1.00              NaN  
    2             bar        NaN   False             y                 2.00              NaN  
    2             bar        NaN   False             z                  NaN              NaN  
    
    (df.groupby(['Sample_Name','Component_Name'])
       .Calculated_Concentration
       .first()
       .unstack()
    )
    

    Output:

    Component_Name    x   y          z
    Sample_Name                       
    bar             1.0 2.0        NaN
    foo             NaN NaN  125.92766
    

提交回复
热议问题