Pandas combining sparse columns in dataframe

后端 未结 5 1692
陌清茗
陌清茗 2021-01-18 03:44

I am using Python, Pandas for data analysis. I have sparsely distributed data in different columns like following

| id | col1a | col1b | col2a | col2b | col3a         


        
5条回答
  •  自闭症患者
    2021-01-18 04:27

    Thanks to @CeliusStingher for providing the code for the dataframe :

    One suggestion is to set the id as index, rearrange the columns, with the numbers extracted from the text. Create a multiIndex, and stack to get the final result :

    #set id as index
    df = df.set_index("id")
    
    #pull out the numbers from each column
    #so that you have (cola,1), (colb,1) ...
    #add g to the numbers ... (cola, g1),(colb,g1), ...
    #create a MultiIndex
    #and reassign to the columns
    df.columns = pd.MultiIndex.from_tuples([("".join((first,last)), f"g{second}")
                                            for first, second, last
                                            in df.columns.str.split("(\d)")],
                                           names=[None,"group"])
    
    #stack the data 
    #to get your result
    df.stack()
    
    
                     cola   colb
        id  group       
        1   g1      11.0    12.0
        2   g2      21.0    86.0
        3   g1      22.0    87.0
        4   g3      545.0   32.0
    

提交回复
热议问题