Python Pandas: How to split a sorted dictionary in a column of a dataframe

后端 未结 2 1832
[愿得一人]
[愿得一人] 2021-01-15 01:45

I have a dataFrame like this:

id  asn      orgs
0   3320    {\'Deutsche Telekom AG\': 2288}
1   47886   {\'Joyent\': 16, \'Equinix (Netherlands) B.V.\': 7}
2         


        
2条回答
  •  一个人的身影
    2021-01-15 02:09

    This should work:

    In [1]: import pandas as pd  
    In [2]: import operator
    In [3]: df = pd.DataFrame({ 'id' : [0,1,2,3],
       ...:                      'asn' : [3320, 47886, 47601, 33438],
       ...:                      'orgs' : [{'Deutsche Telekom AG': 2288}, {'Joyent': 16, 'Equinix (Netherlands) B.V.': 7}, {'fusion services': 1024, 'GCE Global Maritime':16859}, {'Highwinds Network Group': 893}]
       ...:                    })
    
    In [4]: df.orgs, df['value'] = zip(*df.orgs.apply(lambda x : sorted(x.items(),key=operator.itemgetter(1),reverse=True)[0]))
    
    In [5]: df
    Out[5]:
         asn  id                     orgs  value
    0   3320   0      Deutsche Telekom AG   2288
    1  47886   1                   Joyent     16
    2  47601   2      GCE Global Maritime  16859
    3  33438   3  Highwinds Network Group    893
    

    I used zip(* ) and assigned them to df.orgs and df.value.

    For empty dictionaries:

    In [3]: df = pd.DataFrame({ 'id' : [0,1,2,3],
       ...:                      'asn' : [3320, 47886, 47601, 33438],
       ...:                      'orgs' : [{'Deutsche Telekom AG': 2288}, {'Joyent': 16, 'Equinix (Netherlands) B.V.': 7}, {'fusion services': 1024, 'GCE Global Maritime':16859}, {}]
       ...:                    })
    In [4]: df.orgs.apply(lambda x : sorted(x.items(),key=operator.itemgetter(1),reverse=True)[0] if len(x) else ('',''))
    Out[4]:
    0     (Deutsche Telekom AG, 2288)
    1                    (Joyent, 16)
    2    (GCE Global Maritime, 16859)
    3                            (, )
    Name: orgs, dtype: object
    
    In [5]: df.orgs, df['value'] = zip(*df.orgs.apply(lambda x : sorted(x.items(),key=operator.itemgetter(1),reverse=True)[0] if len(x) else ('','')))
    
    In [6]: df
    Out[6]:
         asn  id                 orgs  value
    0   3320   0  Deutsche Telekom AG   2288
    1  47886   1               Joyent     16
    2  47601   2  GCE Global Maritime  16859
    3  33438   3
    

提交回复
热议问题