Python Pandas: How to split a sorted dictionary in a column of a dataframe

后端 未结 2 1818
[愿得一人]
[愿得一人] 2021-01-15 01:45

I have a dataFrame like this:

id  asn      orgs
0   3320    {\'Deutsche Telekom AG\': 2288}
1   47886   {\'Joyent\': 16, \'Equinix (Netherlands) B.V.\': 7}
2         


        
2条回答
  •  [愿得一人]
    2021-01-15 01:51

    Another approach define a function that just calls min on the dict and return a Series so you can assign to multiple columns (function body taken from @Alex Martelli's answer):

    In [17]:
    
    def func(x):
        k = min(x, key=x.get)
        return pd.Series([k, x[k]])
    df[['orgs', 'value']] = df['orgs'].apply(func)
    df
    
    Out[17]:
         asn  id                        orgs  value
    0   3320   0         Deutsche Telekom AG   2288
    1  47886   1  Equinix (Netherlands) B.V.      7
    2  47601   2             fusion services   1024
    3  33438   3     Highwinds Network Group    893
    

    EDIT

    If your data has empty dicss, then you can just test the len:

    In [34]:
    
    df = pd.DataFrame({'id':[0,1,2,3,4],
                       'asn':[3320,47886,47601,33438,56],
                       'orgs':[{'Deutsche Telekom AG': 2288},
                               {'Joyent': 16, 'Equinix (Netherlands) B.V.': 7},
                               {'fusion services': 1024, 'GCE Global Maritime':16859},
                               {'Highwinds Network Group': 893},{}]})
    df
    Out[34]:
         asn  id                                               orgs
    0   3320   0                      {'Deutsche Telekom AG': 2288}
    1  47886   1    {'Equinix (Netherlands) B.V.': 7, 'Joyent': 16}
    2  47601   2  {'GCE Global Maritime': 16859, 'fusion service...
    3  33438   3                   {'Highwinds Network Group': 893}
    4     56   4                                                 {}
    In [36]:
    
    def func(x):
        if len(x) > 0:
            k = min(x, key=x.get)
            return pd.Series([k, x[k]])
        return pd.Series([np.NaN, np.NaN])
    
    df[['orgs', 'value']] = df['orgs'].apply(func)
    df
    
    Out[36]:
         asn  id                        orgs  value
    0   3320   0         Deutsche Telekom AG   2288
    1  47886   1  Equinix (Netherlands) B.V.      7
    2  47601   2             fusion services   1024
    3  33438   3     Highwinds Network Group    893
    4     56   4                         NaN    NaN
    

提交回复
热议问题