Convert a Pandas DataFrame to a dictionary

前端 未结 7 968
悲哀的现实
悲哀的现实 2020-11-22 08:07

I have a DataFrame with four columns. I want to convert this DataFrame to a python dictionary. I want the elements of first column be keys and the elements of o

7条回答
  •  小蘑菇
    小蘑菇 (楼主)
    2020-11-22 08:34

    For my use (node names with xy positions) I found @user4179775's answer to the most helpful / intuitive:

    import pandas as pd
    
    df = pd.read_csv('glycolysis_nodes_xy.tsv', sep='\t')
    
    df.head()
        nodes    x    y
    0  c00033  146  958
    1  c00031  601  195
    ...
    
    xy_dict_list=dict([(i,[a,b]) for i, a,b in zip(df.nodes, df.x,df.y)])
    
    xy_dict_list
    {'c00022': [483, 868],
     'c00024': [146, 868],
     ... }
    
    xy_dict_tuples=dict([(i,(a,b)) for i, a,b in zip(df.nodes, df.x,df.y)])
    
    xy_dict_tuples
    {'c00022': (483, 868),
     'c00024': (146, 868),
     ... }
    

    Addendum

    I later returned to this issue, for other, but related, work. Here is an approach that more closely mirrors the [excellent] accepted answer.

    node_df = pd.read_csv('node_prop-glycolysis_tca-from_pg.tsv', sep='\t')
    
    node_df.head()
       node  kegg_id kegg_cid            name  wt  vis
    0  22    22       c00022   pyruvate        1   1
    1  24    24       c00024   acetyl-CoA      1   1
    ...
    

    Convert Pandas dataframe to a [list], {dict}, {dict of {dict}}, ...

    Per accepted answer:

    node_df.set_index('kegg_cid').T.to_dict('list')
    
    {'c00022': [22, 22, 'pyruvate', 1, 1],
     'c00024': [24, 24, 'acetyl-CoA', 1, 1],
     ... }
    
    node_df.set_index('kegg_cid').T.to_dict('dict')
    
    {'c00022': {'kegg_id': 22, 'name': 'pyruvate', 'node': 22, 'vis': 1, 'wt': 1},
     'c00024': {'kegg_id': 24, 'name': 'acetyl-CoA', 'node': 24, 'vis': 1, 'wt': 1},
     ... }
    

    In my case, I wanted to do the same thing but with selected columns from the Pandas dataframe, so I needed to slice the columns. There are two approaches.

    1. Directly:

    (see: Convert pandas to dictionary defining the columns used fo the key values)

    node_df.set_index('kegg_cid')[['name', 'wt', 'vis']].T.to_dict('dict')
    
    {'c00022': {'name': 'pyruvate', 'vis': 1, 'wt': 1},
     'c00024': {'name': 'acetyl-CoA', 'vis': 1, 'wt': 1},
     ... }
    
    1. "Indirectly:" first, slice the desired columns/data from the Pandas dataframe (again, two approaches),
    node_df_sliced = node_df[['kegg_cid', 'name', 'wt', 'vis']]
    

    or

    node_df_sliced2 = node_df.loc[:, ['kegg_cid', 'name', 'wt', 'vis']]
    

    that can then can be used to create a dictionary of dictionaries

    node_df_sliced.set_index('kegg_cid').T.to_dict('dict')
    
    {'c00022': {'name': 'pyruvate', 'vis': 1, 'wt': 1},
     'c00024': {'name': 'acetyl-CoA', 'vis': 1, 'wt': 1},
     ... }
    

提交回复
热议问题