详解pandas to_dict()的参数形式

不羁岁月 提交于 2020-02-25 21:21:06

pandas 中的to_dict 可以对DataFrame类型的数据进行转换
可以选择六种的转换类型,分别对应于参数 ‘dict’, ‘list’, ‘series’, ‘split’, ‘records’, ‘index’,下面逐一介绍每种的用法

0、原始数据

import pandas as pd

data = pd.DataFrame([['3rd', 31.194181, 'UNKNOWN', 'UNKNOWN', 'male'],
                     ['1rd', 31.194181, 'Cherbourg', 'Paris, France', 'female'],
                     ['3rd', 31.194181, 'UNKNOWN', 'UNKNOWN', 'male'],
                     ['3rd', 32.000000, 'Southampton', 'Foresvik, Norway Portland, ND', 'male'],
                     ['3rd', 31.194181, 'UNKNOWN', 'UNKNOWN', 'male'],
                     ['2rd', 41.000000, 'Cherbourg', 'New York, NY', 'male'],
                     ['2rd', 48.000000, 'Southampton', 'Somerset / Bernardsville, NJ', 'female'],
                     ['3rd', 26.000000, 'Southampton', 'UNKNOWN', 'male'],
                     ['3rd', 19.000000, 'Southampton', 'England', 'male'],
                     ['2rd', 31.194181, 'Southampton', 'Petworth, Sussex', 'male']],
                    index=['1086', '12', '1036', '833', '1108', '562', '437', '663', '669', '507'],
                    columns=['pclass', 'age', 'embarked', 'home.dest', 'sex'])
print(data)

输出结果:

     pclass        age     embarked                      home.dest     sex
1086    3rd  31.194181      UNKNOWN                        UNKNOWN    male
12      1rd  31.194181    Cherbourg                  Paris, France  female
1036    3rd  31.194181      UNKNOWN                        UNKNOWN    male
833     3rd  32.000000  Southampton  Foresvik, Norway Portland, ND    male
1108    3rd  31.194181      UNKNOWN                        UNKNOWN    male
562     2rd  41.000000    Cherbourg                   New York, NY    male
437     2rd  48.000000  Southampton   Somerset / Bernardsville, NJ  female
663     3rd  26.000000  Southampton                        UNKNOWN    male
669     3rd  19.000000  Southampton                        England    male
507     2rd  31.194181  Southampton               Petworth, Sussex    male

1、选择参数orient=’dict’ ,可以看成是一种双重字典结构(columns是外层键,index是内层键)

形成结构{column -> {index -> value}}
data_dict = data.to_dict(orient='dict')
print(data_dict)

输出结果为:

{
  'pclass': {
    '1086': '3rd',
    '12': '1rd',
    '1036': '3rd',
    '833': '3rd',
    '1108': '3rd',
    '562': '2rd',
    '437': '2rd',
    '663': '3rd',
    '669': '3rd',
    '507': '2rd'
  },
  'age': {
    '1086': 31.194181,
    '12': 31.194181,
    '1036': 31.194181,
    '833': 32.0,
    '1108': 31.194181,
    '562': 41.0,
    '437': 48.0,
    '663': 26.0,
    '669': 19.0,
    '507': 31.194181
  },
  'embarked': {
    '1086': 'UNKNOWN',
    '12': 'Cherbourg',
    '1036': 'UNKNOWN',
    '833': 'Southampton',
    '1108': 'UNKNOWN',
    '562': 'Cherbourg',
    '437': 'Southampton',
    '663': 'Southampton',
    '669': 'Southampton',
    '507': 'Southampton'
  },
  'home.dest': {
    '1086': 'UNKNOWN',
    '12': 'Paris, France',
    '1036': 'UNKNOWN',
    '833': 'Foresvik, Norway Portland, ND',
    '1108': 'UNKNOWN',
    '562': 'New York, NY',
    '437': 'Somerset / Bernardsville, NJ',
    '663': 'UNKNOWN',
    '669': 'England',
    '507': 'Petworth, Sussex'
  },
  'sex': {
    '1086': 'male',
    '12': 'female',
    '1036': 'male',
    '833': 'male',
    '1108': 'male',
    '562': 'male',
    '437': 'female',
    '663': 'male',
    '669': 'male',
    '507': 'male'
  }
}

2、选择参数orient=’list’ ,也可以看成是一种双重字典结构,只不过内层变成了一个列表(columns是外层键,不显示index)

形成结构 {column -> [values]}
data_dict = data.to_dict(orient='list')
print(data_dict)

输出结果为:

{
  'pclass': [
    '3rd',
    '1rd',
    '3rd',
    '3rd',
    '3rd',
    '2rd',
    '2rd',
    '3rd',
    '3rd',
    '2rd'
  ],
  'age': [
    31.194181,
    31.194181,
    31.194181,
    32.0,
    31.194181,
    41.0,
    48.0,
    26.0,
    19.0,
    31.194181
  ],
  'embarked': [
    'UNKNOWN',
    'Cherbourg',
    'UNKNOWN',
    'Southampton',
    'UNKNOWN',
    'Cherbourg',
    'Southampton',
    'Southampton',
    'Southampton',
    'Southampton'
  ],
  'home.dest': [
    'UNKNOWN',
    'Paris, France',
    'UNKNOWN',
    'Foresvik, Norway Portland, ND',
    'UNKNOWN',
    'New York, NY',
    'Somerset / Bernardsville, NJ',
    'UNKNOWN',
    'England',
    'Petworth, Sussex'
  ],
  'sex': [
    'male',
    'female',
    'male',
    'male',
    'male',
    'male',
    'female',
    'male',
    'male',
    'male'
  ]
}

3、选择参数orient=’series’

形成结构{column -> index(values)} ,还有数据类型、列名
data_dict = data.to_dict(orient='series')
print(data_dict)

输出结果为:

{'pclass': 
	1086    3rd
	12      1rd
	1036    3rd
	833     3rd
	1108    3rd
	562     2rd
	437     2rd
	663     3rd
	669     3rd
	507     2rd
Name: pclass, dtype: object, 
'age': 
	1086    31.194181
	12      31.194181
	1036    31.194181
	833     32.000000
	1108    31.194181
	562     41.000000
	437     48.000000
	663     26.000000
	669     19.000000
	507     31.194181
Name: age, dtype: float64, 
'embarked': 
	1086        UNKNOWN
	12        Cherbourg
	1036        UNKNOWN
	833     Southampton
	1108        UNKNOWN
	562       Cherbourg
	437     Southampton
	663     Southampton
	669     Southampton
	507     Southampton
Name: embarked, dtype: object, 
'home.dest': 
	1086                          UNKNOWN
	12                      Paris, France
	1036                          UNKNOWN
	833     Foresvik, Norway Portland, ND
	1108                          UNKNOWN
	562                      New York, NY
	437      Somerset / Bernardsville, NJ
	663                           UNKNOWN
	669                           England
	507                  Petworth, Sussex
Name: home.dest, dtype: object, 
'sex': 
	1086      male
	12      female
	1036      male
	833       male
	1108      male
	562       male
	437     female
	663       male
	669       male
	507       male
Name: sex, dtype: object}

4、选择参数orient=’split’

形成结构{index -> [index], columns -> [columns], data -> [[values],[values],[values]……]}。总体分为三部分:index、columns、data
data_dict = data.to_dict(orient='split')
print(data_dict)

输出结果为:

{
  'index': [
    '1086',
    '12',
    '1036',
    '833',
    '1108',
    '562',
    '437',
    '663',
    '669',
    '507'
  ],
  'columns': [
    'pclass',
    'age',
    'embarked',
    'home.dest',
    'sex'
  ],
  'data': [
    [
      '3rd',
      31.194181,
      'UNKNOWN',
      'UNKNOWN',
      'male'
    ],
    [
      '1rd',
      31.194181,
      'Cherbourg',
      'Paris, France',
      'female'
    ],
    [
      '3rd',
      31.194181,
      'UNKNOWN',
      'UNKNOWN',
      'male'
    ],
    [
      '3rd',
      32.0,
      'Southampton',
      'Foresvik, Norway Portland, ND',
      'male'
    ],
    [
      '3rd',
      31.194181,
      'UNKNOWN',
      'UNKNOWN',
      'male'
    ],
    [
      '2rd',
      41.0,
      'Cherbourg',
      'New York, NY',
      'male'
    ],
    [
      '2rd',
      48.0,
      'Southampton',
      'Somerset / Bernardsville, NJ',
      'female'
    ],
    [
      '3rd',
      26.0,
      'Southampton',
      'UNKNOWN',
      'male'
    ],
    [
      '3rd',
      19.0,
      'Southampton',
      'England',
      'male'
    ],
    [
      '2rd',
      31.194181,
      'Southampton',
      'Petworth, Sussex',
      'male'
    ]
  ]
}

5、选择参数orient=’records’ (不显示index)

形成结构[{column -> value}, … , {column -> value}],可以看作是一条数据一条数据组成的集合,不显示index
data_dict = data.to_dict(orient='records')
print(data_dict)

输出结果为:

[
  {
    'pclass': '3rd',
    'age': 31.194181,
    'embarked': 'UNKNOWN',
    'home.dest': 'UNKNOWN',
    'sex': 'male'
  },
  {
    'pclass': '1rd',
    'age': 31.194181,
    'embarked': 'Cherbourg',
    'home.dest': 'Paris, France',
    'sex': 'female'
  },
  {
    'pclass': '3rd',
    'age': 31.194181,
    'embarked': 'UNKNOWN',
    'home.dest': 'UNKNOWN',
    'sex': 'male'
  },
  {
    'pclass': '3rd',
    'age': 32.0,
    'embarked': 'Southampton',
    'home.dest': 'Foresvik, Norway Portland, ND',
    'sex': 'male'
  },
  {
    'pclass': '3rd',
    'age': 31.194181,
    'embarked': 'UNKNOWN',
    'home.dest': 'UNKNOWN',
    'sex': 'male'
  },
  {
    'pclass': '2rd',
    'age': 41.0,
    'embarked': 'Cherbourg',
    'home.dest': 'New York, NY',
    'sex': 'male'
  },
  {
    'pclass': '2rd',
    'age': 48.0,
    'embarked': 'Southampton',
    'home.dest': 'Somerset / Bernardsville, NJ',
    'sex': 'female'
  },
  {
    'pclass': '3rd',
    'age': 26.0,
    'embarked': 'Southampton',
    'home.dest': 'UNKNOWN',
    'sex': 'male'
  },
  {
    'pclass': '3rd',
    'age': 19.0,
    'embarked': 'Southampton',
    'home.dest': 'England',
    'sex': 'male'
  },
  {
    'pclass': '2rd',
    'age': 31.194181,
    'embarked': 'Southampton',
    'home.dest': 'Petworth, Sussex',
    'sex': 'male'
  }
]

6、选择参数orient=’index’

形成结构{index -> {column -> value}},可以看作是一条数据一条数据组成的集合,但是每条数据之前显示index
data_dict = data.to_dict(orient='index')
print(data_dict)

输出结果为:

{
  '1086': {
    'pclass': '3rd',
    'age': 31.194181,
    'embarked': 'UNKNOWN',
    'home.dest': 'UNKNOWN',
    'sex': 'male'
  },
  '12': {
    'pclass': '1rd',
    'age': 31.194181,
    'embarked': 'Cherbourg',
    'home.dest': 'Paris, France',
    'sex': 'female'
  },
  '1036': {
    'pclass': '3rd',
    'age': 31.194181,
    'embarked': 'UNKNOWN',
    'home.dest': 'UNKNOWN',
    'sex': 'male'
  },
  '833': {
    'pclass': '3rd',
    'age': 32.0,
    'embarked': 'Southampton',
    'home.dest': 'Foresvik, Norway Portland, ND',
    'sex': 'male'
  },
  '1108': {
    'pclass': '3rd',
    'age': 31.194181,
    'embarked': 'UNKNOWN',
    'home.dest': 'UNKNOWN',
    'sex': 'male'
  },
  '562': {
    'pclass': '2rd',
    'age': 41.0,
    'embarked': 'Cherbourg',
    'home.dest': 'New York, NY',
    'sex': 'male'
  },
  '437': {
    'pclass': '2rd',
    'age': 48.0,
    'embarked': 'Southampton',
    'home.dest': 'Somerset / Bernardsville, NJ',
    'sex': 'female'
  },
  '663': {
    'pclass': '3rd',
    'age': 26.0,
    'embarked': 'Southampton',
    'home.dest': 'UNKNOWN',
    'sex': 'male'
  },
  '669': {
    'pclass': '3rd',
    'age': 19.0,
    'embarked': 'Southampton',
    'home.dest': 'England',
    'sex': 'male'
  },
  '507': {
    'pclass': '2rd',
    'age': 31.194181,
    'embarked': 'Southampton',
    'home.dest': 'Petworth, Sussex',
    'sex': 'male'
  }
}

参考资料:
https://blog.csdn.net/m0_37804518/article/details/78444110

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!