In pandas how to read csv files with lists in a column?

后端 未结 1 1846
轮回少年
轮回少年 2021-01-16 09:37

I have a csv file in which some columns which look like this:

df = pd.DataFrame({\'a\':[[\'ID1\',\'ID2\',\'ID3\'],[\'ID1\',\'ID4\'],[]],\'b\':[[8.6,1.3,2.5],         


        
相关标签:
1条回答
  • 2021-01-16 10:02

    To do this, you can make use of the converters in the pd.read_csv function (Documentation for read_csv:

    Using your example,

    'ID of the project'  'postcode'    'city'       'len of the lists in the last 3 columns'  'ids of other projects'   'distance from initial project'  'jetlag from initial project'
     object                int          string       int                                       list of strings           list of floats                   list of ints
    

    it could be done in this way:

    import pandas as pd
    import ast
    generic = lambda x: ast.literal_eval(x)
    conv = {'ids of other projects': generic,
            'distance from initial project': generic,
            'jetlag from initial project': generic}
    
    df = pd.read_csv('your_file.csv', converters=conv)
    

    You would have to define for which columns to use your conversion, but this should not be a problem in your case.

    The converter function will be applied during your csv import, and if your file gets too large, you can always read the csv in chunks.

    0 讨论(0)
提交回复
热议问题