Read CSV into a dataFrame with varying row lengths using Pandas

前端 未结 6 1745
孤城傲影
孤城傲影 2020-12-03 22:32

So I have a CSV that looks a bit like this:

1 | 01-01-2019 | 724
2 | 01-01-2019 | 233 | 436
3 | 01-01-2019 | 345
4 | 01-01-2019 | 803 | 933 | 943 | 923 | 954         


        
6条回答
  •  天命终不由人
    2020-12-03 23:14

    Read fixed width should work:

    from io import StringIO
    
    s = '''1  01-01-2019  724
    2  01-01-2019  233  436
    3  01-01-2019  345
    4  01-01-2019  803  933  943  923  954
    5  01-01-2019  454'''
    
    
    pd.read_fwf(StringIO(s), header=None)
    
       0           1    2      3      4      5      6
    0  1  01-01-2019  724    NaN    NaN    NaN    NaN
    1  2  01-01-2019  233  436.0    NaN    NaN    NaN
    2  3  01-01-2019  345    NaN    NaN    NaN    NaN
    3  4  01-01-2019  803  933.0  943.0  923.0  954.0
    4  5  01-01-2019  454    NaN    NaN    NaN    NaN
    

    or with a delimiter param

    s = '''1 | 01-01-2019 | 724
    2 | 01-01-2019 | 233 | 436
    3 | 01-01-2019 | 345
    4 | 01-01-2019 | 803 | 933 | 943 | 923 | 954
    5 | 01-01-2019 | 454'''
    
    
    pd.read_fwf(StringIO(s), header=None, delimiter='|')
    
       0             1    2      3      4      5      6
    0  1   01-01-2019   724    NaN    NaN    NaN    NaN
    1  2   01-01-2019   233  436.0    NaN    NaN    NaN
    2  3   01-01-2019   345    NaN    NaN    NaN    NaN
    3  4   01-01-2019   803  933.0  943.0  923.0  954.0
    4  5   01-01-2019   454    NaN    NaN    NaN    NaN
    

    note that for your actual file you will not use StringIO you would just replace that with your file path: pd.read_fwf('data.csv', delimiter='|', header=None)

提交回复
热议问题