Defining Data Type during csv file import based on column index in pandas

后端 未结 3 504
夕颜
夕颜 2020-12-21 09:36

I need to import a csv file that has 300+ columns, among these columns, only the first column needs to specified as a category, while the rest of the columns should be float

3条回答
  •  萌比男神i
    2020-12-21 09:57

    read it twice, first time get all the columns, second time, specify dtype when reading.

    import pandas as pd
    import numpy as np
    df = pd.DataFrame(np.random.randint(0,100,size=(100, 4)), columns=list('ABCD'))
    df.to_csv('tmp.csv',index=False)
    
    path = 'tmp.csv'
    df =pd.read_csv(path)
    type_dict = {}
    
    for key in df.columns:
        if key == 'A':
            type_dict[key]='category'
        else:
            type_dict[key]=np.float32
    df = pd.read_csv(path,dtype=type_dict)
    print(df.dtypes)
    

提交回复
热议问题