Python/ Pandas CSV Parsing

后端 未结 2 671
南方客
南方客 2021-01-28 02:34

I used JotForm Configurable list widget to collect data, but having troubles parsing the resulting data correctly. When I use

testdf = pd.read_csv (\"TestLoad.c         


        
2条回答
  •  天命终不由人
    2021-01-28 03:00

    I used regex separators with the python engine so I could specify multiple separators. Then, I used the usecols parameter to specify which columns in the csv file you want in your dataframe. The header will not be read from file, and I skipped the first row since it doesn't have any data. I read in the first and second set of records into 2 dataframes, and then concatenate the 2 data frames.

    a = pd.read_csv('sample.csv', sep=',|:|;', skiprows = 1, usecols = (0,2,4,6, 14), header = None, engine='python')
    b = pd.read_csv('sample.csv', sep=',|:|;', skiprows = 1, usecols = (0,8,10,12,14), header = None, engine='python')
    a.columns = ['Date', 'First', "Last", 'School', 'Type']
    b.columns = ['Date', 'First', "Last", 'School', 'Type']
    final_data = pd.concat([a,b], axis = 0)
    

    If you need the order preserved, such that the second names appear right below the first name, you can sort using the indices. I use mergesort, because it is a stable sort and this ensures that the first Information record (record on the right) will be above the Information record on the left.

    final_data.sort_index(kind='mergesort', inplace = True)
    >>>final_data
            Date        First  Last     School  Type
    0   "2015-12-06"    Tom    Smith    MCAA    "New"
    0   "2015-12-06"    Tammy  Smith    MCAA    "New"
    1   "2015-12-06"    Jim    Jones    MCAA    "New"
    1   "2015-12-06"    Jane   Jones    MCAA    "New"
    

    Edit: Including the second set of the record into the data. Changed the axis to 0.

提交回复
热议问题