Python - Select different row values from csv and combine them in new csv

流过昼夜 提交于 2020-03-26 04:04:58

问题


I have a csv file containing hourly data of wave conditions and data from measurements taken during certain times. I want to select wave conditions 6 hours before the measurement and the outcomes of the measurements. I want to export that to a new csv file for all the measurements.

The code below selects the right rows for 1 measurement:

df = pd.read_csv(csv, header=None, names=['survey', 'time', 'tides', 'mwp', 'swh', 'mwd', 'data1', 'data2', 'data3', 'data4', 'data5'])
xp = [datetime.strptime(d, "%d/%m/%YT%H:%M") for d in df['time']]

xs = mdates.date2num(xp)
date = mdates.DateFormatter ("%d/%m/%Y\n%H:%M")

#select row data waves
survey01 = "26/03/2019T14:00"
survey02 = "10/04/2019T14:00"
survey03 = "11/04/2019T15:00"
survey04 = "01/05/2019T09:00"

#Select row data waves
selected_survey = df.loc[df["time"].eq(survey01)].index[0]
wave = df.loc[selected_survey-6: selected_wave, "time"].index[0]
result_wave = df.loc[wave, ['survey', 'time', 'tides', 'mwp', 'swh', 'mwd']]
meas = df.loc[selected_survey: selected_meas, "time"].index[0]
result_meas = df.loc[meas, ['data1', 'data2', 'data3', 'data4', 'data5']]

#Join them together
joined_list = []
joined_list.extend (result_wave)
joined_list.extend (result_meas)
print (joined_list)

#Export to csv
data = pd.DataFrame(list(zip(*[joined_list]))).add_prefix('Survey1')
data.to_csv('Waves.csv', index=False)
print(data)

This should be done for all the measurements (20+ in total) and combined in 1 csv

How do I do this for all of them and export it to one csv file?

survey 1  26/03/2019T08:00  1.2 9.34    0.509   1.080  25.5  18.4  31.64    27.3    24.2
survey 2  10/04/2019T08:00  1.1 8.06    1.232   1.155  24.64 19.46 31.844   28.83   25.357
survey 3  ...

Or is there an easier way of getting the right data in a csv file?


回答1:


I wasn't able to comprehend the code completely. However, as discussed in the comments, you can use the apply() to get the required results.

def process_data(i):
    selected_survey = df.loc[df["time"].eq(i)].index[0]
    wave = df.loc[selected_survey-3: selected_wave, "time"].index[0]
    result_wave = df.loc[wave, ['survey', 'time', 'tides', 'mwp', 'swh', 'mwd']]
    meas = df.loc[selected_survey: selected_meas, "time"].index[0]
    result_meas = df.loc[meas, ['data1', 'data2', 'data3', 'data4', 'data5']]

    joined_list = []
    joined_list.extend (result_wave)
    joined_list.extend (result_meas)
    return joined_list

joined_list = df["time"].apply(process_data)

survey_index_list = [f'survey{i}' for i in range(len(joined_list))]
data = pd.DataFrame(list(zip(*[joined_list])), index=survey_index_list)
print(data)


来源:https://stackoverflow.com/questions/60484795/python-select-different-row-values-from-csv-and-combine-them-in-new-csv

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!