Merge csv Files with TimeStamps

六月ゝ 毕业季﹏ 提交于 2021-02-10 20:40:44

问题


Data File 1:

data_20150801.csv

Time                Header  Header  Header  Header 
2015-08-01 07:00    14.4    14.4    14.4    68                              
2015-08-01 07:01    14.4    14.4    14.4    68  

Data File 2

data2_20150801.csv

Time                Header   Header
2015-08-01 00:00    90       12312
2015-08-01 00:01    232      13213
......
2015-08-01 07:00    1000    1500
2015-08-01 07:01    2312    1245
2015-08-01 07:02    1232    1232
2015-08-01 07:03    1231    1232

Id like to merge those 2 .csv Files, to get a File That looks like:

Time                Header  Header  Header  Header Header   Header
2015-08-01 07:00    14.4    14.4    14.4    68     1000     1500

so basically I need to copy the Rows from data2_ and insert them at the right time points in data_ I tried it manually with Notepad ++ but the Problem is, that sometimes there's no entry for one Minute in data2_ so I'd need to check where the missing TimeStep is and skip that point manually.

I did some things in Python but I'm still a noob so I lack the experience on how to start tackling a problem like this?

I'm using a mac and I found that cat command that combines .csv files in a Folder to one cvs file --> is there a way to do this line by line conserving the timestamps?


回答1:


You could use Python Pandas to do this quite easily, but its probably an overengineering:

import pandas as pd
d_one = from_csv('data.csv',sep=',',engine='python',header=0)
d_two = from_csv('data2.csv',sep=',',engine='python',header=0)
d_three = pd.merge(d_one, d_two, left_on='timestamp',right_on='timestamp')
d_three.to_csv('output.csv',sep=',')

I havent had the chance to test this code but it should do what you want, you may need to modify commas for tabs (depending on the file), etc.




回答2:


Not being a Python expert, I would use two dictionaries, using the date-time stamp as key and a list for the other columns as data.

Load one file into one dictionary, and the other file into the other. Then it's pretty simple to merge the two dictionaries using keys that are the same in both.

As for reading the files, there is a standard cvs module that you can use.




回答3:


Considering the solution that proposed the use of Pandas, I would add "index=False" on the to_csv line, turning it out in

d_three.to_csv('output.csv',sep=',', index=False)

This will remove the index column.



来源:https://stackoverflow.com/questions/32717819/merge-csv-files-with-timestamps

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!