How to merge two files in python

依然范特西╮ 提交于 2019-12-08 07:16:25

问题


I have two tab delimited csv files (with headers) that I need to merge in python.

Also, in the merged file I want to add a column in the end to identify the files because though they have same format, they have different data that I need to separate later on. So, I want to add a column called 'source' on each line of output which is 0 for file1 and 1 for file2.

I have gone far as using the csv module but the writerow adds an additioal newline character between each line it writes and this code doesn't write anything from file2. What am I doing wrong here? Also, how do I add the extra column 'source' in the line object?

import os, csv

path1 = os.path.abspath("../data/file1.txt")
path2 = os.path.abspath("../data/file2.txt")
merged_path = os.path.abspath('../data/output.txt')

# merge the two files for further processing
merged_file = csv.writer(open(merged_path, 'a'), delimiter = '\t')

#file1
fg = csv.reader(open(path1, 'r'), delimiter = '\t')

for line in fg:
    if line[7] != '\N':
        merged_file.writerow(line) 

#file2
bg = csv.reader(open(path2, 'r'), delimiter = '\t')

for line in bg:
    if line[16] != '\N':
        merged_file.writerow(line) 

回答1:


I prefer to use the dictWriter for this. Also, your code doesn't work because the csv library requires opening files in binary mode.

import os, csv

path1 = os.path.abspath("../data/file1.txt")
path2 = os.path.abspath("../data/file2.txt")
merged_path = os.path.abspath('../data/output.txt')

#file1
fg = csv.DictReader(open(path1, 'rb'), delimiter = '\t')

fieldnames = fg.fieldnames
fieldnames.append('source')
# merge the two files for further processing
merged_file = csv.DictWriter(open(merged_path, 'ab'), delimiter = '\t', fieldnames=fieldnames)
merged_file.writeheader()

for row in fg:
    row['source'] = os.path.basename(path1)
    merged_file.writerow(row)

#file2
bg = csv.DictReader(open(path2, 'rb'), delimiter = '\t')

for row in bg:
    row['source'] = os.path.basename(path1)
    merged_file.writerow(row)


来源:https://stackoverflow.com/questions/9211923/how-to-merge-two-files-in-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!