Compare files line by line to see if they are the same, if so output them

一世执手 提交于 2019-12-06 15:58:30

What I got from the clarification:

  • file1 and file2 are in the same format, where each line looks like

    {32 char hex key}|{text1}|{text2}|{text3}
    
  • the files are sorted in ascending order by key

  • for each key that appears in both file1 and file2, you want merged output, so each line looks like

    {32 char hex key}|{text11}|{text12}|{text13}|{text21}|{text22}|{text23}
    

You basically want the collisions from a merge sort:

import csv

def getnext(csvfile, key=lambda row: int(row[0], 16)):
    row = csvfile.next()
    return key(row),row

with open('file1.dat','rb') as inf1, open('file2.dat','rb') as inf2, open('merged.dat','wb') as outf:
    a = csv.reader(inf1, delimiter='|')
    b = csv.reader(inf2, delimiter='|')
    res = csv.writer(outf, delimiter='|')

    a_key, b_key = -1, 0
    try:
        while True:
            while a_key < b_key:
                a_key, a_row = getnext(a)
            while b_key < a_key:
                b_key, b_row = getnext(b)
            if a_key==b_key:
                res.writerow(a_row + b_row[1:])
    except StopIteration:
        # reached the end of an input file
        pass

I still have no idea what you are trying to communicate by 'as well as other file1[x] where x can be any index from the line'.

using this method and comparing compare line by line you don't have to store files in the memory as the files are huge in size.

with open('file1.txt') as f1, open('file2.txt') as f2, open('file3.txt','w') as f3:
    for x, y in zip(f1, f2): 
        if x == y:
            f3.write(x)

Comparing the contents of two files at a specified index:

fp1 = open("file1.txt", "r")
fp2 = open("file2.txt", "r")

fp1.seek(index)
fp2.seek(index)

line1 = fp1.readline()
line2 = fp2.readline()

if line1 == line2:
    print(line1)

fp1.close()
fp2.close()

Comparing two files line by line to see if they match, otherwise print the line:

fp1 = open("file1.txt", "r")
fp2 = open("file2.txt", "r")

line1, line2 = fp1.readline(), fp2.readline()

while line1 and line2:
    if line1 != line2:
        print("Mismatch.\n1: %s\n2: %s" % (line1, line2))

fp1.close()
fp2.close()
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!