python writing program to iterate a csv file to match field and save the result in a different data file

问题

I am trying to write a program to do the following :

specify a field from a record in a csv file called data. specify a field from a record in a csv file called log.

compare the position of the two in the data and in the log. If they are on the same line proceed to write the record in the file called log in a new file called result. If the field does not match the record position in the log file proceed to move to the next record in the log file and compare it until a matching record is found and then the record is saved in the file called result. reset the index of the log file go to the next line in the data file and proceed to do the verification until the data file reaches the end.

This is whay i was able to do but i am stuck

import csv
def main():

    datafile_csv = open('data.txt')
    logfile_csv = open('log.txt')
    row_data = []
    row_log = []
    row_log_temp = []
    index_data = 1
    index_log = 1
    index_log_temp = index_log
    counter = 0
    data = ''
    datareader = ''
    logreader = ''
    log = ''
#   row = 0
    logfile_len = sum (1 for lines in open('log.txt'))
    with open('resultfile.csv','w') as csvfile:
        out_write = csv.writer(csvfile,  delimiter=',',quotechar='"')
        with open('data.txt','r') as (data):
            row_data = csv.reader(csvfile, delimiter=',', quotechar='"')
            row_data = next(data)
            print(row_data)
            with open ('log.txt','r') as (log):
                row_log = next(log)
                print(row_log)
                while counter != logfile_len:
                    comp_data = row_data[index_data:]
                    comp_log = row_log[index_log:]
                    comp_data = comp_data.strip('"')
                    comp_log = comp_log.strip('"')
                    print(row_data[1])
                    print(comp_data)
                    print(comp_log)
                    if comp_data != comp_log:
                        while comp_data != comp_log:
                            row_log = next(log)
                            comp_log = row_log[index_log]
                        out_write.writerow(row_log)
                        row_data = next(data)
                    else : 
                        out_write.writerow(row_log)
                        row_data = next(data)
                    log.seek(0)
                    counter +=1

The problem i have are the following :

I cannot convert the data line in a string properly and i cannot compare correctly.

Also i need to be able to reset the pointer in the log file but seek does not seem to be working....

This is the content of the data file

"test1","test2","test3" "1","2","3" "4","5","6"

This is the content of the log file

"test1","test2","test3" "4","5","6" "1","2","3"

This is what the compiler return me

t "test1","test2","test3"

t test1","test2","test3"

test1","test2","test3"

1 1","2","3"

test1","test2","test3"

Traceback (most recent call last):
File "H:/test.py", line 100, in <module>
main()
File "H:/test.py", line 40, in main
comp_log = row_log[index_log]
IndexError: string index out of range

Thank you very much for the help

Regards

Danilo

回答1:

Joining two files by columns (rowcount and a Specific Column[not defined]), and returning the results limited to the columns of the left/first file.

import petl

log = petl.fromcsv('log.txt').addrownumbers()  # Load csv/txt file into PETL table, and add row numbers 
log_columns = len(petl.header(log))  # Get the amount of columns in the log file
data = petl.fromcsv('data.txt').addrownumbers()  # Load csv/txt file into PETL table, and add row numbers 
joined_files = petl.join(log, data, key=['row', 'SpecificField'])  # Join the tables using row and a specific field
joined_files = petl.cut(joined_files, *range(1, log_columns))  # Remove the extra columns obtained from right table
petl.tocsv(joined_files, 'resultfile.csv')  # Output results to csv file

log.txt

data.txt

resultfile.csv

Also Do not forget to pip install (version used for this example):