How to print out specific rows/lines in a text file based on a condition (greater than or less than)

问题

I am trying to code a program that prints out the specific rows/lines where one value exceeds the other one in that line. For example ,this is a small part of the text file:

01,test1,202,290,A,290

02,test2,303,730,A,0

03,test3,404,180,N,180

The program that I am trying to code would select all lines that have 'A' in them but also select the lines where the 4th column (290 for the first line) is greater then the 6th column (290 in the first line)and then print them.So the program should only print this line in the text file above in python:

02,test2,303,730,A,0

The best I can do is simply print all lines that have 'A' in them by using:

F = open("TEST.txt").read()
  for line in F.split():
    if 'A' in line:
      Column=line.split(',')

However this only selects the lines with 'A' in them ,when I attempt to filter it based on whether the 4th column is greater then the 6th column,I get various errors.Can somebody please help me with this problem?

回答1:

The csv lib will parse the file into rows for you, you should also never compare numbers as strings as they will be compared lexicographically giving you incorrect output, also using in would mean you would match A in "Apple" or any other place it appear not just an exact match, if you want to check for an exact match in a particular column then you should do exactly that:

In [8]: cat test.txt
01,test1,202,290,A,290
02,test2,303,730,A,0
03,test3,404,180,N,180

In [9]: from csv import reader

In [10]: for row in reader(open("test.txt")):
           if row[4] == "A" and float(row[3]) > float(row[5]):
                  print(row)
   ....:         
['02', 'test2', '303', '730', 'A', '0']

Why comparing numbers as strings is a bad idea:

In [11]: "2" > "1234"
Out[11]: True

In [12]: float("2") > float("1234")
Out[12]: False

回答2:

You can try below code

for line in open(filename):
    if 'A' in line:
        Column=line.split(',')
        if Column[3] > Column[5]:
            print Column

回答3:

Try the following code:

from __future__ import print_function

def condition(cols):
    return cols[4] == 'A' and cols[3] > cols[5]

with open('data.txt', 'r') as f:
  data = f.readlines()

[print(line) for line in data if condition(line.split(','))]

You can set any logical filtering conditions in the "condition" function

回答4:

i guess you should definitely take a look at pandas.

It will make everything much easier:

from __future__ import print_function
import pandas as pd

df = pd.read_csv('data.txt', names=['col1','col2','col3','col4','col5','col6'])
print('Given data-set')
print(df)

df['diff'] = df['col4'] - df['col6']
flt = df[(df.col5 == 'A') & (df.col4 > df.col6)]
print('Filtered data-set')
print(flt)

#print(df.sum(axis=0, numeric_only=True))
print('sum(col6) = %d' % (df.sum(axis=0, numeric_only=True)['col6']))

Output:

Given data-set
   col1   col2  col3  col4 col5  col6
0     1  test1   202   290    A   290
1     2  test2   303   730    A     0
2     3  test3   404   180    N   180
Filtered data-set
   col1   col2  col3  col4 col5  col6  diff
1     2  test2   303   730    A     0   730
sum(col6) = 470

来源：https://stackoverflow.com/questions/35381799/how-to-print-out-specific-rows-lines-in-a-text-file-based-on-a-condition-greate

标签

python

python-3.5