问题
I am trying to code a program that prints out the specific rows/lines where one value exceeds the other one in that line. For example ,this is a small part of the text file:
01,test1,202,290,A,290
02,test2,303,730,A,0
03,test3,404,180,N,180
The program that I am trying to code would select all lines that have 'A' in them but also select the lines where the 4th column (290 for the first line) is greater then the 6th column (290 in the first line)and then print them.So the program should only print this line in the text file above in python:
02,test2,303,730,A,0
The best I can do is simply print all lines that have 'A' in them by using:
F = open("TEST.txt").read()
for line in F.split():
if 'A' in line:
Column=line.split(',')
However this only selects the lines with 'A' in them ,when I attempt to filter it based on whether the 4th column is greater then the 6th column,I get various errors.Can somebody please help me with this problem?
回答1:
The csv lib will parse the file into rows for you, you should also never compare numbers as strings as they will be compared lexicographically giving you incorrect output, also using in
would mean you would match A
in "Apple"
or any other place it appear not just an exact match, if you want to check for an exact match in a particular column then you should do exactly that:
In [8]: cat test.txt
01,test1,202,290,A,290
02,test2,303,730,A,0
03,test3,404,180,N,180
In [9]: from csv import reader
In [10]: for row in reader(open("test.txt")):
if row[4] == "A" and float(row[3]) > float(row[5]):
print(row)
....:
['02', 'test2', '303', '730', 'A', '0']
Why comparing numbers as strings is a bad idea:
In [11]: "2" > "1234"
Out[11]: True
In [12]: float("2") > float("1234")
Out[12]: False
回答2:
You can try below code
for line in open(filename):
if 'A' in line:
Column=line.split(',')
if Column[3] > Column[5]:
print Column
回答3:
Try the following code:
from __future__ import print_function
def condition(cols):
return cols[4] == 'A' and cols[3] > cols[5]
with open('data.txt', 'r') as f:
data = f.readlines()
[print(line) for line in data if condition(line.split(','))]
You can set any logical filtering conditions in the "condition" function
回答4:
i guess you should definitely take a look at pandas.
It will make everything much easier:
from __future__ import print_function
import pandas as pd
df = pd.read_csv('data.txt', names=['col1','col2','col3','col4','col5','col6'])
print('Given data-set')
print(df)
df['diff'] = df['col4'] - df['col6']
flt = df[(df.col5 == 'A') & (df.col4 > df.col6)]
print('Filtered data-set')
print(flt)
#print(df.sum(axis=0, numeric_only=True))
print('sum(col6) = %d' % (df.sum(axis=0, numeric_only=True)['col6']))
Output:
Given data-set
col1 col2 col3 col4 col5 col6
0 1 test1 202 290 A 290
1 2 test2 303 730 A 0
2 3 test3 404 180 N 180
Filtered data-set
col1 col2 col3 col4 col5 col6 diff
1 2 test2 303 730 A 0 730
sum(col6) = 470
来源:https://stackoverflow.com/questions/35381799/how-to-print-out-specific-rows-lines-in-a-text-file-based-on-a-condition-greate