问题
I have this code which compares a number to a number(what i called item in my code) in the domain range to see if it is already there. If it its then print to the output file if it is not then only print it once.
Question How to make sure that if the number isn't between the domain range then print only one time. ( I used true and false statements but this doesn't work because when it is false, it would print several duplicates- on the code below i am not sure how to implement so that it print the number that not in the domain range once instead of multiple times )
for item in lookup[uniprotID]:
for varain in wholelookup[uniprotID]:
for names in wholeline[uniprotID]:
statement=False
if re.search(r'\d+',varain).group(0)==item and start <= int(item) <= end:
result = str(int(item) - start + 1)
if varain in names.split(' '):
statement = True
print ">{0} | at position {1} | start= {2}, end= {3} | description: {4} | {5}".format(uniprotID, result, start, end, varain, names)
if statement == True:
print(''.join(makeList[start-1:end]))
回答1:
Something based on this might work for you:
already_seen = set()
for line in sys.stdin:
if line not in already_seen:
already_seen.add(line)
sys.stdout.write(line)
Not that if your files are large, you could end up consuming a lot of Virtual Memory doing this. If so, look into anydbm or a bloom filter.
回答2:
Store the values that are not in the range.
stored_prints = {}
if not ( start <= int( item ) <= end ):
try:
stored_prints[item]++
except:
stored_prints[item] = 1
print stored_prints
You will have to format and fit it to your need though, but this should do what you need it to do if I understood your question correctly.
来源:https://stackoverflow.com/questions/11638659/how-to-prevent-duplicate-text-in-the-output-file-while-using-for-loop