Getting intersection of two lists in python

北城余情 提交于 2019-12-12 01:55:42

问题


I have two lists of genes that i'm analyzing. Essentially I want to sort the elements of these lists much in the same way as a Venn diagram, i.e. elements that only occur in list 1 are placed in one list, those only in list 2 are in another and those occurring in both are in a third.

My code so far:

from Identify_Gene import Retrieve_Data #custom class
import argparse
import os

#enable use from command line
parser = argparse.ArgumentParser(description='''\n\nFind the intersection between two lists of genes\n ''')
parser.add_argument('filename1',help='first list of genes to compare')
parser.add_argument('filename2',help='second list of genes to compare')
parser.add_argument('--output_path',help='provide an output filename')
args = parser.parse_args()

os.chdir(args.output_path)

a = Retrieve_Data() # custom class, simply produces a python list
list1 = a.parse_gene_list(args.filename1)
list2 = a.parse_gene_list(args.filename2)

intersection = []
list1_only = []
list2_only = []
if len(list1)>len(list2):
    for i in range(0,len(list1)):
        if list1[i] in list2:
            intersection.append(list1[i])
        else:
            list1_only.append(list1[i])
    for i in range(0,len(list2)):
        if list2[i] not in list1:
            list2_only.append(list2[i])
else:
    for i in range(0,len(list2)):
        if list2[i] in list1:
            intersection.append(list2[i])
        else:
            list2_only.append(list2[i])
    for i in range(0,len(list1)):
        if list1[i] not in list2:
            list1_only.append(list2[i])




filenames = {}
filenames['filename1'] = 'list1_only.txt'
filenames['filename2'] = 'list2_only.txt'
filenames['intersection'] = 'intersection.txt'                

with open(filenames['filename1'],'w') as f:
    for i in range(0,len(list1_only)):
        f.write(list1_only[i]+'\n')

with open(filenames['filename2'],'w') as f:
    for i in range(0,len(list2_only)):
        f.write(list2_only[i]+'\n')

with open(filenames['intersection'],'w') as f:
    for i in range(0,len(intersection)):
        f.write(intersection[i]+'\n')

This program currently gives me two identical lists as list1_only and list2_only where they should be mutually exclusive. The intersection file produced is different, though i don't feel it can be trusted since the other two lists are not behaving as expected.

I have been informed (since posting this question) that this operation can easily be done via the python 'Sets' module however, for educational purposes, i'd still quite like to fix this program


回答1:


There is a bug in the construction of the lists.

In the section:

for i in range(0,len(list1)):
    if list1[i] not in list2:
        list1_only.append(list2[i])

the last line should be:

        list1_only.append(list1[i])



回答2:


You might also want to checkout this handy website:

http://jura.wi.mit.edu/bioc/tools/compare.php



来源:https://stackoverflow.com/questions/27006622/getting-intersection-of-two-lists-in-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!