Python: Multiple Consensus sequences

拟墨画扇 提交于 2019-12-11 04:42:35

问题


starting from a list of dna sequences, I must have in return all the possible consensus (the resulting sequence with the highest nucleotide frequency in each position) sequences. If in some positions the nucleotides have the same highest frequency, I must obtain all possible combinations with the highest frequency. I also must have in return the profile matrix ( a matrix with the frequencies of each nucleotide for each sequence).

This is my code so far (but it returns only one consensus sequence):

seqList = ['TTCAAGCT','TGGCAACT','TTGGATCT','TAGCAACC','TTGGAACT','ATGCCATT','ATGGCACT']
n = len(seqList[0])
profile = { 'T':[0]*n,'G':[0]*n ,'C':[0]*n,'A':[0]*n }

for seq in seqList:

    for i, char in enumerate(seq):
        profile[char][i] += 1



consensus = ""
for i in range(n):
    max_count = 0
    max_nt = 'x'
    for nt in "ACGT":
        if profile[nt][i] > max_count:
            max_count = profile[nt][i]
            max_nt = nt
    consensus += max_nt
print(consensus)
for key, value in profile.items():
     print(key,':', " ".join([str(x) for x in value] ))

TTGCAACT
C : 0 0 1 3 2 0 6 1
A : 2 1 0 1 5 5 0 0
G : 0 1 6 3 0 1 0 0
T : 5 5 0 0 0 1 1 6

(As you can see, in position four, C and G have the same highest score, it means I must obtain two consensus sequences)

Is it possible to modify this code to obtain all the possible sequences, or could you explain me the logic (the pseudocode) how to obtain the right result?

Thank you very much in advance!


回答1:


I'm sure there are better ways but this is a simple one:

bestseqs = [[]]
for i in range(n):
    d = {N:profile[N][i] for N in ['T','G','C','A']}
    m = max(d.values())
    l = [N for N in ['T','G','C','A'] if d[N] == m]
    bestseqs = [ s+[N] for N in l for s in bestseqs ]

for s in bestseqs:
    print(''.join(s))

# output:
ATGGAACT
ATGCAACT


来源:https://stackoverflow.com/questions/38586800/python-multiple-consensus-sequences

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!