Translation DNA to Protein

别来无恙 提交于 2019-12-03 09:22:02

Your problem stems from the line

if cds[n:n+3] in codontable == True

This always evaluates to False, and thus you never append to proteinsequence. Just remove the == True portion like so

if cds[n:n+3] in codontable

and you will get the protein sequence. Also, make sure to return proteinsequence in translate_dna().

There is one more problem in your code - when you use stop = sequencestart.find('TAA') you don't care about opened reading frame. In code below I split sequence into triplets and use itertools.takewhile to handle that but it can be done using loops as well:

from itertools import takewhile

def translate_dna(sequence, codontable, stop_codons = ('TAA', 'TGA', 'TAG')):       
    start = sequence.find('ATG')

    # Take sequence from the first start codon
    trimmed_sequence = sequence[start:]

    # Split it into triplets
    codons = [trimmed_sequence[i:i+3] for i in range(0, len(trimmed_sequence), 3)]
    print(len(codons))
    print(trimmed_sequence)
    print(codons)

    # Take all codons until first stop codon
    coding_sequence  =  takewhile(lambda x: x not in stop_codons and len(x) == 3 , codons)

    # Translate and join into string
    protein_sequence = ''.join([codontable[codon] for codon in coding_sequence])

    # This line assumes there is always stop codon in the sequence
    return "{0}_".format(protein_sequence)
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!