AttributeError: 'list' object has no attribute 'SeqRecord' - while trying to slice multiple sequences with Biopython>SeqIO from fasta file

不羁岁月 提交于 2021-02-11 17:52:31

问题


I am trying to generate varying length N and C termini Slices (1,2,3,4,5,6,7). But before I get there I am having problems just reading in my fasta files. I was following the 'Random subsequences' head tutorial from:https://biopython.org/wiki/SeqIO . But in this case there is only one sequence so maybe that is where I went wrong. The code with example sequences and my errors. Any help would be much appreciated. I am clearly out of my depth. It looks like there are a lot of similar problems others have come across so I imagine it is something stupid that I am doing because I do not fully understand the SeqRecord structures. Thanks!

Two example sequences in my file domains.fasta:

>GA98
TTYKLILNLKQAKEEAIKELVDAGTAEKYFKLIANAKTVEGVWTLKDEIKTFTVTE
>GB98
TTYKLILNLKQAKEEAIKELVDAGTAEKYFKLIANAKTVEGVWTYKDEIKTFTVTE

my code that is not working:

from Bio import SeqIO
from Bio.SeqRecord import SeqRecord


# Load data:
domains = list(SeqIO.parse("domains.fa",'fasta'))

#set up receiving arrays
home=[]
num=1

#slice data
for i in range(0, 6):
    num = num+1
    domain = domains
    seq_n = domains.seq[0:num]
    seq_c = domains.seq[len(domain)-num:len(domain)]
    name = domains.id
    record_d = SeqRecord(domain,'%s' % (name), '', '')
    home.append(record_d)
    record_n = SeqRecord(seq_n,'%s_n_%i' % (name,num), '', '')
    home.append(record_n)
    record_c = SeqRecord(seq_c,'%s_c_%i' % (name,num), '', '')
    home.append(record_c)
SeqIO.write(home, "domains_variants.fasta", "fasta")

error I get is:

Traceback (most recent call last):
  File "~/fasta_nc_sequences.py", line 20, in <module>
    seq_n = domains.seq[0:num]
AttributeError: 'list' object has no attribute 'SeqRecord'

When I print out 'domains = list(SeqIO.parse("domains.fa",'fasta'))' I get this:

[SeqRecord(seq=Seq('TTYKLILNLKQAKEEAIKELVDAGTAEKYFKLIANAKTVEGVWTLKDEIKTFTVTE', SingleLetterAlphabet()), id='GA98', name='GA98', description='GA98', dbxrefs=[]), SeqRecord(seq=Seq('TTYKLILNLKQAKEEAIKELVDAGTAEKYFKLIANAKTVEGVWTYKDEIKTFTVTE', SingleLetterAlphabet()), id='GB98', name='GB98', description='GB98', dbxrefs=[])]

I am not sure why I cannot access what is within the SeqRecord. Maybe it is because I wrapped the SeqIO.parse in a list because before I was being thrown a different error:

AttributeError: 'generator' object has no attribute 'seq'

回答1:


I was working one level too low in my for loop so I was not iterating through the sequences. There were also problems accessing the C terminus sequence. Now the code works.

#Load data:
domains = list(SeqIO.parse("examples/data/domains.fa",'fasta'))
#set up receiving arrays

home=[]
#num=1
#subset data
for record in (domains):
    num = 0
    domain = record.seq
    name = record.id
    record_d = SeqRecord(domain,'%s' % (name), '', '')
    home.append(record_d)
    for i in range(0, 6):
        num= num+1
        seq_n = record.seq[0:num]
        seq_c = record.seq[len(record.seq)-num:len(record.seq)]
        record_n = SeqRecord(seq_n,'%s_n_%i' % (name,num), '', '')
        home.append(record_n)
        record_c = SeqRecord(seq_c,'%s_c_%i' % (name,num), '', '')
        home.append(record_c)
SeqIO.write(home, "domains_variants.fasta", "fasta")


来源:https://stackoverflow.com/questions/60144261/attributeerror-list-object-has-no-attribute-seqrecord-while-trying-to-sli

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!