Sort rps-blast results by position of the hit

a 夏天 提交于 2019-12-24 14:12:56

问题


I'm beginning with biopython and I have a question about parsing results. I used a tutorial to get involved in this and here is the code that I used:

from Bio.Blast import NCBIXML
for record in NCBIXML.parse(open("/Users/jcastrof/blast/pruebarpsb.xml")):
    if record.alignments:
        print "Query: %s..." % record.query[:60]
        for align in record.alignments:
            for hsp in align.hsps:
                print " %s HSP,e=%f, from position %i to %i" \
                      % (align.hit_id, hsp.expect, hsp.query_start, hsp.query_end)

Part of the result obtained is:

 gnl|CDD|225858 HSP,e=0.000000, from position 32 to 1118
 gnl|CDD|225858 HSP,e=0.000000, from position 1775 to 2671
 gnl|CDD|214836 HSP,e=0.000000, from position 37 to 458
 gnl|CDD|214836 HSP,e=0.000000, from position 1775 to 2192
 gnl|CDD|214838 HSP,e=0.000000, from position 567 to 850

And what I want to do is to sort that result by position of the hit (Hsp_hit-from), like this:

 gnl|CDD|225858 HSP,e=0.000000, from position 32 to 1118
 gnl|CDD|214836 HSP,e=0.000000, from position 37 to 458
 gnl|CDD|214838 HSP,e=0.000000, from position 567 to 850
 gnl|CDD|225858 HSP,e=0.000000, from position 1775 to 2671
 gnl|CDD|214836 HSP,e=0.000000, from position 1775 to 2192

My input file for rps-blast is a *.xml file. Any suggestion to proceed?

Thanks!


回答1:


The HSPs list is just a Python list, and can be sorted as usual. Try:

align.hsps.sort(key = lambda hsp: hsp.query_start)

However, you are dealing with a nested list (each match has a list of HSPs), and you want to sort over all of them. Here making your own list might be best - something like this:

for record in ...:
    print "Query: %s..." % record.query[:60]
    hits = sorted((hsp.query_start, hsp.query_end, hsp.expect, align.hit_id) \
                   for hsp in align.hsps for align in record.alignments)
    for q_start, q_end, expect, hit_id in hits:
        print " %s HSP,e=%f, from position %i to %i" \
              % (hit_id, expect, q_start, q_end)

Peter



来源:https://stackoverflow.com/questions/16070195/sort-rps-blast-results-by-position-of-the-hit

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!