Improving code design of DNA alignment degapping
This is a question regarding a more efficient code design: Assume three aligned DNA sequences (seq1, seq2 and seq3; they are each strings) that represent two genes (gene1 and gene2). Start and stop positions of these genes are known relative to the aligned DNA sequences. # Input align = {"seq1":"ATGCATGC", # In seq1, gene1 and gene2 are of equal length "seq2":"AT----GC", "seq3":"A--CA--C"} annos = {"seq1":{"gene1":[0,3], "gene2":[4,7]}, "seq2":{"gene1":[0,3], "gene2":[4,7]}, "seq3":{"gene1":[0,3], "gene2":[4,7]}} I wish to remove the gaps (i.e., dashes) from the alignment and maintain the