Consolidate similar patterns into single consensus pattern
问题 In the previous post, I did not clarify the questions properly, therefore, I would like to start a new topic here. I have the following items: a sorted list of 59,000 protein patterns (range from 3 characters "FFK" to 152 characters long); some long protein sequences, aka my reference. I am going to match these patterns against my reference and find the location of where the match is found. (My friend helped wrtoe a script for that.) import sys import re from itertools import chain, izip #