How to add variable error to regex fuzzy search. Python

泄露秘密 提交于 2019-12-10 19:37:55

问题


import regex,re


sequence = 'aaaaaaaaaaaabbbbbbbbbbbbcccccccccccc' #being searched
query = 'aaabbbbbbbbbbbbccc' #100% coverage
query_1 = 'aaaabbbbbbbbcbbbcccc' #95% coverage
query_2 = 'aaabbbbcbbbbbcbccc' #90% coverage

threshold = .95
error = len(query_1) - (len(query_1)*threshold) #for query_1 errors must be <= 1

print regex.search(query_1 + '{e<={}}'.format(error),sequence).group(0)

Im trying to add additional parameters to a regex search so it only works if a certain percentage of the query is in sequence being searched.

For example, if I wanted it to be at least 95% coverage it would work for query_1 but it would not work for query_2


回答1:


Using the regex module:

import regex
sequence = 'aaaaaaaaaaaabbbbbbbbbbbbcccccccccccc' #being searched
query = 'aaabbbbbbbbbbbbccc' #100% coverage
query_1 = 'aaaabbbbbbbbcbbbcccc' #95% coverage
query_2 = 'aaabbbbcbbbbbcbccc' #90% coverage
threshold = 0.97
queries = (query, query_1, query_2)
for q in queries:
    error = int(len(q) - (len(q)*threshold))
    m = regex.search(r'(%s){e<=%d}'%(q,error), sequence)
    print 'match' if m else 'nomatch'


来源:https://stackoverflow.com/questions/17436761/how-to-add-variable-error-to-regex-fuzzy-search-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!