Search for string allowing for one mismatch in any location of the string

后端未结

关注

 13  959

闹比i 2020-11-30 02:45

I am working with DNA sequences of length 25 (see examples below). I have a list of 230,000 and need to look for each sequence in the entire genome (toxoplasma gondii parasi

13条回答

天涯浪人 (楼主)

2020-11-30 02:59

This hints of the longest common subsequence problem. The problem with string similarity here is that you need to test against a continuous string of 230000 sequences; so if you are comparing one of your 25 sequences to the continuous string you'll get a very low similarity.

If you compute the longest common subsequence between your 25 sequences and the continuous string, you'll know if it is in the string if the lengths are the same.

0 讨论(0)

查看其它13个回答
发布评论:

提交评论
- 加载中...