Fastest Way To Find Mismatch Positions Between Two Strings of the Same Length

前端 未结 9 1364
后悔当初
后悔当初 2020-12-28 22:08

I have a millions of pairs of string of same length which I want to compare and find the position where it has mismatches.

For example for each $str1 a

9条回答
  •  独厮守ぢ
    2020-12-28 22:55

    Those look like gene sequences. If the strings are all 8-characters, and the domain of possible codes is ( A, C, G, T ) you might consider transforming the data somehow before processing it. That would give you only 65536 possible strings, so you can specialise your implementation.

    For example, you write a method that takes an 8-character string and maps it to an integer. Memoize that so that the operation will be quick. Next, write a comparison function, that given two integers, tells you how they differ. You would call this in a suitable looping construct with a numeric equality test like unless ( $a != $b ) before calling the comparison - a short circuit for identical codes if you will.

提交回复
热议问题