Count letter differences of two strings

后端未结

关注

 11  682

This is the behaviour I want:

a: IGADKYFHARGNYDAA
c: KGADKYFHARGNYEAA
2 difference(s).

相关标签:

11条回答

醉酒成梦

2020-12-09 17:18

Python has the excellent difflib, which should provide the needed functionnality.

Here's sample usage from the documentation:

import difflib  # Works for python >= 2.1

>>> s = difflib.SequenceMatcher(lambda x: x == " ",
...                     "private Thread currentThread;",
...                     "private volatile Thread currentThread;")
>>> for block in s.get_matching_blocks():
...     print "a[%d] and b[%d] match for %d elements" % block
a[0] and b[0] match for 8 elements
a[8] and b[17] match for 21 elements
a[29] and b[38] match for 0 elements

0 讨论(0)

半阙折子戏

2020-12-09 17:20

def diff_letters(a,b):
    return sum ( a[i] != b[i] for i in range(len(a)) )

0 讨论(0)

说谎

2020-12-09 17:26
I haven't seen anyone use the reduce function, so I'll include a piece of code I've been using:
```
reduce(lambda x, y: x + 1 if y[0] != y[1] else x, zip(source, target), 0)
```
which will give you the number of differing characters in source and target
0 讨论(0)
发布评论:

提交评论
- 加载中...
失恋的感觉

2020-12-09 17:27
With difflib.ndiff you can do this in a one-liner that's still somewhat comprehensible:
```
>>> import difflib
>>> a = 'IGADKYFHARGNYDAA'
>>> c = 'KGADKYFHARGNYEAA'
>>> sum([i[0] != ' '  for i in difflib.ndiff(a, c)]) / 2
2
```
(sum works here because, well, kind of True == 1 and False == 0)

The following makes it clear what's happening and why the / 2 is needed:
```
>>> [i for i in difflib.ndiff(a,c)]
['- I',
 '+ K',
 '  G',
 '  A',
 '  D',
 '  K',
 '  Y',
 '  F',
 '  H',
 '  A',
 '  R',
 '  G',
 '  N',
 '  Y',
 '- D',
 '+ E',
 '  A',
 '  A']
```
This also works well if the strings have a different length.
0 讨论(0)
发布评论:

提交评论
- 加载中...
佛祖请我去吃肉

2020-12-09 17:30
The Theory
1. Iterate over both strings simultaneously and compare the characters.
2. Store the result with a new string by adding either a spacebar or a | character to it, respectively. Also, increase a integer-value starting from zero for each different character.
3. Output the result.
Implementation

You can use the built-in zip function or itertools.izip to simultaneously iterate over both strings, while the latter is a little more performant in case of huge input. If the strings are not of the same size, iteration will only happen for the shorter-part. If this is the case, you can fill up the rest with the no-match indicating character.
```
import itertools

def compare(string1, string2, no_match_c=' ', match_c='|'):
    if len(string2) < len(string1):
        string1, string2 = string2, string1
    result = ''
    n_diff = 0
    for c1, c2 in itertools.izip(string1, string2):
        if c1 == c2:
            result += match_c
        else:
            result += no_match_c
            n_diff += 1
    delta = len(string2) - len(string1)
    result += delta * no_match_c
    n_diff += delta
    return (result, n_diff)
```
Example

Here's a simple test, with slightly different options than from your example above. Note that I have used an underscore for indicating non-matching characters to better demonstrate how the resulting string is expanded to the size of the longer string.
```
def main():
    string1 = 'IGADKYFHARGNYDAA AWOOH'
    string2 = 'KGADKYFHARGNYEAA  W'
    result, n_diff = compare(string1, string2, no_match_c='_')

    print "%d difference(s)." % n_diff  
    print string1
    print result
    print string2

main()
```
Output:
```
niklas@saphire:~/Desktop$ python foo.py 
6 difference(s).
IGADKYFHARGNYDAA AWOOH
_||||||||||||_|||_|___
KGADKYFHARGNYEAA  W
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

执笔经年

2020-12-09 17:32

a = "IGADKYFHARGNYDAA" 
b = "KGADKYFHARGNYEAAXXX"
match_pattern = zip(a, b)                                 #give list of tuples (of letters at each index)
difference = sum (1 for e in zipped if e[0] != e[1])     #count tuples with non matching elements
difference = difference + abs(len(a) - len(b))            #in case the two string are of different lenght, we add the lenght difference

0 讨论(0)

1 2 下一页

Count letter differences of two strings

The Theory

Implementation

Example