I\'m trying to find some sort of a good, fuzzy string matching algorithm. Direct matching doesn\'t work for me — this isn\'t too good because unless my strings are a 100% si
Take a look at this python library, which SeatGeek open-sourced yesterday. Obviously most of these kinds of problems are very context dependent, but it might help you.
from fuzzywuzzy import fuzz
s1 = "the quick brown fox"
s2 = "the quick brown fox jumped over the lazy dog"
s3 = "the fast fox jumped over the hard-working dog"
fuzz.partial_ratio(s1, s2)
> 100
fuzz.token_set_ratio(s2, s3)
> 73
SeatGeek website
and Github repo