Levenshtein distance: how to better handle words swapping positions?
问题 I've had some success comparing strings using the PHP levenshtein function. However, for two strings which contain substrings that have swapped positions, the algorithm counts those as whole new substrings. For example: levenshtein("The quick brown fox", "brown quick The fox"); // 10 differences are treated as having less in common than: levenshtein("The quick brown fox", "The quiet swine flu"); // 9 differences I'd prefer an algorithm which saw that the first two were more similar. How could