String similarity in PHP: levenshtein like function for long strings

前端 未结 3 1205
闹比i
闹比i 2020-12-09 12:56

The function levenshtein in PHP works on strings with maximum length 255. What are good alternatives to compute a similarity score of sentences in PHP.

3条回答
  •  情歌与酒
    2020-12-09 13:31

    The levenshtein algorithm has a time complexity of O(n*m), where n and m are the lengths of the two input strings. This is pretty expensive and computing such a distance for long strings will take a long time.

    For whole sentences, you might want to use a diff algorithm instead, see for example: Highlight the difference between two strings in PHP

    Having said this, PHP also provides the similar_text function which has an even worse complexity (O(max(n,m)**3)) but seems to work on longer strings.

提交回复
热议问题