How to modify Levenshtein algorithm, to know if it inserted, deleted, or substituted a character?

江枫思渺然 提交于 2019-12-23 12:25:29

问题


So I am trying to devise a spin off of the Levenshtein algorithm, where I keep track of what transformations I did in the string(inserted a, or substitute a for b).

Example:

Basically, say I am computing the edit distance of "bbd" and "bcd"

The edit distance will be 1 and the transformation will be "substitude b for c"

Question: How would I approach this problem since the implementations i've seen do not concern themselves with knowing what kind of operation it was but only the total cost?


回答1:


You can use this module - there is an editops function there, which returns a list with the operations needed to transform one string to another.

Example:

Levenshtein.editops("FBBDE", "BCDASD")
[('delete', 0, 0), ('replace', 2, 1), ('insert', 4, 3), ('insert', 4, 4), ('replace', 4, 5)]

From the docs:

Find sequence of edit operations transforming one string to another.

editops(source_string, destination_string) editops(edit_operations, source_length, destination_length)

The result is a list of triples (operation, spos, dpos), where operation is one of equal',replace', insert', ordelete'; spos and dpos are position of characters in the first (source) and the second (destination) strings. These are operations on signle characters. In fact the returned list doesn't contain the equal', but all the related functions accept both lists with and without equal's.



来源:https://stackoverflow.com/questions/24190003/how-to-modify-levenshtein-algorithm-to-know-if-it-inserted-deleted-or-substit

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!