Edit distance recursive algorithm — Skiena

后端 未结 6 1773
执笔经年
执笔经年 2020-12-30 02:37

I\'m reading The Algorithm Design Manual by Steven Skiena, and I\'m on the dynamic programming chapter. He has some example code for edit distance and uses some functions w

6条回答
  •  执念已碎
    2020-12-30 02:51

    This is likely a non-issue for the OP by now, but I'll write down my understanding of the text.

    /**
     * Returns the cost of a substitution(match) operation
     */
    int match(char c, char d)
    {
      if (c == d) return 0
      else return 1
    }
    
    /**
     * Returns the cost of an insert/delete operation(assumed to be a constant operation)
     */
    int indel(char c)
    {
      return 1
    }
    

    The edit distance is essentially the minimum number of modifications on a given string, required to transform it into another reference string. The modifications,as you know, can be the following.

    1. Substitution (Replacing a single character)
    2. Insert (Insert a single character into the string)
    3. Delete (Deleting a single character from the string)

    Now,

    Properly posing the question of string similarity requires us to set the cost of each of these string transform operations. Assigning each operation an equal cost of 1 defines the edit distance between two strings.

    So that establishes that each of the three modifications known to us have a constant cost, O(1).

    But how do we know where to modify?

    We instead look for modifications that may or may not be needed from the end of the string, character by character. So,

    1. We count all substitution operations, starting from the end of the string
    2. We count all delete operations, starting from the end of the string
    3. We count all insert operations, starting from the end of the string

    Finally, once we have this data, we return the minimum of the above three sums.

提交回复
热议问题