How can I compute a difference between two strings?

前端 未结 2 747
难免孤独
难免孤独 2020-12-30 08:32

I want to create a function in Delphi that computes different levels of two strings. If two strings are equal (ignoring case), then it should return 0, but if they are not e

相关标签:
2条回答
  • 2020-12-30 08:51

    Fast and compact implementation.

    About 3 times as fast as smasher's implementation with normal strings. More than 100 times as fast if one of the strings is empty.

    Smasher's function is case insensitive though, which can be useful as well.

    function LevenshteinDistance(const s, t: string): integer;inline;
    var
      d   : array of array of integer;
      n, m, i, j : integer;
    begin
      n := length(s);
      m := length(t);
      if n = 0 then Exit(m);
      if m = 0 then Exit(n);
    
      SetLength(d, n + 1, m + 1);
      for i := 0 to n do d[i, 0] := i;
      for j := 0 to m do d[0, j] := j;
    
      for i := 1 to n do
        for j := 1 to m do
          d[i, j] := Min(Min(d[i-1, j]+1, d[i,j-1]+1), d[i-1,j-1]+Integer(s[i] <> t[j]));
    
      Result := d[n, m];
    end;
    

    Note: the inline directive reduces the execution time to less than 70% on my machine, but only for the win32 target platform. If you compile to 64bits (Delphi XE2), inlining actually makes it a tiny bit slower.

    0 讨论(0)
  • 2020-12-30 08:57

    What you want is known as the Levenshtein distance (the minimum number of edits to transform one string into the other, where an edit is either a character insertion, character deletion or character substitution). The wikipedia site has a pseudo code implementation.

    Delphi implementation:

    function LevenshteinDistance(String1 : String; String2 : String) : Integer;
    
    var
      Length1, Length2      : Integer;
      WorkMatrix            : array of array of Integer;
      I, J                  : Integer;
      Cost                  : Integer;
      Val1, Val2, Val3      : Integer;
    
    begin
    String1 := TCharacter.ToUpper (String1);
    String2 := TCharacter.ToUpper (String2);
    Length1 := Length (String1);
    Length2 := Length (String2);
    SetLength (WorkMatrix, Length1+1, Length2+1);
    for I := 0 to Length1 do
      WorkMatrix [I, 0] := I;
    for J := 0 to Length2 do
      WorkMatrix [0, J] := J;
    for I := 1 to Length1 do
      for J := 1 to Length2 do
        begin
        if (String1 [I] = String2 [J]) then
          Cost := 0
        else
          Cost := 1;
        Val1 := WorkMatrix [I-1, J] + 1;
        Val2 := WorkMatrix [I, J-1] + 1;
        Val3 := WorkMatrix[I-1, J-1] + Cost;
        if (Val1 < Val2) then
          if (Val1 < Val3) then
            WorkMatrix [I, J] := Val1
          else
            WorkMatrix [I, J] := Val3
        else
          if (Val2 < Val3) then
            WorkMatrix [I, J] := Val2
          else
            WorkMatrix [I, J] := Val3;
        end;
    Result := WorkMatrix [Length1, Length2];
    end;
    
    0 讨论(0)
提交回复
热议问题