Getting the closest string match

后端 未结 13 843
难免孤独
难免孤独 2020-11-22 10:57

I need a way to compare multiple strings to a test string and return the string that closely resembles it:

TEST STRING: THE BROWN FOX JUMPED OVER THE RED COW         


        
13条回答
  •  不要未来只要你来
    2020-11-22 11:16

    A sample using C# is here.

    public static void Main()
    {
        Console.WriteLine("Hello World " + LevenshteinDistance("Hello","World"));
        Console.WriteLine("Choice A " + LevenshteinDistance("THE BROWN FOX JUMPED OVER THE RED COW","THE RED COW JUMPED OVER THE GREEN CHICKEN"));
        Console.WriteLine("Choice B " + LevenshteinDistance("THE BROWN FOX JUMPED OVER THE RED COW","THE RED COW JUMPED OVER THE RED COW"));
        Console.WriteLine("Choice C " + LevenshteinDistance("THE BROWN FOX JUMPED OVER THE RED COW","THE RED FOX JUMPED OVER THE BROWN COW"));
    }
    
    public static float LevenshteinDistance(string a, string b)
    {
        var rowLen = a.Length;
        var colLen = b.Length;
        var maxLen = Math.Max(rowLen, colLen);
    
        // Step 1
        if (rowLen == 0 || colLen == 0)
        {
            return maxLen;
        }
    
        /// Create the two vectors
        var v0 = new int[rowLen + 1];
        var v1 = new int[rowLen + 1];
    
        /// Step 2
        /// Initialize the first vector
        for (var i = 1; i <= rowLen; i++)
        {
            v0[i] = i;
        }
    
        // Step 3
        /// For each column
        for (var j = 1; j <= colLen; j++)
        {
            /// Set the 0'th element to the column number
            v1[0] = j;
    
            // Step 4
            /// For each row
            for (var i = 1; i <= rowLen; i++)
            {
                // Step 5
                var cost = (a[i - 1] == b[j - 1]) ? 0 : 1;
    
                // Step 6
                /// Find minimum
                v1[i] = Math.Min(v0[i] + 1, Math.Min(v1[i - 1] + 1, v0[i - 1] + cost));
            }
    
            /// Swap the vectors
            var vTmp = v0;
            v0 = v1;
            v1 = vTmp;
        }
    
        // Step 7
        /// The vectors were swapped one last time at the end of the last loop,
        /// that is why the result is now in v0 rather than in v1
        return v0[rowLen];
    }
    

    The output is:

    Hello World 4
    Choice A 15
    Choice B 6
    Choice C 8
    

提交回复
热议问题