We need to combine 3 columns in a database by concatenation. However, the 3 columns may contain overlapping parts and the parts should not be duplicated. For example,
<
Most of the other answers have focused on constant-factor optimizations, but it's also possible to do asymptotically better. Look at your algorithm: it's O(N^2). This seems like a problem that can be solved much faster than that!
Consider Knuth Morris Pratt. It keeps track of the maximum amount of substring we have matched so far throughout. That means it knows how much of S1 has been matched at the end of S2, and that's the value we're looking for! Just modify the algorithm to continue instead of returning when it matches the substring early on, and have it return the amount matched instead of 0 at the end.
That gives you an O(n) algorithm. Nice!
int OverlappedStringLength(string s1, string s2) {
//Trim s1 so it isn't longer than s2
if (s1.Length > s2.Length) s1 = s1.Substring(s1.Length - s2.Length);
int[] T = ComputeBackTrackTable(s2); //O(n)
int m = 0;
int i = 0;
while (m + i < s1.Length) {
if (s2[i] == s1[m + i]) {
i += 1;
//<-- removed the return case here, because |s1| <= |s2|
} else {
m += i - T[i];
if (i > 0) i = T[i];
}
}
return i; //<-- changed the return here to return characters matched
}
int[] ComputeBackTrackTable(string s) {
var T = new int[s.Length];
int cnd = 0;
T[0] = -1;
T[1] = 0;
int pos = 2;
while (pos < s.Length) {
if (s[pos - 1] == s[cnd]) {
T[pos] = cnd + 1;
pos += 1;
cnd += 1;
} else if (cnd > 0) {
cnd = T[cnd];
} else {
T[pos] = 0;
pos += 1;
}
}
return T;
}
OverlappedStringLength("abcdef", "defghl") returns 3