Longest Common Substring using Recursion and DP

时光毁灭记忆、已成空白 提交于 2019-12-12 02:32:38

问题


I'm trying to find the Longest Common Substring of two strings using Recursion and DP. Please note that I'm not referring to Longest Contiguous subsequence. So, if the two strings were

String s1 = "abcdf"; String s2 = "bzcdf" 
Longest Common Substring == "cdf" (not "bcdf").
Basically they have to be continuous elements

I am trying to do this using recursion and backtracking. However, the problem is that if I use a recursion such as below, the +1 are added upfront in a frame, that is higher up in the call stack, and unaware of whether the characters to come are indeed continuous elements or no. And so, going by the example above, "bcdf" would be the answer.

public class ThisIsLongestCommonSubsequence_NotSubstring {
public static void main(String[] args) {

    String s1 = "abcdgh";
    String s2 = "abefgh";
    System.out.println(fun(s1, s1.length()-1, s2, s2.length()-1));
}

static int fun(String s1, int i, String s2, int j)
{
    if(i == -1 || j == -1)
        return 0;

    int ret = 0;
    if(s1.charAt(i) == s2.charAt(j))
        ret = fun(s1, i-1, s2, j-1) + 1;
    else
        ret = max(fun(s1, i-1, s2, j), fun(s1, i, s2, j-1));

    return ret;
}

static int max(int a, int b)
{
    return a>b?a:b;
}
}

As for now, the code below is what I have come up with. Note how, I reset the count to 0, every time I find a mismatch. And keep track of the number of matching characters using a variable called int count, and record the highest at any point in program using a variable called int maxcount. My code below.

public class LongestContinuousSubstringGlobalvariable {

static int maxcount = 0;

public static void main(String[] args) {
    String s1 = "abcdghijl";
    String s2 = "abefghijk";

    fun(s1, s2, s1.length()-1, s2.length()-1, 0);
    System.out.println("maxcount == "+maxcount);
}

static void fun(String s1, String s2, int i, int j, int count)
{
    if(i == -1 || j==-1)
        return;

    if(s1.charAt(i) == s2.charAt(j))
    {
        if(count+1 >  maxcount)
            maxcount = count+1;
        fun(s1, s2, i-1, j-1, count+1); 
    }
    else
    {
        fun(s1, s2, i-1, j, 0);
        fun(s1, s2, i, j-1, 0);
    }
}
}

This works fine. However, there are couple of things I don't like about my code

  1. Use of the global variable (static int maxcount) to compare across frames
  2. I don't think this is real dynamic programming or backtracking, since the lower frame is not returning it's output to a higher frame, which then decides what to do with it.

Please give me your inputs on how I can achieve this without the use of the global variable, and using backtracking.

PS : I am aware of other approaches to the problem, like keeping a matrix, and doing something like

M[i][j] = M[i-1][j-1]+1 if(str[i] == str[j])

The objective is not to solve the problem, but to find an elegant recursive/backtracking solution.


回答1:


It could probably be done in Prolog. Following is the code which I could put down with help from this post: Foreach not working in Prolog , http://obvcode.blogspot.in/2008/11/working-with-strings-in-prolog.html and How do I find the longest list in a list of lists?

myrun(S1, S2):-
    writeln("-------- codes of first string ---------"),
    string_codes(S1, C1list),
    writeln(C1list),

    writeln("-------- codes of second string ---------"),
    string_codes(S2, C2list),
    writeln(C2list),

    writeln("--------- substrings of first --------"),
    findall(X, sublist(X, C1list), L),   
    writeln(L),

    writeln("--------- substrings of second --------"),
    findall(X, sublist(X, C2list), M),
    writeln(M),

    writeln("------ codes of common substrings -------"),
    intersection(L,M, Outl),
    writeln(Outl), 

    writeln("--------- common strings in one line -------"),
    maplist(string_codes, Sl, Outl), 
    writeln(Sl),
    writeln("------ common strings one by one -------"),
    maplist(writeln, Sl),

    writeln("------ find longest -------"),
    longest(Outl, LongestL),
    writeln(LongestL),
    string_codes(LongestS, LongestL),
    writeln(LongestS).

sublist(S, L) :-
  append(_, L2, L),
  append(S, _, L2).

longest([L], L) :-
   !.
longest([H|T], H) :- 
   length(H, N),
   longest(T, X),
   length(X, M),
   N > M,
   !.
longest([H|T], X) :-
   longest(T, X),
   !.

It runs showing all the steps: It convert strings to codes, then make all possible substrings from both, then find those which are common and lists them:

?- myrun("abcdf", "bzcdf").
-------- codes of first string ---------
[97,98,99,100,102]
-------- codes of second string ---------
[98,122,99,100,102]
--------- substrings of first --------
[[],[97],[97,98],[97,98,99],[97,98,99,100],[97,98,99,100,102],[],[98],[98,99],[98,99,100],[98,99,100,102],[],[99],[99,100],[99,100,102],[],[100],[100,102],[],[102],[]]
--------- substrings of second --------
[[],[98],[98,122],[98,122,99],[98,122,99,100],[98,122,99,100,102],[],[122],[122,99],[122,99,100],[122,99,100,102],[],[99],[99,100],[99,100,102],[],[100],[100,102],[],[102],[]]
------ codes of common substrings -------
[[],[],[98],[],[99],[99,100],[99,100,102],[],[100],[100,102],[],[102],[]]
--------- common strings in one line -------
[,,b,,c,cd,cdf,,d,df,,f,]
------ common strings one by one -------


b

c
cd
cdf

d
df

f

------ find longest -------
[99,100,102]
cdf
true.

Ignore the 'true' at end.

If explanatory parts are removed, program is much shorter:

myrun(S1, S2):-
    string_codes(S1, C1list),
    string_codes(S2, C2list),
    findall(X, sublist(X, C1list), L),    
    findall(X, sublist(X, C2list), M),
    intersection(L,M, Outl),
    longest(Outl, LongestL),
    string_codes(LongestS, LongestL),
    writeln(LongestS).

sublist(S, L) :-
  append(_, L2, L),
  append(S, _, L2).

longest([L], L) :-
   !.
longest([H|T], H) :- 
   length(H, N),
   longest(T, X),
   length(X, M),
   N > M,
   !.
longest([H|T], X) :-
   longest(T, X),
   !.


?- myrun("abcdf", "bzcdf").
cdf
true.


来源:https://stackoverflow.com/questions/38261865/longest-common-substring-using-recursion-and-dp

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!