Longest common contiguous subsequence - algorithm

↘锁芯ラ 提交于 2019-12-07 18:09:06

问题


My question is simple: Is there an O(n) algorithm for finding the longest contiguous subsequence between two sequences A and B? I searched it, but all the results were about the LCS problem, which is not what I'm seeking.

Note: if you are willing to give any sample code, you are more than welcome to do so, but please, if you can, in C or C++.

Edit: Here is an example:

A: { a, b, a, b, b, b, a }
B: { a, d, b, b, b, c, n }
longest common contiguous subsequence: { b, b, b }

回答1:


Yes, you can do this in linear time. One way is by building suffix trees for both the pattern and the text and computing their intersection. I can't think of a way to do this without involving suffix trees or suffix arrays, though.




回答2:


that is what you are looking for:

KMP algorithm c implementation

the basic theory:

  1. How to find Longest Common Substring using C++

  2. http://en.wikipedia.org/wiki/Longest_common_substring_problem




回答3:


I am not sure whether there exists an O(n) algorithm. Here is a O(n*n) dynamic solution, maybe it is helpful to you. Let lcs_con[i][j] represent the longest common contiguous subsequence which end with element A_i from array A and B_j from array B. Then we can get the equations below:

lcs_con[i][j]=0 if i==0 or j==0
lcs_con[i][j]=0 if A_i != B_j
lcs_con[i][j]=lcs_con[i-1][j-1] if A_i==B_j

we can record the maximum of lcs_con[i][j] and the ending index during the calculation to get the longest common contiguous subsequence. The code is below:

#include<iostream>

using namespace std;


int main()
{
    char A[7]={'a','b','a','b','b','b','a'};
    char B[7]={'a','d','b','b','b','c','n'};

    int lcs_con[8][8];    
    memset(lcs_con,0,8*8*sizeof(int));

    int result=-1;
    int x=-1;
    int y=-1;

    for(int i=1;i<=7;++i)
      for(int j=1;j<=7;++j)
      {
          if(A[i-1]==B[j-1])lcs_con[i][j]=lcs_con[i-1][j-1]+1;
          else lcs_con[i][j]=0;

          if(lcs_con[i][j]>result)
          {
               result=lcs_con[i][j];
               x=i;
               y=j;                   
          }
      }

   if(result==-1)cout<<"There are no common contiguous subsequence";
   else
   {
       cout<<"The longest common contiguous subsequence is:"<<endl;
       for(int i=x-result;i<x;i++)cout<<A[i];
       cout<<endl;
   }

   getchar();
   getchar();

   return 0;    
}

Hope it helps!



来源:https://stackoverflow.com/questions/14032903/longest-common-contiguous-subsequence-algorithm

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!