All Common Substrings Between Two Strings

前端 未结 3 1898
星月不相逢
星月不相逢 2021-01-02 09:42

I am working on C# to find all the common substrings between two strings. For instance, if the input is: S1= \"need asssitance with email\" S2= \"email assistance needed\"<

3条回答
  •  旧时难觅i
    2021-01-02 10:00

    Use Set-Intersections

    Start with a routine to find all possible substrings of a string. Here it is in Python, it's an 'exercise for the reader' to translate it to C#':

    def allSubstr(instring):
      retset = set()
      retset.add(instring)
      totlen = len(instring)
      for thislen in range(0, totlen):
        for startpos in range(0, totlen):
          # print "startpos: %s, thislen: %s" % (startpos, thislen)
          addStr = instring[startpos:startpos+thislen]
          # print "addstr: %s" % (addStr)
          retset.add(addStr)
      print "retset total: %s" % (retset)
      return retset
    
    set1 = allSubstr('abcdefg')
    set2 = allSubstr('cdef')
    print set1.intersection(set2)
    

    Here's the output:

    >>> set1 = allSubstr('abcdefg')
    retset total: set(['', 'cde', 'ab', 'ef', 'cd', 'abcdef', 'abc', 'efg', 'bcde', 'cdefg', 'bc', 'de',   'bcdef', 'abcd', 'defg', 'fg', 'cdef', 'a', 'c', 'b', 'e', 'd', 'g', 'f', 'bcd', 'abcde', 'abcdefg', 'bcdefg', 'def'])
    >>> set2 = allSubstr('cdef')
    retset total: set(['', 'cde', 'c', 'ef', 'e', 'd', 'f', 'de', 'cd', 'cdef', 'def'])
    >>> 
    >>> set1.intersection(set2)
    set(['', 'cde', 'c', 'de', 'e', 'd', 'f', 'ef', 'cd', 'cdef', 'def'])
    

    No, you're not interested in subsets of length 1. But, you can always add a limit to length before you do the set.add() call.

提交回复
热议问题