finding all distinct substring of a string

前端 未结 6 1770
借酒劲吻你
借酒劲吻你 2021-01-14 21:41

hello guys i was given homework problem where it asks me to find all distinct substring of a string. I have implemented a method which will tell you all the substrings of s

6条回答
  •  日久生厌
    2021-01-14 22:19

    I followed this link.Acknowledged the content from similar answer in quora

    The solution consists of constructing the suffix array and then finding the number of distinct substrings based on the Longest Common Prefixes.

    One key observation here is that:

    If you look through the prefixes of each suffix of a string, you have covered all substrings of that string.

    Let us take an example: BANANA

    Suffixes are: 0) BANANA 1) ANANA 2) NANA 3) ANA 4) NA 5) A

    It would be a lot easier to go through the prefixes if we sort the above set of suffixes, as we can skip the repeated prefixes easily.

    Sorted set of suffixes: 5) A 3) ANA 1) ANANA 0) BANANA 4) NA 2) NANA

    From now on,

    LCP = Longest Common Prefix of 2 strings.

    Initialize

    ans = length(first suffix) = length("A") = 1.

    Now consider the consecutive pairs of suffixes, i.e, [A, ANA], [ANA, ANANA], [ANANA, BANANA], etc. from the above set of sorted suffixes.

    We can see that, LCP("A", "ANA") = "A".

    All characters that are not part of the common prefix contribute to a distinct substring. In the above case, they are 'N' and 'A'. So they should be added to ans.

    So we have, 1 2 ans += length("ANA") - LCP("A", "ANA") ans = ans + 3 - 1 = ans + 2 = 3

    Do the same for the next pair of consecutive suffixes: ["ANA", "ANANA"]

    1 2 3 4 LCP("ANA", "ANANA") = "ANA". ans += length("ANANA") - length(LCP) => ans = ans + 5 - 3 => ans = 3 + 2 = 5.

    Similarly, we have:

    1 2 LCP("ANANA", "BANANA") = 0 ans = ans + length("BANANA") - 0 = 11

    1 2 LCP("BANANA", "NA") = 0 ans = ans + length("NA") - 0 = 13

    1 2 LCP("NA", "NANA") = 2 ans = ans + length("NANA") - 2 = 15

    Hence the number of distinct substrings for the string "BANANA" = 15.

提交回复
热议问题