How to find all the unique substrings of a very long string?

后端 未结 2 866
无人及你
无人及你 2020-12-21 16:13

I have a very long string. I want to find all the unique substrings of this string. I tried to write the code where I used a set(python) to store all the substrings

2条回答
  •  醉酒成梦
    2020-12-21 17:04

    If you really need it in memory, then you can try making a suffix tree. Tries are not exotic data structures, so there are probably good implementations available for a mainstream language like Python, and they can be used to implement suffix trees. Marisa-Trie is supposed to get good memory usage.

    1. Create an empty trie.
    2. For each n in [0, len(s)], add the suffix of length n to the Trie.
    3. Every path from the root of the trie is a substring in the string, there are no such paths that are not substrings in the string, and paths are unique.

提交回复
热议问题