Efficient Data Structure For Substring Search?

后端 未结 3 894
广开言路
广开言路 2020-12-16 03:13

Assume I have a set of strings S and a query string q. I want to know if any member of S is a substring of q. (For the purpose of this question substring includes equality

相关标签:
3条回答
  • 2020-12-16 03:27

    So if the length of S is way less then the sum of the lengths of the potential substrings your best option would be to build a suffix tree from S and then do a search in it. This is linear with respect to the length of S plus the summar length of the candidate substrings. Of course there can not be an algorithm with better complexity as you have to pass through all the input at least. If the case is opposite i.e. the length of s is more then the summar length of the substrings your best option would be aho-corasick.

    Hope this helps.

    0 讨论(0)
  • 2020-12-16 03:42

    I think Aho-Corasick algorithm does what you want. I think there is another solution which is very simple to implement, it's Karp-Rabin algorithm.

    0 讨论(0)
  • 2020-12-16 03:49

    Create a regular expression .*(S1|S2|...|Sn).* and construct its minimal DFA.

    Run your query string through the DFA.

    0 讨论(0)
提交回复
热议问题