Which structure provides the best performance results; trie (prefix tree), suffix tree or suffix array? Are there other similar structures? What are good Java implementation
Trie vs Suffix tree
both data structure ensure a very fast look up, the time of search is proportional to the lenght of the query word, complexity time O(m) where m is the lenght of the query word.
it's mean if we have query word that have 10 chars, so we need at most 10 steps to find it.
Trie : A tree for storing strings in which there is one node for every common prefix. The strings are stored in extra leaf nodes.
suffix tree: A compact representation of a trie corresponding to the suffixes of a given string where all nodes with one child are merged with their parents.
def are from : Dictionary of Algorithms and Data Structures
generally Trie used to index dictionary words (lexicon) or any sets of strings example D={abcd, abcdd, bxcdf,.....,zzzz }
a suffix tree used to index text by using the same data structure "Trie" on all suffixes of our text T=abcdabcg all suffixes of T = {abcdabcg , abcdabc , abcdab, abcda, abcd, abc , ab, a}
now it look like a groups of strings. we build a Trie over over this groups of strings (all suffixes of T).
the construction of both data structure is in linear, it take O(n) in time and space.
in case of dicionary (a set of strings): n = the sum of the characters of all the words. in text : n = length of text.
suffix array : is a technic to represent a suffix tree in compressed sapce, it's an array of all starting positions of suffixes of a string.
it's slower than suffix tree in search time.
for more information go to wikipedia , there is a good article talking on this topic.