How to create a simple prefix index in Java?

青春壹個敷衍的年華 提交于 2019-11-29 10:56:47

If you need to efficiently find prefixes of strings, use a Trie, a data structure designed precisely for that purpose:

A trie, or prefix tree, is an ordered tree data structure that is used to store an associative array where the keys are usually strings. Unlike a binary search tree, no node in the tree stores the key associated with that node; instead, its position in the tree defines the key with which it is associated. All the descendants of a node have a common prefix of the string associated with that node, and the root is associated with the empty string

Two links with sample implementations.

Long time ago I put a simple Trie implementation here:

http://code.google.com/p/triebag/source/browse/trunk/src/triebag/tries/SimpleTrie.java

However this is not a compact Trie, so it creates one node per character, creating a compact one is a bit trickier.

The Regexp implementation java.util.regex.Pattern can efficiently handle prefixes:

StringBuilder buffer = new StringBuilder();
for (String prefix : prefixes) {
    if (buffer.length() > 0)
        buffer.append("|");
    buffer.append(prefix);
}
Pattern prefixPattern = Pattern.compile("^(" + buffer + ")");

You can test all prefixes:

boolean containsPrefix = prefixPattern.matcher(stringToTest).find();

Note: for simplicity, prefix strings are not escaped. Regexp characters [, ], \, *, ?, $, ^, (, ), {, } and | have to be prefixed by \.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!