Case Insensitive Ternary Search Tree

▼魔方 西西 提交于 2019-12-06 04:03:21

One of the key factor which make my current Ternary Search Tree difficult to support case insensitive search is that, my underlying data structure is one-to-one mapping. Please look at the following test code :

public void testPut() {
    System.out.println("put");
    Name name0 = new Name("abc");
    Name name1 = new Name("abc");
    TernarySearchTree<Name> instance = new TernarySearchTree<Name>();
    instance.put(name0.toString(), name0);
    instance.put(name1.toString(), name1);
    assertEquals(2, instance.matchPrefix("a").size()); // fail here. Result is 1
}

What my current short-term solution is that, I am using TSTSearchEngine to wrap up the whole TernarySearchTree. TSTSearchEngine is comprised of

(1) A TernarySearchTree, providing UPPER-CASE key to map.

(2) A String-To-ArrayList map.

Here is what happen when I perform :

TSTSearchEngine<Name> engine = TSTSearchEngine<Name>();
engine.put(name0); // name0 is new Name("Abc");
engine.put(name1); // name0 is new Name("aBc");

(1) name0.toString() will be converted to UPPER-CASE ("ABC"). It will be inserted to TernarySearchTree. "ABC" will be both key and value for TernarySearchTree.

(2) "ABC" will use as the key for map, to insert name0 into an array list.

(3) name1.toString() will be converted to UPPER-CASE ("ABC"). It will be inserted to TernarySearchTree. S1 will be both key and value for TernarySearchTree.

(4) "ABC" will use as the key for map, to insert name1 into an array list.

When I try to

engine.searchAll("a");

(1) TernarySearchTree will return "ABC".

(2) "ABC" will be used as the key to access map. Map will return an array list, which is containing name0 and name1.

This solution works. The sample code can be referred to Sample Code for New TSTSearchEngine

However, this may not be an effective solution, as it requires two pass of search. I find out there is an implementation in C++ C++ Implementation of Case Insensitive Ternary Search Tree. Hence, there is an opportunity that C++ code can be ported over to Java.

I haven't used a TST before, but isn't this as simple as lower or uppercasing your keys, both during storage and during lookup? From your code snippet it looks like that should work.

I've implemented a Ternary search tree, you can have a look at http://kunalekawde.blogspot.com/2010/05/radix-patricia-and-ternary.html As of case insensative, either you map small and caps to one hex/dec value or while getting prefix, check the value.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!