Custom full-text index stored in Cassandra

╄→尐↘猪︶ㄣ 提交于 2020-01-02 09:38:29

问题


I've got a situation where I'm using Cassandra for DB and I need full-text search capability. Now I'm aware of Apache Solr, Apache Cassandra, and DSE search.

However, I do not want to use a costly and proprietary software(DSE search). The reason I do not want to use Apache Solr is because I don't want to deal with HA, sharding, and redundency for it. Cassandra is perfect for HA, sharding, and redundency; I would like to store my full-text index in the existing Cassandra DB.

So what I'm looking for is something that will break down a string into its indexable parts. For example:

String input = "I like apples and bannanas.";

String tokens[] = makeTokenIndex(input);

//tokens = {"I","like","apples","bannanas","apple","bannana"}

Obviously I could split strings on spaces and use the words as index-keys. But I'm looking for something smarter than that. Something that can handle plurals, find the root of a word, etc...

Would modifying Apache Lucene be the best solution for this, or is there another option?


回答1:


I've not used Cassandra, but I think you're talking about using a Cassandra implementation of Lucene's Directory interface. Lucene uses a Directory to interact with a storage mechanism.

I found a couple of projects that might help:

  • lucene-on-cassandra
  • Solandra

I can't speak with experience about either one, though.



来源:https://stackoverflow.com/questions/20552046/custom-full-text-index-stored-in-cassandra

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!