instant searching in petabyte of data

前端未结

关注

 3  1218

心在旅途 2021-01-01 07:56

I need to search over petabyte of data in CSV formate files. After indexing using LUCENE, the size of the indexing file is doubler than the original file. Is it possible to

3条回答

一整个雨季 (楼主)

2021-01-01 08:10

Hadoop and Map Reduce are based on batch processing models. You're not going to get instant response speed out of them, that's just not what the tool is designed to do. You might be able to speed up your indexing speed with Hadoop, but it isn't going to do what you want for querying.

Take a look at Lucandra, which is a Cassandra based back end for Lucene. Cassandra is another distributed data store, developed at Facebook if I recall, designed for faster access time in a more query oriented access model than hadoop.

0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...