instant searching in petabyte of data

前端 未结 3 1218
心在旅途
心在旅途 2021-01-01 07:56

I need to search over petabyte of data in CSV formate files. After indexing using LUCENE, the size of the indexing file is doubler than the original file. Is it possible to

3条回答
  •  一整个雨季
    2021-01-01 08:10

    Hadoop and Map Reduce are based on batch processing models. You're not going to get instant response speed out of them, that's just not what the tool is designed to do. You might be able to speed up your indexing speed with Hadoop, but it isn't going to do what you want for querying.

    Take a look at Lucandra, which is a Cassandra based back end for Lucene. Cassandra is another distributed data store, developed at Facebook if I recall, designed for faster access time in a more query oriented access model than hadoop.

提交回复
热议问题