Searching over documents stored in Hadoop - which tool to use?
问题 I'm lost in: Hadoop, Hbase, Lucene, Carrot2, Cloudera, Tika, ZooKeeper, Solr, Katta, Cascading, POI... When you read about the one you can be often sure that each of the others tools is going to be mentioned. I don't expect you to explain every tool to me - sure not. If you could help me to narrow this set for my particular scenario it would be great. So far I'm not sure which of the above will fit and it looks like (as always) there are more then one way of doing what's to be done. The