问题
Looking for a simple integration path between Elasticsearch and Apache Storm. Support for this is included in the elasticsearch-hadoop library, but this brings tons of dependencies on the Hadoop stack: from Hive to Cascading, that I simply don't need. Has anyone out there succeeded in this integration without bringing in elasticsearch-hadoop? Thanks.
回答1:
In my project we're using rabbitmq river for indexing the storm output. It's very efficient and convenient way to write to elasticsearch. You basically put the messages to the queue and the river does the rest. If something gets stucked the data are simply buffered on the queue.
So I would say, use this river approach for writing and elasticsearch Java API for reading, like Kit Menke suggests (or the Jest client, we've found this cool and it offers async API basing on ApacheHttpAsyncClient, though we're not reading from elasticsearch in storm topology but in different services).
来源:https://stackoverflow.com/questions/26750821/elasticsearch-storm-integration-methods