Elasticsearch vs Cassandra vs Elasticsearch with Cassandra

前端 未结 8 1230
佛祖请我去吃肉
佛祖请我去吃肉 2020-11-29 15:03

I am learning NoSQL and looking at different options for one of my client\'s requirements. I have gone through various resources before putting up this question (a person w

相关标签:
8条回答
  • 2020-11-29 15:45

    One of our applications uses data that is stored into both Cassandra and ElasticSearch. We use Cassandra to access those records whenever we can, and have data duplicated into query tables designed to adhere to specific application-side requests. For a more liberal search than our query tables can allow, ElasticSearch performs that functionality nicely.

    We have asked that same question (of ourselves)..."Why don't we just get everything from ElastsicSearch?"

    The answer is that ElasticSearch was designed to be a search engine, and not a persistent data store. Sometimes ElasticSearch loses writes. Schema changes are difficult to do in ElasticSearch without blowing everything away and reloading. For that purpose, I have written jobs that are designed to keep ElasticSearch in-sync with our Cassandra cluster. There was also a fairly recent discussion on Quora about this topic, that yielded similar points.

    That being said, ElasticSearch works great as a search engine. And Cassandra works great as a scalable, high-performance datastore. But querying data is different from searching for data. There are times that we need one or the other, and a combination of the two works well for our application. It may (or it may not) work well for yours.

    As for analytics, I have had some success in using the Cassandra Spark connector, to serve more complex OLAP queries. Hope that helps.

    Edit 20200421

    I've written a newer answer to a similar question:

    ElasticSearch vs. ElasticSearch+Cassandra

    0 讨论(0)
  • 2020-11-29 15:52

    Cassandra is great at retrieving data by ID. I don't know much about secondary index performance, but I doubt it's as fast as Elasticsearch. Certainly Elasticsearch wins when it comes to full text search functionality (text analysis, relevancy scoring, etc).

    Cassandra wins on update performance, too. Elasticsearch supports updates, but an update is really a reindex + soft delete in an atomic operation.

    Cassandra has a very nice replication model (if you need to be extra-fail-safe). Elasticsearch is OK, too, I'm not in the camp that says ES is particularly unreliable (it has issues sometimes, like all software).

    Elasticsearch also has aggregations for real-time analytics. And because searches are so fast, analytics on a subset of data will be fast, too.

    If your requirements are satisfied well enough by one of them (like here it seems like ES would work well), I would just use one. If you have requirements from both worlds, then you can either:

    • use one of them and work around the downsides. For example, you may be able to handle many updates with Elasticsearch, but with more shards and more hardware
    • use both and make sure they're in sync
    0 讨论(0)
提交回复
热议问题