Running Search workload and Cassandra workload on the same physical node

孤街浪徒 提交于 2019-12-22 13:58:31

问题


Can't seem to find the answer to this obvious question.

We have 6 servers currently configured as "Search" workload running DSE.

My question is: Is it possible to run Search (Solr) and Cassandra on the same physical box? (Not) Possible / (Not) Recommended?

I'm very confused with the fact that we currently are running all nodes as Solr nodes and I'm still able to use them as Cassandra (real time queries) - so it's technically both?

The "Services /Best Practice" tells me that: "Please replace the current search nodes that have vnodes enabled with nodes without vnodes."

Our ideal situation would be: a. Use all 6 servers as cassandra storage (+ real time queries) b. Use 1 or 2 of the SAME servers as Solr Search.

The only documentation that I've found that somewhat resemble what we want to is - http://www.datastax.com/documentation/datastax_enterprise/4.6/datastax_enterprise/deploy/deployWkLdSep.html but as far as I understand it still says that I need to physically split the load, meaning dedicate 4 servers for cassandra and 2 nodes for solr/search ?

Can anyone explain/suggest anything?

Thank you!


回答1:


DSE Search - C* and Solr on the Same node:

As Rock Brain mentioned, DSE Search will run Solr and Cassandra on the same node. More specifically, it will run it on the same JVM. This has heap implications. Recommendation is to bump your heap up to 14gb rather than the c* only 8gb.

As RB also mentioned, CPU consumption will be greater with Solr. However, I often see Search DC's with fewer, beefier, nodes than C* nodes. Again this depends on your workload and how much data you're indexing.

Note: DSE Search Performance Tip The main rule of thumb for performance is to try to fit all your DSE Indexes in the OS page cache so you may need more RAM than for a Cassandra only node to get optimal performance.

DSE Search and Workload Isolation:

You will find in the DataStax docs, that we recommend for you to run separate data centers for your cassandra workloads and for your search or analytics workloads. This basically prevents Search driven contention from affecting your cassandra ingestions.

The reason behind this recommendation is that many DSE customers have super-tight micro second sla's and very large workloads. You can get away with running search and c* in the same nodes (same DC) if you have looser SLA's and smaller workloads. Your best bet is to POC it with your workload on your hardware and see how it performs.

Can I activate DSE Search on just 2 of my 6 DSE nodes?

Not really, you most likely want to turn on search on your whole DC or not at all. For the following reasons:

  1. the DSESimpleSnitch will automatically split them up into separate DC's so you'd have to use another snitch.
  2. you will get cannot find endpoints errors on your Solr DC's if there aren't enough nodes with the right copies of your data. Remember, Cassandra is still responsible for replication and the Solr core on each node will only index the corresponding data that is on that node.

Turn on search in all 6, but feel free to direct c* queries at all of them and search queries only at 2 if you want. Not sure why you would want to though, you'll clearly see those 2 nodes will be under higher load in OpsCenter.

Remember that you can leverage Search queries right from CQL now as of DSE 4.6.

Vnodes vs. Non Vnodes for DSE Search

For your question on the comment above. Vnodes are not recommended for DSE Search as you will incur a performance hit. Specifically, pre 4.6 it was a large hit, ~300%. But as of 4.6 it's only a 30% performance hit for Search queries. The bigger the num_vnodes the larger the hit.

You can run vnodes on one DC and single tokens on the other DC. DSE will, by default, run single tokens.




回答2:


Is it possible to run Search (Solr) and Cassandra on the same physical box? (Not) Possible / (Not) Recommended?

Yes, this is how DSE Search works, Cassandra and Solr run in the same process with the full functionality of both available.

Solr uses more CPU than Cassandra, so you will want more Solr nodes than dedicated Cassandra nodes. You will setup separate Cassandra and Solr data centers to divide the work load types.



来源:https://stackoverflow.com/questions/28331235/running-search-workload-and-cassandra-workload-on-the-same-physical-node

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!