What is the batch limit in Cassandra?

亡梦爱人 提交于 2019-11-28 09:00:49
Chris Lohfink

I would recommend not increasing the cap, and just splitting into multiple requests. Putting everything in a giant single request will negatively impact the coordinator significantly. Having everything in one partition can improve the throughput in some sized batches by reducing some latency, but batches are never meant to be used to improve performance. So trying to optimize to get maximum throughput by using different batch sizes will depend largely on use case/schema/nodes and will require specific testing, since there's generally a cliff on the size where it starts to degrade.

There is a

# Fail any batch exceeding this value. 50kb (10x warn threshold) by default.
batch_size_fail_threshold_in_kb: 50

option in your cassandra.yaml to increase it, but be sure to test to make sure your actually helping and not hurting your throughput.

Looking at the Cassandra logs you'll be able to spot things like:

ERROR 19:54:13 Batch for [matches] is of size 103.072KiB, exceeding specified threshold of 50.000KiB by 53.072KiB. (see batch_size_fail_threshold_in_kb)

I fixed this issue by changing the CHUNKSIZE to a lower value (for exemple 1) https://docs.datastax.com/en/cql/3.1/cql/cql_reference/copy_r.html

COPY mytable FROM 'mybackup' WITH CHUNKSIZE = 1;

The operation is much slower but at least it work now

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!