Why is my cassandra throughput not improving when I add nodes?

前端 未结 3 1614
囚心锁ツ
囚心锁ツ 2021-01-14 12:45

this is a newbie question. I have tried to do my homework, but I am stuck trying to learn how cassandra will scale linearly as advertized. When I run against a single cassan

3条回答
  •  半阙折子戏
    2021-01-14 13:05

    Do the inserts within your batch not share the same partition key (tableId)? If they do not each insert in the batch with a unique partition key is treated as a separate mutation on the cassandra node that handles your request and it needs to send those mutations to the responsible replicas. As your cluster size grows this may actually degrade performance as more replicas need to be contacted to complete your batch.

    If you keep your batches to a single partition per batch, or not use batches at all, you should get improved performance with more nodes. See 'Batch Loading without the Batch' keyword as a good reference on how to optimize this.

    With regards to losing performance with a lower replication factor, this is because when you reduce the replication factor a replica has less of a representation of the data in the cluster and thus could not service as much of your request if it spread out among partition keys.

提交回复
热议问题