Kafka consumer - what's the relation of consumer processes and threads with topic partitions

后端 未结 3 1184
被撕碎了的回忆
被撕碎了的回忆 2021-02-02 03:48

I have been working with Kafka lately and have bit of confusion regarding the consumers under a consumer group. The center of the confusion is whether to implement consumers as

3条回答
  •  情深已故
    2021-02-02 04:33

    The main design decision for opting for multiple consumer group instances with the same id vs a single consumer group instance is resiliency. For example if you have a single consumer with two threads then if this machine goes down you loose all consumers. If you have two separate consumer groups with the same id, each on different hosts then they can survive failure. Ideally each consumer group should have two threads in the above, therefore if one host goes down the other consumer group uses its dormant thread to take up the other partition. Indeed it is always desirable to have more threads than partitions to cover this factor.

    1. You can run each consumer group on different hosts. With a single consumer group for a given name/id it will only ever run on a single host as it manages all its threads in a single runtime environment.
    2. Kafka has an algorithm to determine which threads/consumer groups reads the various topic partitions. Kafka tries to evenly distribute these in a resilient fashion. When a consumer group fails, it enables other threads in other groups to read the given partition.
    3. Refers to a single thread in the consumer group. If there are more threads than partitions then some of them will just remain dormant until other threads fail to offer resiliancy.
    4. The preference relates to resilience. So with multiple consumer groups setup with the same id I can run on multiple hosts making my application tolerant to failure.

提交回复
热议问题