针对Kafka消息丢失的问题可以从Producer和Consumer两部分考虑。
1.Producer
request.required.acks配置,使得Producer在将信息推送至Kafka后是否等待Kafka的响应信息。
0——代表Producer不等待Kafka的响应信息就继续推送数据,会丢数据。(0, which means that the producer never waits for an acknowledgement from the broker. This option provides the lowest latency but the weakest durability guarantees (some data will be lost when a server fails))
1——代表Producer等待Partition Leader接收到数据的响应信息之后再继续推送数据,但是如果之后该Leader结点replica数据之前就挂了,仍然会丢数据。(1, which means that the producer gets an acknowledgement after the leader replica has received the data. This option provides better durability as the client waits until the server acknowledges the request as successful (only messages that were written to the now-dead leader but not yet replicated will be lost))
-1——代表Producer等待Partition Leader和Follower数据同步完成后的响应信息之后再继续推送数据,仍然不能完全保证数据不丢失,因为如果Replica的数量只有1个,那么就等于没有Follower同步,Leader挂了之后仍然会丢数据,需要设置min.insync.replicas和Broker 数量保证(-1, The producer gets an acknowledgement after all in-sync replicas have received the data. This option provides the greatest level of durability. However, it does not completely eliminate the risk of message loss because the number of in sync replicas may, in rare cases, shrink to 1. If you want to ensure that some minimum number of replicas (typically a majority) receive a write, then you must set the topic-level min.insync.replicas setting. Please read the Replication section of the design documentation for a more in-depth discussion.)
2.Consumer
修改自动提交Offerset为false。Partition Offset记录了该Partition的数据在对应Consumer Group中消费的偏移量,相当于标记着哪些数据是被消费过的。
当Consumer将一组数据完全处理完后再手动提交Offset,可以保证在处理过程中如果Consumer挂掉导致的消息丢失。