Alert Management Threshold with Monitoring for Kafka/Confluent JMX metrics

强颜欢笑 提交于 2019-12-25 03:27:31

问题


I am building an Alert Monitoring tool for Kafka.

I do understand that there can be metrics for which the thresholds depends on application data. But I am only interested in knowing those metrics and threshold values which will help me in knowing the lag and help in determining if any scaling is required.

As of now I can do following :

  • Enable JMX on Kafka Broker
  • Fecth JMX metrics using JMX Java client or jCOnsole.

Next I researched and found so many metrics but none had comnplete thresholds (eg some value or pattern like increasing or decreasing or may be some maths ) over which I should write my logic for metrics .

Few Example are following :

UnderReplicatedPartitions - Alert if value is greater than 0.
records-lag-max - alert if value increases with time .
OfflinePartitionsCount - alert if value is greater then zero
ActiveControllerCount - alert if value other than 1 .

来源:https://stackoverflow.com/questions/52479009/alert-management-threshold-with-monitoring-for-kafka-confluent-jmx-metrics

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!