Monitoring Apache Spark with Prometheus

前端 未结 2 1157
死守一世寂寞
死守一世寂寞 2020-12-30 02:56

I have read that Spark does not have Prometheus as one of the pre-packaged sinks. So I found this post on how to monitor Apache Spark with prometheus.

But I found it

相关标签:
2条回答
  • 2020-12-30 03:38

    I have followed the GitHub readme and it worked for me (the original blog assumes that you use the Banzai Cloud fork as they were expected the PR to accepted upstream). They externalized the sink to a standalone project (https://github.com/banzaicloud/spark-metrics) and I used that to make it work with Spark 2.3.

    Actually you can scrape (Prometheus) through JMX, and in that case you don't need the sink - the Banzai Cloud folks did a post about how they use JMX for Kafka, but actually you can do this for any JVM.

    So basically you have two options:

    • use the sink

    • or go through JMX,

    they open sourced both options.

    0 讨论(0)
  • 2020-12-30 03:43

    There are few ways to monitoring Apache Spark with Prometheus.

    One of the way is by JmxSink + jmx-exporter

    Preparations

    • Uncomment *.sink.jmx.class=org.apache.spark.metrics.sink.JmxSink in spark/conf/metrics.properties
    • Download jmx-exporter by following link on prometheus/jmx_exporter
    • Download Example prometheus config file

    Use it in spark-shell or spark-submit

    In the following command, the jmx_prometheus_javaagent-0.3.1.jar file and the spark.yml are downloaded in previous steps. It might need be changed accordingly.

    bin/spark-shell --conf "spark.driver.extraJavaOptions=-javaagent:jmx_prometheus_javaagent-0.3.1.jar=8080:spark.yml" 
    

    Access it

    After running, we can access with localhost:8080/metrics

    Next

    It can then configure prometheus to scrape the metrics from jmx-exporter.

    NOTE: We have to handle to discovery part properly if it's running in a cluster environment.

    0 讨论(0)
提交回复
热议问题