monitoring | 易学教程

Adding two values in Prometheus

阅读更多关于 Adding two values in Prometheus

问题 We need to add results of two queries in Prometheus. Snippet is below: (probe_ssl_earliest_cert_expiry{job="SSL-expiry"} - time() < 86400 * 738 )*1000 + (node_time_seconds*1000) but the result says no data as shown below: 回答1: You will get an empty result if the metrics do not match. The reason is that for binary operator vector1 <op> vector2 vector1 and vector2 results in a vector consisting of the elements of vector1 for which there are elements in vector2 with exactly matching label sets.

Too Many Files Open while monitoring file changes

阅读更多关于 Too Many Files Open while monitoring file changes

来源： https://stackoverflow.com/questions/61452583/too-many-files-open-while-monitoring-file-changes

Measuring load per database in Postgres using 'active' processes in pg_stat_activity?

阅读更多关于 Measuring load per database in Postgres using 'active' processes in pg_stat_activity?

来源： https://stackoverflow.com/questions/47336725/measuring-load-per-database-in-postgres-using-active-processes-in-pg-stat-acti

Monitoring and alerting on pod status or restart with Google Container Engine (GKE) and Stackdriver

阅读更多关于 Monitoring and alerting on pod status or restart with Google Container Engine (GKE) and Stackdriver

问题 Is there a way to monitor the pod status and restart count of pods running in a GKE cluster with Stackdriver? While I can see CPU, memory and disk usage metrics for all pods in Stackdriver there seems to be no way of getting metrics about crashing pods or pods in a replica set being restarted due to crashes. I'm using a Kubernetes replica set to manage the pods, hence they are respawned and created with a new name when they crash. As far as I can tell the metrics in Stackdriver appear by pod

Monitoring and alerting on pod status or restart with Google Container Engine (GKE) and Stackdriver

阅读更多关于 Monitoring and alerting on pod status or restart with Google Container Engine (GKE) and Stackdriver

Can't display data with with log-based metric

阅读更多关于 Can't display data with with log-based metric

问题 I’m struggling to create a chart with stackdriver monitoring with a log-based metric. My metric is a counter one with no unit by default. Logs are available for my log-based metric but when I create a chart with my metric, it says no data is available for the... . Here’s my metric which does work (called isOperatorAllowed): resource.type="container" resource.labels.namespace_id="default" jsonPayload.message="CaseForOperator flags" logName="projects/PROJECT-ID/logs/app" jsonPayload

Can't display data with with log-based metric

阅读更多关于 Can't display data with with log-based metric

How to monitor consumer lag in kafka via jmx?

阅读更多关于 How to monitor consumer lag in kafka via jmx?

问题 I have a kafka setup that includes a jmx exporter to prometheus. I'm looking for a metric, that gives the offset lag based on topic and groupid. I'm running kafka 2.2.0. Some resources online point to a metric called kafka.consumer , but I have no such metric in my setup. From my jmxterminal: $>domains #following domains are available JMImplementation com.sun.management java.lang java.nio java.util.logging jdk.management.jfr kafka kafka.cluster kafka.controller kafka.coordinator.group kafka

How to monitor consumer lag in kafka via jmx?

阅读更多关于 How to monitor consumer lag in kafka via jmx?

Grafana: Panel with time of last result

阅读更多关于 Grafana: Panel with time of last result

问题 I have an elasticsearch instance that receives logs from multiple backup routines. I'd like to query ES for these logs from Grafana and set up a panel that shows the last time for the different backups. Ideally I would also like to be able to show this in color if the time is longer than a certain threshold. Basically the idea is to have a display that shows, for instance, green if a certain backup has been completed in the last 24 hours, and red if it hasn't. How would I do this in Grafana