节选自:Stackdriver tips and tricks: Understanding metrics and building charts
Understanding the Google Stackdriver metrics model
In order to build useful charts, it's important to have an understanding of how the Stackdriver metrics model works under the hood. This model helps you configure charts in Stackdriver Metrics Explorer and custom dashboards.
All metrics are made up of two things:
A metric descriptor that defines what the metric is and what resource it applies to; i.e., a "CPU usage" metric that applies to the "Compute Engine instance" resource type. The metric descriptor also defines a set of labels that are used as identifiers or metadata for the system writing the metric. For example, our disk write operations metric has a label called "device_name" that identifies which disk a data point was associated with.
A time series that includes a set of points that are a combination of [time, value, labels/resource] and written against the metric descriptor.
Stackdriver metrics are one of three kinds: gauge, delta, and cumulative.
A gauge metric measures a value at a particular point in time; i.e., “CPU utilization” for a Compute Engine instance, or “Instance count” for an App Engine app. A chart of CPU utilization will have points showing, as expected, the CPU utilization at that moment in time.
A delta metric measures the change in a value over a sample period, often one minute. An example is the “Backend request count” for a load balancer. A chart of the backend request count will have one point for each minute, showing how many requests hit the load balancer in that minute.
A cumulative metric measures a value that constantly increases, such as “sent bytes count” for Firebase. Cumulative metrics are never drawn directly in practice; you always use aligners (discussed below) to turn them into gauge or delta metrics first. If you could draw the raw data for “sent bytes count,” you would see an ever-increasing line going up as the total number of sent bytes grows without bound.
Finally, there are two types of metrics value types: numeric or distribution.
A numeric metric consists of streams that have numeric values for specific points in time. All of the examples in the previous section are numeric-valued metrics.
A distribution metric consists of (usually one) stream that has an array of “buckets” at each point in time that are used to draw a heatmap. Examples are “Backend latency” for a load balancer or “Execution times” for BigQuery.
为了构建有用的图表,理解Stackdriver 指标模型的后台工作原理是一件很重要的事。这个模型帮助你在Stackdriver指标浏览器和客制化面板中配置图表。
所有指标由两样东西构成:
1. 指标描述符,其定义了这个指标是什么以及适用的资源。列如,“CPU使用率”指标适用于“计算引擎实例”资源类型。这个指标描述符同时定义了一系列标签,用作标识符或者系统元数据。举个例子,我们的磁盘写操作指标有一个标签,“设备名称”,它用于标识数据点于哪个磁盘关联。
2. 时间序列,其包含一系列点,由【时间,值,标签/资源】构成,针对指标描述符编写。
Stackdriver指标有三种类型:计量(gauge),德尔塔(delta),累积(cumulative)。
1. 计量指标度量特定时刻的值。例如,计算引擎实例的“CPU利用率”,应用引擎应用的“实例计数”。CPU利用率图表以点的形式展示,表明在此时此刻的利用率。
2. 德尔塔指标衡量一个样本期间的变化的值,通常是一分钟。例如,负载均衡器的“后台请求计数”。后台请求计数的图表在每分钟会有一个点,展示在这一分钟内负载均衡器收到了多少请求。
3. 累积指标衡量持续增长的值。例如,Firebase的“已发送字节数”。实际情况中,累积指标不会直接描绘。你通常会先使用直线对准器(下面会讨论)去将它们转换为计量指标或德尔塔指标。如果你描绘“已发送字节数”的原始数据,你会得到一条随着已发送字节无限增长而不断上升的线条。
最后,指标值类型有两种:数字,分布。
1. 数字指标由多个数据流构成,其拥有指定时间点的数字值。前一节的所有例子都是数字指标。
2. 分布指标由(通常是一个)数据流构成,其拥有每个时间点的桶数组,用来描绘热力图。例如,负载均衡器的“后台延迟”,或者BigQuery的“执行时间”。