Configure Metrics Sets
HyperShift creates ServiceMonitor resources in each control plane namespace that allow a Prometheus stack to scrape metrics from the control planes. ServiceMonitors use metrics relabelings to define which metrics are included or excluded from a particular component (etcd, Kube API server, etc) The number of metrics produced by control planes has a direct impact on resource requirements of the monitoring stack scraping them.
Instead of producing a fixed number of metrics that apply to all situations, HyperShift allows configuration of a "metrics set" that identifies a set of metrics to produce per control plane.
The following metrics sets are supported:
Telemetry- metrics needed for telemetry. This is the default and the smallest set of metrics.
SRE- Configurable metrics set, intended to include necessary metrics to produce alerts and allow troubleshooting of control plane components.
All- all the metrics produced by standalone OCP control plane components.
The metrics set is configured by setting the
METRICS_SET environment variable in the HyperShift
oc set env -n hypershift deployment/operator METRICS_SET=All
Configuring the SRE Metrics Set
When the SRE metrics set is specified, the HyperShift operator looks for a ConfigMap named
sre-metric-set with a single key:
config. The value of the
config key should contain a set
of RelabelConfigs organized by control plane component. An example of this configuration can be
support/metrics/testdata/sreconfig.yaml in this repository.
The following components can be specified: