etcd/Documentation/etcd-mixin
Clayton Coleman 322c38e169 Documentation/etcd-mixin: Fix etcdHighNumberOfLeaderChanges (#11448)
The `etcdHighNumberOfLeaderChanges` alert had a copy and paste
error when it was converted from docs to mixin in 10244 - we moved
from "increase over 15m > 3" to "rate over 15m > 3" which is not
the same (rate is measured per second, so it should have been
"rate over 15m > (3 / 60 / 15)").  As part of fixing that, we
need to capture when prometheus starts or when new etcd clusters
are captured with a high leader change - i.e. if you start a new
etcd cluster and at the moment prometheus first scrapes you are
already at 5 leader changes, we should fire on that transition.

This alert is also now more responsive, so if you get a quick
burst of 3 leader changes we'll alert within 5m rather than 15m.
2019-12-13 16:00:11 -08:00
..
README.md Documentation/etcd-mixin: Fix EtcdInsufficientMembers alerting 2018-10-15 19:23:43 +02:00
mixin.libsonnet Documentation/etcd-mixin: Fix etcdHighNumberOfLeaderChanges (#11448) 2019-12-13 16:00:11 -08:00
test.yaml Documentation/etcd-mixin: Fix etcdHighNumberOfLeaderChanges (#11448) 2019-12-13 16:00:11 -08:00

README.md

Prometheus Monitoring Mixin for etcd

NOTE: This project is alpha stage. Flags, configuration, behaviour and design may change significantly in following releases.

A set of customisable Prometheus alerts for etcd.

Instructions for use are the same as the kubernetes-mixin.

Background

  • For more information about monitoring mixins, see this design doc.

Testing alerts

Make sure to have jsonnet and gojsontoyaml installed.

First compile the mixin to a YAML file, which the promtool will read:

jsonnet -e '(import "mixin.libsonnet").prometheusAlerts' | gojsontoyaml > mixin.yaml

Then run the unit test:

promtool test rules test.yaml