![]() Before this change, the default window for the etcdMembersDown network failure
rate function was recently changed to 1 minute. While this helps detect a etcd
recovery more quickly, it depends on scrape intervals of <= 15s to collect
sufficient data points for the rate function. In practice, an interval of >= 30s
is more typical, which causes the rate function to be less accurate.
This patch increases the window to 2m, which is a compromise between the
original value of 3m and the 1m change introuced with
|
||
---|---|---|
.. | ||
README.md | ||
mixin.libsonnet | ||
test.yaml |
README.md
Prometheus Monitoring Mixin for etcd
NOTE: This project is alpha stage. Flags, configuration, behaviour and design may change significantly in following releases.
A set of customisable Prometheus alerts for etcd.
Instructions for use are the same as the kubernetes-mixin.
Background
- For more information about monitoring mixins, see this design doc.
Testing alerts
Make sure to have jsonnet and gojsontoyaml installed.
First compile the mixin to a YAML file, which the promtool will read:
jsonnet -e '(import "mixin.libsonnet").prometheusAlerts' | gojsontoyaml > mixin.yaml
Then run the unit test:
promtool test rules test.yaml