Documentation/etcd-mixin: Fix EtcdInsufficientMembers alerting

Currently the EtcdInsufficientMembers alert fires, when more than (X/2)-1
instances are unavailable. This fixes it to fire at the correct limit of (X-1)/2
unavailable instances and $value now contains the number of available instances
instead of unavailable ones. Added unit test for EtcdInsufficientMembers alert.
release-3.4
Christian Beneke 2018-10-03 19:08:36 +02:00
parent dac8c6fcc0
commit c75ba98f81
3 changed files with 50 additions and 1 deletions

View File

@ -9,3 +9,17 @@ Instructions for use are the same as the [kubernetes-mixin](https://github.com/k
## Background
* For more information about monitoring mixins, see this [design doc](https://docs.google.com/document/d/1A9xvzwqnFVSOZ5fD3blKODXfsat5fg6ZhnKu9LK3lB4/edit#).
## Testing alerts
Make sure to have [jsonnet](https://jsonnet.org/) and [gojsontoyaml](https://github.com/brancz/gojsontoyaml) installed.
First compile the mixin to a YAML file, which the promtool will read:
```
jsonnet -e '(import "mixin.libsonnet").prometheusAlerts' | gojsontoyaml > mixin.yaml
```
Then run the unit test:
```
promtool test rules test.yaml
```

View File

@ -11,7 +11,7 @@
{
alert: 'EtcdInsufficientMembers',
expr: |||
count(up{%(etcd_selector)s} == 0) by (job) > (count(up{%(etcd_selector)s}) by (job) / 2 - 1)
sum(up{%(etcd_selector)s} == bool 1) by (job) < ((count(up{%(etcd_selector)s}) by (job) + 1) / 2)
||| % $._config,
'for': '3m',
labels: {

View File

@ -0,0 +1,35 @@
rule_files:
- mixin.yaml
evaluation_interval: 1m
tests:
- interval: 1m
input_series:
- series: 'up{job="etcd",instance="10.10.10.0"}'
values: '1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0'
- series: 'up{job="etcd",instance="10.10.10.1"}'
values: '1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0'
- series: 'up{job="etcd",instance="10.10.10.2"}'
values: '1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0'
alert_rule_test:
- eval_time: 3m
alertname: EtcdInsufficientMembers
- eval_time: 7m
alertname: EtcdInsufficientMembers
- eval_time: 11m
alertname: EtcdInsufficientMembers
exp_alerts:
- exp_labels:
job: etcd
severity: critical
exp_annotations:
message: 'Etcd cluster "etcd": insufficient members (1).'
- eval_time: 15m
alertname: EtcdInsufficientMembers
exp_alerts:
- exp_labels:
job: etcd
severity: critical
exp_annotations:
message: 'Etcd cluster "etcd": insufficient members (0).'