etcd/Documentation/debugging.md

3.1 KiB

Debugging etcd

Diagnosing issues in a distributed application is hard. etcd will help as much as it can - just enable these debug features using the CLI flag -trace=* or the config option trace=*.

Logging

Log verbosity can be increased to the max using either the -vvv CLI flag or the very_very_verbose=true config option.

The only supported logging mode is to stdout.

Metrics

etcd itself can generate a set of metrics. These metrics represent many different internal data points that can be helpful when debugging etcd servers.

Metrics reference

Each individual metric name is prefixed with etcd.<NAME>, where <NAME> is the configured name of the etcd server.

  • timer.appendentries.handle: amount of time a peer takes to process an AppendEntriesRequest from the POV of the peer itself
  • timer.peer.<PEER>.heartbeat: amount of time a peer heartbeat operation takes from the POV of the leader that initiated that operation for peer <PEER>
  • timer.command.<COMMAND>: amount of time a given command took to be processed through the local server's raft state machine. This does not include time waiting on locks.

Fetching metrics over HTTP

Once tracing has been enabled on a given etcd server, all metric data is available at the server's /debug/metrics HTTP endpoint (i.e. http://127.0.0.1:4001/debug/metrics). Executing a GET HTTP command against the metrics endpoint will yield the current state of all metrics in the etcd server.

Sending metrics to Graphite

etcd supports Graphite's Carbon plaintext protocol - a TCP wire protocol designed for shipping metric data to an aggregator. To send metrics to a Graphite endpoint using this protocol, use of the -graphite-host CLI flag or the graphite_host config option (i.e. graphite_host=172.17.0.19:2003).

See an example graphite deploy script.

Generating additional metrics with Collectd

Collectd gathers metrics from the host running etcd. While these aren't metrics generated by etcd itself, it can be invaluable to compare etcd's view of the world to that of a separate process running next to etcd.

See an example collectd deploy script.

Profiling

etcd exposes profiling information from the Go pprof package over HTTP. The basic browsable interface is served by etcd at the /debug/pprof HTTP endpoint (i.e. http://127.0.0.1:4001/debug/pprof). For more information on using profiling tools, see http://blog.golang.org/profiling-go-programs.

NOTE: In the following examples you need to ensure that the ./bin/etcd is identical to the ./bin/etcd that you are targeting (same git hash, arch, platform, etc).

Heap memory profile

go tool pprof ./bin/etcd http://127.0.0.1:4001/debug/pprof/heap

CPU profile

go tool pprof ./bin/etcd http://127.0.0.1:4001/debug/pprof/profile

Blocked goroutine profile

go tool pprof ./bin/etcd http://127.0.0.1:4001/debug/pprof/block