version: bump up to 3.2.8

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
Documentation/op-guide: remove grafana demo link
2017-09-26 02:41:18 +09:00 · 2017-09-26 02:40:59 +09:00 · 2017-09-20 08:11:02 +09:00 · 2017-09-14 04:42:06 +09:00 · 2017-09-14 04:41:58 +09:00 · 2017-09-08 13:28:55 -07:00
7 changed files with 95 additions and 36 deletions
--- a/Documentation/op-guide/monitoring.md
+++ b/Documentation/op-guide/monitoring.md
@@ -1,6 +1,50 @@
 # Monitoring etcd

-Each etcd server exports metrics under the `/metrics` path on its client port.
+Each etcd server provides local monitoring information on its client port through http endpoints. The monitoring data is useful for both system health checking and cluster debugging.
+
+## Debug endpoint
+
+If `--debug` is set, the etcd server exports debugging information on its client port under the `/debug` path. Take care when setting `--debug`, since there will be degraded performance and verbose logging.
+
+The `/debug/pprof` endpoint is the standard go runtime profiling endpoint. This can be used to profile CPU, heap, mutex, and goroutine utilization. For example, here `go tool pprof` gets the top 10 functions where etcd spends its time:
+
+```sh
+$ go tool pprof http://localhost:2379/debug/pprof/profile
+Fetching profile from http://localhost:2379/debug/pprof/profile
+Please wait... (30s)
+Saved profile in /home/etcd/pprof/pprof.etcd.localhost:2379.samples.cpu.001.pb.gz
+Entering interactive mode (type "help" for commands)
+(pprof) top10
+310ms of 480ms total (64.58%)
+Showing top 10 nodes out of 157 (cum >= 10ms)
+    flat  flat%   sum%        cum   cum%
+   130ms 27.08% 27.08%      130ms 27.08%  runtime.futex
+    70ms 14.58% 41.67%       70ms 14.58%  syscall.Syscall
+    20ms  4.17% 45.83%       20ms  4.17%  github.com/coreos/etcd/cmd/vendor/golang.org/x/net/http2/hpack.huffmanDecode
+    20ms  4.17% 50.00%       30ms  6.25%  runtime.pcvalue
+    20ms  4.17% 54.17%       50ms 10.42%  runtime.schedule
+    10ms  2.08% 56.25%       10ms  2.08%  github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver.(*EtcdServer).AuthInfoFromCtx
+    10ms  2.08% 58.33%       10ms  2.08%  github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver.(*EtcdServer).Lead
+    10ms  2.08% 60.42%       10ms  2.08%  github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/pkg/wait.(*timeList).Trigger
+    10ms  2.08% 62.50%       10ms  2.08%  github.com/coreos/etcd/cmd/vendor/github.com/prometheus/client_golang/prometheus.(*MetricVec).hashLabelValues
+    10ms  2.08% 64.58%       10ms  2.08%  github.com/coreos/etcd/cmd/vendor/golang.org/x/net/http2.(*Framer).WriteHeaders
+```
+
+The `/debug/requests` endpoint gives gRPC traces and performance statistics through a web browser. For example, here is a `Range` request for the key `abc`:
+
+```
+When	Elapsed (s)
+2017/08/18 17:34:51.999317 	0.000244 	/etcdserverpb.KV/Range
+17:34:51.999382 	 .    65 	... RPC: from 127.0.0.1:47204 deadline:4.999377747s
+17:34:51.999395 	 .    13 	... recv: key:"abc"
+17:34:51.999499 	 .   104 	... OK
+17:34:51.999535 	 .    36 	... sent: header:<cluster_id:14841639068965178418 member_id:10276657743932975437 revision:15 raft_term:17 > kvs:<key:"abc" create_revision:6 mod_revision:14 version:9 value:"asda" > count:1
+```
+
+## Metrics endpoint
+
+Each etcd server exports metrics under the `/metrics` path on its client port and optionally on interfaces given by `--listen-metrics-urls`.
+>>>>>>> 607d0762e... Documentation/op-guide: remove grafana demo link

 The metrics can be fetched with `curl`:

@@ -75,8 +119,6 @@ Access: proxy

 Then import the default [etcd dashboard template][template] and customize. For instance, if Prometheus data source name is `my-etcd`, the `datasource` field values in JSON also need to be `my-etcd`.

-See the [demo][demo].
-
 Sample dashboard:

 ![](./etcd-sample-grafana.png)
@@ -85,4 +127,3 @@ Sample dashboard:
 [prometheus]: https://prometheus.io/
 [grafana]: http://grafana.org/
 [template]: ./grafana.json
-[demo]: http://dash.etcd.io/dashboard/db/test-etcd
--- a/Documentation/platforms/aws.md
+++ b/Documentation/platforms/aws.md
@@ -6,7 +6,7 @@ This guide assumes operational knowledge of Amazon Web Services (AWS), specifica

 As a critical building block for distributed systems it is crucial to perform adequate capacity planning in order to support the intended cluster workload. As a highly available and strongly consistent data store increasing the number of nodes in an etcd cluster will generally affect performance adversely. This makes sense intuitively, as more nodes means more members for the leader to coordinate state across. The most direct way to increase throughput and decrease latency of an etcd cluster is allocate more disk I/O, network I/O, CPU, and memory to cluster members. In the event it is impossible to temporarily divert incoming requests to the cluster, scaling the EC2 instances which comprise the etcd cluster members one at a time may improve performance. It is, however, best to avoid bottlenecks through capacity planning.

-The etcd team has produced a [hardware recommendation guide]( ../op-guide/hardware.md) which is very useful for “ballparking” how many nodes and what instance type are necessary for a cluster.
+The etcd team has produced a [hardware recommendation guide](../op-guide/hardware.md) which is very useful for “ballparking” how many nodes and what instance type are necessary for a cluster.

 AWS provides a service for creating groups of EC2 instances which are dynamically sized to match load on the instances. Using an Auto Scaling Group ([ASG](http://docs.aws.amazon.com/autoscaling/latest/userguide/AutoScalingGroup.html)) to dynamically scale an etcd cluster is not recommended for several reasons including:

--- a/client/client.go
+++ b/client/client.go
@@ -372,12 +372,7 @@ func (c *httpClusterClient) Do(ctx context.Context, act httpAction) (*http.Respo
 			if err == context.Canceled || err == context.DeadlineExceeded {
 				return nil, nil, err
 			}
-			if isOneShot {
-				return nil, nil, err
-			}
-			continue
-		}
-		if resp.StatusCode/100 == 5 {
+		} else if resp.StatusCode/100 == 5 {
 			switch resp.StatusCode {
 			case http.StatusInternalServerError, http.StatusServiceUnavailable:
 				// TODO: make sure this is a no leader response
@@ -385,10 +380,16 @@ func (c *httpClusterClient) Do(ctx context.Context, act httpAction) (*http.Respo
 			default:
 				cerr.Errors = append(cerr.Errors, fmt.Errorf("client: etcd member %s returns server error [%s]", eps[k].String(), http.StatusText(resp.StatusCode)))
 			}
-			if isOneShot {
-				return nil, nil, cerr.Errors[0]
+			err = cerr.Errors[0]
+		}
+		if err != nil {
+			if !isOneShot {
+				continue
 			}
-			continue
+			c.Lock()
+			c.pinned = (k + 1) % leps
+			c.Unlock()
+			return nil, nil, err
 		}
 		if k != pinned {
 			c.Lock()
--- a/client/client_test.go
+++ b/client/client_test.go
@@ -16,6 +16,7 @@ package client

 import (
 	"errors"
+	"fmt"
 	"io"
 	"io/ioutil"
 	"math/rand"
@@ -304,7 +305,9 @@ func TestHTTPClusterClientDo(t *testing.T) {
 	fakeErr := errors.New("fake!")
 	fakeURL := url.URL{}
 	tests := []struct {
-		client     *httpClusterClient
+		client *httpClusterClient
+		ctx    context.Context
+
 		wantCode   int
 		wantErr    error
 		wantPinned int
@@ -395,10 +398,30 @@ func TestHTTPClusterClientDo(t *testing.T) {
 			wantCode:   http.StatusTeapot,
 			wantPinned: 1,
 		},
+
+		// 500-level errors cause one shot Do to fallthrough to next endpoint
+		{
+			client: &httpClusterClient{
+				endpoints: []url.URL{fakeURL, fakeURL},
+				clientFactory: newStaticHTTPClientFactory(
+					[]staticHTTPResponse{
+						{resp: http.Response{StatusCode: http.StatusBadGateway}},
+						{resp: http.Response{StatusCode: http.StatusTeapot}},
+					},
+				),
+				rand: rand.New(rand.NewSource(0)),
+			},
+			ctx:        context.WithValue(context.Background(), &oneShotCtxValue, &oneShotCtxValue),
+			wantErr:    fmt.Errorf("client: etcd member  returns server error [Bad Gateway]"),
+			wantPinned: 1,
+		},
 	}

 	for i, tt := range tests {
-		resp, _, err := tt.client.Do(context.Background(), nil)
+		if tt.ctx == nil {
+			tt.ctx = context.Background()
+		}
+		resp, _, err := tt.client.Do(tt.ctx, nil)
 		if !reflect.DeepEqual(tt.wantErr, err) {
 			t.Errorf("#%d: got err=%v, want=%v", i, err, tt.wantErr)
 			continue
@@ -407,11 +430,9 @@ func TestHTTPClusterClientDo(t *testing.T) {
 		if resp == nil {
 			if tt.wantCode != 0 {
 				t.Errorf("#%d: resp is nil, want=%d", i, tt.wantCode)
+				continue
 			}
-			continue
-		}
-
-		if resp.StatusCode != tt.wantCode {
+		} else if resp.StatusCode != tt.wantCode {
 			t.Errorf("#%d: resp code=%d, want=%d", i, resp.StatusCode, tt.wantCode)
 			continue
 		}
--- a/e2e/ctl_v3_kv_test.go
+++ b/e2e/ctl_v3_kv_test.go
@@ -198,21 +198,15 @@ func getRevTest(cx ctlCtx) {
 }

 func getKeysOnlyTest(cx ctlCtx) {
-	var (
-		kvs = []kv{{"key1", "val1"}}
-	)
-	for i := range kvs {
-		if err := ctlV3Put(cx, kvs[i].key, kvs[i].val, ""); err != nil {
-			cx.t.Fatalf("getKeysOnlyTest #%d: ctlV3Put error (%v)", i, err)
-		}
+	if err := ctlV3Put(cx, "key", "val", ""); err != nil {
+		cx.t.Fatal(err)
 	}
-
-	cmdArgs := append(cx.PrefixArgs(), "get")
-	cmdArgs = append(cmdArgs, []string{"--prefix", "--keys-only", "key"}...)
-
-	err := spawnWithExpects(cmdArgs, []string{"key1", ""}...)
-	if err != nil {
-		cx.t.Fatalf("getKeysOnlyTest : error (%v)", err)
+	cmdArgs := append(cx.PrefixArgs(), []string{"get", "--keys-only", "key"}...)
+	if err := spawnWithExpect(cmdArgs, "key"); err != nil {
+		cx.t.Fatal(err)
+	}
+	if err := spawnWithExpects(cmdArgs, "val"); err == nil {
+		cx.t.Fatalf("got value but passed --keys-only")
 	}
 }

--- a/proxy/grpcproxy/kv.go
+++ b/proxy/grpcproxy/kv.go
@@ -189,7 +189,9 @@ func RangeRequestToOp(r *pb.RangeRequest) clientv3.Op {
 	if r.CountOnly {
 		opts = append(opts, clientv3.WithCountOnly())
 	}
-
+	if r.KeysOnly {
+		opts = append(opts, clientv3.WithKeysOnly())
+	}
 	if r.Serializable {
 		opts = append(opts, clientv3.WithSerializable())
 	}
--- a/version/version.go
+++ b/version/version.go
@@ -26,7 +26,7 @@ import (
 var (
 	// MinClusterVersion is the min cluster version this etcd binary is compatible with.
 	MinClusterVersion = "3.0.0"
-	Version           = "3.2.7"
+	Version           = "3.2.8"
 	APIVersion        = "unknown"

 	// Git SHA Value will be set during build
Author	SHA1	Message	Date
Gyu-Ho Lee	e211fb6de3	version: bump up to 3.2.8 Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>	2017-09-26 02:41:18 +09:00
Gyu-Ho Lee	fb7e274309	Documentation/op-guide: remove grafana demo link The dashboard was removed during Tectonic migration in AWS, while the Grafana still runs in GCP. Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>	2017-09-26 02:40:59 +09:00
beth wright	4a61fcf42d	docs: remove link-breaking space	2017-09-20 08:11:02 +09:00
Anthony Romano	4c8fa30dda	e2e: test no value is returned in TestCtlV3GetKeysOnly Test was checking key name is returned, but was not correctly checking no value is returned.	2017-09-14 04:42:06 +09:00
Anthony Romano	01c4f35b30	grpcproxy: respect KeysOnly flag Fixes #8478	2017-09-14 04:41:58 +09:00
Anthony Romano	15e9510d2c	client: fail over to next endpoint on oneshot failure Fixes #8515	2017-09-08 13:28:55 -07:00
Gyu-Ho Lee	09b7fd4975	version: bump up to 3.2.7+git Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>	2017-09-01 14:03:26 -07:00