Compare commits

..

617 Commits

Author SHA1 Message Date
Benjamin Wang fee612d900
Merge pull request #16020 from tjungblu/putauthshort_3.4
[3.4] Early exit auth check on lease puts
2023-06-21 11:06:17 +01:00
Benjamin Wang d897e4f555
Merge pull request #16047 from kkkkun/cp-14457-to-3.4
[3.4] etcdserver: fix corruption check when server has just been compacted
2023-06-19 09:36:34 +01:00
Benjamin Wang a8d4009a94
Merge pull request #16089 from jmhbnz/release-3.4
[3.4] Backport .github/workflows: Read .go-version as a step and not separate workflow
2023-06-19 09:35:49 +01:00
James Blair f0a1499ce9
Backport .github/workflows: Read .go-version as a step and not separate workflow.
Signed-off-by: James Blair <mail@jamesblair.net>
2023-06-16 20:45:14 +12:00
Thomas Jungblut afa0167538 Add first unit test for authApplierV3
This contains a slight refactoring to expose enough information
to write meaningful tests for auth applier v3.

Signed-off-by: Thomas Jungblut <tjungblu@redhat.com>
2023-06-16 10:08:47 +02:00
kkkkun bce0d0b799 etcdserver: fix corruption check when server has just been compacted
Signed-off-by: kkkkun <scuzk373x@gmail.com>
2023-06-11 22:01:36 +08:00
Benjamin Wang ca4a717def
Merge pull request #16038 from daljitdokal/release-3.4
[3.4] Backport updating go to latest patch release 1.19.10
2023-06-10 20:36:04 +08:00
Daljit Singh 7b7140bd51 [3.4] Backport updating go to latest patch release 1.19.10
Signed-off-by: Daljit Singh <daljit.dokal@yahoo.co.nz>
2023-06-09 10:21:27 +12:00
Thomas Jungblut 96d0831770 Early exit auth check on lease puts
Mitigates #15993 by not checking each key individually for permission
when auth is entirely disabled or admin user is calling the method.

Backport of #16005

Signed-off-by: Thomas Jungblut <tjungblu@redhat.com>
2023-06-06 11:45:28 +02:00
Benjamin Wang a603c07989 bump version to 3.4.26
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2023-05-12 09:40:47 +08:00
Benjamin Wang 3f78c423b5
Merge pull request #15814 from mitake/backport-15656-3.4
Backport 15656 to release-3.4
2023-05-10 08:16:41 +08:00
Benjamin Wang 2db96e817f
Merge pull request #15861 from serathius/go-version-release-3.4
[release-3.4] Move go version to dedicated .go-version file
2023-05-10 04:50:42 +08:00
Marek Siarkowicz 6796a50397 Move go version to dedicated .go-version file
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-05-09 14:56:34 +02:00
Hitoshi Mitake c62b5db79d tests: e2e and integration test for timetolive
Signed-off-by: Hitoshi Mitake <h.mitake@gmail.com>
Co-authored-by: Benjamin Wang <wachao@vmware.com>
2023-05-08 22:54:54 +09:00
Hitoshi Mitake 71e85e9ded etcdserver: protect lease timetilive with auth
Signed-off-by: Hitoshi Mitake <h.mitake@gmail.com>
Co-authored-by: Benjamin Wang <wachao@vmware.com>
2023-05-08 22:54:54 +09:00
Benjamin Wang 27d362ae94
Merge pull request #15823 from jmhbnz/release-3.4-backport
[3.4] Backport updating go to latest patch release 1.19.9
2023-05-05 08:16:53 +08:00
James Blair 9925f90161
Backport go update to latest patch release 1.19.9.
Signed-off-by: James Blair <mail@jamesblair.net>
2023-05-04 15:20:32 +12:00
James Blair 2ce1c37160
Backport centralising go version for actions workflows.
Signed-off-by: James Blair <mail@jamesblair.net>
2023-05-04 15:19:39 +12:00
Benjamin Wang 392144d73a
Merge pull request #15788 from sharathsivakumar/release-3.4
[3.4] server: backport 15743, improved description of --initial-cluster-state
2023-04-27 04:12:48 +08:00
sharathsivakumar 7fa519fa24
server: backport 15743, improved description of --initial-cluster-state
Signed-off-by: sharathsivakumar <mailssr9@gmail.com>
2023-04-26 17:08:29 +02:00
Benjamin Wang 94593e63d4
Merge pull request #15715 from ahrtr/fix_release_20230414
[3.4] fix release.sh: git_assert_branch_in_sync not exist in 3.4
2023-04-14 15:19:34 +08:00
Benjamin Wang 46c6ea552e fix release.sh: git_assert_branch_in_sync not exist in 3.4
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2023-04-14 14:48:34 +08:00
Benjamin Wang bc19b67f16 bump version to 3.4.25
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2023-04-14 14:21:15 +08:00
Benjamin Wang b56268ac48
Merge pull request #15677 from ahrtr/jwt_panic_3.4_20230410
[3.4] etcdserver: verify field 'username' and 'revision' present when decoding a JWT token
2023-04-11 18:44:43 +08:00
Marek Siarkowicz 1d759fc8bd
Merge pull request #15697 from ahrtr/3.4_request_progress_20230411
[3.4] etcdserver: guarantee order of requested progress notification
2023-04-11 10:49:21 +02:00
Benjamin Wang 90e4d04c8e etcdserver: guarantee order of requested progress notification
Progress notifications requested using ProgressRequest were sent
directly using the ctrlStream, which means that they could race
against watch responses in the watchStream.

This would especially happen when the stream was not synced - e.g. if
you requested a progress notification on a freshly created unsynced
watcher, the notification would typically arrive indicating a revision
for which not all watch responses had been sent.

This changes the behaviour so that v3rpc always goes through the watch
stream, using a new RequestProgressAll function that closely matches
the behaviour of the v3rpc code - i.e.

1. Generate a message with WatchId -1, indicating the revision for
   *all* watchers in the stream

2. Guarantee that a response is (eventually) sent

The latter might require us to defer the response until all watchers
are synced, which is likely as it should be. Note that we do *not*
guarantee that the number of progress notifications matches the number
of requests, only that eventually at least one gets sent.

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2023-04-11 12:47:09 +08:00
Benjamin Wang 881147f5d8
Merge pull request #15681 from jmhbnz/release-3.4
[3.4] Backport fix for all docker images showing amd64 architecture
2023-04-10 19:31:43 +08:00
James Blair 8f0a8a1271
Backport fix for all docker images showing amd64 architecture.
Signed-off-by: James Blair <mail@jamesblair.net>
2023-04-10 22:43:10 +12:00
Benjamin Wang abdc3cc41f
Merge pull request #15609 from pchan/automated-cherry-pick-of-#15505-upstream-release-3.4
[3.4] Add testing of etcd in local image in release workflow
2023-04-10 16:37:01 +08:00
Prasad Chandrasekaran 4a826042f1 scripts: Add testing of etcd in local image in release workflow.
Signed-off-by: Prasad Chandrasekaran <prasadc@vmware.com>
Co-authored-by: Benjamin Wang <wachao@vmware.com>
2023-04-10 13:25:57 +05:30
Benjamin Wang b000f15049 etcdserver: verify field 'username' and 'revision' present when decoding a JWT token
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2023-04-10 08:26:12 +08:00
Marek Siarkowicz 4b91b6d800
Merge pull request #15662 from ahrtr/backport_15447_3.4_20230407
[3.4] etcdserver: set zap logging to wsproxy
2023-04-07 10:55:55 +02:00
Benjamin Wang b48cf63488
Merge pull request #15655 from mitake/3.4-backport-15648
[3.4] backport 15648
2023-04-07 16:49:24 +08:00
Benjamin Wang b364b48475 etcdserver: set zap logging to wsproxy
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2023-04-07 13:38:25 +08:00
Benjamin Wang 3618ab4b07 security: remove password after authenticating the user
fix https://nvd.nist.gov/vuln/detail/CVE-2021-28235

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2023-04-06 22:42:29 +09:00
Benjamin Wang 1f746597ea test: add an e2e test to reproduce https://nvd.nist.gov/vuln/detail/CVE-2021-28235
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2023-04-06 22:17:20 +09:00
Benjamin Wang 584576d672
Merge pull request #15652 from ahrtr/bump_go_20230406_3.4
[3.4] Bump golang to 1.19.8 to fix CVEs
2023-04-06 15:48:41 +08:00
Benjamin Wang 78a898a903 bump golang to 1.19.8 to fix CVEs
https://groups.google.com/g/golang-announce/c/Xdv6JL9ENs8/m/OV40vnafAwAJ

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2023-04-06 14:31:15 +08:00
Benjamin Wang ab64d49a13
Merge pull request #15621 from mitake/3.4-backport-15294
[3.4] backport 15294
2023-04-05 08:25:01 +08:00
Hitoshi Mitake 442de314a2 server/auth: disallow creating empty permission ranges
Signed-off-by: Hitoshi Mitake <h.mitake@gmail.com>
Co-authored-by: Benjamin Wang <wachao@vmware.com>
2023-04-04 21:41:04 +09:00
J. David Lowe cee78aca75 etcdserver: don't attempt to grant nil permission to a role
Prevent etcd from crashing when given a bad grant payload, e.g.:

$ curl -d '{"name": "foo"}' http://localhost:2379/v3/auth/role/add
{"header":{"cluster_id":"14841639068965178418", ...
$ curl -d '{"name": "foo"}' http://localhost:2379/v3/auth/role/grant
curl: (52) Empty reply from server

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
Signed-off-by: J. David Lowe <j.david.lowe@gmail.com>
2023-04-04 21:40:54 +09:00
Marek Siarkowicz a1a37492f5
Merge pull request #15620 from serathius/separate-grpc-server-3.4
[3.4] Separate grpc server
2023-04-04 09:48:45 +02:00
Marek Siarkowicz 47d4ff2e36 server: Fix defer function closure escape
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-04-03 16:11:05 +02:00
Marek Siarkowicz 75675cd464 tests: Test separate http port connection multiplexing
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-04-03 16:11:05 +02:00
Marek Siarkowicz 8dc1244179 server: Add --listen-client-http-urls flag to allow running grpc server separate from http server
Difference in load configuration for watch delay tests show how huge the
impact is. Even with random write scheduler grpc under http
server can only handle 500 KB with 2 seconds delay. On the other hand,
separate grpc server easily hits 10, 100 or even 1000 MB within 100 miliseconds.

Priority write scheduler that was used in most previous releases
is far worse than random one.

Tests configured to only 5 MB to avoid flakes and taking too long to fill
etcd.

Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-04-03 16:11:03 +02:00
Marek Siarkowicz dd0bc66478 server: Pick one address that all grpc gateways connect to
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-04-03 15:47:35 +02:00
Marek Siarkowicz a4ac849ec1 server: Extract resolveUrl helper function
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-04-03 15:43:46 +02:00
Marek Siarkowicz 66704b4c59 server: Separate client listener grouping from serving
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-04-03 15:43:35 +02:00
Marek Siarkowicz 6de105e89b refactor: Use proper variable names for urls
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-04-03 15:35:10 +02:00
Benjamin Wang 9ba5c28404
Merge pull request #15590 from ArkaSaha30/backport-14860-3-4
[3.4] Backport cherry-pick of #14860: Trigger release in current branch for github workflow case
2023-03-31 17:58:25 +08:00
Benjamin Wang 49d05f88c3
[3.4] Backport cherry-pick of #14860: Trigger release in current branch for github workflow case
Signed-off-by: ArkaSaha30 <arkasaha30@gmail.com>
2023-03-31 10:29:09 +05:30
Marek Siarkowicz f9a4a471a0
Merge pull request #15560 from serathius/test-cmux-3.4
[3.4] Test cmux
2023-03-30 15:55:24 +02:00
Marek Siarkowicz 7d62b4d64a tests: Add v2 API to connection multiplexing test
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-03-30 14:51:39 +02:00
Marek Siarkowicz 7bb5f1f58c tests: Add connection muiltiplexer testing
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-03-30 14:51:25 +02:00
Marek Siarkowicz c4a0bac555 tests: Backport tls for etcdctl
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-03-30 14:49:04 +02:00
Marek Siarkowicz ec9221f42a tests: Backport etcdctl
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-03-30 14:49:04 +02:00
Marek Siarkowicz 9e912ba3ed tests: Extract e2e test utils
Consider creating generic testutils for both e2e and integration tests.

Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-03-30 14:49:02 +02:00
Marek Siarkowicz 063d3ceed6 tests: Allow specifying http version in curl
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-03-30 14:48:28 +02:00
Marek Siarkowicz cee9d4c0f1 tests: Refactor newClient args
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-03-30 14:48:28 +02:00
Marek Siarkowicz 1bafc86b42 tests: Refactor CURLPrefixArgs
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-03-30 14:48:28 +02:00
Benjamin Wang bf22b350b0
Merge pull request #15584 from mitake/follow-up-for-15542
[3.4] etcdserver: keep server side change of 14548
2023-03-30 06:34:08 +08:00
Hitoshi Mitake 01c0d8b309 etcdserver: keep server side change of 14548
Signed-off-by: Hitoshi Mitake <h.mitake@gmail.com>
2023-03-28 21:43:17 +09:00
Marek Siarkowicz 2b189d8638
Merge pull request #15562 from serathius/fix-e2e
tests: Avoid testing package root tests in e2e
2023-03-28 13:53:49 +02:00
Marek Siarkowicz 3f6429d702 tests: Avoid testing package root tests in e2e
Changes invocation from `go test -timeout 30m -v -cpu 1,2,4 '' -v
--count 1 go.etcd.io/etcd/tests/e2e` to `go test -timeout 30m -v -cpu 1,2,4 -v --count 1 go.etcd.io/etcd/tests/e2e` (removes '').
Those braces caused tests to also run in root package.

Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-03-28 11:07:34 +02:00
Marek Siarkowicz 63c7a038eb
Merge pull request #15555 from serathius/run-e2e
Run e2e tests in CI
2023-03-27 13:38:53 +02:00
Marek Siarkowicz 73f152e61e Run e2e tests in CI
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-03-27 12:12:36 +02:00
Marek Siarkowicz e0fcb9e637
Merge pull request #15504 from fuweid/fix-15487
[3.4] fix: enable strict mode for CI
2023-03-23 12:41:08 +01:00
Benjamin Wang 82de82ee80
Merge pull request #15486 from jmhbnz/release-3.4
[3.4] Backport tls 1.3 support
2023-03-23 15:25:17 +08:00
Wei Fu 3fc5fbeaa0 fix: enable strict mode for CI
fixes: #15487

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-03-22 17:55:58 +08:00
Benjamin Wang 284c312fd4
Merge pull request #15518 from fuweid/cp-15509-to-3.4
[3.4] server/embed: fix data race when start insecure grpc
2023-03-22 12:10:01 +08:00
Benjamin Wang 336ac78ebe
Merge pull request #15542 from mitake/revert-14548-v2
[3.4] Revert 14548
2023-03-22 06:19:30 +08:00
Hitoshi Mitake be808bde23 Revert "tests: a test case for watch with auth token expiration"
This reverts commit 91365174b3.

Signed-off-by: Hitoshi Mitake <h.mitake@gmail.com>
2023-03-21 22:13:27 +09:00
Hitoshi Mitake c8f890cde1 Revert "*: handle auth invalid token and old revision errors in watch"
This reverts commit 0c6e466024.

Signed-off-by: Hitoshi Mitake <h.mitake@gmail.com>
2023-03-21 22:13:17 +09:00
Benjamin Wang 46ae7ebd96
Merge pull request #15520 from serathius/fix-issue15271-3.4
[v3.4] Fix issue15271
2023-03-21 06:39:25 +08:00
Marek Siarkowicz 29ecfc0185 server: Test watch restore
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-03-20 16:06:10 +01:00
Bogdan Kanivets 8160d9aea5 mvcc: update minRev when watcher stays synced
Problem: during restore in watchableStore.Restore, synced watchers are moved to unsynced.
minRev will be behind since it's not updated when watcher stays synced.

Solution: update minRev

fixes: https://github.com/etcd-io/etcd/issues/15271
Signed-off-by: Bogdan Kanivets <bkanivets@apple.com>
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-03-20 16:04:49 +01:00
Wei Fu 303519c7b8 server/embed: fix data race when start insecure grpc
There are two goroutines accessing the `gs` grpc server var. Before
insecure `gs` server start, the `gs` can be changed to secure server and
then the client will fail to connect to etcd with insecure request. It
is data-race. We should use argument for reference in the new goroutine.

fix: #15495

Signed-off-by: Wei Fu <fuweid89@gmail.com>
(cherry picked from commit a9988e2625)
Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-03-20 21:33:16 +08:00
James Blair d8f7cfe28d
Backport tls 1.3 support.
Signed-off-by: James Blair <mail@jamesblair.net>
2023-03-16 21:46:17 +13:00
Marek Siarkowicz 2eabc0bc70
Merge pull request #15482 from ahrtr/3.4_gomod_cleanup_20230315
[3.4] cleanup the go.mod & go.sum files
2023-03-15 09:17:41 +01:00
Benjamin Wang 7c6b0882fd cleanup the go.mod & go.sum files
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2023-03-15 07:11:33 +08:00
Marek Siarkowicz 08a42e65a8
Merge pull request #15478 from serathius/watch-random-scheduler-3.4
Watch random scheduler 3.4
2023-03-14 11:32:20 +01:00
Marek Siarkowicz 60e381aaa9 server: Switch back to random scheduler to improve resilience to watch starvation
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-03-14 10:33:15 +01:00
Marek Siarkowicz e818b5fac8 test: Test etcd watch stream starvation under high read response load when sharing the same connection
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-03-14 10:33:06 +01:00
Marek Siarkowicz 6025355ce0 tests: Allow configuring progress notify interval in e2e tests
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-03-14 10:26:43 +01:00
Benjamin Wang 4cdb91db15
Merge pull request #15429 from jmhbnz/release-3.4-backport
[3.4] Backport update to latest go 1.19.7 release
2023-03-08 19:07:44 +08:00
James Blair 51ea1c0abe
Updated go to 1.19.7.
Mitigates CVE-2023-24532.

Signed-off-by: James Blair <mail@jamesblair.net>
2023-03-08 22:46:34 +13:00
Piotr Tabor 20eee55557
Merge pull request #15333 from jmhbnz/release-3.4
[3.4] Backport bump to go 1.19.6 and golang.org/x/net to v0.7.0
2023-03-03 11:11:04 +01:00
James Blair a91bacf567
Formatted source code for go 1.19.6.
Signed-off-by: James Blair <mail@jamesblair.net>
2023-02-20 12:44:14 +13:00
James Blair 7318f5dd0c
Bump golang.org/x/net to v0.7.0 to address CVE GO-2023-1571.
Signed-off-by: James Blair <mail@jamesblair.net>
2023-02-20 11:41:25 +13:00
James Blair 9570978e93
Bump to go 1.19.6
Signed-off-by: James Blair <mail@jamesblair.net>
2023-02-20 11:41:01 +13:00
Benjamin Wang 6d1bfe4f99 bump version to 3.4.24
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2023-02-16 09:39:00 +08:00
Benjamin Wang 9c81b86e90 test: enhance the test case TestV3WatchProgressOnMemberRestart
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2023-02-10 21:03:53 +08:00
Benjamin Wang ed529ab0e5 clientv3: correct the nextRev on receving progress notification response
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2023-02-10 16:47:56 +08:00
James Blair d32dceb8a6 Fix regression in timestamp resolution
Historic capnslog timestamps are in microsecond resolution. We need to match that when we migrate to the zap logger.

Signed-off-by: James Blair <mail@jamesblair.net>
2023-02-10 04:35:24 +08:00
Marek Siarkowicz fb7a8973bd
Merge pull request #15265 from ahrtr/3.4_walSync_failpoint_20230209
[3.4] etctserver: add failpoints walBeforeSync and walAfterSync
2023-02-09 09:10:19 +01:00
Benjamin Wang 109873dcb6 etctserver: add failpoints walBeforeSync and walAfterSync
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2023-02-09 07:06:46 +08:00
Benjamin Wang b4e3ed72e3 bump bbolt to v1.3.7 for release-3.4
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2023-02-02 03:47:21 +08:00
Wilson Wang 2f8158650f server: set multiple concurrentReadTx instances share one txReadBuffer.
(cherry picked from commit 9c82e8c72b)
Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-01-30 11:43:19 +08:00
kidsan c5347cb0c6 netutil: consistently format ipv6 addresses
This formats ipv6 addresses to ensure they can be compared safely

Signed-off-by: kidsan <8798449+Kidsan@users.noreply.github.com>
2023-01-27 06:49:26 +08:00
Iavael d2fc8dbeeb docker: remove nsswitch.conf
Signed-off-by: Iavael <905853+iavael@users.noreply.github.com>
2023-01-25 02:45:52 +08:00
Benjamin Wang e4b154231c
Merge pull request #15137 from fuweid/backport-11990-to-3.4
[3.4] mvcc: push down RangeOptions.limit argv into index tree to reduce memory overhead
2023-01-20 06:23:32 +08:00
Wei Fu 931cf9a814 mvcc: update ut for Revisions/CountRevisions
It is kind of backport from etcd-io#14124.

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-01-18 10:18:57 +08:00
Marek Siarkowicz 1246c52d04 etcdserver: Fix invalid count returned on Range with Limit
(cherry picked from commit 182aef6e6b)
Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-01-18 10:02:10 +08:00
tangcong d48f7ad7c1 mvcc: push down RangeOptions.limit argv into index tree
(cherry picked from commit 26c930f27d)
Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-01-18 10:01:20 +08:00
Benjamin Wang a1d1af5774
Merge pull request #15099 from fuweid/backport-11771-11743-pr-to-3.4
[3.4] mvcc: reduce count-only range overhead
2023-01-18 08:48:29 +08:00
Piotr Tabor 4be8c0e5a5
Merge pull request #15097 from ahrtr/3.4_promote_non_exist_id_20230113
[3.4] etcdserver: return membership.ErrIDNotFound when the memberID not found
2023-01-17 09:15:02 +01:00
Benjamin Wang 00b31512a1 etcdserver: return membership.ErrIDNotFound when the memberID not found
Backport https://github.com/etcd-io/etcd/pull/15095 to 3.4.

When promoting a learner, we need to wait until the leader's applied ID
catches up to the commitId. Afterwards, check whether the learner ID
exist or not, and return `membership.ErrIDNotFound` directly in the API
if the member ID not found, to avoid the request being unnecessarily
delivered to raft.

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2023-01-17 06:27:31 +08:00
Wei Fu 10c080dc5e mvcc: Add ut for Revisions/CountRevisions
It is kind of backport from #14124.

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-01-16 15:15:34 +08:00
tangcong 2070f55aab e2e: add getCountOnlyTest testcase
(cherry picked from commit 3594ab94cf)
Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-01-13 16:33:56 +08:00
tangcong 00a005c300 mvcc: reduce count-only range overhead
(cherry picked from commit 730f3f1d78)
Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-01-13 16:32:35 +08:00
mlmhl 841f3bd2be etcdctl: support query count only of specified prefix
(cherry picked from commit aa7b056a77)
Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-01-13 16:31:23 +08:00
Benjamin Wang a577940b4e
Merge pull request #15088 from fuweid/3.4-fix-flaky-testcase
[3.4] grpc-gateway: update version to v1.11.0
2023-01-13 10:39:28 +08:00
Wei Fu c320f75a15 grpc-gateway: update version to v1.11.0
The issue is caused by hand-crafted protobuf message. The runtime.errorBody
defines two protobuf fields with same number. We need to upgrade the
version to fix it. Otherwise, the client side won't receive any errors
from server side because of panic.

```
mismatching field: runtime.errorBody.error, want runtime.errorBody.message
```

It can fix the cases

PASSES="build grpcproxy" CPU=4 RACE=true ./test -run TestV3CurlLeaseRevokeNoTLS

The original error is like:

```
v3_curl_lease_test.go:109: testV3CurlLeaseRevoke: prefix (/v3) endpoint (/kv/lease/revoke): error (read /dev/ptmx: input/output error (expected "etcdserver: requested lease not found", got ["curl: (52) Empty reply from server\r\n"])), wanted etcdserver: requested lease not found
    v3_curl_lease_test.go:109: testV3CurlLeaseRevoke: prefix (/v3beta) endpoint (/kv/lease/revoke): error (read /dev/ptmx: input/output error (expected "etcdserver: requested lease not found", got ["curl: (52) Empty reply from server\r\n"])), wanted etcdserver: requested lease not found
```

The `Empty reply from server` is caused by panic and server recover it
but it doesn't have chance to reply to client.

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-01-12 17:06:00 +08:00
Benjamin Wang 46511ab96e
Merge pull request #15042 from ahrtr/update_nsswitch_3.4
[3.4] Update nsswitch.conf for 3.4
2022-12-24 07:13:34 +08:00
Benjamin Wang 58c2f5f228 update nsswitch.conf for 3.4
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-12-23 20:31:45 +08:00
Benjamin Wang 283e447df5
Merge pull request #15038 from ahrtr/remove_busybox_3.4_20221223
3.4: remove the dependency on busybox
2022-12-23 19:27:41 +08:00
Benjamin Wang 8aace73c77 3.4: remove the dependency on busybox
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-12-23 18:43:44 +08:00
Benjamin Wang c8b7831967 bump version to 3.4.23
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-12-21 14:11:16 +08:00
Benjamin Wang 8119eb3951
Merge pull request #15019 from ahrtr/deps_3.4_20221219
[3.4] Security: address HIGH Vulnerabilities
2022-12-19 19:33:56 +08:00
Benjamin Wang 5413ce46dc bump go version to 1.17.3
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-12-19 18:34:04 +08:00
Benjamin Wang 86479c5ba9 deps: bump golang.org/x/net to v0.4.0
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-12-19 17:43:43 +08:00
Benjamin Wang 68a55439e1 deps: bump golang.org/x/net to 0.0.0-20220906165146-f3363e06e74c to address CVE CVE-2021-44716 and CVE-2022-27664
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-12-19 16:34:06 +08:00
Benjamin Wang 40566d943a deps: bump github.com/prometheus/client_golang to 1.11.1 to address CVE CVE-2022-21698
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-12-19 16:32:23 +08:00
Benjamin Wang fcb048dd67 deps: bump github.com/gogo/protobuf to 1.3.2 to address CVE CVE-2021-3121
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-12-19 16:30:53 +08:00
Benjamin Wang f318a39998
Merge pull request #15017 from ahrtr/use_distroless_3.4_20221219
[3.4] Security: use distroless base image to address critical Vulnerabilities
2022-12-19 16:23:30 +08:00
Benjamin Wang c1bec6bd97 security: use distroless base image to address critical Vulnerabilities
Command:
trivy image --severity CRITICAL gcr.io/etcd-development/etcd:v3.4.22  -f json -o 3.4.22_image_critical.json

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-12-19 08:04:47 +08:00
Benjamin Wang 9d37e7626a
Merge pull request #15011 from MukulKolpe/specify_branch_release-3.4
fix: specify the branch name of release-3.4 in the workflow
2022-12-17 18:09:47 +08:00
Mukul Kolpe fb07cf843a fix: specify the branch name of release-3.4 in the workflow
Signed-off-by: Mukul Kolpe <mukulkolpe45@gmail.com>
2022-12-17 14:40:24 +05:30
Benjamin Wang e03c62d5e7
Merge pull request #15007 from ArkaSaha30/trivy-release-3-4
Add trivy nightly scan for `release-3.4`
2022-12-16 13:59:40 +08:00
ArkaSaha30 7450bcfc49
Add trivy nightly scan for release-3.4
Signed-off-by: ArkaSaha30 <arkasaha30@gmail.com>
2022-12-16 11:06:58 +05:30
Benjamin Wang 593711848e
Merge pull request #14900 from ahrtr/fix_readyonly_txn_panic_3.4_20221206
[3.4] etcdserver: fix nil pointer panic for readonly txn
2022-12-06 19:25:12 +08:00
Benjamin Wang acca4fa93e etcdserver: fix nil pointer panic for readonly txn
Backporting https://github.com/etcd-io/etcd/pull/14895

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-12-06 18:09:47 +08:00
Benjamin Wang c619e2705e
Merge pull request #14853 from ahrtr/remove_memberid_alarm_3.4_20221125
[3.4] etcdserver: intentionally set the memberID as 0 in corruption alarm
2022-11-25 17:01:02 +08:00
Benjamin Wang 2f4f7328d0 etcdserver: intentionally set the memberID as 0 in corruption alarm
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-11-25 15:58:23 +08:00
Benjamin Wang f4bf538781
Merge pull request #14792 from ahrtr/auth_3.4_20221117
[3.4] clientv3: do not refresh token when users use CommonName based authentication
2022-11-17 18:08:11 +08:00
Benjamin Wang 90585e03a0 test: add test case to cover the CommonName based authentication
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-11-17 09:12:13 +08:00
Benjamin Wang 8b4405b276 test: add certificate with root CommonName
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-11-17 08:34:59 +08:00
Benjamin Wang 8ca42a7ae4 clientv3: do not refresh token when using TLS CommonName based authentication
When users use the TLS CommonName based authentication, the
authTokenBundle is always nil. But it's possible for the clients
to get `rpctypes.ErrAuthOldRevision` response when the clients
concurrently modify auth data (e.g, addUser, deleteUser etc.).
In this case, there is no need to refresh the token; instead the
clients just need to retry the operations (e.g. Put, Delete etc).

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-11-17 08:32:35 +08:00
Benjamin Wang 1f054980bc Bump version to 3.4.22
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-11-02 08:08:33 +08:00
Benjamin Wang c9cf4db813
Merge pull request #14675 from cenkalti/release-3.4
server: add more context to panic message
2022-11-02 07:56:50 +08:00
Cenk Alti 7a4a3ad8db
server: add more context to panic message
Signed-off-by: Cenk Alti <cenkalti@gmail.com>
2022-11-01 18:59:17 -04:00
Benjamin Wang 7c1499d3bb
Merge pull request #14649 from mitake/test-authrecover-3.4
[3.4] server: add a unit test case for authStore.Reocver() with empty rangePermCache
2022-10-29 13:11:36 +08:00
Hitoshi Mitake b7a23311e6 etcdserver: call refreshRangePermCache on Recover() in AuthStore
Signed-off-by: Oleg Guba <oleg@dropbox.com>
Signed-off-by: Hitoshi Mitake <h.mitake@gmail.com>
2022-10-29 13:55:06 +09:00
Hitoshi Mitake 0b3ff06868 server: add a unit test case for authStore.Reocver() with empty rangePermCache
Signed-off-by: Hitoshi Mitake <h.mitake@gmail.com>
2022-10-29 13:27:53 +09:00
Benjamin Wang ce1630f68f
Merge pull request #14601 from dusk125/release-3.4
Backport #14500 to 3.4
2022-10-27 14:21:22 +08:00
Allen Ray 9254f8f05b Release-3.4: server/etcdmain: add configurable cipher list to gRPC proxy listener
Signed-off-by: Allen Ray <alray@redhat.com>
2022-10-19 16:02:13 -04:00
Benjamin Wang b058374fbd
Merge pull request #14594 from ZoeShaw101/fix-watch-test-issue-3.4
Backport #14591 to 3.4.
2022-10-17 05:25:50 +08:00
王霄霄 dcebdf7958 Backport #14591 to 3.4.
Signed-off-by: 王霄霄 1141195807@qq.com
Signed-off-by: 王霄霄 <1141195807@qq.com>
2022-10-16 21:18:53 +08:00
Benjamin Wang 5b764d8771
Merge pull request #14581 from tomari/tomari/watch-backoff-for-3.4
[3.4] client/v3: Add backoff before retry when watch stream returns unavailable
2022-10-13 07:23:02 +08:00
Hisanobu Tomari 7b7fbbf8b8 client/v3: Add backoff before retry when watch stream returns unavailable
The client retries connection without backoff when the server is gone
after the watch stream is established. This results in high CPU usage
in the client process. This change introduces backoff when the stream is
failed and unavailable.

Signed-off-by: Hisanobu Tomari <posco.grubb@gmail.com>
2022-10-13 05:26:31 +09:00
Sahdev Zala 429fcb98ab
Merge pull request #14579 from ahrtr/wal_log_3.4
[3.4] etcdserver: added more debug log for the purgeFile goroutine
2022-10-12 11:34:33 -04:00
Benjamin Wang 1d7639f796 etcdserver: added more debug log for the purgeFile goroutine
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-10-12 19:39:20 +08:00
Benjamin Wang 5b3ac7da6b
Merge pull request #14577 from pchan/acp3.4
Cherry pick of #13224
2022-10-12 17:58:26 +08:00
Sergey Kacheev 5381dafaae netutil: make a `raw` URL comparison part of the urlsEqual function
Signed-off-by: Prasad Chandrasekaran <prasadc@vmware.com>
2022-10-12 15:07:46 +05:30
Sergey Kacheev 90e7e254ae Apply suggestions from code review
Co-authored-by: Lili Cosic <cosiclili@gmail.com>
Signed-off-by: Prasad Chandrasekaran <prasadc@vmware.com>
2022-10-12 15:07:46 +05:30
Sergey Kacheev abb019a51e netutil: add url comparison without resolver to URLStringsEqual
If one of the nodes in the cluster has lost a dns record,
restarting the second node will break it.
This PR makes an attempt to add a comparison without using a resolver,
which allows to protect cluster from dns errors and does not break
the current logic of comparing urls in the URLStringsEqual function.
You can read more in the issue #7798

Fixes #7798

Signed-off-by: Prasad Chandrasekaran <prasadc@vmware.com>
2022-10-12 15:07:46 +05:30
Hitoshi Mitake 57a27de189
Merge pull request #14562 from kafuu-chino/3.4-backport-14296
*: avoid closing a watch with ID 0 incorrectly
2022-10-10 22:48:53 +09:00
Kafuu Chino ed10ca13f4 *: avoid closing a watch with ID 0 incorrectly
Signed-off-by: Kafuu Chino <KafuuChinoQ@gmail.com>

add test

1

1

1
2022-10-10 19:54:58 +08:00
Benjamin Wang de11726a8a
Merge pull request #14548 from mitake/3.4-backport-14322
Backport PR 14322 to release-3.4
2022-10-05 05:50:43 +08:00
Hitoshi Mitake 91365174b3 tests: a test case for watch with auth token expiration
Signed-off-by: Hitoshi Mitake <h.mitake@gmail.com>
2022-10-04 22:55:36 +09:00
Hitoshi Mitake 0c6e466024 *: handle auth invalid token and old revision errors in watch
Signed-off-by: Hitoshi Mitake <h.mitake@gmail.com>
2022-10-04 22:49:06 +09:00
Marek Siarkowicz d0a732f96d
Merge pull request #14530 from ahrtr/memberid_alarm
etcdserver: fix memberID equals to zero in corruption alarm
2022-09-28 09:30:10 +02:00
Benjamin Wang 29911e9a5b etcdserver: fix memberID equals to zero in corruption alarm
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-09-28 11:01:26 +08:00
Benjamin Wang 85b640cee7 Bump version to 3.4.21
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-09-15 08:46:22 +08:00
Marek Siarkowicz 1a05326fae
Merge pull request #14442 from ahrtr/fix_TestV3AuthRestartMember
[release-3.4] Fix the flaky test TestV3AuthRestartMember
2022-09-09 09:57:24 +02:00
Benjamin Wang b8bea91f22 fix the flaky test TestV3AuthRestartMember
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-09-09 09:37:25 +08:00
Benjamin Wang 6730ed8477
Merge pull request #14410 from vivekpatani/release-3.4
[release-3.4] server,test: refresh cache on each NewAuthStore
2022-09-09 09:34:32 +08:00
Benjamin Wang a55a9f5e07
Merge pull request #14441 from tjungblu/bz_1918413_3.4_upstream
[release-3.4] etcdctl: fix move-leader for multiple endpoints
2022-09-09 09:26:40 +08:00
Thomas Jungblut 86bc0a25c4 etcdctl: fix move-leader for multiple endpoints
Due to a duplicate call of clientConfigFromCmd, the move-leader command
would fail with "conflicting environment variable is shadowed by corresponding command-line flag".
Also in scenarios where no command-line flag was supplied.

Signed-off-by: Thomas Jungblut <tjungblu@redhat.com>
2022-09-08 15:51:19 +02:00
Benjamin Wang dd743eea81
Merge pull request #14439 from vsvastey/usr/vsvastey/open-with-max-index-test-fix-3.4
[release-3.4] testing: fix TestOpenWithMaxIndex cleanup
2022-09-08 17:00:20 +08:00
Vladimir Sokolov 1ed5dfc20e testing: fix TestOpenWithMaxIndex cleanup
A WAL object was closed by defer, however the WAL was rewritten afterwards,
so defer closed already closed WAL but not the new one. It caused a data
race between writing file and cleaning up a temporary test directory,
which led to a non-deterministic bug.

Fixes #14332

Signed-off-by: Vladimir Sokolov <vsvastey@gmail.com>
2022-09-08 10:49:47 +03:00
Benjamin Wang b2b7b9d535
Merge pull request #14423 from serathius/one_member_data_loss_raft_3_4
[release-3.4] fix the potential data loss for clusters with only one member
2022-09-06 03:29:45 +08:00
Benjamin Wang 119e4dda19 fix the potential data loss for clusters with only one member
For a cluster with only one member, the raft always send identical
unstable entries and committed entries to etcdserver, and etcd
responds to the client once it finishes (actually partially) the
applying workflow.

When the client receives the response, it doesn't mean etcd has already
successfully saved the data, including BoltDB and WAL, because:
   1. etcd commits the boltDB transaction periodically instead of on each request;
   2. etcd saves WAL entries in parallel with applying the committed entries.
Accordingly, it may run into a situation of data loss when the etcd crashes
immediately after responding to the client and before the boltDB and WAL
successfully save the data to disk.
Note that this issue can only happen for clusters with only one member.

For clusters with multiple members, it isn't an issue, because etcd will
not commit & apply the data before it being replicated to majority members.
When the client receives the response, it means the data must have been applied.
It further means the data must have been committed.
Note: for clusters with multiple members, the raft will never send identical
unstable entries and committed entries to etcdserver.

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-09-05 14:15:47 +02:00
Benjamin Wang 9d5ae56764
Merge pull request #14420 from vsvastey/usr/vsvastey/nil-logger
etcdserver: nil-logger issue fix for version 3.4
2022-09-05 14:53:08 +08:00
Vladimir Sokolov 38342e88da etcdserver: nil-logger issue fix for version 3.4
In v3.5 it is assumed that the logger should not be nil, however it is
still a case in v3.4. The PR targeted to v3.5 was backported to 3.4 and
that's why it's possible to get panic on nil logger in 3.4. This commit
fixed this issue.

Fixes #14402

Signed-off-by: Vladimir Sokolov <vsvastey@gmail.com>
2022-09-03 04:34:03 +03:00
vivekpatani c0ef7d52e0 server,test: refresh cache on each NewAuthStore
- permissions were incorrectly loaded on restarts.
- #14355
- Backport of https://github.com/etcd-io/etcd/pull/14358

Signed-off-by: vivekpatani <9080894+vivekpatani@users.noreply.github.com>
2022-08-31 13:08:11 -07:00
Benjamin Wang 1e2682301c Bump version to 3.4.20
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-08-06 05:27:01 +08:00
Sahdev Zala ee366151c6
Merge pull request #14290 from ahrtr/3.4_no_prevkv_for_create
[3.4] Do not get previous K/V for create event
2022-08-01 08:39:19 -04:00
Benjamin Wang 095bbfc4ed lock down the version of shadow to v0.1.11
The latest vesion v0.1.12 was just released On Jul 27, 2022,
and it is causing issue (see below) on the govet check,

```
govet_shadow' started at Sun Jul 31 23:23:27 PDT 2022
go get: upgraded golang.org/x/net v0.0.0-20211112202133-69e39bad7dc2 => v0.0.0-20220722155237-a158d28d115b
go get: upgraded golang.org/x/sys v0.0.0-20211019181941-9d821ace8654 => v0.0.0-20220722155257-8c9f86f7a55f
go get: upgraded golang.org/x/tools v0.0.0-20190524140312-2c0ae7006135 => v0.1.12
/root/go/pkg/mod/github.com/grpc-ecosystem/go-grpc-prometheus@v1.2.0/client_metrics.go:7:2: missing go.sum entry for module providing package golang.org/x/net/context (imported by go.etcd.io/etcd/etcdserver/etcdserverpb); to add:
	go get go.etcd.io/etcd/etcdserver/etcdserverpb
/root/go/pkg/mod/google.golang.org/grpc@v1.26.0/internal/transport/controlbuf.go:28:2: missing go.sum entry for module providing package golang.org/x/net/http2 (imported by go.etcd.io/etcd/embed); to add:
	go get go.etcd.io/etcd/embed
/root/go/pkg/mod/google.golang.org/grpc@v1.26.0/internal/transport/controlbuf.go:29:2: missing go.sum entry for module providing package golang.org/x/net/http2/hpack (imported by github.com/soheilhy/cmux); to add:
	go get github.com/soheilhy/cmux@v0.1.4
/root/go/pkg/mod/google.golang.org/grpc@v1.26.0/server.go:36:2: missing go.sum entry for module providing package golang.org/x/net/trace (imported by go.etcd.io/etcd/embed); to add:
	go get go.etcd.io/etcd/embed
```

It isn't good to always to use the latest version. Instead, we should
lock down the version, and v0.1.11 was confirmed to be working.

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-08-01 15:11:49 +08:00
Benjamin Wang cc1b0e6a44 do not get previous K/V for create event
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-08-01 13:11:46 +08:00
Benjamin Wang 314dcbf6f5
Merge pull request #14274 from lavacat/release-3.4-fix-TestRoundRobinBalancedResolvableFailoverFromServerFail
[3.4] clientv3/balancer: fixed flaky TestRoundRobinBalancedResolvableFailoverFromServerFail
2022-07-27 04:59:38 +08:00
Bogdan Kanivets 6f483a649e clientv3/balancer: fixed flaky TestRoundRobinBalancedResolvableFailoverFromServerFail
- ignore "transport is closing" error during connections warmup after stopping one peer.

Signed-off-by: Bogdan Kanivets <bkanivets@apple.com>
2022-07-26 08:06:59 -07:00
Benjamin Wang ce539a960c
Merge pull request #14279 from SimFG/mvcc-race
[3.4] clientv3/mvcc: fixed DATA RACE
2022-07-26 23:01:34 +08:00
SimFG 04e5e5516e [3.4] clientv3/mvcc: fixed DATA RACE between mvcc.(*store).setupMetricsReporter and mvcc.(*store).restore
Signed-off-by: SimFG <1142838399@qq.com>
2022-07-26 21:38:23 +08:00
Benjamin Wang 2c778eebf7
Merge pull request #14269 from ahrtr/3.4_resend_readindex
[3.4] etcdserver: resend ReadIndex request on empty apply request
2022-07-25 16:53:06 +08:00
Benjamin Wang f53db9b246 etcdserver: resend ReadIndex request on empty apply request
Backport https://github.com/etcd-io/etcd/pull/12795 to 3.4

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-07-25 09:21:31 +08:00
Benjamin Wang e2b36f8879
Merge pull request #14253 from serathius/checkpoints-fix-3.4
[3.4] Checkpoints fix 3.4
2022-07-22 16:56:17 +08:00
Benjamin Wang de2e8ccc78
Merge pull request #14258 from ahrtr/3.4_postphone_read_index
[3.4] raft: postpone MsgReadIndex until first commit in the term
2022-07-22 16:46:32 +08:00
Marek Siarkowicz 783e99cbfe Fix lease checkpointing tests by forcing a snapshot
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-07-22 10:28:44 +02:00
Marek Siarkowicz 8f4735dfd4 server: Require either cluster version v3.6 or --experimental-enable-lease-checkpoint-persist to persist lease remainingTTL
To avoid inconsistant behavior during cluster upgrade we are feature
gating persistance behind cluster version. This should ensure that
all cluster members are upgraded to v3.6 before changing behavior.

To allow backporting this fix to v3.5 we are also introducing flag
--experimental-enable-lease-checkpoint-persist that will allow for
smooth upgrade in v3.5 clusters with this feature enabled.

Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-07-22 10:28:29 +02:00
Benjamin Wang 9c9148c4cd raft: postpone MsgReadIndex until first commit in the term
Backport https://github.com/etcd-io/etcd/pull/12762 to 3.4

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-07-22 13:56:27 +08:00
Benjamin Wang f18d074866
Merge pull request #14254 from ramses/backport-13435
[3.4] Backport: non mutating requests pass through quotaKVServer when NOSPACE
2022-07-22 09:27:00 +08:00
Benjamin Wang aca5cd1717
Merge pull request #14246 from vivekpatani/release-3.4
[3.4] etcdserver,pkg: remove temp files in snap dir when etcdserver starting
2022-07-22 09:14:23 +08:00
vivekpatani e4deb09c9e etcdserver,pkg: remove temp files in snap dir when etcdserver starting
- Backporting: https://github.com/etcd-io/etcd/pull/12846
- Reference: https://github.com/etcd-io/etcd/issues/14232

Signed-off-by: vivekpatani <9080894+vivekpatani@users.noreply.github.com>
2022-07-21 15:50:27 -07:00
Chao Chen 96f69dee47 Backport: non mutating requests pass through quotaKVServer when NOSPACE
This is a backport of https://github.com/etcd-io/etcd/pull/13435 and is
part of the work for 3.4.20
https://github.com/etcd-io/etcd/issues/14232.

The original change had a second commit that modifies a changelog file.
The 3.4 branch does not include any changelog file, so that part was not
cherry-picked.

Local Testing:

- `make build`
- `make test`

Both succeed.

Signed-off-by: Ramsés Morales <ramses@gmail.com>
2022-07-21 15:06:09 -07:00
Michał Jasionowski 8d83691d53 etcdserver,integration: Store remaining TTL on checkpoint
To extend lease checkpointing mechanism to cases when the whole etcd
cluster is restarted.

Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-07-21 17:35:21 +02:00
Michał Jasionowski a30aba8fc2 lease,integration: add checkpoint scheduling after leader change
Current checkpointing mechanism is buggy. New checkpoints for any lease
are scheduled only until the first leader change. Added fix for that
and a test that will check it.

Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-07-21 17:35:15 +02:00
Benjamin Wang 7ee7029c08
Merge pull request #14251 from ahrtr/3.4_maxstream
[3.4] Support configuring MaxConcurrentStreams for http2
2022-07-21 17:43:15 +08:00
Benjamin Wang 6071b1c523 Support configuring MaxConcurrentStreams for http2
Backport https://github.com/etcd-io/etcd/pull/14219 to 3.4

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-07-21 14:25:29 +08:00
Benjamin Wang 40ccb8b454
Merge pull request #14240 from chaochn47/cherry-pick-12335
[3.4] etcdserver: add more detailed traces on linearized reading
2022-07-21 03:32:02 +08:00
Chao Chen 864006b72d print out applied index as uint64
Signed-off-by: Chao Chen <chaochn@amazon.com>
2022-07-20 12:07:51 -07:00
Pierre Zemb 3f9fba9112 etcdserver: add more detailed traces on linearized reading
To improve debuggability of `agreement among raft nodes before
linearized reading`, we added some tracing inside
`linearizableReadLoop`.

This will allow us to know the timing of `s.r.ReadIndex` vs
`s.applyWait.Wait(rs.Index)`.

Signed-off-by: Chao Chen <chaochn@amazon.com>
2022-07-20 12:07:51 -07:00
Benjamin Wang fc76e90cf2
Merge pull request #14230 from mitake/perm-cache-lock-3.4
server/auth: protect rangePermCache with a RW lock
2022-07-20 18:51:54 +08:00
Benjamin Wang 3ea12d352e
Merge pull request #14241 from vivekpatani/release-3.4
clientv3: fix isOptsWithFromKey/isOptsWithPrefix
2022-07-20 18:51:24 +08:00
Benjamin Wang 6313502fb4
Merge pull request #14239 from chaochn47/backport-13676
backport 3.5: #13676 load all leases from backend
2022-07-20 18:50:46 +08:00
Benjamin Wang b0e1aaef69
Merge pull request #14236 from chrisayoub/release-3.4
[release-3.4] clientv3: filter learners members during autosync
2022-07-20 12:49:31 +08:00
Chris Ayoub 36a76e8531 clientv3: filter learners members during autosync
This change is to ensure that all members returned during the client's
AutoSync are started and are not learners, which are not valid
etcd members to make requests to.

Signed-off-by: Chris Ayoub <cayoub@hubspot.com>
2022-07-20 00:04:03 -04:00
vivekpatani 4fef7fcb90 clientv3: fix isOptsWithFromKey/isOptsWithPrefix
- Addressing: https://github.com/etcd-io/etcd/issues/13332
- Backporting: https://github.com/etcd-io/etcd/pull/13334

Signed-off-by: vivekpatani <9080894+vivekpatani@users.noreply.github.com>
2022-07-19 17:20:56 -07:00
Chao Chen fd51434b54 backport 3.5: #13676 load all leases from backend
Signed-off-by: Chao Chen <chaochn@amazon.com>
2022-07-19 16:08:01 -07:00
Benjamin Wang d58a0c0434
Merge pull request #14177 from ahrtr/3.4_lease_renew_linearizable
[3.4] Support linearizable renew lease for 3.4
2022-07-19 16:39:00 +08:00
Hitoshi Mitake ecd91da40d server/auth: protect rangePermCache with a RW lock
Signed-off-by: Hitoshi Mitake <h.mitake@gmail.com>
2022-07-19 15:51:48 +09:00
Benjamin Wang 07d2b1d626 support linearizable renew lease for 3.4
Cherry pick https://github.com/etcd-io/etcd/pull/13932 to 3.4.

When etcdserver receives a LeaseRenew request, it may be still in
progress of processing the LeaseGrantRequest on exact the same
leaseID. Accordingly it may return a TTL=0 to client due to the
leaseID not found error. So the leader should wait for the appliedID
to be available before processing client requests.

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-07-19 13:34:55 +08:00
Benjamin Wang 4636a5fab4 Bump version to 3.4.19
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-07-12 16:18:45 +08:00
Benjamin Wang 06561ae4bf
Merge pull request #14210 from ahrtr/fix_release_script
[3.4] Fix pipeline failure for release test
2022-07-12 16:06:33 +08:00
Benjamin Wang be0ce4f15b fix pipeline failure for release test
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-07-12 08:31:59 +08:00
Benjamin Wang d3dfc9b796
Merge pull request #14204 from lavacat/release-3.4-balancer-tests
clientv3/balance: fixed flaky balancer tests
2022-07-12 06:14:35 +08:00
Bogdan Kanivets 185f203528 clientv3/balance: fixed flaky balancer tests
- added verification step to indirectly verify that all peers are in balancer subconn list

Signed-off-by: Bogdan Kanivets <bkanivets@apple.com>
2022-07-11 14:43:58 -07:00
Benjamin Wang 7de53273dd
Merge pull request #14205 from ahrtr/3.4_release_script
[3.4] Update release scripts for release-3.4
2022-07-11 20:06:06 +08:00
Benjamin Wang 6cc9416ae5 backport release test to 3.4
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-07-11 19:47:08 +08:00
Benjamin Wang e6b3d97712 Update release scripts for release-3.4
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-07-11 16:06:32 +08:00
Marek Siarkowicz 852ac37bc0
Merge pull request #14200 from ahrtr/3.4_pipeline_race
set RACE as true for linux-amd64-unit and linux-amd64-grpcproxy
2022-07-08 10:23:21 +02:00
Benjamin Wang 8c1c5fefdb set RACE as true for linux-amd64-unit and linux-amd64-grpcproxy
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-07-08 08:37:31 +08:00
Marek Siarkowicz 0c6063fa82
Merge pull request #14192 from ahrtr/3.4_bump_yaml
[3.4] Bump gopkg.in/yaml.v2 v2.2.2 -> v2.4.0 due to: CVE-2019-11254
2022-07-05 14:32:09 +02:00
Benjamin Wang 860dc149b2 Bump gopkg.in/yaml.v2 v2.2.8 -> v2.4.0 due to: CVE-2019-11254
Cherry pick https://github.com/etcd-io/etcd/pull/13616 to 3.4.

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-07-05 06:26:06 +08:00
Marek Siarkowicz f0256eeec9
Merge pull request #14179 from lavacat/release-3.4-crypto
[backport 3.4] Update golang.org/x/crypto to latest
2022-07-04 11:57:58 +02:00
Bogdan Kanivets 576a798bf9 [backport 3.4] Update golang.org/x/crypto to latest
Update crypto to address CVE-2022-27191.

The CVE fix is added in 0.0.0-20220315160706-3147a52a75dd but this
change updates to latest.

Backport of https://github.com/etcd-io/etcd/pull/13996

Signed-off-by: Bogdan Kanivets <bkanivets@apple.com>
2022-06-30 23:08:13 -07:00
Benjamin Wang bae61786fc
Merge pull request #14183 from ahrtr/3.4_pipeline_issues_20220630
[3.4] Fix pipeline failures in 3.4
2022-07-01 05:36:29 +08:00
Benjamin Wang 8160e9ebe2 disable test cases on certificate-based authentication which isn't supported by gRPC proxy.
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-06-30 14:11:54 +08:00
Benjamin Wang 5b3f269159 replace all 3.4 certificates and keys with the files from 3.5
Fix the following error in integration pipeline,
```
=== RUN   TestTLSReloadCopy
    v3_grpc_test.go:1754: tls: failed to find any PEM data in key input
    v3_grpc_test.go:1754: tls: private key does not match public key
    v3_grpc_test.go:1754: tls: private key does not match public key
    v3_grpc_test.go:1754: tls: private key does not match public key
```

Refer to https://github.com/etcd-io/etcd/runs/7123775361?check_suite_focus=true

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-06-30 13:21:48 +08:00
Benjamin Wang bb9113097a fix test failure in TestCtlV3WatchClientTLS
Also refer to the following commit in 3.5,
093282f5ea

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-06-30 10:19:03 +08:00
Benjamin Wang f169e5dcba
Merge pull request #14151 from ahrtr/3.4_skip_TestWatchRequestProgress_proxy
[3.4] Skip WatchRequestProgress test in grpc-proxy mode.
2022-06-29 05:40:05 +08:00
Benjamin Wang 6958ee8ff2 Skip WatchRequestProgress test in grpc-proxy mode.
We shouldn't fail the grpc-server (completely) by a not implemented RPC.
Failing whole server by remote request is anti-pattern and security
risk.

Refer to https://github.com/etcd-io/etcd/runs/7034342964?check_suite_focus=true#step:5:2284

```
=== RUN   TestWatchRequestProgress/1-watcher
panic: not implemented
goroutine 83024 [running]:
go.etcd.io/etcd/proxy/grpcproxy.(*watchProxyStream).recvLoop(0xc009232f00, 0x4a73e1, 0xc00e2406e0)
	/home/runner/work/etcd/etcd/proxy/grpcproxy/watch.go:265 +0xbf2
go.etcd.io/etcd/proxy/grpcproxy.(*watchProxy).Watch.func1(0xc0038a3bc0, 0xc009232f00)
	/home/runner/work/etcd/etcd/proxy/grpcproxy/watch.go:125 +0x70
created by go.etcd.io/etcd/proxy/grpcproxy.(*watchProxy).Watch
	/home/runner/work/etcd/etcd/proxy/grpcproxy/watch.go:123 +0x73b
FAIL	go.etcd.io/etcd/clientv3/integration	222.813s
FAIL
```

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-06-29 05:12:43 +08:00
Marek Siarkowicz f1c59dcfac
Merge pull request #14170 from ahrtr/3.4_proxy_fix_20220628
Fix deadlock in 'go test -tags cluster_proxy -v ./integration/... ./client'
2022-06-28 17:56:44 +02:00
Benjamin Wang 1c9fa07cd7 Fix deadlock in 'go test -tags cluster_proxy -v ./integration/... ./clientv3/...'
Cherry pick https://github.com/etcd-io/etcd/pull/12319 to 3.4.

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-06-28 13:44:47 +08:00
Benjamin Wang 4e88cce06c
Merge pull request #14168 from lavacat/release-3.4-TestGetToken
[backport 3.4] clientv3/integration: Reduce flakines of TestGetTokenWithoutAuth
2022-06-28 04:35:17 +08:00
Bogdan Kanivets 2d99b341ad [backport 3.4] clientv3/integration: Reduce flakines of TestGetTokenWithoutAuth
backport from branch-3.5:
https://github.com/etcd-io/etcd/pull/12200/

Signed-off-by: Bogdan Kanivets <bkanivets@apple.com>
2022-06-27 11:31:16 -07:00
Marek Siarkowicz 17fc680454
Merge pull request #14150 from ahrtr/lease_revoke_race_3.4
[3.4] Backport two lease related bug fixes to 3.4
2022-06-24 11:27:09 +02:00
Benjamin Wang f036529b5d Backport two lease related bug fixes to 3.4
The first bug fix is to resolve the race condition between goroutine
and channel on the same leases to be revoked. It's a classic mistake
in using Golang channel + goroutine. Please refer to
https://go.dev/doc/effective_go#channels

The second bug fix is to resolve the issue that etcd lessor may
continue to schedule checkpoint after stepping down the leader role.

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-06-24 09:09:40 +08:00
Benjamin Wang 953376e666
Merge pull request #14136 from ahrtr/3.4_pipeline_issues
[3.4] Fix all the pipeline failues for release 3.4
2022-06-23 04:54:42 +08:00
Benjamin Wang 1abf085cfb fix all the pipeline failues for release 3.4
Items resolved:
1. fix the vet error: possible misuse of reflect.SliceHeader;
2. fix the vet error: call to (*T).Fatal from a non-test goroutine;
3. bump package golang.org/x/crypto, net and sys;
4. bump boltdb from 1.3.3 to 1.3.6;
5. remove the vendor directory;
6. remove go 1.12.17 and 1.15.15, add go 1.16.15 into pipeline;
7. bump go version to 1.16 in go.mod;
8. fix the issue: compile: version go1.16.15 does not match go tool version go1.17.11,
   refer to https://github.com/actions/setup-go/issues/107;
9. fix data race on compactMainRev and watcherGauge;
10. fix test failure for TestLeasingTxnOwnerGet in cluster_proxy mode.

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-06-22 05:28:45 +08:00
Benjamin Wang c2c9e7de01
Merge pull request #14075 from lavacat/release-3.4-go1.15.15-tests
tests: fixing dependencies that brake tests in go.1.15.15
2022-05-31 05:52:21 +08:00
Bogdan Kanivets ceed023f7c tests: fixing dependencies that brake tests in go.1.15.15
- retry_interceptor_test causes:
clientv3/naming/grpc.go:25:2: module google.golang.org/grpc@latest found (v1.46.0),
but does not contain package google.golang.org/grpc/naming
https://github.com/etcd-io/etcd/issues/12124
2022-05-30 12:08:47 -07:00
Benjamin Wang 5505d7a95b
Merge pull request #13206 from cfz/cherry-pick-#13172-r34
[backport 3.4]: server/auth: enable tokenProvider if recoved store enables auth
2022-05-07 06:59:33 +08:00
Piotr Tabor 76147c9c79
Merge pull request #13999 from mitake/backport-13308-to-3.4
Backport PR 13308 to release 3.4
2022-05-06 13:03:05 +02:00
cfz 23e79dbf19
[backport 3.4]: server/auth: enable tokenProvider if recoved store enables auth
this is a manual backport of #13172
2022-05-06 12:26:55 +08:00
Hitoshi Mitake 757a8e8f5b *: implement a retry logic for auth old revision in the client 2022-04-29 23:46:24 +09:00
Ashish Ranjan 9bbdeb4a64 client/v3: refresh the token when ErrUserEmpty is received while retrying
To fix a bug in the retry logic caused when the auth token is cleared after receiving `ErrInvalidAuthToken` from the server and the subsequent call to `getToken` also fails due to some reason (eg. context deadline exceeded).
This leaves the client without a token and the retry will continue to fail with `ErrUserEmpty` unless the token is refreshed.
2022-04-29 23:43:36 +09:00
Marek Siarkowicz c50b7260cc
Merge pull request #13713 from lavacat/defrag-bopts-fix-3.4
mvcc/backend: restore original bolt db options after defrag
2022-02-18 10:54:21 +01:00
Bogdan Kanivets d30a4fbf0c mvcc/backend: restore original bolt db options after defrag
Problem: Defrag was implemented before custom bolt options were added.
Currently defrag doesn't restore backend options.
For example BackendFreelistType will be unset after defrag.

Solution: save bolt db options and use them in defrag.
2022-02-17 15:33:05 -08:00
richkun a905430d27
embed: only log stream error with debug level (#13656)
Co-authored-by: tangcong <tangcong506@gmail.com>
2022-01-30 12:24:22 -08:00
Sam Batschelet 161bf7e7be
Merge pull request #13475 from chaochn47/backport-release-3.4
backport 3.4 from #13467 exclude the same alarm type activated by multiple peers
2021-11-13 22:10:38 -05:00
Chao Chen 04d47a93f9 backport from #13467 exclude the same alarm type activated by multiple peers 2021-11-12 14:17:14 -08:00
Sam Batschelet 72d3e382e7 version: 3.4.18
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
2021-10-15 09:47:08 -04:00
Piotr Tabor eb9cee9ee3
Merge pull request #13397 from geetasg/release-3.4
storage/backend: Add a gauge to indicate if defrag is active (backport)
2021-10-07 19:08:31 +02:00
Geeta Gharpure 85abf6e46d storage/backend: Add a gauge to indicate if defrag is active (backport from 3.6) 2021-10-06 11:04:47 -07:00
Piotr Tabor 1eac258f58
Merge pull request #13385 from hexfusion/cp-13376-release-3.4
[release-3.4] Dockerfile: bump debian bullseye-20210927
2021-10-04 08:40:32 +02:00
Sam Batschelet 91da298560 Dockerfile: bump debian bullseye-20210927
fixes: CVE-2021-3711, CVE-2021-35942, CVE-2019-9893

Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
2021-10-04 00:32:23 -04:00
Sam Batschelet 19e2e70e4f version: 3.4.17
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
2021-10-03 22:30:27 -04:00
Sam Batschelet 8ea187e2cf
Merge pull request #13378 from ysksuzuki/replace-jwt-go
Replace github.com/dgrijalva/jwt-go with github.com/golang-jwt/jwt
2021-10-03 21:48:32 -04:00
Yusuke Suzuki e63d058247 test: update go to 1.15.15
Update go to 1.15.15 which is the latest of 1.15 because linux-amd64-fmt fails with go 1.15.13.

Signed-off-by: Yusuke Suzuki <yusuke-suzuki@cybozu.co.jp>
2021-10-02 10:04:22 +09:00
Yusuke Suzuki 1558ede7f8 go.mod,go.sum: Replace github.com/dgrijalva/jwt-go with github.com/golang-jwt/jwt
github.com/dgrijalva/jwt-go has CVE https://github.com/advisories/GHSA-w73w-5m7g-f7qc
and is already archived. etcd v3.4 should use a community maintained fork
github.com/golang-jwt/jwt which provides the fixed version of the CVE.

Signed-off-by: Yusuke Suzuki <yusuke-suzuki@cybozu.co.jp>
2021-10-02 10:01:52 +09:00
Sam Batschelet 41061e56ad
Merge pull request #13139 from hexfusion/bp-12727
[release-3.4]: ClientV3: Ordering: Fix TestEndpointSwitchResolvesViolation test
2021-06-24 10:38:10 -04:00
Sam Batschelet 501d8f01ea [release-3.4]: ClientV3: Ordering: Fix TestEndpointSwitchResolvesViolation test
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
2021-06-23 21:26:55 -04:00
Sam Batschelet 38669a0709
Merge pull request #13137 from hexfusion/track-modules
vendor: track vendor/modules.txt
2021-06-23 14:52:10 -04:00
Sam Batschelet 7489911d51
Merge pull request #13135 from serathius/actions-3.4
Migrate PR testing from travis to GitHub actions
2021-06-23 14:03:37 -04:00
Sam Batschelet 15b7954d03 vendor: track vendor/modules.txt
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
2021-06-23 13:56:39 -04:00
Marek Siarkowicz 6cc1345a0b Migrate PR testing from travis to GitHub actions 2021-06-23 18:25:29 +02:00
Sam Batschelet 589a6993b8
Merge pull request #13101 from mrueg/backport-12864
[backport 3.4]  fix check datascale command for https endpoints
2021-06-14 08:19:22 -04:00
Saeid Bostandoust 4bacd21e20 fix check datascale command for https endpoints 2021-06-11 11:58:20 +02:00
Sam Batschelet 0ecc337028
Merge pull request #13100 from tangcong/automated-cherry-pick-of-#13077-origin-release-3.4
[backport 3.4] embed: unlimit the recv msg size of grpc-gateway
2021-06-10 21:29:34 -04:00
spacewander 628fa1818e embed: unlimit the recv msg size of grpc-gateway
Ensure the client which access etcd via grpc-gateway won't
be limited by the MaxCallRecvMsgSize. Here we choose the same
default value of etcdcli as grpc-gateway's MaxCallRecvMsgSize.

Fix https://github.com/etcd-io/etcd/issues/12576
2021-06-11 08:07:28 +08:00
Gyuho Lee d19fbe541b version: 3.4.16
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2021-05-12 01:52:43 +00:00
Piotr Tabor 6bbc85827b
Merge pull request #12917 from chaochn47/2021-05-03-backport-#12880
Backport-3.4 exclude alarms from health check conditionally
2021-05-06 10:21:09 +02:00
Chao Chen dbde4f2d5e Backport-3.4 exclude alarms from health check conditionally 2021-05-04 10:37:12 -07:00
Gyuho Lee 15715dcf1a
Merge pull request #12902 from MakDon/release-3.4
[Backport-3.4] etcdserver/mvcc: update trace.Step condition
2021-04-28 11:05:35 -07:00
makdon 963d3b9369 etcdserver/mvcc: update trace.Step condition
backport PR #12894 to release-3.4
2021-04-28 11:35:49 +08:00
Piotr Tabor ba829044f5
Merge pull request #12888 from chaochn47/2021-04-22-cherry-pick-12871
Backport-3.4 etcdserver/util.go: reduce memory when logging range requests
2021-04-23 00:45:20 +02:00
Chao Chen c4eb81af99 Backport-3.4 etcdserver/util.go: reduce memory when logging range requests 2021-04-22 15:07:44 -07:00
Piotr Tabor ceafa1b33e
Merge pull request #12882 from lilic/bump-go-12
.travis,Makefile,functional: Bump go 1.12 version to v1.12.17
2021-04-20 23:33:23 +02:00
Lili Cosic 5890bc8bd6 .travis,Makefile,functional: Bump go 1.12 version to v1.12.17
This version was already used to build the release v3.4.15.
2021-04-20 14:00:44 +02:00
Piotr Tabor c274aa5ea4
Merge pull request #12849 from lilic/test-go-1-15
[release-3.4]: .travis.yml: Test with go v1.15.11
2021-04-19 18:17:05 +02:00
Piotr Tabor 276ee962ec integration: Fix 'go test --tags cluster_proxy --timeout=30m -v ./integration/...'
grpc proxy opens additional 2 watching channels. The metric is shared
between etcd-server & grpc_proxy, so all assertions on number of open
watch channels need to take in consideration the additional "2"
channels.
2021-04-19 16:41:28 +02:00
Lili Cosic 8d1b8335e3 pkg/tlsutil: Adjust cipher suites for go 1.12
Cherry-pick of 60e44286fa from master branch does not work due to
missing `tls.CipherSuites()` function. We work around by using go build
tags for both the building and tests.
2021-04-19 11:49:13 +02:00
Piotr Tabor c3f447a698 Fix pkg/tlsutil (test) to not fail on 386.
In fact this commit rewrites the functionality to use upstream list of
ciphers instead of checking whether the lists are in sync using ast
analysis.
2021-04-19 11:49:13 +02:00
Lili Cosic 85e037d9c6 bill-of-materials.json: Update golang.org/x/sys 2021-04-19 11:49:13 +02:00
Lili Cosic a1691be1bd .travis,test: Turn race off in Travis for go version 1.15
Currently with race it fails, we can enable this at a later point.
2021-04-19 11:49:13 +02:00
Vimal K df35086b6a integration : fix TestTLSClientCipherSuitesMismatch in go1.13
In go1.13, the TLS13 is enablled by default, and as per go1.13 release notes :
TLS 1.3 cipher suites are not configurable. All supported cipher suites are safe,
and if PreferServerCipherSuites is set in Config the preference order is based
on the available hardware.

Fixing the test case for go1.13 by limiting the TLS version to TLS12
2021-04-19 11:18:14 +02:00
Lili Cosic eeefd614c8 vendor: Run go mod vendor 2021-04-19 11:18:14 +02:00
Lili Cosic 4276c33026 go.mod,go.sum: Bump github.com/creack/pty that includes patch
This patch is needed due to go 1.15 erroring on:

"Setctty set but Ctty not valid in child".
2021-04-19 11:18:13 +02:00
Lili Cosic cfc08e5f06 go.mod,go.sum: Comply with go v1.15 2021-04-19 11:18:13 +02:00
Lili Cosic 0b7e4184e8 etcdserver,wal: Convert int to string using rune() 2021-04-19 11:18:13 +02:00
Lili Cosic 35bd924596 integration,raft,tests: Comply with go v1.15 gofmt 2021-04-19 11:18:13 +02:00
Lili Cosic 62596faeed .travis.yml: Test with go v1.15.11
Currently in CI the tests are only run with go v1.12, this adds also go
v1.15.11.

Excludes certain variants for v1.15.
2021-04-19 11:18:13 +02:00
Piotr Tabor b7e5f5bc12
Merge pull request #12839 from lilic/fix-go-version
[release-3.4]: Pin go version in go.mod to 1.12
2021-04-07 17:52:05 +02:00
Lili Cosic 91bed2e01f pkpkg/testutil/leak.go: Allowlist created by testing.runTests.func1 2021-04-07 17:20:52 +02:00
Lili Cosic b19eb0f339 vendor: Run go mod vendor 2021-04-07 15:25:32 +02:00
Lili Cosic 8557cb29ba go.sum, go.mod: Run go mod tidy with go 1.12 2021-04-07 15:25:08 +02:00
Lili Cosic ef415e3fe1 go.mod: Pin go to 1.12 version
As go 1.12.2 is what is tested in CI as well as recommended to be built
with 1.12.2 we should also pin to this in the go directive version.
2021-04-07 15:21:42 +02:00
Sam Batschelet 82eae9227c
Merge pull request #12803 from cwedgwood/metrics-3.4
etcdserver: fix incorrect metrics generated when clients cancel watches
2021-04-01 08:17:37 -04:00
Chris Wedgwood 656dc63eab etcdserver: fix incorrect metrics generated when clients cancel watches
Manual cherry-pick of 9571325fe8 for
release-3.4.
2021-03-31 22:59:29 -07:00
Piotr Tabor 30799c97be
Merge pull request #12815 from dbavatar/release-3.4-peervalidation
etcdserver: Fix PeerURL validation
2021-03-30 12:54:32 +02:00
Piotr Tabor 16fe9a89ff
Merge pull request #12816 from cwedgwood/3.4-relax-gate-timeout
integration: relax leader timeout from 3s to 4s
2021-03-30 12:53:27 +02:00
Chris Wedgwood c499d9b047 integration: relax leader timeout from 3s to 4s
The integration jobs fail with timeouts slightly over 3s, increase
this marginally so false failures are less prevalent.
2021-03-29 10:17:44 -07:00
Piotr Tabor 2702f9e5f2
Merge pull request #12751 from cwedgwood/nofsyncdowrite
When using --unsafe-no-fsync still write out the data
2021-03-07 11:52:33 +01:00
Chris Wedgwood 94634fc258 etcdserver: when using --unsafe-no-fsync write data
There are situations where we don't wish to fsync but we do want to
write the data.

Typically this occurs in clusters where fsync latency (often the
result of firmware) transiently spikes.  For Kubernetes clusters this
causes (many) elections which have knock-on effects such that the API
server will transiently fail causing other components fail in turn.

By writing the data (buffered and asynchronously flushed, so in most
situations the write is fast) and avoiding the fsync we no longer
trigger this situation and opportunistically write out the data.

Anecdotally:
  Because the fsync is missing there is the argument that certain
  types of failure events will cause data corruption or loss, in
  testing this wasn't seen.  If this was to occur the expectation is
  the member can be readded to a cluster or worst-case restored from a
  robust persisted snapshot.

  The etcd members are deployed across isolated racks with different
  power feeds.  An instantaneous failure of all of them simultaneously
  is unlikely.

  Testing was usually of the form:
   * create (Kubernetes) etcd write-churn by creating replicasets of
     some 1000s of pods
   * break/fail the leader

  Failure testing included:
   * hard node power-off events
   * disk removal
   * orderly reboots/shutdown

  In all cases when the node recovered it was able to rejoin the
  cluster and synchronize.
2021-03-05 10:09:52 -08:00
Sam Batschelet afd6d8a40d
Merge pull request #12740 from hexfusion/cp-12448--release-3.4
Manual cherry pick of #12448 on release 3.4
2021-03-03 13:37:20 -05:00
Sam Batschelet 9aeabe447d server: Added config parameter experimental-warning-apply-duration
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
2021-03-03 12:14:30 -05:00
Gyuho Lee aa7126864d version: 3.4.15
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2021-02-26 22:08:24 +00:00
Gyuho Lee 3be9460ddc
Merge pull request #12679 from chaochn47/backport_3.4_#12677
[Backport-3.4] etcdserver/api/etcdhttp: log successful etcd server side health check in debug level
2021-02-09 15:01:19 -08:00
Chao Chen f27ef4d343 [Backport-3.4] etcdserver/api/etcdhttp: log successful etcd server side health check in debug level
ref. #12677
ref. 0b9cfa8677
2021-02-08 21:44:44 -08:00
Piotr Tabor a1c5f59b59
Merge pull request #12402 from vitalif/release-3.4
etcdserver: Fix 64 KB websocket notification message limit
2021-02-03 09:19:21 +01:00
Vitaliy Filippov a40f14d92c etcdserver: Fix 64 KB websocket notification message limit
This fixes etcd being unable to send any message longer than 64 KB as
a notification over the websocket. This was because the older version
of grpc-websocket-proxy was used and WithMaxRespBodyBufferSize option
wasn't set.
2021-01-30 00:37:02 +03:00
Sam Batschelet d51c6c689b
Merge pull request #12645 from hexfusion/bump-dep
vendor: bump gorilla/websocket
2021-01-23 13:49:45 -05:00
Sam Batschelet becc228c5a vendor: bump gorilla/websocket
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
2021-01-23 11:20:53 -05:00
Piotr Tabor 0880605772
Merge pull request #12551 from kolyshkin/3.4-fix-lock
[3.4 backport] pkg/fileutil: fix F_OFD_ constants
2021-01-15 23:16:49 +01:00
Kir Kolyshkin bea35fd2c6 pkg/fileutil: fix F_OFD_ constants
Use golang.org/x/sys/unix for F_OFD_* constants.

This fixes the issue that F_OFD_GETLK was defined incorrectly,
resulting in bugs such as https://github.com/moby/moby/issues/31182

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-12-14 10:42:13 -08:00
Gyuho Lee 8a03d2e961 version: 3.4.14
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2020-11-25 11:31:52 -08:00
Gyuho Lee a4b43b388d pkg/netutil: remove unused "iptables" wrapper
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2020-11-25 11:31:17 -08:00
Gyuho Lee e3b29b66a4 tools/etcd-dump-metrics: validate exec cmd args
To prevent arbitrary command invocations.

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2020-11-25 11:30:31 -08:00
Gyuho Lee eb0fb0e799
Merge pull request #12356 from cfc4n/automated-cherry-pick-of-#12264-upstream-release-3.4
Automated cherry pick of #12264
2020-10-12 14:26:50 -07:00
CFC4N 40b71074e8
clientv3: get AuthToken automatically when clientConn is ready.
fixes: #11954
2020-09-30 17:14:22 +08:00
Jingyi Hu 7e2d426ec0
Merge pull request #12299 from galal-hussein/fix_panic_34
[Backport 3.4] etcdserver: add ConfChangeAddLearnerNode to the list of config changes
2020-09-15 09:04:18 -07:00
galal-hussein 3019246742 etcdserver: add ConfChangeAddLearnerNode to the list of config changes
To fix a panic that happens when trying to get ids of etcd members in
force new cluster mode, the issue happen if the cluster previously had
etcd learner nodes added to the cluster

Fixes #12285
2020-09-14 17:50:57 +02:00
Joe Betz dd1b699fc4
Merge pull request #12280 from jingyih/automated-cherry-pick-of-#12271-upstream-release-3.4
Automated cherry pick of #12271 on release 3.4
2020-09-10 11:07:54 -07:00
jingyih f44aaf8248 integration: add flag WatchProgressNotifyInterval in integration test 2020-09-09 12:39:42 -07:00
Gyuho Lee ae9734ed27 version: 3.4.13
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2020-08-24 12:11:28 -07:00
Gyuho Lee 781bde75e2
Merge pull request #12250 from spzala/automated-cherry-pick-of-#12242-upstream-release-3.4
Automated cherry pick of #12242
2020-08-24 12:05:03 -07:00
Sahdev P. Zala d5ebbbceb8 pkg: file stat warning
Provide warning and doc instead of enforcing file permission.
2020-08-24 11:21:29 -04:00
Sam Batschelet 7cd5872656
Merge pull request #12244 from hexfusion/automated-cherry-pick-of-#12243-upstream-release-3.4
Automated cherry pick of #12243 on release 3.4
2020-08-21 11:24:21 -04:00
Sam Batschelet 46a0a44f95 Automated cherry pick of #12243 on release 3.4
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
2020-08-21 10:14:07 -04:00
Gyuho Lee 17cef6e3e9 version: 3.4.12
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2020-08-19 09:56:24 -07:00
Gyuho Lee c07cba001b
Merge pull request #12239 from liggitt/slow-v2-panic-3.4
[3.4] etcdserver: Avoid panics logging slow v2 requests in integration tests
2020-08-19 09:55:08 -07:00
Jordan Liggitt b8878eac45 etcdserver: Avoid panics logging slow v2 requests in integration tests 2020-08-19 11:30:39 -04:00
Gyuho Lee e71e0c5c88
Merge pull request #12226 from jingyih/fix_backport_PR12216
*: add plog logging to the backport of PR12216
2020-08-18 08:48:09 -07:00
Gyuho Lee bc44e367c3 version: 3.4.11
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2020-08-18 08:46:13 -07:00
Gyuho Lee 299e0f17aa Revert "etcdserver/api/v3rpc: "MemberList" never return non-empty ClientURLs"
This reverts commit 0372cfc7ab.
2020-08-18 08:45:38 -07:00
jingyih 75d5e78d1f *: fix backport of PR12216
Fix bugs introduced in commit c60dabf
2020-08-16 15:01:18 +08:00
jingyih c60dabf2f3 *: add experimental flag for watch notify interval
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2020-08-15 10:24:25 -07:00
Gyuho Lee 8a4afdbcc2
Merge pull request #12189 from jingyih/automated-cherry-pick-of-#11452-#12187-upstream-release-3.4
Automated cherry pick of #11452 #12187 on release 3.4
2020-08-13 21:38:05 -07:00
jingyih 6fcab5af9f clientv3: remove excessive watch cancel logging 2020-08-13 13:51:21 +08:00
Gyuho Lee 008074187c etcdserver: add OS level FD metrics
Similar counts are exposed via Prometheus.
This adds the one that are perceived by etcd server.

e.g.

os_fd_limit 120000
os_fd_used 14
process_cpu_seconds_total 0.31
process_max_fds 120000
process_open_fds 17

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2020-08-12 18:38:35 -07:00
Gyuho Lee cf558ee8b7 pkg/runtime: optimize FDUsage by removing sort
No need sort when we just want the counts.

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2020-08-12 18:38:17 -07:00
Ted Yu e800c62eca clientv3: log warning in case of error sending request 2020-07-30 22:54:35 +08:00
Gyuho Lee 0372cfc7ab etcdserver/api/v3rpc: "MemberList" never return non-empty ClientURLs
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>

cr https://code.amazon.com/reviews/CR-29712724
2020-07-16 16:29:51 -07:00
Gyuho Lee 18dfb9cca3 version: 3.4.10
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2020-07-16 15:16:20 -07:00
Jingyi Hu 7b8270416d
Merge pull request #12106 from bart0sh/PR001-cherry-pick-change-protobuf-field-type-from-int-to-int64
etcdserver: change protobuf field type from int to int64 (#12000)
2020-07-16 00:45:16 +08:00
Sahdev Zala a2c37485dd
Merge pull request #12127 from spzala/automated-cherry-pick-of-#12012-upstream-release-3.4
Automated cherry pick of #12012
2020-07-13 10:53:52 -04:00
Hitoshi Mitake 67bfc310f0 Documentation: note on data encryption 2020-07-13 09:50:30 -04:00
Yuchen Zhou ed28c768a3 etcdserver: change protobuf field type from int to int64 (#12000) 2020-07-08 10:21:10 +03:00
Gyuho Lee d3a702a09d
Merge pull request #12112 from spzala/automated-cherry-pick-of-#12018-upstream-release-3.4
Automated cherry pick of #12018
2020-07-07 10:32:18 -07:00
Sahdev P. Zala 319331192e pkg: consider umask when use MkdirAll
os.MkdirAll creates directory before umask so make sure that a desired
permission is set after creating a directory with MkdirAll. Use the
existing TouchDirAll function which checks for permission if dir is already
exist and when create a new dir.
2020-07-07 11:46:31 -04:00
Gyuho Lee 2acdf88406
Merge pull request #12076 from cfc4n/automated-cherry-pick-of-#11987-upstream-release-3.4
Automated cherry pick of #11987
2020-07-06 13:01:24 -07:00
Gyuho Lee a8454e453f
Merge pull request #12089 from tangcong/automated-cherry-pick-of-#11997-origin-release-3.4
Automated cherry pick of #11997
2020-07-06 13:00:56 -07:00
Gyuho Lee 32583af167
Merge pull request #12101 from tangcong/automated-cherry-pick-of-#12100-origin-release-3.4
Automated cherry pick of #12100
2020-07-06 11:47:24 -07:00
Sahdev Zala 85cc4deae6
Merge pull request #12103 from spzala/automated-cherry-pick-of-#12092-upstream-release-3.4
Automated cherry pick of #12092
2020-07-05 11:45:30 -04:00
Hitoshi Mitake 7dec4c412c etcdmain: let grpc proxy warn about insecure-skip-tls-verify 2020-07-01 18:25:29 -04:00
tangcong a4667f596a etcdmain: fix shadow error 2020-07-01 13:36:48 +08:00
tangcong 0207d1df66 pkg/fileutil: print desired file permission in error log 2020-06-29 09:59:19 +08:00
Gyuho Lee 99e893d285
Merge pull request #12074 from cfc4n/automated-cherry-pick-of-#12005-upstream-release-3.4
Automated cherry pick of #12005
2020-06-26 11:30:07 -07:00
Gyuho Lee d5dec731db
Merge pull request #12077 from cfc4n/automated-cherry-pick-of-#11980-upstream-release-3.4
Automated cherry pick of #11980
2020-06-26 11:29:16 -07:00
Gyuho Lee 81a2edc365
Merge pull request #12081 from spzala/automated-cherry-pick-of-#11945-upstream-release-3.4
Automated cherry pick of #11945
2020-06-26 11:28:34 -07:00
Changxin Miao e5424fc474 pkg: Fix dir permission check on Windows 2020-06-25 20:20:55 -04:00
cfc4n 4488595e05 auth: Customize simpleTokenTTL settings.
see https://github.com/etcd-io/etcd/issues/11978 for more detail.
2020-06-25 19:58:26 +08:00
cfc4n 7b99863e02 mvcc: chanLen 1024 is to biger,and it used more memory. 128 seems to be enough. Sometimes the consumption speed is more than the production speed.
See https://github.com/etcd-io/etcd/issues/11906 for more detail.
2020-06-25 19:53:13 +08:00
cfc4n 490c6139ac auth: return incorrect result 'ErrUserNotFound' when client request without username or username was empty.
Fiexs https://github.com/etcd-io/etcd/issues/12004 .
2020-06-25 19:48:36 +08:00
Gyuho Lee 31e49a4df3
Merge pull request #12048 from spzala/automated-cherry-pick-of-#11793-upstream-release-3.4
Automated cherry pick of #11793
2020-06-24 20:42:26 -07:00
Gyuho Lee 83fc96df0c
Merge pull request #12055 from tangcong/automated-cherry-pick-of-#11850-origin-release-3.4
Automated cherry pick of #11850
2020-06-24 20:41:44 -07:00
Gyuho Lee 45192cf62b
Merge pull request #12064 from cfc4n/automated-cherry-pick-of-#11986-upstream-release-3.4
Automated cherry pick of #11986
2020-06-24 20:40:37 -07:00
Gyuho Lee 1a1281005c
Merge pull request #12070 from spzala/automated-cherry-pick-of-#12060-upstream-release-3.4
Automated cherry pick of #12060
2020-06-24 20:39:33 -07:00
Gyuho Lee a4f42948e8
Merge pull request #12072 from tangcong/automated-cherry-pick-of-#12066-origin-release-3.4
Automated cherry pick of #12066
2020-06-24 20:39:15 -07:00
Gyuho Lee 2212a84adb
Merge pull request #12034 from spzala/automated-cherry-pick-of-#11798-upstream-release-3.4
Automated cherry pick of #11798
2020-06-24 20:38:46 -07:00
tangcong e42d7b5248 etcdmain: fix shadow error 2020-06-25 06:40:33 +08:00
Xiang Li b86bb615ff doc: add TLS related warnings 2020-06-24 16:39:35 -04:00
cfc4n ee963470f4 etcdserver:FDUsage set ticker to 10 minute from 5 seconds. This ticker will check File Descriptor Requirements ,and count all fds in used. And recorded some logs when in used >= limit/5*4. Just recorded message. If fds was more than 10K,It's low performance due to FDUsage() works. So need to increase it.
see https://github.com/etcd-io/etcd/issues/11969 for more detail.
2020-06-24 13:28:40 +08:00
Jack Kleeman 36452a1c1d clientv3: cancel watches proactively on client context cancellation
Currently, watch cancel requests are only sent to the server after a
message comes through on a watch where the client has cancelled. This
means that cancelled watches that don't receive any new messages are
never cancelled; they persist for the lifetime of the client stream.
This has negative connotations for locking applications where a watch
may observe a key which might never change again after cancellation,
leading to many accumulating watches on the server.

By cancelling proactively, in most cases we simply move the cancel
request to happen earlier, and additionally we solve the case where the
cancel request would never be sent.

Fixes #9416
Heavy inspiration drawn from the solutions proposed there.
2020-06-23 19:50:21 +08:00
Gyuho Lee 4571e528f4 wal: check out of range slice in "ReadAll", "decoder"
wal: add slice bound checks in decoder

CHANGELOG-3.5: add wal slice bound check
CHANGELOG-3.5: add "decodeRecord"

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2020-06-22 11:57:43 -04:00
Gyuho Lee 37ac22205b
Merge pull request #12035 from spzala/automated-cherry-pick-of-#11787-upstream-release-3.4
Automated cherry pick of #11787
2020-06-21 19:22:54 -07:00
Gyuho Lee 493f15c156
Merge pull request #12037 from spzala/automated-cherry-pick-of-#11807-upstream-release-3.4
Automated cherry pick of #11807
2020-06-21 19:20:42 -07:00
Gyuho Lee c8b3c6f54c
Merge pull request #12041 from spzala/automated-cherry-pick-of-#11795-upstream-release-3.4
Automated cherry pick of #11795
2020-06-21 19:20:26 -07:00
Gyuho Lee 368ff75a10
Merge pull request #12039 from spzala/automated-cherry-pick-of-#11845-upstream-release-3.4
Automated cherry pick of #11845
2020-06-21 19:20:04 -07:00
Gyuho Lee 7adbfa1144
Merge pull request #12038 from spzala/automated-cherry-pick-of-#11608-upstream-release-3.4
Automated cherry pick of #11608
2020-06-21 19:19:50 -07:00
Gyuho Lee e151faf3cc
Merge pull request #12040 from spzala/automated-cherry-pick-of-#11796-upstream-release-3.4
Automated cherry pick of #11796
2020-06-21 19:19:31 -07:00
Gyuho Lee 8292fd5051
Merge pull request #12042 from spzala/automated-cherry-pick-of-#11818-upstream-release-3.4
Automated cherry pick of #11818
2020-06-21 19:19:17 -07:00
Gyuho Lee c37245ed4b
Merge pull request #12043 from spzala/automated-cherry-pick-of-#11830-upstream-release-3.4
Automated cherry pick of #11830
2020-06-21 19:18:51 -07:00
Gyuho Lee 6dab8aff66
Merge pull request #12044 from spzala/automated-cherry-pick-of-#11841-upstream-release-3.4
Automated cherry pick of #11841
2020-06-21 19:18:35 -07:00
Hitoshi Mitake c69efda350 etcdctl, etcdmain: warn about --insecure-skip-tls-verify options 2020-06-21 19:23:06 -04:00
Hitoshi Mitake 3d8e9a323d Documentation: note on the policy of insecure by default 2020-06-21 19:21:05 -04:00
Hitoshi Mitake 963b242846 etcdserver: don't let InternalAuthenticateRequest have password 2020-06-21 19:18:18 -04:00
Hitoshi Mitake 6f011ce524 auth: a new error code for the case of password auth against no password user 2020-06-21 19:12:55 -04:00
Hitoshi Mitake 36f8dee003 Documentation: note on password strength 2020-06-21 19:08:39 -04:00
Xiang Li 47001f28bd etcdmain: best effort detection of self pointing in tcp proxy 2020-06-21 18:12:24 -04:00
Sahdev P. Zala 9a24f73f7b Discovery: do not allow passing negative cluster size
When an etcd instance attempts to perform service discovery, if a
cluster size with negative value  is provided, the etcd instance
will panic without recovery because of
2020-06-21 18:00:35 -04:00
Sahdev P. Zala 7d1cf64049 wal: fix panic when decoder not set
Handle the related panic and clarify doc.
2020-06-21 16:21:34 -04:00
Sahdev P. Zala 05c441f92f embed: fix compaction runtime err
Handle negative value input which currently gives a runtime error.
2020-06-20 20:58:18 -04:00
Sahdev P. Zala 434f7e83f0 pkg: check file stats
modify file util.
2020-06-20 16:29:47 -04:00
Gyuho Lee 91b1a9182a
Merge pull request #11977 from jpbetz/automated-cherry-pick-of-#11946-release-3.4
Automated cherry pick of #11946
2020-06-05 12:46:33 -07:00
David Crawshaw 78f67988aa
etcdserver, et al: add --unsafe-no-fsync flag
This makes it possible to run an etcd node for testing and development
without placing lots of load on the file system.

Fixes #11930.

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
2020-06-04 20:19:28 -07:00
Gyuho Lee 54ba958911 version: 3.4.9
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2020-05-20 16:28:29 -07:00
Gyuho Lee 609e844f86
Merge pull request #11811 from wswcfan/automated-cherry-pick-of-#11735-origin-release-3.4
Automated cherry pick of #11735 on release-3.4
2020-05-20 15:57:48 -07:00
tangcong 166b4473fa wal: add TestValidSnapshotEntriesAfterPurgeWal testcase
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2020-05-20 11:38:06 -07:00
tangcong ed231df7c0 wal: fix crc mismatch crash bug 2020-05-20 11:37:04 -07:00
Gyuho Lee cfe37de6c0 rafthttp: log snapshot download duration
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2020-05-20 11:37:01 -07:00
Gyuho Lee 0de2b1f860 version: 3.4.8
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2020-05-18 11:40:23 -07:00
Gyuho Lee a668adba78 rafthttp: improve snapshot send logging
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2020-05-18 11:39:24 -07:00
Gyuho Lee 9bad82fee5 *: make sure snapshot save downloads SHA256 checksum
ref. https://github.com/etcd-io/etcd/pull/11896

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2020-05-17 17:38:42 -07:00
Gyuho Lee f1ea03a7c8 etcdserver/api/snap: exclude orphaned defragmentation files in snapNames
ref. https://github.com/etcd-io/etcd/pull/11900

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2020-05-17 14:21:02 -07:00
Ted Yu 4079deadb4 etcdserver: continue releasing snap db in case of error
Signed-off-by: Ted Yu <yuzhihong@gmail.com>
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2020-05-17 14:15:26 -07:00
Viacheslav Biriukov 87fc3c9e57 etcdserver,wal: fix inconsistencies in WAL and snapshot
etcdserver/*, wal/*: changes to snapshots and wal logic
etcdserver/*: changes to snapshots and wal logic to fix #10219
etcdserver/*, wal/*: add Sync method
etcdserver/*, wal/*: find valid snapshots by cross checking snap files and wal snap entries
etcdserver/*, wal/*:Add comments, clean up error messages and tests
etcdserver/*, wal/*: Remove orphaned .snap.db files during Release

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2020-05-15 08:40:09 -07:00
Gary Belvin e048e166ab
cherry pick of #11564 (#11880)
* clientv3: fix grpc-go(v1.27.0) incompatible changes to balancer/resolver.

* vendor: upgrade gRPC Go to v1.24.0

Picking up some performance improvements and bug fixes.

https://github.com/grpc/grpc-go/releases/tag/v1.24.0

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>

* vendor: update gRPC Go to v1.26.0 (#11522)

* GO111MODULE=on go mod vendor

* GO111MODULE=on go mod vendor go 1.14

Bump travis 2

Co-authored-by: EDDYCJY <313687982@qq.com>
Co-authored-by: Gyuho Lee <leegyuho@amazon.com>
Co-authored-by: Yuchen Zhou <yczhou@google.com>
2020-05-13 10:12:58 -07:00
Gyuho Lee 2333c727a2
Merge pull request #11855 from tangcong/automated-cherry-pick-of-#11817-origin-release-3.4
Automated cherry pick of #11817 on release-3.4
2020-05-07 20:00:26 -07:00
tangcong aa75e90ac8 mvcc: fix deadlock bug 2020-05-08 09:56:23 +08:00
shawwang f18976f4b8 auth: optimize lock scope for CheckPassword
to improve authentication performance in concurrent scenarios when enable auth and using authentication based password
2020-04-25 18:36:18 +08:00
Jingyi Hu f1eca4e1fa
Merge pull request #11752 from tangcong/automated-cherry-pick-of-#11652-#11670-#11710-origin-release-3.4
Automated cherry pick of #11652 #11670 #11710
2020-04-10 23:21:45 +08:00
tangcong b733b22712 auth: ensure RoleGrantPermission is compatible with older versions 2020-04-09 09:33:40 +08:00
tangcong eb80716532 etcdserver: print warn log when failed to apply request 2020-04-09 09:33:40 +08:00
tangcong e2abd97659 auth: cleanup saveConsistentIndex in NewAuthStore 2020-04-09 09:33:40 +08:00
tangcong 716821b9b5 auth: print warning log when error is ErrAuthOldRevision 2020-04-09 09:33:40 +08:00
shawwang 63116ffdb4 auth: add new metric 'etcd_debugging_auth_revision' 2020-04-09 09:33:40 +08:00
shawwang b3d54def77 tools/etcd-dump-db: add auth decoder, optimize print format 2020-04-09 09:33:40 +08:00
tangcong 347c8dac3b *: fix auth revision corruption bug 2020-04-09 09:33:36 +08:00
Sahdev Zala e2ae6013a4
Merge pull request #11757 from jingyih/automated-cherry-pick-of-#11754-upstream-release-3.4
Automated cherry pick of #11754 on release-3.4
2020-04-06 11:09:26 -04:00
Changxin Miao 9c8554573f etcdserver: watch stream got closed once one request is not permitted (#11708) 2020-04-06 07:06:57 -07:00
Gyuho Lee e694b7bb08 version: 3.4.7
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2020-04-01 10:46:54 -07:00
Gyuho Lee e99399d0dc wal: add "etcd_wal_writes_bytes_total"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2020-04-01 09:30:09 -07:00
Gyuho Lee b68f8ff31d pkg/ioutil: add "FlushN"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2020-04-01 09:29:59 -07:00
Gyuho Lee 857dffa386
Merge pull request #11734 from jingyih/automated-cherry-pick-of-#11330-upstream-release-3.4
Cherry pick of #11330 on release-3.4
2020-03-31 14:58:59 -07:00
jingyih 5f17aa2c8b test: auto detect branch when finding merge base 2020-03-31 10:59:44 -07:00
shenjiangc 89b10cf967 mvcc/kvstore:when the number key-value is greater than one million, compact take too long and blocks other requests 2020-03-30 08:21:38 -07:00
Gyuho Lee bdc9bc1d81 version: 3.4.6
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2020-03-29 12:47:27 -07:00
tangcong b0bdaaa449 lease: fix memory leak in LeaseGrant when node is follower 2020-03-29 12:47:14 -07:00
Gyuho Lee e784ba73c2 version: 3.4.5
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2020-03-18 17:24:42 -07:00
Gyuho Lee 35dc623a98 words: whitelist "racey"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2020-03-18 17:24:15 -07:00
Gyuho Lee 130342152a Revert "version: 3.4.5"
This reverts commit 0dc5d577fc.
2020-03-18 17:17:19 -07:00
Gyuho Lee fc93fbf9de words: whitelist "hasleader"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2020-03-18 17:17:04 -07:00
Gyuho Lee 0dc5d577fc version: 3.4.5
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2020-03-18 17:17:04 -07:00
Gyuho Lee e63db56cc9 etcdserver/api/v3rpc: handle api version metadata, add metrics
ref.
https://github.com/etcd-io/etcd/pull/11687

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2020-03-18 17:17:04 -07:00
Gyuho Lee 1471e12108 clientv3: embed api version in metadata
ref.
https://github.com/etcd-io/etcd/pull/11687

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>

clientv3: fix racy writes to context key

=== RUN   TestWatchOverlapContextCancel

==================

WARNING: DATA RACE

Write at 0x00c42110dd40 by goroutine 99:

  runtime.mapassign()

      /usr/local/go/src/runtime/hashmap.go:485 +0x0

  github.com/coreos/etcd/clientv3.metadataSet()

      /go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/clientv3/ctx.go:61 +0x8c

  github.com/coreos/etcd/clientv3.withVersion()

      /go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/clientv3/ctx.go:47 +0x137

  github.com/coreos/etcd/clientv3.newStreamClientInterceptor.func1()

      /go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/clientv3/client.go:309 +0x81

  google.golang.org/grpc.NewClientStream()

      /go/src/github.com/coreos/etcd/gopath/src/google.golang.org/grpc/stream.go:101 +0x10e

  github.com/coreos/etcd/etcdserver/etcdserverpb.(*watchClient).Watch()

      /go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/etcdserver/etcdserverpb/rpc.pb.go:3193 +0xe9

  github.com/coreos/etcd/clientv3.(*watchGrpcStream).openWatchClient()

      /go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/clientv3/watch.go:788 +0x143

  github.com/coreos/etcd/clientv3.(*watchGrpcStream).newWatchClient()

      /go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/clientv3/watch.go:700 +0x5c3

  github.com/coreos/etcd/clientv3.(*watchGrpcStream).run()

      /go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/clientv3/watch.go:431 +0x12b

Previous read at 0x00c42110dd40 by goroutine 130:

  reflect.maplen()

      /usr/local/go/src/runtime/hashmap.go:1165 +0x0

  reflect.Value.MapKeys()

      /usr/local/go/src/reflect/value.go:1090 +0x43b

  fmt.(*pp).printValue()

      /usr/local/go/src/fmt/print.go:741 +0x1885

  fmt.(*pp).printArg()

      /usr/local/go/src/fmt/print.go:682 +0x1b1

  fmt.(*pp).doPrintf()

      /usr/local/go/src/fmt/print.go:998 +0x1cad

  fmt.Sprintf()

      /usr/local/go/src/fmt/print.go:196 +0x77

  github.com/coreos/etcd/clientv3.streamKeyFromCtx()

      /go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/clientv3/watch.go:825 +0xc8

  github.com/coreos/etcd/clientv3.(*watcher).Watch()

      /go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/clientv3/watch.go:265 +0x426

  github.com/coreos/etcd/clientv3/integration.testWatchOverlapContextCancel.func1()

      /go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/clientv3/integration/watch_test.go:959 +0x23e

Goroutine 99 (running) created at:

  github.com/coreos/etcd/clientv3.(*watcher).newWatcherGrpcStream()

      /go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/clientv3/watch.go:236 +0x59d

  github.com/coreos/etcd/clientv3.(*watcher).Watch()

      /go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/clientv3/watch.go:278 +0xbb6

  github.com/coreos/etcd/clientv3/integration.testWatchOverlapContextCancel.func1()

      /go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/clientv3/integration/watch_test.go:959 +0x23e

Goroutine 130 (running) created at:

  github.com/coreos/etcd/clientv3/integration.testWatchOverlapContextCancel()

      /go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/clientv3/integration/watch_test.go:979 +0x76d

  github.com/coreos/etcd/clientv3/integration.TestWatchOverlapContextCancel()

      /go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/clientv3/integration/watch_test.go:922 +0x44

  testing.tRunner()

      /usr/local/go/src/testing/testing.go:657 +0x107

==================

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2020-03-18 17:17:00 -07:00
Gyuho Lee 0b9cfa8677 etcdserver/api/etcdhttp: log server-side /health checks
ref.
https://github.com/etcd-io/etcd/pull/11704

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2020-03-18 16:29:24 -07:00
Sam Batschelet b66c53ff5f proxy/grpcproxy: add return on error for metrics handler
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
2020-03-16 12:06:01 -04:00
Sahdev Zala 8f6c3f4d09
Merge pull request #11664 from jingyih/automated-cherry-pick-of-#11638-upstream-release-3.4
Automated cherry pick of #11638 on release-3.4
2020-03-11 19:26:11 -04:00
jingyih 379d01a0d2 etcdctl: fix member add command
Use members information from member add response, which is
guaranteed to be up to date.
2020-02-29 07:18:11 -08:00
Gyuho Lee c65a9e2dd1 version: 3.4.4
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2020-02-24 13:14:02 -08:00
Jingyi Hu 7862f6ed2c
Merge pull request #11644 from jingyih/automated-cherry-pick-of-#11640-upstream-release-3.4
Automated cherry pick of #11640 on release-3.4
2020-02-23 14:42:34 +08:00
Rafael Fernández López 257319fb18 etcdserver: fix quorum calculation when promoting a learner member
When promoting a learner member we should not count already a voting
member, but take only into account the number of existing voting
members and their current status (started, unstarted) when taking the
decision whether a learner member can be promoted.

Before this change, it was impossible to grow from a quorum N to a N+1
through promoting a learning member.

Fixes: #11633
2020-02-21 23:14:55 -08:00
Jingyi Hu cdb2dc11b8
Merge pull request #11636 from YoyinZyc/automated-cherry-pick-of-#11621-upstream-release-3.4
Automated cherry pick of #11621 to release-3.4
2020-02-20 13:04:33 +08:00
jingyih 770674e4a6 etcdserver: corruption check via http
During corruption check, get peer's hashKV via http call.
2020-02-18 14:12:19 -08:00
Jingyi Hu c10168f718
Merge pull request #11631 from jingyih/automated-cherry-pick-of-#11630-upstream-release-3.4
Automated cherry pick of #11630 to release-3.4
2020-02-16 08:35:23 +08:00
jingyih 94673a6ba4 mvcc/backend: check for nil boltOpenOptions
Check if boltOpenOptions is nil before use it.
2020-02-15 00:18:26 -08:00
Joe Betz a1bf5574fc
Merge pull request #11622 from jpbetz/automated-cherry-pick-of-#11613-origin-release-3.4
Automated cherry pick of #11613 to release-3.4
2020-02-13 14:45:38 -08:00
Joe Betz 6d646c442a
mvcc/backend: Delete orphaned db.tmp files before defrag 2020-02-13 12:26:54 -08:00
Jingyi Hu 1226686cf3
Merge pull request #11588 from jingyih/automated-cherry-pick-of-#11586-upstream-release-3.4
Automated cherry pick of #11586 on release 3.4
2020-02-04 19:45:59 -08:00
jingyih 50e12328ac auth: correct logging level 2020-02-04 05:38:58 -08:00
Jingyi Hu 0dc78a144b
Merge pull request #11439 from YoyinZyc/automated-cherry-pick-of-#11418-upstream-release-3.4
Automated cherry pick of #11418 to release 3.4
2019-12-11 14:41:06 -08:00
yoyinzyc 7cf32c262c e2e: test curl auth on onoption user 2019-12-10 12:53:10 -08:00
yoyinzyc 4a9247a47e auth: fix NoPassWord check when add user 2019-12-10 12:53:10 -08:00
Jingyi Hu ac63c2fbd0
Merge pull request #11415 from YoyinZyc/automated-cherry-pick-of-#11413-upstream-release-3.4
Automated cherry pick of #11413 to release-3.4
2019-12-02 14:51:47 -08:00
yoyinzyc ae5bd3c268 auth: fix user.Options nil pointer 2019-12-02 14:44:15 -08:00
Jingyi Hu 94e46ba0d7
Merge pull request #11403 from jingyih/automated-cherry-pick-of-#11400-upstream-release-3.4
Automated cherry pick of #11400 on release 3.4
2019-11-27 13:28:34 -08:00
宇慕 8c10973820 mvcc/kvstore:fixcompactbug 2019-11-27 13:07:47 -08:00
Wenjia 1af0b51537
Merge pull request #11393 from jingyih/automated-cherry-pick-of-#11374-upstream-release-3.4
Automated cherry pick of #11374 on release 3.4
2019-11-26 15:02:27 -08:00
yoyinzyc f4669c3b62 mvcc: update to "etcd_debugging_mvcc_total_put_size_in_bytes" 2019-11-26 14:03:07 -08:00
yoyinzyc 55c3476abc mvcc: add "etcd_mvcc_put_size_in_bytes" to monitor the throughput of put request. 2019-11-26 14:03:07 -08:00
Jingyi Hu b66203c0a1
Merge pull request #11299 from jingyih/automated-cherry-pick-of-#10468-upstream-release-3.4
Automated cherry pick of #10468 on release-3.4
2019-11-05 18:34:22 -08:00
Gyuho Lee 4388404f56 clientv3: fix retry/streamer error message
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-10-31 10:09:18 -07:00
Gyuho Lee a447d51f23
Merge pull request #11312 from jingyih/automated-cherry-pick-of-#11308-upstream-release-3.4
Automated cherry pick of #11308 on release-3.4
2019-10-31 10:08:34 -07:00
Jingyi Hu 4f3c81d81d etcdserver: wait purge file loop during shutdown
To prevent the purge file loop from accidentally acquiring the file lock
and remove the files during server shutdowm.
2019-10-30 16:04:41 -07:00
Jingyi Hu 478da3bf24 integration: disable TestV3AuthOldRevConcurrent
Disable TestV3AuthOldRevConcurrent for now. See
https://github.com/etcd-io/etcd/pull/10468#issuecomment-463253361
2019-10-28 15:03:44 -07:00
Jingyi Hu d6b30e43cd etcdserver: remove auth validation loop
Remove auth validation loop in v3_server.raftRequest(). Re-validation
when error ErrAuthOldRevision occurs should be handled on client side.
2019-10-28 15:03:44 -07:00
Gyuho Lee 1e98c9642e scripts/release: list GPG key only when tagging is needed
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-10-23 11:13:21 -07:00
Gyuho Lee 3cf2f69b57 version: 3.4.3
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-10-23 10:11:46 -07:00
Gyuho Lee d617055284 *: use Go 1.12.12
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-10-23 10:11:02 -07:00
Gyuho Lee 84db9b0878
Merge pull request #11252 from YoyinZyc/automated-cherry-pick-of-#11247-origin-release-3.4
Automated cherry pick of #11247
2019-10-18 10:30:30 -07:00
Gyuho Lee 6cf418ff6d
Merge pull request #11275 from YoyinZyc/stream-support-3.4
rafthttp: add 3.4 stream type
2019-10-18 10:25:53 -07:00
yoyinzyc 97e68cf4e7 rafthttp: add 3.4 stream type 2019-10-17 14:33:53 -07:00
Gyuho Lee 90556d550d
Merge pull request #11269 from jingyih/automated-cherry-pick-of-#11265-upstream-release-3.4
Automated cherry pick of #11265 on release 3.4
2019-10-16 16:50:06 -07:00
Jingyi Hu a00abf5f2a etcdserver: strip patch version in metrics
Strip patch version in cluster version metrics during node restart.
2019-10-16 16:29:53 -07:00
Gyuho Lee b3329ebcd2
Merge pull request #11255 from jingyih/automated-cherry-pick-of-#11233-#11254-upstream-release-3.4
Automated cherry pick of #11233 #11254 on release 3.4
2019-10-15 11:06:09 -07:00
Jingyi Hu b67862c0a6 etcdserver: strip patch version in cluster version
Strip patch version in cluster version metrics.
2019-10-14 17:37:49 -07:00
Jingyi Hu 6a699b6b7f etcdserver: unset old cluster version in metrics 2019-10-14 17:35:10 -07:00
Joe Betz bb5ba14aac Add version, tag and branch checks to release script 2019-10-14 12:55:17 -07:00
Gyuho Lee c3dc994567
Merge pull request #11243 from YoyinZyc/automated-cherry-pick-of-#11237-origin-release-3.4
Automated cherry pick of #11237
2019-10-11 12:38:10 -07:00
Yuchen Zhou f3fbed5b72
Merge branch 'release-3.4' into automated-cherry-pick-of-#11237-origin-release-3.4 2019-10-11 11:17:03 -07:00
yoyinzyc e2547907c5 scripts: avoid release builds on darwin machine. 2019-10-11 11:12:30 -07:00
Gyuho Lee 14c239030f
Merge pull request #11235 from YoyinZyc/automated-cherry-pick-of-#11234-origin-release-3.4
Automated cherry pick of #11234
2019-10-10 16:28:59 -07:00
yoyinzyc 7b67e8a5c5 scripts: fix read failure prompt in release; use https for git clone. 2019-10-10 16:20:17 -07:00
Joe Betz bbe86b066c
version: 3.4.2 2019-10-09 15:26:52 -07:00
Gyuho Lee 2c36cab87d
Merge pull request #11223 from YoyinZyc/automated-cherry-pick-of-#11179-origin-release-3.4
Automated cherry pick of #11179
2019-10-09 13:28:04 -07:00
yoyinzyc 480d5510f9 etcdserver: trace compaction request; add return parameter 'trace' to applierV3.Compaction() mvcc: trace compaction request; add input parameter 'trace' to KV.Compact() 2019-10-09 12:40:12 -07:00
yoyinzyc 9245518363 etcdserver: trace raft requests. 2019-10-09 12:40:12 -07:00
yoyinzyc daa432cfa7 etcdserver: add put request steps. mvcc: add put request steps; add trace to KV.Write() as input parameter. 2019-10-09 12:40:12 -07:00
yoyinzyc 8717327697 pkg: use zap logger to format the structure log output. 2019-10-09 12:40:12 -07:00
yoyinzyc 4f1bbff888 pkg: add field to record additional detail of trace; add stepThreshold to reduce log volume. 2019-10-09 12:40:12 -07:00
yoyinzyc 28bb8037d9 pkg: create package traceutil for tracing. mvcc: add tracing steps:range from the in-memory index tree; range from boltdb. etcdserver: add tracing steps: agreement among raft nodes before linerized reading; authentication; filter and sort kv pairs; assemble the response. 2019-10-09 12:40:12 -07:00
Joe Betz 03b5e7229b
Merge pull request #11213 from jpbetz/automated-cherry-pick-of-#11211-origin-release-3.4
Automated cherry pick of #11211
2019-10-08 18:47:06 -07:00
Joe Betz 0781c0327d
clientv3: Replace endpoint.ParseHostPort with net.SplitHostPort to fix IPv6 client endpoints 2019-10-08 18:27:03 -07:00
Joe Betz 99774d8ed4
Merge pull request #11214 from jpbetz/automated-cherry-pick-of-#11184-origin-release-3.4
Automated cherry pick of #11184
2019-10-08 17:35:02 -07:00
Joe Betz c454344f14
clientv3: Set authority used in cert checks to host of endpoint 2019-10-08 15:35:27 -07:00
Gyuho Lee dae0a72a42
Merge pull request #11200 from jingyih/automated-cherry-pick-of-#11194-origin-release-3.4
Automated cherry pick of #11194 on release-3.4
2019-10-03 16:03:23 -07:00
Gyuho Lee c91a6bf14f tests/e2e: fix metrics tests
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-10-03 16:02:39 -07:00
Jingyi Hu b7ff97f54e etcdctl: fix member add command 2019-10-03 13:52:22 -07:00
Gyuho Lee d08bb07d6d scripts/build-binary: fix darwin tar commands
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-09-28 11:39:04 -07:00
Gyuho Lee 3a736a81e8 scripts/release: fix SHA256SUMS command
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-09-17 14:12:18 -07:00
Gyuho Lee a14579fbfb version: 3.4.1
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-09-17 13:53:25 -07:00
Gyuho Lee ade66a5722 scripts/release: fix docker push command
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-09-17 13:53:12 -07:00
Guangming Wang 67cc70926d integration: fix bug in for loop, make it break properly 2019-09-17 13:30:12 -07:00
Debabrata Banerjee 3b8f812955 etcdserver: Fix PeerURL validation
In case of URLs that are synonyms, the current lexicographic sorting
and compare of the URLs fails with frustrating errors. Make sure to do
a full comparison between every set of PeerURLs before failing.

Fixes #11013
2019-09-16 11:49:58 -04:00
Gyuho Lee 21dcadc83c
Merge pull request #11148 from spzala/automated-cherry-pick-of-#11147-upstream-release-3.4
Automated cherry pick of #11147
2019-09-13 11:12:41 -07:00
chris c7c379e52e embed: expose ZapLoggerBuilder
This exposes the ZapLoggerBuilder in the embed.Config to allow for
custom loggers to be defined and used by embedded etcd.

Fixes #11144
2019-09-13 14:09:54 -04:00
Gyuho Lee 9ed5f76dc0 vendor: upgrade to gRPC v1.23.1
https://github.com/grpc/grpc-go/releases/tag/v1.23.1

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-09-11 14:54:24 -07:00
Jingyi Hu 994865c89e
Merge pull request #11133 from jingyih/automated-cherry-pick-of-#11126-origin-release-3.4
Automated cherry pick of #11126 on release-3.4
2019-09-07 00:03:37 -07:00
Jingyi Hu ccbbb2f8d6 mvcc: add store revision metrics
Add experimental metrics etcd_debugging_mvcc_current_revision and
etcd_debugging_mvcc_compact_revision.
2019-09-06 17:03:21 -07:00
zhangjianweibj d5f79adc9c etcdserver: remove dup percentage sign in log 2019-09-04 22:03:49 -07:00
Gyuho Lee 8b053b0f44 embed: fix secure server logging message
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-09-03 09:43:08 -07:00
Manuel Rüger 11980f8165 scripts/release: Apply shellcheck findings
I run https://github.com/koalaman/shellcheck/ over scripts/* and fixed
the findings it returned.

Signed-off-by: Manuel Rüger <manuel@rueg.eu>
2019-09-03 09:42:35 -07:00
Brandon Philips 41d4e2b276 scripts/release: rename SHA256SUM to SHA256SUMS
These files are commonly called SHA256SUMS and with this change rget
works for v3.4.0 as well.

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-30 13:35:40 -07:00
Gyuho Lee 898bd1351f version: 3.4.0
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-30 08:09:55 -07:00
Gyuho Lee d04d96c9ac tests/e2e: run metrics test again
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-30 08:09:32 -07:00
keepCaim 21edf98fdb Documentation:fix clerical error 2019-08-30 08:08:47 -07:00
Carlos de Paula a4f7c65ef8 vendor: x/sys and x/net to support building on Risc-V
Signed-off-by: Carlos de Paula <me@carlosedp.com>
2019-08-29 14:03:59 -07:00
Gyuho Lee c3a9eec843 scripts/release: fix sha256sum
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-29 09:38:57 -07:00
Gyuho Lee e5528acf57 version: 3.4.0-rc.4
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-29 08:53:10 -07:00
Gyuho Lee 9977550ae9
Merge pull request #11091 from hexfusion/automated-cherry-pick-of-#11087-upstream-release-3.4
Automated cherry pick of #11087 on release 3.4
2019-08-29 08:39:11 -07:00
Sam Batschelet 4d7a6e2755 scripts/release: add sha256sum summary of release assets
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
2019-08-29 11:33:16 +00:00
vimalk78 5e8757c3c5 Documentation: Add section headers to etcd Learner
In the Background section, the document describes various challenges for cluster membership change.
Added section header for each case described for better readability.
2019-08-27 10:18:34 -07:00
Gyuho Lee 012e38fef3 version: 3.4.0-rc.3
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-27 09:50:54 -07:00
Gyuho Lee 41a2cfa122 pkg/logutil: change to "MergeOutputPaths"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-27 09:50:26 -07:00
Gyuho Lee 9f8a1edf38 embed: fix "--log-outputs" setup without "stderr"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-27 09:50:17 -07:00
Wine93 165ba72593 raft/log_test: fixed wrong index 2019-08-26 12:37:07 -07:00
Wine93 9c850ccef0 raft: fixed some typos and simplify minor logic 2019-08-26 12:37:02 -07:00
Raphael Westphal 61d6efda4c etcdserver: add check for nil options 2019-08-26 10:48:20 -07:00
Gyuho Lee b76f149c35 tests/e2e: skip metrics tests for now
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-26 00:02:48 -07:00
vimalk78 5e33bb1a95 Documentation: snapshot can be requested from one etcd node only
Updated Snapshot section of demo.md to reflect that snapsot can be requested only from one etcd node at a time.

Fixes : #10855
2019-08-25 23:40:25 -07:00
vimalk78 83bf125d93 clientv3: add nil checks in Close()
Added nil checks in Close() for Watcher and Lease fields
Added test case
2019-08-25 23:40:05 -07:00
Gyuho Lee d23af41bca tests/e2e: remove string replace for v3.4.0-rc.1
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-23 01:14:42 -07:00
Gyuho Lee 67d0c21bb0 version: 3.4.0-rc.2
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-23 00:37:01 -07:00
nilsocket 18a077d3d3 raft : Write compact if statements 2019-08-23 00:36:44 -07:00
Xiang Li fb6d870e89
Merge pull request #11072 from jingyih/automated-cherry-pick-of-#11069-origin-release-3.4
Automated cherry pick of #11069 on release-3.4
2019-08-23 06:57:12 +08:00
Jingyi Hu e00224f87e integration: fix TestKVPutError
Give backend quota enough overhead.
2019-08-22 13:33:19 -07:00
Wenjia 2af1caf1a5 functional test: fix typo in agent log
Fix typo in functional test agent log to avoid debugging confusion.
2019-08-20 15:23:13 -07:00
Gyuho Lee 0777eab766 Documentation/upgrades: special upgrade guides for >= 3.3.14
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-16 16:19:22 -07:00
Jingyi Hu 0ecc0d0542 etcdmain: update help message
Add experimental-peer-skip-client-san-verification flag description to
help message. Add default values.
2019-08-16 16:07:06 -07:00
Tobias Schottdorf 982a8c9bc3 rafttest: print Ready before processing it
It was confusing to see the effects of the Ready (i.e. log messages)
printed before the Ready itself.
2019-08-16 08:10:17 -07:00
Tobias Schottdorf b8e3e4e7cb raft: fix a test file name 2019-08-16 08:10:07 -07:00
Tobias Schottdorf 4090edfb5b raft: document problem with leader self-removal
When a leader removes itself, it will retain its leadership but not
accept new proposals, making the range effectively stuck until manual
intervention triggers a campaign event.

This commit documents the behavior. It does not correct it yet.
2019-08-16 08:09:56 -07:00
Tobias Schottdorf 078caccce5 raft: add a batch of interaction-driven conf change tests
Verifiy the behavior in various v1 and v2 conf change operations.
This also includes various fixups, notably it adds protection
against transitioning in and out of new configs when this is not
permissible.

There are more threads to pull, but those are left for future commits.
2019-08-16 08:09:44 -07:00
Tobias Schottdorf d177b7f6b4 raft: proactively probe newly added followers
When the leader applied a new configuration that added voters, it would
not immediately probe these voters, delaying when they would be caught
up.

I noticed this while writing an interaction-driven test, which has now
been cleaned up and completed.
2019-08-16 08:09:33 -07:00
Tobias Schottdorf 2c1a1d8c32 rafttest: add _breakpoint directive
It is a helper case to attach a debugger to when a problem needs
to be investigated in a longer test file. In such a case, add the
following stanza immediately before the interesting behavior starts:

_breakpoint:
----
ok

and set a breakpoint on the _breakpoint case.
2019-08-16 08:09:23 -07:00
Tobias Schottdorf 0fc108428e raft: initialize new Progress at LastIndex, not LastIndex+1
Initializing at LastIndex+1 meant that new peers would not be probed
immediately when they appeared in the leader's config, which delays
their getting caught up.
2019-08-16 08:09:11 -07:00
Tobias Schottdorf df489e7a2c raft/rafttest: fix stabilize handler
It was bailing out too early.
2019-08-16 08:08:28 -07:00
Gyuho Lee f13a5102ec tests/e2e: fix version matching
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-15 14:46:19 -07:00
Gyuho Lee c9465f51d2 *: use Go 1.12.9
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-15 14:40:46 -07:00
Gyuho Lee 8f85f0dc26 version: 3.4.0-rc.1
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-15 13:45:25 -07:00
Gyuho Lee 0161e72d8d mvcc: keep 64-bit alignment in "store" struct
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-15 13:31:52 -07:00
Gyuho Lee 1691eec2db clientv3/integration: fix "mvcc.NewStore" call
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-15 13:31:46 -07:00
Joe Betz 1e213b7ab6 *: Add experimental-compaction-batch-limit flag
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-15 13:31:39 -07:00
Joe Betz b30c1eb2c8 mvcc: Optimize compaction for short commit pauses 2019-08-15 13:29:28 -07:00
Gyuho Lee a0be90f450 Documentation/upgrades: update
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-14 17:01:19 -07:00
Gyuho Lee 8110a96f69 scripts/release: clean up minor tag docker commands
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-13 22:01:10 -07:00
Gyuho Lee 8e05c73fa7 Makefile: explicit about GOOS in docker-test builds
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-13 16:57:22 -07:00
Gyuho Lee 970ca9fa43 Documentation/upgrades: highlight "--enable-v2=false"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-13 15:32:46 -07:00
Gyuho Lee a481ee809f vendor: update "net/http2" to latest
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-13 14:44:59 -07:00
Gyuho Lee 4d06d3b498 vendor: upgrade grpc-go to 1.23.0
https://github.com/grpc/grpc-go/releases/tag/v1.23.0

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-13 14:44:53 -07:00
Gyuho Lee 98462b52d1 *: use Go 1.12.8
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-13 12:56:11 -07:00
Gyuho Lee 2a8d09b83b clientv3: use Endpoints(), fix context creation
If overwritten, the previous context should be canceled first.

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-13 12:43:49 -07:00
Gyuho Lee 49c6e87f74 version: 3.4.0-pre
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-13 12:43:40 -07:00
Gyuho Lee 84ed0f7f87 version: 3.4.0-rc.0
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-12 10:06:34 -07:00
Gyuho Lee 52d34298ab scripts: remove ".aci" commands
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-12 10:06:24 -07:00
Gyuho Lee 9c1d2eaee4 scripts/release: fix version check commands
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-12 09:59:24 -07:00
Gyuho Lee 547631a492 scripts: fix build docker commands, add more logging
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-12 09:50:21 -07:00
Gyuho Lee 802e01a0d8 *: remove "acbuild"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-12 09:50:21 -07:00
Gyuho Lee 1dff1c869f scripts/release: fix "yq" command
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-12 09:50:18 -07:00
Tobias Schottdorf ac6b604bb8 raft/rafttest: introduce datadriven testing
It has often been tedious to test the interactions between multi-member
Raft groups, especially when many steps were required to reach a certain
scenario. Often, this boilerplate was as boring as it is hard to write
and hard to maintain, making it attractive to resort to shortcuts
whenever possible, which in turn tended to undercut how meaningful and
maintainable the tests ended up being - that is, if the tests were even
written, which sometimes they weren't.

This change introduces a datadriven framework specifically for testing
deterministically the interaction between multiple members of a raft group
with the goal of reducing the friction for writing these tests to near
zero.

In the near term, this will be used to add thorough testing for joint
consensus (which is already available today, but wildly undertested),
but just converting an existing test into this framework has shown that
the concise representation and built-in inspection of log messages
highlights unexpected behavior much more readily than the previous unit
tests did (the test in question is `snapshot_succeed_via_app_resp`; the
reader is invited to compare the old and new version of it).

The main building block is `InteractionEnv`, which holds on to the state
of the whole system and exposes various relevant methods for
manipulating it, including but not limited to adding nodes, delivering
and dropping messages, and proposing configuration changes. All of this
is extensible so that in the future I hope to use it to explore the
phenomena discussed in

https://github.com/etcd-io/etcd/issues/7625#issuecomment-488798263

which requires injecting appropriate "crash points" in the Ready
handling loop. Discussions of the "what if X happened in state Y"
can quickly be made concrete by "scripting up an interaction test".

Additionally, this framework is intentionally not kept internal to the
raft package.. Though this is in its infancy, a goal is that it should
be possible for a suite of interaction tests to allow applications to
validate that their Storage implementation behaves accordingly, simply
by running a raft-provided interaction suite against their Storage.
2019-08-12 08:10:29 -07:00
Tobias Schottdorf 69c97cdc8f vendor: bump datadriven
Picks up some fixes for papercuts.
2019-08-12 08:10:19 -07:00
ethan faa71d89d4 cleanup: correct summary message in put.go 2019-08-12 08:07:33 -07:00
Gyuho Lee 64c16779c0 tests/e2e: pass "rc.0"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-12 01:46:58 -07:00
Hanaasagi 8ff71c52db test: fix metric name typo 2019-08-09 13:24:27 -07:00
Tobias Schottdorf dbe5198c45 raft: fix restoring joint configurations
While writing interaction tests for joint configuration changes, I
realized that this wasn't working yet - restoring had no notion of
the joint configuration and was simply dropping it on the floor.

This commit introduces a helper `confchange.Restore` which takes
a `ConfState` and initializes a `Tracker` from it.

This is then used both in `(*raft).restore` as well as in `newRaft`.
2019-08-09 11:18:40 -07:00
Tobias Schottdorf 39d0f4e53c confchange: clean up unnecessary block 2019-08-09 11:18:30 -07:00
nilsocket a8b4213ec0 raft : `newRaft()` does check for validity of `Config` 2019-08-09 11:18:06 -07:00
Tobias Schottdorf a945379ce4 raft/tracker: visit Progress in stable order
This is helpful for upcoming testing work which allows datadriven
testing of the interaction of multiple nodes. This testing requires
determinism to work correctly.
2019-08-09 08:39:52 -07:00
Tobias Schottdorf 7a50cd7074 raft/auorum: remove unused type 2019-08-09 08:39:44 -07:00
Gyuho Lee f786b6ba16 etcdserver: add "etcd_server_snapshot_apply_in_progress_total"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-08 14:02:13 -07:00
Gyuho Lee 1c8ab76333 integration: test snapshot inflights metrics
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-08 14:01:54 -07:00
Gyuho Lee abdb7ca17b etcdserver/api: add "etcd_network_snapshot_send_inflights_total", "etcd_network_snapshot_receive_inflights_total"
Useful for deciding when to terminate the unhealthy follower.
If the follower is receiving a leader snapshot, operator may wait.

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-08 14:01:45 -07:00
Zeming YU 629cb7aa5e agent: fix a data race and deadlock
add 1-size buffer for `errc`  to avoid deadlock of child goroutine
add a local variable to a void data race in `err`
when `case <-stream.Context().Done():` is taken
2019-08-08 12:23:08 -07:00
Gyuho Lee 89e102365d Documentation/op-guide: update runtime configuration
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-08 09:25:29 -07:00
Tobias Schottdorf 9018b3dc4d raft: let learners vote
It turns out that that learners must be allowed to cast votes.

This seems counter- intuitive but is necessary in the situation in which
a learner has been promoted (i.e. is now a voter) but has not learned
about this yet.

For example, consider a group in which id=1 is a learner and id=2 and
id=3 are voters. A configuration change promoting 1 can be committed on
the quorum `{2,3}` without the config change being appended to the
learner's log. If the leader (say 2) fails, there are de facto two
voters remaining. Only 3 can win an election (due to its log containing
all committed entries), but to do so it will need 1 to vote. But 1
considers itself a learner and will continue to do so until 3 has
stepped up as leader, replicates the conf change to 1, and 1 applies it.

Ultimately, by receiving a request to vote, the learner realizes that
the candidate believes it to be a voter, and that it should act
accordingly. The candidate's config may be stale, too; but in that case
it won't win the election, at least in the absence of the bug discussed
in:
https://github.com/etcd-io/etcd/issues/7625#issuecomment-488798263.
2019-08-08 09:10:21 -07:00
Gyuho Lee b9bea9def7 functional/agent: copy file, instead of renaming
To retain failure logs in CI testing.

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-08 09:09:39 -07:00
Gyuho Lee d2675c13f4 functional/rpcpb: make client log less verbose
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-08 09:09:34 -07:00
Gyuho Lee 8230536171 functional.yaml: try lower snapshot count for flaky tests, error threshold
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-08 09:09:29 -07:00
lzhfromustc 524278c187 pkg/types: Avoid potential double lock of tsafeSet.
(tsafeSet).Sub and (tsafeSet).Equals can cause double lock bug if ts and other is pointing the same variable

gofmt the code and add some comments
2019-08-07 16:02:24 -07:00
Gyuho Lee 29cdc9abfc test: output etcd server logs when functional tests fail
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-07 10:16:44 -07:00
Zeming YU a6a9a71b6a integration: fix a data race about `err`
don't share `err` between goroutines
2019-08-06 16:15:27 -07:00
Gyuho Lee 8c8f6f4b01 mvcc: fix typo in test
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-06 15:09:55 -07:00
Zeming YU b6cfaf883b v3rpc: fix a typo `err`
don't read return value in child goroutine which causes data race.
2019-08-06 15:09:47 -07:00
Gyuho Lee b522281a98 stream: Prevent panic when newAttemptLocked fails to get a transport for the new attempt
Testing https://github.com/grpc/grpc-go/pull/2958

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-06 15:09:42 -07:00
Gyuho Lee a78793e6bf vendor: update gRPC to latest
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-06 15:09:37 -07:00
Gyuho Lee e09528aa06
Merge pull request #10988 from wenjiaswe/automated-cherry-pick-of-#10987-upstream-release-3.4
Automated cherry pick of #10987
2019-08-05 23:31:33 -07:00
Wenjia Zhang cb4507d15b functional:update go.etcd.io/etcd link and go image registry for functional test 2019-08-05 23:28:45 -07:00
Gyuho Lee 4cead3c25c
Merge pull request #10986 from wenjiaswe/automated-cherry-pick-of-#10985-upstream-release-3.4
Automated cherry pick of #10985
2019-08-05 22:45:31 -07:00
Wenjia 3ac41644cc functional test: Update functional README.md 2019-08-05 22:12:50 -07:00
Gyuho Lee 0564743c9b CHANGELOG: remove from release branch
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-05 14:39:18 -07:00
Gyuho Lee 9d927afead Documentation/upgrades: highlight "grpc.ErrClientConnClosing"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-05 14:38:51 -07:00
Gyuho Lee 5d19b96341 proxy/grpcproxy: deprecate "grpc.ErrClientConnClosing"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-05 14:38:44 -07:00
Gyuho Lee faa1d9d206 functional: deprecate "grpc.ErrClientConnClosing"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-05 14:38:35 -07:00
Gyuho Lee ab1db0dfd8 clientv3: deprecate "grpc.ErrClientConnClosing"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-05 14:38:27 -07:00
Gyuho Lee 1c312cefbd functional: use Go 1.12.7 as default
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-05 12:40:50 -07:00
Gyuho Lee b4fcaad87d pkg/adt: remove TODO
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-05 00:25:02 -07:00
Gyuho Lee 3468505e38 clientv3: document "WithBlock" dial option
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-04 23:53:02 -07:00
Gyuho Lee a2d68dd389 travis: do not allow CPU 4 test failures
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-04 23:34:31 -07:00
Gyuho Lee c6e9699960 travis: do not run coverage, tip tests in v3.4
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-04 23:33:13 -07:00
Gyuho Lee b05dfeb15e scripts/release: remove acbuild commands
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-04 23:21:51 -07:00
xkey bb7df24af4 pkg/adt: fix interval tree black-height property based on rbtree
Author: xkey <xk33430@ly.com>
ref. https://github.com/etcd-io/etcd/pull/10978

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-04 23:15:09 -07:00
Gyuho Lee 9ff86fe516 tests/e2e: skip release tests until release candidate
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-03 00:09:10 -07:00
Gyuho Lee bc9a54beae tests/e2e: fix upgrade, metrics tests
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-02 15:58:25 -07:00
Gyuho Lee df1d3f7c6e functional: remove "embed" support in tests
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-02 15:58:21 -07:00
Gyuho Lee 14053ba7f7 etcdserver/api: enable 3.4 capability
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-02 15:24:40 -07:00
2240 changed files with 263581 additions and 211125 deletions

View File

@ -1,23 +0,0 @@
// For format details, see https://aka.ms/devcontainer.json. For config options, see the
// README at: https://github.com/devcontainers/templates/tree/main/src/go
{
"name": "Go",
// Or use a Dockerfile or Docker Compose file. More info: https://containers.dev/guide/dockerfile
"image": "mcr.microsoft.com/devcontainers/go:1.19-bullseye",
// Features to add to the dev container. More info: https://containers.dev/features.
"features": {
"ghcr.io/devcontainers/features/docker-in-docker:2": {},
"ghcr.io/devcontainers/features/github-cli:1": {}
},
// Use 'forwardPorts' to make a list of ports inside the container available locally.
"forwardPorts": [2379, 2380],
// Use 'postCreateCommand' to run commands after the container is created.
"postCreateCommand": "make build"
// Configure tool-specific properties.
// "customizations": {},
}

2
.github/ISSUE_TEMPLATE.md vendored Normal file
View File

@ -0,0 +1,2 @@
Please read https://github.com/etcd-io/etcd/blob/master/Documentation/reporting_bugs.md.

View File

@ -1,102 +0,0 @@
---
name: Bug Report
description: Report a bug encountered while operating etcd
labels:
- type/bug
body:
- type: checkboxes
id: confirmations
attributes:
label: Bug report criteria
description: Please confirm this bug report meets the following criteria.
options:
- label: This bug report is not security related, security issues should be disclosed privately via security@etcd.io.
- label: This is not a support request, support requests should be raised in the etcd [discussion forums](https://github.com/etcd-io/etcd/discussions).
- label: You have read the etcd [bug reporting guidelines](https://github.com/etcd-io/etcd/blob/main/Documentation/contributor-guide/reporting_bugs.md).
- label: Existing open issues along with etcd [frequently asked questions](https://etcd.io/docs/latest/faq) have been checked and this is not a duplicate.
- type: markdown
attributes:
value: |
Please fill the form below and provide as much information as possible.
Not doing so may result in your bug not being addressed in a timely manner.
- type: textarea
id: problem
attributes:
label: What happened?
validations:
required: true
- type: textarea
id: expected
attributes:
label: What did you expect to happen?
validations:
required: true
- type: textarea
id: repro
attributes:
label: How can we reproduce it (as minimally and precisely as possible)?
validations:
required: true
- type: textarea
id: additional
attributes:
label: Anything else we need to know?
- type: textarea
id: etcdVersion
attributes:
label: Etcd version (please run commands below)
value: |
<details>
```console
$ etcd --version
# paste output here
$ etcdctl version
# paste output here
```
</details>
validations:
required: true
- type: textarea
id: config
attributes:
label: Etcd configuration (command line flags or environment variables)
value: |
<details>
# paste your configuration here
</details>
- type: textarea
id: etcdDebugInformation
attributes:
label: Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)
value: |
<details>
```console
$ etcdctl member list -w table
# paste output here
$ etcdctl --endpoints=<member list> endpoint status -w table
# paste output here
```
</details>
- type: textarea
id: logs
attributes:
label: Relevant log output
description: Please copy and paste any relevant log output. This will be automatically formatted into code, so no need for backticks.
render: Shell

View File

@ -1,6 +0,0 @@
---
blank_issues_enabled: false
contact_links:
- name: Question
url: https://github.com/etcd-io/etcd/discussions
about: Question relating to Etcd

View File

@ -1,19 +0,0 @@
---
name: Feature request
description: Provide idea for a new feature
labels:
- type/feature
body:
- type: textarea
id: feature
attributes:
label: What would you like to be added?
validations:
required: true
- type: textarea
id: rationale
attributes:
label: Why is this needed?
validations:
required: true

View File

@ -1,31 +0,0 @@
---
name: Membership nomination
description: Nominate new etcd members
labels:
- area/community
body:
- type: textarea
id: feature
attributes:
label: Who would you like to nominate?
validations:
required: true
- id: requirements
type: checkboxes
attributes:
label: Requirements
options:
- label: I have reviewed the [community membership guidelines](https://github.com/etcd-io/etcd/blob/main/Documentation/contributor-guide/community-membership.md)
required: true
- label: The members are actively contributing to 1 or more etcd subprojects
required: true
- label: The members are being sponsored by two current reviewers or a current maintainer.
required: true
- type: textarea
id: rationale
attributes:
label: How do the new members meet the regular active contribution requirements?
validations:
required: true

View File

@ -1,34 +0,0 @@
---
name: Flaking Test
description: Report flaky tests
labels:
- type/flake
body:
- type: textarea
id: workflows
attributes:
label: Which github workflows are flaking?
validations:
required: true
- type: textarea
id: tests
attributes:
label: Which tests are flaking?
validations:
required: true
- type: input
id: link
attributes:
label: Github Action link
- type: textarea
id: reason
attributes:
label: Reason for failure (if possible)
- type: textarea
id: additional
attributes:
label: Anything else we need to know?

View File

@ -1,2 +1,2 @@
Please read https://github.com/etcd-io/etcd/blob/main/CONTRIBUTING.md#contribution-flow.
Please read https://github.com/etcd-io/etcd/blob/master/CONTRIBUTING.md#contribution-flow.

2
.github/SECURITY.md vendored
View File

@ -1,2 +1,2 @@
Please read https://github.com/etcd-io/etcd/blob/main/security/README.md.
Please read https://github.com/etcd-io/etcd/blob/master/security/README.md.

View File

@ -1,21 +0,0 @@
---
version: 2
updates:
- package-ecosystem: github-actions
directory: /
schedule:
interval: weekly
- package-ecosystem: gomod
directory: /
schedule:
interval: weekly
allow:
- dependency-type: all
- package-ecosystem: gomod
directory: /tools/mod # Not linked from /go.mod
schedule:
interval: weekly
allow:
- dependency-type: all

56
.github/stale.yml vendored
View File

@ -1,56 +0,0 @@
---
# Configuration for probot-stale - https://github.com/probot/stale
# Number of days of inactivity before an Issue or Pull Request becomes stale
daysUntilStale: 90
# Number of days of inactivity before an Issue or Pull Request with the stale label is closed.
# Set to false to disable. If disabled, issues still need to be closed manually, but will remain marked as stale.
daysUntilClose: 21
# Only issues or pull requests with all of these labels are check if stale. Defaults to `[]` (disabled)
onlyLabels: []
# Issues or Pull Requests with these labels will never be considered stale. Set to `[]` to disable
exemptLabels:
- "stage/tracked"
# Set to true to ignore issues in a project (defaults to false)
exemptProjects: false
# Set to true to ignore issues in a milestone (defaults to false)
exemptMilestones: false
# Set to true to ignore issues with an assignee (defaults to false)
exemptAssignees: false
# Label to use when marking as stale
staleLabel: stale
# Comment to post when marking as stale. Set to `false` to disable
markComment: This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.
# Comment to post when removing the stale label.
# unmarkComment: >
# Your comment here.
# Comment to post when closing a stale Issue or Pull Request.
# closeComment: >
# Your comment here.
# Limit the number of actions per hour, from 1-30. Default is 30
limitPerRun: 30
# Limit to only `issues` or `pulls`
# only: issues
# Optionally, specify configuration settings that are specific to just 'issues' or 'pulls':
# pulls:
# daysUntilStale: 30
# markComment: >
# This pull request has been automatically marked as stale because it has not had
# recent activity. It will be closed if no further activity occurs. Thank you
# for your contributions.
# issues:
# exemptLabels:
# - confirmed

View File

@ -1,67 +0,0 @@
---
name: Build
on: [push, pull_request]
permissions: read-all
jobs:
build:
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
target:
- linux-amd64
- linux-386
- darwin-amd64
- darwin-arm64
- windows-amd64
- linux-arm
- linux-arm64
- linux-ppc64le
- linux-s390x
steps:
- uses: actions/checkout@c85c95e3d7251135ab7dc9ce3241c5835cc595a9 # v3.5.3
- id: goversion
run: echo "goversion=$(cat .go-version)" >> "$GITHUB_OUTPUT"
- uses: actions/setup-go@fac708d6674e30b6ba41289acaab6d4b75aa0753 # v4.0.1
with:
go-version: ${{ steps.goversion.outputs.goversion }}
- env:
TARGET: ${{ matrix.target }}
run: |
set -euo pipefail
echo "${TARGET}"
case "${TARGET}" in
linux-amd64)
GOOS=linux GOARCH=amd64 make build
;;
linux-386)
GOOS=linux GOARCH=386 make build
;;
darwin-amd64)
GOOS=darwin GOARCH=amd64 make build
;;
darwin-arm64)
GOOS=darwin GOARCH=arm64 make build
;;
windows-amd64)
GOOS=windows GOARCH=amd64 make build
;;
linux-arm)
GOOS=linux GOARCH=arm make build
;;
linux-arm64)
GOOS=linux GOARCH=arm64 make build
;;
linux-ppc64le)
GOOS=linux GOARCH=ppc64le make build
;;
linux-s390x)
GOOS=linux GOARCH=s390x make build
;;
*)
echo "Failed to find target"
exit 1
;;
esac

View File

@ -1,55 +0,0 @@
---
# For most projects, this workflow file will not need changing; you simply need
# to commit it to your repository.
#
# You may wish to alter this file to override the set of languages analyzed,
# or to provide custom queries or build logic.
#
# ******** NOTE ********
# We have attempted to detect the languages in your repository. Please check
# the `language` matrix defined below to confirm you have the correct set of
# supported CodeQL languages.
#
name: "CodeQL"
on:
push:
branches: [main, release-3.4, release-3.5, release-3.6]
pull_request:
# The branches below must be a subset of the branches above
branches: [main]
schedule:
- cron: '20 14 * * 5'
permissions: read-all
jobs:
analyze:
name: Analyze
runs-on: ubuntu-latest
permissions:
actions: read
contents: read
security-events: write
strategy:
fail-fast: false
matrix:
# CodeQL supports [ 'cpp', 'csharp', 'go', 'java', 'javascript', 'python' ]
# Learn more:
# https://docs.github.com/en/free-pro-team@latest/github/finding-security-vulnerabilities-and-errors-in-your-code/configuring-code-scanning#changing-the-languages-that-are-analyzed
language: ['go']
steps:
- name: Checkout repository
uses: actions/checkout@c85c95e3d7251135ab7dc9ce3241c5835cc595a9 # v3.5.3
# Initializes the CodeQL tools for scanning.
- name: Initialize CodeQL
uses: github/codeql-action/init@6c089f53dd51dc3fc7e599c3cb5356453a52ca9e # v2.20.0
with:
# If you wish to specify custom queries, you can do so here or in a config file.
# By default, queries listed here will override any specified in a config file.
# Prefix the list here with "+" to use these queries and those in the config file.
# queries: ./path/to/local/query, your-org/your-repo/queries@main
languages: ${{ matrix.language }}
# Autobuild attempts to build any compiled languages (C/C++, C#, or Java).
# If this step fails, then you should remove it and run the build manually (see below)
- name: Autobuild
uses: github/codeql-action/autobuild@6c089f53dd51dc3fc7e599c3cb5356453a52ca9e # v2.20.0
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@6c089f53dd51dc3fc7e599c3cb5356453a52ca9e # v2.20.0

View File

@ -1,18 +0,0 @@
---
name: Test contrib/mixin
on: [push, pull_request]
permissions: read-all
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@c85c95e3d7251135ab7dc9ce3241c5835cc595a9 # v3.5.3
- id: goversion
run: echo "goversion=$(cat .go-version)" >> "$GITHUB_OUTPUT"
- uses: actions/setup-go@fac708d6674e30b6ba41289acaab6d4b75aa0753 # v4.0.1
with:
go-version: ${{ steps.goversion.outputs.goversion }}
- run: |
set -euo pipefail
make -C contrib/mixin tools test

View File

@ -1,32 +0,0 @@
---
name: Coverage
on: [push]
permissions: read-all
jobs:
coverage:
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
target:
- linux-amd64-coverage
steps:
- uses: actions/checkout@c85c95e3d7251135ab7dc9ce3241c5835cc595a9 # v3.5.3
- id: goversion
run: echo "goversion=$(cat .go-version)" >> "$GITHUB_OUTPUT"
- uses: actions/setup-go@fac708d6674e30b6ba41289acaab6d4b75aa0753 # v4.0.1
with:
go-version: ${{ steps.goversion.outputs.goversion }}
- env:
TARGET: ${{ matrix.target }}
run: |
mkdir "${TARGET}"
case "${TARGET}" in
linux-amd64-coverage)
GOARCH=amd64 ./scripts/codecov_upload.sh
;;
*)
echo "Failed to find target"
exit 1
;;
esac

View File

@ -1,45 +0,0 @@
---
name: E2E-arm64
on:
schedule:
- cron: '0 1 * * *' # runs daily at 1am.
permissions: read-all
jobs:
test:
# this is to prevent the job to run at forked projects
if: github.repository == 'etcd-io/etcd'
runs-on: [self-hosted, Linux, ARM64]
container: golang:1.19-bullseye
defaults:
run:
shell: bash
strategy:
fail-fast: true
matrix:
target:
- linux-arm64-e2e
steps:
- uses: actions/checkout@c85c95e3d7251135ab7dc9ce3241c5835cc595a9 # v3.5.3
# https://github.com/actions/checkout/issues/1169
- run: git config --system --add safe.directory '*'
- id: goversion
run: echo "goversion=$(cat .go-version)" >> "$GITHUB_OUTPUT"
- uses: actions/setup-go@fac708d6674e30b6ba41289acaab6d4b75aa0753 # v4.0.1
with:
go-version: ${{ steps.goversion.outputs.goversion }}
- env:
TARGET: ${{ matrix.target }}
run: |
set -euo pipefail
go clean -testcache
echo "${TARGET}"
case "${TARGET}" in
linux-arm64-e2e)
GOOS=linux GOARCH=arm64 CPU=4 EXPECT_DEBUG=true RACE=true make test-e2e-release
;;
*)
echo "Failed to find target"
exit 1
;;
esac

View File

@ -1,39 +0,0 @@
---
name: E2E
on: [push, pull_request]
permissions: read-all
jobs:
test:
runs-on: ubuntu-latest
strategy:
fail-fast: true
matrix:
target:
- linux-amd64-e2e
- linux-386-e2e
steps:
- uses: actions/checkout@c85c95e3d7251135ab7dc9ce3241c5835cc595a9 # v3.5.3
- id: goversion
run: echo "goversion=$(cat .go-version)" >> "$GITHUB_OUTPUT"
- uses: actions/setup-go@fac708d6674e30b6ba41289acaab6d4b75aa0753 # v4.0.1
with:
go-version: ${{ steps.goversion.outputs.goversion }}
- env:
TARGET: ${{ matrix.target }}
run: |
set -euo pipefail
go clean -testcache
echo "${TARGET}"
case "${TARGET}" in
linux-amd64-e2e)
VERBOSE=1 GOOS=linux GOARCH=amd64 CPU=4 EXPECT_DEBUG=true RACE=true make test-e2e-release
;;
linux-386-e2e)
VERBOSE=1 GOOS=linux GOARCH=386 CPU=4 EXPECT_DEBUG=true RACE=true make test-e2e
;;
*)
echo "Failed to find target"
exit 1
;;
esac

View File

@ -1,26 +0,0 @@
---
name: Fuzzing v3rpc
on: [push, pull_request]
permissions: read-all
jobs:
fuzzing:
runs-on: ubuntu-latest
strategy:
fail-fast: false
env:
TARGET_PATH: ./server/etcdserver/api/v3rpc
steps:
- uses: actions/checkout@c85c95e3d7251135ab7dc9ce3241c5835cc595a9 # v3.5.3
- id: goversion
run: echo "goversion=$(cat .go-version)" >> "$GITHUB_OUTPUT"
- uses: actions/setup-go@fac708d6674e30b6ba41289acaab6d4b75aa0753 # v4.0.1
with:
go-version: ${{ steps.goversion.outputs.goversion }}
- run: |
set -euo pipefail
GOARCH=amd64 CPU=4 make fuzz
- uses: actions/upload-artifact@0b7f8abb1508181956e8e162db84b466c27e18ce # v3.1.2
if: failure()
with:
path: "${{env.TARGET_PATH}}/testdata/fuzz/**/*"

View File

@ -1,19 +0,0 @@
---
name: Go Vulnerability Checker
on: [push, pull_request]
permissions: read-all
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@c85c95e3d7251135ab7dc9ce3241c5835cc595a9 # v3.5.3
- id: goversion
run: echo "goversion=$(cat .go-version)" >> "$GITHUB_OUTPUT"
- uses: actions/setup-go@fac708d6674e30b6ba41289acaab6d4b75aa0753 # v4.0.1
with:
go-version: ${{ steps.goversion.outputs.goversion }}
- run: date
- run: |
set -euo pipefail
go install golang.org/x/vuln/cmd/govulncheck@latest && govulncheck ./...

View File

@ -1,38 +0,0 @@
---
name: grpcProxy-tests
on: [push, pull_request]
permissions: read-all
jobs:
test:
runs-on: ubuntu-latest
strategy:
fail-fast: true
matrix:
target:
- linux-amd64-grpcproxy-integration
- linux-amd64-grpcproxy-e2e
steps:
- uses: actions/checkout@c85c95e3d7251135ab7dc9ce3241c5835cc595a9 # v3.5.3
- id: goversion
run: echo "goversion=$(cat .go-version)" >> "$GITHUB_OUTPUT"
- uses: actions/setup-go@fac708d6674e30b6ba41289acaab6d4b75aa0753 # v4.0.1
with:
go-version: ${{ steps.goversion.outputs.goversion }}
- env:
TARGET: ${{ matrix.target }}
run: |
set -euo pipefail
echo "${TARGET}"
case "${TARGET}" in
linux-amd64-grpcproxy-integration)
GOOS=linux GOARCH=amd64 CPU=4 RACE=true make test-grpcproxy-integration
;;
linux-amd64-grpcproxy-e2e)
GOOS=linux GOARCH=amd64 CPU=4 RACE=true make test-grpcproxy-e2e
;;
*)
echo "Failed to find target"
exit 1
;;
esac

View File

@ -1,23 +0,0 @@
---
name: Measure Test Flakiness
on:
schedule:
- cron: "0 0 * * 0" # run every Sunday at midnight
permissions: read-all
jobs:
measure-test-flakiness:
name: Measure Test Flakiness
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@c85c95e3d7251135ab7dc9ce3241c5835cc595a9 # v3.5.3
- env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
set -euo pipefail
./scripts/measure-test-flakiness.sh
make bin/etcd-test-analyzer
bin/etcd-test-analyzer run -token $GITHUB_TOKEN -max-age=168h -workflow Tests -branch main

View File

@ -1,15 +1,13 @@
---
name: Release
on: [push, pull_request]
permissions: read-all
jobs:
main:
release:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@c85c95e3d7251135ab7dc9ce3241c5835cc595a9 # v3.5.3
- uses: actions/checkout@v2
- id: goversion
run: echo "goversion=$(cat .go-version)" >> "$GITHUB_OUTPUT"
- uses: actions/setup-go@fac708d6674e30b6ba41289acaab6d4b75aa0753 # v4.0.1
- uses: actions/setup-go@v2
with:
go-version: ${{ steps.goversion.outputs.goversion }}
- name: release
@ -28,7 +26,7 @@ jobs:
Name-Email: github-action@etcd.io
Expire-Date: 0
EOF
DRY_RUN=true ./scripts/release.sh --no-upload --no-docker-push --in-place 3.6.99
DRY_RUN=true ./scripts/release.sh --no-upload --no-docker-push --in-place 3.4.99
- name: test-image
run: |
VERSION=3.6.99 ./scripts/test_images.sh
VERSION=3.4.99 ./scripts/test_images.sh

View File

@ -1,39 +0,0 @@
---
name: Robustness Nightly
permissions: read-all
on:
# schedules always run against the main branch, hence we have to create separate jobs
# with individual checkout actions for each of the active release branches
schedule:
- cron: '25 9 * * *' # runs every day at 09:25 UTC
jobs:
main:
# GHA has a maximum amount of 6h execution time, we try to get done within 3h
uses: ./.github/workflows/robustness-template.yaml
with:
etcdBranch: main
count: 100
testTimeout: 200m
artifactName: main
main-arm64:
uses: ./.github/workflows/robustness-template-arm64.yaml
with:
etcdBranch: main
count: 100
testTimeout: 200m
artifactName: main-arm64
runs-on: "['self-hosted', 'Linux', 'ARM64']"
release-35:
uses: ./.github/workflows/robustness-template.yaml
with:
etcdBranch: release-3.5
count: 100
testTimeout: 200m
artifactName: release-35
release-34:
uses: ./.github/workflows/robustness-template.yaml
with:
etcdBranch: release-3.4
count: 100
testTimeout: 200m
artifactName: release-34

View File

@ -1,72 +0,0 @@
---
name: Reusable Robustness Workflow
on:
workflow_call:
inputs:
etcdBranch:
required: true
type: string
count:
required: true
type: number
testTimeout:
required: false
type: string
default: '30m'
artifactName:
required: true
type: string
runs-on:
required: false
type: string
default: "['ubuntu-latest']"
permissions: read-all
jobs:
test:
timeout-minutes: 210
runs-on: ${{ fromJson(inputs.runs-on) }}
container: golang:1.19-bullseye
defaults:
run:
shell: bash
steps:
- uses: actions/checkout@c85c95e3d7251135ab7dc9ce3241c5835cc595a9 # v3.5.3
# https://github.com/actions/checkout/issues/1169
- run: git config --system --add safe.directory '*'
- id: goversion
run: echo "goversion=$(cat .go-version)" >> "$GITHUB_OUTPUT"
- uses: actions/setup-go@fac708d6674e30b6ba41289acaab6d4b75aa0753 # v4.0.1
with:
go-version: ${{ steps.goversion.outputs.goversion }}
- name: test-robustness
env:
ETCD_BRANCH: "${{ inputs.etcdBranch }}"
run: |
set -euo pipefail
go clean -testcache
# Use --failfast to avoid overriding report generated by failed test
GO_TEST_FLAGS="-v --count ${{ inputs.count }} --timeout ${{ inputs.testTimeout }} --failfast --run TestRobustness"
case "${ETCD_BRANCH}" in
release-3.5)
EXPECT_DEBUG=true GO_TEST_FLAGS=${GO_TEST_FLAGS} RESULTS_DIR=/tmp/results make test-robustness-release-3.5
;;
release-3.4)
EXPECT_DEBUG=true GO_TEST_FLAGS=${GO_TEST_FLAGS} RESULTS_DIR=/tmp/results make test-robustness-release-3.4
;;
main)
make gofail-enable
make build
EXPECT_DEBUG=true GO_TEST_FLAGS=${GO_TEST_FLAGS} RESULTS_DIR=/tmp/results make test-robustness
;;
*)
echo "Failed to find target ${ETCD_BRANCH}"
exit 1
;;
esac
- uses: actions/upload-artifact@0b7f8abb1508181956e8e162db84b466c27e18ce
if: always()
with:
name: ${{ inputs.artifactName }}
path: /tmp/results/*

View File

@ -1,65 +0,0 @@
---
name: Reusable Robustness Workflow
on:
workflow_call:
inputs:
etcdBranch:
required: true
type: string
count:
required: true
type: number
testTimeout:
required: false
type: string
default: '30m'
artifactName:
required: true
type: string
runs-on:
required: false
type: string
default: "['ubuntu-latest']"
permissions: read-all
jobs:
test:
timeout-minutes: 210
runs-on: ${{ fromJson(inputs.runs-on) }}
steps:
- uses: actions/checkout@c85c95e3d7251135ab7dc9ce3241c5835cc595a9 # v3.5.3
- id: goversion
run: echo "goversion=$(cat .go-version)" >> "$GITHUB_OUTPUT"
- uses: actions/setup-go@fac708d6674e30b6ba41289acaab6d4b75aa0753 # v4.0.1
with:
go-version: ${{ steps.goversion.outputs.goversion }}
- name: test-robustness
env:
ETCD_BRANCH: "${{ inputs.etcdBranch }}"
run: |
set -euo pipefail
go clean -testcache
# Use --failfast to avoid overriding report generated by failed test
GO_TEST_FLAGS="-v --count ${{ inputs.count }} --timeout ${{ inputs.testTimeout }} --failfast --run TestRobustness"
case "${ETCD_BRANCH}" in
release-3.5)
EXPECT_DEBUG=true GO_TEST_FLAGS=${GO_TEST_FLAGS} RESULTS_DIR=/tmp/results make test-robustness-release-3.5
;;
release-3.4)
EXPECT_DEBUG=true GO_TEST_FLAGS=${GO_TEST_FLAGS} RESULTS_DIR=/tmp/results make test-robustness-release-3.4
;;
main)
make gofail-enable
make build
EXPECT_DEBUG=true GO_TEST_FLAGS=${GO_TEST_FLAGS} RESULTS_DIR=/tmp/results make test-robustness
;;
*)
echo "Failed to find target ${ETCD_BRANCH}"
exit 1
;;
esac
- uses: actions/upload-artifact@0b7f8abb1508181956e8e162db84b466c27e18ce
if: always()
with:
name: ${{ inputs.artifactName }}
path: /tmp/results/*

View File

@ -1,12 +0,0 @@
---
name: Robustness
on: [push, pull_request]
permissions: read-all
jobs:
main:
uses: ./.github/workflows/robustness-template.yaml
with:
etcdBranch: main
count: 15
testTimeout: 30m
artifactName: main

View File

@ -1,55 +0,0 @@
---
name: Scorecards supply-chain security
on:
# Only the default branch is supported.
branch_protection_rule:
schedule:
- cron: '45 1 * * 0'
push:
branches: ["main"]
# Declare default permissions as read only.
permissions: read-all
jobs:
analysis:
name: Scorecards analysis
runs-on: ubuntu-latest
permissions:
# Needed to upload the results to code-scanning dashboard.
security-events: write
# Used to receive a badge.
id-token: write
steps:
- name: "Checkout code"
uses: actions/checkout@c85c95e3d7251135ab7dc9ce3241c5835cc595a9 # tag=v3.0.0
with:
persist-credentials: false
- name: "Run analysis"
uses: ossf/scorecard-action@80e868c13c90f172d68d1f4501dee99e2479f7af # tag=v2.1.3
with:
results_file: results.sarif
results_format: sarif
# Publish the results for public repositories to enable scorecard badges. For more details, see
# https://github.com/ossf/scorecard-action#publishing-results.
# For private repositories, `publish_results` will automatically be set to `false`, regardless
# of the value entered here.
publish_results: true
# Upload the results as artifacts (optional). Commenting out will disable uploads of run results in SARIF
# format to the repository Actions tab.
- name: "Upload artifact"
uses: actions/upload-artifact@0b7f8abb1508181956e8e162db84b466c27e18ce # tag=v3.0.0
with:
name: SARIF file
path: results.sarif
retention-days: 5
# Upload the results to GitHub's code scanning dashboard.
- name: "Upload to code-scanning"
uses: github/codeql-action/upload-sarif@6c089f53dd51dc3fc7e599c3cb5356453a52ca9e # tag=v1.0.26
with:
sarif_file: results.sarif

View File

@ -1,32 +0,0 @@
---
name: Static Analysis
on: [push, pull_request]
permissions: read-all
jobs:
run:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@c85c95e3d7251135ab7dc9ce3241c5835cc595a9 # v3.5.3
- id: goversion
run: echo "goversion=$(cat .go-version)" >> "$GITHUB_OUTPUT"
- uses: actions/setup-go@fac708d6674e30b6ba41289acaab6d4b75aa0753 # v4.0.1
with:
go-version: ${{ steps.goversion.outputs.goversion }}
- name: golangci-lint
uses: golangci/golangci-lint-action@639cd343e1d3b897ff35927a75193d57cfcba299 # v3.6.0
with:
version: v1.49.0
args: --config tools/.golangci.yaml
- name: protoc
uses: arduino/setup-protoc@149f6c87b92550901b26acd1632e11c3662e381f # v1.3.0
with:
version: '3.14.0'
repo-token: ${{ secrets.GITHUB_TOKEN }}
- run: |
set -euo pipefail
make verify
- run: |
set -euo pipefail
make fix

View File

@ -1,62 +0,0 @@
---
name: Tests-arm64
on:
schedule:
- cron: '30 1 * * *' # runs daily at 1:30 am.
permissions: read-all
jobs:
test:
# this is to prevent the job to run at forked projects
if: github.repository == 'etcd-io/etcd'
runs-on: [self-hosted, Linux, ARM64]
container: golang:1.19-bullseye
defaults:
run:
shell: bash
strategy:
fail-fast: false
matrix:
target:
- linux-arm64-integration-1-cpu
- linux-arm64-integration-2-cpu
- linux-arm64-integration-4-cpu
- linux-arm64-unit-4-cpu-race
steps:
- uses: actions/checkout@c85c95e3d7251135ab7dc9ce3241c5835cc595a9 # v3.5.3
# https://github.com/actions/checkout/issues/1169
- run: git config --system --add safe.directory '*'
- id: goversion
run: echo "goversion=$(cat .go-version)" >> "$GITHUB_OUTPUT"
- uses: actions/setup-go@fac708d6674e30b6ba41289acaab6d4b75aa0753 # v4.0.1
with:
go-version: ${{ steps.goversion.outputs.goversion }}
- env:
TARGET: ${{ matrix.target }}
run: |
set -euo pipefail
go clean -testcache
mkdir "${TARGET}"
export JUNIT_REPORT_DIR=$(realpath ${TARGET})
case "${TARGET}" in
linux-arm64-integration-1-cpu)
GOOS=linux GOARCH=arm64 CPU=1 make test-integration
;;
linux-arm64-integration-2-cpu)
GOOS=linux GOARCH=arm64 CPU=2 make test-integration
;;
linux-arm64-integration-4-cpu)
GOOS=linux GOARCH=arm64 CPU=4 make test-integration
;;
linux-arm64-unit-4-cpu-race)
GOOS=linux GOARCH=arm64 CPU=4 RACE=true GO_TEST_FLAGS='-p=2' make test-unit
;;
*)
echo "Failed to find target"
exit 1
;;
esac
- uses: actions/upload-artifact@0b7f8abb1508181956e8e162db84b466c27e18ce # v3.1.2
if: always()
with:
path: ./**/junit_*.xml

View File

@ -1,7 +1,5 @@
---
name: Tests
on: [push, pull_request]
permissions: read-all
jobs:
test:
runs-on: ubuntu-latest
@ -9,48 +7,72 @@ jobs:
fail-fast: false
matrix:
target:
- linux-amd64-fmt
- linux-amd64-integration-1-cpu
- linux-amd64-integration-2-cpu
- linux-amd64-integration-4-cpu
- linux-amd64-functional
- linux-amd64-unit-4-cpu-race
- linux-386-unit-1-cpu
- all-build
- linux-amd64-grpcproxy
- linux-amd64-e2e
- linux-386-unit
steps:
- uses: actions/checkout@c85c95e3d7251135ab7dc9ce3241c5835cc595a9 # v3.5.3
- uses: actions/checkout@v2
- id: goversion
run: echo "goversion=$(cat .go-version)" >> "$GITHUB_OUTPUT"
- uses: actions/setup-go@fac708d6674e30b6ba41289acaab6d4b75aa0753 # v4.0.1
- uses: actions/setup-go@v2
with:
go-version: ${{ steps.goversion.outputs.goversion }}
- run: date
- env:
TARGET: ${{ matrix.target }}
run: |
set -euo pipefail
go clean -testcache
mkdir "${TARGET}"
export JUNIT_REPORT_DIR=$(realpath ${TARGET})
go version
echo ${GOROOT}
echo "${TARGET}"
case "${TARGET}" in
linux-amd64-fmt)
GOARCH=amd64 PASSES='fmt bom dep' ./test
;;
linux-amd64-integration-1-cpu)
GOOS=linux GOARCH=amd64 CPU=1 make test-integration
GOARCH=amd64 CPU=1 PASSES='integration' RACE='false' ./test
;;
linux-amd64-integration-2-cpu)
GOOS=linux GOARCH=amd64 CPU=2 make test-integration
GOARCH=amd64 CPU=2 PASSES='integration' RACE='false' ./test
;;
linux-amd64-integration-4-cpu)
GOOS=linux GOARCH=amd64 CPU=4 make test-integration
GOARCH=amd64 CPU=4 PASSES='integration' RACE='false' ./test
;;
linux-amd64-functional)
./build && GOARCH=amd64 PASSES='functional' ./test
;;
linux-amd64-unit-4-cpu-race)
GOOS=linux GOARCH=amd64 CPU=4 RACE=true GO_TEST_FLAGS='-p=2' make test-unit
GOARCH=amd64 PASSES='unit' RACE='true' CPU='4' ./test -p=2
;;
linux-386-unit-1-cpu)
GOOS=linux GOARCH=386 CPU=1 GO_TEST_FLAGS='-p=4' make test-unit
all-build)
GOARCH=amd64 PASSES='build' ./test
GOARCH=386 PASSES='build' ./test
GO_BUILD_FLAGS='-v' GOOS=darwin GOARCH=amd64 ./build
GO_BUILD_FLAGS='-v' GOOS=windows GOARCH=amd64 ./build
GO_BUILD_FLAGS='-v' GOARCH=arm ./build
GO_BUILD_FLAGS='-v' GOARCH=arm64 ./build
GO_BUILD_FLAGS='-v' GOARCH=ppc64le ./build
GO_BUILD_FLAGS='-v' GOARCH=s390x ./build
;;
linux-amd64-grpcproxy)
PASSES='build grpcproxy' CPU='4' RACE='true' ./test
;;
linux-amd64-e2e)
GOARCH=amd64 PASSES='build release e2e' ./test
;;
linux-386-unit)
GOARCH=386 PASSES='unit' ./test
;;
*)
echo "Failed to find target"
exit 1
;;
esac
- uses: actions/upload-artifact@0b7f8abb1508181956e8e162db84b466c27e18ce # v3.1.2
if: always()
with:
path: ./**/junit_*.xml

View File

@ -0,0 +1,37 @@
name: Trivy Nightly Scan
on:
schedule:
- cron: '0 2 * * *' # run at 2 AM UTC
permissions: read-all
jobs:
nightly-scan:
name: Trivy Scan nightly
strategy:
fail-fast: false
matrix:
# maintain the versions of etcd that need to be actively
# security scanned
versions: [v3.4.22]
permissions:
security-events: write # for github/codeql-action/upload-sarif to upload SARIF results
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@93ea575cb5d8a053eaa0ac8fa3b40d7e05a33cc8 # v3.1.0
with:
ref: release-3.4
- name: Run Trivy vulnerability scanner
uses: aquasecurity/trivy-action@9ab158e8597f3b310480b9a69402b419bc03dbd5 # master
with:
image-ref: 'gcr.io/etcd-development/etcd:${{ matrix.versions }}'
severity: 'CRITICAL,HIGH'
format: 'template'
template: '@/contrib/sarif.tpl'
output: 'trivy-results-3-4.sarif'
- name: Upload Trivy scan results to GitHub Security tab
uses: github/codeql-action/upload-sarif@a669cc5936cc5e1b6a362ec1ff9e410dc570d190 # v2.1.36
with:
sarif_file: 'trivy-results-3-4.sarif'

38
.gitignore vendored
View File

@ -7,31 +7,31 @@
/bin
*.etcd
*.log
*.swp
/etcd
/hack/insta-discovery/.env
*.coverprofile
*.test
hack/tls-setup/certs
.idea
*.iml
/contrib/mixin/manifests
/contrib/raftexample/raftexample
/contrib/raftexample/raftexample-*
/vendor
/tests/e2e/default.proxy
*.tmp
*.bak
.gobincache/
.DS_Store
/Documentation/dev-guide/api_reference_v3.md
/Documentation/dev-guide/api_concurrency_reference_v3.md
/tools/etcd-dump-db/etcd-dump-db
/tools/etcd-dump-logs/etcd-dump-logs
/tools/etcd-dump-metrics/etcd-dump-metrics
/tools/local-tester/bridge/bridge
/tools/proto-annotations/proto-annotations
/tools/benchmark/benchmark
/out
/etcd-dump-logs
# TODO: use dep prune
# https://github.com/golang/dep/issues/120#issuecomment-306518546
vendor/**/*
!vendor/**/
!vendor/**/*.go
!vendor/**/*.c
!vendor/**/*.cpp
!vendor/**/*.s
!vendor/**/COPYING*
!vendor/**/PATENTS*
!vendor/**/NOTICE*
!vendor/**/Licence*
!vendor/**/License*
!vendor/**/LICENCE*
!vendor/**/LICENSE*
!vendor/modules.txt
vendor/**/*_test.go
*.bak

113
.words Normal file
View File

@ -0,0 +1,113 @@
DefaultMaxRequestBytes
ErrCodeEnhanceYourCalm
ErrTimeout
GoAway
KeepAlive
Keepalive
MiB
ResourceExhausted
RPC
RPCs
parsedTarget
SRV
WithRequireLeader
InfoLevel
args
backoff
blackhole
blackholed
cancelable
cancelation
cluster_proxy
defragment
defragmenting
deleter
dev
/dev/null
dev/null
errClientDisconnected
etcd
gRPC
goroutine
goroutines
healthcheck
hostname
iff
inflight
keepalive
keepalives
hasleader
racey
keyspace
linearization
liveness
linearized
localhost
mutex
prefetching
protobuf
prometheus
rafthttp
repin
rpc
serializable
statusError
teardown
too_many_pings
transactional
uncontended
unprefixed
unlisting
nondeterministically
atomics
transferee
Balancer
lexicographically
lexically
accessors
unbuffered
nils
reconnection
mutators
ConsistentIndexGetter
OutputWALDir
WAL
consistentIndex
todo
saveWALAndSnap
SHA
subconns
nop
SubConns
DNS
passthrough
ccBalancerWrapper
rebalanced
addrConns
subConn
TestBalancerDoNotBlockOnClose
middleware
clusterName
jitter
FIXME
retriable
github
retriable
jitter
WithBackoff
BackoffLinearWithJitter
jitter
WithDialer
WithMax
ServerStreams
BidiStreams
transientFailure
BackoffFunc
CallOptions
PermitWithoutStream
__lostleader
ErrConnClosing
unfreed
grpcAddr
clientURLs

View File

@ -1,16 +0,0 @@
<hr>
## [v2.3.8](https://github.com/etcd-io/etcd/releases/tag/v2.3.8) (2017-02-17)
See [code changes](https://github.com/etcd-io/etcd/compare/v2.3.7...v2.3.8).
### Go
- Compile with [*Go 1.7.5*](https://golang.org/doc/devel/release.html#go1.7).
<hr>

View File

@ -1,291 +0,0 @@
<hr>
## [v3.0.16](https://github.com/etcd-io/etcd/releases/tag/v3.0.16) (2016-11-13)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.0.15...v3.0.16) and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/).**
### Go
- Compile with [*Go 1.6.4*](https://golang.org/doc/devel/release.html#go1.6).
<hr>
## [v3.0.15](https://github.com/etcd-io/etcd/releases/tag/v3.0.15) (2016-11-11)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.0.14...v3.0.15) and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/).**
### Fixed
- Fix cancel watch request with wrong range end.
### Go
- Compile with [*Go 1.6.3*](https://golang.org/doc/devel/release.html#go1.6).
<hr>
## [v3.0.14](https://github.com/etcd-io/etcd/releases/tag/v3.0.14) (2016-11-04)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.0.13...v3.0.14) and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/).**
### Added
- v3 `etcdctl migrate` command now supports `--no-ttl` flag to discard keys on transform.
### Go
- Compile with [*Go 1.6.3*](https://golang.org/doc/devel/release.html#go1.6).
<hr>
## [v3.0.13](https://github.com/etcd-io/etcd/releases/tag/v3.0.13) (2016-10-24)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.0.12...v3.0.13) and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/).**
### Go
- Compile with [*Go 1.6.3*](https://golang.org/doc/devel/release.html#go1.6).
<hr>
## [v3.0.12](https://github.com/etcd-io/etcd/releases/tag/v3.0.12) (2016-10-07)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.0.11...v3.0.12) and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/).**
### Go
- Compile with [*Go 1.6.3*](https://golang.org/doc/devel/release.html#go1.6).
<hr>
## [v3.0.11](https://github.com/etcd-io/etcd/releases/tag/v3.0.11) (2016-10-07)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.0.10...v3.0.11) and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/).**
### Added
- Server returns previous key-value (optional)
- `clientv3.WithPrevKV` option
- v3 etcdctl `put,watch,del --prev-kv` flag
### Go
- Compile with [*Go 1.6.3*](https://golang.org/doc/devel/release.html#go1.6).
<hr>
## [v3.0.10](https://github.com/etcd-io/etcd/releases/tag/v3.0.10) (2016-09-23)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.0.9...v3.0.10) and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/).**
### Go
- Compile with [*Go 1.6.3*](https://golang.org/doc/devel/release.html#go1.6).
<hr>
## [v3.0.9](https://github.com/etcd-io/etcd/releases/tag/v3.0.9) (2016-09-15)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.0.8...v3.0.9) and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/).**
### Added
- Warn on domain names on listen URLs (v3.2 will reject domain names).
### Go
- Compile with [*Go 1.6.3*](https://golang.org/doc/devel/release.html#go1.6).
<hr>
## [v3.0.8](https://github.com/etcd-io/etcd/releases/tag/v3.0.8) (2016-09-09)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.0.7...v3.0.8) and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/).**
### Other
- Allow only IP addresses in listen URLs (domain names are rejected).
### Go
- Compile with [*Go 1.6.3*](https://golang.org/doc/devel/release.html#go1.6).
<hr>
## [v3.0.7](https://github.com/etcd-io/etcd/releases/tag/v3.0.7) (2016-08-31)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.0.6...v3.0.7) and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/).**
### Other
- SRV records only allow A records (RFC 2052).
### Go
- Compile with [*Go 1.6.3*](https://golang.org/doc/devel/release.html#go1.6).
<hr>
## [v3.0.6](https://github.com/etcd-io/etcd/releases/tag/v3.0.6) (2016-08-19)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.0.5...v3.0.6) and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/).**
### Go
- Compile with [*Go 1.6.3*](https://golang.org/doc/devel/release.html#go1.6).
<hr>
## [v3.0.5](https://github.com/etcd-io/etcd/releases/tag/v3.0.5) (2016-08-19)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.0.4...v3.0.5) and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/).**
### Other
- SRV records (e.g., infra1.example.com) must match the discovery domain (i.e., example.com) if no custom certificate authority is given.
### Go
- Compile with [*Go 1.6.3*](https://golang.org/doc/devel/release.html#go1.6).
<hr>
## [v3.0.4](https://github.com/etcd-io/etcd/releases/tag/v3.0.4) (2016-07-27)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.0.3...v3.0.4) and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/).**
### Added
- v2 `etcdctl ls` command now supports `--output=json`.
- Add /var/lib/etcd directory to etcd official Docker image.
### Other
- v2 auth can now use common name from TLS certificate when `--client-cert-auth` is enabled.
### Go
- Compile with [*Go 1.6.3*](https://golang.org/doc/devel/release.html#go1.6).
<hr>
## [v3.0.3](https://github.com/etcd-io/etcd/releases/tag/v3.0.3) (2016-07-15)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.0.2...v3.0.3) and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/).**
### Other
- Revert Dockerfile to use `CMD`, instead of `ENTRYPOINT`, to support `etcdctl` run.
- Docker commands for v3.0.2 won't work without specifying executable binary paths.
- v3 etcdctl default endpoints are now `127.0.0.1:2379`.
### Go
- Compile with [*Go 1.6.2*](https://golang.org/doc/devel/release.html#go1.6).
<hr>
## [v3.0.2](https://github.com/etcd-io/etcd/releases/tag/v3.0.2) (2016-07-08)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.0.1...v3.0.2) and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/).**
### Other
- Dockerfile uses `ENTRYPOINT`, instead of `CMD`, to run etcd without binary path specified.
### Go
- Compile with [*Go 1.6.2*](https://golang.org/doc/devel/release.html#go1.6).
<hr>
## [v3.0.1](https://github.com/etcd-io/etcd/releases/tag/v3.0.1) (2016-07-01)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.0.0...v3.0.1) and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/).**
### Go
- Compile with [*Go 1.6.2*](https://golang.org/doc/devel/release.html#go1.6).
<hr>
## [v3.0.0](https://github.com/etcd-io/etcd/releases/tag/v3.0.0) (2016-06-30)
See [code changes](https://github.com/etcd-io/etcd/compare/v2.3.0...v3.0.0) and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/).**
### Go
- Compile with [*Go 1.6.2*](https://golang.org/doc/devel/release.html#go1.6).
<hr>

View File

@ -1,574 +0,0 @@
Previous change logs can be found at [CHANGELOG-3.0](https://github.com/etcd-io/etcd/blob/main/CHANGELOG/CHANGELOG-3.0.md).
<hr>
## [v3.1.21](https://github.com/etcd-io/etcd/releases/tag/v3.1.21) (2019-TBD)
### etcdctl v3
- [Strip out insecure endpoints from DNS SRV records when using discovery](https://github.com/etcd-io/etcd/pull/10443) with etcdctl v2
- Add [`etcdctl endpoint health --write-out` support](https://github.com/etcd-io/etcd/pull/9540).
- Previously, [`etcdctl endpoint health --write-out json` did not work](https://github.com/etcd-io/etcd/issues/9532).
- The command output is changed. Previously, if endpoint is unreachable, the command output is
"\<endpoint\> is unhealthy: failed to connect: \<error message\>". This change unified the error message, all error types
now have the same output "\<endpoint\> is unhealthy: failed to commit proposal: \<error message\>".
### Metrics, Monitoring
See [List of metrics](https://github.com/etcd-io/etcd/tree/main/Documentation/metrics) for all metrics per release.
Note that any `etcd_debugging_*` metrics are experimental and subject to change.
- Fix bug where [db_compaction_total_duration_milliseconds metric incorrectly measured duration as 0](https://github.com/etcd-io/etcd/pull/10646).
<hr>
## [v3.1.20](https://github.com/etcd-io/etcd/releases/tag/v3.1.20) (2018-10-10)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.1.19...v3.1.20) and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/).**
### Improved
- Improve ["became inactive" warning log](https://github.com/etcd-io/etcd/pull/10024), which indicates message send to a peer failed.
- Improve [read index wait timeout warning log](https://github.com/etcd-io/etcd/pull/10026), which indicates that local node might have slow network.
- Add [gRPC interceptor for debugging logs](https://github.com/etcd-io/etcd/pull/9990); enable `etcd --debug` flag to see per-request debug information.
- Add [consistency check in snapshot status](https://github.com/etcd-io/etcd/pull/10109). If consistency check on snapshot file fails, `snapshot status` returns `"snapshot file integrity check failed..."` error.
### Metrics, Monitoring
See [List of metrics](https://github.com/etcd-io/etcd/tree/main/Documentation/metrics) for all metrics per release.
Note that any `etcd_debugging_*` metrics are experimental and subject to change.
- Improve [`etcd_network_peer_round_trip_time_seconds`](https://github.com/etcd-io/etcd/pull/10155) Prometheus metric to track leader heartbeats.
- Previously, it only samples the TCP connection for snapshot messages.
- Display all registered [gRPC metrics at start](https://github.com/etcd-io/etcd/pull/10034).
- Add [`etcd_snap_db_fsync_duration_seconds_count`](https://github.com/etcd-io/etcd/pull/9997) Prometheus metric.
- Add [`etcd_snap_db_save_total_duration_seconds_bucket`](https://github.com/etcd-io/etcd/pull/9997) Prometheus metric.
- Add [`etcd_network_snapshot_send_success`](https://github.com/etcd-io/etcd/pull/9997) Prometheus metric.
- Add [`etcd_network_snapshot_send_failures`](https://github.com/etcd-io/etcd/pull/9997) Prometheus metric.
- Add [`etcd_network_snapshot_send_total_duration_seconds`](https://github.com/etcd-io/etcd/pull/9997) Prometheus metric.
- Add [`etcd_network_snapshot_receive_success`](https://github.com/etcd-io/etcd/pull/9997) Prometheus metric.
- Add [`etcd_network_snapshot_receive_failures`](https://github.com/etcd-io/etcd/pull/9997) Prometheus metric.
- Add [`etcd_network_snapshot_receive_total_duration_seconds`](https://github.com/etcd-io/etcd/pull/9997) Prometheus metric.
- Add [`etcd_server_id`](https://github.com/etcd-io/etcd/pull/9998) Prometheus metric.
- Add [`etcd_server_health_success`](https://github.com/etcd-io/etcd/pull/10156) Prometheus metric.
- Add [`etcd_server_health_failures`](https://github.com/etcd-io/etcd/pull/10156) Prometheus metric.
- Add [`etcd_server_read_indexes_failed_total`](https://github.com/etcd-io/etcd/pull/10094) Prometheus metric.
### client v3
- Fix logic on [release lock key if cancelled](https://github.com/etcd-io/etcd/pull/10153) in `clientv3/concurrency` package.
### Go
- Compile with [*Go 1.8.7*](https://golang.org/doc/devel/release.html#go1.8).
<hr>
## [v3.1.19](https://github.com/etcd-io/etcd/releases/tag/v3.1.19) (2018-07-24)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.1.18...v3.1.19) and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/).**
### Improved
- Improve [Raft Read Index timeout warning messages](https://github.com/etcd-io/etcd/pull/9897).
### Metrics, Monitoring
See [List of metrics](https://github.com/etcd-io/etcd/tree/main/Documentation/metrics) for all metrics per release.
Note that any `etcd_debugging_*` metrics are experimental and subject to change.
- Add [`etcd_server_go_version`](https://github.com/etcd-io/etcd/pull/9957) Prometheus metric.
- Add [`etcd_server_slow_read_indexes_total`](https://github.com/etcd-io/etcd/pull/9897) Prometheus metric.
- Add [`etcd_server_quota_backend_bytes`](https://github.com/etcd-io/etcd/pull/9820) Prometheus metric.
- Use it with `etcd_mvcc_db_total_size_in_bytes` and `etcd_mvcc_db_total_size_in_use_in_bytes`.
- `etcd_server_quota_backend_bytes 2.147483648e+09` means current quota size is 2 GB.
- `etcd_mvcc_db_total_size_in_bytes 20480` means current physically allocated DB size is 20 KB.
- `etcd_mvcc_db_total_size_in_use_in_bytes 16384` means future DB size if defragment operation is complete.
- `etcd_mvcc_db_total_size_in_bytes - etcd_mvcc_db_total_size_in_use_in_bytes` is the number of bytes that can be saved on disk with defragment operation.
- Add [`etcd_mvcc_db_total_size_in_bytes`](https://github.com/etcd-io/etcd/pull/9819) Prometheus metric.
- In addition to [`etcd_debugging_mvcc_db_total_size_in_bytes`](https://github.com/etcd-io/etcd/pull/9819).
- Add [`etcd_mvcc_db_total_size_in_use_in_bytes`](https://github.com/etcd-io/etcd/pull/9256) Prometheus metric.
- Use it with `etcd_mvcc_db_total_size_in_bytes` and `etcd_mvcc_db_total_size_in_use_in_bytes`.
- `etcd_server_quota_backend_bytes 2.147483648e+09` means current quota size is 2 GB.
- `etcd_mvcc_db_total_size_in_bytes 20480` means current physically allocated DB size is 20 KB.
- `etcd_mvcc_db_total_size_in_use_in_bytes 16384` means future DB size if defragment operation is complete.
- `etcd_mvcc_db_total_size_in_bytes - etcd_mvcc_db_total_size_in_use_in_bytes` is the number of bytes that can be saved on disk with defragment operation.
### client v3
- Fix [lease keepalive interval updates when response queue is full](https://github.com/etcd-io/etcd/pull/9952).
- If `<-chan *clientv3LeaseKeepAliveResponse` from `clientv3.Lease.KeepAlive` was never consumed or channel is full, client was [sending keepalive request every 500ms](https://github.com/etcd-io/etcd/issues/9911) instead of expected rate of every "TTL / 3" duration.
### Go
- Compile with [*Go 1.8.7*](https://golang.org/doc/devel/release.html#go1.8).
<hr>
## [v3.1.18](https://github.com/etcd-io/etcd/releases/tag/v3.1.18) (2018-06-15)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.1.17...v3.1.18) and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/).**
### Metrics, Monitoring
See [List of metrics](https://github.com/etcd-io/etcd/tree/main/Documentation/metrics) for all metrics per release.
Note that any `etcd_debugging_*` metrics are experimental and subject to change.
- Add [`etcd_server_version`](https://github.com/etcd-io/etcd/pull/8960) Prometheus metric.
- To replace [Kubernetes `etcd-version-monitor`](https://github.com/etcd-io/etcd/issues/8948).
### Go
- Compile with [*Go 1.8.7*](https://golang.org/doc/devel/release.html#go1.8).
<hr>
## [v3.1.17](https://github.com/etcd-io/etcd/releases/tag/v3.1.17) (2018-06-06)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.1.16...v3.1.17) and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/).**
### etcd server
- Fix [v3 snapshot recovery](https://github.com/etcd-io/etcd/issues/7628).
- A follower receives a leader snapshot to be persisted as a `[SNAPSHOT-INDEX].snap.db` file on disk.
- Now, server [ensures that the incoming snapshot be persisted on disk before loading it](https://github.com/etcd-io/etcd/pull/7876).
- Otherwise, index mismatch happens and triggers server-side panic (e.g. newer WAL entry with outdated snapshot index).
### Go
- Compile with [*Go 1.8.7*](https://golang.org/doc/devel/release.html#go1.8).
<hr>
## [v3.1.16](https://github.com/etcd-io/etcd/releases/tag/v3.1.16) (2018-05-31)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.1.15...v3.1.16) and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/).**
### etcd server
- Fix [`mvcc` server panic from restore operation](https://github.com/etcd-io/etcd/pull/9775).
- Let's assume that a watcher had been requested with a future revision X and sent to node A that became network-partitioned thereafter. Meanwhile, cluster makes progress. Then when the partition gets removed, the leader sends a snapshot to node A. Previously if the snapshot's latest revision is still lower than the watch revision X, **etcd server panicked** during snapshot restore operation.
- Now, this server-side panic has been fixed.
### Go
- Compile with [*Go 1.8.7*](https://golang.org/doc/devel/release.html#go1.8).
<hr>
## [v3.1.15](https://github.com/etcd-io/etcd/releases/tag/v3.1.15) (2018-05-09)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.1.14...v3.1.15) and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/).**
### etcd server
- Purge old [`*.snap.db` snapshot files](https://github.com/etcd-io/etcd/pull/7967).
- Previously, etcd did not respect `--max-snapshots` flag to purge old `*.snap.db` files.
- Now, etcd purges old `*.snap.db` files to keep maximum `--max-snapshots` number of files on disk.
### Go
- Compile with [*Go 1.8.7*](https://golang.org/doc/devel/release.html#go1.8).
<hr>
## [v3.1.14](https://github.com/etcd-io/etcd/releases/tag/v3.1.14) (2018-04-24)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.1.13...v3.1.14) and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/).**
### Metrics, Monitoring
See [List of metrics](https://github.com/etcd-io/etcd/tree/main/Documentation/metrics) for all metrics per release.
Note that any `etcd_debugging_*` metrics are experimental and subject to change.
- Add [`etcd_server_is_leader`](https://github.com/etcd-io/etcd/pull/9587) Prometheus metric.
### etcd server
- Add [`--initial-election-tick-advance`](https://github.com/etcd-io/etcd/pull/9591) flag to configure initial election tick fast-forward.
- By default, `--initial-election-tick-advance=true`, then local member fast-forwards election ticks to speed up "initial" leader election trigger.
- This benefits the case of larger election ticks. For instance, cross datacenter deployment may require longer election timeout of 10-second. If true, local node does not need wait up to 10-second. Instead, forwards its election ticks to 8-second, and have only 2-second left before leader election.
- Major assumptions are that: cluster has no active leader thus advancing ticks enables faster leader election. Or cluster already has an established leader, and rejoining follower is likely to receive heartbeats from the leader after tick advance and before election timeout.
- However, when network from leader to rejoining follower is congested, and the follower does not receive leader heartbeat within left election ticks, disruptive election has to happen thus affecting cluster availabilities.
- Now, this can be disabled by setting `--initial-election-tick-advance=false`.
- Disabling this would slow down initial bootstrap process for cross datacenter deployments. Make tradeoffs by configuring `--initial-election-tick-advance` at the cost of slow initial bootstrap.
- If single-node, it advances ticks regardless.
- Address [disruptive rejoining follower node](https://github.com/etcd-io/etcd/issues/9333).
### Go
- Compile with [*Go 1.8.7*](https://golang.org/doc/devel/release.html#go1.8).
<hr>
## [v3.1.13](https://github.com/etcd-io/etcd/releases/tag/v3.1.13) (2018-03-29)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.1.12...v3.1.13) and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/).**
### Improved
- Adjust [election timeout on server restart](https://github.com/etcd-io/etcd/pull/9415) to reduce [disruptive rejoining servers](https://github.com/etcd-io/etcd/issues/9333).
- Previously, etcd fast-forwards election ticks on server start, with only one tick left for leader election. This is to speed up start phase, without having to wait until all election ticks elapse. Advancing election ticks is useful for cross datacenter deployments with larger election timeouts. However, it was affecting cluster availability if the last tick elapses before leader contacts the restarted node.
- Now, when etcd restarts, it adjusts election ticks with more than one tick left, thus more time for leader to prevent disruptive restart.
### Metrics, Monitoring
See [List of metrics](https://github.com/etcd-io/etcd/tree/main/Documentation/metrics) for all metrics per release.
Note that any `etcd_debugging_*` metrics are experimental and subject to change.
- Add missing [`etcd_network_peer_sent_failures_total` count](https://github.com/etcd-io/etcd/pull/9437).
### Go
- Compile with [*Go 1.8.7*](https://golang.org/doc/devel/release.html#go1.8).
<hr>
## [v3.1.12](https://github.com/etcd-io/etcd/releases/tag/v3.1.12) (2018-03-08)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.1.11...v3.1.12) and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/).**
### etcd server
- Fix [`mvcc` "unsynced" watcher restore operation](https://github.com/etcd-io/etcd/pull/9297).
- "unsynced" watcher is watcher that needs to be in sync with events that have happened.
- That is, "unsynced" watcher is the slow watcher that was requested on old revision.
- "unsynced" watcher restore operation was not correctly populating its underlying watcher group.
- Which possibly causes [missing events from "unsynced" watchers](https://github.com/etcd-io/etcd/issues/9086).
- A node gets network partitioned with a watcher on a future revision, and falls behind receiving a leader snapshot after partition gets removed. When applying this snapshot, etcd watch storage moves current synced watchers to unsynced since sync watchers might have become stale during network partition. And reset synced watcher group to restart watcher routines. Previously, there was a bug when moving from synced watcher group to unsynced, thus client would miss events when the watcher was requested to the network-partitioned node.
### Go
- Compile with [*Go 1.8.7*](https://golang.org/doc/devel/release.html#go1.8).
<hr>
## [v3.1.11](https://github.com/etcd-io/etcd/releases/tag/v3.1.11) (2017-11-28)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.1.10...v3.1.11) and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/).**
### etcd server
- [#8411](https://github.com/etcd-io/etcd/issues/8411),[#8806](https://github.com/etcd-io/etcd/pull/8806) backport "mvcc: sending events after restore"
- [#8009](https://github.com/etcd-io/etcd/issues/8009),[#8902](https://github.com/etcd-io/etcd/pull/8902) backport coreos/bbolt v1.3.1-coreos.5
### Go
- Compile with [*Go 1.8.5*](https://golang.org/doc/devel/release.html#go1.8).
<hr>
## [v3.1.10](https://github.com/etcd-io/etcd/releases/tag/v3.1.10) (2017-07-14)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.1.9...v3.1.10) and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/).**
### Added
- Tag docker images with minor versions.
- e.g. `docker pull quay.io/coreos/etcd:v3.1` to fetch latest v3.1 versions.
### Go
- Compile with [*Go 1.8.3*](https://golang.org/doc/devel/release.html#go1.8).
- Fix panic on `net/http.CloseNotify`
<hr>
## [v3.1.9](https://github.com/etcd-io/etcd/releases/tag/v3.1.9) (2017-06-09)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.1.8...v3.1.9) and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/).**
### etcd server
- Allow v2 snapshot over 512MB.
### Go
- Compile with [*Go 1.7.6*](https://golang.org/doc/devel/release.html#go1.7).
<hr>
## [v3.1.8](https://github.com/etcd-io/etcd/releases/tag/v3.1.8) (2017-05-19)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.1.7...v3.1.8) and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/).**
### Go
- Compile with [*Go 1.7.5*](https://golang.org/doc/devel/release.html#go1.7).
<hr>
## [v3.1.7](https://github.com/etcd-io/etcd/releases/tag/v3.1.7) (2017-04-28)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.1.6...v3.1.7) and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/).**
### Go
- Compile with [*Go 1.7.5*](https://golang.org/doc/devel/release.html#go1.7).
<hr>
## [v3.1.6](https://github.com/etcd-io/etcd/releases/tag/v3.1.6) (2017-04-19)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.1.5...v3.1.6) and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/).**
### etcd server
- Fill in Auth API response header.
- Remove auth check in Status API.
### Go
- Compile with [*Go 1.7.5*](https://golang.org/doc/devel/release.html#go1.7).
<hr>
## [v3.1.5](https://github.com/etcd-io/etcd/releases/tag/v3.1.5) (2017-03-27)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.1.4...v3.1.5) and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/).**
### etcd server
- Fix raft memory leak issue.
- Fix Windows file path issues.
### Other
- Add `/etc/nsswitch.conf` file to alpine-based Docker image.
### Go
- Compile with [*Go 1.7.5*](https://golang.org/doc/devel/release.html#go1.7).
<hr>
## [v3.1.4](https://github.com/etcd-io/etcd/releases/tag/v3.1.4) (2017-03-22)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.1.3...v3.1.4) and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/).**
### Go
- Compile with [*Go 1.7.5*](https://golang.org/doc/devel/release.html#go1.7).
<hr>
## [v3.1.3](https://github.com/etcd-io/etcd/releases/tag/v3.1.3) (2017-03-10)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.1.2...v3.1.3) and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/).**
### etcd gateway
- Fix `etcd gateway` schema handling in DNS discovery.
- Fix sd_notify behaviors in `gateway`, `grpc-proxy`.
### gRPC Proxy
- Fix sd_notify behaviors in `gateway`, `grpc-proxy`.
### Other
- Use machine default host when advertise URLs are default values(`localhost:2379,2380`) AND if listen URL is `0.0.0.0`.
### Go
- Compile with [*Go 1.7.5*](https://golang.org/doc/devel/release.html#go1.7).
<hr>
## [v3.1.2](https://github.com/etcd-io/etcd/releases/tag/v3.1.2) (2017-02-24)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.1.1...v3.1.2) and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/).**
### etcd gateway
- Fix `etcd gateway` with multiple endpoints.
### Other
- Use IPv4 default host, by default (when IPv4 and IPv6 are available).
### Go
- Compile with [*Go 1.7.5*](https://golang.org/doc/devel/release.html#go1.7).
<hr>
## [v3.1.1](https://github.com/etcd-io/etcd/releases/tag/v3.1.1) (2017-02-17)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.1.0...v3.1.1) and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/).**
### Go
- Compile with [*Go 1.7.5*](https://golang.org/doc/devel/release.html#go1.7).
<hr>
## [v3.1.0](https://github.com/etcd-io/etcd/releases/tag/v3.1.0) (2017-01-20)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.0.0...v3.1.0) and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/).**
### Improved
- Faster linearizable reads (implements Raft [read-index](https://github.com/etcd-io/etcd/pull/6212)).
- v3 authentication API is now stable.
### Breaking Changes
- Deprecated following gRPC metrics in favor of [go-grpc-prometheus](https://github.com/grpc-ecosystem/go-grpc-prometheus).
- `etcd_grpc_requests_total`
- `etcd_grpc_requests_failed_total`
- `etcd_grpc_active_streams`
- `etcd_grpc_unary_requests_duration_seconds`
### Dependency
- Upgrade [`github.com/ugorji/go/codec`](https://github.com/ugorji/go) to [**`ugorji/go@9c7f9b7`**](https://github.com/ugorji/go/commit/9c7f9b7a2bc3a520f7c7b30b34b7f85f47fe27b6), and [regenerate v2 `client`](https://github.com/etcd-io/etcd/pull/6945).
### Security, Authentication
See [security doc](https://etcd.io/docs/latest/op-guide/security/) for more details.
- SRV records (e.g., infra1.example.com) must match the discovery domain (i.e., example.com) if no custom certificate authority is given.
- `TLSConfig.ServerName` is ignored with user-provided certificates for backwards compatibility; to be deprecated.
- For example, `etcd --discovery-srv=example.com` will only authenticate peers/clients when the provided certs have root domain `example.com` as an entry in Subject Alternative Name (SAN) field.
### etcd server
- Automatic leadership transfer when leader steps down.
- etcd flags
- `--strict-reconfig-check` flag is set by default.
- Add `--log-output` flag.
- Add `--metrics` flag.
- etcd uses default route IP if advertise URL is not given.
- Cluster rejects removing members if quorum will be lost.
- Discovery now has upper limit for waiting on retries.
- Warn on binding listeners through domain names; to be deprecated.
- v3.0 and v3.1 with `--auto-compaction-retention=10` run periodic compaction on v3 key-value store for every 10-hour.
- Compactor only supports periodic compaction.
- Compactor records latest revisions every 5-minute, until it reaches the first compaction period (e.g. 10-hour).
- In order to retain key-value history of last compaction period, it uses the last revision that was fetched before compaction period, from the revision records that were collected every 5-minute.
- When `--auto-compaction-retention=10`, compactor uses revision 100 for compact revision where revision 100 is the latest revision fetched from 10 hours ago.
- If compaction succeeds or requested revision has already been compacted, it resets period timer and starts over with new historical revision records (e.g. restart revision collect and compact for the next 10-hour period).
- If compaction fails, it retries in 5 minutes.
### client v3
- Add `SetEndpoints` method; update endpoints at runtime.
- Add `Sync` method; auto-update endpoints at runtime.
- Add `Lease TimeToLive` API; fetch lease information.
- replace Config.Logger field with global logger.
- Get API responses are sorted in ascending order by default.
### etcdctl v3
- Add `lease timetolive` command.
- Add `--print-value-only` flag to get command.
- Add `--dest-prefix` flag to make-mirror command.
- `get` command responses are sorted in ascending order by default.
### gRPC Proxy
- Experimental gRPC proxy feature.
### Other
- `recipes` now conform to sessions defined in `clientv3/concurrency`.
- ACI has symlinks to `/usr/local/bin/etcd*`.
### Go
- Compile with [*Go 1.7.4*](https://golang.org/doc/devel/release.html#go1.7).
<hr>

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -1,508 +0,0 @@
Previous change logs can be found at [CHANGELOG-3.4](https://github.com/etcd-io/etcd/blob/main/CHANGELOG/CHANGELOG-3.4.md).
<hr>
## v3.5.10 (tbd)
### etcd server
- Fix [corruption check may get a `ErrCompacted` error when server has just been compacted](https://github.com/etcd-io/etcd/pull/16048)
- Improve [Lease put performance for the case that auth is disabled or the user is admin](https://github.com/etcd-io/etcd/pull/16019)
### etcd grpc-proxy
- Fix [Memberlist results not updated when proxy node down](https://github.com/etcd-io/etcd/pull/15907).
<hr>
## v3.5.9 (2023-05-11)
### etcd server
- Fix [LeaseTimeToLive API may return keys to clients which have no read permission on the keys](https://github.com/etcd-io/etcd/pull/15815).
### Dependencies
- Compile binaries using [go 1.19.9](https://github.com/etcd-io/etcd/pull/15822).
<hr>
## v3.5.8 (2023-04-13)
### etcd server
- Add [`etcd --tls-min-version --tls-max-version`](https://github.com/etcd-io/etcd/pull/15483) to enable support for TLS 1.3.
- Add [`etcd --listen-client-http-urls`](https://github.com/etcd-io/etcd/pull/15589) flag to support separating http server from grpc one, thus giving full immunity to [watch stream starvation under high read load](https://github.com/etcd-io/etcd/issues/15402).
- Change [http2 frame scheduler to random algorithm](https://github.com/etcd-io/etcd/pull/15452)
- Fix [Watch response traveling back in time when reconnecting member downloads snapshot from the leader](https://github.com/etcd-io/etcd/pull/15515)
- Fix [race when starting both secure & insecure gRPC servers on the same address](https://github.com/etcd-io/etcd/pull/15517)
- Fix [server/auth: disallow creating empty permission ranges](https://github.com/etcd-io/etcd/pull/15619)
- Fix [aligning zap log timestamp resolution to microseconds](https://github.com/etcd-io/etcd/pull/15240). Etcd now uses zap timestamp format: `2006-01-02T15:04:05.999999Z0700` (microsecond instead of milliseconds precision).
- Fix [wsproxy did not print log in JSON format](https://github.com/etcd-io/etcd/pull/15661).
- Fix [CVE-2021-28235](https://nvd.nist.gov/vuln/detail/CVE-2021-28235) by [clearing password after authenticating the user](https://github.com/etcd-io/etcd/pull/15653).
- Fix [etcdserver may panic when parsing a JWT token without username or revision](https://github.com/etcd-io/etcd/pull/15676).
- Fix [Requested watcher progress notifications are not synchronised with stream](https://github.com/etcd-io/etcd/pull/15695).
### Package `netutil`
- Fix [consistently format IPv6 addresses for comparison](https://github.com/etcd-io/etcd/pull/15187).
### Package `clientv3`
- Fix [etcd might send duplicated events to watch clients](https://github.com/etcd-io/etcd/pull/15274).
### Dependencies
- Recommend [Go 1.19+](https://github.com/etcd-io/etcd/pull/15337).
- Compile binaries using [go to 1.19.8](https://github.com/etcd-io/etcd/pull/15651)
- Upgrade [golang.org/x/net to v0.7.0](https://github.com/etcd-io/etcd/pull/15337)
- Upgrade [bbolt to v1.3.7](https://github.com/etcd-io/etcd/pull/15222).
### Docker image
- [Remove nsswitch.conf from docker image](https://github.com/etcd-io/etcd/pull/15161)
- Fix [etcd docker images all tagged with amd64 architecture](https://github.com/etcd-io/etcd/pull/15612)
<hr>
## v3.5.7 (2023-01-20)
### etcd server
- Fix [Remove memberID from data corrupt alarm](https://github.com/etcd-io/etcd/pull/14852).
- Fix [Allow non mutating requests pass through quotaKVServer when NOSPACE](https://github.com/etcd-io/etcd/pull/14884).
- Fix [nil pointer panic for readonly txn due to nil response](https://github.com/etcd-io/etcd/pull/14899).
- Fix [The last record which was partially synced to disk isn't automatically repaired](https://github.com/etcd-io/etcd/pull/15069).
- Fix [etcdserver might promote a non-started learner](https://github.com/etcd-io/etcd/pull/15096).
### Package `clientv3`
- Reverted the fix to [auth invalid token and old revision errors in watch](https://github.com/etcd-io/etcd/pull/14995).
### Dependencies
- Recommend [Go 1.17+](https://github.com/etcd-io/etcd/pull/15019).
- Compile binaries using [Go 1.17.13](https://github.com/etcd-io/etcd/pull/15019)
- Bumped [some dependencies](https://github.com/etcd-io/etcd/pull/15018) to address some HIGH Vulnerabilities.
### Docker image
- Use [distroless base image](https://github.com/etcd-io/etcd/pull/15016) to address critical Vulnerabilities.
- Updated [base image from base-debian11 to static-debian11 and removed dependency on busybox](https://github.com/etcd-io/etcd/pull/15037).
<hr>
## v3.5.6 (2022-11-21)
### etcd server
- Fix [auth invalid token and old revision errors in watch](https://github.com/etcd-io/etcd/pull/14547)
- Fix [avoid closing a watch with ID 0 incorrectly](https://github.com/etcd-io/etcd/pull/14563)
- Fix [auth: fix data consistency issue caused by recovery from snapshot](https://github.com/etcd-io/etcd/pull/14648)
- Fix [revision might be inconsistency between members when etcd crashes during processing defragmentation operation](https://github.com/etcd-io/etcd/pull/14733)
- Fix [timestamp in inconsistent format](https://github.com/etcd-io/etcd/pull/14799)
- Fix [Failed resolving host due to lost DNS record](https://github.com/etcd-io/etcd/pull/14573)
### Package `clientv3`
- Fix [Add backoff before retry when watch stream returns unavailable](https://github.com/etcd-io/etcd/pull/14582).
- Fix [stack overflow error in double barrier](https://github.com/etcd-io/etcd/pull/14658)
- Fix [Refreshing token on CommonName based authentication causes segmentation violation in client](https://github.com/etcd-io/etcd/pull/14790).
### etcd grpc-proxy
- Add [`etcd grpc-proxy start --listen-cipher-suites`](https://github.com/etcd-io/etcd/pull/14500) flag to support adding configurable cipher list.
<hr>
## v3.5.5 (2022-09-15)
### Deprecations
- Deprecated [SetKeepAlive and SetKeepAlivePeriod in limitListenerConn](https://github.com/etcd-io/etcd/pull/14366).
### Package `clientv3`
- Fix [do not overwrite authTokenBundle on dial](https://github.com/etcd-io/etcd/pull/14132).
- Fix [IsOptsWithPrefix returns false even if WithPrefix() is included](https://github.com/etcd-io/etcd/pull/14187).
### etcd server
- [Build official darwin/arm64 artifacts](https://github.com/etcd-io/etcd/pull/14436).
- Add [`etcd --max-concurrent-streams`](https://github.com/etcd-io/etcd/pull/14219) flag to configure the max concurrent streams each client can open at a time, and defaults to math.MaxUint32.
- Add [`etcd --experimental-compact-hash-check-enabled --experimental-compact-hash-check-time`](https://github.com/etcd-io/etcd/issues/14039) flags to support enabling reliable corruption detection on compacted revisions.
- Fix [unexpected error during txn](https://github.com/etcd-io/etcd/issues/14110).
- Fix [lease leak issue due to tokenProvider isn't enabled when restoring auth store from a snapshot](https://github.com/etcd-io/etcd/pull/13205).
- Fix [the race condition between goroutine and channel on the same leases to be revoked](https://github.com/etcd-io/etcd/pull/14087).
- Fix [lessor may continue to schedule checkpoint after stepping down leader role](https://github.com/etcd-io/etcd/pull/14087).
- Fix [Restrict the max size of each WAL entry to the remaining size of the WAL file](https://github.com/etcd-io/etcd/pull/14127).
- Fix [Protect rangePermCache with a RW lock correctly](https://github.com/etcd-io/etcd/pull/14227)
- Fix [memberID equals zero in corruption alarm](https://github.com/etcd-io/etcd/pull/14272)
- Fix [Durability API guarantee broken in single node cluster](https://github.com/etcd-io/etcd/pull/14424)
- Fix [etcd fails to start after performing alarm list operation and then power off/on](https://github.com/etcd-io/etcd/pull/14429)
- Fix [authentication data not loaded on member startup](https://github.com/etcd-io/etcd/pull/14409)
### etcdctl v3
- Fix [etcdctl move-leader may fail for multiple endpoints](https://github.com/etcd-io/etcd/pull/14434)
### Other
- [Bump golang.org/x/crypto to latest version](https://github.com/etcd-io/etcd/pull/13996) to address [CVE-2022-27191](https://github.com/advisories/GHSA-8c26-wmh5-6g9v).
- [Bump OpenTelemetry to 1.0.1 and gRPC to 1.41.0](https://github.com/etcd-io/etcd/pull/14312).
<hr>
## v3.5.4 (2022-04-24)
### etcd server
- Fix [etcd panic on startup (auth enabled)](https://github.com/etcd-io/etcd/pull/13946)
### package `client/pkg/v3`
- [Revert the change of trimming the trailing dot from SRV.Target](https://github.com/etcd-io/etcd/pull/13950) returned by DNS lookup
<hr>
## v3.5.3 (2022-04-13)
### etcd server
- Fix [Provide a better liveness probe for when etcd runs as a Kubernetes pod](https://github.com/etcd-io/etcd/pull/13706)
- Fix [inconsistent log format](https://github.com/etcd-io/etcd/pull/13864)
- Fix [Inconsistent revision and data occurs](https://github.com/etcd-io/etcd/pull/13908)
- Fix [Etcdserver is still in progress of processing LeaseGrantRequest when it receives a LeaseKeepAliveRequest on the same leaseID](https://github.com/etcd-io/etcd/pull/13932)
- Fix [consistent_index coming from snapshot is overwritten by the old local value](https://github.com/etcd-io/etcd/pull/13933)
- [Update container base image snapshot](https://github.com/etcd-io/etcd/pull/13862)
- Fix [Defrag unsets backend options](https://github.com/etcd-io/etcd/pull/13701).
### package `client/pkg/v3`
- [Trim the suffix dot from the target](https://github.com/etcd-io/etcd/pull/13714) in SRV records returned by DNS lookup
### etcdctl v3
- [Always print the raft_term in decimal](https://github.com/etcd-io/etcd/pull/13727) when displaying member list in json.
<hr>
## [v3.5.2](https://github.com/etcd-io/etcd/releases/tag/v3.5.2) (2022-02-01)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.5.1...v3.5.2) and [v3.5 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_5/) for any breaking changes.
### etcd server
- Fix [exclude the same alarm type activated by multiple peers](https://github.com/etcd-io/etcd/pull/13476).
- Add [`etcd --experimental-enable-lease-checkpoint-persist`](https://github.com/etcd-io/etcd/pull/13508) flag to enable checkpoint persisting.
- Fix [Lease checkpoints don't prevent to reset ttl on leader change](https://github.com/etcd-io/etcd/pull/13508), requires enabling checkpoint persisting.
- Fix [assertion failed due to tx closed when recovering v3 backend from a snapshot db](https://github.com/etcd-io/etcd/pull/13501)
- Fix [segmentation violation(SIGSEGV) error due to premature unlocking of watchableStore](https://github.com/etcd-io/etcd/pull/13541)
<hr>
## [v3.5.1](https://github.com/etcd-io/etcd/releases/tag/v3.5.1) (2021-10-15)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.5.0...v3.5.1) and [v3.5 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_5/) for any breaking changes.
### etcd server
- Fix [self-signed-cert-validity parameter cannot be specified in the config file](https://github.com/etcd-io/etcd/pull/13237).
- Fix [ensure that cluster members stored in v2store and backend are in sync](https://github.com/etcd-io/etcd/pull/13348)
### etcd client
- [Fix etcd client sends invalid :authority header](https://github.com/etcd-io/etcd/issues/13192)
### package clientv3
- Endpoints self identify now as `etcd-endpoints://{id}/{authority}` where authority is based on first endpoint passed, for example `etcd-endpoints://0xc0009d8540/localhost:2079`
### Other
- Updated [base image](https://github.com/etcd-io/etcd/pull/13386) from `debian:buster-v1.4.0` to `debian:bullseye-20210927` to fix the following critical CVEs:
- [CVE-2021-3711](https://nvd.nist.gov/vuln/detail/CVE-2021-3711): miscalculation of a buffer size in openssl's SM2 decryption
- [CVE-2021-35942](https://nvd.nist.gov/vuln/detail/CVE-2021-35942): integer overflow flaw in glibc
- [CVE-2019-9893](https://nvd.nist.gov/vuln/detail/CVE-2019-9893): incorrect syscall argument generation in libseccomp
- [CVE-2021-36159](https://nvd.nist.gov/vuln/detail/CVE-2021-36159): libfetch in apk-tools mishandles numeric strings in FTP and HTTP protocols to allow out of bound reads.
<hr>
## v3.5.0 (2021-06)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.4.0...v3.5.0) and [v3.5 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_5/) for any breaking changes.
- [v3.5.0](https://github.com/etcd-io/etcd/releases/tag/v3.5.0) (2021 TBD), see [code changes](https://github.com/etcd-io/etcd/compare/v3.5.0-rc.1...v3.5.0).
- [v3.5.0-rc.1](https://github.com/etcd-io/etcd/releases/tag/v3.5.0-rc.1) (2021-06-10), see [code changes](https://github.com/etcd-io/etcd/compare/v3.5.0-rc.0...v3.5.0-rc.1).
- [v3.5.0-rc.0](https://github.com/etcd-io/etcd/releases/tag/v3.5.0-rc.0) (2021-06-04), see [code changes](https://github.com/etcd-io/etcd/compare/v3.5.0-beta.4...v3.5.0-rc.0).
- [v3.5.0-beta.4](https://github.com/etcd-io/etcd/releases/tag/v3.5.0-beta.4) (2021-05-26), see [code changes](https://github.com/etcd-io/etcd/compare/v3.5.0-beta.3...v3.5.0-beta.4).
- [v3.5.0-beta.3](https://github.com/etcd-io/etcd/releases/tag/v3.5.0-beta.3) (2021-05-18), see [code changes](https://github.com/etcd-io/etcd/compare/v3.5.0-beta.2...v3.5.0-beta.3).
- [v3.5.0-beta.2](https://github.com/etcd-io/etcd/releases/tag/v3.5.0-beta.2) (2021-05-18), see [code changes](https://github.com/etcd-io/etcd/compare/v3.5.0-beta.1...v3.5.0-beta.2).
- [v3.5.0-beta.1](https://github.com/etcd-io/etcd/releases/tag/v3.5.0-beta.1) (2021-05-18), see [code changes](https://github.com/etcd-io/etcd/compare/v3.4.0...v3.5.0-beta.1).
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.5 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_5/).**
### Breaking Changes
- `go.etcd.io/etcd` Go packages have moved to `go.etcd.io/etcd/{api,pkg,raft,client,etcdctl,server,raft,tests}/v3` to follow the [Go modules](https://github.com/golang/go/wiki/Modules) conventions
- `go.etcd.io/clientv3/snapshot` SnapshotManager class have moved to `go.etcd.io/clientv3/etcdctl`.
The method `snapshot.Save` to download a snapshot from the remote server was preserved in 'go.etcd.io/clientv3/snapshot`.
- `go.etcd.io/client' package got migrated to 'go.etcd.io/client/v2'.
- Changed behavior of clientv3 API [MemberList](https://github.com/etcd-io/etcd/pull/11639).
- Previously, it is directly served with server's local data, which could be stale.
- Now, it is served with linearizable guarantee. If the server is disconnected from quorum, `MemberList` call will fail.
- [gRPC gateway](https://github.com/grpc-ecosystem/grpc-gateway) only supports [`/v3`](TODO) endpoint.
- Deprecated [`/v3beta`](https://github.com/etcd-io/etcd/pull/9298).
- `curl -L http://localhost:2379/v3beta/kv/put -X POST -d '{"key": "Zm9v", "value": "YmFy"}'` doesn't work in v3.5. Use `curl -L http://localhost:2379/v3/kv/put -X POST -d '{"key": "Zm9v", "value": "YmFy"}'` instead.
- **`etcd --experimental-enable-v2v3` flag remains experimental and to be deprecated.**
- v2 storage emulation feature will be deprecated in the next release.
- etcd 3.5 is the last version that supports V2 API. Flags `--enable-v2` and `--experimental-enable-v2v3` [are now deprecated](https://github.com/etcd-io/etcd/pull/12940) and will be removed in etcd v3.6 release.
- **`etcd --experimental-backend-bbolt-freelist-type` flag has been deprecated.** Use **`etcd --backend-bbolt-freelist-type`** instead. The default type is hashmap and it is stable now.
- **`etcd --debug` flag has been deprecated.** Use **`etcd --log-level=debug`** instead.
- Remove [`embed.Config.Debug`](https://github.com/etcd-io/etcd/pull/10947).
- **`etcd --log-output` flag has been deprecated.** Use **`etcd --log-outputs`** instead.
- **`etcd --logger=zap --log-outputs=stderr`** is now the default.
- **`etcd --logger=capnslog` flag value has been deprecated.**
- **`etcd --logger=zap --log-outputs=default` flag value is not supported.**.
- Use `etcd --logger=zap --log-outputs=stderr`.
- Or, use `etcd --logger=zap --log-outputs=systemd/journal` to send logs to the local systemd journal.
- Previously, if etcd parent process ID (PPID) is 1 (e.g. run with systemd), `etcd --logger=capnslog --log-outputs=default` redirects server logs to local systemd journal. And if write to journald fails, it writes to `os.Stderr` as a fallback.
- However, even with PPID 1, it can fail to dial systemd journal (e.g. run embedded etcd with Docker container). Then, [every single log write will fail](https://github.com/etcd-io/etcd/pull/9729) and fall back to `os.Stderr`, which is inefficient.
- To avoid this problem, systemd journal logging must be configured manually.
- **`etcd --log-outputs=stderr`** is now the default.
- **`etcd --log-package-levels` flag for `capnslog` has been deprecated.** Now, **`etcd --logger=zap --log-outputs=stderr`** is the default.
- **`[CLIENT-URL]/config/local/log` endpoint has been deprecated, as is `etcd --log-package-levels` flag.**
- `curl http://127.0.0.1:2379/config/local/log -XPUT -d '{"Level":"DEBUG"}'` won't work.
- Please use `etcd --logger=zap --log-outputs=stderr` instead.
- Deprecated `etcd_debugging_mvcc_db_total_size_in_bytes` Prometheus metric. Use `etcd_mvcc_db_total_size_in_bytes` instead.
- Deprecated `etcd_debugging_mvcc_put_total` Prometheus metric. Use `etcd_mvcc_put_total` instead.
- Deprecated `etcd_debugging_mvcc_delete_total` Prometheus metric. Use `etcd_mvcc_delete_total` instead.
- Deprecated `etcd_debugging_mvcc_txn_total` Prometheus metric. Use `etcd_mvcc_txn_total` instead.
- Deprecated `etcd_debugging_mvcc_range_total` Prometheus metric. Use `etcd_mvcc_range_total` instead.
- Main branch `/version` outputs `3.5.0-pre`, instead of `3.4.0+git`.
- Changed `proxy` package function signature to [support structured logger](https://github.com/etcd-io/etcd/pull/11614).
- Previously, `NewClusterProxy(c *clientv3.Client, advaddr string, prefix string) (pb.ClusterServer, <-chan struct{})`, now `NewClusterProxy(lg *zap.Logger, c *clientv3.Client, advaddr string, prefix string) (pb.ClusterServer, <-chan struct{})`.
- Previously, `Register(c *clientv3.Client, prefix string, addr string, ttl int)`, now `Register(lg *zap.Logger, c *clientv3.Client, prefix string, addr string, ttl int) <-chan struct{}`.
- Previously, `NewHandler(t *http.Transport, urlsFunc GetProxyURLs, failureWait time.Duration, refreshInterval time.Duration) http.Handler`, now `NewHandler(lg *zap.Logger, t *http.Transport, urlsFunc GetProxyURLs, failureWait time.Duration, refreshInterval time.Duration) http.Handler`.
- Changed `pkg/flags` function signature to [support structured logger](https://github.com/etcd-io/etcd/pull/11616).
- Previously, `SetFlagsFromEnv(prefix string, fs *flag.FlagSet) error`, now `SetFlagsFromEnv(lg *zap.Logger, prefix string, fs *flag.FlagSet) error`.
- Previously, `SetPflagsFromEnv(prefix string, fs *pflag.FlagSet) error`, now `SetPflagsFromEnv(lg *zap.Logger, prefix string, fs *pflag.FlagSet) error`.
- ClientV3 supports [grpc resolver API](https://github.com/etcd-io/etcd/blob/main/client/v3/naming/resolver/resolver.go).
- Endpoints can be managed using [endpoints.Manager](https://github.com/etcd-io/etcd/blob/main/client/v3/naming/endpoints/endpoints.go)
- Previously supported [GRPCResolver was decomissioned](https://github.com/etcd-io/etcd/pull/12675). Use [resolver](https://github.com/etcd-io/etcd/blob/main/client/v3/naming/resolver/resolver.go) instead.
- Turned on [--pre-vote by default](https://github.com/etcd-io/etcd/pull/12770). Should prevent disrupting RAFT leader by an individual member.
- [ETCD_CLIENT_DEBUG env](https://github.com/etcd-io/etcd/pull/12786): Now supports log levels (debug, info, warn, error, dpanic, panic, fatal). Only when set, overrides application-wide grpc logging settings.
- [Embed Etcd.Close()](https://github.com/etcd-io/etcd/pull/12828) needs to called exactly once and closes Etcd.Err() stream.
- [Embed Etcd does not override global/grpc logger](https://github.com/etcd-io/etcd/pull/12861) be default any longer. If desired, please call `embed.Config::SetupGlobalLoggers()` explicitly.
- [Embed Etcd custom logger should be configured using simpler builder `NewZapLoggerBuilder`](https://github.com/etcd-io/etcd/pull/12973).
- Client errors of `context cancelled` or `context deadline exceeded` are exposed as `codes.Canceled` and `codes.DeadlineExceeded`, instead of `codes.Unknown`.
### Storage format changes
- [WAL log's snapshots persists raftpb.ConfState](https://github.com/etcd-io/etcd/pull/12735)
- [Backend persists raftpb.ConfState](https://github.com/etcd-io/etcd/pull/12962) in the `meta` bucket `confState` key.
- [Backend persists applied term](https://github.com/etcd-io/etcd/pull/) in the `meta` bucket.
- Backend persists `downgrade` in the `cluster` bucket
### Security
- Add [`TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256` and `TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256` to `etcd --cipher-suites`](https://github.com/etcd-io/etcd/pull/11864).
- Changed [the format of WAL entries related to auth for not keeping password as a plain text](https://github.com/etcd-io/etcd/pull/11943).
- Add third party [Security Audit Report](https://github.com/etcd-io/etcd/pull/12201).
- A [log warning](https://github.com/etcd-io/etcd/pull/12242) is added when etcd uses any existing directory that has a permission different than 700 on Linux and 777 on Windows.
- Add optional [`ClientCertFile` and `ClientKeyFile`](https://github.com/etcd-io/etcd/pull/12705) options for peer and client tls configuration when split certificates are used.
### Metrics, Monitoring
See [List of metrics](https://etcd.io/docs/latest/metrics/) for all metrics per release.
Note that any `etcd_debugging_*` metrics are experimental and subject to change.
- Deprecated `etcd_debugging_mvcc_db_total_size_in_bytes` Prometheus metric. Use `etcd_mvcc_db_total_size_in_bytes` instead.
- Deprecated `etcd_debugging_mvcc_put_total` Prometheus metric. Use `etcd_mvcc_put_total` instead.
- Deprecated `etcd_debugging_mvcc_delete_total` Prometheus metric. Use `etcd_mvcc_delete_total` instead.
- Deprecated `etcd_debugging_mvcc_txn_total` Prometheus metric. Use `etcd_mvcc_txn_total` instead.
- Deprecated `etcd_debugging_mvcc_range_total` Prometheus metric. Use `etcd_mvcc_range_total` instead.
- Add [`etcd_debugging_mvcc_current_revision`](https://github.com/etcd-io/etcd/pull/11126) Prometheus metric.
- Add [`etcd_debugging_mvcc_compact_revision`](https://github.com/etcd-io/etcd/pull/11126) Prometheus metric.
- Change [`etcd_cluster_version`](https://github.com/etcd-io/etcd/pull/11254) Prometheus metrics to include only major and minor version.
- Add [`etcd_debugging_mvcc_total_put_size_in_bytes`](https://github.com/etcd-io/etcd/pull/11374) Prometheus metric.
- Add [`etcd_server_client_requests_total` with `"type"` and `"client_api_version"` labels](https://github.com/etcd-io/etcd/pull/11687).
- Add [`etcd_wal_write_bytes_total`](https://github.com/etcd-io/etcd/pull/11738).
- Add [`etcd_debugging_auth_revision`](https://github.com/etcd-io/etcd/commit/f14d2a087f7b0fd6f7980b95b5e0b945109c95f3).
- Add [`os_fd_used` and `os_fd_limit` to monitor current OS file descriptors](https://github.com/etcd-io/etcd/pull/12214).
- Add [`etcd_disk_defrag_inflight`](https://github.com/etcd-io/etcd/pull/13395).
### etcd server
- Add [don't attempt to grant nil permission to a role](https://github.com/etcd-io/etcd/pull/13086).
- Add [don't activate alarms w/missing AlarmType](https://github.com/etcd-io/etcd/pull/13084).
- Add [`TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256` and `TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256` to `etcd --cipher-suites`](https://github.com/etcd-io/etcd/pull/11864).
- Automatically [create parent directory if it does not exist](https://github.com/etcd-io/etcd/pull/9626) (fix [issue#9609](https://github.com/etcd-io/etcd/issues/9609)).
- v4.0 will configure `etcd --enable-v2=true --enable-v2v3=/aaa` to enable v2 API server that is backed by **v3 storage**.
- [`etcd --backend-bbolt-freelist-type`] flag is now stable.
- `etcd --experimental-backend-bbolt-freelist-type` has been deprecated.
- Support [downgrade API](https://github.com/etcd-io/etcd/pull/11715).
- Deprecate v2 apply on cluster version. [Use v3 request to set cluster version and recover cluster version from v3 backend](https://github.com/etcd-io/etcd/pull/11427).
- [Use v2 api to update cluster version to support mixed version cluster during upgrade](https://github.com/etcd-io/etcd/pull/12988).
- [Fix corruption bug in defrag](https://github.com/etcd-io/etcd/pull/11613).
- Fix [quorum protection logic when promoting a learner](https://github.com/etcd-io/etcd/pull/11640).
- Improve [peer corruption checker](https://github.com/etcd-io/etcd/pull/11621) to work when peer mTLS is enabled.
- Log [`[CLIENT-PORT]/health` check in server side](https://github.com/etcd-io/etcd/pull/11704).
- Log [successful etcd server-side health check in debug level](https://github.com/etcd-io/etcd/pull/12677).
- Improve [compaction performance when latest index is greater than 1-million](https://github.com/etcd-io/etcd/pull/11734).
- [Refactor consistentindex](https://github.com/etcd-io/etcd/pull/11699).
- [Add log when etcdserver failed to apply command](https://github.com/etcd-io/etcd/pull/11670).
- Improve [count-only range performance](https://github.com/etcd-io/etcd/pull/11771).
- Remove [redundant storage restore operation to shorten the startup time](https://github.com/etcd-io/etcd/pull/11779).
- With 40 million key test data,it can shorten the startup time from 5 min to 2.5 min.
- [Fix deadlock bug in mvcc](https://github.com/etcd-io/etcd/pull/11817).
- Fix [inconsistency between WAL and server snapshot](https://github.com/etcd-io/etcd/pull/11888).
- Previously, server restore fails if it had crashed after persisting raft hard state but before saving snapshot.
- See https://github.com/etcd-io/etcd/issues/10219 for more.
- Add [missing CRC checksum check in WAL validate method otherwise causes panic](https://github.com/etcd-io/etcd/pull/11924).
- See https://github.com/etcd-io/etcd/issues/11918.
- Improve logging around snapshot send and receive.
- [Push down RangeOptions.limit argv into index tree to reduce memory overhead](https://github.com/etcd-io/etcd/pull/11990).
- Add [reason field for /health response](https://github.com/etcd-io/etcd/pull/11983).
- Add [exclude alarms from health check conditionally](https://github.com/etcd-io/etcd/pull/12880).
- Add [`etcd --unsafe-no-fsync`](https://github.com/etcd-io/etcd/pull/11946) flag.
- Setting the flag disables all uses of fsync, which is unsafe and will cause data loss. This flag makes it possible to run an etcd node for testing and development without placing lots of load on the file system.
- Add [`etcd --auth-token-ttl`](https://github.com/etcd-io/etcd/pull/11980) flag to customize `simpleTokenTTL` settings.
- Improve [`runtime.FDUsage` call pattern to reduce objects malloc of Memory Usage and CPU Usage](https://github.com/etcd-io/etcd/pull/11986).
- Improve [mvcc.watchResponse channel Memory Usage](https://github.com/etcd-io/etcd/pull/11987).
- Log [expensive request info in UnaryInterceptor](https://github.com/etcd-io/etcd/pull/12086).
- [Fix invalid Go type in etcdserverpb](https://github.com/etcd-io/etcd/pull/12000).
- [Improve healthcheck by using v3 range request and its corresponding timeout](https://github.com/etcd-io/etcd/pull/12195).
- Add [`etcd --experimental-watch-progress-notify-interval`](https://github.com/etcd-io/etcd/pull/12216) flag to make watch progress notify interval configurable.
- Fix [server panic in slow writes warnings](https://github.com/etcd-io/etcd/issues/12197).
- Fixed via [PR#12238](https://github.com/etcd-io/etcd/pull/12238).
- [Fix server panic](https://github.com/etcd-io/etcd/pull/12288) when force-new-cluster flag is enabled in a cluster which had learner node.
- Add [`etcd --self-signed-cert-validity`](https://github.com/etcd-io/etcd/pull/12429) flag to support setting certificate expiration time.
- Notice, certificates generated by etcd are valid for 1 year by default when specifying the auto-tls or peer-auto-tls option.
- Add [`etcd --experimental-warning-apply-duration`](https://github.com/etcd-io/etcd/pull/12448) flag which allows apply duration threshold to be configurable.
- Add [`etcd --experimental-memory-mlock`](https://github.com/etcd-io/etcd/pull/TODO) flag which prevents etcd memory pages to be swapped out.
- Add [`etcd --socket-reuse-port`](https://github.com/etcd-io/etcd/pull/12702) flag
- Setting this flag enables `SO_REUSEPORT` which allows rebind of a port already in use. User should take caution when using this flag to ensure flock is properly enforced.
- Add [`etcd --socket-reuse-address`](https://github.com/etcd-io/etcd/pull/12702) flag
- Setting this flag enables `SO_REUSEADDR` which allows binding to an address in `TIME_WAIT` state, improving etcd restart time.
- Reduce [around 30% memory allocation by logging range response size without marshal](https://github.com/etcd-io/etcd/pull/12871).
- `ETCD_VERIFY="all"` environment triggers [additional verification of consistency](https://github.com/etcd-io/etcd/pull/12901) of etcd data-dir files.
- Add [`etcd --enable-log-rotation`](https://github.com/etcd-io/etcd/pull/12774) boolean flag which enables log rotation if true.
- Add [`etcd --log-rotation-config-json`](https://github.com/etcd-io/etcd/pull/12774) flag which allows passthrough of JSON config to configure log rotation for a file output target.
- Add experimental distributed tracing boolean flag [`--experimental-enable-distributed-tracing`](https://github.com/etcd-io/etcd/pull/12919) which enables tracing.
- Add [`etcd --experimental-distributed-tracing-address`](https://github.com/etcd-io/etcd/pull/12919) string flag which allows configuring the OpenTelemetry collector address.
- Add [`etcd --experimental-distributed-tracing-service-name`](https://github.com/etcd-io/etcd/pull/12919) string flag which allows changing the default "etcd" service name.
- Add [`etcd --experimental-distributed-tracing-instance-id`](https://github.com/etcd-io/etcd/pull/12919) string flag which configures an instance ID, which must be unique per etcd instance.
- Add [`--experimental-bootstrap-defrag-threshold-megabytes`](https://github.com/etcd-io/etcd/pull/12941) which configures a threshold for the unused db size and etcdserver will automatically perform defragmentation on bootstrap when it exceeds this value. The functionality is disabled if the value is 0.
### Package `runtime`
- Optimize [`runtime.FDUsage` by removing unnecessary sorting](https://github.com/etcd-io/etcd/pull/12214).
### Package `embed`
- Remove [`embed.Config.Debug`](https://github.com/etcd-io/etcd/pull/10947).
- Use `embed.Config.LogLevel` instead.
- Add [`embed.Config.ZapLoggerBuilder`](https://github.com/etcd-io/etcd/pull/11147) to allow creating a custom zap logger.
- Replace [global `*zap.Logger` with etcd server logger object](https://github.com/etcd-io/etcd/pull/12212).
- Add [`embed.Config.EnableLogRotation`](https://github.com/etcd-io/etcd/pull/12774) which enables log rotation if true.
- Add [`embed.Config.LogRotationConfigJSON`](https://github.com/etcd-io/etcd/pull/12774) to allow passthrough of JSON config to configure log rotation for a file output target.
- Add [`embed.Config.ExperimentalEnableDistributedTracing`](https://github.com/etcd-io/etcd/pull/12919) which enables experimental distributed tracing if true.
- Add [`embed.Config.ExperimentalDistributedTracingAddress`](https://github.com/etcd-io/etcd/pull/12919) which allows overriding default collector address.
- Add [`embed.Config.ExperimentalDistributedTracingServiceName`](https://github.com/etcd-io/etcd/pull/12919) which allows overriding default "etcd" service name.
- Add [`embed.Config.ExperimentalDistributedTracingServiceInstanceID`](https://github.com/etcd-io/etcd/pull/12919) which allows configuring an instance ID, which must be uniquer per etcd instance.
### Package `clientv3`
- Remove [excessive watch cancel logging messages](https://github.com/etcd-io/etcd/pull/12187).
- See [kubernetes/kubernetes#93450](https://github.com/kubernetes/kubernetes/issues/93450).
- Add [`TryLock`](https://github.com/etcd-io/etcd/pull/11104) method to `clientv3/concurrency/Mutex`. A non-blocking method on `Mutex` which does not wait to get lock on the Mutex, returns immediately if Mutex is locked by another session.
- Fix [client balancer failover against multiple endpoints](https://github.com/etcd-io/etcd/pull/11184).
- Fix [`"kube-apiserver: failover on multi-member etcd cluster fails certificate check on DNS mismatch"`](https://github.com/kubernetes/kubernetes/issues/83028).
- Fix [IPv6 endpoint parsing in client](https://github.com/etcd-io/etcd/pull/11211).
- Fix ["1.16: etcd client does not parse IPv6 addresses correctly when members are joining" (kubernetes#83550)](https://github.com/kubernetes/kubernetes/issues/83550).
- Fix [errors caused by grpc changing balancer/resolver API](https://github.com/etcd-io/etcd/pull/11564). This change is compatible with grpc >= [v1.26.0](https://github.com/grpc/grpc-go/releases/tag/v1.26.0), but is not compatible with < v1.26.0 version.
- Use [ServerName as the authority](https://github.com/etcd-io/etcd/pull/11574) after bumping to grpc v1.26.0. Remove workaround in [#11184](https://github.com/etcd-io/etcd/pull/11184).
- Fix [`"hasleader"` metadata embedding](https://github.com/etcd-io/etcd/pull/11687).
- Previously, `clientv3.WithRequireLeader(ctx)` was overwriting existing context keys.
- Fix [watch leak caused by lazy cancellation](https://github.com/etcd-io/etcd/pull/11850). When clients cancel their watches, a cancel request will now be immediately sent to the server instead of waiting for the next watch event.
- Make sure [save snapshot downloads checksum for integrity checks](https://github.com/etcd-io/etcd/pull/11896).
- Fix [auth token invalid after watch reconnects](https://github.com/etcd-io/etcd/pull/12264). Get AuthToken automatically when clientConn is ready.
- Improve [clientv3:get AuthToken gracefully without extra connection](https://github.com/etcd-io/etcd/pull/12165).
- Changed [clientv3 dialing code](https://github.com/etcd-io/etcd/pull/12671) to use grpc resolver API instead of custom balancer.
- Endpoints self identify now as `etcd-endpoints://{id}/#initially={list of endpoints}` e.g. `etcd-endpoints://0xc0009d8540/#initially=[localhost:2079]`
- Make sure [save snapshot downloads checksum for integrity checks](https://github.com/etcd-io/etcd/pull/11896).
### Package `lease`
- Fix [memory leak in follower nodes](https://github.com/etcd-io/etcd/pull/11731).
- https://github.com/etcd-io/etcd/issues/11495
- https://github.com/etcd-io/etcd/issues/11730
- Make sure [grant/revoke won't be applied repeatedly after restarting etcd](https://github.com/etcd-io/etcd/pull/11935).
### Package `wal`
- Add [`etcd_wal_write_bytes_total`](https://github.com/etcd-io/etcd/pull/11738).
- Handle [out-of-range slice bound in `ReadAll` and entry limit in `decodeRecord`](https://github.com/etcd-io/etcd/pull/11793).
### etcdctl v3
- Fix `etcdctl member add` command to prevent potential timeout. ([PR#11194](https://github.com/etcd-io/etcd/pull/11194) and [PR#11638](https://github.com/etcd-io/etcd/pull/11638))
- Add [`etcdctl watch --progress-notify`](https://github.com/etcd-io/etcd/pull/11462) flag.
- Add [`etcdctl auth status`](https://github.com/etcd-io/etcd/pull/11536) command to check if authentication is enabled
- Add [`etcdctl get --count-only`](https://github.com/etcd-io/etcd/pull/11743) flag for output type `fields`.
- Add [`etcdctl member list -w=json --hex`](https://github.com/etcd-io/etcd/pull/11812) flag to print memberListResponse in hex format json.
- Changed [`etcdctl lock <lockname> exec-command`](https://github.com/etcd-io/etcd/pull/12829) to return exit code of exec-command.
- [New tool: `etcdutl`](https://github.com/etcd-io/etcd/pull/12971) incorporated functionality of: `etcdctl snapshot status|restore`, `etcdctl backup`, `etcdctl defrag --data-dir ...`.
- [ETCDCTL_API=3 `etcdctl migrate`](https://github.com/etcd-io/etcd/pull/12971) has been decommissioned. Use etcd <=v3.4 to restore v2 storage.
### gRPC gateway
- [gRPC gateway](https://github.com/grpc-ecosystem/grpc-gateway) only supports [`/v3`](TODO) endpoint.
- Deprecated [`/v3beta`](https://github.com/etcd-io/etcd/pull/9298).
- `curl -L http://localhost:2379/v3beta/kv/put -X POST -d '{"key": "Zm9v", "value": "YmFy"}'` does work in v3.5. Use `curl -L http://localhost:2379/v3/kv/put -X POST -d '{"key": "Zm9v", "value": "YmFy"}'` instead.
- Set [`enable-grpc-gateway`](https://github.com/etcd-io/etcd/pull/12297) flag to true when using a config file to keep the defaults the same as the command line configuration.
### gRPC Proxy
- Fix [`panic on error`](https://github.com/etcd-io/etcd/pull/11694) for metrics handler.
- Add [gRPC keepalive related flags](https://github.com/etcd-io/etcd/pull/11711) `grpc-keepalive-min-time`, `grpc-keepalive-interval` and `grpc-keepalive-timeout`.
- [Fix grpc watch proxy hangs when failed to cancel a watcher](https://github.com/etcd-io/etcd/pull/12030) .
- Add [metrics handler for grpcproxy self](https://github.com/etcd-io/etcd/pull/12107).
- Add [health handler for grpcproxy self](https://github.com/etcd-io/etcd/pull/12114).
### Auth
- Fix [NoPassword check when adding user through GRPC gateway](https://github.com/etcd-io/etcd/pull/11418) ([issue#11414](https://github.com/etcd-io/etcd/issues/11414))
- Fix bug where [some auth related messages are logged at wrong level](https://github.com/etcd-io/etcd/pull/11586)
- [Fix a data corruption bug by saving consistent index](https://github.com/etcd-io/etcd/pull/11652).
- [Improve checkPassword performance](https://github.com/etcd-io/etcd/pull/11735).
- [Add authRevision field in AuthStatus](https://github.com/etcd-io/etcd/pull/11659).
- Fix [a bug of not refreshing expired tokens](https://github.com/etcd-io/etcd/pull/13308).
-
### API
- Add [`/v3/auth/status`](https://github.com/etcd-io/etcd/pull/11536) endpoint to check if authentication is enabled
- [Add `Linearizable` field to `etcdserverpb.MemberListRequest`](https://github.com/etcd-io/etcd/pull/11639).
- [Learner support Snapshot RPC](https://github.com/etcd-io/etcd/pull/12890/).
### Package `netutil`
- Remove [`netutil.DropPort/RecoverPort/SetLatency/RemoveLatency`](https://github.com/etcd-io/etcd/pull/12491).
- These are not used anymore. They were only used for older versions of functional testing.
- Removed to adhere to best security practices, minimize arbitrary shell invocation.
### `tools/etcd-dump-metrics`
- Implement [input validation to prevent arbitrary shell invocation](https://github.com/etcd-io/etcd/pull/12491).
### Dependency
- Upgrade [`google.golang.org/grpc`](https://github.com/grpc/grpc-go/releases) from [**`v1.23.0`**](https://github.com/grpc/grpc-go/releases/tag/v1.23.0) to [**`v1.37.0`**](https://github.com/grpc/grpc-go/releases/tag/v1.37.0).
- Upgrade [`go.uber.org/zap`](https://github.com/uber-go/zap/releases) from [**`v1.14.1`**](https://github.com/uber-go/zap/releases/tag/v1.14.1) to [**`v1.16.0`**](https://github.com/uber-go/zap/releases/tag/v1.16.0).
### Platforms
- etcd now [officially supports `arm64`](https://github.com/etcd-io/etcd/pull/12929).
- See https://github.com/etcd-io/etcd/pull/12928 for adding automated tests with `arm64` EC2 instances (Graviton 2).
- See https://github.com/etcd-io/website/pull/273 for new platform support tier policies.
### Release
- Add s390x build support ([PR#11548](https://github.com/etcd-io/etcd/pull/11548) and [PR#11358](https://github.com/etcd-io/etcd/pull/11358))
### Go
- Require [*Go 1.16+*](https://github.com/etcd-io/etcd/pull/11110).
- Compile with [*Go 1.16+*](https://golang.org/doc/devel/release.html#go1.16)
- etcd uses [go modules](https://github.com/etcd-io/etcd/pull/12279) (instead of vendor dir) to track dependencies.
### Project Governance
- The etcd team has added, a well defined and openly discussed, project [governance](https://github.com/etcd-io/etcd/pull/11175).
<hr>

View File

@ -1,103 +0,0 @@
Previous change logs can be found at [CHANGELOG-3.5](https://github.com/etcd-io/etcd/blob/main/CHANGELOG/CHANGELOG-3.5.md).
<hr>
## v3.6.0 (TBD)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.5.0...v3.6.0).
### Breaking Changes
- `etcd` will no longer start on data dir created by newer versions (for example etcd v3.6 will not run on v3.7+ data dir). To downgrade data dir please check out `etcdutl migrate` command.
- `etcd` doesn't support serving client requests on the peer listen endpoints (--listen-peer-urls). See [pull/13565](https://github.com/etcd-io/etcd/pull/13565).
- `etcdctl` will sleep(2s) in case of range delete without `--range` flag. See [pull/13747](https://github.com/etcd-io/etcd/pull/13747)
- Applications which depend on etcd v3.6 packages must be built with go version >= v1.18.
### Deprecations
- Deprecated [V2 discovery](https://etcd.io/docs/v3.5/dev-internal/discovery_protocol/).
- Deprecated [SetKeepAlive and SetKeepAlivePeriod in limitListenerConn](https://github.com/etcd-io/etcd/pull/14356).
- Removed [etcdctl defrag --data-dir](https://github.com/etcd-io/etcd/pull/13793).
- Removed [etcdctl snapshot status](https://github.com/etcd-io/etcd/pull/13809).
- Removed [etcdctl snapshot restore](https://github.com/etcd-io/etcd/pull/13809).
- Removed [etcdutl snapshot save](https://github.com/etcd-io/etcd/pull/13809).
### etcdctl v3
- Add command to generate [shell completion](https://github.com/etcd-io/etcd/pull/13133).
- When print endpoint status, [show db size in use](https://github.com/etcd-io/etcd/pull/13639)
- [Always print the raft_term in decimal](https://github.com/etcd-io/etcd/pull/13711) when displaying member list in json.
- [Add one more field `storageVersion`](https://github.com/etcd-io/etcd/pull/13773) into the response of command `etcdctl endpoint status`.
- Add [`--max-txn-ops`](https://github.com/etcd-io/etcd/pull/14340) flag to make-mirror command.
- Add [`--consistency`](https://github.com/etcd-io/etcd/pull/15261) flag to member list command.
- Display [field `hash_revision`](https://github.com/etcd-io/etcd/pull/14812) for `etcdctl endpoint hash` command.
### etcdutl v3
- Add command to generate [shell completion](https://github.com/etcd-io/etcd/pull/13142).
- Add `migrate` command for downgrading/upgrading etcd data dir files.
### Package `clientv3`
- [Support serializable `MemberList` operation](https://github.com/etcd-io/etcd/pull/15261).
### Package `server`
- Package `mvcc` was moved to `storage/mvcc`
- Package `mvcc/backend` was moved to `storage/backend`
- Package `mvcc/buckets` was moved to `storage/schema`
- Package `wal` was moved to `storage/wal`
- Package `datadir` was moved to `storage/datadir`
### Package `raft`
- Send empty `MsgApp` when entry in-flight limits are exceeded. See [pull/14633](https://github.com/etcd-io/etcd/pull/14633).
- Add [MaxInflightBytes](https://github.com/etcd-io/etcd/pull/14624) setting in `raft.Config` for better flow control of entries.
- [Decouple raft from etcd](https://github.com/etcd-io/etcd/issues/14713). Migrated raft to a separate [repository](https://github.com/etcd-io/raft), and renamed raft module to `go.etcd.io/raft/v3`.
### etcd server
- Add [`etcd --log-format`](https://github.com/etcd-io/etcd/pull/13339) flag to support log format.
- Add [`etcd --experimental-max-learners`](https://github.com/etcd-io/etcd/pull/13377) flag to allow configuration of learner max membership.
- Add [`etcd --experimental-enable-lease-checkpoint-persist`](https://github.com/etcd-io/etcd/pull/13508) flag to handle upgrade from v3.5.2 clusters with this feature enabled.
- Add [`etcdctl make-mirror --rev`](https://github.com/etcd-io/etcd/pull/13519) flag to support incremental mirror.
- Add [`etcd --experimental-wait-cluster-ready-timeout`](https://github.com/etcd-io/etcd/pull/13525) flag to wait for cluster to be ready before serving client requests.
- Add [v3 discovery](https://github.com/etcd-io/etcd/pull/13635) to bootstrap a new etcd cluster.
- Add [field `storage`](https://github.com/etcd-io/etcd/pull/13772) into the response body of endpoint `/version`.
- Add [`etcd --max-concurrent-streams`](https://github.com/etcd-io/etcd/pull/14169) flag to configure the max concurrent streams each client can open at a time, and defaults to math.MaxUint32.
- Add [`etcd grpc-proxy --experimental-enable-grpc-logging`](https://github.com/etcd-io/etcd/pull/14266) flag to logging all grpc requests and responses.
- Add [`etcd --experimental-compact-hash-check-enabled --experimental-compact-hash-check-time`](https://github.com/etcd-io/etcd/issues/14039) flags to support enabling reliable corruption detection on compacted revisions.
- Add [Protection on maintenance request when auth is enabled](https://github.com/etcd-io/etcd/pull/14663).
- Graduated [`--experimental-warning-unary-request-duration` to `--warning-unary-request-duration`](https://github.com/etcd-io/etcd/pull/14414). Note the experimental flag is deprecated and will be decommissioned in v3.7.
- Add [field `hash_revision` into `HashKVResponse`](https://github.com/etcd-io/etcd/pull/14537).
- Add [`etcd --experimental-snapshot-catch-up-entries`](https://github.com/etcd-io/etcd/pull/15033) flag to configure number of entries for a slow follower to catch up after compacting the the raft storage entries and defaults to 5k.
- Decreased [`--snapshot-count` default value from 100,000 to 10,000](https://github.com/etcd-io/etcd/pull/15408)
- Add [`etcd --tls-min-version --tls-max-version`](https://github.com/etcd-io/etcd/pull/15156) to enable support for TLS 1.3.
### etcd grpc-proxy
- Add [`etcd grpc-proxy start --endpoints-auto-sync-interval`](https://github.com/etcd-io/etcd/pull/14354) flag to enable and configure interval of auto sync of endpoints with server.
- Add [`etcd grpc-proxy start --listen-cipher-suites`](https://github.com/etcd-io/etcd/pull/14308) flag to support adding configurable cipher list.
### tools/benchmark
- [Add etcd client autoSync flag](https://github.com/etcd-io/etcd/pull/13416)
### Metrics, Monitoring
See [List of metrics](https://etcd.io/docs/latest/metrics/) for all metrics per release.
- Add [`etcd_disk_defrag_inflight`](https://github.com/etcd-io/etcd/pull/13371).
- Add [`etcd_debugging_server_alarms`](https://github.com/etcd-io/etcd/pull/14276).
### Go
- Require [Go 1.19+](https://github.com/etcd-io/etcd/pull/14463).
- Compile with [Go 1.19+](https://golang.org/doc/devel/release.html#go1.19). Please refer to [gc-guide](https://go.dev/doc/gc-guide) to configure `GOGC` and `GOMEMLIMIT` properly.
### Other
- Use Distroless as base image to make the image less vulnerable and reduce image size.
<hr>

View File

@ -1,44 +0,0 @@
Previous change logs can be found at [CHANGELOG-3.x](https://github.com/etcd-io/etcd/blob/main/CHANGELOG/CHANGELOG-3.x.md).
<hr>
## v4.0.0 (TBD)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.5.0...v4.0.0) and [v4.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_4_0/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v4.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_4_0/).**
### Breaking Changes
- [Secure etcd by default](https://github.com/etcd-io/etcd/issues/9475)?
- Deprecate [`etcd --proxy*`](TODO) flags; **no more v2 proxy**.
- Deprecate [v2 storage backend](https://github.com/etcd-io/etcd/issues/9232); **no more v2 store**.
- v2 API is still supported via [v2 emulation](TODO).
- Deprecate [`etcdctl backup`](TODO) command.
- `clientv3.Client.KeepAlive(ctx context.Context, id LeaseID) (<-chan *LeaseKeepAliveResponse, error)` is now [`clientv4.Client.KeepAlive(ctx context.Context, id LeaseID) <-chan *LeaseKeepAliveResponse`](TODO).
- Similar to `Watch`, [`KeepAlive` does not return errors](https://github.com/etcd-io/etcd/issues/7488).
- If there's an unknown server error, kill all open channels and create a new stream on the next `KeepAlive` call.
- Rename `github.com/coreos/client` to `github.com/coreos/clientv2`.
- [`etcd --experimental-initial-corrupt-check`](TODO) has been deprecated.
- Use [`etcd --initial-corrupt-check`](TODO) instead.
- [`etcd --experimental-corrupt-check-time`](TODO) has been deprecated.
- Use [`etcd --corrupt-check-time`](TODO) instead.
- Enable TLS 1.13, deprecate TLS cipher suites.
### etcd server
- [`etcd --initial-corrupt-check`](TODO) flag is now stable (`etcd --experimental-initial-corrupt-check` has been deprecated).
- `etcd --initial-corrupt-check=true` by default, to check cluster database hashes before serving client/peer traffic.
- [`etcd --corrupt-check-time`](TODO) flag is now stable (`etcd --experimental-corrupt-check-time` has been deprecated).
- `etcd --corrupt-check-time=12h` by default, to check cluster database hashes for every 12-hour.
- Enable TLS 1.13, deprecate TLS cipher suites.
### Go
- Require [*Go 2*](https://blog.golang.org/go2draft).
<hr>

View File

@ -1,21 +0,0 @@
# Change logs
## Production recommendation
The minimum recommended etcd versions to run in **production** are v3.4.22+ and v3.5.6+. Refer to the [versioning policy](https://etcd.io/docs/v3.5/op-guide/versioning/) for more details.
### v3.5 data corruption issue
Running etcd v3.5.2, v3.5.1 and v3.5.0 under high load can cause a data corruption issue.
If etcd process is killed, occasionally some committed transactions are not reflected on all the members.
Recommendation is to upgrade to v3.5.4+.
If you have encountered data corruption, please follow instructions on https://etcd.io/docs/v3.5/op-guide/data_corruption/.
## Change log rules
1. Each patch release only includes changes against previous patch release.
For example, the change log of v3.5.5 should only include items which are new to v3.5.4.
2. For the first release (e.g. 3.4.0, 3.5.0, 3.6.0, 4.0.0 etc.) for each minor or major
version, it only includes changes which are new to the first release of previous minor
or major version. For example, v3.5.0 should only include items which are new to v3.4.0,
and v3.6.0 should only include items which are new to v3.5.0.

View File

@ -1,121 +1,46 @@
# How to contribute
etcd is Apache 2.0 licensed and accepts contributions via GitHub pull requests.
This document outlines basics of contributing to etcd.
etcd is Apache 2.0 licensed and accepts contributions via GitHub pull requests. This document outlines some of the conventions on commit message formatting, contact points for developers, and other resources to help get contributions into etcd.
# Email and chat
- Email: [etcd-dev](https://groups.google.com/forum/?hl=en#!forum/etcd-dev)
- IRC: #[etcd](irc://irc.freenode.org:6667/#etcd) IRC channel on freenode.org
## Getting started
- Fork the repository on GitHub
- Read the README.md for build instructions
## Reporting bugs and creating issues
Reporting bugs is one of the best ways to contribute. However, a good bug report has some very specific qualities, so please read over our short document on [reporting bugs](https://github.com/etcd-io/etcd/blob/master/Documentation/reporting_bugs.md) before submitting a bug report. This document might contain links to known issues, another good reason to take a look there before reporting a bug.
## Contribution flow
This is a rough outline of what a contributor's workflow looks like:
* [Find something to work on](#Find-something-to-work-on)
* [Setup development environment](#Setup-development-environment)
* [Implement your change](#Implement-your-change)
* [Commit your change](#Commit-your-change)
* [Create a pull request](#Create-a-pull-request)
* [Get your pull request reviewed](#Get-your-pull-request-reviewed)
If you have any questions about, please reach out using one of the methods listed in [contact].
- Create a topic branch from where to base the contribution. This is usually master.
- Make commits of logical units.
- Make sure commit messages are in the proper format (see below).
- Push changes in a topic branch to a personal fork of the repository.
- Submit a pull request to etcd-io/etcd.
- The PR must receive a LGTM from two maintainers found in the MAINTAINERS file.
[contact]: ./README.md#Contact
Thanks for contributing!
## Learn more about etcd
### Code style
Before making a change please look through resources below to learn more about etcd and tools used for development.
The coding style suggested by the Golang community is used in etcd. See the [style doc](https://github.com/golang/go/wiki/CodeReviewComments) for details.
* Please learn about [Git](https://github.com/git-guides) version control system used in etcd.
* Read the [etcd learning resources](https://etcd.io/docs/v3.5/learning/)
* Read the [etcd community membership](/Documentation/contributor-guide/community-membership.md)
* Watch [etcd deep dive](https://www.youtube.com/watch?v=D2pm6ufIt98&t=927s)
* Watch [etcd code walk through](https://www.youtube.com/watch?v=H3XaSF6wF7w)
Please follow this style to make etcd easy to review, maintain and develop.
## Find something to work on
### Format of the commit message
All the work in etcd project is tracked in [github issue tracker].
Issues should be properly labeled making it easy to find something for you.
We follow a rough convention for commit messages that is designed to answer two
questions: what changed and why. The subject line should feature the what and
the body of the commit should describe the why.
Depending on your interest and experience you should check different labels:
* If you are just starting, check issues labeled with [good first issue].
* When you feel more conformable in your contributions, checkout [help wanted].
* Advanced contributors can try to help with issues labeled [priority/important] covering most relevant work at the time.
If any of aforementioned labels don't have unassigned issues, please [contact] one of the [maintainers] asking to triage more issues.
[github issue tracker]: https://github.com/etcd-io/etcd/issues
[good first issue]: https://github.com/etcd-io/etcd/labels/good%20first%20issue
[help wanted]: https://github.com/etcd-io/etcd/labels/help%20wanted
[maintainers]: https://github.com/etcd-io/etcd/blob/main/MAINTAINERS
[priority/important]: https://github.com/etcd-io/etcd/labels/priority%2Fimportant
## Setup development environment
The etcd project supports two options for development:
1. Manually setup local environment.
2. Automatically setup [devcontainer](https://containers.dev).
For both options the only supported architecture is `linux-amd64`. Bug reports for other environments will generally be ignored. Supporting new environments requires introduction of proper tests and mainter support that is currently lacking in the etcd project.
If you would like etcd to support your preferred environment you can [file an issue].
### Option 1 - Manually setup local environment
This is the original etcd development environment, is most supported and is backwards compatible for development of older etcd versions.
Follow the steps below to setup the environment:
- [Clone the repository](https://docs.github.com/en/repositories/creating-and-managing-repositories/cloning-a-repository)
- Install Go by following [installation](https://go.dev/doc/install). Please check minimal go version in [go.mod file](./go.mod#L3).
- Install build tools (`make`):
- For debian based distributions you can run `sudo apt-get install build-essential`
- Verify that everything is installed by running `make build`
Note: `make build` runs with `-v`. Other build flags can be added through env `GO_BUILD_FLAGS`, **if required**. Eg.,
```console
GO_BUILD_FLAGS="-buildmode=pie" make build
```
### Option 2 - Automatically setup devcontainer
This is a more recently added environmnent that aims to make it faster for new contributors to get started with etcd. This option is supported for etcd versions 3.6 onwards.
This option can be [used locally](https://code.visualstudio.com/docs/devcontainers/tutorial) on a system running Visual Studio Code and Docker, or in a remote cloud based [Codespaces](https://github.com/features/codespaces) environment.
To get started, create a codespace for this repository by clicking this 👇
[![Open in GitHub Codespaces](https://github.com/codespaces/badge.svg)](https://github.com/codespaces/new?hide_repo_select=true&ref=main&repo=11225014)
A codespace will open in a web-based version of Visual Studio Code. The [dev container](.devcontainer/devcontainer.json) is fully configured with software needed for this project.
**Note**: Dev containers is an open spec which is supported by [GitHub Codespaces](https://github.com/codespaces) and [other tools](https://containers.dev/supporting).
[file an issue]: https://github.com/etcd-io/etcd/issues/new/choose
## Implement your change
etcd code should follow coding style suggested by the Golang community.
See the [style doc](https://github.com/golang/go/wiki/CodeReviewComments) for details.
Please ensure that your change passes static analysis (requires [golangci-lint](https://golangci-lint.run/usage/install/)):
- `make verify` to verify if all checks pass.
- `make verify-*` to verify a single check, for example `make verify-bom` to verify if bill-of-materials.json file is up-to-date.
- `make fix` to fix all checks.
- `make fix-*` to fix a single checks, for example `make fix-bom` to update bill-of-materials.json.
Please ensure that your change passes tests.
- `make test-unit` to run unit tests.
- `make test-integration` to run integration tests.
- `make test-e2e` to run e2e tests.
All changes are expected to come with unit test.
All new features are expected to have either e2e or integration tests.
## Commit your change
etcd follows a rough convention for commit messages:
* First line:
* Should start name of package (for example `etcdserver`, `etcdctl`) followed by `:` character.
* Describe the `what` behind the change
* Optionally author might provide the `why` behind the change in the main commit message body.
* Last line should be `Signed-off-by: firstname lastname <email@example.com>` (can be automatically generate by providing `--signoff` to git commit command).
Example of commit message:
```
etcdserver: add grpc interceptor to log info on incoming requests
@ -125,24 +50,44 @@ remote client info, request content (with value field redacted), request
handling latency, response size, etc. Uses zap logger if available,
otherwise uses capnslog.
Signed-off-by: FirstName LastName <github@github.com>
Fixes #38
```
## Create a pull request
The format can be described more formally as follows:
Please follow [making a pull request](https://docs.github.com/en/get-started/quickstart/contributing-to-projects#making-a-pull-request) guide.
```
<package>: <what changed>
<BLANK LINE>
<why this change was made>
<BLANK LINE>
<footer>
```
If you are still working on the pull request, you can convert it to draft by clicking `Convert to draft` link just below list of reviewers.
The first line is the subject and should be no longer than 70 characters, the second
line is always blank, and other lines should be wrapped at 80 characters. This allows
the message to be easier to read on GitHub as well as in various git tools.
Multiple small PRs are preferred over single large ones (>500 lines of code).
### Pull request across multiple files and packages
## Get your pull request reviewed
If multiple files in a package are changed in a pull request for example:
Before requesting review please ensure that all GitHub checks were successful.
It might happen that some unrelated tests on your PR are failing, due to their flakiness.
In such cases please [file an issue] to deflake the problematic test and ask one of [maintainers] to rerun the tests.
```
etcdserver/config.go
etcdserver/corrupt.go
```
If all checks were successful feel free to reach out for review from people that were involved in the original discussion or [maintainers].
Depending on complexity of the PR it might require between 1 and 2 maintainers to approve your change before merging.
At the end of the review process if multiple commits exist for a single package they
should be squashed/rebased into a single commit before being merged.
Thanks for contributing!
```
etcdserver: <what changed>
[..]
```
If a pull request spans many packages these commits should be squashed/rebased into a single
commit using message with a more generic `*:` prefix.
```
*: <what changed>
[..]
```

View File

@ -1,14 +0,0 @@
ARG ARCH=amd64
FROM --platform=linux/${ARCH} gcr.io/distroless/static-debian11
ADD etcd /usr/local/bin/
ADD etcdctl /usr/local/bin/
ADD etcdutl /usr/local/bin/
WORKDIR /var/etcd/
WORKDIR /var/lib/etcd/
EXPOSE 2379 2380
# Define default command.
CMD ["/usr/local/bin/etcd"]

12
Dockerfile-release Normal file
View File

@ -0,0 +1,12 @@
FROM --platform=linux/amd64 gcr.io/distroless/static-debian11
ADD etcd /usr/local/bin/
ADD etcdctl /usr/local/bin/
WORKDIR /var/etcd/
WORKDIR /var/lib/etcd/
EXPOSE 2379 2380
# Define default command.
CMD ["/usr/local/bin/etcd"]

12
Dockerfile-release.arm64 Normal file
View File

@ -0,0 +1,12 @@
FROM --platform=linux/arm64 gcr.io/distroless/static-debian11
ADD etcd /usr/local/bin/
ADD etcdctl /usr/local/bin/
WORKDIR /var/etcd/
WORKDIR /var/lib/etcd/
EXPOSE 2379 2380
# Define default command.
CMD ["/usr/local/bin/etcd"]

View File

@ -0,0 +1,12 @@
FROM --platform=linux/ppc64le gcr.io/distroless/static-debian11
ADD etcd /usr/local/bin/
ADD etcdctl /usr/local/bin/
WORKDIR /var/etcd/
WORKDIR /var/lib/etcd/
EXPOSE 2379 2380
# Define default command.
CMD ["/usr/local/bin/etcd"]

View File

@ -1,4 +0,0 @@
This directory includes etcd project internal documentation for new and existing contributors.
For user and developer documentation please go to [etcd.io](https://etcd.io/),
which is developed in [website](https://github.com/etcd-io/website/) repo.

1
Documentation/README.md Symbolic link
View File

@ -0,0 +1 @@
docs.md

View File

@ -0,0 +1,18 @@
# Benchmarks
etcd benchmarks will be published regularly and tracked for each release below:
- [etcd v2.1.0-alpha][2.1]
- [etcd v2.2.0-rc][2.2]
- [etcd v3 demo][3.0]
# Memory Usage Benchmarks
It records expected memory usage in different scenarios.
- [etcd v2.2.0-rc][2.2-mem]
[2.1]: etcd-2-1-0-alpha-benchmarks.md
[2.2]: etcd-2-2-0-rc-benchmarks.md
[2.2-mem]: etcd-2-2-0-rc-memory-benchmarks.md
[3.0]: etcd-3-demo-benchmarks.md

View File

@ -0,0 +1,56 @@
---
title: Benchmarking etcd v2.1.0
---
## Physical machines
GCE n1-highcpu-2 machine type
- 1x dedicated local SSD mounted under /var/lib/etcd
- 1x dedicated slow disk for the OS
- 1.8 GB memory
- 2x CPUs
- etcd version 2.1.0 alpha
## etcd Cluster
3 etcd members, each runs on a single machine
## Testing
Bootstrap another machine and use the [hey HTTP benchmark tool][hey] to send requests to each etcd member. Check the [benchmark hacking guide][hack-benchmark] for detailed instructions.
## Performance
### reading one single key
| key size in bytes | number of clients | target etcd server | read QPS | 90th Percentile Latency (ms) |
|-------------------|-------------------|--------------------|----------|---------------|
| 64 | 1 | leader only | 1534 | 0.7 |
| 64 | 64 | leader only | 10125 | 9.1 |
| 64 | 256 | leader only | 13892 | 27.1 |
| 256 | 1 | leader only | 1530 | 0.8 |
| 256 | 64 | leader only | 10106 | 10.1 |
| 256 | 256 | leader only | 14667 | 27.0 |
| 64 | 64 | all servers | 24200 | 3.9 |
| 64 | 256 | all servers | 33300 | 11.8 |
| 256 | 64 | all servers | 24800 | 3.9 |
| 256 | 256 | all servers | 33000 | 11.5 |
### writing one single key
| key size in bytes | number of clients | target etcd server | write QPS | 90th Percentile Latency (ms) |
|-------------------|-------------------|--------------------|-----------|---------------|
| 64 | 1 | leader only | 60 | 21.4 |
| 64 | 64 | leader only | 1742 | 46.8 |
| 64 | 256 | leader only | 3982 | 90.5 |
| 256 | 1 | leader only | 58 | 20.3 |
| 256 | 64 | leader only | 1770 | 47.8 |
| 256 | 256 | leader only | 4157 | 105.3 |
| 64 | 64 | all servers | 1028 | 123.4 |
| 64 | 256 | all servers | 3260 | 123.8 |
| 256 | 64 | all servers | 1033 | 121.5 |
| 256 | 256 | all servers | 3061 | 119.3 |
[hey]: https://github.com/rakyll/hey
[hack-benchmark]: https://github.com/coreos/etcd/tree/master/hack/benchmark

View File

@ -0,0 +1,71 @@
---
title: Benchmarking etcd v2.2.0
---
## Physical Machines
GCE n1-highcpu-2 machine type
- 1x dedicated local SSD mounted as etcd data directory
- 1x dedicated slow disk for the OS
- 1.8 GB memory
- 2x CPUs
## etcd Cluster
3 etcd 2.2.0 members, each runs on a single machine.
Detailed versions:
```
etcd Version: 2.2.0
Git SHA: e4561dd
Go Version: go1.5
Go OS/Arch: linux/amd64
```
## Testing
Bootstrap another machine, outside of the etcd cluster, and run the [`hey` HTTP benchmark tool](https://github.com/rakyll/hey) with a connection reuse patch to send requests to each etcd cluster member. See the [benchmark instructions](../../hack/benchmark/) for the patch and the steps to reproduce our procedures.
The performance is calculated through results of 100 benchmark rounds.
## Performance
### Single Key Read Performance
| key size in bytes | number of clients | target etcd server | average read QPS | read QPS stddev | average 90th Percentile Latency (ms) | latency stddev |
|-------------------|-------------------|--------------------|------------------|-----------------|--------------------------------------|----------------|
| 64 | 1 | leader only | 2303 | 200 | 0.49 | 0.06 |
| 64 | 64 | leader only | 15048 | 685 | 7.60 | 0.46 |
| 64 | 256 | leader only | 14508 | 434 | 29.76 | 1.05 |
| 256 | 1 | leader only | 2162 | 214 | 0.52 | 0.06 |
| 256 | 64 | leader only | 14789 | 792 | 7.69| 0.48 |
| 256 | 256 | leader only | 14424 | 512 | 29.92 | 1.42 |
| 64 | 64 | all servers | 45752 | 2048 | 2.47 | 0.14 |
| 64 | 256 | all servers | 46592 | 1273 | 10.14 | 0.59 |
| 256 | 64 | all servers | 45332 | 1847 | 2.48| 0.12 |
| 256 | 256 | all servers | 46485 | 1340 | 10.18 | 0.74 |
### Single Key Write Performance
| key size in bytes | number of clients | target etcd server | average write QPS | write QPS stddev | average 90th Percentile Latency (ms) | latency stddev |
|-------------------|-------------------|--------------------|------------------|-----------------|--------------------------------------|----------------|
| 64 | 1 | leader only | 55 | 4 | 24.51 | 13.26 |
| 64 | 64 | leader only | 2139 | 125 | 35.23 | 3.40 |
| 64 | 256 | leader only | 4581 | 581 | 70.53 | 10.22 |
| 256 | 1 | leader only | 56 | 4 | 22.37| 4.33 |
| 256 | 64 | leader only | 2052 | 151 | 36.83 | 4.20 |
| 256 | 256 | leader only | 4442 | 560 | 71.59 | 10.03 |
| 64 | 64 | all servers | 1625 | 85 | 58.51 | 5.14 |
| 64 | 256 | all servers | 4461 | 298 | 89.47 | 36.48 |
| 256 | 64 | all servers | 1599 | 94 | 60.11| 6.43 |
| 256 | 256 | all servers | 4315 | 193 | 88.98 | 7.01 |
## Performance Changes
- Because etcd now records metrics for each API call, read QPS performance seems to see a minor decrease in most scenarios. This minimal performance impact was judged a reasonable investment for the breadth of monitoring and debugging information returned.
- Write QPS to cluster leaders seems to be increased by a small margin. This is because the main loop and entry apply loops were decoupled in the etcd raft logic, eliminating several blocks between them.
- Write QPS to all members seems to be increased by a significant margin, because followers now receive the latest commit index sooner, and commit proposals more quickly.

View File

@ -0,0 +1,76 @@
---
title: Benchmarking etcd v2.2.0-rc
---
## Physical machine
GCE n1-highcpu-2 machine type
- 1x dedicated local SSD mounted under /var/lib/etcd
- 1x dedicated slow disk for the OS
- 1.8 GB memory
- 2x CPUs
## etcd Cluster
3 etcd 2.2.0-rc members, each runs on a single machine.
Detailed versions:
```
etcd Version: 2.2.0-alpha.1+git
Git SHA: 59a5a7e
Go Version: go1.4.2
Go OS/Arch: linux/amd64
```
Also, we use 3 etcd 2.1.0 alpha-stage members to form cluster to get base performance. etcd's commit head is at [c7146bd5][c7146bd5], which is the same as the one that we use in [etcd 2.1 benchmark][etcd-2.1-benchmark].
## Testing
Bootstrap another machine and use the [hey HTTP benchmark tool][hey] to send requests to each etcd member. Check the [benchmark hacking guide][hack-benchmark] for detailed instructions.
## Performance
### reading one single key
| key size in bytes | number of clients | target etcd server | read QPS | 90th Percentile Latency (ms) |
|-------------------|-------------------|--------------------|----------|---------------|
| 64 | 1 | leader only | 2804 (-5%) | 0.4 (+0%) |
| 64 | 64 | leader only | 17816 (+0%) | 5.7 (-6%) |
| 64 | 256 | leader only | 18667 (-6%) | 20.4 (+2%) |
| 256 | 1 | leader only | 2181 (-15%) | 0.5 (+25%) |
| 256 | 64 | leader only | 17435 (-7%) | 6.0 (+9%) |
| 256 | 256 | leader only | 18180 (-8%) | 21.3 (+3%) |
| 64 | 64 | all servers | 46965 (-4%) | 2.1 (+0%) |
| 64 | 256 | all servers | 55286 (-6%) | 7.4 (+6%) |
| 256 | 64 | all servers | 46603 (-6%) | 2.1 (+5%) |
| 256 | 256 | all servers | 55291 (-6%) | 7.3 (+4%) |
### writing one single key
| key size in bytes | number of clients | target etcd server | write QPS | 90th Percentile Latency (ms) |
|-------------------|-------------------|--------------------|-----------|---------------|
| 64 | 1 | leader only | 76 (+22%) | 19.4 (-15%) |
| 64 | 64 | leader only | 2461 (+45%) | 31.8 (-32%) |
| 64 | 256 | leader only | 4275 (+1%) | 69.6 (-10%) |
| 256 | 1 | leader only | 64 (+20%) | 16.7 (-30%) |
| 256 | 64 | leader only | 2385 (+30%) | 31.5 (-19%) |
| 256 | 256 | leader only | 4353 (-3%) | 74.0 (+9%) |
| 64 | 64 | all servers | 2005 (+81%) | 49.8 (-55%) |
| 64 | 256 | all servers | 4868 (+35%) | 81.5 (-40%) |
| 256 | 64 | all servers | 1925 (+72%) | 47.7 (-59%) |
| 256 | 256 | all servers | 4975 (+36%) | 70.3 (-36%) |
### performance changes explanation
- read QPS in most scenarios is decreased by 5~8%. The reason is that etcd records store metrics for each store operation. The metrics is important for monitoring and debugging, so this is acceptable.
- write QPS to leader is increased by 20~30%. This is because we decouple raft main loop and entry apply loop, which avoids them blocking each other.
- write QPS to all servers is increased by 30~80% because follower could receive latest commit index earlier and commit proposals faster.
[hey]: https://github.com/rakyll/hey
[c7146bd5]: https://github.com/coreos/etcd/commits/c7146bd5f2c73716091262edc638401bb8229144
[etcd-2.1-benchmark]: etcd-2-1-0-alpha-benchmarks.md
[hack-benchmark]: ../../hack/benchmark/

View File

@ -0,0 +1,51 @@
---
title: Benchmarking etcd v2.2.0-rc-memory
---
## Physical machine
GCE n1-standard-2 machine type
- 1x dedicated local SSD mounted under /var/lib/etcd
- 1x dedicated slow disk for the OS
- 7.5 GB memory
- 2x CPUs
## etcd
```
etcd Version: 2.2.0-rc.0+git
Git SHA: 103cb5c
Go Version: go1.5
Go OS/Arch: linux/amd64
```
## Testing
Start 3-member etcd cluster, each of which uses 2 cores.
The length of key name is always 64 bytes, which is a reasonable length of average key bytes.
## Memory Maximal Usage
- etcd may use maximal memory if one follower is dead and the leader keeps sending snapshots.
- `max RSS` is the maximal memory usage recorded in 3 runs.
| value bytes | key number | data size(MB) | max RSS(MB) | max RSS/data rate on leader |
|-------------|-------------|---------------|-------------|-----------------------------|
| 128 | 50000 | 6 | 433 | 72x |
| 128 | 100000 | 12 | 659 | 54x |
| 128 | 200000 | 24 | 1466 | 61x |
| 1024 | 50000 | 48 | 1253 | 26x |
| 1024 | 100000 | 96 | 2344 | 24x |
| 1024 | 200000 | 192 | 4361 | 22x |
## Data Size Threshold
- When etcd reaches data size threshold, it may trigger leader election easily and drop part of proposals.
- For most cases, the etcd cluster should work smoothly if it doesn't hit the threshold. If it doesn't work well due to insufficient resources, decrease its data size.
| value bytes | key number limitation | suggested data size threshold(MB) | consumed RSS(MB) |
|-------------|-----------------------|-----------------------------------|------------------|
| 128 | 400K | 48 | 2400 |
| 1024 | 300K | 292 | 6500 |

View File

@ -0,0 +1,46 @@
---
title: Benchmarking etcd v3
---
## Physical machines
GCE n1-highcpu-2 machine type
- 1x dedicated local SSD mounted under /var/lib/etcd
- 1x dedicated slow disk for the OS
- 1.8 GB memory
- 2x CPUs
- etcd version 2.2.0
## etcd Cluster
1 etcd member running in v3 demo mode
## Testing
Use [etcd v3 benchmark tool][etcd-v3-benchmark].
## Performance
### reading one single key
| key size in bytes | number of clients | read QPS | 90th Percentile Latency (ms) |
|-------------------|-------------------|----------|---------------|
| 256 | 1 | 2716 | 0.4 |
| 256 | 64 | 16623 | 6.1 |
| 256 | 256 | 16622 | 21.7 |
The performance is nearly the same as the one with empty server handler.
### reading one single key after putting
| key size in bytes | number of clients | read QPS | 90th Percentile Latency (ms) |
|-------------------|-------------------|----------|---------------|
| 256 | 1 | 2269 | 0.5 |
| 256 | 64 | 13582 | 8.6 |
| 256 | 256 | 13262 | 47.5 |
The performance with empty server handler is not affected by one put. So the
performance downgrade should be caused by storage package.
[etcd-v3-benchmark]: ../../tools/benchmark/

View File

@ -0,0 +1,79 @@
---
title: Watch Memory Usage Benchmark
---
*NOTE*: The watch features are under active development, and their memory usage may change as that development progresses. We do not expect it to significantly increase beyond the figures stated below.
A primary goal of etcd is supporting a very large number of watchers doing a massively large amount of watching. etcd aims to support O(10k) clients, O(100K) watch streams (O(10) streams per client) and O(10M) total watchings (O(100) watching per stream). The memory consumed by each individual watching accounts for the largest portion of etcd's overall usage, and is therefore the focus of current and future optimizations.
Three related components of etcd watch consume physical memory: each `grpc.Conn`, each watch stream, and each instance of the watching activity. `grpc.Conn` maintains the actual TCP connection and other gRPC connection state. Each `grpc.Conn` consumes O(10kb) of memory, and might have multiple watch streams attached.
Each watch stream is an independent HTTP2 connection which consumes another O(10kb) of memory.
Multiple watchings might share one watch stream.
Watching is the actual struct that tracks the changes on the key-value store. Each watching should only consume < O(1kb).
```
+-------+
| watch |
+---------> | foo |
| +-------+
+------+-----+
| stream |
+--------------> | |
| +------+-----+ +-------+
| | | watch |
| +---------> | bar |
+-----+------+ +-------+
| | +------------+
| conn +-------> | stream |
| | | |
+-----+------+ +------------+
|
|
|
| +------------+
+--------------> | stream |
| |
+------------+
```
The theoretical memory consumption of watch can be approximated with the formula:
`memory = c1 * number_of_conn + c2 * avg_number_of_stream_per_conn + c3 * avg_number_of_watch_stream`
## Testing Environment
etcd version
- git head https://github.com/coreos/etcd/commit/185097ffaa627b909007e772c175e8fefac17af3
GCE n1-standard-2 machine type
- 7.5 GB memory
- 2x CPUs
## Overall memory usage
The overall memory usage captures how much [RSS][rss] etcd consumes with the client watchers. While the result may vary by as much as 10%, it is still meaningful, since the goal is to learn about the rough memory usage and the pattern of allocations.
With the benchmark result, we can calculate roughly that `c1 = 17kb`, `c2 = 18kb` and `c3 = 350bytes`. So each additional client connection consumes 17kb of memory and each additional stream consumes 18kb of memory, and each additional watching only cause 350bytes. A single etcd server can maintain millions of watchings with a few GB of memory in normal case.
| clients | streams per client | watchings per stream | total watching | memory usage |
|---------|---------|-----------|----------------|--------------|
| 1k | 1 | 1 | 1k | 50MB |
| 2k | 1 | 1 | 2k | 90MB |
| 5k | 1 | 1 | 5k | 200MB |
| 1k | 10 | 1 | 10k | 217MB |
| 2k | 10 | 1 | 20k | 417MB |
| 5k | 10 | 1 | 50k | 980MB |
| 1k | 50 | 1 | 50k | 1001MB |
| 2k | 50 | 1 | 100k | 1960MB |
| 5k | 50 | 1 | 250k | 4700MB |
| 1k | 50 | 10 | 500k | 1171MB |
| 2k | 50 | 10 | 1M | 2371MB |
| 5k | 50 | 10 | 2.5M | 5710MB |
| 1k | 50 | 100 | 5M | 2380MB |
| 2k | 50 | 100 | 10M | 4672MB |
| 5k | 50 | 100 | 25M | *OOM* |
[rss]: https://en.wikipedia.org/wiki/Resident_set_size

View File

@ -0,0 +1,100 @@
---
title: Storage Memory Usage Benchmark
---
<!---todo: link storage to storage design doc-->
Two components of etcd storage consume physical memory. The etcd process allocates an *in-memory index* to speed key lookup. The process's *page cache*, managed by the operating system, stores recently-accessed data from disk for quick re-use.
The in-memory index holds all the keys in a [B-tree][btree] data structure, along with pointers to the on-disk data (the values). Each key in the B-tree may contain multiple pointers, pointing to different versions of its values. The theoretical memory consumption of the in-memory index can hence be approximated with the formula:
`N * (c1 + avg_key_size) + N * (avg_versions_of_key) * (c2 + size_of_pointer)`
where `c1` is the key metadata overhead and `c2` is the version metadata overhead.
The graph shows the detailed structure of the in-memory index B-tree.
```
In mem index
+------------+
| key || ... |
+--------------+ | || |
| | +------------+
| | | v1 || ... |
| disk <----------------| || | Tree Node
| | +------------+
| | | v2 || ... |
| <----------------+ || |
| | +------------+
+--------------+ +-----+ | | |
| | | | |
| +------------+
|
|
^
------+
| ... |
| |
+-----+
| ... | Tree Node
| |
+-----+
| ... |
| |
------+
```
[Page cache memory][pagecache] is managed by the operating system and is not covered in detail in this document.
## Testing Environment
etcd version
- git head https://github.com/coreos/etcd/commit/776e9fb7be7eee5e6b58ab977c8887b4fe4d48db
GCE n1-standard-2 machine type
- 7.5 GB memory
- 2x CPUs
## In-memory index memory usage
In this test, we only benchmark the memory usage of the in-memory index. The goal is to find `c1` and `c2` mentioned above and to understand the hard limit of memory consumption of the storage.
We calculate the memory usage consumption via the Go runtime.ReadMemStats. We calculate the total allocated bytes difference before creating the index and after creating the index. It cannot perfectly reflect the memory usage of the in-memory index itself but can show the rough consumption pattern.
| N | versions | key size | memory usage |
|------|----------|----------|--------------|
| 100K | 1 | 64bytes | 22MB |
| 100K | 5 | 64bytes | 39MB |
| 1M | 1 | 64bytes | 218MB |
| 1M | 5 | 64bytes | 432MB |
| 100K | 1 | 256bytes | 41MB |
| 100K | 5 | 256bytes | 65MB |
| 1M | 1 | 256bytes | 409MB |
| 1M | 5 | 256bytes | 506MB |
Based on the result, we can calculate `c1=120bytes`, `c2=30bytes`. We only need two sets of data to calculate `c1` and `c2`, since they are the only unknown variable in the formula. The `c1=120bytes` and `c2=30bytes` are the average value of the 4 sets of `c1` and `c2` we calculated. The key metadata overhead is still relatively nontrivial (50%) for small key-value pairs. However, this is a significant improvement over the old store, which had at least 1000% overhead.
## Overall memory usage
The overall memory usage captures how much RSS etcd consumes with the storage. The value size should have very little impact on the overall memory usage of etcd, since we keep values on disk and only retain hot values in memory, managed by the OS page cache.
| N | versions | key size | value size | memory usage |
|------|----------|----------|------------|--------------|
| 100K | 1 | 64bytes | 256bytes | 40MB |
| 100K | 5 | 64bytes | 256bytes | 89MB |
| 1M | 1 | 64bytes | 256bytes | 470MB |
| 1M | 5 | 64bytes | 256bytes | 880MB |
| 100K | 1 | 64bytes | 1KB | 102MB |
| 100K | 5 | 64bytes | 1KB | 164MB |
| 1M | 1 | 64bytes | 1KB | 587MB |
| 1M | 5 | 64bytes | 1KB | 836MB |
Based on the result, we know the value size does not significantly impact the memory consumption. There is some minor increase due to more data held in the OS page cache.
[btree]: https://en.wikipedia.org/wiki/B-tree
[pagecache]: https://en.wikipedia.org/wiki/Page_cache

View File

@ -0,0 +1,28 @@
---
title: Branch management
---
## Guide
* New development occurs on the [master branch][master].
* Master branch should always have a green build!
* Backwards-compatible bug fixes should target the master branch and subsequently be ported to stable branches.
* Once the master branch is ready for release, it will be tagged and become the new stable branch.
The etcd team has adopted a *rolling release model* and supports two stable versions of etcd.
### Master branch
The `master` branch is our development branch. All new features land here first.
To try new and experimental features, pull `master` and play with it. Note that `master` may not be stable because new features may introduce bugs.
Before the release of the next stable version, feature PRs will be frozen. A [release manager](./dev-internal/release.md#release-management) will be assigned to major/minor version and will lead the etcd community in test, bug-fix and documentation of the release for one to two weeks.
### Stable branches
All branches with prefix `release-` are considered _stable_ branches.
After every minor release (http://semver.org/), we will have a new stable branch for that release, managed by a [patch release manager](./dev-internal/release.md#release-management). We will keep fixing the backwards-compatible bugs for the latest two stable releases. A _patch_ release to each supported release branch, incorporating any bug fixes, will be once every two weeks, given any patches.
[master]: https://github.com/coreos/etcd/tree/master

View File

@ -1,27 +0,0 @@
# Branch management
## Guide
* New development occurs on the [main branch][main].
* Main branch should always have a green build!
* Backwards-compatible bug fixes should target the main branch and subsequently be ported to stable branches.
* Once the main branch is ready for release, it will be tagged and become the new stable branch.
The etcd team has adopted a *rolling release model* and supports two stable versions of etcd.
### Main branch
The `main` branch is our development branch. All new features land here first.
To try new and experimental features, pull `main` and play with it. Note that `main` may not be stable because new features may introduce bugs.
Before the release of the next stable version, feature PRs will be frozen. A [release manager](./release.md/#release-management) will be assigned to major/minor version and will lead the etcd community in test, bug-fix and documentation of the release for one to two weeks.
### Stable branches
All branches with prefix `release-` are considered _stable_ branches.
After every minor release ([semver.org](https://semver.org/)), we will have a new stable branch for that release, managed by a [patch release manager](./release.md/#release-management). We will keep fixing the backwards-compatible bugs for the latest two stable releases. A _patch_ release to each supported release branch, incorporating any bug fixes, will be once every two weeks, given any patches.
[main]: https://github.com/etcd-io/etcd/tree/main

View File

@ -1,168 +0,0 @@
# Community membership
This doc outlines the various responsibilities of contributor roles in etcd.
| Role | Responsibilities | Requirements | Defined by |
|------------|----------------------------------------------|---------------------------------------------------------------|--------------------------------------|
| Member | Active contributor in the community | Sponsored by 2 reviewers and multiple contributions | etcd GitHub org member |
| Reviewer | Review contributions from other members | History of review and authorship | [MAINTAINERS] file reviewer entry |
| Maintainer | Set direction and priorities for the project | Demonstrated responsibility and excellent technical judgement | [MAINTAINERS] file maintainers entry |
## New contributors
New contributors should be welcomed to the community by existing members,
helped with PR workflow, and directed to relevant documentation and
communication channels.
## Established community members
Established community members are expected to demonstrate their adherence to the
principles in this document, familiarity with project organization, roles,
policies, procedures, conventions, etc., and technical and/or writing ability.
Role-specific expectations, responsibilities, and requirements are enumerated
below.
## Member
Members are continuously active contributors in the community. They can have
issues and PRs assigned to them. Members are expected to remain active
contributors to the community.
**Defined by:** Member of the etcd GitHub organization.
### Requirements
- Enabled [two-factor authentication] on their GitHub account
- Have made multiple contributions to the project or community. Contribution may include, but is not limited to:
- Authoring or reviewing PRs on GitHub. At least one PR must be **merged**.
- Filing or commenting on issues on GitHub
- Contributing to community discussions (e.g. meetings, Slack, email discussion
forums, Stack Overflow)
- Subscribed to [etcd-dev@googlegroups.com]
- Have read the [contributor guide]
- Sponsored by one active maintainer or two reviewers.
- Sponsors must be from multiple member companies to demonstrate integration across community.
- With no objections from other maintainers
- Open a [membership nomination] issue against the etcd-io/etcd repo
- Ensure your sponsors are @mentioned on the issue
- Make sure that the list of contributions included is representative of your work on the project.
- Members can be removed by a supermajority of the maintainers or can resign by notifying
the maintainers.
### Responsibilities and privileges
- Responsive to issues and PRs assigned to them
- Granted "triage access" to etcd project
- Active owner of code they have contributed (unless ownership is explicitly transferred)
- Code is well tested
- Tests consistently pass
- Addresses bugs or issues discovered after code is accepted
**Note:** members who frequently contribute code are expected to proactively
perform code reviews and work towards becoming a *reviewer*.
## Reviewers
Reviewers are contributors who have demonstrated greater skill in
reviewing the code from other contributors. They are knowledgeable about both
the codebase and software engineering principles. Their LGTM counts towards
merging a code change into the project. A reviewer is generally on the ladder towards
maintainership.
**Defined by:** *reviewers* entry in the [MAINTAINERS] file.
### Requirements
- member for at least 3 months.
- Primary reviewer for at least 5 PRs to the codebase.
- Reviewed or contributed at least 20 substantial PRs to the codebase.
- Knowledgeable about the codebase.
- Sponsored by two active maintainers.
- Sponsors must be from multiple member companies to demonstrate integration across community.
- With no objections from other maintainers
- Reviewers can be removed by a supermajority of the maintainers or can resign by notifying
the maintainers.
### Responsibilities and privileges
- Code reviewer status may be a precondition to accepting large code contributions
- Responsible for project quality control via code reviews
- Focus on code quality and correctness, including testing and factoring
- May also review for more holistic issues, but not a requirement
- Expected to be responsive to review requests
- Assigned PRs to review related to area of expertise
- Assigned test bugs related to area of expertise
- Granted "triage access" to etcd project
## Maintainers
Maintainers are first and foremost contributors that have shown they
are committed to the long term success of a project. Maintainership is about building
trust with the current maintainers and being a person that they can
depend on to make decisions in the best interest of the project in a consistent manner.
**Defined by:** *maintainers* entry in the [MAINTAINERS] file.
### Requirements
- Deep understanding of the technical goals and direction of the project
- Deep understanding of the technical domain of the project
- Sustained contributions to design and direction by doing all of:
- Authoring and reviewing proposals
- Initiating, contributing and resolving discussions (emails, GitHub issues, meetings)
- Identifying subtle or complex issues in designs and implementation PRs
- Directly contributed to the project through implementation and / or review
- Sponsored by two active maintainers and elected by supermajority
- Sponsors must be from multiple member companies to demonstrate integration across community.
- To become a maintainer send an email with your candidacy to [etcd-maintainers-private@googlegroups.com]
- Ensure your sponsors are @mentioned on the email
- Include a list of contributions representative of your work on the project.
- Existing maintainers vote will privately and respond to the email with either acceptance or with feedback for suggested improvement.
- With your membership approved you are expected to:
- Open a PR and add an entry to the [MAINTAINERS] file
- Subscribe to [etcd-maintainers@googlegroups.com] and [etcd-maintainers-private@googlegroups.com]
- Request to join to [etcd-maintainer teams of etcd organization of GitHub](https://github.com/orgs/etcd-io/teams/maintainers-etcd)
- Request to join to the private slack channel for etcd maintainers on [kubernetes slack](http://slack.kubernetes.io/)
- Request access to etcd-development GCP project where we publish releases
- Request access to passwords shared between maintainers
### Responsibilities and privileges
- Make and approve technical design decisions
- Set technical direction and priorities
- Define milestones and releases
- Mentor and guide reviewers, and contributors to the project.
- Participate when called upon in the [security disclosure and release process]
- Ensure continued health of the project
- Adequate test coverage to confidently release
- Tests are passing reliably (i.e. not flaky) and are fixed when they fail
- Ensure a healthy process for discussion and decision making is in place.
- Work with other maintainers to maintain the project's overall health and success holistically
### Retiring
Life priorities, interests, and passions can change. Maintainers can retire and
move to [emeritus maintainers]. If a maintainer needs to step down, they should
inform other maintainers, if possible, help find someone to pick up the related
work. At the very least, ensure the related work can be continued. Afterward
they can remove themselves from list of existing maintainers.
If a maintainer has not been performing their duties for period of 12 months,
they can be removed by other maintainers. In that case inactive maintainer will
be first notified via an email. If situation doesn't improve, they will be
removed. If an emeritus maintainer wants to regain an active role, they can do
so by renewing their contributions. Active maintainers should welcome such a move.
Retiring of other maintainers or regaining the status should require approval
of at least two active maintainers.
## Acknowledgements
Contributor roles and responsibilities were written based on [Kubernetes community membership]
[MAINTAINERS]: /MAINTAINERS
[contributor guide]: /CONTRIBUTING.md
[membership nomination]:https://github.com/etcd-io/etcd/issues/new?assignees=&labels=area%2Fcommunity&template=membership-request.yml
[Kubernetes community membership]: https://github.com/kubernetes/community/blob/master/community-membership.md
[emeritus maintainers]: /README.md#etcd-emeritus-maintainers
[security disclosure and release process]: /security/README.md
[two-factor authentication]: https://docs.github.com/en/authentication/securing-your-account-with-two-factor-authentication-2fa/about-two-factor-authentication

View File

@ -1,102 +0,0 @@
Dependency management
======
# Table of Contents
- **[Main branch](#main-branch)**
- [Dependencies used in workflows](#dependencies-used-in-workflows)
- [Bumping order](#bumping-order)
- [Steps to bump a dependency](#steps-to-bump-a-dependency)
- [Indirect dependencies](#indirect-dependencies)
- [About gRPC](#about-grpc)
- [Rotation worksheet](#rotation-worksheet)
- **[Stable branches](#stable-branches)**
# Main branch
The dependabot is enabled & [configured](https://github.com/etcd-io/etcd/blob/main/.github/dependabot.yml) to
manage dependencies for etcd `main` branch. But dependabot doesn't work well for multi-module repository like `etcd`,
see [dependabot-core/issues/6678](https://github.com/dependabot/dependabot-core/issues/6678).
Usually human intervention is required each time when dependabot automatically opens some PRs to bump dependencies.
Please see guidance below.
## Dependencies used in workflows
The PRs which automatically bump dependencies (see examples below) used in workflows are fine, and can be approved & merged directly as long as all checks are successful.
- [build(deps): bump github/codeql-action from 2.2.11 to 2.2.12](https://github.com/etcd-io/etcd/pull/15736)
- [build(deps): bump actions/checkout from 3.5.0 to 3.5.2](https://github.com/etcd-io/etcd/pull/15735)
- [build(deps): bump ossf/scorecard-action from 2.1.2 to 2.1.3](https://github.com/etcd-io/etcd/pull/15607)
## Bumping order
When multiple etcd modules depend on the same package, please bump the package version for all the modules in the correct order. The rule is simple:
if module A depends on module B, then bump the dependency for module B before module A. If the two modules do not depend on each other, then
it doesn't matter to bump which module first. For example, multiple modules depend on `github.com/spf13/cobra`, we need to bump the dependency
in the following order,
- go.etcd.io/etcd/pkg/v3
- go.etcd.io/etcd/server/v3
- go.etcd.io/etcd/etcdctl/v3
- go.etcd.io/etcd/etcdutl/v3
- go.etcd.io/etcd/tests/v3
- go.etcd.io/etcd/v3
- go.etcd.io/etcd/tools/v3
For more details about etcd Golang modules, please check https://etcd.io/docs/next/dev-internal/modules/
Note the module `go.etcd.io/etcd/tools/v3` doesn't depend on any other modules, nor by any other modules, so it doesn't matter when to bump dependencies for it.
## Steps to bump a dependency
Use the `github.com/spf13/cobra` as an example, follow steps below to bump it from 1.6.1 to 1.7.0 for module `go.etcd.io/etcd/etcdctl/v3`,
```
$ cd ${ETCD_ROOT_DIR}/etcdctl
$ go get github.com/spf13/cobra@v1.7.0
$ go mod tidy
$ cd ..
$ ./scripts/fix.sh
```
Execute the same steps for all other modules. When you finish bumping the dependency for all modules, then commit the change,
```
$ git add .
$ git commit --signoff -m "dependency: bump github.com/spf13/cobra from 1.6.1 to 1.7.0"
```
Please close the related PRs which were automatically opened by dependabot.
When you bump multiple dependencies in one PR, it's recommended to create a separate commit for each dependency. But it isn't a must; for example,
you can get all dependencies bumping for the module `go.etcd.io/etcd/tools/v3` included in one commit.
## Indirect dependencies
Usually we don't bump a dependency if all modules just indirectly depend on it, such as `github.com/go-logr/logr`.
If an indirect dependency (e.g. `D1`) causes any CVE or bugs which affect etcd, usually the module (e.g. `M1`, not part of etcd, but used by etcd)
which depends on it should bump the dependency (`D1`), and then etcd just needs to bump `M1`. However, if the module (`M1`) somehow doesn't
bump the problematic dependency, then etcd can still bump it (`D1`) directly following the same steps above. But as a long-term solution, etcd should
try to remove the dependency on such module (`M1`) that lack maintenance.
For mixed cases, in which some modules directly while others indirectly depend on a dependency, we have multiple options,
- Bump the dependency for all modules, no matter it's direct or indirect dependency.
- Bump the dependency only for modules which directly depend on it.
We should try to follow the first way, and temporarily fall back to the second one if we run into any issue on the first way. Eventually we
should fix the issue and ensure all modules depend on the same version of the dependency.
## Known incompatible dependency updates
### arduino/setup-protoc
Please refer to [build(deps): bump arduino/setup-protoc from 1.3.0 to 2.0.0](https://github.com/etcd-io/etcd/pull/16016)
### About gRPC
There is a compatible issue between etcd and gRPC 1.52.0, and there is a pending PR [pull/15131](https://github.com/etcd-io/etcd/pull/15131).
The plan is to remove the dependency on some grpc-go's experimental API firstly, afterwards try to bump it again. Please get more details in
[issues/15145](https://github.com/etcd-io/etcd/issues/15145).
`go.opentelemetry.io/otel` version update is indirectly blocked due to this gRPC issue. Please get more details in [pull/15810](https://github.com/etcd-io/etcd/pull/15810).
## Rotation worksheet
The dependabot scheduling interval is weekly; it means dependabot will automatically raise a bunch of PRs per week.
Usually human intervention is required each time. We have a [rotation worksheet](https://docs.google.com/spreadsheets/d/1DDWzbcOx1p32MhyelaPZ_SfYtAD6xRsrtGRZ9QXPOyQ/edit#gid=0),
and everyone is welcome to participate; you just need to register your name in the worksheet.
# Stable branches
Usually we don't proactively bump dependencies for stable releases unless there are any CVEs or bugs that affect etcd.
If we have to do it, then follow the same guidance above. Note that there is no `./scripts/fix.sh` in release-3.4, so no need to
execute it for 3.4.

View File

@ -1,83 +0,0 @@
# Features
This document provides an overview of etcd features and general development guidelines for adding and deprecating them. The project maintainers can override these guidelines per the need of the project following the project governance.
## Overview
The etcd features fall into three stages, experimental, stable, and unsafe.
### Experimental
Any new feature is usually added as an experimental feature. An experimental feature is characterized as below:
- Might be buggy due to a lack of user testing. Enabling the feature may not work as expected.
- Disabled by default when added initially.
- Support for such a feature may be dropped at any time without notice
- Feature related issues may be given lower priorities.
- It can be removed in the next minor or major release without following the feature deprecation policy unless it graduates to the stable future.
### Stable
A stable feature is characterized as below:
- Supported as part of the supported releases of etcd.
- May be enabled by default.
- Discontinuation of support must follow the feature deprecation policy.
### Unsafe
Unsafe features are rare and listed under the `Unsafe feature:` section in the etcd usage documentation. By default, they are disabled. They should be used with caution following documentation. An unsafe feature can be removed in the next minor or major release without following feature deprecation policy.
## Development Guidelines
### Adding a new feature
Any new enhancements to the etcd are typically added as an experimental feature. The general development requirements are listed below. They can be somewhat flexible depending on the scope of the feature and review discussions, and will evolve over time.
- Open an issue
- It must provide a clear need for the proposed feature.
- It should list development work items as checkboxes. There must be one work item towards future graduation to the stable future.
- Label the issue with `type/feature` and `experimental`.
- Keep the issue open for tracking purpose until a decision is made on graduation.
- Open a Pull Request (PR)
- Provide unit tests. Integreation tests are also recommended as possible.
- Provide robust e2e test coverage. If the feature being added is complicated or quickly needed, maintainers can decide to go with e2e tests for basic coverage initially and have robust coverage added at the later time before feature graduation to the stable feature.
- Provide logs for proper debugging.
- Provide metrics and benchmarks as needed.
- The Feature should be disabled by default.
- Any configuration flags related to the implementation of the feature must be prefixed with `experimental` e.g. `--experimental-feature-name`.
- Add a CHANGELOG entry.
- At least two maintainers must approve feature requirements and related code changes.
### Graduating an Experimental feature to Stable
It is important that experimental features don't get stuck in that stage. They should be revisited and moved to the stable stage following the graduation steps as described here.
#### Locate graduation candidate
Decide if an experimental feature is ready for graduation to the stable stage.
- Find the issue that was used to enable the experimental feature initially. One way to find such issues is to search for issues with `type/feature` and `experimental` labels.
- Fix any known open issues against the feature.
- Make sure the feature was enabled for at least one previous release. Check the PR(s) reference from the issue to see when the feature related code changes were merged.
#### Provide implementation
If an experimental feature is found ready for graduation to the stable stage, open a Pull Request (PR) with the following changes.
- Add robust e2e tests if not already provided.
- Add a new stable feature flag identical to the experimental feature flag but without the `--experimental` prefix.
- Deprecate the experimental feature following the [feature deprecation policy](#Deprecating-a-feature).
- Implementation must ensure that both the graduated and deprecated experimental feature flags work as expected. Note that both these flags will co-exist for the timeframe described in the feature deprecation policy.
- Enable the graduated feature by default if needed.
At least two maintainers must approve the work. Patch releases should not be considered for graduation.
### Deprecating a feature
#### Experimental
An experimental feature deprecates when it graduates to the stable stage.
- Add a deprecation message in the documentation of the experimental feature with a recommendation to use related stable feature. e.g. `DEPRECATED. Use <feature-name> instead.`
- Add a `deprecated` label in the issue that was initially used to enable the experimental feature.
#### Stable
As the project evolves, a stable feature may sometimes need to be deprecated and removed. Such a situation should be handled using the steps below:
- Create an issue for tracking purpose.
- Add a deprecation message in the feature usage documentation before a planned release for feature deprecation. e.g. `To be deprecated in <release>.`. If a new feature replaces the `To be deprecated` feature, then also provide a message saying so. e.g. `Use <feature-name> instead.`.
- Deprecate the feature in the planned release with a message as part of the feature usage documentation. e.g. `DEPRECATED`. If a new feature replaces the deprecated feature, then also provide a message saying so. e.g. `DEPRECATED. Use <feature-name> instead.`.
- Add a `deprecated` label in the related issue.
Remove the deprecated feature in the following release. Close any related issue(s). At least two maintainers must approve the work. Patch releases should not be considered for deprecation.

View File

@ -1,150 +0,0 @@
# Set up local cluster
For testing and development deployments, the quickest and easiest way is to configure a local cluster. For a production deployment, refer to the [clustering][clustering] section.
## Local standalone cluster
### Starting a cluster
Run the following to deploy an etcd cluster as a standalone cluster:
```
$ ./etcd
...
```
If the `etcd` binary is not present in the current working directory, it might be located either at `$GOPATH/bin/etcd` or at `/usr/local/bin/etcd`. Run the command appropriately.
The running etcd member listens on `localhost:2379` for client requests.
### Interacting with the cluster
Use `etcdctl` to interact with the running cluster:
1. Store an example key-value pair in the cluster:
```
$ ./etcdctl put foo bar
OK
```
If OK is printed, storing key-value pair is successful.
2. Retrieve the value of `foo`:
```
$ ./etcdctl get foo
bar
```
If `bar` is returned, interaction with the etcd cluster is working as expected.
## Local multi-member cluster
### Starting a cluster
A `Procfile` at the base of the etcd git repository is provided to easily configure a local multi-member cluster. To start a multi-member cluster, navigate to the root of the etcd source tree and perform the following:
1. Install `goreman` to control Procfile-based applications:
```
$ go install github.com/mattn/goreman@latest
```
The installation will place executables in the $GOPATH/bin. If $GOPATH environment variable is not set, the tool will be installed into the $HOME/go/bin. Make sure that $PATH is set accordingly in your environment.
2. Start a cluster with `goreman` using etcd's stock Procfile:
```
$ goreman -f Procfile start
```
The members start running. They listen on `localhost:2379`, `localhost:22379`, and `localhost:32379` respectively for client requests.
### Interacting with the cluster
Use `etcdctl` to interact with the running cluster:
1. Print the list of members:
```
$ etcdctl --write-out=table --endpoints=localhost:2379 member list
```
The list of etcd members are displayed as follows:
```
+------------------+---------+--------+------------------------+------------------------+
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS |
+------------------+---------+--------+------------------------+------------------------+
| 8211f1d0f64f3269 | started | infra1 | http://127.0.0.1:2380 | http://127.0.0.1:2379 |
| 91bc3c398fb3c146 | started | infra2 | http://127.0.0.1:22380 | http://127.0.0.1:22379 |
| fd422379fda50e48 | started | infra3 | http://127.0.0.1:32380 | http://127.0.0.1:32379 |
+------------------+---------+--------+------------------------+------------------------+
```
2. Store an example key-value pair in the cluster:
```
$ etcdctl put foo bar
OK
```
If OK is printed, storing key-value pair is successful.
### Testing fault tolerance
To exercise etcd's fault tolerance, kill a member and attempt to retrieve the key.
1. Identify the process name of the member to be stopped.
The `Procfile` lists the properties of the multi-member cluster. For example, consider the member with the process name, `etcd2`.
2. Stop the member:
```
# kill etcd2
$ goreman run stop etcd2
```
3. Store a key:
```
$ etcdctl put key hello
OK
```
4. Retrieve the key that is stored in the previous step:
```
$ etcdctl get key
hello
```
5. Retrieve a key from the stopped member:
```
$ etcdctl --endpoints=localhost:22379 get key
```
The command should display an error caused by connection failure:
```
2017/06/18 23:07:35 grpc: Conn.resetTransport failed to create client transport: connection error: desc = "transport: dial tcp 127.0.0.1:22379: getsockopt: connection refused"; Reconnecting to "localhost:22379"
Error: grpc: timed out trying to connect
```
6. Restart the stopped member:
```
$ goreman run restart etcd2
```
7. Get the key from the restarted member:
```
$ etcdctl --endpoints=localhost:22379 get key
hello
```
Restarting the member re-establish the connection. `etcdctl` will now be able to retrieve the key successfully. To learn more about interacting with etcd, read [interacting with etcd section][interacting].
[clustering]: https://etcd.io/docs/latest/op-guide/clustering/
[interacting]: https://etcd.io/docs/latest/dev-guide/interacting_v3/

View File

@ -1,33 +0,0 @@
# Logging Conventions
etcd uses the [zap][zap] library for logging application output categorized into *levels*. A log message's level is determined according to these conventions:
* Debug: Everything is still fine, but even common operations may be logged, and less helpful but more quantity of notices. Usually not used in production.
* Examples:
* Send a normal message to a remote peer
* Write a log entry to disk
* Info: Normal, working log information, everything is fine, but helpful notices for auditing or common operations. Should rather not be logged more frequently than once per a few seconds in normal server's operation.
* Examples:
* Startup configuration
* Start to do snapshot
* Warning: (Hopefully) Temporary conditions that may cause errors, but may work fine. A replica disappearing (that may reconnect) is a warning.
* Examples:
* Failure to send raft message to a remote peer
* Failure to receive heartbeat message within the configured election timeout
* Error: Data has been lost, a request has failed for a bad reason, or a required resource has been lost.
* Examples:
* Failure to allocate disk space for WAL
* Panic: Unrecoverable or unexpected error situation that requires stopping execution.
* Examples:
* Failure to create the database
* Fatal: Unrecoverable or unexpected error situation that requires immediate exit. Mostly used in the test.
* Examples:
* Failure to find the data directory
* Failure to run a test function
[zap]: https://github.com/uber-go/zap

File diff suppressed because one or more lines are too long

Before

Width:  |  Height:  |  Size: 129 KiB

View File

@ -1,91 +0,0 @@
# Golang modules
The etcd project (since version 3.5) is organized into multiple
[golang modules](https://golang.org/ref/mod) hosted in a [single repository](https://golang.org/ref/mod#vcs-dir).
![modules graph](modules.svg)
There are following modules:
- **go.etcd.io/etcd/api/v3** - contains API definitions
(like protos & proto-generated libraries) that defines communication protocol
between etcd clients and server.
- **go.etcd.io/etcd/pkg/v3** - collection of utility packages used by etcd
without being specific to etcd itself. A package belongs here
only if it could possibly be moved out into its own repository in the future.
Please avoid adding here code that has a lot of dependencies on its own, as
they automatically becoming dependencies of the client library
(that we want to keep lightweight).
- **go.etcd.io/etcd/client/v3** - client library used to contact etcd over
the network (grpc). Recommended for all new usage of etcd.
- **go.etcd.io/raft/v3** - implementation of distributed consensus
protocol. Should have no etcd specific code. Hosted in a separate repository:
https://github.com/etcd-io/raft.
- **go.etcd.io/etcd/server/v3** - etcd implementation.
The code in this package is etcd internal and should not be consumed
by external projects. The package layout and API can change within the minor versions.
- **go.etcd.io/etcd/etcdctl/v3** - a command line tool to access and manage etcd.
- **go.etcd.io/etcd/tests/v3** - a module that contains all integration tests of etcd.
Notice: All unit-tests (fast and not requiring cross-module dependencies)
should be kept in the local modules to the code under the test.
- **go.etcd.io/bbolt** - implementation of persistent b-tree.
Hosted in a separate repository: https://github.com/etcd-io/bbolt.
### Operations
1. All etcd modules should be released in the same versions, e.g.
`go.etcd.io/etcd/client/v3@v3.5.10` must depend on `go.etcd.io/etcd/api/v3@v3.5.10`.
The consistent updating of versions can by performed using:
```shell script
% DRY_RUN=false TARGET_VERSION="v3.5.10" ./scripts/release_mod.sh update_versions
```
2. The released modules should be tagged according to https://golang.org/ref/mod#vcs-version rules,
i.e. each module should get its own tag.
The tagging can be performed using:
```shell script
% DRY_RUN=false REMOTE_REPO="origin" ./scripts/release_mod.sh push_mod_tags
```
3. All etcd modules should depend on the same versions of underlying dependencies.
This can be verified using:
```shell script
% PASSES="dep" ./test.sh
```
4. The go.mod files must not contain dependencies not being used and must
conform to `go mod tidy` format.
This is being verified by:
```
% PASSES="mod_tidy" ./test.sh
```
5. To trigger actions across all modules (e.g. auto-format all files), please
use/expand the following script:
```shell script
% ./scripts/fix.sh
```
### Future
As a North Star, we would like to evaluate etcd modules towards following model:
![modules graph](modules-future.svg)
This assumes:
- Splitting etcdmigrate/etcdadm out of etcdctl binary.
Thanks to this etcdctl would become clearly a command-line wrapper
around network client API,
while etcdmigrate/etcdadm would support direct physical operations on the
etcd storage files.
- Splitting etcd-proxy out of ./etcd binary, as it contains more experimental code
so carries additional risk & dependencies.
- Deprecation of support for v2 protocol.

File diff suppressed because one or more lines are too long

Before

Width:  |  Height:  |  Size: 112 KiB

View File

@ -1,75 +0,0 @@
# Release
The guide talks about how to release a new version of etcd.
The procedure includes some manual steps for sanity checking, but it can probably be further scripted. Please keep this document up-to-date if making changes to the release process.
## Release management
etcd community members are assigned to manage the release each etcd major/minor version as well as manage patches
and to each stable release branch. The managers are responsible for communicating the timelines and status of each
release and for ensuring the stability of the release branch.
| Releases | Manager |
|------------------------|-------------------------------------------------------------|
| 3.4 patch (post 3.4.0) | Benjamin Wang [@ahrtr](https://github.com/ahrtr) |
| 3.5 patch (post 3.5.0) | Marek Siarkowicz [@serathius](https://github.com/serathius) |
All releases version numbers follow the format of [semantic versioning 2.0.0](http://semver.org/).
### Major, minor version release, or its pre-release
- Ensure the relevant milestone on GitHub is complete. All referenced issues should be closed, or moved elsewhere.
- Ensure the latest upgrade documentation is available.
- Bump [hardcoded MinClusterVerion in the repository](https://github.com/etcd-io/etcd/blob/v3.4.15/version/version.go#L29), if necessary.
- Add feature capability maps for the new version, if necessary.
### Patch version release
- To request a backport, devlopers submit cherrypick PRs targeting the release branch. The commits should not include merge commits. The commits should be restricted to bug fixes and security patches.
- The cherrypick PRs should target the appropriate release branch (`base:release-<major>-<minor>`). `hack/patch/cherrypick.sh` may be used to automatically generate cherrypick PRs.
- The release patch manager reviews the cherrypick PRs. Please discuss carefully what is backported to the patch release. Each patch release should be strictly better than it's predecessor.
- The release patch manager will cherry-pick these commits starting from the oldest one into stable branch.
## Write release note
- Write introduction for the new release. For example, what major bug we fix, what new features we introduce or what performance improvement we make.
- Put `[GH XXXX]` at the head of change line to reference Pull Request that introduces the change. Moreover, add a link on it to jump to the Pull Request.
- Find PRs with `release-note` label and explain them in `NEWS` file, as a straightforward summary of changes for end-users.
## Build and push the release artifacts
- Ensure `docker` is available.
Run release script in root directory:
```
DRY_RUN=false ./scripts/release.sh ${VERSION}
```
It generates all release binaries and images under directory ./release.
Binaries are pushed to gcr.io and images are pushed to quay.io and gcr.io.
## Publish release page in GitHub
- Set release title as the version name.
- Follow the format of previous release pages.
- Attach the generated binaries and signatures.
- Select whether it is a pre-release.
- Publish the release!
## Announce to the etcd-dev Googlegroup
- Follow the format of [previous release emails](https://groups.google.com/forum/#!forum/etcd-dev).
- Make sure to include a list of authors that contributed since the previous release - something like the following might be handy:
```
git log ...${PREV_VERSION} --pretty=format:"%an" | sort | uniq | tr '\n' ',' | sed -e 's#,#, #g' -e 's#, $##'
```
- Send email to etcd-dev@googlegroups.com
## Post release
- Create new stable branch through `git push origin ${VERSION_MAJOR}.${VERSION_MINOR}` if this is a major stable release. This assumes `origin` corresponds to "https://github.com/etcd-io/etcd".
- Bump [hardcoded Version in the repository](https://github.com/etcd-io/etcd/blob/v3.4.15/version/version.go#L30) to the version `${VERSION}+git`.

View File

@ -1,45 +0,0 @@
# Reporting bugs
If any part of the etcd project has bugs or documentation mistakes, please let us know by [opening an issue][etcd-issue]. We treat bugs and mistakes very seriously and believe no issue is too small. Before creating a bug report, please check that an issue reporting the same problem does not already exist.
To make the bug report accurate and easy to understand, please try to create bug reports that are:
- Specific. Include as much details as possible: which version, what environment, what configuration, etc. If the bug is related to running the etcd server, please attach the etcd log (the starting log with etcd configuration is especially important).
- Reproducible. Include the steps to reproduce the problem. We understand some issues might be hard to reproduce, please includes the steps that might lead to the problem. If possible, please attach the affected etcd data dir and stack strace to the bug report.
- Isolated. Please try to isolate and reproduce the bug with minimum dependencies. It would significantly slow down the speed to fix a bug if too many dependencies are involved in a bug report. Debugging external systems that rely on etcd is out of scope, but we are happy to provide guidance in the right direction or help with using etcd itself.
- Unique. Do not duplicate existing bug report.
- Scoped. One bug per report. Do not follow up with another bug inside one report.
It may be worthwhile to read [Elika Etemads article on filing good bug reports][filing-good-bugs] before creating a bug report.
We might ask for further information to locate a bug. A duplicated bug report will be closed.
## Frequently asked questions
### How to get a stack trace
``` bash
$ kill -QUIT $PID
```
### How to get etcd version
``` bash
$ etcd --version
```
### How to get etcd configuration and log when it runs as systemd service etcd2.service
``` bash
$ sudo systemctl cat etcd2
$ sudo journalctl -u etcd2
```
Due to an upstream systemd bug, journald may miss the last few log lines when its processes exit. If journalctl says etcd stopped without fatal or panic message, try `sudo journalctl -f -t etcd2` to get full log.
[etcd-issue]: https://github.com/etcd-io/etcd/issues/new
[filing-good-bugs]: http://fantasai.inkedblade.net/style/talks/filing-good-bugs/

View File

@ -1,180 +0,0 @@
# Issue triage guidelines
## Purpose
Speed up issue management.
The `etcd` issues are listed at <https://github.com/etcd-io/etcd/issues> and are identified with labels. For example, an issue that is identified as a bug will be set to label `type/bug`.
The etcd project uses labels to indicate common attributes such as `area`, `type` and `priority` of incoming issues.
New issues will often start out without any labels, but typically `etcd` maintainers, reviewers and members will add labels by following these triage guidelines. The detailed list of labels can be found at <https://github.com/etcd-io/etcd/labels>.
## Scope
This document serves as the primary guidelines for triaging incoming issues in `etcd`.
All contributors are encouraged and welcome to help manage issues which will help reduce burden on project maintainers, though the work and responsibilities discussed in this document are created with `etcd` project reviewers and members in mind as these individuals will have triage access to the etcd project which is a requirement for actions like applying labels or closing issues.
Refer to [etcd community membership](https://github.com/etcd-io/etcd/blob/main/Documentation/contributor-guide/community-membership.md) for guidance on becoming and etcd project member or reviewer.
## Step 1 - Find an issue to triage
To get started you can use the following recommended issue searches to identify issues that are in need of triage:
* [Issues that have no labels](https://github.com/etcd-io/etcd/issues?q=is%3Aissue+is%3Aopen+sort%3Aupdated+no%3Alabel)
* [Issues created recently](https://github.com/etcd-io/etcd/issues?q=is%3Aissue+is%3Aopen+)
* [Issues not assigned but linked pr](https://github.com/etcd-io/etcd/issues?q=is%3Aopen+is%3Aissue+no%3Aassignee+linked%3Apr)
* [Issues with no comments](https://github.com/etcd-io/etcd/issues?q=is%3Aopen+is%3Aissue+comments%3A0+)
* [Issues with help wanted](https://github.com/etcd-io/etcd/issues?q=is%3Aopen+is%3Aissue+label%3A%22help+wanted%22+)
## Step 2 - Check the issue is valid
Before we start adding labels or trying to work out a priority, our first triage step needs to be working out if the issue actually belongs to the etcd project and is not a duplicate.
### Issues that don't belong to etcd
Sometime issues are reported that actually belongs to other projects that `etcd` use. For example, `grpc` or `golang` issues. Such issues should be addressed by asking reporter to open issues in appropriate other project.
These issues can generally be closed unless a maintainer and issue reporter see a need to keep it open for tracking purpose. If you have triage permissions please close it, alternatively mention the @etcd-io/members group to request a member with triage access close the issue.
### Duplicate issues
If an issue is a duplicate, add a comment stating so along with a reference for the original issue and if you have triage permissions please close it, alternatively mention the @etcd-io/members group to request a member with triage access close the issue.
## Step 3 - Apply the appropriate type label
Adding a `type` label to an issue helps create visibility on the health of the project and helps contributors identify potential priorities, i.e. addressing existing bugs or test flakes before implementing new features.
### Support requests
As a general rule the focus for etcd support is to address common themes in a broad way that helps all users, i.e. through channels like known issues, frequently asked questions and high quality documentation. To make the best use of project members time we should avoid providing 1:1 support if a broad approach is available.
Some people mistakenly use our GitHub bug report or feature request templates to file support requests. Usually they are asking for help operating or configuring some aspect of etcd. Support requests for etcd should instead be raised as [discussions](https://github.com/etcd-io/etcd/discussions).
Common types of support requests are:
1. Questions about configuring or operating existing well documented etcd features, for example <https://github.com/etcd-io/etcd/issues/15945>. Note - If an existing feature is not well documented please apply the `area/documentation` label and propose documentation improvements that would prevent future users from stumbling on the problem again.
2. Bug reports or questions about unspported versions of etcd, for example <https://github.com/etcd-io/etcd/issues/15796>. When responding to these issues please refer to our [supported versions documentation](https://etcd.io/docs/latest/op-guide/versioning) and encourage the reporter to upgrade to a recent patch release of a supported version as soon as possible. We should limit the effort supporting users that do not make the effort to run a supported version of etcd or ensure their version is patched.
3. Bug reports that do not provide a complete list of steps to reproduce issue and/or contributors are not able to reproduce the issue, for example <https://github.com/etcd-io/etcd/issues/15740>. We should limit the effort we put into reproducing issues ourselves and motivate users to provide necessary information to accept the bug report.
4. General questions that are filed using feature request or bug report issue templates, for example <https://github.com/etcd-io/etcd/issues/15914>. Note - These types of requests may surface good additions to our [frequently asked questions](https://etcd.io/docs/v3.5/faq).
If you identify that an issue is a support request please:
1. Add the `type/support` or `type/question` label.
2. Add the following comment to inform the issue creator that discussions should be used instead and that this issue will be converted to a discussion.
> Thank you for your question, this support issue will be moved to our [Discussion Forums](https://github.com/etcd-io/etcd/discussions).
>
> We are trying to consolidate the channels to which questions for help/support are posted so that we can improve our efficiency in responding to your requests, and to make it easier for you to find answers to frequently asked questions and how to address common use cases.
>
> We regularly see messages posted in multiple forums, with the full response thread only in one place or, worse, spread across multiple forums. Also, the large volume of support issues on GitHub is making it difficult for us to use issues to identify real bugs.
>
> Members of the etcd community use Discussion Forums to field support requests. Before posting a new question, please search these for answers to similar questions, and also familiarize yourself with:
>
> 1. [user documentation](https://etcd.io/docs/latest)
> 2. [frequently asked questions](https://etcd.io/docs/v3.5/faq)
>
> Again, thanks for using etcd and raising this question.
>
> The etcd team
3. Finally, click `Convert to discussion` on the right hand panel, selecting the appropriate discussion category.
### Bug reports
If an issue has been raised as a bug it should already have the `type/bug` label, however if this is missing for an issue you determine to be a bug please add the label manually.
The next step is to validate if the issue is indeed a bug. If not, add a comment with findings and close trivial issue. For non-trivial issue, wait to hear back from issue reporter and see if there is any objection. If issue reporter does not reply in 30 days, close the issue.
If the problem can not be reproduced or requires more information, leave a comment for the issue reporter as soon as possible while the issue will be fresh for the issue reporter.
### Feature requests
New feature requests should be created via the etcd feature request template and in theory already have the `type/feature` label, however if this is missing for an issue you determine to be a feature please add the label manually.
### Test flakes
Test flakes are a specific type of bug that the etcd project tracks seperately as these are a priority to address. These should be created via the test flake template and in theory already have the `type/flake` label, however if this is missing for an issue you determine to be related to a flaking test please add the label manually.
## Step 4 - Define the areas impacted
Adding an `area` label to an issue helps create visibility on which areas of the etcd project require attention and helps contributors find issues to work on relating to their particular skills or knowledge of the etcd codebase.
If an issue crosses multiple domains please add additional `area` labels to reflect that.
Below is a brief summary of the area labels in active use by the etcd project along with any notes on their use:
| Label | Notes |
| --- | --- |
| area/external | Tracking label for issues raised that are external to etcd. |
| area/community | |
| area/raft | |
| area/clientv3 | |
| area/performance | |
| area/security | |
| area/tls | |
| area/auth | |
| area/etcdctl | |
| area/etcdutl | |
| area/contrib | Not to be confused with `area/community` this label is specifically used for issues relating to community maintained scripts or files in the `contrib/` directory which aren't part of the core etcd project. |
| area/documentation | |
| area/tooling | Generally used in relation to the third party / external utilities or tools that are used in various stages of the etcd build, test or release process, for example tooling to create sboms. |
| area/testing | |
| area/robustness-testing | |
## Step 5 - Prioritise the issue
Placeholder.
## Step 6 - Support new contributors
As part of the `etcd` triage process once the `kind` and `area` have been determined, please consider if the issue would be suitable for a less experienced contributor. The `good first issue` label is a subset of the `help wanted` label, indicating that members have committed to providing extra assistance for new contributors. All `good first issue` items also have the `help wanted` label.
### Help wanted
Items marked with the `help wanted` label need to ensure that they meet these criteria:
* **Low Barrier to Entry** - It should be easy for new contributors.
* **Clear** - The task is agreed upon and does not require further discussions in the community.
* **Goldilocks priority** - The priority should not be so high that a core contributor should do it, but not too low that it isnt useful enough for a core contributor to spend time reviewing it, answering questions, helping get it into a release, etc.
### Good first issue
Items marked with `good first issue` are intended for first-time contributors. It indicates that members will keep an eye out for these pull requests and shepherd it through our processes.
New contributors should not be left to find an approver, ping for reviews, decipher test commands, or identify that their build failed due to a flake. It is important to make new contributors feel welcome and valued. We should assure them that they will have an extra level of help with their first contribution.
After a contributor has successfully completed one or two `good first issue` items, they should be ready to move on to `help wanted` items.
* **No Barrier to Entry** - The task is something that a new contributor can tackle without advanced setup or domain knowledge.
* **Solution Explained** - The recommended solution is clearly described in the issue.
* **Gives Examples** - Link to examples of similar implementations so new contributors have a reference guide for their changes.
* **Identifies Relevant Code** - The relevant code and tests to be changed should be linked in the issue.
* **Ready to Test** - There should be existing tests that can be modified, or existing test cases fit to be copied. If the area of code doesnt have tests, before labeling the issue, add a test fixture. This prep often makes a great help wanted task!
## Step 7 - Follow up
Once initial triage has been completed, issues need to be re-evaluated over time to ensure they don't become stale incorrectly.
### Track important issues
If an issue is at risk of being closed by stale bot in future, but is an important issuefor the etcd project, then please apply the `stage/tracked` label and remove any `stale` labels that exist. This will ensure the project does not lose sight of the issue.
### Close incomplete issues
Issues that lack enough information from the issue reporter should be closed if issue reporter do not provide information in 30 days. Issues can always be re-opened at a later date if new information is provided.
### Check for incomplete work
If an issue owned by a developer has no pull request created in 30 days, contact the issue owner and kindly ask about the status of their work, or to release ownership on the issue if needed.

View File

@ -1,28 +0,0 @@
# PR management
## Purpose
Speed up PR management.
The `etcd` PRs are listed at https://github.com/etcd-io/etcd/pulls
A PR can have various labels, milestone, reviewer etc. The detailed list of labels can be found at
https://github.com/kubernetes/kubernetes/labels
Following are few example searches on PR for convenience:
* [Open PRS for milestone etcd-v3.6](https://github.com/etcd-io/etcd/pulls?utf8=%E2%9C%93&q=is%3Apr+is%3Aopen+milestone%3Aetcd-v3.6)
* [PRs under investigation](https://github.com/etcd-io/etcd/labels/Investigating)
## Scope
These guidelines serves as a primary document for managing PRs in `etcd`. Everyone is welcome to help manage PRs but the work and responsibilities discussed in this document is created with `etcd` maintainers and active contributors in mind.
## Handle inactive PRs
Poke PR owner if review comments are not addressed in 15 days. If PR owner does not reply in 90 days, update the PR with a new commit if possible. If not, inactive PR should be closed after 180 days.
## Poke reviewer if needed
Reviewers are responsive in a timely fashion, but considering everyone is busy, give them some time after requesting review if quick response is not provided. If response is not provided in 10 days, feel free to contact them via adding a comment in the PR or sending an email or message on the Slack.
## Verify important labels are in place
Make sure that appropriate reviewers are added to the PR. Also, make sure that a milestone is identified. If any of these or other important labels are missing, add them. If a correct label cannot be decided, leave a comment for the maintainers to do so as needed.

459
Documentation/demo.md Normal file
View File

@ -0,0 +1,459 @@
---
title: Demo
---
This series of examples shows the basic procedures for working with an etcd cluster.
## Set up a cluster
<img src="https://storage.googleapis.com/etcd/demo/01_etcd_clustering_2016051001.gif" alt="01_etcd_clustering_2016050601"/>
On each etcd node, specify the cluster members:
```
TOKEN=token-01
CLUSTER_STATE=new
NAME_1=machine-1
NAME_2=machine-2
NAME_3=machine-3
HOST_1=10.240.0.17
HOST_2=10.240.0.18
HOST_3=10.240.0.19
CLUSTER=${NAME_1}=http://${HOST_1}:2380,${NAME_2}=http://${HOST_2}:2380,${NAME_3}=http://${HOST_3}:2380
```
Run this on each machine:
```
# For machine 1
THIS_NAME=${NAME_1}
THIS_IP=${HOST_1}
etcd --data-dir=data.etcd --name ${THIS_NAME} \
--initial-advertise-peer-urls http://${THIS_IP}:2380 --listen-peer-urls http://${THIS_IP}:2380 \
--advertise-client-urls http://${THIS_IP}:2379 --listen-client-urls http://${THIS_IP}:2379 \
--initial-cluster ${CLUSTER} \
--initial-cluster-state ${CLUSTER_STATE} --initial-cluster-token ${TOKEN}
# For machine 2
THIS_NAME=${NAME_2}
THIS_IP=${HOST_2}
etcd --data-dir=data.etcd --name ${THIS_NAME} \
--initial-advertise-peer-urls http://${THIS_IP}:2380 --listen-peer-urls http://${THIS_IP}:2380 \
--advertise-client-urls http://${THIS_IP}:2379 --listen-client-urls http://${THIS_IP}:2379 \
--initial-cluster ${CLUSTER} \
--initial-cluster-state ${CLUSTER_STATE} --initial-cluster-token ${TOKEN}
# For machine 3
THIS_NAME=${NAME_3}
THIS_IP=${HOST_3}
etcd --data-dir=data.etcd --name ${THIS_NAME} \
--initial-advertise-peer-urls http://${THIS_IP}:2380 --listen-peer-urls http://${THIS_IP}:2380 \
--advertise-client-urls http://${THIS_IP}:2379 --listen-client-urls http://${THIS_IP}:2379 \
--initial-cluster ${CLUSTER} \
--initial-cluster-state ${CLUSTER_STATE} --initial-cluster-token ${TOKEN}
```
Or use our public discovery service:
```
curl https://discovery.etcd.io/new?size=3
https://discovery.etcd.io/a81b5818e67a6ea83e9d4daea5ecbc92
# grab this token
TOKEN=token-01
CLUSTER_STATE=new
NAME_1=machine-1
NAME_2=machine-2
NAME_3=machine-3
HOST_1=10.240.0.17
HOST_2=10.240.0.18
HOST_3=10.240.0.19
DISCOVERY=https://discovery.etcd.io/a81b5818e67a6ea83e9d4daea5ecbc92
THIS_NAME=${NAME_1}
THIS_IP=${HOST_1}
etcd --data-dir=data.etcd --name ${THIS_NAME} \
--initial-advertise-peer-urls http://${THIS_IP}:2380 --listen-peer-urls http://${THIS_IP}:2380 \
--advertise-client-urls http://${THIS_IP}:2379 --listen-client-urls http://${THIS_IP}:2379 \
--discovery ${DISCOVERY} \
--initial-cluster-state ${CLUSTER_STATE} --initial-cluster-token ${TOKEN}
THIS_NAME=${NAME_2}
THIS_IP=${HOST_2}
etcd --data-dir=data.etcd --name ${THIS_NAME} \
--initial-advertise-peer-urls http://${THIS_IP}:2380 --listen-peer-urls http://${THIS_IP}:2380 \
--advertise-client-urls http://${THIS_IP}:2379 --listen-client-urls http://${THIS_IP}:2379 \
--discovery ${DISCOVERY} \
--initial-cluster-state ${CLUSTER_STATE} --initial-cluster-token ${TOKEN}
THIS_NAME=${NAME_3}
THIS_IP=${HOST_3}
etcd --data-dir=data.etcd --name ${THIS_NAME} \
--initial-advertise-peer-urls http://${THIS_IP}:2380 --listen-peer-urls http://${THIS_IP}:2380 \
--advertise-client-urls http://${THIS_IP}:2379 --listen-client-urls http://${THIS_IP}:2379 \
--discovery ${DISCOVERY} \
--initial-cluster-state ${CLUSTER_STATE} --initial-cluster-token ${TOKEN}
```
Now etcd is ready! To connect to etcd with etcdctl:
```
export ETCDCTL_API=3
HOST_1=10.240.0.17
HOST_2=10.240.0.18
HOST_3=10.240.0.19
ENDPOINTS=$HOST_1:2379,$HOST_2:2379,$HOST_3:2379
etcdctl --endpoints=$ENDPOINTS member list
```
## Access etcd
<img src="https://storage.googleapis.com/etcd/demo/02_etcdctl_access_etcd_2016051001.gif" alt="02_etcdctl_access_etcd_2016051001"/>
`put` command to write:
```
etcdctl --endpoints=$ENDPOINTS put foo "Hello World!"
```
`get` to read from etcd:
```
etcdctl --endpoints=$ENDPOINTS get foo
etcdctl --endpoints=$ENDPOINTS --write-out="json" get foo
```
## Get by prefix
<img src="https://storage.googleapis.com/etcd/demo/03_etcdctl_get_by_prefix_2016050501.gif" alt="03_etcdctl_get_by_prefix_2016050501"/>
```
etcdctl --endpoints=$ENDPOINTS put web1 value1
etcdctl --endpoints=$ENDPOINTS put web2 value2
etcdctl --endpoints=$ENDPOINTS put web3 value3
etcdctl --endpoints=$ENDPOINTS get web --prefix
```
## Delete
<img src="https://storage.googleapis.com/etcd/demo/04_etcdctl_delete_2016050601.gif" alt="04_etcdctl_delete_2016050601"/>
```
etcdctl --endpoints=$ENDPOINTS put key myvalue
etcdctl --endpoints=$ENDPOINTS del key
etcdctl --endpoints=$ENDPOINTS put k1 value1
etcdctl --endpoints=$ENDPOINTS put k2 value2
etcdctl --endpoints=$ENDPOINTS del k --prefix
```
## Transactional write
`txn` to wrap multiple requests into one transaction:
<img src="https://storage.googleapis.com/etcd/demo/05_etcdctl_transaction_2016050501.gif" alt="05_etcdctl_transaction_2016050501"/>
```
etcdctl --endpoints=$ENDPOINTS put user1 bad
etcdctl --endpoints=$ENDPOINTS txn --interactive
compares:
value("user1") = "bad"
success requests (get, put, delete):
del user1
failure requests (get, put, delete):
put user1 good
```
## Watch
`watch` to get notified of future changes:
<img src="https://storage.googleapis.com/etcd/demo/06_etcdctl_watch_2016050501.gif" alt="06_etcdctl_watch_2016050501"/>
```
etcdctl --endpoints=$ENDPOINTS watch stock1
etcdctl --endpoints=$ENDPOINTS put stock1 1000
etcdctl --endpoints=$ENDPOINTS watch stock --prefix
etcdctl --endpoints=$ENDPOINTS put stock1 10
etcdctl --endpoints=$ENDPOINTS put stock2 20
```
## Lease
`lease` to write with TTL:
<img src="https://storage.googleapis.com/etcd/demo/07_etcdctl_lease_2016050501.gif" alt="07_etcdctl_lease_2016050501"/>
```
etcdctl --endpoints=$ENDPOINTS lease grant 300
# lease 2be7547fbc6a5afa granted with TTL(300s)
etcdctl --endpoints=$ENDPOINTS put sample value --lease=2be7547fbc6a5afa
etcdctl --endpoints=$ENDPOINTS get sample
etcdctl --endpoints=$ENDPOINTS lease keep-alive 2be7547fbc6a5afa
etcdctl --endpoints=$ENDPOINTS lease revoke 2be7547fbc6a5afa
# or after 300 seconds
etcdctl --endpoints=$ENDPOINTS get sample
```
## Distributed locks
`lock` for distributed lock:
<img src="https://storage.googleapis.com/etcd/demo/08_etcdctl_lock_2016050501.gif" alt="08_etcdctl_lock_2016050501"/>
```
etcdctl --endpoints=$ENDPOINTS lock mutex1
# another client with the same name blocks
etcdctl --endpoints=$ENDPOINTS lock mutex1
```
## Elections
`elect` for leader election:
<img src="https://storage.googleapis.com/etcd/demo/09_etcdctl_elect_2016050501.gif" alt="09_etcdctl_elect_2016050501"/>
```
etcdctl --endpoints=$ENDPOINTS elect one p1
# another client with the same name blocks
etcdctl --endpoints=$ENDPOINTS elect one p2
```
## Cluster status
Specify the initial cluster configuration for each machine:
<img src="https://storage.googleapis.com/etcd/demo/10_etcdctl_endpoint_2016050501.gif" alt="10_etcdctl_endpoint_2016050501"/>
```
etcdctl --write-out=table --endpoints=$ENDPOINTS endpoint status
+------------------+------------------+---------+---------+-----------+-----------+------------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX |
+------------------+------------------+---------+---------+-----------+-----------+------------+
| 10.240.0.17:2379 | 4917a7ab173fabe7 | 3.0.0 | 45 kB | true | 4 | 16726 |
| 10.240.0.18:2379 | 59796ba9cd1bcd72 | 3.0.0 | 45 kB | false | 4 | 16726 |
| 10.240.0.19:2379 | 94df724b66343e6c | 3.0.0 | 45 kB | false | 4 | 16726 |
+------------------+------------------+---------+---------+-----------+-----------+------------+
```
```
etcdctl --endpoints=$ENDPOINTS endpoint health
10.240.0.17:2379 is healthy: successfully committed proposal: took = 3.345431ms
10.240.0.19:2379 is healthy: successfully committed proposal: took = 3.767967ms
10.240.0.18:2379 is healthy: successfully committed proposal: took = 4.025451ms
```
## Snapshot
`snapshot` to save point-in-time snapshot of etcd database:
<img src="https://storage.googleapis.com/etcd/demo/11_etcdctl_snapshot_2016051001.gif" alt="11_etcdctl_snapshot_2016051001"/>
Snapshot can only be requested from one etcd node, so `--endpoints` flag should contain only one endpoint.
```
ENDPOINTS=$HOST_1:2379
etcdctl --endpoints=$ENDPOINTS snapshot save my.db
Snapshot saved at my.db
```
```
etcdctl --write-out=table --endpoints=$ENDPOINTS snapshot status my.db
+---------+----------+------------+------------+
| HASH | REVISION | TOTAL KEYS | TOTAL SIZE |
+---------+----------+------------+------------+
| c55e8b8 | 9 | 13 | 25 kB |
+---------+----------+------------+------------+
```
## Migrate
`migrate` to transform etcd v2 to v3 data:
<img src="https://storage.googleapis.com/etcd/demo/12_etcdctl_migrate_2016061602.gif" alt="12_etcdctl_migrate_2016061602"/>
```
# write key in etcd version 2 store
export ETCDCTL_API=2
etcdctl --endpoints=http://$ENDPOINT set foo bar
# read key in etcd v2
etcdctl --endpoints=$ENDPOINTS --output="json" get foo
# stop etcd node to migrate, one by one
# migrate v2 data
export ETCDCTL_API=3
etcdctl --endpoints=$ENDPOINT migrate --data-dir="default.etcd" --wal-dir="default.etcd/member/wal"
# restart etcd node after migrate, one by one
# confirm that the key got migrated
etcdctl --endpoints=$ENDPOINTS get /foo
```
## Member
`member` to add,remove,update membership:
<img src="https://storage.googleapis.com/etcd/demo/13_etcdctl_member_2016062301.gif" alt="13_etcdctl_member_2016062301"/>
```
# For each machine
TOKEN=my-etcd-token-1
CLUSTER_STATE=new
NAME_1=etcd-node-1
NAME_2=etcd-node-2
NAME_3=etcd-node-3
HOST_1=10.240.0.13
HOST_2=10.240.0.14
HOST_3=10.240.0.15
CLUSTER=${NAME_1}=http://${HOST_1}:2380,${NAME_2}=http://${HOST_2}:2380,${NAME_3}=http://${HOST_3}:2380
# For node 1
THIS_NAME=${NAME_1}
THIS_IP=${HOST_1}
etcd --data-dir=data.etcd --name ${THIS_NAME} \
--initial-advertise-peer-urls http://${THIS_IP}:2380 \
--listen-peer-urls http://${THIS_IP}:2380 \
--advertise-client-urls http://${THIS_IP}:2379 \
--listen-client-urls http://${THIS_IP}:2379 \
--initial-cluster ${CLUSTER} \
--initial-cluster-state ${CLUSTER_STATE} \
--initial-cluster-token ${TOKEN}
# For node 2
THIS_NAME=${NAME_2}
THIS_IP=${HOST_2}
etcd --data-dir=data.etcd --name ${THIS_NAME} \
--initial-advertise-peer-urls http://${THIS_IP}:2380 \
--listen-peer-urls http://${THIS_IP}:2380 \
--advertise-client-urls http://${THIS_IP}:2379 \
--listen-client-urls http://${THIS_IP}:2379 \
--initial-cluster ${CLUSTER} \
--initial-cluster-state ${CLUSTER_STATE} \
--initial-cluster-token ${TOKEN}
# For node 3
THIS_NAME=${NAME_3}
THIS_IP=${HOST_3}
etcd --data-dir=data.etcd --name ${THIS_NAME} \
--initial-advertise-peer-urls http://${THIS_IP}:2380 \
--listen-peer-urls http://${THIS_IP}:2380 \
--advertise-client-urls http://${THIS_IP}:2379 \
--listen-client-urls http://${THIS_IP}:2379 \
--initial-cluster ${CLUSTER} \
--initial-cluster-state ${CLUSTER_STATE} \
--initial-cluster-token ${TOKEN}
```
Then replace a member with `member remove` and `member add` commands:
```
# get member ID
export ETCDCTL_API=3
HOST_1=10.240.0.13
HOST_2=10.240.0.14
HOST_3=10.240.0.15
etcdctl --endpoints=${HOST_1}:2379,${HOST_2}:2379,${HOST_3}:2379 member list
# remove the member
MEMBER_ID=278c654c9a6dfd3b
etcdctl --endpoints=${HOST_1}:2379,${HOST_2}:2379,${HOST_3}:2379 \
member remove ${MEMBER_ID}
# add a new member (node 4)
export ETCDCTL_API=3
NAME_1=etcd-node-1
NAME_2=etcd-node-2
NAME_4=etcd-node-4
HOST_1=10.240.0.13
HOST_2=10.240.0.14
HOST_4=10.240.0.16 # new member
etcdctl --endpoints=${HOST_1}:2379,${HOST_2}:2379 \
member add ${NAME_4} \
--peer-urls=http://${HOST_4}:2380
```
Next, start the new member with `--initial-cluster-state existing` flag:
```
# [WARNING] If the new member starts from the same disk space,
# make sure to remove the data directory of the old member
#
# restart with 'existing' flag
TOKEN=my-etcd-token-1
CLUSTER_STATE=existing
NAME_1=etcd-node-1
NAME_2=etcd-node-2
NAME_4=etcd-node-4
HOST_1=10.240.0.13
HOST_2=10.240.0.14
HOST_4=10.240.0.16 # new member
CLUSTER=${NAME_1}=http://${HOST_1}:2380,${NAME_2}=http://${HOST_2}:2380,${NAME_4}=http://${HOST_4}:2380
THIS_NAME=${NAME_4}
THIS_IP=${HOST_4}
etcd --data-dir=data.etcd --name ${THIS_NAME} \
--initial-advertise-peer-urls http://${THIS_IP}:2380 \
--listen-peer-urls http://${THIS_IP}:2380 \
--advertise-client-urls http://${THIS_IP}:2379 \
--listen-client-urls http://${THIS_IP}:2379 \
--initial-cluster ${CLUSTER} \
--initial-cluster-state ${CLUSTER_STATE} \
--initial-cluster-token ${TOKEN}
```
## Auth
`auth`,`user`,`role` for authentication:
<img src="https://storage.googleapis.com/etcd/demo/14_etcdctl_auth_2016062301.gif" alt="14_etcdctl_auth_2016062301"/>
```
export ETCDCTL_API=3
ENDPOINTS=localhost:2379
etcdctl --endpoints=${ENDPOINTS} role add root
etcdctl --endpoints=${ENDPOINTS} role grant-permission root readwrite foo
etcdctl --endpoints=${ENDPOINTS} role get root
etcdctl --endpoints=${ENDPOINTS} user add root
etcdctl --endpoints=${ENDPOINTS} user grant-role root root
etcdctl --endpoints=${ENDPOINTS} user get root
etcdctl --endpoints=${ENDPOINTS} auth enable
# now all client requests go through auth
etcdctl --endpoints=${ENDPOINTS} --user=root:123 put foo bar
etcdctl --endpoints=${ENDPOINTS} get foo
etcdctl --endpoints=${ENDPOINTS} --user=root:123 get foo
etcdctl --endpoints=${ENDPOINTS} --user=root:123 get foo1
```

View File

@ -0,0 +1,168 @@
### etcd concurrency API Reference
This is a generated documentation. Please read the proto files for more.
##### service `Lock` (etcdserver/api/v3lock/v3lockpb/v3lock.proto)
The lock service exposes client-side locking facilities as a gRPC interface.
| Method | Request Type | Response Type | Description |
| ------ | ------------ | ------------- | ----------- |
| Lock | LockRequest | LockResponse | Lock acquires a distributed shared lock on a given named lock. On success, it will return a unique key that exists so long as the lock is held by the caller. This key can be used in conjunction with transactions to safely ensure updates to etcd only occur while holding lock ownership. The lock is held until Unlock is called on the key or the lease associate with the owner expires. |
| Unlock | UnlockRequest | UnlockResponse | Unlock takes a key returned by Lock and releases the hold on lock. The next Lock caller waiting for the lock will then be woken up and given ownership of the lock. |
##### message `LockRequest` (etcdserver/api/v3lock/v3lockpb/v3lock.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| name | name is the identifier for the distributed shared lock to be acquired. | bytes |
| lease | lease is the ID of the lease that will be attached to ownership of the lock. If the lease expires or is revoked and currently holds the lock, the lock is automatically released. Calls to Lock with the same lease will be treated as a single acquisition; locking twice with the same lease is a no-op. | int64 |
##### message `LockResponse` (etcdserver/api/v3lock/v3lockpb/v3lock.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| header | | etcdserverpb.ResponseHeader |
| key | key is a key that will exist on etcd for the duration that the Lock caller owns the lock. Users should not modify this key or the lock may exhibit undefined behavior. | bytes |
##### message `UnlockRequest` (etcdserver/api/v3lock/v3lockpb/v3lock.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| key | key is the lock ownership key granted by Lock. | bytes |
##### message `UnlockResponse` (etcdserver/api/v3lock/v3lockpb/v3lock.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| header | | etcdserverpb.ResponseHeader |
##### service `Election` (etcdserver/api/v3election/v3electionpb/v3election.proto)
The election service exposes client-side election facilities as a gRPC interface.
| Method | Request Type | Response Type | Description |
| ------ | ------------ | ------------- | ----------- |
| Campaign | CampaignRequest | CampaignResponse | Campaign waits to acquire leadership in an election, returning a LeaderKey representing the leadership if successful. The LeaderKey can then be used to issue new values on the election, transactionally guard API requests on leadership still being held, and resign from the election. |
| Proclaim | ProclaimRequest | ProclaimResponse | Proclaim updates the leader's posted value with a new value. |
| Leader | LeaderRequest | LeaderResponse | Leader returns the current election proclamation, if any. |
| Observe | LeaderRequest | LeaderResponse | Observe streams election proclamations in-order as made by the election's elected leaders. |
| Resign | ResignRequest | ResignResponse | Resign releases election leadership so other campaigners may acquire leadership on the election. |
##### message `CampaignRequest` (etcdserver/api/v3election/v3electionpb/v3election.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| name | name is the election's identifier for the campaign. | bytes |
| lease | lease is the ID of the lease attached to leadership of the election. If the lease expires or is revoked before resigning leadership, then the leadership is transferred to the next campaigner, if any. | int64 |
| value | value is the initial proclaimed value set when the campaigner wins the election. | bytes |
##### message `CampaignResponse` (etcdserver/api/v3election/v3electionpb/v3election.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| header | | etcdserverpb.ResponseHeader |
| leader | leader describes the resources used for holding leadereship of the election. | LeaderKey |
##### message `LeaderKey` (etcdserver/api/v3election/v3electionpb/v3election.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| name | name is the election identifier that correponds to the leadership key. | bytes |
| key | key is an opaque key representing the ownership of the election. If the key is deleted, then leadership is lost. | bytes |
| rev | rev is the creation revision of the key. It can be used to test for ownership of an election during transactions by testing the key's creation revision matches rev. | int64 |
| lease | lease is the lease ID of the election leader. | int64 |
##### message `LeaderRequest` (etcdserver/api/v3election/v3electionpb/v3election.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| name | name is the election identifier for the leadership information. | bytes |
##### message `LeaderResponse` (etcdserver/api/v3election/v3electionpb/v3election.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| header | | etcdserverpb.ResponseHeader |
| kv | kv is the key-value pair representing the latest leader update. | mvccpb.KeyValue |
##### message `ProclaimRequest` (etcdserver/api/v3election/v3electionpb/v3election.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| leader | leader is the leadership hold on the election. | LeaderKey |
| value | value is an update meant to overwrite the leader's current value. | bytes |
##### message `ProclaimResponse` (etcdserver/api/v3election/v3electionpb/v3election.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| header | | etcdserverpb.ResponseHeader |
##### message `ResignRequest` (etcdserver/api/v3election/v3electionpb/v3election.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| leader | leader is the leadership to relinquish by resignation. | LeaderKey |
##### message `ResignResponse` (etcdserver/api/v3election/v3electionpb/v3election.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| header | | etcdserverpb.ResponseHeader |
##### message `Event` (mvcc/mvccpb/kv.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| type | type is the kind of event. If type is a PUT, it indicates new data has been stored to the key. If type is a DELETE, it indicates the key was deleted. | EventType |
| kv | kv holds the KeyValue for the event. A PUT event contains current kv pair. A PUT event with kv.Version=1 indicates the creation of a key. A DELETE/EXPIRE event contains the deleted key with its modification revision set to the revision of deletion. | KeyValue |
| prev_kv | prev_kv holds the key-value pair before the event happens. | KeyValue |
##### message `KeyValue` (mvcc/mvccpb/kv.proto)
| Field | Description | Type |
| ----- | ----------- | ---- |
| key | key is the key in bytes. An empty key is not allowed. | bytes |
| create_revision | create_revision is the revision of last creation on this key. | int64 |
| mod_revision | mod_revision is the revision of last modification on this key. | int64 |
| version | version is the version of the key. A deletion resets the version to zero and any modification of the key increases its version. | int64 |
| value | value is the value held by the key, in bytes. | bytes |
| lease | lease is the ID of the lease that attached to key. When the attached lease expires, the key will be deleted. If lease is 0, then no lease is attached to the key. | int64 |

View File

@ -0,0 +1,137 @@
---
title: Why gRPC gateway
---
etcd v3 uses [gRPC][grpc] for its messaging protocol. The etcd project includes a gRPC-based [Go client][go-client] and a command line utility, [etcdctl][etcdctl], for communicating with an etcd cluster through gRPC. For languages with no gRPC support, etcd provides a JSON [gRPC gateway][grpc-gateway]. This gateway serves a RESTful proxy that translates HTTP/JSON requests into gRPC messages.
## Using gRPC gateway
The gateway accepts a [JSON mapping][json-mapping] for etcd's [protocol buffer][api-ref] message definitions. Note that `key` and `value` fields are defined as byte arrays and therefore must be base64 encoded in JSON. The following examples use `curl`, but any HTTP/JSON client should work all the same.
### Notes
gRPC gateway endpoint has changed since etcd v3.3:
- etcd v3.2 or before uses only `[CLIENT-URL]/v3alpha/*`.
- etcd v3.3 uses `[CLIENT-URL]/v3beta/*` while keeping `[CLIENT-URL]/v3alpha/*`.
- etcd v3.4 uses `[CLIENT-URL]/v3/*` while keeping `[CLIENT-URL]/v3beta/*`.
- **`[CLIENT-URL]/v3alpha/*` is deprecated**.
- etcd v3.5 or later uses only `[CLIENT-URL]/v3/*`.
- **`[CLIENT-URL]/v3beta/*` is deprecated**.
gRPC-gateway does not support authentication using TLS Common Name.
### Put and get keys
Use the `/v3/kv/range` and `/v3/kv/put` services to read and write keys:
```bash
<<COMMENT
https://www.base64encode.org/
foo is 'Zm9v' in Base64
bar is 'YmFy'
COMMENT
curl -L http://localhost:2379/v3/kv/put \
-X POST -d '{"key": "Zm9v", "value": "YmFy"}'
# {"header":{"cluster_id":"12585971608760269493","member_id":"13847567121247652255","revision":"2","raft_term":"3"}}
curl -L http://localhost:2379/v3/kv/range \
-X POST -d '{"key": "Zm9v"}'
# {"header":{"cluster_id":"12585971608760269493","member_id":"13847567121247652255","revision":"2","raft_term":"3"},"kvs":[{"key":"Zm9v","create_revision":"2","mod_revision":"2","version":"1","value":"YmFy"}],"count":"1"}
# get all keys prefixed with "foo"
curl -L http://localhost:2379/v3/kv/range \
-X POST -d '{"key": "Zm9v", "range_end": "Zm9w"}'
# {"header":{"cluster_id":"12585971608760269493","member_id":"13847567121247652255","revision":"2","raft_term":"3"},"kvs":[{"key":"Zm9v","create_revision":"2","mod_revision":"2","version":"1","value":"YmFy"}],"count":"1"}
```
### Watch keys
Use the `/v3/watch` service to watch keys:
```bash
curl -N http://localhost:2379/v3/watch \
-X POST -d '{"create_request": {"key":"Zm9v"} }' &
# {"result":{"header":{"cluster_id":"12585971608760269493","member_id":"13847567121247652255","revision":"1","raft_term":"2"},"created":true}}
curl -L http://localhost:2379/v3/kv/put \
-X POST -d '{"key": "Zm9v", "value": "YmFy"}' >/dev/null 2>&1
# {"result":{"header":{"cluster_id":"12585971608760269493","member_id":"13847567121247652255","revision":"2","raft_term":"2"},"events":[{"kv":{"key":"Zm9v","create_revision":"2","mod_revision":"2","version":"1","value":"YmFy"}}]}}
```
### Transactions
Issue a transaction with `/v3/kv/txn`:
```bash
# target CREATE
curl -L http://localhost:2379/v3/kv/txn \
-X POST \
-d '{"compare":[{"target":"CREATE","key":"Zm9v","createRevision":"2"}],"success":[{"requestPut":{"key":"Zm9v","value":"YmFy"}}]}'
# {"header":{"cluster_id":"12585971608760269493","member_id":"13847567121247652255","revision":"3","raft_term":"2"},"succeeded":true,"responses":[{"response_put":{"header":{"revision":"3"}}}]}
```
```bash
# target VERSION
curl -L http://localhost:2379/v3/kv/txn \
-X POST \
-d '{"compare":[{"version":"4","result":"EQUAL","target":"VERSION","key":"Zm9v"}],"success":[{"requestRange":{"key":"Zm9v"}}]}'
# {"header":{"cluster_id":"14841639068965178418","member_id":"10276657743932975437","revision":"6","raft_term":"3"},"succeeded":true,"responses":[{"response_range":{"header":{"revision":"6"},"kvs":[{"key":"Zm9v","create_revision":"2","mod_revision":"6","version":"4","value":"YmF6"}],"count":"1"}}]}
```
### Authentication
Set up authentication with the `/v3/auth` service:
```bash
# create root user
curl -L http://localhost:2379/v3/auth/user/add \
-X POST -d '{"name": "root", "password": "pass"}'
# {"header":{"cluster_id":"14841639068965178418","member_id":"10276657743932975437","revision":"1","raft_term":"2"}}
# create root role
curl -L http://localhost:2379/v3/auth/role/add \
-X POST -d '{"name": "root"}'
# {"header":{"cluster_id":"14841639068965178418","member_id":"10276657743932975437","revision":"1","raft_term":"2"}}
# grant root role
curl -L http://localhost:2379/v3/auth/user/grant \
-X POST -d '{"user": "root", "role": "root"}'
# {"header":{"cluster_id":"14841639068965178418","member_id":"10276657743932975437","revision":"1","raft_term":"2"}}
# enable auth
curl -L http://localhost:2379/v3/auth/enable -X POST -d '{}'
# {"header":{"cluster_id":"14841639068965178418","member_id":"10276657743932975437","revision":"1","raft_term":"2"}}
```
Authenticate with etcd for an authentication token using `/v3/auth/authenticate`:
```bash
# get the auth token for the root user
curl -L http://localhost:2379/v3/auth/authenticate \
-X POST -d '{"name": "root", "password": "pass"}'
# {"header":{"cluster_id":"14841639068965178418","member_id":"10276657743932975437","revision":"1","raft_term":"2"},"token":"sssvIpwfnLAcWAQH.9"}
```
Set the `Authorization` header to the authentication token to fetch a key using authentication credentials:
```bash
curl -L http://localhost:2379/v3/kv/put \
-H 'Authorization : sssvIpwfnLAcWAQH.9' \
-X POST -d '{"key": "Zm9v", "value": "YmFy"}'
# {"header":{"cluster_id":"14841639068965178418","member_id":"10276657743932975437","revision":"2","raft_term":"2"}}
```
## Swagger
Generated [Swagger][swagger] API definitions can be found at [rpc.swagger.json][swagger-doc].
[api-ref]: ./api_reference_v3.md
[go-client]: https://github.com/coreos/etcd/tree/master/clientv3
[etcdctl]: https://github.com/coreos/etcd/tree/master/etcdctl
[grpc]: https://www.grpc.io/
[grpc-gateway]: https://github.com/grpc-ecosystem/grpc-gateway
[json-mapping]: https://developers.google.com/protocol-buffers/docs/proto3#json
[swagger]: http://swagger.io/
[swagger-doc]: apispec/swagger/rpc.swagger.json

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -1,9 +1,13 @@
{
"swagger": "2.0",
"info": {
"title": "server/etcdserver/api/v3election/v3electionpb/v3election.proto",
"title": "etcdserver/api/v3election/v3electionpb/v3election.proto",
"version": "version not set"
},
"schemes": [
"http",
"https"
],
"consumes": [
"application/json"
],
@ -14,19 +18,13 @@
"/v3/election/campaign": {
"post": {
"summary": "Campaign waits to acquire leadership in an election, returning a LeaderKey\nrepresenting the leadership if successful. The LeaderKey can then be used\nto issue new values on the election, transactionally guard API requests on\nleadership still being held, and resign from the election.",
"operationId": "Election_Campaign",
"operationId": "Campaign",
"responses": {
"200": {
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/v3electionpbCampaignResponse"
}
},
"default": {
"description": "An unexpected error response.",
"schema": {
"$ref": "#/definitions/runtimeError"
}
}
},
"parameters": [
@ -47,19 +45,13 @@
"/v3/election/leader": {
"post": {
"summary": "Leader returns the current election proclamation, if any.",
"operationId": "Election_Leader",
"operationId": "Leader",
"responses": {
"200": {
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/v3electionpbLeaderResponse"
}
},
"default": {
"description": "An unexpected error response.",
"schema": {
"$ref": "#/definitions/runtimeError"
}
}
},
"parameters": [
@ -80,27 +72,12 @@
"/v3/election/observe": {
"post": {
"summary": "Observe streams election proclamations in-order as made by the election's\nelected leaders.",
"operationId": "Election_Observe",
"operationId": "Observe",
"responses": {
"200": {
"description": "A successful response.(streaming responses)",
"schema": {
"type": "object",
"properties": {
"result": {
"$ref": "#/definitions/v3electionpbLeaderResponse"
},
"error": {
"$ref": "#/definitions/runtimeStreamError"
}
},
"title": "Stream result of v3electionpbLeaderResponse"
}
},
"default": {
"description": "An unexpected error response.",
"schema": {
"$ref": "#/definitions/runtimeError"
"$ref": "#/x-stream-definitions/v3electionpbLeaderResponse"
}
}
},
@ -122,19 +99,13 @@
"/v3/election/proclaim": {
"post": {
"summary": "Proclaim updates the leader's posted value with a new value.",
"operationId": "Election_Proclaim",
"operationId": "Proclaim",
"responses": {
"200": {
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/v3electionpbProclaimResponse"
}
},
"default": {
"description": "An unexpected error response.",
"schema": {
"$ref": "#/definitions/runtimeError"
}
}
},
"parameters": [
@ -155,19 +126,13 @@
"/v3/election/resign": {
"post": {
"summary": "Resign releases election leadership so other campaigners may acquire\nleadership on the election.",
"operationId": "Election_Resign",
"operationId": "Resign",
"responses": {
"200": {
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/v3electionpbResignResponse"
}
},
"default": {
"description": "An unexpected error response.",
"schema": {
"$ref": "#/definitions/runtimeError"
}
}
},
"parameters": [
@ -203,7 +168,7 @@
"revision": {
"type": "string",
"format": "int64",
"description": "revision is the key-value store revision when the request was applied, and it's\nunset (so 0) in case of calls not interacting with key-value store.\nFor watch progress responses, the header.revision indicates progress. All future events\nreceived in this stream are guaranteed to have a higher revision number than the\nheader.revision number."
"description": "revision is the key-value store revision when the request was applied.\nFor watch progress responses, the header.revision indicates progress. All future events\nrecieved in this stream are guaranteed to have a higher revision number than the\nheader.revision number."
},
"raft_term": {
"type": "string",
@ -251,35 +216,11 @@
"type": "object",
"properties": {
"type_url": {
"type": "string",
"description": "A URL/resource name that uniquely identifies the type of the serialized\nprotocol buffer message. This string must contain at least\none \"/\" character. The last segment of the URL's path must represent\nthe fully qualified name of the type (as in\n`path/google.protobuf.Duration`). The name should be in a canonical form\n(e.g., leading \".\" is not accepted).\n\nIn practice, teams usually precompile into the binary all types that they\nexpect it to use in the context of Any. However, for URLs which use the\nscheme `http`, `https`, or no scheme, one can optionally set up a type\nserver that maps type URLs to message definitions as follows:\n\n* If no scheme is provided, `https` is assumed.\n* An HTTP GET on the URL must yield a [google.protobuf.Type][]\n value in binary format, or produce an error.\n* Applications are allowed to cache lookup results based on the\n URL, or have them precompiled into a binary to avoid any\n lookup. Therefore, binary compatibility needs to be preserved\n on changes to types. (Use versioned type names to manage\n breaking changes.)\n\nNote: this functionality is not currently available in the official\nprotobuf release, and it is not used for type URLs beginning with\ntype.googleapis.com.\n\nSchemes other than `http`, `https` (or the empty scheme) might be\nused with implementation specific semantics."
"type": "string"
},
"value": {
"type": "string",
"format": "byte",
"description": "Must be a valid serialized protocol buffer of the above specified type."
}
},
"description": "`Any` contains an arbitrary serialized protocol buffer message along with a\nURL that describes the type of the serialized message.\n\nProtobuf library provides support to pack/unpack Any values in the form\nof utility functions or additional generated methods of the Any type.\n\nExample 1: Pack and unpack a message in C++.\n\n Foo foo = ...;\n Any any;\n any.PackFrom(foo);\n ...\n if (any.UnpackTo(\u0026foo)) {\n ...\n }\n\nExample 2: Pack and unpack a message in Java.\n\n Foo foo = ...;\n Any any = Any.pack(foo);\n ...\n if (any.is(Foo.class)) {\n foo = any.unpack(Foo.class);\n }\n\n Example 3: Pack and unpack a message in Python.\n\n foo = Foo(...)\n any = Any()\n any.Pack(foo)\n ...\n if any.Is(Foo.DESCRIPTOR):\n any.Unpack(foo)\n ...\n\n Example 4: Pack and unpack a message in Go\n\n foo := \u0026pb.Foo{...}\n any, err := ptypes.MarshalAny(foo)\n ...\n foo := \u0026pb.Foo{}\n if err := ptypes.UnmarshalAny(any, foo); err != nil {\n ...\n }\n\nThe pack methods provided by protobuf library will by default use\n'type.googleapis.com/full.type.name' as the type URL and the unpack\nmethods only use the fully qualified type name after the last '/'\nin the type URL, for example \"foo.bar.com/x/y.z\" will yield type\nname \"y.z\".\n\n\nJSON\n====\nThe JSON representation of an `Any` value uses the regular\nrepresentation of the deserialized, embedded message, with an\nadditional field `@type` which contains the type URL. Example:\n\n package google.profile;\n message Person {\n string first_name = 1;\n string last_name = 2;\n }\n\n {\n \"@type\": \"type.googleapis.com/google.profile.Person\",\n \"firstName\": \u003cstring\u003e,\n \"lastName\": \u003cstring\u003e\n }\n\nIf the embedded message type is well-known and has a custom JSON\nrepresentation, that representation will be embedded adding a field\n`value` which holds the custom JSON in addition to the `@type`\nfield. Example (for message [google.protobuf.Duration][]):\n\n {\n \"@type\": \"type.googleapis.com/google.protobuf.Duration\",\n \"value\": \"1.212s\"\n }"
},
"runtimeError": {
"type": "object",
"properties": {
"error": {
"type": "string"
},
"code": {
"type": "integer",
"format": "int32"
},
"message": {
"type": "string"
},
"details": {
"type": "array",
"items": {
"$ref": "#/definitions/protobufAny"
}
"format": "byte"
}
}
},
@ -426,5 +367,19 @@
}
}
}
},
"x-stream-definitions": {
"v3electionpbLeaderResponse": {
"type": "object",
"properties": {
"result": {
"$ref": "#/definitions/v3electionpbLeaderResponse"
},
"error": {
"$ref": "#/definitions/runtimeStreamError"
}
},
"title": "Stream result of v3electionpbLeaderResponse"
}
}
}

View File

@ -1,9 +1,13 @@
{
"swagger": "2.0",
"info": {
"title": "server/etcdserver/api/v3lock/v3lockpb/v3lock.proto",
"title": "etcdserver/api/v3lock/v3lockpb/v3lock.proto",
"version": "version not set"
},
"schemes": [
"http",
"https"
],
"consumes": [
"application/json"
],
@ -14,19 +18,13 @@
"/v3/lock/lock": {
"post": {
"summary": "Lock acquires a distributed shared lock on a given named lock.\nOn success, it will return a unique key that exists so long as the\nlock is held by the caller. This key can be used in conjunction with\ntransactions to safely ensure updates to etcd only occur while holding\nlock ownership. The lock is held until Unlock is called on the key or the\nlease associate with the owner expires.",
"operationId": "Lock_Lock",
"operationId": "Lock",
"responses": {
"200": {
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/v3lockpbLockResponse"
}
},
"default": {
"description": "An unexpected error response.",
"schema": {
"$ref": "#/definitions/runtimeError"
}
}
},
"parameters": [
@ -47,19 +45,13 @@
"/v3/lock/unlock": {
"post": {
"summary": "Unlock takes a key returned by Lock and releases the hold on lock. The\nnext Lock caller waiting for the lock will then be woken up and given\nownership of the lock.",
"operationId": "Lock_Unlock",
"operationId": "Unlock",
"responses": {
"200": {
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/v3lockpbUnlockResponse"
}
},
"default": {
"description": "An unexpected error response.",
"schema": {
"$ref": "#/definitions/runtimeError"
}
}
},
"parameters": [
@ -95,7 +87,7 @@
"revision": {
"type": "string",
"format": "int64",
"description": "revision is the key-value store revision when the request was applied, and it's\nunset (so 0) in case of calls not interacting with key-value store.\nFor watch progress responses, the header.revision indicates progress. All future events\nreceived in this stream are guaranteed to have a higher revision number than the\nheader.revision number."
"description": "revision is the key-value store revision when the request was applied.\nFor watch progress responses, the header.revision indicates progress. All future events\nrecieved in this stream are guaranteed to have a higher revision number than the\nheader.revision number."
},
"raft_term": {
"type": "string",
@ -104,42 +96,6 @@
}
}
},
"protobufAny": {
"type": "object",
"properties": {
"type_url": {
"type": "string",
"description": "A URL/resource name that uniquely identifies the type of the serialized\nprotocol buffer message. This string must contain at least\none \"/\" character. The last segment of the URL's path must represent\nthe fully qualified name of the type (as in\n`path/google.protobuf.Duration`). The name should be in a canonical form\n(e.g., leading \".\" is not accepted).\n\nIn practice, teams usually precompile into the binary all types that they\nexpect it to use in the context of Any. However, for URLs which use the\nscheme `http`, `https`, or no scheme, one can optionally set up a type\nserver that maps type URLs to message definitions as follows:\n\n* If no scheme is provided, `https` is assumed.\n* An HTTP GET on the URL must yield a [google.protobuf.Type][]\n value in binary format, or produce an error.\n* Applications are allowed to cache lookup results based on the\n URL, or have them precompiled into a binary to avoid any\n lookup. Therefore, binary compatibility needs to be preserved\n on changes to types. (Use versioned type names to manage\n breaking changes.)\n\nNote: this functionality is not currently available in the official\nprotobuf release, and it is not used for type URLs beginning with\ntype.googleapis.com.\n\nSchemes other than `http`, `https` (or the empty scheme) might be\nused with implementation specific semantics."
},
"value": {
"type": "string",
"format": "byte",
"description": "Must be a valid serialized protocol buffer of the above specified type."
}
},
"description": "`Any` contains an arbitrary serialized protocol buffer message along with a\nURL that describes the type of the serialized message.\n\nProtobuf library provides support to pack/unpack Any values in the form\nof utility functions or additional generated methods of the Any type.\n\nExample 1: Pack and unpack a message in C++.\n\n Foo foo = ...;\n Any any;\n any.PackFrom(foo);\n ...\n if (any.UnpackTo(\u0026foo)) {\n ...\n }\n\nExample 2: Pack and unpack a message in Java.\n\n Foo foo = ...;\n Any any = Any.pack(foo);\n ...\n if (any.is(Foo.class)) {\n foo = any.unpack(Foo.class);\n }\n\n Example 3: Pack and unpack a message in Python.\n\n foo = Foo(...)\n any = Any()\n any.Pack(foo)\n ...\n if any.Is(Foo.DESCRIPTOR):\n any.Unpack(foo)\n ...\n\n Example 4: Pack and unpack a message in Go\n\n foo := \u0026pb.Foo{...}\n any, err := ptypes.MarshalAny(foo)\n ...\n foo := \u0026pb.Foo{}\n if err := ptypes.UnmarshalAny(any, foo); err != nil {\n ...\n }\n\nThe pack methods provided by protobuf library will by default use\n'type.googleapis.com/full.type.name' as the type URL and the unpack\nmethods only use the fully qualified type name after the last '/'\nin the type URL, for example \"foo.bar.com/x/y.z\" will yield type\nname \"y.z\".\n\n\nJSON\n====\nThe JSON representation of an `Any` value uses the regular\nrepresentation of the deserialized, embedded message, with an\nadditional field `@type` which contains the type URL. Example:\n\n package google.profile;\n message Person {\n string first_name = 1;\n string last_name = 2;\n }\n\n {\n \"@type\": \"type.googleapis.com/google.profile.Person\",\n \"firstName\": \u003cstring\u003e,\n \"lastName\": \u003cstring\u003e\n }\n\nIf the embedded message type is well-known and has a custom JSON\nrepresentation, that representation will be embedded adding a field\n`value` which holds the custom JSON in addition to the `@type`\nfield. Example (for message [google.protobuf.Duration][]):\n\n {\n \"@type\": \"type.googleapis.com/google.protobuf.Duration\",\n \"value\": \"1.212s\"\n }"
},
"runtimeError": {
"type": "object",
"properties": {
"error": {
"type": "string"
},
"code": {
"type": "integer",
"format": "int32"
},
"message": {
"type": "string"
},
"details": {
"type": "array",
"items": {
"$ref": "#/definitions/protobufAny"
}
}
}
},
"v3lockpbLockRequest": {
"type": "object",
"properties": {

View File

@ -0,0 +1,9 @@
---
title: Experimental APIs and features
---
For the most part, the etcd project is stable, but we are still moving fast! We believe in the release fast philosophy. We want to get early feedback on features still in development and stabilizing. Thus, there are, and will be more, experimental features and APIs. We plan to improve these features based on the early feedback from the community, or abandon them if there is little interest, in the next few releases. Please do not rely on any experimental features or APIs in production environment.
## The current experimental API/features are:
- [KV ordering](https://godoc.org/github.com/etcd-io/etcd/clientv3/ordering) wrapper. When an etcd client switches endpoints, responses to serializable reads may go backward in time if the new endpoint is lagging behind the rest of the cluster. The ordering wrapper caches the current cluster revision from response headers. If a response revision is less than the cached revision, the client selects another endpoint and reissues the read. Enable in grpcproxy with `--experimental-serializable-ordering`.

View File

@ -0,0 +1,67 @@
---
title: gRPC naming and discovery
---
etcd provides a gRPC resolver to support an alternative name system that fetches endpoints from etcd for discovering gRPC services. The underlying mechanism is based on watching updates to keys prefixed with the service name.
## Using etcd discovery with go-grpc
The etcd client provides a gRPC resolver for resolving gRPC endpoints with an etcd backend. The resolver is initialized with an etcd client and given a target for resolution:
```go
import (
"go.etcd.io/etcd/clientv3"
etcdnaming "go.etcd.io/etcd/clientv3/naming"
"google.golang.org/grpc"
)
...
cli, cerr := clientv3.NewFromURL("http://localhost:2379")
r := &etcdnaming.GRPCResolver{Client: cli}
b := grpc.RoundRobin(r)
conn, gerr := grpc.Dial("my-service", grpc.WithBalancer(b), grpc.WithBlock(), ...)
```
## Managing service endpoints
The etcd resolver treats all keys under the prefix of the resolution target following a "/" (e.g., "my-service/") with JSON-encoded go-grpc `naming.Update` values as potential service endpoints. Endpoints are added to the service by creating new keys and removed from the service by deleting keys.
### Adding an endpoint
New endpoints can be added to the service through `etcdctl`:
```sh
ETCDCTL_API=3 etcdctl put my-service/1.2.3.4 '{"Addr":"1.2.3.4","Metadata":"..."}'
```
The etcd client's `GRPCResolver.Update` method can also register new endpoints with a key matching the `Addr`:
```go
r.Update(context.TODO(), "my-service", naming.Update{Op: naming.Add, Addr: "1.2.3.4", Metadata: "..."})
```
### Deleting an endpoint
Hosts can be deleted from the service through `etcdctl`:
```sh
ETCDCTL_API=3 etcdctl del my-service/1.2.3.4
```
The etcd client's `GRPCResolver.Update` method also supports deleting endpoints:
```go
r.Update(context.TODO(), "my-service", naming.Update{Op: naming.Delete, Addr: "1.2.3.4"})
```
### Registering an endpoint with a lease
Registering an endpoint with a lease ensures that if the host can't maintain a keepalive heartbeat (e.g., its machine fails), it will be removed from the service:
```sh
lease=`ETCDCTL_API=3 etcdctl lease grant 5 | cut -f2 -d' '`
ETCDCTL_API=3 etcdctl put --lease=$lease my-service/1.2.3.4 '{"Addr":"1.2.3.4","Metadata":"..."}'
ETCDCTL_API=3 etcdctl lease keep-alive $lease
```

View File

@ -0,0 +1,499 @@
---
title: Interacting with etcd
---
Users mostly interact with etcd by putting or getting the value of a key. This section describes how to do that by using etcdctl, a command line tool for interacting with etcd server. The concepts described here should apply to the gRPC APIs or client library APIs.
The API version used by etcdctl to speak to etcd may be set to version `2` or `3` via the `ETCDCTL_API` environment variable. By default, etcdctl on master (3.4) uses the v3 API and earlier versions (3.3 and earlier) default to the v2 API.
Note that any key that was created using the v2 API will not be able to be queried via the v3 API. A v3 API ```etcdctl get``` of a v2 key will exit with 0 and no key data, this is the expected behaviour.
```bash
export ETCDCTL_API=3
```
## Find versions
etcdctl version and Server API version can be useful in finding the appropriate commands to be used for performing various operations on etcd.
Here is the command to find the versions:
```bash
$ etcdctl version
etcdctl version: 3.1.0-alpha.0+git
API version: 3.1
```
## Write a key
Applications store keys into the etcd cluster by writing to keys. Every stored key is replicated to all etcd cluster members through the Raft protocol to achieve consistency and reliability.
Here is the command to set the value of key `foo` to `bar`:
```bash
$ etcdctl put foo bar
OK
```
Also a key can be set for a specified interval of time by attaching lease to it.
Here is the command to set the value of key `foo1` to `bar1` for 10s.
```bash
$ etcdctl put foo1 bar1 --lease=1234abcd
OK
```
Note: The lease id `1234abcd` in the above command refers to id returned on creating the lease of 10s. This id can then be attached to the key.
## Read keys
Applications can read values of keys from an etcd cluster. Queries may read a single key, or a range of keys.
Suppose the etcd cluster has stored the following keys:
```bash
foo = bar
foo1 = bar1
foo2 = bar2
foo3 = bar3
```
Here is the command to read the value of key `foo`:
```bash
$ etcdctl get foo
foo
bar
```
Here is the command to read the value of key `foo` in hex format:
```bash
$ etcdctl get foo --hex
\x66\x6f\x6f # Key
\x62\x61\x72 # Value
```
Here is the command to read only the value of key `foo`:
```bash
$ etcdctl get foo --print-value-only
bar
```
Here is the command to range over the keys from `foo` to `foo3`:
```bash
$ etcdctl get foo foo3
foo
bar
foo1
bar1
foo2
bar2
```
Note that `foo3` is excluded since the range is over the half-open interval `[foo, foo3)`, excluding `foo3`.
Here is the command to range over all keys prefixed with `foo`:
```bash
$ etcdctl get --prefix foo
foo
bar
foo1
bar1
foo2
bar2
foo3
bar3
```
Here is the command to range over all keys prefixed with `foo`, limiting the number of results to 2:
```bash
$ etcdctl get --prefix --limit=2 foo
foo
bar
foo1
bar1
```
## Read past version of keys
Applications may want to read superseded versions of a key. For example, an application may wish to roll back to an old configuration by accessing an earlier version of a key. Alternatively, an application may want a consistent view over multiple keys through multiple requests by accessing key history.
Since every modification to the etcd cluster key-value store increments the global revision of an etcd cluster, an application can read superseded keys by providing an older etcd revision.
Suppose an etcd cluster already has the following keys:
```bash
foo = bar # revision = 2
foo1 = bar1 # revision = 3
foo = bar_new # revision = 4
foo1 = bar1_new # revision = 5
```
Here are an example to access the past versions of keys:
```bash
$ etcdctl get --prefix foo # access the most recent versions of keys
foo
bar_new
foo1
bar1_new
$ etcdctl get --prefix --rev=4 foo # access the versions of keys at revision 4
foo
bar_new
foo1
bar1
$ etcdctl get --prefix --rev=3 foo # access the versions of keys at revision 3
foo
bar
foo1
bar1
$ etcdctl get --prefix --rev=2 foo # access the versions of keys at revision 2
foo
bar
$ etcdctl get --prefix --rev=1 foo # access the versions of keys at revision 1
```
## Read keys which are greater than or equal to the byte value of the specified key
Applications may want to read keys which are greater than or equal to the byte value of the specified key.
Suppose an etcd cluster already has the following keys:
```bash
a = 123
b = 456
z = 789
```
Here is the command to read keys which are greater than or equal to the byte value of key `b` :
```bash
$ etcdctl get --from-key b
b
456
z
789
```
## Delete keys
Applications can delete a key or a range of keys from an etcd cluster.
Suppose an etcd cluster already has the following keys:
```bash
foo = bar
foo1 = bar1
foo3 = bar3
zoo = val
zoo1 = val1
zoo2 = val2
a = 123
b = 456
z = 789
```
Here is the command to delete key `foo`:
```bash
$ etcdctl del foo
1 # one key is deleted
```
Here is the command to delete keys ranging from `foo` to `foo9`:
```bash
$ etcdctl del foo foo9
2 # two keys are deleted
```
Here is the command to delete key `zoo` with the deleted key value pair returned:
```bash
$ etcdctl del --prev-kv zoo
1 # one key is deleted
zoo # deleted key
val # the value of the deleted key
```
Here is the command to delete keys having prefix as `zoo`:
```bash
$ etcdctl del --prefix zoo
2 # two keys are deleted
```
Here is the command to delete keys which are greater than or equal to the byte value of key `b` :
```bash
$ etcdctl del --from-key b
2 # two keys are deleted
```
## Watch key changes
Applications can watch on a key or a range of keys to monitor for any updates.
Here is the command to watch on key `foo`:
```bash
$ etcdctl watch foo
# in another terminal: etcdctl put foo bar
PUT
foo
bar
```
Here is the command to watch on key `foo` in hex format:
```bash
$ etcdctl watch foo --hex
# in another terminal: etcdctl put foo bar
PUT
\x66\x6f\x6f # Key
\x62\x61\x72 # Value
```
Here is the command to watch on a range key from `foo` to `foo9`:
```bash
$ etcdctl watch foo foo9
# in another terminal: etcdctl put foo bar
PUT
foo
bar
# in another terminal: etcdctl put foo1 bar1
PUT
foo1
bar1
```
Here is the command to watch on keys having prefix `foo`:
```bash
$ etcdctl watch --prefix foo
# in another terminal: etcdctl put foo bar
PUT
foo
bar
# in another terminal: etcdctl put fooz1 barz1
PUT
fooz1
barz1
```
Here is the command to watch on multiple keys `foo` and `zoo`:
```bash
$ etcdctl watch -i
$ watch foo
$ watch zoo
# in another terminal: etcdctl put foo bar
PUT
foo
bar
# in another terminal: etcdctl put zoo val
PUT
zoo
val
```
## Watch historical changes of keys
Applications may want to watch for historical changes of keys in etcd. For example, an application may wish to receive all the modifications of a key; if the application stays connected to etcd, then `watch` is good enough. However, if the application or etcd fails, a change may happen during the failure, and the application will not receive the update in real time. To guarantee the update is delivered, the application must be able to watch for historical changes to keys. To do this, an application can specify a historical revision on a watch, just like reading past version of keys.
Suppose we finished the following sequence of operations:
```bash
$ etcdctl put foo bar # revision = 2
OK
$ etcdctl put foo1 bar1 # revision = 3
OK
$ etcdctl put foo bar_new # revision = 4
OK
$ etcdctl put foo1 bar1_new # revision = 5
OK
```
Here is an example to watch the historical changes:
```bash
# watch for changes on key `foo` since revision 2
$ etcdctl watch --rev=2 foo
PUT
foo
bar
PUT
foo
bar_new
```
```bash
# watch for changes on key `foo` since revision 3
$ etcdctl watch --rev=3 foo
PUT
foo
bar_new
```
Here is an example to watch only from the last historical change:
```bash
# watch for changes on key `foo` and return last revision value along with modified value
$ etcdctl watch --prev-kv foo
# in another terminal: etcdctl put foo bar_latest
PUT
foo # key
bar_new # last value of foo key before modification
foo # key
bar_latest # value of foo key after modification
```
## Watch progress
Applications may want to check the progress of a watch to determine how up-to-date the watch stream is. For example, if a watch is used to update a cache, it can be useful to know if the cache is stale compared to the revision from a quorum read.
Progress requests can be issued using the "progress" command in interactive watch session to ask the etcd server to send a progress notify update in the watch stream:
```bash
$ etcdctl watch -i
$ watch a
$ progress
progress notify: 1
# in another terminal: etcdctl put x 0
# in another terminal: etcdctl put y 1
$ progress
progress notify: 3
```
Note: The revision number in the progress notify response is the revision from the local etcd server node that the watch stream is connected to. If this node is partitioned and not part of quorum, this progress notify revision might be lower than
than the revision returned by a quorum read against a non-partitioned etcd server node.
## Compacted revisions
As we mentioned, etcd keeps revisions so that applications can read past versions of keys. However, to avoid accumulating an unbounded amount of history, it is important to compact past revisions. After compacting, etcd removes historical revisions, releasing resources for future use. All superseded data with revisions before the compacted revision will be unavailable.
Here is the command to compact the revisions:
```bash
$ etcdctl compact 5
compacted revision 5
# any revisions before the compacted one are not accessible
$ etcdctl get --rev=4 foo
Error: rpc error: code = 11 desc = etcdserver: mvcc: required revision has been compacted
```
Note: The current revision of etcd server can be found using get command on any key (existent or non-existent) in json format. Example is shown below for mykey which does not exist in etcd server:
```bash
$ etcdctl get mykey -w=json
{"header":{"cluster_id":14841639068965178418,"member_id":10276657743932975437,"revision":15,"raft_term":4}}
```
## Grant leases
Applications can grant leases for keys from an etcd cluster. When a key is attached to a lease, its lifetime is bound to the lease's lifetime which in turn is governed by a time-to-live (TTL). Each lease has a minimum time-to-live (TTL) value specified by the application at grant time. The lease's actual TTL value is at least the minimum TTL and is chosen by the etcd cluster. Once a lease's TTL elapses, the lease expires and all attached keys are deleted.
Here is the command to grant a lease:
```bash
# grant a lease with 10 second TTL
$ etcdctl lease grant 10
lease 32695410dcc0ca06 granted with TTL(10s)
# attach key foo to lease 32695410dcc0ca06
$ etcdctl put --lease=32695410dcc0ca06 foo bar
OK
```
## Revoke leases
Applications revoke leases by lease ID. Revoking a lease deletes all of its attached keys.
Suppose we finished the following sequence of operations:
```bash
$ etcdctl lease grant 10
lease 32695410dcc0ca06 granted with TTL(10s)
$ etcdctl put --lease=32695410dcc0ca06 foo bar
OK
```
Here is the command to revoke the same lease:
```bash
$ etcdctl lease revoke 32695410dcc0ca06
lease 32695410dcc0ca06 revoked
$ etcdctl get foo
# empty response since foo is deleted due to lease revocation
```
## Keep leases alive
Applications can keep a lease alive by refreshing its TTL so it does not expire.
Suppose we finished the following sequence of operations:
```bash
$ etcdctl lease grant 10
lease 32695410dcc0ca06 granted with TTL(10s)
```
Here is the command to keep the same lease alive:
```bash
$ etcdctl lease keep-alive 32695410dcc0ca06
lease 32695410dcc0ca06 keepalived with TTL(10)
lease 32695410dcc0ca06 keepalived with TTL(10)
lease 32695410dcc0ca06 keepalived with TTL(10)
...
```
## Get lease information
Applications may want to know about lease information, so that they can be renewed or to check if the lease still exists or it has expired. Applications may also want to know the keys to which a particular lease is attached.
Suppose we finished the following sequence of operations:
```bash
# grant a lease with 500 second TTL
$ etcdctl lease grant 500
lease 694d5765fc71500b granted with TTL(500s)
# attach key zoo1 to lease 694d5765fc71500b
$ etcdctl put zoo1 val1 --lease=694d5765fc71500b
OK
# attach key zoo2 to lease 694d5765fc71500b
$ etcdctl put zoo2 val2 --lease=694d5765fc71500b
OK
```
Here is the command to get information about the lease:
```bash
$ etcdctl lease timetolive 694d5765fc71500b
lease 694d5765fc71500b granted with TTL(500s), remaining(258s)
```
Here is the command to get information about the lease along with the keys attached with the lease:
```bash
$ etcdctl lease timetolive --keys 694d5765fc71500b
lease 694d5765fc71500b granted with TTL(500s), remaining(132s), attached keys([zoo2 zoo1])
# if the lease has expired or does not exist it will give the below response:
Error: etcdserver: requested lease not found
```

View File

@ -0,0 +1,11 @@
---
title: System limits
---
## Request size limit
etcd is designed to handle small key value pairs typical for metadata. Larger requests will work, but may increase the latency of other requests. By default, the maximum size of any request is 1.5 MiB. This limit is configurable through `--max-request-bytes` flag for etcd server.
## Storage size limit
The default storage size limit is 2GB, configurable with `--quota-backend-bytes` flag. 8GB is a suggested maximum size for normal environments and etcd warns at startup if the configured value exceeds it.

View File

@ -0,0 +1,151 @@
---
title: Set up a local cluster
---
For testing and development deployments, the quickest and easiest way is to configure a local cluster. For a production deployment, refer to the [clustering][clustering] section.
## Local standalone cluster
### Starting a cluster
Run the following to deploy an etcd cluster as a standalone cluster:
```
$ ./etcd
...
```
If the `etcd` binary is not present in the current working directory, it might be located either at `$GOPATH/bin/etcd` or at `/usr/local/bin/etcd`. Run the command appropriately.
The running etcd member listens on `localhost:2379` for client requests.
### Interacting with the cluster
Use `etcdctl` to interact with the running cluster:
1. Store an example key-value pair in the cluster:
```
$ ./etcdctl put foo bar
OK
```
If OK is printed, storing key-value pair is successful.
2. Retrieve the value of `foo`:
```
$ ./etcdctl get foo
bar
```
If `bar` is returned, interaction with the etcd cluster is working as expected.
## Local multi-member cluster
### Starting a cluster
A `Procfile` at the base of the etcd git repository is provided to easily configure a local multi-member cluster. To start a multi-member cluster, navigate to the root of the etcd source tree and perform the following:
1. Install `goreman` to control Procfile-based applications:
```
$ go get github.com/mattn/goreman
```
2. Start a cluster with `goreman` using etcd's stock Procfile:
```
$ goreman -f Procfile start
```
The members start running. They listen on `localhost:2379`, `localhost:22379`, and `localhost:32379` respectively for client requests.
### Interacting with the cluster
Use `etcdctl` to interact with the running cluster:
1. Print the list of members:
```
$ etcdctl --write-out=table --endpoints=localhost:2379 member list
```
The list of etcd members are displayed as follows:
```
+------------------+---------+--------+------------------------+------------------------+
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS |
+------------------+---------+--------+------------------------+------------------------+
| 8211f1d0f64f3269 | started | infra1 | http://127.0.0.1:2380 | http://127.0.0.1:2379 |
| 91bc3c398fb3c146 | started | infra2 | http://127.0.0.1:22380 | http://127.0.0.1:22379 |
| fd422379fda50e48 | started | infra3 | http://127.0.0.1:32380 | http://127.0.0.1:32379 |
+------------------+---------+--------+------------------------+------------------------+
```
2. Store an example key-value pair in the cluster:
```
$ etcdctl put foo bar
OK
```
If OK is printed, storing key-value pair is successful.
### Testing fault tolerance
To exercise etcd's fault tolerance, kill a member and attempt to retrieve the key.
1. Identify the process name of the member to be stopped.
The `Procfile` lists the properties of the multi-member cluster. For example, consider the member with the process name, `etcd2`.
2. Stop the member:
```
# kill etcd2
$ goreman run stop etcd2
```
3. Store a key:
```
$ etcdctl put key hello
OK
```
4. Retrieve the key that is stored in the previous step:
```
$ etcdctl get key
hello
```
5. Retrieve a key from the stopped member:
```
$ etcdctl --endpoints=localhost:22379 get key
```
The command should display an error caused by connection failure:
```
2017/06/18 23:07:35 grpc: Conn.resetTransport failed to create client transport: connection error: desc = "transport: dial tcp 127.0.0.1:22379: getsockopt: connection refused"; Reconnecting to "localhost:22379"
Error: grpc: timed out trying to connect
```
6. Restart the stopped member:
```
$ goreman run restart etcd2
```
7. Get the key from the restarted member:
```
$ etcdctl --endpoints=localhost:22379 get key
hello
```
Restarting the member re-establish the connection. `etcdctl` will now be able to retrieve the key successfully. To learn more about interacting with etcd, read [interacting with etcd section][interacting].
[interacting]: ./interacting_v3.md
[clustering]: ../op-guide/clustering.md

View File

@ -0,0 +1,115 @@
---
title: Discovery service protocol
---
Discovery service protocol helps new etcd member to discover all other members in cluster bootstrap phase using a shared discovery URL.
Discovery service protocol is _only_ used in cluster bootstrap phase, and cannot be used for runtime reconfiguration or cluster monitoring.
The protocol uses a new discovery token to bootstrap one _unique_ etcd cluster. Remember that one discovery token can represent only one etcd cluster. As long as discovery protocol on this token starts, even if it fails halfway, it must not be used to bootstrap another etcd cluster.
The rest of this article will walk through the discovery process with examples that correspond to a self-hosted discovery cluster. The public discovery service, discovery.etcd.io, functions the same way, but with a layer of polish to abstract away ugly URLs, generate UUIDs automatically, and provide some protections against excessive requests. At its core, the public discovery service still uses an etcd cluster as the data store as described in this document.
## Protocol workflow
The idea of discovery protocol is to use an internal etcd cluster to coordinate bootstrap of a new cluster. First, all new members interact with discovery service and help to generate the expected member list. Then each new member bootstraps its server using this list, which performs the same functionality as -initial-cluster flag.
In the following example workflow, we will list each step of protocol in curl format for ease of understanding.
By convention the etcd discovery protocol uses the key prefix `_etcd/registry`. If `http://example.com` hosts an etcd cluster for discovery service, a full URL to discovery keyspace will be `http://example.com/v2/keys/_etcd/registry`. We will use this as the URL prefix in the example.
### Creating a new discovery token
Generate a unique token that will identify the new cluster. This will be used as a unique prefix in discovery keyspace in the following steps. An easy way to do this is to use `uuidgen`:
```
UUID=$(uuidgen)
```
### Specifying the expected cluster size
The discovery token expects a cluster size that must be specified. The size is used by the discovery service to know when it has found all members that will initially form the cluster.
```
curl -X PUT http://example.com/v2/keys/_etcd/registry/${UUID}/_config/size -d value=${cluster_size}
```
Usually the cluster size is 3, 5 or 7. Check [optimal cluster size][cluster-size] for more details.
### Bringing up etcd processes
Given the discovery URL, use it as `-discovery` flag and bring up etcd processes. Every etcd process will follow this next few steps internally if given a `-discovery` flag.
### Registering itself
The first thing for etcd process is to register itself into the discovery URL as a member. This is done by creating member ID as a key in the discovery URL.
```
curl -X PUT http://example.com/v2/keys/_etcd/registry/${UUID}/${member_id}?prevExist=false -d value="${member_name}=${member_peer_url_1}&${member_name}=${member_peer_url_2}"
```
### Checking the status
It checks the expected cluster size and registration status in discovery URL, and decides what the next action is.
```
curl -X GET http://example.com/v2/keys/_etcd/registry/${UUID}/_config/size
curl -X GET http://example.com/v2/keys/_etcd/registry/${UUID}
```
If registered members are still not enough, it will wait for left members to appear.
If the number of registered members is bigger than the expected size N, it treats the first N registered members as the member list for the cluster. If the member itself is in the member list, the discovery procedure succeeds and it fetches all peers through the member list. If it is not in the member list, the discovery procedure finishes with the failure that the cluster has been full.
In etcd implementation, the member may check the cluster status even before registering itself. So it could fail quickly if the cluster has been full.
### Waiting for all members
The wait process is described in detail in the [etcd API documentation][api].
```
curl -X GET http://example.com/v2/keys/_etcd/registry/${UUID}?wait=true&waitIndex=${current_etcd_index}
```
It keeps waiting until finding all members.
## Public discovery service
CoreOS Inc. hosts a public discovery service at https://discovery.etcd.io/ , which provides some nice features for ease of use.
### Mask key prefix
Public discovery service will redirect `https://discovery.etcd.io/${UUID}` to etcd cluster behind for the key at `/v2/keys/_etcd/registry`. It masks register key prefix for short and readable discovery url.
### Get new token
```
GET /new
Sent query:
size=${cluster_size}
Possible status codes:
200 OK
400 Bad Request
200 Body:
generated discovery url
```
The generation process in the service follows the steps from [Creating a New Discovery Token][new-discovery-token] to [Specifying the Expected Cluster Size][expected-cluster-size].
### Check discovery status
```
GET /${UUID}
```
The status for this discovery token, including the machines that have been registered, can be checked by requesting the value of the UUID.
### Open-source repository
The repository is located at https://github.com/coreos/discovery.etcd.io. It could be used to build a custom discovery service.
[api]: ../v2/api.md#waiting-for-a-change
[cluster-size]: ../v2/admin_guide.md#optimal-cluster-size
[expected-cluster-size]: #specifying-the-expected-cluster-size
[new-discovery-token]: #creating-a-new-discovery-token

View File

@ -0,0 +1,31 @@
---
title: Logging conventions
---
etcd uses the [capnslog][capnslog] library for logging application output categorized into *levels*. A log message's level is determined according to these conventions:
* Error: Data has been lost, a request has failed for a bad reason, or a required resource has been lost
* Examples:
* A failure to allocate disk space for WAL
* Warning: (Hopefully) Temporary conditions that may cause errors, but may work fine. A replica disappearing (that may reconnect) is a warning.
* Examples:
* Failure to send raft message to a remote peer
* Failure to receive heartbeat message within the configured election timeout
* Notice: Normal, but important (uncommon) log information.
* Examples:
* Add a new node into the cluster
* Add a new user into auth subsystem
* Info: Normal, working log information, everything is fine, but helpful notices for auditing or common operations.
* Examples:
* Startup configuration
* Start to do snapshot
* Debug: Everything is still fine, but even common operations may be logged, and less helpful but more quantity of notices.
* Examples:
* Send a normal message to a remote peer
* Write a log entry to disk
[capnslog]: https://github.com/coreos/pkg/tree/master/capnslog

View File

@ -0,0 +1,161 @@
---
title: etcd release guide
---
The guide talks about how to release a new version of etcd.
The procedure includes some manual steps for sanity checking, but it can probably be further scripted. Please keep this document up-to-date if making changes to the release process.
## Release management
etcd community members are assigned to manage the release each etcd major/minor version as well as manage patches
and to each stable release branch. The managers are responsible for communicating the timelines and status of each
release and for ensuring the stability of the release branch.
| Releases | Manager |
| -------- | ------- |
| 3.1 patch (post 3.1.0) | Joe Betz [@jpbetz](https://github.com/jpbetz) |
| 3.2 patch (post 3.2.0) | Joe Betz [@jpbetz](https://github.com/jpbetz) |
| 3.3 patch (post 3.3.0) | Gyuho Lee [@gyuho](https://github.com/gyuho) |
## Prepare release
Set desired version as environment variable for following steps. Here is an example to release 2.3.0:
```
export VERSION=v2.3.0
export PREV_VERSION=v2.2.5
```
All releases version numbers follow the format of [semantic versioning 2.0.0](http://semver.org/).
### Major, minor version release, or its pre-release
- Ensure the relevant milestone on GitHub is complete. All referenced issues should be closed, or moved elsewhere.
- Remove this release from [roadmap](https://github.com/etcd-io/etcd/blob/master/ROADMAP.md), if necessary.
- Ensure the latest upgrade documentation is available.
- Bump [hardcoded MinClusterVerion in the repository](https://github.com/etcd-io/etcd/blob/master/version/version.go#L29), if necessary.
- Add feature capability maps for the new version, if necessary.
### Patch version release
- To request a backport, devlopers submit cherrypick PRs targeting the release branch. The commits should not include merge commits. The commits should be restricted to bug fixes and security patches.
- The cherrypick PRs should target the appropriate release branch (`base:release-<major>-<minor>`). `hack/patch/cherrypick.sh` may be used to automatically generate cherrypick PRs.
- The release patch manager reviews the cherrypick PRs. Please discuss carefully what is backported to the patch release. Each patch release should be strictly better than it's predecessor.
- The release patch manager will cherry-pick these commits starting from the oldest one into stable branch.
## Write release note
- Write introduction for the new release. For example, what major bug we fix, what new features we introduce or what performance improvement we make.
- Put `[GH XXXX]` at the head of change line to reference Pull Request that introduces the change. Moreover, add a link on it to jump to the Pull Request.
- Find PRs with `release-note` label and explain them in `NEWS` file, as a straightforward summary of changes for end-users.
## Tag version
- Bump [hardcoded Version in the repository](https://github.com/etcd-io/etcd/blob/master/version/version.go#L30) to the latest version `${VERSION}`.
- Ensure all tests on CI system are passed.
- Manually check etcd is buildable in Linux, Darwin and Windows.
- Manually check upgrade etcd cluster of previous minor version works well.
- Manually check new features work well.
- Add a signed tag through `git tag -s ${VERSION}`.
- Sanity check tag correctness through `git show tags/$VERSION`.
- Push the tag to GitHub through `git push origin tags/$VERSION`. This assumes `origin` corresponds to "https://github.com/etcd-io/etcd".
## Build release binaries and images
- Ensure `docker` is available.
Run release script in root directory:
```
TAG=gcr.io/etcd-development/etcd ./scripts/release.sh ${VERSION}
```
It generates all release binaries and images under directory ./release.
## Sign binaries, images, and source code
etcd project key must be used to sign the generated binaries and images.`$SUBKEYID` is the key ID of etcd project Yubikey. Connect the key and run `gpg2 --card-status` to get the ID.
The following commands are used for public release sign:
```
cd release
for i in etcd-*{.zip,.tar.gz}; do gpg2 --default-key $SUBKEYID --armor --output ${i}.asc --detach-sign ${i}; done
for i in etcd-*{.zip,.tar.gz}; do gpg2 --verify ${i}.asc ${i}; done
# sign zipped source code files
wget https://github.com/etcd-io/etcd/archive/${VERSION}.zip
gpg2 --armor --default-key $SUBKEYID --output ${VERSION}.zip.asc --detach-sign ${VERSION}.zip
gpg2 --verify ${VERSION}.zip.asc ${VERSION}.zip
wget https://github.com/etcd-io/etcd/archive/${VERSION}.tar.gz
gpg2 --armor --default-key $SUBKEYID --output ${VERSION}.tar.gz.asc --detach-sign ${VERSION}.tar.gz
gpg2 --verify ${VERSION}.tar.gz.asc ${VERSION}.tar.gz
```
The public key for GPG signing can be found at [CoreOS Application Signing Key](https://coreos.com/security/app-signing-key)
## Publish release page in GitHub
- Set release title as the version name.
- Follow the format of previous release pages.
- Attach the generated binaries and signatures.
- Select whether it is a pre-release.
- Publish the release!
## Publish docker image in gcr.io
- Push docker image:
```
gcloud docker -- login -u _json_key -p "$(cat /etc/gcp-key-etcd.json)" https://gcr.io
for TARGET_ARCH in "-arm64" "-ppc64le" ""; do
gcloud docker -- push gcr.io/etcd-development/etcd:${VERSION}${TARGET_ARCH}
done
```
- Add `latest` tag to the new image on [gcr.io](https://console.cloud.google.com/gcr/images/etcd-development/GLOBAL/etcd?project=etcd-development&authuser=1) if this is a stable release.
## Publish docker image in Quay.io
- Build docker images with quay.io:
```
for TARGET_ARCH in "amd64" "arm64" "ppc64le"; do
TAG=quay.io/coreos/etcd GOARCH=${TARGET_ARCH} \
BINARYDIR=release/etcd-${VERSION}-linux-${TARGET_ARCH} \
BUILDDIR=release \
./scripts/build-docker.sh ${VERSION}
done
```
- Push docker image:
```
docker login quay.io
for TARGET_ARCH in "-arm64" "-ppc64le" ""; do
docker push quay.io/coreos/etcd:${VERSION}${TARGET_ARCH}
done
```
- Add `latest` tag to the new image on [quay.io](https://quay.io/repository/coreos/etcd?tag=latest&tab=tags) if this is a stable release.
## Announce to the etcd-dev Googlegroup
- Follow the format of [previous release emails](https://groups.google.com/forum/#!forum/etcd-dev).
- Make sure to include a list of authors that contributed since the previous release - something like the following might be handy:
```
git log ...${PREV_VERSION} --pretty=format:"%an" | sort | uniq | tr '\n' ',' | sed -e 's#,#, #g' -e 's#, $##'
```
- Send email to etcd-dev@googlegroups.com
## Post release
- Create new stable branch through `git push origin ${VERSION_MAJOR}.${VERSION_MINOR}` if this is a major stable release. This assumes `origin` corresponds to "https://github.com/etcd-io/etcd".
- Bump [hardcoded Version in the repository](https://github.com/etcd-io/etcd/blob/master/version/version.go#L30) to the version `${VERSION}+git`.

69
Documentation/dl_build.md Normal file
View File

@ -0,0 +1,69 @@
---
title: Download and build
---
## System requirements
The etcd performance benchmarks run etcd on 8 vCPU, 16GB RAM, 50GB SSD GCE instances, but any relatively modern machine with low latency storage and a few gigabytes of memory should suffice for most use cases. Applications with large v2 data stores will require more memory than a large v3 data store since data is kept in anonymous memory instead of memory mapped from a file. For running etcd on a cloud provider, see the [Example hardware configuration][example-hardware-configurations] documentation.
## Download the pre-built binary
The easiest way to get etcd is to use one of the pre-built release binaries which are available for OSX, Linux, Windows, appc, and Docker. Instructions for using these binaries are on the [GitHub releases page][github-release].
## Build the latest version
For those wanting to try the very latest version, build etcd from the `master` branch. [Go](https://golang.org/) version 1.9+ is required to build the latest version of etcd. To ensure etcd is built against well-tested libraries, etcd vendors its dependencies for official release binaries. However, etcd's vendoring is also optional to avoid potential import conflicts when embedding the etcd server or using the etcd client.
To build `etcd` from the `master` branch without a `GOPATH` using the official `build` script:
```sh
$ git clone https://github.com/etcd-io/etcd.git
$ cd etcd
$ ./build
```
To build a vendored `etcd` from the `master` branch via `go get`:
```sh
# GOPATH should be set
$ echo $GOPATH
/Users/example/go
$ go get -v go.etcd.io/etcd
$ go get -v go.etcd.io/etcd/etcdctl
```
## Test the installation
Check the etcd binary is built correctly by starting etcd and setting a key.
### Starting etcd
If etcd is built without using `go get`, run the following:
```sh
$ ./bin/etcd
```
If etcd is built using `go get`, run the following:
```sh
$ $GOPATH/bin/etcd
```
### Setting a key
Run the following:
```sh
$ ./bin/etcdctl put foo bar
OK
```
(or `$GOPATH/bin/etcdctl put foo bar` if etcdctl was installed with `go get`)
If OK is printed, then etcd is working!
[github-release]: https://github.com/etcd-io/etcd/releases/
[go]: https://golang.org/doc/install
[build-script]: ../build
[cmd-directory]: ../cmd
[example-hardware-configurations]: op-guide/hardware.md#example-hardware-configurations

118
Documentation/docs.md Normal file
View File

@ -0,0 +1,118 @@
# Documentation
etcd is a distributed key-value store designed to reliably and quickly preserve and provide access to critical data. It enables reliable distributed coordination through distributed locking, leader elections, and write barriers. An etcd cluster is intended for high availability and permanent data storage and retrieval.
## Getting started
New etcd users and developers should get started by [downloading and building][download_build] etcd. After getting etcd, follow this [quick demo][demo] to see the basics of creating and working with an etcd cluster.
## Developing with etcd
The easiest way to get started using etcd as a distributed key-value store is to [set up a local cluster][local_cluster].
- [Setting up local clusters][local_cluster]
- [Interacting with etcd][interacting]
- gRPC [etcd core][api_ref] and [etcd concurrency][api_concurrency_ref] API references
- [HTTP JSON API through the gRPC gateway][api_grpc_gateway]
- [gRPC naming and discovery][grpc_naming]
- [Client][namespace_client] and [proxy][namespace_proxy] namespacing
- [Embedding etcd][embed_etcd]
- [Experimental features and APIs][experimental]
- [System limits][system-limit]
## Operating etcd clusters
Administrators who need a fault-tolerant etcd cluster for either development or production should begin with a [cluster on multiple machines][clustering].
### Setting up etcd
- [Configuration flags][conf]
- [Multi-member cluster][clustering]
- [gRPC proxy][grpc_proxy]
- [L4 gateway][gateway]
### System configuration
- [Supported systems][supported_platforms]
- [Hardware recommendations][hardware]
- [Performance benchmarking][performance]
- [Tuning][tuning]
### Platform guides
- [Amazon Web Services][aws_platform]
- [Container Linux, systemd][container_linux_platform]
- [FreeBSD][freebsd_platform]
- [Docker container][container_docker]
- [rkt container][container_rkt]
### Security
- [TLS][security]
- [Role-based access control][authentication]
### Maintenance and troubleshooting
- [Frequently asked questions][faq]
- [Monitoring][monitoring]
- [Maintenance][maintenance]
- [Failure modes][failures]
- [Disaster recovery][recovery]
- [Upgrading][upgrading]
## Learning
To learn more about the concepts and internals behind etcd, read the following pages:
- [Why etcd?][why]
- [Understand data model][data_model]
- [Understand APIs][understand_apis]
- [Glossary][glossary]
- Design
- [Auth subsystem][design-auth-v3]
- [Client][design-client]
- [Learner][design-learner]
[api_ref]: dev-guide/api_reference_v3.md
[api_concurrency_ref]: dev-guide/api_concurrency_reference_v3.md
[api_grpc_gateway]: dev-guide/api_grpc_gateway.md
[clustering]: op-guide/clustering.md
[conf]: op-guide/configuration.md
[system-limit]: dev-guide/limit.md
[faq]: faq.md
[why]: learning/why.md
[data_model]: learning/data_model.md
[demo]: demo.md
[download_build]: dl_build.md
[embed_etcd]: https://godoc.org/github.com/etcd-io/etcd/embed
[grpc_naming]: dev-guide/grpc_naming.md
[failures]: op-guide/failures.md
[gateway]: op-guide/gateway.md
[glossary]: learning/glossary.md
[namespace_client]: https://godoc.org/github.com/etcd-io/etcd/clientv3/namespace
[namespace_proxy]: op-guide/grpc_proxy.md#namespacing
[grpc_proxy]: op-guide/grpc_proxy.md
[hardware]: op-guide/hardware.md
[interacting]: dev-guide/interacting_v3.md
[local_cluster]: dev-guide/local_cluster.md
[performance]: op-guide/performance.md
[recovery]: op-guide/recovery.md
[maintenance]: op-guide/maintenance.md
[security]: op-guide/security.md
[monitoring]: op-guide/monitoring.md
[v2_migration]: op-guide/v2-migration.md
[container_rkt]: op-guide/container.md#rkt
[container_docker]: op-guide/container.md#docker
[understand_apis]: learning/api.md
[versioning]: op-guide/versioning.md
[supported_platforms]: op-guide/supported-platform.md
[container_linux_platform]: platforms/container-linux-systemd.md
[freebsd_platform]: platforms/freebsd.md
[aws_platform]: platforms/aws.md
[experimental]: dev-guide/experimental_apis.md
[authentication]: op-guide/authentication.md
[design-auth-v3]: learning/design-auth-v3.md
[design-client]: learning/design-client.md
[design-learner]: learning/design-learner.md
[tuning]: tuning.md
[upgrading]: upgrades/upgrading-etcd.md

View File

@ -0,0 +1,25 @@
# Prometheus Monitoring Mixin for etcd
> NOTE: This project is *alpha* stage. Flags, configuration, behaviour and design may change significantly in following releases.
A set of customisable Prometheus alerts for etcd.
Instructions for use are the same as the [kubernetes-mixin](https://github.com/kubernetes-monitoring/kubernetes-mixin).
## Background
* For more information about monitoring mixins, see this [design doc](https://docs.google.com/document/d/1A9xvzwqnFVSOZ5fD3blKODXfsat5fg6ZhnKu9LK3lB4/edit#).
## Testing alerts
Make sure to have [jsonnet](https://jsonnet.org/) and [gojsontoyaml](https://github.com/brancz/gojsontoyaml) installed.
First compile the mixin to a YAML file, which the promtool will read:
```
jsonnet -e '(import "mixin.libsonnet").prometheusAlerts' | gojsontoyaml > mixin.yaml
```
Then run the unit test:
```
promtool test rules test.yaml
```

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,85 @@
rule_files:
- mixin.yaml
evaluation_interval: 1m
tests:
- interval: 1m
input_series:
- series: 'up{job="etcd",instance="10.10.10.0"}'
values: '1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0'
- series: 'up{job="etcd",instance="10.10.10.1"}'
values: '1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0'
- series: 'up{job="etcd",instance="10.10.10.2"}'
values: '1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0'
alert_rule_test:
- eval_time: 3m
alertname: etcdInsufficientMembers
- eval_time: 5m
alertname: etcdInsufficientMembers
- eval_time: 5m
alertname: etcdMembersDown
- eval_time: 7m
alertname: etcdMembersDown
exp_alerts:
- exp_labels:
job: etcd
severity: critical
exp_annotations:
message: 'etcd cluster "etcd": members are down (1).'
- eval_time: 7m
alertname: etcdInsufficientMembers
- eval_time: 11m
alertname: etcdInsufficientMembers
exp_alerts:
- exp_labels:
job: etcd
severity: critical
exp_annotations:
message: 'etcd cluster "etcd": insufficient members (1).'
- eval_time: 15m
alertname: etcdInsufficientMembers
exp_alerts:
- exp_labels:
job: etcd
severity: critical
exp_annotations:
message: 'etcd cluster "etcd": insufficient members (0).'
- interval: 1m
input_series:
- series: 'up{job="etcd",instance="10.10.10.0"}'
values: '1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0'
- series: 'up{job="etcd",instance="10.10.10.1"}'
values: '1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0'
- series: 'up{job="etcd",instance="10.10.10.2"}'
values: '1 1 1 1 0 0 0 0'
alert_rule_test:
- eval_time: 10m
alertname: etcdMembersDown
exp_alerts:
- exp_labels:
job: etcd
severity: critical
exp_annotations:
message: 'etcd cluster "etcd": members are down (2).'
- interval: 1m
input_series:
- series: 'up{job="etcd",instance="10.10.10.0"}'
values: '1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0'
- series: 'up{job="etcd",instance="10.10.10.1"}'
values: '1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0'
- series: 'etcd_network_peer_sent_failures_total{To="member-1",job="etcd",endpoint="test"}'
values: '0 0 1 2 3 4 5 6 7 8 9 10'
alert_rule_test:
- eval_time: 4m
alertname: etcdMembersDown
- eval_time: 6m
alertname: etcdMembersDown
exp_alerts:
- exp_labels:
job: etcd
severity: critical
exp_annotations:
message: 'etcd cluster "etcd": members are down (1).'

165
Documentation/faq.md Normal file
View File

@ -0,0 +1,165 @@
---
title: Frequently Asked Questions (FAQ)
---
## etcd, general
### Do clients have to send requests to the etcd leader?
[Raft][raft] is leader-based; the leader handles all client requests which need cluster consensus. However, the client does not need to know which node is the leader. Any request that requires consensus sent to a follower is automatically forwarded to the leader. Requests that do not require consensus (e.g., serialized reads) can be processed by any cluster member.
## Configuration
### What is the difference between listen-<client,peer>-urls, advertise-client-urls or initial-advertise-peer-urls?
`listen-client-urls` and `listen-peer-urls` specify the local addresses etcd server binds to for accepting incoming connections. To listen on a port for all interfaces, specify `0.0.0.0` as the listen IP address.
`advertise-client-urls` and `initial-advertise-peer-urls` specify the addresses etcd clients or other etcd members should use to contact the etcd server. The advertise addresses must be reachable from the remote machines. Do not advertise addresses like `localhost` or `0.0.0.0` for a production setup since these addresses are unreachable from remote machines.
### Why doesn't changing `--listen-peer-urls` or `--initial-advertise-peer-urls` update the advertised peer URLs in `etcdctl member list`?
A member's advertised peer URLs come from `--initial-advertise-peer-urls` on initial cluster boot. Changing the listen peer URLs or the initial advertise peers after booting the member won't affect the exported advertise peer URLs since changes must go through quorum to avoid membership configuration split brain. Use `etcdctl member update` to update a member's peer URLs.
## Deployment
### System requirements
Since etcd writes data to disk, its performance strongly depends on disk performance. For this reason, SSD is highly recommended. To assess whether a disk is fast enough for etcd, one possibility is using a disk benchmarking tool such as [fio][fio]. For an example on how to do that, read [here][fio-blog-post]. To prevent performance degradation or unintentionally overloading the key-value store, etcd enforces a configurable storage size quota set to 2GB by default. To avoid swapping or running out of memory, the machine should have at least as much RAM to cover the quota. 8GB is a suggested maximum size for normal environments and etcd warns at startup if the configured value exceeds it. At CoreOS, an etcd cluster is usually deployed on dedicated CoreOS Container Linux machines with dual-core processors, 2GB of RAM, and 80GB of SSD *at the very least*. **Note that performance is intrinsically workload dependent; please test before production deployment**. See [hardware][hardware-setup] for more recommendations.
Most stable production environment is Linux operating system with amd64 architecture; see [supported platform][supported-platform] for more.
### Why an odd number of cluster members?
An etcd cluster needs a majority of nodes, a quorum, to agree on updates to the cluster state. For a cluster with n members, quorum is (n/2)+1. For any odd-sized cluster, adding one node will always increase the number of nodes necessary for quorum. Although adding a node to an odd-sized cluster appears better since there are more machines, the fault tolerance is worse since exactly the same number of nodes may fail without losing quorum but there are more nodes that can fail. If the cluster is in a state where it can't tolerate any more failures, adding a node before removing nodes is dangerous because if the new node fails to register with the cluster (e.g., the address is misconfigured), quorum will be permanently lost.
### What is maximum cluster size?
Theoretically, there is no hard limit. However, an etcd cluster probably should have no more than seven nodes. [Google Chubby lock service][chubby], similar to etcd and widely deployed within Google for many years, suggests running five nodes. A 5-member etcd cluster can tolerate two member failures, which is enough in most cases. Although larger clusters provide better fault tolerance, the write performance suffers because data must be replicated across more machines.
### What is failure tolerance?
An etcd cluster operates so long as a member quorum can be established. If quorum is lost through transient network failures (e.g., partitions), etcd automatically and safely resumes once the network recovers and restores quorum; Raft enforces cluster consistency. For power loss, etcd persists the Raft log to disk; etcd replays the log to the point of failure and resumes cluster participation. For permanent hardware failure, the node may be removed from the cluster through [runtime reconfiguration][runtime reconfiguration].
It is recommended to have an odd number of members in a cluster. An odd-size cluster tolerates the same number of failures as an even-size cluster but with fewer nodes. The difference can be seen by comparing even and odd sized clusters:
| Cluster Size | Majority | Failure Tolerance |
|:-:|:-:|:-:|
| 1 | 1 | 0 |
| 2 | 2 | 0 |
| 3 | 2 | 1 |
| 4 | 3 | 1 |
| 5 | 3 | 2 |
| 6 | 4 | 2 |
| 7 | 4 | 3 |
| 8 | 5 | 3 |
| 9 | 5 | 4 |
Adding a member to bring the size of cluster up to an even number doesn't buy additional fault tolerance. Likewise, during a network partition, an odd number of members guarantees that there will always be a majority partition that can continue to operate and be the source of truth when the partition ends.
### Does etcd work in cross-region or cross data center deployments?
Deploying etcd across regions improves etcd's fault tolerance since members are in separate failure domains. The cost is higher consensus request latency from crossing data center boundaries. Since etcd relies on a member quorum for consensus, the latency from crossing data centers will be somewhat pronounced because at least a majority of cluster members must respond to consensus requests. Additionally, cluster data must be replicated across all peers, so there will be bandwidth cost as well.
With longer latencies, the default etcd configuration may cause frequent elections or heartbeat timeouts. See [tuning] for adjusting timeouts for high latency deployments.
## Operation
### How to backup a etcd cluster?
etcdctl provides a `snapshot` command to create backups. See [backup][backup] for more details.
### Should I add a member before removing an unhealthy member?
When replacing an etcd node, it's important to remove the member first and then add its replacement.
etcd employs distributed consensus based on a quorum model; (n/2)+1 members, a majority, must agree on a proposal before it can be committed to the cluster. These proposals include key-value updates and membership changes. This model totally avoids any possibility of split brain inconsistency. The downside is permanent quorum loss is catastrophic.
How this applies to membership: If a 3-member cluster has 1 downed member, it can still make forward progress because the quorum is 2 and 2 members are still live. However, adding a new member to a 3-member cluster will increase the quorum to 3 because 3 votes are required for a majority of 4 members. Since the quorum increased, this extra member buys nothing in terms of fault tolerance; the cluster is still one node failure away from being unrecoverable.
Additionally, that new member is risky because it may turn out to be misconfigured or incapable of joining the cluster. In that case, there's no way to recover quorum because the cluster has two members down and two members up, but needs three votes to change membership to undo the botched membership addition. etcd will by default reject member add attempts that could take down the cluster in this manner.
On the other hand, if the downed member is removed from cluster membership first, the number of members becomes 2 and the quorum remains at 2. Following that removal by adding a new member will also keep the quorum steady at 2. So, even if the new node can't be brought up, it's still possible to remove the new member through quorum on the remaining live members.
### Why won't etcd accept my membership changes?
etcd sets `strict-reconfig-check` in order to reject reconfiguration requests that would cause quorum loss. Abandoning quorum is really risky (especially when the cluster is already unhealthy). Although it may be tempting to disable quorum checking if there's quorum loss to add a new member, this could lead to full fledged cluster inconsistency. For many applications, this will make the problem even worse ("disk geometry corruption" being a candidate for most terrifying).
### Why does etcd lose its leader from disk latency spikes?
This is intentional; disk latency is part of leader liveness. Suppose the cluster leader takes a minute to fsync a raft log update to disk, but the etcd cluster has a one second election timeout. Even though the leader can process network messages within the election interval (e.g., send heartbeats), it's effectively unavailable because it can't commit any new proposals; it's waiting on the slow disk. If the cluster frequently loses its leader due to disk latencies, try [tuning][tuning] the disk settings or etcd time parameters.
### What does the etcd warning "request ignored (cluster ID mismatch)" mean?
Every new etcd cluster generates a new cluster ID based on the initial cluster configuration and a user-provided unique `initial-cluster-token` value. By having unique cluster ID's, etcd is protected from cross-cluster interaction which could corrupt the cluster.
Usually this warning happens after tearing down an old cluster, then reusing some of the peer addresses for the new cluster. If any etcd process from the old cluster is still running it will try to contact the new cluster. The new cluster will recognize a cluster ID mismatch, then ignore the request and emit this warning. This warning is often cleared by ensuring peer addresses among distinct clusters are disjoint.
### What does "mvcc: database space exceeded" mean and how do I fix it?
The [multi-version concurrency control][api-mvcc] data model in etcd keeps an exact history of the keyspace. Without periodically compacting this history (e.g., by setting `--auto-compaction`), etcd will eventually exhaust its storage space. If etcd runs low on storage space, it raises a space quota alarm to protect the cluster from further writes. So long as the alarm is raised, etcd responds to write requests with the error `mvcc: database space exceeded`.
To recover from the low space quota alarm:
1. [Compact][maintenance-compact] etcd's history.
2. [Defragment][maintenance-defragment] every etcd endpoint.
3. [Disarm][maintenance-disarm] the alarm.
### What does the etcd warning "etcdserver/api/v3rpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:43020: read: connection reset by peer" mean?
This is gRPC-side warning when a server receives a TCP RST flag with client-side streams being prematurely closed. For example, a client closes its connection, while gRPC server has not yet processed all HTTP/2 frames in the TCP queue. Some data may have been lost in server side, but it is ok so long as client connection has already been closed.
Only [old versions of gRPC](https://github.com/grpc/grpc-go/issues/1362) log this. etcd [>=v3.2.13 by default log this with DEBUG level](https://github.com/etcd-io/etcd/pull/9080), thus only visible with `--debug` flag enabled.
## Performance
### How should I benchmark etcd?
Try the [benchmark] tool. Current [benchmark results][benchmark-result] are available for comparison.
### What does the etcd warning "apply entries took too long" mean?
After a majority of etcd members agree to commit a request, each etcd server applies the request to its data store and persists the result to disk. Even with a slow mechanical disk or a virtualized network disk, such as Amazons EBS or Googles PD, applying a request should normally take fewer than 50 milliseconds. If the average apply duration exceeds 100 milliseconds, etcd will warn that entries are taking too long to apply.
Usually this issue is caused by a slow disk. The disk could be experiencing contention among etcd and other applications, or the disk is too simply slow (e.g., a shared virtualized disk). To rule out a slow disk from causing this warning, monitor [backend_commit_duration_seconds][backend_commit_metrics] (p99 duration should be less than 25ms) to confirm the disk is reasonably fast. If the disk is too slow, assigning a dedicated disk to etcd or using faster disk will typically solve the problem.
The second most common cause is CPU starvation. If monitoring of the machines CPU usage shows heavy utilization, there may not be enough compute capacity for etcd. Moving etcd to dedicated machine, increasing process resource isolation cgroups, or renicing the etcd server process into a higher priority can usually solve the problem.
Expensive user requests which access too many keys (e.g., fetching the entire keyspace) can also cause long apply latencies. Accessing fewer than a several hundred keys per request, however, should always be performant.
If none of the above suggestions clear the warnings, please [open an issue][new_issue] with detailed logging, monitoring, metrics and optionally workload information.
### What does the etcd warning "failed to send out heartbeat on time" mean?
etcd uses a leader-based consensus protocol for consistent data replication and log execution. Cluster members elect a single leader, all other members become followers. The elected leader must periodically send heartbeats to its followers to maintain its leadership. Followers infer leader failure if no heartbeats are received within an election interval and trigger an election. If a leader doesnt send its heartbeats in time but is still running, the election is spurious and likely caused by insufficient resources. To catch these soft failures, if the leader skips two heartbeat intervals, etcd will warn it failed to send a heartbeat on time.
Usually this issue is caused by a slow disk. Before the leader sends heartbeats attached with metadata, it may need to persist the metadata to disk. The disk could be experiencing contention among etcd and other applications, or the disk is too simply slow (e.g., a shared virtualized disk). To rule out a slow disk from causing this warning, monitor [wal_fsync_duration_seconds][wal_fsync_duration_seconds] (p99 duration should be less than 10ms) to confirm the disk is reasonably fast. If the disk is too slow, assigning a dedicated disk to etcd or using faster disk will typically solve the problem. To tell whether a disk is fast enough for etcd, a benchmarking tool such as [fio][fio] can be used. Read [here][fio-blog-post] for an example.
The second most common cause is CPU starvation. If monitoring of the machines CPU usage shows heavy utilization, there may not be enough compute capacity for etcd. Moving etcd to dedicated machine, increasing process resource isolation with cgroups, or renicing the etcd server process into a higher priority can usually solve the problem.
A slow network can also cause this issue. If network metrics among the etcd machines shows long latencies or high drop rate, there may not be enough network capacity for etcd. Moving etcd members to a less congested network will typically solve the problem. However, if the etcd cluster is deployed across data centers, long latency between members is expected. For such deployments, tune the `heartbeat-interval` configuration to roughly match the round trip time between the machines, and the `election-timeout` configuration to be at least 5 * `heartbeat-interval`. See [tuning documentation][tuning] for detailed information.
If none of the above suggestions clear the warnings, please [open an issue][new_issue] with detailed logging, monitoring, metrics and optionally workload information.
### What does the etcd warning "snapshotting is taking more than x seconds to finish ..." mean?
etcd sends a snapshot of its complete key-value store to refresh slow followers and for [backups][backup]. Slow snapshot transfer times increase MTTR; if the cluster is ingesting data with high throughput, slow followers may livelock by needing a new snapshot before finishing receiving a snapshot. To catch slow snapshot performance, etcd warns when sending a snapshot takes more than thirty seconds and exceeds the expected transfer time for a 1Gbps connection.
[hardware-setup]: ./op-guide/hardware.md
[supported-platform]: ./op-guide/supported-platform.md
[wal_fsync_duration_seconds]: ./metrics.md#disk
[tuning]: ./tuning.md
[new_issue]: https://github.com/etcd-io/etcd/issues/new
[backend_commit_metrics]: ./metrics.md#disk
[raft]: https://raft.github.io/raft.pdf
[backup]: https://github.com/etcd-io/etcd/blob/master/Documentation/op-guide/recovery.md#snapshotting-the-keyspace
[chubby]: http://static.googleusercontent.com/media/research.google.com/en//archive/chubby-osdi06.pdf
[runtime reconfiguration]: https://github.com/etcd-io/etcd/blob/master/Documentation/op-guide/runtime-configuration.md
[benchmark]: https://github.com/coreos/etcd/tree/master/tools/benchmark
[benchmark-result]: https://github.com/etcd-io/etcd/blob/master/Documentation/op-guide/performance.md
[api-mvcc]: learning/api.md#revisions
[maintenance-compact]: op-guide/maintenance.md#history-compaction
[maintenance-defragment]: op-guide/maintenance.md#defragmentation
[maintenance-disarm]: ../etcdctl/README.md#alarm-disarm
[fio]: https://github.com/axboe/fio
[fio-blog-post]: https://www.ibm.com/blogs/bluemix/2019/04/using-fio-to-tell-whether-your-storage-is-fast-enough-for-etcd/

View File

@ -1,68 +0,0 @@
# etcd arm64 test infrastructure
## Infrastructure summary
All etcd project pipelines run via github actions. The etcd project currently maintains dedicated infrastructure for running `arm64` continuous integration testing. This is required because currently github actions runner virtual machines are only offered as `x64`.
The infrastructure consists of two `c3.large.arm` bare metal servers kindly provided by [Equinix Metal](https://www.equinix.com/) via the [CNCF Community Infrastructure Lab].
| Hostname | IP | Operating System | Region |
|-------------------------------|----------------|--------------------|---------------|
| etcd-c3-large-arm64-runner-01 | 86.109.7.233 | Ubuntu 22.04.1 LTS | Washington DC |
| etcd-c3-large-arm64-runner-02 | 147.28.151.226 | Ubuntu 22.04.1 LTS | Washington DC |
## Infrastructure support
The etcd project aims to self manage and resolve issues with project infrastructure internally where possible, however if situations emerge where we need to engage support from Equinix Metal we can open an issue under the [CNCF Community Infrastructure Lab] project or contact the [Equinix Metal support team](https://deploy.equinix.com/support). If the situation is urgent contact @vielmetti directly who can provide further assistance or escalation points.
## Granting infrastructure access
Etcd arm64 test infrastructure access is closely controlled to ensure the infrastructure is secure and protect the integrity of the etcd project.
Access to the infrastructure is defined by the infra admins table below:
| Name | Github | K8s Slack | Email |
|---------------------------|----------------|--------------------|--------------------|
| Marek Siarkowicz | @serathius | @ Serathius | Ref MAINTAINERS.md |
| Benjamin Wang | @ahrtr | @ Benjamin Wang | Ref MAINTAINERS.md |
| Davanum Srinivas | @dimns | @ Dims | davanum@gmail.com |
| Chao Chen | @chaochn47 | @ Chao Chen | chaochn@amazon.com |
| James Blair | @jmhbnz | @ James Blair | etcd@jamma.life |
Individuals in this table are granted access to the infrastructure in two ways:
### 1. Equinix metal web console access
An etcd project exists under the CNCF organisation in the Equinix Metal web console. The direct url to the etcd console is <https://console.equinix.com/projects/1b8c1eb7-983c-4b40-97e0-e317406e232e>.
When a new person is added to the infra admins table, an existing member or etcd maintainer should raise an issue in the [CNCF Community Infrastructure Labs](https://github.com/cncf/cluster/issues) to ensure they are granted web console access.
### 2. Server ssh access
Infra admins can ssh directly to the servers with a dedicated user account for each person, usernames are based on github handles for easy recognition in logs. These infra admins will be able to elevate to the `root` user when necessary via `sudo`.
Access to machines via ssh is strictly via individual ssh key based authentication, and is not permitted directly to the `root` user. Password authentication is never to be used for etcd infrastructure ssh authentication.
When a new member is added to the infra admins table, and existing member with ssh access should complete the following actions on all etcd servers:
- create the new user via `sudo adduser <username>`.
- add their public key to `/home/<username>/.ssh/authorized_keys` file. Note: Public keys are to be retrieved via github only, example: <https://github.com/jmhbnz.keys>.
- add the new user to machine sudoers file via `usermod -aG sudo <username>`.
## Revoking infrastructure access
When a member is removed from the infra admins table existing members must review servers and ensure their user access to etcd infrastructure is revoked by removing the members `/home/<username>/.ssh/authorized_keys` entries.
Note: When revoking access do not delete a user or their home directory from servers, as access may need to be reinstated in future.
### Regular access review
On a regular at least quarterly basis members of the infra admins team are responsible for verifying that no unneccessary infrastructure access exists by reviewing membership of the table above and existing server access.
## Provisioning new machines
If the etcd project needs new `arm64` infrastructure we can open an issue with the [CNCF Community Infrastructure Lab]. An example etcd request is [here](https://github.com/cncf/cluster/issues/227).
Note: `arm64` compute capacity is not currently available in all regions, this can be checked with [metal-cli](https://github.com/equinix/metal-cli) `metal capacity get | grep arm`.
[CNCF Community Infrastructure Lab]: https://github.com/cncf/cluster/issues

View File

@ -0,0 +1,179 @@
---
title: Libraries and tools
---
**Tools**
- [etcdctl](https://github.com/etcd-io/etcd/tree/master/etcdctl) - A command line client for etcd
- [etcd-backup](https://github.com/fanhattan/etcd-backup) - A powerful command line utility for dumping/restoring etcd - Supports v2
- [etcd-dump](https://npmjs.org/package/etcd-dump) - Command line utility for dumping/restoring etcd.
- [etcd-fs](https://github.com/xetorthio/etcd-fs) - FUSE filesystem for etcd
- [etcddir](https://github.com/rekby/etcddir) - Realtime sync etcd and local directory. Work with windows and linux.
- [etcd-browser](https://github.com/henszey/etcd-browser) - A web-based key/value editor for etcd using AngularJS
- [etcd-lock](https://github.com/datawisesystems/etcd-lock) - Master election & distributed r/w lock implementation using etcd - Supports v2
- [etcd-console](https://github.com/matishsiao/etcd-console) - A web-base key/value editor for etcd using PHP
- [etcd-viewer](https://github.com/nikfoundas/etcd-viewer) - An etcd key-value store editor/viewer written in Java
- [etcdtool](https://github.com/mickep76/etcdtool) - Export/Import/Edit etcd directory as JSON/YAML/TOML and Validate directory using JSON schema
- [etcd-rest](https://github.com/mickep76/etcd-rest) - Create generic REST API in Go using etcd as a backend with validation using JSON schema
- [etcdsh](https://github.com/kamilhark/etcdsh) - A command line client with support of command history and tab completion. Supports v2
- [etcdloadtest](https://github.com/sinsharat/etcdloadtest) - A command line load test client for etcd version 3.0 and above.
- [lucas](https://github.com/ringtail/lucas) - A web-based key-value viewer for kubernetes etcd3.0+ cluster.
**Go libraries**
- [etcd/clientv3](https://github.com/etcd-io/etcd/blob/master/clientv3) - the officially maintained Go client for v3
- [etcd/client](https://github.com/etcd-io/etcd/blob/master/client) - the officially maintained Go client for v2
- [go-etcd](https://github.com/coreos/go-etcd) - the deprecated official client. May be useful for older (<2.0.0) versions of etcd.
- [encWrapper](https://github.com/lumjjb/etcd/tree/enc_wrapper/clientwrap/encwrapper) - encWrapper is an encryption wrapper for the etcd client Keys API/KV.
**Java libraries**
- [coreos/jetcd](https://github.com/etcd-io/jetcd) - Supports v3
- [boonproject/etcd](https://github.com/boonproject/boon/blob/master/etcd/README.md) - Supports v2, Async/Sync and waits
- [justinsb/jetcd](https://github.com/justinsb/jetcd)
- [diwakergupta/jetcd](https://github.com/diwakergupta/jetcd) - Supports v2
- [jurmous/etcd4j](https://github.com/jurmous/etcd4j) - Supports v2, Async/Sync, waits and SSL
- [AdoHe/etcd4j](http://github.com/AdoHe/etcd4j) - Supports v2 (enhance for real production cluster)
- [cdancy/etcd-rest](https://github.com/cdancy/etcd-rest) - Uses jclouds to provide a complete implementation of v2 API.
**Scala libraries**
- [maciej/etcd-client](https://github.com/maciej/etcd-client) - Supports v2. Akka HTTP-based fully async client
- [eiipii/etcdhttpclient](https://bitbucket.org/eiipii/etcdhttpclient) - Supports v2. Async HTTP client based on Netty and Scala Futures.
**Perl libraries**
- [hexfusion/perl-net-etcd](https://github.com/hexfusion/perl-net-etcd) - Supports v3 grpc gateway HTTP API
- [robn/p5-etcd](https://github.com/robn/p5-etcd) - Supports v2
**Python libraries**
- [kragniz/python-etcd3](https://github.com/kragniz/python-etcd3) - Client for v3
- [jplana/python-etcd](https://github.com/jplana/python-etcd) - Supports v2
- [russellhaering/txetcd](https://github.com/russellhaering/txetcd) - a Twisted Python library
- [cholcombe973/autodock](https://github.com/cholcombe973/autodock) - A docker deployment automation tool
- [lisael/aioetcd](https://github.com/lisael/aioetcd) - (Python 3.4+) Asyncio coroutines client (Supports v2)
- [txaio-etcd](https://github.com/crossbario/txaio-etcd) - Asynchronous etcd v3-only client library for Twisted (today) and asyncio (future)
- [dims/etcd3-gateway](https://github.com/dims/etcd3-gateway) - etcd v3 API library using the HTTP grpc gateway
- [aioetcd3](https://github.com/gaopeiliang/aioetcd3) - (Python 3.6+) etcd v3 API for asyncio
- [Revolution1/etcd3-py](https://github.com/Revolution1/etcd3-py) - (python2.7 and python3.5+) Python client for etcd v3, using gRPC-JSON-Gateway
**Node libraries**
- [mixer/etcd3](https://github.com/mixer/etcd3) - Supports v3
- [stianeikeland/node-etcd](https://github.com/stianeikeland/node-etcd) - Supports v2 (w Coffeescript)
- [lavagetto/nodejs-etcd](https://github.com/lavagetto/nodejs-etcd) - Supports v2
- [deedubs/node-etcd-config](https://github.com/deedubs/node-etcd-config) - Supports v2
**Ruby libraries**
- [iconara/etcd-rb](https://github.com/iconara/etcd-rb)
- [jpfuentes2/etcd-ruby](https://github.com/jpfuentes2/etcd-ruby)
- [ranjib/etcd-ruby](https://github.com/ranjib/etcd-ruby) - Supports v2
- [davissp14/etcdv3-ruby](https://github.com/davissp14/etcdv3-ruby) - Supports v3
**C libraries**
- [apache/celix/etcdlib](https://github.com/apache/celix/tree/develop/etcdlib) - Supports v2
- [jdarcy/etcd-api](https://github.com/jdarcy/etcd-api) - Supports v2
- [shafreeck/cetcd](https://github.com/shafreeck/cetcd) - Supports v2
**C++ libraries**
- [edwardcapriolo/etcdcpp](https://github.com/edwardcapriolo/etcdcpp) - Supports v2
- [suryanathan/etcdcpp](https://github.com/suryanathan/etcdcpp) - Supports v2 (with waits)
- [nokia/etcd-cpp-api](https://github.com/nokia/etcd-cpp-api) - Supports v2
- [nokia/etcd-cpp-apiv3](https://github.com/nokia/etcd-cpp-apiv3) - Supports v3
**Clojure libraries**
- [aterreno/etcd-clojure](https://github.com/aterreno/etcd-clojure)
- [dwwoelfel/cetcd](https://github.com/dwwoelfel/cetcd) - Supports v2
- [rthomas/clj-etcd](https://github.com/rthomas/clj-etcd) - Supports v2
**Erlang libraries**
- [marshall-lee/etcd.erl](https://github.com/marshall-lee/etcd.erl) - Supports v2
- [zhongwencool/eetcd](https://github.com/zhongwencool/eetcd) - Supports v3+ (GRPC only)
**.Net Libraries**
- [wangjia184/etcdnet](https://github.com/wangjia184/etcdnet) - Supports v2
- [drusellers/etcetera](https://github.com/drusellers/etcetera)
- [shubhamranjan/dotnet-etcd](https://github.com/shubhamranjan/dotnet-etcd) - Supports v3+ (GRPC only)
**PHP Libraries**
- [linkorb/etcd-php](https://github.com/linkorb/etcd-php)
- [activecollab/etcd](https://github.com/activecollab/etcd)
- [ouqiang/etcd-php](https://github.com/ouqiang/etcd-php) - Client for v3 gRPC gateway
**Haskell libraries**
- [wereHamster/etcd-hs](https://github.com/wereHamster/etcd-hs)
**R libraries**
- [ropensci/etseed](https://github.com/ropensci/etseed)
**Nim libraries**
- [etcd_client](https://github.com/FedericoCeratto/nim-etcd-client)
**Tcl libraries**
- [efrecon/etcd-tcl](https://github.com/efrecon/etcd-tcl) - Supports v2, except wait.
**Rust libraries**
- [jimmycuadra/rust-etcd](https://github.com/jimmycuadra/rust-etcd) - Supports v2
**Gradle Plugins**
- [gradle-etcd-rest-plugin](https://github.com/cdancy/gradle-etcd-rest-plugin) - Supports v2
**Chef Integration**
- [coderanger/etcd-chef](https://github.com/coderanger/etcd-chef)
**Chef Cookbook**
- [spheromak/etcd-cookbook](https://github.com/spheromak/etcd-cookbook)
**BOSH Releases**
- [cloudfoundry-community/etcd-boshrelease](https://github.com/cloudfoundry-community/etcd-boshrelease)
- [cloudfoundry/cf-release](https://github.com/cloudfoundry/cf-release/tree/master/jobs/etcd)
**Projects using etcd**
- [etcd Raft users](../raft/README.md#notable-users) - projects using etcd's raft library implementation.
- [apache/celix](https://github.com/apache/celix) - an implementation of the OSGi specification adapted to C and C++
- [binocarlos/yoda](https://github.com/binocarlos/yoda) - etcd + ZeroMQ
- [blox/blox](https://github.com/blox/blox) - a collection of open source projects for container management and orchestration with AWS ECS
- [calavera/active-proxy](https://github.com/calavera/active-proxy) - HTTP Proxy configured with etcd
- [chain/chain](https://github.com/chain/chain) - software designed to operate and connect to highly scalable permissioned blockchain networks
- [derekchiang/etcdplus](https://github.com/derekchiang/etcdplus) - A set of distributed synchronization primitives built upon etcd
- [go-discover](https://github.com/flynn/go-discover) - service discovery in Go
- [gleicon/goreman](https://github.com/gleicon/goreman/tree/etcd) - Branch of the Go Foreman clone with etcd support
- [garethr/hiera-etcd](https://github.com/garethr/hiera-etcd) - Puppet hiera backend using etcd
- [mattn/etcd-vim](https://github.com/mattn/etcd-vim) - SET and GET keys from inside vim
- [mattn/etcdenv](https://github.com/mattn/etcdenv) - "env" shebang with etcd integration
- [kelseyhightower/confd](https://github.com/kelseyhightower/confd) - Manage local app config files using templates and data from etcd
- [configdb](https://git.autistici.org/ai/configdb/tree/master) - A REST relational abstraction on top of arbitrary database backends, aimed at storing configs and inventories.
- [kubernetes/kubernetes](https://github.com/kubernetes/kubernetes) - Container cluster manager introduced by Google.
- [mailgun/vulcand](https://github.com/mailgun/vulcand) - HTTP proxy that uses etcd as a configuration backend.
- [duedil-ltd/discodns](https://github.com/duedil-ltd/discodns) - Simple DNS nameserver using etcd as a database for names and records.
- [skynetservices/skydns](https://github.com/skynetservices/skydns) - RFC compliant DNS server
- [xordataexchange/crypt](https://github.com/xordataexchange/crypt) - Securely store values in etcd using GPG encryption
- [spf13/viper](https://github.com/spf13/viper) - Go configuration library, reads values from ENV, pflags, files, and etcd with optional encryption
- [lytics/metafora](https://github.com/lytics/metafora) - Go distributed task library
- [ryandoyle/nss-etcd](https://github.com/ryandoyle/nss-etcd) - A GNU libc NSS module for resolving names from etcd.
- [Gru](https://github.com/dnaeon/gru) - Orchestration made easy with Go
- [Vitess](http://vitess.io/) - Vitess is a database clustering system for horizontal scaling of MySQL.
- [lclarkmichalek/etcdhcp](https://github.com/lclarkmichalek/etcdhcp) - DHCP server that uses etcd for persistence and coordination.
- [openstack/networking-vpp](https://github.com/openstack/networking-vpp) - A networking driver that programs the [FD.io VPP dataplane](https://wiki.fd.io/view/VPP) to provide [OpenStack](https://www.openstack.org/) cloud virtual networking
- [OpenStack](https://github.com/openstack/governance/blob/master/reference/base-services.rst) - OpenStack services can rely on etcd as a base service.
- [CoreDNS](https://github.com/coredns/coredns/tree/master/plugin/etcd) - CoreDNS is a DNS server that chains plugins, part of CNCF and Kubernetes
- [Uber M3](https://github.com/m3db/m3) - M3: Ubers Open Source, Large-scale Metrics Platform for Prometheus
- [Rook](https://github.com/rook/rook) - Storage Orchestration for Kubernetes
- [Patroni](https://github.com/zalando/patroni) - A template for PostgreSQL High Availability with ZooKeeper, etcd, or Consul
- [Trillian](https://github.com/google/trillian) - Trillian implements a Merkle tree whose contents are served from a data storage layer, to allow scalability to extremely large trees.

View File

@ -0,0 +1,483 @@
---
title: etcd3 API
---
This document is meant to give an overview of the etcd3 API's central design. It is by no means all encompassing, but intended to focus on the basic ideas needed to understand etcd without the distraction of less common API calls. All etcd3 API's are defined in [gRPC services][grpc-service], which categorize remote procedure calls (RPCs) understood by the etcd server. A full listing of all etcd RPCs are documented in markdown in the [gRPC API listing][grpc-api].
## gRPC Services
Every API request sent to an etcd server is a gRPC remote procedure call. RPCs in etcd3 are categorized based on functionality into services.
Services important for dealing with etcd's key space include:
* KV - Creates, updates, fetches, and deletes key-value pairs.
* Watch - Monitors changes to keys.
* Lease - Primitives for consuming client keep-alive messages.
Services which manage the cluster itself include:
* Auth - Role based authentication mechanism for authenticating users.
* Cluster - Provides membership information and configuration facilities.
* Maintenance - Takes recovery snapshots, defragments the store, and returns per-member status information.
### Requests and Responses
All RPCs in etcd3 follow the same format. Each RPC has a function `Name` which takes `NameRequest` as an argument and returns `NameResponse` as a response. For example, here is the `Range` RPC description:
```protobuf
service KV {
Range(RangeRequest) returns (RangeResponse)
...
}
```
### Response header
All Responses from etcd API have an attached response header which includes cluster metadata for the response:
```proto
message ResponseHeader {
uint64 cluster_id = 1;
uint64 member_id = 2;
int64 revision = 3;
uint64 raft_term = 4;
}
```
* Cluster_ID - the ID of the cluster generating the response.
* Member_ID - the ID of the member generating the response.
* Revision - the revision of the key-value store when generating the response.
* Raft_Term - the Raft term of the member when generating the response.
An application may read the `Cluster_ID` or `Member_ID` field to ensure it is communicating with the intended cluster (member).
Applications can use the `Revision` field to know the latest revision of the key-value store. This is especially useful when applications specify a historical revision to make a `time travel query` and wish to know the latest revision at the time of the request.
Applications can use `Raft_Term` to detect when the cluster completes a new leader election.
## Key-Value API
The Key-Value API manipulates key-value pairs stored inside etcd. The majority of requests made to etcd are usually key-value requests.
### System primitives
### Key-Value pair
A key-value pair is the smallest unit that the key-value API can manipulate. Each key-value pair has a number of fields, defined in [protobuf format][kv-proto]:
```protobuf
message KeyValue {
bytes key = 1;
int64 create_revision = 2;
int64 mod_revision = 3;
int64 version = 4;
bytes value = 5;
int64 lease = 6;
}
```
* Key - key in bytes. An empty key is not allowed.
* Value - value in bytes.
* Version - version is the version of the key. A deletion resets the version to zero and any modification of the key increases its version.
* Create_Revision - revision of the last creation on the key.
* Mod_Revision - revision of the last modification on the key.
* Lease - the ID of the lease attached to the key. If lease is 0, then no lease is attached to the key.
In addition to just the key and value, etcd attaches additional revision metadata as part of the key message. This revision information orders keys by time of creation and modification, which is useful for managing concurrency for distributed synchronization. The etcd client's [distributed shared locks][locks] use the creation revision to wait for lock ownership. Similarly, the modification revision is used for detecting [software transactional memory][STM] read set conflicts and waiting on [leader election][elections] updates.
#### Revisions
etcd maintains a 64-bit cluster-wide counter, the store revision, that is incremented each time the key space is modified. The revision serves as a global logical clock, sequentially ordering all updates to the store. The change represented by a new revision is incremental; the data associated with a revision is the data that changed the store. Internally, a new revision means writing the changes to the backend's B+tree, keyed by the incremented revision.
Revisions become more valuable when considering etcd3's [multi-version concurrency control][mvcc] backend. The MVCC model means that the key-value store can be viewed from past revisions since historical key revisions are retained. The retention policy for this history can be configured by cluster administrators for fine-grained storage management; usually etcd3 discards old revisions of keys on a timer. A typical etcd3 cluster retains superseded key data for hours. This also provides reliable handling for long client disconnection, not just transient network disruptions: watchers simply resume from the last observed historical revision. Similarly, to read from the store at a particular point-in-time, read requests can be tagged with a revision to return keys from a view of the key space at the point-in-time that revision was committed.
#### Key ranges
The etcd3 data model indexes all keys over a flat binary key space. This differs from other key-value store systems that use a hierarchical system of organizing keys into directories. Instead of listing keys by directory, keys are listed by key intervals `[a, b)`.
These intervals are often referred to as "ranges" in etcd3. Operations over ranges are more powerful than operations on directories. Like a hierarchical store, intervals support single key lookups via `[a, a+1)` (e.g., ['a', 'a\x00') looks up 'a') and directory lookups by encoding keys by directory depth. In addition to those operations, intervals can also encode prefixes; for example the interval `['a', 'b')` looks up all keys prefixed by the string 'a'.
By convention, ranges for a request are denoted by the fields `key` and `range_end`. The `key` field is the first key of the range and should be non-empty. The `range_end` is the key following the last key of the range. If `range_end` is not given or empty, the range is defined to contain only the key argument. If `range_end` is `key` plus one (e.g., "aa"+1 == "ab", "a\xff"+1 == "b"), then the range represents all keys prefixed with key. If both `key` and `range_end` are '\0', then range represents all keys. If `range_end` is '\0', the range is all keys greater than or equal to the key argument.
### Range
Keys are fetched from the key-value store using the `Range` API call, which takes a `RangeRequest`:
```protobuf
message RangeRequest {
enum SortOrder {
NONE = 0; // default, no sorting
ASCEND = 1; // lowest target value first
DESCEND = 2; // highest target value first
}
enum SortTarget {
KEY = 0;
VERSION = 1;
CREATE = 2;
MOD = 3;
VALUE = 4;
}
bytes key = 1;
bytes range_end = 2;
int64 limit = 3;
int64 revision = 4;
SortOrder sort_order = 5;
SortTarget sort_target = 6;
bool serializable = 7;
bool keys_only = 8;
bool count_only = 9;
int64 min_mod_revision = 10;
int64 max_mod_revision = 11;
int64 min_create_revision = 12;
int64 max_create_revision = 13;
}
```
* Key, Range_End - The key range to fetch.
* Limit - the maximum number of keys returned for the request. When limit is set to 0, it is treated as no limit.
* Revision - the point-in-time of the key-value store to use for the range. If revision is less or equal to zero, the range is over the latest key-value store. If the revision is compacted, ErrCompacted is returned as a response.
* Sort_Order - the ordering for sorted requests.
* Sort_Target - the key-value field to sort.
* Serializable - sets the range request to use serializable member-local reads. By default, Range is linearizable; it reflects the current consensus of the cluster. For better performance and availability, in exchange for possible stale reads, a serializable range request is served locally without needing to reach consensus with other nodes in the cluster.
* Keys_Only - return only the keys and not the values.
* Count_Only - return only the count of the keys in the range.
* Min_Mod_Revision - the lower bound for key mod revisions; filters out lesser mod revisions.
* Max_Mod_Revision - the upper bound for key mod revisions; filters out greater mod revisions.
* Min_Create_Revision - the lower bound for key create revisions; filters out lesser create revisions.
* Max_Create_Revision - the upper bound for key create revisions; filters out greater create revisions.
The client receives a `RangeResponse` message from the `Range` call:
```protobuf
message RangeResponse {
ResponseHeader header = 1;
repeated mvccpb.KeyValue kvs = 2;
bool more = 3;
int64 count = 4;
}
```
* Kvs - the list of key-value pairs matched by the range request. When `Count_Only` is set, `Kvs` is empty.
* More - indicates if there are more keys to return in the requested range if `limit` is set.
* Count - the total number of keys satisfying the range request.
### Put
Keys are saved into the key-value store by issuing a `Put` call, which takes a `PutRequest`:
```protobuf
message PutRequest {
bytes key = 1;
bytes value = 2;
int64 lease = 3;
bool prev_kv = 4;
bool ignore_value = 5;
bool ignore_lease = 6;
}
```
* Key - the name of the key to put into the key-value store.
* Value - the value, in bytes, to associate with the key in the key-value store.
* Lease - the lease ID to associate with the key in the key-value store. A lease value of 0 indicates no lease.
* Prev_Kv - when set, responds with the key-value pair data before the update from this `Put` request.
* Ignore_Value - when set, update the key without changing its current value. Returns an error if the key does not exist.
* Ignore_Lease - when set, update the key without changing its current lease. Returns an error if the key does not exist.
The client receives a `PutResponse` message from the `Put` call:
```protobuf
message PutResponse {
ResponseHeader header = 1;
mvccpb.KeyValue prev_kv = 2;
}
```
* Prev_Kv - the key-value pair overwritten by the `Put`, if `Prev_Kv` was set in the `PutRequest`.
### Delete Range
Ranges of keys are deleted using the `DeleteRange` call, which takes a `DeleteRangeRequest`:
```protobuf
message DeleteRangeRequest {
bytes key = 1;
bytes range_end = 2;
bool prev_kv = 3;
}
```
* Key, Range_End - The key range to delete.
* Prev_Kv - when set, return the contents of the deleted key-value pairs.
The client receives a `DeleteRangeResponse` message from the `DeleteRange` call:
```protobuf
message DeleteRangeResponse {
ResponseHeader header = 1;
int64 deleted = 2;
repeated mvccpb.KeyValue prev_kvs = 3;
}
```
* Deleted - number of keys deleted.
* Prev_Kv - a list of all key-value pairs deleted by the `DeleteRange` operation.
### Transaction
A transaction is an atomic If/Then/Else construct over the key-value store. It provides a primitive for grouping requests together in atomic blocks (i.e., then/else) whose execution is guarded (i.e., if) based on the contents of the key-value store. Transactions can be used for protecting keys from unintended concurrent updates, building compare-and-swap operations, and developing higher-level concurrency control.
A transaction can atomically process multiple requests in a single request. For modifications to the key-value store, this means the store's revision is incremented only once for the transaction and all events generated by the transaction will have the same revision. However, modifications to the same key multiple times within a single transaction are forbidden.
All transactions are guarded by a conjunction of comparisons, similar to an `If` statement. Each comparison checks a single key in the store. It may check for the absence or presence of a value, compare with a given value, or check a key's revision or version. Two different comparisons may apply to the same or different keys. All comparisons are applied atomically; if all comparisons are true, the transaction is said to succeed and etcd applies the transaction's then / `success` request block, otherwise it is said to fail and applies the else / `failure` request block.
Each comparison is encoded as a `Compare` message:
```protobuf
message Compare {
enum CompareResult {
EQUAL = 0;
GREATER = 1;
LESS = 2;
NOT_EQUAL = 3;
}
enum CompareTarget {
VERSION = 0;
CREATE = 1;
MOD = 2;
VALUE= 3;
}
CompareResult result = 1;
// target is the key-value field to inspect for the comparison.
CompareTarget target = 2;
// key is the subject key for the comparison operation.
bytes key = 3;
oneof target_union {
int64 version = 4;
int64 create_revision = 5;
int64 mod_revision = 6;
bytes value = 7;
}
}
```
* Result - the kind of logical comparison operation (e.g., equal, less than, etc).
* Target - the key-value field to be compared. Either the key's version, create revision, modification revision, or value.
* Key - the key for the comparison.
* Target_Union - the user-specified data for the comparison.
After processing the comparison block, the transaction applies a block of requests. A block is a list of `RequestOp` messages:
```protobuf
message RequestOp {
// request is a union of request types accepted by a transaction.
oneof request {
RangeRequest request_range = 1;
PutRequest request_put = 2;
DeleteRangeRequest request_delete_range = 3;
}
}
```
* Request_Range - a `RangeRequest`.
* Request_Put - a `PutRequest`. The keys must be unique. It may not share keys with any other Puts or Deletes.
* Request_Delete_Range - a `DeleteRangeRequest`. It may not share keys with any Puts or Deletes requests.
All together, a transaction is issued with a `Txn` API call, which takes a `TxnRequest`:
```protobuf
message TxnRequest {
repeated Compare compare = 1;
repeated RequestOp success = 2;
repeated RequestOp failure = 3;
}
```
* Compare - A list of predicates representing a conjunction of terms for guarding the transaction.
* Success - A list of requests to process if all compare tests evaluate to true.
* Failure - A list of requests to process if any compare test evaluates to false.
The client receives a `TxnResponse` message from the `Txn` call:
```protobuf
message TxnResponse {
ResponseHeader header = 1;
bool succeeded = 2;
repeated ResponseOp responses = 3;
}
```
* Succeeded - Whether `Compare` evaluated to true or false.
* Responses - A list of responses corresponding to the results from applying the `Success` block if succeeded is true or the `Failure` if succeeded is false.
The `Responses` list corresponds to the results from the applied `RequestOp` list, with each response encoded as a `ResponseOp`:
```protobuf
message ResponseOp {
oneof response {
RangeResponse response_range = 1;
PutResponse response_put = 2;
DeleteRangeResponse response_delete_range = 3;
}
}
```
## Watch API
The `Watch` API provides an event-based interface for asynchronously monitoring changes to keys. An etcd3 watch waits for changes to keys by continuously watching from a given revision, either current or historical, and streams key updates back to the client.
### Events
Every change to every key is represented with `Event` messages. An `Event` message provides both the update's data and the type of update:
```protobuf
message Event {
enum EventType {
PUT = 0;
DELETE = 1;
}
EventType type = 1;
KeyValue kv = 2;
KeyValue prev_kv = 3;
}
```
* Type - The kind of event. A PUT type indicates new data has been stored to the key. A DELETE indicates the key was deleted.
* KV - The KeyValue associated with the event. A PUT event contains current kv pair. A PUT event with kv.Version=1 indicates the creation of a key. A DELETE event contains the deleted key with its modification revision set to the revision of deletion.
* Prev_KV - The key-value pair for the key from the revision immediately before the event. To save bandwidth, it is only filled out if the watch has explicitly enabled it.
### Watch streams
Watches are long-running requests and use gRPC streams to stream event data. A watch stream is bi-directional; the client writes to the stream to establish watches and reads to receive watch events. A single watch stream can multiplex many distinct watches by tagging events with per-watch identifiers. This multiplexing helps reducing the memory footprint and connection overhead on the core etcd cluster.
Watches make three guarantees about events:
* Ordered - events are ordered by revision; an event will never appear on a watch if it precedes an event in time that has already been posted.
* Reliable - a sequence of events will never drop any subsequence of events; if there are events ordered in time as a < b < c, then if the watch receives events a and c, it is guaranteed to receive b.
* Atomic - a list of events is guaranteed to encompass complete revisions; updates in the same revision over multiple keys will not be split over several lists of events.
A client creates a watch by sending a `WatchCreateRequest` over a stream returned by `Watch`:
```protobuf
message WatchCreateRequest {
bytes key = 1;
bytes range_end = 2;
int64 start_revision = 3;
bool progress_notify = 4;
enum FilterType {
NOPUT = 0;
NODELETE = 1;
}
repeated FilterType filters = 5;
bool prev_kv = 6;
}
```
* Key, Range_End - The key range to watch.
* Start_Revision - An optional revision for where to inclusively begin watching. If not given, it will stream events following the revision of the watch creation response header revision. The entire available event history can be watched starting from the last compaction revision.
* Progress_Notify - When set, the watch will periodically receive a WatchResponse with no events, if there are no recent events. It is useful when clients wish to recover a disconnected watcher starting from a recent known revision. The etcd server decides how often to send notifications based on current server load.
* Filters - A list of event types to filter away at server side.
* Prev_Kv - When set, the watch receives the key-value data from before the event happens. This is useful for knowing what data has been overwritten.
In response to a `WatchCreateRequest` or if there is a new event for some established watch, the client receives a `WatchResponse`:
```protobuf
message WatchResponse {
ResponseHeader header = 1;
int64 watch_id = 2;
bool created = 3;
bool canceled = 4;
int64 compact_revision = 5;
repeated mvccpb.Event events = 11;
}
```
* Watch_ID - the ID of the watch that corresponds to the response.
* Created - set to true if the response is for a create watch request. The client should store the ID and expect to receive events for the watch on the stream. All events sent to the created watcher will have the same watch_id.
* Canceled - set to true if the response is for a cancel watch request. No further events will be sent to the canceled watcher.
* Compact_Revision - set to the minimum historical revision available to etcd if a watcher tries watching at a compacted revision. This happens when creating a watcher at a compacted revision or the watcher cannot catch up with the progress of the key-value store. The watcher will be canceled; creating new watches with the same start_revision will fail.
* Events - a list of new events in sequence corresponding to the given watch ID.
If the client wishes to stop receiving events for a watch, it issues a `WatchCancelRequest`:
```protobuf
message WatchCancelRequest {
int64 watch_id = 1;
}
```
* Watch_ID - the ID of the watch to cancel so that no more events are transmitted.
## Lease API
Leases are a mechanism for detecting client liveness. The cluster grants leases with a time-to-live. A lease expires if the etcd cluster does not receive a keepAlive within a given TTL period.
To tie leases into the key-value store, each key may be attached to at most one lease. When a lease expires or is revoked, all keys attached to that lease will be deleted. Each expired key generates a delete event in the event history.
### Obtaining leases
Leases are obtained through the `LeaseGrant` API call, which takes a `LeaseGrantRequest`:
```protobuf
message LeaseGrantRequest {
int64 TTL = 1;
int64 ID = 2;
}
```
* TTL - the advisory time-to-live, in seconds.
* ID - the requested ID for the lease. If ID is set to 0, etcd will choose an ID.
The client receives a `LeaseGrantResponse` from the `LeaseGrant` call:
```protobuf
message LeaseGrantResponse {
ResponseHeader header = 1;
int64 ID = 2;
int64 TTL = 3;
}
```
* ID - the lease ID for the granted lease.
* TTL - is the server selected time-to-live, in seconds, for the lease.
```protobuf
message LeaseRevokeRequest {
int64 ID = 1;
}
```
* ID - the lease ID to revoke. When the lease is revoked, all attached keys are deleted.
### Keep alives
Leases are refreshed using a bi-directional stream created with the `LeaseKeepAlive` API call. When the client wishes to refresh a lease, it sends a `LeaseKeepAliveRequest` over the stream:
```protobuf
message LeaseKeepAliveRequest {
int64 ID = 1;
}
```
* ID - the lease ID for the lease to keep alive.
The keep alive stream responds with a `LeaseKeepAliveResponse`:
```protobuf
message LeaseKeepAliveResponse {
ResponseHeader header = 1;
int64 ID = 2;
int64 TTL = 3;
}
```
* ID - the lease that was refreshed with a new TTL.
* TTL - the new time-to-live, in seconds, that the lease has remaining.
[elections]: https://github.com/etcd-io/etcd/blob/master/clientv3/concurrency/election.go
[kv-proto]: https://github.com/etcd-io/etcd/blob/master/mvcc/mvccpb/kv.proto
[grpc-api]: ../dev-guide/api_reference_v3.md
[grpc-service]: https://github.com/etcd-io/etcd/blob/master/etcdserver/etcdserverpb/rpc.proto
[locks]: https://github.com/etcd-io/etcd/blob/master/clientv3/concurrency/mutex.go
[mvcc]: https://en.wikipedia.org/wiki/Multiversion_concurrency_control
[stm]: https://github.com/etcd-io/etcd/blob/master/clientv3/concurrency/stm.go

View File

@ -0,0 +1,66 @@
---
title: KV API guarantees
---
etcd is a consistent and durable key value store with [mini-transaction][txn] support. The key value store is exposed through the KV APIs. etcd tries to ensure the strongest consistency and durability guarantees for a distributed system. This specification enumerates the KV API guarantees made by etcd.
### APIs to consider
* Read APIs
* range
* watch
* Write APIs
* put
* delete
* Combination (read-modify-write) APIs
* txn
### etcd specific definitions
#### Operation completed
An etcd operation is considered complete when it is committed through consensus, and therefore “executed” -- permanently stored -- by the etcd storage engine. The client knows an operation is completed when it receives a response from the etcd server. Note that the client may be uncertain about the status of an operation if it times out, or there is a network disruption between the client and the etcd member. etcd may also abort operations when there is a leader election. etcd does not send `abort` responses to clients outstanding requests in this event.
#### Revision
An etcd operation that modifies the key value store is assigned a single increasing revision. A transaction operation might modify the key value store multiple times, but only one revision is assigned. The revision attribute of a key value pair that was modified by the operation has the same value as the revision of the operation. The revision can be used as a logical clock for key value store. A key value pair that has a larger revision is modified after a key value pair with a smaller revision. Two key value pairs that have the same revision are modified by an operation "concurrently".
### Guarantees provided
#### Atomicity
All API requests are atomic; an operation either completes entirely or not at all. For watch requests, all events generated by one operation will be in one watch response. Watch never observes partial events for a single operation.
#### Consistency
All API calls ensure [sequential consistency][seq_consistency], the strongest consistency guarantee available from distributed systems. No matter which etcd member server a client makes requests to, a client reads the same events in the same order. If two members complete the same number of operations, the state of the two members is consistent.
For watch operations, etcd guarantees to return the same value for the same key across all members for the same revision. For range operations, etcd has a similar guarantee for [linearized][Linearizability] access; serialized access may be behind the quorum state, so that the later revision is not yet available.
As with all distributed systems, it is impossible for etcd to ensure [strict consistency][strict_consistency]. etcd does not guarantee that it will return to a read the “most recent” value (as measured by a wall clock when a request is completed) available on any cluster member.
#### Isolation
etcd ensures [serializable isolation][serializable_isolation], which is the highest isolation level available in distributed systems. Read operations will never observe any intermediate data.
#### Durability
Any completed operations are durable. All accessible data is also durable data. A read will never return data that has not been made durable.
#### Linearizability
Linearizability (also known as Atomic Consistency or External Consistency) is a consistency level between strict consistency and sequential consistency.
For linearizability, suppose each operation receives a timestamp from a loosely synchronized global clock. Operations are linearized if and only if they always complete as though they were executed in a sequential order and each operation appears to complete in the order specified by the program. Likewise, if an operations timestamp precedes another, that operation must also precede the other operation in the sequence.
For example, consider a client completing a write at time point 1 (*t1*). A client issuing a read at *t2* (for *t2* > *t1*) should receive a value at least as recent as the previous write, completed at *t1*. However, the read might actually complete only by *t3*. Linearizability guarantees the read returns the most current value. Without linearizability guarantee, the returned value, current at *t2* when the read began, might be "stale" by *t3* because a concurrent write might happen between *t2* and *t3*.
etcd does not ensure linearizability for watch operations. Users are expected to verify the revision of watch responses to ensure correct ordering.
etcd ensures linearizability for all other operations by default. Linearizability comes with a cost, however, because linearized requests must go through the Raft consensus process. To obtain lower latencies and higher throughput for read requests, clients can configure a requests consistency mode to `serializable`, which may access stale data with respect to quorum, but removes the performance penalty of linearized accesses' reliance on live consensus.
[seq_consistency]: https://en.wikipedia.org/wiki/Consistency_model#Sequential_consistency
[strict_consistency]: https://en.wikipedia.org/wiki/Consistency_model#Strict_consistency
[serializable_isolation]: https://en.wikipedia.org/wiki/Isolation_(database_systems)#Serializable
[Linearizability]: #Linearizability
[txn]: api.md#transactions

View File

@ -0,0 +1,27 @@
---
title: Data model
---
etcd is designed to reliably store infrequently updated data and provide reliable watch queries. etcd exposes previous versions of key-value pairs to support inexpensive snapshots and watch history events (“time travel queries”). A persistent, multi-version, concurrency-control data model is a good fit for these use cases.
etcd stores data in a multiversion [persistent][persistent-ds] key-value store. The persistent key-value store preserves the previous version of a key-value pair when its value is superseded with new data. The key-value store is effectively immutable; its operations do not update the structure in-place, but instead always generate a new updated structure. All past versions of keys are still accessible and watchable after modification. To prevent the data store from growing indefinitely over time and from maintaining old versions, the store may be compacted to shed the oldest versions of superseded data.
### Logical view
The stores logical view is a flat binary key space. The key space has a lexically sorted index on byte string keys so range queries are inexpensive.
The key space maintains multiple **revisions**. Each atomic mutative operation (e.g., a transaction operation may contain multiple operations) creates a new revision on the key space. All data held by previous revisions remains unchanged. Old versions of key can still be accessed through previous revisions. Likewise, revisions are indexed as well; ranging over revisions with watchers is efficient. If the store is compacted to save space, revisions before the compact revision will be removed. Revisions are monotonically increasing over the lifetime of a cluster.
A key's life spans a generation, from creation to deletion. Each key may have one or multiple generations. Creating a key increments the **version** of that key, starting at 1 if the key does not exist at the current revision. Deleting a key generates a key tombstone, concluding the keys current generation by resetting its version to 0. Each modification of a key increments its version; so, versions are monotonically increasing within a key's generation. Once a compaction happens, any generation ended before the compaction revision will be removed, and values set before the compaction revision except the latest one will be removed.
### Physical view
etcd stores the physical data as key-value pairs in a persistent [b+tree][b+tree]. Each revision of the stores state only contains the delta from its previous revision to be efficient. A single revision may correspond to multiple keys in the tree.
The key of key-value pair is a 3-tuple (major, sub, type). Major is the store revision holding the key. Sub differentiates among keys within the same revision. Type is an optional suffix for special value (e.g., `t` if the value contains a tombstone). The value of the key-value pair contains the modification from previous revision, thus one delta from previous revision. The b+tree is ordered by key in lexical byte-order. Ranged lookups over revision deltas are fast; this enables quickly finding modifications from one specific revision to another. Compaction removes out-of-date keys-value pairs.
etcd also keeps a secondary in-memory [btree][btree] index to speed up range queries over keys. The keys in the btree index are the keys of the store exposed to user. The value is a pointer to the modification of the persistent b+tree. Compaction removes dead pointers.
[persistent-ds]: https://en.wikipedia.org/wiki/Persistent_data_structure
[btree]: https://en.wikipedia.org/wiki/B-tree
[b+tree]: https://en.wikipedia.org/wiki/B%2B_tree

View File

@ -0,0 +1,79 @@
---
title: etcd v3 authentication design
---
## Why not reuse the v2 auth system?
The v3 protocol uses gRPC as its transport instead of a RESTful interface like v2. This new protocol provides an opportunity to iterate on and improve the v2 design. For example, v3 auth has connection based authentication, rather than v2's slower per-request authentication. Additionally, v2 auth's semantics tend to be unwieldy in practice with respect to reasoning about consistency, which will be described in the next sections. For v3, there is a well-defined description and implementation of the authentication mechanism which fixes the deficiencies in the v2 auth system.
### Functionality requirements
* Per connection authentication, not per request
* User ID + password based authentication implemented for the gRPC API
* Authentication must be refreshed after auth policy changes
* Its functionality should be as simple and useful as v2
* v3 provides a flat key space, unlike the directory structure of v2. Permission checking will be provided as interval matching.
* It should have stronger consistency guarantees than v2 auth
### Main required changes
* A client must create a dedicated connection only for authentication before sending authenticated requests
* Add permission information (user ID and authorized revision) to the Raft commands (`etcdserverpb.InternalRaftRequest`)
* Every request is permission checked in the state machine layer, rather than API layer
### Permission metadata consistency
The metadata for auth should also be stored and managed in the storage controlled by etcd's Raft protocol like other data stored in etcd. It is required for not sacrificing availability and consistency of the entire etcd cluster. If reading or writing the metadata (e.g. permission information) needs an agreement of every node (more than quorum), single node failure can stop the entire cluster. Requiring all nodes to agree at once means that checking ordinary read/write requests cannot be completed if any cluster member is down, even if the cluster has an available quorum. This unanimous scheme ultimately degrades cluster availability; quorum based consensus from raft should suffice since agreement follows from consistent ordering.
The authentication mechanism in the etcd v2 protocol has a tricky part because the metadata consistency should work as in the above, but does not: each permission check is processed by the etcd member that receives the client request (etcdserver/api/v2http/client.go), including follower members. Therefore, it's possible the check may be based on stale metadata.
This staleness means that auth configuration cannot be reflected as soon as operators execute etcdctl. Therefore there is no way to know how long the stale metadata is active. Practically, the configuration change is reflected immediately after the command execution. However, in some cases of heavy load, the inconsistent state can be prolonged and it might result in counter-intuitive situations for users and developers. It requires a workaround like this: https://github.com/etcd-io/etcd/pull/4317#issuecomment-179037582
### Inconsistent permissions are unsafe for linearized requests
Inconsistent authentication state is most serious for writes. Even if an operator disables write on a user, if the write is only ordered with respect to the key value store but not the authentication system, it's possible the write will complete successfully. Without ordering on both the auth store and the key-value store, the system will be susceptible to stale permission attacks.
Therefore, the permission checking logic should be added to the state machine of etcd. Each state machine should check the requests based on its permission information in the apply phase (so the auth information must not be stale).
## Design and implementation
### Authentication
At first, a client must create a gRPC connection only to authenticate its user ID and password. An etcd server will respond with an authentication reply. The response will be an authentication token on success or an error on failure. The client can use its authentication token to present its credentials to etcd when making API requests.
The client connection used to request the authentication token is typically thrown away; it cannot carry the new token's credentials. This is because gRPC doesn't provide a way for adding per RPC credential after creation of the connection (calling `grpc.Dial()`). Therefore, a client cannot assign a token to its connection that is obtained through the connection. The client needs a new connection for using the token.
#### Notes on the implementation of `Authenticate()` RPC
`Authenticate()` RPC generates an authentication token based on a given user name and password. etcd saves and checks a configured password and a given password using Go's `bcrypt` package. By design, `bcrypt`'s password checking mechanism is computationally expensive, taking nearly 100ms on an ordinary x64 server. Therefore, performing this check in the state machine apply phase would cause performance trouble: the entire etcd cluster can only serve almost 10 `Authenticate()` requests per second.
For good performance, the v3 auth mechanism checks passwords in etcd's API layer, where it can be parallelized outside of raft. However, this can lead to potential time-of-check/time-of-use (TOCTOU) permission lapses:
1. client A sends a request `Authenticate()`
1. the API layer processes the password checking part of `Authenticate()`
1. another client B sends a request of `ChangePassword()` and the server completes it
1. the state machine layer processes the part of getting a revision number for the `Authenticate()` from A
1. the server returns a success to A
1. now A is authenticated on an obsolete password
For avoiding such a situation, the API layer performs *version number validation* based on the revision number of the auth store. During password checking, the API layer saves the revision number of auth store. After successful password checking, the API layer compares the saved revision number and the latest revision number. If the numbers differ, it means someone else updated the auth metadata. So it retries the checking. With this mechanism, the successful password checking based on the obsolete password can be avoided.
### Resolving a token in the API layer
After authenticating with `Authenticate()`, a client can create a gRPC connection as it would without auth. In addition to the existing initialization process, the client must associate the token with the newly created connection. `grpc.WithPerRPCCredentials()` provides the functionality for this purpose.
Every authenticated request from the client has a token. The token can be obtained with `grpc.metadata.FromIncomingContext()` in the server side. The server can obtain who is issuing the request and when the user was authorized. The information will be filled by the API layer in the header (`etcdserverpb.RequestHeader.Username` and `etcdserverpb.RequestHeader.AuthRevision`) of a raft log entry (`etcdserverpb.InternalRaftRequest`).
### Checking permission in the state machine
The auth info in `etcdserverpb.RequestHeader` is checked in the apply phase of the state machine. This step checks the user is granted permission to requested keys on the latest revision of auth store.
### Two types of tokens: simple and JWT
There are two kinds of token types: simple and JWT. The simple token isn't designed for production use cases. Its tokens aren't cryptographically signed and servers must statefully track token-user correspondence; it is meant for development testing. JWT tokens should be used for production deployments since it is cryptographically signed and verified. From the implementation perspective, JWT is stateless. Its token can include metadata including username and revision, so servers don't need to remember correspondence between tokens and the metadata.
## Notes on the difference between KVS models and file system models
etcd v3 is a KVS, not a file system. So the permissions can be granted to the users in form of an exact key name or a key range like `["start key", "end key")`. It means that granting a permission of a nonexistent key is possible. Users should care about unintended permission granting. In a case of file system like system (e.g. Chubby or ZooKeeper), an inode like data structure can include the permission information. So granting permission to a nonexist key won't be possible (except the case of sticky bits).
The etcd v3 model requires multiple lookup of the metadata unlike the file system like systems. The worst case lookup cost will be sum the user's total granted keys and intervals. The cost cannot be avoided because v3's flat key space is completely different from Unix's file system model (every inode includes permission metadata). Practically the cost wont be a serious problem because the metadata is small enough to benefit from caching.

View File

@ -0,0 +1,138 @@
---
title: etcd client design
---
etcd Client Design
==================
*Gyuho Lee (github.com/gyuho, Amazon Web Services, Inc.), Joe Betz (github.com/jpbetz, Google Inc.)*
Introduction
============
etcd server has proven its robustness with years of failure injection testing. Most complex application logic is already handled by etcd server and its data stores (e.g. cluster membership is transparent to clients, with Raft-layer forwarding proposals to leader). Although server components are correct, its composition with client requires a different set of intricate protocols to guarantee its correctness and high availability under faulty conditions. Ideally, etcd server provides one logical cluster view of many physical machines, and client implements automatic failover between replicas. This documents client architectural decisions and its implementation details.
Glossary
========
*clientv3*: etcd Official Go client for etcd v3 API.
*clientv3-grpc1.0*: Official client implementation, with [`grpc-go v1.0.x`](https://github.com/grpc/grpc-go/releases/tag/v1.0.0), which is used in latest etcd v3.1.
*clientv3-grpc1.7*: Official client implementation, with [`grpc-go v1.7.x`](https://github.com/grpc/grpc-go/releases/tag/v1.7.0), which is used in latest etcd v3.2 and v3.3.
*clientv3-grpc1.23*: Official client implementation, with [`grpc-go v1.23.x`](https://github.com/grpc/grpc-go/releases/tag/v1.23.0), which is used in latest etcd v3.4.
*Balancer*: etcd client load balancer that implements retry and failover mechanism. etcd client should automatically balance loads between multiple endpoints.
*Endpoints*: A list of etcd server endpoints that clients can connect to. Typically, 3 or 5 client URLs of an etcd cluster.
*Pinned endpoint*: When configured with multiple endpoints, <= v3.3 client balancer chooses only one endpoint to establish a TCP connection, in order to conserve total open connections to etcd cluster. In v3.4, balancer round-robins pinned endpoints for every request, thus distributing loads more evenly.
*Client Connection*: TCP connection that has been established to an etcd server, via gRPC Dial.
*Sub Connection*: gRPC SubConn interface. Each sub-connection contains a list of addresses. Balancer creates a SubConn from a list of resolved addresses. gRPC ClientConn can map to multiple SubConn (e.g. example.com resolves to `10.10.10.1` and `10.10.10.2` of two sub-connections). etcd v3.4 balancer employs internal resolver to establish one sub-connection for each endpoint.
*Transient disconnect*: When gRPC server returns a status error of [`code Unavailable`](https://godoc.org/google.golang.org/grpc/codes#Code).
Client Requirements
===================
*Correctness*. Requests may fail in the presence of server faults. However, it never violates consistency guarantees: global ordering properties, never write corrupted data, at-most once semantics for mutable operations, watch never observes partial events, and so on.
*Liveness*. Servers may fail or disconnect briefly. Clients should make progress in either way. Clients should [never deadlock](https://github.com/etcd-io/etcd/issues/8980) waiting for a server to come back from offline, unless configured to do so. Ideally, clients detect unavailable servers with HTTP/2 ping and failover to other nodes with clear error messages.
*Effectiveness*. Clients should operate effectively with minimum resources: previous TCP connections should be [gracefully closed](https://github.com/etcd-io/etcd/issues/9212) after endpoint switch. Failover mechanism should effectively predict the next replica to connect, without wastefully retrying on failed nodes.
*Portability*. Official client should be clearly documented and its implementation be applicable to other language bindings. Error handling between different language bindings should be consistent. Since etcd is fully committed to gRPC, implementation should be closely aligned with gRPC long-term design goals (e.g. pluggable retry policy should be compatible with [gRPC retry](https://github.com/grpc/proposal/blob/master/A6-client-retries.md)). Upgrades between two client versions should be non-disruptive.
Client Overview
===============
etcd client implements the following components:
* balancer that establishes gRPC connections to an etcd cluster,
* API client that sends RPCs to an etcd server, and
* error handler that decides whether to retry a failed request or switch endpoints.
Languages may differ in how to establish an initial connection (e.g. configure TLS), how to encode and send Protocol Buffer messages to server, how to handle stream RPCs, and so on. However, errors returned from etcd server will be the same. So should be error handling and retry policy.
For example, etcd server may return `"rpc error: code = Unavailable desc = etcdserver: request timed out"`, which is transient error that expects retries. Or return `rpc error: code = InvalidArgument desc = etcdserver: key is not provided`, which means request was invalid and should not be retried. Go client can parse errors with `google.golang.org/grpc/status.FromError`, and Java client with `io.grpc.Status.fromThrowable`.
clientv3-grpc1.0: Balancer Overview
-----------------------------------
`clientv3-grpc1.0` maintains multiple TCP connections when configured with multiple etcd endpoints. Then pick one address and use it to send all client requests. The pinned address is maintained until the client object is closed (see *Figure 1*). When the client receives an error, it randomly picks another and retries.
![client-balancer-figure-01.png](img/client-balancer-figure-01.png)
clientv3-grpc1.0: Balancer Limitation
-------------------------------------
`clientv3-grpc1.0` opening multiple TCP connections may provide faster balancer failover but requires more resources. The balancer does not understand nodes health status or cluster membership. So, it is possible that balancer gets stuck with one failed or partitioned node.
clientv3-grpc1.7: Balancer Overview
------------------------------------
`clientv3-grpc1.7` maintains only one TCP connection to a chosen etcd server. When given multiple cluster endpoints, a client first tries to connect to them all. As soon as one connection is up, balancer pins the address, closing others (see *Figure 2*). The pinned address is to be maintained until the client object is closed. An error, from server or client network fault, is sent to client error handler (see *Figure 3*).
![client-balancer-figure-02.png](img/client-balancer-figure-02.png)
![client-balancer-figure-03.png](img/client-balancer-figure-03.png)
The client error handler takes an error from gRPC server, and decides whether to retry on the same endpoint, or to switch to other addresses, based on the error code and message (see *Figure 4* and *Figure 5*).
![client-balancer-figure-04.png](img/client-balancer-figure-04.png)
![client-balancer-figure-05.png](img/client-balancer-figure-05.png)
Stream RPCs, such as Watch and KeepAlive, are often requested with no timeouts. Instead, client can send periodic HTTP/2 pings to check the status of a pinned endpoint; if the server does not respond to the ping, balancer switches to other endpoints (see *Figure 6*).
![client-balancer-figure-06.png](img/client-balancer-figure-06.png)
clientv3-grpc1.7: Balancer Limitation
-------------------------------------
`clientv3-grpc1.7` balancer sends HTTP/2 keepalives to detect disconnects from streaming requests. It is a simple gRPC server ping mechanism and does not reason about cluster membership, thus unable to detect network partitions. Since partitioned gRPC server can still respond to client pings, balancer may get stuck with a partitioned node. Ideally, keepalive ping detects partition and triggers endpoint switch, before request time-out (see [etcd#8673](https://github.com/etcd-io/etcd/issues/8673) and *Figure 7*).
![client-balancer-figure-07.png](img/client-balancer-figure-07.png)
`clientv3-grpc1.7` balancer maintains a list of unhealthy endpoints. Disconnected addresses are added to “unhealthy” list, and considered unavailable until after wait duration, which is hard coded as dial timeout with default value 5-second. Balancer can have false positives on which endpoints are unhealthy. For instance, endpoint A may come back right after being blacklisted, but still unusable for next 5 seconds (see *Figure 8*).
`clientv3-grpc1.0` suffered the same problems above.
![client-balancer-figure-08.png](img/client-balancer-figure-08.png)
Upstream gRPC Go had already migrated to new balancer interface. For example, `clientv3-grpc1.7` underlying balancer implementation uses new gRPC balancer and tries to be consistent with old balancer behaviors. While its compatibility has been maintained reasonably well, etcd client still [suffered from subtle breaking changes](https://github.com/grpc/grpc-go/issues/1649). Furthermore, gRPC maintainer recommends to [not rely on the old balancer interface](https://github.com/grpc/grpc-go/issues/1942#issuecomment-375368665). In general, to get better support from upstream, it is best to be in sync with latest gRPC releases. And new features, such as retry policy, may not be backported to gRPC 1.7 branch. Thus, both etcd server and client must migrate to latest gRPC versions.
clientv3-grpc1.23: Balancer Overview
------------------------------------
`clientv3-grpc1.7` is so tightly coupled with old gRPC interface, that every single gRPC dependency upgrade broke client behavior. Majority of development and debugging efforts were devoted to fixing those client behavior changes. As a result, its implementation has become overly complicated with bad assumptions on server connectivities.
The primary goal of `clientv3-grpc1.23` is to simplify balancer failover logic; rather than maintaining a list of unhealthy endpoints, which may be stale, simply roundrobin to the next endpoint whenever client gets disconnected from the current endpoint. It does not assume endpoint status. Thus, no more complicated status tracking is needed (see *Figure 8* and above). Upgrading to `clientv3-grpc1.23` should be no issue; all changes were internal while keeping all the backward compatibilities.
Internally, when given multiple endpoints, `clientv3-grpc1.23` creates multiple sub-connections (one sub-connection per each endpoint), while `clientv3-grpc1.7` creates only one connection to a pinned endpoint (see *Figure 9*). For instance, in 5-node cluster, `clientv3-grpc1.23` balancer would require 5 TCP connections, while `clientv3-grpc1.7` only requires one. By preserving the pool of TCP connections, `clientv3-grpc1.23` may consume more resources but provide more flexible load balancer with better failover performance. The default balancing policy is round robin but can be easily extended to support other types of balancers (e.g. power of two, pick leader, etc.). `clientv3-grpc1.23` uses gRPC resolver group and implements balancer picker policy, in order to delegate complex balancing work to upstream gRPC. On the other hand, `clientv3-grpc1.7` manually handles each gRPC connection and balancer failover, which complicates the implementation. `clientv3-grpc1.23` implements retry in the gRPC interceptor chain that automatically handles gRPC internal errors and enables more advanced retry policies like backoff, while `clientv3-grpc1.7` manually interprets gRPC errors for retries.
![client-balancer-figure-09.png](img/client-balancer-figure-09.png)
clientv3-grpc1.23: Balancer Limitation
--------------------------------------
Improvements can be made by caching the status of each endpoint. For instance, balancer can ping each server in advance to maintain a list of healthy candidates, and use this information when doing round-robin. Or when disconnected, balancer can prioritize healthy endpoints. This may complicate the balancer implementation, thus can be addressed in later versions.
Client-side keepalive ping still does not reason about network partitions. Streaming request may get stuck with a partitioned node. Advanced health checking service need to be implemented to understand the cluster membership (see [etcd#8673](https://github.com/etcd-io/etcd/issues/8673) for more detail).
![client-balancer-figure-07.png](img/client-balancer-figure-07.png)
Currently, retry logic is handled manually as an interceptor. This may be simplified via [official gRPC retries](https://github.com/grpc/proposal/blob/master/A6-client-retries.md).

Some files were not shown because too many files have changed in this diff Show More