Commit Graph

15303 Commits (50babc16e7e0acea10b38f64f5c5f9811ba82c76)

Author SHA1 Message Date
Gyuho Lee 50babc16e7 etcdserver/api/v2v3: skip tests for CI
To fix in v3.5

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-26 05:54:58 -07:00
Xiang Li cf4b5d9c7f
Merge pull request #10938 from MasahikoSawada/updategoversion
README: require Go 1.12+
2019-07-25 22:59:34 -07:00
Masahiko Sawada 222dcc8d13 README: require Go 1.12+
Fixes #10937
2019-07-26 14:31:24 +09:00
Tobias Grieger e36e3ac6a7
Merge pull request #10917 from gyuho/raft-node
raft: improve logging around tick miss
2019-07-25 22:17:28 +02:00
Gyuho Lee 388d15f521
Merge pull request #10622 from philips/add-v2v3-tests
etcdserver: api/v2v3: add initial tests
2019-07-25 10:05:29 -07:00
Gyuho Lee 84a38045c9 CHANGELOG: move "--enable-v2v3" to 3.5
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-25 09:29:31 -07:00
Gyuho Lee c5dba11197
Merge pull request #10933 from ChrisRx/fix-embed-panic
embed: fix oob panic in zap logger
2019-07-25 06:56:31 -07:00
chris 2223142685 embed: fix oob panic in zap logger
This fixes an index out-of-bounds panic caused when using the embed
package and the zap logger. When a TLS handshake error is logged, the
slice for cert ip addresses is allocated with capacity but no length, so
subsequent index access causes the panic, and doesn't surface the TLS
handshake error to the user.

Fixes #10932
2019-07-25 09:42:42 -04:00
Gyuho Lee 1782469a76
Merge pull request #10928 from spzala/changelogdep
changelog: reflect the latest vendor dependencies
2019-07-25 06:19:13 -07:00
Sahdev P. Zala c63c988d03 changelog: reflect the latest vendor dependencies
Vendor dependencies are modified under,
https://github.com/etcd-io/etcd/pull/10918

Also, the protobuf module is mentioned twice so removing one.
2019-07-25 02:52:34 -04:00
Gyuho Lee c7c9428f6b raft: move "RawNode", clarify tick miss
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-24 23:35:36 -07:00
Gyuho Lee 8f000c755b
Merge pull request #10918 from gyuho/zap-go
vendor: upgrade dependencies
2019-07-24 21:59:36 -07:00
Jingyi Hu 425b65467c
Merge pull request #10907 from spzala/emptyrole10905
etcdserver: do not allow creating empty role
2019-07-24 18:51:43 -07:00
Sahdev P. Zala 1cef112a79 etcdserver: do not allow creating empty role
Like user, we should not allow creating empty role.

Related #10905
2019-07-24 17:41:24 -04:00
Gyuho Lee 46166ad733 vendor: update
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-24 14:09:50 -07:00
Tobias Grieger d137fa9d4a
Merge pull request #10920 from tbg/rawnode-ready
raft: require app to consume result from Ready()
2019-07-24 11:46:55 +02:00
Gyuho Lee 8b752ef647 CHANGELOG: update latest changes
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-23 15:36:14 -07:00
Tobias Schottdorf 721127da12 raft: require app to consume result from Ready()
I changed `(*RawNode).Ready`'s behavior in #10892 in a problematic way.
Previously, `Ready()` would create and immediately "accept" a Ready
(i.e. commit the app to actually handling it). In #10892, Ready() became
a pure read-only operation and the "accepting" was moved to
`Advance(rd)`.  As a result it was illegal to use the RawNode in certain
ways while the Ready was being handled. Failure to do so would result in
dropped messages (and perhaps worse). For example, with the following
operations

1. `rd := rawNode.Ready()`
2. `rawNode.Step(someMsg)`
3. `rawNode.Advance(rd)`

`someMsg` would be dropped, because `Advance()` would clear out the
outgoing messages thinking that they had all been handled by the client.
I mistakenly assumed that this restriction had existed prior, but this
is incorrect.

I noticed this while trying to pick up the above PR in CockroachDB,
where it caused unit test failures, precisely due to the above example.

This PR reestablishes the previous behavior (result of `Ready()` must
be handled by the app) and adds a regression test.

While I was there, I carried out a few small clarifying refactors.
2019-07-23 22:45:01 +02:00
Gyuho Lee a91f4e45c0
Merge pull request #10919 from gyuho/ci
*: test with Go 1.12.7
2019-07-23 13:09:01 -07:00
Wenjia 97bd2b3262
Update CHANGELOG-3.4.md for PR #10805 2019-07-23 12:48:28 -07:00
Gyuho Lee dfd62f04e9 *: test with Go 1.12.7
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-07-23 12:40:47 -07:00
Gyuho Lee a9047850de
Merge pull request #10805 from wenjiaswe/rebase-distroless
Dockerfile: Rebase etcd image to debian
2019-07-23 11:42:48 -07:00
Tobias Grieger d1183424bb
Merge pull request #10914 from tbg/api
raft: allow use of joint quorums
2019-07-23 11:49:08 +02:00
Tobias Schottdorf b9c051e7a7 raftpb: clean up naming in ConfChange 2019-07-23 10:40:03 +02:00
Tobias Schottdorf b67303c6a2 raft: allow use of joint quorums
This change introduces joint quorums by changing the Node and RawNode
API to accept pb.ConfChangeV2 (on top of pb.ConfChange).

pb.ConfChange continues to work as today: it allows carrying out a
single configuration change. A pb.ConfChange proposal gets added to
the Raft log as such and is thus also observed by the app during Ready
handling, and fed back to ApplyConfChange.

ConfChangeV2 allows joint configuration changes but will continue to
carry out configuration changes in "one phase" (i.e. without ever
entering a joint config) when this is possible.
2019-07-23 10:40:03 +02:00
Tobias Schottdorf 88f5561733 raft: use ConfChangeSingle internally 2019-07-23 10:39:48 +02:00
Tobias Schottdorf 10680744b9 raft: introduce protos for joint quorums 2019-07-23 10:39:48 +02:00
Wenjia Zhang f856ce963b Dockerfile: rebase etcd image to debian 2019-07-22 18:58:16 -07:00
Jingyi Hu fe86a786a4
Merge pull request #10904 from yuzeming/fixed-10899
integration: fix a data race about `i` and `tt` in TestV3WatchFromCur…
2019-07-19 17:51:21 -07:00
Jingyi Hu 39680c381e
Merge pull request #10908 from tbg/log
etcdserver: fix createConfChangeEnts
2019-07-19 17:45:27 -07:00
Xiang Li 9a69aa17c8
Merge pull request #10614 from jmillikin-stripe/cert-allowed-san-flags
etcdmain, pkg: Support peer and client TLS auth based on SAN fields.
2019-07-19 12:02:28 -07:00
Tobias Schottdorf eb4d9b640a etcdserver: fix createConfChangeEnts
It created a sequence of conf changes that could intermittently cause an
empty set of voters, which Raft asserts against as of #10889.

This fixes TestCtlV2BackupSnapshot and TestCtlV2BackupV3Snapshot, see:
https://github.com/etcd-io/etcd/issues/10700#issuecomment-512358126
2019-07-19 17:13:08 +02:00
Tobias Grieger 3c5e2f51e4
Merge pull request #10892 from tbg/rawnode-everywhere-attempt3
raft: use RawNode for node's event loop; clean up bootstrap
2019-07-19 14:30:08 +02:00
Tobias Schottdorf caa48bcc3d raft: remove TestNodeBoundedLogGrowthWithPartition
It has a data race between the test's call to `reduceUncommittedSize`
and a corresponding call during Ready handling in `(*node).run()`.
The corresponding RawNode test still verifies the functionality, so
instead of fixing the test we can remove it.
2019-07-19 12:35:14 +02:00
Tobias Schottdorf 500af91653 raft: restore ability to bootstrap RawNode
We are worried about breaking backwards compatibility for any
application out there that may have relied on the old behavior. Their
RawNode invocation would have been broken by the removal of the peers
argument so it would not have changed silently; an associated comment
tells callers how to fix it.
2019-07-19 10:02:02 +02:00
Tobias Schottdorf c9491d7861 raft: clean up bootstrap
This is the first (maybe not last) step in cleaning up the bootstrap
code around StartNode.

Initializing a Raft group for the first time is awkward, since a
configuration has to be pulled from thin air. The way this is solved
today is unclean: The app is supposed to pass peers to StartNode(),
we add configuration changes for them to the log, immediately pretend
that they are applied, but actually leave them unapplied (to give the
app a chance to observe them, though if the app did decide to not apply
them things would really go off the rails), and then return control to
the app. The app will then process the initial Readys and as a result
the configuration will be persisted to disk; restarts of the node then
use RestartNode which doesn't take any peers.

The code that did this lived awkwardly in two places fairly deep down
the callstack, though it was really only necessary in StartNode(). This
commit refactors things to make this more obvious: only StartNode does
this dance now. In particular, RawNode does not support this at all any
more; it expects the app to set up its Storage correctly.

Future work may provide helpers to make this "preseeding" of the Storage
more user-friendly. It isn't entirely straightforward to do so since
the Storage interface doesn't provide the right accessors for this
purpose. Briefly speaking, we want to make sure that a non-bootstrapped
node can never catch up via the log so that we can implicitly use one
of the "skipped" log entries to represent the configuration change into
the bootstrap configuration. This is an invasive change that affects
all consumers of raft, and it is of lower urgency since the code (post
this commit) already encapsulates the complexity sufficiently.
2019-07-19 10:02:02 +02:00
Tobias Schottdorf c62b7048b5 raft: use RawNode for node's event loop
It has always bugged me that any new feature essentially needed to be
tested twice due to the two ways in which apps can use raft (`*node` and
`*RawNode`). Due to upcoming testing work for joint consensus, now is a
good time to rectify this somewhat.

This commit removes most logic from `(*node).run` and uses `*RawNode`
internally. This simplifies the logic and also lead (via debugging) to
some insight on how the semantics of the approaches differ, which is now
documented in the comments.
2019-07-19 09:59:59 +02:00
Zeming YU 5f21b557e5 integration: fix a data race about `i` and `tt` in TestV3WatchFromCurrentRevision
don't references to loop variables from within go anonymous function

Fixes etcd-io#10899
2019-07-19 00:48:49 -07:00
Jingyi Hu 233be58056
Merge pull request #10839 from needkane/pr
raft: update log info and annotation
2019-07-18 23:26:44 -07:00
Tobias Grieger 62f4fb3c5e
Merge pull request #10903 from tbg/inflights
raft: return non-nil Inflights in raft status
2019-07-18 17:50:28 +02:00
Tobias Schottdorf 6b0322549f raft: replace StatusWithoutProgress with BasicStatus
Now that a Config is also added to the full status, the old name
did not convey the intention, which was to get a Status without
an associated allocation.
2019-07-18 16:28:37 +02:00
Xiang Li f498392ca7
Merge pull request #10898 from tbg/dep
scripts: fail explicitly in updatedep.sh when gopath.proto exists
2019-07-17 16:47:51 -07:00
Xiang Li 7d2e57216a
Merge pull request #10900 from yuzeming/master
integration: add WaitGroup to TestV3WatchCurrentPutOverlap
2019-07-17 16:46:25 -07:00
yzm 3737979532 move wg.Wait() after loop 2019-07-17 16:31:48 -07:00
Tobias Schottdorf 7ce934cbec raft: return active config in Status
This is useful for debug purposes, and more so once we support joint
quorums.
2019-07-17 14:29:45 +02:00
Tobias Schottdorf 26a1e60eab raft: return non-nil Inflights in raft status
Recent refactoring to the String() method of `Progress` hit an NPE
because we return nil Inflights as part of the Raft status. Just
fix this at the source and properly populate the Raft status instead
of teaching String() to ignore nil. A real Progress always has a
non-nil Inflights.
2019-07-17 12:53:28 +02:00
yzm d87bd2c87c integration: add WaitGroup to prevent calling t.Fatalf after TestV3WatchCurrentPutOverlap function return
It could cause a panic when it happens

Fixes #10886
2019-07-16 10:25:35 -07:00
Tobias Grieger 9fba06ba3b
Merge pull request #10889 from tbg/joint-conf-change-logic
raft: internally support joint consensus
2019-07-16 16:02:16 +02:00
Tobias Schottdorf aa158f36b9 raft: internally support joint consensus
This commit introduces machinery to safely apply joint consensus
configuration changes to Raft.

The main contribution is the new package, `confchange`, which offers
the primitives `Simple`, `EnterJoint`, and `LeaveJoint`.

The first two take a list of configuration changes. `Simple` only
declares success if these configuration changes (applied atomically)
change the set of voters by at most one (i.e. it's fine to add or
remove any number of learners, but change only one voter). `EnterJoint`
makes the configuration joint and then applies the changes to it, in
preparation of the caller returning later and transitioning out of the
joint config into the final desired configuration via `LeaveJoint()`.

This commit streamlines the conversion between voters and learners, which
is now generally allowed whenever the above conditions are upheld (i.e.
it's not possible to demote a voter and add a new voter in the context
of a Simple configuration change, but it is possible via EnterJoint).
Previously, we had the artificial restriction that a voter could not be
demoted to a learner, but had to be removed first.
Even though demoting a learner is generally less useful than promoting
a learner (the latter is used to catch up future voters), demotions
could see use in improved handling of temporary node unavailability,
where it is desired to remove voting power from a down node, but to
preserve its data should it return.

An additional change that was made in this commit is to prevent the use
of empty commit quorums, which was previously possible but for no good
reason; this:

Closes #10884.

The work left to do in a future PR is to actually expose joint
configurations to the applications using Raft. This will entail mostly
API design and the addition of suitable testing, which to be carried
out ergonomically is likely to motivate a larger refactor.

Touches #7625.
2019-07-16 15:36:04 +02:00
Tobias Schottdorf 14625b847c scripts: have genproto.sh clean up after itself
We don't want it to leave gopath.proto around for reasons detailed in
the previous commit (messing up vgo).
2019-07-16 14:01:04 +02:00