Commit Graph

15259 Commits (233be580562ff6b9bba85b6fa85f2e985cf90cac)

Author SHA1 Message Date
Jingyi Hu 233be58056
Merge pull request #10839 from needkane/pr
raft: update log info and annotation
2019-07-18 23:26:44 -07:00
Tobias Grieger 62f4fb3c5e
Merge pull request #10903 from tbg/inflights
raft: return non-nil Inflights in raft status
2019-07-18 17:50:28 +02:00
Tobias Schottdorf 6b0322549f raft: replace StatusWithoutProgress with BasicStatus
Now that a Config is also added to the full status, the old name
did not convey the intention, which was to get a Status without
an associated allocation.
2019-07-18 16:28:37 +02:00
Xiang Li f498392ca7
Merge pull request #10898 from tbg/dep
scripts: fail explicitly in updatedep.sh when gopath.proto exists
2019-07-17 16:47:51 -07:00
Xiang Li 7d2e57216a
Merge pull request #10900 from yuzeming/master
integration: add WaitGroup to TestV3WatchCurrentPutOverlap
2019-07-17 16:46:25 -07:00
yzm 3737979532 move wg.Wait() after loop 2019-07-17 16:31:48 -07:00
Tobias Schottdorf 7ce934cbec raft: return active config in Status
This is useful for debug purposes, and more so once we support joint
quorums.
2019-07-17 14:29:45 +02:00
Tobias Schottdorf 26a1e60eab raft: return non-nil Inflights in raft status
Recent refactoring to the String() method of `Progress` hit an NPE
because we return nil Inflights as part of the Raft status. Just
fix this at the source and properly populate the Raft status instead
of teaching String() to ignore nil. A real Progress always has a
non-nil Inflights.
2019-07-17 12:53:28 +02:00
yzm d87bd2c87c integration: add WaitGroup to prevent calling t.Fatalf after TestV3WatchCurrentPutOverlap function return
It could cause a panic when it happens

Fixes #10886
2019-07-16 10:25:35 -07:00
Tobias Grieger 9fba06ba3b
Merge pull request #10889 from tbg/joint-conf-change-logic
raft: internally support joint consensus
2019-07-16 16:02:16 +02:00
Tobias Schottdorf aa158f36b9 raft: internally support joint consensus
This commit introduces machinery to safely apply joint consensus
configuration changes to Raft.

The main contribution is the new package, `confchange`, which offers
the primitives `Simple`, `EnterJoint`, and `LeaveJoint`.

The first two take a list of configuration changes. `Simple` only
declares success if these configuration changes (applied atomically)
change the set of voters by at most one (i.e. it's fine to add or
remove any number of learners, but change only one voter). `EnterJoint`
makes the configuration joint and then applies the changes to it, in
preparation of the caller returning later and transitioning out of the
joint config into the final desired configuration via `LeaveJoint()`.

This commit streamlines the conversion between voters and learners, which
is now generally allowed whenever the above conditions are upheld (i.e.
it's not possible to demote a voter and add a new voter in the context
of a Simple configuration change, but it is possible via EnterJoint).
Previously, we had the artificial restriction that a voter could not be
demoted to a learner, but had to be removed first.
Even though demoting a learner is generally less useful than promoting
a learner (the latter is used to catch up future voters), demotions
could see use in improved handling of temporary node unavailability,
where it is desired to remove voting power from a down node, but to
preserve its data should it return.

An additional change that was made in this commit is to prevent the use
of empty commit quorums, which was previously possible but for no good
reason; this:

Closes #10884.

The work left to do in a future PR is to actually expose joint
configurations to the applications using Raft. This will entail mostly
API design and the addition of suitable testing, which to be carried
out ergonomically is likely to motivate a larger refactor.

Touches #7625.
2019-07-16 15:36:04 +02:00
Tobias Schottdorf 14625b847c scripts: have genproto.sh clean up after itself
We don't want it to leave gopath.proto around for reasons detailed in
the previous commit (messing up vgo).
2019-07-16 14:01:04 +02:00
Tobias Schottdorf f63984bb33 scripts: fail explicitly in updatedep.sh when gopath.proto exists
I had been dealing with these intermittent failures for a while and
finally figured out why. The real solution is making genproto.sh less
ugly but that won't happen for a while.
2019-07-16 13:54:09 +02:00
Xiang Li 5a734e79f5
Merge pull request #10891 from changkun/raft
raft/rafttest: simulate async send in node test
2019-07-15 11:49:06 -07:00
Changkun Ou 856097181b raft/rafttest: simulate async send in node test
In order to cover message can well be received when a node is paused, this commit sends message async using goroutine and random sleep. This change makes recvms is possible to cache message during node.pause is triggered.
2019-07-13 16:22:33 +02:00
Gyuho Lee e56e8471ec
Merge pull request #10888 from tbg/test-ski
test: allow failures in linux-amd64-integration-4-cpu
2019-07-11 09:24:06 -07:00
Tobias Schottdorf b7327b1cd8 test: allow failures in linux-amd64-integration-4-cpu
This run *should* certainly pass, but it's consistently the one that
fails with a regularity that essentially blocks the CI pipeline.

Someone needs to take a look at #10700, but in the meantime, the show
must go on.
2019-07-11 16:40:54 +02:00
Tobias Grieger b2274efee0
Merge pull request #10864 from tbg/learner-snap
raft: allow voter to become learner through snapshot
2019-07-11 15:48:09 +02:00
Tobias Grieger eb7dd97135
Merge pull request #10882 from tbg/pr-string
raft: optimize string representation of Progress
2019-07-09 16:27:35 +02:00
Tobias Schottdorf 95024fa3cc raft: optimize string representation of Progress
Make it less verbose by omitting the values for the steady state.
Also rearrange the order so that information that is typically more
relevant is printed first.
2019-07-09 11:22:37 +02:00
Xiang Li 0af16979f8
Merge pull request #10879 from lzhfromustc/master
etcdserver: modify a read operation to avoid potential race
2019-07-08 14:48:43 -07:00
lzhfromustc d35f6647bc
Use newbe instead of s.be to avoid potential race
`s.cluster.SetBackend(s.be)` is not in critical section. Using `newbe` instead of `s.be` can avoid potential data race.
2019-07-08 14:24:52 -07:00
Tobias Schottdorf 6f009d211f raft: allow voter to become learner through snapshot
At the time of writing, we don't allow configuration changes to change
voters to learners directly, but note that a snapshot may compress
multiple changes to the configuration into one: the voter could have
been removed, then readded as a learner and the snapshot reflects both
changes. In that case, a voter receives a snapshot telling it that it is
now a learner. In fact, the node has to accept that snapshot, or it is
permanently cut off from the Raft log.

I think this just wasn't realized in the original work, but this is just
my guess since there generally is very little rationale on the various
decisions made. I also generally haven't been able to figure out whether
the decision to prevent voters from becoming learners without first
having been removed was motivated by some particular concern, or if it
just wasn't deemed necessary. I suspect it is the latter because
demoting a voter seems perfectly safe.

See https://github.com/etcd-io/etcd/pull/8751#issuecomment-342028091.
2019-07-08 09:32:24 +02:00
Tobias Grieger 48f5bb6d28
Merge pull request #10865 from tbg/multi-conf-change
raft: centralize configuration change application
2019-07-03 21:57:57 +02:00
Tobias Schottdorf 6697adfff8 raft/tracker: pull Voters and Learners into Config struct
This is helpful to quickly print the configuration log messages without
having to specify Voters and Learners separately.

It will also come in handy for joint quorums because it allows holding
on to voters and learners as a unit, which is useful for unit testing.
2019-07-03 21:26:42 +02:00
Tobias Schottdorf b171e1c78b raft: centralize configuration change application
Put all the logic related to applying a configuration change in one
place in preparation for adding joint consensus.

This inspired various TODOs.

I had to rewrite TestSnapshotSucceedViaAppResp since it was relying
on a snapshot applied to the leader, which is now prevented.
2019-07-03 21:26:42 +02:00
kane 4f7d83a249 raft: update log info and annotation 2019-07-02 23:43:56 -04:00
Xiang Li 1f40b6642f
Merge pull request #10850 from Koprvhdix/role-remove-document-fix
Documentation: change `etcdctl role remove` to `etcdctl role delete`
2019-07-01 15:34:55 -07:00
Xiang Li d506962fec
Merge pull request #10848 from spzala/raftthesis10831
raftdoc: fix raft thesis link
2019-06-28 12:43:32 -07:00
Xiang Li ecba4492f2
Merge pull request #10866 from lzhfromustc/master
clientv3: Fixed a missing block bug
2019-06-28 11:54:50 -07:00
lzhfromustc 8194aa3f03
Fixed a missing block bug
Description: w.mu is locked at line 385 and unlocked at line 396. Among 5 return statements in this function, 4 are below line 396 but there is 1 return at line 387. 
Fix: Add w.mu.Unlock() before that return at line 387.
2019-06-28 11:27:13 -07:00
Clockworkai c34de2aef4 Documentation: change `etcdctl role remove` to `etcdctl role delete`
This is a document error. With running `etcdctl role --help`, we can find that it should be delete, not remove.

Fixes #10849
2019-06-26 09:03:08 +08:00
Sahdev P. Zala 655ab0ac6a raftdoc: fix raft thesis link
The current link does not work and not valid anymore per stanford support.
Replace all current refs with a link that is used by the
https://raft.github.io/

Fixes # https://github.com/etcd-io/etcd/issues/10831
2019-06-24 19:01:00 -04:00
Tobias Grieger 948e276ca7
Merge pull request #10807 from tbg/extract-prs
raft: extract 'tracker' package
2019-06-21 22:50:06 +02:00
Tobias Schottdorf f9c2d00fb3 raft: extract 'tracker' package
Mechanically extract `progressTracker`, `Progress`, and `inflights`
to their own package named `tracker`. Add lots of comments in the
progress, and take the opportunity to rename and clarify various
fields.
2019-06-21 22:15:00 +02:00
Xiang Li 6953ccc135
Merge pull request #10837 from tbg/ci-20m
test: s/20m/30m/g
2019-06-21 08:54:26 +08:00
Tobias Schottdorf 362dfb4d08 test: s/20m/30m/g
Every other test build times out due to the 20 minute test timeout. I
doesn't seem like tests are actually hanging, it's more that 20 minutes
just isn't enough to run the tests any more.
2019-06-20 23:44:25 +02:00
Tobias Grieger 5d30dccdaa
Merge pull request #10838 from tbg/unbreak
quorum: fix vet failure
2019-06-20 23:44:09 +02:00
Tobias Schottdorf e262542d6d quorum: fix vet failure
This slipped in during a rename and I didn't see it in CI because of
CI flakiness and a general intransparency about which failures are
important.
2019-06-20 23:40:08 +02:00
Tobias Grieger 755aab6990
Merge pull request #10779 from tbg/jointq-pr
raft: use half-populated joint quorum
2019-06-20 22:57:46 +02:00
Tobias Schottdorf e039629907 raft: use half-populated joint quorum
To ease a future transition into joint quorums, this commit removes the
previous "ad-hoc" majority-based quorum and vote computations with that
introduced in the `raft/quorum` package.

More specifically, the progressTracker now uses a quorum.JointConfig for
which the "second" majority quorum is always empty; in this case the
quorum behaves like the one quorum.MajorityConfig that is actually
present. Or, more briefly, this change is a no-op, but it will take the
busywork out of actually starting to make use of joint quorums in the
future.

On a side node, I suspect that this might've fixed a bug regarding the
read index though I haven't been able to explicitly come up with a
counter-example. The problem was that the acks collected for the read
index weren't taking into account membership changes, so they'd run the
danger of using acks from nodes since removed to claim that a quorum of
acks had been received. There's a chance that there isn't a
counter-example (the only guarantee extracted from the "quorum" is that
there isn't another leader, but even if there's another leader all that
matters is that that leader doesn't have a divergent history from the
stale leader in the hypothetical counter-example), but either way there
is morally a bug here that is now fixed because VoteCommitted doesn't
care about votes from members that are not voters known to the currently
active configuration.
2019-06-19 14:19:35 +02:00
Tobias Schottdorf 0384c587eb raft: rename makeP{RS,rogressTracker} 2019-06-19 14:19:35 +02:00
Tobias Schottdorf 3def2364e4 raft: use membership sets in progress tracking
Instead of having disjoint mappings of ID to *Progress for voters and
learners, use a map[id]struct{} for each and share a map of *Progress
among them.

This is easier to handle when joint quorums are introduced, at which
point a node may be a voting member of two quorums.
2019-06-19 14:19:35 +02:00
Tobias Schottdorf 76c8ca5a55 quorum: introduce library for majority and joint quorums
The quorum package contains logic to reason about committed indexes as
well as vote outcomes for both majority and joint quorums. The package
is oblivious to the existence of learner replicas.

The plan is to hook this up to etcd/raft in subsequent commits.
2019-06-19 14:19:35 +02:00
Jingyi Hu 9ff7628577
Merge pull request #10801 from roe85/master
DOC: Fix referencing the wrong version number.
2019-06-18 13:24:28 -07:00
Gyuho Lee 53891cbf97
Merge pull request #10822 from tbg/airplane/learner-no-campaign
raft: prevent learners from becoming leader
2019-06-17 10:58:24 -07:00
Gyuho Lee e5876c6ce2
Merge pull request #10826 from yznima/pr-race
Raft HTTP: fix pause/resume race condition
2019-06-17 09:58:18 -07:00
Nima Yahyazadeh b1812a410f Raft HTTP: fix pause/resume race condition 2019-06-17 11:45:25 -04:00
Tobias Schottdorf c844526002 raft: prevent learners from becoming leader
We were already taking some precautions against learners campaigning,
but there was no safeguard against an explicit call to `Campaign()`.
The newly added test also verifies that leadership transfers to
learners are ignored.
2019-06-17 09:20:45 +02:00
Richard c27e1108f4 Doc: Fix referencing the wrong version number. 2019-06-14 14:35:34 +02:00