vitalif/etcd - etcd

Commit Graph

Author	SHA1	Message	Date
Andrei Matei	0db04a38a5	raft: minor comment fix in confchange	2020-04-12 22:44:37 -04:00
Fullstop000	7eae024ead	raft: only redirect msg produced by own node (#11466 ) Signed-off-by: Fullstop000 <fullstop1005@gmail.com>	2020-04-06 20:27:46 -07:00
Lars Lehtonen	dbcf540c88	raft/rafttest: Prune Unused Functions (#11625 ) * raft/rafttest: prune unused network interface * raft/rafttest: prune unused raftNetwork.heal() * raft/rafttest: prune unused logLevels.strToLev() * raft/rafttest: prune unused InteractionEnv.writeErr()	2020-04-04 14:38:14 -07:00
qupeng	6f850a65a1	raft: cleanup read index code (#11528 ) Signed-off-by: qupeng <qupeng@pingcap.com>	2020-03-03 09:20:25 -08:00
Tobias Schottdorf	0544f33248	raft: clarify ApplyConfChange contract for rejected conf changes Apps typically maintain the raft configuration as part of the state machine. As a result, they want to be able to reject configuration change entries at apply time based on the state on which the entry is supposed to be applied. When this happens, the app should not call ApplyConfChange, but the comments did not make this clear. As a result, it was tempting to pass an empty pb.ConfChange or it's V2 version instead of not calling ApplyConfChange. However, an empty V1 or V2 proto aren't noops when the configuration is joint: an empty V1 change is treated internally as a single configuration change for NodeID zero and will cause a panic when applied in a joint state. An empty V2 proto is treated as a signal to leave a joint state, which means that the app's config and raft's would diverge. The comments updated in this commit now ask users to not call ApplyConfState when they reject a conf change. Apps that never use joint consensus can keep their old behavior since the distinction only matters when in a joint state, but we don't want to encourage that.	2020-02-25 12:45:45 +01:00
Tobias Schottdorf	37c7e4d1d8	raft: fix auto-transitioning out of joint config The code doing so was undertested and buggy: it would launch multiple attempts to transition out when the conf change was not the last element in the log. This commit fixes the problem and adds a regression test. It also reworks the code to handle a former untested edge case, in which the auto-transition append is refused. This can't happen any more with the current version of the code because this proposal has size zero and is special cased in increaseUncommittedSize. Last but not least, the auto-leave proposal now also bumps pendingConfIndex, which was not done previously due to an oversight.	2020-02-25 12:35:51 +01:00
Tobias Schottdorf	2332705f10	raft: remove bogus tail end of membership change interaction test The test was supposed to end earlier, but some old copy pasta survived.	2020-02-25 12:35:51 +01:00
po3rin	53f15caf73	raft: Fixed missing package name in README.md (#11566 )	2020-01-29 10:15:00 -08:00
qupeng	eaa0612e02	raft: abort leader transferring if the target is demoted (#11417 ) Signed-off-by: qupeng <qupeng@pingcap.com>	2019-12-20 12:07:52 +08:00
Baohua Yang	4b755e8935	docs: Update the raft usage by adding Hyperledger project	2019-11-19 09:54:16 -08:00
Ilya Sevostyanov	d487b16de1	confchange: removed duplicate check in confchange.Simple.	2019-10-01 11:32:12 +03:00
Wine93	b0534c1b44	raft/log_test: fixed wrong index	2019-08-25 04:47:11 +00:00
Wine93	5f42161750	raft: fixed some typos and simplify minor logic	2019-08-25 04:46:29 +00:00
Xiang Li	f4dfd1976d	Merge pull request #10971 from nilsocket/codeReformat raft : write compact if statements	2019-08-23 09:10:26 +08:00
nilsocket	702c69c906	raft : Write compact if statements	2019-08-23 04:31:40 +05:30
Tobias Schottdorf	47ae53d25d	rafttest: print Ready before processing it It was confusing to see the effects of the Ready (i.e. log messages) printed before the Ready itself.	2019-08-16 09:41:35 +02:00
Tobias Schottdorf	99f8046fd1	raft: fix a test file name	2019-08-16 09:38:44 +02:00
Tobias Schottdorf	8d1946d16a	raft: document problem with leader self-removal When a leader removes itself, it will retain its leadership but not accept new proposals, making the range effectively stuck until manual intervention triggers a campaign event. This commit documents the behavior. It does not correct it yet.	2019-08-16 09:38:44 +02:00
Tobias Schottdorf	306e75a96f	raft: add a batch of interaction-driven conf change tests Verifiy the behavior in various v1 and v2 conf change operations. This also includes various fixups, notably it adds protection against transitioning in and out of new configs when this is not permissible. There are more threads to pull, but those are left for future commits.	2019-08-16 09:38:44 +02:00
Tobias Schottdorf	4e19150676	raft: proactively probe newly added followers When the leader applied a new configuration that added voters, it would not immediately probe these voters, delaying when they would be caught up. I noticed this while writing an interaction-driven test, which has now been cleaned up and completed.	2019-08-14 20:53:34 +02:00
Tobias Schottdorf	3d6721f751	rafttest: add _breakpoint directive It is a helper case to attach a debugger to when a problem needs to be investigated in a longer test file. In such a case, add the following stanza immediately before the interesting behavior starts: _breakpoint: ---- ok and set a breakpoint on the _breakpoint case.	2019-08-14 20:53:34 +02:00
Tobias Schottdorf	fdaed88f14	raft: initialize new Progress at LastIndex, not LastIndex+1 Initializing at LastIndex+1 meant that new peers would not be probed immediately when they appeared in the leader's config, which delays their getting caught up.	2019-08-14 20:53:34 +02:00
Tobias Schottdorf	c2d9514370	raft/rafttest: fix stabilize handler It was bailing out too early.	2019-08-14 17:24:14 +02:00
Tobias Grieger	029401ab81	Merge pull request #11005 from tbg/interactiontest raft/rafttest: introduce datadriven testing	2019-08-12 11:52:52 +02:00
Tobias Schottdorf	e8090e57a2	raft/rafttest: introduce datadriven testing It has often been tedious to test the interactions between multi-member Raft groups, especially when many steps were required to reach a certain scenario. Often, this boilerplate was as boring as it is hard to write and hard to maintain, making it attractive to resort to shortcuts whenever possible, which in turn tended to undercut how meaningful and maintainable the tests ended up being - that is, if the tests were even written, which sometimes they weren't. This change introduces a datadriven framework specifically for testing deterministically the interaction between multiple members of a raft group with the goal of reducing the friction for writing these tests to near zero. In the near term, this will be used to add thorough testing for joint consensus (which is already available today, but wildly undertested), but just converting an existing test into this framework has shown that the concise representation and built-in inspection of log messages highlights unexpected behavior much more readily than the previous unit tests did (the test in question is `snapshot_succeed_via_app_resp`; the reader is invited to compare the old and new version of it). The main building block is `InteractionEnv`, which holds on to the state of the whole system and exposes various relevant methods for manipulating it, including but not limited to adding nodes, delivering and dropping messages, and proposing configuration changes. All of this is extensible so that in the future I hope to use it to explore the phenomena discussed in https://github.com/etcd-io/etcd/issues/7625#issuecomment-488798263 which requires injecting appropriate "crash points" in the Ready handling loop. Discussions of the "what if X happened in state Y" can quickly be made concrete by "scripting up an interaction test". Additionally, this framework is intentionally not kept internal to the raft package.. Though this is in its infancy, a goal is that it should be possible for a suite of interaction tests to allow applications to validate that their Storage implementation behaves accordingly, simply by running a raft-provided interaction suite against their Storage.	2019-08-12 11:13:51 +02:00
Gyuho Lee	6c87b21821	raft: fix typo Signed-off-by: Gyuho Lee <leegyuho@amazon.com>	2019-08-09 21:26:48 -07:00
Tobias Grieger	4cec8dddc6	Merge pull request #11003 from tbg/interaction/restore raft: fix restoring joint configurations	2019-08-09 20:13:04 +02:00
Xiang Li	84c69cca76	Merge pull request #10970 from nilsocket/minorFix1 raft : remove unnecessary, if check	2019-08-09 11:03:01 -07:00
Tobias Schottdorf	37ab5bdd21	raft: fix restoring joint configurations While writing interaction tests for joint configuration changes, I realized that this wasn't working yet - restoring had no notion of the joint configuration and was simply dropping it on the floor. This commit introduces a helper `confchange.Restore` which takes a `ConfState` and initializes a `Tracker` from it. This is then used both in `(*raft).restore` as well as in `newRaft`.	2019-08-09 19:28:43 +02:00
Tobias Schottdorf	a5f785a232	confchange: clean up unnecessary block	2019-08-09 19:28:43 +02:00
Tobias Grieger	7948f39790	Merge pull request #11004 from tbg/interaction/unused-type raft/tracker: visit Progress in stable order	2019-08-09 12:32:04 +02:00
Tobias Schottdorf	1b3e0821a7	raft/tracker: visit Progress in stable order This is helpful for upcoming testing work which allows datadriven testing of the interaction of multiple nodes. This testing requires determinism to work correctly.	2019-08-08 09:37:33 +02:00
Tobias Schottdorf	9553994cd7	raft/auorum: remove unused type	2019-08-07 18:53:01 +02:00
Tobias Schottdorf	c30c2e345b	raft: let learners vote It turns out that that learners must be allowed to cast votes. This seems counter- intuitive but is necessary in the situation in which a learner has been promoted (i.e. is now a voter) but has not learned about this yet. For example, consider a group in which id=1 is a learner and id=2 and id=3 are voters. A configuration change promoting 1 can be committed on the quorum `{2,3}` without the config change being appended to the learner's log. If the leader (say 2) fails, there are de facto two voters remaining. Only 3 can win an election (due to its log containing all committed entries), but to do so it will need 1 to vote. But 1 considers itself a learner and will continue to do so until 3 has stepped up as leader, replicates the conf change to 1, and 1 applies it. Ultimately, by receiving a request to vote, the learner realizes that the candidate believes it to be a voter, and that it should act accordingly. The candidate's config may be stale, too; but in that case it won't win the election, at least in the absence of the bug discussed in: https://github.com/etcd-io/etcd/issues/7625#issuecomment-488798263.	2019-08-07 12:03:18 +02:00
nilsocket	0d99469cdb	raft : `newRaft()` does check for validity of `Config`	2019-08-02 03:09:51 +05:30
Gyuho Lee	4e43a082b2	raft: use mutex in "SetLogger" to avoid race conditions in tests Signed-off-by: Gyuho Lee <leegyuho@amazon.com>	2019-07-29 15:43:19 -07:00
Gyuho Lee	936c506e8d	Merge pull request #10945 from tbg/add-todo raft: leave TODO about leaving StateSnapshot	2019-07-29 13:51:38 -07:00
Tobias Schottdorf	3b02d4c5ff	raft: leave TODO about leaving StateSnapshot The condition is overly strict, which has popped up in CockroachDB recently.	2019-07-26 23:19:34 +02:00
Gyuho Lee	c7c9428f6b	raft: move "RawNode", clarify tick miss Signed-off-by: Gyuho Lee <leegyuho@amazon.com>	2019-07-24 23:35:36 -07:00
Tobias Schottdorf	721127da12	raft: require app to consume result from Ready() I changed `(*RawNode).Ready`'s behavior in #10892 in a problematic way. Previously, `Ready()` would create and immediately "accept" a Ready (i.e. commit the app to actually handling it). In #10892, Ready() became a pure read-only operation and the "accepting" was moved to `Advance(rd)`. As a result it was illegal to use the RawNode in certain ways while the Ready was being handled. Failure to do so would result in dropped messages (and perhaps worse). For example, with the following operations 1. `rd := rawNode.Ready()` 2. `rawNode.Step(someMsg)` 3. `rawNode.Advance(rd)` `someMsg` would be dropped, because `Advance()` would clear out the outgoing messages thinking that they had all been handled by the client. I mistakenly assumed that this restriction had existed prior, but this is incorrect. I noticed this while trying to pick up the above PR in CockroachDB, where it caused unit test failures, precisely due to the above example. This PR reestablishes the previous behavior (result of `Ready()` must be handled by the app) and adds a regression test. While I was there, I carried out a few small clarifying refactors.	2019-07-23 22:45:01 +02:00
Tobias Schottdorf	b9c051e7a7	raftpb: clean up naming in ConfChange	2019-07-23 10:40:03 +02:00
Tobias Schottdorf	b67303c6a2	raft: allow use of joint quorums This change introduces joint quorums by changing the Node and RawNode API to accept pb.ConfChangeV2 (on top of pb.ConfChange). pb.ConfChange continues to work as today: it allows carrying out a single configuration change. A pb.ConfChange proposal gets added to the Raft log as such and is thus also observed by the app during Ready handling, and fed back to ApplyConfChange. ConfChangeV2 allows joint configuration changes but will continue to carry out configuration changes in "one phase" (i.e. without ever entering a joint config) when this is possible.	2019-07-23 10:40:03 +02:00
Tobias Schottdorf	88f5561733	raft: use ConfChangeSingle internally	2019-07-23 10:39:48 +02:00
Tobias Schottdorf	10680744b9	raft: introduce protos for joint quorums	2019-07-23 10:39:48 +02:00
Tobias Schottdorf	caa48bcc3d	raft: remove TestNodeBoundedLogGrowthWithPartition It has a data race between the test's call to `reduceUncommittedSize` and a corresponding call during Ready handling in `(*node).run()`. The corresponding RawNode test still verifies the functionality, so instead of fixing the test we can remove it.	2019-07-19 12:35:14 +02:00
Tobias Schottdorf	500af91653	raft: restore ability to bootstrap RawNode We are worried about breaking backwards compatibility for any application out there that may have relied on the old behavior. Their RawNode invocation would have been broken by the removal of the peers argument so it would not have changed silently; an associated comment tells callers how to fix it.	2019-07-19 10:02:02 +02:00
Tobias Schottdorf	c9491d7861	raft: clean up bootstrap This is the first (maybe not last) step in cleaning up the bootstrap code around StartNode. Initializing a Raft group for the first time is awkward, since a configuration has to be pulled from thin air. The way this is solved today is unclean: The app is supposed to pass peers to StartNode(), we add configuration changes for them to the log, immediately pretend that they are applied, but actually leave them unapplied (to give the app a chance to observe them, though if the app did decide to not apply them things would really go off the rails), and then return control to the app. The app will then process the initial Readys and as a result the configuration will be persisted to disk; restarts of the node then use RestartNode which doesn't take any peers. The code that did this lived awkwardly in two places fairly deep down the callstack, though it was really only necessary in StartNode(). This commit refactors things to make this more obvious: only StartNode does this dance now. In particular, RawNode does not support this at all any more; it expects the app to set up its Storage correctly. Future work may provide helpers to make this "preseeding" of the Storage more user-friendly. It isn't entirely straightforward to do so since the Storage interface doesn't provide the right accessors for this purpose. Briefly speaking, we want to make sure that a non-bootstrapped node can never catch up via the log so that we can implicitly use one of the "skipped" log entries to represent the configuration change into the bootstrap configuration. This is an invasive change that affects all consumers of raft, and it is of lower urgency since the code (post this commit) already encapsulates the complexity sufficiently.	2019-07-19 10:02:02 +02:00
Tobias Schottdorf	c62b7048b5	raft: use RawNode for node's event loop It has always bugged me that any new feature essentially needed to be tested twice due to the two ways in which apps can use raft (`node` and `RawNode`). Due to upcoming testing work for joint consensus, now is a good time to rectify this somewhat. This commit removes most logic from `(node).run` and uses `RawNode` internally. This simplifies the logic and also lead (via debugging) to some insight on how the semantics of the approaches differ, which is now documented in the comments.	2019-07-19 09:59:59 +02:00
Jingyi Hu	233be58056	Merge pull request #10839 from needkane/pr raft: update log info and annotation	2019-07-18 23:26:44 -07:00
Tobias Schottdorf	6b0322549f	raft: replace StatusWithoutProgress with BasicStatus Now that a Config is also added to the full status, the old name did not convey the intention, which was to get a Status without an associated allocation.	2019-07-18 16:28:37 +02:00

1 2 3 4 5 ...

971 Commits (0461b3fa51ebcd3f72ef54296638ac2f19ea9071)