Commit Graph

971 Commits (0461b3fa51ebcd3f72ef54296638ac2f19ea9071)

Author SHA1 Message Date
Ben Darnell 340ba8353c raft: Fix election "logs converge" test
The "logs converge" case in TestLeaderElectionPreVote was incorrectly
passing because some nodes were not actually using the preVoteConfig.
This test case was more complex than its siblings and it was not
verifying what it wanted to verify, so pull it out into a separate test
where everything can be tested more explicitly.

Fixes #6895
2016-12-03 17:29:15 +08:00
Xiang Li f2eb8560ed raft: fix TestNodeProposeAddDuplicateNode
Only send signal after applying conf change.
Or deadlock might happen if raft node receives
ready without conf change when the test server
is slow.
2016-11-20 21:59:31 -08:00
Vincent Lee e6d1ebcc1d raft: use the channel instead of sleep to make test case reliable 2016-11-21 13:30:15 +08:00
Vincent Lee bc6f5ad53e raft: fix test case for data race 2016-11-21 10:30:36 +08:00
Vincent Lee 62bd5477b9 raft: fix test case, should wait config propose applied 2016-11-21 10:10:34 +08:00
Vincent Lee 16e3ab0f11 raft: test case to check the duplicate add node propose 2016-11-20 16:58:11 +08:00
Vincent Lee 4401d88546 raft: add node should reset the pendingConf state
After add node conf proposed twice with the same node id, the pending state is not reset because
the addNode returned without setting the pending state at the second
time and the pending state will always be true unless other conf changed. During this we
can not add any new node because the propose will be ignored since the
pending state is true.
2016-11-17 15:50:13 +08:00
Alexander Morozov 7afc490c95 raft: return empty status if node is stopped
If the node is stopped, then Status can hang forever because there is no
event loop to answer. So, just return empty status to avoid deadlocks.

Fix #6855

Signed-off-by: Alexander Morozov <lk4d4math@gmail.com>
2016-11-15 15:45:23 -08:00
Gyu-Ho Lee b8b72f80f9 *: revendor, update proto files 2016-11-10 12:02:00 -08:00
fanmin shi c2fd42b556 etcdserver, clientv3: add "!=" to txn
adding != to compare is a requested functionality from a etcd user

FIX #6719
2016-11-09 14:28:36 -08:00
Ben Darnell 2f34547d39 raft: Check promotable() in MsgTimeoutNow handling
If MsgTimeoutNow arrived after a node was removed, the node could start
and win an election, then panic in becomeLeader (see
cockroachdb/cockroach#8535)
2016-11-07 20:02:21 +08:00
Xiang Li e5987dea37 rafttest: make raft test reliable 2016-11-04 15:55:17 -07:00
Gyu-Ho Lee cb5c92f69b raft: do not attach term to MsgReadIndex
Fix https://github.com/coreos/etcd/issues/6744.

MsgReadIndex, as MsgProp, is to be forwarded to leader.
So we should treat it as local message.
2016-10-28 22:12:25 -07:00
Xiang Li d7bc15300b Merge pull request #6624 from bdarnell/pre-vote
raft: Implement the PreVote RPC described in thesis section 9.6
2016-10-25 13:18:22 -07:00
Ben Darnell 8d5e969f12 raft: Separate test methods for vote and pre-vote tests 2016-10-25 23:31:44 +09:00
Ben Darnell 22aa710c1f raft: Improve comments and formatting for PreVote change 2016-10-24 22:29:33 +09:00
Ben Darnell cf93a74aa8 raft: Refactor vote handling
Move all vote handling from the per-state step functions to the
top-level Step(). This wasn't necessary before because MsgVote would
cause us to become a follower, but MsgPreVote needs to be handled
without changing the node's current state.
2016-10-19 19:35:21 +08:00
Ben Darnell 73cae7abd0 raft: Implement the PreVote RPC described in thesis section 9.6
This prevents disruption when a node that has been partitioned
away rejoins the cluster.

Fixes #6522
2016-10-19 19:35:20 +08:00
Ben Darnell ca87a13b18 raft: More realistic terms in tests
Some tests were starting nodes with a non-empty log but a term of zero,
which cannot happen in the real world. This was affecting the final term
being tested in TestLeaderElection.
2016-10-19 19:35:20 +08:00
Manish R Jain 255670106f
raft: Add dgraph to the list of users
Because Dgraph is a notable user of RAFT.
2016-10-19 17:26:51 +11:00
Manish R Jain e69c2fd382
raft: update README to explain starting a single node cluster and joining it
this PR helps clients of RAFT set up the cluster correctly, when they're
starting with a single node cluster.
2016-10-19 14:09:48 +11:00
Xiang Li dc8bf26cd8 raft: refactor inflight 2016-10-04 13:12:16 -07:00
Gyu-Ho Lee 9b56e51ca7 *: regenerate proto + gofmt change 2016-10-03 15:34:34 -07:00
Dylan.Wen a6eb2939b1 raft: add test cases to improve test coverage 2016-09-28 10:19:30 +08:00
ychen11 69f5b4ba79 Documentation:made watch request doc more clear 2016-09-23 23:13:55 +08:00
Peter Mattis 37fa6ac45c raft: add RawNode.TickQuiesced
TickQuiesced allows the caller to support "quiesced" Raft groups which
do not perform periodic heartbeats and elections. This is useful in a
system with thousands of Raft groups where these periodic operations can
be overwhelming in an otherwise idle system.

It might seem possible to avoid advancing the logical clock at all in
such Raft groups, but doing so has an interaction with the CheckQuorum
functionality. If a follower is not quiesced while the leader is the
follower can call an election that will fail because the leader's lease
has not expired (electionElapsed < electionTimeout). The next time the
leader sends a heartbeat to this follower the follower will see that the
heartbeat is from a previous term and respond with a MsgAppResp. This in
turn will cause the leader to step down and become a follower even
though there isn't a leader in the group. By allowing the leader's
logical clock to advance via TickQuiesced, the leader won't reject the
election and there will be a smooth transfer of leadership to the
follower.
2016-09-15 21:05:18 -04:00
Dylan.Wen eeca614cd3 raft: add read index for RawNode 2016-09-14 14:43:46 +08:00
Xiang Li cfe717e926 Merge pull request #6275 from xiang90/raft_l
raft: support safe readonly request
2016-09-13 01:36:04 -05:00
Xiang Li 710b14ce56 raft: support safe readonly request
Implement raft readonly request described in raft thesis 6.4
along with the existing clock/lease based approach.
2016-09-12 15:13:52 +08:00
Dylan.Wen 68f2fdc1ff raft: add test case for leader transfer from follower 2016-09-08 17:22:52 +08:00
goroutine ce49fb6ec4 raft: add tests for IsLocalMsg (#6357)
* raft: add tests for IsLocalMsg

* report index of failed tests
2016-09-07 12:52:37 +09:00
Peter Mattis c1948f2940 raft: grow the inflights buffer instead of preallocating
Grow the inflights buffer as needed instead of preallocating it to its
max size. This avoids preallocating a lot of unnecessary
space (8*MaxInflightMsgs) when using lots of raft groups while still
allowing for a reasonable MaxInflightMsgs configuration.
2016-09-06 18:07:01 -04:00
Peter Mattis 4a33aa3917 raft: use a singleton global rand
rand.NewSource creates a 4872 byte object. With a small number of raft
groups in a process this isn't a problem. With 10k raft groups we'd use
46MB for these random sources. The only usage is in
raft.resetRandomizedElectionTimeout which isn't performance critical.

Fixes #6347.
2016-09-05 09:03:18 -04:00
Ben Darnell a7a867c1e6 raft: Allow an election immediately after start with checkQuorum
Previously, the checkQuorum flag required an election timeout to
expire before a node could cast its first vote. This change permits
the node to cast a vote at any time when the leader is not known,
including immediately after startup.
2016-08-30 08:28:41 +08:00
sharat 9b3b1f80dd raft: handled panic for Term due to IOB
Instead of raising panic, returning an error instead for better handling

#6215
2016-08-18 23:11:38 +05:30
Gyu-Ho Lee de06dc1272 Merge pull request #6155 from gyuho/raft-leader-transfer
*: expose Raft leader transfer
2016-08-11 08:03:28 -07:00
siddontang f8ee322b08 raft: fix overflow 2016-08-11 09:24:49 +08:00
Gyu-Ho Lee e64ef3f261 raft: add 'TransferLeadership' to Node interface 2016-08-10 16:25:22 -07:00
Gyu-Ho Lee f4141f0f51 raft: handle 'MsgTransferLeader' in follower 2016-08-10 16:24:29 -07:00
Xiang Li 5f0c122496 raft: fix getting unapplied log entries 2016-08-08 10:44:02 -07:00
Xiang Li 6c3efde51b Merge pull request #6099 from sinsharat/master
raft: handling of applying old snapshots
2016-08-05 07:38:07 -07:00
Xiang Li c46955b60a Merge pull request #6097 from swingbach/master
raft: fix #6096
2016-08-04 11:40:02 -07:00
sharat fd757756f5 raft: handling of applying old snapshots
There was a TODO requirement to handle ErrorSnapshotOutOfDate for the
function ApplySnapshot. The same has been implemented

#6090
2016-08-04 21:08:24 +05:30
swingbach@gmail.com 41dee84733 raft: fix #6096 2016-08-04 18:31:22 +08:00
Xiang Li 6e7baab32c Merge pull request #6070 from swingbach/master
raft: fix #6068
2016-08-03 19:59:07 -07:00
swingbach@gmail.com c0a8da7fd0 raft: minor refactor 2016-08-02 08:46:43 +08:00
Xiang Li 8d12017fe2 raft: better doc 2016-07-30 21:11:37 -07:00
swingbach@gmail.com 992f628e6e raft: fix #6068 2016-07-30 03:27:29 +08:00
Gyu-Ho Lee 982e18d80b *: regenerate proto with latest grpc-gateway 2016-07-27 13:21:03 -07:00
Xiang Li a75688bd17 Merge pull request #6039 from xiang90/fix_r
raft: hide Campaign rules on applying all entries
2016-07-26 20:52:09 -07:00
Xiang Li 0d6c028aa2 Merge pull request #6032 from xiang90/gateway
fix a few issues in grpc gateway
2016-07-25 16:48:38 -07:00
Xiang Li 484f579905 raft: hide Campaign rules on applying all entries 2016-07-25 15:53:39 -07:00
Xiang Li fffa484a9f *: regenerate proto for adding deleterange 2016-07-23 16:17:44 -07:00
Gyu-Ho Lee 4ff6c72257 raft: replace 'reflect.DeepEqual' with bytes.Equal 2016-07-22 16:34:13 -07:00
Xiang Li 1c5754f02d raft: fix readindex 2016-07-19 15:00:58 -07:00
Gyu-Ho Lee 50be793f09 *: regenerate proto 2016-07-18 09:33:32 -07:00
Gyu-Ho Lee 5b92e17e86 *: regenerate proto files 2016-07-15 13:24:19 -07:00
Xiang Li 7432e9fbe9 Merge pull request #5809 from swingbach/master
raft: make leader transferring workable when quorum check is on
2016-07-12 09:46:18 -07:00
Xiang Li b2c1112288 Merge pull request #5921 from xiang90/r
raft: do not change RecentActive when resetState for progress
2016-07-12 06:54:14 -07:00
swingbach@gmail.com c36a40ca15 raft: introduce top-level context in message struct 2016-07-12 16:14:06 +08:00
Xiang Li eb08f2274e raft: do not change RecentActive when resetState for progress 2016-07-11 21:12:14 -07:00
Gyu-Ho Lee 6f3a40cb53 raft: set leader id in stepFollower
Follower has already set its leader ID from
previous append messages from the leader, but
to be consistent,  this adds a line to set its
leader id from leader snapshot message.
2016-07-11 16:37:31 -07:00
swingbach@gmail.com 0d9b6ba0ab raft: fix a few problems 2016-07-11 14:59:53 +08:00
Gyu-Ho Lee c396b6aaaa raft: remove unnecessary type-cast, else-clause 2016-07-09 22:01:19 -07:00
Jared Hulbert 90889ebc0f raftpb: atomic access alignment
The Entry struct has misaligned fields that are accessed atomically.  The
misalignment is caused by the EntryType enum which the Protocol Buffers
spec forces to be a 32bit int.

Moving the order of the fields without renumbering them in the .proto file
seems to align the go structure without changing the wire format.
2016-07-08 11:13:53 -07:00
Jared Hulbert df94f58462 raft: atomic access alignment
The relevant structures are properly aligned, however, there is no comment
highlighting the need to keep it aligned as is present elsewhere in the
codebase.

Adding note to keep alignment, in line with similar comments in the codebase.
2016-07-08 11:05:41 -07:00
Gyu-Ho Lee 9e0de02fde raft: fix minor grammar, remove TODO
- test 'Term' panic cases (remove TODO)
- fix minor grammar in 'Node' godoc
2016-07-05 07:21:52 -07:00
Gyu-Ho Lee 881a120453 raft: minor updates and clean up in log.go
- remove redundant test case in log_test.go
- fix test case comment ('equal or larger')
- lastnewi after matching index and term
2016-07-04 16:52:17 -07:00
Xiang Li 8d99a666f9 Merge pull request #5854 from xiang90/r_f
raft: add features section to readme file
2016-07-03 18:00:31 -07:00
Xiang Li c76dcc5190 raft: add features section to readme file 2016-07-03 17:59:59 -07:00
Gyu-Ho Lee 9b5e99efe0 raft: remove unnecessary reflect.DeepEqual in test 2016-07-03 13:42:26 -07:00
Xiang Li 40c4a7894d *: support return prev deleted kv 2016-07-01 14:01:48 -07:00
Gyu-Ho Lee 2cc2372165 raft: give correct offset in unstable test
`unstable.entries[i] has raft log position i+unstable.offset`

So, this fixes some test cases by giving them correct
offsets.
2016-06-29 12:29:36 -07:00
swingbach@gmail.com e020b2a228 raft: make leader transferring workable when quorum check is on 2016-06-29 18:24:58 +08:00
zhonglin6666 df31eab136 raft: simplify truncateAndAppend
truncateAndAppend no need the value of 'after' with subbing one
2016-06-28 18:53:12 -07:00
Xiang Li 5f1c763993 Merge pull request #5553 from swingbach/master
raft: implemented read-only query when quorum check is on
2016-06-28 12:47:43 -07:00
swingbach@gmail.com 0faae33ace raft: implemented read-only query when quorum check is on 2016-06-28 10:52:53 +08:00
Gyu-Ho Lee 6a48961895 raft: len(entries) before Lock, use firstIndex
- To avoid unnecessary locking in case len(entries) == 0
- use firstIndex method
2016-06-24 23:50:00 -07:00
Gyu-Ho Lee 33f7e7583b raft: fix comment,method name to needSnapshotAbort
And 'maybeSnapshotAbort' does not 'unset'
the pendingSnapshot. 'resetState', which is called after this
metho, is the one that unsets pendingSnapshot. So this changes
the method name.
2016-06-24 07:54:10 -07:00
Xiang Li 848f539536 raft: make tick unblock and fix potential live lock 2016-06-16 08:01:06 -07:00
Xiang Li 5a7b7f7595 main: add grpc-gateway support
Now etcd can serve HTTP json request at /v3alpha/
2016-06-14 17:09:06 -07:00
Xiang Li ab65d2b848 raft: add docker/swarmkit as notable raft users 2016-06-09 10:10:44 -07:00
Gyu-Ho Lee 1610391449 *: following changes for proto update 2016-06-07 13:33:03 -07:00
Gyu-Ho Lee 843c53192a raft: small fix in doc
'MsgBeat' is an internal type to signal the leader, not the message type
that gets sent to its followers. 'MsgHeartbeat' is the type sent to followers.
2016-06-05 17:47:46 -07:00
Xiang Li 500296d0fb raft: fix TestNodeStepUnblock
The test cases have side-effect. We need to stop testing if one of the test
fails. Also timeout should be much longer to avoid false-positive.
2016-06-03 10:22:11 -07:00
Xiang Li 9fee7732f6 Merge pull request #5468 from swingbach/master
implemented leader lease when quorum check is on.
2016-06-01 16:10:41 -07:00
swingbach@gmail.com 337ef64ed5 raft: implemented leader lease when quorum check is on 2016-06-02 06:17:27 +08:00
Xiang Li 5b2e130f09 raft: initial readme 2016-05-28 18:37:21 -07:00
swingbach@gmail.com ff9d16a2e0 raft: fix tiny mistake of message type 2016-05-20 14:04:08 +08:00
swingbach@gmail.com 1e54117580 raft: add more comments for dueling candidates test case 2016-05-19 13:51:20 +08:00
swingbach@gmail.com c703ccab63 raft: add more assertions for dueling candidates test case 2016-05-19 13:50:14 +08:00
Xiang Li 910781ef5b raft: do not panic when removing all the nodes from cluster 2016-05-16 10:04:17 -07:00
Xiang Li 4d2424210f Merge pull request #5313 from xiang90/fix_raft_abort
raft: simplify leadership transfer
2016-05-13 09:26:01 -07:00
Gyu-Ho Lee fe884f8209 raft: update LICENSE header 2016-05-12 20:49:15 -07:00
Xiang Li 82a6de8b69 raft: simplify leadership transfer 2016-05-10 20:03:42 -07:00
Gyu-Ho Lee 015acabdbb *: rerun genproto -g 2016-05-02 23:02:31 -07:00
Xiang Li 2fa5b913fe raft: fix flaky test
We recently changed the randomized election timeout from (et, 2*et-1] tp
[et, 2*et-2], where et is user set election timeout.

So 2*et might trigger two elections instead of one. We need to fix the test
code accordingly.

Thanks for Tikv guys for finding this issue. We probably need to randomize
etcd/raft test more.
2016-05-02 21:08:19 -07:00
Anthony Romano b7ac758969 *: rename storage package to mvcc 2016-04-25 15:25:51 -07:00
Gyu-Ho Lee 4b31acf0e0 *: update generated Proto 2016-04-25 14:08:33 -07:00
Xiang Li 59c5110b73 raft: fix detected race in node.go 2016-04-22 15:45:33 -07:00
Jud White a9cfbd5414 raft/doc.go: add missing } 2016-04-19 04:21:33 -05:00
Gyu-Ho Lee 7a2ef3eb00 *: regenerate proto buffers 2016-04-13 16:24:07 -07:00
mqliang 1044fbce2c etcdctlv3: update aunto generated files 2016-04-12 22:48:47 +08:00
Xiang Li 3a695a82a3 Merge pull request #5036 from xiang90/r_t
raft: add a test case for Test Slice
2016-04-11 16:02:13 -07:00
Xiang Li 9423125ce1 raft: add a test case for Test Slice 2016-04-11 10:04:03 -07:00
Gyu-Ho Lee 9108af9046 *: clean up from go vet, misspell 2016-04-10 23:16:56 -07:00
Xiang Li 4997ed36b4 Merge pull request #5011 from xiang90/r_c
raft: fix issues reported by golint
2016-04-08 11:46:12 -07:00
es-chow ac059eb8cb raft: transfer leader feature 2016-04-08 16:56:32 +08:00
Xiang Li 1b41ee9c99 raft: fix issues reported by golint 2016-04-07 22:14:56 -07:00
Anthony Romano dc17eaace7 *: rename Lease Create to Grant
Creating a lease through the client API interface union looked like
"c.Create(...)"-- the method name wasn't very descriptive.
2016-04-07 12:28:14 -07:00
Gyu-Ho Lee 6e6d64fb9b *: clean up unused vars, functions
With help from https://github.com/dominikh/go-unused.
IsNetTimeoutError seems useful, so moved to pkg/netutil.
2016-04-06 21:33:55 -07:00
Tamir Duberstein 68db18667a raft: correct doc comment 2016-04-06 08:43:42 -04:00
Tamir Duberstein 5250784b09 raft: use rand.Intn instead of rand.Int and mod
This provides a better random distribution and is easier to read.
2016-04-06 08:43:42 -04:00
Gyu-Ho Lee c09f23c46d *: clean up bool comparison 2016-04-02 18:27:54 -07:00
Xiang Li 5d431b4782 raft: lower split vote rate 2016-04-01 12:11:03 -07:00
Anthony Romano bd832e5b0a *: migrate Godeps to vendor/ 2016-03-22 17:10:28 -07:00
Xiang Li f5e60c0e18 raft: add optimization notes 2016-03-17 09:53:50 -07:00
Xiang Li aa59e7518e raft: remove unnecessary waitSchedule in test 2016-03-09 09:18:49 -08:00
Peter Bourgon aedf2c5876 raft: Config: comment wrapping @ 80col 2016-03-01 09:54:58 +01:00
Peter Bourgon 6c1b3a71db raft: clarify Heartbeat/ElectionTick comments
Avoid other, ambiguous interpretations.
2016-03-01 09:52:14 +01:00
Anthony Romano afa0368dcc *: fix godoc bugs in interfaces and slice fields
detected with goword
2016-02-24 00:45:40 -08:00
Anthony Romano c5b51946eb *: exported godoc fixups 2016-02-21 20:36:44 -08:00
Xiang Li 390a4518c0 raft: rework comment for advance interface 2016-02-12 13:43:51 -08:00
Anthony Romano 20461ab11a *: fix many typos 2016-01-31 21:42:39 -08:00
Gyu-Ho Lee c827c7432c raft: fix leaky goroutines in raft test 2016-01-31 12:41:33 -08:00
Gyu-Ho Lee 71c2a9bb3c *: fix minor typos, comments 2016-01-30 18:15:56 -08:00
Shawn Smith edd823bba6 raft: fix var name in comment 2016-01-29 16:18:47 +09:00
Xiang Li 37290820de Merge pull request #4293 from bdarnell/bcast-after-commit
raft: Always call bcastAppend after maybeCommit
2016-01-27 09:58:22 -08:00
Sam Rijs be21d90108 raft/doc: add notice about thread safety of messages
Fixes #4285
2016-01-27 20:18:19 +11:00
Xiang Li 6054748181 Merge pull request #4297 from ngaut/ngaut/raft-typo
raft: typo
2016-01-26 20:48:53 -08:00
ngaut 751ab40f44 raft: typo 2016-01-27 12:35:14 +08:00
Gyu-Ho Lee a35d5889f6 *: update gRPC, proto interface 2016-01-26 17:41:39 -08:00
Ben Darnell 0771d713e6 raft: Always call bcastAppend after maybeCommit 2016-01-26 16:55:47 -05:00
Ben Darnell 22925a1d2f raft: Remove redundant `raft.Commit` field.
Keeping this field in sync with `raft.raftLog.committed` was
error-prone, so instead we synthesize the `HardState` on demand.

Fixes #4278.
2016-01-26 15:18:55 -05:00
Xiang Li 8199147cf8 Merge pull request #4246 from bdarnell/commit-after-remove-node
raft: Call maybeCommit after removing a node
2016-01-25 11:47:56 +08:00
Sam Rijs 896719c877 raft: use configured logger in raft/node.go
Those three log statements in node.go have not been using the logger that was passed via `raft.Config`, but instead the default raft logger. This changes it to use the proper logger.
2016-01-25 00:15:44 +11:00
Gyu-Ho Lee 53d6aede82 Merge pull request #3889 from gyuho/raft_doc.go_20151118
raft: doc, debugging instruction on MessageType
2016-01-22 14:22:49 -08:00
Ben Darnell 46bb2582fe raft: Call maybeCommit after removing a node.
removeNode reduces the required quorum size, so some pending entries may
be able to commit after it is applied.

Discovered in cockroachdb/cockroach#3642
2016-01-20 11:05:48 -04:00
Ben Darnell c185bdaf95 raft: Improve formatting of DescribeMessage 2016-01-20 11:03:07 -04:00
davygeek 194607812c raft: follow golint notice to replace +=1 with ++ 2016-01-13 09:39:00 +08:00
Xiang Li eab052d5c4 Merge pull request #4141 from ngaut/ngaut/refactor
raft: Rename q() to quorum() which is more readable
2016-01-06 07:32:39 -08:00
siddontang 54a45ba2f5 *: fix typo 2016-01-06 16:17:02 +08:00
ngaut 8ee232d4ec raft: Rename q() to quorum() which is more readable 2016-01-06 15:23:35 +08:00
ngaut b38dfda1c9 raft: Tiny refactor
Rename i to id since i looks like index which is confusing.
2016-01-04 21:20:54 +08:00
ngaut acee23112a raft: typo 2016-01-04 11:51:51 +08:00
Jonathan Boulle 5c65c393a5 raft: small typo fixes in raft package doc 2015-12-23 16:37:06 +01:00
Brandon Philips c72e4ae112 raft: add raftexample to the docs
To help people wanting use this package get started point to the
raftexample package.
2015-12-22 12:04:39 -08:00
Hitoshi Mitake 9b2da76796 raft: remove go vet compliants 2015-12-16 13:29:23 +09:00
Gyu-Ho Lee 8696a1509c raft/rafttest: fix shadowed variable 2015-12-12 09:38:26 -08:00
Jonathan Boulle af9f352fe3 raft: update RecentActive name in comments
Noticed when retrospectively reviewing #3976 that a couple of places
were missed when the variable was renamed.
2015-12-11 15:06:11 -08:00
Xiang Li cc6d98bf89 etcdserver: only send snapshot when the member is active 2015-12-10 16:15:26 -08:00
Xiang Li 9df46f9d6f raft: expose RecentActive in Progress 2015-12-10 12:17:18 -08:00
Bram Gruneir 1901a4c718 raft: Ensure that Progress is not nil when a MsgSnapStatus comes in.
This was causing some issues in cockroach cockroachdb/cockroach#2950
2015-12-07 16:01:18 -05:00
Gyu-Ho Lee d817f885db raft: doc, debugging instruction on MessageType
This adds documentation on MessageType. Having clear explanation about
MessageType helps understand raft logic and debug etcd when there is a
message dropping. This is partially for coreos#3806.
2015-12-03 00:45:11 -08:00
es-chow 5bc56786dc raft: add RawNode which is a thread-unsafe node without goroutine and remove MultiNode 2015-11-26 17:14:14 +08:00
Xiang Li a8cc1570d0 raft: support quorum check when raft is leader
If quorum check fails, the leader will step down to follower.
2015-11-24 09:36:37 -08:00
Gyu-Ho Lee 81229dbea9 *: add missing package descriptions
This adds and updates package descriptions in etcd projects.
And also deletes some duplicate LICENSE statements.
2015-11-17 20:54:10 -08:00
Gyu-Ho Lee e1c108e604 raft: minor typo in progress.go
Fixes a minor typo.
2015-11-17 14:21:35 -08:00
Xiang Li 5d0268aa2e Merge pull request #3877 from bdarnell/campaign-while-leader
raft: no-op instead of panic for Campaigning while leader
2015-11-16 19:59:34 -08:00
Ben Darnell fbeb58d265 raft: no-op instead of panic for Campaigning while leader
We need to be able to force an election (on one node) after creating a
new group (cockroachdb/cockroach#1384), but it is difficult to ensure
that our call to Campaign does not race with an election that may be
started by raft itself. A redundant call to Campaign should be a no-op
instead of a panic. (But the panic in becomeCandidate remains, because
we don't want to update the term or change the committed index in this
case)
2015-11-16 21:44:14 -05:00
Yicheng Qin 3a65442d7d raft: fix print format for term in one log line
`term` should be printed in decimal representation instead of
hexadecimal one.
2015-11-15 20:26:16 -08:00
Xiang Li 2990249c1d Merge pull request #3856 from xiang90/raft_doc_restart
raft: add doc to make restart clear
2015-11-11 11:15:49 -08:00
Xiang Li f7f28b9984 raft: add doc to make restart clear, especially for configuration changed case 2015-11-11 11:11:58 -08:00
Xiang Li 6df52614fc raft: add more words about raft protocol 2015-11-11 09:20:25 -08:00
Yicheng Qin 0de52414cd raft: extend wait timeout in TestNodeAdvance
This fixes the failure met in semaphore CI.
2015-11-03 16:57:18 -08:00
Yicheng Qin bf3057e5bd raft: extend wait timeout in TestMultiNodeAdvance
This fixes the failure met in semaphore CI:

```
--- FAIL: TestMultiNodeAdvance-2 (0.01s)
		multinode_test.go:458: expect Ready after Advance, but there is
		no Ready available
```
2015-10-23 12:08:24 -07:00
Yicheng Qin 01806c3e80 raft: fix malformed example name
It is reported by latest govet:
```
gopath/src/github.com/coreos/etcd/raft/example_test.go:26: Example_Node
has malformed example suffix: Node
```
2015-10-20 16:40:01 -07:00
Gyu-Ho Lee 1716d5858f raft/documentation: clarify progress's subjects.
If I understand correctly, `progress` represents the states of follower. For
me, some comments weren't clear because it was missing the subjects of
`progress`. This adds more clarification on who is doing what. Please let me
know if I misunderstood anything. Thanks,
2015-10-15 19:15:08 -07:00
Cong Ding 362df8e470 raft/doc: fix misuse of `for' loop in docs 2015-10-15 11:13:30 -05:00
Cong Ding f1f92f0fa3 raft/doc: fix typos 2015-10-15 02:17:34 -05:00
Kenji Kaneda ebd8cb04c1 raft: fix a description of MemoryStorage.Compact
The parameter name is compactIndex, not i.
2015-10-06 21:49:33 -07:00
Cong Ding b2edf1d24a raft: fix typo in doc 2015-10-01 11:21:23 -05:00
Yicheng Qin 533e728b64 Merge pull request #3609 from yichengq/raft-snapshot
raft: kill TODO about behavior when snapshot fails
2015-09-29 19:32:31 -07:00
Yicheng Qin 4c82b481a5 raft: improve behavior when snapshot fails
etcd is going to support incremental snapshot, and we design to let it
send at most one snapshot out at first stage. So when one snapshot is in
flight, snapshot request will return error.

When failing to get snapshot when sending MsgSnap, raft prints out
related log and abort sending this message.
2015-09-29 19:15:15 -07:00
Kenji Kaneda f602767e50 raft: remove an obsolete TODO comment on 4MB maxMsgSize hard coding
The TODO comment was added by 7571b2cd, and it was addressed by d9b5b56c.
2015-09-28 21:31:12 -07:00
Emil Hessman b9f22cb69b raft: fix Node doc typo 2015-09-21 06:13:33 +02:00
Ben Darnell b7baaa6bc8 raft: Allow per-group nodeIDs in MultiNode.
This feature is motivated by
https://github.com/cockroachdb/cockroach/blob/master/docs/RFCS/replica_tombstone.md
which requires a change to the way CockroachDB constructs its node IDs.
2015-09-18 15:36:36 -04:00
Jonathan Boulle 7848ac3979 *: add missing license headers 2015-09-15 14:09:01 -07:00
Brandon Philips 68d4ec3e13 raft: improve panic error message
Give a human being some insight into how we might have gotten to this
state based on feedback from #3504.
2015-09-12 12:17:02 -07:00
Dmitry Smirnov b2f4a5f587 *: fix spelling issues (codespell).
Signed-off-by: Dmitry Smirnov <onlyjob@member.fsf.org>
2015-09-11 10:22:29 +10:00
Xiang Li ef7cf058a2 *: update gogoproto 2015-09-03 15:32:25 -07:00
Tamir Duberstein 45390b9fb8 *: regenerate proto to use local import path
Using Go-style import paths in protos is not idiomatic. Normally, this
detail would be internal to etcd, but the path from which gogoproto
is imported affects downstream consumers (e.g. cockroachdb).

In cockroach, we want to avoid including `$GOPATH/src` in our protoc
include path for various reasons. This patch puts etcd on the same
convention, which allows this for cockroach.

More information: https://github.com/cockroachdb/cockroach/pull/2339#discussion_r38663417

This commit also regenerates all the protos, which seem to have
drifted a tiny bit.
2015-09-03 13:38:28 -04:00
Ben Darnell 4f20e01f60 raft: Ignore proposals if not a current member.
Fixes another panic in MultiNode.Propose.
2015-08-31 20:31:14 -04:00
Xiang Li 6cbaaa715c Merge pull request #3396 from bdarnell/multinode-propose-panic
raft: Fix a nil-pointer panic in MultiNode.Propose.
2015-08-28 12:34:49 -07:00
Ben Darnell 05924b330a raft: Fix a nil-pointer panic in MultiNode.Propose. 2015-08-28 11:17:59 +02:00
Yicheng Qin df83af944b Merge pull request #3384 from yichengq/fix-shadow
test: use go vet shadow feature instead of go-nyet
2015-08-27 14:27:57 -07:00
Yicheng Qin 92cd24d5bd *: fix govet shadow check failure 2015-08-27 14:15:30 -07:00
Matt Keller 32372e1d70 raft: Fixed a test misassumption
network_test.go:56: total = 59.22354ms, want > 50ms
59 is > 50, but the equation added 10 to the right side
2015-08-27 15:15:34 -04:00
Cong Ding c09b667d57 *: fix go vet reported issues 2015-08-22 12:19:02 -05:00
Xiang Li 6b23a8131f *: test gofmt with -s and fix reported issues 2015-08-21 18:52:16 -07:00
Xiang Li 50c1db3fbf raft: downgrade the logging around snapshot to debugf
Snapshot related logging is spamming when leader trying to
sync a failed peer.
2015-08-18 15:43:53 -07:00
es-chow cc362ccdad raft: set logger to raft so log context such as multinode groupID can be logged 2015-08-12 22:56:00 +08:00
Xiang Li 845c51fedd *: fix typos vaild->valid 2015-08-07 10:57:11 -07:00
Xiang Li 581ef05bab *: resolve proto warnings 2015-06-29 18:39:46 -07:00
Xiang Li 13f44e4b79 *: update generated proto code 2015-06-29 16:45:25 -07:00
Xiang Li e01d53b853 Merge pull request #2979 from xiang90/fix_sendapp
raft: fix panic in send app
2015-06-29 10:49:04 -07:00
Xiang Li b4022899eb raft: fix panic in send app
sendApp accesses the storage several times. Perviously, we
assume that the storage will not be modified during the read
opeartions. The assumption is not true since the storage can
be compacted between the read operations. If a compaction
causes a read entries error, we should not painc. Instead, we
can simply retry the sendApp logic until succeed.
2015-06-15 14:23:33 -07:00
Xiang Li 2f0169c3ab raft: fix usage section of doc
We recently added a config struct to start raft. Update
our doc accordingly.
2015-06-15 10:26:10 -07:00
Yicheng Qin 4e79abcfeb Merge pull request #2944 from yichengq/fix-2procs
pkg/testutil: ForceGosched -> WaitSchedule
2015-06-10 14:44:32 -07:00
Yicheng Qin 018fb8e6d9 pkg/testutil: ForceGosched -> WaitSchedule
ForceGosched() performs bad when GOMAXPROCS>1. When GOMAXPROCS=1, it
could promise that other goroutines run long enough
because it always yield the processor to other goroutines. But it cannot
yield processor to goroutine running on other processors. So when
GOMAXPROCS>1, the yield may finish when goroutine on the other
processor just runs for little time.

Here is a test to confirm the case:

```
package main

import (
	"fmt"
	"runtime"
	"testing"
)

func ForceGosched() {
	// possibility enough to sched up to 10 go routines.
	for i := 0; i < 10000; i++ {
		runtime.Gosched()
	}
}

var d int

func loop(c chan struct{}) {
	for {
		select {
		case <-c:
			for i := 0; i < 1000; i++ {
				fmt.Sprintf("come to time %d", i)
			}
			d++
		}
	}
}

func TestLoop(t *testing.T) {
	c := make(chan struct{}, 1)
	go loop(c)
	c <- struct{}{}
	ForceGosched()
	if d != 1 {
		t.Fatal("d is not incremented")
	}
}
```

`go test -v -race` runs well, but `GOMAXPROCS=2 go test -v -race` fails.

Change the functionality to waiting for schedule to happen.
2015-06-10 14:37:41 -07:00
Xiang Li 1279e495f0 raft: make the repeated log message under bad path debug level 2015-06-05 17:29:24 -07:00
Xiang Li 1561b85bf3 raft: drop the raft prefix in logging 2015-06-02 12:50:42 -07:00
Xiang Li 0ca6be31f8 raft: remove wrong invariant
The commit > unstable might not true for follower. The leader only need
to ensure the entry is stored on the majority of nodes to commit an
entry. So the minority of the cluster might receive commit > unstable
append request. This is normal.
2015-05-29 18:48:59 -07:00
Xiang Li 085447ed85 raft: fix raft node start bug
raft node should set initial prev hard state to empty.
Or it will not send the first hard coded state to application
until the state changes again.

This commit fixs the issue. It introduce a small overhead, that
the same tate might send to application twice when restarting.
But this is fine.
2015-05-27 13:32:04 -07:00
Xiang Li 0ad6d7e3ba Merge pull request #2853 from bdarnell/status
raft: MultiNode.Status returns nil for non-existent groups.
2015-05-20 13:07:23 -07:00
Ben Darnell d58fac453d raft: MultiNode.Status returns nil for non-existent groups.
Previously it would panic if the group did not exist.
2015-05-20 15:45:38 -04:00
Ben Darnell ef721db247 raft: Format node IDs as hex in DescribeMessage.
This is how they are printed in all other log messages.
2015-05-20 15:32:56 -04:00
xujun 6b7891c643 raft: fix typo in raftlog
fix typo in String() method of raftlog which will misorder
the "committed" and "unstable.offset" output.
2015-04-24 03:28:57 -04:00
Yicheng Qin 89495f9194 Merge pull request #2626 from yichengq/fix-raft-status
raft: generate correct json-format status
2015-04-03 13:54:46 -07:00
Yicheng Qin fa96e64b43 Merge pull request #2624 from yichengq/fix-raft-storage
raft: lock storage when compact it
2015-04-03 13:51:06 -07:00
Yicheng Qin 3d32c059dd raft: generate correct json-format status
Current json-format string misses the double quote around status field.

Use %q for better clearance.
2015-04-03 13:49:46 -07:00
Yicheng Qin d91ea7f199 raft: fix freeTo fails to free
If freeTo is called when to is set to the lastest inflight, freeTo
fails to free the slots.
2015-04-03 13:21:26 -07:00
Yicheng Qin c6de464587 raft: lock storage when compact it
etcd now compact raft storage asynchronously, and append entry to raft
storage may happen at the same time. Add the lock to fix the bug that
the entries saved in storage may be organized in a wrong way.
2015-04-03 11:38:01 -07:00
Xiang Li 3f867bc6ed raft: node bench matches reality 2015-03-28 14:53:42 -07:00
Xiang Li 05e240b892 *: update protobuf 2015-03-25 10:14:35 -07:00
Ben Darnell c9d507df11 raft: Use raft.Config in MultiNode. 2015-03-24 15:37:13 -04:00
Xiang Li b3fb052ad4 raft: make peers a prviate field in raft.Config 2015-03-24 11:10:07 -07:00
Xiang Li abddef0f28 raft: make node configurable 2015-03-23 21:20:49 -07:00
Brandon Philips 057978bbc6 raft: design: fixup markdown
Need a space between `1.` for markdown to render as a list.
2015-03-23 14:01:17 -07:00
Xiang Li d9b5b56c82 raft: make raft configurable 2015-03-23 09:55:19 -07:00
Xiang Li a552722f03 Merge pull request #2544 from xiang90/raft-inflight
raft: add flow control for progress
2015-03-20 20:12:31 -07:00
Xiang Li 4a64373225 raft: add flow control for progress
Each progress has a inflighs sliding window. When the progress
is in replicate state, inflights will control the sending speed
of the leader.

The leader can have at most maxInflight number of inflight
messages for each replicate progress. Receving a appResp moves
forward the sliding window. Heartbeat response free one
slot if the window is full.
2015-03-20 20:04:33 -07:00
Xiang Li 09a86cb9b9 Merge pull request #2553 from xiang90/raft-design
raft: add progress state machine graph
2015-03-20 19:57:51 -07:00
Xiang Li 86622537a1 raft: add progress state machine graph 2015-03-20 15:28:50 -07:00
Xiang Li 44d9209990 Merge pull request #2548 from xiang90/raft-design
raft: add our very first design.md
2015-03-20 09:07:44 -07:00
Yicheng Qin 6e557c58c7 Merge pull request #2532 from yichengq/342
raft: print out data and time in log
2015-03-20 08:03:23 -07:00
Xiang Li 59d8089295 raft: add our very first design.md 2015-03-19 21:00:47 -07:00
Xiang Li 2adb58f9de raft: move progress to progress.go 2015-03-19 10:05:04 -07:00
Xiang Li 7571b2cde2 raft: limit the size of msgApp
limit the max size of entries sent per message.
Lower the cost at probing state as we limit the size per message;
lower the penalty when aggressively decrease to a too low next.
2015-03-18 15:59:30 -07:00
Yicheng Qin 0634cf2cfe raft: print out data and time in log
Keep the default log setting consistent with other packages.
2015-03-18 15:49:06 -07:00
Yicheng Qin 7e7bc76038 Merge pull request #2514 from yichengq/340
raft: introduce progress states
2015-03-18 09:40:30 -07:00
Yicheng Qin 67194c0b22 raft: introduce progress states 2015-03-18 08:16:32 -07:00
Xiang Li d17f3a4452 Merge pull request #2519 from bdarnell/multinode-commit
raft: Use the correct commit index when advancing in MultiNode.
2015-03-17 10:31:53 -07:00
Ben Darnell cd1ff78ff3 raft: Elaborate a little more about committed entries in commitReady. 2015-03-17 13:22:36 -04:00
funkygao 0b912c0faf raft: fix godoc about starting a node 2015-03-17 17:35:18 +08:00
Ben Darnell 271d911c32 raft: Use the correct commit index when advancing in MultiNode.
This fixes an issue when restoring from a snapshot and brings
MultiNode closer to Node.
2015-03-16 18:40:51 -04:00
Ben Darnell 5e19adcf70 raft: correctly pass arguments to Logger.Panicf() 2015-03-12 16:15:43 -04:00
Iago López Galeiras e698192e4a rafttest: fix build error
raftLogger is not exported so we can't access it from here. Go back to
using log.
2015-03-12 11:47:13 +01:00
Xiang Li 39731724ff Merge pull request #2485 from yichengq/337
raft: fall back to bad path when unreachable
2015-03-11 14:16:39 -07:00
Yicheng Qin be0bf2a2bd raft: fall back to bad path when unreachable 2015-03-11 13:21:23 -07:00
Xiang Li c643967a41 raft: reply with the commit index when receives a smaller append message
Follower should not reject the append message with a smaller index than its commit
index. Or it will trigger the leader's resending logic, which might have a high cost.
2015-03-10 22:32:36 -07:00
Xiang Li a2be25cba4 Merge pull request #2460 from xiang90/raft-logger
raft: introduce logger interface
2015-03-09 08:00:21 -07:00
Xiang Li 97579e2e1d raft: introduce logger interface 2015-03-08 21:36:32 -07:00
Xiang Li 7fe608532a raft: do not reset vote if term is not changed
raft MUST keep the voting information for the same term. reset
should not reset vote if term is not changed.
2015-03-07 22:31:20 -08:00
Ben Darnell 725c411346 Add ReportUnreachable and ReportSnapshot to MultiNode.
Add ReportSnapshot requirement to doc.go.
2015-03-05 12:39:52 -05:00
Xiang Li 6b9b695167 Merge pull request #2435 from bdarnell/multinode
raft: Introduce MultiNode.
2015-03-04 21:27:20 -08:00
Ben Darnell c824c867ec raft: more doc updates.
Including parallelism of persist and send, cancellation of
ConfChanges, and the risks of two-node clusters.
2015-03-04 15:48:35 -05:00
Ben Darnell 4e74d81bbb raft: Introduce MultiNode.
MultiNode is an alternative to raft.Node that is more efficient
when a node may participate in many consensus groups. It is currently
used in the CockroachDB project; this commit merges the
github.com/cockroachdb/etcd fork back into the mainline.
2015-03-04 15:30:21 -05:00
Ben Darnell 250970cc23 raft: Expand doc.go
Includes more details on the required caller behavior and the safety of
membership changes.

Closes #2397
2015-03-04 13:18:02 -05:00
Yicheng Qin b4b9b9118a rafthttp: report MsgSnap status 2015-03-02 09:38:11 -08:00