Commit Graph

140 Commits (c8ffa36d9e473a9cb2d570f9176c9109b066df68)

Author SHA1 Message Date
Yicheng Qin 239c8dd479 raft: add comment to newLog 2014-11-24 21:47:12 -08:00
Xiang Li 10ebf1a335 raft: fix memoryStorage append 2014-11-24 16:36:59 -08:00
Xiang Li 2876c652ab raft: fix for go vet 2014-11-24 15:00:38 -08:00
Xiang Li 3dd4c458ca raft: refactor term in log.go 2014-11-24 10:13:56 -08:00
Xiang Li 94190286ff raft: add comment for append in unstableEntries in log.go 2014-11-24 09:05:40 -08:00
Xiang Li 0a46c70f5d raft: use empty slice in unstableEntries in log.go 2014-11-24 09:04:45 -08:00
Xiang Li bc0e72acb9 raft: clean up panic in log.go 2014-11-24 09:01:25 -08:00
Xiang Li f3cef87c69 raft: remove extra empty line in log.go 2014-11-24 08:43:34 -08:00
Xiang Li bdbafe2cf3 raft: use max in log.slice 2014-11-24 08:36:15 -08:00
Ben Darnell b29240baf0 Merge remote-tracking branch 'coreos/master' into merge
* coreos/master:
  scripts: build-docker tag and use ENTRYPOINT
  scripts: build-release add etcd-migrate
  create .godir
  raft: optimistically increase the next if the follower is already matched
  raft: add handleHeartbeat handleHeartbeat commits to the commit index in the message. It never decreases the commit index of the raft state machine.
  rafthttp: send takes raft message instead of bytes
  *: add rafthttp pkg into test list
  raft: include commitIndex in heartbeat
  rafthttp: move server stats in raftHandler to etcdserver
  *: etcdhttp.raftHandler -> rafthttp.RaftHandler
  etcdserver: rename sender.go -> sendhub.go
  *: etcdserver.sender -> rafthttp.Sender

Conflicts:
	raft/log.go
	raft/raft_paper_test.go
2014-11-19 17:05:16 -05:00
Ben Darnell 355ee4f393 raft: Integrate snapshots into the raft.Storage interface.
Compaction is now treated as an implementation detail of Storage
implementations; Node.Compact() and related functionality have been
removed. Ready.Snapshot is now used only for incoming snapshots.

A return value has been added to ApplyConfChange to allow applications
to track the node information that must be stored in the snapshot.

raftpb.Snapshot has been split into Snapshot and SnapshotMetadata, to
allow the full snapshot data to be read from disk only when needed.

raft.Storage has new methods Snapshot, ApplySnapshot, HardState, and
SetHardState. The Snapshot and HardState parameters have been removed
from RestartNode() and will now be loaded from Storage instead.
The only remaining difference between StartNode and RestartNode is that
the former bootstraps an initial list of Peers.
2014-11-19 16:40:26 -05:00
Ben Darnell 46ee58c6f0 raft: Rename ErrSnapshotRequired To ErrCompacted. 2014-11-18 13:15:10 -05:00
Xiang Li bd4cfa2a07 raft: add handleHeartbeat
handleHeartbeat commits to the commit index in the message. It never decreases the
commit index of the raft state machine.
2014-11-18 08:34:06 -08:00
Ben Darnell 300c5a2001 Merge remote-tracking branch 'coreos/master' into log-storage-interface
* coreos/master: (21 commits)
  etcdserver: refactor ValidateClusterAndAssignIDs
  integration: add integration test for remove member
  integration: add test for member restart
  version: bump to alpha.3
  etcdserver: add buffer to the sender queue
  *: gracefully stop etcdserver
  Fix up migration tool, add snapshot migration
  etcd4: migration from v0.4 -> v0.5
  etcdserver: export Member.StoreKey
  etcdserver: recover cluster when receiving newer snapshot
  etcdserver: check and select committed entries to apply
  etcdserver: recover from snapshot before applying requests
  raft: not set applied when restored from snapshot
  sender: support elegant stop
  etcdserver: add StopNotify
  etcdserver: fix TestDoProposalStopped test
  etcdserver: minor cleanup
  etcdserver: validate new node is not registered before in best effort
  etcdserver: fix server.Stop()
  *: print out configuration when necessary
  ...

Conflicts:
	etcdserver/server.go
	etcdserver/server_test.go
	raft/log.go
2014-11-17 18:28:24 -05:00
Ben Darnell 64d9bcabf1 Add Storage.Term() method and hide the first entry from other methods.
The first entry in the log is a dummy which is used for matchTerm
but may not have an actual payload. This change permits Storage
implementations to treat this term value specially instead of
storing it as a dummy Entry.

Storage.FirstIndex() no longer includes the term-only entry.

This reverses a recent decision to create entry zero as initially
unstable; Storage implementations are now required to make
Term(0) == 0 and the first unstable entry is now index 1.
stableTo(0) is no longer allowed.
2014-11-17 16:54:12 -05:00
Yicheng Qin 7d0ffb3f12 raft: not set applied when restored from snapshot
applied is only updated by application level through Advance.
2014-11-14 12:08:39 -08:00
Ben Darnell 45e96be605 raft: PR feedback.
Removed Get prefix in method names, added assertions and fixed comments.
2014-11-14 13:53:42 -05:00
Ben Darnell 0e8ffe9128 raft: remove a guard that is no longer necessary 2014-11-13 15:51:36 -05:00
Ben Darnell b29c512f50 Merge remote-tracking branch 'coreos/master' into log-storage-interface
* coreos/master: (27 commits)
  pkg/wait: move wait to pkg/wait
  etcdserver: do not add/remove/update local member to/from sender hub
  etcdserver: not record attributes when add member
  raft: add a test for proposeConfChange
  raft: block Stop() on n.done, support idempotency
  raft: add a test for node proposal
  integration: add increase cluster size test
  integration: remove unnecessary t.Testing argument
  raft: stop the node synchronously
  integration: fix test to propagate NewServer errors
  etcdserver: move peer URLs check to config
  etcdserver: ensure initial-advertise-peer-urls match initial-cluster
  raft: add a test for node.Tick
  raft: add comment string for TestNodeStart
  etcdserver: use member instead of node at etcd level
  raft: nodes return sorted ids
  raft: update unstable when calling stableTo with 0
  *: support updating advertise-peer-url Users might want to update the peerurl of the etcd member in several cases. For example, if the IP address of the physical machine etcd running on is changed, user need to update the adversite-pee-rurl accordingly. This commit makes etcd support updating the advertise-peer-url of its members.
  transport: create a tls listener only if the tlsInfo is not empty and the scheme is HTTPS
  etcdserver: use member pointer for all tests
  ...

Conflicts:
	etcdserver/server.go
	raft/log.go
	raft/log_test.go
	raft/node.go
2014-11-13 14:21:09 -05:00
Ben Darnell 54b07d7974 Remove raft.loadEnts and the ents parameter to raft.RestartNode.
The initial entries are now provided via the Storage interface.
2014-11-12 18:31:19 -05:00
Ben Darnell 147fd614ce The initial term=0 log entry is now initially unstable.
This entry is now persisted through the normal flow instead of appearing
in the stored log at creation time.  This is how things worked before
the Storage interface was introduced. (see coreos/etcd#1689)
2014-11-12 18:24:16 -05:00
Ben Darnell 76a3de9a33 Require a non-nil Storage parameter in newLog.
Callers must in general have a reference to their Storage objects to
transfer entries from Ready to Storage, so it doesn't make sense to
create a hidden Storage for them.

By explicitly creating Storage objects in tests we can remove a
few casts of raftLog's storage field.
2014-11-12 16:38:50 -05:00
Yicheng Qin 7dba92dd53 raft: update unstable when calling stableTo with 0
It should update unstable in this case because it may happen that raft
only writes entry 0 into stable storage.
2014-11-11 17:20:31 -08:00
Ben Darnell 25b6590547 raft: introduce log storage interface.
This change splits the raftLog.entries array into an in-memory
"unstable" list and a pluggable interface for retrieving entries that
have been persisted to disk. An in-memory implementation of this
interface is provided which behaves the same as the old version;
in a future commit etcdserver could replace the MemoryStorage with
one backed by the WAL.
2014-11-10 17:40:39 -05:00
Ben Darnell 21987c8701 raft: remove raftLog.resetUnstable and resetNextEnts
These methods are no longer used outside of tests and are redundant with
the new stableTo and appliedTo methods.
2014-11-06 17:18:00 -05:00
Xiang Li 0d7c43d885 *: add a Advance interface to raft.Node
Node set the applied to committed right after it sends out Ready to application. This is not
correct since the application has not actually applied the entries at that point. We add a
Advance interface to Node. Application needs to call Advance to tell raft Node its progress.
Also this change can avoid unnecessary copying when application is still applying entires but
there are more entries to be applied.
2014-11-05 15:04:14 -08:00
Yicheng Qin 421d5fbe72 raft: add tests based on section 5.3 in raft paper 2014-10-31 16:32:34 -07:00
Xiang Li 738da2b3fa raft: fix a incorrect in testMaybeAppend 2014-10-29 14:57:39 -07:00
Xiang Li 74c257f63d Merge pull request #1419 from xiangli-cmu/raft_log_test
raft: add test for findConflict
2014-10-27 14:30:36 -07:00
Xiang Li 460d6490ba raft: address issues in comments 2014-10-27 14:20:42 -07:00
Xiang Li 94f701cf95 raft: refactor isUpToDate and add a test 2014-10-25 20:34:14 -07:00
Xiang Li 8cd95e916d raft: comments for isUpToDate 2014-10-25 20:12:54 -07:00
Xiang Li 86c66cd802 raft: remove unused code 2014-10-25 19:56:13 -07:00
Xiang Li 90f26e4a56 raft: add test for findConflict 2014-10-25 18:58:11 -07:00
Soheil Hassas Yeganeh 09e9618b02 raft: change raftLog.maybeAppend to return the last new index
As per @unihorn's comment on #1366, we change raftLog.maybeAppend to
return the last new index of entries in maybeAppend.
2014-10-23 15:42:47 -04:00
Yicheng Qin e200d2a8e2 etcdserver/raft: remove msgDenied, removedNodes, shouldStop
The future plan is to do all these in etcdserver level.
2014-10-20 15:13:18 -07:00
Jonathan Boulle 7a4d42166b *: add license header to all source files 2014-10-17 15:41:22 -07:00
Jonathan Boulle fc42bdb904 raft: remove unused compactThreshold 2014-10-16 17:11:10 -07:00
Xiang Li af5b8c6c44 raft: int64 -> uint64 2014-10-09 14:26:43 +08:00
Xiang Li 7b61565c0a raft: save removed nodes in snapshot 2014-10-08 15:33:55 +08:00
Xiang Li b3c1bd5616 raft: commitIndex=min(leaderCommit, index of last new entry) 2014-09-29 14:38:17 -07:00
Xiang Li ab61a8aa9a *: init for on disk snap support 2014-09-17 13:56:12 -07:00
Yicheng Qin 140fd6d6c4 raft: restart using last written entry also 2014-09-15 09:56:33 -07:00
Yicheng Qin a9af70c52b raft: write entry 0 into log 2014-09-15 09:55:52 -07:00
Yicheng Qin 072a21782e Merge pull request #1049 from unihorn/120
raftLog: enhance check in compact
2014-09-12 11:35:41 -07:00
Yicheng Qin d31443f5a3 raftLog: compact applied entries only
compact MUST happen on entries that have been applied, or
1. it may screw up the log by setting wrong commitIndex
2. discard unapplied entries
2014-09-12 11:34:08 -07:00
Brandon Philips 3bc4b2db12 raft: log comment grammar fix 2014-09-11 13:59:50 -07:00
Xiang Li eaffaacf5e raft: do not need to copy committed entries 2014-09-09 14:09:30 -07:00
Jonathan Boulle 9997c9488a *: fix a few small issues identified by go vet 2014-09-08 23:52:36 -07:00
Blake Mizerany 8d9b7b1680 raft: remove entry type 2014-09-03 15:24:47 -07:00
Blake Mizerany 8463421448 raft: remove configuration 2014-09-03 15:23:05 -07:00
Blake Mizerany e8e588c67b raft: move protobufs into raftpb 2014-09-03 09:20:17 -07:00
Blake Mizerany ddd219f297 many: marshal message 2014-09-03 09:20:16 -07:00
Blake Mizerany 4aa15294a8 raft: re-remove clusterId from raft 2014-09-03 09:20:14 -07:00
Blake Mizerany 134a962222 raft: move raft2 to raft 2014-09-03 09:20:14 -07:00
Blake Mizerany 0453d09af6 raft: moved into new raft 2014-09-03 09:20:11 -07:00
Blake Mizerany f03c3bce05 raft: seperate dequeuing from slicing 2014-09-03 09:20:11 -07:00
Yicheng Qin a28dc4559b raft/etcd: recover node 2014-09-03 09:20:10 -07:00
Xiang Li a5df254e53 raft: add clusterId to snapshot 2014-09-03 09:20:08 -07:00
Yicheng Qin ba63cf666d raft: add recover 2014-09-03 09:20:02 -07:00
Xiang Li 6030261363 etcd/raft: add snap 2014-09-03 09:20:02 -07:00
Xiang Li 38ec659cd6 raft: make Entry a protobuf type 2014-09-03 09:20:01 -07:00
Xiang Li 54b4f52e48 raft: add index to entry 2014-09-03 09:20:01 -07:00
Xiang Li b383cd5acf raft: refactor recover 2014-09-03 09:19:59 -07:00
Yicheng Qin e850c644da raft: return offset for unstableEnts 2014-09-03 09:19:58 -07:00
Xiang Li 609e13a240 raft: add node.Unstable
Be able to return all unstable log entries. Application must store unstable
log entries before send out any messages after calling step.
2014-09-03 09:19:58 -07:00
Xiang Li 1288e1f39d raft: log->raftlog 2014-09-03 09:19:58 -07:00
Xiang Li c7d1beaaa5 raft: add first level logging
We log the message to step and the state of the statemachine before and after
stepping the message.
2014-09-03 09:19:58 -07:00
Xiang Li 2665cc1cc8 raft: heartbeat should not contain entries 2014-09-03 09:19:57 -07:00
Xiang Li 060de128a7 raft: add clusterId 2014-09-03 09:19:56 -07:00
Xiang Li 447d7dc51b raft: fix log append; add tests 2014-09-03 09:19:49 -07:00
Xiang Li 30f4d9faea raft: change index and term to int64 2014-09-03 09:05:14 -07:00
Xiang Li 2a11c1487c raft: sm.compact and sm.restore 2014-09-03 09:05:12 -07:00
Xiang Li 064004b899 raft: add log compact 2014-09-03 09:05:12 -07:00
Xiang Li 6a232dfc13 raft: add offset for log 2014-09-03 09:05:12 -07:00
Xiang Li f387e3e27d raft: add Entry.isConfig 2014-09-03 09:05:11 -07:00
Xiang Li 3817661f82 raft: rename ConfigAdd/ConfigRemove -> AddNode/RemoveNode 2014-09-03 09:05:11 -07:00
Xiang Li 9f315ffe10 raft: make entry type public 2014-09-03 09:05:11 -07:00
Xiang Li 1a75beb57c raft: add confAdd and confRemove entry type 2014-09-03 09:05:09 -07:00
Xiang Li c03fbf68d6 raft: add conf safety
To make configuration change safe without adding configuration protocol:

1. We only allow to add/remove one node at a time.

2. We only allow one uncommitted configuration entry in the log.

These two rules can make sure there is no disjoint quorums in both current cluster and the
future(after applied any number of committed entries or uncommitted entries in log) clusters.

We add a type field in Entry structure for two reasons:

1. Statemachine needs to know if there is a pending configuration change.

2. Configuration entry should be executed by raft package rather application who is using raft.
2014-09-03 09:05:09 -07:00
Xiang Li 0cdd1b58a4 raft: rename log.commit to log.committed 2014-09-03 09:05:07 -07:00
Xiang Li 9cd3b2153f raft: comment log.nextEnts 2014-09-03 09:05:07 -07:00
Xiang Li a06729a96a raft: use log.lastIndex() 2014-09-03 09:05:07 -07:00
Xiang Li 888ddacd3c raft: remove the init cap of log entries 2014-09-03 09:05:06 -07:00
Xiang Li 2ef9498d6f raft: remove TLA comment 2014-09-03 09:05:06 -07:00
Xiang Li bee9d8bea5 raft: add log.maybeAppend 2014-09-03 09:05:06 -07:00
Xiang Li b70be19653 raft: add log.maybeCommit 2014-09-03 09:05:06 -07:00
Xiang Li 092461d7c8 raft: rename log.len to log.lastIndex 2014-09-03 09:05:06 -07:00
Xiang Li 8f3d109c18 raft: rename log.isOk to log.matchTerm 2014-09-03 09:05:06 -07:00
Xiang Li 4c609ec59c raft: new log struct 2014-09-03 09:05:06 -07:00