Commit Graph

553 Commits (28feb073b5c2cc506b41ad6236f534b8d1a7e13a)

Author SHA1 Message Date
Yicheng Qin dcf34c0ab4 Merge pull request #1938 from yichengq/262
etcdserver: protect the sender map in SendHub
2014-12-15 10:41:52 +08:00
Yicheng Qin ceb077424d etcdserver: protect the sender map in SendHub 2014-12-15 10:37:41 +08:00
Xiang Li d86603840d rafthttp: better logging 2014-12-14 09:50:59 -08:00
Xiang Li 4724cbbe2c etcdserver: one line 2014-12-11 22:17:36 -08:00
Xiang Li 935f7128a9 etcdserver: move stats inferface to stats pkg 2014-12-11 22:14:05 -08:00
Barak Michener 5f16fab541 Merge pull request #1915 from barakmich/1834
Return Unknown instead of NotExist
2014-12-11 13:49:26 -05:00
Barak Michener cf7690cb51 detect more cases of empty directories and actual errors 2014-12-11 13:37:32 -05:00
Xiang Li c26542b7f2 Merge pull request #1913 from xiang90/lazy_snap_dir
etcdserver: create snap dir until start the node
2014-12-11 09:39:51 -08:00
Xiang Li 836ccabad2 etcdserver: create snap dir until start the node 2014-12-11 09:25:18 -08:00
Yicheng Qin 4777cba995 Merge pull request #1898 from robszumski/improve-logging
Improve logging for etcdserver and rafthttp
2014-12-10 10:46:47 -08:00
Nikhil Sarda a852936a59 etcdserver: removed an unhelpful test failure message
this commit changes instances of "blah" in a test to more
descriptive messages
2014-12-09 21:45:50 -08:00
Rob Szumski 13f3158728 etcdserver: improve discovery ignore warning 2014-12-09 15:57:25 -08:00
Xiang Li e4c0f5c1a8 Merge pull request #1895 from xiang90/snap_nodes
etcd: update conf when apply the confChange entry
2014-12-09 11:45:01 -08:00
Xiang Li a5efbf826d raft: drop nodes in softState 2014-12-09 11:43:52 -08:00
Xiang Li 29d7a2a558 etcd: update conf when apply the confChange entry 2014-12-08 23:37:07 -08:00
Yicheng Qin 4804c45e14 raft: set raft.Commit too when setting raftLog.committed 2014-12-08 22:35:55 -08:00
Yicheng Qin 9c8f5c9535 Merge pull request #1891 from yichengq/257
etcdserver: init state before run loop correctly
2014-12-08 16:38:33 -08:00
Yicheng Qin 13814c9d7d etcdserver: init state before run loop correctly 2014-12-08 16:13:16 -08:00
Yicheng Qin 7e06d85651 etcdserver: apply entries when it is not empty
Or it updates appliedi wrongly.
2014-12-08 15:56:38 -08:00
Yicheng Qin 71f3b80fbe etcdserver: check recovery error when new server 2014-12-08 14:55:23 -08:00
Yicheng Qin 8c338ffcc7 etcdserver: correct the log about recovering from snapshot 2014-12-08 14:51:42 -08:00
Yicheng Qin 771ff4589d etcdserver: not add self into sendhub when new server 2014-12-05 00:18:40 -08:00
Yicheng Qin 1d1c2ff834 Merge pull request #1841 from yichengq/246
etcdserver: close storage when stop
2014-12-04 15:36:24 -08:00
Yicheng Qin a7bc03b42b etcdserver: close storage when stop 2014-12-04 15:16:22 -08:00
Xiang Li 88e2fab572 Merge pull request #1859 from xiang90/pause_test
*: add pauseMember test
2014-12-04 15:11:59 -08:00
Veres Lajos 3de2ab2c04 *: typofixes
https://github.com/vlajos/misspell_fixer
2014-12-04 22:51:19 +00:00
Xiang Li 151f043414 *: add pauseMember test 2014-12-04 14:22:43 -08:00
Yicheng Qin fa292391d8 etcdserver: close idle connections when stop sendhub 2014-12-02 00:08:47 -08:00
Xiang Li 7beac083ff Merge pull request #1810 from xiang90/purge
*: support purging old wal/snap files
2014-12-01 12:05:05 -08:00
Xiang Li d3db010190 *: support purging old wal/snap files 2014-12-01 11:50:17 -08:00
Xiang Li bc5acd3c42 etcdserver: log snapshot event 2014-11-30 12:10:20 -08:00
Xiang Li e23f9e76d1 raft: do not applysnapshot in raft 2014-11-26 10:59:13 -08:00
Xiang Li 9df0e7715d raft: do not panic on out of date compaction 2014-11-25 15:14:39 -08:00
Xiang Li 01cbcce8ba etcdserver: do not applySnapshot twice 2014-11-25 14:53:49 -08:00
Xiang Li 74d8c7f457 etcdserver: cleanup main loop 2014-11-25 14:38:18 -08:00
Yicheng Qin 7e6e305c4f Merge branch 'log_interface'
Conflicts:
	raft/raft.go
2014-11-25 14:22:11 -08:00
Yicheng Qin a13d5a70ff etcdserver: save snapshot before entries 2014-11-25 12:39:15 -08:00
Owen Smith c67b937d62 etcdserver: truncate WAL from correct index when forcing new cluster
When loading from a backup with a snapshot and WAL, the length of WAL entries
can be lower than the current index integer value, causing a panic when
slicing off uncommitted entries. This looks for WAL entries higher than
the current index before slicing.
2014-11-25 16:46:56 +00:00
Yicheng Qin 54e1237271 etcdserver: panic when snapshot on raft storage
Snapshot on raft storage should always succeed. If there is an error, it must
be internal fault and needs stack info to debug.
2014-11-24 21:22:49 -08:00
Yicheng Qin 1b038da18a etcdserver: init snapi when init appliedi 2014-11-24 21:19:30 -08:00
Yicheng Qin bd9e93eeea etcdserver: remove finished TODO for raftStorage.Compact 2014-11-24 21:10:53 -08:00
Yicheng Qin 185d37c333 etcdserver: not load dummy entry from the wal 2014-11-24 20:51:04 -08:00
Xiang Li d69e4dbe6d etcdserver: initial index to 1 2014-11-24 14:57:08 -08:00
Xiang Li 453133977d etcdserver: save snapshot only if the index is greater than previous snap index 2014-11-24 14:47:59 -08:00
Xiang Li 4b7af29c37 etcdserver: fix TriggerSnap test.
Sleep for millisecond to allow the server to apply the first nop and
first put separately.
2014-11-24 14:47:49 -08:00
Xiang Li 08f156a1de etcdserver: remove extra empty line in snapshot func 2014-11-24 10:27:18 -08:00
Ben Darnell 0d680d0e6b Merge remote-tracking branch 'coreos/master' into merge
* coreos/master:
  rafthttp: fix import
  raft: should not decrease match and next when handling out of order msgAppResp
  Fix migration to allow snapshots to have the right IDs
  add snapshotted integration test
  fix test import loop
  fix import loop, add set to types, and fix comments
  etcdserver: autodetect v0.4 WALs and upgrade them to v0.5 automatically
  wal: add a bench for write entry
  rafthttp: add streaming server and client
  dep: use vendored imports in codegangsta/cli
  dep: bump golang.org/x/net/context

Conflicts:
	etcdserver/server.go
	etcdserver/server_test.go
	migrate/snapshot.go
2014-11-21 15:40:11 -05:00
Ben Darnell 30690d15d9 Re-enable a few tests I had missed.
Fix integration test for the change to log entry zero.

Increase test timeouts since integration tests often take
longer than 10s for me.
2014-11-21 15:27:17 -05:00
Brian Waldon c0fb1c8a00 Merge pull request #1755 from bcwaldon/golang.org-deps
Switch to golang.org/x/net/context
2014-11-20 16:26:14 -08:00
Barak Michener 59a0c64e9f fix import loop, add set to types, and fix comments 2014-11-20 15:38:08 -05:00
Barak Michener 78ea3335bf etcdserver: autodetect v0.4 WALs and upgrade them to v0.5 automatically 2014-11-20 15:38:08 -05:00
Yicheng Qin 9d53b94546 rafthttp: add streaming server and client 2014-11-20 11:34:50 -08:00
Brian Waldon 9a728a127a dep: bump golang.org/x/net/context
Move from code.google.com/p/go.net/context to
golang.org/x/net/context before bumping to latest.
2014-11-20 10:19:12 -08:00
Ben Darnell b29240baf0 Merge remote-tracking branch 'coreos/master' into merge
* coreos/master:
  scripts: build-docker tag and use ENTRYPOINT
  scripts: build-release add etcd-migrate
  create .godir
  raft: optimistically increase the next if the follower is already matched
  raft: add handleHeartbeat handleHeartbeat commits to the commit index in the message. It never decreases the commit index of the raft state machine.
  rafthttp: send takes raft message instead of bytes
  *: add rafthttp pkg into test list
  raft: include commitIndex in heartbeat
  rafthttp: move server stats in raftHandler to etcdserver
  *: etcdhttp.raftHandler -> rafthttp.RaftHandler
  etcdserver: rename sender.go -> sendhub.go
  *: etcdserver.sender -> rafthttp.Sender

Conflicts:
	raft/log.go
	raft/raft_paper_test.go
2014-11-19 17:05:16 -05:00
Ben Darnell 355ee4f393 raft: Integrate snapshots into the raft.Storage interface.
Compaction is now treated as an implementation detail of Storage
implementations; Node.Compact() and related functionality have been
removed. Ready.Snapshot is now used only for incoming snapshots.

A return value has been added to ApplyConfChange to allow applications
to track the node information that must be stored in the snapshot.

raftpb.Snapshot has been split into Snapshot and SnapshotMetadata, to
allow the full snapshot data to be read from disk only when needed.

raft.Storage has new methods Snapshot, ApplySnapshot, HardState, and
SetHardState. The Snapshot and HardState parameters have been removed
from RestartNode() and will now be loaded from Storage instead.
The only remaining difference between StartNode and RestartNode is that
the former bootstraps an initial list of Peers.
2014-11-19 16:40:26 -05:00
Yicheng Qin 1a72143ecb rafthttp: send takes raft message instead of bytes
This gives streaming mechanism the chance to assemble and disassemble
raft messages.
2014-11-17 22:39:53 -08:00
Yicheng Qin a2c568a144 Merge pull request #1669 from yichengq/215
*: add rafthttp as a separate package
2014-11-17 16:14:59 -08:00
Yicheng Qin f24e214ee5 rafthttp: move server stats in raftHandler to etcdserver 2014-11-17 16:02:20 -08:00
Yicheng Qin 5dc5f8145c *: etcdhttp.raftHandler -> rafthttp.RaftHandler 2014-11-17 15:52:24 -08:00
Yicheng Qin 3fcc011717 etcdserver: rename sender.go -> sendhub.go 2014-11-17 15:35:15 -08:00
Yicheng Qin 84fbf7aab5 *: etcdserver.sender -> rafthttp.Sender 2014-11-17 15:35:10 -08:00
Ben Darnell 300c5a2001 Merge remote-tracking branch 'coreos/master' into log-storage-interface
* coreos/master: (21 commits)
  etcdserver: refactor ValidateClusterAndAssignIDs
  integration: add integration test for remove member
  integration: add test for member restart
  version: bump to alpha.3
  etcdserver: add buffer to the sender queue
  *: gracefully stop etcdserver
  Fix up migration tool, add snapshot migration
  etcd4: migration from v0.4 -> v0.5
  etcdserver: export Member.StoreKey
  etcdserver: recover cluster when receiving newer snapshot
  etcdserver: check and select committed entries to apply
  etcdserver: recover from snapshot before applying requests
  raft: not set applied when restored from snapshot
  sender: support elegant stop
  etcdserver: add StopNotify
  etcdserver: fix TestDoProposalStopped test
  etcdserver: minor cleanup
  etcdserver: validate new node is not registered before in best effort
  etcdserver: fix server.Stop()
  *: print out configuration when necessary
  ...

Conflicts:
	etcdserver/server.go
	etcdserver/server_test.go
	raft/log.go
2014-11-17 18:28:24 -05:00
Xiang Li e04e4632b3 Merge pull request #1736 from xiang90/verify
etcdserver: refactor ValidateClusterAndAssignIDs
2014-11-17 14:34:59 -08:00
Xiang Li 0541f0afa0 etcdserver: refactor ValidateClusterAndAssignIDs 2014-11-17 14:23:37 -08:00
Ben Darnell 64d9bcabf1 Add Storage.Term() method and hide the first entry from other methods.
The first entry in the log is a dummy which is used for matchTerm
but may not have an actual payload. This change permits Storage
implementations to treat this term value specially instead of
storing it as a dummy Entry.

Storage.FirstIndex() no longer includes the term-only entry.

This reverses a recent decision to create entry zero as initially
unstable; Storage implementations are now required to make
Term(0) == 0 and the first unstable entry is now index 1.
stableTo(0) is no longer allowed.
2014-11-17 16:54:12 -05:00
Xiang Li c26de66262 integration: add integration test for remove member 2014-11-17 13:28:09 -08:00
Xiang Li 7c4b84a6cd etcdserver: add buffer to the sender queue 2014-11-14 15:18:16 -08:00
Xiang Li 8bf71d796e *: gracefully stop etcdserver 2014-11-14 14:12:24 -08:00
Barak Michener 192f200d9e Fix up migration tool, add snapshot migration
Fixes all updates since bcwaldon sketched the original, with cleanup and
into an acutal working state. The commit log follows:

fix pb reference and remove unused file post rebase

unbreak the migrate folder

correctly detect node IDs

fix snapshotting

Fix previous broken snapshot

Add raft log entries to the translation; fix test for all timezones. (Still in progress, but passing)

Fix etcd:join and etcd:remove

print more data when dumping the log

Cleanup based on yichengq's comments

more comments

Fix the commited index based on the snapshot, if one exists

detect nodeIDs from snapshot

add initial tool documentation and match the semantics in the build script and main

formalize migration doc

rename function and clarify docs

fix nil pointer

fix the record conversion test

add migration to test suite and fix govet
2014-11-14 16:46:08 -05:00
Brian Waldon 5ea1f2d96f etcd4: migration from v0.4 -> v0.5 2014-11-14 15:57:26 -05:00
Brian Waldon c36abeabd1 etcdserver: export Member.StoreKey 2014-11-14 15:57:26 -05:00
Yicheng Qin b6887e4a83 Merge pull request #1719 from yichengq/228
etcdserver: recover snapshot before applying committed entries
2014-11-14 12:18:41 -08:00
Yicheng Qin 77433ff6da etcdserver: recover cluster when receiving newer snapshot 2014-11-14 12:11:21 -08:00
Yicheng Qin dfaa7290c4 etcdserver: check and select committed entries to apply 2014-11-14 12:11:16 -08:00
Yicheng Qin f6a7f96967 etcdserver: recover from snapshot before applying requests 2014-11-14 12:08:39 -08:00
Yicheng Qin 12dba7d413 sender: support elegant stop 2014-11-13 17:44:36 -08:00
Xiang Li e66bda957b Merge pull request #1714 from xiang90/stop
StopNotify
2014-11-13 15:16:52 -08:00
Xiang Li 6a1fe00615 Merge pull request #1704 from xiang90/print_config
*: print out configuration when necessary
2014-11-13 14:35:50 -08:00
Yicheng Qin 11f392bdc8 Merge pull request #1708 from yichengq/223
etcdserver: validate new node is not registered before in best effort
2014-11-13 14:30:40 -08:00
Xiang Li b5d480f17a etcdserver: add StopNotify 2014-11-13 14:16:48 -08:00
Xiang Li 978d0f1ca1 etcdserver: fix TestDoProposalStopped test
We start etcd server in this test without the cluster. Sometimes it panics when
accessing the cluster. Most of the time it does not panic, since we can stop the
server fast enough before applying the first configuration change entry.
2014-11-13 14:08:59 -08:00
Xiang Li fb344bc33f etcdserver: minor cleanup 2014-11-13 14:01:56 -08:00
Xiang Li 813ff6ba48 Merge pull request #1713 from xiang90/stop
etcdserver: fix server.Stop()
2014-11-13 13:58:07 -08:00
Yicheng Qin ac907d746b etcdserver: validate new node is not registered before in best effort 2014-11-13 13:56:11 -08:00
Xiang Li 30dfdb0ea9 etcdserver: fix server.Stop()
Stop should be idempotent. It should simply send a stop signal to the server.
It is the server's responsibility to stop the go-routines and related components.
2014-11-13 13:47:12 -08:00
Ben Darnell 39eddd8565 Merge remote-tracking branch 'coreos/master' into log-storage-interface
* coreos/master:
  etcdserver: add sender tests
  raft: Only call stableTo when we have ready entries or a snapshot.
  etcdserver: add ID() function to the Server interface.
  sender: use RoundTripper instead of Client in sender
2014-11-13 15:50:08 -05:00
Yicheng Qin 9716def94b Merge pull request #1700 from yichengq/222
etcdserver: add sender tests
2014-11-13 12:37:29 -08:00
Yicheng Qin d89bf9f215 etcdserver: add sender tests 2014-11-13 12:29:25 -08:00
Xiang Li d6f40acc86 etcdserver: add ID() function to the Server interface. 2014-11-13 11:37:06 -08:00
Ben Darnell b29c512f50 Merge remote-tracking branch 'coreos/master' into log-storage-interface
* coreos/master: (27 commits)
  pkg/wait: move wait to pkg/wait
  etcdserver: do not add/remove/update local member to/from sender hub
  etcdserver: not record attributes when add member
  raft: add a test for proposeConfChange
  raft: block Stop() on n.done, support idempotency
  raft: add a test for node proposal
  integration: add increase cluster size test
  integration: remove unnecessary t.Testing argument
  raft: stop the node synchronously
  integration: fix test to propagate NewServer errors
  etcdserver: move peer URLs check to config
  etcdserver: ensure initial-advertise-peer-urls match initial-cluster
  raft: add a test for node.Tick
  raft: add comment string for TestNodeStart
  etcdserver: use member instead of node at etcd level
  raft: nodes return sorted ids
  raft: update unstable when calling stableTo with 0
  *: support updating advertise-peer-url Users might want to update the peerurl of the etcd member in several cases. For example, if the IP address of the physical machine etcd running on is changed, user need to update the adversite-pee-rurl accordingly. This commit makes etcd support updating the advertise-peer-url of its members.
  transport: create a tls listener only if the tlsInfo is not empty and the scheme is HTTPS
  etcdserver: use member pointer for all tests
  ...

Conflicts:
	etcdserver/server.go
	raft/log.go
	raft/log_test.go
	raft/node.go
2014-11-13 14:21:09 -05:00
Xiang Li 92096dfdc3 *: print out configuration when necessary 2014-11-13 10:46:42 -08:00
Xiang Li 0d18a0f381 pkg/wait: move wait to pkg/wait 2014-11-13 09:11:53 -08:00
Yicheng Qin 23b5bc0dfe sender: use RoundTripper instead of Client in sender 2014-11-12 21:42:08 -08:00
Yicheng Qin 1e1535e6f9 Merge pull request #1620 from yichengq/204
etcdserver: not record attributes when add member
2014-11-12 21:33:53 -08:00
Xiang Li ba915ad5a8 etcdserver: do not add/remove/update local member to/from sender hub 2014-11-12 20:45:21 -08:00
Yicheng Qin 0c2b45ddc6 etcdserver: not record attributes when add member
There is no need to set attributes value when adding member because new
member will publish the information whenever it starts.
2014-11-12 17:48:15 -08:00
Ben Darnell 54b07d7974 Remove raft.loadEnts and the ents parameter to raft.RestartNode.
The initial entries are now provided via the Storage interface.
2014-11-12 18:31:19 -05:00
Ben Darnell 147fd614ce The initial term=0 log entry is now initially unstable.
This entry is now persisted through the normal flow instead of appearing
in the stored log at creation time.  This is how things worked before
the Storage interface was introduced. (see coreos/etcd#1689)
2014-11-12 18:24:16 -05:00
Jonathan Boulle 1197c1f965 etcdserver: move peer URLs check to config 2014-11-12 13:12:49 -08:00
Jonathan Boulle 3f358b6d5d etcdserver: ensure initial-advertise-peer-urls match initial-cluster
This adds a check to setupCluster to ensure that the list of URLs
specified in `initial-advertise-peer-urls` matches those configured in
`initial-cluster` for this node. Also updates the documentation to
clarify this and address some changes in wording.
2014-11-12 12:54:35 -08:00