vitalif/etcd - etcd

Commit Graph

Author	SHA1	Message	Date
Anthony Romano	a524d5bdb7	etcdserver: fix race in TestTriggerSnap Fixes #4584	2016-02-21 22:03:35 -08:00
Caleb Champlin	82778ed478	Add refresh parameter to allow TTL refreshes without firing watch/wait responses	2016-02-08 10:37:37 -07:00
Anthony Romano	082a6c304e	etcdserver/test: use recorderstream in TestApplyRepeat was racing when waiting for the node commit fixes #4333	2016-01-28 17:19:06 -08:00
Anthony Romano	64596f0c49	etcdserver/test: synchronously wait on TestApplySnapshotAndCommittedEntries Replaces the RecorderBuffered with a RecorderStream so Wait will block waiting for updates to the etcdserver store. Fixes #4296	2016-01-26 21:03:03 -08:00
Anthony Romano	bd02d668c8	etcdserver: don't try to apply empty message list If all messages have been applied, don't apply an empty messages list; otherwise appliedi will update to 0 and etcd will panic. Fixes #4278	2016-01-26 11:56:37 -08:00
Xiang Li	f5753f2f51	*: support lease Attach Now we can attach keys to leases. And revoking the lease removes all the attached keys of that lease.	2016-01-09 11:01:58 -08:00
Xiang Li	1714290f4e	storage: support recovering from backend We want the KV to support recovering from backend to avoid additional pointer swap. Or we have to do coordination between etcdserver and API layer, since API layer might have access to kv pointer and use a closed kv.	2016-01-06 21:16:55 -08:00
Xiang Li	5dd3f91903	*: make backend outside kv KV and lease will share the same backend. Thus we need to make backend outside KV.	2016-01-05 19:55:29 -08:00
Anthony Romano	838328b057	etcdserver: fix racey WaitSchedule() tests to wait for recorder actions Fixes #4119	2016-01-05 09:39:18 -08:00
Anthony Romano	384cc76299	pkg/testutil: make Recorder an interface Provides two implementations of Recorder-- one that is non-blocking like the original version and one that provides a blocking channel to avoid busy waiting or racing in tests when no other synchronization is available.	2016-01-05 09:39:18 -08:00
Anthony Romano	e1bf726bc1	*: split out etcdserver's test mockup objects to live in interfaces' packages	2016-01-05 09:39:13 -08:00
Anthony Romano	4cd86ae1ef	etcdserver: serialize snapshot merger with applier Avoids inconsistent snapshotting by only attempting to create a snapshot after an apply completes. Fixes #4061	2015-12-29 18:38:39 -08:00
Anthony Romano	d7ad721ede	etcdserver: stop if removed along with multiple conf changes shouldstop would get clobbered when several conf changes are in an apply	2015-12-23 16:29:21 -08:00
Xiang Li	23bd60ccce	*: rewrite snapshot sending	2015-12-08 18:21:21 -08:00
Xiang Li	0708a5e50d	etcdserver: refactor a for loop in recvSnap test	2015-12-02 15:41:03 -08:00
Xiang Li	3ec3ffbef0	etcdserver: get rid of unreliable WaitSchedule In this case, we know we are waiting for an action happened on storage. We can do a busy wait instead of calling waitSchedule. The test previously failed on CI with no observed actions.	2015-12-02 13:18:11 -08:00
Yicheng Qin	7d757bbc8a	etcdserver: extend wait timeout in TestPublishRetry It fixes the failure in semaphore CI: ``` --- FAIL: TestPublishRetry (0.00s) server_test.go:1108: len(action) = 1, want >= 2 ```	2015-10-28 12:07:00 -07:00
Yicheng Qin	cacc0d6432	etcdserver: restore KV snapshot when receiving snapshot When a slow follower receives the snapshot sent from the leader, it should rename the snapshot file to the default KV file path, and restore KV snapshot. Have tested it manually and it works pretty well.	2015-10-23 08:43:26 -07:00
Yicheng Qin	ab5df57ecf	etcdserver: fix raft state machine may block When snapshot store requests raft snapshot from etcdserver apply loop, it may block on the channel for some time, or wait some time for KV to snapshot. This is unexpected because raft state machine should be unblocked. Even worse, this block may lead to deadlock: 1. raft state machine waits on getting snapshot from raft memory storage 2. raft memory storage waits snapshot store to get snapshot 3. snapshot store requests raft snapshot from apply loop 4. apply loop is applying entries, and waits raftNode loop to finish messages sending 5. raftNode loop waits peer loop in Transport to send out messages 6. peer loop in Transport waits for raft state machine to process message Fix it by changing the logic of getSnap to be asynchronously creation.	2015-10-20 09:19:34 -07:00
Yicheng Qin	1f21ccf166	rafthttp: support sending v3 snapshot message Use snapshotSender to send v3 snapshot message. It puts raft snapshot message and v3 snapshot into request body, then sends it to the target peer. When it receives http.StatusNoContent, it knows the message has been received and processed successfully. As receiver, snapHandler saves v3 snapshot and then processes the raft snapshot message, then respond with http.StatusNoContent.	2015-10-13 23:11:28 -07:00
Yicheng Qin	207c92b627	rafthttp: build transport inside pkg instead of passed-in rafthttp has different requirements for connections created by the transport for different usage, and this is hard to achieve when giving one http.RoundTripper. Pass into pkg the data needed to build transport now, and let rafthttp build its own transports.	2015-10-11 21:42:37 -07:00
Yicheng Qin	233e717e2f	rafthttp: expose struct to set configuration transport takes too many arguments and the new function is unable to read. Change the way to set fields in transport struct directly.	2015-10-11 09:02:16 -07:00
Yicheng Qin	f74ff9b867	Merge pull request #3644 from mitake/test-race etcdserver, test: don't access testing.T in time.AfterFunc()'s own go…	2015-10-07 08:34:58 -07:00
Hitoshi Mitake	68dd3ee621	etcdserver, test: don't access testing.T in time.AfterFunc()'s own goroutine time.AfterFunc() creates its own goroutine and calls the callback function in the goroutine. It can cause datarace like the problem fixed in the commit `de1a16e0f1` . This commit also fixes the potential dataraces of tests in etcdserver/server_test.go .	2015-10-06 11:37:08 +09:00
Yicheng Qin	bfe9502f4f	etcdserver: support to create raft snapshot at apply loop and snapStore could trigger it to create the latest raft snapshot.	2015-10-02 13:17:56 -07:00
Yicheng Qin	2276328720	etcdserver: add snapshotStore and raftStorage snapshotStore is the store of snapshot, and it supports to get latest snapshot and save incoming snapshot. raftStorage supports to get latest snapshot when v3demo is open.	2015-10-01 19:00:59 -07:00
Hitoshi Mitake	f8859a980d	etcdserver: forbid removing started member if quorum cannot be preserved in strict reconfig mode Like the commit `6974fc63ed`, this commit lets etcdserver forbid removing started member if quorum cannot be preserved after reconfiguration if the option -strict-reconfig-check is passed to etcd. The removal can cause deadlock if unstarted members have wrong peer URLs.	2015-09-18 10:09:57 +09:00
Hitoshi Mitake	6974fc63ed	etcdserver: avoid deadlock caused by adding members with wrong peer URLs Current membership changing functionality of etcd seems to have a problem which can cause deadlock. How to produce: 1. construct N node cluster 2. add N new nodes with etcdctl member add, without starting the new members What happens: After finishing add N nodes, a total number of the cluster becomes 2 * N and a quorum number of the cluster becomes N + 1. It means membership change requires at least N + 1 nodes because Raft treats membership information in its log like other ordinal log append requests. Assume the peer URLs of the added nodes are wrong because of miss operation or bugs in wrapping program which launch etcd. In such a case, both of adding and removing members are impossible because the quorum isn't preserved. Of course ordinal requests cannot be served. The cluster would seem to be deadlock. Of course, the best practice of adding new nodes is adding one node and let the node start one by one. However, the effect of this problem is so serious. I think preventing the problem forcibly would be valuable. Solution: This patch lets etcd forbid adding a new node if the operation changes quorum and the number of changed quorum is larger than a number of running nodes. If etcd is launched with a newly added option -strict-reconfig-check, the checking logic is activated. If the option isn't passed, default behavior of reconfig is kept. Fixes https://github.com/coreos/etcd/issues/3477	2015-09-13 09:31:53 +09:00
Yicheng Qin	8f6bf029f8	etcdserver: specify request timeout error due to connection lost It specifies request timeout error possibly caused by connection lost, and print out better log for user to understand. It handles two cases: 1. the leader cannot connect to majority of cluster. 2. the connection between follower and leader is down for a while, and it losts proposals. log format: ``` 20:04:19 etcd3 \| 2015-08-25 20:04:19.368126 E \| etcdhttp: etcdserver: request timed out, possibly due to connection lost 20:04:19 etcd3 \| 2015-08-25 20:04:19.368227 E \| etcdhttp: etcdserver: request timed out, possibly due to connection lost ```	2015-08-26 12:38:37 -07:00
Xiang Li	6b23a8131f	*: test gofmt with -s and fix reported issues	2015-08-21 18:52:16 -07:00
Yicheng Qin	0fdb77aea2	etcdserver: go back to marshal request in 2.1 way It fixes the problem that 2.1 cannot roll upgrade to 2.2 smoothly because 2.1 cannot understand the bytes marshalled at 2.2.	2015-08-13 13:41:52 -07:00
Yicheng Qin	5a91937367	etcdserver: adjust commit timeout based on config It uses heartbeat interval and election timeout to estimate the commit timeout for internal requests. This PR helps etcd survive under high roundtrip-time environment, e.g., globally-deployed cluster.	2015-08-11 21:09:03 -07:00
Xiang Li	845c51fedd	*: fix typos vaild->valid	2015-08-07 10:57:11 -07:00
Xiang Li	58503817ec	etcdserver: internal request union	2015-08-05 07:47:10 -07:00
Yicheng Qin	5d131acfba	etcdserver: fix TestTriggerSnap Before checking, it needs to wait for snapshot goroutine to finish its work.	2015-06-25 09:58:36 -07:00
Yicheng Qin	1af2b4cad7	rafthttp: fix TestUpdateMember Before this PR, it may error like this: ``` --- FAIL: TestUpdateMember-2 (0.00s) server_test.go:950: action = [{ApplyConfChange:ConfChangeUpdateNode []} {ProposeConfChange:ConfChangeUpdateNode []}], want [{ProposeConfChange:ConfChangeUpdateNode []} {ApplyConfChange:ConfChangeUpdateNode []}] ``` This fixes the test by recording the proposal event in time.	2015-06-11 09:45:34 -07:00
Yicheng Qin	4e79abcfeb	Merge pull request #2944 from yichengq/fix-2procs pkg/testutil: ForceGosched -> WaitSchedule	2015-06-10 14:44:32 -07:00
Yicheng Qin	018fb8e6d9	pkg/testutil: ForceGosched -> WaitSchedule ForceGosched() performs bad when GOMAXPROCS>1. When GOMAXPROCS=1, it could promise that other goroutines run long enough because it always yield the processor to other goroutines. But it cannot yield processor to goroutine running on other processors. So when GOMAXPROCS>1, the yield may finish when goroutine on the other processor just runs for little time. Here is a test to confirm the case: ``` package main import ( "fmt" "runtime" "testing" ) func ForceGosched() { // possibility enough to sched up to 10 go routines. for i := 0; i < 10000; i++ { runtime.Gosched() } } var d int func loop(c chan struct{}) { for { select { case <-c: for i := 0; i < 1000; i++ { fmt.Sprintf("come to time %d", i) } d++ } } } func TestLoop(t *testing.T) { c := make(chan struct{}, 1) go loop(c) c <- struct{}{} ForceGosched() if d != 1 { t.Fatal("d is not incremented") } } ``` `go test -v -race` runs well, but `GOMAXPROCS=2 go test -v -race` fails. Change the functionality to waiting for schedule to happen.	2015-06-10 14:37:41 -07:00
Xiang Li	e0f9796653	etcdserver: use leveled logging Leveled logging for etcdserver pkg.	2015-06-09 13:53:07 -07:00
Yicheng Qin	a6a649f1c3	etcdserver: stop exposing Cluster struct After this PR, only cluster's interface Cluster is exposed, which makes code much cleaner. And it avoids external packages to rely on cluster struct in the future.	2015-05-13 10:01:25 -07:00
Xiang Li	e866314b94	etcdserver: support update cluster version through raft 1. Persist the cluster version change through raft. When the member is restarted, it can recover the previous known decided cluster version. 2. When there is a new leader, it is forced to do a version checking immediately. This helps to update the first cluster version fast.	2015-05-12 11:44:34 -07:00
Yicheng Qin	9f19b5660f	rafthttp: add AddRemote Add remotes to rafthttp, who help newly joined members catch up the progress of the cluster. It supports basic message sending to remote, and has no stream connection for simplicity. remotes will not be used after the latest peers have been added into rafthttp.	2015-04-24 11:49:23 -07:00
Yicheng Qin	88224f6f4e	Revert "etcdserver: not apply stale conf change in cluster and transport" This reverts commit `40197f0698`.	2015-04-19 11:08:03 -07:00
Xiang Li	98f8dfbc9d	etcdserver: prevExist=true + condition is compareAndSwap PrevExist indicates the key should exist. Condition compares with an existing key. So PrevExist+condition = CompareAndSwap not Update.	2015-04-14 23:44:06 -07:00
Alex Crawford	d9ad6aa2a9	*: update to use IANA-assigned ports	2015-04-06 13:49:43 -07:00
Yicheng Qin	40197f0698	etcdserver: not apply stale conf change in cluster and transport	2015-03-27 12:53:34 -07:00
Xiang Li	d015610da5	etcdserver: separate apply and raft routine	2015-03-10 13:34:24 -07:00
Xiang Li	a4dab7ad75	*: do not block etcdserver when encoding store into json Encoding store into json snapshot has quite high CPU cost. And it will block for a while. This commit makes the encoding process non- blocking by running it in another go-routine.	2015-02-28 11:41:58 -08:00
Xiang Li	9b4d52ee73	raft: do not resend snapshot if not necessary raft relies on the link layer to report the status of the sent snapshot. If the snapshot is still sending, the replication to that remote peer will be paused. If the snapshot finish sending, the replication will begin optimistically after electionTimeout. If the snapshot fails, raft will try to resend it.	2015-02-28 11:41:58 -08:00
Xiang Li	86429264fb	wal: support auto-cut in wal WAL should control the cut logic itself. We want to do falloc to per allocate the space for a segmented wal file at the beginning and cut it when it size reaches the limit.	2015-02-28 11:18:59 -08:00

1 2 3 4

199 Commits (360aafec76b1ad977c294dea18332bbcd58c765c)