Commit Graph

1062 Commits (030865abe3327815db0586e03198dc5782c1f3ed)

Author SHA1 Message Date
Yicheng Qin 92cd24d5bd *: fix govet shadow check failure 2015-08-27 14:15:30 -07:00
Yicheng Qin 8f6bf029f8 etcdserver: specify request timeout error due to connection lost
It specifies request timeout error possibly caused by connection lost,
and print out better log for user to understand.

It handles two cases:
1. the leader cannot connect to majority of cluster.
2. the connection between follower and leader is down for a while,
and it losts proposals.

log format:
```
20:04:19 etcd3 | 2015-08-25 20:04:19.368126 E | etcdhttp: etcdserver:
request timed out, possibly due to connection lost
20:04:19 etcd3 | 2015-08-25 20:04:19.368227 E | etcdhttp: etcdserver:
request timed out, possibly due to connection lost
```
2015-08-26 12:38:37 -07:00
Mohammad Samman e2e002f94e etcdserver: handle malformed basic auth
return insufficient credentials if basic auth header is malformed

Fixes #3280
2015-08-25 12:37:24 -07:00
Xiang Li e3ef1d363a Merge pull request #3366 from xiang90/v3_proto
update v3 proto and doc
2015-08-24 11:22:29 -07:00
Xiang Li 1cccbb5ebd etcdserverpb: add comments for compaction 2015-08-24 10:52:54 -07:00
Xiang Li 4a5b94478e etcdserverpb: update comment for txn request 2015-08-24 10:40:05 -07:00
Xiang Li 98ceb3cdbd etcdserverpb: add more field into rangeResponse 2015-08-24 10:33:20 -07:00
Cong Ding c09b667d57 *: fix go vet reported issues 2015-08-22 12:19:02 -05:00
Xiang Li 044b23c3ca Merge pull request #3356 from xiang90/travis
*: test gofmt with -s and fix reported issues
2015-08-21 18:59:51 -07:00
Xiang Li 6b23a8131f *: test gofmt with -s and fix reported issues 2015-08-21 18:52:16 -07:00
Yicheng Qin 8c0610d4f5 Merge pull request #3352 from yichengq/fix-name-url
fix that etcd fails to start if using both IP and hostname when discovery srv
2015-08-21 12:38:38 -07:00
Yicheng Qin 72462a72fb etcdserver: remove TODO to delete URLStringsEqual
Discovery SRV supports to compare IP addresses with domain names,
so we need URLStringsEqual function.
2015-08-21 09:52:17 -07:00
Yicheng Qin 8ea3d157c5 Revert "Revert "Treat URLs have same IP address as same""
This reverts commit 3153e635d5.

Conflicts:
	etcdserver/config.go
2015-08-21 09:41:13 -07:00
Xiang Li 11a689d063 etcdserver/auth: cache auth enable result 2015-08-20 23:05:00 -07:00
Yicheng Qin bcb4d5d53e Merge pull request #3311 from yichengq/request-timeout
extend hardcoded timeout for globally-deployed etcd cluster
2015-08-17 17:00:24 -07:00
Yicheng Qin 1375ef8985 etcdserver: remove getVersion timeout
The request can still time out because we have set dial timeout and
read/write timeout. It increases timeout expectation from 1s to 5s,
but it makes it workable in globally-deployer cluster.
2015-08-17 16:50:40 -07:00
Xiang Li d487cf6b63 etcdhttp:write etcderror for all errors in keyhandler 2015-08-17 15:51:29 -07:00
Yicheng Qin c530385d6d Merge pull request #3313 from yichengq/internal-timeout
etcdserver: use ReqTimeout only
2015-08-17 15:05:46 -07:00
Xiang Li af6d1d3d95 Merge pull request #3310 from xiang90/http_err
*: key handler should write auth error as etcd error
2015-08-17 14:57:19 -07:00
Yicheng Qin 2d5b95c49f etcdserver: use ReqTimeout only
We cannot refer RTT value from heartbeat interval, so CommitTimeout
is invalid. Remove it and use ReqTimeout instead.
2015-08-17 14:54:25 -07:00
Xiang Li 87f061bab2 *: key handler should write auth error as etcd error 2015-08-17 14:45:45 -07:00
Xiang Li 15e03d801f etcdserver: add version enforcement when setting cluster version 2015-08-17 11:12:39 -07:00
Xiang Li f199a484af *: only print out major.minor version for cluster version 2015-08-15 08:30:06 -07:00
Xiang Li bbcb38189c Merge pull request #3302 from xiang90/v
etcdserver: better version detection log output
2015-08-14 16:14:55 -07:00
Xiang Li 0076ab154b etcdserver: better version detection log output
Fix https://github.com/coreos/etcd/issues/3288
2015-08-14 16:08:33 -07:00
Xiang Li dd56b7e05e Merge pull request #3299 from xiang90/txn
initial support for txn
2015-08-14 16:05:16 -07:00
Xiang Li 9233fff48f etcdserver: support txn 2015-08-14 11:45:31 -07:00
Xiang Li 46865fa5a5 etcdserverpb: update proto 2015-08-14 11:45:07 -07:00
Yicheng Qin c229e6e655 etcdserver: improve error message when timeout due to leader fail 2015-08-13 15:46:21 -07:00
Yicheng Qin ceb27b1c48 etcdhttp: add auth capability in 2.2 2015-08-13 14:49:10 -07:00
Yicheng Qin 0fdb77aea2 etcdserver: go back to marshal request in 2.1 way
It fixes the problem that 2.1 cannot roll upgrade to 2.2 smoothly
because 2.1 cannot understand the bytes marshalled at 2.2.
2015-08-13 13:41:52 -07:00
Yicheng Qin 27170e67b9 etcdserver: specify timeout caused by leader election
Before this PR, the timeout caused by leader election returns:

```
14:45:37 etcd2 | 2015-08-12 14:45:37.786349 E | etcdhttp: got unexpected
response error (etcdserver: request timed out)
```

After this PR:

```
15:52:54 etcd1 | 2015-08-12 15:52:54.389523 E | etcdhttp: etcdserver:
request timed out, possibly due to leader down
```
2015-08-12 16:53:18 -07:00
Yicheng Qin c3d4d11402 etcdhttp: adjust request timeout based on config
It uses heartbeat interval and election timeout to estimate the
expected request timeout.

This PR helps etcd survive under high roundtrip-time environment,
e.g., globally-deployed cluster.
2015-08-12 09:22:59 -07:00
Yicheng Qin 5a91937367 etcdserver: adjust commit timeout based on config
It uses heartbeat interval and election timeout to estimate the
commit timeout for internal requests.

This PR helps etcd survive under high roundtrip-time environment,
e.g., globally-deployed cluster.
2015-08-11 21:09:03 -07:00
Xiang Li a718329ad3 Merge pull request #3248 from xiang90/v3
initial v3 demo
2015-08-10 13:59:03 -07:00
Brandon Philips fb1951204c etcdserver: move atomics to make etcd work on arm64
Follow the simple rule in the atomic package:

"On both ARM and x86-32, it is the caller's responsibility to arrange
for 64-bit alignment of 64-bit words accessed atomically. The first word
in a global variable or in an allocated struct or slice can be relied
upon to be 64-bit aligned."

Tested on a system with /proc/cpuinfo reporting:

processor       : 0
model name      : ARMv7 Processor rev 1 (v7l)
Features        : swp half thumb fastmult vfp edsp thumbee neon vfpv3
tls vfpv4 idiva idivt vfpd32 lpae evtstrm
CPU implementer : 0x41
CPU architecture: 7
CPU variant     : 0x0
CPU part        : 0xc0d
CPU revision    : 1
2015-08-08 18:11:41 -07:00
Xiang Li 9ff7075ce8 etcdserver: use v3server interface 2015-08-08 10:39:04 -07:00
Xiang Li f004b4dac7 *: etcdserver supports v3 demo 2015-08-08 05:58:29 -07:00
Xiang Li 82afadbcc6 etcdserverpb: update proto 2015-08-08 05:31:35 -07:00
Xiang Li 845c51fedd *: fix typos vaild->valid 2015-08-07 10:57:11 -07:00
Yicheng Qin f03f048232 Merge pull request #3184 from yichengq/fast-bootstrap
etcdserver: tick ElectionTicks before starting when bootstrap new cluster
2015-08-06 15:54:40 -07:00
Yicheng Qin 21f5b885f2 etcdserver: fast election timeout when bootstrap cluster
The behavior accelarates the happen of the first-time leader election,
so the cluster could elect its leader fast. Technically, it could
help to reduce `electionMs - heartbeatMs` wait time for the first leader election.

Main usage:
1. Quick start for the local cluster when setting a little longer
election timeout
2. Quick start for the global cluster, which sets election timeout to
its maximum 50s.
2015-08-06 15:44:26 -07:00
Yicheng Qin a637e86372 Merge pull request #3220 from yichengq/fix-auth-check
etcdhttp: fix access check for multiple roles in auth
2015-08-06 15:09:04 -07:00
Xiang Li 58503817ec etcdserver: internal request union 2015-08-05 07:47:10 -07:00
Yicheng Qin 18169e896c etcdhttp: fix access check for multiple roles in auth
Check access for multiple roles should go through all roles.
2015-08-04 14:31:07 -07:00
Xiang Li 2b8abeb093 *: remove migration related stuff from 2.2 2015-08-01 19:37:20 +08:00
Barak Michener dd1a8fe330 etcdhttp: Improve test coverage surrounding auth 2015-07-30 14:21:08 -04:00
Xiang Li 80b794dccc Merge pull request #3185 from xiang90/add_debug_endpoint
etcdhttp: add config/local/debug endpoint
2015-07-30 08:46:07 +08:00
Xiang Li 4e31df2c2b etcdhttp: add config/local/log endpoint
PUT on the endpoint sets the GlobalDebugLevel to json level value.
The action overwrites the origianl log level setting from
users. We need to write doc to warn this.
2015-07-30 08:35:01 +08:00
Yicheng Qin 6fc9dbfe56 Merge pull request #3114 from yichengq/clean-raft-init
etcdserver: clean up start and stop logic of raft
2015-07-27 14:19:25 -07:00
Yicheng Qin 7696dd3280 etcdserver: clean up start and stop logic of raft
kill TODO and make it more readable.
2015-07-27 13:24:26 -07:00
Xiang Li 53a77fa519 *: tnx -> txn 2015-07-24 23:21:09 +08:00
Yicheng Qin b7892b20c1 etcdserver: rename defaultPublishRetryInterval -> defaultPublishTimeout
This makes code more readable and reasonable.
2015-07-23 10:09:28 -07:00
Yicheng Qin 5be545b872 Merge pull request #3077 from yichengq/fix-test-sync
etcdserver: init raft internal var early
2015-07-10 14:44:52 -07:00
Xiang Li 2fb8347d36 etcdserver: add rpc proto 2015-06-29 20:00:09 -07:00
Xiang Li 581ef05bab *: resolve proto warnings 2015-06-29 18:39:46 -07:00
Xiang Li 13f44e4b79 *: update generated proto code 2015-06-29 16:45:25 -07:00
Yicheng Qin 7f95780bfb etcdserver: init raft internal var early
Its `stopped`/`done` should be created always before being used
in defer in server loop.

It fixes the race detected when running TestSyncTrigger.
2015-06-29 15:34:15 -07:00
Yicheng Qin 2e41b4f9e1 etcdserver/auth: fix return value when creating root user
Before:

```
$ curl http://127.0.0.1:4001/v2/auth/users/root -XPUT -d '{"user": "root",
"password": "root"}'
{"user":"root","roles":null}
```

After:

```
{"user":"root","roles":["root"]}
```
2015-06-27 23:16:54 -07:00
Barak Michener acca9cc3a9 Merge pull request #3047 from barakmich/auth_cov
auth: improve test coverage
2015-06-25 14:47:22 -04:00
Barak Michener 39c10d1fe4 auth: improve test coverage 2015-06-25 14:25:08 -04:00
Yicheng Qin 5d131acfba etcdserver: fix TestTriggerSnap
Before checking, it needs to wait for snapshot goroutine to finish its
work.
2015-06-25 09:58:36 -07:00
Xiang Li 52c2a5731f etcdserver: fix typo in metrics.go 2015-06-24 12:42:40 -07:00
Xiang Li 030d1bbf2d auth: do not allow update root role 2015-06-23 20:15:08 -07:00
Xiang Li e291dfd748 etcdhttp: improve user endpoint validation
Giving both roles and grant/revoke is not allowed.
Creating an existing user is not allowed.
Updating a non-existing user is not allowed.
2015-06-23 14:38:44 -07:00
Xiang Li c8628c8fe5 auth: separate the role create and update path
Giving both permission and grant/revoke is not allowed.
Creating an existing role is not allowed.
Updating a non-existing is not allowed.
2015-06-23 13:15:32 -07:00
Xiang Li bc61056912 etcdhttp: use correct http status const when writing http error 2015-06-23 12:40:30 -07:00
Xiang Li 4f47a6ebfb Merge pull request #3032 from xiang90/refactor_update_role
auth: refactor updateRole
2015-06-23 11:17:45 -07:00
Barak Michener d5a0e3ac6a etcdhttp: Always strip password hash when returning users 2015-06-22 18:39:16 -04:00
Xiang Li 979f531261 auth: refactor updateRole
We will return error if revoke or grant fails to update the role.
No need to check if revoke or grant is nil or not.
2015-06-22 15:16:10 -07:00
Xiang Li 3f82e7b116 auth: do not allow to grant duplicate role or revoke ungranted role to a user 2015-06-22 15:11:09 -07:00
Barak Michener 51a65599dd Merge pull request #3021 from xiang90/auth_err
etcdserver: use correct http status code for auth error
2015-06-22 14:58:33 -04:00
Xiang Li c39aad0e92 etcdserver: use correct http status code for auth error 2015-06-22 09:28:47 -07:00
Xiang Li 3e4479b0cd Merge pull request #3022 from xiang90/aut_type
etcdhttp: fix the response type for auth
2015-06-21 15:06:35 -07:00
Xiang Li d295d21349 etcdserver: better log message for url mismatch 2015-06-19 19:36:26 -07:00
Xiang Li cad757efa0 etcdhttp: fix the response type for auth 2015-06-19 15:19:00 -07:00
Barak Michener 64ec8af91b *: Rename `security` to `auth` 2015-06-15 18:18:50 -04:00
Antoine Grondin 270487d340 etcdserver: use Infof to print formatted argument 2015-06-14 20:22:21 +07:00
Xiang Li 8ad7ed321e *:godep log pkg 2015-06-11 14:22:14 -07:00
Xiang Li f013a627a4 etcdserver/stats: use leveled log 2015-06-11 14:22:14 -07:00
Xiang Li cf7cb2b8a9 etcdserver/security: use leveled log 2015-06-11 14:22:14 -07:00
Xiang Li 2f795e42d0 httptypes: use leveled log 2015-06-11 14:19:53 -07:00
Barak Michener 7bf0479e66 Merge pull request #2882 from barakmich/security_client_new
*: Add security/authorization to etcd/client and etcdctl
2015-06-11 13:40:32 -04:00
Yicheng Qin 1af2b4cad7 rafthttp: fix TestUpdateMember
Before this PR, it may error like this:

```
--- FAIL: TestUpdateMember-2 (0.00s)
		server_test.go:950: action =
		[{ApplyConfChange:ConfChangeUpdateNode []}
{ProposeConfChange:ConfChangeUpdateNode []}], want
[{ProposeConfChange:ConfChangeUpdateNode []}
{ApplyConfChange:ConfChangeUpdateNode []}]
```

This fixes the test by recording the proposal event in time.
2015-06-11 09:45:34 -07:00
Yicheng Qin cd629c9b44 Merge pull request #2939 from yichengq/fix-update-attr
etcdserver: allow to update attributes of removed member
2015-06-10 16:53:39 -07:00
Yicheng Qin 8725e69cf7 etcdserver: allow to update attributes of removed member
There exist the possiblity to update attributes of removed member in
reasonable workflow:
1. start member A
2. leader receives the proposal to remove member A
2. member A sends the proposal of update its attribute to the leader
3. leader commits the two proposals
So etcdserver should allow to update attributes of removed member.
2015-06-10 16:52:18 -07:00
Yicheng Qin 4e79abcfeb Merge pull request #2944 from yichengq/fix-2procs
pkg/testutil: ForceGosched -> WaitSchedule
2015-06-10 14:44:32 -07:00
Yicheng Qin 018fb8e6d9 pkg/testutil: ForceGosched -> WaitSchedule
ForceGosched() performs bad when GOMAXPROCS>1. When GOMAXPROCS=1, it
could promise that other goroutines run long enough
because it always yield the processor to other goroutines. But it cannot
yield processor to goroutine running on other processors. So when
GOMAXPROCS>1, the yield may finish when goroutine on the other
processor just runs for little time.

Here is a test to confirm the case:

```
package main

import (
	"fmt"
	"runtime"
	"testing"
)

func ForceGosched() {
	// possibility enough to sched up to 10 go routines.
	for i := 0; i < 10000; i++ {
		runtime.Gosched()
	}
}

var d int

func loop(c chan struct{}) {
	for {
		select {
		case <-c:
			for i := 0; i < 1000; i++ {
				fmt.Sprintf("come to time %d", i)
			}
			d++
		}
	}
}

func TestLoop(t *testing.T) {
	c := make(chan struct{}, 1)
	go loop(c)
	c <- struct{}{}
	ForceGosched()
	if d != 1 {
		t.Fatal("d is not incremented")
	}
}
```

`go test -v -race` runs well, but `GOMAXPROCS=2 go test -v -race` fails.

Change the functionality to waiting for schedule to happen.
2015-06-10 14:37:41 -07:00
Barak Michener a4d1a5a6e5 *: Add security/auth support to etcdctl and etcd/client
add godep for speakeasy and auth entry parsing
add security_user to client
add role to client
add role commands
add auth support to etcdclient and etcdctl(member/user)
add enable/disable to etcdctl
better error messages, read/write/readwrite
Bump go-etcd to include codec changes, add new dependency
verify the error for revoke/add if nothing changed, remove security-merging prefix
2015-06-10 16:58:10 -04:00
Xiang Li 19ef3a0982 Merge pull request #2934 from xiang90/etcdserver_log
etcdserver: use leveled logging
2015-06-09 15:53:52 -07:00
Xiang Li e0f9796653 etcdserver: use leveled logging
Leveled logging for etcdserver pkg.
2015-06-09 13:53:07 -07:00
Yicheng Qin 9fbd2599ad Merge pull request #2940 from yichengq/improve-raft-loop
etcdserver: stop raft loop when receiving stop signal
2015-06-09 11:24:53 -07:00
Yicheng Qin 0814966ca2 etcdserver: stop raft loop when receiving stop signal
When it waits for apply to be done, it should stop the loop if it
receives stop signal.

This helps to print out panic information. Before this PR, if the panic
happens when server loop is applying entries, server loop will wait for
raft loop to stop forever.
2015-06-09 11:11:53 -07:00
Brian Akins d8a836e618 Simple debug HTTP request logging 2015-06-09 13:40:37 -04:00
Xiang Li 0adeee2965 etcdhttp: use leveled logging 2015-06-09 09:26:57 -07:00
Xiang Li 3af4a45d7b etcdserver: make raft use leveled logger 2015-06-02 12:50:42 -07:00
Xiang Li 42fe370b35 Merge pull request #2848 from xiang90/metrics
*: use namespace and subsystem in metrics
2015-05-26 14:44:54 -07:00
Xiang Li 34ac145b38 *: use namespace and subsystem in metrics
Fix #2841.

From Prometheus developer:
```
the recommended way for etcd as an open source project and under
consideration of its size would be etcd_<subsystem>_<name>.
```

We made the naming change accordingly.
2015-05-26 14:39:04 -07:00
Xiang Li 3028edd7dc Merge pull request #2856 from xiang90/mrefactor
etcdserver: refactore member.go
2015-05-26 14:37:37 -07:00
Barak Michener 9ef098c5ed etcdserver: fix go vet. Fixes #2859 2015-05-22 13:54:54 -04:00
Xiang Li 58eefda72d Merge pull request #2840 from yichengq/revert-url-equal
Revert "Treat URLs have same IP address as same"
2015-05-21 19:27:19 -07:00
Xiang Li 4a72d3a8bb etcdserver: refactore member.go 2015-05-21 09:19:29 -07:00
Xiang Li 260aad5468 Merge pull request #2830 from xiang90/join_checking
checking cluster version compatibility before joining the existing cluster
2015-05-20 12:25:50 -07:00
Xiang Li aa417ab644 etcdserver: log the per endpoint error in getVersion 2015-05-20 12:10:10 -07:00
Xiang Li db7db689a6 etcdserver: check cluster version compability when joining 2015-05-19 10:19:41 -07:00
Barak Michener a88a53274f security: Lazily create the security directories. Fixes #2755, may find new instances for #2741
revert the kv integration test

fix nits

amend security mention of GUEST
2015-05-18 17:28:04 -04:00
Yicheng Qin 3153e635d5 Revert "Treat URLs have same IP address as same"
This reverts commit f8ce5996b0.

etcd no longer resolves TCP addresses passed in through flags,
so there is no need to compare hostname and IP slices anymore.
(for more details: a3892221ee)

Conflicts:
	etcdserver/cluster.go
	etcdserver/config.go
	pkg/netutil/netutil.go
	pkg/netutil/netutil_test.go
2015-05-16 03:21:10 -07:00
Xiang Li 9f8342dba4 etcdserver: do not get local version via HTTP 2015-05-13 17:19:32 -07:00
Xiang Li 988c30bfba etcdserver: getVersion returns both server and cluster version 2015-05-13 17:04:46 -07:00
Xiang Li 6296054ff6 etcdhttp: version endpoint also returns cluster version. 2015-05-13 15:48:10 -07:00
Yicheng Qin 75ee7f4aa1 Merge pull request #2821 from yichengq/private-cluster
etcdserver: stop exposing Cluster struct
2015-05-13 10:26:48 -07:00
Xiang Li 2690535f8a Merge pull request #2820 from xiang90/cap
version capability checking
2015-05-13 10:16:49 -07:00
Xiang Li d3b1d5c008 etcdhttp: support capability checking
etcdhttp will check the cluster version and update its
capability version periodically.

Any new handler's after 2.0 needs to wrap by capability handler
to ensure it is not accessable until rolling upgrade finished.
2015-05-13 10:11:35 -07:00
Yicheng Qin a6a649f1c3 etcdserver: stop exposing Cluster struct
After this PR, only cluster's interface Cluster is exposed, which makes
code much cleaner. And it avoids external packages to rely on cluster
struct in the future.
2015-05-13 10:01:25 -07:00
Xiang Li f2905f2828 etcdserver: remove unnecessary around detect datadir
The log is super unhelpful. When I have a 2.1.0 etcd, it prints out
`2.0.1 vaild dir`. I have no idea why the data dir of a 2.1.0 etcd is
2.0.1.
2015-05-12 22:06:42 -07:00
Yicheng Qin 032db5e396 *: extract types.Cluster from etcdserver.Cluster
The PR extracts types.Cluster from etcdserver.Cluster. types.Cluster
is used for flag parsing and etcdserver config.

There is no need to expose etcdserver.Cluster public, which contains
lots of etcdserver internal details and methods. This is the first step
for it.
2015-05-12 14:53:11 -07:00
Xiang Li e866314b94 etcdserver: support update cluster version through raft
1. Persist the cluster version change through raft. When the member is restarted, it can recover
the previous known decided cluster version.

2. When there is a new leader, it is forced to do a version checking immediately. This helps to
update the first cluster version fast.
2015-05-12 11:44:34 -07:00
Xiang Li 94ffd72c7e etcdserver: rename StoreAdminPrefix to StoreClusterPrefix
We store cluster related key in StoreAdminPrefix for some
historical reason. The previous API is called admin. But now,
the admin name is gone and `cluster` is a more clear and correct
name.
2015-04-29 12:05:51 -07:00
Xiang Li 6699107f61 *: add cluster version and cluster version detection.
Cluster version is the min major.minor of all members in
the etcd cluster. Cluster version is set to the min version
that a etcd member is compatible with when first bootstrapp.

During a rolling upgrades, the cluster version will be updated
automatically.

For example:

```
Cluster [a:1, b:1 ,c:1] -> clusterVersion 1

update a -> 2, b -> 2

after a detection

Cluster [a:2, b:2 ,c:1] -> clusterVersion 1, since c is still 1

update c -> 2

after a detection

Cluster [a:2, b:2 ,c:2] -> clusterVersion 2
```

The API/raft component can utilize clusterVersion to determine if
it can accept a client request or a raft RPC.

We choose polling rather than pushing since we want to use the same
logic for cluster version detection and (TODO) cluster version checking.

Before a member actually joins a etcd cluster, it should check the version
of the cluster. Push does not work since the other members cannot push
version info to it before it actually joins. Moreover, we do not want our
raft RPC system (which is doing the heartbeat pushing) to coordinate cluster version.
2015-04-29 11:31:59 -07:00
Yicheng Qin 1c1cccd236 rafthttp: stop etcd if it is found removed when stream dial
The original process is stopping etcd only when pipeline message finds itself
has been removed. After this PR, stream dial has this functionality too.
It helps fast etcd stop, which doesn't need to wait for stream break to
fall back to pipeline, and wait for election timeout to send out message
to detect self removal.
2015-04-27 15:10:00 -07:00
Yicheng Qin ebecee34e0 Merge pull request #2701 from yichengq/rafthttp-anon
rafthttp: add remotes
2015-04-24 13:04:37 -07:00
Yicheng Qin 9f19b5660f rafthttp: add AddRemote
Add remotes to rafthttp, who help newly joined members catch up the
progress of the cluster. It supports basic message sending to remote, and
has no stream connection for simplicity. remotes will not be used
after the latest peers have been added into rafthttp.
2015-04-24 11:49:23 -07:00
xiaost cab1e9a723 etcdserver: skip noop entry in apply 2015-04-24 12:15:51 +08:00
Barak Michener fa74e702d8 security: Improve the security api as per the suggestions list in #2384
Subcommits:

decouple root and security enable/disable

create root role

prefix matching

godep: bump go-etcd to include credentials

add godep for speakeasy and auth entry parsing

appropriate errors for security enable/disable

WIP adding to etcd/client all the security client methods

add guest access

minor ui return tweaks

revert client changes

respond to comments, log more security operations

fix major ensure() bug, add better UX

block recursive access

fix some boneheaded mistakes

fix integration test

last comments

fix up security_api.md

philips nits

fix docs
2015-04-23 16:11:38 -04:00
Yicheng Qin 1d96de459a etcdserver: init server stats before passing it as argument
It is more reasonable to init the variable before passing it as an
argument.

It fixes a bug that etcdserver may panic on server stats when processing
a message from rafthttp streamReader before server stats is initialized
in server.Start().
2015-04-22 08:28:08 -07:00
Xiang Li 5ad559b503 *: serve json version on both client and peer url 2015-04-20 16:23:51 -07:00
Yicheng Qin 1811701427 Revert "etcdserver: fix cluster fallback recovery"
This reverts commit cff005777a.

Conflicts:
	etcdserver/server.go
2015-04-19 11:34:33 -07:00
Yicheng Qin 88224f6f4e Revert "etcdserver: not apply stale conf change in cluster and transport"
This reverts commit 40197f0698.
2015-04-19 11:08:03 -07:00
Xiang Li 98f8dfbc9d etcdserver: prevExist=true + condition is compareAndSwap
PrevExist indicates the key should exist. Condition compares with
an existing key. So PrevExist+condition = CompareAndSwap not Update.
2015-04-14 23:44:06 -07:00
xiaost eab2c2224a etcdserver: fix minor bug in EtcdServer.send
it seems to nothing serious.
after deleted peers, the log may output:
"etcdserver: send message to unknown receiver %s"
2015-04-13 20:35:58 +08:00
Yicheng Qin 2141308524 Merge pull request #2631 from yichengq/metrics-fd
etcdserver: metrics and monitor number of file descriptor
2015-04-08 11:28:58 -07:00
Yicheng Qin 7a7e1f7a7c etcdserver: metrics and monitor number of file descriptor
It exposes the metrics of file descriptor limit and file descriptor used.
Moreover, it prints out warning when more than 80% of fd limit has been used.

```
2015/04/08 01:26:19 etcdserver: 80% of the file descriptor limit is open
[open = 969, limit = 1024]
```
2015-04-08 11:17:48 -07:00
Alex Crawford d9ad6aa2a9 *: update to use IANA-assigned ports 2015-04-06 13:49:43 -07:00
Xiang Li 471aa1aa89 Merge pull request #2622 from xiang90/fix_watcher
store: fix watcher removal
2015-04-03 10:39:03 -07:00
Xiang Li 999917010d store: fix watcher removal 2015-04-03 10:13:43 -07:00
Yicheng Qin 9e5743c816 etcdserver: stop raft node goroutine before stop server
Stop raftNode goroutine before stopping server goroutine, so
server.Stop does stop all underlying stuffs elegantly now. This fixes
the problem that previous-round lock on WAL may not be released when
etcd is restarted.
2015-04-01 11:20:51 -07:00
Xiang Li 77a04cda0c Merge pull request #2597 from xiang90/wal-repair
wal: fix the unexpectedEOF error in the last wal.
2015-03-30 13:49:05 -07:00
Xiang Li 253f7c4ae1 Merge pull request #2522 from xiang90/user_pw
etcdserver/etcdhttp: do not return back the password of a user
2015-03-30 13:42:41 -07:00
Xiang Li 0b9a318e68 etcdserver: make the wal repairing logic clear 2015-03-29 21:10:28 -07:00
Xiang Li 1231f82f22 etcdserver: save snapshot into wal first 2015-03-29 14:23:05 -07:00
Xiang Li 8b4eed29e5 wal: fix the unexpectedEOF error in the last wal.
It is safe to repair the unexpectedEOF error in the last wal. raft
will not send out message before the entry successfully comitted
into wal. Thus we can safely truncate the last entry in the wal
to repair.
2015-03-28 21:08:14 -07:00
Yicheng Qin 60efd4d96e Revert "etcdhttp: add internalVersion"
This reverts commit a77bf97c14.

Conflicts:
	version/version.go
2015-03-27 16:53:55 -07:00
Yicheng Qin dd92a2b484 Merge pull request #2556 from yichengq/fix-apply-conf
etcdserver: not apply stale conf change
2015-03-27 14:00:30 -07:00
Kelsey Hightower 538d624cfa etcdserver: add stats.LatencyStats and stats.CountsStats types 2015-03-27 13:42:44 -07:00
Yicheng Qin 40197f0698 etcdserver: not apply stale conf change in cluster and transport 2015-03-27 12:53:34 -07:00
Xiang Li e3817adb5b etcdserver: loose member validation for joining existing cluster 2015-03-25 13:59:22 -07:00
Xiang Li 05e240b892 *: update protobuf 2015-03-25 10:14:35 -07:00
Yicheng Qin 5e0077cc0c etcdserver: print out extra files in data dir instead of erroring 2015-03-24 18:56:22 -07:00
Xiang Li 866a9d4e41 Merge pull request #2568 from xiang90/raftnode
raft: make node configurable
2015-03-24 11:18:22 -07:00
Yicheng Qin ea78f5d1aa Merge pull request #2552 from yichengq/fix-2396
etcdserver: check -initial-cluster in join case
2015-03-23 22:46:38 -07:00
Yicheng Qin abcd828114 etcdserver: add join-existing check 2015-03-23 22:31:20 -07:00
Xiang Li abddef0f28 raft: make node configurable 2015-03-23 21:20:49 -07:00
Kelsey Hightower 4611c3b2d7 netutil: add BasicAuth function
etcd ships it's own BasicAuth function and no longer requires
Go 1.4 to build.
2015-03-20 17:32:33 -07:00
Xiang Li 9d28f94005 etcdserver/etcdhttp: do not return back the password of a user 2015-03-16 22:35:01 -07:00
Xiang Li f3e4dbf967 etcdserver/etcdhttp: write the http error to response writer 2015-03-16 15:24:19 -07:00
Xiang Li bba7f75562 Merge pull request #2517 from yichengq/fix-sec2
security: fix var shadowing in CreateOrUpdateUser
2015-03-16 15:08:55 -07:00
Yicheng Qin 8335a5407b security: fix var shadowing in CreateOrUpdateUser 2015-03-16 14:59:05 -07:00
Yicheng Qin d7780cf293 security: fix var shadowing in CreateOrUpdate 2015-03-16 14:55:04 -07:00
Barak Michener 001efa0639 security: Implement RBAC security for etcd
stub out security

further wip

Last stub before CRUD for roles

Complete role merging

start tests

add Godep for golang.org/x/crypto/bcrypt

first round of comments

add tests, remove root addition (will be added back as part of creation)

Add security checks for /v2/machines and /v2/keys

Allow non-root to determine if security is enabled, get machine list.

Responding to comments, remove multiple verbs (like /v2/security/user/foo/password)

add some prefixes to the logging
2015-03-16 16:23:11 -04:00
Xiang Li d015610da5 etcdserver: separate apply and raft routine 2015-03-10 13:34:24 -07:00
Yicheng Qin b4b9b9118a rafthttp: report MsgSnap status 2015-03-02 09:38:11 -08:00
Yicheng Qin 9989bf1d36 Merge pull request #2407 from yichengq/334
rafthttp: report unreachable status of the peer
2015-03-02 09:35:35 -08:00
Yicheng Qin 9b986fb4c1 rafthttp: report unreachable status of the peer
When it failed to send message to the remote peer, it reports unreachable
to raft.
2015-03-01 16:48:26 -08:00
Xiang Li 428b77afc3 etcdserver: keep a min number of entries in memory
Do not aggressively compact raft log entries. After a snapshot,
etcd server can compact the raft log upto snapshot index. etcd server
compacts to an index smaller than snapshot to keep some entries in memory.
The leader can still read out the in memory entries to send to a slightly
slow follower. If all the entries are compacted, the leader will send the
whole snapshot or read entries from disk if possible.
2015-03-01 10:12:13 -08:00
Xiang Li a4dab7ad75 *: do not block etcdserver when encoding store into json
Encoding store into json snapshot has quite high CPU cost. And it
will block for a while. This commit makes the encoding process non-
blocking by running it in another go-routine.
2015-02-28 11:41:58 -08:00
Xiang Li 9b4d52ee73 raft: do not resend snapshot if not necessary
raft relies on the link layer to report the status of the sent snapshot.
If the snapshot is still sending, the replication to that remote peer will
be paused. If the snapshot finish sending, the replication will begin
optimistically after electionTimeout. If the snapshot fails, raft will
try to resend it.
2015-02-28 11:41:58 -08:00
Xiang Li 86429264fb wal: support auto-cut in wal
WAL should control the cut logic itself. We want to do falloc to
per allocate the space for a segmented wal file at the beginning
and cut it when it size reaches the limit.
2015-02-28 11:18:59 -08:00
Xiang Li 95bba154d6 etcdserver: add propose summary 2015-02-28 11:16:42 -08:00
Xiang Li 83c953b153 etcdhttp: move /stats to /debug/vars 2015-02-28 11:16:42 -08:00
Xiang Li 84485643fe *: expose wal metrics at /metrics 2015-02-28 11:06:11 -08:00
Xiang Li 2af33fd494 raft: add reportUnreachable 2015-02-28 10:45:22 -08:00
Brian Waldon 4a77760f56 client: break dependency on httptypes pkg 2015-02-28 10:38:46 -08:00
Xiang Li 2e078582f9 etcdmain: expose runtime metrics 2015-02-28 10:11:53 -08:00
Xiang Li 33afbfead6 etcdserver: remove the dep on metrics. first step towards removing metrics pkg from etcd. 2015-02-28 10:09:55 -08:00
Xiang Li 5ede18be74 raft: separate compact and createsnap in memory storage 2015-02-28 10:08:30 -08:00
Yicheng Qin cff005777a etcdserver: fix cluster fallback recovery
Cluster and transport may recover to old states when new node joins
the cluster. Record cluster last modified index to avoid this.
2015-02-20 14:30:00 -08:00
Barak Michener 92dca0af0f *: remove shadowing of variables from etcd and add travis test
We've been bitten by this enough times that I wrote a tool so that
it never happens again.
2015-02-17 16:31:42 -05:00
Xiang Li beb44ef6ba etcdserver: fix error message when valide the discovery cluster 2015-02-16 09:53:01 -08:00
Xiang Li 73e67628d9 Merge pull request #2313 from xiang90/cluster_mu
etcdserver: move the mutex before what it guards
2015-02-14 23:05:53 -08:00
Xiang Li 04bd06d20b etcdserver: move the mutex before what it guards 2015-02-14 22:26:12 -08:00
Xiang Li c5ca1218f3 etcdserver: GetClusterFromPeers -> GetClusterFromRemotePeers 2015-02-13 19:05:29 -08:00
Xiang Li f7540912d6 etcdserver: getOtherPeerURLs -> getRemotePeerURLs 2015-02-13 18:56:45 -08:00
Xiang Li cfa7ab6074 etcdserver: validate discovery cluster 2015-02-13 14:32:24 -08:00
Xiang Li c16cc3a6a3 etcdserver: recover transport when recovering from a snapshot 2015-02-13 10:16:28 -08:00
Xiang Li fbc4c8efb5 etcdserver: fix snapshot 2015-02-13 09:54:25 -08:00
Barak Michener a0e3bc9cbd etcdserver: Unmask the snapshotter. Fixes #2295 2015-02-13 11:56:00 -05:00
Barak Michener cd50f0e058 etcdserver: Create MemberDir() and base {Snap,WAL}Dir() thereon. Audit DataDir. 2015-02-12 12:45:19 -05:00
Barak Michener fade9b6065 etcdserver: Refactor 2.0.1 directory rename into a proper migration
fix all instances

fix detection test
2015-02-12 11:53:19 -05:00
Xiang Li 163f0f09f6 etcdserver: cleanup cluster_util 2015-02-11 16:20:38 -08:00
Xiang Li 20497f1f85 etcdserver: move remote cluster retrive to cluster_util.go 2015-02-11 14:03:14 -08:00
Xiang Li 6e1aecfc6f etcdserver: save confstate when apply new snapshot 2015-02-10 07:31:25 -08:00
Yicheng Qin f13c7872d5 etcdserver: register pre-defined namespaces in store 2015-02-04 16:33:40 -08:00
Yicheng Qin 7840d49ae0 etcdserver: not add self to transporter based on local ID
If this is decided by local name, it comes to trouble if the name is
duplicate in the cluster.
2015-01-29 12:35:47 -08:00
Xiang Li 276c9540b4 etcdserver: support raft.status 2015-01-26 16:39:33 -08:00
Yicheng Qin f0c9a54edb Merge pull request #2156 from yichengq/309
pkg/metrics: self-manage global expvar map
2015-01-26 16:20:31 -08:00
Yicheng Qin 08b34a3f5b pkg/metrics: self-manage global expvar map
This helps the embedded tests.
2015-01-26 16:20:09 -08:00
Shota Fukumori (sora_h) f8ce5996b0 Treat URLs have same IP address as same
- To solve validation error problem using URLs in hostname #2123
2015-01-27 04:36:41 +09:00
Xiang Li 9c7f66c5d9 Merge pull request #2119 from sorah/peer-ca-on-fetching-members
etcdserver: User peerTLSInfo to get cluster member
2015-01-26 10:50:44 -08:00
Shota Fukumori (sora_h) 033e7d1db9 etcdserver: User peerTLSInfo to get cluster member 2015-01-27 03:43:21 +09:00
Jonathan Boulle f1ed69e883 *: switch to line comments for copyright
Build tags are not compatible with block comments.
Also adds copyright header to a few places it was missing.
2015-01-26 09:53:30 -08:00
Xiang Li a77bf97c14 etcdhttp: add internalVersion 2015-01-22 15:42:16 -08:00
Ben Darnell 59214978a2 raft: Add applied index as an argument to newRaft and RestartNode. 2015-01-22 11:38:05 -05:00
Yicheng Qin 99821579bf metrics: add /rafthttp/stream metrics 2015-01-21 13:24:21 -08:00
Xiang Li 0eaaad0e48 raft: add Status interface
Status returns the current status of raft state machine.
2015-01-16 14:02:04 -08:00
Xiang Li a97f331a0e etcdhttp: add health endpoint 2015-01-16 10:52:02 -08:00
Xiang Li 973f79e1c9 etcdserver: separate out raft related stuff 2015-01-15 15:15:13 -08:00
Xiang Li 1a6161d08a Merge pull request #2104 from xiang90/timeout
etcdserver: make heartbeat/election configurable
2015-01-15 13:52:20 -08:00
Xiang Li 276a4abac0 etcdserver: make heartbeat/election configurable 2015-01-15 11:11:33 -08:00
Yicheng Qin 190fd446f9 pkg/types: add URLs tests 2015-01-15 10:24:23 -08:00
Xiang Li 89d95539cf Merge pull request #2083 from yichengq/293
*: move etcdserver/idutil -> pkg/idutil
2015-01-13 13:04:50 -08:00
Yicheng Qin 07a69430c1 *: move etcdserver/idutil -> pkg/idutil 2015-01-13 11:54:51 -08:00
Yicheng Qin dc6aef0d02 etcdhttp: add NewPeerHandler test 2015-01-12 15:56:29 -08:00
Yicheng Qin 270e67db84 wal: not export unnecessary public functions 2015-01-09 14:55:10 -08:00
Yicheng Qin bca1e5aea6 Merge pull request #2057 from yichengq/282
fix context time-out failure on travis
2015-01-07 13:41:26 -08:00
Yicheng Qin 930156c18a integration: adjust election ticks using env var 2015-01-07 11:18:29 -08:00
Yicheng Qin 6b237416e1 Merge pull request #2044 from yichengq/278
wal: record mark when snapshotting
2015-01-07 08:26:33 -08:00
Yicheng Qin 6460e49a33 wal: save empty snapshot when create
So caller can open at empty snapshot to read all entries.
2015-01-06 19:48:21 -08:00
Yicheng Qin 84f62f21ee wal: record and check snapshot 2015-01-06 16:27:40 -08:00
Xiang Li 1ebad5e42c etcdhttp: support member/leader endpoint 2015-01-06 08:52:33 -08:00
Xiang Li 6b8667152b Merge pull request #2035 from xiang90/errorc
etcdserver: collect error from errorc
2015-01-05 11:29:01 -08:00
Yicheng Qin cb5bff5b05 Merge pull request #2034 from yichengq/276
etcdhttp: reset serve and watch timeout
2015-01-05 08:33:25 -08:00
Xiang Li 15be030aaa etcdserver: collect error from errorc 2015-01-02 20:13:46 -08:00
Yicheng Qin 4dd00be365 etcdhttp: reset serve and watch timeout 2015-01-02 16:39:13 -08:00
Xiang Li 2a83e350b1 Merge pull request #1992 from xiang90/rm_leader
*: support removing the leader from a 2 members cluster
2015-01-02 14:15:12 -08:00
Xiang Li 6e727625b9 etcdserver: continue to apply after self-removed 2015-01-02 14:10:07 -08:00
Xiang Li 51ffc88096 etcdsever: remove mult_server_test 2015-01-02 13:49:58 -08:00
Xiang Li 41f6137261 etcdserver: use the actual store implementation when we need the actual implementation 2015-01-02 12:22:01 -08:00
Xiang Li 27d47977d9 etcdserver: move recorder to testutil 2015-01-02 11:21:23 -08:00
Xiang Li ac6cd03365 etcdserver: refactor server_test.go 2015-01-02 10:56:09 -08:00
Xiang Li 04003a01ba Merge pull request #2013 from xiang90/tr
rafthttp cleanup
2014-12-31 08:35:20 -08:00
Xiang Li 803c38f448 etcdserver: move error to errors.go
Both server.go and cluster.go are using defined ErrX. Move error
to errors.go
2014-12-30 15:02:07 -08:00
Xiang Li c3d2f5eea0 pbutil: add getbool to pbutil 2014-12-30 14:51:26 -08:00
Yicheng Qin 241a474935 etcdserver: refactor server tests
1. remove redundant fake struct
2. use fake node for better testing
3. code clean
2014-12-30 13:49:55 -08:00
Yicheng Qin a92bd1d165 etcdserver: add multi_server_test.go 2014-12-30 00:10:18 -08:00
Xiang Li f79b9042ab etcdserver: fix streaming handler 2014-12-29 22:12:58 -08:00
Yicheng Qin 3748088b96 etcdserver: print out log of normal tests 2014-12-29 14:38:00 -08:00
Yicheng Qin 6ccaadc95d Merge pull request #1952 from yichengq/262
etcdserver: add id generator
2014-12-29 13:59:06 -08:00
Yicheng Qin 05c921229e etcdserver: add id generator 2014-12-29 13:03:04 -08:00
Xiang Li c712dd682a rafthttp: make Transport private 2014-12-29 12:20:52 -08:00
Xiang Li a14d13f724 rafthttp: make fields in Transport private 2014-12-29 12:08:13 -08:00
Xiang Li 7c8b9c0203 Merge pull request #2011 from xiang90/timeutil
etcdserver: move getExpr to timeutil
2014-12-29 12:03:25 -08:00
Xiang Li 152676f43a *: support removing the leader from a 2 members cluster 2014-12-29 11:34:33 -08:00
Yicheng Qin 5bb8eeb5cf rafthttp: transport cleanup 2014-12-29 11:21:40 -08:00
Xiang Li cea29fe158 etcdserver: move getExpr to timeutil 2014-12-29 11:15:02 -08:00
Yicheng Qin 08f839e32c rafthttp: set the API boundary of the package 2014-12-28 15:50:27 -08:00
Xiang Li 1535596252 etcdserver: remove unnecessary indirection 2014-12-26 11:03:13 -08:00
Xiang Li 3dcd66459d etcdserver: remove unused containsUint64() 2014-12-26 10:56:56 -08:00
Xiang Li 69444b6bba etcdserver: cleanup server.go 2014-12-25 21:37:20 -08:00
Xiang Li 78b51d3f2f etcdserver: cleanup cluster.go 2014-12-25 20:56:30 -08:00
Xiang Li 6dc3af5da4 etcdserver: cluster clean up 2014-12-25 20:36:48 -08:00
Xiang Li 7a5bf53222 etcdserver: move member sort interface to member.go 2014-12-25 20:18:55 -08:00
Xiang Li ef0a66bb0a etcdserver: kill a todo in test 2014-12-25 18:14:05 -08:00
Xiang Li f43bc809b9 etcdserver: cleanup wal upgrade 2014-12-24 22:02:46 -08:00
Xiang Li b9d228b0fa etcdserver: remove example.go 2014-12-24 21:41:31 -08:00
Jonathan Boulle 4e6cbc937e *: change remaining 0.5 references -> 2.0 2014-12-18 16:14:42 -08:00
Xiang Li c27c288bef etcdserver: update stats when become leader 2014-12-15 17:02:48 -08:00
Xiang Li 04522baeee etcdserver: fix leader stats 2014-12-15 16:50:03 -08:00
Xiang Li 53bf7e4b5e wal: rename openAtIndex -> open; OpenAtIndexUntilUsing -> openNotInUse 2014-12-14 19:33:06 -08:00
Xiang Li ea94d19147 *: lock the in using files; do not purge locked the wal files 2014-12-14 19:27:22 -08:00
Yicheng Qin dcf34c0ab4 Merge pull request #1938 from yichengq/262
etcdserver: protect the sender map in SendHub
2014-12-15 10:41:52 +08:00
Yicheng Qin ceb077424d etcdserver: protect the sender map in SendHub 2014-12-15 10:37:41 +08:00
Xiang Li d86603840d rafthttp: better logging 2014-12-14 09:50:59 -08:00
Xiang Li 4724cbbe2c etcdserver: one line 2014-12-11 22:17:36 -08:00
Xiang Li 935f7128a9 etcdserver: move stats inferface to stats pkg 2014-12-11 22:14:05 -08:00
Barak Michener 5f16fab541 Merge pull request #1915 from barakmich/1834
Return Unknown instead of NotExist
2014-12-11 13:49:26 -05:00
Barak Michener cf7690cb51 detect more cases of empty directories and actual errors 2014-12-11 13:37:32 -05:00
Xiang Li c26542b7f2 Merge pull request #1913 from xiang90/lazy_snap_dir
etcdserver: create snap dir until start the node
2014-12-11 09:39:51 -08:00
Xiang Li 836ccabad2 etcdserver: create snap dir until start the node 2014-12-11 09:25:18 -08:00
Yicheng Qin 4777cba995 Merge pull request #1898 from robszumski/improve-logging
Improve logging for etcdserver and rafthttp
2014-12-10 10:46:47 -08:00
Nikhil Sarda a852936a59 etcdserver: removed an unhelpful test failure message
this commit changes instances of "blah" in a test to more
descriptive messages
2014-12-09 21:45:50 -08:00
Rob Szumski 13f3158728 etcdserver: improve discovery ignore warning 2014-12-09 15:57:25 -08:00
Xiang Li e4c0f5c1a8 Merge pull request #1895 from xiang90/snap_nodes
etcd: update conf when apply the confChange entry
2014-12-09 11:45:01 -08:00
Xiang Li a5efbf826d raft: drop nodes in softState 2014-12-09 11:43:52 -08:00
Xiang Li 29d7a2a558 etcd: update conf when apply the confChange entry 2014-12-08 23:37:07 -08:00
Yicheng Qin 4804c45e14 raft: set raft.Commit too when setting raftLog.committed 2014-12-08 22:35:55 -08:00
Yicheng Qin 9c8f5c9535 Merge pull request #1891 from yichengq/257
etcdserver: init state before run loop correctly
2014-12-08 16:38:33 -08:00
Yicheng Qin 13814c9d7d etcdserver: init state before run loop correctly 2014-12-08 16:13:16 -08:00
Yicheng Qin 7e06d85651 etcdserver: apply entries when it is not empty
Or it updates appliedi wrongly.
2014-12-08 15:56:38 -08:00
Yicheng Qin 71f3b80fbe etcdserver: check recovery error when new server 2014-12-08 14:55:23 -08:00
Yicheng Qin 8c338ffcc7 etcdserver: correct the log about recovering from snapshot 2014-12-08 14:51:42 -08:00
Yicheng Qin 771ff4589d etcdserver: not add self into sendhub when new server 2014-12-05 00:18:40 -08:00
Yicheng Qin 1d1c2ff834 Merge pull request #1841 from yichengq/246
etcdserver: close storage when stop
2014-12-04 15:36:24 -08:00
Yicheng Qin a7bc03b42b etcdserver: close storage when stop 2014-12-04 15:16:22 -08:00
Xiang Li 88e2fab572 Merge pull request #1859 from xiang90/pause_test
*: add pauseMember test
2014-12-04 15:11:59 -08:00
Veres Lajos 3de2ab2c04 *: typofixes
https://github.com/vlajos/misspell_fixer
2014-12-04 22:51:19 +00:00
Xiang Li 151f043414 *: add pauseMember test 2014-12-04 14:22:43 -08:00
Yicheng Qin fa292391d8 etcdserver: close idle connections when stop sendhub 2014-12-02 00:08:47 -08:00
Xiang Li 7beac083ff Merge pull request #1810 from xiang90/purge
*: support purging old wal/snap files
2014-12-01 12:05:05 -08:00
Xiang Li d3db010190 *: support purging old wal/snap files 2014-12-01 11:50:17 -08:00
Xiang Li bc5acd3c42 etcdserver: log snapshot event 2014-11-30 12:10:20 -08:00
Xiang Li e23f9e76d1 raft: do not applysnapshot in raft 2014-11-26 10:59:13 -08:00
Xiang Li 9df0e7715d raft: do not panic on out of date compaction 2014-11-25 15:14:39 -08:00
Xiang Li 01cbcce8ba etcdserver: do not applySnapshot twice 2014-11-25 14:53:49 -08:00
Xiang Li 74d8c7f457 etcdserver: cleanup main loop 2014-11-25 14:38:18 -08:00
Yicheng Qin 7e6e305c4f Merge branch 'log_interface'
Conflicts:
	raft/raft.go
2014-11-25 14:22:11 -08:00
Yicheng Qin a13d5a70ff etcdserver: save snapshot before entries 2014-11-25 12:39:15 -08:00
Owen Smith c67b937d62 etcdserver: truncate WAL from correct index when forcing new cluster
When loading from a backup with a snapshot and WAL, the length of WAL entries
can be lower than the current index integer value, causing a panic when
slicing off uncommitted entries. This looks for WAL entries higher than
the current index before slicing.
2014-11-25 16:46:56 +00:00
Yicheng Qin 54e1237271 etcdserver: panic when snapshot on raft storage
Snapshot on raft storage should always succeed. If there is an error, it must
be internal fault and needs stack info to debug.
2014-11-24 21:22:49 -08:00
Yicheng Qin 1b038da18a etcdserver: init snapi when init appliedi 2014-11-24 21:19:30 -08:00
Yicheng Qin bd9e93eeea etcdserver: remove finished TODO for raftStorage.Compact 2014-11-24 21:10:53 -08:00
Yicheng Qin 185d37c333 etcdserver: not load dummy entry from the wal 2014-11-24 20:51:04 -08:00
Xiang Li d69e4dbe6d etcdserver: initial index to 1 2014-11-24 14:57:08 -08:00
Xiang Li 453133977d etcdserver: save snapshot only if the index is greater than previous snap index 2014-11-24 14:47:59 -08:00
Xiang Li 4b7af29c37 etcdserver: fix TriggerSnap test.
Sleep for millisecond to allow the server to apply the first nop and
first put separately.
2014-11-24 14:47:49 -08:00
Xiang Li 08f156a1de etcdserver: remove extra empty line in snapshot func 2014-11-24 10:27:18 -08:00
Ben Darnell 0d680d0e6b Merge remote-tracking branch 'coreos/master' into merge
* coreos/master:
  rafthttp: fix import
  raft: should not decrease match and next when handling out of order msgAppResp
  Fix migration to allow snapshots to have the right IDs
  add snapshotted integration test
  fix test import loop
  fix import loop, add set to types, and fix comments
  etcdserver: autodetect v0.4 WALs and upgrade them to v0.5 automatically
  wal: add a bench for write entry
  rafthttp: add streaming server and client
  dep: use vendored imports in codegangsta/cli
  dep: bump golang.org/x/net/context

Conflicts:
	etcdserver/server.go
	etcdserver/server_test.go
	migrate/snapshot.go
2014-11-21 15:40:11 -05:00
Ben Darnell 30690d15d9 Re-enable a few tests I had missed.
Fix integration test for the change to log entry zero.

Increase test timeouts since integration tests often take
longer than 10s for me.
2014-11-21 15:27:17 -05:00
Brian Waldon c0fb1c8a00 Merge pull request #1755 from bcwaldon/golang.org-deps
Switch to golang.org/x/net/context
2014-11-20 16:26:14 -08:00
Barak Michener 59a0c64e9f fix import loop, add set to types, and fix comments 2014-11-20 15:38:08 -05:00
Barak Michener 78ea3335bf etcdserver: autodetect v0.4 WALs and upgrade them to v0.5 automatically 2014-11-20 15:38:08 -05:00
Yicheng Qin 9d53b94546 rafthttp: add streaming server and client 2014-11-20 11:34:50 -08:00
Brian Waldon 9a728a127a dep: bump golang.org/x/net/context
Move from code.google.com/p/go.net/context to
golang.org/x/net/context before bumping to latest.
2014-11-20 10:19:12 -08:00
Ben Darnell b29240baf0 Merge remote-tracking branch 'coreos/master' into merge
* coreos/master:
  scripts: build-docker tag and use ENTRYPOINT
  scripts: build-release add etcd-migrate
  create .godir
  raft: optimistically increase the next if the follower is already matched
  raft: add handleHeartbeat handleHeartbeat commits to the commit index in the message. It never decreases the commit index of the raft state machine.
  rafthttp: send takes raft message instead of bytes
  *: add rafthttp pkg into test list
  raft: include commitIndex in heartbeat
  rafthttp: move server stats in raftHandler to etcdserver
  *: etcdhttp.raftHandler -> rafthttp.RaftHandler
  etcdserver: rename sender.go -> sendhub.go
  *: etcdserver.sender -> rafthttp.Sender

Conflicts:
	raft/log.go
	raft/raft_paper_test.go
2014-11-19 17:05:16 -05:00
Ben Darnell 355ee4f393 raft: Integrate snapshots into the raft.Storage interface.
Compaction is now treated as an implementation detail of Storage
implementations; Node.Compact() and related functionality have been
removed. Ready.Snapshot is now used only for incoming snapshots.

A return value has been added to ApplyConfChange to allow applications
to track the node information that must be stored in the snapshot.

raftpb.Snapshot has been split into Snapshot and SnapshotMetadata, to
allow the full snapshot data to be read from disk only when needed.

raft.Storage has new methods Snapshot, ApplySnapshot, HardState, and
SetHardState. The Snapshot and HardState parameters have been removed
from RestartNode() and will now be loaded from Storage instead.
The only remaining difference between StartNode and RestartNode is that
the former bootstraps an initial list of Peers.
2014-11-19 16:40:26 -05:00
Yicheng Qin 1a72143ecb rafthttp: send takes raft message instead of bytes
This gives streaming mechanism the chance to assemble and disassemble
raft messages.
2014-11-17 22:39:53 -08:00
Yicheng Qin a2c568a144 Merge pull request #1669 from yichengq/215
*: add rafthttp as a separate package
2014-11-17 16:14:59 -08:00
Yicheng Qin f24e214ee5 rafthttp: move server stats in raftHandler to etcdserver 2014-11-17 16:02:20 -08:00
Yicheng Qin 5dc5f8145c *: etcdhttp.raftHandler -> rafthttp.RaftHandler 2014-11-17 15:52:24 -08:00
Yicheng Qin 3fcc011717 etcdserver: rename sender.go -> sendhub.go 2014-11-17 15:35:15 -08:00
Yicheng Qin 84fbf7aab5 *: etcdserver.sender -> rafthttp.Sender 2014-11-17 15:35:10 -08:00
Ben Darnell 300c5a2001 Merge remote-tracking branch 'coreos/master' into log-storage-interface
* coreos/master: (21 commits)
  etcdserver: refactor ValidateClusterAndAssignIDs
  integration: add integration test for remove member
  integration: add test for member restart
  version: bump to alpha.3
  etcdserver: add buffer to the sender queue
  *: gracefully stop etcdserver
  Fix up migration tool, add snapshot migration
  etcd4: migration from v0.4 -> v0.5
  etcdserver: export Member.StoreKey
  etcdserver: recover cluster when receiving newer snapshot
  etcdserver: check and select committed entries to apply
  etcdserver: recover from snapshot before applying requests
  raft: not set applied when restored from snapshot
  sender: support elegant stop
  etcdserver: add StopNotify
  etcdserver: fix TestDoProposalStopped test
  etcdserver: minor cleanup
  etcdserver: validate new node is not registered before in best effort
  etcdserver: fix server.Stop()
  *: print out configuration when necessary
  ...

Conflicts:
	etcdserver/server.go
	etcdserver/server_test.go
	raft/log.go
2014-11-17 18:28:24 -05:00
Xiang Li e04e4632b3 Merge pull request #1736 from xiang90/verify
etcdserver: refactor ValidateClusterAndAssignIDs
2014-11-17 14:34:59 -08:00
Xiang Li 0541f0afa0 etcdserver: refactor ValidateClusterAndAssignIDs 2014-11-17 14:23:37 -08:00
Ben Darnell 64d9bcabf1 Add Storage.Term() method and hide the first entry from other methods.
The first entry in the log is a dummy which is used for matchTerm
but may not have an actual payload. This change permits Storage
implementations to treat this term value specially instead of
storing it as a dummy Entry.

Storage.FirstIndex() no longer includes the term-only entry.

This reverses a recent decision to create entry zero as initially
unstable; Storage implementations are now required to make
Term(0) == 0 and the first unstable entry is now index 1.
stableTo(0) is no longer allowed.
2014-11-17 16:54:12 -05:00
Xiang Li c26de66262 integration: add integration test for remove member 2014-11-17 13:28:09 -08:00
Xiang Li 7c4b84a6cd etcdserver: add buffer to the sender queue 2014-11-14 15:18:16 -08:00
Xiang Li 8bf71d796e *: gracefully stop etcdserver 2014-11-14 14:12:24 -08:00
Barak Michener 192f200d9e Fix up migration tool, add snapshot migration
Fixes all updates since bcwaldon sketched the original, with cleanup and
into an acutal working state. The commit log follows:

fix pb reference and remove unused file post rebase

unbreak the migrate folder

correctly detect node IDs

fix snapshotting

Fix previous broken snapshot

Add raft log entries to the translation; fix test for all timezones. (Still in progress, but passing)

Fix etcd:join and etcd:remove

print more data when dumping the log

Cleanup based on yichengq's comments

more comments

Fix the commited index based on the snapshot, if one exists

detect nodeIDs from snapshot

add initial tool documentation and match the semantics in the build script and main

formalize migration doc

rename function and clarify docs

fix nil pointer

fix the record conversion test

add migration to test suite and fix govet
2014-11-14 16:46:08 -05:00
Brian Waldon 5ea1f2d96f etcd4: migration from v0.4 -> v0.5 2014-11-14 15:57:26 -05:00
Brian Waldon c36abeabd1 etcdserver: export Member.StoreKey 2014-11-14 15:57:26 -05:00
Yicheng Qin b6887e4a83 Merge pull request #1719 from yichengq/228
etcdserver: recover snapshot before applying committed entries
2014-11-14 12:18:41 -08:00
Yicheng Qin 77433ff6da etcdserver: recover cluster when receiving newer snapshot 2014-11-14 12:11:21 -08:00
Yicheng Qin dfaa7290c4 etcdserver: check and select committed entries to apply 2014-11-14 12:11:16 -08:00
Yicheng Qin f6a7f96967 etcdserver: recover from snapshot before applying requests 2014-11-14 12:08:39 -08:00
Yicheng Qin 12dba7d413 sender: support elegant stop 2014-11-13 17:44:36 -08:00
Xiang Li e66bda957b Merge pull request #1714 from xiang90/stop
StopNotify
2014-11-13 15:16:52 -08:00
Xiang Li 6a1fe00615 Merge pull request #1704 from xiang90/print_config
*: print out configuration when necessary
2014-11-13 14:35:50 -08:00
Yicheng Qin 11f392bdc8 Merge pull request #1708 from yichengq/223
etcdserver: validate new node is not registered before in best effort
2014-11-13 14:30:40 -08:00
Xiang Li b5d480f17a etcdserver: add StopNotify 2014-11-13 14:16:48 -08:00
Xiang Li 978d0f1ca1 etcdserver: fix TestDoProposalStopped test
We start etcd server in this test without the cluster. Sometimes it panics when
accessing the cluster. Most of the time it does not panic, since we can stop the
server fast enough before applying the first configuration change entry.
2014-11-13 14:08:59 -08:00
Xiang Li fb344bc33f etcdserver: minor cleanup 2014-11-13 14:01:56 -08:00
Xiang Li 813ff6ba48 Merge pull request #1713 from xiang90/stop
etcdserver: fix server.Stop()
2014-11-13 13:58:07 -08:00
Yicheng Qin ac907d746b etcdserver: validate new node is not registered before in best effort 2014-11-13 13:56:11 -08:00
Xiang Li 30dfdb0ea9 etcdserver: fix server.Stop()
Stop should be idempotent. It should simply send a stop signal to the server.
It is the server's responsibility to stop the go-routines and related components.
2014-11-13 13:47:12 -08:00
Ben Darnell 39eddd8565 Merge remote-tracking branch 'coreos/master' into log-storage-interface
* coreos/master:
  etcdserver: add sender tests
  raft: Only call stableTo when we have ready entries or a snapshot.
  etcdserver: add ID() function to the Server interface.
  sender: use RoundTripper instead of Client in sender
2014-11-13 15:50:08 -05:00
Yicheng Qin 9716def94b Merge pull request #1700 from yichengq/222
etcdserver: add sender tests
2014-11-13 12:37:29 -08:00
Yicheng Qin d89bf9f215 etcdserver: add sender tests 2014-11-13 12:29:25 -08:00
Xiang Li d6f40acc86 etcdserver: add ID() function to the Server interface. 2014-11-13 11:37:06 -08:00
Ben Darnell b29c512f50 Merge remote-tracking branch 'coreos/master' into log-storage-interface
* coreos/master: (27 commits)
  pkg/wait: move wait to pkg/wait
  etcdserver: do not add/remove/update local member to/from sender hub
  etcdserver: not record attributes when add member
  raft: add a test for proposeConfChange
  raft: block Stop() on n.done, support idempotency
  raft: add a test for node proposal
  integration: add increase cluster size test
  integration: remove unnecessary t.Testing argument
  raft: stop the node synchronously
  integration: fix test to propagate NewServer errors
  etcdserver: move peer URLs check to config
  etcdserver: ensure initial-advertise-peer-urls match initial-cluster
  raft: add a test for node.Tick
  raft: add comment string for TestNodeStart
  etcdserver: use member instead of node at etcd level
  raft: nodes return sorted ids
  raft: update unstable when calling stableTo with 0
  *: support updating advertise-peer-url Users might want to update the peerurl of the etcd member in several cases. For example, if the IP address of the physical machine etcd running on is changed, user need to update the adversite-pee-rurl accordingly. This commit makes etcd support updating the advertise-peer-url of its members.
  transport: create a tls listener only if the tlsInfo is not empty and the scheme is HTTPS
  etcdserver: use member pointer for all tests
  ...

Conflicts:
	etcdserver/server.go
	raft/log.go
	raft/log_test.go
	raft/node.go
2014-11-13 14:21:09 -05:00
Xiang Li 92096dfdc3 *: print out configuration when necessary 2014-11-13 10:46:42 -08:00