Commit Graph

1062 Commits (030865abe3327815db0586e03198dc5782c1f3ed)

Author SHA1 Message Date
Anthony Romano 7e00325fe9 etcdserver: refactor server.go select loop
splits out the apply case into smaller functions
2015-12-22 12:13:39 -08:00
Gyu-Ho Lee 0ff822bf22 etcdserver/auth: fix shadowed variables from go tool
Fixes for https://github.com/coreos/etcd/issues/3954.
2015-12-12 09:20:26 -08:00
Gyu-Ho Lee 40b11038f2 etcdserver: fixes shadowed variables for go tool vet
Fix for https://github.com/coreos/etcd/issues/3954.
2015-12-12 09:13:12 -08:00
Xiang Li c4cbaf5c2a etcdsever: swap kv pointer atomically 2015-12-10 17:29:36 -08:00
Xiang Li 23bd60ccce *: rewrite snapshot sending 2015-12-08 18:21:21 -08:00
Xiang Li 0708a5e50d etcdserver: refactor a for loop in recvSnap test 2015-12-02 15:41:03 -08:00
Xiang Li 67ffeee521 Merge pull request #3946 from xiang90/fix_snap_test
etcdserver: get rid of unreliable WaitSchedule
2015-12-02 15:01:20 -08:00
Xiang Li 3ec3ffbef0 etcdserver: get rid of unreliable WaitSchedule
In this case, we know we are waiting for an action happened on
storage. We can do a busy wait instead of calling waitSchedule.

The test previously failed on CI with no observed actions.
2015-12-02 13:18:11 -08:00
ngaut e142e073e8 v3rpc: Tiny clean up
unreachable code
2015-12-02 12:30:21 +08:00
Barak Michener 452e5bffc0 etcdserver: Fix panic for v3 transaction compares on non-existent keys
Fixes #3920
2015-11-24 16:49:45 -05:00
Xiang Li b868f4b1b1 v3rpc: report eventReceived correctly 2015-11-19 22:44:46 -08:00
Xiang Li 3cf90a4dff v3rpc: do not send closing event
When a watch stream closes, both of the watcher.Chan and closec
will be closed.

If watcher.Chan is closed, we should not send out the empty event.
Sending the empty is wrong and waste a lot of CPU resources.
Instead we should just return.
2015-11-19 17:56:15 -08:00
Xiang Li c400d05d0a Merge pull request #3892 from xiang90/fix_snapshot_handling
etcdserver: handle incoming v3 snapshot correctly
2015-11-19 12:18:18 -08:00
Xiang Li a07e4bb6e2 etcdserver: handle incoming v3 snapshot correctly
1. we should update all kv reference (including the
on in snapStore).

2. we should first restore a new KV and then close
the old one asynchronously.
2015-11-18 16:07:41 -08:00
Gyu-Ho Lee 81229dbea9 *: add missing package descriptions
This adds and updates package descriptions in etcd projects.
And also deletes some duplicate LICENSE statements.
2015-11-17 20:54:10 -08:00
Xiang Li b4abe5b584 etcdserver: start real txn for txn request
We should open real txn for applying txn requests. Or the intermediate
state might be observed by reader.

This also fixes #3803. Same consistent(raft) index per multiple indenpendent
operations confuses consistentStore.
2015-11-16 21:11:27 -08:00
Xiang Li a1616afc5d auth: use canonical path for pre-defined guest role 2015-11-15 17:58:09 -08:00
Xiang Li ff36b9d9bc Merge pull request #3700 from xiang90/metrics_hi
Replace Summary with Histogram for all metrics
2015-11-10 10:06:45 -08:00
Xiang Li ae2f69b41e etcdserver: rename processInternalRaftReq to processInternalRaftRequest
We have a structure called InternalRaftRequest. Making the function
shorter by calling it processInternalRaftReq seems to be random and
reduce the readability. So we just use the full name.
2015-11-09 13:37:36 -08:00
Hitoshi Mitake 2c8ffa6bcb etcdserver: correct error log for strict reconfig checking
This commit fixes an error log caused by the strict reconfig checking
option.

Before:
14:21:38 etcd2 | 2015-11-05 14:21:38.870356 E | etcdhttp: got unexpected response error (etcdserver: re-configuration failed due to not enough started members)

After:
log
13:27:33 etcd2 | 2015-11-05 13:27:33.089364 E | etcdhttp: etcdserver: re-configuration failed due to not enough started members

The error is not an unexpected thing therefore the old message is
incorrect.
2015-11-06 11:03:42 +09:00
Yicheng Qin 0874c44cdc etcdserver: fix snapshot index in creation log line
The snapshot is created at appliedi instead of snapi.
2015-11-05 14:02:09 -08:00
Xiang Li 08f0d94019 Merge pull request #3809 from xiang90/rpc_kv
*: refactor kv rpc implementation
2015-11-04 19:05:48 -08:00
Yicheng Qin ec3c2d23a3 *: update feature maps to adopt v2.3.0 2015-11-04 14:30:35 -08:00
Yicheng Qin 3d15526c35 Merge pull request #3796 from yichengq/fix-get-version
etcdserver: not reuse connections for peer transport
2015-11-04 11:39:14 -08:00
Xiang Li c37bd2385a *: refactor kv rpc implementation 2015-11-04 11:36:17 -08:00
Yicheng Qin 4ccbcb91c8 rafthttp: add functions to create listener and roundTripper
This moves the code to create listener and roundTripper for raft communication
to the same place, and use explicit functions to build them. This prevents
possible development errors in the future.
2015-11-04 11:12:46 -08:00
Yicheng Qin 32819f6b3f etcdserver: use roundTripper to request peerURL
It uses roundTripper instead of Transport because roundTripper is
sufficient for its requirements.
2015-11-04 10:49:42 -08:00
Xiang Li 1a3f7f7fa4 *: rename etcd service to kv service in gRPC 2015-11-04 10:05:49 -08:00
Xiang Li 10de2e6dbe *: serve watch service
Implement watch service and hook it up
with grpc server in etcdmain.
2015-11-03 15:58:34 -08:00
Xiang Li c160085f44 *: add v3 watch service 2015-11-03 14:21:24 -08:00
Yicheng Qin 0eee88a3d9 etcdserver: use timeout transport as peer transport
This pairs with remote timeout listeners.

etcd uses timeout listener, and times out the accepted connections
if there is no activity. So the idle connections may time out easily.
Becaus timeout transport doesn't reuse connections, it prevents using
timeouted connection.

This fixes the problem that etcd fail to get version of peers.
2015-11-03 07:58:03 -08:00
Xiang Li fe165de1d1 Merge pull request #3794 from yichengq/fix-proxy-term
etcdmain: fix parsing discovery error
2015-11-02 17:33:47 -08:00
Yicheng Qin 9757dcd3a2 etcdmain: fix parsing discovery error
The discovery error is wrapped into a struct now, and cannot be compared
to predefined errors. Correct the comparison behavior to fix the
problem.
2015-11-02 17:23:06 -08:00
Jonathan Boulle ee522025b3 etcdserver: restructure auth.Store and auth.User
This attempts to decouple password-related functions, which previously
existed both in the Store and User structs, by splitting them out into a
separate interface, PasswordStore.  This means that they can be more
easily swapped out during testing.

This also changes the relevant tests to use mock password functions
instead of the bcrypt-backed implementations; as a result, the tests are
much faster.

Before:
```
	github.com/coreos/etcd/etcdserver/auth		31.495s
	github.com/coreos/etcd/etcdserver/etcdhttp	91.205s
```

After:
```
	github.com/coreos/etcd/etcdserver/auth		1.207s
	github.com/coreos/etcd/etcdserver/etcdhttp	1.207s
```
2015-10-30 16:33:40 -07:00
Yicheng Qin 7d757bbc8a etcdserver: extend wait timeout in TestPublishRetry
It fixes the failure in semaphore CI:
```
--- FAIL: TestPublishRetry (0.00s)
		server_test.go:1108: len(action) = 1, want >= 2
```
2015-10-28 12:07:00 -07:00
Yicheng Qin 099d8674c4 Merge pull request #3746 from yichengq/load-storage
etcdserver: fix recovering snapshot from disk
2015-10-27 14:42:41 -07:00
Yicheng Qin 263b270708 etcdserver: commit v3 storage before releasing WAL
This ensures that v3 storage could always find the following log entries
when restart.
2015-10-26 21:06:08 -07:00
Xiang Li a8e6e71bf9 *: fix various data races detected by race detector 2015-10-26 20:49:37 -07:00
Yicheng Qin 15ed6d8268 etcdserver: save consistent index into v3 storage
This helps to recover consistent index when restart in the future.
2015-10-24 09:27:24 -07:00
Yicheng Qin cacc0d6432 etcdserver: restore KV snapshot when receiving snapshot
When a slow follower receives the snapshot sent from the leader, it
should rename the snapshot file to the default KV file path, and
restore KV snapshot.

Have tested it manually and it works pretty well.
2015-10-23 08:43:26 -07:00
Yicheng Qin de669be6d6 Merge pull request #3683 from yichengq/raft-block
etcdserver: fix raft state machine may block
2015-10-20 09:44:34 -07:00
Yicheng Qin ab5df57ecf etcdserver: fix raft state machine may block
When snapshot store requests raft snapshot from etcdserver apply loop,
it may block on the channel for some time, or wait some time for KV to
snapshot. This is unexpected because raft state machine should be unblocked.

Even worse, this block may lead to deadlock:
1. raft state machine waits on getting snapshot from raft memory storage
2. raft memory storage waits snapshot store to get snapshot
3. snapshot store requests raft snapshot from apply loop
4. apply loop is applying entries, and waits raftNode loop to finish
messages sending
5. raftNode loop waits peer loop in Transport to send out messages
6. peer loop in Transport waits for raft state machine to process message

Fix it by changing the logic of getSnap to be asynchronously creation.
2015-10-20 09:19:34 -07:00
Hitoshi Mitake 1b0c65c299 etcdserver: don't allow methods other than GET in /debug/vars
Currently, /debug/vars seems to allow all types of methods e.g. PUT,
POST, etc. However, this path is a readonly stuff so it should allow
GET only.
2015-10-20 17:19:42 +09:00
Xiang Li 32dd4d5de3 Merge pull request #3657 from xiang90/fix_remove
etcdserver: skip updating attr if the member does not exist
2015-10-19 13:35:57 -07:00
Xiang Li d90a47656e etcdserver: use Histogram for proposal_durations 2015-10-17 12:48:25 -07:00
Yicheng Qin 1f21ccf166 rafthttp: support sending v3 snapshot message
Use snapshotSender to send v3 snapshot message. It puts raft snapshot
message and v3 snapshot into request body, then sends it to the target peer.
When it receives http.StatusNoContent, it knows the message has been
received and processed successfully.

As receiver, snapHandler saves v3 snapshot and then processes the raft snapshot
message, then respond with http.StatusNoContent.
2015-10-13 23:11:28 -07:00
Yicheng Qin 207c92b627 rafthttp: build transport inside pkg instead of passed-in
rafthttp has different requirements for connections created by the
transport for different usage, and this is hard to achieve when giving
one http.RoundTripper. Pass into pkg the data needed to build transport
now, and let rafthttp build its own transports.
2015-10-11 21:42:37 -07:00
Yicheng Qin 233e717e2f rafthttp: expose struct to set configuration
transport takes too many arguments and the new function is unable to
read. Change the way to set fields in transport struct directly.
2015-10-11 09:02:16 -07:00
Xiang Li 98e30ca7c2 etcdserver: skip updating attr if the member does not exist 2015-10-08 14:07:16 -07:00
Yicheng Qin f74ff9b867 Merge pull request #3644 from mitake/test-race
etcdserver, test: don't access testing.T in time.AfterFunc()'s own go…
2015-10-07 08:34:58 -07:00
Hitoshi Mitake 68dd3ee621 etcdserver, test: don't access testing.T in time.AfterFunc()'s own goroutine
time.AfterFunc() creates its own goroutine and calls the callback
function in the goroutine. It can cause datarace like the problem
fixed in the commit de1a16e0f1 . This
commit also fixes the potential dataraces of tests in
etcdserver/server_test.go .
2015-10-06 11:37:08 +09:00
Yicheng Qin 8c94ae0ee3 etcdserver: get existing snapshot instead of requesting one
This fixes the problem that proposal cannot be applied.

When start the etcdserver.run loop, it expects to get the latest
existing snapshot. It should not attempt to request one because the loop
is the entity to create the snapshot.
2015-10-05 14:32:16 -07:00
Yicheng Qin 36f4303fc3 storage/etcdserver: update KV.Snapshot function
When using Snapshot function, it is expected:
1. know the size of snapshot before writing data
2. split snapshot-ready phase and write-data phase. so we could cut
snapshot first and write data later.

Update its interface to fit the requirement of etcdserver.
2015-10-03 10:15:23 -07:00
Yicheng Qin 8c0db94fef Merge pull request #3631 from yichengq/create-snapshot
etcdserver: support to create raft snapshot at apply loop
2015-10-03 10:03:27 -07:00
Yicheng Qin 18c568bc82 etcdserver: print out correct restored cluster info
Before this PR, it always prints nil because cluster info has not been
covered when print:

```
2015-10-02 14:00:24.353631 I | etcdserver: loaded cluster information
from store: <nil>
```
2015-10-02 16:11:32 -07:00
Yicheng Qin bfe9502f4f etcdserver: support to create raft snapshot at apply loop
and snapStore could trigger it to create the latest raft snapshot.
2015-10-02 13:17:56 -07:00
Yicheng Qin ccce61bda9 Merge pull request #3614 from yichengq/snapshot-store
etcdserver: add snapshotStore and raftStorage
2015-10-01 19:35:34 -07:00
Yicheng Qin 2276328720 etcdserver: add snapshotStore and raftStorage
snapshotStore is the store of snapshot, and it supports to get latest snapshot
and save incoming snapshot.

raftStorage supports to get latest snapshot when v3demo is open.
2015-10-01 19:00:59 -07:00
Xiang Li 715fdfb669 Merge pull request #3093 from mwitkow-io/feature/httpd_metrics
add `events` metrics in etcdhttp.
2015-10-01 12:10:58 -07:00
Michal Witkowski 1b2dc1c796 metrics: add `events` metrics in etcdhttp. 2015-10-01 08:11:42 +01:00
Yicheng Qin a535cf2cad Merge pull request #3610 from yichengq/load-storage
etcdserver: restore v3 storage when restart
2015-09-29 11:58:38 -07:00
Yicheng Qin 49d262185d Merge pull request #3590 from yichengq/discovery-log
etcdmain: improve log when join discovery fails
2015-09-29 08:02:18 -07:00
Yicheng Qin 5d906a0acc etcdserver: restore v3 storage when restart
To load the previous data.
2015-09-29 00:14:27 -07:00
Yicheng Qin 939aa96a34 etcdmain: improve log when join discovery fails
Before this PR, the log is

```
2015/09/1 13:18:31 etcdmain: client: etcd cluster is unavailable or
misconfigured
```

It is quite hard for people to understand what happens.

Now we print out the exact reason for the failure, and explains the way
to handle it.
2015-09-28 23:23:50 -07:00
Xiang Li 6c05a01ec6 Merge pull request #3604 from gyuho/replace_netutil_BasicAuth
etcdhttp/auth: BasicAuth method in standard pkg
2015-09-28 15:55:46 -07:00
Gyu-Ho Lee e16f81838b etcdhttp/auth: BasicAuth method in standard pkg
I created a new PR from https://github.com/coreos/etcd/pull/3598.
This is for `TODO: use the standard lib BasicAuth method when we move to
Go 1.4.` [1]. `BasicAuth` method got into Go standard package a year ago. [2]

---
1. https://github.com/coreos/etcd/blob/master/pkg/netutil/netutil.go#L126-L138
2. https://codereview.appspot.com/76540043/
2015-09-28 14:02:55 -07:00
Xiang Li 1226838381 etcdhttp: add Content-Type: application/json header to version handler 2015-09-25 15:14:13 -07:00
Gyu-Ho Lee 85f4475f62 httptypes/errors: HTTPError.WriteTo returns error
Squashing all commits into this one
(from https://github.com/coreos/etcd/pull/357).

Thanks,
2015-09-25 08:06:26 -07:00
Xiang Li 2540a3fb7e etcdsever: mismatch error uses the same format as the corresponding flags 2015-09-21 19:32:10 -07:00
Xiang Li ea3dbfed60 Merge pull request #3408 from MSamman/extend-auth-api
etcdserver: extend auth api
2015-09-21 11:51:19 -07:00
Xiang Li 3b70bf87c3 etcdmain: better logging when user forget to set initial flags 2015-09-21 10:43:26 -07:00
Mohammad Samman 6ae1f6c6e4 etcdserver: extend auth api
allow recursive query on users and roles to get more detail

Fixes #3278
2015-09-21 00:51:18 -07:00
Yicheng Qin cedad49dcf Merge pull request #3543 from mitake/reconfig-remove
etcdserver: forbid removing started member if quorum cannot be preserved in strict reconfig mode
2015-09-17 18:22:53 -07:00
Hitoshi Mitake f8859a980d etcdserver: forbid removing started member if quorum cannot be preserved in strict reconfig mode
Like the commit 6974fc63ed, this commit lets etcdserver forbid
removing started member if quorum cannot be preserved after
reconfiguration if the option -strict-reconfig-check is passed to
etcd. The removal can cause deadlock if unstarted members have wrong
peer URLs.
2015-09-18 10:09:57 +09:00
Xiang Li ec4142576e Merge pull request #3534 from xiang90/grpc_err
etcdserver: better v3 api error handling
2015-09-16 12:32:28 -07:00
Jonathan Boulle 7848ac3979 *: add missing license headers 2015-09-15 14:09:01 -07:00
Xiang Li 94f4069a25 etcdserver: better v3 api error handling 2015-09-15 11:20:06 -07:00
Yicheng Qin 352cd768c6 etcdserver: fix shadow declaration 2015-09-14 23:25:16 -07:00
Yicheng Qin 05c74bd890 etcdserver: rename db file into a formal directory
and rename it to a formal name
2015-09-14 22:41:40 -07:00
Yicheng Qin 51f1ee055e Merge pull request #3526 from yichengq/snapshot
etcdserver: forbid to unset v3 demo once used
2015-09-14 21:36:39 -07:00
Yicheng Qin 1f0fb3d9aa etcdserver: forbid to unset v3 demo once used
After enabling v3 demo, it may change the underlying data organization
for v3 store. So we forbid to unset --experimental-v3demo once it has
been used.
2015-09-14 21:27:11 -07:00
Xiang Li 94f784826a *: support v3 compaction 2015-09-14 19:59:36 -07:00
Xiang Li e0d8923f7b Merge pull request #3524 from xiang90/grpc_error
etcdserver: use gRPC error instead of error message in header
2015-09-14 16:38:44 -07:00
Xiang Li 7183387110 etcdserver: use gRPC error instead of error message in header 2015-09-14 16:11:13 -07:00
Gyu-Ho Lee c2dcf7431e etcdserver, store: fix grammars in comments (a->an existing)
I found some grammatical errors in comments.

This pull request was submitted https://github.com/coreos/etcd/pull/3513.
I am resubmitting following the correct guidlines.
2015-09-14 13:41:13 -07:00
Xiang Li c7b4c67436 Merge pull request #3514 from xiang90/v3_raft
support clustered v3 api
2015-09-14 09:35:02 -07:00
Xiang Li 4c81615cef etcdserver: initial support for cluster-wide v3 request 2015-09-13 08:32:01 -07:00
Xiang Li 600456f4ba etcdserverpb: update proto file for raftInternalRequest
We needs to assign each raftInternalRequest an ID for getting
the response after it goes through raft.

We also needs an empty response for error case.
2015-09-13 08:28:10 -07:00
Hitoshi Mitake dad32646eb etcdserver: enhance test cases for isReadyToAddNewMember
- a case of a cluster with even number members
 - a case of an empty cluster
2015-09-13 12:30:10 +09:00
Jonathan Boulle d9cf752060 etcdserver: add test for isReadyToAddNewMember
Also fixed check for special case of one-member cluster
2015-09-13 11:16:08 +09:00
Hitoshi Mitake 6974fc63ed etcdserver: avoid deadlock caused by adding members with wrong peer URLs
Current membership changing functionality of etcd seems to have a
problem which can cause deadlock.

How to produce:
 1. construct N node cluster
 2. add N new nodes with etcdctl member add, without starting the new members

What happens:
After finishing add N nodes, a total number of the cluster becomes 2 *
N and a quorum number of the cluster becomes N + 1. It means
membership change requires at least N + 1 nodes because Raft treats
membership information in its log like other ordinal log append
requests.

Assume the peer URLs of the added nodes are wrong because of miss
operation or bugs in wrapping program which launch etcd. In such a
case, both of adding and removing members are impossible because the
quorum isn't preserved. Of course ordinal requests cannot be
served. The cluster would seem to be deadlock.

Of course, the best practice of adding new nodes is adding one node
and let the node start one by one. However, the effect of this problem
is so serious. I think preventing the problem forcibly would be
valuable.

Solution:
This patch lets etcd forbid adding a new node if the operation changes
quorum and the number of changed quorum is larger than a number of
running nodes. If etcd is launched with a newly added option
-strict-reconfig-check, the checking logic is activated. If the option
isn't passed, default behavior of reconfig is kept.

Fixes https://github.com/coreos/etcd/issues/3477
2015-09-13 09:31:53 +09:00
Xiang Li 95d5556445 etcdserver: refactor v3demo do 2015-09-05 15:31:28 -07:00
Xiang Li 3f18ded10a *: v3api index->revision 2015-09-04 10:41:20 -07:00
Xiang Li 2ac9af4924 *: replace consistent token with revision in v3 api 2015-09-03 15:41:33 -07:00
Xiang Li ef7cf058a2 *: update gogoproto 2015-09-03 15:32:25 -07:00
Tamir Duberstein 45390b9fb8 *: regenerate proto to use local import path
Using Go-style import paths in protos is not idiomatic. Normally, this
detail would be internal to etcd, but the path from which gogoproto
is imported affects downstream consumers (e.g. cockroachdb).

In cockroach, we want to avoid including `$GOPATH/src` in our protoc
include path for various reasons. This patch puts etcd on the same
convention, which allows this for cockroach.

More information: https://github.com/cockroachdb/cockroach/pull/2339#discussion_r38663417

This commit also regenerates all the protos, which seem to have
drifted a tiny bit.
2015-09-03 13:38:28 -04:00
Xiang Li a94118893c Merge pull request #3413 from xiang90/snapshot_dir
*: support wal dir
2015-09-01 10:03:50 -07:00
Xiang Li d94e712d91 *: support wal dir 2015-09-01 09:54:27 -07:00
Yicheng Qin f3bfcb9dee etcdserver: add timeout param on getClusterFromRemotePeers
It sets 10s timeout for public GetClusterFromRemotePeers.

This helps the following cases to work well in high latency scenario:

1. proxy sync members from the cluster
2. newly-joined member sync members from the cluster

Besides 10s request timeout, the request is also controlled by dial
timeout and read connection timeout.
2015-09-01 08:49:01 -07:00
Xiang Li 1bcaa9f4a1 etcdserver: ignore confChangeUpdateNode in getIDs 2015-08-31 09:36:39 -07:00
Yicheng Qin 92cd24d5bd *: fix govet shadow check failure 2015-08-27 14:15:30 -07:00
Yicheng Qin 8f6bf029f8 etcdserver: specify request timeout error due to connection lost
It specifies request timeout error possibly caused by connection lost,
and print out better log for user to understand.

It handles two cases:
1. the leader cannot connect to majority of cluster.
2. the connection between follower and leader is down for a while,
and it losts proposals.

log format:
```
20:04:19 etcd3 | 2015-08-25 20:04:19.368126 E | etcdhttp: etcdserver:
request timed out, possibly due to connection lost
20:04:19 etcd3 | 2015-08-25 20:04:19.368227 E | etcdhttp: etcdserver:
request timed out, possibly due to connection lost
```
2015-08-26 12:38:37 -07:00
Mohammad Samman e2e002f94e etcdserver: handle malformed basic auth
return insufficient credentials if basic auth header is malformed

Fixes #3280
2015-08-25 12:37:24 -07:00
Xiang Li e3ef1d363a Merge pull request #3366 from xiang90/v3_proto
update v3 proto and doc
2015-08-24 11:22:29 -07:00
Xiang Li 1cccbb5ebd etcdserverpb: add comments for compaction 2015-08-24 10:52:54 -07:00
Xiang Li 4a5b94478e etcdserverpb: update comment for txn request 2015-08-24 10:40:05 -07:00
Xiang Li 98ceb3cdbd etcdserverpb: add more field into rangeResponse 2015-08-24 10:33:20 -07:00
Cong Ding c09b667d57 *: fix go vet reported issues 2015-08-22 12:19:02 -05:00
Xiang Li 044b23c3ca Merge pull request #3356 from xiang90/travis
*: test gofmt with -s and fix reported issues
2015-08-21 18:59:51 -07:00
Xiang Li 6b23a8131f *: test gofmt with -s and fix reported issues 2015-08-21 18:52:16 -07:00
Yicheng Qin 8c0610d4f5 Merge pull request #3352 from yichengq/fix-name-url
fix that etcd fails to start if using both IP and hostname when discovery srv
2015-08-21 12:38:38 -07:00
Yicheng Qin 72462a72fb etcdserver: remove TODO to delete URLStringsEqual
Discovery SRV supports to compare IP addresses with domain names,
so we need URLStringsEqual function.
2015-08-21 09:52:17 -07:00
Yicheng Qin 8ea3d157c5 Revert "Revert "Treat URLs have same IP address as same""
This reverts commit 3153e635d5.

Conflicts:
	etcdserver/config.go
2015-08-21 09:41:13 -07:00
Xiang Li 11a689d063 etcdserver/auth: cache auth enable result 2015-08-20 23:05:00 -07:00
Yicheng Qin bcb4d5d53e Merge pull request #3311 from yichengq/request-timeout
extend hardcoded timeout for globally-deployed etcd cluster
2015-08-17 17:00:24 -07:00
Yicheng Qin 1375ef8985 etcdserver: remove getVersion timeout
The request can still time out because we have set dial timeout and
read/write timeout. It increases timeout expectation from 1s to 5s,
but it makes it workable in globally-deployer cluster.
2015-08-17 16:50:40 -07:00
Xiang Li d487cf6b63 etcdhttp:write etcderror for all errors in keyhandler 2015-08-17 15:51:29 -07:00
Yicheng Qin c530385d6d Merge pull request #3313 from yichengq/internal-timeout
etcdserver: use ReqTimeout only
2015-08-17 15:05:46 -07:00
Xiang Li af6d1d3d95 Merge pull request #3310 from xiang90/http_err
*: key handler should write auth error as etcd error
2015-08-17 14:57:19 -07:00
Yicheng Qin 2d5b95c49f etcdserver: use ReqTimeout only
We cannot refer RTT value from heartbeat interval, so CommitTimeout
is invalid. Remove it and use ReqTimeout instead.
2015-08-17 14:54:25 -07:00
Xiang Li 87f061bab2 *: key handler should write auth error as etcd error 2015-08-17 14:45:45 -07:00
Xiang Li 15e03d801f etcdserver: add version enforcement when setting cluster version 2015-08-17 11:12:39 -07:00
Xiang Li f199a484af *: only print out major.minor version for cluster version 2015-08-15 08:30:06 -07:00
Xiang Li bbcb38189c Merge pull request #3302 from xiang90/v
etcdserver: better version detection log output
2015-08-14 16:14:55 -07:00
Xiang Li 0076ab154b etcdserver: better version detection log output
Fix https://github.com/coreos/etcd/issues/3288
2015-08-14 16:08:33 -07:00
Xiang Li dd56b7e05e Merge pull request #3299 from xiang90/txn
initial support for txn
2015-08-14 16:05:16 -07:00
Xiang Li 9233fff48f etcdserver: support txn 2015-08-14 11:45:31 -07:00
Xiang Li 46865fa5a5 etcdserverpb: update proto 2015-08-14 11:45:07 -07:00
Yicheng Qin c229e6e655 etcdserver: improve error message when timeout due to leader fail 2015-08-13 15:46:21 -07:00
Yicheng Qin ceb27b1c48 etcdhttp: add auth capability in 2.2 2015-08-13 14:49:10 -07:00
Yicheng Qin 0fdb77aea2 etcdserver: go back to marshal request in 2.1 way
It fixes the problem that 2.1 cannot roll upgrade to 2.2 smoothly
because 2.1 cannot understand the bytes marshalled at 2.2.
2015-08-13 13:41:52 -07:00
Yicheng Qin 27170e67b9 etcdserver: specify timeout caused by leader election
Before this PR, the timeout caused by leader election returns:

```
14:45:37 etcd2 | 2015-08-12 14:45:37.786349 E | etcdhttp: got unexpected
response error (etcdserver: request timed out)
```

After this PR:

```
15:52:54 etcd1 | 2015-08-12 15:52:54.389523 E | etcdhttp: etcdserver:
request timed out, possibly due to leader down
```
2015-08-12 16:53:18 -07:00
Yicheng Qin c3d4d11402 etcdhttp: adjust request timeout based on config
It uses heartbeat interval and election timeout to estimate the
expected request timeout.

This PR helps etcd survive under high roundtrip-time environment,
e.g., globally-deployed cluster.
2015-08-12 09:22:59 -07:00
Yicheng Qin 5a91937367 etcdserver: adjust commit timeout based on config
It uses heartbeat interval and election timeout to estimate the
commit timeout for internal requests.

This PR helps etcd survive under high roundtrip-time environment,
e.g., globally-deployed cluster.
2015-08-11 21:09:03 -07:00
Xiang Li a718329ad3 Merge pull request #3248 from xiang90/v3
initial v3 demo
2015-08-10 13:59:03 -07:00
Brandon Philips fb1951204c etcdserver: move atomics to make etcd work on arm64
Follow the simple rule in the atomic package:

"On both ARM and x86-32, it is the caller's responsibility to arrange
for 64-bit alignment of 64-bit words accessed atomically. The first word
in a global variable or in an allocated struct or slice can be relied
upon to be 64-bit aligned."

Tested on a system with /proc/cpuinfo reporting:

processor       : 0
model name      : ARMv7 Processor rev 1 (v7l)
Features        : swp half thumb fastmult vfp edsp thumbee neon vfpv3
tls vfpv4 idiva idivt vfpd32 lpae evtstrm
CPU implementer : 0x41
CPU architecture: 7
CPU variant     : 0x0
CPU part        : 0xc0d
CPU revision    : 1
2015-08-08 18:11:41 -07:00
Xiang Li 9ff7075ce8 etcdserver: use v3server interface 2015-08-08 10:39:04 -07:00
Xiang Li f004b4dac7 *: etcdserver supports v3 demo 2015-08-08 05:58:29 -07:00
Xiang Li 82afadbcc6 etcdserverpb: update proto 2015-08-08 05:31:35 -07:00
Xiang Li 845c51fedd *: fix typos vaild->valid 2015-08-07 10:57:11 -07:00
Yicheng Qin f03f048232 Merge pull request #3184 from yichengq/fast-bootstrap
etcdserver: tick ElectionTicks before starting when bootstrap new cluster
2015-08-06 15:54:40 -07:00
Yicheng Qin 21f5b885f2 etcdserver: fast election timeout when bootstrap cluster
The behavior accelarates the happen of the first-time leader election,
so the cluster could elect its leader fast. Technically, it could
help to reduce `electionMs - heartbeatMs` wait time for the first leader election.

Main usage:
1. Quick start for the local cluster when setting a little longer
election timeout
2. Quick start for the global cluster, which sets election timeout to
its maximum 50s.
2015-08-06 15:44:26 -07:00
Yicheng Qin a637e86372 Merge pull request #3220 from yichengq/fix-auth-check
etcdhttp: fix access check for multiple roles in auth
2015-08-06 15:09:04 -07:00
Xiang Li 58503817ec etcdserver: internal request union 2015-08-05 07:47:10 -07:00
Yicheng Qin 18169e896c etcdhttp: fix access check for multiple roles in auth
Check access for multiple roles should go through all roles.
2015-08-04 14:31:07 -07:00
Xiang Li 2b8abeb093 *: remove migration related stuff from 2.2 2015-08-01 19:37:20 +08:00
Barak Michener dd1a8fe330 etcdhttp: Improve test coverage surrounding auth 2015-07-30 14:21:08 -04:00
Xiang Li 80b794dccc Merge pull request #3185 from xiang90/add_debug_endpoint
etcdhttp: add config/local/debug endpoint
2015-07-30 08:46:07 +08:00
Xiang Li 4e31df2c2b etcdhttp: add config/local/log endpoint
PUT on the endpoint sets the GlobalDebugLevel to json level value.
The action overwrites the origianl log level setting from
users. We need to write doc to warn this.
2015-07-30 08:35:01 +08:00
Yicheng Qin 6fc9dbfe56 Merge pull request #3114 from yichengq/clean-raft-init
etcdserver: clean up start and stop logic of raft
2015-07-27 14:19:25 -07:00
Yicheng Qin 7696dd3280 etcdserver: clean up start and stop logic of raft
kill TODO and make it more readable.
2015-07-27 13:24:26 -07:00
Xiang Li 53a77fa519 *: tnx -> txn 2015-07-24 23:21:09 +08:00
Yicheng Qin b7892b20c1 etcdserver: rename defaultPublishRetryInterval -> defaultPublishTimeout
This makes code more readable and reasonable.
2015-07-23 10:09:28 -07:00
Yicheng Qin 5be545b872 Merge pull request #3077 from yichengq/fix-test-sync
etcdserver: init raft internal var early
2015-07-10 14:44:52 -07:00
Xiang Li 2fb8347d36 etcdserver: add rpc proto 2015-06-29 20:00:09 -07:00
Xiang Li 581ef05bab *: resolve proto warnings 2015-06-29 18:39:46 -07:00
Xiang Li 13f44e4b79 *: update generated proto code 2015-06-29 16:45:25 -07:00
Yicheng Qin 7f95780bfb etcdserver: init raft internal var early
Its `stopped`/`done` should be created always before being used
in defer in server loop.

It fixes the race detected when running TestSyncTrigger.
2015-06-29 15:34:15 -07:00
Yicheng Qin 2e41b4f9e1 etcdserver/auth: fix return value when creating root user
Before:

```
$ curl http://127.0.0.1:4001/v2/auth/users/root -XPUT -d '{"user": "root",
"password": "root"}'
{"user":"root","roles":null}
```

After:

```
{"user":"root","roles":["root"]}
```
2015-06-27 23:16:54 -07:00
Barak Michener acca9cc3a9 Merge pull request #3047 from barakmich/auth_cov
auth: improve test coverage
2015-06-25 14:47:22 -04:00
Barak Michener 39c10d1fe4 auth: improve test coverage 2015-06-25 14:25:08 -04:00
Yicheng Qin 5d131acfba etcdserver: fix TestTriggerSnap
Before checking, it needs to wait for snapshot goroutine to finish its
work.
2015-06-25 09:58:36 -07:00
Xiang Li 52c2a5731f etcdserver: fix typo in metrics.go 2015-06-24 12:42:40 -07:00
Xiang Li 030d1bbf2d auth: do not allow update root role 2015-06-23 20:15:08 -07:00
Xiang Li e291dfd748 etcdhttp: improve user endpoint validation
Giving both roles and grant/revoke is not allowed.
Creating an existing user is not allowed.
Updating a non-existing user is not allowed.
2015-06-23 14:38:44 -07:00
Xiang Li c8628c8fe5 auth: separate the role create and update path
Giving both permission and grant/revoke is not allowed.
Creating an existing role is not allowed.
Updating a non-existing is not allowed.
2015-06-23 13:15:32 -07:00
Xiang Li bc61056912 etcdhttp: use correct http status const when writing http error 2015-06-23 12:40:30 -07:00
Xiang Li 4f47a6ebfb Merge pull request #3032 from xiang90/refactor_update_role
auth: refactor updateRole
2015-06-23 11:17:45 -07:00
Barak Michener d5a0e3ac6a etcdhttp: Always strip password hash when returning users 2015-06-22 18:39:16 -04:00
Xiang Li 979f531261 auth: refactor updateRole
We will return error if revoke or grant fails to update the role.
No need to check if revoke or grant is nil or not.
2015-06-22 15:16:10 -07:00
Xiang Li 3f82e7b116 auth: do not allow to grant duplicate role or revoke ungranted role to a user 2015-06-22 15:11:09 -07:00
Barak Michener 51a65599dd Merge pull request #3021 from xiang90/auth_err
etcdserver: use correct http status code for auth error
2015-06-22 14:58:33 -04:00
Xiang Li c39aad0e92 etcdserver: use correct http status code for auth error 2015-06-22 09:28:47 -07:00
Xiang Li 3e4479b0cd Merge pull request #3022 from xiang90/aut_type
etcdhttp: fix the response type for auth
2015-06-21 15:06:35 -07:00
Xiang Li d295d21349 etcdserver: better log message for url mismatch 2015-06-19 19:36:26 -07:00
Xiang Li cad757efa0 etcdhttp: fix the response type for auth 2015-06-19 15:19:00 -07:00
Barak Michener 64ec8af91b *: Rename `security` to `auth` 2015-06-15 18:18:50 -04:00
Antoine Grondin 270487d340 etcdserver: use Infof to print formatted argument 2015-06-14 20:22:21 +07:00
Xiang Li 8ad7ed321e *:godep log pkg 2015-06-11 14:22:14 -07:00
Xiang Li f013a627a4 etcdserver/stats: use leveled log 2015-06-11 14:22:14 -07:00
Xiang Li cf7cb2b8a9 etcdserver/security: use leveled log 2015-06-11 14:22:14 -07:00
Xiang Li 2f795e42d0 httptypes: use leveled log 2015-06-11 14:19:53 -07:00
Barak Michener 7bf0479e66 Merge pull request #2882 from barakmich/security_client_new
*: Add security/authorization to etcd/client and etcdctl
2015-06-11 13:40:32 -04:00
Yicheng Qin 1af2b4cad7 rafthttp: fix TestUpdateMember
Before this PR, it may error like this:

```
--- FAIL: TestUpdateMember-2 (0.00s)
		server_test.go:950: action =
		[{ApplyConfChange:ConfChangeUpdateNode []}
{ProposeConfChange:ConfChangeUpdateNode []}], want
[{ProposeConfChange:ConfChangeUpdateNode []}
{ApplyConfChange:ConfChangeUpdateNode []}]
```

This fixes the test by recording the proposal event in time.
2015-06-11 09:45:34 -07:00
Yicheng Qin cd629c9b44 Merge pull request #2939 from yichengq/fix-update-attr
etcdserver: allow to update attributes of removed member
2015-06-10 16:53:39 -07:00
Yicheng Qin 8725e69cf7 etcdserver: allow to update attributes of removed member
There exist the possiblity to update attributes of removed member in
reasonable workflow:
1. start member A
2. leader receives the proposal to remove member A
2. member A sends the proposal of update its attribute to the leader
3. leader commits the two proposals
So etcdserver should allow to update attributes of removed member.
2015-06-10 16:52:18 -07:00
Yicheng Qin 4e79abcfeb Merge pull request #2944 from yichengq/fix-2procs
pkg/testutil: ForceGosched -> WaitSchedule
2015-06-10 14:44:32 -07:00
Yicheng Qin 018fb8e6d9 pkg/testutil: ForceGosched -> WaitSchedule
ForceGosched() performs bad when GOMAXPROCS>1. When GOMAXPROCS=1, it
could promise that other goroutines run long enough
because it always yield the processor to other goroutines. But it cannot
yield processor to goroutine running on other processors. So when
GOMAXPROCS>1, the yield may finish when goroutine on the other
processor just runs for little time.

Here is a test to confirm the case:

```
package main

import (
	"fmt"
	"runtime"
	"testing"
)

func ForceGosched() {
	// possibility enough to sched up to 10 go routines.
	for i := 0; i < 10000; i++ {
		runtime.Gosched()
	}
}

var d int

func loop(c chan struct{}) {
	for {
		select {
		case <-c:
			for i := 0; i < 1000; i++ {
				fmt.Sprintf("come to time %d", i)
			}
			d++
		}
	}
}

func TestLoop(t *testing.T) {
	c := make(chan struct{}, 1)
	go loop(c)
	c <- struct{}{}
	ForceGosched()
	if d != 1 {
		t.Fatal("d is not incremented")
	}
}
```

`go test -v -race` runs well, but `GOMAXPROCS=2 go test -v -race` fails.

Change the functionality to waiting for schedule to happen.
2015-06-10 14:37:41 -07:00
Barak Michener a4d1a5a6e5 *: Add security/auth support to etcdctl and etcd/client
add godep for speakeasy and auth entry parsing
add security_user to client
add role to client
add role commands
add auth support to etcdclient and etcdctl(member/user)
add enable/disable to etcdctl
better error messages, read/write/readwrite
Bump go-etcd to include codec changes, add new dependency
verify the error for revoke/add if nothing changed, remove security-merging prefix
2015-06-10 16:58:10 -04:00
Xiang Li 19ef3a0982 Merge pull request #2934 from xiang90/etcdserver_log
etcdserver: use leveled logging
2015-06-09 15:53:52 -07:00
Xiang Li e0f9796653 etcdserver: use leveled logging
Leveled logging for etcdserver pkg.
2015-06-09 13:53:07 -07:00
Yicheng Qin 9fbd2599ad Merge pull request #2940 from yichengq/improve-raft-loop
etcdserver: stop raft loop when receiving stop signal
2015-06-09 11:24:53 -07:00
Yicheng Qin 0814966ca2 etcdserver: stop raft loop when receiving stop signal
When it waits for apply to be done, it should stop the loop if it
receives stop signal.

This helps to print out panic information. Before this PR, if the panic
happens when server loop is applying entries, server loop will wait for
raft loop to stop forever.
2015-06-09 11:11:53 -07:00
Brian Akins d8a836e618 Simple debug HTTP request logging 2015-06-09 13:40:37 -04:00
Xiang Li 0adeee2965 etcdhttp: use leveled logging 2015-06-09 09:26:57 -07:00
Xiang Li 3af4a45d7b etcdserver: make raft use leveled logger 2015-06-02 12:50:42 -07:00
Xiang Li 42fe370b35 Merge pull request #2848 from xiang90/metrics
*: use namespace and subsystem in metrics
2015-05-26 14:44:54 -07:00
Xiang Li 34ac145b38 *: use namespace and subsystem in metrics
Fix #2841.

From Prometheus developer:
```
the recommended way for etcd as an open source project and under
consideration of its size would be etcd_<subsystem>_<name>.
```

We made the naming change accordingly.
2015-05-26 14:39:04 -07:00
Xiang Li 3028edd7dc Merge pull request #2856 from xiang90/mrefactor
etcdserver: refactore member.go
2015-05-26 14:37:37 -07:00
Barak Michener 9ef098c5ed etcdserver: fix go vet. Fixes #2859 2015-05-22 13:54:54 -04:00
Xiang Li 58eefda72d Merge pull request #2840 from yichengq/revert-url-equal
Revert "Treat URLs have same IP address as same"
2015-05-21 19:27:19 -07:00
Xiang Li 4a72d3a8bb etcdserver: refactore member.go 2015-05-21 09:19:29 -07:00
Xiang Li 260aad5468 Merge pull request #2830 from xiang90/join_checking
checking cluster version compatibility before joining the existing cluster
2015-05-20 12:25:50 -07:00
Xiang Li aa417ab644 etcdserver: log the per endpoint error in getVersion 2015-05-20 12:10:10 -07:00
Xiang Li db7db689a6 etcdserver: check cluster version compability when joining 2015-05-19 10:19:41 -07:00
Barak Michener a88a53274f security: Lazily create the security directories. Fixes #2755, may find new instances for #2741
revert the kv integration test

fix nits

amend security mention of GUEST
2015-05-18 17:28:04 -04:00
Yicheng Qin 3153e635d5 Revert "Treat URLs have same IP address as same"
This reverts commit f8ce5996b0.

etcd no longer resolves TCP addresses passed in through flags,
so there is no need to compare hostname and IP slices anymore.
(for more details: a3892221ee)

Conflicts:
	etcdserver/cluster.go
	etcdserver/config.go
	pkg/netutil/netutil.go
	pkg/netutil/netutil_test.go
2015-05-16 03:21:10 -07:00
Xiang Li 9f8342dba4 etcdserver: do not get local version via HTTP 2015-05-13 17:19:32 -07:00
Xiang Li 988c30bfba etcdserver: getVersion returns both server and cluster version 2015-05-13 17:04:46 -07:00
Xiang Li 6296054ff6 etcdhttp: version endpoint also returns cluster version. 2015-05-13 15:48:10 -07:00
Yicheng Qin 75ee7f4aa1 Merge pull request #2821 from yichengq/private-cluster
etcdserver: stop exposing Cluster struct
2015-05-13 10:26:48 -07:00
Xiang Li 2690535f8a Merge pull request #2820 from xiang90/cap
version capability checking
2015-05-13 10:16:49 -07:00
Xiang Li d3b1d5c008 etcdhttp: support capability checking
etcdhttp will check the cluster version and update its
capability version periodically.

Any new handler's after 2.0 needs to wrap by capability handler
to ensure it is not accessable until rolling upgrade finished.
2015-05-13 10:11:35 -07:00
Yicheng Qin a6a649f1c3 etcdserver: stop exposing Cluster struct
After this PR, only cluster's interface Cluster is exposed, which makes
code much cleaner. And it avoids external packages to rely on cluster
struct in the future.
2015-05-13 10:01:25 -07:00
Xiang Li f2905f2828 etcdserver: remove unnecessary around detect datadir
The log is super unhelpful. When I have a 2.1.0 etcd, it prints out
`2.0.1 vaild dir`. I have no idea why the data dir of a 2.1.0 etcd is
2.0.1.
2015-05-12 22:06:42 -07:00
Yicheng Qin 032db5e396 *: extract types.Cluster from etcdserver.Cluster
The PR extracts types.Cluster from etcdserver.Cluster. types.Cluster
is used for flag parsing and etcdserver config.

There is no need to expose etcdserver.Cluster public, which contains
lots of etcdserver internal details and methods. This is the first step
for it.
2015-05-12 14:53:11 -07:00
Xiang Li e866314b94 etcdserver: support update cluster version through raft
1. Persist the cluster version change through raft. When the member is restarted, it can recover
the previous known decided cluster version.

2. When there is a new leader, it is forced to do a version checking immediately. This helps to
update the first cluster version fast.
2015-05-12 11:44:34 -07:00
Xiang Li 94ffd72c7e etcdserver: rename StoreAdminPrefix to StoreClusterPrefix
We store cluster related key in StoreAdminPrefix for some
historical reason. The previous API is called admin. But now,
the admin name is gone and `cluster` is a more clear and correct
name.
2015-04-29 12:05:51 -07:00
Xiang Li 6699107f61 *: add cluster version and cluster version detection.
Cluster version is the min major.minor of all members in
the etcd cluster. Cluster version is set to the min version
that a etcd member is compatible with when first bootstrapp.

During a rolling upgrades, the cluster version will be updated
automatically.

For example:

```
Cluster [a:1, b:1 ,c:1] -> clusterVersion 1

update a -> 2, b -> 2

after a detection

Cluster [a:2, b:2 ,c:1] -> clusterVersion 1, since c is still 1

update c -> 2

after a detection

Cluster [a:2, b:2 ,c:2] -> clusterVersion 2
```

The API/raft component can utilize clusterVersion to determine if
it can accept a client request or a raft RPC.

We choose polling rather than pushing since we want to use the same
logic for cluster version detection and (TODO) cluster version checking.

Before a member actually joins a etcd cluster, it should check the version
of the cluster. Push does not work since the other members cannot push
version info to it before it actually joins. Moreover, we do not want our
raft RPC system (which is doing the heartbeat pushing) to coordinate cluster version.
2015-04-29 11:31:59 -07:00
Yicheng Qin 1c1cccd236 rafthttp: stop etcd if it is found removed when stream dial
The original process is stopping etcd only when pipeline message finds itself
has been removed. After this PR, stream dial has this functionality too.
It helps fast etcd stop, which doesn't need to wait for stream break to
fall back to pipeline, and wait for election timeout to send out message
to detect self removal.
2015-04-27 15:10:00 -07:00
Yicheng Qin ebecee34e0 Merge pull request #2701 from yichengq/rafthttp-anon
rafthttp: add remotes
2015-04-24 13:04:37 -07:00
Yicheng Qin 9f19b5660f rafthttp: add AddRemote
Add remotes to rafthttp, who help newly joined members catch up the
progress of the cluster. It supports basic message sending to remote, and
has no stream connection for simplicity. remotes will not be used
after the latest peers have been added into rafthttp.
2015-04-24 11:49:23 -07:00
xiaost cab1e9a723 etcdserver: skip noop entry in apply 2015-04-24 12:15:51 +08:00
Barak Michener fa74e702d8 security: Improve the security api as per the suggestions list in #2384
Subcommits:

decouple root and security enable/disable

create root role

prefix matching

godep: bump go-etcd to include credentials

add godep for speakeasy and auth entry parsing

appropriate errors for security enable/disable

WIP adding to etcd/client all the security client methods

add guest access

minor ui return tweaks

revert client changes

respond to comments, log more security operations

fix major ensure() bug, add better UX

block recursive access

fix some boneheaded mistakes

fix integration test

last comments

fix up security_api.md

philips nits

fix docs
2015-04-23 16:11:38 -04:00
Yicheng Qin 1d96de459a etcdserver: init server stats before passing it as argument
It is more reasonable to init the variable before passing it as an
argument.

It fixes a bug that etcdserver may panic on server stats when processing
a message from rafthttp streamReader before server stats is initialized
in server.Start().
2015-04-22 08:28:08 -07:00
Xiang Li 5ad559b503 *: serve json version on both client and peer url 2015-04-20 16:23:51 -07:00
Yicheng Qin 1811701427 Revert "etcdserver: fix cluster fallback recovery"
This reverts commit cff005777a.

Conflicts:
	etcdserver/server.go
2015-04-19 11:34:33 -07:00
Yicheng Qin 88224f6f4e Revert "etcdserver: not apply stale conf change in cluster and transport"
This reverts commit 40197f0698.
2015-04-19 11:08:03 -07:00
Xiang Li 98f8dfbc9d etcdserver: prevExist=true + condition is compareAndSwap
PrevExist indicates the key should exist. Condition compares with
an existing key. So PrevExist+condition = CompareAndSwap not Update.
2015-04-14 23:44:06 -07:00
xiaost eab2c2224a etcdserver: fix minor bug in EtcdServer.send
it seems to nothing serious.
after deleted peers, the log may output:
"etcdserver: send message to unknown receiver %s"
2015-04-13 20:35:58 +08:00
Yicheng Qin 2141308524 Merge pull request #2631 from yichengq/metrics-fd
etcdserver: metrics and monitor number of file descriptor
2015-04-08 11:28:58 -07:00
Yicheng Qin 7a7e1f7a7c etcdserver: metrics and monitor number of file descriptor
It exposes the metrics of file descriptor limit and file descriptor used.
Moreover, it prints out warning when more than 80% of fd limit has been used.

```
2015/04/08 01:26:19 etcdserver: 80% of the file descriptor limit is open
[open = 969, limit = 1024]
```
2015-04-08 11:17:48 -07:00
Alex Crawford d9ad6aa2a9 *: update to use IANA-assigned ports 2015-04-06 13:49:43 -07:00
Xiang Li 471aa1aa89 Merge pull request #2622 from xiang90/fix_watcher
store: fix watcher removal
2015-04-03 10:39:03 -07:00
Xiang Li 999917010d store: fix watcher removal 2015-04-03 10:13:43 -07:00
Yicheng Qin 9e5743c816 etcdserver: stop raft node goroutine before stop server
Stop raftNode goroutine before stopping server goroutine, so
server.Stop does stop all underlying stuffs elegantly now. This fixes
the problem that previous-round lock on WAL may not be released when
etcd is restarted.
2015-04-01 11:20:51 -07:00
Xiang Li 77a04cda0c Merge pull request #2597 from xiang90/wal-repair
wal: fix the unexpectedEOF error in the last wal.
2015-03-30 13:49:05 -07:00
Xiang Li 253f7c4ae1 Merge pull request #2522 from xiang90/user_pw
etcdserver/etcdhttp: do not return back the password of a user
2015-03-30 13:42:41 -07:00
Xiang Li 0b9a318e68 etcdserver: make the wal repairing logic clear 2015-03-29 21:10:28 -07:00
Xiang Li 1231f82f22 etcdserver: save snapshot into wal first 2015-03-29 14:23:05 -07:00
Xiang Li 8b4eed29e5 wal: fix the unexpectedEOF error in the last wal.
It is safe to repair the unexpectedEOF error in the last wal. raft
will not send out message before the entry successfully comitted
into wal. Thus we can safely truncate the last entry in the wal
to repair.
2015-03-28 21:08:14 -07:00
Yicheng Qin 60efd4d96e Revert "etcdhttp: add internalVersion"
This reverts commit a77bf97c14.

Conflicts:
	version/version.go
2015-03-27 16:53:55 -07:00
Yicheng Qin dd92a2b484 Merge pull request #2556 from yichengq/fix-apply-conf
etcdserver: not apply stale conf change
2015-03-27 14:00:30 -07:00
Kelsey Hightower 538d624cfa etcdserver: add stats.LatencyStats and stats.CountsStats types 2015-03-27 13:42:44 -07:00
Yicheng Qin 40197f0698 etcdserver: not apply stale conf change in cluster and transport 2015-03-27 12:53:34 -07:00
Xiang Li e3817adb5b etcdserver: loose member validation for joining existing cluster 2015-03-25 13:59:22 -07:00
Xiang Li 05e240b892 *: update protobuf 2015-03-25 10:14:35 -07:00
Yicheng Qin 5e0077cc0c etcdserver: print out extra files in data dir instead of erroring 2015-03-24 18:56:22 -07:00
Xiang Li 866a9d4e41 Merge pull request #2568 from xiang90/raftnode
raft: make node configurable
2015-03-24 11:18:22 -07:00
Yicheng Qin ea78f5d1aa Merge pull request #2552 from yichengq/fix-2396
etcdserver: check -initial-cluster in join case
2015-03-23 22:46:38 -07:00