Commit Graph

1406 Commits (b8e9bd2b42516443306fc17841c661c05aca2826)

Author SHA1 Message Date
Gyu-Ho Lee 6ec03d3f7c etcdserver: move 'EtcdServer.send' to raft.go
Clear 'TODO'
2016-10-26 16:26:00 -07:00
Gyu-Ho Lee 1cd6fefd49 etcdserver: set sort ASCEND for empty sort order
when target is not key
2016-10-18 16:29:19 -07:00
Hitoshi Mitake 39e9b1f75a auth, etcdserver: check password at API layer
The cost of bcrypt password checking is quite high (almost 100ms on a
modern machine) so executing it in apply loop will be
problematic. This commit exclude the checking mechanism to the API
layer. The password checking is validated with the OCC like way
similar to the auth of serializable get.

This commit also removes a unit test of Authenticate RPC from
auth/store_test.go. It is because the RPC now accepts an auth request
unconditionally and delegates the checking functionality to
authStore.CheckPassword() (so a unit test for CheckPassword() is
added). The combination of the two functionalities can be tested by
e2e (e.g. TestCtlV3AuthWriteKey).

Fixes https://github.com/coreos/etcd/issues/6530
2016-10-17 14:18:21 +09:00
Xiang Li 698a789644 Merge pull request #6655 from kragniz/range_end-docs
etcdserver: document DeleteRangeRequest prefixes
2016-10-14 15:00:24 -07:00
Louis Taylor ce6276a2e8
etcdserver: document DeleteRangeRequest prefixes
There was missing info about deleting prefixes in the proto docs for
DeleteRangeRequest.

Closes #6641.
2016-10-14 21:39:03 +01:00
Louis Taylor 9df97eb441
etcdserver: increase warnApplyDuration from 10ms to 100ms
When running test suites for a client locally I'm getting spammed by log
lines such as:

    etcdserver: apply entries took too long [14.226771ms for 1 entries]

The comments in #6278 mention there were future plans of increasing the
threshold for logging these warnings, but it hadn't been done yet.
2016-10-13 17:55:50 +01:00
Gyu-Ho Lee 0c61d8804a etcdserver: make WaitGroup.Add sync with Wait 2016-10-12 13:11:35 -07:00
Xiang Li dbaa44372b etcdserver: better panic logging 2016-10-11 13:34:18 -07:00
Gyu-Ho Lee e011ea25ca etcdserver: separate EtcdServer from raftNode 2016-10-07 13:18:39 -07:00
fanmin shi ea9e857eb9 Merge pull request #6599 from fanminshi/lease_error_type_fix
Lease: Add lease errors to togRPCError()
2016-10-06 15:47:51 -07:00
Xiang Li cbbd1f0f44 Merge pull request #6598 from xiang90/cleanup
v3rpc: return nil as error explicitly
2016-10-06 15:30:04 -07:00
fanmin shi a862fd9f0f Lease: Add lease errors to togRPCError()
This allows lease's function to convert lease error to appropriate GRPC errors
2016-10-06 14:29:31 -07:00
Xiang Li 10cafe56b8 v3rpc: return nil as error explicitly 2016-10-06 14:14:43 -07:00
Gyu-Ho Lee 65ac718a11 etcdserver: use 'TTL()' on lease.Lease 2016-10-06 11:24:12 -07:00
Xiang Li f0469f7f25 Merge pull request #6570 from xiang90/lease_expire
Fix lease expire
2016-10-05 15:49:45 -07:00
Xiang Li 0f0c048e29 etcdserver: fix early lessor promotion issue
If we promote the lessor before finish applying all
entries from the last term, we might incorrectly renew
the already revoked leases.

Here is an example:

- Term 1: revoke lease A accepted by raft
- Old leader failed, new election happened
- Term 2: promote
- Term 2: keep alive A succeed. A now has 10 seconds TTL
- Term 2: revoke lease A from Term 1 got committed and applied
- Term 2: the lease A with 10 seconds TTL is revoked

To solve this, the new leader MUST apply all entries from old term
before promote its lessor to start accept renew requests.
2016-10-05 14:41:47 -07:00
Gyu-Ho Lee 9b56e51ca7 *: regenerate proto + gofmt change 2016-10-03 15:34:34 -07:00
Xiang Li dfe85b26cc Merge pull request #6571 from xiang90/log_pkg
*: set repo correctly for logging
2016-10-03 15:49:44 -05:00
Xiang Li 962433c17f *: set repo correctly for logging 2016-10-03 17:03:22 +08:00
Anthony Romano 289e3c0c63 etcdserver: use stream recorder for TestPublishRetry
Fixes #6546
2016-09-30 15:43:32 -07:00
Xiang Li ea0c65797a etcdserver: use linearizableReadNotify for txn 2016-09-28 20:47:49 +08:00
fanmin shi 8ef6687018 etcdserver: fix a node panic bug caused LeaseTimeToLive call on a nonexistent lease
When the non Leader etcd server receives a LeaseTimeToLive on a nonexistent lease, it responds with a nil resp and a nil error The invoking function parses the nil resp and results a segmentation fault.
I fix the bug by making sure the lease not found error is returned so that the invoking function parses the the error message instead.

fix #6537
2016-09-27 17:46:30 -07:00
Xiang Li e3e3993022 etcdserver: support read index
Use read index to achieve l-read.
2016-09-27 13:41:40 +08:00
ychen11 69f5b4ba79 Documentation:made watch request doc more clear 2016-09-23 23:13:55 +08:00
fanmin shi 690a0b6f00 etcdserver: parallelize expired leases process
When 1000 leases expired at the same time, etcd takes more than 5 seconds to clean them. This means that even after the leases have expired, keys associated with leases are still accessible. I increase the deletion throughput by parallelizing leases deletion process.
2016-09-19 16:17:49 -07:00
Anthony Romano 3866e78c26 etcdserver: tighten up goroutine management
All outstanding goroutines now go into the etcdserver waitgroup. goroutines are
shutdown with a "stopping" channel which is closed when the run() goroutine
shutsdown. The done channel will only close once the waitgroup is totally cleared.
2016-09-19 12:10:41 -07:00
Xiang Li c6feb695dc api: update capability map 2016-09-16 14:34:55 +08:00
Xiang Li b3a083d336 Merge pull request #6436 from LiamHaworth/bugfix/6433-support-for-charset-in-content-type-header
etcdserver, api, v2http, client: Added support for semicolons
2016-09-14 23:25:31 -05:00
Liam Haworth 5cfa9e2384 etcdserver, api, v2http, client: Added support for semicolons
Added support into the v2 API to fix an issue (6433) where if there is a semicolon
and fields after it the API would return an "invalid Content-type" message even
if the content type was actually correct
2016-09-15 13:54:22 +10:00
Anthony Romano c0981a90f7 etcdserver, etcdserverpb: range min_create_revision and max_create_revision 2016-09-14 15:31:45 -07:00
Anthony Romano af0264d2e6 etcdserver, etcdserverpb: add MinModRevision and MaxModRevision options to Range 2016-09-12 15:17:57 -07:00
Gyu-Ho Lee 63b0cd470d etcdserver: implement 'LeaseTimeToLive' 2016-09-09 08:14:14 +09:00
Gyu-Ho Lee 0712ebc9b5 v2http: handle '/leases/internal' 2016-09-09 08:12:31 +09:00
Gyu-Ho Lee 3132e36bf3 etcdserverpb: add 'LeaseTimeToLive' RPC 2016-09-09 08:08:14 +09:00
Anthony Romano 1defeda792 v3api, rpctypes: add ErrUnhealthy 2016-09-07 16:51:49 -07:00
Gyu-Ho Lee 48941cea95 Merge pull request #6308 from gyuho/manual2
client: do not send previous node data (optional)
2016-08-30 13:33:22 -07:00
Xiang Li 771ee43169 etcdserver: allow zero kv index for cluster upgrade
If a user upgrades etcd from 2.3.x to 3.0 and shutdown the
cluster immediately without triggering any new backend writes,
then the consistent index in backend would be zero.

The user cannot restart etcdserver due to today's strick index
match checking. We now have to lose this a bit for this case.
2016-08-30 11:28:18 -07:00
Gyu-Ho Lee 2da7b63809 v2http: change to 'NoValueOnSuccess' 2016-08-30 10:53:02 -07:00
Gyu-Ho Lee 572bfd99ff v2http: update function returns 2016-08-30 10:29:37 -07:00
Michael Fraenkel 82053f04b2 client: do not send previous node data (optional)
- Do not send back node data when specified
- remove node and prevNode when noDataOnSuccess is set
2016-08-30 10:04:09 -07:00
Anthony Romano 64ac631863 rpctypes: set unknown codes to Unknown instead of internal
An unrecognized error code isn't "very broken".
2016-08-28 19:37:35 -07:00
Anthony Romano c388b2f22f Merge pull request #6264 from heyitsanthony/error-codes
clientv3: use grpc codes to translate raw grpc errors
2016-08-26 11:52:37 -07:00
Anthony Romano df54ad2208 v3rpc, rpctypes: add error types for timeouts 2016-08-26 09:22:09 -07:00
Anthony Romano 254c0ea814 etcdserver: use request timeout defined by ServerConfig for v3 requests 2016-08-25 18:39:01 -07:00
Xiang Li 7f3d4bfae5 etcdserver: kv.commit needs to be serialized with apply
kv.commit updates the consistent index in backend. When
executing in parallel with apply, it might grab tx lock
after apply update the consistent index and before apply
starts to execute the opeartion. If the server dies right
after kv.commit, the consistent is updated but the opeartion
is not executed. If we restart etcd server, etcd will skip
the operation. :(

There are a few other places that we need to take care of,
but let us fix this first.
2016-08-23 09:16:09 -07:00
Xiang Li 83de13e4a8 etcdserver: support apply wait 2016-08-19 16:18:35 -07:00
Xiang Li d0fa390048 etcdserver: improve logging for leadership transfer 2016-08-17 11:40:46 -07:00
Gyu-Ho Lee f91f7dfb91 v2http: fix tests to use new clockwork 2016-08-16 16:36:24 -07:00
Gyu-Ho Lee 4d3b281369 etcdserver: fix spell errors 2016-08-13 20:54:48 -07:00
Gyu-Ho Lee 64a0e34602 etcdserver: transfer leadership when stopping 2016-08-13 14:31:58 -07:00
sharat 1fec4ba127 etcdserver: optimized veryfying local member
moved the code for perparing and sorting of advertising peer urls and
sorting of peer urls only when strict verification needs to be done.
This is done to avoid this processing when strict verification is not
required like in case of VerifyJoinExisting function.

#6165
2016-08-13 06:17:21 +05:30
Gyu-Ho Lee f975fe8068 Merge pull request #6140 from gyuho/network-partition
*: add network partition tests
2016-08-12 12:33:24 -07:00
Xiang Li 82a3d90763 Merge pull request #6167 from xiang90/fix_txn_rev
etcdserver: fix wrong rev in header when nothing is actually got executed
2016-08-12 12:14:48 -07:00
Xiang Li 92a0f08722 etcdserver: fix wrong rev in header when nothing is actually got executed 2016-08-12 11:44:13 -07:00
Gyu-Ho Lee c6c6cfb502 etcdserver: implement 'CutPeer', 'MendPeer' 2016-08-12 07:38:52 -07:00
Xiang Li c33ea20fef Merge pull request #6161 from sinsharat/master
etcdserver: stats/server - refactored
2016-08-11 17:03:23 -07:00
Anthony Romano 965b2901d5 Merge pull request #6156 from heyitsanthony/remove-member-quorum
etcdserver: reject member removal that breaks active quorum
2016-08-11 11:40:38 -07:00
sharat 6205a9a6cb etcdserver: stats/server - refactored
removed code duplicacy and improved readability

#6160
2016-08-11 22:09:25 +05:30
Anthony Romano a1ce07a321 etcdserver: reject member removal that breaks the current active quorum 2016-08-10 17:00:39 -07:00
Gyu-Ho Lee a56cb82180 etcdserver: add TransferLeadership for raft.Node 2016-08-10 16:26:11 -07:00
Gyu-Ho Lee d219e96359 etcdserver: use Counter for proposals_failed_total
It only ever goes up.
2016-08-10 09:27:51 -07:00
sharat 2b5a5c77cf etcdserver: Error handling for invalid empty raft cluster
TODO implemented for GetClusterFromRemotePeers should not return nil
error with an invalid empty cluster

#6137
2016-08-10 19:23:19 +05:30
Anthony Romano 9063ce5e3f etcdserver, embed: stricter reconfig checking
Make --strict-reconfig-check a default and check if cluster is healthy when
adding a member.
2016-08-05 16:59:25 -07:00
Xiang Li d69d438289 *: minor cleanup for lease 2016-08-04 20:39:32 -07:00
Xiang Li 29a077bdbe etcdserver: always recover lessor first 2016-08-04 08:06:19 -07:00
Anthony Romano bf71497537 etcdserver, lease: tie lease min ttl to election timeout 2016-08-02 13:06:57 -07:00
Gyu-Ho Lee 87498e0209 v2http: use guest access in non-TLS mode
Fix https://github.com/coreos/etcd/issues/6075.
2016-08-01 14:00:38 -07:00
Anthony Romano 06da46c4ee etcdserver: apply serialized requests outside auth apply lock
Fixes #6010
2016-07-30 22:00:49 -07:00
Anthony Romano 79d25a6884 Merge pull request #6061 from heyitsanthony/fix-snapshot-test
etcdserver: don't race when waiting for store in TestSnapshot
2016-07-27 19:15:41 -07:00
Anthony Romano cfe09d34b8 etcdserver: don't race when waiting for store in TestSnapshot 2016-07-27 15:37:27 -07:00
Gyu-Ho Lee 982e18d80b *: regenerate proto with latest grpc-gateway 2016-07-27 13:21:03 -07:00
Anthony Romano 13c2d32061 Merge pull request #6045 from heyitsanthony/fix-version-race
etcdserver, api, membership: don't race on setting version
2016-07-27 08:56:39 -07:00
Hitoshi Mitake 0090573749 etcdserver: skip range requests in txn if the result is needless
If a server isn't serving txn requests from a client, the server
doesn't need the result of range requests in the txn.

This is a succeeding commit of
https://github.com/coreos/etcd/pull/5689
2016-07-26 19:49:07 -07:00
Anthony Romano de2c3ec3db etcdserver, api, membership: don't race on setting version
Fixes #6029
2016-07-26 18:21:40 -07:00
Xiang Li 020a24f1c3 *: regenerate proto for handling eof error 2016-07-23 16:21:44 -07:00
Xiang Li fffa484a9f *: regenerate proto for adding deleterange 2016-07-23 16:17:44 -07:00
Xiang Li b4ce427d45 etcdserverpb: add missing deleterange annotation 2016-07-23 15:59:53 -07:00
Gyu-Ho Lee 5066981cc7 v2http: test with 'ClientCertAuthEnabled' 2016-07-20 16:24:33 -07:00
Gyu-Ho Lee 25aeeb35c3 v2http: set 'ClientCertAuthEnabled' in client.go 2016-07-20 16:24:15 -07:00
Gyu-Ho Lee 68ece954fb v2http: add 'ClientCertAuthEnabled' in handlers 2016-07-20 16:23:41 -07:00
Gyu-Ho Lee 9510bd6036 etcdserver: add 'ClientCertAuthEnabled' option 2016-07-20 16:22:59 -07:00
Gyu-Ho Lee 0f0d32b073 v2http: move 'testdata' from 'etcdhttp' 2016-07-20 16:20:42 -07:00
rob boll ff5709bb41 v2http: client cert cn authentication
introduce client certificate authentication using certificate cn.
2016-07-20 16:20:13 -07:00
rob boll ab17165352 v2http: refactor http basic auth
refactor http basic auth code to combine basic auth extraction and validation
2016-07-20 16:20:05 -07:00
Anthony Romano 299ebc6137 v3rpc: don't elide next progress notification on progress notification
Fixes #5878
2016-07-20 11:37:20 -07:00
Xiang Li aba478fb8a Merge pull request #5793 from mitake/auth-revision
auth, etcdserver: introduce revision of authStore for avoiding TOCTOU problem
2016-07-20 09:32:54 -07:00
Hitoshi Mitake ef6b74411c auth, etcdserver: introduce revision of authStore for avoiding TOCTOU problem
This commit introduces revision of authStore. The revision number
represents a version of authStore that is incremented by updating auth
related information.

The revision is required for avoiding TOCTOU problems. Currently there
are two types of the TOCTOU problems in v3 auth.

The first one is in ordinal linearizable requests with a sequence like
below ():
1. Request from client CA is processed in follower FA. FA looks up the
   username (let it U) for the request from a token of the request. At
   this time, the request is authorized correctly.
2. Another request from client CB is processed in follower FB. CB
   is for changing U's password.
3. FB forwards the request from CB to the leader before FA. Now U's
   password is updated and the request from CA should be rejected.
4. However, the request from CA is processed by the leader because
   authentication is already done in FA.

For avoiding the above sequence, this commit lets
etcdserverpb.RequestHeader have a member revision. The member is
initialized during authentication by followers and checked in a
leader. If the revision in RequestHeader is lower than the leader's
authStore revision, it means a sequence like above happened. In such a
case, the state machine returns auth.ErrAuthRevisionObsolete. The
error code lets nodes retry their requests.

The second one, a case of serializable range and txn, is more
subtle. Because these requests are processed in follower directly. The
TOCTOU problem can be caused by a sequence like below:
1. Serializable request from client CA is processed in follower FA. At
   first, FA looks up the username (let it U) and its permission
   before actual access to KV.
2. Another request from client CB is processed in follower FB and
   forwarded to the leader. The cluster including FA now commits a log
   entry of the request from CB. Assume the request changed the
   permission or password of U.
3. Now the serializable request from CA is accessing to KV. Even if
   the access is allowed at the point of 1, now it can be invalid
   because of the change introduced in 2.

For avoiding the above sequence, this commit lets the functions of
serializable requests (EtcdServer.Range() and EtcdServer.Txn())
compare the revision in the request header with the latest revision of
authStore after the actual access. If the saved revision is lower than
the latest one, it means the permission can be changed. Although it
would introduce false positives (e.g. changing other user's password),
it prevents the TOCTOU problem. This idea is an implementation of
Anthony's comment:
https://github.com/coreos/etcd/pull/5739#issuecomment-228128254
2016-07-20 14:39:04 +09:00
Anthony Romano 8abae076d1 rpctypes, clientv3: retry RPC on EtcdStopped
Fixes #5983
2016-07-19 18:29:12 -07:00
Xiang Li 1c5754f02d raft: fix readindex 2016-07-19 15:00:58 -07:00
Xiang Li 58aa3483c3 grpcproxy: add filter to watcher 2016-07-18 13:02:34 -07:00
Gyu-Ho Lee 50be793f09 *: regenerate proto 2016-07-18 09:33:32 -07:00
Xiang Li 2d761d64a4 etcdserver: set applied index correctly 2016-07-16 11:44:18 -07:00
Gyu-Ho Lee 5b92e17e86 *: regenerate proto files 2016-07-15 13:24:19 -07:00
Anthony Romano 51c5c307fa rpctypes: test error equivalence with Error()
grpc.Errorf() now returns *rpcError, which makes comparisons shallow.
2016-07-14 15:59:06 -07:00
Xiang Li b0f2e5e64a Merge pull request #5927 from xiang90/pacing
*: deny proposals when there is a huge gap between apply/commit
2016-07-14 11:47:53 -07:00
Xiang Li 27b03f0ed5 *: deny proposals when there is a huge gap between apply/commit 2016-07-14 10:02:55 -07:00
Xiang Li 81d5ae3ce1 rpctypes: use permission deny code for permission deny error 2016-07-13 10:32:10 -07:00
Xiang Li b9f6de9277 Merge pull request #5895 from smallfish/master
etcdserver/api/v2http, Documentation: fix debug pprof index miss / in end
2016-07-12 07:10:53 -07:00
Xiang Li f65e75e4b3 *: remove unnecessary data upgrade code 2016-07-11 15:11:56 -07:00
Hitoshi Mitake c47689d98f Merge pull request #5689 from mitake/skip-apply
RFC: etcdserver, pkg: skip needless log entry applying
2016-07-10 01:23:35 +09:00