Commit Graph

169 Commits (e39ad84440cd03005f6c035e001dc14d8b113a64)

Author SHA1 Message Date
Yiqiao Pu ddc4f8bd45 etcdmain: Add max-snapshots and max-wals to help
Based on the configuration doc, seems these two flags are missing
in the help. So add them and the descriptions are from config.go in
the same directory.

Signed-off-by: Yiqiao Pu <ypu@redhat.com>
2015-11-19 11:58:00 +08:00
Xiang Li 9e4a003fb0 etcdmain: fix unstoppable startEtcd function
We should wrap the blocking function with a closure. And first
creates a go routine to execute the function. Or the inner function
blocks before creating the go routine.
2015-11-17 09:04:00 -08:00
Gyu-Ho Lee 12b4a122ce etcdmain: minor typo, make descriptions consistent
This fixes some typos and make help.go and config.go flag descriptions
consistent with each other.
2015-11-10 13:00:08 -08:00
Gyu-Ho Lee 5eb57c2aee etcdmain: more description for init cluster token
This adds more description to initial-cluster-token from
https://github.com/coreos/etcd/issues/3690 to help.go.
2015-11-10 09:40:08 -08:00
Xiang Li 08f0d94019 Merge pull request #3809 from xiang90/rpc_kv
*: refactor kv rpc implementation
2015-11-04 19:05:48 -08:00
Yicheng Qin 3d15526c35 Merge pull request #3796 from yichengq/fix-get-version
etcdserver: not reuse connections for peer transport
2015-11-04 11:39:14 -08:00
Xiang Li c37bd2385a *: refactor kv rpc implementation 2015-11-04 11:36:17 -08:00
Yicheng Qin 4ccbcb91c8 rafthttp: add functions to create listener and roundTripper
This moves the code to create listener and roundTripper for raft communication
to the same place, and use explicit functions to build them. This prevents
possible development errors in the future.
2015-11-04 11:12:46 -08:00
Xiang Li 1a3f7f7fa4 *: rename etcd service to kv service in gRPC 2015-11-04 10:05:49 -08:00
Xiang Li 10de2e6dbe *: serve watch service
Implement watch service and hook it up
with grpc server in etcdmain.
2015-11-03 15:58:34 -08:00
Xiang Li fe165de1d1 Merge pull request #3794 from yichengq/fix-proxy-term
etcdmain: fix parsing discovery error
2015-11-02 17:33:47 -08:00
Yicheng Qin 9757dcd3a2 etcdmain: fix parsing discovery error
The discovery error is wrapped into a struct now, and cannot be compared
to predefined errors. Correct the comparison behavior to fix the
problem.
2015-11-02 17:23:06 -08:00
Gyu-Ho Lee 821c071f3f etcdmain: fix package description for godoc.org
This fixes package description for etcdmain that wasn't compatible with
godoc.org, by deleting the extra blank lines between comment and package name.
2015-10-29 12:28:52 -07:00
Xiang Li ab4892ade2 Merge pull request #3749 from gyuho/etcdmain_flags_20151025
etcdmain: make flags and formats idential
2015-10-26 20:54:37 -07:00
Gyu-Ho Lee 52782cf8ee etcdmain: make flags and formats idential
This makes flagsline and config.go identical in its flag description and some
punctuation conventions.
2015-10-25 06:31:37 -07:00
Yicheng Qin 207c92b627 rafthttp: build transport inside pkg instead of passed-in
rafthttp has different requirements for connections created by the
transport for different usage, and this is hard to achieve when giving
one http.RoundTripper. Pass into pkg the data needed to build transport
now, and let rafthttp build its own transports.
2015-10-11 21:42:37 -07:00
Xiang Li 51043830d4 etcdmain: print out error and suggestion for fixing notify issue 2015-10-02 13:39:41 -07:00
Yicheng Qin 49d262185d Merge pull request #3590 from yichengq/discovery-log
etcdmain: improve log when join discovery fails
2015-09-29 08:02:18 -07:00
Yicheng Qin 939aa96a34 etcdmain: improve log when join discovery fails
Before this PR, the log is

```
2015/09/1 13:18:31 etcdmain: client: etcd cluster is unavailable or
misconfigured
```

It is quite hard for people to understand what happens.

Now we print out the exact reason for the failure, and explains the way
to handle it.
2015-09-28 23:23:50 -07:00
Yicheng Qin dc9a75df1c etcdmain: exit after print out ErrDuplicateID
etcd should exit after printing log for unhandlable error.
2015-09-25 14:10:50 -07:00
Xiang Li 9de7f24301 Merge pull request #3554 from mitake/reconfig-doc
doc: add a description of -strict-reconfig-check
2015-09-24 08:07:32 -07:00
Hitoshi Mitake 78791f81a6 doc: add a description of -strict-reconfig-check 2015-09-24 11:44:55 +09:00
Xiang Li 3b70bf87c3 etcdmain: better logging when user forget to set initial flags 2015-09-21 10:43:26 -07:00
Xiang Li 662b4966d0 Merge pull request #3510 from xiang90/v3_raft
etcdmain: support gRPC addr flag
2015-09-12 22:58:08 -07:00
Xiang Li a0cfcf2dd7 etcdmain: support gRPC addr flag 2015-09-12 22:52:51 -07:00
Hitoshi Mitake 6974fc63ed etcdserver: avoid deadlock caused by adding members with wrong peer URLs
Current membership changing functionality of etcd seems to have a
problem which can cause deadlock.

How to produce:
 1. construct N node cluster
 2. add N new nodes with etcdctl member add, without starting the new members

What happens:
After finishing add N nodes, a total number of the cluster becomes 2 *
N and a quorum number of the cluster becomes N + 1. It means
membership change requires at least N + 1 nodes because Raft treats
membership information in its log like other ordinal log append
requests.

Assume the peer URLs of the added nodes are wrong because of miss
operation or bugs in wrapping program which launch etcd. In such a
case, both of adding and removing members are impossible because the
quorum isn't preserved. Of course ordinal requests cannot be
served. The cluster would seem to be deadlock.

Of course, the best practice of adding new nodes is adding one node
and let the node start one by one. However, the effect of this problem
is so serious. I think preventing the problem forcibly would be
valuable.

Solution:
This patch lets etcd forbid adding a new node if the operation changes
quorum and the number of changed quorum is larger than a number of
running nodes. If etcd is launched with a newly added option
-strict-reconfig-check, the checking logic is activated. If the option
isn't passed, default behavior of reconfig is kept.

Fixes https://github.com/coreos/etcd/issues/3477
2015-09-13 09:31:53 +09:00
Dmitry Smirnov b2f4a5f587 *: fix spelling issues (codespell).
Signed-off-by: Dmitry Smirnov <onlyjob@member.fsf.org>
2015-09-11 10:22:29 +10:00
Raoof Mohammed 2de1c36061 etcdmain: Proxy doesnt specify - listening on http or https
etcdmain: Proxy doesnt specify - listening on http or https

Fixes #3464
2015-09-08 17:19:23 -04:00
Xiang Li 7957677cf2 etcdmain: proxy does not need to belong to the discovered cluster 2015-09-01 11:24:02 -07:00
Xiang Li d94e712d91 *: support wal dir 2015-09-01 09:54:27 -07:00
Yicheng Qin 58455a2ae4 etcdmain: check error before assigning peer transport
Or it may panic when new transport fails, e.g., TLS info is invalid.
2015-08-25 22:04:26 -07:00
Yicheng Qin 2ac9a329ab etcdmain: stop setting GOMAXPROCS explicitly
We always want to use GOMAXPROCS() as the way go parses it. When in go1.4, we
want to expose GOMAXPROCS value, so we set GOMAXPROCS explicitly as the
way go 1.4 does and print it out.

But it becomes a problem when go 1.5 changes the way to set GOMAXPROCS.

Fix the problem by stop setting GOMAXPROCS and get its value directly.

Due to this change, it sets default GOMAXPROCS to the
number of CPUs available when compiling in go 1.5, which matches how go 1.5 works:
https://docs.google.com/document/d/1At2Ls5_fhJQ59kDK2DFVhFu3g5mATSXqqV5QrxinasI/edit

This is a behavior change in etcd 2.2.
2015-08-25 13:38:16 -07:00
Xiang Li 044b23c3ca Merge pull request #3356 from xiang90/travis
*: test gofmt with -s and fix reported issues
2015-08-21 18:59:51 -07:00
Xiang Li 6b23a8131f *: test gofmt with -s and fix reported issues 2015-08-21 18:52:16 -07:00
Xiang Li 92634356c1 *: use limitedListener from golang 2015-08-20 20:02:35 -07:00
Xiang Li 6b77c146ec etcdmain: print out version information on startup 2015-08-20 14:50:16 -07:00
Yicheng Qin 927d5f3d26 Merge pull request #3301 from yichengq/ca-file
etcdmain: update -ca-file description
2015-08-17 23:36:33 -07:00
Yicheng Qin c0747a7b8b etcdmain: update -ca-file description
so people could deprecate old flags and use new flags much easier.
2015-08-17 22:36:04 -07:00
Yicheng Qin ffae601af5 etcdmain: calculate dial timeout for peer transport
This helps peer communication in globally-deployed cluster.
2015-08-17 16:52:53 -07:00
Yicheng Qin c3d4d11402 etcdhttp: adjust request timeout based on config
It uses heartbeat interval and election timeout to estimate the
expected request timeout.

This PR helps etcd survive under high roundtrip-time environment,
e.g., globally-deployed cluster.
2015-08-12 09:22:59 -07:00
Xiang Li a718329ad3 Merge pull request #3248 from xiang90/v3
initial v3 demo
2015-08-10 13:59:03 -07:00
Xiang Li 6c58333969 etcdmain: use default formatter
The default formatter would use syslog style when running
under init system, and would use pretty format otherwise.
2015-08-10 13:38:22 -07:00
Xiang Li c1e0b19f9f *: better flag 2015-08-10 09:53:17 -07:00
Xiang Li f004b4dac7 *: etcdserver supports v3 demo 2015-08-08 05:58:29 -07:00
Xiang Li 1b572ae2dd etcdmain: fix path printing 2015-08-06 15:53:24 -07:00
Xiang Li 7314310aed Merge pull request #3233 from xiang90/srv_discovery
better dns discovery error and doc
2015-08-06 14:35:22 -07:00
Yicheng Qin 2c2249dadc Merge pull request #3219 from yichengq/limit-listener
etcdmain: stop accepting client conns when it reachs limit
2015-08-06 12:17:49 -07:00
Yicheng Qin 97923ca3fc etcdmain: close client conns when it exceeds limit
This solves the problem that etcd may fatal because its critical path
cannot get file descriptor resource when the number of clients is too
big. The PR lets the client listener close client connections
immediately after they are accepted when
the file descriptor usage in the process reaches some pre-set limit, so
it ensures that the internal critical path could always get file
descriptor when it needs.

When there are tons to clients connecting to the server, the original
behavior is like this:

```
2015/08/4 16:42:08 etcdserver: cannot monitor file descriptor usage
(open /proc/self/fd: too many open files)
2015/08/4 16:42:33 etcdserver: failed to purge snap file open
default2.etcd/member/snap: too many open files
[halted]
```

Current behavior is like this:

```
2015/08/6 19:05:25 transport: accept error: closing connection,
exceed file descriptor usage limitation (fd limit=874)
2015/08/6 19:05:25 transport: accept error: closing connection,
exceed file descriptor usage limitation (fd limit=874)
2015/08/6 19:05:26 transport: accept error: closing connection,
exceed file descriptor usage limitation (fd limit=874)
2015/08/6 19:05:27 transport: accept error: closing connection,
exceed file descriptor usage limitation (fd limit=874)
2015/08/6 19:05:28 transport: accept error: closing connection,
exceed file descriptor usage limitation (fd limit=874)
2015/08/6 19:05:28 etcdserver: 80% of the file descriptor limit is
used [used = 873, limit = 1024]
```

It is available at linux system today because pkg/runtime only has linux
support.
2015-08-06 12:03:20 -07:00
Xiang Li 203e0f178b etcdmian: better error for srv discovery failure 2015-08-06 11:38:53 -07:00
Xiang Li 0cbac56fa2 etcdmain: support sdnotify for readiness 2015-07-31 13:33:18 +08:00