Commit Graph

163 Commits (72462a72fbcf5226ecad371adf0825b5705856d8)

Author SHA1 Message Date
Yicheng Qin 8ea3d157c5 Revert "Revert "Treat URLs have same IP address as same""
This reverts commit 3153e635d5.

Conflicts:
	etcdserver/config.go
2015-08-21 09:41:13 -07:00
Xiang Li ff37cc455c pkg/transport: remove home-grown limitedListener 2015-08-20 20:03:27 -07:00
Xiang Li 3ca5482251 pkg/fileutil: treat not support error as nil error in preallocate 2015-08-20 11:15:02 -07:00
Yicheng Qin 4778d780a8 pkg/pathutil: change copyright for path.go
The file only contains the function that is borrowed from std http lib,
so we use their copyright.
2015-08-18 11:48:22 -07:00
Yicheng Qin b5ec7f543a client: use canonical url path in request
The main change is that it keeps the trailing slash. This helps
auth feature to judge path permission accurately.
2015-08-18 08:59:48 -07:00
Yicheng Qin 2c2249dadc Merge pull request #3219 from yichengq/limit-listener
etcdmain: stop accepting client conns when it reachs limit
2015-08-06 12:17:49 -07:00
Yicheng Qin 97923ca3fc etcdmain: close client conns when it exceeds limit
This solves the problem that etcd may fatal because its critical path
cannot get file descriptor resource when the number of clients is too
big. The PR lets the client listener close client connections
immediately after they are accepted when
the file descriptor usage in the process reaches some pre-set limit, so
it ensures that the internal critical path could always get file
descriptor when it needs.

When there are tons to clients connecting to the server, the original
behavior is like this:

```
2015/08/4 16:42:08 etcdserver: cannot monitor file descriptor usage
(open /proc/self/fd: too many open files)
2015/08/4 16:42:33 etcdserver: failed to purge snap file open
default2.etcd/member/snap: too many open files
[halted]
```

Current behavior is like this:

```
2015/08/6 19:05:25 transport: accept error: closing connection,
exceed file descriptor usage limitation (fd limit=874)
2015/08/6 19:05:25 transport: accept error: closing connection,
exceed file descriptor usage limitation (fd limit=874)
2015/08/6 19:05:26 transport: accept error: closing connection,
exceed file descriptor usage limitation (fd limit=874)
2015/08/6 19:05:27 transport: accept error: closing connection,
exceed file descriptor usage limitation (fd limit=874)
2015/08/6 19:05:28 transport: accept error: closing connection,
exceed file descriptor usage limitation (fd limit=874)
2015/08/6 19:05:28 etcdserver: 80% of the file descriptor limit is
used [used = 873, limit = 1024]
```

It is available at linux system today because pkg/runtime only has linux
support.
2015-08-06 12:03:20 -07:00
Xiang Li 01c286ccb6 Merge pull request #3231 from xiang90/fallocate
pkg/fileutil: support perallocate
2015-08-06 10:25:28 -07:00
Xiang Li 39a4b6a5e5 pkg/fileutil: support perallocate 2015-08-06 10:10:58 -07:00
Xiang Li 58503817ec etcdserver: internal request union 2015-08-05 07:47:10 -07:00
Yicheng Qin 6317abf7e4 pkg/transport: fix HTTPS downgrade bug for keepalive listener
If TLS config is empty, etcd downgrades keepalive listener from HTTPS to
HTTP without warning. This results in HTTPS downgrade bug for client urls.
The commit returns error if it cannot listen on TLS.
2015-07-21 12:53:21 -07:00
Xiang Li b8279b3591 types: add len func for urlmaps 2015-07-21 12:53:20 -07:00
Yicheng Qin 1624235bb3 pkg/testutil: extend wait schedule time to 10ms
Waiting 3ms is not long enough for schedule to work well. The test suite
may fail once per 200 times in travis due to this. Extend this to 10ms
to ensure schedule could work. Now it could run 1000 times successfully
in travis.
2015-07-13 09:05:40 -07:00
Yicheng Qin 3d4642c2c4 Merge pull request #3059 from yichengq/fix-wait-stress-test
pkg/wait: extend timeout to check closed channel
2015-06-25 11:16:54 -07:00
Yicheng Qin eea7f28be4 pkg/wait: extend timeout to check closed channel
It is possible to trigger the time.After case if the timer went off
between time.After setting the timer for its channel and the time that
select looked at the channel. So it needs to be longer.

refer: https://groups.google.com/forum/#!topic/golang-nuts/1tjcV80ccq8
2015-06-25 10:43:12 -07:00
Yicheng Qin 107263ef9f pkg/fileutil: fix TestPurgeFile
It needs to wait longer for file to be detected and removed sometimes.
2015-06-25 10:09:20 -07:00
Xiang Li 8e7fa9e201 Merge pull request #2976 from yichengq/fix-lock-test
pkg/fileutil: wait longer for relock
2015-06-12 15:20:18 -07:00
Yicheng Qin 7723b91c06 pkg/fileutil: wait longer for relock
multiple cpu running makes it slower, so it waits longer for relock.
2015-06-12 15:17:28 -07:00
Yicheng Qin 75f91bab5c pkg/fileutil: wait longer before checking purge results
multiple cpu running may be slower than single cpu running, so it may
take longer time to remove files.
Increase from 5ms to 20ms to give it enough time.
2015-06-12 14:36:15 -07:00
Xiang Li 8ad7ed321e *:godep log pkg 2015-06-11 14:22:14 -07:00
Xiang Li 4b5dbeff9b pkg/pbutil: use leveled log 2015-06-11 14:19:53 -07:00
Xiang Li 865a5ffc61 pkg/osutil: use leveled log 2015-06-11 14:19:53 -07:00
Xiang Li a45f53986f pkg/netutil: use leveled log 2015-06-11 14:19:52 -07:00
Xiang Li 69819d334a pkg/flags: use leveled log 2015-06-11 14:19:52 -07:00
Xiang Li f64a8214f7 Merge pull request #2952 from xiang90/fileutil
fileutil: use leveled logging
2015-06-10 16:01:24 -07:00
Xiang Li dc87454487 fileutil: return on error and send it to error chan 2015-06-10 15:59:24 -07:00
Xiang Li e2c2f098bc fileutil: use leveled logging 2015-06-10 15:57:59 -07:00
Yicheng Qin 018fb8e6d9 pkg/testutil: ForceGosched -> WaitSchedule
ForceGosched() performs bad when GOMAXPROCS>1. When GOMAXPROCS=1, it
could promise that other goroutines run long enough
because it always yield the processor to other goroutines. But it cannot
yield processor to goroutine running on other processors. So when
GOMAXPROCS>1, the yield may finish when goroutine on the other
processor just runs for little time.

Here is a test to confirm the case:

```
package main

import (
	"fmt"
	"runtime"
	"testing"
)

func ForceGosched() {
	// possibility enough to sched up to 10 go routines.
	for i := 0; i < 10000; i++ {
		runtime.Gosched()
	}
}

var d int

func loop(c chan struct{}) {
	for {
		select {
		case <-c:
			for i := 0; i < 1000; i++ {
				fmt.Sprintf("come to time %d", i)
			}
			d++
		}
	}
}

func TestLoop(t *testing.T) {
	c := make(chan struct{}, 1)
	go loop(c)
	c <- struct{}{}
	ForceGosched()
	if d != 1 {
		t.Fatal("d is not incremented")
	}
}
```

`go test -v -race` runs well, but `GOMAXPROCS=2 go test -v -race` fails.

Change the functionality to waiting for schedule to happen.
2015-06-10 14:37:41 -07:00
Yicheng Qin 3153e635d5 Revert "Treat URLs have same IP address as same"
This reverts commit f8ce5996b0.

etcd no longer resolves TCP addresses passed in through flags,
so there is no need to compare hostname and IP slices anymore.
(for more details: a3892221ee)

Conflicts:
	etcdserver/cluster.go
	etcdserver/config.go
	pkg/netutil/netutil.go
	pkg/netutil/netutil_test.go
2015-05-16 03:21:10 -07:00
Yicheng Qin 256a7cfe8c pkg/wait: fix TestWaitTestStress
The test may fail if two consequent time.Now() returns the same value.
Sleep 1ns to avoid this situation.
2015-05-13 13:41:34 -07:00
Yicheng Qin 032db5e396 *: extract types.Cluster from etcdserver.Cluster
The PR extracts types.Cluster from etcdserver.Cluster. types.Cluster
is used for flag parsing and etcdserver config.

There is no need to expose etcdserver.Cluster public, which contains
lots of etcdserver internal details and methods. This is the first step
for it.
2015-05-12 14:53:11 -07:00
mischief 2e8c932ab0 pkg/fileutil: add plan9 lockfile support 2015-05-11 13:24:01 -07:00
Alexander Kolbasov 39c7060d3b pkg/fileutil: add filelock support for solaris 2015-04-24 12:18:08 -07:00
Yicheng Qin 2141308524 Merge pull request #2631 from yichengq/metrics-fd
etcdserver: metrics and monitor number of file descriptor
2015-04-08 11:28:58 -07:00
Yicheng Qin 7a7e1f7a7c etcdserver: metrics and monitor number of file descriptor
It exposes the metrics of file descriptor limit and file descriptor used.
Moreover, it prints out warning when more than 80% of fd limit has been used.

```
2015/04/08 01:26:19 etcdserver: 80% of the file descriptor limit is open
[open = 969, limit = 1024]
```
2015-04-08 11:17:48 -07:00
Yicheng Qin 374a18130a Merge pull request #2629 from crawford/ports
*: update to use IANA-assigned ports
2015-04-06 13:57:18 -07:00
Alex Crawford d9ad6aa2a9 *: update to use IANA-assigned ports 2015-04-06 13:49:43 -07:00
Yicheng Qin 2b830dd64b pkg: remove unused pkg/coreos
The package was used in upgrade path, and is not used anywhere now.
2015-04-06 13:33:42 -07:00
Mateus Braga cec8466ad2 osutil: fix InterruptHandler comment position 2015-04-04 11:32:42 -04:00
Yicheng Qin 24f9ba8ee8 pkg/netutil: fix DropPort and RecoverPort in linux
The iptables commands in DropPort do not work because setting
destination-port flag without specifying the protocol is invalid.
2015-03-31 10:39:31 -07:00
Yicheng Qin 04a62dd54b tools/functional-tester: add isolate failures 2015-03-29 00:29:47 -07:00
Kelsey Hightower 4611c3b2d7 netutil: add BasicAuth function
etcd ships it's own BasicAuth function and no longer requires
Go 1.4 to build.
2015-03-20 17:32:33 -07:00
Kelsey Hightower 8dd8b1cdc2 etcd: server SSL and client cert auth configuration is more explicit
etcd does not provide enough flexibility to configure server SSL and
client authentication separately. When configuring server SSL the
`--ca-file` flag is required to trust self-signed SSL certificates
used to service client requests.

The `--ca-file` has the side effect of enabling client cert
authentication. This can be surprising for those looking to simply
secure communication between an etcd server and client.

Resolve this issue by introducing four new flags:

    --client-cert-auth
    --peer-client-cert-auth
    --trusted-ca-file
    --peer-trusted-ca-file

These new flags will allow etcd to support a more explicit SSL
configuration for both etcd clients and peers.

Example usage:

Start etcd with server SSL and no client cert authentication:

    etcd -name etcd0 \
    --advertise-client-urls https://etcd0.example.com:2379 \
    --cert-file etcd0.example.com.crt \
    --key-file etcd0.example.com.key \
    --trusted-ca-file ca.crt

Start etcd with server SSL and enable client cert authentication:

    etcd -name etcd0 \
    --advertise-client-urls https://etcd0.example.com:2379 \
    --cert-file etcd0.example.com.crt \
    --key-file etcd0.example.com.key \
    --trusted-ca-file ca.crt \
    --client-cert-auth

Start etcd with server SSL and client cert authentication for both
peer and client endpoints:

    etcd -name etcd0 \
    --advertise-client-urls https://etcd0.example.com:2379 \
    --cert-file etcd0.example.com.crt \
    --key-file etcd0.example.com.key \
    --trusted-ca-file ca.crt \
    --client-cert-auth \
    --peer-cert-file etcd0.example.com.crt \
    --peer-key-file etcd0.example.com.key \
    --peer-trusted-ca-file ca.crt \
    --peer-client-cert-auth

This change is backwards compatible with etcd versions 2.0.0+. The
current behavior of the `--ca-file` flag is preserved.

Fixes #2499.
2015-03-12 23:09:54 -07:00
kmeaw 00a22891ee pkg/flags: Add support for IPv6 addresses
Support IPv6 address for ETCD_ADDR and ETCD_PEER_ADDR

pkg/flags: Support IPv6 address for ETCD_ADDR and ETCD_PEER_ADDR

pkg/flags: tests for IPv6 addr and bind-addr flags

pkg/flags: IPAddressPort.Host: do not enclose IPv6 address in square brackets

pkg/flags: set default bind address to [::] instead of 0.0.0.0

pkg/flags: we don't need fmt any more

also, one minor fix: net.JoinHostPort takes string as a port value

pkg/flags: fix ipv6 tests

pkg/flags: test both IPv4 and IPv6 addresses in TestIPAddressPortString

etcdmain: test: use [::] instead of 0.0.0.0
2015-03-12 11:30:53 +03:00
Xiang Li 3c9581adde pkg/transport: fix downgrade https to http bug in transport
If the TLS config is empty, etcd downgrades https to http without a warning.
This commit avoid the downgrade and stoping etcd from bootstrap if it cannot
listen on TLS.
2015-03-06 10:42:23 -08:00
Xiang Li e50d43fd32 pkg/transport: set the maxIdleConnsPerHost to -1
for transport that are using timeout connections, we set the
maxIdleConnsPerHost to -1. The default transport does not clear
the timeout for the connections it sets to be idle. So the connections
with timeout cannot be reused.
2015-03-02 21:52:03 -08:00
Yicheng Qin 2c94e2d771 *: make dial timeout configurable
Dial timeout is set shorter because
1. etcd is supposed to work in good environment, and the new value is long
enough
2. shorter dial timeout makes dial fail faster, which is good for
performance
2015-02-28 11:18:59 -08:00
Xiang Li 9b6fcfffb6 *: replace our own metrics with codahale/metrics 2015-02-28 10:11:53 -08:00
Xiang Li a560c52815 Merge pull request #2354 from xiang90/wait_time
pkg/wait: add WaitTime
2015-02-23 14:29:39 -08:00
Xiang Li 53d20a8a29 pkg/wait: add WaitTime
WaitTime waits on deadline instead of id.
2015-02-23 14:26:42 -08:00