Compare commits

..

367 Commits

Author SHA1 Message Date
Yicheng Qin
a2d4f85d33 *: bump to v2.3.0-alpha.0 2015-11-06 09:08:39 -08:00
Jonathan Boulle
63872d812a Merge pull request #3825 from jonboulle/master
contrib: add example systemd unit file
2015-11-06 17:53:32 +01:00
Jonathan Boulle
652d3f1974 contrib: add example systemd unit file 2015-11-06 17:50:19 +01:00
Jonathan Boulle
e3ce605cb5 Merge pull request #3826 from jonboulle/scripts
scripts: enforce genproto.sh is run from repo root
2015-11-06 17:26:16 +01:00
Jonathan Boulle
de0cb472be scripts: enforce genproto.sh is run from repo root 2015-11-06 16:13:24 +01:00
Jonathan Boulle
f48e95f7b0 Merge pull request #3822 from mitake/strict-reconfig-error-log
etcdserver: correct error log for strict reconfig checking
2015-11-06 15:13:27 +01:00
Hitoshi Mitake
2c8ffa6bcb etcdserver: correct error log for strict reconfig checking
This commit fixes an error log caused by the strict reconfig checking
option.

Before:
14:21:38 etcd2 | 2015-11-05 14:21:38.870356 E | etcdhttp: got unexpected response error (etcdserver: re-configuration failed due to not enough started members)

After:
log
13:27:33 etcd2 | 2015-11-05 13:27:33.089364 E | etcdhttp: etcdserver: re-configuration failed due to not enough started members

The error is not an unexpected thing therefore the old message is
incorrect.
2015-11-06 11:03:42 +09:00
Xiang Li
9d880f136f Merge pull request #3818 from yichengq/req-snap-log
etcdserver: fix snapshot index in creation log line
2015-11-05 14:04:46 -08:00
Yicheng Qin
0874c44cdc etcdserver: fix snapshot index in creation log line
The snapshot is created at appliedi instead of snapi.
2015-11-05 14:02:09 -08:00
Yicheng Qin
dadfdf6af8 Merge pull request #3802 from yichengq/fix-storage-watch
storage: delete key instead of setting it to false
2015-11-05 11:40:46 -08:00
Xiang Li
08f0d94019 Merge pull request #3809 from xiang90/rpc_kv
*: refactor kv rpc implementation
2015-11-04 19:05:48 -08:00
Yicheng Qin
47cad59571 Merge pull request #3813 from yichengq/update-version
*: update clusterMinVersion and feature maps for incoming v2.3
2015-11-04 14:37:37 -08:00
Yicheng Qin
ec3c2d23a3 *: update feature maps to adopt v2.3.0 2015-11-04 14:30:35 -08:00
Yicheng Qin
b82c171f5f version: update MinClusterVersion to v2.2.0
This is the preparation for bumping to v2.3.0-alpha
2015-11-04 14:30:04 -08:00
Xiang Li
03951495d3 Merge pull request #3811 from gyuho/storage_watchergauge_fix
storage: move watcherGauge to watchable_store
2015-11-04 13:23:10 -08:00
Gyu-Ho Lee
6e5eb03544 storage: move watcherGauge to watchable_store
watcherGauge should be increased everytime we creates Watcher, not per watch
method call.
2015-11-04 13:17:47 -08:00
Xiang Li
319b77b051 Merge pull request #3810 from gyuho/storage_metrics_add_watcher_gauge
storage: add metrics to watcher
2015-11-04 13:09:07 -08:00
Gyu-Ho Lee
4ebf28aa2e storage: add metrics to watcher
This adds metrics to watcher, and changes some order in MustRegister function
calls in init (same order that we define the gauges).
2015-11-04 13:01:52 -08:00
Yicheng Qin
33fe6f41fb Merge pull request #3808 from yichengq/fix-wait-test
pkg/wait: extend wait timeout in TestWaitTime
2015-11-04 11:39:24 -08:00
Yicheng Qin
3d15526c35 Merge pull request #3796 from yichengq/fix-get-version
etcdserver: not reuse connections for peer transport
2015-11-04 11:39:14 -08:00
Xiang Li
c37bd2385a *: refactor kv rpc implementation 2015-11-04 11:36:17 -08:00
Yicheng Qin
3b8349c06e pkg/wait: extend wait timeout in TestWaitTime
Fix this error happening on travis:
```
--- FAIL: TestWaitTime-2 (0.01s)
		wait_time_test.go:46: cannot receive from ch as expected
```
2015-11-04 11:18:17 -08:00
Yicheng Qin
4ccbcb91c8 rafthttp: add functions to create listener and roundTripper
This moves the code to create listener and roundTripper for raft communication
to the same place, and use explicit functions to build them. This prevents
possible development errors in the future.
2015-11-04 11:12:46 -08:00
Yicheng Qin
32819f6b3f etcdserver: use roundTripper to request peerURL
It uses roundTripper instead of Transport because roundTripper is
sufficient for its requirements.
2015-11-04 10:49:42 -08:00
Xiang Li
5272ee99b5 Merge pull request #3804 from xiang90/ctl_watch
etcdctlv3: support watch
2015-11-04 10:21:05 -08:00
Xiang Li
616078dc1b Merge pull request #3807 from xiang90/kv
*: rename etcd service to kv service in gRPC
2015-11-04 10:11:02 -08:00
Xiang Li
1a3f7f7fa4 *: rename etcd service to kv service in gRPC 2015-11-04 10:05:49 -08:00
Yicheng Qin
65d153db73 Merge pull request #3783 from yichengq/merge-logger
rafthttp: use MergeLogger for rafthttp logging
2015-11-04 09:48:43 -08:00
Yicheng Qin
6040d57106 rafthttp: use MergeLogger to merge message-drop log
rafthttp logs repeated messages when amounts of message-drop logs
happen, and it becomes log spamming.
Use MergeLogger to merge log lines in this case.
2015-11-04 07:26:58 -08:00
Yicheng Qin
5329159b5e rafthttp: remove failureMap from peerStatus
The logging mechanism is verbose, so it is removed from peerStatus.

We would like to see the status change
of connection with peers, and one error that leads to deactivation.
There is no need to print out all non-repeated errors.
2015-11-04 07:26:33 -08:00
Xiang Li
5c1b833232 etcdctlv3: support watch
A draft impl for demo.
2015-11-03 19:28:57 -08:00
Yicheng Qin
c8e622f517 storage: make putm/delm a set with empty value
This cleans the code, and reduces the allocation space.
2015-11-03 19:10:45 -08:00
Yicheng Qin
6dbfc21846 storage: delete key instead of setting it to false
When getting the watched events, it iterate all keys in putm and delm
to generate the events. If we don't delete the key from putm/delm,
it would range on the key that is not actually put or deleted. This is
incorrect.

Fix the panic that happens when single put/delete is watched.
2015-11-03 19:00:39 -08:00
Xiang Li
94c6b6a93d Merge pull request #3801 from yichengq/fix-raft-timeout
raft: extend wait timeout in TestNodeAdvance
2015-11-03 18:29:47 -08:00
Yicheng Qin
0de52414cd raft: extend wait timeout in TestNodeAdvance
This fixes the failure met in semaphore CI.
2015-11-03 16:57:18 -08:00
Xiang Li
1f1d8e9282 Merge pull request #3800 from xiang90/watch_server
*: serve watch service
2015-11-03 16:32:29 -08:00
Xiang Li
10de2e6dbe *: serve watch service
Implement watch service and hook it up
with grpc server in etcdmain.
2015-11-03 15:58:34 -08:00
Xiang Li
70cb8b8391 Merge pull request #3799 from gyuho/nameing_in_metrics_watching
storage: apply same naming in metrics.go
2015-11-03 15:23:39 -08:00
Gyu-Ho Lee
bdc280c4a7 storage: apply same naming in metrics.go
This is PR following up with Xiang's https://github.com/coreos/etcd/pull/3795,
and to make the naming consistent with its interface change.
2015-11-03 15:19:18 -08:00
Xiang Li
f6b097c0cc Merge pull request #3798 from xiang90/watch_new
*: add v3 watch service
2015-11-03 14:39:20 -08:00
Xiang Li
c160085f44 *: add v3 watch service 2015-11-03 14:21:24 -08:00
Xiang Li
154fc8e19c Merge pull request #3795 from xiang90/watch_stream
storage: add watchChan
2015-11-03 13:32:49 -08:00
Xiang Li
a1129dd5a5 storage: support multiple watching per watcher
We want to support multiple watchings per one watcher chan. Then
we can have one single go routine to watch multiple keys/prefixs.
2015-11-03 12:36:11 -08:00
Xiang Li
34e7611093 Merge pull request #3797 from gyuho/procfile_20151103
Procfile: delay proxy waiting for initial cluster
2015-11-03 10:24:48 -08:00
Gyu-Ho Lee
5e29449d61 Procfile: delay proxy waiting for initial cluster
This fixes #3647 by delaying proxy start by 3 seconds. Without this, proxy
starts at the same time as initial cluster and since the default proxy director
refresh interval is 30-second, if cluster is not ready at first trial, the
proxy misses to discover them and has to wait another 30-seconds, which delays
the proxying for first 30-second.
2015-11-03 10:22:48 -08:00
Yicheng Qin
0eee88a3d9 etcdserver: use timeout transport as peer transport
This pairs with remote timeout listeners.

etcd uses timeout listener, and times out the accepted connections
if there is no activity. So the idle connections may time out easily.
Becaus timeout transport doesn't reuse connections, it prevents using
timeouted connection.

This fixes the problem that etcd fail to get version of peers.
2015-11-03 07:58:03 -08:00
Xiang Li
fe165de1d1 Merge pull request #3794 from yichengq/fix-proxy-term
etcdmain: fix parsing discovery error
2015-11-02 17:33:47 -08:00
Yicheng Qin
9757dcd3a2 etcdmain: fix parsing discovery error
The discovery error is wrapped into a struct now, and cannot be compared
to predefined errors. Correct the comparison behavior to fix the
problem.
2015-11-02 17:23:06 -08:00
Xiang Li
4fd65ecd4c Merge pull request #3785 from yichengq/fix-block-test
storage: extend wait timeout for execution
2015-11-02 12:53:18 -08:00
Xiang Li
a74dd4c47a Merge pull request #3790 from xiang90/etcd-top
add etcdtop
2015-11-02 12:02:05 -08:00
Yicheng Qin
8f9d237d21 Merge pull request #3792 from wojtek-t/update_ugorji
Update dependency on ugorji/go/codec
2015-11-02 07:43:48 -08:00
Wojciech Tyczynski
65ae8784fb client: regenerate code to unmarshal key response
Regenerate code for unmarshaling key response with a new version of
ugorji/go/codec.
2015-11-02 12:06:32 +01:00
Wojciech Tyczynski
02eec7763d Godeps: update ugorji/go/codec dependency
Update ugorji/go/codec dependency to the newer version.
2015-11-02 12:03:49 +01:00
Xiang Li
e1b2e7245b tools/etcd-top: add copyright header 2015-11-01 18:19:32 -08:00
Xiang Li
c4f8fe96e8 travis: install libpcap 2015-11-01 18:16:20 -08:00
Xiang Li
2bfe995fb8 Godeps: add dependency for etcd-top 2015-11-01 18:07:27 -08:00
Tyler Neely
00557e96af tools: add etcd-top 2015-11-01 18:07:27 -08:00
Yicheng Qin
59b5dabc66 storage: extend wait timeout for execution
Extend timeout to pass always in traivs.
2015-10-30 17:44:31 -07:00
Jonathan Boulle
f787e7904b Merge pull request #3762 from jonboulle/auth
etcdserver: restructure auth.Store and auth.User
2015-10-30 16:49:28 -07:00
Jonathan Boulle
ee522025b3 etcdserver: restructure auth.Store and auth.User
This attempts to decouple password-related functions, which previously
existed both in the Store and User structs, by splitting them out into a
separate interface, PasswordStore.  This means that they can be more
easily swapped out during testing.

This also changes the relevant tests to use mock password functions
instead of the bcrypt-backed implementations; as a result, the tests are
much faster.

Before:
```
	github.com/coreos/etcd/etcdserver/auth		31.495s
	github.com/coreos/etcd/etcdserver/etcdhttp	91.205s
```

After:
```
	github.com/coreos/etcd/etcdserver/auth		1.207s
	github.com/coreos/etcd/etcdserver/etcdhttp	1.207s
```
2015-10-30 16:33:40 -07:00
Jonathan Boulle
2840260b3b Merge pull request #3781 from gyuho/doc_typo_20151029
Documentation: fix typo in proxy.md
2015-10-29 20:34:16 -07:00
Gyu-Ho Lee
294f03f85f Documentation: fix typo in proxy.md
This fixes some typos in proxy.md.

Thanks,
2015-10-29 20:30:50 -07:00
Jonathan Boulle
00b4880494 Update ROADMAP.md 2015-10-29 16:05:40 -07:00
Jonathan Boulle
42255748cd Update ROADMAP.md 2015-10-29 16:03:14 -07:00
Yicheng Qin
1b3d9130c9 Merge pull request #3759 from yichengq/rafthttp-unreachable
rafthttp: mark unreachable on unexpected response
2015-10-29 15:12:23 -07:00
Jonathan Boulle
1944893ef8 Merge pull request #3776 from gyuho/etcdmain_doc
etcdmain: fix package description compatible with godoc.org
2015-10-29 12:36:11 -07:00
Gyu-Ho Lee
821c071f3f etcdmain: fix package description for godoc.org
This fixes package description for etcdmain that wasn't compatible with
godoc.org, by deleting the extra blank lines between comment and package name.
2015-10-29 12:28:52 -07:00
Yicheng Qin
84d7825a77 rafthttp: stop masking errMemberRemoved in pipeline
It makes logic more straightforward and readable. Also, it makes the
handle method consistent with stream and snapshot sender.
2015-10-28 21:40:48 -07:00
Yicheng Qin
908a011604 rafthttp: mark unreachable on unexpected response
In rafthttp, when making request to some endpoint, it may receive
response with unexpected status code and header. This indicates the endpoint
doesn't function correctly. It should mark the endpoint unreachable.
2015-10-28 21:40:11 -07:00
Xiang Li
2fe6893d5d Merge pull request #3772 from xiang90/watcher_sep
storage: move watcher interface into watcher.go
2015-10-28 21:14:55 -07:00
Xiang Li
f71bcfa8ce storage: move watcher interface into watcher.go 2015-10-28 21:10:58 -07:00
Jonathan Boulle
de99c9ed58 Merge pull request #3770 from yichengq/link-etcdctl
docs/libraries-and-tools: update the link of etcdctl
2015-10-28 14:07:33 -07:00
Yicheng Qin
4f36897f8c Merge pull request #3767 from kamilhark/master
Added etcdsh command line tool to the list
2015-10-28 13:41:14 -07:00
Yicheng Qin
ebde1d720e docs/libraries-and-tools: update the link of etcdctl
The old repo is deprecated, and we develop etcdctl in etcd repo now.
2015-10-28 13:37:28 -07:00
Yicheng Qin
b5c176360e Merge pull request #3768 from yichengq/fix-publish-test
etcdserver: extend wait timeout in TestPublishRetry
2015-10-28 13:31:51 -07:00
Josh Wood
695a5148cf Merge pull request #3769 from msoap/fix-docs
documentation: changed link to style doc
LGTM. Link accurate (old link was redir'd anyway).
Tiny fix, doc-only, so directly merging.
2015-10-28 13:13:37 -07:00
Sergey Mudrik
3dad5fffc0 documentation: changed link to style doc
Go-project has been moved from code.google.com to github.com
2015-10-28 21:49:28 +02:00
Yicheng Qin
7d757bbc8a etcdserver: extend wait timeout in TestPublishRetry
It fixes the failure in semaphore CI:
```
--- FAIL: TestPublishRetry (0.00s)
		server_test.go:1108: len(action) = 1, want >= 2
```
2015-10-28 12:07:00 -07:00
Kamil
ae18e6ea37 docs/libraries-and-tools: Added etcdsh command line tool to the list 2015-10-28 19:38:09 +01:00
Yicheng Qin
099d8674c4 Merge pull request #3746 from yichengq/load-storage
etcdserver: fix recovering snapshot from disk
2015-10-27 14:42:41 -07:00
Yicheng Qin
4b8ee2d66e storage: skip old entry in ConsistentWatchableStore
This avoids to apply the same entry twice when restoring from disk.
2015-10-26 23:26:06 -07:00
Yicheng Qin
263b270708 etcdserver: commit v3 storage before releasing WAL
This ensures that v3 storage could always find the following log entries
when restart.
2015-10-26 21:06:08 -07:00
Xiang Li
70f9407d2d Merge pull request #3758 from xiang90/race
*: fix various data races detected by race detector
2015-10-26 20:57:31 -07:00
Xiang Li
ab4892ade2 Merge pull request #3749 from gyuho/etcdmain_flags_20151025
etcdmain: make flags and formats idential
2015-10-26 20:54:37 -07:00
Xiang Li
a8e6e71bf9 *: fix various data races detected by race detector 2015-10-26 20:49:37 -07:00
Xiang Li
306dd7183b Merge pull request #3757 from xiang90/race
rafthttp: fix data races detected by go race detector
2015-10-26 17:10:17 -07:00
Xiang Li
336d177c82 rafthttp: fix data races detected by go race detector 2015-10-26 15:29:08 -07:00
Yicheng Qin
4766227b76 Merge pull request #3750 from yichengq/rafthttp-continue
rafthttp: fix wrong return in pipeline.handle
2015-10-26 14:11:42 -07:00
Yicheng Qin
4076dda101 rafthttp: fix wrong return in pipeline.handle
pipeline.handle is a long-living one, and should continue to receive
next message to send out when current message fails to send. So it
should `continue` instead of `return` here.
2015-10-26 14:05:19 -07:00
Jonathan Boulle
44bbc87698 Merge pull request #3756 from suryanathan/master
docs/libraries-and-tools: Update libraries-and-tools.md with etcdcpp
2015-10-26 13:52:42 -07:00
suryanathan
e4ada19996 docs/libraries-and-tools: Update libraries-and-tools.md with etcdcpp
Add a c++ language binding for API version 2.2.0
2015-10-26 16:44:17 -04:00
Jonathan Boulle
cc378585a9 Merge pull request #3755 from jonboulle/master
travis: only run unit tests
2015-10-26 13:36:18 -07:00
Jonathan Boulle
516be7a781 travis: only run unit tests
Travis has chronic problems successfully running the integration suite -
and we've successfully moved to Semaphore for that purpose - but can
still be useful as a fail-fast option for testing unit tests and formatting.
2015-10-26 12:47:15 -07:00
Gyu-Ho Lee
52782cf8ee etcdmain: make flags and formats idential
This makes flagsline and config.go identical in its flag description and some
punctuation conventions.
2015-10-25 06:31:37 -07:00
Yicheng Qin
d44b79c3c9 Merge pull request #3748 from coreos/revert-3737-rafthttp-continue
Revert "rafthttp: fix wrong return in pipeline.handle"
2015-10-24 21:05:52 -07:00
Yicheng Qin
5eda45ece6 Revert "rafthttp: fix wrong return in pipeline.handle" 2015-10-24 20:25:56 -07:00
Yicheng Qin
dbba5bb373 Merge pull request #3737 from yichengq/rafthttp-continue
rafthttp: fix wrong return in pipeline.handle
2015-10-24 19:42:38 -07:00
Yicheng Qin
7e38f05ceb Merge pull request #3742 from yichengq/save-index
etcdserver: save consistent index into v3 storage
2015-10-24 09:48:28 -07:00
Yicheng Qin
15ed6d8268 etcdserver: save consistent index into v3 storage
This helps to recover consistent index when restart in the future.
2015-10-24 09:27:24 -07:00
Yicheng Qin
f648d52afe rafthttp: fix wrong return in pipeline.handle
pipeline.handle is a long-living one, and should continue to receive
next message to send out when current message fails to send. So it
should `continue` instead of `return` here.
2015-10-23 17:00:03 -07:00
Yicheng Qin
41cb39b68a storage: Get -> ConsistentIndex in ConsistentIndexGetter
To make the method name more specific in the context.
2015-10-23 16:40:55 -07:00
Yicheng Qin
4f47b08cf6 Merge pull request #3744 from yichengq/fix-sem
raft: extend wait timeout in TestMultiNodeAdvance
2015-10-23 13:20:52 -07:00
Yicheng Qin
bf3057e5bd raft: extend wait timeout in TestMultiNodeAdvance
This fixes the failure met in semaphore CI:

```
--- FAIL: TestMultiNodeAdvance-2 (0.01s)
		multinode_test.go:458: expect Ready after Advance, but there is
		no Ready available
```
2015-10-23 12:08:24 -07:00
Yicheng Qin
01559fafeb Merge pull request #3741 from yichengq/receive-restore
etcdserver: restore KV snapshot when receiving snapshot
2015-10-23 09:24:17 -07:00
Yicheng Qin
cacc0d6432 etcdserver: restore KV snapshot when receiving snapshot
When a slow follower receives the snapshot sent from the leader, it
should rename the snapshot file to the default KV file path, and
restore KV snapshot.

Have tested it manually and it works pretty well.
2015-10-23 08:43:26 -07:00
Yicheng Qin
d33c26c20a Merge pull request #3730 from yichengq/storage-consistent
storage: add consistentWatchableStore
2015-10-23 08:15:04 -07:00
Yicheng Qin
4fb4bc3ca8 storage: add consistentWatchableStore
consistentWatchableStore maintains an index that is always consistent
with the latest txn. The index could be used to indicate the progress
of the store so far when recovery.
2015-10-22 22:54:51 -07:00
Josh Wood
ae62a77de6 Merge pull request #3729 from xiang90/mem_bench
doc: add benchmark doc for new storage pkg
2015-10-22 10:54:43 -07:00
Xiang Li
e3cedeeb12 doc: add benchmark doc for new storage pkg 2015-10-22 13:53:03 -04:00
Xiang Li
2feccd3fa4 Merge pull request #3733 from yichengq/fix-wait-timeout
pkg/transport: extend wait timeout for write
2015-10-22 13:07:14 -04:00
Yicheng Qin
d3ebecdddd pkg/transport: extend wait timeout for write
This helps the test to pass safely in semaphore CI.

Based on my manual testing, it may take at most 500ms to return
error in semaphore CI, so I set 1s as a safe value.
2015-10-21 18:27:21 -07:00
Yicheng Qin
8b08fff1e9 Merge pull request #3731 from yichengq/storage-kv
storage: fix WatchableKV interface and refine comment
2015-10-21 17:27:24 -07:00
Yicheng Qin
01b163e77d Merge pull request #3588 from gyuho/storage/watchable_store.go-use-map-for-unsynced
storage/watchable_store.go: use map for unsynced
2015-10-21 16:50:15 -07:00
Yicheng Qin
44cecb8624 Merge pull request #3732 from yichengq/config-header
docs/configuration: fix heading hierarchy
2015-10-21 15:50:38 -07:00
Gyu-Ho Lee
f73d0ed1d9 storage: use map for watchable store unsynced
This is for `TODO: use map to reduce cancel cost`.
I switched slice to map, and benchmark results show
that map implementation performs better, as follows:

```
[1]:
benchmark                                   old ns/op     new ns/op     delta
BenchmarkWatchableStoreUnsyncedCancel       215212        1307          -99.39%
BenchmarkWatchableStoreUnsyncedCancel-2     120453        710           -99.41%
BenchmarkWatchableStoreUnsyncedCancel-4     120765        748           -99.38%
BenchmarkWatchableStoreUnsyncedCancel-8     121391        719           -99.41%

benchmark                                   old allocs     new allocs     delta
BenchmarkWatchableStoreUnsyncedCancel       0              0              +0.00%
BenchmarkWatchableStoreUnsyncedCancel-2     0              0              +0.00%
BenchmarkWatchableStoreUnsyncedCancel-4     0              0              +0.00%
BenchmarkWatchableStoreUnsyncedCancel-8     0              0              +0.00%

benchmark                                   old bytes     new bytes     delta
BenchmarkWatchableStoreUnsyncedCancel       200           1             -99.50%
BenchmarkWatchableStoreUnsyncedCancel-2     138           0             -100.00%
BenchmarkWatchableStoreUnsyncedCancel-4     138           0             -100.00%
BenchmarkWatchableStoreUnsyncedCancel-8     139           0             -100.00%

[2]:
benchmark                                   old ns/op     new ns/op     delta
BenchmarkWatchableStoreUnsyncedCancel       212550        1117          -99.47%
BenchmarkWatchableStoreUnsyncedCancel-2     120927        691           -99.43%
BenchmarkWatchableStoreUnsyncedCancel-4     120752        699           -99.42%
BenchmarkWatchableStoreUnsyncedCancel-8     121012        688           -99.43%

benchmark                                   old allocs     new allocs     delta
BenchmarkWatchableStoreUnsyncedCancel       0              0              +0.00%
BenchmarkWatchableStoreUnsyncedCancel-2     0              0              +0.00%
BenchmarkWatchableStoreUnsyncedCancel-4     0              0              +0.00%
BenchmarkWatchableStoreUnsyncedCancel-8     0              0              +0.00%

benchmark                                   old bytes     new bytes     delta
BenchmarkWatchableStoreUnsyncedCancel       197           1             -99.49%
BenchmarkWatchableStoreUnsyncedCancel-2     138           0             -100.00%
BenchmarkWatchableStoreUnsyncedCancel-4     138           0             -100.00%
BenchmarkWatchableStoreUnsyncedCancel-8     139           0             -100.00%

[3]:
benchmark                                   old ns/op     new ns/op     delta
BenchmarkWatchableStoreUnsyncedCancel       214268        1183          -99.45%
BenchmarkWatchableStoreUnsyncedCancel-2     120763        759           -99.37%
BenchmarkWatchableStoreUnsyncedCancel-4     120321        708           -99.41%
BenchmarkWatchableStoreUnsyncedCancel-8     121628        680           -99.44%

benchmark                                   old allocs     new allocs     delta
BenchmarkWatchableStoreUnsyncedCancel       0              0              +0.00%
BenchmarkWatchableStoreUnsyncedCancel-2     0              0              +0.00%
BenchmarkWatchableStoreUnsyncedCancel-4     0              0              +0.00%
BenchmarkWatchableStoreUnsyncedCancel-8     0              0              +0.00%

benchmark                                   old bytes     new bytes     delta
BenchmarkWatchableStoreUnsyncedCancel       200           1             -99.50%
BenchmarkWatchableStoreUnsyncedCancel-2     139           0             -100.00%
BenchmarkWatchableStoreUnsyncedCancel-4     138           0             -100.00%
BenchmarkWatchableStoreUnsyncedCancel-8     139           0             -100.00%

[4]:
benchmark                                   old ns/op     new ns/op     delta
BenchmarkWatchableStoreUnsyncedCancel       208332        1089          -99.48%
BenchmarkWatchableStoreUnsyncedCancel-2     121011        691           -99.43%
BenchmarkWatchableStoreUnsyncedCancel-4     120678        681           -99.44%
BenchmarkWatchableStoreUnsyncedCancel-8     121303        721           -99.41%

benchmark                                   old allocs     new allocs     delta
BenchmarkWatchableStoreUnsyncedCancel       0              0              +0.00%
BenchmarkWatchableStoreUnsyncedCancel-2     0              0              +0.00%
BenchmarkWatchableStoreUnsyncedCancel-4     0              0              +0.00%
BenchmarkWatchableStoreUnsyncedCancel-8     0              0              +0.00%

benchmark                                   old bytes     new bytes     delta
BenchmarkWatchableStoreUnsyncedCancel       194           1             -99.48%
BenchmarkWatchableStoreUnsyncedCancel-2     139           0             -100.00%
BenchmarkWatchableStoreUnsyncedCancel-4     139           0             -100.00%
BenchmarkWatchableStoreUnsyncedCancel-8     139           0             -100.00%

[5]:
benchmark                                   old ns/op     new ns/op     delta
BenchmarkWatchableStoreUnsyncedCancel       211900        1097          -99.48%
BenchmarkWatchableStoreUnsyncedCancel-2     121795        753           -99.38%
BenchmarkWatchableStoreUnsyncedCancel-4     123182        700           -99.43%
BenchmarkWatchableStoreUnsyncedCancel-8     122820        688           -99.44%

benchmark                                   old allocs     new allocs     delta
BenchmarkWatchableStoreUnsyncedCancel       0              0              +0.00%
BenchmarkWatchableStoreUnsyncedCancel-2     0              0              +0.00%
BenchmarkWatchableStoreUnsyncedCancel-4     0              0              +0.00%
BenchmarkWatchableStoreUnsyncedCancel-8     0              0              +0.00%

benchmark                                   old bytes     new bytes     delta
BenchmarkWatchableStoreUnsyncedCancel       198           1             -99.49%
BenchmarkWatchableStoreUnsyncedCancel-2     140           0             -100.00%
BenchmarkWatchableStoreUnsyncedCancel-4     141           0             -100.00%
BenchmarkWatchableStoreUnsyncedCancel-8     141           0             -100.00%
```
2015-10-21 15:30:15 -07:00
Yicheng Qin
f95d18f766 docs/configuration: fix heading hierarchy
to make it consistent with other sections in the doc.
2015-10-21 15:23:48 -07:00
Yicheng Qin
027c073d55 storage: refine Watch comment in WatchableKV
Explain explicitly how these arguments are used.
2015-10-21 14:47:52 -07:00
Jonathan Boulle
56b7584418 Merge pull request #3725 from joshix/hdinghier-mulligan
Documentation: Fix heading hierarchy.
2015-10-21 13:52:57 -07:00
Yicheng Qin
2673e657e6 storage: fix WatchableKV interface
We delete endRev from the watch functionality, so the interface needs
to be fixed.
2015-10-21 11:50:25 -07:00
Yicheng Qin
35eb26ef5d Merge pull request #3726 from yichengq/watch-store
storage: add store field in watchableStore
2015-10-21 11:07:45 -07:00
Yicheng Qin
0f7374ce89 storage: KV field -> store field in watchableStore
We need to access the underlying store to use its RangeEvents function.
It is not good to use unnecessary type conversion.

The underlying store is also needed for further store upon
watchableStore.
2015-10-20 19:23:20 -07:00
Yicheng Qin
8d3ed0176c Merge pull request #3727 from yichengq/govet
raft: fix malformed example name
2015-10-20 16:51:47 -07:00
Yicheng Qin
01806c3e80 raft: fix malformed example name
It is reported by latest govet:
```
gopath/src/github.com/coreos/etcd/raft/example_test.go:26: Example_Node
has malformed example suffix: Node
```
2015-10-20 16:40:01 -07:00
Josh Wood
98bdeab53b Documentation: Fix heading hierarchy.
Correct the hierarchy of Markdown symbols in document headings.
2015-10-20 15:26:49 -07:00
Xiang Li
704bff0c77 Merge pull request #3724 from coreos/philips-patch-1
README: fix language for release binaries
2015-10-20 14:22:02 -07:00
Brandon Philips
5b5b0ef060 README: attempt to make it even clearer 2015-10-20 14:17:33 -07:00
Brandon Philips
b38e21a9e9 README: fix language for release binaries
Address confusion on where to find stuff and make it easier to find with keywords for various operating systems.
2015-10-20 14:07:40 -07:00
Yicheng Qin
9635d8d94c Merge pull request #3720 from yichengq/clean-streamAppV1
rafthttp: deprecate streamTypeMsgApp and remove msgApp stream sent restriction due to streamTypeMsgApp
2015-10-20 10:37:51 -07:00
Yicheng Qin
de669be6d6 Merge pull request #3683 from yichengq/raft-block
etcdserver: fix raft state machine may block
2015-10-20 09:44:34 -07:00
Yicheng Qin
ab5df57ecf etcdserver: fix raft state machine may block
When snapshot store requests raft snapshot from etcdserver apply loop,
it may block on the channel for some time, or wait some time for KV to
snapshot. This is unexpected because raft state machine should be unblocked.

Even worse, this block may lead to deadlock:
1. raft state machine waits on getting snapshot from raft memory storage
2. raft memory storage waits snapshot store to get snapshot
3. snapshot store requests raft snapshot from apply loop
4. apply loop is applying entries, and waits raftNode loop to finish
messages sending
5. raftNode loop waits peer loop in Transport to send out messages
6. peer loop in Transport waits for raft state machine to process message

Fix it by changing the logic of getSnap to be asynchronously creation.
2015-10-20 09:19:34 -07:00
Yicheng Qin
b61eaf3335 rafthttp: msgApp{Reader/Writer} -> msgAppV2{Reader/Writer}
To make what it serves more clear.
2015-10-20 08:28:06 -07:00
Yicheng Qin
5060b2f322 rafthttp: send all MsgApp on stream msgAppV2
For stream msgAppV2, as long as the message is MsgApp type, it should be sent
through stream msgAppV2.
2015-10-20 08:23:36 -07:00
Yicheng Qin
33231fccdd rafthttp: fix wrong stream name returned by pick
msgAppWriter uses streamAppV2 type, and it should return the correct name.
2015-10-20 08:17:06 -07:00
Yicheng Qin
f725f6a552 rafthttp: deprecate streamTypeMsgApp
streamTypeMsgApp is only used in etcd 2.0. etcd 2.3 should not talk to
etcd 2.0, either send or receive requests. So I deprecate streamTypeMsgApp
and its related stuffs from rafthttp package.

updating term is only used from streamTypeMsgApp, so it is removed too.
2015-10-20 08:15:54 -07:00
Xiang Li
eb7bce893e Merge pull request #3721 from mitake/servevars
etcdserver: don't allow methods other than GET in /debug/vars
2015-10-20 08:01:38 -07:00
Hitoshi Mitake
1b0c65c299 etcdserver: don't allow methods other than GET in /debug/vars
Currently, /debug/vars seems to allow all types of methods e.g. PUT,
POST, etc. However, this path is a readonly stuff so it should allow
GET only.
2015-10-20 17:19:42 +09:00
Yicheng Qin
7dcb99b60e Merge pull request #3656 from endocode/kayrus/client_doc
Added example on how to get node's value
2015-10-19 20:41:12 -07:00
Jonathan Boulle
e5082fce54 Merge pull request #3718 from gyuho/gyuho_README
README: fix typo
2015-10-19 17:14:46 -07:00
Gyu-Ho Lee
7d326087f9 README: fix typo
It looks like a typo. Or is it elided sentence?
Thanks,
2015-10-19 17:03:40 -07:00
Yicheng Qin
4e9f137d1b Merge pull request #3716 from yichengq/add-sem-badge
README: add semaphore CI status badge into README
2015-10-19 16:25:21 -07:00
Yicheng Qin
6ebd62a869 README: add semaphore CI status badge into README 2015-10-19 16:08:40 -07:00
Xiang Li
32dd4d5de3 Merge pull request #3657 from xiang90/fix_remove
etcdserver: skip updating attr if the member does not exist
2015-10-19 13:35:57 -07:00
Xiang Li
776e9fb7be Merge pull request #3703 from xiang90/bolt
storage/backend: avoid creating new bolt.tx during a batchTx
2015-10-19 10:30:57 -07:00
kayrus
afb35e366d client: added example on how to get node's value 2015-10-19 10:31:05 +02:00
Xiang Li
79c263b2ec Merge pull request #3707 from xiang90/CI
pkg/transport: longer timeout for slow CI
2015-10-18 17:22:15 -07:00
Xiang Li
7c6e2deb66 Merge pull request #3708 from xiang90/travis
travis: drop go-nyet
2015-10-18 16:43:53 -07:00
Xiang Li
559b76f401 travis: drop go-nyet 2015-10-18 16:41:57 -07:00
Xiang Li
3c1ecf70cf pkg/transport: longer timeout for slow CI 2015-10-18 16:32:18 -07:00
Xiang Li
5372e11727 Merge pull request #3704 from xiang90/rafthttp
clean up rafthttp pkg: round1
2015-10-18 10:00:26 -07:00
Xiang Li
427a154aae rafthttp: various clean up 2015-10-18 09:49:18 -07:00
Xiang Li
7d3af5e15f rafthttp: rename message.go -> message_codec.go 2015-10-18 09:49:11 -07:00
Xiang Li
e87cd0c17b rafthttp: move new funcs to right place 2015-10-18 09:48:59 -07:00
Xiang Li
f08d750b0b Merge pull request #3697 from mqliang/cluster-health
etcdctl: fix health check condition
2015-10-18 09:26:05 -07:00
Xiang Li
478fab6aca rafthttp: rename NewHandler to newPipelineHandler 2015-10-17 22:33:28 -07:00
Xiang Li
080c11d14e rafthttp: make ConnReadLimitByte private and add comment 2015-10-17 22:20:36 -07:00
Xiang Li
5efdd7bc6d storage/backend: avoid creating new bolt.tx during a batchTx 2015-10-17 21:31:34 -07:00
mqliang
b2d92dedae etcdctl:fix health check condition 2015-10-18 08:22:13 +08:00
Xiang Li
d07c9b00e5 Merge pull request #3701 from xiang90/rm_end_watcher
storage: remove the endRev of watcher
2015-10-17 16:04:27 -07:00
Xiang Li
6556bf1643 storage: remove the endRev of watcher 2015-10-17 15:59:49 -07:00
Xiang Li
f78a11d468 Merge pull request #3694 from philips/fix-configuration-headers
Documentation: configuration docs headers
2015-10-16 12:39:10 -07:00
Brandon Philips
22d8ca4c9a Documentation: configuration docs headers
The configuration docs are indented weird compared to our standards.
This makes the sidebar here not work right:
https://coreos.com/etcd/docs/latest/configuration.html
2015-10-16 12:32:57 -07:00
Xiang Li
fd07e02604 Merge pull request #3691 from gyuho/documentation_20151015
raft/documentation: clarify progress's subjects.
2015-10-15 20:02:00 -07:00
Gyu-Ho Lee
1716d5858f raft/documentation: clarify progress's subjects.
If I understand correctly, `progress` represents the states of follower. For
me, some comments weren't clear because it was missing the subjects of
`progress`. This adds more clarification on who is doing what. Please let me
know if I misunderstood anything. Thanks,
2015-10-15 19:15:08 -07:00
Yicheng Qin
9ce79dbbc3 Merge pull request #3685 from gyuho/etcdctl_mk_command_2
etcdctl: fix mk command with PrevNoExist
2015-10-15 12:13:54 -07:00
Xiang Li
df34d67e98 Merge pull request #3689 from ccding/patch-1
raft/doc: fix misuse of `for' loop in docs
2015-10-15 09:20:05 -07:00
Cong Ding
362df8e470 raft/doc: fix misuse of `for' loop in docs 2015-10-15 11:13:30 -05:00
Gyu-Ho Lee
1dab7e8084 etcdctl/command: mk command with PrevNoExist
This attempts to fix #3676. `PrevNoExist` checks if the key previously exists
and if so, it returns an error, which is how `mk` command is supposed to work.
The previous code ignores the previous key and overwrites with the later value.

/cc @yichengq
2015-10-15 09:05:17 -07:00
Xiang Li
c2e49b5622 Merge pull request #3687 from ccding/patch-1
raft/doc: fix typos
2015-10-15 07:52:43 -07:00
Cong Ding
f1f92f0fa3 raft/doc: fix typos 2015-10-15 02:17:34 -05:00
Yicheng Qin
afd74dfeb7 Merge pull request #3611 from mitake/etcdctl-timeout
etcdctl: use a context with -total-timeout in simple commands
2015-10-14 16:13:34 -07:00
Yicheng Qin
f78ccbbd3f Merge pull request #3681 from yichengq/godep-update
Godeps: update prometheus dependency
2015-10-14 11:44:49 -07:00
Yicheng Qin
d807c895b8 Godeps: update prometheus dependency
prometheus updates its directory layout
(https://github.com/prometheus/client_golang#where-is-model-extraction-and-text)
and makes Godeps restore/save unable to work.

Remove all prometheus dependency manually and godep save again to fix
this problem.
2015-10-14 09:58:36 -07:00
Jonathan Boulle
aeb6ef5d8a Merge pull request #3680 from gyuho/Documentation_20151014
Documentation: typos in discovery, faq, security
2015-10-14 08:50:02 -07:00
Gyu-Ho Lee
0598de2e25 Documentation: typos in discovery, faq, security
This just fixes some typos from my reading Documentation.

Thanks,

Documentation: security
2015-10-14 08:48:01 -07:00
Yicheng Qin
d2b40c1c98 Merge pull request #3666 from yichengq/transport-snap
rafthttp: support sending v3 snapshot message
2015-10-13 23:58:02 -07:00
Yicheng Qin
1f21ccf166 rafthttp: support sending v3 snapshot message
Use snapshotSender to send v3 snapshot message. It puts raft snapshot
message and v3 snapshot into request body, then sends it to the target peer.
When it receives http.StatusNoContent, it knows the message has been
received and processed successfully.

As receiver, snapHandler saves v3 snapshot and then processes the raft snapshot
message, then respond with http.StatusNoContent.
2015-10-13 23:11:28 -07:00
Jonathan Boulle
6afd8e4fd9 ROADMAP: fix v3 API issues link 2015-10-12 20:56:20 -07:00
Yicheng Qin
70dc205500 Merge pull request #3665 from raoofm/patch-2
hack: bench to use Content Type as value written was always empty
2015-10-12 11:35:34 -07:00
Raoof Mohammed
90e8dcd9bf hack: benchmarking to use Content Type
hack: benchmarking to use Content Type: application/x-www-form-urlencoded

boom sends PUT request with Content-Type: text/html, and this cannot be parsed by etcd. It should use Content-Type: application/x-www-form-urlencoded. Post this fix the official etcd benchmarks should be updated as the keysize was not a factor as the value being set was always empty.
2015-10-12 10:03:09 -04:00
Xiang Li
df7074911e Merge pull request #3664 from yichengq/transport-more
rafthttp: build transport inside pkg instead of passed-in
2015-10-11 22:00:05 -07:00
Yicheng Qin
207c92b627 rafthttp: build transport inside pkg instead of passed-in
rafthttp has different requirements for connections created by the
transport for different usage, and this is hard to achieve when giving
one http.RoundTripper. Pass into pkg the data needed to build transport
now, and let rafthttp build its own transports.
2015-10-11 21:42:37 -07:00
Yicheng Qin
988a09eb20 Merge pull request #3663 from yichengq/transport-rt
pkg/transport: pass dial timeout to NewTransport
2015-10-11 10:12:09 -07:00
Yicheng Qin
9673eb625a pkg/transport: pass dial timeout to NewTransport
So we could set dial timeout for new transport, which makes it
customizable according to max RTT.
2015-10-11 10:09:25 -07:00
Yicheng Qin
017f5f4670 Merge pull request #3662 from yichengq/transport
rafthttp: expose struct to set configuration
2015-10-11 09:35:15 -07:00
Yicheng Qin
233e717e2f rafthttp: expose struct to set configuration
transport takes too many arguments and the new function is unable to
read. Change the way to set fields in transport struct directly.
2015-10-11 09:02:16 -07:00
Xiang Li
9534b11ad2 Merge pull request #3660 from gyuho/Documentation_typos_20151009
Documentation: fix typos
2015-10-09 16:10:49 -07:00
Gyu-Ho Lee
d05d6f8bbb Documentation: fix typos
I found some typos. Please let me know if you have any feedback.

Thanks,

Documentation: fix metrics.md typo

Documentation: trim blank lines in metrics.md
2015-10-09 15:56:58 -07:00
Yicheng Qin
ce45159767 Merge pull request #3655 from wojtek-t/update_dependency
Update dependency on ugorji/go/codec
2015-10-09 10:23:47 -07:00
Wojciech Tyczynski
4eb598be06 client: regenerate code to unmarshal key response
Regenerate code for unmarshaling key response with a new version of
ugorji/go/codec
2015-10-09 10:59:42 +02:00
Wojciech Tyczynski
8ebbec8e05 Godeps: update ugorji/go/codec dependency
Update ugorji/go/codec dependency to the newer version (a bunch of fixed were made).
2015-10-09 10:59:42 +02:00
Yicheng Qin
eadfd138a4 Merge pull request #3658 from mqliang/patch-2
docs/api.md: fix documentation
2015-10-08 19:51:54 -07:00
mqliang
8c580ffe2d docs/api.md: fix documentation
Fix documentation
2015-10-09 10:41:12 +08:00
Xiang Li
98e30ca7c2 etcdserver: skip updating attr if the member does not exist 2015-10-08 14:07:16 -07:00
Yicheng Qin
f74ff9b867 Merge pull request #3644 from mitake/test-race
etcdserver, test: don't access testing.T in time.AfterFunc()'s own go…
2015-10-07 08:34:58 -07:00
Xiang Li
dc394a0a99 Merge pull request #3649 from kkaneda/kkaneda/comment_fix
raft: fix a description of MemoryStorage.Compact
2015-10-06 21:51:20 -07:00
Kenji Kaneda
ebd8cb04c1 raft: fix a description of MemoryStorage.Compact
The parameter name is compactIndex, not i.
2015-10-06 21:49:33 -07:00
Hitoshi Mitake
68dd3ee621 etcdserver, test: don't access testing.T in time.AfterFunc()'s own goroutine
time.AfterFunc() creates its own goroutine and calls the callback
function in the goroutine. It can cause datarace like the problem
fixed in the commit de1a16e0f1 . This
commit also fixes the potential dataraces of tests in
etcdserver/server_test.go .
2015-10-06 11:37:08 +09:00
Xiang Li
21179d929f Merge pull request #3616 from yichengq/storage-txn
storage: hold batchTx lock during KV txn
2015-10-05 17:12:52 -07:00
Xiang Li
4f2ada3f1e Merge pull request #3643 from xiang90/metrics_storage
storage: add metrics for db total size
2015-10-05 17:11:00 -07:00
Xiang Li
0aa2f1192a storage: add metrics for db total size 2015-10-05 16:56:30 -07:00
Yicheng Qin
522ee6ab3a Merge pull request #3635 from yichengq/parse-ipv6
pkg/types: fix unwanted unescape in NewURLsMap
2015-10-05 15:30:51 -07:00
Yicheng Qin
699e37562e Merge pull request #3637 from yichengq/run-snapshot
etcdserver: get existing snapshot instead of requesting one
2015-10-05 14:57:03 -07:00
Yicheng Qin
e117f36e48 pkg/types: fix unwanted unescape in NewURLsMap
We use url.ParseQuery to parse names-to-urls string, but it has side
effect that unescape the string. If the initial-cluster string has ipv6
which contains `%25`, it will unescape it to `%` and make further url
parse failed.

Fix it by modifiying the parse process.

Go1.4 doesn't support literal IPv6 address w/ zone in
URI(https://github.com/golang/go/issues/6530), so we only enable tests
in Go1.5+.
2015-10-05 14:54:17 -07:00
Yicheng Qin
8c94ae0ee3 etcdserver: get existing snapshot instead of requesting one
This fixes the problem that proposal cannot be applied.

When start the etcdserver.run loop, it expects to get the latest
existing snapshot. It should not attempt to request one because the loop
is the entity to create the snapshot.
2015-10-05 14:32:16 -07:00
Xiang Li
ba949ae2be Merge pull request #3640 from xiang90/watch_metrics
storage: add metrics for watchers
2015-10-05 11:46:50 -07:00
Xiang Li
09157d4f1a storage: add metrics for watchers 2015-10-05 11:32:56 -07:00
Xiang Li
432b1bc230 Merge pull request #3638 from gyuho/documentation_proxy
Documentation: proxy.md typo, line-breaks
2015-10-05 07:57:22 -07:00
Gyu-Ho Lee
f2dae5a0d2 Documentation: proxy.md typo, line-breaks
1. I found a little typo (easily -> easy)
2. If you go to https://coreos.com/etcd/docs/2.0.9/proxy.html,
   the proxy flag command is out of width of the web-page. Can
   we have line-breaks between flags to make the command easier
   to read?

Thanks for the great documentation!
2015-10-04 12:25:25 -07:00
Yicheng Qin
c97dda766e storage: hold batchTx lock during KV txn
One txn is treated as atomic, and might contain multiple Put/Delete/Range
operations. For now, between these operations, we might call forecCommit
to sync the change to disk, or backend may commit it in background.
Thus the snapshot state might contains an unfinished multiple objects
transaction, which is dangerous if database is restored from the snapshot.

This PR makes KV txn hold batchTx lock during the process and avoids
commit to happen.
2015-10-03 16:01:05 -07:00
Yicheng Qin
581cc5cff4 Merge pull request #3608 from yichengq/storage-snapshot
storage: update KV.Snapshot function
2015-10-03 15:32:14 -07:00
Yicheng Qin
36f4303fc3 storage/etcdserver: update KV.Snapshot function
When using Snapshot function, it is expected:
1. know the size of snapshot before writing data
2. split snapshot-ready phase and write-data phase. so we could cut
snapshot first and write data later.

Update its interface to fit the requirement of etcdserver.
2015-10-03 10:15:23 -07:00
Yicheng Qin
8c0db94fef Merge pull request #3631 from yichengq/create-snapshot
etcdserver: support to create raft snapshot at apply loop
2015-10-03 10:03:27 -07:00
Yicheng Qin
675d4306b0 Merge pull request #3634 from yichengq/fix-cluster-output
etcdserver: print out correct restored cluster info
2015-10-02 16:23:33 -07:00
Yicheng Qin
18c568bc82 etcdserver: print out correct restored cluster info
Before this PR, it always prints nil because cluster info has not been
covered when print:

```
2015-10-02 14:00:24.353631 I | etcdserver: loaded cluster information
from store: <nil>
```
2015-10-02 16:11:32 -07:00
Xiang Li
69ca0b8475 Merge pull request #3633 from xiang90/systemd_readiness
etcdmain: print out error and suggestion for fixing notify issue
2015-10-02 13:56:16 -07:00
Xiang Li
51043830d4 etcdmain: print out error and suggestion for fixing notify issue 2015-10-02 13:39:41 -07:00
Yicheng Qin
bfe9502f4f etcdserver: support to create raft snapshot at apply loop
and snapStore could trigger it to create the latest raft snapshot.
2015-10-02 13:17:56 -07:00
Xiang Li
f8a4d1f01b Merge pull request #3607 from xiang90/doc_name
doc: emphasize name should be unique
2015-10-02 12:23:06 -07:00
Xiang Li
2733e3f543 doc: emphasize name should be unique 2015-10-02 09:58:20 -07:00
Xiang Li
f093559b1d Merge pull request #3632 from mickep76/master
docs/libraries-and-tools: add etcd-rest rest api daemon
2015-10-02 09:49:59 -07:00
Michael Persson
4835a411c7 docs/libraries-and-tools: add etcd-rest rest api daemon 2015-10-02 14:55:57 +02:00
Yicheng Qin
ccce61bda9 Merge pull request #3614 from yichengq/snapshot-store
etcdserver: add snapshotStore and raftStorage
2015-10-01 19:35:34 -07:00
Yicheng Qin
f47cbf3073 Merge pull request #3627 from jelmer/typofix
Fix typo: boostrapping -> bootstrapping.
2015-10-01 19:02:47 -07:00
Yicheng Qin
2276328720 etcdserver: add snapshotStore and raftStorage
snapshotStore is the store of snapshot, and it supports to get latest snapshot
and save incoming snapshot.

raftStorage supports to get latest snapshot when v3demo is open.
2015-10-01 19:00:59 -07:00
Jelmer Vernooij
d70975e54c Documentation/configuration.md: Fix typo.
boostrapping -> bootstrapping
2015-10-01 20:03:16 +00:00
Xiang Li
715fdfb669 Merge pull request #3093 from mwitkow-io/feature/httpd_metrics
add `events` metrics in etcdhttp.
2015-10-01 12:10:58 -07:00
Xiang Li
6e9943a037 Merge pull request #3629 from ccding/master
raft: fix typo in doc
2015-10-01 10:11:12 -07:00
Cong Ding
b2edf1d24a raft: fix typo in doc 2015-10-01 11:21:23 -05:00
Michal Witkowski
1b2dc1c796 metrics: add events metrics in etcdhttp. 2015-10-01 08:11:42 +01:00
Yicheng Qin
46e5444d93 Merge pull request #3625 from yichengq/fix-race
pkg/transport: fix a data race in TestReadWriteTimeoutDialer
2015-09-30 17:50:48 -07:00
Yicheng Qin
de1a16e0f1 pkg/transport: fix a data race in TestReadWriteTimeoutDialer
Accessing test.T async will cause data race.

Change to use select to coordinate the access of test.T.
2015-09-30 17:29:24 -07:00
Brandon Philips
036ea58a77 Documentation: 04 snapshot: add example with fleet 2015-09-30 16:35:28 -07:00
Brandon Philips
b043635868 Documentation: fix-up the kubernetes github URL 2015-09-30 10:58:33 -07:00
Yicheng Qin
533e728b64 Merge pull request #3609 from yichengq/raft-snapshot
raft: kill TODO about behavior when snapshot fails
2015-09-29 19:32:31 -07:00
Yicheng Qin
4c82b481a5 raft: improve behavior when snapshot fails
etcd is going to support incremental snapshot, and we design to let it
send at most one snapshot out at first stage. So when one snapshot is in
flight, snapshot request will return error.

When failing to get snapshot when sending MsgSnap, raft prints out
related log and abort sending this message.
2015-09-29 19:15:15 -07:00
Yicheng Qin
a535cf2cad Merge pull request #3610 from yichengq/load-storage
etcdserver: restore v3 storage when restart
2015-09-29 11:58:38 -07:00
Yicheng Qin
49d262185d Merge pull request #3590 from yichengq/discovery-log
etcdmain: improve log when join discovery fails
2015-09-29 08:02:18 -07:00
Hitoshi Mitake
33a0df3e33 etcdctl: use a context with -total-timeout in simple commands
Like the commit 8ebc933111, this commit lets simple etcdctl commands
use a context with timeout value passed via -total-timeout.

This commit doesn't change complex commands like watch,
cluster-health, and import because it is not obvious that using the
context in the commands is good or not.
2015-09-29 17:23:01 +09:00
Yicheng Qin
5d906a0acc etcdserver: restore v3 storage when restart
To load the previous data.
2015-09-29 00:14:27 -07:00
Yicheng Qin
939aa96a34 etcdmain: improve log when join discovery fails
Before this PR, the log is

```
2015/09/1 13:18:31 etcdmain: client: etcd cluster is unavailable or
misconfigured
```

It is quite hard for people to understand what happens.

Now we print out the exact reason for the failure, and explains the way
to handle it.
2015-09-28 23:23:50 -07:00
Xiang Li
783884a04e Merge pull request #3606 from kkaneda/kkaneda/tiny_fix
raft: remove an obsolete TODO comment on 4MB maxMsgSize hard coding
2015-09-28 21:44:45 -07:00
Kenji Kaneda
f602767e50 raft: remove an obsolete TODO comment on 4MB maxMsgSize hard coding
The TODO comment was added by 7571b2cd, and it was addressed by d9b5b56c.
2015-09-28 21:31:12 -07:00
Xiang Li
6c05a01ec6 Merge pull request #3604 from gyuho/replace_netutil_BasicAuth
etcdhttp/auth: BasicAuth method in standard pkg
2015-09-28 15:55:46 -07:00
Gyu-Ho Lee
6264a41e22 ectd/Getting-etcd: update README to require Go1.4+
Notice `For those wanting to try the very latest version,`
2015-09-28 15:35:09 -07:00
Gyu-Ho Lee
e16f81838b etcdhttp/auth: BasicAuth method in standard pkg
I created a new PR from https://github.com/coreos/etcd/pull/3598.
This is for `TODO: use the standard lib BasicAuth method when we move to
Go 1.4.` [1]. `BasicAuth` method got into Go standard package a year ago. [2]

---
1. https://github.com/coreos/etcd/blob/master/pkg/netutil/netutil.go#L126-L138
2. https://codereview.appspot.com/76540043/
2015-09-28 14:02:55 -07:00
Yicheng Qin
7410698761 Merge pull request #3530 from mitake/etcdctl-timeout-v2
etcdctl: use user specified timeout value for entire command execution
2015-09-28 09:45:02 -07:00
Hitoshi Mitake
8ebc933111 etcdctl: use user specified timeout value for entire command execution
etcdctl should be capable to use a user specified timeout value for
total command execution, not only per request timeout. This commit
adds a new option --total-timeout to the command. The value passed via
this option is used as a timeout value of entire command execution.

Fixes coreos#3517
2015-09-28 10:31:46 +09:00
Rob Szumski
c645ac23c0 docs: fix link 2015-09-26 17:43:33 -07:00
Xiang Li
49d52eaf1e Merge pull request #3596 from xiang90/json_header
etcdhttp: add Content-Type: application/json header to version handler
2015-09-25 15:27:29 -07:00
Xiang Li
1226838381 etcdhttp: add Content-Type: application/json header to version handler 2015-09-25 15:14:13 -07:00
Xiang Li
c9be719d92 Merge pull request #3579 from gyuho/etcdserver/etcdhttp/httptypes/errors.go-WriteTo-returns-error
httptypes: WriteTo to return error
2015-09-25 14:31:48 -07:00
Yicheng Qin
93edabf85f Merge pull request #3594 from yichengq/exit
etcdmain: exit after print out ErrDuplicateID
2015-09-25 14:28:45 -07:00
Yicheng Qin
dc9a75df1c etcdmain: exit after print out ErrDuplicateID
etcd should exit after printing log for unhandlable error.
2015-09-25 14:10:50 -07:00
Xiang Li
60a641762b Merge pull request #3593 from xiang90/fix_race
pkg/transport: fix a data race in TestWriteReadTimeoutListener
2015-09-25 10:16:17 -07:00
Xiang Li
5d033c22af pkg/transport: fix a data race in TestWriteReadTimeoutListener 2015-09-25 10:02:37 -07:00
Yicheng Qin
dff702b2b8 Merge pull request #3564 from gouyang/master
Improve proxy log for retrying an unavailable endpoint
2015-09-25 10:02:15 -07:00
Gyu-Ho Lee
85f4475f62 httptypes/errors: HTTPError.WriteTo returns error
Squashing all commits into this one
(from https://github.com/coreos/etcd/pull/357).

Thanks,
2015-09-25 08:06:26 -07:00
Guohua ouyang
e35eeeae42 proxy: improve log for retrying an unavailable endpoint
Fixes #3541

Signed-off-by: Guohua ouyang <guohuaouyang@gmail.com>
2015-09-25 07:36:49 +08:00
Xiang Li
9de7f24301 Merge pull request #3554 from mitake/reconfig-doc
doc: add a description of -strict-reconfig-check
2015-09-24 08:07:32 -07:00
Hitoshi Mitake
78791f81a6 doc: add a description of -strict-reconfig-check 2015-09-24 11:44:55 +09:00
Xiang Li
0813a0f2d1 Merge pull request #3585 from xiang90/fix_hash
storage: fix hash by iterating kv
2015-09-23 11:39:21 -07:00
Xiang Li
385e17583f storage: fix hash by iterating kv 2015-09-23 11:28:33 -07:00
Xiang Li
370ce37d32 Merge pull request #3584 from mickep76/master
docs/libraries-and-tools: add etcd-export tool
2015-09-23 09:32:35 -07:00
Michael Persson
c1db1338c9 docs/libraries-and-tools: add etcd-export tool 2015-09-23 18:29:11 +02:00
Yicheng Qin
d6db4e6d6b Merge pull request #3577 from gyuho/storage/watchable_store.go-defer-fix
storage/watchable_store: defer to Unlock s.mu
2015-09-23 07:37:29 -07:00
Gyu-Ho Lee
4113509828 storage/watchable_store: defer to Unlock s.mu
New PR from https://github.com/coreos/etcd/pull/3575.
This add `defer` to `s.mu`. Current code does not `Unlock`
in the correct scope, I think.

(Sorry, I accidentally deleted my fork so the changes
might not sound continuous from my previous pull requests.)
2015-09-22 23:25:07 -07:00
Xiang Li
89acdd6245 Merge pull request #3555 from xiang90/proxy_doc
doc: add proxy promotion doc
2015-09-22 12:59:40 -07:00
Yicheng Qin
932bb76cbb Merge pull request #3570 from yichengq/extend-timeout
integration: extend request timeout
2015-09-22 10:17:13 -07:00
Xiang Li
eba8a2ed90 Merge pull request #3566 from xiang90/error_msg
etcdsever: mismatch error uses the same format as the corresponding flag
2015-09-22 07:41:46 -07:00
Xiang Li
13cfb4284f Merge pull request #3573 from TheHippo/patch-1
docs/security: fixed command typo
2015-09-22 07:41:30 -07:00
Xiang Li
2540a3fb7e etcdsever: mismatch error uses the same format as the corresponding flags 2015-09-21 19:32:10 -07:00
Philipp Klose
94f3297299 docs/security: fixed command typo
`-peer-client-cert-atuh` should be `-peer-client-cert-auth`
2015-09-22 03:39:29 +02:00
Yicheng Qin
305a0d7ab9 integration: extend request timeout
Extend request timeout to give etcd cluster enough time to return
response.
2015-09-21 16:50:22 -07:00
Xiang Li
ea3dbfed60 Merge pull request #3408 from MSamman/extend-auth-api
etcdserver: extend auth api
2015-09-21 11:51:19 -07:00
Xiang Li
999b2c6ec2 doc: add proxy promotion doc 2015-09-21 11:47:37 -07:00
Xiang Li
6188933c81 Merge pull request #3556 from xiang90/better_error_logging
etcdmain: better logging when user forget to set initial flags
2015-09-21 10:52:34 -07:00
Xiang Li
3b70bf87c3 etcdmain: better logging when user forget to set initial flags 2015-09-21 10:43:26 -07:00
Xiang Li
574d1b0d46 Merge pull request #3563 from dnaeon/fixes
Fix etcd/client API example
2015-09-21 10:06:41 -07:00
Marin Atanasov Nikolov
d6459b8b84 client: Fix API example 2015-09-21 19:51:29 +03:00
Mohammad Samman
6ae1f6c6e4 etcdserver: extend auth api
allow recursive query on users and roles to get more detail

Fixes #3278
2015-09-21 00:51:18 -07:00
Yicheng Qin
f3d2b5831c Merge pull request #3558 from yichengq/watch
storage: add tests for RangeEvents and its underlying functions
2015-09-20 23:58:41 -07:00
Xiang Li
cbddb8670a Merge pull request #3561 from ceh/raft-doc-typo
raft: fix Node doc typo
2015-09-20 21:52:34 -07:00
Emil Hessman
b9f22cb69b raft: fix Node doc typo 2015-09-21 06:13:33 +02:00
Yicheng Qin
d72914c36f storage: clarify comment for store.RangeEvents and fix related bugs
Change to the function:
1. specify the meaning of startRev and endRev parameters
2. specify the meaning of returned nextRev

Moreover, it adds unit tests for the function.
2015-09-19 23:17:03 -07:00
Yicheng Qin
5709b66dfb storage: add unit test for index.RangeEvents 2015-09-19 23:08:24 -07:00
Yicheng Qin
87b5143b15 storage: fix missing continue in keyIndex.since
It should continue to skip following operations.

The test from rev14 to rev0 fails if it doesn't call continue and append
all revisions of the same main rev to the list.
2015-09-19 23:01:18 -07:00
Yicheng Qin
158d6e0e03 storage: fix calculating generation in keyIndex.since
It should skip last empty generation when the key is just tombstoned.

The rev15 and rev16 in the test fails if it doesn't skip last empty generation
and find previous generations.
2015-09-19 22:58:45 -07:00
Xiang Li
06180be154 Merge pull request #3533 from xiang90/proxy
proxy: expose proxy configuration
2015-09-18 14:18:06 -07:00
Xiang Li
ac29432aab proxy: add a test for configHandler 2015-09-18 13:43:54 -07:00
Xiang Li
0f9b2046ef Merge pull request #3547 from bdarnell/multinode-node-ids
raft: Allow per-group nodeIDs in MultiNode.
2015-09-18 13:29:07 -07:00
Ben Darnell
b7baaa6bc8 raft: Allow per-group nodeIDs in MultiNode.
This feature is motivated by
https://github.com/cockroachdb/cockroach/blob/master/docs/RFCS/replica_tombstone.md
which requires a change to the way CockroachDB constructs its node IDs.
2015-09-18 15:36:36 -04:00
Yicheng Qin
be80d11948 storage: enhance test for keyIndex.Get and keyIndex.Compact
It covers the case that one key is set multiple times in one main
revision now.
2015-09-17 18:26:17 -07:00
Yicheng Qin
cedad49dcf Merge pull request #3543 from mitake/reconfig-remove
etcdserver: forbid removing started member if quorum cannot be preserved in strict reconfig mode
2015-09-17 18:22:53 -07:00
Hitoshi Mitake
f8859a980d etcdserver: forbid removing started member if quorum cannot be preserved in strict reconfig mode
Like the commit 6974fc63ed, this commit lets etcdserver forbid
removing started member if quorum cannot be preserved after
reconfiguration if the option -strict-reconfig-check is passed to
etcd. The removal can cause deadlock if unstarted members have wrong
peer URLs.
2015-09-18 10:09:57 +09:00
Xiang Li
c4b3ad72d9 Merge pull request #3544 from xiang90/bench
v3benchmark: add put benchmark
2015-09-17 15:10:13 -07:00
Xiang Li
f69582e1a2 v3benchmark: add put benchmark 2015-09-17 14:48:07 -07:00
Yicheng Qin
97b67fdbfc Merge pull request #3548 from yichengq/travis
storage/backend: extend wait timeout for commit to finish
2015-09-16 14:22:46 -07:00
Yicheng Qin
f7efbe8b14 storage/backend: extend wait timeout for commit to finish
It needs to take more time on travis. Fix:

```
--- FAIL: TestBackendBatchIntervalCommit (0.01s)
		backend_test.go:113: bucket test does not exit
```
2015-09-16 14:14:51 -07:00
Xiang Li
ec4142576e Merge pull request #3534 from xiang90/grpc_err
etcdserver: better v3 api error handling
2015-09-16 12:32:28 -07:00
Xiang Li
804d80387d Merge pull request #3546 from gae123/patch-1
doc: update admin_guide.md for the recent Go1.5 default MAXPROCS change
2015-09-16 10:44:38 -07:00
gae123
7fb0eb8f56 doc: update admin_guide.md for the recent Go1.5 default MAXPROCS change 2015-09-16 10:43:40 -07:00
Xiang Li
ce47161ae0 Merge pull request #3540 from xiang90/bench
Godep: add cheggaaa dependency
2015-09-15 16:27:07 -07:00
Xiang Li
4628d08879 Godep: add cheggaaa dependency 2015-09-15 16:24:31 -07:00
Xiang Li
3a2700141e Merge pull request #3539 from xiang90/bench
godep: use github.com/cheggaaa/pb
2015-09-15 16:12:34 -07:00
Xiang Li
38dd680f2e godep: use github.com/cheggaaa/pb 2015-09-15 16:08:07 -07:00
Xiang Li
8bb50635ce Merge pull request #3538 from xiang90/bench
benchmarkv3: refactoring the main logic
2015-09-15 15:58:32 -07:00
Xiang Li
4deb12fbbb benchmarkv3: refactoring the main logic 2015-09-15 15:57:38 -07:00
Jonathan Boulle
3221b6787e Merge pull request #3537 from jonboulle/master
*: add missing license headers + test
2015-09-15 14:19:17 -07:00
Jonathan Boulle
108f97d63e test: add license header check 2015-09-15 14:09:01 -07:00
Jonathan Boulle
7848ac3979 *: add missing license headers 2015-09-15 14:09:01 -07:00
Xiang Li
867954f3ad Merge pull request #3535 from xiang90/rev
storage: add rev into kv interface
2015-09-15 12:54:29 -07:00
Xiang Li
6d1f0ce89f storage: add rev into kv interface 2015-09-15 12:11:00 -07:00
Xiang Li
94f4069a25 etcdserver: better v3 api error handling 2015-09-15 11:20:06 -07:00
Xiang Li
e079f87410 proxy: expose proxy configuration 2015-09-15 10:27:51 -07:00
Yicheng Qin
c082488e23 Merge pull request #3507 from yichengq/watch
storage: support basic watch
2015-09-15 00:04:36 -07:00
Yicheng Qin
ec43e0a4c3 storage: introduce WatchableKV and watch feature
WatchableKV is an interface upon KV, and supports watch feature.
2015-09-14 23:53:03 -07:00
Yicheng Qin
34bfac99c4 Merge pull request #3529 from yichengq/snapshot
etcdserver: rename db file into a formal directory
2015-09-14 23:43:17 -07:00
Yicheng Qin
352cd768c6 etcdserver: fix shadow declaration 2015-09-14 23:25:16 -07:00
Yicheng Qin
05c74bd890 etcdserver: rename db file into a formal directory
and rename it to a formal name
2015-09-14 22:41:40 -07:00
Yicheng Qin
51f1ee055e Merge pull request #3526 from yichengq/snapshot
etcdserver: forbid to unset v3 demo once used
2015-09-14 21:36:39 -07:00
Yicheng Qin
1f0fb3d9aa etcdserver: forbid to unset v3 demo once used
After enabling v3 demo, it may change the underlying data organization
for v3 store. So we forbid to unset --experimental-v3demo once it has
been used.
2015-09-14 21:27:11 -07:00
Xiang Li
b1c2d7e526 Merge pull request #3528 from xiang90/compact
*: support v3 compaction
2015-09-14 20:04:50 -07:00
Xiang Li
94f784826a *: support v3 compaction 2015-09-14 19:59:36 -07:00
Xiang Li
e0d8923f7b Merge pull request #3524 from xiang90/grpc_error
etcdserver: use gRPC error instead of error message in header
2015-09-14 16:38:44 -07:00
Xiang Li
7183387110 etcdserver: use gRPC error instead of error message in header 2015-09-14 16:11:13 -07:00
Xiang Li
d04382c30e Merge pull request #3525 from gyuho/master
etcdserver, store: fix grammars in comments (a->an existing)
2015-09-14 13:47:49 -07:00
Gyu-Ho Lee
c2dcf7431e etcdserver, store: fix grammars in comments (a->an existing)
I found some grammatical errors in comments.

This pull request was submitted https://github.com/coreos/etcd/pull/3513.
I am resubmitting following the correct guidlines.
2015-09-14 13:41:13 -07:00
Yicheng Qin
1fc122741d Merge pull request #3521 from raoofm/patch-3
doc: faq.md change flag --peers to --endpoint
2015-09-14 09:43:41 -07:00
Xiang Li
c7b4c67436 Merge pull request #3514 from xiang90/v3_raft
support clustered v3 api
2015-09-14 09:35:02 -07:00
Raoof Mohammed
d685135832 doc: faq.md change flag --peers to --endpoint
doc: faq.md change flag --peers to --endpoint

Changing the flag to --endpoint and mentioning that --peers is deprecated.
2015-09-14 12:22:06 -04:00
Xiang Li
451cce4a90 Merge pull request #3516 from xiang90/hash_improved
storage: support hash state
2015-09-13 21:46:12 -07:00
Xiang Li
714b5e0b08 storage: support hash state 2015-09-13 21:34:58 -07:00
Xiang Li
cdaa263346 Merge pull request #3506 from philips/improve-tocommit-error
raft: improve panic error message
2015-09-13 17:46:53 -07:00
Xiang Li
f8fd2c10d6 Merge pull request #3449 from yichengq/cleanup-max-election
Documentation/tuning: cleanup paragraph on max election
2015-09-13 16:55:16 -07:00
Yicheng Qin
95bb6d7584 Merge pull request #3508 from amarshall/patch-3
readme: Use SVG image for build status badge
2015-09-13 15:21:58 -07:00
Xiang Li
40e0a33fcd Merge pull request #3511 from xiang90/v3_raft
Procfile: add a v3DemoProcfile
2015-09-13 08:43:14 -07:00
Xiang Li
4c81615cef etcdserver: initial support for cluster-wide v3 request 2015-09-13 08:32:01 -07:00
Xiang Li
600456f4ba etcdserverpb: update proto file for raftInternalRequest
We needs to assign each raftInternalRequest an ID for getting
the response after it goes through raft.

We also needs an empty response for error case.
2015-09-13 08:28:10 -07:00
Xiang Li
ac7253f28e Procfile: add a v3DemoProcfile 2015-09-12 23:08:56 -07:00
Xiang Li
662b4966d0 Merge pull request #3510 from xiang90/v3_raft
etcdmain: support gRPC addr flag
2015-09-12 22:58:08 -07:00
Xiang Li
a0cfcf2dd7 etcdmain: support gRPC addr flag 2015-09-12 22:52:51 -07:00
Xiang Li
35f1531576 Merge pull request #3509 from xiang90/v3_raft
etcdctlv3: support endpoint flag
2015-09-12 22:51:36 -07:00
Xiang Li
121d2b9e9d etcdctlv3: support endpoint flag 2015-09-12 22:46:43 -07:00
Andrew Marshall
0894294074 readme: Use SVG image for build status badge
More accessible, better scaling.
2015-09-13 01:12:11 -04:00
Yicheng Qin
0ca800fbac Merge pull request #3479 from mitake/membership
etcdserver: avoid deadlock caused by adding members with wrong peer URLs
2015-09-12 22:09:13 -07:00
Hitoshi Mitake
dad32646eb etcdserver: enhance test cases for isReadyToAddNewMember
- a case of a cluster with even number members
 - a case of an empty cluster
2015-09-13 12:30:10 +09:00
Jonathan Boulle
d9cf752060 etcdserver: add test for isReadyToAddNewMember
Also fixed check for special case of one-member cluster
2015-09-13 11:16:08 +09:00
Hitoshi Mitake
6974fc63ed etcdserver: avoid deadlock caused by adding members with wrong peer URLs
Current membership changing functionality of etcd seems to have a
problem which can cause deadlock.

How to produce:
 1. construct N node cluster
 2. add N new nodes with etcdctl member add, without starting the new members

What happens:
After finishing add N nodes, a total number of the cluster becomes 2 *
N and a quorum number of the cluster becomes N + 1. It means
membership change requires at least N + 1 nodes because Raft treats
membership information in its log like other ordinal log append
requests.

Assume the peer URLs of the added nodes are wrong because of miss
operation or bugs in wrapping program which launch etcd. In such a
case, both of adding and removing members are impossible because the
quorum isn't preserved. Of course ordinal requests cannot be
served. The cluster would seem to be deadlock.

Of course, the best practice of adding new nodes is adding one node
and let the node start one by one. However, the effect of this problem
is so serious. I think preventing the problem forcibly would be
valuable.

Solution:
This patch lets etcd forbid adding a new node if the operation changes
quorum and the number of changed quorum is larger than a number of
running nodes. If etcd is launched with a newly added option
-strict-reconfig-check, the checking logic is activated. If the option
isn't passed, default behavior of reconfig is kept.

Fixes https://github.com/coreos/etcd/issues/3477
2015-09-13 09:31:53 +09:00
Brandon Philips
68d4ec3e13 raft: improve panic error message
Give a human being some insight into how we might have gotten to this
state based on feedback from #3504.
2015-09-12 12:17:02 -07:00
Yicheng Qin
d4e19d1afb Merge pull request #3501 from yichengq/update-peers
docs/admin_guide: use ETCDCTL_ENDPOINT
2015-09-12 08:31:47 -07:00
Yicheng Qin
e9512f8c5f docs/admin_guide: use ETCDCTL_ENDPOINT
because ETCDCTL_PEERS is not prefered.
2015-09-11 19:38:55 -07:00
Yicheng Qin
28a371471a Merge pull request #3500 from yichengq/fix-ETCD
libraries-and-tools.md: correct project name to etcd
2015-09-11 19:34:15 -07:00
Yicheng Qin
4e71954111 libraries-and-tools.md: correct project name to etcd
etcd is the official name of the project.
2015-09-11 19:31:40 -07:00
Yicheng Qin
a528cb6f5d Merge pull request #3495 from rekby/patch-2
libraries-and-tools.md: add etcddir
2015-09-11 19:30:03 -07:00
Jonathan Boulle
f8f702b3f8 Merge pull request #3497 from jonboulle/master
docs: add official client to libraries-and-tools
2015-09-11 15:17:30 -07:00
Jonathan Boulle
fd82f0b8d5 docs: add official client to libraries-and-tools 2015-09-11 15:16:02 -07:00
Timofey Koolin
136efd3ba9 libraries-and-tools.md: add etcddir 2015-09-11 21:04:56 +03:00
Yicheng Qin
c5c3ae4790 Merge pull request #3486 from yichengq/readme
README: warn that master branch is unstable
2015-09-10 19:20:35 -07:00
Xiang Li
56d61d995a Merge pull request #3487 from onlyjob/master
Minor spelling corrections (codespell).
2015-09-10 17:46:29 -07:00
Dmitry Smirnov
b2f4a5f587 *: fix spelling issues (codespell).
Signed-off-by: Dmitry Smirnov <onlyjob@member.fsf.org>
2015-09-11 10:22:29 +10:00
Yicheng Qin
bd5f924b0f README: warn that master branch is unstable
Avoid users building from master branch for stable binaries.
2015-09-10 14:27:10 -07:00
Xiang Li
1de63deca4 Merge pull request #3483 from xiang90/update_roadmap
roadmap.md: update roadmap for 2.3
2015-09-10 13:32:30 -07:00
Yicheng Qin
f51c2f471b Merge pull request #3482 from yichengq/client
client: add Nodes type to faciliate sorting
2015-09-10 12:29:12 -07:00
Xiang Li
07bd9f65d3 roadmap.md: update roadmap for 2.3 2015-09-10 12:24:48 -07:00
Yicheng Qin
2f558e56d2 client: add Nodes to codecgen and regenerate 2015-09-10 11:51:59 -07:00
Yicheng Qin
eb51901830 client: add Nodes type to faciliate sorting
This helps users to sort easily.
2015-09-10 11:03:12 -07:00
Yicheng Qin
48a4d2ccba Documentation/tuning: cleanup paragraph on max election
- Use one sentence per line for easier diffing
- Walkthrough the thought process and cleanup the grammar
- Move below the other sections

Original author: @philips
2015-09-06 00:38:03 -07:00
226 changed files with 11298 additions and 2044 deletions

View File

@@ -4,8 +4,10 @@ go:
- 1.4
- 1.5
install:
- go get github.com/barakmich/go-nyet
addons:
apt:
packages:
- libpcap-dev
script:
- INTEGRATION=y ./test
- ./test

View File

@@ -35,7 +35,7 @@ Thanks for your contributions!
### Code style
The coding style suggested by the Golang community is used in etcd. See the [style doc](https://code.google.com/p/go-wiki/wiki/CodeReviewComments) for details.
The coding style suggested by the Golang community is used in etcd. See the [style doc](https://github.com/golang/go/wiki/CodeReviewComments) for details.
Please follow this style to make etcd easy to review, maintain and develop.

View File

@@ -1,4 +1,4 @@
## Snapshot Migration
# Snapshot Migration
You can migrate a snapshot of your data from a v0.4.9+ cluster into a new etcd 2.2 cluster using a snapshot migration. After snapshot migration, the etcd indexes of your data will change. Many etcd applications rely on these indexes to behave correctly. This operation should only be done while all etcd applications are stopped.
@@ -15,7 +15,7 @@ etcdctl --endpoint new_cluster.example.com import --snap backup.snap
```
If you have a large amount of data, you can specify more concurrent works to copy data in parallel by using `-c` flag.
If you have hidden keys to copy, you can use `--hidden` flag to specify.
If you have hidden keys to copy, you can use `--hidden` flag to specify. For example fleet uses `/_coreos.com/fleet` so to import those keys use `--hidden /_coreos.com`.
And the data will quickly copy into the new cluster:

View File

@@ -1,8 +1,8 @@
## Administration
# Administration
### Data Directory
## Data Directory
#### Lifecycle
### Lifecycle
When first started, etcd stores its configuration into a data directory specified by the data-dir configuration parameter.
Configuration is stored in the write ahead log and includes: the local member ID, cluster ID, and initial cluster configuration.
@@ -20,7 +20,7 @@ Once removed the member can be re-added with an empty data directory.
[remove-a-member]: runtime-configuration.md#remove-a-member
#### Contents
### Contents
The data directory has two sub-directories in it:
@@ -32,18 +32,18 @@ If `--wal-dir` flag is set, etcd will write the write ahead log files to the spe
[wal-pkg]: http://godoc.org/github.com/coreos/etcd/wal
[snap-pkg]: http://godoc.org/github.com/coreos/etcd/snap
### Cluster Management
## Cluster Management
#### Lifecycle
### Lifecycle
If you are spinning up multiple clusters for testing it is recommended that you specify a unique initial-cluster-token for the different clusters.
This can protect you from cluster corruption in case of mis-configuration because two members started with different cluster tokens will refuse members from each other.
#### Monitoring
### Monitoring
It is important to monitor your production etcd cluster for healthy information and runtime metrics.
##### Health Monitoring
#### Health Monitoring
At lowest level, etcd exposes health information via HTTP at `/health` in JSON format. If it returns `{"health": "true"}`, then the cluster is healthy. Please note the `/health` endpoint is still an experimental one as in etcd 2.2.
@@ -63,16 +63,16 @@ member fd422379fda50e48 is healthy: got healthy result from http://127.0.0.1:323
cluster is healthy
```
##### Runtime Metrics
#### Runtime Metrics
etcd uses [Prometheus](http://prometheus.io/) for metrics reporting in the server. You can read more through the runtime metrics [doc](metrics.md).
#### Debugging
### Debugging
Debugging a distributed system can be difficult. etcd provides several ways to make debug
easier.
##### Enabling Debug Logging
#### Enabling Debug Logging
When you want to debug etcd without stopping it, you can enable debug logging at runtime.
etcd exposes logging configuration at `/config/local/log`.
@@ -85,7 +85,7 @@ $ curl http://127.0.0.1:2379/config/local/log -XPUT -d '{"Level":"INFO"}'
$ # debug logging disabled
```
##### Debugging Variables
#### Debugging Variables
Debug variables are exposed for real-time debugging purposes. Developers who are familiar with etcd can utilize these variables to debug unexpected behavior. etcd exposes debug variables via HTTP at `/debug/vars` in JSON format. The debug variables contains
`cmdline`, `file_descriptor_limit`, `memstats` and `raft.status`.
@@ -107,7 +107,7 @@ Debug variables are exposed for real-time debugging purposes. Developers who are
}
```
#### Optimal Cluster Size
### Optimal Cluster Size
The recommended etcd cluster size is 3, 5 or 7, which is decided by the fault tolerance requirement. A 7-member cluster can provide enough fault tolerance in most cases. While larger cluster provides better fault tolerance the write performance reduces since data needs to be replicated to more machines.
@@ -152,7 +152,7 @@ This example will walk you through the process of migrating the infra1 member to
|infra2|10.0.1.12:2380|
```sh
$ export ETCDCTL_PEERS=http://10.0.1.10:2379,http://10.0.1.11:2379,http://10.0.1.12:2379
$ export ETCDCTL_ENDPOINT=http://10.0.1.10:2379,http://10.0.1.11:2379,http://10.0.1.12:2379
```
```sh
@@ -293,6 +293,6 @@ If timeout happens several times continuously, administrators should check statu
#### Maximum OS threads
By default, etcd uses the default configuration of the Go 1.4 runtime, which means that at most one operating system thread will be used to execute code simultaneously. (Note that this default behavior [may change in Go 1.5](https://docs.google.com/document/d/1At2Ls5_fhJQ59kDK2DFVhFu3g5mATSXqqV5QrxinasI/edit)).
By default, etcd uses the default configuration of the Go 1.4 runtime, which means that at most one operating system thread will be used to execute code simultaneously. (Note that this default behavior [has changed in Go 1.5](https://golang.org/doc/go1.5#runtime)).
When using etcd in heavy-load scenarios on machines with multiple cores it will usually be desirable to increase the number of threads that etcd can utilize. To do this, simply set the environment variable `GOMAXPROCS` to the desired number when starting etcd. For more information on this variable, see the Go [runtime](https://golang.org/pkg/runtime) documentation.

View File

@@ -359,7 +359,7 @@ curl 'http://127.0.0.1:2379/v2/keys/foo?wait=true&waitIndex=2008'
#### Connection being closed prematurely
The server may close a long polling connection before emitting any events.
This can happend due to a timeout or the server being shutdown.
This can happen due to a timeout or the server being shutdown.
Since the HTTP header is sent immediately upon accepting the connection, the response will be seen as empty: `200 OK` and empty body.
The clients should be prepared to deal with this scenario and retry the watch.
@@ -506,7 +506,7 @@ The current comparable conditions are:
2. `prevIndex` - checks the previous modifiedIndex of the key.
3. `prevExist` - checks existence of the key: if `prevExist` is true, it is an `update` request; if prevExist is `false`, it is a `create` request.
3. `prevExist` - checks existence of the key: if `prevExist` is true, it is an `update` request; if `prevExist` is `false`, it is a `create` request.
Here is a simple example.
Let's create a key-value pair first: `foo=one`.

View File

@@ -19,7 +19,7 @@ Each role has exact one associated Permission List. An permission list exists fo
The special static ROOT (named `root`) role has a full permissions on all key-value resources, the permission to manage user resources and settings resources. Only the ROOT role has the permission to manage user resources and modify settings resources. The ROOT role is built-in and does not need to be created.
There is also a special GUEST role, named 'guest'. These are the permissions given to unauthenticated requests to etcd. This role will be created automatically, and by default allows access to the full keyspace due to backward compatability. (etcd did not previously authenticate any actions.). This role can be modified by a ROOT role holder at any time, to reduce the capabilities of unauthenticated users.
There is also a special GUEST role, named 'guest'. These are the permissions given to unauthenticated requests to etcd. This role will be created automatically, and by default allows access to the full keyspace due to backward compatibility. (etcd did not previously authenticate any actions.). This role can be modified by a ROOT role holder at any time, to reduce the capabilities of unauthenticated users.
#### Permissions
@@ -124,7 +124,7 @@ The User JSON object is formed as follows:
Password is only passed when necessary.
**Get a list of users**
**Get a List of Users**
GET/HEAD /v2/auth/users
@@ -137,7 +137,36 @@ GET/HEAD /v2/auth/users
Content-type: application/json
200 Body:
{
"users": ["alice", "bob", "eve"]
"users": [
{
"user": "alice",
"roles": [
{
"role": "root",
"permissions": {
"kv": {
"read": ["*"],
"write": ["*"]
}
}
}
]
},
{
"user": "bob",
"roles": [
{
"role": "guest",
"permissions": {
"kv": {
"read": ["*"],
"write": ["*"]
}
}
}
]
}
]
}
**Get User Details**
@@ -155,7 +184,26 @@ GET/HEAD /v2/auth/users/alice
200 Body:
{
"user" : "alice",
"roles" : ["fleet", "etcd"]
"roles" : [
{
"role": "fleet",
"permissions" : {
"kv" : {
"read": [ "/fleet/" ],
"write": [ "/fleet/" ]
}
}
},
{
"role": "etcd",
"permissions" : {
"kv" : {
"read": [ "*" ],
"write": [ "*" ]
}
}
}
]
}
**Create Or Update A User**
@@ -213,22 +261,6 @@ A full role structure may look like this. A Permission List structure is used fo
}
```
**Get a list of Roles**
GET/HEAD /v2/auth/roles
Sent Headers:
Authorization: Basic <BasicAuthString>
Possible Status Codes:
200 OK
401 Unauthorized
200 Headers:
Content-type: application/json
200 Body:
{
"roles": ["fleet", "etcd", "quay"]
}
**Get Role Details**
GET/HEAD /v2/auth/roles/fleet
@@ -252,6 +284,50 @@ GET/HEAD /v2/auth/roles/fleet
}
}
**Get a list of Roles**
GET/HEAD /v2/auth/roles
Sent Headers:
Authorization: Basic <BasicAuthString>
Possible Status Codes:
200 OK
401 Unauthorized
200 Headers:
Content-type: application/json
200 Body:
{
"roles": [
{
"role": "fleet",
"permissions": {
"kv": {
"read": ["/fleet/"],
"write": ["/fleet/"]
}
}
},
{
"role": "etcd",
"permissions": {
"kv": {
"read": ["*"],
"write": ["*"]
}
}
},
{
"role": "quay",
"permissions": {
"kv": {
"read": ["*"],
"write": ["*"]
}
}
}
]
}
**Create Or Update A Role**
PUT /v2/auth/roles/rkt

View File

@@ -134,7 +134,7 @@ $ etcdctl role remove myrolename
## Enabling authentication
The minimal steps to enabling auth follow. The administrator can set up users and roles before or after enabling authentication, as a matter of preference.
The minimal steps to enabling auth are as follows. The administrator can set up users and roles before or after enabling authentication, as a matter of preference.
Make sure the root user is created:

View File

@@ -0,0 +1,98 @@
# Storage Memory Usage Benchmark
<!---todo: link storage to storage design doc-->
Two components of etcd storage consume physical memory. The etcd process allocates an *in-memory index* to speed key lookup. The process's *page cache*, managed by the operating system, stores recently-accessed data from disk for quick re-use.
The in-memory index holds all the keys in a [B-tree][btree] data structure, along with pointers to the on-disk data (the values). Each key in the B-tree may contain multiple pointers, pointing to different versions of its values. The theoretical memory consumption of the in-memory index can hence be approximated with the formula:
`N * (c1 + avg_key_size) + N * (avg_versions_of_key) * (c2 + size_of_pointer)`
where `c1` is the key metadata overhead and `c2` is the version metadata overhead.
The graph shows the detailed structure of the in-memory index B-tree.
```
In mem index
+------------+
| key || ... |
+--------------+ | || |
| | +------------+
| | | v1 || ... |
| disk <----------------| || | Tree Node
| | +------------+
| | | v2 || ... |
| <----------------+ || |
| | +------------+
+--------------+ +-----+ | | |
| | | | |
| +------------+
|
|
^
------+
| ... |
| |
+-----+
| ... | Tree Node
| |
+-----+
| ... |
| |
------+
```
[Page cache memory][pagecache] is managed by the operating system and is not covered in detail in this document.
## Testing Environment
etcd version
- git head https://github.com/coreos/etcd/commit/776e9fb7be7eee5e6b58ab977c8887b4fe4d48db
GCE n1-standard-2 machine type
- 7.5 GB memory
- 2x CPUs
## In-memory index memory usage
In this test, we only benchmark the memory usage of the in-memory index. The goal is to find `c1` and `c2` mentioned above and to understand the hard limit of memory consumption of the storage.
We calculate the memory usage consumption via the Go runtime.ReadMemStats. We calculate the total allocated bytes difference before creating the index and after creating the index. It cannot perfectly reflect the memory usage of the in-memory index itself but can show the rough consumption pattern.
| N | versions | key size | memory usage |
|------|----------|----------|--------------|
| 100K | 1 | 64bytes | 22MB |
| 100K | 5 | 64bytes | 39MB |
| 1M | 1 | 64bytes | 218MB |
| 1M | 5 | 64bytes | 432MB |
| 100K | 1 | 256bytes | 41MB |
| 100K | 5 | 256bytes | 65MB |
| 1M | 1 | 256bytes | 409MB |
| 1M | 5 | 256bytes | 506MB |
Based on the result, we can calculate `c1=120bytes`, `c2=30bytes`. We only need two sets of data to calculate `c1` and `c2`, since they are the only unknown variable in the formula. The `c1=120bytes` and `c2=30bytes` are the average value of the 4 sets of `c1` and `c2` we calculated. The key metadata overhead is still relatively nontrivial (50%) for small key-value pairs. However, this is a significant improvement over the old store, which had at least 1000% overhead.
## Overall memory usage
The overall memory usage captures how much RSS etcd consumes with the storage. The value size should have very little impact on the overall memory usage of etcd, since we keep values on disk and only retain hot values in memory, managed by the OS page cache.
| N | versions | key size | value size | memory usage |
|------|----------|----------|------------|--------------|
| 100K | 1 | 64bytes | 256bytes | 40MB |
| 100K | 5 | 64bytes | 256bytes | 89MB |
| 1M | 1 | 64bytes | 256bytes | 470MB |
| 1M | 5 | 64bytes | 256bytes | 880MB |
| 100K | 1 | 64bytes | 1KB | 102MB |
| 100K | 5 | 64bytes | 1KB | 164MB |
| 1M | 1 | 64bytes | 1KB | 587MB |
| 1M | 5 | 64bytes | 1KB | 836MB |
Based on the result, we know the value size does not significantly impact the memory consumption. There is some minor increase due to more data held in the OS page cache.
[btree]: https://en.wikipedia.org/wiki/B-tree
[pagecache]: https://en.wikipedia.org/wiki/Page_cache

View File

@@ -1,6 +1,6 @@
## Branch Management
# Branch Management
### Guide
## Guide
- New development occurs on the [master branch](https://github.com/coreos/etcd/tree/master)
- Master branch should always have a green build!

View File

@@ -124,7 +124,7 @@ There two methods that can be used for discovery:
### etcd Discovery
To better understand the design about discovery service protocol, we suggest you read [this](./discovery_protocol.md).
To better understand the design about discovery service protocol, we suggest you read [this](discovery_protocol.md).
#### Lifetime of a Discovery URL
@@ -148,7 +148,7 @@ If you bootstrap an etcd cluster using discovery service with more than the expe
The URL you will use in this case will be `https://myetcd.local/v2/keys/discovery/6c007a14875d53d9bf0ef5a6fc0257c817f0fb83` and the etcd members will use the `https://myetcd.local/v2/keys/discovery/6c007a14875d53d9bf0ef5a6fc0257c817f0fb83` directory for registration as they start.
Each member must have a different name flag specified. Or discovery will fail due to duplicated name.
**Each member must have a different name flag specified. `Hostname` or `machine-id` can be a good choice. Or discovery will fail due to duplicated name.**
Now we start etcd with those relevant flags for each member:
@@ -200,7 +200,7 @@ ETCD_DISCOVERY=https://discovery.etcd.io/3e86b59982e49066c5d813af1c2e2579cbf573d
-discovery https://discovery.etcd.io/3e86b59982e49066c5d813af1c2e2579cbf573de
```
Each member must have a different name flag specified. Or discovery will fail due to duplicated name.
**Each member must have a different name flag specified. `Hostname` or `machine-id` can be a good choice. Or discovery will fail due to duplicated name.**
Now we start etcd with those relevant flags for each member:
@@ -384,7 +384,7 @@ $ etcd --proxy on -discovery-srv example.com
#### Error Cases
You might see the an error like `cannot find local etcd $name from SRV records.`. That means the etcd member fails to find itself from the cluster defined in SRV records. The resolved address in `-initial-advertise-peer-urls` *must match* one of the resolved addresses in the SRV targets.
You might see an error like `cannot find local etcd $name from SRV records.`. That means the etcd member fails to find itself from the cluster defined in SRV records. The resolved address in `-initial-advertise-peer-urls` *must match* one of the resolved addresses in the SRV targets.
# 0.4 to 2.0+ Migration Guide

View File

@@ -1,4 +1,4 @@
## Configuration Flags
# Configuration Flags
etcd is configurable through command-line flags and environment variables. Options set on the command line take precedence over those from the environment.
@@ -8,251 +8,256 @@ To start etcd automatically using custom settings at startup in Linux, using a [
[systemd-intro]: http://freedesktop.org/wiki/Software/systemd/
### Member Flags
## Member Flags
##### -name
### -name
+ Human-readable name for this member.
+ default: "default"
+ env variable: ETCD_NAME
+ This value is referenced as this node's own entries listed in the `-initial-cluster` flag (Ex: `default=http://localhost:2380` or `default=http://localhost:2380,default=http://localhost:7001`). This needs to match the key used in the flag if you're using [static boostrapping](clustering.md#static).
+ This value is referenced as this node's own entries listed in the `-initial-cluster` flag (Ex: `default=http://localhost:2380` or `default=http://localhost:2380,default=http://localhost:7001`). This needs to match the key used in the flag if you're using [static bootstrapping](clustering.md#static). When using discovery, each member must have a unique name. `Hostname` or `machine-id` can be a good choice.
##### -data-dir
### -data-dir
+ Path to the data directory.
+ default: "${name}.etcd"
+ env variable: ETCD_DATA_DIR
##### -wal-dir
### -wal-dir
+ Path to the dedicated wal directory. If this flag is set, etcd will write the WAL files to the walDir rather than the dataDir. This allows a dedicated disk to be used, and helps avoid io competition between logging and other IO operations.
+ default: ""
+ env variable: ETCD_WAL_DIR
##### -snapshot-count
### -snapshot-count
+ Number of committed transactions to trigger a snapshot to disk.
+ default: "10000"
+ env variable: ETCD_SNAPSHOT_COUNT
##### -heartbeat-interval
### -heartbeat-interval
+ Time (in milliseconds) of a heartbeat interval.
+ default: "100"
+ env variable: ETCD_HEARTBEAT_INTERVAL
##### -election-timeout
### -election-timeout
+ Time (in milliseconds) for an election to timeout. See [Documentation/tuning.md](tuning.md#time-parameters) for details.
+ default: "1000"
+ env variable: ETCD_ELECTION_TIMEOUT
##### -listen-peer-urls
### -listen-peer-urls
+ List of URLs to listen on for peer traffic. This flag tells the etcd to accept incoming requests from its peers on the specified scheme://IP:port combinations. Scheme can be either http or https.If 0.0.0.0 is specified as the IP, etcd listens to the given port on all interfaces. If an IP address is given as well as a port, etcd will listen on the given port and interface. Multiple URLs may be used to specify a number of addresses and ports to listen on. The etcd will respond to requests from any of the listed addresses and ports.
+ default: "http://localhost:2380,http://localhost:7001"
+ env variable: ETCD_LISTEN_PEER_URLS
+ example: "http://10.0.0.1:2380"
+ invalid example: "http://example.com:2380" (domain name is invalid for binding)
##### -listen-client-urls
### -listen-client-urls
+ List of URLs to listen on for client traffic. This flag tells the etcd to accept incoming requests from the clients on the specified scheme://IP:port combinations. Scheme can be either http or https. If 0.0.0.0 is specified as the IP, etcd listens to the given port on all interfaces. If an IP address is given as well as a port, etcd will listen on the given port and interface. Multiple URLs may be used to specify a number of addresses and ports to listen on. The etcd will respond to requests from any of the listed addresses and ports.
+ default: "http://localhost:2379,http://localhost:4001"
+ env variable: ETCD_LISTEN_CLIENT_URLS
+ example: "http://10.0.0.1:2379"
+ invalid example: "http://example.com:2379" (domain name is invalid for binding)
##### -max-snapshots
### -max-snapshots
+ Maximum number of snapshot files to retain (0 is unlimited)
+ default: 5
+ env variable: ETCD_MAX_SNAPSHOTS
+ The default for users on Windows is unlimited, and manual purging down to 5 (or your preference for safety) is recommended.
##### -max-wals
### -max-wals
+ Maximum number of wal files to retain (0 is unlimited)
+ default: 5
+ env variable: ETCD_MAX_WALS
+ The default for users on Windows is unlimited, and manual purging down to 5 (or your preference for safety) is recommended.
##### -cors
### -cors
+ Comma-separated white list of origins for CORS (cross-origin resource sharing).
+ default: none
+ env variable: ETCD_CORS
### Clustering Flags
## Clustering Flags
`-initial` prefix flags are used in bootstrapping ([static bootstrap][build-cluster], [discovery-service bootstrap][discovery] or [runtime reconfiguration][reconfig]) a new member, and ignored when restarting an existing member.
`-discovery` prefix flags need to be set when using [discovery service][discovery].
##### -initial-advertise-peer-urls
### -initial-advertise-peer-urls
+ List of this member's peer URLs to advertise to the rest of the cluster. These addresses are used for communicating etcd data around the cluster. At least one must be routable to all cluster members. These URLs can contain domain names.
+ default: "http://localhost:2380,http://localhost:7001"
+ env variable: ETCD_INITIAL_ADVERTISE_PEER_URLS
+ example: "http://example.com:2380, http://10.0.0.1:2380"
##### -initial-cluster
### -initial-cluster
+ Initial cluster configuration for bootstrapping.
+ default: "default=http://localhost:2380,default=http://localhost:7001"
+ env variable: ETCD_INITIAL_CLUSTER
+ The key is the value of the `-name` flag for each node provided. The default uses `default` for the key because this is the default for the `-name` flag.
##### -initial-cluster-state
### -initial-cluster-state
+ Initial cluster state ("new" or "existing"). Set to `new` for all members present during initial static or DNS bootstrapping. If this option is set to `existing`, etcd will attempt to join the existing cluster. If the wrong value is set, etcd will attempt to start but fail safely.
+ default: "new"
+ env variable: ETCD_INITIAL_CLUSTER_STATE
[static bootstrap]: clustering.md#static
##### -initial-cluster-token
### -initial-cluster-token
+ Initial cluster token for the etcd cluster during bootstrap.
+ default: "etcd-cluster"
+ env variable: ETCD_INITIAL_CLUSTER_TOKEN
##### -advertise-client-urls
### -advertise-client-urls
+ List of this member's client URLs to advertise to the rest of the cluster. These URLs can contain domain names.
+ default: "http://localhost:2379,http://localhost:4001"
+ env variable: ETCD_ADVERTISE_CLIENT_URLS
+ example: "http://example.com:2379, http://10.0.0.1:2379"
+ Be careful if you are advertising URLs such as http://localhost:2379 from a cluster member and are using the proxy feature of etcd. This will cause loops, because the proxy will be forwarding requests to itself until its resources (memory, file descriptors) are eventually depleted.
##### -discovery
### -discovery
+ Discovery URL used to bootstrap the cluster.
+ default: none
+ env variable: ETCD_DISCOVERY
##### -discovery-srv
### -discovery-srv
+ DNS srv domain used to bootstrap the cluster.
+ default: none
+ env variable: ETCD_DISCOVERY_SRV
##### -discovery-fallback
### -discovery-fallback
+ Expected behavior ("exit" or "proxy") when discovery services fails.
+ default: "proxy"
+ env variable: ETCD_DISCOVERY_FALLBACK
##### -discovery-proxy
### -discovery-proxy
+ HTTP proxy to use for traffic to discovery service.
+ default: none
+ env variable: ETCD_DISCOVERY_PROXY
### Proxy Flags
### -strict-reconfig-check
+ Reject reconfiguration requests that would cause quorum loss.
+ default: false
+ env variable: ETCD_STRICT_RECONFIG_CHECK
## Proxy Flags
`-proxy` prefix flags configures etcd to run in [proxy mode][proxy].
##### -proxy
### -proxy
+ Proxy mode setting ("off", "readonly" or "on").
+ default: "off"
+ env variable: ETCD_PROXY
##### -proxy-failure-wait
### -proxy-failure-wait
+ Time (in milliseconds) an endpoint will be held in a failed state before being reconsidered for proxied requests.
+ default: 5000
+ env variable: ETCD_PROXY_FAILURE_WAIT
##### -proxy-refresh-interval
### -proxy-refresh-interval
+ Time (in milliseconds) of the endpoints refresh interval.
+ default: 30000
+ env variable: ETCD_PROXY_REFRESH_INTERVAL
##### -proxy-dial-timeout
### -proxy-dial-timeout
+ Time (in milliseconds) for a dial to timeout or 0 to disable the timeout
+ default: 1000
+ env variable: ETCD_PROXY_DIAL_TIMEOUT
##### -proxy-write-timeout
### -proxy-write-timeout
+ Time (in milliseconds) for a write to timeout or 0 to disable the timeout.
+ default: 5000
+ env variable: ETCD_PROXY_WRITE_TIMEOUT
##### -proxy-read-timeout
### -proxy-read-timeout
+ Time (in milliseconds) for a read to timeout or 0 to disable the timeout.
+ Don't change this value if you use watches because they are using long polling requests.
+ default: 0
+ env variable: ETCD_PROXY_READ_TIMEOUT
### Security Flags
## Security Flags
The security flags help to [build a secure etcd cluster][security].
##### -ca-file [DEPRECATED]
### -ca-file [DEPRECATED]
+ Path to the client server TLS CA file. `-ca-file ca.crt` could be replaced by `-trusted-ca-file ca.crt -client-cert-auth` and etcd will perform the same.
+ default: none
+ env variable: ETCD_CA_FILE
##### -cert-file
### -cert-file
+ Path to the client server TLS cert file.
+ default: none
+ env variable: ETCD_CERT_FILE
##### -key-file
### -key-file
+ Path to the client server TLS key file.
+ default: none
+ env variable: ETCD_KEY_FILE
##### -client-cert-auth
### -client-cert-auth
+ Enable client cert authentication.
+ default: false
+ env variable: ETCD_CLIENT_CERT_AUTH
##### -trusted-ca-file
### -trusted-ca-file
+ Path to the client server TLS trusted CA key file.
+ default: none
+ env variable: ETCD_TRUSTED_CA_FILE
##### -peer-ca-file [DEPRECATED]
### -peer-ca-file [DEPRECATED]
+ Path to the peer server TLS CA file. `-peer-ca-file ca.crt` could be replaced by `-peer-trusted-ca-file ca.crt -peer-client-cert-auth` and etcd will perform the same.
+ default: none
+ env variable: ETCD_PEER_CA_FILE
##### -peer-cert-file
### -peer-cert-file
+ Path to the peer server TLS cert file.
+ default: none
+ env variable: ETCD_PEER_CERT_FILE
##### -peer-key-file
### -peer-key-file
+ Path to the peer server TLS key file.
+ default: none
+ env variable: ETCD_PEER_KEY_FILE
##### -peer-client-cert-auth
### -peer-client-cert-auth
+ Enable peer client cert authentication.
+ default: false
+ env variable: ETCD_PEER_CLIENT_CERT_AUTH
##### -peer-trusted-ca-file
### -peer-trusted-ca-file
+ Path to the peer server TLS trusted CA file.
+ default: none
+ env variable: ETCD_PEER_TRUSTED_CA_FILE
### Logging Flags
## Logging Flags
##### -debug
### -debug
+ Drop the default log level to DEBUG for all subpackages.
+ default: false (INFO for all packages)
+ env variable: ETCD_DEBUG
##### -log-package-levels
### -log-package-levels
+ Set individual etcd subpackages to specific log levels. An example being `etcdserver=WARNING,security=DEBUG`
+ default: none (INFO for all packages)
+ env variable: ETCD_LOG_PACKAGE_LEVELS
### Unsafe Flags
## Unsafe Flags
Please be CAUTIOUS when using unsafe flags because it will break the guarantees given by the consensus protocol.
For example, it may panic if other members in the cluster are still alive.
Follow the instructions when using these flags.
##### -force-new-cluster
+ Force to create a new one-member cluster. It commits configuration changes in force to remove all existing members in the cluster and add itself. It needs to be set to [restore a backup][restore].
### -force-new-cluster
+ Force to create a new one-member cluster. It commits configuration changes forcing to remove all existing members in the cluster and add itself. It needs to be set to [restore a backup][restore].
+ default: false
+ env variable: ETCD_FORCE_NEW_CLUSTER
### Experimental Flags
## Experimental Flags
##### -experimental-v3demo
### -experimental-v3demo
+ Enable experimental [v3 demo API](rfc/v3api.proto).
+ default: false
+ env variable: ETCD_EXPERIMENTAL_V3DEMO
### Miscellaneous Flags
## Miscellaneous Flags
##### -version
### -version
+ Print the version and exit.
+ default: false

View File

@@ -4,7 +4,7 @@ Discovery service protocol helps new etcd member to discover all other members i
Discovery service protocol is _only_ used in cluster bootstrap phase, and cannot be used for runtime reconfiguration or cluster monitoring.
The protocol uses a new discovery token to bootstrap one _unique_ etcd cluster. Remember that one discovery token can represent only one etcd cluster. As long as discovery protocol on this token starts, even if fails halfway, it must not be used to bootstrap another etcd cluster.
The protocol uses a new discovery token to bootstrap one _unique_ etcd cluster. Remember that one discovery token can represent only one etcd cluster. As long as discovery protocol on this token starts, even if it fails halfway, it must not be used to bootstrap another etcd cluster.
The rest of this article will walk through the discovery process with examples that correspond to a self-hosted discovery cluster. The public discovery service, discovery.etcd.io, functions the same way, but with a layer of polish to abstract away ugly URLs, generate UUIDs automatically, and provide some protections against excessive requests. At its core, the public discovery service still uses an etcd cluster as the data store as described in this document.

View File

@@ -1,4 +1,4 @@
Error Code
# Error Code
======
This document describes the error code used in key space '/v2/keys'. Feel free to import 'github.com/coreos/etcd/error' to use.

View File

@@ -37,7 +37,7 @@ timeout.
A proxy is a redirection server to the etcd cluster. The proxy handles the
redirection of a client to the current configuration of the etcd cluster. A
typical usecase is to start a proxy on a machine, and on first boot up of the
typical use case is to start a proxy on a machine, and on first boot up of the
proxy specify both the `--proxy` flag and the `--initial-cluster` flag.
From there, any etcdctl client that starts up automatically speaks to the local
@@ -57,24 +57,26 @@ and their integration with the reconfiguration API.
Thus, a member that is down, even infinitely, will never be automatically
removed from the etcd cluster member list.
This makes sense because its usually an application level / administrative
This makes sense because it's usually an application level / administrative
action to determine whether a reconfiguration should happen based on health.
For more information, refer to [Documentation/runtime-reconfiguration.md].
For more information, refer to
[Documentation/runtime-reconf-design.md](https://github.com/coreos/etcd/blob/master/Documentation/runtime-reconf-design.md).
## 6) how does --peers work with etcdctl?
## 6) how does --endpoint work with etcdctl?
The `--peers` flag can specify any number of etcd cluster members in a comma
The `--endpoint` flag can specify any number of etcd cluster members in a comma
separated list. This list might be a subset, equal to, or more than the actual
etcd cluster member list itself.
If only one peer is specified via the `--peers` flag, the etcdctl discovers the
If only one peer is specified via the `--endpoint` flag, the etcdctl discovers the
rest of the cluster via the member list of that one peer, and then it randomly
chooses a member to use. Again, the client can use the `quorum=true` flag on
reads, which will always fail when using a member in the minority.
If peers from multiple clusters are specified via the `--peers` flag, etcdctl
If peers from multiple clusters are specified via the `--endpoint` flag, etcdctl
will randomly choose a peer, and the request will simply get routed to one of
the clusters. This is probably not what you want.
Note: --peers flag is now deprecated and --endpoint should be used instead,
as it might confuse users to give etcdctl a peerURL.

View File

@@ -1,35 +1,35 @@
## Glossary
# Glossary
This document defines the various terms used in etcd documentation, command line and source code.
### Node
## Node
Node is an instance of raft state machine.
It has a unique identification, and records other nodes' progress internally when it is the leader.
### Member
## Member
Member is an instance of etcd. It hosts a node, and provides service to clients.
### Cluster
## Cluster
Cluster consists of several members.
The node in each member follows raft consensus protocol to replicate logs. Cluster receives proposals from members, commits them and apply to local store.
### Peer
## Peer
Peer is another member of the same cluster.
### Proposal
## Proposal
A proposal is a request (for example a write request, a configuration change request) that needs to go through raft protocol.
### Client
## Client
Client is a caller of the cluster's HTTP API.
### Machine (deprecated)
## Machine (deprecated)
The alternative of Member in etcd before 2.0

View File

@@ -46,7 +46,7 @@ ExecStart=/usr/bin/etcd
There are several error cases:
0) Init has already ran and the data directory is already configured
0) Init has already run and the data directory is already configured
1) Discovery fails because of network timeout, etc
2) Discovery fails because the cluster is already full and etcd needs to fall back to proxy
3) Static cluster configuration fails because of conflict, misconfiguration or timeout

View File

@@ -1,19 +1,24 @@
## Libraries and Tools
# Libraries and Tools
**Tools**
- [etcdctl](https://github.com/coreos/etcdctl) - A command line client for etcd
- [etcdctl](https://github.com/coreos/etcd/tree/master/etcdctl) - A command line client for etcd
- [etcd-backup](https://github.com/fanhattan/etcd-backup) - A powerful command line utility for dumping/restoring etcd - Supports v2
- [etcd-dump](https://npmjs.org/package/etcd-dump) - Command line utility for dumping/restoring etcd.
- [etcd-fs](https://github.com/xetorthio/etcd-fs) - FUSE filesystem for etcd
- [etcddir](https://github.com/rekby/etcddir) - Realtime sync etcd and local directory. Work with windows and linux.
- [etcd-browser](https://github.com/henszey/etcd-browser) - A web-based key/value editor for etcd using AngularJS
- [etcd-lock](https://github.com/datawisesystems/etcd-lock) - Master election & distributed r/w lock implementation using etcd - Supports v2
- [etcd-console](https://github.com/matishsiao/etcd-console) - A web-base key/value editor for etcd using PHP
- [etcd-viewer](https://github.com/nikfoundas/etcd-viewer) - An etcd key-value store editor/viewer written in Java
- [etcd-export](https://github.com/mickep76/etcd-export) - Export/Import etcd directory as JSON/YAML/TOML and Validate directory using JSON schema
- [etcd-rest](https://github.com/mickep76/etcd-rest) - Create generic REST API in Go using etcd as a backend with validation using JSON schema
- [etcdsh](https://github.com/kamilhark/etcdsh) - A command line client with support of command history and tab completion. Supports v2
**Go libraries**
- [go-etcd](https://github.com/coreos/go-etcd) - Supports v2
- [etcd/client](https://github.com/coreos/etcd/blob/master/client) - the officially maintained Go client
- [go-etcd](https://github.com/coreos/go-etcd) - the deprecated official client. May be useful for older (<2.0.0) versions of etcd.
**Java libraries**
@@ -49,6 +54,7 @@
**C++ libraries**
- [edwardcapriolo/etcdcpp](https://github.com/edwardcapriolo/etcdcpp) - Supports v2
- [suryanathan/etcdcpp](https://github.com/suryanathan/etcdcpp) - Supports v2 (with waits)
**Clojure libraries**
@@ -111,7 +117,7 @@ A detailed recap of client functionalities can be found in the [clients compatib
- [configdb](https://git.autistici.org/ai/configdb/tree/master) - A REST relational abstraction on top of arbitrary database backends, aimed at storing configs and inventories.
- [scrz](https://github.com/scrz/scrz) - Container manager, stores configuration in etcd.
- [fleet](https://github.com/coreos/fleet) - Distributed init system
- [GoogleCloudPlatform/kubernetes](https://github.com/GoogleCloudPlatform/kubernetes) - Container cluster manager.
- [kubernetes/kubernetes](https://github.com/kubernetes/kubernetes) - Container cluster manager introduced by Google.
- [mailgun/vulcand](https://github.com/mailgun/vulcand) - HTTP proxy that uses etcd as a configuration backend.
- [duedil-ltd/discodns](https://github.com/duedil-ltd/discodns) - Simple DNS nameserver using etcd as a database for names and records.
- [skynetservices/skydns](https://github.com/skynetservices/skydns) - RFC compliant DNS server

View File

@@ -1,19 +1,19 @@
## Metrics
# Metrics
**NOTE: The metrics feature is considered as an experimental. We might add/change/remove metrics without warning in the future releases.**
**NOTE: The metrics feature is considered experimental. We may add/change/remove metrics without warning in future releases.**
etcd uses [Prometheus](http://prometheus.io/) for metrics reporting in the server. The metrics can be used for real-time monitoring and debugging.
The simplest way to see the available metrics is to cURL the metrics endpoint `/metrics` of etcd. The format is described [here](http://prometheus.io/docs/instrumenting/exposition_formats/).
You can also follow the doc [here](http://prometheus.io/docs/introduction/getting_started/) to start a Promethus server and monitor etcd metrics.
You can also follow the doc [here](http://prometheus.io/docs/introduction/getting_started/) to start a Prometheus server and monitor etcd metrics.
The naming of metrics follows the suggested [best practice of Promethus](http://prometheus.io/docs/practices/naming/). A metric name has an `etcd` prefix as its namespace and a subsystem prefix (for example `wal` and `etcdserver`).
The naming of metrics follows the suggested [best practice of Prometheus](http://prometheus.io/docs/practices/naming/). A metric name has an `etcd` prefix as its namespace and a subsystem prefix (for example `wal` and `etcdserver`).
etcd now exposes the following metrics:
### etcdserver
## etcdserver
| Name | Description | Type |
|-----------------------------------------|--------------------------------------------------|---------|
@@ -30,46 +30,7 @@ Pending proposal (`pending_proposal_total`) gives you an idea about how many pro
Failed proposals (`proposal_failed_total`) are normally related to two issues: temporary failures related to a leader election or longer duration downtime caused by a loss of quorum in the cluster.
### store
These metrics describe the accesses into the data store of etcd members that exist in the cluster. They
are useful to count what kind of actions are taken by users. It is also useful to see and whether all etcd members
"see" the same set of data mutations, and whether reads and watches (which are local) are equally distributed.
All these metrics are prefixed with `etcd_store_`.
| Name | Description | Type |
|---------------------------|------------------------------------------------------------------------------------------|--------------------|
| reads_total | Total number of reads from store, should differ among etcd members (local reads). | Counter(action) |
| writes_total | Total number of writes to store, should be same among all etcd members. | Counter(action) |
| reads_failed_total | Number of failed reads from store (e.g. key missing) on local reads. | Counter(action) |
| writes_failed_total | Number of failed writes to store (e.g. failed compare and swap). | Counter(action) |
| expires_total | Total number of expired keys (due to TTL).   | Counter |
| watch_requests_totals | Total number of incoming watch requests to this etcd member (local watches). | Counter |
| watchers | Current count of active watchers on this etcd member. | Gauge |
Both `reads_total` and `writes_total` count both successful and failed requests. `reads_failed_total` and
`writes_failed_total` count failed requests. A lot of failed writes indicate possible contentions on keys (e.g. when
doing `compareAndSet`), and read failures indicate that some clients try to access keys that don't exist.
Example Prometheus queries that may be useful from these metrics (across all etcd members):
* `sum(rate(etcd_store_reads_total{job="etcd"}[1m])) by (action)`
`max(rate(etcd_store_writes_total{job="etcd"}[1m])) by (action)`
Rate of reads and writes by action, across all servers across a time window of `1m`. The reason why `max` is used
for writes as opposed to `sum` for reads is because all of etcd nodes in the cluster apply all writes to their stores.
Shows the rate of successfull readonly/write queries across all servers, across a time window of `1m`.
* `sum(rate(etcd_store_watch_requests_total{job="etcd"}[1m]))`
Shows rate of new watch requests per second. Likely driven by how often watched keys change.
* `sum(etcd_store_watchers{job="etcd"})`
Number of active watchers across all etcd servers.
### wal
## wal
| Name | Description | Type |
|------------------------------------|--------------------------------------------------|---------|
@@ -78,7 +39,39 @@ Example Prometheus queries that may be useful from these metrics (across all etc
Abnormally high fsync duration (`fsync_durations_microseconds`) indicates disk issues and might cause the cluster to be unstable.
### snapshot
## http requests
These metrics describe the serving of requests (non-watch events) served by etcd members in non-proxy mode: total
incoming requests, request failures and processing latency (inc. raft rounds for storage). They are useful for tracking
user-generated traffic hitting the etcd cluster .
All these metrics are prefixed with `etcd_http_`
| Name | Description | Type |
|--------------------------------|-----------------------------------------------------------------------------------------|--------------------|
| received_total | Total number of events after parsing and auth. | Counter(method) |
| failed_total | Total number of failed events.   | Counter(method,error) |
| successful_duration_second | Bucketed handling times of the requests, including raft rounds for writes. | Histogram(method) |
Example Prometheus queries that may be useful from these metrics (across all etcd members):
* `sum(rate(etcd_http_failed_total{job="etcd"}[1m]) by (method) / sum(rate(etcd_http_events_received_total{job="etcd"})[1m]) by (method)`
Shows the fraction of events that failed by HTTP method across all members, across a time window of `1m`.
* `sum(rate(etcd_http_received_total{job="etcd",method="GET})[1m]) by (method)`
`sum(rate(etcd_http_received_total{job="etcd",method~="GET})[1m]) by (method)`
Shows the rate of successful readonly/write queries across all servers, across a time window of `1m`.
* `histogram_quantile(0.9, sum(increase(etcd_http_successful_processing_seconds{job="etcd",method="GET"}[5m]) ) by (le))`
`histogram_quantile(0.9, sum(increase(etcd_http_successful_processing_seconds{job="etcd",method!="GET"}[5m]) ) by (le))`
Show the 0.90-tile latency (in seconds) of read/write (respectively) event handling across all members, with a window of `5m`.
## snapshot
| Name | Description | Type |
|--------------------------------------------|------------------------------------------------------------|---------|
@@ -87,7 +80,7 @@ Abnormally high fsync duration (`fsync_durations_microseconds`) indicates disk i
Abnormally high snapshot duration (`snapshot_save_total_durations_microseconds`) indicates disk issues and might cause the cluster to be unstable.
### rafthttp
## rafthttp
| Name | Description | Type | Labels |
|-----------------------------------|--------------------------------------------|---------|--------------------------------|
@@ -106,7 +99,7 @@ Label `msgType` is the type of raft message. `MsgApp` is log replication message
Label `remoteID` is the member ID of the message destination.
### proxy
## proxy
etcd members operating in proxy mode do not do store operations. They forward all requests
to cluster instances.
@@ -134,4 +127,4 @@ Example Prometheus queries that may be useful from these metrics (across all etc
* `sum(rate(etcd_proxy_dropped_total{job="etcd"}[1m])) by (proxying_error)`
Number of failed request on the proxy. This should be 0, spikes here indicate connectivity issues to etcd cluster.

View File

@@ -1,4 +1,4 @@
## Members API
# Members API
* [List members](#list-members)
* [Add a member](#add-a-member)
@@ -103,7 +103,7 @@ Change the peer urls of a given member. The member ID must be a hex-encoded uint
If the POST body is malformed an HTTP 400 will be returned. If the member does not exist in the cluster an HTTP 404 will be returned. If any of the given peerURLs exists in the cluster an HTTP 409 will be returned. If the cluster fails to process the request within timeout an HTTP 500 will be returned, though the request may be processed later.
#### Request
### Request
```
PUT /v2/members/<id> HTTP/1.1
@@ -111,7 +111,7 @@ PUT /v2/members/<id> HTTP/1.1
{"peerURLs": ["http://10.0.0.10:2380"]}
```
#### Example
### Example
```sh
curl http://10.0.0.10:2379/v2/members/272e204152 -XPUT \

View File

@@ -1,4 +1,6 @@
# etcd in Production
etcd is being used successfully by many companies in production. It is,
however, under active development and systems like etcd are difficult to get
correct. If you are comfortable with bleeding-edge software please use etcd and
however, under active development, and systems like etcd are difficult to get
correct. If you are comfortable with bleeding-edge software, please use etcd and
provide us with the feedback and testing young software needs.

View File

@@ -1,6 +1,6 @@
## Proxy
# Proxy
etcd can now run as a transparent proxy. Running etcd as a proxy allows for easily discovery of etcd within your infrastructure, since it can run on each machine as a local service. In this mode, etcd acts as a reverse proxy and forwards client requests to an active etcd cluster. The etcd proxy does not participate in the consensus replication of the etcd cluster, thus it neither increases the resilience nor decreases the write performance of the etcd cluster.
etcd can now run as a transparent proxy. Running etcd as a proxy allows for easy discovery of etcd within your infrastructure, since it can run on each machine as a local service. In this mode, etcd acts as a reverse proxy and forwards client requests to an active etcd cluster. The etcd proxy does not participate in the consensus replication of the etcd cluster, thus it neither increases the resilience nor decreases the write performance of the etcd cluster.
etcd currently supports two proxy modes: `readwrite` and `readonly`. The default mode is `readwrite`, which forwards both read and write requests to the etcd cluster. A `readonly` etcd proxy only forwards read requests to the etcd cluster, and returns `HTTP 501` to all write requests.
@@ -8,30 +8,114 @@ The proxy will shuffle the list of cluster members periodically to avoid sending
The member list used by proxy consists of all client URLs advertised within the cluster, as specified in each members' `-advertise-client-urls` flag. If this flag is set incorrectly, requests sent to the proxy are forwarded to wrong addresses and then fail. Including URLs in the `-advertise-client-urls` flag that point to the proxy itself, e.g. http://localhost:2379, is even more problematic as it will cause loops, because the proxy keeps trying to forward requests to itself until its resources (memory, file descriptors) are eventually depleted. The fix for this problem is to restart etcd member with correct `-advertise-client-urls` flag. After client URLs list in proxy is recalculated, which happens every 30 seconds, requests will be forwarded correctly.
### Using an etcd proxy
## Using an etcd proxy
To start etcd in proxy mode, you need to provide three flags: `proxy`, `listen-client-urls`, and `initial-cluster` (or `discovery`).
To start a readwrite proxy, set `-proxy on`; To start a readonly proxy, set `-proxy readonly`.
The proxy will be listening on `listen-client-urls` and forward requests to the etcd cluster discovered from in `initial-cluster` or `discovery` url.
#### Start an etcd proxy with a static configuration
### Start an etcd proxy with a static configuration
To start a proxy that will connect to a statically defined etcd cluster, specify the `initial-cluster` flag:
```
etcd -proxy on -listen-client-urls http://127.0.0.1:8080 -initial-cluster infra0=http://10.0.1.10:2380,infra1=http://10.0.1.11:2380,infra2=http://10.0.1.12:2380
etcd -proxy on \
-listen-client-urls http://127.0.0.1:8080 \
-initial-cluster infra0=http://10.0.1.10:2380,infra1=http://10.0.1.11:2380,infra2=http://10.0.1.12:2380
```
#### Start an etcd proxy with the discovery service
### Start an etcd proxy with the discovery service
If you bootstrap an etcd cluster using the [discovery service][discovery-service], you can also start the proxy with the same `discovery`.
To start a proxy using the discovery service, specify the `discovery` flag. The proxy will wait until the etcd cluster defined at the `discovery` url finishes bootstrapping, and then start to forward the requests.
```
etcd -proxy on -listen-client-urls http://127.0.0.1:8080 -discovery https://discovery.etcd.io/3e86b59982e49066c5d813af1c2e2579cbf573de
etcd -proxy on \
-listen-client-urls http://127.0.0.1:8080 \
-discovery https://discovery.etcd.io/3e86b59982e49066c5d813af1c2e2579cbf573de
```
#### Fallback to proxy mode with discovery service
## Fallback to proxy mode with discovery service
If you bootstrap a etcd cluster using [discovery service][discovery-service] with more than the expected number of etcd members, the extra etcd processes will fall back to being `readwrite` proxies by default. They will forward the requests to the cluster as described above. For example, if you create a discovery url with `size=5`, and start ten etcd processes using that same discovery url, the result will be a cluster with five etcd members and five proxies. Note that this behaviour can be disabled with the `proxy-fallback` flag.
## Promote a proxy to a member of etcd cluster
A Proxy is in the part of etcd cluster that does not participate in consensus. A proxy will not promote itself to an etcd member that participates in consensus automtically in any case.
If you want to promote a proxy to an etcd member, there are four steps you need to follow:
- use etcdctl to add the proxy node as an etcd member into the existing cluster
- stop the etcd proxy process or service
- remove the existing proxy data directory
- restart the etcd process with new member configuration
## Example
We assume you have a one member etcd cluster with one proxy. The cluster information is listed below:
|Name|Address|
|------|---------|
|infra0|10.0.1.10|
|proxy0|10.0.1.11|
This example walks you through a case that you promote one proxy to an etcd member. The cluster will become a two member cluster after finishing the four steps.
### Add a new member into the existing cluster
First, use etcdctl to add the member to the cluster, which will output the environment variables need to correctly configure the new member:
``` bash
$ etcdctl -endpoint http://10.0.1.10:2379 member add infra1 http://10.0.1.11:2380
added member 9bf1b35fc7761a23 to cluster
ETCD_NAME="infra1"
ETCD_INITIAL_CLUSTER="infra0=http://10.0.1.10:2380,infra1=http://10.0.1.11:2380"
ETCD_INITIAL_CLUSTER_STATE=existing
```
### Stop the proxy process
Stop the existing proxy so we can wipe it's state on disk and reload it with the new configuration:
``` bash
px aux | grep etcd
kill %etcd_proxy_pid%
```
or (if you are running etcd proxy as etcd service under systemd)
``` bash
sudo systemctl stop etcd
```
### Remove the existing proxy data dir
``` bash
rm -rf %data_dir%/proxy
```
### Start etcd as a new member
Finally, start the reconfigured member and make sure it joins the cluster correctly:
``` bash
$ export ETCD_NAME="infra1"
$ export ETCD_INITIAL_CLUSTER="infra0=http://10.0.1.10:2380,infra1=http://10.0.1.11:2380"
$ export ETCD_INITIAL_CLUSTER_STATE=existing
$ etcd -listen-client-urls http://10.0.1.11:2379 \
-advertise-client-urls http://10.0.1.11:2379 \
-listen-peer-urls http://10.0.1.11:2380 \
-initial-advertise-peer-urls http://10.0.1.11:2380 \
-data-dir %data_dir%
```
If you are running etcd under systemd, you should modify the service file with correct configuration and restart the service:
``` bash
sudo systemd restart etcd
```
If you see an error, you can read the [add member troubleshooting doc](runtime-configuration.md#error-cases).
[discovery-service]: clustering.md#discovery

View File

@@ -1,4 +1,4 @@
## Reporting Bugs
# Reporting Bugs
If you find bugs or documentation mistakes in etcd project, please let us know by [opening an issue](https://github.com/coreos/etcd/issues/new). We treat bugs and mistakes very seriously and believe no issue is too small. Before creating a bug report, please check there that one does not already exist.
@@ -20,7 +20,7 @@ We might ask you for further information to locate a bug. A duplicated bug repor
## Frequently Asked Questions
### How to get stack trace
### How to get a stack trace
``` bash
$ kill -QUIT $PID

View File

@@ -1,4 +1,4 @@
## Design
# Design
1. Flatten binary key-value space
@@ -32,9 +32,9 @@
[protobuf](./v3api.proto)
### Examples
## Examples
#### Put a key (foo=bar)
### Put a key (foo=bar)
```
// A put is always successful
Put( PutRequest { key = foo, value = bar } )
@@ -47,7 +47,7 @@ PutResponse {
}
```
#### Get a key (assume we have foo=bar)
### Get a key (assume we have foo=bar)
```
Get ( RangeRequest { key = foo } )
@@ -68,7 +68,7 @@ RangeResponse {
}
```
#### Range over a key space (assume we have foo0=bar0… foo100=bar100)
### Range over a key space (assume we have foo0=bar0… foo100=bar100)
```
Range ( RangeRequest { key = foo, end_key = foo80, limit = 30 } )
@@ -97,7 +97,7 @@ RangeResponse {
}
```
#### Finish a txn (assume we have foo0=bar0, foo1=bar1)
### Finish a txn (assume we have foo0=bar0, foo1=bar1)
```
Txn(TxnRequest {
// mod_revision of foo0 is equal to 1, mod_revision of foo1 is greater than 1
@@ -129,7 +129,7 @@ TxnResponse {
}
```
#### Watch on a key/range
### Watch on a key/range
```
Watch( WatchRequest{

View File

@@ -1,7 +1,6 @@
syntax = "proto3";
// Interface exported by the server.
service etcd {
service KV {
// Range gets the keys in the range from the store.
rpc Range(RangeRequest) returns (RangeResponse) {}
@@ -20,11 +19,6 @@ service etcd {
// and generates events with the same revision in the event history.
rpc Txn(TxnRequest) returns (TxnResponse) {}
// Watch watches the events happening or happened in etcd. Both input and output
// are stream. One watch rpc can watch for multiple ranges and get a stream of
// events. The whole events history can be watched unless compacted.
rpc WatchRange(stream WatchRangeRequest) returns (stream WatchRangeResponse) {}
// Compact compacts the event history in etcd. User should compact the
// event history periodically, or it will grow infinitely.
rpc Compact(CompactionRequest) returns (CompactionResponse) {}
@@ -50,6 +44,14 @@ service etcd {
rpc LeaseKeepAlive(stream LeaseKeepAliveRequest) returns (stream LeaseKeepAliveResponse) {}
}
service watch {
// Watch watches the events happening or happened. Both input and output
// are stream. One watch rpc can watch for multiple keys or prefixs and
// get a stream of events. The whole events history can be watched unless
// compacted.
rpc Watch(stream WatchRequest) returns (stream WatchResponse) {}
}
message ResponseHeader {
// an error type message?
string error = 1;
@@ -190,21 +192,22 @@ message KeyValue {
bytes value = 5;
}
message WatchRangeRequest {
// if the range_end is not given, the request returns the key.
message WatchRequest {
// the key to be watched
bytes key = 1;
// if the range_end is given, it gets the keys in range [key, range_end).
bytes range_end = 2;
// the prefix to be watched.
bytes prefix = 2;
// start_revision is an optional revision (including) to watch from. No start_revision is "now".
int64 start_revision = 3;
// end_revision is an optional revision (excluding) to end watch. No end_revision is "forever".
int64 end_revision = 4;
bool progress_notification = 5;
// TODO: support Range watch?
// TODO: support notification every time interval or revision increase?
// TODO: support cancel watch if the server cannot reach with majority?
}
message WatchRangeResponse {
message WatchResponse {
ResponseHeader header = 1;
repeated Event events = 2;
// TODO: support batched events response?
storagepb.Event event = 2;
}
message Event {

View File

@@ -1,8 +1,8 @@
## Runtime Reconfiguration
# Runtime Reconfiguration
etcd comes with support for incremental runtime reconfiguration, which allows users to update the membership of the cluster at run time.
Reconfiguration requests can only be processed when the the majority of the cluster members are functioning. It is **highly recommended** to always have a cluster size greater than two in production. It is unsafe to remove a member from a two member cluster. The majority of a two member cluster is also two. If there is a failure during the removal process, the cluster might not able to make progress and need to [restart from majority failure][majority failure].
Reconfiguration requests can only be processed when the majority of the cluster members are functioning. It is **highly recommended** to always have a cluster size greater than two in production. It is unsafe to remove a member from a two member cluster. The majority of a two member cluster is also two. If there is a failure during the removal process, the cluster might not able to make progress and need to [restart from majority failure][majority failure].
To better understand the design behind runtime reconfiguration, we suggest you read [this](runtime-reconf-design.md).
@@ -131,7 +131,7 @@ The new member will run as a part of the cluster and immediately begin catching
If you are adding multiple members the best practice is to configure a single member at a time and verify it starts correctly before adding more new members.
If you add a new member to a 1-node cluster, the cluster cannot make progress before the new member starts because it needs two members as majority to agree on the consensus. You will only see this behavior between the time `etcdctl member add` informs the cluster about the new member and the new member successfully establishing a connection to the existing one.
#### Error Cases
#### Error Cases When Adding Members
In the following case we have not included our new host in the list of enumerated nodes.
If this is a new cluster, the node must be added to the list of initial cluster members.
@@ -154,10 +154,18 @@ etcdserver: assign ids error: unmatched member while checking PeerURLs
exit 1
```
When we start etcd using the data directory of a removed member, etcd will exit automatically if it connects to any alive member in the cluster:
When we start etcd using the data directory of a removed member, etcd will exit automatically if it connects to any active member in the cluster:
```sh
$ etcd
etcd: this member has been permanently removed from the cluster. Exiting.
exit 1
```
### Strict Reconfiguration Check Mode (`-strict-reconfig-check`)
As described in the above, the best practice of adding new members is to configure a single member at a time and verify it starts correctly before adding more new members. This step by step approach is very important because if newly added members is not configured correctly (for example the peer URLs are incorrect), the cluster can lose quorum. The quorum loss happens since the newly added member are counted in the quorum even if that member is not reachable from other existing members. Also quorum loss might happen if there is a connectivity issue or there are operational issues.
For avoiding this problem, etcd provides an option `-strict-reconfig-check`. If this option is passed to etcd, etcd rejects reconfiguration requests if the number of started members will be less than a quorum of the reconfigured cluster.
It is recommended to enable this option. However, it is disabled by default because of keeping compatibility.

View File

@@ -1,10 +1,10 @@
### Design of Runtime Reconfiguration
# Design of Runtime Reconfiguration
Runtime reconfiguration is one of the hardest and most error prone features in a distributed system, especially in a consensus based system like etcd.
Read on to learn about the design of etcd's runtime reconfiguration commands and how we tackled these problems.
### Two Phase Config Changes Keep you Safe
## Two Phase Config Changes Keep you Safe
In etcd, every runtime reconfiguration has to go through [two phases](Documentation/runtime-configuration.md#add-a-new-member) for safety reasons. For example, to add a member you need to first inform cluster of new configuration and then start the new member.
@@ -22,7 +22,7 @@ Without the explicit workflow around cluster membership etcd would be vulnerable
We think runtime reconfiguration should be a low frequent operation. We made the decision to keep it explicit and user-driven to ensure configuration safety and keep your cluster always running smoothly under your control.
### Permanent Loss of Quorum Requires New Cluster
## Permanent Loss of Quorum Requires New Cluster
If a cluster permanently loses a majority of its members, a new cluster will need to be started from an old data directory to recover the previous state.
@@ -30,7 +30,7 @@ It is entirely possible to force removing the failed members from the existing c
If you have a correct deployment, the possibility of permanent majority lose is very low. But it is a severe enough problem that worth special care. We strongly suggest you to read the [disaster recovery documentation](admin_guide.md#disaster-recovery) and prepare for permanent majority lose before you put etcd into production.
### Do Not Use Public Discovery Service For Runtime Reconfiguration
## Do Not Use Public Discovery Service For Runtime Reconfiguration
The public discovery service should only be used for bootstrapping a cluster. To join member into an existing cluster, you should use runtime reconfiguration API.
@@ -38,10 +38,10 @@ Discovery service is designed for bootstrapping an etcd cluster in the cloud env
It seems that using public discovery service is a convenient way to do runtime reconfiguration, after all discovery service already has all the cluster configuration information. However relying on public discovery service brings troubles:
1. it introduces a external dependencies for the entire life-cycle of your cluster, not just bootstrap time. If there is a network issue between your cluster and public discover service, your cluster will suffer from it.
1. it introduces external dependencies for the entire life-cycle of your cluster, not just bootstrap time. If there is a network issue between your cluster and public discovery service, your cluster will suffer from it.
2. public discovery service must reflect correct runtime configuration of your cluster during it life-cycle. It has to provide security mechanism to avoid bad actions, and it is hard.
3. public discovery service has to keep tens of thousands of cluster configurations. Our public discovery service backend is not ready for that workload.
If you want to have a discovery service that supports runtime reconfiguration, the best choice is to build your private one.
If you want to have a discovery service that supports runtime reconfiguration, the best choice is to build your private one.

View File

@@ -97,7 +97,7 @@ $ curl --cacert /path/to/ca.crt --cert /path/to/client.crt --key /path/to/client
-L https://127.0.0.1:2379/v2/keys/foo -XPUT -d value=bar -v
```
You should able to see:
You should be able to see:
```
...
@@ -138,7 +138,7 @@ $ etcd -name infra1 -data-dir infra1 \
# member2
$ etcd -name infra2 -data-dir infra2 \
-peer-client-cert-atuh -peer-trusted-ca-file=/path/to/ca.crt -peer-cert-file=/path/to/member2.crt -peer-key-file=/path/to/member2.key \
-peer-client-cert-auth -peer-trusted-ca-file=/path/to/ca.crt -peer-cert-file=/path/to/member2.crt -peer-key-file=/path/to/member2.key \
-initial-advertise-peer-urls=https://10.0.1.11:2380 -listen-peer-urls=https://10.0.1.11:2380 \
-discovery ${DISCOVERY_URL}
```
@@ -152,7 +152,7 @@ The etcd members will form a cluster and all communication between members in th
The internal protocol of etcd v2.0.x uses a lot of short-lived HTTP connections.
So, when enabling TLS you may need to increase the heartbeat interval and election timeouts to reduce internal cluster connection churn.
A reasonable place to start are these values: ` --heartbeat-interval 500 --election-timeout 2500`.
This issues is resolved in the etcd v2.1.x series of releases which uses fewer connections.
These issues are resolved in the etcd v2.1.x series of releases which uses fewer connections.
### I'm seeing a SSLv3 alert handshake failure when using SSL client authentication?

View File

@@ -1,16 +1,16 @@
## Tuning
# Tuning
The default settings in etcd should work well for installations on a local network where the average network latency is low.
However, when using etcd across multiple data centers or over networks with high latency you may need to tweak the heartbeat interval and election timeout settings.
The network isn't the only source of latency. Each request and response may be impacted by slow disks on both the leader and follower. Each of these timeouts represents the total time from request to successful response from the other machine.
### Time Parameters
## Time Parameters
The underlying distributed consensus protocol relies on two separate time parameters to ensure that nodes can handoff leadership if one stalls or goes offline.
The first parameter is called the *Heartbeat Interval*.
This is the frequency with which the leader will notify followers that it is still the leader.
For best pratices, the parameter should be set around round-trip time between members.
For best practices, the parameter should be set around round-trip time between members.
By default, etcd uses a `100ms` heartbeat interval.
The second parameter is the *Election Timeout*.
@@ -27,11 +27,14 @@ The election timeout should be set based on the heartbeat interval and average r
Election timeouts must be at least 10 times the round-trip time so it can account for variance in your network.
For example, if the round-trip time between your members is 10ms then you should have at least a 100ms election timeout.
The upper limit of election timeout is 50000ms, which should only be used when deploying global etcd cluster. First, 5s is the upper limit of average global round-trip time. A reasonable round-trip time for the continental united states is 130ms, and the time between US and japan is around 350-400ms. Because package gets delayed a lot, and network situation may be terrible, 5s is a safe value for it. Then, because election timeout should be an order of magnitude bigger than broadcast time, 50s becomes its maximum.
You should also set your election timeout to at least 5 to 10 times your heartbeat interval to account for variance in leader replication.
For a heartbeat interval of 50ms you should set your election timeout to at least 250ms - 500ms.
The upper limit of election timeout is 50000ms (50s), which should only be used when deploying a globally-distributed etcd cluster.
A reasonable round-trip time for the continental United States is 130ms, and the time between US and Japan is around 350-400ms.
If your network has uneven performance or regular packet delays/loss then it is possible that a couple of retries may be necessary to successfully send a packet. So 5s is a safe upper limit of global round-trip time.
As the election timeout should be an order of magnitude bigger than broadcast time, in the case of ~5s for a globally distributed cluster, then 50 seconds becomes a reasonable maximum.
The heartbeat interval and election timeout value should be the same for all members in one cluster. Setting different values for etcd members may disrupt cluster stability.
You can override the default values on the command line:
@@ -46,7 +49,7 @@ $ ETCD_HEARTBEAT_INTERVAL=100 ETCD_ELECTION_TIMEOUT=500 etcd
The values are specified in milliseconds.
### Snapshots
## Snapshots
etcd appends all key changes to a log file.
This log grows forever and is a complete linear history of every change made to the keys.

View File

@@ -1,4 +1,4 @@
## Upgrade etcd to 2.1
# Upgrade etcd to 2.1
In the general case, upgrading from etcd 2.0 to 2.1 can be a zero-downtime, rolling upgrade:
- one by one, stop the etcd v2.0 processes and replace them with etcd v2.1 processes
@@ -6,15 +6,15 @@ In the general case, upgrading from etcd 2.0 to 2.1 can be a zero-downtime, roll
Before [starting an upgrade](#upgrade-procedure), read through the rest of this guide to prepare.
### Upgrade Checklists
## Upgrade Checklists
#### Upgrade Requirement
### Upgrade Requirements
To upgrade an existing etcd deployment to 2.1, you must be running 2.0. If youre running a version of etcd before 2.0, you must upgrade to [2.0](https://github.com/coreos/etcd/releases/tag/v2.0.13) before upgrading to 2.1.
Also, to ensure a smooth rolling upgrade, your running cluster must be healthy. You can check the health of the cluster by using `etcdctl cluster-health` command.
#### Preparedness
### Preparedness
Before upgrading etcd, always test the services relying on etcd in a staging environment before deploying the upgrade to the production environment.
@@ -22,13 +22,13 @@ You might also want to [backup your data directory](admin_guide.md#backing-up-th
etcd 2.1 introduces a new [authentication](auth_api.md) feature, which is disabled by default. If your deployment depends on these, you may want to test the auth features before enabling them in production.
#### Mixed Versions
### Mixed Versions
While upgrading, an etcd cluster supports mixed versions of etcd members. The cluster is only considered upgraded once all its members are upgraded to 2.1.
Internally, etcd members negotiate with each other to determine the overall etcd cluster version, which controls the reported cluster version and the supported features. For example, if you are mid-upgrade, any 2.1 features (such as the the authentication feature mentioned above) wont be available.
#### Limitations
### Limitations
If you encounter any issues during the upgrade, you can attempt to restart the etcd process in trouble using a newer v2.1 binary to solve the problem. One known issue is that etcd v2.0.0 and v2.0.2 may panic during rolling upgrades due to an existing bug, which has been fixed since etcd v2.0.3.
@@ -36,7 +36,7 @@ It might take up to 2 minutes for the newly upgraded member to catch up with the
If you have even more data, this might take more time. If you have a data size larger than 100MB you should contact us before upgrading, so we can make sure the upgrades work smoothly.
#### Downgrade
### Downgrade
If all members have been upgraded to v2.1, the cluster will be upgraded to v2.1, and downgrade is **not possible**. If any member is still v2.0, the cluster will remain in v2.0, and you can go back to use v2.0 binary.

View File

@@ -1,4 +1,4 @@
## Upgrade etcd from 2.1 to 2.2
# Upgrade etcd from 2.1 to 2.2
In the general case, upgrading from etcd 2.1 to 2.2 can be a zero-downtime, rolling upgrade:
@@ -7,33 +7,33 @@ In the general case, upgrading from etcd 2.1 to 2.2 can be a zero-downtime, roll
Before [starting an upgrade](#upgrade-procedure), read through the rest of this guide to prepare.
### Upgrade Checklists
## Upgrade Checklists
#### Upgrade Requirement
### Upgrade Requirement
To upgrade an existing etcd deployment to 2.2, you must be running 2.1. If youre running a version of etcd before 2.1, you must upgrade to [2.1](https://github.com/coreos/etcd/releases/tag/v2.1.2) before upgrading to 2.2.
Also, to ensure a smooth rolling upgrade, your running cluster must be healthy. You can check the health of the cluster by using `etcdctl cluster-health` command.
#### Preparedness
### Preparedness
Before upgrading etcd, always test the services relying on etcd in a staging environment before deploying the upgrade to the production environment.
You might also want to [backup your data directory](admin_guide.md#backing-up-the-datastore) for a potential [downgrade](#downgrade).
#### Mixed Versions
### Mixed Versions
While upgrading, an etcd cluster supports mixed versions of etcd members. The cluster is only considered upgraded once all its members are upgraded to 2.2.
Internally, etcd members negotiate with each other to determine the overall etcd cluster version, which controls the reported cluster version and the supported features.
#### Limitations
### Limitations
If you have a data size larger than 100MB you should contact us before upgrading, so we can make sure the upgrades work smoothly.
Every etcd 2.2 member will do health checking across the cluster periodically. etcd 2.1 member does not support health checking. During the upgrade, etcd 2.2 member will log warning about the unhealthy state of etcd 2.1 member. You can ignore the warning.
#### Downgrade
### Downgrade
If all members have been upgraded to v2.2, the cluster will be upgraded to v2.2, and downgrade is **not possible**. If any member is still v2.1, the cluster will remain in v2.1, and you can go back to use v2.1 binary.

16
Godeps/Godeps.json generated
View File

@@ -1,6 +1,6 @@
{
"ImportPath": "github.com/coreos/etcd",
"GoVersion": "go1.4.2",
"GoVersion": "go1.5",
"Packages": [
"./..."
],
@@ -10,6 +10,10 @@
"Comment": "null-5",
"Rev": "75cd24fc2f2c2a2088577d12123ddee5f54e0675"
},
{
"ImportPath": "github.com/akrennmair/gopcap",
"Rev": "00e11033259acb75598ba416495bb708d864a010"
},
{
"ImportPath": "github.com/beorn7/perks/quantile",
"Rev": "b965b613227fddccbfffe13eae360ed3fa822f8d"
@@ -27,6 +31,10 @@
"ImportPath": "github.com/bradfitz/http2",
"Rev": "3e36af6d3af0e56fa3da71099f864933dea3d9fb"
},
{
"ImportPath": "github.com/cheggaaa/pb",
"Rev": "da1f27ad1d9509b16f65f52fd9d8138b0f2dc7b2"
},
{
"ImportPath": "github.com/codegangsta/cli",
"Comment": "1.2.0-26-gf7ebb76",
@@ -53,7 +61,7 @@
},
{
"ImportPath": "github.com/coreos/pkg/capnslog",
"Rev": "2c77715c4df99b5420ffcae14ead08f52104065d"
"Rev": "42a8c3b1a6f917bb8346ef738f32712a7ca0ede7"
},
{
"ImportPath": "github.com/gogo/protobuf/proto",
@@ -102,8 +110,8 @@
"Rev": "454a56f35412459b5e684fd5ec0f9211b94f002a"
},
{
"ImportPath": "github.com/rakyll/pb",
"Rev": "dc507ad06b7462501281bb4691ee43f0b1d1ec37"
"ImportPath": "github.com/spacejam/loghisto",
"Rev": "323309774dec8b7430187e46cd0793974ccca04a"
},
{
"ImportPath": "github.com/stretchr/testify/assert",

View File

@@ -0,0 +1,5 @@
#*
*~
/tools/pass/pass
/tools/pcaptest/pcaptest
/tools/tcpdump/tcpdump

View File

@@ -0,0 +1,27 @@
Copyright (c) 2009-2011 Andreas Krennmair. All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are
met:
* Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above
copyright notice, this list of conditions and the following disclaimer
in the documentation and/or other materials provided with the
distribution.
* Neither the name of Andreas Krennmair nor the names of its
contributors may be used to endorse or promote products derived from
this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

View File

@@ -0,0 +1,11 @@
# PCAP
This is a simple wrapper around libpcap for Go. Originally written by Andreas
Krennmair <ak@synflood.at> and only minorly touched up by Mark Smith <mark@qq.is>.
Please see the included pcaptest.go and tcpdump.go programs for instructions on
how to use this library.
Miek Gieben <miek@miek.nl> has created a more Go-like package and replaced functionality
with standard functions from the standard library. The package has also been renamed to
pcap.

View File

@@ -0,0 +1,527 @@
package pcap
import (
"encoding/binary"
"fmt"
"net"
"reflect"
"strings"
)
const (
TYPE_IP = 0x0800
TYPE_ARP = 0x0806
TYPE_IP6 = 0x86DD
TYPE_VLAN = 0x8100
IP_ICMP = 1
IP_INIP = 4
IP_TCP = 6
IP_UDP = 17
)
const (
ERRBUF_SIZE = 256
// According to pcap-linktype(7).
LINKTYPE_NULL = 0
LINKTYPE_ETHERNET = 1
LINKTYPE_TOKEN_RING = 6
LINKTYPE_ARCNET = 7
LINKTYPE_SLIP = 8
LINKTYPE_PPP = 9
LINKTYPE_FDDI = 10
LINKTYPE_ATM_RFC1483 = 100
LINKTYPE_RAW = 101
LINKTYPE_PPP_HDLC = 50
LINKTYPE_PPP_ETHER = 51
LINKTYPE_C_HDLC = 104
LINKTYPE_IEEE802_11 = 105
LINKTYPE_FRELAY = 107
LINKTYPE_LOOP = 108
LINKTYPE_LINUX_SLL = 113
LINKTYPE_LTALK = 104
LINKTYPE_PFLOG = 117
LINKTYPE_PRISM_HEADER = 119
LINKTYPE_IP_OVER_FC = 122
LINKTYPE_SUNATM = 123
LINKTYPE_IEEE802_11_RADIO = 127
LINKTYPE_ARCNET_LINUX = 129
LINKTYPE_LINUX_IRDA = 144
LINKTYPE_LINUX_LAPD = 177
)
type addrHdr interface {
SrcAddr() string
DestAddr() string
Len() int
}
type addrStringer interface {
String(addr addrHdr) string
}
func decodemac(pkt []byte) uint64 {
mac := uint64(0)
for i := uint(0); i < 6; i++ {
mac = (mac << 8) + uint64(pkt[i])
}
return mac
}
// Decode decodes the headers of a Packet.
func (p *Packet) Decode() {
if len(p.Data) <= 14 {
return
}
p.Type = int(binary.BigEndian.Uint16(p.Data[12:14]))
p.DestMac = decodemac(p.Data[0:6])
p.SrcMac = decodemac(p.Data[6:12])
if len(p.Data) >= 15 {
p.Payload = p.Data[14:]
}
switch p.Type {
case TYPE_IP:
p.decodeIp()
case TYPE_IP6:
p.decodeIp6()
case TYPE_ARP:
p.decodeArp()
case TYPE_VLAN:
p.decodeVlan()
}
}
func (p *Packet) headerString(headers []interface{}) string {
// If there's just one header, return that.
if len(headers) == 1 {
if hdr, ok := headers[0].(fmt.Stringer); ok {
return hdr.String()
}
}
// If there are two headers (IPv4/IPv6 -> TCP/UDP/IP..)
if len(headers) == 2 {
// Commonly the first header is an address.
if addr, ok := p.Headers[0].(addrHdr); ok {
if hdr, ok := p.Headers[1].(addrStringer); ok {
return fmt.Sprintf("%s %s", p.Time, hdr.String(addr))
}
}
}
// For IP in IP, we do a recursive call.
if len(headers) >= 2 {
if addr, ok := headers[0].(addrHdr); ok {
if _, ok := headers[1].(addrHdr); ok {
return fmt.Sprintf("%s > %s IP in IP: ",
addr.SrcAddr(), addr.DestAddr(), p.headerString(headers[1:]))
}
}
}
var typeNames []string
for _, hdr := range headers {
typeNames = append(typeNames, reflect.TypeOf(hdr).String())
}
return fmt.Sprintf("unknown [%s]", strings.Join(typeNames, ","))
}
// String prints a one-line representation of the packet header.
// The output is suitable for use in a tcpdump program.
func (p *Packet) String() string {
// If there are no headers, print "unsupported protocol".
if len(p.Headers) == 0 {
return fmt.Sprintf("%s unsupported protocol %d", p.Time, int(p.Type))
}
return fmt.Sprintf("%s %s", p.Time, p.headerString(p.Headers))
}
// Arphdr is a ARP packet header.
type Arphdr struct {
Addrtype uint16
Protocol uint16
HwAddressSize uint8
ProtAddressSize uint8
Operation uint16
SourceHwAddress []byte
SourceProtAddress []byte
DestHwAddress []byte
DestProtAddress []byte
}
func (arp *Arphdr) String() (s string) {
switch arp.Operation {
case 1:
s = "ARP request"
case 2:
s = "ARP Reply"
}
if arp.Addrtype == LINKTYPE_ETHERNET && arp.Protocol == TYPE_IP {
s = fmt.Sprintf("%012x (%s) > %012x (%s)",
decodemac(arp.SourceHwAddress), arp.SourceProtAddress,
decodemac(arp.DestHwAddress), arp.DestProtAddress)
} else {
s = fmt.Sprintf("addrtype = %d protocol = %d", arp.Addrtype, arp.Protocol)
}
return
}
func (p *Packet) decodeArp() {
if len(p.Payload) < 8 {
return
}
pkt := p.Payload
arp := new(Arphdr)
arp.Addrtype = binary.BigEndian.Uint16(pkt[0:2])
arp.Protocol = binary.BigEndian.Uint16(pkt[2:4])
arp.HwAddressSize = pkt[4]
arp.ProtAddressSize = pkt[5]
arp.Operation = binary.BigEndian.Uint16(pkt[6:8])
if len(pkt) < int(8+2*arp.HwAddressSize+2*arp.ProtAddressSize) {
return
}
arp.SourceHwAddress = pkt[8 : 8+arp.HwAddressSize]
arp.SourceProtAddress = pkt[8+arp.HwAddressSize : 8+arp.HwAddressSize+arp.ProtAddressSize]
arp.DestHwAddress = pkt[8+arp.HwAddressSize+arp.ProtAddressSize : 8+2*arp.HwAddressSize+arp.ProtAddressSize]
arp.DestProtAddress = pkt[8+2*arp.HwAddressSize+arp.ProtAddressSize : 8+2*arp.HwAddressSize+2*arp.ProtAddressSize]
p.Headers = append(p.Headers, arp)
if len(pkt) >= int(8+2*arp.HwAddressSize+2*arp.ProtAddressSize) {
p.Payload = p.Payload[8+2*arp.HwAddressSize+2*arp.ProtAddressSize:]
}
}
// IPadr is the header of an IP packet.
type Iphdr struct {
Version uint8
Ihl uint8
Tos uint8
Length uint16
Id uint16
Flags uint8
FragOffset uint16
Ttl uint8
Protocol uint8
Checksum uint16
SrcIp []byte
DestIp []byte
}
func (p *Packet) decodeIp() {
if len(p.Payload) < 20 {
return
}
pkt := p.Payload
ip := new(Iphdr)
ip.Version = uint8(pkt[0]) >> 4
ip.Ihl = uint8(pkt[0]) & 0x0F
ip.Tos = pkt[1]
ip.Length = binary.BigEndian.Uint16(pkt[2:4])
ip.Id = binary.BigEndian.Uint16(pkt[4:6])
flagsfrags := binary.BigEndian.Uint16(pkt[6:8])
ip.Flags = uint8(flagsfrags >> 13)
ip.FragOffset = flagsfrags & 0x1FFF
ip.Ttl = pkt[8]
ip.Protocol = pkt[9]
ip.Checksum = binary.BigEndian.Uint16(pkt[10:12])
ip.SrcIp = pkt[12:16]
ip.DestIp = pkt[16:20]
pEnd := int(ip.Length)
if pEnd > len(pkt) {
pEnd = len(pkt)
}
if len(pkt) >= pEnd && int(ip.Ihl*4) < pEnd {
p.Payload = pkt[ip.Ihl*4 : pEnd]
} else {
p.Payload = []byte{}
}
p.Headers = append(p.Headers, ip)
p.IP = ip
switch ip.Protocol {
case IP_TCP:
p.decodeTcp()
case IP_UDP:
p.decodeUdp()
case IP_ICMP:
p.decodeIcmp()
case IP_INIP:
p.decodeIp()
}
}
func (ip *Iphdr) SrcAddr() string { return net.IP(ip.SrcIp).String() }
func (ip *Iphdr) DestAddr() string { return net.IP(ip.DestIp).String() }
func (ip *Iphdr) Len() int { return int(ip.Length) }
type Vlanhdr struct {
Priority byte
DropEligible bool
VlanIdentifier int
Type int // Not actually part of the vlan header, but the type of the actual packet
}
func (v *Vlanhdr) String() {
fmt.Sprintf("VLAN Priority:%d Drop:%v Tag:%d", v.Priority, v.DropEligible, v.VlanIdentifier)
}
func (p *Packet) decodeVlan() {
pkt := p.Payload
vlan := new(Vlanhdr)
if len(pkt) < 4 {
return
}
vlan.Priority = (pkt[2] & 0xE0) >> 13
vlan.DropEligible = pkt[2]&0x10 != 0
vlan.VlanIdentifier = int(binary.BigEndian.Uint16(pkt[:2])) & 0x0FFF
vlan.Type = int(binary.BigEndian.Uint16(p.Payload[2:4]))
p.Headers = append(p.Headers, vlan)
if len(pkt) >= 5 {
p.Payload = p.Payload[4:]
}
switch vlan.Type {
case TYPE_IP:
p.decodeIp()
case TYPE_IP6:
p.decodeIp6()
case TYPE_ARP:
p.decodeArp()
}
}
type Tcphdr struct {
SrcPort uint16
DestPort uint16
Seq uint32
Ack uint32
DataOffset uint8
Flags uint16
Window uint16
Checksum uint16
Urgent uint16
Data []byte
}
const (
TCP_FIN = 1 << iota
TCP_SYN
TCP_RST
TCP_PSH
TCP_ACK
TCP_URG
TCP_ECE
TCP_CWR
TCP_NS
)
func (p *Packet) decodeTcp() {
if len(p.Payload) < 20 {
return
}
pkt := p.Payload
tcp := new(Tcphdr)
tcp.SrcPort = binary.BigEndian.Uint16(pkt[0:2])
tcp.DestPort = binary.BigEndian.Uint16(pkt[2:4])
tcp.Seq = binary.BigEndian.Uint32(pkt[4:8])
tcp.Ack = binary.BigEndian.Uint32(pkt[8:12])
tcp.DataOffset = (pkt[12] & 0xF0) >> 4
tcp.Flags = binary.BigEndian.Uint16(pkt[12:14]) & 0x1FF
tcp.Window = binary.BigEndian.Uint16(pkt[14:16])
tcp.Checksum = binary.BigEndian.Uint16(pkt[16:18])
tcp.Urgent = binary.BigEndian.Uint16(pkt[18:20])
if len(pkt) >= int(tcp.DataOffset*4) {
p.Payload = pkt[tcp.DataOffset*4:]
}
p.Headers = append(p.Headers, tcp)
p.TCP = tcp
}
func (tcp *Tcphdr) String(hdr addrHdr) string {
return fmt.Sprintf("TCP %s:%d > %s:%d %s SEQ=%d ACK=%d LEN=%d",
hdr.SrcAddr(), int(tcp.SrcPort), hdr.DestAddr(), int(tcp.DestPort),
tcp.FlagsString(), int64(tcp.Seq), int64(tcp.Ack), hdr.Len())
}
func (tcp *Tcphdr) FlagsString() string {
var sflags []string
if 0 != (tcp.Flags & TCP_SYN) {
sflags = append(sflags, "syn")
}
if 0 != (tcp.Flags & TCP_FIN) {
sflags = append(sflags, "fin")
}
if 0 != (tcp.Flags & TCP_ACK) {
sflags = append(sflags, "ack")
}
if 0 != (tcp.Flags & TCP_PSH) {
sflags = append(sflags, "psh")
}
if 0 != (tcp.Flags & TCP_RST) {
sflags = append(sflags, "rst")
}
if 0 != (tcp.Flags & TCP_URG) {
sflags = append(sflags, "urg")
}
if 0 != (tcp.Flags & TCP_NS) {
sflags = append(sflags, "ns")
}
if 0 != (tcp.Flags & TCP_CWR) {
sflags = append(sflags, "cwr")
}
if 0 != (tcp.Flags & TCP_ECE) {
sflags = append(sflags, "ece")
}
return fmt.Sprintf("[%s]", strings.Join(sflags, " "))
}
type Udphdr struct {
SrcPort uint16
DestPort uint16
Length uint16
Checksum uint16
}
func (p *Packet) decodeUdp() {
if len(p.Payload) < 8 {
return
}
pkt := p.Payload
udp := new(Udphdr)
udp.SrcPort = binary.BigEndian.Uint16(pkt[0:2])
udp.DestPort = binary.BigEndian.Uint16(pkt[2:4])
udp.Length = binary.BigEndian.Uint16(pkt[4:6])
udp.Checksum = binary.BigEndian.Uint16(pkt[6:8])
p.Headers = append(p.Headers, udp)
p.UDP = udp
if len(p.Payload) >= 8 {
p.Payload = pkt[8:]
}
}
func (udp *Udphdr) String(hdr addrHdr) string {
return fmt.Sprintf("UDP %s:%d > %s:%d LEN=%d CHKSUM=%d",
hdr.SrcAddr(), int(udp.SrcPort), hdr.DestAddr(), int(udp.DestPort),
int(udp.Length), int(udp.Checksum))
}
type Icmphdr struct {
Type uint8
Code uint8
Checksum uint16
Id uint16
Seq uint16
Data []byte
}
func (p *Packet) decodeIcmp() *Icmphdr {
if len(p.Payload) < 8 {
return nil
}
pkt := p.Payload
icmp := new(Icmphdr)
icmp.Type = pkt[0]
icmp.Code = pkt[1]
icmp.Checksum = binary.BigEndian.Uint16(pkt[2:4])
icmp.Id = binary.BigEndian.Uint16(pkt[4:6])
icmp.Seq = binary.BigEndian.Uint16(pkt[6:8])
p.Payload = pkt[8:]
p.Headers = append(p.Headers, icmp)
return icmp
}
func (icmp *Icmphdr) String(hdr addrHdr) string {
return fmt.Sprintf("ICMP %s > %s Type = %d Code = %d ",
hdr.SrcAddr(), hdr.DestAddr(), icmp.Type, icmp.Code)
}
func (icmp *Icmphdr) TypeString() (result string) {
switch icmp.Type {
case 0:
result = fmt.Sprintf("Echo reply seq=%d", icmp.Seq)
case 3:
switch icmp.Code {
case 0:
result = "Network unreachable"
case 1:
result = "Host unreachable"
case 2:
result = "Protocol unreachable"
case 3:
result = "Port unreachable"
default:
result = "Destination unreachable"
}
case 8:
result = fmt.Sprintf("Echo request seq=%d", icmp.Seq)
case 30:
result = "Traceroute"
}
return
}
type Ip6hdr struct {
// http://www.networksorcery.com/enp/protocol/ipv6.htm
Version uint8 // 4 bits
TrafficClass uint8 // 8 bits
FlowLabel uint32 // 20 bits
Length uint16 // 16 bits
NextHeader uint8 // 8 bits, same as Protocol in Iphdr
HopLimit uint8 // 8 bits
SrcIp []byte // 16 bytes
DestIp []byte // 16 bytes
}
func (p *Packet) decodeIp6() {
if len(p.Payload) < 40 {
return
}
pkt := p.Payload
ip6 := new(Ip6hdr)
ip6.Version = uint8(pkt[0]) >> 4
ip6.TrafficClass = uint8((binary.BigEndian.Uint16(pkt[0:2]) >> 4) & 0x00FF)
ip6.FlowLabel = binary.BigEndian.Uint32(pkt[0:4]) & 0x000FFFFF
ip6.Length = binary.BigEndian.Uint16(pkt[4:6])
ip6.NextHeader = pkt[6]
ip6.HopLimit = pkt[7]
ip6.SrcIp = pkt[8:24]
ip6.DestIp = pkt[24:40]
if len(p.Payload) >= 40 {
p.Payload = pkt[40:]
}
p.Headers = append(p.Headers, ip6)
switch ip6.NextHeader {
case IP_TCP:
p.decodeTcp()
case IP_UDP:
p.decodeUdp()
case IP_ICMP:
p.decodeIcmp()
case IP_INIP:
p.decodeIp()
}
}
func (ip6 *Ip6hdr) SrcAddr() string { return net.IP(ip6.SrcIp).String() }
func (ip6 *Ip6hdr) DestAddr() string { return net.IP(ip6.DestIp).String() }
func (ip6 *Ip6hdr) Len() int { return int(ip6.Length) }

View File

@@ -0,0 +1,247 @@
package pcap
import (
"bytes"
"testing"
"time"
)
var testSimpleTcpPacket *Packet = &Packet{
Data: []byte{
0x00, 0x00, 0x0c, 0x9f, 0xf0, 0x20, 0xbc, 0x30, 0x5b, 0xe8, 0xd3, 0x49,
0x08, 0x00, 0x45, 0x00, 0x01, 0xa4, 0x39, 0xdf, 0x40, 0x00, 0x40, 0x06,
0x55, 0x5a, 0xac, 0x11, 0x51, 0x49, 0xad, 0xde, 0xfe, 0xe1, 0xc5, 0xf7,
0x00, 0x50, 0xc5, 0x7e, 0x0e, 0x48, 0x49, 0x07, 0x42, 0x32, 0x80, 0x18,
0x00, 0x73, 0xab, 0xb1, 0x00, 0x00, 0x01, 0x01, 0x08, 0x0a, 0x03, 0x77,
0x37, 0x9c, 0x42, 0x77, 0x5e, 0x3a, 0x47, 0x45, 0x54, 0x20, 0x2f, 0x20,
0x48, 0x54, 0x54, 0x50, 0x2f, 0x31, 0x2e, 0x31, 0x0d, 0x0a, 0x48, 0x6f,
0x73, 0x74, 0x3a, 0x20, 0x77, 0x77, 0x77, 0x2e, 0x66, 0x69, 0x73, 0x68,
0x2e, 0x63, 0x6f, 0x6d, 0x0d, 0x0a, 0x43, 0x6f, 0x6e, 0x6e, 0x65, 0x63,
0x74, 0x69, 0x6f, 0x6e, 0x3a, 0x20, 0x6b, 0x65, 0x65, 0x70, 0x2d, 0x61,
0x6c, 0x69, 0x76, 0x65, 0x0d, 0x0a, 0x55, 0x73, 0x65, 0x72, 0x2d, 0x41,
0x67, 0x65, 0x6e, 0x74, 0x3a, 0x20, 0x4d, 0x6f, 0x7a, 0x69, 0x6c, 0x6c,
0x61, 0x2f, 0x35, 0x2e, 0x30, 0x20, 0x28, 0x58, 0x31, 0x31, 0x3b, 0x20,
0x4c, 0x69, 0x6e, 0x75, 0x78, 0x20, 0x78, 0x38, 0x36, 0x5f, 0x36, 0x34,
0x29, 0x20, 0x41, 0x70, 0x70, 0x6c, 0x65, 0x57, 0x65, 0x62, 0x4b, 0x69,
0x74, 0x2f, 0x35, 0x33, 0x35, 0x2e, 0x32, 0x20, 0x28, 0x4b, 0x48, 0x54,
0x4d, 0x4c, 0x2c, 0x20, 0x6c, 0x69, 0x6b, 0x65, 0x20, 0x47, 0x65, 0x63,
0x6b, 0x6f, 0x29, 0x20, 0x43, 0x68, 0x72, 0x6f, 0x6d, 0x65, 0x2f, 0x31,
0x35, 0x2e, 0x30, 0x2e, 0x38, 0x37, 0x34, 0x2e, 0x31, 0x32, 0x31, 0x20,
0x53, 0x61, 0x66, 0x61, 0x72, 0x69, 0x2f, 0x35, 0x33, 0x35, 0x2e, 0x32,
0x0d, 0x0a, 0x41, 0x63, 0x63, 0x65, 0x70, 0x74, 0x3a, 0x20, 0x74, 0x65,
0x78, 0x74, 0x2f, 0x68, 0x74, 0x6d, 0x6c, 0x2c, 0x61, 0x70, 0x70, 0x6c,
0x69, 0x63, 0x61, 0x74, 0x69, 0x6f, 0x6e, 0x2f, 0x78, 0x68, 0x74, 0x6d,
0x6c, 0x2b, 0x78, 0x6d, 0x6c, 0x2c, 0x61, 0x70, 0x70, 0x6c, 0x69, 0x63,
0x61, 0x74, 0x69, 0x6f, 0x6e, 0x2f, 0x78, 0x6d, 0x6c, 0x3b, 0x71, 0x3d,
0x30, 0x2e, 0x39, 0x2c, 0x2a, 0x2f, 0x2a, 0x3b, 0x71, 0x3d, 0x30, 0x2e,
0x38, 0x0d, 0x0a, 0x41, 0x63, 0x63, 0x65, 0x70, 0x74, 0x2d, 0x45, 0x6e,
0x63, 0x6f, 0x64, 0x69, 0x6e, 0x67, 0x3a, 0x20, 0x67, 0x7a, 0x69, 0x70,
0x2c, 0x64, 0x65, 0x66, 0x6c, 0x61, 0x74, 0x65, 0x2c, 0x73, 0x64, 0x63,
0x68, 0x0d, 0x0a, 0x41, 0x63, 0x63, 0x65, 0x70, 0x74, 0x2d, 0x4c, 0x61,
0x6e, 0x67, 0x75, 0x61, 0x67, 0x65, 0x3a, 0x20, 0x65, 0x6e, 0x2d, 0x55,
0x53, 0x2c, 0x65, 0x6e, 0x3b, 0x71, 0x3d, 0x30, 0x2e, 0x38, 0x0d, 0x0a,
0x41, 0x63, 0x63, 0x65, 0x70, 0x74, 0x2d, 0x43, 0x68, 0x61, 0x72, 0x73,
0x65, 0x74, 0x3a, 0x20, 0x49, 0x53, 0x4f, 0x2d, 0x38, 0x38, 0x35, 0x39,
0x2d, 0x31, 0x2c, 0x75, 0x74, 0x66, 0x2d, 0x38, 0x3b, 0x71, 0x3d, 0x30,
0x2e, 0x37, 0x2c, 0x2a, 0x3b, 0x71, 0x3d, 0x30, 0x2e, 0x33, 0x0d, 0x0a,
0x0d, 0x0a,
}}
func BenchmarkDecodeSimpleTcpPacket(b *testing.B) {
for i := 0; i < b.N; i++ {
testSimpleTcpPacket.Decode()
}
}
func TestDecodeSimpleTcpPacket(t *testing.T) {
p := testSimpleTcpPacket
p.Decode()
if p.DestMac != 0x00000c9ff020 {
t.Error("Dest mac", p.DestMac)
}
if p.SrcMac != 0xbc305be8d349 {
t.Error("Src mac", p.SrcMac)
}
if len(p.Headers) != 2 {
t.Error("Incorrect number of headers", len(p.Headers))
return
}
if ip, ipOk := p.Headers[0].(*Iphdr); ipOk {
if ip.Version != 4 {
t.Error("ip Version", ip.Version)
}
if ip.Ihl != 5 {
t.Error("ip header length", ip.Ihl)
}
if ip.Tos != 0 {
t.Error("ip TOS", ip.Tos)
}
if ip.Length != 420 {
t.Error("ip Length", ip.Length)
}
if ip.Id != 14815 {
t.Error("ip ID", ip.Id)
}
if ip.Flags != 0x02 {
t.Error("ip Flags", ip.Flags)
}
if ip.FragOffset != 0 {
t.Error("ip Fragoffset", ip.FragOffset)
}
if ip.Ttl != 64 {
t.Error("ip TTL", ip.Ttl)
}
if ip.Protocol != 6 {
t.Error("ip Protocol", ip.Protocol)
}
if ip.Checksum != 0x555A {
t.Error("ip Checksum", ip.Checksum)
}
if !bytes.Equal(ip.SrcIp, []byte{172, 17, 81, 73}) {
t.Error("ip Src", ip.SrcIp)
}
if !bytes.Equal(ip.DestIp, []byte{173, 222, 254, 225}) {
t.Error("ip Dest", ip.DestIp)
}
if tcp, tcpOk := p.Headers[1].(*Tcphdr); tcpOk {
if tcp.SrcPort != 50679 {
t.Error("tcp srcport", tcp.SrcPort)
}
if tcp.DestPort != 80 {
t.Error("tcp destport", tcp.DestPort)
}
if tcp.Seq != 0xc57e0e48 {
t.Error("tcp seq", tcp.Seq)
}
if tcp.Ack != 0x49074232 {
t.Error("tcp ack", tcp.Ack)
}
if tcp.DataOffset != 8 {
t.Error("tcp dataoffset", tcp.DataOffset)
}
if tcp.Flags != 0x18 {
t.Error("tcp flags", tcp.Flags)
}
if tcp.Window != 0x73 {
t.Error("tcp window", tcp.Window)
}
if tcp.Checksum != 0xabb1 {
t.Error("tcp checksum", tcp.Checksum)
}
if tcp.Urgent != 0 {
t.Error("tcp urgent", tcp.Urgent)
}
} else {
t.Error("Second header is not TCP header")
}
} else {
t.Error("First header is not IP header")
}
if string(p.Payload) != "GET / HTTP/1.1\r\nHost: www.fish.com\r\nConnection: keep-alive\r\nUser-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/535.2 (KHTML, like Gecko) Chrome/15.0.874.121 Safari/535.2\r\nAccept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\nAccept-Encoding: gzip,deflate,sdch\r\nAccept-Language: en-US,en;q=0.8\r\nAccept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3\r\n\r\n" {
t.Error("--- PAYLOAD STRING ---\n", string(p.Payload), "\n--- PAYLOAD BYTES ---\n", p.Payload)
}
}
// Makes sure packet payload doesn't display the 6 trailing null of this packet
// as part of the payload. They're actually the ethernet trailer.
func TestDecodeSmallTcpPacketHasEmptyPayload(t *testing.T) {
p := &Packet{
// This packet is only 54 bits (an empty TCP RST), thus 6 trailing null
// bytes are added by the ethernet layer to make it the minimum packet size.
Data: []byte{
0xbc, 0x30, 0x5b, 0xe8, 0xd3, 0x49, 0xb8, 0xac, 0x6f, 0x92, 0xd5, 0xbf,
0x08, 0x00, 0x45, 0x00, 0x00, 0x28, 0x00, 0x00, 0x40, 0x00, 0x40, 0x06,
0x3f, 0x9f, 0xac, 0x11, 0x51, 0xc5, 0xac, 0x11, 0x51, 0x49, 0x00, 0x63,
0x9a, 0xef, 0x00, 0x00, 0x00, 0x00, 0x2e, 0xc1, 0x27, 0x83, 0x50, 0x14,
0x00, 0x00, 0xc3, 0x08, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
}}
p.Decode()
if p.Payload == nil {
t.Error("Nil payload")
}
if len(p.Payload) != 0 {
t.Error("Non-empty payload:", p.Payload)
}
}
func TestDecodeVlanPacket(t *testing.T) {
p := &Packet{
Data: []byte{
0x00, 0x10, 0xdb, 0xff, 0x10, 0x00, 0x00, 0x15, 0x2c, 0x9d, 0xcc, 0x00, 0x81, 0x00, 0x01, 0xf7,
0x08, 0x00, 0x45, 0x00, 0x00, 0x28, 0x29, 0x8d, 0x40, 0x00, 0x7d, 0x06, 0x83, 0xa0, 0xac, 0x1b,
0xca, 0x8e, 0x45, 0x16, 0x94, 0xe2, 0xd4, 0x0a, 0x00, 0x50, 0xdf, 0xab, 0x9c, 0xc6, 0xcd, 0x1e,
0xe5, 0xd1, 0x50, 0x10, 0x01, 0x00, 0x5a, 0x74, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
}}
p.Decode()
if p.Type != TYPE_VLAN {
t.Error("Didn't detect vlan")
}
if len(p.Headers) != 3 {
t.Error("Incorrect number of headers:", len(p.Headers))
for i, h := range p.Headers {
t.Errorf("Header %d: %#v", i, h)
}
t.FailNow()
}
if _, ok := p.Headers[0].(*Vlanhdr); !ok {
t.Errorf("First header isn't vlan: %q", p.Headers[0])
}
if _, ok := p.Headers[1].(*Iphdr); !ok {
t.Errorf("Second header isn't IP: %q", p.Headers[1])
}
if _, ok := p.Headers[2].(*Tcphdr); !ok {
t.Errorf("Third header isn't TCP: %q", p.Headers[2])
}
}
func TestDecodeFuzzFallout(t *testing.T) {
testData := []struct {
Data []byte
}{
{[]byte("000000000000\x81\x000")},
{[]byte("000000000000\x81\x00000")},
{[]byte("000000000000\x86\xdd0")},
{[]byte("000000000000\b\x000")},
{[]byte("000000000000\b\x060")},
{[]byte{}},
{[]byte("000000000000\b\x0600000000")},
{[]byte("000000000000\x86\xdd000000\x01000000000000000000000000000000000")},
{[]byte("000000000000\x81\x0000\b\x0600000000")},
{[]byte("000000000000\b\x00n0000000000000000000")},
{[]byte("000000000000\x86\xdd000000\x0100000000000000000000000000000000000")},
{[]byte("000000000000\x81\x0000\b\x00g0000000000000000000")},
//{[]byte()},
{[]byte("000000000000\b\x00400000000\x110000000000")},
{[]byte("0nMء\xfe\x13\x13\x81\x00gr\b\x00&x\xc9\xe5b'\x1e0\x00\x04\x00\x0020596224")},
{[]byte("000000000000\x81\x0000\b\x00400000000\x110000000000")},
{[]byte("000000000000\b\x00000000000\x0600\xff0000000")},
{[]byte("000000000000\x86\xdd000000\x06000000000000000000000000000000000")},
{[]byte("000000000000\x81\x0000\b\x00000000000\x0600b0000000")},
{[]byte("000000000000\x81\x0000\b\x00400000000\x060000000000")},
{[]byte("000000000000\x86\xdd000000\x11000000000000000000000000000000000")},
{[]byte("000000000000\x86\xdd000000\x0600000000000000000000000000000000000000000000M")},
{[]byte("000000000000\b\x00500000000\x0600000000000")},
{[]byte("0nM\xd80\xfe\x13\x13\x81\x00gr\b\x00&x\xc9\xe5b'\x1e0\x00\x04\x00\x0020596224")},
}
for _, entry := range testData {
pkt := &Packet{
Time: time.Now(),
Caplen: uint32(len(entry.Data)),
Len: uint32(len(entry.Data)),
Data: entry.Data,
}
pkt.Decode()
/*
func() {
defer func() {
if err := recover(); err != nil {
t.Fatalf("%d. %q failed: %v", idx, string(entry.Data), err)
}
}()
pkt.Decode()
}()
*/
}
}

View File

@@ -0,0 +1,206 @@
package pcap
import (
"encoding/binary"
"fmt"
"io"
"time"
)
// FileHeader is the parsed header of a pcap file.
// http://wiki.wireshark.org/Development/LibpcapFileFormat
type FileHeader struct {
MagicNumber uint32
VersionMajor uint16
VersionMinor uint16
TimeZone int32
SigFigs uint32
SnapLen uint32
Network uint32
}
type PacketTime struct {
Sec int32
Usec int32
}
// Convert the PacketTime to a go Time struct.
func (p *PacketTime) Time() time.Time {
return time.Unix(int64(p.Sec), int64(p.Usec)*1000)
}
// Packet is a single packet parsed from a pcap file.
//
// Convenient access to IP, TCP, and UDP headers is provided after Decode()
// is called if the packet is of the appropriate type.
type Packet struct {
Time time.Time // packet send/receive time
Caplen uint32 // bytes stored in the file (caplen <= len)
Len uint32 // bytes sent/received
Data []byte // packet data
Type int // protocol type, see LINKTYPE_*
DestMac uint64
SrcMac uint64
Headers []interface{} // decoded headers, in order
Payload []byte // remaining non-header bytes
IP *Iphdr // IP header (for IP packets, after decoding)
TCP *Tcphdr // TCP header (for TCP packets, after decoding)
UDP *Udphdr // UDP header (for UDP packets after decoding)
}
// Reader parses pcap files.
type Reader struct {
flip bool
buf io.Reader
err error
fourBytes []byte
twoBytes []byte
sixteenBytes []byte
Header FileHeader
}
// NewReader reads pcap data from an io.Reader.
func NewReader(reader io.Reader) (*Reader, error) {
r := &Reader{
buf: reader,
fourBytes: make([]byte, 4),
twoBytes: make([]byte, 2),
sixteenBytes: make([]byte, 16),
}
switch magic := r.readUint32(); magic {
case 0xa1b2c3d4:
r.flip = false
case 0xd4c3b2a1:
r.flip = true
default:
return nil, fmt.Errorf("pcap: bad magic number: %0x", magic)
}
r.Header = FileHeader{
MagicNumber: 0xa1b2c3d4,
VersionMajor: r.readUint16(),
VersionMinor: r.readUint16(),
TimeZone: r.readInt32(),
SigFigs: r.readUint32(),
SnapLen: r.readUint32(),
Network: r.readUint32(),
}
return r, nil
}
// Next returns the next packet or nil if no more packets can be read.
func (r *Reader) Next() *Packet {
d := r.sixteenBytes
r.err = r.read(d)
if r.err != nil {
return nil
}
timeSec := asUint32(d[0:4], r.flip)
timeUsec := asUint32(d[4:8], r.flip)
capLen := asUint32(d[8:12], r.flip)
origLen := asUint32(d[12:16], r.flip)
data := make([]byte, capLen)
if r.err = r.read(data); r.err != nil {
return nil
}
return &Packet{
Time: time.Unix(int64(timeSec), int64(timeUsec)),
Caplen: capLen,
Len: origLen,
Data: data,
}
}
func (r *Reader) read(data []byte) error {
var err error
n, err := r.buf.Read(data)
for err == nil && n != len(data) {
var chunk int
chunk, err = r.buf.Read(data[n:])
n += chunk
}
if len(data) == n {
return nil
}
return err
}
func (r *Reader) readUint32() uint32 {
data := r.fourBytes
if r.err = r.read(data); r.err != nil {
return 0
}
return asUint32(data, r.flip)
}
func (r *Reader) readInt32() int32 {
data := r.fourBytes
if r.err = r.read(data); r.err != nil {
return 0
}
return int32(asUint32(data, r.flip))
}
func (r *Reader) readUint16() uint16 {
data := r.twoBytes
if r.err = r.read(data); r.err != nil {
return 0
}
return asUint16(data, r.flip)
}
// Writer writes a pcap file.
type Writer struct {
writer io.Writer
buf []byte
}
// NewWriter creates a Writer that stores output in an io.Writer.
// The FileHeader is written immediately.
func NewWriter(writer io.Writer, header *FileHeader) (*Writer, error) {
w := &Writer{
writer: writer,
buf: make([]byte, 24),
}
binary.LittleEndian.PutUint32(w.buf, header.MagicNumber)
binary.LittleEndian.PutUint16(w.buf[4:], header.VersionMajor)
binary.LittleEndian.PutUint16(w.buf[6:], header.VersionMinor)
binary.LittleEndian.PutUint32(w.buf[8:], uint32(header.TimeZone))
binary.LittleEndian.PutUint32(w.buf[12:], header.SigFigs)
binary.LittleEndian.PutUint32(w.buf[16:], header.SnapLen)
binary.LittleEndian.PutUint32(w.buf[20:], header.Network)
if _, err := writer.Write(w.buf); err != nil {
return nil, err
}
return w, nil
}
// Writer writes a packet to the underlying writer.
func (w *Writer) Write(pkt *Packet) error {
binary.LittleEndian.PutUint32(w.buf, uint32(pkt.Time.Unix()))
binary.LittleEndian.PutUint32(w.buf[4:], uint32(pkt.Time.Nanosecond()))
binary.LittleEndian.PutUint32(w.buf[8:], uint32(pkt.Time.Unix()))
binary.LittleEndian.PutUint32(w.buf[12:], pkt.Len)
if _, err := w.writer.Write(w.buf[:16]); err != nil {
return err
}
_, err := w.writer.Write(pkt.Data)
return err
}
func asUint32(data []byte, flip bool) uint32 {
if flip {
return binary.BigEndian.Uint32(data)
}
return binary.LittleEndian.Uint32(data)
}
func asUint16(data []byte, flip bool) uint16 {
if flip {
return binary.BigEndian.Uint16(data)
}
return binary.LittleEndian.Uint16(data)
}

View File

@@ -0,0 +1,266 @@
// Interface to both live and offline pcap parsing.
package pcap
/*
#cgo linux LDFLAGS: -lpcap
#cgo freebsd LDFLAGS: -lpcap
#cgo darwin LDFLAGS: -lpcap
#cgo windows CFLAGS: -I C:/WpdPack/Include
#cgo windows,386 LDFLAGS: -L C:/WpdPack/Lib -lwpcap
#cgo windows,amd64 LDFLAGS: -L C:/WpdPack/Lib/x64 -lwpcap
#include <stdlib.h>
#include <pcap.h>
// Workaround for not knowing how to cast to const u_char**
int hack_pcap_next_ex(pcap_t *p, struct pcap_pkthdr **pkt_header,
u_char **pkt_data) {
return pcap_next_ex(p, pkt_header, (const u_char **)pkt_data);
}
*/
import "C"
import (
"errors"
"net"
"syscall"
"time"
"unsafe"
)
type Pcap struct {
cptr *C.pcap_t
}
type Stat struct {
PacketsReceived uint32
PacketsDropped uint32
PacketsIfDropped uint32
}
type Interface struct {
Name string
Description string
Addresses []IFAddress
// TODO: add more elements
}
type IFAddress struct {
IP net.IP
Netmask net.IPMask
// TODO: add broadcast + PtP dst ?
}
func (p *Pcap) Next() (pkt *Packet) {
rv, _ := p.NextEx()
return rv
}
// Openlive opens a device and returns a *Pcap handler
func Openlive(device string, snaplen int32, promisc bool, timeout_ms int32) (handle *Pcap, err error) {
var buf *C.char
buf = (*C.char)(C.calloc(ERRBUF_SIZE, 1))
h := new(Pcap)
var pro int32
if promisc {
pro = 1
}
dev := C.CString(device)
defer C.free(unsafe.Pointer(dev))
h.cptr = C.pcap_open_live(dev, C.int(snaplen), C.int(pro), C.int(timeout_ms), buf)
if nil == h.cptr {
handle = nil
err = errors.New(C.GoString(buf))
} else {
handle = h
}
C.free(unsafe.Pointer(buf))
return
}
func Openoffline(file string) (handle *Pcap, err error) {
var buf *C.char
buf = (*C.char)(C.calloc(ERRBUF_SIZE, 1))
h := new(Pcap)
cf := C.CString(file)
defer C.free(unsafe.Pointer(cf))
h.cptr = C.pcap_open_offline(cf, buf)
if nil == h.cptr {
handle = nil
err = errors.New(C.GoString(buf))
} else {
handle = h
}
C.free(unsafe.Pointer(buf))
return
}
func (p *Pcap) NextEx() (pkt *Packet, result int32) {
var pkthdr *C.struct_pcap_pkthdr
var buf_ptr *C.u_char
var buf unsafe.Pointer
result = int32(C.hack_pcap_next_ex(p.cptr, &pkthdr, &buf_ptr))
buf = unsafe.Pointer(buf_ptr)
if nil == buf {
return
}
pkt = new(Packet)
pkt.Time = time.Unix(int64(pkthdr.ts.tv_sec), int64(pkthdr.ts.tv_usec)*1000)
pkt.Caplen = uint32(pkthdr.caplen)
pkt.Len = uint32(pkthdr.len)
pkt.Data = C.GoBytes(buf, C.int(pkthdr.caplen))
return
}
func (p *Pcap) Close() {
C.pcap_close(p.cptr)
}
func (p *Pcap) Geterror() error {
return errors.New(C.GoString(C.pcap_geterr(p.cptr)))
}
func (p *Pcap) Getstats() (stat *Stat, err error) {
var cstats _Ctype_struct_pcap_stat
if -1 == C.pcap_stats(p.cptr, &cstats) {
return nil, p.Geterror()
}
stats := new(Stat)
stats.PacketsReceived = uint32(cstats.ps_recv)
stats.PacketsDropped = uint32(cstats.ps_drop)
stats.PacketsIfDropped = uint32(cstats.ps_ifdrop)
return stats, nil
}
func (p *Pcap) Setfilter(expr string) (err error) {
var bpf _Ctype_struct_bpf_program
cexpr := C.CString(expr)
defer C.free(unsafe.Pointer(cexpr))
if -1 == C.pcap_compile(p.cptr, &bpf, cexpr, 1, 0) {
return p.Geterror()
}
if -1 == C.pcap_setfilter(p.cptr, &bpf) {
C.pcap_freecode(&bpf)
return p.Geterror()
}
C.pcap_freecode(&bpf)
return nil
}
func Version() string {
return C.GoString(C.pcap_lib_version())
}
func (p *Pcap) Datalink() int {
return int(C.pcap_datalink(p.cptr))
}
func (p *Pcap) Setdatalink(dlt int) error {
if -1 == C.pcap_set_datalink(p.cptr, C.int(dlt)) {
return p.Geterror()
}
return nil
}
func DatalinkValueToName(dlt int) string {
if name := C.pcap_datalink_val_to_name(C.int(dlt)); name != nil {
return C.GoString(name)
}
return ""
}
func DatalinkValueToDescription(dlt int) string {
if desc := C.pcap_datalink_val_to_description(C.int(dlt)); desc != nil {
return C.GoString(desc)
}
return ""
}
func Findalldevs() (ifs []Interface, err error) {
var buf *C.char
buf = (*C.char)(C.calloc(ERRBUF_SIZE, 1))
defer C.free(unsafe.Pointer(buf))
var alldevsp *C.pcap_if_t
if -1 == C.pcap_findalldevs((**C.pcap_if_t)(&alldevsp), buf) {
return nil, errors.New(C.GoString(buf))
}
defer C.pcap_freealldevs((*C.pcap_if_t)(alldevsp))
dev := alldevsp
var i uint32
for i = 0; dev != nil; dev = (*C.pcap_if_t)(dev.next) {
i++
}
ifs = make([]Interface, i)
dev = alldevsp
for j := uint32(0); dev != nil; dev = (*C.pcap_if_t)(dev.next) {
var iface Interface
iface.Name = C.GoString(dev.name)
iface.Description = C.GoString(dev.description)
iface.Addresses = findalladdresses(dev.addresses)
// TODO: add more elements
ifs[j] = iface
j++
}
return
}
func findalladdresses(addresses *_Ctype_struct_pcap_addr) (retval []IFAddress) {
// TODO - make it support more than IPv4 and IPv6?
retval = make([]IFAddress, 0, 1)
for curaddr := addresses; curaddr != nil; curaddr = (*_Ctype_struct_pcap_addr)(curaddr.next) {
var a IFAddress
var err error
if a.IP, err = sockaddr_to_IP((*syscall.RawSockaddr)(unsafe.Pointer(curaddr.addr))); err != nil {
continue
}
if a.Netmask, err = sockaddr_to_IP((*syscall.RawSockaddr)(unsafe.Pointer(curaddr.addr))); err != nil {
continue
}
retval = append(retval, a)
}
return
}
func sockaddr_to_IP(rsa *syscall.RawSockaddr) (IP []byte, err error) {
switch rsa.Family {
case syscall.AF_INET:
pp := (*syscall.RawSockaddrInet4)(unsafe.Pointer(rsa))
IP = make([]byte, 4)
for i := 0; i < len(IP); i++ {
IP[i] = pp.Addr[i]
}
return
case syscall.AF_INET6:
pp := (*syscall.RawSockaddrInet6)(unsafe.Pointer(rsa))
IP = make([]byte, 16)
for i := 0; i < len(IP); i++ {
IP[i] = pp.Addr[i]
}
return
}
err = errors.New("Unsupported address type")
return
}
func (p *Pcap) Inject(data []byte) (err error) {
buf := (*C.char)(C.malloc((C.size_t)(len(data))))
for i := 0; i < len(data); i++ {
*(*byte)(unsafe.Pointer(uintptr(unsafe.Pointer(buf)) + uintptr(i))) = data[i]
}
if -1 == C.pcap_sendpacket(p.cptr, (*C.u_char)(unsafe.Pointer(buf)), (C.int)(len(data))) {
err = p.Geterror()
}
C.free(unsafe.Pointer(buf))
return
}

View File

@@ -0,0 +1,49 @@
package main
import (
"flag"
"fmt"
"os"
"runtime/pprof"
"time"
"github.com/coreos/etcd/Godeps/_workspace/src/github.com/akrennmair/gopcap"
)
func main() {
var filename *string = flag.String("file", "", "filename")
var decode *bool = flag.Bool("d", false, "If true, decode each packet")
var cpuprofile *string = flag.String("cpuprofile", "", "filename")
flag.Parse()
h, err := pcap.Openoffline(*filename)
if err != nil {
fmt.Printf("Couldn't create pcap reader: %v", err)
}
if *cpuprofile != "" {
if out, err := os.Create(*cpuprofile); err == nil {
pprof.StartCPUProfile(out)
defer func() {
pprof.StopCPUProfile()
out.Close()
}()
} else {
panic(err)
}
}
i, nilPackets := 0, 0
start := time.Now()
for pkt, code := h.NextEx(); code != -2; pkt, code = h.NextEx() {
if pkt == nil {
nilPackets++
} else if *decode {
pkt.Decode()
}
i++
}
duration := time.Since(start)
fmt.Printf("Took %v to process %v packets, %v per packet, %d nil packets\n", duration, i, duration/time.Duration(i), nilPackets)
}

View File

@@ -0,0 +1,96 @@
package main
// Parses a pcap file, writes it back to disk, then verifies the files
// are the same.
import (
"bufio"
"flag"
"fmt"
"io"
"os"
"github.com/coreos/etcd/Godeps/_workspace/src/github.com/akrennmair/gopcap"
)
var input *string = flag.String("input", "", "input file")
var output *string = flag.String("output", "", "output file")
var decode *bool = flag.Bool("decode", false, "print decoded packets")
func copyPcap(dest, src string) {
f, err := os.Open(src)
if err != nil {
fmt.Printf("couldn't open %q: %v\n", src, err)
return
}
defer f.Close()
reader, err := pcap.NewReader(bufio.NewReader(f))
if err != nil {
fmt.Printf("couldn't create reader: %v\n", err)
return
}
w, err := os.Create(dest)
if err != nil {
fmt.Printf("couldn't open %q: %v\n", dest, err)
return
}
defer w.Close()
buf := bufio.NewWriter(w)
writer, err := pcap.NewWriter(buf, &reader.Header)
if err != nil {
fmt.Printf("couldn't create writer: %v\n", err)
return
}
for {
pkt := reader.Next()
if pkt == nil {
break
}
if *decode {
pkt.Decode()
fmt.Println(pkt.String())
}
writer.Write(pkt)
}
buf.Flush()
}
func check(dest, src string) {
f, err := os.Open(src)
if err != nil {
fmt.Printf("couldn't open %q: %v\n", src, err)
return
}
defer f.Close()
freader := bufio.NewReader(f)
g, err := os.Open(dest)
if err != nil {
fmt.Printf("couldn't open %q: %v\n", src, err)
return
}
defer g.Close()
greader := bufio.NewReader(g)
for {
fb, ferr := freader.ReadByte()
gb, gerr := greader.ReadByte()
if ferr == io.EOF && gerr == io.EOF {
break
}
if fb == gb {
continue
}
fmt.Println("FAIL")
return
}
fmt.Println("PASS")
}
func main() {
flag.Parse()
copyPcap(*output, *input)
check(*output, *input)
}

View File

@@ -0,0 +1,82 @@
package main
import (
"flag"
"fmt"
"time"
"github.com/coreos/etcd/Godeps/_workspace/src/github.com/akrennmair/gopcap"
)
func min(x uint32, y uint32) uint32 {
if x < y {
return x
}
return y
}
func main() {
var device *string = flag.String("d", "", "device")
var file *string = flag.String("r", "", "file")
var expr *string = flag.String("e", "", "filter expression")
flag.Parse()
var h *pcap.Pcap
var err error
ifs, err := pcap.Findalldevs()
if len(ifs) == 0 {
fmt.Printf("Warning: no devices found : %s\n", err)
} else {
for i := 0; i < len(ifs); i++ {
fmt.Printf("dev %d: %s (%s)\n", i+1, ifs[i].Name, ifs[i].Description)
}
}
if *device != "" {
h, err = pcap.Openlive(*device, 65535, true, 0)
if h == nil {
fmt.Printf("Openlive(%s) failed: %s\n", *device, err)
return
}
} else if *file != "" {
h, err = pcap.Openoffline(*file)
if h == nil {
fmt.Printf("Openoffline(%s) failed: %s\n", *file, err)
return
}
} else {
fmt.Printf("usage: pcaptest [-d <device> | -r <file>]\n")
return
}
defer h.Close()
fmt.Printf("pcap version: %s\n", pcap.Version())
if *expr != "" {
fmt.Printf("Setting filter: %s\n", *expr)
err := h.Setfilter(*expr)
if err != nil {
fmt.Printf("Warning: setting filter failed: %s\n", err)
}
}
for pkt := h.Next(); pkt != nil; pkt = h.Next() {
fmt.Printf("time: %d.%06d (%s) caplen: %d len: %d\nData:",
int64(pkt.Time.Second()), int64(pkt.Time.Nanosecond()),
time.Unix(int64(pkt.Time.Second()), 0).String(), int64(pkt.Caplen), int64(pkt.Len))
for i := uint32(0); i < pkt.Caplen; i++ {
if i%32 == 0 {
fmt.Printf("\n")
}
if 32 <= pkt.Data[i] && pkt.Data[i] <= 126 {
fmt.Printf("%c", pkt.Data[i])
} else {
fmt.Printf(".")
}
}
fmt.Printf("\n\n")
}
}

View File

@@ -0,0 +1,121 @@
package main
import (
"bufio"
"flag"
"fmt"
"os"
"github.com/coreos/etcd/Godeps/_workspace/src/github.com/akrennmair/gopcap"
)
const (
TYPE_IP = 0x0800
TYPE_ARP = 0x0806
TYPE_IP6 = 0x86DD
IP_ICMP = 1
IP_INIP = 4
IP_TCP = 6
IP_UDP = 17
)
var out *bufio.Writer
var errout *bufio.Writer
func main() {
var device *string = flag.String("i", "", "interface")
var snaplen *int = flag.Int("s", 65535, "snaplen")
var hexdump *bool = flag.Bool("X", false, "hexdump")
expr := ""
out = bufio.NewWriter(os.Stdout)
errout = bufio.NewWriter(os.Stderr)
flag.Usage = func() {
fmt.Fprintf(errout, "usage: %s [ -i interface ] [ -s snaplen ] [ -X ] [ expression ]\n", os.Args[0])
errout.Flush()
os.Exit(1)
}
flag.Parse()
if len(flag.Args()) > 0 {
expr = flag.Arg(0)
}
if *device == "" {
devs, err := pcap.Findalldevs()
if err != nil {
fmt.Fprintf(errout, "tcpdump: couldn't find any devices: %s\n", err)
}
if 0 == len(devs) {
flag.Usage()
}
*device = devs[0].Name
}
h, err := pcap.Openlive(*device, int32(*snaplen), true, 0)
if h == nil {
fmt.Fprintf(errout, "tcpdump: %s\n", err)
errout.Flush()
return
}
defer h.Close()
if expr != "" {
ferr := h.Setfilter(expr)
if ferr != nil {
fmt.Fprintf(out, "tcpdump: %s\n", ferr)
out.Flush()
}
}
for pkt := h.Next(); pkt != nil; pkt = h.Next() {
pkt.Decode()
fmt.Fprintf(out, "%s\n", pkt.String())
if *hexdump {
Hexdump(pkt)
}
out.Flush()
}
}
func min(a, b int) int {
if a < b {
return a
}
return b
}
func Hexdump(pkt *pcap.Packet) {
for i := 0; i < len(pkt.Data); i += 16 {
Dumpline(uint32(i), pkt.Data[i:min(i+16, len(pkt.Data))])
}
}
func Dumpline(addr uint32, line []byte) {
fmt.Fprintf(out, "\t0x%04x: ", int32(addr))
var i uint16
for i = 0; i < 16 && i < uint16(len(line)); i++ {
if i%2 == 0 {
out.WriteString(" ")
}
fmt.Fprintf(out, "%02x", line[i])
}
for j := i; j <= 16; j++ {
if j%2 == 0 {
out.WriteString(" ")
}
out.WriteString(" ")
}
out.WriteString(" ")
for i = 0; i < 16 && i < uint16(len(line)); i++ {
if line[i] >= 32 && line[i] <= 126 {
fmt.Fprintf(out, "%c", line[i])
} else {
out.WriteString(".")
}
}
out.WriteString("\n")
}

View File

@@ -0,0 +1,4 @@
language: go
go:
- 1.4.2
sudo: false

View File

@@ -57,6 +57,12 @@ bar.ShowTimeLeft = true
// show average speed
bar.ShowSpeed = true
// sets the width of the progress bar
bar.SetWidth(80)
// sets the width of the progress bar, but if terminal size smaller will be ignored
bar.SetMaxWidth(80)
// convert output to readable format (like KB, MB)
bar.SetUnits(pb.U_BYTES)

View File

@@ -1,14 +1,14 @@
package main
import (
"github.com/cheggaaa/pb"
"os"
"fmt"
"github.com/coreos/etcd/Godeps/_workspace/src/github.com/cheggaaa/pb"
"io"
"time"
"strings"
"net/http"
"os"
"strconv"
"strings"
"time"
)
func main() {
@@ -18,12 +18,12 @@ func main() {
return
}
sourceName, destName := os.Args[1], os.Args[2]
// check source
var source io.Reader
var sourceSize int64
if strings.HasPrefix(sourceName, "http://") {
// open as url
// open as url
resp, err := http.Get(sourceName)
if err != nil {
fmt.Printf("Can't get %s: %v\n", sourceName, err)
@@ -54,9 +54,7 @@ func main() {
sourceSize = sourceStat.Size()
source = s
}
// create dest
dest, err := os.Create(destName)
if err != nil {
@@ -64,15 +62,15 @@ func main() {
return
}
defer dest.Close()
// create bar
// create bar
bar := pb.New(int(sourceSize)).SetUnits(pb.U_BYTES).SetRefreshRate(time.Millisecond * 10)
bar.ShowSpeed = true
bar.Start()
// create multi writer
writer := io.MultiWriter(dest, bar)
// and copy
io.Copy(writer, source)
bar.Finish()

View File

@@ -1,7 +1,7 @@
package main
import (
"github.com/cheggaaa/pb"
"github.com/coreos/etcd/Godeps/_workspace/src/github.com/cheggaaa/pb"
"time"
)
@@ -13,7 +13,7 @@ func main() {
bar.ShowPercent = true
// show bar (by default already true)
bar.ShowPercent = true
bar.ShowBar = true
// no need counters
bar.ShowCounters = true

45
Godeps/_workspace/src/github.com/cheggaaa/pb/format.go generated vendored Normal file
View File

@@ -0,0 +1,45 @@
package pb
import (
"fmt"
"strconv"
"strings"
)
type Units int
const (
// By default, without type handle
U_NO Units = iota
// Handle as b, Kb, Mb, etc
U_BYTES
)
// Format integer
func Format(i int64, units Units) string {
switch units {
case U_BYTES:
return FormatBytes(i)
default:
// by default just convert to string
return strconv.FormatInt(i, 10)
}
}
// Convert bytes to human readable string. Like a 2 MB, 64.2 KB, 52 B
func FormatBytes(i int64) (result string) {
switch {
case i > (1024 * 1024 * 1024 * 1024):
result = fmt.Sprintf("%.02f TB", float64(i)/1024/1024/1024/1024)
case i > (1024 * 1024 * 1024):
result = fmt.Sprintf("%.02f GB", float64(i)/1024/1024/1024)
case i > (1024 * 1024):
result = fmt.Sprintf("%.02f MB", float64(i)/1024/1024)
case i > 1024:
result = fmt.Sprintf("%.02f KB", float64(i)/1024)
default:
result = fmt.Sprintf("%d B", i)
}
result = strings.Trim(result, " ")
return
}

View File

@@ -0,0 +1,37 @@
package pb
import (
"fmt"
"strconv"
"testing"
)
func Test_DefaultsToInteger(t *testing.T) {
value := int64(1000)
expected := strconv.Itoa(int(value))
actual := Format(value, -1)
if actual != expected {
t.Error(fmt.Sprintf("Expected {%s} was {%s}", expected, actual))
}
}
func Test_CanFormatAsInteger(t *testing.T) {
value := int64(1000)
expected := strconv.Itoa(int(value))
actual := Format(value, U_NO)
if actual != expected {
t.Error(fmt.Sprintf("Expected {%s} was {%s}", expected, actual))
}
}
func Test_CanFormatAsBytes(t *testing.T) {
value := int64(1000)
expected := "1000 B"
actual := Format(value, U_BYTES)
if actual != expected {
t.Error(fmt.Sprintf("Expected {%s} was {%s}", expected, actual))
}
}

367
Godeps/_workspace/src/github.com/cheggaaa/pb/pb.go generated vendored Normal file
View File

@@ -0,0 +1,367 @@
package pb
import (
"fmt"
"io"
"math"
"strings"
"sync"
"sync/atomic"
"time"
"unicode/utf8"
)
const (
// Default refresh rate - 200ms
DEFAULT_REFRESH_RATE = time.Millisecond * 200
FORMAT = "[=>-]"
)
// DEPRECATED
// variables for backward compatibility, from now do not work
// use pb.Format and pb.SetRefreshRate
var (
DefaultRefreshRate = DEFAULT_REFRESH_RATE
BarStart, BarEnd, Empty, Current, CurrentN string
)
// Create new progress bar object
func New(total int) *ProgressBar {
return New64(int64(total))
}
// Create new progress bar object uding int64 as total
func New64(total int64) *ProgressBar {
pb := &ProgressBar{
Total: total,
RefreshRate: DEFAULT_REFRESH_RATE,
ShowPercent: true,
ShowCounters: true,
ShowBar: true,
ShowTimeLeft: true,
ShowFinalTime: true,
Units: U_NO,
ManualUpdate: false,
isFinish: make(chan struct{}),
currentValue: -1,
}
return pb.Format(FORMAT)
}
// Create new object and start
func StartNew(total int) *ProgressBar {
return New(total).Start()
}
// Callback for custom output
// For example:
// bar.Callback = func(s string) {
// mySuperPrint(s)
// }
//
type Callback func(out string)
type ProgressBar struct {
current int64 // current must be first member of struct (https://code.google.com/p/go/issues/detail?id=5278)
Total int64
RefreshRate time.Duration
ShowPercent, ShowCounters bool
ShowSpeed, ShowTimeLeft, ShowBar bool
ShowFinalTime bool
Output io.Writer
Callback Callback
NotPrint bool
Units Units
Width int
ForceWidth bool
ManualUpdate bool
finishOnce sync.Once //Guards isFinish
isFinish chan struct{}
startTime time.Time
startValue int64
currentValue int64
prefix, postfix string
BarStart string
BarEnd string
Empty string
Current string
CurrentN string
}
// Start print
func (pb *ProgressBar) Start() *ProgressBar {
pb.startTime = time.Now()
pb.startValue = pb.current
if pb.Total == 0 {
pb.ShowTimeLeft = false
pb.ShowPercent = false
}
if !pb.ManualUpdate {
go pb.writer()
}
return pb
}
// Increment current value
func (pb *ProgressBar) Increment() int {
return pb.Add(1)
}
// Set current value
func (pb *ProgressBar) Set(current int) *ProgressBar {
return pb.Set64(int64(current))
}
// Set64 sets the current value as int64
func (pb *ProgressBar) Set64(current int64) *ProgressBar {
atomic.StoreInt64(&pb.current, current)
return pb
}
// Add to current value
func (pb *ProgressBar) Add(add int) int {
return int(pb.Add64(int64(add)))
}
func (pb *ProgressBar) Add64(add int64) int64 {
return atomic.AddInt64(&pb.current, add)
}
// Set prefix string
func (pb *ProgressBar) Prefix(prefix string) *ProgressBar {
pb.prefix = prefix
return pb
}
// Set postfix string
func (pb *ProgressBar) Postfix(postfix string) *ProgressBar {
pb.postfix = postfix
return pb
}
// Set custom format for bar
// Example: bar.Format("[=>_]")
func (pb *ProgressBar) Format(format string) *ProgressBar {
formatEntries := strings.Split(format, "")
if len(formatEntries) == 5 {
pb.BarStart = formatEntries[0]
pb.BarEnd = formatEntries[4]
pb.Empty = formatEntries[3]
pb.Current = formatEntries[1]
pb.CurrentN = formatEntries[2]
}
return pb
}
// Set bar refresh rate
func (pb *ProgressBar) SetRefreshRate(rate time.Duration) *ProgressBar {
pb.RefreshRate = rate
return pb
}
// Set units
// bar.SetUnits(U_NO) - by default
// bar.SetUnits(U_BYTES) - for Mb, Kb, etc
func (pb *ProgressBar) SetUnits(units Units) *ProgressBar {
pb.Units = units
return pb
}
// Set max width, if width is bigger than terminal width, will be ignored
func (pb *ProgressBar) SetMaxWidth(width int) *ProgressBar {
pb.Width = width
pb.ForceWidth = false
return pb
}
// Set bar width
func (pb *ProgressBar) SetWidth(width int) *ProgressBar {
pb.Width = width
pb.ForceWidth = true
return pb
}
// End print
func (pb *ProgressBar) Finish() {
//Protect multiple calls
pb.finishOnce.Do(func() {
close(pb.isFinish)
pb.write(atomic.LoadInt64(&pb.current))
if !pb.NotPrint {
fmt.Println()
}
})
}
// End print and write string 'str'
func (pb *ProgressBar) FinishPrint(str string) {
pb.Finish()
fmt.Println(str)
}
// implement io.Writer
func (pb *ProgressBar) Write(p []byte) (n int, err error) {
n = len(p)
pb.Add(n)
return
}
// implement io.Reader
func (pb *ProgressBar) Read(p []byte) (n int, err error) {
n = len(p)
pb.Add(n)
return
}
// Create new proxy reader over bar
func (pb *ProgressBar) NewProxyReader(r io.Reader) *Reader {
return &Reader{r, pb}
}
func (pb *ProgressBar) write(current int64) {
width := pb.GetWidth()
var percentBox, countersBox, timeLeftBox, speedBox, barBox, end, out string
// percents
if pb.ShowPercent {
percent := float64(current) / (float64(pb.Total) / float64(100))
percentBox = fmt.Sprintf(" %.02f %% ", percent)
}
// counters
if pb.ShowCounters {
if pb.Total > 0 {
countersBox = fmt.Sprintf("%s / %s ", Format(current, pb.Units), Format(pb.Total, pb.Units))
} else {
countersBox = Format(current, pb.Units) + " / ? "
}
}
// time left
fromStart := time.Now().Sub(pb.startTime)
currentFromStart := current - pb.startValue
select {
case <-pb.isFinish:
if pb.ShowFinalTime {
left := (fromStart / time.Second) * time.Second
timeLeftBox = left.String()
}
default:
if pb.ShowTimeLeft && currentFromStart > 0 {
perEntry := fromStart / time.Duration(currentFromStart)
left := time.Duration(pb.Total-currentFromStart) * perEntry
left = (left / time.Second) * time.Second
timeLeftBox = left.String()
}
}
// speed
if pb.ShowSpeed && currentFromStart > 0 {
fromStart := time.Now().Sub(pb.startTime)
speed := float64(currentFromStart) / (float64(fromStart) / float64(time.Second))
speedBox = Format(int64(speed), pb.Units) + "/s "
}
barWidth := utf8.RuneCountInString(countersBox + pb.BarStart + pb.BarEnd + percentBox + timeLeftBox + speedBox + pb.prefix + pb.postfix)
// bar
if pb.ShowBar {
size := width - barWidth
if size > 0 {
if pb.Total > 0 {
curCount := int(math.Ceil((float64(current) / float64(pb.Total)) * float64(size)))
emptCount := size - curCount
barBox = pb.BarStart
if emptCount < 0 {
emptCount = 0
}
if curCount > size {
curCount = size
}
if emptCount <= 0 {
barBox += strings.Repeat(pb.Current, curCount)
} else if curCount > 0 {
barBox += strings.Repeat(pb.Current, curCount-1) + pb.CurrentN
}
barBox += strings.Repeat(pb.Empty, emptCount) + pb.BarEnd
} else {
barBox = pb.BarStart
pos := size - int(current)%int(size)
if pos-1 > 0 {
barBox += strings.Repeat(pb.Empty, pos-1)
}
barBox += pb.Current
if size-pos-1 > 0 {
barBox += strings.Repeat(pb.Empty, size-pos-1)
}
barBox += pb.BarEnd
}
}
}
// check len
out = pb.prefix + countersBox + barBox + percentBox + speedBox + timeLeftBox + pb.postfix
if utf8.RuneCountInString(out) < width {
end = strings.Repeat(" ", width-utf8.RuneCountInString(out))
}
// and print!
switch {
case pb.Output != nil:
fmt.Fprint(pb.Output, "\r"+out+end)
case pb.Callback != nil:
pb.Callback(out + end)
case !pb.NotPrint:
fmt.Print("\r" + out + end)
}
}
func (pb *ProgressBar) GetWidth() int {
if pb.ForceWidth {
return pb.Width
}
width := pb.Width
termWidth, _ := terminalWidth()
if width == 0 || termWidth <= width {
width = termWidth
}
return width
}
// Write the current state of the progressbar
func (pb *ProgressBar) Update() {
c := atomic.LoadInt64(&pb.current)
if c != pb.currentValue {
pb.write(c)
pb.currentValue = c
}
}
// Internal loop for writing progressbar
func (pb *ProgressBar) writer() {
pb.Update()
for {
select {
case <-pb.isFinish:
return
case <-time.After(pb.RefreshRate):
pb.Update()
}
}
}
type window struct {
Row uint16
Col uint16
Xpixel uint16
Ypixel uint16
}

View File

@@ -0,0 +1,7 @@
// +build linux darwin freebsd netbsd openbsd
package pb
import "syscall"
const sys_ioctl = syscall.SYS_IOCTL

View File

@@ -0,0 +1,5 @@
// +build solaris
package pb
const sys_ioctl = 54

View File

@@ -0,0 +1,37 @@
package pb
import (
"testing"
)
func Test_IncrementAddsOne(t *testing.T) {
count := 5000
bar := New(count)
expected := 1
actual := bar.Increment()
if actual != expected {
t.Errorf("Expected {%d} was {%d}", expected, actual)
}
}
func Test_Width(t *testing.T) {
count := 5000
bar := New(count)
width := 100
bar.SetWidth(100).Callback = func(out string) {
if len(out) != width {
t.Errorf("Bar width expected {%d} was {%d}", len(out), width)
}
}
bar.Start()
bar.Increment()
bar.Finish()
}
func Test_MultipleFinish(t *testing.T) {
bar := New(5000)
bar.Add(2000)
bar.Finish()
bar.Finish()
}

View File

@@ -1,16 +1,16 @@
// +build windows
package pb
import (
"github.com/olekukonko/ts"
)
func bold(str string) string {
return str
}
func terminalWidth() (int, error) {
size , err := ts.GetSize()
return size.Col() , err
}
// +build windows
package pb
import (
"github.com/olekukonko/ts"
)
func bold(str string) string {
return str
}
func terminalWidth() (int, error) {
size, err := ts.GetSize()
return size.Col(), err
}

View File

@@ -1,8 +1,9 @@
// +build linux darwin freebsd
// +build linux darwin freebsd netbsd openbsd solaris
package pb
import (
"os"
"runtime"
"syscall"
"unsafe"
@@ -13,6 +14,16 @@ const (
TIOCGWINSZ_OSX = 1074295912
)
var tty *os.File
func init() {
var err error
tty, err = os.Open("/dev/tty")
if err != nil {
tty = os.Stdin
}
}
func bold(str string) string {
return "\033[1m" + str + "\033[0m"
}
@@ -23,8 +34,8 @@ func terminalWidth() (int, error) {
if runtime.GOOS == "darwin" {
tio = TIOCGWINSZ_OSX
}
res, _, err := syscall.Syscall(syscall.SYS_IOCTL,
uintptr(syscall.Stdin),
res, _, err := syscall.Syscall(sys_ioctl,
tty.Fd(),
uintptr(tio),
uintptr(unsafe.Pointer(w)),
)

17
Godeps/_workspace/src/github.com/cheggaaa/pb/reader.go generated vendored Normal file
View File

@@ -0,0 +1,17 @@
package pb
import (
"io"
)
// It's proxy reader, implement io.Reader
type Reader struct {
io.Reader
bar *ProgressBar
}
func (r *Reader) Read(p []byte) (n int, err error) {
n, err = r.Reader.Read(p)
r.bar.Add(n)
return
}

View File

@@ -1,33 +1,32 @@
# capnslog, the CoreOS logging package
# CoreOS Log
There are far too many logging packages out there, with varying degrees of licenses, far too many features (colorization, all sorts of log frameworks) or are just a pain to use (lack of `Fatalln()`?).
capnslog provides a simple but consistent logging interface suitable for all kinds of projects.
There are far too many logging packages out there, with varying degrees of licenses, far too many features (colorization, all sorts of log frameworks) or are just a pain to use (lack of `Fatalln()`?)
### Design Principles
## Design Principles
##### `package main` is the place where logging gets turned on and routed
* `package main` is the place where logging gets turned on and routed
A library should not touch log options, only generate log entries. Libraries are silent until main lets them speak.
##### All log options are runtime-configurable.
* All log options are runtime-configurable.
Still the job of `main` to expose these configurations. `main` may delegate this to, say, a configuration webhook, but does so explicitly.
##### There is one log object per package. It is registered under its repository and package name.
* There is one log object per package. It is registered under its repository and package name.
`main` activates logging for its repository and any dependency repositories it would also like to have output in its logstream. `main` also dictates at which level each subpackage logs.
##### There is *one* output stream, and it is an `io.Writer` composed with a formatter.
* There is *one* output stream, and it is an `io.Writer` composed with a formatter.
Splitting streams is probably not the job of your program, but rather, your log aggregation framework. If you must split output streams, again, `main` configures this and you can write a very simple two-output struct that satisfies io.Writer.
Fancy colorful formatting and JSON output are beyond the scope of a basic logging framework -- they're application/log-collector dependant. These are, at best, provided as options, but more likely, provided by your application.
##### Log objects are an interface
* Log objects are an interface
An object knows best how to print itself. Log objects can collect more interesting metadata if they wish, however, because text isn't going away anytime soon, they must all be marshalable to text. The simplest log object is a string, which returns itself. If you wish to do more fancy tricks for printing your log objects, see also JSON output -- introspect and write a formatter which can handle your advanced log interface. Making strings is the only thing guaranteed.
##### Log levels have specific meanings:
* Log levels have specific meanings:
* Critical: Unrecoverable. Must fail.
* Error: Data has been lost, a request has failed for a bad reason, or a required resource has been lost

View File

@@ -19,7 +19,7 @@ import "os"
func init() {
initHijack()
// Go `log` package uses os.Stderr.
// Go `log` pacakge uses os.Stderr.
SetFormatter(NewPrettyFormatter(os.Stderr, false))
SetGlobalLogLevel(INFO)
}

View File

@@ -20,7 +20,6 @@ import (
"errors"
"fmt"
"os"
"path/filepath"
"github.com/coreos/etcd/Godeps/_workspace/src/github.com/coreos/go-systemd/journal"
)
@@ -56,8 +55,7 @@ func (j *journaldFormatter) Format(pkg string, l LogLevel, _ int, entries ...int
}
msg := fmt.Sprint(entries...)
tags := map[string]string{
"PACKAGE": pkg,
"SYSLOG_IDENTIFIER": filepath.Base(os.Args[0]),
"PACKAGE": pkg,
}
err := journal.Send(msg, pri, tags)
if err != nil {

View File

@@ -1,44 +0,0 @@
package pb
import (
"fmt"
"strings"
"strconv"
)
const (
// By default, without type handle
U_NO = 0
// Handle as b, Kb, Mb, etc
U_BYTES = 1
)
// Format integer
func Format(i int64, units int) string {
switch units {
case U_BYTES:
return FormatBytes(i)
}
// by default just convert to string
return strconv.Itoa(int(i))
}
// Convert bytes to human readable string. Like a 2 MiB, 64.2 KiB, 52 B
func FormatBytes(i int64) (result string) {
switch {
case i > (1024 * 1024 * 1024 * 1024):
result = fmt.Sprintf("%#.02f TB", float64(i)/1024/1024/1024/1024)
case i > (1024 * 1024 * 1024):
result = fmt.Sprintf("%#.02f GB", float64(i)/1024/1024/1024)
case i > (1024 * 1024):
result = fmt.Sprintf("%#.02f MB", float64(i)/1024/1024)
case i > 1024:
result = fmt.Sprintf("%#.02f KB", float64(i)/1024)
default:
result = fmt.Sprintf("%d B", i)
}
result = strings.Trim(result, " ")
return
}

View File

@@ -1,267 +0,0 @@
package pb
import (
"fmt"
"io"
"math"
"strings"
"sync/atomic"
"time"
)
const (
// Default refresh rate - 200ms
DEFAULT_REFRESH_RATE = time.Millisecond * 200
FORMAT = "[=>-]"
)
// DEPRECATED
// variables for backward compatibility, from now do not work
// use pb.Format and pb.SetRefreshRate
var (
DefaultRefreshRate = DEFAULT_REFRESH_RATE
BarStart, BarEnd, Empty, Current, CurrentN string
)
// Create new progress bar object
func New(total int) (pb *ProgressBar) {
pb = &ProgressBar{
Total: int64(total),
RefreshRate: DEFAULT_REFRESH_RATE,
ShowPercent: true,
ShowCounters: true,
ShowBar: true,
ShowTimeLeft: true,
}
pb.Format(FORMAT)
return
}
// Create new object and start
func StartNew(total int) (pb *ProgressBar) {
pb = New(total)
pb.Start()
return
}
// Callback for custom output
// For example:
// bar.Callback = func(s string) {
// mySuperPrint(s)
// }
//
type Callback func(out string)
type ProgressBar struct {
current int64 // current must be first member of struct (https://code.google.com/p/go/issues/detail?id=5278)
Total int64
RefreshRate time.Duration
ShowPercent, ShowCounters bool
ShowSpeed, ShowTimeLeft, ShowBar bool
Output io.Writer
Callback Callback
NotPrint bool
Units int
isFinish bool
startTime time.Time
BarStart string
BarEnd string
Empty string
Current string
CurrentN string
}
// Start print
func (pb *ProgressBar) Start() {
pb.startTime = time.Now()
if pb.Total == 0 {
pb.ShowBar = false
pb.ShowTimeLeft = false
pb.ShowPercent = false
}
go pb.writer()
}
// Increment current value
func (pb *ProgressBar) Increment() int {
return pb.Add(1)
}
// Set current value
func (pb *ProgressBar) Set(current int) {
atomic.StoreInt64(&pb.current, int64(current))
}
// Add to current value
func (pb *ProgressBar) Add(add int) int {
return int(atomic.AddInt64(&pb.current, int64(add)))
}
// Set custom format for bar
// Example: bar.Format("[=>_]")
func (pb *ProgressBar) Format(format string) (bar *ProgressBar) {
bar = pb
formatEntries := strings.Split(format, "")
if len(formatEntries) != 5 {
return
}
pb.BarStart = formatEntries[0]
pb.BarEnd = formatEntries[4]
pb.Empty = formatEntries[3]
pb.Current = formatEntries[1]
pb.CurrentN = formatEntries[2]
return
}
// Set bar refresh rate
func (pb *ProgressBar) SetRefreshRate(rate time.Duration) (bar *ProgressBar) {
bar = pb
pb.RefreshRate = rate
return
}
// Set units
// bar.SetUnits(U_NO) - by default
// bar.SetUnits(U_BYTES) - for Mb, Kb, etc
func (pb *ProgressBar) SetUnits(units int) (bar *ProgressBar) {
bar = pb
switch units {
case U_NO, U_BYTES:
pb.Units = units
}
return
}
// End print
func (pb *ProgressBar) Finish() {
pb.isFinish = true
pb.write(atomic.LoadInt64(&pb.current))
if !pb.NotPrint {
fmt.Println()
}
}
// End print and write string 'str'
func (pb *ProgressBar) FinishPrint(str string) {
pb.Finish()
fmt.Println(str)
}
// implement io.Writer
func (pb *ProgressBar) Write(p []byte) (n int, err error) {
n = len(p)
pb.Add(n)
return
}
// implement io.Reader
func (pb *ProgressBar) Read(p []byte) (n int, err error) {
n = len(p)
pb.Add(n)
return
}
func (pb *ProgressBar) write(current int64) {
width, _ := terminalWidth()
var percentBox, countersBox, timeLeftBox, speedBox, barBox, end, out string
// percents
if pb.ShowPercent {
percent := float64(current) / (float64(pb.Total) / float64(100))
percentBox = fmt.Sprintf(" %#.02f %% ", percent)
}
// counters
if pb.ShowCounters {
if pb.Total > 0 {
countersBox = fmt.Sprintf("%s / %s ", Format(current, pb.Units), Format(pb.Total, pb.Units))
} else {
countersBox = Format(current, pb.Units) + " "
}
}
// time left
if pb.ShowTimeLeft && current > 0 {
fromStart := time.Now().Sub(pb.startTime)
perEntry := fromStart / time.Duration(current)
left := time.Duration(pb.Total-current) * perEntry
left = (left / time.Second) * time.Second
if left > 0 {
timeLeftBox = left.String()
}
}
// speed
if pb.ShowSpeed && current > 0 {
fromStart := time.Now().Sub(pb.startTime)
speed := float64(current) / (float64(fromStart) / float64(time.Second))
speedBox = Format(int64(speed), pb.Units) + "/s "
}
// bar
if pb.ShowBar {
size := width - len(countersBox+pb.BarStart+pb.BarEnd+percentBox+timeLeftBox+speedBox)
if size > 0 {
curCount := int(math.Ceil((float64(current) / float64(pb.Total)) * float64(size)))
emptCount := size - curCount
barBox = pb.BarStart
if emptCount < 0 {
emptCount = 0
}
if curCount > size {
curCount = size
}
if emptCount <= 0 {
barBox += strings.Repeat(pb.Current, curCount)
} else if curCount > 0 {
barBox += strings.Repeat(pb.Current, curCount-1) + pb.CurrentN
}
barBox += strings.Repeat(pb.Empty, emptCount) + pb.BarEnd
}
}
// check len
out = countersBox + barBox + percentBox + speedBox + timeLeftBox
if len(out) < width {
end = strings.Repeat(" ", width-len(out))
}
out = countersBox + barBox + percentBox + speedBox + timeLeftBox
// and print!
switch {
case pb.Output != nil:
fmt.Fprint(pb.Output, out+end)
case pb.Callback != nil:
pb.Callback(out + end)
case !pb.NotPrint:
fmt.Print("\r" + out + end)
}
}
func (pb *ProgressBar) writer() {
var c, oc int64
oc = -1
for {
if pb.isFinish {
break
}
c = atomic.LoadInt64(&pb.current)
if c != oc {
pb.write(c)
oc = c
}
time.Sleep(pb.RefreshRate)
}
}
type window struct {
Row uint16
Col uint16
Xpixel uint16
Ypixel uint16
}

View File

@@ -0,0 +1,5 @@
language: go
go:
- 1.4
- tip

View File

@@ -0,0 +1,75 @@
// Copyright 2014 The Cockroach Authors.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
// implied. See the License for the specific language governing
// permissions and limitations under the License. See the AUTHORS file
// for names of contributors.
//
// Author: Tyler Neely (t@jujit.su)
package loghisto
import (
"bytes"
"fmt"
"os"
"strings"
)
type graphiteStat struct {
Metric string
Time int64
Value float64
Host string
}
type graphiteStatArray []*graphiteStat
func (stats graphiteStatArray) ToRequest() []byte {
var request bytes.Buffer
for _, stat := range stats {
request.Write([]byte(fmt.Sprintf("cockroach.%s.%s %f %d\n",
stat.Host,
strings.Replace(stat.Metric, "_", ".", -1),
stat.Value,
stat.Time,
)))
}
return []byte(request.String())
}
func (metricSet *ProcessedMetricSet) tographiteStats() graphiteStatArray {
hostname, err := os.Hostname()
if err != nil {
hostname = "unknown"
}
stats := make([]*graphiteStat, 0, len(metricSet.Metrics))
i := 0
for metric, value := range metricSet.Metrics {
//TODO(tyler) custom tags
stats = append(stats, &graphiteStat{
Metric: metric,
Time: metricSet.Time.Unix(),
Value: value,
Host: hostname,
})
i++
}
return stats
}
// GraphiteProtocol generates a wire representation of a ProcessedMetricSet
// for submission to a Graphite Carbon instance using the plaintext protocol.
func GraphiteProtocol(ms *ProcessedMetricSet) []byte {
return ms.tographiteStats().ToRequest()
}

View File

@@ -0,0 +1,23 @@
package loghisto
import (
"testing"
"time"
)
func TestGraphite(t *testing.T) {
ms := NewMetricSystem(time.Second, true)
s := NewSubmitter(ms, GraphiteProtocol, "tcp", "localhost:7777")
s.Start()
metrics := &ProcessedMetricSet{
Time: time.Now(),
Metrics: map[string]float64{
"test.3": 50.54,
"test.4": 10.21,
},
}
request := s.serializer(metrics)
s.submit(request)
s.Shutdown()
}

View File

@@ -0,0 +1,653 @@
// Copyright 2014 The Cockroach Authors.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
// implied. See the License for the specific language governing
// permissions and limitations under the License. See the AUTHORS file
// for names of contributors.
//
// Author: Tyler Neely (t@jujit.su)
// IMPORTANT: only subscribe to the metric stream
// using buffered channels that are regularly
// flushed, as reaper will NOT block while trying
// to send metrics to a subscriber, and will ignore
// a subscriber if they fail to clear their channel
// 3 times in a row!
package loghisto
import (
"errors"
"fmt"
"math"
"runtime"
"sort"
"sync"
"sync/atomic"
"time"
"github.com/coreos/etcd/Godeps/_workspace/src/github.com/golang/glog"
)
const (
// precision effects the bucketing used during histogram value compression.
precision = 100
)
// ProcessedMetricSet contains human-readable metrics that may also be
// suitable for storage in time-series databases.
type ProcessedMetricSet struct {
Time time.Time
Metrics map[string]float64
}
// RawMetricSet contains metrics in a form that supports generation of
// percentiles and other rich statistics.
type RawMetricSet struct {
Time time.Time
Counters map[string]uint64
Rates map[string]uint64
Histograms map[string]map[int16]*uint64
Gauges map[string]float64
}
// TimerToken facilitates concurrent timings of durations of the same label.
type TimerToken struct {
Name string
Start time.Time
MetricSystem *MetricSystem
}
// proportion is a compact value with a corresponding count of
// occurrences in this interval.
type proportion struct {
Value float64
Count uint64
}
// proportionArray is a sortable collection of proportion types.
type proportionArray []proportion
// MetricSystem facilitates the collection and distribution of metrics.
type MetricSystem struct {
// percentiles is a mapping from labels to desired percentiles to be
// calculated by the MetricSystem
percentiles map[string]float64
// interval is the duration between collections and broadcasts of metrics
// to subscribers.
interval time.Duration
// subscribeToRawMetrics allows subscription to a RawMetricSet generated
// by reaper at the end of each interval on a sent channel.
subscribeToRawMetrics chan chan *RawMetricSet
// unsubscribeFromRawMetrics allows subscribers to unsubscribe from
// receiving a RawMetricSet on the sent channel.
unsubscribeFromRawMetrics chan chan *RawMetricSet
// subscribeToProcessedMetrics allows subscription to a ProcessedMetricSet
// generated by reaper at the end of each interval on a sent channel.
subscribeToProcessedMetrics chan chan *ProcessedMetricSet
// unsubscribeFromProcessedMetrics allows subscribers to unsubscribe from
// receiving a ProcessedMetricSet on the sent channel.
unsubscribeFromProcessedMetrics chan chan *ProcessedMetricSet
// rawSubscribers stores current subscribers to RawMetrics
rawSubscribers map[chan *RawMetricSet]struct{}
// rawBadSubscribers tracks misbehaving subscribers who do not clear their
// subscription channels regularly.
rawBadSubscribers map[chan *RawMetricSet]int
// processedSubscribers stores current subscribers to ProcessedMetrics
processedSubscribers map[chan *ProcessedMetricSet]struct{}
// processedBadSubscribers tracks misbehaving subscribers who do not clear
// their subscription channels regularly.
processedBadSubscribers map[chan *ProcessedMetricSet]int
// subscribersMu controls access to subscription structures
subscribersMu sync.RWMutex
// counterStore maintains the total counts of counters.
counterStore map[string]*uint64
counterStoreMu sync.RWMutex
// counterCache aggregates new Counters until they are collected by reaper().
counterCache map[string]*uint64
// counterMu controls access to counterCache.
counterMu sync.RWMutex
// histogramCache aggregates Histograms until they are collected by reaper().
histogramCache map[string]map[int16]*uint64
// histogramMu controls access to histogramCache.
histogramMu sync.RWMutex
// histogramCountStore keeps track of aggregate counts and sums for aggregate
// mean calculation.
histogramCountStore map[string]*uint64
// histogramCountMu controls access to the histogramCountStore.
histogramCountMu sync.RWMutex
// gaugeFuncs maps metrics to functions used for calculating their value
gaugeFuncs map[string]func() float64
// gaugeFuncsMu controls access to the gaugeFuncs map.
gaugeFuncsMu sync.Mutex
// Has reaper() been started?
reaping bool
// Close this to bring down this MetricSystem
shutdownChan chan struct{}
}
// Metrics is the default metric system, which collects and broadcasts metrics
// to subscribers once every 60 seconds. Also includes default system stats.
var Metrics = NewMetricSystem(60*time.Second, true)
// NewMetricSystem returns a new metric system that collects and broadcasts
// metrics after each interval.
func NewMetricSystem(interval time.Duration, sysStats bool) *MetricSystem {
ms := &MetricSystem{
percentiles: map[string]float64{
"%s_min": 0,
"%s_50": .5,
"%s_75": .75,
"%s_90": .9,
"%s_95": .95,
"%s_99": .99,
"%s_99.9": .999,
"%s_99.99": .9999,
"%s_max": 1,
},
interval: interval,
subscribeToRawMetrics: make(chan chan *RawMetricSet, 64),
unsubscribeFromRawMetrics: make(chan chan *RawMetricSet, 64),
subscribeToProcessedMetrics: make(chan chan *ProcessedMetricSet, 64),
unsubscribeFromProcessedMetrics: make(chan chan *ProcessedMetricSet, 64),
rawSubscribers: make(map[chan *RawMetricSet]struct{}),
rawBadSubscribers: make(map[chan *RawMetricSet]int),
processedSubscribers: make(map[chan *ProcessedMetricSet]struct{}),
processedBadSubscribers: make(map[chan *ProcessedMetricSet]int),
counterStore: make(map[string]*uint64),
counterCache: make(map[string]*uint64),
histogramCache: make(map[string]map[int16]*uint64),
histogramCountStore: make(map[string]*uint64),
gaugeFuncs: make(map[string]func() float64),
shutdownChan: make(chan struct{}),
}
if sysStats {
ms.gaugeFuncsMu.Lock()
ms.gaugeFuncs["sys.Alloc"] = func() float64 {
memStats := new(runtime.MemStats)
runtime.ReadMemStats(memStats)
return float64(memStats.Alloc)
}
ms.gaugeFuncs["sys.NumGC"] = func() float64 {
memStats := new(runtime.MemStats)
runtime.ReadMemStats(memStats)
return float64(memStats.NumGC)
}
ms.gaugeFuncs["sys.PauseTotalNs"] = func() float64 {
memStats := new(runtime.MemStats)
runtime.ReadMemStats(memStats)
return float64(memStats.PauseTotalNs)
}
ms.gaugeFuncs["sys.NumGoroutine"] = func() float64 {
return float64(runtime.NumGoroutine())
}
ms.gaugeFuncsMu.Unlock()
}
return ms
}
// SpecifyPercentiles allows users to override the default collected
// and reported percentiles.
func (ms *MetricSystem) SpecifyPercentiles(percentiles map[string]float64) {
ms.percentiles = percentiles
}
// SubscribeToRawMetrics registers a channel to receive RawMetricSets
// periodically generated by reaper at each interval.
func (ms *MetricSystem) SubscribeToRawMetrics(metricStream chan *RawMetricSet) {
ms.subscribeToRawMetrics <- metricStream
}
// UnsubscribeFromRawMetrics registers a channel to receive RawMetricSets
// periodically generated by reaper at each interval.
func (ms *MetricSystem) UnsubscribeFromRawMetrics(
metricStream chan *RawMetricSet) {
ms.unsubscribeFromRawMetrics <- metricStream
}
// SubscribeToProcessedMetrics registers a channel to receive
// ProcessedMetricSets periodically generated by reaper at each interval.
func (ms *MetricSystem) SubscribeToProcessedMetrics(
metricStream chan *ProcessedMetricSet) {
ms.subscribeToProcessedMetrics <- metricStream
}
// UnsubscribeFromProcessedMetrics registers a channel to receive
// ProcessedMetricSets periodically generated by reaper at each interval.
func (ms *MetricSystem) UnsubscribeFromProcessedMetrics(
metricStream chan *ProcessedMetricSet) {
ms.unsubscribeFromProcessedMetrics <- metricStream
}
// StartTimer begins a timer and returns a token which is required for halting
// the timer. This allows for concurrent timings under the same name.
func (ms *MetricSystem) StartTimer(name string) TimerToken {
return TimerToken{
Name: name,
Start: time.Now(),
MetricSystem: ms,
}
}
// Stop stops a timer given by StartTimer, submits a Histogram of its duration
// in nanoseconds, and returns its duration in nanoseconds.
func (tt *TimerToken) Stop() time.Duration {
duration := time.Since(tt.Start)
tt.MetricSystem.Histogram(tt.Name, float64(duration.Nanoseconds()))
return duration
}
// Counter is used for recording a running count of the total occurrences of
// a particular event. A rate is also exported for the amount that a counter
// has increased during an interval of this MetricSystem.
func (ms *MetricSystem) Counter(name string, amount uint64) {
ms.counterMu.RLock()
_, exists := ms.counterCache[name]
// perform lock promotion when we need more control
if exists {
atomic.AddUint64(ms.counterCache[name], amount)
ms.counterMu.RUnlock()
} else {
ms.counterMu.RUnlock()
ms.counterMu.Lock()
_, syncExists := ms.counterCache[name]
if !syncExists {
var z uint64
ms.counterCache[name] = &z
}
atomic.AddUint64(ms.counterCache[name], amount)
ms.counterMu.Unlock()
}
}
// Histogram is used for generating rich metrics, such as percentiles, from
// periodically occurring continuous values.
func (ms *MetricSystem) Histogram(name string, value float64) {
compressedValue := compress(value)
ms.histogramMu.RLock()
_, present := ms.histogramCache[name][compressedValue]
if present {
atomic.AddUint64(ms.histogramCache[name][compressedValue], 1)
ms.histogramMu.RUnlock()
} else {
ms.histogramMu.RUnlock()
ms.histogramMu.Lock()
_, syncPresent := ms.histogramCache[name][compressedValue]
if !syncPresent {
var z uint64
_, mapPresent := ms.histogramCache[name]
if !mapPresent {
ms.histogramCache[name] = make(map[int16]*uint64)
}
ms.histogramCache[name][compressedValue] = &z
}
atomic.AddUint64(ms.histogramCache[name][compressedValue], 1)
ms.histogramMu.Unlock()
}
}
// RegisterGaugeFunc registers a function to be called at each interval
// whose return value will be used to populate the <name> metric.
func (ms *MetricSystem) RegisterGaugeFunc(name string, f func() float64) {
ms.gaugeFuncsMu.Lock()
ms.gaugeFuncs[name] = f
ms.gaugeFuncsMu.Unlock()
}
// DeregisterGaugeFunc deregisters a function for the <name> metric.
func (ms *MetricSystem) DeregisterGaugeFunc(name string) {
ms.gaugeFuncsMu.Lock()
delete(ms.gaugeFuncs, name)
ms.gaugeFuncsMu.Unlock()
}
// compress takes a float64 and lossily shrinks it to an int16 to facilitate
// bucketing of histogram values, staying within 1% of the true value. This
// fails for large values of 1e142 and above, and is inaccurate for values
// closer to 0 than +/- 0.51 or +/- math.Inf.
func compress(value float64) int16 {
i := int16(precision*math.Log(1.0+math.Abs(value)) + 0.5)
if value < 0 {
return -1 * i
}
return i
}
// decompress takes a lossily shrunk int16 and returns a float64 within 1% of
// the original float64 passed to compress.
func decompress(compressedValue int16) float64 {
f := math.Exp(math.Abs(float64(compressedValue))/precision) - 1.0
if compressedValue < 0 {
return -1.0 * f
}
return f
}
// processHistograms derives rich metrics from histograms, currently
// percentiles, sum, count, and mean.
func (ms *MetricSystem) processHistograms(name string,
valuesToCounts map[int16]*uint64) map[string]float64 {
output := make(map[string]float64)
totalSum := float64(0)
totalCount := uint64(0)
proportions := make([]proportion, 0, len(valuesToCounts))
for compressedValue, count := range valuesToCounts {
value := decompress(compressedValue)
totalSum += value * float64(*count)
totalCount += *count
proportions = append(proportions, proportion{Value: value, Count: *count})
}
sumName := fmt.Sprintf("%s_sum", name)
countName := fmt.Sprintf("%s_count", name)
avgName := fmt.Sprintf("%s_avg", name)
// increment interval sum and count
output[countName] = float64(totalCount)
output[sumName] = totalSum
output[avgName] = totalSum / float64(totalCount)
// increment aggregate sum and count
ms.histogramCountMu.RLock()
_, present := ms.histogramCountStore[sumName]
if !present {
ms.histogramCountMu.RUnlock()
ms.histogramCountMu.Lock()
_, syncPresent := ms.histogramCountStore[sumName]
if !syncPresent {
var x uint64
ms.histogramCountStore[sumName] = &x
var z uint64
ms.histogramCountStore[countName] = &z
}
ms.histogramCountMu.Unlock()
ms.histogramCountMu.RLock()
}
atomic.AddUint64(ms.histogramCountStore[sumName], uint64(totalSum))
atomic.AddUint64(ms.histogramCountStore[countName], totalCount)
ms.histogramCountMu.RUnlock()
for label, p := range ms.percentiles {
value, err := percentile(totalCount, proportions, p)
if err != nil {
glog.Errorf("unable to calculate percentile: %s", err)
} else {
output[fmt.Sprintf(label, name)] = value
}
}
return output
}
// These next 3 methods are for the implementation of sort.Interface
func (s proportionArray) Len() int {
return len(s)
}
func (s proportionArray) Less(i, j int) bool {
return s[i].Value < s[j].Value
}
func (s proportionArray) Swap(i, j int) {
s[i], s[j] = s[j], s[i]
}
// percentile calculates a percentile represented as a float64 between 0 and 1
// inclusive from a proportionArray. totalCount is the sum of all counts of
// elements in the proportionArray.
func percentile(totalCount uint64, proportions proportionArray,
percentile float64) (float64, error) {
//TODO(tyler) handle multiple percentiles at once for efficiency
sort.Sort(proportions)
sofar := uint64(0)
for _, proportion := range proportions {
sofar += proportion.Count
if float64(sofar)/float64(totalCount) >= percentile {
return proportion.Value, nil
}
}
return 0, errors.New("Invalid percentile. Should be between 0 and 1.")
}
func (ms *MetricSystem) collectRawMetrics() *RawMetricSet {
normalizedInterval := time.Unix(0, time.Now().UnixNano()/
ms.interval.Nanoseconds()*
ms.interval.Nanoseconds())
ms.counterMu.Lock()
freshCounters := ms.counterCache
ms.counterCache = make(map[string]*uint64)
ms.counterMu.Unlock()
rates := make(map[string]uint64)
for name, count := range freshCounters {
rates[name] = *count
}
counters := make(map[string]uint64)
ms.counterStoreMu.RLock()
// update counters
for name, count := range freshCounters {
_, exists := ms.counterStore[name]
// only take a write lock when it's a totally new counter
if !exists {
ms.counterStoreMu.RUnlock()
ms.counterStoreMu.Lock()
_, syncExists := ms.counterStore[name]
if !syncExists {
var z uint64
ms.counterStore[name] = &z
}
ms.counterStoreMu.Unlock()
ms.counterStoreMu.RLock()
}
atomic.AddUint64(ms.counterStore[name], *count)
}
// copy counters for export
for name, count := range ms.counterStore {
counters[name] = *count
}
ms.counterStoreMu.RUnlock()
ms.histogramMu.Lock()
histograms := ms.histogramCache
ms.histogramCache = make(map[string]map[int16]*uint64)
ms.histogramMu.Unlock()
ms.gaugeFuncsMu.Lock()
gauges := make(map[string]float64)
for name, f := range ms.gaugeFuncs {
gauges[name] = f()
}
ms.gaugeFuncsMu.Unlock()
return &RawMetricSet{
Time: normalizedInterval,
Counters: counters,
Rates: rates,
Histograms: histograms,
Gauges: gauges,
}
}
// processMetrics (potentially slowly) creates human consumable metrics from a
// RawMetricSet, deriving rich statistics from histograms such as percentiles.
func (ms *MetricSystem) processMetrics(
rawMetrics *RawMetricSet) *ProcessedMetricSet {
metrics := make(map[string]float64)
for name, count := range rawMetrics.Counters {
metrics[name] = float64(count)
}
for name, count := range rawMetrics.Rates {
metrics[fmt.Sprintf("%s_rate", name)] = float64(count)
}
for name, valuesToCounts := range rawMetrics.Histograms {
for histoName, histoValue := range ms.processHistograms(name, valuesToCounts) {
metrics[histoName] = histoValue
}
}
for name, value := range rawMetrics.Gauges {
metrics[name] = value
}
return &ProcessedMetricSet{Time: rawMetrics.Time, Metrics: metrics}
}
func (ms *MetricSystem) updateSubscribers() {
ms.subscribersMu.Lock()
defer ms.subscribersMu.Unlock()
for {
select {
case subscriber := <-ms.subscribeToRawMetrics:
ms.rawSubscribers[subscriber] = struct{}{}
case unsubscriber := <-ms.unsubscribeFromRawMetrics:
delete(ms.rawSubscribers, unsubscriber)
case subscriber := <-ms.subscribeToProcessedMetrics:
ms.processedSubscribers[subscriber] = struct{}{}
case unsubscriber := <-ms.unsubscribeFromProcessedMetrics:
delete(ms.processedSubscribers, unsubscriber)
default: // no changes in subscribers
return
}
}
}
// reaper wakes up every <interval> seconds,
// collects and processes metrics, and pushes
// them to the corresponding subscribing channels.
func (ms *MetricSystem) reaper() {
ms.reaping = true
// create goroutine pool to handle multiple processing tasks at once
processChan := make(chan func(), 16)
for i := 0; i < int(math.Max(float64(runtime.NumCPU()/4), 4)); i++ {
go func() {
for {
c, ok := <-processChan
if !ok {
return
}
c()
}
}()
}
// begin reaper main loop
for {
// sleep until the next interval, or die if shutdownChan is closed
tts := ms.interval.Nanoseconds() -
(time.Now().UnixNano() % ms.interval.Nanoseconds())
select {
case <-time.After(time.Duration(tts)):
case <-ms.shutdownChan:
ms.reaping = false
close(processChan)
return
}
rawMetrics := ms.collectRawMetrics()
ms.updateSubscribers()
// broadcast raw metrics
for subscriber := range ms.rawSubscribers {
// new subscribers get all counters, otherwise just the new diffs
select {
case subscriber <- rawMetrics:
delete(ms.rawBadSubscribers, subscriber)
default:
ms.rawBadSubscribers[subscriber]++
glog.Error("a raw subscriber has allowed their channel to fill up. ",
"dropping their metrics on the floor rather than blocking.")
if ms.rawBadSubscribers[subscriber] >= 2 {
glog.Error("this raw subscriber has caused dropped metrics at ",
"least 3 times in a row. closing the channel.")
delete(ms.rawSubscribers, subscriber)
close(subscriber)
}
}
}
// Perform the rest in another goroutine since processing is not
// gauranteed to complete before the interval is up.
sendProcessed := func() {
// this is potentially expensive if there is a massive number of metrics
processedMetrics := ms.processMetrics(rawMetrics)
// add aggregate mean
for name := range rawMetrics.Histograms {
ms.histogramCountMu.RLock()
aggCountPtr, countPresent :=
ms.histogramCountStore[fmt.Sprintf("%s_count", name)]
aggCount := atomic.LoadUint64(aggCountPtr)
aggSumPtr, sumPresent :=
ms.histogramCountStore[fmt.Sprintf("%s_sum", name)]
aggSum := atomic.LoadUint64(aggSumPtr)
ms.histogramCountMu.RUnlock()
if countPresent && sumPresent && aggCount > 0 {
processedMetrics.Metrics[fmt.Sprintf("%s_agg_avg", name)] =
float64(aggSum / aggCount)
processedMetrics.Metrics[fmt.Sprintf("%s_agg_count", name)] =
float64(aggCount)
processedMetrics.Metrics[fmt.Sprintf("%s_agg_sum", name)] =
float64(aggSum)
}
}
// broadcast processed metrics
ms.subscribersMu.Lock()
for subscriber := range ms.processedSubscribers {
select {
case subscriber <- processedMetrics:
delete(ms.processedBadSubscribers, subscriber)
default:
ms.processedBadSubscribers[subscriber]++
glog.Error("a subscriber has allowed their channel to fill up. ",
"dropping their metrics on the floor rather than blocking.")
if ms.processedBadSubscribers[subscriber] >= 2 {
glog.Error("this subscriber has caused dropped metrics at ",
"least 3 times in a row. closing the channel.")
delete(ms.processedSubscribers, subscriber)
close(subscriber)
}
}
}
ms.subscribersMu.Unlock()
}
select {
case processChan <- sendProcessed:
default:
// processChan has filled up, this metric load is not sustainable
glog.Errorf("processing of metrics is taking longer than this node can "+
"handle. dropping this entire interval of %s metrics on the "+
"floor rather than blocking the reaper.", rawMetrics.Time)
}
} // end main reaper loop
}
// Start spawns a goroutine for merging metrics into caches from
// metric submitters, and a reaper goroutine that harvests metrics at the
// default interval of every 60 seconds.
func (ms *MetricSystem) Start() {
if !ms.reaping {
go ms.reaper()
}
}
// Stop shuts down a MetricSystem
func (ms *MetricSystem) Stop() {
close(ms.shutdownChan)
}

View File

@@ -0,0 +1,363 @@
// Copyright 2014 The Cockroach Authors.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
// implied. See the License for the specific language governing
// permissions and limitations under the License. See the AUTHORS file
// for names of contributors.
//
// Author: Tyler Neely (t@jujit.su)
package loghisto
import (
"fmt"
"math"
"runtime"
"testing"
"time"
)
func ExampleMetricSystem() {
ms := NewMetricSystem(time.Microsecond, true)
ms.Start()
myMetricStream := make(chan *ProcessedMetricSet, 2)
ms.SubscribeToProcessedMetrics(myMetricStream)
timeToken := ms.StartTimer("submit_metrics")
ms.Counter("range_splits", 1)
ms.Histogram("some_ipc_latency", 123)
timeToken.Stop()
processedMetricSet := <-myMetricStream
ms.UnsubscribeFromProcessedMetrics(myMetricStream)
m := processedMetricSet.Metrics
example := []struct {
Name string
Value float64
}{
{
"total range splits during the process lifetime",
m["range_splits"],
}, {
"range splits in this period",
m["range_splits_rate"],
}, {
"some_ipc 99.9th percentile",
m["some_ipc_latency_99.9"],
}, {
"some_ipc max",
m["some_ipc_latency_max"],
}, {
"some_ipc calls this period",
m["some_ipc_latency_count"],
}, {
"some_ipc calls during the process lifetime",
m["some_ipc_latency_agg_count"],
}, {
"some_ipc total latency this period",
m["some_ipc_latency_sum"],
}, {
"some_ipc mean this period",
m["some_ipc_latency_avg"],
}, {
"some_ipc aggregate man",
m["some_ipc_latency_agg_avg"],
}, {
"time spent submitting metrics this period",
m["submit_metrics_sum"],
}, {
"number of goroutines",
m["sys.NumGoroutine"],
}, {
"time spent in GC",
m["sys.PauseTotalNs"],
},
}
for _, nameValue := range example {
var result string
if nameValue.Value == float64(0) {
result = "NOT present"
} else {
result = "present"
}
fmt.Println(nameValue.Name, result)
}
ms.Stop()
// Output:
// total range splits during the process lifetime present
// range splits in this period present
// some_ipc 99.9th percentile present
// some_ipc max present
// some_ipc calls this period present
// some_ipc calls during the process lifetime present
// some_ipc total latency this period present
// some_ipc mean this period present
// some_ipc aggregate man present
// time spent submitting metrics this period present
// number of goroutines present
// time spent in GC present
}
func TestPercentile(t *testing.T) {
metrics := map[float64]uint64{
10: 9000,
25: 900,
33: 90,
47: 9,
500: 1,
}
percentileToExpected := map[float64]float64{
0: 10,
.99: 25,
.999: 33,
.9991: 47,
.9999: 47,
1: 500,
}
totalcount := uint64(0)
proportions := make([]proportion, 0, len(metrics))
for value, count := range metrics {
totalcount += count
proportions = append(proportions, proportion{Value: value, Count: count})
}
for p, expected := range percentileToExpected {
result, err := percentile(totalcount, proportions, p)
if err != nil {
t.Error("error:", err)
}
// results must be within 1% of their expected values.
diff := math.Abs(expected/result - 1)
if diff > .01 {
t.Errorf("percentile: %.04f, expected: %.04f, actual: %.04f, %% off: %.04f\n",
p, expected, result, diff*100)
}
}
}
func TestCompress(t *testing.T) {
toTest := []float64{
-421408208120481,
-1,
0,
1,
214141241241241,
}
for _, f := range toTest {
result := decompress(compress(f))
var diff float64
if result == 0 {
diff = math.Abs(f - result)
} else {
diff = math.Abs(f/result - 1)
}
if diff > .01 {
t.Errorf("expected: %f, actual: %f, %% off: %.04f\n",
f, result, diff*100)
}
}
}
func TestSysStats(t *testing.T) {
metricSystem := NewMetricSystem(time.Microsecond, true)
gauges := metricSystem.collectRawMetrics().Gauges
v, present := gauges["sys.Alloc"]
if v <= 0 || !present {
t.Errorf("expected positive reported allocated bytes, got %f\n", v)
}
}
func TestTimer(t *testing.T) {
metricSystem := NewMetricSystem(time.Microsecond, false)
token1 := metricSystem.StartTimer("timer1")
token2 := metricSystem.StartTimer("timer1")
time.Sleep(50 & time.Microsecond)
token1.Stop()
time.Sleep(5 * time.Microsecond)
token2.Stop()
token3 := metricSystem.StartTimer("timer1")
time.Sleep(10 * time.Microsecond)
token3.Stop()
result := metricSystem.processMetrics(metricSystem.collectRawMetrics()).Metrics
if result["timer1_min"] > result["timer1_50"] ||
result["timer1_50"] > result["timer1_max"] {
t.Error("bad result map:", result)
}
}
func TestRate(t *testing.T) {
metricSystem := NewMetricSystem(time.Microsecond, false)
metricSystem.Counter("rate1", 777)
time.Sleep(20 * time.Millisecond)
metrics := metricSystem.processMetrics(metricSystem.collectRawMetrics()).Metrics
if metrics["rate1_rate"] != 777 {
t.Error("count one value")
}
metricSystem.Counter("rate1", 1223)
time.Sleep(20 * time.Millisecond)
metrics = metricSystem.processMetrics(metricSystem.collectRawMetrics()).Metrics
if metrics["rate1_rate"] != 1223 {
t.Errorf("expected rate: 1223, actual: %f", metrics["rate1_rate"])
}
metricSystem.Counter("rate1", 1223)
metricSystem.Counter("rate1", 1223)
time.Sleep(20 * time.Millisecond)
metrics = metricSystem.processMetrics(metricSystem.collectRawMetrics()).Metrics
if metrics["rate1_rate"] != 2446 {
t.Errorf("expected rate: 2446, actual: %f", metrics["rate1_rate"])
}
}
func TestCounter(t *testing.T) {
metricSystem := NewMetricSystem(time.Microsecond, false)
metricSystem.Counter("counter1", 3290)
time.Sleep(20 * time.Millisecond)
metrics := metricSystem.processMetrics(metricSystem.collectRawMetrics()).Metrics
if metrics["counter1"] != 3290 {
t.Error("count one value", metrics)
}
metricSystem.Counter("counter1", 10000)
time.Sleep(20 * time.Millisecond)
metrics = metricSystem.processMetrics(metricSystem.collectRawMetrics()).Metrics
if metrics["counter1"] != 13290 {
t.Error("accumulate counts across broadcasts")
}
}
func TestUpdateSubscribers(t *testing.T) {
rawMetricStream := make(chan *RawMetricSet)
processedMetricStream := make(chan *ProcessedMetricSet)
metricSystem := NewMetricSystem(2*time.Microsecond, false)
metricSystem.SubscribeToRawMetrics(rawMetricStream)
metricSystem.SubscribeToProcessedMetrics(processedMetricStream)
metricSystem.Counter("counter5", 33)
go func() {
select {
case <-rawMetricStream:
case <-time.After(20 * time.Millisecond):
t.Error("received no raw metrics from the MetricSystem after 2 milliseconds.")
}
metricSystem.UnsubscribeFromRawMetrics(rawMetricStream)
}()
go func() {
select {
case <-processedMetricStream:
case <-time.After(20 * time.Millisecond):
t.Error("received no processed metrics from the MetricSystem after 2 milliseconds.")
}
metricSystem.UnsubscribeFromProcessedMetrics(processedMetricStream)
}()
metricSystem.Start()
time.Sleep(20 * time.Millisecond)
go func() {
select {
case <-rawMetricStream:
t.Error("received raw metrics from the MetricSystem after unsubscribing.")
default:
}
}()
go func() {
select {
case <-processedMetricStream:
t.Error("received processed metrics from the MetricSystem after unsubscribing.")
default:
}
}()
time.Sleep(20 * time.Millisecond)
}
func TestProcessedBroadcast(t *testing.T) {
processedMetricStream := make(chan *ProcessedMetricSet, 128)
metricSystem := NewMetricSystem(time.Microsecond, false)
metricSystem.SubscribeToProcessedMetrics(processedMetricStream)
metricSystem.Histogram("histogram1", 33)
metricSystem.Histogram("histogram1", 59)
metricSystem.Histogram("histogram1", 330000)
metricSystem.Start()
select {
case processedMetrics := <-processedMetricStream:
if int(processedMetrics.Metrics["histogram1_sum"]) != 331132 {
t.Error("expected histogram1_sum to be 331132, instead was",
processedMetrics.Metrics["histogram1_sum"])
}
if int(processedMetrics.Metrics["histogram1_agg_avg"]) != 110377 {
t.Error("expected histogram1_agg_avg to be 110377, instead was",
processedMetrics.Metrics["histogram1_agg_avg"])
}
if int(processedMetrics.Metrics["histogram1_count"]) != 3 {
t.Error("expected histogram1_count to be 3, instead was",
processedMetrics.Metrics["histogram1_count"])
}
case <-time.After(20 * time.Millisecond):
t.Error("received no metrics from the MetricSystem after 2 milliseconds.")
}
metricSystem.UnsubscribeFromProcessedMetrics(processedMetricStream)
metricSystem.Stop()
}
func TestRawBroadcast(t *testing.T) {
rawMetricStream := make(chan *RawMetricSet, 128)
metricSystem := NewMetricSystem(time.Microsecond, false)
metricSystem.SubscribeToRawMetrics(rawMetricStream)
metricSystem.Counter("counter2", 10)
metricSystem.Counter("counter2", 111)
metricSystem.Start()
select {
case rawMetrics := <-rawMetricStream:
if rawMetrics.Counters["counter2"] != 121 {
t.Error("expected counter2 to be 121, instead was",
rawMetrics.Counters["counter2"])
}
if rawMetrics.Rates["counter2"] != 121 {
t.Error("expected counter2 rate to be 121, instead was",
rawMetrics.Counters["counter2"])
}
case <-time.After(20 * time.Millisecond):
t.Error("received no metrics from the MetricSystem after 2 milliseconds.")
}
metricSystem.UnsubscribeFromRawMetrics(rawMetricStream)
metricSystem.Stop()
}
func TestMetricSystemStop(t *testing.T) {
metricSystem := NewMetricSystem(time.Microsecond, false)
startingRoutines := runtime.NumGoroutine()
metricSystem.Start()
metricSystem.Stop()
time.Sleep(20 * time.Millisecond)
endRoutines := runtime.NumGoroutine()
if startingRoutines < endRoutines {
t.Errorf("lingering goroutines have not been cleaned up: "+
"before: %d, after: %d\n", startingRoutines, endRoutines)
}
}

View File

@@ -0,0 +1,85 @@
// Copyright 2014 The Cockroach Authors.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
// implied. See the License for the specific language governing
// permissions and limitations under the License. See the AUTHORS file
// for names of contributors.
//
// Author: Tyler Neely (t@jujit.su)
package loghisto
import (
"bytes"
"fmt"
"os"
"strings"
)
type openTSDBStat struct {
Metric string
Time int64
Value float64
Tags map[string]string
}
type openTSDBStatArray []*openTSDBStat
func mapToTSDProtocolTags(tagMap map[string]string) string {
tags := make([]string, 0, len(tagMap))
for tag, value := range tagMap {
tags = append(tags, fmt.Sprintf("%s=%s", tag, value))
}
return strings.Join(tags, " ")
}
func (stats openTSDBStatArray) ToRequest() []byte {
var request bytes.Buffer
for _, stat := range stats {
request.Write([]byte(fmt.Sprintf("put %s %d %f %s\n",
stat.Metric,
stat.Time,
stat.Value,
mapToTSDProtocolTags(stat.Tags))))
}
return []byte(request.String())
}
func (metricSet *ProcessedMetricSet) toopenTSDBStats() openTSDBStatArray {
hostname, err := os.Hostname()
if err != nil {
hostname = "unknown"
}
stats := make([]*openTSDBStat, 0, len(metricSet.Metrics))
i := 0
for metric, value := range metricSet.Metrics {
var tags = map[string]string{
"host": hostname,
}
//TODO(tyler) custom tags
stats = append(stats, &openTSDBStat{
Metric: metric,
Time: metricSet.Time.Unix(),
Value: value,
Tags: tags,
})
i++
}
return stats
}
// OpenTSDBProtocol generates a wire representation of a ProcessedMetricSet
// for submission to an OpenTSDB instance.
func OpenTSDBProtocol(ms *ProcessedMetricSet) []byte {
return ms.toopenTSDBStats().ToRequest()
}

View File

@@ -0,0 +1,23 @@
package loghisto
import (
"testing"
"time"
)
func TestOpenTSDB(t *testing.T) {
ms := NewMetricSystem(time.Second, true)
s := NewSubmitter(ms, OpenTSDBProtocol, "tcp", "localhost:7777")
s.Start()
metrics := &ProcessedMetricSet{
Time: time.Now(),
Metrics: map[string]float64{
"test.1": 43.32,
"test.2": 12.3,
},
}
request := s.serializer(metrics)
s.submit(request)
s.Shutdown()
}

View File

@@ -0,0 +1,106 @@
// Copyright 2014 The Cockroach Authors.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
// implied. See the License for the specific language governing
// permissions and limitations under the License. See the AUTHORS file
// for names of contributors.
//
// Author: Tyler Neely (t@jujit.su)
package loghisto
import (
"fmt"
"os"
"runtime"
"text/tabwriter"
"time"
)
// PrintBenchmark will run the provided function at the specified
// concurrency, time the operation, and once per second write the
// following information to standard out:
//
// 2014-08-09 17:44:57 -0400 EDT
// raft_AppendLogEntries_count: 16488
// raft_AppendLogEntries_max: 3.982478339757623e+07
// raft_AppendLogEntries_99.99: 3.864778314316012e+07
// raft_AppendLogEntries_99.9: 3.4366224772310276e+06
// raft_AppendLogEntries_99: 2.0228126576114902e+06
// raft_AppendLogEntries_50: 469769.7083161708
// raft_AppendLogEntries_min: 129313.15075081984
// raft_AppendLogEntries_sum: 9.975892639594093e+09
// raft_AppendLogEntries_avg: 605039.5827022133
// raft_AppendLogEntries_agg_avg: 618937
// raft_AppendLogEntries_agg_count: 121095
// raft_AppendLogEntries_agg_sum: 7.4950269894e+10
// sys.Alloc: 997328
// sys.NumGC: 1115
// sys.PauseTotalNs: 2.94946542e+08
// sys.NumGoroutine: 26
func PrintBenchmark(name string, concurrency uint, op func()) {
runtime.GOMAXPROCS(runtime.NumCPU())
var ms = NewMetricSystem(time.Second, true)
mc := make(chan *ProcessedMetricSet, 1)
ms.SubscribeToProcessedMetrics(mc)
ms.Start()
defer ms.Stop()
go receiver(name, mc)
for i := uint(0); i < concurrency; i++ {
go func() {
for {
timer := ms.StartTimer(name)
op()
timer.Stop()
}
}()
}
<-make(chan struct{})
}
func receiver(name string, mc chan *ProcessedMetricSet) {
interesting := []string{
fmt.Sprintf("%s_count", name),
fmt.Sprintf("%s_max", name),
fmt.Sprintf("%s_99.99", name),
fmt.Sprintf("%s_99.9", name),
fmt.Sprintf("%s_99", name),
fmt.Sprintf("%s_95", name),
fmt.Sprintf("%s_90", name),
fmt.Sprintf("%s_75", name),
fmt.Sprintf("%s_50", name),
fmt.Sprintf("%s_min", name),
fmt.Sprintf("%s_sum", name),
fmt.Sprintf("%s_avg", name),
fmt.Sprintf("%s_agg_avg", name),
fmt.Sprintf("%s_agg_count", name),
fmt.Sprintf("%s_agg_sum", name),
"sys.Alloc",
"sys.NumGC",
"sys.PauseTotalNs",
"sys.NumGoroutine",
}
w := new(tabwriter.Writer)
w.Init(os.Stdout, 0, 8, 0, '\t', 0)
for m := range mc {
fmt.Fprintln(w, m.Time)
for _, e := range interesting {
fmt.Fprintln(w, fmt.Sprintf("%s:\t", e), m.Metrics[e])
}
fmt.Fprintln(w)
w.Flush()
}
}

View File

@@ -0,0 +1,113 @@
loghisto
============
[![Build Status](https://travis-ci.org/spacejam/loghisto.svg)](https://travis-ci.org/spacejam/loghisto)
A metric system for high performance counters and histograms. Unlike popular metric systems today, this does not destroy the accuracy of histograms by sampling. Instead, a logarithmic bucketing function compresses values, generally within 1% of their true value (although between 0 and 1 the precision loss may not be within this boundary). This allows for extreme compression, which allows us to calculate arbitrarily high percentiles with no loss of accuracy - just a small amount of precision. This is particularly useful for highly-clustered events that are tolerant of a small precision loss, but for which you REALLY care about what the tail looks like, such as measuring latency across a distributed system.
Copied out of my work for the CockroachDB metrics system. Based on an algorithm created by Keith Frost.
### running a print benchmark for quick analysis
```go
package main
import (
"runtime"
"github.com/spacejam/loghisto"
)
func benchmark() {
// do some stuff
}
func main() {
numCPU := runtime.NumCPU()
runtime.GOMAXPROCS(numCPU)
desiredConcurrency := uint(100)
loghisto.PrintBenchmark("benchmark1234", desiredConcurrency, benchmark)
}
```
results in something like this printed to stdout each second:
```
2014-12-11 21:41:45 -0500 EST
benchmark1234_count: 2.0171025e+07
benchmark1234_max: 2.4642914167480484e+07
benchmark1234_99.99: 4913.768840299134
benchmark1234_99.9: 1001.2472422902518
benchmark1234_99: 71.24044000732538
benchmark1234_95: 67.03348428941965
benchmark1234_90: 65.68633104092515
benchmark1234_75: 63.07152259993664
benchmark1234_50: 58.739891704145194
benchmark1234_min: -657.5233632152207 // Corollary: time.Since(time.Now()) is often < 0
benchmark1234_sum: 1.648051169322668e+09
benchmark1234_avg: 81.70388809307748
benchmark1234_agg_avg: 89
benchmark1234_agg_count: 6.0962226e+07
benchmark1234_agg_sum: 5.454779078e+09
sys.Alloc: 1.132672e+06
sys.NumGC: 5741
sys.PauseTotalNs: 1.569390954e+09
sys.NumGoroutine: 113
```
### adding an embedded metric system to your code
```go
import (
"time"
"fmt"
"github.com/spacejam/loghisto"
)
func ExampleMetricSystem() {
// Create metric system that reports once a minute, and includes stats
// about goroutines, memory usage and GC.
includeGoProcessStats := true
ms := loghisto.NewMetricSystem(time.Minute, includeGoProcessStats)
ms.Start()
// create a channel that subscribes to metrics as they are produced once
// per minute.
// NOTE: if you allow this channel to fill up, the metric system will NOT
// block, and will FORGET about your channel if you fail to unblock the
// channel after 3 configured intervals (in this case 3 minutes) rather
// than causing a memory leak.
myMetricStream := make(chan *loghisto.ProcessedMetricSet, 2)
ms.SubscribeToProcessedMetrics(myMetricStream)
// create some metrics
timeToken := ms.StartTimer("time for creating a counter and histo")
ms.Counter("some event", 1)
ms.Histogram("some measured thing", 123)
timeToken.Stop()
for m := range myMetricStream {
fmt.Printf("number of goroutines: %f\n", m.Metrics["sys.NumGoroutine"])
}
// if you want to manually unsubscribe from the metric stream
ms.UnsubscribeFromProcessedMetrics(myMetricStream)
// to stop and clean up your metric system
ms.Stop()
}
```
### automatically sending your metrics to OpenTSDB, KairosDB or Graphite
```go
func ExampleExternalSubmitter() {
includeGoProcessStats := true
ms := NewMetricSystem(time.Minute, includeGoProcessStats)
ms.Start()
// graphite
s := NewSubmitter(ms, GraphiteProtocol, "tcp", "localhost:7777")
s.Start()
// opentsdb / kairosdb
s := NewSubmitter(ms, OpenTSDBProtocol, "tcp", "localhost:7777")
s.Start()
// to tear down:
s.Shutdown()
}
```
See code for the Graphite/OpenTSDB protocols for adding your own output plugins, it's pretty simple.

View File

@@ -0,0 +1,159 @@
// Copyright 2014 The Cockroach Authors.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
// implied. See the License for the specific language governing
// permissions and limitations under the License. See the AUTHORS file
// for names of contributors.
//
// Author: Tyler Neely (t@jujit.su)
package loghisto
import (
"net"
"sync"
"time"
)
type requestable interface{}
type requestableArray interface {
ToRequest() []byte
}
// Submitter encapsulates the state of a metric submitter.
type Submitter struct {
// backlog works as an evicting queue
backlog [60][]byte
backlogHead uint
backlogTail uint
backlogMu sync.Mutex
serializer func(*ProcessedMetricSet) []byte
DestinationNetwork string
DestinationAddress string
metricSystem *MetricSystem
metricChan chan *ProcessedMetricSet
shutdownChan chan struct{}
}
// NewSubmitter creates a Submitter that receives metrics off of a
// specified metric channel, serializes them using the provided
// serialization function, and attempts to send them to the
// specified destination.
func NewSubmitter(metricSystem *MetricSystem,
serializer func(*ProcessedMetricSet) []byte, destinationNetwork string,
destinationAddress string) *Submitter {
metricChan := make(chan *ProcessedMetricSet, 60)
metricSystem.SubscribeToProcessedMetrics(metricChan)
return &Submitter{
backlog: [60][]byte{},
backlogHead: 0,
backlogTail: 0,
serializer: serializer,
DestinationNetwork: destinationNetwork,
DestinationAddress: destinationAddress,
metricSystem: metricSystem,
metricChan: metricChan,
shutdownChan: make(chan struct{}),
}
}
func (s *Submitter) retryBacklog() error {
var request []byte
for {
s.backlogMu.Lock()
head := s.backlogHead
tail := s.backlogTail
if head != tail {
request = s.backlog[head]
}
s.backlogMu.Unlock()
if head == tail {
return nil
}
err := s.submit(request)
if err != nil {
return err
}
s.backlogMu.Lock()
s.backlogHead = (s.backlogHead + 1) % 60
s.backlogMu.Unlock()
}
}
func (s *Submitter) appendToBacklog(request []byte) {
s.backlogMu.Lock()
s.backlog[s.backlogTail] = request
s.backlogTail = (s.backlogTail + 1) % 60
// if we've run into the head, evict it
if s.backlogHead == s.backlogTail {
s.backlogHead = (s.backlogHead + 1) % 60
}
s.backlogMu.Unlock()
}
func (s *Submitter) submit(request []byte) error {
conn, err := net.DialTimeout(s.DestinationNetwork, s.DestinationAddress,
5*time.Second)
if err != nil {
return err
}
conn.SetDeadline(time.Now().Add(5 * time.Second))
_, err = conn.Write(request)
conn.Close()
return err
}
// Start creates the goroutines that receive, serialize, and send metrics.
func (s *Submitter) Start() {
go func() {
for {
select {
case metrics, ok := <-s.metricChan:
if !ok {
// We can no longer make progress.
return
}
request := s.serializer(metrics)
s.appendToBacklog(request)
case <-s.shutdownChan:
return
}
}
}()
go func() {
for {
select {
case <-s.shutdownChan:
return
default:
s.retryBacklog()
tts := s.metricSystem.interval.Nanoseconds() -
(time.Now().UnixNano() % s.metricSystem.interval.Nanoseconds())
time.Sleep(time.Duration(tts))
}
}
}()
}
// Shutdown shuts down a submitter
func (s *Submitter) Shutdown() {
select {
case <-s.shutdownChan:
// already closed
default:
close(s.shutdownChan)
}
}

View File

@@ -2,4 +2,5 @@
etcd1: bin/etcd -name infra1 -listen-client-urls http://127.0.0.1:12379 -advertise-client-urls http://127.0.0.1:12379 -listen-peer-urls http://127.0.0.1:12380 -initial-advertise-peer-urls http://127.0.0.1:12380 -initial-cluster-token etcd-cluster-1 -initial-cluster 'infra1=http://127.0.0.1:12380,infra2=http://127.0.0.1:22380,infra3=http://127.0.0.1:32380' -initial-cluster-state new
etcd2: bin/etcd -name infra2 -listen-client-urls http://127.0.0.1:22379 -advertise-client-urls http://127.0.0.1:22379 -listen-peer-urls http://127.0.0.1:22380 -initial-advertise-peer-urls http://127.0.0.1:22380 -initial-cluster-token etcd-cluster-1 -initial-cluster 'infra1=http://127.0.0.1:12380,infra2=http://127.0.0.1:22380,infra3=http://127.0.0.1:32380' -initial-cluster-state new
etcd3: bin/etcd -name infra3 -listen-client-urls http://127.0.0.1:32379 -advertise-client-urls http://127.0.0.1:32379 -listen-peer-urls http://127.0.0.1:32380 -initial-advertise-peer-urls http://127.0.0.1:32380 -initial-cluster-token etcd-cluster-1 -initial-cluster 'infra1=http://127.0.0.1:12380,infra2=http://127.0.0.1:22380,infra3=http://127.0.0.1:32380' -initial-cluster-state new
proxy: bin/etcd -name infra-proxy1 -proxy=on -listen-client-urls http://127.0.0.1:2379 -initial-cluster 'infra1=http://127.0.0.1:12380,infra2=http://127.0.0.1:22380,infra3=http://127.0.0.1:32380'
# wait 3 seconds until initial cluster is ready to be proxied.
proxy: sleep 3s && bin/etcd -name infra-proxy1 -proxy=on -listen-client-urls http://127.0.0.1:2379 -initial-cluster 'infra1=http://127.0.0.1:12380,infra2=http://127.0.0.1:22380,infra3=http://127.0.0.1:32380'

View File

@@ -1,8 +1,11 @@
# etcd
[![Build Status](https://travis-ci.org/coreos/etcd.png?branch=master)](https://travis-ci.org/coreos/etcd)
[![Build Status](https://travis-ci.org/coreos/etcd.svg?branch=master)](https://travis-ci.org/coreos/etcd)
[![Build Status](https://semaphoreci.com/api/v1/projects/406f9909-2f4f-4839-b59e-95082cb088f1/575109/badge.svg)](https://semaphoreci.com/coreos/etcd)
[![Docker Repository on Quay.io](https://quay.io/repository/coreos/etcd-git/status "Docker Repository on Quay.io")](https://quay.io/repository/coreos/etcd-git)
**Note**: `master` branch may be in *unstable or even broken state* during development. Please use [releases][github-release] instead of `master` branch to get stable binaries.
![etcd Logo](logos/etcd-horizontal-color.png)
etcd is a distributed, consistent key-value store for shared configuration and service discovery, with a focus on being:
@@ -26,9 +29,10 @@ If you're considering etcd for production use, please see: [production-ready.md]
### Getting etcd
The easiest way to get etcd is to install one of the pre-built binaries from the tagged releases: instructions are available on [GitHub][github-release].
The easiest way to get etcd is to use one of the pre-built release binaries which are available for OSX, Linux, Windows, AppC (ACI), and Docker. Instructions for using these binaries are on the [GitHub releases page][github-release].
For those wanting to try the very latest version, you can build the latest version of etcd from the `master` branch.
You will first need [*Go*](https://golang.org/) installed on your machine (version 1.4+ is required).
All development occurs on `master`, including new features and bug fixes.
Bug fixes are first targeted at `master` and subsequently ported to release branches, as described in the [branch management][branch-management] guide.
@@ -101,7 +105,7 @@ See [CONTRIBUTING](CONTRIBUTING.md) for details on submitting patches and the co
## Reporting bugs
See [reporting bugs](Documentation/reporting_bugs.md) for details about reporting any issue you may encounter..
See [reporting bugs](Documentation/reporting_bugs.md) for details about reporting any issue you may encounter.
## Project Details

View File

@@ -8,12 +8,51 @@ The dates below should not be considered authoritative, but rather indicative of
etcd 2.2 is our current stable branch. The roadmap below outlines new features that will be added to etcd, and while subject to change, define what future stable will look like.
### etcd 2.3alpha (November)
- v3 API preview
- support clustered API
- use gRPC error code
- initial API level testing
- transactions
- basic runtime metrics
- better backend
- benchmark memory usage
- experimental v3 compatibility
- store v2 snapshot into new backend
- move snapshot logic out of raft to support new snapshot work-flow
### etcd 2.3 (November)
- improved v3 API preview
- initial performance benchmark for get/put/delete
- support watch API
- improved runtime metrics
- raft state machine
- new backend
- V3 API
- better backend
- fully tested backend
- benchmark performance for key operations
### etcd 3.0 (January)
- v3 API ([see also the issue tag](https://github.com/coreos/etcd/issues?utf8=%E2%9C%93&q=label%3Av3api))
- Transactions
- Leases
- Binary protocol
- Support a large number of watchers
- Better disk backend
- Improved write throughput
- Support larger datasets and histories
- v3 API ([see also the issue tag](https://github.com/coreos/etcd/issues?utf8=%E2%9C%93&q=label%3Aarea/v3api))
- Leases
- Binary protocol
- Support a large number of watchers
- Failure guarantees documented
- Simple v3 client (golang)
### etcd 3.1 (February)
- v3 API
- Locking
- Better disk backend
- Improved write throughput
- Support larger datasets and histories
- Simpler disaster recovery UX
- Integrated with Kubernetes
### etcd 3.2 (March)
- API bindings for other languages
### etcd 3.+ (future)
- Mirroring
- Horizontally scalable proxy layer

5
V3DemoProcfile Normal file
View File

@@ -0,0 +1,5 @@
# Use goreman to run `go get github.com/mattn/goreman`
etcd1: bin/etcd --experimental-v3demo=true --experimental-gRPC-addr 127.0.0.1:12378 -name infra1 -listen-client-urls http://127.0.0.1:12379 -advertise-client-urls http://127.0.0.1:12379 -listen-peer-urls http://127.0.0.1:12380 -initial-advertise-peer-urls http://127.0.0.1:12380 -initial-cluster-token etcd-cluster-1 -initial-cluster 'infra1=http://127.0.0.1:12380,infra2=http://127.0.0.1:22380,infra3=http://127.0.0.1:32380' -initial-cluster-state new
etcd2: bin/etcd --experimental-v3demo=true --experimental-gRPC-addr 127.0.0.1:22378 -name infra2 -listen-client-urls http://127.0.0.1:22379 -advertise-client-urls http://127.0.0.1:22379 -listen-peer-urls http://127.0.0.1:22380 -initial-advertise-peer-urls http://127.0.0.1:22380 -initial-cluster-token etcd-cluster-1 -initial-cluster 'infra1=http://127.0.0.1:12380,infra2=http://127.0.0.1:22380,infra3=http://127.0.0.1:32380' -initial-cluster-state new
etcd3: bin/etcd --experimental-v3demo=true --experimental-gRPC-addr 127.0.0.1:32378 -name infra3 -listen-client-urls http://127.0.0.1:32379 -advertise-client-urls http://127.0.0.1:32379 -listen-peer-urls http://127.0.0.1:32380 -initial-advertise-peer-urls http://127.0.0.1:32380 -initial-cluster-token etcd-cluster-1 -initial-cluster 'infra1=http://127.0.0.1:12380,infra2=http://127.0.0.1:22380,infra3=http://127.0.0.1:32380' -initial-cluster-state new
proxy: bin/etcd -name infra-proxy1 -proxy=on -listen-client-urls http://127.0.0.1:2379 -initial-cluster 'infra1=http://127.0.0.1:12380,infra2=http://127.0.0.1:22380,infra3=http://127.0.0.1:32380'

View File

@@ -35,9 +35,25 @@ func main() {
log.Fatal(err)
}
kapi := client.NewKeysAPI(c)
resp, err := kapi.Set(context.Background(), "foo", "bar", nil)
// set "/foo" key with "bar" value
log.Print("Setting '/foo' key with 'bar' value")
resp, err := kapi.Set(context.Background(), "/foo", "bar", nil)
if err != nil {
log.Fatal(err)
} else {
// print common key info
log.Printf("Set is done. Metadata is %q\n", resp)
}
// get "/foo" key's value
log.Print("Getting '/foo' key value")
resp, err = kapi.Get(context.Background(), "/foo", nil)
if err != nil {
log.Fatal(err)
} else {
// print common key info
log.Printf("Get is done. Metadata is %q\n", resp)
// print value
log.Printf("%q key has %q value\n", resp.Node.Key, resp.Node.Value)
}
}
```

View File

@@ -16,6 +16,6 @@ package client
// Discoverer is an interface that wraps the Discover method.
type Discoverer interface {
// Dicover looks up the etcd servers for the domain.
// Discover looks up the etcd servers for the domain.
Discover(domain string) ([]string, error)
}

View File

@@ -1,3 +1,17 @@
// Copyright 2015 CoreOS, Inc.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
package client
import (

View File

@@ -84,7 +84,7 @@ func TestSRVDiscover(t *testing.T) {
if service == "etcd-server" {
return "", tt.withoutSSL, nil
}
return "", nil, errors.New("Unkown service in mock")
return "", nil, errors.New("Unknown service in mock")
}
d := NewSRVDiscover()

5
contrib/README.md Normal file
View File

@@ -0,0 +1,5 @@
## contrib
Scripts and files which may be useful but aren't part of the core etcd project.
- [systemd](systemd) - an example unit file for deploying etcd on systemd-based distributions

View File

@@ -0,0 +1,16 @@
[Unit]
Description=etcd key-value store
Documentation=https://github.com/coreos/etcd
[Service]
User=etcd
Type=notify
Environment=ETCD_DATA_DIR=/var/lib/etcd
Environment=ETCD_NAME=%m
ExecStart=/usr/bin/etcd
Restart=always
RestartSec=10s
LimitNOFILE=40000
[Install]
WantedBy=multi-user.target

View File

@@ -106,7 +106,7 @@ func TestSRVGetCluster(t *testing.T) {
if service == "etcd-server" {
return "", tt.withoutSSL, nil
}
return "", nil, errors.New("Unkown service in mock")
return "", nil, errors.New("Unknown service in mock")
}
resolveTCPAddr = func(network, addr string) (*net.TCPAddr, error) {
if tt.dns == nil || tt.dns[addr] == "" {

View File

@@ -143,7 +143,7 @@ func (e Error) toJsonString() string {
return string(b)
}
func (e Error) statusCode() int {
func (e Error) StatusCode() int {
status, ok := errorStatus[e.ErrorCode]
if !ok {
status = http.StatusBadRequest
@@ -154,6 +154,6 @@ func (e Error) statusCode() int {
func (e Error) WriteTo(w http.ResponseWriter) {
w.Header().Add("X-Etcd-Index", fmt.Sprint(e.Index))
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(e.statusCode())
w.WriteHeader(e.StatusCode())
fmt.Fprintln(w, e.toJsonString())
}

View File

@@ -28,8 +28,8 @@ func TestErrorWriteTo(t *testing.T) {
rr := httptest.NewRecorder()
err.WriteTo(rr)
if err.statusCode() != rr.Code {
t.Errorf("HTTP status code %d, want %d", rr.Code, err.statusCode())
if err.StatusCode() != rr.Code {
t.Errorf("HTTP status code %d, want %d", rr.Code, err.StatusCode())
}
gbody := strings.TrimSuffix(rr.Body.String(), "\n")

View File

@@ -1,3 +1,17 @@
// Copyright 2015 CoreOS, Inc.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
package command
import (

View File

@@ -1,3 +1,17 @@
// Copyright 2015 CoreOS, Inc.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
package command
import (

View File

@@ -158,7 +158,7 @@ func actionMemberRemove(c *cli.Context) {
// Actually attempt to remove the member.
err = mAPI.Remove(ctx, removalID)
if err != nil {
fmt.Fprintf(os.Stderr, "Recieved an error trying to remove member %s: %s", removalID, err.Error())
fmt.Fprintf(os.Stderr, "Received an error trying to remove member %s: %s", removalID, err.Error())
os.Exit(1)
}

View File

@@ -23,6 +23,7 @@ import (
"net/url"
"os"
"strings"
"time"
"github.com/coreos/etcd/Godeps/_workspace/src/github.com/bgentry/speakeasy"
"github.com/coreos/etcd/Godeps/_workspace/src/github.com/codegangsta/cli"
@@ -33,6 +34,10 @@ import (
var (
ErrNoAvailSrc = errors.New("no available argument and stdin")
// the maximum amount of time a dial will wait for a connection to setup.
// 30s is long enough for most of the network conditions.
defaultDialTimeout = 30 * time.Second
)
// trimsplit slices s into all substrings separated by sep and returns a
@@ -153,7 +158,7 @@ func getTransport(c *cli.Context) (*http.Transport, error) {
CertFile: certfile,
KeyFile: keyfile,
}
return transport.NewTransport(tls)
return transport.NewTransport(tls, defaultDialTimeout)
}
func getUsernamePasswordFromFlag(usernameFlag string) (username string, password string, err error) {

View File

@@ -0,0 +1,55 @@
// Copyright 2015 CoreOS, Inc.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
package command
import (
"strconv"
"github.com/coreos/etcd/Godeps/_workspace/src/github.com/codegangsta/cli"
"github.com/coreos/etcd/Godeps/_workspace/src/golang.org/x/net/context"
"github.com/coreos/etcd/Godeps/_workspace/src/google.golang.org/grpc"
pb "github.com/coreos/etcd/etcdserver/etcdserverpb"
)
// NewCompactionCommand returns the CLI command for "compaction".
func NewCompactionCommand() cli.Command {
return cli.Command{
Name: "compaction",
Action: func(c *cli.Context) {
compactionCommandFunc(c)
},
}
}
// compactionCommandFunc executes the "compaction" command.
func compactionCommandFunc(c *cli.Context) {
if len(c.Args()) != 1 {
panic("bad arg")
}
rev, err := strconv.ParseInt(c.Args()[0], 10, 64)
if err != nil {
panic("bad arg")
}
conn, err := grpc.Dial(c.GlobalString("endpoint"))
if err != nil {
panic(err)
}
kv := pb.NewKVClient(conn)
req := &pb.CompactionRequest{Revision: rev}
kv.Compact(context.Background(), req)
}

View File

@@ -44,14 +44,14 @@ func deleteRangeCommandFunc(c *cli.Context) {
if len(c.Args()) > 1 {
rangeEnd = []byte(c.Args()[1])
}
conn, err := grpc.Dial("127.0.0.1:12379")
conn, err := grpc.Dial(c.GlobalString("endpoint"))
if err != nil {
panic(err)
}
etcd := pb.NewEtcdClient(conn)
kv := pb.NewKVClient(conn)
req := &pb.DeleteRangeRequest{Key: key, RangeEnd: rangeEnd}
etcd.DeleteRange(context.Background(), req)
kv.DeleteRange(context.Background(), req)
if rangeEnd != nil {
fmt.Printf("range [%s, %s) is deleted\n", string(key), string(rangeEnd))

View File

@@ -41,13 +41,13 @@ func putCommandFunc(c *cli.Context) {
key := []byte(c.Args()[0])
value := []byte(c.Args()[1])
conn, err := grpc.Dial("127.0.0.1:12379")
conn, err := grpc.Dial(c.GlobalString("endpoint"))
if err != nil {
panic(err)
}
etcd := pb.NewEtcdClient(conn)
kv := pb.NewKVClient(conn)
req := &pb.PutRequest{Key: key, Value: value}
etcd.Put(context.Background(), req)
kv.Put(context.Background(), req)
fmt.Printf("%s %s\n", key, value)
}

View File

@@ -44,14 +44,14 @@ func rangeCommandFunc(c *cli.Context) {
if len(c.Args()) > 1 {
rangeEnd = []byte(c.Args()[1])
}
conn, err := grpc.Dial("127.0.0.1:12379")
conn, err := grpc.Dial(c.GlobalString("endpoint"))
if err != nil {
panic(err)
}
etcd := pb.NewEtcdClient(conn)
kv := pb.NewKVClient(conn)
req := &pb.RangeRequest{Key: key, RangeEnd: rangeEnd}
resp, err := etcd.Range(context.Background(), req)
resp, err := kv.Range(context.Background(), req)
for _, kv := range resp.Kvs {
fmt.Printf("%s %s\n", string(kv.Key), string(kv.Value))
}

View File

@@ -51,13 +51,13 @@ func txnCommandFunc(c *cli.Context) {
next = next(txn, reader)
}
conn, err := grpc.Dial("127.0.0.1:12379")
conn, err := grpc.Dial(c.GlobalString("endpoint"))
if err != nil {
panic(err)
}
etcd := pb.NewEtcdClient(conn)
kv := pb.NewKVClient(conn)
resp, err := etcd.Txn(context.Background(), txn)
resp, err := kv.Txn(context.Background(), txn)
if err != nil {
fmt.Println(err)
}

View File

@@ -0,0 +1,101 @@
// Copyright 2015 CoreOS, Inc.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
package command
import (
"bufio"
"fmt"
"io"
"os"
"strings"
"github.com/coreos/etcd/Godeps/_workspace/src/github.com/codegangsta/cli"
"github.com/coreos/etcd/Godeps/_workspace/src/golang.org/x/net/context"
"github.com/coreos/etcd/Godeps/_workspace/src/google.golang.org/grpc"
pb "github.com/coreos/etcd/etcdserver/etcdserverpb"
)
// NewWatchCommand returns the CLI command for "watch".
func NewWatchCommand() cli.Command {
return cli.Command{
Name: "watch",
Action: func(c *cli.Context) {
watchCommandFunc(c)
},
}
}
// watchCommandFunc executes the "watch" command.
func watchCommandFunc(c *cli.Context) {
conn, err := grpc.Dial(c.GlobalString("endpoint"))
if err != nil {
panic(err)
}
wAPI := pb.NewWatchClient(conn)
wStream, err := wAPI.Watch(context.TODO())
if err != nil {
panic(err)
}
go recvLoop(wStream)
reader := bufio.NewReader(os.Stdin)
for {
l, err := reader.ReadString('\n')
if err != nil {
fmt.Fprintf(os.Stderr, "Error reading watch request line: %v", err)
os.Exit(1)
}
l = strings.TrimSuffix(l, "\n")
// TODO: support start and end revision
segs := strings.Split(l, " ")
if len(segs) != 2 {
fmt.Fprintf(os.Stderr, "Invaild watch request format: use watch key or watchprefix prefix\n")
continue
}
var r *pb.WatchRequest
switch segs[0] {
case "watch":
r = &pb.WatchRequest{Key: []byte(segs[1])}
case "watchprefix":
r = &pb.WatchRequest{Prefix: []byte(segs[1])}
default:
fmt.Fprintf(os.Stderr, "Invaild watch request format: use watch key or watchprefix prefix\n")
continue
}
err = wStream.Send(r)
if err != nil {
fmt.Fprintf(os.Stderr, "Error sending request to server: %v\n", err)
}
}
}
func recvLoop(wStream pb.Watch_WatchClient) {
for {
resp, err := wStream.Recv()
if err == io.EOF {
os.Exit(0)
}
if err != nil {
panic(err)
}
fmt.Printf("%s: %s %s\n", resp.Event.Type, string(resp.Event.Kv.Key), string(resp.Event.Kv.Value))
}
}

View File

@@ -27,11 +27,16 @@ func main() {
app.Name = "etcdctlv3"
app.Version = version.Version
app.Usage = "A simple command line client for etcd3."
app.Flags = []cli.Flag{
cli.StringFlag{Name: "endpoint", Value: "127.0.0.1:2378", Usage: "gRPC endpoint"},
}
app.Commands = []cli.Command{
command.NewRangeCommand(),
command.NewPutCommand(),
command.NewDeleteRangeCommand(),
command.NewTxnCommand(),
command.NewCompactionCommand(),
command.NewWatchCommand(),
}
app.Run(os.Args)

View File

@@ -96,6 +96,7 @@ type config struct {
fallback *flags.StringsFlag
initialCluster string
initialClusterToken string
strictReconfigCheck bool
// proxy
proxy *flags.StringsFlag
@@ -117,7 +118,8 @@ type config struct {
printVersion bool
v3demo bool
v3demo bool
gRPCAddr string
ignored []string
}
@@ -149,35 +151,36 @@ func NewConfig() *config {
// member
fs.Var(cfg.corsInfo, "cors", "Comma-separated white list of origins for CORS (cross-origin resource sharing).")
fs.StringVar(&cfg.dir, "data-dir", "", "Path to the data directory")
fs.StringVar(&cfg.walDir, "wal-dir", "", "Path to the dedicated wal directory")
fs.Var(flags.NewURLsValue("http://localhost:2380,http://localhost:7001"), "listen-peer-urls", "List of URLs to listen on for peer traffic")
fs.Var(flags.NewURLsValue("http://localhost:2379,http://localhost:4001"), "listen-client-urls", "List of URLs to listen on for client traffic")
fs.UintVar(&cfg.maxSnapFiles, "max-snapshots", defaultMaxSnapshots, "Maximum number of snapshot files to retain (0 is unlimited)")
fs.UintVar(&cfg.maxWalFiles, "max-wals", defaultMaxWALs, "Maximum number of wal files to retain (0 is unlimited)")
fs.StringVar(&cfg.name, "name", defaultName, "Unique human-readable name for this node")
fs.Uint64Var(&cfg.snapCount, "snapshot-count", etcdserver.DefaultSnapCount, "Number of committed transactions to trigger a snapshot")
fs.StringVar(&cfg.dir, "data-dir", "", "Path to the data directory.")
fs.StringVar(&cfg.walDir, "wal-dir", "", "Path to the dedicated wal directory.")
fs.Var(flags.NewURLsValue("http://localhost:2380,http://localhost:7001"), "listen-peer-urls", "List of URLs to listen on for peer traffic.")
fs.Var(flags.NewURLsValue("http://localhost:2379,http://localhost:4001"), "listen-client-urls", "List of URLs to listen on for client traffic.")
fs.UintVar(&cfg.maxSnapFiles, "max-snapshots", defaultMaxSnapshots, "Maximum number of snapshot files to retain (0 is unlimited).")
fs.UintVar(&cfg.maxWalFiles, "max-wals", defaultMaxWALs, "Maximum number of wal files to retain (0 is unlimited).")
fs.StringVar(&cfg.name, "name", defaultName, "Unique human-readable name for this node.")
fs.Uint64Var(&cfg.snapCount, "snapshot-count", etcdserver.DefaultSnapCount, "Number of committed transactions to trigger a snapshot.")
fs.UintVar(&cfg.TickMs, "heartbeat-interval", 100, "Time (in milliseconds) of a heartbeat interval.")
fs.UintVar(&cfg.ElectionMs, "election-timeout", 1000, "Time (in milliseconds) for an election to timeout.")
// clustering
fs.Var(flags.NewURLsValue(defaultInitialAdvertisePeerURLs), "initial-advertise-peer-urls", "List of this member's peer URLs to advertise to the rest of the cluster")
fs.Var(flags.NewURLsValue("http://localhost:2379,http://localhost:4001"), "advertise-client-urls", "List of this member's client URLs to advertise to the rest of the cluster")
fs.StringVar(&cfg.durl, "discovery", "", "Discovery service used to bootstrap the initial cluster")
fs.Var(flags.NewURLsValue(defaultInitialAdvertisePeerURLs), "initial-advertise-peer-urls", "List of this member's peer URLs to advertise to the rest of the cluster.")
fs.Var(flags.NewURLsValue("http://localhost:2379,http://localhost:4001"), "advertise-client-urls", "List of this member's client URLs to advertise to the rest of the cluster.")
fs.StringVar(&cfg.durl, "discovery", "", "Discovery service used to bootstrap the initial cluster.")
fs.Var(cfg.fallback, "discovery-fallback", fmt.Sprintf("Valid values include %s", strings.Join(cfg.fallback.Values, ", ")))
if err := cfg.fallback.Set(fallbackFlagProxy); err != nil {
// Should never happen.
plog.Panicf("unexpected error setting up discovery-fallback flag: %v", err)
}
fs.StringVar(&cfg.dproxy, "discovery-proxy", "", "HTTP proxy to use for traffic to discovery service")
fs.StringVar(&cfg.dnsCluster, "discovery-srv", "", "DNS domain used to bootstrap initial cluster")
fs.StringVar(&cfg.initialCluster, "initial-cluster", initialClusterFromName(defaultName), "Initial cluster configuration for bootstrapping")
fs.StringVar(&cfg.initialClusterToken, "initial-cluster-token", "etcd-cluster", "Initial cluster token for the etcd cluster during bootstrap")
fs.Var(cfg.clusterState, "initial-cluster-state", "Initial cluster configuration for bootstrapping")
fs.StringVar(&cfg.dproxy, "discovery-proxy", "", "HTTP proxy to use for traffic to discovery service.")
fs.StringVar(&cfg.dnsCluster, "discovery-srv", "", "DNS domain used to bootstrap initial cluster.")
fs.StringVar(&cfg.initialCluster, "initial-cluster", initialClusterFromName(defaultName), "Initial cluster configuration for bootstrapping.")
fs.StringVar(&cfg.initialClusterToken, "initial-cluster-token", "etcd-cluster", "Initial cluster token for the etcd cluster during bootstrap.")
fs.Var(cfg.clusterState, "initial-cluster-state", "Initial cluster configuration for bootstrapping.")
if err := cfg.clusterState.Set(clusterStateFlagNew); err != nil {
// Should never happen.
plog.Panicf("unexpected error setting up clusterStateFlag: %v", err)
}
fs.BoolVar(&cfg.strictReconfigCheck, "strict-reconfig-check", false, "Reject reconfiguration that might cause quorum loss.")
// proxy
fs.Var(cfg.proxy, "proxy", fmt.Sprintf("Valid values include %s", strings.Join(cfg.proxy.Values, ", ")))
@@ -205,24 +208,25 @@ func NewConfig() *config {
// logging
fs.BoolVar(&cfg.debug, "debug", false, "Enable debug output to the logs.")
fs.StringVar(&cfg.logPkgLevels, "log-package-levels", "", "Specify a particular log level for each etcd package.")
fs.StringVar(&cfg.logPkgLevels, "log-package-levels", "", "Specify a particular log level for each etcd package (eg: 'etcdmain=CRITICAL,etcdserver=DEBUG').")
// unsafe
fs.BoolVar(&cfg.forceNewCluster, "force-new-cluster", false, "Force to create a new one member cluster")
fs.BoolVar(&cfg.forceNewCluster, "force-new-cluster", false, "Force to create a new one member cluster.")
// version
fs.BoolVar(&cfg.printVersion, "version", false, "Print the version and exit")
fs.BoolVar(&cfg.printVersion, "version", false, "Print the version and exit.")
// demo flag
fs.BoolVar(&cfg.v3demo, "experimental-v3demo", false, "Enable experimental v3 demo API")
fs.BoolVar(&cfg.v3demo, "experimental-v3demo", false, "Enable experimental v3 demo API.")
fs.StringVar(&cfg.gRPCAddr, "experimental-gRPC-addr", "127.0.0.1:2378", "gRPC address for experimental v3 demo API.")
// backwards-compatibility with v0.4.6
fs.Var(&flags.IPAddressPort{}, "addr", "DEPRECATED: Use -advertise-client-urls instead.")
fs.Var(&flags.IPAddressPort{}, "bind-addr", "DEPRECATED: Use -listen-client-urls instead.")
fs.Var(&flags.IPAddressPort{}, "peer-addr", "DEPRECATED: Use -initial-advertise-peer-urls instead.")
fs.Var(&flags.IPAddressPort{}, "peer-bind-addr", "DEPRECATED: Use -listen-peer-urls instead.")
fs.Var(&flags.DeprecatedFlag{Name: "peers"}, "peers", "DEPRECATED: Use -initial-cluster instead")
fs.Var(&flags.DeprecatedFlag{Name: "peers-file"}, "peers-file", "DEPRECATED: Use -initial-cluster instead")
fs.Var(&flags.DeprecatedFlag{Name: "peers"}, "peers", "DEPRECATED: Use -initial-cluster instead.")
fs.Var(&flags.DeprecatedFlag{Name: "peers-file"}, "peers-file", "DEPRECATED: Use -initial-cluster instead.")
// ignored
for _, f := range cfg.ignored {

View File

@@ -12,6 +12,5 @@
// See the License for the specific language governing permissions and
// limitations under the License.
/* Package etcd contains the main entry point for the etcd binary. */
// Package etcdmain contains the main entry point for the etcd binary.
package etcdmain

View File

@@ -201,11 +201,6 @@ func startEtcd(cfg *config) (<-chan struct{}, error) {
return nil, fmt.Errorf("error setting up initial cluster: %v", err)
}
pt, err := transport.NewTimeoutTransport(cfg.peerTLSInfo, peerDialTimeout(cfg.ElectionMs), rafthttp.ConnReadTimeout, rafthttp.ConnWriteTimeout)
if err != nil {
return nil, err
}
if !cfg.peerTLSInfo.Empty() {
plog.Infof("peerTLS: %s", cfg.peerTLSInfo)
}
@@ -215,7 +210,7 @@ func startEtcd(cfg *config) (<-chan struct{}, error) {
plog.Warningf("The scheme of peer url %s is http while peer key/cert files are presented. Ignored peer key/cert files.", u.String())
}
var l net.Listener
l, err = transport.NewTimeoutListener(u.Host, u.Scheme, cfg.peerTLSInfo, rafthttp.ConnReadTimeout, rafthttp.ConnWriteTimeout)
l, err = rafthttp.NewListener(u, cfg.peerTLSInfo)
if err != nil {
return nil, err
}
@@ -264,11 +259,11 @@ func startEtcd(cfg *config) (<-chan struct{}, error) {
var v3l net.Listener
if cfg.v3demo {
v3l, err = net.Listen("tcp", "127.0.0.1:12379")
v3l, err = net.Listen("tcp", cfg.gRPCAddr)
if err != nil {
plog.Fatal(err)
}
plog.Infof("listening for client rpc on 127.0.0.1:12379")
plog.Infof("listening for client rpc on %s", cfg.gRPCAddr)
}
srvcfg := &etcdserver.ServerConfig{
@@ -286,10 +281,11 @@ func startEtcd(cfg *config) (<-chan struct{}, error) {
DiscoveryProxy: cfg.dproxy,
NewCluster: cfg.isNewCluster(),
ForceNewCluster: cfg.forceNewCluster,
Transport: pt,
PeerTLSInfo: cfg.peerTLSInfo,
TickMs: cfg.TickMs,
ElectionTicks: cfg.electionTicks(),
V3demo: cfg.v3demo,
StrictReconfigCheck: cfg.strictReconfigCheck,
}
var s *etcdserver.EtcdServer
s, err = etcdserver.NewServer(srvcfg)
@@ -325,7 +321,8 @@ func startEtcd(cfg *config) (<-chan struct{}, error) {
if cfg.v3demo {
// set up v3 demo rpc
grpcServer := grpc.NewServer()
etcdserverpb.RegisterEtcdServer(grpcServer, v3rpc.New(s))
etcdserverpb.RegisterKVServer(grpcServer, v3rpc.NewKVServer(s))
etcdserverpb.RegisterWatchServer(grpcServer, v3rpc.NewWatchServer(s.Watchable()))
go plog.Fatal(grpcServer.Serve(v3l))
}
@@ -392,7 +389,7 @@ func startProxy(cfg *config) error {
uf := func() []string {
gcls, err := etcdserver.GetClusterFromRemotePeers(peerURLs, tr)
// TODO: remove the 2nd check when we fix GetClusterFromPeers
// GetClusterFromPeers should not return nil error with an invaild empty cluster
// GetClusterFromPeers should not return nil error with an invalid empty cluster
if err != nil {
plog.Warningf("proxy: %v", err)
return []string{}
@@ -535,9 +532,3 @@ func setupLogging(cfg *config) {
repoLog.SetLogLevel(settings)
}
}
func peerDialTimeout(electionMs uint) time.Duration {
// 1s for queue wait and system delay
// + one RTT, which is smaller than 1/5 election timeout
return time.Second + time.Duration(electionMs)*time.Millisecond/5
}

View File

@@ -68,7 +68,8 @@ clustering flags:
HTTP proxy to use for traffic to discovery service.
--discovery-srv ''
dns srv domain used to bootstrap the cluster.
--strict-reconfig-check
reject reconfiguration requests that would cause quorum loss.
proxy flags:
@@ -114,7 +115,7 @@ logging flags
--debug 'false'
enable debug-level logging for etcd.
--log-package-levels ''
set individual packages to various log levels (eg: 'etcdmain=CRITICAL,etcdserver=DEBUG')
specify a particular log level for each etcd package (eg: 'etcdmain=CRITICAL,etcdserver=DEBUG').
unsafe flags:
@@ -128,6 +129,8 @@ given by the consensus protocol.
experimental flags:
--experimental-v3demo 'false'
enable experimental v3 demo API
enable experimental v3 demo API.
--experimental-gRPC-addr '127.0.0.1:2378'
gRPC address for experimental v3 demo API.
`
)

View File

@@ -0,0 +1,27 @@
// Copyright 2015 CoreOS, Inc.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
package v3rpc
import (
"github.com/coreos/etcd/Godeps/_workspace/src/google.golang.org/grpc"
"github.com/coreos/etcd/Godeps/_workspace/src/google.golang.org/grpc/codes"
"github.com/coreos/etcd/storage"
)
var (
ErrEmptyKey = grpc.Errorf(codes.InvalidArgument, "key is not provided")
ErrCompacted = grpc.Errorf(codes.OutOfRange, storage.ErrCompacted.Error())
ErrFutureRev = grpc.Errorf(codes.OutOfRange, storage.ErrFutureRev.Error())
)

Some files were not shown because too many files have changed in this diff Show More