Compare commits
45 Commits
v2.3.0-alp
...
v2.2.4
Author | SHA1 | Date | |
---|---|---|---|
![]() |
bdee27b19e | ||
![]() |
7f684641a3 | ||
![]() |
d54cf26bed | ||
![]() |
d3e73aadab | ||
![]() |
e340928988 | ||
![]() |
e6ffe22e16 | ||
![]() |
05b564a394 | ||
![]() |
cb779b2305 | ||
![]() |
22c3208fb3 | ||
![]() |
e44372e430 | ||
![]() |
05a90bc1e5 | ||
![]() |
6751727809 | ||
![]() |
916106c3a2 | ||
![]() |
e0c7768f94 | ||
![]() |
0fb2d5d4d3 | ||
![]() |
fc61fc7c7a | ||
![]() |
09b81bad15 | ||
![]() |
b4bddf685b | ||
![]() |
af1c711270 | ||
![]() |
c269426be8 | ||
![]() |
20b7df3c12 | ||
![]() |
e342de3cc5 | ||
![]() |
26cc2111cd | ||
![]() |
5d6457e658 | ||
![]() |
53bc644168 | ||
![]() |
ad3bb484ca | ||
![]() |
15f7b736e4 | ||
![]() |
4dc835c718 | ||
![]() |
75f8282eef | ||
![]() |
45c86af0eb | ||
![]() |
71e5467807 | ||
![]() |
0169fec873 | ||
![]() |
766023b1b0 | ||
![]() |
ca9e63dde2 | ||
![]() |
7659bbb1b2 | ||
![]() |
f8b98d3925 | ||
![]() |
9ee3ed777b | ||
![]() |
c9bd125490 | ||
![]() |
ec49496111 | ||
![]() |
baaefd18e2 | ||
![]() |
72c18eb7ba | ||
![]() |
2e87d71bc6 | ||
![]() |
217dccd617 | ||
![]() |
3ceb5dd270 | ||
![]() |
49b77a59cf |
1
.gitignore
vendored
1
.gitignore
vendored
@@ -9,4 +9,3 @@
|
||||
*.swp
|
||||
/hack/insta-discovery/.env
|
||||
*.test
|
||||
tools/functional-tester/docker/bin
|
||||
|
2
.header
2
.header
@@ -1,4 +1,4 @@
|
||||
// Copyright 2016 CoreOS, Inc.
|
||||
// Copyright 2014 CoreOS, Inc.
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
|
14
.travis.yml
14
.travis.yml
@@ -1,19 +1,11 @@
|
||||
language: go
|
||||
sudo: false
|
||||
|
||||
go:
|
||||
- 1.4
|
||||
- 1.5
|
||||
- tip
|
||||
|
||||
matrix:
|
||||
allow_failures:
|
||||
- go: tip
|
||||
|
||||
addons:
|
||||
apt:
|
||||
packages:
|
||||
- libpcap-dev
|
||||
install:
|
||||
- go get github.com/barakmich/go-nyet
|
||||
|
||||
script:
|
||||
- ./test
|
||||
- INTEGRATION=y ./test
|
||||
|
@@ -35,7 +35,7 @@ Thanks for your contributions!
|
||||
|
||||
### Code style
|
||||
|
||||
The coding style suggested by the Golang community is used in etcd. See the [style doc](https://github.com/golang/go/wiki/CodeReviewComments) for details.
|
||||
The coding style suggested by the Golang community is used in etcd. See the [style doc](https://code.google.com/p/go-wiki/wiki/CodeReviewComments) for details.
|
||||
|
||||
Please follow this style to make etcd easy to review, maintain and develop.
|
||||
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# Snapshot Migration
|
||||
## Snapshot Migration
|
||||
|
||||
You can migrate a snapshot of your data from a v0.4.9+ cluster into a new etcd 2.2 cluster using a snapshot migration. After snapshot migration, the etcd indexes of your data will change. Many etcd applications rely on these indexes to behave correctly. This operation should only be done while all etcd applications are stopped.
|
||||
|
||||
@@ -15,7 +15,7 @@ etcdctl --endpoint new_cluster.example.com import --snap backup.snap
|
||||
```
|
||||
|
||||
If you have a large amount of data, you can specify more concurrent works to copy data in parallel by using `-c` flag.
|
||||
If you have hidden keys to copy, you can use `--hidden` flag to specify. For example fleet uses `/_coreos.com/fleet` so to import those keys use `--hidden /_coreos.com`.
|
||||
If you have hidden keys to copy, you can use `--hidden` flag to specify.
|
||||
|
||||
And the data will quickly copy into the new cluster:
|
||||
|
||||
|
@@ -1,8 +1,8 @@
|
||||
# Administration
|
||||
## Administration
|
||||
|
||||
## Data Directory
|
||||
### Data Directory
|
||||
|
||||
### Lifecycle
|
||||
#### Lifecycle
|
||||
|
||||
When first started, etcd stores its configuration into a data directory specified by the data-dir configuration parameter.
|
||||
Configuration is stored in the write ahead log and includes: the local member ID, cluster ID, and initial cluster configuration.
|
||||
@@ -20,7 +20,7 @@ Once removed the member can be re-added with an empty data directory.
|
||||
|
||||
[remove-a-member]: runtime-configuration.md#remove-a-member
|
||||
|
||||
### Contents
|
||||
#### Contents
|
||||
|
||||
The data directory has two sub-directories in it:
|
||||
|
||||
@@ -32,18 +32,18 @@ If `--wal-dir` flag is set, etcd will write the write ahead log files to the spe
|
||||
[wal-pkg]: http://godoc.org/github.com/coreos/etcd/wal
|
||||
[snap-pkg]: http://godoc.org/github.com/coreos/etcd/snap
|
||||
|
||||
## Cluster Management
|
||||
### Cluster Management
|
||||
|
||||
### Lifecycle
|
||||
#### Lifecycle
|
||||
|
||||
If you are spinning up multiple clusters for testing it is recommended that you specify a unique initial-cluster-token for the different clusters.
|
||||
This can protect you from cluster corruption in case of mis-configuration because two members started with different cluster tokens will refuse members from each other.
|
||||
|
||||
### Monitoring
|
||||
#### Monitoring
|
||||
|
||||
It is important to monitor your production etcd cluster for healthy information and runtime metrics.
|
||||
|
||||
#### Health Monitoring
|
||||
##### Health Monitoring
|
||||
|
||||
At lowest level, etcd exposes health information via HTTP at `/health` in JSON format. If it returns `{"health": "true"}`, then the cluster is healthy. Please note the `/health` endpoint is still an experimental one as in etcd 2.2.
|
||||
|
||||
@@ -63,16 +63,16 @@ member fd422379fda50e48 is healthy: got healthy result from http://127.0.0.1:323
|
||||
cluster is healthy
|
||||
```
|
||||
|
||||
#### Runtime Metrics
|
||||
##### Runtime Metrics
|
||||
|
||||
etcd uses [Prometheus](http://prometheus.io/) for metrics reporting in the server. You can read more through the runtime metrics [doc](metrics.md).
|
||||
|
||||
### Debugging
|
||||
#### Debugging
|
||||
|
||||
Debugging a distributed system can be difficult. etcd provides several ways to make debug
|
||||
easier.
|
||||
|
||||
#### Enabling Debug Logging
|
||||
##### Enabling Debug Logging
|
||||
|
||||
When you want to debug etcd without stopping it, you can enable debug logging at runtime.
|
||||
etcd exposes logging configuration at `/config/local/log`.
|
||||
@@ -85,7 +85,7 @@ $ curl http://127.0.0.1:2379/config/local/log -XPUT -d '{"Level":"INFO"}'
|
||||
$ # debug logging disabled
|
||||
```
|
||||
|
||||
#### Debugging Variables
|
||||
##### Debugging Variables
|
||||
|
||||
Debug variables are exposed for real-time debugging purposes. Developers who are familiar with etcd can utilize these variables to debug unexpected behavior. etcd exposes debug variables via HTTP at `/debug/vars` in JSON format. The debug variables contains
|
||||
`cmdline`, `file_descriptor_limit`, `memstats` and `raft.status`.
|
||||
@@ -107,7 +107,7 @@ Debug variables are exposed for real-time debugging purposes. Developers who are
|
||||
}
|
||||
```
|
||||
|
||||
### Optimal Cluster Size
|
||||
#### Optimal Cluster Size
|
||||
|
||||
The recommended etcd cluster size is 3, 5 or 7, which is decided by the fault tolerance requirement. A 7-member cluster can provide enough fault tolerance in most cases. While larger cluster provides better fault tolerance the write performance reduces since data needs to be replicated to more machines.
|
||||
|
||||
@@ -152,7 +152,7 @@ This example will walk you through the process of migrating the infra1 member to
|
||||
|infra2|10.0.1.12:2380|
|
||||
|
||||
```sh
|
||||
$ export ETCDCTL_ENDPOINT=http://10.0.1.10:2379,http://10.0.1.11:2379,http://10.0.1.12:2379
|
||||
$ export ETCDCTL_PEERS=http://10.0.1.10:2379,http://10.0.1.11:2379,http://10.0.1.12:2379
|
||||
```
|
||||
|
||||
```sh
|
||||
@@ -293,6 +293,6 @@ If timeout happens several times continuously, administrators should check statu
|
||||
|
||||
#### Maximum OS threads
|
||||
|
||||
By default, etcd uses the default configuration of the Go 1.4 runtime, which means that at most one operating system thread will be used to execute code simultaneously. (Note that this default behavior [has changed in Go 1.5](https://golang.org/doc/go1.5#runtime)).
|
||||
By default, etcd uses the default configuration of the Go 1.4 runtime, which means that at most one operating system thread will be used to execute code simultaneously. (Note that this default behavior [may change in Go 1.5](https://docs.google.com/document/d/1At2Ls5_fhJQ59kDK2DFVhFu3g5mATSXqqV5QrxinasI/edit)).
|
||||
|
||||
When using etcd in heavy-load scenarios on machines with multiple cores it will usually be desirable to increase the number of threads that etcd can utilize. To do this, simply set the environment variable `GOMAXPROCS` to the desired number when starting etcd. For more information on this variable, see the Go [runtime](https://golang.org/pkg/runtime) documentation.
|
||||
|
@@ -234,50 +234,6 @@ curl http://127.0.0.1:2379/v2/keys/foo -XPUT -d value=bar -d ttl= -d prevExist=t
|
||||
}
|
||||
```
|
||||
|
||||
### Refreshing key TTL
|
||||
|
||||
Keys in etcd can be refreshed notifying watchers
|
||||
this can be achieved by setting the refresh to true when updating a TTL
|
||||
|
||||
You cannot update the value of a key when refreshing it
|
||||
|
||||
```sh
|
||||
curl http://127.0.0.1:2379/v2/keys/foo -XPUT -d value=bar -d ttl=5
|
||||
curl http://127.0.0.1:2379/v2/keys/foo -XPUT -d ttl=5 -d refresh=true -d prevExist=true
|
||||
```
|
||||
|
||||
```json
|
||||
{
|
||||
"action": "set",
|
||||
"node": {
|
||||
"createdIndex": 5,
|
||||
"expiration": "2013-12-04T12:01:21.874888581-08:00",
|
||||
"key": "/foo",
|
||||
"modifiedIndex": 5,
|
||||
"ttl": 5,
|
||||
"value": "bar"
|
||||
}
|
||||
}
|
||||
{
|
||||
"action":"update",
|
||||
"node":{
|
||||
"key":"/foo",
|
||||
"value":"bar",
|
||||
"expiration": "2013-12-04T12:01:26.874888581-08:00",
|
||||
"ttl":5,
|
||||
"modifiedIndex":6,
|
||||
"createdIndex":5
|
||||
},
|
||||
"prevNode":{
|
||||
"key":"/foo",
|
||||
"value":"bar",
|
||||
"expiration":"2013-12-04T12:01:21.874888581-08:00",
|
||||
"ttl":3,
|
||||
"modifiedIndex":5,
|
||||
"createdIndex":5
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Waiting for a change
|
||||
|
||||
@@ -403,7 +359,7 @@ curl 'http://127.0.0.1:2379/v2/keys/foo?wait=true&waitIndex=2008'
|
||||
#### Connection being closed prematurely
|
||||
|
||||
The server may close a long polling connection before emitting any events.
|
||||
This can happen due to a timeout or the server being shutdown.
|
||||
This can happend due to a timeout or the server being shutdown.
|
||||
Since the HTTP header is sent immediately upon accepting the connection, the response will be seen as empty: `200 OK` and empty body.
|
||||
The clients should be prepared to deal with this scenario and retry the watch.
|
||||
|
||||
@@ -544,15 +500,13 @@ etcd can be used as a centralized coordination service in a cluster, and `Compar
|
||||
|
||||
This command will set the value of a key only if the client-provided conditions are equal to the current conditions.
|
||||
|
||||
_Note that `CompareAndSwap` does not work with [directories](#listing-a-directory). If an attempt is made to `CompareAndSwap` a directory, a 102 "Not a file" error will be returned._
|
||||
|
||||
The current comparable conditions are:
|
||||
|
||||
1. `prevValue` - checks the previous value of the key.
|
||||
|
||||
2. `prevIndex` - checks the previous modifiedIndex of the key.
|
||||
|
||||
3. `prevExist` - checks existence of the key: if `prevExist` is true, it is an `update` request; if `prevExist` is `false`, it is a `create` request.
|
||||
3. `prevExist` - checks existence of the key: if `prevExist` is true, it is an `update` request; if prevExist is `false`, it is a `create` request.
|
||||
|
||||
Here is a simple example.
|
||||
Let's create a key-value pair first: `foo=one`.
|
||||
@@ -631,8 +585,6 @@ We successfully changed the value from "one" to "two" since we gave the correct
|
||||
|
||||
This command will delete a key only if the client-provided conditions are equal to the current conditions.
|
||||
|
||||
_Note that `CompareAndDelete` does not work with [directories](#listing-a-directory). If an attempt is made to `CompareAndDelete` a directory, a 102 "Not a file" error will be returned._
|
||||
|
||||
The current comparable conditions are:
|
||||
|
||||
1. `prevValue` - checks the previous value of the key.
|
||||
@@ -1096,7 +1048,6 @@ curl http://127.0.0.1:2379/v2/stats/self
|
||||
### Store Statistics
|
||||
|
||||
The store statistics include information about the operations that this node has handled.
|
||||
Note that v2 `store Statistics` is stored in-memory. When a member stops, store statistics will reset on restart.
|
||||
|
||||
Operations that modify the store's state like create, delete, set and update are seen by the entire cluster and the number will increase on all nodes.
|
||||
Operations like get and watch are node local and will only be seen on this node.
|
||||
@@ -1126,6 +1077,6 @@ curl http://127.0.0.1:2379/v2/stats/store
|
||||
|
||||
## Cluster Config
|
||||
|
||||
See the [members API][members-api] for details on the cluster management.
|
||||
See the [other etcd APIs][other-apis] for details on the cluster management.
|
||||
|
||||
[members-api]: members_api.md
|
||||
[other-apis]: other_apis.md
|
||||
|
@@ -19,7 +19,7 @@ Each role has exact one associated Permission List. An permission list exists fo
|
||||
|
||||
The special static ROOT (named `root`) role has a full permissions on all key-value resources, the permission to manage user resources and settings resources. Only the ROOT role has the permission to manage user resources and modify settings resources. The ROOT role is built-in and does not need to be created.
|
||||
|
||||
There is also a special GUEST role, named 'guest'. These are the permissions given to unauthenticated requests to etcd. This role will be created automatically, and by default allows access to the full keyspace due to backward compatibility. (etcd did not previously authenticate any actions.). This role can be modified by a ROOT role holder at any time, to reduce the capabilities of unauthenticated users.
|
||||
There is also a special GUEST role, named 'guest'. These are the permissions given to unauthenticated requests to etcd. This role will be created automatically, and by default allows access to the full keyspace due to backward compatability. (etcd did not previously authenticate any actions.). This role can be modified by a ROOT role holder at any time, to reduce the capabilities of unauthenticated users.
|
||||
|
||||
#### Permissions
|
||||
|
||||
@@ -124,7 +124,7 @@ The User JSON object is formed as follows:
|
||||
|
||||
Password is only passed when necessary.
|
||||
|
||||
**Get a List of Users**
|
||||
**Get a list of users**
|
||||
|
||||
GET/HEAD /v2/auth/users
|
||||
|
||||
@@ -137,36 +137,7 @@ GET/HEAD /v2/auth/users
|
||||
Content-type: application/json
|
||||
200 Body:
|
||||
{
|
||||
"users": [
|
||||
{
|
||||
"user": "alice",
|
||||
"roles": [
|
||||
{
|
||||
"role": "root",
|
||||
"permissions": {
|
||||
"kv": {
|
||||
"read": ["*"],
|
||||
"write": ["*"]
|
||||
}
|
||||
}
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"user": "bob",
|
||||
"roles": [
|
||||
{
|
||||
"role": "guest",
|
||||
"permissions": {
|
||||
"kv": {
|
||||
"read": ["*"],
|
||||
"write": ["*"]
|
||||
}
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
"users": ["alice", "bob", "eve"]
|
||||
}
|
||||
|
||||
**Get User Details**
|
||||
@@ -184,26 +155,7 @@ GET/HEAD /v2/auth/users/alice
|
||||
200 Body:
|
||||
{
|
||||
"user" : "alice",
|
||||
"roles" : [
|
||||
{
|
||||
"role": "fleet",
|
||||
"permissions" : {
|
||||
"kv" : {
|
||||
"read": [ "/fleet/" ],
|
||||
"write": [ "/fleet/" ]
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
"role": "etcd",
|
||||
"permissions" : {
|
||||
"kv" : {
|
||||
"read": [ "*" ],
|
||||
"write": [ "*" ]
|
||||
}
|
||||
}
|
||||
}
|
||||
]
|
||||
"roles" : ["fleet", "etcd"]
|
||||
}
|
||||
|
||||
**Create Or Update A User**
|
||||
@@ -261,6 +213,22 @@ A full role structure may look like this. A Permission List structure is used fo
|
||||
}
|
||||
```
|
||||
|
||||
**Get a list of Roles**
|
||||
|
||||
GET/HEAD /v2/auth/roles
|
||||
|
||||
Sent Headers:
|
||||
Authorization: Basic <BasicAuthString>
|
||||
Possible Status Codes:
|
||||
200 OK
|
||||
401 Unauthorized
|
||||
200 Headers:
|
||||
Content-type: application/json
|
||||
200 Body:
|
||||
{
|
||||
"roles": ["fleet", "etcd", "quay"]
|
||||
}
|
||||
|
||||
**Get Role Details**
|
||||
|
||||
GET/HEAD /v2/auth/roles/fleet
|
||||
@@ -284,50 +252,6 @@ GET/HEAD /v2/auth/roles/fleet
|
||||
}
|
||||
}
|
||||
|
||||
**Get a list of Roles**
|
||||
|
||||
GET/HEAD /v2/auth/roles
|
||||
|
||||
Sent Headers:
|
||||
Authorization: Basic <BasicAuthString>
|
||||
Possible Status Codes:
|
||||
200 OK
|
||||
401 Unauthorized
|
||||
200 Headers:
|
||||
Content-type: application/json
|
||||
200 Body:
|
||||
{
|
||||
"roles": [
|
||||
{
|
||||
"role": "fleet",
|
||||
"permissions": {
|
||||
"kv": {
|
||||
"read": ["/fleet/"],
|
||||
"write": ["/fleet/"]
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
"role": "etcd",
|
||||
"permissions": {
|
||||
"kv": {
|
||||
"read": ["*"],
|
||||
"write": ["*"]
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
"role": "quay",
|
||||
"permissions": {
|
||||
"kv": {
|
||||
"read": ["*"],
|
||||
"write": ["*"]
|
||||
}
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
|
||||
**Create Or Update A Role**
|
||||
|
||||
PUT /v2/auth/roles/rkt
|
||||
|
@@ -134,7 +134,7 @@ $ etcdctl role remove myrolename
|
||||
|
||||
## Enabling authentication
|
||||
|
||||
The minimal steps to enabling auth are as follows. The administrator can set up users and roles before or after enabling authentication, as a matter of preference.
|
||||
The minimal steps to enabling auth follow. The administrator can set up users and roles before or after enabling authentication, as a matter of preference.
|
||||
|
||||
Make sure the root user is created:
|
||||
|
||||
|
@@ -32,7 +32,7 @@ The consistent flag for read operations is removed in etcd 2.0.0. The normal rea
|
||||
|
||||
The read consistency guarantees are:
|
||||
|
||||
The consistent read guarantees the sequential consistency within one client that talks to one etcd server. Read/Write from one client to one etcd member should be observed in order. If one client write a value to an etcd server successfully, it should be able to get the value out of the server immediately.
|
||||
The consistent read guarantees the sequential consistency within one client that talks to one etcd server. Read/Write from one client to one etcd member should be observed in order. If one client write a value to a etcd server successfully, it should be able to get the value out of the server immediately.
|
||||
|
||||
Each etcd member will proxy the request to leader and only return the result to user after the result is applied on the local member. Thus after the write succeed, the user is guaranteed to see the value on the member it sent the request to.
|
||||
|
||||
@@ -60,9 +60,9 @@ A size key needs to be provided inside a [discovery token][discoverytoken].
|
||||
|
||||
## HTTP Admin API
|
||||
|
||||
`v2/admin` on peer url and `v2/keys/_etcd` are unified under the new [v2/members API][members-api] to better explain which machines are part of an etcd cluster, and to simplify the keyspace for all your use cases.
|
||||
`v2/admin` on peer url and `v2/keys/_etcd` are unified under the new [v2/member API][memberapi] to better explain which machines are part of an etcd cluster, and to simplify the keyspace for all your use cases.
|
||||
|
||||
[members-api]: members_api.md
|
||||
[memberapi]: other_apis.md
|
||||
|
||||
## HTTP Key Value API
|
||||
- The follower can now transparently proxy write requests to the leader. Clients will no longer see 307 redirections to the leader from etcd.
|
||||
|
@@ -14,7 +14,7 @@ GCE n1-highcpu-2 machine type
|
||||
|
||||
## Testing
|
||||
|
||||
Use [etcd v3 benchmark tool](../../tools/v3benchmark/).
|
||||
Use [etcd v3 benchmark tool](../../hack/v3benchmark/).
|
||||
|
||||
## Performance
|
||||
|
||||
|
@@ -1,76 +0,0 @@
|
||||
# Watch Memory Usage Benchmark
|
||||
|
||||
*NOTE*: The watch features are under active development, and their memory usage may change as that development progresses. We do not expect it to significantly increase beyond the figures stated below.
|
||||
|
||||
A primary goal of etcd is supporting a very large number of watchers doing a massively large amount of watching. etcd aims to support O(10k) clients, O(100K) watch streams (O(10) streams per client) and O(10M) total watchings (O(100) watching per stream). The memory consumed by each individual watching accounts for the largest portion of etcd's overall usage, and is therefore the focus of current and future optimizations.
|
||||
|
||||
|
||||
Three related components of etcd watch consume physical memory: each `grpc.Conn`, each watch stream, and each instance of the watching activity. `grpc.Conn` maintains the actual TCP connection and other gRPC connection state. Each `grpc.Conn` consumes O(10kb) of memory, and might have multiple watch streams attached.
|
||||
|
||||
Each watch stream is an independent HTTP2 connection which consumes another O(10kb) of memory.
|
||||
Multiple watchings might share one watch stream.
|
||||
|
||||
Watching is the actual struct that tracks the changes on the key-value store. Each watching should only consume < O(1kb).
|
||||
|
||||
```
|
||||
+-------+
|
||||
| watch |
|
||||
+---------> | foo |
|
||||
| +-------+
|
||||
+------+-----+
|
||||
| stream |
|
||||
+--------------> | |
|
||||
| +------+-----+ +-------+
|
||||
| | | watch |
|
||||
| +---------> | bar |
|
||||
+-----+------+ +-------+
|
||||
| | +------------+
|
||||
| conn +-------> | stream |
|
||||
| | | |
|
||||
+-----+------+ +------------+
|
||||
|
|
||||
|
|
||||
|
|
||||
| +------------+
|
||||
+--------------> | stream |
|
||||
| |
|
||||
+------------+
|
||||
```
|
||||
|
||||
The theoretical memory consumption of watch can be approximated with the formula:
|
||||
`memory = c1 * number_of_conn + c2 * avg_number_of_stream_per_conn + c3 * avg_number_of_watch_stream`
|
||||
|
||||
## Testing Environment
|
||||
|
||||
etcd version
|
||||
- git head https://github.com/coreos/etcd/commit/185097ffaa627b909007e772c175e8fefac17af3
|
||||
|
||||
GCE n1-standard-2 machine type
|
||||
- 7.5 GB memory
|
||||
- 2x CPUs
|
||||
|
||||
## Overall memory usage
|
||||
|
||||
The overall memory usage captures how much [RSS](https://en.wikipedia.org/wiki/Resident_set_size) etcd consumes with the client watchers. The result is not very accurate and might vary around 10%. It is still very meaningful, since the goal is to learn about the rough memory usage and the pattern (find out `c1`, `c2` and `c3`).
|
||||
|
||||
With the benchmark result, we can calculate roughly that `c1 = 17kb`, `c2 = 18kb` and `c3 = 350bytes`. So each additional client connection consumes 17kb of memory and each additional stream consumes 18kb of memory, and each additional watching only cause 350bytes. A single etcd server can maintain millions of watchings with a few GB of memory in normal case.
|
||||
|
||||
|
||||
| clients | streams per client | watchings per stream | total watching | memory usage |
|
||||
|---------|---------|-----------|----------------|--------------|
|
||||
| 1k | 1 | 1 | 1k | 50MB |
|
||||
| 2k | 1 | 1 | 2k | 90MB |
|
||||
| 5k | 1 | 1 | 5k | 200MB |
|
||||
| 1k | 10 | 1 | 10k | 217MB |
|
||||
| 2k | 10 | 1 | 20k | 417MB |
|
||||
| 5k | 10 | 1 | 50k | 980MB |
|
||||
| 1k | 50 | 1 | 50k | 1001MB |
|
||||
| 2k | 50 | 1 | 100k | 1960MB |
|
||||
| 5k | 50 | 1 | 250k | 4700MB |
|
||||
| 1k | 50 | 10 | 500k | 1171MB |
|
||||
| 2k | 50 | 10 | 1M | 2371MB |
|
||||
| 5k | 50 | 10 | 2.5M | 5710MB |
|
||||
| 1k | 50 | 100 | 5M | 2380MB |
|
||||
| 2k | 50 | 100 | 10M | 4672MB |
|
||||
| 5k | 50 | 100 | 50M | *OOM* |
|
||||
|
@@ -1,98 +0,0 @@
|
||||
# Storage Memory Usage Benchmark
|
||||
|
||||
<!---todo: link storage to storage design doc-->
|
||||
Two components of etcd storage consume physical memory. The etcd process allocates an *in-memory index* to speed key lookup. The process's *page cache*, managed by the operating system, stores recently-accessed data from disk for quick re-use.
|
||||
|
||||
The in-memory index holds all the keys in a [B-tree][btree] data structure, along with pointers to the on-disk data (the values). Each key in the B-tree may contain multiple pointers, pointing to different versions of its values. The theoretical memory consumption of the in-memory index can hence be approximated with the formula:
|
||||
|
||||
`N * (c1 + avg_key_size) + N * (avg_versions_of_key) * (c2 + size_of_pointer)`
|
||||
|
||||
where `c1` is the key metadata overhead and `c2` is the version metadata overhead.
|
||||
|
||||
The graph shows the detailed structure of the in-memory index B-tree.
|
||||
|
||||
```
|
||||
|
||||
|
||||
In mem index
|
||||
|
||||
+------------+
|
||||
| key || ... |
|
||||
+--------------+ | || |
|
||||
| | +------------+
|
||||
| | | v1 || ... |
|
||||
| disk <----------------| || | Tree Node
|
||||
| | +------------+
|
||||
| | | v2 || ... |
|
||||
| <----------------+ || |
|
||||
| | +------------+
|
||||
+--------------+ +-----+ | | |
|
||||
| | | | |
|
||||
| +------------+
|
||||
|
|
||||
|
|
||||
^
|
||||
------+
|
||||
| ... |
|
||||
| |
|
||||
+-----+
|
||||
| ... | Tree Node
|
||||
| |
|
||||
+-----+
|
||||
| ... |
|
||||
| |
|
||||
------+
|
||||
```
|
||||
|
||||
[Page cache memory][pagecache] is managed by the operating system and is not covered in detail in this document.
|
||||
|
||||
## Testing Environment
|
||||
|
||||
etcd version
|
||||
- git head https://github.com/coreos/etcd/commit/776e9fb7be7eee5e6b58ab977c8887b4fe4d48db
|
||||
|
||||
GCE n1-standard-2 machine type
|
||||
|
||||
- 7.5 GB memory
|
||||
- 2x CPUs
|
||||
|
||||
## In-memory index memory usage
|
||||
|
||||
In this test, we only benchmark the memory usage of the in-memory index. The goal is to find `c1` and `c2` mentioned above and to understand the hard limit of memory consumption of the storage.
|
||||
|
||||
We calculate the memory usage consumption via the Go runtime.ReadMemStats. We calculate the total allocated bytes difference before creating the index and after creating the index. It cannot perfectly reflect the memory usage of the in-memory index itself but can show the rough consumption pattern.
|
||||
|
||||
| N | versions | key size | memory usage |
|
||||
|------|----------|----------|--------------|
|
||||
| 100K | 1 | 64bytes | 22MB |
|
||||
| 100K | 5 | 64bytes | 39MB |
|
||||
| 1M | 1 | 64bytes | 218MB |
|
||||
| 1M | 5 | 64bytes | 432MB |
|
||||
| 100K | 1 | 256bytes | 41MB |
|
||||
| 100K | 5 | 256bytes | 65MB |
|
||||
| 1M | 1 | 256bytes | 409MB |
|
||||
| 1M | 5 | 256bytes | 506MB |
|
||||
|
||||
|
||||
Based on the result, we can calculate `c1=120bytes`, `c2=30bytes`. We only need two sets of data to calculate `c1` and `c2`, since they are the only unknown variable in the formula. The `c1=120bytes` and `c2=30bytes` are the average value of the 4 sets of `c1` and `c2` we calculated. The key metadata overhead is still relatively nontrivial (50%) for small key-value pairs. However, this is a significant improvement over the old store, which had at least 1000% overhead.
|
||||
|
||||
## Overall memory usage
|
||||
|
||||
The overall memory usage captures how much RSS etcd consumes with the storage. The value size should have very little impact on the overall memory usage of etcd, since we keep values on disk and only retain hot values in memory, managed by the OS page cache.
|
||||
|
||||
| N | versions | key size | value size | memory usage |
|
||||
|------|----------|----------|------------|--------------|
|
||||
| 100K | 1 | 64bytes | 256bytes | 40MB |
|
||||
| 100K | 5 | 64bytes | 256bytes | 89MB |
|
||||
| 1M | 1 | 64bytes | 256bytes | 470MB |
|
||||
| 1M | 5 | 64bytes | 256bytes | 880MB |
|
||||
| 100K | 1 | 64bytes | 1KB | 102MB |
|
||||
| 100K | 5 | 64bytes | 1KB | 164MB |
|
||||
| 1M | 1 | 64bytes | 1KB | 587MB |
|
||||
| 1M | 5 | 64bytes | 1KB | 836MB |
|
||||
|
||||
Based on the result, we know the value size does not significantly impact the memory consumption. There is some minor increase due to more data held in the OS page cache.
|
||||
|
||||
[btree]: https://en.wikipedia.org/wiki/B-tree
|
||||
[pagecache]: https://en.wikipedia.org/wiki/Page_cache
|
||||
|
@@ -1,6 +1,6 @@
|
||||
# Branch Management
|
||||
## Branch Management
|
||||
|
||||
## Guide
|
||||
### Guide
|
||||
|
||||
- New development occurs on the [master branch](https://github.com/coreos/etcd/tree/master)
|
||||
- Master branch should always have a green build!
|
||||
|
@@ -124,7 +124,7 @@ There two methods that can be used for discovery:
|
||||
|
||||
### etcd Discovery
|
||||
|
||||
To better understand the design about discovery service protocol, we suggest you read [this](discovery_protocol.md).
|
||||
To better understand the design about discovery service protocol, we suggest you read [this](./discovery_protocol.md).
|
||||
|
||||
#### Lifetime of a Discovery URL
|
||||
|
||||
@@ -148,7 +148,7 @@ If you bootstrap an etcd cluster using discovery service with more than the expe
|
||||
|
||||
The URL you will use in this case will be `https://myetcd.local/v2/keys/discovery/6c007a14875d53d9bf0ef5a6fc0257c817f0fb83` and the etcd members will use the `https://myetcd.local/v2/keys/discovery/6c007a14875d53d9bf0ef5a6fc0257c817f0fb83` directory for registration as they start.
|
||||
|
||||
**Each member must have a different name flag specified. `Hostname` or `machine-id` can be a good choice. Or discovery will fail due to duplicated name.**
|
||||
Each member must have a different name flag specified. Or discovery will fail due to duplicated name.
|
||||
|
||||
Now we start etcd with those relevant flags for each member:
|
||||
|
||||
@@ -200,7 +200,7 @@ ETCD_DISCOVERY=https://discovery.etcd.io/3e86b59982e49066c5d813af1c2e2579cbf573d
|
||||
-discovery https://discovery.etcd.io/3e86b59982e49066c5d813af1c2e2579cbf573de
|
||||
```
|
||||
|
||||
**Each member must have a different name flag specified. `Hostname` or `machine-id` can be a good choice. Or discovery will fail due to duplicated name.**
|
||||
Each member must have a different name flag specified. Or discovery will fail due to duplicated name.
|
||||
|
||||
Now we start etcd with those relevant flags for each member:
|
||||
|
||||
@@ -285,13 +285,6 @@ The following DNS SRV records are looked up in the listed order:
|
||||
|
||||
If `_etcd-server-ssl._tcp.example.com` is found then etcd will attempt the bootstrapping process over SSL.
|
||||
|
||||
To help clients discover the etcd cluster, the following DNS SRV records are looked up in the listed order:
|
||||
|
||||
* _etcd-client._tcp.example.com
|
||||
* _etcd-client-ssl._tcp.example.com
|
||||
|
||||
If `_etcd-client-ssl._tcp.example.com` is found, clients will attempt to communicate with the etcd cluster over SSL.
|
||||
|
||||
#### Create DNS SRV records
|
||||
|
||||
```
|
||||
@@ -301,13 +294,6 @@ _etcd-server._tcp.example.com. 300 IN SRV 0 0 2380 infra1.example.com.
|
||||
_etcd-server._tcp.example.com. 300 IN SRV 0 0 2380 infra2.example.com.
|
||||
```
|
||||
|
||||
```
|
||||
$ dig +noall +answer SRV _etcd-client._tcp.example.com
|
||||
_etcd-client._tcp.example.com. 300 IN SRV 0 0 2379 infra0.example.com.
|
||||
_etcd-client._tcp.example.com. 300 IN SRV 0 0 2379 infra1.example.com.
|
||||
_etcd-client._tcp.example.com. 300 IN SRV 0 0 2379 infra2.example.com.
|
||||
```
|
||||
|
||||
```
|
||||
$ dig +noall +answer infra0.example.com infra1.example.com infra2.example.com
|
||||
infra0.example.com. 300 IN A 10.0.1.10
|
||||
@@ -396,21 +382,9 @@ DNS SRV records can also be used to configure the list of peers for an etcd serv
|
||||
$ etcd --proxy on -discovery-srv example.com
|
||||
```
|
||||
|
||||
#### etcd client configuration
|
||||
|
||||
DNS SRV records can also be used to help clients discover the etcd cluster.
|
||||
|
||||
The official [etcd/client](../client) supports [DNS Discovery](https://godoc.org/github.com/coreos/etcd/client#Discoverer).
|
||||
|
||||
`etcdctl` also supports DNS Discovery by specifying the `--discovery-srv` option.
|
||||
|
||||
```
|
||||
$ etcdctl --discovery-srv example.com set foo bar
|
||||
```
|
||||
|
||||
#### Error Cases
|
||||
|
||||
You might see an error like `cannot find local etcd $name from SRV records.`. That means the etcd member fails to find itself from the cluster defined in SRV records. The resolved address in `-initial-advertise-peer-urls` *must match* one of the resolved addresses in the SRV targets.
|
||||
You might see the an error like `cannot find local etcd $name from SRV records.`. That means the etcd member fails to find itself from the cluster defined in SRV records. The resolved address in `-initial-advertise-peer-urls` *must match* one of the resolved addresses in the SRV targets.
|
||||
|
||||
# 0.4 to 2.0+ Migration Guide
|
||||
|
||||
|
@@ -1,274 +1,261 @@
|
||||
# Configuration Flags
|
||||
## Configuration Flags
|
||||
|
||||
etcd is configurable through command-line flags and environment variables. Options set on the command line take precedence over those from the environment.
|
||||
|
||||
The format of environment variable for flag `-my-flag` is `ETCD_MY_FLAG`. It applies to all flags.
|
||||
|
||||
The [official etcd ports](https://www.iana.org/assignments/service-names-port-numbers/service-names-port-numbers.xhtml?search=etcd) are 2379 for client requests, and 2380 for peer communication. Some legacy code and documentation still references ports 4001 and 7001, but all new etcd use and discussion should adopt the assigned ports.
|
||||
|
||||
To start etcd automatically using custom settings at startup in Linux, using a [systemd][systemd-intro] unit is highly recommended.
|
||||
|
||||
[systemd-intro]: http://freedesktop.org/wiki/Software/systemd/
|
||||
|
||||
## Member Flags
|
||||
### Member Flags
|
||||
|
||||
### -name
|
||||
##### -name
|
||||
+ Human-readable name for this member.
|
||||
+ default: "default"
|
||||
+ env variable: ETCD_NAME
|
||||
+ This value is referenced as this node's own entries listed in the `-initial-cluster` flag (Ex: `default=http://localhost:2380` or `default=http://localhost:2380,default=http://localhost:7001`). This needs to match the key used in the flag if you're using [static bootstrapping](clustering.md#static). When using discovery, each member must have a unique name. `Hostname` or `machine-id` can be a good choice.
|
||||
+ This value is referenced as this node's own entries listed in the `-initial-cluster` flag (Ex: `default=http://localhost:2380` or `default=http://localhost:2380,default=http://localhost:7001`). This needs to match the key used in the flag if you're using [static boostrapping](clustering.md#static).
|
||||
|
||||
### -data-dir
|
||||
##### -data-dir
|
||||
+ Path to the data directory.
|
||||
+ default: "${name}.etcd"
|
||||
+ env variable: ETCD_DATA_DIR
|
||||
|
||||
### -wal-dir
|
||||
##### -wal-dir
|
||||
+ Path to the dedicated wal directory. If this flag is set, etcd will write the WAL files to the walDir rather than the dataDir. This allows a dedicated disk to be used, and helps avoid io competition between logging and other IO operations.
|
||||
+ default: ""
|
||||
+ env variable: ETCD_WAL_DIR
|
||||
|
||||
### -snapshot-count
|
||||
##### -snapshot-count
|
||||
+ Number of committed transactions to trigger a snapshot to disk.
|
||||
+ default: "10000"
|
||||
+ env variable: ETCD_SNAPSHOT_COUNT
|
||||
|
||||
### -heartbeat-interval
|
||||
##### -heartbeat-interval
|
||||
+ Time (in milliseconds) of a heartbeat interval.
|
||||
+ default: "100"
|
||||
+ env variable: ETCD_HEARTBEAT_INTERVAL
|
||||
|
||||
### -election-timeout
|
||||
##### -election-timeout
|
||||
+ Time (in milliseconds) for an election to timeout. See [Documentation/tuning.md](tuning.md#time-parameters) for details.
|
||||
+ default: "1000"
|
||||
+ env variable: ETCD_ELECTION_TIMEOUT
|
||||
|
||||
### -listen-peer-urls
|
||||
##### -listen-peer-urls
|
||||
+ List of URLs to listen on for peer traffic. This flag tells the etcd to accept incoming requests from its peers on the specified scheme://IP:port combinations. Scheme can be either http or https.If 0.0.0.0 is specified as the IP, etcd listens to the given port on all interfaces. If an IP address is given as well as a port, etcd will listen on the given port and interface. Multiple URLs may be used to specify a number of addresses and ports to listen on. The etcd will respond to requests from any of the listed addresses and ports.
|
||||
+ default: "http://localhost:2380,http://localhost:7001"
|
||||
+ env variable: ETCD_LISTEN_PEER_URLS
|
||||
+ example: "http://10.0.0.1:2380"
|
||||
+ invalid example: "http://example.com:2380" (domain name is invalid for binding)
|
||||
|
||||
### -listen-client-urls
|
||||
##### -listen-client-urls
|
||||
+ List of URLs to listen on for client traffic. This flag tells the etcd to accept incoming requests from the clients on the specified scheme://IP:port combinations. Scheme can be either http or https. If 0.0.0.0 is specified as the IP, etcd listens to the given port on all interfaces. If an IP address is given as well as a port, etcd will listen on the given port and interface. Multiple URLs may be used to specify a number of addresses and ports to listen on. The etcd will respond to requests from any of the listed addresses and ports.
|
||||
+ default: "http://localhost:2379,http://localhost:4001"
|
||||
+ env variable: ETCD_LISTEN_CLIENT_URLS
|
||||
+ example: "http://10.0.0.1:2379"
|
||||
+ invalid example: "http://example.com:2379" (domain name is invalid for binding)
|
||||
|
||||
### -max-snapshots
|
||||
##### -max-snapshots
|
||||
+ Maximum number of snapshot files to retain (0 is unlimited)
|
||||
+ default: 5
|
||||
+ env variable: ETCD_MAX_SNAPSHOTS
|
||||
+ The default for users on Windows is unlimited, and manual purging down to 5 (or your preference for safety) is recommended.
|
||||
|
||||
### -max-wals
|
||||
##### -max-wals
|
||||
+ Maximum number of wal files to retain (0 is unlimited)
|
||||
+ default: 5
|
||||
+ env variable: ETCD_MAX_WALS
|
||||
+ The default for users on Windows is unlimited, and manual purging down to 5 (or your preference for safety) is recommended.
|
||||
|
||||
### -cors
|
||||
##### -cors
|
||||
+ Comma-separated white list of origins for CORS (cross-origin resource sharing).
|
||||
+ default: none
|
||||
+ env variable: ETCD_CORS
|
||||
|
||||
## Clustering Flags
|
||||
### Clustering Flags
|
||||
|
||||
`-initial` prefix flags are used in bootstrapping ([static bootstrap][build-cluster], [discovery-service bootstrap][discovery] or [runtime reconfiguration][reconfig]) a new member, and ignored when restarting an existing member.
|
||||
|
||||
`-discovery` prefix flags need to be set when using [discovery service][discovery].
|
||||
|
||||
### -initial-advertise-peer-urls
|
||||
##### -initial-advertise-peer-urls
|
||||
|
||||
+ List of this member's peer URLs to advertise to the rest of the cluster. These addresses are used for communicating etcd data around the cluster. At least one must be routable to all cluster members. These URLs can contain domain names.
|
||||
+ default: "http://localhost:2380,http://localhost:7001"
|
||||
+ env variable: ETCD_INITIAL_ADVERTISE_PEER_URLS
|
||||
+ example: "http://example.com:2380, http://10.0.0.1:2380"
|
||||
|
||||
### -initial-cluster
|
||||
##### -initial-cluster
|
||||
+ Initial cluster configuration for bootstrapping.
|
||||
+ default: "default=http://localhost:2380,default=http://localhost:7001"
|
||||
+ env variable: ETCD_INITIAL_CLUSTER
|
||||
+ The key is the value of the `-name` flag for each node provided. The default uses `default` for the key because this is the default for the `-name` flag.
|
||||
|
||||
### -initial-cluster-state
|
||||
##### -initial-cluster-state
|
||||
+ Initial cluster state ("new" or "existing"). Set to `new` for all members present during initial static or DNS bootstrapping. If this option is set to `existing`, etcd will attempt to join the existing cluster. If the wrong value is set, etcd will attempt to start but fail safely.
|
||||
+ default: "new"
|
||||
+ env variable: ETCD_INITIAL_CLUSTER_STATE
|
||||
|
||||
[static bootstrap]: clustering.md#static
|
||||
|
||||
### -initial-cluster-token
|
||||
##### -initial-cluster-token
|
||||
+ Initial cluster token for the etcd cluster during bootstrap.
|
||||
+ default: "etcd-cluster"
|
||||
+ env variable: ETCD_INITIAL_CLUSTER_TOKEN
|
||||
|
||||
### -advertise-client-urls
|
||||
##### -advertise-client-urls
|
||||
+ List of this member's client URLs to advertise to the rest of the cluster. These URLs can contain domain names.
|
||||
+ default: "http://localhost:2379,http://localhost:4001"
|
||||
+ env variable: ETCD_ADVERTISE_CLIENT_URLS
|
||||
+ example: "http://example.com:2379, http://10.0.0.1:2379"
|
||||
+ Be careful if you are advertising URLs such as http://localhost:2379 from a cluster member and are using the proxy feature of etcd. This will cause loops, because the proxy will be forwarding requests to itself until its resources (memory, file descriptors) are eventually depleted.
|
||||
|
||||
### -discovery
|
||||
##### -discovery
|
||||
+ Discovery URL used to bootstrap the cluster.
|
||||
+ default: none
|
||||
+ env variable: ETCD_DISCOVERY
|
||||
|
||||
### -discovery-srv
|
||||
##### -discovery-srv
|
||||
+ DNS srv domain used to bootstrap the cluster.
|
||||
+ default: none
|
||||
+ env variable: ETCD_DISCOVERY_SRV
|
||||
|
||||
### -discovery-fallback
|
||||
##### -discovery-fallback
|
||||
+ Expected behavior ("exit" or "proxy") when discovery services fails.
|
||||
+ default: "proxy"
|
||||
+ env variable: ETCD_DISCOVERY_FALLBACK
|
||||
|
||||
### -discovery-proxy
|
||||
##### -discovery-proxy
|
||||
+ HTTP proxy to use for traffic to discovery service.
|
||||
+ default: none
|
||||
+ env variable: ETCD_DISCOVERY_PROXY
|
||||
|
||||
### -strict-reconfig-check
|
||||
+ Reject reconfiguration requests that would cause quorum loss.
|
||||
+ default: false
|
||||
+ env variable: ETCD_STRICT_RECONFIG_CHECK
|
||||
|
||||
## Proxy Flags
|
||||
### Proxy Flags
|
||||
|
||||
`-proxy` prefix flags configures etcd to run in [proxy mode][proxy].
|
||||
|
||||
### -proxy
|
||||
##### -proxy
|
||||
+ Proxy mode setting ("off", "readonly" or "on").
|
||||
+ default: "off"
|
||||
+ env variable: ETCD_PROXY
|
||||
|
||||
### -proxy-failure-wait
|
||||
##### -proxy-failure-wait
|
||||
+ Time (in milliseconds) an endpoint will be held in a failed state before being reconsidered for proxied requests.
|
||||
+ default: 5000
|
||||
+ env variable: ETCD_PROXY_FAILURE_WAIT
|
||||
|
||||
### -proxy-refresh-interval
|
||||
##### -proxy-refresh-interval
|
||||
+ Time (in milliseconds) of the endpoints refresh interval.
|
||||
+ default: 30000
|
||||
+ env variable: ETCD_PROXY_REFRESH_INTERVAL
|
||||
|
||||
### -proxy-dial-timeout
|
||||
##### -proxy-dial-timeout
|
||||
+ Time (in milliseconds) for a dial to timeout or 0 to disable the timeout
|
||||
+ default: 1000
|
||||
+ env variable: ETCD_PROXY_DIAL_TIMEOUT
|
||||
|
||||
### -proxy-write-timeout
|
||||
##### -proxy-write-timeout
|
||||
+ Time (in milliseconds) for a write to timeout or 0 to disable the timeout.
|
||||
+ default: 5000
|
||||
+ env variable: ETCD_PROXY_WRITE_TIMEOUT
|
||||
|
||||
### -proxy-read-timeout
|
||||
##### -proxy-read-timeout
|
||||
+ Time (in milliseconds) for a read to timeout or 0 to disable the timeout.
|
||||
+ Don't change this value if you use watches because they are using long polling requests.
|
||||
+ default: 0
|
||||
+ env variable: ETCD_PROXY_READ_TIMEOUT
|
||||
|
||||
## Security Flags
|
||||
### Security Flags
|
||||
|
||||
The security flags help to [build a secure etcd cluster][security].
|
||||
|
||||
### -ca-file [DEPRECATED]
|
||||
##### -ca-file [DEPRECATED]
|
||||
+ Path to the client server TLS CA file. `-ca-file ca.crt` could be replaced by `-trusted-ca-file ca.crt -client-cert-auth` and etcd will perform the same.
|
||||
+ default: none
|
||||
+ env variable: ETCD_CA_FILE
|
||||
|
||||
### -cert-file
|
||||
##### -cert-file
|
||||
+ Path to the client server TLS cert file.
|
||||
+ default: none
|
||||
+ env variable: ETCD_CERT_FILE
|
||||
|
||||
### -key-file
|
||||
##### -key-file
|
||||
+ Path to the client server TLS key file.
|
||||
+ default: none
|
||||
+ env variable: ETCD_KEY_FILE
|
||||
|
||||
### -client-cert-auth
|
||||
##### -client-cert-auth
|
||||
+ Enable client cert authentication.
|
||||
+ default: false
|
||||
+ env variable: ETCD_CLIENT_CERT_AUTH
|
||||
|
||||
### -trusted-ca-file
|
||||
##### -trusted-ca-file
|
||||
+ Path to the client server TLS trusted CA key file.
|
||||
+ default: none
|
||||
+ env variable: ETCD_TRUSTED_CA_FILE
|
||||
|
||||
### -peer-ca-file [DEPRECATED]
|
||||
##### -peer-ca-file [DEPRECATED]
|
||||
+ Path to the peer server TLS CA file. `-peer-ca-file ca.crt` could be replaced by `-peer-trusted-ca-file ca.crt -peer-client-cert-auth` and etcd will perform the same.
|
||||
+ default: none
|
||||
+ env variable: ETCD_PEER_CA_FILE
|
||||
|
||||
### -peer-cert-file
|
||||
##### -peer-cert-file
|
||||
+ Path to the peer server TLS cert file.
|
||||
+ default: none
|
||||
+ env variable: ETCD_PEER_CERT_FILE
|
||||
|
||||
### -peer-key-file
|
||||
##### -peer-key-file
|
||||
+ Path to the peer server TLS key file.
|
||||
+ default: none
|
||||
+ env variable: ETCD_PEER_KEY_FILE
|
||||
|
||||
### -peer-client-cert-auth
|
||||
##### -peer-client-cert-auth
|
||||
+ Enable peer client cert authentication.
|
||||
+ default: false
|
||||
+ env variable: ETCD_PEER_CLIENT_CERT_AUTH
|
||||
|
||||
### -peer-trusted-ca-file
|
||||
##### -peer-trusted-ca-file
|
||||
+ Path to the peer server TLS trusted CA file.
|
||||
+ default: none
|
||||
+ env variable: ETCD_PEER_TRUSTED_CA_FILE
|
||||
|
||||
## Logging Flags
|
||||
### Logging Flags
|
||||
|
||||
### -debug
|
||||
##### -debug
|
||||
+ Drop the default log level to DEBUG for all subpackages.
|
||||
+ default: false (INFO for all packages)
|
||||
+ env variable: ETCD_DEBUG
|
||||
|
||||
### -log-package-levels
|
||||
##### -log-package-levels
|
||||
+ Set individual etcd subpackages to specific log levels. An example being `etcdserver=WARNING,security=DEBUG`
|
||||
+ default: none (INFO for all packages)
|
||||
+ env variable: ETCD_LOG_PACKAGE_LEVELS
|
||||
|
||||
|
||||
## Unsafe Flags
|
||||
### Unsafe Flags
|
||||
|
||||
Please be CAUTIOUS when using unsafe flags because it will break the guarantees given by the consensus protocol.
|
||||
For example, it may panic if other members in the cluster are still alive.
|
||||
Follow the instructions when using these flags.
|
||||
|
||||
### -force-new-cluster
|
||||
+ Force to create a new one-member cluster. It commits configuration changes forcing to remove all existing members in the cluster and add itself. It needs to be set to [restore a backup][restore].
|
||||
##### -force-new-cluster
|
||||
+ Force to create a new one-member cluster. It commits configuration changes in force to remove all existing members in the cluster and add itself. It needs to be set to [restore a backup][restore].
|
||||
+ default: false
|
||||
+ env variable: ETCD_FORCE_NEW_CLUSTER
|
||||
|
||||
## Experimental Flags
|
||||
### Experimental Flags
|
||||
|
||||
### -experimental-v3demo
|
||||
##### -experimental-v3demo
|
||||
+ Enable experimental [v3 demo API](rfc/v3api.proto).
|
||||
+ default: false
|
||||
+ env variable: ETCD_EXPERIMENTAL_V3DEMO
|
||||
|
||||
## Miscellaneous Flags
|
||||
### Miscellaneous Flags
|
||||
|
||||
### -version
|
||||
##### -version
|
||||
+ Print the version and exit.
|
||||
+ default: false
|
||||
|
||||
## Profiling flags
|
||||
|
||||
### -enable-pprof
|
||||
+ Enable runtime profiling data via HTTP server. Address is at client URL + "/debug/pprof"
|
||||
+ default: false
|
||||
|
||||
[build-cluster]: clustering.md#static
|
||||
[reconfig]: runtime-configuration.md
|
||||
[discovery]: clustering.md#discovery
|
||||
|
@@ -4,7 +4,7 @@ Discovery service protocol helps new etcd member to discover all other members i
|
||||
|
||||
Discovery service protocol is _only_ used in cluster bootstrap phase, and cannot be used for runtime reconfiguration or cluster monitoring.
|
||||
|
||||
The protocol uses a new discovery token to bootstrap one _unique_ etcd cluster. Remember that one discovery token can represent only one etcd cluster. As long as discovery protocol on this token starts, even if it fails halfway, it must not be used to bootstrap another etcd cluster.
|
||||
The protocol uses a new discovery token to bootstrap one _unique_ etcd cluster. Remember that one discovery token can represent only one etcd cluster. As long as discovery protocol on this token starts, even if fails halfway, it must not be used to bootstrap another etcd cluster.
|
||||
|
||||
The rest of this article will walk through the discovery process with examples that correspond to a self-hosted discovery cluster. The public discovery service, discovery.etcd.io, functions the same way, but with a layer of polish to abstract away ugly URLs, generate UUIDs automatically, and provide some protections against excessive requests. At its core, the public discovery service still uses an etcd cluster as the data store as described in this document.
|
||||
|
||||
@@ -14,7 +14,7 @@ The idea of discovery protocol is to use an internal etcd cluster to coordinate
|
||||
|
||||
In the following example workflow, we will list each step of protocol in curl format for ease of understanding.
|
||||
|
||||
By convention the etcd discovery protocol uses the key prefix `_etcd/registry`. If `http://example.com` hosts an etcd cluster for discovery service, a full URL to discovery keyspace will be `http://example.com/v2/keys/_etcd/registry`. We will use this as the URL prefix in the example.
|
||||
By convention the etcd discovery protocol uses the key prefix `_etcd/registry`. If `http://example.com` hosts a etcd cluster for discovery service, a full URL to discovery keyspace will be `http://example.com/v2/keys/_etcd/registry`. We will use this as the URL prefix in the example.
|
||||
|
||||
### Creating a New Discovery Token
|
||||
|
||||
|
@@ -12,11 +12,9 @@ export HostIP="192.168.12.50"
|
||||
|
||||
The following `docker run` command will expose the etcd client API over ports 4001 and 2379, and expose the peer port over 2380.
|
||||
|
||||
This will run the latest release version of etcd. You can specify version if needed (e.g. `quay.io/coreos/etcd:v2.2.0`).
|
||||
|
||||
```
|
||||
docker run -d -v /usr/share/ca-certificates/:/etc/ssl/certs -p 4001:4001 -p 2380:2380 -p 2379:2379 \
|
||||
--name etcd quay.io/coreos/etcd \
|
||||
--name etcd quay.io/coreos/etcd:v2.0.8 \
|
||||
-name etcd0 \
|
||||
-advertise-client-urls http://${HostIP}:2379,http://${HostIP}:4001 \
|
||||
-listen-client-urls http://0.0.0.0:2379,http://0.0.0.0:4001 \
|
||||
@@ -46,7 +44,7 @@ The main difference being the value used for the `-initial-cluster` flag, which
|
||||
|
||||
```
|
||||
docker run -d -v /usr/share/ca-certificates/:/etc/ssl/certs -p 4001:4001 -p 2380:2380 -p 2379:2379 \
|
||||
--name etcd quay.io/coreos/etcd \
|
||||
--name etcd quay.io/coreos/etcd:v2.0.8 \
|
||||
-name etcd0 \
|
||||
-advertise-client-urls http://192.168.12.50:2379,http://192.168.12.50:4001 \
|
||||
-listen-client-urls http://0.0.0.0:2379,http://0.0.0.0:4001 \
|
||||
@@ -61,7 +59,7 @@ docker run -d -v /usr/share/ca-certificates/:/etc/ssl/certs -p 4001:4001 -p 2380
|
||||
|
||||
```
|
||||
docker run -d -v /usr/share/ca-certificates/:/etc/ssl/certs -p 4001:4001 -p 2380:2380 -p 2379:2379 \
|
||||
--name etcd quay.io/coreos/etcd \
|
||||
--name etcd quay.io/coreos/etcd:v2.0.8 \
|
||||
-name etcd1 \
|
||||
-advertise-client-urls http://192.168.12.51:2379,http://192.168.12.51:4001 \
|
||||
-listen-client-urls http://0.0.0.0:2379,http://0.0.0.0:4001 \
|
||||
@@ -76,7 +74,7 @@ docker run -d -v /usr/share/ca-certificates/:/etc/ssl/certs -p 4001:4001 -p 2380
|
||||
|
||||
```
|
||||
docker run -d -v /usr/share/ca-certificates/:/etc/ssl/certs -p 4001:4001 -p 2380:2380 -p 2379:2379 \
|
||||
--name etcd quay.io/coreos/etcd \
|
||||
--name etcd quay.io/coreos/etcd:v2.0.8 \
|
||||
-name etcd2 \
|
||||
-advertise-client-urls http://192.168.12.52:2379,http://192.168.12.52:4001 \
|
||||
-listen-client-urls http://0.0.0.0:2379,http://0.0.0.0:4001 \
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# Error Code
|
||||
Error Code
|
||||
======
|
||||
|
||||
This document describes the error code used in key space '/v2/keys'. Feel free to import 'github.com/coreos/etcd/error' to use.
|
||||
|
@@ -37,7 +37,7 @@ timeout.
|
||||
|
||||
A proxy is a redirection server to the etcd cluster. The proxy handles the
|
||||
redirection of a client to the current configuration of the etcd cluster. A
|
||||
typical use case is to start a proxy on a machine, and on first boot up of the
|
||||
typical usecase is to start a proxy on a machine, and on first boot up of the
|
||||
proxy specify both the `--proxy` flag and the `--initial-cluster` flag.
|
||||
|
||||
From there, any etcdctl client that starts up automatically speaks to the local
|
||||
@@ -57,26 +57,24 @@ and their integration with the reconfiguration API.
|
||||
Thus, a member that is down, even infinitely, will never be automatically
|
||||
removed from the etcd cluster member list.
|
||||
|
||||
This makes sense because it's usually an application level / administrative
|
||||
This makes sense because its usually an application level / administrative
|
||||
action to determine whether a reconfiguration should happen based on health.
|
||||
|
||||
For more information, refer to
|
||||
[Documentation/runtime-reconf-design.md](https://github.com/coreos/etcd/blob/master/Documentation/runtime-reconf-design.md).
|
||||
For more information, refer to [Documentation/runtime-reconfiguration.md].
|
||||
|
||||
## 6) how does --endpoint work with etcdctl?
|
||||
## 6) how does --peers work with etcdctl?
|
||||
|
||||
The `--endpoint` flag can specify any number of etcd cluster members in a comma
|
||||
The `--peers` flag can specify any number of etcd cluster members in a comma
|
||||
separated list. This list might be a subset, equal to, or more than the actual
|
||||
etcd cluster member list itself.
|
||||
|
||||
If only one peer is specified via the `--endpoint` flag, the etcdctl discovers the
|
||||
If only one peer is specified via the `--peers` flag, the etcdctl discovers the
|
||||
rest of the cluster via the member list of that one peer, and then it randomly
|
||||
chooses a member to use. Again, the client can use the `quorum=true` flag on
|
||||
reads, which will always fail when using a member in the minority.
|
||||
|
||||
If peers from multiple clusters are specified via the `--endpoint` flag, etcdctl
|
||||
If peers from multiple clusters are specified via the `--peers` flag, etcdctl
|
||||
will randomly choose a peer, and the request will simply get routed to one of
|
||||
the clusters. This is probably not what you want.
|
||||
|
||||
Note: --peers flag is now deprecated and --endpoint should be used instead,
|
||||
as it might confuse users to give etcdctl a peerURL.
|
||||
|
||||
|
@@ -1,35 +1,35 @@
|
||||
# Glossary
|
||||
## Glossary
|
||||
|
||||
This document defines the various terms used in etcd documentation, command line and source code.
|
||||
|
||||
## Node
|
||||
### Node
|
||||
|
||||
Node is an instance of raft state machine.
|
||||
|
||||
It has a unique identification, and records other nodes' progress internally when it is the leader.
|
||||
|
||||
## Member
|
||||
### Member
|
||||
|
||||
Member is an instance of etcd. It hosts a node, and provides service to clients.
|
||||
|
||||
## Cluster
|
||||
### Cluster
|
||||
|
||||
Cluster consists of several members.
|
||||
|
||||
The node in each member follows raft consensus protocol to replicate logs. Cluster receives proposals from members, commits them and apply to local store.
|
||||
|
||||
## Peer
|
||||
### Peer
|
||||
|
||||
Peer is another member of the same cluster.
|
||||
|
||||
## Proposal
|
||||
### Proposal
|
||||
|
||||
A proposal is a request (for example a write request, a configuration change request) that needs to go through raft protocol.
|
||||
|
||||
## Client
|
||||
### Client
|
||||
|
||||
Client is a caller of the cluster's HTTP API.
|
||||
|
||||
## Machine (deprecated)
|
||||
### Machine (deprecated)
|
||||
|
||||
The alternative of Member in etcd before 2.0
|
||||
|
@@ -46,7 +46,7 @@ ExecStart=/usr/bin/etcd
|
||||
|
||||
There are several error cases:
|
||||
|
||||
0) Init has already run and the data directory is already configured
|
||||
0) Init has already ran and the data directory is already configured
|
||||
1) Discovery fails because of network timeout, etc
|
||||
2) Discovery fails because the cluster is already full and etcd needs to fall back to proxy
|
||||
3) Static cluster configuration fails because of conflict, misconfiguration or timeout
|
||||
|
@@ -1,24 +1,19 @@
|
||||
# Libraries and Tools
|
||||
## Libraries and Tools
|
||||
|
||||
**Tools**
|
||||
|
||||
- [etcdctl](https://github.com/coreos/etcd/tree/master/etcdctl) - A command line client for etcd
|
||||
- [etcdctl](https://github.com/coreos/etcdctl) - A command line client for etcd
|
||||
- [etcd-backup](https://github.com/fanhattan/etcd-backup) - A powerful command line utility for dumping/restoring etcd - Supports v2
|
||||
- [etcd-dump](https://npmjs.org/package/etcd-dump) - Command line utility for dumping/restoring etcd.
|
||||
- [etcd-fs](https://github.com/xetorthio/etcd-fs) - FUSE filesystem for etcd
|
||||
- [etcddir](https://github.com/rekby/etcddir) - Realtime sync etcd and local directory. Work with windows and linux.
|
||||
- [etcd-browser](https://github.com/henszey/etcd-browser) - A web-based key/value editor for etcd using AngularJS
|
||||
- [etcd-lock](https://github.com/datawisesystems/etcd-lock) - Master election & distributed r/w lock implementation using etcd - Supports v2
|
||||
- [etcd-console](https://github.com/matishsiao/etcd-console) - A web-base key/value editor for etcd using PHP
|
||||
- [etcd-viewer](https://github.com/nikfoundas/etcd-viewer) - An etcd key-value store editor/viewer written in Java
|
||||
- [etcdtool](https://github.com/mickep76/etcdtool) - Export/Import/Edit etcd directory as JSON/YAML/TOML and Validate directory using JSON schema
|
||||
- [etcd-rest](https://github.com/mickep76/etcd-rest) - Create generic REST API in Go using etcd as a backend with validation using JSON schema
|
||||
- [etcdsh](https://github.com/kamilhark/etcdsh) - A command line client with support of command history and tab completion. Supports v2
|
||||
|
||||
**Go libraries**
|
||||
|
||||
- [etcd/client](https://github.com/coreos/etcd/blob/master/client) - the officially maintained Go client
|
||||
- [go-etcd](https://github.com/coreos/go-etcd) - the deprecated official client. May be useful for older (<2.0.0) versions of etcd.
|
||||
- [go-etcd](https://github.com/coreos/go-etcd) - Supports v2
|
||||
|
||||
**Java libraries**
|
||||
|
||||
@@ -54,7 +49,6 @@
|
||||
|
||||
**C++ libraries**
|
||||
- [edwardcapriolo/etcdcpp](https://github.com/edwardcapriolo/etcdcpp) - Supports v2
|
||||
- [suryanathan/etcdcpp](https://github.com/suryanathan/etcdcpp) - Supports v2 (with waits)
|
||||
|
||||
**Clojure libraries**
|
||||
|
||||
@@ -68,7 +62,6 @@
|
||||
|
||||
**.Net Libraries**
|
||||
|
||||
- [wangjia184/etcdnet](https://github.com/wangjia184/etcdnet) - Supports v2
|
||||
- [drusellers/etcetera](https://github.com/drusellers/etcetera)
|
||||
|
||||
**PHP Libraries**
|
||||
@@ -87,6 +80,10 @@
|
||||
|
||||
- [efrecon/etcd-tcl](https://github.com/efrecon/etcd-tcl) - Supports v2, except wait.
|
||||
|
||||
A detailed recap of client functionalities can be found in the [clients compatibility matrix][clients-matrix.md].
|
||||
|
||||
[clients-matrix.md]: https://github.com/coreos/etcd/blob/master/Documentation/clients-matrix.md
|
||||
|
||||
**Chef Integration**
|
||||
|
||||
- [coderanger/etcd-chef](https://github.com/coderanger/etcd-chef)
|
||||
@@ -114,7 +111,7 @@
|
||||
- [configdb](https://git.autistici.org/ai/configdb/tree/master) - A REST relational abstraction on top of arbitrary database backends, aimed at storing configs and inventories.
|
||||
- [scrz](https://github.com/scrz/scrz) - Container manager, stores configuration in etcd.
|
||||
- [fleet](https://github.com/coreos/fleet) - Distributed init system
|
||||
- [kubernetes/kubernetes](https://github.com/kubernetes/kubernetes) - Container cluster manager introduced by Google.
|
||||
- [GoogleCloudPlatform/kubernetes](https://github.com/GoogleCloudPlatform/kubernetes) - Container cluster manager.
|
||||
- [mailgun/vulcand](https://github.com/mailgun/vulcand) - HTTP proxy that uses etcd as a configuration backend.
|
||||
- [duedil-ltd/discodns](https://github.com/duedil-ltd/discodns) - Simple DNS nameserver using etcd as a database for names and records.
|
||||
- [skynetservices/skydns](https://github.com/skynetservices/skydns) - RFC compliant DNS server
|
||||
|
@@ -1,120 +0,0 @@
|
||||
# Members API
|
||||
|
||||
* [List members](#list-members)
|
||||
* [Add a member](#add-a-member)
|
||||
* [Delete a member](#delete-a-member)
|
||||
* [Change the peer urls of a member](#change-the-peer-urls-of-a-member)
|
||||
|
||||
## List members
|
||||
|
||||
Return an HTTP 200 OK response code and a representation of all members in the etcd cluster.
|
||||
|
||||
### Request
|
||||
|
||||
```
|
||||
GET /v2/members HTTP/1.1
|
||||
```
|
||||
|
||||
### Example
|
||||
|
||||
```sh
|
||||
curl http://10.0.0.10:2379/v2/members
|
||||
```
|
||||
|
||||
```json
|
||||
{
|
||||
"members": [
|
||||
{
|
||||
"id": "272e204152",
|
||||
"name": "infra1",
|
||||
"peerURLs": [
|
||||
"http://10.0.0.10:2380"
|
||||
],
|
||||
"clientURLs": [
|
||||
"http://10.0.0.10:2379"
|
||||
]
|
||||
},
|
||||
{
|
||||
"id": "2225373f43",
|
||||
"name": "infra2",
|
||||
"peerURLs": [
|
||||
"http://10.0.0.11:2380"
|
||||
],
|
||||
"clientURLs": [
|
||||
"http://10.0.0.11:2379"
|
||||
]
|
||||
},
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
## Add a member
|
||||
|
||||
Returns an HTTP 201 response code and the representation of added member with a newly generated a memberID when successful. Returns a string describing the failure condition when unsuccessful.
|
||||
|
||||
If the POST body is malformed an HTTP 400 will be returned. If the member exists in the cluster or existed in the cluster at some point in the past an HTTP 409 will be returned. If any of the given peerURLs exists in the cluster an HTTP 409 will be returned. If the cluster fails to process the request within timeout an HTTP 500 will be returned, though the request may be processed later.
|
||||
|
||||
### Request
|
||||
|
||||
```
|
||||
POST /v2/members HTTP/1.1
|
||||
|
||||
{"peerURLs": ["http://10.0.0.10:2380"]}
|
||||
```
|
||||
|
||||
### Example
|
||||
|
||||
```sh
|
||||
curl http://10.0.0.10:2379/v2/members -XPOST \
|
||||
-H "Content-Type: application/json" -d '{"peerURLs":["http://10.0.0.10:2380"]}'
|
||||
```
|
||||
|
||||
```json
|
||||
{
|
||||
"id": "3777296169",
|
||||
"peerURLs": [
|
||||
"http://10.0.0.10:2380"
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
## Delete a member
|
||||
|
||||
Remove a member from the cluster. The member ID must be a hex-encoded uint64.
|
||||
Returns 204 with empty content when successful. Returns a string describing the failure condition when unsuccessful.
|
||||
|
||||
If the member does not exist in the cluster an HTTP 500(TODO: fix this) will be returned. If the cluster fails to process the request within timeout an HTTP 500 will be returned, though the request may be processed later.
|
||||
|
||||
### Request
|
||||
|
||||
```
|
||||
DELETE /v2/members/<id> HTTP/1.1
|
||||
```
|
||||
|
||||
### Example
|
||||
|
||||
```sh
|
||||
curl http://10.0.0.10:2379/v2/members/272e204152 -XDELETE
|
||||
```
|
||||
|
||||
## Change the peer urls of a member
|
||||
|
||||
Change the peer urls of a given member. The member ID must be a hex-encoded uint64. Returns 204 with empty content when successful. Returns a string describing the failure condition when unsuccessful.
|
||||
|
||||
If the POST body is malformed an HTTP 400 will be returned. If the member does not exist in the cluster an HTTP 404 will be returned. If any of the given peerURLs exists in the cluster an HTTP 409 will be returned. If the cluster fails to process the request within timeout an HTTP 500 will be returned, though the request may be processed later.
|
||||
|
||||
### Request
|
||||
|
||||
```
|
||||
PUT /v2/members/<id> HTTP/1.1
|
||||
|
||||
{"peerURLs": ["http://10.0.0.10:2380"]}
|
||||
```
|
||||
|
||||
### Example
|
||||
|
||||
```sh
|
||||
curl http://10.0.0.10:2379/v2/members/272e204152 -XPUT \
|
||||
-H "Content-Type: application/json" -d '{"peerURLs":["http://10.0.0.10:2380"]}'
|
||||
```
|
||||
|
@@ -1,95 +1,101 @@
|
||||
# Metrics
|
||||
## Metrics
|
||||
|
||||
**NOTE: The metrics feature is considered experimental. We may add/change/remove metrics without warning in future releases.**
|
||||
**NOTE: The metrics feature is considered as an experimental. We might add/change/remove metrics without warning in the future releases.**
|
||||
|
||||
etcd uses [Prometheus](http://prometheus.io/) for metrics reporting in the server. The metrics can be used for real-time monitoring and debugging.
|
||||
etcd only stores these data in memory. If a member restarts, metrics will reset.
|
||||
|
||||
The simplest way to see the available metrics is to cURL the metrics endpoint `/metrics` of etcd. The format is described [here](http://prometheus.io/docs/instrumenting/exposition_formats/).
|
||||
|
||||
|
||||
You can also follow the doc [here](http://prometheus.io/docs/introduction/getting_started/) to start a Prometheus server and monitor etcd metrics.
|
||||
You can also follow the doc [here](http://prometheus.io/docs/introduction/getting_started/) to start a Promethus server and monitor etcd metrics.
|
||||
|
||||
The naming of metrics follows the suggested [best practice of Prometheus](http://prometheus.io/docs/practices/naming/). A metric name has an `etcd` prefix as its namespace and a subsystem prefix (for example `wal` and `etcdserver`).
|
||||
The naming of metrics follows the suggested [best practice of Promethus](http://prometheus.io/docs/practices/naming/). A metric name has an `etcd` prefix as its namespace and a subsystem prefix (for example `wal` and `etcdserver`).
|
||||
|
||||
etcd now exposes the following metrics:
|
||||
|
||||
## etcdserver
|
||||
### etcdserver
|
||||
|
||||
| Name | Description | Type |
|
||||
|-----------------------------------------|--------------------------------------------------|-----------|
|
||||
| file_descriptors_used_total | The total number of file descriptors used | Gauge |
|
||||
| proposal_durations_seconds | The latency distributions of committing proposal | Histogram |
|
||||
| pending_proposal_total | The total number of pending proposals | Gauge |
|
||||
| proposal_failed_total | The total number of failed proposals | Counter |
|
||||
| Name | Description | Type |
|
||||
|-----------------------------------------|--------------------------------------------------|---------|
|
||||
| file_descriptors_used_total | The total number of file descriptors used | Gauge |
|
||||
| proposal_durations_milliseconds | The latency distributions of committing proposal | Summary |
|
||||
| pending_proposal_total | The total number of pending proposals | Gauge |
|
||||
| proposal_failed_total | The total number of failed proposals | Counter |
|
||||
|
||||
High file descriptors (`file_descriptors_used_total`) usage (near the file descriptors limitation of the process) indicates a potential out of file descriptors issue. That might cause etcd fails to create new WAL files and panics.
|
||||
|
||||
[Proposal](glossary.md#proposal) durations (`proposal_durations_seconds`) give you a histogram about the proposal commit latency. Latency can be introduced into this process by network and disk IO.
|
||||
[Proposal](glossary.md#proposal) durations (`proposal_durations_milliseconds`) give you an summary about the proposal commit latency. Latency can be introduced into this process by network and disk IO.
|
||||
|
||||
Pending proposal (`pending_proposal_total`) gives you an idea about how many proposal are in the queue and waiting for commit. An increasing pending number indicates a high client load or an unstable cluster.
|
||||
|
||||
Failed proposals (`proposal_failed_total`) are normally related to two issues: temporary failures related to a leader election or longer duration downtime caused by a loss of quorum in the cluster.
|
||||
|
||||
## wal
|
||||
|
||||
| Name | Description | Type |
|
||||
|------------------------------------|--------------------------------------------------|-----------|
|
||||
| fsync_durations_seconds | The latency distributions of fsync called by wal | Histogram |
|
||||
| last_index_saved | The index of the last entry saved by wal | Gauge |
|
||||
### store
|
||||
|
||||
Abnormally high fsync duration (`fsync_durations_seconds`) indicates disk issues and might cause the cluster to be unstable.
|
||||
These metrics describe the accesses into the data store of etcd members that exist in the cluster. They
|
||||
are useful to count what kind of actions are taken by users. It is also useful to see and whether all etcd members
|
||||
"see" the same set of data mutations, and whether reads and watches (which are local) are equally distributed.
|
||||
|
||||
All these metrics are prefixed with `etcd_store_`.
|
||||
|
||||
## http requests
|
||||
|
||||
These metrics describe the serving of requests (non-watch events) served by etcd members in non-proxy mode: total
|
||||
incoming requests, request failures and processing latency (inc. raft rounds for storage). They are useful for tracking
|
||||
user-generated traffic hitting the etcd cluster .
|
||||
|
||||
All these metrics are prefixed with `etcd_http_`
|
||||
|
||||
| Name | Description | Type |
|
||||
|--------------------------------|-----------------------------------------------------------------------------------------|--------------------|
|
||||
| received_total | Total number of events after parsing and auth. | Counter(method) |
|
||||
| failed_total | Total number of failed events. | Counter(method,error) |
|
||||
| successful_duration_second | Bucketed handling times of the requests, including raft rounds for writes. | Histogram(method) |
|
||||
| Name | Description | Type |
|
||||
|---------------------------|------------------------------------------------------------------------------------------|--------------------|
|
||||
| reads_total | Total number of reads from store, should differ among etcd members (local reads). | Counter(action) |
|
||||
| writes_total | Total number of writes to store, should be same among all etcd members. | Counter(action) |
|
||||
| reads_failed_total | Number of failed reads from store (e.g. key missing) on local reads. | Counter(action) |
|
||||
| writes_failed_total | Number of failed writes to store (e.g. failed compare and swap). | Counter(action) |
|
||||
| expires_total | Total number of expired keys (due to TTL). | Counter |
|
||||
| watch_requests_totals | Total number of incoming watch requests to this etcd member (local watches). | Counter |
|
||||
| watchers | Current count of active watchers on this etcd member. | Gauge |
|
||||
|
||||
Both `reads_total` and `writes_total` count both successful and failed requests. `reads_failed_total` and
|
||||
`writes_failed_total` count failed requests. A lot of failed writes indicate possible contentions on keys (e.g. when
|
||||
doing `compareAndSet`), and read failures indicate that some clients try to access keys that don't exist.
|
||||
|
||||
Example Prometheus queries that may be useful from these metrics (across all etcd members):
|
||||
|
||||
* `sum(rate(etcd_http_failed_total{job="etcd"}[1m]) by (method) / sum(rate(etcd_http_events_received_total{job="etcd"})[1m]) by (method)`
|
||||
|
||||
* `sum(rate(etcd_store_reads_total{job="etcd"}[1m])) by (action)`
|
||||
`max(rate(etcd_store_writes_total{job="etcd"}[1m])) by (action)`
|
||||
|
||||
Shows the fraction of events that failed by HTTP method across all members, across a time window of `1m`.
|
||||
|
||||
* `sum(rate(etcd_http_received_total{job="etcd",method="GET})[1m]) by (method)`
|
||||
`sum(rate(etcd_http_received_total{job="etcd",method~="GET})[1m]) by (method)`
|
||||
Rate of reads and writes by action, across all servers across a time window of `1m`. The reason why `max` is used
|
||||
for writes as opposed to `sum` for reads is because all of etcd nodes in the cluster apply all writes to their stores.
|
||||
Shows the rate of successfull readonly/write queries across all servers, across a time window of `1m`.
|
||||
* `sum(rate(etcd_store_watch_requests_total{job="etcd"}[1m]))`
|
||||
|
||||
Shows the rate of successful readonly/write queries across all servers, across a time window of `1m`.
|
||||
Shows rate of new watch requests per second. Likely driven by how often watched keys change.
|
||||
* `sum(etcd_store_watchers{job="etcd"})`
|
||||
|
||||
* `histogram_quantile(0.9, sum(increase(etcd_http_successful_processing_seconds{job="etcd",method="GET"}[5m]) ) by (le))`
|
||||
`histogram_quantile(0.9, sum(increase(etcd_http_successful_processing_seconds{job="etcd",method!="GET"}[5m]) ) by (le))`
|
||||
|
||||
Show the 0.90-tile latency (in seconds) of read/write (respectively) event handling across all members, with a window of `5m`.
|
||||
|
||||
## snapshot
|
||||
|
||||
| Name | Description | Type |
|
||||
|--------------------------------------------|------------------------------------------------------------|-----------|
|
||||
| snapshot_save_total_durations_seconds | The total latency distributions of save called by snapshot | Histogram |
|
||||
|
||||
Abnormally high snapshot duration (`snapshot_save_total_durations_seconds`) indicates disk issues and might cause the cluster to be unstable.
|
||||
Number of active watchers across all etcd servers.
|
||||
|
||||
|
||||
## rafthttp
|
||||
### wal
|
||||
|
||||
| Name | Description | Type | Labels |
|
||||
|-----------------------------------|--------------------------------------------|--------------|--------------------------------|
|
||||
| message_sent_latency_seconds | The latency distributions of messages sent | HistogramVec | sendingType, msgType, remoteID |
|
||||
| message_sent_failed_total | The total number of failed messages sent | Summary | sendingType, msgType, remoteID |
|
||||
| Name | Description | Type |
|
||||
|------------------------------------|--------------------------------------------------|---------|
|
||||
| fsync_durations_microseconds | The latency distributions of fsync called by wal | Summary |
|
||||
| last_index_saved | The index of the last entry saved by wal | Gauge |
|
||||
|
||||
Abnormally high fsync duration (`fsync_durations_microseconds`) indicates disk issues and might cause the cluster to be unstable.
|
||||
|
||||
### snapshot
|
||||
|
||||
| Name | Description | Type |
|
||||
|--------------------------------------------|------------------------------------------------------------|---------|
|
||||
| snapshot_save_total_durations_microseconds | The total latency distributions of save called by snapshot | Summary |
|
||||
|
||||
Abnormally high snapshot duration (`snapshot_save_total_durations_microseconds`) indicates disk issues and might cause the cluster to be unstable.
|
||||
|
||||
|
||||
Abnormally high message duration (`message_sent_latency_seconds`) indicates network issues and might cause the cluster to be unstable.
|
||||
### rafthttp
|
||||
|
||||
| Name | Description | Type | Labels |
|
||||
|-----------------------------------|--------------------------------------------|---------|--------------------------------|
|
||||
| message_sent_latency_microseconds | The latency distributions of messages sent | Summary | sendingType, msgType, remoteID |
|
||||
| message_sent_failed_total | The total number of failed messages sent | Summary | sendingType, msgType, remoteID |
|
||||
|
||||
|
||||
Abnormally high message duration (`message_sent_latency_microseconds`) indicates network issues and might cause the cluster to be unstable.
|
||||
|
||||
An increase in message failures (`message_sent_failed_total`) indicates more severe network issues and might cause the cluster to be unstable.
|
||||
|
||||
@@ -100,7 +106,7 @@ Label `msgType` is the type of raft message. `MsgApp` is log replication message
|
||||
Label `remoteID` is the member ID of the message destination.
|
||||
|
||||
|
||||
## proxy
|
||||
### proxy
|
||||
|
||||
etcd members operating in proxy mode do not do store operations. They forward all requests
|
||||
to cluster instances.
|
||||
@@ -124,8 +130,8 @@ Example Prometheus queries that may be useful from these metrics (across all etc
|
||||
* `histogram_quantile(0.9, sum(increase(etcd_proxy_events_handling_time_seconds_bucket{job="etcd",method="GET"}[5m])) by (le))`
|
||||
`histogram_quantile(0.9, sum(increase(etcd_proxy_events_handling_time_seconds_bucket{job="etcd",method!="GET"}[5m])) by (le))`
|
||||
|
||||
Show the 0.90-tile latency (in seconds) of handling of user requests across all proxy machines, with a window of `5m`.
|
||||
Show the 0.90-tile latency (in seconds) of handling of user requestsacross all proxy machines, with a window of `5m`.
|
||||
* `sum(rate(etcd_proxy_dropped_total{job="etcd"}[1m])) by (proxying_error)`
|
||||
|
||||
Number of failed request on the proxy. This should be 0, spikes here indicate connectivity issues to etcd cluster.
|
||||
|
||||
|
@@ -1,28 +1,119 @@
|
||||
# Miscellaneous APIs
|
||||
## Members API
|
||||
|
||||
* [Getting the etcd version](#getting-the-etcd-version)
|
||||
* [Checking health of an etcd member node](#checking-health-of-an-etcd-member-node)
|
||||
* [List members](#list-members)
|
||||
* [Add a member](#add-a-member)
|
||||
* [Delete a member](#delete-a-member)
|
||||
* [Change the peer urls of a member](#change-the-peer-urls-of-a-member)
|
||||
|
||||
## Getting the etcd version
|
||||
## List members
|
||||
|
||||
The etcd version of a specific instance can be obtained from the `/version` endpoint.
|
||||
Return an HTTP 200 OK response code and a representation of all members in the etcd cluster.
|
||||
|
||||
### Request
|
||||
|
||||
```
|
||||
GET /v2/members HTTP/1.1
|
||||
```
|
||||
|
||||
### Example
|
||||
|
||||
```sh
|
||||
curl -L http://127.0.0.1:2379/version
|
||||
```
|
||||
|
||||
```
|
||||
etcd 2.0.12
|
||||
```
|
||||
|
||||
## Checking health of an etcd member node
|
||||
|
||||
etcd provides a `/health` endpoint to verify the health of a particular member.
|
||||
|
||||
```sh
|
||||
curl http://10.0.0.10:2379/health
|
||||
curl http://10.0.0.10:2379/v2/members
|
||||
```
|
||||
|
||||
```json
|
||||
{"health": "true"}
|
||||
{
|
||||
"members": [
|
||||
{
|
||||
"id": "272e204152",
|
||||
"name": "infra1",
|
||||
"peerURLs": [
|
||||
"http://10.0.0.10:2380"
|
||||
],
|
||||
"clientURLs": [
|
||||
"http://10.0.0.10:2379"
|
||||
]
|
||||
},
|
||||
{
|
||||
"id": "2225373f43",
|
||||
"name": "infra2",
|
||||
"peerURLs": [
|
||||
"http://10.0.0.11:2380"
|
||||
],
|
||||
"clientURLs": [
|
||||
"http://10.0.0.11:2379"
|
||||
]
|
||||
},
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
## Add a member
|
||||
|
||||
Returns an HTTP 201 response code and the representation of added member with a newly generated a memberID when successful. Returns a string describing the failure condition when unsuccessful.
|
||||
|
||||
If the POST body is malformed an HTTP 400 will be returned. If the member exists in the cluster or existed in the cluster at some point in the past an HTTP 409 will be returned. If any of the given peerURLs exists in the cluster an HTTP 409 will be returned. If the cluster fails to process the request within timeout an HTTP 500 will be returned, though the request may be processed later.
|
||||
|
||||
### Request
|
||||
|
||||
```
|
||||
POST /v2/members HTTP/1.1
|
||||
|
||||
{"peerURLs": ["http://10.0.0.10:2380"]}
|
||||
```
|
||||
|
||||
### Example
|
||||
|
||||
```sh
|
||||
curl http://10.0.0.10:2379/v2/members -XPOST \
|
||||
-H "Content-Type: application/json" -d '{"peerURLs":["http://10.0.0.10:2380"]}'
|
||||
```
|
||||
|
||||
```json
|
||||
{
|
||||
"id": "3777296169",
|
||||
"peerURLs": [
|
||||
"http://10.0.0.10:2380"
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
## Delete a member
|
||||
|
||||
Remove a member from the cluster. The member ID must be a hex-encoded uint64.
|
||||
Returns 204 with empty content when successful. Returns a string describing the failure condition when unsuccessful.
|
||||
|
||||
If the member does not exist in the cluster an HTTP 500(TODO: fix this) will be returned. If the cluster fails to process the request within timeout an HTTP 500 will be returned, though the request may be processed later.
|
||||
|
||||
### Request
|
||||
|
||||
```
|
||||
DELETE /v2/members/<id> HTTP/1.1
|
||||
```
|
||||
|
||||
### Example
|
||||
|
||||
```sh
|
||||
curl http://10.0.0.10:2379/v2/members/272e204152 -XDELETE
|
||||
```
|
||||
|
||||
## Change the peer urls of a member
|
||||
|
||||
Change the peer urls of a given member. The member ID must be a hex-encoded uint64. Returns 204 with empty content when successful. Returns a string describing the failure condition when unsuccessful.
|
||||
|
||||
If the POST body is malformed an HTTP 400 will be returned. If the member does not exist in the cluster an HTTP 404 will be returned. If any of the given peerURLs exists in the cluster an HTTP 409 will be returned. If the cluster fails to process the request within timeout an HTTP 500 will be returned, though the request may be processed later.
|
||||
|
||||
#### Request
|
||||
|
||||
```
|
||||
PUT /v2/members/<id> HTTP/1.1
|
||||
|
||||
{"peerURLs": ["http://10.0.0.10:2380"]}
|
||||
```
|
||||
|
||||
#### Example
|
||||
|
||||
```sh
|
||||
curl http://10.0.0.10:2379/v2/members/272e204152 -XPUT \
|
||||
-H "Content-Type: application/json" -d '{"peerURLs":["http://10.0.0.10:2380"]}'
|
||||
```
|
||||
|
4
Documentation/production-ready.md
Normal file
4
Documentation/production-ready.md
Normal file
@@ -0,0 +1,4 @@
|
||||
etcd is being used successfully by many companies in production. It is,
|
||||
however, under active development and systems like etcd are difficult to get
|
||||
correct. If you are comfortable with bleeding-edge software please use etcd and
|
||||
provide us with the feedback and testing young software needs.
|
@@ -1,49 +0,0 @@
|
||||
# Production Users
|
||||
|
||||
This document tracks people and use cases for etcd in production. By creating a list of production use cases we hope to build a community of advisors that we can reach out to with experience using various etcd applications, operation environments, and cluster sizes. The etcd development team may reach out periodically to check-in on your experience and update this list.
|
||||
|
||||
## discovery.etcd.io
|
||||
|
||||
- *Application*: https://github.com/coreos/discovery.etcd.io
|
||||
- *Launched*: Feb. 2014
|
||||
- *Cluster Size*: 5 members, 5 discovery proxies
|
||||
- *Order of Data Size*: 100s of Megabytes
|
||||
- *Operator*: CoreOS, brandon.philips@coreos.com
|
||||
- *Environment*: AWS
|
||||
- *Backups*: Periodic async to S3
|
||||
|
||||
discovery.etcd.io is the longest continuously running etcd backed service that we know about. It is the basis of automatic cluster bootstrap and was launched in Feb. 2014: https://coreos.com/blog/etcd-0.3.0-released/.
|
||||
|
||||
## OpenTable
|
||||
|
||||
- *Application*: OpenTable internal service discovery and cluster configuration management
|
||||
- *Launched*: May 2014
|
||||
- *Cluster Size*: 3 members each in 6 independent clusters; approximately 50 nodes reading / writing
|
||||
- *Order of Data Size*: 10s of MB
|
||||
- *Operator*: OpenTable, Inc; sschlansker@opentable.com
|
||||
- *Environment*: AWS, VMWare
|
||||
- *Backups*: None, all data can be re-created if necessary.
|
||||
|
||||
## cycoresys.com
|
||||
|
||||
- *Application*: multiple
|
||||
- *Launched*: Jul. 2014
|
||||
- *Cluster Size*: 3 members, _n_ proxies
|
||||
- *Order of Data Size*: 100s of kilobytes
|
||||
- *Operator*: CyCore Systems, Inc, sys@cycoresys.com
|
||||
- *Environment*: Baremetal
|
||||
- *Backups*: Periodic sync to Ceph RadosGW and DigitalOcean VM
|
||||
|
||||
CyCore Systems provides architecture and engineering for computing systems. This cluster provides microservices, virtual machines, databases, storage clusters to a number of clients. It is built on CoreOS machines, with each machine in the cluster running etcd as a peer or proxy.
|
||||
|
||||
## Radius Intelligence
|
||||
|
||||
- *Application*: multiple internal tools, Kubernetes clusters, bootstrappable system configs
|
||||
- *Launched*: June 2015
|
||||
- *Cluster Size*: 2 clusters of 5 and 3 members; approximately a dozen nodes read/write
|
||||
- *Order of Data Size*: 100s of kilobytes
|
||||
- *Operator*: Radius Intelligence; jcderr@radius.com
|
||||
- *Environment*: AWS, CoreOS, Kubernetes
|
||||
- *Backups*: None, all data can be recreated if necessary.
|
||||
|
||||
Radius Intelligence uses Kubernetes running CoreOS to containerize and scale internal toolsets. Examples include running [JetBrains TeamCity](https://www.jetbrains.com/teamcity/) and internal AWS security and cost reporting tools. etcd clusters back these clusters as well as provide some basic environment bootstrapping configuration keys.
|
@@ -1,150 +1,37 @@
|
||||
# Proxy
|
||||
## Proxy
|
||||
|
||||
etcd can run as a transparent proxy. Doing so allows for easy discovery of etcd within your infrastructure, since it can run on each machine as a local service. In this mode, etcd acts as a reverse proxy and forwards client requests to an active etcd cluster. The etcd proxy does not participate in the consensus replication of the etcd cluster, thus it neither increases the resilience nor decreases the write performance of the etcd cluster.
|
||||
etcd can now run as a transparent proxy. Running etcd as a proxy allows for easily discovery of etcd within your infrastructure, since it can run on each machine as a local service. In this mode, etcd acts as a reverse proxy and forwards client requests to an active etcd cluster. The etcd proxy does not participate in the consensus replication of the etcd cluster, thus it neither increases the resilience nor decreases the write performance of the etcd cluster.
|
||||
|
||||
etcd currently supports two proxy modes: `readwrite` and `readonly`. The default mode is `readwrite`, which forwards both read and write requests to the etcd cluster. A `readonly` etcd proxy only forwards read requests to the etcd cluster, and returns `HTTP 501` to all write requests.
|
||||
etcd currently supports two proxy modes: `readwrite` and `readonly`. The default mode is `readwrite`, which forwards both read and write requests to the etcd cluster. A `readonly` etcd proxy only forwards read requests to the etcd cluster, and returns `HTTP 501` to all write requests.
|
||||
|
||||
The proxy will shuffle the list of cluster members periodically to avoid sending all connections to a single member.
|
||||
|
||||
The member list used by an etcd proxy consists of all client URLs advertised in the cluster. These client URLs are specified in each etcd cluster member's `advertise-client-urls` option.
|
||||
The member list used by proxy consists of all client URLs advertised within the cluster, as specified in each members' `-advertise-client-urls` flag. If this flag is set incorrectly, requests sent to the proxy are forwarded to wrong addresses and then fail. Including URLs in the `-advertise-client-urls` flag that point to the proxy itself, e.g. http://localhost:2379, is even more problematic as it will cause loops, because the proxy keeps trying to forward requests to itself until its resources (memory, file descriptors) are eventually depleted. The fix for this problem is to restart etcd member with correct `-advertise-client-urls` flag. After client URLs list in proxy is recalculated, which happens every 30 seconds, requests will be forwarded correctly.
|
||||
|
||||
An etcd proxy examines several command-line options to discover its peer URLs. In order of precedence, these options are `discovery`, `discovery-srv`, and `initial-cluster`. The `initial-cluster` option is set to a comma-separated list of one or more etcd peer URLs used temporarily in order to discover the permanent cluster.
|
||||
|
||||
After establishing a list of peer URLs in this manner, the proxy retrieves the list of client URLs from the first reachable peer. These client URLs are specified by the `advertise-client-urls` option to etcd peers. The proxy then continues to connect to the first reachable etcd cluster member every thirty seconds to refresh the list of client URLs.
|
||||
|
||||
While etcd proxies therefore do not need to be given the `advertise-client-urls` option, as they retrieve this configuration from the cluster, this implies that `initial-cluster` must be set correctly for every proxy, and the `advertise-client-urls` option must be set correctly for every non-proxy, first-order cluster peer. Otherwise, requests to any etcd proxy would be forwarded improperly. Take special care not to set the `advertise-client-urls` option to URLs that point to the proxy itself, as such a configuration will cause the proxy to enter a loop, forwarding requests to itself until resources are exhausted. To correct either case, stop etcd and restart it with the correct URLs.
|
||||
|
||||
[This example Procfile](https://github.com/coreos/etcd/blob/master/Procfile) illustrates the difference in the etcd peer and proxy command lines used to configure and start a cluster with one proxy under the [goreman process management utility](https://github.com/mattn/goreman).
|
||||
|
||||
To summarize etcd proxy startup and peer discovery:
|
||||
|
||||
1. etcd proxies execute the following steps in order until the cluster *peer-urls* are known:
|
||||
1. If `discovery` is set for the proxy, ask the given discovery service for
|
||||
the *peer-urls*. The *peer-urls* will be the combined
|
||||
`initial-advertise-peer-urls` of all first-order, non-proxy cluster
|
||||
members.
|
||||
2. If `discovery-srv` is set for the proxy, the *peer-urls* are discovered
|
||||
from DNS.
|
||||
3. If `initial-cluster` is set for the proxy, that will become the value of
|
||||
*peer-urls*.
|
||||
4. Otherwise use the default value of
|
||||
`http://localhost:2380,http://localhost:7001`.
|
||||
2. These *peer-urls* are used to contact the (non-proxy) members of the cluster
|
||||
to find their *client-urls*. The *client-urls* will thus be the combined
|
||||
`advertise-client-urls` of all cluster members (i.e. non-proxies).
|
||||
3. Request of clients of the proxy will be forwarded (proxied) to these
|
||||
*client-urls*.
|
||||
|
||||
Always start the first-order etcd cluster members first, then any proxies. A proxy must be able to reach the cluster members to retrieve its configuration, and will attempt connections somewhat aggressively in the absence of such a channel. Starting the members before any proxy ensures the proxy can discover the client URLs when it later starts.
|
||||
|
||||
## Using an etcd proxy
|
||||
To start etcd in proxy mode, you need to provide three flags: `proxy`, `listen-client-urls`, and `initial-cluster` (or `discovery`).
|
||||
### Using an etcd proxy
|
||||
To start etcd in proxy mode, you need to provide three flags: `proxy`, `listen-client-urls`, and `initial-cluster` (or `discovery`).
|
||||
|
||||
To start a readwrite proxy, set `-proxy on`; To start a readonly proxy, set `-proxy readonly`.
|
||||
|
||||
The proxy will be listening on `listen-client-urls` and forward requests to the etcd cluster discovered from in `initial-cluster` or `discovery` url.
|
||||
The proxy will be listening on `listen-client-urls` and forward requests to the etcd cluster discovered from in `initial-cluster` or `discovery` url.
|
||||
|
||||
### Start an etcd proxy with a static configuration
|
||||
#### Start an etcd proxy with a static configuration
|
||||
To start a proxy that will connect to a statically defined etcd cluster, specify the `initial-cluster` flag:
|
||||
|
||||
```
|
||||
etcd -proxy on \
|
||||
-listen-client-urls http://127.0.0.1:8080 \
|
||||
-initial-cluster infra0=http://10.0.1.10:2380,infra1=http://10.0.1.11:2380,infra2=http://10.0.1.12:2380
|
||||
etcd -proxy on -listen-client-urls http://127.0.0.1:8080 -initial-cluster infra0=http://10.0.1.10:2380,infra1=http://10.0.1.11:2380,infra2=http://10.0.1.12:2380
|
||||
```
|
||||
|
||||
### Start an etcd proxy with the discovery service
|
||||
If you bootstrap an etcd cluster using the [discovery service][discovery-service], you can also start the proxy with the same `discovery`.
|
||||
#### Start an etcd proxy with the discovery service
|
||||
If you bootstrap an etcd cluster using the [discovery service][discovery-service], you can also start the proxy with the same `discovery`.
|
||||
|
||||
To start a proxy using the discovery service, specify the `discovery` flag. The proxy will wait until the etcd cluster defined at the `discovery` url finishes bootstrapping, and then start to forward the requests.
|
||||
To start a proxy using the discovery service, specify the `discovery` flag. The proxy will wait until the etcd cluster defined at the `discovery` url finishes bootstrapping, and then start to forward the requests.
|
||||
|
||||
```
|
||||
etcd -proxy on \
|
||||
-listen-client-urls http://127.0.0.1:8080 \
|
||||
-discovery https://discovery.etcd.io/3e86b59982e49066c5d813af1c2e2579cbf573de
|
||||
etcd -proxy on -listen-client-urls http://127.0.0.1:8080 -discovery https://discovery.etcd.io/3e86b59982e49066c5d813af1c2e2579cbf573de
|
||||
```
|
||||
|
||||
## Fallback to proxy mode with discovery service
|
||||
|
||||
If you bootstrap an etcd cluster using [discovery service][discovery-service] with more than the expected number of etcd members, the extra etcd processes will fall back to being `readwrite` proxies by default. They will forward the requests to the cluster as described above. For example, if you create a discovery url with `size=5`, and start ten etcd processes using that same discovery url, the result will be a cluster with five etcd members and five proxies. Note that this behaviour can be disabled with the `proxy-fallback` flag.
|
||||
|
||||
## Promote a proxy to a member of etcd cluster
|
||||
|
||||
A Proxy is in the part of etcd cluster that does not participate in consensus. A proxy will not promote itself to an etcd member that participates in consensus automatically in any case.
|
||||
|
||||
If you want to promote a proxy to an etcd member, there are four steps you need to follow:
|
||||
|
||||
- use etcdctl to add the proxy node as an etcd member into the existing cluster
|
||||
- stop the etcd proxy process or service
|
||||
- remove the existing proxy data directory
|
||||
- restart the etcd process with new member configuration
|
||||
|
||||
## Example
|
||||
|
||||
We assume you have a one member etcd cluster with one proxy. The cluster information is listed below:
|
||||
|
||||
|Name|Address|
|
||||
|------|---------|
|
||||
|infra0|10.0.1.10|
|
||||
|proxy0|10.0.1.11|
|
||||
|
||||
This example walks you through a case that you promote one proxy to an etcd member. The cluster will become a two member cluster after finishing the four steps.
|
||||
|
||||
### Add a new member into the existing cluster
|
||||
|
||||
First, use etcdctl to add the member to the cluster, which will output the environment variables need to correctly configure the new member:
|
||||
|
||||
``` bash
|
||||
$ etcdctl -endpoint http://10.0.1.10:2379 member add infra1 http://10.0.1.11:2380
|
||||
added member 9bf1b35fc7761a23 to cluster
|
||||
|
||||
ETCD_NAME="infra1"
|
||||
ETCD_INITIAL_CLUSTER="infra0=http://10.0.1.10:2380,infra1=http://10.0.1.11:2380"
|
||||
ETCD_INITIAL_CLUSTER_STATE=existing
|
||||
```
|
||||
|
||||
### Stop the proxy process
|
||||
|
||||
Stop the existing proxy so we can wipe it's state on disk and reload it with the new configuration:
|
||||
|
||||
``` bash
|
||||
px aux | grep etcd
|
||||
kill %etcd_proxy_pid%
|
||||
```
|
||||
|
||||
or (if you are running etcd proxy as etcd service under systemd)
|
||||
|
||||
``` bash
|
||||
sudo systemctl stop etcd
|
||||
```
|
||||
|
||||
### Remove the existing proxy data dir
|
||||
|
||||
``` bash
|
||||
rm -rf %data_dir%/proxy
|
||||
```
|
||||
|
||||
### Start etcd as a new member
|
||||
|
||||
Finally, start the reconfigured member and make sure it joins the cluster correctly:
|
||||
|
||||
``` bash
|
||||
$ export ETCD_NAME="infra1"
|
||||
$ export ETCD_INITIAL_CLUSTER="infra0=http://10.0.1.10:2380,infra1=http://10.0.1.11:2380"
|
||||
$ export ETCD_INITIAL_CLUSTER_STATE=existing
|
||||
$ etcd -listen-client-urls http://10.0.1.11:2379 \
|
||||
-advertise-client-urls http://10.0.1.11:2379 \
|
||||
-listen-peer-urls http://10.0.1.11:2380 \
|
||||
-initial-advertise-peer-urls http://10.0.1.11:2380 \
|
||||
-data-dir %data_dir%
|
||||
```
|
||||
|
||||
If you are running etcd under systemd, you should modify the service file with correct configuration and restart the service:
|
||||
|
||||
``` bash
|
||||
sudo systemd restart etcd
|
||||
```
|
||||
|
||||
If you see an error, you can read the [add member troubleshooting doc](runtime-configuration.md#error-cases).
|
||||
#### Fallback to proxy mode with discovery service
|
||||
If you bootstrap a etcd cluster using [discovery service][discovery-service] with more than the expected number of etcd members, the extra etcd processes will fall back to being `readwrite` proxies by default. They will forward the requests to the cluster as described above. For example, if you create a discovery url with `size=5`, and start ten etcd processes using that same discovery url, the result will be a cluster with five etcd members and five proxies. Note that this behaviour can be disabled with the `proxy-fallback` flag.
|
||||
|
||||
[discovery-service]: clustering.md#discovery
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# Reporting Bugs
|
||||
## Reporting Bugs
|
||||
|
||||
If you find bugs or documentation mistakes in etcd project, please let us know by [opening an issue](https://github.com/coreos/etcd/issues/new). We treat bugs and mistakes very seriously and believe no issue is too small. Before creating a bug report, please check there that one does not already exist.
|
||||
|
||||
@@ -20,7 +20,7 @@ We might ask you for further information to locate a bug. A duplicated bug repor
|
||||
|
||||
## Frequently Asked Questions
|
||||
|
||||
### How to get a stack trace
|
||||
### How to get stack trace
|
||||
|
||||
``` bash
|
||||
$ kill -QUIT $PID
|
||||
|
@@ -1,10 +1,4 @@
|
||||
# Overview
|
||||
|
||||
The etcd v3 API is designed to give users a more efficient and cleaner abstraction compared to etcd v2. There are a number of semantic and protocol changes in this new API. For an overview [see Xiang Li's video](https://youtu.be/J5AioGtEPeQ?t=211).
|
||||
|
||||
To prove out the design of the v3 API the team has also built [a number of example recipes](https://github.com/coreos/etcd/tree/master/contrib/recipes), there is a [video discussing these recipes too](https://www.youtube.com/watch?v=fj-2RY-3yVU&feature=youtu.be&t=590).
|
||||
|
||||
# Design
|
||||
## Design
|
||||
|
||||
1. Flatten binary key-value space
|
||||
|
||||
@@ -34,24 +28,13 @@ To prove out the design of the v3 API the team has also built [a number of examp
|
||||
- easy for people to write simple etcd application
|
||||
|
||||
|
||||
## Notes
|
||||
|
||||
### Request Size Limitation
|
||||
|
||||
The max request size is around 1MB. Since etcd replicates requests in a streaming fashion, a very large
|
||||
request might block other requests for a long time. The use case for etcd is to store small configuration
|
||||
values, so we prevent user from submitting large requests. This also applies to Txn requests. We might loosen
|
||||
the size in the future a little bit or make it configurable.
|
||||
|
||||
## Protobuf Defined API
|
||||
|
||||
[api protobuf](../../etcdserver/etcdserverpb/rpc.proto)
|
||||
[protobuf](./v3api.proto)
|
||||
|
||||
[kv protobuf](../../storage/storagepb/kv.proto)
|
||||
### Examples
|
||||
|
||||
## Examples
|
||||
|
||||
### Put a key (foo=bar)
|
||||
#### Put a key (foo=bar)
|
||||
```
|
||||
// A put is always successful
|
||||
Put( PutRequest { key = foo, value = bar } )
|
||||
@@ -64,7 +47,7 @@ PutResponse {
|
||||
}
|
||||
```
|
||||
|
||||
### Get a key (assume we have foo=bar)
|
||||
#### Get a key (assume we have foo=bar)
|
||||
```
|
||||
Get ( RangeRequest { key = foo } )
|
||||
|
||||
@@ -85,7 +68,7 @@ RangeResponse {
|
||||
}
|
||||
```
|
||||
|
||||
### Range over a key space (assume we have foo0=bar0… foo100=bar100)
|
||||
#### Range over a key space (assume we have foo0=bar0… foo100=bar100)
|
||||
```
|
||||
Range ( RangeRequest { key = foo, end_key = foo80, limit = 30 } )
|
||||
|
||||
@@ -114,7 +97,7 @@ RangeResponse {
|
||||
}
|
||||
```
|
||||
|
||||
### Finish a txn (assume we have foo0=bar0, foo1=bar1)
|
||||
#### Finish a txn (assume we have foo0=bar0, foo1=bar1)
|
||||
```
|
||||
Txn(TxnRequest {
|
||||
// mod_revision of foo0 is equal to 1, mod_revision of foo1 is greater than 1
|
||||
@@ -146,7 +129,7 @@ TxnResponse {
|
||||
}
|
||||
```
|
||||
|
||||
### Watch on a key/range
|
||||
#### Watch on a key/range
|
||||
|
||||
```
|
||||
Watch( WatchRequest{
|
||||
|
285
Documentation/rfc/v3api.proto
Normal file
285
Documentation/rfc/v3api.proto
Normal file
@@ -0,0 +1,285 @@
|
||||
syntax = "proto3";
|
||||
|
||||
// Interface exported by the server.
|
||||
service etcd {
|
||||
// Range gets the keys in the range from the store.
|
||||
rpc Range(RangeRequest) returns (RangeResponse) {}
|
||||
|
||||
// Put puts the given key into the store.
|
||||
// A put request increases the revision of the store,
|
||||
// and generates one event in the event history.
|
||||
rpc Put(PutRequest) returns (PutResponse) {}
|
||||
|
||||
// Delete deletes the given range from the store.
|
||||
// A delete request increase the revision of the store,
|
||||
// and generates one event in the event history.
|
||||
rpc DeleteRange(DeleteRangeRequest) returns (DeleteRangeResponse) {}
|
||||
|
||||
// Txn processes all the requests in one transaction.
|
||||
// A txn request increases the revision of the store,
|
||||
// and generates events with the same revision in the event history.
|
||||
rpc Txn(TxnRequest) returns (TxnResponse) {}
|
||||
|
||||
// Watch watches the events happening or happened in etcd. Both input and output
|
||||
// are stream. One watch rpc can watch for multiple ranges and get a stream of
|
||||
// events. The whole events history can be watched unless compacted.
|
||||
rpc WatchRange(stream WatchRangeRequest) returns (stream WatchRangeResponse) {}
|
||||
|
||||
// Compact compacts the event history in etcd. User should compact the
|
||||
// event history periodically, or it will grow infinitely.
|
||||
rpc Compact(CompactionRequest) returns (CompactionResponse) {}
|
||||
|
||||
// LeaseCreate creates a lease. A lease has a TTL. The lease will expire if the
|
||||
// server does not receive a keepAlive within TTL from the lease holder.
|
||||
// All keys attached to the lease will be expired and deleted if the lease expires.
|
||||
// The key expiration generates an event in event history.
|
||||
rpc LeaseCreate(LeaseCreateRequest) returns (LeaseCreateResponse) {}
|
||||
|
||||
// LeaseRevoke revokes a lease. All the key attached to the lease will be expired and deleted.
|
||||
rpc LeaseRevoke(LeaseRevokeRequest) returns (LeaseRevokeResponse) {}
|
||||
|
||||
// LeaseAttach attaches keys with a lease.
|
||||
rpc LeaseAttach(LeaseAttachRequest) returns (LeaseAttachResponse) {}
|
||||
|
||||
// LeaseTxn likes Txn. It has two addition success and failure LeaseAttachRequest list.
|
||||
// If the Txn is successful, then the success list will be executed. Or the failure list
|
||||
// will be executed.
|
||||
rpc LeaseTxn(LeaseTxnRequest) returns (LeaseTxnResponse) {}
|
||||
|
||||
// KeepAlive keeps the lease alive.
|
||||
rpc LeaseKeepAlive(stream LeaseKeepAliveRequest) returns (stream LeaseKeepAliveResponse) {}
|
||||
}
|
||||
|
||||
message ResponseHeader {
|
||||
// an error type message?
|
||||
string error = 1;
|
||||
uint64 cluster_id = 2;
|
||||
uint64 member_id = 3;
|
||||
// revision of the store when the request was applied.
|
||||
int64 revision = 4;
|
||||
// term of raft when the request was applied.
|
||||
uint64 raft_term = 5;
|
||||
}
|
||||
|
||||
message RangeRequest {
|
||||
// if the range_end is not given, the request returns the key.
|
||||
bytes key = 1;
|
||||
// if the range_end is given, it gets the keys in range [key, range_end).
|
||||
bytes range_end = 2;
|
||||
// limit the number of keys returned.
|
||||
int64 limit = 3;
|
||||
// range over the store at the given revision.
|
||||
// if revision is less or equal to zero, range over the newest store.
|
||||
// if the revision has been compacted, ErrCompaction will be returned in
|
||||
// response.
|
||||
int64 revision = 4;
|
||||
}
|
||||
|
||||
message RangeResponse {
|
||||
ResponseHeader header = 1;
|
||||
repeated storagepb.KeyValue kvs = 2;
|
||||
// more indicates if there are more keys to return in the requested range.
|
||||
bool more = 3;
|
||||
}
|
||||
|
||||
message PutRequest {
|
||||
bytes key = 1;
|
||||
bytes value = 2;
|
||||
}
|
||||
|
||||
message PutResponse {
|
||||
ResponseHeader header = 1;
|
||||
}
|
||||
|
||||
message DeleteRangeRequest {
|
||||
// if the range_end is not given, the request deletes the key.
|
||||
bytes key = 1;
|
||||
// if the range_end is given, it deletes the keys in range [key, range_end).
|
||||
bytes range_end = 2;
|
||||
}
|
||||
|
||||
message DeleteRangeResponse {
|
||||
ResponseHeader header = 1;
|
||||
}
|
||||
|
||||
message RequestUnion {
|
||||
oneof request {
|
||||
RangeRequest request_range = 1;
|
||||
PutRequest request_put = 2;
|
||||
DeleteRangeRequest request_delete_range = 3;
|
||||
}
|
||||
}
|
||||
|
||||
message ResponseUnion {
|
||||
oneof response {
|
||||
RangeResponse response_range = 1;
|
||||
PutResponse response_put = 2;
|
||||
DeleteRangeResponse response_delete_range = 3;
|
||||
}
|
||||
}
|
||||
|
||||
message Compare {
|
||||
enum CompareResult {
|
||||
EQUAL = 0;
|
||||
GREATER = 1;
|
||||
LESS = 2;
|
||||
}
|
||||
enum CompareTarget {
|
||||
VERSION = 0;
|
||||
CREATE = 1;
|
||||
MOD = 2;
|
||||
VALUE= 3;
|
||||
}
|
||||
CompareResult result = 1;
|
||||
CompareTarget target = 2;
|
||||
// key path
|
||||
bytes key = 3;
|
||||
oneof target_union {
|
||||
// version of the given key
|
||||
int64 version = 4;
|
||||
// create revision of the given key
|
||||
int64 create_revision = 5;
|
||||
// last modified revision of the given key
|
||||
int64 mod_revision = 6;
|
||||
// value of the given key
|
||||
bytes value = 7;
|
||||
}
|
||||
}
|
||||
|
||||
// If the comparisons succeed, then the success requests will be processed in order,
|
||||
// and the response will contain their respective responses in order.
|
||||
// If the comparisons fail, then the failure requests will be processed in order,
|
||||
// and the response will contain their respective responses in order.
|
||||
|
||||
// From google paxosdb paper:
|
||||
// Our implementation hinges around a powerful primitive which we call MultiOp. All other database
|
||||
// operations except for iteration are implemented as a single call to MultiOp. A MultiOp is applied atomically
|
||||
// and consists of three components:
|
||||
// 1. A list of tests called guard. Each test in guard checks a single entry in the database. It may check
|
||||
// for the absence or presence of a value, or compare with a given value. Two different tests in the guard
|
||||
// may apply to the same or different entries in the database. All tests in the guard are applied and
|
||||
// MultiOp returns the results. If all tests are true, MultiOp executes t op (see item 2 below), otherwise
|
||||
// it executes f op (see item 3 below).
|
||||
// 2. A list of database operations called t op. Each operation in the list is either an insert, delete, or
|
||||
// lookup operation, and applies to a single database entry. Two different operations in the list may apply
|
||||
// to the same or different entries in the database. These operations are executed
|
||||
// if guard evaluates to
|
||||
// true.
|
||||
// 3. A list of database operations called f op. Like t op, but executed if guard evaluates to false.
|
||||
message TxnRequest {
|
||||
repeated Compare compare = 1;
|
||||
repeated RequestUnion success = 2;
|
||||
repeated RequestUnion failure = 3;
|
||||
}
|
||||
|
||||
message TxnResponse {
|
||||
ResponseHeader header = 1;
|
||||
bool succeeded = 2;
|
||||
repeated ResponseUnion responses = 3;
|
||||
}
|
||||
|
||||
message KeyValue {
|
||||
bytes key = 1;
|
||||
int64 create_revision = 2;
|
||||
// mod_revision is the last modified revision of the key.
|
||||
int64 mod_revision = 3;
|
||||
// version is the version of the key. A deletion resets
|
||||
// the version to zero and any modification of the key
|
||||
// increases its version.
|
||||
int64 version = 4;
|
||||
bytes value = 5;
|
||||
}
|
||||
|
||||
message WatchRangeRequest {
|
||||
// if the range_end is not given, the request returns the key.
|
||||
bytes key = 1;
|
||||
// if the range_end is given, it gets the keys in range [key, range_end).
|
||||
bytes range_end = 2;
|
||||
// start_revision is an optional revision (including) to watch from. No start_revision is "now".
|
||||
int64 start_revision = 3;
|
||||
// end_revision is an optional revision (excluding) to end watch. No end_revision is "forever".
|
||||
int64 end_revision = 4;
|
||||
bool progress_notification = 5;
|
||||
}
|
||||
|
||||
message WatchRangeResponse {
|
||||
ResponseHeader header = 1;
|
||||
repeated Event events = 2;
|
||||
}
|
||||
|
||||
message Event {
|
||||
enum EventType {
|
||||
PUT = 0;
|
||||
DELETE = 1;
|
||||
EXPIRE = 2;
|
||||
}
|
||||
EventType event_type = 1;
|
||||
// a put event contains the current key-value
|
||||
// a delete/expire event contains the previous
|
||||
// key-value
|
||||
KeyValue kv = 2;
|
||||
}
|
||||
|
||||
// Compaction compacts the kv store upto the given revision (including).
|
||||
// It removes the old versions of a key. It keeps the newest version of
|
||||
// the key even if its latest modification revision is smaller than the given
|
||||
// revision.
|
||||
message CompactionRequest {
|
||||
int64 revision = 1;
|
||||
}
|
||||
|
||||
message CompactionResponse {
|
||||
ResponseHeader header = 1;
|
||||
}
|
||||
|
||||
message LeaseCreateRequest {
|
||||
// advisory ttl in seconds
|
||||
int64 ttl = 1;
|
||||
}
|
||||
|
||||
message LeaseCreateResponse {
|
||||
ResponseHeader header = 1;
|
||||
int64 lease_id = 2;
|
||||
// server decided ttl in second
|
||||
int64 ttl = 3;
|
||||
string error = 4;
|
||||
}
|
||||
|
||||
message LeaseRevokeRequest {
|
||||
int64 lease_id = 1;
|
||||
}
|
||||
|
||||
message LeaseRevokeResponse {
|
||||
ResponseHeader header = 1;
|
||||
}
|
||||
|
||||
message LeaseTxnRequest {
|
||||
TxnRequest request = 1;
|
||||
repeated LeaseAttachRequest success = 2;
|
||||
repeated LeaseAttachRequest failure = 3;
|
||||
}
|
||||
|
||||
message LeaseTxnResponse {
|
||||
ResponseHeader header = 1;
|
||||
TxnResponse response = 2;
|
||||
repeated LeaseAttachResponse attach_responses = 3;
|
||||
}
|
||||
|
||||
message LeaseAttachRequest {
|
||||
int64 lease_id = 1;
|
||||
bytes key = 2;
|
||||
}
|
||||
|
||||
message LeaseAttachResponse {
|
||||
ResponseHeader header = 1;
|
||||
}
|
||||
|
||||
message LeaseKeepAliveRequest {
|
||||
int64 lease_id = 1;
|
||||
}
|
||||
|
||||
message LeaseKeepAliveResponse {
|
||||
ResponseHeader header = 1;
|
||||
int64 lease_id = 2;
|
||||
int64 ttl = 3;
|
||||
}
|
@@ -1,8 +1,8 @@
|
||||
# Runtime Reconfiguration
|
||||
## Runtime Reconfiguration
|
||||
|
||||
etcd comes with support for incremental runtime reconfiguration, which allows users to update the membership of the cluster at run time.
|
||||
|
||||
Reconfiguration requests can only be processed when the majority of the cluster members are functioning. It is **highly recommended** to always have a cluster size greater than two in production. It is unsafe to remove a member from a two member cluster. The majority of a two member cluster is also two. If there is a failure during the removal process, the cluster might not able to make progress and need to [restart from majority failure][majority failure].
|
||||
Reconfiguration requests can only be processed when the the majority of the cluster members are functioning. It is **highly recommended** to always have a cluster size greater than two in production. It is unsafe to remove a member from a two member cluster. The majority of a two member cluster is also two. If there is a failure during the removal process, the cluster might not able to make progress and need to [restart from majority failure][majority failure].
|
||||
|
||||
To better understand the design behind runtime reconfiguration, we suggest you read [this](runtime-reconf-design.md).
|
||||
|
||||
@@ -60,24 +60,11 @@ All changes to the cluster are done one at a time:
|
||||
* To decrease from 5 to 3 you will make two remove operations
|
||||
|
||||
All of these examples will use the `etcdctl` command line tool that ships with etcd.
|
||||
If you want to use the members API directly you can find the documentation [here](members_api.md).
|
||||
If you want to use the member API directly you can find the documentation [here](other_apis.md).
|
||||
|
||||
### Update a Member
|
||||
|
||||
#### Update advertise client URLs
|
||||
|
||||
If you would like to update the advertise client URLs of a member, you can simply restart
|
||||
that member with updated client urls flag (`--advertise-client-urls`) or environment variable
|
||||
(`ETCD_ADVERTISE_CLIENT_URLS`). The restarted member will self publish the updated URLs.
|
||||
A wrongly updated client URL will not affect the health of the etcd cluster.
|
||||
|
||||
#### Update advertise peer URLs
|
||||
|
||||
If you would like to update the advertise peer URLs of a member, you have to first update
|
||||
it explicitly via member command and then restart the member. The additional action is required
|
||||
since updating peer URLs changes the cluster wide configuration and can affect the health of the etcd cluster.
|
||||
|
||||
To update the peer URLs, first, we need to find the target member's ID. You can list all members with `etcdctl`:
|
||||
If you would like to update a member IP address (peerURLs), first, we need to find the target member's ID. You can list all members with `etcdctl`:
|
||||
|
||||
```sh
|
||||
$ etcdctl member list
|
||||
@@ -115,7 +102,7 @@ It is safe to remove the leader, however the cluster will be inactive while a ne
|
||||
|
||||
Adding a member is a two step process:
|
||||
|
||||
* Add the new member to the cluster via the [members API](members_api.md#post-v2members) or the `etcdctl member add` command.
|
||||
* Add the new member to the cluster via the [members API](other_apis.md#post-v2members) or the `etcdctl member add` command.
|
||||
* Start the new member with the new cluster configuration, including a list of the updated members (existing members + the new member).
|
||||
|
||||
Using `etcdctl` let's add the new member to the cluster by specifying its [name](configuration.md#-name) and [advertised peer URLs](configuration.md#-initial-advertise-peer-urls):
|
||||
@@ -144,7 +131,7 @@ The new member will run as a part of the cluster and immediately begin catching
|
||||
If you are adding multiple members the best practice is to configure a single member at a time and verify it starts correctly before adding more new members.
|
||||
If you add a new member to a 1-node cluster, the cluster cannot make progress before the new member starts because it needs two members as majority to agree on the consensus. You will only see this behavior between the time `etcdctl member add` informs the cluster about the new member and the new member successfully establishing a connection to the existing one.
|
||||
|
||||
#### Error Cases When Adding Members
|
||||
#### Error Cases
|
||||
|
||||
In the following case we have not included our new host in the list of enumerated nodes.
|
||||
If this is a new cluster, the node must be added to the list of initial cluster members.
|
||||
@@ -167,18 +154,10 @@ etcdserver: assign ids error: unmatched member while checking PeerURLs
|
||||
exit 1
|
||||
```
|
||||
|
||||
When we start etcd using the data directory of a removed member, etcd will exit automatically if it connects to any active member in the cluster:
|
||||
When we start etcd using the data directory of a removed member, etcd will exit automatically if it connects to any alive member in the cluster:
|
||||
|
||||
```sh
|
||||
$ etcd
|
||||
etcd: this member has been permanently removed from the cluster. Exiting.
|
||||
exit 1
|
||||
```
|
||||
|
||||
### Strict Reconfiguration Check Mode (`-strict-reconfig-check`)
|
||||
|
||||
As described in the above, the best practice of adding new members is to configure a single member at a time and verify it starts correctly before adding more new members. This step by step approach is very important because if newly added members is not configured correctly (for example the peer URLs are incorrect), the cluster can lose quorum. The quorum loss happens since the newly added member are counted in the quorum even if that member is not reachable from other existing members. Also quorum loss might happen if there is a connectivity issue or there are operational issues.
|
||||
|
||||
For avoiding this problem, etcd provides an option `-strict-reconfig-check`. If this option is passed to etcd, etcd rejects reconfiguration requests if the number of started members will be less than a quorum of the reconfigured cluster.
|
||||
|
||||
It is recommended to enable this option. However, it is disabled by default because of keeping compatibility.
|
||||
|
@@ -1,10 +1,10 @@
|
||||
# Design of Runtime Reconfiguration
|
||||
### Design of Runtime Reconfiguration
|
||||
|
||||
Runtime reconfiguration is one of the hardest and most error prone features in a distributed system, especially in a consensus based system like etcd.
|
||||
|
||||
Read on to learn about the design of etcd's runtime reconfiguration commands and how we tackled these problems.
|
||||
|
||||
## Two Phase Config Changes Keep you Safe
|
||||
### Two Phase Config Changes Keep you Safe
|
||||
|
||||
In etcd, every runtime reconfiguration has to go through [two phases](Documentation/runtime-configuration.md#add-a-new-member) for safety reasons. For example, to add a member you need to first inform cluster of new configuration and then start the new member.
|
||||
|
||||
@@ -22,7 +22,7 @@ Without the explicit workflow around cluster membership etcd would be vulnerable
|
||||
|
||||
We think runtime reconfiguration should be a low frequent operation. We made the decision to keep it explicit and user-driven to ensure configuration safety and keep your cluster always running smoothly under your control.
|
||||
|
||||
## Permanent Loss of Quorum Requires New Cluster
|
||||
### Permanent Loss of Quorum Requires New Cluster
|
||||
|
||||
If a cluster permanently loses a majority of its members, a new cluster will need to be started from an old data directory to recover the previous state.
|
||||
|
||||
@@ -30,7 +30,7 @@ It is entirely possible to force removing the failed members from the existing c
|
||||
|
||||
If you have a correct deployment, the possibility of permanent majority lose is very low. But it is a severe enough problem that worth special care. We strongly suggest you to read the [disaster recovery documentation](admin_guide.md#disaster-recovery) and prepare for permanent majority lose before you put etcd into production.
|
||||
|
||||
## Do Not Use Public Discovery Service For Runtime Reconfiguration
|
||||
### Do Not Use Public Discovery Service For Runtime Reconfiguration
|
||||
|
||||
The public discovery service should only be used for bootstrapping a cluster. To join member into an existing cluster, you should use runtime reconfiguration API.
|
||||
|
||||
@@ -38,10 +38,10 @@ Discovery service is designed for bootstrapping an etcd cluster in the cloud env
|
||||
|
||||
It seems that using public discovery service is a convenient way to do runtime reconfiguration, after all discovery service already has all the cluster configuration information. However relying on public discovery service brings troubles:
|
||||
|
||||
1. it introduces external dependencies for the entire life-cycle of your cluster, not just bootstrap time. If there is a network issue between your cluster and public discovery service, your cluster will suffer from it.
|
||||
1. it introduces a external dependencies for the entire life-cycle of your cluster, not just bootstrap time. If there is a network issue between your cluster and public discover service, your cluster will suffer from it.
|
||||
|
||||
2. public discovery service must reflect correct runtime configuration of your cluster during it life-cycle. It has to provide security mechanism to avoid bad actions, and it is hard.
|
||||
|
||||
3. public discovery service has to keep tens of thousands of cluster configurations. Our public discovery service backend is not ready for that workload.
|
||||
|
||||
If you want to have a discovery service that supports runtime reconfiguration, the best choice is to build your private one.
|
||||
If you want to have a discovery service that supports runtime reconfiguration, the best choice is to build your private one.
|
@@ -1,10 +1,12 @@
|
||||
# Security Model
|
||||
# security model
|
||||
|
||||
etcd supports SSL/TLS as well as authentication through client certificates, both for clients to server as well as peer (server to server / cluster) communication.
|
||||
|
||||
To get up and running you first need to have a CA certificate and a signed key pair for one member. It is recommended to create and sign a new key pair for every member in a cluster.
|
||||
|
||||
For convenience, the [cfssl](https://github.com/cloudflare/cfssl) tool provides an easy interface to certificate generation, and we provide an example using the tool [here](https://github.com/coreos/etcd/tree/master/hack/tls-setup). You can also examine this [alternative guide to generating self-signed key pairs](http://www.g-loaded.eu/2005/11/10/be-your-own-ca/).
|
||||
For convenience the [cfssl](https://github.com/cloudflare/cfssl) tool provides an easy interface to certificate generation, and we provide a full example using the tool at [here](../hack/tls-setup). Alternatively this site provides a good reference on how to generate self-signed key pairs:
|
||||
|
||||
http://www.g-loaded.eu/2005/11/10/be-your-own-ca/
|
||||
|
||||
## Basic setup
|
||||
|
||||
@@ -95,7 +97,7 @@ $ curl --cacert /path/to/ca.crt --cert /path/to/client.crt --key /path/to/client
|
||||
-L https://127.0.0.1:2379/v2/keys/foo -XPUT -d value=bar -v
|
||||
```
|
||||
|
||||
You should be able to see:
|
||||
You should able to see:
|
||||
|
||||
```
|
||||
...
|
||||
@@ -136,7 +138,7 @@ $ etcd -name infra1 -data-dir infra1 \
|
||||
|
||||
# member2
|
||||
$ etcd -name infra2 -data-dir infra2 \
|
||||
-peer-client-cert-auth -peer-trusted-ca-file=/path/to/ca.crt -peer-cert-file=/path/to/member2.crt -peer-key-file=/path/to/member2.key \
|
||||
-peer-client-cert-atuh -peer-trusted-ca-file=/path/to/ca.crt -peer-cert-file=/path/to/member2.crt -peer-key-file=/path/to/member2.key \
|
||||
-initial-advertise-peer-urls=https://10.0.1.11:2380 -listen-peer-urls=https://10.0.1.11:2380 \
|
||||
-discovery ${DISCOVERY_URL}
|
||||
```
|
||||
@@ -150,7 +152,7 @@ The etcd members will form a cluster and all communication between members in th
|
||||
The internal protocol of etcd v2.0.x uses a lot of short-lived HTTP connections.
|
||||
So, when enabling TLS you may need to increase the heartbeat interval and election timeouts to reduce internal cluster connection churn.
|
||||
A reasonable place to start are these values: ` --heartbeat-interval 500 --election-timeout 2500`.
|
||||
These issues are resolved in the etcd v2.1.x series of releases which uses fewer connections.
|
||||
This issues is resolved in the etcd v2.1.x series of releases which uses fewer connections.
|
||||
|
||||
### I'm seeing a SSLv3 alert handshake failure when using SSL client authentication?
|
||||
|
||||
|
@@ -1,16 +1,16 @@
|
||||
# Tuning
|
||||
## Tuning
|
||||
|
||||
The default settings in etcd should work well for installations on a local network where the average network latency is low.
|
||||
However, when using etcd across multiple data centers or over networks with high latency you may need to tweak the heartbeat interval and election timeout settings.
|
||||
|
||||
The network isn't the only source of latency. Each request and response may be impacted by slow disks on both the leader and follower. Each of these timeouts represents the total time from request to successful response from the other machine.
|
||||
|
||||
## Time Parameters
|
||||
### Time Parameters
|
||||
|
||||
The underlying distributed consensus protocol relies on two separate time parameters to ensure that nodes can handoff leadership if one stalls or goes offline.
|
||||
The first parameter is called the *Heartbeat Interval*.
|
||||
This is the frequency with which the leader will notify followers that it is still the leader.
|
||||
For best practices, the parameter should be set around round-trip time between members.
|
||||
For best pratices, the parameter should be set around round-trip time between members.
|
||||
By default, etcd uses a `100ms` heartbeat interval.
|
||||
|
||||
The second parameter is the *Election Timeout*.
|
||||
@@ -27,14 +27,11 @@ The election timeout should be set based on the heartbeat interval and average r
|
||||
Election timeouts must be at least 10 times the round-trip time so it can account for variance in your network.
|
||||
For example, if the round-trip time between your members is 10ms then you should have at least a 100ms election timeout.
|
||||
|
||||
The upper limit of election timeout is 50000ms, which should only be used when deploying global etcd cluster. First, 5s is the upper limit of average global round-trip time. A reasonable round-trip time for the continental united states is 130ms, and the time between US and japan is around 350-400ms. Because package gets delayed a lot, and network situation may be terrible, 5s is a safe value for it. Then, because election timeout should be an order of magnitude bigger than broadcast time, 50s becomes its maximum.
|
||||
|
||||
You should also set your election timeout to at least 5 to 10 times your heartbeat interval to account for variance in leader replication.
|
||||
For a heartbeat interval of 50ms you should set your election timeout to at least 250ms - 500ms.
|
||||
|
||||
The upper limit of election timeout is 50000ms (50s), which should only be used when deploying a globally-distributed etcd cluster.
|
||||
A reasonable round-trip time for the continental United States is 130ms, and the time between US and Japan is around 350-400ms.
|
||||
If your network has uneven performance or regular packet delays/loss then it is possible that a couple of retries may be necessary to successfully send a packet. So 5s is a safe upper limit of global round-trip time.
|
||||
As the election timeout should be an order of magnitude bigger than broadcast time, in the case of ~5s for a globally distributed cluster, then 50 seconds becomes a reasonable maximum.
|
||||
|
||||
The heartbeat interval and election timeout value should be the same for all members in one cluster. Setting different values for etcd members may disrupt cluster stability.
|
||||
|
||||
You can override the default values on the command line:
|
||||
@@ -49,7 +46,7 @@ $ ETCD_HEARTBEAT_INTERVAL=100 ETCD_ELECTION_TIMEOUT=500 etcd
|
||||
|
||||
The values are specified in milliseconds.
|
||||
|
||||
## Snapshots
|
||||
### Snapshots
|
||||
|
||||
etcd appends all key changes to a log file.
|
||||
This log grows forever and is a complete linear history of every change made to the keys.
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# Upgrade etcd to 2.1
|
||||
## Upgrade etcd to 2.1
|
||||
|
||||
In the general case, upgrading from etcd 2.0 to 2.1 can be a zero-downtime, rolling upgrade:
|
||||
- one by one, stop the etcd v2.0 processes and replace them with etcd v2.1 processes
|
||||
@@ -6,15 +6,15 @@ In the general case, upgrading from etcd 2.0 to 2.1 can be a zero-downtime, roll
|
||||
|
||||
Before [starting an upgrade](#upgrade-procedure), read through the rest of this guide to prepare.
|
||||
|
||||
## Upgrade Checklists
|
||||
### Upgrade Checklists
|
||||
|
||||
### Upgrade Requirements
|
||||
#### Upgrade Requirement
|
||||
|
||||
To upgrade an existing etcd deployment to 2.1, you must be running 2.0. If you’re running a version of etcd before 2.0, you must upgrade to [2.0](https://github.com/coreos/etcd/releases/tag/v2.0.13) before upgrading to 2.1.
|
||||
|
||||
Also, to ensure a smooth rolling upgrade, your running cluster must be healthy. You can check the health of the cluster by using `etcdctl cluster-health` command.
|
||||
|
||||
### Preparedness
|
||||
#### Preparedness
|
||||
|
||||
Before upgrading etcd, always test the services relying on etcd in a staging environment before deploying the upgrade to the production environment.
|
||||
|
||||
@@ -22,13 +22,13 @@ You might also want to [backup your data directory](admin_guide.md#backing-up-th
|
||||
|
||||
etcd 2.1 introduces a new [authentication](auth_api.md) feature, which is disabled by default. If your deployment depends on these, you may want to test the auth features before enabling them in production.
|
||||
|
||||
### Mixed Versions
|
||||
#### Mixed Versions
|
||||
|
||||
While upgrading, an etcd cluster supports mixed versions of etcd members. The cluster is only considered upgraded once all its members are upgraded to 2.1.
|
||||
|
||||
Internally, etcd members negotiate with each other to determine the overall etcd cluster version, which controls the reported cluster version and the supported features. For example, if you are mid-upgrade, any 2.1 features (such as the the authentication feature mentioned above) won’t be available.
|
||||
|
||||
### Limitations
|
||||
#### Limitations
|
||||
|
||||
If you encounter any issues during the upgrade, you can attempt to restart the etcd process in trouble using a newer v2.1 binary to solve the problem. One known issue is that etcd v2.0.0 and v2.0.2 may panic during rolling upgrades due to an existing bug, which has been fixed since etcd v2.0.3.
|
||||
|
||||
@@ -36,7 +36,7 @@ It might take up to 2 minutes for the newly upgraded member to catch up with the
|
||||
|
||||
If you have even more data, this might take more time. If you have a data size larger than 100MB you should contact us before upgrading, so we can make sure the upgrades work smoothly.
|
||||
|
||||
### Downgrade
|
||||
#### Downgrade
|
||||
|
||||
If all members have been upgraded to v2.1, the cluster will be upgraded to v2.1, and downgrade is **not possible**. If any member is still v2.0, the cluster will remain in v2.0, and you can go back to use v2.0 binary.
|
||||
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# Upgrade etcd from 2.1 to 2.2
|
||||
## Upgrade etcd from 2.1 to 2.2
|
||||
|
||||
In the general case, upgrading from etcd 2.1 to 2.2 can be a zero-downtime, rolling upgrade:
|
||||
|
||||
@@ -7,33 +7,33 @@ In the general case, upgrading from etcd 2.1 to 2.2 can be a zero-downtime, roll
|
||||
|
||||
Before [starting an upgrade](#upgrade-procedure), read through the rest of this guide to prepare.
|
||||
|
||||
## Upgrade Checklists
|
||||
### Upgrade Checklists
|
||||
|
||||
### Upgrade Requirement
|
||||
#### Upgrade Requirement
|
||||
|
||||
To upgrade an existing etcd deployment to 2.2, you must be running 2.1. If you’re running a version of etcd before 2.1, you must upgrade to [2.1](https://github.com/coreos/etcd/releases/tag/v2.1.2) before upgrading to 2.2.
|
||||
|
||||
Also, to ensure a smooth rolling upgrade, your running cluster must be healthy. You can check the health of the cluster by using `etcdctl cluster-health` command.
|
||||
|
||||
### Preparedness
|
||||
#### Preparedness
|
||||
|
||||
Before upgrading etcd, always test the services relying on etcd in a staging environment before deploying the upgrade to the production environment.
|
||||
|
||||
You might also want to [backup your data directory](admin_guide.md#backing-up-the-datastore) for a potential [downgrade](#downgrade).
|
||||
|
||||
### Mixed Versions
|
||||
#### Mixed Versions
|
||||
|
||||
While upgrading, an etcd cluster supports mixed versions of etcd members. The cluster is only considered upgraded once all its members are upgraded to 2.2.
|
||||
|
||||
Internally, etcd members negotiate with each other to determine the overall etcd cluster version, which controls the reported cluster version and the supported features.
|
||||
|
||||
### Limitations
|
||||
#### Limitations
|
||||
|
||||
If you have a data size larger than 100MB you should contact us before upgrading, so we can make sure the upgrades work smoothly.
|
||||
|
||||
Every etcd 2.2 member will do health checking across the cluster periodically. etcd 2.1 member does not support health checking. During the upgrade, etcd 2.2 member will log warning about the unhealthy state of etcd 2.1 member. You can ignore the warning.
|
||||
|
||||
### Downgrade
|
||||
#### Downgrade
|
||||
|
||||
If all members have been upgraded to v2.2, the cluster will be upgraded to v2.2, and downgrade is **not possible**. If any member is still v2.1, the cluster will remain in v2.1, and you can go back to use v2.1 binary.
|
||||
|
||||
|
94
Godeps/Godeps.json
generated
94
Godeps/Godeps.json
generated
@@ -1,6 +1,6 @@
|
||||
{
|
||||
"ImportPath": "github.com/coreos/etcd",
|
||||
"GoVersion": "go1.5.1",
|
||||
"GoVersion": "go1.4.2",
|
||||
"Packages": [
|
||||
"./..."
|
||||
],
|
||||
@@ -10,10 +10,6 @@
|
||||
"Comment": "null-5",
|
||||
"Rev": "75cd24fc2f2c2a2088577d12123ddee5f54e0675"
|
||||
},
|
||||
{
|
||||
"ImportPath": "github.com/akrennmair/gopcap",
|
||||
"Rev": "00e11033259acb75598ba416495bb708d864a010"
|
||||
},
|
||||
{
|
||||
"ImportPath": "github.com/beorn7/perks/quantile",
|
||||
"Rev": "b965b613227fddccbfffe13eae360ed3fa822f8d"
|
||||
@@ -24,21 +20,17 @@
|
||||
},
|
||||
{
|
||||
"ImportPath": "github.com/boltdb/bolt",
|
||||
"Comment": "v1.1.0-65-gee4a088",
|
||||
"Rev": "ee4a0888a9abe7eefe5a0992ca4cb06864839873"
|
||||
"Comment": "v1.0-119-g90fef38",
|
||||
"Rev": "90fef389f98027ca55594edd7dbd6e7f3926fdad"
|
||||
},
|
||||
{
|
||||
"ImportPath": "github.com/cheggaaa/pb",
|
||||
"Rev": "da1f27ad1d9509b16f65f52fd9d8138b0f2dc7b2"
|
||||
"ImportPath": "github.com/bradfitz/http2",
|
||||
"Rev": "3e36af6d3af0e56fa3da71099f864933dea3d9fb"
|
||||
},
|
||||
{
|
||||
"ImportPath": "github.com/codegangsta/cli",
|
||||
"Comment": "1.2.0-183-gb5232bb",
|
||||
"Rev": "b5232bb2934f606f9f27a1305f1eea224e8e8b88"
|
||||
},
|
||||
{
|
||||
"ImportPath": "github.com/coreos/gexpect",
|
||||
"Rev": "5173270e159f5aa8fbc999dc7e3dcb50f4098a69"
|
||||
"Comment": "1.2.0-26-gf7ebb76",
|
||||
"Rev": "f7ebb761e83e21225d1d8954fde853bf8edd46c4"
|
||||
},
|
||||
{
|
||||
"ImportPath": "github.com/coreos/go-semver/semver",
|
||||
@@ -63,15 +55,9 @@
|
||||
"ImportPath": "github.com/coreos/pkg/capnslog",
|
||||
"Rev": "2c77715c4df99b5420ffcae14ead08f52104065d"
|
||||
},
|
||||
{
|
||||
"ImportPath": "github.com/cpuguy83/go-md2man/md2man",
|
||||
"Comment": "v1.0.4",
|
||||
"Rev": "71acacd42f85e5e82f70a55327789582a5200a90"
|
||||
},
|
||||
{
|
||||
"ImportPath": "github.com/gogo/protobuf/proto",
|
||||
"Comment": "v0.1-118-ge8904f5",
|
||||
"Rev": "e8904f58e872a473a5b91bc9bf3377d223555263"
|
||||
"Rev": "64f27bf06efee53589314a6e5a4af34cdd85adf6"
|
||||
},
|
||||
{
|
||||
"ImportPath": "github.com/golang/glog",
|
||||
@@ -79,37 +65,20 @@
|
||||
},
|
||||
{
|
||||
"ImportPath": "github.com/golang/protobuf/proto",
|
||||
"Rev": "6aaa8d47701fa6cf07e914ec01fde3d4a1fe79c3"
|
||||
"Rev": "5677a0e3d5e89854c9974e1256839ee23f8233ca"
|
||||
},
|
||||
{
|
||||
"ImportPath": "github.com/google/btree",
|
||||
"Rev": "cc6329d4279e3f025a53a83c397d2339b5705c45"
|
||||
},
|
||||
{
|
||||
"ImportPath": "github.com/inconshreveable/mousetrap",
|
||||
"Rev": "76626ae9c91c4f2a10f34cad8ce83ea42c93bb75"
|
||||
},
|
||||
{
|
||||
"ImportPath": "github.com/jonboulle/clockwork",
|
||||
"Rev": "72f9bd7c4e0c2a40055ab3d0f09654f730cce982"
|
||||
},
|
||||
{
|
||||
"ImportPath": "github.com/kballard/go-shellquote",
|
||||
"Rev": "d8ec1a69a250a17bb0e419c386eac1f3711dc142"
|
||||
},
|
||||
{
|
||||
"ImportPath": "github.com/kr/pty",
|
||||
"Comment": "release.r56-29-gf7ee69f",
|
||||
"Rev": "f7ee69f31298ecbe5d2b349c711e2547a617d398"
|
||||
},
|
||||
{
|
||||
"ImportPath": "github.com/matttproud/golang_protobuf_extensions/pbutil",
|
||||
"Rev": "fc2b8d3a73c4867e51861bbdd5ae3c1f0869dd6a"
|
||||
},
|
||||
{
|
||||
"ImportPath": "github.com/olekukonko/ts",
|
||||
"Rev": "ecf753e7c962639ab5a1fb46f7da627d4c0a04b8"
|
||||
},
|
||||
{
|
||||
"ImportPath": "github.com/prometheus/client_golang/prometheus",
|
||||
"Comment": "0.7.0-52-ge51041b",
|
||||
@@ -133,25 +102,8 @@
|
||||
"Rev": "454a56f35412459b5e684fd5ec0f9211b94f002a"
|
||||
},
|
||||
{
|
||||
"ImportPath": "github.com/russross/blackfriday",
|
||||
"Comment": "v1.4-2-g300106c",
|
||||
"Rev": "300106c228d52c8941d4b3de6054a6062a86dda3"
|
||||
},
|
||||
{
|
||||
"ImportPath": "github.com/shurcooL/sanitized_anchor_name",
|
||||
"Rev": "10ef21a441db47d8b13ebcc5fd2310f636973c77"
|
||||
},
|
||||
{
|
||||
"ImportPath": "github.com/spacejam/loghisto",
|
||||
"Rev": "323309774dec8b7430187e46cd0793974ccca04a"
|
||||
},
|
||||
{
|
||||
"ImportPath": "github.com/spf13/cobra",
|
||||
"Rev": "1c44ec8d3f1552cac48999f9306da23c4d8a288b"
|
||||
},
|
||||
{
|
||||
"ImportPath": "github.com/spf13/pflag",
|
||||
"Rev": "08b1a584251b5b62f458943640fc8ebd4d50aaa5"
|
||||
"ImportPath": "github.com/rakyll/pb",
|
||||
"Rev": "dc507ad06b7462501281bb4691ee43f0b1d1ec37"
|
||||
},
|
||||
{
|
||||
"ImportPath": "github.com/stretchr/testify/assert",
|
||||
@@ -175,27 +127,27 @@
|
||||
},
|
||||
{
|
||||
"ImportPath": "golang.org/x/net/context",
|
||||
"Rev": "04b9de9b512f58addf28c9853d50ebef61c3953e"
|
||||
"Rev": "7dbad50ab5b31073856416cdcfeb2796d682f844"
|
||||
},
|
||||
{
|
||||
"ImportPath": "golang.org/x/net/http2",
|
||||
"Rev": "04b9de9b512f58addf28c9853d50ebef61c3953e"
|
||||
},
|
||||
{
|
||||
"ImportPath": "golang.org/x/net/internal/timeseries",
|
||||
"Rev": "04b9de9b512f58addf28c9853d50ebef61c3953e"
|
||||
},
|
||||
{
|
||||
"ImportPath": "golang.org/x/net/trace",
|
||||
"Rev": "04b9de9b512f58addf28c9853d50ebef61c3953e"
|
||||
"ImportPath": "golang.org/x/oauth2",
|
||||
"Rev": "3046bc76d6dfd7d3707f6640f85e42d9c4050f50"
|
||||
},
|
||||
{
|
||||
"ImportPath": "golang.org/x/sys/unix",
|
||||
"Rev": "9c60d1c508f5134d1ca726b4641db998f2523357"
|
||||
},
|
||||
{
|
||||
"ImportPath": "google.golang.org/cloud/compute/metadata",
|
||||
"Rev": "f20d6dcccb44ed49de45ae3703312cb46e627db1"
|
||||
},
|
||||
{
|
||||
"ImportPath": "google.golang.org/cloud/internal",
|
||||
"Rev": "f20d6dcccb44ed49de45ae3703312cb46e627db1"
|
||||
},
|
||||
{
|
||||
"ImportPath": "google.golang.org/grpc",
|
||||
"Rev": "e29d659177655e589850ba7d3d83f7ce12ef23dd"
|
||||
"Rev": "f5ebd86be717593ab029545492c93ddf8914832b"
|
||||
}
|
||||
]
|
||||
}
|
||||
|
5
Godeps/_workspace/src/github.com/akrennmair/gopcap/.gitignore
generated
vendored
5
Godeps/_workspace/src/github.com/akrennmair/gopcap/.gitignore
generated
vendored
@@ -1,5 +0,0 @@
|
||||
#*
|
||||
*~
|
||||
/tools/pass/pass
|
||||
/tools/pcaptest/pcaptest
|
||||
/tools/tcpdump/tcpdump
|
27
Godeps/_workspace/src/github.com/akrennmair/gopcap/LICENSE
generated
vendored
27
Godeps/_workspace/src/github.com/akrennmair/gopcap/LICENSE
generated
vendored
@@ -1,27 +0,0 @@
|
||||
Copyright (c) 2009-2011 Andreas Krennmair. All rights reserved.
|
||||
|
||||
Redistribution and use in source and binary forms, with or without
|
||||
modification, are permitted provided that the following conditions are
|
||||
met:
|
||||
|
||||
* Redistributions of source code must retain the above copyright
|
||||
notice, this list of conditions and the following disclaimer.
|
||||
* Redistributions in binary form must reproduce the above
|
||||
copyright notice, this list of conditions and the following disclaimer
|
||||
in the documentation and/or other materials provided with the
|
||||
distribution.
|
||||
* Neither the name of Andreas Krennmair nor the names of its
|
||||
contributors may be used to endorse or promote products derived from
|
||||
this software without specific prior written permission.
|
||||
|
||||
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
|
||||
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
|
||||
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
|
||||
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
|
||||
OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
|
||||
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
|
||||
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
|
||||
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
|
||||
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
|
||||
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
|
||||
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
11
Godeps/_workspace/src/github.com/akrennmair/gopcap/README.mkd
generated
vendored
11
Godeps/_workspace/src/github.com/akrennmair/gopcap/README.mkd
generated
vendored
@@ -1,11 +0,0 @@
|
||||
# PCAP
|
||||
|
||||
This is a simple wrapper around libpcap for Go. Originally written by Andreas
|
||||
Krennmair <ak@synflood.at> and only minorly touched up by Mark Smith <mark@qq.is>.
|
||||
|
||||
Please see the included pcaptest.go and tcpdump.go programs for instructions on
|
||||
how to use this library.
|
||||
|
||||
Miek Gieben <miek@miek.nl> has created a more Go-like package and replaced functionality
|
||||
with standard functions from the standard library. The package has also been renamed to
|
||||
pcap.
|
527
Godeps/_workspace/src/github.com/akrennmair/gopcap/decode.go
generated
vendored
527
Godeps/_workspace/src/github.com/akrennmair/gopcap/decode.go
generated
vendored
@@ -1,527 +0,0 @@
|
||||
package pcap
|
||||
|
||||
import (
|
||||
"encoding/binary"
|
||||
"fmt"
|
||||
"net"
|
||||
"reflect"
|
||||
"strings"
|
||||
)
|
||||
|
||||
const (
|
||||
TYPE_IP = 0x0800
|
||||
TYPE_ARP = 0x0806
|
||||
TYPE_IP6 = 0x86DD
|
||||
TYPE_VLAN = 0x8100
|
||||
|
||||
IP_ICMP = 1
|
||||
IP_INIP = 4
|
||||
IP_TCP = 6
|
||||
IP_UDP = 17
|
||||
)
|
||||
|
||||
const (
|
||||
ERRBUF_SIZE = 256
|
||||
|
||||
// According to pcap-linktype(7).
|
||||
LINKTYPE_NULL = 0
|
||||
LINKTYPE_ETHERNET = 1
|
||||
LINKTYPE_TOKEN_RING = 6
|
||||
LINKTYPE_ARCNET = 7
|
||||
LINKTYPE_SLIP = 8
|
||||
LINKTYPE_PPP = 9
|
||||
LINKTYPE_FDDI = 10
|
||||
LINKTYPE_ATM_RFC1483 = 100
|
||||
LINKTYPE_RAW = 101
|
||||
LINKTYPE_PPP_HDLC = 50
|
||||
LINKTYPE_PPP_ETHER = 51
|
||||
LINKTYPE_C_HDLC = 104
|
||||
LINKTYPE_IEEE802_11 = 105
|
||||
LINKTYPE_FRELAY = 107
|
||||
LINKTYPE_LOOP = 108
|
||||
LINKTYPE_LINUX_SLL = 113
|
||||
LINKTYPE_LTALK = 104
|
||||
LINKTYPE_PFLOG = 117
|
||||
LINKTYPE_PRISM_HEADER = 119
|
||||
LINKTYPE_IP_OVER_FC = 122
|
||||
LINKTYPE_SUNATM = 123
|
||||
LINKTYPE_IEEE802_11_RADIO = 127
|
||||
LINKTYPE_ARCNET_LINUX = 129
|
||||
LINKTYPE_LINUX_IRDA = 144
|
||||
LINKTYPE_LINUX_LAPD = 177
|
||||
)
|
||||
|
||||
type addrHdr interface {
|
||||
SrcAddr() string
|
||||
DestAddr() string
|
||||
Len() int
|
||||
}
|
||||
|
||||
type addrStringer interface {
|
||||
String(addr addrHdr) string
|
||||
}
|
||||
|
||||
func decodemac(pkt []byte) uint64 {
|
||||
mac := uint64(0)
|
||||
for i := uint(0); i < 6; i++ {
|
||||
mac = (mac << 8) + uint64(pkt[i])
|
||||
}
|
||||
return mac
|
||||
}
|
||||
|
||||
// Decode decodes the headers of a Packet.
|
||||
func (p *Packet) Decode() {
|
||||
if len(p.Data) <= 14 {
|
||||
return
|
||||
}
|
||||
|
||||
p.Type = int(binary.BigEndian.Uint16(p.Data[12:14]))
|
||||
p.DestMac = decodemac(p.Data[0:6])
|
||||
p.SrcMac = decodemac(p.Data[6:12])
|
||||
|
||||
if len(p.Data) >= 15 {
|
||||
p.Payload = p.Data[14:]
|
||||
}
|
||||
|
||||
switch p.Type {
|
||||
case TYPE_IP:
|
||||
p.decodeIp()
|
||||
case TYPE_IP6:
|
||||
p.decodeIp6()
|
||||
case TYPE_ARP:
|
||||
p.decodeArp()
|
||||
case TYPE_VLAN:
|
||||
p.decodeVlan()
|
||||
}
|
||||
}
|
||||
|
||||
func (p *Packet) headerString(headers []interface{}) string {
|
||||
// If there's just one header, return that.
|
||||
if len(headers) == 1 {
|
||||
if hdr, ok := headers[0].(fmt.Stringer); ok {
|
||||
return hdr.String()
|
||||
}
|
||||
}
|
||||
// If there are two headers (IPv4/IPv6 -> TCP/UDP/IP..)
|
||||
if len(headers) == 2 {
|
||||
// Commonly the first header is an address.
|
||||
if addr, ok := p.Headers[0].(addrHdr); ok {
|
||||
if hdr, ok := p.Headers[1].(addrStringer); ok {
|
||||
return fmt.Sprintf("%s %s", p.Time, hdr.String(addr))
|
||||
}
|
||||
}
|
||||
}
|
||||
// For IP in IP, we do a recursive call.
|
||||
if len(headers) >= 2 {
|
||||
if addr, ok := headers[0].(addrHdr); ok {
|
||||
if _, ok := headers[1].(addrHdr); ok {
|
||||
return fmt.Sprintf("%s > %s IP in IP: ",
|
||||
addr.SrcAddr(), addr.DestAddr(), p.headerString(headers[1:]))
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
var typeNames []string
|
||||
for _, hdr := range headers {
|
||||
typeNames = append(typeNames, reflect.TypeOf(hdr).String())
|
||||
}
|
||||
|
||||
return fmt.Sprintf("unknown [%s]", strings.Join(typeNames, ","))
|
||||
}
|
||||
|
||||
// String prints a one-line representation of the packet header.
|
||||
// The output is suitable for use in a tcpdump program.
|
||||
func (p *Packet) String() string {
|
||||
// If there are no headers, print "unsupported protocol".
|
||||
if len(p.Headers) == 0 {
|
||||
return fmt.Sprintf("%s unsupported protocol %d", p.Time, int(p.Type))
|
||||
}
|
||||
return fmt.Sprintf("%s %s", p.Time, p.headerString(p.Headers))
|
||||
}
|
||||
|
||||
// Arphdr is a ARP packet header.
|
||||
type Arphdr struct {
|
||||
Addrtype uint16
|
||||
Protocol uint16
|
||||
HwAddressSize uint8
|
||||
ProtAddressSize uint8
|
||||
Operation uint16
|
||||
SourceHwAddress []byte
|
||||
SourceProtAddress []byte
|
||||
DestHwAddress []byte
|
||||
DestProtAddress []byte
|
||||
}
|
||||
|
||||
func (arp *Arphdr) String() (s string) {
|
||||
switch arp.Operation {
|
||||
case 1:
|
||||
s = "ARP request"
|
||||
case 2:
|
||||
s = "ARP Reply"
|
||||
}
|
||||
if arp.Addrtype == LINKTYPE_ETHERNET && arp.Protocol == TYPE_IP {
|
||||
s = fmt.Sprintf("%012x (%s) > %012x (%s)",
|
||||
decodemac(arp.SourceHwAddress), arp.SourceProtAddress,
|
||||
decodemac(arp.DestHwAddress), arp.DestProtAddress)
|
||||
} else {
|
||||
s = fmt.Sprintf("addrtype = %d protocol = %d", arp.Addrtype, arp.Protocol)
|
||||
}
|
||||
return
|
||||
}
|
||||
|
||||
func (p *Packet) decodeArp() {
|
||||
if len(p.Payload) < 8 {
|
||||
return
|
||||
}
|
||||
|
||||
pkt := p.Payload
|
||||
arp := new(Arphdr)
|
||||
arp.Addrtype = binary.BigEndian.Uint16(pkt[0:2])
|
||||
arp.Protocol = binary.BigEndian.Uint16(pkt[2:4])
|
||||
arp.HwAddressSize = pkt[4]
|
||||
arp.ProtAddressSize = pkt[5]
|
||||
arp.Operation = binary.BigEndian.Uint16(pkt[6:8])
|
||||
|
||||
if len(pkt) < int(8+2*arp.HwAddressSize+2*arp.ProtAddressSize) {
|
||||
return
|
||||
}
|
||||
arp.SourceHwAddress = pkt[8 : 8+arp.HwAddressSize]
|
||||
arp.SourceProtAddress = pkt[8+arp.HwAddressSize : 8+arp.HwAddressSize+arp.ProtAddressSize]
|
||||
arp.DestHwAddress = pkt[8+arp.HwAddressSize+arp.ProtAddressSize : 8+2*arp.HwAddressSize+arp.ProtAddressSize]
|
||||
arp.DestProtAddress = pkt[8+2*arp.HwAddressSize+arp.ProtAddressSize : 8+2*arp.HwAddressSize+2*arp.ProtAddressSize]
|
||||
|
||||
p.Headers = append(p.Headers, arp)
|
||||
|
||||
if len(pkt) >= int(8+2*arp.HwAddressSize+2*arp.ProtAddressSize) {
|
||||
p.Payload = p.Payload[8+2*arp.HwAddressSize+2*arp.ProtAddressSize:]
|
||||
}
|
||||
}
|
||||
|
||||
// IPadr is the header of an IP packet.
|
||||
type Iphdr struct {
|
||||
Version uint8
|
||||
Ihl uint8
|
||||
Tos uint8
|
||||
Length uint16
|
||||
Id uint16
|
||||
Flags uint8
|
||||
FragOffset uint16
|
||||
Ttl uint8
|
||||
Protocol uint8
|
||||
Checksum uint16
|
||||
SrcIp []byte
|
||||
DestIp []byte
|
||||
}
|
||||
|
||||
func (p *Packet) decodeIp() {
|
||||
if len(p.Payload) < 20 {
|
||||
return
|
||||
}
|
||||
|
||||
pkt := p.Payload
|
||||
ip := new(Iphdr)
|
||||
|
||||
ip.Version = uint8(pkt[0]) >> 4
|
||||
ip.Ihl = uint8(pkt[0]) & 0x0F
|
||||
ip.Tos = pkt[1]
|
||||
ip.Length = binary.BigEndian.Uint16(pkt[2:4])
|
||||
ip.Id = binary.BigEndian.Uint16(pkt[4:6])
|
||||
flagsfrags := binary.BigEndian.Uint16(pkt[6:8])
|
||||
ip.Flags = uint8(flagsfrags >> 13)
|
||||
ip.FragOffset = flagsfrags & 0x1FFF
|
||||
ip.Ttl = pkt[8]
|
||||
ip.Protocol = pkt[9]
|
||||
ip.Checksum = binary.BigEndian.Uint16(pkt[10:12])
|
||||
ip.SrcIp = pkt[12:16]
|
||||
ip.DestIp = pkt[16:20]
|
||||
|
||||
pEnd := int(ip.Length)
|
||||
if pEnd > len(pkt) {
|
||||
pEnd = len(pkt)
|
||||
}
|
||||
|
||||
if len(pkt) >= pEnd && int(ip.Ihl*4) < pEnd {
|
||||
p.Payload = pkt[ip.Ihl*4 : pEnd]
|
||||
} else {
|
||||
p.Payload = []byte{}
|
||||
}
|
||||
|
||||
p.Headers = append(p.Headers, ip)
|
||||
p.IP = ip
|
||||
|
||||
switch ip.Protocol {
|
||||
case IP_TCP:
|
||||
p.decodeTcp()
|
||||
case IP_UDP:
|
||||
p.decodeUdp()
|
||||
case IP_ICMP:
|
||||
p.decodeIcmp()
|
||||
case IP_INIP:
|
||||
p.decodeIp()
|
||||
}
|
||||
}
|
||||
|
||||
func (ip *Iphdr) SrcAddr() string { return net.IP(ip.SrcIp).String() }
|
||||
func (ip *Iphdr) DestAddr() string { return net.IP(ip.DestIp).String() }
|
||||
func (ip *Iphdr) Len() int { return int(ip.Length) }
|
||||
|
||||
type Vlanhdr struct {
|
||||
Priority byte
|
||||
DropEligible bool
|
||||
VlanIdentifier int
|
||||
Type int // Not actually part of the vlan header, but the type of the actual packet
|
||||
}
|
||||
|
||||
func (v *Vlanhdr) String() {
|
||||
fmt.Sprintf("VLAN Priority:%d Drop:%v Tag:%d", v.Priority, v.DropEligible, v.VlanIdentifier)
|
||||
}
|
||||
|
||||
func (p *Packet) decodeVlan() {
|
||||
pkt := p.Payload
|
||||
vlan := new(Vlanhdr)
|
||||
if len(pkt) < 4 {
|
||||
return
|
||||
}
|
||||
|
||||
vlan.Priority = (pkt[2] & 0xE0) >> 13
|
||||
vlan.DropEligible = pkt[2]&0x10 != 0
|
||||
vlan.VlanIdentifier = int(binary.BigEndian.Uint16(pkt[:2])) & 0x0FFF
|
||||
vlan.Type = int(binary.BigEndian.Uint16(p.Payload[2:4]))
|
||||
p.Headers = append(p.Headers, vlan)
|
||||
|
||||
if len(pkt) >= 5 {
|
||||
p.Payload = p.Payload[4:]
|
||||
}
|
||||
|
||||
switch vlan.Type {
|
||||
case TYPE_IP:
|
||||
p.decodeIp()
|
||||
case TYPE_IP6:
|
||||
p.decodeIp6()
|
||||
case TYPE_ARP:
|
||||
p.decodeArp()
|
||||
}
|
||||
}
|
||||
|
||||
type Tcphdr struct {
|
||||
SrcPort uint16
|
||||
DestPort uint16
|
||||
Seq uint32
|
||||
Ack uint32
|
||||
DataOffset uint8
|
||||
Flags uint16
|
||||
Window uint16
|
||||
Checksum uint16
|
||||
Urgent uint16
|
||||
Data []byte
|
||||
}
|
||||
|
||||
const (
|
||||
TCP_FIN = 1 << iota
|
||||
TCP_SYN
|
||||
TCP_RST
|
||||
TCP_PSH
|
||||
TCP_ACK
|
||||
TCP_URG
|
||||
TCP_ECE
|
||||
TCP_CWR
|
||||
TCP_NS
|
||||
)
|
||||
|
||||
func (p *Packet) decodeTcp() {
|
||||
if len(p.Payload) < 20 {
|
||||
return
|
||||
}
|
||||
|
||||
pkt := p.Payload
|
||||
tcp := new(Tcphdr)
|
||||
tcp.SrcPort = binary.BigEndian.Uint16(pkt[0:2])
|
||||
tcp.DestPort = binary.BigEndian.Uint16(pkt[2:4])
|
||||
tcp.Seq = binary.BigEndian.Uint32(pkt[4:8])
|
||||
tcp.Ack = binary.BigEndian.Uint32(pkt[8:12])
|
||||
tcp.DataOffset = (pkt[12] & 0xF0) >> 4
|
||||
tcp.Flags = binary.BigEndian.Uint16(pkt[12:14]) & 0x1FF
|
||||
tcp.Window = binary.BigEndian.Uint16(pkt[14:16])
|
||||
tcp.Checksum = binary.BigEndian.Uint16(pkt[16:18])
|
||||
tcp.Urgent = binary.BigEndian.Uint16(pkt[18:20])
|
||||
if len(pkt) >= int(tcp.DataOffset*4) {
|
||||
p.Payload = pkt[tcp.DataOffset*4:]
|
||||
}
|
||||
p.Headers = append(p.Headers, tcp)
|
||||
p.TCP = tcp
|
||||
}
|
||||
|
||||
func (tcp *Tcphdr) String(hdr addrHdr) string {
|
||||
return fmt.Sprintf("TCP %s:%d > %s:%d %s SEQ=%d ACK=%d LEN=%d",
|
||||
hdr.SrcAddr(), int(tcp.SrcPort), hdr.DestAddr(), int(tcp.DestPort),
|
||||
tcp.FlagsString(), int64(tcp.Seq), int64(tcp.Ack), hdr.Len())
|
||||
}
|
||||
|
||||
func (tcp *Tcphdr) FlagsString() string {
|
||||
var sflags []string
|
||||
if 0 != (tcp.Flags & TCP_SYN) {
|
||||
sflags = append(sflags, "syn")
|
||||
}
|
||||
if 0 != (tcp.Flags & TCP_FIN) {
|
||||
sflags = append(sflags, "fin")
|
||||
}
|
||||
if 0 != (tcp.Flags & TCP_ACK) {
|
||||
sflags = append(sflags, "ack")
|
||||
}
|
||||
if 0 != (tcp.Flags & TCP_PSH) {
|
||||
sflags = append(sflags, "psh")
|
||||
}
|
||||
if 0 != (tcp.Flags & TCP_RST) {
|
||||
sflags = append(sflags, "rst")
|
||||
}
|
||||
if 0 != (tcp.Flags & TCP_URG) {
|
||||
sflags = append(sflags, "urg")
|
||||
}
|
||||
if 0 != (tcp.Flags & TCP_NS) {
|
||||
sflags = append(sflags, "ns")
|
||||
}
|
||||
if 0 != (tcp.Flags & TCP_CWR) {
|
||||
sflags = append(sflags, "cwr")
|
||||
}
|
||||
if 0 != (tcp.Flags & TCP_ECE) {
|
||||
sflags = append(sflags, "ece")
|
||||
}
|
||||
return fmt.Sprintf("[%s]", strings.Join(sflags, " "))
|
||||
}
|
||||
|
||||
type Udphdr struct {
|
||||
SrcPort uint16
|
||||
DestPort uint16
|
||||
Length uint16
|
||||
Checksum uint16
|
||||
}
|
||||
|
||||
func (p *Packet) decodeUdp() {
|
||||
if len(p.Payload) < 8 {
|
||||
return
|
||||
}
|
||||
|
||||
pkt := p.Payload
|
||||
udp := new(Udphdr)
|
||||
udp.SrcPort = binary.BigEndian.Uint16(pkt[0:2])
|
||||
udp.DestPort = binary.BigEndian.Uint16(pkt[2:4])
|
||||
udp.Length = binary.BigEndian.Uint16(pkt[4:6])
|
||||
udp.Checksum = binary.BigEndian.Uint16(pkt[6:8])
|
||||
p.Headers = append(p.Headers, udp)
|
||||
p.UDP = udp
|
||||
if len(p.Payload) >= 8 {
|
||||
p.Payload = pkt[8:]
|
||||
}
|
||||
}
|
||||
|
||||
func (udp *Udphdr) String(hdr addrHdr) string {
|
||||
return fmt.Sprintf("UDP %s:%d > %s:%d LEN=%d CHKSUM=%d",
|
||||
hdr.SrcAddr(), int(udp.SrcPort), hdr.DestAddr(), int(udp.DestPort),
|
||||
int(udp.Length), int(udp.Checksum))
|
||||
}
|
||||
|
||||
type Icmphdr struct {
|
||||
Type uint8
|
||||
Code uint8
|
||||
Checksum uint16
|
||||
Id uint16
|
||||
Seq uint16
|
||||
Data []byte
|
||||
}
|
||||
|
||||
func (p *Packet) decodeIcmp() *Icmphdr {
|
||||
if len(p.Payload) < 8 {
|
||||
return nil
|
||||
}
|
||||
|
||||
pkt := p.Payload
|
||||
icmp := new(Icmphdr)
|
||||
icmp.Type = pkt[0]
|
||||
icmp.Code = pkt[1]
|
||||
icmp.Checksum = binary.BigEndian.Uint16(pkt[2:4])
|
||||
icmp.Id = binary.BigEndian.Uint16(pkt[4:6])
|
||||
icmp.Seq = binary.BigEndian.Uint16(pkt[6:8])
|
||||
p.Payload = pkt[8:]
|
||||
p.Headers = append(p.Headers, icmp)
|
||||
return icmp
|
||||
}
|
||||
|
||||
func (icmp *Icmphdr) String(hdr addrHdr) string {
|
||||
return fmt.Sprintf("ICMP %s > %s Type = %d Code = %d ",
|
||||
hdr.SrcAddr(), hdr.DestAddr(), icmp.Type, icmp.Code)
|
||||
}
|
||||
|
||||
func (icmp *Icmphdr) TypeString() (result string) {
|
||||
switch icmp.Type {
|
||||
case 0:
|
||||
result = fmt.Sprintf("Echo reply seq=%d", icmp.Seq)
|
||||
case 3:
|
||||
switch icmp.Code {
|
||||
case 0:
|
||||
result = "Network unreachable"
|
||||
case 1:
|
||||
result = "Host unreachable"
|
||||
case 2:
|
||||
result = "Protocol unreachable"
|
||||
case 3:
|
||||
result = "Port unreachable"
|
||||
default:
|
||||
result = "Destination unreachable"
|
||||
}
|
||||
case 8:
|
||||
result = fmt.Sprintf("Echo request seq=%d", icmp.Seq)
|
||||
case 30:
|
||||
result = "Traceroute"
|
||||
}
|
||||
return
|
||||
}
|
||||
|
||||
type Ip6hdr struct {
|
||||
// http://www.networksorcery.com/enp/protocol/ipv6.htm
|
||||
Version uint8 // 4 bits
|
||||
TrafficClass uint8 // 8 bits
|
||||
FlowLabel uint32 // 20 bits
|
||||
Length uint16 // 16 bits
|
||||
NextHeader uint8 // 8 bits, same as Protocol in Iphdr
|
||||
HopLimit uint8 // 8 bits
|
||||
SrcIp []byte // 16 bytes
|
||||
DestIp []byte // 16 bytes
|
||||
}
|
||||
|
||||
func (p *Packet) decodeIp6() {
|
||||
if len(p.Payload) < 40 {
|
||||
return
|
||||
}
|
||||
|
||||
pkt := p.Payload
|
||||
ip6 := new(Ip6hdr)
|
||||
ip6.Version = uint8(pkt[0]) >> 4
|
||||
ip6.TrafficClass = uint8((binary.BigEndian.Uint16(pkt[0:2]) >> 4) & 0x00FF)
|
||||
ip6.FlowLabel = binary.BigEndian.Uint32(pkt[0:4]) & 0x000FFFFF
|
||||
ip6.Length = binary.BigEndian.Uint16(pkt[4:6])
|
||||
ip6.NextHeader = pkt[6]
|
||||
ip6.HopLimit = pkt[7]
|
||||
ip6.SrcIp = pkt[8:24]
|
||||
ip6.DestIp = pkt[24:40]
|
||||
|
||||
if len(p.Payload) >= 40 {
|
||||
p.Payload = pkt[40:]
|
||||
}
|
||||
|
||||
p.Headers = append(p.Headers, ip6)
|
||||
|
||||
switch ip6.NextHeader {
|
||||
case IP_TCP:
|
||||
p.decodeTcp()
|
||||
case IP_UDP:
|
||||
p.decodeUdp()
|
||||
case IP_ICMP:
|
||||
p.decodeIcmp()
|
||||
case IP_INIP:
|
||||
p.decodeIp()
|
||||
}
|
||||
}
|
||||
|
||||
func (ip6 *Ip6hdr) SrcAddr() string { return net.IP(ip6.SrcIp).String() }
|
||||
func (ip6 *Ip6hdr) DestAddr() string { return net.IP(ip6.DestIp).String() }
|
||||
func (ip6 *Ip6hdr) Len() int { return int(ip6.Length) }
|
247
Godeps/_workspace/src/github.com/akrennmair/gopcap/decode_test.go
generated
vendored
247
Godeps/_workspace/src/github.com/akrennmair/gopcap/decode_test.go
generated
vendored
@@ -1,247 +0,0 @@
|
||||
package pcap
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"testing"
|
||||
"time"
|
||||
)
|
||||
|
||||
var testSimpleTcpPacket *Packet = &Packet{
|
||||
Data: []byte{
|
||||
0x00, 0x00, 0x0c, 0x9f, 0xf0, 0x20, 0xbc, 0x30, 0x5b, 0xe8, 0xd3, 0x49,
|
||||
0x08, 0x00, 0x45, 0x00, 0x01, 0xa4, 0x39, 0xdf, 0x40, 0x00, 0x40, 0x06,
|
||||
0x55, 0x5a, 0xac, 0x11, 0x51, 0x49, 0xad, 0xde, 0xfe, 0xe1, 0xc5, 0xf7,
|
||||
0x00, 0x50, 0xc5, 0x7e, 0x0e, 0x48, 0x49, 0x07, 0x42, 0x32, 0x80, 0x18,
|
||||
0x00, 0x73, 0xab, 0xb1, 0x00, 0x00, 0x01, 0x01, 0x08, 0x0a, 0x03, 0x77,
|
||||
0x37, 0x9c, 0x42, 0x77, 0x5e, 0x3a, 0x47, 0x45, 0x54, 0x20, 0x2f, 0x20,
|
||||
0x48, 0x54, 0x54, 0x50, 0x2f, 0x31, 0x2e, 0x31, 0x0d, 0x0a, 0x48, 0x6f,
|
||||
0x73, 0x74, 0x3a, 0x20, 0x77, 0x77, 0x77, 0x2e, 0x66, 0x69, 0x73, 0x68,
|
||||
0x2e, 0x63, 0x6f, 0x6d, 0x0d, 0x0a, 0x43, 0x6f, 0x6e, 0x6e, 0x65, 0x63,
|
||||
0x74, 0x69, 0x6f, 0x6e, 0x3a, 0x20, 0x6b, 0x65, 0x65, 0x70, 0x2d, 0x61,
|
||||
0x6c, 0x69, 0x76, 0x65, 0x0d, 0x0a, 0x55, 0x73, 0x65, 0x72, 0x2d, 0x41,
|
||||
0x67, 0x65, 0x6e, 0x74, 0x3a, 0x20, 0x4d, 0x6f, 0x7a, 0x69, 0x6c, 0x6c,
|
||||
0x61, 0x2f, 0x35, 0x2e, 0x30, 0x20, 0x28, 0x58, 0x31, 0x31, 0x3b, 0x20,
|
||||
0x4c, 0x69, 0x6e, 0x75, 0x78, 0x20, 0x78, 0x38, 0x36, 0x5f, 0x36, 0x34,
|
||||
0x29, 0x20, 0x41, 0x70, 0x70, 0x6c, 0x65, 0x57, 0x65, 0x62, 0x4b, 0x69,
|
||||
0x74, 0x2f, 0x35, 0x33, 0x35, 0x2e, 0x32, 0x20, 0x28, 0x4b, 0x48, 0x54,
|
||||
0x4d, 0x4c, 0x2c, 0x20, 0x6c, 0x69, 0x6b, 0x65, 0x20, 0x47, 0x65, 0x63,
|
||||
0x6b, 0x6f, 0x29, 0x20, 0x43, 0x68, 0x72, 0x6f, 0x6d, 0x65, 0x2f, 0x31,
|
||||
0x35, 0x2e, 0x30, 0x2e, 0x38, 0x37, 0x34, 0x2e, 0x31, 0x32, 0x31, 0x20,
|
||||
0x53, 0x61, 0x66, 0x61, 0x72, 0x69, 0x2f, 0x35, 0x33, 0x35, 0x2e, 0x32,
|
||||
0x0d, 0x0a, 0x41, 0x63, 0x63, 0x65, 0x70, 0x74, 0x3a, 0x20, 0x74, 0x65,
|
||||
0x78, 0x74, 0x2f, 0x68, 0x74, 0x6d, 0x6c, 0x2c, 0x61, 0x70, 0x70, 0x6c,
|
||||
0x69, 0x63, 0x61, 0x74, 0x69, 0x6f, 0x6e, 0x2f, 0x78, 0x68, 0x74, 0x6d,
|
||||
0x6c, 0x2b, 0x78, 0x6d, 0x6c, 0x2c, 0x61, 0x70, 0x70, 0x6c, 0x69, 0x63,
|
||||
0x61, 0x74, 0x69, 0x6f, 0x6e, 0x2f, 0x78, 0x6d, 0x6c, 0x3b, 0x71, 0x3d,
|
||||
0x30, 0x2e, 0x39, 0x2c, 0x2a, 0x2f, 0x2a, 0x3b, 0x71, 0x3d, 0x30, 0x2e,
|
||||
0x38, 0x0d, 0x0a, 0x41, 0x63, 0x63, 0x65, 0x70, 0x74, 0x2d, 0x45, 0x6e,
|
||||
0x63, 0x6f, 0x64, 0x69, 0x6e, 0x67, 0x3a, 0x20, 0x67, 0x7a, 0x69, 0x70,
|
||||
0x2c, 0x64, 0x65, 0x66, 0x6c, 0x61, 0x74, 0x65, 0x2c, 0x73, 0x64, 0x63,
|
||||
0x68, 0x0d, 0x0a, 0x41, 0x63, 0x63, 0x65, 0x70, 0x74, 0x2d, 0x4c, 0x61,
|
||||
0x6e, 0x67, 0x75, 0x61, 0x67, 0x65, 0x3a, 0x20, 0x65, 0x6e, 0x2d, 0x55,
|
||||
0x53, 0x2c, 0x65, 0x6e, 0x3b, 0x71, 0x3d, 0x30, 0x2e, 0x38, 0x0d, 0x0a,
|
||||
0x41, 0x63, 0x63, 0x65, 0x70, 0x74, 0x2d, 0x43, 0x68, 0x61, 0x72, 0x73,
|
||||
0x65, 0x74, 0x3a, 0x20, 0x49, 0x53, 0x4f, 0x2d, 0x38, 0x38, 0x35, 0x39,
|
||||
0x2d, 0x31, 0x2c, 0x75, 0x74, 0x66, 0x2d, 0x38, 0x3b, 0x71, 0x3d, 0x30,
|
||||
0x2e, 0x37, 0x2c, 0x2a, 0x3b, 0x71, 0x3d, 0x30, 0x2e, 0x33, 0x0d, 0x0a,
|
||||
0x0d, 0x0a,
|
||||
}}
|
||||
|
||||
func BenchmarkDecodeSimpleTcpPacket(b *testing.B) {
|
||||
for i := 0; i < b.N; i++ {
|
||||
testSimpleTcpPacket.Decode()
|
||||
}
|
||||
}
|
||||
|
||||
func TestDecodeSimpleTcpPacket(t *testing.T) {
|
||||
p := testSimpleTcpPacket
|
||||
p.Decode()
|
||||
if p.DestMac != 0x00000c9ff020 {
|
||||
t.Error("Dest mac", p.DestMac)
|
||||
}
|
||||
if p.SrcMac != 0xbc305be8d349 {
|
||||
t.Error("Src mac", p.SrcMac)
|
||||
}
|
||||
if len(p.Headers) != 2 {
|
||||
t.Error("Incorrect number of headers", len(p.Headers))
|
||||
return
|
||||
}
|
||||
if ip, ipOk := p.Headers[0].(*Iphdr); ipOk {
|
||||
if ip.Version != 4 {
|
||||
t.Error("ip Version", ip.Version)
|
||||
}
|
||||
if ip.Ihl != 5 {
|
||||
t.Error("ip header length", ip.Ihl)
|
||||
}
|
||||
if ip.Tos != 0 {
|
||||
t.Error("ip TOS", ip.Tos)
|
||||
}
|
||||
if ip.Length != 420 {
|
||||
t.Error("ip Length", ip.Length)
|
||||
}
|
||||
if ip.Id != 14815 {
|
||||
t.Error("ip ID", ip.Id)
|
||||
}
|
||||
if ip.Flags != 0x02 {
|
||||
t.Error("ip Flags", ip.Flags)
|
||||
}
|
||||
if ip.FragOffset != 0 {
|
||||
t.Error("ip Fragoffset", ip.FragOffset)
|
||||
}
|
||||
if ip.Ttl != 64 {
|
||||
t.Error("ip TTL", ip.Ttl)
|
||||
}
|
||||
if ip.Protocol != 6 {
|
||||
t.Error("ip Protocol", ip.Protocol)
|
||||
}
|
||||
if ip.Checksum != 0x555A {
|
||||
t.Error("ip Checksum", ip.Checksum)
|
||||
}
|
||||
if !bytes.Equal(ip.SrcIp, []byte{172, 17, 81, 73}) {
|
||||
t.Error("ip Src", ip.SrcIp)
|
||||
}
|
||||
if !bytes.Equal(ip.DestIp, []byte{173, 222, 254, 225}) {
|
||||
t.Error("ip Dest", ip.DestIp)
|
||||
}
|
||||
if tcp, tcpOk := p.Headers[1].(*Tcphdr); tcpOk {
|
||||
if tcp.SrcPort != 50679 {
|
||||
t.Error("tcp srcport", tcp.SrcPort)
|
||||
}
|
||||
if tcp.DestPort != 80 {
|
||||
t.Error("tcp destport", tcp.DestPort)
|
||||
}
|
||||
if tcp.Seq != 0xc57e0e48 {
|
||||
t.Error("tcp seq", tcp.Seq)
|
||||
}
|
||||
if tcp.Ack != 0x49074232 {
|
||||
t.Error("tcp ack", tcp.Ack)
|
||||
}
|
||||
if tcp.DataOffset != 8 {
|
||||
t.Error("tcp dataoffset", tcp.DataOffset)
|
||||
}
|
||||
if tcp.Flags != 0x18 {
|
||||
t.Error("tcp flags", tcp.Flags)
|
||||
}
|
||||
if tcp.Window != 0x73 {
|
||||
t.Error("tcp window", tcp.Window)
|
||||
}
|
||||
if tcp.Checksum != 0xabb1 {
|
||||
t.Error("tcp checksum", tcp.Checksum)
|
||||
}
|
||||
if tcp.Urgent != 0 {
|
||||
t.Error("tcp urgent", tcp.Urgent)
|
||||
}
|
||||
} else {
|
||||
t.Error("Second header is not TCP header")
|
||||
}
|
||||
} else {
|
||||
t.Error("First header is not IP header")
|
||||
}
|
||||
if string(p.Payload) != "GET / HTTP/1.1\r\nHost: www.fish.com\r\nConnection: keep-alive\r\nUser-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/535.2 (KHTML, like Gecko) Chrome/15.0.874.121 Safari/535.2\r\nAccept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\nAccept-Encoding: gzip,deflate,sdch\r\nAccept-Language: en-US,en;q=0.8\r\nAccept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3\r\n\r\n" {
|
||||
t.Error("--- PAYLOAD STRING ---\n", string(p.Payload), "\n--- PAYLOAD BYTES ---\n", p.Payload)
|
||||
}
|
||||
}
|
||||
|
||||
// Makes sure packet payload doesn't display the 6 trailing null of this packet
|
||||
// as part of the payload. They're actually the ethernet trailer.
|
||||
func TestDecodeSmallTcpPacketHasEmptyPayload(t *testing.T) {
|
||||
p := &Packet{
|
||||
// This packet is only 54 bits (an empty TCP RST), thus 6 trailing null
|
||||
// bytes are added by the ethernet layer to make it the minimum packet size.
|
||||
Data: []byte{
|
||||
0xbc, 0x30, 0x5b, 0xe8, 0xd3, 0x49, 0xb8, 0xac, 0x6f, 0x92, 0xd5, 0xbf,
|
||||
0x08, 0x00, 0x45, 0x00, 0x00, 0x28, 0x00, 0x00, 0x40, 0x00, 0x40, 0x06,
|
||||
0x3f, 0x9f, 0xac, 0x11, 0x51, 0xc5, 0xac, 0x11, 0x51, 0x49, 0x00, 0x63,
|
||||
0x9a, 0xef, 0x00, 0x00, 0x00, 0x00, 0x2e, 0xc1, 0x27, 0x83, 0x50, 0x14,
|
||||
0x00, 0x00, 0xc3, 0x08, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
|
||||
}}
|
||||
p.Decode()
|
||||
if p.Payload == nil {
|
||||
t.Error("Nil payload")
|
||||
}
|
||||
if len(p.Payload) != 0 {
|
||||
t.Error("Non-empty payload:", p.Payload)
|
||||
}
|
||||
}
|
||||
|
||||
func TestDecodeVlanPacket(t *testing.T) {
|
||||
p := &Packet{
|
||||
Data: []byte{
|
||||
0x00, 0x10, 0xdb, 0xff, 0x10, 0x00, 0x00, 0x15, 0x2c, 0x9d, 0xcc, 0x00, 0x81, 0x00, 0x01, 0xf7,
|
||||
0x08, 0x00, 0x45, 0x00, 0x00, 0x28, 0x29, 0x8d, 0x40, 0x00, 0x7d, 0x06, 0x83, 0xa0, 0xac, 0x1b,
|
||||
0xca, 0x8e, 0x45, 0x16, 0x94, 0xe2, 0xd4, 0x0a, 0x00, 0x50, 0xdf, 0xab, 0x9c, 0xc6, 0xcd, 0x1e,
|
||||
0xe5, 0xd1, 0x50, 0x10, 0x01, 0x00, 0x5a, 0x74, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
|
||||
}}
|
||||
p.Decode()
|
||||
if p.Type != TYPE_VLAN {
|
||||
t.Error("Didn't detect vlan")
|
||||
}
|
||||
if len(p.Headers) != 3 {
|
||||
t.Error("Incorrect number of headers:", len(p.Headers))
|
||||
for i, h := range p.Headers {
|
||||
t.Errorf("Header %d: %#v", i, h)
|
||||
}
|
||||
t.FailNow()
|
||||
}
|
||||
if _, ok := p.Headers[0].(*Vlanhdr); !ok {
|
||||
t.Errorf("First header isn't vlan: %q", p.Headers[0])
|
||||
}
|
||||
if _, ok := p.Headers[1].(*Iphdr); !ok {
|
||||
t.Errorf("Second header isn't IP: %q", p.Headers[1])
|
||||
}
|
||||
if _, ok := p.Headers[2].(*Tcphdr); !ok {
|
||||
t.Errorf("Third header isn't TCP: %q", p.Headers[2])
|
||||
}
|
||||
}
|
||||
|
||||
func TestDecodeFuzzFallout(t *testing.T) {
|
||||
testData := []struct {
|
||||
Data []byte
|
||||
}{
|
||||
{[]byte("000000000000\x81\x000")},
|
||||
{[]byte("000000000000\x81\x00000")},
|
||||
{[]byte("000000000000\x86\xdd0")},
|
||||
{[]byte("000000000000\b\x000")},
|
||||
{[]byte("000000000000\b\x060")},
|
||||
{[]byte{}},
|
||||
{[]byte("000000000000\b\x0600000000")},
|
||||
{[]byte("000000000000\x86\xdd000000\x01000000000000000000000000000000000")},
|
||||
{[]byte("000000000000\x81\x0000\b\x0600000000")},
|
||||
{[]byte("000000000000\b\x00n0000000000000000000")},
|
||||
{[]byte("000000000000\x86\xdd000000\x0100000000000000000000000000000000000")},
|
||||
{[]byte("000000000000\x81\x0000\b\x00g0000000000000000000")},
|
||||
//{[]byte()},
|
||||
{[]byte("000000000000\b\x00400000000\x110000000000")},
|
||||
{[]byte("0nMء\xfe\x13\x13\x81\x00gr\b\x00&x\xc9\xe5b'\x1e0\x00\x04\x00\x0020596224")},
|
||||
{[]byte("000000000000\x81\x0000\b\x00400000000\x110000000000")},
|
||||
{[]byte("000000000000\b\x00000000000\x0600\xff0000000")},
|
||||
{[]byte("000000000000\x86\xdd000000\x06000000000000000000000000000000000")},
|
||||
{[]byte("000000000000\x81\x0000\b\x00000000000\x0600b0000000")},
|
||||
{[]byte("000000000000\x81\x0000\b\x00400000000\x060000000000")},
|
||||
{[]byte("000000000000\x86\xdd000000\x11000000000000000000000000000000000")},
|
||||
{[]byte("000000000000\x86\xdd000000\x0600000000000000000000000000000000000000000000M")},
|
||||
{[]byte("000000000000\b\x00500000000\x0600000000000")},
|
||||
{[]byte("0nM\xd80\xfe\x13\x13\x81\x00gr\b\x00&x\xc9\xe5b'\x1e0\x00\x04\x00\x0020596224")},
|
||||
}
|
||||
|
||||
for _, entry := range testData {
|
||||
pkt := &Packet{
|
||||
Time: time.Now(),
|
||||
Caplen: uint32(len(entry.Data)),
|
||||
Len: uint32(len(entry.Data)),
|
||||
Data: entry.Data,
|
||||
}
|
||||
|
||||
pkt.Decode()
|
||||
/*
|
||||
func() {
|
||||
defer func() {
|
||||
if err := recover(); err != nil {
|
||||
t.Fatalf("%d. %q failed: %v", idx, string(entry.Data), err)
|
||||
}
|
||||
}()
|
||||
pkt.Decode()
|
||||
}()
|
||||
*/
|
||||
}
|
||||
}
|
206
Godeps/_workspace/src/github.com/akrennmair/gopcap/io.go
generated
vendored
206
Godeps/_workspace/src/github.com/akrennmair/gopcap/io.go
generated
vendored
@@ -1,206 +0,0 @@
|
||||
package pcap
|
||||
|
||||
import (
|
||||
"encoding/binary"
|
||||
"fmt"
|
||||
"io"
|
||||
"time"
|
||||
)
|
||||
|
||||
// FileHeader is the parsed header of a pcap file.
|
||||
// http://wiki.wireshark.org/Development/LibpcapFileFormat
|
||||
type FileHeader struct {
|
||||
MagicNumber uint32
|
||||
VersionMajor uint16
|
||||
VersionMinor uint16
|
||||
TimeZone int32
|
||||
SigFigs uint32
|
||||
SnapLen uint32
|
||||
Network uint32
|
||||
}
|
||||
|
||||
type PacketTime struct {
|
||||
Sec int32
|
||||
Usec int32
|
||||
}
|
||||
|
||||
// Convert the PacketTime to a go Time struct.
|
||||
func (p *PacketTime) Time() time.Time {
|
||||
return time.Unix(int64(p.Sec), int64(p.Usec)*1000)
|
||||
}
|
||||
|
||||
// Packet is a single packet parsed from a pcap file.
|
||||
//
|
||||
// Convenient access to IP, TCP, and UDP headers is provided after Decode()
|
||||
// is called if the packet is of the appropriate type.
|
||||
type Packet struct {
|
||||
Time time.Time // packet send/receive time
|
||||
Caplen uint32 // bytes stored in the file (caplen <= len)
|
||||
Len uint32 // bytes sent/received
|
||||
Data []byte // packet data
|
||||
|
||||
Type int // protocol type, see LINKTYPE_*
|
||||
DestMac uint64
|
||||
SrcMac uint64
|
||||
|
||||
Headers []interface{} // decoded headers, in order
|
||||
Payload []byte // remaining non-header bytes
|
||||
|
||||
IP *Iphdr // IP header (for IP packets, after decoding)
|
||||
TCP *Tcphdr // TCP header (for TCP packets, after decoding)
|
||||
UDP *Udphdr // UDP header (for UDP packets after decoding)
|
||||
}
|
||||
|
||||
// Reader parses pcap files.
|
||||
type Reader struct {
|
||||
flip bool
|
||||
buf io.Reader
|
||||
err error
|
||||
fourBytes []byte
|
||||
twoBytes []byte
|
||||
sixteenBytes []byte
|
||||
Header FileHeader
|
||||
}
|
||||
|
||||
// NewReader reads pcap data from an io.Reader.
|
||||
func NewReader(reader io.Reader) (*Reader, error) {
|
||||
r := &Reader{
|
||||
buf: reader,
|
||||
fourBytes: make([]byte, 4),
|
||||
twoBytes: make([]byte, 2),
|
||||
sixteenBytes: make([]byte, 16),
|
||||
}
|
||||
switch magic := r.readUint32(); magic {
|
||||
case 0xa1b2c3d4:
|
||||
r.flip = false
|
||||
case 0xd4c3b2a1:
|
||||
r.flip = true
|
||||
default:
|
||||
return nil, fmt.Errorf("pcap: bad magic number: %0x", magic)
|
||||
}
|
||||
r.Header = FileHeader{
|
||||
MagicNumber: 0xa1b2c3d4,
|
||||
VersionMajor: r.readUint16(),
|
||||
VersionMinor: r.readUint16(),
|
||||
TimeZone: r.readInt32(),
|
||||
SigFigs: r.readUint32(),
|
||||
SnapLen: r.readUint32(),
|
||||
Network: r.readUint32(),
|
||||
}
|
||||
return r, nil
|
||||
}
|
||||
|
||||
// Next returns the next packet or nil if no more packets can be read.
|
||||
func (r *Reader) Next() *Packet {
|
||||
d := r.sixteenBytes
|
||||
r.err = r.read(d)
|
||||
if r.err != nil {
|
||||
return nil
|
||||
}
|
||||
timeSec := asUint32(d[0:4], r.flip)
|
||||
timeUsec := asUint32(d[4:8], r.flip)
|
||||
capLen := asUint32(d[8:12], r.flip)
|
||||
origLen := asUint32(d[12:16], r.flip)
|
||||
|
||||
data := make([]byte, capLen)
|
||||
if r.err = r.read(data); r.err != nil {
|
||||
return nil
|
||||
}
|
||||
return &Packet{
|
||||
Time: time.Unix(int64(timeSec), int64(timeUsec)),
|
||||
Caplen: capLen,
|
||||
Len: origLen,
|
||||
Data: data,
|
||||
}
|
||||
}
|
||||
|
||||
func (r *Reader) read(data []byte) error {
|
||||
var err error
|
||||
n, err := r.buf.Read(data)
|
||||
for err == nil && n != len(data) {
|
||||
var chunk int
|
||||
chunk, err = r.buf.Read(data[n:])
|
||||
n += chunk
|
||||
}
|
||||
if len(data) == n {
|
||||
return nil
|
||||
}
|
||||
return err
|
||||
}
|
||||
|
||||
func (r *Reader) readUint32() uint32 {
|
||||
data := r.fourBytes
|
||||
if r.err = r.read(data); r.err != nil {
|
||||
return 0
|
||||
}
|
||||
return asUint32(data, r.flip)
|
||||
}
|
||||
|
||||
func (r *Reader) readInt32() int32 {
|
||||
data := r.fourBytes
|
||||
if r.err = r.read(data); r.err != nil {
|
||||
return 0
|
||||
}
|
||||
return int32(asUint32(data, r.flip))
|
||||
}
|
||||
|
||||
func (r *Reader) readUint16() uint16 {
|
||||
data := r.twoBytes
|
||||
if r.err = r.read(data); r.err != nil {
|
||||
return 0
|
||||
}
|
||||
return asUint16(data, r.flip)
|
||||
}
|
||||
|
||||
// Writer writes a pcap file.
|
||||
type Writer struct {
|
||||
writer io.Writer
|
||||
buf []byte
|
||||
}
|
||||
|
||||
// NewWriter creates a Writer that stores output in an io.Writer.
|
||||
// The FileHeader is written immediately.
|
||||
func NewWriter(writer io.Writer, header *FileHeader) (*Writer, error) {
|
||||
w := &Writer{
|
||||
writer: writer,
|
||||
buf: make([]byte, 24),
|
||||
}
|
||||
binary.LittleEndian.PutUint32(w.buf, header.MagicNumber)
|
||||
binary.LittleEndian.PutUint16(w.buf[4:], header.VersionMajor)
|
||||
binary.LittleEndian.PutUint16(w.buf[6:], header.VersionMinor)
|
||||
binary.LittleEndian.PutUint32(w.buf[8:], uint32(header.TimeZone))
|
||||
binary.LittleEndian.PutUint32(w.buf[12:], header.SigFigs)
|
||||
binary.LittleEndian.PutUint32(w.buf[16:], header.SnapLen)
|
||||
binary.LittleEndian.PutUint32(w.buf[20:], header.Network)
|
||||
if _, err := writer.Write(w.buf); err != nil {
|
||||
return nil, err
|
||||
}
|
||||
return w, nil
|
||||
}
|
||||
|
||||
// Writer writes a packet to the underlying writer.
|
||||
func (w *Writer) Write(pkt *Packet) error {
|
||||
binary.LittleEndian.PutUint32(w.buf, uint32(pkt.Time.Unix()))
|
||||
binary.LittleEndian.PutUint32(w.buf[4:], uint32(pkt.Time.Nanosecond()))
|
||||
binary.LittleEndian.PutUint32(w.buf[8:], uint32(pkt.Time.Unix()))
|
||||
binary.LittleEndian.PutUint32(w.buf[12:], pkt.Len)
|
||||
if _, err := w.writer.Write(w.buf[:16]); err != nil {
|
||||
return err
|
||||
}
|
||||
_, err := w.writer.Write(pkt.Data)
|
||||
return err
|
||||
}
|
||||
|
||||
func asUint32(data []byte, flip bool) uint32 {
|
||||
if flip {
|
||||
return binary.BigEndian.Uint32(data)
|
||||
}
|
||||
return binary.LittleEndian.Uint32(data)
|
||||
}
|
||||
|
||||
func asUint16(data []byte, flip bool) uint16 {
|
||||
if flip {
|
||||
return binary.BigEndian.Uint16(data)
|
||||
}
|
||||
return binary.LittleEndian.Uint16(data)
|
||||
}
|
266
Godeps/_workspace/src/github.com/akrennmair/gopcap/pcap.go
generated
vendored
266
Godeps/_workspace/src/github.com/akrennmair/gopcap/pcap.go
generated
vendored
@@ -1,266 +0,0 @@
|
||||
// Interface to both live and offline pcap parsing.
|
||||
package pcap
|
||||
|
||||
/*
|
||||
#cgo linux LDFLAGS: -lpcap
|
||||
#cgo freebsd LDFLAGS: -lpcap
|
||||
#cgo darwin LDFLAGS: -lpcap
|
||||
#cgo windows CFLAGS: -I C:/WpdPack/Include
|
||||
#cgo windows,386 LDFLAGS: -L C:/WpdPack/Lib -lwpcap
|
||||
#cgo windows,amd64 LDFLAGS: -L C:/WpdPack/Lib/x64 -lwpcap
|
||||
#include <stdlib.h>
|
||||
#include <pcap.h>
|
||||
|
||||
// Workaround for not knowing how to cast to const u_char**
|
||||
int hack_pcap_next_ex(pcap_t *p, struct pcap_pkthdr **pkt_header,
|
||||
u_char **pkt_data) {
|
||||
return pcap_next_ex(p, pkt_header, (const u_char **)pkt_data);
|
||||
}
|
||||
*/
|
||||
import "C"
|
||||
import (
|
||||
"errors"
|
||||
"net"
|
||||
"syscall"
|
||||
"time"
|
||||
"unsafe"
|
||||
)
|
||||
|
||||
type Pcap struct {
|
||||
cptr *C.pcap_t
|
||||
}
|
||||
|
||||
type Stat struct {
|
||||
PacketsReceived uint32
|
||||
PacketsDropped uint32
|
||||
PacketsIfDropped uint32
|
||||
}
|
||||
|
||||
type Interface struct {
|
||||
Name string
|
||||
Description string
|
||||
Addresses []IFAddress
|
||||
// TODO: add more elements
|
||||
}
|
||||
|
||||
type IFAddress struct {
|
||||
IP net.IP
|
||||
Netmask net.IPMask
|
||||
// TODO: add broadcast + PtP dst ?
|
||||
}
|
||||
|
||||
func (p *Pcap) Next() (pkt *Packet) {
|
||||
rv, _ := p.NextEx()
|
||||
return rv
|
||||
}
|
||||
|
||||
// Openlive opens a device and returns a *Pcap handler
|
||||
func Openlive(device string, snaplen int32, promisc bool, timeout_ms int32) (handle *Pcap, err error) {
|
||||
var buf *C.char
|
||||
buf = (*C.char)(C.calloc(ERRBUF_SIZE, 1))
|
||||
h := new(Pcap)
|
||||
var pro int32
|
||||
if promisc {
|
||||
pro = 1
|
||||
}
|
||||
|
||||
dev := C.CString(device)
|
||||
defer C.free(unsafe.Pointer(dev))
|
||||
|
||||
h.cptr = C.pcap_open_live(dev, C.int(snaplen), C.int(pro), C.int(timeout_ms), buf)
|
||||
if nil == h.cptr {
|
||||
handle = nil
|
||||
err = errors.New(C.GoString(buf))
|
||||
} else {
|
||||
handle = h
|
||||
}
|
||||
C.free(unsafe.Pointer(buf))
|
||||
return
|
||||
}
|
||||
|
||||
func Openoffline(file string) (handle *Pcap, err error) {
|
||||
var buf *C.char
|
||||
buf = (*C.char)(C.calloc(ERRBUF_SIZE, 1))
|
||||
h := new(Pcap)
|
||||
|
||||
cf := C.CString(file)
|
||||
defer C.free(unsafe.Pointer(cf))
|
||||
|
||||
h.cptr = C.pcap_open_offline(cf, buf)
|
||||
if nil == h.cptr {
|
||||
handle = nil
|
||||
err = errors.New(C.GoString(buf))
|
||||
} else {
|
||||
handle = h
|
||||
}
|
||||
C.free(unsafe.Pointer(buf))
|
||||
return
|
||||
}
|
||||
|
||||
func (p *Pcap) NextEx() (pkt *Packet, result int32) {
|
||||
var pkthdr *C.struct_pcap_pkthdr
|
||||
|
||||
var buf_ptr *C.u_char
|
||||
var buf unsafe.Pointer
|
||||
result = int32(C.hack_pcap_next_ex(p.cptr, &pkthdr, &buf_ptr))
|
||||
|
||||
buf = unsafe.Pointer(buf_ptr)
|
||||
if nil == buf {
|
||||
return
|
||||
}
|
||||
|
||||
pkt = new(Packet)
|
||||
pkt.Time = time.Unix(int64(pkthdr.ts.tv_sec), int64(pkthdr.ts.tv_usec)*1000)
|
||||
pkt.Caplen = uint32(pkthdr.caplen)
|
||||
pkt.Len = uint32(pkthdr.len)
|
||||
pkt.Data = C.GoBytes(buf, C.int(pkthdr.caplen))
|
||||
return
|
||||
}
|
||||
|
||||
func (p *Pcap) Close() {
|
||||
C.pcap_close(p.cptr)
|
||||
}
|
||||
|
||||
func (p *Pcap) Geterror() error {
|
||||
return errors.New(C.GoString(C.pcap_geterr(p.cptr)))
|
||||
}
|
||||
|
||||
func (p *Pcap) Getstats() (stat *Stat, err error) {
|
||||
var cstats _Ctype_struct_pcap_stat
|
||||
if -1 == C.pcap_stats(p.cptr, &cstats) {
|
||||
return nil, p.Geterror()
|
||||
}
|
||||
stats := new(Stat)
|
||||
stats.PacketsReceived = uint32(cstats.ps_recv)
|
||||
stats.PacketsDropped = uint32(cstats.ps_drop)
|
||||
stats.PacketsIfDropped = uint32(cstats.ps_ifdrop)
|
||||
|
||||
return stats, nil
|
||||
}
|
||||
|
||||
func (p *Pcap) Setfilter(expr string) (err error) {
|
||||
var bpf _Ctype_struct_bpf_program
|
||||
cexpr := C.CString(expr)
|
||||
defer C.free(unsafe.Pointer(cexpr))
|
||||
|
||||
if -1 == C.pcap_compile(p.cptr, &bpf, cexpr, 1, 0) {
|
||||
return p.Geterror()
|
||||
}
|
||||
|
||||
if -1 == C.pcap_setfilter(p.cptr, &bpf) {
|
||||
C.pcap_freecode(&bpf)
|
||||
return p.Geterror()
|
||||
}
|
||||
C.pcap_freecode(&bpf)
|
||||
return nil
|
||||
}
|
||||
|
||||
func Version() string {
|
||||
return C.GoString(C.pcap_lib_version())
|
||||
}
|
||||
|
||||
func (p *Pcap) Datalink() int {
|
||||
return int(C.pcap_datalink(p.cptr))
|
||||
}
|
||||
|
||||
func (p *Pcap) Setdatalink(dlt int) error {
|
||||
if -1 == C.pcap_set_datalink(p.cptr, C.int(dlt)) {
|
||||
return p.Geterror()
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
func DatalinkValueToName(dlt int) string {
|
||||
if name := C.pcap_datalink_val_to_name(C.int(dlt)); name != nil {
|
||||
return C.GoString(name)
|
||||
}
|
||||
return ""
|
||||
}
|
||||
|
||||
func DatalinkValueToDescription(dlt int) string {
|
||||
if desc := C.pcap_datalink_val_to_description(C.int(dlt)); desc != nil {
|
||||
return C.GoString(desc)
|
||||
}
|
||||
return ""
|
||||
}
|
||||
|
||||
func Findalldevs() (ifs []Interface, err error) {
|
||||
var buf *C.char
|
||||
buf = (*C.char)(C.calloc(ERRBUF_SIZE, 1))
|
||||
defer C.free(unsafe.Pointer(buf))
|
||||
var alldevsp *C.pcap_if_t
|
||||
|
||||
if -1 == C.pcap_findalldevs((**C.pcap_if_t)(&alldevsp), buf) {
|
||||
return nil, errors.New(C.GoString(buf))
|
||||
}
|
||||
defer C.pcap_freealldevs((*C.pcap_if_t)(alldevsp))
|
||||
dev := alldevsp
|
||||
var i uint32
|
||||
for i = 0; dev != nil; dev = (*C.pcap_if_t)(dev.next) {
|
||||
i++
|
||||
}
|
||||
ifs = make([]Interface, i)
|
||||
dev = alldevsp
|
||||
for j := uint32(0); dev != nil; dev = (*C.pcap_if_t)(dev.next) {
|
||||
var iface Interface
|
||||
iface.Name = C.GoString(dev.name)
|
||||
iface.Description = C.GoString(dev.description)
|
||||
iface.Addresses = findalladdresses(dev.addresses)
|
||||
// TODO: add more elements
|
||||
ifs[j] = iface
|
||||
j++
|
||||
}
|
||||
return
|
||||
}
|
||||
|
||||
func findalladdresses(addresses *_Ctype_struct_pcap_addr) (retval []IFAddress) {
|
||||
// TODO - make it support more than IPv4 and IPv6?
|
||||
retval = make([]IFAddress, 0, 1)
|
||||
for curaddr := addresses; curaddr != nil; curaddr = (*_Ctype_struct_pcap_addr)(curaddr.next) {
|
||||
var a IFAddress
|
||||
var err error
|
||||
if a.IP, err = sockaddr_to_IP((*syscall.RawSockaddr)(unsafe.Pointer(curaddr.addr))); err != nil {
|
||||
continue
|
||||
}
|
||||
if a.Netmask, err = sockaddr_to_IP((*syscall.RawSockaddr)(unsafe.Pointer(curaddr.addr))); err != nil {
|
||||
continue
|
||||
}
|
||||
retval = append(retval, a)
|
||||
}
|
||||
return
|
||||
}
|
||||
|
||||
func sockaddr_to_IP(rsa *syscall.RawSockaddr) (IP []byte, err error) {
|
||||
switch rsa.Family {
|
||||
case syscall.AF_INET:
|
||||
pp := (*syscall.RawSockaddrInet4)(unsafe.Pointer(rsa))
|
||||
IP = make([]byte, 4)
|
||||
for i := 0; i < len(IP); i++ {
|
||||
IP[i] = pp.Addr[i]
|
||||
}
|
||||
return
|
||||
case syscall.AF_INET6:
|
||||
pp := (*syscall.RawSockaddrInet6)(unsafe.Pointer(rsa))
|
||||
IP = make([]byte, 16)
|
||||
for i := 0; i < len(IP); i++ {
|
||||
IP[i] = pp.Addr[i]
|
||||
}
|
||||
return
|
||||
}
|
||||
err = errors.New("Unsupported address type")
|
||||
return
|
||||
}
|
||||
|
||||
func (p *Pcap) Inject(data []byte) (err error) {
|
||||
buf := (*C.char)(C.malloc((C.size_t)(len(data))))
|
||||
|
||||
for i := 0; i < len(data); i++ {
|
||||
*(*byte)(unsafe.Pointer(uintptr(unsafe.Pointer(buf)) + uintptr(i))) = data[i]
|
||||
}
|
||||
|
||||
if -1 == C.pcap_sendpacket(p.cptr, (*C.u_char)(unsafe.Pointer(buf)), (C.int)(len(data))) {
|
||||
err = p.Geterror()
|
||||
}
|
||||
C.free(unsafe.Pointer(buf))
|
||||
return
|
||||
}
|
49
Godeps/_workspace/src/github.com/akrennmair/gopcap/tools/benchmark/benchmark.go
generated
vendored
49
Godeps/_workspace/src/github.com/akrennmair/gopcap/tools/benchmark/benchmark.go
generated
vendored
@@ -1,49 +0,0 @@
|
||||
package main
|
||||
|
||||
import (
|
||||
"flag"
|
||||
"fmt"
|
||||
"os"
|
||||
"runtime/pprof"
|
||||
"time"
|
||||
|
||||
"github.com/coreos/etcd/Godeps/_workspace/src/github.com/akrennmair/gopcap"
|
||||
)
|
||||
|
||||
func main() {
|
||||
var filename *string = flag.String("file", "", "filename")
|
||||
var decode *bool = flag.Bool("d", false, "If true, decode each packet")
|
||||
var cpuprofile *string = flag.String("cpuprofile", "", "filename")
|
||||
|
||||
flag.Parse()
|
||||
|
||||
h, err := pcap.Openoffline(*filename)
|
||||
if err != nil {
|
||||
fmt.Printf("Couldn't create pcap reader: %v", err)
|
||||
}
|
||||
|
||||
if *cpuprofile != "" {
|
||||
if out, err := os.Create(*cpuprofile); err == nil {
|
||||
pprof.StartCPUProfile(out)
|
||||
defer func() {
|
||||
pprof.StopCPUProfile()
|
||||
out.Close()
|
||||
}()
|
||||
} else {
|
||||
panic(err)
|
||||
}
|
||||
}
|
||||
|
||||
i, nilPackets := 0, 0
|
||||
start := time.Now()
|
||||
for pkt, code := h.NextEx(); code != -2; pkt, code = h.NextEx() {
|
||||
if pkt == nil {
|
||||
nilPackets++
|
||||
} else if *decode {
|
||||
pkt.Decode()
|
||||
}
|
||||
i++
|
||||
}
|
||||
duration := time.Since(start)
|
||||
fmt.Printf("Took %v to process %v packets, %v per packet, %d nil packets\n", duration, i, duration/time.Duration(i), nilPackets)
|
||||
}
|
96
Godeps/_workspace/src/github.com/akrennmair/gopcap/tools/pass/pass.go
generated
vendored
96
Godeps/_workspace/src/github.com/akrennmair/gopcap/tools/pass/pass.go
generated
vendored
@@ -1,96 +0,0 @@
|
||||
package main
|
||||
|
||||
// Parses a pcap file, writes it back to disk, then verifies the files
|
||||
// are the same.
|
||||
import (
|
||||
"bufio"
|
||||
"flag"
|
||||
"fmt"
|
||||
"io"
|
||||
"os"
|
||||
|
||||
"github.com/coreos/etcd/Godeps/_workspace/src/github.com/akrennmair/gopcap"
|
||||
)
|
||||
|
||||
var input *string = flag.String("input", "", "input file")
|
||||
var output *string = flag.String("output", "", "output file")
|
||||
var decode *bool = flag.Bool("decode", false, "print decoded packets")
|
||||
|
||||
func copyPcap(dest, src string) {
|
||||
f, err := os.Open(src)
|
||||
if err != nil {
|
||||
fmt.Printf("couldn't open %q: %v\n", src, err)
|
||||
return
|
||||
}
|
||||
defer f.Close()
|
||||
reader, err := pcap.NewReader(bufio.NewReader(f))
|
||||
if err != nil {
|
||||
fmt.Printf("couldn't create reader: %v\n", err)
|
||||
return
|
||||
}
|
||||
w, err := os.Create(dest)
|
||||
if err != nil {
|
||||
fmt.Printf("couldn't open %q: %v\n", dest, err)
|
||||
return
|
||||
}
|
||||
defer w.Close()
|
||||
buf := bufio.NewWriter(w)
|
||||
writer, err := pcap.NewWriter(buf, &reader.Header)
|
||||
if err != nil {
|
||||
fmt.Printf("couldn't create writer: %v\n", err)
|
||||
return
|
||||
}
|
||||
for {
|
||||
pkt := reader.Next()
|
||||
if pkt == nil {
|
||||
break
|
||||
}
|
||||
if *decode {
|
||||
pkt.Decode()
|
||||
fmt.Println(pkt.String())
|
||||
}
|
||||
writer.Write(pkt)
|
||||
}
|
||||
buf.Flush()
|
||||
}
|
||||
|
||||
func check(dest, src string) {
|
||||
f, err := os.Open(src)
|
||||
if err != nil {
|
||||
fmt.Printf("couldn't open %q: %v\n", src, err)
|
||||
return
|
||||
}
|
||||
defer f.Close()
|
||||
freader := bufio.NewReader(f)
|
||||
|
||||
g, err := os.Open(dest)
|
||||
if err != nil {
|
||||
fmt.Printf("couldn't open %q: %v\n", src, err)
|
||||
return
|
||||
}
|
||||
defer g.Close()
|
||||
greader := bufio.NewReader(g)
|
||||
|
||||
for {
|
||||
fb, ferr := freader.ReadByte()
|
||||
gb, gerr := greader.ReadByte()
|
||||
|
||||
if ferr == io.EOF && gerr == io.EOF {
|
||||
break
|
||||
}
|
||||
if fb == gb {
|
||||
continue
|
||||
}
|
||||
fmt.Println("FAIL")
|
||||
return
|
||||
}
|
||||
|
||||
fmt.Println("PASS")
|
||||
}
|
||||
|
||||
func main() {
|
||||
flag.Parse()
|
||||
|
||||
copyPcap(*output, *input)
|
||||
check(*output, *input)
|
||||
}
|
82
Godeps/_workspace/src/github.com/akrennmair/gopcap/tools/pcaptest/pcaptest.go
generated
vendored
82
Godeps/_workspace/src/github.com/akrennmair/gopcap/tools/pcaptest/pcaptest.go
generated
vendored
@@ -1,82 +0,0 @@
|
||||
package main
|
||||
|
||||
import (
|
||||
"flag"
|
||||
"fmt"
|
||||
"time"
|
||||
|
||||
"github.com/coreos/etcd/Godeps/_workspace/src/github.com/akrennmair/gopcap"
|
||||
)
|
||||
|
||||
func min(x uint32, y uint32) uint32 {
|
||||
if x < y {
|
||||
return x
|
||||
}
|
||||
return y
|
||||
}
|
||||
|
||||
func main() {
|
||||
var device *string = flag.String("d", "", "device")
|
||||
var file *string = flag.String("r", "", "file")
|
||||
var expr *string = flag.String("e", "", "filter expression")
|
||||
|
||||
flag.Parse()
|
||||
|
||||
var h *pcap.Pcap
|
||||
var err error
|
||||
|
||||
ifs, err := pcap.Findalldevs()
|
||||
if len(ifs) == 0 {
|
||||
fmt.Printf("Warning: no devices found : %s\n", err)
|
||||
} else {
|
||||
for i := 0; i < len(ifs); i++ {
|
||||
fmt.Printf("dev %d: %s (%s)\n", i+1, ifs[i].Name, ifs[i].Description)
|
||||
}
|
||||
}
|
||||
|
||||
if *device != "" {
|
||||
h, err = pcap.Openlive(*device, 65535, true, 0)
|
||||
if h == nil {
|
||||
fmt.Printf("Openlive(%s) failed: %s\n", *device, err)
|
||||
return
|
||||
}
|
||||
} else if *file != "" {
|
||||
h, err = pcap.Openoffline(*file)
|
||||
if h == nil {
|
||||
fmt.Printf("Openoffline(%s) failed: %s\n", *file, err)
|
||||
return
|
||||
}
|
||||
} else {
|
||||
fmt.Printf("usage: pcaptest [-d <device> | -r <file>]\n")
|
||||
return
|
||||
}
|
||||
defer h.Close()
|
||||
|
||||
fmt.Printf("pcap version: %s\n", pcap.Version())
|
||||
|
||||
if *expr != "" {
|
||||
fmt.Printf("Setting filter: %s\n", *expr)
|
||||
err := h.Setfilter(*expr)
|
||||
if err != nil {
|
||||
fmt.Printf("Warning: setting filter failed: %s\n", err)
|
||||
}
|
||||
}
|
||||
|
||||
for pkt := h.Next(); pkt != nil; pkt = h.Next() {
|
||||
fmt.Printf("time: %d.%06d (%s) caplen: %d len: %d\nData:",
|
||||
int64(pkt.Time.Second()), int64(pkt.Time.Nanosecond()),
|
||||
time.Unix(int64(pkt.Time.Second()), 0).String(), int64(pkt.Caplen), int64(pkt.Len))
|
||||
for i := uint32(0); i < pkt.Caplen; i++ {
|
||||
if i%32 == 0 {
|
||||
fmt.Printf("\n")
|
||||
}
|
||||
if 32 <= pkt.Data[i] && pkt.Data[i] <= 126 {
|
||||
fmt.Printf("%c", pkt.Data[i])
|
||||
} else {
|
||||
fmt.Printf(".")
|
||||
}
|
||||
}
|
||||
fmt.Printf("\n\n")
|
||||
}
|
||||
|
||||
}
|
121
Godeps/_workspace/src/github.com/akrennmair/gopcap/tools/tcpdump/tcpdump.go
generated
vendored
121
Godeps/_workspace/src/github.com/akrennmair/gopcap/tools/tcpdump/tcpdump.go
generated
vendored
@@ -1,121 +0,0 @@
|
||||
package main
|
||||
|
||||
import (
|
||||
"bufio"
|
||||
"flag"
|
||||
"fmt"
|
||||
"os"
|
||||
|
||||
"github.com/coreos/etcd/Godeps/_workspace/src/github.com/akrennmair/gopcap"
|
||||
)
|
||||
|
||||
const (
|
||||
TYPE_IP = 0x0800
|
||||
TYPE_ARP = 0x0806
|
||||
TYPE_IP6 = 0x86DD
|
||||
|
||||
IP_ICMP = 1
|
||||
IP_INIP = 4
|
||||
IP_TCP = 6
|
||||
IP_UDP = 17
|
||||
)
|
||||
|
||||
var out *bufio.Writer
|
||||
var errout *bufio.Writer
|
||||
|
||||
func main() {
|
||||
var device *string = flag.String("i", "", "interface")
|
||||
var snaplen *int = flag.Int("s", 65535, "snaplen")
|
||||
var hexdump *bool = flag.Bool("X", false, "hexdump")
|
||||
expr := ""
|
||||
|
||||
out = bufio.NewWriter(os.Stdout)
|
||||
errout = bufio.NewWriter(os.Stderr)
|
||||
|
||||
flag.Usage = func() {
|
||||
fmt.Fprintf(errout, "usage: %s [ -i interface ] [ -s snaplen ] [ -X ] [ expression ]\n", os.Args[0])
|
||||
errout.Flush()
|
||||
os.Exit(1)
|
||||
}
|
||||
|
||||
flag.Parse()
|
||||
|
||||
if len(flag.Args()) > 0 {
|
||||
expr = flag.Arg(0)
|
||||
}
|
||||
|
||||
if *device == "" {
|
||||
devs, err := pcap.Findalldevs()
|
||||
if err != nil {
|
||||
fmt.Fprintf(errout, "tcpdump: couldn't find any devices: %s\n", err)
|
||||
}
|
||||
if 0 == len(devs) {
|
||||
flag.Usage()
|
||||
}
|
||||
*device = devs[0].Name
|
||||
}
|
||||
|
||||
h, err := pcap.Openlive(*device, int32(*snaplen), true, 0)
|
||||
if h == nil {
|
||||
fmt.Fprintf(errout, "tcpdump: %s\n", err)
|
||||
errout.Flush()
|
||||
return
|
||||
}
|
||||
defer h.Close()
|
||||
|
||||
if expr != "" {
|
||||
ferr := h.Setfilter(expr)
|
||||
if ferr != nil {
|
||||
fmt.Fprintf(out, "tcpdump: %s\n", ferr)
|
||||
out.Flush()
|
||||
}
|
||||
}
|
||||
|
||||
for pkt := h.Next(); pkt != nil; pkt = h.Next() {
|
||||
pkt.Decode()
|
||||
fmt.Fprintf(out, "%s\n", pkt.String())
|
||||
if *hexdump {
|
||||
Hexdump(pkt)
|
||||
}
|
||||
out.Flush()
|
||||
}
|
||||
}
|
||||
|
||||
func min(a, b int) int {
|
||||
if a < b {
|
||||
return a
|
||||
}
|
||||
return b
|
||||
}
|
||||
|
||||
func Hexdump(pkt *pcap.Packet) {
|
||||
for i := 0; i < len(pkt.Data); i += 16 {
|
||||
Dumpline(uint32(i), pkt.Data[i:min(i+16, len(pkt.Data))])
|
||||
}
|
||||
}
|
||||
|
||||
func Dumpline(addr uint32, line []byte) {
|
||||
fmt.Fprintf(out, "\t0x%04x: ", int32(addr))
|
||||
var i uint16
|
||||
for i = 0; i < 16 && i < uint16(len(line)); i++ {
|
||||
if i%2 == 0 {
|
||||
out.WriteString(" ")
|
||||
}
|
||||
fmt.Fprintf(out, "%02x", line[i])
|
||||
}
|
||||
for j := i; j <= 16; j++ {
|
||||
if j%2 == 0 {
|
||||
out.WriteString(" ")
|
||||
}
|
||||
out.WriteString(" ")
|
||||
}
|
||||
out.WriteString(" ")
|
||||
for i = 0; i < 16 && i < uint16(len(line)); i++ {
|
||||
if line[i] >= 32 && line[i] <= 126 {
|
||||
fmt.Fprintf(out, "%c", line[i])
|
||||
} else {
|
||||
out.WriteString(".")
|
||||
}
|
||||
}
|
||||
out.WriteString("\n")
|
||||
}
|
50
Godeps/_workspace/src/github.com/boltdb/bolt/Makefile
generated
vendored
50
Godeps/_workspace/src/github.com/boltdb/bolt/Makefile
generated
vendored
@@ -1,18 +1,54 @@
|
||||
TEST=.
|
||||
BENCH=.
|
||||
COVERPROFILE=/tmp/c.out
|
||||
BRANCH=`git rev-parse --abbrev-ref HEAD`
|
||||
COMMIT=`git rev-parse --short HEAD`
|
||||
GOLDFLAGS="-X main.branch $(BRANCH) -X main.commit $(COMMIT)"
|
||||
|
||||
default: build
|
||||
|
||||
race:
|
||||
@go test -v -race -test.run="TestSimulate_(100op|1000op)"
|
||||
bench:
|
||||
go test -v -test.run=NOTHINCONTAINSTHIS -test.bench=$(BENCH)
|
||||
|
||||
# http://cloc.sourceforge.net/
|
||||
cloc:
|
||||
@cloc --not-match-f='Makefile|_test.go' .
|
||||
|
||||
cover: fmt
|
||||
go test -coverprofile=$(COVERPROFILE) -test.run=$(TEST) $(COVERFLAG) .
|
||||
go tool cover -html=$(COVERPROFILE)
|
||||
rm $(COVERPROFILE)
|
||||
|
||||
cpuprofile: fmt
|
||||
@go test -c
|
||||
@./bolt.test -test.v -test.run=$(TEST) -test.cpuprofile cpu.prof
|
||||
|
||||
# go get github.com/kisielk/errcheck
|
||||
errcheck:
|
||||
@errcheck -ignorepkg=bytes -ignore=os:Remove github.com/boltdb/bolt
|
||||
@echo "=== errcheck ==="
|
||||
@errcheck github.com/boltdb/bolt
|
||||
|
||||
test:
|
||||
@go test -v -cover .
|
||||
@go test -v ./cmd/bolt
|
||||
fmt:
|
||||
@go fmt ./...
|
||||
|
||||
.PHONY: fmt test
|
||||
get:
|
||||
@go get -d ./...
|
||||
|
||||
build: get
|
||||
@mkdir -p bin
|
||||
@go build -ldflags=$(GOLDFLAGS) -a -o bin/bolt ./cmd/bolt
|
||||
|
||||
test: fmt
|
||||
@go get github.com/stretchr/testify/assert
|
||||
@echo "=== TESTS ==="
|
||||
@go test -v -cover -test.run=$(TEST)
|
||||
@echo ""
|
||||
@echo ""
|
||||
@echo "=== CLI ==="
|
||||
@go test -v -test.run=$(TEST) ./cmd/bolt
|
||||
@echo ""
|
||||
@echo ""
|
||||
@echo "=== RACE DETECTOR ==="
|
||||
@go test -v -race -test.run="TestSimulate_(100op|1000op)"
|
||||
|
||||
.PHONY: bench cloc cover cpuprofile fmt memprofile test
|
||||
|
269
Godeps/_workspace/src/github.com/boltdb/bolt/README.md
generated
vendored
269
Godeps/_workspace/src/github.com/boltdb/bolt/README.md
generated
vendored
@@ -1,8 +1,8 @@
|
||||
Bolt [](https://drone.io/github.com/boltdb/bolt/latest) [](https://coveralls.io/r/boltdb/bolt?branch=master) [](https://godoc.org/github.com/boltdb/bolt) 
|
||||
Bolt [](https://drone.io/github.com/boltdb/bolt/latest) [](https://coveralls.io/r/boltdb/bolt?branch=master) [](https://godoc.org/github.com/boltdb/bolt) 
|
||||
====
|
||||
|
||||
Bolt is a pure Go key/value store inspired by [Howard Chu's][hyc_symas]
|
||||
[LMDB project][lmdb]. The goal of the project is to provide a simple,
|
||||
Bolt is a pure Go key/value store inspired by [Howard Chu's][hyc_symas] and
|
||||
the [LMDB project][lmdb]. The goal of the project is to provide a simple,
|
||||
fast, and reliable database for projects that don't require a full database
|
||||
server such as Postgres or MySQL.
|
||||
|
||||
@@ -13,6 +13,7 @@ and setting values. That's it.
|
||||
[hyc_symas]: https://twitter.com/hyc_symas
|
||||
[lmdb]: http://symas.com/mdb/
|
||||
|
||||
|
||||
## Project Status
|
||||
|
||||
Bolt is stable and the API is fixed. Full unit test coverage and randomized
|
||||
@@ -21,36 +22,6 @@ Bolt is currently in high-load production environments serving databases as
|
||||
large as 1TB. Many companies such as Shopify and Heroku use Bolt-backed
|
||||
services every day.
|
||||
|
||||
## Table of Contents
|
||||
|
||||
- [Getting Started](#getting-started)
|
||||
- [Installing](#installing)
|
||||
- [Opening a database](#opening-a-database)
|
||||
- [Transactions](#transactions)
|
||||
- [Read-write transactions](#read-write-transactions)
|
||||
- [Read-only transactions](#read-only-transactions)
|
||||
- [Batch read-write transactions](#batch-read-write-transactions)
|
||||
- [Managing transactions manually](#managing-transactions-manually)
|
||||
- [Using buckets](#using-buckets)
|
||||
- [Using key/value pairs](#using-keyvalue-pairs)
|
||||
- [Autoincrementing integer for the bucket](#autoincrementing-integer-for-the-bucket)
|
||||
- [Iterating over keys](#iterating-over-keys)
|
||||
- [Prefix scans](#prefix-scans)
|
||||
- [Range scans](#range-scans)
|
||||
- [ForEach()](#foreach)
|
||||
- [Nested buckets](#nested-buckets)
|
||||
- [Database backups](#database-backups)
|
||||
- [Statistics](#statistics)
|
||||
- [Read-Only Mode](#read-only-mode)
|
||||
- [Mobile Use (iOS/Android)](#mobile-use-iosandroid)
|
||||
- [Resources](#resources)
|
||||
- [Comparison with other databases](#comparison-with-other-databases)
|
||||
- [Postgres, MySQL, & other relational databases](#postgres-mysql--other-relational-databases)
|
||||
- [LevelDB, RocksDB](#leveldb-rocksdb)
|
||||
- [LMDB](#lmdb)
|
||||
- [Caveats & Limitations](#caveats--limitations)
|
||||
- [Reading the Source](#reading-the-source)
|
||||
- [Other Projects Using Bolt](#other-projects-using-bolt)
|
||||
|
||||
## Getting Started
|
||||
|
||||
@@ -209,8 +180,8 @@ and then safely close your transaction if an error is returned. This is the
|
||||
recommended way to use Bolt transactions.
|
||||
|
||||
However, sometimes you may want to manually start and end your transactions.
|
||||
You can use the `Tx.Begin()` function directly but **please** be sure to close
|
||||
the transaction.
|
||||
You can use the `Tx.Begin()` function directly but _please_ be sure to close the
|
||||
transaction.
|
||||
|
||||
```go
|
||||
// Start a writable transaction.
|
||||
@@ -285,7 +256,7 @@ db.View(func(tx *bolt.Tx) error {
|
||||
```
|
||||
|
||||
The `Get()` function does not return an error because its operation is
|
||||
guaranteed to work (unless there is some kind of system failure). If the key
|
||||
guarenteed to work (unless there is some kind of system failure). If the key
|
||||
exists then it will return its byte slice value. If it doesn't exist then it
|
||||
will return `nil`. It's important to note that you can have a zero-length value
|
||||
set to a key which is different than the key not existing.
|
||||
@@ -297,49 +268,6 @@ transaction is open. If you need to use a value outside of the transaction
|
||||
then you must use `copy()` to copy it to another byte slice.
|
||||
|
||||
|
||||
### Autoincrementing integer for the bucket
|
||||
By using the `NextSequence()` function, you can let Bolt determine a sequence
|
||||
which can be used as the unique identifier for your key/value pairs. See the
|
||||
example below.
|
||||
|
||||
```go
|
||||
// CreateUser saves u to the store. The new user ID is set on u once the data is persisted.
|
||||
func (s *Store) CreateUser(u *User) error {
|
||||
return s.db.Update(func(tx *bolt.Tx) error {
|
||||
// Retrieve the users bucket.
|
||||
// This should be created when the DB is first opened.
|
||||
b := tx.Bucket([]byte("users"))
|
||||
|
||||
// Generate ID for the user.
|
||||
// This returns an error only if the Tx is closed or not writeable.
|
||||
// That can't happen in an Update() call so I ignore the error check.
|
||||
id, _ = b.NextSequence()
|
||||
u.ID = int(id)
|
||||
|
||||
// Marshal user data into bytes.
|
||||
buf, err := json.Marshal(u)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
// Persist bytes to users bucket.
|
||||
return b.Put(itob(u.ID), buf)
|
||||
})
|
||||
}
|
||||
|
||||
// itob returns an 8-byte big endian representation of v.
|
||||
func itob(v int) []byte {
|
||||
b := make([]byte, 8)
|
||||
binary.BigEndian.PutUint64(b, uint64(v))
|
||||
return b
|
||||
}
|
||||
|
||||
type User struct {
|
||||
ID int
|
||||
...
|
||||
}
|
||||
```
|
||||
|
||||
### Iterating over keys
|
||||
|
||||
Bolt stores its keys in byte-sorted order within a bucket. This makes sequential
|
||||
@@ -348,9 +276,7 @@ iteration over these keys extremely fast. To iterate over keys we'll use a
|
||||
|
||||
```go
|
||||
db.View(func(tx *bolt.Tx) error {
|
||||
// Assume bucket exists and has keys
|
||||
b := tx.Bucket([]byte("MyBucket"))
|
||||
|
||||
c := b.Cursor()
|
||||
|
||||
for k, v := c.First(); k != nil; k, v = c.Next() {
|
||||
@@ -374,15 +300,10 @@ Next() Move to the next key.
|
||||
Prev() Move to the previous key.
|
||||
```
|
||||
|
||||
Each of those functions has a return signature of `(key []byte, value []byte)`.
|
||||
When you have iterated to the end of the cursor then `Next()` will return a
|
||||
`nil` key. You must seek to a position using `First()`, `Last()`, or `Seek()`
|
||||
before calling `Next()` or `Prev()`. If you do not seek to a position then
|
||||
these functions will return a `nil` key.
|
||||
|
||||
During iteration, if the key is non-`nil` but the value is `nil`, that means
|
||||
the key refers to a bucket rather than a value. Use `Bucket.Bucket()` to
|
||||
access the sub-bucket.
|
||||
When you have iterated to the end of the cursor then `Next()` will return `nil`.
|
||||
You must seek to a position using `First()`, `Last()`, or `Seek()` before
|
||||
calling `Next()` or `Prev()`. If you do not seek to a position then these
|
||||
functions will return `nil`.
|
||||
|
||||
|
||||
#### Prefix scans
|
||||
@@ -391,7 +312,6 @@ To iterate over a key prefix, you can combine `Seek()` and `bytes.HasPrefix()`:
|
||||
|
||||
```go
|
||||
db.View(func(tx *bolt.Tx) error {
|
||||
// Assume bucket exists and has keys
|
||||
c := tx.Bucket([]byte("MyBucket")).Cursor()
|
||||
|
||||
prefix := []byte("1234")
|
||||
@@ -411,7 +331,7 @@ date range like this:
|
||||
|
||||
```go
|
||||
db.View(func(tx *bolt.Tx) error {
|
||||
// Assume our events bucket exists and has RFC3339 encoded time keys.
|
||||
// Assume our events bucket has RFC3339 encoded time keys.
|
||||
c := tx.Bucket([]byte("Events")).Cursor()
|
||||
|
||||
// Our time range spans the 90's decade.
|
||||
@@ -435,9 +355,7 @@ all the keys in a bucket:
|
||||
|
||||
```go
|
||||
db.View(func(tx *bolt.Tx) error {
|
||||
// Assume bucket exists and has keys
|
||||
b := tx.Bucket([]byte("MyBucket"))
|
||||
|
||||
b.ForEach(func(k, v []byte) error {
|
||||
fmt.Printf("key=%s, value=%s\n", k, v)
|
||||
return nil
|
||||
@@ -464,11 +382,8 @@ func (*Bucket) DeleteBucket(key []byte) error
|
||||
Bolt is a single file so it's easy to backup. You can use the `Tx.WriteTo()`
|
||||
function to write a consistent view of the database to a writer. If you call
|
||||
this from a read-only transaction, it will perform a hot backup and not block
|
||||
your other database reads and writes.
|
||||
|
||||
By default, it will use a regular file handle which will utilize the operating
|
||||
system's page cache. See the [`Tx`](https://godoc.org/github.com/boltdb/bolt#Tx)
|
||||
documentation for information about optimizing for larger-than-RAM datasets.
|
||||
your other database reads and writes. It will also use `O_DIRECT` when available
|
||||
to prevent page cache trashing.
|
||||
|
||||
One common use case is to backup over HTTP so you can use tools like `cURL` to
|
||||
do database backups:
|
||||
@@ -550,84 +465,6 @@ if err != nil {
|
||||
}
|
||||
```
|
||||
|
||||
### Mobile Use (iOS/Android)
|
||||
|
||||
Bolt is able to run on mobile devices by leveraging the binding feature of the
|
||||
[gomobile](https://github.com/golang/mobile) tool. Create a struct that will
|
||||
contain your database logic and a reference to a `*bolt.DB` with a initializing
|
||||
contstructor that takes in a filepath where the database file will be stored.
|
||||
Neither Android nor iOS require extra permissions or cleanup from using this method.
|
||||
|
||||
```go
|
||||
func NewBoltDB(filepath string) *BoltDB {
|
||||
db, err := bolt.Open(filepath+"/demo.db", 0600, nil)
|
||||
if err != nil {
|
||||
log.Fatal(err)
|
||||
}
|
||||
|
||||
return &BoltDB{db}
|
||||
}
|
||||
|
||||
type BoltDB struct {
|
||||
db *bolt.DB
|
||||
...
|
||||
}
|
||||
|
||||
func (b *BoltDB) Path() string {
|
||||
return b.db.Path()
|
||||
}
|
||||
|
||||
func (b *BoltDB) Close() {
|
||||
b.db.Close()
|
||||
}
|
||||
```
|
||||
|
||||
Database logic should be defined as methods on this wrapper struct.
|
||||
|
||||
To initialize this struct from the native language (both platforms now sync
|
||||
their local storage to the cloud. These snippets disable that functionality for the
|
||||
database file):
|
||||
|
||||
#### Android
|
||||
|
||||
```java
|
||||
String path;
|
||||
if (android.os.Build.VERSION.SDK_INT >=android.os.Build.VERSION_CODES.LOLLIPOP){
|
||||
path = getNoBackupFilesDir().getAbsolutePath();
|
||||
} else{
|
||||
path = getFilesDir().getAbsolutePath();
|
||||
}
|
||||
Boltmobiledemo.BoltDB boltDB = Boltmobiledemo.NewBoltDB(path)
|
||||
```
|
||||
|
||||
#### iOS
|
||||
|
||||
```objc
|
||||
- (void)demo {
|
||||
NSString* path = [NSSearchPathForDirectoriesInDomains(NSLibraryDirectory,
|
||||
NSUserDomainMask,
|
||||
YES) objectAtIndex:0];
|
||||
GoBoltmobiledemoBoltDB * demo = GoBoltmobiledemoNewBoltDB(path);
|
||||
[self addSkipBackupAttributeToItemAtPath:demo.path];
|
||||
//Some DB Logic would go here
|
||||
[demo close];
|
||||
}
|
||||
|
||||
- (BOOL)addSkipBackupAttributeToItemAtPath:(NSString *) filePathString
|
||||
{
|
||||
NSURL* URL= [NSURL fileURLWithPath: filePathString];
|
||||
assert([[NSFileManager defaultManager] fileExistsAtPath: [URL path]]);
|
||||
|
||||
NSError *error = nil;
|
||||
BOOL success = [URL setResourceValue: [NSNumber numberWithBool: YES]
|
||||
forKey: NSURLIsExcludedFromBackupKey error: &error];
|
||||
if(!success){
|
||||
NSLog(@"Error excluding %@ from backup %@", [URL lastPathComponent], error);
|
||||
}
|
||||
return success;
|
||||
}
|
||||
|
||||
```
|
||||
|
||||
## Resources
|
||||
|
||||
@@ -663,7 +500,7 @@ they are libraries bundled into the application, however, their underlying
|
||||
structure is a log-structured merge-tree (LSM tree). An LSM tree optimizes
|
||||
random writes by using a write ahead log and multi-tiered, sorted files called
|
||||
SSTables. Bolt uses a B+tree internally and only a single file. Both approaches
|
||||
have trade-offs.
|
||||
have trade offs.
|
||||
|
||||
If you require a high random write throughput (>10,000 w/sec) or you need to use
|
||||
spinning disks then LevelDB could be a good choice. If your application is
|
||||
@@ -699,8 +536,9 @@ It's important to pick the right tool for the job and Bolt is no exception.
|
||||
Here are a few things to note when evaluating and using Bolt:
|
||||
|
||||
* Bolt is good for read intensive workloads. Sequential write performance is
|
||||
also fast but random writes can be slow. You can use `DB.Batch()` or add a
|
||||
write-ahead log to help mitigate this issue.
|
||||
also fast but random writes can be slow. You can add a write-ahead log or
|
||||
[transaction coalescer](https://github.com/boltdb/coalescer) in front of Bolt
|
||||
to mitigate this issue.
|
||||
|
||||
* Bolt uses a B+tree internally so there can be a lot of random page access.
|
||||
SSDs provide a significant performance boost over spinning disks.
|
||||
@@ -730,13 +568,11 @@ Here are a few things to note when evaluating and using Bolt:
|
||||
can in memory and will release memory as needed to other processes. This means
|
||||
that Bolt can show very high memory usage when working with large databases.
|
||||
However, this is expected and the OS will release memory as needed. Bolt can
|
||||
handle databases much larger than the available physical RAM, provided its
|
||||
memory-map fits in the process virtual address space. It may be problematic
|
||||
on 32-bits systems.
|
||||
handle databases much larger than the available physical RAM.
|
||||
|
||||
* The data structures in the Bolt database are memory mapped so the data file
|
||||
will be endian specific. This means that you cannot copy a Bolt file from a
|
||||
little endian machine to a big endian machine and have it work. For most
|
||||
little endian machine to a big endian machine and have it work. For most
|
||||
users this is not a concern since most modern CPUs are little endian.
|
||||
|
||||
* Because of the way pages are laid out on disk, Bolt cannot truncate data files
|
||||
@@ -751,56 +587,6 @@ Here are a few things to note when evaluating and using Bolt:
|
||||
[page-allocation]: https://github.com/boltdb/bolt/issues/308#issuecomment-74811638
|
||||
|
||||
|
||||
## Reading the Source
|
||||
|
||||
Bolt is a relatively small code base (<3KLOC) for an embedded, serializable,
|
||||
transactional key/value database so it can be a good starting point for people
|
||||
interested in how databases work.
|
||||
|
||||
The best places to start are the main entry points into Bolt:
|
||||
|
||||
- `Open()` - Initializes the reference to the database. It's responsible for
|
||||
creating the database if it doesn't exist, obtaining an exclusive lock on the
|
||||
file, reading the meta pages, & memory-mapping the file.
|
||||
|
||||
- `DB.Begin()` - Starts a read-only or read-write transaction depending on the
|
||||
value of the `writable` argument. This requires briefly obtaining the "meta"
|
||||
lock to keep track of open transactions. Only one read-write transaction can
|
||||
exist at a time so the "rwlock" is acquired during the life of a read-write
|
||||
transaction.
|
||||
|
||||
- `Bucket.Put()` - Writes a key/value pair into a bucket. After validating the
|
||||
arguments, a cursor is used to traverse the B+tree to the page and position
|
||||
where they key & value will be written. Once the position is found, the bucket
|
||||
materializes the underlying page and the page's parent pages into memory as
|
||||
"nodes". These nodes are where mutations occur during read-write transactions.
|
||||
These changes get flushed to disk during commit.
|
||||
|
||||
- `Bucket.Get()` - Retrieves a key/value pair from a bucket. This uses a cursor
|
||||
to move to the page & position of a key/value pair. During a read-only
|
||||
transaction, the key and value data is returned as a direct reference to the
|
||||
underlying mmap file so there's no allocation overhead. For read-write
|
||||
transactions, this data may reference the mmap file or one of the in-memory
|
||||
node values.
|
||||
|
||||
- `Cursor` - This object is simply for traversing the B+tree of on-disk pages
|
||||
or in-memory nodes. It can seek to a specific key, move to the first or last
|
||||
value, or it can move forward or backward. The cursor handles the movement up
|
||||
and down the B+tree transparently to the end user.
|
||||
|
||||
- `Tx.Commit()` - Converts the in-memory dirty nodes and the list of free pages
|
||||
into pages to be written to disk. Writing to disk then occurs in two phases.
|
||||
First, the dirty pages are written to disk and an `fsync()` occurs. Second, a
|
||||
new meta page with an incremented transaction ID is written and another
|
||||
`fsync()` occurs. This two phase write ensures that partially written data
|
||||
pages are ignored in the event of a crash since the meta page pointing to them
|
||||
is never written. Partially written meta pages are invalidated because they
|
||||
are written with a checksum.
|
||||
|
||||
If you have additional notes that could be helpful for others, please submit
|
||||
them via pull request.
|
||||
|
||||
|
||||
## Other Projects Using Bolt
|
||||
|
||||
Below is a list of public, open source projects that use Bolt:
|
||||
@@ -811,30 +597,25 @@ Below is a list of public, open source projects that use Bolt:
|
||||
* [Skybox Analytics](https://github.com/skybox/skybox) - A standalone funnel analysis tool for web analytics.
|
||||
* [Scuttlebutt](https://github.com/benbjohnson/scuttlebutt) - Uses Bolt to store and process all Twitter mentions of GitHub projects.
|
||||
* [Wiki](https://github.com/peterhellberg/wiki) - A tiny wiki using Goji, BoltDB and Blackfriday.
|
||||
* [ChainStore](https://github.com/pressly/chainstore) - Simple key-value interface to a variety of storage engines organized as a chain of operations.
|
||||
* [ChainStore](https://github.com/nulayer/chainstore) - Simple key-value interface to a variety of storage engines organized as a chain of operations.
|
||||
* [MetricBase](https://github.com/msiebuhr/MetricBase) - Single-binary version of Graphite.
|
||||
* [Gitchain](https://github.com/gitchain/gitchain) - Decentralized, peer-to-peer Git repositories aka "Git meets Bitcoin".
|
||||
* [event-shuttle](https://github.com/sclasen/event-shuttle) - A Unix system service to collect and reliably deliver messages to Kafka.
|
||||
* [ipxed](https://github.com/kelseyhightower/ipxed) - Web interface and api for ipxed.
|
||||
* [BoltStore](https://github.com/yosssi/boltstore) - Session store using Bolt.
|
||||
* [photosite/session](https://godoc.org/bitbucket.org/kardianos/photosite/session) - Sessions for a photo viewing site.
|
||||
* [photosite/session](http://godoc.org/bitbucket.org/kardianos/photosite/session) - Sessions for a photo viewing site.
|
||||
* [LedisDB](https://github.com/siddontang/ledisdb) - A high performance NoSQL, using Bolt as optional storage.
|
||||
* [ipLocator](https://github.com/AndreasBriese/ipLocator) - A fast ip-geo-location-server using bolt with bloom filters.
|
||||
* [cayley](https://github.com/google/cayley) - Cayley is an open-source graph database using Bolt as optional backend.
|
||||
* [bleve](http://www.blevesearch.com/) - A pure Go search engine similar to ElasticSearch that uses Bolt as the default storage backend.
|
||||
* [tentacool](https://github.com/optiflows/tentacool) - REST api server to manage system stuff (IP, DNS, Gateway...) on a linux server.
|
||||
* [SkyDB](https://github.com/skydb/sky) - Behavioral analytics database.
|
||||
* [Seaweed File System](https://github.com/chrislusf/seaweedfs) - Highly scalable distributed key~file system with O(1) disk read.
|
||||
* [InfluxDB](https://influxdata.com) - Scalable datastore for metrics, events, and real-time analytics.
|
||||
* [Seaweed File System](https://github.com/chrislusf/weed-fs) - Highly scalable distributed key~file system with O(1) disk read.
|
||||
* [InfluxDB](http://influxdb.com) - Scalable datastore for metrics, events, and real-time analytics.
|
||||
* [Freehold](http://tshannon.bitbucket.org/freehold/) - An open, secure, and lightweight platform for your files and data.
|
||||
* [Prometheus Annotation Server](https://github.com/oliver006/prom_annotation_server) - Annotation server for PromDash & Prometheus service monitoring system.
|
||||
* [Consul](https://github.com/hashicorp/consul) - Consul is service discovery and configuration made easy. Distributed, highly available, and datacenter-aware.
|
||||
* [Kala](https://github.com/ajvb/kala) - Kala is a modern job scheduler optimized to run on a single node. It is persistent, JSON over HTTP API, ISO 8601 duration notation, and dependent jobs.
|
||||
* [Kala](https://github.com/ajvb/kala) - Kala is a modern job scheduler optimized to run on a single node. It is persistant, JSON over HTTP API, ISO 8601 duration notation, and dependent jobs.
|
||||
* [drive](https://github.com/odeke-em/drive) - drive is an unofficial Google Drive command line client for \*NIX operating systems.
|
||||
* [stow](https://github.com/djherbis/stow) - a persistence manager for objects
|
||||
backed by boltdb.
|
||||
* [buckets](https://github.com/joyrexus/buckets) - a bolt wrapper streamlining
|
||||
simple tx and key scans.
|
||||
* [Request Baskets](https://github.com/darklynx/request-baskets) - A web service to collect arbitrary HTTP requests and inspect them via REST API or simple web UI, similar to [RequestBin](http://requestb.in/) service
|
||||
|
||||
If you are using Bolt in a project please send a pull request to add it to the list.
|
||||
|
138
Godeps/_workspace/src/github.com/boltdb/bolt/batch.go
generated
vendored
Normal file
138
Godeps/_workspace/src/github.com/boltdb/bolt/batch.go
generated
vendored
Normal file
@@ -0,0 +1,138 @@
|
||||
package bolt
|
||||
|
||||
import (
|
||||
"errors"
|
||||
"fmt"
|
||||
"sync"
|
||||
"time"
|
||||
)
|
||||
|
||||
// Batch calls fn as part of a batch. It behaves similar to Update,
|
||||
// except:
|
||||
//
|
||||
// 1. concurrent Batch calls can be combined into a single Bolt
|
||||
// transaction.
|
||||
//
|
||||
// 2. the function passed to Batch may be called multiple times,
|
||||
// regardless of whether it returns error or not.
|
||||
//
|
||||
// This means that Batch function side effects must be idempotent and
|
||||
// take permanent effect only after a successful return is seen in
|
||||
// caller.
|
||||
//
|
||||
// The maximum batch size and delay can be adjusted with DB.MaxBatchSize
|
||||
// and DB.MaxBatchDelay, respectively.
|
||||
//
|
||||
// Batch is only useful when there are multiple goroutines calling it.
|
||||
func (db *DB) Batch(fn func(*Tx) error) error {
|
||||
errCh := make(chan error, 1)
|
||||
|
||||
db.batchMu.Lock()
|
||||
if (db.batch == nil) || (db.batch != nil && len(db.batch.calls) >= db.MaxBatchSize) {
|
||||
// There is no existing batch, or the existing batch is full; start a new one.
|
||||
db.batch = &batch{
|
||||
db: db,
|
||||
}
|
||||
db.batch.timer = time.AfterFunc(db.MaxBatchDelay, db.batch.trigger)
|
||||
}
|
||||
db.batch.calls = append(db.batch.calls, call{fn: fn, err: errCh})
|
||||
if len(db.batch.calls) >= db.MaxBatchSize {
|
||||
// wake up batch, it's ready to run
|
||||
go db.batch.trigger()
|
||||
}
|
||||
db.batchMu.Unlock()
|
||||
|
||||
err := <-errCh
|
||||
if err == trySolo {
|
||||
err = db.Update(fn)
|
||||
}
|
||||
return err
|
||||
}
|
||||
|
||||
type call struct {
|
||||
fn func(*Tx) error
|
||||
err chan<- error
|
||||
}
|
||||
|
||||
type batch struct {
|
||||
db *DB
|
||||
timer *time.Timer
|
||||
start sync.Once
|
||||
calls []call
|
||||
}
|
||||
|
||||
// trigger runs the batch if it hasn't already been run.
|
||||
func (b *batch) trigger() {
|
||||
b.start.Do(b.run)
|
||||
}
|
||||
|
||||
// run performs the transactions in the batch and communicates results
|
||||
// back to DB.Batch.
|
||||
func (b *batch) run() {
|
||||
b.db.batchMu.Lock()
|
||||
b.timer.Stop()
|
||||
// Make sure no new work is added to this batch, but don't break
|
||||
// other batches.
|
||||
if b.db.batch == b {
|
||||
b.db.batch = nil
|
||||
}
|
||||
b.db.batchMu.Unlock()
|
||||
|
||||
retry:
|
||||
for len(b.calls) > 0 {
|
||||
var failIdx = -1
|
||||
err := b.db.Update(func(tx *Tx) error {
|
||||
for i, c := range b.calls {
|
||||
if err := safelyCall(c.fn, tx); err != nil {
|
||||
failIdx = i
|
||||
return err
|
||||
}
|
||||
}
|
||||
return nil
|
||||
})
|
||||
|
||||
if failIdx >= 0 {
|
||||
// take the failing transaction out of the batch. it's
|
||||
// safe to shorten b.calls here because db.batch no longer
|
||||
// points to us, and we hold the mutex anyway.
|
||||
c := b.calls[failIdx]
|
||||
b.calls[failIdx], b.calls = b.calls[len(b.calls)-1], b.calls[:len(b.calls)-1]
|
||||
// tell the submitter re-run it solo, continue with the rest of the batch
|
||||
c.err <- trySolo
|
||||
continue retry
|
||||
}
|
||||
|
||||
// pass success, or bolt internal errors, to all callers
|
||||
for _, c := range b.calls {
|
||||
if c.err != nil {
|
||||
c.err <- err
|
||||
}
|
||||
}
|
||||
break retry
|
||||
}
|
||||
}
|
||||
|
||||
// trySolo is a special sentinel error value used for signaling that a
|
||||
// transaction function should be re-run. It should never be seen by
|
||||
// callers.
|
||||
var trySolo = errors.New("batch function returned an error and should be re-run solo")
|
||||
|
||||
type panicked struct {
|
||||
reason interface{}
|
||||
}
|
||||
|
||||
func (p panicked) Error() string {
|
||||
if err, ok := p.reason.(error); ok {
|
||||
return err.Error()
|
||||
}
|
||||
return fmt.Sprintf("panic: %v", p.reason)
|
||||
}
|
||||
|
||||
func safelyCall(fn func(*Tx) error, tx *Tx) (err error) {
|
||||
defer func() {
|
||||
if p := recover(); p != nil {
|
||||
err = panicked{p}
|
||||
}
|
||||
}()
|
||||
return fn(tx)
|
||||
}
|
170
Godeps/_workspace/src/github.com/boltdb/bolt/batch_benchmark_test.go
generated
vendored
Normal file
170
Godeps/_workspace/src/github.com/boltdb/bolt/batch_benchmark_test.go
generated
vendored
Normal file
@@ -0,0 +1,170 @@
|
||||
package bolt_test
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"encoding/binary"
|
||||
"errors"
|
||||
"hash/fnv"
|
||||
"sync"
|
||||
"testing"
|
||||
|
||||
"github.com/coreos/etcd/Godeps/_workspace/src/github.com/boltdb/bolt"
|
||||
)
|
||||
|
||||
func validateBatchBench(b *testing.B, db *TestDB) {
|
||||
var rollback = errors.New("sentinel error to cause rollback")
|
||||
validate := func(tx *bolt.Tx) error {
|
||||
bucket := tx.Bucket([]byte("bench"))
|
||||
h := fnv.New32a()
|
||||
buf := make([]byte, 4)
|
||||
for id := uint32(0); id < 1000; id++ {
|
||||
binary.LittleEndian.PutUint32(buf, id)
|
||||
h.Reset()
|
||||
h.Write(buf[:])
|
||||
k := h.Sum(nil)
|
||||
v := bucket.Get(k)
|
||||
if v == nil {
|
||||
b.Errorf("not found id=%d key=%x", id, k)
|
||||
continue
|
||||
}
|
||||
if g, e := v, []byte("filler"); !bytes.Equal(g, e) {
|
||||
b.Errorf("bad value for id=%d key=%x: %s != %q", id, k, g, e)
|
||||
}
|
||||
if err := bucket.Delete(k); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
// should be empty now
|
||||
c := bucket.Cursor()
|
||||
for k, v := c.First(); k != nil; k, v = c.Next() {
|
||||
b.Errorf("unexpected key: %x = %q", k, v)
|
||||
}
|
||||
return rollback
|
||||
}
|
||||
if err := db.Update(validate); err != nil && err != rollback {
|
||||
b.Error(err)
|
||||
}
|
||||
}
|
||||
|
||||
func BenchmarkDBBatchAutomatic(b *testing.B) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
db.MustCreateBucket([]byte("bench"))
|
||||
|
||||
b.ResetTimer()
|
||||
for i := 0; i < b.N; i++ {
|
||||
start := make(chan struct{})
|
||||
var wg sync.WaitGroup
|
||||
|
||||
for round := 0; round < 1000; round++ {
|
||||
wg.Add(1)
|
||||
|
||||
go func(id uint32) {
|
||||
defer wg.Done()
|
||||
<-start
|
||||
|
||||
h := fnv.New32a()
|
||||
buf := make([]byte, 4)
|
||||
binary.LittleEndian.PutUint32(buf, id)
|
||||
h.Write(buf[:])
|
||||
k := h.Sum(nil)
|
||||
insert := func(tx *bolt.Tx) error {
|
||||
b := tx.Bucket([]byte("bench"))
|
||||
return b.Put(k, []byte("filler"))
|
||||
}
|
||||
if err := db.Batch(insert); err != nil {
|
||||
b.Error(err)
|
||||
return
|
||||
}
|
||||
}(uint32(round))
|
||||
}
|
||||
close(start)
|
||||
wg.Wait()
|
||||
}
|
||||
|
||||
b.StopTimer()
|
||||
validateBatchBench(b, db)
|
||||
}
|
||||
|
||||
func BenchmarkDBBatchSingle(b *testing.B) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
db.MustCreateBucket([]byte("bench"))
|
||||
|
||||
b.ResetTimer()
|
||||
for i := 0; i < b.N; i++ {
|
||||
start := make(chan struct{})
|
||||
var wg sync.WaitGroup
|
||||
|
||||
for round := 0; round < 1000; round++ {
|
||||
wg.Add(1)
|
||||
go func(id uint32) {
|
||||
defer wg.Done()
|
||||
<-start
|
||||
|
||||
h := fnv.New32a()
|
||||
buf := make([]byte, 4)
|
||||
binary.LittleEndian.PutUint32(buf, id)
|
||||
h.Write(buf[:])
|
||||
k := h.Sum(nil)
|
||||
insert := func(tx *bolt.Tx) error {
|
||||
b := tx.Bucket([]byte("bench"))
|
||||
return b.Put(k, []byte("filler"))
|
||||
}
|
||||
if err := db.Update(insert); err != nil {
|
||||
b.Error(err)
|
||||
return
|
||||
}
|
||||
}(uint32(round))
|
||||
}
|
||||
close(start)
|
||||
wg.Wait()
|
||||
}
|
||||
|
||||
b.StopTimer()
|
||||
validateBatchBench(b, db)
|
||||
}
|
||||
|
||||
func BenchmarkDBBatchManual10x100(b *testing.B) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
db.MustCreateBucket([]byte("bench"))
|
||||
|
||||
b.ResetTimer()
|
||||
for i := 0; i < b.N; i++ {
|
||||
start := make(chan struct{})
|
||||
var wg sync.WaitGroup
|
||||
|
||||
for major := 0; major < 10; major++ {
|
||||
wg.Add(1)
|
||||
go func(id uint32) {
|
||||
defer wg.Done()
|
||||
<-start
|
||||
|
||||
insert100 := func(tx *bolt.Tx) error {
|
||||
h := fnv.New32a()
|
||||
buf := make([]byte, 4)
|
||||
for minor := uint32(0); minor < 100; minor++ {
|
||||
binary.LittleEndian.PutUint32(buf, uint32(id*100+minor))
|
||||
h.Reset()
|
||||
h.Write(buf[:])
|
||||
k := h.Sum(nil)
|
||||
b := tx.Bucket([]byte("bench"))
|
||||
if err := b.Put(k, []byte("filler")); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
return nil
|
||||
}
|
||||
if err := db.Update(insert100); err != nil {
|
||||
b.Fatal(err)
|
||||
}
|
||||
}(uint32(major))
|
||||
}
|
||||
close(start)
|
||||
wg.Wait()
|
||||
}
|
||||
|
||||
b.StopTimer()
|
||||
validateBatchBench(b, db)
|
||||
}
|
148
Godeps/_workspace/src/github.com/boltdb/bolt/batch_example_test.go
generated
vendored
Normal file
148
Godeps/_workspace/src/github.com/boltdb/bolt/batch_example_test.go
generated
vendored
Normal file
@@ -0,0 +1,148 @@
|
||||
package bolt_test
|
||||
|
||||
import (
|
||||
"encoding/binary"
|
||||
"fmt"
|
||||
"io/ioutil"
|
||||
"log"
|
||||
"math/rand"
|
||||
"net/http"
|
||||
"net/http/httptest"
|
||||
"os"
|
||||
|
||||
"github.com/coreos/etcd/Godeps/_workspace/src/github.com/boltdb/bolt"
|
||||
)
|
||||
|
||||
// Set this to see how the counts are actually updated.
|
||||
const verbose = false
|
||||
|
||||
// Counter updates a counter in Bolt for every URL path requested.
|
||||
type counter struct {
|
||||
db *bolt.DB
|
||||
}
|
||||
|
||||
func (c counter) ServeHTTP(rw http.ResponseWriter, req *http.Request) {
|
||||
// Communicates the new count from a successful database
|
||||
// transaction.
|
||||
var result uint64
|
||||
|
||||
increment := func(tx *bolt.Tx) error {
|
||||
b, err := tx.CreateBucketIfNotExists([]byte("hits"))
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
key := []byte(req.URL.String())
|
||||
// Decode handles key not found for us.
|
||||
count := decode(b.Get(key)) + 1
|
||||
b.Put(key, encode(count))
|
||||
// All good, communicate new count.
|
||||
result = count
|
||||
return nil
|
||||
}
|
||||
if err := c.db.Batch(increment); err != nil {
|
||||
http.Error(rw, err.Error(), 500)
|
||||
return
|
||||
}
|
||||
|
||||
if verbose {
|
||||
log.Printf("server: %s: %d", req.URL.String(), result)
|
||||
}
|
||||
|
||||
rw.Header().Set("Content-Type", "application/octet-stream")
|
||||
fmt.Fprintf(rw, "%d\n", result)
|
||||
}
|
||||
|
||||
func client(id int, base string, paths []string) error {
|
||||
// Process paths in random order.
|
||||
rng := rand.New(rand.NewSource(int64(id)))
|
||||
permutation := rng.Perm(len(paths))
|
||||
|
||||
for i := range paths {
|
||||
path := paths[permutation[i]]
|
||||
resp, err := http.Get(base + path)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
defer resp.Body.Close()
|
||||
buf, err := ioutil.ReadAll(resp.Body)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
if verbose {
|
||||
log.Printf("client: %s: %s", path, buf)
|
||||
}
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
func ExampleDB_Batch() {
|
||||
// Open the database.
|
||||
db, _ := bolt.Open(tempfile(), 0666, nil)
|
||||
defer os.Remove(db.Path())
|
||||
defer db.Close()
|
||||
|
||||
// Start our web server
|
||||
count := counter{db}
|
||||
srv := httptest.NewServer(count)
|
||||
defer srv.Close()
|
||||
|
||||
// Decrease the batch size to make things more interesting.
|
||||
db.MaxBatchSize = 3
|
||||
|
||||
// Get every path multiple times concurrently.
|
||||
const clients = 10
|
||||
paths := []string{
|
||||
"/foo",
|
||||
"/bar",
|
||||
"/baz",
|
||||
"/quux",
|
||||
"/thud",
|
||||
"/xyzzy",
|
||||
}
|
||||
errors := make(chan error, clients)
|
||||
for i := 0; i < clients; i++ {
|
||||
go func(id int) {
|
||||
errors <- client(id, srv.URL, paths)
|
||||
}(i)
|
||||
}
|
||||
// Check all responses to make sure there's no error.
|
||||
for i := 0; i < clients; i++ {
|
||||
if err := <-errors; err != nil {
|
||||
fmt.Printf("client error: %v", err)
|
||||
return
|
||||
}
|
||||
}
|
||||
|
||||
// Check the final result
|
||||
db.View(func(tx *bolt.Tx) error {
|
||||
b := tx.Bucket([]byte("hits"))
|
||||
c := b.Cursor()
|
||||
for k, v := c.First(); k != nil; k, v = c.Next() {
|
||||
fmt.Printf("hits to %s: %d\n", k, decode(v))
|
||||
}
|
||||
return nil
|
||||
})
|
||||
|
||||
// Output:
|
||||
// hits to /bar: 10
|
||||
// hits to /baz: 10
|
||||
// hits to /foo: 10
|
||||
// hits to /quux: 10
|
||||
// hits to /thud: 10
|
||||
// hits to /xyzzy: 10
|
||||
}
|
||||
|
||||
// encode marshals a counter.
|
||||
func encode(n uint64) []byte {
|
||||
buf := make([]byte, 8)
|
||||
binary.BigEndian.PutUint64(buf, n)
|
||||
return buf
|
||||
}
|
||||
|
||||
// decode unmarshals a counter. Nil buffers are decoded as 0.
|
||||
func decode(buf []byte) uint64 {
|
||||
if buf == nil {
|
||||
return 0
|
||||
}
|
||||
return binary.BigEndian.Uint64(buf)
|
||||
}
|
167
Godeps/_workspace/src/github.com/boltdb/bolt/batch_test.go
generated
vendored
Normal file
167
Godeps/_workspace/src/github.com/boltdb/bolt/batch_test.go
generated
vendored
Normal file
@@ -0,0 +1,167 @@
|
||||
package bolt_test
|
||||
|
||||
import (
|
||||
"testing"
|
||||
"time"
|
||||
|
||||
"github.com/coreos/etcd/Godeps/_workspace/src/github.com/boltdb/bolt"
|
||||
)
|
||||
|
||||
// Ensure two functions can perform updates in a single batch.
|
||||
func TestDB_Batch(t *testing.T) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
db.MustCreateBucket([]byte("widgets"))
|
||||
|
||||
// Iterate over multiple updates in separate goroutines.
|
||||
n := 2
|
||||
ch := make(chan error)
|
||||
for i := 0; i < n; i++ {
|
||||
go func(i int) {
|
||||
ch <- db.Batch(func(tx *bolt.Tx) error {
|
||||
return tx.Bucket([]byte("widgets")).Put(u64tob(uint64(i)), []byte{})
|
||||
})
|
||||
}(i)
|
||||
}
|
||||
|
||||
// Check all responses to make sure there's no error.
|
||||
for i := 0; i < n; i++ {
|
||||
if err := <-ch; err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
}
|
||||
|
||||
// Ensure data is correct.
|
||||
db.MustView(func(tx *bolt.Tx) error {
|
||||
b := tx.Bucket([]byte("widgets"))
|
||||
for i := 0; i < n; i++ {
|
||||
if v := b.Get(u64tob(uint64(i))); v == nil {
|
||||
t.Errorf("key not found: %d", i)
|
||||
}
|
||||
}
|
||||
return nil
|
||||
})
|
||||
}
|
||||
|
||||
func TestDB_Batch_Panic(t *testing.T) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
|
||||
var sentinel int
|
||||
var bork = &sentinel
|
||||
var problem interface{}
|
||||
var err error
|
||||
|
||||
// Execute a function inside a batch that panics.
|
||||
func() {
|
||||
defer func() {
|
||||
if p := recover(); p != nil {
|
||||
problem = p
|
||||
}
|
||||
}()
|
||||
err = db.Batch(func(tx *bolt.Tx) error {
|
||||
panic(bork)
|
||||
})
|
||||
}()
|
||||
|
||||
// Verify there is no error.
|
||||
if g, e := err, error(nil); g != e {
|
||||
t.Fatalf("wrong error: %v != %v", g, e)
|
||||
}
|
||||
// Verify the panic was captured.
|
||||
if g, e := problem, bork; g != e {
|
||||
t.Fatalf("wrong error: %v != %v", g, e)
|
||||
}
|
||||
}
|
||||
|
||||
func TestDB_BatchFull(t *testing.T) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
db.MustCreateBucket([]byte("widgets"))
|
||||
|
||||
const size = 3
|
||||
// buffered so we never leak goroutines
|
||||
ch := make(chan error, size)
|
||||
put := func(i int) {
|
||||
ch <- db.Batch(func(tx *bolt.Tx) error {
|
||||
return tx.Bucket([]byte("widgets")).Put(u64tob(uint64(i)), []byte{})
|
||||
})
|
||||
}
|
||||
|
||||
db.MaxBatchSize = size
|
||||
// high enough to never trigger here
|
||||
db.MaxBatchDelay = 1 * time.Hour
|
||||
|
||||
go put(1)
|
||||
go put(2)
|
||||
|
||||
// Give the batch a chance to exhibit bugs.
|
||||
time.Sleep(10 * time.Millisecond)
|
||||
|
||||
// not triggered yet
|
||||
select {
|
||||
case <-ch:
|
||||
t.Fatalf("batch triggered too early")
|
||||
default:
|
||||
}
|
||||
|
||||
go put(3)
|
||||
|
||||
// Check all responses to make sure there's no error.
|
||||
for i := 0; i < size; i++ {
|
||||
if err := <-ch; err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
}
|
||||
|
||||
// Ensure data is correct.
|
||||
db.MustView(func(tx *bolt.Tx) error {
|
||||
b := tx.Bucket([]byte("widgets"))
|
||||
for i := 1; i <= size; i++ {
|
||||
if v := b.Get(u64tob(uint64(i))); v == nil {
|
||||
t.Errorf("key not found: %d", i)
|
||||
}
|
||||
}
|
||||
return nil
|
||||
})
|
||||
}
|
||||
|
||||
func TestDB_BatchTime(t *testing.T) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
db.MustCreateBucket([]byte("widgets"))
|
||||
|
||||
const size = 1
|
||||
// buffered so we never leak goroutines
|
||||
ch := make(chan error, size)
|
||||
put := func(i int) {
|
||||
ch <- db.Batch(func(tx *bolt.Tx) error {
|
||||
return tx.Bucket([]byte("widgets")).Put(u64tob(uint64(i)), []byte{})
|
||||
})
|
||||
}
|
||||
|
||||
db.MaxBatchSize = 1000
|
||||
db.MaxBatchDelay = 0
|
||||
|
||||
go put(1)
|
||||
|
||||
// Batch must trigger by time alone.
|
||||
|
||||
// Check all responses to make sure there's no error.
|
||||
for i := 0; i < size; i++ {
|
||||
if err := <-ch; err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
}
|
||||
|
||||
// Ensure data is correct.
|
||||
db.MustView(func(tx *bolt.Tx) error {
|
||||
b := tx.Bucket([]byte("widgets"))
|
||||
for i := 1; i <= size; i++ {
|
||||
if v := b.Get(u64tob(uint64(i))); v == nil {
|
||||
t.Errorf("key not found: %d", i)
|
||||
}
|
||||
}
|
||||
return nil
|
||||
})
|
||||
}
|
9
Godeps/_workspace/src/github.com/boltdb/bolt/bolt_arm64.go
generated
vendored
9
Godeps/_workspace/src/github.com/boltdb/bolt/bolt_arm64.go
generated
vendored
@@ -1,9 +0,0 @@
|
||||
// +build arm64
|
||||
|
||||
package bolt
|
||||
|
||||
// maxMapSize represents the largest mmap size supported by Bolt.
|
||||
const maxMapSize = 0xFFFFFFFFFFFF // 256TB
|
||||
|
||||
// maxAllocSize is the size used when creating array pointers.
|
||||
const maxAllocSize = 0x7FFFFFFF
|
2
Godeps/_workspace/src/github.com/boltdb/bolt/bolt_linux.go
generated
vendored
2
Godeps/_workspace/src/github.com/boltdb/bolt/bolt_linux.go
generated
vendored
@@ -4,6 +4,8 @@ import (
|
||||
"syscall"
|
||||
)
|
||||
|
||||
var odirect = syscall.O_DIRECT
|
||||
|
||||
// fdatasync flushes written data to a file descriptor.
|
||||
func fdatasync(db *DB) error {
|
||||
return syscall.Fdatasync(int(db.file.Fd()))
|
||||
|
2
Godeps/_workspace/src/github.com/boltdb/bolt/bolt_openbsd.go
generated
vendored
2
Godeps/_workspace/src/github.com/boltdb/bolt/bolt_openbsd.go
generated
vendored
@@ -11,6 +11,8 @@ const (
|
||||
msInvalidate // invalidate cached data
|
||||
)
|
||||
|
||||
var odirect int
|
||||
|
||||
func msync(db *DB) error {
|
||||
_, _, errno := syscall.Syscall(syscall.SYS_MSYNC, uintptr(unsafe.Pointer(db.data)), uintptr(db.datasz), msInvalidate)
|
||||
if errno != 0 {
|
||||
|
9
Godeps/_workspace/src/github.com/boltdb/bolt/bolt_ppc64le.go
generated
vendored
9
Godeps/_workspace/src/github.com/boltdb/bolt/bolt_ppc64le.go
generated
vendored
@@ -1,9 +0,0 @@
|
||||
// +build ppc64le
|
||||
|
||||
package bolt
|
||||
|
||||
// maxMapSize represents the largest mmap size supported by Bolt.
|
||||
const maxMapSize = 0xFFFFFFFFFFFF // 256TB
|
||||
|
||||
// maxAllocSize is the size used when creating array pointers.
|
||||
const maxAllocSize = 0x7FFFFFFF
|
9
Godeps/_workspace/src/github.com/boltdb/bolt/bolt_s390x.go
generated
vendored
9
Godeps/_workspace/src/github.com/boltdb/bolt/bolt_s390x.go
generated
vendored
@@ -1,9 +0,0 @@
|
||||
// +build s390x
|
||||
|
||||
package bolt
|
||||
|
||||
// maxMapSize represents the largest mmap size supported by Bolt.
|
||||
const maxMapSize = 0xFFFFFFFFFFFF // 256TB
|
||||
|
||||
// maxAllocSize is the size used when creating array pointers.
|
||||
const maxAllocSize = 0x7FFFFFFF
|
36
Godeps/_workspace/src/github.com/boltdb/bolt/bolt_test.go
generated
vendored
Normal file
36
Godeps/_workspace/src/github.com/boltdb/bolt/bolt_test.go
generated
vendored
Normal file
@@ -0,0 +1,36 @@
|
||||
package bolt_test
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"path/filepath"
|
||||
"reflect"
|
||||
"runtime"
|
||||
"testing"
|
||||
)
|
||||
|
||||
// assert fails the test if the condition is false.
|
||||
func assert(tb testing.TB, condition bool, msg string, v ...interface{}) {
|
||||
if !condition {
|
||||
_, file, line, _ := runtime.Caller(1)
|
||||
fmt.Printf("\033[31m%s:%d: "+msg+"\033[39m\n\n", append([]interface{}{filepath.Base(file), line}, v...)...)
|
||||
tb.FailNow()
|
||||
}
|
||||
}
|
||||
|
||||
// ok fails the test if an err is not nil.
|
||||
func ok(tb testing.TB, err error) {
|
||||
if err != nil {
|
||||
_, file, line, _ := runtime.Caller(1)
|
||||
fmt.Printf("\033[31m%s:%d: unexpected error: %s\033[39m\n\n", filepath.Base(file), line, err.Error())
|
||||
tb.FailNow()
|
||||
}
|
||||
}
|
||||
|
||||
// equals fails the test if exp is not equal to act.
|
||||
func equals(tb testing.TB, exp, act interface{}) {
|
||||
if !reflect.DeepEqual(exp, act) {
|
||||
_, file, line, _ := runtime.Caller(1)
|
||||
fmt.Printf("\033[31m%s:%d:\n\n\texp: %#v\n\n\tgot: %#v\033[39m\n\n", filepath.Base(file), line, exp, act)
|
||||
tb.FailNow()
|
||||
}
|
||||
}
|
13
Godeps/_workspace/src/github.com/boltdb/bolt/bolt_unix.go
generated
vendored
13
Godeps/_workspace/src/github.com/boltdb/bolt/bolt_unix.go
generated
vendored
@@ -46,8 +46,19 @@ func funlock(f *os.File) error {
|
||||
|
||||
// mmap memory maps a DB's data file.
|
||||
func mmap(db *DB, sz int) error {
|
||||
// Truncate and fsync to ensure file size metadata is flushed.
|
||||
// https://github.com/boltdb/bolt/issues/284
|
||||
if !db.NoGrowSync && !db.readOnly {
|
||||
if err := db.file.Truncate(int64(sz)); err != nil {
|
||||
return fmt.Errorf("file resize error: %s", err)
|
||||
}
|
||||
if err := db.file.Sync(); err != nil {
|
||||
return fmt.Errorf("file sync error: %s", err)
|
||||
}
|
||||
}
|
||||
|
||||
// Map the data file to memory.
|
||||
b, err := syscall.Mmap(int(db.file.Fd()), 0, sz, syscall.PROT_READ, syscall.MAP_SHARED|db.MmapFlags)
|
||||
b, err := syscall.Mmap(int(db.file.Fd()), 0, sz, syscall.PROT_READ, syscall.MAP_SHARED)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
16
Godeps/_workspace/src/github.com/boltdb/bolt/bolt_unix_solaris.go
generated
vendored
16
Godeps/_workspace/src/github.com/boltdb/bolt/bolt_unix_solaris.go
generated
vendored
@@ -2,12 +2,11 @@ package bolt
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"github.com/coreos/etcd/Godeps/_workspace/src/golang.org/x/sys/unix"
|
||||
"os"
|
||||
"syscall"
|
||||
"time"
|
||||
"unsafe"
|
||||
|
||||
"github.com/coreos/etcd/Godeps/_workspace/src/golang.org/x/sys/unix"
|
||||
)
|
||||
|
||||
// flock acquires an advisory lock on a file descriptor.
|
||||
@@ -56,8 +55,19 @@ func funlock(f *os.File) error {
|
||||
|
||||
// mmap memory maps a DB's data file.
|
||||
func mmap(db *DB, sz int) error {
|
||||
// Truncate and fsync to ensure file size metadata is flushed.
|
||||
// https://github.com/boltdb/bolt/issues/284
|
||||
if !db.NoGrowSync && !db.readOnly {
|
||||
if err := db.file.Truncate(int64(sz)); err != nil {
|
||||
return fmt.Errorf("file resize error: %s", err)
|
||||
}
|
||||
if err := db.file.Sync(); err != nil {
|
||||
return fmt.Errorf("file sync error: %s", err)
|
||||
}
|
||||
}
|
||||
|
||||
// Map the data file to memory.
|
||||
b, err := unix.Mmap(int(db.file.Fd()), 0, sz, syscall.PROT_READ, syscall.MAP_SHARED|db.MmapFlags)
|
||||
b, err := unix.Mmap(int(db.file.Fd()), 0, sz, syscall.PROT_READ, syscall.MAP_SHARED)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
62
Godeps/_workspace/src/github.com/boltdb/bolt/bolt_windows.go
generated
vendored
62
Godeps/_workspace/src/github.com/boltdb/bolt/bolt_windows.go
generated
vendored
@@ -8,37 +8,7 @@ import (
|
||||
"unsafe"
|
||||
)
|
||||
|
||||
// LockFileEx code derived from golang build filemutex_windows.go @ v1.5.1
|
||||
var (
|
||||
modkernel32 = syscall.NewLazyDLL("kernel32.dll")
|
||||
procLockFileEx = modkernel32.NewProc("LockFileEx")
|
||||
procUnlockFileEx = modkernel32.NewProc("UnlockFileEx")
|
||||
)
|
||||
|
||||
const (
|
||||
// see https://msdn.microsoft.com/en-us/library/windows/desktop/aa365203(v=vs.85).aspx
|
||||
flagLockExclusive = 2
|
||||
flagLockFailImmediately = 1
|
||||
|
||||
// see https://msdn.microsoft.com/en-us/library/windows/desktop/ms681382(v=vs.85).aspx
|
||||
errLockViolation syscall.Errno = 0x21
|
||||
)
|
||||
|
||||
func lockFileEx(h syscall.Handle, flags, reserved, locklow, lockhigh uint32, ol *syscall.Overlapped) (err error) {
|
||||
r, _, err := procLockFileEx.Call(uintptr(h), uintptr(flags), uintptr(reserved), uintptr(locklow), uintptr(lockhigh), uintptr(unsafe.Pointer(ol)))
|
||||
if r == 0 {
|
||||
return err
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
func unlockFileEx(h syscall.Handle, reserved, locklow, lockhigh uint32, ol *syscall.Overlapped) (err error) {
|
||||
r, _, err := procUnlockFileEx.Call(uintptr(h), uintptr(reserved), uintptr(locklow), uintptr(lockhigh), uintptr(unsafe.Pointer(ol)), 0)
|
||||
if r == 0 {
|
||||
return err
|
||||
}
|
||||
return nil
|
||||
}
|
||||
var odirect int
|
||||
|
||||
// fdatasync flushes written data to a file descriptor.
|
||||
func fdatasync(db *DB) error {
|
||||
@@ -46,37 +16,13 @@ func fdatasync(db *DB) error {
|
||||
}
|
||||
|
||||
// flock acquires an advisory lock on a file descriptor.
|
||||
func flock(f *os.File, exclusive bool, timeout time.Duration) error {
|
||||
var t time.Time
|
||||
for {
|
||||
// If we're beyond our timeout then return an error.
|
||||
// This can only occur after we've attempted a flock once.
|
||||
if t.IsZero() {
|
||||
t = time.Now()
|
||||
} else if timeout > 0 && time.Since(t) > timeout {
|
||||
return ErrTimeout
|
||||
}
|
||||
|
||||
var flag uint32 = flagLockFailImmediately
|
||||
if exclusive {
|
||||
flag |= flagLockExclusive
|
||||
}
|
||||
|
||||
err := lockFileEx(syscall.Handle(f.Fd()), flag, 0, 1, 0, &syscall.Overlapped{})
|
||||
if err == nil {
|
||||
return nil
|
||||
} else if err != errLockViolation {
|
||||
return err
|
||||
}
|
||||
|
||||
// Wait for a bit and try again.
|
||||
time.Sleep(50 * time.Millisecond)
|
||||
}
|
||||
func flock(f *os.File, _ bool, _ time.Duration) error {
|
||||
return nil
|
||||
}
|
||||
|
||||
// funlock releases an advisory lock on a file descriptor.
|
||||
func funlock(f *os.File) error {
|
||||
return unlockFileEx(syscall.Handle(f.Fd()), 0, 1, 0, &syscall.Overlapped{})
|
||||
return nil
|
||||
}
|
||||
|
||||
// mmap memory maps a DB's data file.
|
||||
|
2
Godeps/_workspace/src/github.com/boltdb/bolt/boltsync_unix.go
generated
vendored
2
Godeps/_workspace/src/github.com/boltdb/bolt/boltsync_unix.go
generated
vendored
@@ -2,6 +2,8 @@
|
||||
|
||||
package bolt
|
||||
|
||||
var odirect int
|
||||
|
||||
// fdatasync flushes written data to a file descriptor.
|
||||
func fdatasync(db *DB) error {
|
||||
return db.file.Sync()
|
||||
|
9
Godeps/_workspace/src/github.com/boltdb/bolt/bucket.go
generated
vendored
9
Godeps/_workspace/src/github.com/boltdb/bolt/bucket.go
generated
vendored
@@ -11,7 +11,7 @@ const (
|
||||
MaxKeySize = 32768
|
||||
|
||||
// MaxValueSize is the maximum length of a value, in bytes.
|
||||
MaxValueSize = (1 << 31) - 2
|
||||
MaxValueSize = 4294967295
|
||||
)
|
||||
|
||||
const (
|
||||
@@ -99,7 +99,6 @@ func (b *Bucket) Cursor() *Cursor {
|
||||
|
||||
// Bucket retrieves a nested bucket by name.
|
||||
// Returns nil if the bucket does not exist.
|
||||
// The bucket instance is only valid for the lifetime of the transaction.
|
||||
func (b *Bucket) Bucket(name []byte) *Bucket {
|
||||
if b.buckets != nil {
|
||||
if child := b.buckets[string(name)]; child != nil {
|
||||
@@ -149,7 +148,6 @@ func (b *Bucket) openBucket(value []byte) *Bucket {
|
||||
|
||||
// CreateBucket creates a new bucket at the given key and returns the new bucket.
|
||||
// Returns an error if the key already exists, if the bucket name is blank, or if the bucket name is too long.
|
||||
// The bucket instance is only valid for the lifetime of the transaction.
|
||||
func (b *Bucket) CreateBucket(key []byte) (*Bucket, error) {
|
||||
if b.tx.db == nil {
|
||||
return nil, ErrTxClosed
|
||||
@@ -194,7 +192,6 @@ func (b *Bucket) CreateBucket(key []byte) (*Bucket, error) {
|
||||
|
||||
// CreateBucketIfNotExists creates a new bucket if it doesn't already exist and returns a reference to it.
|
||||
// Returns an error if the bucket name is blank, or if the bucket name is too long.
|
||||
// The bucket instance is only valid for the lifetime of the transaction.
|
||||
func (b *Bucket) CreateBucketIfNotExists(key []byte) (*Bucket, error) {
|
||||
child, err := b.CreateBucket(key)
|
||||
if err == ErrBucketExists {
|
||||
@@ -273,7 +270,6 @@ func (b *Bucket) Get(key []byte) []byte {
|
||||
|
||||
// Put sets the value for a key in the bucket.
|
||||
// If the key exist then its previous value will be overwritten.
|
||||
// Supplied value must remain valid for the life of the transaction.
|
||||
// Returns an error if the bucket was created from a read-only transaction, if the key is blank, if the key is too large, or if the value is too large.
|
||||
func (b *Bucket) Put(key []byte, value []byte) error {
|
||||
if b.tx.db == nil {
|
||||
@@ -350,8 +346,7 @@ func (b *Bucket) NextSequence() (uint64, error) {
|
||||
|
||||
// ForEach executes a function for each key/value pair in a bucket.
|
||||
// If the provided function returns an error then the iteration is stopped and
|
||||
// the error is returned to the caller. The provided function must not modify
|
||||
// the bucket; this will result in undefined behavior.
|
||||
// the error is returned to the caller.
|
||||
func (b *Bucket) ForEach(fn func(k, v []byte) error) error {
|
||||
if b.tx.db == nil {
|
||||
return ErrTxClosed
|
||||
|
1169
Godeps/_workspace/src/github.com/boltdb/bolt/bucket_test.go
generated
vendored
Normal file
1169
Godeps/_workspace/src/github.com/boltdb/bolt/bucket_test.go
generated
vendored
Normal file
File diff suppressed because it is too large
Load Diff
5
Godeps/_workspace/src/github.com/boltdb/bolt/cmd/bolt/main.go
generated
vendored
5
Godeps/_workspace/src/github.com/boltdb/bolt/cmd/bolt/main.go
generated
vendored
@@ -825,10 +825,7 @@ func (cmd *StatsCommand) Run(args ...string) error {
|
||||
|
||||
fmt.Fprintln(cmd.Stdout, "Bucket statistics")
|
||||
fmt.Fprintf(cmd.Stdout, "\tTotal number of buckets: %d\n", s.BucketN)
|
||||
percentage = 0
|
||||
if s.BucketN != 0 {
|
||||
percentage = int(float32(s.InlineBucketN) * 100.0 / float32(s.BucketN))
|
||||
}
|
||||
percentage = int(float32(s.InlineBucketN) * 100.0 / float32(s.BucketN))
|
||||
fmt.Fprintf(cmd.Stdout, "\tTotal number on inlined buckets: %d (%d%%)\n", s.InlineBucketN, percentage)
|
||||
percentage = 0
|
||||
if s.LeafInuse != 0 {
|
||||
|
145
Godeps/_workspace/src/github.com/boltdb/bolt/cmd/bolt/main_test.go
generated
vendored
Normal file
145
Godeps/_workspace/src/github.com/boltdb/bolt/cmd/bolt/main_test.go
generated
vendored
Normal file
@@ -0,0 +1,145 @@
|
||||
package main_test
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"io/ioutil"
|
||||
"os"
|
||||
"strconv"
|
||||
"testing"
|
||||
|
||||
"github.com/coreos/etcd/Godeps/_workspace/src/github.com/boltdb/bolt"
|
||||
"github.com/coreos/etcd/Godeps/_workspace/src/github.com/boltdb/bolt/cmd/bolt"
|
||||
)
|
||||
|
||||
// Ensure the "info" command can print information about a database.
|
||||
func TestInfoCommand_Run(t *testing.T) {
|
||||
db := MustOpen(0666, nil)
|
||||
db.DB.Close()
|
||||
defer db.Close()
|
||||
|
||||
// Run the info command.
|
||||
m := NewMain()
|
||||
if err := m.Run("info", db.Path); err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
}
|
||||
|
||||
// Ensure the "stats" command can execute correctly.
|
||||
func TestStatsCommand_Run(t *testing.T) {
|
||||
// Ignore
|
||||
if os.Getpagesize() != 4096 {
|
||||
t.Skip("system does not use 4KB page size")
|
||||
}
|
||||
|
||||
db := MustOpen(0666, nil)
|
||||
defer db.Close()
|
||||
|
||||
if err := db.Update(func(tx *bolt.Tx) error {
|
||||
// Create "foo" bucket.
|
||||
b, err := tx.CreateBucket([]byte("foo"))
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
for i := 0; i < 10; i++ {
|
||||
if err := b.Put([]byte(strconv.Itoa(i)), []byte(strconv.Itoa(i))); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
|
||||
// Create "bar" bucket.
|
||||
b, err = tx.CreateBucket([]byte("bar"))
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
for i := 0; i < 100; i++ {
|
||||
if err := b.Put([]byte(strconv.Itoa(i)), []byte(strconv.Itoa(i))); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
|
||||
// Create "baz" bucket.
|
||||
b, err = tx.CreateBucket([]byte("baz"))
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
if err := b.Put([]byte("key"), []byte("value")); err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
return nil
|
||||
}); err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
db.DB.Close()
|
||||
|
||||
// Generate expected result.
|
||||
exp := "Aggregate statistics for 3 buckets\n\n" +
|
||||
"Page count statistics\n" +
|
||||
"\tNumber of logical branch pages: 0\n" +
|
||||
"\tNumber of physical branch overflow pages: 0\n" +
|
||||
"\tNumber of logical leaf pages: 1\n" +
|
||||
"\tNumber of physical leaf overflow pages: 0\n" +
|
||||
"Tree statistics\n" +
|
||||
"\tNumber of keys/value pairs: 111\n" +
|
||||
"\tNumber of levels in B+tree: 1\n" +
|
||||
"Page size utilization\n" +
|
||||
"\tBytes allocated for physical branch pages: 0\n" +
|
||||
"\tBytes actually used for branch data: 0 (0%)\n" +
|
||||
"\tBytes allocated for physical leaf pages: 4096\n" +
|
||||
"\tBytes actually used for leaf data: 1996 (48%)\n" +
|
||||
"Bucket statistics\n" +
|
||||
"\tTotal number of buckets: 3\n" +
|
||||
"\tTotal number on inlined buckets: 2 (66%)\n" +
|
||||
"\tBytes used for inlined buckets: 236 (11%)\n"
|
||||
|
||||
// Run the command.
|
||||
m := NewMain()
|
||||
if err := m.Run("stats", db.Path); err != nil {
|
||||
t.Fatal(err)
|
||||
} else if m.Stdout.String() != exp {
|
||||
t.Fatalf("unexpected stdout:\n\n%s", m.Stdout.String())
|
||||
}
|
||||
}
|
||||
|
||||
// Main represents a test wrapper for main.Main that records output.
|
||||
type Main struct {
|
||||
*main.Main
|
||||
Stdin bytes.Buffer
|
||||
Stdout bytes.Buffer
|
||||
Stderr bytes.Buffer
|
||||
}
|
||||
|
||||
// NewMain returns a new instance of Main.
|
||||
func NewMain() *Main {
|
||||
m := &Main{Main: main.NewMain()}
|
||||
m.Main.Stdin = &m.Stdin
|
||||
m.Main.Stdout = &m.Stdout
|
||||
m.Main.Stderr = &m.Stderr
|
||||
return m
|
||||
}
|
||||
|
||||
// MustOpen creates a Bolt database in a temporary location.
|
||||
func MustOpen(mode os.FileMode, options *bolt.Options) *DB {
|
||||
// Create temporary path.
|
||||
f, _ := ioutil.TempFile("", "bolt-")
|
||||
f.Close()
|
||||
os.Remove(f.Name())
|
||||
|
||||
db, err := bolt.Open(f.Name(), mode, options)
|
||||
if err != nil {
|
||||
panic(err.Error())
|
||||
}
|
||||
return &DB{DB: db, Path: f.Name()}
|
||||
}
|
||||
|
||||
// DB is a test wrapper for bolt.DB.
|
||||
type DB struct {
|
||||
*bolt.DB
|
||||
Path string
|
||||
}
|
||||
|
||||
// Close closes and removes the database.
|
||||
func (db *DB) Close() error {
|
||||
defer os.Remove(db.Path)
|
||||
return db.DB.Close()
|
||||
}
|
56
Godeps/_workspace/src/github.com/boltdb/bolt/cursor.go
generated
vendored
56
Godeps/_workspace/src/github.com/boltdb/bolt/cursor.go
generated
vendored
@@ -34,13 +34,6 @@ func (c *Cursor) First() (key []byte, value []byte) {
|
||||
p, n := c.bucket.pageNode(c.bucket.root)
|
||||
c.stack = append(c.stack, elemRef{page: p, node: n, index: 0})
|
||||
c.first()
|
||||
|
||||
// If we land on an empty page then move to the next value.
|
||||
// https://github.com/boltdb/bolt/issues/450
|
||||
if c.stack[len(c.stack)-1].count() == 0 {
|
||||
c.next()
|
||||
}
|
||||
|
||||
k, v, flags := c.keyValue()
|
||||
if (flags & uint32(bucketLeafFlag)) != 0 {
|
||||
return k, nil
|
||||
@@ -216,37 +209,28 @@ func (c *Cursor) last() {
|
||||
// next moves to the next leaf element and returns the key and value.
|
||||
// If the cursor is at the last leaf element then it stays there and returns nil.
|
||||
func (c *Cursor) next() (key []byte, value []byte, flags uint32) {
|
||||
for {
|
||||
// Attempt to move over one element until we're successful.
|
||||
// Move up the stack as we hit the end of each page in our stack.
|
||||
var i int
|
||||
for i = len(c.stack) - 1; i >= 0; i-- {
|
||||
elem := &c.stack[i]
|
||||
if elem.index < elem.count()-1 {
|
||||
elem.index++
|
||||
break
|
||||
}
|
||||
// Attempt to move over one element until we're successful.
|
||||
// Move up the stack as we hit the end of each page in our stack.
|
||||
var i int
|
||||
for i = len(c.stack) - 1; i >= 0; i-- {
|
||||
elem := &c.stack[i]
|
||||
if elem.index < elem.count()-1 {
|
||||
elem.index++
|
||||
break
|
||||
}
|
||||
|
||||
// If we've hit the root page then stop and return. This will leave the
|
||||
// cursor on the last element of the last page.
|
||||
if i == -1 {
|
||||
return nil, nil, 0
|
||||
}
|
||||
|
||||
// Otherwise start from where we left off in the stack and find the
|
||||
// first element of the first leaf page.
|
||||
c.stack = c.stack[:i+1]
|
||||
c.first()
|
||||
|
||||
// If this is an empty page then restart and move back up the stack.
|
||||
// https://github.com/boltdb/bolt/issues/450
|
||||
if c.stack[len(c.stack)-1].count() == 0 {
|
||||
continue
|
||||
}
|
||||
|
||||
return c.keyValue()
|
||||
}
|
||||
|
||||
// If we've hit the root page then stop and return. This will leave the
|
||||
// cursor on the last element of the last page.
|
||||
if i == -1 {
|
||||
return nil, nil, 0
|
||||
}
|
||||
|
||||
// Otherwise start from where we left off in the stack and find the
|
||||
// first element of the first leaf page.
|
||||
c.stack = c.stack[:i+1]
|
||||
c.first()
|
||||
return c.keyValue()
|
||||
}
|
||||
|
||||
// search recursively performs a binary search against a given page/node until it finds a given key.
|
||||
|
511
Godeps/_workspace/src/github.com/boltdb/bolt/cursor_test.go
generated
vendored
Normal file
511
Godeps/_workspace/src/github.com/boltdb/bolt/cursor_test.go
generated
vendored
Normal file
@@ -0,0 +1,511 @@
|
||||
package bolt_test
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"encoding/binary"
|
||||
"fmt"
|
||||
"os"
|
||||
"sort"
|
||||
"testing"
|
||||
"testing/quick"
|
||||
|
||||
"github.com/coreos/etcd/Godeps/_workspace/src/github.com/boltdb/bolt"
|
||||
)
|
||||
|
||||
// Ensure that a cursor can return a reference to the bucket that created it.
|
||||
func TestCursor_Bucket(t *testing.T) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
db.Update(func(tx *bolt.Tx) error {
|
||||
b, _ := tx.CreateBucket([]byte("widgets"))
|
||||
c := b.Cursor()
|
||||
equals(t, b, c.Bucket())
|
||||
return nil
|
||||
})
|
||||
}
|
||||
|
||||
// Ensure that a Tx cursor can seek to the appropriate keys.
|
||||
func TestCursor_Seek(t *testing.T) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
db.Update(func(tx *bolt.Tx) error {
|
||||
b, err := tx.CreateBucket([]byte("widgets"))
|
||||
ok(t, err)
|
||||
ok(t, b.Put([]byte("foo"), []byte("0001")))
|
||||
ok(t, b.Put([]byte("bar"), []byte("0002")))
|
||||
ok(t, b.Put([]byte("baz"), []byte("0003")))
|
||||
_, err = b.CreateBucket([]byte("bkt"))
|
||||
ok(t, err)
|
||||
return nil
|
||||
})
|
||||
db.View(func(tx *bolt.Tx) error {
|
||||
c := tx.Bucket([]byte("widgets")).Cursor()
|
||||
|
||||
// Exact match should go to the key.
|
||||
k, v := c.Seek([]byte("bar"))
|
||||
equals(t, []byte("bar"), k)
|
||||
equals(t, []byte("0002"), v)
|
||||
|
||||
// Inexact match should go to the next key.
|
||||
k, v = c.Seek([]byte("bas"))
|
||||
equals(t, []byte("baz"), k)
|
||||
equals(t, []byte("0003"), v)
|
||||
|
||||
// Low key should go to the first key.
|
||||
k, v = c.Seek([]byte(""))
|
||||
equals(t, []byte("bar"), k)
|
||||
equals(t, []byte("0002"), v)
|
||||
|
||||
// High key should return no key.
|
||||
k, v = c.Seek([]byte("zzz"))
|
||||
assert(t, k == nil, "")
|
||||
assert(t, v == nil, "")
|
||||
|
||||
// Buckets should return their key but no value.
|
||||
k, v = c.Seek([]byte("bkt"))
|
||||
equals(t, []byte("bkt"), k)
|
||||
assert(t, v == nil, "")
|
||||
|
||||
return nil
|
||||
})
|
||||
}
|
||||
|
||||
func TestCursor_Delete(t *testing.T) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
|
||||
var count = 1000
|
||||
|
||||
// Insert every other key between 0 and $count.
|
||||
db.Update(func(tx *bolt.Tx) error {
|
||||
b, _ := tx.CreateBucket([]byte("widgets"))
|
||||
for i := 0; i < count; i += 1 {
|
||||
k := make([]byte, 8)
|
||||
binary.BigEndian.PutUint64(k, uint64(i))
|
||||
b.Put(k, make([]byte, 100))
|
||||
}
|
||||
b.CreateBucket([]byte("sub"))
|
||||
return nil
|
||||
})
|
||||
|
||||
db.Update(func(tx *bolt.Tx) error {
|
||||
c := tx.Bucket([]byte("widgets")).Cursor()
|
||||
bound := make([]byte, 8)
|
||||
binary.BigEndian.PutUint64(bound, uint64(count/2))
|
||||
for key, _ := c.First(); bytes.Compare(key, bound) < 0; key, _ = c.Next() {
|
||||
if err := c.Delete(); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
c.Seek([]byte("sub"))
|
||||
err := c.Delete()
|
||||
equals(t, err, bolt.ErrIncompatibleValue)
|
||||
return nil
|
||||
})
|
||||
|
||||
db.View(func(tx *bolt.Tx) error {
|
||||
b := tx.Bucket([]byte("widgets"))
|
||||
equals(t, b.Stats().KeyN, count/2+1)
|
||||
return nil
|
||||
})
|
||||
}
|
||||
|
||||
// Ensure that a Tx cursor can seek to the appropriate keys when there are a
|
||||
// large number of keys. This test also checks that seek will always move
|
||||
// forward to the next key.
|
||||
//
|
||||
// Related: https://github.com/boltdb/bolt/pull/187
|
||||
func TestCursor_Seek_Large(t *testing.T) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
|
||||
var count = 10000
|
||||
|
||||
// Insert every other key between 0 and $count.
|
||||
db.Update(func(tx *bolt.Tx) error {
|
||||
b, _ := tx.CreateBucket([]byte("widgets"))
|
||||
for i := 0; i < count; i += 100 {
|
||||
for j := i; j < i+100; j += 2 {
|
||||
k := make([]byte, 8)
|
||||
binary.BigEndian.PutUint64(k, uint64(j))
|
||||
b.Put(k, make([]byte, 100))
|
||||
}
|
||||
}
|
||||
return nil
|
||||
})
|
||||
|
||||
db.View(func(tx *bolt.Tx) error {
|
||||
c := tx.Bucket([]byte("widgets")).Cursor()
|
||||
for i := 0; i < count; i++ {
|
||||
seek := make([]byte, 8)
|
||||
binary.BigEndian.PutUint64(seek, uint64(i))
|
||||
|
||||
k, _ := c.Seek(seek)
|
||||
|
||||
// The last seek is beyond the end of the the range so
|
||||
// it should return nil.
|
||||
if i == count-1 {
|
||||
assert(t, k == nil, "")
|
||||
continue
|
||||
}
|
||||
|
||||
// Otherwise we should seek to the exact key or the next key.
|
||||
num := binary.BigEndian.Uint64(k)
|
||||
if i%2 == 0 {
|
||||
equals(t, uint64(i), num)
|
||||
} else {
|
||||
equals(t, uint64(i+1), num)
|
||||
}
|
||||
}
|
||||
|
||||
return nil
|
||||
})
|
||||
}
|
||||
|
||||
// Ensure that a cursor can iterate over an empty bucket without error.
|
||||
func TestCursor_EmptyBucket(t *testing.T) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
db.Update(func(tx *bolt.Tx) error {
|
||||
_, err := tx.CreateBucket([]byte("widgets"))
|
||||
return err
|
||||
})
|
||||
db.View(func(tx *bolt.Tx) error {
|
||||
c := tx.Bucket([]byte("widgets")).Cursor()
|
||||
k, v := c.First()
|
||||
assert(t, k == nil, "")
|
||||
assert(t, v == nil, "")
|
||||
return nil
|
||||
})
|
||||
}
|
||||
|
||||
// Ensure that a Tx cursor can reverse iterate over an empty bucket without error.
|
||||
func TestCursor_EmptyBucketReverse(t *testing.T) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
|
||||
db.Update(func(tx *bolt.Tx) error {
|
||||
_, err := tx.CreateBucket([]byte("widgets"))
|
||||
return err
|
||||
})
|
||||
db.View(func(tx *bolt.Tx) error {
|
||||
c := tx.Bucket([]byte("widgets")).Cursor()
|
||||
k, v := c.Last()
|
||||
assert(t, k == nil, "")
|
||||
assert(t, v == nil, "")
|
||||
return nil
|
||||
})
|
||||
}
|
||||
|
||||
// Ensure that a Tx cursor can iterate over a single root with a couple elements.
|
||||
func TestCursor_Iterate_Leaf(t *testing.T) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
|
||||
db.Update(func(tx *bolt.Tx) error {
|
||||
tx.CreateBucket([]byte("widgets"))
|
||||
tx.Bucket([]byte("widgets")).Put([]byte("baz"), []byte{})
|
||||
tx.Bucket([]byte("widgets")).Put([]byte("foo"), []byte{0})
|
||||
tx.Bucket([]byte("widgets")).Put([]byte("bar"), []byte{1})
|
||||
return nil
|
||||
})
|
||||
tx, _ := db.Begin(false)
|
||||
c := tx.Bucket([]byte("widgets")).Cursor()
|
||||
|
||||
k, v := c.First()
|
||||
equals(t, string(k), "bar")
|
||||
equals(t, v, []byte{1})
|
||||
|
||||
k, v = c.Next()
|
||||
equals(t, string(k), "baz")
|
||||
equals(t, v, []byte{})
|
||||
|
||||
k, v = c.Next()
|
||||
equals(t, string(k), "foo")
|
||||
equals(t, v, []byte{0})
|
||||
|
||||
k, v = c.Next()
|
||||
assert(t, k == nil, "")
|
||||
assert(t, v == nil, "")
|
||||
|
||||
k, v = c.Next()
|
||||
assert(t, k == nil, "")
|
||||
assert(t, v == nil, "")
|
||||
|
||||
tx.Rollback()
|
||||
}
|
||||
|
||||
// Ensure that a Tx cursor can iterate in reverse over a single root with a couple elements.
|
||||
func TestCursor_LeafRootReverse(t *testing.T) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
|
||||
db.Update(func(tx *bolt.Tx) error {
|
||||
tx.CreateBucket([]byte("widgets"))
|
||||
tx.Bucket([]byte("widgets")).Put([]byte("baz"), []byte{})
|
||||
tx.Bucket([]byte("widgets")).Put([]byte("foo"), []byte{0})
|
||||
tx.Bucket([]byte("widgets")).Put([]byte("bar"), []byte{1})
|
||||
return nil
|
||||
})
|
||||
tx, _ := db.Begin(false)
|
||||
c := tx.Bucket([]byte("widgets")).Cursor()
|
||||
|
||||
k, v := c.Last()
|
||||
equals(t, string(k), "foo")
|
||||
equals(t, v, []byte{0})
|
||||
|
||||
k, v = c.Prev()
|
||||
equals(t, string(k), "baz")
|
||||
equals(t, v, []byte{})
|
||||
|
||||
k, v = c.Prev()
|
||||
equals(t, string(k), "bar")
|
||||
equals(t, v, []byte{1})
|
||||
|
||||
k, v = c.Prev()
|
||||
assert(t, k == nil, "")
|
||||
assert(t, v == nil, "")
|
||||
|
||||
k, v = c.Prev()
|
||||
assert(t, k == nil, "")
|
||||
assert(t, v == nil, "")
|
||||
|
||||
tx.Rollback()
|
||||
}
|
||||
|
||||
// Ensure that a Tx cursor can restart from the beginning.
|
||||
func TestCursor_Restart(t *testing.T) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
|
||||
db.Update(func(tx *bolt.Tx) error {
|
||||
tx.CreateBucket([]byte("widgets"))
|
||||
tx.Bucket([]byte("widgets")).Put([]byte("bar"), []byte{})
|
||||
tx.Bucket([]byte("widgets")).Put([]byte("foo"), []byte{})
|
||||
return nil
|
||||
})
|
||||
|
||||
tx, _ := db.Begin(false)
|
||||
c := tx.Bucket([]byte("widgets")).Cursor()
|
||||
|
||||
k, _ := c.First()
|
||||
equals(t, string(k), "bar")
|
||||
|
||||
k, _ = c.Next()
|
||||
equals(t, string(k), "foo")
|
||||
|
||||
k, _ = c.First()
|
||||
equals(t, string(k), "bar")
|
||||
|
||||
k, _ = c.Next()
|
||||
equals(t, string(k), "foo")
|
||||
|
||||
tx.Rollback()
|
||||
}
|
||||
|
||||
// Ensure that a Tx can iterate over all elements in a bucket.
|
||||
func TestCursor_QuickCheck(t *testing.T) {
|
||||
f := func(items testdata) bool {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
|
||||
// Bulk insert all values.
|
||||
tx, _ := db.Begin(true)
|
||||
tx.CreateBucket([]byte("widgets"))
|
||||
b := tx.Bucket([]byte("widgets"))
|
||||
for _, item := range items {
|
||||
ok(t, b.Put(item.Key, item.Value))
|
||||
}
|
||||
ok(t, tx.Commit())
|
||||
|
||||
// Sort test data.
|
||||
sort.Sort(items)
|
||||
|
||||
// Iterate over all items and check consistency.
|
||||
var index = 0
|
||||
tx, _ = db.Begin(false)
|
||||
c := tx.Bucket([]byte("widgets")).Cursor()
|
||||
for k, v := c.First(); k != nil && index < len(items); k, v = c.Next() {
|
||||
equals(t, k, items[index].Key)
|
||||
equals(t, v, items[index].Value)
|
||||
index++
|
||||
}
|
||||
equals(t, len(items), index)
|
||||
tx.Rollback()
|
||||
|
||||
return true
|
||||
}
|
||||
if err := quick.Check(f, qconfig()); err != nil {
|
||||
t.Error(err)
|
||||
}
|
||||
}
|
||||
|
||||
// Ensure that a transaction can iterate over all elements in a bucket in reverse.
|
||||
func TestCursor_QuickCheck_Reverse(t *testing.T) {
|
||||
f := func(items testdata) bool {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
|
||||
// Bulk insert all values.
|
||||
tx, _ := db.Begin(true)
|
||||
tx.CreateBucket([]byte("widgets"))
|
||||
b := tx.Bucket([]byte("widgets"))
|
||||
for _, item := range items {
|
||||
ok(t, b.Put(item.Key, item.Value))
|
||||
}
|
||||
ok(t, tx.Commit())
|
||||
|
||||
// Sort test data.
|
||||
sort.Sort(revtestdata(items))
|
||||
|
||||
// Iterate over all items and check consistency.
|
||||
var index = 0
|
||||
tx, _ = db.Begin(false)
|
||||
c := tx.Bucket([]byte("widgets")).Cursor()
|
||||
for k, v := c.Last(); k != nil && index < len(items); k, v = c.Prev() {
|
||||
equals(t, k, items[index].Key)
|
||||
equals(t, v, items[index].Value)
|
||||
index++
|
||||
}
|
||||
equals(t, len(items), index)
|
||||
tx.Rollback()
|
||||
|
||||
return true
|
||||
}
|
||||
if err := quick.Check(f, qconfig()); err != nil {
|
||||
t.Error(err)
|
||||
}
|
||||
}
|
||||
|
||||
// Ensure that a Tx cursor can iterate over subbuckets.
|
||||
func TestCursor_QuickCheck_BucketsOnly(t *testing.T) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
|
||||
db.Update(func(tx *bolt.Tx) error {
|
||||
b, err := tx.CreateBucket([]byte("widgets"))
|
||||
ok(t, err)
|
||||
_, err = b.CreateBucket([]byte("foo"))
|
||||
ok(t, err)
|
||||
_, err = b.CreateBucket([]byte("bar"))
|
||||
ok(t, err)
|
||||
_, err = b.CreateBucket([]byte("baz"))
|
||||
ok(t, err)
|
||||
return nil
|
||||
})
|
||||
db.View(func(tx *bolt.Tx) error {
|
||||
var names []string
|
||||
c := tx.Bucket([]byte("widgets")).Cursor()
|
||||
for k, v := c.First(); k != nil; k, v = c.Next() {
|
||||
names = append(names, string(k))
|
||||
assert(t, v == nil, "")
|
||||
}
|
||||
equals(t, names, []string{"bar", "baz", "foo"})
|
||||
return nil
|
||||
})
|
||||
}
|
||||
|
||||
// Ensure that a Tx cursor can reverse iterate over subbuckets.
|
||||
func TestCursor_QuickCheck_BucketsOnly_Reverse(t *testing.T) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
|
||||
db.Update(func(tx *bolt.Tx) error {
|
||||
b, err := tx.CreateBucket([]byte("widgets"))
|
||||
ok(t, err)
|
||||
_, err = b.CreateBucket([]byte("foo"))
|
||||
ok(t, err)
|
||||
_, err = b.CreateBucket([]byte("bar"))
|
||||
ok(t, err)
|
||||
_, err = b.CreateBucket([]byte("baz"))
|
||||
ok(t, err)
|
||||
return nil
|
||||
})
|
||||
db.View(func(tx *bolt.Tx) error {
|
||||
var names []string
|
||||
c := tx.Bucket([]byte("widgets")).Cursor()
|
||||
for k, v := c.Last(); k != nil; k, v = c.Prev() {
|
||||
names = append(names, string(k))
|
||||
assert(t, v == nil, "")
|
||||
}
|
||||
equals(t, names, []string{"foo", "baz", "bar"})
|
||||
return nil
|
||||
})
|
||||
}
|
||||
|
||||
func ExampleCursor() {
|
||||
// Open the database.
|
||||
db, _ := bolt.Open(tempfile(), 0666, nil)
|
||||
defer os.Remove(db.Path())
|
||||
defer db.Close()
|
||||
|
||||
// Start a read-write transaction.
|
||||
db.Update(func(tx *bolt.Tx) error {
|
||||
// Create a new bucket.
|
||||
tx.CreateBucket([]byte("animals"))
|
||||
|
||||
// Insert data into a bucket.
|
||||
b := tx.Bucket([]byte("animals"))
|
||||
b.Put([]byte("dog"), []byte("fun"))
|
||||
b.Put([]byte("cat"), []byte("lame"))
|
||||
b.Put([]byte("liger"), []byte("awesome"))
|
||||
|
||||
// Create a cursor for iteration.
|
||||
c := b.Cursor()
|
||||
|
||||
// Iterate over items in sorted key order. This starts from the
|
||||
// first key/value pair and updates the k/v variables to the
|
||||
// next key/value on each iteration.
|
||||
//
|
||||
// The loop finishes at the end of the cursor when a nil key is returned.
|
||||
for k, v := c.First(); k != nil; k, v = c.Next() {
|
||||
fmt.Printf("A %s is %s.\n", k, v)
|
||||
}
|
||||
|
||||
return nil
|
||||
})
|
||||
|
||||
// Output:
|
||||
// A cat is lame.
|
||||
// A dog is fun.
|
||||
// A liger is awesome.
|
||||
}
|
||||
|
||||
func ExampleCursor_reverse() {
|
||||
// Open the database.
|
||||
db, _ := bolt.Open(tempfile(), 0666, nil)
|
||||
defer os.Remove(db.Path())
|
||||
defer db.Close()
|
||||
|
||||
// Start a read-write transaction.
|
||||
db.Update(func(tx *bolt.Tx) error {
|
||||
// Create a new bucket.
|
||||
tx.CreateBucket([]byte("animals"))
|
||||
|
||||
// Insert data into a bucket.
|
||||
b := tx.Bucket([]byte("animals"))
|
||||
b.Put([]byte("dog"), []byte("fun"))
|
||||
b.Put([]byte("cat"), []byte("lame"))
|
||||
b.Put([]byte("liger"), []byte("awesome"))
|
||||
|
||||
// Create a cursor for iteration.
|
||||
c := b.Cursor()
|
||||
|
||||
// Iterate over items in reverse sorted key order. This starts
|
||||
// from the last key/value pair and updates the k/v variables to
|
||||
// the previous key/value on each iteration.
|
||||
//
|
||||
// The loop finishes at the beginning of the cursor when a nil key
|
||||
// is returned.
|
||||
for k, v := c.Last(); k != nil; k, v = c.Prev() {
|
||||
fmt.Printf("A %s is %s.\n", k, v)
|
||||
}
|
||||
|
||||
return nil
|
||||
})
|
||||
|
||||
// Output:
|
||||
// A liger is awesome.
|
||||
// A dog is fun.
|
||||
// A cat is lame.
|
||||
}
|
212
Godeps/_workspace/src/github.com/boltdb/bolt/db.go
generated
vendored
212
Godeps/_workspace/src/github.com/boltdb/bolt/db.go
generated
vendored
@@ -1,10 +1,8 @@
|
||||
package bolt
|
||||
|
||||
import (
|
||||
"errors"
|
||||
"fmt"
|
||||
"hash/fnv"
|
||||
"log"
|
||||
"os"
|
||||
"runtime"
|
||||
"runtime/debug"
|
||||
@@ -26,14 +24,13 @@ const magic uint32 = 0xED0CDAED
|
||||
// IgnoreNoSync specifies whether the NoSync field of a DB is ignored when
|
||||
// syncing changes to a file. This is required as some operating systems,
|
||||
// such as OpenBSD, do not have a unified buffer cache (UBC) and writes
|
||||
// must be synchronized using the msync(2) syscall.
|
||||
// must be synchronzied using the msync(2) syscall.
|
||||
const IgnoreNoSync = runtime.GOOS == "openbsd"
|
||||
|
||||
// Default values if not set in a DB instance.
|
||||
const (
|
||||
DefaultMaxBatchSize int = 1000
|
||||
DefaultMaxBatchDelay = 10 * time.Millisecond
|
||||
DefaultAllocSize = 16 * 1024 * 1024
|
||||
)
|
||||
|
||||
// DB represents a collection of buckets persisted to a file on disk.
|
||||
@@ -66,10 +63,6 @@ type DB struct {
|
||||
// https://github.com/boltdb/bolt/issues/284
|
||||
NoGrowSync bool
|
||||
|
||||
// If you want to read the entire database fast, you can set MmapFlag to
|
||||
// syscall.MAP_POPULATE on Linux 2.6.23+ for sequential read-ahead.
|
||||
MmapFlags int
|
||||
|
||||
// MaxBatchSize is the maximum size of a batch. Default value is
|
||||
// copied from DefaultMaxBatchSize in Open.
|
||||
//
|
||||
@@ -86,17 +79,11 @@ type DB struct {
|
||||
// Do not change concurrently with calls to Batch.
|
||||
MaxBatchDelay time.Duration
|
||||
|
||||
// AllocSize is the amount of space allocated when the database
|
||||
// needs to create new pages. This is done to amortize the cost
|
||||
// of truncate() and fsync() when growing the data file.
|
||||
AllocSize int
|
||||
|
||||
path string
|
||||
file *os.File
|
||||
dataref []byte // mmap'ed readonly, write throws SEGV
|
||||
data *[maxMapSize]byte
|
||||
datasz int
|
||||
filesz int // current on disk file size
|
||||
meta0 *meta
|
||||
meta1 *meta
|
||||
pageSize int
|
||||
@@ -149,12 +136,10 @@ func Open(path string, mode os.FileMode, options *Options) (*DB, error) {
|
||||
options = DefaultOptions
|
||||
}
|
||||
db.NoGrowSync = options.NoGrowSync
|
||||
db.MmapFlags = options.MmapFlags
|
||||
|
||||
// Set default values for later DB operations.
|
||||
db.MaxBatchSize = DefaultMaxBatchSize
|
||||
db.MaxBatchDelay = DefaultMaxBatchDelay
|
||||
db.AllocSize = DefaultAllocSize
|
||||
|
||||
flag := os.O_RDWR
|
||||
if options.ReadOnly {
|
||||
@@ -187,7 +172,7 @@ func Open(path string, mode os.FileMode, options *Options) (*DB, error) {
|
||||
|
||||
// Initialize the database if it doesn't exist.
|
||||
if info, err := db.file.Stat(); err != nil {
|
||||
return nil, err
|
||||
return nil, fmt.Errorf("stat error: %s", err)
|
||||
} else if info.Size() == 0 {
|
||||
// Initialize new files with meta pages.
|
||||
if err := db.init(); err != nil {
|
||||
@@ -199,14 +184,14 @@ func Open(path string, mode os.FileMode, options *Options) (*DB, error) {
|
||||
if _, err := db.file.ReadAt(buf[:], 0); err == nil {
|
||||
m := db.pageInBuffer(buf[:], 0).meta()
|
||||
if err := m.validate(); err != nil {
|
||||
return nil, err
|
||||
return nil, fmt.Errorf("meta0 error: %s", err)
|
||||
}
|
||||
db.pageSize = int(m.pageSize)
|
||||
}
|
||||
}
|
||||
|
||||
// Memory map the data file.
|
||||
if err := db.mmap(options.InitialMmapSize); err != nil {
|
||||
if err := db.mmap(0); err != nil {
|
||||
_ = db.close()
|
||||
return nil, err
|
||||
}
|
||||
@@ -263,10 +248,10 @@ func (db *DB) mmap(minsz int) error {
|
||||
|
||||
// Validate the meta pages.
|
||||
if err := db.meta0.validate(); err != nil {
|
||||
return err
|
||||
return fmt.Errorf("meta0 error: %s", err)
|
||||
}
|
||||
if err := db.meta1.validate(); err != nil {
|
||||
return err
|
||||
return fmt.Errorf("meta1 error: %s", err)
|
||||
}
|
||||
|
||||
return nil
|
||||
@@ -281,7 +266,7 @@ func (db *DB) munmap() error {
|
||||
}
|
||||
|
||||
// mmapSize determines the appropriate size for the mmap given the current size
|
||||
// of the database. The minimum size is 32KB and doubles until it reaches 1GB.
|
||||
// of the database. The minimum size is 1MB and doubles until it reaches 1GB.
|
||||
// Returns an error if the new mmap size is greater than the max allowed.
|
||||
func (db *DB) mmapSize(size int) (int, error) {
|
||||
// Double the size from 32KB until 1GB.
|
||||
@@ -397,9 +382,7 @@ func (db *DB) close() error {
|
||||
// No need to unlock read-only file.
|
||||
if !db.readOnly {
|
||||
// Unlock the file.
|
||||
if err := funlock(db.file); err != nil {
|
||||
log.Printf("bolt.Close(): funlock error: %s", err)
|
||||
}
|
||||
_ = funlock(db.file)
|
||||
}
|
||||
|
||||
// Close the file descriptor.
|
||||
@@ -418,15 +401,11 @@ func (db *DB) close() error {
|
||||
// will cause the calls to block and be serialized until the current write
|
||||
// transaction finishes.
|
||||
//
|
||||
// Transactions should not be dependent on one another. Opening a read
|
||||
// Transactions should not be depedent on one another. Opening a read
|
||||
// transaction and a write transaction in the same goroutine can cause the
|
||||
// writer to deadlock because the database periodically needs to re-mmap itself
|
||||
// as it grows and it cannot do that while a read transaction is open.
|
||||
//
|
||||
// If a long running read transaction (for example, a snapshot transaction) is
|
||||
// needed, you might want to set DB.InitialMmapSize to a large enough value
|
||||
// to avoid potential blocking of write transaction.
|
||||
//
|
||||
// IMPORTANT: You must close read-only transactions after you are finished or
|
||||
// else the database will not reclaim old pages.
|
||||
func (db *DB) Begin(writable bool) (*Tx, error) {
|
||||
@@ -610,136 +589,6 @@ func (db *DB) View(fn func(*Tx) error) error {
|
||||
return nil
|
||||
}
|
||||
|
||||
// Batch calls fn as part of a batch. It behaves similar to Update,
|
||||
// except:
|
||||
//
|
||||
// 1. concurrent Batch calls can be combined into a single Bolt
|
||||
// transaction.
|
||||
//
|
||||
// 2. the function passed to Batch may be called multiple times,
|
||||
// regardless of whether it returns error or not.
|
||||
//
|
||||
// This means that Batch function side effects must be idempotent and
|
||||
// take permanent effect only after a successful return is seen in
|
||||
// caller.
|
||||
//
|
||||
// The maximum batch size and delay can be adjusted with DB.MaxBatchSize
|
||||
// and DB.MaxBatchDelay, respectively.
|
||||
//
|
||||
// Batch is only useful when there are multiple goroutines calling it.
|
||||
func (db *DB) Batch(fn func(*Tx) error) error {
|
||||
errCh := make(chan error, 1)
|
||||
|
||||
db.batchMu.Lock()
|
||||
if (db.batch == nil) || (db.batch != nil && len(db.batch.calls) >= db.MaxBatchSize) {
|
||||
// There is no existing batch, or the existing batch is full; start a new one.
|
||||
db.batch = &batch{
|
||||
db: db,
|
||||
}
|
||||
db.batch.timer = time.AfterFunc(db.MaxBatchDelay, db.batch.trigger)
|
||||
}
|
||||
db.batch.calls = append(db.batch.calls, call{fn: fn, err: errCh})
|
||||
if len(db.batch.calls) >= db.MaxBatchSize {
|
||||
// wake up batch, it's ready to run
|
||||
go db.batch.trigger()
|
||||
}
|
||||
db.batchMu.Unlock()
|
||||
|
||||
err := <-errCh
|
||||
if err == trySolo {
|
||||
err = db.Update(fn)
|
||||
}
|
||||
return err
|
||||
}
|
||||
|
||||
type call struct {
|
||||
fn func(*Tx) error
|
||||
err chan<- error
|
||||
}
|
||||
|
||||
type batch struct {
|
||||
db *DB
|
||||
timer *time.Timer
|
||||
start sync.Once
|
||||
calls []call
|
||||
}
|
||||
|
||||
// trigger runs the batch if it hasn't already been run.
|
||||
func (b *batch) trigger() {
|
||||
b.start.Do(b.run)
|
||||
}
|
||||
|
||||
// run performs the transactions in the batch and communicates results
|
||||
// back to DB.Batch.
|
||||
func (b *batch) run() {
|
||||
b.db.batchMu.Lock()
|
||||
b.timer.Stop()
|
||||
// Make sure no new work is added to this batch, but don't break
|
||||
// other batches.
|
||||
if b.db.batch == b {
|
||||
b.db.batch = nil
|
||||
}
|
||||
b.db.batchMu.Unlock()
|
||||
|
||||
retry:
|
||||
for len(b.calls) > 0 {
|
||||
var failIdx = -1
|
||||
err := b.db.Update(func(tx *Tx) error {
|
||||
for i, c := range b.calls {
|
||||
if err := safelyCall(c.fn, tx); err != nil {
|
||||
failIdx = i
|
||||
return err
|
||||
}
|
||||
}
|
||||
return nil
|
||||
})
|
||||
|
||||
if failIdx >= 0 {
|
||||
// take the failing transaction out of the batch. it's
|
||||
// safe to shorten b.calls here because db.batch no longer
|
||||
// points to us, and we hold the mutex anyway.
|
||||
c := b.calls[failIdx]
|
||||
b.calls[failIdx], b.calls = b.calls[len(b.calls)-1], b.calls[:len(b.calls)-1]
|
||||
// tell the submitter re-run it solo, continue with the rest of the batch
|
||||
c.err <- trySolo
|
||||
continue retry
|
||||
}
|
||||
|
||||
// pass success, or bolt internal errors, to all callers
|
||||
for _, c := range b.calls {
|
||||
if c.err != nil {
|
||||
c.err <- err
|
||||
}
|
||||
}
|
||||
break retry
|
||||
}
|
||||
}
|
||||
|
||||
// trySolo is a special sentinel error value used for signaling that a
|
||||
// transaction function should be re-run. It should never be seen by
|
||||
// callers.
|
||||
var trySolo = errors.New("batch function returned an error and should be re-run solo")
|
||||
|
||||
type panicked struct {
|
||||
reason interface{}
|
||||
}
|
||||
|
||||
func (p panicked) Error() string {
|
||||
if err, ok := p.reason.(error); ok {
|
||||
return err.Error()
|
||||
}
|
||||
return fmt.Sprintf("panic: %v", p.reason)
|
||||
}
|
||||
|
||||
func safelyCall(fn func(*Tx) error, tx *Tx) (err error) {
|
||||
defer func() {
|
||||
if p := recover(); p != nil {
|
||||
err = panicked{p}
|
||||
}
|
||||
}()
|
||||
return fn(tx)
|
||||
}
|
||||
|
||||
// Sync executes fdatasync() against the database file handle.
|
||||
//
|
||||
// This is not necessary under normal operation, however, if you use NoSync
|
||||
@@ -806,36 +655,6 @@ func (db *DB) allocate(count int) (*page, error) {
|
||||
return p, nil
|
||||
}
|
||||
|
||||
// grow grows the size of the database to the given sz.
|
||||
func (db *DB) grow(sz int) error {
|
||||
// Ignore if the new size is less than available file size.
|
||||
if sz <= db.filesz {
|
||||
return nil
|
||||
}
|
||||
|
||||
// If the data is smaller than the alloc size then only allocate what's needed.
|
||||
// Once it goes over the allocation size then allocate in chunks.
|
||||
if db.datasz < db.AllocSize {
|
||||
sz = db.datasz
|
||||
} else {
|
||||
sz += db.AllocSize
|
||||
}
|
||||
|
||||
// Truncate and fsync to ensure file size metadata is flushed.
|
||||
// https://github.com/boltdb/bolt/issues/284
|
||||
if !db.NoGrowSync && !db.readOnly {
|
||||
if err := db.file.Truncate(int64(sz)); err != nil {
|
||||
return fmt.Errorf("file resize error: %s", err)
|
||||
}
|
||||
if err := db.file.Sync(); err != nil {
|
||||
return fmt.Errorf("file sync error: %s", err)
|
||||
}
|
||||
}
|
||||
|
||||
db.filesz = sz
|
||||
return nil
|
||||
}
|
||||
|
||||
func (db *DB) IsReadOnly() bool {
|
||||
return db.readOnly
|
||||
}
|
||||
@@ -853,19 +672,6 @@ type Options struct {
|
||||
// Open database in read-only mode. Uses flock(..., LOCK_SH |LOCK_NB) to
|
||||
// grab a shared lock (UNIX).
|
||||
ReadOnly bool
|
||||
|
||||
// Sets the DB.MmapFlags flag before memory mapping the file.
|
||||
MmapFlags int
|
||||
|
||||
// InitialMmapSize is the initial mmap size of the database
|
||||
// in bytes. Read transactions won't block write transaction
|
||||
// if the InitialMmapSize is large enough to hold database mmap
|
||||
// size. (See DB.Begin for more information)
|
||||
//
|
||||
// If <=0, the initial map size is 0.
|
||||
// If initialMmapSize is smaller than the previous database size,
|
||||
// it takes no effect.
|
||||
InitialMmapSize int
|
||||
}
|
||||
|
||||
// DefaultOptions represent the options used if nil options are passed into Open().
|
||||
|
913
Godeps/_workspace/src/github.com/boltdb/bolt/db_test.go
generated
vendored
Normal file
913
Godeps/_workspace/src/github.com/boltdb/bolt/db_test.go
generated
vendored
Normal file
@@ -0,0 +1,913 @@
|
||||
package bolt_test
|
||||
|
||||
import (
|
||||
"encoding/binary"
|
||||
"errors"
|
||||
"flag"
|
||||
"fmt"
|
||||
"io/ioutil"
|
||||
"os"
|
||||
"regexp"
|
||||
"runtime"
|
||||
"sort"
|
||||
"strings"
|
||||
"testing"
|
||||
"time"
|
||||
|
||||
"github.com/coreos/etcd/Godeps/_workspace/src/github.com/boltdb/bolt"
|
||||
)
|
||||
|
||||
var statsFlag = flag.Bool("stats", false, "show performance stats")
|
||||
|
||||
// Ensure that opening a database with a bad path returns an error.
|
||||
func TestOpen_BadPath(t *testing.T) {
|
||||
db, err := bolt.Open("", 0666, nil)
|
||||
assert(t, err != nil, "err: %s", err)
|
||||
assert(t, db == nil, "")
|
||||
}
|
||||
|
||||
// Ensure that a database can be opened without error.
|
||||
func TestOpen(t *testing.T) {
|
||||
path := tempfile()
|
||||
defer os.Remove(path)
|
||||
db, err := bolt.Open(path, 0666, nil)
|
||||
assert(t, db != nil, "")
|
||||
ok(t, err)
|
||||
equals(t, db.Path(), path)
|
||||
ok(t, db.Close())
|
||||
}
|
||||
|
||||
// Ensure that opening an already open database file will timeout.
|
||||
func TestOpen_Timeout(t *testing.T) {
|
||||
if runtime.GOOS == "windows" {
|
||||
t.Skip("timeout not supported on windows")
|
||||
}
|
||||
if runtime.GOOS == "solaris" {
|
||||
t.Skip("solaris fcntl locks don't support intra-process locking")
|
||||
}
|
||||
|
||||
path := tempfile()
|
||||
defer os.Remove(path)
|
||||
|
||||
// Open a data file.
|
||||
db0, err := bolt.Open(path, 0666, nil)
|
||||
assert(t, db0 != nil, "")
|
||||
ok(t, err)
|
||||
|
||||
// Attempt to open the database again.
|
||||
start := time.Now()
|
||||
db1, err := bolt.Open(path, 0666, &bolt.Options{Timeout: 100 * time.Millisecond})
|
||||
assert(t, db1 == nil, "")
|
||||
equals(t, bolt.ErrTimeout, err)
|
||||
assert(t, time.Since(start) > 100*time.Millisecond, "")
|
||||
|
||||
db0.Close()
|
||||
}
|
||||
|
||||
// Ensure that opening an already open database file will wait until its closed.
|
||||
func TestOpen_Wait(t *testing.T) {
|
||||
if runtime.GOOS == "windows" {
|
||||
t.Skip("timeout not supported on windows")
|
||||
}
|
||||
if runtime.GOOS == "solaris" {
|
||||
t.Skip("solaris fcntl locks don't support intra-process locking")
|
||||
}
|
||||
|
||||
path := tempfile()
|
||||
defer os.Remove(path)
|
||||
|
||||
// Open a data file.
|
||||
db0, err := bolt.Open(path, 0666, nil)
|
||||
assert(t, db0 != nil, "")
|
||||
ok(t, err)
|
||||
|
||||
// Close it in just a bit.
|
||||
time.AfterFunc(100*time.Millisecond, func() { db0.Close() })
|
||||
|
||||
// Attempt to open the database again.
|
||||
start := time.Now()
|
||||
db1, err := bolt.Open(path, 0666, &bolt.Options{Timeout: 200 * time.Millisecond})
|
||||
assert(t, db1 != nil, "")
|
||||
ok(t, err)
|
||||
assert(t, time.Since(start) > 100*time.Millisecond, "")
|
||||
}
|
||||
|
||||
// Ensure that opening a database does not increase its size.
|
||||
// https://github.com/boltdb/bolt/issues/291
|
||||
func TestOpen_Size(t *testing.T) {
|
||||
// Open a data file.
|
||||
db := NewTestDB()
|
||||
path := db.Path()
|
||||
defer db.Close()
|
||||
|
||||
// Insert until we get above the minimum 4MB size.
|
||||
ok(t, db.Update(func(tx *bolt.Tx) error {
|
||||
b, _ := tx.CreateBucketIfNotExists([]byte("data"))
|
||||
for i := 0; i < 10000; i++ {
|
||||
ok(t, b.Put([]byte(fmt.Sprintf("%04d", i)), make([]byte, 1000)))
|
||||
}
|
||||
return nil
|
||||
}))
|
||||
|
||||
// Close database and grab the size.
|
||||
db.DB.Close()
|
||||
sz := fileSize(path)
|
||||
if sz == 0 {
|
||||
t.Fatalf("unexpected new file size: %d", sz)
|
||||
}
|
||||
|
||||
// Reopen database, update, and check size again.
|
||||
db0, err := bolt.Open(path, 0666, nil)
|
||||
ok(t, err)
|
||||
ok(t, db0.Update(func(tx *bolt.Tx) error { return tx.Bucket([]byte("data")).Put([]byte{0}, []byte{0}) }))
|
||||
ok(t, db0.Close())
|
||||
newSz := fileSize(path)
|
||||
if newSz == 0 {
|
||||
t.Fatalf("unexpected new file size: %d", newSz)
|
||||
}
|
||||
|
||||
// Compare the original size with the new size.
|
||||
if sz != newSz {
|
||||
t.Fatalf("unexpected file growth: %d => %d", sz, newSz)
|
||||
}
|
||||
}
|
||||
|
||||
// Ensure that opening a database beyond the max step size does not increase its size.
|
||||
// https://github.com/boltdb/bolt/issues/303
|
||||
func TestOpen_Size_Large(t *testing.T) {
|
||||
if testing.Short() {
|
||||
t.Skip("short mode")
|
||||
}
|
||||
|
||||
// Open a data file.
|
||||
db := NewTestDB()
|
||||
path := db.Path()
|
||||
defer db.Close()
|
||||
|
||||
// Insert until we get above the minimum 4MB size.
|
||||
var index uint64
|
||||
for i := 0; i < 10000; i++ {
|
||||
ok(t, db.Update(func(tx *bolt.Tx) error {
|
||||
b, _ := tx.CreateBucketIfNotExists([]byte("data"))
|
||||
for j := 0; j < 1000; j++ {
|
||||
ok(t, b.Put(u64tob(index), make([]byte, 50)))
|
||||
index++
|
||||
}
|
||||
return nil
|
||||
}))
|
||||
}
|
||||
|
||||
// Close database and grab the size.
|
||||
db.DB.Close()
|
||||
sz := fileSize(path)
|
||||
if sz == 0 {
|
||||
t.Fatalf("unexpected new file size: %d", sz)
|
||||
} else if sz < (1 << 30) {
|
||||
t.Fatalf("expected larger initial size: %d", sz)
|
||||
}
|
||||
|
||||
// Reopen database, update, and check size again.
|
||||
db0, err := bolt.Open(path, 0666, nil)
|
||||
ok(t, err)
|
||||
ok(t, db0.Update(func(tx *bolt.Tx) error { return tx.Bucket([]byte("data")).Put([]byte{0}, []byte{0}) }))
|
||||
ok(t, db0.Close())
|
||||
newSz := fileSize(path)
|
||||
if newSz == 0 {
|
||||
t.Fatalf("unexpected new file size: %d", newSz)
|
||||
}
|
||||
|
||||
// Compare the original size with the new size.
|
||||
if sz != newSz {
|
||||
t.Fatalf("unexpected file growth: %d => %d", sz, newSz)
|
||||
}
|
||||
}
|
||||
|
||||
// Ensure that a re-opened database is consistent.
|
||||
func TestOpen_Check(t *testing.T) {
|
||||
path := tempfile()
|
||||
defer os.Remove(path)
|
||||
|
||||
db, err := bolt.Open(path, 0666, nil)
|
||||
ok(t, err)
|
||||
ok(t, db.View(func(tx *bolt.Tx) error { return <-tx.Check() }))
|
||||
db.Close()
|
||||
|
||||
db, err = bolt.Open(path, 0666, nil)
|
||||
ok(t, err)
|
||||
ok(t, db.View(func(tx *bolt.Tx) error { return <-tx.Check() }))
|
||||
db.Close()
|
||||
}
|
||||
|
||||
// Ensure that the database returns an error if the file handle cannot be open.
|
||||
func TestDB_Open_FileError(t *testing.T) {
|
||||
path := tempfile()
|
||||
defer os.Remove(path)
|
||||
|
||||
_, err := bolt.Open(path+"/youre-not-my-real-parent", 0666, nil)
|
||||
assert(t, err.(*os.PathError) != nil, "")
|
||||
equals(t, path+"/youre-not-my-real-parent", err.(*os.PathError).Path)
|
||||
equals(t, "open", err.(*os.PathError).Op)
|
||||
}
|
||||
|
||||
// Ensure that write errors to the meta file handler during initialization are returned.
|
||||
func TestDB_Open_MetaInitWriteError(t *testing.T) {
|
||||
t.Skip("pending")
|
||||
}
|
||||
|
||||
// Ensure that a database that is too small returns an error.
|
||||
func TestDB_Open_FileTooSmall(t *testing.T) {
|
||||
path := tempfile()
|
||||
defer os.Remove(path)
|
||||
|
||||
db, err := bolt.Open(path, 0666, nil)
|
||||
ok(t, err)
|
||||
db.Close()
|
||||
|
||||
// corrupt the database
|
||||
ok(t, os.Truncate(path, int64(os.Getpagesize())))
|
||||
|
||||
db, err = bolt.Open(path, 0666, nil)
|
||||
equals(t, errors.New("file size too small"), err)
|
||||
}
|
||||
|
||||
// Ensure that a database can be opened in read-only mode by multiple processes
|
||||
// and that a database can not be opened in read-write mode and in read-only
|
||||
// mode at the same time.
|
||||
func TestOpen_ReadOnly(t *testing.T) {
|
||||
if runtime.GOOS == "solaris" {
|
||||
t.Skip("solaris fcntl locks don't support intra-process locking")
|
||||
}
|
||||
|
||||
bucket, key, value := []byte(`bucket`), []byte(`key`), []byte(`value`)
|
||||
|
||||
path := tempfile()
|
||||
defer os.Remove(path)
|
||||
|
||||
// Open in read-write mode.
|
||||
db, err := bolt.Open(path, 0666, nil)
|
||||
ok(t, db.Update(func(tx *bolt.Tx) error {
|
||||
b, err := tx.CreateBucket(bucket)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
return b.Put(key, value)
|
||||
}))
|
||||
assert(t, db != nil, "")
|
||||
assert(t, !db.IsReadOnly(), "")
|
||||
ok(t, err)
|
||||
ok(t, db.Close())
|
||||
|
||||
// Open in read-only mode.
|
||||
db0, err := bolt.Open(path, 0666, &bolt.Options{ReadOnly: true})
|
||||
ok(t, err)
|
||||
defer db0.Close()
|
||||
|
||||
// Opening in read-write mode should return an error.
|
||||
_, err = bolt.Open(path, 0666, &bolt.Options{Timeout: time.Millisecond * 100})
|
||||
assert(t, err != nil, "")
|
||||
|
||||
// And again (in read-only mode).
|
||||
db1, err := bolt.Open(path, 0666, &bolt.Options{ReadOnly: true})
|
||||
ok(t, err)
|
||||
defer db1.Close()
|
||||
|
||||
// Verify both read-only databases are accessible.
|
||||
for _, db := range []*bolt.DB{db0, db1} {
|
||||
// Verify is is in read only mode indeed.
|
||||
assert(t, db.IsReadOnly(), "")
|
||||
|
||||
// Read-only databases should not allow updates.
|
||||
assert(t,
|
||||
bolt.ErrDatabaseReadOnly == db.Update(func(*bolt.Tx) error {
|
||||
panic(`should never get here`)
|
||||
}),
|
||||
"")
|
||||
|
||||
// Read-only databases should not allow beginning writable txns.
|
||||
_, err = db.Begin(true)
|
||||
assert(t, bolt.ErrDatabaseReadOnly == err, "")
|
||||
|
||||
// Verify the data.
|
||||
ok(t, db.View(func(tx *bolt.Tx) error {
|
||||
b := tx.Bucket(bucket)
|
||||
if b == nil {
|
||||
return fmt.Errorf("expected bucket `%s`", string(bucket))
|
||||
}
|
||||
|
||||
got := string(b.Get(key))
|
||||
expected := string(value)
|
||||
if got != expected {
|
||||
return fmt.Errorf("expected `%s`, got `%s`", expected, got)
|
||||
}
|
||||
return nil
|
||||
}))
|
||||
}
|
||||
}
|
||||
|
||||
// TODO(benbjohnson): Test corruption at every byte of the first two pages.
|
||||
|
||||
// Ensure that a database cannot open a transaction when it's not open.
|
||||
func TestDB_Begin_DatabaseNotOpen(t *testing.T) {
|
||||
var db bolt.DB
|
||||
tx, err := db.Begin(false)
|
||||
assert(t, tx == nil, "")
|
||||
equals(t, err, bolt.ErrDatabaseNotOpen)
|
||||
}
|
||||
|
||||
// Ensure that a read-write transaction can be retrieved.
|
||||
func TestDB_BeginRW(t *testing.T) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
tx, err := db.Begin(true)
|
||||
assert(t, tx != nil, "")
|
||||
ok(t, err)
|
||||
assert(t, tx.DB() == db.DB, "")
|
||||
equals(t, tx.Writable(), true)
|
||||
ok(t, tx.Commit())
|
||||
}
|
||||
|
||||
// Ensure that opening a transaction while the DB is closed returns an error.
|
||||
func TestDB_BeginRW_Closed(t *testing.T) {
|
||||
var db bolt.DB
|
||||
tx, err := db.Begin(true)
|
||||
equals(t, err, bolt.ErrDatabaseNotOpen)
|
||||
assert(t, tx == nil, "")
|
||||
}
|
||||
|
||||
func TestDB_Close_PendingTx_RW(t *testing.T) { testDB_Close_PendingTx(t, true) }
|
||||
func TestDB_Close_PendingTx_RO(t *testing.T) { testDB_Close_PendingTx(t, false) }
|
||||
|
||||
// Ensure that a database cannot close while transactions are open.
|
||||
func testDB_Close_PendingTx(t *testing.T, writable bool) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
|
||||
// Start transaction.
|
||||
tx, err := db.Begin(true)
|
||||
if err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
|
||||
// Open update in separate goroutine.
|
||||
done := make(chan struct{})
|
||||
go func() {
|
||||
db.Close()
|
||||
close(done)
|
||||
}()
|
||||
|
||||
// Ensure database hasn't closed.
|
||||
time.Sleep(100 * time.Millisecond)
|
||||
select {
|
||||
case <-done:
|
||||
t.Fatal("database closed too early")
|
||||
default:
|
||||
}
|
||||
|
||||
// Commit transaction.
|
||||
if err := tx.Commit(); err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
|
||||
// Ensure database closed now.
|
||||
time.Sleep(100 * time.Millisecond)
|
||||
select {
|
||||
case <-done:
|
||||
default:
|
||||
t.Fatal("database did not close")
|
||||
}
|
||||
}
|
||||
|
||||
// Ensure a database can provide a transactional block.
|
||||
func TestDB_Update(t *testing.T) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
err := db.Update(func(tx *bolt.Tx) error {
|
||||
tx.CreateBucket([]byte("widgets"))
|
||||
b := tx.Bucket([]byte("widgets"))
|
||||
b.Put([]byte("foo"), []byte("bar"))
|
||||
b.Put([]byte("baz"), []byte("bat"))
|
||||
b.Delete([]byte("foo"))
|
||||
return nil
|
||||
})
|
||||
ok(t, err)
|
||||
err = db.View(func(tx *bolt.Tx) error {
|
||||
assert(t, tx.Bucket([]byte("widgets")).Get([]byte("foo")) == nil, "")
|
||||
equals(t, []byte("bat"), tx.Bucket([]byte("widgets")).Get([]byte("baz")))
|
||||
return nil
|
||||
})
|
||||
ok(t, err)
|
||||
}
|
||||
|
||||
// Ensure a closed database returns an error while running a transaction block
|
||||
func TestDB_Update_Closed(t *testing.T) {
|
||||
var db bolt.DB
|
||||
err := db.Update(func(tx *bolt.Tx) error {
|
||||
tx.CreateBucket([]byte("widgets"))
|
||||
return nil
|
||||
})
|
||||
equals(t, err, bolt.ErrDatabaseNotOpen)
|
||||
}
|
||||
|
||||
// Ensure a panic occurs while trying to commit a managed transaction.
|
||||
func TestDB_Update_ManualCommit(t *testing.T) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
|
||||
var ok bool
|
||||
db.Update(func(tx *bolt.Tx) error {
|
||||
func() {
|
||||
defer func() {
|
||||
if r := recover(); r != nil {
|
||||
ok = true
|
||||
}
|
||||
}()
|
||||
tx.Commit()
|
||||
}()
|
||||
return nil
|
||||
})
|
||||
assert(t, ok, "expected panic")
|
||||
}
|
||||
|
||||
// Ensure a panic occurs while trying to rollback a managed transaction.
|
||||
func TestDB_Update_ManualRollback(t *testing.T) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
|
||||
var ok bool
|
||||
db.Update(func(tx *bolt.Tx) error {
|
||||
func() {
|
||||
defer func() {
|
||||
if r := recover(); r != nil {
|
||||
ok = true
|
||||
}
|
||||
}()
|
||||
tx.Rollback()
|
||||
}()
|
||||
return nil
|
||||
})
|
||||
assert(t, ok, "expected panic")
|
||||
}
|
||||
|
||||
// Ensure a panic occurs while trying to commit a managed transaction.
|
||||
func TestDB_View_ManualCommit(t *testing.T) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
|
||||
var ok bool
|
||||
db.Update(func(tx *bolt.Tx) error {
|
||||
func() {
|
||||
defer func() {
|
||||
if r := recover(); r != nil {
|
||||
ok = true
|
||||
}
|
||||
}()
|
||||
tx.Commit()
|
||||
}()
|
||||
return nil
|
||||
})
|
||||
assert(t, ok, "expected panic")
|
||||
}
|
||||
|
||||
// Ensure a panic occurs while trying to rollback a managed transaction.
|
||||
func TestDB_View_ManualRollback(t *testing.T) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
|
||||
var ok bool
|
||||
db.Update(func(tx *bolt.Tx) error {
|
||||
func() {
|
||||
defer func() {
|
||||
if r := recover(); r != nil {
|
||||
ok = true
|
||||
}
|
||||
}()
|
||||
tx.Rollback()
|
||||
}()
|
||||
return nil
|
||||
})
|
||||
assert(t, ok, "expected panic")
|
||||
}
|
||||
|
||||
// Ensure a write transaction that panics does not hold open locks.
|
||||
func TestDB_Update_Panic(t *testing.T) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
|
||||
func() {
|
||||
defer func() {
|
||||
if r := recover(); r != nil {
|
||||
t.Log("recover: update", r)
|
||||
}
|
||||
}()
|
||||
db.Update(func(tx *bolt.Tx) error {
|
||||
tx.CreateBucket([]byte("widgets"))
|
||||
panic("omg")
|
||||
})
|
||||
}()
|
||||
|
||||
// Verify we can update again.
|
||||
err := db.Update(func(tx *bolt.Tx) error {
|
||||
_, err := tx.CreateBucket([]byte("widgets"))
|
||||
return err
|
||||
})
|
||||
ok(t, err)
|
||||
|
||||
// Verify that our change persisted.
|
||||
err = db.Update(func(tx *bolt.Tx) error {
|
||||
assert(t, tx.Bucket([]byte("widgets")) != nil, "")
|
||||
return nil
|
||||
})
|
||||
}
|
||||
|
||||
// Ensure a database can return an error through a read-only transactional block.
|
||||
func TestDB_View_Error(t *testing.T) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
err := db.View(func(tx *bolt.Tx) error {
|
||||
return errors.New("xxx")
|
||||
})
|
||||
equals(t, errors.New("xxx"), err)
|
||||
}
|
||||
|
||||
// Ensure a read transaction that panics does not hold open locks.
|
||||
func TestDB_View_Panic(t *testing.T) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
db.Update(func(tx *bolt.Tx) error {
|
||||
tx.CreateBucket([]byte("widgets"))
|
||||
return nil
|
||||
})
|
||||
|
||||
func() {
|
||||
defer func() {
|
||||
if r := recover(); r != nil {
|
||||
t.Log("recover: view", r)
|
||||
}
|
||||
}()
|
||||
db.View(func(tx *bolt.Tx) error {
|
||||
assert(t, tx.Bucket([]byte("widgets")) != nil, "")
|
||||
panic("omg")
|
||||
})
|
||||
}()
|
||||
|
||||
// Verify that we can still use read transactions.
|
||||
db.View(func(tx *bolt.Tx) error {
|
||||
assert(t, tx.Bucket([]byte("widgets")) != nil, "")
|
||||
return nil
|
||||
})
|
||||
}
|
||||
|
||||
// Ensure that an error is returned when a database write fails.
|
||||
func TestDB_Commit_WriteFail(t *testing.T) {
|
||||
t.Skip("pending") // TODO(benbjohnson)
|
||||
}
|
||||
|
||||
// Ensure that DB stats can be returned.
|
||||
func TestDB_Stats(t *testing.T) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
db.Update(func(tx *bolt.Tx) error {
|
||||
_, err := tx.CreateBucket([]byte("widgets"))
|
||||
return err
|
||||
})
|
||||
stats := db.Stats()
|
||||
equals(t, 2, stats.TxStats.PageCount)
|
||||
equals(t, 0, stats.FreePageN)
|
||||
equals(t, 2, stats.PendingPageN)
|
||||
}
|
||||
|
||||
// Ensure that database pages are in expected order and type.
|
||||
func TestDB_Consistency(t *testing.T) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
db.Update(func(tx *bolt.Tx) error {
|
||||
_, err := tx.CreateBucket([]byte("widgets"))
|
||||
return err
|
||||
})
|
||||
|
||||
for i := 0; i < 10; i++ {
|
||||
db.Update(func(tx *bolt.Tx) error {
|
||||
ok(t, tx.Bucket([]byte("widgets")).Put([]byte("foo"), []byte("bar")))
|
||||
return nil
|
||||
})
|
||||
}
|
||||
db.Update(func(tx *bolt.Tx) error {
|
||||
p, _ := tx.Page(0)
|
||||
assert(t, p != nil, "")
|
||||
equals(t, "meta", p.Type)
|
||||
|
||||
p, _ = tx.Page(1)
|
||||
assert(t, p != nil, "")
|
||||
equals(t, "meta", p.Type)
|
||||
|
||||
p, _ = tx.Page(2)
|
||||
assert(t, p != nil, "")
|
||||
equals(t, "free", p.Type)
|
||||
|
||||
p, _ = tx.Page(3)
|
||||
assert(t, p != nil, "")
|
||||
equals(t, "free", p.Type)
|
||||
|
||||
p, _ = tx.Page(4)
|
||||
assert(t, p != nil, "")
|
||||
equals(t, "leaf", p.Type)
|
||||
|
||||
p, _ = tx.Page(5)
|
||||
assert(t, p != nil, "")
|
||||
equals(t, "freelist", p.Type)
|
||||
|
||||
p, _ = tx.Page(6)
|
||||
assert(t, p == nil, "")
|
||||
return nil
|
||||
})
|
||||
}
|
||||
|
||||
// Ensure that DB stats can be substracted from one another.
|
||||
func TestDBStats_Sub(t *testing.T) {
|
||||
var a, b bolt.Stats
|
||||
a.TxStats.PageCount = 3
|
||||
a.FreePageN = 4
|
||||
b.TxStats.PageCount = 10
|
||||
b.FreePageN = 14
|
||||
diff := b.Sub(&a)
|
||||
equals(t, 7, diff.TxStats.PageCount)
|
||||
// free page stats are copied from the receiver and not subtracted
|
||||
equals(t, 14, diff.FreePageN)
|
||||
}
|
||||
|
||||
func ExampleDB_Update() {
|
||||
// Open the database.
|
||||
db, _ := bolt.Open(tempfile(), 0666, nil)
|
||||
defer os.Remove(db.Path())
|
||||
defer db.Close()
|
||||
|
||||
// Execute several commands within a write transaction.
|
||||
err := db.Update(func(tx *bolt.Tx) error {
|
||||
b, err := tx.CreateBucket([]byte("widgets"))
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
if err := b.Put([]byte("foo"), []byte("bar")); err != nil {
|
||||
return err
|
||||
}
|
||||
return nil
|
||||
})
|
||||
|
||||
// If our transactional block didn't return an error then our data is saved.
|
||||
if err == nil {
|
||||
db.View(func(tx *bolt.Tx) error {
|
||||
value := tx.Bucket([]byte("widgets")).Get([]byte("foo"))
|
||||
fmt.Printf("The value of 'foo' is: %s\n", value)
|
||||
return nil
|
||||
})
|
||||
}
|
||||
|
||||
// Output:
|
||||
// The value of 'foo' is: bar
|
||||
}
|
||||
|
||||
func ExampleDB_View() {
|
||||
// Open the database.
|
||||
db, _ := bolt.Open(tempfile(), 0666, nil)
|
||||
defer os.Remove(db.Path())
|
||||
defer db.Close()
|
||||
|
||||
// Insert data into a bucket.
|
||||
db.Update(func(tx *bolt.Tx) error {
|
||||
tx.CreateBucket([]byte("people"))
|
||||
b := tx.Bucket([]byte("people"))
|
||||
b.Put([]byte("john"), []byte("doe"))
|
||||
b.Put([]byte("susy"), []byte("que"))
|
||||
return nil
|
||||
})
|
||||
|
||||
// Access data from within a read-only transactional block.
|
||||
db.View(func(tx *bolt.Tx) error {
|
||||
v := tx.Bucket([]byte("people")).Get([]byte("john"))
|
||||
fmt.Printf("John's last name is %s.\n", v)
|
||||
return nil
|
||||
})
|
||||
|
||||
// Output:
|
||||
// John's last name is doe.
|
||||
}
|
||||
|
||||
func ExampleDB_Begin_ReadOnly() {
|
||||
// Open the database.
|
||||
db, _ := bolt.Open(tempfile(), 0666, nil)
|
||||
defer os.Remove(db.Path())
|
||||
defer db.Close()
|
||||
|
||||
// Create a bucket.
|
||||
db.Update(func(tx *bolt.Tx) error {
|
||||
_, err := tx.CreateBucket([]byte("widgets"))
|
||||
return err
|
||||
})
|
||||
|
||||
// Create several keys in a transaction.
|
||||
tx, _ := db.Begin(true)
|
||||
b := tx.Bucket([]byte("widgets"))
|
||||
b.Put([]byte("john"), []byte("blue"))
|
||||
b.Put([]byte("abby"), []byte("red"))
|
||||
b.Put([]byte("zephyr"), []byte("purple"))
|
||||
tx.Commit()
|
||||
|
||||
// Iterate over the values in sorted key order.
|
||||
tx, _ = db.Begin(false)
|
||||
c := tx.Bucket([]byte("widgets")).Cursor()
|
||||
for k, v := c.First(); k != nil; k, v = c.Next() {
|
||||
fmt.Printf("%s likes %s\n", k, v)
|
||||
}
|
||||
tx.Rollback()
|
||||
|
||||
// Output:
|
||||
// abby likes red
|
||||
// john likes blue
|
||||
// zephyr likes purple
|
||||
}
|
||||
|
||||
// TestDB represents a wrapper around a Bolt DB to handle temporary file
|
||||
// creation and automatic cleanup on close.
|
||||
type TestDB struct {
|
||||
*bolt.DB
|
||||
}
|
||||
|
||||
// NewTestDB returns a new instance of TestDB.
|
||||
func NewTestDB() *TestDB {
|
||||
db, err := bolt.Open(tempfile(), 0666, nil)
|
||||
if err != nil {
|
||||
panic("cannot open db: " + err.Error())
|
||||
}
|
||||
return &TestDB{db}
|
||||
}
|
||||
|
||||
// MustView executes a read-only function. Panic on error.
|
||||
func (db *TestDB) MustView(fn func(tx *bolt.Tx) error) {
|
||||
if err := db.DB.View(func(tx *bolt.Tx) error {
|
||||
return fn(tx)
|
||||
}); err != nil {
|
||||
panic(err.Error())
|
||||
}
|
||||
}
|
||||
|
||||
// MustUpdate executes a read-write function. Panic on error.
|
||||
func (db *TestDB) MustUpdate(fn func(tx *bolt.Tx) error) {
|
||||
if err := db.DB.View(func(tx *bolt.Tx) error {
|
||||
return fn(tx)
|
||||
}); err != nil {
|
||||
panic(err.Error())
|
||||
}
|
||||
}
|
||||
|
||||
// MustCreateBucket creates a new bucket. Panic on error.
|
||||
func (db *TestDB) MustCreateBucket(name []byte) {
|
||||
if err := db.Update(func(tx *bolt.Tx) error {
|
||||
_, err := tx.CreateBucket([]byte(name))
|
||||
return err
|
||||
}); err != nil {
|
||||
panic(err.Error())
|
||||
}
|
||||
}
|
||||
|
||||
// Close closes the database and deletes the underlying file.
|
||||
func (db *TestDB) Close() {
|
||||
// Log statistics.
|
||||
if *statsFlag {
|
||||
db.PrintStats()
|
||||
}
|
||||
|
||||
// Check database consistency after every test.
|
||||
db.MustCheck()
|
||||
|
||||
// Close database and remove file.
|
||||
defer os.Remove(db.Path())
|
||||
db.DB.Close()
|
||||
}
|
||||
|
||||
// PrintStats prints the database stats
|
||||
func (db *TestDB) PrintStats() {
|
||||
var stats = db.Stats()
|
||||
fmt.Printf("[db] %-20s %-20s %-20s\n",
|
||||
fmt.Sprintf("pg(%d/%d)", stats.TxStats.PageCount, stats.TxStats.PageAlloc),
|
||||
fmt.Sprintf("cur(%d)", stats.TxStats.CursorCount),
|
||||
fmt.Sprintf("node(%d/%d)", stats.TxStats.NodeCount, stats.TxStats.NodeDeref),
|
||||
)
|
||||
fmt.Printf(" %-20s %-20s %-20s\n",
|
||||
fmt.Sprintf("rebal(%d/%v)", stats.TxStats.Rebalance, truncDuration(stats.TxStats.RebalanceTime)),
|
||||
fmt.Sprintf("spill(%d/%v)", stats.TxStats.Spill, truncDuration(stats.TxStats.SpillTime)),
|
||||
fmt.Sprintf("w(%d/%v)", stats.TxStats.Write, truncDuration(stats.TxStats.WriteTime)),
|
||||
)
|
||||
}
|
||||
|
||||
// MustCheck runs a consistency check on the database and panics if any errors are found.
|
||||
func (db *TestDB) MustCheck() {
|
||||
db.Update(func(tx *bolt.Tx) error {
|
||||
// Collect all the errors.
|
||||
var errors []error
|
||||
for err := range tx.Check() {
|
||||
errors = append(errors, err)
|
||||
if len(errors) > 10 {
|
||||
break
|
||||
}
|
||||
}
|
||||
|
||||
// If errors occurred, copy the DB and print the errors.
|
||||
if len(errors) > 0 {
|
||||
var path = tempfile()
|
||||
tx.CopyFile(path, 0600)
|
||||
|
||||
// Print errors.
|
||||
fmt.Print("\n\n")
|
||||
fmt.Printf("consistency check failed (%d errors)\n", len(errors))
|
||||
for _, err := range errors {
|
||||
fmt.Println(err)
|
||||
}
|
||||
fmt.Println("")
|
||||
fmt.Println("db saved to:")
|
||||
fmt.Println(path)
|
||||
fmt.Print("\n\n")
|
||||
os.Exit(-1)
|
||||
}
|
||||
|
||||
return nil
|
||||
})
|
||||
}
|
||||
|
||||
// CopyTempFile copies a database to a temporary file.
|
||||
func (db *TestDB) CopyTempFile() {
|
||||
path := tempfile()
|
||||
db.View(func(tx *bolt.Tx) error { return tx.CopyFile(path, 0600) })
|
||||
fmt.Println("db copied to: ", path)
|
||||
}
|
||||
|
||||
// tempfile returns a temporary file path.
|
||||
func tempfile() string {
|
||||
f, _ := ioutil.TempFile("", "bolt-")
|
||||
f.Close()
|
||||
os.Remove(f.Name())
|
||||
return f.Name()
|
||||
}
|
||||
|
||||
// mustContainKeys checks that a bucket contains a given set of keys.
|
||||
func mustContainKeys(b *bolt.Bucket, m map[string]string) {
|
||||
found := make(map[string]string)
|
||||
b.ForEach(func(k, _ []byte) error {
|
||||
found[string(k)] = ""
|
||||
return nil
|
||||
})
|
||||
|
||||
// Check for keys found in bucket that shouldn't be there.
|
||||
var keys []string
|
||||
for k, _ := range found {
|
||||
if _, ok := m[string(k)]; !ok {
|
||||
keys = append(keys, k)
|
||||
}
|
||||
}
|
||||
if len(keys) > 0 {
|
||||
sort.Strings(keys)
|
||||
panic(fmt.Sprintf("keys found(%d): %s", len(keys), strings.Join(keys, ",")))
|
||||
}
|
||||
|
||||
// Check for keys not found in bucket that should be there.
|
||||
for k, _ := range m {
|
||||
if _, ok := found[string(k)]; !ok {
|
||||
keys = append(keys, k)
|
||||
}
|
||||
}
|
||||
if len(keys) > 0 {
|
||||
sort.Strings(keys)
|
||||
panic(fmt.Sprintf("keys not found(%d): %s", len(keys), strings.Join(keys, ",")))
|
||||
}
|
||||
}
|
||||
|
||||
func trunc(b []byte, length int) []byte {
|
||||
if length < len(b) {
|
||||
return b[:length]
|
||||
}
|
||||
return b
|
||||
}
|
||||
|
||||
func truncDuration(d time.Duration) string {
|
||||
return regexp.MustCompile(`^(\d+)(\.\d+)`).ReplaceAllString(d.String(), "$1")
|
||||
}
|
||||
|
||||
func fileSize(path string) int64 {
|
||||
fi, err := os.Stat(path)
|
||||
if err != nil {
|
||||
return 0
|
||||
}
|
||||
return fi.Size()
|
||||
}
|
||||
|
||||
func warn(v ...interface{}) { fmt.Fprintln(os.Stderr, v...) }
|
||||
func warnf(msg string, v ...interface{}) { fmt.Fprintf(os.Stderr, msg+"\n", v...) }
|
||||
|
||||
// u64tob converts a uint64 into an 8-byte slice.
|
||||
func u64tob(v uint64) []byte {
|
||||
b := make([]byte, 8)
|
||||
binary.BigEndian.PutUint64(b, v)
|
||||
return b
|
||||
}
|
||||
|
||||
// btou64 converts an 8-byte slice into an uint64.
|
||||
func btou64(b []byte) uint64 { return binary.BigEndian.Uint64(b) }
|
156
Godeps/_workspace/src/github.com/boltdb/bolt/freelist_test.go
generated
vendored
Normal file
156
Godeps/_workspace/src/github.com/boltdb/bolt/freelist_test.go
generated
vendored
Normal file
@@ -0,0 +1,156 @@
|
||||
package bolt
|
||||
|
||||
import (
|
||||
"math/rand"
|
||||
"reflect"
|
||||
"sort"
|
||||
"testing"
|
||||
"unsafe"
|
||||
)
|
||||
|
||||
// Ensure that a page is added to a transaction's freelist.
|
||||
func TestFreelist_free(t *testing.T) {
|
||||
f := newFreelist()
|
||||
f.free(100, &page{id: 12})
|
||||
if !reflect.DeepEqual([]pgid{12}, f.pending[100]) {
|
||||
t.Fatalf("exp=%v; got=%v", []pgid{12}, f.pending[100])
|
||||
}
|
||||
}
|
||||
|
||||
// Ensure that a page and its overflow is added to a transaction's freelist.
|
||||
func TestFreelist_free_overflow(t *testing.T) {
|
||||
f := newFreelist()
|
||||
f.free(100, &page{id: 12, overflow: 3})
|
||||
if exp := []pgid{12, 13, 14, 15}; !reflect.DeepEqual(exp, f.pending[100]) {
|
||||
t.Fatalf("exp=%v; got=%v", exp, f.pending[100])
|
||||
}
|
||||
}
|
||||
|
||||
// Ensure that a transaction's free pages can be released.
|
||||
func TestFreelist_release(t *testing.T) {
|
||||
f := newFreelist()
|
||||
f.free(100, &page{id: 12, overflow: 1})
|
||||
f.free(100, &page{id: 9})
|
||||
f.free(102, &page{id: 39})
|
||||
f.release(100)
|
||||
f.release(101)
|
||||
if exp := []pgid{9, 12, 13}; !reflect.DeepEqual(exp, f.ids) {
|
||||
t.Fatalf("exp=%v; got=%v", exp, f.ids)
|
||||
}
|
||||
|
||||
f.release(102)
|
||||
if exp := []pgid{9, 12, 13, 39}; !reflect.DeepEqual(exp, f.ids) {
|
||||
t.Fatalf("exp=%v; got=%v", exp, f.ids)
|
||||
}
|
||||
}
|
||||
|
||||
// Ensure that a freelist can find contiguous blocks of pages.
|
||||
func TestFreelist_allocate(t *testing.T) {
|
||||
f := &freelist{ids: []pgid{3, 4, 5, 6, 7, 9, 12, 13, 18}}
|
||||
if id := int(f.allocate(3)); id != 3 {
|
||||
t.Fatalf("exp=3; got=%v", id)
|
||||
}
|
||||
if id := int(f.allocate(1)); id != 6 {
|
||||
t.Fatalf("exp=6; got=%v", id)
|
||||
}
|
||||
if id := int(f.allocate(3)); id != 0 {
|
||||
t.Fatalf("exp=0; got=%v", id)
|
||||
}
|
||||
if id := int(f.allocate(2)); id != 12 {
|
||||
t.Fatalf("exp=12; got=%v", id)
|
||||
}
|
||||
if id := int(f.allocate(1)); id != 7 {
|
||||
t.Fatalf("exp=7; got=%v", id)
|
||||
}
|
||||
if id := int(f.allocate(0)); id != 0 {
|
||||
t.Fatalf("exp=0; got=%v", id)
|
||||
}
|
||||
if id := int(f.allocate(0)); id != 0 {
|
||||
t.Fatalf("exp=0; got=%v", id)
|
||||
}
|
||||
if exp := []pgid{9, 18}; !reflect.DeepEqual(exp, f.ids) {
|
||||
t.Fatalf("exp=%v; got=%v", exp, f.ids)
|
||||
}
|
||||
|
||||
if id := int(f.allocate(1)); id != 9 {
|
||||
t.Fatalf("exp=9; got=%v", id)
|
||||
}
|
||||
if id := int(f.allocate(1)); id != 18 {
|
||||
t.Fatalf("exp=18; got=%v", id)
|
||||
}
|
||||
if id := int(f.allocate(1)); id != 0 {
|
||||
t.Fatalf("exp=0; got=%v", id)
|
||||
}
|
||||
if exp := []pgid{}; !reflect.DeepEqual(exp, f.ids) {
|
||||
t.Fatalf("exp=%v; got=%v", exp, f.ids)
|
||||
}
|
||||
}
|
||||
|
||||
// Ensure that a freelist can deserialize from a freelist page.
|
||||
func TestFreelist_read(t *testing.T) {
|
||||
// Create a page.
|
||||
var buf [4096]byte
|
||||
page := (*page)(unsafe.Pointer(&buf[0]))
|
||||
page.flags = freelistPageFlag
|
||||
page.count = 2
|
||||
|
||||
// Insert 2 page ids.
|
||||
ids := (*[3]pgid)(unsafe.Pointer(&page.ptr))
|
||||
ids[0] = 23
|
||||
ids[1] = 50
|
||||
|
||||
// Deserialize page into a freelist.
|
||||
f := newFreelist()
|
||||
f.read(page)
|
||||
|
||||
// Ensure that there are two page ids in the freelist.
|
||||
if exp := []pgid{23, 50}; !reflect.DeepEqual(exp, f.ids) {
|
||||
t.Fatalf("exp=%v; got=%v", exp, f.ids)
|
||||
}
|
||||
}
|
||||
|
||||
// Ensure that a freelist can serialize into a freelist page.
|
||||
func TestFreelist_write(t *testing.T) {
|
||||
// Create a freelist and write it to a page.
|
||||
var buf [4096]byte
|
||||
f := &freelist{ids: []pgid{12, 39}, pending: make(map[txid][]pgid)}
|
||||
f.pending[100] = []pgid{28, 11}
|
||||
f.pending[101] = []pgid{3}
|
||||
p := (*page)(unsafe.Pointer(&buf[0]))
|
||||
f.write(p)
|
||||
|
||||
// Read the page back out.
|
||||
f2 := newFreelist()
|
||||
f2.read(p)
|
||||
|
||||
// Ensure that the freelist is correct.
|
||||
// All pages should be present and in reverse order.
|
||||
if exp := []pgid{3, 11, 12, 28, 39}; !reflect.DeepEqual(exp, f2.ids) {
|
||||
t.Fatalf("exp=%v; got=%v", exp, f2.ids)
|
||||
}
|
||||
}
|
||||
|
||||
func Benchmark_FreelistRelease10K(b *testing.B) { benchmark_FreelistRelease(b, 10000) }
|
||||
func Benchmark_FreelistRelease100K(b *testing.B) { benchmark_FreelistRelease(b, 100000) }
|
||||
func Benchmark_FreelistRelease1000K(b *testing.B) { benchmark_FreelistRelease(b, 1000000) }
|
||||
func Benchmark_FreelistRelease10000K(b *testing.B) { benchmark_FreelistRelease(b, 10000000) }
|
||||
|
||||
func benchmark_FreelistRelease(b *testing.B, size int) {
|
||||
ids := randomPgids(size)
|
||||
pending := randomPgids(len(ids) / 400)
|
||||
b.ResetTimer()
|
||||
for i := 0; i < b.N; i++ {
|
||||
f := &freelist{ids: ids, pending: map[txid][]pgid{1: pending}}
|
||||
f.release(1)
|
||||
}
|
||||
}
|
||||
|
||||
func randomPgids(n int) []pgid {
|
||||
rand.Seed(42)
|
||||
pgids := make(pgids, n)
|
||||
for i := range pgids {
|
||||
pgids[i] = pgid(rand.Int63())
|
||||
}
|
||||
sort.Sort(pgids)
|
||||
return pgids
|
||||
}
|
156
Godeps/_workspace/src/github.com/boltdb/bolt/node_test.go
generated
vendored
Normal file
156
Godeps/_workspace/src/github.com/boltdb/bolt/node_test.go
generated
vendored
Normal file
@@ -0,0 +1,156 @@
|
||||
package bolt
|
||||
|
||||
import (
|
||||
"testing"
|
||||
"unsafe"
|
||||
)
|
||||
|
||||
// Ensure that a node can insert a key/value.
|
||||
func TestNode_put(t *testing.T) {
|
||||
n := &node{inodes: make(inodes, 0), bucket: &Bucket{tx: &Tx{meta: &meta{pgid: 1}}}}
|
||||
n.put([]byte("baz"), []byte("baz"), []byte("2"), 0, 0)
|
||||
n.put([]byte("foo"), []byte("foo"), []byte("0"), 0, 0)
|
||||
n.put([]byte("bar"), []byte("bar"), []byte("1"), 0, 0)
|
||||
n.put([]byte("foo"), []byte("foo"), []byte("3"), 0, leafPageFlag)
|
||||
|
||||
if len(n.inodes) != 3 {
|
||||
t.Fatalf("exp=3; got=%d", len(n.inodes))
|
||||
}
|
||||
if k, v := n.inodes[0].key, n.inodes[0].value; string(k) != "bar" || string(v) != "1" {
|
||||
t.Fatalf("exp=<bar,1>; got=<%s,%s>", k, v)
|
||||
}
|
||||
if k, v := n.inodes[1].key, n.inodes[1].value; string(k) != "baz" || string(v) != "2" {
|
||||
t.Fatalf("exp=<baz,2>; got=<%s,%s>", k, v)
|
||||
}
|
||||
if k, v := n.inodes[2].key, n.inodes[2].value; string(k) != "foo" || string(v) != "3" {
|
||||
t.Fatalf("exp=<foo,3>; got=<%s,%s>", k, v)
|
||||
}
|
||||
if n.inodes[2].flags != uint32(leafPageFlag) {
|
||||
t.Fatalf("not a leaf: %d", n.inodes[2].flags)
|
||||
}
|
||||
}
|
||||
|
||||
// Ensure that a node can deserialize from a leaf page.
|
||||
func TestNode_read_LeafPage(t *testing.T) {
|
||||
// Create a page.
|
||||
var buf [4096]byte
|
||||
page := (*page)(unsafe.Pointer(&buf[0]))
|
||||
page.flags = leafPageFlag
|
||||
page.count = 2
|
||||
|
||||
// Insert 2 elements at the beginning. sizeof(leafPageElement) == 16
|
||||
nodes := (*[3]leafPageElement)(unsafe.Pointer(&page.ptr))
|
||||
nodes[0] = leafPageElement{flags: 0, pos: 32, ksize: 3, vsize: 4} // pos = sizeof(leafPageElement) * 2
|
||||
nodes[1] = leafPageElement{flags: 0, pos: 23, ksize: 10, vsize: 3} // pos = sizeof(leafPageElement) + 3 + 4
|
||||
|
||||
// Write data for the nodes at the end.
|
||||
data := (*[4096]byte)(unsafe.Pointer(&nodes[2]))
|
||||
copy(data[:], []byte("barfooz"))
|
||||
copy(data[7:], []byte("helloworldbye"))
|
||||
|
||||
// Deserialize page into a leaf.
|
||||
n := &node{}
|
||||
n.read(page)
|
||||
|
||||
// Check that there are two inodes with correct data.
|
||||
if !n.isLeaf {
|
||||
t.Fatal("expected leaf")
|
||||
}
|
||||
if len(n.inodes) != 2 {
|
||||
t.Fatalf("exp=2; got=%d", len(n.inodes))
|
||||
}
|
||||
if k, v := n.inodes[0].key, n.inodes[0].value; string(k) != "bar" || string(v) != "fooz" {
|
||||
t.Fatalf("exp=<bar,fooz>; got=<%s,%s>", k, v)
|
||||
}
|
||||
if k, v := n.inodes[1].key, n.inodes[1].value; string(k) != "helloworld" || string(v) != "bye" {
|
||||
t.Fatalf("exp=<helloworld,bye>; got=<%s,%s>", k, v)
|
||||
}
|
||||
}
|
||||
|
||||
// Ensure that a node can serialize into a leaf page.
|
||||
func TestNode_write_LeafPage(t *testing.T) {
|
||||
// Create a node.
|
||||
n := &node{isLeaf: true, inodes: make(inodes, 0), bucket: &Bucket{tx: &Tx{db: &DB{}, meta: &meta{pgid: 1}}}}
|
||||
n.put([]byte("susy"), []byte("susy"), []byte("que"), 0, 0)
|
||||
n.put([]byte("ricki"), []byte("ricki"), []byte("lake"), 0, 0)
|
||||
n.put([]byte("john"), []byte("john"), []byte("johnson"), 0, 0)
|
||||
|
||||
// Write it to a page.
|
||||
var buf [4096]byte
|
||||
p := (*page)(unsafe.Pointer(&buf[0]))
|
||||
n.write(p)
|
||||
|
||||
// Read the page back in.
|
||||
n2 := &node{}
|
||||
n2.read(p)
|
||||
|
||||
// Check that the two pages are the same.
|
||||
if len(n2.inodes) != 3 {
|
||||
t.Fatalf("exp=3; got=%d", len(n2.inodes))
|
||||
}
|
||||
if k, v := n2.inodes[0].key, n2.inodes[0].value; string(k) != "john" || string(v) != "johnson" {
|
||||
t.Fatalf("exp=<john,johnson>; got=<%s,%s>", k, v)
|
||||
}
|
||||
if k, v := n2.inodes[1].key, n2.inodes[1].value; string(k) != "ricki" || string(v) != "lake" {
|
||||
t.Fatalf("exp=<ricki,lake>; got=<%s,%s>", k, v)
|
||||
}
|
||||
if k, v := n2.inodes[2].key, n2.inodes[2].value; string(k) != "susy" || string(v) != "que" {
|
||||
t.Fatalf("exp=<susy,que>; got=<%s,%s>", k, v)
|
||||
}
|
||||
}
|
||||
|
||||
// Ensure that a node can split into appropriate subgroups.
|
||||
func TestNode_split(t *testing.T) {
|
||||
// Create a node.
|
||||
n := &node{inodes: make(inodes, 0), bucket: &Bucket{tx: &Tx{db: &DB{}, meta: &meta{pgid: 1}}}}
|
||||
n.put([]byte("00000001"), []byte("00000001"), []byte("0123456701234567"), 0, 0)
|
||||
n.put([]byte("00000002"), []byte("00000002"), []byte("0123456701234567"), 0, 0)
|
||||
n.put([]byte("00000003"), []byte("00000003"), []byte("0123456701234567"), 0, 0)
|
||||
n.put([]byte("00000004"), []byte("00000004"), []byte("0123456701234567"), 0, 0)
|
||||
n.put([]byte("00000005"), []byte("00000005"), []byte("0123456701234567"), 0, 0)
|
||||
|
||||
// Split between 2 & 3.
|
||||
n.split(100)
|
||||
|
||||
var parent = n.parent
|
||||
if len(parent.children) != 2 {
|
||||
t.Fatalf("exp=2; got=%d", len(parent.children))
|
||||
}
|
||||
if len(parent.children[0].inodes) != 2 {
|
||||
t.Fatalf("exp=2; got=%d", len(parent.children[0].inodes))
|
||||
}
|
||||
if len(parent.children[1].inodes) != 3 {
|
||||
t.Fatalf("exp=3; got=%d", len(parent.children[1].inodes))
|
||||
}
|
||||
}
|
||||
|
||||
// Ensure that a page with the minimum number of inodes just returns a single node.
|
||||
func TestNode_split_MinKeys(t *testing.T) {
|
||||
// Create a node.
|
||||
n := &node{inodes: make(inodes, 0), bucket: &Bucket{tx: &Tx{db: &DB{}, meta: &meta{pgid: 1}}}}
|
||||
n.put([]byte("00000001"), []byte("00000001"), []byte("0123456701234567"), 0, 0)
|
||||
n.put([]byte("00000002"), []byte("00000002"), []byte("0123456701234567"), 0, 0)
|
||||
|
||||
// Split.
|
||||
n.split(20)
|
||||
if n.parent != nil {
|
||||
t.Fatalf("expected nil parent")
|
||||
}
|
||||
}
|
||||
|
||||
// Ensure that a node that has keys that all fit on a page just returns one leaf.
|
||||
func TestNode_split_SinglePage(t *testing.T) {
|
||||
// Create a node.
|
||||
n := &node{inodes: make(inodes, 0), bucket: &Bucket{tx: &Tx{db: &DB{}, meta: &meta{pgid: 1}}}}
|
||||
n.put([]byte("00000001"), []byte("00000001"), []byte("0123456701234567"), 0, 0)
|
||||
n.put([]byte("00000002"), []byte("00000002"), []byte("0123456701234567"), 0, 0)
|
||||
n.put([]byte("00000003"), []byte("00000003"), []byte("0123456701234567"), 0, 0)
|
||||
n.put([]byte("00000004"), []byte("00000004"), []byte("0123456701234567"), 0, 0)
|
||||
n.put([]byte("00000005"), []byte("00000005"), []byte("0123456701234567"), 0, 0)
|
||||
|
||||
// Split.
|
||||
n.split(4096)
|
||||
if n.parent != nil {
|
||||
t.Fatalf("expected nil parent")
|
||||
}
|
||||
}
|
72
Godeps/_workspace/src/github.com/boltdb/bolt/page_test.go
generated
vendored
Normal file
72
Godeps/_workspace/src/github.com/boltdb/bolt/page_test.go
generated
vendored
Normal file
@@ -0,0 +1,72 @@
|
||||
package bolt
|
||||
|
||||
import (
|
||||
"reflect"
|
||||
"sort"
|
||||
"testing"
|
||||
"testing/quick"
|
||||
)
|
||||
|
||||
// Ensure that the page type can be returned in human readable format.
|
||||
func TestPage_typ(t *testing.T) {
|
||||
if typ := (&page{flags: branchPageFlag}).typ(); typ != "branch" {
|
||||
t.Fatalf("exp=branch; got=%v", typ)
|
||||
}
|
||||
if typ := (&page{flags: leafPageFlag}).typ(); typ != "leaf" {
|
||||
t.Fatalf("exp=leaf; got=%v", typ)
|
||||
}
|
||||
if typ := (&page{flags: metaPageFlag}).typ(); typ != "meta" {
|
||||
t.Fatalf("exp=meta; got=%v", typ)
|
||||
}
|
||||
if typ := (&page{flags: freelistPageFlag}).typ(); typ != "freelist" {
|
||||
t.Fatalf("exp=freelist; got=%v", typ)
|
||||
}
|
||||
if typ := (&page{flags: 20000}).typ(); typ != "unknown<4e20>" {
|
||||
t.Fatalf("exp=unknown<4e20>; got=%v", typ)
|
||||
}
|
||||
}
|
||||
|
||||
// Ensure that the hexdump debugging function doesn't blow up.
|
||||
func TestPage_dump(t *testing.T) {
|
||||
(&page{id: 256}).hexdump(16)
|
||||
}
|
||||
|
||||
func TestPgids_merge(t *testing.T) {
|
||||
a := pgids{4, 5, 6, 10, 11, 12, 13, 27}
|
||||
b := pgids{1, 3, 8, 9, 25, 30}
|
||||
c := a.merge(b)
|
||||
if !reflect.DeepEqual(c, pgids{1, 3, 4, 5, 6, 8, 9, 10, 11, 12, 13, 25, 27, 30}) {
|
||||
t.Errorf("mismatch: %v", c)
|
||||
}
|
||||
|
||||
a = pgids{4, 5, 6, 10, 11, 12, 13, 27, 35, 36}
|
||||
b = pgids{8, 9, 25, 30}
|
||||
c = a.merge(b)
|
||||
if !reflect.DeepEqual(c, pgids{4, 5, 6, 8, 9, 10, 11, 12, 13, 25, 27, 30, 35, 36}) {
|
||||
t.Errorf("mismatch: %v", c)
|
||||
}
|
||||
}
|
||||
|
||||
func TestPgids_merge_quick(t *testing.T) {
|
||||
if err := quick.Check(func(a, b pgids) bool {
|
||||
// Sort incoming lists.
|
||||
sort.Sort(a)
|
||||
sort.Sort(b)
|
||||
|
||||
// Merge the two lists together.
|
||||
got := a.merge(b)
|
||||
|
||||
// The expected value should be the two lists combined and sorted.
|
||||
exp := append(a, b...)
|
||||
sort.Sort(exp)
|
||||
|
||||
if !reflect.DeepEqual(exp, got) {
|
||||
t.Errorf("\nexp=%+v\ngot=%+v\n", exp, got)
|
||||
return false
|
||||
}
|
||||
|
||||
return true
|
||||
}, nil); err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
}
|
79
Godeps/_workspace/src/github.com/boltdb/bolt/quick_test.go
generated
vendored
Normal file
79
Godeps/_workspace/src/github.com/boltdb/bolt/quick_test.go
generated
vendored
Normal file
@@ -0,0 +1,79 @@
|
||||
package bolt_test
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"flag"
|
||||
"fmt"
|
||||
"math/rand"
|
||||
"os"
|
||||
"reflect"
|
||||
"testing/quick"
|
||||
"time"
|
||||
)
|
||||
|
||||
// testing/quick defaults to 5 iterations and a random seed.
|
||||
// You can override these settings from the command line:
|
||||
//
|
||||
// -quick.count The number of iterations to perform.
|
||||
// -quick.seed The seed to use for randomizing.
|
||||
// -quick.maxitems The maximum number of items to insert into a DB.
|
||||
// -quick.maxksize The maximum size of a key.
|
||||
// -quick.maxvsize The maximum size of a value.
|
||||
//
|
||||
|
||||
var qcount, qseed, qmaxitems, qmaxksize, qmaxvsize int
|
||||
|
||||
func init() {
|
||||
flag.IntVar(&qcount, "quick.count", 5, "")
|
||||
flag.IntVar(&qseed, "quick.seed", int(time.Now().UnixNano())%100000, "")
|
||||
flag.IntVar(&qmaxitems, "quick.maxitems", 1000, "")
|
||||
flag.IntVar(&qmaxksize, "quick.maxksize", 1024, "")
|
||||
flag.IntVar(&qmaxvsize, "quick.maxvsize", 1024, "")
|
||||
flag.Parse()
|
||||
fmt.Fprintln(os.Stderr, "seed:", qseed)
|
||||
fmt.Fprintf(os.Stderr, "quick settings: count=%v, items=%v, ksize=%v, vsize=%v\n", qcount, qmaxitems, qmaxksize, qmaxvsize)
|
||||
}
|
||||
|
||||
func qconfig() *quick.Config {
|
||||
return &quick.Config{
|
||||
MaxCount: qcount,
|
||||
Rand: rand.New(rand.NewSource(int64(qseed))),
|
||||
}
|
||||
}
|
||||
|
||||
type testdata []testdataitem
|
||||
|
||||
func (t testdata) Len() int { return len(t) }
|
||||
func (t testdata) Swap(i, j int) { t[i], t[j] = t[j], t[i] }
|
||||
func (t testdata) Less(i, j int) bool { return bytes.Compare(t[i].Key, t[j].Key) == -1 }
|
||||
|
||||
func (t testdata) Generate(rand *rand.Rand, size int) reflect.Value {
|
||||
n := rand.Intn(qmaxitems-1) + 1
|
||||
items := make(testdata, n)
|
||||
for i := 0; i < n; i++ {
|
||||
item := &items[i]
|
||||
item.Key = randByteSlice(rand, 1, qmaxksize)
|
||||
item.Value = randByteSlice(rand, 0, qmaxvsize)
|
||||
}
|
||||
return reflect.ValueOf(items)
|
||||
}
|
||||
|
||||
type revtestdata []testdataitem
|
||||
|
||||
func (t revtestdata) Len() int { return len(t) }
|
||||
func (t revtestdata) Swap(i, j int) { t[i], t[j] = t[j], t[i] }
|
||||
func (t revtestdata) Less(i, j int) bool { return bytes.Compare(t[i].Key, t[j].Key) == 1 }
|
||||
|
||||
type testdataitem struct {
|
||||
Key []byte
|
||||
Value []byte
|
||||
}
|
||||
|
||||
func randByteSlice(rand *rand.Rand, minSize, maxSize int) []byte {
|
||||
n := rand.Intn(maxSize-minSize) + minSize
|
||||
b := make([]byte, n)
|
||||
for i := 0; i < n; i++ {
|
||||
b[i] = byte(rand.Intn(255))
|
||||
}
|
||||
return b
|
||||
}
|
327
Godeps/_workspace/src/github.com/boltdb/bolt/simulation_test.go
generated
vendored
Normal file
327
Godeps/_workspace/src/github.com/boltdb/bolt/simulation_test.go
generated
vendored
Normal file
@@ -0,0 +1,327 @@
|
||||
package bolt_test
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"fmt"
|
||||
"math/rand"
|
||||
"sync"
|
||||
"testing"
|
||||
|
||||
"github.com/coreos/etcd/Godeps/_workspace/src/github.com/boltdb/bolt"
|
||||
)
|
||||
|
||||
func TestSimulate_1op_1p(t *testing.T) { testSimulate(t, 100, 1) }
|
||||
func TestSimulate_10op_1p(t *testing.T) { testSimulate(t, 10, 1) }
|
||||
func TestSimulate_100op_1p(t *testing.T) { testSimulate(t, 100, 1) }
|
||||
func TestSimulate_1000op_1p(t *testing.T) { testSimulate(t, 1000, 1) }
|
||||
func TestSimulate_10000op_1p(t *testing.T) { testSimulate(t, 10000, 1) }
|
||||
|
||||
func TestSimulate_10op_10p(t *testing.T) { testSimulate(t, 10, 10) }
|
||||
func TestSimulate_100op_10p(t *testing.T) { testSimulate(t, 100, 10) }
|
||||
func TestSimulate_1000op_10p(t *testing.T) { testSimulate(t, 1000, 10) }
|
||||
func TestSimulate_10000op_10p(t *testing.T) { testSimulate(t, 10000, 10) }
|
||||
|
||||
func TestSimulate_100op_100p(t *testing.T) { testSimulate(t, 100, 100) }
|
||||
func TestSimulate_1000op_100p(t *testing.T) { testSimulate(t, 1000, 100) }
|
||||
func TestSimulate_10000op_100p(t *testing.T) { testSimulate(t, 10000, 100) }
|
||||
|
||||
func TestSimulate_10000op_1000p(t *testing.T) { testSimulate(t, 10000, 1000) }
|
||||
|
||||
// Randomly generate operations on a given database with multiple clients to ensure consistency and thread safety.
|
||||
func testSimulate(t *testing.T, threadCount, parallelism int) {
|
||||
if testing.Short() {
|
||||
t.Skip("skipping test in short mode.")
|
||||
}
|
||||
|
||||
rand.Seed(int64(qseed))
|
||||
|
||||
// A list of operations that readers and writers can perform.
|
||||
var readerHandlers = []simulateHandler{simulateGetHandler}
|
||||
var writerHandlers = []simulateHandler{simulateGetHandler, simulatePutHandler}
|
||||
|
||||
var versions = make(map[int]*QuickDB)
|
||||
versions[1] = NewQuickDB()
|
||||
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
|
||||
var mutex sync.Mutex
|
||||
|
||||
// Run n threads in parallel, each with their own operation.
|
||||
var wg sync.WaitGroup
|
||||
var threads = make(chan bool, parallelism)
|
||||
var i int
|
||||
for {
|
||||
threads <- true
|
||||
wg.Add(1)
|
||||
writable := ((rand.Int() % 100) < 20) // 20% writers
|
||||
|
||||
// Choose an operation to execute.
|
||||
var handler simulateHandler
|
||||
if writable {
|
||||
handler = writerHandlers[rand.Intn(len(writerHandlers))]
|
||||
} else {
|
||||
handler = readerHandlers[rand.Intn(len(readerHandlers))]
|
||||
}
|
||||
|
||||
// Execute a thread for the given operation.
|
||||
go func(writable bool, handler simulateHandler) {
|
||||
defer wg.Done()
|
||||
|
||||
// Start transaction.
|
||||
tx, err := db.Begin(writable)
|
||||
if err != nil {
|
||||
t.Fatal("tx begin: ", err)
|
||||
}
|
||||
|
||||
// Obtain current state of the dataset.
|
||||
mutex.Lock()
|
||||
var qdb = versions[tx.ID()]
|
||||
if writable {
|
||||
qdb = versions[tx.ID()-1].Copy()
|
||||
}
|
||||
mutex.Unlock()
|
||||
|
||||
// Make sure we commit/rollback the tx at the end and update the state.
|
||||
if writable {
|
||||
defer func() {
|
||||
mutex.Lock()
|
||||
versions[tx.ID()] = qdb
|
||||
mutex.Unlock()
|
||||
|
||||
ok(t, tx.Commit())
|
||||
}()
|
||||
} else {
|
||||
defer tx.Rollback()
|
||||
}
|
||||
|
||||
// Ignore operation if we don't have data yet.
|
||||
if qdb == nil {
|
||||
return
|
||||
}
|
||||
|
||||
// Execute handler.
|
||||
handler(tx, qdb)
|
||||
|
||||
// Release a thread back to the scheduling loop.
|
||||
<-threads
|
||||
}(writable, handler)
|
||||
|
||||
i++
|
||||
if i > threadCount {
|
||||
break
|
||||
}
|
||||
}
|
||||
|
||||
// Wait until all threads are done.
|
||||
wg.Wait()
|
||||
}
|
||||
|
||||
type simulateHandler func(tx *bolt.Tx, qdb *QuickDB)
|
||||
|
||||
// Retrieves a key from the database and verifies that it is what is expected.
|
||||
func simulateGetHandler(tx *bolt.Tx, qdb *QuickDB) {
|
||||
// Randomly retrieve an existing exist.
|
||||
keys := qdb.Rand()
|
||||
if len(keys) == 0 {
|
||||
return
|
||||
}
|
||||
|
||||
// Retrieve root bucket.
|
||||
b := tx.Bucket(keys[0])
|
||||
if b == nil {
|
||||
panic(fmt.Sprintf("bucket[0] expected: %08x\n", trunc(keys[0], 4)))
|
||||
}
|
||||
|
||||
// Drill into nested buckets.
|
||||
for _, key := range keys[1 : len(keys)-1] {
|
||||
b = b.Bucket(key)
|
||||
if b == nil {
|
||||
panic(fmt.Sprintf("bucket[n] expected: %v -> %v\n", keys, key))
|
||||
}
|
||||
}
|
||||
|
||||
// Verify key/value on the final bucket.
|
||||
expected := qdb.Get(keys)
|
||||
actual := b.Get(keys[len(keys)-1])
|
||||
if !bytes.Equal(actual, expected) {
|
||||
fmt.Println("=== EXPECTED ===")
|
||||
fmt.Println(expected)
|
||||
fmt.Println("=== ACTUAL ===")
|
||||
fmt.Println(actual)
|
||||
fmt.Println("=== END ===")
|
||||
panic("value mismatch")
|
||||
}
|
||||
}
|
||||
|
||||
// Inserts a key into the database.
|
||||
func simulatePutHandler(tx *bolt.Tx, qdb *QuickDB) {
|
||||
var err error
|
||||
keys, value := randKeys(), randValue()
|
||||
|
||||
// Retrieve root bucket.
|
||||
b := tx.Bucket(keys[0])
|
||||
if b == nil {
|
||||
b, err = tx.CreateBucket(keys[0])
|
||||
if err != nil {
|
||||
panic("create bucket: " + err.Error())
|
||||
}
|
||||
}
|
||||
|
||||
// Create nested buckets, if necessary.
|
||||
for _, key := range keys[1 : len(keys)-1] {
|
||||
child := b.Bucket(key)
|
||||
if child != nil {
|
||||
b = child
|
||||
} else {
|
||||
b, err = b.CreateBucket(key)
|
||||
if err != nil {
|
||||
panic("create bucket: " + err.Error())
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Insert into database.
|
||||
if err := b.Put(keys[len(keys)-1], value); err != nil {
|
||||
panic("put: " + err.Error())
|
||||
}
|
||||
|
||||
// Insert into in-memory database.
|
||||
qdb.Put(keys, value)
|
||||
}
|
||||
|
||||
// QuickDB is an in-memory database that replicates the functionality of the
|
||||
// Bolt DB type except that it is entirely in-memory. It is meant for testing
|
||||
// that the Bolt database is consistent.
|
||||
type QuickDB struct {
|
||||
sync.RWMutex
|
||||
m map[string]interface{}
|
||||
}
|
||||
|
||||
// NewQuickDB returns an instance of QuickDB.
|
||||
func NewQuickDB() *QuickDB {
|
||||
return &QuickDB{m: make(map[string]interface{})}
|
||||
}
|
||||
|
||||
// Get retrieves the value at a key path.
|
||||
func (db *QuickDB) Get(keys [][]byte) []byte {
|
||||
db.RLock()
|
||||
defer db.RUnlock()
|
||||
|
||||
m := db.m
|
||||
for _, key := range keys[:len(keys)-1] {
|
||||
value := m[string(key)]
|
||||
if value == nil {
|
||||
return nil
|
||||
}
|
||||
switch value := value.(type) {
|
||||
case map[string]interface{}:
|
||||
m = value
|
||||
case []byte:
|
||||
return nil
|
||||
}
|
||||
}
|
||||
|
||||
// Only return if it's a simple value.
|
||||
if value, ok := m[string(keys[len(keys)-1])].([]byte); ok {
|
||||
return value
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
// Put inserts a value into a key path.
|
||||
func (db *QuickDB) Put(keys [][]byte, value []byte) {
|
||||
db.Lock()
|
||||
defer db.Unlock()
|
||||
|
||||
// Build buckets all the way down the key path.
|
||||
m := db.m
|
||||
for _, key := range keys[:len(keys)-1] {
|
||||
if _, ok := m[string(key)].([]byte); ok {
|
||||
return // Keypath intersects with a simple value. Do nothing.
|
||||
}
|
||||
|
||||
if m[string(key)] == nil {
|
||||
m[string(key)] = make(map[string]interface{})
|
||||
}
|
||||
m = m[string(key)].(map[string]interface{})
|
||||
}
|
||||
|
||||
// Insert value into the last key.
|
||||
m[string(keys[len(keys)-1])] = value
|
||||
}
|
||||
|
||||
// Rand returns a random key path that points to a simple value.
|
||||
func (db *QuickDB) Rand() [][]byte {
|
||||
db.RLock()
|
||||
defer db.RUnlock()
|
||||
if len(db.m) == 0 {
|
||||
return nil
|
||||
}
|
||||
var keys [][]byte
|
||||
db.rand(db.m, &keys)
|
||||
return keys
|
||||
}
|
||||
|
||||
func (db *QuickDB) rand(m map[string]interface{}, keys *[][]byte) {
|
||||
i, index := 0, rand.Intn(len(m))
|
||||
for k, v := range m {
|
||||
if i == index {
|
||||
*keys = append(*keys, []byte(k))
|
||||
if v, ok := v.(map[string]interface{}); ok {
|
||||
db.rand(v, keys)
|
||||
}
|
||||
return
|
||||
}
|
||||
i++
|
||||
}
|
||||
panic("quickdb rand: out-of-range")
|
||||
}
|
||||
|
||||
// Copy copies the entire database.
|
||||
func (db *QuickDB) Copy() *QuickDB {
|
||||
db.RLock()
|
||||
defer db.RUnlock()
|
||||
return &QuickDB{m: db.copy(db.m)}
|
||||
}
|
||||
|
||||
func (db *QuickDB) copy(m map[string]interface{}) map[string]interface{} {
|
||||
clone := make(map[string]interface{}, len(m))
|
||||
for k, v := range m {
|
||||
switch v := v.(type) {
|
||||
case map[string]interface{}:
|
||||
clone[k] = db.copy(v)
|
||||
default:
|
||||
clone[k] = v
|
||||
}
|
||||
}
|
||||
return clone
|
||||
}
|
||||
|
||||
func randKey() []byte {
|
||||
var min, max = 1, 1024
|
||||
n := rand.Intn(max-min) + min
|
||||
b := make([]byte, n)
|
||||
for i := 0; i < n; i++ {
|
||||
b[i] = byte(rand.Intn(255))
|
||||
}
|
||||
return b
|
||||
}
|
||||
|
||||
func randKeys() [][]byte {
|
||||
var keys [][]byte
|
||||
var count = rand.Intn(2) + 2
|
||||
for i := 0; i < count; i++ {
|
||||
keys = append(keys, randKey())
|
||||
}
|
||||
return keys
|
||||
}
|
||||
|
||||
func randValue() []byte {
|
||||
n := rand.Intn(8192)
|
||||
b := make([]byte, n)
|
||||
for i := 0; i < n; i++ {
|
||||
b[i] = byte(rand.Intn(255))
|
||||
}
|
||||
return b
|
||||
}
|
47
Godeps/_workspace/src/github.com/boltdb/bolt/tx.go
generated
vendored
47
Godeps/_workspace/src/github.com/boltdb/bolt/tx.go
generated
vendored
@@ -29,14 +29,6 @@ type Tx struct {
|
||||
pages map[pgid]*page
|
||||
stats TxStats
|
||||
commitHandlers []func()
|
||||
|
||||
// WriteFlag specifies the flag for write-related methods like WriteTo().
|
||||
// Tx opens the database file with the specified flag to copy the data.
|
||||
//
|
||||
// By default, the flag is unset, which works well for mostly in-memory
|
||||
// workloads. For databases that are much larger than available RAM,
|
||||
// set the flag to syscall.O_DIRECT to avoid trashing the page cache.
|
||||
WriteFlag int
|
||||
}
|
||||
|
||||
// init initializes the transaction.
|
||||
@@ -95,21 +87,18 @@ func (tx *Tx) Stats() TxStats {
|
||||
|
||||
// Bucket retrieves a bucket by name.
|
||||
// Returns nil if the bucket does not exist.
|
||||
// The bucket instance is only valid for the lifetime of the transaction.
|
||||
func (tx *Tx) Bucket(name []byte) *Bucket {
|
||||
return tx.root.Bucket(name)
|
||||
}
|
||||
|
||||
// CreateBucket creates a new bucket.
|
||||
// Returns an error if the bucket already exists, if the bucket name is blank, or if the bucket name is too long.
|
||||
// The bucket instance is only valid for the lifetime of the transaction.
|
||||
func (tx *Tx) CreateBucket(name []byte) (*Bucket, error) {
|
||||
return tx.root.CreateBucket(name)
|
||||
}
|
||||
|
||||
// CreateBucketIfNotExists creates a new bucket if it doesn't already exist.
|
||||
// Returns an error if the bucket name is blank, or if the bucket name is too long.
|
||||
// The bucket instance is only valid for the lifetime of the transaction.
|
||||
func (tx *Tx) CreateBucketIfNotExists(name []byte) (*Bucket, error) {
|
||||
return tx.root.CreateBucketIfNotExists(name)
|
||||
}
|
||||
@@ -168,8 +157,6 @@ func (tx *Tx) Commit() error {
|
||||
// Free the old root bucket.
|
||||
tx.meta.root.root = tx.root.root
|
||||
|
||||
opgid := tx.meta.pgid
|
||||
|
||||
// Free the freelist and allocate new pages for it. This will overestimate
|
||||
// the size of the freelist but not underestimate the size (which would be bad).
|
||||
tx.db.freelist.free(tx.meta.txid, tx.db.page(tx.meta.freelist))
|
||||
@@ -184,14 +171,6 @@ func (tx *Tx) Commit() error {
|
||||
}
|
||||
tx.meta.freelist = p.id
|
||||
|
||||
// If the high water mark has moved up then attempt to grow the database.
|
||||
if tx.meta.pgid > opgid {
|
||||
if err := tx.db.grow(int(tx.meta.pgid+1) * tx.db.pageSize); err != nil {
|
||||
tx.rollback()
|
||||
return err
|
||||
}
|
||||
}
|
||||
|
||||
// Write dirty pages to disk.
|
||||
startTime = time.Now()
|
||||
if err := tx.write(); err != nil {
|
||||
@@ -257,8 +236,7 @@ func (tx *Tx) close() {
|
||||
var freelistPendingN = tx.db.freelist.pending_count()
|
||||
var freelistAlloc = tx.db.freelist.size()
|
||||
|
||||
// Remove transaction ref & writer lock.
|
||||
tx.db.rwtx = nil
|
||||
// Remove writer lock.
|
||||
tx.db.rwlock.Unlock()
|
||||
|
||||
// Merge statistics.
|
||||
@@ -272,16 +250,11 @@ func (tx *Tx) close() {
|
||||
} else {
|
||||
tx.db.removeTx(tx)
|
||||
}
|
||||
|
||||
// Clear all references.
|
||||
tx.db = nil
|
||||
tx.meta = nil
|
||||
tx.root = Bucket{tx: tx}
|
||||
tx.pages = nil
|
||||
}
|
||||
|
||||
// Copy writes the entire database to a writer.
|
||||
// This function exists for backwards compatibility. Use WriteTo() instead.
|
||||
// This function exists for backwards compatibility. Use WriteTo() in
|
||||
func (tx *Tx) Copy(w io.Writer) error {
|
||||
_, err := tx.WriteTo(w)
|
||||
return err
|
||||
@@ -290,18 +263,21 @@ func (tx *Tx) Copy(w io.Writer) error {
|
||||
// WriteTo writes the entire database to a writer.
|
||||
// If err == nil then exactly tx.Size() bytes will be written into the writer.
|
||||
func (tx *Tx) WriteTo(w io.Writer) (n int64, err error) {
|
||||
// Attempt to open reader with WriteFlag
|
||||
f, err := os.OpenFile(tx.db.path, os.O_RDONLY|tx.WriteFlag, 0)
|
||||
if err != nil {
|
||||
return 0, err
|
||||
// Attempt to open reader directly.
|
||||
var f *os.File
|
||||
if f, err = os.OpenFile(tx.db.path, os.O_RDONLY|odirect, 0); err != nil {
|
||||
// Fallback to a regular open if that doesn't work.
|
||||
if f, err = os.OpenFile(tx.db.path, os.O_RDONLY, 0); err != nil {
|
||||
return 0, err
|
||||
}
|
||||
}
|
||||
defer func() { _ = f.Close() }()
|
||||
|
||||
// Copy the meta pages.
|
||||
tx.db.metalock.Lock()
|
||||
n, err = io.CopyN(w, f, int64(tx.db.pageSize*2))
|
||||
tx.db.metalock.Unlock()
|
||||
if err != nil {
|
||||
_ = f.Close()
|
||||
return n, fmt.Errorf("meta copy: %s", err)
|
||||
}
|
||||
|
||||
@@ -309,6 +285,7 @@ func (tx *Tx) WriteTo(w io.Writer) (n int64, err error) {
|
||||
wn, err := io.CopyN(w, f, tx.Size()-int64(tx.db.pageSize*2))
|
||||
n += wn
|
||||
if err != nil {
|
||||
_ = f.Close()
|
||||
return n, err
|
||||
}
|
||||
|
||||
@@ -515,7 +492,7 @@ func (tx *Tx) writeMeta() error {
|
||||
}
|
||||
|
||||
// page returns a reference to the page with a given id.
|
||||
// If page has been written to then a temporary buffered page is returned.
|
||||
// If page has been written to then a temporary bufferred page is returned.
|
||||
func (tx *Tx) page(id pgid) *page {
|
||||
// Check the dirty pages first.
|
||||
if tx.pages != nil {
|
||||
|
456
Godeps/_workspace/src/github.com/boltdb/bolt/tx_test.go
generated
vendored
Normal file
456
Godeps/_workspace/src/github.com/boltdb/bolt/tx_test.go
generated
vendored
Normal file
@@ -0,0 +1,456 @@
|
||||
package bolt_test
|
||||
|
||||
import (
|
||||
"errors"
|
||||
"fmt"
|
||||
"os"
|
||||
"testing"
|
||||
|
||||
"github.com/coreos/etcd/Godeps/_workspace/src/github.com/boltdb/bolt"
|
||||
)
|
||||
|
||||
// Ensure that committing a closed transaction returns an error.
|
||||
func TestTx_Commit_Closed(t *testing.T) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
tx, _ := db.Begin(true)
|
||||
tx.CreateBucket([]byte("foo"))
|
||||
ok(t, tx.Commit())
|
||||
equals(t, tx.Commit(), bolt.ErrTxClosed)
|
||||
}
|
||||
|
||||
// Ensure that rolling back a closed transaction returns an error.
|
||||
func TestTx_Rollback_Closed(t *testing.T) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
tx, _ := db.Begin(true)
|
||||
ok(t, tx.Rollback())
|
||||
equals(t, tx.Rollback(), bolt.ErrTxClosed)
|
||||
}
|
||||
|
||||
// Ensure that committing a read-only transaction returns an error.
|
||||
func TestTx_Commit_ReadOnly(t *testing.T) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
tx, _ := db.Begin(false)
|
||||
equals(t, tx.Commit(), bolt.ErrTxNotWritable)
|
||||
}
|
||||
|
||||
// Ensure that a transaction can retrieve a cursor on the root bucket.
|
||||
func TestTx_Cursor(t *testing.T) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
db.Update(func(tx *bolt.Tx) error {
|
||||
tx.CreateBucket([]byte("widgets"))
|
||||
tx.CreateBucket([]byte("woojits"))
|
||||
c := tx.Cursor()
|
||||
|
||||
k, v := c.First()
|
||||
equals(t, "widgets", string(k))
|
||||
assert(t, v == nil, "")
|
||||
|
||||
k, v = c.Next()
|
||||
equals(t, "woojits", string(k))
|
||||
assert(t, v == nil, "")
|
||||
|
||||
k, v = c.Next()
|
||||
assert(t, k == nil, "")
|
||||
assert(t, v == nil, "")
|
||||
|
||||
return nil
|
||||
})
|
||||
}
|
||||
|
||||
// Ensure that creating a bucket with a read-only transaction returns an error.
|
||||
func TestTx_CreateBucket_ReadOnly(t *testing.T) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
db.View(func(tx *bolt.Tx) error {
|
||||
b, err := tx.CreateBucket([]byte("foo"))
|
||||
assert(t, b == nil, "")
|
||||
equals(t, bolt.ErrTxNotWritable, err)
|
||||
return nil
|
||||
})
|
||||
}
|
||||
|
||||
// Ensure that creating a bucket on a closed transaction returns an error.
|
||||
func TestTx_CreateBucket_Closed(t *testing.T) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
tx, _ := db.Begin(true)
|
||||
tx.Commit()
|
||||
b, err := tx.CreateBucket([]byte("foo"))
|
||||
assert(t, b == nil, "")
|
||||
equals(t, bolt.ErrTxClosed, err)
|
||||
}
|
||||
|
||||
// Ensure that a Tx can retrieve a bucket.
|
||||
func TestTx_Bucket(t *testing.T) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
db.Update(func(tx *bolt.Tx) error {
|
||||
tx.CreateBucket([]byte("widgets"))
|
||||
b := tx.Bucket([]byte("widgets"))
|
||||
assert(t, b != nil, "")
|
||||
return nil
|
||||
})
|
||||
}
|
||||
|
||||
// Ensure that a Tx retrieving a non-existent key returns nil.
|
||||
func TestTx_Get_Missing(t *testing.T) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
db.Update(func(tx *bolt.Tx) error {
|
||||
tx.CreateBucket([]byte("widgets"))
|
||||
tx.Bucket([]byte("widgets")).Put([]byte("foo"), []byte("bar"))
|
||||
value := tx.Bucket([]byte("widgets")).Get([]byte("no_such_key"))
|
||||
assert(t, value == nil, "")
|
||||
return nil
|
||||
})
|
||||
}
|
||||
|
||||
// Ensure that a bucket can be created and retrieved.
|
||||
func TestTx_CreateBucket(t *testing.T) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
|
||||
// Create a bucket.
|
||||
db.Update(func(tx *bolt.Tx) error {
|
||||
b, err := tx.CreateBucket([]byte("widgets"))
|
||||
assert(t, b != nil, "")
|
||||
ok(t, err)
|
||||
return nil
|
||||
})
|
||||
|
||||
// Read the bucket through a separate transaction.
|
||||
db.View(func(tx *bolt.Tx) error {
|
||||
b := tx.Bucket([]byte("widgets"))
|
||||
assert(t, b != nil, "")
|
||||
return nil
|
||||
})
|
||||
}
|
||||
|
||||
// Ensure that a bucket can be created if it doesn't already exist.
|
||||
func TestTx_CreateBucketIfNotExists(t *testing.T) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
db.Update(func(tx *bolt.Tx) error {
|
||||
b, err := tx.CreateBucketIfNotExists([]byte("widgets"))
|
||||
assert(t, b != nil, "")
|
||||
ok(t, err)
|
||||
|
||||
b, err = tx.CreateBucketIfNotExists([]byte("widgets"))
|
||||
assert(t, b != nil, "")
|
||||
ok(t, err)
|
||||
|
||||
b, err = tx.CreateBucketIfNotExists([]byte{})
|
||||
assert(t, b == nil, "")
|
||||
equals(t, bolt.ErrBucketNameRequired, err)
|
||||
|
||||
b, err = tx.CreateBucketIfNotExists(nil)
|
||||
assert(t, b == nil, "")
|
||||
equals(t, bolt.ErrBucketNameRequired, err)
|
||||
return nil
|
||||
})
|
||||
|
||||
// Read the bucket through a separate transaction.
|
||||
db.View(func(tx *bolt.Tx) error {
|
||||
b := tx.Bucket([]byte("widgets"))
|
||||
assert(t, b != nil, "")
|
||||
return nil
|
||||
})
|
||||
}
|
||||
|
||||
// Ensure that a bucket cannot be created twice.
|
||||
func TestTx_CreateBucket_Exists(t *testing.T) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
// Create a bucket.
|
||||
db.Update(func(tx *bolt.Tx) error {
|
||||
b, err := tx.CreateBucket([]byte("widgets"))
|
||||
assert(t, b != nil, "")
|
||||
ok(t, err)
|
||||
return nil
|
||||
})
|
||||
|
||||
// Create the same bucket again.
|
||||
db.Update(func(tx *bolt.Tx) error {
|
||||
b, err := tx.CreateBucket([]byte("widgets"))
|
||||
assert(t, b == nil, "")
|
||||
equals(t, bolt.ErrBucketExists, err)
|
||||
return nil
|
||||
})
|
||||
}
|
||||
|
||||
// Ensure that a bucket is created with a non-blank name.
|
||||
func TestTx_CreateBucket_NameRequired(t *testing.T) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
db.Update(func(tx *bolt.Tx) error {
|
||||
b, err := tx.CreateBucket(nil)
|
||||
assert(t, b == nil, "")
|
||||
equals(t, bolt.ErrBucketNameRequired, err)
|
||||
return nil
|
||||
})
|
||||
}
|
||||
|
||||
// Ensure that a bucket can be deleted.
|
||||
func TestTx_DeleteBucket(t *testing.T) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
|
||||
// Create a bucket and add a value.
|
||||
db.Update(func(tx *bolt.Tx) error {
|
||||
tx.CreateBucket([]byte("widgets"))
|
||||
tx.Bucket([]byte("widgets")).Put([]byte("foo"), []byte("bar"))
|
||||
return nil
|
||||
})
|
||||
|
||||
// Delete the bucket and make sure we can't get the value.
|
||||
db.Update(func(tx *bolt.Tx) error {
|
||||
ok(t, tx.DeleteBucket([]byte("widgets")))
|
||||
assert(t, tx.Bucket([]byte("widgets")) == nil, "")
|
||||
return nil
|
||||
})
|
||||
|
||||
db.Update(func(tx *bolt.Tx) error {
|
||||
// Create the bucket again and make sure there's not a phantom value.
|
||||
b, err := tx.CreateBucket([]byte("widgets"))
|
||||
assert(t, b != nil, "")
|
||||
ok(t, err)
|
||||
assert(t, tx.Bucket([]byte("widgets")).Get([]byte("foo")) == nil, "")
|
||||
return nil
|
||||
})
|
||||
}
|
||||
|
||||
// Ensure that deleting a bucket on a closed transaction returns an error.
|
||||
func TestTx_DeleteBucket_Closed(t *testing.T) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
tx, _ := db.Begin(true)
|
||||
tx.Commit()
|
||||
equals(t, tx.DeleteBucket([]byte("foo")), bolt.ErrTxClosed)
|
||||
}
|
||||
|
||||
// Ensure that deleting a bucket with a read-only transaction returns an error.
|
||||
func TestTx_DeleteBucket_ReadOnly(t *testing.T) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
db.View(func(tx *bolt.Tx) error {
|
||||
equals(t, tx.DeleteBucket([]byte("foo")), bolt.ErrTxNotWritable)
|
||||
return nil
|
||||
})
|
||||
}
|
||||
|
||||
// Ensure that nothing happens when deleting a bucket that doesn't exist.
|
||||
func TestTx_DeleteBucket_NotFound(t *testing.T) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
db.Update(func(tx *bolt.Tx) error {
|
||||
equals(t, bolt.ErrBucketNotFound, tx.DeleteBucket([]byte("widgets")))
|
||||
return nil
|
||||
})
|
||||
}
|
||||
|
||||
// Ensure that no error is returned when a tx.ForEach function does not return
|
||||
// an error.
|
||||
func TestTx_ForEach_NoError(t *testing.T) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
db.Update(func(tx *bolt.Tx) error {
|
||||
tx.CreateBucket([]byte("widgets"))
|
||||
tx.Bucket([]byte("widgets")).Put([]byte("foo"), []byte("bar"))
|
||||
|
||||
equals(t, nil, tx.ForEach(func(name []byte, b *bolt.Bucket) error {
|
||||
return nil
|
||||
}))
|
||||
return nil
|
||||
})
|
||||
}
|
||||
|
||||
// Ensure that an error is returned when a tx.ForEach function returns an error.
|
||||
func TestTx_ForEach_WithError(t *testing.T) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
db.Update(func(tx *bolt.Tx) error {
|
||||
tx.CreateBucket([]byte("widgets"))
|
||||
tx.Bucket([]byte("widgets")).Put([]byte("foo"), []byte("bar"))
|
||||
|
||||
err := errors.New("foo")
|
||||
equals(t, err, tx.ForEach(func(name []byte, b *bolt.Bucket) error {
|
||||
return err
|
||||
}))
|
||||
return nil
|
||||
})
|
||||
}
|
||||
|
||||
// Ensure that Tx commit handlers are called after a transaction successfully commits.
|
||||
func TestTx_OnCommit(t *testing.T) {
|
||||
var x int
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
db.Update(func(tx *bolt.Tx) error {
|
||||
tx.OnCommit(func() { x += 1 })
|
||||
tx.OnCommit(func() { x += 2 })
|
||||
_, err := tx.CreateBucket([]byte("widgets"))
|
||||
return err
|
||||
})
|
||||
equals(t, 3, x)
|
||||
}
|
||||
|
||||
// Ensure that Tx commit handlers are NOT called after a transaction rolls back.
|
||||
func TestTx_OnCommit_Rollback(t *testing.T) {
|
||||
var x int
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
db.Update(func(tx *bolt.Tx) error {
|
||||
tx.OnCommit(func() { x += 1 })
|
||||
tx.OnCommit(func() { x += 2 })
|
||||
tx.CreateBucket([]byte("widgets"))
|
||||
return errors.New("rollback this commit")
|
||||
})
|
||||
equals(t, 0, x)
|
||||
}
|
||||
|
||||
// Ensure that the database can be copied to a file path.
|
||||
func TestTx_CopyFile(t *testing.T) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
var dest = tempfile()
|
||||
db.Update(func(tx *bolt.Tx) error {
|
||||
tx.CreateBucket([]byte("widgets"))
|
||||
tx.Bucket([]byte("widgets")).Put([]byte("foo"), []byte("bar"))
|
||||
tx.Bucket([]byte("widgets")).Put([]byte("baz"), []byte("bat"))
|
||||
return nil
|
||||
})
|
||||
|
||||
ok(t, db.View(func(tx *bolt.Tx) error { return tx.CopyFile(dest, 0600) }))
|
||||
|
||||
db2, err := bolt.Open(dest, 0600, nil)
|
||||
ok(t, err)
|
||||
defer db2.Close()
|
||||
|
||||
db2.View(func(tx *bolt.Tx) error {
|
||||
equals(t, []byte("bar"), tx.Bucket([]byte("widgets")).Get([]byte("foo")))
|
||||
equals(t, []byte("bat"), tx.Bucket([]byte("widgets")).Get([]byte("baz")))
|
||||
return nil
|
||||
})
|
||||
}
|
||||
|
||||
type failWriterError struct{}
|
||||
|
||||
func (failWriterError) Error() string {
|
||||
return "error injected for tests"
|
||||
}
|
||||
|
||||
type failWriter struct {
|
||||
// fail after this many bytes
|
||||
After int
|
||||
}
|
||||
|
||||
func (f *failWriter) Write(p []byte) (n int, err error) {
|
||||
n = len(p)
|
||||
if n > f.After {
|
||||
n = f.After
|
||||
err = failWriterError{}
|
||||
}
|
||||
f.After -= n
|
||||
return n, err
|
||||
}
|
||||
|
||||
// Ensure that Copy handles write errors right.
|
||||
func TestTx_CopyFile_Error_Meta(t *testing.T) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
db.Update(func(tx *bolt.Tx) error {
|
||||
tx.CreateBucket([]byte("widgets"))
|
||||
tx.Bucket([]byte("widgets")).Put([]byte("foo"), []byte("bar"))
|
||||
tx.Bucket([]byte("widgets")).Put([]byte("baz"), []byte("bat"))
|
||||
return nil
|
||||
})
|
||||
|
||||
err := db.View(func(tx *bolt.Tx) error { return tx.Copy(&failWriter{}) })
|
||||
equals(t, err.Error(), "meta copy: error injected for tests")
|
||||
}
|
||||
|
||||
// Ensure that Copy handles write errors right.
|
||||
func TestTx_CopyFile_Error_Normal(t *testing.T) {
|
||||
db := NewTestDB()
|
||||
defer db.Close()
|
||||
db.Update(func(tx *bolt.Tx) error {
|
||||
tx.CreateBucket([]byte("widgets"))
|
||||
tx.Bucket([]byte("widgets")).Put([]byte("foo"), []byte("bar"))
|
||||
tx.Bucket([]byte("widgets")).Put([]byte("baz"), []byte("bat"))
|
||||
return nil
|
||||
})
|
||||
|
||||
err := db.View(func(tx *bolt.Tx) error { return tx.Copy(&failWriter{3 * db.Info().PageSize}) })
|
||||
equals(t, err.Error(), "error injected for tests")
|
||||
}
|
||||
|
||||
func ExampleTx_Rollback() {
|
||||
// Open the database.
|
||||
db, _ := bolt.Open(tempfile(), 0666, nil)
|
||||
defer os.Remove(db.Path())
|
||||
defer db.Close()
|
||||
|
||||
// Create a bucket.
|
||||
db.Update(func(tx *bolt.Tx) error {
|
||||
_, err := tx.CreateBucket([]byte("widgets"))
|
||||
return err
|
||||
})
|
||||
|
||||
// Set a value for a key.
|
||||
db.Update(func(tx *bolt.Tx) error {
|
||||
return tx.Bucket([]byte("widgets")).Put([]byte("foo"), []byte("bar"))
|
||||
})
|
||||
|
||||
// Update the key but rollback the transaction so it never saves.
|
||||
tx, _ := db.Begin(true)
|
||||
b := tx.Bucket([]byte("widgets"))
|
||||
b.Put([]byte("foo"), []byte("baz"))
|
||||
tx.Rollback()
|
||||
|
||||
// Ensure that our original value is still set.
|
||||
db.View(func(tx *bolt.Tx) error {
|
||||
value := tx.Bucket([]byte("widgets")).Get([]byte("foo"))
|
||||
fmt.Printf("The value for 'foo' is still: %s\n", value)
|
||||
return nil
|
||||
})
|
||||
|
||||
// Output:
|
||||
// The value for 'foo' is still: bar
|
||||
}
|
||||
|
||||
func ExampleTx_CopyFile() {
|
||||
// Open the database.
|
||||
db, _ := bolt.Open(tempfile(), 0666, nil)
|
||||
defer os.Remove(db.Path())
|
||||
defer db.Close()
|
||||
|
||||
// Create a bucket and a key.
|
||||
db.Update(func(tx *bolt.Tx) error {
|
||||
tx.CreateBucket([]byte("widgets"))
|
||||
tx.Bucket([]byte("widgets")).Put([]byte("foo"), []byte("bar"))
|
||||
return nil
|
||||
})
|
||||
|
||||
// Copy the database to another file.
|
||||
toFile := tempfile()
|
||||
db.View(func(tx *bolt.Tx) error { return tx.CopyFile(toFile, 0666) })
|
||||
defer os.Remove(toFile)
|
||||
|
||||
// Open the cloned database.
|
||||
db2, _ := bolt.Open(toFile, 0666, nil)
|
||||
defer db2.Close()
|
||||
|
||||
// Ensure that the key exists in the copy.
|
||||
db2.View(func(tx *bolt.Tx) error {
|
||||
value := tx.Bucket([]byte("widgets")).Get([]byte("foo"))
|
||||
fmt.Printf("The value for 'foo' in the clone is: %s\n", value)
|
||||
return nil
|
||||
})
|
||||
|
||||
// Output:
|
||||
// The value for 'foo' in the clone is: bar
|
||||
}
|
1
Godeps/_workspace/src/github.com/bradfitz/http2/.gitignore
generated
vendored
Normal file
1
Godeps/_workspace/src/github.com/bradfitz/http2/.gitignore
generated
vendored
Normal file
@@ -0,0 +1 @@
|
||||
*~
|
19
Godeps/_workspace/src/github.com/bradfitz/http2/AUTHORS
generated
vendored
Normal file
19
Godeps/_workspace/src/github.com/bradfitz/http2/AUTHORS
generated
vendored
Normal file
@@ -0,0 +1,19 @@
|
||||
# This file is like Go's AUTHORS file: it lists Copyright holders.
|
||||
# The list of humans who have contributd is in the CONTRIBUTORS file.
|
||||
#
|
||||
# To contribute to this project, because it will eventually be folded
|
||||
# back in to Go itself, you need to submit a CLA:
|
||||
#
|
||||
# http://golang.org/doc/contribute.html#copyright
|
||||
#
|
||||
# Then you get added to CONTRIBUTORS and you or your company get added
|
||||
# to the AUTHORS file.
|
||||
|
||||
Blake Mizerany <blake.mizerany@gmail.com> github=bmizerany
|
||||
Daniel Morsing <daniel.morsing@gmail.com> github=DanielMorsing
|
||||
Gabriel Aszalos <gabriel.aszalos@gmail.com> github=gbbr
|
||||
Google, Inc.
|
||||
Keith Rarick <kr@xph.us> github=kr
|
||||
Matthew Keenan <tank.en.mate@gmail.com> <github@mattkeenan.net> github=mattkeenan
|
||||
Matt Layher <mdlayher@gmail.com> github=mdlayher
|
||||
Tatsuhiro Tsujikawa <tatsuhiro.t@gmail.com> github=tatsuhiro-t
|
19
Godeps/_workspace/src/github.com/bradfitz/http2/CONTRIBUTORS
generated
vendored
Normal file
19
Godeps/_workspace/src/github.com/bradfitz/http2/CONTRIBUTORS
generated
vendored
Normal file
@@ -0,0 +1,19 @@
|
||||
# This file is like Go's CONTRIBUTORS file: it lists humans.
|
||||
# The list of copyright holders (which may be companies) are in the AUTHORS file.
|
||||
#
|
||||
# To contribute to this project, because it will eventually be folded
|
||||
# back in to Go itself, you need to submit a CLA:
|
||||
#
|
||||
# http://golang.org/doc/contribute.html#copyright
|
||||
#
|
||||
# Then you get added to CONTRIBUTORS and you or your company get added
|
||||
# to the AUTHORS file.
|
||||
|
||||
Blake Mizerany <blake.mizerany@gmail.com> github=bmizerany
|
||||
Brad Fitzpatrick <bradfitz@golang.org> github=bradfitz
|
||||
Daniel Morsing <daniel.morsing@gmail.com> github=DanielMorsing
|
||||
Gabriel Aszalos <gabriel.aszalos@gmail.com> github=gbbr
|
||||
Keith Rarick <kr@xph.us> github=kr
|
||||
Matthew Keenan <tank.en.mate@gmail.com> <github@mattkeenan.net> github=mattkeenan
|
||||
Matt Layher <mdlayher@gmail.com> github=mdlayher
|
||||
Tatsuhiro Tsujikawa <tatsuhiro.t@gmail.com> github=tatsuhiro-t
|
@@ -17,15 +17,8 @@ RUN apt-get install -y --no-install-recommends \
|
||||
libcunit1-dev libssl-dev libxml2-dev libevent-dev \
|
||||
automake autoconf
|
||||
|
||||
# The list of packages nghttp2 recommends for h2load:
|
||||
RUN apt-get install -y --no-install-recommends make binutils \
|
||||
autoconf automake autotools-dev \
|
||||
libtool pkg-config zlib1g-dev libcunit1-dev libssl-dev libxml2-dev \
|
||||
libev-dev libevent-dev libjansson-dev libjemalloc-dev \
|
||||
cython python3.4-dev python-setuptools
|
||||
|
||||
# Note: setting NGHTTP2_VER before the git clone, so an old git clone isn't cached:
|
||||
ENV NGHTTP2_VER 895da9a
|
||||
ENV NGHTTP2_VER af24f8394e43f4
|
||||
RUN cd /root && git clone https://github.com/tatsuhiro-t/nghttp2.git
|
||||
|
||||
WORKDIR /root/nghttp2
|
||||
@@ -38,9 +31,9 @@ RUN make
|
||||
RUN make install
|
||||
|
||||
WORKDIR /root
|
||||
RUN wget http://curl.haxx.se/download/curl-7.45.0.tar.gz
|
||||
RUN tar -zxvf curl-7.45.0.tar.gz
|
||||
WORKDIR /root/curl-7.45.0
|
||||
RUN wget http://curl.haxx.se/download/curl-7.40.0.tar.gz
|
||||
RUN tar -zxvf curl-7.40.0.tar.gz
|
||||
WORKDIR /root/curl-7.40.0
|
||||
RUN ./configure --with-ssl --with-nghttp2=/usr/local
|
||||
RUN make
|
||||
RUN make install
|
5
Godeps/_workspace/src/github.com/bradfitz/http2/HACKING
generated
vendored
Normal file
5
Godeps/_workspace/src/github.com/bradfitz/http2/HACKING
generated
vendored
Normal file
@@ -0,0 +1,5 @@
|
||||
We only accept contributions from users who have gone through Go's
|
||||
contribution process (signed a CLA).
|
||||
|
||||
Please acknowledge whether you have (and use the same email) if
|
||||
sending a pull request.
|
7
Godeps/_workspace/src/github.com/bradfitz/http2/LICENSE
generated
vendored
Normal file
7
Godeps/_workspace/src/github.com/bradfitz/http2/LICENSE
generated
vendored
Normal file
@@ -0,0 +1,7 @@
|
||||
Copyright 2014 Google & the Go AUTHORS
|
||||
|
||||
Go AUTHORS are:
|
||||
See https://code.google.com/p/go/source/browse/AUTHORS
|
||||
|
||||
Licensed under the terms of Go itself:
|
||||
https://code.google.com/p/go/source/browse/LICENSE
|
@@ -10,11 +10,8 @@ Status:
|
||||
* The client work has just started but shares a lot of code
|
||||
is coming along much quicker.
|
||||
|
||||
Docs are at https://godoc.org/golang.org/x/net/http2
|
||||
Docs are at https://godoc.org/github.com/bradfitz/http2
|
||||
|
||||
Demo test server at https://http2.golang.org/
|
||||
|
||||
Help & bug reports welcome!
|
||||
|
||||
Contributing: https://golang.org/doc/contribute.html
|
||||
Bugs: https://golang.org/issue/new?title=x/net/http2:+
|
||||
Help & bug reports welcome.
|
75
Godeps/_workspace/src/github.com/bradfitz/http2/buffer.go
generated
vendored
Normal file
75
Godeps/_workspace/src/github.com/bradfitz/http2/buffer.go
generated
vendored
Normal file
@@ -0,0 +1,75 @@
|
||||
// Copyright 2014 The Go Authors.
|
||||
// See https://code.google.com/p/go/source/browse/CONTRIBUTORS
|
||||
// Licensed under the same terms as Go itself:
|
||||
// https://code.google.com/p/go/source/browse/LICENSE
|
||||
|
||||
package http2
|
||||
|
||||
import (
|
||||
"errors"
|
||||
)
|
||||
|
||||
// buffer is an io.ReadWriteCloser backed by a fixed size buffer.
|
||||
// It never allocates, but moves old data as new data is written.
|
||||
type buffer struct {
|
||||
buf []byte
|
||||
r, w int
|
||||
closed bool
|
||||
err error // err to return to reader
|
||||
}
|
||||
|
||||
var (
|
||||
errReadEmpty = errors.New("read from empty buffer")
|
||||
errWriteFull = errors.New("write on full buffer")
|
||||
)
|
||||
|
||||
// Read copies bytes from the buffer into p.
|
||||
// It is an error to read when no data is available.
|
||||
func (b *buffer) Read(p []byte) (n int, err error) {
|
||||
n = copy(p, b.buf[b.r:b.w])
|
||||
b.r += n
|
||||
if b.closed && b.r == b.w {
|
||||
err = b.err
|
||||
} else if b.r == b.w && n == 0 {
|
||||
err = errReadEmpty
|
||||
}
|
||||
return n, err
|
||||
}
|
||||
|
||||
// Len returns the number of bytes of the unread portion of the buffer.
|
||||
func (b *buffer) Len() int {
|
||||
return b.w - b.r
|
||||
}
|
||||
|
||||
// Write copies bytes from p into the buffer.
|
||||
// It is an error to write more data than the buffer can hold.
|
||||
func (b *buffer) Write(p []byte) (n int, err error) {
|
||||
if b.closed {
|
||||
return 0, errors.New("closed")
|
||||
}
|
||||
|
||||
// Slide existing data to beginning.
|
||||
if b.r > 0 && len(p) > len(b.buf)-b.w {
|
||||
copy(b.buf, b.buf[b.r:b.w])
|
||||
b.w -= b.r
|
||||
b.r = 0
|
||||
}
|
||||
|
||||
// Write new data.
|
||||
n = copy(b.buf[b.w:], p)
|
||||
b.w += n
|
||||
if n < len(p) {
|
||||
err = errWriteFull
|
||||
}
|
||||
return n, err
|
||||
}
|
||||
|
||||
// Close marks the buffer as closed. Future calls to Write will
|
||||
// return an error. Future calls to Read, once the buffer is
|
||||
// empty, will return err.
|
||||
func (b *buffer) Close(err error) {
|
||||
if !b.closed {
|
||||
b.closed = true
|
||||
b.err = err
|
||||
}
|
||||
}
|
73
Godeps/_workspace/src/github.com/bradfitz/http2/buffer_test.go
generated
vendored
Normal file
73
Godeps/_workspace/src/github.com/bradfitz/http2/buffer_test.go
generated
vendored
Normal file
@@ -0,0 +1,73 @@
|
||||
// Copyright 2014 The Go Authors.
|
||||
// See https://code.google.com/p/go/source/browse/CONTRIBUTORS
|
||||
// Licensed under the same terms as Go itself:
|
||||
// https://code.google.com/p/go/source/browse/LICENSE
|
||||
|
||||
package http2
|
||||
|
||||
import (
|
||||
"io"
|
||||
"reflect"
|
||||
"testing"
|
||||
)
|
||||
|
||||
var bufferReadTests = []struct {
|
||||
buf buffer
|
||||
read, wn int
|
||||
werr error
|
||||
wp []byte
|
||||
wbuf buffer
|
||||
}{
|
||||
{
|
||||
buffer{[]byte{'a', 0}, 0, 1, false, nil},
|
||||
5, 1, nil, []byte{'a'},
|
||||
buffer{[]byte{'a', 0}, 1, 1, false, nil},
|
||||
},
|
||||
{
|
||||
buffer{[]byte{'a', 0}, 0, 1, true, io.EOF},
|
||||
5, 1, io.EOF, []byte{'a'},
|
||||
buffer{[]byte{'a', 0}, 1, 1, true, io.EOF},
|
||||
},
|
||||
{
|
||||
buffer{[]byte{0, 'a'}, 1, 2, false, nil},
|
||||
5, 1, nil, []byte{'a'},
|
||||
buffer{[]byte{0, 'a'}, 2, 2, false, nil},
|
||||
},
|
||||
{
|
||||
buffer{[]byte{0, 'a'}, 1, 2, true, io.EOF},
|
||||
5, 1, io.EOF, []byte{'a'},
|
||||
buffer{[]byte{0, 'a'}, 2, 2, true, io.EOF},
|
||||
},
|
||||
{
|
||||
buffer{[]byte{}, 0, 0, false, nil},
|
||||
5, 0, errReadEmpty, []byte{},
|
||||
buffer{[]byte{}, 0, 0, false, nil},
|
||||
},
|
||||
{
|
||||
buffer{[]byte{}, 0, 0, true, io.EOF},
|
||||
5, 0, io.EOF, []byte{},
|
||||
buffer{[]byte{}, 0, 0, true, io.EOF},
|
||||
},
|
||||
}
|
||||
|
||||
func TestBufferRead(t *testing.T) {
|
||||
for i, tt := range bufferReadTests {
|
||||
read := make([]byte, tt.read)
|
||||
n, err := tt.buf.Read(read)
|
||||
if n != tt.wn {
|
||||
t.Errorf("#%d: wn = %d want %d", i, n, tt.wn)
|
||||
continue
|
||||
}
|
||||
if err != tt.werr {
|
||||
t.Errorf("#%d: werr = %v want %v", i, err, tt.werr)
|
||||
continue
|
||||
}
|
||||
read = read[:n]
|
||||
if !reflect.DeepEqual(read, tt.wp) {
|
||||
t.Errorf("#%d: read = %+v want %+v", i, read, tt.wp)
|
||||
}
|
||||
if !reflect.DeepEqual(tt.buf, tt.wbuf) {
|
||||
t.Errorf("#%d: buf = %+v want %+v", i, tt.buf, tt.wbuf)
|
||||
}
|
||||
}
|
||||
}
|
@@ -1,6 +1,7 @@
|
||||
// Copyright 2014 The Go Authors. All rights reserved.
|
||||
// Use of this source code is governed by a BSD-style
|
||||
// license that can be found in the LICENSE file.
|
||||
// Copyright 2014 The Go Authors.
|
||||
// See https://code.google.com/p/go/source/browse/CONTRIBUTORS
|
||||
// Licensed under the same terms as Go itself:
|
||||
// https://code.google.com/p/go/source/browse/LICENSE
|
||||
|
||||
package http2
|
||||
|
||||
@@ -75,16 +76,3 @@ func (e StreamError) Error() string {
|
||||
type goAwayFlowError struct{}
|
||||
|
||||
func (goAwayFlowError) Error() string { return "connection exceeded flow control window size" }
|
||||
|
||||
// connErrorReason wraps a ConnectionError with an informative error about why it occurs.
|
||||
|
||||
// Errors of this type are only returned by the frame parser functions
|
||||
// and converted into ConnectionError(ErrCodeProtocol).
|
||||
type connError struct {
|
||||
Code ErrCode
|
||||
Reason string
|
||||
}
|
||||
|
||||
func (e connError) Error() string {
|
||||
return fmt.Sprintf("http2: connection error: %v: %v", e.Code, e.Reason)
|
||||
}
|
@@ -1,6 +1,9 @@
|
||||
// Copyright 2014 The Go Authors. All rights reserved.
|
||||
// Use of this source code is governed by a BSD-style
|
||||
// license that can be found in the LICENSE file.
|
||||
// See https://code.google.com/p/go/source/browse/CONTRIBUTORS
|
||||
// Licensed under the same terms as Go itself:
|
||||
// https://code.google.com/p/go/source/browse/LICENSE
|
||||
|
||||
package http2
|
||||
|
@@ -1,6 +1,7 @@
|
||||
// Copyright 2014 The Go Authors. All rights reserved.
|
||||
// Use of this source code is governed by a BSD-style
|
||||
// license that can be found in the LICENSE file.
|
||||
// Copyright 2014 The Go Authors.
|
||||
// See https://code.google.com/p/go/source/browse/CONTRIBUTORS
|
||||
// Licensed under the same terms as Go itself:
|
||||
// https://code.google.com/p/go/source/browse/LICENSE
|
||||
|
||||
// Flow control
|
||||
|
@@ -1,6 +1,7 @@
|
||||
// Copyright 2014 The Go Authors. All rights reserved.
|
||||
// Use of this source code is governed by a BSD-style
|
||||
// license that can be found in the LICENSE file.
|
||||
// Copyright 2014 The Go Authors.
|
||||
// See https://code.google.com/p/go/source/browse/CONTRIBUTORS
|
||||
// Licensed under the same terms as Go itself:
|
||||
// https://code.google.com/p/go/source/browse/LICENSE
|
||||
|
||||
package http2
|
||||
|
@@ -1,6 +1,7 @@
|
||||
// Copyright 2014 The Go Authors. All rights reserved.
|
||||
// Use of this source code is governed by a BSD-style
|
||||
// license that can be found in the LICENSE file.
|
||||
// Copyright 2014 The Go Authors.
|
||||
// See https://code.google.com/p/go/source/browse/CONTRIBUTORS
|
||||
// Licensed under the same terms as Go itself:
|
||||
// https://code.google.com/p/go/source/browse/LICENSE
|
||||
|
||||
package http2
|
||||
|
||||
@@ -10,7 +11,6 @@ import (
|
||||
"errors"
|
||||
"fmt"
|
||||
"io"
|
||||
"log"
|
||||
"sync"
|
||||
)
|
||||
|
||||
@@ -85,8 +85,8 @@ const (
|
||||
// Continuation Frame
|
||||
FlagContinuationEndHeaders Flags = 0x4
|
||||
|
||||
FlagPushPromiseEndHeaders Flags = 0x4
|
||||
FlagPushPromisePadded Flags = 0x8
|
||||
FlagPushPromiseEndHeaders = 0x4
|
||||
FlagPushPromisePadded = 0x8
|
||||
)
|
||||
|
||||
var flagName = map[FrameType]map[Flags]string{
|
||||
@@ -172,12 +172,6 @@ func (h FrameHeader) Header() FrameHeader { return h }
|
||||
func (h FrameHeader) String() string {
|
||||
var buf bytes.Buffer
|
||||
buf.WriteString("[FrameHeader ")
|
||||
h.writeDebug(&buf)
|
||||
buf.WriteByte(']')
|
||||
return buf.String()
|
||||
}
|
||||
|
||||
func (h FrameHeader) writeDebug(buf *bytes.Buffer) {
|
||||
buf.WriteString(h.Type.String())
|
||||
if h.Flags != 0 {
|
||||
buf.WriteString(" flags=")
|
||||
@@ -194,14 +188,15 @@ func (h FrameHeader) writeDebug(buf *bytes.Buffer) {
|
||||
if name != "" {
|
||||
buf.WriteString(name)
|
||||
} else {
|
||||
fmt.Fprintf(buf, "0x%x", 1<<i)
|
||||
fmt.Fprintf(&buf, "0x%x", 1<<i)
|
||||
}
|
||||
}
|
||||
}
|
||||
if h.StreamID != 0 {
|
||||
fmt.Fprintf(buf, " stream=%d", h.StreamID)
|
||||
fmt.Fprintf(&buf, " stream=%d", h.StreamID)
|
||||
}
|
||||
fmt.Fprintf(buf, " len=%d", h.Length)
|
||||
fmt.Fprintf(&buf, " len=%d]", h.Length)
|
||||
return buf.String()
|
||||
}
|
||||
|
||||
func (h *FrameHeader) checkValid() {
|
||||
@@ -261,11 +256,6 @@ type Frame interface {
|
||||
type Framer struct {
|
||||
r io.Reader
|
||||
lastFrame Frame
|
||||
errReason string
|
||||
|
||||
// lastHeaderStream is non-zero if the last frame was an
|
||||
// unfinished HEADERS/CONTINUATION.
|
||||
lastHeaderStream uint32
|
||||
|
||||
maxReadSize uint32
|
||||
headerBuf [frameHeaderLen]byte
|
||||
@@ -282,29 +272,18 @@ type Framer struct {
|
||||
wbuf []byte
|
||||
|
||||
// AllowIllegalWrites permits the Framer's Write methods to
|
||||
// write frames that do not conform to the HTTP/2 spec. This
|
||||
// write frames that do not conform to the HTTP/2 spec. This
|
||||
// permits using the Framer to test other HTTP/2
|
||||
// implementations' conformance to the spec.
|
||||
// If false, the Write methods will prefer to return an error
|
||||
// rather than comply.
|
||||
AllowIllegalWrites bool
|
||||
|
||||
// AllowIllegalReads permits the Framer's ReadFrame method
|
||||
// to return non-compliant frames or frame orders.
|
||||
// This is for testing and permits using the Framer to test
|
||||
// other HTTP/2 implementations' conformance to the spec.
|
||||
AllowIllegalReads bool
|
||||
|
||||
// TODO: track which type of frame & with which flags was sent
|
||||
// last. Then return an error (unless AllowIllegalWrites) if
|
||||
// we're in the middle of a header block and a
|
||||
// non-Continuation or Continuation on a different stream is
|
||||
// attempted to be written.
|
||||
|
||||
logReads bool
|
||||
|
||||
debugFramer *Framer // only use for logging written writes
|
||||
debugFramerBuf *bytes.Buffer
|
||||
}
|
||||
|
||||
func (f *Framer) startWrite(ftype FrameType, flags Flags, streamID uint32) {
|
||||
@@ -332,10 +311,6 @@ func (f *Framer) endWrite() error {
|
||||
byte(length>>16),
|
||||
byte(length>>8),
|
||||
byte(length))
|
||||
if logFrameWrites {
|
||||
f.logWrite()
|
||||
}
|
||||
|
||||
n, err := f.w.Write(f.wbuf)
|
||||
if err == nil && n != len(f.wbuf) {
|
||||
err = io.ErrShortWrite
|
||||
@@ -343,24 +318,6 @@ func (f *Framer) endWrite() error {
|
||||
return err
|
||||
}
|
||||
|
||||
func (f *Framer) logWrite() {
|
||||
if f.debugFramer == nil {
|
||||
f.debugFramerBuf = new(bytes.Buffer)
|
||||
f.debugFramer = NewFramer(nil, f.debugFramerBuf)
|
||||
f.debugFramer.logReads = false // we log it ourselves, saying "wrote" below
|
||||
// Let us read anything, even if we accidentally wrote it
|
||||
// in the wrong order:
|
||||
f.debugFramer.AllowIllegalReads = true
|
||||
}
|
||||
f.debugFramerBuf.Write(f.wbuf)
|
||||
fr, err := f.debugFramer.ReadFrame()
|
||||
if err != nil {
|
||||
log.Printf("http2: Framer %p: failed to decode just-written frame", f)
|
||||
return
|
||||
}
|
||||
log.Printf("http2: Framer %p: wrote %v", f, summarizeFrame(fr))
|
||||
}
|
||||
|
||||
func (f *Framer) writeByte(v byte) { f.wbuf = append(f.wbuf, v) }
|
||||
func (f *Framer) writeBytes(v []byte) { f.wbuf = append(f.wbuf, v...) }
|
||||
func (f *Framer) writeUint16(v uint16) { f.wbuf = append(f.wbuf, byte(v>>8), byte(v)) }
|
||||
@@ -376,9 +333,8 @@ const (
|
||||
// NewFramer returns a Framer that writes frames to w and reads them from r.
|
||||
func NewFramer(w io.Writer, r io.Reader) *Framer {
|
||||
fr := &Framer{
|
||||
w: w,
|
||||
r: r,
|
||||
logReads: logFrameReads,
|
||||
w: w,
|
||||
r: r,
|
||||
}
|
||||
fr.getReadBuf = func(size uint32) []byte {
|
||||
if cap(fr.readBuf) >= int(size) {
|
||||
@@ -406,22 +362,10 @@ func (fr *Framer) SetMaxReadFrameSize(v uint32) {
|
||||
// sends a frame that is larger than declared with SetMaxReadFrameSize.
|
||||
var ErrFrameTooLarge = errors.New("http2: frame too large")
|
||||
|
||||
// terminalReadFrameError reports whether err is an unrecoverable
|
||||
// error from ReadFrame and no other frames should be read.
|
||||
func terminalReadFrameError(err error) bool {
|
||||
if _, ok := err.(StreamError); ok {
|
||||
return false
|
||||
}
|
||||
return err != nil
|
||||
}
|
||||
|
||||
// ReadFrame reads a single frame. The returned Frame is only valid
|
||||
// until the next call to ReadFrame.
|
||||
//
|
||||
// If the frame is larger than previously set with SetMaxReadFrameSize, the
|
||||
// returned error is ErrFrameTooLarge. Other errors may be of type
|
||||
// ConnectionError, StreamError, or anything else from from the underlying
|
||||
// reader.
|
||||
// If the frame is larger than previously set with SetMaxReadFrameSize,
|
||||
// the returned error is ErrFrameTooLarge.
|
||||
func (fr *Framer) ReadFrame() (Frame, error) {
|
||||
if fr.lastFrame != nil {
|
||||
fr.lastFrame.invalidate()
|
||||
@@ -439,66 +383,10 @@ func (fr *Framer) ReadFrame() (Frame, error) {
|
||||
}
|
||||
f, err := typeFrameParser(fh.Type)(fh, payload)
|
||||
if err != nil {
|
||||
if ce, ok := err.(connError); ok {
|
||||
return nil, fr.connError(ce.Code, ce.Reason)
|
||||
}
|
||||
return nil, err
|
||||
}
|
||||
if err := fr.checkFrameOrder(f); err != nil {
|
||||
return nil, err
|
||||
}
|
||||
if fr.logReads {
|
||||
log.Printf("http2: Framer %p: read %v", fr, summarizeFrame(f))
|
||||
}
|
||||
return f, nil
|
||||
}
|
||||
|
||||
// connError returns ConnectionError(code) but first
|
||||
// stashes away a public reason to the caller can optionally relay it
|
||||
// to the peer before hanging up on them. This might help others debug
|
||||
// their implementations.
|
||||
func (fr *Framer) connError(code ErrCode, reason string) error {
|
||||
fr.errReason = reason
|
||||
return ConnectionError(code)
|
||||
}
|
||||
|
||||
// checkFrameOrder reports an error if f is an invalid frame to return
|
||||
// next from ReadFrame. Mostly it checks whether HEADERS and
|
||||
// CONTINUATION frames are contiguous.
|
||||
func (fr *Framer) checkFrameOrder(f Frame) error {
|
||||
last := fr.lastFrame
|
||||
fr.lastFrame = f
|
||||
if fr.AllowIllegalReads {
|
||||
return nil
|
||||
}
|
||||
|
||||
fh := f.Header()
|
||||
if fr.lastHeaderStream != 0 {
|
||||
if fh.Type != FrameContinuation {
|
||||
return fr.connError(ErrCodeProtocol,
|
||||
fmt.Sprintf("got %s for stream %d; expected CONTINUATION following %s for stream %d",
|
||||
fh.Type, fh.StreamID,
|
||||
last.Header().Type, fr.lastHeaderStream))
|
||||
}
|
||||
if fh.StreamID != fr.lastHeaderStream {
|
||||
return fr.connError(ErrCodeProtocol,
|
||||
fmt.Sprintf("got CONTINUATION for stream %d; expected stream %d",
|
||||
fh.StreamID, fr.lastHeaderStream))
|
||||
}
|
||||
} else if fh.Type == FrameContinuation {
|
||||
return fr.connError(ErrCodeProtocol, fmt.Sprintf("unexpected CONTINUATION for stream %d", fh.StreamID))
|
||||
}
|
||||
|
||||
switch fh.Type {
|
||||
case FrameHeaders, FrameContinuation:
|
||||
if fh.Flags.Has(FlagHeadersEndHeaders) {
|
||||
fr.lastHeaderStream = 0
|
||||
} else {
|
||||
fr.lastHeaderStream = fh.StreamID
|
||||
}
|
||||
}
|
||||
|
||||
return nil
|
||||
return f, nil
|
||||
}
|
||||
|
||||
// A DataFrame conveys arbitrary, variable-length sequences of octets
|
||||
@@ -529,7 +417,7 @@ func parseDataFrame(fh FrameHeader, payload []byte) (Frame, error) {
|
||||
// field is 0x0, the recipient MUST respond with a
|
||||
// connection error (Section 5.4.1) of type
|
||||
// PROTOCOL_ERROR.
|
||||
return nil, connError{ErrCodeProtocol, "DATA frame with stream ID 0"}
|
||||
return nil, ConnectionError(ErrCodeProtocol)
|
||||
}
|
||||
f := &DataFrame{
|
||||
FrameHeader: fh,
|
||||
@@ -547,7 +435,7 @@ func parseDataFrame(fh FrameHeader, payload []byte) (Frame, error) {
|
||||
// length of the frame payload, the recipient MUST
|
||||
// treat this as a connection error.
|
||||
// Filed: https://github.com/http2/http2-spec/issues/610
|
||||
return nil, connError{ErrCodeProtocol, "pad size larger than data payload"}
|
||||
return nil, ConnectionError(ErrCodeProtocol)
|
||||
}
|
||||
f.data = payload[:len(payload)-int(padSize)]
|
||||
return f, nil
|
||||
@@ -687,8 +575,6 @@ type PingFrame struct {
|
||||
Data [8]byte
|
||||
}
|
||||
|
||||
func (f *PingFrame) IsAck() bool { return f.Flags.Has(FlagPingAck) }
|
||||
|
||||
func parsePingFrame(fh FrameHeader, payload []byte) (Frame, error) {
|
||||
if len(payload) != 8 {
|
||||
return nil, ConnectionError(ErrCodeFrameSize)
|
||||
@@ -777,7 +663,7 @@ func parseUnknownFrame(fh FrameHeader, p []byte) (Frame, error) {
|
||||
// See http://http2.github.io/http2-spec/#rfc.section.6.9
|
||||
type WindowUpdateFrame struct {
|
||||
FrameHeader
|
||||
Increment uint32 // never read with high bit set
|
||||
Increment uint32
|
||||
}
|
||||
|
||||
func parseWindowUpdateFrame(fh FrameHeader, p []byte) (Frame, error) {
|
||||
@@ -854,7 +740,7 @@ func parseHeadersFrame(fh FrameHeader, p []byte) (_ Frame, err error) {
|
||||
// is received whose stream identifier field is 0x0, the recipient MUST
|
||||
// respond with a connection error (Section 5.4.1) of type
|
||||
// PROTOCOL_ERROR.
|
||||
return nil, connError{ErrCodeProtocol, "HEADERS frame with stream ID 0"}
|
||||
return nil, ConnectionError(ErrCodeProtocol)
|
||||
}
|
||||
var padLength uint8
|
||||
if fh.Flags.Has(FlagHeadersPadded) {
|
||||
@@ -984,10 +870,10 @@ func (p PriorityParam) IsZero() bool {
|
||||
|
||||
func parsePriorityFrame(fh FrameHeader, payload []byte) (Frame, error) {
|
||||
if fh.StreamID == 0 {
|
||||
return nil, connError{ErrCodeProtocol, "PRIORITY frame with stream ID 0"}
|
||||
return nil, ConnectionError(ErrCodeProtocol)
|
||||
}
|
||||
if len(payload) != 5 {
|
||||
return nil, connError{ErrCodeFrameSize, fmt.Sprintf("PRIORITY frame payload size was %d; want 5", len(payload))}
|
||||
return nil, ConnectionError(ErrCodeFrameSize)
|
||||
}
|
||||
v := binary.BigEndian.Uint32(payload[:4])
|
||||
streamID := v & 0x7fffffff // mask off high bit
|
||||
@@ -1057,12 +943,13 @@ type ContinuationFrame struct {
|
||||
}
|
||||
|
||||
func parseContinuationFrame(fh FrameHeader, p []byte) (Frame, error) {
|
||||
if fh.StreamID == 0 {
|
||||
return nil, connError{ErrCodeProtocol, "CONTINUATION frame with stream ID 0"}
|
||||
}
|
||||
return &ContinuationFrame{fh, p}, nil
|
||||
}
|
||||
|
||||
func (f *ContinuationFrame) StreamEnded() bool {
|
||||
return f.FrameHeader.Flags.Has(FlagDataEndStream)
|
||||
}
|
||||
|
||||
func (f *ContinuationFrame) HeaderBlockFragment() []byte {
|
||||
f.checkValid()
|
||||
return f.headerFragBuf
|
||||
@@ -1224,46 +1111,3 @@ type streamEnder interface {
|
||||
type headersEnder interface {
|
||||
HeadersEnded() bool
|
||||
}
|
||||
|
||||
func summarizeFrame(f Frame) string {
|
||||
var buf bytes.Buffer
|
||||
f.Header().writeDebug(&buf)
|
||||
switch f := f.(type) {
|
||||
case *SettingsFrame:
|
||||
n := 0
|
||||
f.ForeachSetting(func(s Setting) error {
|
||||
n++
|
||||
if n == 1 {
|
||||
buf.WriteString(", settings:")
|
||||
}
|
||||
fmt.Fprintf(&buf, " %v=%v,", s.ID, s.Val)
|
||||
return nil
|
||||
})
|
||||
if n > 0 {
|
||||
buf.Truncate(buf.Len() - 1) // remove trailing comma
|
||||
}
|
||||
case *DataFrame:
|
||||
data := f.Data()
|
||||
const max = 256
|
||||
if len(data) > max {
|
||||
data = data[:max]
|
||||
}
|
||||
fmt.Fprintf(&buf, " data=%q", data)
|
||||
if len(f.Data()) > max {
|
||||
fmt.Fprintf(&buf, " (%d bytes omitted)", len(f.Data())-max)
|
||||
}
|
||||
case *WindowUpdateFrame:
|
||||
if f.StreamID == 0 {
|
||||
buf.WriteString(" (conn)")
|
||||
}
|
||||
fmt.Fprintf(&buf, " incr=%v", f.Increment)
|
||||
case *PingFrame:
|
||||
fmt.Fprintf(&buf, " ping=%q", f.Data[:])
|
||||
case *GoAwayFrame:
|
||||
fmt.Fprintf(&buf, " LastStreamID=%v ErrCode=%v Debug=%q",
|
||||
f.LastStreamID, f.ErrCode, f.debugData)
|
||||
case *RSTStreamFrame:
|
||||
fmt.Fprintf(&buf, " ErrCode=%v", f.ErrCode)
|
||||
}
|
||||
return buf.String()
|
||||
}
|
@@ -6,8 +6,6 @@ package http2
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"fmt"
|
||||
"io"
|
||||
"reflect"
|
||||
"strings"
|
||||
"testing"
|
||||
@@ -26,25 +24,6 @@ func TestFrameSizes(t *testing.T) {
|
||||
}
|
||||
}
|
||||
|
||||
func TestFrameTypeString(t *testing.T) {
|
||||
tests := []struct {
|
||||
ft FrameType
|
||||
want string
|
||||
}{
|
||||
{FrameData, "DATA"},
|
||||
{FramePing, "PING"},
|
||||
{FrameGoAway, "GOAWAY"},
|
||||
{0xf, "UNKNOWN_FRAME_TYPE_15"},
|
||||
}
|
||||
|
||||
for i, tt := range tests {
|
||||
got := tt.ft.String()
|
||||
if got != tt.want {
|
||||
t.Errorf("%d. String(FrameType %d) = %q; want %q", i, int(tt.ft), got, tt.want)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestWriteRST(t *testing.T) {
|
||||
fr, buf := testFramer()
|
||||
var streamID uint32 = 1<<24 + 2<<16 + 3<<8 + 4
|
||||
@@ -266,7 +245,6 @@ func TestWriteContinuation(t *testing.T) {
|
||||
t.Errorf("test %q: %v", tt.name, err)
|
||||
continue
|
||||
}
|
||||
fr.AllowIllegalReads = true
|
||||
f, err := fr.ReadFrame()
|
||||
if err != nil {
|
||||
t.Errorf("test %q: failed to read the frame back: %v", tt.name, err)
|
||||
@@ -598,138 +576,3 @@ func TestWritePushPromise(t *testing.T) {
|
||||
t.Fatalf("parsed back:\n%#v\nwant:\n%#v", f, want)
|
||||
}
|
||||
}
|
||||
|
||||
// test checkFrameOrder and that HEADERS and CONTINUATION frames can't be intermingled.
|
||||
func TestReadFrameOrder(t *testing.T) {
|
||||
head := func(f *Framer, id uint32, end bool) {
|
||||
f.WriteHeaders(HeadersFrameParam{
|
||||
StreamID: id,
|
||||
BlockFragment: []byte("foo"), // unused, but non-empty
|
||||
EndHeaders: end,
|
||||
})
|
||||
}
|
||||
cont := func(f *Framer, id uint32, end bool) {
|
||||
f.WriteContinuation(id, end, []byte("foo"))
|
||||
}
|
||||
|
||||
tests := [...]struct {
|
||||
name string
|
||||
w func(*Framer)
|
||||
atLeast int
|
||||
wantErr string
|
||||
}{
|
||||
0: {
|
||||
w: func(f *Framer) {
|
||||
head(f, 1, true)
|
||||
},
|
||||
},
|
||||
1: {
|
||||
w: func(f *Framer) {
|
||||
head(f, 1, true)
|
||||
head(f, 2, true)
|
||||
},
|
||||
},
|
||||
2: {
|
||||
wantErr: "got HEADERS for stream 2; expected CONTINUATION following HEADERS for stream 1",
|
||||
w: func(f *Framer) {
|
||||
head(f, 1, false)
|
||||
head(f, 2, true)
|
||||
},
|
||||
},
|
||||
3: {
|
||||
wantErr: "got DATA for stream 1; expected CONTINUATION following HEADERS for stream 1",
|
||||
w: func(f *Framer) {
|
||||
head(f, 1, false)
|
||||
},
|
||||
},
|
||||
4: {
|
||||
w: func(f *Framer) {
|
||||
head(f, 1, false)
|
||||
cont(f, 1, true)
|
||||
head(f, 2, true)
|
||||
},
|
||||
},
|
||||
5: {
|
||||
wantErr: "got CONTINUATION for stream 2; expected stream 1",
|
||||
w: func(f *Framer) {
|
||||
head(f, 1, false)
|
||||
cont(f, 2, true)
|
||||
head(f, 2, true)
|
||||
},
|
||||
},
|
||||
6: {
|
||||
wantErr: "unexpected CONTINUATION for stream 1",
|
||||
w: func(f *Framer) {
|
||||
cont(f, 1, true)
|
||||
},
|
||||
},
|
||||
7: {
|
||||
wantErr: "unexpected CONTINUATION for stream 1",
|
||||
w: func(f *Framer) {
|
||||
cont(f, 1, false)
|
||||
},
|
||||
},
|
||||
8: {
|
||||
wantErr: "HEADERS frame with stream ID 0",
|
||||
w: func(f *Framer) {
|
||||
head(f, 0, true)
|
||||
},
|
||||
},
|
||||
9: {
|
||||
wantErr: "CONTINUATION frame with stream ID 0",
|
||||
w: func(f *Framer) {
|
||||
cont(f, 0, true)
|
||||
},
|
||||
},
|
||||
10: {
|
||||
wantErr: "unexpected CONTINUATION for stream 1",
|
||||
atLeast: 5,
|
||||
w: func(f *Framer) {
|
||||
head(f, 1, false)
|
||||
cont(f, 1, false)
|
||||
cont(f, 1, false)
|
||||
cont(f, 1, false)
|
||||
cont(f, 1, true)
|
||||
cont(f, 1, false)
|
||||
},
|
||||
},
|
||||
}
|
||||
for i, tt := range tests {
|
||||
buf := new(bytes.Buffer)
|
||||
f := NewFramer(buf, buf)
|
||||
f.AllowIllegalWrites = true
|
||||
tt.w(f)
|
||||
f.WriteData(1, true, nil) // to test transition away from last step
|
||||
|
||||
var err error
|
||||
n := 0
|
||||
var log bytes.Buffer
|
||||
for {
|
||||
var got Frame
|
||||
got, err = f.ReadFrame()
|
||||
fmt.Fprintf(&log, " read %v, %v\n", got, err)
|
||||
if err != nil {
|
||||
break
|
||||
}
|
||||
n++
|
||||
}
|
||||
if err == io.EOF {
|
||||
err = nil
|
||||
}
|
||||
ok := tt.wantErr == ""
|
||||
if ok && err != nil {
|
||||
t.Errorf("%d. after %d good frames, ReadFrame = %v; want success\n%s", i, n, err, log.Bytes())
|
||||
continue
|
||||
}
|
||||
if !ok && err != ConnectionError(ErrCodeProtocol) {
|
||||
t.Errorf("%d. after %d good frames, ReadFrame = %v; want ConnectionError(ErrCodeProtocol)\n%s", i, n, err, log.Bytes())
|
||||
continue
|
||||
}
|
||||
if f.errReason != tt.wantErr {
|
||||
t.Errorf("%d. framer eror = %q; want %q\n%s", i, f.errReason, tt.wantErr, log.Bytes())
|
||||
}
|
||||
if n < tt.atLeast {
|
||||
t.Errorf("%d. framer only read %d frames; want at least %d\n%s", i, n, tt.atLeast, log.Bytes())
|
||||
}
|
||||
}
|
||||
}
|
@@ -1,6 +1,9 @@
|
||||
// Copyright 2014 The Go Authors. All rights reserved.
|
||||
// Use of this source code is governed by a BSD-style
|
||||
// license that can be found in the LICENSE file.
|
||||
// See https://code.google.com/p/go/source/browse/CONTRIBUTORS
|
||||
// Licensed under the same terms as Go itself:
|
||||
// https://code.google.com/p/go/source/browse/LICENSE
|
||||
|
||||
// Defensive debug-only utility to track that functions run on the
|
||||
// goroutine that they're supposed to.
|
||||
@@ -11,20 +14,16 @@ import (
|
||||
"bytes"
|
||||
"errors"
|
||||
"fmt"
|
||||
"os"
|
||||
"runtime"
|
||||
"strconv"
|
||||
"sync"
|
||||
)
|
||||
|
||||
var DebugGoroutines = os.Getenv("DEBUG_HTTP2_GOROUTINES") == "1"
|
||||
var DebugGoroutines = false
|
||||
|
||||
type goroutineLock uint64
|
||||
|
||||
func newGoroutineLock() goroutineLock {
|
||||
if !DebugGoroutines {
|
||||
return 0
|
||||
}
|
||||
return goroutineLock(curGoroutineID())
|
||||
}
|
||||
|
@@ -1,6 +1,9 @@
|
||||
// Copyright 2014 The Go Authors. All rights reserved.
|
||||
// Use of this source code is governed by a BSD-style
|
||||
// license that can be found in the LICENSE file.
|
||||
// See https://code.google.com/p/go/source/browse/CONTRIBUTORS
|
||||
// Licensed under the same terms as Go itself:
|
||||
// https://code.google.com/p/go/source/browse/LICENSE
|
||||
|
||||
package http2
|
||||
|
||||
@@ -11,10 +14,7 @@ import (
|
||||
)
|
||||
|
||||
func TestGoroutineLock(t *testing.T) {
|
||||
oldDebug := DebugGoroutines
|
||||
DebugGoroutines = true
|
||||
defer func() { DebugGoroutines = oldDebug }()
|
||||
|
||||
g := newGoroutineLock()
|
||||
g.check()
|
||||
|
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user