doc: remove out-of-dated docs

release-2.0
Xiang Li 2015-01-13 18:35:11 -08:00
parent 9010e8a2c4
commit 3e268467c8
7 changed files with 0 additions and 712 deletions

View File

@ -1,139 +0,0 @@
# Etcd Configuration
## Node Configuration
Individual node configuration options can be set in three places:
1. Command line flags
2. Environment variables
3. Configuration file
Options set on the command line take precedence over all other sources.
Options set in environment variables take precedence over options set in
configuration files.
## Cluster Configuration
Cluster-wide settings are configured via the `/config` admin endpoint and additionally in the configuration file. Values contained in the configuration file will seed the cluster setting with the provided value. After the cluster is running, only the admin endpoint is used.
The full documentation is contained in the [API docs](https://github.com/coreos/etcd/blob/master/Documentation/api.md#cluster-config).
* `activeSize` - the maximum number of peers that can participate in the consensus protocol. Other peers will join as standbys.
* `removeDelay` - the minimum time in seconds that a machine has been observed to be unresponsive before it is removed from the cluster.
* `syncInterval` - the amount of time in seconds between cluster sync when it runs in standby mode.
## Command Line Flags
### Required
* `-name` - The node name. Defaults to a UUID.
### Optional
* `-addr` - The advertised public hostname:port for client communication. Defaults to `127.0.0.1:4001`.
* `-bind-addr` - The listening hostname for client communication. Defaults to advertised IP.
* `-ca-file` - The path of the client CAFile. Enables client cert authentication when present.
* `-cert-file` - The cert file of the client.
* `-cluster-active-size` - The expected number of instances participating in the consensus protocol. Only applied if the etcd instance is the first peer in the cluster.
* `-cluster-remove-delay` - The number of seconds before one node is removed from the cluster since it cannot be connected at all. Only applied if the etcd instance is the first peer in the cluster.
* `-cluster-sync-interval` - The number of seconds between synchronization for standby-mode instance with the cluster. Only applied if the etcd instance is the first peer in the cluster.
* `-config` - The path of the etcd configuration file. Defaults to `/etc/etcd/etcd.conf`.
* `-cors` - A comma separated white list of origins for cross-origin resource sharing.
* `-cpuprofile` - The path to a file to output CPU profile data. Enables CPU profiling when present.
* `-data-dir` - The directory to store log and snapshot. Defaults to the current working directory.
* `-discovery` - A URL to use for discovering the peer list. (i.e `"https://discovery.etcd.io/your-unique-key"`).
* `-graphite-host` - The Graphite endpoint to which to send metrics.
* `-http-read-timeout` - The number of seconds before an HTTP read operation is timed out.
* `-http-write-timeout` - The number of seconds before an HTTP write operation is timed out.
* `-key-file` - The key file of the client.
* `-max-result-buffer` - The max size of result buffer. Defaults to `1024`.
* `-max-retry-attempts` - The max retry attempts when trying to join a cluster. Defaults to `3`.
* `-peer-addr` - The advertised public hostname:port for server communication. Defaults to `127.0.0.1:7001`.
* `-peer-bind-addr` - The listening hostname for server communication. Defaults to advertised IP.
* `-peer-ca-file` - The path of the CAFile. Enables client/peer cert authentication when present.
* `-peer-cert-file` - The cert file of the server.
* `-peer-election-timeout` - The number of milliseconds to wait before the leader is declared unhealthy.
* `-peer-heartbeat-interval` - The number of milliseconds in between heartbeat requests
* `-peer-key-file` - The key file of the server.
* `-peers` - A comma separated list of peers in the cluster (i.e `"203.0.113.101:7001,203.0.113.102:7001"`).
* `-peers-file` - The file path containing a comma separated list of peers in the cluster.
* `-retry-interval` - Seconds to wait between cluster join retry attempts.
* `-snapshot=false` - Disable log snapshots. Defaults to `true`.
* `-v` - Enable verbose logging. Defaults to `false`.
* `-vv` - Enable very verbose logging. Defaults to `false`.
* `-version` - Print the version and exit.
## Configuration File
The etcd configuration file is written in [TOML](https://github.com/mojombo/toml)
and read from `/etc/etcd/etcd.conf` by default.
```TOML
addr = "127.0.0.1:4001"
bind_addr = "127.0.0.1:4001"
ca_file = ""
cert_file = ""
cors = []
cpu_profile_file = ""
data_dir = "."
discovery = "http://etcd.local:4001/v2/keys/_etcd/registry/examplecluster"
http_read_timeout = 10.0
http_write_timeout = 10.0
key_file = ""
peers = []
peers_file = ""
max_result_buffer = 1024
max_retry_attempts = 3
name = "default-name"
snapshot = true
verbose = false
very_verbose = false
[peer]
addr = "127.0.0.1:7001"
bind_addr = "127.0.0.1:7001"
ca_file = ""
cert_file = ""
key_file = ""
[cluster]
active_size = 9
remove_delay = 1800.0
sync_interval = 5.0
```
## Environment Variables
* `ETCD_ADDR`
* `ETCD_BIND_ADDR`
* `ETCD_CA_FILE`
* `ETCD_CERT_FILE`
* `ETCD_CLUSTER_ACTIVE_SIZE`
* `ETCD_CLUSTER_REMOVE_DELAY`
* `ETCD_CLUSTER_SYNC_INTERVAL`
* `ETCD_CORS`
* `ETCD_DATA_DIR`
* `ETCD_DISCOVERY`
* `ETCD_GRAPHITE_HOST`
* `ETCD_HTTP_READ_TIMEOUT`
* `ETCD_HTTP_WRITE_TIMEOUT`
* `ETCD_KEY_FILE`
* `ETCD_MAX_RESULT_BUFFER`
* `ETCD_MAX_RETRY_ATTEMPTS`
* `ETCD_NAME`
* `ETCD_PEER_ADDR`
* `ETCD_PEER_BIND_ADDR`
* `ETCD_PEER_CA_FILE`
* `ETCD_PEER_CERT_FILE`
* `ETCD_PEER_ELECTION_TIMEOUT`
* `ETCD_PEER_HEARTBEAT_INTERVAL`
* `ETCD_PEER_KEY_FILE`
* `ETCD_PEERS`
* `ETCD_PEERS_FILE`
* `ETCD_RETRY_INTERVAL`
* `ETCD_SNAPSHOT`
* `ETCD_SNAPSHOTCOUNT`
* `ETCD_TRACE`
* `ETCD_VERBOSE`
* `ETCD_VERY_VERBOSE`
* `ETCD_VERY_VERY_VERBOSE`

View File

@ -1,69 +0,0 @@
# Debugging etcd
Diagnosing issues in a distributed application is hard.
etcd will help as much as it can - just enable these debug features using the CLI flag `-trace=*` or the config option `trace=*`.
## Logging
Log verbosity can be increased to the max using either the `-vvv` CLI flag or the `very_very_verbose=true` config option.
The only supported logging mode is to stdout.
## Metrics
etcd itself can generate a set of metrics.
These metrics represent many different internal data points that can be helpful when debugging etcd servers.
#### Metrics reference
Each individual metric name is prefixed with `etcd.<NAME>`, where \<NAME\> is the configured name of the etcd server.
* `timer.appendentries.handle`: amount of time a peer takes to process an AppendEntriesRequest from the POV of the peer itself
* `timer.peer.<PEER>.heartbeat`: amount of time a peer heartbeat operation takes from the POV of the leader that initiated that operation for peer \<PEER\>
* `timer.command.<COMMAND>`: amount of time a given command took to be processed through the local server's raft state machine. This does not include time waiting on locks.
#### Fetching metrics over HTTP
Once tracing has been enabled on a given etcd server, all metric data is available at the server's `/debug/metrics` HTTP endpoint (i.e. `http://127.0.0.1:4001/debug/metrics`).
Executing a GET HTTP command against the metrics endpoint will yield the current state of all metrics in the etcd server.
#### Sending metrics to Graphite
etcd supports [Graphite's Carbon plaintext protocol](https://graphite.readthedocs.org/en/latest/feeding-carbon.html#the-plaintext-protocol) - a TCP wire protocol designed for shipping metric data to an aggregator.
To send metrics to a Graphite endpoint using this protocol, use of the `-graphite-host` CLI flag or the `graphite_host` config option (i.e. `graphite_host=172.17.0.19:2003`).
See an [example graphite deploy script](https://github.com/coreos/etcd/contrib/graphite).
#### Generating additional metrics with Collectd
[Collectd](http://collectd.org/documentation.shtml) gathers metrics from the host running etcd.
While these aren't metrics generated by etcd itself, it can be invaluable to compare etcd's view of the world to that of a separate process running next to etcd.
See an [example collectd deploy script](https://github.com/coreos/etcd/contrib/collectd).
## Profiling
etcd exposes profiling information from the Go pprof package over HTTP.
The basic browsable interface is served by etcd at the `/debug/pprof` HTTP endpoint (i.e. `http://127.0.0.1:4001/debug/pprof`).
For more information on using profiling tools, see http://blog.golang.org/profiling-go-programs.
**NOTE**: In the following examples you need to ensure that the `./bin/etcd` is identical to the `./bin/etcd` that you are targeting (same git hash, arch, platform, etc).
#### Heap memory profile
```
go tool pprof ./bin/etcd http://127.0.0.1:4001/debug/pprof/heap
```
#### CPU profile
```
go tool pprof ./bin/etcd http://127.0.0.1:4001/debug/pprof/profile
```
#### Blocked goroutine profile
```
go tool pprof ./bin/etcd http://127.0.0.1:4001/debug/pprof/block
```

View File

@ -1,34 +0,0 @@
## Cluster Finding Process
Peer discovery uses the following sources in this order: log data in `-data-dir`, `-discovery` and `-peers`.
If log data is provided, etcd will concatenate possible peers from three sources: the log data, the `-discovery` option, and `-peers`. Then it tries to join cluster through them one by one. If all connection attempts fail (which indicates that the majority of the cluster is currently down), it will restart itself based on the log data, which helps the cluster to recover from a full outage.
Without log data, the instance is assumed to be a brand new one. If possible targets are provided by `-discovery` and `-peers`, etcd will make a best effort attempt to join them, and if none is reachable it will exit. Otherwise, if no `-discovery` or `-peers` option is provided, a new cluster will always be started.
This ensures that users can always restart the node safely with the same command (without --force), and etcd will either reconnect to the old cluster if it is still running or recover its cluster from a outage.
## Logical Workflow
Start an etcd machine:
```
If log data is given:
Try to join via peers in previous cluster
Try to join via peers found in discover URL
Try to join via peers in peer list
Restart the previous cluster which is down
return
If discover URL is given:
Fetch peers through discover URL
If Success:
Join via peers found
return
If peer list is given:
Join as follower via peers in peer list
return
Start as the leader of a new cluster
```

View File

@ -1,232 +0,0 @@
## Standbys
Adding peers in an etcd cluster adds network, CPU, and disk overhead to the leader since each one requires replication.
Peers primarily provide resiliency in the event of a leader failure but the benefit of more failover nodes decreases as the cluster size increases.
A lightweight alternative is the standby.
Standbys are a way for an etcd node to forward requests along to the cluster but the standbys are not part of the Raft cluster themselves.
This provides an easier API for local applications while reducing the overhead required by a regular peer node.
Standbys also act as standby nodes in the event that a peer node in the cluster has not recovered after a long duration.
## Configuration Parameters
There are three configuration parameters used by standbys: active size, remove delay and standby sync interval.
The active size specifies a target size for the number of peers in the cluster.
If there are not enough peers to meet the active size, standbys will send join requests until the peer count is equal to the active size.
If there are more peers than the target active size then peers are removed by the leader and will become standbys.
The remove delay specifies how long the cluster should wait before removing a dead peer.
By default this is 30 minutes.
If a peer is inactive for 30 minutes then the peer is removed.
The standby sync interval specifies the synchronization interval of standbys with the cluster.
By default this is 5 seconds.
After each interval, standbys synchronize information with cluster.
## Logical Workflow
### Start a etcd machine
#### Main logic
```
If find existing standby cluster info:
Goto standby loop
Find cluster as required
If determine to start peer server:
Goto peer loop
Else:
Goto standby loop
Peer loop:
Start peer mode
If running:
Wait for stop
Goto standby loop
Standby loop:
Start standby mode
If running:
Wait for stop
Goto peer loop
```
#### [Cluster finding logic][cluster-finding.md]
#### Join request logic:
```
Fetch machine info
If cannot match version:
return false
If active size <= peer count:
return false
If it has existed in the cluster:
return true
If join request fails:
return false
return true
```
**Note**
1. [TODO] The running mode cannot be determined by log, because the log may be outdated. But the log could be used to estimate its state.
2. Even if sync cluster fails, it will restart still for recovery from full outage.
#### Peer mode start logic
```
Start raft server
Start other helper routines
```
#### Peer mode auto stop logic
```
When removed from the cluster:
Stop raft server
Stop other helper routines
```
#### Standby mode run logic
```
Loop:
Sleep for some time
Sync cluster, and write cluster info into disk
Check active size and send join request if needed
If succeed:
Clear cluster info from disk
Return
```
#### Serve Requests as Standby
Return '404 Page Not Found' always on peer address. This is because peer address is used for raft communication and cluster management, which should not be used in standby mode.
Serve requests from client:
```
Redirect all requests to client URL of leader
```
**Note**
1. The leader here implies the one in raft cluster when doing the latest successful synchronization.
2. [IDEA] We could extend HTTP Redirect to multiple possible targets.
### Join Request Handling
```
If machine has existed in the cluster:
Return
If peer count < active size:
Add peer
Increase peer count
```
### Remove Request Handling
```
If machine exists in the cluster:
Remove peer
Decrease peer count
```
## Cluster Monitor Logic
### Active Size Monitor:
This is only run by current cluster leader.
```
Loop:
Sleep for some time
If peer count > active size:
Remove randomly selected peer
```
### Peer Activity Monitor
This is only run by current cluster leader.
```
Loop:
Sleep for some time
For each peer:
If peer last activity time > remove delay:
Remove the peer
Goto Loop
```
## Cluster Cases
### Create Cluster with Thousands of Instances
First few machines run in peer mode.
All the others check the status of the cluster and run in standby mode.
### Recover from full outage
Machines with log data restart with join failure.
Machines in peer mode recover heartbeat between each other.
Machines in standby mode always sync the cluster. If sync fails, it uses the first address from data log as redirect target.
### Kill one peer machine
Leader of the cluster lose the connection with the peer.
When the time exceeds remove delay, it removes the peer from the cluster.
Machine in standby mode finds one available place of the cluster. It sends join request and joins the cluster.
**Note**
1. [TODO] Machine which was divided from majority and was removed from the cluster will distribute running of the cluster if the new node uses the same name.
### Kill one standby machine
No change for the cluster.
## Cons
1. New instance cannot join immediately after one peer is kicked out of the cluster, because the leader doesn't know the info about the standby instances.
2. It may introduce join collision
3. Cluster needs a good interval setting to balance the join delay and join collision.
## Future Attack Plans
1. Based on heartbeat miss and remove delay, standby could adjust its next check time.
2. Preregister the promotion target when heartbeat miss happens.
3. Get the estimated cluster size from the check happened in the sync interval, and adjust sync interval dynamically.
4. Accept join requests based on active size and alive peers.

View File

@ -1,101 +0,0 @@
#Etcd File System
## Structure
[TODO]
![alt text](./img/etcd_fs_structure.jpg "etcd file system structure")
## Node
In **etcd**, the **node** is the base from which the filesystem is constructed.
**etcd**'s file system is Unix-like with two kinds of nodes: file and directories.
- A **file node** has data associated with it.
- A **directory node** has child nodes associated with it.
All nodes, regardless of type, have the following attributes and operations:
### Attributes:
- **Expiration Time** [optional]
The node will be deleted when it expires.
- **ACL**
The path to the node's access control list.
### Operation:
- **Get** (path, recursive, sorted)
Get the content of the node
- If the node is a file, the data of the file will be returned.
- If the node is a directory, the child nodes of the directory will be returned.
- If recursive is true, it will recursively get the nodes of the directory.
- If sorted is true, the result will be sorted based on the path.
- **Create** (path, value[optional], ttl [optional])
Create a file. Create operation will help to create intermediate directories with no expiration time.
- If the file already exists, create will fail.
- If the value is given, set will create a file.
- If the value is not given, set will crate a directory.
- If ttl is given, the node will be deleted when it expires.
- **Update** (path, value[optional], ttl [optional])
Update the content of the node.
- If the value is given, the value of the key will be updated.
- If ttl is given, the expiration time of the node will be updated.
- **Delete** (path, recursive)
Delete the node of given path.
- If the node is a directory:
- If recursive is true, the operation will delete all nodes under the directory.
- If recursive is false, error will be returned.
- **TestAndSet** (path, prevValue [prevIndex], value, ttl)
Atomic *test and set* value to a file. If test succeeds, this operation will change the previous value of the file to the given value.
- If the prevValue is given, it will test against previous value of
the node.
- If the prevValue is empty, it will test if the node is not existing.
- If the prevValue is not empty, it will test if the prevValue is equal to the current value of the file.
- If the prevIndex is given, it will test if the create/last modified index of the node is equal to prevIndex.
- **Renew** (path, ttl)
Set the node's expiration time to (current time + ttl)
## ACL
### Theory
Etcd exports a Unix-like file system interface consisting of files and directories, collectively called nodes.
Each node has various meta-data, including three names of the access control lists used to control reading, writing and changing (change ACL names for the node).
We are storing the ACL names for nodes under a special *ACL* directory.
Each node has ACL name corresponding to one file within *ACL* dir.
Unless overridden, a node naturally inherits the ACL names of its parent directory on creation.
For each ACL name, it has three children: *R (Reading)*, *W (Writing)*, *C (Changing)*
Each permission is also a node. Under the node it contains the users who have this permission for the file referring to this ACL name.
### Example
[TODO]
### Diagram
[TODO]
### Interface
Testing permissions:
- (node *Node) get_perm()
- (node *Node) has_perm(perm string, user string)
Setting/Changing permissions:
- (node *Node) set_perm(perm string)
- (node *Node) change_ACLname(aclname string)
## User Group
[TODO]

View File

@ -1,120 +0,0 @@
## Modules
etcd has a number of modules that are built on top of the core etcd API.
These modules provide things like dashboards, locks and leader election (removed).
**Warning**: Modules and dashboard are deprecated from v0.4 until we have a solid base we can apply them back onto.
For now, we are choosing to focus on raft algorithm and core etcd to make sure that it works correctly and fast.
And it is time consuming to maintain these modules in this period, given that etcd's API changes from time to time.
Moreover, the lock module has some unfixed bugs, which may mislead users.
But we also notice that these modules are popular and useful, and plan to add them back with full functionality as soon as possible.
### Dashboard
**Other Dashboards**: There are other dashboards available on [Github](https://github.com/henszey/etcd-browser) that can be run [in a container](https://registry.hub.docker.com/u/tomaskral/etcd-browser/).
An HTML dashboard can be found at `http://127.0.0.1:4001/mod/dashboard/`.
This dashboard is compiled into the etcd binary and uses the same API as regular etcd clients.
Use the `-cors='*'` flag to allow your browser to request information from the current master as it changes.
### Lock
The Lock module implements a fair lock that can be used when lots of clients want access to a single resource.
A lock can be associated with a value.
The value is unique so if a lock tries to request a value that is already queued for a lock then it will find it and watch until that value obtains the lock.
You may supply a `timeout` which will cancel the lock request if it is not obtained within `timeout` seconds. If `timeout` is not supplied, it is presumed to be infinite. If `timeout` is `0`, the lock request will fail if it is not immediately acquired.
If you lock the same value on a key from two separate curl sessions they'll both return at the same time.
Here's the API:
**Acquire a lock (with no value) for "customer1"**
```sh
curl -X POST http://127.0.0.1:4001/mod/v2/lock/customer1?ttl=60
```
**Acquire a lock for "customer1" that is associated with the value "bar"**
```sh
curl -X POST http://127.0.0.1:4001/mod/v2/lock/customer1?ttl=60 -d value=bar
```
**Acquire a lock for "customer1" that is associated with the value "bar" only if it is done within 2 seconds**
```sh
curl -X POST http://127.0.0.1:4001/mod/v2/lock/customer1?ttl=60 -d value=bar -d timeout=2
```
**Renew the TTL on the "customer1" lock for index 2**
```sh
curl -X PUT http://127.0.0.1:4001/mod/v2/lock/customer1?ttl=60 -d index=2
```
**Renew the TTL on the "customer1" lock for value "bar"**
```sh
curl -X PUT http://127.0.0.1:4001/mod/v2/lock/customer1?ttl=60 -d value=bar
```
**Retrieve the current value for the "customer1" lock.**
```sh
curl http://127.0.0.1:4001/mod/v2/lock/customer1
```
**Retrieve the current index for the "customer1" lock**
```sh
curl http://127.0.0.1:4001/mod/v2/lock/customer1?field=index
```
**Delete the "customer1" lock with the index 2**
```sh
curl -X DELETE http://127.0.0.1:4001/mod/v2/lock/customer1?index=2
```
**Delete the "customer1" lock with the value "bar"**
```sh
curl -X DELETE http://127.0.0.1:4001/mod/v2/lock/customer1?value=bar
```
### Leader Election (Deprecated and Removed in 0.4)
The Leader Election module wraps the Lock module to allow clients to come to consensus on a single value.
This is useful when you want one server to process at a time but allow other servers to fail over.
The API is similar to the Lock module but is limited to simple strings values.
Here's the API:
**Attempt to set a value for the "order_processing" leader key:**
```sh
curl -X PUT http://127.0.0.1:4001/mod/v2/leader/order_processing?ttl=60 -d name=myserver1.foo.com
```
**Retrieve the current value for the "order_processing" leader key:**
```sh
curl http://127.0.0.1:4001/mod/v2/leader/order_processing
myserver1.foo.com
```
**Remove a value from the "order_processing" leader key:**
```sh
curl -X DELETE http://127.0.0.1:4001/mod/v2/leader/order_processing?name=myserver1.foo.com
```
If multiple clients attempt to set the value for a key then only one will succeed.
The other clients will hang until the current value is removed because of TTL or because of a `DELETE` operation.
Multiple clients can submit the same value and will all be notified when that value succeeds.
To update the TTL of a value simply reissue the same `PUT` command that you used to set the value.

View File

@ -1,17 +0,0 @@
# Upgrading an Existing Cluster
etcd clusters can be upgraded by doing a rolling upgrade or all at once. We make every effort to test this process, but please be sure to backup your data [by etcd-dump](https://github.com/AaronO/etcd-dump), or make a copy of data directory beforehand.
## Upgrade Process
- Stop the old etcd processes
- Upgrade the etcd binary
- Restart the etcd instance using the original --name, --address, --peer-address and --data-dir.
## Rolling Upgrade
During an upgrade, etcd clusters are designed to continue working in a mix of old and new versions. It's recommended to converge on the new version quickly. Using new API features before the entire cluster has been upgraded is only supported as a best effort. Each instance's version can be found with `curl http://127.0.0.1:4001/version`.
## All at Once
If downtime is not an issue, the easiest way to upgrade your cluster is to shutdown all of the etcd instances and restart them with the new binary. The current state of the cluster is saved to disk and will be loaded into the cluster when it restarts.