Compare commits

..

32 Commits

Author SHA1 Message Date
Yicheng Qin 5dcbb998f1 *: bump to v2.1.3+git 2015-09-03 13:33:49 -07:00
Yicheng Qin 30801de468 *: bump to v2.1.3 2015-09-03 13:33:27 -07:00
Yicheng Qin dbac8c8f42 etcdmain: check error before assigning peer transport
Or it may panic when new transport fails, e.g., TLS info is invalid.
2015-09-03 13:22:19 -07:00
Yicheng Qin 151c18d650 *: bump to v2.1.2+git 2015-08-21 16:20:16 -07:00
Yicheng Qin ff8d1ecb9f *: bump to v2.1.2 2015-08-21 16:19:55 -07:00
Yicheng Qin ccb67a691b pkg/netutil: stop resolving in place
It helps to copy out a and b, and not modify the original a and b.
2015-08-21 15:39:57 -07:00
Yicheng Qin 059233768e pkg/netutil: not introduce empty url when converting
It should not make slices with length and append elements at the same
time.
2015-08-21 15:39:48 -07:00
Yicheng Qin c530acf6a4 pkg/netutil: not export resolve and urlsEqual functions
They are only used in this package, so there is no need to public them.
2015-08-21 15:39:38 -07:00
Yicheng Qin bad1b20620 pkg/netutil: fix false negative comparison
Sort the resolved URLs before DeepEqual, so it will not compare URLs
that may be out of order due to resolution.
2015-08-21 15:39:29 -07:00
Yicheng Qin 89640cf08f etcdserver: remove TODO to delete URLStringsEqual
Discovery SRV supports to compare IP addresses with domain names,
so we need URLStringsEqual function.
2015-08-21 15:39:19 -07:00
Yicheng Qin bbefb0ad0b Revert "Revert "Treat URLs have same IP address as same""
This reverts commit 3153e635d5.

Conflicts:
	etcdserver/config.go
2015-08-21 15:39:10 -07:00
Xiang Li 8e0706583c etcdmain: print out version information on startup
Conflicts:
	etcdmain/etcd.go
2015-08-21 15:38:34 -07:00
Yicheng Qin cd2a2182cf etcdctl/cluster_health: set health var when checked healthy
This was a typo.
2015-08-21 15:37:26 -07:00
Xiang Li 8d410bdfcb etcdctl: use health endpoint to greatly simplify health checking 2015-08-21 15:22:19 -07:00
Xiang Li 0a2d2b8b9d etcdctl: cluster-health supports forever flag
cluster-health command supports checking the cluster health
forever.
2015-08-21 15:22:13 -07:00
Yicheng Qin 6c9e876d7a etcdctl: refactor the way to check cluster health
This method uses raft status exposed at /debug/varz to determine the
health of the cluster. It uses whether commit index increases to
determine the cluster health, and uses whether match index increases to
determine the member health.

This could fix the bug #2711 that fails to detect follower is unhealthy
because it doesn't rely on whether message in long-polling connection is sent.

This health check is stricter than the old one, and reflects the
situation that whether followers are healthy in the view of the leader. One
example is that if the follower is receiving the snapshot, it will turns
out to be unhealthy because it doesn't move forward.

`etcdctl cluster-health` will reflect the healthy view in the raft level,
while connectivity checks reflects the healthy view in transport level.
2015-08-21 15:22:05 -07:00
Xiang Li a845f82d4f etcdctl: health use etcd/client
Conflicts:
	etcdctl/command/cluster_health.go
2015-08-21 15:21:44 -07:00
Xiang Li c1c23626cb raft: downgrade the logging around snapshot to debugf
Snapshot related logging is spamming when leader trying to
sync a failed peer.

Conflicts:
	raft/raft.go
2015-08-21 15:11:47 -07:00
Xiang Li ac67aa9f63 etcdhttp:write etcderror for all errors in keyhandler 2015-08-21 15:10:01 -07:00
Xiang Li 52c5203370 *: key handler should write auth error as etcd error 2015-08-21 15:09:53 -07:00
Yicheng Qin 27bfb3fcb2 etcdserver: improve error message when timeout due to leader fail 2015-08-21 15:09:46 -07:00
Yicheng Qin 084936a920 etcdserver: specify timeout caused by leader election
Before this PR, the timeout caused by leader election returns:

```
14:45:37 etcd2 | 2015-08-12 14:45:37.786349 E | etcdhttp: got unexpected
response error (etcdserver: request timed out)
```

After this PR:

```
15:52:54 etcd1 | 2015-08-12 15:52:54.389523 E | etcdhttp: etcdserver:
request timed out, possibly due to leader down
```

Conflicts:
	etcdserver/raft.go
2015-08-21 15:09:32 -07:00
Brandon Philips d2ecd9cecf test: race detector doesn't work on armv7l
Test fails without this fix on armv7l:

    go test: -race is only supported on linux/amd64, freebsd/amd64, darwin/amd64 and windows/amd64
2015-08-21 14:59:07 -07:00
Brandon Philips 07b82832f0 etcdserver: move atomics to make etcd work on arm64
Follow the simple rule in the atomic package:

"On both ARM and x86-32, it is the caller's responsibility to arrange
for 64-bit alignment of 64-bit words accessed atomically. The first word
in a global variable or in an allocated struct or slice can be relied
upon to be 64-bit aligned."

Tested on a system with /proc/cpuinfo reporting:

processor       : 0
model name      : ARMv7 Processor rev 1 (v7l)
Features        : swp half thumb fastmult vfp edsp thumbee neon vfpv3
tls vfpv4 idiva idivt vfpd32 lpae evtstrm
CPU implementer : 0x41
CPU architecture: 7
CPU variant     : 0x0
CPU part        : 0xc0d
CPU revision    : 1
2015-08-21 14:58:59 -07:00
Yicheng Qin 61f4e74652 etcdmain: reject unreasonably high values of -election-timeout
This helps users to detect setting problem early.
2015-08-21 14:58:49 -07:00
Yicheng Qin 331ecdf8c8 client: return correct error for 50x response
etcd always returns 500/503 response when it may have no leader.
So we should log the other 50x response in a normal way.

This helps to log correctly when discovery meets 504 error. Before this
PR, it logs like this:

```
18:31:58 etcd2 | 2015/08/4 18:31:58 discovery: error #0: client: etcd
member https://discovery.etcd.io has no leader
18:31:58 etcd2 | 2015/08/4 18:31:58 discovery: waiting for other nodes:
error connecting to https://discovery.etcd.io, retrying in 4s
```

After this PR:

```
22:20:25 etcd2 | 2015/08/4 22:20:25 discovery: error #0: client: etcd
member https://discovery.etcd.io returns server error [Gateway Timeout]
22:20:25 etcd2 | 2015/08/4 22:20:25 discovery: waiting for other nodes:
error connecting to https://discovery.etcd.io, retrying in 4s
```

Conflicts:
	client/client.go
2015-08-21 14:55:35 -07:00
Xiang Li 3a346eac25 discovery: print out detailed cluster error
Conflicts:
	discovery/discovery.go
2015-08-21 14:54:36 -07:00
Xiang Li 97605046c1 client: return cluster error if the etcd cluster is not avaliable
Add a new ClusterError type. It contians all encountered errors and
return ClusterNotAvailable as the error string.

Conflicts:
	client/client.go
	discovery/discovery.go
2015-08-21 14:51:41 -07:00
Guohua Ouyang 41ecf7f722 etcdmain: Don't print flags when flag parse error
At present it prints the whole usage and flags, which cause the exact
error message is hidden two screens above.

Fixes #3141

Signed-off-by: Guohua Ouyang <gouyang@redhat.com>
2015-08-21 14:32:49 -07:00
Xiang Li fcd564efb8 etcdmian: fix initialization confilct
Fix #3142

Ignore flags if etcd is already initialized.
2015-08-21 14:32:41 -07:00
Yicheng Qin 0876c5e1ef etcdmain: warn when listening on HTTP if TLS is set
If the user sets TLS info, this implies that he wants to listen on TLS.
If etcd finds that urls to listen is still HTTP schema, it prints out
warning to notify user about possible wrong setting.
2015-08-21 14:32:31 -07:00
Yicheng Qin ef80bb5cbf pkg/transport: fix HTTPS downgrade bug for keepalive listener
If TLS config is empty, etcd downgrades keepalive listener from HTTPS to
HTTP without warning. This results in HTTPS downgrade bug for client urls.
The commit returns error if it cannot listen on TLS.
2015-08-21 14:32:18 -07:00
2122 changed files with 219443 additions and 245201 deletions

View File

@ -1,23 +0,0 @@
// For format details, see https://aka.ms/devcontainer.json. For config options, see the
// README at: https://github.com/devcontainers/templates/tree/main/src/go
{
"name": "Go",
// Or use a Dockerfile or Docker Compose file. More info: https://containers.dev/guide/dockerfile
"image": "mcr.microsoft.com/devcontainers/go:1.19-bullseye",
// Features to add to the dev container. More info: https://containers.dev/features.
"features": {
"ghcr.io/devcontainers/features/docker-in-docker:2": {},
"ghcr.io/devcontainers/features/github-cli:1": {}
},
// Use 'forwardPorts' to make a list of ports inside the container available locally.
"forwardPorts": [2379, 2380],
// Use 'postCreateCommand' to run commands after the container is created.
"postCreateCommand": "make build"
// Configure tool-specific properties.
// "customizations": {},
}

1
.dockerignore Normal file
View File

@ -0,0 +1 @@
.git

View File

@ -1,102 +0,0 @@
---
name: Bug Report
description: Report a bug encountered while operating etcd
labels:
- type/bug
body:
- type: checkboxes
id: confirmations
attributes:
label: Bug report criteria
description: Please confirm this bug report meets the following criteria.
options:
- label: This bug report is not security related, security issues should be disclosed privately via security@etcd.io.
- label: This is not a support request, support requests should be raised in the etcd [discussion forums](https://github.com/etcd-io/etcd/discussions).
- label: You have read the etcd [bug reporting guidelines](https://github.com/etcd-io/etcd/blob/main/Documentation/contributor-guide/reporting_bugs.md).
- label: Existing open issues along with etcd [frequently asked questions](https://etcd.io/docs/latest/faq) have been checked and this is not a duplicate.
- type: markdown
attributes:
value: |
Please fill the form below and provide as much information as possible.
Not doing so may result in your bug not being addressed in a timely manner.
- type: textarea
id: problem
attributes:
label: What happened?
validations:
required: true
- type: textarea
id: expected
attributes:
label: What did you expect to happen?
validations:
required: true
- type: textarea
id: repro
attributes:
label: How can we reproduce it (as minimally and precisely as possible)?
validations:
required: true
- type: textarea
id: additional
attributes:
label: Anything else we need to know?
- type: textarea
id: etcdVersion
attributes:
label: Etcd version (please run commands below)
value: |
<details>
```console
$ etcd --version
# paste output here
$ etcdctl version
# paste output here
```
</details>
validations:
required: true
- type: textarea
id: config
attributes:
label: Etcd configuration (command line flags or environment variables)
value: |
<details>
# paste your configuration here
</details>
- type: textarea
id: etcdDebugInformation
attributes:
label: Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)
value: |
<details>
```console
$ etcdctl member list -w table
# paste output here
$ etcdctl --endpoints=<member list> endpoint status -w table
# paste output here
```
</details>
- type: textarea
id: logs
attributes:
label: Relevant log output
description: Please copy and paste any relevant log output. This will be automatically formatted into code, so no need for backticks.
render: Shell

View File

@ -1,6 +0,0 @@
---
blank_issues_enabled: false
contact_links:
- name: Question
url: https://github.com/etcd-io/etcd/discussions
about: Question relating to Etcd

View File

@ -1,19 +0,0 @@
---
name: Feature request
description: Provide idea for a new feature
labels:
- type/feature
body:
- type: textarea
id: feature
attributes:
label: What would you like to be added?
validations:
required: true
- type: textarea
id: rationale
attributes:
label: Why is this needed?
validations:
required: true

View File

@ -1,31 +0,0 @@
---
name: Membership nomination
description: Nominate new etcd members
labels:
- area/community
body:
- type: textarea
id: feature
attributes:
label: Who would you like to nominate?
validations:
required: true
- id: requirements
type: checkboxes
attributes:
label: Requirements
options:
- label: I have reviewed the [community membership guidelines](https://github.com/etcd-io/etcd/blob/main/Documentation/contributor-guide/community-membership.md)
required: true
- label: The members are actively contributing to 1 or more etcd subprojects
required: true
- label: The members are being sponsored by two current reviewers or a current maintainer.
required: true
- type: textarea
id: rationale
attributes:
label: How do the new members meet the regular active contribution requirements?
validations:
required: true

View File

@ -1,34 +0,0 @@
---
name: Flaking Test
description: Report flaky tests
labels:
- type/flake
body:
- type: textarea
id: workflows
attributes:
label: Which github workflows are flaking?
validations:
required: true
- type: textarea
id: tests
attributes:
label: Which tests are flaking?
validations:
required: true
- type: input
id: link
attributes:
label: Github Action link
- type: textarea
id: reason
attributes:
label: Reason for failure (if possible)
- type: textarea
id: additional
attributes:
label: Anything else we need to know?

View File

@ -1,2 +0,0 @@
Please read https://github.com/etcd-io/etcd/blob/main/CONTRIBUTING.md#contribution-flow.

2
.github/SECURITY.md vendored
View File

@ -1,2 +0,0 @@
Please read https://github.com/etcd-io/etcd/blob/main/security/README.md.

View File

@ -1,21 +0,0 @@
---
version: 2
updates:
- package-ecosystem: github-actions
directory: /
schedule:
interval: weekly
- package-ecosystem: gomod
directory: /
schedule:
interval: weekly
allow:
- dependency-type: all
- package-ecosystem: gomod
directory: /tools/mod # Not linked from /go.mod
schedule:
interval: weekly
allow:
- dependency-type: all

56
.github/stale.yml vendored
View File

@ -1,56 +0,0 @@
---
# Configuration for probot-stale - https://github.com/probot/stale
# Number of days of inactivity before an Issue or Pull Request becomes stale
daysUntilStale: 90
# Number of days of inactivity before an Issue or Pull Request with the stale label is closed.
# Set to false to disable. If disabled, issues still need to be closed manually, but will remain marked as stale.
daysUntilClose: 21
# Only issues or pull requests with all of these labels are check if stale. Defaults to `[]` (disabled)
onlyLabels: []
# Issues or Pull Requests with these labels will never be considered stale. Set to `[]` to disable
exemptLabels:
- "stage/tracked"
# Set to true to ignore issues in a project (defaults to false)
exemptProjects: false
# Set to true to ignore issues in a milestone (defaults to false)
exemptMilestones: false
# Set to true to ignore issues with an assignee (defaults to false)
exemptAssignees: false
# Label to use when marking as stale
staleLabel: stale
# Comment to post when marking as stale. Set to `false` to disable
markComment: This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.
# Comment to post when removing the stale label.
# unmarkComment: >
# Your comment here.
# Comment to post when closing a stale Issue or Pull Request.
# closeComment: >
# Your comment here.
# Limit the number of actions per hour, from 1-30. Default is 30
limitPerRun: 30
# Limit to only `issues` or `pulls`
# only: issues
# Optionally, specify configuration settings that are specific to just 'issues' or 'pulls':
# pulls:
# daysUntilStale: 30
# markComment: >
# This pull request has been automatically marked as stale because it has not had
# recent activity. It will be closed if no further activity occurs. Thank you
# for your contributions.
# issues:
# exemptLabels:
# - confirmed

View File

@ -1,67 +0,0 @@
---
name: Build
on: [push, pull_request]
permissions: read-all
jobs:
build:
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
target:
- linux-amd64
- linux-386
- darwin-amd64
- darwin-arm64
- windows-amd64
- linux-arm
- linux-arm64
- linux-ppc64le
- linux-s390x
steps:
- uses: actions/checkout@c85c95e3d7251135ab7dc9ce3241c5835cc595a9 # v3.5.3
- id: goversion
run: echo "goversion=$(cat .go-version)" >> "$GITHUB_OUTPUT"
- uses: actions/setup-go@fac708d6674e30b6ba41289acaab6d4b75aa0753 # v4.0.1
with:
go-version: ${{ steps.goversion.outputs.goversion }}
- env:
TARGET: ${{ matrix.target }}
run: |
set -euo pipefail
echo "${TARGET}"
case "${TARGET}" in
linux-amd64)
GOOS=linux GOARCH=amd64 make build
;;
linux-386)
GOOS=linux GOARCH=386 make build
;;
darwin-amd64)
GOOS=darwin GOARCH=amd64 make build
;;
darwin-arm64)
GOOS=darwin GOARCH=arm64 make build
;;
windows-amd64)
GOOS=windows GOARCH=amd64 make build
;;
linux-arm)
GOOS=linux GOARCH=arm make build
;;
linux-arm64)
GOOS=linux GOARCH=arm64 make build
;;
linux-ppc64le)
GOOS=linux GOARCH=ppc64le make build
;;
linux-s390x)
GOOS=linux GOARCH=s390x make build
;;
*)
echo "Failed to find target"
exit 1
;;
esac

View File

@ -1,55 +0,0 @@
---
# For most projects, this workflow file will not need changing; you simply need
# to commit it to your repository.
#
# You may wish to alter this file to override the set of languages analyzed,
# or to provide custom queries or build logic.
#
# ******** NOTE ********
# We have attempted to detect the languages in your repository. Please check
# the `language` matrix defined below to confirm you have the correct set of
# supported CodeQL languages.
#
name: "CodeQL"
on:
push:
branches: [main, release-3.4, release-3.5, release-3.6]
pull_request:
# The branches below must be a subset of the branches above
branches: [main]
schedule:
- cron: '20 14 * * 5'
permissions: read-all
jobs:
analyze:
name: Analyze
runs-on: ubuntu-latest
permissions:
actions: read
contents: read
security-events: write
strategy:
fail-fast: false
matrix:
# CodeQL supports [ 'cpp', 'csharp', 'go', 'java', 'javascript', 'python' ]
# Learn more:
# https://docs.github.com/en/free-pro-team@latest/github/finding-security-vulnerabilities-and-errors-in-your-code/configuring-code-scanning#changing-the-languages-that-are-analyzed
language: ['go']
steps:
- name: Checkout repository
uses: actions/checkout@c85c95e3d7251135ab7dc9ce3241c5835cc595a9 # v3.5.3
# Initializes the CodeQL tools for scanning.
- name: Initialize CodeQL
uses: github/codeql-action/init@6c089f53dd51dc3fc7e599c3cb5356453a52ca9e # v2.20.0
with:
# If you wish to specify custom queries, you can do so here or in a config file.
# By default, queries listed here will override any specified in a config file.
# Prefix the list here with "+" to use these queries and those in the config file.
# queries: ./path/to/local/query, your-org/your-repo/queries@main
languages: ${{ matrix.language }}
# Autobuild attempts to build any compiled languages (C/C++, C#, or Java).
# If this step fails, then you should remove it and run the build manually (see below)
- name: Autobuild
uses: github/codeql-action/autobuild@6c089f53dd51dc3fc7e599c3cb5356453a52ca9e # v2.20.0
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@6c089f53dd51dc3fc7e599c3cb5356453a52ca9e # v2.20.0

View File

@ -1,18 +0,0 @@
---
name: Test contrib/mixin
on: [push, pull_request]
permissions: read-all
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@c85c95e3d7251135ab7dc9ce3241c5835cc595a9 # v3.5.3
- id: goversion
run: echo "goversion=$(cat .go-version)" >> "$GITHUB_OUTPUT"
- uses: actions/setup-go@fac708d6674e30b6ba41289acaab6d4b75aa0753 # v4.0.1
with:
go-version: ${{ steps.goversion.outputs.goversion }}
- run: |
set -euo pipefail
make -C contrib/mixin tools test

View File

@ -1,32 +0,0 @@
---
name: Coverage
on: [push]
permissions: read-all
jobs:
coverage:
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
target:
- linux-amd64-coverage
steps:
- uses: actions/checkout@c85c95e3d7251135ab7dc9ce3241c5835cc595a9 # v3.5.3
- id: goversion
run: echo "goversion=$(cat .go-version)" >> "$GITHUB_OUTPUT"
- uses: actions/setup-go@fac708d6674e30b6ba41289acaab6d4b75aa0753 # v4.0.1
with:
go-version: ${{ steps.goversion.outputs.goversion }}
- env:
TARGET: ${{ matrix.target }}
run: |
mkdir "${TARGET}"
case "${TARGET}" in
linux-amd64-coverage)
GOARCH=amd64 ./scripts/codecov_upload.sh
;;
*)
echo "Failed to find target"
exit 1
;;
esac

View File

@ -1,45 +0,0 @@
---
name: E2E-arm64
on:
schedule:
- cron: '0 1 * * *' # runs daily at 1am.
permissions: read-all
jobs:
test:
# this is to prevent the job to run at forked projects
if: github.repository == 'etcd-io/etcd'
runs-on: [self-hosted, Linux, ARM64]
container: golang:1.19-bullseye
defaults:
run:
shell: bash
strategy:
fail-fast: true
matrix:
target:
- linux-arm64-e2e
steps:
- uses: actions/checkout@c85c95e3d7251135ab7dc9ce3241c5835cc595a9 # v3.5.3
# https://github.com/actions/checkout/issues/1169
- run: git config --system --add safe.directory '*'
- id: goversion
run: echo "goversion=$(cat .go-version)" >> "$GITHUB_OUTPUT"
- uses: actions/setup-go@fac708d6674e30b6ba41289acaab6d4b75aa0753 # v4.0.1
with:
go-version: ${{ steps.goversion.outputs.goversion }}
- env:
TARGET: ${{ matrix.target }}
run: |
set -euo pipefail
go clean -testcache
echo "${TARGET}"
case "${TARGET}" in
linux-arm64-e2e)
GOOS=linux GOARCH=arm64 CPU=4 EXPECT_DEBUG=true RACE=true make test-e2e-release
;;
*)
echo "Failed to find target"
exit 1
;;
esac

View File

@ -1,39 +0,0 @@
---
name: E2E
on: [push, pull_request]
permissions: read-all
jobs:
test:
runs-on: ubuntu-latest
strategy:
fail-fast: true
matrix:
target:
- linux-amd64-e2e
- linux-386-e2e
steps:
- uses: actions/checkout@c85c95e3d7251135ab7dc9ce3241c5835cc595a9 # v3.5.3
- id: goversion
run: echo "goversion=$(cat .go-version)" >> "$GITHUB_OUTPUT"
- uses: actions/setup-go@fac708d6674e30b6ba41289acaab6d4b75aa0753 # v4.0.1
with:
go-version: ${{ steps.goversion.outputs.goversion }}
- env:
TARGET: ${{ matrix.target }}
run: |
set -euo pipefail
go clean -testcache
echo "${TARGET}"
case "${TARGET}" in
linux-amd64-e2e)
VERBOSE=1 GOOS=linux GOARCH=amd64 CPU=4 EXPECT_DEBUG=true RACE=true make test-e2e-release
;;
linux-386-e2e)
VERBOSE=1 GOOS=linux GOARCH=386 CPU=4 EXPECT_DEBUG=true RACE=true make test-e2e
;;
*)
echo "Failed to find target"
exit 1
;;
esac

View File

@ -1,26 +0,0 @@
---
name: Fuzzing v3rpc
on: [push, pull_request]
permissions: read-all
jobs:
fuzzing:
runs-on: ubuntu-latest
strategy:
fail-fast: false
env:
TARGET_PATH: ./server/etcdserver/api/v3rpc
steps:
- uses: actions/checkout@c85c95e3d7251135ab7dc9ce3241c5835cc595a9 # v3.5.3
- id: goversion
run: echo "goversion=$(cat .go-version)" >> "$GITHUB_OUTPUT"
- uses: actions/setup-go@fac708d6674e30b6ba41289acaab6d4b75aa0753 # v4.0.1
with:
go-version: ${{ steps.goversion.outputs.goversion }}
- run: |
set -euo pipefail
GOARCH=amd64 CPU=4 make fuzz
- uses: actions/upload-artifact@0b7f8abb1508181956e8e162db84b466c27e18ce # v3.1.2
if: failure()
with:
path: "${{env.TARGET_PATH}}/testdata/fuzz/**/*"

View File

@ -1,19 +0,0 @@
---
name: Go Vulnerability Checker
on: [push, pull_request]
permissions: read-all
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@c85c95e3d7251135ab7dc9ce3241c5835cc595a9 # v3.5.3
- id: goversion
run: echo "goversion=$(cat .go-version)" >> "$GITHUB_OUTPUT"
- uses: actions/setup-go@fac708d6674e30b6ba41289acaab6d4b75aa0753 # v4.0.1
with:
go-version: ${{ steps.goversion.outputs.goversion }}
- run: date
- run: |
set -euo pipefail
go install golang.org/x/vuln/cmd/govulncheck@latest && govulncheck ./...

View File

@ -1,38 +0,0 @@
---
name: grpcProxy-tests
on: [push, pull_request]
permissions: read-all
jobs:
test:
runs-on: ubuntu-latest
strategy:
fail-fast: true
matrix:
target:
- linux-amd64-grpcproxy-integration
- linux-amd64-grpcproxy-e2e
steps:
- uses: actions/checkout@c85c95e3d7251135ab7dc9ce3241c5835cc595a9 # v3.5.3
- id: goversion
run: echo "goversion=$(cat .go-version)" >> "$GITHUB_OUTPUT"
- uses: actions/setup-go@fac708d6674e30b6ba41289acaab6d4b75aa0753 # v4.0.1
with:
go-version: ${{ steps.goversion.outputs.goversion }}
- env:
TARGET: ${{ matrix.target }}
run: |
set -euo pipefail
echo "${TARGET}"
case "${TARGET}" in
linux-amd64-grpcproxy-integration)
GOOS=linux GOARCH=amd64 CPU=4 RACE=true make test-grpcproxy-integration
;;
linux-amd64-grpcproxy-e2e)
GOOS=linux GOARCH=amd64 CPU=4 RACE=true make test-grpcproxy-e2e
;;
*)
echo "Failed to find target"
exit 1
;;
esac

View File

@ -1,23 +0,0 @@
---
name: Measure Test Flakiness
on:
schedule:
- cron: "0 0 * * 0" # run every Sunday at midnight
permissions: read-all
jobs:
measure-test-flakiness:
name: Measure Test Flakiness
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@c85c95e3d7251135ab7dc9ce3241c5835cc595a9 # v3.5.3
- env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
set -euo pipefail
./scripts/measure-test-flakiness.sh
make bin/etcd-test-analyzer
bin/etcd-test-analyzer run -token $GITHUB_TOKEN -max-age=168h -workflow Tests -branch main

View File

@ -1,34 +0,0 @@
---
name: Release
on: [push, pull_request]
permissions: read-all
jobs:
main:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@c85c95e3d7251135ab7dc9ce3241c5835cc595a9 # v3.5.3
- id: goversion
run: echo "goversion=$(cat .go-version)" >> "$GITHUB_OUTPUT"
- uses: actions/setup-go@fac708d6674e30b6ba41289acaab6d4b75aa0753 # v4.0.1
with:
go-version: ${{ steps.goversion.outputs.goversion }}
- name: release
run: |
set -euo pipefail
git config --global user.email "github-action@etcd.io"
git config --global user.name "Github Action"
gpg --batch --gen-key <<EOF
%no-protection
Key-Type: 1
Key-Length: 2048
Subkey-Type: 1
Subkey-Length: 2048
Name-Real: Github Action
Name-Email: github-action@etcd.io
Expire-Date: 0
EOF
DRY_RUN=true ./scripts/release.sh --no-upload --no-docker-push --in-place 3.6.99
- name: test-image
run: |
VERSION=3.6.99 ./scripts/test_images.sh

View File

@ -1,39 +0,0 @@
---
name: Robustness Nightly
permissions: read-all
on:
# schedules always run against the main branch, hence we have to create separate jobs
# with individual checkout actions for each of the active release branches
schedule:
- cron: '25 9 * * *' # runs every day at 09:25 UTC
jobs:
main:
# GHA has a maximum amount of 6h execution time, we try to get done within 3h
uses: ./.github/workflows/robustness-template.yaml
with:
etcdBranch: main
count: 100
testTimeout: 200m
artifactName: main
main-arm64:
uses: ./.github/workflows/robustness-template-arm64.yaml
with:
etcdBranch: main
count: 100
testTimeout: 200m
artifactName: main-arm64
runs-on: "['self-hosted', 'Linux', 'ARM64']"
release-35:
uses: ./.github/workflows/robustness-template.yaml
with:
etcdBranch: release-3.5
count: 100
testTimeout: 200m
artifactName: release-35
release-34:
uses: ./.github/workflows/robustness-template.yaml
with:
etcdBranch: release-3.4
count: 100
testTimeout: 200m
artifactName: release-34

View File

@ -1,72 +0,0 @@
---
name: Reusable Robustness Workflow
on:
workflow_call:
inputs:
etcdBranch:
required: true
type: string
count:
required: true
type: number
testTimeout:
required: false
type: string
default: '30m'
artifactName:
required: true
type: string
runs-on:
required: false
type: string
default: "['ubuntu-latest']"
permissions: read-all
jobs:
test:
timeout-minutes: 210
runs-on: ${{ fromJson(inputs.runs-on) }}
container: golang:1.19-bullseye
defaults:
run:
shell: bash
steps:
- uses: actions/checkout@c85c95e3d7251135ab7dc9ce3241c5835cc595a9 # v3.5.3
# https://github.com/actions/checkout/issues/1169
- run: git config --system --add safe.directory '*'
- id: goversion
run: echo "goversion=$(cat .go-version)" >> "$GITHUB_OUTPUT"
- uses: actions/setup-go@fac708d6674e30b6ba41289acaab6d4b75aa0753 # v4.0.1
with:
go-version: ${{ steps.goversion.outputs.goversion }}
- name: test-robustness
env:
ETCD_BRANCH: "${{ inputs.etcdBranch }}"
run: |
set -euo pipefail
go clean -testcache
# Use --failfast to avoid overriding report generated by failed test
GO_TEST_FLAGS="-v --count ${{ inputs.count }} --timeout ${{ inputs.testTimeout }} --failfast --run TestRobustness"
case "${ETCD_BRANCH}" in
release-3.5)
EXPECT_DEBUG=true GO_TEST_FLAGS=${GO_TEST_FLAGS} RESULTS_DIR=/tmp/results make test-robustness-release-3.5
;;
release-3.4)
EXPECT_DEBUG=true GO_TEST_FLAGS=${GO_TEST_FLAGS} RESULTS_DIR=/tmp/results make test-robustness-release-3.4
;;
main)
make gofail-enable
make build
EXPECT_DEBUG=true GO_TEST_FLAGS=${GO_TEST_FLAGS} RESULTS_DIR=/tmp/results make test-robustness
;;
*)
echo "Failed to find target ${ETCD_BRANCH}"
exit 1
;;
esac
- uses: actions/upload-artifact@0b7f8abb1508181956e8e162db84b466c27e18ce
if: always()
with:
name: ${{ inputs.artifactName }}
path: /tmp/results/*

View File

@ -1,65 +0,0 @@
---
name: Reusable Robustness Workflow
on:
workflow_call:
inputs:
etcdBranch:
required: true
type: string
count:
required: true
type: number
testTimeout:
required: false
type: string
default: '30m'
artifactName:
required: true
type: string
runs-on:
required: false
type: string
default: "['ubuntu-latest']"
permissions: read-all
jobs:
test:
timeout-minutes: 210
runs-on: ${{ fromJson(inputs.runs-on) }}
steps:
- uses: actions/checkout@c85c95e3d7251135ab7dc9ce3241c5835cc595a9 # v3.5.3
- id: goversion
run: echo "goversion=$(cat .go-version)" >> "$GITHUB_OUTPUT"
- uses: actions/setup-go@fac708d6674e30b6ba41289acaab6d4b75aa0753 # v4.0.1
with:
go-version: ${{ steps.goversion.outputs.goversion }}
- name: test-robustness
env:
ETCD_BRANCH: "${{ inputs.etcdBranch }}"
run: |
set -euo pipefail
go clean -testcache
# Use --failfast to avoid overriding report generated by failed test
GO_TEST_FLAGS="-v --count ${{ inputs.count }} --timeout ${{ inputs.testTimeout }} --failfast --run TestRobustness"
case "${ETCD_BRANCH}" in
release-3.5)
EXPECT_DEBUG=true GO_TEST_FLAGS=${GO_TEST_FLAGS} RESULTS_DIR=/tmp/results make test-robustness-release-3.5
;;
release-3.4)
EXPECT_DEBUG=true GO_TEST_FLAGS=${GO_TEST_FLAGS} RESULTS_DIR=/tmp/results make test-robustness-release-3.4
;;
main)
make gofail-enable
make build
EXPECT_DEBUG=true GO_TEST_FLAGS=${GO_TEST_FLAGS} RESULTS_DIR=/tmp/results make test-robustness
;;
*)
echo "Failed to find target ${ETCD_BRANCH}"
exit 1
;;
esac
- uses: actions/upload-artifact@0b7f8abb1508181956e8e162db84b466c27e18ce
if: always()
with:
name: ${{ inputs.artifactName }}
path: /tmp/results/*

View File

@ -1,12 +0,0 @@
---
name: Robustness
on: [push, pull_request]
permissions: read-all
jobs:
main:
uses: ./.github/workflows/robustness-template.yaml
with:
etcdBranch: main
count: 15
testTimeout: 30m
artifactName: main

View File

@ -1,55 +0,0 @@
---
name: Scorecards supply-chain security
on:
# Only the default branch is supported.
branch_protection_rule:
schedule:
- cron: '45 1 * * 0'
push:
branches: ["main"]
# Declare default permissions as read only.
permissions: read-all
jobs:
analysis:
name: Scorecards analysis
runs-on: ubuntu-latest
permissions:
# Needed to upload the results to code-scanning dashboard.
security-events: write
# Used to receive a badge.
id-token: write
steps:
- name: "Checkout code"
uses: actions/checkout@c85c95e3d7251135ab7dc9ce3241c5835cc595a9 # tag=v3.0.0
with:
persist-credentials: false
- name: "Run analysis"
uses: ossf/scorecard-action@80e868c13c90f172d68d1f4501dee99e2479f7af # tag=v2.1.3
with:
results_file: results.sarif
results_format: sarif
# Publish the results for public repositories to enable scorecard badges. For more details, see
# https://github.com/ossf/scorecard-action#publishing-results.
# For private repositories, `publish_results` will automatically be set to `false`, regardless
# of the value entered here.
publish_results: true
# Upload the results as artifacts (optional). Commenting out will disable uploads of run results in SARIF
# format to the repository Actions tab.
- name: "Upload artifact"
uses: actions/upload-artifact@0b7f8abb1508181956e8e162db84b466c27e18ce # tag=v3.0.0
with:
name: SARIF file
path: results.sarif
retention-days: 5
# Upload the results to GitHub's code scanning dashboard.
- name: "Upload to code-scanning"
uses: github/codeql-action/upload-sarif@6c089f53dd51dc3fc7e599c3cb5356453a52ca9e # tag=v1.0.26
with:
sarif_file: results.sarif

View File

@ -1,32 +0,0 @@
---
name: Static Analysis
on: [push, pull_request]
permissions: read-all
jobs:
run:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@c85c95e3d7251135ab7dc9ce3241c5835cc595a9 # v3.5.3
- id: goversion
run: echo "goversion=$(cat .go-version)" >> "$GITHUB_OUTPUT"
- uses: actions/setup-go@fac708d6674e30b6ba41289acaab6d4b75aa0753 # v4.0.1
with:
go-version: ${{ steps.goversion.outputs.goversion }}
- name: golangci-lint
uses: golangci/golangci-lint-action@639cd343e1d3b897ff35927a75193d57cfcba299 # v3.6.0
with:
version: v1.49.0
args: --config tools/.golangci.yaml
- name: protoc
uses: arduino/setup-protoc@149f6c87b92550901b26acd1632e11c3662e381f # v1.3.0
with:
version: '3.14.0'
repo-token: ${{ secrets.GITHUB_TOKEN }}
- run: |
set -euo pipefail
make verify
- run: |
set -euo pipefail
make fix

View File

@ -1,62 +0,0 @@
---
name: Tests-arm64
on:
schedule:
- cron: '30 1 * * *' # runs daily at 1:30 am.
permissions: read-all
jobs:
test:
# this is to prevent the job to run at forked projects
if: github.repository == 'etcd-io/etcd'
runs-on: [self-hosted, Linux, ARM64]
container: golang:1.19-bullseye
defaults:
run:
shell: bash
strategy:
fail-fast: false
matrix:
target:
- linux-arm64-integration-1-cpu
- linux-arm64-integration-2-cpu
- linux-arm64-integration-4-cpu
- linux-arm64-unit-4-cpu-race
steps:
- uses: actions/checkout@c85c95e3d7251135ab7dc9ce3241c5835cc595a9 # v3.5.3
# https://github.com/actions/checkout/issues/1169
- run: git config --system --add safe.directory '*'
- id: goversion
run: echo "goversion=$(cat .go-version)" >> "$GITHUB_OUTPUT"
- uses: actions/setup-go@fac708d6674e30b6ba41289acaab6d4b75aa0753 # v4.0.1
with:
go-version: ${{ steps.goversion.outputs.goversion }}
- env:
TARGET: ${{ matrix.target }}
run: |
set -euo pipefail
go clean -testcache
mkdir "${TARGET}"
export JUNIT_REPORT_DIR=$(realpath ${TARGET})
case "${TARGET}" in
linux-arm64-integration-1-cpu)
GOOS=linux GOARCH=arm64 CPU=1 make test-integration
;;
linux-arm64-integration-2-cpu)
GOOS=linux GOARCH=arm64 CPU=2 make test-integration
;;
linux-arm64-integration-4-cpu)
GOOS=linux GOARCH=arm64 CPU=4 make test-integration
;;
linux-arm64-unit-4-cpu-race)
GOOS=linux GOARCH=arm64 CPU=4 RACE=true GO_TEST_FLAGS='-p=2' make test-unit
;;
*)
echo "Failed to find target"
exit 1
;;
esac
- uses: actions/upload-artifact@0b7f8abb1508181956e8e162db84b466c27e18ce # v3.1.2
if: always()
with:
path: ./**/junit_*.xml

View File

@ -1,56 +0,0 @@
---
name: Tests
on: [push, pull_request]
permissions: read-all
jobs:
test:
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
target:
- linux-amd64-integration-1-cpu
- linux-amd64-integration-2-cpu
- linux-amd64-integration-4-cpu
- linux-amd64-unit-4-cpu-race
- linux-386-unit-1-cpu
steps:
- uses: actions/checkout@c85c95e3d7251135ab7dc9ce3241c5835cc595a9 # v3.5.3
- id: goversion
run: echo "goversion=$(cat .go-version)" >> "$GITHUB_OUTPUT"
- uses: actions/setup-go@fac708d6674e30b6ba41289acaab6d4b75aa0753 # v4.0.1
with:
go-version: ${{ steps.goversion.outputs.goversion }}
- env:
TARGET: ${{ matrix.target }}
run: |
set -euo pipefail
go clean -testcache
mkdir "${TARGET}"
export JUNIT_REPORT_DIR=$(realpath ${TARGET})
case "${TARGET}" in
linux-amd64-integration-1-cpu)
GOOS=linux GOARCH=amd64 CPU=1 make test-integration
;;
linux-amd64-integration-2-cpu)
GOOS=linux GOARCH=amd64 CPU=2 make test-integration
;;
linux-amd64-integration-4-cpu)
GOOS=linux GOARCH=amd64 CPU=4 make test-integration
;;
linux-amd64-unit-4-cpu-race)
GOOS=linux GOARCH=amd64 CPU=4 RACE=true GO_TEST_FLAGS='-p=2' make test-unit
;;
linux-386-unit-1-cpu)
GOOS=linux GOARCH=386 CPU=1 GO_TEST_FLAGS='-p=4' make test-unit
;;
*)
echo "Failed to find target"
exit 1
;;
esac
- uses: actions/upload-artifact@0b7f8abb1508181956e8e162db84b466c27e18ce # v3.1.2
if: always()
with:
path: ./**/junit_*.xml

34
.gitignore vendored
View File

@ -1,37 +1,11 @@
/agent-*
/coverage
/covdir
/gopath
/gopath.proto
/release
/go-bindata
/machine*
/bin
.vagrant
*.etcd
*.log
*.swp
/etcd
*.swp
/hack/insta-discovery/.env
*.coverprofile
*.test
hack/tls-setup/certs
.idea
*.iml
/contrib/mixin/manifests
/contrib/raftexample/raftexample
/contrib/raftexample/raftexample-*
/vendor
/tests/e2e/default.proxy
*.tmp
*.bak
.gobincache/
.DS_Store
/Documentation/dev-guide/api_reference_v3.md
/Documentation/dev-guide/api_concurrency_reference_v3.md
/tools/etcd-dump-db/etcd-dump-db
/tools/etcd-dump-logs/etcd-dump-logs
/tools/etcd-dump-metrics/etcd-dump-metrics
/tools/local-tester/bridge/bridge
/tools/proto-annotations/proto-annotations
/tools/benchmark/benchmark
/out
/etcd-dump-logs

View File

@ -1 +0,0 @@
1.19.10

1
.godir Normal file
View File

@ -0,0 +1 @@
github.com/coreos/etcd

View File

@ -1,4 +1,4 @@
// Copyright 2016 The etcd Authors
// Copyright 2014 CoreOS, Inc.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.

10
.travis.yml Normal file
View File

@ -0,0 +1,10 @@
language: go
sudo: false
go:
- 1.4
install:
- go get github.com/barakmich/go-nyet
script:
- INTEGRATION=y ./test

View File

@ -1,250 +0,0 @@
---
title: Production users
---
This document tracks people and use cases for etcd in production. By creating a list of production use cases we hope to build a community of advisors that we can reach out to with experience using various etcd applications, operation environments, and cluster sizes. The etcd development team may reach out periodically to check-in on how etcd is working in the field and update this list.
## All Kubernetes Users
- *Application*: https://kubernetes.io/
- *Environments*: AWS, OpenStack, Azure, Google Cloud, Huawei Cloud, Bare Metal, etc
**This is a meta user; please feel free to document specific Kubernetes clusters!**
All Kubernetes clusters use etcd as their primary data store. This means etcd's users include such companies as [Niantic, Inc Pokemon Go](https://cloudplatform.googleblog.com/2016/09/bringing-Pokemon-GO-to-life-on-Google-Cloud.html), [Box](https://blog.box.com/blog/kubernetes-box-microservices-maximum-velocity/), [CoreOS](https://coreos.com/tectonic), [Ticketmaster](https://www.youtube.com/watch?v=wqXVKneP0Hg), [Salesforce](https://www.salesforce.com) and many many more.
## discovery.etcd.io
- *Application*: https://github.com/coreos/discovery.etcd.io
- *Launched*: Feb. 2014
- *Cluster Size*: 5 members, 5 discovery proxies
- *Order of Data Size*: 100s of Megabytes
- *Operator*: CoreOS, brandon.philips@coreos.com
- *Environment*: AWS
- *Backups*: Periodic async to S3
discovery.etcd.io is the longest continuously running etcd backed service that we know about. It is the basis of automatic cluster bootstrap and was launched in Feb. 2014: https://coreos.com/blog/etcd-0.3.0-released/.
## OpenTable
- *Application*: OpenTable internal service discovery and cluster configuration management
- *Launched*: May 2014
- *Cluster Size*: 3 members each in 6 independent clusters; approximately 50 nodes reading / writing
- *Order of Data Size*: 10s of MB
- *Operator*: OpenTable, Inc; sschlansker@opentable.com
- *Environment*: AWS, VMWare
- *Backups*: None, all data can be re-created if necessary.
## cycoresys.com
- *Application*: multiple
- *Launched*: Jul. 2014
- *Cluster Size*: 3 members, _n_ proxies
- *Order of Data Size*: 100s of kilobytes
- *Operator*: CyCore Systems, Inc, sys@cycoresys.com
- *Environment*: Baremetal
- *Backups*: Periodic sync to Ceph RadosGW and DigitalOcean VM
CyCore Systems provides architecture and engineering for computing systems. This cluster provides microservices, virtual machines, databases, storage clusters to a number of clients. It is built on CoreOS machines, with each machine in the cluster running etcd as a peer or proxy.
## Radius Intelligence
- *Application*: multiple internal tools, Kubernetes clusters, bootstrappable system configs
- *Launched*: June 2015
- *Cluster Size*: 2 clusters of 5 and 3 members; approximately a dozen nodes read/write
- *Order of Data Size*: 100s of kilobytes
- *Operator*: Radius Intelligence; jcderr@radius.com
- *Environment*: AWS, CoreOS, Kubernetes
- *Backups*: None, all data can be recreated if necessary.
Radius Intelligence uses Kubernetes running CoreOS to containerize and scale internal toolsets. Examples include running [JetBrains TeamCity][teamcity] and internal AWS security and cost reporting tools. etcd clusters back these clusters as well as provide some basic environment bootstrapping configuration keys.
## Vonage
- *Application*: kubernetes, vault backend, system configuration for microservices, scheduling, locks (future - service discovery)
- *Launched*: August 2015
- *Cluster Size*: 2 clusters of 5 members in 2 DCs, n local proxies 1-to-1 with microservice, (ssl and SRV look up)
- *Order of Data Size*: kilobytes
- *Operator*: Vonage [devAdmin][raoofm]
- *Environment*: VMWare, AWS
- *Backups*: Daily snapshots on VMs. Backups done for upgrades.
## PD
- *Application*: embed etcd
- *Launched*: Mar 2016
- *Cluster Size*: 3 or 5 members
- *Order of Data Size*: megabytes
- *Operator*: PingCAP, Inc.
- *Environment*: Bare Metal, AWS, etc.
- *Backups*: None.
PD(Placement Driver) is the central controller in the TiDB cluster. It saves the cluster meta information, schedule the data, allocate the global unique timestamp for the distributed transaction, etc. It embeds etcd to supply high availability and auto failover.
## Huawei
- *Application*: System configuration for overlay network (Canal)
- *Launched*: June 2016
- *Cluster Size*: 3 members for each cluster
- *Order of Data Size*: kilobytes
- *Operator*: Huawei Euler Department
- *Environment*: [Huawei Cloud](http://www.hwclouds.com/product/cce.html)
- *Backups*: None, all data can be recreated if necessary.
[teamcity]: https://www.jetbrains.com/teamcity/
[raoofm]:https://github.com/raoofm
## Qiniu Cloud
- *Application*: system configuration for microservices, distributed locks
- *Launched*: Jan. 2016
- *Cluster Size*: 3 members each with several clusters
- *Order of Data Size*: kilobytes
- *Operator*: Pandora, chenchao@qiniu.com
- *Environment*: Baremetal
- *Backups*: None, all data can be recreated if necessary
## QingCloud
- *Application*: [QingCloud][qingcloud] appcenter cluster for service discovery as [metad][metad] backend.
- *Launched*: December 2016
- *Cluster Size*: 1 cluster of 3 members per user.
- *Order of Data Size*: kilobytes
- *Operator*: [yunify][yunify]
- *Environment*: QingCloud IaaS
- *Backups*: None, all data can be recreated if necessary.
[metad]:https://github.com/yunify/metad
[yunify]:https://github.com/yunify
[qingcloud]:https://qingcloud.com/
## Yandex
- *Application*: system configuration for services, service discovery
- *Launched*: March 2016
- *Cluster Size*: 3 clusters of 5 members
- *Order of Data Size*: several gigabytes
- *Operator*: Yandex; [nekto0n][nekto0n]
- *Environment*: Bare Metal
- *Backups*: None
[nekto0n]:https://github.com/nekto0n
## Tencent Games
- *Application*: Meta data and configuration data for service discovery, Kubernetes, etc.
- *Launched*: Jan. 2015
- *Cluster Size*: 3 members each with 10s of clusters
- *Order of Data Size*: 10s of Megabytes
- *Operator*: Tencent Game Operations Department
- *Environment*: Baremetal
- *Backups*: Periodic sync to backup server
In Tencent games, we use Docker and Kubernetes to deploy and run our applications, and use etcd to save meta data for service discovery, Kubernetes, etc.
## Hyper.sh
- *Application*: Kubernetes, distributed locks, etc.
- *Launched*: April 2016
- *Cluster Size*: 1 cluster of 3 members
- *Order of Data Size*: 10s of MB
- *Operator*: Hyper.sh
- *Environment*: Baremetal
- *Backups*: None, all data can be recreated if necessary.
In [hyper.sh][hyper.sh], the container service is backed by [hypernetes][hypernetes], a multi-tenant kubernetes distro. Moreover, we use etcd to coordinate the multiple manage services and store global meta data.
[hypernetes]:https://github.com/hyperhq/hypernetes
[Hyper.sh]:https://www.hyper.sh
## Meitu
- *Application*: system configuration for services, service discovery, kubernetes in test environment
- *Launched*: October 2015
- *Cluster Size*: 1 cluster of 3 members
- *Order of Data Size*: megabytes
- *Operator*: Meitu, hxj@meitu.com, [shafreeck][shafreeck]
- *Environment*: Bare Metal
- *Backups*: None, all data can be recreated if necessary.
[shafreeck]:https://github.com/shafreeck
## Grab
- *Application*: system configuration for services, service discovery
- *Launched*: June 2016
- *Cluster Size*: 1 cluster of 7 members
- *Order of Data Size*: megabytes
- *Operator*: Grab, [taxitan][taxitan], [reterVision][reterVision]
- *Environment*: AWS
- *Backups*: None, all data can be recreated if necessary.
[taxitan]:https://github.com/taxitan
[reterVision]:https://github.com/reterVision
## DaoCloud.io
- *Application*: container management
- *Launched*: Sep. 2015
- *Cluster Size*: 1000+ deployments, each deployment contains a 3 node cluster.
- *Order of Data Size*: 100s of Megabytes
- *Operator*: daocloud.io
- *Environment*: Baremetal and virtual machines
- *Backups*: None, all data can be recreated if necessary.
In [DaoCloud][DaoCloud], we use Docker and Swarm to deploy and run our applications, and we use etcd to save metadata for service discovery.
[DaoCloud]:https://www.daocloud.io
## Branch.io
- *Application*: Kubernetes
- *Launched*: April 2016
- *Cluster Size*: Multiple clusters, multiple sizes
- *Order of Data Size*: 100s of Megabytes
- *Operator*: branch.io
- *Environment*: AWS, Kubernetes
- *Backups*: EBS volume backups
At [Branch][branch], we use kubernetes heavily as our core microservice platform for staging and production.
[branch]: https://branch.io
## Baidu Waimai
- *Application*: SkyDNS, Kubernetes, UDC, CMDB and other distributed systems
- *Launched*: April. 2016
- *Cluster Size*: 3 clusters of 5 members
- *Order of Data Size*: several gigabytes
- *Operator*: Baidu Waimai Operations Department
- *Environment*: CentOS 6.5
- *Backups*: backup scripts
## Salesforce.com
- *Application*: Kubernetes
- *Launched*: Jan 2017
- *Cluster Size*: Multiple clusters of 3 members
- *Order of Data Size*: 100s of Megabytes
- *Operator*: Salesforce.com (krmayankk@github)
- *Environment*: BareMetal
- *Backups*: None, all data can be recreated
## Hosted Graphite
- *Application*: Service discovery, locking, ephemeral application data
- *Launched*: January 2017
- *Cluster Size*: 2 clusters of 7 members
- *Order of Data Size*: Megabytes
- *Operator*: Hosted Graphite (sre@hostedgraphite.com)
- *Environment*: Bare Metal
- *Backups*: None, all data is considered ephemeral.
## Transwarp
- *Application*: Transwarp Data Cloud, Transwarp Operating System, Transwarp Data Hub, Sophon
- *Launched*: January 2016
- *Cluster Size*: Multiple clusters, multiple sizes
- *Order of Data Size*: Megabytes
- *Operator*: Trasnwarp Operating System
- *Environment*: Bare Metal, Container
- *Backups*: backup scripts

View File

@ -1,16 +0,0 @@
<hr>
## [v2.3.8](https://github.com/etcd-io/etcd/releases/tag/v2.3.8) (2017-02-17)
See [code changes](https://github.com/etcd-io/etcd/compare/v2.3.7...v2.3.8).
### Go
- Compile with [*Go 1.7.5*](https://golang.org/doc/devel/release.html#go1.7).
<hr>

View File

@ -1,291 +0,0 @@
<hr>
## [v3.0.16](https://github.com/etcd-io/etcd/releases/tag/v3.0.16) (2016-11-13)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.0.15...v3.0.16) and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/).**
### Go
- Compile with [*Go 1.6.4*](https://golang.org/doc/devel/release.html#go1.6).
<hr>
## [v3.0.15](https://github.com/etcd-io/etcd/releases/tag/v3.0.15) (2016-11-11)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.0.14...v3.0.15) and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/).**
### Fixed
- Fix cancel watch request with wrong range end.
### Go
- Compile with [*Go 1.6.3*](https://golang.org/doc/devel/release.html#go1.6).
<hr>
## [v3.0.14](https://github.com/etcd-io/etcd/releases/tag/v3.0.14) (2016-11-04)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.0.13...v3.0.14) and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/).**
### Added
- v3 `etcdctl migrate` command now supports `--no-ttl` flag to discard keys on transform.
### Go
- Compile with [*Go 1.6.3*](https://golang.org/doc/devel/release.html#go1.6).
<hr>
## [v3.0.13](https://github.com/etcd-io/etcd/releases/tag/v3.0.13) (2016-10-24)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.0.12...v3.0.13) and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/).**
### Go
- Compile with [*Go 1.6.3*](https://golang.org/doc/devel/release.html#go1.6).
<hr>
## [v3.0.12](https://github.com/etcd-io/etcd/releases/tag/v3.0.12) (2016-10-07)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.0.11...v3.0.12) and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/).**
### Go
- Compile with [*Go 1.6.3*](https://golang.org/doc/devel/release.html#go1.6).
<hr>
## [v3.0.11](https://github.com/etcd-io/etcd/releases/tag/v3.0.11) (2016-10-07)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.0.10...v3.0.11) and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/).**
### Added
- Server returns previous key-value (optional)
- `clientv3.WithPrevKV` option
- v3 etcdctl `put,watch,del --prev-kv` flag
### Go
- Compile with [*Go 1.6.3*](https://golang.org/doc/devel/release.html#go1.6).
<hr>
## [v3.0.10](https://github.com/etcd-io/etcd/releases/tag/v3.0.10) (2016-09-23)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.0.9...v3.0.10) and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/).**
### Go
- Compile with [*Go 1.6.3*](https://golang.org/doc/devel/release.html#go1.6).
<hr>
## [v3.0.9](https://github.com/etcd-io/etcd/releases/tag/v3.0.9) (2016-09-15)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.0.8...v3.0.9) and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/).**
### Added
- Warn on domain names on listen URLs (v3.2 will reject domain names).
### Go
- Compile with [*Go 1.6.3*](https://golang.org/doc/devel/release.html#go1.6).
<hr>
## [v3.0.8](https://github.com/etcd-io/etcd/releases/tag/v3.0.8) (2016-09-09)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.0.7...v3.0.8) and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/).**
### Other
- Allow only IP addresses in listen URLs (domain names are rejected).
### Go
- Compile with [*Go 1.6.3*](https://golang.org/doc/devel/release.html#go1.6).
<hr>
## [v3.0.7](https://github.com/etcd-io/etcd/releases/tag/v3.0.7) (2016-08-31)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.0.6...v3.0.7) and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/).**
### Other
- SRV records only allow A records (RFC 2052).
### Go
- Compile with [*Go 1.6.3*](https://golang.org/doc/devel/release.html#go1.6).
<hr>
## [v3.0.6](https://github.com/etcd-io/etcd/releases/tag/v3.0.6) (2016-08-19)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.0.5...v3.0.6) and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/).**
### Go
- Compile with [*Go 1.6.3*](https://golang.org/doc/devel/release.html#go1.6).
<hr>
## [v3.0.5](https://github.com/etcd-io/etcd/releases/tag/v3.0.5) (2016-08-19)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.0.4...v3.0.5) and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/).**
### Other
- SRV records (e.g., infra1.example.com) must match the discovery domain (i.e., example.com) if no custom certificate authority is given.
### Go
- Compile with [*Go 1.6.3*](https://golang.org/doc/devel/release.html#go1.6).
<hr>
## [v3.0.4](https://github.com/etcd-io/etcd/releases/tag/v3.0.4) (2016-07-27)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.0.3...v3.0.4) and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/).**
### Added
- v2 `etcdctl ls` command now supports `--output=json`.
- Add /var/lib/etcd directory to etcd official Docker image.
### Other
- v2 auth can now use common name from TLS certificate when `--client-cert-auth` is enabled.
### Go
- Compile with [*Go 1.6.3*](https://golang.org/doc/devel/release.html#go1.6).
<hr>
## [v3.0.3](https://github.com/etcd-io/etcd/releases/tag/v3.0.3) (2016-07-15)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.0.2...v3.0.3) and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/).**
### Other
- Revert Dockerfile to use `CMD`, instead of `ENTRYPOINT`, to support `etcdctl` run.
- Docker commands for v3.0.2 won't work without specifying executable binary paths.
- v3 etcdctl default endpoints are now `127.0.0.1:2379`.
### Go
- Compile with [*Go 1.6.2*](https://golang.org/doc/devel/release.html#go1.6).
<hr>
## [v3.0.2](https://github.com/etcd-io/etcd/releases/tag/v3.0.2) (2016-07-08)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.0.1...v3.0.2) and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/).**
### Other
- Dockerfile uses `ENTRYPOINT`, instead of `CMD`, to run etcd without binary path specified.
### Go
- Compile with [*Go 1.6.2*](https://golang.org/doc/devel/release.html#go1.6).
<hr>
## [v3.0.1](https://github.com/etcd-io/etcd/releases/tag/v3.0.1) (2016-07-01)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.0.0...v3.0.1) and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/).**
### Go
- Compile with [*Go 1.6.2*](https://golang.org/doc/devel/release.html#go1.6).
<hr>
## [v3.0.0](https://github.com/etcd-io/etcd/releases/tag/v3.0.0) (2016-06-30)
See [code changes](https://github.com/etcd-io/etcd/compare/v2.3.0...v3.0.0) and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_0/).**
### Go
- Compile with [*Go 1.6.2*](https://golang.org/doc/devel/release.html#go1.6).
<hr>

View File

@ -1,574 +0,0 @@
Previous change logs can be found at [CHANGELOG-3.0](https://github.com/etcd-io/etcd/blob/main/CHANGELOG/CHANGELOG-3.0.md).
<hr>
## [v3.1.21](https://github.com/etcd-io/etcd/releases/tag/v3.1.21) (2019-TBD)
### etcdctl v3
- [Strip out insecure endpoints from DNS SRV records when using discovery](https://github.com/etcd-io/etcd/pull/10443) with etcdctl v2
- Add [`etcdctl endpoint health --write-out` support](https://github.com/etcd-io/etcd/pull/9540).
- Previously, [`etcdctl endpoint health --write-out json` did not work](https://github.com/etcd-io/etcd/issues/9532).
- The command output is changed. Previously, if endpoint is unreachable, the command output is
"\<endpoint\> is unhealthy: failed to connect: \<error message\>". This change unified the error message, all error types
now have the same output "\<endpoint\> is unhealthy: failed to commit proposal: \<error message\>".
### Metrics, Monitoring
See [List of metrics](https://github.com/etcd-io/etcd/tree/main/Documentation/metrics) for all metrics per release.
Note that any `etcd_debugging_*` metrics are experimental and subject to change.
- Fix bug where [db_compaction_total_duration_milliseconds metric incorrectly measured duration as 0](https://github.com/etcd-io/etcd/pull/10646).
<hr>
## [v3.1.20](https://github.com/etcd-io/etcd/releases/tag/v3.1.20) (2018-10-10)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.1.19...v3.1.20) and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/).**
### Improved
- Improve ["became inactive" warning log](https://github.com/etcd-io/etcd/pull/10024), which indicates message send to a peer failed.
- Improve [read index wait timeout warning log](https://github.com/etcd-io/etcd/pull/10026), which indicates that local node might have slow network.
- Add [gRPC interceptor for debugging logs](https://github.com/etcd-io/etcd/pull/9990); enable `etcd --debug` flag to see per-request debug information.
- Add [consistency check in snapshot status](https://github.com/etcd-io/etcd/pull/10109). If consistency check on snapshot file fails, `snapshot status` returns `"snapshot file integrity check failed..."` error.
### Metrics, Monitoring
See [List of metrics](https://github.com/etcd-io/etcd/tree/main/Documentation/metrics) for all metrics per release.
Note that any `etcd_debugging_*` metrics are experimental and subject to change.
- Improve [`etcd_network_peer_round_trip_time_seconds`](https://github.com/etcd-io/etcd/pull/10155) Prometheus metric to track leader heartbeats.
- Previously, it only samples the TCP connection for snapshot messages.
- Display all registered [gRPC metrics at start](https://github.com/etcd-io/etcd/pull/10034).
- Add [`etcd_snap_db_fsync_duration_seconds_count`](https://github.com/etcd-io/etcd/pull/9997) Prometheus metric.
- Add [`etcd_snap_db_save_total_duration_seconds_bucket`](https://github.com/etcd-io/etcd/pull/9997) Prometheus metric.
- Add [`etcd_network_snapshot_send_success`](https://github.com/etcd-io/etcd/pull/9997) Prometheus metric.
- Add [`etcd_network_snapshot_send_failures`](https://github.com/etcd-io/etcd/pull/9997) Prometheus metric.
- Add [`etcd_network_snapshot_send_total_duration_seconds`](https://github.com/etcd-io/etcd/pull/9997) Prometheus metric.
- Add [`etcd_network_snapshot_receive_success`](https://github.com/etcd-io/etcd/pull/9997) Prometheus metric.
- Add [`etcd_network_snapshot_receive_failures`](https://github.com/etcd-io/etcd/pull/9997) Prometheus metric.
- Add [`etcd_network_snapshot_receive_total_duration_seconds`](https://github.com/etcd-io/etcd/pull/9997) Prometheus metric.
- Add [`etcd_server_id`](https://github.com/etcd-io/etcd/pull/9998) Prometheus metric.
- Add [`etcd_server_health_success`](https://github.com/etcd-io/etcd/pull/10156) Prometheus metric.
- Add [`etcd_server_health_failures`](https://github.com/etcd-io/etcd/pull/10156) Prometheus metric.
- Add [`etcd_server_read_indexes_failed_total`](https://github.com/etcd-io/etcd/pull/10094) Prometheus metric.
### client v3
- Fix logic on [release lock key if cancelled](https://github.com/etcd-io/etcd/pull/10153) in `clientv3/concurrency` package.
### Go
- Compile with [*Go 1.8.7*](https://golang.org/doc/devel/release.html#go1.8).
<hr>
## [v3.1.19](https://github.com/etcd-io/etcd/releases/tag/v3.1.19) (2018-07-24)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.1.18...v3.1.19) and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/).**
### Improved
- Improve [Raft Read Index timeout warning messages](https://github.com/etcd-io/etcd/pull/9897).
### Metrics, Monitoring
See [List of metrics](https://github.com/etcd-io/etcd/tree/main/Documentation/metrics) for all metrics per release.
Note that any `etcd_debugging_*` metrics are experimental and subject to change.
- Add [`etcd_server_go_version`](https://github.com/etcd-io/etcd/pull/9957) Prometheus metric.
- Add [`etcd_server_slow_read_indexes_total`](https://github.com/etcd-io/etcd/pull/9897) Prometheus metric.
- Add [`etcd_server_quota_backend_bytes`](https://github.com/etcd-io/etcd/pull/9820) Prometheus metric.
- Use it with `etcd_mvcc_db_total_size_in_bytes` and `etcd_mvcc_db_total_size_in_use_in_bytes`.
- `etcd_server_quota_backend_bytes 2.147483648e+09` means current quota size is 2 GB.
- `etcd_mvcc_db_total_size_in_bytes 20480` means current physically allocated DB size is 20 KB.
- `etcd_mvcc_db_total_size_in_use_in_bytes 16384` means future DB size if defragment operation is complete.
- `etcd_mvcc_db_total_size_in_bytes - etcd_mvcc_db_total_size_in_use_in_bytes` is the number of bytes that can be saved on disk with defragment operation.
- Add [`etcd_mvcc_db_total_size_in_bytes`](https://github.com/etcd-io/etcd/pull/9819) Prometheus metric.
- In addition to [`etcd_debugging_mvcc_db_total_size_in_bytes`](https://github.com/etcd-io/etcd/pull/9819).
- Add [`etcd_mvcc_db_total_size_in_use_in_bytes`](https://github.com/etcd-io/etcd/pull/9256) Prometheus metric.
- Use it with `etcd_mvcc_db_total_size_in_bytes` and `etcd_mvcc_db_total_size_in_use_in_bytes`.
- `etcd_server_quota_backend_bytes 2.147483648e+09` means current quota size is 2 GB.
- `etcd_mvcc_db_total_size_in_bytes 20480` means current physically allocated DB size is 20 KB.
- `etcd_mvcc_db_total_size_in_use_in_bytes 16384` means future DB size if defragment operation is complete.
- `etcd_mvcc_db_total_size_in_bytes - etcd_mvcc_db_total_size_in_use_in_bytes` is the number of bytes that can be saved on disk with defragment operation.
### client v3
- Fix [lease keepalive interval updates when response queue is full](https://github.com/etcd-io/etcd/pull/9952).
- If `<-chan *clientv3LeaseKeepAliveResponse` from `clientv3.Lease.KeepAlive` was never consumed or channel is full, client was [sending keepalive request every 500ms](https://github.com/etcd-io/etcd/issues/9911) instead of expected rate of every "TTL / 3" duration.
### Go
- Compile with [*Go 1.8.7*](https://golang.org/doc/devel/release.html#go1.8).
<hr>
## [v3.1.18](https://github.com/etcd-io/etcd/releases/tag/v3.1.18) (2018-06-15)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.1.17...v3.1.18) and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/).**
### Metrics, Monitoring
See [List of metrics](https://github.com/etcd-io/etcd/tree/main/Documentation/metrics) for all metrics per release.
Note that any `etcd_debugging_*` metrics are experimental and subject to change.
- Add [`etcd_server_version`](https://github.com/etcd-io/etcd/pull/8960) Prometheus metric.
- To replace [Kubernetes `etcd-version-monitor`](https://github.com/etcd-io/etcd/issues/8948).
### Go
- Compile with [*Go 1.8.7*](https://golang.org/doc/devel/release.html#go1.8).
<hr>
## [v3.1.17](https://github.com/etcd-io/etcd/releases/tag/v3.1.17) (2018-06-06)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.1.16...v3.1.17) and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/).**
### etcd server
- Fix [v3 snapshot recovery](https://github.com/etcd-io/etcd/issues/7628).
- A follower receives a leader snapshot to be persisted as a `[SNAPSHOT-INDEX].snap.db` file on disk.
- Now, server [ensures that the incoming snapshot be persisted on disk before loading it](https://github.com/etcd-io/etcd/pull/7876).
- Otherwise, index mismatch happens and triggers server-side panic (e.g. newer WAL entry with outdated snapshot index).
### Go
- Compile with [*Go 1.8.7*](https://golang.org/doc/devel/release.html#go1.8).
<hr>
## [v3.1.16](https://github.com/etcd-io/etcd/releases/tag/v3.1.16) (2018-05-31)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.1.15...v3.1.16) and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/).**
### etcd server
- Fix [`mvcc` server panic from restore operation](https://github.com/etcd-io/etcd/pull/9775).
- Let's assume that a watcher had been requested with a future revision X and sent to node A that became network-partitioned thereafter. Meanwhile, cluster makes progress. Then when the partition gets removed, the leader sends a snapshot to node A. Previously if the snapshot's latest revision is still lower than the watch revision X, **etcd server panicked** during snapshot restore operation.
- Now, this server-side panic has been fixed.
### Go
- Compile with [*Go 1.8.7*](https://golang.org/doc/devel/release.html#go1.8).
<hr>
## [v3.1.15](https://github.com/etcd-io/etcd/releases/tag/v3.1.15) (2018-05-09)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.1.14...v3.1.15) and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/).**
### etcd server
- Purge old [`*.snap.db` snapshot files](https://github.com/etcd-io/etcd/pull/7967).
- Previously, etcd did not respect `--max-snapshots` flag to purge old `*.snap.db` files.
- Now, etcd purges old `*.snap.db` files to keep maximum `--max-snapshots` number of files on disk.
### Go
- Compile with [*Go 1.8.7*](https://golang.org/doc/devel/release.html#go1.8).
<hr>
## [v3.1.14](https://github.com/etcd-io/etcd/releases/tag/v3.1.14) (2018-04-24)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.1.13...v3.1.14) and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/).**
### Metrics, Monitoring
See [List of metrics](https://github.com/etcd-io/etcd/tree/main/Documentation/metrics) for all metrics per release.
Note that any `etcd_debugging_*` metrics are experimental and subject to change.
- Add [`etcd_server_is_leader`](https://github.com/etcd-io/etcd/pull/9587) Prometheus metric.
### etcd server
- Add [`--initial-election-tick-advance`](https://github.com/etcd-io/etcd/pull/9591) flag to configure initial election tick fast-forward.
- By default, `--initial-election-tick-advance=true`, then local member fast-forwards election ticks to speed up "initial" leader election trigger.
- This benefits the case of larger election ticks. For instance, cross datacenter deployment may require longer election timeout of 10-second. If true, local node does not need wait up to 10-second. Instead, forwards its election ticks to 8-second, and have only 2-second left before leader election.
- Major assumptions are that: cluster has no active leader thus advancing ticks enables faster leader election. Or cluster already has an established leader, and rejoining follower is likely to receive heartbeats from the leader after tick advance and before election timeout.
- However, when network from leader to rejoining follower is congested, and the follower does not receive leader heartbeat within left election ticks, disruptive election has to happen thus affecting cluster availabilities.
- Now, this can be disabled by setting `--initial-election-tick-advance=false`.
- Disabling this would slow down initial bootstrap process for cross datacenter deployments. Make tradeoffs by configuring `--initial-election-tick-advance` at the cost of slow initial bootstrap.
- If single-node, it advances ticks regardless.
- Address [disruptive rejoining follower node](https://github.com/etcd-io/etcd/issues/9333).
### Go
- Compile with [*Go 1.8.7*](https://golang.org/doc/devel/release.html#go1.8).
<hr>
## [v3.1.13](https://github.com/etcd-io/etcd/releases/tag/v3.1.13) (2018-03-29)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.1.12...v3.1.13) and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/).**
### Improved
- Adjust [election timeout on server restart](https://github.com/etcd-io/etcd/pull/9415) to reduce [disruptive rejoining servers](https://github.com/etcd-io/etcd/issues/9333).
- Previously, etcd fast-forwards election ticks on server start, with only one tick left for leader election. This is to speed up start phase, without having to wait until all election ticks elapse. Advancing election ticks is useful for cross datacenter deployments with larger election timeouts. However, it was affecting cluster availability if the last tick elapses before leader contacts the restarted node.
- Now, when etcd restarts, it adjusts election ticks with more than one tick left, thus more time for leader to prevent disruptive restart.
### Metrics, Monitoring
See [List of metrics](https://github.com/etcd-io/etcd/tree/main/Documentation/metrics) for all metrics per release.
Note that any `etcd_debugging_*` metrics are experimental and subject to change.
- Add missing [`etcd_network_peer_sent_failures_total` count](https://github.com/etcd-io/etcd/pull/9437).
### Go
- Compile with [*Go 1.8.7*](https://golang.org/doc/devel/release.html#go1.8).
<hr>
## [v3.1.12](https://github.com/etcd-io/etcd/releases/tag/v3.1.12) (2018-03-08)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.1.11...v3.1.12) and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/).**
### etcd server
- Fix [`mvcc` "unsynced" watcher restore operation](https://github.com/etcd-io/etcd/pull/9297).
- "unsynced" watcher is watcher that needs to be in sync with events that have happened.
- That is, "unsynced" watcher is the slow watcher that was requested on old revision.
- "unsynced" watcher restore operation was not correctly populating its underlying watcher group.
- Which possibly causes [missing events from "unsynced" watchers](https://github.com/etcd-io/etcd/issues/9086).
- A node gets network partitioned with a watcher on a future revision, and falls behind receiving a leader snapshot after partition gets removed. When applying this snapshot, etcd watch storage moves current synced watchers to unsynced since sync watchers might have become stale during network partition. And reset synced watcher group to restart watcher routines. Previously, there was a bug when moving from synced watcher group to unsynced, thus client would miss events when the watcher was requested to the network-partitioned node.
### Go
- Compile with [*Go 1.8.7*](https://golang.org/doc/devel/release.html#go1.8).
<hr>
## [v3.1.11](https://github.com/etcd-io/etcd/releases/tag/v3.1.11) (2017-11-28)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.1.10...v3.1.11) and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/).**
### etcd server
- [#8411](https://github.com/etcd-io/etcd/issues/8411),[#8806](https://github.com/etcd-io/etcd/pull/8806) backport "mvcc: sending events after restore"
- [#8009](https://github.com/etcd-io/etcd/issues/8009),[#8902](https://github.com/etcd-io/etcd/pull/8902) backport coreos/bbolt v1.3.1-coreos.5
### Go
- Compile with [*Go 1.8.5*](https://golang.org/doc/devel/release.html#go1.8).
<hr>
## [v3.1.10](https://github.com/etcd-io/etcd/releases/tag/v3.1.10) (2017-07-14)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.1.9...v3.1.10) and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/).**
### Added
- Tag docker images with minor versions.
- e.g. `docker pull quay.io/coreos/etcd:v3.1` to fetch latest v3.1 versions.
### Go
- Compile with [*Go 1.8.3*](https://golang.org/doc/devel/release.html#go1.8).
- Fix panic on `net/http.CloseNotify`
<hr>
## [v3.1.9](https://github.com/etcd-io/etcd/releases/tag/v3.1.9) (2017-06-09)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.1.8...v3.1.9) and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/).**
### etcd server
- Allow v2 snapshot over 512MB.
### Go
- Compile with [*Go 1.7.6*](https://golang.org/doc/devel/release.html#go1.7).
<hr>
## [v3.1.8](https://github.com/etcd-io/etcd/releases/tag/v3.1.8) (2017-05-19)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.1.7...v3.1.8) and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/).**
### Go
- Compile with [*Go 1.7.5*](https://golang.org/doc/devel/release.html#go1.7).
<hr>
## [v3.1.7](https://github.com/etcd-io/etcd/releases/tag/v3.1.7) (2017-04-28)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.1.6...v3.1.7) and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/).**
### Go
- Compile with [*Go 1.7.5*](https://golang.org/doc/devel/release.html#go1.7).
<hr>
## [v3.1.6](https://github.com/etcd-io/etcd/releases/tag/v3.1.6) (2017-04-19)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.1.5...v3.1.6) and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/).**
### etcd server
- Fill in Auth API response header.
- Remove auth check in Status API.
### Go
- Compile with [*Go 1.7.5*](https://golang.org/doc/devel/release.html#go1.7).
<hr>
## [v3.1.5](https://github.com/etcd-io/etcd/releases/tag/v3.1.5) (2017-03-27)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.1.4...v3.1.5) and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/).**
### etcd server
- Fix raft memory leak issue.
- Fix Windows file path issues.
### Other
- Add `/etc/nsswitch.conf` file to alpine-based Docker image.
### Go
- Compile with [*Go 1.7.5*](https://golang.org/doc/devel/release.html#go1.7).
<hr>
## [v3.1.4](https://github.com/etcd-io/etcd/releases/tag/v3.1.4) (2017-03-22)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.1.3...v3.1.4) and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/).**
### Go
- Compile with [*Go 1.7.5*](https://golang.org/doc/devel/release.html#go1.7).
<hr>
## [v3.1.3](https://github.com/etcd-io/etcd/releases/tag/v3.1.3) (2017-03-10)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.1.2...v3.1.3) and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/).**
### etcd gateway
- Fix `etcd gateway` schema handling in DNS discovery.
- Fix sd_notify behaviors in `gateway`, `grpc-proxy`.
### gRPC Proxy
- Fix sd_notify behaviors in `gateway`, `grpc-proxy`.
### Other
- Use machine default host when advertise URLs are default values(`localhost:2379,2380`) AND if listen URL is `0.0.0.0`.
### Go
- Compile with [*Go 1.7.5*](https://golang.org/doc/devel/release.html#go1.7).
<hr>
## [v3.1.2](https://github.com/etcd-io/etcd/releases/tag/v3.1.2) (2017-02-24)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.1.1...v3.1.2) and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/).**
### etcd gateway
- Fix `etcd gateway` with multiple endpoints.
### Other
- Use IPv4 default host, by default (when IPv4 and IPv6 are available).
### Go
- Compile with [*Go 1.7.5*](https://golang.org/doc/devel/release.html#go1.7).
<hr>
## [v3.1.1](https://github.com/etcd-io/etcd/releases/tag/v3.1.1) (2017-02-17)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.1.0...v3.1.1) and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/).**
### Go
- Compile with [*Go 1.7.5*](https://golang.org/doc/devel/release.html#go1.7).
<hr>
## [v3.1.0](https://github.com/etcd-io/etcd/releases/tag/v3.1.0) (2017-01-20)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.0.0...v3.1.0) and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.1 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_1/).**
### Improved
- Faster linearizable reads (implements Raft [read-index](https://github.com/etcd-io/etcd/pull/6212)).
- v3 authentication API is now stable.
### Breaking Changes
- Deprecated following gRPC metrics in favor of [go-grpc-prometheus](https://github.com/grpc-ecosystem/go-grpc-prometheus).
- `etcd_grpc_requests_total`
- `etcd_grpc_requests_failed_total`
- `etcd_grpc_active_streams`
- `etcd_grpc_unary_requests_duration_seconds`
### Dependency
- Upgrade [`github.com/ugorji/go/codec`](https://github.com/ugorji/go) to [**`ugorji/go@9c7f9b7`**](https://github.com/ugorji/go/commit/9c7f9b7a2bc3a520f7c7b30b34b7f85f47fe27b6), and [regenerate v2 `client`](https://github.com/etcd-io/etcd/pull/6945).
### Security, Authentication
See [security doc](https://etcd.io/docs/latest/op-guide/security/) for more details.
- SRV records (e.g., infra1.example.com) must match the discovery domain (i.e., example.com) if no custom certificate authority is given.
- `TLSConfig.ServerName` is ignored with user-provided certificates for backwards compatibility; to be deprecated.
- For example, `etcd --discovery-srv=example.com` will only authenticate peers/clients when the provided certs have root domain `example.com` as an entry in Subject Alternative Name (SAN) field.
### etcd server
- Automatic leadership transfer when leader steps down.
- etcd flags
- `--strict-reconfig-check` flag is set by default.
- Add `--log-output` flag.
- Add `--metrics` flag.
- etcd uses default route IP if advertise URL is not given.
- Cluster rejects removing members if quorum will be lost.
- Discovery now has upper limit for waiting on retries.
- Warn on binding listeners through domain names; to be deprecated.
- v3.0 and v3.1 with `--auto-compaction-retention=10` run periodic compaction on v3 key-value store for every 10-hour.
- Compactor only supports periodic compaction.
- Compactor records latest revisions every 5-minute, until it reaches the first compaction period (e.g. 10-hour).
- In order to retain key-value history of last compaction period, it uses the last revision that was fetched before compaction period, from the revision records that were collected every 5-minute.
- When `--auto-compaction-retention=10`, compactor uses revision 100 for compact revision where revision 100 is the latest revision fetched from 10 hours ago.
- If compaction succeeds or requested revision has already been compacted, it resets period timer and starts over with new historical revision records (e.g. restart revision collect and compact for the next 10-hour period).
- If compaction fails, it retries in 5 minutes.
### client v3
- Add `SetEndpoints` method; update endpoints at runtime.
- Add `Sync` method; auto-update endpoints at runtime.
- Add `Lease TimeToLive` API; fetch lease information.
- replace Config.Logger field with global logger.
- Get API responses are sorted in ascending order by default.
### etcdctl v3
- Add `lease timetolive` command.
- Add `--print-value-only` flag to get command.
- Add `--dest-prefix` flag to make-mirror command.
- `get` command responses are sorted in ascending order by default.
### gRPC Proxy
- Experimental gRPC proxy feature.
### Other
- `recipes` now conform to sessions defined in `clientv3/concurrency`.
- ACI has symlinks to `/usr/local/bin/etcd*`.
### Go
- Compile with [*Go 1.7.4*](https://golang.org/doc/devel/release.html#go1.7).
<hr>

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -1,508 +0,0 @@
Previous change logs can be found at [CHANGELOG-3.4](https://github.com/etcd-io/etcd/blob/main/CHANGELOG/CHANGELOG-3.4.md).
<hr>
## v3.5.10 (tbd)
### etcd server
- Fix [corruption check may get a `ErrCompacted` error when server has just been compacted](https://github.com/etcd-io/etcd/pull/16048)
- Improve [Lease put performance for the case that auth is disabled or the user is admin](https://github.com/etcd-io/etcd/pull/16019)
### etcd grpc-proxy
- Fix [Memberlist results not updated when proxy node down](https://github.com/etcd-io/etcd/pull/15907).
<hr>
## v3.5.9 (2023-05-11)
### etcd server
- Fix [LeaseTimeToLive API may return keys to clients which have no read permission on the keys](https://github.com/etcd-io/etcd/pull/15815).
### Dependencies
- Compile binaries using [go 1.19.9](https://github.com/etcd-io/etcd/pull/15822).
<hr>
## v3.5.8 (2023-04-13)
### etcd server
- Add [`etcd --tls-min-version --tls-max-version`](https://github.com/etcd-io/etcd/pull/15483) to enable support for TLS 1.3.
- Add [`etcd --listen-client-http-urls`](https://github.com/etcd-io/etcd/pull/15589) flag to support separating http server from grpc one, thus giving full immunity to [watch stream starvation under high read load](https://github.com/etcd-io/etcd/issues/15402).
- Change [http2 frame scheduler to random algorithm](https://github.com/etcd-io/etcd/pull/15452)
- Fix [Watch response traveling back in time when reconnecting member downloads snapshot from the leader](https://github.com/etcd-io/etcd/pull/15515)
- Fix [race when starting both secure & insecure gRPC servers on the same address](https://github.com/etcd-io/etcd/pull/15517)
- Fix [server/auth: disallow creating empty permission ranges](https://github.com/etcd-io/etcd/pull/15619)
- Fix [aligning zap log timestamp resolution to microseconds](https://github.com/etcd-io/etcd/pull/15240). Etcd now uses zap timestamp format: `2006-01-02T15:04:05.999999Z0700` (microsecond instead of milliseconds precision).
- Fix [wsproxy did not print log in JSON format](https://github.com/etcd-io/etcd/pull/15661).
- Fix [CVE-2021-28235](https://nvd.nist.gov/vuln/detail/CVE-2021-28235) by [clearing password after authenticating the user](https://github.com/etcd-io/etcd/pull/15653).
- Fix [etcdserver may panic when parsing a JWT token without username or revision](https://github.com/etcd-io/etcd/pull/15676).
- Fix [Requested watcher progress notifications are not synchronised with stream](https://github.com/etcd-io/etcd/pull/15695).
### Package `netutil`
- Fix [consistently format IPv6 addresses for comparison](https://github.com/etcd-io/etcd/pull/15187).
### Package `clientv3`
- Fix [etcd might send duplicated events to watch clients](https://github.com/etcd-io/etcd/pull/15274).
### Dependencies
- Recommend [Go 1.19+](https://github.com/etcd-io/etcd/pull/15337).
- Compile binaries using [go to 1.19.8](https://github.com/etcd-io/etcd/pull/15651)
- Upgrade [golang.org/x/net to v0.7.0](https://github.com/etcd-io/etcd/pull/15337)
- Upgrade [bbolt to v1.3.7](https://github.com/etcd-io/etcd/pull/15222).
### Docker image
- [Remove nsswitch.conf from docker image](https://github.com/etcd-io/etcd/pull/15161)
- Fix [etcd docker images all tagged with amd64 architecture](https://github.com/etcd-io/etcd/pull/15612)
<hr>
## v3.5.7 (2023-01-20)
### etcd server
- Fix [Remove memberID from data corrupt alarm](https://github.com/etcd-io/etcd/pull/14852).
- Fix [Allow non mutating requests pass through quotaKVServer when NOSPACE](https://github.com/etcd-io/etcd/pull/14884).
- Fix [nil pointer panic for readonly txn due to nil response](https://github.com/etcd-io/etcd/pull/14899).
- Fix [The last record which was partially synced to disk isn't automatically repaired](https://github.com/etcd-io/etcd/pull/15069).
- Fix [etcdserver might promote a non-started learner](https://github.com/etcd-io/etcd/pull/15096).
### Package `clientv3`
- Reverted the fix to [auth invalid token and old revision errors in watch](https://github.com/etcd-io/etcd/pull/14995).
### Dependencies
- Recommend [Go 1.17+](https://github.com/etcd-io/etcd/pull/15019).
- Compile binaries using [Go 1.17.13](https://github.com/etcd-io/etcd/pull/15019)
- Bumped [some dependencies](https://github.com/etcd-io/etcd/pull/15018) to address some HIGH Vulnerabilities.
### Docker image
- Use [distroless base image](https://github.com/etcd-io/etcd/pull/15016) to address critical Vulnerabilities.
- Updated [base image from base-debian11 to static-debian11 and removed dependency on busybox](https://github.com/etcd-io/etcd/pull/15037).
<hr>
## v3.5.6 (2022-11-21)
### etcd server
- Fix [auth invalid token and old revision errors in watch](https://github.com/etcd-io/etcd/pull/14547)
- Fix [avoid closing a watch with ID 0 incorrectly](https://github.com/etcd-io/etcd/pull/14563)
- Fix [auth: fix data consistency issue caused by recovery from snapshot](https://github.com/etcd-io/etcd/pull/14648)
- Fix [revision might be inconsistency between members when etcd crashes during processing defragmentation operation](https://github.com/etcd-io/etcd/pull/14733)
- Fix [timestamp in inconsistent format](https://github.com/etcd-io/etcd/pull/14799)
- Fix [Failed resolving host due to lost DNS record](https://github.com/etcd-io/etcd/pull/14573)
### Package `clientv3`
- Fix [Add backoff before retry when watch stream returns unavailable](https://github.com/etcd-io/etcd/pull/14582).
- Fix [stack overflow error in double barrier](https://github.com/etcd-io/etcd/pull/14658)
- Fix [Refreshing token on CommonName based authentication causes segmentation violation in client](https://github.com/etcd-io/etcd/pull/14790).
### etcd grpc-proxy
- Add [`etcd grpc-proxy start --listen-cipher-suites`](https://github.com/etcd-io/etcd/pull/14500) flag to support adding configurable cipher list.
<hr>
## v3.5.5 (2022-09-15)
### Deprecations
- Deprecated [SetKeepAlive and SetKeepAlivePeriod in limitListenerConn](https://github.com/etcd-io/etcd/pull/14366).
### Package `clientv3`
- Fix [do not overwrite authTokenBundle on dial](https://github.com/etcd-io/etcd/pull/14132).
- Fix [IsOptsWithPrefix returns false even if WithPrefix() is included](https://github.com/etcd-io/etcd/pull/14187).
### etcd server
- [Build official darwin/arm64 artifacts](https://github.com/etcd-io/etcd/pull/14436).
- Add [`etcd --max-concurrent-streams`](https://github.com/etcd-io/etcd/pull/14219) flag to configure the max concurrent streams each client can open at a time, and defaults to math.MaxUint32.
- Add [`etcd --experimental-compact-hash-check-enabled --experimental-compact-hash-check-time`](https://github.com/etcd-io/etcd/issues/14039) flags to support enabling reliable corruption detection on compacted revisions.
- Fix [unexpected error during txn](https://github.com/etcd-io/etcd/issues/14110).
- Fix [lease leak issue due to tokenProvider isn't enabled when restoring auth store from a snapshot](https://github.com/etcd-io/etcd/pull/13205).
- Fix [the race condition between goroutine and channel on the same leases to be revoked](https://github.com/etcd-io/etcd/pull/14087).
- Fix [lessor may continue to schedule checkpoint after stepping down leader role](https://github.com/etcd-io/etcd/pull/14087).
- Fix [Restrict the max size of each WAL entry to the remaining size of the WAL file](https://github.com/etcd-io/etcd/pull/14127).
- Fix [Protect rangePermCache with a RW lock correctly](https://github.com/etcd-io/etcd/pull/14227)
- Fix [memberID equals zero in corruption alarm](https://github.com/etcd-io/etcd/pull/14272)
- Fix [Durability API guarantee broken in single node cluster](https://github.com/etcd-io/etcd/pull/14424)
- Fix [etcd fails to start after performing alarm list operation and then power off/on](https://github.com/etcd-io/etcd/pull/14429)
- Fix [authentication data not loaded on member startup](https://github.com/etcd-io/etcd/pull/14409)
### etcdctl v3
- Fix [etcdctl move-leader may fail for multiple endpoints](https://github.com/etcd-io/etcd/pull/14434)
### Other
- [Bump golang.org/x/crypto to latest version](https://github.com/etcd-io/etcd/pull/13996) to address [CVE-2022-27191](https://github.com/advisories/GHSA-8c26-wmh5-6g9v).
- [Bump OpenTelemetry to 1.0.1 and gRPC to 1.41.0](https://github.com/etcd-io/etcd/pull/14312).
<hr>
## v3.5.4 (2022-04-24)
### etcd server
- Fix [etcd panic on startup (auth enabled)](https://github.com/etcd-io/etcd/pull/13946)
### package `client/pkg/v3`
- [Revert the change of trimming the trailing dot from SRV.Target](https://github.com/etcd-io/etcd/pull/13950) returned by DNS lookup
<hr>
## v3.5.3 (2022-04-13)
### etcd server
- Fix [Provide a better liveness probe for when etcd runs as a Kubernetes pod](https://github.com/etcd-io/etcd/pull/13706)
- Fix [inconsistent log format](https://github.com/etcd-io/etcd/pull/13864)
- Fix [Inconsistent revision and data occurs](https://github.com/etcd-io/etcd/pull/13908)
- Fix [Etcdserver is still in progress of processing LeaseGrantRequest when it receives a LeaseKeepAliveRequest on the same leaseID](https://github.com/etcd-io/etcd/pull/13932)
- Fix [consistent_index coming from snapshot is overwritten by the old local value](https://github.com/etcd-io/etcd/pull/13933)
- [Update container base image snapshot](https://github.com/etcd-io/etcd/pull/13862)
- Fix [Defrag unsets backend options](https://github.com/etcd-io/etcd/pull/13701).
### package `client/pkg/v3`
- [Trim the suffix dot from the target](https://github.com/etcd-io/etcd/pull/13714) in SRV records returned by DNS lookup
### etcdctl v3
- [Always print the raft_term in decimal](https://github.com/etcd-io/etcd/pull/13727) when displaying member list in json.
<hr>
## [v3.5.2](https://github.com/etcd-io/etcd/releases/tag/v3.5.2) (2022-02-01)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.5.1...v3.5.2) and [v3.5 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_5/) for any breaking changes.
### etcd server
- Fix [exclude the same alarm type activated by multiple peers](https://github.com/etcd-io/etcd/pull/13476).
- Add [`etcd --experimental-enable-lease-checkpoint-persist`](https://github.com/etcd-io/etcd/pull/13508) flag to enable checkpoint persisting.
- Fix [Lease checkpoints don't prevent to reset ttl on leader change](https://github.com/etcd-io/etcd/pull/13508), requires enabling checkpoint persisting.
- Fix [assertion failed due to tx closed when recovering v3 backend from a snapshot db](https://github.com/etcd-io/etcd/pull/13501)
- Fix [segmentation violation(SIGSEGV) error due to premature unlocking of watchableStore](https://github.com/etcd-io/etcd/pull/13541)
<hr>
## [v3.5.1](https://github.com/etcd-io/etcd/releases/tag/v3.5.1) (2021-10-15)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.5.0...v3.5.1) and [v3.5 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_5/) for any breaking changes.
### etcd server
- Fix [self-signed-cert-validity parameter cannot be specified in the config file](https://github.com/etcd-io/etcd/pull/13237).
- Fix [ensure that cluster members stored in v2store and backend are in sync](https://github.com/etcd-io/etcd/pull/13348)
### etcd client
- [Fix etcd client sends invalid :authority header](https://github.com/etcd-io/etcd/issues/13192)
### package clientv3
- Endpoints self identify now as `etcd-endpoints://{id}/{authority}` where authority is based on first endpoint passed, for example `etcd-endpoints://0xc0009d8540/localhost:2079`
### Other
- Updated [base image](https://github.com/etcd-io/etcd/pull/13386) from `debian:buster-v1.4.0` to `debian:bullseye-20210927` to fix the following critical CVEs:
- [CVE-2021-3711](https://nvd.nist.gov/vuln/detail/CVE-2021-3711): miscalculation of a buffer size in openssl's SM2 decryption
- [CVE-2021-35942](https://nvd.nist.gov/vuln/detail/CVE-2021-35942): integer overflow flaw in glibc
- [CVE-2019-9893](https://nvd.nist.gov/vuln/detail/CVE-2019-9893): incorrect syscall argument generation in libseccomp
- [CVE-2021-36159](https://nvd.nist.gov/vuln/detail/CVE-2021-36159): libfetch in apk-tools mishandles numeric strings in FTP and HTTP protocols to allow out of bound reads.
<hr>
## v3.5.0 (2021-06)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.4.0...v3.5.0) and [v3.5 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_5/) for any breaking changes.
- [v3.5.0](https://github.com/etcd-io/etcd/releases/tag/v3.5.0) (2021 TBD), see [code changes](https://github.com/etcd-io/etcd/compare/v3.5.0-rc.1...v3.5.0).
- [v3.5.0-rc.1](https://github.com/etcd-io/etcd/releases/tag/v3.5.0-rc.1) (2021-06-10), see [code changes](https://github.com/etcd-io/etcd/compare/v3.5.0-rc.0...v3.5.0-rc.1).
- [v3.5.0-rc.0](https://github.com/etcd-io/etcd/releases/tag/v3.5.0-rc.0) (2021-06-04), see [code changes](https://github.com/etcd-io/etcd/compare/v3.5.0-beta.4...v3.5.0-rc.0).
- [v3.5.0-beta.4](https://github.com/etcd-io/etcd/releases/tag/v3.5.0-beta.4) (2021-05-26), see [code changes](https://github.com/etcd-io/etcd/compare/v3.5.0-beta.3...v3.5.0-beta.4).
- [v3.5.0-beta.3](https://github.com/etcd-io/etcd/releases/tag/v3.5.0-beta.3) (2021-05-18), see [code changes](https://github.com/etcd-io/etcd/compare/v3.5.0-beta.2...v3.5.0-beta.3).
- [v3.5.0-beta.2](https://github.com/etcd-io/etcd/releases/tag/v3.5.0-beta.2) (2021-05-18), see [code changes](https://github.com/etcd-io/etcd/compare/v3.5.0-beta.1...v3.5.0-beta.2).
- [v3.5.0-beta.1](https://github.com/etcd-io/etcd/releases/tag/v3.5.0-beta.1) (2021-05-18), see [code changes](https://github.com/etcd-io/etcd/compare/v3.4.0...v3.5.0-beta.1).
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v3.5 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_3_5/).**
### Breaking Changes
- `go.etcd.io/etcd` Go packages have moved to `go.etcd.io/etcd/{api,pkg,raft,client,etcdctl,server,raft,tests}/v3` to follow the [Go modules](https://github.com/golang/go/wiki/Modules) conventions
- `go.etcd.io/clientv3/snapshot` SnapshotManager class have moved to `go.etcd.io/clientv3/etcdctl`.
The method `snapshot.Save` to download a snapshot from the remote server was preserved in 'go.etcd.io/clientv3/snapshot`.
- `go.etcd.io/client' package got migrated to 'go.etcd.io/client/v2'.
- Changed behavior of clientv3 API [MemberList](https://github.com/etcd-io/etcd/pull/11639).
- Previously, it is directly served with server's local data, which could be stale.
- Now, it is served with linearizable guarantee. If the server is disconnected from quorum, `MemberList` call will fail.
- [gRPC gateway](https://github.com/grpc-ecosystem/grpc-gateway) only supports [`/v3`](TODO) endpoint.
- Deprecated [`/v3beta`](https://github.com/etcd-io/etcd/pull/9298).
- `curl -L http://localhost:2379/v3beta/kv/put -X POST -d '{"key": "Zm9v", "value": "YmFy"}'` doesn't work in v3.5. Use `curl -L http://localhost:2379/v3/kv/put -X POST -d '{"key": "Zm9v", "value": "YmFy"}'` instead.
- **`etcd --experimental-enable-v2v3` flag remains experimental and to be deprecated.**
- v2 storage emulation feature will be deprecated in the next release.
- etcd 3.5 is the last version that supports V2 API. Flags `--enable-v2` and `--experimental-enable-v2v3` [are now deprecated](https://github.com/etcd-io/etcd/pull/12940) and will be removed in etcd v3.6 release.
- **`etcd --experimental-backend-bbolt-freelist-type` flag has been deprecated.** Use **`etcd --backend-bbolt-freelist-type`** instead. The default type is hashmap and it is stable now.
- **`etcd --debug` flag has been deprecated.** Use **`etcd --log-level=debug`** instead.
- Remove [`embed.Config.Debug`](https://github.com/etcd-io/etcd/pull/10947).
- **`etcd --log-output` flag has been deprecated.** Use **`etcd --log-outputs`** instead.
- **`etcd --logger=zap --log-outputs=stderr`** is now the default.
- **`etcd --logger=capnslog` flag value has been deprecated.**
- **`etcd --logger=zap --log-outputs=default` flag value is not supported.**.
- Use `etcd --logger=zap --log-outputs=stderr`.
- Or, use `etcd --logger=zap --log-outputs=systemd/journal` to send logs to the local systemd journal.
- Previously, if etcd parent process ID (PPID) is 1 (e.g. run with systemd), `etcd --logger=capnslog --log-outputs=default` redirects server logs to local systemd journal. And if write to journald fails, it writes to `os.Stderr` as a fallback.
- However, even with PPID 1, it can fail to dial systemd journal (e.g. run embedded etcd with Docker container). Then, [every single log write will fail](https://github.com/etcd-io/etcd/pull/9729) and fall back to `os.Stderr`, which is inefficient.
- To avoid this problem, systemd journal logging must be configured manually.
- **`etcd --log-outputs=stderr`** is now the default.
- **`etcd --log-package-levels` flag for `capnslog` has been deprecated.** Now, **`etcd --logger=zap --log-outputs=stderr`** is the default.
- **`[CLIENT-URL]/config/local/log` endpoint has been deprecated, as is `etcd --log-package-levels` flag.**
- `curl http://127.0.0.1:2379/config/local/log -XPUT -d '{"Level":"DEBUG"}'` won't work.
- Please use `etcd --logger=zap --log-outputs=stderr` instead.
- Deprecated `etcd_debugging_mvcc_db_total_size_in_bytes` Prometheus metric. Use `etcd_mvcc_db_total_size_in_bytes` instead.
- Deprecated `etcd_debugging_mvcc_put_total` Prometheus metric. Use `etcd_mvcc_put_total` instead.
- Deprecated `etcd_debugging_mvcc_delete_total` Prometheus metric. Use `etcd_mvcc_delete_total` instead.
- Deprecated `etcd_debugging_mvcc_txn_total` Prometheus metric. Use `etcd_mvcc_txn_total` instead.
- Deprecated `etcd_debugging_mvcc_range_total` Prometheus metric. Use `etcd_mvcc_range_total` instead.
- Main branch `/version` outputs `3.5.0-pre`, instead of `3.4.0+git`.
- Changed `proxy` package function signature to [support structured logger](https://github.com/etcd-io/etcd/pull/11614).
- Previously, `NewClusterProxy(c *clientv3.Client, advaddr string, prefix string) (pb.ClusterServer, <-chan struct{})`, now `NewClusterProxy(lg *zap.Logger, c *clientv3.Client, advaddr string, prefix string) (pb.ClusterServer, <-chan struct{})`.
- Previously, `Register(c *clientv3.Client, prefix string, addr string, ttl int)`, now `Register(lg *zap.Logger, c *clientv3.Client, prefix string, addr string, ttl int) <-chan struct{}`.
- Previously, `NewHandler(t *http.Transport, urlsFunc GetProxyURLs, failureWait time.Duration, refreshInterval time.Duration) http.Handler`, now `NewHandler(lg *zap.Logger, t *http.Transport, urlsFunc GetProxyURLs, failureWait time.Duration, refreshInterval time.Duration) http.Handler`.
- Changed `pkg/flags` function signature to [support structured logger](https://github.com/etcd-io/etcd/pull/11616).
- Previously, `SetFlagsFromEnv(prefix string, fs *flag.FlagSet) error`, now `SetFlagsFromEnv(lg *zap.Logger, prefix string, fs *flag.FlagSet) error`.
- Previously, `SetPflagsFromEnv(prefix string, fs *pflag.FlagSet) error`, now `SetPflagsFromEnv(lg *zap.Logger, prefix string, fs *pflag.FlagSet) error`.
- ClientV3 supports [grpc resolver API](https://github.com/etcd-io/etcd/blob/main/client/v3/naming/resolver/resolver.go).
- Endpoints can be managed using [endpoints.Manager](https://github.com/etcd-io/etcd/blob/main/client/v3/naming/endpoints/endpoints.go)
- Previously supported [GRPCResolver was decomissioned](https://github.com/etcd-io/etcd/pull/12675). Use [resolver](https://github.com/etcd-io/etcd/blob/main/client/v3/naming/resolver/resolver.go) instead.
- Turned on [--pre-vote by default](https://github.com/etcd-io/etcd/pull/12770). Should prevent disrupting RAFT leader by an individual member.
- [ETCD_CLIENT_DEBUG env](https://github.com/etcd-io/etcd/pull/12786): Now supports log levels (debug, info, warn, error, dpanic, panic, fatal). Only when set, overrides application-wide grpc logging settings.
- [Embed Etcd.Close()](https://github.com/etcd-io/etcd/pull/12828) needs to called exactly once and closes Etcd.Err() stream.
- [Embed Etcd does not override global/grpc logger](https://github.com/etcd-io/etcd/pull/12861) be default any longer. If desired, please call `embed.Config::SetupGlobalLoggers()` explicitly.
- [Embed Etcd custom logger should be configured using simpler builder `NewZapLoggerBuilder`](https://github.com/etcd-io/etcd/pull/12973).
- Client errors of `context cancelled` or `context deadline exceeded` are exposed as `codes.Canceled` and `codes.DeadlineExceeded`, instead of `codes.Unknown`.
### Storage format changes
- [WAL log's snapshots persists raftpb.ConfState](https://github.com/etcd-io/etcd/pull/12735)
- [Backend persists raftpb.ConfState](https://github.com/etcd-io/etcd/pull/12962) in the `meta` bucket `confState` key.
- [Backend persists applied term](https://github.com/etcd-io/etcd/pull/) in the `meta` bucket.
- Backend persists `downgrade` in the `cluster` bucket
### Security
- Add [`TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256` and `TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256` to `etcd --cipher-suites`](https://github.com/etcd-io/etcd/pull/11864).
- Changed [the format of WAL entries related to auth for not keeping password as a plain text](https://github.com/etcd-io/etcd/pull/11943).
- Add third party [Security Audit Report](https://github.com/etcd-io/etcd/pull/12201).
- A [log warning](https://github.com/etcd-io/etcd/pull/12242) is added when etcd uses any existing directory that has a permission different than 700 on Linux and 777 on Windows.
- Add optional [`ClientCertFile` and `ClientKeyFile`](https://github.com/etcd-io/etcd/pull/12705) options for peer and client tls configuration when split certificates are used.
### Metrics, Monitoring
See [List of metrics](https://etcd.io/docs/latest/metrics/) for all metrics per release.
Note that any `etcd_debugging_*` metrics are experimental and subject to change.
- Deprecated `etcd_debugging_mvcc_db_total_size_in_bytes` Prometheus metric. Use `etcd_mvcc_db_total_size_in_bytes` instead.
- Deprecated `etcd_debugging_mvcc_put_total` Prometheus metric. Use `etcd_mvcc_put_total` instead.
- Deprecated `etcd_debugging_mvcc_delete_total` Prometheus metric. Use `etcd_mvcc_delete_total` instead.
- Deprecated `etcd_debugging_mvcc_txn_total` Prometheus metric. Use `etcd_mvcc_txn_total` instead.
- Deprecated `etcd_debugging_mvcc_range_total` Prometheus metric. Use `etcd_mvcc_range_total` instead.
- Add [`etcd_debugging_mvcc_current_revision`](https://github.com/etcd-io/etcd/pull/11126) Prometheus metric.
- Add [`etcd_debugging_mvcc_compact_revision`](https://github.com/etcd-io/etcd/pull/11126) Prometheus metric.
- Change [`etcd_cluster_version`](https://github.com/etcd-io/etcd/pull/11254) Prometheus metrics to include only major and minor version.
- Add [`etcd_debugging_mvcc_total_put_size_in_bytes`](https://github.com/etcd-io/etcd/pull/11374) Prometheus metric.
- Add [`etcd_server_client_requests_total` with `"type"` and `"client_api_version"` labels](https://github.com/etcd-io/etcd/pull/11687).
- Add [`etcd_wal_write_bytes_total`](https://github.com/etcd-io/etcd/pull/11738).
- Add [`etcd_debugging_auth_revision`](https://github.com/etcd-io/etcd/commit/f14d2a087f7b0fd6f7980b95b5e0b945109c95f3).
- Add [`os_fd_used` and `os_fd_limit` to monitor current OS file descriptors](https://github.com/etcd-io/etcd/pull/12214).
- Add [`etcd_disk_defrag_inflight`](https://github.com/etcd-io/etcd/pull/13395).
### etcd server
- Add [don't attempt to grant nil permission to a role](https://github.com/etcd-io/etcd/pull/13086).
- Add [don't activate alarms w/missing AlarmType](https://github.com/etcd-io/etcd/pull/13084).
- Add [`TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256` and `TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256` to `etcd --cipher-suites`](https://github.com/etcd-io/etcd/pull/11864).
- Automatically [create parent directory if it does not exist](https://github.com/etcd-io/etcd/pull/9626) (fix [issue#9609](https://github.com/etcd-io/etcd/issues/9609)).
- v4.0 will configure `etcd --enable-v2=true --enable-v2v3=/aaa` to enable v2 API server that is backed by **v3 storage**.
- [`etcd --backend-bbolt-freelist-type`] flag is now stable.
- `etcd --experimental-backend-bbolt-freelist-type` has been deprecated.
- Support [downgrade API](https://github.com/etcd-io/etcd/pull/11715).
- Deprecate v2 apply on cluster version. [Use v3 request to set cluster version and recover cluster version from v3 backend](https://github.com/etcd-io/etcd/pull/11427).
- [Use v2 api to update cluster version to support mixed version cluster during upgrade](https://github.com/etcd-io/etcd/pull/12988).
- [Fix corruption bug in defrag](https://github.com/etcd-io/etcd/pull/11613).
- Fix [quorum protection logic when promoting a learner](https://github.com/etcd-io/etcd/pull/11640).
- Improve [peer corruption checker](https://github.com/etcd-io/etcd/pull/11621) to work when peer mTLS is enabled.
- Log [`[CLIENT-PORT]/health` check in server side](https://github.com/etcd-io/etcd/pull/11704).
- Log [successful etcd server-side health check in debug level](https://github.com/etcd-io/etcd/pull/12677).
- Improve [compaction performance when latest index is greater than 1-million](https://github.com/etcd-io/etcd/pull/11734).
- [Refactor consistentindex](https://github.com/etcd-io/etcd/pull/11699).
- [Add log when etcdserver failed to apply command](https://github.com/etcd-io/etcd/pull/11670).
- Improve [count-only range performance](https://github.com/etcd-io/etcd/pull/11771).
- Remove [redundant storage restore operation to shorten the startup time](https://github.com/etcd-io/etcd/pull/11779).
- With 40 million key test data,it can shorten the startup time from 5 min to 2.5 min.
- [Fix deadlock bug in mvcc](https://github.com/etcd-io/etcd/pull/11817).
- Fix [inconsistency between WAL and server snapshot](https://github.com/etcd-io/etcd/pull/11888).
- Previously, server restore fails if it had crashed after persisting raft hard state but before saving snapshot.
- See https://github.com/etcd-io/etcd/issues/10219 for more.
- Add [missing CRC checksum check in WAL validate method otherwise causes panic](https://github.com/etcd-io/etcd/pull/11924).
- See https://github.com/etcd-io/etcd/issues/11918.
- Improve logging around snapshot send and receive.
- [Push down RangeOptions.limit argv into index tree to reduce memory overhead](https://github.com/etcd-io/etcd/pull/11990).
- Add [reason field for /health response](https://github.com/etcd-io/etcd/pull/11983).
- Add [exclude alarms from health check conditionally](https://github.com/etcd-io/etcd/pull/12880).
- Add [`etcd --unsafe-no-fsync`](https://github.com/etcd-io/etcd/pull/11946) flag.
- Setting the flag disables all uses of fsync, which is unsafe and will cause data loss. This flag makes it possible to run an etcd node for testing and development without placing lots of load on the file system.
- Add [`etcd --auth-token-ttl`](https://github.com/etcd-io/etcd/pull/11980) flag to customize `simpleTokenTTL` settings.
- Improve [`runtime.FDUsage` call pattern to reduce objects malloc of Memory Usage and CPU Usage](https://github.com/etcd-io/etcd/pull/11986).
- Improve [mvcc.watchResponse channel Memory Usage](https://github.com/etcd-io/etcd/pull/11987).
- Log [expensive request info in UnaryInterceptor](https://github.com/etcd-io/etcd/pull/12086).
- [Fix invalid Go type in etcdserverpb](https://github.com/etcd-io/etcd/pull/12000).
- [Improve healthcheck by using v3 range request and its corresponding timeout](https://github.com/etcd-io/etcd/pull/12195).
- Add [`etcd --experimental-watch-progress-notify-interval`](https://github.com/etcd-io/etcd/pull/12216) flag to make watch progress notify interval configurable.
- Fix [server panic in slow writes warnings](https://github.com/etcd-io/etcd/issues/12197).
- Fixed via [PR#12238](https://github.com/etcd-io/etcd/pull/12238).
- [Fix server panic](https://github.com/etcd-io/etcd/pull/12288) when force-new-cluster flag is enabled in a cluster which had learner node.
- Add [`etcd --self-signed-cert-validity`](https://github.com/etcd-io/etcd/pull/12429) flag to support setting certificate expiration time.
- Notice, certificates generated by etcd are valid for 1 year by default when specifying the auto-tls or peer-auto-tls option.
- Add [`etcd --experimental-warning-apply-duration`](https://github.com/etcd-io/etcd/pull/12448) flag which allows apply duration threshold to be configurable.
- Add [`etcd --experimental-memory-mlock`](https://github.com/etcd-io/etcd/pull/TODO) flag which prevents etcd memory pages to be swapped out.
- Add [`etcd --socket-reuse-port`](https://github.com/etcd-io/etcd/pull/12702) flag
- Setting this flag enables `SO_REUSEPORT` which allows rebind of a port already in use. User should take caution when using this flag to ensure flock is properly enforced.
- Add [`etcd --socket-reuse-address`](https://github.com/etcd-io/etcd/pull/12702) flag
- Setting this flag enables `SO_REUSEADDR` which allows binding to an address in `TIME_WAIT` state, improving etcd restart time.
- Reduce [around 30% memory allocation by logging range response size without marshal](https://github.com/etcd-io/etcd/pull/12871).
- `ETCD_VERIFY="all"` environment triggers [additional verification of consistency](https://github.com/etcd-io/etcd/pull/12901) of etcd data-dir files.
- Add [`etcd --enable-log-rotation`](https://github.com/etcd-io/etcd/pull/12774) boolean flag which enables log rotation if true.
- Add [`etcd --log-rotation-config-json`](https://github.com/etcd-io/etcd/pull/12774) flag which allows passthrough of JSON config to configure log rotation for a file output target.
- Add experimental distributed tracing boolean flag [`--experimental-enable-distributed-tracing`](https://github.com/etcd-io/etcd/pull/12919) which enables tracing.
- Add [`etcd --experimental-distributed-tracing-address`](https://github.com/etcd-io/etcd/pull/12919) string flag which allows configuring the OpenTelemetry collector address.
- Add [`etcd --experimental-distributed-tracing-service-name`](https://github.com/etcd-io/etcd/pull/12919) string flag which allows changing the default "etcd" service name.
- Add [`etcd --experimental-distributed-tracing-instance-id`](https://github.com/etcd-io/etcd/pull/12919) string flag which configures an instance ID, which must be unique per etcd instance.
- Add [`--experimental-bootstrap-defrag-threshold-megabytes`](https://github.com/etcd-io/etcd/pull/12941) which configures a threshold for the unused db size and etcdserver will automatically perform defragmentation on bootstrap when it exceeds this value. The functionality is disabled if the value is 0.
### Package `runtime`
- Optimize [`runtime.FDUsage` by removing unnecessary sorting](https://github.com/etcd-io/etcd/pull/12214).
### Package `embed`
- Remove [`embed.Config.Debug`](https://github.com/etcd-io/etcd/pull/10947).
- Use `embed.Config.LogLevel` instead.
- Add [`embed.Config.ZapLoggerBuilder`](https://github.com/etcd-io/etcd/pull/11147) to allow creating a custom zap logger.
- Replace [global `*zap.Logger` with etcd server logger object](https://github.com/etcd-io/etcd/pull/12212).
- Add [`embed.Config.EnableLogRotation`](https://github.com/etcd-io/etcd/pull/12774) which enables log rotation if true.
- Add [`embed.Config.LogRotationConfigJSON`](https://github.com/etcd-io/etcd/pull/12774) to allow passthrough of JSON config to configure log rotation for a file output target.
- Add [`embed.Config.ExperimentalEnableDistributedTracing`](https://github.com/etcd-io/etcd/pull/12919) which enables experimental distributed tracing if true.
- Add [`embed.Config.ExperimentalDistributedTracingAddress`](https://github.com/etcd-io/etcd/pull/12919) which allows overriding default collector address.
- Add [`embed.Config.ExperimentalDistributedTracingServiceName`](https://github.com/etcd-io/etcd/pull/12919) which allows overriding default "etcd" service name.
- Add [`embed.Config.ExperimentalDistributedTracingServiceInstanceID`](https://github.com/etcd-io/etcd/pull/12919) which allows configuring an instance ID, which must be uniquer per etcd instance.
### Package `clientv3`
- Remove [excessive watch cancel logging messages](https://github.com/etcd-io/etcd/pull/12187).
- See [kubernetes/kubernetes#93450](https://github.com/kubernetes/kubernetes/issues/93450).
- Add [`TryLock`](https://github.com/etcd-io/etcd/pull/11104) method to `clientv3/concurrency/Mutex`. A non-blocking method on `Mutex` which does not wait to get lock on the Mutex, returns immediately if Mutex is locked by another session.
- Fix [client balancer failover against multiple endpoints](https://github.com/etcd-io/etcd/pull/11184).
- Fix [`"kube-apiserver: failover on multi-member etcd cluster fails certificate check on DNS mismatch"`](https://github.com/kubernetes/kubernetes/issues/83028).
- Fix [IPv6 endpoint parsing in client](https://github.com/etcd-io/etcd/pull/11211).
- Fix ["1.16: etcd client does not parse IPv6 addresses correctly when members are joining" (kubernetes#83550)](https://github.com/kubernetes/kubernetes/issues/83550).
- Fix [errors caused by grpc changing balancer/resolver API](https://github.com/etcd-io/etcd/pull/11564). This change is compatible with grpc >= [v1.26.0](https://github.com/grpc/grpc-go/releases/tag/v1.26.0), but is not compatible with < v1.26.0 version.
- Use [ServerName as the authority](https://github.com/etcd-io/etcd/pull/11574) after bumping to grpc v1.26.0. Remove workaround in [#11184](https://github.com/etcd-io/etcd/pull/11184).
- Fix [`"hasleader"` metadata embedding](https://github.com/etcd-io/etcd/pull/11687).
- Previously, `clientv3.WithRequireLeader(ctx)` was overwriting existing context keys.
- Fix [watch leak caused by lazy cancellation](https://github.com/etcd-io/etcd/pull/11850). When clients cancel their watches, a cancel request will now be immediately sent to the server instead of waiting for the next watch event.
- Make sure [save snapshot downloads checksum for integrity checks](https://github.com/etcd-io/etcd/pull/11896).
- Fix [auth token invalid after watch reconnects](https://github.com/etcd-io/etcd/pull/12264). Get AuthToken automatically when clientConn is ready.
- Improve [clientv3:get AuthToken gracefully without extra connection](https://github.com/etcd-io/etcd/pull/12165).
- Changed [clientv3 dialing code](https://github.com/etcd-io/etcd/pull/12671) to use grpc resolver API instead of custom balancer.
- Endpoints self identify now as `etcd-endpoints://{id}/#initially={list of endpoints}` e.g. `etcd-endpoints://0xc0009d8540/#initially=[localhost:2079]`
- Make sure [save snapshot downloads checksum for integrity checks](https://github.com/etcd-io/etcd/pull/11896).
### Package `lease`
- Fix [memory leak in follower nodes](https://github.com/etcd-io/etcd/pull/11731).
- https://github.com/etcd-io/etcd/issues/11495
- https://github.com/etcd-io/etcd/issues/11730
- Make sure [grant/revoke won't be applied repeatedly after restarting etcd](https://github.com/etcd-io/etcd/pull/11935).
### Package `wal`
- Add [`etcd_wal_write_bytes_total`](https://github.com/etcd-io/etcd/pull/11738).
- Handle [out-of-range slice bound in `ReadAll` and entry limit in `decodeRecord`](https://github.com/etcd-io/etcd/pull/11793).
### etcdctl v3
- Fix `etcdctl member add` command to prevent potential timeout. ([PR#11194](https://github.com/etcd-io/etcd/pull/11194) and [PR#11638](https://github.com/etcd-io/etcd/pull/11638))
- Add [`etcdctl watch --progress-notify`](https://github.com/etcd-io/etcd/pull/11462) flag.
- Add [`etcdctl auth status`](https://github.com/etcd-io/etcd/pull/11536) command to check if authentication is enabled
- Add [`etcdctl get --count-only`](https://github.com/etcd-io/etcd/pull/11743) flag for output type `fields`.
- Add [`etcdctl member list -w=json --hex`](https://github.com/etcd-io/etcd/pull/11812) flag to print memberListResponse in hex format json.
- Changed [`etcdctl lock <lockname> exec-command`](https://github.com/etcd-io/etcd/pull/12829) to return exit code of exec-command.
- [New tool: `etcdutl`](https://github.com/etcd-io/etcd/pull/12971) incorporated functionality of: `etcdctl snapshot status|restore`, `etcdctl backup`, `etcdctl defrag --data-dir ...`.
- [ETCDCTL_API=3 `etcdctl migrate`](https://github.com/etcd-io/etcd/pull/12971) has been decommissioned. Use etcd <=v3.4 to restore v2 storage.
### gRPC gateway
- [gRPC gateway](https://github.com/grpc-ecosystem/grpc-gateway) only supports [`/v3`](TODO) endpoint.
- Deprecated [`/v3beta`](https://github.com/etcd-io/etcd/pull/9298).
- `curl -L http://localhost:2379/v3beta/kv/put -X POST -d '{"key": "Zm9v", "value": "YmFy"}'` does work in v3.5. Use `curl -L http://localhost:2379/v3/kv/put -X POST -d '{"key": "Zm9v", "value": "YmFy"}'` instead.
- Set [`enable-grpc-gateway`](https://github.com/etcd-io/etcd/pull/12297) flag to true when using a config file to keep the defaults the same as the command line configuration.
### gRPC Proxy
- Fix [`panic on error`](https://github.com/etcd-io/etcd/pull/11694) for metrics handler.
- Add [gRPC keepalive related flags](https://github.com/etcd-io/etcd/pull/11711) `grpc-keepalive-min-time`, `grpc-keepalive-interval` and `grpc-keepalive-timeout`.
- [Fix grpc watch proxy hangs when failed to cancel a watcher](https://github.com/etcd-io/etcd/pull/12030) .
- Add [metrics handler for grpcproxy self](https://github.com/etcd-io/etcd/pull/12107).
- Add [health handler for grpcproxy self](https://github.com/etcd-io/etcd/pull/12114).
### Auth
- Fix [NoPassword check when adding user through GRPC gateway](https://github.com/etcd-io/etcd/pull/11418) ([issue#11414](https://github.com/etcd-io/etcd/issues/11414))
- Fix bug where [some auth related messages are logged at wrong level](https://github.com/etcd-io/etcd/pull/11586)
- [Fix a data corruption bug by saving consistent index](https://github.com/etcd-io/etcd/pull/11652).
- [Improve checkPassword performance](https://github.com/etcd-io/etcd/pull/11735).
- [Add authRevision field in AuthStatus](https://github.com/etcd-io/etcd/pull/11659).
- Fix [a bug of not refreshing expired tokens](https://github.com/etcd-io/etcd/pull/13308).
-
### API
- Add [`/v3/auth/status`](https://github.com/etcd-io/etcd/pull/11536) endpoint to check if authentication is enabled
- [Add `Linearizable` field to `etcdserverpb.MemberListRequest`](https://github.com/etcd-io/etcd/pull/11639).
- [Learner support Snapshot RPC](https://github.com/etcd-io/etcd/pull/12890/).
### Package `netutil`
- Remove [`netutil.DropPort/RecoverPort/SetLatency/RemoveLatency`](https://github.com/etcd-io/etcd/pull/12491).
- These are not used anymore. They were only used for older versions of functional testing.
- Removed to adhere to best security practices, minimize arbitrary shell invocation.
### `tools/etcd-dump-metrics`
- Implement [input validation to prevent arbitrary shell invocation](https://github.com/etcd-io/etcd/pull/12491).
### Dependency
- Upgrade [`google.golang.org/grpc`](https://github.com/grpc/grpc-go/releases) from [**`v1.23.0`**](https://github.com/grpc/grpc-go/releases/tag/v1.23.0) to [**`v1.37.0`**](https://github.com/grpc/grpc-go/releases/tag/v1.37.0).
- Upgrade [`go.uber.org/zap`](https://github.com/uber-go/zap/releases) from [**`v1.14.1`**](https://github.com/uber-go/zap/releases/tag/v1.14.1) to [**`v1.16.0`**](https://github.com/uber-go/zap/releases/tag/v1.16.0).
### Platforms
- etcd now [officially supports `arm64`](https://github.com/etcd-io/etcd/pull/12929).
- See https://github.com/etcd-io/etcd/pull/12928 for adding automated tests with `arm64` EC2 instances (Graviton 2).
- See https://github.com/etcd-io/website/pull/273 for new platform support tier policies.
### Release
- Add s390x build support ([PR#11548](https://github.com/etcd-io/etcd/pull/11548) and [PR#11358](https://github.com/etcd-io/etcd/pull/11358))
### Go
- Require [*Go 1.16+*](https://github.com/etcd-io/etcd/pull/11110).
- Compile with [*Go 1.16+*](https://golang.org/doc/devel/release.html#go1.16)
- etcd uses [go modules](https://github.com/etcd-io/etcd/pull/12279) (instead of vendor dir) to track dependencies.
### Project Governance
- The etcd team has added, a well defined and openly discussed, project [governance](https://github.com/etcd-io/etcd/pull/11175).
<hr>

View File

@ -1,103 +0,0 @@
Previous change logs can be found at [CHANGELOG-3.5](https://github.com/etcd-io/etcd/blob/main/CHANGELOG/CHANGELOG-3.5.md).
<hr>
## v3.6.0 (TBD)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.5.0...v3.6.0).
### Breaking Changes
- `etcd` will no longer start on data dir created by newer versions (for example etcd v3.6 will not run on v3.7+ data dir). To downgrade data dir please check out `etcdutl migrate` command.
- `etcd` doesn't support serving client requests on the peer listen endpoints (--listen-peer-urls). See [pull/13565](https://github.com/etcd-io/etcd/pull/13565).
- `etcdctl` will sleep(2s) in case of range delete without `--range` flag. See [pull/13747](https://github.com/etcd-io/etcd/pull/13747)
- Applications which depend on etcd v3.6 packages must be built with go version >= v1.18.
### Deprecations
- Deprecated [V2 discovery](https://etcd.io/docs/v3.5/dev-internal/discovery_protocol/).
- Deprecated [SetKeepAlive and SetKeepAlivePeriod in limitListenerConn](https://github.com/etcd-io/etcd/pull/14356).
- Removed [etcdctl defrag --data-dir](https://github.com/etcd-io/etcd/pull/13793).
- Removed [etcdctl snapshot status](https://github.com/etcd-io/etcd/pull/13809).
- Removed [etcdctl snapshot restore](https://github.com/etcd-io/etcd/pull/13809).
- Removed [etcdutl snapshot save](https://github.com/etcd-io/etcd/pull/13809).
### etcdctl v3
- Add command to generate [shell completion](https://github.com/etcd-io/etcd/pull/13133).
- When print endpoint status, [show db size in use](https://github.com/etcd-io/etcd/pull/13639)
- [Always print the raft_term in decimal](https://github.com/etcd-io/etcd/pull/13711) when displaying member list in json.
- [Add one more field `storageVersion`](https://github.com/etcd-io/etcd/pull/13773) into the response of command `etcdctl endpoint status`.
- Add [`--max-txn-ops`](https://github.com/etcd-io/etcd/pull/14340) flag to make-mirror command.
- Add [`--consistency`](https://github.com/etcd-io/etcd/pull/15261) flag to member list command.
- Display [field `hash_revision`](https://github.com/etcd-io/etcd/pull/14812) for `etcdctl endpoint hash` command.
### etcdutl v3
- Add command to generate [shell completion](https://github.com/etcd-io/etcd/pull/13142).
- Add `migrate` command for downgrading/upgrading etcd data dir files.
### Package `clientv3`
- [Support serializable `MemberList` operation](https://github.com/etcd-io/etcd/pull/15261).
### Package `server`
- Package `mvcc` was moved to `storage/mvcc`
- Package `mvcc/backend` was moved to `storage/backend`
- Package `mvcc/buckets` was moved to `storage/schema`
- Package `wal` was moved to `storage/wal`
- Package `datadir` was moved to `storage/datadir`
### Package `raft`
- Send empty `MsgApp` when entry in-flight limits are exceeded. See [pull/14633](https://github.com/etcd-io/etcd/pull/14633).
- Add [MaxInflightBytes](https://github.com/etcd-io/etcd/pull/14624) setting in `raft.Config` for better flow control of entries.
- [Decouple raft from etcd](https://github.com/etcd-io/etcd/issues/14713). Migrated raft to a separate [repository](https://github.com/etcd-io/raft), and renamed raft module to `go.etcd.io/raft/v3`.
### etcd server
- Add [`etcd --log-format`](https://github.com/etcd-io/etcd/pull/13339) flag to support log format.
- Add [`etcd --experimental-max-learners`](https://github.com/etcd-io/etcd/pull/13377) flag to allow configuration of learner max membership.
- Add [`etcd --experimental-enable-lease-checkpoint-persist`](https://github.com/etcd-io/etcd/pull/13508) flag to handle upgrade from v3.5.2 clusters with this feature enabled.
- Add [`etcdctl make-mirror --rev`](https://github.com/etcd-io/etcd/pull/13519) flag to support incremental mirror.
- Add [`etcd --experimental-wait-cluster-ready-timeout`](https://github.com/etcd-io/etcd/pull/13525) flag to wait for cluster to be ready before serving client requests.
- Add [v3 discovery](https://github.com/etcd-io/etcd/pull/13635) to bootstrap a new etcd cluster.
- Add [field `storage`](https://github.com/etcd-io/etcd/pull/13772) into the response body of endpoint `/version`.
- Add [`etcd --max-concurrent-streams`](https://github.com/etcd-io/etcd/pull/14169) flag to configure the max concurrent streams each client can open at a time, and defaults to math.MaxUint32.
- Add [`etcd grpc-proxy --experimental-enable-grpc-logging`](https://github.com/etcd-io/etcd/pull/14266) flag to logging all grpc requests and responses.
- Add [`etcd --experimental-compact-hash-check-enabled --experimental-compact-hash-check-time`](https://github.com/etcd-io/etcd/issues/14039) flags to support enabling reliable corruption detection on compacted revisions.
- Add [Protection on maintenance request when auth is enabled](https://github.com/etcd-io/etcd/pull/14663).
- Graduated [`--experimental-warning-unary-request-duration` to `--warning-unary-request-duration`](https://github.com/etcd-io/etcd/pull/14414). Note the experimental flag is deprecated and will be decommissioned in v3.7.
- Add [field `hash_revision` into `HashKVResponse`](https://github.com/etcd-io/etcd/pull/14537).
- Add [`etcd --experimental-snapshot-catch-up-entries`](https://github.com/etcd-io/etcd/pull/15033) flag to configure number of entries for a slow follower to catch up after compacting the the raft storage entries and defaults to 5k.
- Decreased [`--snapshot-count` default value from 100,000 to 10,000](https://github.com/etcd-io/etcd/pull/15408)
- Add [`etcd --tls-min-version --tls-max-version`](https://github.com/etcd-io/etcd/pull/15156) to enable support for TLS 1.3.
### etcd grpc-proxy
- Add [`etcd grpc-proxy start --endpoints-auto-sync-interval`](https://github.com/etcd-io/etcd/pull/14354) flag to enable and configure interval of auto sync of endpoints with server.
- Add [`etcd grpc-proxy start --listen-cipher-suites`](https://github.com/etcd-io/etcd/pull/14308) flag to support adding configurable cipher list.
### tools/benchmark
- [Add etcd client autoSync flag](https://github.com/etcd-io/etcd/pull/13416)
### Metrics, Monitoring
See [List of metrics](https://etcd.io/docs/latest/metrics/) for all metrics per release.
- Add [`etcd_disk_defrag_inflight`](https://github.com/etcd-io/etcd/pull/13371).
- Add [`etcd_debugging_server_alarms`](https://github.com/etcd-io/etcd/pull/14276).
### Go
- Require [Go 1.19+](https://github.com/etcd-io/etcd/pull/14463).
- Compile with [Go 1.19+](https://golang.org/doc/devel/release.html#go1.19). Please refer to [gc-guide](https://go.dev/doc/gc-guide) to configure `GOGC` and `GOMEMLIMIT` properly.
### Other
- Use Distroless as base image to make the image less vulnerable and reduce image size.
<hr>

View File

@ -1,44 +0,0 @@
Previous change logs can be found at [CHANGELOG-3.x](https://github.com/etcd-io/etcd/blob/main/CHANGELOG/CHANGELOG-3.x.md).
<hr>
## v4.0.0 (TBD)
See [code changes](https://github.com/etcd-io/etcd/compare/v3.5.0...v4.0.0) and [v4.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_4_0/) for any breaking changes.
**Again, before running upgrades from any previous release, please make sure to read change logs below and [v4.0 upgrade guide](https://etcd.io/docs/latest/upgrades/upgrade_4_0/).**
### Breaking Changes
- [Secure etcd by default](https://github.com/etcd-io/etcd/issues/9475)?
- Deprecate [`etcd --proxy*`](TODO) flags; **no more v2 proxy**.
- Deprecate [v2 storage backend](https://github.com/etcd-io/etcd/issues/9232); **no more v2 store**.
- v2 API is still supported via [v2 emulation](TODO).
- Deprecate [`etcdctl backup`](TODO) command.
- `clientv3.Client.KeepAlive(ctx context.Context, id LeaseID) (<-chan *LeaseKeepAliveResponse, error)` is now [`clientv4.Client.KeepAlive(ctx context.Context, id LeaseID) <-chan *LeaseKeepAliveResponse`](TODO).
- Similar to `Watch`, [`KeepAlive` does not return errors](https://github.com/etcd-io/etcd/issues/7488).
- If there's an unknown server error, kill all open channels and create a new stream on the next `KeepAlive` call.
- Rename `github.com/coreos/client` to `github.com/coreos/clientv2`.
- [`etcd --experimental-initial-corrupt-check`](TODO) has been deprecated.
- Use [`etcd --initial-corrupt-check`](TODO) instead.
- [`etcd --experimental-corrupt-check-time`](TODO) has been deprecated.
- Use [`etcd --corrupt-check-time`](TODO) instead.
- Enable TLS 1.13, deprecate TLS cipher suites.
### etcd server
- [`etcd --initial-corrupt-check`](TODO) flag is now stable (`etcd --experimental-initial-corrupt-check` has been deprecated).
- `etcd --initial-corrupt-check=true` by default, to check cluster database hashes before serving client/peer traffic.
- [`etcd --corrupt-check-time`](TODO) flag is now stable (`etcd --experimental-corrupt-check-time` has been deprecated).
- `etcd --corrupt-check-time=12h` by default, to check cluster database hashes for every 12-hour.
- Enable TLS 1.13, deprecate TLS cipher suites.
### Go
- Require [*Go 2*](https://blog.golang.org/go2draft).
<hr>

View File

@ -1,21 +0,0 @@
# Change logs
## Production recommendation
The minimum recommended etcd versions to run in **production** are v3.4.22+ and v3.5.6+. Refer to the [versioning policy](https://etcd.io/docs/v3.5/op-guide/versioning/) for more details.
### v3.5 data corruption issue
Running etcd v3.5.2, v3.5.1 and v3.5.0 under high load can cause a data corruption issue.
If etcd process is killed, occasionally some committed transactions are not reflected on all the members.
Recommendation is to upgrade to v3.5.4+.
If you have encountered data corruption, please follow instructions on https://etcd.io/docs/v3.5/op-guide/data_corruption/.
## Change log rules
1. Each patch release only includes changes against previous patch release.
For example, the change log of v3.5.5 should only include items which are new to v3.5.4.
2. For the first release (e.g. 3.4.0, 3.5.0, 3.6.0, 4.0.0 etc.) for each minor or major
version, it only includes changes which are new to the first release of previous minor
or major version. For example, v3.5.0 should only include items which are new to v3.4.0,
and v3.6.0 should only include items which are new to v3.5.0.

View File

@ -1,148 +1,62 @@
# How to contribute
etcd is Apache 2.0 licensed and accepts contributions via GitHub pull requests.
This document outlines basics of contributing to etcd.
etcd is Apache 2.0 licensed and accepts contributions via GitHub pull requests. This document outlines some of the conventions on commit message formatting, contact points for developers and other resources to make getting your contribution into etcd easier.
# Email and chat
- Email: [etcd-dev](https://groups.google.com/forum/?hl=en#!forum/etcd-dev)
- IRC: #[coreos](irc://irc.freenode.org:6667/#coreos) IRC channel on freenode.org
## Getting started
- Fork the repository on GitHub
- Read the README.md for build instructions
## Contribution flow
This is a rough outline of what a contributor's workflow looks like:
* [Find something to work on](#Find-something-to-work-on)
* [Setup development environment](#Setup-development-environment)
* [Implement your change](#Implement-your-change)
* [Commit your change](#Commit-your-change)
* [Create a pull request](#Create-a-pull-request)
* [Get your pull request reviewed](#Get-your-pull-request-reviewed)
If you have any questions about, please reach out using one of the methods listed in [contact].
- Create a topic branch from where you want to base your work. This is usually master.
- Make commits of logical units.
- Make sure your commit messages are in the proper format (see below).
- Push your changes to a topic branch in your fork of the repository.
- Submit a pull request to coreos/etcd.
- Your PR must receive a LGTM from two maintainers found in the MAINTAINERS file.
[contact]: ./README.md#Contact
Thanks for your contributions!
## Learn more about etcd
### Code style
Before making a change please look through resources below to learn more about etcd and tools used for development.
The coding style suggested by the Golang community is used in etcd. See the [style doc](https://code.google.com/p/go-wiki/wiki/CodeReviewComments) for details.
* Please learn about [Git](https://github.com/git-guides) version control system used in etcd.
* Read the [etcd learning resources](https://etcd.io/docs/v3.5/learning/)
* Read the [etcd community membership](/Documentation/contributor-guide/community-membership.md)
* Watch [etcd deep dive](https://www.youtube.com/watch?v=D2pm6ufIt98&t=927s)
* Watch [etcd code walk through](https://www.youtube.com/watch?v=H3XaSF6wF7w)
Please follow this style to make etcd easy to review, maintain and develop.
## Find something to work on
### Format of the Commit Message
All the work in etcd project is tracked in [github issue tracker].
Issues should be properly labeled making it easy to find something for you.
We follow a rough convention for commit messages that is designed to answer two
questions: what changed and why. The subject line should feature the what and
the body of the commit should describe the why.
Depending on your interest and experience you should check different labels:
* If you are just starting, check issues labeled with [good first issue].
* When you feel more conformable in your contributions, checkout [help wanted].
* Advanced contributors can try to help with issues labeled [priority/important] covering most relevant work at the time.
```
scripts: add the test-cluster command
If any of aforementioned labels don't have unassigned issues, please [contact] one of the [maintainers] asking to triage more issues.
this uses tmux to setup a test cluster that you can easily kill and
start for debugging.
[github issue tracker]: https://github.com/etcd-io/etcd/issues
[good first issue]: https://github.com/etcd-io/etcd/labels/good%20first%20issue
[help wanted]: https://github.com/etcd-io/etcd/labels/help%20wanted
[maintainers]: https://github.com/etcd-io/etcd/blob/main/MAINTAINERS
[priority/important]: https://github.com/etcd-io/etcd/labels/priority%2Fimportant
## Setup development environment
The etcd project supports two options for development:
1. Manually setup local environment.
2. Automatically setup [devcontainer](https://containers.dev).
For both options the only supported architecture is `linux-amd64`. Bug reports for other environments will generally be ignored. Supporting new environments requires introduction of proper tests and mainter support that is currently lacking in the etcd project.
If you would like etcd to support your preferred environment you can [file an issue].
### Option 1 - Manually setup local environment
This is the original etcd development environment, is most supported and is backwards compatible for development of older etcd versions.
Follow the steps below to setup the environment:
- [Clone the repository](https://docs.github.com/en/repositories/creating-and-managing-repositories/cloning-a-repository)
- Install Go by following [installation](https://go.dev/doc/install). Please check minimal go version in [go.mod file](./go.mod#L3).
- Install build tools (`make`):
- For debian based distributions you can run `sudo apt-get install build-essential`
- Verify that everything is installed by running `make build`
Note: `make build` runs with `-v`. Other build flags can be added through env `GO_BUILD_FLAGS`, **if required**. Eg.,
```console
GO_BUILD_FLAGS="-buildmode=pie" make build
Fixes #38
```
### Option 2 - Automatically setup devcontainer
The format can be described more formally as follows:
This is a more recently added environmnent that aims to make it faster for new contributors to get started with etcd. This option is supported for etcd versions 3.6 onwards.
This option can be [used locally](https://code.visualstudio.com/docs/devcontainers/tutorial) on a system running Visual Studio Code and Docker, or in a remote cloud based [Codespaces](https://github.com/features/codespaces) environment.
To get started, create a codespace for this repository by clicking this 👇
[![Open in GitHub Codespaces](https://github.com/codespaces/badge.svg)](https://github.com/codespaces/new?hide_repo_select=true&ref=main&repo=11225014)
A codespace will open in a web-based version of Visual Studio Code. The [dev container](.devcontainer/devcontainer.json) is fully configured with software needed for this project.
**Note**: Dev containers is an open spec which is supported by [GitHub Codespaces](https://github.com/codespaces) and [other tools](https://containers.dev/supporting).
[file an issue]: https://github.com/etcd-io/etcd/issues/new/choose
## Implement your change
etcd code should follow coding style suggested by the Golang community.
See the [style doc](https://github.com/golang/go/wiki/CodeReviewComments) for details.
Please ensure that your change passes static analysis (requires [golangci-lint](https://golangci-lint.run/usage/install/)):
- `make verify` to verify if all checks pass.
- `make verify-*` to verify a single check, for example `make verify-bom` to verify if bill-of-materials.json file is up-to-date.
- `make fix` to fix all checks.
- `make fix-*` to fix a single checks, for example `make fix-bom` to update bill-of-materials.json.
Please ensure that your change passes tests.
- `make test-unit` to run unit tests.
- `make test-integration` to run integration tests.
- `make test-e2e` to run e2e tests.
All changes are expected to come with unit test.
All new features are expected to have either e2e or integration tests.
## Commit your change
etcd follows a rough convention for commit messages:
* First line:
* Should start name of package (for example `etcdserver`, `etcdctl`) followed by `:` character.
* Describe the `what` behind the change
* Optionally author might provide the `why` behind the change in the main commit message body.
* Last line should be `Signed-off-by: firstname lastname <email@example.com>` (can be automatically generate by providing `--signoff` to git commit command).
Example of commit message:
```
etcdserver: add grpc interceptor to log info on incoming requests
To improve debuggability of etcd v3. Added a grpc interceptor to log
info on incoming requests to etcd server. The log output includes
remote client info, request content (with value field redacted), request
handling latency, response size, etc. Uses zap logger if available,
otherwise uses capnslog.
Signed-off-by: FirstName LastName <github@github.com>
<subsystem>: <what changed>
<BLANK LINE>
<why this change was made>
<BLANK LINE>
<footer>
```
## Create a pull request
Please follow [making a pull request](https://docs.github.com/en/get-started/quickstart/contributing-to-projects#making-a-pull-request) guide.
If you are still working on the pull request, you can convert it to draft by clicking `Convert to draft` link just below list of reviewers.
Multiple small PRs are preferred over single large ones (>500 lines of code).
## Get your pull request reviewed
Before requesting review please ensure that all GitHub checks were successful.
It might happen that some unrelated tests on your PR are failing, due to their flakiness.
In such cases please [file an issue] to deflake the problematic test and ask one of [maintainers] to rerun the tests.
If all checks were successful feel free to reach out for review from people that were involved in the original discussion or [maintainers].
Depending on complexity of the PR it might require between 1 and 2 maintainers to approve your change before merging.
Thanks for contributing!
The first line is the subject and should be no longer than 70 characters, the
second line is always blank, and other lines should be wrapped at 80 characters.
This allows the message to be easier to read on GitHub as well as in various
git tools.

View File

@ -1,14 +1,2 @@
ARG ARCH=amd64
FROM --platform=linux/${ARCH} gcr.io/distroless/static-debian11
ADD etcd /usr/local/bin/
ADD etcdctl /usr/local/bin/
ADD etcdutl /usr/local/bin/
WORKDIR /var/etcd/
WORKDIR /var/lib/etcd/
EXPOSE 2379 2380
# Define default command.
CMD ["/usr/local/bin/etcd"]
FROM golang:onbuild
EXPOSE 4001 7001 2379 2380

View File

@ -1,4 +0,0 @@
This directory includes etcd project internal documentation for new and existing contributors.
For user and developer documentation please go to [etcd.io](https://etcd.io/),
which is developed in [website](https://github.com/etcd-io/website/) repo.

View File

@ -0,0 +1,219 @@
## Administration
### Data Directory
#### Lifecycle
When first started, etcd stores its configuration into a data directory specified by the data-dir configuration parameter.
Configuration is stored in the write ahead log and includes: the local member ID, cluster ID, and initial cluster configuration.
The write ahead log and snapshot files are used during member operation and to recover after a restart.
If a members data directory is ever lost or corrupted then the user should remove the etcd member from the cluster via the [members API][members-api].
A user should avoid restarting an etcd member with a data directory from an out-of-date backup.
Using an out-of-date data directory can lead to inconsistency as the member had agreed to store information via raft then re-joins saying it needs that information again.
For maximum safety, if an etcd member suffers any sort of data corruption or loss, it must be removed from the cluster.
Once removed the member can be re-added with an empty data directory.
[members-api]: other_apis.md#members-api
#### Contents
The data directory has two sub-directories in it:
1. wal: write ahead log files are stored here. For details see the [wal package documentation][wal-pkg]
2. snap: log snapshots are stored here. For details see the [snap package documentation][snap-pkg]
[wal-pkg]: http://godoc.org/github.com/coreos/etcd/wal
[snap-pkg]: http://godoc.org/github.com/coreos/etcd/snap
### Cluster Management
#### Lifecycle
If you are spinning up multiple clusters for testing it is recommended that you specify a unique initial-cluster-token for the different clusters.
This can protect you from cluster corruption in case of mis-configuration because two members started with different cluster tokens will refuse members from each other.
#### Optimal Cluster Size
The recommended etcd cluster size is 3, 5 or 7, which is decided by the fault tolerance requirement. A 7-member cluster can provide enough fault tolerance in most cases. While larger cluster provides better fault tolerance the write performance reduces since data needs to be replicated to more machines.
#### Fault Tolerance Table
It is recommended to have an odd number of members in a cluster. Having an odd cluster size doesn't change the number needed for majority, but you gain a higher tolerance for failure by adding the extra member. You can see this in practice when comparing even and odd sized clusters:
| Cluster Size | Majority | Failure Tolerance |
|--------------|------------|-------------------|
| 1 | 1 | 0 |
| 3 | 2 | 1 |
| 4 | 3 | 1 |
| 5 | 3 | **2** |
| 6 | 4 | 2 |
| 7 | 4 | **3** |
| 8 | 5 | 3 |
| 9 | 5 | **4** |
As you can see, adding another member to bring the size of cluster up to an odd size is always worth it. During a network partition, an odd number of members also guarantees that there will almost always be a majority of the cluster that can continue to operate and be the source of truth when the partition ends.
#### Changing Cluster Size
After your cluster is up and running, adding or removing members is done via [runtime reconfiguration](runtime-configuration.md), which allows the cluster to be modified without downtime. The `etcdctl` tool has a `member list`, `member add` and `member remove` commands to complete this process.
### Member Migration
When there is a scheduled machine maintenance or retirement, you might want to migrate an etcd member to another machine without losing the data and changing the member ID.
The data directory contains all the data to recover a member to its point-in-time state. To migrate a member:
* Stop the member process
* Copy the data directory of the now-idle member to the new machine
* Update the peer URLs for that member to reflect the new machine according to the [member api] [change peer url]
* Start etcd on the new machine, using the same configuration and the copy of the data directory
This example will walk you through the process of migrating the infra1 member to a new machine:
|Name|Peer URL|
|------|--------------|
|infra0|10.0.1.10:2380|
|infra1|10.0.1.11:2380|
|infra2|10.0.1.12:2380|
```
$ export ETCDCTL_PEERS=http://10.0.1.10:2379,http://10.0.1.11:2379,http://10.0.1.12:2379
```
```
$ etcdctl member list
84194f7c5edd8b37: name=infra0 peerURLs=http://10.0.1.10:2380 clientURLs=http://127.0.0.1:2379,http://10.0.1.10:2379
b4db3bf5e495e255: name=infra1 peerURLs=http://10.0.1.11:2380 clientURLs=http://127.0.0.1:2379,http://10.0.1.11:2379
bc1083c870280d44: name=infra2 peerURLs=http://10.0.1.12:2380 clientURLs=http://127.0.0.1:2379,http://10.0.1.12:2379
```
#### Stop the member etcd process
```
$ ssh core@10.0.1.11
```
```
$ sudo systemctl stop etcd
```
#### Copy the data directory of the now-idle member to the new machine
```
$ tar -cvzf node1.etcd.tar.gz /var/lib/etcd/node1.etcd
```
```
$ scp node1.etcd.tar.gz core@10.0.1.13:~/
```
#### Update the peer URLs for that member to reflect the new machine
```
$ curl http://10.0.1.10:2379/v2/members/b4db3bf5e495e255 -XPUT \
-H "Content-Type: application/json" -d '{"peerURLs":["http://10.0.1.13:2380"]}'
```
#### Start etcd on the new machine, using the same configuration and the copy of the data directory
```
$ ssh core@10.0.1.13
```
```
$ tar -xzvf node1.etcd.tar.gz -C /var/lib/etcd
```
```
etcd -name node1 \
-listen-peer-urls http://10.0.1.13:2380 \
-listen-client-urls http://10.0.1.13:2379,http://127.0.0.1:2379 \
-advertise-client-urls http://10.0.1.13:2379,http://127.0.0.1:2379
```
[change peer url]: other_apis.md#change-the-peer-urls-of-a-member
### Disaster Recovery
etcd is designed to be resilient to machine failures. An etcd cluster can automatically recover from any number of temporary failures (for example, machine reboots), and a cluster of N members can tolerate up to _(N/2)-1_ permanent failures (where a member can no longer access the cluster, due to hardware failure or disk corruption). However, in extreme circumstances, a cluster might permanently lose enough members such that quorum is irrevocably lost. For example, if a three-node cluster suffered two simultaneous and unrecoverable machine failures, it would be normally impossible for the cluster to restore quorum and continue functioning.
To recover from such scenarios, etcd provides functionality to backup and restore the datastore and recreate the cluster without data loss.
#### Backing up the datastore
**NB:** Windows users must stop etcd before running the backup command.
The first step of the recovery is to backup the data directory on a functioning etcd node. To do this, use the `etcdctl backup` command, passing in the original data directory used by etcd. For example:
```sh
etcdctl backup \
--data-dir /var/lib/etcd \
--backup-dir /tmp/etcd_backup
```
This command will rewrite some of the metadata contained in the backup (specifically, the node ID and cluster ID), which means that the node will lose its former identity. In order to recreate a cluster from the backup, you will need to start a new, single-node cluster. The metadata is rewritten to prevent the new node from inadvertently being joined onto an existing cluster.
#### Restoring a backup
To restore a backup using the procedure created above, start etcd with the `-force-new-cluster` option and pointing to the backup directory. This will initialize a new, single-member cluster with the default advertised peer URLs, but preserve the entire contents of the etcd data store. Continuing from the previous example:
```sh
etcd \
-data-dir=/tmp/etcd_backup \
-force-new-cluster \
...
```
Now etcd should be available on this node and serving the original datastore.
Once you have verified that etcd has started successfully, shut it down and move the data back to the previous location (you may wish to make another copy as well to be safe):
```sh
pkill etcd
rm -fr /var/lib/etcd
mv /tmp/etcd_backup /var/lib/etcd
etcd \
-data-dir=/var/lib/etcd \
...
```
#### Restoring the cluster
Now that the node is running successfully, you should [change its advertised peer URLs](other_apis.md#change-the-peer-urls-of-a-member), as the `--force-new-cluster` has set the peer URL to the default (listening on localhost).
You can then add more nodes to the cluster and restore resiliency. See the [runtime configuration](runtime-configuration.md) guide for more details.
### Client Request Timeout
etcd sets different timeouts for various types of client requests. The timeout value is not tunable now, which will be improved soon (https://github.com/coreos/etcd/issues/2038).
#### Get requests
Timeout is not set for get requests, because etcd serves the result locally in a non-blocking way.
**Note**: QuorumGet request is a different type, which is mentioned in the following sections.
#### Watch requests
Timeout is not set for watch requests. etcd will not stop a watch request until client cancels it, or the connection is broken.
#### Delete, Put, Post, QuorumGet requests
The default timeout is 5 seconds. It should be large enough to allow all key modifications if the majority of cluster is functioning.
If the request times out, it indicates two possibilities:
1. the server the request sent to was not functioning at that time.
2. the majority of the cluster is not functioning.
If timeout happens several times continuously, administrators should check status of cluster and resolve it as soon as possible.
### Best Practices
#### Maximum OS threads
By default, etcd uses the default configuration of the Go 1.4 runtime, which means that at most one operating system thread will be used to execute code simultaneously. (Note that this default behavior [may change in Go 1.5](https://docs.google.com/document/d/1At2Ls5_fhJQ59kDK2DFVhFu3g5mATSXqqV5QrxinasI/edit)).
When using etcd in heavy-load scenarios on machines with multiple cores it will usually be desirable to increase the number of threads that etcd can utilize. To do this, simply set the environment variable `GOMAXPROCS` to the desired number when starting etcd. For more information on this variable, see the Go [runtime](https://golang.org/pkg/runtime) documentation.

1075
Documentation/api.md Normal file

File diff suppressed because it is too large Load Diff

434
Documentation/auth_api.md Normal file
View File

@ -0,0 +1,434 @@
# v2 Auth and Security
## etcd Resources
There are three types of resources in etcd
1. permission resources: users and roles in the user store
2. key-value resources: key-value pairs in the key-value store
3. settings resources: security settings, auth settings, and dynamic etcd cluster settings (election/heartbeat)
### Permission Resources
#### Users
A user is an identity to be authenticated. Each user can have multiple roles. The user has a capability (such as reading or writing) on the resource if one of the roles has that capability.
A user named `root` is required before authentication can be enabled, and it always has the ROOT role. The ROOT role can be granted to multiple users, but `root` is required for recovery purposes.
#### Roles
Each role has exact one associated Permission List. An permission list exists for each permission on key-value resources.
The special static ROOT (named `root`) role has a full permissions on all key-value resources, the permission to manage user resources and settings resources. Only the ROOT role has the permission to manage user resources and modify settings resources. The ROOT role is built-in and does not need to be created.
There is also a special GUEST role, named 'guest'. These are the permissions given to unauthenticated requests to etcd. This role will be created automatically, and by default allows access to the full keyspace due to backward compatability. (etcd did not previously authenticate any actions.). This role can be modified by a ROOT role holder at any time, to reduce the capabilities of unauthenticated users.
#### Permissions
There are two types of permissions, `read` and `write`. All management and settings require the ROOT role.
A Permission List is a list of allowed patterns for that particular permission (read or write). Only ALLOW prefixes are supported. DENY becomes more complicated and is TBD.
### Key-Value Resources
A key-value resource is a key-value pairs in the store. Given a list of matching patterns, permission for any given key in a request is granted if any of the patterns in the list match.
Only prefixes or exact keys are supported. A prefix permission string ends in `*`.
A permission on `/foo` is for that exact key or directory, not its children or recursively. `/foo*` is a prefix that matches `/foo` recursively, and all keys thereunder, and keys with that prefix (eg. `/foobar`. Contrast to the prefix `/foo/*`). `*` alone is permission on the full keyspace.
### Settings Resources
Specific settings for the cluster as a whole. This can include adding and removing cluster members, enabling or disabling authentication, replacing certificates, and any other dynamic configuration by the administrator (holder of the ROOT role).
## v2 Auth
### Basic Auth
We only support [Basic Auth](http://en.wikipedia.org/wiki/Basic_access_authentication) for the first version. Client needs to attach the basic auth to the HTTP Authorization Header.
### Authorization field for operations
Added to requests to /v2/keys, /v2/auth
Add code 401 Unauthorized to the set of responses from the v2 API
Authorization: Basic {encoded string}
### Future Work
Other types of auth can be considered for the future (eg, signed certs, public keys) but the `Authorization:` header allows for other such types
### Things out of Scope for etcd Permissions
* Pluggable AUTH backends like LDAP (other Authorization tokens generated by LDAP et al may be a possibility)
* Very fine-grained access controls (eg: users modifying keys outside work hours)
## API endpoints
An Error JSON corresponds to:
{
"name": "ErrErrorName",
"description" : "The longer helpful description of the error."
}
#### Enable and Disable Authentication
**Get auth status**
GET /v2/auth/enable
Sent Headers:
Possible Status Codes:
200 OK
200 Body:
{
"enabled": true
}
**Enable auth**
PUT /v2/auth/enable
Sent Headers:
Put Body: (empty)
Possible Status Codes:
200 OK
400 Bad Request (if root user has not been created)
409 Conflict (already enabled)
200 Body: (empty)
**Disable auth**
DELETE /v2/auth/enable
Sent Headers:
Authorization: Basic <RootAuthString>
Possible Status Codes:
200 OK
401 Unauthorized (if not a root user)
409 Conflict (already disabled)
200 Body: (empty)
#### Users
The User JSON object is formed as follows:
```
{
"user": "userName",
"password": "password",
"roles": [
"role1",
"role2"
],
"grant": [],
"revoke": []
}
```
Password is only passed when necessary.
**Get a list of users**
GET/HEAD /v2/auth/users
Sent Headers:
Authorization: Basic <BasicAuthString>
Possible Status Codes:
200 OK
401 Unauthorized
200 Headers:
Content-type: application/json
200 Body:
{
"users": ["alice", "bob", "eve"]
}
**Get User Details**
GET/HEAD /v2/auth/users/alice
Sent Headers:
Authorization: Basic <BasicAuthString>
Possible Status Codes:
200 OK
401 Unauthorized
404 Not Found
200 Headers:
Content-type: application/json
200 Body:
{
"user" : "alice",
"roles" : ["fleet", "etcd"]
}
**Create Or Update A User**
A user can be created with initial roles, if filled in. However, no roles are required; only the username and password fields
PUT /v2/auth/users/charlie
Sent Headers:
Authorization: Basic <BasicAuthString>
Put Body:
JSON struct, above, matching the appropriate name
* Starting password and roles when creating.
* Grant/Revoke/Password filled in when updating (to grant roles, revoke roles, or change the password).
Possible Status Codes:
200 OK
201 Created
400 Bad Request
401 Unauthorized
404 Not Found (update non-existent users)
409 Conflict (when granting duplicated roles or revoking non-existent roles)
200 Headers:
Content-type: application/json
200 Body:
JSON state of the user
**Remove A User**
DELETE /v2/auth/users/charlie
Sent Headers:
Authorization: Basic <BasicAuthString>
Possible Status Codes:
200 OK
401 Unauthorized
403 Forbidden (remove root user when auth is enabled)
404 Not Found
200 Headers:
200 Body: (empty)
#### Roles
A full role structure may look like this. A Permission List structure is used for the "permissions", "grant", and "revoke" keys.
```
{
"role" : "fleet",
"permissions" : {
"kv" : {
"read" : [ "/fleet/" ],
"write": [ "/fleet/" ]
}
},
"grant" : {"kv": {...}},
"revoke": {"kv": {...}}
}
```
**Get a list of Roles**
GET/HEAD /v2/auth/roles
Sent Headers:
Authorization: Basic <BasicAuthString>
Possible Status Codes:
200 OK
401 Unauthorized
200 Headers:
Content-type: application/json
200 Body:
{
"roles": ["fleet", "etcd", "quay"]
}
**Get Role Details**
GET/HEAD /v2/auth/roles/fleet
Sent Headers:
Authorization: Basic <BasicAuthString>
Possible Status Codes:
200 OK
401 Unauthorized
404 Not Found
200 Headers:
Content-type: application/json
200 Body:
{
"role" : "fleet",
"permissions" : {
"kv" : {
"read": [ "/fleet/" ],
"write": [ "/fleet/" ]
}
}
}
**Create Or Update A Role**
PUT /v2/auth/roles/rkt
Sent Headers:
Authorization: Basic <BasicAuthString>
Put Body:
Initial desired JSON state, including the role name for verification and:
* Starting permission set if creating
* Granted/Revoked permission set if updating
Possible Status Codes:
200 OK
201 Created
400 Bad Request
401 Unauthorized
404 Not Found (update non-existent roles)
409 Conflict (when granting duplicated permission or revoking non-existent permission)
200 Body:
JSON state of the role
**Remove A Role**
DELETE /v2/auth/roles/rkt
Sent Headers:
Authorization: Basic <BasicAuthString>
Possible Status Codes:
200 OK
401 Unauthorized
403 Forbidden (remove root)
404 Not Found
200 Headers:
200 Body: (empty)
## Example Workflow
Let's walk through an example to show two tenants (applications, in our case) using etcd permissions.
### Create root role
```
PUT /v2/auth/users/root
Put Body:
{"user" : "root", "password": "betterRootPW!"}
```
### Enable auth
```
PUT /v2/auth/enable
```
### Modify guest role (revoke write permission)
```
PUT /v2/auth/roles/guest
Headers:
Authorization: Basic <root:betterRootPW!>
Put Body:
{
"role" : "guest",
"revoke" : {
"kv" : {
"write": [
"*"
]
}
}
}
```
### Create Roles for the Applications
Create the rkt role fully specified:
```
PUT /v2/auth/roles/rkt
Headers:
Authorization: Basic <root:betterRootPW!>
Body:
{
"role" : "rkt",
"permissions" : {
"kv": {
"read": [
"/rkt/*"
],
"write": [
"/rkt/*"
]
}
}
}
```
But let's make fleet just a basic role for now:
```
PUT /v2/auth/roles/fleet
Headers:
Authorization: Basic <root:betterRootPW!>
Body:
{
"role" : "fleet"
}
```
### Optional: Grant some permissions to the roles
Well, we finally figured out where we want fleet to live. Let's fix it.
(Note that we avoided this in the rkt case. So this step is optional.)
```
PUT /v2/auth/roles/fleet
Headers:
Authorization: Basic <root:betterRootPW!>
Put Body:
{
"role" : "fleet",
"grant" : {
"kv" : {
"read": [
"/rkt/fleet",
"/fleet/*"
]
}
}
}
```
### Create Users
Same as before, let's use rocket all at once and fleet separately
```
PUT /v2/auth/users/rktuser
Headers:
Authorization: Basic <root:betterRootPW!>
Body:
{"user" : "rktuser", "password" : "rktpw", "roles" : ["rkt"]}
```
```
PUT /v2/auth/users/fleetuser
Headers:
Authorization: Basic <root:betterRootPW!>
Body:
{"user" : "fleetuser", "password" : "fleetpw"}
```
### Optional: Grant Roles to Users
Likewise, let's explicitly grant fleetuser access.
```
PUT /v2/auth/users/fleetuser
Headers:
Authorization: Basic <root:betterRootPW!>
Body:
{"user": "fleetuser", "grant": ["fleet"]}
```
#### Start to use fleetuser and rktuser
For example:
```
PUT /v2/keys/rkt/RktData
Headers:
Authorization: Basic <rktuser:rktpw>
Body:
value=launch
```
Reads and writes outside the prefixes granted will fail with a 401 Unauthorized.

View File

@ -0,0 +1,179 @@
# Authentication Guide
**NOTE: The authentication feature is considered experimental. We may change workflow without warning in future releases.**
## Overview
Authentication -- having users and roles in etcd -- was added in etcd 2.1. This guide will help you set up basic authentication in etcd.
etcd before 2.1 was a completely open system; anyone with access to the API could change keys. In order to preserve backward compatibility and upgradability, this feature is off by default.
For a full discussion of the RESTful API, see [the authentication API documentation](auth_api.md)
## Special Users and Roles
There is one special user, `root`, and there are two special roles, `root` and `guest`.
### User `root`
User `root` must be created before security can be activated. It has the `root` role and allows for the changing of anything inside etcd. The idea behind the `root` user is for recovery purposes -- a password is generated and stored somewhere -- and the root role is granted to the administrator accounts on the system. In the future, for troubleshooting and recovery, we will need to assume some access to the system, and future documentation will assume this root user (though anyone with the role will suffice).
### Role `root`
Role `root` cannot be modified, but it may be granted to any user. Having access via the root role not only allows global read-write access (as was the case before 2.1) but allows modification of the authentication policy and all administrative things, like modifying the cluster membership.
### Role `guest`
The `guest` role defines the permissions granted to any request that does not provide an authentication. This will be created on security activation (if it doesn't already exist) to have full access to all keys, as was true in etcd 2.0. It may be modified at any time, and cannot be removed.
## Working with users
The `user` subcommand for `etcdctl` handles all things having to do with user accounts.
A listing of users can be found with
```
$ etcdctl user list
```
Creating a user is as easy as
```
$ etcdctl user add myusername
```
And there will be prompt for a new password.
Roles can be granted and revoked for a user with
```
$ etcdctl user grant myusername -roles foo,bar,baz
$ etcdctl user revoke myusername -roles bar,baz
```
We can look at this user with
```
$ etcdctl user get myusername
```
And the password for a user can be changed with
```
$ etcdctl user passwd myusername
```
Which will prompt again for a new password.
To delete an account, there's always
```
$ etcdctl user remove myusername
```
## Working with roles
The `role` subcommand for `etcdctl` handles all things having to do with access controls for particular roles, as were granted to individual users.
A listing of roles can be found with
```
$ etcdctl role list
```
A new role can be created with
```
$ etcdctl role add myrolename
```
A role has no password; we are merely defining a new set of access rights.
Roles are granted access to various parts of the keyspace, a single path at a time.
Reading a path is simple; if the path ends in `*`, that key **and all keys prefixed with it**, are granted to holders of this role. If it does not end in `*`, only that key and that key alone is granted.
Access can be granted as either read, write, or both, as in the following examples:
```
# Give read access to keys under the /foo directory
$ etcdctl role grant myrolename -path '/foo/*' -read
# Give write-only access to the key at /foo/bar
$ etcdctl role grant myrolename -path '/foo/bar' -write
# Give full access to keys under /pub
$ etcdctl role grant myrolename -path '/pub/*' -readwrite
```
Beware that
```
# Give full access to keys under /pub??
$ etcdctl role grant myrolename -path '/pub*' -readwrite
```
Without the slash may include keys under `/publishing`, for example. To do both, grant `/pub` and `/pub/*`
To see what's granted, we can look at the role at any time:
```
$ etcdctl role get myrolename
```
Revocation of permissions is done the same logical way:
```
$ etcdctl role revoke myrolename -path '/foo/bar' -write
```
As is removing a role entirely
```
$ etcdctl role remove myrolename
```
## Enabling authentication
The minimal steps to enabling auth follow. The administrator can set up users and roles before or after enabling authentication, as a matter of preference.
Make sure the root user is created:
```
$ etcdctl user add root
New password:
```
And enable authentication
```
$ etcdctl auth enable
```
After this, etcd is running with authentication enabled. To disable it for any reason, use the reciprocal command:
```
$ etcdctl -u root:rootpw auth disable
```
It would also be good to check what guests (unauthenticated users) are allowed to do:
```
$ etcdctl -u root:rootpw role get guest
```
And modify this role appropriately, depending on your policies.
## Using `etcdctl` to authenticate
`etcdctl` supports a similar flag as `curl` for authentication.
```
$ etcdctl -u user:password get foo
```
or if you prefer to be prompted:
```
$ etcdctl -u user get foo
```
Otherwise, all `etcdctl` commands remain the same. Users and roles can still be created and modified, but require authentication by a user with the root role.

View File

@ -0,0 +1,114 @@
# Backward Compatibility
The main goal of etcd 2.0 release is to improve cluster safety around bootstrapping and dynamic reconfiguration. To do this, we deprecated the old error-prone APIs and provide a new set of APIs.
The other main focus of this release was a more reliable Raft implementation, but as this change is internal it should not have any notable effects to users.
## Command Line Flags Changes
The major flag changes are to mostly related to bootstrapping. The `initial-*` flags provide an improved way to specify the required criteria to start the cluster. The advertised URLs now support a list of values instead of a single value, which allows etcd users to gracefully migrate to the new set of IANA-assigned ports (2379/client and 2380/peers) while maintaining backward compatibility with the old ports.
- `-addr` is replaced by `-advertise-client-urls`.
- `-bind-addr` is replaced by `-listen-client-urls`.
- `-peer-addr` is replaced by `-initial-advertise-peer-urls`.
- `-peer-bind-addr` is replaced by `-listen-peer-urls`.
- `-peers` is replaced by `-initial-cluster`.
- `-peers-file` is replaced by `-initial-cluster`.
- `-peer-heartbeat-interval` is replaced by `-heartbeat-interval`.
- `-peer-election-timeout` is replaced by `-election-timeout`.
The documentation of new command line flags can be found at
https://github.com/coreos/etcd/blob/master/Documentation/configuration.md.
## Data Directory Naming
The default data dir location has changed from {$hostname}.etcd to {name}.etcd.
## Data Directory Migration
The disk format within the data directory changed with etcd 2.0.
If you run etcd 2.0 on an etcd 0.4 data directory it will automatically migrate the data and start.
You will want to coordinate this upgrade by walking through each of your machines in the cluster, stopping etcd 0.4 and then starting etcd 2.0.
If you would rather manually do the migration, to test it out first in another environment, you can use the [migration tool doc][migrationtooldoc].
[migrationtooldoc]: https://github.com/coreos/etcd/blob/master/tools/etcd-migrate/README.md
## Snapshot Migration
If you are only interested in the data in etcd you can migrate a snapshot of your data from a v0.4.9+ cluster into a new etcd 2.0 cluster using a snapshot migration.
The advantage of this method is that you are directly dumping only the etcd data so you can run your old and new cluster side-by-side, snapshot the data, import it and then point your applications at this cluster.
The disadvantage is that the etcd indexes of your data will change which may confuse applications that use etcd.
To get started get the newest data snapshot from the 0.4.9+ cluster:
```
curl http://cluster.example.com:4001/v2/migration/snapshot > backup.snap
```
Now, import the snapshot into your new cluster:
```
etcdctl -C new_cluster.example.com import --snap backup.snap
```
If you have a large amount of data, you can specify more concurrent works to copy data in parallel by using `-c` flag.
If you have hidden keys to copy, you can use `--hidden` flag to specify.
And the data will quickly copy into the new cluster:
```
entering dir: /
entering dir: /foo
entering dir: /foo/bar
copying key: /foo/bar/1 1
entering dir: /
entering dir: /foo2
entering dir: /foo2/bar2
copying key: /foo2/bar2/2 2
```
## Key-Value API
### Read consistency flag
The consistent flag for read operations is removed in etcd 2.0.0. The normal read operations provides the same consistency guarantees with the 0.4.6 read operations with consistent flag set.
The read consistency guarantees are:
The consistent read guarantees the sequential consistency within one client that talks to one etcd server. Read/Write from one client to one etcd member should be observed in order. If one client write a value to a etcd server successfully, it should be able to get the value out of the server immediately.
Each etcd member will proxy the request to leader and only return the result to user after the result is applied on the local member. Thus after the write succeed, the user is guaranteed to see the value on the member it sent the request to.
Reads do not provide linearizability. If you want linearizable read, you need to set quorum option to true.
**Previous behavior**
We added an option for a consistent read in the old version of etcd since etcd 0.x redirects the write request to the leader. When the user get back the result from the leader, the member it sent the request to originally might not apply the write request yet. With the consistent flag set to true, the client will always send read request to the leader. So one client should be able to see its last write when consistent=true is enabled. There is no order guarantees among different clients.
## Standby
etcd 0.4s standby mode has been deprecated. [Proxy mode][proxymode] is introduced to solve a subset of problems standby was solving.
Standby mode was intended for large clusters that had a subset of the members acting in the consensus process. Overall this process was too magical and allowed for operators to back themselves into a corner.
Proxy mode in 2.0 will provide similar functionality, and with improved control over which machines act as proxies due to the operator specifically configuring them. Proxies also support read only or read/write modes for increased security and durability.
[proxymode]: proxy.md
## Discovery Service
A size key needs to be provided inside a [discovery token][discoverytoken].
[discoverytoken]: clustering.md#custom-etcd-discovery-service
## HTTP Admin API
`v2/admin` on peer url and `v2/keys/_etcd` are unified under the new [v2/member API][memberapi] to better explain which machines are part of an etcd cluster, and to simplify the keyspace for all your use cases.
[memberapi]: other_apis.md
## HTTP Key Value API
- The follower can now transparently proxy write requests to the leader. Clients will no longer see 307 redirections to the leader from etcd.
- Expiration time is in UTC instead of local time.

View File

@ -0,0 +1,5 @@
# Benchmarks
etcd benchmarks will be published regularly and tracked for each release below:
- [etcd v2.1.0](etcd-2-1-0-benchmarks.md)

View File

@ -0,0 +1,49 @@
## Physical machines
GCE n1-highcpu-2 machine type
- 1x dedicated local SSD mounted under /var/lib/etcd
- 1x dedicated slow disk for the OS
- 1.8 GB memory
- 2x CPUs
- etcd version 2.1.0
## etcd Cluster
3 etcd members, each runs on a single machine
## Testing
Bootstrap another machine and use benchmark tool [boom](https://github.com/rakyll/boom) to send requests to each etcd member.
## Performance
### reading one single key
| key size in bytes | number of clients | target etcd server | read QPS | 90th Percentile Latency (ms) |
|-------------------|-------------------|--------------------|----------|---------------|
| 64 | 1 | leader only | 1534 | 0.7 |
| 64 | 64 | leader only | 10125 | 9.1 |
| 64 | 256 | leader only | 13892 | 27.1 |
| 256 | 1 | leader only | 1530 | 0.8 |
| 256 | 64 | leader only | 10106 | 10.1 |
| 256 | 256 | leader only | 14667 | 27.0 |
| 64 | 64 | all servers | 24200 | 3.9 |
| 64 | 256 | all servers | 33300 | 11.8 |
| 256 | 64 | all servers | 24800 | 3.9 |
| 256 | 256 | all servers | 33000 | 11.5 |
### writing one single key
| key size in bytes | number of clients | target etcd server | write QPS | 90th Percentile Latency (ms) |
|-------------------|-------------------|--------------------|-----------|---------------|
| 64 | 1 | leader only | 60 | 21.4 |
| 64 | 64 | leader only | 1742 | 46.8 |
| 64 | 256 | leader only | 3982 | 90.5 |
| 256 | 1 | leader only | 58 | 20.3 |
| 256 | 64 | leader only | 1770 | 47.8 |
| 256 | 256 | leader only | 4157 | 105.3 |
| 64 | 64 | all servers | 1028 | 123.4 |
| 64 | 256 | all servers | 3260 | 123.8 |
| 256 | 64 | all servers | 1033 | 121.5 |
| 256 | 256 | all servers | 3061 | 119.3 |

View File

@ -0,0 +1,24 @@
## Branch Management
### Guide
- New development occurs on the [master branch](https://github.com/coreos/etcd/tree/master)
- Master branch should always have a green build!
- Backwards-compatible bug fixes should target the master branch and subsequently be ported to stable branches
- Once the master branch is ready for release, it will be tagged and become the new stable branch.
The etcd team has adopted a _rolling release model_ and supports one stable version of etcd.
### Master branch
The `master` branch is our development branch. All new features land here first.
If you want to try new features, pull `master` and play with it. Note that `master` may not be stable because new features may introduce bugs.
Before the release of the next stable version, feature PRs will be frozen. We will focus on the testing, bug-fix and documentation for one to two weeks.
### Stable branches
All branches with prefix `release-` are considered _stable_ branches.
After every minor release (http://semver.org/), we will have a new stable branch for that release. We will keep fixing the backwards-compatible bugs for the latest stable release, but not previous releases. The _patch_ release, incorporating any bug fixes, will be once every two weeks, given any patches.

388
Documentation/clustering.md Normal file
View File

@ -0,0 +1,388 @@
# Clustering Guide
## Overview
Starting an etcd cluster statically requires that each member knows another in the cluster. In a number of cases, you might not know the IPs of your cluster members ahead of time. In these cases, you can bootstrap an etcd cluster with the help of a discovery service.
Once an etcd cluster is up and running, adding or removing members is done via [runtime reconfiguration](runtime-configuration.md).
This guide will cover the following mechanisms for bootstrapping an etcd cluster:
* [Static](#static)
* [etcd Discovery](#etcd-discovery)
* [DNS Discovery](#dns-discovery)
Each of the bootstrapping mechanisms will be used to create a three machine etcd cluster with the following details:
|Name|Address|Hostname|
|------|---------|------------------|
|infra0|10.0.1.10|infra0.example.com|
|infra1|10.0.1.11|infra1.example.com|
|infra2|10.0.1.12|infra2.example.com|
## Static
As we know the cluster members, their addresses and the size of the cluster before starting, we can use an offline bootstrap configuration by setting the `initial-cluster` flag. Each machine will get either the following command line or environment variables:
```
ETCD_INITIAL_CLUSTER="infra0=http://10.0.1.10:2380,infra1=http://10.0.1.11:2380,infra2=http://10.0.1.12:2380"
ETCD_INITIAL_CLUSTER_STATE=new
```
```
-initial-cluster infra0=http://10.0.1.10:2380,infra1=http://10.0.1.11:2380,infra2=http://10.0.1.12:2380 \
-initial-cluster-state new
```
Note that the URLs specified in `initial-cluster` are the _advertised peer URLs_, i.e. they should match the value of `initial-advertise-peer-urls` on the respective nodes.
If you are spinning up multiple clusters (or creating and destroying a single cluster) with same configuration for testing purpose, it is highly recommended that you specify a unique `initial-cluster-token` for the different clusters. By doing this, etcd can generate unique cluster IDs and member IDs for the clusters even if they otherwise have the exact same configuration. This can protect you from cross-cluster-interaction, which might corrupt your clusters.
On each machine you would start etcd with these flags:
```
$ etcd -name infra0 -initial-advertise-peer-urls http://10.0.1.10:2380 \
-listen-peer-urls http://10.0.1.10:2380 \
-listen-client-urls http://10.0.1.10:2379,http://127.0.0.1:2379 \
-advertise-client-urls http://10.0.1.10:2379 \
-initial-cluster-token etcd-cluster-1 \
-initial-cluster infra0=http://10.0.1.10:2380,infra1=http://10.0.1.11:2380,infra2=http://10.0.1.12:2380 \
-initial-cluster-state new
```
```
$ etcd -name infra1 -initial-advertise-peer-urls http://10.0.1.11:2380 \
-listen-peer-urls http://10.0.1.11:2380 \
-listen-client-urls http://10.0.1.11:2379,http://127.0.0.1:2379 \
-advertise-client-urls http://10.0.1.11:2379 \
-initial-cluster-token etcd-cluster-1 \
-initial-cluster infra0=http://10.0.1.10:2380,infra1=http://10.0.1.11:2380,infra2=http://10.0.1.12:2380 \
-initial-cluster-state new
```
```
$ etcd -name infra2 -initial-advertise-peer-urls http://10.0.1.12:2380 \
-listen-peer-urls http://10.0.1.12:2380 \
-listen-client-urls http://10.0.1.12:2379,http://127.0.0.1:2379 \
-advertise-client-urls http://10.0.1.12:2379 \
-initial-cluster-token etcd-cluster-1 \
-initial-cluster infra0=http://10.0.1.10:2380,infra1=http://10.0.1.11:2380,infra2=http://10.0.1.12:2380 \
-initial-cluster-state new
```
The command line parameters starting with `-initial-cluster` will be ignored on subsequent runs of etcd. You are free to remove the environment variables or command line flags after the initial bootstrap process. If you need to make changes to the configuration later (for example, adding or removing members to/from the cluster), see the [runtime configuration](runtime-configuration.md) guide.
### Error Cases
In the following example, we have not included our new host in the list of enumerated nodes. If this is a new cluster, the node _must_ be added to the list of initial cluster members.
```
$ etcd -name infra1 -initial-advertise-peer-urls http://10.0.1.11:2380 \
-listen-peer-urls https://10.0.1.11:2380 \
-listen-client-urls http://10.0.1.11:2379,http://127.0.0.1:2379 \
-advertise-client-urls http://10.0.1.11:2379 \
-initial-cluster infra0=http://10.0.1.10:2380 \
-initial-cluster-state new
etcd: infra1 not listed in the initial cluster config
exit 1
```
In this example, we are attempting to map a node (infra0) on a different address (127.0.0.1:2380) than its enumerated address in the cluster list (10.0.1.10:2380). If this node is to listen on multiple addresses, all addresses _must_ be reflected in the "initial-cluster" configuration directive.
```
$ etcd -name infra0 -initial-advertise-peer-urls http://127.0.0.1:2380 \
-listen-peer-urls http://10.0.1.10:2380 \
-listen-client-urls http://10.0.1.10:2379,http://127.0.0.1:2379 \
-advertise-client-urls http://10.0.1.10:2379 \
-initial-cluster infra0=http://10.0.1.10:2380,infra1=http://10.0.1.11:2380,infra2=http://10.0.1.12:2380 \
-initial-cluster-state=new
etcd: error setting up initial cluster: infra0 has different advertised URLs in the cluster and advertised peer URLs list
exit 1
```
If you configure a peer with a different set of configuration and attempt to join this cluster you will get a cluster ID mismatch and etcd will exit.
```
$ etcd -name infra3 -initial-advertise-peer-urls http://10.0.1.13:2380 \
-listen-peer-urls http://10.0.1.13:2380 \
-listen-client-urls http://10.0.1.13:2379,http://127.0.0.1:2379 \
-advertise-client-urls http://10.0.1.13:2379 \
-initial-cluster infra0=http://10.0.1.10:2380,infra1=http://10.0.1.11:2380,infra3=http://10.0.1.13:2380 \
-initial-cluster-state=new
etcd: conflicting cluster ID to the target cluster (c6ab534d07e8fcc4 != bc25ea2a74fb18b0). Exiting.
exit 1
```
## Discovery
In a number of cases, you might not know the IPs of your cluster peers ahead of time. This is common when utilizing cloud providers or when your network uses DHCP. In these cases, rather than specifying a static configuration, you can use an existing etcd cluster to bootstrap a new one. We call this process "discovery".
There two methods that can be used for discovery:
* etcd discovery service
* DNS SRV records
### etcd Discovery
#### Lifetime of a Discovery URL
A discovery URL identifies a unique etcd cluster. Instead of reusing a discovery URL, you should always create discovery URLs for new clusters.
Moreover, discovery URLs should ONLY be used for the initial bootstrapping of a cluster. To change cluster membership after the cluster is already running, see the [runtime reconfiguration][runtime] guide.
[runtime]: runtime-configuration.md
#### Custom etcd Discovery Service
Discovery uses an existing cluster to bootstrap itself. If you are using your own etcd cluster you can create a URL like so:
```
$ curl -X PUT https://myetcd.local/v2/keys/discovery/6c007a14875d53d9bf0ef5a6fc0257c817f0fb83/_config/size -d value=3
```
By setting the size key to the URL, you create a discovery URL with an expected cluster size of 3.
If you bootstrap an etcd cluster using discovery service with more than the expected number of etcd members, the extra etcd processes will [fall back][fall-back] to being [proxies][proxy] by default.
The URL you will use in this case will be `https://myetcd.local/v2/keys/discovery/6c007a14875d53d9bf0ef5a6fc0257c817f0fb83` and the etcd members will use the `https://myetcd.local/v2/keys/discovery/6c007a14875d53d9bf0ef5a6fc0257c817f0fb83` directory for registration as they start.
Now we start etcd with those relevant flags for each member:
```
$ etcd -name infra0 -initial-advertise-peer-urls http://10.0.1.10:2380 \
-listen-peer-urls http://10.0.1.10:2380 \
-listen-client-urls http://10.0.1.10:2379,http://127.0.0.1:2379 \
-advertise-client-urls http://10.0.1.10:2379 \
-discovery https://myetcd.local/v2/keys/discovery/6c007a14875d53d9bf0ef5a6fc0257c817f0fb83
```
```
$ etcd -name infra1 -initial-advertise-peer-urls http://10.0.1.11:2380 \
-listen-peer-urls http://10.0.1.11:2380 \
-listen-client-urls http://10.0.1.11:2379,http://127.0.0.1:2379 \
-advertise-client-urls http://10.0.1.11:2379 \
-discovery https://myetcd.local/v2/keys/discovery/6c007a14875d53d9bf0ef5a6fc0257c817f0fb83
```
```
$ etcd -name infra2 -initial-advertise-peer-urls http://10.0.1.12:2380 \
-listen-peer-urls http://10.0.1.12:2380 \
-listen-client-urls http://10.0.1.12:2379,http://127.0.0.1:2379 \
-advertise-client-urls http://10.0.1.12:2379 \
-discovery https://myetcd.local/v2/keys/discovery/6c007a14875d53d9bf0ef5a6fc0257c817f0fb83
```
This will cause each member to register itself with the custom etcd discovery service and begin the cluster once all machines have been registered.
#### Public etcd Discovery Service
If you do not have access to an existing cluster, you can use the public discovery service hosted at `discovery.etcd.io`. You can create a private discovery URL using the "new" endpoint like so:
```
$ curl https://discovery.etcd.io/new?size=3
https://discovery.etcd.io/3e86b59982e49066c5d813af1c2e2579cbf573de
```
This will create the cluster with an initial expected size of 3 members. If you do not specify a size, a default of 3 will be used.
If you bootstrap an etcd cluster using discovery service with more than the expected number of etcd members, the extra etcd processes will [fall back][fall-back] to being [proxies][proxy] by default.
[fall-back]: proxy.md#fallback-to-proxy-mode-with-discovery-service
[proxy]: proxy.md
```
ETCD_DISCOVERY=https://discovery.etcd.io/3e86b59982e49066c5d813af1c2e2579cbf573de
```
```
-discovery https://discovery.etcd.io/3e86b59982e49066c5d813af1c2e2579cbf573de
```
Now we start etcd with those relevant flags for each member:
```
$ etcd -name infra0 -initial-advertise-peer-urls http://10.0.1.10:2380 \
-listen-peer-urls http://10.0.1.10:2380 \
-listen-client-urls http://10.0.1.10:2379,http://127.0.0.1:2379 \
-advertise-client-urls http://10.0.1.10:2379 \
-discovery https://discovery.etcd.io/3e86b59982e49066c5d813af1c2e2579cbf573de
```
```
$ etcd -name infra1 -initial-advertise-peer-urls http://10.0.1.11:2380 \
-listen-peer-urls http://10.0.1.11:2380 \
-listen-client-urls http://10.0.1.11:2379,http://127.0.0.1:2379 \
-advertise-client-urls http://10.0.1.11:2379 \
-discovery https://discovery.etcd.io/3e86b59982e49066c5d813af1c2e2579cbf573de
```
```
$ etcd -name infra2 -initial-advertise-peer-urls http://10.0.1.12:2380 \
-listen-peer-urls http://10.0.1.12:2380 \
-listen-client-urls http://10.0.1.12:2379,http://127.0.0.1:2379 \
-advertise-client-urls http://10.0.1.12:2379 \
-discovery https://discovery.etcd.io/3e86b59982e49066c5d813af1c2e2579cbf573de
```
This will cause each member to register itself with the discovery service and begin the cluster once all members have been registered.
You can use the environment variable `ETCD_DISCOVERY_PROXY` to cause etcd to use an HTTP proxy to connect to the discovery service.
#### Error and Warning Cases
##### Discovery Server Errors
```
$ etcd -name infra0 -initial-advertise-peer-urls http://10.0.1.10:2380 \
-listen-peer-urls http://10.0.1.10:2380 \
-listen-client-urls http://10.0.1.10:2379,http://127.0.0.1:2379 \
-advertise-client-urls http://10.0.1.10:2379 \
-discovery https://discovery.etcd.io/3e86b59982e49066c5d813af1c2e2579cbf573de
etcd: error: the cluster doesnt have a size configuration value in https://discovery.etcd.io/3e86b59982e49066c5d813af1c2e2579cbf573de/_config
exit 1
```
##### User Errors
This error will occur if the discovery cluster already has the configured number of members, and `discovery-fallback` is explicitly disabled
```
$ etcd -name infra0 -initial-advertise-peer-urls http://10.0.1.10:2380 \
-listen-peer-urls http://10.0.1.10:2380 \
-listen-client-urls http://10.0.1.10:2379,http://127.0.0.1:2379 \
-advertise-client-urls http://10.0.1.10:2379 \
-discovery https://discovery.etcd.io/3e86b59982e49066c5d813af1c2e2579cbf573de \
-discovery-fallback exit
etcd: discovery: cluster is full
exit 1
```
##### Warnings
This is a harmless warning notifying you that the discovery URL will be
ignored on this machine.
```
$ etcd -name infra0 -initial-advertise-peer-urls http://10.0.1.10:2380 \
-listen-peer-urls http://10.0.1.10:2380 \
-listen-client-urls http://10.0.1.10:2379,http://127.0.0.1:2379 \
-advertise-client-urls http://10.0.1.10:2379 \
-discovery https://discovery.etcd.io/3e86b59982e49066c5d813af1c2e2579cbf573de
etcdserver: discovery token ignored since a cluster has already been initialized. Valid log found at /var/lib/etcd
```
### DNS Discovery
DNS [SRV records](http://www.ietf.org/rfc/rfc2052.txt) can be used as a discovery mechanism.
The `-discovery-srv` flag can be used to set the DNS domain name where the discovery SRV records can be found.
The following DNS SRV records are looked up in the listed order:
* _etcd-server-ssl._tcp.example.com
* _etcd-server._tcp.example.com
If `_etcd-server-ssl._tcp.example.com` is found then etcd will attempt the bootstrapping process over SSL.
#### Create DNS SRV records
```
$ dig +noall +answer SRV _etcd-server._tcp.example.com
_etcd-server._tcp.example.com. 300 IN SRV 0 0 2380 infra0.example.com.
_etcd-server._tcp.example.com. 300 IN SRV 0 0 2380 infra1.example.com.
_etcd-server._tcp.example.com. 300 IN SRV 0 0 2380 infra2.example.com.
```
```
$ dig +noall +answer infra0.example.com infra1.example.com infra2.example.com
infra0.example.com. 300 IN A 10.0.1.10
infra1.example.com. 300 IN A 10.0.1.11
infra2.example.com. 300 IN A 10.0.1.12
```
#### Bootstrap the etcd cluster using DNS
etcd cluster members can listen on domain names or IP address, the bootstrap process will resolve DNS A records.
```
$ etcd -name infra0 \
-discovery-srv example.com \
-initial-advertise-peer-urls http://infra0.example.com:2380 \
-initial-cluster-token etcd-cluster-1 \
-initial-cluster-state new \
-advertise-client-urls http://infra0.example.com:2379 \
-listen-client-urls http://infra0.example.com:2379 \
-listen-peer-urls http://infra0.example.com:2380
```
```
$ etcd -name infra1 \
-discovery-srv example.com \
-initial-advertise-peer-urls http://infra1.example.com:2380 \
-initial-cluster-token etcd-cluster-1 \
-initial-cluster-state new \
-advertise-client-urls http://infra1.example.com:2379 \
-listen-client-urls http://infra1.example.com:2379 \
-listen-peer-urls http://infra1.example.com:2380
```
```
$ etcd -name infra2 \
-discovery-srv example.com \
-initial-advertise-peer-urls http://infra2.example.com:2380 \
-initial-cluster-token etcd-cluster-1 \
-initial-cluster-state new \
-advertise-client-urls http://infra2.example.com:2379 \
-listen-client-urls http://infra2.example.com:2379 \
-listen-peer-urls http://infra2.example.com:2380
```
You can also bootstrap the cluster using IP addresses instead of domain names:
```
$ etcd -name infra0 \
-discovery-srv example.com \
-initial-advertise-peer-urls http://10.0.1.10:2380 \
-initial-cluster-token etcd-cluster-1 \
-initial-cluster-state new \
-advertise-client-urls http://10.0.1.10:2379 \
-listen-client-urls http://10.0.1.10:2379 \
-listen-peer-urls http://10.0.1.10:2380
```
```
$ etcd -name infra1 \
-discovery-srv example.com \
-initial-advertise-peer-urls http://10.0.1.11:2380 \
-initial-cluster-token etcd-cluster-1 \
-initial-cluster-state new \
-advertise-client-urls http://10.0.1.11:2379 \
-listen-client-urls http://10.0.1.11:2379 \
-listen-peer-urls http://10.0.1.11:2380
```
```
$ etcd -name infra2 \
-discovery-srv example.com \
-initial-advertise-peer-urls http://10.0.1.12:2380 \
-initial-cluster-token etcd-cluster-1 \
-initial-cluster-state new \
-advertise-client-urls http://10.0.1.12:2379 \
-listen-client-urls http://10.0.1.12:2379 \
-listen-peer-urls http://10.0.1.12:2380
```
#### etcd proxy configuration
DNS SRV records can also be used to configure the list of peers for an etcd server running in proxy mode:
```
$ etcd --proxy on -discovery-srv example.com
```
# 0.4 to 2.0+ Migration Guide
In etcd 2.0 we introduced the ability to listen on more than one address and to advertise multiple addresses. This makes using etcd easier when you have complex networking, such as private and public networks on various cloud providers.
To make understanding this feature easier, we changed the naming of some flags, but we support the old flags to make the migration from the old to new version easier.
|Old Flag |New Flag |Migration Behavior |
|-----------------------|-----------------------|---------------------------------------------------------------------------------------|
|-peer-addr |-initial-advertise-peer-urls |If specified, peer-addr will be used as the only peer URL. Error if both flags specified.|
|-addr |-advertise-client-urls |If specified, addr will be used as the only client URL. Error if both flags specified.|
|-peer-bind-addr |-listen-peer-urls |If specified, peer-bind-addr will be used as the only peer bind URL. Error if both flags specified.|
|-bind-addr |-listen-client-urls |If specified, bind-addr will be used as the only client bind URL. Error if both flags specified.|
|-peers |none |Deprecated. The -initial-cluster flag provides a similar concept with different semantics. Please read this guide on cluster startup.|
|-peers-file |none |Deprecated. The -initial-cluster flag provides a similar concept with different semantics. Please read this guide on cluster startup.|

View File

@ -0,0 +1,207 @@
## Configuration Flags
etcd is configurable through command-line flags and environment variables. Options set on the command line take precedence over those from the environment.
The format of environment variable for flag `-my-flag` is `ETCD_MY_FLAG`. It applies to all flags.
To start etcd automatically using custom settings at startup in Linux, using a [systemd][systemd-intro] unit is highly recommended.
[systemd-intro]: http://freedesktop.org/wiki/Software/systemd/
### Member Flags
##### -name
+ Human-readable name for this member.
+ default: "default"
+ This value is referenced as this node's own entries listed in the `-initial-cluster` flag (Ex: `default=http://localhost:2380` or `default=http://localhost:2380,default=http://localhost:7001`). This needs to match the key used in the flag if you're using [static boostrapping](clustering.md#static).
##### -data-dir
+ Path to the data directory.
+ default: "${name}.etcd"
##### -snapshot-count
+ Number of committed transactions to trigger a snapshot to disk.
+ default: "10000"
##### -heartbeat-interval
+ Time (in milliseconds) of a heartbeat interval.
+ default: "100"
##### -election-timeout
+ Time (in milliseconds) for an election to timeout. See [Documentation/tuning.md](tuning.md#time-parameters) for details.
+ default: "1000"
##### -listen-peer-urls
+ List of URLs to listen on for peer traffic.
+ default: "http://localhost:2380,http://localhost:7001"
##### -listen-client-urls
+ List of URLs to listen on for client traffic.
+ default: "http://localhost:2379,http://localhost:4001"
##### -max-snapshots
+ Maximum number of snapshot files to retain (0 is unlimited)
+ default: 5
+ The default for users on Windows is unlimited, and manual purging down to 5 (or your preference for safety) is recommended.
##### -max-wals
+ Maximum number of wal files to retain (0 is unlimited)
+ default: 5
+ The default for users on Windows is unlimited, and manual purging down to 5 (or your preference for safety) is recommended.
##### -cors
+ Comma-separated white list of origins for CORS (cross-origin resource sharing).
+ default: none
### Clustering Flags
`-initial` prefix flags are used in bootstrapping ([static bootstrap][build-cluster], [discovery-service bootstrap][discovery] or [runtime reconfiguration][reconfig]) a new member, and ignored when restarting an existing member.
`-discovery` prefix flags need to be set when using [discovery service][discovery].
##### -initial-advertise-peer-urls
+ List of this member's peer URLs to advertise to the rest of the cluster. These addresses are used for communicating etcd data around the cluster. At least one must be routable to all cluster members.
+ default: "http://localhost:2380,http://localhost:7001"
##### -initial-cluster
+ Initial cluster configuration for bootstrapping.
+ default: "default=http://localhost:2380,default=http://localhost:7001"
+ The key is the value of the `-name` flag for each node provided. The default uses `default` for the key because this is the default for the `-name` flag.
##### -initial-cluster-state
+ Initial cluster state ("new" or "existing"). Set to `new` for all members present during initial static or DNS bootstrapping. If this option is set to `existing`, etcd will attempt to join the existing cluster. If the wrong value is set, etcd will attempt to start but fail safely.
+ default: "new"
[static bootstrap]: clustering.md#static
##### -initial-cluster-token
+ Initial cluster token for the etcd cluster during bootstrap.
+ default: "etcd-cluster"
##### -advertise-client-urls
+ List of this member's client URLs to advertise to the rest of the cluster.
+ default: "http://localhost:2379,http://localhost:4001"
##### -discovery
+ Discovery URL used to bootstrap the cluster.
+ default: none
##### -discovery-srv
+ DNS srv domain used to bootstrap the cluster.
+ default: none
##### -discovery-fallback
+ Expected behavior ("exit" or "proxy") when discovery services fails.
+ default: "proxy"
##### -discovery-proxy
+ HTTP proxy to use for traffic to discovery service.
+ default: none
### Proxy Flags
`-proxy` prefix flags configures etcd to run in [proxy mode][proxy].
##### -proxy
+ Proxy mode setting ("off", "readonly" or "on").
+ default: "off"
##### -proxy-failure-wait
+ Time (in milliseconds) an endpoint will be held in a failed state before being reconsidered for proxied requests.
+ default: 5000
##### -proxy-refresh-interval
+ Time (in milliseconds) of the endpoints refresh interval.
+ default: 30000
##### -proxy-dial-timeout
+ Time (in milliseconds) for a dial to timeout or 0 to disable the timeout
+ default: 1000
##### -proxy-write-timeout
+ Time (in milliseconds) for a write to timeout or 0 to disable the timeout.
+ default: 5000
##### -proxy-read-timeout
+ Time (in milliseconds) for a read to timeout or 0 to disable the timeout.
+ Don't change this value if you use watches because they are using long polling requests.
+ default: 0
### Security Flags
The security flags help to [build a secure etcd cluster][security].
##### -ca-file [DEPRECATED]
+ Path to the client server TLS CA file.
+ default: none
##### -cert-file
+ Path to the client server TLS cert file.
+ default: none
##### -key-file
+ Path to the client server TLS key file.
+ default: none
##### -client-cert-auth
+ Enable client cert authentication.
+ default: false
##### -trusted-ca-file
+ Path to the client server TLS trusted CA key file.
+ default: none
##### -peer-ca-file [DEPRECATED]
+ Path to the peer server TLS CA file.
+ default: none
##### -peer-cert-file
+ Path to the peer server TLS cert file.
+ default: none
##### -peer-key-file
+ Path to the peer server TLS key file.
+ default: none
##### -peer-client-cert-auth
+ Enable peer client cert authentication.
+ default: false
##### -peer-trusted-ca-file
+ Path to the peer server TLS trusted CA file.
+ default: none
### Logging Flags
##### -debug
+ Drop the default log level to DEBUG for all subpackages.
+ default: false (INFO for all packages)
##### -log-package-levels
+ Set individual etcd subpackages to specific log levels. An example being `etcdserver=WARNING,security=DEBUG`
+ default: none (INFO for all packages)
### Unsafe Flags
Please be CAUTIOUS when using unsafe flags because it will break the guarantees given by the consensus protocol.
For example, it may panic if other members in the cluster are still alive.
Follow the instructions when using these flags.
##### -force-new-cluster
+ Force to create a new one-member cluster. It commits configuration changes in force to remove all existing members in the cluster and add itself. It needs to be set to [restore a backup][restore].
+ default: false
### Miscellaneous Flags
##### -version
+ Print the version and exit.
+ default: false
[build-cluster]: clustering.md#static
[reconfig]: runtime-configuration.md
[discovery]: clustering.md#discovery
[proxy]: proxy.md
[security]: security.md
[restore]: admin_guide.md#restoring-a-backup

View File

@ -1,27 +0,0 @@
# Branch management
## Guide
* New development occurs on the [main branch][main].
* Main branch should always have a green build!
* Backwards-compatible bug fixes should target the main branch and subsequently be ported to stable branches.
* Once the main branch is ready for release, it will be tagged and become the new stable branch.
The etcd team has adopted a *rolling release model* and supports two stable versions of etcd.
### Main branch
The `main` branch is our development branch. All new features land here first.
To try new and experimental features, pull `main` and play with it. Note that `main` may not be stable because new features may introduce bugs.
Before the release of the next stable version, feature PRs will be frozen. A [release manager](./release.md/#release-management) will be assigned to major/minor version and will lead the etcd community in test, bug-fix and documentation of the release for one to two weeks.
### Stable branches
All branches with prefix `release-` are considered _stable_ branches.
After every minor release ([semver.org](https://semver.org/)), we will have a new stable branch for that release, managed by a [patch release manager](./release.md/#release-management). We will keep fixing the backwards-compatible bugs for the latest two stable releases. A _patch_ release to each supported release branch, incorporating any bug fixes, will be once every two weeks, given any patches.
[main]: https://github.com/etcd-io/etcd/tree/main

View File

@ -1,168 +0,0 @@
# Community membership
This doc outlines the various responsibilities of contributor roles in etcd.
| Role | Responsibilities | Requirements | Defined by |
|------------|----------------------------------------------|---------------------------------------------------------------|--------------------------------------|
| Member | Active contributor in the community | Sponsored by 2 reviewers and multiple contributions | etcd GitHub org member |
| Reviewer | Review contributions from other members | History of review and authorship | [MAINTAINERS] file reviewer entry |
| Maintainer | Set direction and priorities for the project | Demonstrated responsibility and excellent technical judgement | [MAINTAINERS] file maintainers entry |
## New contributors
New contributors should be welcomed to the community by existing members,
helped with PR workflow, and directed to relevant documentation and
communication channels.
## Established community members
Established community members are expected to demonstrate their adherence to the
principles in this document, familiarity with project organization, roles,
policies, procedures, conventions, etc., and technical and/or writing ability.
Role-specific expectations, responsibilities, and requirements are enumerated
below.
## Member
Members are continuously active contributors in the community. They can have
issues and PRs assigned to them. Members are expected to remain active
contributors to the community.
**Defined by:** Member of the etcd GitHub organization.
### Requirements
- Enabled [two-factor authentication] on their GitHub account
- Have made multiple contributions to the project or community. Contribution may include, but is not limited to:
- Authoring or reviewing PRs on GitHub. At least one PR must be **merged**.
- Filing or commenting on issues on GitHub
- Contributing to community discussions (e.g. meetings, Slack, email discussion
forums, Stack Overflow)
- Subscribed to [etcd-dev@googlegroups.com]
- Have read the [contributor guide]
- Sponsored by one active maintainer or two reviewers.
- Sponsors must be from multiple member companies to demonstrate integration across community.
- With no objections from other maintainers
- Open a [membership nomination] issue against the etcd-io/etcd repo
- Ensure your sponsors are @mentioned on the issue
- Make sure that the list of contributions included is representative of your work on the project.
- Members can be removed by a supermajority of the maintainers or can resign by notifying
the maintainers.
### Responsibilities and privileges
- Responsive to issues and PRs assigned to them
- Granted "triage access" to etcd project
- Active owner of code they have contributed (unless ownership is explicitly transferred)
- Code is well tested
- Tests consistently pass
- Addresses bugs or issues discovered after code is accepted
**Note:** members who frequently contribute code are expected to proactively
perform code reviews and work towards becoming a *reviewer*.
## Reviewers
Reviewers are contributors who have demonstrated greater skill in
reviewing the code from other contributors. They are knowledgeable about both
the codebase and software engineering principles. Their LGTM counts towards
merging a code change into the project. A reviewer is generally on the ladder towards
maintainership.
**Defined by:** *reviewers* entry in the [MAINTAINERS] file.
### Requirements
- member for at least 3 months.
- Primary reviewer for at least 5 PRs to the codebase.
- Reviewed or contributed at least 20 substantial PRs to the codebase.
- Knowledgeable about the codebase.
- Sponsored by two active maintainers.
- Sponsors must be from multiple member companies to demonstrate integration across community.
- With no objections from other maintainers
- Reviewers can be removed by a supermajority of the maintainers or can resign by notifying
the maintainers.
### Responsibilities and privileges
- Code reviewer status may be a precondition to accepting large code contributions
- Responsible for project quality control via code reviews
- Focus on code quality and correctness, including testing and factoring
- May also review for more holistic issues, but not a requirement
- Expected to be responsive to review requests
- Assigned PRs to review related to area of expertise
- Assigned test bugs related to area of expertise
- Granted "triage access" to etcd project
## Maintainers
Maintainers are first and foremost contributors that have shown they
are committed to the long term success of a project. Maintainership is about building
trust with the current maintainers and being a person that they can
depend on to make decisions in the best interest of the project in a consistent manner.
**Defined by:** *maintainers* entry in the [MAINTAINERS] file.
### Requirements
- Deep understanding of the technical goals and direction of the project
- Deep understanding of the technical domain of the project
- Sustained contributions to design and direction by doing all of:
- Authoring and reviewing proposals
- Initiating, contributing and resolving discussions (emails, GitHub issues, meetings)
- Identifying subtle or complex issues in designs and implementation PRs
- Directly contributed to the project through implementation and / or review
- Sponsored by two active maintainers and elected by supermajority
- Sponsors must be from multiple member companies to demonstrate integration across community.
- To become a maintainer send an email with your candidacy to [etcd-maintainers-private@googlegroups.com]
- Ensure your sponsors are @mentioned on the email
- Include a list of contributions representative of your work on the project.
- Existing maintainers vote will privately and respond to the email with either acceptance or with feedback for suggested improvement.
- With your membership approved you are expected to:
- Open a PR and add an entry to the [MAINTAINERS] file
- Subscribe to [etcd-maintainers@googlegroups.com] and [etcd-maintainers-private@googlegroups.com]
- Request to join to [etcd-maintainer teams of etcd organization of GitHub](https://github.com/orgs/etcd-io/teams/maintainers-etcd)
- Request to join to the private slack channel for etcd maintainers on [kubernetes slack](http://slack.kubernetes.io/)
- Request access to etcd-development GCP project where we publish releases
- Request access to passwords shared between maintainers
### Responsibilities and privileges
- Make and approve technical design decisions
- Set technical direction and priorities
- Define milestones and releases
- Mentor and guide reviewers, and contributors to the project.
- Participate when called upon in the [security disclosure and release process]
- Ensure continued health of the project
- Adequate test coverage to confidently release
- Tests are passing reliably (i.e. not flaky) and are fixed when they fail
- Ensure a healthy process for discussion and decision making is in place.
- Work with other maintainers to maintain the project's overall health and success holistically
### Retiring
Life priorities, interests, and passions can change. Maintainers can retire and
move to [emeritus maintainers]. If a maintainer needs to step down, they should
inform other maintainers, if possible, help find someone to pick up the related
work. At the very least, ensure the related work can be continued. Afterward
they can remove themselves from list of existing maintainers.
If a maintainer has not been performing their duties for period of 12 months,
they can be removed by other maintainers. In that case inactive maintainer will
be first notified via an email. If situation doesn't improve, they will be
removed. If an emeritus maintainer wants to regain an active role, they can do
so by renewing their contributions. Active maintainers should welcome such a move.
Retiring of other maintainers or regaining the status should require approval
of at least two active maintainers.
## Acknowledgements
Contributor roles and responsibilities were written based on [Kubernetes community membership]
[MAINTAINERS]: /MAINTAINERS
[contributor guide]: /CONTRIBUTING.md
[membership nomination]:https://github.com/etcd-io/etcd/issues/new?assignees=&labels=area%2Fcommunity&template=membership-request.yml
[Kubernetes community membership]: https://github.com/kubernetes/community/blob/master/community-membership.md
[emeritus maintainers]: /README.md#etcd-emeritus-maintainers
[security disclosure and release process]: /security/README.md
[two-factor authentication]: https://docs.github.com/en/authentication/securing-your-account-with-two-factor-authentication-2fa/about-two-factor-authentication

View File

@ -1,102 +0,0 @@
Dependency management
======
# Table of Contents
- **[Main branch](#main-branch)**
- [Dependencies used in workflows](#dependencies-used-in-workflows)
- [Bumping order](#bumping-order)
- [Steps to bump a dependency](#steps-to-bump-a-dependency)
- [Indirect dependencies](#indirect-dependencies)
- [About gRPC](#about-grpc)
- [Rotation worksheet](#rotation-worksheet)
- **[Stable branches](#stable-branches)**
# Main branch
The dependabot is enabled & [configured](https://github.com/etcd-io/etcd/blob/main/.github/dependabot.yml) to
manage dependencies for etcd `main` branch. But dependabot doesn't work well for multi-module repository like `etcd`,
see [dependabot-core/issues/6678](https://github.com/dependabot/dependabot-core/issues/6678).
Usually human intervention is required each time when dependabot automatically opens some PRs to bump dependencies.
Please see guidance below.
## Dependencies used in workflows
The PRs which automatically bump dependencies (see examples below) used in workflows are fine, and can be approved & merged directly as long as all checks are successful.
- [build(deps): bump github/codeql-action from 2.2.11 to 2.2.12](https://github.com/etcd-io/etcd/pull/15736)
- [build(deps): bump actions/checkout from 3.5.0 to 3.5.2](https://github.com/etcd-io/etcd/pull/15735)
- [build(deps): bump ossf/scorecard-action from 2.1.2 to 2.1.3](https://github.com/etcd-io/etcd/pull/15607)
## Bumping order
When multiple etcd modules depend on the same package, please bump the package version for all the modules in the correct order. The rule is simple:
if module A depends on module B, then bump the dependency for module B before module A. If the two modules do not depend on each other, then
it doesn't matter to bump which module first. For example, multiple modules depend on `github.com/spf13/cobra`, we need to bump the dependency
in the following order,
- go.etcd.io/etcd/pkg/v3
- go.etcd.io/etcd/server/v3
- go.etcd.io/etcd/etcdctl/v3
- go.etcd.io/etcd/etcdutl/v3
- go.etcd.io/etcd/tests/v3
- go.etcd.io/etcd/v3
- go.etcd.io/etcd/tools/v3
For more details about etcd Golang modules, please check https://etcd.io/docs/next/dev-internal/modules/
Note the module `go.etcd.io/etcd/tools/v3` doesn't depend on any other modules, nor by any other modules, so it doesn't matter when to bump dependencies for it.
## Steps to bump a dependency
Use the `github.com/spf13/cobra` as an example, follow steps below to bump it from 1.6.1 to 1.7.0 for module `go.etcd.io/etcd/etcdctl/v3`,
```
$ cd ${ETCD_ROOT_DIR}/etcdctl
$ go get github.com/spf13/cobra@v1.7.0
$ go mod tidy
$ cd ..
$ ./scripts/fix.sh
```
Execute the same steps for all other modules. When you finish bumping the dependency for all modules, then commit the change,
```
$ git add .
$ git commit --signoff -m "dependency: bump github.com/spf13/cobra from 1.6.1 to 1.7.0"
```
Please close the related PRs which were automatically opened by dependabot.
When you bump multiple dependencies in one PR, it's recommended to create a separate commit for each dependency. But it isn't a must; for example,
you can get all dependencies bumping for the module `go.etcd.io/etcd/tools/v3` included in one commit.
## Indirect dependencies
Usually we don't bump a dependency if all modules just indirectly depend on it, such as `github.com/go-logr/logr`.
If an indirect dependency (e.g. `D1`) causes any CVE or bugs which affect etcd, usually the module (e.g. `M1`, not part of etcd, but used by etcd)
which depends on it should bump the dependency (`D1`), and then etcd just needs to bump `M1`. However, if the module (`M1`) somehow doesn't
bump the problematic dependency, then etcd can still bump it (`D1`) directly following the same steps above. But as a long-term solution, etcd should
try to remove the dependency on such module (`M1`) that lack maintenance.
For mixed cases, in which some modules directly while others indirectly depend on a dependency, we have multiple options,
- Bump the dependency for all modules, no matter it's direct or indirect dependency.
- Bump the dependency only for modules which directly depend on it.
We should try to follow the first way, and temporarily fall back to the second one if we run into any issue on the first way. Eventually we
should fix the issue and ensure all modules depend on the same version of the dependency.
## Known incompatible dependency updates
### arduino/setup-protoc
Please refer to [build(deps): bump arduino/setup-protoc from 1.3.0 to 2.0.0](https://github.com/etcd-io/etcd/pull/16016)
### About gRPC
There is a compatible issue between etcd and gRPC 1.52.0, and there is a pending PR [pull/15131](https://github.com/etcd-io/etcd/pull/15131).
The plan is to remove the dependency on some grpc-go's experimental API firstly, afterwards try to bump it again. Please get more details in
[issues/15145](https://github.com/etcd-io/etcd/issues/15145).
`go.opentelemetry.io/otel` version update is indirectly blocked due to this gRPC issue. Please get more details in [pull/15810](https://github.com/etcd-io/etcd/pull/15810).
## Rotation worksheet
The dependabot scheduling interval is weekly; it means dependabot will automatically raise a bunch of PRs per week.
Usually human intervention is required each time. We have a [rotation worksheet](https://docs.google.com/spreadsheets/d/1DDWzbcOx1p32MhyelaPZ_SfYtAD6xRsrtGRZ9QXPOyQ/edit#gid=0),
and everyone is welcome to participate; you just need to register your name in the worksheet.
# Stable branches
Usually we don't proactively bump dependencies for stable releases unless there are any CVEs or bugs that affect etcd.
If we have to do it, then follow the same guidance above. Note that there is no `./scripts/fix.sh` in release-3.4, so no need to
execute it for 3.4.

View File

@ -1,83 +0,0 @@
# Features
This document provides an overview of etcd features and general development guidelines for adding and deprecating them. The project maintainers can override these guidelines per the need of the project following the project governance.
## Overview
The etcd features fall into three stages, experimental, stable, and unsafe.
### Experimental
Any new feature is usually added as an experimental feature. An experimental feature is characterized as below:
- Might be buggy due to a lack of user testing. Enabling the feature may not work as expected.
- Disabled by default when added initially.
- Support for such a feature may be dropped at any time without notice
- Feature related issues may be given lower priorities.
- It can be removed in the next minor or major release without following the feature deprecation policy unless it graduates to the stable future.
### Stable
A stable feature is characterized as below:
- Supported as part of the supported releases of etcd.
- May be enabled by default.
- Discontinuation of support must follow the feature deprecation policy.
### Unsafe
Unsafe features are rare and listed under the `Unsafe feature:` section in the etcd usage documentation. By default, they are disabled. They should be used with caution following documentation. An unsafe feature can be removed in the next minor or major release without following feature deprecation policy.
## Development Guidelines
### Adding a new feature
Any new enhancements to the etcd are typically added as an experimental feature. The general development requirements are listed below. They can be somewhat flexible depending on the scope of the feature and review discussions, and will evolve over time.
- Open an issue
- It must provide a clear need for the proposed feature.
- It should list development work items as checkboxes. There must be one work item towards future graduation to the stable future.
- Label the issue with `type/feature` and `experimental`.
- Keep the issue open for tracking purpose until a decision is made on graduation.
- Open a Pull Request (PR)
- Provide unit tests. Integreation tests are also recommended as possible.
- Provide robust e2e test coverage. If the feature being added is complicated or quickly needed, maintainers can decide to go with e2e tests for basic coverage initially and have robust coverage added at the later time before feature graduation to the stable feature.
- Provide logs for proper debugging.
- Provide metrics and benchmarks as needed.
- The Feature should be disabled by default.
- Any configuration flags related to the implementation of the feature must be prefixed with `experimental` e.g. `--experimental-feature-name`.
- Add a CHANGELOG entry.
- At least two maintainers must approve feature requirements and related code changes.
### Graduating an Experimental feature to Stable
It is important that experimental features don't get stuck in that stage. They should be revisited and moved to the stable stage following the graduation steps as described here.
#### Locate graduation candidate
Decide if an experimental feature is ready for graduation to the stable stage.
- Find the issue that was used to enable the experimental feature initially. One way to find such issues is to search for issues with `type/feature` and `experimental` labels.
- Fix any known open issues against the feature.
- Make sure the feature was enabled for at least one previous release. Check the PR(s) reference from the issue to see when the feature related code changes were merged.
#### Provide implementation
If an experimental feature is found ready for graduation to the stable stage, open a Pull Request (PR) with the following changes.
- Add robust e2e tests if not already provided.
- Add a new stable feature flag identical to the experimental feature flag but without the `--experimental` prefix.
- Deprecate the experimental feature following the [feature deprecation policy](#Deprecating-a-feature).
- Implementation must ensure that both the graduated and deprecated experimental feature flags work as expected. Note that both these flags will co-exist for the timeframe described in the feature deprecation policy.
- Enable the graduated feature by default if needed.
At least two maintainers must approve the work. Patch releases should not be considered for graduation.
### Deprecating a feature
#### Experimental
An experimental feature deprecates when it graduates to the stable stage.
- Add a deprecation message in the documentation of the experimental feature with a recommendation to use related stable feature. e.g. `DEPRECATED. Use <feature-name> instead.`
- Add a `deprecated` label in the issue that was initially used to enable the experimental feature.
#### Stable
As the project evolves, a stable feature may sometimes need to be deprecated and removed. Such a situation should be handled using the steps below:
- Create an issue for tracking purpose.
- Add a deprecation message in the feature usage documentation before a planned release for feature deprecation. e.g. `To be deprecated in <release>.`. If a new feature replaces the `To be deprecated` feature, then also provide a message saying so. e.g. `Use <feature-name> instead.`.
- Deprecate the feature in the planned release with a message as part of the feature usage documentation. e.g. `DEPRECATED`. If a new feature replaces the deprecated feature, then also provide a message saying so. e.g. `DEPRECATED. Use <feature-name> instead.`.
- Add a `deprecated` label in the related issue.
Remove the deprecated feature in the following release. Close any related issue(s). At least two maintainers must approve the work. Patch releases should not be considered for deprecation.

View File

@ -1,150 +0,0 @@
# Set up local cluster
For testing and development deployments, the quickest and easiest way is to configure a local cluster. For a production deployment, refer to the [clustering][clustering] section.
## Local standalone cluster
### Starting a cluster
Run the following to deploy an etcd cluster as a standalone cluster:
```
$ ./etcd
...
```
If the `etcd` binary is not present in the current working directory, it might be located either at `$GOPATH/bin/etcd` or at `/usr/local/bin/etcd`. Run the command appropriately.
The running etcd member listens on `localhost:2379` for client requests.
### Interacting with the cluster
Use `etcdctl` to interact with the running cluster:
1. Store an example key-value pair in the cluster:
```
$ ./etcdctl put foo bar
OK
```
If OK is printed, storing key-value pair is successful.
2. Retrieve the value of `foo`:
```
$ ./etcdctl get foo
bar
```
If `bar` is returned, interaction with the etcd cluster is working as expected.
## Local multi-member cluster
### Starting a cluster
A `Procfile` at the base of the etcd git repository is provided to easily configure a local multi-member cluster. To start a multi-member cluster, navigate to the root of the etcd source tree and perform the following:
1. Install `goreman` to control Procfile-based applications:
```
$ go install github.com/mattn/goreman@latest
```
The installation will place executables in the $GOPATH/bin. If $GOPATH environment variable is not set, the tool will be installed into the $HOME/go/bin. Make sure that $PATH is set accordingly in your environment.
2. Start a cluster with `goreman` using etcd's stock Procfile:
```
$ goreman -f Procfile start
```
The members start running. They listen on `localhost:2379`, `localhost:22379`, and `localhost:32379` respectively for client requests.
### Interacting with the cluster
Use `etcdctl` to interact with the running cluster:
1. Print the list of members:
```
$ etcdctl --write-out=table --endpoints=localhost:2379 member list
```
The list of etcd members are displayed as follows:
```
+------------------+---------+--------+------------------------+------------------------+
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS |
+------------------+---------+--------+------------------------+------------------------+
| 8211f1d0f64f3269 | started | infra1 | http://127.0.0.1:2380 | http://127.0.0.1:2379 |
| 91bc3c398fb3c146 | started | infra2 | http://127.0.0.1:22380 | http://127.0.0.1:22379 |
| fd422379fda50e48 | started | infra3 | http://127.0.0.1:32380 | http://127.0.0.1:32379 |
+------------------+---------+--------+------------------------+------------------------+
```
2. Store an example key-value pair in the cluster:
```
$ etcdctl put foo bar
OK
```
If OK is printed, storing key-value pair is successful.
### Testing fault tolerance
To exercise etcd's fault tolerance, kill a member and attempt to retrieve the key.
1. Identify the process name of the member to be stopped.
The `Procfile` lists the properties of the multi-member cluster. For example, consider the member with the process name, `etcd2`.
2. Stop the member:
```
# kill etcd2
$ goreman run stop etcd2
```
3. Store a key:
```
$ etcdctl put key hello
OK
```
4. Retrieve the key that is stored in the previous step:
```
$ etcdctl get key
hello
```
5. Retrieve a key from the stopped member:
```
$ etcdctl --endpoints=localhost:22379 get key
```
The command should display an error caused by connection failure:
```
2017/06/18 23:07:35 grpc: Conn.resetTransport failed to create client transport: connection error: desc = "transport: dial tcp 127.0.0.1:22379: getsockopt: connection refused"; Reconnecting to "localhost:22379"
Error: grpc: timed out trying to connect
```
6. Restart the stopped member:
```
$ goreman run restart etcd2
```
7. Get the key from the restarted member:
```
$ etcdctl --endpoints=localhost:22379 get key
hello
```
Restarting the member re-establish the connection. `etcdctl` will now be able to retrieve the key successfully. To learn more about interacting with etcd, read [interacting with etcd section][interacting].
[clustering]: https://etcd.io/docs/latest/op-guide/clustering/
[interacting]: https://etcd.io/docs/latest/dev-guide/interacting_v3/

View File

@ -1,33 +0,0 @@
# Logging Conventions
etcd uses the [zap][zap] library for logging application output categorized into *levels*. A log message's level is determined according to these conventions:
* Debug: Everything is still fine, but even common operations may be logged, and less helpful but more quantity of notices. Usually not used in production.
* Examples:
* Send a normal message to a remote peer
* Write a log entry to disk
* Info: Normal, working log information, everything is fine, but helpful notices for auditing or common operations. Should rather not be logged more frequently than once per a few seconds in normal server's operation.
* Examples:
* Startup configuration
* Start to do snapshot
* Warning: (Hopefully) Temporary conditions that may cause errors, but may work fine. A replica disappearing (that may reconnect) is a warning.
* Examples:
* Failure to send raft message to a remote peer
* Failure to receive heartbeat message within the configured election timeout
* Error: Data has been lost, a request has failed for a bad reason, or a required resource has been lost.
* Examples:
* Failure to allocate disk space for WAL
* Panic: Unrecoverable or unexpected error situation that requires stopping execution.
* Examples:
* Failure to create the database
* Fatal: Unrecoverable or unexpected error situation that requires immediate exit. Mostly used in the test.
* Examples:
* Failure to find the data directory
* Failure to run a test function
[zap]: https://github.com/uber-go/zap

File diff suppressed because one or more lines are too long

Before

Width:  |  Height:  |  Size: 129 KiB

View File

@ -1,91 +0,0 @@
# Golang modules
The etcd project (since version 3.5) is organized into multiple
[golang modules](https://golang.org/ref/mod) hosted in a [single repository](https://golang.org/ref/mod#vcs-dir).
![modules graph](modules.svg)
There are following modules:
- **go.etcd.io/etcd/api/v3** - contains API definitions
(like protos & proto-generated libraries) that defines communication protocol
between etcd clients and server.
- **go.etcd.io/etcd/pkg/v3** - collection of utility packages used by etcd
without being specific to etcd itself. A package belongs here
only if it could possibly be moved out into its own repository in the future.
Please avoid adding here code that has a lot of dependencies on its own, as
they automatically becoming dependencies of the client library
(that we want to keep lightweight).
- **go.etcd.io/etcd/client/v3** - client library used to contact etcd over
the network (grpc). Recommended for all new usage of etcd.
- **go.etcd.io/raft/v3** - implementation of distributed consensus
protocol. Should have no etcd specific code. Hosted in a separate repository:
https://github.com/etcd-io/raft.
- **go.etcd.io/etcd/server/v3** - etcd implementation.
The code in this package is etcd internal and should not be consumed
by external projects. The package layout and API can change within the minor versions.
- **go.etcd.io/etcd/etcdctl/v3** - a command line tool to access and manage etcd.
- **go.etcd.io/etcd/tests/v3** - a module that contains all integration tests of etcd.
Notice: All unit-tests (fast and not requiring cross-module dependencies)
should be kept in the local modules to the code under the test.
- **go.etcd.io/bbolt** - implementation of persistent b-tree.
Hosted in a separate repository: https://github.com/etcd-io/bbolt.
### Operations
1. All etcd modules should be released in the same versions, e.g.
`go.etcd.io/etcd/client/v3@v3.5.10` must depend on `go.etcd.io/etcd/api/v3@v3.5.10`.
The consistent updating of versions can by performed using:
```shell script
% DRY_RUN=false TARGET_VERSION="v3.5.10" ./scripts/release_mod.sh update_versions
```
2. The released modules should be tagged according to https://golang.org/ref/mod#vcs-version rules,
i.e. each module should get its own tag.
The tagging can be performed using:
```shell script
% DRY_RUN=false REMOTE_REPO="origin" ./scripts/release_mod.sh push_mod_tags
```
3. All etcd modules should depend on the same versions of underlying dependencies.
This can be verified using:
```shell script
% PASSES="dep" ./test.sh
```
4. The go.mod files must not contain dependencies not being used and must
conform to `go mod tidy` format.
This is being verified by:
```
% PASSES="mod_tidy" ./test.sh
```
5. To trigger actions across all modules (e.g. auto-format all files), please
use/expand the following script:
```shell script
% ./scripts/fix.sh
```
### Future
As a North Star, we would like to evaluate etcd modules towards following model:
![modules graph](modules-future.svg)
This assumes:
- Splitting etcdmigrate/etcdadm out of etcdctl binary.
Thanks to this etcdctl would become clearly a command-line wrapper
around network client API,
while etcdmigrate/etcdadm would support direct physical operations on the
etcd storage files.
- Splitting etcd-proxy out of ./etcd binary, as it contains more experimental code
so carries additional risk & dependencies.
- Deprecation of support for v2 protocol.

File diff suppressed because one or more lines are too long

Before

Width:  |  Height:  |  Size: 112 KiB

View File

@ -1,75 +0,0 @@
# Release
The guide talks about how to release a new version of etcd.
The procedure includes some manual steps for sanity checking, but it can probably be further scripted. Please keep this document up-to-date if making changes to the release process.
## Release management
etcd community members are assigned to manage the release each etcd major/minor version as well as manage patches
and to each stable release branch. The managers are responsible for communicating the timelines and status of each
release and for ensuring the stability of the release branch.
| Releases | Manager |
|------------------------|-------------------------------------------------------------|
| 3.4 patch (post 3.4.0) | Benjamin Wang [@ahrtr](https://github.com/ahrtr) |
| 3.5 patch (post 3.5.0) | Marek Siarkowicz [@serathius](https://github.com/serathius) |
All releases version numbers follow the format of [semantic versioning 2.0.0](http://semver.org/).
### Major, minor version release, or its pre-release
- Ensure the relevant milestone on GitHub is complete. All referenced issues should be closed, or moved elsewhere.
- Ensure the latest upgrade documentation is available.
- Bump [hardcoded MinClusterVerion in the repository](https://github.com/etcd-io/etcd/blob/v3.4.15/version/version.go#L29), if necessary.
- Add feature capability maps for the new version, if necessary.
### Patch version release
- To request a backport, devlopers submit cherrypick PRs targeting the release branch. The commits should not include merge commits. The commits should be restricted to bug fixes and security patches.
- The cherrypick PRs should target the appropriate release branch (`base:release-<major>-<minor>`). `hack/patch/cherrypick.sh` may be used to automatically generate cherrypick PRs.
- The release patch manager reviews the cherrypick PRs. Please discuss carefully what is backported to the patch release. Each patch release should be strictly better than it's predecessor.
- The release patch manager will cherry-pick these commits starting from the oldest one into stable branch.
## Write release note
- Write introduction for the new release. For example, what major bug we fix, what new features we introduce or what performance improvement we make.
- Put `[GH XXXX]` at the head of change line to reference Pull Request that introduces the change. Moreover, add a link on it to jump to the Pull Request.
- Find PRs with `release-note` label and explain them in `NEWS` file, as a straightforward summary of changes for end-users.
## Build and push the release artifacts
- Ensure `docker` is available.
Run release script in root directory:
```
DRY_RUN=false ./scripts/release.sh ${VERSION}
```
It generates all release binaries and images under directory ./release.
Binaries are pushed to gcr.io and images are pushed to quay.io and gcr.io.
## Publish release page in GitHub
- Set release title as the version name.
- Follow the format of previous release pages.
- Attach the generated binaries and signatures.
- Select whether it is a pre-release.
- Publish the release!
## Announce to the etcd-dev Googlegroup
- Follow the format of [previous release emails](https://groups.google.com/forum/#!forum/etcd-dev).
- Make sure to include a list of authors that contributed since the previous release - something like the following might be handy:
```
git log ...${PREV_VERSION} --pretty=format:"%an" | sort | uniq | tr '\n' ',' | sed -e 's#,#, #g' -e 's#, $##'
```
- Send email to etcd-dev@googlegroups.com
## Post release
- Create new stable branch through `git push origin ${VERSION_MAJOR}.${VERSION_MINOR}` if this is a major stable release. This assumes `origin` corresponds to "https://github.com/etcd-io/etcd".
- Bump [hardcoded Version in the repository](https://github.com/etcd-io/etcd/blob/v3.4.15/version/version.go#L30) to the version `${VERSION}+git`.

View File

@ -1,45 +0,0 @@
# Reporting bugs
If any part of the etcd project has bugs or documentation mistakes, please let us know by [opening an issue][etcd-issue]. We treat bugs and mistakes very seriously and believe no issue is too small. Before creating a bug report, please check that an issue reporting the same problem does not already exist.
To make the bug report accurate and easy to understand, please try to create bug reports that are:
- Specific. Include as much details as possible: which version, what environment, what configuration, etc. If the bug is related to running the etcd server, please attach the etcd log (the starting log with etcd configuration is especially important).
- Reproducible. Include the steps to reproduce the problem. We understand some issues might be hard to reproduce, please includes the steps that might lead to the problem. If possible, please attach the affected etcd data dir and stack strace to the bug report.
- Isolated. Please try to isolate and reproduce the bug with minimum dependencies. It would significantly slow down the speed to fix a bug if too many dependencies are involved in a bug report. Debugging external systems that rely on etcd is out of scope, but we are happy to provide guidance in the right direction or help with using etcd itself.
- Unique. Do not duplicate existing bug report.
- Scoped. One bug per report. Do not follow up with another bug inside one report.
It may be worthwhile to read [Elika Etemads article on filing good bug reports][filing-good-bugs] before creating a bug report.
We might ask for further information to locate a bug. A duplicated bug report will be closed.
## Frequently asked questions
### How to get a stack trace
``` bash
$ kill -QUIT $PID
```
### How to get etcd version
``` bash
$ etcd --version
```
### How to get etcd configuration and log when it runs as systemd service etcd2.service
``` bash
$ sudo systemctl cat etcd2
$ sudo journalctl -u etcd2
```
Due to an upstream systemd bug, journald may miss the last few log lines when its processes exit. If journalctl says etcd stopped without fatal or panic message, try `sudo journalctl -f -t etcd2` to get full log.
[etcd-issue]: https://github.com/etcd-io/etcd/issues/new
[filing-good-bugs]: http://fantasai.inkedblade.net/style/talks/filing-good-bugs/

View File

@ -1,180 +0,0 @@
# Issue triage guidelines
## Purpose
Speed up issue management.
The `etcd` issues are listed at <https://github.com/etcd-io/etcd/issues> and are identified with labels. For example, an issue that is identified as a bug will be set to label `type/bug`.
The etcd project uses labels to indicate common attributes such as `area`, `type` and `priority` of incoming issues.
New issues will often start out without any labels, but typically `etcd` maintainers, reviewers and members will add labels by following these triage guidelines. The detailed list of labels can be found at <https://github.com/etcd-io/etcd/labels>.
## Scope
This document serves as the primary guidelines for triaging incoming issues in `etcd`.
All contributors are encouraged and welcome to help manage issues which will help reduce burden on project maintainers, though the work and responsibilities discussed in this document are created with `etcd` project reviewers and members in mind as these individuals will have triage access to the etcd project which is a requirement for actions like applying labels or closing issues.
Refer to [etcd community membership](https://github.com/etcd-io/etcd/blob/main/Documentation/contributor-guide/community-membership.md) for guidance on becoming and etcd project member or reviewer.
## Step 1 - Find an issue to triage
To get started you can use the following recommended issue searches to identify issues that are in need of triage:
* [Issues that have no labels](https://github.com/etcd-io/etcd/issues?q=is%3Aissue+is%3Aopen+sort%3Aupdated+no%3Alabel)
* [Issues created recently](https://github.com/etcd-io/etcd/issues?q=is%3Aissue+is%3Aopen+)
* [Issues not assigned but linked pr](https://github.com/etcd-io/etcd/issues?q=is%3Aopen+is%3Aissue+no%3Aassignee+linked%3Apr)
* [Issues with no comments](https://github.com/etcd-io/etcd/issues?q=is%3Aopen+is%3Aissue+comments%3A0+)
* [Issues with help wanted](https://github.com/etcd-io/etcd/issues?q=is%3Aopen+is%3Aissue+label%3A%22help+wanted%22+)
## Step 2 - Check the issue is valid
Before we start adding labels or trying to work out a priority, our first triage step needs to be working out if the issue actually belongs to the etcd project and is not a duplicate.
### Issues that don't belong to etcd
Sometime issues are reported that actually belongs to other projects that `etcd` use. For example, `grpc` or `golang` issues. Such issues should be addressed by asking reporter to open issues in appropriate other project.
These issues can generally be closed unless a maintainer and issue reporter see a need to keep it open for tracking purpose. If you have triage permissions please close it, alternatively mention the @etcd-io/members group to request a member with triage access close the issue.
### Duplicate issues
If an issue is a duplicate, add a comment stating so along with a reference for the original issue and if you have triage permissions please close it, alternatively mention the @etcd-io/members group to request a member with triage access close the issue.
## Step 3 - Apply the appropriate type label
Adding a `type` label to an issue helps create visibility on the health of the project and helps contributors identify potential priorities, i.e. addressing existing bugs or test flakes before implementing new features.
### Support requests
As a general rule the focus for etcd support is to address common themes in a broad way that helps all users, i.e. through channels like known issues, frequently asked questions and high quality documentation. To make the best use of project members time we should avoid providing 1:1 support if a broad approach is available.
Some people mistakenly use our GitHub bug report or feature request templates to file support requests. Usually they are asking for help operating or configuring some aspect of etcd. Support requests for etcd should instead be raised as [discussions](https://github.com/etcd-io/etcd/discussions).
Common types of support requests are:
1. Questions about configuring or operating existing well documented etcd features, for example <https://github.com/etcd-io/etcd/issues/15945>. Note - If an existing feature is not well documented please apply the `area/documentation` label and propose documentation improvements that would prevent future users from stumbling on the problem again.
2. Bug reports or questions about unspported versions of etcd, for example <https://github.com/etcd-io/etcd/issues/15796>. When responding to these issues please refer to our [supported versions documentation](https://etcd.io/docs/latest/op-guide/versioning) and encourage the reporter to upgrade to a recent patch release of a supported version as soon as possible. We should limit the effort supporting users that do not make the effort to run a supported version of etcd or ensure their version is patched.
3. Bug reports that do not provide a complete list of steps to reproduce issue and/or contributors are not able to reproduce the issue, for example <https://github.com/etcd-io/etcd/issues/15740>. We should limit the effort we put into reproducing issues ourselves and motivate users to provide necessary information to accept the bug report.
4. General questions that are filed using feature request or bug report issue templates, for example <https://github.com/etcd-io/etcd/issues/15914>. Note - These types of requests may surface good additions to our [frequently asked questions](https://etcd.io/docs/v3.5/faq).
If you identify that an issue is a support request please:
1. Add the `type/support` or `type/question` label.
2. Add the following comment to inform the issue creator that discussions should be used instead and that this issue will be converted to a discussion.
> Thank you for your question, this support issue will be moved to our [Discussion Forums](https://github.com/etcd-io/etcd/discussions).
>
> We are trying to consolidate the channels to which questions for help/support are posted so that we can improve our efficiency in responding to your requests, and to make it easier for you to find answers to frequently asked questions and how to address common use cases.
>
> We regularly see messages posted in multiple forums, with the full response thread only in one place or, worse, spread across multiple forums. Also, the large volume of support issues on GitHub is making it difficult for us to use issues to identify real bugs.
>
> Members of the etcd community use Discussion Forums to field support requests. Before posting a new question, please search these for answers to similar questions, and also familiarize yourself with:
>
> 1. [user documentation](https://etcd.io/docs/latest)
> 2. [frequently asked questions](https://etcd.io/docs/v3.5/faq)
>
> Again, thanks for using etcd and raising this question.
>
> The etcd team
3. Finally, click `Convert to discussion` on the right hand panel, selecting the appropriate discussion category.
### Bug reports
If an issue has been raised as a bug it should already have the `type/bug` label, however if this is missing for an issue you determine to be a bug please add the label manually.
The next step is to validate if the issue is indeed a bug. If not, add a comment with findings and close trivial issue. For non-trivial issue, wait to hear back from issue reporter and see if there is any objection. If issue reporter does not reply in 30 days, close the issue.
If the problem can not be reproduced or requires more information, leave a comment for the issue reporter as soon as possible while the issue will be fresh for the issue reporter.
### Feature requests
New feature requests should be created via the etcd feature request template and in theory already have the `type/feature` label, however if this is missing for an issue you determine to be a feature please add the label manually.
### Test flakes
Test flakes are a specific type of bug that the etcd project tracks seperately as these are a priority to address. These should be created via the test flake template and in theory already have the `type/flake` label, however if this is missing for an issue you determine to be related to a flaking test please add the label manually.
## Step 4 - Define the areas impacted
Adding an `area` label to an issue helps create visibility on which areas of the etcd project require attention and helps contributors find issues to work on relating to their particular skills or knowledge of the etcd codebase.
If an issue crosses multiple domains please add additional `area` labels to reflect that.
Below is a brief summary of the area labels in active use by the etcd project along with any notes on their use:
| Label | Notes |
| --- | --- |
| area/external | Tracking label for issues raised that are external to etcd. |
| area/community | |
| area/raft | |
| area/clientv3 | |
| area/performance | |
| area/security | |
| area/tls | |
| area/auth | |
| area/etcdctl | |
| area/etcdutl | |
| area/contrib | Not to be confused with `area/community` this label is specifically used for issues relating to community maintained scripts or files in the `contrib/` directory which aren't part of the core etcd project. |
| area/documentation | |
| area/tooling | Generally used in relation to the third party / external utilities or tools that are used in various stages of the etcd build, test or release process, for example tooling to create sboms. |
| area/testing | |
| area/robustness-testing | |
## Step 5 - Prioritise the issue
Placeholder.
## Step 6 - Support new contributors
As part of the `etcd` triage process once the `kind` and `area` have been determined, please consider if the issue would be suitable for a less experienced contributor. The `good first issue` label is a subset of the `help wanted` label, indicating that members have committed to providing extra assistance for new contributors. All `good first issue` items also have the `help wanted` label.
### Help wanted
Items marked with the `help wanted` label need to ensure that they meet these criteria:
* **Low Barrier to Entry** - It should be easy for new contributors.
* **Clear** - The task is agreed upon and does not require further discussions in the community.
* **Goldilocks priority** - The priority should not be so high that a core contributor should do it, but not too low that it isnt useful enough for a core contributor to spend time reviewing it, answering questions, helping get it into a release, etc.
### Good first issue
Items marked with `good first issue` are intended for first-time contributors. It indicates that members will keep an eye out for these pull requests and shepherd it through our processes.
New contributors should not be left to find an approver, ping for reviews, decipher test commands, or identify that their build failed due to a flake. It is important to make new contributors feel welcome and valued. We should assure them that they will have an extra level of help with their first contribution.
After a contributor has successfully completed one or two `good first issue` items, they should be ready to move on to `help wanted` items.
* **No Barrier to Entry** - The task is something that a new contributor can tackle without advanced setup or domain knowledge.
* **Solution Explained** - The recommended solution is clearly described in the issue.
* **Gives Examples** - Link to examples of similar implementations so new contributors have a reference guide for their changes.
* **Identifies Relevant Code** - The relevant code and tests to be changed should be linked in the issue.
* **Ready to Test** - There should be existing tests that can be modified, or existing test cases fit to be copied. If the area of code doesnt have tests, before labeling the issue, add a test fixture. This prep often makes a great help wanted task!
## Step 7 - Follow up
Once initial triage has been completed, issues need to be re-evaluated over time to ensure they don't become stale incorrectly.
### Track important issues
If an issue is at risk of being closed by stale bot in future, but is an important issuefor the etcd project, then please apply the `stage/tracked` label and remove any `stale` labels that exist. This will ensure the project does not lose sight of the issue.
### Close incomplete issues
Issues that lack enough information from the issue reporter should be closed if issue reporter do not provide information in 30 days. Issues can always be re-opened at a later date if new information is provided.
### Check for incomplete work
If an issue owned by a developer has no pull request created in 30 days, contact the issue owner and kindly ask about the status of their work, or to release ownership on the issue if needed.

View File

@ -1,28 +0,0 @@
# PR management
## Purpose
Speed up PR management.
The `etcd` PRs are listed at https://github.com/etcd-io/etcd/pulls
A PR can have various labels, milestone, reviewer etc. The detailed list of labels can be found at
https://github.com/kubernetes/kubernetes/labels
Following are few example searches on PR for convenience:
* [Open PRS for milestone etcd-v3.6](https://github.com/etcd-io/etcd/pulls?utf8=%E2%9C%93&q=is%3Apr+is%3Aopen+milestone%3Aetcd-v3.6)
* [PRs under investigation](https://github.com/etcd-io/etcd/labels/Investigating)
## Scope
These guidelines serves as a primary document for managing PRs in `etcd`. Everyone is welcome to help manage PRs but the work and responsibilities discussed in this document is created with `etcd` maintainers and active contributors in mind.
## Handle inactive PRs
Poke PR owner if review comments are not addressed in 15 days. If PR owner does not reply in 90 days, update the PR with a new commit if possible. If not, inactive PR should be closed after 180 days.
## Poke reviewer if needed
Reviewers are responsive in a timely fashion, but considering everyone is busy, give them some time after requesting review if quick response is not provided. If response is not provided in 10 days, feel free to contact them via adding a comment in the PR or sending an email or message on the Slack.
## Verify important labels are in place
Make sure that appropriate reviewers are added to the PR. Also, make sure that a milestone is identified. If any of these or other important labels are missing, add them. If a correct label cannot be decided, leave a comment for the maintainers to do so as needed.

File diff suppressed because it is too large Load Diff

View File

@ -1,430 +0,0 @@
{
"swagger": "2.0",
"info": {
"title": "server/etcdserver/api/v3election/v3electionpb/v3election.proto",
"version": "version not set"
},
"consumes": [
"application/json"
],
"produces": [
"application/json"
],
"paths": {
"/v3/election/campaign": {
"post": {
"summary": "Campaign waits to acquire leadership in an election, returning a LeaderKey\nrepresenting the leadership if successful. The LeaderKey can then be used\nto issue new values on the election, transactionally guard API requests on\nleadership still being held, and resign from the election.",
"operationId": "Election_Campaign",
"responses": {
"200": {
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/v3electionpbCampaignResponse"
}
},
"default": {
"description": "An unexpected error response.",
"schema": {
"$ref": "#/definitions/runtimeError"
}
}
},
"parameters": [
{
"name": "body",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/v3electionpbCampaignRequest"
}
}
],
"tags": [
"Election"
]
}
},
"/v3/election/leader": {
"post": {
"summary": "Leader returns the current election proclamation, if any.",
"operationId": "Election_Leader",
"responses": {
"200": {
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/v3electionpbLeaderResponse"
}
},
"default": {
"description": "An unexpected error response.",
"schema": {
"$ref": "#/definitions/runtimeError"
}
}
},
"parameters": [
{
"name": "body",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/v3electionpbLeaderRequest"
}
}
],
"tags": [
"Election"
]
}
},
"/v3/election/observe": {
"post": {
"summary": "Observe streams election proclamations in-order as made by the election's\nelected leaders.",
"operationId": "Election_Observe",
"responses": {
"200": {
"description": "A successful response.(streaming responses)",
"schema": {
"type": "object",
"properties": {
"result": {
"$ref": "#/definitions/v3electionpbLeaderResponse"
},
"error": {
"$ref": "#/definitions/runtimeStreamError"
}
},
"title": "Stream result of v3electionpbLeaderResponse"
}
},
"default": {
"description": "An unexpected error response.",
"schema": {
"$ref": "#/definitions/runtimeError"
}
}
},
"parameters": [
{
"name": "body",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/v3electionpbLeaderRequest"
}
}
],
"tags": [
"Election"
]
}
},
"/v3/election/proclaim": {
"post": {
"summary": "Proclaim updates the leader's posted value with a new value.",
"operationId": "Election_Proclaim",
"responses": {
"200": {
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/v3electionpbProclaimResponse"
}
},
"default": {
"description": "An unexpected error response.",
"schema": {
"$ref": "#/definitions/runtimeError"
}
}
},
"parameters": [
{
"name": "body",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/v3electionpbProclaimRequest"
}
}
],
"tags": [
"Election"
]
}
},
"/v3/election/resign": {
"post": {
"summary": "Resign releases election leadership so other campaigners may acquire\nleadership on the election.",
"operationId": "Election_Resign",
"responses": {
"200": {
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/v3electionpbResignResponse"
}
},
"default": {
"description": "An unexpected error response.",
"schema": {
"$ref": "#/definitions/runtimeError"
}
}
},
"parameters": [
{
"name": "body",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/v3electionpbResignRequest"
}
}
],
"tags": [
"Election"
]
}
}
},
"definitions": {
"etcdserverpbResponseHeader": {
"type": "object",
"properties": {
"cluster_id": {
"type": "string",
"format": "uint64",
"description": "cluster_id is the ID of the cluster which sent the response."
},
"member_id": {
"type": "string",
"format": "uint64",
"description": "member_id is the ID of the member which sent the response."
},
"revision": {
"type": "string",
"format": "int64",
"description": "revision is the key-value store revision when the request was applied, and it's\nunset (so 0) in case of calls not interacting with key-value store.\nFor watch progress responses, the header.revision indicates progress. All future events\nreceived in this stream are guaranteed to have a higher revision number than the\nheader.revision number."
},
"raft_term": {
"type": "string",
"format": "uint64",
"description": "raft_term is the raft term when the request was applied."
}
}
},
"mvccpbKeyValue": {
"type": "object",
"properties": {
"key": {
"type": "string",
"format": "byte",
"description": "key is the key in bytes. An empty key is not allowed."
},
"create_revision": {
"type": "string",
"format": "int64",
"description": "create_revision is the revision of last creation on this key."
},
"mod_revision": {
"type": "string",
"format": "int64",
"description": "mod_revision is the revision of last modification on this key."
},
"version": {
"type": "string",
"format": "int64",
"description": "version is the version of the key. A deletion resets\nthe version to zero and any modification of the key\nincreases its version."
},
"value": {
"type": "string",
"format": "byte",
"description": "value is the value held by the key, in bytes."
},
"lease": {
"type": "string",
"format": "int64",
"description": "lease is the ID of the lease that attached to key.\nWhen the attached lease expires, the key will be deleted.\nIf lease is 0, then no lease is attached to the key."
}
}
},
"protobufAny": {
"type": "object",
"properties": {
"type_url": {
"type": "string",
"description": "A URL/resource name that uniquely identifies the type of the serialized\nprotocol buffer message. This string must contain at least\none \"/\" character. The last segment of the URL's path must represent\nthe fully qualified name of the type (as in\n`path/google.protobuf.Duration`). The name should be in a canonical form\n(e.g., leading \".\" is not accepted).\n\nIn practice, teams usually precompile into the binary all types that they\nexpect it to use in the context of Any. However, for URLs which use the\nscheme `http`, `https`, or no scheme, one can optionally set up a type\nserver that maps type URLs to message definitions as follows:\n\n* If no scheme is provided, `https` is assumed.\n* An HTTP GET on the URL must yield a [google.protobuf.Type][]\n value in binary format, or produce an error.\n* Applications are allowed to cache lookup results based on the\n URL, or have them precompiled into a binary to avoid any\n lookup. Therefore, binary compatibility needs to be preserved\n on changes to types. (Use versioned type names to manage\n breaking changes.)\n\nNote: this functionality is not currently available in the official\nprotobuf release, and it is not used for type URLs beginning with\ntype.googleapis.com.\n\nSchemes other than `http`, `https` (or the empty scheme) might be\nused with implementation specific semantics."
},
"value": {
"type": "string",
"format": "byte",
"description": "Must be a valid serialized protocol buffer of the above specified type."
}
},
"description": "`Any` contains an arbitrary serialized protocol buffer message along with a\nURL that describes the type of the serialized message.\n\nProtobuf library provides support to pack/unpack Any values in the form\nof utility functions or additional generated methods of the Any type.\n\nExample 1: Pack and unpack a message in C++.\n\n Foo foo = ...;\n Any any;\n any.PackFrom(foo);\n ...\n if (any.UnpackTo(\u0026foo)) {\n ...\n }\n\nExample 2: Pack and unpack a message in Java.\n\n Foo foo = ...;\n Any any = Any.pack(foo);\n ...\n if (any.is(Foo.class)) {\n foo = any.unpack(Foo.class);\n }\n\n Example 3: Pack and unpack a message in Python.\n\n foo = Foo(...)\n any = Any()\n any.Pack(foo)\n ...\n if any.Is(Foo.DESCRIPTOR):\n any.Unpack(foo)\n ...\n\n Example 4: Pack and unpack a message in Go\n\n foo := \u0026pb.Foo{...}\n any, err := ptypes.MarshalAny(foo)\n ...\n foo := \u0026pb.Foo{}\n if err := ptypes.UnmarshalAny(any, foo); err != nil {\n ...\n }\n\nThe pack methods provided by protobuf library will by default use\n'type.googleapis.com/full.type.name' as the type URL and the unpack\nmethods only use the fully qualified type name after the last '/'\nin the type URL, for example \"foo.bar.com/x/y.z\" will yield type\nname \"y.z\".\n\n\nJSON\n====\nThe JSON representation of an `Any` value uses the regular\nrepresentation of the deserialized, embedded message, with an\nadditional field `@type` which contains the type URL. Example:\n\n package google.profile;\n message Person {\n string first_name = 1;\n string last_name = 2;\n }\n\n {\n \"@type\": \"type.googleapis.com/google.profile.Person\",\n \"firstName\": \u003cstring\u003e,\n \"lastName\": \u003cstring\u003e\n }\n\nIf the embedded message type is well-known and has a custom JSON\nrepresentation, that representation will be embedded adding a field\n`value` which holds the custom JSON in addition to the `@type`\nfield. Example (for message [google.protobuf.Duration][]):\n\n {\n \"@type\": \"type.googleapis.com/google.protobuf.Duration\",\n \"value\": \"1.212s\"\n }"
},
"runtimeError": {
"type": "object",
"properties": {
"error": {
"type": "string"
},
"code": {
"type": "integer",
"format": "int32"
},
"message": {
"type": "string"
},
"details": {
"type": "array",
"items": {
"$ref": "#/definitions/protobufAny"
}
}
}
},
"runtimeStreamError": {
"type": "object",
"properties": {
"grpc_code": {
"type": "integer",
"format": "int32"
},
"http_code": {
"type": "integer",
"format": "int32"
},
"message": {
"type": "string"
},
"http_status": {
"type": "string"
},
"details": {
"type": "array",
"items": {
"$ref": "#/definitions/protobufAny"
}
}
}
},
"v3electionpbCampaignRequest": {
"type": "object",
"properties": {
"name": {
"type": "string",
"format": "byte",
"description": "name is the election's identifier for the campaign."
},
"lease": {
"type": "string",
"format": "int64",
"description": "lease is the ID of the lease attached to leadership of the election. If the\nlease expires or is revoked before resigning leadership, then the\nleadership is transferred to the next campaigner, if any."
},
"value": {
"type": "string",
"format": "byte",
"description": "value is the initial proclaimed value set when the campaigner wins the\nelection."
}
}
},
"v3electionpbCampaignResponse": {
"type": "object",
"properties": {
"header": {
"$ref": "#/definitions/etcdserverpbResponseHeader"
},
"leader": {
"$ref": "#/definitions/v3electionpbLeaderKey",
"description": "leader describes the resources used for holding leadereship of the election."
}
}
},
"v3electionpbLeaderKey": {
"type": "object",
"properties": {
"name": {
"type": "string",
"format": "byte",
"description": "name is the election identifier that correponds to the leadership key."
},
"key": {
"type": "string",
"format": "byte",
"description": "key is an opaque key representing the ownership of the election. If the key\nis deleted, then leadership is lost."
},
"rev": {
"type": "string",
"format": "int64",
"description": "rev is the creation revision of the key. It can be used to test for ownership\nof an election during transactions by testing the key's creation revision\nmatches rev."
},
"lease": {
"type": "string",
"format": "int64",
"description": "lease is the lease ID of the election leader."
}
}
},
"v3electionpbLeaderRequest": {
"type": "object",
"properties": {
"name": {
"type": "string",
"format": "byte",
"description": "name is the election identifier for the leadership information."
}
}
},
"v3electionpbLeaderResponse": {
"type": "object",
"properties": {
"header": {
"$ref": "#/definitions/etcdserverpbResponseHeader"
},
"kv": {
"$ref": "#/definitions/mvccpbKeyValue",
"description": "kv is the key-value pair representing the latest leader update."
}
}
},
"v3electionpbProclaimRequest": {
"type": "object",
"properties": {
"leader": {
"$ref": "#/definitions/v3electionpbLeaderKey",
"description": "leader is the leadership hold on the election."
},
"value": {
"type": "string",
"format": "byte",
"description": "value is an update meant to overwrite the leader's current value."
}
}
},
"v3electionpbProclaimResponse": {
"type": "object",
"properties": {
"header": {
"$ref": "#/definitions/etcdserverpbResponseHeader"
}
}
},
"v3electionpbResignRequest": {
"type": "object",
"properties": {
"leader": {
"$ref": "#/definitions/v3electionpbLeaderKey",
"description": "leader is the leadership to relinquish by resignation."
}
}
},
"v3electionpbResignResponse": {
"type": "object",
"properties": {
"header": {
"$ref": "#/definitions/etcdserverpbResponseHeader"
}
}
}
}
}

View File

@ -1,190 +0,0 @@
{
"swagger": "2.0",
"info": {
"title": "server/etcdserver/api/v3lock/v3lockpb/v3lock.proto",
"version": "version not set"
},
"consumes": [
"application/json"
],
"produces": [
"application/json"
],
"paths": {
"/v3/lock/lock": {
"post": {
"summary": "Lock acquires a distributed shared lock on a given named lock.\nOn success, it will return a unique key that exists so long as the\nlock is held by the caller. This key can be used in conjunction with\ntransactions to safely ensure updates to etcd only occur while holding\nlock ownership. The lock is held until Unlock is called on the key or the\nlease associate with the owner expires.",
"operationId": "Lock_Lock",
"responses": {
"200": {
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/v3lockpbLockResponse"
}
},
"default": {
"description": "An unexpected error response.",
"schema": {
"$ref": "#/definitions/runtimeError"
}
}
},
"parameters": [
{
"name": "body",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/v3lockpbLockRequest"
}
}
],
"tags": [
"Lock"
]
}
},
"/v3/lock/unlock": {
"post": {
"summary": "Unlock takes a key returned by Lock and releases the hold on lock. The\nnext Lock caller waiting for the lock will then be woken up and given\nownership of the lock.",
"operationId": "Lock_Unlock",
"responses": {
"200": {
"description": "A successful response.",
"schema": {
"$ref": "#/definitions/v3lockpbUnlockResponse"
}
},
"default": {
"description": "An unexpected error response.",
"schema": {
"$ref": "#/definitions/runtimeError"
}
}
},
"parameters": [
{
"name": "body",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/v3lockpbUnlockRequest"
}
}
],
"tags": [
"Lock"
]
}
}
},
"definitions": {
"etcdserverpbResponseHeader": {
"type": "object",
"properties": {
"cluster_id": {
"type": "string",
"format": "uint64",
"description": "cluster_id is the ID of the cluster which sent the response."
},
"member_id": {
"type": "string",
"format": "uint64",
"description": "member_id is the ID of the member which sent the response."
},
"revision": {
"type": "string",
"format": "int64",
"description": "revision is the key-value store revision when the request was applied, and it's\nunset (so 0) in case of calls not interacting with key-value store.\nFor watch progress responses, the header.revision indicates progress. All future events\nreceived in this stream are guaranteed to have a higher revision number than the\nheader.revision number."
},
"raft_term": {
"type": "string",
"format": "uint64",
"description": "raft_term is the raft term when the request was applied."
}
}
},
"protobufAny": {
"type": "object",
"properties": {
"type_url": {
"type": "string",
"description": "A URL/resource name that uniquely identifies the type of the serialized\nprotocol buffer message. This string must contain at least\none \"/\" character. The last segment of the URL's path must represent\nthe fully qualified name of the type (as in\n`path/google.protobuf.Duration`). The name should be in a canonical form\n(e.g., leading \".\" is not accepted).\n\nIn practice, teams usually precompile into the binary all types that they\nexpect it to use in the context of Any. However, for URLs which use the\nscheme `http`, `https`, or no scheme, one can optionally set up a type\nserver that maps type URLs to message definitions as follows:\n\n* If no scheme is provided, `https` is assumed.\n* An HTTP GET on the URL must yield a [google.protobuf.Type][]\n value in binary format, or produce an error.\n* Applications are allowed to cache lookup results based on the\n URL, or have them precompiled into a binary to avoid any\n lookup. Therefore, binary compatibility needs to be preserved\n on changes to types. (Use versioned type names to manage\n breaking changes.)\n\nNote: this functionality is not currently available in the official\nprotobuf release, and it is not used for type URLs beginning with\ntype.googleapis.com.\n\nSchemes other than `http`, `https` (or the empty scheme) might be\nused with implementation specific semantics."
},
"value": {
"type": "string",
"format": "byte",
"description": "Must be a valid serialized protocol buffer of the above specified type."
}
},
"description": "`Any` contains an arbitrary serialized protocol buffer message along with a\nURL that describes the type of the serialized message.\n\nProtobuf library provides support to pack/unpack Any values in the form\nof utility functions or additional generated methods of the Any type.\n\nExample 1: Pack and unpack a message in C++.\n\n Foo foo = ...;\n Any any;\n any.PackFrom(foo);\n ...\n if (any.UnpackTo(\u0026foo)) {\n ...\n }\n\nExample 2: Pack and unpack a message in Java.\n\n Foo foo = ...;\n Any any = Any.pack(foo);\n ...\n if (any.is(Foo.class)) {\n foo = any.unpack(Foo.class);\n }\n\n Example 3: Pack and unpack a message in Python.\n\n foo = Foo(...)\n any = Any()\n any.Pack(foo)\n ...\n if any.Is(Foo.DESCRIPTOR):\n any.Unpack(foo)\n ...\n\n Example 4: Pack and unpack a message in Go\n\n foo := \u0026pb.Foo{...}\n any, err := ptypes.MarshalAny(foo)\n ...\n foo := \u0026pb.Foo{}\n if err := ptypes.UnmarshalAny(any, foo); err != nil {\n ...\n }\n\nThe pack methods provided by protobuf library will by default use\n'type.googleapis.com/full.type.name' as the type URL and the unpack\nmethods only use the fully qualified type name after the last '/'\nin the type URL, for example \"foo.bar.com/x/y.z\" will yield type\nname \"y.z\".\n\n\nJSON\n====\nThe JSON representation of an `Any` value uses the regular\nrepresentation of the deserialized, embedded message, with an\nadditional field `@type` which contains the type URL. Example:\n\n package google.profile;\n message Person {\n string first_name = 1;\n string last_name = 2;\n }\n\n {\n \"@type\": \"type.googleapis.com/google.profile.Person\",\n \"firstName\": \u003cstring\u003e,\n \"lastName\": \u003cstring\u003e\n }\n\nIf the embedded message type is well-known and has a custom JSON\nrepresentation, that representation will be embedded adding a field\n`value` which holds the custom JSON in addition to the `@type`\nfield. Example (for message [google.protobuf.Duration][]):\n\n {\n \"@type\": \"type.googleapis.com/google.protobuf.Duration\",\n \"value\": \"1.212s\"\n }"
},
"runtimeError": {
"type": "object",
"properties": {
"error": {
"type": "string"
},
"code": {
"type": "integer",
"format": "int32"
},
"message": {
"type": "string"
},
"details": {
"type": "array",
"items": {
"$ref": "#/definitions/protobufAny"
}
}
}
},
"v3lockpbLockRequest": {
"type": "object",
"properties": {
"name": {
"type": "string",
"format": "byte",
"description": "name is the identifier for the distributed shared lock to be acquired."
},
"lease": {
"type": "string",
"format": "int64",
"description": "lease is the ID of the lease that will be attached to ownership of the\nlock. If the lease expires or is revoked and currently holds the lock,\nthe lock is automatically released. Calls to Lock with the same lease will\nbe treated as a single acquisition; locking twice with the same lease is a\nno-op."
}
}
},
"v3lockpbLockResponse": {
"type": "object",
"properties": {
"header": {
"$ref": "#/definitions/etcdserverpbResponseHeader"
},
"key": {
"type": "string",
"format": "byte",
"description": "key is a key that will exist on etcd for the duration that the Lock caller\nowns the lock. Users should not modify this key or the lock may exhibit\nundefined behavior."
}
}
},
"v3lockpbUnlockRequest": {
"type": "object",
"properties": {
"key": {
"type": "string",
"format": "byte",
"description": "key is the lock ownership key granted by Lock."
}
}
},
"v3lockpbUnlockResponse": {
"type": "object",
"properties": {
"header": {
"$ref": "#/definitions/etcdserverpbResponseHeader"
}
}
}
}
}

View File

@ -0,0 +1,92 @@
# Running etcd under Docker
The following guide will show you how to run etcd under Docker using the [static bootstrap process](clustering.md#static).
## Running etcd in standalone mode
In order to expose the etcd API to clients outside of the Docker host you'll need use the host IP address when configuring etcd.
```
export HostIP="192.168.12.50"
```
The following `docker run` command will expose the etcd client API over ports 4001 and 2379, and expose the peer port over 2380.
```
docker run -d -v /usr/share/ca-certificates/:/etc/ssl/certs -p 4001:4001 -p 2380:2380 -p 2379:2379 \
--name etcd quay.io/coreos/etcd:v2.0.8 \
-name etcd0 \
-advertise-client-urls http://${HostIP}:2379,http://${HostIP}:4001 \
-listen-client-urls http://0.0.0.0:2379,http://0.0.0.0:4001 \
-initial-advertise-peer-urls http://${HostIP}:2380 \
-listen-peer-urls http://0.0.0.0:2380 \
-initial-cluster-token etcd-cluster-1 \
-initial-cluster etcd0=http://${HostIP}:2380 \
-initial-cluster-state new
```
Configure etcd clients to use the Docker host IP and one of the listening ports from above.
```
etcdctl -C http://192.168.12.50:2379 member list
```
```
etcdctl -C http://192.168.12.50:4001 member list
```
## Running a 3 node etcd cluster
Using Docker to setup a multi-node cluster is very similar to the standalone mode configuration.
The main difference being the value used for the `-initial-cluster` flag, which must contain the peer urls for each etcd member in the cluster.
### etcd0
```
docker run -d -v /usr/share/ca-certificates/:/etc/ssl/certs -p 4001:4001 -p 2380:2380 -p 2379:2379 \
--name etcd quay.io/coreos/etcd:v2.0.8 \
-name etcd0 \
-advertise-client-urls http://192.168.12.50:2379,http://192.168.12.50:4001 \
-listen-client-urls http://0.0.0.0:2379,http://0.0.0.0:4001 \
-initial-advertise-peer-urls http://192.168.12.50:2380 \
-listen-peer-urls http://0.0.0.0:2380 \
-initial-cluster-token etcd-cluster-1 \
-initial-cluster etcd0=http://192.168.12.50:2380,etcd1=http://192.168.12.51:2380,etcd2=http://192.168.12.52:2380 \
-initial-cluster-state new
```
### etcd1
```
docker run -d -v /usr/share/ca-certificates/:/etc/ssl/certs -p 4001:4001 -p 2380:2380 -p 2379:2379 \
--name etcd quay.io/coreos/etcd:v2.0.8 \
-name etcd1 \
-advertise-client-urls http://192.168.12.51:2379,http://192.168.12.51:4001 \
-listen-client-urls http://0.0.0.0:2379,http://0.0.0.0:4001 \
-initial-advertise-peer-urls http://192.168.12.51:2380 \
-listen-peer-urls http://0.0.0.0:2380 \
-initial-cluster-token etcd-cluster-1 \
-initial-cluster etcd0=http://192.168.12.50:2380,etcd1=http://192.168.12.51:2380,etcd2=http://192.168.12.52:2380 \
-initial-cluster-state new
```
### etcd2
```
docker run -d -v /usr/share/ca-certificates/:/etc/ssl/certs -p 4001:4001 -p 2380:2380 -p 2379:2379 \
--name etcd quay.io/coreos/etcd:v2.0.8 \
-name etcd2 \
-advertise-client-urls http://192.168.12.52:2379,http://192.168.12.52:4001 \
-listen-client-urls http://0.0.0.0:2379,http://0.0.0.0:4001 \
-initial-advertise-peer-urls http://192.168.12.52:2380 \
-listen-peer-urls http://0.0.0.0:2380 \
-initial-cluster-token etcd-cluster-1 \
-initial-cluster etcd0=http://192.168.12.50:2380,etcd1=http://192.168.12.51:2380,etcd2=http://192.168.12.52:2380 \
-initial-cluster-state new
```
Once the cluster has been bootstrapped etcd clients can be configured with a list of etcd members:
```
etcdctl -C http://192.168.12.50:2379,http://192.168.12.51:2379,http://192.168.12.52:2379 member list
```

View File

@ -0,0 +1,42 @@
Error Code
======
This document describes the error code used in key space '/v2/keys'. Feel free to import 'github.com/coreos/etcd/error' to use.
It's categorized into four groups:
- Command Related Error
| name | code | strerror |
|----------------------|------|-----------------------|
| EcodeKeyNotFound | 100 | "Key not found" |
| EcodeTestFailed | 101 | "Compare failed" |
| EcodeNotFile | 102 | "Not a file" |
| EcodeNotDir | 104 | "Not a directory" |
| EcodeNodeExist | 105 | "Key already exists" |
| EcodeRootROnly | 107 | "Root is read only" |
| EcodeDirNotEmpty | 108 | "Directory not empty" |
- Post Form Related Error
| name | code | strerror |
|--------------------------|------|------------------------------------------------|
| EcodePrevValueRequired | 201 | "PrevValue is Required in POST form" |
| EcodeTTLNaN | 202 | "The given TTL in POST form is not a number" |
| EcodeIndexNaN | 203 | "The given index in POST form is not a number" |
| EcodeInvalidField | 209 | "Invalid field" |
| EcodeInvalidForm | 210 | "Invalid POST form" |
- Raft Related Error
| name | code | strerror |
|-------------------|------|--------------------------|
| EcodeRaftInternal | 300 | "Raft Internal Error" |
| EcodeLeaderElect | 301 | "During Leader Election" |
- Etcd Related Error
| name | code | strerror |
|-------------------------|------|--------------------------------------------------------|
| EcodeWatcherCleared | 400 | "watcher is cleared due to etcd recovery" |
| EcodeEventIndexCleared | 401 | "The event in requested index is outdated and cleared" |

35
Documentation/glossary.md Normal file
View File

@ -0,0 +1,35 @@
## Glossary
This document defines the various terms used in etcd documentation, command line and source code.
### Node
Node is an instance of raft state machine.
It has a unique identification, and records other nodes' progress internally when it is the leader.
### Member
Member is an instance of etcd. It hosts a node, and provides service to clients.
### Cluster
Cluster consists of several members.
The node in each member follows raft consensus protocol to replicate logs. Cluster receives proposals from members, commits them and apply to local store.
### Peer
Peer is another member of the same cluster.
### Proposal
A proposal is a request (for example a write request, a configuration change request) that needs to go through raft protocol.
### Client
Client is a caller of the cluster's HTTP API.
### Machine (deprecated)
The alternative of Member in etcd before 2.0

View File

@ -0,0 +1,65 @@
# FAQ
## Initial Bootstrapping UX
etcd initial bootstrapping is done via command line flags such as
`--initial-cluster` or `--discovery`. These flags can safely be left on the
command line after your cluster is running but they will be ignored if you have
a non-empty data dir. So, why did we decide to have this sort of odd UX?
One of the design goals of etcd is easy bringup of clusters using a one-shot
static configuration like AWS Cloud Formation, PXE booting, etc. Essentially we
want to describe several virtual machines and bring them all up at once into an
etcd cluster.
To achieve this sort of hands-free cluster bootstrap we had two other options:
**API to bootstrap**
This is problematic because it cannot be coordinated from a single service file
and we didn't want to have the etcd socket listening but unresponsive to
clients for an unbound period of time.
It would look something like this:
```
ExecStart=/usr/bin/etcd
ExecStartPost/usr/bin/etcd init localhost:2379 --cluster=
```
**etcd init subcommand**
```
etcd init --cluster='default=http://localhost:2380,default=http://localhost:7001'...
etcd init --discovery https://discovery-example.etcd.io/193e4
```
Then after running an init step you would execute `etcd`. This however
introduced problems: we now have to define a hand-off protocol between the etcd
init process and the etcd binary itself. This is hard to coordinate in a single
service file such as:
```
ExecStartPre=/usr/bin/etcd init --cluster=....
ExecStart=/usr/bin/etcd
```
There are several error cases:
0) Init has already ran and the data directory is already configured
1) Discovery fails because of network timeout, etc
2) Discovery fails because the cluster is already full and etcd needs to fall back to proxy
3) Static cluster configuration fails because of conflict, misconfiguration or timeout
In hindsight we could have made this work by doing:
```
rc status
0 Init already ran
1 Discovery fails on network timeout, etc
0 Discovery fails for cluster full, coordinate via proxy state file
1 Static cluster configuration failed
```
Perhaps we can add the init command in a future version and deprecate if the UX
continues to confuse people.

View File

@ -1,68 +0,0 @@
# etcd arm64 test infrastructure
## Infrastructure summary
All etcd project pipelines run via github actions. The etcd project currently maintains dedicated infrastructure for running `arm64` continuous integration testing. This is required because currently github actions runner virtual machines are only offered as `x64`.
The infrastructure consists of two `c3.large.arm` bare metal servers kindly provided by [Equinix Metal](https://www.equinix.com/) via the [CNCF Community Infrastructure Lab].
| Hostname | IP | Operating System | Region |
|-------------------------------|----------------|--------------------|---------------|
| etcd-c3-large-arm64-runner-01 | 86.109.7.233 | Ubuntu 22.04.1 LTS | Washington DC |
| etcd-c3-large-arm64-runner-02 | 147.28.151.226 | Ubuntu 22.04.1 LTS | Washington DC |
## Infrastructure support
The etcd project aims to self manage and resolve issues with project infrastructure internally where possible, however if situations emerge where we need to engage support from Equinix Metal we can open an issue under the [CNCF Community Infrastructure Lab] project or contact the [Equinix Metal support team](https://deploy.equinix.com/support). If the situation is urgent contact @vielmetti directly who can provide further assistance or escalation points.
## Granting infrastructure access
Etcd arm64 test infrastructure access is closely controlled to ensure the infrastructure is secure and protect the integrity of the etcd project.
Access to the infrastructure is defined by the infra admins table below:
| Name | Github | K8s Slack | Email |
|---------------------------|----------------|--------------------|--------------------|
| Marek Siarkowicz | @serathius | @ Serathius | Ref MAINTAINERS.md |
| Benjamin Wang | @ahrtr | @ Benjamin Wang | Ref MAINTAINERS.md |
| Davanum Srinivas | @dimns | @ Dims | davanum@gmail.com |
| Chao Chen | @chaochn47 | @ Chao Chen | chaochn@amazon.com |
| James Blair | @jmhbnz | @ James Blair | etcd@jamma.life |
Individuals in this table are granted access to the infrastructure in two ways:
### 1. Equinix metal web console access
An etcd project exists under the CNCF organisation in the Equinix Metal web console. The direct url to the etcd console is <https://console.equinix.com/projects/1b8c1eb7-983c-4b40-97e0-e317406e232e>.
When a new person is added to the infra admins table, an existing member or etcd maintainer should raise an issue in the [CNCF Community Infrastructure Labs](https://github.com/cncf/cluster/issues) to ensure they are granted web console access.
### 2. Server ssh access
Infra admins can ssh directly to the servers with a dedicated user account for each person, usernames are based on github handles for easy recognition in logs. These infra admins will be able to elevate to the `root` user when necessary via `sudo`.
Access to machines via ssh is strictly via individual ssh key based authentication, and is not permitted directly to the `root` user. Password authentication is never to be used for etcd infrastructure ssh authentication.
When a new member is added to the infra admins table, and existing member with ssh access should complete the following actions on all etcd servers:
- create the new user via `sudo adduser <username>`.
- add their public key to `/home/<username>/.ssh/authorized_keys` file. Note: Public keys are to be retrieved via github only, example: <https://github.com/jmhbnz.keys>.
- add the new user to machine sudoers file via `usermod -aG sudo <username>`.
## Revoking infrastructure access
When a member is removed from the infra admins table existing members must review servers and ensure their user access to etcd infrastructure is revoked by removing the members `/home/<username>/.ssh/authorized_keys` entries.
Note: When revoking access do not delete a user or their home directory from servers, as access may need to be reinstated in future.
### Regular access review
On a regular at least quarterly basis members of the infra admins team are responsible for verifying that no unneccessary infrastructure access exists by reviewing membership of the table above and existing server access.
## Provisioning new machines
If the etcd project needs new `arm64` infrastructure we can open an issue with the [CNCF Community Infrastructure Lab]. An example etcd request is [here](https://github.com/cncf/cluster/issues/227).
Note: `arm64` compute capacity is not currently available in all regions, this can be checked with [metal-cli](https://github.com/equinix/metal-cli) `metal capacity get | grep arm`.
[CNCF Community Infrastructure Lab]: https://github.com/cncf/cluster/issues

View File

@ -0,0 +1,61 @@
# Versioning
Goal: We want to be able to upgrade an individual peer in an etcd cluster to a newer version of etcd.
The process will take the form of individual followers upgrading to the latest version until the entire cluster is on the new version.
Immediate need: etcd is moving too fast to version the internal API right now.
But, we need to keep mixed version clusters from being started by a rolling upgrade process (e.g. the CoreOS developer alpha).
Longer term need: Having a mixed version cluster where all peers are not running the exact same version of etcd itself but are able to speak one version of the internal protocol.
Solution: The internal protocol needs to be versioned just as the client protocol is.
Initially during the 0.\*.\* series of etcd releases we won't allow mixed versions at all.
## Join Control
We will add a version field to the join command.
But, who decides whether a newly upgraded follower should be able to join a cluster?
### Leader Controlled
If the leader controls the version of followers joining the cluster then it compares its version to the version number presented by the follower in the JoinCommand and rejects the join if the number is less than the leader's version number.
Advantages
- Leader controls all cluster decisions still
Disadvantages
- Follower knows better what versions of the internal protocol it can talk than the leader
### Follower Controlled
A newly upgraded follower should be able to figure out the leaders internal version from a defined internal backwards compatible API endpoint and figure out if it can join the cluster.
If it cannot join the cluster then it simply exits.
Advantages
- The follower is running newer code and knows better if it can talk older protocols
Disadvantages
- This cluster decision isn't made by the leader
## Recommendation
To solve the immediate need and to plan for the future lets do the following:
- Add Version field to JoinCommand
- Have a joining follower read the Version field of the leader and if its own version doesn't match the leader then sleep for some random interval and retry later to see if the leader has upgraded.
# Research
## Zookeeper versioning
Zookeeper very recently added versioning into the protocol and it doesn't seem to have seen any use yet.
https://issues.apache.org/jira/browse/ZOOKEEPER-1633
## doozerd
doozerd stores the version number of the peers in the datastore for other clients to check, no decisions are made off of this number currently.

View File

@ -0,0 +1,120 @@
## Libraries and Tools
**Tools**
- [etcdctl](https://github.com/coreos/etcdctl) - A command line client for etcd
- [etcd-backup](https://github.com/fanhattan/etcd-backup) - A powerful command line utility for dumping/restoring etcd - Supports v2
- [etcd-dump](https://npmjs.org/package/etcd-dump) - Command line utility for dumping/restoring etcd.
- [etcd-fs](https://github.com/xetorthio/etcd-fs) - FUSE filesystem for etcd
- [etcd-browser](https://github.com/henszey/etcd-browser) - A web-based key/value editor for etcd using AngularJS
- [etcd-lock](https://github.com/datawisesystems/etcd-lock) - Master election & distributed r/w lock implementation using etcd - Supports v2
- [etcd-console](https://github.com/matishsiao/etcd-console) - A web-base key/value editor for etcd using PHP
- [etcd-viewer](https://github.com/nikfoundas/etcd-viewer) - An etcd key-value store editor/viewer written in Java
**Go libraries**
- [go-etcd](https://github.com/coreos/go-etcd) - Supports v2
**Java libraries**
- [boonproject/etcd](https://github.com/boonproject/boon/blob/master/etcd/README.md) - Supports v2, Async/Sync and waits
- [justinsb/jetcd](https://github.com/justinsb/jetcd)
- [diwakergupta/jetcd](https://github.com/diwakergupta/jetcd) - Supports v2
- [jurmous/etcd4j](https://github.com/jurmous/etcd4j) - Supports v2, Async/Sync, waits and SSL
- [AdoHe/etcd4j](http://github.com/AdoHe/etcd4j) - Supports v2 (enhance for real production cluster)
**Python libraries**
- [jplana/python-etcd](https://github.com/jplana/python-etcd) - Supports v2
- [russellhaering/txetcd](https://github.com/russellhaering/txetcd) - a Twisted Python library
- [cholcombe973/autodock](https://github.com/cholcombe973/autodock) - A docker deployment automation tool
- [lisael/aioetcd](https://github.com/lisael/aioetcd) - (Python 3.4+) Asyncio coroutines client (Supports v2)
**Node libraries**
- [stianeikeland/node-etcd](https://github.com/stianeikeland/node-etcd) - Supports v2 (w Coffeescript)
- [lavagetto/nodejs-etcd](https://github.com/lavagetto/nodejs-etcd) - Supports v2
- [deedubs/node-etcd-config](https://github.com/deedubs/node-etcd-config) - Supports v2
**Ruby libraries**
- [iconara/etcd-rb](https://github.com/iconara/etcd-rb)
- [jpfuentes2/etcd-ruby](https://github.com/jpfuentes2/etcd-ruby)
- [ranjib/etcd-ruby](https://github.com/ranjib/etcd-ruby) - Supports v2
**C libraries**
- [jdarcy/etcd-api](https://github.com/jdarcy/etcd-api) - Supports v2
**C++ libraries**
- [edwardcapriolo/etcdcpp](https://github.com/edwardcapriolo/etcdcpp) - Supports v2
**Clojure libraries**
- [aterreno/etcd-clojure](https://github.com/aterreno/etcd-clojure)
- [dwwoelfel/cetcd](https://github.com/dwwoelfel/cetcd) - Supports v2
- [rthomas/clj-etcd](https://github.com/rthomas/clj-etcd) - Supports v2
**Erlang libraries**
- [marshall-lee/etcd.erl](https://github.com/marshall-lee/etcd.erl)
**.Net Libraries**
- [drusellers/etcetera](https://github.com/drusellers/etcetera)
**PHP Libraries**
- [linkorb/etcd-php](https://github.com/linkorb/etcd-php)
**Haskell libraries**
- [wereHamster/etcd-hs](https://github.com/wereHamster/etcd-hs)
**R libraries**
- [ropensci/etseed](https://github.com/ropensci/etseed)
**Tcl libraries**
- [efrecon/etcd-tcl](https://github.com/efrecon/etcd-tcl) - Supports v2, except wait.
A detailed recap of client functionalities can be found in the [clients compatibility matrix][clients-matrix.md].
[clients-matrix.md]: https://github.com/coreos/etcd/blob/master/Documentation/clients-matrix.md
**Chef Integration**
- [coderanger/etcd-chef](https://github.com/coderanger/etcd-chef)
**Chef Cookbook**
- [spheromak/etcd-cookbook](https://github.com/spheromak/etcd-cookbook)
**BOSH Releases**
- [cloudfoundry-community/etcd-boshrelease](https://github.com/cloudfoundry-community/etcd-boshrelease)
- [cloudfoundry/cf-release](https://github.com/cloudfoundry/cf-release/tree/master/jobs/etcd)
**Projects using etcd**
- [binocarlos/yoda](https://github.com/binocarlos/yoda) - etcd + ZeroMQ
- [calavera/active-proxy](https://github.com/calavera/active-proxy) - HTTP Proxy configured with etcd
- [derekchiang/etcdplus](https://github.com/derekchiang/etcdplus) - A set of distributed synchronization primitives built upon etcd
- [go-discover](https://github.com/flynn/go-discover) - service discovery in Go
- [gleicon/goreman](https://github.com/gleicon/goreman/tree/etcd) - Branch of the Go Foreman clone with etcd support
- [garethr/hiera-etcd](https://github.com/garethr/hiera-etcd) - Puppet hiera backend using etcd
- [mattn/etcd-vim](https://github.com/mattn/etcd-vim) - SET and GET keys from inside vim
- [mattn/etcdenv](https://github.com/mattn/etcdenv) - "env" shebang with etcd integration
- [kelseyhightower/confd](https://github.com/kelseyhightower/confd) - Manage local app config files using templates and data from etcd
- [configdb](https://git.autistici.org/ai/configdb/tree/master) - A REST relational abstraction on top of arbitrary database backends, aimed at storing configs and inventories.
- [scrz](https://github.com/scrz/scrz) - Container manager, stores configuration in etcd.
- [fleet](https://github.com/coreos/fleet) - Distributed init system
- [GoogleCloudPlatform/kubernetes](https://github.com/GoogleCloudPlatform/kubernetes) - Container cluster manager.
- [mailgun/vulcand](https://github.com/mailgun/vulcand) - HTTP proxy that uses etcd as a configuration backend.
- [duedil-ltd/discodns](https://github.com/duedil-ltd/discodns) - Simple DNS nameserver using etcd as a database for names and records.
- [skynetservices/skydns](https://github.com/skynetservices/skydns) - RFC compliant DNS server
- [xordataexchange/crypt](https://github.com/xordataexchange/crypt) - Securely store values in etcd using GPG encryption
- [spf13/viper](https://github.com/spf13/viper) - Go configuration library, reads values from ENV, pflags, files, and etcd with optional encryption
- [lytics/metafora](https://github.com/lytics/metafora) - Go distributed task library
- [ryandoyle/nss-etcd](https://github.com/ryandoyle/nss-etcd) - A GNU libc NSS module for resolving names from etcd.

137
Documentation/metrics.md Normal file
View File

@ -0,0 +1,137 @@
## Metrics
**NOTE: The metrics feature is considered as an experimental. We might add/change/remove metrics without warning in the future releases.**
etcd uses [Prometheus](http://prometheus.io/) for metrics reporting in the server. The metrics can be used for real-time monitoring and debugging.
The simplest way to see the available metrics is to cURL the metrics endpoint `/metrics` of etcd. The format is described [here](http://prometheus.io/docs/instrumenting/exposition_formats/).
You can also follow the doc [here](http://prometheus.io/docs/introduction/getting_started/) to start a Promethus server and monitor etcd metrics.
The naming of metrics follows the suggested [best practice of Promethus](http://prometheus.io/docs/practices/naming/). A metric name has an `etcd` prefix as its namespace and a subsystem prefix (for example `wal` and `etcdserver`).
etcd now exposes the following metrics:
### etcdserver
| Name | Description | Type |
|-----------------------------------------|--------------------------------------------------|---------|
| file_descriptors_used_total | The total number of file descriptors used | Gauge |
| proposal_durations_milliseconds | The latency distributions of committing proposal | Summary |
| pending_proposal_total | The total number of pending proposals | Gauge |
| proposal_failed_total | The total number of failed proposals | Counter |
High file descriptors (`file_descriptors_used_total`) usage (near the file descriptors limitation of the process) indicates a potential out of file descriptors issue. That might cause etcd fails to create new WAL files and panics.
[Proposal](glossary.md#proposal) durations (`proposal_durations_milliseconds`) give you an summary about the proposal commit latency. Latency can be introduced into this process by network and disk IO.
Pending proposal (`pending_proposal_total`) gives you an idea about how many proposal are in the queue and waiting for commit. An increasing pending number indicates a high client load or an unstable cluster.
Failed proposals (`proposal_failed_total`) are normally related to two issues: temporary failures related to a leader election or longer duration downtime caused by a loss of quorum in the cluster.
### store
These metrics describe the accesses into the data store of etcd members that exist in the cluster. They
are useful to count what kind of actions are taken by users. It is also useful to see and whether all etcd members
"see" the same set of data mutations, and whether reads and watches (which are local) are equally distributed.
All these metrics are prefixed with `etcd_store_`.
| Name | Description | Type |
|---------------------------|------------------------------------------------------------------------------------------|--------------------|
| reads_total | Total number of reads from store, should differ among etcd members (local reads). | Counter(action) |
| writes_total | Total number of writes to store, should be same among all etcd members. | Counter(action) |
| reads_failed_total | Number of failed reads from store (e.g. key missing) on local reads. | Counter(action) |
| writes_failed_total | Number of failed writes to store (e.g. failed compare and swap). | Counter(action) |
| expires_total | Total number of expired keys (due to TTL).   | Counter |
| watch_requests_totals | Total number of incoming watch requests to this etcd member (local watches). | Counter |
| watchers | Current count of active watchers on this etcd member. | Gauge |
Both `reads_total` and `writes_total` count both successful and failed requests. `reads_failed_total` and
`writes_failed_total` count failed requests. A lot of failed writes indicate possible contentions on keys (e.g. when
doing `compareAndSet`), and read failures indicate that some clients try to access keys that don't exist.
Example Prometheus queries that may be useful from these metrics (across all etcd members):
* `sum(rate(etcd_store_reads_total{job="etcd"}[1m])) by (action)`
`max(rate(etcd_store_writes_total{job="etcd"}[1m])) by (action)`
Rate of reads and writes by action, across all servers across a time window of `1m`. The reason why `max` is used
for writes as opposed to `sum` for reads is because all of etcd nodes in the cluster apply all writes to their stores.
Shows the rate of successfull readonly/write queries across all servers, across a time window of `1m`.
* `sum(rate(etcd_store_watch_requests_total{job="etcd"}[1m]))`
Shows rate of new watch requests per second. Likely driven by how often watched keys change.
* `sum(etcd_store_watchers{job="etcd"})`
Number of active watchers across all etcd servers.
### wal
| Name | Description | Type |
|------------------------------------|--------------------------------------------------|---------|
| fsync_durations_microseconds | The latency distributions of fsync called by wal | Summary |
| last_index_saved | The index of the last entry saved by wal | Gauge |
Abnormally high fsync duration (`fsync_durations_microseconds`) indicates disk issues and might cause the cluster to be unstable.
### snapshot
| Name | Description | Type |
|--------------------------------------------|------------------------------------------------------------|---------|
| snapshot_save_total_durations_microseconds | The total latency distributions of save called by snapshot | Summary |
Abnormally high snapshot duration (`snapshot_save_total_durations_microseconds`) indicates disk issues and might cause the cluster to be unstable.
### rafthttp
| Name | Description | Type | Labels |
|-----------------------------------|--------------------------------------------|---------|--------------------------------|
| message_sent_latency_microseconds | The latency distributions of messages sent | Summary | sendingType, msgType, remoteID |
| message_sent_failed_total | The total number of failed messages sent | Summary | sendingType, msgType, remoteID |
Abnormally high message duration (`message_sent_latency_microseconds`) indicates network issues and might cause the cluster to be unstable.
An increase in message failures (`message_sent_failed_total`) indicates more severe network issues and might cause the cluster to be unstable.
Label `sendingType` is the connection type to send messages. `message`, `msgapp` and `msgappv2` use HTTP streaming, while `pipeline` does HTTP request for each message.
Label `msgType` is the type of raft message. `MsgApp` is log replication message; `MsgSnap` is snapshot install message; `MsgProp` is proposal forward message; the others are used to maintain raft internal status. If you have a large snapshot, you would expect a long msgSnap sending latency. For other types of messages, you would expect low latency, which is comparable to your ping latency if you have enough network bandwidth.
Label `remoteID` is the member ID of the message destination.
### proxy
etcd members operating in proxy mode do not do store operations. They forward all requests
to cluster instances.
Tracking the rate of requests coming from a proxy allows one to pin down which machine is performing most reads/writes.
All these metrics are prefixed with `etcd_proxy_`
| Name | Description | Type |
|---------------------------|-----------------------------------------------------------------------------------------|--------------------|
| requests_total | Total number of requests by this proxy instance. . | Counter(method) |
| handled_total | Total number of fully handled requests, with responses from etcd members. | Counter(method) |
| dropped_total | Total number of dropped requests due to forwarding errors to etcd members.  | Counter(method,error) |
| handling_duration_seconds | Bucketed handling times by HTTP method, including round trip to member instances. | Histogram(method) |
Example Prometheus queries that may be useful from these metrics (across all etcd servers):
* `sum(rate(etcd_proxy_handled_total{job="etcd"}[1m])) by (method)`
Rate of requests (by HTTP method) handled by all proxies, across a window of `1m`.
* `histogram_quantile(0.9, sum(increase(etcd_proxy_events_handling_time_seconds_bucket{job="etcd",method="GET"}[5m])) by (le))`
`histogram_quantile(0.9, sum(increase(etcd_proxy_events_handling_time_seconds_bucket{job="etcd",method!="GET"}[5m])) by (le))`
Show the 0.90-tile latency (in seconds) of handling of user requestsacross all proxy machines, with a window of `5m`.
* `sum(rate(etcd_proxy_dropped_total{job="etcd"}[1m])) by (proxying_error)`
Number of failed request on the proxy. This should be 0, spikes here indicate connectivity issues to etcd cluster.

119
Documentation/other_apis.md Normal file
View File

@ -0,0 +1,119 @@
## Members API
* [List members](#list-members)
* [Add a member](#add-a-member)
* [Delete a member](#delete-a-member)
* [Change the peer urls of a member](#change-the-peer-urls-of-a-member)
## List members
Return an HTTP 200 OK response code and a representation of all members in the etcd cluster.
### Request
```
GET /v2/members HTTP/1.1
```
### Example
```sh
curl http://10.0.0.10:2379/v2/members
```
```json
{
"members": [
{
"id": "272e204152",
"name": "infra1",
"peerURLs": [
"http://10.0.0.10:2380"
],
"clientURLs": [
"http://10.0.0.10:2379"
]
},
{
"id": "2225373f43",
"name": "infra2",
"peerURLs": [
"http://10.0.0.11:2380"
],
"clientURLs": [
"http://10.0.0.11:2379"
]
},
]
}
```
## Add a member
Returns an HTTP 201 response code and the representation of added member with a newly generated a memberID when successful. Returns a string describing the failure condition when unsuccessful.
If the POST body is malformed an HTTP 400 will be returned. If the member exists in the cluster or existed in the cluster at some point in the past an HTTP 409 will be returned. If any of the given peerURLs exists in the cluster an HTTP 409 will be returned. If the cluster fails to process the request within timeout an HTTP 500 will be returned, though the request may be processed later.
### Request
```
POST /v2/members HTTP/1.1
{"peerURLs": ["http://10.0.0.10:2380"]}
```
### Example
```sh
curl http://10.0.0.10:2379/v2/members -XPOST \
-H "Content-Type: application/json" -d '{"peerURLs":["http://10.0.0.10:2380"]}'
```
```json
{
"id": "3777296169",
"peerURLs": [
"http://10.0.0.10:2380"
]
}
```
## Delete a member
Remove a member from the cluster. The member ID must be a hex-encoded uint64.
Returns 204 with empty content when successful. Returns a string describing the failure condition when unsuccessful.
If the member does not exist in the cluster an HTTP 500(TODO: fix this) will be returned. If the cluster fails to process the request within timeout an HTTP 500 will be returned, though the request may be processed later.
### Request
```
DELETE /v2/members/<id> HTTP/1.1
```
### Example
```sh
curl http://10.0.0.10:2379/v2/members/272e204152 -XDELETE
```
## Change the peer urls of a member
Change the peer urls of a given member. The member ID must be a hex-encoded uint64. Returns 204 with empty content when successful. Returns a string describing the failure condition when unsuccessful.
If the POST body is malformed an HTTP 400 will be returned. If the member does not exist in the cluster an HTTP 404 will be returned. If any of the given peerURLs exists in the cluster an HTTP 409 will be returned. If the cluster fails to process the request within timeout an HTTP 500 will be returned, though the request may be processed later.
#### Request
```
PUT /v2/members/<id> HTTP/1.1
{"peerURLs": ["http://10.0.0.10:2380"]}
```
#### Example
```sh
curl http://10.0.0.10:2379/v2/members/272e204152 -XPUT \
-H "Content-Type: application/json" -d '{"peerURLs":["http://10.0.0.10:2380"]}'
```

View File

@ -0,0 +1,62 @@
# FreeBSD
Starting with version 0.1.2 both etcd and etcdctl have been ported to FreeBSD and can
be installed either via packages or ports system. Their versions have been recently
updated to 0.2.0 so now you can enjoy using etcd and etcdctl on FreeBSD 10.0 (RC4 as
of now) and 9.x where they have been tested. They might also work when installed from
ports on earlier versions of FreeBSD, but your mileage may vary.
## Installation
### Using pkgng package system
1. If you do not have pkg­ng installed, install it with command `pkg` and answering 'Y'
when asked
2. Update your repository data with `pkg update`
3. Install etcd with `pkg install coreos­etcd coreos­etcdctl`
4. Verify successful installation with `pkg info | grep etcd` and you should get:
```
r@fbsd­10:/ # pkg info | grep etcd
coreos­etcd­0.2.0              Highly­available key value store and service discovery
coreos­etcdctl­0.2.0           Simple commandline client for etcd
r@fbsd­10:/ #
```
5. Youre ready to use etcd and etcdctl! For more information about using pkgng, please
see: http://www.freebsd.org/doc/handbook/pkgng­intro.html
 
### Using ports system
1. If you do not have ports installed, install with with `portsnap fetch extract` (it
may take some time depending on your hardware and network connection)
2. Build etcd with `cd /usr/ports/devel/etcd && make install clean`, you
will get an option to build and install documentation and etcdctl with it.
3. If you haven't installed it with etcdctl, and you would like to install it later, you can build it
with `cd /usr/ports/devel/etcdctl && make install clean`
4. Verify successful installation with `pkg info | grep etcd` and you should get:
 
```
r@fbsd­10:/ # pkg info | grep etcd
coreos­etcd­0.2.0              Highly­available key value store and service discovery
coreos­etcdctl­0.2.0           Simple commandline client for etcd
r@fbsd­10:/ #
```
5. Youre ready to use etcd and etcdctl! For more information about using ports system,
please see: https://www.freebsd.org/doc/handbook/ports­using.html
## Issues
If you find any issues with the build/install procedure or you've found a problem that
you've verified is local to FreeBSD version only (for example, by not being able to
reproduce it on any other platform, like OSX or Linux), please sent a
problem report using this page for more
information: http://www.freebsd.org/send­pr.html

View File

@ -1,142 +0,0 @@
# v3.5 data inconsistency postmortem
| | |
|---------|------------|
| Authors | serathius@ |
| Date | 2022-04-20 |
| Status | published |
## Summary
| | |
|---------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Summary | Code refactor in v3.5.0 resulted in consistent index not being saved atomically. Independent crash could lead to committed transactions are not reflected on all the members. |
| Impact | No user reported problems in production as triggering the issue required frequent crashes, however issue was critical enough to motivate a public statement. Main impact comes from loosing user trust into etcd reliability. |
## Background
etcd v3 state is preserved on disk in two forms write ahead log (WAL) and database state (DB).
etcd v3.5 also still maintains v2 state, however it's deprecated and not relevant to the issue in this postmortem.
WAL stores history of changes for etcd state and database represents state at one point.
To know which point of history database is representing, it stores consistent index (CI).
It's a special metadata field that points to last entry in WAL that it has seen.
When etcd is updating database state, it replays entries from WAL and updates the consistent index to point to new entry.
This operation is required to be [atomic](https://en.wikipedia.org/wiki/Atomic_commit).
A partial fail would mean that database and WAL would no longer match, so some entries would be either skipped (if only CI is updated) or executed twice (if only changes are applied).
This is especially important for distributed system like etcd, where there are multiple cluster members, each applying the WAL entries to their database.
Correctness of the system depends on assumption that every member of the cluster, while replying WAL entries, will reach the same state.
## Root cause
To simplify managing consistency index, etcd has introduced backend hooks in https://github.com/etcd-io/etcd/pull/12855.
Goal was to ensure that consistency index is always updated, by automatically triggering update during commit.
Implementation was as follows, before applying the WAL entries, etcd updated in memory value of consistent index.
As part of transaction commit process, a database hook would read the value of consistent index and store it to database.
Problem is that in memory value of consistent index is shared, and there might be other in flight transactions apart from serial WAL apply flow.
So if we imagine scenario:
1. etcd server starts an apply workflow, and it just sets a new consistent index value.
2. The periodic commit is triggered, and it executes the backend hook and saves consistent index from apply workflow.
3. etcd server finished an apply workflow, saves new changes and saves same value of consistent index again.
Between second and third point there is a very small window where consistent index is increased without applying entry from WAL.
## Trigger
If etcd crashed after consistency index is saved, but before to apply workflow finished it would lead to data inconsistency.
When recovering the data etcd would skip executing changes from failed apply workflow, assuming they have been already executed.
This follows the issue reports and code used to reproduce the issue where trigger was etcd crashing under high request load.
Etcd v3.5.0 was released with bug (https://github.com/etcd-io/etcd/pull/13505) that could cause etcd to crash that was fixed in v3.5.1.
Apart from that all reports described etcd running under high memory pressure, causing it to go out of memory from time to time.
Reproduction run etcd under high stress and randomly killed one of the members using SIGKILL signal (not recoverable immediate process death).
## Detection
For single member cluster it is totally undetectable.
There is no mechanism or tool for verifying that state database matches WAL.
In cluster with multiple members it would mean that one of the members that crashed, will missing changes from failed apply workflow.
This means that it will have different state of database and will return different hash via `HashKV` grpc call.
There is an automatic mechanism to detect data inconsistency.
It can be executed during etcd start via `--experimental-initial-corrupt-check` and periodically via `--experimental-corrupt-check-time`.
Both checks however have a flaw, they depend on `HashKV` grpc method, which might fail causing the check to pass.
In multi member etcd cluster, each member can run with different performance and be at different stage of applying the WAL log.
Comparing database hashes between multiple etcd members requires all hashes to be calculated at the same change.
This is done by requesting hash for the same `revision` (version of key value store).
However, it will not work if the provided revision is not available on the members.
This can happen on very slow members, or in cases where corruption has lead revision numbers to diverge.
This means that for this issue, the corrupt check is only reliable during etcd start just after etcd crashes.
## Impact
We are not aware any cases of users reporting a data corruption in production environment.
However, issue was critical enough to motivate a public statement.
Main impact comes from loosing user trust into etcd reliability.
## Lessons learned
### What went well
* Multiple maintainers were able to work effectively on reproducing and fixing the issue. As they are in different timezones, there was always someone working on the issue.
* When fixing the main data inconsistency we have found multiple other edge cases that could lead to data corruption (https://github.com/etcd-io/etcd/issues/13514, https://github.com/etcd-io/etcd/issues/13922, https://github.com/etcd-io/etcd/issues/13937).
### What went wrong
* No users enable data corruption detection as it is still an experimental feature introduced in v3.3. All reported cases where detected manually making it almost impossible to reproduce.
* etcd has functional tests designed to detect such problems, however they are unmaintained, flaky and are missing crucial scenarios.
* etcd v3.5 release was not qualified as comprehensive as previous ones. Older maintainers run manual qualification process that is no longer known or executed.
* etcd apply code is so complicated that fixing the data inconsistency took almost 2 weeks and multiple tries. Fix needed to be so complicated that we needed to develop automatic validation for it (https://github.com/etcd-io/etcd/pull/13885).
* etcd v3.5 was recommended for production without enough insight on the production adoption. Production ready recommendations based on after some internal feedback... to get diverse usage, but the user's hold on till someone else will discover issues.
### Where we got lucky
* We reproduced the issue using etcd functional only because weird partition setup on workstation. Functional tests store etcd data under `/tmp` usually mounted to in memory filesystem. Problem was reproduced only because one of the maintainers has `/tmp` mounted to standard disk.
## Action items
Action items should directly address items listed in lessons learned.
We should double down on things that went well, fix things that went wrong, and stop depending on luck.
Action fall under three types, and we should have at least one item per type. Types:
* Prevent - Prevent similar issues from occurring. In this case, what testing we should introduce to find data inconsistency issues before release, preventing publishing broken release.
* Detect - Be more effective in detecting when similar issues occur. In this case, improve mechanism to detect data inconsistency issue so users will be automatically informed.
* Mitigate - Reduce time to recovery for users. In this case, how we ensure that users are able to quickly fix data inconsistency.
Actions should not be restricted to fixing the immediate issues and also propose long term strategic improvements.
To reflect this action items should have assigned priority:
* P0 - Critical for reliability of the v3.5 release. Should be prioritized this over all other work and backported to v3.5.
* P1 - Important for long term success of the project. Blocks v3.6 release.
* P2 - Stretch goals that would be nice to have for v3.6, however should not be blocking.
| Action Item | Type | Priority | Bug | Status |
|-------------------------------------------------------------------------------------|----------|----------|----------------------------------------------|--------|
| etcd testing can reproduce historical data inconsistency issues | Prevent | P0 | https://github.com/etcd-io/etcd/issues/14045 | DONE |
| etcd detects data corruption by default | Detect | P0 | https://github.com/etcd-io/etcd/issues/14039 | DONE |
| etcd testing is high quality, easy to maintain and expand | Prevent | P1 | https://github.com/etcd-io/etcd/issues/13637 | |
| etcd apply code should be easy to understand and validate correctness | Prevent | P1 | | |
| Critical etcd features are not abandoned when contributors move on | Prevent | P1 | https://github.com/etcd-io/etcd/issues/13775 | DONE |
| etcd is continuously qualified with failure injection | Prevent | P1 | https://github.com/etcd-io/etcd/pull/14911 | DONE |
| etcd can reliably detect data corruption (hash is linearizable) | Detect | P1 | | |
| etcd checks consistency of snapshots sent between leader and followers | Detect | P1 | https://github.com/etcd-io/etcd/issues/13973 | DONE |
| etcd recovery from data inconsistency procedures are documented and tested | Mitigate | P1 | | |
| etcd can imminently detect and recover from data corruption (implement Merkle root) | Mitigate | P2 | https://github.com/etcd-io/etcd/issues/13839 | |
## Timeline
| Date | Event |
|------------|-----------------------------------------------------------------------------------------------------------------------|
| 2021-05-08 | Pull request that caused data corruption was merged - https://github.com/etcd-io/etcd/pull/12855 |
| 2021-06-16 | Release v3.5.0 with data corruption was published - https://github.com/etcd-io/etcd/releases/tag/v3.5.0 |
| 2021-12-01 | Report of data corruption - https://github.com/etcd-io/etcd/issues/13514 |
| 2021-01-28 | Report of data corruption - https://github.com/etcd-io/etcd/issues/13654 |
| 2022-03-08 | Report of data corruption - https://github.com/etcd-io/etcd/issues/13766 |
| 2022-03-25 | Corruption confirmed by one of the maintainers - https://github.com/etcd-io/etcd/issues/13766#issuecomment-1078897588 |
| 2022-03-29 | Statement about the corruption was sent to etcd-dev@googlegroups.com and dev@kubernetes.io |
| 2022-04-24 | Release v3.5.3 with fix was published - https://github.com/etcd-io/etcd/releases/tag/v3.5.3 |

View File

@ -0,0 +1,4 @@
etcd is being used successfully by many companies in production. It is,
however, under active development and systems like etcd are difficult to get
correct. If you are comfortable with bleeding-edge software please use etcd and
provide us with the feedback and testing young software needs.

36
Documentation/proxy.md Normal file
View File

@ -0,0 +1,36 @@
## Proxy
etcd can now run as a transparent proxy. Running etcd as a proxy allows for easily discovery of etcd within your infrastructure, since it can run on each machine as a local service. In this mode, etcd acts as a reverse proxy and forwards client requests to an active etcd cluster. The etcd proxy does not participant in the consensus replication of the etcd cluster, thus it neither increases the resilience nor decreases the write performance of the etcd cluster.
etcd currently supports two proxy modes: `readwrite` and `readonly`. The default mode is `readwrite`, which forwards both read and write requests to the etcd cluster. A `readonly` etcd proxy only forwards read requests to the etcd cluster, and returns `HTTP 501` to all write requests.
The proxy will shuffle the list of cluster members periodically to avoid sending all connections to a single member.
The member list used by proxy consists of all client URLs advertised within the cluster, as specified in each members' `-advertise-client-urls` flag. If this flag is set incorrectly, requests sent to the proxy are forwarded to wrong addresses and then fail. The fix for this problem is to restart etcd member with correct `-advertise-client-urls` flag. After client URLs list in proxy is recalculated, which happens every 30 seconds, requests will be forwarded correctly.
### Using an etcd proxy
To start etcd in proxy mode, you need to provide three flags: `proxy`, `listen-client-urls`, and `initial-cluster` (or `discovery`).
To start a readwrite proxy, set `-proxy on`; To start a readonly proxy, set `-proxy readonly`.
The proxy will be listening on `listen-client-urls` and forward requests to the etcd cluster discovered from in `initial-cluster` or `discovery` url.
#### Start an etcd proxy with a static configuration
To start a proxy that will connect to a statically defined etcd cluster, specify the `initial-cluster` flag:
```
etcd -proxy on -listen-client-urls http://127.0.0.1:8080 -initial-cluster infra0=http://10.0.1.10:2380,infra1=http://10.0.1.11:2380,infra2=http://10.0.1.12:2380
```
#### Start an etcd proxy with the discovery service
If you bootstrap an etcd cluster using the [discovery service][discovery-service], you can also start the proxy with the same `discovery`.
To start a proxy using the discovery service, specify the `discovery` flag. The proxy will wait until the etcd cluster defined at the `discovery` url finishes bootstrapping, and then start to forward the requests.
```
etcd -proxy on -listen-client-urls http://127.0.0.1:8080 -discovery https://discovery.etcd.io/3e86b59982e49066c5d813af1c2e2579cbf573de
```
#### Fallback to proxy mode with discovery service
If you bootstrap a etcd cluster using [discovery service][discovery-service] with more than the expected number of etcd members, the extra etcd processes will fall back to being `readwrite` proxies by default. They will forward the requests to the cluster as described above. For example, if you create a discovery url with `size=5`, and start ten etcd processes using that same discovery url, the result will be a cluster with five etcd members and five proxies. Note that this behaviour can be disabled with the `proxy-fallback` flag.
[discovery-service]: clustering.md#discovery

191
Documentation/rfc/v3api.md Normal file
View File

@ -0,0 +1,191 @@
## Design
1. Flatten binary key-value space
2. Keep the event history until compaction
- access to old version of keys
- user controlled history compaction
3. Support range query
- Pagination support with limit argument
- Support consistency guarantee across multiple range queries
4. Replace TTL key with Lease
- more efficient/ low cost keep alive
- a logical group of TTL keys
5. Replace CAS/CAD with multi-object Tnx
- MUCH MORE powerful and flexible
6. Support efficient watching with multiple ranges
7. RPC API supports the completed set of APIs.
- more efficient than JSON/HTTP
- additional tnx/lease support
8. HTTP API supports a subset of APIs.
- easy for people to try out etcd
- easy for people to write simple etcd application
## Protobuf Defined API
[protobuf](./v3api.proto)
### Examples
#### Put a key (foo=bar)
```
// A put is always successful
Put( PutRequest { key = foo, value = bar } )
PutResponse {
cluster_id = 0x1000,
member_id = 0x1,
index = 1,
raft_term = 0x1,
}
```
#### Get a key (assume we have foo=bar)
```
Get ( RangeRequest { key = foo } )
RangeResponse {
cluster_id = 0x1000,
member_id = 0x1,
index = 1,
raft_term = 0x1,
kvs = {
{
key = foo,
value = bar,
create_index = 1,
mod_index = 1,
version = 1;
},
},
}
```
#### Range over a key space (assume we have foo0=bar0… foo100=bar100)
```
Range ( RangeRequest { key = foo, end_key = foo80, limit = 30 } )
RangeResponse {
cluster_id = 0x1000,
member_id = 0x1,
index = 100,
raft_term = 0x1,
kvs = {
{
key = foo0,
value = bar0,
create_index = 1,
mod_index = 1,
version = 1;
},
...,
{
key = foo30,
value = bar30,
create_index = 30,
mod_index = 30,
version = 1;
},
},
}
```
#### Finish a tnx (assume we have foo0=bar0, foo1=bar1)
```
Tnx(TnxRequest {
// mod_index of foo0 is equal to 1, mod_index of foo1 is greater than 1
compare = {
{compareType = equal, key = foo0, mod_index = 1},
{compareType = greater, key = foo1, mod_index = 1}}
},
// if the comparison succeeds, put foo2 = bar2
success = {PutRequest { key = foo2, value = success }},
// if the comparison fails, put foo2=fail
failure = {PutRequest { key = foo2, value = failure }},
)
TnxResponse {
cluster_id = 0x1000,
member_id = 0x1,
index = 3,
raft_term = 0x1,
succeeded = true,
responses = {
// response of PUT foo2=success
{
cluster_id = 0x1000,
member_id = 0x1,
index = 3,
raft_term = 0x1,
}
}
}
```
#### Watch on a key/range
```
Watch( WatchRequest{
key = foo,
end_key = fop, // prefix foo
start_index = 20,
end_index = 10000,
// server decided notification frequency
progress_notification = true,
}
… // this can be a watch request stream
)
// put (foo0=bar0) event at 3
WatchResponse {
cluster_id = 0x1000,
member_id = 0x1,
index = 3,
raft_term = 0x1,
event_type = put,
kv = {
key = foo0,
value = bar0,
create_index = 1,
mod_index = 1,
version = 1;
},
}
// a notification at 2000
WatchResponse {
cluster_id = 0x1000,
member_id = 0x1,
index = 2000,
raft_term = 0x1,
// nil event as notification
}
// put (foo0=bar3000) event at 3000
WatchResponse {
cluster_id = 0x1000,
member_id = 0x1,
index = 3000,
raft_term = 0x1,
event_type = put,
kv = {
key = foo0,
value = bar3000,
create_index = 1,
mod_index = 3000,
version = 2;
},
}
```

View File

@ -0,0 +1,272 @@
syntax = "proto3";
// Interface exported by the server.
service etcd {
// Range gets the keys in the range from the store.
rpc Range(RangeRequest) returns (RangeResponse) {}
// Put puts the given key into the store.
// A put request increases the index of the store,
// and generates one event in the event history.
rpc Put(PutRequest) returns (PutResponse) {}
// Delete deletes the given range from the store.
// A delete request increase the index of the store,
// and generates one event in the event history.
rpc DeleteRange(DeleteRangeRequest) returns (DeleteRangeResponse) {}
// Tnx processes all the requests in one transaction.
// A tnx request increases the index of the store,
// and generates events with the same index in the event history.
rpc Tnx(TnxRequest) returns (TnxResponse) {}
// Watch watches the events happening or happened in etcd. Both input and output
// are stream. One watch rpc can watch for multiple ranges and get a stream of
// events. The whole events history can be watched unless compacted.
rpc WatchRange(stream WatchRangeRequest) returns (stream WatchRangeResponse) {}
// Compact compacts the event history in etcd. User should compact the
// event history periodically, or it will grow infinitely.
rpc Compact(CompactionRequest) returns (CompactionResponse) {}
// LeaseCreate creates a lease. A lease has a TTL. The lease will expire if the
// server does not receive a keepAlive within TTL from the lease holder.
// All keys attached to the lease will be expired and deleted if the lease expires.
// The key expiration generates an event in event history.
rpc LeaseCreate(LeaseCreateRequest) returns (LeaseCreateResponse) {}
// LeaseRevoke revokes a lease. All the key attached to the lease will be expired and deleted.
rpc LeaseRevoke(LeaseRevokeRequest) returns (LeaseRevokeResponse) {}
// LeaseAttach attaches keys with a lease.
rpc LeaseAttach(LeaseAttachRequest) returns (LeaseAttachResponse) {}
// LeaseTnx likes Tnx. It has two addition success and failure LeaseAttachRequest list.
// If the Tnx is successful, then the success list will be executed. Or the failure list
// will be executed.
rpc LeaseTnx(LeaseTnxRequest) returns (LeaseTnxResponse) {}
// KeepAlive keeps the lease alive.
rpc LeaseKeepAlive(stream LeaseKeepAliveRequest) returns (stream LeaseKeepAliveResponse) {}
}
message ResponseHeader {
// an error type message?
optional string error = 1;
optional uint64 cluster_id = 2;
optional uint64 member_id = 3;
// index of the store when the request was applied.
optional int64 index = 4;
// term of raft when the request was applied.
optional uint64 raft_term = 5;
}
message RangeRequest {
// if the range_end is not given, the request returns the key.
optional bytes key = 1;
// if the range_end is given, it gets the keys in range [key, range_end).
optional bytes range_end = 2;
// limit the number of keys returned.
optional int64 limit = 3;
// the response will be consistent with previous request with same token if the token is
// given and is vaild.
optional bytes consistent_token = 4;
}
message RangeResponse {
optional ResponseHeader header = 1;
repeated KeyValue kvs = 2;
optional bytes consistent_token = 3;
}
message PutRequest {
optional bytes key = 1;
optional bytes value = 2;
}
message PutResponse {
optional ResponseHeader header = 1;
}
message DeleteRangeRequest {
// if the range_end is not given, the request deletes the key.
optional bytes key = 1;
// if the range_end is given, it deletes the keys in range [key, range_end).
optional bytes range_end = 2;
}
message DeleteRangeResponse {
optional ResponseHeader header = 1;
}
message RequestUnion {
oneof request {
RangeRequest request_range = 1;
PutRequest request_put = 2;
DeleteRangeRequest request_delete_range = 3;
}
}
message ResponseUnion {
oneof response {
RangeResponse reponse_range = 1;
PutResponse response_put = 2;
DeleteRangeResponse response_delete_range = 3;
}
}
message Compare {
enum CompareType {
EQUAL = 0;
GREATER = 1;
LESS = 2;
}
optional CompareType type = 1;
// key path
optional bytes key = 2;
oneof target {
// version of the given key
int64 version = 3;
// create index of the given key
int64 create_index = 4;
// last modified index of the given key
int64 mod_index = 5;
// value of the given key
bytes value = 6;
}
}
// First all the compare requests are processed.
// If all the compare succeed, all the success
// requests will be processed.
// Or all the failure requests will be processed and
// all the errors in the comparison will be returned.
// From google paxosdb paper:
// Our implementation hinges around a powerful primitive which we call MultiOp. All other database
// operations except for iteration are implemented as a single call to MultiOp. A MultiOp is applied atomically
// and consists of three components:
// 1. A list of tests called guard. Each test in guard checks a single entry in the database. It may check
// for the absence or presence of a value, or compare with a given value. Two different tests in the guard
// may apply to the same or different entries in the database. All tests in the guard are applied and
// MultiOp returns the results. If all tests are true, MultiOp executes t op (see item 2 below), otherwise
// it executes f op (see item 3 below).
// 2. A list of database operations called t op. Each operation in the list is either an insert, delete, or
// lookup operation, and applies to a single database entry. Two different operations in the list may apply
// to the same or different entries in the database. These operations are executed
// if guard evaluates to
// true.
// 3. A list of database operations called f op. Like t op, but executed if guard evaluates to false.
message TnxRequest {
repeated Compare compare = 1;
repeated RequestUnion success = 2;
repeated RequestUnion failure = 3;
}
message TnxResponse {
optional ResponseHeader header = 1;
optional bool succeeded = 2;
repeated ResponseUnion responses = 3;
}
message KeyValue {
optional bytes key = 1;
// mod_index is the last modified index of the key.
optional int64 create_index = 2;
optional int64 mod_index = 3;
// version is the version of the key. A deletion resets
// the version to zero and any modification of the key
// increases its version.
optional int64 version = 4;
optional bytes value = 5;
}
message WatchRangeRequest {
// if the range_end is not given, the request returns the key.
optional bytes key = 1;
// if the range_end is given, it gets the keys in range [key, range_end).
optional bytes range_end = 2;
// start_index is an optional index (including) to watch from. No start_index is "now".
optional int64 start_index = 3;
// end_index is an optional index (excluding) to end watch. No end_index is "forever".
optional int64 end_index = 4;
optional bool progress_notification = 5;
}
message WatchRangeResponse {
optional ResponseHeader header = 1;
repeated Event events = 2;
}
message Event {
enum EventType {
PUT = 0;
DELETE = 1;
EXPIRE = 2;
}
optional EventType event_type = 1;
// a put event contains the current key-value
// a delete/expire event contains the previous
// key-value
optional KeyValue kv = 2;
}
message CompactionRequest {
optional int64 index = 1;
}
message CompactionResponse {
optional ResponseHeader header = 1;
}
message LeaseCreateRequest {
// advisory ttl in seconds
optional int64 ttl = 1;
}
message LeaseCreateResponse {
optional ResponseHeader header = 1;
optional int64 lease_id = 2;
// server decided ttl in second
optional int64 ttl = 3;
optional string error = 4;
}
message LeaseRevokeRequest {
optional int64 lease_id = 1;
}
message LeaseRevokeResponse {
optional ResponseHeader header = 1;
}
message LeaseTnxRequest {
optional TnxRequest request = 1;
repeated LeaseAttachRequest success = 2;
repeated LeaseAttachRequest failure = 3;
}
message LeaseTnxResponse {
optional ResponseHeader header = 1;
optional TnxResponse response = 2;
repeated LeaseAttachResponse attach_responses = 3;
}
message LeaseAttachRequest {
optional int64 lease_id = 1;
optional bytes key = 2;
}
message LeaseAttachResponse {
optional ResponseHeader header = 1;
}
message LeaseKeepAliveRequest {
optional int64 lease_id = 1;
}
message LeaseKeepAliveResponse {
optional ResponseHeader header = 1;
optional int64 lease_id = 2;
optional int64 ttl = 3;
}

View File

@ -0,0 +1,151 @@
## Runtime Reconfiguration
etcd comes with support for incremental runtime reconfiguration, which allows users to update the membership of the cluster at run time.
Reconfiguration requests can only be processed when the the majority of the cluster members are functioning. It is **highly recommended** to always have a cluster size greater than two in production. It is unsafe to remove a member from a two member cluster. The majority of a two member cluster is also two. If there is a failure during the removal process, the cluster might not able to make progress and need to [restart from majority failure][majority failure].
[majority failure]: #restart-cluster-from-majority-failure
## Reconfiguration Use Cases
Let us walk through some common reasons for reconfiguring a cluster. Most of these just involve combinations of adding or removing a member, which are explained below under [Cluster Reconfiguration Operations](#cluster-reconfiguration-operations).
### Cycle or Upgrade Multiple Machines
If you need to move multiple members of your cluster due to planned maintenance (hardware upgrades, network downtime, etc.), it is recommended to modify members one at a time.
It is safe to remove the leader, however there is a brief period of downtime while the election process takes place. If your cluster holds more than 50MB, it is recommended to [migrate the member's data directory][member migration].
[member migration]: admin_guide.md#member-migration
### Change the Cluster Size
Increasing the cluster size can enhance [failure tolerance][fault tolerance table] and provide better read performance. Since clients can read from any member, increasing the number of members increases the overall read throughput.
Decreasing the cluster size can improve the write performance of a cluster, with a trade-off of decreased resilience. Writes into the cluster are replicated to a majority of members of the cluster before considered committed. Decreasing the cluster size lowers the majority, and each write is committed more quickly.
[fault tolerance table]: admin_guide.md#fault-tolerance-table
### Replace A Failed Machine
If a machine fails due to hardware failure, data directory corruption, or some other fatal situation, it should be replaced as soon as possible. Machines that have failed but haven't been removed adversely affect your quorum and reduce the tolerance for an additional failure.
To replace the machine, follow the instructions for [removing the member][remove member] from the cluster, and then [add a new member][add member] in its place. If your cluster holds more than 50MB, it is recommended to [migrate the failed member's data directory][member migration] if you can still access it.
[remove member]: #remove-a-member
[add member]: #add-a-new-member
### Restart Cluster from Majority Failure
If the majority of your cluster is lost, then you need to take manual action in order to recover safely.
The basic steps in the recovery process include [creating a new cluster using the old data][disaster recovery], forcing a single member to act as the leader, and finally using runtime configuration to [add new members][add member] to this new cluster one at a time.
[add member]: #add-a-new-member
[disaster recovery]: admin_guide.md#disaster-recovery
## Cluster Reconfiguration Operations
Now that we have the use cases in mind, let us lay out the operations involved in each.
Before making any change, the simple majority (quorum) of etcd members must be available.
This is essentially the same requirement as for any other write to etcd.
All changes to the cluster are done one at a time:
To replace a single member you will make an add then a remove operation
To increase from 3 to 5 members you will make two add operations
To decrease from 5 to 3 you will make two remove operations
All of these examples will use the `etcdctl` command line tool that ships with etcd.
If you want to use the member API directly you can find the documentation [here](other_apis.md).
### Remove a Member
First, we need to find the target member's ID. You can list all members with `etcdctl`:
```
$ etcdctl member list
6e3bd23ae5f1eae0: name=node2 peerURLs=http://localhost:7002 clientURLs=http://127.0.0.1:4002
924e2e83e93f2560: name=node3 peerURLs=http://localhost:7003 clientURLs=http://127.0.0.1:4003
a8266ecf031671f3: name=node1 peerURLs=http://localhost:7001 clientURLs=http://127.0.0.1:4001
```
Let us say the member ID we want to remove is a8266ecf031671f3.
We then use the `remove` command to perform the removal:
```
$ etcdctl member remove a8266ecf031671f3
Removed member a8266ecf031671f3 from cluster
```
The target member will stop itself at this point and print out the removal in the log:
```
etcd: this member has been permanently removed from the cluster. Exiting.
```
It is safe to remove the leader, however the cluster will be inactive while a new leader is elected. This duration is normally the period of election timeout plus the voting process.
### Add a New Member
Adding a member is a two step process:
* Add the new member to the cluster via the [members API](other_apis.md#post-v2members) or the `etcdctl member add` command.
* Start the new member with the new cluster configuration, including a list of the updated members (existing members + the new member).
Using `etcdctl` let's add the new member to the cluster by specifying its [name](configuration.md#-name) and [advertised peer URLs](configuration.md#-initial-advertise-peer-urls):
```
$ etcdctl member add infra3 http://10.0.1.13:2380
added member 9bf1b35fc7761a23 to cluster
ETCD_NAME="infra3"
ETCD_INITIAL_CLUSTER="infra0=http://10.0.1.10:2380,infra1=http://10.0.1.11:2380,infra2=http://10.0.1.12:2380,infra3=http://10.0.1.13:2380"
ETCD_INITIAL_CLUSTER_STATE=existing
```
`etcdctl` has informed the cluster about the new member and printed out the environment variables needed to successfully start it.
Now start the new etcd process with the relevant flags for the new member:
```
$ export ETCD_NAME="infra3"
$ export ETCD_INITIAL_CLUSTER="infra0=http://10.0.1.10:2380,infra1=http://10.0.1.11:2380,infra2=http://10.0.1.12:2380,infra3=http://10.0.1.13:2380"
$ export ETCD_INITIAL_CLUSTER_STATE=existing
$ etcd -listen-client-urls http://10.0.1.13:2379 -advertise-client-urls http://10.0.1.13:2379 -listen-peer-urls http://10.0.1.13:2380 -initial-advertise-peer-urls http://10.0.1.13:2380
```
The new member will run as a part of the cluster and immediately begin catching up with the rest of the cluster.
If you are adding multiple members the best practice is to configure a single member at a time and verify it starts correctly before adding more new members.
If you add a new member to a 1-node cluster, the cluster cannot make progress before the new member starts because it needs two members as majority to agree on the consensus. You will only see this behavior between the time `etcdctl member add` informs the cluster about the new member and the new member successfully establishing a connection to the existing one.
#### Error Cases
In the following case we have not included our new host in the list of enumerated nodes.
If this is a new cluster, the node must be added to the list of initial cluster members.
```
$ etcd -name infra3 \
-initial-cluster infra0=http://10.0.1.10:2380,infra1=http://10.0.1.11:2380,infra2=http://10.0.1.12:2380 \
-initial-cluster-state existing
etcdserver: assign ids error: the member count is unequal
exit 1
```
In this case we give a different address (10.0.1.14:2380) to the one that we used to join the cluster (10.0.1.13:2380).
```
$ etcd -name infra4 \
-initial-cluster infra0=http://10.0.1.10:2380,infra1=http://10.0.1.11:2380,infra2=http://10.0.1.12:2380,infra4=http://10.0.1.14:2380 \
-initial-cluster-state existing
etcdserver: assign ids error: unmatched member while checking PeerURLs
exit 1
```
When we start etcd using the data directory of a removed member, etcd will exit automatically if it connects to any alive member in the cluster:
```
$ etcd
etcd: this member has been permanently removed from the cluster. Exiting.
exit 1
```

182
Documentation/security.md Normal file
View File

@ -0,0 +1,182 @@
# security model
etcd supports SSL/TLS as well as authentication through client certificates, both for clients to server as well as peer (server to server / cluster) communication.
To get up and running you first need to have a CA certificate and a signed key pair for one member. It is recommended to create and sign a new key pair for every member in a cluster.
For convenience the [etcd-ca](https://github.com/coreos/etcd-ca) tool provides an easy interface to certificate generation, alternatively this site provides a good reference on how to generate self-signed key pairs:
http://www.g-loaded.eu/2005/11/10/be-your-own-ca/
## Basic setup
etcd takes several certificate related configuration options, either through command-line flags or environment variables:
**Client-to-server communication:**
`--cert-file=<path>`: Certificate used for SSL/TLS connections **to** etcd. When this option is set, you can set advertise-client-urls using HTTPS schema.
`--key-file=<path>`: Key for the certificate. Must be unencrypted.
`--client-cert-auth`: When this is set etcd will check all incoming HTTPS requests for a client certificate signed by the trusted CA, requests that don't supply a valid client certificate will fail.
`--trusted-ca-file=<path>`: Trusted certificate authority.
**Peer (server-to-server / cluster) communication:**
The peer options work the same way as the client-to-server options:
`--peer-cert-file=<path>`: Certificate used for SSL/TLS connections between peers. This will be used both for listening on the peer address as well as sending requests to other peers.
`--peer-key-file=<path>`: Key for the certificate. Must be unencrypted.
`--peer-client-cert-auth`: When set, etcd will check all incoming peer requests from the cluster for valid client certificates signed by the supplied CA.
`--peer-trusted-ca-file=<path>`: Trusted certificate authority.
If either a client-to-server or peer certificate is supplied the key must also be set. All of these configuration options are also available through the environment variables, `ETCD_CA_FILE`, `ETCD_PEER_CA_FILE` and so on.
## Example 1: Client-to-server transport security with HTTPS
For this you need your CA certificate (`ca.crt`) and signed key pair (`server.crt`, `server.key`) ready.
Let us configure etcd to provide simple HTTPS transport security step by step:
```sh
$ etcd -name infra0 -data-dir infra0 \
-cert-file=/path/to/server.crt -key-file=/path/to/server.key \
-advertise-client-urls=https://127.0.0.1:2379 -listen-client-urls=https://127.0.0.1:2379
```
This should start up fine and you can now test the configuration by speaking HTTPS to etcd:
```sh
$ curl --cacert /path/to/ca.crt https://127.0.0.1:2379/v2/keys/foo -XPUT -d value=bar -v
```
You should be able to see the handshake succeed. Because we use self-signed certificates with our own certificate authorities you need to provide the CA to curl using the `--cacert` option. Another possibility would be to add your CA certificate to the trusted certificates on your system (usually in `/etc/ssl/certs`).
**OSX 10.9+ Users**: curl 7.30.0 on OSX 10.9+ doesn't understand certificates passed in on the command line.
Instead you must import the dummy ca.crt directly into the keychain or add the `-k` flag to curl to ignore errors.
If you want to test without the `-k` flag run `open ./fixtures/ca/ca.crt` and follow the prompts.
Please remove this certificate after you are done testing!
If you know of a workaround let us know.
## Example 2: Client-to-server authentication with HTTPS client certificates
For now we've given the etcd client the ability to verify the server identity and provide transport security. We can however also use client certificates to prevent unauthorized access to etcd.
The clients will provide their certificates to the server and the server will check whether the cert is signed by the supplied CA and decide whether to serve the request.
You need the same files mentioned in the first example for this, as well as a key pair for the client (`client.crt`, `client.key`) signed by the same certificate authority.
```sh
$ etcd -name infra0 -data-dir infra0 \
-client-cert-auth -trusted-ca-file=/path/to/ca.crt -cert-file=/path/to/server.crt -key-file=/path/to/server.key \
-advertise-client-urls https://127.0.0.1:2379 -listen-client-urls https://127.0.0.1:2379
```
Now try the same request as above to this server:
```sh
$ curl --cacert /path/to/ca.crt https://127.0.0.1:2379/v2/keys/foo -XPUT -d value=bar -v
```
The request should be rejected by the server:
```
...
routines:SSL3_READ_BYTES:sslv3 alert bad certificate
...
```
To make it succeed, we need to give the CA signed client certificate to the server:
```sh
$ curl --cacert /path/to/ca.crt --cert /path/to/client.crt --key /path/to/client.key \
-L https://127.0.0.1:2379/v2/keys/foo -XPUT -d value=bar -v
```
You should able to see:
```
...
SSLv3, TLS handshake, CERT verify (15):
...
TLS handshake, Finished (20)
```
And also the response from the server:
```json
{
"action": "set",
"node": {
"createdIndex": 12,
"key": "/foo",
"modifiedIndex": 12,
"value": "bar"
}
}
```
## Example 3: Transport security & client certificates in a cluster
etcd supports the same model as above for **peer communication**, that means the communication between etcd members in a cluster.
Assuming we have our `ca.crt` and two members with their own keypairs (`member1.crt` & `member1.key`, `member2.crt` & `member2.key`) signed by this CA, we launch etcd as follows:
```sh
DISCOVERY_URL=... # from https://discovery.etcd.io/new
# member1
$ etcd -name infra1 -data-dir infra1 \
-peer-client-cert-auth -peer-trusted-ca-file=/path/to/ca.crt -peer-cert-file=/path/to/member1.crt -peer-key-file=/path/to/member1.key \
-initial-advertise-peer-urls=https://10.0.1.10:2380 -listen-peer-urls=https://10.0.1.10:2380 \
-discovery ${DISCOVERY_URL}
# member2
$ etcd -name infra2 -data-dir infra2 \
-peer-client-cert-atuh -peer-trusted-ca-file=/path/to/ca.crt -peer-cert-file=/path/to/member2.crt -peer-key-file=/path/to/member2.key \
-initial-advertise-peer-urls=https://10.0.1.11:2380 -listen-peer-urls=https://10.0.1.11:2380 \
-discovery ${DISCOVERY_URL}
```
The etcd members will form a cluster and all communication between members in the cluster will be encrypted and authenticated using the client certificates. You will see in the output of etcd that the addresses it connects to use HTTPS.
## Frequently Asked Questions
### My cluster is not working with peer tls configuration?
The internal protocol of etcd v2.0.x uses a lot of short-lived HTTP connections.
So, when enabling TLS you may need to increase the heartbeat interval and election timeouts to reduce internal cluster connection churn.
A reasonable place to start are these values: ` --heartbeat-interval 500 --election-timeout 2500`.
This issues is resolved in the etcd v2.1.x series of releases which uses fewer connections.
### I'm seeing a SSLv3 alert handshake failure when using SSL client authentication?
The `crypto/tls` package of `golang` checks the key usage of the certificate public key before using it.
To use the certificate public key to do client auth, we need to add `clientAuth` to `Extended Key Usage` when creating the certificate public key.
Here is how to do it:
Add the following section to your openssl.cnf:
```
[ ssl_client ]
...
extendedKeyUsage = clientAuth
...
```
When creating the cert be sure to reference it in the `-extensions` flag:
```
$ openssl ca -config openssl.cnf -policy policy_anything -extensions ssl_client -out certs/machine.crt -infiles machine.csr
```
### With peer certificate authentication I receive "certificate is valid for 127.0.0.1, not $MY_IP"
Make sure that you sign your certificates with a Subject Name your member's public IP address. The `etcd-ca` tool for example provides an `--ip=` option for its `new-cert` command.
If you need your certificate to be signed for your member's FQDN in its Subject Name then you could use Subject Alternative Names (short IP SANs) to add your IP address. The `etcd-ca` tool provides `--domain=` option for its `new-cert` command, and openssl can make [it](http://wiki.cacert.org/FAQ/subjectAltName) too.

66
Documentation/tuning.md Normal file
View File

@ -0,0 +1,66 @@
## Tuning
The default settings in etcd should work well for installations on a local network where the average network latency is low.
However, when using etcd across multiple data centers or over networks with high latency you may need to tweak the heartbeat interval and election timeout settings.
The network isn't the only source of latency. Each request and response may be impacted by slow disks on both the leader and follower. Each of these timeouts represents the total time from request to successful response from the other machine.
### Time Parameters
The underlying distributed consensus protocol relies on two separate time parameters to ensure that nodes can handoff leadership if one stalls or goes offline.
The first parameter is called the *Heartbeat Interval*.
This is the frequency with which the leader will notify followers that it is still the leader.
etcd batches commands together for higher throughput so this heartbeat interval is also a delay for how long it takes for commands to be committed.
By default, etcd uses a `100ms` heartbeat interval.
The second parameter is the *Election Timeout*.
This timeout is how long a follower node will go without hearing a heartbeat before attempting to become leader itself.
By default, etcd uses a `1000ms` election timeout.
Adjusting these values is a trade off.
Lowering the heartbeat interval will cause individual commands to be committed faster but it will lower the overall throughput of etcd.
If your etcd instances have low utilization then lowering the heartbeat interval can improve your command response time.
The election timeout should be set based on the heartbeat interval and your network ping time between nodes.
Election timeouts should be at least 10 times your ping time so it can account for variance in your network.
For example, if the ping time between your nodes is 10ms then you should have at least a 100ms election timeout.
The upper limit of election timeout is 50000ms, which should only be used when deploying global etcd cluster. First, 5s is the upper limit of average global round-trip time. A reasonable round-trip time for the continental united states is 130ms, and the time between US and japan is around 350-400ms. Because package gets delayed a lot, and network situation may be terrible, 5s is a safe value for it. Then, because election timeout should be an order of magnitude bigger than broadcast time, 50s becomes its maximum.
You should also set your election timeout to at least 5 to 10 times your heartbeat interval to account for variance in leader replication.
For a heartbeat interval of 50ms you should set your election timeout to at least 250ms - 500ms.
You can override the default values on the command line:
```sh
# Command line arguments:
$ etcd -heartbeat-interval=100 -election-timeout=500
# Environment variables:
$ ETCD_HEARTBEAT_INTERVAL=100 ETCD_ELECTION_TIMEOUT=500 etcd
```
The values are specified in milliseconds.
### Snapshots
etcd appends all key changes to a log file.
This log grows forever and is a complete linear history of every change made to the keys.
A complete history works well for lightly used clusters but clusters that are heavily used would carry around a large log.
To avoid having a huge log etcd makes periodic snapshots.
These snapshots provide a way for etcd to compact the log by saving the current state of the system and removing old logs.
### Snapshot Tuning
Creating snapshots can be expensive so they're only created after a given number of changes to etcd.
By default, snapshots will be made after every 10,000 changes.
If etcd's memory usage and disk usage are too high, you can lower the snapshot threshold by setting the following on the command line:
```sh
# Command line arguments:
$ etcd -snapshot-count=5000
# Environment variables:
$ ETCD_SNAPSHOT_COUNT=5000 etcd
```

View File

@ -0,0 +1,112 @@
## Upgrade etcd to 2.1
In the general case, upgrading from etcd 2.0 to 2.1 can be a zero-downtime, rolling upgrade:
- one by one, stop the etcd v2.0 processes and replace them with etcd v2.1 processes
- after you are running all v2.1 processes, new features in v2.1 are available to the cluster
Before [starting an upgrade](#upgrade-procedure), read through the rest of this guide to prepare.
### Upgrade Checklists
#### Upgrade Requirement
To upgrade an existing etcd deployment to 2.1, you must be running 2.0. If youre running a version of etcd before 2.0, you must upgrade to [2.0](https://github.com/coreos/etcd/releases/tag/v2.0.13) before upgrading to 2.1.
Also, to ensure a smooth rolling upgrade, your running cluster must be healthy. You can check the health of the cluster by using `etcdctl cluster-health` command.
#### Preparedness
Before upgrading etcd, always test the services relying on etcd in a staging environment before deploying the upgrade to the production environment.
You might also want to [backup your data directory](admin_guide.md#backing-up-the-datastore) for a potential [downgrade](#downgrade).
etcd 2.1 introduces a new [authentication](auth_api.md) feature, which is disabled by default. If your deployment depends on these, you may want to test the auth features before enabling them in production.
#### Mixed Versions
While upgrading, an etcd cluster supports mixed versions of etcd members. The cluster is only considered upgraded once all its members are upgraded to 2.1.
Internally, etcd members negotiate with each other to determine the overall etcd cluster version, which controls the reported cluster version and the supported features. For example, if you are mid-upgrade, any 2.1 features (such as the the authentication feature mentioned above) wont be available.
#### Limitations
If you encounter any issues during the upgrade, you can attempt to restart the etcd process in trouble using a newer v2.1 binary to solve the problem. One known issue is that etcd v2.0.0 and v2.0.2 may panic during rolling upgrades due to an existing bug, which has been fixed since etcd v2.0.3.
It might take up to 2 minutes for the newly upgraded member to catch up with the existing cluster when the total data size is larger than 50MB (You can check the size of the existing snapshot to know about the rough data size). In other words, it is safest to wait for 2 minutes before upgrading the next member.
If you have even more data, this might take more time. If you have a data size larger than 100MB you should contact us before upgrading, so we can make sure the upgrades work smoothly.
#### Downgrade
If all members have been upgraded to v2.1, the cluster will be upgraded to v2.1, and downgrade is **not possible**. If any member is still v2.0, the cluster will remain in v2.0, and you can go back to use v2.0 binary.
Please [backup your data directory](admin_guide.md#backing-up-the-datastore) of all etcd members if you want to downgrade the cluster, even if it is upgraded.
### Upgrade Procedure
#### 1. Check upgrade requirements.
```
$ etcdctl cluster-health
cluster is healthy
member 6e3bd23ae5f1eae0 is healthy
member 924e2e83e93f2560 is healthy
member a8266ecf031671f3 is healthy
$ curl http://127.0.0.1:4001/version
etcd 2.0.x
```
#### 2. Stop the existing etcd process
You will see similar error logging from other etcd processes in your cluster. This is normal, since you just shut down a member.
```
2015/06/23 15:45:09 sender: error posting to 6e3bd23ae5f1eae0: dial tcp 127.0.0.1:7002: connection refused
2015/06/23 15:45:09 sender: the connection with 6e3bd23ae5f1eae0 became inactive
2015/06/23 15:45:11 rafthttp: encountered error writing to server log stream: write tcp 127.0.0.1:53783: broken pipe
2015/06/23 15:45:11 rafthttp: server streaming to 6e3bd23ae5f1eae0 at term 2 has been stopped
2015/06/23 15:45:11 stream: error sending message: stopped
2015/06/23 15:45:11 stream: stopping the stream server...
```
You could [backup your data directory](https://github.com/coreos/etcd/blob/7f7e2cc79d9c5c342a6eb1e48c386b0223cf934e/Documentation/admin_guide.md#backing-up-the-datastore) for data safety.
```
$ etcdctl backup \
--data-dir /var/lib/etcd \
--backup-dir /tmp/etcd_backup
```
#### 3. Drop-in etcd v2.1 binary and start the new etcd process
You will see the etcd publish its information to the cluster.
```
2015/06/23 15:45:39 etcdserver: published {Name:infra2 ClientURLs:[http://localhost:4002]} to cluster e9c7614f68f35fb2
```
You could verify the cluster becomes healthy.
```
$ etcdctl cluster-health
cluster is healthy
member 6e3bd23ae5f1eae0 is healthy
member 924e2e83e93f2560 is healthy
member a8266ecf031671f3 is healthy
```
#### 4. Repeat step 2 to step 3 for all other members
#### 5. Finish
When all members are upgraded, you will see the cluster is upgraded to 2.1 successfully:
```
2015/06/23 15:46:35 etcdserver: updated the cluster version from 2.0.0 to 2.1.0
```
```
$ curl http://127.0.0.1:4001/version
{"etcdserver":"2.1.x","etcdcluster":"2.1.0"}
```

View File

@ -1,42 +0,0 @@
# etcd Governance
## Principles
The etcd community adheres to the following principles:
- Open: etcd is open source.
- Welcoming and respectful: See [Code of Conduct].
- Transparent and accessible: Changes to the etcd code repository and CNCF related
activities (e.g. level, involvement, etc) are done in public.
- Merit: Ideas and contributions are accepted according to their technical merit for
the betterment of the project. For specific guidance on practical contribution steps
please see [contributor guide] guide.
## Roles and responsibilities
Etcd project roles along with their requirements and responsibilities are defined
in [community membership].
## Decision making process
Decisions are built on consensus between [maintainers] publicly. Proposals and ideas
can either be submitted for agreement via a GitHub issue or PR, or by sending an email
to `etcd-maintainers@googlegroups.com`.
## Conflict resolution
In general, we prefer that technical issues and maintainer membership are amicably
worked out between the persons involved. However, any technical dispute that has
reached an impasse with a subset of the community, any contributor may open a GitHub
issue or PR or send an email to `etcd-maintainers@googlegroups.com`. If the
maintainers themselves cannot decide an issue, the issue will be resolved by a
supermajority of the maintainers with a fallback on lazy consensus after three business
weeks inactive voting period and as long as two maintainers are on board.
## Changes in Governance
Changes in project governance could be initiated by opening a GitHub PR.
[community membership]: /Documentation/contributor-guide/community-membership.md
[Code of Conduct]: /code-of-conduct.md
[contributor guide]: /CONTRIBUTING.md
[maintainers]: /MAINTAINERS

133
Godeps/Godeps.json generated Normal file
View File

@ -0,0 +1,133 @@
{
"ImportPath": "github.com/coreos/etcd",
"GoVersion": "go1.4.1",
"Packages": [
"./..."
],
"Deps": [
{
"ImportPath": "bitbucket.org/ww/goautoneg",
"Comment": "null-5",
"Rev": "75cd24fc2f2c2a2088577d12123ddee5f54e0675"
},
{
"ImportPath": "github.com/beorn7/perks/quantile",
"Rev": "b965b613227fddccbfffe13eae360ed3fa822f8d"
},
{
"ImportPath": "github.com/bgentry/speakeasy",
"Rev": "5dfe43257d1f86b96484e760f2f0c4e2559089c7"
},
{
"ImportPath": "github.com/boltdb/bolt",
"Comment": "v1.0-71-g71f28ea",
"Rev": "71f28eaecbebd00604d87bb1de0dae8fcfa54bbd"
},
{
"ImportPath": "github.com/bradfitz/http2",
"Rev": "3e36af6d3af0e56fa3da71099f864933dea3d9fb"
},
{
"ImportPath": "github.com/codegangsta/cli",
"Comment": "1.2.0-26-gf7ebb76",
"Rev": "f7ebb761e83e21225d1d8954fde853bf8edd46c4"
},
{
"ImportPath": "github.com/coreos/go-etcd/etcd",
"Comment": "v2.0.0-13-g4cceaf7",
"Rev": "4cceaf7283b76f27c4a732b20730dcdb61053bf5"
},
{
"ImportPath": "github.com/coreos/go-semver/semver",
"Rev": "568e959cd89871e61434c1143528d9162da89ef2"
},
{
"ImportPath": "github.com/coreos/pkg/capnslog",
"Rev": "99f6e6b8f8ea30b0f82769c1411691c44a66d015"
},
{
"ImportPath": "github.com/gogo/protobuf/proto",
"Rev": "64f27bf06efee53589314a6e5a4af34cdd85adf6"
},
{
"ImportPath": "github.com/golang/glog",
"Rev": "44145f04b68cf362d9c4df2182967c2275eaefed"
},
{
"ImportPath": "github.com/golang/protobuf/proto",
"Rev": "5677a0e3d5e89854c9974e1256839ee23f8233ca"
},
{
"ImportPath": "github.com/google/btree",
"Rev": "cc6329d4279e3f025a53a83c397d2339b5705c45"
},
{
"ImportPath": "github.com/jonboulle/clockwork",
"Rev": "72f9bd7c4e0c2a40055ab3d0f09654f730cce982"
},
{
"ImportPath": "github.com/matttproud/golang_protobuf_extensions/pbutil",
"Rev": "fc2b8d3a73c4867e51861bbdd5ae3c1f0869dd6a"
},
{
"ImportPath": "github.com/prometheus/client_golang/model",
"Comment": "0.5.0-10-ga842dc1",
"Rev": "a842dc11e0621c34a71cab634d1d0190a59802a8"
},
{
"ImportPath": "github.com/prometheus/client_golang/prometheus",
"Comment": "0.5.0-10-ga842dc1",
"Rev": "a842dc11e0621c34a71cab634d1d0190a59802a8"
},
{
"ImportPath": "github.com/prometheus/client_golang/text",
"Comment": "0.5.0-10-ga842dc1",
"Rev": "a842dc11e0621c34a71cab634d1d0190a59802a8"
},
{
"ImportPath": "github.com/prometheus/client_model/go",
"Comment": "model-0.0.2-12-gfa8ad6f",
"Rev": "fa8ad6fec33561be4280a8f0514318c79d7f6cb6"
},
{
"ImportPath": "github.com/prometheus/procfs",
"Rev": "ee2372b58cee877abe07cde670d04d3b3bac5ee6"
},
{
"ImportPath": "github.com/stretchr/testify/assert",
"Rev": "9cc77fa25329013ce07362c7742952ff887361f2"
},
{
"ImportPath": "github.com/ugorji/go/codec",
"Rev": "821cda7e48749cacf7cad2c6ed01e96457ca7e9d"
},
{
"ImportPath": "golang.org/x/crypto/bcrypt",
"Rev": "1351f936d976c60a0a48d728281922cf63eafb8d"
},
{
"ImportPath": "golang.org/x/crypto/blowfish",
"Rev": "1351f936d976c60a0a48d728281922cf63eafb8d"
},
{
"ImportPath": "golang.org/x/net/context",
"Rev": "7dbad50ab5b31073856416cdcfeb2796d682f844"
},
{
"ImportPath": "golang.org/x/oauth2",
"Rev": "3046bc76d6dfd7d3707f6640f85e42d9c4050f50"
},
{
"ImportPath": "google.golang.org/cloud/compute/metadata",
"Rev": "f20d6dcccb44ed49de45ae3703312cb46e627db1"
},
{
"ImportPath": "google.golang.org/cloud/internal",
"Rev": "f20d6dcccb44ed49de45ae3703312cb46e627db1"
},
{
"ImportPath": "google.golang.org/grpc",
"Rev": "f5ebd86be717593ab029545492c93ddf8914832b"
}
]
}

5
Godeps/Readme generated Normal file
View File

@ -0,0 +1,5 @@
This directory tree is generated automatically by godep.
Please do not edit.
See https://github.com/tools/godep for more information.

2
Godeps/_workspace/.gitignore generated vendored Normal file
View File

@ -0,0 +1,2 @@
/pkg
/bin

View File

@ -0,0 +1,13 @@
include $(GOROOT)/src/Make.inc
TARG=bitbucket.org/ww/goautoneg
GOFILES=autoneg.go
include $(GOROOT)/src/Make.pkg
format:
gofmt -w *.go
docs:
gomake clean
godoc ${TARG} > README.txt

View File

@ -0,0 +1,67 @@
PACKAGE
package goautoneg
import "bitbucket.org/ww/goautoneg"
HTTP Content-Type Autonegotiation.
The functions in this package implement the behaviour specified in
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html
Copyright (c) 2011, Open Knowledge Foundation Ltd.
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are
met:
Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in
the documentation and/or other materials provided with the
distribution.
Neither the name of the Open Knowledge Foundation Ltd. nor the
names of its contributors may be used to endorse or promote
products derived from this software without specific prior written
permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
FUNCTIONS
func Negotiate(header string, alternatives []string) (content_type string)
Negotiate the most appropriate content_type given the accept header
and a list of alternatives.
func ParseAccept(header string) (accept []Accept)
Parse an Accept Header string returning a sorted list
of clauses
TYPES
type Accept struct {
Type, SubType string
Q float32
Params map[string]string
}
Structure to represent a clause in an HTTP Accept Header
SUBDIRECTORIES
.hg

Some files were not shown because too many files have changed in this diff Show More