Commit Graph

182 Commits (f9c2d00fb3d0a63cf3a7d8c124ad87cd5d826253)

Author SHA1 Message Date
Xiang Li 2c5162af5c
Merge pull request #10523 from jingyih/fully_concurrent_reads
mvcc: fully concurrent read
2019-06-14 12:25:17 +08:00
Jingyi Hu 55066ebdc0 mvcc: address comments 2019-06-13 18:05:50 -07:00
Jingyi Hu 2a9320e944 mvcc: add TestConcurrentReadTxAndWrite
Add TestConcurrentReadTxAndWrite which creates random reads and writes,
and ensures reads always see latest writes.
2019-06-11 17:05:41 -07:00
Jingyi Hu b873fbd127 mvcc/backend: correct RLock in test
Should use RLock instead of Lock.
2019-06-11 16:24:44 -07:00
Jingyi Hu 693afd8e5e mvcc/backend: add unit test for ConcurrentReadTx
Add TestConcurrentReadTx to ensure concurrentReadTx can see all the
prior writes which are stored on the read buffer.
2019-06-10 18:30:21 -07:00
Jingyi Hu ad80752715 mvcc: add metrics dbOpenReadTxn
Expose the number of currently open read transactions in backend to
metrics endpoint.
2019-06-10 17:20:04 -07:00
Gyuho Lee 1caaa9ed4a test: test update for Go 1.12.5 and related changes
Update to Go 1.12.5 testing. Remove deprecated unused and gosimple
pacakges, and mask staticcheck 1006. Also, fix unconvert errors related
to unnecessary type conversions and following staticcheck errors:
- remove redundant return statements
- use for range instead of for select
- use time.Since instead of time.Now().Sub
- omit comparison to bool constant
- replace T.Fatal and T.Fatalf in tests with T.Error and T.Fatalf respectively because the goroutine calls T.Fatal must be called in the same goroutine as the test
- fix error strings that should not be capitalized
- use sort.Strings(...) instead of sort.Sort(sort.StringSlice(...))
- use he status code of Canceled instead of grpc.ErrClientConnClosing which is deprecated
- use use status.Errorf instead of grpc.Errorf which is deprecated

Related #10528 #10438
2019-06-05 17:02:05 -04:00
Jingyi Hu 4345f74426 mvcc: revert change made by 10526
Revert #10526 and its followup #10699.
2019-05-29 17:41:33 -07:00
Gyuho Lee 34bd797e67 *: revert module import paths
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-05-28 15:39:35 -07:00
Max Lowther 9ab3572662 Doc: Fix typo in revision.go 2019-05-16 14:29:10 +01:00
Jingyi Hu a73fb85c0c mvcc: fully concurrent read 2019-05-08 19:11:23 -07:00
Gyuho Lee e899023f3f
Merge pull request #10640 from shrajfr12/gomodulecompat
Fix module path to have the major version to comply with go modules specification.
2019-05-01 22:46:03 -07:00
Jingyi Hu 88922b0d08 mvcc: protect tree clone with write lock 2019-05-01 12:34:09 -07:00
shivaramr 9150bf52d6 go modules: Fix module path version to include version number 2019-04-26 15:29:50 -07:00
Yingnan Zhang c5cb5509ea mvcc: fix db_compaction_total_duration_milliseconds 2019-04-16 10:32:21 +08:00
Jingyi Hu 93732df3ef mvcc: release lock early when traversing index
Make a copy-on-write clone of index tree when traversing. So that lock
can be released right after the clone to improve backend concurrency.
2019-03-06 14:26:17 -08:00
Jingyi Hu 1c19f126cb mvcc/backend: rename ReadTx Lock() to RLock()
For better code readability, renaming Lock() to RLock() in ReadTx
interface.
2019-03-05 13:53:27 -08:00
WizardCXY e6c6d8492e *: add flag to let etcd use the new boltdb freelistType feature 2019-02-14 11:07:08 +08:00
WizardCXY 6e8913b004 bugfix:dead lock on store.mu when store.Compact in store.Restore happens 2019-01-21 10:46:58 +08:00
caoming cf309757d6 mvcc/backend: code format optimization 2018-10-17 14:18:09 +08:00
caoming bf49b9a145 mvcc/backend: fix to use the backend create by snapshot instead of origin one. 2018-10-15 09:35:20 +08:00
Gyuho Lee 5adbc231f2 mvcc/backend: use "go.etcd.io/bbolt"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-29 12:31:04 -07:00
Gyuho Lee d537b328cb mvcc: update import paths "go.etcd.io/etcd"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-28 17:47:55 -07:00
Gyuho Lee 6ab3cc0a2e mvcc: clean up code format
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-21 16:03:16 -07:00
Gyuho Lee e388a4a1a1 mvcc: simplify increment "rrev"
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-05 10:28:10 -07:00
Gyuho Lee bc18474029 mvcc: remove unnecessary type conversion
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-05 10:09:53 -07:00
Xiang Li 2f1730fcae backend: more metrics for bboltdb transcation 2018-06-11 14:05:04 -07:00
Gyuho Lee f2db05a869 mvcc: server db size with "etcd_debugging" namespace for backward compatibility
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-06-07 10:23:12 -07:00
Gyuho Lee 21130d5fb6 mvcc: promote db size metrics to "etcd"
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-06-07 10:20:45 -07:00
Gyuho Lee e239cc276a mvcc: separate synced/unsynced benchmarks
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-06-01 10:00:18 -07:00
Gyuho Lee 0398ec7dcb mvcc: fix panic by allowing future revision watcher from restore operation
This also happens without gRPC proxy.

Fix panic when gRPC proxy leader watcher is restored:

```
go test -v -tags cluster_proxy -cpu 4 -race -run TestV3WatchRestoreSnapshotUnsync

=== RUN   TestV3WatchRestoreSnapshotUnsync
panic: watcher minimum revision 9223372036854775805 should not exceed current revision 16

goroutine 156 [running]:
github.com/coreos/etcd/mvcc.(*watcherGroup).chooseAll(0xc4202b8720, 0x10, 0xffffffffffffffff, 0x1)
	/home/gyuho/go/src/github.com/coreos/etcd/mvcc/watcher_group.go:242 +0x3b5
github.com/coreos/etcd/mvcc.(*watcherGroup).choose(0xc4202b8720, 0x200, 0x10, 0xffffffffffffffff, 0xc420253378, 0xc420253378)
	/home/gyuho/go/src/github.com/coreos/etcd/mvcc/watcher_group.go:225 +0x289
github.com/coreos/etcd/mvcc.(*watchableStore).syncWatchers(0xc4202b86e0, 0x0)
	/home/gyuho/go/src/github.com/coreos/etcd/mvcc/watchable_store.go:340 +0x237
github.com/coreos/etcd/mvcc.(*watchableStore).syncWatchersLoop(0xc4202b86e0)
	/home/gyuho/go/src/github.com/coreos/etcd/mvcc/watchable_store.go:214 +0x280
created by github.com/coreos/etcd/mvcc.newWatchableStore
	/home/gyuho/go/src/github.com/coreos/etcd/mvcc/watchable_store.go:90 +0x477
exit status 2
FAIL	github.com/coreos/etcd/integration	2.551s
```

gRPC proxy spawns a watcher with a key "proxy-namespace__lostleader"
and watch revision "int64(math.MaxInt64 - 2)" to detect leader loss.
But, when the partitioned node restores, this watcher triggers
panic with "watcher minimum revision ... should not exceed current ...".

This check was added a long time ago, by my PR, when there was no gRPC proxy:

https://github.com/coreos/etcd/pull/4043#discussion_r48457145

> we can remove this checking actually. it is impossible for a unsynced watching to have a future rev. or we should just panic here.

However, now it's possible that a unsynced watcher has a future
revision, when it was moved from a synced watcher group through
restore operation.

This PR adds "restore" flag to indicate that a watcher was moved
from the synced watcher group with restore operation. Otherwise,
the watcher with future revision in an unsynced watcher group
would still panic.

Example logs with future revision watcher from restore operation:

```
{"level":"info","ts":1527196358.9057755,"caller":"mvcc/watcher_group.go:261","msg":"choosing future revision watcher from restore operation","watch-key":"proxy-namespace__lostleader","watch-revision":9223372036854775805,"current-revision":16}
{"level":"info","ts":1527196358.910349,"caller":"mvcc/watcher_group.go:261","msg":"choosing future revision watcher from restore operation","watch-key":"proxy-namespace__lostleader","watch-revision":9223372036854775805,"current-revision":16}
```

Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-05-25 12:40:02 -07:00
Gyuho Lee 210c842345 mvcc: improve watcherGroup panic message
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-05-23 15:38:40 -07:00
Gyuho Lee 1d91698268 mvcc: document, clean up histogram variables
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-05-23 14:03:28 -07:00
Gyuho Lee e6a113cdcd mvcc/backend: clean up histogram variables
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-05-23 14:03:28 -07:00
Gyuho Lee bc59f7b42f mvcc: add "etcd_mvcc_hash_(rev)_duration_seconds"
etcd_mvcc_hash_duration_seconds
etcd_mvcc_hash_rev_duration_seconds

Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-05-23 13:09:42 -07:00
Gyuho Lee 966ee9323c mvcc/backend: fix defrag duration scale
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-05-23 13:09:42 -07:00
Gyuho Lee d326b2933c mvcc/backend: add "etcd_disk_backend_defrag_duration_seconds"
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-05-23 13:09:42 -07:00
Gyuho Lee 60a9ec8a15 mvcc/backend: document metrics ExponentialBuckets
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-05-23 13:09:42 -07:00
Gyuho Lee 58e3ead219 mvcc/backend: clean up mutex, logging
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-05-23 13:09:42 -07:00
Gyuho Lee 1a83c6ad80 mvcc: remove unused parameters
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-05-09 15:42:45 -07:00
Gyuho Lee 5165344981 mvcc: use latest revision to tombstone
We replace/insert into in-memory B-tree, which means
we only keep a single node per key thus do not support
delete by revision on B-tree. So, (*keyIndex).tombstone
has always been marked with latest revision.

tombstone with key's modified revision panics:

panic: store.keyindex: put with unexpected smaller revision [{2 0} / {2 0}]

Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-05-09 09:07:39 -07:00
Gyuho Lee 03ef9745a9 mvcc: add more structured logging
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-05-04 13:15:51 -07:00
Gyuho Lee 4d863dac5a mvcc: support structured logging in compact restore
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-05-02 11:57:23 -07:00
Gyuho Lee 3df30b9c7f mvcc: fix "unconvert" warnings
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-04-30 15:32:16 -07:00
jocalvert f176427791 mvcc: Clone the key index for compaction and lock on each item
For compaction, clone the original Btree for traversal purposes, so as to
not hold the lock for the duration of compaction. This allows read/write
throughput by not blocking when the index tree is large (> 1M entries).

mvcc: add comment for index compaction lock
mvcc: explicitly unlock store to do index compaction synchronously
mvcc: formatting index bench
mvcc: add release note for index compaction changes
mvcc: add license header
2018-04-18 13:29:27 -07:00
Gyuho Lee c00c6cb685 mvcc: support structured logger
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-04-16 17:36:00 -07:00
Gyuho Lee 6c40b2b5d4 mvcc/backend: defrag to block concurrent read requests while resetting tx
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-03-16 03:29:18 -04:00
Gyuho Lee 8a518b01c4 *: revert "internal/mvcc" change
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-02-26 17:11:40 -08:00
Gyuho Lee 80d15948bc *: move "mvcc" to "internal/mvcc"
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-01-26 11:14:41 -08:00
Gyuho Lee 349a377a67 *: move "lease" to "internal/lease"
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-01-26 11:09:29 -08:00