vitalif/etcd - etcd

Commit Graph

Author	SHA1	Message	Date
Jingyi Hu	a73fb85c0c	mvcc: fully concurrent read	2019-05-08 19:11:23 -07:00
Gyuho Lee	e899023f3f	Merge pull request #10640 from shrajfr12/gomodulecompat Fix module path to have the major version to comply with go modules specification.	2019-05-01 22:46:03 -07:00
Jingyi Hu	88922b0d08	mvcc: protect tree clone with write lock	2019-05-01 12:34:09 -07:00
shivaramr	9150bf52d6	go modules: Fix module path version to include version number	2019-04-26 15:29:50 -07:00
Yingnan Zhang	c5cb5509ea	mvcc: fix db_compaction_total_duration_milliseconds	2019-04-16 10:32:21 +08:00
Jingyi Hu	93732df3ef	mvcc: release lock early when traversing index Make a copy-on-write clone of index tree when traversing. So that lock can be released right after the clone to improve backend concurrency.	2019-03-06 14:26:17 -08:00
Jingyi Hu	1c19f126cb	mvcc/backend: rename ReadTx Lock() to RLock() For better code readability, renaming Lock() to RLock() in ReadTx interface.	2019-03-05 13:53:27 -08:00
WizardCXY	e6c6d8492e	*: add flag to let etcd use the new boltdb freelistType feature	2019-02-14 11:07:08 +08:00
WizardCXY	6e8913b004	bugfix:dead lock on store.mu when store.Compact in store.Restore happens	2019-01-21 10:46:58 +08:00
caoming	cf309757d6	mvcc/backend: code format optimization	2018-10-17 14:18:09 +08:00
caoming	bf49b9a145	mvcc/backend: fix to use the backend create by snapshot instead of origin one.	2018-10-15 09:35:20 +08:00
Gyuho Lee	5adbc231f2	mvcc/backend: use "go.etcd.io/bbolt" Signed-off-by: Gyuho Lee <leegyuho@amazon.com>	2018-08-29 12:31:04 -07:00
Gyuho Lee	d537b328cb	mvcc: update import paths "go.etcd.io/etcd" Signed-off-by: Gyuho Lee <leegyuho@amazon.com>	2018-08-28 17:47:55 -07:00
Gyuho Lee	6ab3cc0a2e	mvcc: clean up code format Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2018-07-21 16:03:16 -07:00
Gyuho Lee	e388a4a1a1	mvcc: simplify increment "rrev" Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2018-07-05 10:28:10 -07:00
Gyuho Lee	bc18474029	mvcc: remove unnecessary type conversion Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2018-07-05 10:09:53 -07:00
Xiang Li	2f1730fcae	backend: more metrics for bboltdb transcation	2018-06-11 14:05:04 -07:00
Gyuho Lee	f2db05a869	mvcc: server db size with "etcd_debugging" namespace for backward compatibility Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2018-06-07 10:23:12 -07:00
Gyuho Lee	21130d5fb6	mvcc: promote db size metrics to "etcd" Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2018-06-07 10:20:45 -07:00
Gyuho Lee	e239cc276a	mvcc: separate synced/unsynced benchmarks Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2018-06-01 10:00:18 -07:00
Gyuho Lee	0398ec7dcb	mvcc: fix panic by allowing future revision watcher from restore operation This also happens without gRPC proxy. Fix panic when gRPC proxy leader watcher is restored: ``` go test -v -tags cluster_proxy -cpu 4 -race -run TestV3WatchRestoreSnapshotUnsync === RUN TestV3WatchRestoreSnapshotUnsync panic: watcher minimum revision 9223372036854775805 should not exceed current revision 16 goroutine 156 [running]: github.com/coreos/etcd/mvcc.(watcherGroup).chooseAll(0xc4202b8720, 0x10, 0xffffffffffffffff, 0x1) /home/gyuho/go/src/github.com/coreos/etcd/mvcc/watcher_group.go:242 +0x3b5 github.com/coreos/etcd/mvcc.(watcherGroup).choose(0xc4202b8720, 0x200, 0x10, 0xffffffffffffffff, 0xc420253378, 0xc420253378) /home/gyuho/go/src/github.com/coreos/etcd/mvcc/watcher_group.go:225 +0x289 github.com/coreos/etcd/mvcc.(watchableStore).syncWatchers(0xc4202b86e0, 0x0) /home/gyuho/go/src/github.com/coreos/etcd/mvcc/watchable_store.go:340 +0x237 github.com/coreos/etcd/mvcc.(watchableStore).syncWatchersLoop(0xc4202b86e0) /home/gyuho/go/src/github.com/coreos/etcd/mvcc/watchable_store.go:214 +0x280 created by github.com/coreos/etcd/mvcc.newWatchableStore /home/gyuho/go/src/github.com/coreos/etcd/mvcc/watchable_store.go:90 +0x477 exit status 2 FAIL github.com/coreos/etcd/integration 2.551s ``` gRPC proxy spawns a watcher with a key "proxy-namespace__lostleader" and watch revision "int64(math.MaxInt64 - 2)" to detect leader loss. But, when the partitioned node restores, this watcher triggers panic with "watcher minimum revision ... should not exceed current ...". This check was added a long time ago, by my PR, when there was no gRPC proxy: https://github.com/coreos/etcd/pull/4043#discussion_r48457145 > we can remove this checking actually. it is impossible for a unsynced watching to have a future rev. or we should just panic here. However, now it's possible that a unsynced watcher has a future revision, when it was moved from a synced watcher group through restore operation. This PR adds "restore" flag to indicate that a watcher was moved from the synced watcher group with restore operation. Otherwise, the watcher with future revision in an unsynced watcher group would still panic. Example logs with future revision watcher from restore operation: ``` {"level":"info","ts":1527196358.9057755,"caller":"mvcc/watcher_group.go:261","msg":"choosing future revision watcher from restore operation","watch-key":"proxy-namespace__lostleader","watch-revision":9223372036854775805,"current-revision":16} {"level":"info","ts":1527196358.910349,"caller":"mvcc/watcher_group.go:261","msg":"choosing future revision watcher from restore operation","watch-key":"proxy-namespace__lostleader","watch-revision":9223372036854775805,"current-revision":16} ``` Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2018-05-25 12:40:02 -07:00
Gyuho Lee	210c842345	mvcc: improve watcherGroup panic message Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2018-05-23 15:38:40 -07:00
Gyuho Lee	1d91698268	mvcc: document, clean up histogram variables Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2018-05-23 14:03:28 -07:00
Gyuho Lee	e6a113cdcd	mvcc/backend: clean up histogram variables Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2018-05-23 14:03:28 -07:00
Gyuho Lee	bc59f7b42f	mvcc: add "etcd_mvcc_hash_(rev)_duration_seconds" etcd_mvcc_hash_duration_seconds etcd_mvcc_hash_rev_duration_seconds Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2018-05-23 13:09:42 -07:00
Gyuho Lee	966ee9323c	mvcc/backend: fix defrag duration scale Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2018-05-23 13:09:42 -07:00
Gyuho Lee	d326b2933c	mvcc/backend: add "etcd_disk_backend_defrag_duration_seconds" Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2018-05-23 13:09:42 -07:00
Gyuho Lee	60a9ec8a15	mvcc/backend: document metrics ExponentialBuckets Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2018-05-23 13:09:42 -07:00
Gyuho Lee	58e3ead219	mvcc/backend: clean up mutex, logging Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2018-05-23 13:09:42 -07:00
Gyuho Lee	1a83c6ad80	mvcc: remove unused parameters Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2018-05-09 15:42:45 -07:00
Gyuho Lee	5165344981	mvcc: use latest revision to tombstone We replace/insert into in-memory B-tree, which means we only keep a single node per key thus do not support delete by revision on B-tree. So, (*keyIndex).tombstone has always been marked with latest revision. tombstone with key's modified revision panics: panic: store.keyindex: put with unexpected smaller revision [{2 0} / {2 0}] Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2018-05-09 09:07:39 -07:00
Gyuho Lee	03ef9745a9	mvcc: add more structured logging Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2018-05-04 13:15:51 -07:00
Gyuho Lee	4d863dac5a	mvcc: support structured logging in compact restore Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2018-05-02 11:57:23 -07:00
Gyuho Lee	3df30b9c7f	mvcc: fix "unconvert" warnings Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2018-04-30 15:32:16 -07:00
jocalvert	f176427791	mvcc: Clone the key index for compaction and lock on each item For compaction, clone the original Btree for traversal purposes, so as to not hold the lock for the duration of compaction. This allows read/write throughput by not blocking when the index tree is large (> 1M entries). mvcc: add comment for index compaction lock mvcc: explicitly unlock store to do index compaction synchronously mvcc: formatting index bench mvcc: add release note for index compaction changes mvcc: add license header	2018-04-18 13:29:27 -07:00
Gyuho Lee	c00c6cb685	mvcc: support structured logger Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2018-04-16 17:36:00 -07:00
Gyuho Lee	6c40b2b5d4	mvcc/backend: defrag to block concurrent read requests while resetting tx Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2018-03-16 03:29:18 -04:00
Gyuho Lee	8a518b01c4	*: revert "internal/mvcc" change Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2018-02-26 17:11:40 -08:00
Gyuho Lee	80d15948bc	*: move "mvcc" to "internal/mvcc" Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2018-01-26 11:14:41 -08:00
Gyuho Lee	349a377a67	*: move "lease" to "internal/lease" Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2018-01-26 11:09:29 -08:00
Manjunath A Kumatagi	89221a25b8	mvcc : Fix Govet errors	2018-01-25 02:30:37 -05:00
Iwasaki Yudai	0b1b82aff2	mvcc: check null before set FillPercent not to panic Since CreateBucketIfNotExists() can return nil when it gets an error, accessing FillPercent must be done after a nil check, not to cause a panic.	2018-01-08 11:34:34 -08:00
Gyuho Lee	82a164e3b9	mvcc: make test struct fields unexported Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2017-12-31 13:20:41 -08:00
Connor Peet	fc3b59046f	mvcc: allow clients to assign watcher IDs This allows for watchers to be created concurrently without needing potentially complex and latency-adding queuing on the client. Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2017-12-31 13:20:40 -08:00
Gyuho Lee	76dd9d56a1	mvcc: clean-up godoc in key_index.go Minor clean-up. Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2017-12-18 13:20:00 -08:00
Gyuho Lee	2e95ace82b	mvcc: fetch revisions with current revision, not 0, in HashByRev It was getting revisions with "atRev==0", which makes "available" from "keep" method always empty since "walk" on "keyIndex" only returns true. "available" should be populated with all revisions to be kept if the compaction happens with the given revision. But, "available" was being empty when "kvindex.Keep(0)" since it's always the case that "rev.main > atRev==0". Fix https://github.com/coreos/etcd/issues/9022. Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2017-12-18 12:17:06 -08:00
Gyuho Lee	bcd5390b35	*: regenerate protobuf, grpc-gateway Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2017-12-07 21:31:13 -08:00
Gyu-Ho Lee	9154b31bf3	mvcc: move 'keyi' define before holding locks To make it consistent with other code paths. Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>	2017-10-05 10:06:28 -07:00
Anthony Romano	4fa1dd196c	*: make receiver names consistent	2017-09-12 03:54:04 -07:00
Gyu-Ho Lee	f65aee0759	*: replace 'golang.org/x/net/context' with 'context' Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>	2017-09-07 13:39:42 -07:00
Anthony Romano	9d79d5fe65	mvcc: don't allocate keys when computing Revisions	2017-08-31 13:23:23 -07:00
Anthony Romano	be7d488982	mvcc: add range benchmark for fetching 100 keys	2017-08-31 13:23:23 -07:00
Anthony Romano	896447ed99	mvcc: only remove watch cancel after cancel completes If Close() is called before Cancel()'s cancel() completes, the watch channel will be closed while the watch is still in the synced list. If there's an event, etcd will try to write to a closed channel. Instead, remove the watch from the bookkeeping structures only after cancel completes, so Close() will always call it. Fixes #8443	2017-08-28 17:06:33 -07:00
Anthony Romano	bd53ae5680	mvcc: test concurrently closing watch streams and canceling watches Triggers a race that causes a write to a closed watch stream channel.	2017-08-28 17:06:32 -07:00
Anthony Romano	f58c0cfb66	mvcc: Revisions() method for index to avoid key allocation Save another alloc on the one key path.	2017-08-21 11:30:02 -07:00
fengshaobao 00231050	13041c15ba	mvcc: sending events after restore Fixes: #8411	2017-08-21 10:32:49 -07:00
Anthony Romano	8b872196d0	backend: cache buckets in read tx Saves an alloc and about 10% of Range() time.	2017-08-21 02:16:55 -07:00
Anthony Romano	10b65c97dd	mvcc: benchmark Range() on a single key	2017-08-21 00:14:46 -07:00
Anthony Romano	ccd1bb1780	mvcc: test keys gauge is reloaded correctly on restore	2017-08-10 09:21:39 -07:00
Anthony Romano	32866572bf	mvcc: reset keys gauge on restore Fixes #8388	2017-08-10 08:37:50 -07:00
fanmin shi	df5a3d15ce	mvcc: increase rev for TestHashKVWhenCompacting	2017-07-31 17:59:49 -07:00
fanmin shi	bb86c327e2	mvcc: HashKV gets keep from kvindex.Keep	2017-07-31 17:59:49 -07:00
fanmin shi	4c2c5b0084	mvcc: add tests for Keep	2017-07-31 17:59:42 -07:00
fanmin shi	7b8fb3cf0a	mvcc: add and implement Keep api to index Keep finds all revisions to be kept for a Compaction at the given rev.	2017-07-31 14:04:03 -07:00
fanmin shi	451b062184	mvcc/backend: add TestBackendWritebackForEach to backend_test.go	2017-07-28 09:39:48 -07:00
fanmin shi	785deebd62	mvcc/backend: enforce ordering for UnsafeForEach in read_tx.go This pr changes UnsafeForEach to traverse on boltdb before on the buffer. This ordering guarantees that UnsafeForEach traverses in the same order before or after the commit of buffer.	2017-07-28 09:30:23 -07:00
Xiang Li	2a348fb8e9	Merge pull request #8263 from fanminshi/hash_by_rev api: hash by rev	2017-07-26 11:22:33 -07:00
fanmin shi	8609521ce2	mvcc: add TestHashKVWhenCompacting to kvstore_test	2017-07-26 09:48:29 -07:00
fanmin shi	deca9879c2	mvcc: add HashByRev to kv.go HashByRev computes the hash of all MVCC keys up to a given revision.	2017-07-25 17:00:46 -07:00
Joe Betz	c06953ae08	mvcc: Add metric for count of db key revisions compacted. When digging into etcd/boltdb "storage space exceeded" issues, this metric may help answer questions about if/when compactions occured and how much data was freed.	2017-07-20 10:07:56 -07:00
Anthony Romano	e9d096ae6b	mvcc: don't allocate end revision while computing range Use 'nil' since it's only reading a single key. Also preallocates the result slice based on limit / number of revisions fetched. Fixes #8208	2017-07-06 15:59:27 -07:00
Gyu-Ho Lee	870302afa6	mvcc/backend: enable 'NoFreelistSync' by default (linux) Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>	2017-07-05 16:10:04 -07:00
Gyu-Ho Lee	318e9c766f	*: replace 'boltdb' import paths with 'coreos/bbolt' Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>	2017-07-05 14:32:13 -07:00
Anthony Romano	522e75cb4f	mvcc: use GaugeFunc metric to load db size when requested Relying on mvcc to set the db size metric can cause it to miss size changes when a txn commits after the last write completes before a quiescent period. Instead, load the db size on demand. Fixes #8146	2017-06-21 23:58:37 -07:00
Anthony Romano	51a568aa81	mvcc: restore into tree index with one key index Clobbering the mvcc kvindex with new keyIndexes for each restore chunk would cause index corruption by dropping historical information.	2017-06-19 12:04:01 -07:00
Anthony Romano	02164874d9	mvcc: test restore and deletes with small chunk sizes	2017-06-19 12:04:01 -07:00
Anthony Romano	7f149d8fb6	mvcc: set db size metric on restore Fixes #8080	2017-06-16 11:27:34 -07:00
Anthony Romano	da48f1feaf	mvcc: create TxnWrites from TxnRead with NewReadOnlyTxnWrite Already used internally by mvcc, but needed by etcdserver txns.	2017-06-09 09:20:38 -07:00
Anthony Romano	300feea177	Merge pull request #8052 from heyitsanthony/watch-victim-test mvcc: test watch victim/delay path	2017-06-08 11:10:33 -07:00
Anthony Romano	83b2ea2f60	mvcc: test watch victim/delay path Current tests don't normally trigger the watch victim path because the constants are too large; set the constants to small values and hammer the store to cause watch delivery delays.	2017-06-07 17:02:00 -07:00
Anthony Romano	0352ce79b8	mvcc: count range/put/del operations for txns Txns were previously only bumping the txn counter; now bumps all operation counters.	2017-06-07 16:53:50 -07:00
Anthony Romano	fd71da47d1	mvcc: remove unused store.Equals function	2017-06-07 09:25:42 -07:00
Anthony Romano	ef63abdf7f	mvcc: don't use pointer for storeTxnRead in storeTxnWrite Saves an allocation when creating a storeTxnWrite.	2017-06-06 09:51:57 -07:00
Anthony Romano	6846e49edf	Merge pull request #7859 from heyitsanthony/cache-consistent-get mvcc: cache consistent index	2017-05-26 10:52:53 -07:00
Anthony Romano	ac4855e911	mvcc: benchmark ConsistentIndex	2017-05-26 09:49:40 -07:00
Anthony Romano	73dee0bec4	mvcc: cache consistentIndex Called on every entry apply and boltdb requests aren't free.	2017-05-26 09:49:40 -07:00
Anthony Romano	0506f49f9e	backend: don't hold boltdb read txn lock on cursor scanning Large fetches hold the lock when they do not need to do so.	2017-05-26 09:28:08 -07:00
Anthony Romano	343a018361	Merge pull request #7900 from heyitsanthony/chunk-restore mvcc: chunk reads for restoring	2017-05-26 09:21:59 -07:00
Anthony Romano	8516d8ccc5	backend: force initial mmap size to 0 for windows boltdb on windows allocates a file with the full mmap size even if the db is empty. Force the initial mmap size to 0 so there's no huge initial db file on windows. Fixes #7910	2017-05-12 14:34:07 -07:00
fanmin shi	8468b38631	backend: dynamically set snapshotWarningTimeout based on db size	2017-05-11 15:25:35 -07:00
Anthony Romano	1aca63e9e0	mvcc: time restore in restore benchmark This never worked.	2017-05-09 20:14:58 -07:00
Anthony Romano	163fd2d76b	mvcc: chunk reads for restoring Loading all keys at once would cause etcd to use twice as much memory than it would need to serve the keys, causing RSS to spike on boot. Instead, load the keys into the mvcc by chunk. Uses pipelining for some concurrency. Fixes #7822	2017-05-09 20:14:58 -07:00
fanmin shi	230106dd3c	backend: add prometheus metric for large snapshot duration. FIXES #7878	2017-05-05 17:27:33 -07:00
fanmin shi	f7f30f2361	backend: print snapshotting duration warning every 30s FIXES #7870	2017-05-04 16:41:03 -07:00
Anthony Romano	14d6ed9e5f	*: clear redundant return statement warnings (S1027)	2017-04-21 14:01:00 -07:00
Gyu-Ho Lee	5000d29b4a	mvcc: remove stopc select case in Hash Revert change in `33acbb694b`. Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>	2017-04-17 14:19:48 -07:00
Gyu-Ho Lee	8ffd58fb3b	mvcc/backend: remove t.tx.DB()==nil checks with GracefulStop Revert https://github.com/coreos/etcd/pull/6662. Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>	2017-04-17 14:17:00 -07:00
Gyu-Ho Lee	cd470f9ccd	Revert "mvcc: test inflight Hash to trigger Size on nil db" This reverts commit `994e8e4f40`. Since now etcdserver gracefully shuts down the gRPC server	2017-04-17 14:15:43 -07:00
Anthony Romano	78a5eb79b5	*: add swagger and grpc-gateway assets for v3lock and v3election	2017-04-10 15:21:07 -07:00
Anthony Romano	f67bdc2eed	*: support checking that an interval tree's keys cover an entire interval	2017-04-03 15:38:07 -07:00

1 2 3 4 5

222 Commits (5cb1e0b342f52a96cbe7f31667d98e999e167230)