Commit Graph

182 Commits (f9c2d00fb3d0a63cf3a7d8c124ad87cd5d826253)

Author SHA1 Message Date
Manjunath A Kumatagi 89221a25b8 mvcc : Fix Govet errors 2018-01-25 02:30:37 -05:00
Iwasaki Yudai 0b1b82aff2 mvcc: check null before set FillPercent not to panic
Since CreateBucketIfNotExists() can return nil when it gets an error,
accessing FillPercent must be done after a nil check, not to cause
a panic.
2018-01-08 11:34:34 -08:00
Gyuho Lee 82a164e3b9 mvcc: make test struct fields unexported
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2017-12-31 13:20:41 -08:00
Connor Peet fc3b59046f mvcc: allow clients to assign watcher IDs
This allows for watchers to be created concurrently
without needing potentially complex and latency-adding
queuing on the client.

Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2017-12-31 13:20:40 -08:00
Gyuho Lee 76dd9d56a1 mvcc: clean-up godoc in key_index.go
Minor clean-up.

Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2017-12-18 13:20:00 -08:00
Gyuho Lee 2e95ace82b mvcc: fetch revisions with current revision, not 0, in HashByRev
It was getting revisions with "atRev==0", which makes
"available" from "keep" method always empty since
"walk" on "keyIndex" only returns true.

"available" should be populated with all revisions to be
kept if the compaction happens with the given revision.
But, "available" was being empty when "kvindex.Keep(0)"
since it's always the case that "rev.main > atRev==0".

Fix https://github.com/coreos/etcd/issues/9022.

Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2017-12-18 12:17:06 -08:00
Gyuho Lee bcd5390b35 *: regenerate protobuf, grpc-gateway
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2017-12-07 21:31:13 -08:00
Gyu-Ho Lee 9154b31bf3 mvcc: move 'keyi' define before holding locks
To make it consistent with other code paths.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-10-05 10:06:28 -07:00
Anthony Romano 4fa1dd196c *: make receiver names consistent 2017-09-12 03:54:04 -07:00
Gyu-Ho Lee f65aee0759 *: replace 'golang.org/x/net/context' with 'context'
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-09-07 13:39:42 -07:00
Anthony Romano 9d79d5fe65 mvcc: don't allocate keys when computing Revisions 2017-08-31 13:23:23 -07:00
Anthony Romano be7d488982 mvcc: add range benchmark for fetching 100 keys 2017-08-31 13:23:23 -07:00
Anthony Romano 896447ed99 mvcc: only remove watch cancel after cancel completes
If Close() is called before Cancel()'s cancel() completes, the
watch channel will be closed while the watch is still in the
synced list. If there's an event, etcd will try to write to a
closed channel. Instead, remove the watch from the bookkeeping
structures only after cancel completes, so Close() will always
call it.

Fixes #8443
2017-08-28 17:06:33 -07:00
Anthony Romano bd53ae5680 mvcc: test concurrently closing watch streams and canceling watches
Triggers a race that causes a write to a closed watch stream channel.
2017-08-28 17:06:32 -07:00
Anthony Romano f58c0cfb66 mvcc: Revisions() method for index to avoid key allocation
Save another alloc on the one key path.
2017-08-21 11:30:02 -07:00
fengshaobao 00231050 13041c15ba mvcc: sending events after restore
Fixes: #8411
2017-08-21 10:32:49 -07:00
Anthony Romano 8b872196d0 backend: cache buckets in read tx
Saves an alloc and about 10% of Range() time.
2017-08-21 02:16:55 -07:00
Anthony Romano 10b65c97dd mvcc: benchmark Range() on a single key 2017-08-21 00:14:46 -07:00
Anthony Romano ccd1bb1780 mvcc: test keys gauge is reloaded correctly on restore 2017-08-10 09:21:39 -07:00
Anthony Romano 32866572bf mvcc: reset keys gauge on restore
Fixes #8388
2017-08-10 08:37:50 -07:00
fanmin shi df5a3d15ce mvcc: increase rev for TestHashKVWhenCompacting 2017-07-31 17:59:49 -07:00
fanmin shi bb86c327e2 mvcc: HashKV gets keep from kvindex.Keep 2017-07-31 17:59:49 -07:00
fanmin shi 4c2c5b0084 mvcc: add tests for Keep 2017-07-31 17:59:42 -07:00
fanmin shi 7b8fb3cf0a mvcc: add and implement Keep api to index
Keep finds all revisions to be kept for a Compaction at the given rev.
2017-07-31 14:04:03 -07:00
fanmin shi 451b062184 mvcc/backend: add TestBackendWritebackForEach to backend_test.go 2017-07-28 09:39:48 -07:00
fanmin shi 785deebd62 mvcc/backend: enforce ordering for UnsafeForEach in read_tx.go
This pr changes  UnsafeForEach to traverse on boltdb before on the buffer.
This ordering guarantees that UnsafeForEach traverses in the same order
before or after the commit of buffer.
2017-07-28 09:30:23 -07:00
Xiang Li 2a348fb8e9 Merge pull request #8263 from fanminshi/hash_by_rev
api: hash by rev
2017-07-26 11:22:33 -07:00
fanmin shi 8609521ce2 mvcc: add TestHashKVWhenCompacting to kvstore_test 2017-07-26 09:48:29 -07:00
fanmin shi deca9879c2 mvcc: add HashByRev to kv.go
HashByRev computes the hash of all MVCC keys up to a given revision.
2017-07-25 17:00:46 -07:00
Joe Betz c06953ae08 mvcc: Add metric for count of db key revisions compacted.
When digging into etcd/boltdb "storage space exceeded" issues, this metric may help answer questions about if/when compactions occured and how much data was freed.
2017-07-20 10:07:56 -07:00
Anthony Romano e9d096ae6b mvcc: don't allocate end revision while computing range
Use 'nil' since it's only reading a single key. Also preallocates
the result slice based on limit / number of revisions fetched.

Fixes #8208
2017-07-06 15:59:27 -07:00
Gyu-Ho Lee 870302afa6 mvcc/backend: enable 'NoFreelistSync' by default (linux)
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-07-05 16:10:04 -07:00
Gyu-Ho Lee 318e9c766f *: replace 'boltdb' import paths with 'coreos/bbolt'
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-07-05 14:32:13 -07:00
Anthony Romano 522e75cb4f mvcc: use GaugeFunc metric to load db size when requested
Relying on mvcc to set the db size metric can cause it to
miss size changes when a txn commits after the last write
completes before a quiescent period. Instead, load the
db size on demand.

Fixes #8146
2017-06-21 23:58:37 -07:00
Anthony Romano 51a568aa81 mvcc: restore into tree index with one key index
Clobbering the mvcc kvindex with new keyIndexes for each restore
chunk would cause index corruption by dropping historical information.
2017-06-19 12:04:01 -07:00
Anthony Romano 02164874d9 mvcc: test restore and deletes with small chunk sizes 2017-06-19 12:04:01 -07:00
Anthony Romano 7f149d8fb6 mvcc: set db size metric on restore
Fixes #8080
2017-06-16 11:27:34 -07:00
Anthony Romano da48f1feaf mvcc: create TxnWrites from TxnRead with NewReadOnlyTxnWrite
Already used internally by mvcc, but needed by etcdserver txns.
2017-06-09 09:20:38 -07:00
Anthony Romano 300feea177 Merge pull request #8052 from heyitsanthony/watch-victim-test
mvcc: test watch victim/delay path
2017-06-08 11:10:33 -07:00
Anthony Romano 83b2ea2f60 mvcc: test watch victim/delay path
Current tests don't normally trigger the watch victim path because the
constants are too large; set the constants to small values and hammer
the store to cause watch delivery delays.
2017-06-07 17:02:00 -07:00
Anthony Romano 0352ce79b8 mvcc: count range/put/del operations for txns
Txns were previously only bumping the txn counter; now bumps all operation
counters.
2017-06-07 16:53:50 -07:00
Anthony Romano fd71da47d1 mvcc: remove unused store.Equals function 2017-06-07 09:25:42 -07:00
Anthony Romano ef63abdf7f mvcc: don't use pointer for storeTxnRead in storeTxnWrite
Saves an allocation when creating a storeTxnWrite.
2017-06-06 09:51:57 -07:00
Anthony Romano 6846e49edf Merge pull request #7859 from heyitsanthony/cache-consistent-get
mvcc: cache consistent index
2017-05-26 10:52:53 -07:00
Anthony Romano ac4855e911 mvcc: benchmark ConsistentIndex 2017-05-26 09:49:40 -07:00
Anthony Romano 73dee0bec4 mvcc: cache consistentIndex
Called on every entry apply and boltdb requests aren't free.
2017-05-26 09:49:40 -07:00
Anthony Romano 0506f49f9e backend: don't hold boltdb read txn lock on cursor scanning
Large fetches hold the lock when they do not need to do so.
2017-05-26 09:28:08 -07:00
Anthony Romano 343a018361 Merge pull request #7900 from heyitsanthony/chunk-restore
mvcc: chunk reads for restoring
2017-05-26 09:21:59 -07:00
Anthony Romano 8516d8ccc5 backend: force initial mmap size to 0 for windows
boltdb on windows allocates a file with the full mmap size even if the
db is empty. Force the initial mmap size to 0 so there's no huge initial
db file on windows.

Fixes #7910
2017-05-12 14:34:07 -07:00
fanmin shi 8468b38631 backend: dynamically set snapshotWarningTimeout based on db size 2017-05-11 15:25:35 -07:00
Anthony Romano 1aca63e9e0 mvcc: time restore in restore benchmark
This never worked.
2017-05-09 20:14:58 -07:00
Anthony Romano 163fd2d76b mvcc: chunk reads for restoring
Loading all keys at once would cause etcd to use twice as much
memory than it would need to serve the keys, causing RSS to spike on
boot. Instead, load the keys into the mvcc by chunk. Uses pipelining
for some concurrency.

Fixes #7822
2017-05-09 20:14:58 -07:00
fanmin shi 230106dd3c backend: add prometheus metric for large snapshot duration.
FIXES #7878
2017-05-05 17:27:33 -07:00
fanmin shi f7f30f2361 backend: print snapshotting duration warning every 30s
FIXES #7870
2017-05-04 16:41:03 -07:00
Anthony Romano 14d6ed9e5f *: clear redundant return statement warnings (S1027) 2017-04-21 14:01:00 -07:00
Gyu-Ho Lee 5000d29b4a mvcc: remove stopc select case in Hash
Revert change in 33acbb694b.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-17 14:19:48 -07:00
Gyu-Ho Lee 8ffd58fb3b mvcc/backend: remove t.tx.DB()==nil checks with GracefulStop
Revert https://github.com/coreos/etcd/pull/6662.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-17 14:17:00 -07:00
Gyu-Ho Lee cd470f9ccd Revert "mvcc: test inflight Hash to trigger Size on nil db"
This reverts commit 994e8e4f40.

Since now etcdserver gracefully shuts down the gRPC server
2017-04-17 14:15:43 -07:00
Anthony Romano 78a5eb79b5 *: add swagger and grpc-gateway assets for v3lock and v3election 2017-04-10 15:21:07 -07:00
Anthony Romano f67bdc2eed *: support checking that an interval tree's keys cover an entire interval 2017-04-03 15:38:07 -07:00
Gyu-Ho Lee 161c7f6bdf Merge pull request #7579 from gyuho/fix-defrage
*: fix panic during defrag operation
2017-03-23 10:08:33 -07:00
Anthony Romano 7ef75e373a Merge pull request #7525 from heyitsanthony/big-backend
etcdserver, backend: configure mmap size based on quota
2017-03-23 10:06:00 -07:00
Gyu-Ho Lee 26abd25cd3 mvcc/backend: hold 'readTx.Lock' until completing bolt.Tx reset
Fix https://github.com/coreos/etcd/issues/7526.

When resetting `bolt.Tx` in `defrag` and `batchTxBuffered.commit`
operation, we do not hold `readTx` lock, so the inflight range
requests can trigger panic in `mvcc.Range` paths. This fixes by
moving mutexes out and hold it while resetting the `readTx`.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-23 09:47:43 -07:00
Xiang Li 7698a2a546 Merge pull request #7553 from xiang90/fix_defrag
backend: add FillPercent option
2017-03-21 11:16:17 -07:00
Xiang 95870a21eb backend: add FillPercent option 2017-03-21 08:06:03 -07:00
Anthony Romano 8a3fee15a3 etcdserver, backend: only warn if exceeding max quota 2017-03-17 15:38:57 -07:00
Anthony Romano 5e4b008106 *: base initial mmap size on quota size 2017-03-17 15:38:49 -07:00
Anthony Romano 2f1542c06d *: use filepath.Join for files 2017-03-16 07:46:06 -07:00
Anthony Romano 33acbb694b mvcc: txns and r/w views
Clean-up of the mvcc interfaces to use txn interfaces instead of an id.

Adds support for concurrent read-only mvcc transactions.

Fixes #7083
2017-03-08 20:52:59 -08:00
Anthony Romano 8d438c2939 backend: readtx
ReadTxs are designed for read-only accesses to the backend using a
read-only boltDB transaction. Since BatchTx's are long-running
transactions, all writes to BatchTx will writeback to ReadTx, overlaying
the base read-only transaction.
2017-03-08 20:52:59 -08:00
Gyu-Ho Lee 3d75395875 *: remove never-unused vars, minor lint fix
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-06 14:59:12 -08:00
Manjunath A Kumatagi 0914b8b707 test: Fix gosimple errors
Getting gosimple suggestion while running test script, so this PR is for fixing gosimple S1019 check.
raft/node_test.go:456:40: should use make([]raftpb.Entry, 1) instead (S1019)
raft/node_test.go:457:49: should use make([]raftpb.Entry, 1) instead (S1019)
raft/node_test.go:458:43: should use make([]raftpb.Message, 1) instead (S1019)

Refer https://github.com/dominikh/go-tools/blob/master/cmd/gosimple/README.md#checks for more information.
2017-02-09 08:01:28 -05:00
sharat 43078d3ced mvcc: remove unused restore method 2016-11-18 23:04:39 +05:30
sharat aa2b5aec1b mvcc : Added benchmark for store.resotre 2016-11-17 04:01:15 +05:30
sharat f014cca644 mvcc: TestStoreRestore fix 2016-11-16 16:58:42 +05:30
sharat 95fb41a923 mvcc: store.restore taking too long triggering snapshot cycle fix 2016-11-16 16:31:20 +05:30
Gyu-Ho Lee b8b72f80f9 *: revendor, update proto files 2016-11-10 12:02:00 -08:00
Gyu-Ho Lee 425acb28c4 mvcc: return -1 for wrong watcher range key >= end
Fix https://github.com/coreos/etcd/issues/6819.
2016-11-08 17:02:28 -08:00
Gyu-Ho Lee 4e1d3f0f52 mvcc: expose 'backend.IgnoreKey' 2016-10-25 10:07:08 -07:00
Gyu-Ho Lee 994e8e4f40 mvcc: test inflight Hash to trigger Size on nil db 2016-10-21 11:02:09 -07:00
Gyu-Ho Lee 7d30326968 backend: skip *bolt.DB.Size call when nil
Fix https://github.com/coreos/etcdlabs/issues/30.
2016-10-21 11:01:23 -07:00
Gyu-Ho Lee 46716fe9fb mvcc: fix gofmt issues from Go tip 2016-10-20 16:32:47 -07:00
Xiang Li 93225ebafc mvcc: fix rev inconsistency
Try:

./etcdctl put foo bar
./etcdctl del foo
./etcdctl compact 3

restart etcd

./etcdctl get foo
mvcc: required revision has been compacted

The error is unexpected when range over the head revision.

Internally, we incorrectly set current revision smaller than the
compacted revision when we remove all keys around compacted revision.

This commit fixes the issue by recovering the current revision at least
to compacted revision.
2016-10-12 10:42:57 -07:00
Nikita Vetoshkin 064e02f4b3 mvcc: Optimize updating key by storing lease in lessor 2016-10-12 09:37:09 +05:00
Nikita Vetoshkin 9970ded79f mvcc: add BenchmarkWatchableStoreTxnPut benchmark 2016-10-06 22:44:25 +05:00
Xiang Li fa1e28102e Merge pull request #5316 from ajityagaty/too_many_allocs
mvcc: Reduce number of allocs in PUT when watchableStore has no watchers.
2016-10-06 09:47:59 -07:00
Gyu-Ho Lee 9b56e51ca7 *: regenerate proto + gofmt change 2016-10-03 15:34:34 -07:00
Xiang Li 962433c17f *: set repo correctly for logging 2016-10-03 17:03:22 +08:00
ychen11 69f5b4ba79 Documentation:made watch request doc more clear 2016-09-23 23:13:55 +08:00
Xiang Li 1437388f77 mvcc: force commit and hash should be atomic for getting hash 2016-08-27 19:22:22 -07:00
Xiang Li e1789aa531 mvcc: only write txn should update index 2016-08-22 22:05:51 -07:00
Xiang Li de864d3b58 mvcc: fix count 2016-08-10 10:54:25 -07:00
Xiang Li bd62b0a646 mvcc: attach keys to leases after recover all state
The previous logic is wrong. When we have hisotry like Put(foo, bar, lease1),
and Put(foo, bar, lease2), we will end up with attaching foo to two leases 1 and
2. Similar things can happen for deattach by clearing the lease of a key.

Now we try to fix this by starting to attach leases at the end of the recovery.
We use a map to keep the last lease attachment state.
2016-08-04 11:17:58 -07:00
Gyu-Ho Lee 982e18d80b *: regenerate proto with latest grpc-gateway 2016-07-27 13:21:03 -07:00
Xiang Li fffa484a9f *: regenerate proto for adding deleterange 2016-07-23 16:17:44 -07:00
Gyu-Ho Lee 50be793f09 *: regenerate proto 2016-07-18 09:33:32 -07:00
Anthony Romano ba2725c2d0 build, backend: add backend commit failpoints 2016-07-14 12:26:35 -07:00
Xiang Li c853704ac9 *: support get-old-kv in watch 2016-07-05 16:17:09 -07:00
Xiang Li bc6d7659af Merge pull request #5795 from xiang90/filter
*: support watch with filters
2016-06-28 14:07:12 -07:00
Xiang Li dced92f8bd *: support watch with filters
Now user can filter events with types. The API is also extensible.
It might make sense for the proxy to filter out events based on
more expensive/customized filter.
2016-06-28 13:46:57 -07:00