The watch count metrics are not robust to duplicate cancellations. These
cause the count to be decremented twice, leading eventually to negative
counts. We are seeing this in production. The duplicate cancellations
themselves are not themselves a big problem (except performance), but
they are caused by the new proactive cancellation logic (#11850) which
cancels proactively even immediately before initiating a Close, thus
nearly guaranteeing a Close-cancel race, as discussed in
watchable_store.go. We can avoid this in most cases by not sending a
cancellation when we are going to Close.
This change makes the etcd package compatible with the existing Go
ecosystem for module versioning.
Used this tool to update package imports:
https://github.com/KSubedi/gomove
Method `kvsToEvents` can take a long time if a large number of events need to be synchronized. However, this method does not access any shared resources, so it no need to hold the tx lock when executing. So we can move it outside to reduce lock holding time.
This also happens without gRPC proxy.
Fix panic when gRPC proxy leader watcher is restored:
```
go test -v -tags cluster_proxy -cpu 4 -race -run TestV3WatchRestoreSnapshotUnsync
=== RUN TestV3WatchRestoreSnapshotUnsync
panic: watcher minimum revision 9223372036854775805 should not exceed current revision 16
goroutine 156 [running]:
github.com/coreos/etcd/mvcc.(*watcherGroup).chooseAll(0xc4202b8720, 0x10, 0xffffffffffffffff, 0x1)
/home/gyuho/go/src/github.com/coreos/etcd/mvcc/watcher_group.go:242 +0x3b5
github.com/coreos/etcd/mvcc.(*watcherGroup).choose(0xc4202b8720, 0x200, 0x10, 0xffffffffffffffff, 0xc420253378, 0xc420253378)
/home/gyuho/go/src/github.com/coreos/etcd/mvcc/watcher_group.go:225 +0x289
github.com/coreos/etcd/mvcc.(*watchableStore).syncWatchers(0xc4202b86e0, 0x0)
/home/gyuho/go/src/github.com/coreos/etcd/mvcc/watchable_store.go:340 +0x237
github.com/coreos/etcd/mvcc.(*watchableStore).syncWatchersLoop(0xc4202b86e0)
/home/gyuho/go/src/github.com/coreos/etcd/mvcc/watchable_store.go:214 +0x280
created by github.com/coreos/etcd/mvcc.newWatchableStore
/home/gyuho/go/src/github.com/coreos/etcd/mvcc/watchable_store.go:90 +0x477
exit status 2
FAIL github.com/coreos/etcd/integration 2.551s
```
gRPC proxy spawns a watcher with a key "proxy-namespace__lostleader"
and watch revision "int64(math.MaxInt64 - 2)" to detect leader loss.
But, when the partitioned node restores, this watcher triggers
panic with "watcher minimum revision ... should not exceed current ...".
This check was added a long time ago, by my PR, when there was no gRPC proxy:
https://github.com/coreos/etcd/pull/4043#discussion_r48457145
> we can remove this checking actually. it is impossible for a unsynced watching to have a future rev. or we should just panic here.
However, now it's possible that a unsynced watcher has a future
revision, when it was moved from a synced watcher group through
restore operation.
This PR adds "restore" flag to indicate that a watcher was moved
from the synced watcher group with restore operation. Otherwise,
the watcher with future revision in an unsynced watcher group
would still panic.
Example logs with future revision watcher from restore operation:
```
{"level":"info","ts":1527196358.9057755,"caller":"mvcc/watcher_group.go:261","msg":"choosing future revision watcher from restore operation","watch-key":"proxy-namespace__lostleader","watch-revision":9223372036854775805,"current-revision":16}
{"level":"info","ts":1527196358.910349,"caller":"mvcc/watcher_group.go:261","msg":"choosing future revision watcher from restore operation","watch-key":"proxy-namespace__lostleader","watch-revision":9223372036854775805,"current-revision":16}
```
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
If Close() is called before Cancel()'s cancel() completes, the
watch channel will be closed while the watch is still in the
synced list. If there's an event, etcd will try to write to a
closed channel. Instead, remove the watch from the bookkeeping
structures only after cancel completes, so Close() will always
call it.
Fixes#8443
Current tests don't normally trigger the watch victim path because the
constants are too large; set the constants to small values and hammer
the store to cause watch delivery delays.
Now user can filter events with types. The API is also extensible.
It might make sense for the proxy to filter out events based on
more expensive/customized filter.
Makes w.cur into w.minrev, the minimum revision for the next update, and
retries cancelation if the watcher isn't found (because it's being processed
by moveVictims).
Fixes: #5459
Instead of holding the store lock while doing a lot of work like when syncung
unsynced watchers, the work from a blocked synced notify can be reused and
dispatched without holding the store lock for long.