Improved the overall performance more than 20% under heavyload
with little latency impact
heavy load
```
Requests/sec: ~23200 vs Requests/sec: ~31500
Latency distribution:
10% in 0.0883 secs.
25% in 0.1022 secs.
50% in 0.1207 secs.
75% in 0.1460 secs.
90% in 0.1647 secs.
95% in 0.1783 secs.
99% in 0.2223 secs.
vs
Latency distribution:
10% in 0.1119 secs.
25% in 0.1272 secs.
50% in 0.1469 secs.
75% in 0.1626 secs.
90% in 0.1765 secs.
95% in 0.1863 secs.
99% in 0.2276 secs.
```
Similar on light load too.
Database snapshot can be as large as 5GB. It is reasonable
to log before receiving it. Or the user might not know what
is happening and why etcd starts to use IO intensively.
The issue is not caused by this code, but by reading snapshot
from disk. etcd assumes the snapshot of v3 kv should live in
memory. If not, etcd does not work well anyway.
This moves the code to create listener and roundTripper for raft communication
to the same place, and use explicit functions to build them. This prevents
possible development errors in the future.
rafthttp logs repeated messages when amounts of message-drop logs
happen, and it becomes log spamming.
Use MergeLogger to merge log lines in this case.
The logging mechanism is verbose, so it is removed from peerStatus.
We would like to see the status change
of connection with peers, and one error that leads to deactivation.
There is no need to print out all non-repeated errors.
This pairs with remote timeout listeners.
etcd uses timeout listener, and times out the accepted connections
if there is no activity. So the idle connections may time out easily.
Becaus timeout transport doesn't reuse connections, it prevents using
timeouted connection.
This fixes the problem that etcd fail to get version of peers.
In rafthttp, when making request to some endpoint, it may receive
response with unexpected status code and header. This indicates the endpoint
doesn't function correctly. It should mark the endpoint unreachable.
pipeline.handle is a long-living one, and should continue to receive
next message to send out when current message fails to send. So it
should `continue` instead of `return` here.
When snapshot store requests raft snapshot from etcdserver apply loop,
it may block on the channel for some time, or wait some time for KV to
snapshot. This is unexpected because raft state machine should be unblocked.
Even worse, this block may lead to deadlock:
1. raft state machine waits on getting snapshot from raft memory storage
2. raft memory storage waits snapshot store to get snapshot
3. snapshot store requests raft snapshot from apply loop
4. apply loop is applying entries, and waits raftNode loop to finish
messages sending
5. raftNode loop waits peer loop in Transport to send out messages
6. peer loop in Transport waits for raft state machine to process message
Fix it by changing the logic of getSnap to be asynchronously creation.