5.8 KiB
Documentation → Configuration → Network Protocol Parameters
Network Protocol Parameters
These parameters apply to clients and OSDs and affect network connection logic between clients, OSDs and etcd.
- tcp_header_buffer_size
- use_sync_send_recv
- use_rdma
- rdma_device
- rdma_port_num
- rdma_gid_index
- rdma_mtu
- rdma_max_sge
- rdma_max_msg
- rdma_max_recv
- peer_connect_interval
- peer_connect_timeout
- osd_idle_timeout
- osd_ping_timeout
- up_wait_retry_interval
- max_etcd_attempts
- etcd_quick_timeout
- etcd_slow_timeout
- etcd_keepalive_timeout
- etcd_ws_keepalive_timeout
tcp_header_buffer_size
- Type: integer
- Default: 65536
Size of the buffer used to read data using an additional copy. Vitastor packet headers are 128 bytes, payload is always at least 4 KB, so it is usually beneficial to try to read multiple packets at once even though it requires to copy the data an additional time. The rest of each packet is received without an additional copy. You can try to play with this parameter and see how it affects random iops and linear bandwidth if you want.
use_sync_send_recv
- Type: boolean
- Default: false
If true, synchronous send/recv syscalls are used instead of io_uring for socket communication. Useless for OSDs because they require io_uring anyway, but may be required for clients with old kernel versions.
use_rdma
- Type: boolean
- Default: true
Try to use RDMA for communication if it's available. Disable if you don't want Vitastor to use RDMA. TCP-only clients can also talk to an RDMA-enabled cluster, so disabling RDMA may be needed if clients have RDMA devices, but they are not connected to the cluster.
rdma_device
- Type: string
RDMA device name to use for Vitastor OSD communications (for example,
"rocep5s0f0"). Please note that Vitastor RDMA requires Implicit On-Demand
Paging (Implicit ODP) and Scatter/Gather (SG) support from the RDMA device
to work. For example, Mellanox ConnectX-3 and older adapters don't have
Implicit ODP, so they're unsupported by Vitastor. Run ibv_devinfo -v
as
root to list available RDMA devices and their features.
rdma_port_num
- Type: integer
- Default: 1
RDMA device port number to use. Only for devices that have more than 1 port.
See phys_port_cnt
in ibv_devinfo -v
output to determine how many ports
your device has.
rdma_gid_index
- Type: integer
- Default: 0
Global address identifier index of the RDMA device to use. Different GID
indexes may correspond to different protocols like RoCEv1, RoCEv2 and iWARP.
Search for "GID" in ibv_devinfo -v
output to determine which GID index
you need.
IMPORTANT: If you want to use RoCEv2 (as recommended) then the correct rdma_gid_index is usually 1 (IPv6) or 3 (IPv4).
rdma_mtu
- Type: integer
- Default: 4096
RDMA Path MTU to use. Must be 1024, 2048 or 4096. There is usually no sense to change it from the default 4096.
rdma_max_sge
- Type: integer
- Default: 128
Maximum number of scatter/gather entries to use for RDMA. OSDs negotiate the actual value when establishing connection anyway, so it's usually not required to change this parameter.
rdma_max_msg
- Type: integer
- Default: 1048576
Maximum size of a single RDMA send or receive operation in bytes.
rdma_max_recv
- Type: integer
- Default: 8
Maximum number of parallel RDMA receive operations. Note that this number
of receive buffers rdma_max_msg
in size are allocated for each client,
so this setting actually affects memory usage. This is because RDMA receive
operations are (sadly) still not zero-copy in Vitastor. It may be fixed in
later versions.
peer_connect_interval
- Type: seconds
- Default: 5
- Minimum: 1
Interval before attempting to reconnect to an unavailable OSD.
peer_connect_timeout
- Type: seconds
- Default: 5
- Minimum: 1
Timeout for OSD connection attempts.
osd_idle_timeout
- Type: seconds
- Default: 5
- Minimum: 1
OSD connection inactivity time after which clients and other OSDs send keepalive requests to check state of the connection.
osd_ping_timeout
- Type: seconds
- Default: 5
- Minimum: 1
Maximum time to wait for OSD keepalive responses. If an OSD doesn't respond within this time, the connection to it is dropped and a reconnection attempt is scheduled.
up_wait_retry_interval
- Type: milliseconds
- Default: 500
- Minimum: 50
OSDs respond to clients with a special error code when they receive I/O requests for a PG that's not synchronized and started. This parameter sets the time for the clients to wait before re-attempting such I/O requests.
max_etcd_attempts
- Type: integer
- Default: 5
Maximum number of attempts for etcd requests which can't be retried indefinitely.
etcd_quick_timeout
- Type: milliseconds
- Default: 1000
Timeout for etcd requests which should complete quickly, like lease refresh.
etcd_slow_timeout
- Type: milliseconds
- Default: 5000
Timeout for etcd requests which are allowed to wait for some time.
etcd_keepalive_timeout
- Type: seconds
- Default: max(30, etcd_report_interval*2)
Timeout for etcd connection HTTP Keep-Alive. Should be higher than etcd_report_interval to guarantee that keepalive actually works.
etcd_ws_keepalive_timeout
- Type: seconds
- Default: 30
etcd websocket ping interval required to keep the connection alive and detect disconnections quickly.