Compare commits

...

2 Commits

Author SHA1 Message Date
Vitaliy Filippov 7492e0795d Add pve-qemu 9.1 patch 2024-12-19 11:36:18 +03:00
Vitaliy Filippov 6febd1e2cc Document NFS-RDMA 2024-12-19 11:27:06 +03:00
3 changed files with 218 additions and 10 deletions

View File

@ -111,6 +111,19 @@ settings, because Vitastor NFS proxy doesn't keep uncommitted data in memory
with these settings. But it may even work without `immediate_commit=all` because with these settings. But it may even work without `immediate_commit=all` because
the Linux NFS client repeats all uncommitted writes if it loses the connection. the Linux NFS client repeats all uncommitted writes if it loses the connection.
## RDMA
vitastor-nfs supports NFS over RDMA. You can test it even if you don't have an RDMA
NIC using SoftROCE:
1. First, add SoftROCE device on both servers: `rdma link add rxe0 type rxe netdev eth0`.
Here, `rdma` utility is a part the iproute2 package, and `eth0` should be replaced with
the name of your Ethernet NIC.
2. Start vitastor-nfs with RDMA: `vitastor-nfs start (--fs <NAME> | --block) --pool <POOL> --port 20049 --nfs_rdma 20049 --portmap 0`
3. Mount the FS: `mount 192.168.0.10:/mnt/test/ /mnt/vita/ -o port=20049,mountport=20049,nfsvers=3,soft,nolock,rdma`
## Commands ## Commands
### mount ### mount
@ -132,10 +145,15 @@ The server will be automatically stopped when the FS is unmounted.
Start network NFS server. Options: Start network NFS server. Options:
| <!-- --> | <!-- --> | | <!-- --> | <!-- --> |
|-----------------|------------------------------------------------------------| |------------------------|-----------------------------------------------------------------------------------------------------------------------------|
| `--bind <IP>` | bind service to \<IP> address (default 0.0.0.0) | | `--bind <IP>` | bind service to \<IP> address (default 0.0.0.0) |
| `--port <PORT>` | use port \<PORT> for NFS services (default is 2049) | | `--port <PORT>` | use port \<PORT> for NFS services (default is 2049). Specify "auto" to auto-select and print port |
| `--portmap 0` | do not listen on port 111 (portmap/rpcbind, requires root) | | `--portmap 0` | do not listen on port 111 (portmap/rpcbind, requires root) |
| `--nfs_rdma <PORT>` | enable NFS-RDMA at RDMA-CM port \<PORT> (you can try 20049). If RDMA is enabled and --port is set to 0, TCP will be disabled |
| `--nfs_rdma_credit 16` | maximum operation credit for RDMA clients (max iodepth) |
| `--nfs_rdma_send 1024` | maximum RDMA send operation count (should be larger than iodepth) |
| `--nfs_rdma_alloc 1M` | RDMA memory allocation rounding |
| `--nfs_rdma_gc 64M` | maximum unused RDMA buffers |
### upgrade ### upgrade

View File

@ -116,6 +116,19 @@ JSON-формате :-). Для инспекции содержимого БД
даже без `immediate_commit=all`, потому что NFS-клиент ядра Linux повторяет все даже без `immediate_commit=all`, потому что NFS-клиент ядра Linux повторяет все
незафиксированные запросы при потере соединения. незафиксированные запросы при потере соединения.
## RDMA
vitastor-nfs поддерживает NFS через RDMA. Вы можете протестировать его, даже если у вас нет
RDMA-карты, с помощью SoftROCE:
1. Сначала создайте SoftROCE устройства на обоих тестовых серверах: `rdma link add rxe0 type rxe netdev eth0`.
Утилита `rdma` входит в состав пакета iproute2, а `eth0` вам нужно заменить на имя своей
сетевой карты.
2. Запустите vitastor-nfs с RDMA: `vitastor-nfs start (--fs <NAME> | --block) --pool <POOL> --port 20049 --nfs_rdma 20049 --portmap 0`
3. Смонтируйте ФС: `mount 192.168.0.10:/mnt/test/ /mnt/vita/ -o port=20049,mountport=20049,nfsvers=3,soft,nolock,rdma`
## Команды ## Команды
### mount ### mount
@ -137,10 +150,15 @@ JSON-формате :-). Для инспекции содержимого БД
Запустить сетевой NFS-сервер. Опции: Запустить сетевой NFS-сервер. Опции:
| <!-- --> | <!-- --> | | <!-- --> | <!-- --> |
|-----------------|-----------------------------------------------------------------------| |------------------------|-----------------------------------------------------------------------------------------------------------------------------|
| `--bind <IP>` | принимать соединения по адресу \<IP> (по умолчанию 0.0.0.0 - на всех) | | `--bind <IP>` | принимать соединения по адресу \<IP> (по умолчанию 0.0.0.0 - на всех) |
| `--port <PORT>` | использовать порт \<PORT> для NFS-сервисов (по умолчанию 2049) | | `--port <PORT>` | использовать порт \<PORT> для NFS-сервисов (по умолчанию 2049). Укажите "auto", чтобы выбрать и напечатать случайный порт |
| `--portmap 0` | отключить сервис portmap/rpcbind на порту 111 (по умолчанию включён и требует root привилегий) | | `--portmap 0` | отключить сервис portmap/rpcbind на порту 111 (по умолчанию включён и требует root привилегий) |
| `--nfs_rdma <PORT>` | включить NFS-RDMA на порту RDMA-CM \<PORT> (попробуйте 20049). Если RDMA включено и указано `--port 0`, TCP будет отключено |
| `--nfs_rdma_credit 16` | максимальный "кредит", глубина очереди для NFS-клиентов |
| `--nfs_rdma_send 1024` | максимальное число операций RDMA отправки (должно быть больше nfs_rdma_credit) |
| `--nfs_rdma_alloc 1M` | округление выделения памяти для RDMA-клиентов |
| `--nfs_rdma_gc 64M` | максимальный объём неиспользуемой памяти RDMA-клиентом перед освобождением |
### upgrade ### upgrade

View File

@ -0,0 +1,172 @@
Index: pve-qemu-kvm-9.1.2/block/meson.build
===================================================================
--- pve-qemu-kvm-9.1.2.orig/block/meson.build
+++ pve-qemu-kvm-9.1.2/block/meson.build
@@ -126,6 +126,7 @@ foreach m : [
[libnfs, 'nfs', files('nfs.c')],
[libssh, 'ssh', files('ssh.c')],
[rbd, 'rbd', files('rbd.c')],
+ [vitastor, 'vitastor', files('vitastor.c')],
]
if m[0].found()
module_ss = ss.source_set()
Index: pve-qemu-kvm-9.1.2/meson.build
===================================================================
--- pve-qemu-kvm-9.1.2.orig/meson.build
+++ pve-qemu-kvm-9.1.2/meson.build
@@ -1516,6 +1516,26 @@ if not get_option('rbd').auto() or have_
endif
endif
+vitastor = not_found
+if not get_option('vitastor').auto() or have_block
+ libvitastor_client = cc.find_library('vitastor_client', has_headers: ['vitastor_c.h'],
+ required: get_option('vitastor'))
+ if libvitastor_client.found()
+ if cc.links('''
+ #include <vitastor_c.h>
+ int main(void) {
+ vitastor_c_create_qemu(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0);
+ return 0;
+ }''', dependencies: libvitastor_client)
+ vitastor = declare_dependency(dependencies: libvitastor_client)
+ elif get_option('vitastor').enabled()
+ error('could not link libvitastor_client')
+ else
+ warning('could not link libvitastor_client, disabling')
+ endif
+ endif
+endif
+
glusterfs = not_found
glusterfs_ftruncate_has_stat = false
glusterfs_iocb_has_stat = false
@@ -2367,6 +2387,7 @@ endif
config_host_data.set('CONFIG_OPENGL', opengl.found())
config_host_data.set('CONFIG_PLUGIN', get_option('plugins'))
config_host_data.set('CONFIG_RBD', rbd.found())
+config_host_data.set('CONFIG_VITASTOR', vitastor.found())
config_host_data.set('CONFIG_RDMA', rdma.found())
config_host_data.set('CONFIG_RELOCATABLE', get_option('relocatable'))
config_host_data.set('CONFIG_SAFESTACK', get_option('safe_stack'))
@@ -4534,6 +4555,7 @@ summary_info += {'fdt support': fd
summary_info += {'libcap-ng support': libcap_ng}
summary_info += {'bpf support': libbpf}
summary_info += {'rbd support': rbd}
+summary_info += {'vitastor support': vitastor}
summary_info += {'smartcard support': cacard}
summary_info += {'U2F support': u2f}
summary_info += {'libusb': libusb}
Index: pve-qemu-kvm-9.1.2/meson_options.txt
===================================================================
--- pve-qemu-kvm-9.1.2.orig/meson_options.txt
+++ pve-qemu-kvm-9.1.2/meson_options.txt
@@ -194,6 +194,8 @@ option('lzo', type : 'feature', value :
description: 'lzo compression support')
option('rbd', type : 'feature', value : 'auto',
description: 'Ceph block device driver')
+option('vitastor', type : 'feature', value : 'auto',
+ description: 'Vitastor block device driver')
option('opengl', type : 'feature', value : 'auto',
description: 'OpenGL support')
option('rdma', type : 'feature', value : 'auto',
Index: pve-qemu-kvm-9.1.2/qapi/block-core.json
===================================================================
--- pve-qemu-kvm-9.1.2.orig/qapi/block-core.json
+++ pve-qemu-kvm-9.1.2/qapi/block-core.json
@@ -3477,7 +3477,7 @@
'raw', 'rbd',
{ 'name': 'replication', 'if': 'CONFIG_REPLICATION' },
'pbs',
- 'ssh', 'throttle', 'vdi', 'vhdx',
+ 'ssh', 'throttle', 'vdi', 'vhdx', 'vitastor',
{ 'name': 'virtio-blk-vfio-pci', 'if': 'CONFIG_BLKIO' },
{ 'name': 'virtio-blk-vhost-user', 'if': 'CONFIG_BLKIO' },
{ 'name': 'virtio-blk-vhost-vdpa', 'if': 'CONFIG_BLKIO' },
@@ -4588,6 +4588,28 @@
'*server': ['InetSocketAddressBase'] } }
##
+# @BlockdevOptionsVitastor:
+#
+# Driver specific block device options for vitastor
+#
+# @image: Image name
+# @inode: Inode number
+# @pool: Pool ID
+# @size: Desired image size in bytes
+# @config-path: Path to Vitastor configuration
+# @etcd-host: etcd connection address(es)
+# @etcd-prefix: etcd key/value prefix
+##
+{ 'struct': 'BlockdevOptionsVitastor',
+ 'data': { '*inode': 'uint64',
+ '*pool': 'uint64',
+ '*size': 'uint64',
+ '*image': 'str',
+ '*config-path': 'str',
+ '*etcd-host': 'str',
+ '*etcd-prefix': 'str' } }
+
+##
# @ReplicationMode:
#
# An enumeration of replication modes.
@@ -5050,6 +5072,7 @@
'throttle': 'BlockdevOptionsThrottle',
'vdi': 'BlockdevOptionsGenericFormat',
'vhdx': 'BlockdevOptionsGenericFormat',
+ 'vitastor': 'BlockdevOptionsVitastor',
'virtio-blk-vfio-pci':
{ 'type': 'BlockdevOptionsVirtioBlkVfioPci',
'if': 'CONFIG_BLKIO' },
@@ -5497,6 +5520,20 @@
'*encrypt' : 'RbdEncryptionCreateOptions' } }
##
+# @BlockdevCreateOptionsVitastor:
+#
+# Driver specific image creation options for Vitastor.
+#
+# @location: Where to store the new image file. This location cannot
+# point to a snapshot.
+#
+# @size: Size of the virtual disk in bytes
+##
+{ 'struct': 'BlockdevCreateOptionsVitastor',
+ 'data': { 'location': 'BlockdevOptionsVitastor',
+ 'size': 'size' } }
+
+##
# @BlockdevVmdkSubformat:
#
# Subformat options for VMDK images
@@ -5718,6 +5755,7 @@
'ssh': 'BlockdevCreateOptionsSsh',
'vdi': 'BlockdevCreateOptionsVdi',
'vhdx': 'BlockdevCreateOptionsVhdx',
+ 'vitastor': 'BlockdevCreateOptionsVitastor',
'vmdk': 'BlockdevCreateOptionsVmdk',
'vpc': 'BlockdevCreateOptionsVpc'
} }
Index: pve-qemu-kvm-9.1.2/scripts/meson-buildoptions.sh
===================================================================
--- pve-qemu-kvm-9.1.2.orig/scripts/meson-buildoptions.sh
+++ pve-qemu-kvm-9.1.2/scripts/meson-buildoptions.sh
@@ -168,6 +168,7 @@ meson_options_help() {
printf "%s\n" ' qga-vss build QGA VSS support (broken with MinGW)'
printf "%s\n" ' qpl Query Processing Library support'
printf "%s\n" ' rbd Ceph block device driver'
+ printf "%s\n" ' vitastor Vitastor block device driver'
printf "%s\n" ' rdma Enable RDMA-based migration'
printf "%s\n" ' replication replication support'
printf "%s\n" ' rutabaga-gfx rutabaga_gfx support'
@@ -444,6 +445,8 @@ _meson_option_parse() {
--disable-qpl) printf "%s" -Dqpl=disabled ;;
--enable-rbd) printf "%s" -Drbd=enabled ;;
--disable-rbd) printf "%s" -Drbd=disabled ;;
+ --enable-vitastor) printf "%s" -Dvitastor=enabled ;;
+ --disable-vitastor) printf "%s" -Dvitastor=disabled ;;
--enable-rdma) printf "%s" -Drdma=enabled ;;
--disable-rdma) printf "%s" -Drdma=disabled ;;
--enable-relocatable) printf "%s" -Drelocatable=true ;;