Compare commits
3 Commits
e44a0c2360
...
4c455642ef
Author | SHA1 | Date |
---|---|---|
Vitaliy Filippov | 4c455642ef | |
Vitaliy Filippov | 72b116c57b | |
Vitaliy Filippov | 76bb603a79 |
|
@ -22,7 +22,7 @@ RUN set -e -x; \
|
|||
echo 'APT::Install-Suggests false;' >> /etc/apt/apt.conf
|
||||
|
||||
RUN apt-get update
|
||||
RUN apt-get -y install fio liburing-dev libgoogle-perftools-dev devscripts libjerasure-dev cmake libibverbs-dev libisal-dev libnl-3-dev libnl-genl-3-dev curl
|
||||
RUN apt-get -y install fio liburing-dev libgoogle-perftools-dev devscripts libjerasure-dev cmake libibverbs-dev librdmacm-dev libisal-dev libnl-3-dev libnl-genl-3-dev curl
|
||||
RUN apt-get -y build-dep fio
|
||||
RUN apt-get --download-only source fio
|
||||
|
||||
|
|
|
@ -36,6 +36,7 @@
|
|||
- [Clustered file system](../usage/nfs.en.md#vitastorfs)
|
||||
- [Experimental internal etcd replacement - antietcd](../config/monitor.en.md#use_antietcd)
|
||||
- [Built-in Prometheus metric exporter](../config/monitor.en.md#enable_prometheus)
|
||||
- [NFS RDMA support](../usage/nfs.en.md#rdma) (probably also usable for GPUDirect)
|
||||
|
||||
## Plugins and tools
|
||||
|
||||
|
|
|
@ -38,6 +38,7 @@
|
|||
- [Кластерная файловая система](../usage/nfs.ru.md#vitastorfs)
|
||||
- [Экспериментальная встроенная замена etcd - antietcd](../config/monitor.ru.md#use_antietcd)
|
||||
- [Встроенный Prometheus-экспортер метрик](../config/monitor.ru.md#enable_prometheus)
|
||||
- [Поддержка NFS RDMA](../usage/nfs.ru.md#rdma) (вероятно, также подходящая для GPUDirect)
|
||||
|
||||
## Драйверы и инструменты
|
||||
|
||||
|
|
|
@ -111,6 +111,21 @@ settings, because Vitastor NFS proxy doesn't keep uncommitted data in memory
|
|||
with these settings. But it may even work without `immediate_commit=all` because
|
||||
the Linux NFS client repeats all uncommitted writes if it loses the connection.
|
||||
|
||||
## RDMA
|
||||
|
||||
vitastor-nfs supports NFS over RDMA, which, in theory, should also allow to use
|
||||
VitastorFS from GPUDirect.
|
||||
|
||||
You can test NFS-RDMA even if you don't have an RDMA NIC using SoftROCE:
|
||||
|
||||
1. First, add SoftROCE device on both servers: `rdma link add rxe0 type rxe netdev eth0`.
|
||||
Here, `rdma` utility is a part the iproute2 package, and `eth0` should be replaced with
|
||||
the name of your Ethernet NIC.
|
||||
|
||||
2. Start vitastor-nfs with RDMA: `vitastor-nfs start (--fs <NAME> | --block) --pool <POOL> --port 20049 --nfs_rdma 20049 --portmap 0`
|
||||
|
||||
3. Mount the FS: `mount 192.168.0.10:/mnt/test/ /mnt/vita/ -o port=20049,mountport=20049,nfsvers=3,soft,nolock,rdma`
|
||||
|
||||
## Commands
|
||||
|
||||
### mount
|
||||
|
@ -131,11 +146,16 @@ The server will be automatically stopped when the FS is unmounted.
|
|||
|
||||
Start network NFS server. Options:
|
||||
|
||||
| <!-- --> | <!-- --> |
|
||||
|-----------------|------------------------------------------------------------|
|
||||
| `--bind <IP>` | bind service to \<IP> address (default 0.0.0.0) |
|
||||
| `--port <PORT>` | use port \<PORT> for NFS services (default is 2049) |
|
||||
| `--portmap 0` | do not listen on port 111 (portmap/rpcbind, requires root) |
|
||||
| <!-- --> | <!-- --> |
|
||||
|------------------------|-----------------------------------------------------------------------------------------------------------------------------|
|
||||
| `--bind <IP>` | bind service to \<IP> address (default 0.0.0.0) |
|
||||
| `--port <PORT>` | use port \<PORT> for NFS services (default is 2049). Specify "auto" to auto-select and print port |
|
||||
| `--portmap 0` | do not listen on port 111 (portmap/rpcbind, requires root) |
|
||||
| `--nfs_rdma <PORT>` | enable NFS-RDMA at RDMA-CM port \<PORT> (you can try 20049). If RDMA is enabled and --port is set to 0, TCP will be disabled |
|
||||
| `--nfs_rdma_credit 16` | maximum operation credit for RDMA clients (max iodepth) |
|
||||
| `--nfs_rdma_send 1024` | maximum RDMA send operation count (should be larger than iodepth) |
|
||||
| `--nfs_rdma_alloc 1M` | RDMA memory allocation rounding |
|
||||
| `--nfs_rdma_gc 64M` | maximum unused RDMA buffers |
|
||||
|
||||
### upgrade
|
||||
|
||||
|
|
|
@ -116,6 +116,21 @@ JSON-формате :-). Для инспекции содержимого БД
|
|||
даже без `immediate_commit=all`, потому что NFS-клиент ядра Linux повторяет все
|
||||
незафиксированные запросы при потере соединения.
|
||||
|
||||
## RDMA
|
||||
|
||||
vitastor-nfs поддерживает NFS через RDMA. В теории это также должно позволять использовать
|
||||
VitastorFS из GPUDirect.
|
||||
|
||||
Вы можете протестировать NFS-RDMA, даже если у вас нет RDMA-карты, с помощью SoftROCE:
|
||||
|
||||
1. Сначала создайте SoftROCE устройства на обоих тестовых серверах: `rdma link add rxe0 type rxe netdev eth0`.
|
||||
Утилита `rdma` входит в состав пакета iproute2, а `eth0` вам нужно заменить на имя своей
|
||||
сетевой карты.
|
||||
|
||||
2. Запустите vitastor-nfs с RDMA: `vitastor-nfs start (--fs <NAME> | --block) --pool <POOL> --port 20049 --nfs_rdma 20049 --portmap 0`
|
||||
|
||||
3. Смонтируйте ФС: `mount 192.168.0.10:/mnt/test/ /mnt/vita/ -o port=20049,mountport=20049,nfsvers=3,soft,nolock,rdma`
|
||||
|
||||
## Команды
|
||||
|
||||
### mount
|
||||
|
@ -136,11 +151,16 @@ JSON-формате :-). Для инспекции содержимого БД
|
|||
|
||||
Запустить сетевой NFS-сервер. Опции:
|
||||
|
||||
| <!-- --> | <!-- --> |
|
||||
|-----------------|-----------------------------------------------------------------------|
|
||||
| `--bind <IP>` | принимать соединения по адресу \<IP> (по умолчанию 0.0.0.0 - на всех) |
|
||||
| `--port <PORT>` | использовать порт \<PORT> для NFS-сервисов (по умолчанию 2049) |
|
||||
| `--portmap 0` | отключить сервис portmap/rpcbind на порту 111 (по умолчанию включён и требует root привилегий) |
|
||||
| <!-- --> | <!-- --> |
|
||||
|------------------------|-----------------------------------------------------------------------------------------------------------------------------|
|
||||
| `--bind <IP>` | принимать соединения по адресу \<IP> (по умолчанию 0.0.0.0 - на всех) |
|
||||
| `--port <PORT>` | использовать порт \<PORT> для NFS-сервисов (по умолчанию 2049). Укажите "auto", чтобы выбрать и напечатать случайный порт |
|
||||
| `--portmap 0` | отключить сервис portmap/rpcbind на порту 111 (по умолчанию включён и требует root привилегий) |
|
||||
| `--nfs_rdma <PORT>` | включить NFS-RDMA на порту RDMA-CM \<PORT> (попробуйте 20049). Если RDMA включено и указано `--port 0`, TCP будет отключено |
|
||||
| `--nfs_rdma_credit 16` | максимальный "кредит", глубина очереди для NFS-клиентов |
|
||||
| `--nfs_rdma_send 1024` | максимальное число операций RDMA отправки (должно быть больше nfs_rdma_credit) |
|
||||
| `--nfs_rdma_alloc 1M` | округление выделения памяти для RDMA-клиентов |
|
||||
| `--nfs_rdma_gc 64M` | максимальный объём неиспользуемой памяти RDMA-клиентом перед освобождением |
|
||||
|
||||
### upgrade
|
||||
|
||||
|
|
|
@ -0,0 +1,172 @@
|
|||
Index: pve-qemu-kvm-9.1.2/block/meson.build
|
||||
===================================================================
|
||||
--- pve-qemu-kvm-9.1.2.orig/block/meson.build
|
||||
+++ pve-qemu-kvm-9.1.2/block/meson.build
|
||||
@@ -126,6 +126,7 @@ foreach m : [
|
||||
[libnfs, 'nfs', files('nfs.c')],
|
||||
[libssh, 'ssh', files('ssh.c')],
|
||||
[rbd, 'rbd', files('rbd.c')],
|
||||
+ [vitastor, 'vitastor', files('vitastor.c')],
|
||||
]
|
||||
if m[0].found()
|
||||
module_ss = ss.source_set()
|
||||
Index: pve-qemu-kvm-9.1.2/meson.build
|
||||
===================================================================
|
||||
--- pve-qemu-kvm-9.1.2.orig/meson.build
|
||||
+++ pve-qemu-kvm-9.1.2/meson.build
|
||||
@@ -1516,6 +1516,26 @@ if not get_option('rbd').auto() or have_
|
||||
endif
|
||||
endif
|
||||
|
||||
+vitastor = not_found
|
||||
+if not get_option('vitastor').auto() or have_block
|
||||
+ libvitastor_client = cc.find_library('vitastor_client', has_headers: ['vitastor_c.h'],
|
||||
+ required: get_option('vitastor'))
|
||||
+ if libvitastor_client.found()
|
||||
+ if cc.links('''
|
||||
+ #include <vitastor_c.h>
|
||||
+ int main(void) {
|
||||
+ vitastor_c_create_qemu(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0);
|
||||
+ return 0;
|
||||
+ }''', dependencies: libvitastor_client)
|
||||
+ vitastor = declare_dependency(dependencies: libvitastor_client)
|
||||
+ elif get_option('vitastor').enabled()
|
||||
+ error('could not link libvitastor_client')
|
||||
+ else
|
||||
+ warning('could not link libvitastor_client, disabling')
|
||||
+ endif
|
||||
+ endif
|
||||
+endif
|
||||
+
|
||||
glusterfs = not_found
|
||||
glusterfs_ftruncate_has_stat = false
|
||||
glusterfs_iocb_has_stat = false
|
||||
@@ -2367,6 +2387,7 @@ endif
|
||||
config_host_data.set('CONFIG_OPENGL', opengl.found())
|
||||
config_host_data.set('CONFIG_PLUGIN', get_option('plugins'))
|
||||
config_host_data.set('CONFIG_RBD', rbd.found())
|
||||
+config_host_data.set('CONFIG_VITASTOR', vitastor.found())
|
||||
config_host_data.set('CONFIG_RDMA', rdma.found())
|
||||
config_host_data.set('CONFIG_RELOCATABLE', get_option('relocatable'))
|
||||
config_host_data.set('CONFIG_SAFESTACK', get_option('safe_stack'))
|
||||
@@ -4534,6 +4555,7 @@ summary_info += {'fdt support': fd
|
||||
summary_info += {'libcap-ng support': libcap_ng}
|
||||
summary_info += {'bpf support': libbpf}
|
||||
summary_info += {'rbd support': rbd}
|
||||
+summary_info += {'vitastor support': vitastor}
|
||||
summary_info += {'smartcard support': cacard}
|
||||
summary_info += {'U2F support': u2f}
|
||||
summary_info += {'libusb': libusb}
|
||||
Index: pve-qemu-kvm-9.1.2/meson_options.txt
|
||||
===================================================================
|
||||
--- pve-qemu-kvm-9.1.2.orig/meson_options.txt
|
||||
+++ pve-qemu-kvm-9.1.2/meson_options.txt
|
||||
@@ -194,6 +194,8 @@ option('lzo', type : 'feature', value :
|
||||
description: 'lzo compression support')
|
||||
option('rbd', type : 'feature', value : 'auto',
|
||||
description: 'Ceph block device driver')
|
||||
+option('vitastor', type : 'feature', value : 'auto',
|
||||
+ description: 'Vitastor block device driver')
|
||||
option('opengl', type : 'feature', value : 'auto',
|
||||
description: 'OpenGL support')
|
||||
option('rdma', type : 'feature', value : 'auto',
|
||||
Index: pve-qemu-kvm-9.1.2/qapi/block-core.json
|
||||
===================================================================
|
||||
--- pve-qemu-kvm-9.1.2.orig/qapi/block-core.json
|
||||
+++ pve-qemu-kvm-9.1.2/qapi/block-core.json
|
||||
@@ -3477,7 +3477,7 @@
|
||||
'raw', 'rbd',
|
||||
{ 'name': 'replication', 'if': 'CONFIG_REPLICATION' },
|
||||
'pbs',
|
||||
- 'ssh', 'throttle', 'vdi', 'vhdx',
|
||||
+ 'ssh', 'throttle', 'vdi', 'vhdx', 'vitastor',
|
||||
{ 'name': 'virtio-blk-vfio-pci', 'if': 'CONFIG_BLKIO' },
|
||||
{ 'name': 'virtio-blk-vhost-user', 'if': 'CONFIG_BLKIO' },
|
||||
{ 'name': 'virtio-blk-vhost-vdpa', 'if': 'CONFIG_BLKIO' },
|
||||
@@ -4588,6 +4588,28 @@
|
||||
'*server': ['InetSocketAddressBase'] } }
|
||||
|
||||
##
|
||||
+# @BlockdevOptionsVitastor:
|
||||
+#
|
||||
+# Driver specific block device options for vitastor
|
||||
+#
|
||||
+# @image: Image name
|
||||
+# @inode: Inode number
|
||||
+# @pool: Pool ID
|
||||
+# @size: Desired image size in bytes
|
||||
+# @config-path: Path to Vitastor configuration
|
||||
+# @etcd-host: etcd connection address(es)
|
||||
+# @etcd-prefix: etcd key/value prefix
|
||||
+##
|
||||
+{ 'struct': 'BlockdevOptionsVitastor',
|
||||
+ 'data': { '*inode': 'uint64',
|
||||
+ '*pool': 'uint64',
|
||||
+ '*size': 'uint64',
|
||||
+ '*image': 'str',
|
||||
+ '*config-path': 'str',
|
||||
+ '*etcd-host': 'str',
|
||||
+ '*etcd-prefix': 'str' } }
|
||||
+
|
||||
+##
|
||||
# @ReplicationMode:
|
||||
#
|
||||
# An enumeration of replication modes.
|
||||
@@ -5050,6 +5072,7 @@
|
||||
'throttle': 'BlockdevOptionsThrottle',
|
||||
'vdi': 'BlockdevOptionsGenericFormat',
|
||||
'vhdx': 'BlockdevOptionsGenericFormat',
|
||||
+ 'vitastor': 'BlockdevOptionsVitastor',
|
||||
'virtio-blk-vfio-pci':
|
||||
{ 'type': 'BlockdevOptionsVirtioBlkVfioPci',
|
||||
'if': 'CONFIG_BLKIO' },
|
||||
@@ -5497,6 +5520,20 @@
|
||||
'*encrypt' : 'RbdEncryptionCreateOptions' } }
|
||||
|
||||
##
|
||||
+# @BlockdevCreateOptionsVitastor:
|
||||
+#
|
||||
+# Driver specific image creation options for Vitastor.
|
||||
+#
|
||||
+# @location: Where to store the new image file. This location cannot
|
||||
+# point to a snapshot.
|
||||
+#
|
||||
+# @size: Size of the virtual disk in bytes
|
||||
+##
|
||||
+{ 'struct': 'BlockdevCreateOptionsVitastor',
|
||||
+ 'data': { 'location': 'BlockdevOptionsVitastor',
|
||||
+ 'size': 'size' } }
|
||||
+
|
||||
+##
|
||||
# @BlockdevVmdkSubformat:
|
||||
#
|
||||
# Subformat options for VMDK images
|
||||
@@ -5718,6 +5755,7 @@
|
||||
'ssh': 'BlockdevCreateOptionsSsh',
|
||||
'vdi': 'BlockdevCreateOptionsVdi',
|
||||
'vhdx': 'BlockdevCreateOptionsVhdx',
|
||||
+ 'vitastor': 'BlockdevCreateOptionsVitastor',
|
||||
'vmdk': 'BlockdevCreateOptionsVmdk',
|
||||
'vpc': 'BlockdevCreateOptionsVpc'
|
||||
} }
|
||||
Index: pve-qemu-kvm-9.1.2/scripts/meson-buildoptions.sh
|
||||
===================================================================
|
||||
--- pve-qemu-kvm-9.1.2.orig/scripts/meson-buildoptions.sh
|
||||
+++ pve-qemu-kvm-9.1.2/scripts/meson-buildoptions.sh
|
||||
@@ -168,6 +168,7 @@ meson_options_help() {
|
||||
printf "%s\n" ' qga-vss build QGA VSS support (broken with MinGW)'
|
||||
printf "%s\n" ' qpl Query Processing Library support'
|
||||
printf "%s\n" ' rbd Ceph block device driver'
|
||||
+ printf "%s\n" ' vitastor Vitastor block device driver'
|
||||
printf "%s\n" ' rdma Enable RDMA-based migration'
|
||||
printf "%s\n" ' replication replication support'
|
||||
printf "%s\n" ' rutabaga-gfx rutabaga_gfx support'
|
||||
@@ -444,6 +445,8 @@ _meson_option_parse() {
|
||||
--disable-qpl) printf "%s" -Dqpl=disabled ;;
|
||||
--enable-rbd) printf "%s" -Drbd=enabled ;;
|
||||
--disable-rbd) printf "%s" -Drbd=disabled ;;
|
||||
+ --enable-vitastor) printf "%s" -Dvitastor=enabled ;;
|
||||
+ --disable-vitastor) printf "%s" -Dvitastor=disabled ;;
|
||||
--enable-rdma) printf "%s" -Drdma=enabled ;;
|
||||
--disable-rdma) printf "%s" -Drdma=disabled ;;
|
||||
--enable-relocatable) printf "%s" -Drelocatable=true ;;
|
|
@ -15,7 +15,7 @@ BuildRequires: rh-nodejs12-npm
|
|||
BuildRequires: jerasure-devel
|
||||
BuildRequires: libisa-l-devel
|
||||
BuildRequires: gf-complete-devel
|
||||
BuildRequires: libibverbs-devel
|
||||
BuildRequires: rdma-core-devel
|
||||
BuildRequires: cmake3
|
||||
BuildRequires: libnl3-devel
|
||||
Requires: vitastor-osd = %{version}-%{release}
|
||||
|
|
|
@ -14,7 +14,7 @@ BuildRequires: nodejs >= 10
|
|||
BuildRequires: jerasure-devel
|
||||
BuildRequires: libisa-l-devel
|
||||
BuildRequires: gf-complete-devel
|
||||
BuildRequires: libibverbs-devel
|
||||
BuildRequires: rdma-core-devel
|
||||
BuildRequires: cmake
|
||||
BuildRequires: libnl3-devel
|
||||
Requires: vitastor-osd = %{version}-%{release}
|
||||
|
|
Loading…
Reference in New Issue