Compare commits

..

142 Commits

Author SHA1 Message Date
0c89886374 Release 2.3.0
All checks were successful
Test / test_switch_primary (push) Successful in 32s
Test / test_write (push) Successful in 37s
Test / test_write_no_same (push) Successful in 11s
Test / test_write_xor (push) Successful in 39s
Test / test_write_iothreads (push) Successful in 41s
Test / test_heal_pg_size_2 (push) Successful in 2m20s
Test / test_heal_local_read (push) Successful in 2m21s
Test / test_heal_ec (push) Successful in 2m22s
Test / test_heal_csum_32k_dmj (push) Successful in 2m21s
Test / test_heal_csum_32k_dj (push) Successful in 2m20s
Test / test_heal_csum_32k (push) Successful in 2m19s
Test / test_resize (push) Successful in 14s
Test / test_heal_csum_4k_dmj (push) Successful in 2m25s
Test / test_resize_auto (push) Successful in 9s
Test / test_osd_tags (push) Successful in 8s
Test / test_snapshot_pool2 (push) Successful in 15s
Test / test_enospc (push) Successful in 19s
Test / test_enospc_xor (push) Successful in 20s
Test / test_enospc_imm (push) Successful in 12s
Test / test_enospc_imm_xor (push) Successful in 16s
Test / test_scrub (push) Successful in 17s
Test / test_scrub_zero_osd_2 (push) Successful in 19s
Test / test_scrub_xor (push) Successful in 18s
Test / test_scrub_pg_size_3 (push) Successful in 16s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 18s
Test / test_scrub_ec (push) Successful in 16s
Test / test_nfs (push) Successful in 12s
Test / test_heal_csum_4k_dj (push) Successful in 2m20s
Test / test_heal_csum_4k (push) Successful in 2m22s
Test / test_heal_antietcd (push) Successful in 2m21s
New features:

- Add a new kernel device mounting method: [ublk](https://vitastor.io/docs/usage/ublk.html).
  It's the fastest method for random IOPS, and it's on par with VDUSE for linear MB/s.
- Disable io_uring waits being reported as iowait on kernels which support it (6.15+)
- Allow to enforce permissions at the VitastorFS NFS server side
- Add qemu_file_mirror_path option to the config to allow to trick Veeam and make it work
- Speed up CRC32C calculation in OSD by fixing a bug and enabling AVX512 version
- Support QEMU 10
- Support Debian 13 Trixie and Proxmox 9.0
- Remove the dependency on system liburing in package builds (build it statically)

Bug fixes:

- Fix checksums NEVER BEING ENABLED in vitastor-disk prepare, even when explicitly requested :-D
- Use default uid and gid from NFS AUTH_SYS when creating files
- Fix object bitmaps supposedly & possibly being corrupted in some rare cases with EC N+2+
- Avoid multiple inflight overwrites to meta blocks - fixes possible data corruption with
  one specific SSD model: Memblaze PBlaze5 910 (github #79)
- Fix a bug in antietcd which was leading to leases sometimes not expiring correctly
- Fix a bug in NFS where ".." entry had its `cookie` equal to 0 instead of 1 (github #78)
- Fix snapshots not being deleted during VM deletion in Proxmox plugin (github #85)
- Fix monitor not filtering OSDs by block size correctly
2025-08-25 21:39:19 +03:00
e79bef8751 Fix pve-qemu-10.0 patch
Some checks reported warnings
Test / buildenv (push) Has been cancelled
Test / test_write (push) Has been cancelled
Test / test_write_xor (push) Has been cancelled
Test / test_write_iothreads (push) Has been cancelled
Test / test_write_no_same (push) Has been cancelled
Test / test_heal_pg_size_2 (push) Has been cancelled
Test / test_heal_local_read (push) Has been cancelled
Test / test_heal_ec (push) Has been cancelled
Test / test_heal_antietcd (push) Has been cancelled
Test / test_heal_csum_32k_dmj (push) Has been cancelled
Test / test_heal_csum_32k_dj (push) Has been cancelled
Test / test_heal_csum_32k (push) Has been cancelled
Test / test_heal_csum_4k_dmj (push) Has been cancelled
Test / test_heal_csum_4k_dj (push) Has been cancelled
Test / test_heal_csum_4k (push) Has been cancelled
Test / test_resize (push) Has been cancelled
Test / test_resize_auto (push) Has been cancelled
Test / test_snapshot_pool2 (push) Has been cancelled
Test / test_osd_tags (push) Has been cancelled
Test / test_enospc (push) Has been cancelled
Test / test_enospc_xor (push) Has been cancelled
Test / test_enospc_imm (push) Has been cancelled
Test / test_enospc_imm_xor (push) Has been cancelled
Test / test_scrub (push) Has been cancelled
Test / test_scrub_zero_osd_2 (push) Has been cancelled
Test / test_scrub_xor (push) Has been cancelled
Test / test_scrub_pg_size_3 (push) Has been cancelled
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Has been cancelled
Test / test_scrub_ec (push) Has been cancelled
Test / test_nfs (push) Has been cancelled
2025-08-25 21:38:18 +03:00
ad76f84e1c Fix compat.h for even older kernel headers 2025-08-25 21:38:04 +03:00
db827cb34c Add --enforce 1 to docs 2025-08-25 16:05:40 +03:00
e5c6d85ea1 Add Debian trixie to docs 2025-08-25 00:47:23 +03:00
6cc44c1f54 Add qemu_file_mirror_path docs 2025-08-25 00:45:52 +03:00
c20450c1f1 Add build.sh for Debian Trixie 2025-08-24 22:15:55 +03:00
db63e58b3d Add UBLK docs
All checks were successful
Test / test_switch_primary (push) Successful in 32s
Test / test_write (push) Successful in 32s
Test / test_write_xor (push) Successful in 36s
Test / test_write_no_same (push) Successful in 10s
Test / test_write_iothreads (push) Successful in 38s
Test / test_heal_pg_size_2 (push) Successful in 2m18s
Test / test_heal_local_read (push) Successful in 2m18s
Test / test_heal_ec (push) Successful in 2m17s
Test / test_heal_antietcd (push) Successful in 2m18s
Test / test_heal_csum_32k_dmj (push) Successful in 2m18s
Test / test_heal_csum_32k_dj (push) Successful in 2m19s
Test / test_heal_csum_32k (push) Successful in 2m18s
Test / test_heal_csum_4k_dmj (push) Successful in 2m19s
Test / test_resize (push) Successful in 15s
Test / test_resize_auto (push) Successful in 9s
Test / test_snapshot_pool2 (push) Successful in 14s
Test / test_osd_tags (push) Successful in 8s
Test / test_enospc (push) Successful in 12s
Test / test_enospc_xor (push) Successful in 14s
Test / test_enospc_imm (push) Successful in 11s
Test / test_enospc_imm_xor (push) Successful in 12s
Test / test_scrub (push) Successful in 13s
Test / test_scrub_zero_osd_2 (push) Successful in 14s
Test / test_scrub_xor (push) Successful in 13s
Test / test_scrub_pg_size_3 (push) Successful in 13s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 17s
Test / test_scrub_ec (push) Successful in 15s
Test / test_nfs (push) Successful in 13s
Test / test_heal_csum_4k_dj (push) Successful in 2m16s
Test / test_heal_csum_4k (push) Successful in 2m12s
2025-08-24 16:55:58 +03:00
31b7021330 Implement Vitastor ublk server 2025-08-24 16:55:58 +03:00
2ebe3a468c Mark all symbols hidden by default, export only required ones 2025-08-24 16:55:58 +03:00
9892fccfb0 Disable io_uring waits being reported as iowait 2025-08-24 16:55:58 +03:00
0be86a306d Remove old liburing version support as it's now included 2025-08-24 16:55:58 +03:00
d77a775948 Fix liburing include/compat for older kernels 2025-08-24 16:55:58 +03:00
8cc82bab39 Include liburing in static build 2025-08-24 16:55:58 +03:00
f9d5e33ddd Fix filtering OSDs by block size in monitor
All checks were successful
Test / test_switch_primary (push) Successful in 33s
Test / test_write (push) Successful in 33s
Test / test_write_no_same (push) Successful in 10s
Test / test_write_xor (push) Successful in 37s
Test / test_write_iothreads (push) Successful in 37s
Test / test_heal_pg_size_2 (push) Successful in 2m17s
Test / test_heal_local_read (push) Successful in 2m18s
Test / test_heal_ec (push) Successful in 2m19s
Test / test_heal_antietcd (push) Successful in 2m18s
Test / test_heal_csum_32k_dmj (push) Successful in 2m18s
Test / test_heal_csum_32k_dj (push) Successful in 2m23s
Test / test_heal_csum_32k (push) Successful in 2m18s
Test / test_heal_csum_4k_dmj (push) Successful in 2m19s
Test / test_resize (push) Successful in 13s
Test / test_resize_auto (push) Successful in 8s
Test / test_osd_tags (push) Successful in 7s
Test / test_snapshot_pool2 (push) Successful in 15s
Test / test_enospc (push) Successful in 10s
Test / test_enospc_xor (push) Successful in 13s
Test / test_enospc_imm (push) Successful in 12s
Test / test_enospc_imm_xor (push) Successful in 13s
Test / test_scrub (push) Successful in 14s
Test / test_scrub_zero_osd_2 (push) Successful in 14s
Test / test_scrub_xor (push) Successful in 15s
Test / test_scrub_pg_size_3 (push) Successful in 13s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 18s
Test / test_scrub_ec (push) Successful in 16s
Test / test_nfs (push) Successful in 12s
Test / test_heal_csum_4k_dj (push) Successful in 2m13s
Test / test_heal_csum_4k (push) Successful in 2m19s
2025-08-24 16:46:36 +03:00
1563932024
f83418d93e Fix snapshots not being deleted during VM deletion in Proxmox plugin (#85)
Co-authored-by: zhu.chengzhen <zhu.chengzhen@jingjiamicro.com>

I accept Vitastor CLA agreement: https://git.yourcmc.ru/vitalif/vitastor/src/branch/master/CLA-en.md
2025-08-24 16:28:12 +03:00
fbf14fb0cb Allow to enforce permissions at the server side
All checks were successful
Test / test_switch_primary (push) Successful in 34s
Test / test_write (push) Successful in 31s
Test / test_write_xor (push) Successful in 35s
Test / test_write_no_same (push) Successful in 9s
Test / test_write_iothreads (push) Successful in 38s
Test / test_heal_pg_size_2 (push) Successful in 2m17s
Test / test_heal_local_read (push) Successful in 2m17s
Test / test_heal_ec (push) Successful in 2m16s
Test / test_heal_antietcd (push) Successful in 2m18s
Test / test_heal_csum_32k_dmj (push) Successful in 2m22s
Test / test_heal_csum_32k (push) Successful in 2m19s
Test / test_heal_csum_32k_dj (push) Successful in 2m27s
Test / test_heal_csum_4k_dmj (push) Successful in 2m18s
Test / test_resize (push) Successful in 13s
Test / test_resize_auto (push) Successful in 9s
Test / test_osd_tags (push) Successful in 8s
Test / test_snapshot_pool2 (push) Successful in 14s
Test / test_enospc (push) Successful in 11s
Test / test_enospc_xor (push) Successful in 12s
Test / test_enospc_imm (push) Successful in 12s
Test / test_enospc_imm_xor (push) Successful in 14s
Test / test_scrub (push) Successful in 14s
Test / test_scrub_zero_osd_2 (push) Successful in 14s
Test / test_scrub_xor (push) Successful in 15s
Test / test_scrub_pg_size_3 (push) Successful in 14s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 17s
Test / test_scrub_ec (push) Successful in 17s
Test / test_nfs (push) Successful in 12s
Test / test_heal_csum_4k_dj (push) Successful in 2m17s
Test / test_heal_csum_4k (push) Successful in 2m19s
2025-08-24 16:20:19 +03:00
fb1c3e00f4 Use default uid and gid from NFS AUTH_SYS when creating files 2025-08-24 16:03:40 +03:00
ston3lu
d8332171e9 According to the NFSv3 protocol: Directory traversal should start with cookie '0' - "." entry as the first entry, cookie should be '0' - '.." entry as the second entry, cookie should be '1' 2025-08-24 14:44:48 +03:00
c24cc9bf0b Bump antietcd version to 1.1.3
All checks were successful
Test / test_rebalance_verify_imm (push) Successful in 1m47s
Test / test_write_no_same (push) Successful in 8s
Test / test_rebalance_verify_ec_imm (push) Successful in 1m54s
Test / test_write_xor (push) Successful in 36s
Test / test_write_iothreads (push) Successful in 36s
Test / test_heal_pg_size_2 (push) Successful in 2m15s
Test / test_heal_local_read (push) Successful in 2m17s
Test / test_heal_ec (push) Successful in 2m17s
Test / test_heal_antietcd (push) Successful in 2m18s
Test / test_heal_csum_32k_dmj (push) Successful in 2m20s
Test / test_heal_csum_32k_dj (push) Successful in 2m25s
Test / test_heal_csum_32k (push) Successful in 2m20s
Test / test_heal_csum_4k_dmj (push) Successful in 2m19s
Test / test_resize_auto (push) Successful in 7s
Test / test_resize (push) Successful in 13s
Test / test_osd_tags (push) Successful in 9s
Test / test_snapshot_pool2 (push) Successful in 15s
Test / test_enospc (push) Successful in 10s
Test / test_enospc_xor (push) Successful in 14s
Test / test_enospc_imm (push) Successful in 12s
Test / test_enospc_imm_xor (push) Successful in 13s
Test / test_scrub (push) Successful in 15s
Test / test_scrub_zero_osd_2 (push) Successful in 14s
Test / test_scrub_xor (push) Successful in 12s
Test / test_scrub_pg_size_3 (push) Successful in 14s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 16s
Test / test_scrub_ec (push) Successful in 15s
Test / test_nfs (push) Successful in 11s
Test / test_heal_csum_4k_dj (push) Successful in 2m19s
Test / test_heal_csum_4k (push) Successful in 2m21s
2025-08-23 17:45:56 +03:00
9f57c75acf unordered_map flush_versions 2025-08-23 17:45:56 +03:00
53b12641d1 Fix the fix :) 2025-08-23 17:45:56 +03:00
flynn.yang
5c5c8825dc Avoid multiple inflight overwrites to meta blocks
https://github.com/vitalif/vitastor/pull/83

By https://github.com/Flynn049

By submitting this pull request, I accept Vitastor CLA
2025-08-23 17:45:03 +03:00
3a261ac3fc Support QEMU 10 2025-08-18 10:51:04 +03:00
04514435de Fix checking config in qemu driver 2025-08-12 01:16:57 +03:00
07303020fc Fix enabling checksums in blockstore-disk O_o
Some checks reported warnings
Test / test_heal_pg_size_2 (push) Blocked by required conditions
Test / test_heal_local_read (push) Blocked by required conditions
Test / test_heal_ec (push) Blocked by required conditions
Test / test_heal_antietcd (push) Blocked by required conditions
Test / test_heal_csum_32k_dmj (push) Blocked by required conditions
Test / test_heal_csum_32k_dj (push) Blocked by required conditions
Test / test_heal_csum_32k (push) Blocked by required conditions
Test / test_heal_csum_4k_dmj (push) Blocked by required conditions
Test / test_heal_csum_4k_dj (push) Blocked by required conditions
Test / test_heal_csum_4k (push) Blocked by required conditions
Test / test_resize (push) Blocked by required conditions
Test / test_resize_auto (push) Blocked by required conditions
Test / test_snapshot_pool2 (push) Blocked by required conditions
Test / test_osd_tags (push) Blocked by required conditions
Test / test_enospc (push) Blocked by required conditions
Test / test_enospc_xor (push) Blocked by required conditions
Test / test_enospc_imm (push) Blocked by required conditions
Test / test_enospc_imm_xor (push) Blocked by required conditions
Test / test_scrub (push) Blocked by required conditions
Test / test_scrub_zero_osd_2 (push) Blocked by required conditions
Test / test_scrub_xor (push) Blocked by required conditions
Test / test_scrub_pg_size_3 (push) Blocked by required conditions
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Blocked by required conditions
Test / test_scrub_ec (push) Blocked by required conditions
Test / test_nfs (push) Blocked by required conditions
Test / buildenv (push) Has been cancelled
Test / build (push) Has been cancelled
Test / make_test (push) Has been cancelled
Test / npm_lint (push) Has been cancelled
Test / test_add_osd (push) Has been cancelled
2025-08-10 18:04:52 +03:00
feaf7a15cf Use CRC32C implementation from ISA-L when available - enables VPCLMULQDQ version with AVX512 which is ~2x faster
Some checks reported warnings
Test / test_heal_csum_4k_dmj (push) Blocked by required conditions
Test / test_heal_csum_4k_dj (push) Blocked by required conditions
Test / test_heal_csum_4k (push) Blocked by required conditions
Test / test_resize (push) Blocked by required conditions
Test / test_resize_auto (push) Blocked by required conditions
Test / test_snapshot_pool2 (push) Blocked by required conditions
Test / test_osd_tags (push) Blocked by required conditions
Test / test_enospc (push) Blocked by required conditions
Test / test_enospc_xor (push) Blocked by required conditions
Test / test_enospc_imm (push) Blocked by required conditions
Test / test_enospc_imm_xor (push) Blocked by required conditions
Test / test_scrub (push) Blocked by required conditions
Test / test_scrub_zero_osd_2 (push) Blocked by required conditions
Test / test_scrub_xor (push) Blocked by required conditions
Test / test_scrub_pg_size_3 (push) Blocked by required conditions
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Blocked by required conditions
Test / test_scrub_ec (push) Blocked by required conditions
Test / test_nfs (push) Blocked by required conditions
Test / build (push) Has been cancelled
Test / make_test (push) Has been cancelled
Test / npm_lint (push) Has been cancelled
Test / test_add_osd (push) Has been cancelled
Test / test_cas (push) Has been cancelled
Test / test_change_pg_count (push) Has been cancelled
Test / test_change_pg_count_ec (push) Has been cancelled
Test / test_change_pg_count_online (push) Has been cancelled
Test / test_change_pg_size (push) Has been cancelled
Test / test_create_nomaxid (push) Has been cancelled
Test / test_etcd_fail (push) Has been cancelled
Test / buildenv (push) Has been cancelled
2025-08-10 18:03:47 +03:00
29dda5066f Stop doing cpuid repeatedly in runtime 2025-08-10 18:03:22 +03:00
1de53ef7e6 Move crc32c_pad to util/crc32c.c
Some checks reported warnings
Test / test_failure_domain (push) Has been cancelled
Test / test_snapshot (push) Has been cancelled
Test / test_snapshot_ec (push) Has been cancelled
Test / test_minsize_1 (push) Has been cancelled
Test / test_move_reappear (push) Has been cancelled
Test / test_rm (push) Has been cancelled
Test / test_rm_degraded (push) Has been cancelled
Test / test_snapshot_chain (push) Has been cancelled
Test / test_snapshot_chain_ec (push) Has been cancelled
Test / test_snapshot_down (push) Has been cancelled
Test / test_snapshot_down_ec (push) Has been cancelled
Test / test_splitbrain (push) Has been cancelled
Test / test_rebalance_verify (push) Has been cancelled
Test / test_rebalance_verify_imm (push) Has been cancelled
Test / test_rebalance_verify_ec (push) Has been cancelled
Test / test_rebalance_verify_ec_imm (push) Has been cancelled
Test / test_dd (push) Has been cancelled
Test / test_root_node (push) Has been cancelled
Test / test_switch_primary (push) Has been cancelled
Test / test_write (push) Has been cancelled
Test / test_write_xor (push) Has been cancelled
Test / test_write_iothreads (push) Has been cancelled
Test / test_write_no_same (push) Has been cancelled
Test / test_heal_pg_size_2 (push) Has been cancelled
Test / test_heal_local_read (push) Has been cancelled
Test / test_heal_ec (push) Has been cancelled
Test / test_heal_antietcd (push) Has been cancelled
Test / test_heal_csum_32k_dmj (push) Has been cancelled
Test / test_heal_csum_32k_dj (push) Has been cancelled
Test / test_heal_csum_32k (push) Has been cancelled
2025-08-10 18:03:17 +03:00
4793dbe9c3 Use set_immediate() in osd_flush to prevent stack overflows on repeated errors
Some checks reported warnings
Test / test_dd (push) Has been cancelled
Test / test_root_node (push) Has been cancelled
Test / test_switch_primary (push) Has been cancelled
Test / test_write (push) Has been cancelled
Test / test_write_xor (push) Has been cancelled
Test / test_write_iothreads (push) Has been cancelled
Test / test_write_no_same (push) Has been cancelled
Test / test_heal_pg_size_2 (push) Has been cancelled
Test / test_heal_local_read (push) Has been cancelled
Test / test_heal_ec (push) Has been cancelled
Test / test_heal_antietcd (push) Has been cancelled
Test / make_test (push) Has been cancelled
Test / test_heal_csum_32k_dmj (push) Has been cancelled
Test / test_heal_csum_32k_dj (push) Has been cancelled
Test / test_heal_csum_32k (push) Has been cancelled
Test / test_heal_csum_4k_dmj (push) Has been cancelled
Test / test_heal_csum_4k_dj (push) Has been cancelled
Test / test_heal_csum_4k (push) Has been cancelled
Test / test_resize (push) Has been cancelled
Test / test_resize_auto (push) Has been cancelled
Test / test_snapshot_pool2 (push) Has been cancelled
Test / test_osd_tags (push) Has been cancelled
Test / test_enospc (push) Has been cancelled
Test / test_enospc_xor (push) Has been cancelled
Test / test_enospc_imm (push) Has been cancelled
Test / test_enospc_imm_xor (push) Has been cancelled
Test / test_scrub (push) Has been cancelled
Test / test_scrub_zero_osd_2 (push) Has been cancelled
Test / test_scrub_xor (push) Has been cancelled
Test / test_scrub_pg_size_3 (push) Has been cancelled
2025-08-10 17:58:01 +03:00
918ea34af2 Remove "Only allow to overwrite part of the bitmap" blockstore API feature
Some checks reported warnings
Test / test_switch_primary (push) Successful in 32s
Test / test_write (push) Successful in 31s
Test / test_write_no_same (push) Successful in 8s
Test / test_write_xor (push) Successful in 35s
Test / test_heal_ec (push) Has started running
Test / test_heal_antietcd (push) Has been cancelled
Test / test_heal_csum_32k_dmj (push) Has been cancelled
Test / test_heal_csum_32k_dj (push) Has been cancelled
Test / test_heal_csum_32k (push) Has been cancelled
Test / test_heal_csum_4k_dmj (push) Has been cancelled
Test / test_heal_csum_4k_dj (push) Has been cancelled
Test / test_heal_csum_4k (push) Has been cancelled
Test / test_resize (push) Has been cancelled
Test / test_resize_auto (push) Has been cancelled
Test / test_snapshot_pool2 (push) Has been cancelled
Test / test_osd_tags (push) Has been cancelled
Test / test_enospc (push) Has been cancelled
Test / test_enospc_xor (push) Has been cancelled
Test / test_enospc_imm (push) Has been cancelled
Test / test_enospc_imm_xor (push) Has been cancelled
Test / test_scrub (push) Has been cancelled
Test / test_scrub_zero_osd_2 (push) Has been cancelled
Test / test_scrub_xor (push) Has been cancelled
Test / test_scrub_pg_size_3 (push) Has been cancelled
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Has been cancelled
Test / test_scrub_ec (push) Has been cancelled
Test / test_nfs (push) Has been cancelled
Test / test_write_iothreads (push) Has been cancelled
Test / test_heal_pg_size_2 (push) Has been cancelled
Test / test_heal_local_read (push) Has been cancelled
2025-08-10 17:57:04 +03:00
2db8184cd8 Fix bitmap calculation for EC N+1 & the new store and EC N+2+ for the old store 2025-08-10 17:57:04 +03:00
0e964b3c8c Fix renaming from ddeb 2025-08-09 16:11:22 +03:00
1b9296ff6c Add qemu_file_mirror_path option to the config to allow to trick Veeam
All checks were successful
Test / test_switch_primary (push) Successful in 32s
Test / test_write (push) Successful in 31s
Test / test_write_no_same (push) Successful in 11s
Test / test_write_xor (push) Successful in 34s
Test / test_write_iothreads (push) Successful in 37s
Test / test_heal_pg_size_2 (push) Successful in 2m19s
Test / test_heal_local_read (push) Successful in 2m19s
Test / test_heal_ec (push) Successful in 2m19s
Test / test_heal_antietcd (push) Successful in 2m21s
Test / test_heal_csum_32k_dmj (push) Successful in 2m19s
Test / test_heal_csum_4k_dmj (push) Successful in 2m20s
Test / test_heal_csum_32k_dj (push) Successful in 2m28s
Test / test_heal_csum_32k (push) Successful in 2m26s
Test / test_resize_auto (push) Successful in 9s
Test / test_resize (push) Successful in 14s
Test / test_osd_tags (push) Successful in 9s
Test / test_snapshot_pool2 (push) Successful in 17s
Test / test_enospc (push) Successful in 11s
Test / test_enospc_xor (push) Successful in 13s
Test / test_enospc_imm (push) Successful in 10s
Test / test_enospc_imm_xor (push) Successful in 17s
Test / test_scrub (push) Successful in 15s
Test / test_scrub_zero_osd_2 (push) Successful in 14s
Test / test_scrub_xor (push) Successful in 15s
Test / test_scrub_pg_size_3 (push) Successful in 14s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 16s
Test / test_scrub_ec (push) Successful in 16s
Test / test_nfs (push) Successful in 12s
Test / test_heal_csum_4k_dj (push) Successful in 2m22s
Test / test_heal_csum_4k (push) Successful in 2m17s
When qemu_file_mirror_path is set to "/dir/" in /etc/vitastor/vitastor.conf, QEMU driver
returns "/dir/image_name" as the filename in qemu-img info and in QAPI to trick software
like Veeam which expects file-based access. After that, vitastor-nfs can be mounted to that
directory to make backups work :-)
2025-08-06 01:15:54 +03:00
6bf136c199 Add a note about /etc/apt/preferences to docs 2025-08-05 11:25:06 +03:00
b529f77264 Release 2.2.3
- Support Ubuntu 24.04 Noble
- Fix clients hanging on online PG count change with in-flight operations
- Fix volume_size_info in PVE VitastorPlugin
- Implement bdrv_refresh_filename for the QEMU driver
- Fix RDMA-CM broken in 2.2.0
- Fix possible "invalid %N$ use detected" OSD crashes
- Fix vitastor-nfs build without RDMA
- Fix docker build by adding the forgotten apt/preferences
- Fix libfio_blockstore.so
- Fix "trigger event loop automatically" API version check in the QEMU driver
- Add Content-Type header for prometheus metrics
- Add a patch for libvirt 11.5
2025-07-30 10:50:02 +03:00
bf9519dcdc docker -i 2025-07-30 10:50:02 +03:00
4ba687738b Use archive for debian buster 2025-07-30 10:50:02 +03:00
8427f6fe46 Use the new build method for RPM packages too 2025-07-30 02:38:52 +03:00
efa6bc3e70 Fix docker pull commands in docs 2025-07-30 02:38:52 +03:00
da33e9b12d Add noble repository 2025-07-30 02:38:52 +03:00
Vitaliy Filippov
265127c1a7 Fix vitastor-nfs build without RDMA 2025-07-30 02:38:52 +03:00
2b30acfc1d Do not use podman and do not use build command to build packages (we are not building a docker image) 2025-07-30 02:38:52 +03:00
7fbc38ef29 Add a build file for ubuntu 24.04 2025-07-29 02:51:59 +03:00
e5070e991a Add forgotten apt/preferences 2025-07-29 01:18:55 +03:00
625552c441 Fix RDMA-CM broken in 2.2.0
All checks were successful
Test / test_write (push) Successful in 34s
Test / test_write_no_same (push) Successful in 9s
Test / test_write_xor (push) Successful in 34s
Test / test_write_iothreads (push) Successful in 36s
Test / test_heal_pg_size_2 (push) Successful in 2m17s
Test / test_heal_local_read (push) Successful in 2m19s
Test / test_heal_ec (push) Successful in 2m18s
Test / test_heal_antietcd (push) Successful in 2m17s
Test / test_heal_csum_32k_dmj (push) Successful in 2m20s
Test / test_heal_csum_32k_dj (push) Successful in 2m22s
Test / test_heal_csum_32k (push) Successful in 2m24s
Test / test_heal_csum_4k_dmj (push) Successful in 2m19s
Test / test_resize_auto (push) Successful in 7s
Test / test_resize (push) Successful in 12s
Test / test_osd_tags (push) Successful in 7s
Test / test_snapshot_pool2 (push) Successful in 14s
Test / test_enospc (push) Successful in 10s
Test / test_enospc_xor (push) Successful in 12s
Test / test_enospc_imm (push) Successful in 10s
Test / test_enospc_imm_xor (push) Successful in 14s
Test / test_scrub (push) Successful in 15s
Test / test_scrub_zero_osd_2 (push) Successful in 15s
Test / test_scrub_xor (push) Successful in 14s
Test / test_scrub_pg_size_3 (push) Successful in 13s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 15s
Test / test_scrub_ec (push) Successful in 14s
Test / test_nfs (push) Successful in 11s
Test / test_heal_csum_4k_dj (push) Successful in 2m18s
Test / test_heal_csum_4k (push) Successful in 2m18s
Test / test_rebalance_verify (push) Successful in 2m4s
2025-07-29 00:14:05 +03:00
78c95c94f6 Use uint8_t* for BS buffers in OSD
All checks were successful
Test / test_switch_primary (push) Successful in 32s
Test / test_write (push) Successful in 30s
Test / test_write_xor (push) Successful in 34s
Test / test_write_no_same (push) Successful in 10s
Test / test_write_iothreads (push) Successful in 35s
Test / test_heal_pg_size_2 (push) Successful in 2m18s
Test / test_heal_local_read (push) Successful in 2m18s
Test / test_heal_ec (push) Successful in 2m17s
Test / test_heal_antietcd (push) Successful in 2m21s
Test / test_heal_csum_32k_dmj (push) Successful in 2m20s
Test / test_heal_csum_32k_dj (push) Successful in 2m32s
Test / test_heal_csum_32k (push) Successful in 2m30s
Test / test_heal_csum_4k_dmj (push) Successful in 2m26s
Test / test_resize (push) Successful in 13s
Test / test_resize_auto (push) Successful in 9s
Test / test_osd_tags (push) Successful in 8s
Test / test_snapshot_pool2 (push) Successful in 13s
Test / test_enospc (push) Successful in 12s
Test / test_enospc_xor (push) Successful in 13s
Test / test_enospc_imm (push) Successful in 10s
Test / test_enospc_imm_xor (push) Successful in 14s
Test / test_scrub (push) Successful in 12s
Test / test_scrub_zero_osd_2 (push) Successful in 14s
Test / test_scrub_xor (push) Successful in 14s
Test / test_scrub_pg_size_3 (push) Successful in 15s
Test / test_scrub_ec (push) Successful in 15s
Test / test_nfs (push) Successful in 10s
Test / test_heal_csum_4k_dj (push) Successful in 2m19s
Test / test_heal_csum_4k (push) Successful in 2m20s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 16s
2025-07-26 14:13:12 +03:00
488e20bf55 Remove BS_OP_SYNC_STAB_ALL from OSD
Some checks reported warnings
Test / test_rebalance_verify_ec (push) Has been cancelled
Test / test_rebalance_verify_ec_imm (push) Has been cancelled
Test / test_dd (push) Has been cancelled
Test / test_root_node (push) Has been cancelled
Test / test_switch_primary (push) Has been cancelled
Test / test_write (push) Has been cancelled
Test / test_write_xor (push) Has been cancelled
Test / test_write_iothreads (push) Has been cancelled
Test / test_write_no_same (push) Has been cancelled
Test / test_heal_pg_size_2 (push) Has been cancelled
Test / test_heal_local_read (push) Has been cancelled
Test / test_heal_ec (push) Has been cancelled
Test / test_heal_antietcd (push) Has been cancelled
Test / test_heal_csum_32k_dmj (push) Has been cancelled
Test / test_heal_csum_32k_dj (push) Has been cancelled
Test / buildenv (push) Has been cancelled
Test / test_heal_csum_32k (push) Has been cancelled
Test / test_heal_csum_4k_dmj (push) Has been cancelled
Test / test_heal_csum_4k_dj (push) Has been cancelled
Test / test_heal_csum_4k (push) Has been cancelled
Test / test_resize (push) Has been cancelled
Test / test_resize_auto (push) Has been cancelled
Test / test_snapshot_pool2 (push) Has been cancelled
Test / test_osd_tags (push) Has been cancelled
Test / test_enospc (push) Has been cancelled
Test / test_enospc_xor (push) Has been cancelled
Test / test_enospc_imm (push) Has been cancelled
Test / test_enospc_imm_xor (push) Has been cancelled
Test / test_scrub (push) Has been cancelled
Test / test_scrub_zero_osd_2 (push) Has been cancelled
2025-07-26 14:13:12 +03:00
25d6281b3e Remove BS_OP_SYNC_STAB_ALL 2025-07-26 14:13:11 +03:00
1676e50b3a Fix debug trace for fio_bs 2025-07-26 14:12:48 +03:00
8049e3c14a Use OP_WRITE_STABLE in bs_fio 2025-07-26 14:12:48 +03:00
93a30efd86 Do not use numbered printf args
Some checks reported warnings
Test / test_enospc_imm_xor (push) Blocked by required conditions
Test / test_scrub (push) Blocked by required conditions
Test / test_scrub_zero_osd_2 (push) Blocked by required conditions
Test / test_scrub_xor (push) Blocked by required conditions
Test / test_scrub_pg_size_3 (push) Blocked by required conditions
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Blocked by required conditions
Test / test_scrub_ec (push) Blocked by required conditions
Test / test_nfs (push) Blocked by required conditions
Test / buildenv (push) Successful in 9s
Test / build (push) Successful in 3m50s
Test / npm_lint (push) Successful in 10s
Test / test_cas (push) Successful in 7s
Test / make_test (push) Successful in 33s
Test / test_change_pg_count (push) Successful in 30s
Test / test_change_pg_count_ec (push) Successful in 31s
Test / test_change_pg_size (push) Successful in 8s
Test / test_create_nomaxid (push) Successful in 7s
Test / test_add_osd (push) Successful in 1m14s
Test / test_etcd_fail_antietcd (push) Successful in 38s
Test / test_etcd_fail (push) Successful in 41s
Test / test_change_pg_count_online (push) Successful in 1m8s
Test / test_interrupted_rebalance (push) Successful in 47s
Test / test_create_halfhost (push) Successful in 6s
Test / test_interrupted_rebalance_imm (push) Successful in 45s
Test / test_interrupted_rebalance_ec (push) Successful in 47s
Test / test_failure_domain (push) Successful in 9s
Test / test_minsize_1 (push) Has been cancelled
Test / test_snapshot_ec (push) Has been cancelled
Test / test_interrupted_rebalance_ec_imm (push) Has been cancelled
Test / test_snapshot (push) Has been cancelled
2025-07-26 14:11:15 +03:00
83fb121f36 Fix "Trigger event loop automatically" API version check
All checks were successful
Test / test_switch_primary (push) Successful in 31s
Test / test_write (push) Successful in 32s
Test / test_write_xor (push) Successful in 33s
Test / test_write_no_same (push) Successful in 8s
Test / test_write_iothreads (push) Successful in 37s
Test / test_heal_pg_size_2 (push) Successful in 2m16s
Test / test_heal_local_read (push) Successful in 2m19s
Test / test_heal_ec (push) Successful in 2m18s
Test / test_heal_antietcd (push) Successful in 2m16s
Test / test_heal_csum_32k_dmj (push) Successful in 2m20s
Test / test_heal_csum_32k_dj (push) Successful in 2m19s
Test / test_heal_csum_32k (push) Successful in 2m25s
Test / test_heal_csum_4k_dmj (push) Successful in 2m17s
Test / test_resize (push) Successful in 14s
Test / test_resize_auto (push) Successful in 8s
Test / test_osd_tags (push) Successful in 9s
Test / test_snapshot_pool2 (push) Successful in 16s
Test / test_enospc (push) Successful in 11s
Test / test_enospc_xor (push) Successful in 12s
Test / test_enospc_imm (push) Successful in 10s
Test / test_enospc_imm_xor (push) Successful in 14s
Test / test_scrub (push) Successful in 13s
Test / test_scrub_zero_osd_2 (push) Successful in 14s
Test / test_scrub_xor (push) Successful in 13s
Test / test_scrub_pg_size_3 (push) Successful in 14s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 16s
Test / test_scrub_ec (push) Successful in 17s
Test / test_nfs (push) Successful in 12s
Test / test_heal_csum_4k (push) Successful in 2m19s
Test / test_heal_csum_4k_dj (push) Successful in 2m30s
2025-07-20 17:14:22 +03:00
afc97b757b Implement bdrv_refresh_filename
Some checks failed
Test / test_rebalance_verify_ec_imm (push) Successful in 1m36s
Test / test_write (push) Successful in 34s
Test / test_write_no_same (push) Successful in 7s
Test / test_write_xor (push) Successful in 35s
Test / test_write_iothreads (push) Successful in 35s
Test / test_heal_local_read (push) Failing after 2m16s
Test / test_heal_pg_size_2 (push) Successful in 2m20s
Test / test_heal_ec (push) Successful in 2m18s
Test / test_heal_antietcd (push) Successful in 2m18s
Test / test_heal_csum_32k_dmj (push) Successful in 2m21s
Test / test_heal_csum_32k_dj (push) Successful in 2m19s
Test / test_heal_csum_32k (push) Successful in 2m29s
Test / test_heal_csum_4k_dmj (push) Successful in 2m35s
Test / test_resize (push) Successful in 13s
Test / test_resize_auto (push) Successful in 8s
Test / test_osd_tags (push) Successful in 7s
Test / test_snapshot_pool2 (push) Successful in 15s
Test / test_enospc (push) Successful in 10s
Test / test_enospc_xor (push) Successful in 13s
Test / test_enospc_imm (push) Successful in 10s
Test / test_scrub (push) Successful in 19s
Test / test_enospc_imm_xor (push) Successful in 24s
Test / test_scrub_zero_osd_2 (push) Successful in 23s
Test / test_scrub_xor (push) Successful in 21s
Test / test_scrub_pg_size_3 (push) Successful in 14s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 16s
Test / test_nfs (push) Successful in 10s
Test / test_scrub_ec (push) Successful in 15s
Test / test_heal_csum_4k (push) Successful in 2m24s
Test / test_heal_csum_4k_dj (push) Successful in 2m41s
2025-07-20 16:30:01 +03:00
68905cbf41 Fix online PG count change, add a test for it
All checks were successful
Test / test_switch_primary (push) Successful in 32s
Test / test_write (push) Successful in 32s
Test / test_write_no_same (push) Successful in 8s
Test / test_write_xor (push) Successful in 34s
Test / test_write_iothreads (push) Successful in 37s
Test / test_heal_pg_size_2 (push) Successful in 2m15s
Test / test_heal_local_read (push) Successful in 2m19s
Test / test_heal_ec (push) Successful in 2m18s
Test / test_heal_antietcd (push) Successful in 2m16s
Test / test_heal_csum_32k_dmj (push) Successful in 2m23s
Test / test_heal_csum_32k_dj (push) Successful in 2m27s
Test / test_heal_csum_32k (push) Successful in 2m26s
Test / test_heal_csum_4k_dmj (push) Successful in 2m20s
Test / test_resize_auto (push) Successful in 8s
Test / test_resize (push) Successful in 12s
Test / test_osd_tags (push) Successful in 7s
Test / test_snapshot_pool2 (push) Successful in 13s
Test / test_enospc (push) Successful in 11s
Test / test_enospc_xor (push) Successful in 13s
Test / test_enospc_imm (push) Successful in 11s
Test / test_enospc_imm_xor (push) Successful in 15s
Test / test_scrub (push) Successful in 12s
Test / test_scrub_zero_osd_2 (push) Successful in 15s
Test / test_scrub_xor (push) Successful in 12s
Test / test_scrub_pg_size_3 (push) Successful in 14s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 18s
Test / test_scrub_ec (push) Successful in 14s
Test / test_nfs (push) Successful in 13s
Test / test_heal_csum_4k_dj (push) Successful in 2m21s
Test / test_heal_csum_4k (push) Successful in 2m21s
2025-07-15 02:33:28 +03:00
3fff667f13 Add Content-Type header for prometheus metrics
Some checks failed
Test / test_rebalance_verify_ec_imm (push) Successful in 1m40s
Test / test_write_no_same (push) Successful in 7s
Test / test_write (push) Successful in 33s
Test / test_write_xor (push) Successful in 32s
Test / test_write_iothreads (push) Successful in 35s
Test / test_heal_pg_size_2 (push) Successful in 2m20s
Test / test_heal_local_read (push) Successful in 2m26s
Test / test_heal_ec (push) Successful in 2m27s
Test / test_heal_antietcd (push) Successful in 2m41s
Test / test_heal_csum_32k_dmj (push) Successful in 2m25s
Test / test_heal_csum_32k_dj (push) Successful in 2m24s
Test / test_heal_csum_32k (push) Successful in 2m22s
Test / test_resize (push) Successful in 12s
Test / test_resize_auto (push) Failing after 5s
Test / test_heal_csum_4k_dmj (push) Successful in 2m26s
Test / test_osd_tags (push) Successful in 8s
Test / test_snapshot_pool2 (push) Successful in 15s
Test / test_enospc (push) Successful in 11s
Test / test_enospc_xor (push) Successful in 12s
Test / test_enospc_imm (push) Successful in 10s
Test / test_enospc_imm_xor (push) Successful in 12s
Test / test_scrub (push) Successful in 13s
Test / test_scrub_zero_osd_2 (push) Successful in 16s
Test / test_scrub_xor (push) Successful in 16s
Test / test_scrub_pg_size_3 (push) Successful in 14s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 17s
Test / test_scrub_ec (push) Successful in 14s
Test / test_nfs (push) Successful in 9s
Test / test_heal_csum_4k_dj (push) Successful in 2m21s
Test / test_heal_csum_4k (push) Successful in 2m19s
2025-07-14 19:51:35 +03:00
980aec1d9b Add patch for libvirt 11.5 2025-07-04 23:15:18 +03:00
f515fcce62 Fix volume_size_info in PVE VitastorPlugin 2025-06-18 12:38:09 +03:00
97bb809b54 Release 2.2.2
Some checks reported warnings
Test / test_osd_tags (push) Blocked by required conditions
Test / test_enospc (push) Blocked by required conditions
Test / test_enospc_xor (push) Blocked by required conditions
Test / test_enospc_imm (push) Blocked by required conditions
Test / test_enospc_imm_xor (push) Blocked by required conditions
Test / test_scrub (push) Blocked by required conditions
Test / test_scrub_zero_osd_2 (push) Blocked by required conditions
Test / test_scrub_xor (push) Blocked by required conditions
Test / test_scrub_pg_size_3 (push) Blocked by required conditions
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Blocked by required conditions
Test / test_scrub_ec (push) Blocked by required conditions
Test / test_nfs (push) Blocked by required conditions
Test / buildenv (push) Successful in 10s
Test / make_test (push) Has been cancelled
Test / npm_lint (push) Has been cancelled
Test / test_add_osd (push) Has been cancelled
Test / test_cas (push) Has been cancelled
Test / build (push) Has been cancelled
Test / test_change_pg_count (push) Has been cancelled
Test / test_change_pg_count_ec (push) Has been cancelled
Test / test_change_pg_size (push) Has been cancelled
Test / test_create_nomaxid (push) Has been cancelled
Test / test_etcd_fail (push) Has been cancelled
Test / test_etcd_fail_antietcd (push) Has been cancelled
Test / test_interrupted_rebalance (push) Has been cancelled
Test / test_interrupted_rebalance_imm (push) Has been cancelled
Test / test_interrupted_rebalance_ec (push) Has been cancelled
Test / test_interrupted_rebalance_ec_imm (push) Has been cancelled
Test / test_create_halfhost (push) Has been cancelled
Test / test_failure_domain (push) Has been cancelled
- Fix a bug introduced in 2.2.0 - pg_locks weren't disabled for pools without local_reads
  correctly which could lead to inactive pools during various operations
- Fix an old bug where OSDs could send sub-operations to incorrect peer OSDs when their
  connections were stopped and reestablished quickly, in 2.2.0 it was usually leading
  to "sequencing broken" messages in OSD logs
- Fix debug use_sync_send_recv mode
2025-06-07 12:56:48 +03:00
6022a61329 Decouple break_pg_locks from outbound OSD disconnections
All checks were successful
Test / test_switch_primary (push) Successful in 34s
Test / test_write (push) Successful in 33s
Test / test_write_xor (push) Successful in 36s
Test / test_write_no_same (push) Successful in 7s
Test / test_write_iothreads (push) Successful in 41s
Test / test_heal_pg_size_2 (push) Successful in 2m17s
Test / test_heal_local_read (push) Successful in 2m20s
Test / test_heal_ec (push) Successful in 2m20s
Test / test_heal_antietcd (push) Successful in 2m17s
Test / test_heal_csum_32k_dmj (push) Successful in 2m21s
Test / test_heal_csum_32k_dj (push) Successful in 2m21s
Test / test_heal_csum_32k (push) Successful in 2m27s
Test / test_heal_csum_4k_dmj (push) Successful in 2m18s
Test / test_resize (push) Successful in 13s
Test / test_resize_auto (push) Successful in 8s
Test / test_osd_tags (push) Successful in 7s
Test / test_snapshot_pool2 (push) Successful in 14s
Test / test_enospc (push) Successful in 11s
Test / test_enospc_xor (push) Successful in 13s
Test / test_enospc_imm (push) Successful in 11s
Test / test_enospc_imm_xor (push) Successful in 15s
Test / test_scrub (push) Successful in 14s
Test / test_scrub_zero_osd_2 (push) Successful in 13s
Test / test_scrub_xor (push) Successful in 14s
Test / test_scrub_pg_size_3 (push) Successful in 17s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 16s
Test / test_scrub_ec (push) Successful in 16s
Test / test_nfs (push) Successful in 13s
Test / test_heal_csum_4k_dj (push) Successful in 2m20s
Test / test_heal_csum_4k (push) Successful in 2m21s
2025-06-05 02:48:54 +03:00
a3c1996101 Do not accidentally clear incorrect osd_peer_fds entries
Some checks failed
Test / test_switch_primary (push) Successful in 33s
Test / test_write (push) Successful in 32s
Test / test_write_xor (push) Successful in 34s
Test / test_write_no_same (push) Successful in 8s
Test / test_write_iothreads (push) Successful in 39s
Test / test_heal_pg_size_2 (push) Successful in 2m18s
Test / test_heal_ec (push) Successful in 2m20s
Test / test_heal_antietcd (push) Successful in 2m18s
Test / test_heal_csum_32k_dmj (push) Successful in 2m21s
Test / test_heal_csum_32k_dj (push) Successful in 2m18s
Test / test_heal_csum_32k (push) Successful in 2m21s
Test / test_heal_csum_4k_dmj (push) Successful in 2m25s
Test / test_heal_csum_4k_dj (push) Successful in 2m22s
Test / test_resize_auto (push) Successful in 9s
Test / test_resize (push) Successful in 13s
Test / test_heal_csum_4k (push) Successful in 2m18s
Test / test_osd_tags (push) Successful in 8s
Test / test_snapshot_pool2 (push) Successful in 16s
Test / test_enospc (push) Successful in 12s
Test / test_enospc_xor (push) Successful in 13s
Test / test_enospc_imm (push) Successful in 10s
Test / test_enospc_imm_xor (push) Successful in 13s
Test / test_scrub (push) Successful in 13s
Test / test_scrub_zero_osd_2 (push) Successful in 14s
Test / test_scrub_xor (push) Successful in 15s
Test / test_scrub_pg_size_3 (push) Successful in 15s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 18s
Test / test_nfs (push) Successful in 11s
Test / test_scrub_ec (push) Successful in 16s
Test / test_heal_local_read (push) Failing after 10m10s
2025-06-05 02:22:13 +03:00
8d2a1f0297 Fix PG lock auto-enabling/auto-disabling in the default configuration 2025-06-05 02:22:01 +03:00
91cbc313c2 Change "on osd -123" logging to "on peer 123" for unknown connections 2025-06-05 02:22:01 +03:00
f0a025428e Postpone read/write handlers using timerfd in the debug use_sync_send_recv mode 2025-06-05 02:22:01 +03:00
67071158bd Cancel outbound operations only in the osd_client_t destructor
This is required to prevent disconnected peers from sometimes receiving messages
suited for other peers - stop_client was freeing the operations even though they
were still references in the io_uring requests in progress. This was leading to
OSDs sometimes receiving garbage and "broken sequencing" errors in logs as the
memory was usually already reallocated for other operations
2025-06-05 02:09:41 +03:00
cd028612c8 Use a separate osd_client_t::in_osd_num for inbound OSD connections 2025-06-05 02:09:41 +03:00
f390e73dae Log broken sequence numbers in "sequencing" errors 2025-06-05 02:09:41 +03:00
de2539c491 Correct Proxmox version 2025-06-03 01:56:09 +03:00
957a4fce7e Release 2.2.1
All checks were successful
Test / test_rebalance_verify_ec_imm (push) Successful in 1m47s
Test / test_write_no_same (push) Successful in 9s
Test / test_write_xor (push) Successful in 36s
Test / test_write_iothreads (push) Successful in 36s
Test / test_heal_pg_size_2 (push) Successful in 2m15s
Test / test_heal_local_read (push) Successful in 2m21s
Test / test_heal_ec (push) Successful in 2m16s
Test / test_heal_antietcd (push) Successful in 2m18s
Test / test_heal_csum_32k_dmj (push) Successful in 2m29s
Test / test_heal_csum_32k_dj (push) Successful in 2m24s
Test / test_heal_csum_32k (push) Successful in 2m29s
Test / test_heal_csum_4k_dmj (push) Successful in 2m27s
Test / test_resize_auto (push) Successful in 11s
Test / test_resize (push) Successful in 18s
Test / test_osd_tags (push) Successful in 10s
Test / test_snapshot_pool2 (push) Successful in 18s
Test / test_enospc (push) Successful in 12s
Test / test_enospc_xor (push) Successful in 14s
Test / test_enospc_imm (push) Successful in 11s
Test / test_enospc_imm_xor (push) Successful in 15s
Test / test_scrub (push) Successful in 12s
Test / test_scrub_zero_osd_2 (push) Successful in 14s
Test / test_scrub_xor (push) Successful in 16s
Test / test_scrub_pg_size_3 (push) Successful in 13s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 16s
Test / test_scrub_ec (push) Successful in 17s
Test / test_nfs (push) Successful in 11s
Test / test_heal_csum_4k_dj (push) Successful in 2m24s
Test / test_heal_csum_4k (push) Successful in 2m27s
Test / test_rebalance_verify (push) Successful in 1m51s
- Fix vitastor-disk purge broken after adding the "OSD is still running" check
- Fix iothreads hanging after adding zero-copy send support
- Fix enabling localized reads online (without restarting OSDs) in the default PG lock mode
2025-05-25 01:04:48 +03:00
f201ecdd51 Fix missing mutex unlock with zero-copy and iothreads O_o
All checks were successful
Test / test_switch_primary (push) Successful in 32s
Test / test_write (push) Successful in 33s
Test / test_write_xor (push) Successful in 36s
Test / test_write_no_same (push) Successful in 8s
Test / test_write_iothreads (push) Successful in 37s
Test / test_heal_pg_size_2 (push) Successful in 2m17s
Test / test_heal_local_read (push) Successful in 2m18s
Test / test_heal_ec (push) Successful in 2m18s
Test / test_heal_antietcd (push) Successful in 2m17s
Test / test_heal_csum_32k_dmj (push) Successful in 2m21s
Test / test_heal_csum_32k_dj (push) Successful in 2m26s
Test / test_heal_csum_32k (push) Successful in 2m30s
Test / test_heal_csum_4k_dmj (push) Successful in 2m18s
Test / test_resize_auto (push) Successful in 9s
Test / test_resize (push) Successful in 13s
Test / test_osd_tags (push) Successful in 8s
Test / test_snapshot_pool2 (push) Successful in 14s
Test / test_enospc (push) Successful in 12s
Test / test_enospc_xor (push) Successful in 13s
Test / test_enospc_imm (push) Successful in 11s
Test / test_enospc_imm_xor (push) Successful in 13s
Test / test_scrub (push) Successful in 14s
Test / test_scrub_zero_osd_2 (push) Successful in 15s
Test / test_scrub_xor (push) Successful in 16s
Test / test_scrub_pg_size_3 (push) Successful in 14s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 16s
Test / test_scrub_ec (push) Successful in 15s
Test / test_nfs (push) Successful in 11s
Test / test_heal_csum_4k_dj (push) Successful in 2m19s
Test / test_heal_csum_4k (push) Successful in 2m19s
2025-05-24 00:56:31 +03:00
4afb617f59 Also zero-init sqe
Some checks failed
Test / test_rebalance_verify_ec_imm (push) Successful in 1m43s
Test / test_write_no_same (push) Successful in 8s
Test / test_write_xor (push) Successful in 36s
Test / test_heal_pg_size_2 (push) Successful in 2m17s
Test / test_heal_local_read (push) Successful in 2m17s
Test / test_heal_ec (push) Successful in 2m18s
Test / test_heal_antietcd (push) Successful in 2m17s
Test / test_heal_csum_32k_dmj (push) Successful in 2m25s
Test / test_heal_csum_32k_dj (push) Successful in 2m19s
Test / test_heal_csum_32k (push) Successful in 2m28s
Test / test_resize (push) Successful in 14s
Test / test_resize_auto (push) Successful in 8s
Test / test_snapshot_pool2 (push) Successful in 14s
Test / test_osd_tags (push) Successful in 7s
Test / test_enospc (push) Successful in 10s
Test / test_enospc_xor (push) Successful in 13s
Test / test_enospc_imm (push) Successful in 11s
Test / test_heal_csum_4k_dmj (push) Successful in 2m20s
Test / test_enospc_imm_xor (push) Successful in 13s
Test / test_heal_csum_4k_dj (push) Successful in 2m20s
Test / test_scrub (push) Successful in 13s
Test / test_scrub_zero_osd_2 (push) Successful in 15s
Test / test_heal_csum_4k (push) Successful in 2m19s
Test / test_scrub_xor (push) Successful in 15s
Test / test_scrub_pg_size_3 (push) Successful in 14s
Test / test_nfs (push) Successful in 14s
Test / test_scrub_ec (push) Successful in 16s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 19s
Test / test_rebalance_verify (push) Successful in 1m55s
Test / test_write_iothreads (push) Failing after 3m5s
2025-05-23 21:18:37 +03:00
d3fde0569f Add a test with enabled iothreads
Some checks failed
Test / test_switch_primary (push) Successful in 34s
Test / test_write (push) Successful in 32s
Test / test_write_no_same (push) Successful in 8s
Test / test_write_xor (push) Successful in 34s
Test / test_heal_pg_size_2 (push) Successful in 2m16s
Test / test_heal_local_read (push) Successful in 2m18s
Test / test_heal_ec (push) Successful in 2m18s
Test / test_write_iothreads (push) Failing after 3m5s
Test / test_heal_csum_4k_dmj (push) Has been cancelled
Test / test_heal_csum_4k_dj (push) Has been cancelled
Test / test_heal_csum_4k (push) Has been cancelled
Test / test_resize (push) Has been cancelled
Test / test_resize_auto (push) Has been cancelled
Test / test_snapshot_pool2 (push) Has been cancelled
Test / test_osd_tags (push) Has been cancelled
Test / test_enospc (push) Has been cancelled
Test / test_enospc_xor (push) Has been cancelled
Test / test_enospc_imm (push) Has been cancelled
Test / test_enospc_imm_xor (push) Has been cancelled
Test / test_scrub (push) Has been cancelled
Test / test_scrub_zero_osd_2 (push) Has been cancelled
Test / test_scrub_xor (push) Has been cancelled
Test / test_scrub_pg_size_3 (push) Has been cancelled
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Has been cancelled
Test / test_scrub_ec (push) Has been cancelled
Test / test_nfs (push) Has been cancelled
Test / test_heal_csum_32k (push) Has been cancelled
Test / test_heal_csum_32k_dmj (push) Has been cancelled
Test / test_heal_antietcd (push) Has been cancelled
Test / test_heal_csum_32k_dj (push) Has been cancelled
2025-05-23 21:05:18 +03:00
438b64f6c3 Allow to enable PG locks online when changing local_reads in pool configuration
Some checks reported warnings
Test / test_switch_primary (push) Has been cancelled
Test / test_write (push) Has been cancelled
Test / test_write_xor (push) Has been cancelled
Test / test_write_no_same (push) Has been cancelled
Test / test_heal_pg_size_2 (push) Has been cancelled
Test / test_heal_local_read (push) Has been cancelled
Test / test_heal_ec (push) Has been cancelled
Test / test_heal_antietcd (push) Has been cancelled
Test / test_heal_csum_32k_dmj (push) Has been cancelled
Test / test_heal_csum_32k_dj (push) Has been cancelled
Test / test_heal_csum_32k (push) Has been cancelled
Test / test_heal_csum_4k_dmj (push) Has been cancelled
Test / test_heal_csum_4k_dj (push) Has been cancelled
Test / test_heal_csum_4k (push) Has been cancelled
Test / test_resize (push) Has been cancelled
Test / test_resize_auto (push) Has been cancelled
Test / test_snapshot_pool2 (push) Has been cancelled
Test / test_osd_tags (push) Has been cancelled
Test / test_enospc (push) Has been cancelled
Test / test_enospc_xor (push) Has been cancelled
Test / test_enospc_imm (push) Has been cancelled
Test / test_enospc_imm_xor (push) Has been cancelled
Test / test_scrub (push) Has been cancelled
Test / test_scrub_zero_osd_2 (push) Has been cancelled
Test / test_scrub_xor (push) Has been cancelled
Test / test_scrub_pg_size_3 (push) Has been cancelled
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Has been cancelled
Test / test_scrub_ec (push) Has been cancelled
Test / test_nfs (push) Has been cancelled
Test / make_test (push) Has been cancelled
2025-05-23 20:54:47 +03:00
2b0a802ea1 Fix iothreads sometimes hanging after adding zerocopy support
Some checks reported warnings
Test / test_rebalance_verify_imm (push) Successful in 1m39s
Test / test_dd (push) Successful in 11s
Test / test_rebalance_verify_ec (push) Successful in 1m44s
Test / test_root_node (push) Successful in 7s
Test / test_rebalance_verify_ec_imm (push) Successful in 1m44s
Test / test_write_no_same (push) Successful in 8s
Test / test_switch_primary (push) Successful in 32s
Test / test_write (push) Successful in 31s
Test / test_write_xor (push) Successful in 35s
Test / test_heal_csum_32k_dmj (push) Has been cancelled
Test / test_heal_csum_32k_dj (push) Has been cancelled
Test / test_heal_csum_32k (push) Has been cancelled
Test / test_heal_csum_4k_dmj (push) Has been cancelled
Test / test_heal_csum_4k_dj (push) Has been cancelled
Test / test_heal_ec (push) Has been cancelled
Test / test_heal_csum_4k (push) Has been cancelled
Test / test_resize (push) Has been cancelled
Test / test_heal_pg_size_2 (push) Has been cancelled
Test / test_resize_auto (push) Has been cancelled
Test / test_heal_local_read (push) Has been cancelled
Test / test_snapshot_pool2 (push) Has been cancelled
Test / test_heal_antietcd (push) Has been cancelled
Test / test_osd_tags (push) Has been cancelled
Test / test_enospc (push) Has been cancelled
Test / test_enospc_xor (push) Has been cancelled
Test / test_enospc_imm (push) Has been cancelled
Test / test_enospc_imm_xor (push) Has been cancelled
Test / test_scrub (push) Has been cancelled
Test / test_scrub_zero_osd_2 (push) Has been cancelled
Test / test_scrub_xor (push) Has been cancelled
2025-05-23 20:54:03 +03:00
0dd49c1d67 Followup to "allow to purge running OSDs again"
All checks were successful
Test / test_write_no_same (push) Successful in 7s
Test / test_rebalance_verify_ec_imm (push) Successful in 1m44s
Test / test_write (push) Successful in 34s
Test / test_write_xor (push) Successful in 35s
Test / test_heal_pg_size_2 (push) Successful in 2m16s
Test / test_heal_local_read (push) Successful in 2m19s
Test / test_heal_ec (push) Successful in 2m17s
Test / test_heal_antietcd (push) Successful in 2m19s
Test / test_heal_csum_32k_dmj (push) Successful in 2m16s
Test / test_heal_csum_32k_dj (push) Successful in 2m18s
Test / test_heal_csum_32k (push) Successful in 2m19s
Test / test_heal_csum_4k_dmj (push) Successful in 2m17s
Test / test_resize_auto (push) Successful in 8s
Test / test_resize (push) Successful in 13s
Test / test_osd_tags (push) Successful in 9s
Test / test_snapshot_pool2 (push) Successful in 13s
Test / test_enospc (push) Successful in 10s
Test / test_enospc_xor (push) Successful in 12s
Test / test_enospc_imm (push) Successful in 9s
Test / test_enospc_imm_xor (push) Successful in 14s
Test / test_scrub (push) Successful in 14s
Test / test_scrub_zero_osd_2 (push) Successful in 15s
Test / test_scrub_xor (push) Successful in 14s
Test / test_scrub_pg_size_3 (push) Successful in 14s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 15s
Test / test_scrub_ec (push) Successful in 14s
Test / test_nfs (push) Successful in 11s
Test / test_heal_csum_4k_dj (push) Successful in 2m17s
Test / test_heal_csum_4k (push) Successful in 2m18s
Test / test_rebalance_verify (push) Successful in 1m37s
2025-05-22 01:10:05 +03:00
410170db96 Add notes about VNPL in English 2025-05-20 02:12:49 +03:00
7d8523e0e5 Add more notes about VNPL in Russian 2025-05-19 02:41:34 +03:00
db915184c6 Allow to purge running OSDs again, as in 2.1.0 and earlier
All checks were successful
Test / test_rebalance_verify_ec_imm (push) Successful in 1m40s
Test / test_write_no_same (push) Successful in 8s
Test / test_switch_primary (push) Successful in 31s
Test / test_write (push) Successful in 32s
Test / test_write_xor (push) Successful in 36s
Test / test_heal_pg_size_2 (push) Successful in 2m14s
Test / test_heal_local_read (push) Successful in 2m18s
Test / test_heal_ec (push) Successful in 2m17s
Test / test_heal_antietcd (push) Successful in 2m19s
Test / test_heal_csum_32k_dmj (push) Successful in 2m21s
Test / test_heal_csum_32k_dj (push) Successful in 2m21s
Test / test_heal_csum_32k (push) Successful in 2m20s
Test / test_heal_csum_4k_dmj (push) Successful in 2m19s
Test / test_resize (push) Successful in 13s
Test / test_resize_auto (push) Successful in 9s
Test / test_osd_tags (push) Successful in 6s
Test / test_snapshot_pool2 (push) Successful in 13s
Test / test_enospc (push) Successful in 9s
Test / test_enospc_xor (push) Successful in 14s
Test / test_enospc_imm_xor (push) Successful in 14s
Test / test_scrub (push) Successful in 14s
Test / test_scrub_zero_osd_2 (push) Successful in 14s
Test / test_scrub_xor (push) Successful in 16s
Test / test_scrub_pg_size_3 (push) Successful in 12s
Test / test_heal_csum_4k_dj (push) Successful in 2m21s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 17s
Test / test_scrub_ec (push) Successful in 14s
Test / test_heal_csum_4k (push) Successful in 2m18s
Test / test_nfs (push) Successful in 10s
Test / test_enospc_imm (push) Successful in 11s
2025-05-11 13:59:28 +03:00
5ae6fea49c Add a note about local reads 2025-05-11 01:23:48 +03:00
95ec750b8c Release 2.2.0
All checks were successful
Test / test_rebalance_verify_ec_imm (push) Successful in 1m43s
Test / test_write_no_same (push) Successful in 7s
Test / test_switch_primary (push) Successful in 34s
Test / test_write (push) Successful in 31s
Test / test_write_xor (push) Successful in 34s
Test / test_heal_pg_size_2 (push) Successful in 2m16s
Test / test_heal_local_read (push) Successful in 2m17s
Test / test_heal_antietcd (push) Successful in 2m19s
Test / test_heal_csum_32k_dmj (push) Successful in 2m29s
Test / test_heal_csum_32k_dj (push) Successful in 2m29s
Test / test_heal_csum_32k (push) Successful in 2m22s
Test / test_heal_csum_4k_dmj (push) Successful in 2m30s
Test / test_resize (push) Successful in 16s
Test / test_resize_auto (push) Successful in 10s
Test / test_osd_tags (push) Successful in 11s
Test / test_snapshot_pool2 (push) Successful in 16s
Test / test_enospc (push) Successful in 12s
Test / test_enospc_xor (push) Successful in 13s
Test / test_enospc_imm (push) Successful in 11s
Test / test_enospc_imm_xor (push) Successful in 12s
Test / test_scrub (push) Successful in 14s
Test / test_scrub_zero_osd_2 (push) Successful in 14s
Test / test_scrub_xor (push) Successful in 14s
Test / test_scrub_pg_size_3 (push) Successful in 14s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 16s
Test / test_scrub_ec (push) Successful in 18s
Test / test_nfs (push) Successful in 11s
Test / test_heal_csum_4k_dj (push) Successful in 2m20s
Test / test_heal_csum_4k (push) Successful in 2m23s
Test / test_heal_ec (push) Successful in 2m16s
New features:

- [Localized read support](https://vitastor.io/docs/config/pool.html#local_reads) for multi-datacenter setups.
- io_uring-based [zero-copy send support](https://vitastor.io/docs/config/network.html#min_zerocopy_send_size) -
  read the instruction carefully for optimal performance!
- Improve and speedup data distribution, especially in cases of very large hosts (100 OSD+).
  Previously, PG optimization speed depended on the number of OSDs, now it only depends on
  the number of failure domains. Distribution over specific OSDs is now also more even and
  becomes strictly more even when you increase the number of PGs.
- Add a [very interesting instruction](https://vitastor.io/en/docs/usage/nfs.html#linux-nfs-write-size) to change NFS_MAX_FILE_IO_SIZE
- Check operation sequencing and stop connections when it breaks - should help catch some
  very rare RDMA packet loss problems.
- `vitastor-cli rm-osd` now refuses to remove OSDs which are still up and suggests to use `vitastor-disk purge`.
- Allow removal of direntries referring non-existent in VitastorFS.
- Change default vitastor-etcd data dir to /var/lib/etcd/vitastor.

Bug fixes:

- Fix compatibility with ISA-L 2.31+. ⚠️Very important: please upgrade Vitastor before upgrading ISA-L to 2.31+.
- Fix in-memory state cleanup for incomplete PGs.
- Fix monitor crash with non-existent node_placement nodes.
- Slightly speedup `vitastor-kv dump` command by adding output buffering.
- Fix theoretically possible slowdowns in OSD sub-operation failure handling code.
- Fix very rare stack overflows in vitastor-kv.
- Fix a possible crash in VitastorFS during handling of file creation race condition.
- Fix modify-pool -s PG_SIZE which didn't work without --pg_minsize.
- Fix marking peer OSDs as alive on receiving data from them via RDMA - in theory,
  the bug could result in instability with RDMA under high load with slow disks.
- Fix a rare OSD crash due to double handle_primary_subop() call.
- Fix latency aggregation in global stats (/vitastor/stats in etcd) - do not sum it.
- Hide "Ran out of journal space" log messages by default.
- Wait for RDMA-CM EVENT_ESTABLISHED after rdma_accept(), handle rdma_accept() before acking the event.
- Fix VitastorFS total & free numbers multiplied by extra 2.
- Fix systemd unit name in make-etcd.
- Do not allow reweight > 1 in vitastor-cli modify-osd.
- Fix docker build.
2025-05-11 00:26:08 +03:00
90b1de307b Support local reads in client
All checks were successful
Test / test_write_no_same (push) Successful in 9s
Test / test_rebalance_verify_ec_imm (push) Successful in 1m51s
Test / test_write (push) Successful in 34s
Test / test_write_xor (push) Successful in 34s
Test / test_heal_pg_size_2 (push) Successful in 2m17s
Test / test_heal_local_read (push) Successful in 2m19s
Test / test_heal_ec (push) Successful in 2m19s
Test / test_heal_antietcd (push) Successful in 2m18s
Test / test_heal_csum_32k_dmj (push) Successful in 2m20s
Test / test_heal_csum_32k_dj (push) Successful in 2m20s
Test / test_heal_csum_32k (push) Successful in 2m17s
Test / test_heal_csum_4k_dmj (push) Successful in 2m19s
Test / test_resize_auto (push) Successful in 8s
Test / test_resize (push) Successful in 14s
Test / test_osd_tags (push) Successful in 7s
Test / test_snapshot_pool2 (push) Successful in 14s
Test / test_enospc (push) Successful in 10s
Test / test_enospc_xor (push) Successful in 14s
Test / test_enospc_imm (push) Successful in 10s
Test / test_enospc_imm_xor (push) Successful in 15s
Test / test_scrub (push) Successful in 13s
Test / test_scrub_zero_osd_2 (push) Successful in 17s
Test / test_scrub_xor (push) Successful in 16s
Test / test_scrub_pg_size_3 (push) Successful in 14s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 16s
Test / test_nfs (push) Successful in 11s
Test / test_scrub_ec (push) Successful in 15s
Test / test_heal_csum_4k_dj (push) Successful in 2m20s
Test / test_heal_csum_4k (push) Successful in 2m19s
Test / test_rebalance_verify (push) Successful in 1m32s
2025-05-10 16:42:02 +03:00
7e6a95c678 Support primary-reads from clean replicated PGs on secondary OSDs 2025-05-10 16:42:02 +03:00
b2416afb28 Lock PGs on secondary OSDs to allow local reads and guarantee splitbrain prevention 2025-05-10 15:18:00 +03:00
66dc116f60 Cleanup PG_INCOMPLETE peering states
All checks were successful
Test / test_root_node (push) Successful in 7s
Test / test_switch_primary (push) Successful in 33s
Test / test_write (push) Successful in 30s
Test / test_write_no_same (push) Successful in 9s
Test / test_write_xor (push) Successful in 33s
Test / test_rebalance_verify_ec_imm (push) Successful in 1m35s
Test / test_heal_pg_size_2 (push) Successful in 2m15s
Test / test_heal_ec (push) Successful in 2m16s
Test / test_heal_antietcd (push) Successful in 2m16s
Test / test_heal_csum_32k_dmj (push) Successful in 2m21s
Test / test_heal_csum_32k_dj (push) Successful in 2m26s
Test / test_heal_csum_32k (push) Successful in 2m20s
Test / test_resize (push) Successful in 14s
Test / test_resize_auto (push) Successful in 7s
Test / test_snapshot_pool2 (push) Successful in 14s
Test / test_osd_tags (push) Successful in 6s
Test / test_heal_csum_4k_dmj (push) Successful in 2m20s
Test / test_enospc (push) Successful in 11s
Test / test_enospc_xor (push) Successful in 14s
Test / test_enospc_imm (push) Successful in 10s
Test / test_enospc_imm_xor (push) Successful in 14s
Test / test_scrub (push) Successful in 13s
Test / test_scrub_zero_osd_2 (push) Successful in 13s
Test / test_scrub_xor (push) Successful in 13s
Test / test_heal_csum_4k_dj (push) Successful in 2m17s
Test / test_scrub_pg_size_3 (push) Successful in 15s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 16s
Test / test_scrub_ec (push) Successful in 15s
Test / test_nfs (push) Successful in 13s
Test / test_heal_csum_4k (push) Successful in 2m21s
2025-05-10 02:51:26 +03:00
0cb8629ab6 Remove finish_stop_pg shortcut
All checks were successful
Test / test_rebalance_verify_imm (push) Successful in 1m32s
Test / test_rebalance_verify_ec (push) Successful in 1m37s
Test / test_write (push) Successful in 33s
Test / test_rebalance_verify_ec_imm (push) Successful in 1m38s
Test / test_write_no_same (push) Successful in 8s
Test / test_write_xor (push) Successful in 35s
Test / test_heal_pg_size_2 (push) Successful in 2m16s
Test / test_heal_ec (push) Successful in 2m16s
Test / test_heal_antietcd (push) Successful in 2m18s
Test / test_heal_csum_32k_dmj (push) Successful in 2m17s
Test / test_heal_csum_32k_dj (push) Successful in 2m21s
Test / test_heal_csum_32k (push) Successful in 2m19s
Test / test_heal_csum_4k_dmj (push) Successful in 2m21s
Test / test_resize (push) Successful in 11s
Test / test_resize_auto (push) Successful in 8s
Test / test_osd_tags (push) Successful in 6s
Test / test_snapshot_pool2 (push) Successful in 14s
Test / test_heal_csum_4k_dj (push) Successful in 2m29s
Test / test_enospc (push) Successful in 10s
Test / test_enospc_xor (push) Successful in 13s
Test / test_enospc_imm (push) Successful in 10s
Test / test_enospc_imm_xor (push) Successful in 13s
Test / test_scrub (push) Successful in 12s
Test / test_scrub_zero_osd_2 (push) Successful in 13s
Test / test_scrub_xor (push) Successful in 13s
Test / test_scrub_pg_size_3 (push) Successful in 15s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 15s
Test / test_scrub_ec (push) Successful in 14s
Test / test_nfs (push) Successful in 10s
Test / test_heal_csum_4k (push) Successful in 2m18s
2025-05-08 16:14:35 +03:00
b7322a405a Move gethostname_str to utils
All checks were successful
Test / test_rebalance_verify_ec (push) Successful in 1m35s
Test / test_rebalance_verify_ec_imm (push) Successful in 1m33s
Test / test_write_no_same (push) Successful in 9s
Test / test_write (push) Successful in 30s
Test / test_switch_primary (push) Successful in 34s
Test / test_write_xor (push) Successful in 34s
Test / test_heal_pg_size_2 (push) Successful in 2m16s
Test / test_heal_ec (push) Successful in 2m16s
Test / test_heal_antietcd (push) Successful in 2m18s
Test / test_heal_csum_32k_dmj (push) Successful in 2m21s
Test / test_heal_csum_32k_dj (push) Successful in 2m22s
Test / test_heal_csum_32k (push) Successful in 2m18s
Test / test_heal_csum_4k_dmj (push) Successful in 2m18s
Test / test_heal_csum_4k_dj (push) Successful in 2m19s
Test / test_resize_auto (push) Successful in 9s
Test / test_resize (push) Successful in 14s
Test / test_osd_tags (push) Successful in 7s
Test / test_snapshot_pool2 (push) Successful in 13s
Test / test_enospc (push) Successful in 10s
Test / test_enospc_xor (push) Successful in 12s
Test / test_enospc_imm (push) Successful in 10s
Test / test_enospc_imm_xor (push) Successful in 13s
Test / test_scrub (push) Successful in 13s
Test / test_scrub_zero_osd_2 (push) Successful in 14s
Test / test_scrub_xor (push) Successful in 16s
Test / test_scrub_pg_size_3 (push) Successful in 14s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 18s
Test / test_scrub_ec (push) Successful in 15s
Test / test_nfs (push) Successful in 11s
Test / test_heal_csum_4k (push) Successful in 2m15s
2025-05-05 02:16:06 +03:00
5692630005 Move check_sequencing indication into config response features subkey
All checks were successful
Test / test_rebalance_verify_ec (push) Successful in 1m33s
Test / test_switch_primary (push) Successful in 34s
Test / test_write (push) Successful in 32s
Test / test_write_no_same (push) Successful in 9s
Test / test_rebalance_verify_ec_imm (push) Successful in 1m33s
Test / test_write_xor (push) Successful in 36s
Test / test_heal_pg_size_2 (push) Successful in 2m17s
Test / test_heal_ec (push) Successful in 2m18s
Test / test_heal_antietcd (push) Successful in 2m18s
Test / test_heal_csum_32k_dmj (push) Successful in 2m18s
Test / test_heal_csum_32k_dj (push) Successful in 2m19s
Test / test_heal_csum_32k (push) Successful in 2m20s
Test / test_heal_csum_4k_dmj (push) Successful in 2m17s
Test / test_resize_auto (push) Successful in 8s
Test / test_resize (push) Successful in 12s
Test / test_osd_tags (push) Successful in 9s
Test / test_heal_csum_4k_dj (push) Successful in 2m17s
Test / test_snapshot_pool2 (push) Successful in 14s
Test / test_enospc (push) Successful in 12s
Test / test_enospc_xor (push) Successful in 12s
Test / test_enospc_imm (push) Successful in 12s
Test / test_enospc_imm_xor (push) Successful in 14s
Test / test_scrub (push) Successful in 14s
Test / test_scrub_zero_osd_2 (push) Successful in 14s
Test / test_scrub_xor (push) Successful in 13s
Test / test_scrub_pg_size_3 (push) Successful in 14s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 15s
Test / test_nfs (push) Successful in 11s
Test / test_scrub_ec (push) Successful in 16s
Test / test_heal_csum_4k (push) Successful in 2m23s
2025-05-05 02:07:50 +03:00
00ced7cea7 Fix monitor crash with non-existing node_placement nodes 2025-05-05 02:05:04 +03:00
ebdb75e287 Fix typo in docker docs
All checks were successful
Test / test_rebalance_verify_ec_imm (push) Successful in 1m46s
Test / test_switch_primary (push) Successful in 31s
Test / test_write_no_same (push) Successful in 8s
Test / test_write (push) Successful in 34s
Test / test_write_xor (push) Successful in 35s
Test / test_heal_pg_size_2 (push) Successful in 2m15s
Test / test_heal_ec (push) Successful in 2m16s
Test / test_heal_antietcd (push) Successful in 2m18s
Test / test_heal_csum_32k_dmj (push) Successful in 2m19s
Test / test_heal_csum_32k_dj (push) Successful in 2m22s
Test / test_heal_csum_32k (push) Successful in 2m20s
Test / test_heal_csum_4k_dmj (push) Successful in 2m19s
Test / test_heal_csum_4k_dj (push) Successful in 2m20s
Test / test_resize_auto (push) Successful in 8s
Test / test_resize (push) Successful in 13s
Test / test_snapshot_pool2 (push) Successful in 13s
Test / test_osd_tags (push) Successful in 9s
Test / test_enospc (push) Successful in 12s
Test / test_enospc_xor (push) Successful in 12s
Test / test_enospc_imm (push) Successful in 11s
Test / test_enospc_imm_xor (push) Successful in 13s
Test / test_scrub (push) Successful in 13s
Test / test_scrub_zero_osd_2 (push) Successful in 15s
Test / test_scrub_xor (push) Successful in 15s
Test / test_scrub_pg_size_3 (push) Successful in 14s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 16s
Test / test_scrub_ec (push) Successful in 15s
Test / test_nfs (push) Successful in 13s
Test / test_heal_csum_4k (push) Successful in 2m25s
Test / test_rebalance_verify (push) Successful in 1m33s
2025-05-04 18:09:33 +03:00
f397fe9c6a Add compatibility with ISA-L 2.31+ 2025-05-04 18:09:33 +03:00
28560b4ae5 Write K/V listings in buffered manner
All checks were successful
Test / test_dd (push) Successful in 13s
Test / test_rebalance_verify_ec_imm (push) Successful in 1m39s
Test / test_write_no_same (push) Successful in 9s
Test / test_switch_primary (push) Successful in 33s
Test / test_write (push) Successful in 33s
Test / test_write_xor (push) Successful in 36s
Test / test_heal_pg_size_2 (push) Successful in 2m14s
Test / test_heal_ec (push) Successful in 2m19s
Test / test_heal_antietcd (push) Successful in 2m18s
Test / test_heal_csum_32k_dmj (push) Successful in 2m19s
Test / test_heal_csum_32k_dj (push) Successful in 2m19s
Test / test_heal_csum_32k (push) Successful in 2m21s
Test / test_heal_csum_4k_dmj (push) Successful in 2m19s
Test / test_heal_csum_4k_dj (push) Successful in 2m21s
Test / test_resize_auto (push) Successful in 9s
Test / test_resize (push) Successful in 13s
Test / test_osd_tags (push) Successful in 9s
Test / test_snapshot_pool2 (push) Successful in 16s
Test / test_enospc (push) Successful in 11s
Test / test_enospc_xor (push) Successful in 14s
Test / test_enospc_imm (push) Successful in 12s
Test / test_enospc_imm_xor (push) Successful in 13s
Test / test_scrub (push) Successful in 15s
Test / test_scrub_zero_osd_2 (push) Successful in 14s
Test / test_scrub_xor (push) Successful in 17s
Test / test_scrub_pg_size_3 (push) Successful in 14s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 17s
Test / test_scrub_ec (push) Successful in 16s
Test / test_nfs (push) Successful in 12s
Test / test_heal_csum_4k (push) Successful in 2m19s
2025-05-03 15:06:59 +03:00
2d07449e74 Postpone cb() to set_immediate() to prevent stack overflows in kv_db 2025-05-03 15:06:59 +03:00
80c4e8c20f Add missing wakeup in ringloop->set_immediate to prevent slowdowns in code using set_immediate
All checks were successful
Test / test_rebalance_verify_ec (push) Successful in 1m48s
Test / test_rebalance_verify_ec_imm (push) Successful in 1m45s
Test / test_write_no_same (push) Successful in 7s
Test / test_write (push) Successful in 30s
Test / test_switch_primary (push) Successful in 33s
Test / test_write_xor (push) Successful in 34s
Test / test_heal_pg_size_2 (push) Successful in 2m17s
Test / test_heal_ec (push) Successful in 2m16s
Test / test_heal_csum_32k_dmj (push) Successful in 2m18s
Test / test_heal_csum_32k (push) Successful in 2m19s
Test / test_heal_csum_32k_dj (push) Successful in 2m25s
Test / test_heal_csum_4k_dmj (push) Successful in 2m17s
Test / test_heal_csum_4k_dj (push) Successful in 2m18s
Test / test_resize_auto (push) Successful in 8s
Test / test_resize (push) Successful in 12s
Test / test_osd_tags (push) Successful in 7s
Test / test_snapshot_pool2 (push) Successful in 13s
Test / test_enospc (push) Successful in 10s
Test / test_enospc_xor (push) Successful in 11s
Test / test_enospc_imm (push) Successful in 11s
Test / test_enospc_imm_xor (push) Successful in 12s
Test / test_scrub (push) Successful in 12s
Test / test_scrub_zero_osd_2 (push) Successful in 14s
Test / test_scrub_xor (push) Successful in 13s
Test / test_scrub_pg_size_3 (push) Successful in 13s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 16s
Test / test_scrub_ec (push) Successful in 14s
Test / test_nfs (push) Successful in 10s
Test / test_heal_csum_4k (push) Successful in 2m25s
Test / test_heal_antietcd (push) Successful in 2m39s
2025-05-03 14:40:48 +03:00
2ab0ae3bc9 Check operation sequencing and stop clients when it breaks
All checks were successful
Test / test_rebalance_verify_ec (push) Successful in 1m33s
Test / test_rebalance_verify_ec_imm (push) Successful in 1m35s
Test / test_write_no_same (push) Successful in 7s
Test / test_switch_primary (push) Successful in 32s
Test / test_write (push) Successful in 30s
Test / test_write_xor (push) Successful in 34s
Test / test_heal_pg_size_2 (push) Successful in 2m16s
Test / test_heal_ec (push) Successful in 2m16s
Test / test_heal_antietcd (push) Successful in 2m17s
Test / test_heal_csum_32k_dmj (push) Successful in 2m25s
Test / test_heal_csum_32k_dj (push) Successful in 2m18s
Test / test_heal_csum_32k (push) Successful in 2m16s
Test / test_heal_csum_4k_dmj (push) Successful in 2m19s
Test / test_resize_auto (push) Successful in 8s
Test / test_resize (push) Successful in 12s
Test / test_heal_csum_4k_dj (push) Successful in 2m16s
Test / test_osd_tags (push) Successful in 7s
Test / test_snapshot_pool2 (push) Successful in 13s
Test / test_enospc (push) Successful in 10s
Test / test_enospc_xor (push) Successful in 13s
Test / test_enospc_imm (push) Successful in 10s
Test / test_enospc_imm_xor (push) Successful in 12s
Test / test_scrub (push) Successful in 13s
Test / test_scrub_zero_osd_2 (push) Successful in 13s
Test / test_scrub_xor (push) Successful in 13s
Test / test_scrub_pg_size_3 (push) Successful in 13s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 15s
Test / test_scrub_ec (push) Successful in 14s
Test / test_nfs (push) Successful in 11s
Test / test_heal_csum_4k (push) Successful in 2m17s
2025-05-02 17:01:50 +03:00
05e59c1b4f Fix MSG_WAITALL assertion added in the zero-copy patch
All checks were successful
Test / test_switch_primary (push) Successful in 33s
Test / test_write (push) Successful in 36s
Test / test_rebalance_verify_ec_imm (push) Successful in 1m59s
Test / test_write_no_same (push) Successful in 10s
Test / test_write_xor (push) Successful in 39s
Test / test_heal_pg_size_2 (push) Successful in 2m16s
Test / test_heal_ec (push) Successful in 2m18s
Test / test_heal_antietcd (push) Successful in 2m16s
Test / test_heal_csum_32k_dmj (push) Successful in 2m17s
Test / test_rebalance_verify (push) Successful in 1m28s
Test / test_heal_csum_32k_dj (push) Successful in 2m20s
Test / test_heal_csum_4k_dmj (push) Successful in 2m16s
Test / test_heal_csum_32k (push) Successful in 2m20s
Test / test_resize (push) Successful in 13s
Test / test_osd_tags (push) Successful in 6s
Test / test_snapshot_pool2 (push) Successful in 15s
Test / test_enospc (push) Successful in 10s
Test / test_enospc_xor (push) Successful in 12s
Test / test_enospc_imm (push) Successful in 10s
Test / test_enospc_imm_xor (push) Successful in 12s
Test / test_scrub (push) Successful in 12s
Test / test_scrub_zero_osd_2 (push) Successful in 13s
Test / test_scrub_xor (push) Successful in 14s
Test / test_heal_csum_4k_dj (push) Successful in 2m19s
Test / test_heal_csum_4k (push) Successful in 2m18s
Test / test_resize_auto (push) Successful in 9s
Test / test_scrub_pg_size_3 (push) Successful in 13s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 15s
Test / test_scrub_ec (push) Successful in 16s
Test / test_nfs (push) Successful in 12s
2025-05-02 17:01:43 +03:00
e6e1c5b962 Check if OSDs are still up in rm-osd
All checks were successful
Test / test_rebalance_verify_ec (push) Successful in 1m49s
Test / test_rebalance_verify_ec_imm (push) Successful in 1m47s
Test / test_write_no_same (push) Successful in 8s
Test / test_switch_primary (push) Successful in 33s
Test / test_write (push) Successful in 34s
Test / test_write_xor (push) Successful in 34s
Test / test_heal_pg_size_2 (push) Successful in 2m17s
Test / test_heal_ec (push) Successful in 2m16s
Test / test_heal_antietcd (push) Successful in 2m17s
Test / test_heal_csum_32k_dmj (push) Successful in 2m19s
Test / test_heal_csum_32k_dj (push) Successful in 2m17s
Test / test_heal_csum_32k (push) Successful in 2m17s
Test / test_heal_csum_4k_dmj (push) Successful in 2m18s
Test / test_heal_csum_4k_dj (push) Successful in 2m17s
Test / test_resize (push) Successful in 13s
Test / test_resize_auto (push) Successful in 8s
Test / test_osd_tags (push) Successful in 7s
Test / test_snapshot_pool2 (push) Successful in 15s
Test / test_enospc (push) Successful in 10s
Test / test_enospc_xor (push) Successful in 11s
Test / test_enospc_imm (push) Successful in 11s
Test / test_enospc_imm_xor (push) Successful in 12s
Test / test_scrub (push) Successful in 13s
Test / test_scrub_zero_osd_2 (push) Successful in 13s
Test / test_scrub_xor (push) Successful in 13s
Test / test_scrub_pg_size_3 (push) Successful in 15s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 17s
Test / test_scrub_ec (push) Successful in 16s
Test / test_nfs (push) Successful in 11s
Test / test_heal_csum_4k (push) Successful in 2m19s
2025-05-02 13:05:59 +03:00
9556eeae45 Implement io_uring zero-copy send support
All checks were successful
Test / test_rebalance_verify_ec (push) Successful in 1m47s
Test / test_rebalance_verify_ec_imm (push) Successful in 1m46s
Test / test_write_no_same (push) Successful in 8s
Test / test_switch_primary (push) Successful in 32s
Test / test_write (push) Successful in 32s
Test / test_write_xor (push) Successful in 35s
Test / test_heal_pg_size_2 (push) Successful in 2m17s
Test / test_heal_ec (push) Successful in 2m17s
Test / test_heal_antietcd (push) Successful in 2m17s
Test / test_heal_csum_32k_dmj (push) Successful in 2m17s
Test / test_heal_csum_32k_dj (push) Successful in 2m18s
Test / test_heal_csum_32k (push) Successful in 2m20s
Test / test_heal_csum_4k_dmj (push) Successful in 2m18s
Test / test_heal_csum_4k_dj (push) Successful in 2m18s
Test / test_resize_auto (push) Successful in 8s
Test / test_resize (push) Successful in 12s
Test / test_osd_tags (push) Successful in 7s
Test / test_snapshot_pool2 (push) Successful in 13s
Test / test_enospc (push) Successful in 10s
Test / test_enospc_xor (push) Successful in 12s
Test / test_enospc_imm (push) Successful in 10s
Test / test_enospc_imm_xor (push) Successful in 13s
Test / test_scrub (push) Successful in 15s
Test / test_scrub_zero_osd_2 (push) Successful in 14s
Test / test_scrub_xor (push) Successful in 15s
Test / test_scrub_pg_size_3 (push) Successful in 13s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 17s
Test / test_scrub_ec (push) Successful in 15s
Test / test_nfs (push) Successful in 13s
Test / test_heal_csum_4k (push) Successful in 2m18s
2025-05-01 18:47:10 +03:00
96b5a72630 Allow removal of bad direntries in VitastorFS (direntries referring non-existent inodes)
All checks were successful
Test / test_rebalance_verify_ec (push) Successful in 1m49s
Test / test_rebalance_verify_ec_imm (push) Successful in 1m45s
Test / test_switch_primary (push) Successful in 32s
Test / test_write_no_same (push) Successful in 10s
Test / test_write (push) Successful in 31s
Test / test_write_xor (push) Successful in 34s
Test / test_heal_pg_size_2 (push) Successful in 2m15s
Test / test_heal_ec (push) Successful in 2m17s
Test / test_heal_antietcd (push) Successful in 2m17s
Test / test_heal_csum_32k_dmj (push) Successful in 2m15s
Test / test_heal_csum_32k_dj (push) Successful in 2m17s
Test / test_heal_csum_4k_dmj (push) Successful in 2m15s
Test / test_heal_csum_32k (push) Successful in 2m20s
Test / test_heal_csum_4k_dj (push) Successful in 2m20s
Test / test_resize_auto (push) Successful in 8s
Test / test_resize (push) Successful in 14s
Test / test_osd_tags (push) Successful in 8s
Test / test_snapshot_pool2 (push) Successful in 15s
Test / test_enospc (push) Successful in 11s
Test / test_enospc_imm (push) Successful in 10s
Test / test_enospc_xor (push) Successful in 13s
Test / test_enospc_imm_xor (push) Successful in 13s
Test / test_scrub (push) Successful in 12s
Test / test_scrub_zero_osd_2 (push) Successful in 14s
Test / test_scrub_xor (push) Successful in 13s
Test / test_scrub_pg_size_3 (push) Successful in 13s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 18s
Test / test_scrub_ec (push) Successful in 16s
Test / test_nfs (push) Successful in 11s
Test / test_heal_csum_4k (push) Successful in 2m16s
2025-05-01 01:14:23 +03:00
ef80f121f6 Fix "duplicate inode during create" deletion in VitastorFS
All checks were successful
Test / test_rebalance_verify_ec (push) Successful in 1m45s
Test / test_rebalance_verify_ec_imm (push) Successful in 1m42s
Test / test_write_no_same (push) Successful in 8s
Test / test_switch_primary (push) Successful in 32s
Test / test_write (push) Successful in 31s
Test / test_write_xor (push) Successful in 35s
Test / test_heal_pg_size_2 (push) Successful in 2m17s
Test / test_heal_ec (push) Successful in 2m16s
Test / test_heal_antietcd (push) Successful in 2m17s
Test / test_heal_csum_32k_dmj (push) Successful in 2m17s
Test / test_heal_csum_32k_dj (push) Successful in 2m21s
Test / test_heal_csum_32k (push) Successful in 2m20s
Test / test_heal_csum_4k_dmj (push) Successful in 2m25s
Test / test_heal_csum_4k_dj (push) Successful in 2m18s
Test / test_resize (push) Successful in 13s
Test / test_resize_auto (push) Successful in 8s
Test / test_osd_tags (push) Successful in 9s
Test / test_snapshot_pool2 (push) Successful in 13s
Test / test_enospc (push) Successful in 10s
Test / test_enospc_xor (push) Successful in 13s
Test / test_enospc_imm (push) Successful in 12s
Test / test_enospc_imm_xor (push) Successful in 12s
Test / test_scrub (push) Successful in 12s
Test / test_scrub_zero_osd_2 (push) Successful in 15s
Test / test_scrub_xor (push) Successful in 15s
Test / test_scrub_pg_size_3 (push) Successful in 13s
Test / test_scrub_ec (push) Successful in 13s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 17s
Test / test_nfs (push) Successful in 11s
Test / test_heal_csum_4k (push) Successful in 2m15s
2025-04-30 20:37:49 +03:00
bbdd1f3aa7 Fix modify-pool -s PG_SIZE without --pg_minsize
All checks were successful
Test / test_rebalance_verify_ec (push) Successful in 1m46s
Test / test_rebalance_verify_ec_imm (push) Successful in 1m43s
Test / test_write_no_same (push) Successful in 9s
Test / test_switch_primary (push) Successful in 33s
Test / test_write (push) Successful in 31s
Test / test_write_xor (push) Successful in 35s
Test / test_heal_ec (push) Successful in 2m17s
Test / test_heal_antietcd (push) Successful in 2m17s
Test / test_heal_csum_32k_dj (push) Successful in 2m18s
Test / test_heal_csum_32k (push) Successful in 2m16s
Test / test_heal_csum_4k_dmj (push) Successful in 2m20s
Test / test_resize (push) Successful in 14s
Test / test_resize_auto (push) Successful in 8s
Test / test_heal_csum_4k_dj (push) Successful in 2m18s
Test / test_osd_tags (push) Successful in 9s
Test / test_snapshot_pool2 (push) Successful in 14s
Test / test_enospc (push) Successful in 10s
Test / test_enospc_imm (push) Successful in 10s
Test / test_enospc_xor (push) Successful in 14s
Test / test_enospc_imm_xor (push) Successful in 13s
Test / test_scrub (push) Successful in 11s
Test / test_scrub_zero_osd_2 (push) Successful in 13s
Test / test_scrub_xor (push) Successful in 14s
Test / test_scrub_pg_size_3 (push) Successful in 14s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 15s
Test / test_nfs (push) Successful in 10s
Test / test_scrub_ec (push) Successful in 16s
Test / test_heal_csum_4k (push) Successful in 2m16s
Test / test_heal_pg_size_2 (push) Successful in 2m17s
Test / test_heal_csum_32k_dmj (push) Successful in 2m18s
2025-04-28 02:20:54 +03:00
5dd37f519a Fix node folding in case of empty rules (pool with size 1), add a test
All checks were successful
Test / test_rebalance_verify_ec (push) Successful in 1m42s
Test / test_rebalance_verify_ec_imm (push) Successful in 1m42s
Test / test_write_no_same (push) Successful in 8s
Test / test_write (push) Successful in 30s
Test / test_switch_primary (push) Successful in 34s
Test / test_write_xor (push) Successful in 36s
Test / test_heal_pg_size_2 (push) Successful in 2m15s
Test / test_heal_ec (push) Successful in 2m18s
Test / test_heal_antietcd (push) Successful in 2m16s
Test / test_heal_csum_32k_dmj (push) Successful in 2m17s
Test / test_heal_csum_4k_dmj (push) Successful in 2m18s
Test / test_heal_csum_32k (push) Successful in 2m22s
Test / test_heal_csum_4k_dj (push) Successful in 2m20s
Test / test_resize_auto (push) Successful in 8s
Test / test_resize (push) Successful in 13s
Test / test_osd_tags (push) Successful in 8s
Test / test_enospc (push) Successful in 9s
Test / test_snapshot_pool2 (push) Successful in 17s
Test / test_enospc_xor (push) Successful in 13s
Test / test_enospc_imm (push) Successful in 10s
Test / test_enospc_imm_xor (push) Successful in 12s
Test / test_scrub (push) Successful in 13s
Test / test_scrub_zero_osd_2 (push) Successful in 13s
Test / test_scrub_xor (push) Successful in 14s
Test / test_scrub_pg_size_3 (push) Successful in 14s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 16s
Test / test_scrub_ec (push) Successful in 14s
Test / test_nfs (push) Successful in 12s
Test / test_heal_csum_4k (push) Successful in 2m17s
Test / test_heal_csum_32k_dj (push) Successful in 2m17s
2025-04-28 02:16:49 +03:00
a2278be84d Improve data distribution: solve LP task on failure domains instead of individual OSDs
All checks were successful
Test / test_rebalance_verify_ec (push) Successful in 1m41s
Test / test_rebalance_verify_ec_imm (push) Successful in 1m40s
Test / test_write_no_same (push) Successful in 8s
Test / test_switch_primary (push) Successful in 32s
Test / test_write (push) Successful in 31s
Test / test_write_xor (push) Successful in 35s
Test / test_heal_pg_size_2 (push) Successful in 2m18s
Test / test_heal_ec (push) Successful in 2m19s
Test / test_heal_antietcd (push) Successful in 2m18s
Test / test_heal_csum_32k_dmj (push) Successful in 2m19s
Test / test_heal_csum_32k_dj (push) Successful in 2m18s
Test / test_heal_csum_32k (push) Successful in 2m20s
Test / test_heal_csum_4k_dmj (push) Successful in 2m20s
Test / test_heal_csum_4k_dj (push) Successful in 2m19s
Test / test_resize_auto (push) Successful in 8s
Test / test_resize (push) Successful in 13s
Test / test_osd_tags (push) Successful in 8s
Test / test_snapshot_pool2 (push) Successful in 13s
Test / test_enospc (push) Successful in 10s
Test / test_enospc_xor (push) Successful in 12s
Test / test_enospc_imm (push) Successful in 11s
Test / test_enospc_imm_xor (push) Successful in 12s
Test / test_scrub (push) Successful in 13s
Test / test_scrub_zero_osd_2 (push) Successful in 17s
Test / test_scrub_xor (push) Successful in 14s
Test / test_scrub_pg_size_3 (push) Successful in 15s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 15s
Test / test_scrub_ec (push) Successful in 16s
Test / test_nfs (push) Successful in 12s
Test / test_heal_csum_4k (push) Successful in 2m17s
This greatly speeds up PG placement and makes it more uniform both because the LP task
becomes simpler and because the distribution of individual OSDs is optimised manually
2025-04-27 01:44:46 +03:00
1393a2671c Change default vitastor-etcd data dir to /var/lib/etcd/vitastor 2025-04-27 01:44:46 +03:00
9fa8ae5384 Reset OSD ping state on receiving data from it via RDMA
All checks were successful
Test / test_rebalance_verify_ec (push) Successful in 1m42s
Test / test_rebalance_verify_ec_imm (push) Successful in 1m44s
Test / test_write_no_same (push) Successful in 8s
Test / test_switch_primary (push) Successful in 32s
Test / test_write (push) Successful in 31s
Test / test_write_xor (push) Successful in 34s
Test / test_heal_pg_size_2 (push) Successful in 2m16s
Test / test_heal_ec (push) Successful in 2m16s
Test / test_heal_antietcd (push) Successful in 2m17s
Test / test_heal_csum_32k_dmj (push) Successful in 2m18s
Test / test_heal_csum_32k_dj (push) Successful in 2m19s
Test / test_heal_csum_32k (push) Successful in 2m17s
Test / test_heal_csum_4k_dmj (push) Successful in 2m17s
Test / test_heal_csum_4k_dj (push) Successful in 2m17s
Test / test_resize_auto (push) Successful in 8s
Test / test_resize (push) Successful in 14s
Test / test_osd_tags (push) Successful in 9s
Test / test_snapshot_pool2 (push) Successful in 15s
Test / test_enospc (push) Successful in 11s
Test / test_enospc_xor (push) Successful in 12s
Test / test_enospc_imm (push) Successful in 12s
Test / test_enospc_imm_xor (push) Successful in 14s
Test / test_scrub (push) Successful in 12s
Test / test_scrub_zero_osd_2 (push) Successful in 13s
Test / test_scrub_xor (push) Successful in 13s
Test / test_scrub_pg_size_3 (push) Successful in 13s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 17s
Test / test_nfs (push) Successful in 10s
Test / test_scrub_ec (push) Successful in 16s
Test / test_heal_csum_4k (push) Successful in 2m19s
2025-04-26 14:17:09 +03:00
169a35a067 Followup to latency aggregation fix
All checks were successful
Test / test_rebalance_verify_ec (push) Successful in 1m49s
Test / test_rebalance_verify_ec_imm (push) Successful in 1m50s
Test / test_write_no_same (push) Successful in 8s
Test / test_switch_primary (push) Successful in 32s
Test / test_write (push) Successful in 30s
Test / test_write_xor (push) Successful in 33s
Test / test_heal_pg_size_2 (push) Successful in 2m13s
Test / test_heal_ec (push) Successful in 2m18s
Test / test_heal_antietcd (push) Successful in 2m17s
Test / test_heal_csum_32k_dmj (push) Successful in 2m17s
Test / test_heal_csum_32k_dj (push) Successful in 2m16s
Test / test_heal_csum_32k (push) Successful in 2m18s
Test / test_heal_csum_4k_dmj (push) Successful in 2m19s
Test / test_heal_csum_4k_dj (push) Successful in 2m19s
Test / test_resize_auto (push) Successful in 8s
Test / test_resize (push) Successful in 12s
Test / test_osd_tags (push) Successful in 9s
Test / test_snapshot_pool2 (push) Successful in 13s
Test / test_enospc (push) Successful in 10s
Test / test_enospc_imm (push) Successful in 9s
Test / test_enospc_xor (push) Successful in 12s
Test / test_enospc_imm_xor (push) Successful in 15s
Test / test_scrub (push) Successful in 13s
Test / test_scrub_zero_osd_2 (push) Successful in 13s
Test / test_scrub_xor (push) Successful in 14s
Test / test_scrub_pg_size_3 (push) Successful in 15s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 15s
Test / test_scrub_ec (push) Successful in 17s
Test / test_nfs (push) Successful in 12s
Test / test_heal_csum_4k (push) Successful in 2m18s
2025-04-26 01:32:46 +03:00
2b2a10581d Prevent double handle_primary_subop in rare cases
Some checks failed
Test / test_write_no_same (push) Successful in 9s
Test / test_switch_primary (push) Successful in 33s
Test / test_write (push) Successful in 31s
Test / test_write_xor (push) Successful in 34s
Test / test_heal_pg_size_2 (push) Successful in 2m16s
Test / test_heal_ec (push) Successful in 2m15s
Test / test_heal_antietcd (push) Successful in 2m16s
Test / test_heal_csum_32k_dmj (push) Successful in 2m16s
Test / test_change_pg_count (push) Failing after 18s
Test / test_change_pg_count_ec (push) Failing after 20s
Test / test_heal_csum_32k_dj (push) Successful in 2m16s
Test / test_heal_csum_32k (push) Successful in 2m17s
Test / test_heal_csum_4k_dmj (push) Successful in 2m17s
Test / test_heal_csum_4k_dj (push) Successful in 2m16s
Test / test_resize_auto (push) Successful in 8s
Test / test_resize (push) Successful in 12s
Test / test_enospc_xor (push) Has been cancelled
Test / test_heal_csum_4k (push) Has been cancelled
Test / test_enospc_imm (push) Has been cancelled
Test / test_enospc_imm_xor (push) Has been cancelled
Test / test_scrub (push) Has been cancelled
Test / test_scrub_zero_osd_2 (push) Has been cancelled
Test / test_scrub_xor (push) Has been cancelled
Test / test_scrub_pg_size_3 (push) Has been cancelled
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Has been cancelled
Test / test_scrub_ec (push) Has been cancelled
Test / test_nfs (push) Has been cancelled
Test / test_enospc (push) Has been cancelled
Test / test_osd_tags (push) Has been cancelled
Test / test_snapshot_pool2 (push) Has been cancelled
2025-04-26 01:16:53 +03:00
10fd51862a Fix latency aggregation in global stats (/vitastor/stats in etcd)
All checks were successful
Test / test_switch_primary (push) Successful in 39s
Test / test_write_no_same (push) Successful in 19s
Test / test_write (push) Successful in 54s
Test / test_rebalance_verify_ec_imm (push) Successful in 5m22s
Test / test_write_xor (push) Successful in 1m3s
Test / test_heal_pg_size_2 (push) Successful in 2m37s
Test / test_heal_ec (push) Successful in 2m39s
Test / test_heal_csum_32k_dmj (push) Successful in 2m41s
Test / test_heal_antietcd (push) Successful in 2m49s
Test / test_heal_csum_32k_dj (push) Successful in 2m55s
Test / test_heal_csum_32k (push) Successful in 2m54s
Test / test_heal_csum_4k_dj (push) Successful in 2m55s
Test / test_heal_csum_4k_dmj (push) Successful in 2m57s
Test / test_resize (push) Successful in 44s
Test / test_resize_auto (push) Successful in 20s
Test / test_osd_tags (push) Successful in 16s
Test / test_snapshot_pool2 (push) Successful in 33s
Test / test_enospc (push) Successful in 23s
Test / test_enospc_xor (push) Successful in 24s
Test / test_enospc_imm (push) Successful in 13s
Test / test_enospc_imm_xor (push) Successful in 14s
Test / test_scrub (push) Successful in 12s
Test / test_scrub_zero_osd_2 (push) Successful in 12s
Test / test_scrub_xor (push) Successful in 15s
Test / test_scrub_pg_size_3 (push) Successful in 14s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 18s
Test / test_nfs (push) Successful in 12s
Test / test_scrub_ec (push) Successful in 18s
Test / test_heal_csum_4k (push) Successful in 2m55s
Test / test_failure_domain (push) Successful in 10s
2025-04-25 00:08:10 +03:00
15d0204f96 Hide "Ran out of journal space" log messages by default
All checks were successful
Test / test_root_node (push) Successful in 14s
Test / test_rebalance_verify_ec_imm (push) Successful in 1m52s
Test / test_write_no_same (push) Successful in 14s
Test / test_write (push) Successful in 37s
Test / test_switch_primary (push) Successful in 39s
Test / test_write_xor (push) Successful in 40s
Test / test_heal_pg_size_2 (push) Successful in 2m20s
Test / test_heal_ec (push) Successful in 2m22s
Test / test_heal_antietcd (push) Successful in 2m24s
Test / test_heal_csum_32k_dmj (push) Successful in 2m24s
Test / test_heal_csum_32k_dj (push) Successful in 2m21s
Test / test_heal_csum_32k (push) Successful in 2m24s
Test / test_heal_csum_4k_dmj (push) Successful in 2m25s
Test / test_heal_csum_4k_dj (push) Successful in 2m24s
Test / test_resize_auto (push) Successful in 13s
Test / test_resize (push) Successful in 19s
Test / test_snapshot_pool2 (push) Successful in 18s
Test / test_osd_tags (push) Successful in 14s
Test / test_enospc (push) Successful in 16s
Test / test_enospc_xor (push) Successful in 17s
Test / test_enospc_imm (push) Successful in 17s
Test / test_enospc_imm_xor (push) Successful in 19s
Test / test_scrub (push) Successful in 17s
Test / test_scrub_zero_osd_2 (push) Successful in 19s
Test / test_scrub_xor (push) Successful in 19s
Test / test_scrub_pg_size_3 (push) Successful in 18s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 21s
Test / test_scrub_ec (push) Successful in 20s
Test / test_nfs (push) Successful in 15s
Test / test_heal_csum_4k (push) Successful in 2m19s
2025-04-20 01:21:03 +03:00
21d6e88a1b Add instructions to change NFS_MAX_FILE_IO_SIZE 2025-04-18 13:39:32 +03:00
df2847df2d Wait for RDMA-CM EVENT_ESTABLISHED after rdma_accept(), handle rdma_accept() before acking the event
All checks were successful
Test / test_rebalance_verify_ec (push) Successful in 1m50s
Test / test_dd (push) Successful in 18s
Test / test_write_no_same (push) Successful in 14s
Test / test_switch_primary (push) Successful in 39s
Test / test_write (push) Successful in 38s
Test / test_write_xor (push) Successful in 40s
Test / test_heal_pg_size_2 (push) Successful in 2m20s
Test / test_heal_ec (push) Successful in 2m22s
Test / test_heal_antietcd (push) Successful in 2m23s
Test / test_heal_csum_32k_dmj (push) Successful in 2m25s
Test / test_heal_csum_32k_dj (push) Successful in 2m24s
Test / test_heal_csum_32k (push) Successful in 2m23s
Test / test_heal_csum_4k_dmj (push) Successful in 2m26s
Test / test_heal_csum_4k_dj (push) Successful in 2m23s
Test / test_resize_auto (push) Successful in 13s
Test / test_resize (push) Successful in 20s
Test / test_snapshot_pool2 (push) Successful in 19s
Test / test_osd_tags (push) Successful in 13s
Test / test_enospc (push) Successful in 16s
Test / test_enospc_xor (push) Successful in 19s
Test / test_enospc_imm (push) Successful in 17s
Test / test_enospc_imm_xor (push) Successful in 18s
Test / test_scrub (push) Successful in 20s
Test / test_scrub_zero_osd_2 (push) Successful in 21s
Test / test_scrub_xor (push) Successful in 19s
Test / test_scrub_pg_size_3 (push) Successful in 20s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 24s
Test / test_scrub_ec (push) Successful in 20s
Test / test_nfs (push) Successful in 16s
Test / test_heal_csum_4k (push) Successful in 2m26s
2025-04-15 15:19:36 +03:00
327c98a4b6 Fix index_tree
All checks were successful
Test / test_rebalance_verify_ec (push) Successful in 1m49s
Test / test_rebalance_verify_ec_imm (push) Successful in 1m49s
Test / test_write_no_same (push) Successful in 14s
Test / test_switch_primary (push) Successful in 38s
Test / test_write (push) Successful in 37s
Test / test_write_xor (push) Successful in 40s
Test / test_heal_pg_size_2 (push) Successful in 2m20s
Test / test_heal_ec (push) Successful in 2m21s
Test / test_heal_antietcd (push) Successful in 2m23s
Test / test_heal_csum_32k_dmj (push) Successful in 2m27s
Test / test_heal_csum_32k_dj (push) Successful in 2m22s
Test / test_heal_csum_32k (push) Successful in 2m22s
Test / test_heal_csum_4k_dmj (push) Successful in 2m24s
Test / test_heal_csum_4k_dj (push) Successful in 2m22s
Test / test_resize (push) Successful in 18s
Test / test_resize_auto (push) Successful in 14s
Test / test_snapshot_pool2 (push) Successful in 19s
Test / test_osd_tags (push) Successful in 13s
Test / test_enospc (push) Successful in 15s
Test / test_enospc_xor (push) Successful in 17s
Test / test_enospc_imm (push) Successful in 17s
Test / test_enospc_imm_xor (push) Successful in 21s
Test / test_scrub (push) Successful in 19s
Test / test_scrub_zero_osd_2 (push) Successful in 20s
Test / test_scrub_xor (push) Successful in 20s
Test / test_scrub_pg_size_3 (push) Successful in 21s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 21s
Test / test_scrub_ec (push) Successful in 20s
Test / test_nfs (push) Successful in 16s
Test / test_heal_csum_4k (push) Successful in 2m23s
2025-04-13 16:13:44 +03:00
3cc0abfd81 Fix NFS total & free multiplied by extra 2
All checks were successful
Test / test_dd (push) Successful in 19s
Test / test_rebalance_verify_ec_imm (push) Successful in 1m51s
Test / test_write_no_same (push) Successful in 14s
Test / test_write (push) Successful in 35s
Test / test_switch_primary (push) Successful in 39s
Test / test_write_xor (push) Successful in 42s
Test / test_heal_pg_size_2 (push) Successful in 2m20s
Test / test_heal_ec (push) Successful in 2m23s
Test / test_heal_antietcd (push) Successful in 2m22s
Test / test_heal_csum_32k_dmj (push) Successful in 2m24s
Test / test_heal_csum_32k_dj (push) Successful in 2m23s
Test / test_heal_csum_32k (push) Successful in 2m26s
Test / test_heal_csum_4k_dmj (push) Successful in 2m26s
Test / test_heal_csum_4k_dj (push) Successful in 2m25s
Test / test_resize_auto (push) Successful in 14s
Test / test_resize (push) Successful in 18s
Test / test_snapshot_pool2 (push) Successful in 19s
Test / test_osd_tags (push) Successful in 13s
Test / test_enospc (push) Successful in 15s
Test / test_enospc_xor (push) Successful in 18s
Test / test_enospc_imm (push) Successful in 16s
Test / test_enospc_imm_xor (push) Successful in 20s
Test / test_scrub (push) Successful in 19s
Test / test_scrub_zero_osd_2 (push) Successful in 19s
Test / test_scrub_xor (push) Successful in 21s
Test / test_scrub_pg_size_3 (push) Successful in 20s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 23s
Test / test_scrub_ec (push) Successful in 21s
Test / test_nfs (push) Successful in 16s
Test / test_heal_csum_4k (push) Successful in 2m20s
2025-04-12 19:09:26 +03:00
80e5f8ba76 Add missing WITH_RDMACM defines
All checks were successful
Test / test_dd (push) Successful in 18s
Test / test_rebalance_verify_ec_imm (push) Successful in 2m2s
Test / test_write_no_same (push) Successful in 14s
Test / test_switch_primary (push) Successful in 38s
Test / test_write (push) Successful in 36s
Test / test_write_xor (push) Successful in 40s
Test / test_heal_pg_size_2 (push) Successful in 2m22s
Test / test_heal_ec (push) Successful in 2m22s
Test / test_heal_antietcd (push) Successful in 2m23s
Test / test_heal_csum_32k_dmj (push) Successful in 2m24s
Test / test_heal_csum_32k_dj (push) Successful in 2m22s
Test / test_heal_csum_32k (push) Successful in 2m22s
Test / test_heal_csum_4k_dmj (push) Successful in 2m25s
Test / test_heal_csum_4k_dj (push) Successful in 2m25s
Test / test_resize (push) Successful in 18s
Test / test_resize_auto (push) Successful in 13s
Test / test_snapshot_pool2 (push) Successful in 22s
Test / test_osd_tags (push) Successful in 15s
Test / test_enospc (push) Successful in 16s
Test / test_enospc_imm (push) Successful in 16s
Test / test_enospc_xor (push) Successful in 18s
Test / test_enospc_imm_xor (push) Successful in 18s
Test / test_scrub (push) Successful in 21s
Test / test_scrub_xor (push) Successful in 19s
Test / test_scrub_zero_osd_2 (push) Successful in 23s
Test / test_scrub_pg_size_3 (push) Successful in 21s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 23s
Test / test_scrub_ec (push) Successful in 22s
Test / test_nfs (push) Successful in 18s
Test / test_heal_csum_4k (push) Successful in 2m22s
2025-04-12 19:06:27 +03:00
4b660f1ce8 Fix systemd unit name in make-etcd
Some checks failed
Test / test_rebalance_verify_ec_imm (push) Successful in 2m4s
Test / test_dd (push) Successful in 21s
Test / test_write_no_same (push) Successful in 14s
Test / test_switch_primary (push) Successful in 37s
Test / test_write (push) Successful in 36s
Test / test_write_xor (push) Successful in 40s
Test / test_heal_pg_size_2 (push) Successful in 2m21s
Test / test_heal_ec (push) Successful in 2m21s
Test / test_heal_antietcd (push) Successful in 2m22s
Test / test_heal_csum_32k_dmj (push) Failing after 2m34s
Test / test_heal_csum_32k_dj (push) Successful in 2m22s
Test / test_heal_csum_32k (push) Successful in 2m23s
Test / test_heal_csum_4k_dmj (push) Successful in 2m24s
Test / test_resize_auto (push) Successful in 14s
Test / test_resize (push) Successful in 18s
Test / test_heal_csum_4k_dj (push) Successful in 2m26s
Test / test_osd_tags (push) Successful in 15s
Test / test_snapshot_pool2 (push) Successful in 20s
Test / test_enospc (push) Successful in 16s
Test / test_enospc_xor (push) Successful in 17s
Test / test_enospc_imm (push) Successful in 18s
Test / test_enospc_imm_xor (push) Successful in 19s
Test / test_scrub (push) Successful in 20s
Test / test_scrub_zero_osd_2 (push) Successful in 21s
Test / test_scrub_xor (push) Successful in 19s
Test / test_scrub_pg_size_3 (push) Successful in 20s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 22s
Test / test_scrub_ec (push) Successful in 20s
Test / test_nfs (push) Successful in 16s
Test / test_heal_csum_4k (push) Successful in 2m15s
2025-04-11 02:05:08 +03:00
dfde0e60f0 Do not allow reweight > 1
All checks were successful
Test / test_dd (push) Successful in 20s
Test / test_rebalance_verify_ec_imm (push) Successful in 1m48s
Test / test_write_no_same (push) Successful in 14s
Test / test_write (push) Successful in 36s
Test / test_switch_primary (push) Successful in 39s
Test / test_write_xor (push) Successful in 40s
Test / test_heal_pg_size_2 (push) Successful in 2m21s
Test / test_heal_ec (push) Successful in 2m21s
Test / test_heal_antietcd (push) Successful in 2m23s
Test / test_heal_csum_32k_dmj (push) Successful in 2m24s
Test / test_heal_csum_32k_dj (push) Successful in 2m31s
Test / test_heal_csum_32k (push) Successful in 2m24s
Test / test_heal_csum_4k_dmj (push) Successful in 2m26s
Test / test_heal_csum_4k_dj (push) Successful in 2m25s
Test / test_resize (push) Successful in 18s
Test / test_resize_auto (push) Successful in 13s
Test / test_snapshot_pool2 (push) Successful in 19s
Test / test_osd_tags (push) Successful in 12s
Test / test_enospc (push) Successful in 15s
Test / test_enospc_imm (push) Successful in 14s
Test / test_enospc_xor (push) Successful in 18s
Test / test_enospc_imm_xor (push) Successful in 19s
Test / test_scrub (push) Successful in 17s
Test / test_scrub_zero_osd_2 (push) Successful in 19s
Test / test_scrub_xor (push) Successful in 20s
Test / test_scrub_pg_size_3 (push) Successful in 19s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 23s
Test / test_scrub_ec (push) Successful in 20s
Test / test_nfs (push) Successful in 16s
Test / test_heal_csum_4k (push) Successful in 2m22s
2025-04-05 12:21:14 +03:00
013f688ffe Run check_peer_config on RDMA-CM connections too
All checks were successful
Test / test_switch_primary (push) Successful in 37s
Test / test_write (push) Successful in 37s
Test / test_write_no_same (push) Successful in 15s
Test / test_write_xor (push) Successful in 40s
Test / test_rebalance_verify_ec_imm (push) Successful in 1m46s
Test / test_heal_pg_size_2 (push) Successful in 2m19s
Test / test_heal_ec (push) Successful in 2m23s
Test / test_heal_antietcd (push) Successful in 2m23s
Test / test_heal_csum_32k_dmj (push) Successful in 2m24s
Test / test_heal_csum_32k_dj (push) Successful in 2m22s
Test / test_heal_csum_32k (push) Successful in 2m23s
Test / test_resize (push) Successful in 19s
Test / test_resize_auto (push) Successful in 12s
Test / test_snapshot_pool2 (push) Successful in 20s
Test / test_heal_csum_4k_dmj (push) Successful in 2m24s
Test / test_osd_tags (push) Successful in 13s
Test / test_enospc (push) Successful in 15s
Test / test_enospc_xor (push) Successful in 18s
Test / test_enospc_imm (push) Successful in 15s
Test / test_enospc_imm_xor (push) Successful in 18s
Test / test_scrub (push) Successful in 18s
Test / test_heal_csum_4k_dj (push) Successful in 2m24s
Test / test_scrub_zero_osd_2 (push) Successful in 19s
Test / test_scrub_xor (push) Successful in 18s
Test / test_heal_csum_4k (push) Successful in 2m23s
Test / test_scrub_pg_size_3 (push) Successful in 19s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 23s
Test / test_nfs (push) Successful in 16s
Test / test_scrub_ec (push) Successful in 22s
Test / test_etcd_fail_antietcd (push) Successful in 44s
2025-04-02 01:32:28 +03:00
cf9738ddbe Fix docker 2.1.0 build :) 2025-04-01 22:46:22 +03:00
891b2811c7 Release 2.1.0
All checks were successful
Test / test_root_node (push) Successful in 13s
Test / test_rebalance_verify_ec_imm (push) Successful in 1m52s
Test / test_write_no_same (push) Successful in 14s
Test / test_write (push) Successful in 37s
Test / test_switch_primary (push) Successful in 39s
Test / test_write_xor (push) Successful in 42s
Test / test_heal_pg_size_2 (push) Successful in 2m20s
Test / test_heal_ec (push) Successful in 2m22s
Test / test_heal_antietcd (push) Successful in 2m22s
Test / test_heal_csum_32k_dmj (push) Successful in 2m24s
Test / test_heal_csum_32k_dj (push) Successful in 2m22s
Test / test_heal_csum_32k (push) Successful in 2m24s
Test / test_heal_csum_4k_dmj (push) Successful in 2m24s
Test / test_heal_csum_4k_dj (push) Successful in 2m22s
Test / test_resize_auto (push) Successful in 13s
Test / test_resize (push) Successful in 17s
Test / test_snapshot_pool2 (push) Successful in 21s
Test / test_osd_tags (push) Successful in 12s
Test / test_enospc (push) Successful in 16s
Test / test_enospc_imm (push) Successful in 16s
Test / test_enospc_xor (push) Successful in 19s
Test / test_enospc_imm_xor (push) Successful in 19s
Test / test_scrub (push) Successful in 17s
Test / test_scrub_zero_osd_2 (push) Successful in 21s
Test / test_scrub_xor (push) Successful in 20s
Test / test_scrub_pg_size_3 (push) Successful in 20s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 22s
Test / test_scrub_ec (push) Successful in 19s
Test / test_nfs (push) Successful in 17s
Test / test_heal_csum_4k (push) Successful in 2m22s
New features:

- Support separate OSD cluster network - [osd_cluster_network](https://vitastor.io/docs/config/network.html#osd_cluster_network)
  and, in general, multiple OSD networks, including RDMA
- Add an alternative RDMA implementation via RDMA-CM - [use_rdmacm](https://vitastor.io/docs/config/network.html#use_rdmacm),
  required for iWARP and, maybe, for some IB setups (but not for RoCE)
- Change default PG behaviour to wait for all "up" OSDs to be connected before starting it.
  The old behaviour may be returned by enabling a new [allow_net_split](https://vitastor.io/docs/config/osd.html#allow_net_split)
  option.
- Add a patch for QEMU 9.2

Bug fixes:

- Fix incorrect "has_xxx" PG state names in ls-pgs
- Fix possible QEMU crashes after detaching of Vitastor disks (and update all QEMU builds in Vitastor repos)
- Fix clients sometimes spamming OSDs with infinite reconnections when some PGs are offline
- Fall back to TCP on RDMA connection failures
- Add missing logging of RDMA ibv_modify_qp() errors
- Add a minimum interval for etcd_state_client to reload state
2025-04-01 20:16:27 +03:00
01590df6da Update QEMU version in vitastor-csi Dockerfile 2025-04-01 20:16:27 +03:00
3e5f0be52c Use separate port numbers for RDMA-CM
All checks were successful
Test / test_dd (push) Successful in 20s
Test / test_rebalance_verify_ec_imm (push) Successful in 1m54s
Test / test_write_no_same (push) Successful in 13s
Test / test_switch_primary (push) Successful in 38s
Test / test_write (push) Successful in 37s
Test / test_write_xor (push) Successful in 40s
Test / test_heal_pg_size_2 (push) Successful in 2m21s
Test / test_heal_ec (push) Successful in 2m22s
Test / test_heal_antietcd (push) Successful in 2m23s
Test / test_heal_csum_32k_dmj (push) Successful in 2m24s
Test / test_heal_csum_32k_dj (push) Successful in 2m22s
Test / test_heal_csum_32k (push) Successful in 2m23s
Test / test_heal_csum_4k_dmj (push) Successful in 2m24s
Test / test_heal_csum_4k_dj (push) Successful in 2m24s
Test / test_resize (push) Successful in 18s
Test / test_resize_auto (push) Successful in 13s
Test / test_snapshot_pool2 (push) Successful in 20s
Test / test_osd_tags (push) Successful in 13s
Test / test_enospc (push) Successful in 15s
Test / test_enospc_xor (push) Successful in 17s
Test / test_enospc_imm (push) Successful in 16s
Test / test_enospc_imm_xor (push) Successful in 18s
Test / test_scrub (push) Successful in 20s
Test / test_scrub_zero_osd_2 (push) Successful in 21s
Test / test_scrub_xor (push) Successful in 18s
Test / test_scrub_pg_size_3 (push) Successful in 19s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 21s
Test / test_scrub_ec (push) Successful in 20s
Test / test_nfs (push) Successful in 17s
Test / test_heal_csum_4k (push) Successful in 2m24s
2025-04-01 16:16:03 +03:00
58af897e73 s/listen on/listen to/ :) 2025-04-01 12:07:15 +03:00
dbf9ecd171 Move osd_network to config/network docs 2025-03-31 21:12:09 +03:00
8508e78288 Add an alternative RDMA implementation via RDMA-CM
All checks were successful
Test / test_rebalance_verify_ec (push) Successful in 1m48s
Test / test_rebalance_verify_ec_imm (push) Successful in 1m48s
Test / test_write_no_same (push) Successful in 15s
Test / test_switch_primary (push) Successful in 38s
Test / test_write (push) Successful in 37s
Test / test_write_xor (push) Successful in 41s
Test / test_heal_pg_size_2 (push) Successful in 2m20s
Test / test_heal_ec (push) Successful in 2m20s
Test / test_heal_antietcd (push) Successful in 2m23s
Test / test_heal_csum_32k_dmj (push) Successful in 2m24s
Test / test_heal_csum_32k_dj (push) Successful in 2m24s
Test / test_heal_csum_32k (push) Successful in 2m22s
Test / test_heal_csum_4k_dmj (push) Successful in 2m24s
Test / test_heal_csum_4k_dj (push) Successful in 2m24s
Test / test_resize (push) Successful in 19s
Test / test_resize_auto (push) Successful in 14s
Test / test_snapshot_pool2 (push) Successful in 19s
Test / test_osd_tags (push) Successful in 14s
Test / test_enospc (push) Successful in 17s
Test / test_enospc_imm (push) Successful in 16s
Test / test_enospc_xor (push) Successful in 19s
Test / test_enospc_imm_xor (push) Successful in 20s
Test / test_scrub (push) Successful in 16s
Test / test_scrub_zero_osd_2 (push) Successful in 20s
Test / test_scrub_xor (push) Successful in 20s
Test / test_scrub_pg_size_3 (push) Successful in 19s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 23s
Test / test_scrub_ec (push) Successful in 20s
Test / test_nfs (push) Successful in 15s
Test / test_heal_csum_4k (push) Successful in 2m23s
Required for non-RoCE cards: iWARP and, possibly, Infiniband
2025-03-31 21:01:25 +03:00
f32dea02bf Support multiple RDMA networks 2025-03-31 21:01:25 +03:00
a103065d12 Support multiple OSD networks and separate OSD cluster network 2025-03-31 21:01:15 +03:00
5d2e28d4a9 Remove unused used_max_cqe from nfs_proxy_rdma
Some checks failed
Test / test_dd (push) Successful in 18s
Test / test_rebalance_verify_ec_imm (push) Successful in 1m52s
Test / test_write_no_same (push) Successful in 14s
Test / test_switch_primary (push) Successful in 38s
Test / test_write (push) Successful in 36s
Test / test_write_xor (push) Successful in 41s
Test / test_heal_pg_size_2 (push) Successful in 2m20s
Test / test_heal_ec (push) Successful in 2m23s
Test / test_heal_antietcd (push) Successful in 2m22s
Test / test_heal_csum_32k_dj (push) Successful in 2m32s
Test / test_heal_csum_32k (push) Successful in 2m23s
Test / test_heal_csum_4k_dmj (push) Successful in 2m25s
Test / test_resize_auto (push) Successful in 13s
Test / test_resize (push) Successful in 18s
Test / test_heal_csum_4k_dj (push) Successful in 2m25s
Test / test_osd_tags (push) Successful in 13s
Test / test_snapshot_pool2 (push) Successful in 21s
Test / test_enospc (push) Successful in 17s
Test / test_enospc_xor (push) Successful in 17s
Test / test_enospc_imm (push) Successful in 15s
Test / test_enospc_imm_xor (push) Successful in 18s
Test / test_scrub (push) Successful in 19s
Test / test_scrub_zero_osd_2 (push) Successful in 19s
Test / test_scrub_xor (push) Successful in 19s
Test / test_scrub_pg_size_3 (push) Successful in 19s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 23s
Test / test_scrub_ec (push) Successful in 20s
Test / test_nfs (push) Successful in 16s
Test / test_heal_csum_4k (push) Successful in 2m17s
Test / test_heal_csum_32k_dmj (push) Failing after 2s
2025-03-30 02:13:24 +03:00
18e14eed11 Fix --pg_count formula in docs/usage/cli 2025-03-29 17:54:53 +03:00
ccc32b9e68 Use TCP on RDMA connection failure
All checks were successful
Test / test_rebalance_verify_ec (push) Successful in 1m54s
Test / test_rebalance_verify_ec_imm (push) Successful in 1m55s
Test / test_write_no_same (push) Successful in 15s
Test / test_switch_primary (push) Successful in 37s
Test / test_write (push) Successful in 37s
Test / test_write_xor (push) Successful in 39s
Test / test_heal_pg_size_2 (push) Successful in 2m20s
Test / test_heal_ec (push) Successful in 2m23s
Test / test_heal_antietcd (push) Successful in 2m21s
Test / test_heal_csum_32k_dmj (push) Successful in 2m23s
Test / test_heal_csum_32k_dj (push) Successful in 2m23s
Test / test_heal_csum_32k (push) Successful in 2m21s
Test / test_heal_csum_4k_dmj (push) Successful in 2m24s
Test / test_heal_csum_4k_dj (push) Successful in 2m25s
Test / test_resize (push) Successful in 18s
Test / test_resize_auto (push) Successful in 14s
Test / test_snapshot_pool2 (push) Successful in 18s
Test / test_osd_tags (push) Successful in 13s
Test / test_enospc (push) Successful in 16s
Test / test_enospc_xor (push) Successful in 16s
Test / test_enospc_imm (push) Successful in 15s
Test / test_enospc_imm_xor (push) Successful in 19s
Test / test_scrub (push) Successful in 19s
Test / test_scrub_zero_osd_2 (push) Successful in 19s
Test / test_scrub_xor (push) Successful in 18s
Test / test_scrub_pg_size_3 (push) Successful in 18s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 21s
Test / test_scrub_ec (push) Successful in 19s
Test / test_nfs (push) Successful in 15s
Test / test_heal_csum_4k (push) Successful in 2m21s
2025-03-23 12:04:23 +03:00
ebaf3fee79 Add an assertion to prevent sending message to TCP channel when switched to RDMA 2025-03-23 12:04:09 +03:00
196d28e987 Fix typo 2025-03-23 12:00:20 +03:00
8f243b2328 Fix qemu buster build and bullseye version 2025-03-23 02:46:52 +03:00
7a835fcd8f Add allow_net_split parameter 2025-03-23 02:12:32 +03:00
8b0389b4e8 Log RDMA ibv_modify_qp() errors
All checks were successful
Test / test_rebalance_verify_ec (push) Successful in 1m53s
Test / test_rebalance_verify_ec_imm (push) Successful in 1m54s
Test / test_write_no_same (push) Successful in 14s
Test / test_switch_primary (push) Successful in 37s
Test / test_write (push) Successful in 36s
Test / test_write_xor (push) Successful in 39s
Test / test_heal_pg_size_2 (push) Successful in 2m20s
Test / test_heal_ec (push) Successful in 2m21s
Test / test_heal_antietcd (push) Successful in 2m21s
Test / test_heal_csum_32k_dmj (push) Successful in 2m23s
Test / test_heal_csum_32k_dj (push) Successful in 2m21s
Test / test_heal_csum_32k (push) Successful in 2m25s
Test / test_heal_csum_4k_dmj (push) Successful in 2m24s
Test / test_heal_csum_4k_dj (push) Successful in 2m23s
Test / test_resize_auto (push) Successful in 13s
Test / test_resize (push) Successful in 16s
Test / test_snapshot_pool2 (push) Successful in 17s
Test / test_osd_tags (push) Successful in 12s
Test / test_enospc (push) Successful in 16s
Test / test_enospc_xor (push) Successful in 17s
Test / test_enospc_imm (push) Successful in 16s
Test / test_enospc_imm_xor (push) Successful in 19s
Test / test_scrub (push) Successful in 16s
Test / test_scrub_zero_osd_2 (push) Successful in 17s
Test / test_scrub_xor (push) Successful in 19s
Test / test_scrub_pg_size_3 (push) Successful in 18s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 21s
Test / test_scrub_ec (push) Successful in 21s
Test / test_nfs (push) Successful in 16s
Test / test_heal_csum_4k (push) Successful in 2m23s
2025-03-22 15:58:13 +03:00
f544c350ba %l* -> %j*
All checks were successful
Test / test_dd (push) Successful in 19s
Test / test_rebalance_verify_ec_imm (push) Successful in 1m42s
Test / test_write_no_same (push) Successful in 14s
Test / test_switch_primary (push) Successful in 40s
Test / test_write (push) Successful in 38s
Test / test_write_xor (push) Successful in 41s
Test / test_heal_pg_size_2 (push) Successful in 2m22s
Test / test_heal_ec (push) Successful in 2m22s
Test / test_heal_antietcd (push) Successful in 2m25s
Test / test_heal_csum_32k_dmj (push) Successful in 2m25s
Test / test_heal_csum_32k_dj (push) Successful in 2m26s
Test / test_heal_csum_32k (push) Successful in 2m33s
Test / test_heal_csum_4k_dmj (push) Successful in 2m41s
Test / test_heal_csum_4k_dj (push) Successful in 2m37s
Test / test_resize (push) Successful in 17s
Test / test_resize_auto (push) Successful in 14s
Test / test_osd_tags (push) Successful in 13s
Test / test_snapshot_pool2 (push) Successful in 19s
Test / test_enospc (push) Successful in 16s
Test / test_enospc_xor (push) Successful in 18s
Test / test_enospc_imm (push) Successful in 14s
Test / test_enospc_imm_xor (push) Successful in 18s
Test / test_scrub (push) Successful in 16s
Test / test_scrub_zero_osd_2 (push) Successful in 19s
Test / test_scrub_xor (push) Successful in 18s
Test / test_scrub_pg_size_3 (push) Successful in 19s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 21s
Test / test_scrub_ec (push) Successful in 21s
Test / test_nfs (push) Successful in 17s
Test / test_heal_csum_4k (push) Successful in 2m22s
2025-03-22 15:32:07 +03:00
4eafb55b5c Add a patch for QEMU 9.2, fix debian bookworm QEMU build 2025-03-22 15:30:52 +03:00
5030396f71 Clear QEMU eventfd handler on vitastor block driver destruction
All checks were successful
Test / test_root_node (push) Successful in 14s
Test / test_rebalance_verify_ec_imm (push) Successful in 1m54s
Test / test_write_no_same (push) Successful in 12s
Test / test_write (push) Successful in 35s
Test / test_switch_primary (push) Successful in 37s
Test / test_write_xor (push) Successful in 39s
Test / test_heal_pg_size_2 (push) Successful in 2m21s
Test / test_heal_ec (push) Successful in 2m22s
Test / test_heal_antietcd (push) Successful in 2m22s
Test / test_heal_csum_32k_dmj (push) Successful in 2m23s
Test / test_heal_csum_32k_dj (push) Successful in 2m24s
Test / test_heal_csum_32k (push) Successful in 2m21s
Test / test_heal_csum_4k_dmj (push) Successful in 2m23s
Test / test_heal_csum_4k_dj (push) Successful in 2m23s
Test / test_resize_auto (push) Successful in 14s
Test / test_resize (push) Successful in 19s
Test / test_snapshot_pool2 (push) Successful in 19s
Test / test_osd_tags (push) Successful in 13s
Test / test_enospc (push) Successful in 14s
Test / test_enospc_xor (push) Successful in 16s
Test / test_enospc_imm (push) Successful in 14s
Test / test_enospc_imm_xor (push) Successful in 19s
Test / test_scrub (push) Successful in 19s
Test / test_scrub_zero_osd_2 (push) Successful in 18s
Test / test_scrub_xor (push) Successful in 21s
Test / test_scrub_pg_size_3 (push) Successful in 19s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 21s
Test / test_scrub_ec (push) Successful in 20s
Test / test_nfs (push) Successful in 15s
Test / test_heal_csum_4k (push) Successful in 2m21s
2025-03-21 20:47:17 +03:00
be22c363ca Do not skip client_retry_interval on reconnecting OSDs to prevent OSD spam
All checks were successful
Test / test_rebalance_verify_ec (push) Successful in 1m54s
Test / test_rebalance_verify_ec_imm (push) Successful in 1m52s
Test / test_write_no_same (push) Successful in 13s
Test / test_switch_primary (push) Successful in 37s
Test / test_write (push) Successful in 36s
Test / test_write_xor (push) Successful in 39s
Test / test_heal_pg_size_2 (push) Successful in 2m20s
Test / test_heal_ec (push) Successful in 2m21s
Test / test_heal_antietcd (push) Successful in 2m21s
Test / test_heal_csum_32k_dmj (push) Successful in 2m23s
Test / test_heal_csum_32k_dj (push) Successful in 2m23s
Test / test_heal_csum_32k (push) Successful in 2m23s
Test / test_heal_csum_4k_dmj (push) Successful in 2m26s
Test / test_heal_csum_4k_dj (push) Successful in 2m21s
Test / test_resize (push) Successful in 18s
Test / test_resize_auto (push) Successful in 13s
Test / test_snapshot_pool2 (push) Successful in 19s
Test / test_osd_tags (push) Successful in 13s
Test / test_enospc (push) Successful in 17s
Test / test_enospc_xor (push) Successful in 16s
Test / test_enospc_imm (push) Successful in 16s
Test / test_enospc_imm_xor (push) Successful in 18s
Test / test_scrub (push) Successful in 18s
Test / test_scrub_zero_osd_2 (push) Successful in 19s
Test / test_scrub_xor (push) Successful in 20s
Test / test_scrub_pg_size_3 (push) Successful in 18s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 21s
Test / test_scrub_ec (push) Successful in 20s
Test / test_nfs (push) Successful in 16s
Test / test_heal_csum_4k (push) Successful in 2m23s
2025-03-20 00:12:38 +03:00
0f80c87b43 Add a minimum interval for etcd_state_client to reload state
All checks were successful
Test / test_rebalance_verify_ec (push) Successful in 1m55s
Test / test_rebalance_verify_ec_imm (push) Successful in 1m55s
Test / test_write_no_same (push) Successful in 12s
Test / test_write (push) Successful in 35s
Test / test_switch_primary (push) Successful in 38s
Test / test_write_xor (push) Successful in 39s
Test / test_heal_pg_size_2 (push) Successful in 2m21s
Test / test_heal_ec (push) Successful in 2m20s
Test / test_heal_antietcd (push) Successful in 2m22s
Test / test_heal_csum_32k_dmj (push) Successful in 2m22s
Test / test_heal_csum_32k (push) Successful in 2m21s
Test / test_heal_csum_4k_dmj (push) Successful in 2m22s
Test / test_heal_csum_4k_dj (push) Successful in 2m24s
Test / test_resize (push) Successful in 17s
Test / test_resize_auto (push) Successful in 13s
Test / test_snapshot_pool2 (push) Successful in 18s
Test / test_osd_tags (push) Successful in 12s
Test / test_enospc (push) Successful in 15s
Test / test_enospc_xor (push) Successful in 17s
Test / test_enospc_imm (push) Successful in 16s
Test / test_enospc_imm_xor (push) Successful in 17s
Test / test_scrub (push) Successful in 18s
Test / test_scrub_zero_osd_2 (push) Successful in 18s
Test / test_scrub_xor (push) Successful in 19s
Test / test_scrub_pg_size_3 (push) Successful in 18s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 21s
Test / test_scrub_ec (push) Successful in 21s
Test / test_nfs (push) Successful in 15s
Test / test_heal_csum_4k (push) Successful in 2m14s
Test / test_heal_csum_32k_dj (push) Successful in 2m27s
(To prevent excessive load on etcd during outages)
2025-03-19 02:36:09 +03:00
e0953fd502 Wait for all "up" OSDs to be connected before starting PG 2025-03-19 02:36:09 +03:00
6e0ae47938 Add Proxmox QEMU 9.2 patch 2025-03-19 02:36:02 +03:00
b8f19e85ad Fix pg state formatting in ls-pgs
All checks were successful
Test / test_dd (push) Successful in 17s
Test / test_rebalance_verify_ec_imm (push) Successful in 1m51s
Test / test_write_no_same (push) Successful in 14s
Test / test_write (push) Successful in 34s
Test / test_switch_primary (push) Successful in 38s
Test / test_write_xor (push) Successful in 38s
Test / test_heal_pg_size_2 (push) Successful in 2m19s
Test / test_heal_ec (push) Successful in 2m20s
Test / test_heal_antietcd (push) Successful in 2m21s
Test / test_heal_csum_32k_dmj (push) Successful in 2m23s
Test / test_heal_csum_32k_dj (push) Successful in 2m26s
Test / test_heal_csum_32k (push) Successful in 2m21s
Test / test_heal_csum_4k_dmj (push) Successful in 2m21s
Test / test_heal_csum_4k_dj (push) Successful in 2m23s
Test / test_resize_auto (push) Successful in 12s
Test / test_resize (push) Successful in 17s
Test / test_snapshot_pool2 (push) Successful in 18s
Test / test_osd_tags (push) Successful in 12s
Test / test_enospc (push) Successful in 14s
Test / test_enospc_xor (push) Successful in 19s
Test / test_enospc_imm (push) Successful in 18s
Test / test_enospc_imm_xor (push) Successful in 17s
Test / test_scrub (push) Successful in 17s
Test / test_scrub_zero_osd_2 (push) Successful in 20s
Test / test_scrub_xor (push) Successful in 19s
Test / test_scrub_pg_size_3 (push) Successful in 18s
Test / test_scrub_ec (push) Successful in 18s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 21s
Test / test_nfs (push) Successful in 16s
Test / test_heal_csum_4k (push) Successful in 2m22s
2025-03-17 01:37:58 +03:00
b7636e595f Update version in docker docs 2025-03-16 16:53:57 +03:00
251 changed files with 14122 additions and 1827 deletions

View File

@@ -20,7 +20,7 @@ RUN echo 'deb http://deb.debian.org/debian bullseye-backports main' >> /etc/apt/
RUN apt-get update
RUN apt-get -y install etcd qemu-system-x86 qemu-block-extra qemu-utils fio libasan5 \
liburing1 liburing-dev libgoogle-perftools-dev devscripts libjerasure-dev cmake libibverbs-dev libisal-dev
libgoogle-perftools-dev devscripts libjerasure-dev cmake libibverbs-dev libisal-dev
RUN apt-get -y build-dep fio qemu=`dpkg -s qemu-system-x86|grep ^Version:|awk '{print $2}'`
RUN apt-get update && apt-get -y install jq lp-solve sudo nfs-common fdisk parted
RUN apt-get --download-only source fio qemu=`dpkg -s qemu-system-x86|grep ^Version:|awk '{print $2}'`

View File

@@ -144,6 +144,24 @@ jobs:
echo ""
done
test_change_pg_count_online:
runs-on: ubuntu-latest
needs: build
container: ${{env.TEST_IMAGE}}:${{github.sha}}
steps:
- name: Run test
id: test
timeout-minutes: 3
run: /root/vitastor/tests/test_change_pg_count_online.sh
- name: Print logs
if: always() && steps.test.outcome == 'failure'
run: |
for i in /root/vitastor/testdata/*.log /root/vitastor/testdata/*.txt; do
echo "-------- $i --------"
cat $i
echo ""
done
test_change_pg_size:
runs-on: ubuntu-latest
needs: build
@@ -684,6 +702,24 @@ jobs:
echo ""
done
test_write_iothreads:
runs-on: ubuntu-latest
needs: build
container: ${{env.TEST_IMAGE}}:${{github.sha}}
steps:
- name: Run test
id: test
timeout-minutes: 3
run: TEST_NAME=iothreads GLOBAL_CONFIG=',"client_iothread_count":4' /root/vitastor/tests/test_write.sh
- name: Print logs
if: always() && steps.test.outcome == 'failure'
run: |
for i in /root/vitastor/testdata/*.log /root/vitastor/testdata/*.txt; do
echo "-------- $i --------"
cat $i
echo ""
done
test_write_no_same:
runs-on: ubuntu-latest
needs: build
@@ -720,6 +756,24 @@ jobs:
echo ""
done
test_heal_local_read:
runs-on: ubuntu-latest
needs: build
container: ${{env.TEST_IMAGE}}:${{github.sha}}
steps:
- name: Run test
id: test
timeout-minutes: 10
run: TEST_NAME=local_read POOLCFG='"local_reads":"random",' /root/vitastor/tests/test_heal.sh
- name: Print logs
if: always() && steps.test.outcome == 'failure'
run: |
for i in /root/vitastor/testdata/*.log /root/vitastor/testdata/*.txt; do
echo "-------- $i --------"
cat $i
echo ""
done
test_heal_ec:
runs-on: ubuntu-latest
needs: build

View File

@@ -2,6 +2,6 @@ cmake_minimum_required(VERSION 2.8.12)
project(vitastor)
set(VITASTOR_VERSION "2.0.0")
set(VITASTOR_VERSION "2.3.0")
add_subdirectory(src)

View File

@@ -19,7 +19,7 @@ Vitastor нацелен в первую очередь на SSD и SSD+HDD кл
TCP и RDMA и на хорошем железе может достигать задержки 4 КБ чтения и записи на уровне ~0.1 мс,
что примерно в 10 раз быстрее, чем Ceph и другие популярные программные СХД.
Vitastor поддерживает QEMU-драйвер, протоколы NBD и NFS, драйверы OpenStack, OpenNebula, Proxmox, Kubernetes.
Vitastor поддерживает QEMU-драйвер, протоколы UBLK, NBD и NFS, драйверы OpenStack, OpenNebula, Proxmox, Kubernetes.
Другие драйверы могут также быть легко реализованы.
Подробности смотрите в документации по ссылкам. Можете начать отсюда: [Быстрый старт](docs/intro/quickstart.ru.md).
@@ -64,8 +64,9 @@ Vitastor поддерживает QEMU-драйвер, протоколы NBD и
- [vitastor-cli](docs/usage/cli.ru.md) (консольный интерфейс)
- [vitastor-disk](docs/usage/disk.ru.md) (управление дисками)
- [fio](docs/usage/fio.ru.md) для тестов производительности
- [NBD](docs/usage/nbd.ru.md) для монтирования ядром
- [QEMU и qemu-img](docs/usage/qemu.ru.md)
- [UBLK](docs/usage/ublk.ru.md) для монтирования ядром
- [NBD](docs/usage/nbd.ru.md) - старый интерфейс для монтирования ядром
- [QEMU, qemu-img и VDUSE](docs/usage/qemu.ru.md)
- [NFS](docs/usage/nfs.ru.md) кластерная файловая система и псевдо-ФС прокси
- [Администрирование](docs/usage/admin.ru.md)
- Производительность

View File

@@ -19,7 +19,7 @@ supports TCP and RDMA and may achieve 4 KB read and write latency as low as ~0.1
with proper hardware which is ~10 times faster than other popular SDS's like Ceph
or internal systems of public clouds.
Vitastor supports QEMU, NBD, NFS protocols, OpenStack, OpenNebula, Proxmox, Kubernetes drivers.
Vitastor supports QEMU, UBLK, NBD, NFS protocols, OpenStack, OpenNebula, Proxmox, Kubernetes drivers.
More drivers may be created easily.
Read more details in the documentation. You can start from here: [Quick Start](docs/intro/quickstart.en.md).
@@ -64,8 +64,9 @@ Read more details in the documentation. You can start from here: [Quick Start](d
- [vitastor-cli](docs/usage/cli.en.md) (command-line interface)
- [vitastor-disk](docs/usage/disk.en.md) (disk management tool)
- [fio](docs/usage/fio.en.md) for benchmarks
- [NBD](docs/usage/nbd.en.md) for kernel mounts
- [QEMU and qemu-img](docs/usage/qemu.en.md)
- [UBLK](docs/usage/ublk.en.md) for kernel mounts
- [NBD](docs/usage/nbd.en.md) - old interface for kernel mounts
- [QEMU, qemu-img and VDUSE](docs/usage/qemu.en.md)
- [NFS](docs/usage/nfs.en.md) clustered file system and pseudo-FS proxy
- [Administration](docs/usage/admin.en.md)
- Performance

View File

@@ -37,8 +37,8 @@ RUN (echo deb http://vitastor.io/debian bookworm main > /etc/apt/sources.list.d/
wget -q -O /etc/apt/trusted.gpg.d/vitastor.gpg https://vitastor.io/debian/pubkey.gpg && \
apt-get update && \
apt-get install -y vitastor-client && \
wget https://vitastor.io/archive/qemu/qemu-bookworm-8.1.2%2Bds-1%2Bvitastor1/qemu-utils_8.1.2%2Bds-1%2Bvitastor1_amd64.deb && \
wget https://vitastor.io/archive/qemu/qemu-bookworm-8.1.2%2Bds-1%2Bvitastor1/qemu-block-extra_8.1.2%2Bds-1%2Bvitastor1_amd64.deb && \
wget https://vitastor.io/archive/qemu/qemu-bookworm-9.2.2%2Bds-1%2Bvitastor4/qemu-utils_9.2.2%2Bds-1%2Bvitastor4_amd64.deb && \
wget https://vitastor.io/archive/qemu/qemu-bookworm-9.2.2%2Bds-1%2Bvitastor4/qemu-block-extra_9.2.2%2Bds-1%2Bvitastor4_amd64.deb && \
dpkg -x qemu-utils*.deb tmp1 && \
dpkg -x qemu-block-extra*.deb tmp1 && \
cp -a tmp1/usr/bin/qemu-storage-daemon /usr/bin/ && \

View File

@@ -1,4 +1,4 @@
VITASTOR_VERSION ?= v2.0.0
VITASTOR_VERSION ?= v2.3.0
all: build push

View File

@@ -49,7 +49,7 @@ spec:
capabilities:
add: ["SYS_ADMIN"]
allowPrivilegeEscalation: true
image: vitalif/vitastor-csi:v2.0.0
image: vitalif/vitastor-csi:v2.3.0
args:
- "--node=$(NODE_ID)"
- "--endpoint=$(CSI_ENDPOINT)"

View File

@@ -121,7 +121,7 @@ spec:
privileged: true
capabilities:
add: ["SYS_ADMIN"]
image: vitalif/vitastor-csi:v2.0.0
image: vitalif/vitastor-csi:v2.3.0
args:
- "--node=$(NODE_ID)"
- "--endpoint=$(CSI_ENDPOINT)"

View File

@@ -5,7 +5,7 @@ package vitastor
const (
vitastorCSIDriverName = "csi.vitastor.io"
vitastorCSIDriverVersion = "2.0.0"
vitastorCSIDriverVersion = "2.3.0"
)
// Config struct fills the parameters of request or user input

View File

@@ -1,7 +1,4 @@
#!/bin/bash
cat < vitastor.Dockerfile > ../Dockerfile
cd ..
mkdir -p packages
sudo podman build --build-arg DISTRO=debian --build-arg REL=bookworm -v `pwd`/packages:/root/packages -f Dockerfile .
rm Dockerfile
docker build --build-arg DISTRO=debian --build-arg REL=bookworm -t vitastor-buildenv:bookworm -f vitastor-buildenv.Dockerfile .
docker run -i --rm -e REL=bookworm -v `dirname $0`/../:/root/vitastor vitastor-buildenv:bookworm /root/vitastor/debian/vitastor-build.sh

View File

@@ -1,7 +1,4 @@
#!/bin/bash
cat < vitastor.Dockerfile > ../Dockerfile
cd ..
mkdir -p packages
sudo podman build --build-arg DISTRO=debian --build-arg REL=bullseye -v `pwd`/packages:/root/packages -f Dockerfile .
rm Dockerfile
docker build --build-arg DISTRO=debian --build-arg REL=bullseye -t vitastor-buildenv:bullseye -f vitastor-buildenv.Dockerfile .
docker run -i --rm -e REL=bullseye -v `dirname $0`/../:/root/vitastor vitastor-buildenv:bullseye /root/vitastor/debian/vitastor-build.sh

View File

@@ -1,7 +1,4 @@
#!/bin/bash
cat < vitastor.Dockerfile > ../Dockerfile
cd ..
mkdir -p packages
sudo podman build --build-arg DISTRO=debian --build-arg REL=buster -v `pwd`/packages:/root/packages -f Dockerfile .
rm Dockerfile
docker build --build-arg DISTRO=debian --build-arg REL=buster -t vitastor-buildenv:buster -f vitastor-buildenv.Dockerfile .
docker run -i --rm -e REL=buster -v `dirname $0`/../:/root/vitastor vitastor-buildenv:buster /root/vitastor/debian/vitastor-build.sh

4
debian/build-vitastor-trixie.sh vendored Executable file
View File

@@ -0,0 +1,4 @@
#!/bin/bash
docker build --build-arg DISTRO=debian --build-arg REL=trixie -t vitastor-buildenv:trixie -f vitastor-buildenv.Dockerfile .
docker run -i --rm -e REL=trixie -v `dirname $0`/../:/root/vitastor vitastor-buildenv:trixie /root/vitastor/debian/vitastor-build.sh

View File

@@ -1,7 +1,5 @@
#!/bin/bash
# Ubuntu 22.04 Jammy Jellyfish
cat < vitastor.Dockerfile > ../Dockerfile
cd ..
mkdir -p packages
sudo podman build --build-arg DISTRO=ubuntu --build-arg REL=jammy -v `pwd`/packages:/root/packages -f Dockerfile .
rm Dockerfile
docker build --build-arg DISTRO=ubuntu --build-arg REL=jammy -t vitastor-buildenv:jammy -f vitastor-buildenv.Dockerfile .
docker run -i --rm -e REL=jammy -v `dirname $0`/../:/root/vitastor vitastor-buildenv:jammy /root/vitastor/debian/vitastor-build.sh

5
debian/build-vitastor-ubuntu-noble.sh vendored Executable file
View File

@@ -0,0 +1,5 @@
#!/bin/bash
# 24.04 Noble Numbat
docker build --build-arg DISTRO=ubuntu --build-arg REL=noble -t vitastor-buildenv:noble -f vitastor-buildenv.Dockerfile .
docker run -i --rm -e REL=noble -v `dirname $0`/../:/root/vitastor vitastor-buildenv:noble /root/vitastor/debian/vitastor-build.sh

2
debian/changelog vendored
View File

@@ -1,4 +1,4 @@
vitastor (2.0.0-1) unstable; urgency=medium
vitastor (2.3.0-1) unstable; urgency=medium
* Bugfixes

2
debian/control vendored
View File

@@ -2,7 +2,7 @@ Source: vitastor
Section: admin
Priority: optional
Maintainer: Vitaliy Filippov <vitalif@yourcmc.ru>
Build-Depends: debhelper, liburing-dev (>= 0.6), g++ (>= 8), libstdc++6 (>= 8),
Build-Depends: debhelper, g++ (>= 8), libstdc++6 (>= 8),
linux-libc-dev, libgoogle-perftools-dev, libjerasure-dev, libgf-complete-dev,
libibverbs-dev, libisal-dev, cmake, pkg-config, libnl-3-dev, libnl-genl-3-dev,
node-bindings <!nocheck>, node-gyp, node-nan

View File

@@ -10,10 +10,14 @@ ARG REL=
WORKDIR /root
RUN if [ "$REL" = "buster" -o "$REL" = "bullseye" -o "$REL" = "bookworm" ]; then \
echo "deb http://deb.debian.org/debian $REL-backports main" >> /etc/apt/sources.list; \
if [ "$REL" = "buster" ]; then \
echo "deb http://archive.debian.org/debian $REL-backports main" >> /etc/apt/sources.list; \
else \
echo "deb http://deb.debian.org/debian $REL-backports main" >> /etc/apt/sources.list; \
fi; \
echo >> /etc/apt/preferences; \
echo 'Package: *' >> /etc/apt/preferences; \
echo "Pin: release a=$REL-backports" >> /etc/apt/preferences; \
echo "Pin: release n=$REL-backports" >> /etc/apt/preferences; \
echo 'Pin-Priority: 500' >> /etc/apt/preferences; \
fi; \
grep '^deb ' /etc/apt/sources.list | perl -pe 's/^deb/deb-src/' >> /etc/apt/sources.list; \
@@ -22,7 +26,7 @@ RUN if [ "$REL" = "buster" -o "$REL" = "bullseye" -o "$REL" = "bookworm" ]; then
echo 'APT::Install-Suggests false;' >> /etc/apt/apt.conf
RUN apt-get update
RUN DEBIAN_FRONTEND=noninteractive TZ=Europe/Moscow apt-get -y install fio liburing-dev libgoogle-perftools-dev devscripts
RUN DEBIAN_FRONTEND=noninteractive TZ=Europe/Moscow apt-get -y install fio libgoogle-perftools-dev devscripts
RUN DEBIAN_FRONTEND=noninteractive TZ=Europe/Moscow apt-get -y build-dep qemu
# To build a custom version
#RUN cp /root/packages/qemu-orig/* /root
@@ -56,7 +60,7 @@ RUN set -e; \
quilt add block/vitastor.c; \
cp /root/qemu_driver.c block/vitastor.c; \
quilt refresh; \
V=$(head -n1 debian/changelog | perl -pe 's/5\.2\+dfsg-9/5.2+dfsg-11/; s/^.*\((.*?)(~bpo[\d\+]*)?\).*$/$1/')+vitastor4; \
V=$(head -n1 debian/changelog | perl -pe 's/5\.2\+dfsg-9/5.2+dfsg-11/; s/^.*\((.*?)(\+deb\d+u\d+)?(~bpo[\d\+]*)?\).*$/$1/')+vitastor5; \
if [ "$REL" = bullseye ]; then V=${V}bullseye; fi; \
DEBEMAIL="Vitaliy Filippov <vitalif@yourcmc.ru>" dch -D $REL -v $V 'Plug Vitastor block driver'; \
DEB_BUILD_OPTIONS=nocheck dpkg-buildpackage --jobs=auto -sa; \

60
debian/vitastor-build.sh vendored Executable file
View File

@@ -0,0 +1,60 @@
#!/bin/bash
# To be ran inside buildenv docker
set -e -x
[ -e /usr/lib/x86_64-linux-gnu/pkgconfig/libisal.pc ] || cp /root/vitastor/debian/libisal.pc /usr/lib/x86_64-linux-gnu/pkgconfig
mkdir -p /root/fio-build/
cd /root/fio-build/
rm -rf /root/fio-build/*
dpkg-source -x /root/fio*.dsc
FULLVER=`head -n1 /root/vitastor/debian/changelog | perl -pe 's/^.*\((.*?)\).*$/$1/'`
VER=${FULLVER%%-*}
rm -rf /root/vitastor-$VER
mkdir /root/vitastor-$VER
cd /root/vitastor
cp -a $(ls | grep -v packages) /root/vitastor-$VER
rm -rf /root/vitastor/packages/vitastor-$REL
mkdir -p /root/vitastor/packages/vitastor-$REL
mv /root/vitastor-$VER /root/vitastor/packages/vitastor-$REL/
cd /root/vitastor/packages/vitastor-$REL/vitastor-$VER
rm -rf fio
ln -s /root/fio-build/fio-*/ ./fio
FIO=`head -n1 fio/debian/changelog | perl -pe 's/^.*\((.*?)\).*$/$1/'`
ls /usr/include/linux/raw.h || cp ./debian/raw.h /usr/include/linux/raw.h
sh copy-fio-includes.sh
rm fio
mkdir -p a b debian/patches
mv fio-copy b/fio
diff -NaurpbB a b > debian/patches/fio-headers.patch || true
echo fio-headers.patch >> debian/patches/series
rm -rf a b
echo "dep:fio=$FIO" > debian/fio_version
cd /root/vitastor/packages/vitastor-$REL/vitastor-$VER
mkdir mon/node_modules
cd mon/node_modules
curl -s https://git.yourcmc.ru/vitalif/antietcd/archive/master.tar.gz | tar -zx
curl -s https://git.yourcmc.ru/vitalif/tinyraft/archive/master.tar.gz | tar -zx
cd /root/vitastor/packages/vitastor-$REL
tar --sort=name --mtime='2020-01-01' --owner=0 --group=0 --exclude=debian -cJf vitastor_$VER.orig.tar.xz vitastor-$VER
cd vitastor-$VER
DEBEMAIL="Vitaliy Filippov <vitalif@yourcmc.ru>" dch -D $REL -v "$FULLVER""$REL" "Rebuild for $REL"
DEB_BUILD_OPTIONS=nocheck dpkg-buildpackage --jobs=auto -sa
rm -rf /root/vitastor/packages/vitastor-$REL/vitastor-*/
# Why does ubuntu rename debug packages to *.ddeb?
cd /root/vitastor/packages/vitastor-$REL
if ls *.ddeb >/dev/null; then
perl -i -pe 's/\.ddeb/.deb/' *.buildinfo *.changes
for i in *.ddeb; do
mv $i ${i%%.ddeb}.deb
done
fi

31
debian/vitastor-buildenv.Dockerfile vendored Normal file
View File

@@ -0,0 +1,31 @@
# Build environment for building Vitastor packages for Debian inside a container
# cd ..
# docker build --build-arg DISTRO=debian --build-arg REL=bullseye -f debian/vitastor.Dockerfile -t vitastor-buildenv:bullseye .
# docker run --rm -e REL=bullseye -v ./:/root/vitastor /root/vitastor/debian/vitastor-build.sh
ARG DISTRO=debian
ARG REL=
FROM $DISTRO:$REL
ARG DISTRO=debian
ARG REL=
WORKDIR /root
RUN set -e -x; \
if [ "$REL" = "buster" ]; then \
perl -i -pe 's/deb.debian.org/archive.debian.org/' /etc/apt/sources.list; \
apt-get update; \
apt-get -y install wget; \
wget https://vitastor.io/debian/pubkey.gpg -O /etc/apt/trusted.gpg.d/vitastor.gpg; \
echo "deb https://vitastor.io/debian $REL main" >> /etc/apt/sources.list; \
fi; \
grep '^deb ' /etc/apt/sources.list | perl -pe 's/^deb/deb-src/' >> /etc/apt/sources.list; \
perl -i -pe 's/Types: deb$/Types: deb deb-src/' /etc/apt/sources.list.d/*.sources || true; \
echo 'APT::Install-Recommends false;' >> /etc/apt/apt.conf; \
echo 'APT::Install-Suggests false;' >> /etc/apt/apt.conf
RUN apt-get update && \
apt-get -y install fio libgoogle-perftools-dev devscripts libjerasure-dev cmake \
libibverbs-dev librdmacm-dev libisal-dev libnl-3-dev libnl-genl-3-dev curl nodejs npm node-nan node-bindings && \
apt-get -y build-dep fio && \
apt-get --download-only source fio

View File

@@ -2,6 +2,7 @@ usr/bin/vita
usr/bin/vitastor-cli
usr/bin/vitastor-rm
usr/bin/vitastor-nbd
usr/bin/vitastor-ublk
usr/bin/vitastor-nfs
usr/bin/vitastor-kv
usr/bin/vitastor-kv-stress

View File

@@ -1,65 +0,0 @@
# Build Vitastor packages for Debian inside a container
# cd ..; podman build --build-arg DISTRO=debian --build-arg REL=bullseye -v `pwd`/packages:/root/packages -f debian/vitastor.Dockerfile .
ARG DISTRO=debian
ARG REL=
FROM $DISTRO:$REL
ARG DISTRO=debian
ARG REL=
WORKDIR /root
RUN set -e -x; \
if [ "$REL" = "buster" ]; then \
apt-get update; \
apt-get -y install wget; \
wget https://vitastor.io/debian/pubkey.gpg -O /etc/apt/trusted.gpg.d/vitastor.gpg; \
echo "deb https://vitastor.io/debian $REL main" >> /etc/apt/sources.list; \
fi; \
grep '^deb ' /etc/apt/sources.list | perl -pe 's/^deb/deb-src/' >> /etc/apt/sources.list; \
perl -i -pe 's/Types: deb$/Types: deb deb-src/' /etc/apt/sources.list.d/debian.sources || true; \
echo 'APT::Install-Recommends false;' >> /etc/apt/apt.conf; \
echo 'APT::Install-Suggests false;' >> /etc/apt/apt.conf
RUN apt-get update && \
apt-get -y install fio liburing-dev libgoogle-perftools-dev devscripts libjerasure-dev cmake \
libibverbs-dev librdmacm-dev libisal-dev libnl-3-dev libnl-genl-3-dev curl nodejs npm node-nan node-bindings && \
apt-get -y build-dep fio && \
apt-get --download-only source fio
ADD . /root/vitastor
RUN set -e -x; \
[ -e /usr/lib/x86_64-linux-gnu/pkgconfig/libisal.pc ] || cp /root/vitastor/debian/libisal.pc /usr/lib/x86_64-linux-gnu/pkgconfig; \
mkdir -p /root/fio-build/; \
cd /root/fio-build/; \
rm -rf /root/fio-build/*; \
dpkg-source -x /root/fio*.dsc; \
mkdir -p /root/packages/vitastor-$REL; \
rm -rf /root/packages/vitastor-$REL/*; \
cd /root/packages/vitastor-$REL; \
FULLVER=$(head -n1 /root/vitastor/debian/changelog | perl -pe 's/^.*\((.*?)\).*$/$1/'); \
VER=${FULLVER%%-*}; \
cp -r /root/vitastor vitastor-$VER; \
cd vitastor-$VER; \
ln -s /root/fio-build/fio-*/ ./fio; \
FIO=$(head -n1 fio/debian/changelog | perl -pe 's/^.*\((.*?)\).*$/$1/'); \
ls /usr/include/linux/raw.h || cp ./debian/raw.h /usr/include/linux/raw.h; \
sh copy-fio-includes.sh; \
rm fio; \
mkdir -p a b debian/patches; \
mv fio-copy b/fio; \
diff -NaurpbB a b > debian/patches/fio-headers.patch || true; \
echo fio-headers.patch >> debian/patches/series; \
rm -rf a b; \
echo "dep:fio=$FIO" > debian/fio_version; \
cd /root/packages/vitastor-$REL/vitastor-$VER; \
mkdir mon/node_modules; \
cd mon/node_modules; \
curl -s https://git.yourcmc.ru/vitalif/antietcd/archive/master.tar.gz | tar -zx; \
curl -s https://git.yourcmc.ru/vitalif/tinyraft/archive/master.tar.gz | tar -zx; \
cd /root/packages/vitastor-$REL; \
tar --sort=name --mtime='2020-01-01' --owner=0 --group=0 --exclude=debian -cJf vitastor_$VER.orig.tar.xz vitastor-$VER; \
cd vitastor-$VER; \
DEBFULLNAME="Vitaliy Filippov <vitalif@yourcmc.ru>" dch -D $REL -v "$FULLVER""$REL" "Rebuild for $REL"; \
DEB_BUILD_OPTIONS=nocheck dpkg-buildpackage --jobs=auto -sa; \
rm -rf /root/packages/vitastor-$REL/vitastor-*/

View File

@@ -1,9 +1,9 @@
VITASTOR_VERSION ?= v2.0.0
VITASTOR_VERSION ?= v2.3.0
all: build push
build:
@docker build --rm -t vitalif/vitastor:$(VITASTOR_VERSION) .
@docker build --no-cache --rm -t vitalif/vitastor:$(VITASTOR_VERSION) .
push:
@docker push vitalif/vitastor:$(VITASTOR_VERSION)

View File

@@ -0,0 +1,3 @@
Package: *
Pin: release n=bookworm-backports
Pin-Priority: 500

View File

@@ -1 +1,2 @@
deb http://vitastor.io/debian bookworm main
deb http://http.debian.net/debian/ bookworm-backports main

View File

@@ -4,7 +4,7 @@
#
# Desired Vitastor version
VITASTOR_VERSION=v2.0.0
VITASTOR_VERSION=v2.3.0
# Additional arguments for all containers
# For example, you may want to specify a custom logging driver here

View File

@@ -24,6 +24,10 @@ affect their interaction with the cluster.
- [nbd_max_devices](#nbd_max_devices)
- [nbd_max_part](#nbd_max_part)
- [osd_nearfull_ratio](#osd_nearfull_ratio)
- [hostname](#hostname)
- [ublk_queue_depth](#ublk_queue_depth)
- [ublk_max_io_size](#ublk_max_io_size)
- [qemu_file_mirror_path](#qemu_file_mirror_path)
## client_iothread_count
@@ -215,3 +219,37 @@ just one OSD becomes 100 % full!
However, unlike in Ceph, 100 % full Vitastor OSDs don't crash (in Ceph they're
unable to start at all), so you'll be able to recover from "out of space" errors
without destroying and recreating OSDs.
## hostname
- Type: string
- Can be changed online: yes
Clients use host name to find their distance to OSDs when [localized reads](pool.en.md#local_reads)
are enabled. By default, standard [gethostname](https://man7.org/linux/man-pages/man2/gethostname.2.html)
function is used to determine host name, but you can also override it with this parameter.
## ublk_queue_depth
- Type: integer
- Default: 256
Default queue depth for [Vitastor ublk servers](../usage/ublk.en.md).
## ublk_max_io_size
- Type: integer
Default maximum I/O size for Vitastor [ublk servers](../usage/ublk.en.md).
The largest of 1 MB and pool block size multiplied by EC data chunk count is used if not specified.
## qemu_file_mirror_path
- Type: string
When set to an FS directory path (for example, `/mnt/vitastor/`), `qemu-img info` and similar
QAPI commands return the name of the image inside this directory instead of normal
`vitastor://?image=abc` URI as `filename`.
This allows to then mount this path using [vitastor-nfs](../usage/nfs.en.md) and trick
third-party systems like Veeam which rely on `filename` in the image info but don't support Vitastor.

View File

@@ -24,6 +24,10 @@
- [nbd_max_devices](#nbd_max_devices)
- [nbd_max_part](#nbd_max_part)
- [osd_nearfull_ratio](#osd_nearfull_ratio)
- [hostname](#hostname)
- [ublk_queue_depth](#ublk_queue_depth)
- [ublk_max_io_size](#ublk_max_io_size)
- [qemu_file_mirror_path](#qemu_file_mirror_path)
## client_iothread_count
@@ -219,3 +223,40 @@ RDMA и хотите повысить пиковую производитель
заполненные на 100% OSD вообще не могут стартовать), так что вы сможете
восстановить работу кластера после ошибок отсутствия свободного места
без уничтожения и пересоздания OSD.
## hostname
- Тип: строка
- Можно менять на лету: да
Клиенты используют имя хоста для определения расстояния до OSD, когда включены
[локальные чтения](pool.ru.md#local_reads). По умолчанию для определения имени
хоста используется стандартная функция [gethostname](https://man7.org/linux/man-pages/man2/gethostname.2.html),
но вы также можете задать имя хоста вручную данным параметром.
## ublk_queue_depth
- Тип: целое число
- Значение по умолчанию: 256
Глубина очереди по умолчанию для [ublk-серверов Vitastor](../usage/ublk.ru.md).
## ublk_max_io_size
- Тип: целое число
Максимальный размер запроса ввода-вывода для [ublk-серверов Vitastor](../usage/ublk.ru.md).
Если не задан, используется максимум из 1 МБ и размера блока пула, умноженного на число частей
данных EC-пула.
## qemu_file_mirror_path
- Тип: строка
Если установить эту опцию равной пути к каталогу в ФС, команда `qemu-img info` и подобные
команды QAPI будут возвращать в поле `filename` имя образа внутри заданного каталога вместо
обычного адреса типа `vitastor://?image=abc`.
Это позволяет смонтировать этот путь с помощью [vitastor-nfs](../usage/nfs.ru.md) и обмануть
сторонние системы типа Veeam, которые полагаются на поле `filename` в информации об образе QEMU,
но не поддерживают Vitastor.

View File

@@ -74,13 +74,13 @@ Grafana dashboard suitable for this exporter is here: [Vitastor-Grafana-6+.json]
- Type: integer
- Default: 8060
HTTP port for monitors to listen on (including metrics exporter)
HTTP port for monitors to listen to (including metrics exporter)
## mon_http_ip
- Type: string
IP address for monitors to listen on (all addresses by default)
IP address for monitors to listen to (all addresses by default)
## mon_https_cert

View File

@@ -9,9 +9,11 @@
These parameters apply to clients and OSDs and affect network connection logic
between clients, OSDs and etcd.
- [tcp_header_buffer_size](#tcp_header_buffer_size)
- [use_sync_send_recv](#use_sync_send_recv)
- [osd_network](#osd_network)
- [osd_cluster_network](#osd_cluster_network)
- [use_rdma](#use_rdma)
- [use_rdmacm](#use_rdmacm)
- [disable_tcp](#disable_tcp)
- [rdma_device](#rdma_device)
- [rdma_port_num](#rdma_port_num)
- [rdma_gid_index](#rdma_gid_index)
@@ -30,38 +32,63 @@ between clients, OSDs and etcd.
- [etcd_slow_timeout](#etcd_slow_timeout)
- [etcd_keepalive_timeout](#etcd_keepalive_timeout)
- [etcd_ws_keepalive_interval](#etcd_ws_keepalive_interval)
- [etcd_min_reload_interval](#etcd_min_reload_interval)
- [tcp_header_buffer_size](#tcp_header_buffer_size)
- [min_zerocopy_send_size](#min_zerocopy_send_size)
- [use_sync_send_recv](#use_sync_send_recv)
## tcp_header_buffer_size
## osd_network
- Type: integer
- Default: 65536
- Type: string or array of strings
Size of the buffer used to read data using an additional copy. Vitastor
packet headers are 128 bytes, payload is always at least 4 KB, so it is
usually beneficial to try to read multiple packets at once even though
it requires to copy the data an additional time. The rest of each packet
is received without an additional copy. You can try to play with this
parameter and see how it affects random iops and linear bandwidth if you
want.
Network mask of public OSD network(s) (IPv4 or IPv6). Each OSD listens to all
addresses of UP + RUNNING interfaces matching one of these networks, on the
same port. Port is auto-selected except if [bind_port](osd.en.md#bind_port) is
explicitly specified. Bind address(es) may also be overridden manually by
specifying [bind_address](osd.en.md#bind_address). If OSD networks are not specified
at all, OSD just listens to a wildcard address (0.0.0.0).
## use_sync_send_recv
## osd_cluster_network
- Type: boolean
- Default: false
- Type: string or array of strings
If true, synchronous send/recv syscalls are used instead of io_uring for
socket communication. Useless for OSDs because they require io_uring anyway,
but may be required for clients with old kernel versions.
Network mask of separate network(s) (IPv4 or IPv6) to use for OSD
cluster connections. I.e. OSDs will always attempt to use these networks
to connect to other OSDs, while clients will attempt to use networks from
[osd_network](#osd_network).
## use_rdma
- Type: boolean
- Default: true
Try to use RDMA for communication if it's available. Disable if you don't
want Vitastor to use RDMA. TCP-only clients can also talk to an RDMA-enabled
cluster, so disabling RDMA may be needed if clients have RDMA devices,
but they are not connected to the cluster.
Try to use RDMA through libibverbs for communication if it's available.
Disable if you don't want Vitastor to use RDMA. TCP-only clients can also
talk to an RDMA-enabled cluster, so disabling RDMA may be needed if clients
have RDMA devices, but they are not connected to the cluster.
`use_rdma` works with RoCEv1/RoCEv2 networks, but not with iWARP and,
maybe, with some Infiniband configurations which require RDMA-CM.
Consider `use_rdmacm` for such networks.
## use_rdmacm
- Type: boolean
- Default: false
Use an alternative implementation of RDMA through RDMA-CM (Connection
Manager). Works with all RDMA networks: Infiniband, iWARP and
RoCEv1/RoCEv2, and even allows to disable TCP and run only with RDMA.
OSDs always use random port numbers for RDMA-CM listeners, different
from their TCP ports. `use_rdma` is automatically disabled when
`use_rdmacm` is enabled.
## disable_tcp
- Type: boolean
- Default: true
Fully disable TCP and only use RDMA-CM for OSD communication.
## rdma_device
@@ -92,12 +119,13 @@ PFC (Priority Flow Control) and ECN (Explicit Congestion Notification).
## rdma_port_num
- Type: integer
- Default: 1
RDMA device port number to use. Only for devices that have more than 1 port.
See `phys_port_cnt` in `ibv_devinfo -v` output to determine how many ports
your device has.
Not relevant for RDMA-CM (use_rdmacm).
## rdma_gid_index
- Type: integer
@@ -113,13 +141,14 @@ GID auto-selection is unsupported with libibverbs < v32.
A correct rdma_gid_index for RoCEv2 is usually 1 (IPv6) or 3 (IPv4).
Not relevant for RDMA-CM (use_rdmacm).
## rdma_mtu
- Type: integer
- Default: 4096
RDMA Path MTU to use. Must be 1024, 2048 or 4096. There is usually no
sense to change it from the default 4096.
RDMA Path MTU to use. Must be 1024, 2048 or 4096. Default is to use the
RDMA device's MTU.
## rdma_max_sge
@@ -261,3 +290,63 @@ etcd_report_interval to guarantee that keepalive actually works.
etcd websocket ping interval required to keep the connection alive and
detect disconnections quickly.
## etcd_min_reload_interval
- Type: milliseconds
- Default: 1000
- Can be changed online: yes
Minimum interval for full etcd state reload. Introduced to prevent
excessive load on etcd during outages when etcd can't keep up with event
streams and cancels them.
## tcp_header_buffer_size
- Type: integer
- Default: 65536
Size of the buffer used to read data using an additional copy. Vitastor
packet headers are 128 bytes, payload is always at least 4 KB, so it is
usually beneficial to try to read multiple packets at once even though
it requires to copy the data an additional time. The rest of each packet
is received without an additional copy. You can try to play with this
parameter and see how it affects random iops and linear bandwidth if you
want.
## min_zerocopy_send_size
- Type: integer
- Default: 32768
OSDs and clients will attempt to use io_uring-based zero-copy TCP send
for buffers larger than this number of bytes. Zero-copy send with io_uring is
supported since Linux kernel version 6.1. Support is auto-detected and disabled
automatically when not available. It can also be disabled explicitly by setting
this parameter to a negative value.
Warning! Zero-copy send performance may vary greatly from CPU to CPU and from
one kernel version to another. Generally, it tends to only make benefit with larger
messages. With smaller messages (say, 4 KB), it may actually be slower. 32 KB is
enough for almost all CPUs, but even smaller values are optimal for some of them.
For example, 4 KB is OK for EPYC Milan/Genoa and 12 KB is OK for Xeon Ice Lake
(but verify it yourself please).
Verification instructions:
1. Add `iommu=pt` into your Linux kernel command line and reboot.
2. Upgrade your kernel. For example, it's very important to use 6.11+ with recent AMD EPYCs.
3. Run some tests with the [send-zerocopy liburing example](https://github.com/axboe/liburing/blob/master/examples/send-zerocopy.c)
to find the minimal message size for which zero-copy is optimal.
Use `./send-zerocopy tcp -4 -R` at the server side and
`time ./send-zerocopy tcp -4 -b 0 -s BUFFER_SIZE -D SERVER_IP` at the client side with
`-z 0` (no zero-copy) and `-z 1` (zero-copy), and compare MB/s and used CPU time
(user+system).
## use_sync_send_recv
- Type: boolean
- Default: false
If true, synchronous send/recv syscalls are used instead of io_uring for
socket communication. Useless for OSDs because they require io_uring anyway,
but may be required for clients with old kernel versions.

View File

@@ -9,9 +9,11 @@
Данные параметры используются клиентами и OSD и влияют на логику сетевого
взаимодействия между клиентами, OSD, а также etcd.
- [tcp_header_buffer_size](#tcp_header_buffer_size)
- [use_sync_send_recv](#use_sync_send_recv)
- [osd_network](#osd_network)
- [osd_cluster_network](#osd_cluster_network)
- [use_rdma](#use_rdma)
- [use_rdmacm](#use_rdmacm)
- [disable_tcp](#disable_tcp)
- [rdma_device](#rdma_device)
- [rdma_port_num](#rdma_port_num)
- [rdma_gid_index](#rdma_gid_index)
@@ -30,41 +32,63 @@
- [etcd_slow_timeout](#etcd_slow_timeout)
- [etcd_keepalive_timeout](#etcd_keepalive_timeout)
- [etcd_ws_keepalive_interval](#etcd_ws_keepalive_interval)
- [etcd_min_reload_interval](#etcd_min_reload_interval)
- [tcp_header_buffer_size](#tcp_header_buffer_size)
- [min_zerocopy_send_size](#min_zerocopy_send_size)
- [use_sync_send_recv](#use_sync_send_recv)
## tcp_header_buffer_size
## osd_network
- Тип: целое число
- Значение по умолчанию: 65536
- Тип: строка или массив строк
Размер буфера для чтения данных с дополнительным копированием. Пакеты
Vitastor содержат 128-байтные заголовки, за которыми следуют данные размером
от 4 КБ и для мелких операций ввода-вывода обычно выгодно за 1 вызов читать
сразу несколько пакетов, даже не смотря на то, что это требует лишний раз
скопировать данные. Часть каждого пакета за пределами значения данного
параметра читается без дополнительного копирования. Вы можете попробовать
поменять этот параметр и посмотреть, как он влияет на производительность
случайного и линейного доступа.
Маски подсетей (IPv4 или IPv6) публичной сети или сетей OSD. Каждый OSD слушает
один и тот же порт на всех адресах поднятых (UP + RUNNING) сетевых интерфейсов,
соответствующих одной из указанных сетей. Порт выбирается автоматически, если
только [bind_port](osd.ru.md#bind_port) не задан явно. Адреса для подключений можно
также переопределить явно, задав [bind_address](osd.ru.md#bind_address). Если сети OSD
не заданы вообще, OSD слушает все адреса (0.0.0.0).
## use_sync_send_recv
## osd_cluster_network
- Тип: булево (да/нет)
- Значение по умолчанию: false
- Тип: строка или массив строк
Если установлено в истину, то вместо io_uring для передачи данных по сети
будут использоваться обычные синхронные системные вызовы send/recv. Для OSD
это бессмысленно, так как OSD в любом случае нуждается в io_uring, но, в
принципе, это может применяться для клиентов со старыми версиями ядра.
Маски подсетей (IPv4 или IPv6) отдельной кластерной сети или сетей OSD.
То есть, OSD будут всегда стараться использовать эти сети для соединений
с другими OSD, а клиенты будут стараться использовать сети из [osd_network](#osd_network).
## use_rdma
- Тип: булево (да/нет)
- Значение по умолчанию: true
Пытаться использовать RDMA для связи при наличии доступных устройств.
Отключите, если вы не хотите, чтобы Vitastor использовал RDMA.
TCP-клиенты также могут работать с RDMA-кластером, так что отключать
RDMA может быть нужно только если у клиентов есть RDMA-устройства,
но они не имеют соединения с кластером Vitastor.
Попробовать использовать RDMA через libibverbs для связи при наличии
доступных устройств. Отключите, если вы не хотите, чтобы Vitastor
использовал RDMA. TCP-клиенты также могут работать с RDMA-кластером,
так что отключать RDMA может быть нужно, только если у клиентов есть
RDMA-устройства, но они не имеют соединения с кластером Vitastor.
`use_rdma` работает с RoCEv1/RoCEv2 сетями, но не работает с iWARP и
может не работать с частью конфигураций Infiniband, требующих RDMA-CM.
Рассмотрите включение `use_rdmacm` для таких сетей.
## use_rdmacm
- Тип: булево (да/нет)
- Значение по умолчанию: false
Использовать альтернативную реализацию RDMA на основе RDMA-CM (Connection
Manager). Работает со всеми типами RDMA-сетей: Infiniband, iWARP и
RoCEv1/RoCEv2, и даже позволяет полностью отключить TCP и работать
только на RDMA. OSD используют случайные номера портов для ожидания
соединений через RDMA-CM, отличающиеся от их TCP-портов. Также при
включении `use_rdmacm` автоматически отключается опция `use_rdma`.
## disable_tcp
- Тип: булево (да/нет)
- Значение по умолчанию: true
Полностью отключить TCP и использовать только RDMA-CM для соединений с OSD.
## rdma_device
@@ -96,13 +120,14 @@ Control) и ECN (Explicit Congestion Notification).
## rdma_port_num
- Тип: целое число
- Значение по умолчанию: 1
Номер порта RDMA-устройства, который следует использовать. Имеет смысл
только для устройств, у которых более 1 порта. Чтобы узнать, сколько портов
у вашего адаптера, посмотрите `phys_port_cnt` в выводе команды
`ibv_devinfo -v`.
Опция неприменима к RDMA-CM (use_rdmacm).
## rdma_gid_index
- Тип: целое число
@@ -119,13 +144,14 @@ libibverbs < v32.
Правильный rdma_gid_index для RoCEv2, как правило, 1 (IPv6) или 3 (IPv4).
Опция неприменима к RDMA-CM (use_rdmacm).
## rdma_mtu
- Тип: целое число
- Значение по умолчанию: 4096
Максимальная единица передачи (Path MTU) для RDMA. Должно быть равно 1024,
2048 или 4096. Обычно нет смысла менять значение по умолчанию, равное 4096.
2048 или 4096. По умолчанию используется значение MTU RDMA-устройства.
## rdma_max_sge
@@ -271,3 +297,65 @@ etcd_report_interval, чтобы keepalive гарантированно рабо
- Можно менять на лету: да
Интервал проверки живости вебсокет-подключений к etcd.
## etcd_min_reload_interval
- Тип: миллисекунды
- Значение по умолчанию: 1000
- Можно менять на лету: да
Минимальный интервал полной перезагрузки состояния из etcd. Добавлено для
предотвращения избыточной нагрузки на etcd во время отказов, когда etcd не
успевает рассылать потоки событий и отменяет их.
## tcp_header_buffer_size
- Тип: целое число
- Значение по умолчанию: 65536
Размер буфера для чтения данных с дополнительным копированием. Пакеты
Vitastor содержат 128-байтные заголовки, за которыми следуют данные размером
от 4 КБ и для мелких операций ввода-вывода обычно выгодно за 1 вызов читать
сразу несколько пакетов, даже не смотря на то, что это требует лишний раз
скопировать данные. Часть каждого пакета за пределами значения данного
параметра читается без дополнительного копирования. Вы можете попробовать
поменять этот параметр и посмотреть, как он влияет на производительность
случайного и линейного доступа.
## min_zerocopy_send_size
- Тип: целое число
- Значение по умолчанию: 32768
OSD и клиенты будут пробовать использовать TCP-отправку без копирования (zero-copy) на
основе io_uring для буферов, больших, чем это число байт. Отправка без копирования
поддерживается в io_uring, начиная с версии ядра Linux 6.1. Наличие поддержки
проверяется автоматически и zero-copy отключается, когда поддержки нет. Также
её можно отключить явно, установив данный параметр в отрицательное значение.
Внимание! Производительность данной функции может сильно отличаться на разных
процессорах и на разных версиях ядра Linux. В целом, zero-copy обычно быстрее с
большими сообщениями, а с мелкими (например, 4 КБ) zero-copy может быть даже
медленнее. 32 КБ достаточно почти для всех процессоров, но для каких-то можно
использовать даже меньшие значения. Например, для EPYC Milan/Genoa подходит 4 КБ,
а для Xeon Ice Lake - 12 КБ (но, пожалуйста, перепроверьте это сами).
Инструкция по проверке:
1. Добавьте `iommu=pt` в командную строку загрузки вашего ядра Linux и перезагрузитесь.
2. Обновите ядро. Например, для AMD EPYC очень важно использовать версию 6.11+.
3. Позапускайте тесты с помощью [send-zerocopy из примеров liburing](https://github.com/axboe/liburing/blob/master/examples/send-zerocopy.c),
чтобы найти минимальный размер сообщения, для которого zero-copy отправка оптимальна.
Запускайте `./send-zerocopy tcp -4 -R` на стороне сервера и
`time ./send-zerocopy tcp -4 -b 0 -s РАЗМЕРУФЕРА -D АДРЕС_СЕРВЕРА` на стороне клиента
с опцией `-z 0` (обычная отправка) и `-z 1` (отправка без копирования), и сравнивайте
скорость в МБ/с и занятое процессорное время (user+system).
## use_sync_send_recv
- Тип: булево (да/нет)
- Значение по умолчанию: false
Если установлено в истину, то вместо io_uring для передачи данных по сети
будут использоваться обычные синхронные системные вызовы send/recv. Для OSD
это бессмысленно, так как OSD в любом случае нуждается в io_uring, но, в
принципе, это может применяться для клиентов со старыми версиями ядра.

View File

@@ -10,13 +10,12 @@ These parameters only apply to OSDs, are not fixed at the moment of OSD drive
initialization and can be changed - in /etc/vitastor/vitastor.conf or [vitastor-disk update-sb](../usage/disk.en.md#update-sb)
with an OSD restart or, for some of them, even without restarting by updating configuration in etcd.
- [bind_address](#bind_address)
- [bind_port](#bind_port)
- [osd_iothread_count](#osd_iothread_count)
- [etcd_report_interval](#etcd_report_interval)
- [etcd_stats_interval](#etcd_stats_interval)
- [run_primary](#run_primary)
- [osd_network](#osd_network)
- [bind_address](#bind_address)
- [bind_port](#bind_port)
- [autosync_interval](#autosync_interval)
- [autosync_writes](#autosync_writes)
- [recovery_queue_depth](#recovery_queue_depth)
@@ -63,6 +62,26 @@ with an OSD restart or, for some of them, even without restarting by updating co
- [recovery_tune_sleep_cutoff_us](#recovery_tune_sleep_cutoff_us)
- [discard_on_start](#discard_on_start)
- [min_discard_size](#min_discard_size)
- [allow_net_split](#allow_net_split)
- [enable_pg_locks](#enable_pg_locks)
- [pg_lock_retry_interval_ms](#pg_lock_retry_interval_ms)
## bind_address
- Type: string or array of strings
Instead of the network masks ([osd_network](network.en.md#osd_network) and
[osd_cluster_network](network.en.md#osd_cluster_network)), you can also set
OSD listen addresses explicitly using this parameter. May be useful if you
want to start OSDs on interfaces that are not UP + RUNNING.
## bind_port
- Type: integer
By default, OSDs pick random ports to use for incoming connections
automatically. With this option you can set a specific port for a specific
OSD by hand.
## osd_iothread_count
@@ -106,34 +125,6 @@ debugging purposes. It's possible to implement additional feature for the
monitor which may allow to separate primary and secondary OSDs, but it's
unclear why anyone could need it, so it's not implemented.
## osd_network
- Type: string or array of strings
Network mask of the network (IPv4 or IPv6) to use for OSDs. Note that
although it's possible to specify multiple networks here, this does not
mean that OSDs will create multiple listening sockets - they'll only
pick the first matching address of an UP + RUNNING interface. Separate
networks for cluster and client connections are also not implemented, but
they are mostly useless anyway, so it's not a big deal.
## bind_address
- Type: string
- Default: 0.0.0.0
Instead of the network mask, you can also set OSD listen address explicitly
using this parameter. May be useful if you want to start OSDs on interfaces
that are not UP + RUNNING.
## bind_port
- Type: integer
By default, OSDs pick random ports to use for incoming connections
automatically. With this option you can set a specific port for a specific
OSD by hand.
## autosync_interval
- Type: seconds
@@ -644,3 +635,34 @@ Discard (SSD TRIM) unused data device blocks on every OSD startup.
- Default: 1048576
Minimum consecutive block size to TRIM it.
## allow_net_split
- Type: boolean
- Default: false
Allow "safe" cases of network splits/partitions - allow to start PGs without
connections to some OSDs currently registered as alive in etcd, if the number
of actually connected PG OSDs is at least pg_minsize. That is, allow some OSDs to lose
connectivity with some other OSDs as long as it doesn't break pg_minsize guarantees.
The downside is that it increases the probability of writing data into just pg_minsize
OSDs during failover which can lead to PGs becoming incomplete after additional outages.
The old behaviour in versions up to 2.0.0 was equal to enabled allow_net_split.
## enable_pg_locks
- Type: boolean
Vitastor 2.2.0 introduces a new layer of split-brain prevention mechanism in
addition to etcd: PG locks. They prevent split-brain even in abnormal theoretical cases
when etcd is extremely laggy. As a new feature, by default, PG locks are only enabled
for pools where they're required - pools with [localized reads](pool.en.md#local_reads).
Use this parameter to enable or disable this function for all pools.
## pg_lock_retry_interval_ms
- Type: milliseconds
- Default: 100
Retry interval for failed PG lock attempts.

View File

@@ -11,13 +11,12 @@
момент с перезапуском OSD в /etc/vitastor/vitastor.conf или [vitastor-disk update-sb](../usage/disk.ru.md#update-sb),
а некоторые и без перезапуска, с помощью изменения конфигурации в etcd.
- [bind_address](#bind_address)
- [bind_port](#bind_port)
- [osd_iothread_count](#osd_iothread_count)
- [etcd_report_interval](#etcd_report_interval)
- [etcd_stats_interval](#etcd_stats_interval)
- [run_primary](#run_primary)
- [osd_network](#osd_network)
- [bind_address](#bind_address)
- [bind_port](#bind_port)
- [autosync_interval](#autosync_interval)
- [autosync_writes](#autosync_writes)
- [recovery_queue_depth](#recovery_queue_depth)
@@ -64,6 +63,26 @@
- [recovery_tune_sleep_cutoff_us](#recovery_tune_sleep_cutoff_us)
- [discard_on_start](#discard_on_start)
- [min_discard_size](#min_discard_size)
- [allow_net_split](#allow_net_split)
- [enable_pg_locks](#enable_pg_locks)
- [pg_lock_retry_interval_ms](#pg_lock_retry_interval_ms)
## bind_address
- Тип: строка или массив строк
Вместо использования масок подсети ([osd_network](network.ru.md#osd_network) и
[osd_cluster_network](network.ru.md#osd_cluster_network)), вы также можете явно
задать адрес(а), на которых будут ожидать соединений OSD, с помощью данного
параметра. Это может быть полезно, например, чтобы запускать OSD на неподнятых
интерфейсах (не UP + RUNNING).
## bind_port
- Тип: целое число
По умолчанию OSD сами выбирают случайные порты для входящих подключений.
С помощью данной опции вы можете задать порт для отдельного OSD вручную.
## osd_iothread_count
@@ -109,34 +128,6 @@ max_etcd_attempts * etcd_quick_timeout.
первичные OSD от вторичных, но пока не понятно, зачем это может кому-то
понадобиться, поэтому это не реализовано.
## osd_network
- Тип: строка или массив строк
Маска подсети (IPv4 или IPv6) для использования для соединений с OSD.
Имейте в виду, что хотя сейчас и можно передать в этот параметр несколько
подсетей, это не означает, что OSD будут создавать несколько слушающих
сокетов - они лишь будут выбирать адрес первого поднятого (состояние UP +
RUNNING), подходящий под заданную маску. Также не реализовано разделение
кластерной и публичной сетей OSD. Правда, от него обычно всё равно довольно
мало толку, так что особенной проблемы в этом нет.
## bind_address
- Тип: строка
- Значение по умолчанию: 0.0.0.0
Этим параметром можно явным образом задать адрес, на котором будет ожидать
соединений OSD (вместо использования маски подсети). Может быть полезно,
например, чтобы запускать OSD на неподнятых интерфейсах (не UP + RUNNING).
## bind_port
- Тип: целое число
По умолчанию OSD сами выбирают случайные порты для входящих подключений.
С помощью данной опции вы можете задать порт для отдельного OSD вручную.
## autosync_interval
- Тип: секунды
@@ -675,3 +666,36 @@ EC (кодов коррекции ошибок) с более, чем 1 диск
- Значение по умолчанию: 1048576
Минимальный размер последовательного блока данных, чтобы освобождать его через TRIM.
## allow_net_split
- Тип: булево (да/нет)
- Значение по умолчанию: false
Разрешить "безопасные" случаи разделений сети - разрешить активировать PG без
соединений к некоторым OSD, помеченным активными в etcd, если общее число активных
OSD в PG составляет как минимум pg_minsize. То есть, разрешать некоторым OSD терять
соединения с некоторыми другими OSD, если это не нарушает гарантий pg_minsize.
Минус такого разрешения в том, что оно повышает вероятность записи данных ровно в
pg_minsize OSD во время переключений, что может потом привести к тому, что PG станут
неполными (incomplete), если упадут ещё какие-то OSD.
Старое поведение в версиях до 2.0.0 было идентично включённому allow_net_split.
## enable_pg_locks
- Тип: булево (да/нет)
В Vitastor 2.2.0 появился новый слой защиты от сплитбрейна в дополнение к etcd -
блокировки PG. Они гарантируют порядок даже в теоретических ненормальных случаях,
когда etcd очень сильно тормозит. Так как функция новая, по умолчанию она включается
только для пулов, в которых она необходима - а именно, в пулах с включёнными
[локальными чтениями](pool.ru.md#local_reads). Ну а с помощью данного параметра
можно включить блокировки PG для всех пулов.
## pg_lock_retry_interval_ms
- Тип: миллисекунды
- Значение по умолчанию: 100
Интервал повтора неудачных попыток блокировки PG.

View File

@@ -34,6 +34,7 @@ Parameters:
- [failure_domain](#failure_domain)
- [level_placement](#level_placement)
- [raw_placement](#raw_placement)
- [local_reads](#local_reads)
- [max_osd_combinations](#max_osd_combinations)
- [block_size](#block_size)
- [bitmap_granularity](#bitmap_granularity)
@@ -133,8 +134,8 @@ Pool name.
## scheme
- Type: string
- Required
- One of: "replicated", "xor", "ec" or "jerasure"
- Required
Redundancy scheme used for data in this pool. "jerasure" is an alias for "ec",
both use Reed-Solomon-Vandermonde codes based on ISA-L or jerasure libraries.
@@ -189,6 +190,9 @@ So, pg_minsize regulates the number of failures that a pool can tolerate
without temporary downtime for [osd_out_time](monitor.en.md#osd_out_time),
but at a cost of slightly reduced storage reliability.
See also [allow_net_split](osd.en.md#allow_net_split) and
[PG state descriptions](../usage/admin.en.md#pg-states).
FIXME: pg_minsize behaviour may be changed in the future to only make PGs
read-only instead of deactivating them.
@@ -286,6 +290,30 @@ Examples:
- EC 4+2 in 3 DC: `any, dc=1 host!=1, dc!=1, dc=3 host!=3, dc!=(1,3), dc=5 host!=5`
- 1 replica in fixed DC + 2 in random DCs: `dc?=meow, dc!=1, dc!=(1,2)`
## local_reads
- Type: string
- One of: "primary", "nearest" or "random"
- Default: primary
By default, Vitastor serves all read and write requests from the primary OSD of each PG.
But it can also serve read requests for replicated pools from secondary OSDs in clean PGs
(active or active+left_on_dead) which may be useful if you have OSDs with different network
latency to the client - for example, if you have a cross-datacenter setup.
If you set this parameter to "nearest", clients will try to read from the nearest OSD
in the [Placement Tree](#placement-tree), i.e. from an OSD from the same host or datacenter.
Distance to different OSDs will be calculated based on client hostname, determined
automatically or set manually in the [hostname](client.en.md#hostname) parameter.
If you set this parameter to "random", clients will try to distribute read requests over
all available secondary OSDs. This mode is mainly useful for tests, but, probably, not
really required in production setups.
[PG locks](osd.en.md#enable_pg_locks) are required for local reads to function. However,
PG locks are enabled automatically by default for pools with enabled local reads, so you
don't have to enable them explicitly.
## max_osd_combinations
- Type: integer
@@ -321,7 +349,8 @@ Read more about this parameter in [Cluster-Wide Disk Layout Parameters](layout-c
## immediate_commit
- Type: string, one of "all", "small" and "none"
- Type: string
- One of: "all", "small" or "none"
- Default: none
Immediate commit setting for this pool. The value from /vitastor/config/global

View File

@@ -33,6 +33,7 @@
- [failure_domain](#failure_domain)
- [level_placement](#level_placement)
- [raw_placement](#raw_placement)
- [local_reads](#local_reads)
- [max_osd_combinations](#max_osd_combinations)
- [block_size](#block_size)
- [bitmap_granularity](#bitmap_granularity)
@@ -133,8 +134,8 @@ OSD игнорируется и OSD не удаляется из распред
## scheme
- Тип: строка
- Обязательный
- Возможные значения: "replicated", "xor", "ec" или "jerasure"
- Обязательный
Схема избыточности, используемая в данном пуле. "jerasure" - синоним для "ec",
в обеих схемах используются коды Рида-Соломона-Вандермонда, реализованные на
@@ -287,6 +288,30 @@ meow недоступен".
- EC 4+2 в 3 датацентрах: `any, dc=1 host!=1, dc!=1, dc=3 host!=3, dc!=(1,3), dc=5 host!=5`
- 1 копия в фиксированном ДЦ + 2 в других ДЦ: `dc?=meow, dc!=1, dc!=(1,2)`
## local_reads
- Тип: строка
- Возможные значения: "primary", "nearest" или "random"
- По умолчанию: primary
По умолчанию Vitastor обслуживает все запросы чтения и записи с первичного OSD каждой PG.
Однако, в чистых PG (active или active+left_on_dead) реплицированных пулов также есть
возможность обслуживать запросы чтения с вторичных OSD, что может быть полезно, если
у вас сильно отличается время сетевого обращения от клиента к разным OSD - например,
если у вас несколько дата-центров.
Если данный параметр установлен в значение "nearest", клиенты будут стараться читать с
ближайших по [Дереву размещения](#дерево-размещения) OSD, то есть, с OSD с того же хоста
или датацентра. Расстояние до разных OSD будет рассчитываться с помощью имени хоста клиента,
определяемого автоматически или заданного вручную параметром [hostname](client.ru.md#hostname).
Если данный параметр установлен в значение "random", клиенты будут стараться распределять
запросы чтения по всем доступным вторичным OSD. Этот режим в основном полезен для тестов,
но, скорее всего, редко нужен в реальных инсталляциях.
Для работы локальных чтений требуются [блокировки PG](osd.ru.md#enable_pg_locks). Включать
их явно не нужно - они включаются автоматически для пулов с включёнными локальными чтениями.
## max_osd_combinations
- Тип: целое число
@@ -324,7 +349,8 @@ meow недоступен".
## immediate_commit
- Тип: строка "all", "small" или "none"
- Тип: строка
- Возможные значения: "all", "small" или "none"
- По умолчанию: none
Настройка мгновенного коммита для данного пула. Если не задана, используется
@@ -413,6 +439,9 @@ OSD с "all".
(например, `s3:standard`) - конкретное содержимое `<name>` пока никак не проверяется
компонентами Vitastor S3.
Смотрите также [allow_net_split](osd.ru.md#allow_net_split) и
[документацию по состояниям PG](../usage/admin.ru.md#состояния-pg).
Все остальные значения used_for_app, кроме начинающихся на `fs:` или `s3:`, не
означают ничего особенного для основных компонентов Vitastor. Поэтому сейчас вы
можете использовать их свободно любым желаемым способом.

View File

@@ -271,3 +271,48 @@
заполненные на 100% OSD вообще не могут стартовать), так что вы сможете
восстановить работу кластера после ошибок отсутствия свободного места
без уничтожения и пересоздания OSD.
- name: hostname
type: string
online: true
info: |
Clients use host name to find their distance to OSDs when [localized reads](pool.en.md#local_reads)
are enabled. By default, standard [gethostname](https://man7.org/linux/man-pages/man2/gethostname.2.html)
function is used to determine host name, but you can also override it with this parameter.
info_ru: |
Клиенты используют имя хоста для определения расстояния до OSD, когда включены
[локальные чтения](pool.ru.md#local_reads). По умолчанию для определения имени
хоста используется стандартная функция [gethostname](https://man7.org/linux/man-pages/man2/gethostname.2.html),
но вы также можете задать имя хоста вручную данным параметром.
- name: ublk_queue_depth
type: int
default: 256
online: false
info: Default queue depth for [Vitastor ublk servers](../usage/ublk.en.md).
info_ru: Глубина очереди по умолчанию для [ublk-серверов Vitastor](../usage/ublk.ru.md).
- name: ublk_max_io_size
type: int
online: false
info: |
Default maximum I/O size for Vitastor [ublk servers](../usage/ublk.en.md).
The largest of 1 MB and pool block size multiplied by EC data chunk count is used if not specified.
info_ru: |
Максимальный размер запроса ввода-вывода для [ublk-серверов Vitastor](../usage/ublk.ru.md).
Если не задан, используется максимум из 1 МБ и размера блока пула, умноженного на число частей
данных EC-пула.
- name: qemu_file_mirror_path
type: string
info: |
When set to an FS directory path (for example, `/mnt/vitastor/`), `qemu-img info` and similar
QAPI commands return the name of the image inside this directory instead of normal
`vitastor://?image=abc` URI as `filename`.
This allows to then mount this path using [vitastor-nfs](../usage/nfs.en.md) and trick
third-party systems like Veeam which rely on `filename` in the image info but don't support Vitastor.
info_ru: |
Если установить эту опцию равной пути к каталогу в ФС, команда `qemu-img info` и подобные
команды QAPI будут возвращать в поле `filename` имя образа внутри заданного каталога вместо
обычного адреса типа `vitastor://?image=abc`.
Это позволяет смонтировать этот путь с помощью [vitastor-nfs](../usage/nfs.ru.md) и обмануть
сторонние системы типа Veeam, которые полагаются на поле `filename` в информации об образе QEMU,
но не поддерживают Vitastor.

View File

@@ -24,6 +24,8 @@
{{../../installation/kubernetes.en.md}}
{{../../installation/s3.en.md}}
{{../../installation/source.en.md}}
{{../../config.en.md|indent=1}}
@@ -54,6 +56,8 @@
{{../../usage/fio.en.md}}
{{../../usage/ublk.en.md}}
{{../../usage/nbd.en.md}}
{{../../usage/qemu.en.md}}

View File

@@ -26,6 +26,8 @@
{{../../installation/source.ru.md}}
{{../../installation/s3.ru.md}}
{{../../config.ru.md|indent=1}}
{{../../config/common.ru.md|indent=2}}
@@ -54,6 +56,8 @@
{{../../usage/fio.ru.md}}
{{../../usage/ublk.ru.md}}
{{../../usage/nbd.ru.md}}
{{../../usage/qemu.ru.md}}

View File

@@ -75,11 +75,11 @@
- name: mon_http_port
type: int
default: 8060
info: HTTP port for monitors to listen on (including metrics exporter)
info: HTTP port for monitors to listen to (including metrics exporter)
info_ru: Порт, на котором мониторы принимают HTTP-соединения (в том числе для отдачи метрик)
- name: mon_http_ip
type: string
info: IP address for monitors to listen on (all addresses by default)
info: IP address for monitors to listen to (all addresses by default)
info_ru: IP-адрес, на котором мониторы принимают HTTP-соединения (по умолчанию все адреса)
- name: mon_https_cert
type: string

View File

@@ -1,49 +1,78 @@
- name: tcp_header_buffer_size
type: int
default: 65536
- name: osd_network
type: string or array of strings
type_ru: строка или массив строк
info: |
Size of the buffer used to read data using an additional copy. Vitastor
packet headers are 128 bytes, payload is always at least 4 KB, so it is
usually beneficial to try to read multiple packets at once even though
it requires to copy the data an additional time. The rest of each packet
is received without an additional copy. You can try to play with this
parameter and see how it affects random iops and linear bandwidth if you
want.
Network mask of public OSD network(s) (IPv4 or IPv6). Each OSD listens to all
addresses of UP + RUNNING interfaces matching one of these networks, on the
same port. Port is auto-selected except if [bind_port](osd.en.md#bind_port) is
explicitly specified. Bind address(es) may also be overridden manually by
specifying [bind_address](osd.en.md#bind_address). If OSD networks are not specified
at all, OSD just listens to a wildcard address (0.0.0.0).
info_ru: |
Размер буфера для чтения данных с дополнительным копированием. Пакеты
Vitastor содержат 128-байтные заголовки, за которыми следуют данные размером
от 4 КБ и для мелких операций ввода-вывода обычно выгодно за 1 вызов читать
сразу несколько пакетов, даже не смотря на то, что это требует лишний раз
скопировать данные. Часть каждого пакета за пределами значения данного
параметра читается без дополнительного копирования. Вы можете попробовать
поменять этот параметр и посмотреть, как он влияет на производительность
случайного и линейного доступа.
- name: use_sync_send_recv
type: bool
default: false
Маски подсетей (IPv4 или IPv6) публичной сети или сетей OSD. Каждый OSD слушает
один и тот же порт на всех адресах поднятых (UP + RUNNING) сетевых интерфейсов,
соответствующих одной из указанных сетей. Порт выбирается автоматически, если
только [bind_port](osd.ru.md#bind_port) не задан явно. Адреса для подключений можно
также переопределить явно, задав [bind_address](osd.ru.md#bind_address). Если сети OSD
не заданы вообще, OSD слушает все адреса (0.0.0.0).
- name: osd_cluster_network
type: string or array of strings
type_ru: строка или массив строк
info: |
If true, synchronous send/recv syscalls are used instead of io_uring for
socket communication. Useless for OSDs because they require io_uring anyway,
but may be required for clients with old kernel versions.
Network mask of separate network(s) (IPv4 or IPv6) to use for OSD
cluster connections. I.e. OSDs will always attempt to use these networks
to connect to other OSDs, while clients will attempt to use networks from
[osd_network](#osd_network).
info_ru: |
Если установлено в истину, то вместо io_uring для передачи данных по сети
будут использоваться обычные синхронные системные вызовы send/recv. Для OSD
это бессмысленно, так как OSD в любом случае нуждается в io_uring, но, в
принципе, это может применяться для клиентов со старыми версиями ядра.
Маски подсетей (IPv4 или IPv6) отдельной кластерной сети или сетей OSD.
То есть, OSD будут всегда стараться использовать эти сети для соединений
с другими OSD, а клиенты будут стараться использовать сети из [osd_network](#osd_network).
- name: use_rdma
type: bool
default: true
info: |
Try to use RDMA for communication if it's available. Disable if you don't
want Vitastor to use RDMA. TCP-only clients can also talk to an RDMA-enabled
cluster, so disabling RDMA may be needed if clients have RDMA devices,
but they are not connected to the cluster.
Try to use RDMA through libibverbs for communication if it's available.
Disable if you don't want Vitastor to use RDMA. TCP-only clients can also
talk to an RDMA-enabled cluster, so disabling RDMA may be needed if clients
have RDMA devices, but they are not connected to the cluster.
`use_rdma` works with RoCEv1/RoCEv2 networks, but not with iWARP and,
maybe, with some Infiniband configurations which require RDMA-CM.
Consider `use_rdmacm` for such networks.
info_ru: |
Пытаться использовать RDMA для связи при наличии доступных устройств.
Отключите, если вы не хотите, чтобы Vitastor использовал RDMA.
TCP-клиенты также могут работать с RDMA-кластером, так что отключать
RDMA может быть нужно только если у клиентов есть RDMA-устройства,
но они не имеют соединения с кластером Vitastor.
Попробовать использовать RDMA через libibverbs для связи при наличии
доступных устройств. Отключите, если вы не хотите, чтобы Vitastor
использовал RDMA. TCP-клиенты также могут работать с RDMA-кластером,
так что отключать RDMA может быть нужно, только если у клиентов есть
RDMA-устройства, но они не имеют соединения с кластером Vitastor.
`use_rdma` работает с RoCEv1/RoCEv2 сетями, но не работает с iWARP и
может не работать с частью конфигураций Infiniband, требующих RDMA-CM.
Рассмотрите включение `use_rdmacm` для таких сетей.
- name: use_rdmacm
type: bool
default: false
info: |
Use an alternative implementation of RDMA through RDMA-CM (Connection
Manager). Works with all RDMA networks: Infiniband, iWARP and
RoCEv1/RoCEv2, and even allows to disable TCP and run only with RDMA.
OSDs always use random port numbers for RDMA-CM listeners, different
from their TCP ports. `use_rdma` is automatically disabled when
`use_rdmacm` is enabled.
info_ru: |
Использовать альтернативную реализацию RDMA на основе RDMA-CM (Connection
Manager). Работает со всеми типами RDMA-сетей: Infiniband, iWARP и
RoCEv1/RoCEv2, и даже позволяет полностью отключить TCP и работать
только на RDMA. OSD используют случайные номера портов для ожидания
соединений через RDMA-CM, отличающиеся от их TCP-портов. Также при
включении `use_rdmacm` автоматически отключается опция `use_rdma`.
- name: disable_tcp
type: bool
default: true
info: |
Fully disable TCP and only use RDMA-CM for OSD communication.
info_ru: |
Полностью отключить TCP и использовать только RDMA-CM для соединений с OSD.
- name: rdma_device
type: string
info: |
@@ -93,16 +122,19 @@
Control) и ECN (Explicit Congestion Notification).
- name: rdma_port_num
type: int
default: 1
info: |
RDMA device port number to use. Only for devices that have more than 1 port.
See `phys_port_cnt` in `ibv_devinfo -v` output to determine how many ports
your device has.
Not relevant for RDMA-CM (use_rdmacm).
info_ru: |
Номер порта RDMA-устройства, который следует использовать. Имеет смысл
только для устройств, у которых более 1 порта. Чтобы узнать, сколько портов
у вашего адаптера, посмотрите `phys_port_cnt` в выводе команды
`ibv_devinfo -v`.
Опция неприменима к RDMA-CM (use_rdmacm).
- name: rdma_gid_index
type: int
info: |
@@ -116,6 +148,8 @@
GID auto-selection is unsupported with libibverbs < v32.
A correct rdma_gid_index for RoCEv2 is usually 1 (IPv6) or 3 (IPv4).
Not relevant for RDMA-CM (use_rdmacm).
info_ru: |
Номер глобального идентификатора адреса RDMA-устройства, который следует
использовать. Разным gid_index могут соответствовать разные протоколы связи:
@@ -128,15 +162,16 @@
libibverbs < v32.
Правильный rdma_gid_index для RoCEv2, как правило, 1 (IPv6) или 3 (IPv4).
Опция неприменима к RDMA-CM (use_rdmacm).
- name: rdma_mtu
type: int
default: 4096
info: |
RDMA Path MTU to use. Must be 1024, 2048 or 4096. There is usually no
sense to change it from the default 4096.
RDMA Path MTU to use. Must be 1024, 2048 or 4096. Default is to use the
RDMA device's MTU.
info_ru: |
Максимальная единица передачи (Path MTU) для RDMA. Должно быть равно 1024,
2048 или 4096. Обычно нет смысла менять значение по умолчанию, равное 4096.
2048 или 4096. По умолчанию используется значение MTU RDMA-устройства.
- name: rdma_max_sge
type: int
default: 128
@@ -306,3 +341,96 @@
detect disconnections quickly.
info_ru: |
Интервал проверки живости вебсокет-подключений к etcd.
- name: etcd_min_reload_interval
type: ms
default: 1000
online: true
info: |
Minimum interval for full etcd state reload. Introduced to prevent
excessive load on etcd during outages when etcd can't keep up with event
streams and cancels them.
info_ru: |
Минимальный интервал полной перезагрузки состояния из etcd. Добавлено для
предотвращения избыточной нагрузки на etcd во время отказов, когда etcd не
успевает рассылать потоки событий и отменяет их.
- name: tcp_header_buffer_size
type: int
default: 65536
info: |
Size of the buffer used to read data using an additional copy. Vitastor
packet headers are 128 bytes, payload is always at least 4 KB, so it is
usually beneficial to try to read multiple packets at once even though
it requires to copy the data an additional time. The rest of each packet
is received without an additional copy. You can try to play with this
parameter and see how it affects random iops and linear bandwidth if you
want.
info_ru: |
Размер буфера для чтения данных с дополнительным копированием. Пакеты
Vitastor содержат 128-байтные заголовки, за которыми следуют данные размером
от 4 КБ и для мелких операций ввода-вывода обычно выгодно за 1 вызов читать
сразу несколько пакетов, даже не смотря на то, что это требует лишний раз
скопировать данные. Часть каждого пакета за пределами значения данного
параметра читается без дополнительного копирования. Вы можете попробовать
поменять этот параметр и посмотреть, как он влияет на производительность
случайного и линейного доступа.
- name: min_zerocopy_send_size
type: int
default: 32768
info: |
OSDs and clients will attempt to use io_uring-based zero-copy TCP send
for buffers larger than this number of bytes. Zero-copy send with io_uring is
supported since Linux kernel version 6.1. Support is auto-detected and disabled
automatically when not available. It can also be disabled explicitly by setting
this parameter to a negative value.
⚠️ Warning! Zero-copy send performance may vary greatly from CPU to CPU and from
one kernel version to another. Generally, it tends to only make benefit with larger
messages. With smaller messages (say, 4 KB), it may actually be slower. 32 KB is
enough for almost all CPUs, but even smaller values are optimal for some of them.
For example, 4 KB is OK for EPYC Milan/Genoa and 12 KB is OK for Xeon Ice Lake
(but verify it yourself please).
Verification instructions:
1. Add `iommu=pt` into your Linux kernel command line and reboot.
2. Upgrade your kernel. For example, it's very important to use 6.11+ with recent AMD EPYCs.
3. Run some tests with the [send-zerocopy liburing example](https://github.com/axboe/liburing/blob/master/examples/send-zerocopy.c)
to find the minimal message size for which zero-copy is optimal.
Use `./send-zerocopy tcp -4 -R` at the server side and
`time ./send-zerocopy tcp -4 -b 0 -s BUFFER_SIZE -D SERVER_IP` at the client side with
`-z 0` (no zero-copy) and `-z 1` (zero-copy), and compare MB/s and used CPU time
(user+system).
info_ru: |
OSD и клиенты будут пробовать использовать TCP-отправку без копирования (zero-copy) на
основе io_uring для буферов, больших, чем это число байт. Отправка без копирования
поддерживается в io_uring, начиная с версии ядра Linux 6.1. Наличие поддержки
проверяется автоматически и zero-copy отключается, когда поддержки нет. Также
её можно отключить явно, установив данный параметр в отрицательное значение.
⚠️ Внимание! Производительность данной функции может сильно отличаться на разных
процессорах и на разных версиях ядра Linux. В целом, zero-copy обычно быстрее с
большими сообщениями, а с мелкими (например, 4 КБ) zero-copy может быть даже
медленнее. 32 КБ достаточно почти для всех процессоров, но для каких-то можно
использовать даже меньшие значения. Например, для EPYC Milan/Genoa подходит 4 КБ,
а для Xeon Ice Lake - 12 КБ (но, пожалуйста, перепроверьте это сами).
Инструкция по проверке:
1. Добавьте `iommu=pt` в командную строку загрузки вашего ядра Linux и перезагрузитесь.
2. Обновите ядро. Например, для AMD EPYC очень важно использовать версию 6.11+.
3. Позапускайте тесты с помощью [send-zerocopy из примеров liburing](https://github.com/axboe/liburing/blob/master/examples/send-zerocopy.c),
чтобы найти минимальный размер сообщения, для которого zero-copy отправка оптимальна.
Запускайте `./send-zerocopy tcp -4 -R` на стороне сервера и
`time ./send-zerocopy tcp -4 -b 0 -s РАЗМЕРУФЕРА -D АДРЕС_СЕРВЕРА` на стороне клиента
с опцией `-z 0` (обычная отправка) и `-z 1` (отправка без копирования), и сравнивайте
скорость в МБ/с и занятое процессорное время (user+system).
- name: use_sync_send_recv
type: bool
default: false
info: |
If true, synchronous send/recv syscalls are used instead of io_uring for
socket communication. Useless for OSDs because they require io_uring anyway,
but may be required for clients with old kernel versions.
info_ru: |
Если установлено в истину, то вместо io_uring для передачи данных по сети
будут использоваться обычные синхронные системные вызовы send/recv. Для OSD
это бессмысленно, так как OSD в любом случае нуждается в io_uring, но, в
принципе, это может применяться для клиентов со старыми версиями ядра.

View File

@@ -1,3 +1,26 @@
- name: bind_address
type: string or array of strings
type_ru: строка или массив строк
info: |
Instead of the network masks ([osd_network](network.en.md#osd_network) and
[osd_cluster_network](network.en.md#osd_cluster_network)), you can also set
OSD listen addresses explicitly using this parameter. May be useful if you
want to start OSDs on interfaces that are not UP + RUNNING.
info_ru: |
Вместо использования масок подсети ([osd_network](network.ru.md#osd_network) и
[osd_cluster_network](network.ru.md#osd_cluster_network)), вы также можете явно
задать адрес(а), на которых будут ожидать соединений OSD, с помощью данного
параметра. Это может быть полезно, например, чтобы запускать OSD на неподнятых
интерфейсах (не UP + RUNNING).
- name: bind_port
type: int
info: |
By default, OSDs pick random ports to use for incoming connections
automatically. With this option you can set a specific port for a specific
OSD by hand.
info_ru: |
По умолчанию OSD сами выбирают случайные порты для входящих подключений.
С помощью данной опции вы можете задать порт для отдельного OSD вручную.
- name: osd_iothread_count
type: int
default: 0
@@ -56,44 +79,6 @@
реализовать дополнительный режим для монитора, который позволит отделять
первичные OSD от вторичных, но пока не понятно, зачем это может кому-то
понадобиться, поэтому это не реализовано.
- name: osd_network
type: string or array of strings
type_ru: строка или массив строк
info: |
Network mask of the network (IPv4 or IPv6) to use for OSDs. Note that
although it's possible to specify multiple networks here, this does not
mean that OSDs will create multiple listening sockets - they'll only
pick the first matching address of an UP + RUNNING interface. Separate
networks for cluster and client connections are also not implemented, but
they are mostly useless anyway, so it's not a big deal.
info_ru: |
Маска подсети (IPv4 или IPv6) для использования для соединений с OSD.
Имейте в виду, что хотя сейчас и можно передать в этот параметр несколько
подсетей, это не означает, что OSD будут создавать несколько слушающих
сокетов - они лишь будут выбирать адрес первого поднятого (состояние UP +
RUNNING), подходящий под заданную маску. Также не реализовано разделение
кластерной и публичной сетей OSD. Правда, от него обычно всё равно довольно
мало толку, так что особенной проблемы в этом нет.
- name: bind_address
type: string
default: "0.0.0.0"
info: |
Instead of the network mask, you can also set OSD listen address explicitly
using this parameter. May be useful if you want to start OSDs on interfaces
that are not UP + RUNNING.
info_ru: |
Этим параметром можно явным образом задать адрес, на котором будет ожидать
соединений OSD (вместо использования маски подсети). Может быть полезно,
например, чтобы запускать OSD на неподнятых интерфейсах (не UP + RUNNING).
- name: bind_port
type: int
info: |
By default, OSDs pick random ports to use for incoming connections
automatically. With this option you can set a specific port for a specific
OSD by hand.
info_ru: |
По умолчанию OSD сами выбирают случайные порты для входящих подключений.
С помощью данной опции вы можете задать порт для отдельного OSD вручную.
- name: autosync_interval
type: sec
default: 5
@@ -774,3 +759,45 @@
default: 1048576
info: Minimum consecutive block size to TRIM it.
info_ru: Минимальный размер последовательного блока данных, чтобы освобождать его через TRIM.
- name: allow_net_split
type: bool
default: false
info: |
Allow "safe" cases of network splits/partitions - allow to start PGs without
connections to some OSDs currently registered as alive in etcd, if the number
of actually connected PG OSDs is at least pg_minsize. That is, allow some OSDs to lose
connectivity with some other OSDs as long as it doesn't break pg_minsize guarantees.
The downside is that it increases the probability of writing data into just pg_minsize
OSDs during failover which can lead to PGs becoming incomplete after additional outages.
The old behaviour in versions up to 2.0.0 was equal to enabled allow_net_split.
info_ru: |
Разрешить "безопасные" случаи разделений сети - разрешить активировать PG без
соединений к некоторым OSD, помеченным активными в etcd, если общее число активных
OSD в PG составляет как минимум pg_minsize. То есть, разрешать некоторым OSD терять
соединения с некоторыми другими OSD, если это не нарушает гарантий pg_minsize.
Минус такого разрешения в том, что оно повышает вероятность записи данных ровно в
pg_minsize OSD во время переключений, что может потом привести к тому, что PG станут
неполными (incomplete), если упадут ещё какие-то OSD.
Старое поведение в версиях до 2.0.0 было идентично включённому allow_net_split.
- name: enable_pg_locks
type: bool
info: |
Vitastor 2.2.0 introduces a new layer of split-brain prevention mechanism in
addition to etcd: PG locks. They prevent split-brain even in abnormal theoretical cases
when etcd is extremely laggy. As a new feature, by default, PG locks are only enabled
for pools where they're required - pools with [localized reads](pool.en.md#local_reads).
Use this parameter to enable or disable this function for all pools.
info_ru: |
В Vitastor 2.2.0 появился новый слой защиты от сплитбрейна в дополнение к etcd -
блокировки PG. Они гарантируют порядок даже в теоретических ненормальных случаях,
когда etcd очень сильно тормозит. Так как функция новая, по умолчанию она включается
только для пулов, в которых она необходима - а именно, в пулах с включёнными
[локальными чтениями](pool.ru.md#local_reads). Ну а с помощью данного параметра
можно включить блокировки PG для всех пулов.
- name: pg_lock_retry_interval_ms
type: ms
default: 100
info: Retry interval for failed PG lock attempts.
info_ru: Интервал повтора неудачных попыток блокировки PG.

View File

@@ -26,9 +26,9 @@ at Vitastor Kubernetes operator: https://github.com/Antilles7227/vitastor-operat
The instruction is very simple.
1. Download a Docker image of the desired version: \
`docker pull vitastor:1.10.2`
`docker pull vitalif/vitastor:v2.3.0`
2. Install scripts to the host system: \
`docker run --rm -it -v /etc:/host-etc -v /usr/bin:/host-bin vitastor:1.10.2 install.sh`
`docker run --rm -it -v /etc:/host-etc -v /usr/bin:/host-bin vitalif/vitastor:v2.3.0 install.sh`
3. Reload udev rules: \
`udevadm control --reload-rules`

View File

@@ -25,9 +25,9 @@ Vitastor можно установить в Docker/Podman. При этом etcd,
Инструкция по установке максимально простая.
1. Скачайте Docker-образ желаемой версии: \
`docker pull vitastor:1.10.2`
`docker pull vitalif/vitastor:v2.3.0`
2. Установите скрипты в хост-систему командой: \
`docker run --rm -it -v /etc:/host-etc -v /usr/bin:/host-bin vitastor:1.10.2 install.sh`
`docker run --rm -it -v /etc:/host-etc -v /usr/bin:/host-bin vitalif/vitastor:v2.3.0 install.sh`
3. Перезагрузите правила udev: \
`udevadm control --reload-rules`

View File

@@ -11,12 +11,20 @@
- Trust Vitastor package signing key:
`wget https://vitastor.io/debian/pubkey.gpg -O /etc/apt/trusted.gpg.d/vitastor.gpg`
- Add Vitastor package repository to your /etc/apt/sources.list:
- Debian 12 (Bookworm/Sid): `deb https://vitastor.io/debian bookworm main`
- Debian 13 (Trixie/Sid): `deb https://vitastor.io/debian trixie main`
- Debian 12 (Bookworm): `deb https://vitastor.io/debian bookworm main`
- Debian 11 (Bullseye): `deb https://vitastor.io/debian bullseye main`
- Debian 10 (Buster): `deb https://vitastor.io/debian buster main`
- Ubuntu 22.04 (Jammy): `deb https://vitastor.io/debian jammy main`
- Ubuntu 24.04 (Noble): `deb https://vitastor.io/debian noble main`
- Add `-oldstable` to bookworm/bullseye/buster in this line to install the last
stable version from 0.9.x branch instead of 1.x
- To always prefer vitastor-patched QEMU and Libvirt versions, add the following to `/etc/apt/preferences`:
```
Package: *
Pin: origin "vitastor.io"
Pin-Priority: 501
```
- Install packages: `apt update; apt install vitastor lp-solve etcd linux-image-amd64 qemu-system-x86`
## CentOS
@@ -42,7 +50,6 @@
recommended because io_uring is a relatively new technology and there is
at least one bug which reproduces with io_uring and HP SmartArray
controllers in 5.4
- liburing 0.4 or newer
- lp_solve
- etcd 3.4.15 or newer. Earlier versions won't work because of various bugs,
for example [#12402](https://github.com/etcd-io/etcd/pull/12402).

View File

@@ -11,12 +11,20 @@
- Добавьте ключ репозитория Vitastor:
`wget https://vitastor.io/debian/pubkey.gpg -O /etc/apt/trusted.gpg.d/vitastor.gpg`
- Добавьте репозиторий Vitastor в /etc/apt/sources.list:
- Debian 12 (Bookworm/Sid): `deb https://vitastor.io/debian bookworm main`
- Debian 13 (Trixie/Sid): `deb https://vitastor.io/debian trixie main`
- Debian 12 (Bookworm): `deb https://vitastor.io/debian bookworm main`
- Debian 11 (Bullseye): `deb https://vitastor.io/debian bullseye main`
- Debian 10 (Buster): `deb https://vitastor.io/debian buster main`
- Ubuntu 22.04 (Jammy): `deb https://vitastor.io/debian jammy main`
- Ubuntu 24.04 (Noble): `deb https://vitastor.io/debian noble main`
- Добавьте `-oldstable` к слову bookworm/bullseye/buster в этой строке, чтобы
установить последнюю стабильную версию из ветки 0.9.x вместо 1.x
- Чтобы всегда предпочитались версии пакетов QEMU и Libvirt с патчами Vitastor, добавьте в `/etc/apt/preferences`:
```
Package: *
Pin: origin "vitastor.io"
Pin-Priority: 501
```
- Установите пакеты: `apt update; apt install vitastor lp-solve etcd linux-image-amd64 qemu-system-x86`
## CentOS
@@ -41,7 +49,6 @@
- Ядро Linux 5.4 или новее, для поддержки io_uring. Рекомендуется даже 5.8,
так как io_uring - относительно новый интерфейс и в версиях до 5.8 встречались
некоторые баги, например, зависание с io_uring и контроллером HP SmartArray
- liburing 0.4 или новее
- lp_solve
- etcd 3.4.15 или новее. Более старые версии не будут работать из-за разных багов,
например, [#12402](https://github.com/etcd-io/etcd/pull/12402).

View File

@@ -6,10 +6,10 @@
# Proxmox VE
To enable Vitastor support in Proxmox Virtual Environment (6.4-8.1 are supported):
To enable Vitastor support in Proxmox Virtual Environment (6.4-8.x are supported):
- Add the corresponding Vitastor Debian repository into sources.list on Proxmox hosts:
bookworm for 8.1, pve8.0 for 8.0, bullseye for 7.4, pve7.3 for 7.3, pve7.2 for 7.2, pve7.1 for 7.1, buster for 6.4
trixie for 9.0+, bookworm for 8.1+, pve8.0 for 8.0, bullseye for 7.4, pve7.3 for 7.3, pve7.2 for 7.2, pve7.1 for 7.1, buster for 6.4
- Install vitastor-client, pve-qemu-kvm, pve-storage-vitastor (* or see note) packages from Vitastor repository
- Define storage in `/etc/pve/storage.cfg` (see below)
- Block network access from VMs to Vitastor network (to OSDs and etcd),

View File

@@ -6,10 +6,10 @@
# Proxmox VE
Чтобы подключить Vitastor к Proxmox Virtual Environment (поддерживаются версии 6.4-8.1):
Чтобы подключить Vitastor к Proxmox Virtual Environment (поддерживаются версии 6.4-8.x):
- Добавьте соответствующий Debian-репозиторий Vitastor в sources.list на хостах Proxmox:
bookworm для 8.1, pve8.0 для 8.0, bullseye для 7.4, pve7.3 для 7.3, pve7.2 для 7.2, pve7.1 для 7.1, buster для 6.4
trixie для 9.0+, bookworm для 8.1+, pve8.0 для 8.0, bullseye для 7.4, pve7.3 для 7.3, pve7.2 для 7.2, pve7.1 для 7.1, buster для 6.4
- Установите пакеты vitastor-client, pve-qemu-kvm, pve-storage-vitastor (* или см. сноску) из репозитория Vitastor
- Определите тип хранилища в `/etc/pve/storage.cfg` (см. ниже)
- Обязательно заблокируйте доступ от виртуальных машин к сети Vitastor (OSD и etcd), т.к. Vitastor (пока) не поддерживает аутентификацию

View File

@@ -15,8 +15,8 @@
- gcc and g++ 8 or newer, clang 10 or newer, or other compiler with C++11 plus
designated initializers support from C++20
- CMake
- liburing, jerasure headers and libraries
- ISA-L, libibverbs headers and libraries (optional)
- jerasure headers and libraries
- ISA-L, libibverbs and librdmacm headers and libraries (optional)
- tcmalloc (google-perftools-dev)
## Basic instructions

View File

@@ -15,8 +15,8 @@
- gcc и g++ >= 8, либо clang >= 10, либо другой компилятор с поддержкой C++11 плюс
назначенных инициализаторов (designated initializers) из C++20
- CMake
- Заголовки и библиотеки liburing, jerasure
- Опционально - заголовки и библиотеки ISA-L, libibverbs
- Заголовки и библиотеки jerasure
- Опционально - заголовки и библиотеки ISA-L, libibverbs, librdmacm
- tcmalloc (google-perftools-dev)
## Базовая инструкция

View File

@@ -125,6 +125,13 @@ All other client-side components are based on the client library:
all current read/write operations to it fail with EPIPE error and are retried by clients.
- After completing all secondary read/write requests, primary OSD sends the response to
the client.
- When [localized reads](../config/pool.en.md#local_reads) are enabled for a PG in a
replicated pool, and the PG is in an active and clean state (active or
active+left_on_dead), the client can send the request to one of secondary OSDs instead
of the primary. Secondary OSD checks the [PG lock](../config/osd.en.md#enable_pg_locks)
and handles the request locally without communicating to the primary. PG lock is required
for the secondary OSD to know for sure that the PG is in clean state and not switching
primary at the moment.
### Nuances of request handling

View File

@@ -125,6 +125,12 @@
и если любое из этих соединений отключается, PG перезапускается, а все текущие запросы чтения
и записи в неё завершаются с ошибкой EPIPE, после чего повторяются клиентами.
- После завершения всех вторичных операций чтения/записи первичный OSD отправляет ответ клиенту.
- Если в реплицированном пуле включены [локализованные чтения](../config/pool.ru.md#local_reads),
а PG находится в чистом активном состоянии (active или active+left_on_dead), клиент может
послать запрос к одному из вторичных OSD вместо первичного. Вторичный OSD проверяет
[блокировку PG](../config/osd.ru.md#enable_pg_locks) и обрабатывает запрос локально, не
обращаясь к первичному. Блокировка PG здесь нужна, чтобы вторичный OSD мог точно знать,
что PG находится в чистом состоянии и не переключается на другой первичный OSD.
### Особенности обработки запросов

View File

@@ -10,8 +10,17 @@ Copyright (c) Vitaliy Filippov (vitalif [at] yourcmc.ru), 2019+
Join Vitastor Telegram Chat: https://t.me/vitastor
All server-side code (OSD, Monitor and so on) is licensed under the terms of
Vitastor Network Public License 1.1 (VNPL 1.1), a copyleft license based on
License: VNPL 1.1 for server-side code and dual VNPL 1.1 + GPL 2.0+ for client tools.
Server-side code is licensed only under the terms of VNPL.
Client libraries (cluster_client and so on) are dual-licensed under the same
VNPL 1.1 and also GNU GPL 2.0 or later to allow for compatibility with GPLed
software like QEMU and fio.
## VNPL
Vitastor Network Public License 1.1 (VNPL 1.1) is a copyleft license based on
GNU GPLv3.0 with the additional "Network Interaction" clause which requires
opensourcing all programs directly or indirectly interacting with Vitastor
through a computer network and expressly designed to be used in conjunction
@@ -20,18 +29,83 @@ the terms of the same license, but also under the terms of any GPL-Compatible
Free Software License, as listed by the Free Software Foundation.
This is a stricter copyleft license than the Affero GPL.
Please note that VNPL doesn't require you to open the code of proprietary
software running inside a VM if it's not specially designed to be used with
Vitastor.
The idea of VNPL is, in addition to modules linked to Vitastor code in a single
binary file, to extend copyleft action to micro-service modules only interacting
with it over the network.
Basically, you can't use the software in a proprietary environment to provide
its functionality to users without opensourcing all intermediary components
standing between the user and Vitastor or purchasing a commercial license
from the author 😀.
Client libraries (cluster_client and so on) are dual-licensed under the same
VNPL 1.1 and also GNU GPL 2.0 or later to allow for compatibility with GPLed
software like QEMU and fio.
At the same time, VNPL doesn't impose any restrictions on software *not specially designed*
to be used with Vitastor, for example, on Windows running inside a VM with a Vitastor disk.
You can find the full text of VNPL-1.1 in the file [VNPL-1.1.txt](../../VNPL-1.1.txt).
GPL 2.0 is also included in this repository as [GPL-2.0.txt](../../GPL-2.0.txt).
## Explanation
Network copyleft is governed by the clause **13. Remote Network Interaction** of VNPL.
A program is considered to be a "Proxy Program" if it meets both conditions:
- It is specially designed to be used with Vitastor. Basically, it means that the program
has any functionality specific to Vitastor and thus "knows" that it works with Vitastor,
not with something random.
- It interacts with Vitastor directly or indirectly through any programming interface,
including API, CLI, network or any wrapper (also considered a Proxy Program itself).
If, in addition to that:
- You give any user an apportunity to interact with Vitastor directly or indirectly through
any computer interface including the network or any number of wrappers (Proxy Programs).
Then VNPL requires you to publish the code of all above Proxy Programs to all above users
under the terms of any GPL-compatible license - that is, GPL, LGPL, MIT/BSD or Apache 2,
because "GPL compatibility" is treated as an ability to legally include licensed code in
a GPL application.
So, if you have a "Proxy Program", but it's not open to the user who directly or indirectly
interacts with Vitastor - you are forbidden to use Vitastor under the terms of VNPL and you
need a commercial license which doesn't contain open-source requirements.
## Examples
- Vitastor Kubernetes CSI driver which creates PersistentVolumes by calling `vitastor-cli create`.
- Yes, it interacts with Vitastor through vitastor-cli.
- Yes, it is designed specially for use with Vitastor (it has no sense otherwise).
- So, CSI driver **definitely IS** a Proxy Program and must be published under the terms of
a free software license.
- Windows, installed in a VM with the system disk on Vitastor storage.
- Yes, it interacts with Vitastor indirectly - it reads and writes data through the block
device interface, emulated by QEMU.
- No, it definitely isn't designed specially for use with Vitastor - Windows was created long
ago before Vitastor and doesn't know anything about it.
- So, Windows **definitely IS NOT** a Proxy Program and VNPL doesn't require to open it.
- Cloud control panel which makes requests to Vitastor Kubernetes CSI driver.
- Yes, it interacts with Vitastor indirectly through the CSI driver, which is a Proxy Program.
- May or may not be designed specially for use with Vitastor. How to determine exactly?
Imagine that Vitastor is replaced with any other storage (for example, with a proprietary).
Do control panel functions change in any way? If they do (for example, if snapshots stop working),
then the panel contains specific functionality and thus is designed specially for use with Vitastor.
Otherwise, the panel is universal and isn't designed specially for Vitastor.
- So, whether you are required to open-source the panel also **depends** on whether it
contains specific functionality or not.
## Why?
Because I believe into the spirit of copyleft (Linux wouldn't become so popular without GPL!)
and, at the same time, I want to have a way to monetize the product.
Existing licenses including AGPL are useless for it with an SDS - SDS is a very deeply
internal software which is almost definitely invisible to the user and thus AGPL doesn't
require anyone to open the code even if they make a proprietary fork.
And, in fact, the current situation in the world where GPL is though to only restrict direct
linking of programs into a single executable file, isn't much correct. Nowadays, programs
are more often linked with network API calls, not with /usr/bin/ld, and a software product
may consist of dozens of microservices interacting with each other over the network.
That's why we need VNPL to keep the license sufficiently copyleft.
## License Texts
- VNPL 1.1 in English: [VNPL-1.1.txt](../../VNPL-1.1.txt)
- VNPL 1.1 in Russian: [VNPL-1.1-RU.txt](../../VNPL-1.1-RU.txt)
- GPL 2.0: [GPL-2.0.txt](../../GPL-2.0.txt)

View File

@@ -12,6 +12,14 @@
Лицензия: VNPL 1.1 на серверный код и двойная VNPL 1.1 + GPL 2.0+ на клиентский.
Серверные компоненты распространяются только на условиях VNPL.
Клиентские библиотеки распространяются на условиях двойной лицензии VNPL 1.0
и также на условиях GNU GPL 2.0 или более поздней версии. Так сделано в целях
совместимости с таким ПО, как QEMU и fio.
## VNPL
VNPL - "сетевой копилефт", собственная свободная копилефт-лицензия
Vitastor Network Public License 1.1, основанная на GNU GPL 3.0 с дополнительным
условием "Сетевого взаимодействия", требующим распространять все программы,
@@ -29,9 +37,70 @@ Vitastor Network Public License 1.1, основанная на GNU GPL 3.0 с д
На Windows и любое другое ПО, не разработанное *специально* для использования
вместе с Vitastor, никакие ограничения не накладываются.
Клиентские библиотеки распространяются на условиях двойной лицензии VNPL 1.0
и также на условиях GNU GPL 2.0 или более поздней версии. Так сделано в целях
совместимости с таким ПО, как QEMU и fio.
## Пояснение
Вы можете найти полный текст VNPL 1.1 на английском языке в файле [VNPL-1.1.txt](../../VNPL-1.1.txt),
VNPL 1.1 на русском языке в файле [VNPL-1.1-RU.txt](../../VNPL-1.1-RU.txt), а GPL 2.0 в файле [GPL-2.0.txt](../../GPL-2.0.txt).
Сетевой копилефт регулируется пунктом лицензии **13. Удалённое сетевое взаимодействие**.
Программа считается "прокси-программой", если верны оба условия:
- Она создана специально для работы вместе с Vitastor. По сути это означает, что программа
должна иметь специфичный для Vitastor функционал, то есть, "знать", что она взаимодействует
именно с Vitastor.
- Она прямо или косвенно взаимодействует с Vitastor через абсолютно любой программный
интерфейс, включая любые способы вызова: API, CLI, сеть или через какую-то обёртку (в
свою очередь тоже являющуюся прокси-программой).
Если в дополнение к этому также:
- Вы предоставляете любому пользователю возможность взаимодействовать с Vitastor по сети,
опять-таки, через любой интерфейс или любую серию "обёрток" (прокси-программ)
То, согласно VNPL, вы должны открыть код "прокси-программ" **таким пользователям** на условиях
любой GPL-совместимой лицензии - то есть, GPL, LGPL, MIT/BSD или Apache 2 - "совместимость с GPL"
понимается как возможность включать лицензируемый код в GPL-приложение.
Соответственно, если у вас есть "прокси-программа", но её код не открыт пользователю,
который прямо или косвенно взаимодействует с Vitastor - вам запрещено использовать Vitastor
на условиях VNPL и вам нужна коммерческая лицензия, не содержащая требований об открытии кода.
## Примеры
- Kubernetes CSI-драйвер Vitastor, создающий PersistentVolume с помощью вызова `vitastor-cli create`.
- Да, взаимодействует с Vitastor через vitastor-cli.
- Да, создавался специально для работы с Vitastor (иначе в чём же ещё его смысл).
- Значит, CSI-драйвер **точно считается** "прокси-программой" и должен быть открыт под свободной
лицензией.
- Windows, установленный в виртуальную машину на диске Vitastor.
- Да, взаимодействует с Vitastor "прямо или косвенно" - пишет и читает данные через интерфейс
блочного устройства, эмулируемый QEMU.
- Нет, точно не создан *специально для работы с Vitastor* - когда его создавали, никакого
Vitastor ещё и в помине не было.
- Значит, Windows **точно не считается** "прокси-программой" и на него требования VNPL не распространяются.
- Панель управления облака, делающая запросы к Kubernetes CSI-драйверу Vitastor.
- Да, взаимодействует с Vitastor косвенно через CSI-драйвер, являющийся "прокси-программой".
- Сходу не известно, создавалась ли конкретно для работы с Vitastor. Как понять, да или нет?
Представьте, что Vitastor заменён на любую другую систему хранения (например, на проприетарную).
Работа панели управления изменится? Если да (например, перестанут работать снапшоты) - значит,
панель содержит специфичный функционал и "создана специально для работы с Vitastor".
Если нет - значит, специфичного функционала панель не содержит и в принципе она универсальна.
- Нужно ли открывать панель - **зависит** от того, содержит она специфичный функционал или нет.
## Почему так?
Потому что я одновременно верю в дух копилефт-лицензий (Linux не стал бы так популярен,
если бы не GPL!) и хочу иметь возможность монетизации продукта.
При этом использовать даже AGPL для программной СХД бессмысленно - это глубоко внутреннее
ПО, которое пользователь почти наверняка не увидит вообще, поэтому и открывать код никому
никогда не придётся, даже при создании производного продукта.
Да и в целом сложившаяся в мире ситуация, при которой действие GPL ограничивается только
прямым связыванием в один исполняемый файл, не очень корректна. В настоящее время программы
гораздо чаще интегрируют сетевыми вызовами, а не с помощью /usr/bin/ld, и общий программный
продукт может состоять из нескольких десятков микросервисов, взаимодействующих по сети.
Поэтому для сохранения достаточной "копилефтности" и придумана VNPL.
## Тексты лицензий
- VNPL 1.1 на английском языке: [VNPL-1.1.txt](../../VNPL-1.1.txt)
- VNPL 1.1 на русском языке: [VNPL-1.1-RU.txt](../../VNPL-1.1-RU.txt)
- GPL 2.0: [GPL-2.0.txt](../../GPL-2.0.txt)

View File

@@ -25,10 +25,11 @@
- Recovery of degraded blocks
- Rebalancing (data movement between OSDs)
- [Lazy fsync support](../config/layout-cluster.en.md#immediate_commit)
- [Localized read support](../config/pool.en.md#local_reads) for cross-datacenter setup optimization
- Per-OSD and per-image I/O and space usage statistics in etcd
- Snapshots and copy-on-write image clones
- [Write throttling to smooth random write workloads in SSD+HDD configurations](../config/osd.en.md#throttle_small_writes)
- [RDMA/RoCEv2 support via libibverbs](../config/network.en.md#rdma_device)
- RDMA/RoCEv2 support [via libibverbs](../config/network.en.md#use_rdma) or [RDMA-CM](../config/network.en.md#use_rdmacm)
- [Scrubbing](../config/osd.en.md#auto_scrub) (verification of copies)
- [Checksums](../config/layout-osd.en.md#data_csum_type)
- [Client write-back cache](../config/client.en.md#client_enable_writeback)
@@ -51,7 +52,7 @@
- Generic user-space client library
- [Native QEMU driver](../usage/qemu.en.md)
- [Loadable fio engine for benchmarks](../usage/fio.en.md)
- [NBD proxy for kernel mounts](../usage/nbd.en.md)
- [UBLK](../usage/ublk.en.md) and [NBD](../usage/nbd.en.md) servers for kernel mounts
- [Simplified NFS proxy for file-based image access emulation (suitable for VMWare)](../usage/nfs.en.md#pseudo-fs)
## Roadmap

View File

@@ -25,12 +25,13 @@
- Восстановление деградированных блоков
- Ребаланс, то есть перемещение данных между OSD (дисками)
- [Поддержка "ленивого" fsync (fsync не на каждую операцию)](../config/layout-cluster.ru.md#immediate_commit)
- [Локальные чтения](../config/pool.ru.md#local_reads) для оптимизации при нескольких датацентрах
- Сбор статистики ввода/вывода в etcd
- Статистика операций ввода/вывода и занятого места в разрезе инодов
- Именование инодов через хранение их метаданных в etcd
- Снапшоты и copy-on-write клоны
- [Сглаживание производительности случайной записи в SSD+HDD конфигурациях](../config/osd.ru.md#throttle_small_writes)
- [Поддержка RDMA/RoCEv2 через libibverbs](../config/network.ru.md#rdma_device)
- Поддержка RDMA/RoCEv2 [через libibverbs](../config/network.ru.md#use_rdma) или [RDMA-CM](../config/network.ru.md#use_rdmacm)
- [Фоновая проверка целостности](../config/osd.ru.md#auto_scrub) (сверка копий)
- [Контрольные суммы](../config/layout-osd.ru.md#data_csum_type)
- [Буферизация записи на стороне клиента](../config/client.ru.md#client_enable_writeback)
@@ -53,7 +54,7 @@
- Общая пользовательская клиентская библиотека для работы с кластером
- [Драйвер диска для QEMU](../usage/qemu.ru.md)
- [Драйвер диска для утилиты тестирования производительности fio](../usage/fio.ru.md)
- [NBD-прокси для монтирования образов ядром](../usage/nbd.ru.md) ("блочное устройство в режиме пользователя")
- [UBLK](../usage/ublk.ru.md) и [NBD](../usage/nbd.ru.md) серверы для монтирования образов ядром ("блочное устройство в режиме пользователя")
- [Упрощённая NFS-прокси для эмуляции файлового доступа к образам (подходит для VMWare)](../usage/nfs.ru.md#псевдо-фс)
## Планы развития

View File

@@ -50,7 +50,7 @@ On the monitor hosts:
## Configure OSDs
- Put etcd_address and osd_network into `/etc/vitastor/vitastor.conf`. Example:
- Put etcd_address and [osd_network](../config/network.en.md#osd_network) into `/etc/vitastor/vitastor.conf`. Example:
```
{
"etcd_address": ["10.200.1.10:2379","10.200.1.11:2379","10.200.1.12:2379"],

View File

@@ -50,7 +50,7 @@
## Настройте OSD
- Пропишите etcd_address и osd_network в `/etc/vitastor/vitastor.conf`. Например:
- Пропишите etcd_address и [osd_network](../config/network.ru.md#osd_network) в `/etc/vitastor/vitastor.conf`. Например:
```
{
"etcd_address": ["10.200.1.10:2379","10.200.1.11:2379","10.200.1.12:2379"],

View File

@@ -14,6 +14,7 @@
- [Removing a failed disk](#removing-a-failed-disk)
- [Adding a disk](#adding-a-disk)
- [Restoring from lost pool configuration](#restoring-from-lost-pool-configuration)
- [Incompatibility problems](#Incompatibility-problems)
- [Upgrading Vitastor](#upgrading-vitastor)
- [OSD memory usage](#osd-memory-usage)
@@ -35,10 +36,19 @@ PG state consists of exactly 1 base state and an arbitrary number of additional
PG state always includes exactly 1 of the following base states:
- **active** — PG is active and handles user I/O.
- **incomplete** — Not enough OSDs are available to activate this PG. That is, more disks
are lost than it's allowed by the pool's redundancy scheme. For example, if the pool has
pg_size=3 and pg_minsize=1, part of the data may be written only to 1 OSD. If that exact
OSD is lost, PG will become **incomplete**.
- **incomplete** — Not enough OSDs are available to activate this PG. More exactly, that
means one of the following:
- Less than pg_minsize current target OSDs are available for the PG. I.e. more disks
are lost than allowed by the pool's redundancy scheme.
- All OSDs of some of PG's history records are unavailable, or, for EC pools, less
than (pg_size-parity_chunks) OSDs are available in one of the history records.
In other words it means that some data in this PG was written to an OSD set such that
it's currently impossible to read it back because these OSDs are down. For example,
if the pool has pg_size=3 and pg_minsize=1, part of the data may be written only to
1 OSD. If that exact OSD is lost, PG becomes **incomplete**.
- [allow_net_split](../config/osd.en.md#allow_net_split) is disabled (default) and
primary OSD of the PG can't connect to some secondary OSDs marked as alive in etcd.
I.e. a network partition happened: OSDs can talk to etcd, but not to some other OSDs.
- **offline** — PG isn't activated by any OSD at all. Either primary OSD isn't set for
this PG at all (if the pool is just created), or an unavailable OSD is set as primary,
or the primary OSD refuses to start this PG (for example, because of wrong block_size),
@@ -157,6 +167,17 @@ done
After that all PGs should peer and find all previous data.
## Incompatibility problems
### ISA-L 2.31
⚠ It is FORBIDDEN to use Vitastor 2.1.0 and earlier versions with ISA-L 2.31 and newer if
you use EC N+K pools and K > 1 on a CPU with GF-NI instruction support, because it WILL
lead to **data loss** during EC recovery.
If you accidentally upgraded ISA-L to 2.31 but didn't upgrade Vitastor and restarted OSDs,
then stop them as soon as possible and either update Vitastor or roll back ISA-L.
## Upgrading Vitastor
Every upcoming Vitastor version is usually compatible with previous both forward

View File

@@ -14,6 +14,7 @@
- [Удаление неисправного диска](#удаление-неисправного-диска)
- [Добавление диска](#добавление-диска)
- [Восстановление потерянной конфигурации пулов](#восстановление-потерянной-конфигурации-пулов)
- [Проблемы несовместимости](#проблемы-несовместимости)
- [Обновление Vitastor](#обновление-vitastor)
- [Потребление памяти OSD](#потребление-памяти-osd)
@@ -35,10 +36,20 @@
Состояние PG включает в себя ровно 1 флаг из следующих:
- **active** — PG активна и обрабатывает запросы ввода-вывода от пользователей.
- **incomplete** — Недостаточно живых OSD, чтобы включить эту PG.
То есть, дисков потеряно больше, чем разрешено схемой отказоустойчивости пула и pg_minsize.
Например, если у пула pg_size=3 и pg_minsize=1, то часть данных может записаться всего на 1 OSD.
Если потом конкретно этот OSD упадёт, PG окажется **incomplete**.
- **incomplete** — Недостаточно живых OSD, чтобы включить эту PG. Если точнее, то это
означает один из следующих вариантов:
- Доступно менее, чем pg_minsize текущих целевых OSD данной PG. Иными словами, потеряно
больше дисков, чем это разрешает схема отказоустойчивости пула.
- Все OSD одной из исторических записей PG недоступны, или, для EC-пулов, в одной
из исторических записей PG доступно менее, чем (pg_size-parity_chunks) OSD. Другими
словами это означает, что часть данных этой PG была записана в такой набор OSD, из
которого их сейчас невозможно прочитать обратно, так как OSD не включены. Например,
если у пула pg_size=3 и pg_minsize=1, то часть данных может записаться всего на 1 OSD.
Если потом конкретно этот OSD упадёт, PG окажется **incomplete**.
- [allow_net_split](../config/osd.ru.md#allow_net_split) отключено (по умолчанию) и
первичный OSD данной PG не может соединиться с частью вторичных OSD этой PG, помеченных
как живых в etcd. Это означает, что произошло разделение сети: OSD могут общаться с etcd,
но не могут общаться с частью других OSD.
- **offline** — PG вообще не активирована ни одним OSD. Либо первичный OSD не назначен вообще
(если пул только создан), либо в качестве первичного назначен недоступный OSD, либо
назначенный OSD отказывается запускать эту PG (например, из-за несовпадения block_size),
@@ -153,6 +164,17 @@ done
После этого все PG должны пройти peering и найти все предыдущие данные.
## Проблемы несовместимости
### ISA-L 2.31
⚠ ЗАПРЕЩЕНО использовать Vitastor 2.1.0 и более ранних версий с библиотекой ISA-L версии 2.31
или более новой, если вы используете EC-пулы N+K и K > 1 на CPU с поддержкой инструкций GF-NI,
так как это приведёт к **потере данных** при восстановлении из EC.
Если вы случайно обновили ISA-L до 2.31, но не обновили Vitastor, и успели перезапустить OSD,
то как можно скорее остановите их все и либо обновите Vitastor, либо откатите ISA-L.
## Обновление Vitastor
Обычно каждая следующая версия Vitastor совместима с предыдущими и "вперёд", и "назад"

View File

@@ -378,11 +378,11 @@ Examples:
Create a pool. Required parameters:
| <!-- --> | <!-- --> |
|--------------------------|---------------------------------------------------------------------------------------|
| `-s R` or `--pg_size R` | Number of replicas for replicated pools |
| `--ec N+K` | Number of data (N) and parity (K) chunks for erasure-coded pools |
| `-n N` or `--pg_count N` | PG count for the new pool (start with 10*<OSD count>/pg_size rounded to a power of 2) |
| <!-- --> | <!-- --> |
|--------------------------|-----------------------------------------------------------------------------------------|
| `-s R` or `--pg_size R` | Number of replicas for replicated pools |
| `--ec N+K` | Number of data (N) and parity (K) chunks for erasure-coded pools |
| `-n N` or `--pg_count N` | PG count for the new pool (start with 10*\<OSD count\>/pg_size rounded to a power of 2) |
Optional parameters:
@@ -397,6 +397,7 @@ Optional parameters:
| `--immediate_commit none` | Put pool only on OSDs with this or larger immediate_commit (none < small < all) |
| `--level_placement <rules>` | Use additional failure domain rules (example: "dc=112233") |
| `--raw_placement <rules>` | Specify raw PG generation rules ([details](../config/pool.en.md#raw_placement)) |
| `--local_reads primary` | Local read policy for replicated pools: primary, nearest or random |
| `--primary_affinity_tags tags` | Prefer to put primary copies on OSDs with all specified tags |
| `--scrub_interval <time>` | Enable regular scrubbing for this pool. Format: number + unit s/m/h/d/M/y |
| `--used_for_app fs:<name>` | Mark pool as used for VitastorFS with metadata in image `<name>` |

View File

@@ -22,6 +22,8 @@ vitastor-cli - интерфейс командной строки для адм
- [flatten](#flatten)
- [rm-data](#rm-data)
- [merge-data](#merge-data)
- [describe](#describe)
- [fix](#fix)
- [alloc-osd](#alloc-osd)
- [rm-osd](#rm-osd)
- [osd-tree](#osd-tree)
@@ -393,11 +395,11 @@ OSD PARENT UP SIZE USED% TAGS WEIGHT BLOCK BITMAP
Создать пул. Обязательные параметры:
| <!-- --> | <!-- --> |
|---------------------------|---------------------------------------------------------------------------------------------|
| `-s R` или `--pg_size R` | Число копий данных для реплицированных пулов |
| `--ec N+K` | Число частей данных (N) и чётности (K) для пулов с кодами коррекции ошибок |
| `-n N` или `--pg_count N` | Число PG для нового пула (начните с 10*<число OSD>/pg_size, округлённого до степени двойки) |
| <!-- --> | <!-- --> |
|---------------------------|-----------------------------------------------------------------------------------------------|
| `-s R` или `--pg_size R` | Число копий данных для реплицированных пулов |
| `--ec N+K` | Число частей данных (N) и чётности (K) для пулов с кодами коррекции ошибок |
| `-n N` или `--pg_count N` | Число PG для нового пула (начните с 10*\<число OSD\>/pg_size, округлённого до степени двойки) |
Необязательные параметры:
@@ -412,6 +414,7 @@ OSD PARENT UP SIZE USED% TAGS WEIGHT BLOCK BITMAP
| `--immediate_commit none` | ...только OSD с этим или большим immediate_commit (none < small < all) |
| `--level_placement <rules>` | Задать правила дополнительных доменов отказа (пример: "dc=112233") |
| `--raw_placement <rules>` | Задать низкоуровневые правила генерации PG ([детали](../config/pool.ru.md#raw_placement)) |
| `--local_reads primary` | Политика локальных чтений для реплик: primary, nearest или random |
| `--primary_affinity_tags tags` | Предпочитать OSD со всеми данными тегами для роли первичных |
| `--scrub_interval <time>` | Включить скрабы с заданным интервалом времени (число + единица s/m/h/d/M/y) |
| `--pg_stripe_size <number>` | Увеличить блок группировки объектов по PG |

View File

@@ -14,6 +14,9 @@ Commands:
- [upgrade](#upgrade)
- [defrag](#defrag)
⚠️ Important: follow the instructions from [Linux NFS write size](#linux-nfs-write-size)
for optimal Vitastor NFS performance if you use EC and HDD and mount your NFS from Linux.
## Pseudo-FS
Simplified pseudo-FS proxy is used for file-based image access emulation. It's not
@@ -86,6 +89,8 @@ POSIX features currently not implemented in VitastorFS:
instead of actually allocated space
- Access times (`atime`) are not tracked (like `-o noatime`)
- Modification time (`mtime`) is updated lazily every second (like `-o lazytime`)
- Permission enforcement is disabled by default (and Linux NFS client doesn't
enforce them too). Use `--enforce 1` to enable it.
Other notable missing features which should be addressed in the future:
- Inode ID reuse. Currently inode IDs always grow, the limit is 2^48 inodes, so
@@ -100,6 +105,62 @@ Other notable missing features which should be addressed in the future:
in the DB. The FS is implemented is such way that this garbage doesn't affect its
function, but having a tool to clean it up still seems a right thing to do.
## Linux NFS write size
Linux NFS client (nfs/nfsv3/nfsv4 kernel modules) has a hard-coded maximum I/O size,
currently set to 1 MB - see `rsize` and `wsize` in [man 5 nfs](https://linux.die.net/man/5/nfs).
This means that when you write to a file in an FS mounted over NFS, the maximum write
request size is 1 MB, even in the O_DIRECT mode and even if the original write request
is larger.
However, for optimal linear write performance in Vitastor EC (erasure-coded) pools,
the size of write requests should be a multiple of [block_size](../config/layout-cluster.en.md#block_size),
multiplied by the data chunk count of the pool ([pg_size](../config/pool.en.md#pg_size)-[parity_chunks](../config/pool.en.md#parity_chunks)).
When write requests are smaller or not a multiple of this number, Vitastor has to first
read paired data blocks from disks, calculate new parity blocks and only then write them
back. Obviously this is 2-3 times slower than a simple disk write.
Vitastor HDD setups use 1 MB block_size by default. So, for optimal performance, if
you use EC 2+1 and HDD, you need your NFS client to send 2 MB write requests, if you
use EC 4+1 - 4 MB and so on.
But Linux NFS client only writes in 1 MB chunks. 😢
The good news is that you can fix it by rebuilding Linux NFS kernel modules 😉 🤩!
You need to change NFS_MAX_FILE_IO_SIZE in nfs_xdr.h and then rebuild and reload modules.
The instruction, using Debian as an example (should be ran under root):
```
# download current Linux kernel headers required to build modules
apt-get install linux-headers-`uname -r`
# replace NFS_MAX_FILE_IO_SIZE with a desired number (here it's 4194304 - 4 MB)
sed -i 's/NFS_MAX_FILE_IO_SIZE\s*.*/NFS_MAX_FILE_IO_SIZE\t(4194304U)/' /lib/modules/`uname -r`/source/include/linux/nfs_xdr.h
# download current Linux kernel source
mkdir linux_src
cd linux_src
apt-get source linux-image-`uname -r`-unsigned
# build NFS modules
cd linux-*/fs/nfs
make -C /lib/modules/`uname -r`/build M=$PWD -j8 modules
make -C /lib/modules/`uname -r`/build M=$PWD modules_install
# move default NFS modules away
mv /lib/modules/`uname -r`/kernel/fs/nfs ~/nfs_orig_`uname -r`
depmod -a
# unload old modules and load the new ones
rmmod nfsv3 nfs
modprobe nfsv3
```
After these (not much complicated 🙂) manipulations NFS begins to be mounted
with new wsize and rsize by default and it fixes Vitastor-NFS linear write performance.
## Horizontal scaling
Linux NFS 3.0 client doesn't support built-in scaling or failover, i.e. you can't
@@ -199,4 +260,5 @@ Options:
| `--nfspath <PATH>` | set NFS export path to \<PATH> (default is /) |
| `--pidfile <FILE>` | write process ID to the specified file |
| `--logfile <FILE>` | log to the specified file |
| `--enforce 1` | enforce permissions at the server side (no by default) |
| `--foreground 1` | stay in foreground, do not daemonize |

View File

@@ -14,6 +14,9 @@
- [upgrade](#upgrade)
- [defrag](#defrag)
⚠️ Важно: для оптимальной производительности Vitastor NFS в Linux при использовании
HDD и EC (erasure кодов) выполните инструкции из раздела [Размер записи Linux NFS](#размер-записи-linux-nfs).
## Псевдо-ФС
Упрощённая реализация псевдо-ФС используется для эмуляции файлового доступа к блочным
@@ -88,6 +91,8 @@ JSON-формате :-). Для инспекции содержимого БД
stat(2), так что `du` всегда показывает сумму размеров файлов, а не фактически занятое место
- Времена доступа (`atime`) не отслеживаются (как будто ФС смонтирована с `-o noatime`)
- Времена модификации (`mtime`) отслеживаются асинхронно (как будто ФС смонтирована с `-o lazytime`)
- Привилегии доступа по умолчанию не проверяются сервером (клиент NFS Linux их также не проверяет).
Чтобы включить проверки, используйте опцию `--enforce 1`.
Другие недостающие функции, которые нужно добавить в будущем:
- Переиспользование номеров инодов. В текущей реализации номера инодов всё время
@@ -104,6 +109,66 @@ JSON-формате :-). Для инспекции содержимого БД
записи. ФС устроена так, что на работу они не влияют, но для порядка и их стоит
уметь подчищать.
## Размер записи Linux NFS
Клиент Linux NFS (модули ядра nfs/nfsv3/nfsv4) имеет фиксированный в коде максимальный
размер запроса ввода-вывода, равный 1 МБ - см. `rsize` и `wsize` в [man 5 nfs](https://linux.die.net/man/5/nfs).
Это означает, что когда вы записываете в файл в примонтированной по NFS файловой системе,
максимальный размер запроса записи составляет 1 МБ, даже в режиме O_DIRECT и даже если
исходный запрос записи был больше.
Однако для оптимальной скорости линейной записи в Vitastor при использовании EC-пулов
(пулов с кодами коррекции ошибок) запросы записи должны быть по размеру кратны
[block_size](../config/layout-cluster.ru.md#block_size), умноженному на число частей
данных пула ([pg_size](../config/pool.ru.md#pg_size)-[parity_chunks](../config/pool.ru.md#parity_chunks)).
Если запросы записи меньше или не кратны, то Vitastor приходится сначала прочитать
с дисков старые версии парных блоков данных, рассчитать новые блоки чётности и только
после этого записать их на диски. Естественно, это в 2-3 раза медленнее простой записи
на диск.
При этом block_size на жёстких дисках по умолчанию устанавливается равным 1 МБ.
Таким образом, если вы используете EC 2+1 и HDD, для оптимальной скорости записи вам
нужно, чтобы NFS-клиент писал по 2 МБ, если EC 4+1 и HDD - то по 4 МБ, и т.п.
А Linux NFS-клиент пишет только по 1 МБ. 😢
Но это можно исправить, пересобрав модули ядра Linux NFS 😉 🤩! Для этого нужно
поменять значение переменной NFS_MAX_FILE_IO_SIZE в заголовочном файле nfs_xdr.h,
после чего пересобрать модули NFS.
Инструкция по пересборке на примере Debian (выполнять под root):
```
# скачиваем заголовки для сборки модулей для текущего ядра Linux
apt-get install linux-headers-`uname -r`
# заменяем в заголовках NFS_MAX_FILE_IO_SIZE на желаемый (здесь 4194304 - 4 МБ)
sed -i 's/NFS_MAX_FILE_IO_SIZE\s*.*/NFS_MAX_FILE_IO_SIZE\t(4194304U)/' /lib/modules/`uname -r`/source/include/linux/nfs_xdr.h
# скачиваем исходный код текущего ядра
mkdir linux_src
cd linux_src
apt-get source linux-image-`uname -r`-unsigned
# собираем модули NFS
cd linux-*/fs/nfs
make -C /lib/modules/`uname -r`/build M=$PWD -j8 modules
make -C /lib/modules/`uname -r`/build M=$PWD modules_install
# убираем в сторону штатные модули NFS
mv /lib/modules/`uname -r`/kernel/fs/nfs ~/nfs_orig_`uname -r`
depmod -a
# выгружаем старые модули и загружаем новые
rmmod nfsv3 nfs
modprobe nfsv3
```
После такой (относительно нехитрой 🙂) манипуляции NFS начинает по умолчанию
монтироваться с новыми wsize и rsize, и производительность линейной записи в Vitastor-NFS
исправляется.
## Горизонтальное масштабирование
Клиент Linux NFS 3.0 не поддерживает встроенное масштабирование или отказоустойчивость.
@@ -207,4 +272,5 @@ VitastorFS из GPUDirect.
| `--nfspath <PATH>` | установить путь NFS-экспорта в \<PATH> (по умолчанию /) |
| `--pidfile <FILE>` | записать ID процесса в заданный файл |
| `--logfile <FILE>` | записывать логи в заданный файл |
| `--enforce 1` | проверять права доступа на стороне сервера (по умолчанию нет) |
| `--foreground 1` | не уходить в фон после запуска |

View File

@@ -130,23 +130,16 @@ Linux kernel, starting with version 5.15, supports a new interface for attaching
to the host - VDUSE (vDPA Device in Userspace). QEMU, starting with 7.2, has support for
exporting QEMU block devices over this protocol using qemu-storage-daemon.
VDUSE is currently the best interface to attach Vitastor disks as kernel devices because:
- It avoids data copies and thus achieves much better performance than [NBD](nbd.en.md)
- It doesn't have NBD timeout problem - the device doesn't die if an operation executes for too long
VDUSE advantages:
- VDUSE copies memory 1 time instead of 2, and is thus faster than [NBD](nbd.en.md) for linear read/write.
- It doesn't have NBD timeout problem - the device doesn't die if an operation executes for too long.
- It doesn't have hung device problem - if the userspace process dies it can be restarted (!)
and block device will continue operation
- It doesn't seem to have the device number limit
and block device will continue operation (UBLK can do it too).
- It doesn't seem to have the device number limit (UBLK also doesn't).
Example performance comparison:
| | direct fio | NBD | VDUSE |
|----------------------|-------------|-------------|-------------|
| linear write | 3.85 GB/s | 1.12 GB/s | 3.85 GB/s |
| 4k random write Q128 | 240000 iops | 120000 iops | 178000 iops |
| 4k random write Q1 | 9500 iops | 7620 iops | 7640 iops |
| linear read | 4.3 GB/s | 1.8 GB/s | 2.85 GB/s |
| 4k random read Q128 | 287000 iops | 140000 iops | 189000 iops |
| 4k random read Q1 | 9600 iops | 7640 iops | 7780 iops |
At the same time, VDUSE may be slower or faster than [UBLK](ublk.en.md) for linear read/write,
and iops-wise it's sometimes even slower than NBD. See performance comparison examples at the page [UBLK](ublk.en.md).
To try VDUSE you need at least Linux 5.15, built with VDUSE support
(CONFIG_VDPA=m, CONFIG_VDPA_USER=m, CONFIG_VIRTIO_VDPA=m).
@@ -162,10 +155,12 @@ apt-get install linux-headers-`uname -r`
apt-get build-dep linux-image-`uname -r`-unsigned
apt-get source linux-image-`uname -r`-unsigned
cd linux*/drivers/vdpa
make -C /lib/modules/`uname -r`/build M=$PWD CONFIG_VDPA=m CONFIG_VDPA_USER=m CONFIG_VIRTIO_VDPA=m -j8 modules modules_install
make -C /lib/modules/`uname -r`/build M=$PWD CONFIG_VDPA=m CONFIG_VDPA_USER=m CONFIG_VIRTIO_VDPA=m -j8 modules
make -C /lib/modules/`uname -r`/build M=$PWD CONFIG_VDPA=m CONFIG_VDPA_USER=m CONFIG_VIRTIO_VDPA=m modules_install
cat Module.symvers >> /lib/modules/`uname -r`/build/Module.symvers
cd ../virtio
make -C /lib/modules/`uname -r`/build M=$PWD CONFIG_VDPA=m CONFIG_VDPA_USER=m CONFIG_VIRTIO_VDPA=m -j8 modules modules_install
make -C /lib/modules/`uname -r`/build M=$PWD CONFIG_VDPA=m CONFIG_VDPA_USER=m CONFIG_VIRTIO_VDPA=m -j8 modules
make -C /lib/modules/`uname -r`/build M=$PWD CONFIG_VDPA=m CONFIG_VDPA_USER=m CONFIG_VIRTIO_VDPA=m modules_install
depmod -a
```
@@ -191,3 +186,12 @@ To remove the device:
vdpa dev del test1
kill <qemu-storage-daemon_process_PID>
```
## Veeam
Vitastor QEMU driver has a feature that allows to trick third-party systems like Veeam not able to parse qemu-img
vitastor URIs: [qemu_file_mirror_path](../config/client.en.md#qemu_file_mirror_path).
To make such systems work, you should set this option to an FS directory path (for example, `/mnt/vitastor/`) and
mount this directory using [`vitastor-nfs mount --block`](../usage/nfs.en.md). It will make them access
your images using files and, hopefully, succeed in doing their normal job :).

View File

@@ -132,24 +132,16 @@ qemu-system-x86_64 -enable-kvm -m 2048 -M accel=kvm,memory-backend=mem \
к системе - VDUSE (vDPA Device in Userspace), а в QEMU, начиная с версии 7.2, есть поддержка
экспорта блочных устройств QEMU по этому протоколу через qemu-storage-daemon.
VDUSE - на данный момент лучший интерфейс для подключения дисков Vitastor в виде блочных
устройств на уровне ядра, ибо:
- VDUSE не копирует данные и поэтому достигает значительно лучшей производительности, чем [NBD](nbd.ru.md)
- Также оно не имеет проблемы NBD-таймаута - устройство не умирает, если операция выполняется слишком долго
- Также оно не имеет проблемы подвисающих устройств - если процесс-обработчик умирает, его можно
перезапустить (!) и блочное устройство продолжит работать
- По-видимому, у него нет предела числа подключаемых в систему устройств
Преимущества VDUSE:
Пример сравнения производительности:
- VDUSE копирует данные 1 раз, а не 2, и поэтому он быстрее, чем [NBD](nbd.ru.md) при линейном доступе.
- VDUSE не имеет проблемы NBD-таймаута - устройство не умирает, если операция выполняется слишком долго.
- VDUSE не имеет проблемы подвисающих устройств - если процесс-обработчик умирает, его можно
перезапустить (!) и блочное устройство продолжит работать (в UBLK это тоже поддерживается).
- По-видимому, у него нет предела числа подключаемых в систему устройств (в UBLK лимита тоже нет).
| | Прямой fio | NBD | VDUSE |
|--------------------------|-------------|-------------|-------------|
| линейная запись | 3.85 GB/s | 1.12 GB/s | 3.85 GB/s |
| 4k случайная запись Q128 | 240000 iops | 120000 iops | 178000 iops |
| 4k случайная запись Q1 | 9500 iops | 7620 iops | 7640 iops |
| линейное чтение | 4.3 GB/s | 1.8 GB/s | 2.85 GB/s |
| 4k случайное чтение Q128 | 287000 iops | 140000 iops | 189000 iops |
| 4k случайное чтение Q1 | 9600 iops | 7640 iops | 7780 iops |
Однако, при линейном доступе VDUSE может быть медленнее UBLK (а может быть и быстрее), а по iops
VDUSE иногда даже медленнее NBD. Пример сравнения производительности смотрите на странице [UBLK](ublk.ru.md).
Чтобы попробовать VDUSE, вам нужно ядро Linux как минимум версии 5.15, собранное с поддержкой
VDUSE (CONFIG_VDPA=m, CONFIG_VDPA_USER=m, CONFIG_VIRTIO_VDPA=m).
@@ -165,10 +157,12 @@ apt-get install linux-headers-`uname -r`
apt-get build-dep linux-image-`uname -r`-unsigned
apt-get source linux-image-`uname -r`-unsigned
cd linux*/drivers/vdpa
make -C /lib/modules/`uname -r`/build M=$PWD CONFIG_VDPA=m CONFIG_VDPA_USER=m CONFIG_VIRTIO_VDPA=m -j8 modules modules_install
make -C /lib/modules/`uname -r`/build M=$PWD CONFIG_VDPA=m CONFIG_VDPA_USER=m CONFIG_VIRTIO_VDPA=m -j8 modules
make -C /lib/modules/`uname -r`/build M=$PWD CONFIG_VDPA=m CONFIG_VDPA_USER=m CONFIG_VIRTIO_VDPA=m modules_install
cat Module.symvers >> /lib/modules/`uname -r`/build/Module.symvers
cd ../virtio
make -C /lib/modules/`uname -r`/build M=$PWD CONFIG_VDPA=m CONFIG_VDPA_USER=m CONFIG_VIRTIO_VDPA=m -j8 modules modules_install
make -C /lib/modules/`uname -r`/build M=$PWD CONFIG_VDPA=m CONFIG_VDPA_USER=m CONFIG_VIRTIO_VDPA=m -j8 modules
make -C /lib/modules/`uname -r`/build M=$PWD CONFIG_VDPA=m CONFIG_VDPA_USER=m CONFIG_VIRTIO_VDPA=m modules_install
depmod -a
```
@@ -194,3 +188,12 @@ vdpa dev add name test1 mgmtdev vduse
vdpa dev del test1
kill <PID_процесса_qemu-storage-daemon>
```
## Veeam
Драйвер Vitastor QEMU имеет функцию, которая позволяет обманывать сторонние системы типа Veeam, которые
не могут сами по себе разобрать адреса дисков в vitastor: [qemu_file_mirror_path](../config/client.ru.md#qemu_file_mirror_path).
Чтобы заставить такие системы работать, вам нужно установить эту опцию равной пути к некоторому каталогу
в ФС (например, `/mnt/vitastor/`) и примонтировать этот каталог с помощью [`vitastor-nfs mount --block`](../usage/nfs.ru.md).
Они начнут обращаться к образам как к файлам и, вероятно, смогут заработать корректно :).

116
docs/usage/ublk.en.md Normal file
View File

@@ -0,0 +1,116 @@
[Documentation](../../README.md#documentation) → Usage → UBLK
-----
[Читать на русском](ublk.ru.md)
# UBLK
[ublk](https://docs.kernel.org/block/ublk.html) is a new io_uring-based Linux interface
for user-space block device drivers, available since Linux 6.0.
It's not zero-copy, but it's still a fast implementation, outperforming both [NBD](nbd.en.md)
and [VDUSE](qemu.en.md#vduse) iops-wise and may or may not outperform VDUSE in linear I/O MB/s.
ublk also allows to recover devices even if the server (vitastor-ublk process) dies.
## Example performance comparison
TCP (100G), 3 hosts each with 6 NVMe OSDs, 3 replicas, single client
| | direct fio | NBD | VDUSE | UBLK |
|----------------------|-------------|-------------|------------|-------------|
| linear write | 3807 MB/s | 1832 MB/s | 3226 MB/s | 3027 MB/s |
| linear read | 3067 MB/s | 1885 MB/s | 1800 MB/s | 2076 MB/s |
| 4k random write Q128 | 128624 iops | 91060 iops | 94621 iops | 149450 iops |
| 4k random read Q128 | 117769 iops | 153408 iops | 93157 iops | 171987 iops |
| 4k random write Q1 | 8090 iops | 6442 iops | 6316 iops | 7272 iops |
| 4k random read Q1 | 9474 iops | 7200 iops | 6840 iops | 8038 iops |
RDMA (100G), 3 hosts each with 6 NVMe OSDs, 3 replicas, single client
| | direct fio | NBD | VDUSE | UBLK |
|----------------------|-------------|-------------|-------------|-------------|
| linear write | 6998 MB/s | 1878 MB/s | 4249 MB/s | 3140 MB/s |
| linear read | 8628 MB/s | 3389 MB/s | 5062 MB/s | 3674 MB/s |
| 4k random write Q128 | 222541 iops | 181589 iops | 138281 iops | 218222 iops |
| 4k random read Q128 | 412647 iops | 239987 iops | 151663 iops | 269583 iops |
| 4k random write Q1 | 11601 iops | 8592 iops | 9111 iops | 10000 iops |
| 4k random read Q1 | 10102 iops | 7788 iops | 8111 iops | 8965 iops |
## Commands
vitastor-ublk supports the following commands:
- [map](#map)
- [unmap](#unmap)
- [ls](#ls)
## map
To create a local block device for a Vitastor image run:
```
vitastor-ublk map [/dev/ublkbN] --image testimg
```
It will output a block device name like /dev/ublkb0 which you can then use as a normal disk.
You can also use `--pool <POOL> --inode <INODE> --size <SIZE>` instead of `--image <IMAGE>` if you want.
vitastor-ublk supports all usual Vitastor configuration options like `--config_path <path_to_config>` plus ublk-specific:
* `--recover` \
Recover a mapped device if the previous ublk server is dead.
* `--queue_depth 256` \
Maximum queue size for the device.
* `--max_io_size 1M` \
Maximum single I/O size for the device. Default: `max(1 MB, pool block size * EC part count)`.
* `--readonly` \
Make the device read-only.
* `--hdd` \
Mark the device as rotational.
* `--logfile /path/to/log/file.txt` \
Write log messages to the specified file instead of dropping them (in background mode)
or printing them to the standard output (in foreground mode).
* `--dev_num N` \
Use the specified device /dev/ublkbN instead of automatic selection (alternative syntax
to /dev/ublkbN positional parameter).
* `--foreground 1` \
Stay in foreground, do not daemonize.
Note that `ublk_queue_depth` and `ublk_max_io_size` may also be specified
in `/etc/vitastor/vitastor.conf` or in other configuration file specified with `--config_path`.
## unmap
To unmap the device run:
```
vitastor-ublk unmap /dev/ublkb0
```
## ls
```
vitastor-ublk ls [--json]
```
List mapped images.
Example output (normal format):
```
/dev/ublkb0
image: bench
pid: 584536
/dev/ublkb1
image: bench1
pid: 584546
```
Example output (JSON format):
```
{"/dev/ublkb0": {"image": "bench", "pid": 584536}, "/dev/ublkb1": {"image": "bench1", "pid": 584546}}
```

121
docs/usage/ublk.ru.md Normal file
View File

@@ -0,0 +1,121 @@
[Документация](../../README-ru.md#документация) → Использование → UBLK
-----
[Read in English](ublk.en.md)
# UBLK
[ublk](https://docs.kernel.org/block/ublk.html) - это новый Linux-интерфейс на основе io_uring
для реализации блочных устройств в пространстве пользователя, доступный, начиная с Linux 6.0.
ublk тоже копирует память (т.е. не является zero-copy), но по IOPS всё равно обгоняет и
[NBD](nbd.ru.md), и [VDUSE](qemu.ru.md#vduse), и иногда может даже обгонять VDUSE по
скорости линейного доступа. Также ublk позволяет оживлять устройства, у которых умер
сервер (процесс-обработчик vitastor-ublk).
## Пример сравнения производительности
TCP (100G), 3 сервера с 6 NVMe OSD каждый, 3 реплики, один клиент
| | Прямой fio | NBD | VDUSE | UBLK |
|--------------------------|-------------|-------------|------------|-------------|
| линейная запись | 3807 MB/s | 1832 MB/s | 3226 MB/s | 3027 MB/s |
| линейное чтение | 3067 MB/s | 1885 MB/s | 1800 MB/s | 2076 MB/s |
| 4k случайная запись Q128 | 128624 iops | 91060 iops | 94621 iops | 149450 iops |
| 4k случайное чтение Q128 | 117769 iops | 153408 iops | 93157 iops | 171987 iops |
| 4k случайная запись Q1 | 8090 iops | 6442 iops | 6316 iops | 7272 iops |
| 4k случайное чтение Q1 | 9474 iops | 7200 iops | 6840 iops | 8038 iops |
RDMA (100G), 3 сервера с 6 NVMe OSD каждый, 3 реплики, один клиент
| | Прямой fio | NBD | VDUSE | UBLK |
|--------------------------|-------------|-------------|-------------|-------------|
| линейная запись | 6998 MB/s | 1878 MB/s | 4249 MB/s | 3140 MB/s |
| линейное чтение | 8628 MB/s | 3389 MB/s | 5062 MB/s | 3674 MB/s |
| 4k случайная запись Q128 | 222541 iops | 181589 iops | 138281 iops | 218222 iops |
| 4k случайное чтение Q128 | 412647 iops | 239987 iops | 151663 iops | 269583 iops |
| 4k случайная запись Q1 | 11601 iops | 8592 iops | 9111 iops | 10000 iops |
| 4k случайное чтение Q1 | 10102 iops | 7788 iops | 8111 iops | 8965 iops |
## Команды
vitastor-ublk поддерживает следующие команды:
- [map](#map)
- [unmap](#unmap)
- [ls](#ls)
## map
Чтобы создать локальное блочное устройство для образа, выполните команду:
```
vitastor-ublk map [/dev/ublkbN] --image testimg
```
Команда напечатает название блочного устройства вида /dev/ublkb0, которое потом можно
будет использовать как обычный диск.
Для обращения по номеру инода, аналогично другим командам, можно использовать опции
`--pool <POOL> --inode <INODE> --size <SIZE>` вместо `--image testimg`.
vitastor-ublk поддерживает все обычные опции Vitastor, например, `--config_path <path_to_config>`,
плюс специфичные для ublk:
* `--recover` \
Восстановить ранее подключённое устройство, у которого умер обработчик.
* `--queue_depth 256` \
Максимальная глубина очереди устройства.
* `--max_io_size 1M` \
Максимальный размер запроса ввода-вывода для устройства. По умолчанию: `max(1 MB, блок данных пула * число частей данных EC)`.
* `--readonly` \
Подключить устройство в режиме только для чтения.
* `--hdd` \
Пометить устройство как вращающийся жёсткий диск (флаг rotational).
* `--logfile /path/to/log/file.txt` \
Писать сообщения о процессе работы в заданный файл, вместо пропуска их
при фоновом режиме запуска или печати на стандартный вывод при запуске
в консоли с `--foreground 1`.
* `--dev_num N` \
Использовать заданное устройство `/dev/ublkbN` вместо автоматического подбора.
* `--foreground 1` \
Не уводить процесс в фоновый режим.
Обратите внимание, что опции `ublk_queue_depth` и `ublk_max_io_size` можно
также задавать в `/etc/vitastor/vitastor.conf` или в другом файле конфигурации,
заданном опцией `--config_path`.
## unmap
Для отключения устройства выполните:
```
vitastor-ublk unmap /dev/ublkb0
```
## ls
```
vitastor-ublk ls [--json]
```
Вывести подключённые устройства.
Пример вывода в обычном формате:
```
/dev/ublkb0
image: bench
pid: 584536
/dev/ublkb1
image: bench1
pid: 584546
```
Пример вывода в JSON-формате:
```
{"/dev/ublkb0": {"image": "bench", "pid": 584536}, "/dev/ublkb1": {"image": "bench1", "pid": 584546}}
```

View File

@@ -253,7 +253,7 @@ function random_custom_combinations(osd_tree, rules, count, ordered)
for (let i = 1; i < rules.length; i++)
{
const filtered = filter_tree_by_rules(osd_tree, rules[i], selected);
const idx = select_murmur3(filtered.length, i => 'p:'+f.id+':'+filtered[i].id);
const idx = select_murmur3(filtered.length, i => 'p:'+f.id+':'+(filtered[i].name || filtered[i].id));
selected.push(idx == null ? { levels: {}, id: null } : filtered[idx]);
}
const size = selected.filter(s => s.id !== null).length;
@@ -270,7 +270,7 @@ function random_custom_combinations(osd_tree, rules, count, ordered)
for (const item_rules of rules)
{
const filtered = selected.length ? filter_tree_by_rules(osd_tree, item_rules, selected) : first;
const idx = select_murmur3(filtered.length, i => n+':'+filtered[i].id);
const idx = select_murmur3(filtered.length, i => n+':'+(filtered[i].name || filtered[i].id));
selected.push(idx == null ? { levels: {}, id: null } : filtered[idx]);
}
const size = selected.filter(s => s.id !== null).length;
@@ -340,9 +340,9 @@ function filter_tree_by_rules(osd_tree, rules, selected)
}
// Convert from
// node_list = { id: string|number, level: string, size?: number, parent?: string|number }[]
// node_list = { id: string|number, name?: string, level: string, size?: number, parent?: string|number }[]
// to
// node_tree = { [node_id]: { id, level, size?, parent?, children?: child_node_id[], levels: { [level]: id, ... } } }
// node_tree = { [node_id]: { id, name?, level, size?, parent?, children?: child_node[], levels: { [level]: id, ... } } }
function index_tree(node_list)
{
const tree = { '': { children: [], levels: {} } };
@@ -357,7 +357,7 @@ function index_tree(node_list)
tree[parent_id].children = tree[parent_id].children || [];
tree[parent_id].children.push(tree[node.id]);
}
const cur = tree[''].children;
const cur = [ ...tree[''].children ];
for (let i = 0; i < cur.length; i++)
{
cur[i].levels[cur[i].level] = cur[i].id;

244
mon/lp_optimizer/fold.js Normal file
View File

@@ -0,0 +1,244 @@
// Copyright (c) Vitaliy Filippov, 2019+
// License: VNPL-1.1 (see README.md for details)
// Extract OSDs from the lowest affected tree level into a separate (flat) map
// to run PG optimisation on failure domains instead of individual OSDs
//
// node_list = same input as for index_tree()
// rules = [ level, operator, value ][][]
// returns { nodes: new_node_list, leaves: { new_folded_node_id: [ extracted_leaf_nodes... ] } }
function fold_failure_domains(node_list, rules)
{
const interest = {};
for (const level_rules of rules)
{
for (const rule of level_rules)
interest[rule[0]] = true;
}
const max_numeric_id = node_list.reduce((a, c) => a < (0|c.id) ? (0|c.id) : a, 0);
let next_id = max_numeric_id;
const node_map = node_list.reduce((a, c) => { a[c.id||''] = c; return a; }, {});
const old_ids_by_new = {};
const extracted_nodes = {};
let folded = true;
while (folded)
{
const per_parent = {};
for (const node_id in node_map)
{
const node = node_map[node_id];
const p = node.parent || '';
per_parent[p] = per_parent[p]||[];
per_parent[p].push(node);
}
folded = false;
for (const node_id in per_parent)
{
const fold_node = node_id !== '' && per_parent[node_id].length > 0 && per_parent[node_id].filter(child => per_parent[child.id||''] || interest[child.level]).length == 0;
if (fold_node)
{
const old_node = node_map[node_id];
const new_id = ++next_id;
node_map[new_id] = {
...old_node,
id: new_id,
name: node_id, // for use in murmur3 hashes
size: per_parent[node_id].reduce((a, c) => a + (Number(c.size)||0), 0),
};
delete node_map[node_id];
old_ids_by_new[new_id] = node_id;
extracted_nodes[new_id] = [];
for (const child of per_parent[node_id])
{
if (old_ids_by_new[child.id])
{
extracted_nodes[new_id].push(...extracted_nodes[child.id]);
delete extracted_nodes[child.id];
}
else
extracted_nodes[new_id].push(child);
delete node_map[child.id];
}
folded = true;
}
}
}
return { nodes: Object.values(node_map), leaves: extracted_nodes };
}
// Distribute PGs mapped to "folded" nodes to individual OSDs according to their weights
// folded_pgs = optimize_result.int_pgs before folding
// prev_pgs = optional previous PGs from optimize_change() input
// extracted_nodes = output from fold_failure_domains
function unfold_failure_domains(folded_pgs, prev_pgs, extracted_nodes)
{
const maps = {};
let found = false;
for (const new_id in extracted_nodes)
{
const weights = {};
for (const sub_node of extracted_nodes[new_id])
{
weights[sub_node.id] = sub_node.size;
}
maps[new_id] = { weights, prev: [], next: [], pos: 0 };
found = true;
}
if (!found)
{
return folded_pgs;
}
for (let i = 0; i < folded_pgs.length; i++)
{
for (let j = 0; j < folded_pgs[i].length; j++)
{
if (maps[folded_pgs[i][j]])
{
maps[folded_pgs[i][j]].prev.push(prev_pgs && prev_pgs[i] && prev_pgs[i][j] || 0);
}
}
}
for (const new_id in maps)
{
maps[new_id].next = adjust_distribution(maps[new_id].weights, maps[new_id].prev);
}
const mapped_pgs = [];
for (let i = 0; i < folded_pgs.length; i++)
{
mapped_pgs.push(folded_pgs[i].map(osd => (maps[osd] ? maps[osd].next[maps[osd].pos++] : osd)));
}
return mapped_pgs;
}
// Return the new array of items re-distributed as close as possible to weights in wanted_weights
// wanted_weights = { [key]: weight }
// cur_items = key[]
function adjust_distribution(wanted_weights, cur_items)
{
const item_map = {};
for (let i = 0; i < cur_items.length; i++)
{
const item = cur_items[i];
item_map[item] = (item_map[item] || { target: 0, cur: [] });
item_map[item].cur.push(i);
}
let total_weight = 0;
for (const item in wanted_weights)
{
total_weight += Number(wanted_weights[item]) || 0;
}
for (const item in wanted_weights)
{
const weight = wanted_weights[item] / total_weight * cur_items.length;
if (weight > 0)
{
item_map[item] = (item_map[item] || { target: 0, cur: [] });
item_map[item].target = weight;
}
}
const diff = (item) => (item_map[item].cur.length - item_map[item].target);
const most_underweighted = Object.keys(item_map)
.filter(item => item_map[item].target > 0)
.sort((a, b) => diff(a) - diff(b));
// Items with zero target weight MUST never be selected - remove them
// and remap each of them to a most underweighted item
for (const item in item_map)
{
if (!item_map[item].target)
{
const prev = item_map[item];
delete item_map[item];
for (const idx of prev.cur)
{
const move_to = most_underweighted[0];
item_map[move_to].cur.push(idx);
move_leftmost(most_underweighted, diff);
}
}
}
// Other over-weighted items are only moved if it improves the distribution
while (most_underweighted.length > 1)
{
const first = most_underweighted[0];
const last = most_underweighted[most_underweighted.length-1];
const first_diff = diff(first);
const last_diff = diff(last);
if (Math.abs(first_diff+1)+Math.abs(last_diff-1) < Math.abs(first_diff)+Math.abs(last_diff))
{
item_map[first].cur.push(item_map[last].cur.pop());
move_leftmost(most_underweighted, diff);
move_rightmost(most_underweighted, diff);
}
else
{
break;
}
}
const new_items = new Array(cur_items.length);
for (const item in item_map)
{
for (const idx of item_map[item].cur)
{
new_items[idx] = item;
}
}
return new_items;
}
function move_leftmost(sorted_array, diff)
{
// Re-sort by moving the leftmost item to the right if it changes position
const first = sorted_array[0];
const new_diff = diff(first);
let r = 0;
while (r < sorted_array.length-1 && diff(sorted_array[r+1]) <= new_diff)
r++;
if (r > 0)
{
for (let i = 0; i < r; i++)
sorted_array[i] = sorted_array[i+1];
sorted_array[r] = first;
}
}
function move_rightmost(sorted_array, diff)
{
// Re-sort by moving the rightmost item to the left if it changes position
const last = sorted_array[sorted_array.length-1];
const new_diff = diff(last);
let r = sorted_array.length-1;
while (r > 0 && diff(sorted_array[r-1]) > new_diff)
r--;
if (r < sorted_array.length-1)
{
for (let i = sorted_array.length-1; i > r; i--)
sorted_array[i] = sorted_array[i-1];
sorted_array[r] = last;
}
}
// map previous PGs to folded nodes
function fold_prev_pgs(pgs, extracted_nodes)
{
const unmap = {};
for (const new_id in extracted_nodes)
{
for (const sub_node of extracted_nodes[new_id])
{
unmap[sub_node.id] = new_id;
}
}
const mapped_pgs = [];
for (let i = 0; i < pgs.length; i++)
{
mapped_pgs.push(pgs[i].map(osd => (unmap[osd] || osd)));
}
return mapped_pgs;
}
module.exports = {
fold_failure_domains,
unfold_failure_domains,
adjust_distribution,
fold_prev_pgs,
};

View File

@@ -98,6 +98,7 @@ async function optimize_initial({ osd_weights, combinator, pg_count, pg_size = 3
score: lp_result.score,
weights: lp_result.vars,
int_pgs,
pg_effsize,
space: eff * pg_effsize,
total_space: total_weight,
};
@@ -409,6 +410,7 @@ async function optimize_change({ prev_pgs: prev_int_pgs, osd_weights, combinator
int_pgs: new_pgs,
differs,
osd_differs,
pg_effsize,
space: pg_effsize * pg_list_space_efficiency(new_pgs, osd_weights, pg_minsize, parity_space),
total_space: total_weight,
};

View File

@@ -0,0 +1,108 @@
// Copyright (c) Vitaliy Filippov, 2019+
// License: VNPL-1.1 (see README.md for details)
const assert = require('assert');
const { fold_failure_domains, unfold_failure_domains, adjust_distribution } = require('./fold.js');
const DSL = require('./dsl_pgs.js');
const LPOptimizer = require('./lp_optimizer.js');
const stableStringify = require('../stable-stringify.js');
async function run()
{
// Test run adjust_distribution
console.log('adjust_distribution');
const rand = [];
for (let i = 0; i < 100; i++)
{
rand.push(1 + Math.floor(10*Math.random()));
// or rand.push(0);
}
const adj = adjust_distribution({ 1: 1, 2: 1, 3: 1, 4: 1, 5: 1, 6: 1, 7: 1, 8: 1, 9: 1, 10: 1 }, rand);
//console.log(rand.join(' '));
console.log(rand.reduce((a, c) => { a[c] = (a[c]||0)+1; return a; }, {}));
//console.log(adj.join(' '));
console.log(adj.reduce((a, c) => { a[c] = (a[c]||0)+1; return a; }, {}));
console.log('Movement: '+rand.reduce((a, c, i) => a+(rand[i] != adj[i] ? 1 : 0), 0)+'/'+rand.length);
console.log('\nfold_failure_domains');
console.log(JSON.stringify(fold_failure_domains(
[
{ id: 1, level: 'osd', size: 1, parent: 'disk1' },
{ id: 2, level: 'osd', size: 2, parent: 'disk1' },
{ id: 'disk1', level: 'disk', parent: 'host1' },
{ id: 'host1', level: 'host', parent: 'dc1' },
{ id: 'dc1', level: 'dc' },
],
[ [ [ 'dc' ], [ 'host' ] ] ]
), 0, 2));
console.log('\nfold_failure_domains empty rules');
console.log(JSON.stringify(fold_failure_domains(
[
{ id: 1, level: 'osd', size: 1, parent: 'disk1' },
{ id: 2, level: 'osd', size: 2, parent: 'disk1' },
{ id: 'disk1', level: 'disk', parent: 'host1' },
{ id: 'host1', level: 'host', parent: 'dc1' },
{ id: 'dc1', level: 'dc' },
],
[]
), 0, 2));
console.log('\noptimize_folded');
// 5 DCs, 2 hosts per DC, 10 OSD per host
const nodes = [];
for (let i = 1; i <= 100; i++)
{
nodes.push({ id: i, level: 'osd', size: 1, parent: 'host'+(1+(0|((i-1)/10))) });
}
for (let i = 1; i <= 10; i++)
{
nodes.push({ id: 'host'+i, level: 'host', parent: 'dc'+(1+(0|((i-1)/2))) });
}
for (let i = 1; i <= 5; i++)
{
nodes.push({ id: 'dc'+i, level: 'dc' });
}
// Check rules
const rules = DSL.parse_level_indexes({ dc: '112233', host: '123456' }, [ 'dc', 'host', 'osd' ]);
assert.deepEqual(rules, [[],[["dc","=",1],["host","!=",[1]]],[["dc","!=",[1]]],[["dc","=",3],["host","!=",[3]]],[["dc","!=",[1,3]]],[["dc","=",5],["host","!=",[5]]]]);
// Check tree folding
const { nodes: folded_nodes, leaves: folded_leaves } = fold_failure_domains(nodes, rules);
const expected_folded = [];
const expected_leaves = {};
for (let i = 1; i <= 10; i++)
{
expected_folded.push({ id: 100+i, name: 'host'+i, level: 'host', size: 10, parent: 'dc'+(1+(0|((i-1)/2))) });
expected_leaves[100+i] = [ ...new Array(10).keys() ].map(k => ({ id: 10*(i-1)+k+1, level: 'osd', size: 1, parent: 'host'+i }));
}
for (let i = 1; i <= 5; i++)
{
expected_folded.push({ id: 'dc'+i, level: 'dc' });
}
assert.equal(stableStringify(folded_nodes), stableStringify(expected_folded));
assert.equal(stableStringify(folded_leaves), stableStringify(expected_leaves));
// Now optimise it
console.log('1000 PGs, EC 112233');
const leaf_weights = folded_nodes.reduce((a, c) => { if (Number(c.id)) { a[c.id] = c.size; } return a; }, {});
let res = await LPOptimizer.optimize_initial({
osd_weights: leaf_weights,
combinator: new DSL.RuleCombinator(folded_nodes, rules, 10000, false),
pg_size: 6,
pg_count: 1000,
ordered: false,
});
LPOptimizer.print_change_stats(res, false);
assert.equal(res.space, 100, 'Initial distribution');
const unfolded_res = { ...res };
unfolded_res.int_pgs = unfold_failure_domains(res.int_pgs, null, folded_leaves);
const osd_weights = nodes.reduce((a, c) => { if (Number(c.id)) { a[c.id] = c.size; } return a; }, {});
unfolded_res.space = unfolded_res.pg_effsize * LPOptimizer.pg_list_space_efficiency(unfolded_res.int_pgs, osd_weights, 0, 1);
LPOptimizer.print_change_stats(unfolded_res, false);
assert.equal(res.space, 100, 'Initial distribution');
}
run().catch(console.error);

View File

@@ -96,6 +96,7 @@ class Mon
}
else
{
res.setHeader('Content-Type', 'text/plain; version=0.0.4; charset=utf-8');
res.write(export_prometheus_metrics(this.state));
}
}

View File

@@ -15,7 +15,7 @@ function get_osd_tree(global_config, state)
const stat = state.osd.stats[osd_num];
const osd_cfg = state.config.osd[osd_num];
let reweight = osd_cfg == null ? 1 : Number(osd_cfg.reweight);
if (reweight < 0 || isNaN(reweight))
if (isNaN(reweight) || reweight < 0 || reweight > 0)
reweight = 1;
if (stat && stat.size && reweight && (state.osd.state[osd_num] || Number(stat.time) >= down_time ||
osd_cfg && osd_cfg.noout))
@@ -87,7 +87,7 @@ function make_hier_tree(global_config, tree)
tree[''] = { children: [] };
for (const node_id in tree)
{
if (node_id === '' || tree[node_id].level === 'osd' && (!tree[node_id].size || tree[node_id].size <= 0))
if (node_id === '' || !(tree[node_id].children||[]).length && (tree[node_id].size||0) <= 0)
{
continue;
}
@@ -107,10 +107,10 @@ function make_hier_tree(global_config, tree)
deleted = 0;
for (const node_id in tree)
{
if (tree[node_id].level !== 'osd' && (!tree[node_id].children || !tree[node_id].children.length))
if (!(tree[node_id].children||[]).length && (tree[node_id].size||0) <= 0)
{
const parent = tree[node_id].parent;
if (parent)
if (parent && tree[parent])
{
tree[parent].children = tree[parent].children.filter(c => c != tree[node_id]);
}
@@ -179,7 +179,7 @@ function filter_osds_by_block_layout(orig_tree, osd_stats, block_size, bitmap_gr
if (orig_tree[osd].level === 'osd')
{
const osd_stat = osd_stats[osd];
if (osd_stat && (osd_stat.bs_block_size && osd_stat.bs_block_size != block_size ||
if (osd_stat && (osd_stat.data_block_size && osd_stat.data_block_size != block_size ||
osd_stat.bitmap_granularity && osd_stat.bitmap_granularity != bitmap_granularity ||
osd_stat.immediate_commit == 'small' && immediate_commit == 'all' ||
osd_stat.immediate_commit == 'none' && immediate_commit != 'none'))

View File

@@ -1,6 +1,6 @@
{
"name": "vitastor-mon",
"version": "2.0.0",
"version": "2.3.0",
"description": "Vitastor SDS monitor service",
"main": "mon-main.js",
"scripts": {
@@ -9,7 +9,7 @@
"author": "Vitaliy Filippov",
"license": "UNLICENSED",
"dependencies": {
"antietcd": "^1.1.2",
"antietcd": "^1.1.3",
"sprintf-js": "^1.1.2",
"ws": "^7.2.5"
},

View File

@@ -3,6 +3,7 @@
const { RuleCombinator } = require('./lp_optimizer/dsl_pgs.js');
const { SimpleCombinator, flatten_tree } = require('./lp_optimizer/simple_pgs.js');
const { fold_failure_domains, unfold_failure_domains, fold_prev_pgs } = require('./lp_optimizer/fold.js');
const { validate_pool_cfg, get_pg_rules } = require('./pool_config.js');
const LPOptimizer = require('./lp_optimizer/lp_optimizer.js');
const { scale_pg_count } = require('./pg_utils.js');
@@ -160,7 +161,6 @@ async function generate_pool_pgs(state, global_config, pool_id, osd_tree, levels
pool_cfg.bitmap_granularity || global_config.bitmap_granularity || 4096,
pool_cfg.immediate_commit || global_config.immediate_commit || 'all'
);
pool_tree = make_hier_tree(global_config, pool_tree);
// First try last_clean_pgs to minimize data movement
let prev_pgs = [];
for (const pg in ((state.history.last_clean_pgs.items||{})[pool_id]||{}))
@@ -175,14 +175,19 @@ async function generate_pool_pgs(state, global_config, pool_id, osd_tree, levels
prev_pgs[pg-1] = [ ...state.pg.config.items[pool_id][pg].osd_set ];
}
}
const use_rules = !global_config.use_old_pg_combinator || pool_cfg.level_placement || pool_cfg.raw_placement;
const rules = use_rules ? get_pg_rules(pool_id, pool_cfg, global_config.placement_levels) : null;
const folded = fold_failure_domains(Object.values(pool_tree), use_rules ? rules : [ [ [ pool_cfg.failure_domain ] ] ]);
// FIXME: Remove/merge make_hier_tree() step somewhere, however it's needed to remove empty nodes
const folded_tree = make_hier_tree(global_config, folded.nodes);
const old_pg_count = prev_pgs.length;
const optimize_cfg = {
osd_weights: Object.values(pool_tree).filter(item => item.level === 'osd').reduce((a, c) => { a[c.id] = c.size; return a; }, {}),
combinator: !global_config.use_old_pg_combinator || pool_cfg.level_placement || pool_cfg.raw_placement
osd_weights: folded.nodes.reduce((a, c) => { if (Number(c.id)) { a[c.id] = c.size; } return a; }, {}),
combinator: use_rules
// new algorithm:
? new RuleCombinator(pool_tree, get_pg_rules(pool_id, pool_cfg, global_config.placement_levels), pool_cfg.max_osd_combinations)
? new RuleCombinator(folded_tree, rules, pool_cfg.max_osd_combinations)
// old algorithm:
: new SimpleCombinator(flatten_tree(pool_tree[''].children, levels, pool_cfg.failure_domain, 'osd'), pool_cfg.pg_size, pool_cfg.max_osd_combinations),
: new SimpleCombinator(flatten_tree(folded_tree[''].children, levels, pool_cfg.failure_domain, 'osd'), pool_cfg.pg_size, pool_cfg.max_osd_combinations),
pg_count: pool_cfg.pg_count,
pg_size: pool_cfg.pg_size,
pg_minsize: pool_cfg.pg_minsize,
@@ -202,12 +207,11 @@ async function generate_pool_pgs(state, global_config, pool_id, osd_tree, levels
for (const pg of prev_pgs)
{
while (pg.length < pool_cfg.pg_size)
{
pg.push(0);
}
}
const folded_prev_pgs = fold_prev_pgs(prev_pgs, folded.leaves);
optimize_result = await LPOptimizer.optimize_change({
prev_pgs,
prev_pgs: folded_prev_pgs,
...optimize_cfg,
});
}
@@ -215,6 +219,10 @@ async function generate_pool_pgs(state, global_config, pool_id, osd_tree, levels
{
optimize_result = await LPOptimizer.optimize_initial(optimize_cfg);
}
optimize_result.int_pgs = unfold_failure_domains(optimize_result.int_pgs, prev_pgs, folded.leaves);
const osd_weights = Object.values(pool_tree).reduce((a, c) => { if (c.level === 'osd') { a[c.id] = c.size; } return a; }, {});
optimize_result.space = optimize_result.pg_effsize * LPOptimizer.pg_list_space_efficiency(optimize_result.int_pgs,
osd_weights, optimize_cfg.pg_minsize, 1);
console.log(`Pool ${pool_id} (${pool_cfg.name || 'unnamed'}):`);
LPOptimizer.print_change_stats(optimize_result);
let pg_effsize = pool_cfg.pg_size;

View File

@@ -40,6 +40,11 @@ async function run()
console.log("/etc/systemd/system/vitastor-etcd.service already exists");
process.exit(1);
}
if (!in_docker && fs.existsSync("/etc/systemd/system/etcd.service"))
{
console.log("/etc/systemd/system/etcd.service already exists");
process.exit(1);
}
const config = JSON.parse(fs.readFileSync(config_path, { encoding: 'utf-8' }));
if (!config.etcd_address)
{
@@ -66,7 +71,7 @@ async function run()
console.log('etcd for Vitastor configured. Run `systemctl enable --now vitastor-etcd` to start etcd');
process.exit(0);
}
await system(`mkdir -p /var/lib/etcd`);
await system(`mkdir -p /var/lib/etcd/vitastor`);
fs.writeFileSync(
"/etc/systemd/system/vitastor-etcd.service",
`[Unit]
@@ -77,14 +82,14 @@ Wants=network-online.target local-fs.target time-sync.target
[Service]
Restart=always
Environment=GOGC=50
ExecStart=etcd --name ${etcd_name} --data-dir /var/lib/etcd \\
ExecStart=etcd --name ${etcd_name} --data-dir /var/lib/etcd/vitastor \\
--snapshot-count 10000 --advertise-client-urls http://${etcds[num]}:2379 --listen-client-urls http://${etcds[num]}:2379 \\
--initial-advertise-peer-urls http://${etcds[num]}:2380 --listen-peer-urls http://${etcds[num]}:2380 \\
--initial-cluster-token vitastor-etcd-1 --initial-cluster ${etcd_cluster} \\
--initial-cluster-state new --max-txn-ops=100000 --max-request-bytes=104857600 \\
--auto-compaction-retention=10 --auto-compaction-mode=revision
WorkingDirectory=/var/lib/etcd
ExecStartPre=+chown -R etcd /var/lib/etcd
WorkingDirectory=/var/lib/etcd/vitastor
ExecStartPre=+chown -R etcd /var/lib/etcd/vitastor
User=etcd
PrivateTmp=false
TasksMax=infinity
@@ -97,8 +102,9 @@ WantedBy=multi-user.target
`);
await system(`useradd etcd`);
await system(`systemctl daemon-reload`);
await system(`systemctl enable etcd`);
await system(`systemctl start etcd`);
// Disable distribution etcd unit and enable our one
await system(`systemctl disable --now etcd`);
await system(`systemctl enable --now vitastor-etcd`);
process.exit(0);
}

View File

@@ -87,11 +87,25 @@ function sum_op_stats(all_osd, prev_stats)
for (const k in derived[type][op])
{
sum_diff[type][op] = sum_diff[type][op] || {};
sum_diff[type][op][k] = (sum_diff[type][op][k] || 0n) + derived[type][op][k];
if (k == 'lat')
sum_diff[type][op].lat = (sum_diff[type][op].lat || 0n) + derived[type][op].lat*derived[type][op].iops;
else
sum_diff[type][op][k] = (sum_diff[type][op][k] || 0n) + derived[type][op][k];
}
}
}
}
// Calculate average (weighted by iops) op latency across all OSDs
for (const type in sum_diff)
{
for (const op in sum_diff[type])
{
if (sum_diff[type][op].lat)
{
sum_diff[type][op].lat /= sum_diff[type][op].iops;
}
}
}
return sum_diff;
}
@@ -271,8 +285,7 @@ function sum_inode_stats(state, prev_stats)
const op_st = inode_stats[pool_id][inode_num][op];
op_st.bps += op_diff.bps;
op_st.iops += op_diff.iops;
op_st.lat += op_diff.lat;
op_st.n_osd = (op_st.n_osd || 0) + 1;
op_st.lat += op_diff.lat*op_diff.iops;
}
}
}
@@ -285,11 +298,8 @@ function sum_inode_stats(state, prev_stats)
for (const op of [ 'read', 'write', 'delete' ])
{
const op_st = inode_stats[pool_id][inode_num][op];
if (op_st.n_osd)
{
op_st.lat /= BigInt(op_st.n_osd);
delete op_st.n_osd;
}
if (op_st.lat)
op_st.lat /= op_st.iops;
if (op_st.bps > 0 || op_st.iops > 0)
nonzero = true;
}

View File

@@ -1,6 +1,6 @@
{
"name": "vitastor",
"version": "2.0.0",
"version": "2.3.0",
"description": "Low-level native bindings to Vitastor client library",
"main": "index.js",
"keywords": [

View File

@@ -261,7 +261,7 @@ sub free_image
my ($vtype, $name, $vmid, undef, undef, undef) = $class->parse_volname($volname);
$class->deactivate_volume($storeid, $scfg, $volname);
my $full_list = run_cli($scfg, [ 'ls', '-l' ]);
my $list = _process_list($scfg, $storeid, $full_list);
my $list = _process_list($scfg, $storeid, $full_list, 0);
# Remove image and all its snapshots
my $rm_names = {
map { ($prefix.$_->{name} => 1) }
@@ -269,6 +269,10 @@ sub free_image
@$list
};
my $children = [ grep { $_->{parent_name} && $rm_names->{$_->{parent_name}} } @$full_list ];
$children = [ grep {
substr($_->{name}, 0, length($prefix.$name)) ne $prefix.$name &&
substr($_->{name}, 0, length($prefix.$name)+1) ne $prefix.$name.'@'
} @$children ];
die "Image has children: ".join(', ', map {
substr($_->{name}, 0, length $prefix) eq $prefix
? substr($_->name, length $prefix)
@@ -288,14 +292,15 @@ sub free_image
sub _process_list
{
my ($scfg, $storeid, $result) = @_;
my ($scfg, $storeid, $result, $skip_snapshot) = @_;
$skip_snapshot = 1 if !defined $skip_snapshot;
my $prefix = defined $scfg->{vitastor_prefix} ? $scfg->{vitastor_prefix} : 'pve/';
my $list = [];
foreach my $el (@$result)
{
next if !$el->{name} || length($prefix) && substr($el->{name}, 0, length $prefix) ne $prefix;
my $name = substr($el->{name}, length $prefix);
next if $name =~ /@/;
next if $skip_snapshot && $name =~ /@/;
my ($owner) = $name =~ /^(?:vm|base)-(\d+)-/s;
next if !defined $owner;
my $parent = !defined $el->{parent_name}
@@ -410,8 +415,8 @@ sub volume_size_info
my $prefix = defined $scfg->{vitastor_prefix} ? $scfg->{vitastor_prefix} : 'pve/';
my ($vtype, $name, $vmid) = $class->parse_volname($volname);
my $info = _process_list($scfg, $storeid, run_cli($scfg, [ 'ls', $prefix.$name ]))->[0];
#return wantarray ? ($size, $format, $used, $parent, $st->ctime) : $size;
return $info->{size};
# (size, format, used, parent, ctime)
return wantarray ? ($info->{size}, $info->{format}, $info->{size}, $info->{parent}, 0) : $info->{size};
}
sub volume_resize

View File

@@ -50,7 +50,7 @@ from cinder.volume import configuration
from cinder.volume import driver
from cinder.volume import volume_utils
VITASTOR_VERSION = '2.0.0'
VITASTOR_VERSION = '2.3.0'
LOG = logging.getLogger(__name__)

View File

@@ -0,0 +1,637 @@
diff --git a/include/libvirt/libvirt-storage.h b/include/libvirt/libvirt-storage.h
index aaad4a3da1..5f5daa8341 100644
--- a/include/libvirt/libvirt-storage.h
+++ b/include/libvirt/libvirt-storage.h
@@ -326,6 +326,7 @@ typedef enum {
VIR_CONNECT_LIST_STORAGE_POOLS_ZFS = 1 << 17, /* (Since: 1.2.8) */
VIR_CONNECT_LIST_STORAGE_POOLS_VSTORAGE = 1 << 18, /* (Since: 3.1.0) */
VIR_CONNECT_LIST_STORAGE_POOLS_ISCSI_DIRECT = 1 << 19, /* (Since: 5.6.0) */
+ VIR_CONNECT_LIST_STORAGE_POOLS_VITASTOR = 1 << 20, /* (Since: 5.0.0) */
} virConnectListAllStoragePoolsFlags;
int virConnectListAllStoragePools(virConnectPtr conn,
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
index 1e24e41a48..ce359a4cf8 100644
--- a/src/conf/domain_conf.c
+++ b/src/conf/domain_conf.c
@@ -7435,7 +7435,8 @@ virDomainDiskSourceNetworkParse(xmlNodePtr node,
src->configFile = virXPathString("string(./config/@file)", ctxt);
if (src->protocol == VIR_STORAGE_NET_PROTOCOL_HTTP ||
- src->protocol == VIR_STORAGE_NET_PROTOCOL_HTTPS)
+ src->protocol == VIR_STORAGE_NET_PROTOCOL_HTTPS ||
+ src->protocol == VIR_STORAGE_NET_PROTOCOL_VITASTOR)
src->query = virXMLPropString(node, "query");
if (virDomainStorageNetworkParseHosts(node, ctxt, &src->hosts, &src->nhosts) < 0)
@@ -31871,6 +31872,7 @@ virDomainStorageSourceTranslateSourcePool(virStorageSource *src,
case VIR_STORAGE_POOL_MPATH:
case VIR_STORAGE_POOL_RBD:
+ case VIR_STORAGE_POOL_VITASTOR:
case VIR_STORAGE_POOL_SHEEPDOG:
case VIR_STORAGE_POOL_GLUSTER:
case VIR_STORAGE_POOL_LAST:
diff --git a/src/conf/domain_validate.c b/src/conf/domain_validate.c
index b28af7fa56..d1aae6e43e 100644
--- a/src/conf/domain_validate.c
+++ b/src/conf/domain_validate.c
@@ -504,6 +504,7 @@ virDomainDiskDefValidateSourceChainOne(const virStorageSource *src)
case VIR_STORAGE_NET_PROTOCOL_RBD:
break;
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
case VIR_STORAGE_NET_PROTOCOL_NBD:
case VIR_STORAGE_NET_PROTOCOL_SHEEPDOG:
case VIR_STORAGE_NET_PROTOCOL_GLUSTER:
@@ -576,7 +577,7 @@ virDomainDiskDefValidateSourceChainOne(const virStorageSource *src)
}
}
- /* internal snapshots and config files are currently supported only with rbd: */
+ /* internal snapshots are currently supported only with rbd: */
if (virStorageSourceGetActualType(src) != VIR_STORAGE_TYPE_NETWORK &&
src->protocol != VIR_STORAGE_NET_PROTOCOL_RBD) {
if (src->snapshot) {
@@ -584,10 +585,14 @@ virDomainDiskDefValidateSourceChainOne(const virStorageSource *src)
_("<snapshot> element is currently supported only with 'rbd' disks"));
return -1;
}
-
+ }
+ /* config files are currently supported only with rbd and vitastor: */
+ if (virStorageSourceGetActualType(src) != VIR_STORAGE_TYPE_NETWORK &&
+ src->protocol != VIR_STORAGE_NET_PROTOCOL_RBD &&
+ src->protocol != VIR_STORAGE_NET_PROTOCOL_VITASTOR) {
if (src->configFile) {
virReportError(VIR_ERR_XML_ERROR, "%s",
- _("<config> element is currently supported only with 'rbd' disks"));
+ _("<config> element is currently supported only with 'rbd' and 'vitastor' disks"));
return -1;
}
}
diff --git a/src/conf/schemas/domaincommon.rng b/src/conf/schemas/domaincommon.rng
index 183dd5db5e..dcc0d1a778 100644
--- a/src/conf/schemas/domaincommon.rng
+++ b/src/conf/schemas/domaincommon.rng
@@ -2066,6 +2066,35 @@
</element>
</define>
+ <define name="diskSourceNetworkProtocolVitastor">
+ <element name="source">
+ <interleave>
+ <attribute name="protocol">
+ <value>vitastor</value>
+ </attribute>
+ <ref name="diskSourceCommon"/>
+ <optional>
+ <attribute name="name"/>
+ </optional>
+ <optional>
+ <attribute name="query"/>
+ </optional>
+ <zeroOrMore>
+ <ref name="diskSourceNetworkHost"/>
+ </zeroOrMore>
+ <optional>
+ <element name="config">
+ <attribute name="file">
+ <ref name="absFilePath"/>
+ </attribute>
+ <empty/>
+ </element>
+ </optional>
+ <empty/>
+ </interleave>
+ </element>
+ </define>
+
<define name="diskSourceNetworkProtocolISCSI">
<element name="source">
<attribute name="protocol">
@@ -2416,6 +2445,7 @@
<ref name="diskSourceNetworkProtocolSimple"/>
<ref name="diskSourceNetworkProtocolVxHS"/>
<ref name="diskSourceNetworkProtocolNFS"/>
+ <ref name="diskSourceNetworkProtocolVitastor"/>
</choice>
</define>
diff --git a/src/conf/storage_conf.c b/src/conf/storage_conf.c
index 1dc9365bf2..a8a736be81 100644
--- a/src/conf/storage_conf.c
+++ b/src/conf/storage_conf.c
@@ -56,7 +56,7 @@ VIR_ENUM_IMPL(virStoragePool,
"logical", "disk", "iscsi",
"iscsi-direct", "scsi", "mpath",
"rbd", "sheepdog", "gluster",
- "zfs", "vstorage",
+ "zfs", "vstorage", "vitastor",
);
VIR_ENUM_IMPL(virStoragePoolFormatFileSystem,
@@ -242,6 +242,18 @@ static virStoragePoolTypeInfo poolTypeInfo[] = {
.formatToString = virStorageFileFormatTypeToString,
}
},
+ {.poolType = VIR_STORAGE_POOL_VITASTOR,
+ .poolOptions = {
+ .flags = (VIR_STORAGE_POOL_SOURCE_HOST |
+ VIR_STORAGE_POOL_SOURCE_NETWORK |
+ VIR_STORAGE_POOL_SOURCE_NAME),
+ },
+ .volOptions = {
+ .defaultFormat = VIR_STORAGE_FILE_RAW,
+ .formatFromString = virStorageVolumeFormatFromString,
+ .formatToString = virStorageFileFormatTypeToString,
+ }
+ },
{.poolType = VIR_STORAGE_POOL_SHEEPDOG,
.poolOptions = {
.flags = (VIR_STORAGE_POOL_SOURCE_HOST |
@@ -538,6 +550,11 @@ virStoragePoolDefParseSource(xmlXPathContextPtr ctxt,
_("element 'name' is mandatory for RBD pool"));
return -1;
}
+ if (pool_type == VIR_STORAGE_POOL_VITASTOR && source->name == NULL) {
+ virReportError(VIR_ERR_XML_ERROR, "%s",
+ _("element 'name' is mandatory for Vitastor pool"));
+ return -1;
+ }
if (options->formatFromString) {
g_autofree char *format = NULL;
@@ -1127,6 +1144,7 @@ virStoragePoolDefFormatBuf(virBuffer *buf,
/* RBD, Sheepdog, Gluster and Iscsi-direct devices are not local block devs nor
* files, so they don't have a target */
if (def->type != VIR_STORAGE_POOL_RBD &&
+ def->type != VIR_STORAGE_POOL_VITASTOR &&
def->type != VIR_STORAGE_POOL_SHEEPDOG &&
def->type != VIR_STORAGE_POOL_GLUSTER &&
def->type != VIR_STORAGE_POOL_ISCSI_DIRECT) {
diff --git a/src/conf/storage_conf.h b/src/conf/storage_conf.h
index fc67957cfe..720c07ef74 100644
--- a/src/conf/storage_conf.h
+++ b/src/conf/storage_conf.h
@@ -103,6 +103,7 @@ typedef enum {
VIR_STORAGE_POOL_GLUSTER, /* Gluster device */
VIR_STORAGE_POOL_ZFS, /* ZFS */
VIR_STORAGE_POOL_VSTORAGE, /* Virtuozzo Storage */
+ VIR_STORAGE_POOL_VITASTOR, /* Vitastor */
VIR_STORAGE_POOL_LAST,
} virStoragePoolType;
@@ -454,6 +455,7 @@ VIR_ENUM_DECL(virStoragePartedFs);
VIR_CONNECT_LIST_STORAGE_POOLS_SCSI | \
VIR_CONNECT_LIST_STORAGE_POOLS_MPATH | \
VIR_CONNECT_LIST_STORAGE_POOLS_RBD | \
+ VIR_CONNECT_LIST_STORAGE_POOLS_VITASTOR | \
VIR_CONNECT_LIST_STORAGE_POOLS_SHEEPDOG | \
VIR_CONNECT_LIST_STORAGE_POOLS_GLUSTER | \
VIR_CONNECT_LIST_STORAGE_POOLS_ZFS | \
diff --git a/src/conf/storage_source_conf.c b/src/conf/storage_source_conf.c
index 8a063be244..dd9c7f11a2 100644
--- a/src/conf/storage_source_conf.c
+++ b/src/conf/storage_source_conf.c
@@ -89,6 +89,7 @@ VIR_ENUM_IMPL(virStorageNetProtocol,
"ssh",
"vxhs",
"nfs",
+ "vitastor",
);
@@ -1314,6 +1315,7 @@ virStorageSourceNetworkDefaultPort(virStorageNetProtocol protocol)
case VIR_STORAGE_NET_PROTOCOL_GLUSTER:
return 24007;
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
case VIR_STORAGE_NET_PROTOCOL_RBD:
/* we don't provide a default for RBD */
return 0;
diff --git a/src/conf/storage_source_conf.h b/src/conf/storage_source_conf.h
index ebddf28cd6..873a2be65c 100644
--- a/src/conf/storage_source_conf.h
+++ b/src/conf/storage_source_conf.h
@@ -130,6 +130,7 @@ typedef enum {
VIR_STORAGE_NET_PROTOCOL_SSH,
VIR_STORAGE_NET_PROTOCOL_VXHS,
VIR_STORAGE_NET_PROTOCOL_NFS,
+ VIR_STORAGE_NET_PROTOCOL_VITASTOR,
VIR_STORAGE_NET_PROTOCOL_LAST
} virStorageNetProtocol;
diff --git a/src/conf/virstorageobj.c b/src/conf/virstorageobj.c
index 59fa5da372..4739167f5f 100644
--- a/src/conf/virstorageobj.c
+++ b/src/conf/virstorageobj.c
@@ -1438,6 +1438,7 @@ virStoragePoolObjSourceFindDuplicateCb(const void *payload,
return 1;
break;
+ case VIR_STORAGE_POOL_VITASTOR:
case VIR_STORAGE_POOL_ISCSI_DIRECT:
case VIR_STORAGE_POOL_RBD:
case VIR_STORAGE_POOL_LAST:
@@ -1921,6 +1922,8 @@ virStoragePoolObjMatch(virStoragePoolObj *obj,
(obj->def->type == VIR_STORAGE_POOL_MPATH)) ||
(MATCH(VIR_CONNECT_LIST_STORAGE_POOLS_RBD) &&
(obj->def->type == VIR_STORAGE_POOL_RBD)) ||
+ (MATCH(VIR_CONNECT_LIST_STORAGE_POOLS_VITASTOR) &&
+ (obj->def->type == VIR_STORAGE_POOL_VITASTOR)) ||
(MATCH(VIR_CONNECT_LIST_STORAGE_POOLS_SHEEPDOG) &&
(obj->def->type == VIR_STORAGE_POOL_SHEEPDOG)) ||
(MATCH(VIR_CONNECT_LIST_STORAGE_POOLS_GLUSTER) &&
diff --git a/src/libvirt-storage.c b/src/libvirt-storage.c
index db7660aac4..561df34709 100644
--- a/src/libvirt-storage.c
+++ b/src/libvirt-storage.c
@@ -94,6 +94,7 @@ virStoragePoolGetConnect(virStoragePoolPtr pool)
* VIR_CONNECT_LIST_STORAGE_POOLS_SCSI
* VIR_CONNECT_LIST_STORAGE_POOLS_MPATH
* VIR_CONNECT_LIST_STORAGE_POOLS_RBD
+ * VIR_CONNECT_LIST_STORAGE_POOLS_VITASTOR
* VIR_CONNECT_LIST_STORAGE_POOLS_SHEEPDOG
* VIR_CONNECT_LIST_STORAGE_POOLS_GLUSTER
* VIR_CONNECT_LIST_STORAGE_POOLS_ZFS
diff --git a/src/libxl/libxl_conf.c b/src/libxl/libxl_conf.c
index bdd30dd65a..5353e00b4a 100644
--- a/src/libxl/libxl_conf.c
+++ b/src/libxl/libxl_conf.c
@@ -1081,6 +1081,7 @@ libxlMakeNetworkDiskSrcStr(virStorageSource *src,
case VIR_STORAGE_NET_PROTOCOL_SSH:
case VIR_STORAGE_NET_PROTOCOL_VXHS:
case VIR_STORAGE_NET_PROTOCOL_NFS:
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
case VIR_STORAGE_NET_PROTOCOL_LAST:
case VIR_STORAGE_NET_PROTOCOL_NONE:
virReportError(VIR_ERR_NO_SUPPORT,
diff --git a/src/libxl/xen_xl.c b/src/libxl/xen_xl.c
index ec8de30c01..61eab9606d 100644
--- a/src/libxl/xen_xl.c
+++ b/src/libxl/xen_xl.c
@@ -1461,6 +1461,7 @@ xenFormatXLDiskSrcNet(virStorageSource *src)
case VIR_STORAGE_NET_PROTOCOL_SSH:
case VIR_STORAGE_NET_PROTOCOL_VXHS:
case VIR_STORAGE_NET_PROTOCOL_NFS:
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
case VIR_STORAGE_NET_PROTOCOL_LAST:
case VIR_STORAGE_NET_PROTOCOL_NONE:
virReportError(VIR_ERR_NO_SUPPORT,
diff --git a/src/qemu/qemu_block.c b/src/qemu/qemu_block.c
index 32568d4ae6..e625fa0720 100644
--- a/src/qemu/qemu_block.c
+++ b/src/qemu/qemu_block.c
@@ -731,6 +731,38 @@ qemuBlockStorageSourceGetRBDProps(virStorageSource *src,
}
+static virJSONValue *
+qemuBlockStorageSourceGetVitastorProps(virStorageSource *src)
+{
+ virJSONValue *ret = NULL;
+ virStorageNetHostDef *host;
+ size_t i;
+ g_auto(virBuffer) buf = VIR_BUFFER_INITIALIZER;
+ g_autofree char *etcd = NULL;
+
+ for (i = 0; i < src->nhosts; i++) {
+ host = src->hosts + i;
+ if ((virStorageNetHostTransport)host->transport != VIR_STORAGE_NET_HOST_TRANS_TCP) {
+ return NULL;
+ }
+ virBufferAsprintf(&buf, i > 0 ? ",%s:%u" : "%s:%u", host->name, host->port);
+ }
+ if (src->nhosts > 0) {
+ etcd = virBufferContentAndReset(&buf);
+ }
+
+ if (virJSONValueObjectAdd(&ret,
+ "S:etcd-host", etcd,
+ "S:etcd-prefix", src->query,
+ "S:config-path", src->configFile,
+ "s:image", src->path,
+ NULL) < 0)
+ return NULL;
+
+ return ret;
+}
+
+
static virJSONValue *
qemuBlockStorageSourceGetSshProps(virStorageSource *src)
{
@@ -1082,6 +1114,12 @@ qemuBlockStorageSourceGetBackendProps(virStorageSource *src,
return NULL;
break;
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
+ driver = "vitastor";
+ if (!(fileprops = qemuBlockStorageSourceGetVitastorProps(src)))
+ return NULL;
+ break;
+
case VIR_STORAGE_NET_PROTOCOL_SSH:
driver = "ssh";
if (!(fileprops = qemuBlockStorageSourceGetSshProps(src)))
@@ -1985,6 +2023,7 @@ qemuBlockGetBackingStoreString(virStorageSource *src,
case VIR_STORAGE_NET_PROTOCOL_SHEEPDOG:
case VIR_STORAGE_NET_PROTOCOL_RBD:
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
case VIR_STORAGE_NET_PROTOCOL_VXHS:
case VIR_STORAGE_NET_PROTOCOL_NFS:
case VIR_STORAGE_NET_PROTOCOL_SSH:
@@ -2365,6 +2404,12 @@ qemuBlockStorageSourceCreateGetStorageProps(virStorageSource *src,
return -1;
break;
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
+ driver = "vitastor";
+ if (!(location = qemuBlockStorageSourceGetVitastorProps(src)))
+ return -1;
+ break;
+
case VIR_STORAGE_NET_PROTOCOL_SSH:
if (srcPriv->nbdkitProcess) {
/* disk creation not yet supported with nbdkit, and even if it
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
index 0d2548d8d4..91121d6e1f 100644
--- a/src/qemu/qemu_domain.c
+++ b/src/qemu/qemu_domain.c
@@ -4526,7 +4526,8 @@ qemuDomainValidateStorageSource(virStorageSource *src,
if (src->query &&
(actualType != VIR_STORAGE_TYPE_NETWORK ||
(src->protocol != VIR_STORAGE_NET_PROTOCOL_HTTPS &&
- src->protocol != VIR_STORAGE_NET_PROTOCOL_HTTP))) {
+ src->protocol != VIR_STORAGE_NET_PROTOCOL_HTTP &&
+ src->protocol != VIR_STORAGE_NET_PROTOCOL_VITASTOR))) {
virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
_("query is supported only with HTTP(S) protocols"));
return -1;
@@ -8954,6 +8955,7 @@ qemuDomainPrepareStorageSourceTLS(virStorageSource *src,
break;
case VIR_STORAGE_NET_PROTOCOL_RBD:
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
case VIR_STORAGE_NET_PROTOCOL_SHEEPDOG:
case VIR_STORAGE_NET_PROTOCOL_GLUSTER:
case VIR_STORAGE_NET_PROTOCOL_ISCSI:
diff --git a/src/qemu/qemu_snapshot.c b/src/qemu/qemu_snapshot.c
index 8128154749..afb339b9b0 100644
--- a/src/qemu/qemu_snapshot.c
+++ b/src/qemu/qemu_snapshot.c
@@ -662,6 +662,7 @@ qemuSnapshotPrepareDiskExternalInactive(virDomainSnapshotDiskDef *snapdisk,
case VIR_STORAGE_NET_PROTOCOL_NONE:
case VIR_STORAGE_NET_PROTOCOL_NBD:
case VIR_STORAGE_NET_PROTOCOL_RBD:
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
case VIR_STORAGE_NET_PROTOCOL_SHEEPDOG:
case VIR_STORAGE_NET_PROTOCOL_GLUSTER:
case VIR_STORAGE_NET_PROTOCOL_ISCSI:
@@ -887,6 +888,7 @@ qemuSnapshotPrepareDiskInternal(virDomainDiskDef *disk,
case VIR_STORAGE_NET_PROTOCOL_NONE:
case VIR_STORAGE_NET_PROTOCOL_NBD:
case VIR_STORAGE_NET_PROTOCOL_RBD:
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
case VIR_STORAGE_NET_PROTOCOL_SHEEPDOG:
case VIR_STORAGE_NET_PROTOCOL_GLUSTER:
case VIR_STORAGE_NET_PROTOCOL_ISCSI:
diff --git a/src/storage/storage_driver.c b/src/storage/storage_driver.c
index e19e032427..59f91f4710 100644
--- a/src/storage/storage_driver.c
+++ b/src/storage/storage_driver.c
@@ -1626,6 +1626,7 @@ storageVolLookupByPathCallback(virStoragePoolObj *obj,
case VIR_STORAGE_POOL_GLUSTER:
case VIR_STORAGE_POOL_RBD:
+ case VIR_STORAGE_POOL_VITASTOR:
case VIR_STORAGE_POOL_SHEEPDOG:
case VIR_STORAGE_POOL_ZFS:
case VIR_STORAGE_POOL_LAST:
diff --git a/src/storage_file/storage_source_backingstore.c b/src/storage_file/storage_source_backingstore.c
index 80681924ea..8a3ade9ec0 100644
--- a/src/storage_file/storage_source_backingstore.c
+++ b/src/storage_file/storage_source_backingstore.c
@@ -287,6 +287,75 @@ virStorageSourceParseRBDColonString(const char *rbdstr,
}
+static int
+virStorageSourceParseVitastorColonString(const char *colonstr,
+ virStorageSource *src)
+{
+ char *p, *e, *next;
+ g_autofree char *options = NULL;
+
+ /* optionally skip the "vitastor:" prefix if provided */
+ if (STRPREFIX(colonstr, "vitastor:"))
+ colonstr += strlen("vitastor:");
+
+ options = g_strdup(colonstr);
+
+ p = options;
+ while (*p) {
+ /* find : delimiter or end of string */
+ for (e = p; *e && *e != ':'; ++e) {
+ if (*e == '\\') {
+ e++;
+ if (*e == '\0')
+ break;
+ }
+ }
+ if (*e == '\0') {
+ next = e; /* last kv pair */
+ } else {
+ next = e + 1;
+ *e = '\0';
+ }
+
+ if (STRPREFIX(p, "image=")) {
+ src->path = g_strdup(p + strlen("image="));
+ } else if (STRPREFIX(p, "etcd-prefix=")) {
+ src->query = g_strdup(p + strlen("etcd-prefix="));
+ } else if (STRPREFIX(p, "config-path=")) {
+ src->configFile = g_strdup(p + strlen("config-path="));
+ } else if (STRPREFIX(p, "etcd-host=")) {
+ char *h, *sep;
+
+ h = p + strlen("etcd-host=");
+ while (h < e) {
+ for (sep = h; sep < e; ++sep) {
+ if (*sep == '\\' && (sep[1] == ',' ||
+ sep[1] == ';' ||
+ sep[1] == ' ')) {
+ *sep = '\0';
+ sep += 2;
+ break;
+ }
+ }
+
+ if (virStorageSourceRBDAddHost(src, h) < 0)
+ return -1;
+
+ h = sep;
+ }
+ }
+
+ p = next;
+ }
+
+ if (!src->path) {
+ return -1;
+ }
+
+ return 0;
+}
+
+
static int
virStorageSourceParseNBDColonString(const char *nbdstr,
virStorageSource *src)
@@ -399,6 +468,11 @@ virStorageSourceParseBackingColon(virStorageSource *src,
return -1;
break;
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
+ if (virStorageSourceParseVitastorColonString(path, src) < 0)
+ return -1;
+ break;
+
case VIR_STORAGE_NET_PROTOCOL_SHEEPDOG:
case VIR_STORAGE_NET_PROTOCOL_LAST:
case VIR_STORAGE_NET_PROTOCOL_NONE:
@@ -975,6 +1049,54 @@ virStorageSourceParseBackingJSONRBD(virStorageSource *src,
return 0;
}
+static int
+virStorageSourceParseBackingJSONVitastor(virStorageSource *src,
+ virJSONValue *json,
+ const char *jsonstr G_GNUC_UNUSED,
+ int opaque G_GNUC_UNUSED)
+{
+ const char *filename;
+ const char *image = virJSONValueObjectGetString(json, "image");
+ const char *conf = virJSONValueObjectGetString(json, "config-path");
+ const char *etcd_prefix = virJSONValueObjectGetString(json, "etcd-prefix");
+ virJSONValue *servers = virJSONValueObjectGetArray(json, "server");
+ size_t nservers;
+ size_t i;
+
+ src->type = VIR_STORAGE_TYPE_NETWORK;
+ src->protocol = VIR_STORAGE_NET_PROTOCOL_VITASTOR;
+
+ /* legacy syntax passed via 'filename' option */
+ if ((filename = virJSONValueObjectGetString(json, "filename")))
+ return virStorageSourceParseVitastorColonString(filename, src);
+
+ if (!image) {
+ virReportError(VIR_ERR_INVALID_ARG, "%s",
+ _("missing image name in Vitastor backing volume "
+ "JSON specification"));
+ return -1;
+ }
+
+ src->path = g_strdup(image);
+ src->configFile = g_strdup(conf);
+ src->query = g_strdup(etcd_prefix);
+
+ if (servers) {
+ nservers = virJSONValueArraySize(servers);
+
+ src->hosts = g_new0(virStorageNetHostDef, nservers);
+ src->nhosts = nservers;
+
+ for (i = 0; i < nservers; i++) {
+ if (virStorageSourceParseBackingJSONInetSocketAddress(src->hosts + i,
+ virJSONValueArrayGet(servers, i)) < 0)
+ return -1;
+ }
+ }
+
+ return 0;
+}
+
static int
virStorageSourceParseBackingJSONRaw(virStorageSource *src,
virJSONValue *json,
@@ -1152,6 +1274,7 @@ static const struct virStorageSourceJSONDriverParser jsonParsers[] = {
{"sheepdog", false, virStorageSourceParseBackingJSONSheepdog, 0},
{"ssh", false, virStorageSourceParseBackingJSONSSH, 0},
{"rbd", false, virStorageSourceParseBackingJSONRBD, 0},
+ {"vitastor", false, virStorageSourceParseBackingJSONVitastor, 0},
{"raw", true, virStorageSourceParseBackingJSONRaw, 0},
{"nfs", false, virStorageSourceParseBackingJSONNFS, 0},
{"vxhs", false, virStorageSourceParseBackingJSONVxHS, 0},
diff --git a/src/test/test_driver.c b/src/test/test_driver.c
index 25335d9002..cf54069fbe 100644
--- a/src/test/test_driver.c
+++ b/src/test/test_driver.c
@@ -7340,6 +7340,7 @@ testStorageVolumeTypeForPool(int pooltype)
case VIR_STORAGE_POOL_ISCSI_DIRECT:
case VIR_STORAGE_POOL_GLUSTER:
case VIR_STORAGE_POOL_RBD:
+ case VIR_STORAGE_POOL_VITASTOR:
return VIR_STORAGE_VOL_NETWORK;
case VIR_STORAGE_POOL_LOGICAL:
case VIR_STORAGE_POOL_DISK:
diff --git a/tests/storagepoolcapsschemadata/poolcaps-fs.xml b/tests/storagepoolcapsschemadata/poolcaps-fs.xml
index eee75af746..8bd0a57bdd 100644
--- a/tests/storagepoolcapsschemadata/poolcaps-fs.xml
+++ b/tests/storagepoolcapsschemadata/poolcaps-fs.xml
@@ -204,4 +204,11 @@
</enum>
</volOptions>
</pool>
+ <pool type='vitastor' supported='no'>
+ <volOptions>
+ <defaultFormat type='raw'/>
+ <enum name='targetFormatType'>
+ </enum>
+ </volOptions>
+ </pool>
</storagepoolCapabilities>
diff --git a/tests/storagepoolcapsschemadata/poolcaps-full.xml b/tests/storagepoolcapsschemadata/poolcaps-full.xml
index 805950a937..852df0de16 100644
--- a/tests/storagepoolcapsschemadata/poolcaps-full.xml
+++ b/tests/storagepoolcapsschemadata/poolcaps-full.xml
@@ -204,4 +204,11 @@
</enum>
</volOptions>
</pool>
+ <pool type='vitastor' supported='yes'>
+ <volOptions>
+ <defaultFormat type='raw'/>
+ <enum name='targetFormatType'>
+ </enum>
+ </volOptions>
+ </pool>
</storagepoolCapabilities>
diff --git a/tests/storagepoolxml2argvtest.c b/tests/storagepoolxml2argvtest.c
index d5c2531ab8..b19308ac38 100644
--- a/tests/storagepoolxml2argvtest.c
+++ b/tests/storagepoolxml2argvtest.c
@@ -57,6 +57,7 @@ testCompareXMLToArgvFiles(bool shouldFail,
case VIR_STORAGE_POOL_GLUSTER:
case VIR_STORAGE_POOL_ZFS:
case VIR_STORAGE_POOL_VSTORAGE:
+ case VIR_STORAGE_POOL_VITASTOR:
case VIR_STORAGE_POOL_LAST:
default:
VIR_TEST_DEBUG("pool type '%s' has no xml2argv test", defTypeStr);
diff --git a/tools/virsh-pool.c b/tools/virsh-pool.c
index 2010ef1356..072e2ff9e8 100644
--- a/tools/virsh-pool.c
+++ b/tools/virsh-pool.c
@@ -1187,6 +1187,9 @@ cmdPoolList(vshControl *ctl, const vshCmd *cmd G_GNUC_UNUSED)
case VIR_STORAGE_POOL_VSTORAGE:
flags |= VIR_CONNECT_LIST_STORAGE_POOLS_VSTORAGE;
break;
+ case VIR_STORAGE_POOL_VITASTOR:
+ flags |= VIR_CONNECT_LIST_STORAGE_POOLS_VITASTOR;
+ break;
case VIR_STORAGE_POOL_LAST:
break;
}

View File

@@ -0,0 +1,172 @@
Index: pve-qemu-kvm-10.0.2/block/meson.build
===================================================================
--- pve-qemu-kvm-10.0.2.orig/block/meson.build
+++ pve-qemu-kvm-10.0.2/block/meson.build
@@ -126,6 +126,7 @@ foreach m : [
[libnfs, 'nfs', files('nfs.c')],
[libssh, 'ssh', files('ssh.c')],
[rbd, 'rbd', files('rbd.c')],
+ [vitastor, 'vitastor', files('vitastor.c')],
]
if m[0].found()
module_ss = ss.source_set()
Index: pve-qemu-kvm-10.0.2/meson.build
===================================================================
--- pve-qemu-kvm-10.0.2.orig/meson.build
+++ pve-qemu-kvm-10.0.2/meson.build
@@ -1622,6 +1622,26 @@ if not get_option('rbd').auto() or have_
endif
endif
+vitastor = not_found
+if not get_option('vitastor').auto() or have_block
+ libvitastor_client = cc.find_library('vitastor_client', has_headers: ['vitastor_c.h'],
+ required: get_option('vitastor'))
+ if libvitastor_client.found()
+ if cc.links('''
+ #include <vitastor_c.h>
+ int main(void) {
+ vitastor_c_create_qemu(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0);
+ return 0;
+ }''', dependencies: libvitastor_client)
+ vitastor = declare_dependency(dependencies: libvitastor_client)
+ elif get_option('vitastor').enabled()
+ error('could not link libvitastor_client')
+ else
+ warning('could not link libvitastor_client, disabling')
+ endif
+ endif
+endif
+
glusterfs = not_found
glusterfs_ftruncate_has_stat = false
glusterfs_iocb_has_stat = false
@@ -2514,6 +2534,7 @@ endif
config_host_data.set('CONFIG_OPENGL', opengl.found())
config_host_data.set('CONFIG_PLUGIN', get_option('plugins'))
config_host_data.set('CONFIG_RBD', rbd.found())
+config_host_data.set('CONFIG_VITASTOR', vitastor.found())
config_host_data.set('CONFIG_RDMA', rdma.found())
config_host_data.set('CONFIG_RELOCATABLE', get_option('relocatable'))
config_host_data.set('CONFIG_SAFESTACK', get_option('safe_stack'))
@@ -4812,6 +4833,7 @@ summary_info += {'fdt support': fd
summary_info += {'libcap-ng support': libcap_ng}
summary_info += {'bpf support': libbpf}
summary_info += {'rbd support': rbd}
+summary_info += {'vitastor support': vitastor}
summary_info += {'smartcard support': cacard}
summary_info += {'U2F support': u2f}
summary_info += {'libusb': libusb}
Index: pve-qemu-kvm-10.0.2/meson_options.txt
===================================================================
--- pve-qemu-kvm-10.0.2.orig/meson_options.txt
+++ pve-qemu-kvm-10.0.2/meson_options.txt
@@ -202,6 +202,8 @@ option('pvg', type: 'feature', value: 'a
description: 'macOS paravirtualized graphics support')
option('rbd', type : 'feature', value : 'auto',
description: 'Ceph block device driver')
+option('vitastor', type : 'feature', value : 'auto',
+ description: 'Vitastor block device driver')
option('opengl', type : 'feature', value : 'auto',
description: 'OpenGL support')
option('rdma', type : 'feature', value : 'auto',
Index: pve-qemu-kvm-10.0.2/qapi/block-core.json
===================================================================
--- pve-qemu-kvm-10.0.2.orig/qapi/block-core.json
+++ pve-qemu-kvm-10.0.2/qapi/block-core.json
@@ -3599,7 +3599,7 @@
'raw', 'rbd',
{ 'name': 'replication', 'if': 'CONFIG_REPLICATION' },
'pbs',
- 'ssh', 'throttle', 'vdi', 'vhdx',
+ 'ssh', 'throttle', 'vdi', 'vhdx', 'vitastor',
{ 'name': 'virtio-blk-vfio-pci', 'if': 'CONFIG_BLKIO' },
{ 'name': 'virtio-blk-vhost-user', 'if': 'CONFIG_BLKIO' },
{ 'name': 'virtio-blk-vhost-vdpa', 'if': 'CONFIG_BLKIO' },
@@ -4725,6 +4725,28 @@
'*server': ['InetSocketAddressBase'] } }
##
+# @BlockdevOptionsVitastor:
+#
+# Driver specific block device options for vitastor
+#
+# @image: Image name
+# @inode: Inode number
+# @pool: Pool ID
+# @size: Desired image size in bytes
+# @config-path: Path to Vitastor configuration
+# @etcd-host: etcd connection address(es)
+# @etcd-prefix: etcd key/value prefix
+##
+{ 'struct': 'BlockdevOptionsVitastor',
+ 'data': { '*inode': 'uint64',
+ '*pool': 'uint64',
+ '*size': 'uint64',
+ '*image': 'str',
+ '*config-path': 'str',
+ '*etcd-host': 'str',
+ '*etcd-prefix': 'str' } }
+
+##
# @ReplicationMode:
#
# An enumeration of replication modes.
@@ -5194,6 +5216,7 @@
'throttle': 'BlockdevOptionsThrottle',
'vdi': 'BlockdevOptionsGenericFormat',
'vhdx': 'BlockdevOptionsGenericFormat',
+ 'vitastor': 'BlockdevOptionsVitastor',
'virtio-blk-vfio-pci':
{ 'type': 'BlockdevOptionsVirtioBlkVfioPci',
'if': 'CONFIG_BLKIO' },
@@ -5674,6 +5697,20 @@
'*encrypt' : 'RbdEncryptionCreateOptions' } }
##
+# @BlockdevCreateOptionsVitastor:
+#
+# Driver specific image creation options for Vitastor.
+#
+# @location: Where to store the new image file. This location cannot
+# point to a snapshot.
+#
+# @size: Size of the virtual disk in bytes
+##
+{ 'struct': 'BlockdevCreateOptionsVitastor',
+ 'data': { 'location': 'BlockdevOptionsVitastor',
+ 'size': 'size' } }
+
+##
# @BlockdevVmdkSubformat:
#
# Subformat options for VMDK images
@@ -5895,6 +5932,7 @@
'ssh': 'BlockdevCreateOptionsSsh',
'vdi': 'BlockdevCreateOptionsVdi',
'vhdx': 'BlockdevCreateOptionsVhdx',
+ 'vitastor': 'BlockdevCreateOptionsVitastor',
'vmdk': 'BlockdevCreateOptionsVmdk',
'vpc': 'BlockdevCreateOptionsVpc'
} }
Index: pve-qemu-kvm-10.0.2/scripts/meson-buildoptions.sh
===================================================================
--- pve-qemu-kvm-10.0.2.orig/scripts/meson-buildoptions.sh
+++ pve-qemu-kvm-10.0.2/scripts/meson-buildoptions.sh
@@ -175,6 +175,7 @@ meson_options_help() {
printf "%s\n" ' qga-vss build QGA VSS support (broken with MinGW)'
printf "%s\n" ' qpl Query Processing Library support'
printf "%s\n" ' rbd Ceph block device driver'
+ printf "%s\n" ' vitastor Vitastor block device driver'
printf "%s\n" ' rdma Enable RDMA-based migration'
printf "%s\n" ' replication replication support'
printf "%s\n" ' rust Rust support'
@@ -458,6 +459,8 @@ _meson_option_parse() {
--disable-qpl) printf "%s" -Dqpl=disabled ;;
--enable-rbd) printf "%s" -Drbd=enabled ;;
--disable-rbd) printf "%s" -Drbd=disabled ;;
+ --enable-vitastor) printf "%s" -Dvitastor=enabled ;;
+ --disable-vitastor) printf "%s" -Dvitastor=disabled ;;
--enable-rdma) printf "%s" -Drdma=enabled ;;
--disable-rdma) printf "%s" -Drdma=disabled ;;
--enable-relocatable) printf "%s" -Drelocatable=true ;;

View File

@@ -0,0 +1,172 @@
Index: pve-qemu-kvm-9.2.0/block/meson.build
===================================================================
--- pve-qemu-kvm-9.2.0.orig/block/meson.build
+++ pve-qemu-kvm-9.2.0/block/meson.build
@@ -126,6 +126,7 @@ foreach m : [
[libnfs, 'nfs', files('nfs.c')],
[libssh, 'ssh', files('ssh.c')],
[rbd, 'rbd', files('rbd.c')],
+ [vitastor, 'vitastor', files('vitastor.c')],
]
if m[0].found()
module_ss = ss.source_set()
Index: pve-qemu-kvm-9.2.0/meson.build
===================================================================
--- pve-qemu-kvm-9.2.0.orig/meson.build
+++ pve-qemu-kvm-9.2.0/meson.build
@@ -1590,6 +1590,26 @@ if not get_option('rbd').auto() or have_
endif
endif
+vitastor = not_found
+if not get_option('vitastor').auto() or have_block
+ libvitastor_client = cc.find_library('vitastor_client', has_headers: ['vitastor_c.h'],
+ required: get_option('vitastor'))
+ if libvitastor_client.found()
+ if cc.links('''
+ #include <vitastor_c.h>
+ int main(void) {
+ vitastor_c_create_qemu(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0);
+ return 0;
+ }''', dependencies: libvitastor_client)
+ vitastor = declare_dependency(dependencies: libvitastor_client)
+ elif get_option('vitastor').enabled()
+ error('could not link libvitastor_client')
+ else
+ warning('could not link libvitastor_client, disabling')
+ endif
+ endif
+endif
+
glusterfs = not_found
glusterfs_ftruncate_has_stat = false
glusterfs_iocb_has_stat = false
@@ -2478,6 +2498,7 @@ endif
config_host_data.set('CONFIG_OPENGL', opengl.found())
config_host_data.set('CONFIG_PLUGIN', get_option('plugins'))
config_host_data.set('CONFIG_RBD', rbd.found())
+config_host_data.set('CONFIG_VITASTOR', vitastor.found())
config_host_data.set('CONFIG_RDMA', rdma.found())
config_host_data.set('CONFIG_RELOCATABLE', get_option('relocatable'))
config_host_data.set('CONFIG_SAFESTACK', get_option('safe_stack'))
@@ -4789,6 +4810,7 @@ summary_info += {'fdt support': fd
summary_info += {'libcap-ng support': libcap_ng}
summary_info += {'bpf support': libbpf}
summary_info += {'rbd support': rbd}
+summary_info += {'vitastor support': vitastor}
summary_info += {'smartcard support': cacard}
summary_info += {'U2F support': u2f}
summary_info += {'libusb': libusb}
Index: pve-qemu-kvm-9.2.0/meson_options.txt
===================================================================
--- pve-qemu-kvm-9.2.0.orig/meson_options.txt
+++ pve-qemu-kvm-9.2.0/meson_options.txt
@@ -200,6 +200,8 @@ option('lzo', type : 'feature', value :
description: 'lzo compression support')
option('rbd', type : 'feature', value : 'auto',
description: 'Ceph block device driver')
+option('vitastor', type : 'feature', value : 'auto',
+ description: 'Vitastor block device driver')
option('opengl', type : 'feature', value : 'auto',
description: 'OpenGL support')
option('rdma', type : 'feature', value : 'auto',
Index: pve-qemu-kvm-9.2.0/qapi/block-core.json
===================================================================
--- pve-qemu-kvm-9.2.0.orig/qapi/block-core.json
+++ pve-qemu-kvm-9.2.0/qapi/block-core.json
@@ -3481,7 +3481,7 @@
'raw', 'rbd',
{ 'name': 'replication', 'if': 'CONFIG_REPLICATION' },
'pbs',
- 'ssh', 'throttle', 'vdi', 'vhdx',
+ 'ssh', 'throttle', 'vdi', 'vhdx', 'vitastor',
{ 'name': 'virtio-blk-vfio-pci', 'if': 'CONFIG_BLKIO' },
{ 'name': 'virtio-blk-vhost-user', 'if': 'CONFIG_BLKIO' },
{ 'name': 'virtio-blk-vhost-vdpa', 'if': 'CONFIG_BLKIO' },
@@ -4592,6 +4592,28 @@
'*server': ['InetSocketAddressBase'] } }
##
+# @BlockdevOptionsVitastor:
+#
+# Driver specific block device options for vitastor
+#
+# @image: Image name
+# @inode: Inode number
+# @pool: Pool ID
+# @size: Desired image size in bytes
+# @config-path: Path to Vitastor configuration
+# @etcd-host: etcd connection address(es)
+# @etcd-prefix: etcd key/value prefix
+##
+{ 'struct': 'BlockdevOptionsVitastor',
+ 'data': { '*inode': 'uint64',
+ '*pool': 'uint64',
+ '*size': 'uint64',
+ '*image': 'str',
+ '*config-path': 'str',
+ '*etcd-host': 'str',
+ '*etcd-prefix': 'str' } }
+
+##
# @ReplicationMode:
#
# An enumeration of replication modes.
@@ -5054,6 +5076,7 @@
'throttle': 'BlockdevOptionsThrottle',
'vdi': 'BlockdevOptionsGenericFormat',
'vhdx': 'BlockdevOptionsGenericFormat',
+ 'vitastor': 'BlockdevOptionsVitastor',
'virtio-blk-vfio-pci':
{ 'type': 'BlockdevOptionsVirtioBlkVfioPci',
'if': 'CONFIG_BLKIO' },
@@ -5501,6 +5524,20 @@
'*encrypt' : 'RbdEncryptionCreateOptions' } }
##
+# @BlockdevCreateOptionsVitastor:
+#
+# Driver specific image creation options for Vitastor.
+#
+# @location: Where to store the new image file. This location cannot
+# point to a snapshot.
+#
+# @size: Size of the virtual disk in bytes
+##
+{ 'struct': 'BlockdevCreateOptionsVitastor',
+ 'data': { 'location': 'BlockdevOptionsVitastor',
+ 'size': 'size' } }
+
+##
# @BlockdevVmdkSubformat:
#
# Subformat options for VMDK images
@@ -5722,6 +5759,7 @@
'ssh': 'BlockdevCreateOptionsSsh',
'vdi': 'BlockdevCreateOptionsVdi',
'vhdx': 'BlockdevCreateOptionsVhdx',
+ 'vitastor': 'BlockdevCreateOptionsVitastor',
'vmdk': 'BlockdevCreateOptionsVmdk',
'vpc': 'BlockdevCreateOptionsVpc'
} }
Index: pve-qemu-kvm-9.2.0/scripts/meson-buildoptions.sh
===================================================================
--- pve-qemu-kvm-9.2.0.orig/scripts/meson-buildoptions.sh
+++ pve-qemu-kvm-9.2.0/scripts/meson-buildoptions.sh
@@ -174,6 +174,7 @@ meson_options_help() {
printf "%s\n" ' qga-vss build QGA VSS support (broken with MinGW)'
printf "%s\n" ' qpl Query Processing Library support'
printf "%s\n" ' rbd Ceph block device driver'
+ printf "%s\n" ' vitastor Vitastor block device driver'
printf "%s\n" ' rdma Enable RDMA-based migration'
printf "%s\n" ' replication replication support'
printf "%s\n" ' rust Rust support'
@@ -455,6 +456,8 @@ _meson_option_parse() {
--disable-qpl) printf "%s" -Dqpl=disabled ;;
--enable-rbd) printf "%s" -Drbd=enabled ;;
--disable-rbd) printf "%s" -Drbd=disabled ;;
+ --enable-vitastor) printf "%s" -Dvitastor=enabled ;;
+ --disable-vitastor) printf "%s" -Dvitastor=disabled ;;
--enable-rdma) printf "%s" -Drdma=enabled ;;
--disable-rdma) printf "%s" -Drdma=disabled ;;
--enable-relocatable) printf "%s" -Drelocatable=true ;;

View File

@@ -0,0 +1,172 @@
diff --git a/block/meson.build b/block/meson.build
index 34b1b2a306..24ca0f1e52 100644
--- a/block/meson.build
+++ b/block/meson.build
@@ -114,6 +114,7 @@ foreach m : [
[libnfs, 'nfs', files('nfs.c')],
[libssh, 'ssh', files('ssh.c')],
[rbd, 'rbd', files('rbd.c')],
+ [vitastor, 'vitastor', files('vitastor.c')],
]
if m[0].found()
module_ss = ss.source_set()
diff --git a/meson.build b/meson.build
index 41f68d3806..29eaed9ba4 100644
--- a/meson.build
+++ b/meson.build
@@ -1622,6 +1622,26 @@ if not get_option('rbd').auto() or have_block
endif
endif
+vitastor = not_found
+if not get_option('vitastor').auto() or have_block
+ libvitastor_client = cc.find_library('vitastor_client', has_headers: ['vitastor_c.h'],
+ required: get_option('vitastor'))
+ if libvitastor_client.found()
+ if cc.links('''
+ #include <vitastor_c.h>
+ int main(void) {
+ vitastor_c_create_qemu(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0);
+ return 0;
+ }''', dependencies: libvitastor_client)
+ vitastor = declare_dependency(dependencies: libvitastor_client)
+ elif get_option('vitastor').enabled()
+ error('could not link libvitastor_client')
+ else
+ warning('could not link libvitastor_client, disabling')
+ endif
+ endif
+endif
+
glusterfs = not_found
glusterfs_ftruncate_has_stat = false
glusterfs_iocb_has_stat = false
@@ -2506,6 +2526,7 @@ endif
config_host_data.set('CONFIG_OPENGL', opengl.found())
config_host_data.set('CONFIG_PLUGIN', get_option('plugins'))
config_host_data.set('CONFIG_RBD', rbd.found())
+config_host_data.set('CONFIG_VITASTOR', vitastor.found())
config_host_data.set('CONFIG_RDMA', rdma.found())
config_host_data.set('CONFIG_RELOCATABLE', get_option('relocatable'))
config_host_data.set('CONFIG_SAFESTACK', get_option('safe_stack'))
@@ -4813,6 +4834,7 @@ summary_info += {'fdt support': fdt_opt == 'internal' ? 'internal' : fdt}
summary_info += {'libcap-ng support': libcap_ng}
summary_info += {'bpf support': libbpf}
summary_info += {'rbd support': rbd}
+summary_info += {'vitastor support': vitastor}
summary_info += {'smartcard support': cacard}
summary_info += {'U2F support': u2f}
summary_info += {'libusb': libusb}
diff --git a/meson_options.txt b/meson_options.txt
index 59d973bca0..a3e7123980 100644
--- a/meson_options.txt
+++ b/meson_options.txt
@@ -202,6 +202,8 @@ option('pvg', type: 'feature', value: 'auto',
description: 'macOS paravirtualized graphics support')
option('rbd', type : 'feature', value : 'auto',
description: 'Ceph block device driver')
+option('vitastor', type : 'feature', value : 'auto',
+ description: 'Vitastor block device driver')
option('opengl', type : 'feature', value : 'auto',
description: 'OpenGL support')
option('rdma', type : 'feature', value : 'auto',
diff --git a/qapi/block-core.json b/qapi/block-core.json
index b1937780e1..a511193620 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -3216,7 +3216,7 @@
'parallels', 'preallocate', 'qcow', 'qcow2', 'qed', 'quorum',
'raw', 'rbd',
{ 'name': 'replication', 'if': 'CONFIG_REPLICATION' },
- 'ssh', 'throttle', 'vdi', 'vhdx',
+ 'ssh', 'throttle', 'vdi', 'vhdx', 'vitastor',
{ 'name': 'virtio-blk-vfio-pci', 'if': 'CONFIG_BLKIO' },
{ 'name': 'virtio-blk-vhost-user', 'if': 'CONFIG_BLKIO' },
{ 'name': 'virtio-blk-vhost-vdpa', 'if': 'CONFIG_BLKIO' },
@@ -4299,6 +4299,28 @@
'*key-secret': 'str',
'*server': ['InetSocketAddressBase'] } }
+##
+# @BlockdevOptionsVitastor:
+#
+# Driver specific block device options for vitastor
+#
+# @image: Image name
+# @inode: Inode number
+# @pool: Pool ID
+# @size: Desired image size in bytes
+# @config-path: Path to Vitastor configuration
+# @etcd-host: etcd connection address(es)
+# @etcd-prefix: etcd key/value prefix
+##
+{ 'struct': 'BlockdevOptionsVitastor',
+ 'data': { '*inode': 'uint64',
+ '*pool': 'uint64',
+ '*size': 'uint64',
+ '*image': 'str',
+ '*config-path': 'str',
+ '*etcd-host': 'str',
+ '*etcd-prefix': 'str' } }
+
##
# @ReplicationMode:
#
@@ -4767,6 +4789,7 @@
'throttle': 'BlockdevOptionsThrottle',
'vdi': 'BlockdevOptionsGenericFormat',
'vhdx': 'BlockdevOptionsGenericFormat',
+ 'vitastor': 'BlockdevOptionsVitastor',
'virtio-blk-vfio-pci':
{ 'type': 'BlockdevOptionsVirtioBlkVfioPci',
'if': 'CONFIG_BLKIO' },
@@ -5240,6 +5263,20 @@
'*cluster-size' : 'size',
'*encrypt' : 'RbdEncryptionCreateOptions' } }
+##
+# @BlockdevCreateOptionsVitastor:
+#
+# Driver specific image creation options for Vitastor.
+#
+# @location: Where to store the new image file. This location cannot
+# point to a snapshot.
+#
+# @size: Size of the virtual disk in bytes
+##
+{ 'struct': 'BlockdevCreateOptionsVitastor',
+ 'data': { 'location': 'BlockdevOptionsVitastor',
+ 'size': 'size' } }
+
##
# @BlockdevVmdkSubformat:
#
@@ -5462,6 +5499,7 @@
'ssh': 'BlockdevCreateOptionsSsh',
'vdi': 'BlockdevCreateOptionsVdi',
'vhdx': 'BlockdevCreateOptionsVhdx',
+ 'vitastor': 'BlockdevCreateOptionsVitastor',
'vmdk': 'BlockdevCreateOptionsVmdk',
'vpc': 'BlockdevCreateOptionsVpc'
} }
diff --git a/scripts/meson-buildoptions.sh b/scripts/meson-buildoptions.sh
index 3e8e00852b..45aff3b6a9 100644
--- a/scripts/meson-buildoptions.sh
+++ b/scripts/meson-buildoptions.sh
@@ -175,6 +175,7 @@ meson_options_help() {
printf "%s\n" ' qga-vss build QGA VSS support (broken with MinGW)'
printf "%s\n" ' qpl Query Processing Library support'
printf "%s\n" ' rbd Ceph block device driver'
+ printf "%s\n" ' vitastor Vitastor block device driver'
printf "%s\n" ' rdma Enable RDMA-based migration'
printf "%s\n" ' replication replication support'
printf "%s\n" ' rust Rust support'
@@ -458,6 +459,8 @@ _meson_option_parse() {
--disable-qpl) printf "%s" -Dqpl=disabled ;;
--enable-rbd) printf "%s" -Drbd=enabled ;;
--disable-rbd) printf "%s" -Drbd=disabled ;;
+ --enable-vitastor) printf "%s" -Dvitastor=enabled ;;
+ --disable-vitastor) printf "%s" -Dvitastor=disabled ;;
--enable-rdma) printf "%s" -Drdma=enabled ;;
--disable-rdma) printf "%s" -Drdma=disabled ;;
--enable-relocatable) printf "%s" -Drelocatable=true ;;

View File

@@ -0,0 +1,172 @@
diff --git a/block/meson.build b/block/meson.build
index f1262ec2ba..3cf3e23f16 100644
--- a/block/meson.build
+++ b/block/meson.build
@@ -114,6 +114,7 @@ foreach m : [
[libnfs, 'nfs', files('nfs.c')],
[libssh, 'ssh', files('ssh.c')],
[rbd, 'rbd', files('rbd.c')],
+ [vitastor, 'vitastor', files('vitastor.c')],
]
if m[0].found()
module_ss = ss.source_set()
diff --git a/meson.build b/meson.build
index 147097c652..2486b3aeb5 100644
--- a/meson.build
+++ b/meson.build
@@ -1590,6 +1590,26 @@ if not get_option('rbd').auto() or have_block
endif
endif
+vitastor = not_found
+if not get_option('vitastor').auto() or have_block
+ libvitastor_client = cc.find_library('vitastor_client', has_headers: ['vitastor_c.h'],
+ required: get_option('vitastor'))
+ if libvitastor_client.found()
+ if cc.links('''
+ #include <vitastor_c.h>
+ int main(void) {
+ vitastor_c_create_qemu(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0);
+ return 0;
+ }''', dependencies: libvitastor_client)
+ vitastor = declare_dependency(dependencies: libvitastor_client)
+ elif get_option('vitastor').enabled()
+ error('could not link libvitastor_client')
+ else
+ warning('could not link libvitastor_client, disabling')
+ endif
+ endif
+endif
+
glusterfs = not_found
glusterfs_ftruncate_has_stat = false
glusterfs_iocb_has_stat = false
@@ -2474,6 +2494,7 @@ endif
config_host_data.set('CONFIG_OPENGL', opengl.found())
config_host_data.set('CONFIG_PLUGIN', get_option('plugins'))
config_host_data.set('CONFIG_RBD', rbd.found())
+config_host_data.set('CONFIG_VITASTOR', vitastor.found())
config_host_data.set('CONFIG_RDMA', rdma.found())
config_host_data.set('CONFIG_RELOCATABLE', get_option('relocatable'))
config_host_data.set('CONFIG_SAFESTACK', get_option('safe_stack'))
@@ -4778,6 +4799,7 @@ summary_info += {'fdt support': fdt_opt == 'internal' ? 'internal' : fdt}
summary_info += {'libcap-ng support': libcap_ng}
summary_info += {'bpf support': libbpf}
summary_info += {'rbd support': rbd}
+summary_info += {'vitastor support': vitastor}
summary_info += {'smartcard support': cacard}
summary_info += {'U2F support': u2f}
summary_info += {'libusb': libusb}
diff --git a/meson_options.txt b/meson_options.txt
index 5eeaf3eee5..b04eda29f9 100644
--- a/meson_options.txt
+++ b/meson_options.txt
@@ -200,6 +200,8 @@ option('lzo', type : 'feature', value : 'auto',
description: 'lzo compression support')
option('rbd', type : 'feature', value : 'auto',
description: 'Ceph block device driver')
+option('vitastor', type : 'feature', value : 'auto',
+ description: 'Vitastor block device driver')
option('opengl', type : 'feature', value : 'auto',
description: 'OpenGL support')
option('rdma', type : 'feature', value : 'auto',
diff --git a/qapi/block-core.json b/qapi/block-core.json
index fd3bcc1c17..41571ac3f9 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -3212,7 +3212,7 @@
'parallels', 'preallocate', 'qcow', 'qcow2', 'qed', 'quorum',
'raw', 'rbd',
{ 'name': 'replication', 'if': 'CONFIG_REPLICATION' },
- 'ssh', 'throttle', 'vdi', 'vhdx',
+ 'ssh', 'throttle', 'vdi', 'vhdx', 'vitastor',
{ 'name': 'virtio-blk-vfio-pci', 'if': 'CONFIG_BLKIO' },
{ 'name': 'virtio-blk-vhost-user', 'if': 'CONFIG_BLKIO' },
{ 'name': 'virtio-blk-vhost-vdpa', 'if': 'CONFIG_BLKIO' },
@@ -4295,6 +4295,28 @@
'*key-secret': 'str',
'*server': ['InetSocketAddressBase'] } }
+##
+# @BlockdevOptionsVitastor:
+#
+# Driver specific block device options for vitastor
+#
+# @image: Image name
+# @inode: Inode number
+# @pool: Pool ID
+# @size: Desired image size in bytes
+# @config-path: Path to Vitastor configuration
+# @etcd-host: etcd connection address(es)
+# @etcd-prefix: etcd key/value prefix
+##
+{ 'struct': 'BlockdevOptionsVitastor',
+ 'data': { '*inode': 'uint64',
+ '*pool': 'uint64',
+ '*size': 'uint64',
+ '*image': 'str',
+ '*config-path': 'str',
+ '*etcd-host': 'str',
+ '*etcd-prefix': 'str' } }
+
##
# @ReplicationMode:
#
@@ -4757,6 +4779,7 @@
'throttle': 'BlockdevOptionsThrottle',
'vdi': 'BlockdevOptionsGenericFormat',
'vhdx': 'BlockdevOptionsGenericFormat',
+ 'vitastor': 'BlockdevOptionsVitastor',
'virtio-blk-vfio-pci':
{ 'type': 'BlockdevOptionsVirtioBlkVfioPci',
'if': 'CONFIG_BLKIO' },
@@ -5198,6 +5221,20 @@
'*cluster-size' : 'size',
'*encrypt' : 'RbdEncryptionCreateOptions' } }
+##
+# @BlockdevCreateOptionsVitastor:
+#
+# Driver specific image creation options for Vitastor.
+#
+# @location: Where to store the new image file. This location cannot
+# point to a snapshot.
+#
+# @size: Size of the virtual disk in bytes
+##
+{ 'struct': 'BlockdevCreateOptionsVitastor',
+ 'data': { 'location': 'BlockdevOptionsVitastor',
+ 'size': 'size' } }
+
##
# @BlockdevVmdkSubformat:
#
@@ -5420,6 +5457,7 @@
'ssh': 'BlockdevCreateOptionsSsh',
'vdi': 'BlockdevCreateOptionsVdi',
'vhdx': 'BlockdevCreateOptionsVhdx',
+ 'vitastor': 'BlockdevCreateOptionsVitastor',
'vmdk': 'BlockdevCreateOptionsVmdk',
'vpc': 'BlockdevCreateOptionsVpc'
} }
diff --git a/scripts/meson-buildoptions.sh b/scripts/meson-buildoptions.sh
index a8066aab03..12e650e7d4 100644
--- a/scripts/meson-buildoptions.sh
+++ b/scripts/meson-buildoptions.sh
@@ -174,6 +174,7 @@ meson_options_help() {
printf "%s\n" ' qga-vss build QGA VSS support (broken with MinGW)'
printf "%s\n" ' qpl Query Processing Library support'
printf "%s\n" ' rbd Ceph block device driver'
+ printf "%s\n" ' vitastor Vitastor block device driver'
printf "%s\n" ' rdma Enable RDMA-based migration'
printf "%s\n" ' replication replication support'
printf "%s\n" ' rust Rust support'
@@ -455,6 +456,8 @@ _meson_option_parse() {
--disable-qpl) printf "%s" -Dqpl=disabled ;;
--enable-rbd) printf "%s" -Drbd=enabled ;;
--disable-rbd) printf "%s" -Drbd=disabled ;;
+ --enable-vitastor) printf "%s" -Dvitastor=enabled ;;
+ --disable-vitastor) printf "%s" -Dvitastor=disabled ;;
--enable-rdma) printf "%s" -Drdma=enabled ;;
--disable-rdma) printf "%s" -Drdma=disabled ;;
--enable-relocatable) printf "%s" -Drelocatable=true ;;

View File

@@ -7,22 +7,24 @@ set -e
VITASTOR=$(dirname $0)
VITASTOR=$(realpath "$VITASTOR/..")
EL=$(rpm --eval '%dist')
if [ "$EL" = ".el8" ]; then
REL=$(rpm --eval '%dist')
REL=${REL##.}
if [ "$REL" = "el8" ]; then
# CentOS 8
. /opt/rh/gcc-toolset-9/enable
elif [ "$EL" = ".el7" ]; then
elif [ "$REL" = "el7" ]; then
# CentOS 7
. /opt/rh/devtoolset-9/enable
fi
cd ~/rpmbuild/SPECS
rpmbuild -bp fio.spec
cd $VITASTOR
VER=$(grep ^Version: rpm/vitastor-el7.spec | awk '{print $2}')
VER=$(grep ^Version: rpm/vitastor-$REL.spec | awk '{print $2}')
rm -rf fio
ln -s ~/rpmbuild/BUILD/fio*/ fio
sh copy-fio-includes.sh
rm fio
mv fio-copy fio
FIO=`rpm -qi fio | perl -e 'while(<>) { /^Epoch[\s:]+(\S+)/ && print "$1:"; /^Version[\s:]+(\S+)/ && print $1; /^Release[\s:]+(\S+)/ && print "-$1"; }'`
perl -i -pe 's/(Requires:\s*fio)([^\n]+)?/$1 = '$FIO'/' $VITASTOR/rpm/vitastor-el$EL.spec
tar --transform "s#^#vitastor-$VER/#" --exclude 'rpm/*.rpm' -czf $VITASTOR/../vitastor-$VER$(rpm --eval '%dist').tar.gz *
perl -i -pe 's/(Requires:\s*fio)([^\n]+)?/$1 = '$FIO'/' $VITASTOR/rpm/vitastor-$REL.spec
tar --transform "s#^#vitastor-$VER/#" --exclude 'rpm/*.rpm' -czf $VITASTOR/../vitastor-$VER.$REL.tar.gz $(ls | grep -v packages)

16
rpm/vitastor-build.sh Executable file
View File

@@ -0,0 +1,16 @@
#!/bin/bash
set -e -x
REL=$(rpm --eval '%dist')
REL=${REL##.}
cd /root/vitastor/rpm
./build-tarball.sh
VER=$(grep ^Version: vitastor-$REL.spec | awk '{print $2}')
cp /root/vitastor-$VER.$REL.tar.gz ~/rpmbuild/SOURCES
cp vitastor-$REL.spec ~/rpmbuild/SPECS/vitastor.spec
cd ~/rpmbuild/SPECS/
rpmbuild -ba vitastor.spec
mkdir -p /root/vitastor/packages/vitastor-$REL
rm -rf /root/vitastor/packages/vitastor-$REL/*
cp ~/rpmbuild/RPMS/*/*vitastor* /root/vitastor/packages/vitastor-$REL/
cp ~/rpmbuild/SRPMS/vitastor* /root/vitastor/packages/vitastor-$REL/

View File

@@ -1,5 +1,8 @@
# Build packages for CentOS 7 inside a container
# cd ..; podman build -t vitastor-el7 -v `pwd`/packages:/root/packages -f rpm/vitastor-el7.Dockerfile .
# cd ..
# docker build -t vitastor-buildenv:el7 -f rpm/vitastor-el7.Dockerfile .
# docker run -i --rm -v ./:/root/vitastor vitastor-buildenv:el7 /root/vitastor/rpm/vitastor-build.sh
# localedef -i ru_RU -f UTF-8 ru_RU.UTF-8
FROM centos:7
@@ -7,7 +10,9 @@ FROM centos:7
WORKDIR /root
RUN rm -f /etc/yum.repos.d/CentOS-Media.repo
RUN sed -i 's/^mirrorlist=/#mirrorlist=/; s!#baseurl=http://mirror.centos.org/centos/\$releasever!baseurl=http://vault.centos.org/7.9.2009!' /etc/yum.repos.d/*.repo
RUN yum -y --enablerepo=extras install centos-release-scl epel-release yum-utils rpm-build
RUN perl -i -pe 's!mirrorlist=!#mirrorlist=!s; s!#\s*baseurl=http://mirror.centos.org!baseurl=http://vault.centos.org!' /etc/yum.repos.d/CentOS-SCLo-scl*.repo
RUN yum -y install https://vitastor.io/rpms/centos/7/vitastor-release-1.0-1.el7.noarch.rpm
RUN yum -y install devtoolset-9-gcc-c++ devtoolset-9-libatomic-devel gcc make cmake gperftools-devel \
fio rh-nodejs12 jerasure-devel libisa-l-devel gf-complete-devel rdma-core-devel libnl3-devel
@@ -16,32 +21,3 @@ RUN rpm --nomd5 -i fio*.src.rpm
RUN rm -f /etc/yum.repos.d/CentOS-Media.repo
RUN cd ~/rpmbuild/SPECS && yum-builddep -y fio.spec
RUN yum -y install cmake3
ADD https://vitastor.io/rpms/liburing-el7/liburing-0.7-2.el7.src.rpm /root
RUN set -e; \
rpm -i liburing*.src.rpm; \
cd ~/rpmbuild/SPECS/; \
. /opt/rh/devtoolset-9/enable; \
rpmbuild -ba liburing.spec; \
mkdir -p /root/packages/liburing-el7; \
rm -rf /root/packages/liburing-el7/*; \
cp ~/rpmbuild/RPMS/*/liburing* /root/packages/liburing-el7/; \
cp ~/rpmbuild/SRPMS/liburing* /root/packages/liburing-el7/
RUN rpm -i `ls /root/packages/liburing-el7/liburing-*.x86_64.rpm | grep -v debug`
ADD . /root/vitastor
RUN set -e; \
cd /root/vitastor/rpm; \
sh build-tarball.sh; \
VER=$(grep ^Version: vitastor-el7.spec | awk '{print $2}'); \
cp /root/vitastor-$VER.el7.tar.gz ~/rpmbuild/SOURCES; \
cp vitastor-el7.spec ~/rpmbuild/SPECS/vitastor.spec; \
cd ~/rpmbuild/SPECS/; \
rpmbuild -ba vitastor.spec; \
mkdir -p /root/packages/vitastor-el7; \
rm -rf /root/packages/vitastor-el7/*; \
cp ~/rpmbuild/RPMS/*/*vitastor* /root/packages/vitastor-el7/; \
cp ~/rpmbuild/SRPMS/vitastor* /root/packages/vitastor-el7/

View File

@@ -1,13 +1,12 @@
Name: vitastor
Version: 2.0.0
Version: 2.3.0
Release: 1%{?dist}
Summary: Vitastor, a fast software-defined clustered block storage
License: Vitastor Network Public License 1.1
URL: https://vitastor.io/
Source0: vitastor-2.0.0.el7.tar.gz
Source0: vitastor-2.3.0.el7.tar.gz
BuildRequires: liburing-devel >= 0.6
BuildRequires: gperftools-devel
BuildRequires: devtoolset-9-gcc-c++
BuildRequires: rh-nodejs12
@@ -35,8 +34,6 @@ size with configurable redundancy (replication or erasure codes/XOR).
Summary: Vitastor - OSD
Requires: libJerasure2
Requires: libisa-l
Requires: liburing >= 0.6
Requires: liburing < 2
Requires: vitastor-client = %{version}-%{release}
Requires: util-linux
Requires: parted
@@ -60,8 +57,6 @@ scheduling cluster-level operations.
%package -n vitastor-client
Summary: Vitastor - client
Requires: liburing >= 0.6
Requires: liburing < 2
%description -n vitastor-client
@@ -82,7 +77,7 @@ Vitastor library headers for development.
Summary: Vitastor - fio drivers
Group: Development/Libraries
Requires: vitastor-client = %{version}-%{release}
Requires: fio = 3.7-1.el7
Requires: fio = 3.7-2.el7
%description -n vitastor-fio
@@ -169,6 +164,7 @@ chown vitastor:vitastor /var/lib/vitastor
%files -n vitastor-client
%_bindir/vitastor-nbd
%_bindir/vitastor-ublk
%_bindir/vitastor-nfs
%_bindir/vitastor-cli
%_bindir/vitastor-rm

View File

@@ -1,5 +1,7 @@
# Build packages for CentOS 8 inside a container
# cd ..; podman build -t vitastor-el8 -v `pwd`/packages:/root/packages -f rpm/vitastor-el8.Dockerfile .
# cd ..
# docker build -t vitastor-buildenv:el8 -f rpm/vitastor-el8.Dockerfile .
# docker run -i --rm -v ./:/root/vitastor vitastor-buildenv:el8 /root/vitastor/rpm/vitastor-build.sh
FROM centos:8
@@ -15,32 +17,3 @@ RUN dnf -y install gcc-toolset-9 gcc-toolset-9-gcc-c++ gperftools-devel \
RUN dnf download --source fio
RUN rpm --nomd5 -i fio*.src.rpm
RUN cd ~/rpmbuild/SPECS && dnf builddep -y --enablerepo=powertools --spec fio.spec
ADD https://vitastor.io/rpms/liburing-el7/liburing-0.7-2.el7.src.rpm /root
RUN set -e; \
rpm -i liburing*.src.rpm; \
cd ~/rpmbuild/SPECS/; \
. /opt/rh/gcc-toolset-9/enable; \
rpmbuild -ba liburing.spec; \
mkdir -p /root/packages/liburing-el8; \
rm -rf /root/packages/liburing-el8/*; \
cp ~/rpmbuild/RPMS/*/liburing* /root/packages/liburing-el8/; \
cp ~/rpmbuild/SRPMS/liburing* /root/packages/liburing-el8/
RUN rpm -i `ls /root/packages/liburing-el8/liburing-*.x86_64.rpm | grep -v debug`
ADD . /root/vitastor
RUN set -e; \
cd /root/vitastor/rpm; \
sh build-tarball.sh; \
VER=$(grep ^Version: vitastor-el8.spec | awk '{print $2}'); \
cp /root/vitastor-$VER.el8.tar.gz ~/rpmbuild/SOURCES; \
cp vitastor-el8.spec ~/rpmbuild/SPECS/vitastor.spec; \
cd ~/rpmbuild/SPECS/; \
rpmbuild -ba vitastor.spec; \
mkdir -p /root/packages/vitastor-el8; \
rm -rf /root/packages/vitastor-el8/*; \
cp ~/rpmbuild/RPMS/*/*vitastor* /root/packages/vitastor-el8/; \
cp ~/rpmbuild/SRPMS/vitastor* /root/packages/vitastor-el8/

View File

@@ -1,13 +1,12 @@
Name: vitastor
Version: 2.0.0
Version: 2.3.0
Release: 1%{?dist}
Summary: Vitastor, a fast software-defined clustered block storage
License: Vitastor Network Public License 1.1
URL: https://vitastor.io/
Source0: vitastor-2.0.0.el8.tar.gz
Source0: vitastor-2.3.0.el8.tar.gz
BuildRequires: liburing-devel >= 0.6
BuildRequires: gperftools-devel
BuildRequires: gcc-toolset-9-gcc-c++
BuildRequires: nodejs >= 10
@@ -34,8 +33,6 @@ size with configurable redundancy (replication or erasure codes/XOR).
Summary: Vitastor - OSD
Requires: libJerasure2
Requires: libisa-l
Requires: liburing >= 0.6
Requires: liburing < 2
Requires: vitastor-client = %{version}-%{release}
Requires: util-linux
Requires: parted
@@ -58,8 +55,6 @@ scheduling cluster-level operations.
%package -n vitastor-client
Summary: Vitastor - client
Requires: liburing >= 0.6
Requires: liburing < 2
%description -n vitastor-client
@@ -80,7 +75,7 @@ Vitastor library headers for development.
Summary: Vitastor - fio drivers
Group: Development/Libraries
Requires: vitastor-client = %{version}-%{release}
Requires: fio = 3.7-3.el8
Requires: fio = 3.19-3.el8
%description -n vitastor-fio
@@ -166,6 +161,7 @@ chown vitastor:vitastor /var/lib/vitastor
%files -n vitastor-client
%_bindir/vitastor-nbd
%_bindir/vitastor-ublk
%_bindir/vitastor-nfs
%_bindir/vitastor-cli
%_bindir/vitastor-rm

View File

@@ -1,5 +1,7 @@
# Build packages for AlmaLinux 9 inside a container
# cd ..; podman build -t vitastor-el9 -v `pwd`/packages:/root/packages -f rpm/vitastor-el9.Dockerfile .
# cd ..
# docker build -t vitastor-buildenv:el9 -f rpm/vitastor-el9.Dockerfile .
# docker run -i --rm -v ./:/root/vitastor vitastor-buildenv:el9 /root/vitastor/rpm/vitastor-build.sh
FROM almalinux:9
@@ -8,22 +10,7 @@ WORKDIR /root
RUN sed -i 's/enabled=0/enabled=1/' /etc/yum.repos.d/*.repo
RUN dnf -y install epel-release dnf-plugins-core
RUN dnf -y install https://vitastor.io/rpms/centos/9/vitastor-release-1.0-1.el9.noarch.rpm
RUN dnf -y install gcc-c++ gperftools-devel fio nodejs rpm-build jerasure-devel libisa-l-devel gf-complete-devel rdma-core-devel libarchive liburing-devel cmake libnl3-devel
RUN dnf -y install gcc-c++ gperftools-devel fio nodejs rpm-build jerasure-devel libisa-l-devel gf-complete-devel rdma-core-devel libarchive cmake libnl3-devel
RUN dnf download --source fio
RUN rpm --nomd5 -i fio*.src.rpm
RUN cd ~/rpmbuild/SPECS && dnf builddep -y --spec fio.spec
ADD . /root/vitastor
RUN set -e; \
cd /root/vitastor/rpm; \
sh build-tarball.sh; \
VER=$(grep ^Version: vitastor-el9.spec | awk '{print $2}'); \
cp /root/vitastor-$VER.el9.tar.gz ~/rpmbuild/SOURCES; \
cp vitastor-el9.spec ~/rpmbuild/SPECS/vitastor.spec; \
cd ~/rpmbuild/SPECS/; \
rpmbuild -ba vitastor.spec; \
mkdir -p /root/packages/vitastor-el9; \
rm -rf /root/packages/vitastor-el9/*; \
cp ~/rpmbuild/RPMS/*/*vitastor* /root/packages/vitastor-el9/; \
cp ~/rpmbuild/SRPMS/vitastor* /root/packages/vitastor-el9/

View File

@@ -1,13 +1,12 @@
Name: vitastor
Version: 2.0.0
Version: 2.3.0
Release: 1%{?dist}
Summary: Vitastor, a fast software-defined clustered block storage
License: Vitastor Network Public License 1.1
URL: https://vitastor.io/
Source0: vitastor-2.0.0.el9.tar.gz
Source0: vitastor-2.3.0.el9.tar.gz
BuildRequires: liburing-devel >= 0.6
BuildRequires: gperftools-devel
BuildRequires: gcc-c++
BuildRequires: nodejs >= 10
@@ -159,6 +158,7 @@ chown vitastor:vitastor /var/lib/vitastor
%files -n vitastor-client
%_bindir/vitastor-nbd
%_bindir/vitastor-ublk
%_bindir/vitastor-nfs
%_bindir/vitastor-cli
%_bindir/vitastor-rm

View File

@@ -12,6 +12,7 @@ set(WITH_QEMU false CACHE BOOL "Build QEMU driver inside Vitastor source tree")
set(WITH_FIO true CACHE BOOL "Build FIO driver")
set(QEMU_PLUGINDIR qemu CACHE STRING "QEMU plugin directory suffix (qemu-kvm on RHEL)")
set(WITH_ASAN false CACHE BOOL "Build with AddressSanitizer")
set(WITH_SYSTEM_LIBURING false CACHE BOOL "Use system liburing")
if("${CMAKE_INSTALL_PREFIX}" MATCHES "^/usr/local/?$")
if(EXISTS "/etc/debian_version")
set(CMAKE_INSTALL_LIBDIR "lib/${CMAKE_LIBRARY_ARCHITECTURE}")
@@ -19,13 +20,16 @@ if("${CMAKE_INSTALL_PREFIX}" MATCHES "^/usr/local/?$")
set(CMAKE_INSTALL_RPATH "${CMAKE_INSTALL_PREFIX}/${CMAKE_INSTALL_LIBDIR}")
endif()
add_definitions(-DVITASTOR_VERSION="2.0.0")
add_definitions(-D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -Wall -Wno-sign-compare -Wno-comment -Wno-parentheses -Wno-pointer-arith -fdiagnostics-color=always -fno-omit-frame-pointer -I ${CMAKE_SOURCE_DIR}/src)
add_definitions(-DVITASTOR_VERSION="2.3.0")
add_definitions(-D_GNU_SOURCE -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -Wall -Wno-sign-compare -Wno-comment -Wno-parentheses -Wno-pointer-arith -fdiagnostics-color=always -fno-omit-frame-pointer -fvisibility=hidden -I ${CMAKE_SOURCE_DIR}/src)
add_link_options(-fno-omit-frame-pointer)
if (${WITH_ASAN})
add_definitions(-fsanitize=address)
add_link_options(-fsanitize=address -fno-omit-frame-pointer)
endif (${WITH_ASAN})
set(CMAKE_CXX_FLAGS_RELEASE "${CMAKE_CXX_FLAGS_RELEASE} -fvisibility-inlines-hidden")
set(CMAKE_CXX_FLAGS_MINSIZEREL "${CMAKE_CXX_FLAGS_MINSIZEREL} -fvisibility-inlines-hidden")
set(CMAKE_CXX_FLAGS_RELWITHDEBINFO "${CMAKE_CXX_FLAGS_RELWITHDEBINFO} -fvisibility-inlines-hidden")
set(CMAKE_BUILD_TYPE RelWithDebInfo)
string(REGEX REPLACE "([\\/\\-]O)[^ \t\r\n]*" "\\13" CMAKE_CXX_FLAGS_RELEASE "${CMAKE_CXX_FLAGS_RELEASE}")
@@ -49,7 +53,6 @@ endmacro(install_symlink)
check_include_file("linux/nbd-netlink.h" HAVE_NBD_NETLINK_H)
find_package(PkgConfig)
pkg_check_modules(LIBURING REQUIRED liburing)
if (${WITH_QEMU})
pkg_check_modules(GLIB REQUIRED glib-2.0)
endif (${WITH_QEMU})
@@ -66,6 +69,15 @@ if (RDMACM_LIBRARIES)
add_definitions(-DWITH_RDMACM)
endif (RDMACM_LIBRARIES)
if (${WITH_SYSTEM_LIBURING})
pkg_check_modules(LIBURING REQUIRED liburing>=2.10)
include_directories(${LIBURING_INCLUDE_DIRS})
else()
include_directories(${CMAKE_SOURCE_DIR}/src/liburing/include)
add_subdirectory(liburing)
set(LIBURING_LIBRARIES uring)
endif (${WITH_SYSTEM_LIBURING})
add_custom_target(build_tests)
add_custom_target(test
COMMAND
@@ -86,7 +98,6 @@ include_directories(
${CMAKE_SOURCE_DIR}/src/test
${CMAKE_SOURCE_DIR}/src/util
/usr/include/jerasure
${LIBURING_INCLUDE_DIRS}
${IBVERBS_INCLUDE_DIRS}
)
@@ -101,7 +112,7 @@ add_subdirectory(test)
### Install
install(TARGETS vitastor-osd vitastor-disk vitastor-nbd vitastor-nfs vitastor-cli vitastor-kv vitastor-kv-stress RUNTIME DESTINATION ${CMAKE_INSTALL_BINDIR})
install(TARGETS vitastor-osd vitastor-disk vitastor-nbd vitastor-ublk vitastor-nfs vitastor-cli vitastor-kv vitastor-kv-stress RUNTIME DESTINATION ${CMAKE_INSTALL_BINDIR})
install_symlink(vitastor-disk ${CMAKE_INSTALL_PREFIX}/${CMAKE_INSTALL_BINDIR}/vitastor-dump-journal)
install_symlink(vitastor-cli ${CMAKE_INSTALL_PREFIX}/${CMAKE_INSTALL_BINDIR}/vitastor-rm)
install_symlink(vitastor-cli ${CMAKE_INSTALL_PREFIX}/${CMAKE_INSTALL_BINDIR}/vita)

View File

@@ -9,6 +9,7 @@ add_library(vitastor_blk SHARED
)
target_link_libraries(vitastor_blk
${LIBURING_LIBRARIES}
${ISAL_LIBRARIES}
tcmalloc_minimal
# for timerfd_manager
vitastor_common

View File

@@ -42,8 +42,7 @@
#define BS_OP_DELETE 6
#define BS_OP_LIST 7
#define BS_OP_ROLLBACK 8
#define BS_OP_SYNC_STAB_ALL 9
#define BS_OP_MAX 9
#define BS_OP_MAX 8
#define BS_OP_PRIVATE_DATA_SIZE 256
@@ -113,14 +112,6 @@ Input:
Output:
- retval = 0 or negative error number (-ENOENT if no such version for stabilize)
## BS_OP_SYNC_STAB_ALL
ONLY FOR TESTS! Sync and mark all unstable object versions as stable, at once.
Input: Nothing except opcode
Output:
- retval = 0 or negative error number (-EINVAL)
## BS_OP_LIST
Get a list of all objects in this Blockstore.
@@ -144,10 +135,10 @@ Output:
*/
struct blockstore_op_t
struct __attribute__ ((visibility("default"))) blockstore_op_t
{
// operation
uint64_t opcode;
uint64_t opcode = 0;
// finish callback
std::function<void (blockstore_op_t*)> callback;
union __attribute__((__packed__))
@@ -171,9 +162,9 @@ struct blockstore_op_t
uint32_t list_stable_limit;
};
};
void *buf;
void *bitmap;
int retval;
void *buf = NULL;
void *bitmap = NULL;
int retval = 0;
uint8_t private_data[BS_OP_PRIVATE_DATA_SIZE];
};
@@ -182,7 +173,7 @@ typedef std::map<std::string, std::string> blockstore_config_t;
class blockstore_impl_t;
class blockstore_t
class __attribute__((visibility("default"))) blockstore_t
{
blockstore_impl_t *impl;
public:

View File

@@ -244,7 +244,7 @@ void blockstore_disk_t::calc_lengths(bool skip_meta_check)
if (new_doesnt_fit)
{
printf("Warning: Using old metadata format without checksums because the new format"
" doesn't fit into provided area (%lu bytes required, %lu bytes available)\n", meta_len, meta_area_size);
" doesn't fit into provided area (%ju bytes required, %ju bytes available)\n", meta_len, meta_area_size);
}
clean_entry_size = clean_entry_v0_size;
meta_len = meta_v0_len;

View File

@@ -361,6 +361,10 @@ bool journal_flusher_co::loop()
else if (wait_state == 28) goto resume_28;
else if (wait_state == 29) goto resume_29;
else if (wait_state == 30) goto resume_30;
else if (wait_state == 31) goto resume_31;
else if (wait_state == 32) goto resume_32;
else if (wait_state == 33) goto resume_33;
else if (wait_state == 34) goto resume_34;
resume_0:
if (flusher->flush_queue.size() < flusher->min_flusher_count && !flusher->trim_wanted ||
!flusher->flush_queue.size() || !flusher->dequeuing)
@@ -486,13 +490,14 @@ resume_2:
resume_10:
resume_11:
resume_12:
resume_13:
if (fill_incomplete && !clear_incomplete_csum_block_bits(5))
return false;
// Wait for journal data reads if the journal is not inmemory
resume_13:
resume_14:
if (wait_journal_count > 0)
{
wait_state = wait_base+13;
wait_state = wait_base+14;
return false;
}
if (bs->dsk.csum_block_size)
@@ -509,31 +514,26 @@ resume_2:
{
if (it->copy_flags == COPY_BUF_JOURNAL || it->copy_flags == (COPY_BUF_JOURNAL|COPY_BUF_COALESCED))
{
await_sqe(14);
await_sqe(15);
data->iov = (struct iovec){ it->buf, (size_t)it->len };
data->callback = simple_callback_w;
my_uring_prep_writev(
io_uring_prep_writev(
sqe, bs->dsk.data_fd, &data->iov, 1, bs->dsk.data_offset + clean_loc + it->offset
);
wait_count++;
}
}
// Wait for data writes and metadata reads
resume_15:
resume_16:
if (!wait_meta_reads(15))
resume_17:
if (!wait_meta_reads(16))
return false;
// Sync data before writing metadata
resume_17:
resume_18:
resume_19:
if (copy_count && !fsync_batch(false, 17))
resume_20:
if (copy_count && !fsync_batch(false, 18))
return false;
// Modify the new metadata entry
update_metadata_entry();
// Update clean_db - it must be equal to the metadata entry
update_clean_db();
// And write metadata entries
if (old_clean_loc != UINT64_MAX && old_clean_loc != clean_loc)
{
// zero out old metadata entry
@@ -548,26 +548,56 @@ resume_2:
}
}
memset((uint8_t*)meta_old.buf + meta_old.pos*bs->dsk.clean_entry_size, 0, bs->dsk.clean_entry_size);
resume_20:
if (meta_old.sector != meta_new.sector && !write_meta_block(meta_old, 20))
return false;
}
if (meta_old.sector != meta_new.sector)
{
resume_21:
if (!write_meta_block(meta_new, 21))
return false;
if (flusher->inflight_meta_sectors.find(meta_old.sector) != flusher->inflight_meta_sectors.end())
{
wait_state = wait_base+21;
return false;
}
flusher->inflight_meta_sectors.insert(meta_old.sector);
resume_22:
if (!write_meta_block(meta_old, 22))
return false;
resume_23:
if (wait_count > 0)
{
wait_state = wait_base+23;
return false;
}
flusher->inflight_meta_sectors.erase(meta_old.sector);
}
}
resume_24:
if (flusher->inflight_meta_sectors.find(meta_new.sector) != flusher->inflight_meta_sectors.end())
{
wait_state = wait_base+24;
return false;
}
flusher->inflight_meta_sectors.insert(meta_new.sector);
// Modify the new metadata entry
update_metadata_entry();
// Update clean_db - it must be equal to the metadata entry
update_clean_db();
// And write metadata entries
resume_25:
if (!write_meta_block(meta_new, 25))
return false;
resume_26:
if (wait_count > 0)
{
wait_state = wait_base+22;
wait_state = wait_base+26;
return false;
}
flusher->inflight_meta_sectors.erase(meta_new.sector);
// Done, free all buffers
free_buffers();
// And sync metadata (in batches - not per each operation!)
resume_23:
resume_24:
resume_25:
if (!fsync_batch(true, 23))
resume_27:
resume_28:
resume_29:
if (!fsync_batch(true, 27))
return false;
// Free the data block only when metadata is synced
free_data_blocks();
@@ -590,12 +620,12 @@ resume_2:
if (bs->journal_trim_interval && !((++flusher->journal_trim_counter) % bs->journal_trim_interval) ||
flusher->trim_wanted > 0)
{
resume_26:
resume_27:
resume_28:
resume_29:
resume_30:
if (!trim_journal(26))
resume_31:
resume_32:
resume_33:
resume_34:
if (!trim_journal(30))
return false;
}
// All done
@@ -716,7 +746,7 @@ bool journal_flusher_co::write_meta_block(flusher_meta_write_t & meta_block, int
await_sqe(0);
data->iov = (struct iovec){ meta_block.buf, (size_t)bs->dsk.meta_block_size };
data->callback = simple_callback_w;
my_uring_prep_writev(
io_uring_prep_writev(
sqe, bs->dsk.meta_fd, &data->iov, 1, bs->dsk.meta_offset + bs->dsk.meta_block_size + meta_block.sector
);
wait_count++;
@@ -734,6 +764,7 @@ bool journal_flusher_co::clear_incomplete_csum_block_bits(int wait_base)
else if (wait_state == wait_base+5) goto resume_5;
else if (wait_state == wait_base+6) goto resume_6;
else if (wait_state == wait_base+7) goto resume_7;
else if (wait_state == wait_base+8) goto resume_8;
cleared_incomplete = false;
for (auto it = v.begin(); it != v.end(); it++)
{
@@ -754,11 +785,18 @@ bool journal_flusher_co::clear_incomplete_csum_block_bits(int wait_base)
if (!wait_meta_reads(wait_base+0))
return false;
resume_2:
if (wait_journal_count > 0)
if (flusher->inflight_meta_sectors.find(meta_new.sector) != flusher->inflight_meta_sectors.end())
{
wait_state = wait_base+2;
return false;
}
flusher->inflight_meta_sectors.insert(meta_new.sector);
resume_3:
if (wait_journal_count > 0)
{
wait_state = wait_base+3;
return false;
}
// Verify data checksums
for (i = v.size()-1; i >= 0 && (v[i].copy_flags & COPY_BUF_CSUM_FILL); i--)
{
@@ -837,19 +875,20 @@ bool journal_flusher_co::clear_incomplete_csum_block_bits(int wait_base)
}
}
// Write and fsync the modified metadata entry
resume_3:
if (!write_meta_block(meta_new, wait_base+3))
return false;
resume_4:
if (!write_meta_block(meta_new, wait_base+4))
return false;
resume_5:
if (wait_count > 0)
{
wait_state = wait_base+4;
wait_state = wait_base+5;
return false;
}
resume_5:
flusher->inflight_meta_sectors.erase(meta_new.sector);
resume_6:
resume_7:
if (!fsync_batch(true, wait_base+5))
resume_8:
if (!fsync_batch(true, wait_base+6))
return false;
}
return true;
@@ -1090,7 +1129,7 @@ bool journal_flusher_co::read_dirty(int wait_base)
vi.buf = memalign_or_die(MEM_ALIGNMENT, vi.len);
data->iov = (struct iovec){ vi.buf, (size_t)vi.len };
data->callback = simple_callback_r;
my_uring_prep_readv(
io_uring_prep_readv(
sqe, bs->dsk.data_fd, &data->iov, 1, bs->dsk.data_offset + old_clean_loc + vi.offset
);
wait_count++;
@@ -1122,7 +1161,7 @@ bool journal_flusher_co::read_dirty(int wait_base)
await_sqe(1);
data->iov = (struct iovec){ v[i].buf, (size_t)v[i].len };
data->callback = simple_callback_rj;
my_uring_prep_readv(
io_uring_prep_readv(
sqe, bs->dsk.journal_fd, &data->iov, 1, bs->journal.offset + v[i].disk_offset
);
wait_journal_count++;
@@ -1215,7 +1254,7 @@ bool journal_flusher_co::modify_meta_read(uint64_t meta_loc, flusher_meta_write_
data->iov = (struct iovec){ wr.it->second.buf, (size_t)bs->dsk.meta_block_size };
data->callback = simple_callback_r;
wr.submitted = true;
my_uring_prep_readv(
io_uring_prep_readv(
sqe, bs->dsk.meta_fd, &data->iov, 1, bs->dsk.meta_offset + bs->dsk.meta_block_size + wr.sector
);
wait_count++;
@@ -1313,7 +1352,7 @@ bool journal_flusher_co::fsync_batch(bool fsync_meta, int wait_base)
await_sqe(0);
data->iov = { 0 };
data->callback = simple_callback_w;
my_uring_prep_fsync(sqe, fsync_meta ? bs->dsk.meta_fd : bs->dsk.data_fd, IORING_FSYNC_DATASYNC);
io_uring_prep_fsync(sqe, fsync_meta ? bs->dsk.meta_fd : bs->dsk.data_fd, IORING_FSYNC_DATASYNC);
cur_sync->state = 1;
wait_count++;
resume_2:
@@ -1383,7 +1422,7 @@ bool journal_flusher_co::trim_journal(int wait_base)
((journal_entry_start*)flusher->journal_superblock)->crc32 = je_crc32((journal_entry*)flusher->journal_superblock);
data->iov = (struct iovec){ flusher->journal_superblock, (size_t)bs->dsk.journal_block_size };
data->callback = simple_callback_w;
my_uring_prep_writev(sqe, bs->dsk.journal_fd, &data->iov, 1, bs->journal.offset);
io_uring_prep_writev(sqe, bs->dsk.journal_fd, &data->iov, 1, bs->journal.offset);
wait_count++;
resume_2:
if (wait_count > 0)
@@ -1394,7 +1433,7 @@ bool journal_flusher_co::trim_journal(int wait_base)
if (!bs->disable_journal_fsync)
{
await_sqe(3);
my_uring_prep_fsync(sqe, bs->dsk.journal_fd, IORING_FSYNC_DATASYNC);
io_uring_prep_fsync(sqe, bs->dsk.journal_fd, IORING_FSYNC_DATASYNC);
data->iov = { 0 };
data->callback = simple_callback_w;
wait_count++;

View File

@@ -119,7 +119,8 @@ class journal_flusher_t
std::map<uint64_t, meta_sector_t> meta_sectors;
std::deque<object_id> flush_queue;
std::map<object_id, uint64_t> flush_versions; // FIXME: consider unordered_map?
std::unordered_map<object_id, uint64_t> flush_versions;
std::unordered_set<uint64_t> inflight_meta_sectors;
bool try_find_older(std::map<obj_ver_id, dirty_entry>::iterator & dirty_end, obj_ver_id & cur);
bool try_find_other(std::map<obj_ver_id, dirty_entry>::iterator & dirty_end, obj_ver_id & cur);

Some files were not shown because too many files have changed in this diff Show More