New features:
- RDMA without ODP - much faster and all cards are now supported, not just Mellanox
- VDUSE in CSI - faster, more stable and can even recover after CSI pod restart!
- Reserve journal space for stabilize requests dynamically to prevent stalls under load with EC
- Raise default NBD timeout from 30 to 300 seconds and allow to take it from /etc/vitastor/vitastor.conf
- Remove explicit etcdUrl/etcdPrefix K8S storage class parameter support to prevent
etcd migration issues for volumes created with these parameters
- Support QEMU 8.1 and pve-qemu 8.1
Bug fixes:
- Fix RDMA connection (and thus memory) leak
- Fix rare crashes under load due to incorrect io_uring queue size tracking
- Fix monitor statistics aggregation in case of empty /osd/stats keys
- Fix crash on unknown long argument to vitastor-disk
- Allow trailing comma in JSONs again
- Fix crash on attempts to dump a long listing of objects "to stabilize" or "to rollback" in a slow op
VDUSE has multiple advantages:
- Better performance
- Lack of timeout problems
- And even the ability to recover after restart of the vitastor-csi pod!
ODP is slower than regular RDMA even with memory copy overhead
Example numbers:
- 3950000 random read iops without ODP vs 240000 iops with ODP
- 1447000 random write iops without ODP vs 101000 iops with ODP
Reference: https://tkygtr6.github.io/pub/ISPASS21_slides.pdf
New features:
- Implement CSI volume expansion
- Implement CSI volume snapshots
- CSI driver now requires Kubernetes >= 1.20
Bug fixes:
- Important bug fix for EC: fix EC n+k, k>=2 read recovery in ISA-L version returning
incorrect data when reading at least the second chunk out of multiple missing chunks
without reading the first one. All users of EC n+k, k>=2 should upgrade as soon as
possible, and upgrade should be conducted with downtime: first stop all clients
(VMs/containers), then all OSDs, then upgrade and restart everything.
- Fix unstable statistics aggregation in monitor (affecting vitastor-cli status and df)
- Make udev not wait for OSDs to start during boot
- Do not report negative numbers of offline PGs in vitastor-cli status when changing PG count
- Report both old and new PG counts in vitastor-cli df when changing it
- Fix OSDs sometimes not starting with "The code only supports journal versions 1 and 2,
but it is 2 on disk" error after upgrading from pre-1.0 versions and letting OSDs run
for some time
- Fix monitors sometimes returning old PG count back after OSD configuration changes
- Make monitor PG changes more stable and timeout errors less probable