-
Release 1.4.5 Stable
released this
2024-02-16 10:13:33 +03:00 | 276 commits to master since this release- Fix a write stall caused by incorrect journal trimming introduced in 1.4.4 :)
- Fix PGs sometimes hanging in "starting" state on mass OSD restarts
- Fix a rare crash with "map::at" during OSD pings
- Use new defaults for non-capacitor (desktop) SSDs - improves T1Q256 random write from ~6k iops to ~45k iops
- Make journal_trim_interval configurable
Downloads
-
Release 1.4.4 Stable
released this
2024-02-11 16:23:08 +03:00 | 283 commits to master since this releaseA couple of fixes for EC pools
- Fix a segfault possible on partial EC overwrite in 1234 -> 5030 rebalance scenario
- Fix two problems leading to EC pools stalling on rebalance & parallel sudden stops
of OSDs, for example during a sudden poweroff of a host:- Recovery auto-tuning (1.4.0 feature) could apply too large delays and stall
the EC journal - fixed by limiting delays with a new recovery_tune_sleep_cutoff_us
parameter (10 seconds by default) and applying recovery pauses before write
operations, not after them, to not occupy space in the journal for long time - Dynamic journal space reservation (1.3.0 feature) wasn't accounting new writes
when checking the limit so OSDs could still fill the journal fully and stall -
fixed by including new writes into the limit
- Recovery auto-tuning (1.4.0 feature) could apply too large delays and stall
- Print etcd dbSize instead of dbSizeInUse in status
Downloads
-
Release 1.4.3 Stable
released this
2024-02-09 00:29:31 +03:00 | 293 commits to master since this releaseHotfix for hotfix O:-)
- "Write stall fix" was incomplete and EC write stalls could
continue even on 1.4.2. Now they're finally fixed O:-) - Make monitor ignore statistics of stopped OSDs. Previously if you stopped all
OSDs the last total I/O numbers would remain the same indefinitely
Downloads
- "Write stall fix" was incomplete and EC write stalls could
-
Release 1.4.2 Stable
released this
2024-02-04 02:23:49 +03:00 | 296 commits to master since this release- Log to systemd by default
- Fix excessive autosyncs after every operation with disabled immediate_commit (introduced in 1.1.0)
- Fix a possible write stall with EC due to the lack of OSD wakeup after stabilizing previous writes
- Change sync operation semantics as a final fix to possible write stalls with EC and disabled immediate_commit
- Sync after deleting data in CLI rm / rm-data if immediate_commit is disabled
- Fix OSDs ignoring syncs & autosyncs for delete operations
- Fix OSD space reporting sometimes adding garbage zeros for deleted inodes (causing extra pool/stats etcd keys for deleted pools)
- Speed up monitor failover - change default etcd_mon_ttl from 30 to 5 seconds
- Speed up operation retries - change default up_wait_retry_interval to 50 ms
- Add patch for libvirt 9.10
Downloads
-
Release 1.4.1 Stable
released this
2024-01-18 02:31:42 +03:00 | 308 commits to master since this release- Fix a monitor crash on primary OSD switching introduced in 1.4.0
- Fix "partly outside array bounds" warnings for GCC 12 in cpp-btree
- Fix a realloc memory leak in theory possible with too large listings (OSD_OP_LIST)
Downloads
-
Release 1.4.0 Stable
released this
2024-01-12 01:28:33 +03:00 | 316 commits to master since this releaseNew features:
- Intelligent recovery/rebalance speed auto-tuning to reduce its impact on clients (see README -> Features)
- Auto-restoration of dead VDUSE daemons in CSI plugin
- Add vitastor-disk update-sb command
- Update QEMU for Debian Bookworm to 8.1 and use it for CSI plugin
Bug fixes:
- Fix pools SOMETIMES staying inactive after stopping a node due to OSDs not reacting
to PG state changes caused by incorrect full reload of state from etcd on reconnection - Make monitors retry pool configuration changes quickier which fixes them being unable
to apply changes when an ongoing rebalance is quickly making a lot of PGs clean - Fix CSI plugin not accepting array of strings as etcd address in /etc/vitastor/vitastor.conf
- Allow multiple interfaces with the same IP address, for "simple routed" full mesh network
- Do not ignore loopback addresses for OSD network (to make ECMP setups with frr possible)
- Fix a rare client crash during OSD reconnections
- Only treat data partitions as existing OSDs in vitastor-disk prepare
- Remove etcd parameter from default command examples
- Fix reported free space sometimes changing non-immediately after deletion of data from OSDs
- Fix a possible OSD crash on print_slow when bs_op is NULL
- Use the same etcd_ws_keepalive_interval in mon as in OSD
- Fix mon not using values from config when /config/global is not present
- Remove pve-storage-portal-dns-list format for vitastor_etcd_address
- Parse log_level in cluster_client
- Fix vitastor-nbd image existence check not working because of non-zeroed inode_watch fields
- Do not warn on EPIPE in client unless log_level is raised explicitly
- Fix incorrect error in CSI when searching for the device in /sys
- Remove 2 last prints to stdout in etcd_state_client
- Fix a possible OSD crash when checking corrupted journal entries
Downloads
-
Release 1.3.1 Stable
released this
2023-12-04 18:35:09 +03:00 | 360 commits to master since this releaseHotfix to 1.3.0 - new "journal space reservation" had a bug which
caused OSDs to crash with EC and without immediate_commit.Downloads
-
Release 1.3.0 Stable
released this
2023-12-04 02:36:43 +03:00 | 362 commits to master since this releaseNew features:
- RDMA without ODP - much faster and all cards are now supported, not just Mellanox
- VDUSE in CSI - faster, more stable and can even recover after CSI pod restart!
- Reserve journal space for stabilize requests dynamically to prevent stalls under load with EC
- Raise default NBD timeout from 30 to 300 seconds and allow to take it from /etc/vitastor/vitastor.conf
- Remove explicit etcdUrl/etcdPrefix K8S storage class parameter support to prevent
etcd migration issues for volumes created with these parameters - Support QEMU 8.1 and pve-qemu 8.1
Bug fixes:
- Fix RDMA connection (and thus memory) leak
- Fix rare crashes under load due to incorrect io_uring queue size tracking
- Fix monitor statistics aggregation in case of empty /osd/stats keys
- Fix crash on unknown long argument to vitastor-disk
- Allow trailing comma in JSONs again
- Fix crash on attempts to dump a long listing of objects "to stabilize" or "to rollback" in a slow op
Downloads
-
Release 1.2.0 Stable
released this
2023-11-05 01:48:57 +03:00 | 382 commits to master since this releaseNew features:
- Implement CSI volume expansion
- Implement CSI volume snapshots
- CSI driver now requires Kubernetes >= 1.20
Bug fixes:
- Important bug fix for EC: fix EC n+k, k>=2 read recovery in ISA-L version returning incorrect data when reading at least the second chunk out of multiple missing chunks without reading the first one. All users of EC n+k, k>=2 should upgrade as soon as possible, and upgrade should be conducted with downtime: first stop all clients (VMs/containers), then all OSDs, then upgrade and restart everything.
- Fix unstable statistics aggregation in monitor (affecting vitastor-cli status and df)
- Make udev not wait for OSDs to start during boot
- Do not report negative numbers of offline PGs in vitastor-cli status when changing PG count
- Report both old and new PG counts in vitastor-cli df when changing it
- Fix OSDs sometimes not starting with "The code only supports journal versions 1 and 2, but it is 2 on disk" error after upgrading from pre-1.0 versions and letting OSDs run for some time
- Fix monitors sometimes returning old PG count back after OSD configuration changes
- Make monitor PG changes more stable and timeout errors less probable
Downloads
-
Release 1.1.0 Stable
released this
2023-10-28 00:33:06 +03:00 | 409 commits to master since this releaseNew features:
- Implement client writeback cache
- Add the third I/O mode: O_DIRECT|O_SYNC (good for Optane)
- Reduce load on etcd by splitting OSD lease and statistics reporting intervals: etcd_stats_interval (default 30 sec)
- Make MON automatically filter OSDs by layout (block_size/immediate_commit/bitmap_granularity) to prevent "refusing to start PGs of this pool" errors on misconfiguration
- Support running fio benchmarks on systems without io_uring
- Make QEMU driver compatible with QEMU 8.1
- Document usage of vhost-user-blk
Bug fixes:
- Fix resizing disks in QEMU driver (for example, in Proxmox)
- Fix "unexpected result" in Proxmox driver by making CLI flush output on exit
- Remove unneeded block_size mismatch warnings on pools without matching PGs
- Fix possible segfault in vitastor-cli ls -l (usually with deleted pools)
- Fix QEMU driver compatibility with systems without io_uring
- Fix monitor eating 100% CPU when etcd is down (caused by infinite retries)
- Fix potential incorrect write processing with snapshots (not caught in tests but could probably lead to client hangs)
- Fix buffer insertion in cluster_client (not caught in tests but could probably lead to incorrect writes in rare cases)
- Fix rare OSD crash during sync operation processing
- Fix a reenterability issue in cluster_client not reproducible in QEMU/fio, but reproducible with the currently developed K/V database implementation
- Fix deletion of the first modified object - OSDs could crash if you modified the same object a lot of times, then deleted it, and then modified it again
- Fix the fio_sec_osd test tool
Downloads