• v1.4.5 f882c7dd87

    Release 1.4.5 Stable

    vitalif released this 2024-02-16 10:13:33 +03:00 | 276 commits to master since this release

    • Fix a write stall caused by incorrect journal trimming introduced in 1.4.4 :)
    • Fix PGs sometimes hanging in "starting" state on mass OSD restarts
    • Fix a rare crash with "map::at" during OSD pings
    • Use new defaults for non-capacitor (desktop) SSDs - improves T1Q256 random write from ~6k iops to ~45k iops
    • Make journal_trim_interval configurable
    Downloads
     
  • v1.4.4 c777a0041a

    Release 1.4.4 Stable

    vitalif released this 2024-02-11 16:23:08 +03:00 | 283 commits to master since this release

    A couple of fixes for EC pools

    • Fix a segfault possible on partial EC overwrite in 1234 -> 5030 rebalance scenario
    • Fix two problems leading to EC pools stalling on rebalance & parallel sudden stops
      of OSDs, for example during a sudden poweroff of a host:
      • Recovery auto-tuning (1.4.0 feature) could apply too large delays and stall
        the EC journal - fixed by limiting delays with a new recovery_tune_sleep_cutoff_us
        parameter (10 seconds by default) and applying recovery pauses before write
        operations, not after them, to not occupy space in the journal for long time
      • Dynamic journal space reservation (1.3.0 feature) wasn't accounting new writes
        when checking the limit so OSDs could still fill the journal fully and stall -
        fixed by including new writes into the limit
    • Print etcd dbSize instead of dbSizeInUse in status
    Downloads
     
  • v1.4.3 27e9f244ec

    Release 1.4.3 Stable

    vitalif released this 2024-02-09 00:29:31 +03:00 | 293 commits to master since this release

    Hotfix for hotfix O:-)

    • "Write stall fix" was incomplete and EC write stalls could
      continue even on 1.4.2. Now they're finally fixed O:-)
    • Make monitor ignore statistics of stopped OSDs. Previously if you stopped all
      OSDs the last total I/O numbers would remain the same indefinitely
    Downloads
     
  • v1.4.2 016115c0d4

    Release 1.4.2 Stable

    vitalif released this 2024-02-04 02:23:49 +03:00 | 296 commits to master since this release

    • Log to systemd by default
    • Fix excessive autosyncs after every operation with disabled immediate_commit (introduced in 1.1.0)
    • Fix a possible write stall with EC due to the lack of OSD wakeup after stabilizing previous writes
    • Change sync operation semantics as a final fix to possible write stalls with EC and disabled immediate_commit
    • Sync after deleting data in CLI rm / rm-data if immediate_commit is disabled
    • Fix OSDs ignoring syncs & autosyncs for delete operations
    • Fix OSD space reporting sometimes adding garbage zeros for deleted inodes (causing extra pool/stats etcd keys for deleted pools)
    • Speed up monitor failover - change default etcd_mon_ttl from 30 to 5 seconds
    • Speed up operation retries - change default up_wait_retry_interval to 50 ms
    • Add patch for libvirt 9.10
    Downloads
     
  • v1.4.1 ba55f91409

    Release 1.4.1 Stable

    vitalif released this 2024-01-18 02:31:42 +03:00 | 308 commits to master since this release

    • Fix a monitor crash on primary OSD switching introduced in 1.4.0
    • Fix "partly outside array bounds" warnings for GCC 12 in cpp-btree
    • Fix a realloc memory leak in theory possible with too large listings (OSD_OP_LIST)
    Downloads
     
  • v1.4.0 5280d1d561

    Release 1.4.0 Stable

    vitalif released this 2024-01-12 01:28:33 +03:00 | 316 commits to master since this release

    New features:

    • Intelligent recovery/rebalance speed auto-tuning to reduce its impact on clients (see README -> Features)
    • Auto-restoration of dead VDUSE daemons in CSI plugin
    • Add vitastor-disk update-sb command
    • Update QEMU for Debian Bookworm to 8.1 and use it for CSI plugin

    Bug fixes:

    • Fix pools SOMETIMES staying inactive after stopping a node due to OSDs not reacting
      to PG state changes caused by incorrect full reload of state from etcd on reconnection
    • Make monitors retry pool configuration changes quickier which fixes them being unable
      to apply changes when an ongoing rebalance is quickly making a lot of PGs clean
    • Fix CSI plugin not accepting array of strings as etcd address in /etc/vitastor/vitastor.conf
    • Allow multiple interfaces with the same IP address, for "simple routed" full mesh network
    • Do not ignore loopback addresses for OSD network (to make ECMP setups with frr possible)
    • Fix a rare client crash during OSD reconnections
    • Only treat data partitions as existing OSDs in vitastor-disk prepare
    • Remove etcd parameter from default command examples
    • Fix reported free space sometimes changing non-immediately after deletion of data from OSDs
    • Fix a possible OSD crash on print_slow when bs_op is NULL
    • Use the same etcd_ws_keepalive_interval in mon as in OSD
    • Fix mon not using values from config when /config/global is not present
    • Remove pve-storage-portal-dns-list format for vitastor_etcd_address
    • Parse log_level in cluster_client
    • Fix vitastor-nbd image existence check not working because of non-zeroed inode_watch fields
    • Do not warn on EPIPE in client unless log_level is raised explicitly
    • Fix incorrect error in CSI when searching for the device in /sys
    • Remove 2 last prints to stdout in etcd_state_client
    • Fix a possible OSD crash when checking corrupted journal entries
    Downloads
     
  • v1.3.1 a1c7cc3d8d

    Release 1.3.1 Stable

    vitalif released this 2023-12-04 18:35:09 +03:00 | 360 commits to master since this release

    Hotfix to 1.3.0 - new "journal space reservation" had a bug which
    caused OSDs to crash with EC and without immediate_commit.

    Downloads
     
  • v1.3.0 7972502eaf

    Release 1.3.0 Stable

    vitalif released this 2023-12-04 02:36:43 +03:00 | 362 commits to master since this release

    New features:

    • RDMA without ODP - much faster and all cards are now supported, not just Mellanox
    • VDUSE in CSI - faster, more stable and can even recover after CSI pod restart!
    • Reserve journal space for stabilize requests dynamically to prevent stalls under load with EC
    • Raise default NBD timeout from 30 to 300 seconds and allow to take it from /etc/vitastor/vitastor.conf
    • Remove explicit etcdUrl/etcdPrefix K8S storage class parameter support to prevent
      etcd migration issues for volumes created with these parameters
    • Support QEMU 8.1 and pve-qemu 8.1

    Bug fixes:

    • Fix RDMA connection (and thus memory) leak
    • Fix rare crashes under load due to incorrect io_uring queue size tracking
    • Fix monitor statistics aggregation in case of empty /osd/stats keys
    • Fix crash on unknown long argument to vitastor-disk
    • Allow trailing comma in JSONs again
    • Fix crash on attempts to dump a long listing of objects "to stabilize" or "to rollback" in a slow op
    Downloads
     
  • v1.2.0 5524dbdab7

    Release 1.2.0 Stable

    vitalif released this 2023-11-05 01:48:57 +03:00 | 382 commits to master since this release

    New features:

    • Implement CSI volume expansion
    • Implement CSI volume snapshots
    • CSI driver now requires Kubernetes >= 1.20

    Bug fixes:

    • Important bug fix for EC: fix EC n+k, k>=2 read recovery in ISA-L version returning incorrect data when reading at least the second chunk out of multiple missing chunks without reading the first one. All users of EC n+k, k>=2 should upgrade as soon as possible, and upgrade should be conducted with downtime: first stop all clients (VMs/containers), then all OSDs, then upgrade and restart everything.
    • Fix unstable statistics aggregation in monitor (affecting vitastor-cli status and df)
    • Make udev not wait for OSDs to start during boot
    • Do not report negative numbers of offline PGs in vitastor-cli status when changing PG count
    • Report both old and new PG counts in vitastor-cli df when changing it
    • Fix OSDs sometimes not starting with "The code only supports journal versions 1 and 2, but it is 2 on disk" error after upgrading from pre-1.0 versions and letting OSDs run for some time
    • Fix monitors sometimes returning old PG count back after OSD configuration changes
    • Make monitor PG changes more stable and timeout errors less probable
    Downloads
     
  • v1.1.0 8222e3c77d

    Release 1.1.0 Stable

    vitalif released this 2023-10-28 00:33:06 +03:00 | 409 commits to master since this release

    New features:

    • Implement client writeback cache
    • Add the third I/O mode: O_DIRECT|O_SYNC (good for Optane)
    • Reduce load on etcd by splitting OSD lease and statistics reporting intervals: etcd_stats_interval (default 30 sec)
    • Make MON automatically filter OSDs by layout (block_size/immediate_commit/bitmap_granularity) to prevent "refusing to start PGs of this pool" errors on misconfiguration
    • Support running fio benchmarks on systems without io_uring
    • Make QEMU driver compatible with QEMU 8.1
    • Document usage of vhost-user-blk

    Bug fixes:

    • Fix resizing disks in QEMU driver (for example, in Proxmox)
    • Fix "unexpected result" in Proxmox driver by making CLI flush output on exit
    • Remove unneeded block_size mismatch warnings on pools without matching PGs
    • Fix possible segfault in vitastor-cli ls -l (usually with deleted pools)
    • Fix QEMU driver compatibility with systems without io_uring
    • Fix monitor eating 100% CPU when etcd is down (caused by infinite retries)
    • Fix potential incorrect write processing with snapshots (not caught in tests but could probably lead to client hangs)
    • Fix buffer insertion in cluster_client (not caught in tests but could probably lead to incorrect writes in rare cases)
    • Fix rare OSD crash during sync operation processing
    • Fix a reenterability issue in cluster_client not reproducible in QEMU/fio, but reproducible with the currently developed K/V database implementation
    • Fix deletion of the first modified object - OSDs could crash if you modified the same object a lot of times, then deleted it, and then modified it again
    • Fix the fio_sec_osd test tool
    Downloads