Fixes two bugs found during HDD testing :-)
1) OSD crashed with "BUG: Attempt to overwrite used offset of the journal" during
`fio -bs=900k -iodepth=128` test with 16 MB journal
2) OSD stalled during `fio -bs=512k -iodepth=128` test with 64 MB journal
Before this change, OSDs almost always died when one of the etcds was restarted,
even though the rest of them was still in quorum and the lease was still active
- Add patch for libvirt 9.0
- Add support for Proxmox VE 8.0
- Fix compatibility of the QEMU driver with iothread (QEMU rebuilds are coming)
- Fix vitastor-cli rm-data/rm/merge hanging when some OSDs are down.
Allow deletions in unclean cluster at the cost of some data possibly
"reappearing" when those OSDs start back. In that case you can just repeat
the deletion request using rm-data.
- A bunch of bug fixes for snapshots:
- Fix snapshot reads often not working at all with snapshot chain size > 2
- Fix optimized snapshot data merge (children to parent)
- Fix updating of image name index key during optimized merge
- Fix auto-selection preventing the use of optimized merge with only 1 snapshot
- Fix incorrect CAS retries during snapshot merge
- Fix snapshot merge progress reporting
- Fix primary_read bitmap buffers use-after-free which could lead to
incorrect allocation map reads
- Remove /usr/local/bin path from make-etcd
- Some documentation fixes
- Measure and report scrub I/O statistics in vitastor-cli status
- Make aggregated statistics in vitastor-cli status much smoother
(first derive, then sum instead of first summing and then deriving)
- Fix an old rare bug leading to journal corruption
(try to use scrub if you think you're affected...)
- Do not start EC PGs without at least <data chunks> OSDs in each old set
(prevents spurious read errors with EC during reconnections/restarts)
- Fix failed assert(!scrub_list_op) on OSD restart with pending scrubs
- Fix future planned scrubs not starting because of incorrect time comparison
- Build packages for Debian 12 (Bookworm)
The consequence of this issue was that in some very rare cases (only reproduced
under load in CI when running 4+ tests in parallel) small write data written to
journal could overwrite journal entries.
Also add an assert-type safety check to be able to catch this issue in the
future again in case of a regression.
- Fix "Client XX command out of sync" messages sometimes happening on OSD reconnections
- Fix a bug where EC reads parallel with writes to the same object failed with -ERANGE error
- Slightly reduce the amount of metadata writes during journal flushing
- Correctly unmap NBD volumes when Proxmox forces map_volume use (with SWTPM and maybe some other cases)