Commit Graph

1198 Commits (v0.8.4)

Author SHA1 Message Date
Vitaliy Filippov 81fc8bb94c Release 0.8.4
New features:
- Implement QCOW2 image/snapshot export via qemu-img (bdrv_co_block_status in the driver)
- Remove OSDs from PG history during `vitastor-cli rm-osd` to prevent `left_on_dead` PG states after deletion
- Add a new recovery_pg_switch setting to mix all PGs during recovery, to almost
  fully reduce the probability of ENOSPC during rebalance
- Introduce partial ENOSPC ("OSD is full") handling - now ENOSPC doesn't turn
  into cascades of crashes
- Add migration support to Proxmox VE Vitastor driver
- Track last_clean_pgs on a per-pool basis thus reducing data movement in a cluster
  with pools remaining unclean/degraded for a long time

Bug fixes:
- Fix a bug where monitor could generate degraded PGs if one of the hosts had no OSDs
- Fix a bug where monitor could skip PG redistribution with a lot of OSDs in cluster
- Report PG history synchronously on the first write, which improves PG consistency
  and availability at the same time, because history now gets reported correctly
  and doesn't get reported without the need for it
- Fix possible write and recovery stalls which could happen in a cluster with both EC and replicated pools
- Make OSD and monitors sanitize & deduplicate PG history items in etcd
- Fix non-working OSD peer config safety check
- Fix a rare journal flush stall where flushing wasn't activated with full journal, but with empty flush queue
- Fix builds without ISA-L (jerasure-only) crashing with EC N+K, K>=2 due to the lack of 16-byte buffer alignment
- Fix a possible crash for EC N+K, K>=2 when calculating a parity chunk with previous parity chunk missing
- Fix a bug where vitastor-disk purge with suppressed warnings didn't work
2023-01-13 23:59:54 +03:00
Vitaliy Filippov bc465c16de Fix arithmetic on void* for clang 2023-01-13 23:58:42 +03:00
Vitaliy Filippov 8763e9211c Fix qemu driver compilation warning/error 2023-01-13 23:44:39 +03:00
Vitaliy Filippov 9e1a80bd17 Replace apt-key with trusted.gpg.d 2023-01-13 19:51:47 +03:00
Vitaliy Filippov 3e280f2f08 Mark vitastor as shared storage in PVE driver 2023-01-13 01:36:30 +03:00
Vitaliy Filippov fe87b4076b Fix backwards compatibility in cluster_client 2023-01-12 02:37:31 +03:00
Vitaliy Filippov a38957c1a7 Skip empty hosts in lp-optimizer 2023-01-09 16:26:16 +03:00
Vitaliy Filippov 137309cf29 Implement bdrv_co_block_status for snapshot export support 2023-01-07 17:06:58 +03:00
Vitaliy Filippov 373f9d0387 Try to re-peer PGs on history change 2023-01-06 12:46:44 +03:00
Vitaliy Filippov c4516ea971 Also remove deleted OSD from PG configuration and last_clean_pgs 2023-01-06 12:46:44 +03:00
Vitaliy Filippov 91065c80fc Try to prevent left_on_dead when deleting OSDs by removing them from PG history 2023-01-06 12:46:43 +03:00
Vitaliy Filippov 0f6b946add Time changes with every stat change, do not schedule checks based on it 2023-01-05 13:54:16 +03:00
Vitaliy Filippov 465cbf0b2f Do not re-schedule recheck indefinitely, run it after mon_change_timeout in any case 2023-01-05 13:48:06 +03:00
Vitaliy Filippov 41add50e4e Track last_clean_pgs on a per-pool basis 2023-01-03 02:20:50 +03:00
Vitaliy Filippov 02e7be7dc9 Prevent reenterability side effects during PG history operation resume 2023-01-03 02:20:50 +03:00
Vitaliy Filippov 73940adf07 Prioritize EC (non-instantly-stable) operations under journal pressure
This reduces the probability of hitting OSD stalls with EC due to "deadlocks"
where two parallel write operations wait for each other to complete
2023-01-03 00:05:45 +03:00
Vitaliy Filippov e950c024d3 Do not sync peer OSDs before listing
Sync before listing was added to wait for all PG writes possibly left in queue
from the previous master to finish before listing it

But in fact it may block the cluster when EC is used and some unstable writes
are left in the queue - they block journal flushing, rollback/stabilize is
required to unblock them, but rollback/stabilize may only happen after PG is
peered. But peering needs listings, listings are requested only after sync, and
sync itself waits for currently blocked writes waiting in the queue
2023-01-03 00:05:45 +03:00
Vitaliy Filippov 71d6d9f868 Fix possible crash on ENOSPC during operation cancel in blockstore 2023-01-03 00:05:45 +03:00
Vitaliy Filippov a4dfa519af Report PG history synchronously during write
This has 2 effects:
1) OSD sets aren't added into PG history until actual write attempts anymore
   which removes unneeded extra osd_sets in PG history
2) New OSD sets are reported synchronously and can't be lost on PG restarts
   happening at the same time with reconfiguration
2023-01-01 23:41:05 +03:00
Vitaliy Filippov 37a6aff2fa Write OSD numbers always as numbers in mon 2023-01-01 23:17:42 +03:00
Vitaliy Filippov 67019f5b02 Make OSD sort & sanitize PG history items 2023-01-01 23:17:42 +03:00
Vitaliy Filippov 0593e5c21c Fix OSD peer config safety check 2022-12-31 02:24:42 +03:00
Vitaliy Filippov 998e24adf8 Add a new recovery_pg_switch setting to mix all PGs during recovery 2022-12-30 02:03:33 +03:00
Vitaliy Filippov d7bd36dc32 Fix another rare journal flush stall 2022-12-30 02:03:33 +03:00
Vitaliy Filippov cf5c562800 Log all object locations when peering PGs 2022-12-30 02:03:33 +03:00
Vitaliy Filippov 629200b0cc Return ENOSPC as the primary OSD 2022-12-30 02:03:33 +03:00
Vitaliy Filippov 3589ccec22 Do not disconnect peer on ENOSPC during write 2022-12-30 01:54:25 +03:00
Vitaliy Filippov 8d55a1e780 Build osd_rmw_test both with and without ISA-L 2022-12-29 19:13:57 +03:00
Vitaliy Filippov 65f6b3a4eb Fix jerasure crashing on bitmap calculation/restoration due to the lack of 16-byte alignment 2022-12-29 19:13:57 +03:00
Vitaliy Filippov fd216eac77 Add a test for missing parity chunk calculation 2022-12-29 19:13:57 +03:00
Vitaliy Filippov 61fca7c426 Fix crash when calculating a parity chunk with previous parity chunk missing (test coming shortly) 2022-12-29 19:13:57 +03:00
Vitaliy Filippov 1c29ed80b9 Fix quote in docs :) 2022-12-28 18:08:53 +03:00
Vitaliy Filippov 68f3fb795e Suppress warnings in vitastor-disk purge correctly 2022-12-27 11:09:19 +03:00
Vitaliy Filippov fa90f287da Release 0.8.3
- Implement a new "vitastor-disk purge" command to remove OSDs with safety checks
- Implement a new "vitastor-cli rm-osd" command to only remove OSD metadata from etcd
- Fix a bug where the monitor could ignore OSD removal and other /osd/stats key changes
- Fix a bug where garbage could be returned when reading objects being written at the same time
- Fix a rare write stall where journal space could be not reclaimed where there
  were no new operations in the flush queue
- Fix a rare peering stall caused by a previous long listing operations queues limiting attempt
- Fix total object count statistic in OSD on object creation
- Add missing offset&len into vitastor-disk dump-journal for big_writes, fix JSON format
- Make vitastor-cli print help on missing command
- Make vitastor-cli translate all '-' to '_' in CLI options
2022-12-27 02:40:55 +03:00
Vitaliy Filippov 795020674d Loop journal flusher when the queue is empty but there is a trim request 2022-12-27 02:28:20 +03:00
Vitaliy Filippov 8e12285629 Fix vitastor-disk purge (now it works) 2022-12-27 02:28:20 +03:00
Vitaliy Filippov b9b50ab4cc Implement vitastor-disk purge command 2022-12-26 02:48:48 +03:00
Vitaliy Filippov 0d8625f92d Make vitastor-cli print help on missing command 2022-12-26 02:48:48 +03:00
Vitaliy Filippov 2f3c2c5140 Implement safety check for OSD removal, translate all '-' to '_' in cli options
'-' to '_' translation fixes a bug with create --image_meta
2022-12-26 02:48:48 +03:00
Vitaliy Filippov 4ebdd02b0f Remove LIST op limiter
It doesn't prevent OSD slow ops but may itself lead to stalls :)
2022-12-26 02:48:48 +03:00
Vitaliy Filippov bf6fdc4141 Check add/rm osd with 2048 PGs 2022-12-26 02:48:48 +03:00
Vitaliy Filippov c2244331e6 Add vitastor-cli rm-osd command 2022-12-26 02:48:48 +03:00
Vitaliy Filippov 3de57e87b1 Recheck OSD tree in monitor on /osd/stats changes 2022-12-26 02:48:48 +03:00
Vitaliy Filippov 2d4cc688b2 Add a remove-osd test 2022-12-26 02:48:48 +03:00
Vitaliy Filippov 31bd1ec145 Fix object creation check for statistics 2022-12-21 02:51:11 +03:00
Vitaliy Filippov c08d1f2dfe Add missing offset&len into big_writes journal dump, fix commas again 2022-12-21 02:51:11 +03:00
Vitaliy Filippov 1d80bcc8d0 Fix blockstore returning garbage for unstable reads if there is an in-flight version
"In-flight" versions are added into dirty_db when writes are enqueued. And they
weren't ignored by subsequent reads even though they didn't have data location yet.
This bug was leading to test_heal.sh not passing sometimes with replicated setups.
2022-12-21 02:48:24 +03:00
Vitaliy Filippov 5ef8bed75f Release 0.8.2
- Fix QEMU driver compatibility with QEMU 7.0 and < 2.9
- Add patches for pve-qemu-kvm 7.1 (PVE 7.3) and pve-qemu-kvm 6.2 (PVE 7.2)
- Fix Proxmox driver location in the pve-storage-vitastor package
- Disable HDD autodetection in non-hybrid mode
- Explicitly warn about a buggy kernels on -EAGAIN in io_uring
- Final fix for the lack of zeroing out of old metadata entries
  (do not crash with "big_write journal_entry was allocated over another object"
  in some cases after an unclean OSD shutdown)
- Wait for data writes before fsyncing data if data fsync is enabled
- Never try to wait for free space inside blockstore thus stalling OSDs
- Fix a rare crash in osd_peering due to callback ordering
- Fix a rare duplication of ping & op message IDs
- Fix a rare use-after-free during pings
- Add --force to vitastor-disk read-sb
- Make vitastor-disk dump metadata object IDs in hex, add forgotten commas
- Fix vitastor-disk SCSI disk cache check
2022-12-17 17:54:13 +03:00
Vitaliy Filippov 8669998e5e Fix discard_list_subop() for local ops 2022-12-17 17:54:13 +03:00
Vitaliy Filippov b457327e77 Oops. Fix metadata read after fixes :-) 2022-12-17 17:31:57 +03:00