Commit Graph

39 Commits (master)

Author SHA1 Message Date
Vitaliy Filippov 3aee37eadd Allow to disable per-inode stats for VitastorFS pools 2024-03-16 13:24:36 +03:00
Vitaliy Filippov f20564b44b Fix 32-bit build warnings (99.9% in printf) 2024-02-22 12:22:16 +03:00
Vitaliy Filippov 5d3317e4f2 Followup to 1.4.2 write stall fix - sadly, the previous version was not working correctly :)
Test / test_move_reappear (push) Successful in 19s Details
Test / test_snapshot_chain (push) Successful in 1m21s Details
Test / test_snapshot_down (push) Successful in 23s Details
Test / test_snapshot_chain_ec (push) Successful in 1m50s Details
Test / test_snapshot_down_ec (push) Successful in 22s Details
Test / test_splitbrain (push) Successful in 16s Details
Test / test_etcd_fail (push) Successful in 6m42s Details
Test / test_rebalance_verify_imm (push) Successful in 2m19s Details
Test / test_rebalance_verify (push) Successful in 4m7s Details
Test / test_switch_primary (push) Successful in 36s Details
Test / test_write (push) Successful in 35s Details
Test / test_rebalance_verify_ec (push) Successful in 4m6s Details
Test / test_write_no_same (push) Successful in 22s Details
Test / test_write_xor (push) Successful in 1m34s Details
Test / test_rebalance_verify_ec_imm (push) Successful in 6m7s Details
Test / test_heal_csum_32k_dmj (push) Successful in 4m7s Details
Test / test_heal_csum_32k_dj (push) Successful in 4m59s Details
Test / test_heal_csum_32k (push) Successful in 5m4s Details
Test / test_heal_csum_4k_dmj (push) Successful in 5m59s Details
Test / test_scrub (push) Successful in 1m9s Details
Test / test_scrub_zero_osd_2 (push) Successful in 37s Details
Test / test_scrub_xor (push) Successful in 52s Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 1m5s Details
Test / test_heal_csum_4k_dj (push) Successful in 5m12s Details
Test / test_heal_csum_4k (push) Successful in 5m1s Details
Test / test_scrub_pg_size_3 (push) Successful in 1m48s Details
Test / test_scrub_ec (push) Successful in 19s Details
Test / test_interrupted_rebalance (push) Successful in 1m38s Details
Test / test_heal_pg_size_2 (push) Successful in 3m20s Details
Test / test_heal_ec (push) Successful in 3m3s Details
2024-02-08 19:34:29 +03:00
Vitaliy Filippov 1cec62d25d Sync only completed writes
Test / test_move_reappear (push) Successful in 21s Details
Test / test_rm (push) Successful in 16s Details
Test / test_snapshot_down (push) Successful in 25s Details
Test / test_snapshot_down_ec (push) Successful in 35s Details
Test / test_splitbrain (push) Successful in 24s Details
Test / test_interrupted_rebalance (push) Successful in 5m14s Details
Test / test_snapshot_chain (push) Successful in 2m50s Details
Test / test_rebalance_verify_imm (push) Successful in 2m47s Details
Test / test_rebalance_verify (push) Successful in 3m42s Details
Test / test_switch_primary (push) Successful in 33s Details
Test / test_write (push) Successful in 42s Details
Test / test_write_xor (push) Successful in 44s Details
Test / test_rebalance_verify_ec_imm (push) Successful in 2m52s Details
Test / test_write_no_same (push) Successful in 15s Details
Test / test_rebalance_verify_ec (push) Successful in 4m19s Details
Test / test_heal_ec (push) Successful in 6m20s Details
Test / test_heal_csum_32k (push) Successful in 3m29s Details
Test / test_scrub (push) Successful in 1m24s Details
Test / test_scrub_zero_osd_2 (push) Successful in 1m11s Details
Test / test_heal_csum_4k_dmj (push) Successful in 4m23s Details
Test / test_scrub_xor (push) Successful in 1m9s Details
Test / test_heal_csum_4k_dj (push) Successful in 5m29s Details
Test / test_heal_csum_4k (push) Successful in 5m36s Details
Test / test_scrub_pg_size_3 (push) Successful in 1m53s Details
Test / test_scrub_ec (push) Successful in 29s Details
Test / test_heal_pg_size_2 (push) Successful in 3m9s Details
Test / test_heal_csum_32k_dmj (push) Successful in 4m13s Details
Test / test_heal_csum_32k_dj (push) Successful in 4m17s Details
Test / test_snapshot_chain_ec (push) Successful in 1m25s Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Failing after 24s Details
Should be a final remaining fix to EC + non-capacitor (non-immediate-commit) write hangs :).

First it was breaking non-EC ("instantly stable") writes because they sometimes
complete out of order which was leading to the following error:

terminate called after throwing an instance of 'std::runtime_error'
  what():  BUG: Unexpected dirty_entry 1000000000001:29480000 v65540 unstable state during flush: 0x151

But it is easily fixed by scanning previous and next dirty_entries in mark_stable.
2024-01-27 15:17:22 +03:00
Vitaliy Filippov 5d9d6f32a0 Fix common realloc memory leak mistakes found by cppcheck 2024-01-13 01:30:28 +03:00
Vitaliy Filippov f600cc07b0 Autosync in blockstore every autosync_writes, too 2023-09-16 17:52:17 +03:00
Vitaliy Filippov 5cadb170b9 Fix possible OSD crash during sync due to missing min_flushed_journal_sector reset 2023-09-16 17:52:17 +03:00
Vitaliy Filippov a8464c19af Support keeping checksums on disk (not in memory)
Definitely beneficial for SSD+HDD setups
2023-07-29 12:17:18 +03:00
Vitaliy Filippov c5274f655b ...and partially remove the perversion with bitmap inlining 2023-07-29 12:17:18 +03:00
Vitaliy Filippov 0b0405d115 Implement bitmap-granular (4k) metadata & data checksums 2023-07-29 12:17:18 +03:00
Vitaliy Filippov a6d846863b Add min/max stripe and limit to OP_LIST 2023-05-20 23:19:39 +03:00
Vitaliy Filippov a409598b16 Wait for free space again, but count on big_write flushes instead of just flusher activity 2023-05-10 01:51:02 +03:00
Vitaliy Filippov 0fbf4c6a08 Selectively sync nonsynced objects on STABILIZE/ROLLBACK (fix for github issue #51) 2023-04-08 02:44:02 +03:00
Vitaliy Filippov d06ed2b0e7 Implement online config update 2023-03-26 19:21:50 +03:00
Vitaliy Filippov 1a1ba0d1e7 Add set_immediate to ringloop and use it for bs/osd ops to prevent reenterability issues 2023-02-09 17:37:26 +03:00
Vitaliy Filippov e950c024d3 Do not sync peer OSDs before listing
Sync before listing was added to wait for all PG writes possibly left in queue
from the previous master to finish before listing it

But in fact it may block the cluster when EC is used and some unstable writes
are left in the queue - they block journal flushing, rollback/stabilize is
required to unblock them, but rollback/stabilize may only happen after PG is
peered. But peering needs listings, listings are requested only after sync, and
sync itself waits for currently blocked writes waiting in the queue
2023-01-03 00:05:45 +03:00
Vitaliy Filippov 71d6d9f868 Fix possible crash on ENOSPC during operation cancel in blockstore 2023-01-03 00:05:45 +03:00
Vitaliy Filippov 4ebdd02b0f Remove LIST op limiter
It doesn't prevent OSD slow ops but may itself lead to stalls :)
2022-12-26 02:48:48 +03:00
Vitaliy Filippov 552e207d2b Explicitly print errors about -EAGAIN in io_uring 2022-12-17 15:49:49 +03:00
Vitaliy Filippov cb437913d3 Never try to wait for free space inside blockstore 2022-12-12 00:27:05 +03:00
Vitaliy Filippov dfd80626bd Extract disk opening functions to separate module 2022-07-15 01:38:30 +03:00
Vitaliy Filippov 73a363bf92 Rename some variables and constants 2022-07-15 01:38:30 +03:00
Vitaliy Filippov 839ec9e6e0 Shard clean_db by PGs to speedup listings 2022-02-20 00:21:24 +03:00
Vitaliy Filippov 36c276358b Attempt to fix "head-of-line blocking" by LIST operations 2022-02-18 01:31:45 +03:00
Vitaliy Filippov df0cd85352 Fix another part of the "async sqe clear" bug (followup to d9857a5340) 2022-02-01 01:14:56 +03:00
Vitaliy Filippov 7bdd92ca4f Fix build under clang and some warnings
Build problems fixed:
- void* pointer arithmetic which is a GNU extension (works as byte*)
- "variable size object may not be initialized" which is OK under GCC
- nullptr_t related error in json11 (it lacks 'operator <' in clang)

Warnings fixed:
- empty nested struct initializer { 0 } replaced by {}
- removed several unused lambda captures
2022-01-16 00:02:54 +03:00
Vitaliy Filippov f93491bc6c Implement journal write batching and slightly refactor journal writes
Slightly reduces WA. For example, in 4K T1Q128 replicated randwrite tests
WA is reduced from ~3.6 to ~3.1, in T1Q64 from ~3.8 to ~3.4.

Only effective without no_same_sector_overwrites.
2021-12-16 00:27:17 +03:00
Vitaliy Filippov e74af9745e Print journal flusher diagnostics on slow ops 2021-07-17 16:13:41 +03:00
Vitaliy Filippov 2ab423d4ef Implement journaled write throttling for the SSD+HDD case 2021-04-10 17:44:12 +03:00
Vitaliy Filippov d6524670e1 Introduce data distribution locality 2021-04-10 17:44:12 +03:00
Vitaliy Filippov 6909807068 Allow to start the OSD just to flush the journal completely 2021-04-10 17:44:12 +03:00
Vitaliy Filippov 54f2353f24 Use bitmap granularity for alignment checks 2021-04-03 14:36:04 +03:00
Vitaliy Filippov 8f8b90be7a Add min_flusher_count configuration 2021-04-03 00:53:28 +03:00
Vitaliy Filippov c5fb1d5987 Do not duplicate blockstore operations when io_uring fills up
This bug was leading to OSDs dying with "Assertion `fulfilled == read_op->len' failed"
when testing fio -rw=randread -numjobs=8 -iodepth=128
2021-03-16 12:48:26 +03:00
Vitaliy Filippov d1526b415f Correctly resume writes when OSD is full to return an error 2021-03-13 17:19:45 +03:00
Vitaliy Filippov 36c935ace6 Use std::vector for the blockstore submission queue 2021-03-08 17:04:10 +03:00
Vitaliy Filippov 0d8b5e2ef9 Remove unused enqueue_op_first() 2021-03-08 17:04:10 +03:00
Vitaliy Filippov 98f1e2c277 Rework write/sync ordering
Make syncs wait for all previous writes because it's the only way
to make sure that OSDs do not receive incomplete writes in LIST results
during peering when some writes are still in progress.

Also simplify blockstore submission queue logic.
2021-03-08 17:04:10 +03:00
Vitaliy Filippov bf9a175efc Move C/C++ sources to src subdirectory 2021-02-25 23:59:03 +03:00