Commit Graph

57 Commits (master)

Author SHA1 Message Date
Vitaliy Filippov a2994ecd0d Fix flusher possibly not trimming journal on rollback 2024-04-05 23:14:39 +03:00
Vitaliy Filippov f20564b44b Fix 32-bit build warnings (99.9% in printf) 2024-02-22 12:22:16 +03:00
Vitaliy Filippov 6cfe38ec04 Followup to empty cur.oid as stop condition for forced trim fix 2024-02-20 15:56:38 +03:00
Vitaliy Filippov 9db2196aef Make journal_trim_interval configurable 2024-02-15 23:38:51 +03:00
Vitaliy Filippov 8d6ae662fe Use empty cur.oid as stop condition for forced trim, not journal_trim_counter 2024-02-15 23:27:17 +03:00
Vitaliy Filippov 38ba76e893 Fix flusher sometimes being unable to trim journal when the flush queue is empty 2024-02-11 13:42:51 +03:00
Vitaliy Filippov e15b6e7805 Fix "cannot be narrowed" in clang
Test / test_snapshot_ec (push) Successful in 44s Details
Test / test_interrupted_rebalance_ec_imm (push) Successful in 2m10s Details
Test / test_rm (push) Successful in 16s Details
Test / test_move_reappear (push) Failing after 51s Details
Test / test_snapshot_down (push) Successful in 22s Details
Test / test_snapshot_down_ec (push) Successful in 24s Details
Test / test_splitbrain (push) Successful in 23s Details
Test / test_snapshot_chain (push) Successful in 2m32s Details
Test / test_snapshot_chain_ec (push) Successful in 3m2s Details
Test / test_rebalance_verify_imm (push) Successful in 3m0s Details
Test / test_write (push) Successful in 33s Details
Test / test_rebalance_verify (push) Successful in 3m53s Details
Test / test_write_no_same (push) Successful in 12s Details
Test / test_rebalance_verify_ec_imm (push) Successful in 3m23s Details
Test / test_rebalance_verify_ec (push) Successful in 4m11s Details
Test / test_write_xor (push) Failing after 3m12s Details
Test / test_heal_pg_size_2 (push) Successful in 3m47s Details
Test / test_heal_csum_32k_dmj (push) Successful in 5m17s Details
Test / test_heal_ec (push) Successful in 5m34s Details
Test / test_heal_csum_32k_dj (push) Successful in 6m43s Details
Test / test_heal_csum_32k (push) Successful in 6m30s Details
Test / test_scrub (push) Successful in 1m18s Details
Test / test_scrub_zero_osd_2 (push) Successful in 1m11s Details
Test / test_heal_csum_4k_dmj (push) Successful in 6m24s Details
Test / test_heal_csum_4k_dj (push) Successful in 6m23s Details
Test / test_scrub_xor (push) Successful in 54s Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 1m1s Details
Test / test_scrub_ec (push) Successful in 54s Details
Test / test_scrub_pg_size_3 (push) Successful in 1m25s Details
Test / test_heal_csum_4k (push) Successful in 6m10s Details
2023-11-04 18:14:44 +03:00
Vitaliy Filippov 4819854064 Fix OSDs incorrectly updating journal superblock after upgrade to 1.x from pre-1.x and refusing to start after it
Test / test_interrupted_rebalance_imm (push) Successful in 3m38s Details
Test / test_snapshot_ec (push) Successful in 33s Details
Test / test_rm (push) Successful in 16s Details
Test / test_snapshot_down (push) Successful in 23s Details
Test / test_move_reappear (push) Failing after 47s Details
Test / test_snapshot_down_ec (push) Successful in 23s Details
Test / test_splitbrain (push) Successful in 21s Details
Test / test_snapshot_chain (push) Successful in 2m31s Details
Test / test_snapshot_chain_ec (push) Successful in 3m7s Details
Test / test_rebalance_verify_imm (push) Successful in 2m54s Details
Test / test_write (push) Successful in 32s Details
Test / test_rebalance_verify (push) Successful in 3m46s Details
Test / test_write_no_same (push) Successful in 13s Details
Test / test_write_xor (push) Successful in 37s Details
Test / test_rebalance_verify_ec_imm (push) Successful in 3m56s Details
Test / test_rebalance_verify_ec (push) Successful in 5m0s Details
Test / test_heal_pg_size_2 (push) Failing after 4m18s Details
Test / test_heal_ec (push) Successful in 5m3s Details
Test / test_heal_csum_32k_dmj (push) Successful in 5m19s Details
Test / test_heal_csum_32k_dj (push) Successful in 5m44s Details
Test / test_heal_csum_32k (push) Successful in 6m37s Details
Test / test_heal_csum_4k_dmj (push) Successful in 6m46s Details
Test / test_scrub (push) Successful in 1m5s Details
Test / test_scrub_zero_osd_2 (push) Successful in 48s Details
Test / test_scrub_xor (push) Successful in 45s Details
Test / test_heal_csum_4k_dj (push) Successful in 6m37s Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 1m17s Details
Test / test_scrub_pg_size_3 (push) Successful in 1m40s Details
Test / test_scrub_ec (push) Successful in 34s Details
Test / test_heal_csum_4k (push) Successful in 7m13s Details
2023-11-04 15:02:24 +03:00
Vitaliy Filippov 1a4ceb420d Track used blocks, not object versions 2023-07-29 12:17:18 +03:00
Vitaliy Filippov 4181add1f4 Remove creepy "metadata copying" during overwrite
Instead of it, just do not verify checksums of currently mutated objects.
When clean data modification during flush runs in parallel to a read request,
that request may read a mix of old and new data. It may even read a mix of
multiple flushed versions if it lasts too long... And attempts to verify it
using temporary copies of metadata make the algorithm too complex and creepy.
2023-07-29 12:17:18 +03:00
Vitaliy Filippov a8464c19af Support keeping checksums on disk (not in memory)
Definitely beneficial for SSD+HDD setups
2023-07-29 12:17:18 +03:00
Vitaliy Filippov 7bfb1639ea Use find_holes() in flusher for unification 2023-07-29 12:17:18 +03:00
Vitaliy Filippov 9357e5293e Call fill_partial_checksum_blocks() correctly in regard to COPY_BUF_CSUM_FILL 2023-07-29 12:17:18 +03:00
Vitaliy Filippov 12851dc07d Wait for journal reads before checking them in clear_incomplete_csum_block_bits 2023-07-29 12:17:18 +03:00
Vitaliy Filippov 71674d00cf Fix journal data checksum mangling on corrupted block overwrite 2023-07-29 12:17:18 +03:00
Vitaliy Filippov c5274f655b ...and partially remove the perversion with bitmap inlining 2023-07-29 12:17:18 +03:00
Vitaliy Filippov 45e07d6294 Sadly we have to refcount dyn_data... 2023-07-29 12:17:18 +03:00
Vitaliy Filippov 874a766b62 Rename meta_version to meta_format 2023-07-29 12:17:18 +03:00
Vitaliy Filippov e42975ffd1 Fix wait_journal_count not being zeroed 2023-07-29 12:17:18 +03:00
Vitaliy Filippov 92c6e16eba Fix checksum verification in big_write journal reads 2023-07-29 12:17:18 +03:00
Vitaliy Filippov 213a9ccb4d Verify checksums during journal reads 2023-07-29 12:17:18 +03:00
Vitaliy Filippov a166147110 Add backwards compatibility with non-checksum metadata and journal formats 2023-07-29 12:17:18 +03:00
Vitaliy Filippov 7d532880c3 Implement large csum_block_size support (more than 4k) + refactor blockstore_flush 2023-07-29 12:17:18 +03:00
Vitaliy Filippov 0b0405d115 Implement bitmap-granular (4k) metadata & data checksums 2023-07-29 12:17:18 +03:00
Vitaliy Filippov b7e4d0c9bf Fix journal dirty_start position tracking and some debug prints
Fixes two bugs found during HDD testing :-)
1) OSD crashed with "BUG: Attempt to overwrite used offset of the journal" during
   `fio -bs=900k -iodepth=128` test with 16 MB journal
2) OSD stalled during `fio -bs=512k -iodepth=128` test with 64 MB journal
2023-07-09 01:17:55 +03:00
Vitaliy Filippov 86b4682975 Put get_trim_pos into the "critical section". Fixes rare journal corruption issue
The consequence of this issue was that in some very rare cases (only reproduced
under load in CI when running 4+ tests in parallel) small write data written to
journal could overwrite journal entries.

Also add an assert-type safety check to be able to catch this issue in the
future again in case of a regression.
2023-06-17 00:06:42 +03:00
Vitaliy Filippov f9fbea25a4 Remove double write when old and new locations are in the same metadata block
Also add another metadata entry fool-safety check which, ideally, will never fire %)
2023-06-03 00:47:10 +03:00
Vitaliy Filippov b74ccb613c Fix another variant of flusher sync-waiting stall 2023-04-24 00:44:41 +03:00
Vitaliy Filippov d7bd36dc32 Fix another rare journal flush stall 2022-12-30 02:03:33 +03:00
Vitaliy Filippov 795020674d Loop journal flusher when the queue is empty but there is a trim request 2022-12-27 02:28:20 +03:00
Vitaliy Filippov 49b88b01f9 Fix clang build 2022-12-17 16:25:26 +03:00
Vitaliy Filippov 552e207d2b Explicitly print errors about -EAGAIN in io_uring 2022-12-17 15:49:49 +03:00
Vitaliy Filippov 1a93e3f33a Wait for data writes before fsyncing data if data fsync is enabled 2022-12-16 20:46:55 +03:00
Vitaliy Filippov a276a1f737 Do not copy journal data additional time when flushing 2022-11-20 00:50:13 +03:00
Vitaliy Filippov ea632367e9 Do not alter dsk.meta_offset/len to skip superblock 2022-07-15 01:38:30 +03:00
Vitaliy Filippov dfd80626bd Extract disk opening functions to separate module 2022-07-15 01:38:30 +03:00
Vitaliy Filippov 839ec9e6e0 Shard clean_db by PGs to speedup listings 2022-02-20 00:21:24 +03:00
Vitaliy Filippov 7bdd92ca4f Fix build under clang and some warnings
Build problems fixed:
- void* pointer arithmetic which is a GNU extension (works as byte*)
- "variable size object may not be initialized" which is OK under GCC
- nullptr_t related error in json11 (it lacks 'operator <' in clang)

Warnings fixed:
- empty nested struct initializer { 0 } replaced by {}
- removed several unused lambda captures
2022-01-16 00:02:54 +03:00
Vitaliy Filippov c6d104ecd6 Print object version on fatal overwrite 2021-12-14 01:57:04 +03:00
Vitaliy Filippov 8398ad0117 Fix #36 - Fix old version data sometimes overriding new version data
Reproduction case:
- v3 = (offset 4kb, length 16kb)
- v2 = (offset 24kb, length 16kb)
- v1 = (offset 16kb, length 16kb)
- At the third step it was inserting 16..24kb instead of 20..24kb
2021-11-27 01:17:45 +03:00
Vitaliy Filippov 28bd94d2c2 Make diagnostics slightly better 2021-07-18 01:24:38 +03:00
Vitaliy Filippov 148ff04aa8 Do not lose flusher queue entries when an "older object rescan" happens in parallel with flushing of an older version of another object 2021-07-18 01:20:54 +03:00
Vitaliy Filippov e74af9745e Print journal flusher diagnostics on slow ops 2021-07-17 16:13:41 +03:00
Vitaliy Filippov f684d9101a Refuse to start with old journal version 2021-04-10 17:44:12 +03:00
Vitaliy Filippov ab39ce2bbb Use clean_entry_bitmap_size instead of entry_attr_size back because of changed bitmap handling 2021-04-10 17:44:12 +03:00
Vitaliy Filippov 6107a4d07b Add "external" bitmap support to blockstore 2021-04-10 17:44:12 +03:00
Vitaliy Filippov 95c29b9dc3 Add "external" bitmap support to osd_rmw 2021-04-10 17:44:12 +03:00
Vitaliy Filippov 6909807068 Allow to start the OSD just to flush the journal completely 2021-04-10 17:44:12 +03:00
Vitaliy Filippov 52097c4856 Stop flushing when less than min_flusher_count operations are available (unless a trim is forced) 2021-04-03 00:53:28 +03:00
Vitaliy Filippov 8f8b90be7a Add min_flusher_count configuration 2021-04-03 00:53:28 +03:00