Test / test_write_xor (push) Successful in 44sDetails
Test / test_rebalance_verify_ec_imm (push) Successful in 2m52sDetails
Test / test_write_no_same (push) Successful in 15sDetails
Test / test_rebalance_verify_ec (push) Successful in 4m19sDetails
Test / test_heal_ec (push) Successful in 6m20sDetails
Test / test_heal_csum_32k (push) Successful in 3m29sDetails
Test / test_scrub (push) Successful in 1m24sDetails
Test / test_scrub_zero_osd_2 (push) Successful in 1m11sDetails
Test / test_heal_csum_4k_dmj (push) Successful in 4m23sDetails
Test / test_scrub_xor (push) Successful in 1m9sDetails
Test / test_heal_csum_4k_dj (push) Successful in 5m29sDetails
Test / test_heal_csum_4k (push) Successful in 5m36sDetails
Test / test_scrub_pg_size_3 (push) Successful in 1m53sDetails
Test / test_scrub_ec (push) Successful in 29sDetails
Test / test_heal_pg_size_2 (push) Successful in 3m9sDetails
Test / test_heal_csum_32k_dmj (push) Successful in 4m13sDetails
Test / test_heal_csum_32k_dj (push) Successful in 4m17sDetails
Test / test_snapshot_chain_ec (push) Successful in 1m25sDetails
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Failing after 24sDetails
Should be a final remaining fix to EC + non-capacitor (non-immediate-commit) write hangs :).
First it was breaking non-EC ("instantly stable") writes because they sometimes
complete out of order which was leading to the following error:
terminate called after throwing an instance of 'std::runtime_error'
what(): BUG: Unexpected dirty_entry 1000000000001:29480000 v65540 unstable state during flush: 0x151
But it is easily fixed by scanning previous and next dirty_entries in mark_stable.
Sync before listing was added to wait for all PG writes possibly left in queue
from the previous master to finish before listing it
But in fact it may block the cluster when EC is used and some unstable writes
are left in the queue - they block journal flushing, rollback/stabilize is
required to unblock them, but rollback/stabilize may only happen after PG is
peered. But peering needs listings, listings are requested only after sync, and
sync itself waits for currently blocked writes waiting in the queue
Build problems fixed:
- void* pointer arithmetic which is a GNU extension (works as byte*)
- "variable size object may not be initialized" which is OK under GCC
- nullptr_t related error in json11 (it lacks 'operator <' in clang)
Warnings fixed:
- empty nested struct initializer { 0 } replaced by {}
- removed several unused lambda captures
Slightly reduces WA. For example, in 4K T1Q128 replicated randwrite tests
WA is reduced from ~3.6 to ~3.1, in T1Q64 from ~3.8 to ~3.4.
Only effective without no_same_sector_overwrites.
Make syncs wait for all previous writes because it's the only way
to make sure that OSDs do not receive incomplete writes in LIST results
during peering when some writes are still in progress.
Also simplify blockstore submission queue logic.