Vitaliy Filippov
38ba76e893
Fix flusher sometimes being unable to trim journal when the flush queue is empty
2024-02-11 13:42:51 +03:00
Vitaliy Filippov
1e3c4edea0
Print etcd dbSize instead of dbSizeInUse in status
2024-02-11 13:42:51 +03:00
Vitaliy Filippov
e7ac855b07
Fix that EC segfault (1234 -> 5030 partial overwrite)
2024-02-11 13:42:51 +03:00
Vitaliy Filippov
c53357ac45
Add a test for EC segfault with partial overwrite in 1234 -> 5030 rebalance scenario
2024-02-11 13:42:51 +03:00
Vitaliy Filippov
27e9f244ec
Release 1.4.3
...
Test / test_move_reappear (push) Successful in 22s
Details
Test / test_rm (push) Successful in 15s
Details
Test / test_snapshot_down (push) Successful in 36s
Details
Test / test_snapshot_down_ec (push) Successful in 30s
Details
Test / test_interrupted_rebalance (push) Successful in 5m3s
Details
Test / test_splitbrain (push) Successful in 20s
Details
Test / test_snapshot_chain (push) Successful in 3m1s
Details
Test / test_snapshot_chain_ec (push) Successful in 3m13s
Details
Test / test_rebalance_verify_imm (push) Successful in 3m0s
Details
Test / test_rebalance_verify (push) Successful in 3m29s
Details
Test / test_switch_primary (push) Successful in 37s
Details
Test / test_write (push) Successful in 44s
Details
Test / test_write_xor (push) Successful in 39s
Details
Test / test_write_no_same (push) Successful in 16s
Details
Test / test_rebalance_verify_ec_imm (push) Successful in 4m13s
Details
Test / test_rebalance_verify_ec (push) Successful in 5m31s
Details
Test / test_heal_ec (push) Successful in 4m54s
Details
Test / test_heal_csum_32k_dj (push) Successful in 5m25s
Details
Test / test_heal_csum_32k (push) Successful in 6m8s
Details
Test / test_heal_csum_4k_dmj (push) Successful in 6m17s
Details
Test / test_scrub (push) Successful in 1m8s
Details
Test / test_scrub_zero_osd_2 (push) Successful in 55s
Details
Test / test_scrub_xor (push) Successful in 45s
Details
Test / test_heal_csum_4k_dj (push) Successful in 6m22s
Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 1m11s
Details
Test / test_scrub_ec (push) Successful in 46s
Details
Test / test_scrub_pg_size_3 (push) Successful in 1m39s
Details
Test / test_heal_csum_4k (push) Successful in 6m8s
Details
Test / test_heal_csum_32k_dmj (push) Successful in 4m15s
Details
Test / test_heal_pg_size_2 (push) Successful in 4m41s
Details
Hotfix for hotfix O:-)
- "Write stall fix" was incomplete and EC write stalls could
continue even on 1.4.2. Now they're finally fixed O:-)
- Make monitor ignore statistics of stopped OSDs. Previously if you stopped all
OSDs the last total I/O numbers would remain the same indefinitely
2024-02-09 00:29:31 +03:00
Vitaliy Filippov
8e25a28a08
Ignore down OSDs in monitor statistics aggregation
Test / test_move_reappear (push) Successful in 20s
Details
Test / test_interrupted_rebalance_ec (push) Successful in 2m50s
Details
Test / test_snapshot_down (push) Successful in 22s
Details
Test / test_snapshot_down_ec (push) Successful in 25s
Details
Test / test_splitbrain (push) Successful in 18s
Details
Test / test_snapshot_chain (push) Successful in 2m10s
Details
Test / test_snapshot_chain_ec (push) Failing after 3m8s
Details
Test / test_rebalance_verify (push) Successful in 3m6s
Details
Test / test_interrupted_rebalance (push) Failing after 10m52s
Details
Test / test_rebalance_verify_imm (push) Successful in 5m28s
Details
Test / test_switch_primary (push) Successful in 37s
Details
Test / test_write (push) Successful in 42s
Details
Test / test_write_xor (push) Successful in 38s
Details
Test / test_write_no_same (push) Successful in 16s
Details
Test / test_rebalance_verify_ec (push) Successful in 6m7s
Details
Test / test_rebalance_verify_ec_imm (push) Successful in 6m3s
Details
Test / test_heal_pg_size_2 (push) Successful in 4m12s
Details
Test / test_heal_ec (push) Successful in 5m20s
Details
Test / test_heal_csum_32k_dmj (push) Successful in 4m53s
Details
Test / test_heal_csum_32k_dj (push) Successful in 5m23s
Details
Test / test_heal_csum_32k (push) Successful in 5m59s
Details
Test / test_scrub_zero_osd_2 (push) Has been cancelled
Details
Test / test_scrub_xor (push) Has been cancelled
Details
Test / test_heal_csum_4k_dmj (push) Has been cancelled
Details
Test / test_scrub_pg_size_3 (push) Has been cancelled
Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Has been cancelled
Details
Test / test_scrub_ec (push) Has been cancelled
Details
Test / test_scrub (push) Has been cancelled
Details
Test / test_heal_csum_4k_dj (push) Has been cancelled
Details
Test / test_heal_csum_4k (push) Has been cancelled
Details
2024-02-09 00:22:36 +03:00
Vitaliy Filippov
5d3317e4f2
Followup to 1.4.2 write stall fix - sadly, the previous version was not working correctly :)
Test / test_move_reappear (push) Successful in 19s
Details
Test / test_snapshot_chain (push) Successful in 1m21s
Details
Test / test_snapshot_down (push) Successful in 23s
Details
Test / test_snapshot_chain_ec (push) Successful in 1m50s
Details
Test / test_snapshot_down_ec (push) Successful in 22s
Details
Test / test_splitbrain (push) Successful in 16s
Details
Test / test_etcd_fail (push) Successful in 6m42s
Details
Test / test_rebalance_verify_imm (push) Successful in 2m19s
Details
Test / test_rebalance_verify (push) Successful in 4m7s
Details
Test / test_switch_primary (push) Successful in 36s
Details
Test / test_write (push) Successful in 35s
Details
Test / test_rebalance_verify_ec (push) Successful in 4m6s
Details
Test / test_write_no_same (push) Successful in 22s
Details
Test / test_write_xor (push) Successful in 1m34s
Details
Test / test_rebalance_verify_ec_imm (push) Successful in 6m7s
Details
Test / test_heal_csum_32k_dmj (push) Successful in 4m7s
Details
Test / test_heal_csum_32k_dj (push) Successful in 4m59s
Details
Test / test_heal_csum_32k (push) Successful in 5m4s
Details
Test / test_heal_csum_4k_dmj (push) Successful in 5m59s
Details
Test / test_scrub (push) Successful in 1m9s
Details
Test / test_scrub_zero_osd_2 (push) Successful in 37s
Details
Test / test_scrub_xor (push) Successful in 52s
Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 1m5s
Details
Test / test_heal_csum_4k_dj (push) Successful in 5m12s
Details
Test / test_heal_csum_4k (push) Successful in 5m1s
Details
Test / test_scrub_pg_size_3 (push) Successful in 1m48s
Details
Test / test_scrub_ec (push) Successful in 19s
Details
Test / test_interrupted_rebalance (push) Successful in 1m38s
Details
Test / test_heal_pg_size_2 (push) Successful in 3m20s
Details
Test / test_heal_ec (push) Successful in 3m3s
Details
2024-02-08 19:34:29 +03:00
Vitaliy Filippov
016115c0d4
Release 1.4.2
...
Test / test_rm (push) Successful in 16s
Details
Test / test_move_reappear (push) Successful in 20s
Details
Test / test_snapshot_down (push) Successful in 25s
Details
Test / test_snapshot_down_ec (push) Successful in 39s
Details
Test / test_interrupted_rebalance (push) Successful in 4m52s
Details
Test / test_splitbrain (push) Successful in 20s
Details
Test / test_snapshot_chain (push) Successful in 3m11s
Details
Test / test_rebalance_verify_imm (push) Successful in 3m16s
Details
Test / test_rebalance_verify (push) Successful in 3m45s
Details
Test / test_switch_primary (push) Successful in 36s
Details
Test / test_write (push) Successful in 40s
Details
Test / test_write_xor (push) Successful in 40s
Details
Test / test_write_no_same (push) Successful in 17s
Details
Test / test_rebalance_verify_ec_imm (push) Successful in 3m8s
Details
Test / test_rebalance_verify_ec (push) Successful in 5m57s
Details
Test / test_heal_pg_size_2 (push) Successful in 4m22s
Details
Test / test_heal_csum_32k_dmj (push) Successful in 4m20s
Details
Test / test_heal_ec (push) Successful in 5m54s
Details
Test / test_heal_csum_32k_dj (push) Successful in 5m24s
Details
Test / test_heal_csum_32k (push) Successful in 6m3s
Details
Test / test_heal_csum_4k_dmj (push) Successful in 5m54s
Details
Test / test_scrub_zero_osd_2 (push) Successful in 53s
Details
Test / test_scrub (push) Successful in 55s
Details
Test / test_heal_csum_4k_dj (push) Successful in 6m14s
Details
Test / test_scrub_xor (push) Successful in 1m1s
Details
Test / test_scrub_pg_size_3 (push) Successful in 1m50s
Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 57s
Details
Test / test_scrub_ec (push) Successful in 52s
Details
Test / test_heal_csum_4k (push) Successful in 5m47s
Details
Test / test_snapshot_chain_ec (push) Successful in 1m24s
Details
- Log to systemd by default
- Fix excessive autosyncs after every operation with disabled immediate_commit (introduced in 1.1.0)
- Fix a possible write stall with EC due to the lack of OSD wakeup after stabilizing previous writes
- Change sync operation semantics as a final fix to possible write stalls with EC and disabled immediate_commit
- Sync after deleting data in CLI rm / rm-data if immediate_commit is disabled
- Fix OSDs ignoring syncs & autosyncs for delete operations
- Fix OSD space reporting sometimes adding garbage zeros for deleted inodes (causing extra pool/stats etcd keys for deleted pools)
- Speed up monitor failover - change default etcd_mon_ttl from 30 to 5 seconds
- Speed up operation retries - change default up_wait_retry_interval to 50 ms
- Add patch for libvirt 9.10
2024-02-04 02:23:49 +03:00
Vitaliy Filippov
e026de95d5
Log to systemd by default
Test / test_move_reappear (push) Successful in 20s
Details
Test / test_etcd_fail (push) Successful in 5m19s
Details
Test / test_snapshot_chain (push) Successful in 1m26s
Details
Test / test_snapshot_down (push) Successful in 26s
Details
Test / test_snapshot_down_ec (push) Successful in 28s
Details
Test / test_splitbrain (push) Successful in 19s
Details
Test / test_snapshot_chain_ec (push) Failing after 3m8s
Details
Test / test_interrupted_rebalance (push) Successful in 7m44s
Details
Test / test_rebalance_verify_imm (push) Successful in 3m11s
Details
Test / test_switch_primary (push) Successful in 34s
Details
Test / test_write (push) Successful in 34s
Details
Test / test_rebalance_verify_ec_imm (push) Successful in 2m41s
Details
Test / test_rebalance_verify_ec (push) Successful in 3m18s
Details
Test / test_write_no_same (push) Successful in 22s
Details
Test / test_write_xor (push) Successful in 1m41s
Details
Test / test_heal_pg_size_2 (push) Failing after 3m54s
Details
Test / test_rebalance_verify (push) Successful in 9m38s
Details
Test / test_heal_csum_32k_dmj (push) Successful in 4m4s
Details
Test / test_heal_csum_32k_dj (push) Successful in 4m23s
Details
Test / test_heal_csum_32k (push) Successful in 5m24s
Details
Test / test_heal_ec (push) Failing after 10m18s
Details
Test / test_heal_csum_4k_dmj (push) Successful in 5m31s
Details
Test / test_scrub (push) Successful in 1m18s
Details
Test / test_scrub_zero_osd_2 (push) Successful in 1m0s
Details
Test / test_scrub_xor (push) Successful in 51s
Details
Test / test_heal_csum_4k_dj (push) Successful in 5m10s
Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 48s
Details
Test / test_scrub_pg_size_3 (push) Successful in 1m59s
Details
Test / test_scrub_ec (push) Successful in 48s
Details
Test / test_heal_csum_4k (push) Successful in 4m39s
Details
2024-02-04 01:21:31 +03:00
Vitaliy Filippov
77c10fd1f8
In fact, do not autosync blockstore when autosync_writes=0
Test / test_move_reappear (push) Successful in 19s
Details
Test / test_rm (push) Successful in 14s
Details
Test / test_snapshot_down (push) Successful in 24s
Details
Test / test_snapshot_down_ec (push) Successful in 25s
Details
Test / test_splitbrain (push) Successful in 17s
Details
Test / test_snapshot_chain (push) Successful in 1m57s
Details
Test / test_snapshot_chain_ec (push) Successful in 2m41s
Details
Test / test_rebalance_verify (push) Successful in 3m5s
Details
Test / test_rebalance_verify_imm (push) Successful in 2m26s
Details
Test / test_switch_primary (push) Successful in 45s
Details
Test / test_write (push) Successful in 33s
Details
Test / test_write_xor (push) Successful in 33s
Details
Test / test_rebalance_verify_ec (push) Successful in 3m42s
Details
Test / test_write_no_same (push) Successful in 14s
Details
Test / test_rebalance_verify_ec_imm (push) Successful in 4m57s
Details
Test / test_heal_csum_32k_dmj (push) Successful in 4m24s
Details
Test / test_heal_csum_32k_dj (push) Successful in 4m29s
Details
Test / test_heal_csum_4k_dmj (push) Successful in 5m10s
Details
Test / test_heal_csum_32k (push) Successful in 5m13s
Details
Test / test_scrub (push) Successful in 1m5s
Details
Test / test_scrub_zero_osd_2 (push) Successful in 1m1s
Details
Test / test_scrub_xor (push) Successful in 1m2s
Details
Test / test_heal_csum_4k_dj (push) Successful in 5m2s
Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 57s
Details
Test / test_scrub_ec (push) Successful in 50s
Details
Test / test_scrub_pg_size_3 (push) Successful in 2m1s
Details
Test / test_heal_csum_4k (push) Successful in 4m40s
Details
Test / test_interrupted_rebalance (push) Successful in 1m38s
Details
Test / test_heal_pg_size_2 (push) Successful in 4m2s
Details
Test / test_heal_ec (push) Successful in 5m17s
Details
2024-02-03 20:37:36 +03:00
Vitaliy Filippov
581d02e581
Mark secondary OSDs with deletions as dirty to not forget to sync & autosync them
Test / test_change_pg_count (push) Has been cancelled
Details
Test / test_rm (push) Has been cancelled
Details
Test / test_snapshot_chain (push) Has been cancelled
Details
Test / test_snapshot_chain_ec (push) Has been cancelled
Details
Test / test_snapshot_down (push) Has been cancelled
Details
Test / test_snapshot_down_ec (push) Has been cancelled
Details
Test / test_splitbrain (push) Has been cancelled
Details
Test / test_rebalance_verify (push) Has been cancelled
Details
Test / test_rebalance_verify_imm (push) Has been cancelled
Details
Test / test_rebalance_verify_ec (push) Has been cancelled
Details
Test / test_rebalance_verify_ec_imm (push) Has been cancelled
Details
Test / test_switch_primary (push) Has been cancelled
Details
Test / test_write (push) Has been cancelled
Details
Test / test_write_xor (push) Has been cancelled
Details
Test / test_write_no_same (push) Has been cancelled
Details
Test / test_heal_pg_size_2 (push) Has been cancelled
Details
Test / test_heal_ec (push) Has been cancelled
Details
Test / test_heal_csum_32k_dmj (push) Has been cancelled
Details
Test / test_cas (push) Has been cancelled
Details
Test / test_heal_csum_32k_dj (push) Has been cancelled
Details
Test / test_heal_csum_32k (push) Has been cancelled
Details
Test / test_heal_csum_4k_dmj (push) Has been cancelled
Details
Test / test_heal_csum_4k_dj (push) Has been cancelled
Details
Test / test_heal_csum_4k (push) Has been cancelled
Details
Test / test_scrub (push) Has been cancelled
Details
Test / test_scrub_zero_osd_2 (push) Has been cancelled
Details
Test / test_scrub_xor (push) Has been cancelled
Details
Test / test_scrub_pg_size_3 (push) Has been cancelled
Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Has been cancelled
Details
Test / test_scrub_ec (push) Has been cancelled
Details
2024-02-03 20:31:08 +03:00
Vitaliy Filippov
f03a9db4d9
Fix OSD space reporting sometimes adding garbage zeros for deleted inodes (causing extra pool/stats etcd keys for deleted pools)
2024-02-03 20:31:08 +03:00
Vitaliy Filippov
cb9c30bc31
Sync after sending all deletes to each PG in cli rm-data
2024-02-03 20:31:08 +03:00
Vitaliy Filippov
a86a380d20
Fix invalid parsing of autosync_writes in blockstore leading to autosyncs after every operation with disabled immediate_commit :D
2024-02-03 20:31:08 +03:00
Vitaliy Filippov
d2b43cb118
Change default etcd_mon_ttl
Test / test_move_reappear (push) Successful in 35s
Details
Test / test_interrupted_rebalance_ec (push) Successful in 3m29s
Details
Test / test_interrupted_rebalance (push) Successful in 4m47s
Details
Test / test_snapshot_down (push) Successful in 29s
Details
Test / test_snapshot_down_ec (push) Successful in 25s
Details
Test / test_splitbrain (push) Successful in 24s
Details
Test / test_snapshot_chain (push) Successful in 2m46s
Details
Test / test_snapshot_chain_ec (push) Failing after 3m10s
Details
Test / test_rebalance_verify_imm (push) Successful in 4m24s
Details
Test / test_rebalance_verify (push) Successful in 4m54s
Details
Test / test_switch_primary (push) Successful in 35s
Details
Test / test_rebalance_verify_ec_imm (push) Successful in 3m38s
Details
Test / test_write (push) Successful in 46s
Details
Test / test_write_xor (push) Successful in 49s
Details
Test / test_write_no_same (push) Successful in 18s
Details
Test / test_rebalance_verify_ec (push) Successful in 7m14s
Details
Test / test_heal_pg_size_2 (push) Successful in 4m10s
Details
Test / test_heal_csum_32k_dmj (push) Successful in 4m10s
Details
Test / test_heal_csum_32k_dj (push) Successful in 4m52s
Details
Test / test_heal_csum_32k (push) Successful in 5m20s
Details
Test / test_heal_csum_4k_dmj (push) Successful in 5m8s
Details
Test / test_heal_ec (push) Failing after 10m21s
Details
Test / test_scrub (push) Successful in 1m2s
Details
Test / test_scrub_xor (push) Successful in 54s
Details
Test / test_scrub_zero_osd_2 (push) Successful in 1m4s
Details
Test / test_heal_csum_4k_dj (push) Successful in 4m48s
Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 1m26s
Details
Test / test_scrub_ec (push) Successful in 50s
Details
Test / test_scrub_pg_size_3 (push) Failing after 2m5s
Details
Test / test_heal_csum_4k (push) Successful in 4m33s
Details
2024-01-29 23:45:19 +03:00
Vitaliy Filippov
cc76e6876b
Fix flapping "scrub" test
Test / test_rm (push) Successful in 16s
Details
Test / test_interrupted_rebalance_ec (push) Successful in 2m53s
Details
Test / test_snapshot_down (push) Successful in 29s
Details
Test / test_snapshot_down_ec (push) Successful in 38s
Details
Test / test_splitbrain (push) Successful in 25s
Details
Test / test_interrupted_rebalance (push) Successful in 5m46s
Details
Test / test_snapshot_chain (push) Successful in 2m59s
Details
Test / test_rebalance_verify_imm (push) Successful in 2m36s
Details
Test / test_rebalance_verify (push) Successful in 3m22s
Details
Test / test_switch_primary (push) Successful in 36s
Details
Test / test_write (push) Successful in 32s
Details
Test / test_write_no_same (push) Successful in 17s
Details
Test / test_rebalance_verify_ec_imm (push) Successful in 2m46s
Details
Test / test_write_xor (push) Successful in 40s
Details
Test / test_rebalance_verify_ec (push) Successful in 4m46s
Details
Test / test_heal_pg_size_2 (push) Successful in 4m23s
Details
Test / test_heal_csum_32k_dmj (push) Successful in 4m40s
Details
Test / test_heal_csum_32k_dj (push) Successful in 6m45s
Details
Test / test_scrub (push) Successful in 1m1s
Details
Test / test_scrub_zero_osd_2 (push) Successful in 43s
Details
Test / test_scrub_xor (push) Successful in 35s
Details
Test / test_heal_csum_4k (push) Successful in 4m14s
Details
Test / test_scrub_pg_size_3 (push) Successful in 1m19s
Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 35s
Details
Test / test_scrub_ec (push) Successful in 21s
Details
Test / test_heal_csum_32k (push) Successful in 4m48s
Details
Test / test_heal_csum_4k_dmj (push) Successful in 4m48s
Details
Test / test_heal_csum_4k_dj (push) Successful in 4m27s
Details
Test / test_snapshot_chain_ec (push) Successful in 2m29s
Details
Test / test_heal_ec (push) Successful in 3m7s
Details
2024-01-28 14:59:33 +03:00
Vitaliy Filippov
1cec62d25d
Sync only completed writes
...
Test / test_move_reappear (push) Successful in 21s
Details
Test / test_rm (push) Successful in 16s
Details
Test / test_snapshot_down (push) Successful in 25s
Details
Test / test_snapshot_down_ec (push) Successful in 35s
Details
Test / test_splitbrain (push) Successful in 24s
Details
Test / test_interrupted_rebalance (push) Successful in 5m14s
Details
Test / test_snapshot_chain (push) Successful in 2m50s
Details
Test / test_rebalance_verify_imm (push) Successful in 2m47s
Details
Test / test_rebalance_verify (push) Successful in 3m42s
Details
Test / test_switch_primary (push) Successful in 33s
Details
Test / test_write (push) Successful in 42s
Details
Test / test_write_xor (push) Successful in 44s
Details
Test / test_rebalance_verify_ec_imm (push) Successful in 2m52s
Details
Test / test_write_no_same (push) Successful in 15s
Details
Test / test_rebalance_verify_ec (push) Successful in 4m19s
Details
Test / test_heal_ec (push) Successful in 6m20s
Details
Test / test_heal_csum_32k (push) Successful in 3m29s
Details
Test / test_scrub (push) Successful in 1m24s
Details
Test / test_scrub_zero_osd_2 (push) Successful in 1m11s
Details
Test / test_heal_csum_4k_dmj (push) Successful in 4m23s
Details
Test / test_scrub_xor (push) Successful in 1m9s
Details
Test / test_heal_csum_4k_dj (push) Successful in 5m29s
Details
Test / test_heal_csum_4k (push) Successful in 5m36s
Details
Test / test_scrub_pg_size_3 (push) Successful in 1m53s
Details
Test / test_scrub_ec (push) Successful in 29s
Details
Test / test_heal_pg_size_2 (push) Successful in 3m9s
Details
Test / test_heal_csum_32k_dmj (push) Successful in 4m13s
Details
Test / test_heal_csum_32k_dj (push) Successful in 4m17s
Details
Test / test_snapshot_chain_ec (push) Successful in 1m25s
Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Failing after 24s
Details
Should be a final remaining fix to EC + non-capacitor (non-immediate-commit) write hangs :).
First it was breaking non-EC ("instantly stable") writes because they sometimes
complete out of order which was leading to the following error:
terminate called after throwing an instance of 'std::runtime_error'
what(): BUG: Unexpected dirty_entry 1000000000001:29480000 v65540 unstable state during flush: 0x151
But it is easily fixed by scanning previous and next dirty_entries in mark_stable.
2024-01-27 15:17:22 +03:00
Vitaliy Filippov
1c322b33ed
Change default up_wait_retry_interval to 50 ms
Test / test_rm (push) Successful in 14s
Details
Test / test_interrupted_rebalance_ec (push) Successful in 3m59s
Details
Test / test_snapshot_chain (push) Successful in 1m34s
Details
Test / test_snapshot_down (push) Successful in 25s
Details
Test / test_snapshot_down_ec (push) Successful in 29s
Details
Test / test_splitbrain (push) Successful in 19s
Details
Test / test_snapshot_chain_ec (push) Successful in 2m35s
Details
Test / test_interrupted_rebalance (push) Successful in 8m15s
Details
Test / test_rebalance_verify_imm (push) Successful in 3m54s
Details
Test / test_switch_primary (push) Successful in 36s
Details
Test / test_write (push) Successful in 35s
Details
Test / test_rebalance_verify_ec (push) Successful in 4m48s
Details
Test / test_rebalance_verify_ec_imm (push) Successful in 2m51s
Details
Test / test_write_no_same (push) Successful in 14s
Details
Test / test_write_xor (push) Failing after 3m9s
Details
Test / test_heal_pg_size_2 (push) Successful in 3m55s
Details
Test / test_heal_ec (push) Successful in 3m50s
Details
Test / test_rebalance_verify (push) Failing after 9m30s
Details
Test / test_heal_csum_32k_dmj (push) Failing after 5m40s
Details
Test / test_heal_csum_32k_dj (push) Successful in 6m12s
Details
Test / test_heal_csum_32k (push) Successful in 6m25s
Details
Test / test_heal_csum_4k_dmj (push) Successful in 6m56s
Details
Test / test_scrub (push) Successful in 1m4s
Details
Test / test_scrub_zero_osd_2 (push) Successful in 55s
Details
Test / test_scrub_xor (push) Successful in 56s
Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 1m19s
Details
Test / test_scrub_pg_size_3 (push) Failing after 2m14s
Details
Test / test_heal_csum_4k_dj (push) Successful in 5m53s
Details
Test / test_scrub_ec (push) Successful in 1m1s
Details
Test / test_heal_csum_4k (push) Successful in 5m17s
Details
2024-01-26 01:51:08 +03:00
Vitaliy Filippov
d27524f441
Add patch for libvirt 9.10
2024-01-25 01:09:12 +03:00
Vitaliy Filippov
ba55f91409
Release 1.4.1
...
Test / test_move_reappear (push) Successful in 22s
Details
Test / test_snapshot_chain (push) Successful in 1m27s
Details
Test / test_interrupted_rebalance_ec (push) Successful in 4m41s
Details
Test / test_snapshot_down (push) Successful in 25s
Details
Test / test_snapshot_chain_ec (push) Successful in 2m0s
Details
Test / test_splitbrain (push) Successful in 18s
Details
Test / test_snapshot_down_ec (push) Successful in 25s
Details
Test / test_rebalance_verify_ec (push) Failing after 2m21s
Details
Test / test_rebalance_verify_imm (push) Successful in 2m30s
Details
Test / test_switch_primary (push) Successful in 39s
Details
Test / test_write (push) Successful in 35s
Details
Test / test_interrupted_rebalance (push) Failing after 10m8s
Details
Test / test_write_xor (push) Successful in 36s
Details
Test / test_write_no_same (push) Successful in 17s
Details
Test / test_rebalance_verify_ec_imm (push) Successful in 4m4s
Details
Test / test_heal_pg_size_2 (push) Successful in 3m55s
Details
Test / test_rebalance_verify (push) Successful in 8m31s
Details
Test / test_heal_ec (push) Successful in 5m9s
Details
Test / test_heal_csum_32k_dmj (push) Successful in 4m27s
Details
Test / test_heal_csum_32k (push) Successful in 5m42s
Details
Test / test_heal_csum_32k_dj (push) Successful in 6m1s
Details
Test / test_scrub (push) Successful in 59s
Details
Test / test_scrub_zero_osd_2 (push) Successful in 38s
Details
Test / test_heal_csum_4k_dmj (push) Successful in 7m5s
Details
Test / test_scrub_xor (push) Successful in 58s
Details
Test / test_heal_csum_4k_dj (push) Successful in 6m25s
Details
Test / test_scrub_ec (push) Failing after 42s
Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 1m32s
Details
Test / test_scrub_pg_size_3 (push) Successful in 1m38s
Details
Test / test_heal_csum_4k (push) Successful in 5m38s
Details
- Fix a monitor crash on primary OSD switching introduced in 1.4.0
- Fix "partly outside array bounds" warnings for GCC 12 in cpp-btree
- Fix a realloc memory leak in theory possible with too large listings (OSD_OP_LIST)
2024-01-18 02:31:42 +03:00
Vitaliy Filippov
80aac39513
Add detailed formula for theoretical EC N+K random write performance
2024-01-18 00:36:32 +03:00
Vitaliy Filippov
2aa5aa7ab6
Add a test for simple master switching without PG reconfiguration
...
Test / test_move_reappear (push) Successful in 20s
Details
Test / test_snapshot_chain (push) Successful in 1m27s
Details
Test / test_snapshot_down (push) Successful in 23s
Details
Test / test_snapshot_chain_ec (push) Successful in 1m56s
Details
Test / test_snapshot_down_ec (push) Successful in 23s
Details
Test / test_splitbrain (push) Successful in 17s
Details
Test / test_interrupted_rebalance_ec (push) Successful in 6m40s
Details
Test / test_interrupted_rebalance (push) Successful in 8m12s
Details
Test / test_rebalance_verify_imm (push) Successful in 3m12s
Details
Test / test_switch_primary (push) Successful in 34s
Details
Test / test_write (push) Successful in 46s
Details
Test / test_rebalance_verify_ec (push) Successful in 3m18s
Details
Test / test_rebalance_verify_ec_imm (push) Successful in 2m42s
Details
Test / test_write_no_same (push) Successful in 15s
Details
Test / test_rebalance_verify (push) Successful in 6m36s
Details
Test / test_heal_ec (push) Successful in 5m2s
Details
Test / test_heal_csum_32k_dmj (push) Successful in 4m33s
Details
Test / test_heal_csum_32k_dj (push) Successful in 5m58s
Details
Test / test_heal_csum_32k (push) Successful in 6m6s
Details
Test / test_scrub (push) Successful in 47s
Details
Test / test_heal_csum_4k_dmj (push) Successful in 6m17s
Details
Test / test_scrub_zero_osd_2 (push) Successful in 43s
Details
Test / test_scrub_xor (push) Successful in 47s
Details
Test / test_heal_csum_4k_dj (push) Successful in 6m44s
Details
Test / test_scrub_ec (push) Successful in 41s
Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 1m18s
Details
Test / test_scrub_pg_size_3 (push) Successful in 2m11s
Details
Test / test_heal_csum_4k (push) Successful in 6m12s
Details
Test / test_heal_pg_size_2 (push) Successful in 3m16s
Details
Test / test_write_xor (push) Successful in 34s
Details
Also use osd_out_time:1 only in select tests and restart mon in tests only on connection errors
2024-01-17 00:19:01 +03:00
Vitaliy Filippov
3ca3b8a8d8
Fix recheck_pgs bug introduced in 1.4.0
Test / test_rm (push) Successful in 14s
Details
Test / test_interrupted_rebalance_ec (push) Successful in 3m27s
Details
Test / test_snapshot_chain (push) Successful in 1m24s
Details
Test / test_snapshot_down (push) Successful in 25s
Details
Test / test_snapshot_chain_ec (push) Successful in 1m54s
Details
Test / test_snapshot_down_ec (push) Successful in 20s
Details
Test / test_splitbrain (push) Successful in 15s
Details
Test / test_rebalance_verify_imm (push) Successful in 2m42s
Details
Test / test_etcd_fail (push) Failing after 10m8s
Details
Test / test_interrupted_rebalance (push) Failing after 10m9s
Details
Test / test_write (push) Successful in 1m22s
Details
Test / test_rebalance_verify_ec (push) Failing after 1m51s
Details
Test / test_write_no_same (push) Successful in 16s
Details
Test / test_rebalance_verify_ec_imm (push) Successful in 3m27s
Details
Test / test_write_xor (push) Failing after 3m13s
Details
Test / test_heal_pg_size_2 (push) Successful in 3m22s
Details
Test / test_rebalance_verify (push) Failing after 10m9s
Details
Test / test_heal_ec (push) Successful in 4m41s
Details
Test / test_heal_csum_32k_dmj (push) Successful in 4m42s
Details
Test / test_heal_csum_32k_dj (push) Successful in 4m58s
Details
Test / test_heal_csum_32k (push) Successful in 6m34s
Details
Test / test_scrub (push) Successful in 54s
Details
Test / test_heal_csum_4k_dmj (push) Successful in 6m56s
Details
Test / test_scrub_zero_osd_2 (push) Successful in 49s
Details
Test / test_heal_csum_4k_dj (push) Successful in 6m1s
Details
Test / test_scrub_ec (push) Has been cancelled
Details
Test / test_heal_csum_4k (push) Has been cancelled
Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Has been cancelled
Details
Test / test_scrub_xor (push) Has been cancelled
Details
Test / test_scrub_pg_size_3 (push) Has been cancelled
Details
2024-01-16 23:49:21 +03:00
Vitaliy Filippov
2cf649eba6
Fix "partly outside array bounds" warnings for GCC 12 in cpp-btree
Test / test_move_reappear (push) Successful in 19s
Details
Test / test_rm (push) Successful in 13s
Details
Test / test_snapshot_down (push) Successful in 25s
Details
Test / test_snapshot_down_ec (push) Successful in 25s
Details
Test / test_splitbrain (push) Successful in 19s
Details
Test / test_snapshot_chain (push) Successful in 2m13s
Details
Test / test_interrupted_rebalance (push) Successful in 7m36s
Details
Test / test_rebalance_verify (push) Successful in 3m35s
Details
Test / test_write (push) Successful in 1m0s
Details
Test / test_rebalance_verify_imm (push) Successful in 4m4s
Details
Test / test_rebalance_verify_ec (push) Successful in 3m13s
Details
Test / test_rebalance_verify_ec_imm (push) Successful in 2m35s
Details
Test / test_write_no_same (push) Successful in 14s
Details
Test / test_heal_pg_size_2 (push) Successful in 4m32s
Details
Test / test_heal_csum_32k_dmj (push) Successful in 4m29s
Details
Test / test_heal_ec (push) Successful in 5m47s
Details
Test / test_heal_csum_32k_dj (push) Successful in 5m47s
Details
Test / test_heal_csum_4k_dmj (push) Successful in 6m4s
Details
Test / test_heal_csum_32k (push) Successful in 6m19s
Details
Test / test_scrub (push) Successful in 56s
Details
Test / test_scrub_zero_osd_2 (push) Successful in 43s
Details
Test / test_heal_csum_4k_dj (push) Successful in 6m14s
Details
Test / test_scrub_xor (push) Successful in 53s
Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 57s
Details
Test / test_scrub_ec (push) Successful in 47s
Details
Test / test_heal_csum_4k (push) Successful in 5m56s
Details
Test / test_minsize_1 (push) Successful in 14s
Details
Test / test_scrub_pg_size_3 (push) Successful in 46s
Details
Test / test_snapshot_chain_ec (push) Successful in 1m40s
Details
Test / test_write_xor (push) Failing after 3m6s
Details
2024-01-15 03:04:33 +03:00
Vitaliy Filippov
5935640a4a
Add CLA PR form
2024-01-14 16:48:24 +03:00
Vitaliy Filippov
d00d4dbac0
Initialize mod_revision field in etcd_state_client
Test / test_interrupted_rebalance_ec (push) Successful in 2m28s
Details
Test / test_rm (push) Successful in 17s
Details
Test / test_move_reappear (push) Successful in 29s
Details
Test / test_snapshot_down (push) Successful in 26s
Details
Test / test_snapshot_down_ec (push) Successful in 26s
Details
Test / test_splitbrain (push) Successful in 16s
Details
Test / test_snapshot_chain (push) Successful in 2m0s
Details
Test / test_rebalance_verify_imm (push) Successful in 2m28s
Details
Test / test_rebalance_verify (push) Successful in 3m0s
Details
Test / test_rebalance_verify_ec (push) Successful in 3m14s
Details
Test / test_write_no_same (push) Successful in 13s
Details
Test / test_rebalance_verify_ec_imm (push) Successful in 3m7s
Details
Test / test_heal_pg_size_2 (push) Successful in 3m33s
Details
Test / test_heal_ec (push) Successful in 4m40s
Details
Test / test_heal_csum_32k_dj (push) Successful in 5m40s
Details
Test / test_heal_csum_32k (push) Successful in 6m8s
Details
Test / test_scrub (push) Successful in 1m4s
Details
Test / test_scrub_zero_osd_2 (push) Successful in 47s
Details
Test / test_heal_csum_4k_dmj (push) Successful in 6m33s
Details
Test / test_heal_csum_4k_dj (push) Successful in 6m28s
Details
Test / test_scrub_xor (push) Successful in 44s
Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 1m2s
Details
Test / test_scrub_ec (push) Successful in 42s
Details
Test / test_scrub_pg_size_3 (push) Successful in 1m38s
Details
Test / test_heal_csum_4k (push) Successful in 5m56s
Details
Test / test_interrupted_rebalance (push) Successful in 1m53s
Details
Test / test_snapshot_chain_ec (push) Failing after 3m17s
Details
Test / test_write (push) Failing after 3m15s
Details
Test / test_heal_csum_32k_dmj (push) Successful in 4m6s
Details
Test / test_write_xor (push) Failing after 3m11s
Details
2024-01-13 01:30:28 +03:00
Vitaliy Filippov
5d9d6f32a0
Fix common realloc memory leak mistakes found by cppcheck
2024-01-13 01:30:28 +03:00
Vitaliy Filippov
5280d1d561
Release 1.4.0
...
Test / test_snapshot (push) Successful in 26s
Details
Test / test_snapshot_ec (push) Successful in 26s
Details
Test / test_rm (push) Successful in 16s
Details
Test / test_move_reappear (push) Successful in 24s
Details
Test / test_snapshot_down (push) Successful in 26s
Details
Test / test_snapshot_down_ec (push) Successful in 30s
Details
Test / test_splitbrain (push) Successful in 28s
Details
Test / test_snapshot_chain (push) Successful in 2m41s
Details
Test / test_rebalance_verify_imm (push) Successful in 2m48s
Details
Test / test_rebalance_verify (push) Successful in 3m28s
Details
Test / test_write (push) Successful in 47s
Details
Test / test_write_no_same (push) Successful in 14s
Details
Test / test_rebalance_verify_ec_imm (push) Successful in 3m5s
Details
Test / test_rebalance_verify_ec (push) Successful in 3m41s
Details
Test / test_heal_pg_size_2 (push) Successful in 3m45s
Details
Test / test_heal_csum_32k_dmj (push) Successful in 4m52s
Details
Test / test_heal_ec (push) Successful in 5m11s
Details
Test / test_heal_csum_32k_dj (push) Successful in 5m42s
Details
Test / test_heal_csum_32k (push) Successful in 5m56s
Details
Test / test_scrub (push) Successful in 1m25s
Details
Test / test_scrub_zero_osd_2 (push) Successful in 1m18s
Details
Test / test_scrub_xor (push) Successful in 42s
Details
Test / test_heal_csum_4k_dmj (push) Successful in 6m49s
Details
Test / test_heal_csum_4k_dj (push) Successful in 6m32s
Details
Test / test_heal_csum_4k (push) Successful in 5m31s
Details
Test / test_scrub_ec (push) Successful in 50s
Details
Test / test_scrub_pg_size_3 (push) Successful in 1m2s
Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 1m5s
Details
Test / test_snapshot_chain_ec (push) Successful in 1m21s
Details
Test / test_write_xor (push) Successful in 36s
Details
New features:
- Intelligent recovery/rebalance speed auto-tuning to reduce its impact on clients (see README -> Features)
- Auto-restoration of dead VDUSE daemons in CSI plugin
- Add vitastor-disk update-sb command
- Update QEMU for Debian Bookworm to 8.1 and use it for CSI plugin
Bug fixes:
- Fix pools SOMETIMES staying inactive after stopping a node due to OSDs not reacting
to PG state changes caused by incorrect full reload of state from etcd on reconnection
- Make monitors retry pool configuration changes quickier which fixes them being unable
to apply changes when an ongoing rebalance is quickly making a lot of PGs clean
- Fix CSI plugin not accepting array of strings as etcd address in /etc/vitastor/vitastor.conf
- Allow multiple interfaces with the same IP address, for "simple routed" full mesh network
- Do not ignore loopback addresses for OSD network (to make ECMP setups with frr possible)
- Fix a rare client crash during OSD reconnections
- Only treat data partitions as existing OSDs in vitastor-disk prepare
- Remove etcd parameter from default command examples
- Fix reported free space sometimes changing non-immediately after deletion of data from OSDs
- Fix a possible OSD crash on print_slow when bs_op is NULL
- Use the same etcd_ws_keepalive_interval in mon as in OSD
- Fix mon not using values from config when /config/global is not present
- Remove pve-storage-portal-dns-list format for vitastor_etcd_address
- Parse log_level in cluster_client
- Fix vitastor-nbd image existence check not working because of non-zeroed inode_watch fields
- Do not warn on EPIPE in client unless log_level is raised explicitly
- Fix incorrect error in CSI when searching for the device in /sys
- Remove 2 last prints to stdout in etcd_state_client
- Fix a possible OSD crash when checking corrupted journal entries
2024-01-12 01:28:33 +03:00
Vitaliy Filippov
317b0feb0a
Add a note about VDUSE daemon auto-restart
2024-01-12 01:27:36 +03:00
Vitaliy Filippov
247f0552db
Fix debug log "killing..." in CSI
2024-01-10 01:19:34 +03:00
Vitaliy Filippov
2f228fa96a
Only treat data partitions as existing OSDs in vitastor-disk prepare
Test / test_interrupted_rebalance_ec (push) Successful in 2m40s
Details
Test / test_rm (push) Successful in 31s
Details
Test / test_move_reappear (push) Successful in 39s
Details
Test / test_snapshot_down (push) Successful in 26s
Details
Test / test_interrupted_rebalance (push) Successful in 4m42s
Details
Test / test_snapshot_down_ec (push) Successful in 26s
Details
Test / test_splitbrain (push) Successful in 22s
Details
Test / test_snapshot_chain (push) Failing after 3m17s
Details
Test / test_snapshot_chain_ec (push) Failing after 3m13s
Details
Test / test_rebalance_verify_imm (push) Successful in 2m51s
Details
Test / test_write (push) Successful in 37s
Details
Test / test_rebalance_verify_ec_imm (push) Successful in 2m37s
Details
Test / test_write_no_same (push) Successful in 18s
Details
Test / test_rebalance_verify_ec (push) Successful in 3m20s
Details
Test / test_write_xor (push) Failing after 3m8s
Details
Test / test_rebalance_verify (push) Successful in 8m20s
Details
Test / test_heal_pg_size_2 (push) Successful in 4m17s
Details
Test / test_heal_ec (push) Successful in 4m59s
Details
Test / test_heal_csum_32k_dmj (push) Successful in 4m15s
Details
Test / test_heal_csum_32k_dj (push) Successful in 5m35s
Details
Test / test_heal_csum_32k (push) Successful in 6m47s
Details
Test / test_heal_csum_4k_dmj (push) Successful in 6m49s
Details
Test / test_scrub (push) Successful in 1m2s
Details
Test / test_scrub_zero_osd_2 (push) Successful in 45s
Details
Test / test_scrub_xor (push) Successful in 40s
Details
Test / test_heal_csum_4k_dj (push) Successful in 7m16s
Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 1m9s
Details
Test / test_scrub_ec (push) Successful in 45s
Details
Test / test_heal_csum_4k (push) Successful in 5m26s
Details
Test / test_scrub_pg_size_3 (push) Successful in 1m38s
Details
2023-12-31 11:46:47 +03:00
Vitaliy Filippov
2f6b9c0306
Remove etcd parameter from default command examples
2023-12-31 02:50:41 +03:00
Vitaliy Filippov
48b5f871e0
Add Contributor License Aggrement in Russian and English
2023-12-31 01:23:52 +03:00
Vitaliy Filippov
c17f76a3e4
Add documentation for recovery auto-tuning
Test / test_snapshot_ec (push) Successful in 26s
Details
Test / test_move_reappear (push) Successful in 19s
Details
Test / test_rm (push) Successful in 15s
Details
Test / test_snapshot_down (push) Successful in 24s
Details
Test / test_snapshot_down_ec (push) Successful in 26s
Details
Test / test_snapshot_chain (push) Successful in 1m50s
Details
Test / test_splitbrain (push) Successful in 52s
Details
Test / test_snapshot_chain_ec (push) Successful in 2m31s
Details
Test / test_rebalance_verify_imm (push) Successful in 2m28s
Details
Test / test_rebalance_verify (push) Successful in 3m25s
Details
Test / test_rebalance_verify_ec (push) Successful in 3m31s
Details
Test / test_write (push) Successful in 1m17s
Details
Test / test_write_no_same (push) Successful in 17s
Details
Test / test_rebalance_verify_ec_imm (push) Successful in 3m36s
Details
Test / test_heal_pg_size_2 (push) Successful in 4m12s
Details
Test / test_heal_ec (push) Successful in 5m20s
Details
Test / test_heal_csum_32k_dmj (push) Successful in 4m36s
Details
Test / test_heal_csum_32k_dj (push) Successful in 6m11s
Details
Test / test_heal_csum_32k (push) Successful in 6m13s
Details
Test / test_scrub (push) Successful in 56s
Details
Test / test_scrub_zero_osd_2 (push) Successful in 1m6s
Details
Test / test_heal_csum_4k_dj (push) Successful in 6m31s
Details
Test / test_heal_csum_4k_dmj (push) Successful in 6m58s
Details
Test / test_scrub_xor (push) Successful in 43s
Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 1m10s
Details
Test / test_scrub_ec (push) Successful in 49s
Details
Test / test_scrub_pg_size_3 (push) Successful in 1m40s
Details
Test / test_heal_csum_4k (push) Successful in 5m59s
Details
Test / test_write_xor (push) Successful in 34s
Details
Test / test_interrupted_rebalance (push) Successful in 1m19s
Details
2023-12-31 01:23:17 +03:00
Vitaliy Filippov
a6ab54b1ba
Do not allow negative util_low/high
2023-12-31 01:23:17 +03:00
Vitaliy Filippov
99ee8596ea
Rename min/max_util to util_low/high
2023-12-31 01:23:17 +03:00
Vitaliy Filippov
c4928e6ecd
Protect from try_send completing the operation immediately
...
Fixes a possible use-after-free in case of continue_ops() calling try_send(),
then connect_peer() -> set_timer() -> trigger_nearest() -> handle_op_part() -> continue_ops() again
2023-12-31 01:23:17 +03:00
Vitaliy Filippov
ec7dcd1be5
Do not apply very large recovery pauses during tests
2023-12-31 01:23:17 +03:00
Vitaliy Filippov
e600bbc151
Fix flapping move_reappear test by adding an fsync before stopping PG
2023-12-31 01:23:17 +03:00
Vitaliy Filippov
8b8c1179a7
Use a separate used_blocks counter for free space stats to hide possibly delayed on-flush deallocation
2023-12-31 01:23:17 +03:00
Vitaliy Filippov
d5a6fa6dd7
Fix possible crash on print_slow when bs_op is NULL
2023-12-31 01:23:17 +03:00
Vitaliy Filippov
f757a35a8d
Retry PG changes without re-running lpsolve when pool configuration and OSD tree don't change
...
OSDs often change their /pg/history keys during rebalance, so monitor receives additional
transaction failures from etcd if it re-runs lpsolve which sometimes may even lead to monitor
being unable to apply PG changes at all until rebalance completes
2023-12-31 01:23:17 +03:00
Vitaliy Filippov
1edf86ed26
Aggregate recovery delay using simple mean over last 10 observations (EWMA is shit)
2023-12-31 01:23:17 +03:00
Vitaliy Filippov
5ca7cde612
Experiment/WIP: Try to track "secondary" recovery ops separately
2023-12-31 01:23:17 +03:00
Vitaliy Filippov
751935ddd8
WIP Auto-tune recovery speed
2023-12-31 01:23:17 +03:00
Vitaliy Filippov
d84dee7098
Track recovery op latencies + refactor into a structure
2023-12-31 01:23:17 +03:00
Vitaliy Filippov
dcc76eee15
Add a parity chunk count change test script
2023-12-26 23:48:41 +03:00
Vitaliy Filippov
2f38adeb3d
Restart dead VDUSE daemons at regular intervals
2023-12-24 12:58:50 +03:00
Vitaliy Filippov
f72f14e6a7
Clear old PG states, history, and OSD states on etcd state reload
...
Test / test_snapshot_ec (push) Successful in 30s
Details
Test / test_interrupted_rebalance_ec_imm (push) Successful in 1m24s
Details
Test / test_rm (push) Successful in 16s
Details
Test / test_snapshot_down (push) Successful in 23s
Details
Test / test_snapshot_down_ec (push) Successful in 25s
Details
Test / test_splitbrain (push) Successful in 21s
Details
Test / test_snapshot_chain (push) Successful in 2m24s
Details
Test / test_snapshot_chain_ec (push) Successful in 3m5s
Details
Test / test_rebalance_verify_imm (push) Successful in 3m21s
Details
Test / test_write (push) Successful in 36s
Details
Test / test_rebalance_verify (push) Successful in 4m12s
Details
Test / test_write_no_same (push) Successful in 15s
Details
Test / test_write_xor (push) Successful in 52s
Details
Test / test_rebalance_verify_ec_imm (push) Successful in 4m29s
Details
Test / test_rebalance_verify_ec (push) Successful in 5m25s
Details
Test / test_heal_pg_size_2 (push) Successful in 4m10s
Details
Test / test_heal_ec (push) Successful in 4m46s
Details
Test / test_heal_csum_32k_dmj (push) Successful in 5m31s
Details
Test / test_heal_csum_32k_dj (push) Successful in 5m41s
Details
Test / test_heal_csum_32k (push) Successful in 6m41s
Details
Test / test_scrub (push) Successful in 1m13s
Details
Test / test_heal_csum_4k_dmj (push) Successful in 6m53s
Details
Test / test_scrub_xor (push) Successful in 54s
Details
Test / test_scrub_zero_osd_2 (push) Successful in 58s
Details
Test / test_heal_csum_4k_dj (push) Successful in 6m27s
Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 1m15s
Details
Test / test_scrub_pg_size_3 (push) Successful in 1m27s
Details
Test / test_heal_csum_4k (push) Successful in 6m20s
Details
Test / test_scrub_ec (push) Successful in 29s
Details
Test / test_move_reappear (push) Successful in 17s
Details
Also add protection from etcd watcher messages being split into multiple websocket
messages - I'm not sure if etcd actually does that, but it's better to have extra
protection anyway.
Also check that all etcd watchers are started in the keepalive routine, otherwise
it sometimes tries to revive etcd watchers starting with revision=1 which obviously
always fails because this revision is nearly always compacted.
All these changes should fix an old rarely reproduced bug where SOMETIMES OSDs
didn't react to PG config changes which was leading to offline pools on node reboot.
It happened on the full reload of state from etcd.
2023-12-24 02:02:13 +03:00
Vitaliy Filippov
1299373988
Use the same etcd_ws_keepalive_interval in OSD and mon
Test / test_snapshot_ec (push) Successful in 33s
Details
Test / test_interrupted_rebalance_ec (push) Successful in 1m58s
Details
Test / test_move_reappear (push) Successful in 22s
Details
Test / test_rm (push) Successful in 16s
Details
Test / test_snapshot_down (push) Successful in 32s
Details
Test / test_snapshot_down_ec (push) Successful in 32s
Details
Test / test_splitbrain (push) Successful in 25s
Details
Test / test_snapshot_chain (push) Successful in 2m36s
Details
Test / test_snapshot_chain_ec (push) Failing after 3m8s
Details
Test / test_rebalance_verify_imm (push) Successful in 2m58s
Details
Test / test_rebalance_verify (push) Successful in 3m55s
Details
Test / test_write (push) Successful in 39s
Details
Test / test_write_no_same (push) Successful in 15s
Details
Test / test_rebalance_verify_ec_imm (push) Successful in 3m18s
Details
Test / test_rebalance_verify_ec (push) Successful in 4m8s
Details
Test / test_write_xor (push) Failing after 3m11s
Details
Test / test_heal_pg_size_2 (push) Successful in 3m47s
Details
Test / test_heal_csum_32k_dmj (push) Successful in 4m58s
Details
Test / test_heal_ec (push) Successful in 6m21s
Details
Test / test_heal_csum_32k_dj (push) Successful in 6m11s
Details
Test / test_heal_csum_32k (push) Successful in 6m22s
Details
Test / test_scrub (push) Successful in 1m17s
Details
Test / test_scrub_zero_osd_2 (push) Successful in 1m17s
Details
Test / test_heal_csum_4k_dmj (push) Successful in 6m35s
Details
Test / test_scrub_xor (push) Successful in 57s
Details
Test / test_heal_csum_4k_dj (push) Successful in 6m27s
Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 1m3s
Details
Test / test_scrub_pg_size_3 (push) Successful in 1m33s
Details
Test / test_scrub_ec (push) Successful in 44s
Details
Test / test_heal_csum_4k (push) Successful in 6m9s
Details
2023-12-23 20:07:29 +03:00