Compare commits

...

23 Commits

Author SHA1 Message Date
27e9f244ec Release 1.4.3
All checks were successful
Test / test_move_reappear (push) Successful in 22s
Test / test_rm (push) Successful in 15s
Test / test_snapshot_down (push) Successful in 36s
Test / test_snapshot_down_ec (push) Successful in 30s
Test / test_interrupted_rebalance (push) Successful in 5m3s
Test / test_splitbrain (push) Successful in 20s
Test / test_snapshot_chain (push) Successful in 3m1s
Test / test_snapshot_chain_ec (push) Successful in 3m13s
Test / test_rebalance_verify_imm (push) Successful in 3m0s
Test / test_rebalance_verify (push) Successful in 3m29s
Test / test_switch_primary (push) Successful in 37s
Test / test_write (push) Successful in 44s
Test / test_write_xor (push) Successful in 39s
Test / test_write_no_same (push) Successful in 16s
Test / test_rebalance_verify_ec_imm (push) Successful in 4m13s
Test / test_rebalance_verify_ec (push) Successful in 5m31s
Test / test_heal_ec (push) Successful in 4m54s
Test / test_heal_csum_32k_dj (push) Successful in 5m25s
Test / test_heal_csum_32k (push) Successful in 6m8s
Test / test_heal_csum_4k_dmj (push) Successful in 6m17s
Test / test_scrub (push) Successful in 1m8s
Test / test_scrub_zero_osd_2 (push) Successful in 55s
Test / test_scrub_xor (push) Successful in 45s
Test / test_heal_csum_4k_dj (push) Successful in 6m22s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 1m11s
Test / test_scrub_ec (push) Successful in 46s
Test / test_scrub_pg_size_3 (push) Successful in 1m39s
Test / test_heal_csum_4k (push) Successful in 6m8s
Test / test_heal_csum_32k_dmj (push) Successful in 4m15s
Test / test_heal_pg_size_2 (push) Successful in 4m41s
Hotfix for hotfix O:-)

- "Write stall fix" was incomplete and EC write stalls could
  continue even on 1.4.2. Now they're finally fixed O:-)
- Make monitor ignore statistics of stopped OSDs. Previously if you stopped all
  OSDs the last total I/O numbers would remain the same indefinitely
2024-02-09 00:29:31 +03:00
8e25a28a08 Ignore down OSDs in monitor statistics aggregation
Some checks failed
Test / test_move_reappear (push) Successful in 20s
Test / test_interrupted_rebalance_ec (push) Successful in 2m50s
Test / test_snapshot_down (push) Successful in 22s
Test / test_snapshot_down_ec (push) Successful in 25s
Test / test_splitbrain (push) Successful in 18s
Test / test_snapshot_chain (push) Successful in 2m10s
Test / test_snapshot_chain_ec (push) Failing after 3m8s
Test / test_rebalance_verify (push) Successful in 3m6s
Test / test_interrupted_rebalance (push) Failing after 10m52s
Test / test_rebalance_verify_imm (push) Successful in 5m28s
Test / test_switch_primary (push) Successful in 37s
Test / test_write (push) Successful in 42s
Test / test_write_xor (push) Successful in 38s
Test / test_write_no_same (push) Successful in 16s
Test / test_rebalance_verify_ec (push) Successful in 6m7s
Test / test_rebalance_verify_ec_imm (push) Successful in 6m3s
Test / test_heal_pg_size_2 (push) Successful in 4m12s
Test / test_heal_ec (push) Successful in 5m20s
Test / test_heal_csum_32k_dmj (push) Successful in 4m53s
Test / test_heal_csum_32k_dj (push) Successful in 5m23s
Test / test_heal_csum_32k (push) Successful in 5m59s
Test / test_scrub_zero_osd_2 (push) Has been cancelled
Test / test_scrub_xor (push) Has been cancelled
Test / test_heal_csum_4k_dmj (push) Has been cancelled
Test / test_scrub_pg_size_3 (push) Has been cancelled
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Has been cancelled
Test / test_scrub_ec (push) Has been cancelled
Test / test_scrub (push) Has been cancelled
Test / test_heal_csum_4k_dj (push) Has been cancelled
Test / test_heal_csum_4k (push) Has been cancelled
2024-02-09 00:22:36 +03:00
5d3317e4f2 Followup to 1.4.2 write stall fix - sadly, the previous version was not working correctly :)
All checks were successful
Test / test_move_reappear (push) Successful in 19s
Test / test_snapshot_chain (push) Successful in 1m21s
Test / test_snapshot_down (push) Successful in 23s
Test / test_snapshot_chain_ec (push) Successful in 1m50s
Test / test_snapshot_down_ec (push) Successful in 22s
Test / test_splitbrain (push) Successful in 16s
Test / test_etcd_fail (push) Successful in 6m42s
Test / test_rebalance_verify_imm (push) Successful in 2m19s
Test / test_rebalance_verify (push) Successful in 4m7s
Test / test_switch_primary (push) Successful in 36s
Test / test_write (push) Successful in 35s
Test / test_rebalance_verify_ec (push) Successful in 4m6s
Test / test_write_no_same (push) Successful in 22s
Test / test_write_xor (push) Successful in 1m34s
Test / test_rebalance_verify_ec_imm (push) Successful in 6m7s
Test / test_heal_csum_32k_dmj (push) Successful in 4m7s
Test / test_heal_csum_32k_dj (push) Successful in 4m59s
Test / test_heal_csum_32k (push) Successful in 5m4s
Test / test_heal_csum_4k_dmj (push) Successful in 5m59s
Test / test_scrub (push) Successful in 1m9s
Test / test_scrub_zero_osd_2 (push) Successful in 37s
Test / test_scrub_xor (push) Successful in 52s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 1m5s
Test / test_heal_csum_4k_dj (push) Successful in 5m12s
Test / test_heal_csum_4k (push) Successful in 5m1s
Test / test_scrub_pg_size_3 (push) Successful in 1m48s
Test / test_scrub_ec (push) Successful in 19s
Test / test_interrupted_rebalance (push) Successful in 1m38s
Test / test_heal_pg_size_2 (push) Successful in 3m20s
Test / test_heal_ec (push) Successful in 3m3s
2024-02-08 19:34:29 +03:00
016115c0d4 Release 1.4.2
All checks were successful
Test / test_rm (push) Successful in 16s
Test / test_move_reappear (push) Successful in 20s
Test / test_snapshot_down (push) Successful in 25s
Test / test_snapshot_down_ec (push) Successful in 39s
Test / test_interrupted_rebalance (push) Successful in 4m52s
Test / test_splitbrain (push) Successful in 20s
Test / test_snapshot_chain (push) Successful in 3m11s
Test / test_rebalance_verify_imm (push) Successful in 3m16s
Test / test_rebalance_verify (push) Successful in 3m45s
Test / test_switch_primary (push) Successful in 36s
Test / test_write (push) Successful in 40s
Test / test_write_xor (push) Successful in 40s
Test / test_write_no_same (push) Successful in 17s
Test / test_rebalance_verify_ec_imm (push) Successful in 3m8s
Test / test_rebalance_verify_ec (push) Successful in 5m57s
Test / test_heal_pg_size_2 (push) Successful in 4m22s
Test / test_heal_csum_32k_dmj (push) Successful in 4m20s
Test / test_heal_ec (push) Successful in 5m54s
Test / test_heal_csum_32k_dj (push) Successful in 5m24s
Test / test_heal_csum_32k (push) Successful in 6m3s
Test / test_heal_csum_4k_dmj (push) Successful in 5m54s
Test / test_scrub_zero_osd_2 (push) Successful in 53s
Test / test_scrub (push) Successful in 55s
Test / test_heal_csum_4k_dj (push) Successful in 6m14s
Test / test_scrub_xor (push) Successful in 1m1s
Test / test_scrub_pg_size_3 (push) Successful in 1m50s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 57s
Test / test_scrub_ec (push) Successful in 52s
Test / test_heal_csum_4k (push) Successful in 5m47s
Test / test_snapshot_chain_ec (push) Successful in 1m24s
- Log to systemd by default
- Fix excessive autosyncs after every operation with disabled immediate_commit (introduced in 1.1.0)
- Fix a possible write stall with EC due to the lack of OSD wakeup after stabilizing previous writes
- Change sync operation semantics as a final fix to possible write stalls with EC and disabled immediate_commit
- Sync after deleting data in CLI rm / rm-data if immediate_commit is disabled
- Fix OSDs ignoring syncs & autosyncs for delete operations
- Fix OSD space reporting sometimes adding garbage zeros for deleted inodes (causing extra pool/stats etcd keys for deleted pools)
- Speed up monitor failover - change default etcd_mon_ttl from 30 to 5 seconds
- Speed up operation retries - change default up_wait_retry_interval to 50 ms
- Add patch for libvirt 9.10
2024-02-04 02:23:49 +03:00
e026de95d5 Log to systemd by default
Some checks failed
Test / test_move_reappear (push) Successful in 20s
Test / test_etcd_fail (push) Successful in 5m19s
Test / test_snapshot_chain (push) Successful in 1m26s
Test / test_snapshot_down (push) Successful in 26s
Test / test_snapshot_down_ec (push) Successful in 28s
Test / test_splitbrain (push) Successful in 19s
Test / test_snapshot_chain_ec (push) Failing after 3m8s
Test / test_interrupted_rebalance (push) Successful in 7m44s
Test / test_rebalance_verify_imm (push) Successful in 3m11s
Test / test_switch_primary (push) Successful in 34s
Test / test_write (push) Successful in 34s
Test / test_rebalance_verify_ec_imm (push) Successful in 2m41s
Test / test_rebalance_verify_ec (push) Successful in 3m18s
Test / test_write_no_same (push) Successful in 22s
Test / test_write_xor (push) Successful in 1m41s
Test / test_heal_pg_size_2 (push) Failing after 3m54s
Test / test_rebalance_verify (push) Successful in 9m38s
Test / test_heal_csum_32k_dmj (push) Successful in 4m4s
Test / test_heal_csum_32k_dj (push) Successful in 4m23s
Test / test_heal_csum_32k (push) Successful in 5m24s
Test / test_heal_ec (push) Failing after 10m18s
Test / test_heal_csum_4k_dmj (push) Successful in 5m31s
Test / test_scrub (push) Successful in 1m18s
Test / test_scrub_zero_osd_2 (push) Successful in 1m0s
Test / test_scrub_xor (push) Successful in 51s
Test / test_heal_csum_4k_dj (push) Successful in 5m10s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 48s
Test / test_scrub_pg_size_3 (push) Successful in 1m59s
Test / test_scrub_ec (push) Successful in 48s
Test / test_heal_csum_4k (push) Successful in 4m39s
2024-02-04 01:21:31 +03:00
77c10fd1f8 In fact, do not autosync blockstore when autosync_writes=0
All checks were successful
Test / test_move_reappear (push) Successful in 19s
Test / test_rm (push) Successful in 14s
Test / test_snapshot_down (push) Successful in 24s
Test / test_snapshot_down_ec (push) Successful in 25s
Test / test_splitbrain (push) Successful in 17s
Test / test_snapshot_chain (push) Successful in 1m57s
Test / test_snapshot_chain_ec (push) Successful in 2m41s
Test / test_rebalance_verify (push) Successful in 3m5s
Test / test_rebalance_verify_imm (push) Successful in 2m26s
Test / test_switch_primary (push) Successful in 45s
Test / test_write (push) Successful in 33s
Test / test_write_xor (push) Successful in 33s
Test / test_rebalance_verify_ec (push) Successful in 3m42s
Test / test_write_no_same (push) Successful in 14s
Test / test_rebalance_verify_ec_imm (push) Successful in 4m57s
Test / test_heal_csum_32k_dmj (push) Successful in 4m24s
Test / test_heal_csum_32k_dj (push) Successful in 4m29s
Test / test_heal_csum_4k_dmj (push) Successful in 5m10s
Test / test_heal_csum_32k (push) Successful in 5m13s
Test / test_scrub (push) Successful in 1m5s
Test / test_scrub_zero_osd_2 (push) Successful in 1m1s
Test / test_scrub_xor (push) Successful in 1m2s
Test / test_heal_csum_4k_dj (push) Successful in 5m2s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 57s
Test / test_scrub_ec (push) Successful in 50s
Test / test_scrub_pg_size_3 (push) Successful in 2m1s
Test / test_heal_csum_4k (push) Successful in 4m40s
Test / test_interrupted_rebalance (push) Successful in 1m38s
Test / test_heal_pg_size_2 (push) Successful in 4m2s
Test / test_heal_ec (push) Successful in 5m17s
2024-02-03 20:37:36 +03:00
581d02e581 Mark secondary OSDs with deletions as dirty to not forget to sync & autosync them
Some checks reported warnings
Test / test_change_pg_count (push) Has been cancelled
Test / test_rm (push) Has been cancelled
Test / test_snapshot_chain (push) Has been cancelled
Test / test_snapshot_chain_ec (push) Has been cancelled
Test / test_snapshot_down (push) Has been cancelled
Test / test_snapshot_down_ec (push) Has been cancelled
Test / test_splitbrain (push) Has been cancelled
Test / test_rebalance_verify (push) Has been cancelled
Test / test_rebalance_verify_imm (push) Has been cancelled
Test / test_rebalance_verify_ec (push) Has been cancelled
Test / test_rebalance_verify_ec_imm (push) Has been cancelled
Test / test_switch_primary (push) Has been cancelled
Test / test_write (push) Has been cancelled
Test / test_write_xor (push) Has been cancelled
Test / test_write_no_same (push) Has been cancelled
Test / test_heal_pg_size_2 (push) Has been cancelled
Test / test_heal_ec (push) Has been cancelled
Test / test_heal_csum_32k_dmj (push) Has been cancelled
Test / test_cas (push) Has been cancelled
Test / test_heal_csum_32k_dj (push) Has been cancelled
Test / test_heal_csum_32k (push) Has been cancelled
Test / test_heal_csum_4k_dmj (push) Has been cancelled
Test / test_heal_csum_4k_dj (push) Has been cancelled
Test / test_heal_csum_4k (push) Has been cancelled
Test / test_scrub (push) Has been cancelled
Test / test_scrub_zero_osd_2 (push) Has been cancelled
Test / test_scrub_xor (push) Has been cancelled
Test / test_scrub_pg_size_3 (push) Has been cancelled
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Has been cancelled
Test / test_scrub_ec (push) Has been cancelled
2024-02-03 20:31:08 +03:00
f03a9db4d9 Fix OSD space reporting sometimes adding garbage zeros for deleted inodes (causing extra pool/stats etcd keys for deleted pools) 2024-02-03 20:31:08 +03:00
cb9c30bc31 Sync after sending all deletes to each PG in cli rm-data 2024-02-03 20:31:08 +03:00
a86a380d20 Fix invalid parsing of autosync_writes in blockstore leading to autosyncs after every operation with disabled immediate_commit :D 2024-02-03 20:31:08 +03:00
d2b43cb118 Change default etcd_mon_ttl
Some checks failed
Test / test_move_reappear (push) Successful in 35s
Test / test_interrupted_rebalance_ec (push) Successful in 3m29s
Test / test_interrupted_rebalance (push) Successful in 4m47s
Test / test_snapshot_down (push) Successful in 29s
Test / test_snapshot_down_ec (push) Successful in 25s
Test / test_splitbrain (push) Successful in 24s
Test / test_snapshot_chain (push) Successful in 2m46s
Test / test_snapshot_chain_ec (push) Failing after 3m10s
Test / test_rebalance_verify_imm (push) Successful in 4m24s
Test / test_rebalance_verify (push) Successful in 4m54s
Test / test_switch_primary (push) Successful in 35s
Test / test_rebalance_verify_ec_imm (push) Successful in 3m38s
Test / test_write (push) Successful in 46s
Test / test_write_xor (push) Successful in 49s
Test / test_write_no_same (push) Successful in 18s
Test / test_rebalance_verify_ec (push) Successful in 7m14s
Test / test_heal_pg_size_2 (push) Successful in 4m10s
Test / test_heal_csum_32k_dmj (push) Successful in 4m10s
Test / test_heal_csum_32k_dj (push) Successful in 4m52s
Test / test_heal_csum_32k (push) Successful in 5m20s
Test / test_heal_csum_4k_dmj (push) Successful in 5m8s
Test / test_heal_ec (push) Failing after 10m21s
Test / test_scrub (push) Successful in 1m2s
Test / test_scrub_xor (push) Successful in 54s
Test / test_scrub_zero_osd_2 (push) Successful in 1m4s
Test / test_heal_csum_4k_dj (push) Successful in 4m48s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 1m26s
Test / test_scrub_ec (push) Successful in 50s
Test / test_scrub_pg_size_3 (push) Failing after 2m5s
Test / test_heal_csum_4k (push) Successful in 4m33s
2024-01-29 23:45:19 +03:00
cc76e6876b Fix flapping "scrub" test
All checks were successful
Test / test_rm (push) Successful in 16s
Test / test_interrupted_rebalance_ec (push) Successful in 2m53s
Test / test_snapshot_down (push) Successful in 29s
Test / test_snapshot_down_ec (push) Successful in 38s
Test / test_splitbrain (push) Successful in 25s
Test / test_interrupted_rebalance (push) Successful in 5m46s
Test / test_snapshot_chain (push) Successful in 2m59s
Test / test_rebalance_verify_imm (push) Successful in 2m36s
Test / test_rebalance_verify (push) Successful in 3m22s
Test / test_switch_primary (push) Successful in 36s
Test / test_write (push) Successful in 32s
Test / test_write_no_same (push) Successful in 17s
Test / test_rebalance_verify_ec_imm (push) Successful in 2m46s
Test / test_write_xor (push) Successful in 40s
Test / test_rebalance_verify_ec (push) Successful in 4m46s
Test / test_heal_pg_size_2 (push) Successful in 4m23s
Test / test_heal_csum_32k_dmj (push) Successful in 4m40s
Test / test_heal_csum_32k_dj (push) Successful in 6m45s
Test / test_scrub (push) Successful in 1m1s
Test / test_scrub_zero_osd_2 (push) Successful in 43s
Test / test_scrub_xor (push) Successful in 35s
Test / test_heal_csum_4k (push) Successful in 4m14s
Test / test_scrub_pg_size_3 (push) Successful in 1m19s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 35s
Test / test_scrub_ec (push) Successful in 21s
Test / test_heal_csum_32k (push) Successful in 4m48s
Test / test_heal_csum_4k_dmj (push) Successful in 4m48s
Test / test_heal_csum_4k_dj (push) Successful in 4m27s
Test / test_snapshot_chain_ec (push) Successful in 2m29s
Test / test_heal_ec (push) Successful in 3m7s
2024-01-28 14:59:33 +03:00
1cec62d25d Sync only completed writes
Some checks failed
Test / test_move_reappear (push) Successful in 21s
Test / test_rm (push) Successful in 16s
Test / test_snapshot_down (push) Successful in 25s
Test / test_snapshot_down_ec (push) Successful in 35s
Test / test_splitbrain (push) Successful in 24s
Test / test_interrupted_rebalance (push) Successful in 5m14s
Test / test_snapshot_chain (push) Successful in 2m50s
Test / test_rebalance_verify_imm (push) Successful in 2m47s
Test / test_rebalance_verify (push) Successful in 3m42s
Test / test_switch_primary (push) Successful in 33s
Test / test_write (push) Successful in 42s
Test / test_write_xor (push) Successful in 44s
Test / test_rebalance_verify_ec_imm (push) Successful in 2m52s
Test / test_write_no_same (push) Successful in 15s
Test / test_rebalance_verify_ec (push) Successful in 4m19s
Test / test_heal_ec (push) Successful in 6m20s
Test / test_heal_csum_32k (push) Successful in 3m29s
Test / test_scrub (push) Successful in 1m24s
Test / test_scrub_zero_osd_2 (push) Successful in 1m11s
Test / test_heal_csum_4k_dmj (push) Successful in 4m23s
Test / test_scrub_xor (push) Successful in 1m9s
Test / test_heal_csum_4k_dj (push) Successful in 5m29s
Test / test_heal_csum_4k (push) Successful in 5m36s
Test / test_scrub_pg_size_3 (push) Successful in 1m53s
Test / test_scrub_ec (push) Successful in 29s
Test / test_heal_pg_size_2 (push) Successful in 3m9s
Test / test_heal_csum_32k_dmj (push) Successful in 4m13s
Test / test_heal_csum_32k_dj (push) Successful in 4m17s
Test / test_snapshot_chain_ec (push) Successful in 1m25s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Failing after 24s
Should be a final remaining fix to EC + non-capacitor (non-immediate-commit) write hangs :).

First it was breaking non-EC ("instantly stable") writes because they sometimes
complete out of order which was leading to the following error:

terminate called after throwing an instance of 'std::runtime_error'
  what():  BUG: Unexpected dirty_entry 1000000000001:29480000 v65540 unstable state during flush: 0x151

But it is easily fixed by scanning previous and next dirty_entries in mark_stable.
2024-01-27 15:17:22 +03:00
1c322b33ed Change default up_wait_retry_interval to 50 ms
Some checks failed
Test / test_rm (push) Successful in 14s
Test / test_interrupted_rebalance_ec (push) Successful in 3m59s
Test / test_snapshot_chain (push) Successful in 1m34s
Test / test_snapshot_down (push) Successful in 25s
Test / test_snapshot_down_ec (push) Successful in 29s
Test / test_splitbrain (push) Successful in 19s
Test / test_snapshot_chain_ec (push) Successful in 2m35s
Test / test_interrupted_rebalance (push) Successful in 8m15s
Test / test_rebalance_verify_imm (push) Successful in 3m54s
Test / test_switch_primary (push) Successful in 36s
Test / test_write (push) Successful in 35s
Test / test_rebalance_verify_ec (push) Successful in 4m48s
Test / test_rebalance_verify_ec_imm (push) Successful in 2m51s
Test / test_write_no_same (push) Successful in 14s
Test / test_write_xor (push) Failing after 3m9s
Test / test_heal_pg_size_2 (push) Successful in 3m55s
Test / test_heal_ec (push) Successful in 3m50s
Test / test_rebalance_verify (push) Failing after 9m30s
Test / test_heal_csum_32k_dmj (push) Failing after 5m40s
Test / test_heal_csum_32k_dj (push) Successful in 6m12s
Test / test_heal_csum_32k (push) Successful in 6m25s
Test / test_heal_csum_4k_dmj (push) Successful in 6m56s
Test / test_scrub (push) Successful in 1m4s
Test / test_scrub_zero_osd_2 (push) Successful in 55s
Test / test_scrub_xor (push) Successful in 56s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 1m19s
Test / test_scrub_pg_size_3 (push) Failing after 2m14s
Test / test_heal_csum_4k_dj (push) Successful in 5m53s
Test / test_scrub_ec (push) Successful in 1m1s
Test / test_heal_csum_4k (push) Successful in 5m17s
2024-01-26 01:51:08 +03:00
d27524f441 Add patch for libvirt 9.10 2024-01-25 01:09:12 +03:00
ba55f91409 Release 1.4.1
Some checks failed
Test / test_move_reappear (push) Successful in 22s
Test / test_snapshot_chain (push) Successful in 1m27s
Test / test_interrupted_rebalance_ec (push) Successful in 4m41s
Test / test_snapshot_down (push) Successful in 25s
Test / test_snapshot_chain_ec (push) Successful in 2m0s
Test / test_splitbrain (push) Successful in 18s
Test / test_snapshot_down_ec (push) Successful in 25s
Test / test_rebalance_verify_ec (push) Failing after 2m21s
Test / test_rebalance_verify_imm (push) Successful in 2m30s
Test / test_switch_primary (push) Successful in 39s
Test / test_write (push) Successful in 35s
Test / test_interrupted_rebalance (push) Failing after 10m8s
Test / test_write_xor (push) Successful in 36s
Test / test_write_no_same (push) Successful in 17s
Test / test_rebalance_verify_ec_imm (push) Successful in 4m4s
Test / test_heal_pg_size_2 (push) Successful in 3m55s
Test / test_rebalance_verify (push) Successful in 8m31s
Test / test_heal_ec (push) Successful in 5m9s
Test / test_heal_csum_32k_dmj (push) Successful in 4m27s
Test / test_heal_csum_32k (push) Successful in 5m42s
Test / test_heal_csum_32k_dj (push) Successful in 6m1s
Test / test_scrub (push) Successful in 59s
Test / test_scrub_zero_osd_2 (push) Successful in 38s
Test / test_heal_csum_4k_dmj (push) Successful in 7m5s
Test / test_scrub_xor (push) Successful in 58s
Test / test_heal_csum_4k_dj (push) Successful in 6m25s
Test / test_scrub_ec (push) Failing after 42s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 1m32s
Test / test_scrub_pg_size_3 (push) Successful in 1m38s
Test / test_heal_csum_4k (push) Successful in 5m38s
- Fix a monitor crash on primary OSD switching introduced in 1.4.0
- Fix "partly outside array bounds" warnings for GCC 12 in cpp-btree
- Fix a realloc memory leak in theory possible with too large listings (OSD_OP_LIST)
2024-01-18 02:31:42 +03:00
80aac39513 Add detailed formula for theoretical EC N+K random write performance 2024-01-18 00:36:32 +03:00
2aa5aa7ab6 Add a test for simple master switching without PG reconfiguration
All checks were successful
Test / test_move_reappear (push) Successful in 20s
Test / test_snapshot_chain (push) Successful in 1m27s
Test / test_snapshot_down (push) Successful in 23s
Test / test_snapshot_chain_ec (push) Successful in 1m56s
Test / test_snapshot_down_ec (push) Successful in 23s
Test / test_splitbrain (push) Successful in 17s
Test / test_interrupted_rebalance_ec (push) Successful in 6m40s
Test / test_interrupted_rebalance (push) Successful in 8m12s
Test / test_rebalance_verify_imm (push) Successful in 3m12s
Test / test_switch_primary (push) Successful in 34s
Test / test_write (push) Successful in 46s
Test / test_rebalance_verify_ec (push) Successful in 3m18s
Test / test_rebalance_verify_ec_imm (push) Successful in 2m42s
Test / test_write_no_same (push) Successful in 15s
Test / test_rebalance_verify (push) Successful in 6m36s
Test / test_heal_ec (push) Successful in 5m2s
Test / test_heal_csum_32k_dmj (push) Successful in 4m33s
Test / test_heal_csum_32k_dj (push) Successful in 5m58s
Test / test_heal_csum_32k (push) Successful in 6m6s
Test / test_scrub (push) Successful in 47s
Test / test_heal_csum_4k_dmj (push) Successful in 6m17s
Test / test_scrub_zero_osd_2 (push) Successful in 43s
Test / test_scrub_xor (push) Successful in 47s
Test / test_heal_csum_4k_dj (push) Successful in 6m44s
Test / test_scrub_ec (push) Successful in 41s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 1m18s
Test / test_scrub_pg_size_3 (push) Successful in 2m11s
Test / test_heal_csum_4k (push) Successful in 6m12s
Test / test_heal_pg_size_2 (push) Successful in 3m16s
Test / test_write_xor (push) Successful in 34s
Also use osd_out_time:1 only in select tests and restart mon in tests only on connection errors
2024-01-17 00:19:01 +03:00
3ca3b8a8d8 Fix recheck_pgs bug introduced in 1.4.0
Some checks failed
Test / test_rm (push) Successful in 14s
Test / test_interrupted_rebalance_ec (push) Successful in 3m27s
Test / test_snapshot_chain (push) Successful in 1m24s
Test / test_snapshot_down (push) Successful in 25s
Test / test_snapshot_chain_ec (push) Successful in 1m54s
Test / test_snapshot_down_ec (push) Successful in 20s
Test / test_splitbrain (push) Successful in 15s
Test / test_rebalance_verify_imm (push) Successful in 2m42s
Test / test_etcd_fail (push) Failing after 10m8s
Test / test_interrupted_rebalance (push) Failing after 10m9s
Test / test_write (push) Successful in 1m22s
Test / test_rebalance_verify_ec (push) Failing after 1m51s
Test / test_write_no_same (push) Successful in 16s
Test / test_rebalance_verify_ec_imm (push) Successful in 3m27s
Test / test_write_xor (push) Failing after 3m13s
Test / test_heal_pg_size_2 (push) Successful in 3m22s
Test / test_rebalance_verify (push) Failing after 10m9s
Test / test_heal_ec (push) Successful in 4m41s
Test / test_heal_csum_32k_dmj (push) Successful in 4m42s
Test / test_heal_csum_32k_dj (push) Successful in 4m58s
Test / test_heal_csum_32k (push) Successful in 6m34s
Test / test_scrub (push) Successful in 54s
Test / test_heal_csum_4k_dmj (push) Successful in 6m56s
Test / test_scrub_zero_osd_2 (push) Successful in 49s
Test / test_heal_csum_4k_dj (push) Successful in 6m1s
Test / test_scrub_ec (push) Has been cancelled
Test / test_heal_csum_4k (push) Has been cancelled
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Has been cancelled
Test / test_scrub_xor (push) Has been cancelled
Test / test_scrub_pg_size_3 (push) Has been cancelled
2024-01-16 23:49:21 +03:00
2cf649eba6 Fix "partly outside array bounds" warnings for GCC 12 in cpp-btree
Some checks failed
Test / test_move_reappear (push) Successful in 19s
Test / test_rm (push) Successful in 13s
Test / test_snapshot_down (push) Successful in 25s
Test / test_snapshot_down_ec (push) Successful in 25s
Test / test_splitbrain (push) Successful in 19s
Test / test_snapshot_chain (push) Successful in 2m13s
Test / test_interrupted_rebalance (push) Successful in 7m36s
Test / test_rebalance_verify (push) Successful in 3m35s
Test / test_write (push) Successful in 1m0s
Test / test_rebalance_verify_imm (push) Successful in 4m4s
Test / test_rebalance_verify_ec (push) Successful in 3m13s
Test / test_rebalance_verify_ec_imm (push) Successful in 2m35s
Test / test_write_no_same (push) Successful in 14s
Test / test_heal_pg_size_2 (push) Successful in 4m32s
Test / test_heal_csum_32k_dmj (push) Successful in 4m29s
Test / test_heal_ec (push) Successful in 5m47s
Test / test_heal_csum_32k_dj (push) Successful in 5m47s
Test / test_heal_csum_4k_dmj (push) Successful in 6m4s
Test / test_heal_csum_32k (push) Successful in 6m19s
Test / test_scrub (push) Successful in 56s
Test / test_scrub_zero_osd_2 (push) Successful in 43s
Test / test_heal_csum_4k_dj (push) Successful in 6m14s
Test / test_scrub_xor (push) Successful in 53s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 57s
Test / test_scrub_ec (push) Successful in 47s
Test / test_heal_csum_4k (push) Successful in 5m56s
Test / test_minsize_1 (push) Successful in 14s
Test / test_scrub_pg_size_3 (push) Successful in 46s
Test / test_snapshot_chain_ec (push) Successful in 1m40s
Test / test_write_xor (push) Failing after 3m6s
2024-01-15 03:04:33 +03:00
5935640a4a Add CLA PR form 2024-01-14 16:48:24 +03:00
d00d4dbac0 Initialize mod_revision field in etcd_state_client
Some checks failed
Test / test_interrupted_rebalance_ec (push) Successful in 2m28s
Test / test_rm (push) Successful in 17s
Test / test_move_reappear (push) Successful in 29s
Test / test_snapshot_down (push) Successful in 26s
Test / test_snapshot_down_ec (push) Successful in 26s
Test / test_splitbrain (push) Successful in 16s
Test / test_snapshot_chain (push) Successful in 2m0s
Test / test_rebalance_verify_imm (push) Successful in 2m28s
Test / test_rebalance_verify (push) Successful in 3m0s
Test / test_rebalance_verify_ec (push) Successful in 3m14s
Test / test_write_no_same (push) Successful in 13s
Test / test_rebalance_verify_ec_imm (push) Successful in 3m7s
Test / test_heal_pg_size_2 (push) Successful in 3m33s
Test / test_heal_ec (push) Successful in 4m40s
Test / test_heal_csum_32k_dj (push) Successful in 5m40s
Test / test_heal_csum_32k (push) Successful in 6m8s
Test / test_scrub (push) Successful in 1m4s
Test / test_scrub_zero_osd_2 (push) Successful in 47s
Test / test_heal_csum_4k_dmj (push) Successful in 6m33s
Test / test_heal_csum_4k_dj (push) Successful in 6m28s
Test / test_scrub_xor (push) Successful in 44s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 1m2s
Test / test_scrub_ec (push) Successful in 42s
Test / test_scrub_pg_size_3 (push) Successful in 1m38s
Test / test_heal_csum_4k (push) Successful in 5m56s
Test / test_interrupted_rebalance (push) Successful in 1m53s
Test / test_snapshot_chain_ec (push) Failing after 3m17s
Test / test_write (push) Failing after 3m15s
Test / test_heal_csum_32k_dmj (push) Successful in 4m6s
Test / test_write_xor (push) Failing after 3m11s
2024-01-13 01:30:28 +03:00
5d9d6f32a0 Fix common realloc memory leak mistakes found by cppcheck 2024-01-13 01:30:28 +03:00
49 changed files with 956 additions and 106 deletions

View File

@@ -532,6 +532,24 @@ jobs:
echo ""
done
test_switch_primary:
runs-on: ubuntu-latest
needs: build
container: ${{env.TEST_IMAGE}}:${{github.sha}}
steps:
- name: Run test
id: test
timeout-minutes: 3
run: /root/vitastor/tests/test_switch_primary.sh
- name: Print logs
if: always() && steps.test.outcome == 'failure'
run: |
for i in /root/vitastor/testdata/*.log /root/vitastor/testdata/*.txt; do
echo "-------- $i --------"
cat $i
echo ""
done
test_write:
runs-on: ubuntu-latest
needs: build

View File

@@ -38,7 +38,7 @@ in the offer.
on behalf of third parties, including on behalf of his employer.
2. Subject of the Agreement. \
2.1. Subject of the Agreement shall be the Contributions sent to the Author by Contributors.
2.1. Subject of the Agreement shall be the Contributions sent to the Author by Contributors. \
2.2. The Contributor grants to the Author the right to use Contributions at his own
discretion and without any necessity to get a prior approval from Contributor or
any other third party in any way, under a simple (non-exclusive), royalty-free,
@@ -86,7 +86,7 @@ in the offer.
of their provision to the Author. \
5.2. The Contributor represents and warrants that he legally owns exclusive
intellectual property rights to the Contributions. \
5.3. The Contributor represents and warrants that any further use of \
5.3. The Contributor represents and warrants that any further use of
Contributions by the Author as provided by Contributor under the terms
of the Agreement does not infringe on intellectual and other rights and
legitimate interests of third parties. \

View File

@@ -2,6 +2,6 @@ cmake_minimum_required(VERSION 2.8.12)
project(vitastor)
set(VERSION "1.4.0")
set(VERSION "1.4.3")
add_subdirectory(src)

View File

@@ -1,4 +1,4 @@
VERSION ?= v1.4.0
VERSION ?= v1.4.3
all: build push

View File

@@ -49,7 +49,7 @@ spec:
capabilities:
add: ["SYS_ADMIN"]
allowPrivilegeEscalation: true
image: vitalif/vitastor-csi:v1.4.0
image: vitalif/vitastor-csi:v1.4.3
args:
- "--node=$(NODE_ID)"
- "--endpoint=$(CSI_ENDPOINT)"

View File

@@ -121,7 +121,7 @@ spec:
privileged: true
capabilities:
add: ["SYS_ADMIN"]
image: vitalif/vitastor-csi:v1.4.0
image: vitalif/vitastor-csi:v1.4.3
args:
- "--node=$(NODE_ID)"
- "--endpoint=$(CSI_ENDPOINT)"

View File

@@ -5,7 +5,7 @@ package vitastor
const (
vitastorCSIDriverName = "csi.vitastor.io"
vitastorCSIDriverVersion = "1.4.0"
vitastorCSIDriverVersion = "1.4.3"
)
// Config struct fills the parameters of request or user input

2
debian/changelog vendored
View File

@@ -1,4 +1,4 @@
vitastor (1.4.0-1) unstable; urgency=medium
vitastor (1.4.3-1) unstable; urgency=medium
* Bugfixes

View File

@@ -35,8 +35,8 @@ RUN set -e -x; \
mkdir -p /root/packages/vitastor-$REL; \
rm -rf /root/packages/vitastor-$REL/*; \
cd /root/packages/vitastor-$REL; \
cp -r /root/vitastor vitastor-1.4.0; \
cd vitastor-1.4.0; \
cp -r /root/vitastor vitastor-1.4.3; \
cd vitastor-1.4.3; \
ln -s /root/fio-build/fio-*/ ./fio; \
FIO=$(head -n1 fio/debian/changelog | perl -pe 's/^.*\((.*?)\).*$/$1/'); \
ls /usr/include/linux/raw.h || cp ./debian/raw.h /usr/include/linux/raw.h; \
@@ -49,8 +49,8 @@ RUN set -e -x; \
rm -rf a b; \
echo "dep:fio=$FIO" > debian/fio_version; \
cd /root/packages/vitastor-$REL; \
tar --sort=name --mtime='2020-01-01' --owner=0 --group=0 --exclude=debian -cJf vitastor_1.4.0.orig.tar.xz vitastor-1.4.0; \
cd vitastor-1.4.0; \
tar --sort=name --mtime='2020-01-01' --owner=0 --group=0 --exclude=debian -cJf vitastor_1.4.3.orig.tar.xz vitastor-1.4.3; \
cd vitastor-1.4.3; \
V=$(head -n1 debian/changelog | perl -pe 's/^.*\((.*?)\).*$/$1/'); \
DEBFULLNAME="Vitaliy Filippov <vitalif@yourcmc.ru>" dch -D $REL -v "$V""$REL" "Rebuild for $REL"; \
DEB_BUILD_OPTIONS=nocheck dpkg-buildpackage --jobs=auto -sa; \

View File

@@ -1,7 +1,7 @@
- name: etcd_mon_ttl
type: sec
min: 10
default: 30
min: 5
default: 1
info: Monitor etcd lease refresh interval in seconds
info_ru: Интервал обновления etcd резервации (lease) монитором
- name: etcd_mon_timeout

View File

@@ -245,8 +245,8 @@
повторная попытка соединения.
- name: up_wait_retry_interval
type: ms
min: 50
default: 500
min: 10
default: 50
online: true
info: |
OSDs respond to clients with a special error code when they receive I/O

View File

@@ -11,19 +11,26 @@ Replicated setups:
- Single-threaded write+fsync latency:
- With immediate commit: 2 network roundtrips + 1 disk write.
- With lazy commit: 4 network roundtrips + 1 disk write + 1 disk flush.
- Saturated parallel read iops: min(network bandwidth, sum(disk read iops)).
- Saturated parallel write iops: min(network bandwidth, sum(disk write iops / number of replicas / write amplification)).
- Linear read: `min(total network bandwidth, sum(disk read MB/s))`.
- Linear write: `min(total network bandwidth, sum(disk write MB/s / number of replicas))`.
- Saturated parallel read iops: `min(total network bandwidth, sum(disk read iops))`.
- Saturated parallel write iops: `min(total network bandwidth / number of replicas, sum(disk write iops / number of replicas / (write amplification = 4)))`.
EC/XOR setups:
EC/XOR setups (EC N+K):
- Single-threaded (T1Q1) read latency: 1.5 network roundtrips + 1 disk read.
- Single-threaded write+fsync latency:
- With immediate commit: 3.5 network roundtrips + 1 disk read + 2 disk writes.
- With lazy commit: 5.5 network roundtrips + 1 disk read + 2 disk writes + 2 disk fsyncs.
- 0.5 in actually (k-1)/k which means that an additional roundtrip doesn't happen when
- 0.5 in actually `(N-1)/N` which means that an additional roundtrip doesn't happen when
the read sub-operation can be served locally.
- Saturated parallel read iops: min(network bandwidth, sum(disk read iops)).
- Saturated parallel write iops: min(network bandwidth, sum(disk write iops * number of data drives / (number of data + parity drives) / write amplification)).
In fact, you should put disk write iops under the condition of ~10% reads / ~90% writes in this formula.
- Linear read: `min(total network bandwidth, sum(disk read MB/s))`.
- Linear write: `min(total network bandwidth, sum(disk write MB/s * N/(N+K)))`.
- Saturated parallel read iops: `min(total network bandwidth, sum(disk read iops))`.
- Saturated parallel write iops: roughly `total iops / (N+K) / WA`. More exactly,
`min(total network bandwidth * N/(N+K), sum(disk randrw iops / (N*4 + K*5 + 1)))` with
random read/write mix corresponding to `(N-1)/(N*4 + K*5 + 1)*100 % reads`.
- For example, with EC 2+1 it is: `(7% randrw iops) / 14`.
- With EC 6+3 it is: `(12.5% randrw iops) / 40`.
Write amplification for 4 KB blocks is usually 3-5 in Vitastor:
1. Journal block write

View File

@@ -11,20 +11,27 @@
- Запись+fsync в 1 поток:
- С мгновенным сбросом: 2 RTT + 1 запись.
- С отложенным ("ленивым") сбросом: 4 RTT + 1 запись + 1 fsync.
- Параллельное чтение: сумма IOPS всех дисков либо производительность сети, если в сеть упрётся раньше.
- Параллельная запись: сумма IOPS всех дисков / число реплик / WA либо производительность сети, если в сеть упрётся раньше.
- Линейное чтение: сумма МБ/с чтения всех дисков, либо общая производительность сети (сумма пропускной способности сети всех нод), если в сеть упрётся раньше.
- Линейная запись: сумма МБ/с записи всех дисков / число реплик, либо производительность сети / число реплик, если в сеть упрётся раньше.
- Параллельное случайное мелкое чтение: сумма IOPS чтения всех дисков, либо производительность сети, если в сеть упрётся раньше.
- Параллельная случайная мелкая запись: сумма IOPS записи всех дисков / число реплик / WA, либо производительность сети / число реплик, если в сеть упрётся раньше.
При использовании кодов коррекции ошибок (EC):
При использовании кодов коррекции ошибок (EC N+K):
- Задержка чтения в 1 поток (T1Q1): 1.5 RTT + 1 чтение.
- Запись+fsync в 1 поток:
- С мгновенным сбросом: 3.5 RTT + 1 чтение + 2 записи.
- С отложенным ("ленивым") сбросом: 5.5 RTT + 1 чтение + 2 записи + 2 fsync.
- Под 0.5 на самом деле подразумевается (k-1)/k, где k - число дисков данных,
- Под 0.5 на самом деле подразумевается (N-1)/N, где N - число дисков данных,
что означает, что дополнительное обращение по сети не нужно, когда операция
чтения обслуживается локально.
- Параллельное чтение: сумма IOPS всех дисков либо производительность сети, если в сеть упрётся раньше.
- Параллельная запись: сумма IOPS всех дисков / общее число дисков данных и чётности / WA либо производительность сети, если в сеть упрётся раньше.
Примечание: IOPS дисков в данном случае надо брать в смешанном режиме чтения/записи в пропорции, аналогичной формулам выше.
- Линейное чтение: сумма МБ/с чтения всех дисков, либо общая производительность сети, если в сеть упрётся раньше.
- Линейная запись: сумма МБ/с записи всех дисков * N/(N+K), либо производительность сети * N / (N+K), если в сеть упрётся раньше.
- Параллельное случайное мелкое чтение: сумма IOPS чтения всех дисков либо производительность сети, если в сеть упрётся раньше.
- Параллельная случайная мелкая запись: грубо `(сумма IOPS / (N+K) / WA)`. Если точнее, то:
сумма смешанного IOPS всех дисков при `(N-1)/(N*4 + K*5 + 1)*100 %` чтения, делённая на `(N*4 + K*5 + 1)`.
Либо, производительность сети * N/(N+K), если в сеть упрётся раньше.
- Например, при EC 2+1 это: `(сумма IOPS при 7% чтения) / 14`.
- При EC 6+3 это: `(сумма IOPS при 12.5% чтения) / 40`.
WA (мультипликатор записи) для 4 КБ блоков в Vitastor обычно составляет 3-5:
1. Запись метаданных в журнал

View File

@@ -55,7 +55,7 @@ const etcd_tree = {
// etcd connection - configurable online
etcd_address: "10.0.115.10:2379/v3",
// mon
etcd_mon_ttl: 30, // min: 10
etcd_mon_ttl: 5, // min: 1
etcd_mon_timeout: 1000, // ms. min: 0
etcd_mon_retries: 5, // min: 0
mon_change_timeout: 1000, // ms. min: 100
@@ -92,7 +92,7 @@ const etcd_tree = {
peer_connect_timeout: 5, // seconds. min: 1
osd_idle_timeout: 5, // seconds. min: 1
osd_ping_timeout: 5, // seconds. min: 1
up_wait_retry_interval: 500, // ms. min: 50
up_wait_retry_interval: 50, // ms. min: 10
max_etcd_attempts: 5,
etcd_quick_timeout: 1000, // ms
etcd_slow_timeout: 5000, // ms
@@ -390,7 +390,8 @@ class Mon
{
constructor(config)
{
this.die = (e) => this._die(e);
this.failconnect = (e) => this._die(e, 2);
this.die = (e) => this._die(e, 1);
if (fs.existsSync(config.config_path||'/etc/vitastor/vitastor.conf'))
{
config = {
@@ -479,10 +480,10 @@ class Mon
check_config()
{
this.config.etcd_mon_ttl = Number(this.config.etcd_mon_ttl) || 30;
if (this.config.etcd_mon_ttl < 10)
this.config.etcd_mon_ttl = Number(this.config.etcd_mon_ttl) || 5;
if (this.config.etcd_mon_ttl < 1)
{
this.config.etcd_mon_ttl = 10;
this.config.etcd_mon_ttl = 1;
}
this.config.etcd_mon_timeout = Number(this.config.etcd_mon_timeout) || 0;
if (this.config.etcd_mon_timeout <= 0)
@@ -604,7 +605,7 @@ class Mon
}
if (!this.ws)
{
this.die('Failed to open etcd watch websocket');
this.failconnect('Failed to open etcd watch websocket');
}
const cur_addr = this.selected_etcd_url;
this.ws_alive = true;
@@ -674,7 +675,12 @@ class Mon
{
this.parse_kv(e.kv);
const key = e.kv.key.substr(this.etcd_prefix.length);
if (key.substr(0, 11) == '/osd/stats/' || key.substr(0, 10) == '/pg/stats/' || key.substr(0, 16) == '/osd/inodestats/')
if (key.substr(0, 11) == '/osd/state/')
{
stats_changed = true;
changed = true;
}
else if (key.substr(0, 11) == '/osd/stats/' || key.substr(0, 10) == '/pg/stats/' || key.substr(0, 16) == '/osd/inodestats/')
{
stats_changed = true;
}
@@ -791,9 +797,9 @@ class Mon
const res = await this.etcd_call('/lease/keepalive', { ID: this.etcd_lease_id }, this.config.etcd_mon_timeout, this.config.etcd_mon_retries);
if (!res.result.TTL)
{
this.die('Lease expired');
this.failconnect('Lease expired');
}
}, this.config.etcd_mon_timeout);
}, this.config.etcd_mon_ttl*1000);
if (!this.signals_set)
{
process.on('SIGINT', this.on_stop_cb);
@@ -1414,7 +1420,14 @@ class Mon
}
if (changed)
{
await this.save_pg_config(new_config_pgs);
const ok = await this.save_pg_config(new_config_pgs);
if (ok)
console.log('PG configuration successfully changed');
else
{
console.log('Someone changed PG configuration while we also tried to change it. Retrying in '+this.config.mon_change_timeout+' ms');
this.schedule_recheck();
}
}
}
this.recheck_pgs_active = false;
@@ -1495,6 +1508,11 @@ class Mon
this.save_new_pgs_txn(new_config_pgs, etcd_request, pool_id, up_osds, osd_tree, real_prev_pgs, pool_res.pgs, pg_history);
}
new_config_pgs.hash = tree_hash;
return await this.save_pg_config(new_config_pgs, etcd_request);
}
async save_pg_config(new_config_pgs, etcd_request = { compare: [], success: [] })
{
etcd_request.compare.push(
{ key: b64(this.etcd_prefix+'/mon/master'), target: 'LEASE', lease: ''+this.etcd_lease_id },
{ key: b64(this.etcd_prefix+'/config/pgs'), target: 'MOD', mod_revision: ''+this.etcd_watch_revision, result: 'LESS' },
@@ -1622,9 +1640,13 @@ class Mon
}
const sum_diff = { op_stats: {}, subop_stats: {}, recovery_stats: {} };
// Sum derived values instead of deriving summed
for (const osd in this.state.osd.stats)
for (const osd in this.state.osd.state)
{
const derived = this.prev_stats.osd_diff[osd];
if (!this.state.osd.state[osd] || !derived)
{
continue;
}
for (const type in sum_diff)
{
for (const op in derived[type]||{})
@@ -1725,9 +1747,13 @@ class Mon
const used = this.state.pool.stats[pool_id].used_raw_tb;
this.state.pool.stats[pool_id].used_raw_tb = Number(used)/1024/1024/1024/1024;
}
for (const osd_num in this.state.osd.inodestats)
for (const osd_num in this.state.osd.state)
{
const ist = this.state.osd.inodestats[osd_num];
if (!ist || !this.state.osd.state[osd_num])
{
continue;
}
for (const pool_id in ist)
{
inode_stats[pool_id] = inode_stats[pool_id] || {};
@@ -1743,9 +1769,14 @@ class Mon
}
}
}
for (const osd in this.prev_stats.osd_diff)
for (const osd in this.state.osd.state)
{
for (const pool_id in this.prev_stats.osd_diff[osd].inode_stats)
const osd_diff = this.prev_stats.osd_diff[osd];
if (!osd_diff || !this.state.osd.state[osd])
{
continue;
}
for (const pool_id in osd_diff.inode_stats)
{
for (const inode_num in this.prev_stats.osd_diff[osd].inode_stats[pool_id])
{
@@ -1985,14 +2016,14 @@ class Mon
return res.json;
}
}
this.die();
this.failconnect();
}
_die(err)
_die(err, code)
{
// In fact we can just try to rejoin
console.error(new Error(err || 'Cluster connection failed'));
process.exit(1);
process.exit(code || 2);
}
local_ips(all)

View File

@@ -1,6 +1,6 @@
{
"name": "vitastor-mon",
"version": "1.4.0",
"version": "1.4.3",
"description": "Vitastor SDS monitor service",
"main": "mon-main.js",
"scripts": {

View File

@@ -8,7 +8,9 @@ PartOf=vitastor.target
LimitNOFILE=1048576
LimitNPROC=1048576
LimitMEMLOCK=infinity
ExecStart=bash -c 'exec vitastor-disk exec-osd /dev/vitastor/osd%i-data >>/var/log/vitastor/osd%i.log 2>&1'
# Use the following for direct logs to files
#ExecStart=bash -c 'exec vitastor-disk exec-osd /dev/vitastor/osd%i-data >>/var/log/vitastor/osd%i.log 2>&1'
ExecStart=vitastor-disk exec-osd /dev/vitastor/osd%i-data
ExecStartPre=+vitastor-disk pre-exec /dev/vitastor/osd%i-data
WorkingDirectory=/
User=vitastor

View File

@@ -50,7 +50,7 @@ from cinder.volume import configuration
from cinder.volume import driver
from cinder.volume import volume_utils
VERSION = '1.4.0'
VERSION = '1.4.3'
LOG = logging.getLogger(__name__)

View File

@@ -0,0 +1,643 @@
commit c1cd026e211e94b120028e7c98a6e4ce5afe9846
Author: Vitaliy Filippov <vitalif@yourcmc.ru>
Date: Wed Jan 24 22:04:50 2024 +0300
Add Vitastor support
diff --git a/include/libvirt/libvirt-storage.h b/include/libvirt/libvirt-storage.h
index aaad4a3da1..5f5daa8341 100644
--- a/include/libvirt/libvirt-storage.h
+++ b/include/libvirt/libvirt-storage.h
@@ -326,6 +326,7 @@ typedef enum {
VIR_CONNECT_LIST_STORAGE_POOLS_ZFS = 1 << 17, /* (Since: 1.2.8) */
VIR_CONNECT_LIST_STORAGE_POOLS_VSTORAGE = 1 << 18, /* (Since: 3.1.0) */
VIR_CONNECT_LIST_STORAGE_POOLS_ISCSI_DIRECT = 1 << 19, /* (Since: 5.6.0) */
+ VIR_CONNECT_LIST_STORAGE_POOLS_VITASTOR = 1 << 20, /* (Since: 5.0.0) */
} virConnectListAllStoragePoolsFlags;
int virConnectListAllStoragePools(virConnectPtr conn,
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
index 22ad43e1d7..56c81d6852 100644
--- a/src/conf/domain_conf.c
+++ b/src/conf/domain_conf.c
@@ -7185,7 +7185,8 @@ virDomainDiskSourceNetworkParse(xmlNodePtr node,
src->configFile = virXPathString("string(./config/@file)", ctxt);
if (src->protocol == VIR_STORAGE_NET_PROTOCOL_HTTP ||
- src->protocol == VIR_STORAGE_NET_PROTOCOL_HTTPS)
+ src->protocol == VIR_STORAGE_NET_PROTOCOL_HTTPS ||
+ src->protocol == VIR_STORAGE_NET_PROTOCOL_VITASTOR)
src->query = virXMLPropString(node, "query");
if (virDomainStorageNetworkParseHosts(node, ctxt, &src->hosts, &src->nhosts) < 0)
@@ -30618,6 +30619,7 @@ virDomainStorageSourceTranslateSourcePool(virStorageSource *src,
case VIR_STORAGE_POOL_MPATH:
case VIR_STORAGE_POOL_RBD:
+ case VIR_STORAGE_POOL_VITASTOR:
case VIR_STORAGE_POOL_SHEEPDOG:
case VIR_STORAGE_POOL_GLUSTER:
case VIR_STORAGE_POOL_LAST:
diff --git a/src/conf/domain_validate.c b/src/conf/domain_validate.c
index c72108886e..c739ed6c43 100644
--- a/src/conf/domain_validate.c
+++ b/src/conf/domain_validate.c
@@ -495,6 +495,7 @@ virDomainDiskDefValidateSourceChainOne(const virStorageSource *src)
case VIR_STORAGE_NET_PROTOCOL_RBD:
break;
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
case VIR_STORAGE_NET_PROTOCOL_NBD:
case VIR_STORAGE_NET_PROTOCOL_SHEEPDOG:
case VIR_STORAGE_NET_PROTOCOL_GLUSTER:
@@ -541,7 +542,7 @@ virDomainDiskDefValidateSourceChainOne(const virStorageSource *src)
}
}
- /* internal snapshots and config files are currently supported only with rbd: */
+ /* internal snapshots are currently supported only with rbd: */
if (virStorageSourceGetActualType(src) != VIR_STORAGE_TYPE_NETWORK &&
src->protocol != VIR_STORAGE_NET_PROTOCOL_RBD) {
if (src->snapshot) {
@@ -549,10 +550,15 @@ virDomainDiskDefValidateSourceChainOne(const virStorageSource *src)
_("<snapshot> element is currently supported only with 'rbd' disks"));
return -1;
}
+ }
+ /* config files are currently supported only with rbd and vitastor: */
+ if (virStorageSourceGetActualType(src) != VIR_STORAGE_TYPE_NETWORK &&
+ src->protocol != VIR_STORAGE_NET_PROTOCOL_RBD &&
+ src->protocol != VIR_STORAGE_NET_PROTOCOL_VITASTOR) {
if (src->configFile) {
virReportError(VIR_ERR_XML_ERROR, "%s",
- _("<config> element is currently supported only with 'rbd' disks"));
+ _("<config> element is currently supported only with 'rbd' and 'vitastor' disks"));
return -1;
}
}
diff --git a/src/conf/schemas/domaincommon.rng b/src/conf/schemas/domaincommon.rng
index b98a2ae602..7d7a872e01 100644
--- a/src/conf/schemas/domaincommon.rng
+++ b/src/conf/schemas/domaincommon.rng
@@ -1997,6 +1997,35 @@
</element>
</define>
+ <define name="diskSourceNetworkProtocolVitastor">
+ <element name="source">
+ <interleave>
+ <attribute name="protocol">
+ <value>vitastor</value>
+ </attribute>
+ <ref name="diskSourceCommon"/>
+ <optional>
+ <attribute name="name"/>
+ </optional>
+ <optional>
+ <attribute name="query"/>
+ </optional>
+ <zeroOrMore>
+ <ref name="diskSourceNetworkHost"/>
+ </zeroOrMore>
+ <optional>
+ <element name="config">
+ <attribute name="file">
+ <ref name="absFilePath"/>
+ </attribute>
+ <empty/>
+ </element>
+ </optional>
+ <empty/>
+ </interleave>
+ </element>
+ </define>
+
<define name="diskSourceNetworkProtocolISCSI">
<element name="source">
<attribute name="protocol">
@@ -2347,6 +2376,7 @@
<ref name="diskSourceNetworkProtocolSimple"/>
<ref name="diskSourceNetworkProtocolVxHS"/>
<ref name="diskSourceNetworkProtocolNFS"/>
+ <ref name="diskSourceNetworkProtocolVitastor"/>
</choice>
</define>
diff --git a/src/conf/storage_conf.c b/src/conf/storage_conf.c
index 68842004b7..1d69a788b6 100644
--- a/src/conf/storage_conf.c
+++ b/src/conf/storage_conf.c
@@ -56,7 +56,7 @@ VIR_ENUM_IMPL(virStoragePool,
"logical", "disk", "iscsi",
"iscsi-direct", "scsi", "mpath",
"rbd", "sheepdog", "gluster",
- "zfs", "vstorage",
+ "zfs", "vstorage", "vitastor",
);
VIR_ENUM_IMPL(virStoragePoolFormatFileSystem,
@@ -242,6 +242,18 @@ static virStoragePoolTypeInfo poolTypeInfo[] = {
.formatToString = virStorageFileFormatTypeToString,
}
},
+ {.poolType = VIR_STORAGE_POOL_VITASTOR,
+ .poolOptions = {
+ .flags = (VIR_STORAGE_POOL_SOURCE_HOST |
+ VIR_STORAGE_POOL_SOURCE_NETWORK |
+ VIR_STORAGE_POOL_SOURCE_NAME),
+ },
+ .volOptions = {
+ .defaultFormat = VIR_STORAGE_FILE_RAW,
+ .formatFromString = virStorageVolumeFormatFromString,
+ .formatToString = virStorageFileFormatTypeToString,
+ }
+ },
{.poolType = VIR_STORAGE_POOL_SHEEPDOG,
.poolOptions = {
.flags = (VIR_STORAGE_POOL_SOURCE_HOST |
@@ -538,6 +550,11 @@ virStoragePoolDefParseSource(xmlXPathContextPtr ctxt,
_("element 'name' is mandatory for RBD pool"));
return -1;
}
+ if (pool_type == VIR_STORAGE_POOL_VITASTOR && source->name == NULL) {
+ virReportError(VIR_ERR_XML_ERROR, "%s",
+ _("element 'name' is mandatory for Vitastor pool"));
+ return -1;
+ }
if (options->formatFromString) {
g_autofree char *format = NULL;
@@ -1127,6 +1144,7 @@ virStoragePoolDefFormatBuf(virBuffer *buf,
/* RBD, Sheepdog, Gluster and Iscsi-direct devices are not local block devs nor
* files, so they don't have a target */
if (def->type != VIR_STORAGE_POOL_RBD &&
+ def->type != VIR_STORAGE_POOL_VITASTOR &&
def->type != VIR_STORAGE_POOL_SHEEPDOG &&
def->type != VIR_STORAGE_POOL_GLUSTER &&
def->type != VIR_STORAGE_POOL_ISCSI_DIRECT) {
diff --git a/src/conf/storage_conf.h b/src/conf/storage_conf.h
index fc67957cfe..720c07ef74 100644
--- a/src/conf/storage_conf.h
+++ b/src/conf/storage_conf.h
@@ -103,6 +103,7 @@ typedef enum {
VIR_STORAGE_POOL_GLUSTER, /* Gluster device */
VIR_STORAGE_POOL_ZFS, /* ZFS */
VIR_STORAGE_POOL_VSTORAGE, /* Virtuozzo Storage */
+ VIR_STORAGE_POOL_VITASTOR, /* Vitastor */
VIR_STORAGE_POOL_LAST,
} virStoragePoolType;
@@ -454,6 +455,7 @@ VIR_ENUM_DECL(virStoragePartedFs);
VIR_CONNECT_LIST_STORAGE_POOLS_SCSI | \
VIR_CONNECT_LIST_STORAGE_POOLS_MPATH | \
VIR_CONNECT_LIST_STORAGE_POOLS_RBD | \
+ VIR_CONNECT_LIST_STORAGE_POOLS_VITASTOR | \
VIR_CONNECT_LIST_STORAGE_POOLS_SHEEPDOG | \
VIR_CONNECT_LIST_STORAGE_POOLS_GLUSTER | \
VIR_CONNECT_LIST_STORAGE_POOLS_ZFS | \
diff --git a/src/conf/storage_source_conf.c b/src/conf/storage_source_conf.c
index f974a521b1..cd394d0a9f 100644
--- a/src/conf/storage_source_conf.c
+++ b/src/conf/storage_source_conf.c
@@ -88,6 +88,7 @@ VIR_ENUM_IMPL(virStorageNetProtocol,
"ssh",
"vxhs",
"nfs",
+ "vitastor",
);
@@ -1301,6 +1302,7 @@ virStorageSourceNetworkDefaultPort(virStorageNetProtocol protocol)
case VIR_STORAGE_NET_PROTOCOL_GLUSTER:
return 24007;
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
case VIR_STORAGE_NET_PROTOCOL_RBD:
/* we don't provide a default for RBD */
return 0;
diff --git a/src/conf/storage_source_conf.h b/src/conf/storage_source_conf.h
index 5e7d127453..283709eeb3 100644
--- a/src/conf/storage_source_conf.h
+++ b/src/conf/storage_source_conf.h
@@ -129,6 +129,7 @@ typedef enum {
VIR_STORAGE_NET_PROTOCOL_SSH,
VIR_STORAGE_NET_PROTOCOL_VXHS,
VIR_STORAGE_NET_PROTOCOL_NFS,
+ VIR_STORAGE_NET_PROTOCOL_VITASTOR,
VIR_STORAGE_NET_PROTOCOL_LAST
} virStorageNetProtocol;
diff --git a/src/conf/virstorageobj.c b/src/conf/virstorageobj.c
index 59fa5da372..4739167f5f 100644
--- a/src/conf/virstorageobj.c
+++ b/src/conf/virstorageobj.c
@@ -1438,6 +1438,7 @@ virStoragePoolObjSourceFindDuplicateCb(const void *payload,
return 1;
break;
+ case VIR_STORAGE_POOL_VITASTOR:
case VIR_STORAGE_POOL_ISCSI_DIRECT:
case VIR_STORAGE_POOL_RBD:
case VIR_STORAGE_POOL_LAST:
@@ -1921,6 +1922,8 @@ virStoragePoolObjMatch(virStoragePoolObj *obj,
(obj->def->type == VIR_STORAGE_POOL_MPATH)) ||
(MATCH(VIR_CONNECT_LIST_STORAGE_POOLS_RBD) &&
(obj->def->type == VIR_STORAGE_POOL_RBD)) ||
+ (MATCH(VIR_CONNECT_LIST_STORAGE_POOLS_VITASTOR) &&
+ (obj->def->type == VIR_STORAGE_POOL_VITASTOR)) ||
(MATCH(VIR_CONNECT_LIST_STORAGE_POOLS_SHEEPDOG) &&
(obj->def->type == VIR_STORAGE_POOL_SHEEPDOG)) ||
(MATCH(VIR_CONNECT_LIST_STORAGE_POOLS_GLUSTER) &&
diff --git a/src/libvirt-storage.c b/src/libvirt-storage.c
index db7660aac4..561df34709 100644
--- a/src/libvirt-storage.c
+++ b/src/libvirt-storage.c
@@ -94,6 +94,7 @@ virStoragePoolGetConnect(virStoragePoolPtr pool)
* VIR_CONNECT_LIST_STORAGE_POOLS_SCSI
* VIR_CONNECT_LIST_STORAGE_POOLS_MPATH
* VIR_CONNECT_LIST_STORAGE_POOLS_RBD
+ * VIR_CONNECT_LIST_STORAGE_POOLS_VITASTOR
* VIR_CONNECT_LIST_STORAGE_POOLS_SHEEPDOG
* VIR_CONNECT_LIST_STORAGE_POOLS_GLUSTER
* VIR_CONNECT_LIST_STORAGE_POOLS_ZFS
diff --git a/src/libxl/libxl_conf.c b/src/libxl/libxl_conf.c
index 62e1be6672..71a1d42896 100644
--- a/src/libxl/libxl_conf.c
+++ b/src/libxl/libxl_conf.c
@@ -979,6 +979,7 @@ libxlMakeNetworkDiskSrcStr(virStorageSource *src,
case VIR_STORAGE_NET_PROTOCOL_SSH:
case VIR_STORAGE_NET_PROTOCOL_VXHS:
case VIR_STORAGE_NET_PROTOCOL_NFS:
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
case VIR_STORAGE_NET_PROTOCOL_LAST:
case VIR_STORAGE_NET_PROTOCOL_NONE:
virReportError(VIR_ERR_NO_SUPPORT,
diff --git a/src/libxl/xen_xl.c b/src/libxl/xen_xl.c
index f175359307..8efcf4c329 100644
--- a/src/libxl/xen_xl.c
+++ b/src/libxl/xen_xl.c
@@ -1456,6 +1456,7 @@ xenFormatXLDiskSrcNet(virStorageSource *src)
case VIR_STORAGE_NET_PROTOCOL_SSH:
case VIR_STORAGE_NET_PROTOCOL_VXHS:
case VIR_STORAGE_NET_PROTOCOL_NFS:
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
case VIR_STORAGE_NET_PROTOCOL_LAST:
case VIR_STORAGE_NET_PROTOCOL_NONE:
virReportError(VIR_ERR_NO_SUPPORT,
diff --git a/src/qemu/qemu_block.c b/src/qemu/qemu_block.c
index 7e9daf0bdc..825b4a3006 100644
--- a/src/qemu/qemu_block.c
+++ b/src/qemu/qemu_block.c
@@ -758,6 +758,38 @@ qemuBlockStorageSourceGetRBDProps(virStorageSource *src,
}
+static virJSONValue *
+qemuBlockStorageSourceGetVitastorProps(virStorageSource *src)
+{
+ virJSONValue *ret = NULL;
+ virStorageNetHostDef *host;
+ size_t i;
+ g_auto(virBuffer) buf = VIR_BUFFER_INITIALIZER;
+ g_autofree char *etcd = NULL;
+
+ for (i = 0; i < src->nhosts; i++) {
+ host = src->hosts + i;
+ if ((virStorageNetHostTransport)host->transport != VIR_STORAGE_NET_HOST_TRANS_TCP) {
+ return NULL;
+ }
+ virBufferAsprintf(&buf, i > 0 ? ",%s:%u" : "%s:%u", host->name, host->port);
+ }
+ if (src->nhosts > 0) {
+ etcd = virBufferContentAndReset(&buf);
+ }
+
+ if (virJSONValueObjectAdd(&ret,
+ "S:etcd-host", etcd,
+ "S:etcd-prefix", src->query,
+ "S:config-path", src->configFile,
+ "s:image", src->path,
+ NULL) < 0)
+ return NULL;
+
+ return ret;
+}
+
+
static virJSONValue *
qemuBlockStorageSourceGetSheepdogProps(virStorageSource *src)
{
@@ -1140,6 +1172,12 @@ qemuBlockStorageSourceGetBackendProps(virStorageSource *src,
return NULL;
break;
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
+ driver = "vitastor";
+ if (!(fileprops = qemuBlockStorageSourceGetVitastorProps(src)))
+ return NULL;
+ break;
+
case VIR_STORAGE_NET_PROTOCOL_SHEEPDOG:
driver = "sheepdog";
if (!(fileprops = qemuBlockStorageSourceGetSheepdogProps(src)))
@@ -2032,6 +2070,7 @@ qemuBlockGetBackingStoreString(virStorageSource *src,
case VIR_STORAGE_NET_PROTOCOL_SHEEPDOG:
case VIR_STORAGE_NET_PROTOCOL_RBD:
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
case VIR_STORAGE_NET_PROTOCOL_VXHS:
case VIR_STORAGE_NET_PROTOCOL_NFS:
case VIR_STORAGE_NET_PROTOCOL_SSH:
@@ -2415,6 +2454,12 @@ qemuBlockStorageSourceCreateGetStorageProps(virStorageSource *src,
return -1;
break;
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
+ driver = "vitastor";
+ if (!(location = qemuBlockStorageSourceGetVitastorProps(src)))
+ return -1;
+ break;
+
case VIR_STORAGE_NET_PROTOCOL_SHEEPDOG:
driver = "sheepdog";
if (!(location = qemuBlockStorageSourceGetSheepdogProps(src)))
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
index 953808fcfe..62860283d8 100644
--- a/src/qemu/qemu_domain.c
+++ b/src/qemu/qemu_domain.c
@@ -5215,7 +5215,8 @@ qemuDomainValidateStorageSource(virStorageSource *src,
if (src->query &&
(actualType != VIR_STORAGE_TYPE_NETWORK ||
(src->protocol != VIR_STORAGE_NET_PROTOCOL_HTTPS &&
- src->protocol != VIR_STORAGE_NET_PROTOCOL_HTTP))) {
+ src->protocol != VIR_STORAGE_NET_PROTOCOL_HTTP &&
+ src->protocol != VIR_STORAGE_NET_PROTOCOL_VITASTOR))) {
virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
_("query is supported only with HTTP(S) protocols"));
return -1;
@@ -10340,6 +10341,7 @@ qemuDomainPrepareStorageSourceTLS(virStorageSource *src,
break;
case VIR_STORAGE_NET_PROTOCOL_RBD:
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
case VIR_STORAGE_NET_PROTOCOL_SHEEPDOG:
case VIR_STORAGE_NET_PROTOCOL_GLUSTER:
case VIR_STORAGE_NET_PROTOCOL_ISCSI:
diff --git a/src/qemu/qemu_snapshot.c b/src/qemu/qemu_snapshot.c
index 73ff533827..e9c799ca8f 100644
--- a/src/qemu/qemu_snapshot.c
+++ b/src/qemu/qemu_snapshot.c
@@ -423,6 +423,7 @@ qemuSnapshotPrepareDiskExternalInactive(virDomainSnapshotDiskDef *snapdisk,
case VIR_STORAGE_NET_PROTOCOL_NONE:
case VIR_STORAGE_NET_PROTOCOL_NBD:
case VIR_STORAGE_NET_PROTOCOL_RBD:
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
case VIR_STORAGE_NET_PROTOCOL_SHEEPDOG:
case VIR_STORAGE_NET_PROTOCOL_GLUSTER:
case VIR_STORAGE_NET_PROTOCOL_ISCSI:
@@ -648,6 +649,7 @@ qemuSnapshotPrepareDiskInternal(virDomainDiskDef *disk,
case VIR_STORAGE_NET_PROTOCOL_NONE:
case VIR_STORAGE_NET_PROTOCOL_NBD:
case VIR_STORAGE_NET_PROTOCOL_RBD:
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
case VIR_STORAGE_NET_PROTOCOL_SHEEPDOG:
case VIR_STORAGE_NET_PROTOCOL_GLUSTER:
case VIR_STORAGE_NET_PROTOCOL_ISCSI:
diff --git a/src/storage/storage_driver.c b/src/storage/storage_driver.c
index 314fe930e0..fb615a8b4e 100644
--- a/src/storage/storage_driver.c
+++ b/src/storage/storage_driver.c
@@ -1626,6 +1626,7 @@ storageVolLookupByPathCallback(virStoragePoolObj *obj,
case VIR_STORAGE_POOL_GLUSTER:
case VIR_STORAGE_POOL_RBD:
+ case VIR_STORAGE_POOL_VITASTOR:
case VIR_STORAGE_POOL_SHEEPDOG:
case VIR_STORAGE_POOL_ZFS:
case VIR_STORAGE_POOL_LAST:
diff --git a/src/storage_file/storage_source_backingstore.c b/src/storage_file/storage_source_backingstore.c
index 80681924ea..8a3ade9ec0 100644
--- a/src/storage_file/storage_source_backingstore.c
+++ b/src/storage_file/storage_source_backingstore.c
@@ -287,6 +287,75 @@ virStorageSourceParseRBDColonString(const char *rbdstr,
}
+static int
+virStorageSourceParseVitastorColonString(const char *colonstr,
+ virStorageSource *src)
+{
+ char *p, *e, *next;
+ g_autofree char *options = NULL;
+
+ /* optionally skip the "vitastor:" prefix if provided */
+ if (STRPREFIX(colonstr, "vitastor:"))
+ colonstr += strlen("vitastor:");
+
+ options = g_strdup(colonstr);
+
+ p = options;
+ while (*p) {
+ /* find : delimiter or end of string */
+ for (e = p; *e && *e != ':'; ++e) {
+ if (*e == '\\') {
+ e++;
+ if (*e == '\0')
+ break;
+ }
+ }
+ if (*e == '\0') {
+ next = e; /* last kv pair */
+ } else {
+ next = e + 1;
+ *e = '\0';
+ }
+
+ if (STRPREFIX(p, "image=")) {
+ src->path = g_strdup(p + strlen("image="));
+ } else if (STRPREFIX(p, "etcd-prefix=")) {
+ src->query = g_strdup(p + strlen("etcd-prefix="));
+ } else if (STRPREFIX(p, "config-path=")) {
+ src->configFile = g_strdup(p + strlen("config-path="));
+ } else if (STRPREFIX(p, "etcd-host=")) {
+ char *h, *sep;
+
+ h = p + strlen("etcd-host=");
+ while (h < e) {
+ for (sep = h; sep < e; ++sep) {
+ if (*sep == '\\' && (sep[1] == ',' ||
+ sep[1] == ';' ||
+ sep[1] == ' ')) {
+ *sep = '\0';
+ sep += 2;
+ break;
+ }
+ }
+
+ if (virStorageSourceRBDAddHost(src, h) < 0)
+ return -1;
+
+ h = sep;
+ }
+ }
+
+ p = next;
+ }
+
+ if (!src->path) {
+ return -1;
+ }
+
+ return 0;
+}
+
+
static int
virStorageSourceParseNBDColonString(const char *nbdstr,
virStorageSource *src)
@@ -399,6 +468,11 @@ virStorageSourceParseBackingColon(virStorageSource *src,
return -1;
break;
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
+ if (virStorageSourceParseVitastorColonString(path, src) < 0)
+ return -1;
+ break;
+
case VIR_STORAGE_NET_PROTOCOL_SHEEPDOG:
case VIR_STORAGE_NET_PROTOCOL_LAST:
case VIR_STORAGE_NET_PROTOCOL_NONE:
@@ -975,6 +1049,54 @@ virStorageSourceParseBackingJSONRBD(virStorageSource *src,
return 0;
}
+static int
+virStorageSourceParseBackingJSONVitastor(virStorageSource *src,
+ virJSONValue *json,
+ const char *jsonstr G_GNUC_UNUSED,
+ int opaque G_GNUC_UNUSED)
+{
+ const char *filename;
+ const char *image = virJSONValueObjectGetString(json, "image");
+ const char *conf = virJSONValueObjectGetString(json, "config-path");
+ const char *etcd_prefix = virJSONValueObjectGetString(json, "etcd-prefix");
+ virJSONValue *servers = virJSONValueObjectGetArray(json, "server");
+ size_t nservers;
+ size_t i;
+
+ src->type = VIR_STORAGE_TYPE_NETWORK;
+ src->protocol = VIR_STORAGE_NET_PROTOCOL_VITASTOR;
+
+ /* legacy syntax passed via 'filename' option */
+ if ((filename = virJSONValueObjectGetString(json, "filename")))
+ return virStorageSourceParseVitastorColonString(filename, src);
+
+ if (!image) {
+ virReportError(VIR_ERR_INVALID_ARG, "%s",
+ _("missing image name in Vitastor backing volume "
+ "JSON specification"));
+ return -1;
+ }
+
+ src->path = g_strdup(image);
+ src->configFile = g_strdup(conf);
+ src->query = g_strdup(etcd_prefix);
+
+ if (servers) {
+ nservers = virJSONValueArraySize(servers);
+
+ src->hosts = g_new0(virStorageNetHostDef, nservers);
+ src->nhosts = nservers;
+
+ for (i = 0; i < nservers; i++) {
+ if (virStorageSourceParseBackingJSONInetSocketAddress(src->hosts + i,
+ virJSONValueArrayGet(servers, i)) < 0)
+ return -1;
+ }
+ }
+
+ return 0;
+}
+
static int
virStorageSourceParseBackingJSONRaw(virStorageSource *src,
virJSONValue *json,
@@ -1152,6 +1274,7 @@ static const struct virStorageSourceJSONDriverParser jsonParsers[] = {
{"sheepdog", false, virStorageSourceParseBackingJSONSheepdog, 0},
{"ssh", false, virStorageSourceParseBackingJSONSSH, 0},
{"rbd", false, virStorageSourceParseBackingJSONRBD, 0},
+ {"vitastor", false, virStorageSourceParseBackingJSONVitastor, 0},
{"raw", true, virStorageSourceParseBackingJSONRaw, 0},
{"nfs", false, virStorageSourceParseBackingJSONNFS, 0},
{"vxhs", false, virStorageSourceParseBackingJSONVxHS, 0},
diff --git a/src/test/test_driver.c b/src/test/test_driver.c
index e87d7cfd44..ccc05d7aae 100644
--- a/src/test/test_driver.c
+++ b/src/test/test_driver.c
@@ -7335,6 +7335,7 @@ testStorageVolumeTypeForPool(int pooltype)
case VIR_STORAGE_POOL_ISCSI_DIRECT:
case VIR_STORAGE_POOL_GLUSTER:
case VIR_STORAGE_POOL_RBD:
+ case VIR_STORAGE_POOL_VITASTOR:
return VIR_STORAGE_VOL_NETWORK;
case VIR_STORAGE_POOL_LOGICAL:
case VIR_STORAGE_POOL_DISK:
diff --git a/tests/storagepoolcapsschemadata/poolcaps-fs.xml b/tests/storagepoolcapsschemadata/poolcaps-fs.xml
index eee75af746..8bd0a57bdd 100644
--- a/tests/storagepoolcapsschemadata/poolcaps-fs.xml
+++ b/tests/storagepoolcapsschemadata/poolcaps-fs.xml
@@ -204,4 +204,11 @@
</enum>
</volOptions>
</pool>
+ <pool type='vitastor' supported='no'>
+ <volOptions>
+ <defaultFormat type='raw'/>
+ <enum name='targetFormatType'>
+ </enum>
+ </volOptions>
+ </pool>
</storagepoolCapabilities>
diff --git a/tests/storagepoolcapsschemadata/poolcaps-full.xml b/tests/storagepoolcapsschemadata/poolcaps-full.xml
index 805950a937..852df0de16 100644
--- a/tests/storagepoolcapsschemadata/poolcaps-full.xml
+++ b/tests/storagepoolcapsschemadata/poolcaps-full.xml
@@ -204,4 +204,11 @@
</enum>
</volOptions>
</pool>
+ <pool type='vitastor' supported='yes'>
+ <volOptions>
+ <defaultFormat type='raw'/>
+ <enum name='targetFormatType'>
+ </enum>
+ </volOptions>
+ </pool>
</storagepoolCapabilities>
diff --git a/tests/storagepoolxml2argvtest.c b/tests/storagepoolxml2argvtest.c
index e8e40d695e..db55fe5f3a 100644
--- a/tests/storagepoolxml2argvtest.c
+++ b/tests/storagepoolxml2argvtest.c
@@ -65,6 +65,7 @@ testCompareXMLToArgvFiles(bool shouldFail,
case VIR_STORAGE_POOL_GLUSTER:
case VIR_STORAGE_POOL_ZFS:
case VIR_STORAGE_POOL_VSTORAGE:
+ case VIR_STORAGE_POOL_VITASTOR:
case VIR_STORAGE_POOL_LAST:
default:
VIR_TEST_DEBUG("pool type '%s' has no xml2argv test", defTypeStr);
diff --git a/tools/virsh-pool.c b/tools/virsh-pool.c
index 36f00cf643..5f5bd3464e 100644
--- a/tools/virsh-pool.c
+++ b/tools/virsh-pool.c
@@ -1223,6 +1223,9 @@ cmdPoolList(vshControl *ctl, const vshCmd *cmd G_GNUC_UNUSED)
case VIR_STORAGE_POOL_VSTORAGE:
flags |= VIR_CONNECT_LIST_STORAGE_POOLS_VSTORAGE;
break;
+ case VIR_STORAGE_POOL_VITASTOR:
+ flags |= VIR_CONNECT_LIST_STORAGE_POOLS_VITASTOR;
+ break;
case VIR_STORAGE_POOL_LAST:
break;
}

28
pull_request_template.yml Normal file
View File

@@ -0,0 +1,28 @@
name: Pull Request
about: Submit a pull request
body:
- type: textarea
id: description
attributes:
label: Description
description: Describe your pull request
placeholder: ""
value: ""
validations:
required: true
- type: input
id: author
attributes:
label: Contributor Name
description: Contributor Name or Company Details if the Contributor is a company
placeholder: ""
validations:
required: false
- type: checkboxes
id: terms
attributes:
label: CLA
description: By submitting this pull request, I accept [Vitastor CLA](https://git.yourcmc.ru/vitalif/vitastor/src/branch/master/CLA-en.md)
options:
- label: "I accept Vitastor CLA agreement: https://git.yourcmc.ru/vitalif/vitastor/src/branch/master/CLA-en.md"
required: true

View File

@@ -24,4 +24,4 @@ rm fio
mv fio-copy fio
FIO=`rpm -qi fio | perl -e 'while(<>) { /^Epoch[\s:]+(\S+)/ && print "$1:"; /^Version[\s:]+(\S+)/ && print $1; /^Release[\s:]+(\S+)/ && print "-$1"; }'`
perl -i -pe 's/(Requires:\s*fio)([^\n]+)?/$1 = '$FIO'/' $VITASTOR/rpm/vitastor-el$EL.spec
tar --transform 's#^#vitastor-1.4.0/#' --exclude 'rpm/*.rpm' -czf $VITASTOR/../vitastor-1.4.0$(rpm --eval '%dist').tar.gz *
tar --transform 's#^#vitastor-1.4.3/#' --exclude 'rpm/*.rpm' -czf $VITASTOR/../vitastor-1.4.3$(rpm --eval '%dist').tar.gz *

View File

@@ -36,7 +36,7 @@ ADD . /root/vitastor
RUN set -e; \
cd /root/vitastor/rpm; \
sh build-tarball.sh; \
cp /root/vitastor-1.4.0.el7.tar.gz ~/rpmbuild/SOURCES; \
cp /root/vitastor-1.4.3.el7.tar.gz ~/rpmbuild/SOURCES; \
cp vitastor-el7.spec ~/rpmbuild/SPECS/vitastor.spec; \
cd ~/rpmbuild/SPECS/; \
rpmbuild -ba vitastor.spec; \

View File

@@ -1,11 +1,11 @@
Name: vitastor
Version: 1.4.0
Version: 1.4.3
Release: 1%{?dist}
Summary: Vitastor, a fast software-defined clustered block storage
License: Vitastor Network Public License 1.1
URL: https://vitastor.io/
Source0: vitastor-1.4.0.el7.tar.gz
Source0: vitastor-1.4.3.el7.tar.gz
BuildRequires: liburing-devel >= 0.6
BuildRequires: gperftools-devel

View File

@@ -35,7 +35,7 @@ ADD . /root/vitastor
RUN set -e; \
cd /root/vitastor/rpm; \
sh build-tarball.sh; \
cp /root/vitastor-1.4.0.el8.tar.gz ~/rpmbuild/SOURCES; \
cp /root/vitastor-1.4.3.el8.tar.gz ~/rpmbuild/SOURCES; \
cp vitastor-el8.spec ~/rpmbuild/SPECS/vitastor.spec; \
cd ~/rpmbuild/SPECS/; \
rpmbuild -ba vitastor.spec; \

View File

@@ -1,11 +1,11 @@
Name: vitastor
Version: 1.4.0
Version: 1.4.3
Release: 1%{?dist}
Summary: Vitastor, a fast software-defined clustered block storage
License: Vitastor Network Public License 1.1
URL: https://vitastor.io/
Source0: vitastor-1.4.0.el8.tar.gz
Source0: vitastor-1.4.3.el8.tar.gz
BuildRequires: liburing-devel >= 0.6
BuildRequires: gperftools-devel

View File

@@ -18,7 +18,7 @@ ADD . /root/vitastor
RUN set -e; \
cd /root/vitastor/rpm; \
sh build-tarball.sh; \
cp /root/vitastor-1.4.0.el9.tar.gz ~/rpmbuild/SOURCES; \
cp /root/vitastor-1.4.3.el9.tar.gz ~/rpmbuild/SOURCES; \
cp vitastor-el9.spec ~/rpmbuild/SPECS/vitastor.spec; \
cd ~/rpmbuild/SPECS/; \
rpmbuild -ba vitastor.spec; \

View File

@@ -1,11 +1,11 @@
Name: vitastor
Version: 1.4.0
Version: 1.4.3
Release: 1%{?dist}
Summary: Vitastor, a fast software-defined clustered block storage
License: Vitastor Network Public License 1.1
URL: https://vitastor.io/
Source0: vitastor-1.4.0.el9.tar.gz
Source0: vitastor-1.4.3.el9.tar.gz
BuildRequires: liburing-devel >= 0.6
BuildRequires: gperftools-devel

View File

@@ -16,7 +16,7 @@ if("${CMAKE_INSTALL_PREFIX}" MATCHES "^/usr/local/?$")
set(CMAKE_INSTALL_RPATH "${CMAKE_INSTALL_PREFIX}/${CMAKE_INSTALL_LIBDIR}")
endif()
add_definitions(-DVERSION="1.4.0")
add_definitions(-DVERSION="1.4.3")
add_definitions(-Wall -Wno-sign-compare -Wno-comment -Wno-parentheses -Wno-pointer-arith -fdiagnostics-color=always -fno-omit-frame-pointer -I ${CMAKE_SOURCE_DIR}/src)
add_link_options(-fno-omit-frame-pointer)
if (${WITH_ASAN})

View File

@@ -163,20 +163,10 @@ void blockstore_impl_t::loop()
}
else if (op->opcode == BS_OP_SYNC)
{
// wait for all small writes to be submitted
// wait for all big writes to complete, submit data device fsync
// sync only completed writes?
// wait for the data device fsync to complete, then submit journal writes for big writes
// then submit an fsync operation
if (has_writes)
{
// Can't submit SYNC before previous writes
continue;
}
wr_st = continue_sync(op);
if (wr_st != 2)
{
has_writes = wr_st > 0 ? 1 : 2;
}
}
else if (op->opcode == BS_OP_STABLE)
{
@@ -205,6 +195,10 @@ void blockstore_impl_t::loop()
// ring is full, stop submission
break;
}
else if (PRIV(op)->wait_for == WAIT_JOURNAL)
{
PRIV(op)->wait_detail2 = (unstable_writes.size()+unstable_unsynced);
}
}
}
if (op_idx != new_idx)
@@ -283,7 +277,8 @@ void blockstore_impl_t::check_wait(blockstore_op_t *op)
}
else if (PRIV(op)->wait_for == WAIT_JOURNAL)
{
if (journal.used_start == PRIV(op)->wait_detail)
if (journal.used_start == PRIV(op)->wait_detail &&
(unstable_writes.size()+unstable_unsynced) == PRIV(op)->wait_detail2)
{
// do not submit
#ifdef BLOCKSTORE_DEBUG
@@ -558,13 +553,14 @@ void blockstore_impl_t::process_list(blockstore_op_t *op)
if (stable_count >= stable_alloc)
{
stable_alloc *= 2;
stable = (obj_ver_id*)realloc(stable, sizeof(obj_ver_id) * stable_alloc);
if (!stable)
obj_ver_id* nst = (obj_ver_id*)realloc(stable, sizeof(obj_ver_id) * stable_alloc);
if (!nst)
{
op->retval = -ENOMEM;
FINISH_OP(op);
return;
}
stable = nst;
}
stable[stable_count++] = {
.oid = clean_it->first,
@@ -642,8 +638,8 @@ void blockstore_impl_t::process_list(blockstore_op_t *op)
if (stable_count >= stable_alloc)
{
stable_alloc += 32768;
stable = (obj_ver_id*)realloc(stable, sizeof(obj_ver_id) * stable_alloc);
if (!stable)
obj_ver_id *nst = (obj_ver_id*)realloc(stable, sizeof(obj_ver_id) * stable_alloc);
if (!nst)
{
if (unstable)
free(unstable);
@@ -651,6 +647,7 @@ void blockstore_impl_t::process_list(blockstore_op_t *op)
FINISH_OP(op);
return;
}
stable = nst;
}
stable[stable_count++] = dirty_it->first;
}
@@ -666,8 +663,8 @@ void blockstore_impl_t::process_list(blockstore_op_t *op)
if (unstable_count >= unstable_alloc)
{
unstable_alloc += 32768;
unstable = (obj_ver_id*)realloc(unstable, sizeof(obj_ver_id) * unstable_alloc);
if (!unstable)
obj_ver_id *nst = (obj_ver_id*)realloc(unstable, sizeof(obj_ver_id) * unstable_alloc);
if (!nst)
{
if (stable)
free(stable);
@@ -675,6 +672,7 @@ void blockstore_impl_t::process_list(blockstore_op_t *op)
FINISH_OP(op);
return;
}
unstable = nst;
}
unstable[unstable_count++] = dirty_it->first;
}
@@ -694,8 +692,8 @@ void blockstore_impl_t::process_list(blockstore_op_t *op)
if (stable_count+unstable_count > stable_alloc)
{
stable_alloc = stable_count+unstable_count;
stable = (obj_ver_id*)realloc(stable, sizeof(obj_ver_id) * stable_alloc);
if (!stable)
obj_ver_id *nst = (obj_ver_id*)realloc(stable, sizeof(obj_ver_id) * stable_alloc);
if (!nst)
{
if (unstable)
free(unstable);
@@ -703,6 +701,7 @@ void blockstore_impl_t::process_list(blockstore_op_t *op)
FINISH_OP(op);
return;
}
stable = nst;
}
// Copy unstable entries
for (int i = 0; i < unstable_count; i++)

View File

@@ -55,6 +55,7 @@
#define IS_JOURNAL(st) (((st) & 0x0F) == BS_ST_SMALL_WRITE)
#define IS_BIG_WRITE(st) (((st) & 0x0F) == BS_ST_BIG_WRITE)
#define IS_DELETE(st) (((st) & 0x0F) == BS_ST_DELETE)
#define IS_INSTANT(st) (((st) & BS_ST_TYPE_MASK) == BS_ST_DELETE || ((st) & BS_ST_INSTANT))
#define BS_SUBMIT_CHECK_SQES(n) \
if (ringloop->sqes_left() < (n))\
@@ -201,7 +202,7 @@ struct blockstore_op_private_t
{
// Wait status
int wait_for;
uint64_t wait_detail;
uint64_t wait_detail, wait_detail2;
int pending_ops;
int op_state;
@@ -377,7 +378,7 @@ class blockstore_impl_t
// Stabilize
int dequeue_stable(blockstore_op_t *op);
int continue_stable(blockstore_op_t *op);
void mark_stable(const obj_ver_id & ov, bool forget_dirty = false);
void mark_stable(obj_ver_id ov, bool forget_dirty = false);
void stabilize_object(object_id oid, uint64_t max_ver);
blockstore_op_t* selective_sync(blockstore_op_t *op);
int split_stab_op(blockstore_op_t *op, std::function<int(obj_ver_id v)> decider);

View File

@@ -19,7 +19,7 @@ void blockstore_impl_t::parse_config(blockstore_config_t & config, bool init)
throttle_target_mbs = strtoull(config["throttle_target_mbs"].c_str(), NULL, 10);
throttle_target_parallelism = strtoull(config["throttle_target_parallelism"].c_str(), NULL, 10);
throttle_threshold_us = strtoull(config["throttle_threshold_us"].c_str(), NULL, 10);
if (config.find("autosync_writes") != config.end())
if (config["autosync_writes"] != "")
{
autosync_writes = strtoull(config["autosync_writes"].c_str(), NULL, 10);
}

View File

@@ -412,11 +412,40 @@ resume_4:
return 2;
}
void blockstore_impl_t::mark_stable(const obj_ver_id & v, bool forget_dirty)
void blockstore_impl_t::mark_stable(obj_ver_id v, bool forget_dirty)
{
auto dirty_it = dirty_db.find(v);
if (dirty_it != dirty_db.end())
{
if (IS_INSTANT(dirty_it->second.state))
{
// 'Instant' (non-EC) operations may complete and try to become stable out of order. Prevent it.
auto back_it = dirty_it;
while (back_it != dirty_db.begin())
{
back_it--;
if (back_it->first.oid != v.oid)
{
break;
}
if (!IS_STABLE(back_it->second.state))
{
// There are preceding unstable versions, can't flush <v>
return;
}
}
while (true)
{
dirty_it++;
if (dirty_it == dirty_db.end() || dirty_it->first.oid != v.oid ||
!IS_SYNCED(dirty_it->second.state))
{
dirty_it--;
break;
}
v.version = dirty_it->first.version;
}
}
while (1)
{
bool was_stable = IS_STABLE(dirty_it->second.state);

View File

@@ -85,8 +85,7 @@ int blockstore_impl_t::continue_sync(blockstore_op_t *op)
left--;
auto & dirty_entry = dirty_db.at(sbw);
uint64_t dyn_size = dsk.dirty_dyn_size(dirty_entry.offset, dirty_entry.len);
if (!space_check.check_available(op, 1, sizeof(journal_entry_big_write) + dyn_size,
(unstable_writes.size()+unstable_unsynced)*journal.block_size))
if (!space_check.check_available(op, 1, sizeof(journal_entry_big_write) + dyn_size, 0))
{
return 0;
}

View File

@@ -129,7 +129,7 @@ bool blockstore_impl_t::enqueue_write(blockstore_op_t *op)
}
bool imm = (op->len < dsk.data_block_size ? (immediate_commit != IMMEDIATE_NONE) : (immediate_commit == IMMEDIATE_ALL));
if (wait_big && !is_del && !deleted && op->len < dsk.data_block_size && !imm ||
!imm && unsynced_queued_ops >= autosync_writes)
!imm && autosync_writes && unsynced_queued_ops >= autosync_writes)
{
// Issue an additional sync so that the previous big write can reach the journal
blockstore_op_t *sync_op = new blockstore_op_t;
@@ -593,7 +593,7 @@ resume_4:
#endif
bool is_big = (dirty_it->second.state & BS_ST_TYPE_MASK) == BS_ST_BIG_WRITE;
bool imm = is_big ? (immediate_commit == IMMEDIATE_ALL) : (immediate_commit != IMMEDIATE_NONE);
bool is_instant = ((dirty_it->second.state & BS_ST_TYPE_MASK) == BS_ST_DELETE || (dirty_it->second.state & BS_ST_INSTANT));
bool is_instant = IS_INSTANT(dirty_it->second.state);
if (imm)
{
auto & unstab = unstable_writes[op->oid];

View File

@@ -17,6 +17,7 @@ struct rm_pg_t
uint64_t obj_count = 0, obj_done = 0;
int state = 0;
int in_flight = 0;
bool synced = false;
};
struct rm_inode_t
@@ -48,6 +49,7 @@ struct rm_inode_t
.objects = objects,
.obj_count = objects.size(),
.obj_done = 0,
.synced = parent->cli->get_immediate_commit(inode),
});
if (min_offset == 0)
{
@@ -151,6 +153,37 @@ struct rm_inode_t
}
cur_list->obj_pos++;
}
if (cur_list->in_flight == 0 && cur_list->obj_pos == cur_list->objects.end() &&
!cur_list->synced)
{
osd_op_t *op = new osd_op_t();
op->op_type = OSD_OP_OUT;
op->peer_fd = parent->cli->msgr.osd_peer_fds.at(cur_list->rm_osd_num);
op->req = (osd_any_op_t){
.sync = {
.header = {
.magic = SECONDARY_OSD_OP_MAGIC,
.id = parent->cli->next_op_id(),
.opcode = OSD_OP_SYNC,
},
},
};
op->callback = [this, cur_list](osd_op_t *op)
{
cur_list->in_flight--;
cur_list->synced = true;
if (op->reply.hdr.retval < 0)
{
fprintf(stderr, "Failed to sync OSD %lu (retval=%ld)\n",
cur_list->rm_osd_num, op->reply.hdr.retval);
error_count++;
}
delete op;
continue_delete();
};
cur_list->in_flight++;
parent->cli->msgr.outbox_push(op);
}
}
void continue_delete()
@@ -161,7 +194,8 @@ struct rm_inode_t
}
for (int i = 0; i < lists.size(); i++)
{
if (!lists[i]->in_flight && lists[i]->obj_pos == lists[i]->objects.end())
if (!lists[i]->in_flight && lists[i]->obj_pos == lists[i]->objects.end() &&
lists[i]->synced)
{
delete lists[i];
lists.erase(lists.begin()+i, lists.begin()+i+1);
@@ -187,7 +221,7 @@ struct rm_inode_t
{
fprintf(stderr, "\n");
}
if (parent->progress && (total_done < total_count || inactive_osds.size() > 0))
if (parent->progress && (total_done < total_count || inactive_osds.size() > 0 || error_count > 0))
{
fprintf(
stderr, "Warning: Pool:%u,ID:%lu inode data may not have been fully removed.\n"

View File

@@ -352,13 +352,13 @@ void cluster_client_t::on_load_config_hook(json11::Json::object & etcd_global_co
// up_wait_retry_interval
up_wait_retry_interval = config["up_wait_retry_interval"].uint64_value();
if (!up_wait_retry_interval)
{
up_wait_retry_interval = 500;
}
else if (up_wait_retry_interval < 50)
{
up_wait_retry_interval = 50;
}
else if (up_wait_retry_interval < 10)
{
up_wait_retry_interval = 10;
}
// log_level
log_level = config["log_level"].uint64_value();
msgr.parse_config(config);

View File

@@ -28,7 +28,7 @@ struct etcd_kv_t
{
std::string key;
json11::Json value;
uint64_t mod_revision;
uint64_t mod_revision = 0;
};
struct pg_config_t

View File

@@ -22,7 +22,7 @@ static blockstore_config_t json_to_bs(const json11::Json::object & config)
{
if (kv.second.is_string())
bs[kv.first] = kv.second.string_value();
else
else if (!kv.second.is_null())
bs[kv.first] = kv.second.dump();
}
return bs;
@@ -194,7 +194,8 @@ void osd_t::parse_config(bool init)
if (autosync_interval > MAX_AUTOSYNC_INTERVAL)
autosync_interval = DEFAULT_AUTOSYNC_INTERVAL;
}
if (!config["autosync_writes"].is_null())
if (config["autosync_writes"].is_number() ||
config["autosync_writes"].string_value() != "")
{
// Allow to set it to 0
autosync_writes = config["autosync_writes"].uint64_value();

View File

@@ -262,7 +262,8 @@ void osd_t::report_statistics()
for (auto st_it = inode_stats.begin(); st_it != inode_stats.end(); )
{
auto & kv = *st_it;
if (!bs_inode_space[kv.first])
auto spc_it = bs_inode_space.find(kv.first);
if (spc_it == bs_inode_space.end() || !spc_it->second) // prevent autovivification
{
// Is it an empty inode?
if (!tv_now.tv_sec)

View File

@@ -706,6 +706,26 @@ resume_5:
remove_object_from_state(op_data->oid, &op_data->object_state, pg);
deref_object_state(pg, &op_data->object_state, true);
}
// Mark PG and OSDs as dirty
for (auto & chunk: (op_data->object_state ? op_data->object_state->osd_set : pg.cur_loc_set))
{
this->dirty_osds.insert(chunk.osd_num);
}
for (auto cl_it = msgr.clients.find(cur_op->peer_fd); cl_it != msgr.clients.end(); )
{
cl_it->second->dirty_pgs.insert({ .pool_id = pg.pool_id, .pg_num = pg.pg_num });
break;
}
dirty_pgs.insert({ .pool_id = pg.pool_id, .pg_num = pg.pg_num });
if (immediate_commit == IMMEDIATE_NONE)
{
unstable_write_count++;
if (unstable_write_count >= autosync_writes)
{
unstable_write_count = 0;
autosync();
}
}
pg.total_count--;
cur_op->reply.hdr.retval = 0;
continue_others:

View File

@@ -6,7 +6,7 @@ includedir=${prefix}/@CMAKE_INSTALL_INCLUDEDIR@
Name: Vitastor
Description: Vitastor client library
Version: 1.4.0
Version: 1.4.3
Libs: -L${libdir} -lvitastor_client
Cflags: -I${includedir}

View File

@@ -10,6 +10,7 @@ SCHEME=${SCHEME:-replicated}
# OFFSET_ARGS
# PG_SIZE
# PG_MINSIZE
# GLOBAL_CONFIG
if [ "$SCHEME" = "ec" ]; then
OSD_COUNT=${OSD_COUNT:-5}
@@ -19,10 +20,10 @@ fi
if [ "$IMMEDIATE_COMMIT" != "" ]; then
NO_SAME="--journal_no_same_sector_overwrites true --journal_sector_buffer_count 1024 --disable_data_fsync 1 --immediate_commit all --log_level 10 --etcd_stats_interval 5"
$ETCDCTL put /vitastor/config/global '{"recovery_queue_depth":1,"recovery_tune_util_low":1,"osd_out_time":1,"immediate_commit":"all","client_enable_writeback":true}'
$ETCDCTL put /vitastor/config/global '{"recovery_queue_depth":1,"recovery_tune_util_low":1,"immediate_commit":"all","client_enable_writeback":true,"client_max_writeback_iodepth":32'$GLOBAL_CONFIG'}'
else
NO_SAME="--journal_sector_buffer_count 1024 --log_level 10 --etcd_stats_interval 5"
$ETCDCTL put /vitastor/config/global '{"recovery_queue_depth":1,"recovery_tune_util_low":1,"osd_out_time":1,"client_enable_writeback":true}'
$ETCDCTL put /vitastor/config/global '{"recovery_queue_depth":1,"recovery_tune_util_low":1,"client_enable_writeback":true,"client_max_writeback_iodepth":32'$GLOBAL_CONFIG'}'
fi
start_osd_on()
@@ -53,7 +54,7 @@ for i in $(seq 1 $OSD_COUNT); do
start_osd $i
done
(while true; do node mon/mon-main.js --etcd_address $ETCD_URL --etcd_prefix "/vitastor" --verbose 1 || true; done) >>./testdata/mon.log 2>&1 &
(while true; do set +e; node mon/mon-main.js --etcd_address $ETCD_URL --etcd_prefix "/vitastor" --verbose 1; if [[ $? -ne 2 ]]; then break; fi; done) >>./testdata/mon.log 2>&1 &
MON_PID=$!
if [ "$SCHEME" = "ec" ]; then

View File

@@ -45,6 +45,8 @@ IMMEDIATE_COMMIT=1 ./test_rebalance_verify.sh
SCHEME=ec ./test_rebalance_verify.sh
SCHEME=ec IMMEDIATE_COMMIT=1 ./test_rebalance_verify.sh
./test_switch_primary.sh
./test_write.sh
SCHEME=xor ./test_write.sh

View File

@@ -1,7 +1,7 @@
#!/bin/bash -ex
PG_COUNT=2048
GLOBAL_CONFIG=',"osd_out_time":1'
. `dirname $0`/run_3osds.sh
LD_PRELOAD="build/src/libfio_vitastor.so" \

View File

@@ -9,6 +9,7 @@ if [[ "$SCHEME" = "ec" ]]; then
fi
OSD_COUNT=${OSD_COUNT:-7}
PG_COUNT=32
GLOBAL_CONFIG=',"osd_out_time":1'
. `dirname $0`/run_3osds.sh
check_qemu

View File

@@ -2,6 +2,7 @@
PG_MINSIZE=1
SCHEME=replicated
GLOBAL_CONFIG=',"osd_out_time":1'
. `dirname $0`/run_3osds.sh

View File

@@ -20,6 +20,9 @@ LD_PRELOAD="build/src/libfio_vitastor.so" \
fio -thread -name=test -ioengine=build/src/libfio_vitastor.so -bs=1M -direct=1 -iodepth=4 \
-mirror_file=./testdata/mirror.bin -end_fsync=1 -rw=write -etcd=$ETCD_URL -image=testimg
# Save PG primary
primary=$($ETCDCTL get --print-value-only /vitastor/config/pgs | jq -r '.items["1"]["1"].primary')
# Intentionally corrupt OSD data and restart it
zero_osd_pid=OSD${ZERO_OSD}_PID
kill ${!zero_osd_pid}
@@ -34,6 +37,9 @@ start_osd $ZERO_OSD
# Wait until start
wait_up 10
# Wait until PG is back on the same primary
wait_condition 10 "$ETCDCTL"$' get --print-value-only /vitastor/config/pgs | jq -s -e \'.[0].items["1"]["1"].primary == "'$primary'"'"'"
# Trigger scrub
$ETCDCTL put /vitastor/pg/history/1/1 `$ETCDCTL get --print-value-only /vitastor/pg/history/1/1 | jq -s -c '(.[0] // {}) + {"next_scrub":1}'`

View File

@@ -4,6 +4,7 @@ OSD_COUNT=2
PG_SIZE=2
PG_MINSIZE=1
SCHEME=replicated
GLOBAL_CONFIG=',"osd_out_time":1'
. `dirname $0`/run_3osds.sh

18
tests/test_switch_primary.sh Executable file
View File

@@ -0,0 +1,18 @@
#!/bin/bash -ex
. `dirname $0`/run_3osds.sh
primary=$($ETCDCTL get --print-value-only /vitastor/config/pgs | jq -r '.items["1"]["1"].primary')
primary_pid=OSD${primary}_PID
kill -9 ${!primary_pid}
sleep 15
wait_condition 10 "$ETCDCTL get --print-value-only /vitastor/config/pgs | jq -s -e '.[0].items[\"1\"][\"1\"].primary != \"$primary\"'"
newprim=$($ETCDCTL get --print-value-only /vitastor/config/pgs | jq -r '.items["1"]["1"].primary')
if [ "$newprim" = "$primary" ]; then
format_error Primary not switched
fi
format_green OK