Compare commits

...

1117 Commits

Author SHA1 Message Date
Vitaliy Filippov 9c405009f3 Use randrw in test_heal
Test / test_change_pg_count (push) Successful in 33s Details
Test / test_change_pg_count_ec (push) Successful in 32s Details
Test / test_change_pg_size (push) Successful in 8s Details
Test / test_create_nomaxid (push) Successful in 9s Details
Test / test_etcd_fail (push) Successful in 51s Details
Test / test_failure_domain (push) Successful in 12s Details
Test / test_interrupted_rebalance (push) Successful in 1m4s Details
Test / test_interrupted_rebalance_imm (push) Successful in 1m0s Details
Test / test_interrupted_rebalance_ec (push) Successful in 1m20s Details
Test / test_interrupted_rebalance_ec_imm (push) Successful in 55s Details
Test / test_minsize_1 (push) Successful in 22s Details
Test / test_move_reappear (push) Successful in 22s Details
Test / test_rebalance_verify (push) Successful in 1m47s Details
Test / test_rebalance_verify_imm (push) Successful in 1m45s Details
Test / test_rebalance_verify_ec (push) Successful in 2m3s Details
Test / test_rebalance_verify_ec_imm (push) Successful in 1m52s Details
Test / test_rm (push) Successful in 11s Details
Test / test_snapshot (push) Successful in 15s Details
Test / test_snapshot_ec (push) Successful in 16s Details
Test / test_splitbrain (push) Successful in 12s Details
Test / test_write (push) Successful in 31s Details
Test / test_write_xor (push) Successful in 47s Details
Test / test_write_no_same (push) Successful in 11s Details
Test / test_heal_pg_size_2 (push) Successful in 3m12s Details
Test / test_scrub (push) Successful in 24s Details
Test / test_scrub_zero_osd_2 (push) Successful in 26s Details
Test / test_scrub_xor (push) Successful in 26s Details
Test / test_scrub_pg_size_3 (push) Successful in 25s Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 42s Details
Test / test_scrub_ec (push) Successful in 33s Details
2 days ago
Vitaliy Filippov f9fbea25a4 Remove double write when old and new locations are in the same metadata block
Also add another metadata entry fool-safety check which, ideally, will never fire %)
2 days ago
Vitaliy Filippov 2c9a10d081 Fix an idiotic bug leading to failed reads with -ERANGE with EC :D 2 days ago
Vitaliy Filippov 150968070f Slightly improve some debug prints
Test / test_change_pg_count (push) Successful in 30s Details
Test / test_change_pg_count_ec (push) Successful in 31s Details
Test / test_change_pg_size (push) Successful in 7s Details
Test / test_create_nomaxid (push) Successful in 7s Details
Test / test_etcd_fail (push) Successful in 45s Details
Test / test_failure_domain (push) Successful in 8s Details
Test / test_interrupted_rebalance (push) Successful in 1m3s Details
Test / test_interrupted_rebalance_imm (push) Successful in 55s Details
Test / test_interrupted_rebalance_ec (push) Successful in 1m30s Details
Test / test_interrupted_rebalance_ec_imm (push) Successful in 57s Details
Test / test_minsize_1 (push) Successful in 20s Details
Test / test_move_reappear (push) Successful in 16s Details
Test / test_rebalance_verify (push) Successful in 1m49s Details
Test / test_rebalance_verify_imm (push) Successful in 1m40s Details
Test / test_rebalance_verify_ec (push) Successful in 2m4s Details
Test / test_rebalance_verify_ec_imm (push) Successful in 1m51s Details
Test / test_rm (push) Successful in 14s Details
Test / test_snapshot (push) Successful in 16s Details
Test / test_snapshot_ec (push) Successful in 17s Details
Test / test_splitbrain (push) Successful in 14s Details
Test / test_write (push) Successful in 41s Details
Test / test_write_xor (push) Successful in 49s Details
Test / test_write_no_same (push) Successful in 10s Details
Test / test_heal_pg_size_2 (push) Successful in 2m58s Details
Test / test_scrub (push) Successful in 23s Details
Test / test_scrub_zero_osd_2 (push) Successful in 19s Details
Test / test_scrub_xor (push) Successful in 17s Details
Test / test_scrub_pg_size_3 (push) Successful in 24s Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 28s Details
Test / test_scrub_ec (push) Successful in 25s Details
1 week ago
Vitaliy Filippov cdfc74665b Close client FDs only when destroying the client, after handling all async reads/writes
Test / test_change_pg_count (push) Successful in 50s Details
Test / test_change_pg_count_ec (push) Successful in 58s Details
Test / test_change_pg_size (push) Successful in 17s Details
Test / test_create_nomaxid (push) Successful in 19s Details
Test / test_etcd_fail (push) Successful in 58s Details
Test / test_failure_domain (push) Successful in 14s Details
Test / test_interrupted_rebalance (push) Successful in 1m31s Details
Test / test_interrupted_rebalance_imm (push) Successful in 1m0s Details
Test / test_interrupted_rebalance_ec (push) Successful in 1m23s Details
Test / test_interrupted_rebalance_ec_imm (push) Successful in 1m16s Details
Test / test_minsize_1 (push) Successful in 38s Details
Test / test_move_reappear (push) Successful in 54s Details
Test / test_rebalance_verify (push) Successful in 2m26s Details
Test / test_rebalance_verify_imm (push) Successful in 2m7s Details
Test / test_rebalance_verify_ec (push) Successful in 2m51s Details
Test / test_rebalance_verify_ec_imm (push) Successful in 2m15s Details
Test / test_rm (push) Successful in 22s Details
Test / test_snapshot (push) Successful in 40s Details
Test / test_snapshot_ec (push) Successful in 34s Details
Test / test_splitbrain (push) Successful in 23s Details
Test / test_write (push) Successful in 1m7s Details
Test / test_write_xor (push) Successful in 2m13s Details
Test / test_write_no_same (push) Successful in 18s Details
Test / test_heal_pg_size_2 (push) Successful in 5m14s Details
Test / test_scrub (push) Successful in 36s Details
Test / test_scrub_zero_osd_2 (push) Successful in 40s Details
Test / test_scrub_xor (push) Successful in 54s Details
Test / test_scrub_pg_size_3 (push) Successful in 54s Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 1m12s Details
Test / test_scrub_ec (push) Successful in 1m10s Details
Fixes "Client XX command out of sync" sometimes happening on reconnections
2 weeks ago
Vitaliy Filippov 3f60fecd7c Fix typo 2 weeks ago
Vitaliy Filippov 3b4cf29e65 Release 0.9.0
New features:
- Scrubbing! Check documentation: [auto_scrub](src/branch/master/docs/config/osd.en.md#auto_scrub)
- Document online-updatable configuration parameters

Bug fixes:
- Fix NaN during PG optimisation if there are nonexisting OSDs in node_placement
- Fix monitor crash on pool deletion
- Clear journal_device and meta_device before initialising the next OSD in automatic mode
- Sync unsynced deletes before overwriting them with a lower version
  (reproducted mostly/only after scrubbing)
2 weeks ago
Vitaliy Filippov eeaba11ebd Use fio 3.27-8 for alma9 2 weeks ago
Vitaliy Filippov aea567cfbd Slightly improve scrub docs
Test / test_cas (push) Successful in 9s Details
Test / test_change_pg_count (push) Successful in 52s Details
Test / test_change_pg_count_ec (push) Successful in 1m0s Details
Test / test_change_pg_size (push) Successful in 16s Details
Test / test_create_nomaxid (push) Successful in 16s Details
Test / test_etcd_fail (push) Successful in 56s Details
Test / test_failure_domain (push) Successful in 13s Details
Test / test_interrupted_rebalance (push) Successful in 1m24s Details
Test / test_interrupted_rebalance_imm (push) Successful in 1m10s Details
Test / test_interrupted_rebalance_ec (push) Successful in 1m9s Details
Test / test_interrupted_rebalance_ec_imm (push) Successful in 1m6s Details
Test / test_minsize_1 (push) Failing after 19s Details
Test / test_move_reappear (push) Successful in 28s Details
Test / test_rebalance_verify (push) Successful in 2m25s Details
Test / test_rebalance_verify_imm (push) Successful in 2m19s Details
Test / test_rebalance_verify_ec (push) Successful in 3m3s Details
Test / test_rebalance_verify_ec_imm (push) Successful in 2m20s Details
Test / test_rm (push) Successful in 16s Details
Test / test_snapshot (push) Successful in 21s Details
Test / test_snapshot_ec (push) Successful in 28s Details
Test / test_splitbrain (push) Successful in 20s Details
Test / test_write_xor (push) Has started running Details
Test / test_heal_pg_size_2 (push) Has started running Details
Test / test_write (push) Has started running Details
Test / test_scrub (push) Has been cancelled Details
Test / test_scrub_zero_osd_2 (push) Has been cancelled Details
Test / test_scrub_xor (push) Has been cancelled Details
Test / test_scrub_pg_size_3 (push) Has been cancelled Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Has been cancelled Details
Test / test_scrub_ec (push) Has been cancelled Details
2 weeks ago
Vitaliy Filippov ce02f47de6 Allow to disable scrub_find_best 2 weeks ago
Vitaliy Filippov 5fd3208616 Add version archive link to docs 2 weeks ago
Vitaliy Filippov 5997b76535 Remove -runtime=10 from fio params in test_scrub, it was breaking the test in CI :D
Test / test_change_pg_count (push) Successful in 49s Details
Test / test_change_pg_count_ec (push) Successful in 2m53s Details
Test / test_change_pg_size (push) Successful in 17s Details
Test / test_create_nomaxid (push) Successful in 13s Details
Test / test_etcd_fail (push) Successful in 1m0s Details
Test / test_failure_domain (push) Successful in 13s Details
Test / test_interrupted_rebalance (push) Successful in 1m18s Details
Test / test_interrupted_rebalance_imm (push) Successful in 1m3s Details
Test / test_interrupted_rebalance_ec (push) Successful in 1m40s Details
Test / test_interrupted_rebalance_ec_imm (push) Successful in 52s Details
Test / test_minsize_1 (push) Successful in 22s Details
Test / test_move_reappear (push) Successful in 23s Details
Test / test_rebalance_verify (push) Successful in 2m32s Details
Test / test_rebalance_verify_imm (push) Successful in 2m29s Details
Test / test_rebalance_verify_ec (push) Successful in 2m55s Details
Test / test_rebalance_verify_ec_imm (push) Successful in 2m30s Details
Test / test_rm (push) Successful in 22s Details
Test / test_snapshot (push) Successful in 26s Details
Test / test_snapshot_ec (push) Successful in 38s Details
Test / test_splitbrain (push) Successful in 28s Details
Test / test_write (push) Successful in 1m5s Details
Test / test_write_xor (push) Successful in 2m13s Details
Test / test_write_no_same (push) Successful in 18s Details
Test / test_heal_ec (push) Successful in 5m27s Details
Test / test_scrub (push) Successful in 36s Details
Test / test_scrub_zero_osd_2 (push) Successful in 39s Details
Test / test_scrub_xor (push) Successful in 1m2s Details
Test / test_scrub_pg_size_3 (push) Successful in 50s Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 29s Details
Test / test_scrub_ec (push) Successful in 1m25s Details
2 weeks ago
Vitaliy Filippov f1961157f0 Fix brute-force error locator for EC n+k with k > 2
Test / test_change_pg_count_ec (push) Successful in 2m23s Details
Test / test_change_pg_size (push) Successful in 20s Details
Test / test_create_nomaxid (push) Successful in 16s Details
Test / test_etcd_fail (push) Successful in 55s Details
Test / test_failure_domain (push) Successful in 12s Details
Test / test_interrupted_rebalance (push) Successful in 1m18s Details
Test / test_interrupted_rebalance_imm (push) Successful in 1m9s Details
Test / test_interrupted_rebalance_ec (push) Successful in 1m23s Details
Test / test_interrupted_rebalance_ec_imm (push) Successful in 1m8s Details
Test / test_minsize_1 (push) Successful in 22s Details
Test / test_move_reappear (push) Successful in 28s Details
Test / test_rebalance_verify (push) Successful in 2m17s Details
Test / test_rebalance_verify_imm (push) Successful in 2m19s Details
Test / test_rebalance_verify_ec (push) Successful in 3m4s Details
Test / test_rebalance_verify_ec_imm (push) Successful in 2m22s Details
Test / test_rm (push) Successful in 23s Details
Test / test_snapshot (push) Successful in 20s Details
Test / test_snapshot_ec (push) Successful in 34s Details
Test / test_splitbrain (push) Successful in 33s Details
Test / test_write (push) Successful in 1m15s Details
Test / test_write_xor (push) Successful in 2m6s Details
Test / test_write_no_same (push) Successful in 16s Details
Test / test_heal_pg_size_2 (push) Successful in 5m22s Details
Test / test_heal_ec (push) Successful in 5m31s Details
Test / test_scrub (push) Successful in 29s Details
Test / test_scrub_zero_osd_2 (push) Successful in 27s Details
Test / test_scrub_xor (push) Successful in 22s Details
Test / test_scrub_pg_size_3 (push) Failing after 37s Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 29s Details
Test / test_scrub_ec (push) Failing after 33s Details
2 weeks ago
Vitaliy Filippov 88c1ba0790 Fix compile errors with gcc 10
Test / build (push) Has started running Details
Test / buildenv (push) Successful in 11s Details
Test / make_test (push) Has been cancelled Details
Test / test_add_osd (push) Has been cancelled Details
Test / test_cas (push) Has been cancelled Details
Test / test_change_pg_count (push) Has been cancelled Details
Test / test_change_pg_count_ec (push) Has been cancelled Details
Test / test_change_pg_size (push) Has been cancelled Details
Test / test_create_nomaxid (push) Has been cancelled Details
Test / test_etcd_fail (push) Has been cancelled Details
Test / test_failure_domain (push) Has been cancelled Details
Test / test_interrupted_rebalance (push) Has been cancelled Details
Test / test_interrupted_rebalance_imm (push) Has been cancelled Details
Test / test_interrupted_rebalance_ec (push) Has been cancelled Details
Test / test_interrupted_rebalance_ec_imm (push) Has been cancelled Details
Test / test_minsize_1 (push) Has been cancelled Details
Test / test_move_reappear (push) Has been cancelled Details
Test / test_rebalance_verify (push) Has been cancelled Details
Test / test_rebalance_verify_imm (push) Has been cancelled Details
Test / test_rebalance_verify_ec (push) Has been cancelled Details
Test / test_rebalance_verify_ec_imm (push) Has been cancelled Details
Test / test_rm (push) Has been cancelled Details
Test / test_snapshot (push) Has been cancelled Details
Test / test_snapshot_ec (push) Has been cancelled Details
Test / test_splitbrain (push) Has been cancelled Details
Test / test_write (push) Has been cancelled Details
Test / test_write_xor (push) Has been cancelled Details
Test / test_write_no_same (push) Has been cancelled Details
Test / test_heal_pg_size_2 (push) Has been cancelled Details
Test / test_heal_ec (push) Has been cancelled Details
2 weeks ago
Vitaliy Filippov b5bd611683 Add scrub tests to CI 2 weeks ago
Vitaliy Filippov fa90b5a4e7 Schedule automatic scrubs correctly (not just after previous scrub) 2 weeks ago
Vitaliy Filippov 8d40ad99a6 Add scrub documentation 2 weeks ago
Vitaliy Filippov 3475772b07 Add configuration online update documentation 2 weeks ago
Vitaliy Filippov 25fcedf6e7 Enable vitastor-cli fix in test 2 weeks ago
Vitaliy Filippov 6ca20aa194 Allow scrub to fix corrupted object states 2 weeks ago
Vitaliy Filippov 4bfd994341 Sync unsynced deletes before overwriting them with a lower version 2 weeks ago
Vitaliy Filippov 59e959dcbb Do not die when "different versions are returned from subops" 2 weeks ago
Vitaliy Filippov a9581f0739 Handle dirty deletes during read correctly O_o 2 weeks ago
Vitaliy Filippov 105a405b0a Implement vitastor-cli fix 2 weeks ago
Vitaliy Filippov d55d7d5326 Add scrub test 2 weeks ago
Vitaliy Filippov 0e5d0e02a9 Add "vitastor-cli describe" command 2 weeks ago
Vitaliy Filippov 0439981a66 Implement "describe object(s)" operation
Required to implement fixing inconsistent objects in vitastor-cli
2 weeks ago
Vitaliy Filippov 6648f6bb6e Implement ambiguity detection during scrub 2 weeks ago
Vitaliy Filippov 281be547eb Implement brute-force error locator for EC 2 weeks ago
Vitaliy Filippov 0c78dd7178 Add no_scrub flag 2 weeks ago
Vitaliy Filippov 3c924397e7 Store next scrub timestamp instead of last scrub timestamp 2 weeks ago
Vitaliy Filippov c3bd26193d Implement PG scrub runner 2 weeks ago
Vitaliy Filippov 43b77d7619 Implement scrubbing "data path" - OSD_OP_SCRUB 2 weeks ago
Vitaliy Filippov a6d846863b Add min/max stripe and limit to OP_LIST 2 weeks ago
Vitaliy Filippov 8dc427b43c Retry failed reads (including chained and RMW) from other replicas 2 weeks ago
Vitaliy Filippov bf2112653b Refcount object_states 2 weeks ago
Vitaliy Filippov 0538a484b3 Add corrupted object state 2 weeks ago
Vitaliy Filippov 97720fa6b4 Remove unused capture
Test / buildenv (push) Successful in 12s Details
Test / build (push) Has started running Details
Test / make_test (push) Has been cancelled Details
Test / test_add_osd (push) Has been cancelled Details
Test / test_cas (push) Has been cancelled Details
Test / test_change_pg_count (push) Has been cancelled Details
Test / test_change_pg_count_ec (push) Has been cancelled Details
Test / test_change_pg_size (push) Has been cancelled Details
Test / test_create_nomaxid (push) Has been cancelled Details
Test / test_etcd_fail (push) Has been cancelled Details
Test / test_failure_domain (push) Has been cancelled Details
Test / test_interrupted_rebalance (push) Has been cancelled Details
Test / test_interrupted_rebalance_imm (push) Has been cancelled Details
Test / test_interrupted_rebalance_ec (push) Has been cancelled Details
Test / test_interrupted_rebalance_ec_imm (push) Has been cancelled Details
Test / test_minsize_1 (push) Has been cancelled Details
Test / test_move_reappear (push) Has been cancelled Details
Test / test_rebalance_verify (push) Has been cancelled Details
Test / test_rebalance_verify_imm (push) Has been cancelled Details
Test / test_rebalance_verify_ec (push) Has been cancelled Details
Test / test_rebalance_verify_ec_imm (push) Has been cancelled Details
Test / test_rm (push) Has been cancelled Details
Test / test_snapshot (push) Has been cancelled Details
Test / test_snapshot_ec (push) Has been cancelled Details
Test / test_splitbrain (push) Has been cancelled Details
Test / test_write (push) Has been cancelled Details
Test / test_write_xor (push) Has been cancelled Details
Test / test_write_no_same (push) Has been cancelled Details
Test / test_heal_pg_size_2 (push) Has been cancelled Details
Test / test_heal_ec (push) Has been cancelled Details
2 weeks ago
Vitaliy Filippov e60e352df6 Improve vitastor-nbd documentation 2 weeks ago
Vitaliy Filippov 98077a1712 Remove unused dependencies from CSI 3 weeks ago
Vitaliy Filippov 1c7d53996d Reweight only 2 OSDs to zero in test_rebalance_verify, otherwise the test does not pass with EC 3+2
Test / buildenv (push) Successful in 9s Details
Test / build (push) Successful in 2m20s Details
Test / test_cas (push) Successful in 11s Details
Test / make_test (push) Successful in 35s Details
Test / test_change_pg_size (push) Successful in 22s Details
Test / test_change_pg_count (push) Successful in 52s Details
Test / test_create_nomaxid (push) Successful in 19s Details
Test / test_change_pg_count_ec (push) Successful in 1m3s Details
Test / test_failure_domain (push) Successful in 13s Details
Test / test_etcd_fail (push) Successful in 1m0s Details
Test / test_interrupted_rebalance_imm (push) Successful in 1m3s Details
Test / test_interrupted_rebalance (push) Successful in 1m14s Details
Test / test_minsize_1 (push) Successful in 22s Details
Test / test_move_reappear (push) Successful in 18s Details
Test / test_interrupted_rebalance_ec_imm (push) Successful in 1m1s Details
Test / test_interrupted_rebalance_ec (push) Successful in 1m38s Details
Test / test_rebalance_verify (push) Successful in 2m20s Details
Test / test_rebalance_verify_imm (push) Successful in 2m1s Details
Test / test_rm (push) Successful in 26s Details
Test / test_rebalance_verify_ec (push) Successful in 2m30s Details
Test / test_snapshot (push) Successful in 22s Details
Test / test_snapshot_ec (push) Successful in 28s Details
Test / test_splitbrain (push) Successful in 20s Details
Test / test_write (push) Successful in 48s Details
Test / test_write_no_same (push) Successful in 15s Details
Test / test_rebalance_verify_ec_imm (push) Successful in 2m11s Details
Test / test_write_xor (push) Successful in 1m28s Details
Test / test_heal_pg_size_2 (push) Successful in 4m48s Details
Test / test_heal_ec (push) Successful in 5m12s Details
Test / test_add_osd (push) Successful in 1m20s Details
3 weeks ago
Vitaliy Filippov 2ca07b1ea7 Raise timeout in test_rebalance_verify
Test / buildenv (push) Successful in 10s Details
Test / build (push) Successful in 2m27s Details
Test / test_cas (push) Successful in 11s Details
Test / make_test (push) Successful in 34s Details
Test / test_change_pg_size (push) Successful in 22s Details
Test / test_change_pg_count (push) Successful in 52s Details
Test / test_create_nomaxid (push) Successful in 8s Details
Test / test_failure_domain (push) Successful in 12s Details
Test / test_etcd_fail (push) Successful in 1m0s Details
Test / test_interrupted_rebalance (push) Successful in 1m15s Details
Test / test_add_osd (push) Successful in 2m33s Details
Test / test_interrupted_rebalance_imm (push) Successful in 1m4s Details
Test / test_change_pg_count_ec (push) Successful in 2m52s Details
Test / test_minsize_1 (push) Successful in 19s Details
Test / test_interrupted_rebalance_ec_imm (push) Successful in 53s Details
Test / test_move_reappear (push) Successful in 21s Details
Test / test_interrupted_rebalance_ec (push) Successful in 1m36s Details
Test / test_rebalance_verify (push) Successful in 2m22s Details
Test / test_rebalance_verify_imm (push) Successful in 2m22s Details
Test / test_rm (push) Successful in 15s Details
Test / test_snapshot (push) Successful in 19s Details
Test / test_snapshot_ec (push) Successful in 27s Details
Test / test_rebalance_verify_ec (push) Failing after 3m6s Details
Test / test_splitbrain (push) Successful in 17s Details
Test / test_write_no_same (push) Successful in 20s Details
Test / test_rebalance_verify_ec_imm (push) Failing after 3m9s Details
Test / test_write (push) Successful in 49s Details
Test / test_write_xor (push) Successful in 1m17s Details
Test / test_heal_ec (push) Successful in 4m53s Details
Test / test_heal_pg_size_2 (push) Failing after 10m10s Details
3 weeks ago
Vitaliy Filippov 022176aa98 Fix NaN during PG optimisation if there are nonexisting OSDs in node_placement
Test / buildenv (push) Successful in 11s Details
Test / build (push) Successful in 2m28s Details
Test / test_cas (push) Successful in 12s Details
Test / make_test (push) Successful in 40s Details
Test / test_change_pg_size (push) Successful in 23s Details
Test / test_change_pg_count (push) Successful in 1m1s Details
Test / test_create_nomaxid (push) Successful in 7s Details
Test / test_failure_domain (push) Successful in 11s Details
Test / test_change_pg_count_ec (push) Successful in 1m35s Details
Test / test_etcd_fail (push) Successful in 51s Details
Test / test_add_osd (push) Successful in 2m27s Details
Test / test_interrupted_rebalance (push) Successful in 1m14s Details
Test / test_interrupted_rebalance_imm (push) Successful in 1m3s Details
Test / test_minsize_1 (push) Successful in 28s Details
Test / test_move_reappear (push) Successful in 41s Details
Test / test_interrupted_rebalance_ec_imm (push) Successful in 1m13s Details
Test / test_interrupted_rebalance_ec (push) Successful in 1m49s Details
Test / test_rebalance_verify (push) Successful in 2m21s Details
Test / test_rm (push) Successful in 15s Details
Test / test_rebalance_verify_imm (push) Successful in 2m12s Details
Test / test_snapshot (push) Successful in 20s Details
Test / test_snapshot_ec (push) Successful in 28s Details
Test / test_splitbrain (push) Successful in 23s Details
Test / test_write_no_same (push) Successful in 17s Details
Test / test_write (push) Successful in 1m6s Details
Test / test_write_xor (push) Successful in 1m42s Details
Test / test_heal_pg_size_2 (push) Successful in 4m57s Details
Test / test_heal_ec (push) Successful in 4m42s Details
Test / test_rebalance_verify_ec_imm (push) Failing after 2m19s Details
Test / test_rebalance_verify_ec (push) Failing after 2m25s Details
3 weeks ago
Vitaliy Filippov 120e3fa7bc Fix pool deletion
Test / buildenv (push) Successful in 10s Details
Test / build (push) Successful in 2m32s Details
Test / test_cas (push) Successful in 13s Details
Test / make_test (push) Successful in 35s Details
Test / test_change_pg_size (push) Successful in 21s Details
Test / test_change_pg_count (push) Successful in 53s Details
Test / test_create_nomaxid (push) Successful in 17s Details
Test / test_change_pg_count_ec (push) Successful in 1m3s Details
Test / test_failure_domain (push) Successful in 16s Details
Test / test_etcd_fail (push) Successful in 1m3s Details
Test / test_add_osd (push) Successful in 2m36s Details
Test / test_interrupted_rebalance_imm (push) Successful in 1m10s Details
Test / test_interrupted_rebalance (push) Successful in 1m24s Details
Test / test_minsize_1 (push) Failing after 28s Details
Test / test_interrupted_rebalance_ec (push) Successful in 1m8s Details
Test / test_move_reappear (push) Failing after 1m2s Details
Test / test_interrupted_rebalance_ec_imm (push) Successful in 1m8s Details
Test / test_rebalance_verify_imm (push) Successful in 2m12s Details
Test / test_rebalance_verify (push) Successful in 2m22s Details
Test / test_rm (push) Successful in 21s Details
Test / test_snapshot (push) Successful in 24s Details
Test / test_rebalance_verify_ec_imm (push) Successful in 2m19s Details
Test / test_snapshot_ec (push) Successful in 27s Details
Test / test_splitbrain (push) Successful in 20s Details
Test / test_rebalance_verify_ec (push) Successful in 2m33s Details
Test / test_write_no_same (push) Successful in 15s Details
Test / test_write (push) Successful in 1m14s Details
Test / test_write_xor (push) Successful in 2m9s Details
Test / test_heal_ec (push) Successful in 4m25s Details
Test / test_heal_pg_size_2 (push) Successful in 4m59s Details
3 weeks ago
Vitaliy Filippov 629999f789 Clear journal_device and meta_device before initialising the next OSD in automatic mode 3 weeks ago
Vitaliy Filippov 93eca11ba2 Fix rhel 9 installation docs 3 weeks ago
Vitaliy Filippov 5a9e1ede52 Release 0.8.9
Test / buildenv (push) Successful in 9s Details
Test / build (push) Successful in 2m31s Details
Test / test_cas (push) Successful in 12s Details
Test / make_test (push) Successful in 33s Details
Test / test_change_pg_size (push) Successful in 19s Details
Test / test_change_pg_count (push) Successful in 55s Details
Test / test_create_nomaxid (push) Successful in 21s Details
Test / test_change_pg_count_ec (push) Successful in 58s Details
Test / test_failure_domain (push) Successful in 13s Details
Test / test_etcd_fail (push) Successful in 1m4s Details
Test / test_interrupted_rebalance (push) Successful in 1m13s Details
Test / test_interrupted_rebalance_imm (push) Successful in 1m7s Details
Test / test_add_osd (push) Successful in 2m59s Details
Test / test_move_reappear (push) Successful in 24s Details
Test / test_interrupted_rebalance_ec (push) Successful in 1m22s Details
Test / test_interrupted_rebalance_ec_imm (push) Successful in 1m1s Details
Test / test_rebalance_verify (push) Successful in 2m12s Details
Test / test_minsize_1 (push) Successful in 15s Details
Test / test_rebalance_verify_imm (push) Successful in 2m4s Details
Test / test_rebalance_verify_ec_imm (push) Successful in 2m9s Details
Test / test_rm (push) Successful in 17s Details
Test / test_snapshot (push) Successful in 23s Details
Test / test_rebalance_verify_ec (push) Successful in 2m31s Details
Test / test_splitbrain (push) Successful in 23s Details
Test / test_snapshot_ec (push) Successful in 30s Details
Test / test_write_no_same (push) Successful in 16s Details
Test / test_write (push) Successful in 53s Details
Test / test_write_xor (push) Successful in 1m19s Details
Test / test_heal_pg_size_2 (push) Successful in 4m30s Details
Test / test_heal_ec (push) Successful in 4m32s Details
- The tests are now stable and run in a CI system based on Gitea CI
- The release includes final bug fixes for EC:
  - Implement missing EC recovery of allocation bitmap when built with ISA-L
  - Fix broken snapshot export with EC (allocation bitmap reads were giving incorrect results previously)
- Also fixed bugs manifesting under heavy load:
  - Fix monitor possibly applying incorrect PG history on retries
  - Fix monitor incorrectly changing PG count when last_clean_pgs contains less PGs than the new number
  - Allow writes to wait for free space again, but now correctly (previously dropped in 0.8.2)
  - Fix a rare segfault in client (handle client stop during incoming stream handling in 1 more place)
  - Make monitor correctly handle etcd connection errors - it could die instead of connecting to another etcd
  - Fix OSD rarely being unable to report PG states after a PG was taken over by another OSD
- Fixed return code for incomplete EC objects (now EIO) and made cluster client retry this error
- Made other small changes for tests: timeouts, nice/ionice for etcd, waiting conditions, NBD device checks and so on
3 weeks ago
Vitaliy Filippov 1c9a188600 Add tests to CI
Test / buildenv (push) Successful in 10s Details
Test / build (push) Successful in 10s Details
Test / test_cas (push) Successful in 12s Details
Test / make_test (push) Successful in 34s Details
Test / test_change_pg_size (push) Successful in 17s Details
Test / test_create_nomaxid (push) Successful in 9s Details
Test / test_change_pg_count (push) Successful in 1m29s Details
Test / test_failure_domain (push) Successful in 11s Details
Test / test_change_pg_count_ec (push) Successful in 1m35s Details
Test / test_etcd_fail (push) Successful in 52s Details
Test / test_add_osd (push) Successful in 2m13s Details
Test / test_interrupted_rebalance_imm (push) Successful in 1m4s Details
Test / test_interrupted_rebalance (push) Successful in 1m28s Details
Test / test_minsize_1 (push) Successful in 21s Details
Test / test_interrupted_rebalance_ec_imm (push) Successful in 1m4s Details
Test / test_move_reappear (push) Successful in 30s Details
Test / test_interrupted_rebalance_ec (push) Successful in 1m53s Details
Test / test_rebalance_verify_imm (push) Successful in 2m14s Details
Test / test_rebalance_verify (push) Successful in 2m16s Details
Test / test_rebalance_verify_ec_imm (push) Successful in 2m4s Details
Test / test_rm (push) Successful in 22s Details
Test / test_snapshot (push) Successful in 28s Details
Test / test_rebalance_verify_ec (push) Successful in 2m27s Details
Test / test_splitbrain (push) Successful in 24s Details
Test / test_snapshot_ec (push) Successful in 34s Details
Test / test_write_no_same (push) Successful in 19s Details
Test / test_write (push) Successful in 1m19s Details
Test / test_write_xor (push) Successful in 1m36s Details
Test / test_heal_pg_size_2 (push) Successful in 4m34s Details
Test / test_heal_ec (push) Successful in 4m21s Details
3 weeks ago
Vitaliy Filippov de3e609166 Add a FIXME about QEMU driver thread safety 3 weeks ago
Vitaliy Filippov 11481170f5 Add a FIXME about ENOSPC 3 weeks ago
Vitaliy Filippov e69d459d43 Allow rebalance to start in test_interrupted_rebalance, raise etcd start timeout 3 weeks ago
Vitaliy Filippov da82754baa Wait for conditions in test_move_reappear instead of waiting a fixed amount of time 3 weeks ago
Vitaliy Filippov d356aca030 Add missing $NO_SAME OSD argument to test_splitbrain 3 weeks ago
Vitaliy Filippov 04a273d213 Raise NBD timeout in tests 3 weeks ago
Vitaliy Filippov 6442010f93 Skip offline PGs during state reporting when the state is already deleted or taken over by another OSD
This fixes OSDs being unable to report PG states in rare conditions
3 weeks ago
Vitaliy Filippov 6f4dc16c59 Handle etcd connection errors correctly in mon (unhandled error events) 4 weeks ago
Vitaliy Filippov ce4a8067b5 Handle client stop during incoming stream handling in 1 more place 4 weeks ago
Vitaliy Filippov e431ecb715 Make tests more stable in CI 4 weeks ago
Vitaliy Filippov 8cac795445 Return EIO instead of EINVAL for incomplete EC objects 4 weeks ago
Vitaliy Filippov a409598b16 Wait for free space again, but count on big_write flushes instead of just flusher activity 4 weeks ago
Vitaliy Filippov f4c6765522 Ignore ENOENT in epoll_ctl 4 weeks ago
Vitaliy Filippov ad2916068a Fix test_add_osd rebalance timeout check 4 weeks ago
Vitaliy Filippov 321cb435a6 Fix monitor incorrectly changing PG count when last_clean_pgs contains less PGs than the new number 4 weeks ago
Vitaliy Filippov cfcf4f4355 Support checking /dev/nbdX nodes in Docker 4 weeks ago
Vitaliy Filippov e0fb17bfee Make etcd more stable in tests (add ionice and raise timeout) 4 weeks ago
Vitaliy Filippov 5b9031fecc Fix monitor possibly applying incorrect PG history under heavy load
Monitor could deceive itself by immediately saving PG configuration changes
which weren't applied to etcd yet in memory, and apply incorrect PG history
changes next time if the first update fails.

This usually only happened under heavy load and was caught in CI. :-)
4 weeks ago
Vitaliy Filippov 5da1d8e1b5 Fix EC just-bitmap reads (len=0) (fixes SCHEME=ec test_snapshot.sh) 4 weeks ago
Vitaliy Filippov 44f86f1999 Add a basic EC 2+2 recovery test (not really required, but let it be there) 4 weeks ago
Vitaliy Filippov 2d9a80c6f6 Implement missing bitmap recovery with ISA-L \(°□°)/ 4 weeks ago
Vitaliy Filippov 5e295e346e Do not make vitastor-mon part of vitastor.target 1 month ago
Vitaliy Filippov d9c0898b7c Notes about config and vitastor-disk cache status 1 month ago
Vitaliy Filippov 04cfb48361 Add a note about PVE 7.4 1 month ago
Vitaliy Filippov ab615849d6 Release 0.8.8
- Fix vitastor-cli rm/rm-data broken in 0.8.6 (missing messenger initialization)
- Prepare OSD read handler for upcoming version with scrub - allow "secondary reads" to return errors
- Fix OSDs re-peering PGs infinitely with a big number of PGs (reproduced in test_add_osd)
- Fix another variant of flusher sync-waiting stall (reproduced in test_write)
- Fix other tests in tests/ (will add them to Gitea CI soon)
- Add patches for QEMU 6.2-8.0
- Fix QEMU driver compatibility with QEMU 8.0
- Build packages for RHEL 9 clones (based on AlmaLinux 9)
1 month ago
Vitaliy Filippov 38be9a49c0 Add AlmaLinux 9 build to documentation 1 month ago
Vitaliy Filippov 7d6bf84a3e Add scripts/meson-buildoptions.sh to QEMU patches 1 month ago
Vitaliy Filippov 41a40a4123 Add QEMU spec patch for Alma/Rocky/RH 9 1 month ago
Vitaliy Filippov b94587ef0e Fix some build warnings 1 month ago
Vitaliy Filippov 2a2f4f6738 Add Almalinux 9 build 1 month ago
Vitaliy Filippov c768a9015f Fix QEMU driver compatibility with QEMU 8.0 1 month ago
Vitaliy Filippov 0d9e10cf96 Add patches for QEMU 6.2-8.0 1 month ago
Vitaliy Filippov b74ccb613c Fix another variant of flusher sync-waiting stall 1 month ago
Vitaliy Filippov 5052174918 Fix test_write_no_same (too large image) 1 month ago
Vitaliy Filippov eec9cf5575 Fix test_snapshot.sh - qemu-img requires explicit backing_fmt 1 month ago
Vitaliy Filippov a04dab0840 Initialize messenger in cluster_client listings 1 month ago
Vitaliy Filippov 160863f707 Print op pointer values in slow log 1 month ago
Vitaliy Filippov 2f16c32eb4 Fix test_minsize_1 (left_on_dead) 1 month ago
Vitaliy Filippov 2877cd0adb Allow OP_SEC_READ to return errors (do not hang the connection) 1 month ago
Vitaliy Filippov 480509f5b9 Fix pg_data_size > 1 for replicas (harmless bug) 1 month ago
Vitaliy Filippov 46462da45e Preload own PG history updates to fix PG state loop possibly applying the old metadata version 1 month ago
Vitaliy Filippov 024c8658f6 Fix missing } in quick start documentation 2 months ago
Vitaliy Filippov 7e958afeda Release 0.8.7
This release includes a bunch of important bugfixes for erasure-coded setups
with disabled immediate_commit. After these fixes, "test_heal" OSD killing test
now passes fine with EC:

- Fix cluster write stalls with "Error while doing flush on OSD xx: -16 (Device or resource busy)"
  in OSD logs possible in EC setups with disabled immediate_commit by selectively
  syncing nonsynced objects on STABILIZE/ROLLBACK (https://github.com/vitalif/vitastor/issues/51)
- Fix other EC + disabled immediate_commit problems:
  - Fix "opcode=5 retval=-2" errors happening on SYNC retries
  - Fix non-working "pagination" during PG dirty object flushing
  - Fix write operations not continued correctly after dirty object flushing
- Fix incorrect parity read-modify-write calculation when writing into a lost chunk
- Fix OSDs losing left_on_dead PG state of non-clean PGs and thus not removing junk data in the cluster
- Fix a small memory leak caused by bad indexing of EC recovery matrices
- Fix a rare use-after-free in cluster_client caused by a reenterability issue
- Fix vitastor-cli create command syntax in the CSI driver
- Allow to start OSDs without local store for tests
- Fix memory allocation error in disk_tool_meta for non-standard metadata block sizes
- Fix delete operations received before loading pool metadata crashing OSDs with "null pointer exception"
- Improve "theoretical performance" Russian documentation

New features:

- Implement online configuration update for some parameters. Documentation is coming soon :)
2 months ago
Vitaliy Filippov 2f5e769a29 Fix a small memory leak caused by bad indexing of EC recovery matrices 2 months ago
Vitaliy Filippov 28d5e53c6c Add test_heal to run_tests 2 months ago
Vitaliy Filippov d9f55f11d8 More logs (log_level 10), append to log instead of overwriting on restart in tests 2 months ago
Vitaliy Filippov 3237014608 Fix incorrect parity read-modify-write calculation when writing into a lost chunk 2 months ago
Vitaliy Filippov baaf8f6f44 Fix write operations not continued correctly after flush 2 months ago
Vitaliy Filippov 1d83fdcd17 Add debug logs to osd_flush 2 months ago
Vitaliy Filippov 0ddd787c38 Fix non-working "pagination" during PG dirty object flushing 2 months ago
Vitaliy Filippov 6eff3a60a5 Do not lose left_on_dead PG state of non-clean PGs 2 months ago
Vitaliy Filippov 888a6975ab Fix a rare use-after-free in cluster_client caused by a reenterability issue 2 months ago
Vitaliy Filippov cd1e890bd4 Fix "opcode=5 retval=-2" errors sometimes possible with EC 2 months ago
Vitaliy Filippov 0fbf4c6a08 Selectively sync nonsynced objects on STABILIZE/ROLLBACK (fix for github issue #51) 2 months ago
Vitaliy Filippov d06ed2b0e7 Implement online config update 2 months ago
Vitaliy Filippov 3bbc46543d Fix vitastor-cli create syntax 3 months ago
Vitaliy Filippov 2fb0c85618 Allow to start OSDs without local store (only for tests) 3 months ago
Vitaliy Filippov d81a6c04fc Update cmake min version so it does not complain about deprecation 3 months ago
Vitaliy Filippov 7b35801647 Fix possible bad realloc in disk_tool_meta for non-standard metadata block sizes 3 months ago
Vitaliy Filippov f3228d5c07 Fix typo (did not affect execution though) 3 months ago
Vitaliy Filippov 18366f5055 Fix read/write return type in rw_blocking 3 months ago
Vitaliy Filippov 851507c147 Add missing close() in test stubs 3 months ago
Vitaliy Filippov 9aaad28488 Fix "null pointer exception" for unhandled OSD_OP_DELETEs (when pool is not loaded yet) 3 months ago
Vitaliy Filippov dd57d086fe Add a missing part of the "theoretical performance" to the Russian version 3 months ago
Vitaliy Filippov 8810eae8fb Release 0.8.6
Important fixes:

- Fix possibly incorrect EC parity chunk updates with EC n+k, k > 1 and when
  the first parity chunk is missing

Minor fixes and improvements:

- Fix incorrect EC free space statistics in vitastor-cli df output
- Speedup vitastor-cli startup in clusters with RDMA
- Remove unused PG "peered" state (previously used to update PG epoch)
- Use sfdisk with just --json in vitastor-disk (--dump --json isn't needed)
- Allow trailing comma in sfdisk output (fixes sfdisk 2.36 compatibility)
- Slightly improve RDMA send/receive code
- Reduce RDMA memory consumption by default (rdma_max_recv/send = 16/8)
- Use vitastor-cli instead of direct etcd interaction in the CSI driver
3 months ago
Vitaliy Filippov c1365f46c9 Use vitastor-cli instead of direct etcd interaction in the CSI driver 3 months ago
Vitaliy Filippov 14d6acbcba Set default rdma_max_recv/send to 16/8, fix documentation 3 months ago
Vitaliy Filippov 1e307069bc Fix missing parity chunk calculation for EC n+k, k > 1 and first parity chunk missing 3 months ago
Vitaliy Filippov c3e80abad7 Allow to send more than 1 operation at a time 3 months ago
Vitaliy Filippov 138ffe4032 Reuse incoming RDMA buffers 3 months ago
Vitaliy Filippov 8139a34e97 Fix json11: allow trailing comma 3 months ago
Vitaliy Filippov 4ab630b44d Use just sfdisk --json, --dump is not needed 3 months ago
Vitaliy Filippov 2c8241b7db Remove PG "peered" state 3 months ago
Vitaliy Filippov 36a7dd3671 Move tests to "make test" 3 months ago
Vitaliy Filippov 936122bbcf Initialize msgr lazily in client to speedup vitastor-cli with RDMA enabled 4 months ago
Vitaliy Filippov 1a1ba0d1e7 Add set_immediate to ringloop and use it for bs/osd ops to prevent reenterability issues 4 months ago
Vitaliy Filippov 3d09c9cec7 Remove unused wait_sqe() from ringloop 4 months ago
Vitaliy Filippov 3d08a1ad6c Fix cluster_client test after last reenterability fixes 4 months ago
Vitaliy Filippov 499881d81c Fix typo 4 months ago
Vitaliy Filippov aba93b951b Fix incorrect EC free space statistics in vitastor-cli df output 4 months ago
Vitaliy Filippov d125fb1f30 Release 0.8.5
- Fix a possible "double free" bug in the client library happening on OSD restart
- Fix a possible write hang on PG history update when only epoch is changed
- Fix incorrect systemd target "local.target" in mon/make-etcd
- Allow "content" option in PVE storage plugin to allow to enable containers
- Build client library without tcmalloc which fixes "attempt to free invalid pointer"
  errors when, for example, trying to run QEMU with both Vitastor and Ceph RBD disks
4 months ago
Vitaliy Filippov 9d3fd72298 Require liburing < 2 in rpm specs 4 months ago
Vitaliy Filippov 8b552a01f9 Do not retry successful operation parts in client (could lead to "double free" bugs) 4 months ago
Vitaliy Filippov 0385b2f9e8 Fix write hangs on PG epoch update - always set pg.history_changed to true 4 months ago
Vitaliy Filippov 749c837045 Replace non-existing local.target with multi-user.target 4 months ago
Vitaliy Filippov 98001d845b Remove version from vitastor-release.rpm links 4 months ago
Vitaliy Filippov c96bcae74b Allow "content" option in PVE storage plugin to allow to enable containers 5 months ago
Vitaliy Filippov 9f4e34a8cc Build client library without tcmalloc
Fixes "[src/tcmalloc.cc:332] Attempt to free invalid pointer ..." when trying
to run QEMU with both Vitastor and Ceph RBD disks and other possible allocator
collisions.
5 months ago
Vitaliy Filippov 81fc8bb94c Release 0.8.4
New features:
- Implement QCOW2 image/snapshot export via qemu-img (bdrv_co_block_status in the driver)
- Remove OSDs from PG history during `vitastor-cli rm-osd` to prevent `left_on_dead` PG states after deletion
- Add a new recovery_pg_switch setting to mix all PGs during recovery, to almost
  fully reduce the probability of ENOSPC during rebalance
- Introduce partial ENOSPC ("OSD is full") handling - now ENOSPC doesn't turn
  into cascades of crashes
- Add migration support to Proxmox VE Vitastor driver
- Track last_clean_pgs on a per-pool basis thus reducing data movement in a cluster
  with pools remaining unclean/degraded for a long time

Bug fixes:
- Fix a bug where monitor could generate degraded PGs if one of the hosts had no OSDs
- Fix a bug where monitor could skip PG redistribution with a lot of OSDs in cluster
- Report PG history synchronously on the first write, which improves PG consistency
  and availability at the same time, because history now gets reported correctly
  and doesn't get reported without the need for it
- Fix possible write and recovery stalls which could happen in a cluster with both EC and replicated pools
- Make OSD and monitors sanitize & deduplicate PG history items in etcd
- Fix non-working OSD peer config safety check
- Fix a rare journal flush stall where flushing wasn't activated with full journal, but with empty flush queue
- Fix builds without ISA-L (jerasure-only) crashing with EC N+K, K>=2 due to the lack of 16-byte buffer alignment
- Fix a possible crash for EC N+K, K>=2 when calculating a parity chunk with previous parity chunk missing
- Fix a bug where vitastor-disk purge with suppressed warnings didn't work
5 months ago
Vitaliy Filippov bc465c16de Fix arithmetic on void* for clang 5 months ago
Vitaliy Filippov 8763e9211c Fix qemu driver compilation warning/error 5 months ago
Vitaliy Filippov 9e1a80bd17 Replace apt-key with trusted.gpg.d 5 months ago
Vitaliy Filippov 3e280f2f08 Mark vitastor as shared storage in PVE driver 5 months ago
Vitaliy Filippov fe87b4076b Fix backwards compatibility in cluster_client 5 months ago
Vitaliy Filippov a38957c1a7 Skip empty hosts in lp-optimizer 5 months ago
Vitaliy Filippov 137309cf29 Implement bdrv_co_block_status for snapshot export support 5 months ago
Vitaliy Filippov 373f9d0387 Try to re-peer PGs on history change 5 months ago
Vitaliy Filippov c4516ea971 Also remove deleted OSD from PG configuration and last_clean_pgs 5 months ago
Vitaliy Filippov 91065c80fc Try to prevent left_on_dead when deleting OSDs by removing them from PG history 5 months ago
Vitaliy Filippov 0f6b946add Time changes with every stat change, do not schedule checks based on it 5 months ago
Vitaliy Filippov 465cbf0b2f Do not re-schedule recheck indefinitely, run it after mon_change_timeout in any case 5 months ago
Vitaliy Filippov 41add50e4e Track last_clean_pgs on a per-pool basis 5 months ago
Vitaliy Filippov 02e7be7dc9 Prevent reenterability side effects during PG history operation resume 5 months ago
Vitaliy Filippov 73940adf07 Prioritize EC (non-instantly-stable) operations under journal pressure
This reduces the probability of hitting OSD stalls with EC due to "deadlocks"
where two parallel write operations wait for each other to complete
5 months ago
Vitaliy Filippov e950c024d3 Do not sync peer OSDs before listing
Sync before listing was added to wait for all PG writes possibly left in queue
from the previous master to finish before listing it

But in fact it may block the cluster when EC is used and some unstable writes
are left in the queue - they block journal flushing, rollback/stabilize is
required to unblock them, but rollback/stabilize may only happen after PG is
peered. But peering needs listings, listings are requested only after sync, and
sync itself waits for currently blocked writes waiting in the queue
5 months ago
Vitaliy Filippov 71d6d9f868 Fix possible crash on ENOSPC during operation cancel in blockstore 5 months ago
Vitaliy Filippov a4dfa519af Report PG history synchronously during write
This has 2 effects:
1) OSD sets aren't added into PG history until actual write attempts anymore
   which removes unneeded extra osd_sets in PG history
2) New OSD sets are reported synchronously and can't be lost on PG restarts
   happening at the same time with reconfiguration
5 months ago
Vitaliy Filippov 37a6aff2fa Write OSD numbers always as numbers in mon 5 months ago
Vitaliy Filippov 67019f5b02 Make OSD sort & sanitize PG history items 5 months ago
Vitaliy Filippov 0593e5c21c Fix OSD peer config safety check 5 months ago
Vitaliy Filippov 998e24adf8 Add a new recovery_pg_switch setting to mix all PGs during recovery 5 months ago
Vitaliy Filippov d7bd36dc32 Fix another rare journal flush stall 5 months ago
Vitaliy Filippov cf5c562800 Log all object locations when peering PGs 5 months ago
Vitaliy Filippov 629200b0cc Return ENOSPC as the primary OSD 5 months ago
Vitaliy Filippov 3589ccec22 Do not disconnect peer on ENOSPC during write 5 months ago
Vitaliy Filippov 8d55a1e780 Build osd_rmw_test both with and without ISA-L 5 months ago
Vitaliy Filippov 65f6b3a4eb Fix jerasure crashing on bitmap calculation/restoration due to the lack of 16-byte alignment 5 months ago
Vitaliy Filippov fd216eac77 Add a test for missing parity chunk calculation 5 months ago
Vitaliy Filippov 61fca7c426 Fix crash when calculating a parity chunk with previous parity chunk missing (test coming shortly) 5 months ago
Vitaliy Filippov 1c29ed80b9 Fix quote in docs :) 5 months ago
Vitaliy Filippov 68f3fb795e Suppress warnings in vitastor-disk purge correctly 5 months ago
Vitaliy Filippov fa90f287da Release 0.8.3
- Implement a new "vitastor-disk purge" command to remove OSDs with safety checks
- Implement a new "vitastor-cli rm-osd" command to only remove OSD metadata from etcd
- Fix a bug where the monitor could ignore OSD removal and other /osd/stats key changes
- Fix a bug where garbage could be returned when reading objects being written at the same time
- Fix a rare write stall where journal space could be not reclaimed where there
  were no new operations in the flush queue
- Fix a rare peering stall caused by a previous long listing operations queues limiting attempt
- Fix total object count statistic in OSD on object creation
- Add missing offset&len into vitastor-disk dump-journal for big_writes, fix JSON format
- Make vitastor-cli print help on missing command
- Make vitastor-cli translate all '-' to '_' in CLI options
5 months ago
Vitaliy Filippov 795020674d Loop journal flusher when the queue is empty but there is a trim request 5 months ago
Vitaliy Filippov 8e12285629 Fix vitastor-disk purge (now it works) 5 months ago
Vitaliy Filippov b9b50ab4cc Implement vitastor-disk purge command 5 months ago
Vitaliy Filippov 0d8625f92d Make vitastor-cli print help on missing command 5 months ago
Vitaliy Filippov 2f3c2c5140 Implement safety check for OSD removal, translate all '-' to '_' in cli options
'-' to '_' translation fixes a bug with create --image_meta
5 months ago
Vitaliy Filippov 4ebdd02b0f Remove LIST op limiter
It doesn't prevent OSD slow ops but may itself lead to stalls :)
5 months ago
Vitaliy Filippov bf6fdc4141 Check add/rm osd with 2048 PGs 5 months ago
Vitaliy Filippov c2244331e6 Add vitastor-cli rm-osd command 5 months ago
Vitaliy Filippov 3de57e87b1 Recheck OSD tree in monitor on /osd/stats changes 5 months ago
Vitaliy Filippov 2d4cc688b2 Add a remove-osd test 5 months ago
Vitaliy Filippov 31bd1ec145 Fix object creation check for statistics 6 months ago
Vitaliy Filippov c08d1f2dfe Add missing offset&len into big_writes journal dump, fix commas again 6 months ago
Vitaliy Filippov 1d80bcc8d0 Fix blockstore returning garbage for unstable reads if there is an in-flight version
"In-flight" versions are added into dirty_db when writes are enqueued. And they
weren't ignored by subsequent reads even though they didn't have data location yet.
This bug was leading to test_heal.sh not passing sometimes with replicated setups.
6 months ago
Vitaliy Filippov 5ef8bed75f Release 0.8.2
- Fix QEMU driver compatibility with QEMU 7.0 and < 2.9
- Add patches for pve-qemu-kvm 7.1 (PVE 7.3) and pve-qemu-kvm 6.2 (PVE 7.2)
- Fix Proxmox driver location in the pve-storage-vitastor package
- Disable HDD autodetection in non-hybrid mode
- Explicitly warn about a buggy kernels on -EAGAIN in io_uring
- Final fix for the lack of zeroing out of old metadata entries
  (do not crash with "big_write journal_entry was allocated over another object"
  in some cases after an unclean OSD shutdown)
- Wait for data writes before fsyncing data if data fsync is enabled
- Never try to wait for free space inside blockstore thus stalling OSDs
- Fix a rare crash in osd_peering due to callback ordering
- Fix a rare duplication of ping & op message IDs
- Fix a rare use-after-free during pings
- Add --force to vitastor-disk read-sb
- Make vitastor-disk dump metadata object IDs in hex, add forgotten commas
- Fix vitastor-disk SCSI disk cache check
6 months ago
Vitaliy Filippov 8669998e5e Fix discard_list_subop() for local ops 6 months ago
Vitaliy Filippov b457327e77 Oops. Fix metadata read after fixes :-) 6 months ago
Vitaliy Filippov f7fa9d5e34 Fix SCSI device cache type check 6 months ago
Vitaliy Filippov 49b88b01f9 Fix clang build 6 months ago
Vitaliy Filippov 71688bcb59 Disable HDD autodetection in non-hybrid mode 6 months ago
Vitaliy Filippov 552e207d2b Explicitly print errors about -EAGAIN in io_uring 6 months ago
Vitaliy Filippov 5464821fa5 Final fix for the lack of zeroing out of old metadata entries
If a crash occurs during flushing a redirect-write it may happen so that
the disk contains both old and new metadata entries. This is OK, but prior
to 0.8.0 after this situation OSDs started without problem, but then they
crashed after some more overwrites with a "tried to overwrite non-zero
metadata entry" error. 0.8.0 introduced a change that was intended to fix
this situation, but rather than fixing it it prevented OSDs from starting,
now because of a "big_write journal_entry was allocated over another object"
error... :-)

This change finally fixes the original issue.

Followup to 54ef2c389f
6 months ago
Vitaliy Filippov 6917a32ca8 Add --force to vitastor-disk read-sb 6 months ago
Vitaliy Filippov f8722a8bd5 Dump meta in hex 6 months ago
Vitaliy Filippov 9c2f69c9fa Add forgotten commas to vitastor-disk dump-journal 6 months ago
Vitaliy Filippov 1a93e3f33a Wait for data writes before fsyncing data if data fsync is enabled 6 months ago
Vitaliy Filippov 3f35744052 Fix compatibility with QEMU aio_set_fd_handler signatures in 7.0 and < 2.9 6 months ago
Vitaliy Filippov 66f14ac019 Update notes about Proxmox 7.1-7.3 6 months ago
Vitaliy Filippov 1364009931 Add patches for pve-qemu-kvm 7.1 (PVE 7.3) and pve-qemu-kvm 6.2 (PVE 7.2) 6 months ago
Vitaliy Filippov d7e30b8353 Fix pve-storage-vitastor filename 6 months ago
Vitaliy Filippov cb437913d3 Never try to wait for free space inside blockstore 6 months ago
Vitaliy Filippov 472bce58ab Fix rare crash in osd_peering due to callback ordering 6 months ago
Vitaliy Filippov 7a71e7ef01 Fix possible duplication of ping & op message IDs 6 months ago
Vitaliy Filippov c71e5e7bbd Fix possible use-after-free during pings 6 months ago
Vitaliy Filippov 8fdf30b21f Release 0.8.1
- Remove an additional data copy operation when flushing journal (should
  slightly increase write performance)
- Fix a bug where new writes in the inmemory_journal=false mode could overwrite
  the data currently read by a parallel read operation
- Fix degraded parity writes for EC N+K when K>1 where the bug could also lead
  to an "assertion failed" error
- Fix missing journal space check for "big" writes which could lead to
  "prefill_single_journal_entry(): assertion failed..." error in OSD
- Fix possible "assertion failed: next->prev_wait >= 0" in client in rare cases
- Fix missing "len" field in vitastor-disk write-journal big_writes
- Fix possible crash of a full OSD (ENOSPC)
- Fix CSI build scripts to include newest packages every time
- Fix CSI endpoint in the liveness probe manifest
7 months ago
Vitaliy Filippov 238037ae31 Make journal trimmer wait until reads are completed when inmemory_journal is false
Without this new writes may in theory overwrite journal data being read at that time
7 months ago
Vitaliy Filippov 09a8864686 Fix degraded parity writes for EC N+K when K>1
Fixes possible `calc_rmw_parity_ec(): Assertion `bufs[i][curbuf[i]].buf' failed` error
7 months ago
Vitaliy Filippov 6e6f6ecbb0 Add missing journal space check for big_writes
Fixes possible `prefill_single_journal_entry(): Assertion `!journal.sector_info[journal.cur_sector].flush_count' failed` error
7 months ago
Vitaliy Filippov 9491f81419 Add missing documentation for OSD tags 7 months ago
Vitaliy Filippov 44c2b30167 Take newest packages every time when rebuilding CSI 7 months ago
Vitaliy Filippov bf8a0581cd Fix possible "assertion failed: next->prev_wait >= 0" in client 7 months ago
Vitaliy Filippov 5953942042 Add crc32c test utility 7 months ago
Vitaliy Filippov a276a1f737 Do not copy journal data additional time when flushing 7 months ago
Vitaliy Filippov cc24e5796e Add a FIXME 7 months ago
Vitaliy Filippov 6e26732e6a Fix skipped "len" field in vitastor-disk write-journal big_writes 7 months ago
Vitaliy Filippov b4edc79449 Fix possible segfault on ENOSPC 7 months ago
Vitaliy Filippov 5f26887d32 Fix csi endpoint in liveness probe 7 months ago
Vitaliy Filippov 11ec9ad874 Release 0.8.0
- Implement automatic OSD activation via udev and simple on-disk superblock storage
- Add a new `vitastor-disk` tool and merge all disk-related functionality there.
  Now it can prepare new OSD disks, upgrade plain old systemd units to the new scheme,
  resize OSD data area, manage OSD services by disk paths, manage superblocks,
  automatically check and disable disk cache, dump and write back journal and metadata.
- Add a documentation section about `vitastor-disk` (read it if you want details!)
- Install systemd services during package installation instead of the older method
  of manually creating them via separate shell scripts
- Add a new `make-etcd` script that reuses /etc/vitastor/vitastor.conf to configure etcd
- Allow to configure block_size, bitmap_granularity and immediate_commit per-pool
- Fix "fatal error: tried to overwrite non-zero metadata entry" which was possible
  in some cases after unclean OSD shutdown (caused by old metadata entries not being zeroed)
9 months ago
Vitaliy Filippov 83bb6598dc Fix fsync autodetection for the single-device mode 9 months ago
Vitaliy Filippov 150f369346 Hotfixes for vitastor-disk prepare: max_other, get device size, older sfdisk 9 months ago
Vitaliy Filippov 8d9a5fde15 Fix docs (block_size vs object_size) 9 months ago
Vitaliy Filippov 9ccc607ab9 Fix parse_size 9 months ago
Vitaliy Filippov 8972878c77 Fix make-etcd for ip:port 9 months ago
Vitaliy Filippov 2a1da88253 Create /etc/vitastor during package installation 9 months ago
Vitaliy Filippov 2f13f347b0 Fix space stats in mon 9 months ago
Vitaliy Filippov 9453db0e99 Add a newer make-etcd.js 9 months ago
Vitaliy Filippov a828a1233d Remove old make-osd scripts 9 months ago
Vitaliy Filippov 9481456dfe Automatically check whether to disable cache during prepare 9 months ago
Vitaliy Filippov bd11db5d0a Add vitastor-mon.service, vitastor.target, create user and log directory during package installation 9 months ago
Vitaliy Filippov 68ebe5993a Fix partition reuse 9 months ago
Vitaliy Filippov eecbfb66ce Remove the old make-osd.sh script from packages 9 months ago
Vitaliy Filippov a537db8909 Add documentation for the new "vitastor-disk" tool 10 months ago
Vitaliy Filippov 54ef2c389f Followup to the "tried to overwrite" fix: also handle it in case of inmemory_meta == false 10 months ago
Vitaliy Filippov 153c73574a Refactor blockstore_init_meta into slightly more obvious code 10 months ago
Vitaliy Filippov d83580bd68 Fix "tried to overwrite non-zero metadata entry" when during a previous metadata
flush writing new entry is completed, but zeroing out an old one isn't
10 months ago
Vitaliy Filippov 29b40aba93 Add write-meta command (for debug) 10 months ago
Vitaliy Filippov a52f2b0e8f Add write-journal command (for debug) 10 months ago
Vitaliy Filippov 1407db9c08 Fix vitastor-disk prepare bugs 10 months ago
Vitaliy Filippov c0d5e83fb8 Run partprobe when partitions do not appear 10 months ago
Vitaliy Filippov 40d8d65188 Rewrite upgrade-simple to C++ 10 months ago
Vitaliy Filippov a16263e88c Fix bugs in the upgrade script and in the udev startup script 10 months ago
Vitaliy Filippov e62bab1b39 Add systemd unit for udev deployments 10 months ago
Vitaliy Filippov cb4e3a118d Fix warning 10 months ago
Vitaliy Filippov b1e39b5dea Split disk_tool.cpp into separate files 10 months ago
Vitaliy Filippov 1170319431 Finish vitastor-disk prepare in theory 10 months ago
Vitaliy Filippov 2e0a2221eb vitastor-disk prepare: WIP second form command of the command 10 months ago
Vitaliy Filippov 5a10d135f3 Allow to configure block_size, bitmap_granularity and immediate_commit per-pool 10 months ago
Vitaliy Filippov 4c9aaa8a86 vitastor-disk prepare: implement first form of the command 10 months ago
Vitaliy Filippov ae99ee6266 Rename base64.{cpp.h} to str_util 10 months ago
Vitaliy Filippov 5af75f7d78 Implement vitastor-cli and vitastor-disk --help <command> 10 months ago
Vitaliy Filippov 7dc6f10ea1 Add read-sb command 10 months ago
Vitaliy Filippov 6fde9950d6 Implement upgrade tool from "simple" units to superblock+udev deployments 10 months ago
Vitaliy Filippov 76dd0fdcea Implement pre-exec command with on-start OSD checks 11 months ago
Vitaliy Filippov 5acc19bbd5 Implement systemctl start/stop and other commands 11 months ago
Vitaliy Filippov d5ca4e1f90 Add exec-osd command 11 months ago
Vitaliy Filippov 67e04f789f Add write-sb (superblock) command 11 months ago
Vitaliy Filippov 837407a84c Add udev import command 11 months ago
Vitaliy Filippov 1fe5908899 WIP OSD activation from superblock 11 months ago
Vitaliy Filippov dcc6d546be Move simple-offsets into vitastor-disk, too 11 months ago
Vitaliy Filippov 85fa389557 Add a test for disk-tool resize 11 months ago
Vitaliy Filippov dfa433c63b Add JSON format to dump-journal 11 months ago
Vitaliy Filippov cf487c95aa Fix resizer 11 months ago
Vitaliy Filippov b10656ca09 Parse new disk params in disk_tool resizer 11 months ago
Vitaliy Filippov ea632367e9 Do not alter dsk.meta_offset/len to skip superblock 11 months ago
Vitaliy Filippov 4d777c6729 Set journal/meta devices to data device explicitly instead of "" 11 months ago
Vitaliy Filippov 0c404c5074 Use blockstore_disk in disk_tool 11 months ago
Vitaliy Filippov dfd80626bd Extract disk opening functions to separate module 11 months ago
Vitaliy Filippov 30907852c2 Use simple std::map for the config 11 months ago
Vitaliy Filippov 078ed5b116 WIP Data area resize tool 11 months ago
Vitaliy Filippov 73a363bf92 Rename some variables and constants 11 months ago
Vitaliy Filippov b0e86ca643 Merge dump-journal and dump-meta into the new "vitastor-disk" tool 11 months ago
Vitaliy Filippov 8800afb649 Fix void* arithmetic again 11 months ago
Vitaliy Filippov c10c90f620 Swap cli.en.md and cli.ru.md contents O_o 11 months ago
Vitaliy Filippov e20cdd13b6 Fix simple-offsets return value 11 months ago
Vitaliy Filippov d29b5d2d04 Add Russian translation of VNPL-1.1 12 months ago
Vitaliy Filippov 65b0e8e940 Fix typo in VNPL-1.1 12 months ago
Vitaliy Filippov bce357e2a5 Do not read all metadata into memory when dumping 12 months ago
Vitaliy Filippov 0876ca09cd Fix dumper includes and print format 12 months ago
Vitaliy Filippov dac12d8a4c Implement metadata dump tool 12 months ago
Vitaliy Filippov 1eec4407ab Fix inode creation when /index/maxid is out of sync 1 year ago
huy 3b7c6dcac2 Fix volume creation from snapshots in Cinder driver 1 year ago
Vitaliy Filippov 342517d126 Fix typo 1 year ago
Vitaliy Filippov 675bc12a13 Add extern "C" for systems like Gentoo which miss it in jerasure includes 1 year ago
Vitaliy Filippov 101592bbff Release 0.7.1
- Add ISA-L erasure code implementation, now used automatically instead of jerasure when available
- Fix listings sending too many parallel requests to OSDs
- Fix rm-data crashing with --wait-list
- Remove empty inodes from statistics and `ls` output, after <inode_vanish_time> seconds after deletion
- Make monitor delete pool statistics when the pool is deleted and thus remove them from `df` output
- Log multiple etcd addresses in OSD logs correctly
- Fix true/false parsing in json configs like no_recovery/no_rebalance
- Show no_recovery, no_rebalance, readonly flags in status
1 year ago
Vitaliy Filippov be4087d9d2 Add a FIXME to test_interrupted_rebalance 1 year ago
Vitaliy Filippov 404e43dd2d Note that ISA-L does not need to be enabled separately 1 year ago
Vitaliy Filippov 87613ed590 Add ISA-L into RPM specs 1 year ago
Vitaliy Filippov 2a2e914ef9 Show no_recovery, no_rebalance and readonly flags in status 1 year ago
Vitaliy Filippov 0cdc9292c8 Fix true/false parsing in json configs like no_recovery/no_rebalance 1 year ago
Vitaliy Filippov 3e1b03bb5c Show all etcd addresses in the "reporting to..." message 1 year ago
Vitaliy Filippov 36e851505a Make monitor delete pool statistics when the pool is deleted 1 year ago
Vitaliy Filippov 1efbbb0c36 Make deleted inodes vanish from statistics after 60 seconds 1 year ago
Vitaliy Filippov 088dd15449 Exclude empty inodes from stats 1 year ago
Vitaliy Filippov 4a531d7b8b Fix listings sending too many parallel requests to OSDs, fix rm-data crashing with --wait-list 1 year ago
Vitaliy Filippov a0cae4c180 Rename "jerasure" to "ec" in pool configuration, function names, fix documentation and Debian build scripts
Old pool configurations with "jerasure" also remain supported as an alias for "ec"
1 year ago
Vitaliy Filippov c4eb46600d Merge run_3osds and run_7osds scripts 1 year ago
Vitaliy Filippov 21b306e25f Add ISA-L support 1 year ago
Vitaliy Filippov d8313e939a Release 0.7.0
- Add documentation! :-) in Russian and English
- Implement an NFS proxy for file-based access emulation to Vitastor
  images for non-QEMU based hypervisors like VMWare, as a better way
  than iSCSI
- Implement "primary affinity tags"
- Add a patch for libvirt 6.0
- Fix free_down_raw in cli status
- Fix a rare bug where OSDs could drop unrelated connections on errors
1 year ago
huynnp911 3e92c3f082 Add patch for libvirt 6.0 1 year ago
Vitaliy Filippov 82b9f4c52d Add a test with OSD kills 1 year ago
Vitaliy Filippov 2bdf415eb3 Fix unknown OSD numbers on error 1 year ago
Vitaliy Filippov f826831282 Describe OSD placement tree and reweights 1 year ago
Vitaliy Filippov 5d47bbe04c Add documentation 1 year ago
Vitaliy Filippov 93a9f1ef89 Fix NFS socket read hangs 1 year ago
Vitaliy Filippov 2697aae909 Fix free_down_raw in cli status 1 year ago
Vitaliy Filippov 6b69db73ac Remove getrandom() usage 1 year ago
Vitaliy Filippov d48a824846 Fix some warnings 1 year ago
Vitaliy Filippov 40985282ff Fix build under GCC 8 1 year ago
Vitaliy Filippov acf403e886 Add install target for NFS proxy 1 year ago
Vitaliy Filippov cf03b9c84d Implement "primary affinity tags" 1 year ago
Vitaliy Filippov 7c2379d458 Simplified NFS proxy based on own NFS/XDR implementation 1 year ago
Vitaliy Filippov a2189100dd Make CLI functions usable in library form
Return results and errors in a variable instead of just printing them,
separate vitastor-cli main() from cli_tool_t, move positional argument
parsing to CLI main from command implementations.
1 year ago
Vitaliy Filippov bb84379db6 Release 0.6.17
- Fix incorrect reading of extra metadata block leading to extra unknown objects in stats
- Fix CSI driver volumeMode: Block support
- Add block PVC and pod examples
- Fix build under 32 bit architectures
- Fix slow connection ramp-up caused by up_wait_retry_interval
1 year ago
Vitaliy Filippov 714dda8151 Fix slow connection ramp-up caused by up_wait_retry_interval pausing operations on first connection attempt 1 year ago
Vitaliy Filippov 834554c523 LD_PRELOAD=libasan.so.5 fio in tests fails when vitastor is built with ASan 1 year ago
Vitaliy Filippov e718116f54 Fix incorrect reading of extra metadata block 1 year ago
Vitaliy Filippov 98e3528a14 Add block PVC and pod examples 1 year ago
Vitaliy Filippov 8e88f77101 Fix CSI driver volumeMode: Block support 1 year ago
Vitaliy Filippov caa2cc2e6c Fix 32bit build error 1 year ago
Vitaliy Filippov 842ba8b831 Use (uint64_t)1 instead of 1l / 1ul 1 year ago
Vitaliy Filippov 1493823f9e Note about starting monitors 1 year ago
Vitaliy Filippov c857272f44 Comment: epoch is uint64_t 1 year ago
Vitaliy Filippov 340a4b4f27 Release 0.6.16
- Implement `vitastor-cli status` (print cluster status) command
- Add a new `make-osd-hybrid.js` script to quickly prepare a lot of hybrid (HDD+SSD) OSDs
- Implement snapshot deletion for Cinder driver (only works in a healthy cluster)
- Fix a huge :) bug causing reads to return all zeroes during rebalance. Add a test to prevent it in the future
- Disconnect NBD proxy correctly without leaving a zombie [vitastor-nbd] process in D state
- Fix a rare write hang appearing with small write throttling enabled
1 year ago
Vitaliy Filippov 5118980315 Add a script to run all tests 1 year ago
Vitaliy Filippov d71cc174e3 Implement CLI status command 1 year ago
Vitaliy Filippov 0eb929f1ba Fix change_pg_count test (statistic reporting may take some time) 1 year ago
Vitaliy Filippov 83146fa3e2 Fix the same HUGE bug for regular reads during rebalance 1 year ago
Vitaliy Filippov 15dcaf7903 Add the same "rebalance" test with regular reads 1 year ago
Vitaliy Filippov cd18ef7323 Disconnect NBD proxy correctly without leaving a zombie [vitastor-nbd] process in D state 1 year ago
Vitaliy Filippov 39531ef1a6 Fix incorrect chained reads during rebalance (the bug detected by test_rebalance_verify.sh) 1 year ago
Vitaliy Filippov d334914948 Fix the test so it actually fails indicating a bug :-) 1 year ago
Vitaliy Filippov c373425562 Fix nbd log 1 year ago
Vitaliy Filippov 3615e57879 Register standby monitors in etcd in /mon/member 1 year ago
Vitaliy Filippov 0edc6fe5a6 Add notes about the new script 1 year ago
Vitaliy Filippov 9c30df83e3 Fix a HUGE :) bug in NBD proxy
The bug could result in corrupted data on large writes
1 year ago
Vitaliy Filippov a420c77107 Add rebalance-verify test 1 year ago
Vitaliy Filippov 4100d829c7 Allow to override log file for daemonized NBD proxy 1 year ago
Vitaliy Filippov 79ebda933e Fix a write hang with throttling due to timer reenterability / triggerability 1 year ago
Vitaliy Filippov 65d08e067e Add a script for preparing hybrid (HDD+SSD) OSDs 1 year ago
Vitaliy Filippov d289753df4 Implement snapshot deletion for Cinder driver 1 year ago
Vitaliy Filippov 85298ddae2 Release 0.6.15
- Make peering much faster in medium to large clusters
- Fix a reenterability issue which could rarely lead to peering process hangs
1 year ago
Vitaliy Filippov e23296a327 Rename cli_rm -> cli_rm_data, cli_snap_rm -> cli_rm 1 year ago
Vitaliy Filippov 839ec9e6e0 Shard clean_db by PGs to speedup listings 1 year ago
Vitaliy Filippov 7cbfdff41a Replace some throws with force_stop 1 year ago
Vitaliy Filippov 951272f27f Try to process PG one after another 1 year ago
Vitaliy Filippov a3fb1d4c98 Fix reenterability around set_timer 1 year ago
Vitaliy Filippov 88402e6eb6 Move next_request to run_cb_and_clear 1 year ago
Vitaliy Filippov 390239c51b Don't terminate HTTP requests with timeouts if response is already available in the socket 1 year ago
Vitaliy Filippov b7b2adfa32 Fix http client not continuing requests in case of failure to connect 1 year ago
Vitaliy Filippov 36c276358b Attempt to fix "head-of-line blocking" by LIST operations 1 year ago
Vitaliy Filippov 117d6f0612 Release 0.6.14
- Fix IPv6 address parsing
- Fix "cannot read bytes of undefined" in the monitor on a fresh DB
- Fix possible hangs of read requests on OSD restarts without immediate_commit=all mode
- Fix OSDs skipping misplaced recovery in some cases
- Fix OSDs possibly dying with "map::at" errors when other OSDs are stopped
- Fix division by zero in ls if all pool OSDs are down
1 year ago
Vitaliy Filippov 7d79c58095 Use the larger sockaddr_storage structure 1 year ago
Vitaliy Filippov 46d2bc100f Add some tolerance to stat calculation so it does not fail on a fresh DB 1 year ago
Vitaliy Filippov 732e2804e9 Fix operation dependency counter underflow for reads without immediate_commit=all mode 1 year ago
Vitaliy Filippov abaec2008c Fix OSDs missing misplaced recovery 1 year ago
Vitaliy Filippov 8129d238a4 Different fio versions have different types for xfer_buflen, but Vitastor anyway does not support 128-bit offsets 1 year ago
Vitaliy Filippov 61ebed144a Fix OSDs possibly dying with "map::at" errors when other OSDs are stopped 1 year ago
Vitaliy Filippov 9d3ba113aa Extract bind socket code into a utility function 1 year ago
Vitaliy Filippov 9788045dc9 Fix division by zero in ls if all pool OSDs are down 1 year ago
Vitaliy Filippov d6b0d29af6 4k MEM_ALIGNMENT 1 year ago
Vitaliy Filippov 36f352f06f Release 0.6.13
- Fix client hangs possible on OSD restarts (bug affected versions from 0.5.11)
- Fix "Assertion `sqe != NULL' failed" io_uring-related crashes possible
  on some kernels (0.6.11 increased probability of this bug)
- Fix timeout=0 in NBD proxy
- Fix build under centos 7
1 year ago
Vitaliy Filippov 318cc463c2 Fix warnings 1 year ago
Vitaliy Filippov 145e5cfb86 MCL_ONFAULT is not available under centos 7 1 year ago
Vitaliy Filippov 73ae578981 Add osd_memlock option 1 year ago
Vitaliy Filippov 20ee4ed758 Update some parameter docs 1 year ago
Vitaliy Filippov 63de79d1b2 Change > to | to preserve newlines 1 year ago
Vitaliy Filippov f712967079 And one more sqe starvation fix 1 year ago
Vitaliy Filippov df0cd85352 Fix another part of the "async sqe clear" bug (followup to d9857a5340) 1 year ago
Vitaliy Filippov ebaf4d7a72 Fix compatibility with fio 3.28+ 1 year ago
Vitaliy Filippov d4bc10542c Fix compatibility with liburing >= 2.1 where it only has __pad2[2] 1 year ago
Vitaliy Filippov 140309620a Free recv_buf in nbd_proxy 1 year ago
Vitaliy Filippov 0a610ee943 Destroy the client after completing CLI command 1 year ago
Vitaliy Filippov f3ce166064 Do not print nan% in df when a pool has no available OSDs 1 year ago
Vitaliy Filippov 717d303370 Handle get_sqe failures, don't die with "will fall out of sync" in epoll_manager
Problem is that in recent kernels io_uring may return completions BEFORE
clearing the submission queue. I.e. for example its capacity is 512, there
were 512 requests, one of them completed, so when the request completion is
processed the queue "should have" 1 free slot. But sometimes it doesn't because
io_uring doesn't always clear the submission queue before sending CQE :-/
1 year ago
Vitaliy Filippov d9857a5340 Check for SQEs, not for completions
Should finally fix Assertion `sqe != NULL' failed introduced after journaling
refactor in 0.6.11...
1 year ago
Vitaliy Filippov eb5d9153e8 Fix build under centos 7 1 year ago
Vitaliy Filippov ae6d1ed1d5 Remove completed items 1 year ago
Vitaliy Filippov d123e58ea3 Fix yaml syntax - remove ` in default 1 year ago
Vitaliy Filippov d9869d8116 Add parameter documentation 1 year ago
Vitaliy Filippov 4047ca606f Add missing cancel_op(currently being read op) when stopping a client
Fixes client hangs possible after stopping & restarting an osd.
Hangs happened when a connection was closed in the middle of reading a READ
operation reply from the network. In this case the operation being read was
in read_op and the client didn't free it when closing the connection.

Test case for msgr_read.cpp:
- Partially read reply for a READ operation
- stop_client()
- Check that the READ operation returns EPIPE

The bug was actually introduced in 0.5.11.
1 year ago
Vitaliy Filippov 218e294e9c > 0, of course 1 year ago
Vitaliy Filippov c1929cabe0 Release 0.6.12
etcd connection stability, clang & elbrus support

- Fix build under CLang and Elbrus LCC compilers, making Vitastor compatible
  with Elbrus CPUs :)
- Completely fix the bug where OSDs didn't connect to peers and incorrectly marked
  PGs as incomplete
- Limit I/O depth for deletes the same way as for small writes. Makes OSD crashes
  with "Assertion failed: sqe != NULL" during image deletion go away
- Fix a very old, but rare, journaling bug (credits to https://github.com/mirrorll)
- Fix flushing of unclean journaled objects leading to OSDs sometimes hanging
  after failover in EC setups (bug was introduced in 0.6.7)
- Fix several problems that could prevent smooth operation of a Vitastor cluster
  under the condition of partial etcd failure:
  - OSDs could randomly fail due to too strict error handling
  - New clients and OSDs could be unable to start because of the lack of retries
  - CLI could fail some commands because of the lack of retries
  - Monitor could stop receiving state updates because of the lack of websocket pings
- Fix monitor being unable to rebalance PGs after a downscale of pool pg_size (3->2)
- Exit with failure when trying to nbd map or benchmark a non-existing image
- Use HTTP keep-alive for etcd connections
- Allow to configure etcd request timeouts and retries
- Allow to configure NBD timeout, max devices and partitions, and set default to
  up to 64 devices with up to 3 partitions each
1 year ago
Vitaliy Filippov cc6b24e03a Allow to configure NBD timeout, max devices and partitions
Also set default NBD devices/partitions to 64/3, Linux default is 16/16 which is way too low
1 year ago
Vitaliy Filippov 0757ba630a Do not happily NBD "map" non-existing images, do not try to benchmark them too 1 year ago
Vitaliy Filippov 2a0b881685 Respect max_write_iodepth for deletes 1 year ago
Vitaliy Filippov 9a15b843ff Do not set pg_real_size to 0 1 year ago
Vitaliy Filippov 8dc1ffb13b Try to connect with PG peers before deciding it's incomplete :)
I already attempted to fix it in 0.6.11, but it happened so that the fix was
only partial :)
1 year ago
Vitaliy Filippov ba63af49b4 Add etcd retries everywhere (they were missing in some places) 1 year ago
Vitaliy Filippov 31b9c683ee Fix flushing of unclean objects
This was preventing OSD failover when there were some unclean objects.
Bug was introduced in aa436027c8
1 year ago
Vitaliy Filippov