Vitaliy Filippov
f9fbea25a4
Remove double write when old and new locations are in the same metadata block
...
Also add another metadata entry fool-safety check which, ideally, will never fire %)
2023-06-03 00:47:10 +03:00
Vitaliy Filippov
2c9a10d081
Fix an idiotic bug leading to failed reads with -ERANGE with EC :D
2023-06-03 00:44:52 +03:00
Vitaliy Filippov
150968070f
Slightly improve some debug prints
Test / test_change_pg_count (push) Successful in 30s
Details
Test / test_change_pg_count_ec (push) Successful in 31s
Details
Test / test_change_pg_size (push) Successful in 7s
Details
Test / test_create_nomaxid (push) Successful in 7s
Details
Test / test_etcd_fail (push) Successful in 45s
Details
Test / test_failure_domain (push) Successful in 8s
Details
Test / test_interrupted_rebalance (push) Successful in 1m3s
Details
Test / test_interrupted_rebalance_imm (push) Successful in 55s
Details
Test / test_interrupted_rebalance_ec (push) Successful in 1m30s
Details
Test / test_interrupted_rebalance_ec_imm (push) Successful in 57s
Details
Test / test_minsize_1 (push) Successful in 20s
Details
Test / test_move_reappear (push) Successful in 16s
Details
Test / test_rebalance_verify (push) Successful in 1m49s
Details
Test / test_rebalance_verify_imm (push) Successful in 1m40s
Details
Test / test_rebalance_verify_ec (push) Successful in 2m4s
Details
Test / test_rebalance_verify_ec_imm (push) Successful in 1m51s
Details
Test / test_rm (push) Successful in 14s
Details
Test / test_snapshot (push) Successful in 16s
Details
Test / test_snapshot_ec (push) Successful in 17s
Details
Test / test_splitbrain (push) Successful in 14s
Details
Test / test_write (push) Successful in 41s
Details
Test / test_write_xor (push) Successful in 49s
Details
Test / test_write_no_same (push) Successful in 10s
Details
Test / test_heal_pg_size_2 (push) Successful in 2m58s
Details
Test / test_scrub (push) Successful in 23s
Details
Test / test_scrub_zero_osd_2 (push) Successful in 19s
Details
Test / test_scrub_xor (push) Successful in 17s
Details
Test / test_scrub_pg_size_3 (push) Successful in 24s
Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 28s
Details
Test / test_scrub_ec (push) Successful in 25s
Details
2023-05-29 01:04:16 +03:00
Vitaliy Filippov
cdfc74665b
Close client FDs only when destroying the client, after handling all async reads/writes
...
Test / test_change_pg_count (push) Successful in 50s
Details
Test / test_change_pg_count_ec (push) Successful in 58s
Details
Test / test_change_pg_size (push) Successful in 17s
Details
Test / test_create_nomaxid (push) Successful in 19s
Details
Test / test_etcd_fail (push) Successful in 58s
Details
Test / test_failure_domain (push) Successful in 14s
Details
Test / test_interrupted_rebalance (push) Successful in 1m31s
Details
Test / test_interrupted_rebalance_imm (push) Successful in 1m0s
Details
Test / test_interrupted_rebalance_ec (push) Successful in 1m23s
Details
Test / test_interrupted_rebalance_ec_imm (push) Successful in 1m16s
Details
Test / test_minsize_1 (push) Successful in 38s
Details
Test / test_move_reappear (push) Successful in 54s
Details
Test / test_rebalance_verify (push) Successful in 2m26s
Details
Test / test_rebalance_verify_imm (push) Successful in 2m7s
Details
Test / test_rebalance_verify_ec (push) Successful in 2m51s
Details
Test / test_rebalance_verify_ec_imm (push) Successful in 2m15s
Details
Test / test_rm (push) Successful in 22s
Details
Test / test_snapshot (push) Successful in 40s
Details
Test / test_snapshot_ec (push) Successful in 34s
Details
Test / test_splitbrain (push) Successful in 23s
Details
Test / test_write (push) Successful in 1m7s
Details
Test / test_write_xor (push) Successful in 2m13s
Details
Test / test_write_no_same (push) Successful in 18s
Details
Test / test_heal_pg_size_2 (push) Successful in 5m14s
Details
Test / test_scrub (push) Successful in 36s
Details
Test / test_scrub_zero_osd_2 (push) Successful in 40s
Details
Test / test_scrub_xor (push) Successful in 54s
Details
Test / test_scrub_pg_size_3 (push) Successful in 54s
Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 1m12s
Details
Test / test_scrub_ec (push) Successful in 1m10s
Details
Fixes "Client XX command out of sync" sometimes happening on reconnections
2023-05-25 00:52:43 +03:00
Vitaliy Filippov
3b4cf29e65
Release 0.9.0
...
New features:
- Scrubbing! Check documentation: [auto_scrub](src/branch/master/docs/config/osd.en.md#auto_scrub)
- Document online-updatable configuration parameters
Bug fixes:
- Fix NaN during PG optimisation if there are nonexisting OSDs in node_placement
- Fix monitor crash on pool deletion
- Clear journal_device and meta_device before initialising the next OSD in automatic mode
- Sync unsynced deletes before overwriting them with a lower version
(reproducted mostly/only after scrubbing)
2023-05-21 15:07:14 +03:00
Vitaliy Filippov
ce02f47de6
Allow to disable scrub_find_best
2023-05-21 12:33:38 +03:00
Vitaliy Filippov
f1961157f0
Fix brute-force error locator for EC n+k with k > 2
Test / test_change_pg_count_ec (push) Successful in 2m23s
Details
Test / test_change_pg_size (push) Successful in 20s
Details
Test / test_create_nomaxid (push) Successful in 16s
Details
Test / test_etcd_fail (push) Successful in 55s
Details
Test / test_failure_domain (push) Successful in 12s
Details
Test / test_interrupted_rebalance (push) Successful in 1m18s
Details
Test / test_interrupted_rebalance_imm (push) Successful in 1m9s
Details
Test / test_interrupted_rebalance_ec (push) Successful in 1m23s
Details
Test / test_interrupted_rebalance_ec_imm (push) Successful in 1m8s
Details
Test / test_minsize_1 (push) Successful in 22s
Details
Test / test_move_reappear (push) Successful in 28s
Details
Test / test_rebalance_verify (push) Successful in 2m17s
Details
Test / test_rebalance_verify_imm (push) Successful in 2m19s
Details
Test / test_rebalance_verify_ec (push) Successful in 3m4s
Details
Test / test_rebalance_verify_ec_imm (push) Successful in 2m22s
Details
Test / test_rm (push) Successful in 23s
Details
Test / test_snapshot (push) Successful in 20s
Details
Test / test_snapshot_ec (push) Successful in 34s
Details
Test / test_splitbrain (push) Successful in 33s
Details
Test / test_write (push) Successful in 1m15s
Details
Test / test_write_xor (push) Successful in 2m6s
Details
Test / test_write_no_same (push) Successful in 16s
Details
Test / test_heal_pg_size_2 (push) Successful in 5m22s
Details
Test / test_heal_ec (push) Successful in 5m31s
Details
Test / test_scrub (push) Successful in 29s
Details
Test / test_scrub_zero_osd_2 (push) Successful in 27s
Details
Test / test_scrub_xor (push) Successful in 22s
Details
Test / test_scrub_pg_size_3 (push) Failing after 37s
Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 29s
Details
Test / test_scrub_ec (push) Failing after 33s
Details
2023-05-21 00:57:14 +03:00
Vitaliy Filippov
88c1ba0790
Fix compile errors with gcc 10
Test / build (push) Has started running
Details
Test / buildenv (push) Successful in 11s
Details
Test / make_test (push) Has been cancelled
Details
Test / test_add_osd (push) Has been cancelled
Details
Test / test_cas (push) Has been cancelled
Details
Test / test_change_pg_count (push) Has been cancelled
Details
Test / test_change_pg_count_ec (push) Has been cancelled
Details
Test / test_change_pg_size (push) Has been cancelled
Details
Test / test_create_nomaxid (push) Has been cancelled
Details
Test / test_etcd_fail (push) Has been cancelled
Details
Test / test_failure_domain (push) Has been cancelled
Details
Test / test_interrupted_rebalance (push) Has been cancelled
Details
Test / test_interrupted_rebalance_imm (push) Has been cancelled
Details
Test / test_interrupted_rebalance_ec (push) Has been cancelled
Details
Test / test_interrupted_rebalance_ec_imm (push) Has been cancelled
Details
Test / test_minsize_1 (push) Has been cancelled
Details
Test / test_move_reappear (push) Has been cancelled
Details
Test / test_rebalance_verify (push) Has been cancelled
Details
Test / test_rebalance_verify_imm (push) Has been cancelled
Details
Test / test_rebalance_verify_ec (push) Has been cancelled
Details
Test / test_rebalance_verify_ec_imm (push) Has been cancelled
Details
Test / test_rm (push) Has been cancelled
Details
Test / test_snapshot (push) Has been cancelled
Details
Test / test_snapshot_ec (push) Has been cancelled
Details
Test / test_splitbrain (push) Has been cancelled
Details
Test / test_write (push) Has been cancelled
Details
Test / test_write_xor (push) Has been cancelled
Details
Test / test_write_no_same (push) Has been cancelled
Details
Test / test_heal_pg_size_2 (push) Has been cancelled
Details
Test / test_heal_ec (push) Has been cancelled
Details
2023-05-20 23:20:09 +03:00
Vitaliy Filippov
fa90b5a4e7
Schedule automatic scrubs correctly (not just after previous scrub)
2023-05-20 23:20:09 +03:00
Vitaliy Filippov
8d40ad99a6
Add scrub documentation
2023-05-20 23:19:39 +03:00
Vitaliy Filippov
6ca20aa194
Allow scrub to fix corrupted object states
2023-05-20 23:19:39 +03:00
Vitaliy Filippov
4bfd994341
Sync unsynced deletes before overwriting them with a lower version
2023-05-20 23:19:39 +03:00
Vitaliy Filippov
59e959dcbb
Do not die when "different versions are returned from subops"
2023-05-20 23:19:39 +03:00
Vitaliy Filippov
a9581f0739
Handle dirty deletes during read correctly O_o
2023-05-20 23:19:39 +03:00
Vitaliy Filippov
105a405b0a
Implement vitastor-cli fix
2023-05-20 23:19:39 +03:00
Vitaliy Filippov
0e5d0e02a9
Add "vitastor-cli describe" command
2023-05-20 23:19:39 +03:00
Vitaliy Filippov
0439981a66
Implement "describe object(s)" operation
...
Required to implement fixing inconsistent objects in vitastor-cli
2023-05-20 23:19:39 +03:00
Vitaliy Filippov
6648f6bb6e
Implement ambiguity detection during scrub
2023-05-20 23:19:39 +03:00
Vitaliy Filippov
281be547eb
Implement brute-force error locator for EC
2023-05-20 23:19:39 +03:00
Vitaliy Filippov
0c78dd7178
Add no_scrub flag
2023-05-20 23:19:39 +03:00
Vitaliy Filippov
3c924397e7
Store next scrub timestamp instead of last scrub timestamp
2023-05-20 23:19:39 +03:00
Vitaliy Filippov
c3bd26193d
Implement PG scrub runner
2023-05-20 23:19:39 +03:00
Vitaliy Filippov
43b77d7619
Implement scrubbing "data path" - OSD_OP_SCRUB
2023-05-20 23:19:39 +03:00
Vitaliy Filippov
a6d846863b
Add min/max stripe and limit to OP_LIST
2023-05-20 23:19:39 +03:00
Vitaliy Filippov
8dc427b43c
Retry failed reads (including chained and RMW) from other replicas
2023-05-20 23:19:39 +03:00
Vitaliy Filippov
bf2112653b
Refcount object_states
2023-05-20 23:19:39 +03:00
Vitaliy Filippov
0538a484b3
Add corrupted object state
2023-05-20 23:19:39 +03:00
Vitaliy Filippov
97720fa6b4
Remove unused capture
Test / buildenv (push) Successful in 12s
Details
Test / build (push) Has started running
Details
Test / make_test (push) Has been cancelled
Details
Test / test_add_osd (push) Has been cancelled
Details
Test / test_cas (push) Has been cancelled
Details
Test / test_change_pg_count (push) Has been cancelled
Details
Test / test_change_pg_count_ec (push) Has been cancelled
Details
Test / test_change_pg_size (push) Has been cancelled
Details
Test / test_create_nomaxid (push) Has been cancelled
Details
Test / test_etcd_fail (push) Has been cancelled
Details
Test / test_failure_domain (push) Has been cancelled
Details
Test / test_interrupted_rebalance (push) Has been cancelled
Details
Test / test_interrupted_rebalance_imm (push) Has been cancelled
Details
Test / test_interrupted_rebalance_ec (push) Has been cancelled
Details
Test / test_interrupted_rebalance_ec_imm (push) Has been cancelled
Details
Test / test_minsize_1 (push) Has been cancelled
Details
Test / test_move_reappear (push) Has been cancelled
Details
Test / test_rebalance_verify (push) Has been cancelled
Details
Test / test_rebalance_verify_imm (push) Has been cancelled
Details
Test / test_rebalance_verify_ec (push) Has been cancelled
Details
Test / test_rebalance_verify_ec_imm (push) Has been cancelled
Details
Test / test_rm (push) Has been cancelled
Details
Test / test_snapshot (push) Has been cancelled
Details
Test / test_snapshot_ec (push) Has been cancelled
Details
Test / test_splitbrain (push) Has been cancelled
Details
Test / test_write (push) Has been cancelled
Details
Test / test_write_xor (push) Has been cancelled
Details
Test / test_write_no_same (push) Has been cancelled
Details
Test / test_heal_pg_size_2 (push) Has been cancelled
Details
Test / test_heal_ec (push) Has been cancelled
Details
2023-05-20 22:58:51 +03:00
Vitaliy Filippov
e60e352df6
Improve vitastor-nbd documentation
2023-05-20 22:58:51 +03:00
Vitaliy Filippov
629999f789
Clear journal_device and meta_device before initialising the next OSD in automatic mode
2023-05-15 23:58:55 +03:00
Vitaliy Filippov
5a9e1ede52
Release 0.8.9
...
Test / buildenv (push) Successful in 9s
Details
Test / build (push) Successful in 2m31s
Details
Test / test_cas (push) Successful in 12s
Details
Test / make_test (push) Successful in 33s
Details
Test / test_change_pg_size (push) Successful in 19s
Details
Test / test_change_pg_count (push) Successful in 55s
Details
Test / test_create_nomaxid (push) Successful in 21s
Details
Test / test_change_pg_count_ec (push) Successful in 58s
Details
Test / test_failure_domain (push) Successful in 13s
Details
Test / test_etcd_fail (push) Successful in 1m4s
Details
Test / test_interrupted_rebalance (push) Successful in 1m13s
Details
Test / test_interrupted_rebalance_imm (push) Successful in 1m7s
Details
Test / test_add_osd (push) Successful in 2m59s
Details
Test / test_move_reappear (push) Successful in 24s
Details
Test / test_interrupted_rebalance_ec (push) Successful in 1m22s
Details
Test / test_interrupted_rebalance_ec_imm (push) Successful in 1m1s
Details
Test / test_rebalance_verify (push) Successful in 2m12s
Details
Test / test_minsize_1 (push) Successful in 15s
Details
Test / test_rebalance_verify_imm (push) Successful in 2m4s
Details
Test / test_rebalance_verify_ec_imm (push) Successful in 2m9s
Details
Test / test_rm (push) Successful in 17s
Details
Test / test_snapshot (push) Successful in 23s
Details
Test / test_rebalance_verify_ec (push) Successful in 2m31s
Details
Test / test_splitbrain (push) Successful in 23s
Details
Test / test_snapshot_ec (push) Successful in 30s
Details
Test / test_write_no_same (push) Successful in 16s
Details
Test / test_write (push) Successful in 53s
Details
Test / test_write_xor (push) Successful in 1m19s
Details
Test / test_heal_pg_size_2 (push) Successful in 4m30s
Details
Test / test_heal_ec (push) Successful in 4m32s
Details
- The tests are now stable and run in a CI system based on Gitea CI
- The release includes final bug fixes for EC:
- Implement missing EC recovery of allocation bitmap when built with ISA-L
- Fix broken snapshot export with EC (allocation bitmap reads were giving incorrect results previously)
- Also fixed bugs manifesting under heavy load:
- Fix monitor possibly applying incorrect PG history on retries
- Fix monitor incorrectly changing PG count when last_clean_pgs contains less PGs than the new number
- Allow writes to wait for free space again, but now correctly (previously dropped in 0.8.2)
- Fix a rare segfault in client (handle client stop during incoming stream handling in 1 more place)
- Make monitor correctly handle etcd connection errors - it could die instead of connecting to another etcd
- Fix OSD rarely being unable to report PG states after a PG was taken over by another OSD
- Fixed return code for incomplete EC objects (now EIO) and made cluster client retry this error
- Made other small changes for tests: timeouts, nice/ionice for etcd, waiting conditions, NBD device checks and so on
2023-05-14 01:25:09 +03:00
Vitaliy Filippov
de3e609166
Add a FIXME about QEMU driver thread safety
2023-05-14 00:06:09 +03:00
Vitaliy Filippov
11481170f5
Add a FIXME about ENOSPC
2023-05-13 23:59:44 +03:00
Vitaliy Filippov
6442010f93
Skip offline PGs during state reporting when the state is already deleted or taken over by another OSD
...
This fixes OSDs being unable to report PG states in rare conditions
2023-05-12 23:17:45 +03:00
Vitaliy Filippov
ce4a8067b5
Handle client stop during incoming stream handling in 1 more place
2023-05-11 01:53:41 +03:00
Vitaliy Filippov
8cac795445
Return EIO instead of EINVAL for incomplete EC objects
2023-05-11 01:15:23 +03:00
Vitaliy Filippov
a409598b16
Wait for free space again, but count on big_write flushes instead of just flusher activity
2023-05-10 01:51:02 +03:00
Vitaliy Filippov
f4c6765522
Ignore ENOENT in epoll_ctl
2023-05-08 20:39:20 +03:00
Vitaliy Filippov
5da1d8e1b5
Fix EC just-bitmap reads (len=0) (fixes SCHEME=ec test_snapshot.sh)
2023-05-07 14:00:08 +03:00
Vitaliy Filippov
44f86f1999
Add a basic EC 2+2 recovery test (not really required, but let it be there)
2023-05-07 11:26:27 +03:00
Vitaliy Filippov
2d9a80c6f6
Implement missing bitmap recovery with ISA-L \(°□°)/
2023-05-07 11:25:51 +03:00
Vitaliy Filippov
ab615849d6
Release 0.8.8
...
- Fix vitastor-cli rm/rm-data broken in 0.8.6 (missing messenger initialization)
- Prepare OSD read handler for upcoming version with scrub - allow "secondary reads" to return errors
- Fix OSDs re-peering PGs infinitely with a big number of PGs (reproduced in test_add_osd)
- Fix another variant of flusher sync-waiting stall (reproduced in test_write)
- Fix other tests in tests/ (will add them to Gitea CI soon)
- Add patches for QEMU 6.2-8.0
- Fix QEMU driver compatibility with QEMU 8.0
- Build packages for RHEL 9 clones (based on AlmaLinux 9)
2023-04-28 11:22:00 +03:00
Vitaliy Filippov
b94587ef0e
Fix some build warnings
2023-04-28 00:44:27 +03:00
Vitaliy Filippov
c768a9015f
Fix QEMU driver compatibility with QEMU 8.0
2023-04-25 11:20:21 +03:00
Vitaliy Filippov
b74ccb613c
Fix another variant of flusher sync-waiting stall
2023-04-24 00:44:41 +03:00
Vitaliy Filippov
a04dab0840
Initialize messenger in cluster_client listings
2023-04-24 00:44:41 +03:00
Vitaliy Filippov
160863f707
Print op pointer values in slow log
2023-04-23 17:54:00 +03:00
Vitaliy Filippov
2877cd0adb
Allow OP_SEC_READ to return errors (do not hang the connection)
2023-04-23 17:54:00 +03:00
Vitaliy Filippov
480509f5b9
Fix pg_data_size > 1 for replicas (harmless bug)
2023-04-23 01:50:42 +03:00
Vitaliy Filippov
46462da45e
Preload own PG history updates to fix PG state loop possibly applying the old metadata version
2023-04-23 01:50:30 +03:00