Vitaliy Filippov
a9ef9a86c0
Add update() API to kv_db
Test / test_snapshot_ec (push) Has been skipped
Details
Test / test_minsize_1 (push) Has been skipped
Details
Test / test_move_reappear (push) Has been skipped
Details
Test / test_rm (push) Has been skipped
Details
Test / test_snapshot_chain (push) Has been skipped
Details
Test / test_snapshot_chain_ec (push) Has been skipped
Details
Test / test_snapshot_down (push) Has been skipped
Details
Test / test_snapshot_down_ec (push) Has been skipped
Details
Test / test_splitbrain (push) Has been skipped
Details
Test / test_rebalance_verify (push) Has been skipped
Details
Test / test_rebalance_verify_imm (push) Has been skipped
Details
Test / test_rebalance_verify_ec (push) Has been skipped
Details
Test / test_rebalance_verify_ec_imm (push) Has been skipped
Details
Test / test_write (push) Has been skipped
Details
Test / test_write_xor (push) Has been skipped
Details
Test / test_write_no_same (push) Has been skipped
Details
Test / test_heal_pg_size_2 (push) Has been skipped
Details
Test / test_heal_ec (push) Has been skipped
Details
Test / test_heal_csum_32k_dmj (push) Has been skipped
Details
Test / test_heal_csum_32k_dj (push) Has been skipped
Details
Test / test_heal_csum_32k (push) Has been skipped
Details
Test / test_heal_csum_4k_dmj (push) Has been skipped
Details
Test / test_heal_csum_4k_dj (push) Has been skipped
Details
Test / test_heal_csum_4k (push) Has been skipped
Details
Test / test_scrub (push) Has been skipped
Details
Test / test_scrub_zero_osd_2 (push) Has been skipped
Details
Test / test_scrub_xor (push) Has been skipped
Details
Test / test_scrub_pg_size_3 (push) Has been skipped
Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Has been skipped
Details
Test / test_scrub_ec (push) Has been skipped
Details
2024-01-06 17:14:42 +03:00
Vitaliy Filippov
25832cb7e4
Fix eviction when random_pos selects the end
Test / test_scrub (push) Blocked by required conditions
Details
Test / test_scrub_zero_osd_2 (push) Blocked by required conditions
Details
Test / test_scrub_xor (push) Blocked by required conditions
Details
Test / test_scrub_pg_size_3 (push) Blocked by required conditions
Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Blocked by required conditions
Details
Test / test_scrub_ec (push) Blocked by required conditions
Details
Test / buildenv (push) Has been cancelled
Details
Test / build (push) Has been cancelled
Details
Test / make_test (push) Has been cancelled
Details
Test / test_add_osd (push) Has been cancelled
Details
Test / test_cas (push) Has been cancelled
Details
Test / test_change_pg_count (push) Has been cancelled
Details
Test / test_change_pg_count_ec (push) Has been cancelled
Details
Test / test_change_pg_size (push) Has been cancelled
Details
Test / test_create_nomaxid (push) Has been cancelled
Details
Test / test_etcd_fail (push) Has been cancelled
Details
Test / test_interrupted_rebalance (push) Has been cancelled
Details
Test / test_interrupted_rebalance_imm (push) Has been cancelled
Details
Test / test_interrupted_rebalance_ec (push) Has been cancelled
Details
Test / test_interrupted_rebalance_ec_imm (push) Has been cancelled
Details
Test / test_failure_domain (push) Has been cancelled
Details
Test / test_snapshot (push) Has been cancelled
Details
Test / test_snapshot_ec (push) Has been cancelled
Details
Test / test_minsize_1 (push) Has been cancelled
Details
Test / test_move_reappear (push) Has been cancelled
Details
Test / test_rm (push) Has been cancelled
Details
Test / test_snapshot_chain (push) Has been cancelled
Details
Test / test_snapshot_chain_ec (push) Has been cancelled
Details
Test / test_snapshot_down (push) Has been cancelled
Details
Test / test_snapshot_down_ec (push) Has been cancelled
Details
2024-01-02 13:24:15 +03:00
Vitaliy Filippov
e6326c6539
Implement min/max list_count to make listings during performance test reasonable
2024-01-02 13:24:15 +03:00
Vitaliy Filippov
e32f382815
Fix and improve parallel allocation
...
- Do not try to allocate more DB blocks in an inode block until it's "confirmed" and "locked" by the first write
- Do not recheck for new zero DB blocks on first write into an inode block - a CAS failure means someone else is already writing into it
- Throw new allocation blocks away regardless of whether the known_version is 0 on a CAS failure
2024-01-02 13:24:15 +03:00
Vitaliy Filippov
fb23d94000
Implement key_prefix for K/V stress test
2024-01-02 13:24:15 +03:00
Vitaliy Filippov
ee462c2dad
More fixes
...
- do not overwrite a block with older version if known version is newer
(read may start before update and end after update)
- invalidated block versions can't be remembered and trusted
- right boundary for split blocks is right_half when diving down, not key_lt
- restart update also when block is "invalidated", not just on version mismatch
- copy callback in listings to avoid closure destruction bugs too
2024-01-02 13:24:15 +03:00
Vitaliy Filippov
16e4c767f1
Add logging and one more assert
2024-01-02 13:24:15 +03:00
Vitaliy Filippov
9e2b677499
Make get_block() wait for updating when unrelated block is found along the path
2024-01-02 13:24:15 +03:00
Vitaliy Filippov
fd57096d2d
Fix a race condition where changed blocks were parsed over existing cached blocks and getting a mix of data
2024-01-02 13:24:15 +03:00
Vitaliy Filippov
e5ae907256
Simplify code by removing an unneeded "optimisation"
2024-01-02 13:24:15 +03:00
Vitaliy Filippov
64fd6f1c56
Add kv_log_level, print warnings on level 1, trace ops on level 10
2024-01-02 13:24:15 +03:00
Vitaliy Filippov
6e76e09d16
Fix duplicate keys in listings on parallel updates -- do not rewind key "iterator position"
2024-01-02 13:24:15 +03:00
Vitaliy Filippov
0964aeebd2
Implement key suffix to avoid collisions of multiple test workers
2024-01-02 13:24:15 +03:00
Vitaliy Filippov
facff20ca1
Do not complain on empty first block
2024-01-02 13:24:15 +03:00
Vitaliy Filippov
16e09745f0
Add JSON output for stress-tester
2024-01-02 13:24:15 +03:00
Vitaliy Filippov
442f44a64f
Print total stats
2024-01-02 13:24:15 +03:00
Vitaliy Filippov
c67e3d56cb
Do not send more than op_count operations (fix segfault on finish)
2024-01-02 13:24:15 +03:00
Vitaliy Filippov
de41e46335
Add some more resiliency to serialize()
2024-01-02 13:24:15 +03:00
Vitaliy Filippov
bf9a279ff9
Invalidate blocks being updated too
2024-01-02 13:24:15 +03:00
Vitaliy Filippov
b7a41e6394
Change new block allocation method: make each writer choose multiple empty PG blocks and place blocks in them
2024-01-02 13:24:15 +03:00
Vitaliy Filippov
4175cb3720
Remove blocks from cache on unsuccessful updates
2024-01-02 13:24:15 +03:00
Vitaliy Filippov
3a4b71b0cd
Allow to track multiple updates per block (it should never happen though)
2024-01-02 13:24:14 +03:00
Vitaliy Filippov
34969c5919
Do not call stop_updating after failed write_new_block and after clear_block (both delete the item)
2024-01-02 13:24:14 +03:00
Vitaliy Filippov
02a8df6586
Track versions of parent blocks and recheck if changed during update
2024-01-02 13:24:14 +03:00
Vitaliy Filippov
86c6482cf3
Fix resume_split condition (key_lt can also be "")
2024-01-02 13:24:14 +03:00
Vitaliy Filippov
4dd68c543c
Experiment: transform offsets for better sharding
2024-01-02 13:24:14 +03:00
Vitaliy Filippov
10ad96c56c
More post-stress-test fixes
...
- Prevent _split types of new blocks
- Stop updating new blocks only after the whole update, otherwise pointers
may become invalid
- Use recheck_none for updates initially
- Use UINT64_MAX as initial block version when postponing ops, otherwise the
check fails when the block is initially empty. This for example leads to
writing both leaf items & block pointers (which is incorrect) into the root
block when starting stress-test with --parallelism 32
- Fix -EINTR comparison
2024-01-02 13:24:14 +03:00
Vitaliy Filippov
6e451117ce
Print operation statistics
2024-01-02 13:24:14 +03:00
Vitaliy Filippov
85f35bdf30
K/V fixes after stress-test :-)
...
- track block versions correctly - per inode block (128kb) instead of tree block (4kb)
- prevent multiple parallel CAS writes of the same inode block
- add logging for EILSEQ which means invalid data in the tree
- fix get_block updated flag which was true for blocks already in cache and was leading to infinite loops on "unrelated block" errors
- apply changes to blocks in cache only after successful writes (using "virtual changes")
- do not replace cached block with an older version from disk
- recheck "unrelated blocks" (read/update collisions) until data stops changing
- track tree path correctly - do not treat split block as parent of its right half
- correctly move blocks when finding new empty place on disk
- restart updates from the beginning when one of blocks is changed by a parallel update
- fix delete using SET opcode and setting key to the empty value instead
- prevent changing the same key more than 1 time in parallel
- fix listing verification
- resume continue_updates in update_find (required because it uses continue_update itself)
- add allow_old_cached parameter to get()
2024-01-02 13:24:14 +03:00
Vitaliy Filippov
09adaf62fd
Implement K/V DB stress tester
2024-01-02 13:24:14 +03:00
Vitaliy Filippov
af93f8323c
Evict blocks based on memory limit & block usage
2024-01-02 13:24:14 +03:00
Vitaliy Filippov
3d29c76ff4
Track blocks per level
2024-01-02 13:24:14 +03:00
Vitaliy Filippov
19275379c1
Track block level
2024-01-02 13:24:14 +03:00
Vitaliy Filippov
96ad3c7c50
Experimental B-Tree Vitastor embedded K/V database implementation!
2024-01-02 13:24:14 +03:00
Vitaliy Filippov
2f228fa96a
Only treat data partitions as existing OSDs in vitastor-disk prepare
Test / test_interrupted_rebalance_ec (push) Successful in 2m40s
Details
Test / test_rm (push) Successful in 31s
Details
Test / test_move_reappear (push) Successful in 39s
Details
Test / test_snapshot_down (push) Successful in 26s
Details
Test / test_interrupted_rebalance (push) Successful in 4m42s
Details
Test / test_snapshot_down_ec (push) Successful in 26s
Details
Test / test_splitbrain (push) Successful in 22s
Details
Test / test_snapshot_chain (push) Failing after 3m17s
Details
Test / test_snapshot_chain_ec (push) Failing after 3m13s
Details
Test / test_rebalance_verify_imm (push) Successful in 2m51s
Details
Test / test_write (push) Successful in 37s
Details
Test / test_rebalance_verify_ec_imm (push) Successful in 2m37s
Details
Test / test_write_no_same (push) Successful in 18s
Details
Test / test_rebalance_verify_ec (push) Successful in 3m20s
Details
Test / test_write_xor (push) Failing after 3m8s
Details
Test / test_rebalance_verify (push) Successful in 8m20s
Details
Test / test_heal_pg_size_2 (push) Successful in 4m17s
Details
Test / test_heal_ec (push) Successful in 4m59s
Details
Test / test_heal_csum_32k_dmj (push) Successful in 4m15s
Details
Test / test_heal_csum_32k_dj (push) Successful in 5m35s
Details
Test / test_heal_csum_32k (push) Successful in 6m47s
Details
Test / test_heal_csum_4k_dmj (push) Successful in 6m49s
Details
Test / test_scrub (push) Successful in 1m2s
Details
Test / test_scrub_zero_osd_2 (push) Successful in 45s
Details
Test / test_scrub_xor (push) Successful in 40s
Details
Test / test_heal_csum_4k_dj (push) Successful in 7m16s
Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 1m9s
Details
Test / test_scrub_ec (push) Successful in 45s
Details
Test / test_heal_csum_4k (push) Successful in 5m26s
Details
Test / test_scrub_pg_size_3 (push) Successful in 1m38s
Details
2023-12-31 11:46:47 +03:00
Vitaliy Filippov
2f6b9c0306
Remove etcd parameter from default command examples
2023-12-31 02:50:41 +03:00
Vitaliy Filippov
48b5f871e0
Add Contributor License Aggrement in Russian and English
2023-12-31 01:23:52 +03:00
Vitaliy Filippov
c17f76a3e4
Add documentation for recovery auto-tuning
Test / test_snapshot_ec (push) Successful in 26s
Details
Test / test_move_reappear (push) Successful in 19s
Details
Test / test_rm (push) Successful in 15s
Details
Test / test_snapshot_down (push) Successful in 24s
Details
Test / test_snapshot_down_ec (push) Successful in 26s
Details
Test / test_snapshot_chain (push) Successful in 1m50s
Details
Test / test_splitbrain (push) Successful in 52s
Details
Test / test_snapshot_chain_ec (push) Successful in 2m31s
Details
Test / test_rebalance_verify_imm (push) Successful in 2m28s
Details
Test / test_rebalance_verify (push) Successful in 3m25s
Details
Test / test_rebalance_verify_ec (push) Successful in 3m31s
Details
Test / test_write (push) Successful in 1m17s
Details
Test / test_write_no_same (push) Successful in 17s
Details
Test / test_rebalance_verify_ec_imm (push) Successful in 3m36s
Details
Test / test_heal_pg_size_2 (push) Successful in 4m12s
Details
Test / test_heal_ec (push) Successful in 5m20s
Details
Test / test_heal_csum_32k_dmj (push) Successful in 4m36s
Details
Test / test_heal_csum_32k_dj (push) Successful in 6m11s
Details
Test / test_heal_csum_32k (push) Successful in 6m13s
Details
Test / test_scrub (push) Successful in 56s
Details
Test / test_scrub_zero_osd_2 (push) Successful in 1m6s
Details
Test / test_heal_csum_4k_dj (push) Successful in 6m31s
Details
Test / test_heal_csum_4k_dmj (push) Successful in 6m58s
Details
Test / test_scrub_xor (push) Successful in 43s
Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 1m10s
Details
Test / test_scrub_ec (push) Successful in 49s
Details
Test / test_scrub_pg_size_3 (push) Successful in 1m40s
Details
Test / test_heal_csum_4k (push) Successful in 5m59s
Details
Test / test_write_xor (push) Successful in 34s
Details
Test / test_interrupted_rebalance (push) Successful in 1m19s
Details
2023-12-31 01:23:17 +03:00
Vitaliy Filippov
a6ab54b1ba
Do not allow negative util_low/high
2023-12-31 01:23:17 +03:00
Vitaliy Filippov
99ee8596ea
Rename min/max_util to util_low/high
2023-12-31 01:23:17 +03:00
Vitaliy Filippov
c4928e6ecd
Protect from try_send completing the operation immediately
...
Fixes a possible use-after-free in case of continue_ops() calling try_send(),
then connect_peer() -> set_timer() -> trigger_nearest() -> handle_op_part() -> continue_ops() again
2023-12-31 01:23:17 +03:00
Vitaliy Filippov
ec7dcd1be5
Do not apply very large recovery pauses during tests
2023-12-31 01:23:17 +03:00
Vitaliy Filippov
e600bbc151
Fix flapping move_reappear test by adding an fsync before stopping PG
2023-12-31 01:23:17 +03:00
Vitaliy Filippov
8b8c1179a7
Use a separate used_blocks counter for free space stats to hide possibly delayed on-flush deallocation
2023-12-31 01:23:17 +03:00
Vitaliy Filippov
d5a6fa6dd7
Fix possible crash on print_slow when bs_op is NULL
2023-12-31 01:23:17 +03:00
Vitaliy Filippov
f757a35a8d
Retry PG changes without re-running lpsolve when pool configuration and OSD tree don't change
...
OSDs often change their /pg/history keys during rebalance, so monitor receives additional
transaction failures from etcd if it re-runs lpsolve which sometimes may even lead to monitor
being unable to apply PG changes at all until rebalance completes
2023-12-31 01:23:17 +03:00
Vitaliy Filippov
1edf86ed26
Aggregate recovery delay using simple mean over last 10 observations (EWMA is shit)
2023-12-31 01:23:17 +03:00
Vitaliy Filippov
5ca7cde612
Experiment/WIP: Try to track "secondary" recovery ops separately
2023-12-31 01:23:17 +03:00
Vitaliy Filippov
751935ddd8
WIP Auto-tune recovery speed
2023-12-31 01:23:17 +03:00
Vitaliy Filippov
d84dee7098
Track recovery op latencies + refactor into a structure
2023-12-31 01:23:17 +03:00