Vitaliy Filippov
6213fbd8c6
Fix NFS shared/aligned write FIXMEs
2024-03-16 13:24:36 +03:00
Vitaliy Filippov
3aee37eadd
Allow to disable per-inode stats for VitastorFS pools
2024-03-16 13:24:36 +03:00
Vitaliy Filippov
ecfc753e93
Add basic NFS tests, fix bugs
2024-03-16 13:24:36 +03:00
Vitaliy Filippov
a574f9ad71
Return block NFS implementation back as an option too
2024-03-16 13:24:36 +03:00
Vitaliy Filippov
7c235c9103
Move KV FS header into a separate file
2024-03-16 13:24:36 +03:00
Vitaliy Filippov
e5bb986164
Implement packing small files into shared inodes
2024-03-16 13:24:36 +03:00
Vitaliy Filippov
181795d748
Split new NFS proxy implementation into multiple files
2024-03-16 13:24:36 +03:00
Vitaliy Filippov
8cdc38805b
WIP VitastorFS with metadata storage in VitastorKV
2024-03-16 13:24:36 +03:00
Vitaliy Filippov
0cd455d17f
First just recheck version without actually re-reading block in vitastor-kv
2024-03-16 13:24:36 +03:00
Vitaliy Filippov
32ba653ba6
Fix vitastor-kv hang on reopen & unfinished closed listing
2024-03-16 13:24:36 +03:00
Vitaliy Filippov
231d4b15fc
Add loadable dump format to vitastor-kv (dump)
2024-03-16 13:24:36 +03:00
Vitaliy Filippov
9dc4d5fd7b
Fix freeing r/w buffers on errors in kv_db
2024-03-16 13:24:36 +03:00
Vitaliy Filippov
e58538fa47
Fix eviction when random_pos selects the end
2024-03-16 13:24:36 +03:00
Vitaliy Filippov
11ac9e7024
Implement min/max list_count to make listings during performance test reasonable
2024-03-16 13:24:36 +03:00
Vitaliy Filippov
511bc3df1c
Fix and improve parallel allocation
...
- Do not try to allocate more DB blocks in an inode block until it's "confirmed" and "locked" by the first write
- Do not recheck for new zero DB blocks on first write into an inode block - a CAS failure means someone else is already writing into it
- Throw new allocation blocks away regardless of whether the known_version is 0 on a CAS failure
2024-03-16 13:24:36 +03:00
Vitaliy Filippov
a64f0d1f73
Implement key_prefix for K/V stress test
2024-03-16 13:24:36 +03:00
Vitaliy Filippov
ec5f7c6b87
More fixes
...
- do not overwrite a block with older version if known version is newer
(read may start before update and end after update)
- invalidated block versions can't be remembered and trusted
- right boundary for split blocks is right_half when diving down, not key_lt
- restart update also when block is "invalidated", not just on version mismatch
- copy callback in listings to avoid closure destruction bugs too
2024-03-16 13:24:36 +03:00
Vitaliy Filippov
3ebed9a749
Add logging and one more assert
2024-03-16 13:24:36 +03:00
Vitaliy Filippov
eab67a6e8f
Make get_block() wait for updating when unrelated block is found along the path
2024-03-16 13:24:36 +03:00
Vitaliy Filippov
20993d9b7a
Fix a race condition where changed blocks were parsed over existing cached blocks and getting a mix of data
2024-03-16 13:24:36 +03:00
Vitaliy Filippov
5cf9b343c0
Simplify code by removing an unneeded "optimisation"
2024-03-16 13:24:36 +03:00
Vitaliy Filippov
79ae0aadcd
Add kv_log_level, print warnings on level 1, trace ops on level 10
2024-03-16 13:24:36 +03:00
Vitaliy Filippov
605afc3583
Fix duplicate keys in listings on parallel updates -- do not rewind key "iterator position"
2024-03-16 13:24:36 +03:00
Vitaliy Filippov
c0681d8242
Implement key suffix to avoid collisions of multiple test workers
2024-03-16 13:24:36 +03:00
Vitaliy Filippov
763e77b4f4
Do not complain on empty first block
2024-03-16 13:24:36 +03:00
Vitaliy Filippov
19426aa4c5
Add JSON output for stress-tester
2024-03-16 13:24:36 +03:00
Vitaliy Filippov
08f586bcec
Print total stats
2024-03-16 13:24:36 +03:00
Vitaliy Filippov
f1cd87473a
Do not send more than op_count operations (fix segfault on finish)
2024-03-16 13:24:36 +03:00
Vitaliy Filippov
1bd8d2da56
Add some more resiliency to serialize()
2024-03-16 13:24:36 +03:00
Vitaliy Filippov
a7396d2baf
Invalidate blocks being updated too
2024-03-16 13:24:36 +03:00
Vitaliy Filippov
e98a38810d
Change new block allocation method: make each writer choose multiple empty PG blocks and place blocks in them
2024-03-16 13:24:36 +03:00
Vitaliy Filippov
28c4324c36
Remove blocks from cache on unsuccessful updates
2024-03-16 13:24:36 +03:00
Vitaliy Filippov
31ec3fa8f5
Allow to track multiple updates per block (it should never happen though)
2024-03-16 13:24:36 +03:00
Vitaliy Filippov
e4fa26f60a
Do not call stop_updating after failed write_new_block and after clear_block (both delete the item)
2024-03-16 13:24:36 +03:00
Vitaliy Filippov
59ae27f9e5
Track versions of parent blocks and recheck if changed during update
2024-03-16 13:24:36 +03:00
Vitaliy Filippov
2c6a301d9b
Fix resume_split condition (key_lt can also be "")
2024-03-16 13:24:36 +03:00
Vitaliy Filippov
01558349f8
Experiment: transform offsets for better sharding
2024-03-16 13:24:36 +03:00
Vitaliy Filippov
36f4717d0d
More post-stress-test fixes
...
- Prevent _split types of new blocks
- Stop updating new blocks only after the whole update, otherwise pointers
may become invalid
- Use recheck_none for updates initially
- Use UINT64_MAX as initial block version when postponing ops, otherwise the
check fails when the block is initially empty. This for example leads to
writing both leaf items & block pointers (which is incorrect) into the root
block when starting stress-test with --parallelism 32
- Fix -EINTR comparison
2024-03-16 13:24:36 +03:00
Vitaliy Filippov
babaf2a0ce
Print operation statistics
2024-03-16 13:24:36 +03:00
Vitaliy Filippov
5773f1a375
K/V fixes after stress-test :-)
...
- track block versions correctly - per inode block (128kb) instead of tree block (4kb)
- prevent multiple parallel CAS writes of the same inode block
- add logging for EILSEQ which means invalid data in the tree
- fix get_block updated flag which was true for blocks already in cache and was leading to infinite loops on "unrelated block" errors
- apply changes to blocks in cache only after successful writes (using "virtual changes")
- do not replace cached block with an older version from disk
- recheck "unrelated blocks" (read/update collisions) until data stops changing
- track tree path correctly - do not treat split block as parent of its right half
- correctly move blocks when finding new empty place on disk
- restart updates from the beginning when one of blocks is changed by a parallel update
- fix delete using SET opcode and setting key to the empty value instead
- prevent changing the same key more than 1 time in parallel
- fix listing verification
- resume continue_updates in update_find (required because it uses continue_update itself)
- add allow_old_cached parameter to get()
2024-03-16 13:24:36 +03:00
Vitaliy Filippov
57222a9f79
Implement K/V DB stress tester
2024-03-16 13:24:36 +03:00
Vitaliy Filippov
61ef000c6e
Evict blocks based on memory limit & block usage
2024-03-16 13:24:36 +03:00
Vitaliy Filippov
7d5e1cc393
Track blocks per level
2024-03-16 13:24:36 +03:00
Vitaliy Filippov
5e7f27a02d
Track block level
2024-03-16 13:24:36 +03:00
Vitaliy Filippov
fd1d8a8520
Experimental B-Tree Vitastor embedded K/V database implementation!
2024-03-16 13:24:36 +03:00
Vitaliy Filippov
c364e14c40
Stop then retry, not retry then stop
2024-03-16 13:24:36 +03:00
Vitaliy Filippov
3ebbfa0428
Fix another rare OSD hang on zeroing out entries on start
2024-03-16 13:24:36 +03:00
Vitaliy Filippov
aa79d1db1c
Fix incorrect "changing scheme" message in modify-pool
Test / test_rm (push) Successful in 14s
Details
Test / test_interrupted_rebalance_ec_imm (push) Successful in 1m32s
Details
Test / test_move_reappear (push) Successful in 20s
Details
Test / test_snapshot_down (push) Successful in 29s
Details
Test / test_snapshot_down_ec (push) Successful in 29s
Details
Test / test_splitbrain (push) Successful in 28s
Details
Test / test_snapshot_chain (push) Successful in 2m5s
Details
Test / test_snapshot_chain_ec (push) Successful in 3m3s
Details
Test / test_rebalance_verify_imm (push) Successful in 4m0s
Details
Test / test_rebalance_verify (push) Successful in 4m40s
Details
Test / test_switch_primary (push) Successful in 38s
Details
Test / test_write (push) Successful in 41s
Details
Test / test_write_no_same (push) Successful in 17s
Details
Test / test_write_xor (push) Successful in 1m2s
Details
Test / test_rebalance_verify_ec (push) Successful in 5m34s
Details
Test / test_rebalance_verify_ec_imm (push) Successful in 5m34s
Details
Test / test_heal_pg_size_2 (push) Successful in 3m22s
Details
Test / test_heal_ec (push) Successful in 4m58s
Details
Test / test_heal_csum_32k_dmj (push) Successful in 5m37s
Details
Test / test_heal_csum_32k_dj (push) Successful in 6m21s
Details
Test / test_heal_csum_32k (push) Successful in 7m1s
Details
Test / test_scrub (push) Successful in 1m37s
Details
Test / test_heal_csum_4k_dmj (push) Successful in 6m59s
Details
Test / test_scrub_zero_osd_2 (push) Successful in 1m26s
Details
Test / test_scrub_xor (push) Successful in 1m3s
Details
Test / test_heal_csum_4k_dj (push) Successful in 7m20s
Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 1m7s
Details
Test / test_scrub_ec (push) Successful in 36s
Details
Test / test_scrub_pg_size_3 (push) Successful in 1m37s
Details
Test / test_heal_csum_4k (push) Successful in 6m23s
Details
2024-03-06 00:41:35 +03:00
Vitaliy Filippov
a1fecb7eff
Move callback away when calling it in cluster_client
2024-03-06 00:41:35 +03:00
Vitaliy Filippov
ff74b19423
Fix rare OSD hang on zeroing out bad entries on start
2024-03-06 00:41:35 +03:00