1
0
Fork 0
Commit Graph

25 Commits (kv)

Author SHA1 Message Date
Vitaliy Filippov f285cfc483 Fix eviction when random_pos selects the end 2023-12-01 01:43:03 +03:00
Vitaliy Filippov 9f6d09428d Fix and improve parallel allocation
- Do not try to allocate more DB blocks in an inode block until it's "confirmed" and "locked" by the first write
- Do not recheck for new zero DB blocks on first write into an inode block - a CAS failure means someone else is already writing into it
- Throw new allocation blocks away regardless of whether the known_version is 0 on a CAS failure
2023-12-01 01:17:04 +03:00
Vitaliy Filippov 13e2d3ce7c More fixes
- do not overwrite a block with older version if known version is newer
  (read may start before update and end after update)
- invalidated block versions can't be remembered and trusted
- right boundary for split blocks is right_half when diving down, not key_lt
- restart update also when block is "invalidated", not just on version mismatch
- copy callback in listings to avoid closure destruction bugs too
2023-12-01 01:17:04 +03:00
Vitaliy Filippov c5b00f897a Add logging and one more assert 2023-12-01 01:17:04 +03:00
Vitaliy Filippov e847e26912 Make get_block() wait for updating when unrelated block is found along the path 2023-12-01 01:17:04 +03:00
Vitaliy Filippov 3393463466 Fix a race condition where changed blocks were parsed over existing cached blocks and getting a mix of data 2023-12-01 01:17:04 +03:00
Vitaliy Filippov bd96a6194a Simplify code by removing an unneeded "optimisation" 2023-12-01 01:17:04 +03:00
Vitaliy Filippov 601fe10c28 Add kv_log_level, print warnings on level 1, trace ops on level 10 2023-12-01 01:17:04 +03:00
Vitaliy Filippov 63dbc9ca85 Fix duplicate keys in listings on parallel updates -- do not rewind key "iterator position" 2023-12-01 01:17:04 +03:00
Vitaliy Filippov ce52c5589e Do not complain on empty first block 2023-12-01 01:17:04 +03:00
Vitaliy Filippov 4ac7e096fd Add some more resiliency to serialize() 2023-12-01 01:17:04 +03:00
Vitaliy Filippov b6171a4599 Invalidate blocks being updated too 2023-12-01 01:17:03 +03:00
Vitaliy Filippov 28045f230c Change new block allocation method: make each writer choose multiple empty PG blocks and place blocks in them 2023-12-01 01:17:03 +03:00
Vitaliy Filippov 10e867880f Remove blocks from cache on unsuccessful updates 2023-12-01 01:17:03 +03:00
Vitaliy Filippov 012462171a Allow to track multiple updates per block (it should never happen though) 2023-12-01 01:17:03 +03:00
Vitaliy Filippov 904793cdab Do not call stop_updating after failed write_new_block and after clear_block (both delete the item) 2023-12-01 01:17:03 +03:00
Vitaliy Filippov 45c01db2de Track versions of parent blocks and recheck if changed during update 2023-12-01 01:17:03 +03:00
Vitaliy Filippov 8c9206cecd Fix resume_split condition (key_lt can also be "") 2023-12-01 01:17:03 +03:00
Vitaliy Filippov e8c46ededa Experiment: transform offsets for better sharding 2023-12-01 01:17:03 +03:00
Vitaliy Filippov e9b321a0e0 More post-stress-test fixes
- Prevent _split types of new blocks
- Stop updating new blocks only after the whole update, otherwise pointers
  may become invalid
- Use recheck_none for updates initially
- Use UINT64_MAX as initial block version when postponing ops, otherwise the
  check fails when the block is initially empty. This for example leads to
  writing both leaf items & block pointers (which is incorrect) into the root
  block when starting stress-test with --parallelism 32
- Fix -EINTR comparison
2023-12-01 01:17:03 +03:00
Vitaliy Filippov 29d8c9b6f3 K/V fixes after stress-test :-)
- track block versions correctly - per inode block (128kb) instead of tree block (4kb)
- prevent multiple parallel CAS writes of the same inode block
- add logging for EILSEQ which means invalid data in the tree
- fix get_block updated flag which was true for blocks already in cache and was leading to infinite loops on "unrelated block" errors
- apply changes to blocks in cache only after successful writes (using "virtual changes")
- do not replace cached block with an older version from disk
- recheck "unrelated blocks" (read/update collisions) until data stops changing
- track tree path correctly - do not treat split block as parent of its right half
- correctly move blocks when finding new empty place on disk
- restart updates from the beginning when one of blocks is changed by a parallel update
- fix delete using SET opcode and setting key to the empty value instead
- prevent changing the same key more than 1 time in parallel
- fix listing verification
- resume continue_updates in update_find (required because it uses continue_update itself)
- add allow_old_cached parameter to get()
2023-12-01 01:17:03 +03:00
Vitaliy Filippov 987b005356 Evict blocks based on memory limit & block usage 2023-12-01 01:17:03 +03:00
Vitaliy Filippov 41754b748b Track blocks per level 2023-12-01 01:17:03 +03:00
Vitaliy Filippov 31913256f3 Track block level 2023-12-01 01:17:03 +03:00
Vitaliy Filippov 0ee36baed7 Experimental B-Tree Vitastor embedded K/V database implementation! 2023-12-01 01:17:03 +03:00