vitastor

Commit Graph

Author	SHA1	Message	Date
Vitaliy Filippov	f70da82317	Add loadjson command to vitastor-kv	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	e42148f347	Allow to specify KV commands on command line	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	c289584469	Add JSON dump format	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	018e89f867	Erase verf key left from creation from ientries on every modification	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	603dc68f11	Implement async mtime change	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	7b12342933	Allow to specify additional NFS mount options	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	44bf0f16ee	Fix malloc/free in nfs_kv_read/write	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	5b747c12ec	Check if already mounted before mounting	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	05f5f46162	Fix zero used space, update mtime when moving/changing inode	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	b5604191c8	Ignore ECANCELED in nfs-proxy (happens in io_uring on fork)	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	e871de27de	Support unaligned shared_offsets, align shared file data instead of header	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	f600ce98e2	Implement auto-unmount local NFS server mode for vitastor-nfs	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	57605a5c13	Return error on failed shrink	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	29bd4561bb	Implement rename over an existing file/directory	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	7142460ec8	Support --logfile in nfs-proxy	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	d03f19ebe5	Fix shared file overlap, add FIXMEs	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	88f9d18be3	Create inode, then direntry, not direntry, then inode; retry ID collisions	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	6213fbd8c6	Fix NFS shared/aligned write FIXMEs	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	3aee37eadd	Allow to disable per-inode stats for VitastorFS pools	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	ecfc753e93	Add basic NFS tests, fix bugs	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	a574f9ad71	Return block NFS implementation back as an option too	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	7c235c9103	Move KV FS header into a separate file	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	e5bb986164	Implement packing small files into shared inodes	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	181795d748	Split new NFS proxy implementation into multiple files	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	8cdc38805b	WIP VitastorFS with metadata storage in VitastorKV	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	0cd455d17f	First just recheck version without actually re-reading block in vitastor-kv	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	32ba653ba6	Fix vitastor-kv hang on reopen & unfinished closed listing	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	231d4b15fc	Add loadable dump format to vitastor-kv (dump)	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	9dc4d5fd7b	Fix freeing r/w buffers on errors in kv_db	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	e58538fa47	Fix eviction when random_pos selects the end	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	11ac9e7024	Implement min/max list_count to make listings during performance test reasonable	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	511bc3df1c	Fix and improve parallel allocation - Do not try to allocate more DB blocks in an inode block until it's "confirmed" and "locked" by the first write - Do not recheck for new zero DB blocks on first write into an inode block - a CAS failure means someone else is already writing into it - Throw new allocation blocks away regardless of whether the known_version is 0 on a CAS failure	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	a64f0d1f73	Implement key_prefix for K/V stress test	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	ec5f7c6b87	More fixes - do not overwrite a block with older version if known version is newer (read may start before update and end after update) - invalidated block versions can't be remembered and trusted - right boundary for split blocks is right_half when diving down, not key_lt - restart update also when block is "invalidated", not just on version mismatch - copy callback in listings to avoid closure destruction bugs too	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	3ebed9a749	Add logging and one more assert	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	eab67a6e8f	Make get_block() wait for updating when unrelated block is found along the path	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	20993d9b7a	Fix a race condition where changed blocks were parsed over existing cached blocks and getting a mix of data	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	5cf9b343c0	Simplify code by removing an unneeded "optimisation"	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	79ae0aadcd	Add kv_log_level, print warnings on level 1, trace ops on level 10	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	605afc3583	Fix duplicate keys in listings on parallel updates -- do not rewind key "iterator position"	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	c0681d8242	Implement key suffix to avoid collisions of multiple test workers	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	763e77b4f4	Do not complain on empty first block	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	19426aa4c5	Add JSON output for stress-tester	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	08f586bcec	Print total stats	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	f1cd87473a	Do not send more than op_count operations (fix segfault on finish)	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	1bd8d2da56	Add some more resiliency to serialize()	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	a7396d2baf	Invalidate blocks being updated too	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	e98a38810d	Change new block allocation method: make each writer choose multiple empty PG blocks and place blocks in them	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	28c4324c36	Remove blocks from cache on unsuccessful updates	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	31ec3fa8f5	Allow to track multiple updates per block (it should never happen though)	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	e4fa26f60a	Do not call stop_updating after failed write_new_block and after clear_block (both delete the item)	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	59ae27f9e5	Track versions of parent blocks and recheck if changed during update	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	2c6a301d9b	Fix resume_split condition (key_lt can also be "")	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	01558349f8	Experiment: transform offsets for better sharding	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	36f4717d0d	More post-stress-test fixes - Prevent _split types of new blocks - Stop updating new blocks only after the whole update, otherwise pointers may become invalid - Use recheck_none for updates initially - Use UINT64_MAX as initial block version when postponing ops, otherwise the check fails when the block is initially empty. This for example leads to writing both leaf items & block pointers (which is incorrect) into the root block when starting stress-test with --parallelism 32 - Fix -EINTR comparison	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	babaf2a0ce	Print operation statistics	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	5773f1a375	K/V fixes after stress-test :-) - track block versions correctly - per inode block (128kb) instead of tree block (4kb) - prevent multiple parallel CAS writes of the same inode block - add logging for EILSEQ which means invalid data in the tree - fix get_block updated flag which was true for blocks already in cache and was leading to infinite loops on "unrelated block" errors - apply changes to blocks in cache only after successful writes (using "virtual changes") - do not replace cached block with an older version from disk - recheck "unrelated blocks" (read/update collisions) until data stops changing - track tree path correctly - do not treat split block as parent of its right half - correctly move blocks when finding new empty place on disk - restart updates from the beginning when one of blocks is changed by a parallel update - fix delete using SET opcode and setting key to the empty value instead - prevent changing the same key more than 1 time in parallel - fix listing verification - resume continue_updates in update_find (required because it uses continue_update itself) - add allow_old_cached parameter to get()	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	57222a9f79	Implement K/V DB stress tester	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	61ef000c6e	Evict blocks based on memory limit & block usage	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	7d5e1cc393	Track blocks per level	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	5e7f27a02d	Track block level	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	fd1d8a8520	Experimental B-Tree Vitastor embedded K/V database implementation!	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	c364e14c40	Stop then retry, not retry then stop	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	3ebbfa0428	Fix another rare OSD hang on zeroing out entries on start	2024-03-16 13:24:36 +03:00
Vitaliy Filippov	aa79d1db1c	Fix incorrect "changing scheme" message in modify-pool Test / test_rm (push) Successful in 14s Details Test / test_interrupted_rebalance_ec_imm (push) Successful in 1m32s Details Test / test_move_reappear (push) Successful in 20s Details Test / test_snapshot_down (push) Successful in 29s Details Test / test_snapshot_down_ec (push) Successful in 29s Details Test / test_splitbrain (push) Successful in 28s Details Test / test_snapshot_chain (push) Successful in 2m5s Details Test / test_snapshot_chain_ec (push) Successful in 3m3s Details Test / test_rebalance_verify_imm (push) Successful in 4m0s Details Test / test_rebalance_verify (push) Successful in 4m40s Details Test / test_switch_primary (push) Successful in 38s Details Test / test_write (push) Successful in 41s Details Test / test_write_no_same (push) Successful in 17s Details Test / test_write_xor (push) Successful in 1m2s Details Test / test_rebalance_verify_ec (push) Successful in 5m34s Details Test / test_rebalance_verify_ec_imm (push) Successful in 5m34s Details Test / test_heal_pg_size_2 (push) Successful in 3m22s Details Test / test_heal_ec (push) Successful in 4m58s Details Test / test_heal_csum_32k_dmj (push) Successful in 5m37s Details Test / test_heal_csum_32k_dj (push) Successful in 6m21s Details Test / test_heal_csum_32k (push) Successful in 7m1s Details Test / test_scrub (push) Successful in 1m37s Details Test / test_heal_csum_4k_dmj (push) Successful in 6m59s Details Test / test_scrub_zero_osd_2 (push) Successful in 1m26s Details Test / test_scrub_xor (push) Successful in 1m3s Details Test / test_heal_csum_4k_dj (push) Successful in 7m20s Details Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 1m7s Details Test / test_scrub_ec (push) Successful in 36s Details Test / test_scrub_pg_size_3 (push) Successful in 1m37s Details Test / test_heal_csum_4k (push) Successful in 6m23s Details	2024-03-06 00:41:35 +03:00
Vitaliy Filippov	a1fecb7eff	Move callback away when calling it in cluster_client	2024-03-06 00:41:35 +03:00
Vitaliy Filippov	ff74b19423	Fix rare OSD hang on zeroing out bad entries on start	2024-03-06 00:41:35 +03:00
Vitaliy Filippov	4cf6dceed7	Merge branch 'rel-1.4' Test / test_minsize_1 (push) Has been cancelled Details Test / test_move_reappear (push) Has been cancelled Details Test / test_rm (push) Has been cancelled Details Test / test_snapshot_chain (push) Has been cancelled Details Test / test_snapshot_chain_ec (push) Has been cancelled Details Test / test_snapshot_down (push) Has been cancelled Details Test / test_snapshot_down_ec (push) Has been cancelled Details Test / test_splitbrain (push) Has been cancelled Details Test / test_rebalance_verify (push) Has been cancelled Details Test / test_rebalance_verify_imm (push) Has been cancelled Details Test / test_rebalance_verify_ec (push) Has been cancelled Details Test / test_rebalance_verify_ec_imm (push) Has been cancelled Details Test / test_switch_primary (push) Has been cancelled Details Test / test_write (push) Has been cancelled Details Test / test_write_xor (push) Has been cancelled Details Test / test_write_no_same (push) Has been cancelled Details Test / test_heal_pg_size_2 (push) Has been cancelled Details Test / test_heal_ec (push) Has been cancelled Details Test / test_heal_csum_32k_dmj (push) Has been cancelled Details Test / test_heal_csum_32k_dj (push) Has been cancelled Details Test / test_heal_csum_32k (push) Has been cancelled Details Test / test_heal_csum_4k_dmj (push) Has been cancelled Details Test / test_heal_csum_4k_dj (push) Has been cancelled Details Test / test_heal_csum_4k (push) Has been cancelled Details Test / test_scrub (push) Has been cancelled Details Test / test_scrub_zero_osd_2 (push) Has been cancelled Details Test / test_scrub_xor (push) Has been cancelled Details Test / test_scrub_pg_size_3 (push) Has been cancelled Details Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Has been cancelled Details Test / test_scrub_ec (push) Has been cancelled Details	2024-02-29 09:59:01 +03:00
Vitaliy Filippov	38b8963330	Release 1.4.8 Test / test_rm (push) Successful in 19s Details Test / test_move_reappear (push) Successful in 26s Details Test / test_interrupted_rebalance_ec_imm (push) Successful in 1m40s Details Test / test_snapshot_down (push) Successful in 31s Details Test / test_snapshot_down_ec (push) Successful in 34s Details Test / test_splitbrain (push) Successful in 27s Details Test / test_snapshot_chain (push) Successful in 2m18s Details Test / test_snapshot_chain_ec (push) Successful in 2m59s Details Test / test_rebalance_verify_imm (push) Successful in 5m32s Details Test / test_rebalance_verify (push) Successful in 6m11s Details Test / test_switch_primary (push) Successful in 41s Details Test / test_write (push) Successful in 45s Details Test / test_write_no_same (push) Successful in 23s Details Test / test_rebalance_verify_ec_imm (push) Successful in 5m2s Details Test / test_write_xor (push) Successful in 55s Details Test / test_rebalance_verify_ec (push) Successful in 6m22s Details Test / test_heal_pg_size_2 (push) Successful in 5m41s Details Test / test_heal_csum_32k_dmj (push) Successful in 5m59s Details Test / test_heal_csum_32k_dj (push) Successful in 7m19s Details Test / test_heal_csum_32k (push) Successful in 7m17s Details Test / test_heal_csum_4k_dmj (push) Successful in 7m14s Details Test / test_scrub (push) Successful in 1m12s Details Test / test_heal_ec (push) Successful in 9m2s Details Test / test_scrub_xor (push) Successful in 56s Details Test / test_scrub_zero_osd_2 (push) Successful in 1m8s Details Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 2m1s Details Test / test_heal_csum_4k_dj (push) Successful in 4m45s Details Test / test_scrub_pg_size_3 (push) Successful in 2m31s Details Test / test_heal_csum_4k (push) Successful in 4m54s Details Test / test_scrub_ec (push) Successful in 46s Details - Do not use \r if output is not a terminal (should fix unexpected job output in proxmox) - Fix rm/rm-data error return code, add --down-ok option to bypass the error - Add EIO retry timeout and allow to disable these retries, rename up_wait_retry_interval to client_retry_interval - Add ubuntu jammy build - Wait for blockstore initialisation before starting OSD (prevent timeouts when init takes time) - Fix a rare use-after-free in automatic sync after delete in blockstore	2024-02-29 09:58:34 +03:00
Vitaliy Filippov	77167e2920	Do not use \r if output is not a terminal	2024-02-29 00:21:17 +03:00
Vitaliy Filippov	5af23672d0	Fix rm/rm-data error return code, add --down-ok option to bypass the error	2024-02-29 00:20:10 +03:00
Vitaliy Filippov	6bf1f539a6	Add EIO retry timeout and allow to disable these retries, rename up_wait_retry_interval to client_retry_interval	2024-02-28 13:10:02 +03:00
Vitaliy Filippov	4eab26f968	Add documentation and a very basic test for pool management commands Test / test_snapshot_ec (push) Successful in 31s Details Test / test_rm (push) Successful in 17s Details Test / test_move_reappear (push) Successful in 24s Details Test / test_snapshot_down (push) Successful in 27s Details Test / test_snapshot_down_ec (push) Successful in 33s Details Test / test_splitbrain (push) Successful in 20s Details Test / test_snapshot_chain (push) Successful in 2m15s Details Test / test_snapshot_chain_ec (push) Successful in 2m58s Details Test / test_rebalance_verify_imm (push) Successful in 5m3s Details Test / test_rebalance_verify (push) Successful in 5m36s Details Test / test_switch_primary (push) Successful in 37s Details Test / test_rebalance_verify_ec_imm (push) Successful in 4m3s Details Test / test_write_no_same (push) Successful in 21s Details Test / test_write (push) Successful in 58s Details Test / test_write_xor (push) Successful in 1m31s Details Test / test_rebalance_verify_ec (push) Successful in 6m20s Details Test / test_heal_pg_size_2 (push) Successful in 4m7s Details Test / test_heal_ec (push) Successful in 4m33s Details Test / test_heal_csum_32k_dmj (push) Successful in 5m53s Details Test / test_heal_csum_32k_dj (push) Successful in 6m17s Details Test / test_heal_csum_32k (push) Successful in 7m23s Details Test / test_heal_csum_4k_dmj (push) Successful in 6m56s Details Test / test_scrub_zero_osd_2 (push) Successful in 1m26s Details Test / test_scrub (push) Successful in 1m29s Details Test / test_heal_csum_4k_dj (push) Successful in 7m1s Details Test / test_scrub_xor (push) Successful in 1m1s Details Test / test_heal_csum_4k (push) Successful in 6m34s Details Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 32s Details Test / test_scrub_pg_size_3 (push) Successful in 1m19s Details Test / test_scrub_ec (push) Successful in 24s Details	2024-02-28 13:08:04 +03:00
Vitaliy Filippov	86243b7101	Rework & fix pool-create / pool-modify / pool-ls	2024-02-28 13:08:04 +03:00
idelson	dc92851322	vitastor-cli: add commands to control pools: pool-create, pool-ls, pool-modify, pool-rm PR #59 - https://github.com/vitalif/vitastor/pull/58/commits By MIND Software LLC By submitting this pull request, I accept Vitastor CLA	2024-02-28 13:08:04 +03:00
Vitaliy Filippov	fc413038d1	Wait for blockstore initialisation before starting OSD Test / test_cas (push) Has been cancelled Details Test / test_change_pg_count (push) Has been cancelled Details Test / test_change_pg_count_ec (push) Has been cancelled Details Test / test_change_pg_size (push) Has been cancelled Details Test / test_create_nomaxid (push) Has been cancelled Details Test / test_etcd_fail (push) Has been cancelled Details Test / test_interrupted_rebalance (push) Has been cancelled Details Test / test_interrupted_rebalance_imm (push) Has been cancelled Details Test / test_interrupted_rebalance_ec (push) Has been cancelled Details Test / test_interrupted_rebalance_ec_imm (push) Has been cancelled Details Test / test_failure_domain (push) Has been cancelled Details Test / test_snapshot (push) Has been cancelled Details Test / test_snapshot_ec (push) Has been cancelled Details Test / test_minsize_1 (push) Has been cancelled Details Test / test_move_reappear (push) Has been cancelled Details Test / test_rm (push) Has been cancelled Details Test / test_snapshot_chain (push) Has been cancelled Details Test / test_snapshot_chain_ec (push) Has been cancelled Details Test / test_snapshot_down (push) Has been cancelled Details Test / test_snapshot_down_ec (push) Has been cancelled Details Test / test_splitbrain (push) Has been cancelled Details Test / test_rebalance_verify (push) Has been cancelled Details Test / test_rebalance_verify_imm (push) Has been cancelled Details Test / test_rebalance_verify_ec (push) Has been cancelled Details Test / test_rebalance_verify_ec_imm (push) Has been cancelled Details Test / test_switch_primary (push) Has been cancelled Details Test / test_write (push) Has been cancelled Details Test / test_write_xor (push) Has been cancelled Details Test / test_write_no_same (push) Has been cancelled Details Test / test_heal_pg_size_2 (push) Has been cancelled Details	2024-02-27 02:20:04 +03:00
Vitaliy Filippov	1bc0b5aab3	Fix a rare use-after-free in automatic sync after delete in blockstore Test / test_interrupted_rebalance_ec (push) Successful in 2m49s Details Test / test_rm (push) Successful in 14s Details Test / test_move_reappear (push) Successful in 21s Details Test / test_snapshot_down (push) Successful in 31s Details Test / test_snapshot_down_ec (push) Successful in 30s Details Test / test_splitbrain (push) Successful in 23s Details Test / test_snapshot_chain (push) Successful in 2m29s Details Test / test_snapshot_chain_ec (push) Successful in 2m48s Details Test / test_rebalance_verify_imm (push) Successful in 4m9s Details Test / test_rebalance_verify (push) Successful in 4m42s Details Test / test_switch_primary (push) Successful in 41s Details Test / test_write (push) Successful in 43s Details Test / test_write_no_same (push) Successful in 21s Details Test / test_rebalance_verify_ec_imm (push) Successful in 3m37s Details Test / test_write_xor (push) Successful in 1m11s Details Test / test_rebalance_verify_ec (push) Successful in 7m14s Details Test / test_heal_pg_size_2 (push) Successful in 4m3s Details Test / test_heal_ec (push) Successful in 4m18s Details Test / test_heal_csum_32k_dmj (push) Successful in 5m5s Details Test / test_heal_csum_32k_dj (push) Successful in 6m52s Details Test / test_heal_csum_32k (push) Successful in 6m23s Details Test / test_heal_csum_4k_dmj (push) Successful in 6m23s Details Test / test_scrub (push) Successful in 1m30s Details Test / test_scrub_zero_osd_2 (push) Successful in 1m18s Details Test / test_heal_csum_4k_dj (push) Successful in 7m9s Details Test / test_scrub_xor (push) Successful in 57s Details Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 1m5s Details Test / test_scrub_ec (push) Successful in 1m6s Details Test / test_scrub_pg_size_3 (push) Successful in 2m3s Details Test / test_heal_csum_4k (push) Successful in 4m54s Details ASan report: [0] READ of size 16 at operator() /root/vitastor/src/blockstore_write.cpp:100 ...[5] blockstore_impl_t::ack_sync(blockstore_op_t*) /root/vitastor/src/blockstore_sync.cpp:232	2024-02-24 00:06:36 +03:00
Vitaliy Filippov	5e934264cf	Release 1.4.7 - Fix another old "BUG: Attempt to overwrite used offset" in a very simple case: bs=4k rw=write iodepth=16 from OSD start; add this case to tests - Fix a rare crash with "unexpected state during flush: 0x51" possible with EC since 1.4.2 during rebalance and OSD outages - Fix a rare write stall with EC & immediate_commit=none caused by sync operations reserving unneeded space in the journal - Fix 32-bit build warnings, most in printf/scanf format strings	2024-02-22 12:45:52 +03:00
Vitaliy Filippov	f20564b44b	Fix 32-bit build warnings (99.9% in printf)	2024-02-22 12:22:16 +03:00
Vitaliy Filippov	b3c15db331	32M journal by default in simple-offsets Test / test_snapshot_ec (push) Successful in 30s Details Test / test_rm (push) Successful in 18s Details Test / test_move_reappear (push) Successful in 24s Details Test / test_snapshot_down (push) Successful in 26s Details Test / test_snapshot_down_ec (push) Successful in 30s Details Test / test_splitbrain (push) Successful in 23s Details Test / test_snapshot_chain (push) Successful in 2m17s Details Test / test_snapshot_chain_ec (push) Successful in 2m55s Details Test / test_rebalance_verify_imm (push) Successful in 2m46s Details Test / test_rebalance_verify (push) Successful in 3m9s Details Test / test_switch_primary (push) Successful in 39s Details Test / test_write (push) Successful in 43s Details Test / test_write_no_same (push) Successful in 19s Details Test / test_write_xor (push) Successful in 55s Details Test / test_rebalance_verify_ec (push) Successful in 3m35s Details Test / test_rebalance_verify_ec_imm (push) Successful in 3m37s Details Test / test_heal_pg_size_2 (push) Successful in 3m36s Details Test / test_heal_ec (push) Successful in 5m47s Details Test / test_heal_csum_32k_dmj (push) Successful in 5m21s Details Test / test_heal_csum_32k_dj (push) Successful in 6m16s Details Test / test_heal_csum_32k (push) Successful in 6m45s Details Test / test_scrub (push) Successful in 1m56s Details Test / test_heal_csum_4k_dj (push) Successful in 6m39s Details Test / test_heal_csum_4k_dmj (push) Successful in 6m42s Details Test / test_scrub_zero_osd_2 (push) Successful in 1m16s Details Test / test_scrub_xor (push) Successful in 47s Details Test / test_scrub_pg_size_3 (push) Successful in 1m26s Details Test / test_heal_csum_4k (push) Successful in 6m32s Details Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 48s Details Test / test_scrub_ec (push) Successful in 49s Details	2024-02-21 15:25:02 +03:00
Vitaliy Filippov	685bcd6ef9	Do not reserve extra space for big_writes during sync - sync itself is needed to commit and clear them	2024-02-21 13:00:14 +03:00
Vitaliy Filippov	3eb389b321	Supposed fix for "unexpected state during flush: 0x51" with EC Test / test_move_reappear (push) Successful in 22s Details Test / test_interrupted_rebalance_ec_imm (push) Successful in 1m32s Details Test / test_rm (push) Successful in 16s Details Test / test_snapshot_down (push) Successful in 31s Details Test / test_snapshot_down_ec (push) Successful in 32s Details Test / test_splitbrain (push) Successful in 25s Details Test / test_snapshot_chain (push) Successful in 2m4s Details Test / test_snapshot_chain_ec (push) Successful in 2m51s Details Test / test_rebalance_verify_imm (push) Successful in 2m47s Details Test / test_rebalance_verify (push) Successful in 3m30s Details Test / test_switch_primary (push) Successful in 38s Details Test / test_write (push) Successful in 51s Details Test / test_write_no_same (push) Successful in 16s Details Test / test_write_xor (push) Successful in 52s Details Test / test_rebalance_verify_ec (push) Successful in 3m32s Details Test / test_rebalance_verify_ec_imm (push) Successful in 3m7s Details Test / test_scrub_zero_osd_2 (push) Successful in 59s Details Test / test_scrub (push) Successful in 1m2s Details Test / test_scrub_xor (push) Successful in 36s Details Test / test_scrub_ec (push) Successful in 38s Details Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 40s Details Test / test_scrub_pg_size_3 (push) Successful in 49s Details Test / test_heal_csum_32k_dmj (push) Successful in 5m12s Details Test / test_heal_csum_32k_dj (push) Successful in 5m8s Details Test / test_heal_csum_32k (push) Successful in 4m55s Details Test / test_heal_ec (push) Failing after 10m14s Details Test / test_heal_csum_4k_dmj (push) Successful in 4m59s Details Test / test_heal_csum_4k_dj (push) Successful in 5m5s Details Test / test_heal_pg_size_2 (push) Successful in 3m54s Details Test / test_heal_csum_4k (push) Successful in 3m49s Details	2024-02-21 01:32:06 +03:00
Vitaliy Filippov	3d16cde23c	Fix assertions, add small sequential write test Test / test_snapshot_down_ec (push) Successful in 32s Details Test / test_splitbrain (push) Successful in 22s Details Test / test_snapshot_chain (push) Successful in 2m8s Details Test / test_snapshot_chain_ec (push) Successful in 2m48s Details Test / test_rebalance_verify_imm (push) Successful in 2m57s Details Test / test_rebalance_verify (push) Successful in 3m29s Details Test / test_switch_primary (push) Successful in 36s Details Test / test_write (push) Successful in 54s Details Test / test_write_xor (push) Successful in 51s Details Test / test_write_no_same (push) Successful in 16s Details Test / test_rebalance_verify_ec (push) Successful in 3m40s Details Test / test_rebalance_verify_ec_imm (push) Successful in 4m20s Details Test / test_scrub (push) Successful in 1m1s Details Test / test_scrub_zero_osd_2 (push) Successful in 46s Details Test / test_scrub_xor (push) Successful in 41s Details Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 1m0s Details Test / test_scrub_ec (push) Successful in 58s Details Test / test_scrub_pg_size_3 (push) Successful in 1m45s Details Test / test_heal_pg_size_2 (push) Failing after 4m52s Details Test / test_heal_csum_32k_dmj (push) Successful in 5m36s Details Test / test_heal_csum_32k_dj (push) Successful in 5m33s Details Test / test_interrupted_rebalance_imm (push) Successful in 1m35s Details Test / test_interrupted_rebalance (push) Successful in 2m28s Details Test / test_interrupted_rebalance_ec (push) Successful in 2m30s Details Test / test_interrupted_rebalance_ec_imm (push) Successful in 2m41s Details Test / test_heal_ec (push) Failing after 10m20s Details Test / test_heal_csum_4k_dmj (push) Successful in 4m21s Details Test / test_heal_csum_32k (push) Successful in 5m15s Details Test / test_heal_csum_4k_dj (push) Successful in 5m48s Details Test / test_heal_csum_4k (push) Successful in 5m32s Details	2024-02-20 19:41:48 +03:00
Vitaliy Filippov	c6406d67fc	Fix journal space_check incorrectly checking for space at the beginning	2024-02-20 19:40:56 +03:00
Vitaliy Filippov	f87964861d	Release 1.4.6 Test / test_snapshot_ec (push) Successful in 29s Details Test / test_rm (push) Successful in 18s Details Test / test_move_reappear (push) Successful in 26s Details Test / test_snapshot_down (push) Successful in 28s Details Test / test_snapshot_down_ec (push) Successful in 32s Details Test / test_splitbrain (push) Successful in 23s Details Test / test_snapshot_chain (push) Successful in 2m3s Details Test / test_snapshot_chain_ec (push) Successful in 2m46s Details Test / test_rebalance_verify_imm (push) Successful in 3m1s Details Test / test_rebalance_verify (push) Successful in 3m30s Details Test / test_switch_primary (push) Successful in 38s Details Test / test_write (push) Successful in 32s Details Test / test_write_no_same (push) Successful in 17s Details Test / test_write_xor (push) Successful in 38s Details Test / test_rebalance_verify_ec (push) Successful in 4m38s Details Test / test_rebalance_verify_ec_imm (push) Successful in 3m57s Details Test / test_heal_csum_32k_dj (push) Successful in 5m14s Details Test / test_heal_csum_32k_dmj (push) Successful in 5m21s Details Test / test_heal_csum_32k (push) Successful in 5m45s Details Test / test_heal_csum_4k_dmj (push) Successful in 5m27s Details Test / test_scrub (push) Successful in 1m30s Details Test / test_heal_csum_4k_dj (push) Successful in 5m26s Details Test / test_scrub_zero_osd_2 (push) Successful in 38s Details Test / test_scrub_xor (push) Successful in 40s Details Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 1m8s Details Test / test_scrub_ec (push) Successful in 1m5s Details Test / test_scrub_pg_size_3 (push) Successful in 1m49s Details Test / test_heal_csum_4k (push) Successful in 5m41s Details Test / test_heal_ec (push) Successful in 4m11s Details Test / test_heal_pg_size_2 (push) Successful in 4m22s Details Unwavering stabilization of 1.4.x, continued :-) - Include the accidentally lost part of 1.4.5 journal trimming fix - Fix a possible OSD crash with "BUG: Attempt to overwrite used offset" which was probably present for long time, but became apparent after fixing flapping tests in CI - Fix remaining flapping tests in CI. It was the first time when tests actually passed without retries :-)	2024-02-20 17:01:26 +03:00
Vitaliy Filippov	7048228678	Supposed fix for "BUG: Attempt to overwrite used offset"	2024-02-20 15:56:48 +03:00
Vitaliy Filippov	ea73857450	Add asserts to catch "BUG: Attempt to overwrite used offset"	2024-02-20 15:56:48 +03:00
Vitaliy Filippov	6cfe38ec04	Followup to empty cur.oid as stop condition for forced trim fix	2024-02-20 15:56:38 +03:00
Vitaliy Filippov	f882c7dd87	Release 1.4.5 Test / test_interrupted_rebalance_ec_imm (push) Successful in 1m23s Details Test / test_rm (push) Successful in 15s Details Test / test_move_reappear (push) Successful in 21s Details Test / test_snapshot_down (push) Successful in 26s Details Test / test_snapshot_down_ec (push) Successful in 30s Details Test / test_splitbrain (push) Successful in 29s Details Test / test_snapshot_chain (push) Successful in 2m17s Details Test / test_snapshot_chain_ec (push) Successful in 3m14s Details Test / test_rebalance_verify_imm (push) Successful in 3m24s Details Test / test_rebalance_verify (push) Successful in 3m59s Details Test / test_switch_primary (push) Successful in 35s Details Test / test_write_xor (push) Successful in 32s Details Test / test_write_no_same (push) Successful in 13s Details Test / test_rebalance_verify_ec (push) Successful in 3m46s Details Test / test_rebalance_verify_ec_imm (push) Successful in 3m13s Details Test / test_heal_pg_size_2 (push) Successful in 3m52s Details Test / test_heal_ec (push) Successful in 5m25s Details Test / test_heal_csum_32k_dj (push) Successful in 4m24s Details Test / test_heal_csum_4k_dmj (push) Successful in 4m23s Details Test / test_heal_csum_4k_dj (push) Successful in 4m17s Details Test / test_scrub (push) Successful in 38s Details Test / test_scrub_zero_osd_2 (push) Successful in 29s Details Test / test_scrub_xor (push) Successful in 30s Details Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 43s Details Test / test_scrub_ec (push) Successful in 32s Details Test / test_scrub_pg_size_3 (push) Successful in 1m46s Details Test / test_heal_csum_4k (push) Successful in 4m4s Details Test / test_write (push) Successful in 1m38s Details Test / test_heal_csum_32k_dmj (push) Successful in 4m5s Details Test / test_heal_csum_32k (push) Successful in 4m15s Details - Fix a write stall caused by incorrect journal trimming introduced in 1.4.4 :) - Fix PGs sometimes hanging in "starting" state on mass OSD restarts - Fix a rare crash with "map::at" during OSD pings - Use new defaults for non-capacitor (desktop) SSDs - improves T1Q256 random write from ~6k iops to ~45k iops - Make journal_trim_interval configurable	2024-02-16 10:13:33 +03:00
Vitaliy Filippov	26dd863c8d	Fix sometimes possible crash on clients.at() during pings	2024-02-16 10:13:33 +03:00
Vitaliy Filippov	2ae859fbc6	Use min/max_flusher_count=32/256, 128M journal and autosync_writes=512 for non-capacitor SSDs by default	2024-02-16 10:13:33 +03:00
Vitaliy Filippov	8389c0f33b	Fix PGs sometimes hanging in "starting" state on mass OSD restarts	2024-02-15 23:38:52 +03:00
Vitaliy Filippov	9db2196aef	Make journal_trim_interval configurable	2024-02-15 23:38:51 +03:00
Vitaliy Filippov	8d6ae662fe	Use empty cur.oid as stop condition for forced trim, not journal_trim_counter	2024-02-15 23:27:17 +03:00
Vitaliy Filippov	c777a0041a	Release 1.4.4 Test / test_interrupted_rebalance_ec_imm (push) Successful in 1m23s Details Test / test_move_reappear (push) Successful in 21s Details Test / test_rm (push) Successful in 16s Details Test / test_snapshot_down (push) Successful in 30s Details Test / test_snapshot_down_ec (push) Successful in 30s Details Test / test_splitbrain (push) Successful in 25s Details Test / test_snapshot_chain (push) Successful in 2m18s Details Test / test_snapshot_chain_ec (push) Successful in 3m13s Details Test / test_rebalance_verify_imm (push) Successful in 3m8s Details Test / test_rebalance_verify (push) Successful in 3m41s Details Test / test_switch_primary (push) Successful in 36s Details Test / test_write (push) Successful in 40s Details Test / test_write_no_same (push) Successful in 16s Details Test / test_write_xor (push) Successful in 39s Details Test / test_rebalance_verify_ec (push) Successful in 4m56s Details Test / test_rebalance_verify_ec_imm (push) Successful in 4m21s Details Test / test_heal_pg_size_2 (push) Successful in 4m15s Details Test / test_heal_ec (push) Successful in 5m1s Details Test / test_heal_csum_32k_dj (push) Successful in 5m32s Details Test / test_heal_csum_32k (push) Successful in 5m38s Details Test / test_heal_csum_4k_dmj (push) Successful in 5m43s Details Test / test_scrub (push) Successful in 1m31s Details Test / test_scrub_zero_osd_2 (push) Successful in 1m17s Details Test / test_heal_csum_4k_dj (push) Successful in 5m57s Details Test / test_scrub_xor (push) Successful in 30s Details Test / test_scrub_pg_size_3 (push) Successful in 1m7s Details Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 41s Details Test / test_scrub_ec (push) Successful in 24s Details Test / test_heal_csum_32k_dmj (push) Successful in 3m56s Details Test / test_heal_csum_4k (push) Successful in 3m16s Details A couple of fixes for EC pools - Fix a segfault possible on partial EC overwrite in 1234 -> 5030 rebalance scenario - Fix two problems leading to EC pools stalling on rebalance & parallel sudden stops of OSDs, for example during a sudden poweroff of a host: - Recovery auto-tuning (1.4.0 feature) could apply too large delays and stall the EC journal - fixed by limiting delays with a new recovery_tune_sleep_cutoff_us parameter (10 seconds by default) and applying recovery pauses before write operations, not after them, to not occupy space in the journal for long time - Dynamic journal space reservation (1.3.0 feature) wasn't accounting new writes when checking the limit so OSDs could still fill the journal fully and stall - fixed by including new writes into the limit - Print etcd dbSize instead of dbSizeInUse in status	2024-02-11 16:23:08 +03:00
Vitaliy Filippov	978bdc128a	Apply recovery pause before writes, after commits, and do not apply it to syncs to not block EC pools from functioning	2024-02-11 16:13:52 +03:00
Vitaliy Filippov	bb2f395f1e	Add cutoff threshold for recovery auto-tuning	2024-02-11 16:13:52 +03:00
Vitaliy Filippov	b127da40f7	Add a FIXME about incomplete PGs	2024-02-11 13:42:51 +03:00
Vitaliy Filippov	ca34a6047a	Fix dynamic journal space reservation: include the new write itself, too	2024-02-11 13:42:51 +03:00
Vitaliy Filippov	38ba76e893	Fix flusher sometimes being unable to trim journal when the flush queue is empty	2024-02-11 13:42:51 +03:00

1 2 3 4 5 ...

870 Commits (ea0d72289c8d7a0964cf8f93568de1d3bf5e9382)