vitastor

antilles

vitastor

Author	SHA1	Message	Date
Vitaliy Filippov	27d0d5b06a	Reads do not have to wait for buffer flushes anymore	2023-09-16 17:52:17 +03:00
Vitaliy Filippov	33950c1ec8	Fix fio_sec_osd attr_len	2023-09-16 17:49:10 +03:00
Vitaliy Filippov	cc0fdc6253	Remove erroneous block_size mismatch warnings on pools without matching PGs	2023-09-08 23:19:04 +03:00
Vitaliy Filippov	79ecd59b10	Flush STDOUT and STDERR before exiting from cli to fix Proxmox "Unexpected result"	2023-09-07 17:30:26 +03:00
Vitaliy Filippov	b7d398be5b	Fix sscanf validation usage (field count instead of null_byte == 0)	2023-09-07 02:34:35 +03:00
Vitaliy Filippov	85e9f67d9d	Add supported_truncate_flags	2023-09-06 17:37:52 +03:00
Vitaliy Filippov	79c6d6f323	Make QEMU driver compatible with QEMU 8.1	2023-08-24 02:23:55 +03:00
Vitaliy Filippov	ae760dbc1d	Fix co_truncate size division by BDRV_SECTOR_SIZE	2023-08-24 01:55:35 +03:00
Vitaliy Filippov	65487da4b1	Do not include msgr_rdma.h into messenger.h	2023-08-24 01:55:35 +03:00
Vitaliy Filippov	7862282938	Extract validation to check_rw(), remove duplicate code with OP_SYNC	2023-08-13 23:49:52 +03:00
Vitaliy Filippov	30ce2bd951	Fix buffer insert in cluster_client	2023-08-12 11:08:50 +03:00
Vitaliy Filippov	b1a0afd10a	Aggregate buffer flushes	2023-08-11 11:26:13 +03:00
Vitaliy Filippov	85b6134910	Return dirty buffers on read in client Required at least to return buffers when they need to be replayed, but until they are actually replayed	2023-08-09 00:57:08 +03:00
Vitaliy Filippov	b1b07a393d	Fix incorrect marking op parts as done with snapshots (could probably lead to client hangs)	2023-08-09 00:57:08 +03:00
Vitaliy Filippov	7333022adf	Add a third I/O mode: O_DIRECT\|O_SYNC, change parameters to data_io/meta_io/journal_io	2023-08-09 00:57:08 +03:00
Vitaliy Filippov	6acf562e01	Release 1.0.0 New features: - Data and metadata checksums! - Metadata checksums are always used with new disk format - Data checksums can be turned on with --data_csum_type crc32c for new OSDs - Checksum block size can be configured - inmemory_metadata now also affects keeping checksums in memory - Linux page cache I/O caching support which can be enabled separately for data, metadata (including checksums) and journal (O_SYNC instead of O_DIRECT) - Details [here](https://git.yourcmc.ru/vitalif/vitastor/src/branch/master/docs/config/layout-osd.en.md#data_csum_type) - Backwards compatibility is preserved, you can use new OSDs with old disks Release also includes bug fixes from [0.9.6](https://git.yourcmc.ru/vitalif/vitastor/releases/tag/v0.9.6). 0.9.6 is moved to "-oldstable" repositories and will be available for some additional time.	2023-07-29 18:57:19 +03:00
Vitaliy Filippov	564df2eb5d	Support using buffered I/O with O_SYNC instead of direct I/O	2023-07-29 12:17:18 +03:00
Vitaliy Filippov	1a4ceb420d	Track used blocks, not object versions	2023-07-29 12:17:18 +03:00
Vitaliy Filippov	21b5124a4b	Document data_csum_type and csum_block_size parameters	2023-07-29 12:17:18 +03:00
Vitaliy Filippov	4181add1f4	Remove creepy "metadata copying" during overwrite Instead of it, just do not verify checksums of currently mutated objects. When clean data modification during flush runs in parallel to a read request, that request may read a mix of old and new data. It may even read a mix of multiple flushed versions if it lasts too long... And attempts to verify it using temporary copies of metadata make the algorithm too complex and creepy.	2023-07-29 12:17:18 +03:00
Vitaliy Filippov	a8464c19af	Support keeping checksums on disk (not in memory) Definitely beneficial for SSD+HDD setups	2023-07-29 12:17:18 +03:00
Vitaliy Filippov	3c8e4c6b72	Use clean_dyn_size for space check	2023-07-29 12:17:18 +03:00
Vitaliy Filippov	8ef4cf89dc	Log more details about checksum mismatch in big_writes	2023-07-29 12:17:18 +03:00
Vitaliy Filippov	7bfb1639ea	Use find_holes() in flusher for unification	2023-07-29 12:17:18 +03:00
Vitaliy Filippov	628e481c32	Fill journal header to know checksum type & size when dumping journal with --all	2023-07-29 12:17:18 +03:00
Vitaliy Filippov	af6f2046fc	Fix journal read checksum verification with inmemory_journal=false	2023-07-29 12:17:18 +03:00
Vitaliy Filippov	9357e5293e	Call fill_partial_checksum_blocks() correctly in regard to COPY_BUF_CSUM_FILL	2023-07-29 12:17:18 +03:00
Vitaliy Filippov	12851dc07d	Wait for journal reads before checking them in clear_incomplete_csum_block_bits	2023-07-29 12:17:18 +03:00
Vitaliy Filippov	d6ee1ca17c	Use zero checksum size for zero-length writes	2023-07-29 12:17:18 +03:00
Vitaliy Filippov	71674d00cf	Fix journal data checksum mangling on corrupted block overwrite	2023-07-29 12:17:18 +03:00
Vitaliy Filippov	ddb078d5a7	Check journal entry size when checking block checksums	2023-07-29 12:17:18 +03:00
Vitaliy Filippov	d22d56f90a	Fix journal data checksum verification on start	2023-07-29 12:17:18 +03:00
Vitaliy Filippov	eb1331a079	Add more details to "journal entry data is corrupt" messages	2023-07-29 12:17:18 +03:00
Vitaliy Filippov	c5274f655b	...and partially remove the perversion with bitmap inlining	2023-07-29 12:17:18 +03:00
Vitaliy Filippov	45e07d6294	Sadly we have to refcount dyn_data...	2023-07-29 12:17:18 +03:00
Vitaliy Filippov	a8ee391e05	Fix clean block checksum read	2023-07-29 12:17:18 +03:00
Vitaliy Filippov	de48fa3fd2	Allow to forcibly set meta_format	2023-07-29 12:17:18 +03:00
Vitaliy Filippov	874a766b62	Rename meta_version to meta_format	2023-07-29 12:17:18 +03:00
Vitaliy Filippov	384bd8e28f	Support old metadata format in vitastor-disk dump-meta	2023-07-29 12:17:18 +03:00
Vitaliy Filippov	430994f48a	Fix journal big_write simple reads after checksum changes	2023-07-29 12:17:18 +03:00
Vitaliy Filippov	b909d81f41	Fix bitmap-granular checksums	2023-07-29 12:17:18 +03:00
Vitaliy Filippov	e42975ffd1	Fix wait_journal_count not being zeroed	2023-07-29 12:17:18 +03:00
Vitaliy Filippov	93778324e5	Rewrite and fix find_holes into a more obvious version	2023-07-29 12:17:18 +03:00
Vitaliy Filippov	eeb6727170	Fix missing checksum read offset	2023-07-29 12:17:18 +03:00
Vitaliy Filippov	92c6e16eba	Fix checksum verification in big_write journal reads	2023-07-29 12:17:18 +03:00
Vitaliy Filippov	213a9ccb4d	Verify checksums during journal reads	2023-07-29 12:17:18 +03:00
Vitaliy Filippov	a166147110	Add backwards compatibility with non-checksum metadata and journal formats	2023-07-29 12:17:18 +03:00
Vitaliy Filippov	7d532880c3	Implement large csum_block_size support (more than 4k) + refactor blockstore_flush	2023-07-29 12:17:18 +03:00
Vitaliy Filippov	0b0405d115	Implement bitmap-granular (4k) metadata & data checksums	2023-07-29 12:17:18 +03:00
Vitaliy Filippov	e651c93a90	Release 0.9.6 - Fix vitastor-disk partition zeroing (sometimes it was writing garbage instead of zeroes) - Fix incorrect EC space statistics in `vitastor-cli status` - Several bug fixes for NFS: - Add . and .. in NFS directory listings - Return FILE_SYNC from NFS writes if immediate_commit is enabled - Return the same "verifier" in NFS COMMIT as in NFS WRITE - Make parallel NFS extending writes work correctly, without conflicts - Handle parallel NFS extending writes without imposing extra load on etcd - Support UTF-8 in vitastor-cli table output - Also allow "0" and "no" as false for inmemory_metadata and inmemory_journal - Use HDD defaults for HDD-only in automatic `vitastor-disk prepare` mode	2023-07-29 10:54:00 +03:00
Vitaliy Filippov	988e90be69	Fix vitastor-disk partition zeroing (it was writing random garbage instead of zeroes :D)	2023-07-28 12:29:07 +03:00
Vitaliy Filippov	700e0e9bff	Handle parallel NFS extending writes without imposing extra load on etcd	2023-07-27 02:26:17 +00:00
Vitaliy Filippov	ab0ca7c00f	Return FILE_SYNC from NFS writes if immediate_commit is enabled	2023-07-26 02:09:47 +03:00
Vitaliy Filippov	f153bc950b	Return the same "verifier" in NFS COMMIT as in NFS WRITE This fixes buffered (not O_DIRECT) NFS writes in Linux - previously they were hanging in an infinite loop because COMMIT didn't return the same verifier as previous WRITEs, and NFS kernel client was infinitely retrying the same writes. Also this probably allows for correct NFS failover, at least for the same buffered writes, because NFS clients repeat all write requests until a COMMIT confirms them.	2023-07-26 02:09:47 +03:00
Vitaliy Filippov	425ff8818d	Add . and .. in NFS directory listings MC, for example, hangs with infinite listing retries without them	2023-07-26 02:09:47 +03:00
Vitaliy Filippov	9e287a7778	Handle extending writes correctly in NFS proxy Previously, multiple parallel writes extending file size through NFS were racing with each other and triggering deletions of part of the written data I.e. if you mounted vitastor-nfs and just copied a file into it in MC then you could end up with only a part of the file actually written	2023-07-26 02:09:43 +03:00
Vitaliy Filippov	f52f58b9e9	Support UTF-8 in vitastor-cli table output	2023-07-25 01:48:57 +00:00
Vitaliy Filippov	1fe6b0c0e2	Also allow "0" and "no" as false for inmemory_metadata and inmemory_journal	2023-07-25 01:48:57 +00:00
Vitaliy Filippov	e4237e9ed8	Enable HDD defaults for HDD-only in automatic `vitastor-disk prepare` mode	2023-07-23 02:33:22 +03:00
Vitaliy Filippov	10a5fd6abb	Release 0.9.5 A hotfix to 0.9.4 containing only one bugfix: 100% CPU usage in the new QEMU driver caused by the lack of eventfd reset on io_uring event handling :)	2023-07-21 00:04:41 +03:00
Vitaliy Filippov	1c316ef350	Reset eventfd on every ringloop::loop()	2023-07-21 00:04:41 +03:00
Vitaliy Filippov	0b2d12eef1	Remove has_work, it was unnecessary	2023-07-21 00:04:37 +03:00
Vitaliy Filippov	1c10430ae1	Release 0.9.4 - Improve QEMU driver performance by integrating io_uring in it (up to 1.5x total iops improvement) - Fix QEMU driver deadlocks which started to reproduce in qemu-img after iothread fixes - Fix `vitastor-cli status` reporting more etcds than actually exists (fix etcd address duplication in config on reload) - Fix `vitastor-cli ls` crashing on inodes in non-existing pools - Delete old garbage /pool/stats/ keys for non-existing (deleted) pools - Reduce memory usage of etcds initialized by make-etcd script - Fix OSDs almost always crashing on etcd restart due to "revisions were compacted" (support reloading state from etcd) - Fix a crash and a stall possible mostly in HDD setups with small journal and big (512k, 900k) random writes - Add notes about HDDs to documentation. You are officially allowed to use HDD-only Vitastor with HGST/Toshiba/EXOS :)	2023-07-19 02:50:30 +03:00
Vitaliy Filippov	d0e257ee81	Fix non-existing pool handling in `vitastor-cli ls`	2023-07-18 23:52:02 +03:00
Vitaliy Filippov	9815d70ffc	It is impossible to use io_uring with older vitastor-client because it does not have vitastor_c_uring_has_work()	2023-07-18 23:37:53 +03:00
Vitaliy Filippov	4a4627dcab	Do not use bool in C library	2023-07-18 23:37:53 +03:00
Vitaliy Filippov	ba7427020e	Fix deadlocks possible in qemu-img after fixing iothread Deadlock was caused by switching QEMU coroutines directly inside vitastor_co_read_bitmap_cb() callback. The correct way is to schedule a BH /BH is a QEMU term for setImmediate() :)/, same as in read and write callbacks.	2023-07-18 23:32:16 +03:00
Vitaliy Filippov	ac7b834af3	Disable journal_no_same_sector_overwrites by default for HDD-only	2023-07-10 00:34:35 +03:00
Vitaliy Filippov	57ad4c3636	Add a note about HDD, enable throttling only for hybrid OSDs	2023-07-09 12:45:11 +03:00
Vitaliy Filippov	b7e4d0c9bf	Fix journal dirty_start position tracking and some debug prints Fixes two bugs found during HDD testing :-) 1) OSD crashed with "BUG: Attempt to overwrite used offset of the journal" during `fio -bs=900k -iodepth=128` test with 16 MB journal 2) OSD stalled during `fio -bs=512k -iodepth=128` test with 64 MB journal	2023-07-09 01:17:55 +03:00
Vitaliy Filippov	161a23c966	Support reloading state when etcd says "revisions were compacted" Before this change, OSDs almost always died when one of the etcds was restarted, even though the rest of them was still in quorum and the lease was still active	2023-07-07 01:33:48 +03:00
Vitaliy Filippov	45c0694853	Clear etcd_local addresses on reload and also skip duplicates	2023-07-06 00:39:39 +03:00
Vitaliy Filippov	30ac899074	Make QEMU driver compatible with older vitastor_client and with systems without io_uring	2023-07-04 15:51:43 +03:00
Vitaliy Filippov	2348d39cf4	Avoid repeated qemu_uring_handlers, add 2.0-2.7 compatibility	2023-07-04 00:28:23 +03:00
Vitaliy Filippov	3de7929fe5	Integrate v2 - direct epoll	2023-07-04 00:28:23 +03:00
Vitaliy Filippov	07b2196bc2	Integrate QEMU driver with io_uring	2023-07-04 00:28:23 +03:00
Vitaliy Filippov	a612cdca47	Release 0.9.3 - Add patch for libvirt 9.0 - Add support for Proxmox VE 8.0 - Fix compatibility of the QEMU driver with iothread (QEMU rebuilds are coming) - Fix vitastor-cli rm-data/rm/merge hanging when some OSDs are down. Allow deletions in unclean cluster at the cost of some data possibly "reappearing" when those OSDs start back. In that case you can just repeat the deletion request using rm-data. - A bunch of bug fixes for snapshots: - Fix snapshot reads often not working at all with snapshot chain size > 2 - Fix optimized snapshot data merge (children to parent) - Fix updating of image name index key during optimized merge - Fix auto-selection preventing the use of optimized merge with only 1 snapshot - Fix incorrect CAS retries during snapshot merge - Fix snapshot merge progress reporting - Fix primary_read bitmap buffers use-after-free which could lead to incorrect allocation map reads - Remove /usr/local/bin path from make-etcd - Some documentation fixes	2023-07-01 00:25:58 +03:00
Vitaliy Filippov	c8d61568b5	Fix primary_read bitmap buffers being freed too early (use-after-free)	2023-06-30 12:47:45 +03:00
Vitaliy Filippov	84ed3c6395	Fix CAS retries during snapshot merge	2023-06-30 02:30:23 +03:00
Vitaliy Filippov	a7b57386c0	Do not print last subcommand result twice during "inverse" snapshot merge	2023-06-30 02:07:10 +03:00
Vitaliy Filippov	9d4ea5f764	Fix inverse parent selection which prevented the use of optimized merge in case of only 1 snapshot	2023-06-30 01:39:11 +03:00
Vitaliy Filippov	000e4944ec	Remove "inverse parent" image name index key from etcd during snapshot merge	2023-06-30 01:23:30 +03:00
Vitaliy Filippov	8426616d89	Warn about unfinished deletions in rm-data	2023-06-30 01:18:25 +03:00
Vitaliy Filippov	1a841344ec	Print progress of all operations during snapshot merge	2023-06-30 01:13:47 +03:00
Vitaliy Filippov	8603b5cb1d	Do not hang on inactive OSDs during delete, report and skip them instead	2023-06-30 00:15:16 +03:00
Vitaliy Filippov	878ccbb6ea	Fix snapshot chain "down-merge" ("up-merge" worked well...)	2023-06-29 00:47:21 +03:00
Vitaliy Filippov	63c2b9832c	Fix chained (snapshot) reads often not working at all with chain size > 2	2023-06-28 18:54:03 +03:00
Vitaliy Filippov	a11ca56fb1	Fix compatibility of the QEMU driver with iothread	2023-06-21 02:11:28 +03:00
Vitaliy Filippov	b84927b340	Fix \n in nbd_proxy	2023-06-19 01:48:58 +03:00
Vitaliy Filippov	926be372fd	Release 0.9.2 - Measure and report scrub I/O statistics in vitastor-cli status - Make aggregated statistics in vitastor-cli status much smoother (first derive, then sum instead of first summing and then deriving) - Fix an old rare bug leading to journal corruption (try to use scrub if you think you're affected...) - Do not start EC PGs without at least <data chunks> OSDs in each old set (prevents spurious read errors with EC during reconnections/restarts) - Fix failed assert(!scrub_list_op) on OSD restart with pending scrubs - Fix future planned scrubs not starting because of incorrect time comparison - Build packages for Debian 12 (Bookworm)	2023-06-18 19:44:33 +03:00
Vitaliy Filippov	c74a424930	Report scrub I/O in vitastor-cli status	2023-06-17 21:11:21 +03:00
Vitaliy Filippov	32f2c4dd27	Measure scrub statistics	2023-06-17 20:56:26 +03:00
Vitaliy Filippov	3ad16b9a1a	Fix auto_scrubs not starting because of < vs <= =))	2023-06-17 17:32:21 +03:00
Vitaliy Filippov	1c2df841c2	Fix failed assert(!scrub_list_op) on OSD restart with pending scrubs	2023-06-17 17:02:54 +03:00
Vitaliy Filippov	aa5dacc7a9	Do not start EC PGs without at least pg_data_size connections to old OSDs from each set	2023-06-17 02:16:30 +03:00
Vitaliy Filippov	4fdc49bdc7	Add another assert-type check (it does not fire, just as a safety measure for the future)	2023-06-17 00:07:22 +03:00
Vitaliy Filippov	86b4682975	Put get_trim_pos into the "critical section". Fixes rare journal corruption issue The consequence of this issue was that in some very rare cases (only reproduced under load in CI when running 4+ tests in parallel) small write data written to journal could overwrite journal entries. Also add an assert-type safety check to be able to catch this issue in the future again in case of a regression.	2023-06-17 00:06:42 +03:00
Vitaliy Filippov	bdd48e4cf1	Release 0.9.1 - Fix "Client XX command out of sync" messages sometimes happening on OSD reconnections - Fix a bug where EC reads parallel with writes to the same object failed with -ERANGE error - Slightly reduce the amount of metadata writes during journal flushing - Correctly unmap NBD volumes when Proxmox forces map_volume use (with SWTPM and maybe some other cases)	2023-06-10 11:42:49 +03:00
Vitaliy Filippov	f9fbea25a4	Remove double write when old and new locations are in the same metadata block Also add another metadata entry fool-safety check which, ideally, will never fire %)	2023-06-03 00:47:10 +03:00
Vitaliy Filippov	2c9a10d081	Fix an idiotic bug leading to failed reads with -ERANGE with EC :D	2023-06-03 00:44:52 +03:00

1 2 3 4 5 ...

688 Commits (recovery-autotune)