vitastor

antilles

vitastor

Author	SHA1	Message	Date
Vitaliy Filippov	c777a0041a	Release 1.4.4 A couple of fixes for EC pools - Fix a segfault possible on partial EC overwrite in 1234 -> 5030 rebalance scenario - Fix two problems leading to EC pools stalling on rebalance & parallel sudden stops of OSDs, for example during a sudden poweroff of a host: - Recovery auto-tuning (1.4.0 feature) could apply too large delays and stall the EC journal - fixed by limiting delays with a new recovery_tune_sleep_cutoff_us parameter (10 seconds by default) and applying recovery pauses before write operations, not after them, to not occupy space in the journal for long time - Dynamic journal space reservation (1.3.0 feature) wasn't accounting new writes when checking the limit so OSDs could still fill the journal fully and stall - fixed by including new writes into the limit - Print etcd dbSize instead of dbSizeInUse in status	2024-02-11 16:23:08 +03:00
Vitaliy Filippov	27e9f244ec	Release 1.4.3 Hotfix for hotfix O:-) - "Write stall fix" was incomplete and EC write stalls could continue even on 1.4.2. Now they're finally fixed O:-) - Make monitor ignore statistics of stopped OSDs. Previously if you stopped all OSDs the last total I/O numbers would remain the same indefinitely	2024-02-09 00:29:31 +03:00
Vitaliy Filippov	8e25a28a08	Ignore down OSDs in monitor statistics aggregation	2024-02-09 00:22:36 +03:00
Vitaliy Filippov	016115c0d4	Release 1.4.2 - Log to systemd by default - Fix excessive autosyncs after every operation with disabled immediate_commit (introduced in 1.1.0) - Fix a possible write stall with EC due to the lack of OSD wakeup after stabilizing previous writes - Change sync operation semantics as a final fix to possible write stalls with EC and disabled immediate_commit - Sync after deleting data in CLI rm / rm-data if immediate_commit is disabled - Fix OSDs ignoring syncs & autosyncs for delete operations - Fix OSD space reporting sometimes adding garbage zeros for deleted inodes (causing extra pool/stats etcd keys for deleted pools) - Speed up monitor failover - change default etcd_mon_ttl from 30 to 5 seconds - Speed up operation retries - change default up_wait_retry_interval to 50 ms - Add patch for libvirt 9.10	2024-02-04 02:23:49 +03:00
Vitaliy Filippov	e026de95d5	Log to systemd by default	2024-02-04 01:21:31 +03:00
Vitaliy Filippov	d2b43cb118	Change default etcd_mon_ttl	2024-01-29 23:45:19 +03:00
Vitaliy Filippov	1c322b33ed	Change default up_wait_retry_interval to 50 ms	2024-01-26 01:51:08 +03:00
Vitaliy Filippov	ba55f91409	Release 1.4.1 - Fix a monitor crash on primary OSD switching introduced in 1.4.0 - Fix "partly outside array bounds" warnings for GCC 12 in cpp-btree - Fix a realloc memory leak in theory possible with too large listings (OSD_OP_LIST)	2024-01-18 02:31:42 +03:00
Vitaliy Filippov	2aa5aa7ab6	Add a test for simple master switching without PG reconfiguration Also use osd_out_time:1 only in select tests and restart mon in tests only on connection errors	2024-01-17 00:19:01 +03:00
Vitaliy Filippov	3ca3b8a8d8	Fix recheck_pgs bug introduced in 1.4.0	2024-01-16 23:49:21 +03:00
Vitaliy Filippov	5280d1d561	Release 1.4.0 New features: - Intelligent recovery/rebalance speed auto-tuning to reduce its impact on clients (see README -> Features) - Auto-restoration of dead VDUSE daemons in CSI plugin - Add vitastor-disk update-sb command - Update QEMU for Debian Bookworm to 8.1 and use it for CSI plugin Bug fixes: - Fix pools SOMETIMES staying inactive after stopping a node due to OSDs not reacting to PG state changes caused by incorrect full reload of state from etcd on reconnection - Make monitors retry pool configuration changes quickier which fixes them being unable to apply changes when an ongoing rebalance is quickly making a lot of PGs clean - Fix CSI plugin not accepting array of strings as etcd address in /etc/vitastor/vitastor.conf - Allow multiple interfaces with the same IP address, for "simple routed" full mesh network - Do not ignore loopback addresses for OSD network (to make ECMP setups with frr possible) - Fix a rare client crash during OSD reconnections - Only treat data partitions as existing OSDs in vitastor-disk prepare - Remove etcd parameter from default command examples - Fix reported free space sometimes changing non-immediately after deletion of data from OSDs - Fix a possible OSD crash on print_slow when bs_op is NULL - Use the same etcd_ws_keepalive_interval in mon as in OSD - Fix mon not using values from config when /config/global is not present - Remove pve-storage-portal-dns-list format for vitastor_etcd_address - Parse log_level in cluster_client - Fix vitastor-nbd image existence check not working because of non-zeroed inode_watch fields - Do not warn on EPIPE in client unless log_level is raised explicitly - Fix incorrect error in CSI when searching for the device in /sys - Remove 2 last prints to stdout in etcd_state_client - Fix a possible OSD crash when checking corrupted journal entries	2024-01-12 01:28:33 +03:00
Vitaliy Filippov	99ee8596ea	Rename min/max_util to util_low/high	2023-12-31 01:23:17 +03:00
Vitaliy Filippov	f757a35a8d	Retry PG changes without re-running lpsolve when pool configuration and OSD tree don't change OSDs often change their /pg/history keys during rebalance, so monitor receives additional transaction failures from etcd if it re-runs lpsolve which sometimes may even lead to monitor being unable to apply PG changes at all until rebalance completes	2023-12-31 01:23:17 +03:00
Vitaliy Filippov	1edf86ed26	Aggregate recovery delay using simple mean over last 10 observations (EWMA is shit)	2023-12-31 01:23:17 +03:00
Vitaliy Filippov	751935ddd8	WIP Auto-tune recovery speed	2023-12-31 01:23:17 +03:00
Vitaliy Filippov	1299373988	Use the same etcd_ws_keepalive_interval in OSD and mon	2023-12-23 20:07:29 +03:00
Vitaliy Filippov	4ece4dfdd0	Fix mon not using values from config when /config/global is not present	2023-12-22 02:25:09 +03:00
Vitaliy Filippov	a1c7cc3d8d	Release 1.3.1 Hotfix to 1.3.0 - new "journal space reservation" had a bug which caused OSDs to crash with EC and without immediate_commit.	2023-12-04 18:35:09 +03:00
Vitaliy Filippov	7972502eaf	Release 1.3.0 New features: - RDMA without ODP - much faster and all cards are now supported, not just Mellanox - VDUSE in CSI - faster, more stable and can even recover after CSI pod restart! - Reserve journal space for stabilize requests dynamically to prevent stalls under load with EC - Raise default NBD timeout from 30 to 300 seconds and allow to take it from /etc/vitastor/vitastor.conf - Remove explicit etcdUrl/etcdPrefix K8S storage class parameter support to prevent etcd migration issues for volumes created with these parameters - Support QEMU 8.1 and pve-qemu 8.1 Bug fixes: - Fix RDMA connection (and thus memory) leak - Fix rare crashes under load due to incorrect io_uring queue size tracking - Fix monitor statistics aggregation in case of empty /osd/stats keys - Fix crash on unknown long argument to vitastor-disk - Allow trailing comma in JSONs again - Fix crash on attempts to dump a long listing of objects "to stabilize" or "to rollback" in a slow op	2023-12-04 02:36:43 +03:00
Vitaliy Filippov	7da4868b37	Fix monitor statistics aggregation in case of empty /osd/stats keys	2023-11-24 01:05:21 +03:00
Vitaliy Filippov	5524dbdab7	Release 1.2.0 New features: - Implement CSI volume expansion - Implement CSI volume snapshots - CSI driver now requires Kubernetes >= 1.20 Bug fixes: - Important bug fix for EC: fix EC n+k, k>=2 read recovery in ISA-L version returning incorrect data when reading at least the second chunk out of multiple missing chunks without reading the first one. All users of EC n+k, k>=2 should upgrade as soon as possible, and upgrade should be conducted with downtime: first stop all clients (VMs/containers), then all OSDs, then upgrade and restart everything. - Fix unstable statistics aggregation in monitor (affecting vitastor-cli status and df) - Make udev not wait for OSDs to start during boot - Do not report negative numbers of offline PGs in vitastor-cli status when changing PG count - Report both old and new PG counts in vitastor-cli df when changing it - Fix OSDs sometimes not starting with "The code only supports journal versions 1 and 2, but it is 2 on disk" error after upgrading from pre-1.0 versions and letting OSDs run for some time - Fix monitors sometimes returning old PG count back after OSD configuration changes - Make monitor PG changes more stable and timeout errors less probable	2023-11-05 01:48:57 +03:00
Vitaliy Filippov	0e888e6c60	Prevent spamming etcd with last_clean_pgs update requests	2023-11-05 00:12:00 +03:00
Vitaliy Filippov	408c21d8f0	Scale last_clean_pgs PG count even if current PGs already contain the new number of PGs	2023-11-04 23:45:59 +03:00
Vitaliy Filippov	43cb9ae212	Prevent multiple parallel recheck_pgs in case of timeouts	2023-11-04 20:59:56 +03:00
Vitaliy Filippov	1fe678e57b	Add --no-block to udev rule	2023-10-30 12:18:29 +03:00
Vitaliy Filippov	2e592a2f22	Fix undefined variable "timeout"	2023-10-29 01:30:55 +03:00
Vitaliy Filippov	b92f644e3a	Fix statistics aggregation, calculate inode stats by first deriving per-OSD stats, too	2023-10-29 01:30:55 +03:00
Vitaliy Filippov	8222e3c77d	Release 1.1.0 New features: - Implement [client writeback cache](docs/config/client.en.md#client_enable_writeback) - Add the third I/O mode: [O_DIRECT\|O_SYNC](docs/config/osd.en.md#data_io) (good for Optane) - Reduce load on etcd by splitting OSD lease and statistics reporting intervals: [etcd_stats_interval](docs/config/osd.en.md#etcd_stats_interval) (default 30 sec) - Make MON automatically filter OSDs by layout (block_size/immediate_commit/bitmap_granularity) to prevent "refusing to start PGs of this pool" errors on misconfiguration - Support running fio benchmarks on systems without io_uring - Make QEMU driver compatible with QEMU 8.1 - Document usage of [vhost-user-blk](docs/usage/qemu.en.md#vhost-user-blk) Bug fixes: - Fix resizing disks in QEMU driver (for example, in Proxmox) - Fix "unexpected result" in Proxmox driver by making CLI flush output on exit - Remove unneeded block_size mismatch warnings on pools without matching PGs - Fix possible segfault in vitastor-cli ls -l (usually with deleted pools) - Fix QEMU driver compatibility with systems without io_uring - Fix monitor eating 100% CPU when etcd is down (caused by infinite retries) - Fix potential incorrect write processing with snapshots (not caught in tests but could probably lead to client hangs) - Fix buffer insertion in cluster_client (not caught in tests but could probably lead to incorrect writes in rare cases) - Fix rare OSD crash during sync operation processing - Fix a reenterability issue in cluster_client not reproducible in QEMU/fio, but reproducible with the currently developed K/V database implementation - Fix deletion of the first modified object - OSDs could crash if you modified the same object a lot of times, then deleted it, and then modified it again - Fix the fio_sec_osd test tool	2023-10-28 00:33:06 +03:00
Vitaliy Filippov	be7e76f849	Split etcd_stats_interval out of etcd_report_interval	2023-10-27 01:26:26 +03:00
Vitaliy Filippov	38db53f5ee	Implement client writeback cache - Disabled by default, enable with client_enable_writeback=true - Even then only enabled in FIO when -direct is disabled and in QEMU when block device cache is enabled in settings - Can also be enabled in other clients like vitastor-cli using parameter client_writeback_allowed=true, but not recommended	2023-09-16 17:52:17 +03:00
Vitaliy Filippov	ff479a102d	Make MON filter OSDs by block layout to prevent "refusing to start PGs of this pool" errors on misconfiguration	2023-09-16 17:52:17 +03:00
Vitaliy Filippov	ab8627c9fa	Fix monitor retrying failed etcd connection in an infinite loop without pauses	2023-08-09 00:57:08 +03:00
Vitaliy Filippov	25a15d24cf	Fix incorrect EC space statistics in `vitastor-cli status`	2023-07-27 02:26:17 +00:00
Vitaliy Filippov	2f999d8607	Reduce etcd memory usage With default --snapshot-count 100000 and GOGC=100 it easily reaches 6.6 GB even when we only store 1-2 MB of data in it	2023-07-06 00:46:26 +03:00
Vitaliy Filippov	d007a374f2	Delete extra /pool/stats/ keys for non-existing pools	2023-07-06 00:40:13 +03:00
Vitaliy Filippov	f12b8e45a9	Remove /usr/local/bin path from make-etcd	2023-06-29 23:49:31 +03:00
Vitaliy Filippov	a4186e20aa	First derive, then sum per-OSD statistics instead of first summing and then deriving This makes statistics reported by vitastor-cli status much smoother	2023-06-18 01:32:24 +03:00
Vitaliy Filippov	aea567cfbd	Slightly improve scrub docs	2023-05-21 12:52:30 +03:00
Vitaliy Filippov	ce02f47de6	Allow to disable scrub_find_best	2023-05-21 12:33:38 +03:00
Vitaliy Filippov	8d40ad99a6	Add scrub documentation	2023-05-20 23:19:39 +03:00
Vitaliy Filippov	3475772b07	Add configuration online update documentation	2023-05-20 23:19:39 +03:00
Vitaliy Filippov	6648f6bb6e	Implement ambiguity detection during scrub	2023-05-20 23:19:39 +03:00
Vitaliy Filippov	3c924397e7	Store next scrub timestamp instead of last scrub timestamp	2023-05-20 23:19:39 +03:00
Vitaliy Filippov	c3bd26193d	Implement PG scrub runner	2023-05-20 23:19:39 +03:00
Vitaliy Filippov	0538a484b3	Add corrupted object state	2023-05-20 23:19:39 +03:00
Vitaliy Filippov	022176aa98	Fix NaN during PG optimisation if there are nonexisting OSDs in node_placement	2023-05-17 01:20:30 +03:00
Vitaliy Filippov	120e3fa7bc	Fix pool deletion	2023-05-17 00:45:59 +03:00
Vitaliy Filippov	6f4dc16c59	Handle etcd connection errors correctly in mon (unhandled error events)	2023-05-11 11:02:44 +03:00
Vitaliy Filippov	321cb435a6	Fix monitor incorrectly changing PG count when last_clean_pgs contains less PGs than the new number	2023-05-08 20:39:20 +03:00
Vitaliy Filippov	5b9031fecc	Fix monitor possibly applying incorrect PG history under heavy load Monitor could deceive itself by immediately saving PG configuration changes which weren't applied to etcd yet in memory, and apply incorrect PG history changes next time if the first update fails. This usually only happened under heavy load and was caught in CI. :-)	2023-05-07 23:23:00 +03:00

1 2 3 4

168 Commits (master)