vitastor

Commit Graph

Author	SHA1	Message	Date
Vitaliy Filippov	5a9e1ede52	Release 0.8.9 Test / buildenv (push) Successful in 9s Details Test / build (push) Successful in 2m31s Details Test / test_cas (push) Successful in 12s Details Test / make_test (push) Successful in 33s Details Test / test_change_pg_size (push) Successful in 19s Details Test / test_change_pg_count (push) Successful in 55s Details Test / test_create_nomaxid (push) Successful in 21s Details Test / test_change_pg_count_ec (push) Successful in 58s Details Test / test_failure_domain (push) Successful in 13s Details Test / test_etcd_fail (push) Successful in 1m4s Details Test / test_interrupted_rebalance (push) Successful in 1m13s Details Test / test_interrupted_rebalance_imm (push) Successful in 1m7s Details Test / test_add_osd (push) Successful in 2m59s Details Test / test_move_reappear (push) Successful in 24s Details Test / test_interrupted_rebalance_ec (push) Successful in 1m22s Details Test / test_interrupted_rebalance_ec_imm (push) Successful in 1m1s Details Test / test_rebalance_verify (push) Successful in 2m12s Details Test / test_minsize_1 (push) Successful in 15s Details Test / test_rebalance_verify_imm (push) Successful in 2m4s Details Test / test_rebalance_verify_ec_imm (push) Successful in 2m9s Details Test / test_rm (push) Successful in 17s Details Test / test_snapshot (push) Successful in 23s Details Test / test_rebalance_verify_ec (push) Successful in 2m31s Details Test / test_splitbrain (push) Successful in 23s Details Test / test_snapshot_ec (push) Successful in 30s Details Test / test_write_no_same (push) Successful in 16s Details Test / test_write (push) Successful in 53s Details Test / test_write_xor (push) Successful in 1m19s Details Test / test_heal_pg_size_2 (push) Successful in 4m30s Details Test / test_heal_ec (push) Successful in 4m32s Details - The tests are now stable and run in a CI system based on Gitea CI - The release includes final bug fixes for EC: - Implement missing EC recovery of allocation bitmap when built with ISA-L - Fix broken snapshot export with EC (allocation bitmap reads were giving incorrect results previously) - Also fixed bugs manifesting under heavy load: - Fix monitor possibly applying incorrect PG history on retries - Fix monitor incorrectly changing PG count when last_clean_pgs contains less PGs than the new number - Allow writes to wait for free space again, but now correctly (previously dropped in 0.8.2) - Fix a rare segfault in client (handle client stop during incoming stream handling in 1 more place) - Make monitor correctly handle etcd connection errors - it could die instead of connecting to another etcd - Fix OSD rarely being unable to report PG states after a PG was taken over by another OSD - Fixed return code for incomplete EC objects (now EIO) and made cluster client retry this error - Made other small changes for tests: timeouts, nice/ionice for etcd, waiting conditions, NBD device checks and so on	2023-05-14 01:25:09 +03:00
Vitaliy Filippov	de3e609166	Add a FIXME about QEMU driver thread safety	2023-05-14 00:06:09 +03:00
Vitaliy Filippov	11481170f5	Add a FIXME about ENOSPC	2023-05-13 23:59:44 +03:00
Vitaliy Filippov	6442010f93	Skip offline PGs during state reporting when the state is already deleted or taken over by another OSD This fixes OSDs being unable to report PG states in rare conditions	2023-05-12 23:17:45 +03:00
Vitaliy Filippov	ce4a8067b5	Handle client stop during incoming stream handling in 1 more place	2023-05-11 01:53:41 +03:00
Vitaliy Filippov	8cac795445	Return EIO instead of EINVAL for incomplete EC objects	2023-05-11 01:15:23 +03:00
Vitaliy Filippov	a409598b16	Wait for free space again, but count on big_write flushes instead of just flusher activity	2023-05-10 01:51:02 +03:00
Vitaliy Filippov	f4c6765522	Ignore ENOENT in epoll_ctl	2023-05-08 20:39:20 +03:00
Vitaliy Filippov	5da1d8e1b5	Fix EC just-bitmap reads (len=0) (fixes SCHEME=ec test_snapshot.sh)	2023-05-07 14:00:08 +03:00
Vitaliy Filippov	44f86f1999	Add a basic EC 2+2 recovery test (not really required, but let it be there)	2023-05-07 11:26:27 +03:00
Vitaliy Filippov	2d9a80c6f6	Implement missing bitmap recovery with ISA-L \(°□°)/	2023-05-07 11:25:51 +03:00
Vitaliy Filippov	ab615849d6	Release 0.8.8 - Fix vitastor-cli rm/rm-data broken in 0.8.6 (missing messenger initialization) - Prepare OSD read handler for upcoming version with scrub - allow "secondary reads" to return errors - Fix OSDs re-peering PGs infinitely with a big number of PGs (reproduced in test_add_osd) - Fix another variant of flusher sync-waiting stall (reproduced in test_write) - Fix other tests in tests/ (will add them to Gitea CI soon) - Add patches for QEMU 6.2-8.0 - Fix QEMU driver compatibility with QEMU 8.0 - Build packages for RHEL 9 clones (based on AlmaLinux 9)	2023-04-28 11:22:00 +03:00
Vitaliy Filippov	b94587ef0e	Fix some build warnings	2023-04-28 00:44:27 +03:00
Vitaliy Filippov	c768a9015f	Fix QEMU driver compatibility with QEMU 8.0	2023-04-25 11:20:21 +03:00
Vitaliy Filippov	b74ccb613c	Fix another variant of flusher sync-waiting stall	2023-04-24 00:44:41 +03:00
Vitaliy Filippov	a04dab0840	Initialize messenger in cluster_client listings	2023-04-24 00:44:41 +03:00
Vitaliy Filippov	160863f707	Print op pointer values in slow log	2023-04-23 17:54:00 +03:00
Vitaliy Filippov	2877cd0adb	Allow OP_SEC_READ to return errors (do not hang the connection)	2023-04-23 17:54:00 +03:00
Vitaliy Filippov	480509f5b9	Fix pg_data_size > 1 for replicas (harmless bug)	2023-04-23 01:50:42 +03:00
Vitaliy Filippov	46462da45e	Preload own PG history updates to fix PG state loop possibly applying the old metadata version	2023-04-23 01:50:30 +03:00
Vitaliy Filippov	7e958afeda	Release 0.8.7 This release includes a bunch of important bugfixes for erasure-coded setups with disabled immediate_commit. After these fixes, "test_heal" OSD killing test now passes fine with EC: - Fix cluster write stalls with "Error while doing flush on OSD xx: -16 (Device or resource busy)" in OSD logs possible in EC setups with disabled immediate_commit by selectively syncing nonsynced objects on STABILIZE/ROLLBACK (https://github.com/vitalif/vitastor/issues/51) - Fix other EC + disabled immediate_commit problems: - Fix "opcode=5 retval=-2" errors happening on SYNC retries - Fix non-working "pagination" during PG dirty object flushing - Fix write operations not continued correctly after dirty object flushing - Fix incorrect parity read-modify-write calculation when writing into a lost chunk - Fix OSDs losing left_on_dead PG state of non-clean PGs and thus not removing junk data in the cluster - Fix a small memory leak caused by bad indexing of EC recovery matrices - Fix a rare use-after-free in cluster_client caused by a reenterability issue - Fix vitastor-cli create command syntax in the CSI driver - Allow to start OSDs without local store for tests - Fix memory allocation error in disk_tool_meta for non-standard metadata block sizes - Fix delete operations received before loading pool metadata crashing OSDs with "null pointer exception" - Improve "theoretical performance" Russian documentation New features: - Implement online configuration update for some parameters. Documentation is coming soon :)	2023-04-11 02:11:57 +03:00
Vitaliy Filippov	2f5e769a29	Fix a small memory leak caused by bad indexing of EC recovery matrices	2023-04-11 00:30:36 +03:00
Vitaliy Filippov	3237014608	Fix incorrect parity read-modify-write calculation when writing into a lost chunk	2023-04-09 02:06:10 +03:00
Vitaliy Filippov	baaf8f6f44	Fix write operations not continued correctly after flush	2023-04-09 02:06:10 +03:00
Vitaliy Filippov	1d83fdcd17	Add debug logs to osd_flush	2023-04-09 02:06:10 +03:00
Vitaliy Filippov	0ddd787c38	Fix non-working "pagination" during PG dirty object flushing	2023-04-08 02:44:02 +03:00
Vitaliy Filippov	6eff3a60a5	Do not lose left_on_dead PG state of non-clean PGs	2023-04-08 02:44:02 +03:00
Vitaliy Filippov	888a6975ab	Fix a rare use-after-free in cluster_client caused by a reenterability issue	2023-04-08 02:44:02 +03:00
Vitaliy Filippov	cd1e890bd4	Fix "opcode=5 retval=-2" errors sometimes possible with EC	2023-04-08 02:44:02 +03:00
Vitaliy Filippov	0fbf4c6a08	Selectively sync nonsynced objects on STABILIZE/ROLLBACK (fix for github issue #51 )	2023-04-08 02:44:02 +03:00
Vitaliy Filippov	d06ed2b0e7	Implement online config update	2023-03-26 19:21:50 +03:00
Vitaliy Filippov	2fb0c85618	Allow to start OSDs without local store (only for tests)	2023-03-15 01:13:59 +03:00
Vitaliy Filippov	d81a6c04fc	Update cmake min version so it does not complain about deprecation	2023-03-15 01:08:23 +03:00
Vitaliy Filippov	7b35801647	Fix possible bad realloc in disk_tool_meta for non-standard metadata block sizes	2023-03-15 01:08:23 +03:00
Vitaliy Filippov	f3228d5c07	Fix typo (did not affect execution though)	2023-03-15 01:08:23 +03:00
Vitaliy Filippov	18366f5055	Fix read/write return type in rw_blocking	2023-03-15 01:08:14 +03:00
Vitaliy Filippov	851507c147	Add missing close() in test stubs	2023-03-15 00:23:56 +03:00
Vitaliy Filippov	9aaad28488	Fix "null pointer exception" for unhandled OSD_OP_DELETEs (when pool is not loaded yet)	2023-03-02 11:16:39 +03:00
Vitaliy Filippov	8810eae8fb	Release 0.8.6 Important fixes: - Fix possibly incorrect EC parity chunk updates with EC n+k, k > 1 and when the first parity chunk is missing Minor fixes and improvements: - Fix incorrect EC free space statistics in vitastor-cli df output - Speedup vitastor-cli startup in clusters with RDMA - Remove unused PG "peered" state (previously used to update PG epoch) - Use sfdisk with just --json in vitastor-disk (--dump --json isn't needed) - Allow trailing comma in sfdisk output (fixes sfdisk 2.36 compatibility) - Slightly improve RDMA send/receive code - Reduce RDMA memory consumption by default (rdma_max_recv/send = 16/8) - Use vitastor-cli instead of direct etcd interaction in the CSI driver	2023-02-28 11:18:48 +03:00
Vitaliy Filippov	14d6acbcba	Set default rdma_max_recv/send to 16/8, fix documentation	2023-02-28 11:00:56 +03:00
Vitaliy Filippov	1e307069bc	Fix missing parity chunk calculation for EC n+k, k > 1 and first parity chunk missing	2023-02-28 02:40:19 +03:00
Vitaliy Filippov	c3e80abad7	Allow to send more than 1 operation at a time	2023-02-26 02:01:04 +03:00
Vitaliy Filippov	138ffe4032	Reuse incoming RDMA buffers	2023-02-26 00:55:01 +03:00
Vitaliy Filippov	4ab630b44d	Use just sfdisk --json, --dump is not needed	2023-02-23 00:55:47 +03:00
Vitaliy Filippov	2c8241b7db	Remove PG "peered" state	2023-02-21 01:30:42 +03:00
Vitaliy Filippov	36a7dd3671	Move tests to "make test"	2023-02-21 01:30:42 +03:00
Vitaliy Filippov	936122bbcf	Initialize msgr lazily in client to speedup vitastor-cli with RDMA enabled	2023-02-19 18:59:07 +03:00
Vitaliy Filippov	1a1ba0d1e7	Add set_immediate to ringloop and use it for bs/osd ops to prevent reenterability issues	2023-02-09 17:37:26 +03:00
Vitaliy Filippov	3d09c9cec7	Remove unused wait_sqe() from ringloop	2023-02-09 17:37:26 +03:00
Vitaliy Filippov	3d08a1ad6c	Fix cluster_client test after last reenterability fixes	2023-02-05 01:47:32 +03:00
Vitaliy Filippov	aba93b951b	Fix incorrect EC free space statistics in vitastor-cli df output	2023-01-26 02:04:29 +03:00
Vitaliy Filippov	d125fb1f30	Release 0.8.5 - Fix a possible "double free" bug in the client library happening on OSD restart - Fix a possible write hang on PG history update when only epoch is changed - Fix incorrect systemd target "local.target" in mon/make-etcd - Allow "content" option in PVE storage plugin to allow to enable containers - Build client library without tcmalloc which fixes "attempt to free invalid pointer" errors when, for example, trying to run QEMU with both Vitastor and Ceph RBD disks	2023-01-25 01:43:49 +03:00
Vitaliy Filippov	8b552a01f9	Do not retry successful operation parts in client (could lead to "double free" bugs)	2023-01-25 01:30:36 +03:00
Vitaliy Filippov	0385b2f9e8	Fix write hangs on PG epoch update - always set pg.history_changed to true	2023-01-25 01:30:15 +03:00
Vitaliy Filippov	9f4e34a8cc	Build client library without tcmalloc Fixes "[src/tcmalloc.cc:332] Attempt to free invalid pointer ..." when trying to run QEMU with both Vitastor and Ceph RBD disks and other possible allocator collisions.	2023-01-15 00:01:11 +03:00
Vitaliy Filippov	81fc8bb94c	Release 0.8.4 New features: - Implement QCOW2 image/snapshot export via qemu-img (bdrv_co_block_status in the driver) - Remove OSDs from PG history during `vitastor-cli rm-osd` to prevent `left_on_dead` PG states after deletion - Add a new recovery_pg_switch setting to mix all PGs during recovery, to almost fully reduce the probability of ENOSPC during rebalance - Introduce partial ENOSPC ("OSD is full") handling - now ENOSPC doesn't turn into cascades of crashes - Add migration support to Proxmox VE Vitastor driver - Track last_clean_pgs on a per-pool basis thus reducing data movement in a cluster with pools remaining unclean/degraded for a long time Bug fixes: - Fix a bug where monitor could generate degraded PGs if one of the hosts had no OSDs - Fix a bug where monitor could skip PG redistribution with a lot of OSDs in cluster - Report PG history synchronously on the first write, which improves PG consistency and availability at the same time, because history now gets reported correctly and doesn't get reported without the need for it - Fix possible write and recovery stalls which could happen in a cluster with both EC and replicated pools - Make OSD and monitors sanitize & deduplicate PG history items in etcd - Fix non-working OSD peer config safety check - Fix a rare journal flush stall where flushing wasn't activated with full journal, but with empty flush queue - Fix builds without ISA-L (jerasure-only) crashing with EC N+K, K>=2 due to the lack of 16-byte buffer alignment - Fix a possible crash for EC N+K, K>=2 when calculating a parity chunk with previous parity chunk missing - Fix a bug where vitastor-disk purge with suppressed warnings didn't work	2023-01-13 23:59:54 +03:00
Vitaliy Filippov	bc465c16de	Fix arithmetic on void* for clang	2023-01-13 23:58:42 +03:00
Vitaliy Filippov	8763e9211c	Fix qemu driver compilation warning/error	2023-01-13 23:44:39 +03:00
Vitaliy Filippov	fe87b4076b	Fix backwards compatibility in cluster_client	2023-01-12 02:37:31 +03:00
Vitaliy Filippov	137309cf29	Implement bdrv_co_block_status for snapshot export support	2023-01-07 17:06:58 +03:00
Vitaliy Filippov	373f9d0387	Try to re-peer PGs on history change	2023-01-06 12:46:44 +03:00
Vitaliy Filippov	c4516ea971	Also remove deleted OSD from PG configuration and last_clean_pgs	2023-01-06 12:46:44 +03:00
Vitaliy Filippov	91065c80fc	Try to prevent left_on_dead when deleting OSDs by removing them from PG history	2023-01-06 12:46:43 +03:00
Vitaliy Filippov	02e7be7dc9	Prevent reenterability side effects during PG history operation resume	2023-01-03 02:20:50 +03:00
Vitaliy Filippov	73940adf07	Prioritize EC (non-instantly-stable) operations under journal pressure This reduces the probability of hitting OSD stalls with EC due to "deadlocks" where two parallel write operations wait for each other to complete	2023-01-03 00:05:45 +03:00
Vitaliy Filippov	e950c024d3	Do not sync peer OSDs before listing Sync before listing was added to wait for all PG writes possibly left in queue from the previous master to finish before listing it But in fact it may block the cluster when EC is used and some unstable writes are left in the queue - they block journal flushing, rollback/stabilize is required to unblock them, but rollback/stabilize may only happen after PG is peered. But peering needs listings, listings are requested only after sync, and sync itself waits for currently blocked writes waiting in the queue	2023-01-03 00:05:45 +03:00
Vitaliy Filippov	71d6d9f868	Fix possible crash on ENOSPC during operation cancel in blockstore	2023-01-03 00:05:45 +03:00
Vitaliy Filippov	a4dfa519af	Report PG history synchronously during write This has 2 effects: 1) OSD sets aren't added into PG history until actual write attempts anymore which removes unneeded extra osd_sets in PG history 2) New OSD sets are reported synchronously and can't be lost on PG restarts happening at the same time with reconfiguration	2023-01-01 23:41:05 +03:00
Vitaliy Filippov	67019f5b02	Make OSD sort & sanitize PG history items	2023-01-01 23:17:42 +03:00
Vitaliy Filippov	0593e5c21c	Fix OSD peer config safety check	2022-12-31 02:24:42 +03:00
Vitaliy Filippov	998e24adf8	Add a new recovery_pg_switch setting to mix all PGs during recovery	2022-12-30 02:03:33 +03:00
Vitaliy Filippov	d7bd36dc32	Fix another rare journal flush stall	2022-12-30 02:03:33 +03:00
Vitaliy Filippov	cf5c562800	Log all object locations when peering PGs	2022-12-30 02:03:33 +03:00
Vitaliy Filippov	629200b0cc	Return ENOSPC as the primary OSD	2022-12-30 02:03:33 +03:00
Vitaliy Filippov	3589ccec22	Do not disconnect peer on ENOSPC during write	2022-12-30 01:54:25 +03:00
Vitaliy Filippov	8d55a1e780	Build osd_rmw_test both with and without ISA-L	2022-12-29 19:13:57 +03:00
Vitaliy Filippov	65f6b3a4eb	Fix jerasure crashing on bitmap calculation/restoration due to the lack of 16-byte alignment	2022-12-29 19:13:57 +03:00
Vitaliy Filippov	fd216eac77	Add a test for missing parity chunk calculation	2022-12-29 19:13:57 +03:00
Vitaliy Filippov	61fca7c426	Fix crash when calculating a parity chunk with previous parity chunk missing (test coming shortly)	2022-12-29 19:13:57 +03:00
Vitaliy Filippov	68f3fb795e	Suppress warnings in vitastor-disk purge correctly	2022-12-27 11:09:19 +03:00
Vitaliy Filippov	fa90f287da	Release 0.8.3 - Implement a new "vitastor-disk purge" command to remove OSDs with safety checks - Implement a new "vitastor-cli rm-osd" command to only remove OSD metadata from etcd - Fix a bug where the monitor could ignore OSD removal and other /osd/stats key changes - Fix a bug where garbage could be returned when reading objects being written at the same time - Fix a rare write stall where journal space could be not reclaimed where there were no new operations in the flush queue - Fix a rare peering stall caused by a previous long listing operations queues limiting attempt - Fix total object count statistic in OSD on object creation - Add missing offset&len into vitastor-disk dump-journal for big_writes, fix JSON format - Make vitastor-cli print help on missing command - Make vitastor-cli translate all '-' to '_' in CLI options	2022-12-27 02:40:55 +03:00
Vitaliy Filippov	795020674d	Loop journal flusher when the queue is empty but there is a trim request	2022-12-27 02:28:20 +03:00
Vitaliy Filippov	8e12285629	Fix vitastor-disk purge (now it works)	2022-12-27 02:28:20 +03:00
Vitaliy Filippov	b9b50ab4cc	Implement vitastor-disk purge command	2022-12-26 02:48:48 +03:00
Vitaliy Filippov	0d8625f92d	Make vitastor-cli print help on missing command	2022-12-26 02:48:48 +03:00
Vitaliy Filippov	2f3c2c5140	Implement safety check for OSD removal, translate all '-' to '_' in cli options '-' to '_' translation fixes a bug with create --image_meta	2022-12-26 02:48:48 +03:00
Vitaliy Filippov	4ebdd02b0f	Remove LIST op limiter It doesn't prevent OSD slow ops but may itself lead to stalls :)	2022-12-26 02:48:48 +03:00
Vitaliy Filippov	c2244331e6	Add vitastor-cli rm-osd command	2022-12-26 02:48:48 +03:00
Vitaliy Filippov	31bd1ec145	Fix object creation check for statistics	2022-12-21 02:51:11 +03:00
Vitaliy Filippov	c08d1f2dfe	Add missing offset&len into big_writes journal dump, fix commas again	2022-12-21 02:51:11 +03:00
Vitaliy Filippov	1d80bcc8d0	Fix blockstore returning garbage for unstable reads if there is an in-flight version "In-flight" versions are added into dirty_db when writes are enqueued. And they weren't ignored by subsequent reads even though they didn't have data location yet. This bug was leading to test_heal.sh not passing sometimes with replicated setups.	2022-12-21 02:48:24 +03:00
Vitaliy Filippov	5ef8bed75f	Release 0.8.2 - Fix QEMU driver compatibility with QEMU 7.0 and < 2.9 - Add patches for pve-qemu-kvm 7.1 (PVE 7.3) and pve-qemu-kvm 6.2 (PVE 7.2) - Fix Proxmox driver location in the pve-storage-vitastor package - Disable HDD autodetection in non-hybrid mode - Explicitly warn about a buggy kernels on -EAGAIN in io_uring - Final fix for the lack of zeroing out of old metadata entries (do not crash with "big_write journal_entry was allocated over another object" in some cases after an unclean OSD shutdown) - Wait for data writes before fsyncing data if data fsync is enabled - Never try to wait for free space inside blockstore thus stalling OSDs - Fix a rare crash in osd_peering due to callback ordering - Fix a rare duplication of ping & op message IDs - Fix a rare use-after-free during pings - Add --force to vitastor-disk read-sb - Make vitastor-disk dump metadata object IDs in hex, add forgotten commas - Fix vitastor-disk SCSI disk cache check	2022-12-17 17:54:13 +03:00
Vitaliy Filippov	8669998e5e	Fix discard_list_subop() for local ops	2022-12-17 17:54:13 +03:00
Vitaliy Filippov	b457327e77	Oops. Fix metadata read after fixes :-)	2022-12-17 17:31:57 +03:00
Vitaliy Filippov	f7fa9d5e34	Fix SCSI device cache type check	2022-12-17 17:31:57 +03:00
Vitaliy Filippov	49b88b01f9	Fix clang build	2022-12-17 16:25:26 +03:00
Vitaliy Filippov	71688bcb59	Disable HDD autodetection in non-hybrid mode	2022-12-17 16:12:15 +03:00
Vitaliy Filippov	552e207d2b	Explicitly print errors about -EAGAIN in io_uring	2022-12-17 15:49:49 +03:00
Vitaliy Filippov	5464821fa5	Final fix for the lack of zeroing out of old metadata entries If a crash occurs during flushing a redirect-write it may happen so that the disk contains both old and new metadata entries. This is OK, but prior to 0.8.0 after this situation OSDs started without problem, but then they crashed after some more overwrites with a "tried to overwrite non-zero metadata entry" error. 0.8.0 introduced a change that was intended to fix this situation, but rather than fixing it it prevented OSDs from starting, now because of a "big_write journal_entry was allocated over another object" error... :-) This change finally fixes the original issue. Followup to `54ef2c389f`	2022-12-17 14:50:31 +03:00
Vitaliy Filippov	6917a32ca8	Add --force to vitastor-disk read-sb	2022-12-17 02:47:15 +03:00
Vitaliy Filippov	f8722a8bd5	Dump meta in hex	2022-12-17 01:50:38 +03:00
Vitaliy Filippov	9c2f69c9fa	Add forgotten commas to vitastor-disk dump-journal	2022-12-17 01:22:58 +03:00
Vitaliy Filippov	1a93e3f33a	Wait for data writes before fsyncing data if data fsync is enabled	2022-12-16 20:46:55 +03:00
Vitaliy Filippov	3f35744052	Fix compatibility with QEMU aio_set_fd_handler signatures in 7.0 and < 2.9	2022-12-15 19:17:17 +03:00
Vitaliy Filippov	cb437913d3	Never try to wait for free space inside blockstore	2022-12-12 00:27:05 +03:00
Vitaliy Filippov	472bce58ab	Fix rare crash in osd_peering due to callback ordering	2022-12-12 00:27:05 +03:00
Vitaliy Filippov	7a71e7ef01	Fix possible duplication of ping & op message IDs	2022-12-04 00:16:47 +03:00
Vitaliy Filippov	c71e5e7bbd	Fix possible use-after-free during pings	2022-12-04 00:16:47 +03:00
Vitaliy Filippov	8fdf30b21f	Release 0.8.1 - Remove an additional data copy operation when flushing journal (should slightly increase write performance) - Fix a bug where new writes in the inmemory_journal=false mode could overwrite the data currently read by a parallel read operation - Fix degraded parity writes for EC N+K when K>1 where the bug could also lead to an "assertion failed" error - Fix missing journal space check for "big" writes which could lead to "prefill_single_journal_entry(): assertion failed..." error in OSD - Fix possible "assertion failed: next->prev_wait >= 0" in client in rare cases - Fix missing "len" field in vitastor-disk write-journal big_writes - Fix possible crash of a full OSD (ENOSPC) - Fix CSI build scripts to include newest packages every time - Fix CSI endpoint in the liveness probe manifest	2022-11-20 11:44:09 +03:00
Vitaliy Filippov	238037ae31	Make journal trimmer wait until reads are completed when inmemory_journal is false Without this new writes may in theory overwrite journal data being read at that time	2022-11-20 01:49:21 +03:00
Vitaliy Filippov	09a8864686	Fix degraded parity writes for EC N+K when K>1 Fixes possible `calc_rmw_parity_ec(): Assertion `bufs[i][curbuf[i]].buf' failed` error	2022-11-20 00:50:13 +03:00
Vitaliy Filippov	6e6f6ecbb0	Add missing journal space check for big_writes Fixes possible `prefill_single_journal_entry(): Assertion `!journal.sector_info[journal.cur_sector].flush_count' failed` error	2022-11-20 00:50:13 +03:00
Vitaliy Filippov	bf8a0581cd	Fix possible "assertion failed: next->prev_wait >= 0" in client	2022-11-20 00:50:13 +03:00
Vitaliy Filippov	5953942042	Add crc32c test utility	2022-11-20 00:50:13 +03:00
Vitaliy Filippov	a276a1f737	Do not copy journal data additional time when flushing	2022-11-20 00:50:13 +03:00
Vitaliy Filippov	cc24e5796e	Add a FIXME	2022-11-20 00:50:09 +03:00
Vitaliy Filippov	6e26732e6a	Fix skipped "len" field in vitastor-disk write-journal big_writes	2022-11-12 12:01:40 +03:00
Vitaliy Filippov	b4edc79449	Fix possible segfault on ENOSPC	2022-11-12 11:59:43 +03:00
Vitaliy Filippov	11ec9ad874	Release 0.8.0 - Implement automatic OSD activation via udev and simple on-disk superblock storage - Add a new `vitastor-disk` tool and merge all disk-related functionality there. Now it can prepare new OSD disks, upgrade plain old systemd units to the new scheme, resize OSD data area, manage OSD services by disk paths, manage superblocks, automatically check and disable disk cache, dump and write back journal and metadata. - Add a documentation section about `vitastor-disk` (read it if you want details!) - Install systemd services during package installation instead of the older method of manually creating them via separate shell scripts - Add a new `make-etcd` script that reuses /etc/vitastor/vitastor.conf to configure etcd - Allow to configure block_size, bitmap_granularity and immediate_commit per-pool - Fix "fatal error: tried to overwrite non-zero metadata entry" which was possible in some cases after unclean OSD shutdown (caused by old metadata entries not being zeroed)	2022-09-05 13:51:20 +03:00
Vitaliy Filippov	83bb6598dc	Fix fsync autodetection for the single-device mode	2022-09-05 13:51:20 +03:00
Vitaliy Filippov	150f369346	Hotfixes for vitastor-disk prepare: max_other, get device size, older sfdisk	2022-09-05 12:48:27 +03:00
Vitaliy Filippov	8d9a5fde15	Fix docs (block_size vs object_size)	2022-09-04 14:47:04 +03:00
Vitaliy Filippov	9ccc607ab9	Fix parse_size	2022-09-04 14:20:56 +03:00
Vitaliy Filippov	9481456dfe	Automatically check whether to disable cache during prepare	2022-09-03 02:04:21 +03:00
Vitaliy Filippov	68ebe5993a	Fix partition reuse	2022-09-02 23:32:25 +03:00
Vitaliy Filippov	a537db8909	Add documentation for the new "vitastor-disk" tool	2022-08-22 00:31:30 +03:00
Vitaliy Filippov	54ef2c389f	Followup to the "tried to overwrite" fix: also handle it in case of inmemory_meta == false	2022-08-21 01:28:29 +03:00
Vitaliy Filippov	153c73574a	Refactor blockstore_init_meta into slightly more obvious code	2022-08-21 01:21:13 +03:00
Vitaliy Filippov	d83580bd68	Fix "tried to overwrite non-zero metadata entry" when during a previous metadata flush writing new entry is completed, but zeroing out an old one isn't	2022-08-21 00:31:18 +03:00
Vitaliy Filippov	29b40aba93	Add write-meta command (for debug)	2022-08-20 23:56:57 +03:00
Vitaliy Filippov	a52f2b0e8f	Add write-journal command (for debug)	2022-08-20 14:05:53 +03:00
Vitaliy Filippov	1407db9c08	Fix vitastor-disk prepare bugs	2022-08-19 02:22:54 +03:00
Vitaliy Filippov	c0d5e83fb8	Run partprobe when partitions do not appear	2022-08-18 02:05:16 +03:00
Vitaliy Filippov	40d8d65188	Rewrite upgrade-simple to C++	2022-08-18 01:31:31 +03:00
Vitaliy Filippov	a16263e88c	Fix bugs in the upgrade script and in the udev startup script	2022-08-17 10:28:34 +03:00
Vitaliy Filippov	cb4e3a118d	Fix warning	2022-08-15 00:18:21 +03:00
Vitaliy Filippov	b1e39b5dea	Split disk_tool.cpp into separate files	2022-08-14 02:37:01 +03:00
Vitaliy Filippov	1170319431	Finish vitastor-disk prepare in theory	2022-08-14 02:13:24 +03:00
Vitaliy Filippov	2e0a2221eb	vitastor-disk prepare: WIP second form command of the command	2022-08-12 01:58:28 +03:00
Vitaliy Filippov	5a10d135f3	Allow to configure block_size, bitmap_granularity and immediate_commit per-pool	2022-08-11 01:56:33 +03:00
Vitaliy Filippov	4c9aaa8a86	vitastor-disk prepare: implement first form of the command	2022-08-09 01:29:29 +03:00
Vitaliy Filippov	ae99ee6266	Rename base64.{cpp.h} to str_util	2022-07-31 01:12:37 +03:00
Vitaliy Filippov	5af75f7d78	Implement vitastor-cli and vitastor-disk --help <command>	2022-07-31 01:10:05 +03:00
Vitaliy Filippov	7dc6f10ea1	Add read-sb command	2022-07-28 00:14:23 +03:00
Vitaliy Filippov	76dd0fdcea	Implement pre-exec command with on-start OSD checks	2022-07-24 15:09:45 +03:00
Vitaliy Filippov	5acc19bbd5	Implement systemctl start/stop and other commands	2022-07-23 02:18:40 +03:00
Vitaliy Filippov	d5ca4e1f90	Add exec-osd command	2022-07-22 02:17:24 +03:00
Vitaliy Filippov	67e04f789f	Add write-sb (superblock) command	2022-07-19 01:14:31 +03:00
Vitaliy Filippov	837407a84c	Add udev import command	2022-07-19 01:14:31 +03:00
Vitaliy Filippov	1fe5908899	WIP OSD activation from superblock	2022-07-17 02:14:50 +03:00
Vitaliy Filippov	dcc6d546be	Move simple-offsets into vitastor-disk, too	2022-07-15 02:19:35 +03:00
Vitaliy Filippov	85fa389557	Add a test for disk-tool resize	2022-07-15 01:38:30 +03:00
Vitaliy Filippov	dfa433c63b	Add JSON format to dump-journal	2022-07-15 01:38:30 +03:00
Vitaliy Filippov	cf487c95aa	Fix resizer	2022-07-15 01:38:30 +03:00
Vitaliy Filippov	b10656ca09	Parse new disk params in disk_tool resizer	2022-07-15 01:38:30 +03:00
Vitaliy Filippov	ea632367e9	Do not alter dsk.meta_offset/len to skip superblock	2022-07-15 01:38:30 +03:00
Vitaliy Filippov	4d777c6729	Set journal/meta devices to data device explicitly instead of ""	2022-07-15 01:38:30 +03:00
Vitaliy Filippov	0c404c5074	Use blockstore_disk in disk_tool	2022-07-15 01:38:30 +03:00
Vitaliy Filippov	dfd80626bd	Extract disk opening functions to separate module	2022-07-15 01:38:30 +03:00
Vitaliy Filippov	30907852c2	Use simple std::map for the config	2022-07-15 01:38:30 +03:00
Vitaliy Filippov	078ed5b116	WIP Data area resize tool	2022-07-15 01:38:30 +03:00
Vitaliy Filippov	73a363bf92	Rename some variables and constants	2022-07-15 01:38:30 +03:00
Vitaliy Filippov	b0e86ca643	Merge dump-journal and dump-meta into the new "vitastor-disk" tool	2022-07-15 01:38:30 +03:00
Vitaliy Filippov	8800afb649	Fix void* arithmetic again	2022-07-15 01:38:30 +03:00
Vitaliy Filippov	e20cdd13b6	Fix simple-offsets return value	2022-07-15 01:38:30 +03:00
Vitaliy Filippov	bce357e2a5	Do not read all metadata into memory when dumping	2022-06-13 01:26:30 +03:00
Vitaliy Filippov	0876ca09cd	Fix dumper includes and print format	2022-06-11 00:30:44 +03:00
Vitaliy Filippov	dac12d8a4c	Implement metadata dump tool	2022-06-10 18:50:09 +03:00
Vitaliy Filippov	1eec4407ab	Fix inode creation when /index/maxid is out of sync	2022-06-06 16:35:51 +03:00
Vitaliy Filippov	675bc12a13	Add extern "C" for systems like Gentoo which miss it in jerasure includes	2022-06-05 00:33:38 +03:00
Vitaliy Filippov	101592bbff	Release 0.7.1 - Add ISA-L erasure code implementation, now used automatically instead of jerasure when available - Fix listings sending too many parallel requests to OSDs - Fix rm-data crashing with --wait-list - Remove empty inodes from statistics and `ls` output, after <inode_vanish_time> seconds after deletion - Make monitor delete pool statistics when the pool is deleted and thus remove them from `df` output - Log multiple etcd addresses in OSD logs correctly - Fix true/false parsing in json configs like no_recovery/no_rebalance - Show no_recovery, no_rebalance, readonly flags in status	2022-06-05 00:07:24 +03:00
Vitaliy Filippov	87613ed590	Add ISA-L into RPM specs	2022-06-04 13:27:06 +03:00
Vitaliy Filippov	2a2e914ef9	Show no_recovery, no_rebalance and readonly flags in status	2022-06-04 13:27:06 +03:00
Vitaliy Filippov	0cdc9292c8	Fix true/false parsing in json configs like no_recovery/no_rebalance	2022-06-04 13:27:06 +03:00
Vitaliy Filippov	3e1b03bb5c	Show all etcd addresses in the "reporting to..." message	2022-06-04 13:27:06 +03:00
Vitaliy Filippov	1efbbb0c36	Make deleted inodes vanish from statistics after 60 seconds	2022-06-04 13:27:06 +03:00
Vitaliy Filippov	088dd15449	Exclude empty inodes from stats	2022-06-04 00:18:17 +03:00
Vitaliy Filippov	4a531d7b8b	Fix listings sending too many parallel requests to OSDs, fix rm-data crashing with --wait-list	2022-06-03 23:36:37 +03:00
Vitaliy Filippov	a0cae4c180	Rename "jerasure" to "ec" in pool configuration, function names, fix documentation and Debian build scripts Old pool configurations with "jerasure" also remain supported as an alias for "ec"	2022-06-03 15:40:00 +03:00
Vitaliy Filippov	21b306e25f	Add ISA-L support	2022-06-02 01:47:33 +03:00
Vitaliy Filippov	d8313e939a	Release 0.7.0 - Add documentation! :-) in Russian and English - Implement an NFS proxy for file-based access emulation to Vitastor images for non-QEMU based hypervisors like VMWare, as a better way than iSCSI - Implement "primary affinity tags" - Add a patch for libvirt 6.0 - Fix free_down_raw in cli status - Fix a rare bug where OSDs could drop unrelated connections on errors	2022-05-29 23:39:53 +03:00
Vitaliy Filippov	82b9f4c52d	Add a test with OSD kills	2022-05-28 00:51:14 +03:00
Vitaliy Filippov	2bdf415eb3	Fix unknown OSD numbers on error	2022-05-28 00:51:14 +03:00
Vitaliy Filippov	5d47bbe04c	Add documentation	2022-05-17 01:10:49 +03:00
Vitaliy Filippov	93a9f1ef89	Fix NFS socket read hangs	2022-05-11 21:06:56 +03:00
Vitaliy Filippov	2697aae909	Fix free_down_raw in cli status	2022-05-11 18:08:45 +03:00
Vitaliy Filippov	6b69db73ac	Remove getrandom() usage	2022-05-11 11:25:20 +03:00
Vitaliy Filippov	d48a824846	Fix some warnings	2022-05-10 12:42:58 +03:00
Vitaliy Filippov	40985282ff	Fix build under GCC 8	2022-05-10 12:26:47 +03:00
Vitaliy Filippov	acf403e886	Add install target for NFS proxy	2022-05-10 10:43:17 +03:00
Vitaliy Filippov	7c2379d458	Simplified NFS proxy based on own NFS/XDR implementation	2022-05-07 01:01:20 +03:00
Vitaliy Filippov	a2189100dd	Make CLI functions usable in library form Return results and errors in a variable instead of just printing them, separate vitastor-cli main() from cli_tool_t, move positional argument parsing to CLI main from command implementations.	2022-05-06 02:18:32 +03:00
Vitaliy Filippov	bb84379db6	Release 0.6.17 - Fix incorrect reading of extra metadata block leading to extra unknown objects in stats - Fix CSI driver volumeMode: Block support - Add block PVC and pod examples - Fix build under 32 bit architectures - Fix slow connection ramp-up caused by up_wait_retry_interval	2022-05-06 02:18:01 +03:00
Vitaliy Filippov	714dda8151	Fix slow connection ramp-up caused by up_wait_retry_interval pausing operations on first connection attempt	2022-05-06 02:12:08 +03:00
Vitaliy Filippov	e718116f54	Fix incorrect reading of extra metadata block	2022-04-21 02:52:21 +03:00
Vitaliy Filippov	caa2cc2e6c	Fix 32bit build error	2022-04-16 01:48:24 +03:00
Vitaliy Filippov	842ba8b831	Use (uint64_t)1 instead of 1l / 1ul	2022-04-16 01:48:14 +03:00
Vitaliy Filippov	340a4b4f27	Release 0.6.16 - Implement `vitastor-cli status` (print cluster status) command - Add a new `make-osd-hybrid.js` script to quickly prepare a lot of hybrid (HDD+SSD) OSDs - Implement snapshot deletion for Cinder driver (only works in a healthy cluster) - Fix a huge :) bug causing reads to return all zeroes during rebalance. Add a test to prevent it in the future - Disconnect NBD proxy correctly without leaving a zombie [vitastor-nbd] process in D state - Fix a rare write hang appearing with small write throttling enabled	2022-04-09 01:16:52 +03:00
Vitaliy Filippov	d71cc174e3	Implement CLI status command	2022-04-09 00:25:51 +03:00
Vitaliy Filippov	83146fa3e2	Fix the same HUGE bug for regular reads during rebalance	2022-04-08 11:50:09 +03:00
Vitaliy Filippov	cd18ef7323	Disconnect NBD proxy correctly without leaving a zombie [vitastor-nbd] process in D state	2022-04-07 16:03:35 +03:00
Vitaliy Filippov	39531ef1a6	Fix incorrect chained reads during rebalance (the bug detected by test_rebalance_verify.sh)	2022-04-07 15:56:58 +03:00
Vitaliy Filippov	c373425562	Fix nbd log	2022-04-07 15:55:38 +03:00
Vitaliy Filippov	9c30df83e3	Fix a HUGE :) bug in NBD proxy The bug could result in corrupted data on large writes	2022-04-03 10:42:06 +03:00
Vitaliy Filippov	4100d829c7	Allow to override log file for daemonized NBD proxy	2022-04-03 02:41:04 +03:00
Vitaliy Filippov	79ebda933e	Fix a write hang with throttling due to timer reenterability / triggerability	2022-03-28 01:42:06 +03:00
Vitaliy Filippov	85298ddae2	Release 0.6.15 - Make peering much faster in medium to large clusters - Fix a reenterability issue which could rarely lead to peering process hangs	2022-03-06 19:16:34 +03:00
Vitaliy Filippov	e23296a327	Rename cli_rm -> cli_rm_data, cli_snap_rm -> cli_rm	2022-02-24 14:34:14 +03:00
Vitaliy Filippov	839ec9e6e0	Shard clean_db by PGs to speedup listings	2022-02-20 00:21:24 +03:00
Vitaliy Filippov	7cbfdff41a	Replace some throws with force_stop	2022-02-20 00:21:19 +03:00
Vitaliy Filippov	951272f27f	Try to process PG one after another	2022-02-19 19:25:55 +03:00
Vitaliy Filippov	a3fb1d4c98	Fix reenterability around set_timer	2022-02-19 18:28:12 +03:00
Vitaliy Filippov	88402e6eb6	Move next_request to run_cb_and_clear	2022-02-19 16:59:03 +03:00
Vitaliy Filippov	390239c51b	Don't terminate HTTP requests with timeouts if response is already available in the socket	2022-02-19 13:37:12 +03:00
Vitaliy Filippov	b7b2adfa32	Fix http client not continuing requests in case of failure to connect	2022-02-19 13:36:26 +03:00
Vitaliy Filippov	36c276358b	Attempt to fix "head-of-line blocking" by LIST operations	2022-02-18 01:31:45 +03:00
Vitaliy Filippov	117d6f0612	Release 0.6.14 - Fix IPv6 address parsing - Fix "cannot read bytes of undefined" in the monitor on a fresh DB - Fix possible hangs of read requests on OSD restarts without immediate_commit=all mode - Fix OSDs skipping misplaced recovery in some cases - Fix OSDs possibly dying with "map::at" errors when other OSDs are stopped - Fix division by zero in ls if all pool OSDs are down	2022-02-17 14:43:44 +03:00
Vitaliy Filippov	7d79c58095	Use the larger sockaddr_storage structure	2022-02-12 11:22:56 +03:00
Vitaliy Filippov	732e2804e9	Fix operation dependency counter underflow for reads without immediate_commit=all mode	2022-02-11 10:54:11 +03:00
Vitaliy Filippov	abaec2008c	Fix OSDs missing misplaced recovery	2022-02-11 01:00:24 +03:00
Vitaliy Filippov	8129d238a4	Different fio versions have different types for xfer_buflen, but Vitastor anyway does not support 128-bit offsets	2022-02-10 01:21:04 +03:00
Vitaliy Filippov	61ebed144a	Fix OSDs possibly dying with "map::at" errors when other OSDs are stopped	2022-02-09 10:35:29 +03:00
Vitaliy Filippov	9d3ba113aa	Extract bind socket code into a utility function	2022-02-06 00:39:52 +03:00
Vitaliy Filippov	9788045dc9	Fix division by zero in ls if all pool OSDs are down	2022-02-05 17:03:37 +03:00
Vitaliy Filippov	d6b0d29af6	4k MEM_ALIGNMENT	2022-02-05 17:03:37 +03:00
Vitaliy Filippov	36f352f06f	Release 0.6.13 - Fix client hangs possible on OSD restarts (bug affected versions from 0.5.11) - Fix "Assertion `sqe != NULL' failed" io_uring-related crashes possible on some kernels (0.6.11 increased probability of this bug) - Fix timeout=0 in NBD proxy - Fix build under centos 7	2022-02-03 01:50:30 +03:00
Vitaliy Filippov	318cc463c2	Fix warnings	2022-02-03 01:50:30 +03:00
Vitaliy Filippov	145e5cfb86	MCL_ONFAULT is not available under centos 7	2022-02-03 01:42:19 +03:00
Vitaliy Filippov	73ae578981	Add osd_memlock option	2022-02-02 01:40:22 +03:00
Vitaliy Filippov	f712967079	And one more sqe starvation fix	2022-02-01 02:50:16 +03:00
Vitaliy Filippov	df0cd85352	Fix another part of the "async sqe clear" bug (followup to `d9857a5340`)	2022-02-01 01:14:56 +03:00
Vitaliy Filippov	ebaf4d7a72	Fix compatibility with fio 3.28+	2022-01-31 23:39:14 +03:00
Vitaliy Filippov	d4bc10542c	Fix compatibility with liburing >= 2.1 where it only has __pad2[2]	2022-01-31 22:49:40 +03:00
Vitaliy Filippov	140309620a	Free recv_buf in nbd_proxy	2022-01-31 20:37:58 +03:00
Vitaliy Filippov	0a610ee943	Destroy the client after completing CLI command	2022-01-31 18:27:04 +03:00
Vitaliy Filippov	f3ce166064	Do not print nan% in df when a pool has no available OSDs	2022-01-31 18:23:57 +03:00
Vitaliy Filippov	717d303370	Handle get_sqe failures, don't die with "will fall out of sync" in epoll_manager Problem is that in recent kernels io_uring may return completions BEFORE clearing the submission queue. I.e. for example its capacity is 512, there were 512 requests, one of them completed, so when the request completion is processed the queue "should have" 1 free slot. But sometimes it doesn't because io_uring doesn't always clear the submission queue before sending CQE :-/	2022-01-31 02:52:20 +03:00
Vitaliy Filippov	d9857a5340	Check for SQEs, not for completions Should finally fix Assertion `sqe != NULL' failed introduced after journaling refactor in 0.6.11...	2022-01-31 02:19:10 +03:00
Vitaliy Filippov	eb5d9153e8	Fix build under centos 7	2022-01-30 20:29:44 +03:00
Vitaliy Filippov	4047ca606f	Add missing cancel_op(currently being read op) when stopping a client Fixes client hangs possible after stopping & restarting an osd. Hangs happened when a connection was closed in the middle of reading a READ operation reply from the network. In this case the operation being read was in read_op and the client didn't free it when closing the connection. Test case for msgr_read.cpp: - Partially read reply for a READ operation - stop_client() - Check that the READ operation returns EPIPE The bug was actually introduced in 0.5.11.	2022-01-28 01:53:52 +03:00
Vitaliy Filippov	218e294e9c	> 0, of course	2022-01-24 13:36:09 +03:00
Vitaliy Filippov	c1929cabe0	Release 0.6.12 etcd connection stability, clang & elbrus support - Fix build under CLang and Elbrus LCC compilers, making Vitastor compatible with Elbrus CPUs :) - Completely fix the bug where OSDs didn't connect to peers and incorrectly marked PGs as incomplete - Limit I/O depth for deletes the same way as for small writes. Makes OSD crashes with "Assertion failed: sqe != NULL" during image deletion go away - Fix a very old, but rare, journaling bug (credits to https://github.com/mirrorll) - Fix flushing of unclean journaled objects leading to OSDs sometimes hanging after failover in EC setups (bug was introduced in 0.6.7) - Fix several problems that could prevent smooth operation of a Vitastor cluster under the condition of partial etcd failure: - OSDs could randomly fail due to too strict error handling - New clients and OSDs could be unable to start because of the lack of retries - CLI could fail some commands because of the lack of retries - Monitor could stop receiving state updates because of the lack of websocket pings - Fix monitor being unable to rebalance PGs after a downscale of pool pg_size (3->2) - Exit with failure when trying to nbd map or benchmark a non-existing image - Use HTTP keep-alive for etcd connections - Allow to configure etcd request timeouts and retries - Allow to configure NBD timeout, max devices and partitions, and set default to up to 64 devices with up to 3 partitions each	2022-01-24 01:15:25 +03:00
Vitaliy Filippov	cc6b24e03a	Allow to configure NBD timeout, max devices and partitions Also set default NBD devices/partitions to 64/3, Linux default is 16/16 which is way too low	2022-01-24 01:15:19 +03:00
Vitaliy Filippov	0757ba630a	Do not happily NBD "map" non-existing images, do not try to benchmark them too	2022-01-23 23:03:42 +03:00
Vitaliy Filippov	2a0b881685	Respect max_write_iodepth for deletes	2022-01-23 22:05:23 +03:00
Vitaliy Filippov	8dc1ffb13b	Try to connect with PG peers before deciding it's incomplete :) I already attempted to fix it in 0.6.11, but it happened so that the fix was only partial :)	2022-01-23 19:19:26 +03:00
Vitaliy Filippov	ba63af49b4	Add etcd retries everywhere (they were missing in some places)	2022-01-23 17:21:48 +03:00
Vitaliy Filippov	31b9c683ee	Fix flushing of unclean objects This was preventing OSD failover when there were some unclean objects. Bug was introduced in `aa436027c8`	2022-01-23 00:45:11 +03:00
Vitaliy Filippov	3abcac058f	Check for double response_callback call more	2022-01-23 00:26:20 +03:00
Vitaliy Filippov	e01c4db702	Add paranoic if()s to prevent accidental double free of etcd_watch_ws	2022-01-23 00:16:09 +03:00
Vitaliy Filippov	a5cf06acd0	Remove etcd timeout and keepalive interval hardcode	2022-01-23 00:00:00 +03:00
Vitaliy Filippov	9c3653b1e1	Handle EINTR	2022-01-22 23:59:37 +03:00
Vitaliy Filippov	7920414bee	Fix build under older gcc (debian buster)	2022-01-20 10:34:52 +03:00
Vitaliy Filippov	098e369a3b	Fix rand initialization, add etcd connection/disconnection logging	2022-01-20 00:45:49 +03:00
Vitaliy Filippov	a43ef525a2	Remove two last end()s from http_client (should have been removed in the keepalive patch)	2022-01-20 00:44:18 +03:00
Mikhail Koshel	d798e0821e	#1 fix deps	2022-01-18 13:30:53 +06:00
Vitaliy Filippov	e591a3e9f7	Include sys/stat.h in messenger.cpp No idea why, but it builds without it on x86 and does not build on e2k	2022-01-17 13:43:29 +03:00
Vitaliy Filippov	77cc18420a	Fix leaks detected by clang scan-build (only 1 of 4 may be important though)	2022-01-16 00:11:59 +03:00
Vitaliy Filippov	7bdd92ca4f	Fix build under clang and some warnings Build problems fixed: - void* pointer arithmetic which is a GNU extension (works as byte*) - "variable size object may not be initialized" which is OK under GCC - nullptr_t related error in json11 (it lacks 'operator <' in clang) Warnings fixed: - empty nested struct initializer { 0 } replaced by {} - removed several unused lambda captures	2022-01-16 00:02:54 +03:00
Vitaliy Filippov	515a2e6e33	Only die when detecting a real race condition, not just a CAS failure	2022-01-05 17:05:25 +03:00
Vitaliy Filippov	9c6168bf17	Remove fill_parsed_response	2022-01-03 20:08:26 +03:00
Vitaliy Filippov	5473d5b4a2	Rework HTTP client to use keepalive, move getifaddr_list to addr_util	2022-01-03 14:52:01 +03:00
Vitaliy Filippov	c3304bce27	Merge pull request #38 from mirrorll/master journal check_available error	2021-12-31 12:45:16 +03:00
Vitaliy Filippov	b9f5c2a823	Support zero-copy send in fio_sec_osd to allow testing it Prelimilary results: - CPU usage drops significantly. For example, in T1Q8 128K write test against stub_uring_osd with 10G network and Athlon X4 860k CPU it drops from 100% to 30% - Latency becomes slightly worse. In T1Q1 4K write test in the same environment latency increases from 56 to 63 us. - Small write throughput also becomes slightly worse. In T1Q128 4K write test against stub iops decreases from 138k to ~110k (unstable, fluctuates 100k..120k). Note that this is without io_uring, of course.	2021-12-27 02:12:44 +03:00
Vitaliy Filippov	e9d2f79aa7	Support reading bitmaps in fio_sec_osd	2021-12-27 02:12:44 +03:00
Vitaliy Filippov	0785bdf8b3	Release 0.6.11 - Slightly reduce journaling write amplification (requires no_same_sector_overwrites=false) - Fix listen_backlog (it was 0) because it could more than halve OSD socket send speed - Support IPv6 OSD addresses - Do not try to initialize client in simple-offsets - Fix OSDs sometimes marking PGs incomplete instead of trying to connect with peers - Allow to configure OSD placement in node_placement - Allow to run with 4k sector size block devices. Natural, but it was forbidden	2021-12-26 21:11:24 +03:00
Vitaliy Filippov	b57e44748b	Send 4 byte bitmap in stub_uring_osd	2021-12-25 11:38:13 +03:00
Vitaliy Filippov	1bbe62f29c	Fix uninitialized listen_backlog which was leading to REALLY SLOW send speeds!!!	2021-12-25 11:38:13 +03:00
lihai	3061c30132	journal check_available error	2021-12-21 09:39:58 +08:00
Vitaliy Filippov	20a4406acc	Support IPv6 OSD addresses	2021-12-19 10:42:17 +03:00
Vitaliy Filippov	f93491bc6c	Implement journal write batching and slightly refactor journal writes Slightly reduces WA. For example, in 4K T1Q128 replicated randwrite tests WA is reduced from ~3.6 to ~3.1, in T1Q64 from ~3.8 to ~3.4. Only effective without no_same_sector_overwrites.	2021-12-16 00:27:17 +03:00
Vitaliy Filippov	999bed8514	Fix opening regular files as blockstore	2021-12-15 02:08:58 +03:00
Vitaliy Filippov	3f33095fd7	Do not try to initialize client in simple-offsets	2021-12-15 02:07:27 +03:00
Vitaliy Filippov	dd74c5ce1b	Fix OSDs marking PGs incomplete instead of trying to connect with peers	2021-12-14 01:57:51 +03:00
Vitaliy Filippov	c6d104ecd6	Print object version on fatal overwrite	2021-12-14 01:57:04 +03:00
Vitaliy Filippov	e544aef7d0	Fix test rw_blocking	2021-12-12 23:24:50 +03:00
Vitaliy Filippov	616c18c786	Fix stub_uring_osd	2021-12-12 23:06:11 +03:00
Vitaliy Filippov	2c7556e536	Allow to run with 4k sector size. Natural, but it was forbidden	2021-12-11 22:03:16 +00:00
Vitaliy Filippov	2020608a39	Release 0.6.10 - Implement a storage plugin for Proxmox. Now you can use Vitastor with Proxmox! - Implement `vitastor-cli df` (pool space usage statistics) command - Add glob pattern support for `vitastor-cli ls` - Fix several bugs in other CLI commands (resize, create --parent, modify --readonly) - Use 512 byte logical block size in QEMU driver by default (and thus don't require to set it in QEMU options)	2021-12-10 21:40:12 +03:00
Vitaliy Filippov	f54ff6ad5d	Do not crash in simple-offsets when some options are empty, too	2021-12-10 12:27:25 +03:00
Vitaliy Filippov	b376ef2ed9	Do not crash on empty matched_addrs	2021-12-10 11:40:59 +03:00
Vitaliy Filippov	5a234588b9	Do not die when invoked via `vita` symlink	2021-12-10 02:45:16 +03:00
Vitaliy Filippov	0ee5e0a7fe	Implement vitastor-cli df command	2021-12-10 02:37:02 +03:00
Vitaliy Filippov	3482bb0860	Fix readonly/readwrite option parsing	2021-12-10 00:52:59 +03:00
Vitaliy Filippov	526995f486	Do not skip empty iops in listings	2021-12-10 00:52:59 +03:00
Vitaliy Filippov	8dfbd7943c	Use logical block size = 512 bytes by default	2021-12-08 23:43:40 +03:00
Vitaliy Filippov	3a83a32cb7	Aaand now fix create --parent :D	2021-12-08 23:00:34 +03:00
Vitaliy Filippov	20d5ed799a	Add glob pattern matching for ls	2021-12-08 23:00:34 +03:00
Vitaliy Filippov	b262938bca	Fix naggy "Failed to get RDMA device list: Unknown error -38"	2021-12-08 02:02:30 +03:00
Vitaliy Filippov	c3c2e68cc1	Now fix resize command :D	2021-12-05 01:38:08 +03:00
Vitaliy Filippov	aa1e21dd99	Release 0.6.9 New features: - Build Vitastor driver as part of QEMU - Implement renaming images in CLI (vitastor-cli modify --rename) - Add vitastor-cli alloc-osd and simple-offsets commands and use them in make-osd, thus removing the dependency on etcdctl - Make monitor remove stale deleted inode statistics from etcd automatically - Implement OSD address selection from a subnet, thus removing the need to specify OSD addresses in startup scripts explicitly Bug fixes: - Fix client failover in case of etcd shutdown or crash (make client survive etcd failures) - Stick to the last live etcd in OSD and mon to prevent random failures when one of etcds is down - Fix incorrect copying of data from journal to the data device which could lead to data corruption - Prefer local etcd IPs in OSD - Remove the total PG count restriction in optimize_change which was sometimes leading to inability to redistribute PGs over OSDs - Fix error response parsing on a failed pg state report - Fix slow linear writes with RDMA by changing default buffer settings - Fix possible 'TypeError' in openstack nova when using Vitastor cinder driver - Fix bugs in vitastor-cli create, ls, rm, modify commands Patch changes: - Add a patch for libvirt 7.6 - Add patches for QEMU 6.0 and 6.1 - Fix config file path XML location parsing in libvirt patches - Replace _ with - in QEMU options - Fix possible 'TypeError' in openstack nova when using Vitastor cinder driver - Fix possible crashes of QEMU block driver in case of incorrect options	2021-12-03 10:58:54 +03:00
Vitaliy Filippov	9fca01dc62	Add a forgotten return statement	2021-12-03 00:41:49 +03:00
Vitaliy Filippov	0bd3a94efd	Use qdict_get_try_int because qdict_get_int may segfault on a missing key	2021-12-03 00:22:17 +03:00
Vitaliy Filippov	5fe3a40416	More fixes for QEMU 2.x :)	2021-12-02 02:25:50 +03:00
Vitaliy Filippov	15957b7d13	Update QEMU 4.2 patch and CentOS 7 QEMU 4.2 spec patch	2021-12-02 01:03:19 +03:00
Vitaliy Filippov	5859f913fc	Fix client failover in case of etcd shutdown or crash	2021-12-01 00:33:02 +03:00
Vitaliy Filippov	92362027a8	Build vitastor driver as part of the QEMU package by default Old behaviour can be restored with cmake var WITH_QEMU=true	2021-11-29 02:05:26 +03:00
Vitaliy Filippov	c4aeeda143	Fix index removal in vitastor-cli rm	2021-11-29 02:00:05 +03:00
Vitaliy Filippov	24f0f8278a	Fix modify --readwrite	2021-11-29 01:52:21 +03:00
Vitaliy Filippov	95496d0845	Implement renaming images in CLI (vitastor-cli modify --rename)	2021-11-28 22:38:57 +03:00
Vitaliy Filippov	94b1f09ef2	Create snapshots in the same pool by default	2021-11-28 21:50:42 +03:00
Vitaliy Filippov	7a0b5212fe	Exit if unable to restart watches FIXME: It's probably not OK for the client to exit in this case	2021-11-28 01:43:31 +03:00
Vitaliy Filippov	ce5b6253ab	Make OSDs stick to the last successful etcd address Previously OSDs were selecting a new random etcd from the cluster on every request so they were failing randomly when part of etcds was down	2021-11-27 23:48:56 +03:00
Vitaliy Filippov	8398ad0117	Fix #36 - Fix old version data sometimes overriding new version data Reproduction case: - v3 = (offset 4kb, length 16kb) - v2 = (offset 24kb, length 16kb) - v1 = (offset 16kb, length 16kb) - At the third step it was inserting 16..24kb instead of 20..24kb	2021-11-27 01:17:45 +03:00
Vitaliy Filippov	fea451b4db	Prefer local etcd in OSD	2021-11-27 00:36:53 +03:00
Vitaliy Filippov	7b7f20fb89	Merge pull request #34 from mirrorll/master report pg state failed	2021-11-25 10:26:42 +03:00
Vitaliy Filippov	300d507026	Fix capture of out in alloc_osd	2021-11-25 10:20:01 +03:00
harley	6886171289	report pg state failed after report pg state failed parse response error	2021-11-25 09:34:34 +08:00
Vitaliy Filippov	43f8ea47a0	Ok, something is not allowed somewhere in C99	2021-11-24 11:28:10 +03:00
Vitaliy Filippov	6e0e172e15	Implement OSD address selection from a specified subnet	2021-11-23 21:59:26 +03:00
Vitaliy Filippov	879fe9b2b4	Add a patch for qemu 6.1 and replace _ with - in qemu options	2021-11-21 16:24:30 +03:00
Vitaliy Filippov	660c3f7b0d	Change default RDMA settings to 128x 129K buffers 129K to leave extra space for the header The problem with 8x 1M buffers is that the following happens with, for example, 2 OSDs and 4M T1Q1 write: - Server posts 8 receives - Client posts 8 sends - WRs are processed by the RDMA stack, but the OSD doesn't have the time to handle them and doesn't refill buffers - Client posts 1 more send - RNR retransmission happens and performance drops to zero Overall it seems that RDMA support should be reworked to use real 'RDMA' operations i.e. operations writing into remote memory. This has an additional advantage of avoiding a copy at the receive side of the OSD.	2021-11-21 12:05:52 +03:00
Vitaliy Filippov	f0ebfae3b8	Fix vitastor-cli alloc-osd, use vitastor-cli in make-osd.sh	2021-11-21 00:01:03 +03:00
Vitaliy Filippov	eb7ad2c114	Fix empty size syntax, use C version of simple-offsets in tests	2021-11-20 23:51:26 +03:00
Vitaliy Filippov	cd21ff0b6a	Rewrite simple-offsets.js in C/C++	2021-11-19 02:39:56 +03:00
Vitaliy Filippov	d3903f039c	Implement alloc-osd (allocate a new OSD number) command	2021-11-19 02:39:37 +03:00
Vitaliy Filippov	c5029961ea	Oops. Fix vitastor-cli ls	2021-11-16 12:39:41 +03:00
Vitaliy Filippov	920345f7b6	Release 0.6.8 - Build separate packages for OSD, monitor, client, C header, fio and QEMU drivers instead of one package which included everything	2021-11-15 00:49:21 +03:00
Vitaliy Filippov	75b47a6298	Generate pkg-config file	2021-11-15 00:49:21 +03:00
Vitaliy Filippov	7eabc364bf	Release 0.6.7 - Implement CLI commands for listing, viewing I/O statistics, creating, snapshotting, cloning, resizing and modifying images. All these operations are covered by 3 commands: ls, create, modify - Implement an important fix to prior OSD set tracking for PGs. The previous version had an issue which could lead to data loss due to an OSD with older copy of the data thinking it has the newest copy - Fix I/O statistics aggregation in the monitor - Several minor fixes for Cinder driver - Fix QEMU driver to be compatible with QEMU 2.x > 2.0 - Fix stalls sometimes possible in configurations without immediate_commit due to insufficient amount of automatic internal fsync operations - Add `vita` alias for `vitastor-cli`	2021-11-13 23:23:55 +03:00
Vitaliy Filippov	a346f84c69	Allow to show only specific images in listing	2021-11-13 23:23:55 +03:00
Vitaliy Filippov	71a0c1a7b9	Fix list sorting	2021-11-13 23:23:55 +03:00
Vitaliy Filippov	110b39900b	Rename the new "set" command to "modify"	2021-11-13 22:39:17 +03:00
Vitaliy Filippov	42479b4590	Fix vitastor-nbd list, add ls alias	2021-11-13 22:39:17 +03:00
Vitaliy Filippov	6e82044e84	Add `vita` symlink	2021-11-13 22:39:17 +03:00
Vitaliy Filippov	2cb3e84882	Implement CLI set (resize, change readonly status) command	2021-11-13 22:39:17 +03:00
Vitaliy Filippov	aa436027c8	Report pg/history from OSD on every degraded activation Required to prevent data loss due to activation of an OSD with older data when PG OSD set change doesn't occur. I.e. fixes the simplest case: - Run 2 OSDs with 1 PG - Start writing into the PG - Stop OSD 2 - Stop OSD 1 - Start OSD 2 After this change the PG will refuse to start after the last step.	2021-11-13 22:39:17 +03:00
Vitaliy Filippov	577a563b91	Allow to disable colored output	2021-11-11 01:41:58 +03:00
Vitaliy Filippov	e4efa2c08a	Improve vitastor-cli ls - show I/O statistics, allow to sort & limit output	2021-11-11 01:41:58 +03:00
Vitaliy Filippov	d528cd77f1	Fix install_symlink	2021-11-09 16:42:29 +03:00
Vitaliy Filippov	4d43774cbb	Use 5s etcd_report_interval by default	2021-11-09 01:27:12 +03:00
Vitaliy Filippov	a1488f7217	Fix qemu_driver to build with QEMU 2.x (previously it was only correct for QEMU 2.0)	2021-11-08 23:07:31 +03:00
Vitaliy Filippov	404e07d365	Implement image/snapshot/clone creation and listing by pool	2021-11-07 01:01:07 +03:00
Vitaliy Filippov	b3dcee0d43	Also print "bare" inodes with missing config if they occupy space	2021-11-06 14:56:41 +03:00
Vitaliy Filippov	609bd4eb59	Remove naggy RDMA messages when log level is zero	2021-11-06 14:36:23 +03:00
Vitaliy Filippov	8e445ddc9a	Begin to implement CLI: implement listing, add help, add create stub	2021-11-06 14:32:19 +03:00
Tân Lê	e889ac4209	Fix building QEMU 3.1	2021-11-05 13:45:51 +07:00
Vitaliy Filippov	cfe8de9b84	Autosync based on number of unstable ops to prevent journal stalls	2021-10-30 14:26:48 +03:00
Vitaliy Filippov	fb2f7a0d3c	Release 0.6.6 - New command-line tool: vitastor-cli - Implement layer (snapshot/clone) merge and delete - Remove 'bool' from the C header - Fix a very rare flusher stall - More diagnostics now printed for slow ops in the log	2021-10-19 02:26:37 +03:00
Vitaliy Filippov	38d85da19a	Fix build for older gcc	2021-10-19 02:26:37 +03:00
Vitaliy Filippov	89dcda1fed	Remove "bool" from the C header	2021-10-18 01:49:07 +03:00
Vitaliy Filippov	1526e2055e	Do not crash with RDMA when receiving garbage, free RDMA buffers when connection is closed	2021-10-15 23:56:22 +03:00
Vitaliy Filippov	74cb3911db	Rebase children of the "inverse" child when it is removed, change /index/image/%s keys during metadata ops	2021-09-26 13:41:48 +03:00
Vitaliy Filippov	d5efbbb6b9	Rename commands and add CLI help	2021-09-26 13:14:36 +03:00
Vitaliy Filippov	4319091bd3	Implement "inverse merge" optimisation	2021-09-26 12:59:04 +03:00
Vitaliy Filippov	6d307d5391	Ignore "readonly" flag when merging snapshots	2021-09-26 11:32:42 +03:00
Vitaliy Filippov	065dfef683	Rename vitastor-cmd to vitastor-cli	2021-09-26 00:52:05 +03:00
Vitaliy Filippov	4d6b85fe67	Split one big cmd.cpp into multiple files	2021-09-26 00:48:08 +03:00
Vitaliy Filippov	2dd2f29f46	Move get_inode_cfg to cli_tool_t	2021-09-25 23:36:45 +03:00
Vitaliy Filippov	fc3a1e076a	Fix minor bugs in snapshot removal, check it in tests	2021-09-25 19:30:29 +03:00
Vitaliy Filippov	3a3e168c42	Implement high-level snapshot flatten and remove commands	2021-09-25 01:36:44 +03:00
Vitaliy Filippov	95c55da0ad	Implement merge with CAS	2021-08-01 20:06:05 +03:00
Vitaliy Filippov	5cf1157f16	Return real version on CAS failure	2021-08-01 20:05:19 +03:00
Vitaliy Filippov	acf637950c	Implement layer merge A new command merges multiple snapshot/clone layers into one of them, so merged layers can be deleted after this procedure	2021-07-31 00:23:30 +03:00
Vitaliy Filippov	a02b02eb04	Use new listing methods in rm_inode	2021-07-20 00:19:34 +03:00
Vitaliy Filippov	7d3d696110	Implement object listing with controllable parallelism in cluster_client	2021-07-20 00:19:34 +03:00
Vitaliy Filippov	712576ca75	Merge pull request #13 from lnsyyj/wip-vitastor-debug fix BLOCKSTORE_DEBUG, error: ‘dirty_it’ was not declared in this scope	2021-07-18 01:25:05 +03:00
Vitaliy Filippov	28bd94d2c2	Make diagnostics slightly better	2021-07-18 01:24:38 +03:00
Vitaliy Filippov	148ff04aa8	Do not lose flusher queue entries when an "older object rescan" happens in parallel with flushing of an older version of another object	2021-07-18 01:20:54 +03:00
JiangYu	e86df4a2a2	fix BLOCKSTORE_DEBUG, error: ‘dirty_it’ was not declared in this scope Signed-off-by: JiangYu <lnsyyj@hotmail.com>	2021-07-18 00:46:05 +08:00
Vitaliy Filippov	e74af9745e	Print journal flusher diagnostics on slow ops	2021-07-17 16:13:41 +03:00
Vitaliy Filippov	0e0509e3da	Dump op states in slow operation log	2021-07-16 01:58:50 +03:00
Vitaliy Filippov	cb282d25e0	Release 0.6.5 - Basic support for OpenStack: Cinder driver, patches for Nova and libvirt - Add missing "image" and "config_path" QEMU options - Calculate aggregate per-pool statistics in monitor - Implement writes with Check-And-Set semantics - Add a C wrapper library with public header	2021-07-10 11:01:21 +03:00
Vitaliy Filippov	b52dd6843a	Rename qemu_rbd_unescape and qemu_rbd_next_tok to _vitastor_	2021-07-03 23:14:44 +03:00
Vitaliy Filippov	b66160a7ad	Aggregate per-pool statistics in mon	2021-07-03 23:14:44 +03:00
Vitaliy Filippov	aad7792d3f	Check for loops in parent inode chains	2021-06-20 00:23:03 +03:00
Vitaliy Filippov	6ca8afffe5	Add CAS version parameter to the C wrapper	2021-06-19 01:00:52 +03:00
Vitaliy Filippov	511a89948b	Rework qemu_proxy into a C wrapper library with public header	2021-06-19 00:39:11 +03:00
Vitaliy Filippov	3de553ecd7	Add a test for CAS write operation	2021-06-15 00:12:35 +03:00
Vitaliy Filippov	891250d355	Implement CAS writes From now on, reads will return the server-side object version numbers and writes and deletes will have an additional "version" parameter which, if set to a non-zero value, will be atomically compared with the current version of the object plus 1 and the modification will fail if it doesn't match. This feature opens the road to correct online flattening of snapshot layers and other interesting things.	2021-06-15 00:12:35 +03:00
Vitaliy Filippov	f9fe72d40a	Release 0.6.4 - Implement a basic Kubernetes CSI driver - Minor fixes for vitastor-nbd - Fix build without RDMA broken in 0.6.3	2021-05-16 01:38:01 +03:00
Vitaliy Filippov	eaac1fc5d1	Log to stderr in etcd_state_client, too	2021-05-16 01:09:25 +03:00
Vitaliy Filippov	57be1923d3	Daemonize NBD_DO_IT process, correctly cleanup unmounted NBD clients	2021-05-16 01:09:25 +03:00
Vitaliy Filippov	c467acc388	Fix /v3 appendage to etcd URLs without /v3	2021-05-15 19:22:24 +03:00
Vitaliy Filippov	bf591ba3ee	Fix nbd module load check	2021-05-15 19:22:24 +03:00
Vitaliy Filippov	699a0fbbc7	Log to stderr instead of stdout in client	2021-05-15 19:22:24 +03:00
Vitaliy Filippov	6b2dd50f27	Fix build without RDMA	2021-05-08 18:20:43 +03:00
Vitaliy Filippov	caf2f3c56f	Release 0.6.3 - RDMA support - Client performance optimisations (4k randread ~120k -> ~180k on 1 core) - JSON configuration file (/etc/vitastor/vitastor.conf) support - Bug fixes	2021-05-02 17:47:43 +03:00
Vitaliy Filippov	9174f188b1	Build packages with libibverbs For CentOS 7 it also requires newer rdma-core as CentOS 7's native version doesn't have implicit ODP support. The updated version is already uploaded into the vitastor repo.	2021-05-02 17:47:16 +03:00
Vitaliy Filippov	d3978c6d0e	Do not print RDMA connection messages when log_level=0 By the way, it's 1 by default in the OSD, so these messages will still be there in OSD logs	2021-05-01 00:26:09 +03:00
Vitaliy Filippov	4a7365660d	Do not wait for down OSDs during sync Fixes a hang introduced in 0.5.11 in the non-immediate_commit mode	2021-05-01 00:26:07 +03:00
Vitaliy Filippov	818ae5d61d	Some config parsing fixes	2021-05-01 00:20:01 +03:00
Vitaliy Filippov	f6f35f4127	Pass options correctly to not override /etc/vitastor/vitastor.conf	2021-04-30 01:17:44 +03:00
Vitaliy Filippov	72aa2fd819	Make OSD and client read common configuration from /etc/vitastor/vitastor.conf	2021-04-30 01:11:27 +03:00
Vitaliy Filippov	5010b0dd75	Use json11 instead of blockstore_config_t	2021-04-30 00:52:46 +03:00
Vitaliy Filippov	483c5ab380	Negotiate max_msg instead of max_sge, make buffer settings more conservative :-)	2021-04-29 11:10:35 +03:00
Vitaliy Filippov	6a6fd6544d	Add RDMA options to the QEMU driver	2021-04-29 11:02:49 +03:00
Vitaliy Filippov	971aa4ae4f	Implement RDMA receive with memory copying (send remains zero-copy) This is the simplest and, as usual, the best implementation :) 100% zero-copy implementation is also possible (see rdma-zerocopy branch), but it requires to create A LOT of queues (~128 per client) to use QPN as a 'tag' because of the lack of receive tags and the server may simply run out of queues. Hardware limit is 262144 on Mellanox ConnectX-4 which amounts to only 2048 'connections' per host. And even with that amount of queues it's still less optimal than the non-zerocopy one. In fact, newest hardware like Mellanox ConnectX-5 does have Tag Matching support, but it's still unsuitable for us because it doesn't support scatter/gather (tm_caps.max_sge=1).	2021-04-29 02:34:45 +03:00
Vitaliy Filippov	9e6cbc6ebc	Negotiate max_sge between RDMA client & server	2021-04-29 02:15:20 +03:00
Vitaliy Filippov	ce777319c3	WIP RDMA support Basic naive implementation works, but it's highly non-optimal as RNR retransmissions occur all the time. RDMA expects the receiver to always have place for incoming WRs...	2021-04-29 02:03:54 +03:00
Vitaliy Filippov	f8ff39b0ab	Rework continue_ops() to remove a CPU hot spot This rework increases fio -rw=randread -iodepth=128 result from ~120k to ~180k iops :)	2021-04-29 01:50:13 +03:00
Vitaliy Filippov	d749159585	Linked list experiment Rework client operation queue from a vector to a linked list. This is required to rework continue_ops() as its current implementation consumes ~25% of client process CPU.	2021-04-29 01:47:33 +03:00
Vitaliy Filippov	9703773a63	Fix has_flushes setting	2021-04-28 23:40:44 +03:00
Vitaliy Filippov	5d8d486f7c	Add SOVERSION	2021-04-20 01:01:32 +03:00
Vitaliy Filippov	2b546cdd55	Link vitastor_blk with vitastor_common for timerfd_manager_t Not really required to operate, but fixes a verify-elf error	2021-04-20 00:51:53 +03:00
Vitaliy Filippov	bd7b177707	Report sensitive configuration values instead of the configuration source	2021-04-17 23:11:16 +03:00
Vitaliy Filippov	82e6aff17b	Support mapping NBD by the image name	2021-04-17 17:39:55 +03:00
Vitaliy Filippov	57e2c503f7	Rename osd_t::c_cli to msgr	2021-04-17 16:32:09 +03:00
Vitaliy Filippov	715bc8d53d	Release 0.6.2 - Fix a possible crash during SYNC when journal fsyncs are enabled - Fix a memory leak in the chained read implementation	2021-04-15 23:40:06 +03:00
Vitaliy Filippov	0af077701c	Fix a possible crash during SYNC when journal fsyncs are enabled	2021-04-15 02:01:50 +03:00

... 6 7 8 9 10 ...

860 Commits (master)