vitastor

antilles

vitastor

Author	SHA1	Message	Date
Vitaliy Filippov	df0cd85352	Fix another part of the "async sqe clear" bug (followup to `d9857a5340`)	2022-02-01 01:14:56 +03:00
Vitaliy Filippov	ebaf4d7a72	Fix compatibility with fio 3.28+	2022-01-31 23:39:14 +03:00
Vitaliy Filippov	d4bc10542c	Fix compatibility with liburing >= 2.1 where it only has __pad2[2]	2022-01-31 22:49:40 +03:00
Vitaliy Filippov	140309620a	Free recv_buf in nbd_proxy	2022-01-31 20:37:58 +03:00
Vitaliy Filippov	0a610ee943	Destroy the client after completing CLI command	2022-01-31 18:27:04 +03:00
Vitaliy Filippov	f3ce166064	Do not print nan% in df when a pool has no available OSDs	2022-01-31 18:23:57 +03:00
Vitaliy Filippov	717d303370	Handle get_sqe failures, don't die with "will fall out of sync" in epoll_manager Problem is that in recent kernels io_uring may return completions BEFORE clearing the submission queue. I.e. for example its capacity is 512, there were 512 requests, one of them completed, so when the request completion is processed the queue "should have" 1 free slot. But sometimes it doesn't because io_uring doesn't always clear the submission queue before sending CQE :-/	2022-01-31 02:52:20 +03:00
Vitaliy Filippov	d9857a5340	Check for SQEs, not for completions Should finally fix Assertion `sqe != NULL' failed introduced after journaling refactor in 0.6.11...	2022-01-31 02:19:10 +03:00
Vitaliy Filippov	eb5d9153e8	Fix build under centos 7	2022-01-30 20:29:44 +03:00
Vitaliy Filippov	ae6d1ed1d5	Remove completed items	2022-01-30 20:20:06 +03:00
Vitaliy Filippov	d123e58ea3	Fix yaml syntax - remove ` in default	2022-01-29 02:08:48 +03:00
Vitaliy Filippov	d9869d8116	Add parameter documentation	2022-01-28 02:45:54 +03:00
Vitaliy Filippov	4047ca606f	Add missing cancel_op(currently being read op) when stopping a client Fixes client hangs possible after stopping & restarting an osd. Hangs happened when a connection was closed in the middle of reading a READ operation reply from the network. In this case the operation being read was in read_op and the client didn't free it when closing the connection. Test case for msgr_read.cpp: - Partially read reply for a READ operation - stop_client() - Check that the READ operation returns EPIPE The bug was actually introduced in 0.5.11.	2022-01-28 01:53:52 +03:00
Vitaliy Filippov	218e294e9c	> 0, of course	2022-01-24 13:36:09 +03:00
Vitaliy Filippov	c1929cabe0	Release 0.6.12 etcd connection stability, clang & elbrus support - Fix build under CLang and Elbrus LCC compilers, making Vitastor compatible with Elbrus CPUs :) - Completely fix the bug where OSDs didn't connect to peers and incorrectly marked PGs as incomplete - Limit I/O depth for deletes the same way as for small writes. Makes OSD crashes with "Assertion failed: sqe != NULL" during image deletion go away - Fix a very old, but rare, journaling bug (credits to https://github.com/mirrorll) - Fix flushing of unclean journaled objects leading to OSDs sometimes hanging after failover in EC setups (bug was introduced in 0.6.7) - Fix several problems that could prevent smooth operation of a Vitastor cluster under the condition of partial etcd failure: - OSDs could randomly fail due to too strict error handling - New clients and OSDs could be unable to start because of the lack of retries - CLI could fail some commands because of the lack of retries - Monitor could stop receiving state updates because of the lack of websocket pings - Fix monitor being unable to rebalance PGs after a downscale of pool pg_size (3->2) - Exit with failure when trying to nbd map or benchmark a non-existing image - Use HTTP keep-alive for etcd connections - Allow to configure etcd request timeouts and retries - Allow to configure NBD timeout, max devices and partitions, and set default to up to 64 devices with up to 3 partitions each	2022-01-24 01:15:25 +03:00
Vitaliy Filippov	cc6b24e03a	Allow to configure NBD timeout, max devices and partitions Also set default NBD devices/partitions to 64/3, Linux default is 16/16 which is way too low	2022-01-24 01:15:19 +03:00
Vitaliy Filippov	0757ba630a	Do not happily NBD "map" non-existing images, do not try to benchmark them too	2022-01-23 23:03:42 +03:00
Vitaliy Filippov	2a0b881685	Respect max_write_iodepth for deletes	2022-01-23 22:05:23 +03:00
Vitaliy Filippov	9a15b843ff	Do not set pg_real_size to 0	2022-01-23 20:15:04 +03:00
Vitaliy Filippov	8dc1ffb13b	Try to connect with PG peers before deciding it's incomplete :) I already attempted to fix it in 0.6.11, but it happened so that the fix was only partial :)	2022-01-23 19:19:26 +03:00
Vitaliy Filippov	ba63af49b4	Add etcd retries everywhere (they were missing in some places)	2022-01-23 17:21:48 +03:00
Vitaliy Filippov	31b9c683ee	Fix flushing of unclean objects This was preventing OSD failover when there were some unclean objects. Bug was introduced in `aa436027c8`	2022-01-23 00:45:11 +03:00
Vitaliy Filippov	3abcac058f	Check for double response_callback call more	2022-01-23 00:26:20 +03:00
Vitaliy Filippov	e01c4db702	Add paranoic if()s to prevent accidental double free of etcd_watch_ws	2022-01-23 00:16:09 +03:00
Vitaliy Filippov	a5cf06acd0	Remove etcd timeout and keepalive interval hardcode	2022-01-23 00:00:00 +03:00
Vitaliy Filippov	9c3653b1e1	Handle EINTR	2022-01-22 23:59:37 +03:00
Vitaliy Filippov	23e578b6a2	Fix common.sh	2022-01-21 01:51:25 +03:00
Vitaliy Filippov	7920414bee	Fix build under older gcc (debian buster)	2022-01-20 10:34:52 +03:00
Vitaliy Filippov	098e369a3b	Fix rand initialization, add etcd connection/disconnection logging	2022-01-20 00:45:49 +03:00
Vitaliy Filippov	a43ef525a2	Remove two last end()s from http_client (should have been removed in the keepalive patch)	2022-01-20 00:44:18 +03:00
Vitaliy Filippov	8a6b07d8f7	Add a 2/5 etcd failure test	2022-01-20 00:43:22 +03:00
Vitaliy Filippov	2c930d55fb	Merge pull request #41 from promobit-bitblaze/1-small-fix #1 fix deps	2022-01-18 11:19:08 +03:00
Mikhail Koshel	d798e0821e	#1 fix deps	2022-01-18 13:30:53 +06:00
Vitaliy Filippov	e591a3e9f7	Include sys/stat.h in messenger.cpp No idea why, but it builds without it on x86 and does not build on e2k	2022-01-17 13:43:29 +03:00
Vitaliy Filippov	77cc18420a	Fix leaks detected by clang scan-build (only 1 of 4 may be important though)	2022-01-16 00:11:59 +03:00
Vitaliy Filippov	7bdd92ca4f	Fix build under clang and some warnings Build problems fixed: - void* pointer arithmetic which is a GNU extension (works as byte*) - "variable size object may not be initialized" which is OK under GCC - nullptr_t related error in json11 (it lacks 'operator <' in clang) Warnings fixed: - empty nested struct initializer { 0 } replaced by {} - removed several unused lambda captures	2022-01-16 00:02:54 +03:00
Vitaliy Filippov	8f64fc61e7	Ignore empty events in mon	2022-01-08 11:41:00 +03:00
Vitaliy Filippov	4a9f001d9e	Make mon also ping etcd websockets regularly	2022-01-05 17:28:51 +03:00
Vitaliy Filippov	8c908316d9	Add a test with an OSD being added	2022-01-05 17:06:24 +03:00
Vitaliy Filippov	515a2e6e33	Only die when detecting a real race condition, not just a CAS failure	2022-01-05 17:05:25 +03:00
Vitaliy Filippov	68b6763ebe	Add asserts for lp-optimizer tests, pass `ordered` from the monitor	2022-01-03 20:37:07 +03:00
Vitaliy Filippov	9c6168bf17	Remove fill_parsed_response	2022-01-03 20:08:26 +03:00
Vitaliy Filippov	08e467270a	Fix pg_size changing from 3 to 2	2022-01-03 17:56:54 +03:00
Vitaliy Filippov	5473d5b4a2	Rework HTTP client to use keepalive, move getifaddr_list to addr_util	2022-01-03 14:52:01 +03:00
Vitaliy Filippov	c3304bce27	Merge pull request #38 from mirrorll/master journal check_available error	2021-12-31 12:45:16 +03:00
Vitaliy Filippov	ec2852c598	Add minsize_1 test	2021-12-28 10:54:36 +03:00
Vitaliy Filippov	b9f5c2a823	Support zero-copy send in fio_sec_osd to allow testing it Prelimilary results: - CPU usage drops significantly. For example, in T1Q8 128K write test against stub_uring_osd with 10G network and Athlon X4 860k CPU it drops from 100% to 30% - Latency becomes slightly worse. In T1Q1 4K write test in the same environment latency increases from 56 to 63 us. - Small write throughput also becomes slightly worse. In T1Q128 4K write test against stub iops decreases from 138k to ~110k (unstable, fluctuates 100k..120k). Note that this is without io_uring, of course.	2021-12-27 02:12:44 +03:00
Vitaliy Filippov	e9d2f79aa7	Support reading bitmaps in fio_sec_osd	2021-12-27 02:12:44 +03:00
Vitaliy Filippov	0785bdf8b3	Release 0.6.11 - Slightly reduce journaling write amplification (requires no_same_sector_overwrites=false) - Fix listen_backlog (it was 0) because it could more than halve OSD socket send speed - Support IPv6 OSD addresses - Do not try to initialize client in simple-offsets - Fix OSDs sometimes marking PGs incomplete instead of trying to connect with peers - Allow to configure OSD placement in node_placement - Allow to run with 4k sector size block devices. Natural, but it was forbidden	2021-12-26 21:11:24 +03:00
Vitaliy Filippov	b57e44748b	Send 4 byte bitmap in stub_uring_osd	2021-12-25 11:38:13 +03:00

1 2 3 4 5 ...

1068 Commits (30907852c2b859f0df20e61dbf7794c919779f9c) All Branches Search

1068 Commits (30907852c2b859f0df20e61dbf7794c919779f9c)

All Branches