vitastor

Commit Graph

Author	SHA1	Message	Date
Vitaliy Filippov	829381b335	Extract some definitions to msgr_op.{cpp,h}	2021-04-03 14:36:04 +03:00
Vitaliy Filippov	54f2353f24	Use bitmap granularity for alignment checks	2021-04-03 14:36:04 +03:00
Vitaliy Filippov	e47f6fba60	Remove cluster_client_t::stop()	2021-04-03 14:35:42 +03:00
Vitaliy Filippov	883bf84a16	Fix build	2021-04-03 01:47:15 +03:00
Vitaliy Filippov	52097c4856	Stop flushing when less than min_flusher_count operations are available (unless a trim is forced)	2021-04-03 00:53:28 +03:00
Vitaliy Filippov	e1355cbc74	Report failed operation name in cluster_client	2021-04-03 00:53:28 +03:00
Vitaliy Filippov	8f8b90be7a	Add min_flusher_count configuration	2021-04-03 00:53:28 +03:00
Vitaliy Filippov	ad9f619370	Skip double allocs when reading journal	2021-04-03 00:53:28 +03:00
Vitaliy Filippov	f4769ba7c7	Collapse create+delete journal entry pairs if they're already flushed Old journal replay mechanism could lead to a double allocation of the same block and a "Fatal error: tried to overwrite non-zero metadata entry"	2021-04-03 00:53:28 +03:00
Vitaliy Filippov	843b7052d2	Add an assertion when clearing deleted metadata entries, add debug details when freeing blocks	2021-04-03 00:53:28 +03:00
Vitaliy Filippov	df99e232ee	Deduplicate osd_sets in pg history + raise request size limit for etcd	2021-04-03 00:53:28 +03:00
Vitaliy Filippov	3a40fa4127	Fix monitor errors in case of OSD removal	2021-03-27 01:15:18 +03:00
Vitaliy Filippov	4095bcc558	Do not ignore object deletion journal entries when they are preceded by a big write	2021-03-25 11:00:10 +03:00
Vitaliy Filippov	564d64e271	Add some details for debug prints	2021-03-25 11:00:10 +03:00
Vitaliy Filippov	cf54741c95	Followup to `05db1308aa` Don't do anything with the object state after errors because it's freed by PG re-peer in this case	2021-03-25 11:00:10 +03:00
Vitaliy Filippov	18a5fafa2a	Fix rollback	2021-03-25 02:41:58 +03:00
Vitaliy Filippov	06f4978085	Fix fsync check in blockstore_flush (data fsyncs were disabled instead of journal fsyncs)	2021-03-25 02:41:58 +03:00
Vitaliy Filippov	7ebf1588c5	Check for immediate_commit==small in the OSD code	2021-03-25 02:41:58 +03:00
Vitaliy Filippov	b0ad1e1e6d	Remember writes as "unsynced" only after completing them Previously BS_OP_SYNC could take unfinished writes and add them into the journal before they were actually completed. This was leading to crashes with the message "BUG: Unexpected dirty_entry 2000000000001:9f2a0000 v3 unstable state during flush: 338"	2021-03-25 02:41:58 +03:00
Vitaliy Filippov	0949f08407	Extract osd_primary write and sync code into separate files	2021-03-24 14:20:56 +03:00
Vitaliy Filippov	04a1f18fa5	Assign .req as a whole to always zero out the remaining part Also clear .reply before processing the operation	2021-03-24 14:20:56 +03:00
Vitaliy Filippov	cf9a641d66	Skip disconnected OSDs during sync	2021-03-24 14:20:56 +03:00
Vitaliy Filippov	05db1308aa	Fix two potential read/write ordering problems (even though not yet seen in tests) - Write operations could be 'stabilized' and previous versions could be purged from OSDs before the removal of version_override and following reads could potentially hit different version in EC pools - Object was marked clean after completing the delete during recovery, so reads could in theory hit a deleted version and return nothing	2021-03-24 14:20:56 +03:00
Vitaliy Filippov	98b54ca948	Don't try to "recover" misplaced objects if it would make them degraded	2021-03-21 01:37:23 +03:00
Vitaliy Filippov	23225c5e62	Do not run ping on clients that are not yet connected	2021-03-21 01:37:23 +03:00
Vitaliy Filippov	7e6e1a5a82	Release 0.5.10 The version seems to be stable after this bunch of fixes :) - Fix delete & write operation ordering during rebalance to not lose objects in the immediate_commit=off mode - Fix a possible crash caused by very high iodepths - Re-distribute PG primaries over OSDs that come up after a short downtime - Allow to specify etcd URLs for OSDs with http://, do not die with a strange error if -etcd option is missing for fio - Fix a journal flushing deadlock which sometimes occurred in the immediate_commit=off mode - Fix a bug where OSDs could hang if the data device filled up - Fix an allocator bug where it was unable to allocate up to last (n%64) data device blocks - Fix monitor crash that occurred on removal of some etcd keys - Fix a bug where PGs could remain incomplete due to incorrect PG history with just zeroes in osd_sets	2021-03-16 12:48:26 +03:00
Vitaliy Filippov	435045751d	Delete objects only after a SYNC during rebalance in the non-immediate_commit mode Previously OSDs could commit deletes before writes during recovery or rebalance in the "lazy fsync" (immediate_commit=off) mode which could result in lost objects	2021-03-16 12:48:26 +03:00
Vitaliy Filippov	c5fb1d5987	Do not duplicate blockstore operations when io_uring fills up This bug was leading to OSDs dying with "Assertion `fulfilled == read_op->len' failed" when testing fio -rw=randread -numjobs=8 -iodepth=128	2021-03-16 12:48:26 +03:00
Vitaliy Filippov	9f59381bea	Re-distribute PG primaries over OSDs that come up after a short downtime	2021-03-16 12:48:26 +03:00
Vitaliy Filippov	9ac7e75178	Allow to specify etcd URLs for OSDs with http://, do not die with a strange error if -etcd option is missing for fio	2021-03-16 12:48:26 +03:00
Vitaliy Filippov	88671cf745	Fix a bug causing all flushers to wait for an fsync without actually trying to do it This happened because flusher_count became dynamic and fsync_batch() was comparing the number of flushers currently ready to do an fsync with the maximum number of flushers. Also the number wasn't rechecked on every loop which was also incorrect. Now the interrupted_rebalance test passes even without IMMEDIATE_COMMIT=1.	2021-03-13 17:27:29 +03:00
Vitaliy Filippov	fe1749c427	Fix the multiple_interrupted_rebalance test	2021-03-13 17:19:45 +03:00
Vitaliy Filippov	ceb9c28de7	Set default log_level before passing config to etcd_state_client	2021-03-13 17:19:45 +03:00
Vitaliy Filippov	299d7d7c95	Use common macro for get_sqe	2021-03-13 17:19:45 +03:00
Vitaliy Filippov	d1526b415f	Correctly resume writes when OSD is full to return an error	2021-03-13 17:19:45 +03:00
Vitaliy Filippov	f49fd53d55	Fix a bug where allocator was unable to allocate up to last (n%64) blocks, add tests for it	2021-03-13 02:19:02 +03:00
Vitaliy Filippov	dd76eda5e5	Test multiple interrupted rebalancings Currently only passes with immediate_commit=all configuration (env variable IMMEDIATE_COMMIT=1 for the bash script)	2021-03-12 12:55:44 +03:00
Vitaliy Filippov	87dbd8fa57	Use empty hash as the default value for some etcd keys in the monitor	2021-03-12 12:40:15 +03:00
Vitaliy Filippov	b44f49aab2	Ignore zero OSDs in history osd_sets	2021-03-12 12:40:15 +03:00
Vitaliy Filippov	036555638e	Release 0.5.9 - Fix two monitor bugs which led to objects being "logically lost" (physically present on some secondary OSDs while primary doesn't know about it) after multiple interrupted rebalancings - Implement "no_recovery" and "no_rebalance" flags	2021-03-11 00:39:10 +03:00
Vitaliy Filippov	af5155fcd9	Implement "no_recovery" and "no_rebalance" flags	2021-03-11 00:36:31 +03:00
Vitaliy Filippov	0d2efbecc9	Preserve previous PG history when changing PG distribution Fixes incorrect PG history in case when a new rebalance is started before the finish of the previous one which could make primary OSDs unable to locate some objects on some secondaries.	2021-03-11 00:16:10 +03:00
Vitaliy Filippov	e62e8b6bae	Use real pg configuration instead of the "last clean" one for generating PG history Basically fixes the bug introduced in 0.5.7 where an rebalance interrupted by the monitor could result in forgetting objects moved to the new place	2021-03-10 02:01:44 +03:00
Vitaliy Filippov	c4ba24c305	Do not print ping op latency	2021-03-10 02:01:44 +03:00
Vitaliy Filippov	19e47a0279	Release 0.5.8 - Add heartbeats (fixes failover in case of network issues or offline nodes) - Fix a bug where a PG could incorrectly become listed as 'incomplete' if historical osd_sets included a set with the the PG's primary OSD as the only alive one - Use osd_out_time = 10 minutes by default instead of 30 minutes - Make monitors stick to a single selected etcd URL on start and not try to select random ones on every request - this was leading to etcd interaction errors when some etcds were unavailable	2020-03-09 02:38:17 +03:00
Vitaliy Filippov	bd178ac20f	Fix history osd_set check - local OSD is always available!	2021-03-09 02:18:18 +03:00
Vitaliy Filippov	7006875a24	Make monitor stick to one etcd until the restart	2021-03-09 02:15:38 +03:00
Vitaliy Filippov	ad577c4aac	Add PING operation and timeouts to detect OSD failures when a host goes down	2021-03-09 02:15:38 +03:00
Vitaliy Filippov	836635c518	Use osd_out_time = 10 minutes by default	2021-03-09 02:15:38 +03:00
Vitaliy Filippov	88a03f4e98	Release 0.5.7 - Fix multiple bugs leading to OSDs sometimes being unable to correctly activate PGs when a lot of PG peering events occurred in a small amount of time - Fix a bug where OSDs could list incomplete object versions during peering. The bug manifested with "local rollback operation failed" messages in OSD logs - Fix a bug where misplaced chunks for degraded and incomplete objects were not removed from extra OSDs during recovery - Fix incorrect PG history configuration resulting in OSDs being unable to find some of the objects after a PG count change - Simplify block layer write ordering logic - Avoid extra data move when a lot of OSDs are first stopped for long time and then restarted - Fix incorrect degraded & misplaced object statistics after a completed rebalance - Fix incorrect usage of pg_minsize instead of the minimal possible object chunk count in EC pools	2021-03-08 23:37:02 +03:00

1 2 3 4 5 ...

645 Commits (829381b3353ce3dbbaeab386b78c71c6c0bc6681) All Branches Search

645 Commits (829381b3353ce3dbbaeab386b78c71c6c0bc6681)

All Branches