vitastor

Commit Graph

Author	SHA1	Message	Date
Vitaliy Filippov	f72f14e6a7	Clear old PG states, history, and OSD states on etcd state reload Test / test_snapshot_ec (push) Successful in 30s Details Test / test_interrupted_rebalance_ec_imm (push) Successful in 1m24s Details Test / test_rm (push) Successful in 16s Details Test / test_snapshot_down (push) Successful in 23s Details Test / test_snapshot_down_ec (push) Successful in 25s Details Test / test_splitbrain (push) Successful in 21s Details Test / test_snapshot_chain (push) Successful in 2m24s Details Test / test_snapshot_chain_ec (push) Successful in 3m5s Details Test / test_rebalance_verify_imm (push) Successful in 3m21s Details Test / test_write (push) Successful in 36s Details Test / test_rebalance_verify (push) Successful in 4m12s Details Test / test_write_no_same (push) Successful in 15s Details Test / test_write_xor (push) Successful in 52s Details Test / test_rebalance_verify_ec_imm (push) Successful in 4m29s Details Test / test_rebalance_verify_ec (push) Successful in 5m25s Details Test / test_heal_pg_size_2 (push) Successful in 4m10s Details Test / test_heal_ec (push) Successful in 4m46s Details Test / test_heal_csum_32k_dmj (push) Successful in 5m31s Details Test / test_heal_csum_32k_dj (push) Successful in 5m41s Details Test / test_heal_csum_32k (push) Successful in 6m41s Details Test / test_scrub (push) Successful in 1m13s Details Test / test_heal_csum_4k_dmj (push) Successful in 6m53s Details Test / test_scrub_xor (push) Successful in 54s Details Test / test_scrub_zero_osd_2 (push) Successful in 58s Details Test / test_heal_csum_4k_dj (push) Successful in 6m27s Details Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 1m15s Details Test / test_scrub_pg_size_3 (push) Successful in 1m27s Details Test / test_heal_csum_4k (push) Successful in 6m20s Details Test / test_scrub_ec (push) Successful in 29s Details Test / test_move_reappear (push) Successful in 17s Details Also add protection from etcd watcher messages being split into multiple websocket messages - I'm not sure if etcd actually does that, but it's better to have extra protection anyway. Also check that all etcd watchers are started in the keepalive routine, otherwise it sometimes tries to revive etcd watchers starting with revision=1 which obviously always fails because this revision is nearly always compacted. All these changes should fix an old rarely reproduced bug where SOMETIMES OSDs didn't react to PG config changes which was leading to offline pools on node reboot. It happened on the full reload of state from etcd.	2023-12-24 02:02:13 +03:00
Vitaliy Filippov	7cea642f4a	Fix vitastor-nbd image existence check not working because of non-zeroed inode_watch fields Test / test_interrupted_rebalance_ec (push) Successful in 1m55s Details Test / test_snapshot_ec (push) Successful in 38s Details Test / test_rm (push) Successful in 16s Details Test / test_snapshot_down (push) Successful in 25s Details Test / test_move_reappear (push) Failing after 50s Details Test / test_splitbrain (push) Successful in 22s Details Test / test_snapshot_down_ec (push) Successful in 24s Details Test / test_snapshot_chain (push) Successful in 2m14s Details Test / test_snapshot_chain_ec (push) Successful in 2m53s Details Test / test_rebalance_verify_imm (push) Successful in 2m49s Details Test / test_write (push) Successful in 35s Details Test / test_rebalance_verify (push) Successful in 3m34s Details Test / test_write_no_same (push) Successful in 14s Details Test / test_write_xor (push) Successful in 53s Details Test / test_rebalance_verify_ec (push) Successful in 4m48s Details Test / test_rebalance_verify_ec_imm (push) Successful in 4m16s Details Test / test_heal_pg_size_2 (push) Successful in 4m3s Details Test / test_heal_ec (push) Successful in 4m37s Details Test / test_heal_csum_32k_dmj (push) Successful in 5m49s Details Test / test_heal_csum_32k_dj (push) Successful in 6m0s Details Test / test_heal_csum_32k (push) Successful in 6m59s Details Test / test_heal_csum_4k_dmj (push) Successful in 7m6s Details Test / test_scrub (push) Successful in 1m13s Details Test / test_scrub_xor (push) Successful in 51s Details Test / test_scrub_zero_osd_2 (push) Successful in 1m2s Details Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 1m11s Details Test / test_heal_csum_4k_dj (push) Successful in 6m44s Details Test / test_scrub_pg_size_3 (push) Successful in 1m44s Details Test / test_scrub_ec (push) Successful in 45s Details Test / test_heal_csum_4k (push) Successful in 6m56s Details	2023-12-19 01:11:37 +03:00
Vitaliy Filippov	3c924397e7	Store next scrub timestamp instead of last scrub timestamp	2023-05-20 23:19:39 +03:00
Vitaliy Filippov	c3bd26193d	Implement PG scrub runner	2023-05-20 23:19:39 +03:00
Vitaliy Filippov	d06ed2b0e7	Implement online config update	2023-03-26 19:21:50 +03:00
Vitaliy Filippov	5a10d135f3	Allow to configure block_size, bitmap_granularity and immediate_commit per-pool	2022-08-11 01:56:33 +03:00
Vitaliy Filippov	7c2379d458	Simplified NFS proxy based on own NFS/XDR implementation	2022-05-07 01:01:20 +03:00
Vitaliy Filippov	a2189100dd	Make CLI functions usable in library form Return results and errors in a variable instead of just printing them, separate vitastor-cli main() from cli_tool_t, move positional argument parsing to CLI main from command implementations.	2022-05-06 02:18:32 +03:00
Vitaliy Filippov	d71cc174e3	Implement CLI status command	2022-04-09 00:25:51 +03:00
Vitaliy Filippov	ba63af49b4	Add etcd retries everywhere (they were missing in some places)	2022-01-23 17:21:48 +03:00
Vitaliy Filippov	a5cf06acd0	Remove etcd timeout and keepalive interval hardcode	2022-01-23 00:00:00 +03:00
Vitaliy Filippov	098e369a3b	Fix rand initialization, add etcd connection/disconnection logging	2022-01-20 00:45:49 +03:00
Vitaliy Filippov	5473d5b4a2	Rework HTTP client to use keepalive, move getifaddr_list to addr_util	2022-01-03 14:52:01 +03:00
Vitaliy Filippov	5859f913fc	Fix client failover in case of etcd shutdown or crash	2021-12-01 00:33:02 +03:00
Vitaliy Filippov	ce5b6253ab	Make OSDs stick to the last successful etcd address Previously OSDs were selecting a new random etcd from the cluster on every request so they were failing randomly when part of etcds was down	2021-11-27 23:48:56 +03:00
Vitaliy Filippov	fea451b4db	Prefer local etcd in OSD	2021-11-27 00:36:53 +03:00
Vitaliy Filippov	8e445ddc9a	Begin to implement CLI: implement listing, add help, add create stub	2021-11-06 14:32:19 +03:00
Vitaliy Filippov	74cb3911db	Rebase children of the "inverse" child when it is removed, change /index/image/%s keys during metadata ops	2021-09-26 13:41:48 +03:00
Vitaliy Filippov	5010b0dd75	Use json11 instead of blockstore_config_t	2021-04-30 00:52:46 +03:00
Vitaliy Filippov	6950b8e3a0	Watch inode metadata revisions	2021-04-10 17:44:12 +03:00
Vitaliy Filippov	2612d3198a	Introduce image names and metadata storage in etcd Each inode has: image name, parent inode number & pool, size and readonly flag Snapshots are created by switching image name to a different inode number while using the older inode as parent.	2021-04-10 17:44:12 +03:00
Vitaliy Filippov	691f066055	Actual snapshot support (untested)	2021-04-10 17:44:12 +03:00
Vitaliy Filippov	a48e2bbf18	Fix write replay ordering when immediate_commit != all Previous implementation didn't respect write ordering and could lead to corrupted data when restarting writes after an OSD outage Also rework cluster_client queueing logic and add tests for it to verify the correct behaviour	2021-04-03 14:51:52 +03:00
Vitaliy Filippov	3e162d95a0	Remove http_client.h include from etcd_state_client.h	2021-04-03 14:36:04 +03:00
Vitaliy Filippov	9ac7e75178	Allow to specify etcd URLs for OSDs with http://, do not die with a strange error if -etcd option is missing for fio	2021-03-16 12:48:26 +03:00
Vitaliy Filippov	bc742ccf8c	Fix a small memory leak in etcd_state_client	2021-03-08 17:04:10 +03:00
Vitaliy Filippov	bf9a175efc	Move C/C++ sources to src subdirectory	2021-02-25 23:59:03 +03:00

27 Commits (f72f14e6a7dc0ee62c3781501dbf68aa31b78556)