vitastor

Commit Graph

Author	SHA1	Message	Date
Vitaliy Filippov	085c145a18	Document etcd data (to-be state with pools) at least in some form	2020-09-01 16:29:45 +03:00
Vitaliy Filippov	30da4bddbe	Extract scale_pg_count into a separate file	2020-09-01 16:18:58 +03:00
Vitaliy Filippov	14b4a4617e	(re)move placement_tree	2020-09-01 16:18:58 +03:00
Vitaliy Filippov	3932c9b2e2	Add WRITE_STABLE to the secondary OSD for the upcoming replication support	2020-09-01 16:18:58 +03:00
Vitaliy Filippov	2e8c69fc5b	Rename OSD_OP_SECONDARY_* to OSD_OP_SEC_*	2020-08-31 23:57:50 +03:00
Vitaliy Filippov	a86788fe3b	Support optimizing for the case when parity chunks occupy more space than data chunks Mostly as an experiment because the problem solved by this commit comes from Ceph's EC+compression implementation details and I'm not sure if my implementation will be the same	2020-08-17 01:44:19 +03:00
Vitaliy Filippov	95ebfad283	Final name is Vitastor	2020-08-03 23:50:59 +03:00
Vitaliy Filippov	6022f28dc9	Add pseudo-random PG generation	2020-07-07 23:13:07 +03:00
Vitaliy Filippov	9d10a4d057	Support arbitrary pg_size in LPOptimizer	2020-07-05 20:28:05 +03:00
Vitaliy Filippov	ec7acc8f3a	Add WRITE_STABLE operation for future replication support	2020-07-05 01:48:02 +03:00
Vitaliy Filippov	416a80b099	Make blockstore object state a combination of type and workflow	2020-07-04 22:20:32 +03:00
Vitaliy Filippov	a7929931eb	Implement PG epochs to prevent the "version split" The "version split" is when: - A block is written to 1 OSD out of 3, all of them die - OSDs 2 and 3 come up, the same block is written to both of them - The remaining OSD comes up. Now all 3 OSDs have the same version of the same object, but with different data.	2020-07-04 00:55:27 +03:00
Vitaliy Filippov	e680d6c1c3	Rename reconstruct_stripe and calc_rmw_parity to indicate that they are only for XOR N+1	2020-06-30 10:40:43 +03:00
Vitaliy Filippov	9b33f598d3	Fix two more cluster client bugs 1) Sync could delete an unfinished write due to the lack of ordering (fixed by introducing syncing_writes) 2) Writes could be postponed indefinitely due to bad resuming of operations after a sync	2020-06-27 02:13:35 +03:00
Vitaliy Filippov	592bcd3699	Fix QEMU driver bugs (QEMU and qemu-img now work! hooray!)	2020-06-26 18:25:43 +03:00
Vitaliy Filippov	5e1e39633d	Implement QEMU block driver	2020-06-25 11:59:43 +03:00
Vitaliy Filippov	41c2655edd	Disconnect sockets when read returns zero	2020-06-24 01:32:19 +03:00
Vitaliy Filippov	d68370304e	Support iovecs in cluster_client_t	2020-06-24 01:31:48 +03:00
Vitaliy Filippov	a22d9f38aa	Only use EPOLLOUT while connecting	2020-06-23 20:18:31 +03:00
Vitaliy Filippov	8736b3ad32	Add destructors, make ringloop optional in cluster_client_t	2020-06-23 20:10:33 +03:00
Vitaliy Filippov	62343c8022	Allow to turn synchronous recvmsg/sendmsg on with a config option	2020-06-23 01:15:07 +03:00
Vitaliy Filippov	9abaf5b735	Use epoll_manager in osd	2020-06-20 01:28:18 +03:00
Vitaliy Filippov	badf68c039	Support iovecs for read operations	2020-06-19 19:47:05 +03:00
Vitaliy Filippov	0f6d193d73	Postpone op callbacks to the end of handle_read(), fix a bug where primary OSD could reply -EPIPE with data to a read operation	2020-06-16 01:36:38 +03:00
Vitaliy Filippov	27ee14a4e6	Fix bugs in cluster_client	2020-06-16 00:08:45 +03:00
Vitaliy Filippov	64afec03ec	In theory, implement syncs and replay for the non-immediate commit mode	2020-06-15 00:04:16 +03:00
Vitaliy Filippov	4dde8b8a42	Oops, fix fio_sec_osd block_order parsing	2020-06-09 00:52:00 +03:00
Vitaliy Filippov	f5ccb154af	Benchmark reads in stub_bench, too	2020-06-08 01:54:44 +03:00
Vitaliy Filippov	73c80e2c39	Move accept_connections() to osd_messenger_t, add a simple uring OSD stub	2020-06-08 01:32:16 +03:00
Vitaliy Filippov	437dc5b630	Implement a FIO engine for testing cluster I/O	2020-06-07 00:30:15 +03:00
Vitaliy Filippov	226f5a2945	Allow to override block_size in fio_sec_osd	2020-06-07 00:10:13 +03:00
Vitaliy Filippov	2187d06eac	Add a parameter to pass the initial config to client	2020-06-07 00:10:12 +03:00
Vitaliy Filippov	c573bc6bb3	(Probably almost) implement cluster client	2020-06-07 00:09:36 +03:00
Vitaliy Filippov	2f6cf605a1	Rename cluster_client to osd_messenger	2020-06-04 12:57:54 +03:00
Vitaliy Filippov	05ea97119f	Fix BS_OP_LIST to account for deleted objects: only list the newest stable entry of each object This allows list responses to be unaffected by journal flushes, which, in turn, fixes PG peering when a peer OSD is replaying journal and journal contains deletions	2020-06-02 23:52:48 +03:00
Vitaliy Filippov	571be0f380	Make deletions instantly stable "2-phase" (write->stabilize) process is pointless for deletions because it doesn't protect us from incomplete objects. This happens because it removes the version information from metadata after stabilization. Deletions require "3-phase" process with a potentially very long 3rd phase. So, deletions will be allowed to generate degraded and incomplete objects, and for it to not affect users' ability to delete something, the cluster will allow to delete whole inodes while storing a list of them in etcd. Proper TRIM will be impossible until the implementation of the aforementioned "3-phase" process, though. By the way, this change also fixes a possible write stall after rebalancing which was caused by the lack of "stabilize delete" operations.	2020-06-02 23:45:22 +03:00
Vitaliy Filippov	985c309d7f	Remove duplicate code between blockstore_{rollback,stable} and blockstore_init	2020-06-02 20:37:00 +03:00
Vitaliy Filippov	a56f8cd14e	Simplify handle_primary_subop() arguments	2020-06-02 18:44:23 +03:00
Vitaliy Filippov	46e111272f	Replace assert(this_it == cur_op) with if() for the case of PG repeering	2020-06-02 14:30:57 +03:00
Vitaliy Filippov	165c204555	Fix BS_OP_DELETE (the implementation was untested up to this point)	2020-06-02 14:26:01 +03:00
Vitaliy Filippov	af5cd45071	Oh crap, got SIGPIPE. Add MSG_NOSIGNAL	2020-06-02 11:41:08 +03:00
Vitaliy Filippov	c3fe9ad0d1	Fix rebalancing writes (add a forgotten state resume)	2020-06-02 01:26:14 +03:00
Vitaliy Filippov	0fcdeae18b	Do not die if a peer is already stopped on flush error	2020-06-01 23:07:08 +03:00
Vitaliy Filippov	e6a4b634f8	Fix possible write stall The stall occurred during fio Q=128 random write tests with low flusher_count (4). It was caused by flushers being unable to flush the beginning of the journal because it contained older writes to an object that also had writes in the very end of the journal, after dirty_start.	2020-06-01 16:18:23 +03:00
Vitaliy Filippov	c22e096943	Output journal offsets in debug trace in hex, add detailed "still waiting" messages	2020-06-01 16:18:19 +03:00
Vitaliy Filippov	45b1c2fbf1	Fix canceling of write operations on PG re-peer (which led to use-after-free, too...)	2020-06-01 16:18:14 +03:00
Vitaliy Filippov	3469bead67	Protect "delete this" with a stack refcounter (to fix use-after-free, too, but "delete this" was a time bomb anyway)	2020-06-01 16:18:09 +03:00
Vitaliy Filippov	3a5d488f19	Fix use-after-free in osd_flush.cpp	2020-06-01 01:56:24 +03:00
Vitaliy Filippov	73e4e30b1f	Auto-generate C++ header dependencies	2020-06-01 00:25:25 +03:00
Vitaliy Filippov	5feff1ffb9	Slightly cleanup socket send/receive code	2020-05-31 15:03:27 +03:00

1 2 3 4 5 ...

449 Commits (94efb54feb510c94251f062f5409b96f9e3d169a) All Branches Search

449 Commits (94efb54feb510c94251f062f5409b96f9e3d169a)

All Branches