Commit Graph

381 Commits (a22d9f38aa3233618871da8a01fe649ef5c6890a)

Author SHA1 Message Date
Vitaliy Filippov a22d9f38aa Only use EPOLLOUT while connecting 2020-06-23 20:18:31 +03:00
Vitaliy Filippov 8736b3ad32 Add destructors, make ringloop optional in cluster_client_t 2020-06-23 20:10:33 +03:00
Vitaliy Filippov 62343c8022 Allow to turn synchronous recvmsg/sendmsg on with a config option 2020-06-23 01:15:07 +03:00
Vitaliy Filippov 9abaf5b735 Use epoll_manager in osd 2020-06-20 01:28:18 +03:00
Vitaliy Filippov badf68c039 Support iovecs for read operations 2020-06-19 19:47:05 +03:00
Vitaliy Filippov 0f6d193d73 Postpone op callbacks to the end of handle_read(), fix a bug where primary OSD could reply -EPIPE with data to a read operation 2020-06-16 01:36:38 +03:00
Vitaliy Filippov 27ee14a4e6 Fix bugs in cluster_client 2020-06-16 00:08:45 +03:00
Vitaliy Filippov 64afec03ec In theory, implement syncs and replay for the non-immediate commit mode 2020-06-15 00:04:16 +03:00
Vitaliy Filippov 4dde8b8a42 Oops, fix fio_sec_osd block_order parsing 2020-06-09 00:52:00 +03:00
Vitaliy Filippov f5ccb154af Benchmark reads in stub_bench, too 2020-06-08 01:54:44 +03:00
Vitaliy Filippov 73c80e2c39 Move accept_connections() to osd_messenger_t, add a simple uring OSD stub 2020-06-08 01:32:16 +03:00
Vitaliy Filippov 437dc5b630 Implement a FIO engine for testing cluster I/O 2020-06-07 00:30:15 +03:00
Vitaliy Filippov 226f5a2945 Allow to override block_size in fio_sec_osd 2020-06-07 00:10:13 +03:00
Vitaliy Filippov 2187d06eac Add a parameter to pass the initial config to client 2020-06-07 00:10:12 +03:00
Vitaliy Filippov c573bc6bb3 (Probably almost) implement cluster client 2020-06-07 00:09:36 +03:00
Vitaliy Filippov 2f6cf605a1 Rename cluster_client to osd_messenger 2020-06-04 12:57:54 +03:00
Vitaliy Filippov 05ea97119f Fix BS_OP_LIST to account for deleted objects: only list the newest stable entry of each object
This allows list responses to be unaffected by journal flushes, which, in turn,
fixes PG peering when a peer OSD is replaying journal and journal contains deletions
2020-06-02 23:52:48 +03:00
Vitaliy Filippov 571be0f380 Make deletions instantly stable
"2-phase" (write->stabilize) process is pointless for deletions because it
doesn't protect us from incomplete objects. This happens because it removes
the version information from metadata after stabilization. Deletions require
"3-phase" process with a potentially very long 3rd phase.

So, deletions will be allowed to generate degraded and incomplete objects,
and for it to not affect users' ability to delete something, the cluster
will allow to delete whole inodes while storing a list of them in etcd.
Proper TRIM will be impossible until the implementation of the aforementioned
"3-phase" process, though.

By the way, this change also fixes a possible write stall after rebalancing
which was caused by the lack of "stabilize delete" operations.
2020-06-02 23:45:22 +03:00
Vitaliy Filippov 985c309d7f Remove duplicate code between blockstore_{rollback,stable} and blockstore_init 2020-06-02 20:37:00 +03:00
Vitaliy Filippov a56f8cd14e Simplify handle_primary_subop() arguments 2020-06-02 18:44:23 +03:00
Vitaliy Filippov 46e111272f Replace assert(this_it == cur_op) with if() for the case of PG repeering 2020-06-02 14:30:57 +03:00
Vitaliy Filippov 165c204555 Fix BS_OP_DELETE (the implementation was untested up to this point) 2020-06-02 14:26:01 +03:00
Vitaliy Filippov af5cd45071 Oh crap, got SIGPIPE. Add MSG_NOSIGNAL 2020-06-02 11:41:08 +03:00
Vitaliy Filippov c3fe9ad0d1 Fix rebalancing writes (add a forgotten state resume) 2020-06-02 01:26:14 +03:00
Vitaliy Filippov 0fcdeae18b Do not die if a peer is already stopped on flush error 2020-06-01 23:07:08 +03:00
Vitaliy Filippov e6a4b634f8 Fix possible write stall
The stall occurred during fio Q=128 random write tests with low flusher_count (4).
It was caused by flushers being unable to flush the beginning of the journal
because it contained older writes to an object that also had writes in the very end
of the journal, after dirty_start.
2020-06-01 16:18:23 +03:00
Vitaliy Filippov c22e096943 Output journal offsets in debug trace in hex, add detailed "still waiting" messages 2020-06-01 16:18:19 +03:00
Vitaliy Filippov 45b1c2fbf1 Fix canceling of write operations on PG re-peer (which led to use-after-free, too...) 2020-06-01 16:18:14 +03:00
Vitaliy Filippov 3469bead67 Protect "delete this" with a stack refcounter
(to fix use-after-free, too, but "delete this" was a time bomb anyway)
2020-06-01 16:18:09 +03:00
Vitaliy Filippov 3a5d488f19 Fix use-after-free in osd_flush.cpp 2020-06-01 01:56:24 +03:00
Vitaliy Filippov 73e4e30b1f Auto-generate C++ header dependencies 2020-06-01 00:25:25 +03:00
Vitaliy Filippov 5feff1ffb9 Slightly cleanup socket send/receive code 2020-05-31 15:03:27 +03:00
Vitaliy Filippov b466e215f0 Fix queued OP_SYNC execution 2020-05-27 13:55:25 +03:00
Vitaliy Filippov 36f995367f Fix bind_address reporting 2020-05-27 10:58:40 +03:00
Vitaliy Filippov 0aca6e9ca8 Extract peer connect and read-write loop into a separate file (to be shared with the client library) 2020-05-26 22:11:30 +03:00
Vitaliy Filippov fa98be6bc0 Allow to specify multiple etcd addresses 2020-05-25 16:30:05 +03:00
Vitaliy Filippov 256a7f2667 Free op->bs_op manually 2020-05-25 15:31:22 +03:00
Vitaliy Filippov 79bf57b6e2 Allow to override pg_stripe_size 2020-05-25 15:31:22 +03:00
Vitaliy Filippov 53f6aba3e6 Die when journal_sector_buffer_count is too small 2020-05-24 17:26:47 +03:00
Vitaliy Filippov 36595eb669 Print "Ran out of journal sector buffers" warning 2020-05-24 16:48:50 +03:00
Vitaliy Filippov e09d0e0678 Several bug fixes
- Do not block flock() requests
- Fix stop_client(0) attempts leading to std::bad_function_call
- Fix degraded writes crashing due to an unset stripes[i].missing (at least with a missing parity device)
- Fix recovery B/W reporting
2020-05-24 01:51:35 +03:00
Vitaliy Filippov d1602b50b3 Fix BS_OP_ROLLBACK removing an incorrect version
Instead of only removing versions with oid == X and version > Y it was
also removing the previous version in list (with the previous oid or
with version == Y)
2020-05-24 01:51:28 +03:00
Vitaliy Filippov 7df384031a Re-peer PGs after stopping the peer
Fixes the bug where two peers killed at once have lead to PG state PG_DEGRADED|PG_HAS_INCOMPLETE instead of PG_INCOMPLETE
2020-05-23 18:45:12 +03:00
Vitaliy Filippov e614a98543 Add a sad FIXME :-) 2020-05-23 15:43:37 +03:00
Vitaliy Filippov 01dd3ef89e Fix timerfd_manager triggering of multiple times at the same time 2020-05-23 15:43:37 +03:00
Vitaliy Filippov cdccc23aff Print [OSD $osd_num] in stats, print B/W only for ops that log bytes 2020-05-23 15:43:37 +03:00
Vitaliy Filippov 700428829a Fix autosync_interval default not setting when autosync_interval is skipped in config 2020-05-23 15:43:37 +03:00
Vitaliy Filippov 6488d0044a Ignore EPOLL_CTL_DEL ENOENT, fix detection of the rollback version 2020-05-23 15:43:37 +03:00
Vitaliy Filippov 393fe75900 Fix creepy (osd_op_t*)(long) casts 2020-05-23 15:43:37 +03:00
Vitaliy Filippov f036eecf1c Fix osd_rmw object recovery case (len==0) 2020-05-23 15:43:37 +03:00