Vitaliy Filippov
c573bc6bb3
(Probably almost) implement cluster client
2020-06-07 00:09:36 +03:00
Vitaliy Filippov
571be0f380
Make deletions instantly stable
...
"2-phase" (write->stabilize) process is pointless for deletions because it
doesn't protect us from incomplete objects. This happens because it removes
the version information from metadata after stabilization. Deletions require
"3-phase" process with a potentially very long 3rd phase.
So, deletions will be allowed to generate degraded and incomplete objects,
and for it to not affect users' ability to delete something, the cluster
will allow to delete whole inodes while storing a list of them in etcd.
Proper TRIM will be impossible until the implementation of the aforementioned
"3-phase" process, though.
By the way, this change also fixes a possible write stall after rebalancing
which was caused by the lack of "stabilize delete" operations.
2020-06-02 23:45:22 +03:00
Vitaliy Filippov
46e111272f
Replace assert(this_it == cur_op) with if() for the case of PG repeering
2020-06-02 14:30:57 +03:00
Vitaliy Filippov
c3fe9ad0d1
Fix rebalancing writes (add a forgotten state resume)
2020-06-02 01:26:14 +03:00
Vitaliy Filippov
45b1c2fbf1
Fix canceling of write operations on PG re-peer (which led to use-after-free, too...)
2020-06-01 16:18:14 +03:00
Vitaliy Filippov
b466e215f0
Fix queued OP_SYNC execution
2020-05-27 13:55:25 +03:00
Vitaliy Filippov
0aca6e9ca8
Extract peer connect and read-write loop into a separate file (to be shared with the client library)
2020-05-26 22:11:30 +03:00
Vitaliy Filippov
e09d0e0678
Several bug fixes
...
- Do not block flock() requests
- Fix stop_client(0) attempts leading to std::bad_function_call
- Fix degraded writes crashing due to an unset stripes[i].missing (at least with a missing parity device)
- Fix recovery B/W reporting
2020-05-24 01:51:35 +03:00
Vitaliy Filippov
393fe75900
Fix creepy (osd_op_t*)(long) casts
2020-05-23 15:43:37 +03:00
Vitaliy Filippov
19f25c7cd5
Handle integer overflow of the op_stat_count
2020-05-15 01:37:17 +03:00
Vitaliy Filippov
5084ff7c6c
Measure & report recovery op count and bandwidth
2020-05-15 01:29:15 +03:00
Vitaliy Filippov
e8149e5848
Implement OSD_OP_DELETE
2020-05-05 00:39:51 +03:00
Vitaliy Filippov
00cf24fbd7
Split osd_primary.cpp
2020-05-03 11:04:20 +03:00
Vitaliy Filippov
bd0fe6e4cc
Fix PGs not stopping during sync, fix state reporting autovivification of erased PGs
2020-05-01 01:33:14 +03:00
Vitaliy Filippov
7b57eeeeb3
Implement PG state locking and PG moving in response to etcd events
2020-04-29 22:23:38 +03:00
Vitaliy Filippov
37b27c3025
Implement basic OSD status reporting to Consul
2020-04-14 14:52:06 +03:00
Vitaliy Filippov
298b013eae
Add simple http request function
2020-04-11 12:05:58 +03:00
Vitaliy Filippov
0880a77c1a
2 FIXME for the future
2020-04-06 00:55:47 +03:00
Vitaliy Filippov
d59be0e8b4
Delete misplaced chunks after moving the object, reset object state in primary_write
2020-04-05 15:51:22 +03:00
Vitaliy Filippov
cf7de0f181
(Almost) Implement misplaced recovery, integrating it into calc_rmw()
2020-04-05 15:50:53 +03:00
Vitaliy Filippov
dfb6e15eaa
Implement graceful stopping of PGs
2020-04-03 13:03:42 +03:00
Vitaliy Filippov
afe2e76c87
Implement regular automatic syncs, split osd_t constructor into some methods
2020-04-02 22:16:46 +03:00
Vitaliy Filippov
0f43f6d3f6
Fix crashes, print some stats
...
Notably:
- fix the `delete op` inside lambda callback crash (it frees the lambda itself
which results in use-after-free with g++)
- fix stop_client() reenterability
- fix a bug in the blockstore layer which resulted in always returning version=0
for zero-length reads
- change error codes for blockstore_stabilize
2020-03-31 17:55:31 +03:00
Vitaliy Filippov
92c800bb64
Forget unstable writes when re-peering, rename parity_block_size -> pg_stripe_size, pg_parity_size -> pg_block_size
2020-03-31 02:09:25 +03:00
Vitaliy Filippov
8a8b619875
Handle secondary OSD connection errors [in theory]
2020-03-30 19:51:34 +03:00
Vitaliy Filippov
43fe1d88e7
Fix memory leaks with subops, fix recovery crashes
2020-03-28 19:09:20 +03:00
Vitaliy Filippov
1b30120918
Fix stripe reconstruction in recovery, only write modified object parts
2020-03-28 13:58:42 +03:00
Vitaliy Filippov
c0a22d825d
Fix degraded object recovery (it seems to work now)
2020-03-25 02:17:41 +03:00
Vitaliy Filippov
250f22c0b6
Implement basic degraded object recovery (integrated into primary_write)
2020-03-25 01:17:50 +03:00
Vitaliy Filippov
dbd8418798
Reply using a single finish_op() method, allow to call OSD ops from inside the OSD
2020-03-24 00:18:52 +03:00
Vitaliy Filippov
21d0b06959
Implement flushing (stabilize/rollback) of unstable entries on start of the PG
2020-03-14 02:49:34 +03:00
Vitaliy Filippov
31f9445030
Use immediate_commit to benefit the primary OSD
2020-03-10 02:20:16 +03:00
Vitaliy Filippov
56765ab750
Send all iovecs at once
2020-02-29 02:27:19 +03:00
Vitaliy Filippov
2be4824a7a
Fix a small memory leak and BS_OP_SYNC mishandling, now fio does not hang during primary-osd test
2020-02-28 01:46:39 +03:00
Vitaliy Filippov
1733de2db6
Test & fix single-PG primary OSD
...
- Add support for benchmarking single primary OSD in fio_sec_osd
- Do not wait for the next event in flushers (return resume_0 back)
- Fix flushing of zero-length writes
- Print PG object count when peering
- Print journal free space when starting and when congested
2020-02-26 19:05:29 +03:00
Vitaliy Filippov
df66a76ce2
...and make it work :)
2020-02-25 22:52:03 +03:00
Vitaliy Filippov
a406c62a71
Implement basic primary-sync-stabilize
2020-02-25 20:10:21 +03:00
Vitaliy Filippov
74673c761f
Make basic primary-write work
2020-02-25 02:55:58 +03:00
Vitaliy Filippov
09588a349f
Transform primary_r/w into "coroutines"
2020-02-24 02:40:52 +03:00
Vitaliy Filippov
4c0178f180
Fix some memory freeing
2020-02-24 01:04:23 +03:00
Vitaliy Filippov
5dd04abbac
Make bs_op pointer
2020-02-23 23:46:00 +03:00
Vitaliy Filippov
88e56a564f
Rename osd_read_stripe_t to osd_rmw_stripe_t
2020-02-23 23:43:57 +03:00
Vitaliy Filippov
4a52a15564
Rename osd_op_t.op to req
2020-02-23 23:21:17 +03:00
Vitaliy Filippov
d4fd9d982a
Implement read-modify-write calculation and extract it into a separate file
2020-02-23 02:11:43 +03:00
Vitaliy Filippov
ffe073473a
Remove hardcode of the EC(2+1) scheme, now it supports EC(k+1), fix some bugs
2020-02-13 19:13:17 +03:00
Vitaliy Filippov
a66b34e04d
Implement event-driven PG peering
2020-02-11 13:41:34 +03:00
Vitaliy Filippov
1513d0490a
Test and fix degraded-read
2020-02-09 19:17:35 +03:00
Vitaliy Filippov
97d3fc593c
Test and fix primary-read
2020-02-09 19:17:32 +03:00
Vitaliy Filippov
235d15422c
Mostly finish primary-OSD-read
2020-02-03 14:18:21 +03:00
Vitaliy Filippov
9fb2d3f840
Fill out the rest of the degraded read logic; now we need to make it a "coroutine"
2020-02-02 00:05:56 +03:00