Commit Graph

75 Commits (1bc08174f99e9e029affeb8a73844cefea49d6b2)

Author SHA1 Message Date
Vitaliy Filippov 1bc08174f9 Sync before listing objects so flushes do not fail thereafter 2020-05-01 12:56:49 +03:00
Vitaliy Filippov cd87333091 Fix PG state comparison leading to unclean PGs not flushing
(a & b == b) -> ((a & b) == b) !
2020-05-01 12:56:46 +03:00
Vitaliy Filippov 7b57eeeeb3 Implement PG state locking and PG moving in response to etcd events 2020-04-29 22:23:38 +03:00
Vitaliy Filippov 268b497c0b Implement simple websocket client 2020-04-25 23:11:50 +03:00
Vitaliy Filippov 35481925b1 Implement very simple HTTP streaming to handle etcd watches 2020-04-25 01:35:52 +03:00
Vitaliy Filippov caa01c6aaf Acquire etcd leases, prevent starting two OSDs with the same number 2020-04-25 01:35:52 +03:00
Vitaliy Filippov 0f2b8dbf6f Use a single timerfd_manager for all timers 2020-04-25 01:35:49 +03:00
Vitaliy Filippov 4f42e9659e Use etcd instead of Consul 2020-04-24 01:03:55 +03:00
Vitaliy Filippov 2a640ba2e8 Remove range port selection (leads to races) 2020-04-21 00:10:59 +03:00
Vitaliy Filippov 6a21ea207e Check peer config (at least, number) after connecting 2020-04-21 00:08:54 +03:00
Vitaliy Filippov 642802b595 Auto-select port numbers 2020-04-20 17:45:27 +03:00
Vitaliy Filippov ff38b464a5 Add consul & connect timeouts, report state before loading PGs, move init_primary to osd_cluster 2020-04-20 15:43:07 +03:00
Vitaliy Filippov f95299b769 Take PG history into account when starting PGs 2020-04-19 00:20:18 +03:00
Vitaliy Filippov 2a8e40835e Fix reporting to Consul, report even if we are purely secondary 2020-04-17 01:59:06 +03:00
Vitaliy Filippov 309486d746 Implement loading PGs from Consul (in theory) 2020-04-16 23:22:32 +03:00
Vitaliy Filippov 089b4eb208 Retry consul connection attempts and then die 2020-04-15 15:33:18 +03:00
Vitaliy Filippov 37b27c3025 Implement basic OSD status reporting to Consul 2020-04-14 14:52:06 +03:00
Vitaliy Filippov d11e8dcb5e Do not flush or recover in readonly mode 2020-04-11 12:06:18 +03:00
Vitaliy Filippov 298b013eae Add simple http request function 2020-04-11 12:05:58 +03:00
Vitaliy Filippov d59be0e8b4 Delete misplaced chunks after moving the object, reset object state in primary_write 2020-04-05 15:51:22 +03:00
Vitaliy Filippov 6212195440 Implement parallel recovery 2020-04-04 19:23:12 +03:00
Vitaliy Filippov dfb6e15eaa Implement graceful stopping of PGs 2020-04-03 13:03:42 +03:00
Vitaliy Filippov afe2e76c87 Implement regular automatic syncs, split osd_t constructor into some methods 2020-04-02 22:16:46 +03:00
Vitaliy Filippov 0f43f6d3f6 Fix crashes, print some stats
Notably:
- fix the `delete op` inside lambda callback crash (it frees the lambda itself
  which results in use-after-free with g++)
- fix stop_client() reenterability
- fix a bug in the blockstore layer which resulted in always returning version=0
  for zero-length reads
- change error codes for blockstore_stabilize
2020-03-31 17:55:31 +03:00
Vitaliy Filippov 92c800bb64 Forget unstable writes when re-peering, rename parity_block_size -> pg_stripe_size, pg_parity_size -> pg_block_size 2020-03-31 02:09:25 +03:00
Vitaliy Filippov 8a8b619875 Handle secondary OSD connection errors [in theory] 2020-03-30 19:51:34 +03:00
Vitaliy Filippov c0a22d825d Fix degraded object recovery (it seems to work now) 2020-03-25 02:17:41 +03:00
Vitaliy Filippov 250f22c0b6 Implement basic degraded object recovery (integrated into primary_write) 2020-03-25 01:17:50 +03:00
Vitaliy Filippov dbd8418798 Reply using a single finish_op() method, allow to call OSD ops from inside the OSD 2020-03-24 00:18:52 +03:00
Vitaliy Filippov 21d0b06959 Implement flushing (stabilize/rollback) of unstable entries on start of the PG 2020-03-14 02:49:34 +03:00
Vitaliy Filippov 31f9445030 Use immediate_commit to benefit the primary OSD 2020-03-10 02:20:16 +03:00
Vitaliy Filippov 8315407558 Incoming data pre-buffering 2020-03-04 17:34:45 +03:00
Vitaliy Filippov b27ad550cf Use btree_map instead of sparsepp 2020-03-04 17:12:27 +03:00
Vitaliy Filippov 20125db181 Use clock_gettime() 2020-03-03 00:54:42 +03:00
Vitaliy Filippov 79839ec31d Start sending immediately instead of waiting for another loop 2020-03-02 00:20:28 +03:00
Vitaliy Filippov 56765ab750 Send all iovecs at once 2020-02-29 02:27:19 +03:00
Vitaliy Filippov fd05e13bc4 Use EPOLLET
Its latency is slightly better, too
2020-02-29 01:56:59 +03:00
Vitaliy Filippov c41fd7ea18 Measure sending subops with data 2020-02-29 01:46:03 +03:00
Vitaliy Filippov c6334afc94 Measure OSD op/subop latency
Something is wrong: loopback RTT between OSDs is sometimes as high as 70us (should be 20us or less probably)
2020-02-28 12:26:49 +03:00
Vitaliy Filippov 2be4824a7a Fix a small memory leak and BS_OP_SYNC mishandling, now fio does not hang during primary-osd test 2020-02-28 01:46:39 +03:00
Vitaliy Filippov a406c62a71 Implement basic primary-sync-stabilize 2020-02-25 20:10:21 +03:00
Vitaliy Filippov 74673c761f Make basic primary-write work 2020-02-25 02:55:58 +03:00
Vitaliy Filippov 09588a349f Transform primary_r/w into "coroutines" 2020-02-24 02:40:52 +03:00
Vitaliy Filippov 4c0178f180 Fix some memory freeing 2020-02-24 01:04:23 +03:00
Vitaliy Filippov 5dd04abbac Make bs_op pointer 2020-02-23 23:46:00 +03:00
Vitaliy Filippov 88e56a564f Rename osd_read_stripe_t to osd_rmw_stripe_t 2020-02-23 23:43:57 +03:00
Vitaliy Filippov 4a52a15564 Rename osd_op_t.op to req 2020-02-23 23:21:17 +03:00
Vitaliy Filippov 72a89be912 Move uint8_t[] buffers into any_op_t/any_reply_t 2020-02-23 23:21:17 +03:00
Vitaliy Filippov d4fd9d982a Implement read-modify-write calculation and extract it into a separate file 2020-02-23 02:11:43 +03:00
Vitaliy Filippov ffe073473a Remove hardcode of the EC(2+1) scheme, now it supports EC(k+1), fix some bugs 2020-02-13 19:13:17 +03:00