Vitaliy Filippov
0471b09b9c
Add license notices to all source code files
2020-09-17 23:07:06 +03:00
Vitaliy Filippov
53832d184a
Allow to use lazy sync with replicated pools
2020-09-06 12:08:44 +03:00
Vitaliy Filippov
44973e7f27
Fix replicated pool bugs
2020-09-05 21:45:04 +03:00
Vitaliy Filippov
4f9b5286a0
Add replicated pool support to OSD logic
...
...in theory :-D now it needs some testing
2020-09-05 01:42:11 +03:00
Vitaliy Filippov
168cc2c803
Add pool support to OSD, part 1
...
This just fixes all the code so it builds and works like before,
but doesn't yet bring the support for replicated pools.
2020-09-04 17:04:17 +03:00
Vitaliy Filippov
3932c9b2e2
Add WRITE_STABLE to the secondary OSD for the upcoming replication support
2020-09-01 16:18:58 +03:00
Vitaliy Filippov
a7929931eb
Implement PG epochs to prevent the "version split"
...
The "version split" is when:
- A block is written to 1 OSD out of 3, all of them die
- OSDs 2 and 3 come up, the same block is written to both of them
- The remaining OSD comes up. Now all 3 OSDs have the same version of the same object,
but with different data.
2020-07-04 00:55:27 +03:00
Vitaliy Filippov
9abaf5b735
Use epoll_manager in osd
2020-06-20 01:28:18 +03:00
Vitaliy Filippov
73c80e2c39
Move accept_connections() to osd_messenger_t, add a simple uring OSD stub
2020-06-08 01:32:16 +03:00
Vitaliy Filippov
2f6cf605a1
Rename cluster_client to osd_messenger
2020-06-04 12:57:54 +03:00
Vitaliy Filippov
571be0f380
Make deletions instantly stable
...
"2-phase" (write->stabilize) process is pointless for deletions because it
doesn't protect us from incomplete objects. This happens because it removes
the version information from metadata after stabilization. Deletions require
"3-phase" process with a potentially very long 3rd phase.
So, deletions will be allowed to generate degraded and incomplete objects,
and for it to not affect users' ability to delete something, the cluster
will allow to delete whole inodes while storing a list of them in etcd.
Proper TRIM will be impossible until the implementation of the aforementioned
"3-phase" process, though.
By the way, this change also fixes a possible write stall after rebalancing
which was caused by the lack of "stabilize delete" operations.
2020-06-02 23:45:22 +03:00
Vitaliy Filippov
a56f8cd14e
Simplify handle_primary_subop() arguments
2020-06-02 18:44:23 +03:00
Vitaliy Filippov
45b1c2fbf1
Fix canceling of write operations on PG re-peer (which led to use-after-free, too...)
2020-06-01 16:18:14 +03:00
Vitaliy Filippov
5feff1ffb9
Slightly cleanup socket send/receive code
2020-05-31 15:03:27 +03:00
Vitaliy Filippov
0aca6e9ca8
Extract peer connect and read-write loop into a separate file (to be shared with the client library)
2020-05-26 22:11:30 +03:00
Vitaliy Filippov
fa98be6bc0
Allow to specify multiple etcd addresses
2020-05-25 16:30:05 +03:00
Vitaliy Filippov
79bf57b6e2
Allow to override pg_stripe_size
2020-05-25 15:31:22 +03:00
Vitaliy Filippov
6488d0044a
Ignore EPOLL_CTL_DEL ENOENT, fix detection of the rollback version
2020-05-23 15:43:37 +03:00
Vitaliy Filippov
393fe75900
Fix creepy (osd_op_t*)(long) casts
2020-05-23 15:43:37 +03:00
Vitaliy Filippov
e56909fb45
Remove tv_send (unused) and timerfd_interval from blockstore
2020-05-22 15:57:08 +03:00
Vitaliy Filippov
9f842ec9a5
Remove connect callback because it is always the same
2020-05-22 12:45:12 +03:00
Vitaliy Filippov
f6a01a4819
Extract "state-watching" etcd client into a separate file
2020-05-22 12:38:40 +03:00
Vitaliy Filippov
6202260018
Extract HTTP client functions from osd_t
2020-05-21 11:39:01 +03:00
Vitaliy Filippov
a61ede9951
Remove io_uring usage from osd_http and timerfd_manager
...
For better future interoperability with external event loops such as QEMU's one
2020-05-21 01:25:38 +03:00
Vitaliy Filippov
c2c2eefea4
Duplicate host in osd/state and osd/stats, take PGs from /config/pgs.items
2020-05-15 01:29:15 +03:00
Vitaliy Filippov
5084ff7c6c
Measure & report recovery op count and bandwidth
2020-05-15 01:29:15 +03:00
Vitaliy Filippov
f71d0c117b
Measure & report op bandwidth, include local blockstore ops in stats
2020-05-11 02:58:13 +03:00
Vitaliy Filippov
e8149e5848
Implement OSD_OP_DELETE
2020-05-05 00:39:51 +03:00
Vitaliy Filippov
6355b968f4
Track osd_set history and all_peers separately
2020-05-04 15:28:07 +03:00
Vitaliy Filippov
1bc08174f9
Sync before listing objects so flushes do not fail thereafter
2020-05-01 12:56:49 +03:00
Vitaliy Filippov
cd87333091
Fix PG state comparison leading to unclean PGs not flushing
...
(a & b == b) -> ((a & b) == b) !
2020-05-01 12:56:46 +03:00
Vitaliy Filippov
7b57eeeeb3
Implement PG state locking and PG moving in response to etcd events
2020-04-29 22:23:38 +03:00
Vitaliy Filippov
268b497c0b
Implement simple websocket client
2020-04-25 23:11:50 +03:00
Vitaliy Filippov
35481925b1
Implement very simple HTTP streaming to handle etcd watches
2020-04-25 01:35:52 +03:00
Vitaliy Filippov
caa01c6aaf
Acquire etcd leases, prevent starting two OSDs with the same number
2020-04-25 01:35:52 +03:00
Vitaliy Filippov
0f2b8dbf6f
Use a single timerfd_manager for all timers
2020-04-25 01:35:49 +03:00
Vitaliy Filippov
4f42e9659e
Use etcd instead of Consul
2020-04-24 01:03:55 +03:00
Vitaliy Filippov
2a640ba2e8
Remove range port selection (leads to races)
2020-04-21 00:10:59 +03:00
Vitaliy Filippov
6a21ea207e
Check peer config (at least, number) after connecting
2020-04-21 00:08:54 +03:00
Vitaliy Filippov
642802b595
Auto-select port numbers
2020-04-20 17:45:27 +03:00
Vitaliy Filippov
ff38b464a5
Add consul & connect timeouts, report state before loading PGs, move init_primary to osd_cluster
2020-04-20 15:43:07 +03:00
Vitaliy Filippov
f95299b769
Take PG history into account when starting PGs
2020-04-19 00:20:18 +03:00
Vitaliy Filippov
2a8e40835e
Fix reporting to Consul, report even if we are purely secondary
2020-04-17 01:59:06 +03:00
Vitaliy Filippov
309486d746
Implement loading PGs from Consul (in theory)
2020-04-16 23:22:32 +03:00
Vitaliy Filippov
089b4eb208
Retry consul connection attempts and then die
2020-04-15 15:33:18 +03:00
Vitaliy Filippov
37b27c3025
Implement basic OSD status reporting to Consul
2020-04-14 14:52:06 +03:00
Vitaliy Filippov
d11e8dcb5e
Do not flush or recover in readonly mode
2020-04-11 12:06:18 +03:00
Vitaliy Filippov
298b013eae
Add simple http request function
2020-04-11 12:05:58 +03:00
Vitaliy Filippov
d59be0e8b4
Delete misplaced chunks after moving the object, reset object state in primary_write
2020-04-05 15:51:22 +03:00
Vitaliy Filippov
6212195440
Implement parallel recovery
2020-04-04 19:23:12 +03:00