Commit Graph

14 Commits (225eb2fe3dc719e92ba4237175a13cf874269bb5)

Author SHA1 Message Date
Vitaliy Filippov 225eb2fe3d Support RDMA without ODP by stupidly copying memory. Disable ODP by default
ODP is slower than regular RDMA even with memory copy overhead

Example numbers:
- 3950000 random read iops without ODP vs 240000 iops with ODP
- 1447000 random write iops without ODP vs 101000 iops with ODP

Reference: https://tkygtr6.github.io/pub/ISPASS21_slides.pdf
2023-11-12 15:03:47 +03:00
Vitaliy Filippov b7d398be5b Fix sscanf validation usage (field count instead of null_byte == 0)
Test / test_minsize_1 (push) Has been cancelled Details
Test / test_move_reappear (push) Has been cancelled Details
Test / test_rm (push) Has been cancelled Details
Test / test_snapshot_chain (push) Has been cancelled Details
Test / test_snapshot_chain_ec (push) Has been cancelled Details
Test / test_snapshot_down (push) Has been cancelled Details
Test / test_snapshot_down_ec (push) Has been cancelled Details
Test / test_splitbrain (push) Has been cancelled Details
Test / test_rebalance_verify (push) Has been cancelled Details
Test / build (push) Has been cancelled Details
Test / test_rebalance_verify_imm (push) Has been cancelled Details
Test / test_rebalance_verify_ec (push) Has been cancelled Details
Test / test_rebalance_verify_ec_imm (push) Has been cancelled Details
Test / test_write (push) Has been cancelled Details
Test / test_write_xor (push) Has been cancelled Details
Test / test_write_no_same (push) Has been cancelled Details
Test / test_heal_pg_size_2 (push) Has been cancelled Details
Test / test_heal_ec (push) Has been cancelled Details
Test / test_heal_csum_32k_dmj (push) Has been cancelled Details
Test / test_heal_csum_32k_dj (push) Has been cancelled Details
Test / test_heal_csum_32k (push) Has been cancelled Details
Test / test_heal_csum_4k_dmj (push) Has been cancelled Details
Test / test_heal_csum_4k_dj (push) Has been cancelled Details
Test / test_heal_csum_4k (push) Has been cancelled Details
Test / test_scrub (push) Has been cancelled Details
Test / test_scrub_zero_osd_2 (push) Has been cancelled Details
Test / test_scrub_xor (push) Has been cancelled Details
Test / test_scrub_pg_size_3 (push) Has been cancelled Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Has been cancelled Details
Test / test_scrub_ec (push) Has been cancelled Details
2023-09-07 02:34:35 +03:00
Vitaliy Filippov c3e80abad7 Allow to send more than 1 operation at a time 2023-02-26 02:01:04 +03:00
Vitaliy Filippov 138ffe4032 Reuse incoming RDMA buffers 2023-02-26 00:55:01 +03:00
Vitaliy Filippov 098e369a3b Fix rand initialization, add etcd connection/disconnection logging 2022-01-20 00:45:49 +03:00
Vitaliy Filippov 7bdd92ca4f Fix build under clang and some warnings
Build problems fixed:
- void* pointer arithmetic which is a GNU extension (works as byte*)
- "variable size object may not be initialized" which is OK under GCC
- nullptr_t related error in json11 (it lacks 'operator <' in clang)

Warnings fixed:
- empty nested struct initializer { 0 } replaced by {}
- removed several unused lambda captures
2022-01-16 00:02:54 +03:00
Vitaliy Filippov b262938bca Fix naggy "Failed to get RDMA device list: Unknown error -38" 2021-12-08 02:02:30 +03:00
Vitaliy Filippov 609bd4eb59 Remove naggy RDMA messages when log level is zero 2021-11-06 14:36:23 +03:00
Vitaliy Filippov 1526e2055e Do not crash with RDMA when receiving garbage, free RDMA buffers when connection is closed 2021-10-15 23:56:22 +03:00
Vitaliy Filippov 699a0fbbc7 Log to stderr instead of stdout in client 2021-05-15 19:22:24 +03:00
Vitaliy Filippov 483c5ab380 Negotiate max_msg instead of max_sge, make buffer settings more conservative :-) 2021-04-29 11:10:35 +03:00
Vitaliy Filippov 971aa4ae4f Implement RDMA receive with memory copying (send remains zero-copy)
This is the simplest and, as usual, the best implementation :)

100% zero-copy implementation is also possible (see rdma-zerocopy branch),
but it requires to create A LOT of queues (~128 per client) to use QPN as a 'tag'
because of the lack of receive tags and the server may simply run out of queues.
Hardware limit is 262144 on Mellanox ConnectX-4 which amounts to only 2048
'connections' per host. And even with that amount of queues it's still less optimal
than the non-zerocopy one.

In fact, newest hardware like Mellanox ConnectX-5 does have Tag Matching
support, but it's still unsuitable for us because it doesn't support scatter/gather
(tm_caps.max_sge=1).
2021-04-29 02:34:45 +03:00
Vitaliy Filippov 9e6cbc6ebc Negotiate max_sge between RDMA client & server 2021-04-29 02:15:20 +03:00
Vitaliy Filippov ce777319c3 WIP RDMA support
Basic naive implementation works, but it's highly non-optimal as
RNR retransmissions occur all the time. RDMA expects the receiver
to always have place for incoming WRs...
2021-04-29 02:03:54 +03:00