Vitaliy Filippov
5d9d6f32a0
Fix common realloc memory leak mistakes found by cppcheck
2024-01-13 01:30:28 +03:00
Vitaliy Filippov
f600cc07b0
Autosync in blockstore every autosync_writes, too
2023-09-16 17:52:17 +03:00
Vitaliy Filippov
5cadb170b9
Fix possible OSD crash during sync due to missing min_flushed_journal_sector reset
2023-09-16 17:52:17 +03:00
Vitaliy Filippov
a8464c19af
Support keeping checksums on disk (not in memory)
...
Definitely beneficial for SSD+HDD setups
2023-07-29 12:17:18 +03:00
Vitaliy Filippov
c5274f655b
...and partially remove the perversion with bitmap inlining
2023-07-29 12:17:18 +03:00
Vitaliy Filippov
0b0405d115
Implement bitmap-granular (4k) metadata & data checksums
2023-07-29 12:17:18 +03:00
Vitaliy Filippov
a6d846863b
Add min/max stripe and limit to OP_LIST
2023-05-20 23:19:39 +03:00
Vitaliy Filippov
a409598b16
Wait for free space again, but count on big_write flushes instead of just flusher activity
2023-05-10 01:51:02 +03:00
Vitaliy Filippov
0fbf4c6a08
Selectively sync nonsynced objects on STABILIZE/ROLLBACK (fix for github issue #51 )
2023-04-08 02:44:02 +03:00
Vitaliy Filippov
d06ed2b0e7
Implement online config update
2023-03-26 19:21:50 +03:00
Vitaliy Filippov
1a1ba0d1e7
Add set_immediate to ringloop and use it for bs/osd ops to prevent reenterability issues
2023-02-09 17:37:26 +03:00
Vitaliy Filippov
e950c024d3
Do not sync peer OSDs before listing
...
Sync before listing was added to wait for all PG writes possibly left in queue
from the previous master to finish before listing it
But in fact it may block the cluster when EC is used and some unstable writes
are left in the queue - they block journal flushing, rollback/stabilize is
required to unblock them, but rollback/stabilize may only happen after PG is
peered. But peering needs listings, listings are requested only after sync, and
sync itself waits for currently blocked writes waiting in the queue
2023-01-03 00:05:45 +03:00
Vitaliy Filippov
71d6d9f868
Fix possible crash on ENOSPC during operation cancel in blockstore
2023-01-03 00:05:45 +03:00
Vitaliy Filippov
4ebdd02b0f
Remove LIST op limiter
...
It doesn't prevent OSD slow ops but may itself lead to stalls :)
2022-12-26 02:48:48 +03:00
Vitaliy Filippov
552e207d2b
Explicitly print errors about -EAGAIN in io_uring
2022-12-17 15:49:49 +03:00
Vitaliy Filippov
cb437913d3
Never try to wait for free space inside blockstore
2022-12-12 00:27:05 +03:00
Vitaliy Filippov
dfd80626bd
Extract disk opening functions to separate module
2022-07-15 01:38:30 +03:00
Vitaliy Filippov
73a363bf92
Rename some variables and constants
2022-07-15 01:38:30 +03:00
Vitaliy Filippov
839ec9e6e0
Shard clean_db by PGs to speedup listings
2022-02-20 00:21:24 +03:00
Vitaliy Filippov
36c276358b
Attempt to fix "head-of-line blocking" by LIST operations
2022-02-18 01:31:45 +03:00
Vitaliy Filippov
df0cd85352
Fix another part of the "async sqe clear" bug (followup to d9857a5340
)
2022-02-01 01:14:56 +03:00
Vitaliy Filippov
7bdd92ca4f
Fix build under clang and some warnings
...
Build problems fixed:
- void* pointer arithmetic which is a GNU extension (works as byte*)
- "variable size object may not be initialized" which is OK under GCC
- nullptr_t related error in json11 (it lacks 'operator <' in clang)
Warnings fixed:
- empty nested struct initializer { 0 } replaced by {}
- removed several unused lambda captures
2022-01-16 00:02:54 +03:00
Vitaliy Filippov
f93491bc6c
Implement journal write batching and slightly refactor journal writes
...
Slightly reduces WA. For example, in 4K T1Q128 replicated randwrite tests
WA is reduced from ~3.6 to ~3.1, in T1Q64 from ~3.8 to ~3.4.
Only effective without no_same_sector_overwrites.
2021-12-16 00:27:17 +03:00
Vitaliy Filippov
e74af9745e
Print journal flusher diagnostics on slow ops
2021-07-17 16:13:41 +03:00
Vitaliy Filippov
2ab423d4ef
Implement journaled write throttling for the SSD+HDD case
2021-04-10 17:44:12 +03:00
Vitaliy Filippov
d6524670e1
Introduce data distribution locality
2021-04-10 17:44:12 +03:00
Vitaliy Filippov
6909807068
Allow to start the OSD just to flush the journal completely
2021-04-10 17:44:12 +03:00
Vitaliy Filippov
54f2353f24
Use bitmap granularity for alignment checks
2021-04-03 14:36:04 +03:00
Vitaliy Filippov
8f8b90be7a
Add min_flusher_count configuration
2021-04-03 00:53:28 +03:00
Vitaliy Filippov
c5fb1d5987
Do not duplicate blockstore operations when io_uring fills up
...
This bug was leading to OSDs dying with "Assertion `fulfilled == read_op->len' failed"
when testing fio -rw=randread -numjobs=8 -iodepth=128
2021-03-16 12:48:26 +03:00
Vitaliy Filippov
d1526b415f
Correctly resume writes when OSD is full to return an error
2021-03-13 17:19:45 +03:00
Vitaliy Filippov
36c935ace6
Use std::vector for the blockstore submission queue
2021-03-08 17:04:10 +03:00
Vitaliy Filippov
0d8b5e2ef9
Remove unused enqueue_op_first()
2021-03-08 17:04:10 +03:00
Vitaliy Filippov
98f1e2c277
Rework write/sync ordering
...
Make syncs wait for all previous writes because it's the only way
to make sure that OSDs do not receive incomplete writes in LIST results
during peering when some writes are still in progress.
Also simplify blockstore submission queue logic.
2021-03-08 17:04:10 +03:00
Vitaliy Filippov
bf9a175efc
Move C/C++ sources to src subdirectory
2021-02-25 23:59:03 +03:00