Vitaliy Filippov
ec7acc8f3a
Add WRITE_STABLE operation for future replication support
2020-07-05 01:48:02 +03:00
Vitaliy Filippov
416a80b099
Make blockstore object state a combination of type and workflow
2020-07-04 22:20:32 +03:00
Vitaliy Filippov
571be0f380
Make deletions instantly stable
...
"2-phase" (write->stabilize) process is pointless for deletions because it
doesn't protect us from incomplete objects. This happens because it removes
the version information from metadata after stabilization. Deletions require
"3-phase" process with a potentially very long 3rd phase.
So, deletions will be allowed to generate degraded and incomplete objects,
and for it to not affect users' ability to delete something, the cluster
will allow to delete whole inodes while storing a list of them in etcd.
Proper TRIM will be impossible until the implementation of the aforementioned
"3-phase" process, though.
By the way, this change also fixes a possible write stall after rebalancing
which was caused by the lack of "stabilize delete" operations.
2020-06-02 23:45:22 +03:00
Vitaliy Filippov
985c309d7f
Remove duplicate code between blockstore_{rollback,stable} and blockstore_init
2020-06-02 20:37:00 +03:00
Vitaliy Filippov
165c204555
Fix BS_OP_DELETE (the implementation was untested up to this point)
2020-06-02 14:26:01 +03:00
Vitaliy Filippov
c22e096943
Output journal offsets in debug trace in hex, add detailed "still waiting" messages
2020-06-01 16:18:19 +03:00
Vitaliy Filippov
21d0b06959
Implement flushing (stabilize/rollback) of unstable entries on start of the PG
2020-03-14 02:49:34 +03:00
Vitaliy Filippov
eba053febe
Do not start small writes before finishing the last big write to the same object
2020-03-12 02:15:01 +03:00
Vitaliy Filippov
3f522c66e6
Implement immediate commit mode
2020-03-10 01:59:15 +03:00
Vitaliy Filippov
c863543bfe
Fix possible journal corruption caused by concurrent flushing and writing of the same journal sector
2020-03-08 01:21:19 +03:00
Vitaliy Filippov
844cacd357
Allow incorrectly forbidden BS_OP_LIST in readonly mode
2020-03-06 02:29:39 +03:00
Vitaliy Filippov
1733de2db6
Test & fix single-PG primary OSD
...
- Add support for benchmarking single primary OSD in fio_sec_osd
- Do not wait for the next event in flushers (return resume_0 back)
- Fix flushing of zero-length writes
- Print PG object count when peering
- Print journal free space when starting and when congested
2020-02-26 19:05:29 +03:00
Vitaliy Filippov
74673c761f
Make basic primary-write work
2020-02-25 02:55:58 +03:00
Vitaliy Filippov
dcc9e75c63
Wait for write completion before fsync in blockstore_init
2020-01-29 16:40:21 +03:00
Vitaliy Filippov
47663bd1dc
Add (empty) osd_primary.cpp, rename osd_read to osd_receive, add FIXMEs for fsync
2020-01-28 22:40:50 +03:00
Vitaliy Filippov
2b09710d6f
Implement blockstore rollback operation
...
Rollback operation is required for the primary OSD to kill unstable
object versions in OSD peers so they don't occupy journal space
2020-01-24 20:18:14 +03:00
Vitaliy Filippov
d0ab2a20b2
Make fsync flags separate for data, metadata and journal
2020-01-17 13:41:37 +03:00
Vitaliy Filippov
43f6cfeb73
Extract alignments to options
2020-01-16 00:54:25 +03:00
Vitaliy Filippov
36d8c8724f
Fix sparse reads using bitmap, fix journal replay (we could sometimes lose its end)
2020-01-12 23:38:33 +03:00
Vitaliy Filippov
cf819eb442
Implement sparse block bitmap to avoid zero-fill
2020-01-12 02:55:32 +03:00
Vitaliy Filippov
bf3eecc159
Extract 512 to constants
2020-01-06 14:11:47 +03:00
Vitaliy Filippov
a7e74670a5
Split blockstore implementation and interface header
2019-12-15 14:57:18 +03:00
Vitaliy Filippov
76caecf7c7
Inmemory metadata mode
2019-12-02 15:42:42 +03:00
Vitaliy Filippov
f4d06ba102
OP_DELETE flushing
2019-12-02 02:41:14 +03:00
Vitaliy Filippov
00eeedae90
Add "fsync disabled" mode
2019-12-01 16:41:07 +03:00
Vitaliy Filippov
76655929c4
Add readonly flag
2019-12-01 16:41:07 +03:00
Vitaliy Filippov
9260cd263a
Verify data crc32 when reading journal
2019-11-30 23:32:10 +03:00
Vitaliy Filippov
2039df76a5
Fix journal reading and make it more similar to writing :)
2019-11-30 02:27:31 +03:00
Vitaliy Filippov
40781c67b2
Trim journal on start
2019-11-29 02:13:32 +03:00
Vitaliy Filippov
45f34fb3b2
Fix linear overwrite, make metadata writes ordered, ignore older entries when recovering journal
2019-11-28 22:36:38 +03:00
Vitaliy Filippov
b6fff5a77e
Fix metadata area size calculation, print free space, wait for free space
...
FIXME: Now it crashes with -ENOSPC on linear overwrite
2019-11-28 20:23:27 +03:00
Vitaliy Filippov
9fa0d3325f
Support inmemory journal
2019-11-28 18:06:50 +03:00
Vitaliy Filippov
e1ac4dba23
Fix safe stop procedure
2019-11-28 02:27:17 +03:00
Vitaliy Filippov
35a6ed728d
Fix another stall due to bad unstable_writes tracking, do not try to write beyond the end of the journal
2019-11-28 00:28:08 +03:00
Vitaliy Filippov
2630e2e3b9
Fix metadata partition length, fix journal allocation at the end
2019-11-27 19:39:18 +03:00
Vitaliy Filippov
876231d26b
no new
2019-11-27 18:14:01 +03:00
Vitaliy Filippov
ce5cd13bc8
Use fdatasync (just for testing over an FS)
2019-11-27 02:41:30 +03:00
Vitaliy Filippov
ff7469ee91
Make allocator a class
2019-11-27 00:50:57 +03:00
Vitaliy Filippov
a6770f619a
Fix crash while reading metadata
2019-11-26 12:06:42 +03:00
Vitaliy Filippov
50cf3667fa
Track unstable writes
2019-11-25 01:16:34 +03:00
Vitaliy Filippov
82a2b8e7d9
Fix some extra bugs and it seems now it is even able to trim the journal
2019-11-22 12:08:44 +03:00
Vitaliy Filippov
7e87290fca
Clear second sector of the journal, init iov for callbacks
2019-11-21 22:06:00 +03:00
Vitaliy Filippov
201eeb8516
Rewrite metadata_init to the same "goto-coroutine" style
2019-11-21 21:51:52 +03:00
Vitaliy Filippov
2b12428cb1
Debug OP_STABLE so the basic case passes without problem
2019-11-21 02:09:18 +03:00
Vitaliy Filippov
299b7288d5
Fix journal loading
2019-11-21 00:52:52 +03:00
Vitaliy Filippov
eb55b2fe20
Initialize sector 0 of the journal
2019-11-19 20:03:19 +03:00
Vitaliy Filippov
b5f04c58ff
Rewrite journal_init to the "goto-coroutine" style
2019-11-19 19:50:58 +03:00
Vitaliy Filippov
8c690c76ec
Wakeup ring loop
2019-11-18 14:08:11 +03:00
Vitaliy Filippov
e40a71b2ce
Check result to be equal to iov_len
2019-11-18 02:09:34 +03:00
Vitaliy Filippov
c2de733e35
Copy io_uring_prep_* to my_uring_prep_* so they do not clear user_data
2019-11-17 21:39:30 +03:00