Commit Graph

36 Commits (959089b919e0195c427785f9f1a14d8a1f434e80)

Author SHA1 Message Date
Vitaliy Filippov 5fbe36198a Fix journal trimming
1) Update journal's used_start in memory only after updating journal superblock.
Doing the opposite is incorrect because part of the journal will be lost if writers
overwrite its old beginning.

2) Sync journal device after updating the superblock.

3) Do not trim in rollback and init because trimming there would also require
updating the superblock. And the only reason to trim in both those places was
to unblock writers. And a guaranteed unblocking method will follow in the next
commit :)
2020-10-24 01:08:33 +03:00
Vitaliy Filippov f011e0c675 Do not block stabilize by list and list by write 2020-10-22 22:13:40 +00:00
Vitaliy Filippov 0471b09b9c Add license notices to all source code files 2020-09-17 23:07:06 +03:00
Vitaliy Filippov de6919b02b Add option to disable multiple overwrites of the same journal sector
This makes sense for some SSDs like Intel D3-4510 because they don't
like overwrites of the same sector:

$ fio -direct=1 -rw=write -bs=4k -size=4k -loops=100000 -iodepth=1
  write: IOPS=3142, BW=12.3MiB/s (12.9MB/s)(97.9MiB/7977msec)

$ fio -direct=1 -rw=write -bs=4k -size=128k -loops=100000 -iodepth=1
  write: IOPS=20.8k, BW=81.4MiB/s (85.3MB/s)(543MiB/6675msec)
2020-09-13 00:37:39 +03:00
Vitaliy Filippov ec7acc8f3a Add WRITE_STABLE operation for future replication support 2020-07-05 01:48:02 +03:00
Vitaliy Filippov 53f6aba3e6 Die when journal_sector_buffer_count is too small 2020-05-24 17:26:47 +03:00
Vitaliy Filippov f3a7ccff50 Use 4K blockstore block by default, use MEM_ALIGNMENT in osd code 2020-04-14 19:19:56 +03:00
Vitaliy Filippov 21d0b06959 Implement flushing (stabilize/rollback) of unstable entries on start of the PG 2020-03-14 02:49:34 +03:00
Vitaliy Filippov c863543bfe Fix possible journal corruption caused by concurrent flushing and writing of the same journal sector 2020-03-08 01:21:19 +03:00
Vitaliy Filippov 2b09710d6f Implement blockstore rollback operation
Rollback operation is required for the primary OSD to kill unstable
object versions in OSD peers so they don't occupy journal space
2020-01-24 20:18:14 +03:00
Vitaliy Filippov 43f6cfeb73 Extract alignments to options 2020-01-16 00:54:25 +03:00
Vitaliy Filippov a3d3949dce Do not overwrite same journal sector multiple times
It doesn't reduce actual WA, but it reduces tail latency (Q=32, 10% / 50% / 90% / 99% / 99.95%):
- write: 766us/979us/1090us/1303us/1729us vs 1074us/1450us/2212us/3261us/4113us
- sync: 701us/881us/1188us/1762us/2540us vs 269us/955us/1663us/2638us/4146us
2020-01-15 02:53:01 +03:00
Vitaliy Filippov 36d8c8724f Fix sparse reads using bitmap, fix journal replay (we could sometimes lose its end) 2020-01-12 23:38:33 +03:00
Vitaliy Filippov 4b05bde3a2 Block writes earlier than sync/stabilize would be blocked, too 2020-01-10 20:05:17 +03:00
Vitaliy Filippov bf3eecc159 Extract 512 to constants 2020-01-06 14:11:47 +03:00
Vitaliy Filippov a7e74670a5 Split blockstore implementation and interface header 2019-12-15 14:57:18 +03:00
Vitaliy Filippov 749ab6e2c6 Rename blockstore_operation to blockstore_op_t 2019-12-15 14:57:18 +03:00
Vitaliy Filippov 9260cd263a Verify data crc32 when reading journal 2019-11-30 23:32:10 +03:00
Vitaliy Filippov 40781c67b2 Trim journal on start 2019-11-29 02:13:32 +03:00
Vitaliy Filippov 9fa0d3325f Support inmemory journal 2019-11-28 18:06:50 +03:00
Vitaliy Filippov 78807eb244 Fix journal space check (do not overwrite the beginning of the journal) 2019-11-27 11:35:11 +03:00
Vitaliy Filippov 74d8ea2f01 Calculate data crc32c 2019-11-27 02:20:38 +03:00
Vitaliy Filippov d0fdcbd7ff Add optimized crc32c 2019-11-25 02:30:06 +03:00
Vitaliy Filippov 82a2b8e7d9 Fix some extra bugs and it seems now it is even able to trim the journal 2019-11-22 12:08:44 +03:00
Vitaliy Filippov 299b7288d5 Fix journal loading 2019-11-21 00:52:52 +03:00
Vitaliy Filippov 3bfa2f5f39 Fix io_uring submission, journal sector selection 2019-11-19 18:07:40 +03:00
Vitaliy Filippov a4aaa3c7c7 First implementation of journal trimming
In theory it's possible to start testing blockstore at this point!
2019-11-15 16:12:55 +03:00
Vitaliy Filippov 0627dd0f5e Used journal sector tracking 2019-11-15 02:04:19 +03:00
Vitaliy Filippov 1c6b9778a4 Handle all io_uring events using lambdas 2019-11-13 22:46:42 +03:00
Vitaliy Filippov ae77a228c7 Rename big_write.block to location 2019-11-12 20:58:27 +03:00
Vitaliy Filippov 46e96c5128 Remove duplicate journal buffer submission code 2019-11-11 18:38:57 +03:00
Vitaliy Filippov 8edb9e9d6f Remove duplicate journal writing code (and fix it at the same time) 2019-11-11 00:28:14 +03:00
Vitaliy Filippov 5330461029 Move blockstore journal fields to journal_t, implement multiple write buffers for journal sectors 2019-11-07 23:42:24 +03:00
Vitaliy Filippov c959948c82 Finish journal reader 2019-11-04 20:18:52 +03:00
Vitaliy Filippov e1c92d2227 Begin journal init reader 2019-11-04 01:42:53 +03:00
Vitaliy Filippov f4705d81d7 Split into multiple files, begin init_loop, adjust read 2019-11-03 02:30:11 +03:00