Commit Graph

36 Commits (5596ad89978e3be27c536ac975a598626b3f3310)

Author SHA1 Message Date
Vitaliy Filippov 220bda0667 Fix possible buffer over(under)flow when handling LIST 2020-10-23 02:17:44 +03:00
Vitaliy Filippov f011e0c675 Do not block stabilize by list and list by write 2020-10-22 22:13:40 +00:00
Vitaliy Filippov 0471b09b9c Add license notices to all source code files 2020-09-17 23:07:06 +03:00
Vitaliy Filippov e051db5a73 Check for unsuccessful memory allocations 2020-09-05 01:42:11 +03:00
Vitaliy Filippov 0918ea08fa Implement min/max inode filters in LIST operation 2020-09-02 14:42:40 +03:00
Vitaliy Filippov ec7acc8f3a Add WRITE_STABLE operation for future replication support 2020-07-05 01:48:02 +03:00
Vitaliy Filippov 05ea97119f Fix BS_OP_LIST to account for deleted objects: only list the newest stable entry of each object
This allows list responses to be unaffected by journal flushes, which, in turn,
fixes PG peering when a peer OSD is replaying journal and journal contains deletions
2020-06-02 23:52:48 +03:00
Vitaliy Filippov 165c204555 Fix BS_OP_DELETE (the implementation was untested up to this point) 2020-06-02 14:26:01 +03:00
Vitaliy Filippov e6a4b634f8 Fix possible write stall
The stall occurred during fio Q=128 random write tests with low flusher_count (4).
It was caused by flushers being unable to flush the beginning of the journal
because it contained older writes to an object that also had writes in the very end
of the journal, after dirty_start.
2020-06-01 16:18:23 +03:00
Vitaliy Filippov c22e096943 Output journal offsets in debug trace in hex, add detailed "still waiting" messages 2020-06-01 16:18:19 +03:00
Vitaliy Filippov 0f43f6d3f6 Fix crashes, print some stats
Notably:
- fix the `delete op` inside lambda callback crash (it frees the lambda itself
  which results in use-after-free with g++)
- fix stop_client() reenterability
- fix a bug in the blockstore layer which resulted in always returning version=0
  for zero-length reads
- change error codes for blockstore_stabilize
2020-03-31 17:55:31 +03:00
Vitaliy Filippov 92c800bb64 Forget unstable writes when re-peering, rename parity_block_size -> pg_stripe_size, pg_parity_size -> pg_block_size 2020-03-31 02:09:25 +03:00
Vitaliy Filippov 46f9bd2a69 Make blockstore list operation return consistent snapshots 2020-03-14 02:10:25 +03:00
Vitaliy Filippov 6982fe1255 Do not block reads by previous unfinished writes 2020-03-13 21:28:49 +03:00
Vitaliy Filippov 3dd1b22d55 Fix segfault with concurrent OP_SYNCs 2020-03-10 17:00:23 +03:00
Vitaliy Filippov 3f522c66e6 Implement immediate commit mode 2020-03-10 01:59:15 +03:00
Vitaliy Filippov c3737ae3ff Add journal fsync to stabilize/rollback 2020-03-09 00:35:58 +03:00
Vitaliy Filippov 844cacd357 Allow incorrectly forbidden BS_OP_LIST in readonly mode 2020-03-06 02:29:39 +03:00
Vitaliy Filippov 9cb07d844b Make [un]register_consumer operate on pointers, rename get_loop_again() to has_work() 2020-03-04 21:00:20 +03:00
Vitaliy Filippov 2be4824a7a Fix a small memory leak and BS_OP_SYNC mishandling, now fio does not hang during primary-osd test 2020-02-28 01:46:39 +03:00
Vitaliy Filippov c71b67f2f7 Move SYNC_STAB_ALL into blockstore implementation 2020-02-23 23:43:57 +03:00
Vitaliy Filippov d4fd9d982a Implement read-modify-write calculation and extract it into a separate file 2020-02-23 02:11:43 +03:00
Vitaliy Filippov ffe073473a Remove hardcode of the EC(2+1) scheme, now it supports EC(k+1), fix some bugs 2020-02-13 19:13:17 +03:00
Vitaliy Filippov 47663bd1dc Add (empty) osd_primary.cpp, rename osd_read to osd_receive, add FIXMEs for fsync 2020-01-28 22:40:50 +03:00
Vitaliy Filippov 2b09710d6f Implement blockstore rollback operation
Rollback operation is required for the primary OSD to kill unstable
object versions in OSD peers so they don't occupy journal space
2020-01-24 20:18:14 +03:00
Vitaliy Filippov a8bc44064d Read object lists from peers and own blockstore 2020-01-22 02:36:14 +03:00
Vitaliy Filippov 43f6cfeb73 Extract alignments to options 2020-01-16 00:54:25 +03:00
Vitaliy Filippov a3d3949dce Do not overwrite same journal sector multiple times
It doesn't reduce actual WA, but it reduces tail latency (Q=32, 10% / 50% / 90% / 99% / 99.95%):
- write: 766us/979us/1090us/1303us/1729us vs 1074us/1450us/2212us/3261us/4113us
- sync: 701us/881us/1188us/1762us/2540us vs 269us/955us/1663us/2638us/4146us
2020-01-15 02:53:01 +03:00
Vitaliy Filippov cf819eb442 Implement sparse block bitmap to avoid zero-fill 2020-01-12 02:55:32 +03:00
Vitaliy Filippov 4b05bde3a2 Block writes earlier than sync/stabilize would be blocked, too 2020-01-10 20:05:17 +03:00
Vitaliy Filippov b3f2102f33 Add queue stall tracking 2020-01-10 01:23:46 +03:00
Vitaliy Filippov bf3eecc159 Extract 512 to constants 2020-01-06 14:11:47 +03:00
Vitaliy Filippov e88ad3f2ff Implement object list operation in blockstore 2019-12-19 20:50:20 +03:00
Vitaliy Filippov d3d21e6e0f Rename OP_ to BS_OP_ 2019-12-19 13:56:26 +03:00
Vitaliy Filippov 19abe6227e Fix submission ring overflow & ring_data_t reuse conflicts 2019-12-17 11:26:17 +03:00
Vitaliy Filippov a7e74670a5 Split blockstore implementation and interface header 2019-12-15 14:57:18 +03:00