Fixes a crash in the following scenario:
- client issues a delete request (object version is at least 2)
- OSD has time to flush it to the metadata, but doesn't have time to move the journal start pointer on disk
- client overwrites the same object and it gets the version number 1 again
- OSD is restarted and sees delete(v=2), big_write(v=1) in the journal
- dirty_db sequence gets broken and OSD crashes with assert("Writes and deletes shouldn't happen at the same time")
This is the new guaranteed unblocking method which replaces old trims
in init and rollback, and also fixes a possible stall when just several
writes in the beginning of the journal are flushed without triggering
a subsequent trim.
1) Update journal's used_start in memory only after updating journal superblock.
Doing the opposite is incorrect because part of the journal will be lost if writers
overwrite its old beginning.
2) Sync journal device after updating the superblock.
3) Do not trim in rollback and init because trimming there would also require
updating the superblock. And the only reason to trim in both those places was
to unblock writers. And a guaranteed unblocking method will follow in the next
commit :)
"2-phase" (write->stabilize) process is pointless for deletions because it
doesn't protect us from incomplete objects. This happens because it removes
the version information from metadata after stabilization. Deletions require
"3-phase" process with a potentially very long 3rd phase.
So, deletions will be allowed to generate degraded and incomplete objects,
and for it to not affect users' ability to delete something, the cluster
will allow to delete whole inodes while storing a list of them in etcd.
Proper TRIM will be impossible until the implementation of the aforementioned
"3-phase" process, though.
By the way, this change also fixes a possible write stall after rebalancing
which was caused by the lack of "stabilize delete" operations.
- Add support for benchmarking single primary OSD in fio_sec_osd
- Do not wait for the next event in flushers (return resume_0 back)
- Fix flushing of zero-length writes
- Print PG object count when peering
- Print journal free space when starting and when congested