Vitaliy Filippov
6875a838e0
Capture all by value in qemu_proxy
2021-03-16 12:48:36 +03:00
Vitaliy Filippov
ee44f64927
Introduce image names and metadata storage in etcd
...
Each inode has: image name, parent inode number & pool, size and readonly flag
Snapshots are created by switching image name to a different inode number
while using the older inode as parent.
2021-03-16 12:48:36 +03:00
Vitaliy Filippov
abf0611d93
Use clean_entry_bitmap_size instead of entry_attr_size back because of changed bitmap handling
2021-03-16 12:48:36 +03:00
Vitaliy Filippov
edbf0eb040
Add a test for snapshots, fix bugs. Now the test passes
2021-03-16 12:48:36 +03:00
Vitaliy Filippov
18f71b059a
Fix part bitmap addresses
2021-03-16 12:48:36 +03:00
Vitaliy Filippov
2db2ed22ea
Fix several snapshot I/O bugs
2021-03-16 12:48:36 +03:00
Vitaliy Filippov
aa7699da24
Fix subop generation for snapshot implementation
2021-03-16 12:48:36 +03:00
Vitaliy Filippov
853ecba780
Actual snapshot support (untested)
2021-03-16 12:48:36 +03:00
Vitaliy Filippov
2f9c76b8fc
Report inode I/O statistics, aggregate it in the monitor
2021-03-16 12:48:36 +03:00
Vitaliy Filippov
8da7f26459
Report inode space usage statistics to etcd, aggregate it in the monitor
2021-03-16 12:48:36 +03:00
Vitaliy Filippov
9998b50c7e
Add inode space usage statistics tracking to blockstore
2021-03-16 12:48:36 +03:00
Vitaliy Filippov
0422d94a70
Send bitmaps with primary-reads, actually read bitmaps for READ ops
2021-03-16 12:48:36 +03:00
Vitaliy Filippov
ff2208ae70
Allocate bitmaps along with stripes to avoid memory fragmentation
2021-03-16 12:48:36 +03:00
Vitaliy Filippov
ae54dddb0c
Remove cryptic bitmap inlining from bs_op_t and osd_op_t, use bitmap in primary OSD code
2021-03-16 12:48:36 +03:00
Vitaliy Filippov
bfc175fe0f
Add "external" bitmap support to the secondary OSD protocol
2021-03-16 12:48:36 +03:00
Vitaliy Filippov
07e10210b6
Use bitmap granularity for alignment checks
2021-03-16 12:48:36 +03:00
Vitaliy Filippov
221b728fc9
Add "external" bitmap support to blockstore
2021-03-16 12:48:36 +03:00
Vitaliy Filippov
6625aaae00
Add "external" bitmap support to osd_rmw
2021-03-16 12:48:36 +03:00
Vitaliy Filippov
435045751d
Delete objects only after a SYNC during rebalance in the non-immediate_commit mode
...
Previously OSDs could commit deletes before writes during recovery or rebalance
in the "lazy fsync" (immediate_commit=off) mode which could result in lost objects
2021-03-16 12:48:26 +03:00
Vitaliy Filippov
c5fb1d5987
Do not duplicate blockstore operations when io_uring fills up
...
This bug was leading to OSDs dying with "Assertion `fulfilled == read_op->len' failed"
when testing fio -rw=randread -numjobs=8 -iodepth=128
2021-03-16 12:48:26 +03:00
Vitaliy Filippov
9ac7e75178
Allow to specify etcd URLs for OSDs with http://, do not die with a strange error if -etcd option is missing for fio
2021-03-16 12:48:26 +03:00
Vitaliy Filippov
88671cf745
Fix a bug causing all flushers to wait for an fsync without actually trying to do it
...
This happened because flusher_count became dynamic and fsync_batch() was comparing the number
of flushers currently ready to do an fsync with the maximum number of flushers. Also the number
wasn't rechecked on every loop which was also incorrect.
Now the interrupted_rebalance test passes even without IMMEDIATE_COMMIT=1.
2021-03-13 17:27:29 +03:00
Vitaliy Filippov
ceb9c28de7
Set default log_level before passing config to etcd_state_client
2021-03-13 17:19:45 +03:00
Vitaliy Filippov
299d7d7c95
Use common macro for get_sqe
2021-03-13 17:19:45 +03:00
Vitaliy Filippov
d1526b415f
Correctly resume writes when OSD is full to return an error
2021-03-13 17:19:45 +03:00
Vitaliy Filippov
f49fd53d55
Fix a bug where allocator was unable to allocate up to last (n%64) blocks, add tests for it
2021-03-13 02:19:02 +03:00
Vitaliy Filippov
b44f49aab2
Ignore zero OSDs in history osd_sets
2021-03-12 12:40:15 +03:00
Vitaliy Filippov
af5155fcd9
Implement "no_recovery" and "no_rebalance" flags
2021-03-11 00:36:31 +03:00
Vitaliy Filippov
c4ba24c305
Do not print ping op latency
2021-03-10 02:01:44 +03:00
Vitaliy Filippov
bd178ac20f
Fix history osd_set check - local OSD is always available!
2021-03-09 02:18:18 +03:00
Vitaliy Filippov
ad577c4aac
Add PING operation and timeouts to detect OSD failures when a host goes down
2021-03-09 02:15:38 +03:00
Vitaliy Filippov
e91ff2a9ec
Only forget offline PGs if their state is not changed during reporting
2021-03-08 17:04:10 +03:00
Vitaliy Filippov
086667f568
Do not check PG state key ownership if it doesn't exist yet
...
This fixes the bug where OSDs were sometimes trying to report updated PG states
infinitely without luck when PGs transitioned from 'starting' to 'peering' too fast
2021-03-08 17:04:10 +03:00
Vitaliy Filippov
1be94da437
Check & remove extra chunks for degraded / incomplete objects, too
2021-03-08 17:04:10 +03:00
Vitaliy Filippov
80e12358a2
Use pg_data_size instead of pg_minsize for object state calculation
2021-03-08 17:04:10 +03:00
Vitaliy Filippov
36c935ace6
Use std::vector for the blockstore submission queue
2021-03-08 17:04:10 +03:00
Vitaliy Filippov
0d8b5e2ef9
Remove unused enqueue_op_first()
2021-03-08 17:04:10 +03:00
Vitaliy Filippov
98f1e2c277
Rework write/sync ordering
...
Make syncs wait for all previous writes because it's the only way
to make sure that OSDs do not receive incomplete writes in LIST results
during peering when some writes are still in progress.
Also simplify blockstore submission queue logic.
2021-03-08 17:04:10 +03:00
Vitaliy Filippov
21e7686037
Fix possible "assertion failed: pg.inflight >= 0" error during PG stop
2021-03-08 17:04:10 +03:00
Vitaliy Filippov
ab21a1908b
Check for the dirty PG flag when trying to continue to stop it after sync
2021-03-08 17:04:10 +03:00
Vitaliy Filippov
30d1ccd43e
Fix an infinite loop when discarding list operations during stop_pg()
2021-03-08 17:04:10 +03:00
Vitaliy Filippov
8bdd6d8d78
Reset PG state when stopping them
2021-03-08 17:04:10 +03:00
Vitaliy Filippov
09b3e4e789
Fix OSDs being unable to stop PGs that are 'peering', not 'active'
...
This was sometimes leading to incorrect misplaced and degraded object count statistics
2021-03-08 17:04:10 +03:00
Vitaliy Filippov
bc742ccf8c
Fix a small memory leak in etcd_state_client
2021-03-08 17:04:10 +03:00
Vitaliy Filippov
314b20437b
Do not break subsequent small writes badly when a big write is canceled
2021-03-08 17:04:10 +03:00
Vitaliy Filippov
29d8ac8b1b
Do not report statistics for the empty operation
2021-03-01 16:20:57 +03:00
Vitaliy Filippov
6155b23a7e
Replace pgs[id] with pgs.at(id) to prevent accidental auto-vivification
2021-02-28 19:36:59 +03:00
Vitaliy Filippov
46e79f3306
Wait for PGs to become clean before stopping them
2021-02-28 19:36:59 +03:00
Vitaliy Filippov
41fd14e024
Fix deletes not increasing write_iodepth
2021-02-28 19:36:59 +03:00
Vitaliy Filippov
2d73b19a6c
Fix online PG count change bugs
2021-02-25 23:59:33 +03:00