Commit Graph

599 Commits (7006875a2411dda11dd19d29d2f2b75488feea7a)

Author SHA1 Message Date
Vitaliy Filippov 7006875a24 Make monitor stick to one etcd until the restart 2021-03-09 02:15:38 +03:00
Vitaliy Filippov ad577c4aac Add PING operation and timeouts to detect OSD failures when a host goes down 2021-03-09 02:15:38 +03:00
Vitaliy Filippov 836635c518 Use osd_out_time = 10 minutes by default 2021-03-09 02:15:38 +03:00
Vitaliy Filippov 88a03f4e98 Release 0.5.7
- Fix multiple bugs leading to OSDs sometimes being unable to correctly activate PGs
  when a lot of PG peering events occurred in a small amount of time
- Fix a bug where OSDs could list incomplete object versions during peering. The bug
  manifested with "local rollback operation failed" messages in OSD logs
- Fix a bug where misplaced chunks for degraded and incomplete objects were not removed
  from extra OSDs during recovery
- Fix incorrect PG history configuration resulting in OSDs being unable to find some
  of the objects after a PG count change
- Simplify block layer write ordering logic
- Avoid extra data move when a lot of OSDs are first stopped for long time and then restarted
- Fix incorrect degraded & misplaced object statistics after a completed rebalance
- Fix incorrect usage of pg_minsize instead of the minimal possible object chunk count in EC pools
2021-03-08 23:37:02 +03:00
Vitaliy Filippov 2a5036669d Fix PG count change procedure
In previous versions PG histories were calculated incorrectly during
PG count change which led to objects being lost on OSDs not in PG's osd set.
2021-03-08 23:15:58 +03:00
Vitaliy Filippov 2e0c853180 Make test_change_pg_count check if any objects are lost during the test 2021-03-08 23:15:07 +03:00
Vitaliy Filippov e91ff2a9ec Only forget offline PGs if their state is not changed during reporting 2021-03-08 17:04:10 +03:00
Vitaliy Filippov 086667f568 Do not check PG state key ownership if it doesn't exist yet
This fixes the bug where OSDs were sometimes trying to report updated PG states
infinitely without luck when PGs transitioned from 'starting' to 'peering' too fast
2021-03-08 17:04:10 +03:00
Vitaliy Filippov 73ce20e246 Add a test for the "reappear after move" case 2021-03-08 17:04:10 +03:00
Vitaliy Filippov 1be94da437 Check & remove extra chunks for degraded / incomplete objects, too 2021-03-08 17:04:10 +03:00
Vitaliy Filippov 80e12358a2 Use pg_data_size instead of pg_minsize for object state calculation 2021-03-08 17:04:10 +03:00
Vitaliy Filippov 36c935ace6 Use std::vector for the blockstore submission queue 2021-03-08 17:04:10 +03:00
Vitaliy Filippov 0d8b5e2ef9 Remove unused enqueue_op_first() 2021-03-08 17:04:10 +03:00
Vitaliy Filippov 98f1e2c277 Rework write/sync ordering
Make syncs wait for all previous writes because it's the only way
to make sure that OSDs do not receive incomplete writes in LIST results
during peering when some writes are still in progress.

Also simplify blockstore submission queue logic.
2021-03-08 17:04:10 +03:00
Vitaliy Filippov 21e7686037 Fix possible "assertion failed: pg.inflight >= 0" error during PG stop 2021-03-08 17:04:10 +03:00
Vitaliy Filippov ab21a1908b Check for the dirty PG flag when trying to continue to stop it after sync 2021-03-08 17:04:10 +03:00
Vitaliy Filippov 30d1ccd43e Fix an infinite loop when discarding list operations during stop_pg() 2021-03-08 17:04:10 +03:00
Vitaliy Filippov 8bdd6d8d78 Reset PG state when stopping them 2021-03-08 17:04:10 +03:00
Vitaliy Filippov 09b3e4e789 Fix OSDs being unable to stop PGs that are 'peering', not 'active'
This was sometimes leading to incorrect misplaced and degraded object count statistics
2021-03-08 17:04:10 +03:00
Vitaliy Filippov 07912fd670 Use history/last_clean_pgs to avoid extra data move when observing a series of changes in the cluster 2021-03-08 17:04:10 +03:00
Vitaliy Filippov bc742ccf8c Fix a small memory leak in etcd_state_client 2021-03-08 17:04:10 +03:00
Vitaliy Filippov 314b20437b Do not break subsequent small writes badly when a big write is canceled 2021-03-08 17:04:10 +03:00
Vitaliy Filippov 29bac892ad Add .gitignore 2021-03-08 17:04:10 +03:00
Vitaliy Filippov cf7547faf3 Fix *.sh build scripts 2021-03-02 02:17:11 +03:00
Vitaliy Filippov ab90ed747f Release 0.5.6
- Fix operation statistics
- Fix a rebalance hang introduced in 0.5.5
- Test PG count changes with actual data moving
- Fix a possible 'unexpected pg state: 0' error during PG count change
2021-03-01 16:26:04 +03:00
Vitaliy Filippov 29d8ac8b1b Do not report statistics for the empty operation 2021-03-01 16:20:57 +03:00
Vitaliy Filippov 97795ea1b1 Use pg_minsize=2 in the pg_count change test
Also don't check for has_degraded because it's not a bug that objects
are _temporarily_ listed as degraded during PG peering as it's not
required for the new primary to connect to _all_ older peers to start
peering. The test may be improved in the future by temporarily disabling
degraded recovery during it and returning the has_degraded check back.
2021-03-01 16:18:08 +03:00
Vitaliy Filippov 24e7075f08 Fix monitor's statistics aggregation 2021-02-28 19:51:16 +03:00
Vitaliy Filippov 6155b23a7e Replace pgs[id] with to prevent accidental auto-vivification 2021-02-28 19:36:59 +03:00
Vitaliy Filippov 7d49706c07 Improve the pg_count change test: add more OSDs and actually move data between them 2021-02-28 19:36:59 +03:00
Vitaliy Filippov 46e79f3306 Wait for PGs to become clean before stopping them 2021-02-28 19:36:59 +03:00
Vitaliy Filippov 41fd14e024 Fix deletes not increasing write_iodepth 2021-02-28 19:36:59 +03:00
Vitaliy Filippov bb2d9a3afe Release 0.5.5
- Transition to CMake build system
- Fix Monitor being unable to change PG sizes
- Fix PG optimizer not using some OSDs in some cases
- Fix inability to change PG count online
- Improve journal flusher performance
- Add a little better systemd unit generator
- Use w=8 with jerasure (breaking change for EC pools)
2021-02-26 01:59:18 +03:00
Vitaliy Filippov e899ed2c25 Make OSDs with 256 flushers (as they are now dynamic) 2021-02-26 01:59:18 +03:00
Vitaliy Filippov e21b14b72c Fix rpm specs for building with CMake 2021-02-26 01:59:18 +03:00
Vitaliy Filippov 5af8eddaa9 Add the remaining build script for Debian 2021-02-26 01:59:18 +03:00
Vitaliy Filippov 4f5a94c07a Modify instructions for the CMake build 2021-02-26 00:28:57 +03:00
Vitaliy Filippov e16b87ecc8 Rename random_combinations() parameter from "unordered" to "ordered" as it's more correct 2021-02-25 23:59:34 +03:00
Vitaliy Filippov fcb4aa0a11 Fix Monitor being unable to change PG sizes 2021-02-25 23:59:34 +03:00
Vitaliy Filippov 12adfa470c Add a test for changing PG size 2021-02-25 23:59:33 +03:00
Vitaliy Filippov 7f15e0c084 Add a simple test for the PG optimizer 2021-02-25 23:59:33 +03:00
Vitaliy Filippov 08d4bef419 Fix PG optimizer removing PGs without adding new ones
This happened when the distribution was already valid for the current OSD tree,
but didn't use all OSDs. For example, OSDs 1 2 3 and all PGs equal to [ 1, 2 ]
remained unchanged.
2021-02-25 23:59:33 +03:00
Vitaliy Filippov 2d73b19a6c Fix online PG count change bugs 2021-02-25 23:59:33 +03:00
Vitaliy Filippov 69c87009e9 Add a test for changing PG count 2021-02-25 23:59:33 +03:00
Vitaliy Filippov c974cb539c Make flusher_count adaptive and limit write iodepth 2021-02-25 23:59:33 +03:00
Vitaliy Filippov 00e98f64f3 A little better systemd unit generator 2021-02-25 23:59:33 +03:00
Vitaliy Filippov 91a70dfb1b Add a test for the no_same_sector_overwrites mode 2021-02-25 23:59:33 +03:00
Vitaliy Filippov 178388ac8c Use packages/ subdir instead of build/ for Docker package builds 2021-02-25 23:59:04 +03:00
Vitaliy Filippov bf9a175efc Move C/C++ sources to src subdirectory 2021-02-25 23:59:03 +03:00
Vitaliy Filippov 08aed962de Use CMake 2021-02-25 23:58:08 +03:00