Commit Graph

78 Commits (098e369a3b9bd0abcd6371f3f7fb356bb8120e14)

Author SHA1 Message Date
Vitaliy Filippov 8f64fc61e7 Ignore empty events in mon 1 year ago
Vitaliy Filippov 4a9f001d9e Make mon also ping etcd websockets regularly 1 year ago
Vitaliy Filippov 68b6763ebe Add asserts for lp-optimizer tests, pass `ordered` from the monitor 1 year ago
Vitaliy Filippov 08e467270a Fix pg_size changing from 3 to 2 1 year ago
Vitaliy Filippov 5473d5b4a2 Rework HTTP client to use keepalive, move getifaddr_list to addr_util 1 year ago
Vitaliy Filippov fa687d3878 Allow to configure OSD placement in node_placement 1 year ago
Vitaliy Filippov 32b1312abb Remove stale deleted inode statistics in monitor 2 years ago
Vitaliy Filippov d5c8fde5de Remove kludgy $IP and $ETCD_MON parsing from make-osd.sh, suggest to use vitastor.conf 2 years ago
Vitaliy Filippov 7a0b5212fe Exit if unable to restart watches
FIXME: It's probably not OK for the client to exit in this case
2 years ago
Vitaliy Filippov a8f5c71ae8 Use the same etcd address selection algorithm in the monitor 2 years ago
Vitaliy Filippov 6e12aca53b Remove the total PG count restriction in optimize_change which was leading to unfeasible problems sometimes 2 years ago
Vitaliy Filippov 6e0e172e15 Implement OSD address selection from a specified subnet 2 years ago
Vitaliy Filippov f0ebfae3b8 Fix vitastor-cli alloc-osd, use vitastor-cli in make-osd.sh 2 years ago
Vitaliy Filippov 66fe1a469b Additionally balance parity chunks over OSDs using round-robin when generating initial distribution 2 years ago
Vitaliy Filippov aa436027c8 Report pg/history from OSD on every degraded activation
Required to prevent data loss due to activation of an OSD with older data
when PG OSD set change doesn't occur. I.e. fixes the simplest case:
- Run 2 OSDs with 1 PG
- Start writing into the PG
- Stop OSD 2
- Stop OSD 1
- Start OSD 2

After this change the PG will refuse to start after the last step.
2 years ago
Vitaliy Filippov 0f3f0a9d29 Calculate average statistics in mon, remove buggy "fix_stat_overflows" 2 years ago
Vitaliy Filippov 6e6f407df3 Simplify & fix monitor stats aggregation 2 years ago
Vitaliy Filippov 4d43774cbb Use 5s etcd_report_interval by default 2 years ago
Vitaliy Filippov ffb06536ff Revoke lease in mon on SIGINT & SIGTERM, fix raw_to_usable calculation 2 years ago
Vitaliy Filippov cfe8de9b84 Autosync based on number of unstable ops to prevent journal stalls 2 years ago
Vitaliy Filippov da99686a15 Correctly aggregate pool statistics for unknown pools 2 years ago
Vitaliy Filippov b66160a7ad Aggregate per-pool statistics in mon 2 years ago
Vitaliy Filippov dfdf5c1f9c Fix comments in mon.js 2 years ago
Vitaliy Filippov 6810e93c3f Add RDMA options to mon.js list 2 years ago
Vitaliy Filippov 2a02f3c4c7 Add metadata superblock and check it on start
Refuse to start if the superblock is missing or bad version;
zero out the metadata area when initializing superblock.
2 years ago
Vitaliy Filippov 82c1a7ec67 Fix statistics reporting, split inode number into pool & inode 2 years ago
Vitaliy Filippov 2612d3198a Introduce image names and metadata storage in etcd
Each inode has: image name, parent inode number & pool, size and readonly flag

Snapshots are created by switching image name to a different inode number
while using the older inode as parent.
2 years ago
Vitaliy Filippov d0c2e31312 Add a test for snapshots, fix bugs. Now the test passes 2 years ago
Vitaliy Filippov 691f066055 Actual snapshot support (untested) 2 years ago
Vitaliy Filippov ffe1cd4c79 Report inode I/O statistics, aggregate it in the monitor 2 years ago
Vitaliy Filippov 4ae1b84c67 Report inode space usage statistics to etcd, aggregate it in the monitor 2 years ago
Vitaliy Filippov 97efb9e299 Do not crash on PG re-peering events when operations are in progress 2 years ago
Vitaliy Filippov 75a6a556b5 Shuffle PGs for better data device utilisation 2 years ago
Vitaliy Filippov 8f8b90be7a Add min_flusher_count configuration 2 years ago
Vitaliy Filippov df99e232ee Deduplicate osd_sets in pg history + raise request size limit for etcd 2 years ago
Vitaliy Filippov 3a40fa4127 Fix monitor errors in case of OSD removal 2 years ago
Vitaliy Filippov 435045751d Delete objects only after a SYNC during rebalance in the non-immediate_commit mode
Previously OSDs could commit deletes before writes during recovery or rebalance
in the "lazy fsync" (immediate_commit=off) mode which could result in lost objects
2 years ago
Vitaliy Filippov 9f59381bea Re-distribute PG primaries over OSDs that come up after a short downtime 2 years ago
Vitaliy Filippov 87dbd8fa57 Use empty hash as the default value for some etcd keys in the monitor 2 years ago
Vitaliy Filippov b44f49aab2 Ignore zero OSDs in history osd_sets 2 years ago
Vitaliy Filippov af5155fcd9 Implement "no_recovery" and "no_rebalance" flags 2 years ago
Vitaliy Filippov 0d2efbecc9 Preserve previous PG history when changing PG distribution
Fixes incorrect PG history in case when a new rebalance is started
before the finish of the previous one which could make primary OSDs unable
to locate some objects on some secondaries.
2 years ago
Vitaliy Filippov e62e8b6bae Use real pg configuration instead of the "last clean" one for generating PG history
Basically fixes the bug introduced in 0.5.7 where an rebalance interrupted
by the monitor could result in forgetting objects moved to the new place
2 years ago
Vitaliy Filippov 7006875a24 Make monitor stick to one etcd until the restart 2 years ago
Vitaliy Filippov 836635c518 Use osd_out_time = 10 minutes by default 2 years ago
Vitaliy Filippov 2a5036669d Fix PG count change procedure
In previous versions PG histories were calculated incorrectly during
PG count change which led to objects being lost on OSDs not in PG's osd set.
2 years ago
Vitaliy Filippov 07912fd670 Use history/last_clean_pgs to avoid extra data move when observing a series of changes in the cluster 2 years ago
Vitaliy Filippov 24e7075f08 Fix monitor's statistics aggregation 2 years ago
Vitaliy Filippov e899ed2c25 Make OSDs with 256 flushers (as they are now dynamic) 2 years ago
Vitaliy Filippov e16b87ecc8 Rename random_combinations() parameter from "unordered" to "ordered" as it's more correct 2 years ago