1
0
Fork 0
Commit Graph

1622 Commits (bb2f395f1e0a4c448c5253af0feb9fa22187fca6)

Author SHA1 Message Date
Vitaliy Filippov 2f38adeb3d Restart dead VDUSE daemons at regular intervals 2023-12-24 12:58:50 +03:00
Vitaliy Filippov f72f14e6a7 Clear old PG states, history, and OSD states on etcd state reload
Also add protection from etcd watcher messages being split into multiple websocket
messages - I'm not sure if etcd actually does that, but it's better to have extra
protection anyway.

Also check that all etcd watchers are started in the keepalive routine, otherwise
it sometimes tries to revive etcd watchers starting with revision=1 which obviously
always fails because this revision is nearly always compacted.

All these changes should fix an old rarely reproduced bug where SOMETIMES OSDs
didn't react to PG config changes which was leading to offline pools on node reboot.
It happened on the full reload of state from etcd.
2023-12-24 02:02:13 +03:00
Vitaliy Filippov 1299373988 Use the same etcd_ws_keepalive_interval in OSD and mon 2023-12-23 20:07:29 +03:00
Vitaliy Filippov 178bb0e701 Prevent re-entry into timerfd set_nearest 2023-12-22 02:32:40 +03:00
Vitaliy Filippov 4ece4dfdd0 Fix mon not using values from config when /config/global is not present 2023-12-22 02:25:09 +03:00
Vitaliy Filippov 95631773b6 Remove pve-storage-portal-dns-list format for vitastor_etcd_address 2023-12-20 02:22:06 +03:00
Vitaliy Filippov 7239cfb91a Parse log_level in cluster_client 2023-12-20 02:21:23 +03:00
Vitaliy Filippov 7cea642f4a Fix vitastor-nbd image existence check not working because of non-zeroed inode_watch fields 2023-12-19 01:11:37 +03:00
Vitaliy Filippov dc615403d9 Do not warn on EPIPE in client unless log_level is raised explicitly 2023-12-17 13:42:26 +03:00
Vitaliy Filippov 1a704e06ab Allow multiple interfaces with the same IP address, for "simple routed" full mesh network 2023-12-17 13:25:56 +03:00
Vitaliy Filippov 575475de71 Do not ignore loopback addresses for OSD network (to make ECMP setups with frr possible) 2023-12-17 11:55:13 +03:00
Vitaliy Filippov aca2bef15f Add vitastor-disk update-sb command 2023-12-14 01:11:42 +03:00
Vitaliy Filippov 4dd6e89263 Change qemu to qemu-system-x86 in docs 2023-12-14 01:01:00 +03:00
Vitaliy Filippov 9bac99ffb6 Fix incorrect error in CSI when searching for the device in /sys 2023-12-14 01:00:32 +03:00
Vitaliy Filippov 62ed130960 Support building qemu 8.1 from bookworm-backports 2023-12-10 00:34:13 +03:00
Vitaliy Filippov 9c7755b6e8 Use qemu-storage-daemon from QEMU 8.1.2 for CSI 2023-12-08 00:10:12 +03:00
Vitaliy Filippov 691ebd991a Move 2 last log printfs to stderr from stdout in etcd_state_client 2023-12-08 00:01:52 +03:00
Vitaliy Filippov 6d5df908a3 Fix possible out of bounds when checking invalid journal entries 2023-12-08 00:01:07 +03:00
Vitaliy Filippov fa87769ed8 Correct config options in vduse docs 2023-12-06 02:09:04 +03:00
Vitaliy Filippov 2ce8292803 Also log when killing process 2023-12-06 01:06:53 +03:00
Vitaliy Filippov 7f8f7ded52 Check for empty output of vitastor-nbd map (just in case) 2023-12-06 01:01:14 +03:00
Vitaliy Filippov 68553eabbb Log executed CLI commands 2023-12-06 00:48:12 +03:00
Vitaliy Filippov 3147c5c8d5 Remove internal error wrapping 2023-12-06 00:39:42 +03:00
Vitaliy Filippov 576e2ae608 Fix etcd_address check in CSI 2023-12-06 00:28:21 +03:00
Vitaliy Filippov a1c7cc3d8d Release 1.3.1
Hotfix to 1.3.0 - new "journal space reservation" had a bug which
caused OSDs to crash with EC and without immediate_commit.
2023-12-04 18:35:09 +03:00
Vitaliy Filippov a5e3dfbc5a Oops, 1.3.0 needs a hotfix 2023-12-04 13:45:54 +03:00
Vitaliy Filippov 7972502eaf Release 1.3.0
New features:
- RDMA without ODP - much faster and all cards are now supported, not just Mellanox
- VDUSE in CSI - faster, more stable and can even recover after CSI pod restart!
- Reserve journal space for stabilize requests dynamically to prevent stalls under load with EC
- Raise default NBD timeout from 30 to 300 seconds and allow to take it from /etc/vitastor/vitastor.conf
- Remove explicit etcdUrl/etcdPrefix K8S storage class parameter support to prevent
  etcd migration issues for volumes created with these parameters
- Support QEMU 8.1 and pve-qemu 8.1

Bug fixes:
- Fix RDMA connection (and thus memory) leak
- Fix rare crashes under load due to incorrect io_uring queue size tracking
- Fix monitor statistics aggregation in case of empty /osd/stats keys
- Fix crash on unknown long argument to vitastor-disk
- Allow trailing comma in JSONs again
- Fix crash on attempts to dump a long listing of objects "to stabilize" or "to rollback" in a slow op
2023-12-04 02:36:43 +03:00
Vitaliy Filippov e57b7203b8 Use cmake3 on RHEL 7 2023-12-04 02:36:29 +03:00
Vitaliy Filippov c8a179dcda Note that Proxmox 8.1 is supported 2023-12-04 02:20:33 +03:00
Vitaliy Filippov 845454742d Fix warning with QEMU 8.1 2023-12-04 01:59:07 +03:00
Vitaliy Filippov d65512bd80 Add patches for QEMU 8.1 2023-12-04 01:56:17 +03:00
Vitaliy Filippov 53de2bbd0f Support VDUSE in CSI
VDUSE has multiple advantages:
- Better performance
- Lack of timeout problems
- And even the ability to recover after restart of the vitastor-csi pod!
2023-12-04 00:41:24 +03:00
Vitaliy Filippov 628aa59574 Raise default NBD timeout from 30 to 300 seconds and allow to take it from /etc/vitastor/vitastor.conf 2023-12-02 14:11:14 +03:00
Vitaliy Filippov 037cf64a47 Remove explicit etcdUrl/etcdPrefix from volume parameters 2023-12-02 13:26:00 +03:00
Vitaliy Filippov 19e2d9d6fa Fix crash on unknown long argument to vitastor-disk 2023-12-01 00:55:51 +03:00
Vitaliy Filippov bfc7e61909 Add more notes + performance comparison about VDUSE 2023-11-25 02:25:56 +03:00
Vitaliy Filippov 7da4868b37 Fix monitor statistics aggregation in case of empty /osd/stats keys 2023-11-24 01:05:21 +03:00
Vitaliy Filippov b5c020ce0b Use io_uring SQ size for ringloop capacity - otherwise get_sqe could return NULL when space_left() was > 0 under load
Raise default io_uring size to 1024 for the same effective capacity as previously
2023-11-20 03:04:06 +03:00
Vitaliy Filippov 6b33ae973d %d -> %lu 2023-11-20 03:02:26 +03:00
Vitaliy Filippov cf36445359 Reserve journal space for stabilize requests dynamically to prevent stalls 2023-11-20 03:01:57 +03:00
Vitaliy Filippov 3fd873d263 Add -fno-omit-frame-pointer by default 2023-11-20 02:59:54 +03:00
Vitaliy Filippov a00e8ae9ed Fix mismatch journal pos format in vitastor-disk 2023-11-19 15:19:54 +03:00
Vitaliy Filippov 75674545dc Limit the number of printed object versions in slow op dump (otherwise it may overflow the fixed buffer) 2023-11-13 01:10:28 +03:00
Vitaliy Filippov 225eb2fe3d Support RDMA without ODP by stupidly copying memory. Disable ODP by default
ODP is slower than regular RDMA even with memory copy overhead

Example numbers:
- 3950000 random read iops without ODP vs 240000 iops with ODP
- 1447000 random write iops without ODP vs 101000 iops with ODP

Reference: https://tkygtr6.github.io/pub/ISPASS21_slides.pdf
2023-11-12 15:03:47 +03:00
Vitaliy Filippov 7e82573ed0 Fix RDMA connection leak which was preventing stable functioning of RDMA :) 2023-11-11 23:40:47 +03:00
Vitaliy Filippov 12a6bed2d5 Return the new accidentally rolled back json11 commit ("allow trailing comma") 2023-11-07 15:49:23 +03:00
Vitaliy Filippov 5524dbdab7 Release 1.2.0
New features:

- Implement CSI volume expansion
- Implement CSI volume snapshots
- CSI driver now requires Kubernetes >= 1.20

Bug fixes:

- Important bug fix for EC: fix EC n+k, k>=2 read recovery in ISA-L version returning
  incorrect data when reading at least the second chunk out of multiple missing chunks
  without reading the first one. All users of EC n+k, k>=2 should upgrade as soon as
  possible, and upgrade should be conducted with downtime: first stop all clients
  (VMs/containers), then all OSDs, then upgrade and restart everything.
- Fix unstable statistics aggregation in monitor (affecting vitastor-cli status and df)
- Make udev not wait for OSDs to start during boot
- Do not report negative numbers of offline PGs in vitastor-cli status when changing PG count
- Report both old and new PG counts in vitastor-cli df when changing it
- Fix OSDs sometimes not starting with "The code only supports journal versions 1 and 2,
  but it is 2 on disk" error after upgrading from pre-1.0 versions and letting OSDs run
  for some time
- Fix monitors sometimes returning old PG count back after OSD configuration changes
- Make monitor PG changes more stable and timeout errors less probable
2023-11-05 01:48:57 +03:00
Vitaliy Filippov cd3dec06ac Remove spaces from old->new PG count in df 2023-11-05 01:45:45 +03:00
Vitaliy Filippov 371d79e059 Document vitastor-csi features 2023-11-05 01:05:26 +03:00
Vitaliy Filippov 0e888e6c60 Prevent spamming etcd with last_clean_pgs update requests 2023-11-05 00:12:00 +03:00