Compare commits

...

79 Commits

Author SHA1 Message Date
huy b57a40a67e Update cinder driver workaround 2022-02-08 16:45:55 +07:00
huy fa95a26109 Update cinder driver workaround 2022-02-08 16:44:21 +07:00
huy ec44cb09e7 Update cinder driver workaround 2022-02-08 16:36:54 +07:00
huy 25fccefe6d patch cinder driver 2021-12-22 10:44:07 +07:00
Vitaliy Filippov 20a4406acc Support IPv6 OSD addresses 2021-12-19 10:42:17 +03:00
Vitaliy Filippov f93491bc6c Implement journal write batching and slightly refactor journal writes
Slightly reduces WA. For example, in 4K T1Q128 replicated randwrite tests
WA is reduced from ~3.6 to ~3.1, in T1Q64 from ~3.8 to ~3.4.

Only effective without no_same_sector_overwrites.
2021-12-16 00:27:17 +03:00
Vitaliy Filippov 999bed8514 Fix opening regular files as blockstore 2021-12-15 02:08:58 +03:00
Vitaliy Filippov 3f33095fd7 Do not try to initialize client in simple-offsets 2021-12-15 02:07:27 +03:00
Vitaliy Filippov dd74c5ce1b Fix OSDs marking PGs incomplete instead of trying to connect with peers 2021-12-14 01:57:51 +03:00
Vitaliy Filippov c6d104ecd6 Print object version on fatal overwrite 2021-12-14 01:57:04 +03:00
Vitaliy Filippov e544aef7d0 Fix test rw_blocking 2021-12-12 23:24:50 +03:00
Vitaliy Filippov 616c18c786 Fix stub_uring_osd 2021-12-12 23:06:11 +03:00
Vitaliy Filippov fa687d3878 Allow to configure OSD placement in node_placement 2021-12-12 01:25:45 +03:00
Vitaliy Filippov 2c7556e536 Allow to run with 4k sector size. Natural, but it was forbidden 2021-12-11 22:03:16 +00:00
Vitaliy Filippov 2020608a39 Release 0.6.10
- Implement a storage plugin for Proxmox. Now you can use Vitastor with Proxmox!
- Implement `vitastor-cli df` (pool space usage statistics) command
- Add glob pattern support for `vitastor-cli ls`
- Fix several bugs in other CLI commands (resize, create --parent, modify --readonly)
- Use 512 byte logical block size in QEMU driver by default (and thus don't require to set it in QEMU options)
2021-12-10 21:40:12 +03:00
Vitaliy Filippov 139b98d80f Exclude block/vitastor.c from patches and add script to easily re-add it 2021-12-10 21:38:36 +03:00
Vitaliy Filippov f54ff6ad5d Do not crash in simple-offsets when some options are empty, too 2021-12-10 12:27:25 +03:00
Vitaliy Filippov b376ef2ed9 Do not crash on empty matched_addrs 2021-12-10 11:40:59 +03:00
Vitaliy Filippov 5a234588b9 Do not die when invoked via `vita` symlink 2021-12-10 02:45:16 +03:00
Vitaliy Filippov b82c30328f Use vitastor-cli df to show pool stats in Proxmox 2021-12-10 02:42:31 +03:00
Vitaliy Filippov 0ee5e0a7fe Implement vitastor-cli df command 2021-12-10 02:37:02 +03:00
Vitaliy Filippov 0a1640d169 Some important fixes for our new Proxmox driver 2021-12-10 01:17:06 +03:00
Vitaliy Filippov 3482bb0860 Fix readonly/readwrite option parsing 2021-12-10 00:52:59 +03:00
Vitaliy Filippov 526995f486 Do not skip empty iops in listings 2021-12-10 00:52:59 +03:00
Vitaliy Filippov 073b505928 Package Proxmox plugin as pve-storage-vitastor 2021-12-10 00:22:45 +03:00
Vitaliy Filippov a8b21a22d0 Add patch for pve-qemu 6.1 2021-12-09 02:57:43 +03:00
Vitaliy Filippov 0b1ffba62b Add Proxmox storage driver 2021-12-09 02:26:54 +03:00
Vitaliy Filippov 8dfbd7943c Use logical block size = 512 bytes by default 2021-12-08 23:43:40 +03:00
Vitaliy Filippov 39e7f98e54 Allow to change etcd IP in tests 2021-12-08 23:00:48 +03:00
Vitaliy Filippov 3a83a32cb7 Aaand now fix create --parent :D 2021-12-08 23:00:34 +03:00
Vitaliy Filippov 20d5ed799a Add glob pattern matching for ls 2021-12-08 23:00:34 +03:00
Vitaliy Filippov b262938bca Fix naggy "Failed to get RDMA device list: Unknown error -38" 2021-12-08 02:02:30 +03:00
Vitaliy Filippov 7e54242251 Add patches for Proxmox QEMU 5.1 and 52 2021-12-05 17:45:01 +03:00
Vitaliy Filippov c3c2e68cc1 Now fix resize command :D 2021-12-05 01:38:08 +03:00
Vitaliy Filippov aa1e21dd99 Release 0.6.9
New features:
- Build Vitastor driver as part of QEMU
- Implement renaming images in CLI (vitastor-cli modify --rename)
- Add vitastor-cli alloc-osd and simple-offsets commands and use them in make-osd,
  thus removing the dependency on etcdctl
- Make monitor remove stale deleted inode statistics from etcd automatically
- Implement OSD address selection from a subnet, thus removing the need to specify
  OSD addresses in startup scripts explicitly

Bug fixes:
- Fix client failover in case of etcd shutdown or crash (make client survive etcd failures)
- Stick to the last live etcd in OSD and mon to prevent random failures when one of etcds is down
- Fix incorrect copying of data from journal to the data device which could lead to data corruption
- Prefer local etcd IPs in OSD
- Remove the total PG count restriction in optimize_change which was sometimes leading
  to inability to redistribute PGs over OSDs
- Fix error response parsing on a failed pg state report
- Fix slow linear writes with RDMA by changing default buffer settings
- Fix possible 'TypeError' in openstack nova when using Vitastor cinder driver
- Fix bugs in vitastor-cli create, ls, rm, modify commands

Patch changes:
- Add a patch for libvirt 7.6
- Add patches for QEMU 6.0 and 6.1
- Fix config file path XML location parsing in libvirt patches
- Replace _ with - in QEMU options
- Fix possible 'TypeError' in openstack nova when using Vitastor cinder driver
- Fix possible crashes of QEMU block driver in case of incorrect options
2021-12-03 10:58:54 +03:00
Vitaliy Filippov f4b57d487f Remove +deb10u1 from libvirt version 2021-12-03 10:56:44 +03:00
Vitaliy Filippov 711ecd2f8e Add a Dockerfile to build libvirt 2021-12-03 03:00:10 +03:00
Vitaliy Filippov 9fca01dc62 Add a forgotten return statement 2021-12-03 00:41:49 +03:00
Vitaliy Filippov 0bd3a94efd Use qdict_get_try_int because qdict_get_int may segfault on a missing key 2021-12-03 00:22:17 +03:00
Vitaliy Filippov 9ffdeef93b Install the built liburing version in el8 dockerfile 2021-12-03 00:04:18 +03:00
Vitaliy Filippov 589892d501 Fix rpm dockerfiles 2021-12-02 10:19:47 +03:00
Vitaliy Filippov 5fe3a40416 More fixes for QEMU 2.x :) 2021-12-02 02:25:50 +03:00
Vitaliy Filippov a453db9c8e An attempt to automatically build patched specs inside Docker is mostly broken for now 2021-12-02 01:51:48 +03:00
Vitaliy Filippov e6498a52ca Rename 4.2 el7 spec patch 2021-12-02 01:47:44 +03:00
Vitaliy Filippov 4bc41aed9d Add patches for QEMU 6.0 and for 6.0 RPM spec 2021-12-02 01:47:20 +03:00
Vitaliy Filippov 4da51f9c4c Update QEMU 3.1 patch 2021-12-02 01:29:21 +03:00
Vitaliy Filippov c6cee6f734 Update QEMU 5.0 patch 2021-12-02 01:27:00 +03:00
Vitaliy Filippov 6fc08f5581 Update CentOS 8 QEMU 4.2 spec patch 2021-12-02 01:23:43 +03:00
Vitaliy Filippov 15957b7d13 Update QEMU 4.2 patch and CentOS 7 QEMU 4.2 spec patch 2021-12-02 01:03:19 +03:00
Vitaliy Filippov 09a3987e83 Remove vitastor-qemu from RPM specs 2021-12-01 23:42:20 +03:00
Vitaliy Filippov cd6820c439 Update QEMU 5.1/5.2 patch to include internal vitastor driver 2021-12-01 02:08:02 +03:00
Vitaliy Filippov dcd8f5e76c Remove qemu shenanigans from vitastor build dockerfile 2021-12-01 02:00:14 +03:00
Vitaliy Filippov 5859f913fc Fix client failover in case of etcd shutdown or crash 2021-12-01 00:33:02 +03:00
Vitaliy Filippov cac6a1d8d1 Don't need to download fio in qemu dockerfile anymore 2021-11-30 22:24:48 +03:00
Vitaliy Filippov a0c32e7de9 Update patch for libvirt 7.6 2021-11-30 10:23:51 +03:00
Vitaliy Filippov 8b37610dd0 Rename patch to Nova 23 because it's actually closer 2021-11-30 02:05:24 +03:00
Vitaliy Filippov ae82ca3b08 Pass config path in <config file="" /> element instead of an attribute
libvirt has always had config in the element for Ceph, we don't want to break it
2021-11-30 02:02:07 +03:00
Vitaliy Filippov 92362027a8 Build vitastor driver as part of the QEMU package by default
Old behaviour can be restored with cmake var WITH_QEMU=true
2021-11-29 02:05:26 +03:00
Vitaliy Filippov c4aeeda143 Fix index removal in vitastor-cli rm 2021-11-29 02:00:05 +03:00
Vitaliy Filippov 24f0f8278a Fix modify --readwrite 2021-11-29 01:52:21 +03:00
Vitaliy Filippov 95496d0845 Implement renaming images in CLI (vitastor-cli modify --rename) 2021-11-28 22:38:57 +03:00
Vitaliy Filippov 94b1f09ef2 Create snapshots in the same pool by default 2021-11-28 21:50:42 +03:00
Vitaliy Filippov 32b1312abb Remove stale deleted inode statistics in monitor 2021-11-28 21:02:05 +03:00
Vitaliy Filippov d5c8fde5de Remove kludgy $IP and $ETCD_MON parsing from make-osd.sh, suggest to use vitastor.conf 2021-11-28 18:27:05 +03:00
Vitaliy Filippov 7a0b5212fe Exit if unable to restart watches
FIXME: It's probably not OK for the client to exit in this case
2021-11-28 01:43:31 +03:00
Vitaliy Filippov a8f5c71ae8 Use the same etcd address selection algorithm in the monitor 2021-11-28 01:19:42 +03:00
Vitaliy Filippov ce5b6253ab Make OSDs stick to the last successful etcd address
Previously OSDs were selecting a new random etcd from the cluster
on every request so they were failing randomly when part of etcds was down
2021-11-27 23:48:56 +03:00
Vitaliy Filippov 8398ad0117 Fix #36 - Fix old version data sometimes overriding new version data
Reproduction case:
- v3 = (offset 4kb, length 16kb)
- v2 = (offset 24kb, length 16kb)
- v1 = (offset 16kb, length 16kb)
- At the third step it was inserting 16..24kb instead of 20..24kb
2021-11-27 01:17:45 +03:00
Vitaliy Filippov fea451b4db Prefer local etcd in OSD 2021-11-27 00:36:53 +03:00
Vitaliy Filippov 6e12aca53b Remove the total PG count restriction in optimize_change which was leading to unfeasible problems sometimes 2021-11-26 23:05:37 +03:00
Vitaliy Filippov 8b007d531f
Merge pull request #33 from moly7x/fix-TypeError
Fixed TypeError
2021-11-25 14:47:31 +03:00
Vitaliy Filippov 7b7f20fb89
Merge pull request #34 from mirrorll/master
report pg state failed
2021-11-25 10:26:42 +03:00
Vitaliy Filippov 300d507026 Fix capture of out in alloc_osd 2021-11-25 10:20:01 +03:00
harley 6886171289
report pg state failed
after report pg state failed parse response error
2021-11-25 09:34:34 +08:00
Vitaliy Filippov 43f8ea47a0 Ok, something is not allowed somewhere in C99 2021-11-24 11:28:10 +03:00
Vitaliy Filippov 6e0e172e15 Implement OSD address selection from a specified subnet 2021-11-23 21:59:26 +03:00
Vitaliy Filippov 655a2c871d Move make-osd.sh into vitastor-client package 2021-11-21 16:33:33 +03:00
Vitaliy Filippov 879fe9b2b4 Add a patch for qemu 6.1 and replace _ with - in qemu options 2021-11-21 16:24:30 +03:00
Tân Lê b4235b4edf
Fixed TypeError
Fix `TypeError: Argument must be bytes or unicode, got 'int'` in nova-compute.
2021-11-19 13:37:47 +07:00
94 changed files with 5029 additions and 994 deletions

View File

@ -2,6 +2,6 @@ cmake_minimum_required(VERSION 2.8)
project(vitastor)
set(VERSION "0.6.8")
set(VERSION "0.6.10")
add_subdirectory(src)

View File

@ -51,13 +51,14 @@ Vitastor на данный момент находится в статусе п
- Базовая поддержка OpenStack: драйвер Cinder, патчи для Nova и libvirt
- Слияние снапшотов (vitastor-cli {snap-rm,flatten,merge})
- Консольный интерфейс для управления образами (vitastor-cli {ls,create,modify})
- Плагин для Proxmox
## Планы развития
- Поддержка удаления снапшотов (слияния слоёв)
- Более корректные скрипты разметки дисков и автоматического запуска OSD
- Другие инструменты администрирования
- Плагины для OpenNebula, Proxmox и других облачных систем
- Плагины для OpenNebula и других облачных систем
- iSCSI-прокси
- Более быстрое переключение при отказах
- Фоновая проверка целостности без контрольных сумм (сверка реплик)
@ -403,12 +404,21 @@ Vitastor с однопоточной NBD прокси на том же стен
в этом случае пострадает.
- Быстрая сеть, минимум 10 гбит/с
- Для наилучшей производительности нужно отключить энергосбережение CPU: `cpupower idle-set -D 0 && cpupower frequency-set -g performance`.
- Пропишите нужные вам значения вверху файлов `/usr/lib/vitastor/mon/make-units.sh` и `/usr/lib/vitastor/mon/make-osd.sh`.
- Создайте юниты systemd для etcd и мониторов: `/usr/lib/vitastor/mon/make-units.sh`
- Создайте юниты для OSD: `/usr/lib/vitastor/mon/make-osd.sh /dev/disk/by-partuuid/XXX [/dev/disk/by-partuuid/YYY ...]`
- Вы можете поменять параметры OSD в юнитах systemd. Смысл некоторых параметров:
- На хостах мониторов:
- Пропишите нужные вам значения в файле `/usr/lib/vitastor/mon/make-units.sh`
- Создайте юниты systemd для etcd и мониторов: `/usr/lib/vitastor/mon/make-units.sh`
- Пропишите etcd_address и osd_network в `/etc/vitastor/vitastor.conf`. Например:
```
{
"etcd_address": ["10.200.1.10:2379","10.200.1.11:2379","10.200.1.12:2379"],
"osd_network": "10.200.1.0/24"
}
```
- Создайте юниты systemd для OSD: `/usr/lib/vitastor/make-osd.sh /dev/disk/by-partuuid/XXX [/dev/disk/by-partuuid/YYY ...]`
- Вы можете менять параметры OSD в юнитах systemd или в `vitastor.conf`. Смысл некоторых параметров:
- `disable_data_fsync 1` - отключает fsync, используется с SSD с конденсаторами.
- `immediate_commit all` - используется с SSD с конденсаторами.
Внимание: если установлено, также нужно установить его в то же значение в etcd в /vitastor/config/global
- `disable_device_lock 1` - отключает блокировку файла устройства, нужно, только если вы запускаете
несколько OSD на одном блочном устройстве.
- `flusher_count 256` - "flusher" - микропоток, удаляющий старые данные из журнала.
@ -528,6 +538,75 @@ for i in ./???-*.yaml; do kubectl apply -f $i; done
После этого вы сможете создавать PersistentVolume. Пример смотрите в файле [csi/deploy/example-pvc.yaml](csi/deploy/example-pvc.yaml).
### OpenStack
Чтобы подключить Vitastor к OpenStack:
- Установите пакеты vitastor-client, libvirt и QEMU из DEB или RPM репозитория Vitastor
- Примените патч `patches/nova-21.diff` или `patches/nova-23.diff` к вашей инсталляции Nova.
nova-21.diff подходит для Nova 21-22, nova-23.diff подходит для Nova 23-24.
- Скопируйте `patches/cinder-vitastor.py` в инсталляцию Cinder как `cinder/volume/drivers/vitastor.py`
- Создайте тип томов в cinder.conf (см. ниже)
- Обязательно заблокируйте доступ от виртуальных машин к сети Vitastor (OSD и etcd), т.к. Vitastor (пока) не поддерживает аутентификацию
- Перезапустите Cinder и Nova
Пример конфигурации Cinder:
```
[DEFAULT]
enabled_backends = lvmdriver-1, vitastor-testcluster
# ...
[vitastor-testcluster]
volume_driver = cinder.volume.drivers.vitastor.VitastorDriver
volume_backend_name = vitastor-testcluster
image_volume_cache_enabled = True
volume_clear = none
vitastor_etcd_address = 192.168.7.2:2379
vitastor_etcd_prefix =
vitastor_config_path = /etc/vitastor/vitastor.conf
vitastor_pool_id = 1
image_upload_use_cinder_backend = True
```
Чтобы помещать в Vitastor Glance-образы, нужно использовать
[https://docs.openstack.org/cinder/pike/admin/blockstorage-volume-backed-image.html](образы на основе томов Cinder),
однако, поддержка этой функции ещё не проверялась.
### Proxmox
Чтобы подключить Vitastor к Proxmox Virtual Environment (поддерживаются версии 6.4 и 7.1):
- Добавьте соответствующий Debian-репозиторий Vitastor в sources.list на хостах Proxmox
(buster для 6.4, bullseye для 7.1)
- Установите пакеты vitastor-client, pve-qemu-kvm, pve-storage-vitastor (* или см. сноску) из репозитория Vitastor
- Определите тип хранилища в `/etc/pve/storage.cfg` (см. ниже)
- Обязательно заблокируйте доступ от виртуальных машин к сети Vitastor (OSD и etcd), т.к. Vitastor (пока) не поддерживает аутентификацию
- Перезапустите демон Proxmox: `systemctl restart pvedaemon`
Пример `/etc/pve/storage.cfg` (единственная обязательная опция - vitastor_pool, все остальные
перечислены внизу для понимания значений по умолчанию):
```
vitastor: vitastor
# Пул, в который будут помещаться образы дисков
vitastor_pool testpool
# Путь к файлу конфигурации
vitastor_config_path /etc/vitastor/vitastor.conf
# Адрес(а) etcd, нужны, только если не указаны в vitastor.conf
vitastor_etcd_address 192.168.7.2:2379/v3
# Префикс ключей метаданных в etcd
vitastor_etcd_prefix /vitastor
# Префикс имён образов
vitastor_prefix pve/
# Монтировать образы через NBD прокси, через ядро (нужно только для контейнеров)
vitastor_nbd 0
```
\* Примечание: вместо установки пакета pve-storage-vitastor вы можете вручную скопировать файл
[patches/PVE_VitastorPlugin.pm](patches/PVE_VitastorPlugin.pm) на хосты Proxmox как
`/usr/share/perl5/PVE/Storage/Custom/VitastorPlugin.pm`.
## Известные проблемы
- Запросы удаления объектов могут в данный момент приводить к "неполным" объектам в EC-пулах,

View File

@ -45,6 +45,7 @@ breaking changes in the future. However, the following is implemented:
- Basic OpenStack support: Cinder driver, Nova and libvirt patches
- Snapshot merge tool (vitastor-cli {snap-rm,flatten,merge})
- Image management CLI (vitastor-cli {ls,create,modify})
- Proxmox storage plugin
## Roadmap
@ -356,13 +357,21 @@ and calculate disk offsets almost by hand. This will be fixed in near future.
with lazy fsync, but prepare for inferior single-thread latency.
- Get a fast network (at least 10 Gbit/s).
- Disable CPU powersaving: `cpupower idle-set -D 0 && cpupower frequency-set -g performance`.
- Check `/usr/lib/vitastor/mon/make-units.sh` and `/usr/lib/vitastor/mon/make-osd.sh` and
put desired values into the variables at the top of these files.
- Create systemd units for the monitor and etcd: `/usr/lib/vitastor/mon/make-units.sh`
- On the monitor hosts:
- Edit variables at the top of `/usr/lib/vitastor/mon/make-units.sh` to desired values.
- Create systemd units for the monitor and etcd: `/usr/lib/vitastor/mon/make-units.sh`
- Put etcd_address and osd_network into `/etc/vitastor/vitastor.conf`. Example:
```
{
"etcd_address": ["10.200.1.10:2379","10.200.1.11:2379","10.200.1.12:2379"],
"osd_network": "10.200.1.0/24"
}
```
- Create systemd units for your OSDs: `/usr/lib/vitastor/mon/make-osd.sh /dev/disk/by-partuuid/XXX [/dev/disk/by-partuuid/YYY ...]`
- You can edit the units and change OSD configuration. Notable configuration variables:
- You can change OSD configuration in units or in `vitastor.conf`. Notable configuration variables:
- `disable_data_fsync 1` - only safe with server-grade drives with capacitors.
- `immediate_commit all` - use this if all your drives are server-grade.
If all OSDs have it set to all then you should also put the same value in etcd into /vitastor/config/global
- `disable_device_lock 1` - only required if you run multiple OSDs on one block device.
- `flusher_count 256` - flusher is a micro-thread that removes old data from the journal.
You don't have to worry about this parameter anymore, 256 is enough.
@ -478,6 +487,73 @@ for i in ./???-*.yaml; do kubectl apply -f $i; done
After that you'll be able to create PersistentVolumes. See example in [csi/deploy/example-pvc.yaml](csi/deploy/example-pvc.yaml).
### OpenStack
To enable Vitastor support in an OpenStack installation:
- Install vitastor-client, patched QEMU and libvirt packages from Vitastor DEB or RPM repository
- Use `patches/nova-21.diff` or `patches/nova-23.diff` to patch your Nova installation.
Patch 21 fits Nova 21-22, patch 23 fits Nova 23-24.
- Install `patches/cinder-vitastor.py` as `..../cinder/volume/drivers/vitastor.py`
- Define a volume type in cinder.conf (see below)
- Block network access from VMs to Vitastor network (to OSDs and etcd), because Vitastor doesn't support authentication (yet)
- Restart Cinder and Nova
Cinder volume type configuration example:
```
[DEFAULT]
enabled_backends = lvmdriver-1, vitastor-testcluster
# ...
[vitastor-testcluster]
volume_driver = cinder.volume.drivers.vitastor.VitastorDriver
volume_backend_name = vitastor-testcluster
image_volume_cache_enabled = True
volume_clear = none
vitastor_etcd_address = 192.168.7.2:2379
vitastor_etcd_prefix =
vitastor_config_path = /etc/vitastor/vitastor.conf
vitastor_pool_id = 1
image_upload_use_cinder_backend = True
```
To put Glance images in Vitastor, use [https://docs.openstack.org/cinder/pike/admin/blockstorage-volume-backed-image.html](volume-backed images),
although the support has not been verified yet.
### Proxmox
To enable Vitastor support in Proxmox Virtual Environment (6.4 and 7.1 are supported):
- Add the corresponding Vitastor Debian repository into sources.list on Proxmox hosts
(buster for 6.4, bullseye for 7.1)
- Install vitastor-client, pve-qemu-kvm, pve-storage-vitastor (* or see note) packages from Vitastor repository
- Define storage in `/etc/pve/storage.cfg` (see below)
- Block network access from VMs to Vitastor network (to OSDs and etcd), because Vitastor doesn't support authentication (yet)
- Restart pvedaemon: `systemctl restart pvedaemon`
`/etc/pve/storage.cfg` example (the only required option is vitastor_pool, all others
are listed below with their default values):
```
vitastor: vitastor
# pool to put new images into
vitastor_pool testpool
# path to the configuration file
vitastor_config_path /etc/vitastor/vitastor.conf
# etcd address(es), required only if missing in the configuration file
vitastor_etcd_address 192.168.7.2:2379/v3
# prefix for keys in etcd
vitastor_etcd_prefix /vitastor
# prefix for images
vitastor_prefix pve/
# use NBD mounter (only required for containers)
vitastor_nbd 0
```
\* Note: you can also manually copy [patches/PVE_VitastorPlugin.pm](patches/PVE_VitastorPlugin.pm) to Proxmox hosts
as `/usr/share/perl5/PVE/Storage/Custom/VitastorPlugin.pm` instead of installing pve-storage-vitastor.
## Known Problems
- Object deletion requests may currently lead to 'incomplete' objects in EC pools

View File

@ -1,4 +1,4 @@
VERSION ?= v0.6.8
VERSION ?= v0.6.10
all: build push

View File

@ -49,7 +49,7 @@ spec:
capabilities:
add: ["SYS_ADMIN"]
allowPrivilegeEscalation: true
image: vitalif/vitastor-csi:v0.6.8
image: vitalif/vitastor-csi:v0.6.10
args:
- "--node=$(NODE_ID)"
- "--endpoint=$(CSI_ENDPOINT)"

View File

@ -116,7 +116,7 @@ spec:
privileged: true
capabilities:
add: ["SYS_ADMIN"]
image: vitalif/vitastor-csi:v0.6.8
image: vitalif/vitastor-csi:v0.6.10
args:
- "--node=$(NODE_ID)"
- "--endpoint=$(CSI_ENDPOINT)"

View File

@ -5,7 +5,7 @@ package vitastor
const (
vitastorCSIDriverName = "csi.vitastor.io"
vitastorCSIDriverVersion = "0.6.8"
vitastorCSIDriverVersion = "0.6.10"
)
// Config struct fills the parameters of request or user input

2
debian/changelog vendored
View File

@ -1,4 +1,4 @@
vitastor (0.6.8-1) unstable; urgency=medium
vitastor (0.6.10-1) unstable; urgency=medium
* RDMA support
* Bugfixes

10
debian/control vendored
View File

@ -9,7 +9,7 @@ Rules-Requires-Root: no
Package: vitastor
Architecture: amd64
Depends: vitastor-osd, vitastor-mon, vitastor-client, vitastor-client-dev, vitastor-fio, vitastor-qemu
Depends: vitastor-osd, vitastor-mon, vitastor-client, vitastor-client-dev, vitastor-fio
Description: Vitastor, a fast software-defined clustered block storage
Vitastor is a small, simple and fast clustered block storage (storage for VM drives),
architecturally similar to Ceph which means strong consistency, primary-replication,
@ -48,8 +48,8 @@ Depends: ${shlibs:Depends}, ${misc:Depends}, vitastor-client (= ${binary:Version
Description: Vitastor, a fast software-defined clustered block storage - fio drivers
Vitastor fio drivers for benchmarking.
Package: vitastor-qemu
Package: pve-storage-vitastor
Architecture: amd64
Depends: ${shlibs:Depends}, ${misc:Depends}, vitastor-client (= ${binary:Version}), qemu (= ${dep:qemu})
Description: Vitastor, a fast software-defined clustered block storage - QEMU driver
Vitastor QEMU block device driver.
Depends: ${shlibs:Depends}, ${misc:Depends}, vitastor-client (= ${binary:Version})
Description: Vitastor Proxmox Virtual Environment storage plugin
Vitastor storage plugin for Proxmox Virtual Environment.

40
debian/libvirt.Dockerfile vendored Normal file
View File

@ -0,0 +1,40 @@
# Build patched libvirt for Debian Buster or Bullseye/Sid inside a container
# cd ..; podman build --build-arg REL=bullseye -v `pwd`/packages:/root/packages -f debian/libvirt.Dockerfile .
ARG REL=
FROM debian:$REL
ARG REL=
WORKDIR /root
RUN if [ "$REL" = "buster" -o "$REL" = "bullseye" ]; then \
echo "deb http://deb.debian.org/debian $REL-backports main" >> /etc/apt/sources.list; \
echo >> /etc/apt/preferences; \
echo 'Package: *' >> /etc/apt/preferences; \
echo "Pin: release a=$REL-backports" >> /etc/apt/preferences; \
echo 'Pin-Priority: 500' >> /etc/apt/preferences; \
fi; \
grep '^deb ' /etc/apt/sources.list | perl -pe 's/^deb/deb-src/' >> /etc/apt/sources.list; \
echo 'APT::Install-Recommends false;' >> /etc/apt/apt.conf; \
echo 'APT::Install-Suggests false;' >> /etc/apt/apt.conf
RUN apt-get update; apt-get -y install devscripts
RUN apt-get -y build-dep libvirt0
RUN apt-get -y install libglusterfs-dev
RUN apt-get --download-only source libvirt
ADD patches/libvirt-5.0-vitastor.diff patches/libvirt-7.0-vitastor.diff patches/libvirt-7.5-vitastor.diff patches/libvirt-7.6-vitastor.diff /root
RUN set -e; \
mkdir -p /root/packages/libvirt-$REL; \
rm -rf /root/packages/libvirt-$REL/*; \
cd /root/packages/libvirt-$REL; \
dpkg-source -x /root/libvirt*.dsc; \
D=$(ls -d libvirt-*/); \
V=$(ls -d libvirt-*/ | perl -pe 's/libvirt-(\d+\.\d+).*/$1/'); \
cp /root/libvirt-$V-vitastor.diff $D/debian/patches; \
echo libvirt-$V-vitastor.diff >> $D/debian/patches/series; \
cd $D; \
V=$(head -n1 debian/changelog | perl -pe 's/^.*\((.*?)(~bpo[\d\+]*)?(\+deb[u\d]+)?\).*$/$1/')+vitastor2; \
DEBEMAIL="Vitaliy Filippov <vitalif@yourcmc.ru>" dch -D $REL -v $V 'Add Vitastor support'; \
DEB_BUILD_OPTIONS=nocheck dpkg-buildpackage --jobs=auto -sa; \
rm -rf /root/packages/libvirt-$REL/$D

View File

@ -7,11 +7,11 @@ ARG REL=
WORKDIR /root
RUN if [ "$REL" = "buster" ]; then \
echo 'deb http://deb.debian.org/debian buster-backports main' >> /etc/apt/sources.list; \
RUN if [ "$REL" = "buster" -o "$REL" = "bullseye" ]; then \
echo "deb http://deb.debian.org/debian $REL-backports main" >> /etc/apt/sources.list; \
echo >> /etc/apt/preferences; \
echo 'Package: *' >> /etc/apt/preferences; \
echo 'Pin: release a=buster-backports' >> /etc/apt/preferences; \
echo "Pin: release a=$REL-backports" >> /etc/apt/preferences; \
echo 'Pin-Priority: 500' >> /etc/apt/preferences; \
fi; \
grep '^deb ' /etc/apt/sources.list | perl -pe 's/^deb/deb-src/' >> /etc/apt/sources.list; \
@ -21,28 +21,41 @@ RUN if [ "$REL" = "buster" ]; then \
RUN apt-get update
RUN apt-get -y install qemu fio liburing1 liburing-dev libgoogle-perftools-dev devscripts
RUN apt-get -y build-dep qemu
RUN apt-get -y build-dep fio
# To build a custom version
#RUN cp /root/packages/qemu-orig/* /root
RUN apt-get --download-only source qemu
RUN apt-get --download-only source fio
ADD patches/qemu-5.0-vitastor.patch patches/qemu-5.1-vitastor.patch /root/vitastor/patches/
ADD patches/qemu-5.0-vitastor.patch patches/qemu-5.1-vitastor.patch patches/qemu-6.1-vitastor.patch src/qemu_driver.c /root/vitastor/patches/
RUN set -e; \
apt-get install -y wget; \
wget -q -O /etc/apt/trusted.gpg.d/vitastor.gpg https://vitastor.io/debian/pubkey.gpg; \
(echo deb http://vitastor.io/debian $REL main > /etc/apt/sources.list.d/vitastor.list); \
(echo "APT::Install-Recommends false;" > /etc/apt/apt.conf) && \
apt-get update; \
apt-get install -y vitastor-client vitastor-client-dev quilt; \
mkdir -p /root/packages/qemu-$REL; \
rm -rf /root/packages/qemu-$REL/*; \
cd /root/packages/qemu-$REL; \
dpkg-source -x /root/qemu*.dsc; \
if [ -d /root/packages/qemu-$REL/qemu-5.0 ]; then \
cp /root/vitastor/patches/qemu-5.0-vitastor.patch /root/packages/qemu-$REL/qemu-5.0/debian/patches; \
echo qemu-5.0-vitastor.patch >> /root/packages/qemu-$REL/qemu-5.0/debian/patches/series; \
if ls -d /root/packages/qemu-$REL/qemu-5.0*; then \
D=$(ls -d /root/packages/qemu-$REL/qemu-5.0*); \
cp /root/vitastor/patches/qemu-5.0-vitastor.patch $D/debian/patches; \
echo qemu-5.0-vitastor.patch >> $D/debian/patches/series; \
elif ls /root/packages/qemu-$REL/qemu-6.1*; then \
D=$(ls -d /root/packages/qemu-$REL/qemu-6.1*); \
cp /root/vitastor/patches/qemu-6.1-vitastor.patch $D/debian/patches; \
echo qemu-6.1-vitastor.patch >> $D/debian/patches/series; \
else \
cp /root/vitastor/patches/qemu-5.1-vitastor.patch /root/packages/qemu-$REL/qemu-*/debian/patches; \
P=`ls -d /root/packages/qemu-$REL/qemu-*/debian/patches`; \
echo qemu-5.1-vitastor.patch >> $P/series; \
fi; \
cd /root/packages/qemu-$REL/qemu-*/; \
quilt push -a; \
quilt add block/vitastor.c; \
cp /root/vitastor/patches/qemu_driver.c block/vitastor.c; \
quilt refresh; \
V=$(head -n1 debian/changelog | perl -pe 's/^.*\((.*?)(~bpo[\d\+]*)?\).*$/$1/')+vitastor1; \
DEBFULLNAME="Vitaliy Filippov <vitalif@yourcmc.ru>" dch -D $REL -v $V 'Plug Vitastor block driver'; \
DEBEMAIL="Vitaliy Filippov <vitalif@yourcmc.ru>" dch -D $REL -v $V 'Plug Vitastor block driver'; \
DEB_BUILD_OPTIONS=nocheck dpkg-buildpackage --jobs=auto -sa; \
rm -rf /root/packages/qemu-$REL/qemu-*/

1
debian/pve-storage-vitastor.install vendored Normal file
View File

@ -0,0 +1 @@
patches/PVE_VitastorPlugin.pm usr/share/perl5/PVE/Storage/Custom/VitastorPlugin.pm

1
debian/qemu_version vendored
View File

@ -1 +0,0 @@
dep:qemu=1:5.2+dfsg-10+vitastor1

19
debian/raw.h vendored Normal file
View File

@ -0,0 +1,19 @@
/* Removed in Linux 5.14 */
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
#ifndef __LINUX_RAW_H
#define __LINUX_RAW_H
#include <linux/types.h>
#define RAW_SETBIND _IO( 0xac, 0 )
#define RAW_GETBIND _IO( 0xac, 1 )
struct raw_config_request
{
int raw_minor;
__u64 block_major;
__u64 block_minor;
};
#endif /* __LINUX_RAW_H */

2
debian/rules vendored
View File

@ -6,5 +6,5 @@ export DH_VERBOSE = 1
override_dh_installdeb:
cat debian/fio_version >> debian/vitastor-fio.substvars
cat debian/qemu_version >> debian/vitastor-qemu.substvars
[ -f debian/qemu_version ] && (cat debian/qemu_version >> debian/vitastor-qemu.substvars) || true
dh_installdeb

View File

@ -3,3 +3,4 @@ usr/bin/vitastor-cli
usr/bin/vitastor-rm
usr/bin/vitastor-nbd
usr/lib/*/libvitastor*.so*
mon/make-osd.sh /usr/lib/vitastor

View File

@ -1,3 +1,2 @@
usr/bin/vitastor-osd
usr/bin/vitastor-dump-journal
mon/make-osd.sh /usr/lib/vitastor

View File

@ -1 +0,0 @@
usr/lib/*/qemu/*

View File

@ -7,11 +7,11 @@ ARG REL=
WORKDIR /root
RUN if [ "$REL" = "buster" ]; then \
echo 'deb http://deb.debian.org/debian buster-backports main' >> /etc/apt/sources.list; \
RUN if [ "$REL" = "buster" -o "$REL" = "bullseye" ]; then \
echo "deb http://deb.debian.org/debian $REL-backports main" >> /etc/apt/sources.list; \
echo >> /etc/apt/preferences; \
echo 'Package: *' >> /etc/apt/preferences; \
echo 'Pin: release a=buster-backports' >> /etc/apt/preferences; \
echo "Pin: release a=$REL-backports" >> /etc/apt/preferences; \
echo 'Pin-Priority: 500' >> /etc/apt/preferences; \
fi; \
grep '^deb ' /etc/apt/sources.list | perl -pe 's/^deb/deb-src/' >> /etc/apt/sources.list; \
@ -19,10 +19,8 @@ RUN if [ "$REL" = "buster" ]; then \
echo 'APT::Install-Suggests false;' >> /etc/apt/apt.conf
RUN apt-get update
RUN apt-get -y install qemu fio liburing1 liburing-dev libgoogle-perftools-dev devscripts
RUN apt-get -y build-dep qemu
RUN apt-get -y install fio liburing1 liburing-dev libgoogle-perftools-dev devscripts
RUN apt-get -y build-dep fio
RUN apt-get --download-only source qemu
RUN apt-get --download-only source fio
RUN apt-get update && apt-get -y install libjerasure-dev cmake libibverbs-dev
@ -32,37 +30,25 @@ RUN set -e -x; \
cd /root/fio-build/; \
rm -rf /root/fio-build/*; \
dpkg-source -x /root/fio*.dsc; \
cd /root/packages/qemu-$REL/; \
rm -rf qemu*/; \
dpkg-source -x qemu*.dsc; \
cd /root/packages/qemu-$REL/qemu*/; \
debian/rules b/configure-stamp; \
cd b/qemu; \
make -j8 qapi/qapi-builtin-types.h; \
mkdir -p /root/packages/vitastor-$REL; \
rm -rf /root/packages/vitastor-$REL/*; \
cd /root/packages/vitastor-$REL; \
cp -r /root/vitastor vitastor-0.6.8; \
ln -s /root/packages/qemu-$REL/qemu-*/ vitastor-0.6.8/qemu; \
ln -s /root/fio-build/fio-*/ vitastor-0.6.8/fio; \
cd vitastor-0.6.8; \
cp -r /root/vitastor vitastor-0.6.10; \
cd vitastor-0.6.10; \
ln -s /root/fio-build/fio-*/ ./fio; \
FIO=$(head -n1 fio/debian/changelog | perl -pe 's/^.*\((.*?)\).*$/$1/'); \
QEMU=$(head -n1 qemu/debian/changelog | perl -pe 's/^.*\((.*?)\).*$/$1/'); \
sh copy-qemu-includes.sh; \
ls /usr/include/linux/raw.h || cp ./debian/raw.h /usr/include/linux/raw.h; \
sh copy-fio-includes.sh; \
rm qemu fio; \
rm fio; \
mkdir -p a b debian/patches; \
mv qemu-copy b/qemu; \
mv fio-copy b/fio; \
diff -NaurpbB a b > debian/patches/qemu-fio-headers.patch || true; \
echo qemu-fio-headers.patch >> debian/patches/series; \
diff -NaurpbB a b > debian/patches/fio-headers.patch || true; \
echo fio-headers.patch >> debian/patches/series; \
rm -rf a b; \
rm -rf /root/packages/qemu-$REL/qemu*/; \
echo "dep:fio=$FIO" > debian/fio_version; \
echo "dep:qemu=$QEMU" > debian/qemu_version; \
cd /root/packages/vitastor-$REL; \
tar --sort=name --mtime='2020-01-01' --owner=0 --group=0 --exclude=debian -cJf vitastor_0.6.8.orig.tar.xz vitastor-0.6.8; \
cd vitastor-0.6.8; \
tar --sort=name --mtime='2020-01-01' --owner=0 --group=0 --exclude=debian -cJf vitastor_0.6.10.orig.tar.xz vitastor-0.6.10; \
cd vitastor-0.6.10; \
V=$(head -n1 debian/changelog | perl -pe 's/^.*\((.*?)\).*$/$1/'); \
DEBFULLNAME="Vitaliy Filippov <vitalif@yourcmc.ru>" dch -D $REL -v "$V""$REL" "Rebuild for $REL"; \
DEB_BUILD_OPTIONS=nocheck dpkg-buildpackage --jobs=auto -sa; \

View File

@ -293,11 +293,6 @@ async function optimize_change({ prev_pgs: prev_int_pgs, osd_tree, pg_size = 3,
lp += 'max: '+all_pg_names.map(pg_name => (
prev_weights[pg_name] ? `${pg_size+1}*add_${pg_name} - ${pg_size+1}*del_${pg_name}` : `${pg_size+1-move_weights[pg_name]}*${pg_name}`
)).join(' + ')+';\n';
lp += all_pg_names
.map(pg_name => (prev_weights[pg_name] ? `add_${pg_name} - del_${pg_name}` : `${pg_name}`))
.join(' + ')+' = '+(pg_count
- Object.keys(prev_weights).reduce((a, old_pg_name) => (a + (all_pgs_hash[old_pg_name] ? prev_weights[old_pg_name] : 0)), 0)
)+';\n';
for (const osd in pg_per_osd)
{
if (osd !== NO_OSD)

View File

@ -4,18 +4,16 @@
# Copyright (c) Vitaliy Filippov, 2019+
# License: MIT
# USAGE: ./make-osd.sh /dev/disk/by-partuuid/xxx [ /dev/disk/by-partuuid/yyy]...
IP_SUBSTR="10.200.1."
ETCD_HOSTS="etcd0=http://10.200.1.10:2380,etcd1=http://10.200.1.11:2380,etcd2=http://10.200.1.12:2380"
# USAGE:
# 1) Put etcd_address and osd_network into /etc/vitastor/vitastor.conf. Example:
# {
# "etcd_address":["http://10.200.1.10:2379/v3","http://10.200.1.11:2379/v3","http://10.200.1.12:2379/v3"],
# "osd_network":"10.200.1.0/24"
# }
# 2) Run ./make-osd.sh /dev/disk/by-partuuid/xxx [ /dev/disk/by-partuuid/yyy]...
set -e -x
IP=`ip -json a s | jq -r '.[].addr_info[] | select(.local | startswith("'$IP_SUBSTR'")) | .local'`
[ "$IP" != "" ] || exit 1
ETCD_MON=$(echo $ETCD_HOSTS | perl -pe 's/:2380/:2379/g; s/etcd\d*=//g;')
D=`dirname $0`
# Create OSDs on all passed devices
for DEV in $*; do
@ -39,8 +37,6 @@ LimitNOFILE=1048576
LimitNPROC=1048576
LimitMEMLOCK=infinity
ExecStart=/usr/bin/vitastor-osd \\
--etcd_address $IP:2379/v3 \\
--bind_address $IP \\
--osd_num $OSD_NUM \\
--disable_data_fsync 1 \\
--immediate_commit all \\

View File

@ -9,17 +9,18 @@ const options = {};
for (let i = 2; i < process.argv.length; i++)
{
if (process.argv[i].substr(0, 2) == '--')
if (process.argv[i] === '-h' || process.argv[i] === '--help')
{
console.error('USAGE: '+process.argv[0]+' '+process.argv[1]+' [--verbose 1]'+
' [--etcd_address "http://127.0.0.1:2379,..."] [--config_file /etc/vitastor/vitastor.conf]'+
' [--etcd_prefix "/vitastor"] [--etcd_start_timeout 5]');
process.exit();
}
else if (process.argv[i].substr(0, 2) == '--')
{
options[process.argv[i].substr(2)] = process.argv[i+1];
i++;
}
}
if (!options.etcd_url)
{
console.error('USAGE: '+process.argv[0]+' '+process.argv[1]+' --etcd_url "http://127.0.0.1:2379,..." --etcd_prefix "/vitastor" --etcd_start_timeout 5 [--verbose 1]');
process.exit();
}
new Mon(options).start().catch(e => { console.error(e); process.exit(); });
new Mon(options).start().catch(e => { console.error(e); process.exit(1); });

View File

@ -1,6 +1,7 @@
// Copyright (c) Vitaliy Filippov, 2019+
// License: VNPL-1.1 (see README.md for details)
const fs = require('fs');
const http = require('http');
const crypto = require('crypto');
const os = require('os');
@ -85,6 +86,7 @@ const etcd_tree = {
// osd
etcd_report_interval: 5,
run_primary: true,
osd_network: null, // "192.168.7.0/24" or an array of masks
bind_address: "0.0.0.0",
bind_port: 0,
autosync_interval: 5,
@ -322,24 +324,53 @@ class Mon
{
constructor(config)
{
// FIXME: Maybe prefer local etcd
this.etcd_urls = [];
for (let url of config.etcd_url.split(/,/))
this.die = (e) => this._die(e);
if (fs.existsSync(config.config_path||'/etc/vitastor/vitastor.conf'))
{
let scheme = 'http';
url = url.trim().replace(/^(https?):\/\//, (m, m1) => { scheme = m1; return ''; });
if (!/\/[^\/]/.exec(url))
url += '/v3';
this.etcd_urls.push(scheme+'://'+url);
config = {
...JSON.parse(fs.readFileSync(config.config_path||'/etc/vitastor/vitastor.conf', { encoding: 'utf-8' })),
...config,
};
}
this.parse_etcd_addresses(config.etcd_address||config.etcd_url);
this.verbose = config.verbose || 0;
this.initConfig = config;
this.config = {};
this.etcd_prefix = config.etcd_prefix || '/vitastor';
this.etcd_prefix = this.etcd_prefix.replace(/\/\/+/g, '/').replace(/^\/?(.*[^\/])\/?$/, '/$1');
this.etcd_start_timeout = (config.etcd_start_timeout || 5) * 1000;
this.state = JSON.parse(JSON.stringify(this.constructor.etcd_tree));
this.signals_set = false;
this.on_stop_cb = () => this.on_stop().catch(console.error);
this.on_stop_cb = () => this.on_stop(0).catch(console.error);
}
parse_etcd_addresses(addrs)
{
const is_local_ip = this.local_ips(true).reduce((a, c) => { a[c] = true; return a; }, {});
this.etcd_local = [];
this.etcd_urls = [];
this.selected_etcd_url = null;
this.etcd_urls_to_try = [];
if (!(addrs instanceof Array))
addrs = addrs ? (''+(addrs||'')).split(/,/) : [];
if (!addrs.length)
{
console.error('Vitastor etcd address(es) not specified. Please set on the command line or in the config file');
process.exit(1);
}
for (let url of addrs)
{
let scheme = 'http';
url = url.trim().replace(/^(https?):\/\//, (m, m1) => { scheme = m1; return ''; });
const slash = url.indexOf('/');
const colon = url.indexOf(':');
const is_local = is_local_ip[colon >= 0 ? url.substr(0, colon) : (slash >= 0 ? url.substr(0, slash) : url)];
url = scheme+'://'+(slash >= 0 ? url : url+'/v3');
if (is_local)
this.etcd_local.push(url);
else
this.etcd_urls.push(url);
}
}
async start()
@ -410,6 +441,31 @@ class Mon
}
}
pick_next_etcd()
{
if (this.selected_etcd_url)
return this.selected_etcd_url;
if (!this.etcd_urls_to_try || !this.etcd_urls_to_try.length)
{
this.etcd_urls_to_try = [ ...this.etcd_local ];
const others = [ ...this.etcd_urls ];
while (others.length)
{
const url = others.splice(0|(others.length*Math.random()), 1);
this.etcd_urls_to_try.push(url[0]);
}
}
this.selected_etcd_url = this.etcd_urls_to_try.shift();
return this.selected_etcd_url;
}
restart_watcher(cur_addr)
{
if (this.selected_etcd_url == cur_addr)
this.selected_etcd_url = null;
this.start_watcher(this.config.etcd_mon_retries).catch(this.die);
}
async start_watcher(retries)
{
let retry = 0;
@ -419,7 +475,8 @@ class Mon
}
while (retries < 0 || retry < retries)
{
const base = 'ws'+this.etcd_urls[Math.floor(Math.random()*this.etcd_urls.length)].substr(4);
const cur_addr = this.pick_next_etcd();
const base = 'ws'+cur_addr.substr(4);
const ok = await new Promise((ok, no) =>
{
const timer_id = setTimeout(() =>
@ -442,9 +499,9 @@ class Mon
});
});
if (ok)
{
break;
}
if (this.selected_etcd_url == cur_addr)
this.selected_etcd_url = null;
this.ws = null;
retry++;
}
@ -452,6 +509,8 @@ class Mon
{
this.die('Failed to open etcd watch websocket');
}
const cur_addr = this.selected_etcd_url;
this.ws.on('error', () => this.restart_watcher(cur_addr));
this.ws.send(JSON.stringify({
create_request: {
key: b64(this.etcd_prefix+'/'),
@ -471,12 +530,25 @@ class Mon
catch (e)
{
}
if (!data || !data.result || !data.result.events)
if (!data || !data.result)
{
if (!data || !data.result || !data.result.watch_id)
console.error('Unknown message received from watch websocket: '+msg);
}
else if (data.result.canceled)
{
// etcd watch canceled
if (data.result.compact_revision)
{
console.error('Garbage received from watch websocket: '+msg);
// we may miss events if we proceed
console.error('Revisions before '+data.result.compact_revision+' were compacted by etcd, exiting');
this.on_stop(1);
}
console.error('Watch canceled by etcd, reason: '+data.result.cancel_reason+', exiting');
this.on_stop(1);
}
else if (data.result.created)
{
// etcd watch created
}
else
{
@ -509,7 +581,7 @@ class Mon
}
if (pg_states_changed)
{
this.save_last_clean().catch(console.error);
this.save_last_clean().catch(this.die);
}
if (stats_changed)
{
@ -580,11 +652,11 @@ class Mon
}
}
async on_stop()
async on_stop(status)
{
clearInterval(this.lease_timer);
await this.etcd_call('/lease/revoke', { ID: this.etcd_lease_id }, this.config.etcd_mon_timeout, this.config.etcd_mon_retries);
process.exit(0);
process.exit(status);
}
async become_master()
@ -637,10 +709,13 @@ class Mon
for (const node_id in this.state.config.node_placement||{})
{
const node_cfg = this.state.config.node_placement[node_id];
if (!node_id || /^\d/.exec(node_id) ||
!node_cfg.level || !levels[node_cfg.level])
if (/^\d+$/.exec(node_id))
{
// All nodes must have non-empty non-numeric IDs and valid levels
node_cfg.level = 'osd';
}
if (!node_id || !node_cfg.level || !levels[node_cfg.level])
{
// All nodes must have non-empty IDs and valid levels
continue;
}
tree[node_id] = { id: node_id, level: node_cfg.level, parent: node_cfg.parent, children: [] };
@ -673,10 +748,10 @@ class Mon
.reduce((a, c) => { a[c] = true; return a; }, {});
}
delete tree[osd_num].children;
if (!tree[tree[osd_num].parent])
if (!tree[stat.host])
{
tree[tree[osd_num].parent] = {
id: tree[osd_num].parent,
tree[stat.host] = {
id: stat.host,
level: 'host',
parent: null,
children: [],
@ -1195,7 +1270,7 @@ class Mon
this.recheck_timer = setTimeout(() =>
{
this.recheck_timer = null;
this.recheck_pgs().catch(console.error);
this.recheck_pgs().catch(this.die);
}, this.config.mon_change_timeout || 1000);
}
@ -1342,6 +1417,7 @@ class Mon
{
for (const inode_num in inode_stats[pool_id])
{
let nonzero = inode_stats[pool_id][inode_num].raw_used > 0;
for (const op of [ 'read', 'write', 'delete' ])
{
const op_st = inode_stats[pool_id][inode_num][op];
@ -1349,6 +1425,13 @@ class Mon
op_st.bps = prev_st ? (op_st.bytes - prev_st.bytes) * 1000n / tm : 0;
op_st.iops = prev_st ? (op_st.count - prev_st.count) * 1000n / tm : 0;
op_st.lat = prev_st ? (op_st.usec - prev_st.usec) / ((op_st.count - prev_st.count) || 1n) : 0;
if (op_st.bps > 0 || op_st.iops > 0 || op_st.lat > 0)
nonzero = true;
}
if (!nonzero && (!this.state.config.inode[pool_id] || !this.state.config.inode[pool_id][inode_num]))
{
// Deleted inode (no data, no I/O, no config)
delete inode_stats[pool_id][inode_num];
}
}
}
@ -1397,6 +1480,18 @@ class Mon
} });
}
}
for (const pool_id in this.state.inode.stats)
{
for (const inode_num in this.state.inode.stats[pool_id])
{
if (!inode_stats[pool_id] || !inode_stats[pool_id][inode_num])
{
txn.push({ requestDeleteRange: {
key: b64(this.etcd_prefix+'/inode/stats/'+pool_id+'/'+inode_num),
} });
}
}
}
for (const pool_id in this.state.pool.stats)
{
const pool_stats = { ...this.state.pool.stats[pool_id] };
@ -1462,7 +1557,7 @@ class Mon
cur[key_parts[key_parts.length-1]] = kv.value;
if (key === 'config/global')
{
this.config = this.state.config.global;
this.config = { ...this.initConfig, ...this.state.config.global };
this.check_config();
for (const osd_num in this.state.osd.stats)
{
@ -1499,12 +1594,15 @@ class Mon
}
while (retries < 0 || retry < retries)
{
const base = this.etcd_urls[Math.floor(Math.random()*this.etcd_urls.length)];
retry++;
const base = this.pick_next_etcd();
const res = await POST(base+path, body, timeout);
if (res.error)
{
console.error('etcd returned error: '+res.error);
break;
if (this.selected_etcd_url == base)
this.selected_etcd_url = null;
console.error('failed to query etcd: '+res.error);
continue;
}
if (res.json)
{
@ -1513,26 +1611,20 @@ class Mon
console.error('etcd returned error: '+res.json.error);
break;
}
if (this.etcd_urls.length > 1)
{
// Stick to the same etcd for the rest of calls
this.etcd_urls = [ base ];
}
return res.json;
}
retry++;
}
this.die();
}
die(err)
_die(err)
{
// In fact we can just try to rejoin
console.error(new Error(err || 'Cluster connection failed'));
process.exit(1);
}
local_ips()
local_ips(all)
{
const ips = [];
const ifaces = os.networkInterfaces();
@ -1540,7 +1632,7 @@ class Mon
{
for (const iface of ifaces[ifname])
{
if (iface.family == 'IPv4' && !iface.internal)
if (iface.family == 'IPv4' && !iface.internal || all)
{
ips.push(iface.address);
}

View File

@ -0,0 +1,33 @@
// Copyright (c) Vitaliy Filippov, 2019+
// License: VNPL-1.1 (see README.md for details)
const LPOptimizer = require('./lp-optimizer.js');
const osd_tree = {
100: {
1: 0.1,
2: 0.1,
3: 0.1,
},
200: {
4: 0.1,
5: 0.1,
6: 0.1,
},
};
async function run()
{
let res;
console.log('256 PGs, 3+3 OSDs, size=2');
res = await LPOptimizer.optimize_initial({ osd_tree, pg_size: 2, pg_count: 256 });
LPOptimizer.print_change_stats(res, false);
// Should NOT fail with the "unfeasible or unbounded" exception
console.log('\nRemoving osd.2');
delete osd_tree[100][2];
res = await LPOptimizer.optimize_change({ prev_pgs: res.int_pgs, osd_tree, pg_size: 2 });
LPOptimizer.print_change_stats(res, false);
}
run().catch(console.error);

View File

@ -0,0 +1,503 @@
# Install as /usr/share/perl5/PVE/Storage/Custom/VitastorPlugin.pm
# Proxmox Vitastor Driver
# Copyright (c) Vitaliy Filippov, 2021+
# License: VNPL-1.1 or GNU AGPLv3.0
package PVE::Storage::Custom::VitastorPlugin;
use strict;
use warnings;
use JSON;
use PVE::Storage::Plugin;
use PVE::Tools qw(run_command);
use base qw(PVE::Storage::Plugin);
sub api
{
# Trick it :)
return PVE::Storage->APIVER;
}
sub run_cli
{
my ($scfg, $cmd, %args) = @_;
my $retval;
my $stderr = '';
my $errmsg = $args{errmsg} ? $args{errmsg}.": " : "vitastor-cli error: ";
my $json = delete $args{json};
$json = 1 if !defined $json;
my $binary = delete $args{binary};
$binary = '/usr/bin/vitastor-cli' if !defined $binary;
if (!exists($args{errfunc}))
{
$args{errfunc} = sub
{
my $line = shift;
print STDERR $line;
*STDERR->flush();
$stderr .= $line;
};
}
if (!exists($args{outfunc}))
{
$retval = '';
$args{outfunc} = sub { $retval .= shift };
if ($json)
{
unshift @$cmd, '--json';
}
}
if ($scfg->{vitastor_etcd_address})
{
unshift @$cmd, '--etcd_address', $scfg->{vitastor_etcd_address};
}
if ($scfg->{vitastor_config_path})
{
unshift @$cmd, '--config_path', $scfg->{vitastor_config_path};
}
unshift @$cmd, $binary;
eval { run_command($cmd, %args); };
if (my $err = $@)