forked from vitalif/vitastor
Compare commits
44 Commits
rdma-zeroc
...
master
Author | SHA1 | Date |
---|---|---|
Vitaliy Filippov | cb282d25e0 | |
Vitaliy Filippov | 8b2a4c9539 | |
Vitaliy Filippov | b66a079892 | |
Vitaliy Filippov | e90bbe6385 | |
Vitaliy Filippov | 4be761254c | |
Vitaliy Filippov | 7a45c5f86c | |
Vitaliy Filippov | bff413584d | |
Vitaliy Filippov | bb31050ab5 | |
Vitaliy Filippov | b52dd6843a | |
Vitaliy Filippov | b66160a7ad | |
Vitaliy Filippov | 30bb602681 | |
Vitaliy Filippov | eb0a3adafc | |
Vitaliy Filippov | 24301b116c | |
Vitaliy Filippov | 1d00c17d68 | |
Vitaliy Filippov | 24f19c4b80 | |
Vitaliy Filippov | dfdf5c1f9c | |
Vitaliy Filippov | aad7792d3f | |
Vitaliy Filippov | 6ca8afffe5 | |
Vitaliy Filippov | 511a89948b | |
Vitaliy Filippov | 3de553ecd7 | |
Vitaliy Filippov | 9c45d43e74 | |
Vitaliy Filippov | 891250d355 | |
Vitaliy Filippov | f9fe72d40a | |
Vitaliy Filippov | 10ee4f7c1d | |
Vitaliy Filippov | fd8244699b | |
Vitaliy Filippov | eaac1fc5d1 | |
Vitaliy Filippov | 57be1923d3 | |
Vitaliy Filippov | c467acc388 | |
Vitaliy Filippov | bf591ba3ee | |
Vitaliy Filippov | 699a0fbbc7 | |
Vitaliy Filippov | 6b2dd50f27 | |
Vitaliy Filippov | caf2f3c56f | |
Vitaliy Filippov | 9174f188b1 | |
Vitaliy Filippov | d3978c6d0e | |
Vitaliy Filippov | 4a7365660d | |
Vitaliy Filippov | 818ae5d61d | |
Vitaliy Filippov | 6810e93c3f | |
Vitaliy Filippov | f6f35f4127 | |
Vitaliy Filippov | 72aa2fd819 | |
Vitaliy Filippov | 5010b0dd75 | |
Vitaliy Filippov | 483c5ab380 | |
Vitaliy Filippov | 6a6fd6544d | |
Vitaliy Filippov | 971aa4ae4f | |
Vitaliy Filippov | 9e6cbc6ebc |
|
@ -2,6 +2,6 @@ cmake_minimum_required(VERSION 2.8)
|
||||||
|
|
||||||
project(vitastor)
|
project(vitastor)
|
||||||
|
|
||||||
set(VERSION "0.6.2")
|
set(VERSION "0.6.5")
|
||||||
|
|
||||||
add_subdirectory(src)
|
add_subdirectory(src)
|
||||||
|
|
28
README-ru.md
28
README-ru.md
|
@ -22,7 +22,6 @@ Vitastor на данный момент находится в статусе п
|
||||||
|
|
||||||
Однако следующее уже реализовано:
|
Однако следующее уже реализовано:
|
||||||
|
|
||||||
0.5.x (стабильная версия):
|
|
||||||
- Базовая часть - надёжное кластерное блочное хранилище без единой точки отказа
|
- Базовая часть - надёжное кластерное блочное хранилище без единой точки отказа
|
||||||
- Производительность ;-D
|
- Производительность ;-D
|
||||||
- Несколько схем отказоустойчивости: репликация, XOR n+1 (1 диск чётности), коды коррекции ошибок
|
- Несколько схем отказоустойчивости: репликация, XOR n+1 (1 диск чётности), коды коррекции ошибок
|
||||||
|
@ -43,24 +42,26 @@ Vitastor на данный момент находится в статусе п
|
||||||
- NBD-прокси для монтирования образов ядром ("блочное устройство в режиме пользователя")
|
- NBD-прокси для монтирования образов ядром ("блочное устройство в режиме пользователя")
|
||||||
- Утилита удаления образов/инодов (vitastor-rm)
|
- Утилита удаления образов/инодов (vitastor-rm)
|
||||||
- Пакеты для Debian и CentOS
|
- Пакеты для Debian и CentOS
|
||||||
|
|
||||||
0.6.x (master-ветка):
|
|
||||||
- Статистика операций ввода/вывода и занятого места в разрезе инодов
|
- Статистика операций ввода/вывода и занятого места в разрезе инодов
|
||||||
- Именование инодов через хранение их метаданных в etcd
|
- Именование инодов через хранение их метаданных в etcd
|
||||||
- Снапшоты и copy-on-write клоны
|
- Снапшоты и copy-on-write клоны
|
||||||
- Сглаживание производительности случайной записи в SSD+HDD конфигурациях
|
- Сглаживание производительности случайной записи в SSD+HDD конфигурациях
|
||||||
|
- Поддержка RDMA/RoCEv2 через libibverbs
|
||||||
|
- CSI-плагин для Kubernetes
|
||||||
|
- Базовая поддержка OpenStack: драйвер Cinder, патчи для Nova и libvirt
|
||||||
|
|
||||||
## Планы развития
|
## Планы развития
|
||||||
|
|
||||||
|
- Поддержка удаления снапшотов (слияния слоёв)
|
||||||
- Более корректные скрипты разметки дисков и автоматического запуска OSD
|
- Более корректные скрипты разметки дисков и автоматического запуска OSD
|
||||||
- Другие инструменты администрирования
|
- Другие инструменты администрирования
|
||||||
- Плагины для OpenStack, Kubernetes, OpenNebula, Proxmox и других облачных систем
|
- Плагины для OpenNebula, Proxmox и других облачных систем
|
||||||
- iSCSI-прокси
|
- iSCSI-прокси
|
||||||
- Более быстрое переключение при отказах
|
- Более быстрое переключение при отказах
|
||||||
- Фоновая проверка целостности без контрольных сумм (сверка реплик)
|
- Фоновая проверка целостности без контрольных сумм (сверка реплик)
|
||||||
- Контрольные суммы
|
- Контрольные суммы
|
||||||
- Поддержка SSD-кэширования (tiered storage)
|
- Поддержка SSD-кэширования (tiered storage)
|
||||||
- Поддержка RDMA и NVDIMM
|
- Поддержка NVDIMM
|
||||||
- Web-интерфейс
|
- Web-интерфейс
|
||||||
- Возможно, сжатие
|
- Возможно, сжатие
|
||||||
- Возможно, поддержка кэширования данных через системный page cache
|
- Возможно, поддержка кэширования данных через системный page cache
|
||||||
|
@ -371,7 +372,7 @@ Vitastor с однопоточной NBD прокси на том же стен
|
||||||
- Установите gcc и g++ 8.x или новее.
|
- Установите gcc и g++ 8.x или новее.
|
||||||
- Склонируйте данный репозиторий с подмодулями: `git clone https://yourcmc.ru/git/vitalif/vitastor/`.
|
- Склонируйте данный репозиторий с подмодулями: `git clone https://yourcmc.ru/git/vitalif/vitastor/`.
|
||||||
- Желательно пересобрать QEMU с патчем, который делает необязательным запуск через LD_PRELOAD.
|
- Желательно пересобрать QEMU с патчем, который делает необязательным запуск через LD_PRELOAD.
|
||||||
См `qemu-*.*-vitastor.patch` - выберите версию, наиболее близкую вашей версии QEMU.
|
См `patches/qemu-*.*-vitastor.patch` - выберите версию, наиболее близкую вашей версии QEMU.
|
||||||
- Установите QEMU 3.0 или новее, возьмите исходные коды установленного пакета, начните его пересборку,
|
- Установите QEMU 3.0 или новее, возьмите исходные коды установленного пакета, начните его пересборку,
|
||||||
через некоторое время остановите её и скопируйте следующие заголовки:
|
через некоторое время остановите её и скопируйте следующие заголовки:
|
||||||
- `<qemu>/include` → `<vitastor>/qemu/include`
|
- `<qemu>/include` → `<vitastor>/qemu/include`
|
||||||
|
@ -510,6 +511,21 @@ vitastor-nbd map --etcd_address 10.115.0.10:2379/v3 --image testimg
|
||||||
Для обращения по номеру инода, аналогично другим командам, можно использовать опции
|
Для обращения по номеру инода, аналогично другим командам, можно использовать опции
|
||||||
`--pool <POOL> --inode <INODE> --size <SIZE>` вместо `--image testimg`.
|
`--pool <POOL> --inode <INODE> --size <SIZE>` вместо `--image testimg`.
|
||||||
|
|
||||||
|
### Kubernetes
|
||||||
|
|
||||||
|
У Vitastor есть CSI-плагин для Kubernetes, поддерживающий RWO-тома.
|
||||||
|
|
||||||
|
Для установки возьмите манифесты из директории [csi/deploy/](csi/deploy/), поместите
|
||||||
|
вашу конфигурацию подключения к Vitastor в [csi/deploy/001-csi-config-map.yaml](001-csi-config-map.yaml),
|
||||||
|
настройте StorageClass в [csi/deploy/009-storage-class.yaml](009-storage-class.yaml)
|
||||||
|
и примените все `NNN-*.yaml` к вашей инсталляции Kubernetes.
|
||||||
|
|
||||||
|
```
|
||||||
|
for i in ./???-*.yaml; do kubectl apply -f $i; done
|
||||||
|
```
|
||||||
|
|
||||||
|
После этого вы сможете создавать PersistentVolume. Пример смотрите в файле [csi/deploy/example-pvc.yaml](csi/deploy/example-pvc.yaml).
|
||||||
|
|
||||||
## Известные проблемы
|
## Известные проблемы
|
||||||
|
|
||||||
- Запросы удаления объектов могут в данный момент приводить к "неполным" объектам в EC-пулах,
|
- Запросы удаления объектов могут в данный момент приводить к "неполным" объектам в EC-пулах,
|
||||||
|
|
28
README.md
28
README.md
|
@ -16,7 +16,6 @@ with configurable redundancy (replication or erasure codes/XOR).
|
||||||
Vitastor is currently a pre-release, a lot of features are missing and you can still expect
|
Vitastor is currently a pre-release, a lot of features are missing and you can still expect
|
||||||
breaking changes in the future. However, the following is implemented:
|
breaking changes in the future. However, the following is implemented:
|
||||||
|
|
||||||
0.5.x (stable):
|
|
||||||
- Basic part: highly-available block storage with symmetric clustering and no SPOF
|
- Basic part: highly-available block storage with symmetric clustering and no SPOF
|
||||||
- Performance ;-D
|
- Performance ;-D
|
||||||
- Multiple redundancy schemes: Replication, XOR n+1, Reed-Solomon erasure codes
|
- Multiple redundancy schemes: Replication, XOR n+1, Reed-Solomon erasure codes
|
||||||
|
@ -37,24 +36,26 @@ breaking changes in the future. However, the following is implemented:
|
||||||
- NBD proxy for kernel mounts
|
- NBD proxy for kernel mounts
|
||||||
- Inode removal tool (vitastor-rm)
|
- Inode removal tool (vitastor-rm)
|
||||||
- Packaging for Debian and CentOS
|
- Packaging for Debian and CentOS
|
||||||
|
|
||||||
0.6.x (master):
|
|
||||||
- Per-inode I/O and space usage statistics
|
- Per-inode I/O and space usage statistics
|
||||||
- Inode metadata storage in etcd
|
- Inode metadata storage in etcd
|
||||||
- Snapshots and copy-on-write image clones
|
- Snapshots and copy-on-write image clones
|
||||||
- Write throttling to smooth random write workloads in SSD+HDD configurations
|
- Write throttling to smooth random write workloads in SSD+HDD configurations
|
||||||
|
- RDMA/RoCEv2 support via libibverbs
|
||||||
|
- CSI plugin for Kubernetes
|
||||||
|
- Basic OpenStack support: Cinder driver, Nova and libvirt patches
|
||||||
|
|
||||||
## Roadmap
|
## Roadmap
|
||||||
|
|
||||||
|
- Snapshot deletion (layer merge) support
|
||||||
- Better OSD creation and auto-start tools
|
- Better OSD creation and auto-start tools
|
||||||
- Other administrative tools
|
- Other administrative tools
|
||||||
- Plugins for OpenStack, Kubernetes, OpenNebula, Proxmox and other cloud systems
|
- Plugins for OpenNebula, Proxmox and other cloud systems
|
||||||
- iSCSI proxy
|
- iSCSI proxy
|
||||||
- Faster failover
|
- Faster failover
|
||||||
- Scrubbing without checksums (verification of replicas)
|
- Scrubbing without checksums (verification of replicas)
|
||||||
- Checksums
|
- Checksums
|
||||||
- Tiered storage
|
- Tiered storage
|
||||||
- RDMA and NVDIMM support
|
- NVDIMM support
|
||||||
- Web GUI
|
- Web GUI
|
||||||
- Compression (possibly)
|
- Compression (possibly)
|
||||||
- Read caching using system page cache (possibly)
|
- Read caching using system page cache (possibly)
|
||||||
|
@ -339,7 +340,7 @@ Vitastor with single-thread NBD on the same hardware:
|
||||||
* For QEMU 2.0+: `<qemu>/qapi-types.h` → `<vitastor>/qemu/b/qemu/qapi-types.h`
|
* For QEMU 2.0+: `<qemu>/qapi-types.h` → `<vitastor>/qemu/b/qemu/qapi-types.h`
|
||||||
- `config-host.h` and `qapi` are required because they contain generated headers
|
- `config-host.h` and `qapi` are required because they contain generated headers
|
||||||
- You can also rebuild QEMU with a patch that makes LD_PRELOAD unnecessary to load vitastor driver.
|
- You can also rebuild QEMU with a patch that makes LD_PRELOAD unnecessary to load vitastor driver.
|
||||||
See `qemu-*.*-vitastor.patch`.
|
See `patches/qemu-*.*-vitastor.patch`.
|
||||||
- Install fio 3.7 or later, get its source and symlink it into `<vitastor>/fio`.
|
- Install fio 3.7 or later, get its source and symlink it into `<vitastor>/fio`.
|
||||||
- Build & install Vitastor with `mkdir build && cd build && cmake .. && make -j8 && make install`.
|
- Build & install Vitastor with `mkdir build && cd build && cmake .. && make -j8 && make install`.
|
||||||
Pay attention to the `QEMU_PLUGINDIR` cmake option - it must be set to `qemu-kvm` on RHEL.
|
Pay attention to the `QEMU_PLUGINDIR` cmake option - it must be set to `qemu-kvm` on RHEL.
|
||||||
|
@ -460,6 +461,21 @@ It will output the device name, like /dev/nbd0 which you can then format and mou
|
||||||
|
|
||||||
Again, you can use `--pool <POOL> --inode <INODE> --size <SIZE>` insteaf of `--image <IMAGE>` if you want.
|
Again, you can use `--pool <POOL> --inode <INODE> --size <SIZE>` insteaf of `--image <IMAGE>` if you want.
|
||||||
|
|
||||||
|
### Kubernetes
|
||||||
|
|
||||||
|
Vitastor has a CSI plugin for Kubernetes which supports RWO volumes.
|
||||||
|
|
||||||
|
To deploy it, take manifests from [csi/deploy/](csi/deploy/) directory, put your
|
||||||
|
Vitastor configuration in [csi/deploy/001-csi-config-map.yaml](001-csi-config-map.yaml),
|
||||||
|
configure storage class in [csi/deploy/009-storage-class.yaml](009-storage-class.yaml)
|
||||||
|
and apply all `NNN-*.yaml` manifests to your Kubernetes installation:
|
||||||
|
|
||||||
|
```
|
||||||
|
for i in ./???-*.yaml; do kubectl apply -f $i; done
|
||||||
|
```
|
||||||
|
|
||||||
|
After that you'll be able to create PersistentVolumes. See example in [csi/deploy/example-pvc.yaml](csi/deploy/example-pvc.yaml).
|
||||||
|
|
||||||
## Known Problems
|
## Known Problems
|
||||||
|
|
||||||
- Object deletion requests may currently lead to 'incomplete' objects in EC pools
|
- Object deletion requests may currently lead to 'incomplete' objects in EC pools
|
||||||
|
|
|
@ -0,0 +1,3 @@
|
||||||
|
vitastor-csi
|
||||||
|
go.sum
|
||||||
|
Dockerfile
|
|
@ -0,0 +1,32 @@
|
||||||
|
# Compile stage
|
||||||
|
FROM golang:buster AS build
|
||||||
|
|
||||||
|
ADD go.mod /app/
|
||||||
|
RUN cd /app; CGO_ENABLED=1 GOOS=linux GOARCH=amd64 go mod download -x
|
||||||
|
ADD . /app
|
||||||
|
RUN perl -i -e '$/ = undef; while(<>) { s/\n\s*(\{\s*\n)/$1\n/g; s/\}(\s*\n\s*)else\b/$1} else/g; print; }' `find /app -name '*.go'`
|
||||||
|
RUN cd /app; CGO_ENABLED=1 GOOS=linux GOARCH=amd64 go build -o vitastor-csi
|
||||||
|
|
||||||
|
# Final stage
|
||||||
|
FROM debian:buster
|
||||||
|
|
||||||
|
LABEL maintainers="Vitaliy Filippov <vitalif@yourcmc.ru>"
|
||||||
|
LABEL description="Vitastor CSI Driver"
|
||||||
|
|
||||||
|
ENV NODE_ID=""
|
||||||
|
ENV CSI_ENDPOINT=""
|
||||||
|
|
||||||
|
RUN apt-get update && \
|
||||||
|
apt-get install -y wget && \
|
||||||
|
wget -q -O /etc/apt/trusted.gpg.d/vitastor.gpg https://vitastor.io/debian/pubkey.gpg && \
|
||||||
|
(echo deb http://vitastor.io/debian buster main > /etc/apt/sources.list.d/vitastor.list) && \
|
||||||
|
(echo deb http://deb.debian.org/debian buster-backports main > /etc/apt/sources.list.d/backports.list) && \
|
||||||
|
(echo "APT::Install-Recommends false;" > /etc/apt/apt.conf) && \
|
||||||
|
apt-get update && \
|
||||||
|
apt-get install -y e2fsprogs xfsprogs vitastor kmod && \
|
||||||
|
apt-get clean && \
|
||||||
|
(echo options nbd nbds_max=128 > /etc/modprobe.d/nbd.conf)
|
||||||
|
|
||||||
|
COPY --from=build /app/vitastor-csi /bin/
|
||||||
|
|
||||||
|
ENTRYPOINT ["/bin/vitastor-csi"]
|
|
@ -0,0 +1,9 @@
|
||||||
|
VERSION ?= v0.6.5
|
||||||
|
|
||||||
|
all: build push
|
||||||
|
|
||||||
|
build:
|
||||||
|
@docker build --rm -t vitalif/vitastor-csi:$(VERSION) .
|
||||||
|
|
||||||
|
push:
|
||||||
|
@docker push vitalif/vitastor-csi:$(VERSION)
|
|
@ -0,0 +1,5 @@
|
||||||
|
---
|
||||||
|
apiVersion: v1
|
||||||
|
kind: Namespace
|
||||||
|
metadata:
|
||||||
|
name: vitastor-system
|
|
@ -0,0 +1,9 @@
|
||||||
|
---
|
||||||
|
apiVersion: v1
|
||||||
|
kind: ConfigMap
|
||||||
|
data:
|
||||||
|
vitastor.conf: |-
|
||||||
|
{"etcd_address":"http://192.168.7.2:2379","etcd_prefix":"/vitastor"}
|
||||||
|
metadata:
|
||||||
|
namespace: vitastor-system
|
||||||
|
name: vitastor-config
|
|
@ -0,0 +1,37 @@
|
||||||
|
---
|
||||||
|
apiVersion: v1
|
||||||
|
kind: ServiceAccount
|
||||||
|
metadata:
|
||||||
|
namespace: vitastor-system
|
||||||
|
name: vitastor-csi-nodeplugin
|
||||||
|
---
|
||||||
|
kind: ClusterRole
|
||||||
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
|
metadata:
|
||||||
|
namespace: vitastor-system
|
||||||
|
name: vitastor-csi-nodeplugin
|
||||||
|
rules:
|
||||||
|
- apiGroups: [""]
|
||||||
|
resources: ["nodes"]
|
||||||
|
verbs: ["get"]
|
||||||
|
# allow to read Vault Token and connection options from the Tenants namespace
|
||||||
|
- apiGroups: [""]
|
||||||
|
resources: ["secrets"]
|
||||||
|
verbs: ["get"]
|
||||||
|
- apiGroups: [""]
|
||||||
|
resources: ["configmaps"]
|
||||||
|
verbs: ["get"]
|
||||||
|
---
|
||||||
|
kind: ClusterRoleBinding
|
||||||
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
|
metadata:
|
||||||
|
namespace: vitastor-system
|
||||||
|
name: vitastor-csi-nodeplugin
|
||||||
|
subjects:
|
||||||
|
- kind: ServiceAccount
|
||||||
|
name: vitastor-csi-nodeplugin
|
||||||
|
namespace: vitastor-system
|
||||||
|
roleRef:
|
||||||
|
kind: ClusterRole
|
||||||
|
name: vitastor-csi-nodeplugin
|
||||||
|
apiGroup: rbac.authorization.k8s.io
|
|
@ -0,0 +1,72 @@
|
||||||
|
---
|
||||||
|
apiVersion: policy/v1beta1
|
||||||
|
kind: PodSecurityPolicy
|
||||||
|
metadata:
|
||||||
|
namespace: vitastor-system
|
||||||
|
name: vitastor-csi-nodeplugin-psp
|
||||||
|
spec:
|
||||||
|
allowPrivilegeEscalation: true
|
||||||
|
allowedCapabilities:
|
||||||
|
- 'SYS_ADMIN'
|
||||||
|
fsGroup:
|
||||||
|
rule: RunAsAny
|
||||||
|
privileged: true
|
||||||
|
hostNetwork: true
|
||||||
|
hostPID: true
|
||||||
|
runAsUser:
|
||||||
|
rule: RunAsAny
|
||||||
|
seLinux:
|
||||||
|
rule: RunAsAny
|
||||||
|
supplementalGroups:
|
||||||
|
rule: RunAsAny
|
||||||
|
volumes:
|
||||||
|
- 'configMap'
|
||||||
|
- 'emptyDir'
|
||||||
|
- 'projected'
|
||||||
|
- 'secret'
|
||||||
|
- 'downwardAPI'
|
||||||
|
- 'hostPath'
|
||||||
|
allowedHostPaths:
|
||||||
|
- pathPrefix: '/dev'
|
||||||
|
readOnly: false
|
||||||
|
- pathPrefix: '/run/mount'
|
||||||
|
readOnly: false
|
||||||
|
- pathPrefix: '/sys'
|
||||||
|
readOnly: false
|
||||||
|
- pathPrefix: '/lib/modules'
|
||||||
|
readOnly: true
|
||||||
|
- pathPrefix: '/var/lib/kubelet/pods'
|
||||||
|
readOnly: false
|
||||||
|
- pathPrefix: '/var/lib/kubelet/plugins/csi.vitastor.io'
|
||||||
|
readOnly: false
|
||||||
|
- pathPrefix: '/var/lib/kubelet/plugins_registry'
|
||||||
|
readOnly: false
|
||||||
|
- pathPrefix: '/var/lib/kubelet/plugins'
|
||||||
|
readOnly: false
|
||||||
|
|
||||||
|
---
|
||||||
|
kind: Role
|
||||||
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
|
metadata:
|
||||||
|
namespace: vitastor-system
|
||||||
|
name: vitastor-csi-nodeplugin-psp
|
||||||
|
rules:
|
||||||
|
- apiGroups: ['policy']
|
||||||
|
resources: ['podsecuritypolicies']
|
||||||
|
verbs: ['use']
|
||||||
|
resourceNames: ['vitastor-csi-nodeplugin-psp']
|
||||||
|
|
||||||
|
---
|
||||||
|
kind: RoleBinding
|
||||||
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
|
metadata:
|
||||||
|
namespace: vitastor-system
|
||||||
|
name: vitastor-csi-nodeplugin-psp
|
||||||
|
subjects:
|
||||||
|
- kind: ServiceAccount
|
||||||
|
name: vitastor-csi-nodeplugin
|
||||||
|
namespace: vitastor-system
|
||||||
|
roleRef:
|
||||||
|
kind: Role
|
||||||
|
name: vitastor-csi-nodeplugin-psp
|
||||||
|
apiGroup: rbac.authorization.k8s.io
|
|
@ -0,0 +1,140 @@
|
||||||
|
---
|
||||||
|
kind: DaemonSet
|
||||||
|
apiVersion: apps/v1
|
||||||
|
metadata:
|
||||||
|
namespace: vitastor-system
|
||||||
|
name: csi-vitastor
|
||||||
|
spec:
|
||||||
|
selector:
|
||||||
|
matchLabels:
|
||||||
|
app: csi-vitastor
|
||||||
|
template:
|
||||||
|
metadata:
|
||||||
|
namespace: vitastor-system
|
||||||
|
labels:
|
||||||
|
app: csi-vitastor
|
||||||
|
spec:
|
||||||
|
serviceAccountName: vitastor-csi-nodeplugin
|
||||||
|
hostNetwork: true
|
||||||
|
hostPID: true
|
||||||
|
priorityClassName: system-node-critical
|
||||||
|
# to use e.g. Rook orchestrated cluster, and mons' FQDN is
|
||||||
|
# resolved through k8s service, set dns policy to cluster first
|
||||||
|
dnsPolicy: ClusterFirstWithHostNet
|
||||||
|
containers:
|
||||||
|
- name: driver-registrar
|
||||||
|
# This is necessary only for systems with SELinux, where
|
||||||
|
# non-privileged sidecar containers cannot access unix domain socket
|
||||||
|
# created by privileged CSI driver container.
|
||||||
|
securityContext:
|
||||||
|
privileged: true
|
||||||
|
image: k8s.gcr.io/sig-storage/csi-node-driver-registrar:v2.2.0
|
||||||
|
args:
|
||||||
|
- "--v=5"
|
||||||
|
- "--csi-address=/csi/csi.sock"
|
||||||
|
- "--kubelet-registration-path=/var/lib/kubelet/plugins/csi.vitastor.io/csi.sock"
|
||||||
|
env:
|
||||||
|
- name: KUBE_NODE_NAME
|
||||||
|
valueFrom:
|
||||||
|
fieldRef:
|
||||||
|
fieldPath: spec.nodeName
|
||||||
|
volumeMounts:
|
||||||
|
- name: socket-dir
|
||||||
|
mountPath: /csi
|
||||||
|
- name: registration-dir
|
||||||
|
mountPath: /registration
|
||||||
|
- name: csi-vitastor
|
||||||
|
securityContext:
|
||||||
|
privileged: true
|
||||||
|
capabilities:
|
||||||
|
add: ["SYS_ADMIN"]
|
||||||
|
allowPrivilegeEscalation: true
|
||||||
|
image: vitalif/vitastor-csi:v0.6.5
|
||||||
|
args:
|
||||||
|
- "--node=$(NODE_ID)"
|
||||||
|
- "--endpoint=$(CSI_ENDPOINT)"
|
||||||
|
env:
|
||||||
|
- name: NODE_ID
|
||||||
|
valueFrom:
|
||||||
|
fieldRef:
|
||||||
|
fieldPath: spec.nodeName
|
||||||
|
- name: CSI_ENDPOINT
|
||||||
|
value: unix:///csi/csi.sock
|
||||||
|
imagePullPolicy: "IfNotPresent"
|
||||||
|
ports:
|
||||||
|
- containerPort: 9898
|
||||||
|
name: healthz
|
||||||
|
protocol: TCP
|
||||||
|
livenessProbe:
|
||||||
|
failureThreshold: 5
|
||||||
|
httpGet:
|
||||||
|
path: /healthz
|
||||||
|
port: healthz
|
||||||
|
initialDelaySeconds: 10
|
||||||
|
timeoutSeconds: 3
|
||||||
|
periodSeconds: 2
|
||||||
|
volumeMounts:
|
||||||
|
- name: socket-dir
|
||||||
|
mountPath: /csi
|
||||||
|
- mountPath: /dev
|
||||||
|
name: host-dev
|
||||||
|
- mountPath: /sys
|
||||||
|
name: host-sys
|
||||||
|
- mountPath: /run/mount
|
||||||
|
name: host-mount
|
||||||
|
- mountPath: /lib/modules
|
||||||
|
name: lib-modules
|
||||||
|
readOnly: true
|
||||||
|
- name: vitastor-config
|
||||||
|
mountPath: /etc/vitastor
|
||||||
|
- name: plugin-dir
|
||||||
|
mountPath: /var/lib/kubelet/plugins
|
||||||
|
mountPropagation: "Bidirectional"
|
||||||
|
- name: mountpoint-dir
|
||||||
|
mountPath: /var/lib/kubelet/pods
|
||||||
|
mountPropagation: "Bidirectional"
|
||||||
|
- name: liveness-probe
|
||||||
|
securityContext:
|
||||||
|
privileged: true
|
||||||
|
image: quay.io/k8scsi/livenessprobe:v1.1.0
|
||||||
|
args:
|
||||||
|
- "--csi-address=$(CSI_ENDPOINT)"
|
||||||
|
- "--health-port=9898"
|
||||||
|
env:
|
||||||
|
- name: CSI_ENDPOINT
|
||||||
|
value: unix://csi/csi.sock
|
||||||
|
volumeMounts:
|
||||||
|
- mountPath: /csi
|
||||||
|
name: socket-dir
|
||||||
|
volumes:
|
||||||
|
- name: socket-dir
|
||||||
|
hostPath:
|
||||||
|
path: /var/lib/kubelet/plugins/csi.vitastor.io
|
||||||
|
type: DirectoryOrCreate
|
||||||
|
- name: plugin-dir
|
||||||
|
hostPath:
|
||||||
|
path: /var/lib/kubelet/plugins
|
||||||
|
type: Directory
|
||||||
|
- name: mountpoint-dir
|
||||||
|
hostPath:
|
||||||
|
path: /var/lib/kubelet/pods
|
||||||
|
type: DirectoryOrCreate
|
||||||
|
- name: registration-dir
|
||||||
|
hostPath:
|
||||||
|
path: /var/lib/kubelet/plugins_registry/
|
||||||
|
type: Directory
|
||||||
|
- name: host-dev
|
||||||
|
hostPath:
|
||||||
|
path: /dev
|
||||||
|
- name: host-sys
|
||||||
|
hostPath:
|
||||||
|
path: /sys
|
||||||
|
- name: host-mount
|
||||||
|
hostPath:
|
||||||
|
path: /run/mount
|
||||||
|
- name: lib-modules
|
||||||
|
hostPath:
|
||||||
|
path: /lib/modules
|
||||||
|
- name: vitastor-config
|
||||||
|
configMap:
|
||||||
|
name: vitastor-config
|
|
@ -0,0 +1,102 @@
|
||||||
|
---
|
||||||
|
apiVersion: v1
|
||||||
|
kind: ServiceAccount
|
||||||
|
metadata:
|
||||||
|
namespace: vitastor-system
|
||||||
|
name: vitastor-csi-provisioner
|
||||||
|
|
||||||
|
---
|
||||||
|
kind: ClusterRole
|
||||||
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
|
metadata:
|
||||||
|
namespace: vitastor-system
|
||||||
|
name: vitastor-external-provisioner-runner
|
||||||
|
rules:
|
||||||
|
- apiGroups: [""]
|
||||||
|
resources: ["nodes"]
|
||||||
|
verbs: ["get", "list", "watch"]
|
||||||
|
- apiGroups: [""]
|
||||||
|
resources: ["secrets"]
|
||||||
|
verbs: ["get", "list", "watch"]
|
||||||
|
- apiGroups: [""]
|
||||||
|
resources: ["events"]
|
||||||
|
verbs: ["list", "watch", "create", "update", "patch"]
|
||||||
|
- apiGroups: [""]
|
||||||
|
resources: ["persistentvolumes"]
|
||||||
|
verbs: ["get", "list", "watch", "create", "update", "delete", "patch"]
|
||||||
|
- apiGroups: [""]
|
||||||
|
resources: ["persistentvolumeclaims"]
|
||||||
|
verbs: ["get", "list", "watch", "update"]
|
||||||
|
- apiGroups: [""]
|
||||||
|
resources: ["persistentvolumeclaims/status"]
|
||||||
|
verbs: ["update", "patch"]
|
||||||
|
- apiGroups: ["storage.k8s.io"]
|
||||||
|
resources: ["storageclasses"]
|
||||||
|
verbs: ["get", "list", "watch"]
|
||||||
|
- apiGroups: ["snapshot.storage.k8s.io"]
|
||||||
|
resources: ["volumesnapshots"]
|
||||||
|
verbs: ["get", "list"]
|
||||||
|
- apiGroups: ["snapshot.storage.k8s.io"]
|
||||||
|
resources: ["volumesnapshotcontents"]
|
||||||
|
verbs: ["create", "get", "list", "watch", "update", "delete"]
|
||||||
|
- apiGroups: ["snapshot.storage.k8s.io"]
|
||||||
|
resources: ["volumesnapshotclasses"]
|
||||||
|
verbs: ["get", "list", "watch"]
|
||||||
|
- apiGroups: ["storage.k8s.io"]
|
||||||
|
resources: ["volumeattachments"]
|
||||||
|
verbs: ["get", "list", "watch", "update", "patch"]
|
||||||
|
- apiGroups: ["storage.k8s.io"]
|
||||||
|
resources: ["volumeattachments/status"]
|
||||||
|
verbs: ["patch"]
|
||||||
|
- apiGroups: ["storage.k8s.io"]
|
||||||
|
resources: ["csinodes"]
|
||||||
|
verbs: ["get", "list", "watch"]
|
||||||
|
- apiGroups: ["snapshot.storage.k8s.io"]
|
||||||
|
resources: ["volumesnapshotcontents/status"]
|
||||||
|
verbs: ["update"]
|
||||||
|
- apiGroups: [""]
|
||||||
|
resources: ["configmaps"]
|
||||||
|
verbs: ["get"]
|
||||||
|
---
|
||||||
|
kind: ClusterRoleBinding
|
||||||
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
|
metadata:
|
||||||
|
namespace: vitastor-system
|
||||||
|
name: vitastor-csi-provisioner-role
|
||||||
|
subjects:
|
||||||
|
- kind: ServiceAccount
|
||||||
|
name: vitastor-csi-provisioner
|
||||||
|
namespace: vitastor-system
|
||||||
|
roleRef:
|
||||||
|
kind: ClusterRole
|
||||||
|
name: vitastor-external-provisioner-runner
|
||||||
|
apiGroup: rbac.authorization.k8s.io
|
||||||
|
|
||||||
|
---
|
||||||
|
kind: Role
|
||||||
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
|
metadata:
|
||||||
|
namespace: vitastor-system
|
||||||
|
name: vitastor-external-provisioner-cfg
|
||||||
|
rules:
|
||||||
|
- apiGroups: [""]
|
||||||
|
resources: ["configmaps"]
|
||||||
|
verbs: ["get", "list", "watch", "create", "update", "delete"]
|
||||||
|
- apiGroups: ["coordination.k8s.io"]
|
||||||
|
resources: ["leases"]
|
||||||
|
verbs: ["get", "watch", "list", "delete", "update", "create"]
|
||||||
|
|
||||||
|
---
|
||||||
|
kind: RoleBinding
|
||||||
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
|
metadata:
|
||||||
|
name: vitastor-csi-provisioner-role-cfg
|
||||||
|
namespace: vitastor-system
|
||||||
|
subjects:
|
||||||
|
- kind: ServiceAccount
|
||||||
|
name: vitastor-csi-provisioner
|
||||||
|
namespace: vitastor-system
|
||||||
|
roleRef:
|
||||||
|
kind: Role
|
||||||
|
name: vitastor-external-provisioner-cfg
|
||||||
|
apiGroup: rbac.authorization.k8s.io
|
|
@ -0,0 +1,60 @@
|
||||||
|
---
|
||||||
|
apiVersion: policy/v1beta1
|
||||||
|
kind: PodSecurityPolicy
|
||||||
|
metadata:
|
||||||
|
namespace: vitastor-system
|
||||||
|
name: vitastor-csi-provisioner-psp
|
||||||
|
spec:
|
||||||
|
allowPrivilegeEscalation: true
|
||||||
|
allowedCapabilities:
|
||||||
|
- 'SYS_ADMIN'
|
||||||
|
fsGroup:
|
||||||
|
rule: RunAsAny
|
||||||
|
privileged: true
|
||||||
|
runAsUser:
|
||||||
|
rule: RunAsAny
|
||||||
|
seLinux:
|
||||||
|
rule: RunAsAny
|
||||||
|
supplementalGroups:
|
||||||
|
rule: RunAsAny
|
||||||
|
volumes:
|
||||||
|
- 'configMap'
|
||||||
|
- 'emptyDir'
|
||||||
|
- 'projected'
|
||||||
|
- 'secret'
|
||||||
|
- 'downwardAPI'
|
||||||
|
- 'hostPath'
|
||||||
|
allowedHostPaths:
|
||||||
|
- pathPrefix: '/dev'
|
||||||
|
readOnly: false
|
||||||
|
- pathPrefix: '/sys'
|
||||||
|
readOnly: false
|
||||||
|
- pathPrefix: '/lib/modules'
|
||||||
|
readOnly: true
|
||||||
|
|
||||||
|
---
|
||||||
|
kind: Role
|
||||||
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
|
metadata:
|
||||||
|
namespace: vitastor-system
|
||||||
|
name: vitastor-csi-provisioner-psp
|
||||||
|
rules:
|
||||||
|
- apiGroups: ['policy']
|
||||||
|
resources: ['podsecuritypolicies']
|
||||||
|
verbs: ['use']
|
||||||
|
resourceNames: ['vitastor-csi-provisioner-psp']
|
||||||
|
|
||||||
|
---
|
||||||
|
kind: RoleBinding
|
||||||
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
|
metadata:
|
||||||
|
name: vitastor-csi-provisioner-psp
|
||||||
|
namespace: vitastor-system
|
||||||
|
subjects:
|
||||||
|
- kind: ServiceAccount
|
||||||
|
name: vitastor-csi-provisioner
|
||||||
|
namespace: vitastor-system
|
||||||
|
roleRef:
|
||||||
|
kind: Role
|
||||||
|
name: vitastor-csi-provisioner-psp
|
||||||
|
apiGroup: rbac.authorization.k8s.io
|
|
@ -0,0 +1,159 @@
|
||||||
|
---
|
||||||
|
kind: Service
|
||||||
|
apiVersion: v1
|
||||||
|
metadata:
|
||||||
|
namespace: vitastor-system
|
||||||
|
name: csi-vitastor-provisioner
|
||||||
|
labels:
|
||||||
|
app: csi-metrics
|
||||||
|
spec:
|
||||||
|
selector:
|
||||||
|
app: csi-vitastor-provisioner
|
||||||
|
ports:
|
||||||
|
- name: http-metrics
|
||||||
|
port: 8080
|
||||||
|
protocol: TCP
|
||||||
|
targetPort: 8680
|
||||||
|
|
||||||
|
---
|
||||||
|
kind: Deployment
|
||||||
|
apiVersion: apps/v1
|
||||||
|
metadata:
|
||||||
|
namespace: vitastor-system
|
||||||
|
name: csi-vitastor-provisioner
|
||||||
|
spec:
|
||||||
|
replicas: 3
|
||||||
|
selector:
|
||||||
|
matchLabels:
|
||||||
|
app: csi-vitastor-provisioner
|
||||||
|
template:
|
||||||
|
metadata:
|
||||||
|
namespace: vitastor-system
|
||||||
|
labels:
|
||||||
|
app: csi-vitastor-provisioner
|
||||||
|
spec:
|
||||||
|
affinity:
|
||||||
|
podAntiAffinity:
|
||||||
|
requiredDuringSchedulingIgnoredDuringExecution:
|
||||||
|
- labelSelector:
|
||||||
|
matchExpressions:
|
||||||
|
- key: app
|
||||||
|
operator: In
|
||||||
|
values:
|
||||||
|
- csi-vitastor-provisioner
|
||||||
|
topologyKey: "kubernetes.io/hostname"
|
||||||
|
serviceAccountName: vitastor-csi-provisioner
|
||||||
|
priorityClassName: system-cluster-critical
|
||||||
|
containers:
|
||||||
|
- name: csi-provisioner
|
||||||
|
image: k8s.gcr.io/sig-storage/csi-provisioner:v2.2.0
|
||||||
|
args:
|
||||||
|
- "--csi-address=$(ADDRESS)"
|
||||||
|
- "--v=5"
|
||||||
|
- "--timeout=150s"
|
||||||
|
- "--retry-interval-start=500ms"
|
||||||
|
- "--leader-election=true"
|
||||||
|
# set it to true to use topology based provisioning
|
||||||
|
- "--feature-gates=Topology=false"
|
||||||
|
# if fstype is not specified in storageclass, ext4 is default
|
||||||
|
- "--default-fstype=ext4"
|
||||||
|
- "--extra-create-metadata=true"
|
||||||
|
env:
|
||||||
|
- name: ADDRESS
|
||||||
|
value: unix:///csi/csi-provisioner.sock
|
||||||
|
imagePullPolicy: "IfNotPresent"
|
||||||
|
volumeMounts:
|
||||||
|
- name: socket-dir
|
||||||
|
mountPath: /csi
|
||||||
|
- name: csi-snapshotter
|
||||||
|
image: k8s.gcr.io/sig-storage/csi-snapshotter:v4.0.0
|
||||||
|
args:
|
||||||
|
- "--csi-address=$(ADDRESS)"
|
||||||
|
- "--v=5"
|
||||||
|
- "--timeout=150s"
|
||||||
|
- "--leader-election=true"
|
||||||
|
env:
|
||||||
|
- name: ADDRESS
|
||||||
|
value: unix:///csi/csi-provisioner.sock
|
||||||
|
imagePullPolicy: "IfNotPresent"
|
||||||
|
securityContext:
|
||||||
|
privileged: true
|
||||||
|
volumeMounts:
|
||||||
|
- name: socket-dir
|
||||||
|
mountPath: /csi
|
||||||
|
- name: csi-attacher
|
||||||
|
image: k8s.gcr.io/sig-storage/csi-attacher:v3.1.0
|
||||||
|
args:
|
||||||
|
- "--v=5"
|
||||||
|
- "--csi-address=$(ADDRESS)"
|
||||||
|
- "--leader-election=true"
|
||||||
|
- "--retry-interval-start=500ms"
|
||||||
|
env:
|
||||||
|
- name: ADDRESS
|
||||||
|
value: /csi/csi-provisioner.sock
|
||||||
|
imagePullPolicy: "IfNotPresent"
|
||||||
|
volumeMounts:
|
||||||
|
- name: socket-dir
|
||||||
|
mountPath: /csi
|
||||||
|
- name: csi-resizer
|
||||||
|
image: k8s.gcr.io/sig-storage/csi-resizer:v1.1.0
|
||||||
|
args:
|
||||||
|
- "--csi-address=$(ADDRESS)"
|
||||||
|
- "--v=5"
|
||||||
|
- "--timeout=150s"
|
||||||
|
- "--leader-election"
|
||||||
|
- "--retry-interval-start=500ms"
|
||||||
|
- "--handle-volume-inuse-error=false"
|
||||||
|
env:
|
||||||
|
- name: ADDRESS
|
||||||
|
value: unix:///csi/csi-provisioner.sock
|
||||||
|
imagePullPolicy: "IfNotPresent"
|
||||||
|
volumeMounts:
|
||||||
|
- name: socket-dir
|
||||||
|
mountPath: /csi
|
||||||
|
- name: csi-vitastor
|
||||||
|
securityContext:
|
||||||
|
privileged: true
|
||||||
|
capabilities:
|
||||||
|
add: ["SYS_ADMIN"]
|
||||||
|
image: vitalif/vitastor-csi:v0.6.5
|
||||||
|
args:
|
||||||
|
- "--node=$(NODE_ID)"
|
||||||
|
- "--endpoint=$(CSI_ENDPOINT)"
|
||||||
|
env:
|
||||||
|
- name: NODE_ID
|
||||||
|
valueFrom:
|
||||||
|
fieldRef:
|
||||||
|
fieldPath: spec.nodeName
|
||||||
|
- name: CSI_ENDPOINT
|
||||||
|
value: unix:///csi/csi-provisioner.sock
|
||||||
|
imagePullPolicy: "IfNotPresent"
|
||||||
|
volumeMounts:
|
||||||
|
- name: socket-dir
|
||||||
|
mountPath: /csi
|
||||||
|
- mountPath: /dev
|
||||||
|
name: host-dev
|
||||||
|
- mountPath: /sys
|
||||||
|
name: host-sys
|
||||||
|
- mountPath: /lib/modules
|
||||||
|
name: lib-modules
|
||||||
|
readOnly: true
|
||||||
|
- name: vitastor-config
|
||||||
|
mountPath: /etc/vitastor
|
||||||
|
volumes:
|
||||||
|
- name: host-dev
|
||||||
|
hostPath:
|
||||||
|
path: /dev
|
||||||
|
- name: host-sys
|
||||||
|
hostPath:
|
||||||
|
path: /sys
|
||||||
|
- name: lib-modules
|
||||||
|
hostPath:
|
||||||
|
path: /lib/modules
|
||||||
|
- name: socket-dir
|
||||||
|
emptyDir: {
|
||||||
|
medium: "Memory"
|
||||||
|
}
|
||||||
|
- name: vitastor-config
|
||||||
|
configMap:
|
||||||
|
name: vitastor-config
|
|
@ -0,0 +1,11 @@
|
||||||
|
---
|
||||||
|
# if Kubernetes version is less than 1.18 change
|
||||||
|
# apiVersion to storage.k8s.io/v1betav1
|
||||||
|
apiVersion: storage.k8s.io/v1
|
||||||
|
kind: CSIDriver
|
||||||
|
metadata:
|
||||||
|
namespace: vitastor-system
|
||||||
|
name: csi.vitastor.io
|
||||||
|
spec:
|
||||||
|
attachRequired: true
|
||||||
|
podInfoOnMount: false
|
|
@ -0,0 +1,19 @@
|
||||||
|
---
|
||||||
|
apiVersion: storage.k8s.io/v1
|
||||||
|
kind: StorageClass
|
||||||
|
metadata:
|
||||||
|
namespace: vitastor-system
|
||||||
|
name: vitastor
|
||||||
|
annotations:
|
||||||
|
storageclass.kubernetes.io/is-default-class: "true"
|
||||||
|
provisioner: csi.vitastor.io
|
||||||
|
volumeBindingMode: Immediate
|
||||||
|
parameters:
|
||||||
|
etcdVolumePrefix: ""
|
||||||
|
poolId: "1"
|
||||||
|
# you can choose other configuration file if you have it in the config map
|
||||||
|
#configPath: "/etc/vitastor/vitastor.conf"
|
||||||
|
# you can also specify etcdUrl here, maybe to connect to another Vitastor cluster
|
||||||
|
# multiple etcdUrls may be specified, delimited by comma
|
||||||
|
#etcdUrl: "http://192.168.7.2:2379"
|
||||||
|
#etcdPrefix: "/vitastor"
|
|
@ -0,0 +1,12 @@
|
||||||
|
---
|
||||||
|
apiVersion: v1
|
||||||
|
kind: PersistentVolumeClaim
|
||||||
|
metadata:
|
||||||
|
name: test-vitastor-pvc
|
||||||
|
spec:
|
||||||
|
storageClassName: vitastor
|
||||||
|
accessModes:
|
||||||
|
- ReadWriteOnce
|
||||||
|
resources:
|
||||||
|
requests:
|
||||||
|
storage: 10Gi
|
|
@ -0,0 +1,35 @@
|
||||||
|
module vitastor.io/csi
|
||||||
|
|
||||||
|
go 1.15
|
||||||
|
|
||||||
|
require (
|
||||||
|
github.com/container-storage-interface/spec v1.4.0
|
||||||
|
github.com/coreos/bbolt v0.0.0-00010101000000-000000000000 // indirect
|
||||||
|
github.com/coreos/etcd v3.3.25+incompatible // indirect
|
||||||
|
github.com/coreos/go-semver v0.3.0 // indirect
|
||||||
|
github.com/coreos/go-systemd v0.0.0-20191104093116-d3cd4ed1dbcf // indirect
|
||||||
|
github.com/coreos/pkg v0.0.0-20180928190104-399ea9e2e55f // indirect
|
||||||
|
github.com/dustin/go-humanize v1.0.0 // indirect
|
||||||
|
github.com/golang/glog v0.0.0-20160126235308-23def4e6c14b
|
||||||
|
github.com/gorilla/websocket v1.4.2 // indirect
|
||||||
|
github.com/grpc-ecosystem/go-grpc-middleware v1.3.0 // indirect
|
||||||
|
github.com/grpc-ecosystem/go-grpc-prometheus v1.2.0 // indirect
|
||||||
|
github.com/grpc-ecosystem/grpc-gateway v1.16.0 // indirect
|
||||||
|
github.com/jonboulle/clockwork v0.2.2 // indirect
|
||||||
|
github.com/kubernetes-csi/csi-lib-utils v0.9.1
|
||||||
|
github.com/soheilhy/cmux v0.1.5 // indirect
|
||||||
|
github.com/tmc/grpc-websocket-proxy v0.0.0-20201229170055-e5319fda7802 // indirect
|
||||||
|
github.com/xiang90/probing v0.0.0-20190116061207-43a291ad63a2 // indirect
|
||||||
|
go.etcd.io/bbolt v0.0.0-00010101000000-000000000000 // indirect
|
||||||
|
go.etcd.io/etcd v3.3.25+incompatible
|
||||||
|
golang.org/x/net v0.0.0-20201202161906-c7110b5ffcbb
|
||||||
|
google.golang.org/grpc v1.33.1
|
||||||
|
k8s.io/klog v1.0.0
|
||||||
|
k8s.io/utils v0.0.0-20210305010621-2afb4311ab10
|
||||||
|
)
|
||||||
|
|
||||||
|
replace github.com/coreos/bbolt => go.etcd.io/bbolt v1.3.5
|
||||||
|
|
||||||
|
replace go.etcd.io/bbolt => github.com/coreos/bbolt v1.3.5
|
||||||
|
|
||||||
|
replace google.golang.org/grpc => google.golang.org/grpc v1.25.1
|
|
@ -0,0 +1,22 @@
|
||||||
|
// Copyright (c) Vitaliy Filippov, 2019+
|
||||||
|
// License: VNPL-1.1 or GNU GPL-2.0+ (see README.md for details)
|
||||||
|
|
||||||
|
package vitastor
|
||||||
|
|
||||||
|
const (
|
||||||
|
vitastorCSIDriverName = "csi.vitastor.io"
|
||||||
|
vitastorCSIDriverVersion = "0.6.5"
|
||||||
|
)
|
||||||
|
|
||||||
|
// Config struct fills the parameters of request or user input
|
||||||
|
type Config struct
|
||||||
|
{
|
||||||
|
Endpoint string
|
||||||
|
NodeID string
|
||||||
|
}
|
||||||
|
|
||||||
|
// NewConfig returns config struct to initialize new driver
|
||||||
|
func NewConfig() *Config
|
||||||
|
{
|
||||||
|
return &Config{}
|
||||||
|
}
|
|
@ -0,0 +1,530 @@
|
||||||
|
// Copyright (c) Vitaliy Filippov, 2019+
|
||||||
|
// License: VNPL-1.1 or GNU GPL-2.0+ (see README.md for details)
|
||||||
|
|
||||||
|
package vitastor
|
||||||
|
|
||||||
|
import (
|
||||||
|
"context"
|
||||||
|
"encoding/json"
|
||||||
|
"strings"
|
||||||
|
"bytes"
|
||||||
|
"strconv"
|
||||||
|
"time"
|
||||||
|
"fmt"
|
||||||
|
"os"
|
||||||
|
"os/exec"
|
||||||
|
"io/ioutil"
|
||||||
|
|
||||||
|
"github.com/kubernetes-csi/csi-lib-utils/protosanitizer"
|
||||||
|
"k8s.io/klog"
|
||||||
|
|
||||||
|
"google.golang.org/grpc/codes"
|
||||||
|
"google.golang.org/grpc/status"
|
||||||
|
|
||||||
|
"go.etcd.io/etcd/clientv3"
|
||||||
|
|
||||||
|
"github.com/container-storage-interface/spec/lib/go/csi"
|
||||||
|
)
|
||||||
|
|
||||||
|
const (
|
||||||
|
KB int64 = 1024
|
||||||
|
MB int64 = 1024 * KB
|
||||||
|
GB int64 = 1024 * MB
|
||||||
|
TB int64 = 1024 * GB
|
||||||
|
ETCD_TIMEOUT time.Duration = 15*time.Second
|
||||||
|
)
|
||||||
|
|
||||||
|
type InodeIndex struct
|
||||||
|
{
|
||||||
|
Id uint64 `json:"id"`
|
||||||
|
PoolId uint64 `json:"pool_id"`
|
||||||
|
}
|
||||||
|
|
||||||
|
type InodeConfig struct
|
||||||
|
{
|
||||||
|
Name string `json:"name"`
|
||||||
|
Size uint64 `json:"size,omitempty"`
|
||||||
|
ParentPool uint64 `json:"parent_pool,omitempty"`
|
||||||
|
ParentId uint64 `json:"parent_id,omitempty"`
|
||||||
|
Readonly bool `json:"readonly,omitempty"`
|
||||||
|
}
|
||||||
|
|
||||||
|
type ControllerServer struct
|
||||||
|
{
|
||||||
|
*Driver
|
||||||
|
}
|
||||||
|
|
||||||
|
// NewControllerServer create new instance controller
|
||||||
|
func NewControllerServer(driver *Driver) *ControllerServer
|
||||||
|
{
|
||||||
|
return &ControllerServer{
|
||||||
|
Driver: driver,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
func GetConnectionParams(params map[string]string) (map[string]string, []string, string)
|
||||||
|
{
|
||||||
|
ctxVars := make(map[string]string)
|
||||||
|
configPath := params["configPath"]
|
||||||
|
if (configPath == "")
|
||||||
|
{
|
||||||
|
configPath = "/etc/vitastor/vitastor.conf"
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
ctxVars["configPath"] = configPath
|
||||||
|
}
|
||||||
|
config := make(map[string]interface{})
|
||||||
|
if configFD, err := os.Open(configPath); err == nil
|
||||||
|
{
|
||||||
|
defer configFD.Close()
|
||||||
|
data, _ := ioutil.ReadAll(configFD)
|
||||||
|
json.Unmarshal(data, &config)
|
||||||
|
}
|
||||||
|
// Try to load prefix & etcd URL from the config
|
||||||
|
var etcdUrl []string
|
||||||
|
if (params["etcdUrl"] != "")
|
||||||
|
{
|
||||||
|
ctxVars["etcdUrl"] = params["etcdUrl"]
|
||||||
|
etcdUrl = strings.Split(params["etcdUrl"], ",")
|
||||||
|
}
|
||||||
|
if (len(etcdUrl) == 0)
|
||||||
|
{
|
||||||
|
switch config["etcd_address"].(type)
|
||||||
|
{
|
||||||
|
case string:
|
||||||
|
etcdUrl = strings.Split(config["etcd_address"].(string), ",")
|
||||||
|
case []string:
|
||||||
|
etcdUrl = config["etcd_address"].([]string)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
etcdPrefix := params["etcdPrefix"]
|
||||||
|
if (etcdPrefix == "")
|
||||||
|
{
|
||||||
|
etcdPrefix, _ = config["etcd_prefix"].(string)
|
||||||
|
if (etcdPrefix == "")
|
||||||
|
{
|
||||||
|
etcdPrefix = "/vitastor"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
ctxVars["etcdPrefix"] = etcdPrefix
|
||||||
|
}
|
||||||
|
return ctxVars, etcdUrl, etcdPrefix
|
||||||
|
}
|
||||||
|
|
||||||
|
// Create the volume
|
||||||
|
func (cs *ControllerServer) CreateVolume(ctx context.Context, req *csi.CreateVolumeRequest) (*csi.CreateVolumeResponse, error)
|
||||||
|
{
|
||||||
|
klog.Infof("received controller create volume request %+v", protosanitizer.StripSecrets(req))
|
||||||
|
if (req == nil)
|
||||||
|
{
|
||||||
|
return nil, status.Errorf(codes.InvalidArgument, "request cannot be empty")
|
||||||
|
}
|
||||||
|
if (req.GetName() == "")
|
||||||
|
{
|
||||||
|
return nil, status.Error(codes.InvalidArgument, "name is a required field")
|
||||||
|
}
|
||||||
|
volumeCapabilities := req.GetVolumeCapabilities()
|
||||||
|
if (volumeCapabilities == nil)
|
||||||
|
{
|
||||||
|
return nil, status.Error(codes.InvalidArgument, "volume capabilities is a required field")
|
||||||
|
}
|
||||||
|
|
||||||
|
etcdVolumePrefix := req.Parameters["etcdVolumePrefix"]
|
||||||
|
poolId, _ := strconv.ParseUint(req.Parameters["poolId"], 10, 64)
|
||||||
|
if (poolId == 0)
|
||||||
|
{
|
||||||
|
return nil, status.Error(codes.InvalidArgument, "poolId is missing in storage class configuration")
|
||||||
|
}
|
||||||
|
|
||||||
|
volName := etcdVolumePrefix + req.GetName()
|
||||||
|
volSize := 1 * GB
|
||||||
|
if capRange := req.GetCapacityRange(); capRange != nil
|
||||||
|
{
|
||||||
|
volSize = ((capRange.GetRequiredBytes() + MB - 1) / MB) * MB
|
||||||
|
}
|
||||||
|
|
||||||
|
// FIXME: The following should PROBABLY be implemented externally in a management tool
|
||||||
|
|
||||||
|
ctxVars, etcdUrl, etcdPrefix := GetConnectionParams(req.Parameters)
|
||||||
|
if (len(etcdUrl) == 0)
|
||||||
|
{
|
||||||
|
return nil, status.Error(codes.InvalidArgument, "no etcdUrl in storage class configuration and no etcd_address in vitastor.conf")
|
||||||
|
}
|
||||||
|
|
||||||
|
// Connect to etcd
|
||||||
|
cli, err := clientv3.New(clientv3.Config{
|
||||||
|
DialTimeout: ETCD_TIMEOUT,
|
||||||
|
Endpoints: etcdUrl,
|
||||||
|
})
|
||||||
|
if (err != nil)
|
||||||
|
{
|
||||||
|
return nil, status.Error(codes.Internal, "failed to connect to etcd at "+strings.Join(etcdUrl, ",")+": "+err.Error())
|
||||||
|
}
|
||||||
|
defer cli.Close()
|
||||||
|
|
||||||
|
var imageId uint64 = 0
|
||||||
|
for
|
||||||
|
{
|
||||||
|
// Check if the image exists
|
||||||
|
ctx, cancel := context.WithTimeout(context.Background(), ETCD_TIMEOUT)
|
||||||
|
resp, err := cli.Get(ctx, etcdPrefix+"/index/image/"+volName)
|
||||||
|
cancel()
|
||||||
|
if (err != nil)
|
||||||
|
{
|
||||||
|
return nil, status.Error(codes.Internal, "failed to read key from etcd: "+err.Error())
|
||||||
|
}
|
||||||
|
if (len(resp.Kvs) > 0)
|
||||||
|
{
|
||||||
|
kv := resp.Kvs[0]
|
||||||
|
var v InodeIndex
|
||||||
|
err := json.Unmarshal(kv.Value, &v)
|
||||||
|
if (err != nil)
|
||||||
|
{
|
||||||
|
return nil, status.Error(codes.Internal, "invalid /index/image/"+volName+" key in etcd: "+err.Error())
|
||||||
|
}
|
||||||
|
poolId = v.PoolId
|
||||||
|
imageId = v.Id
|
||||||
|
inodeCfgKey := fmt.Sprintf("/config/inode/%d/%d", poolId, imageId)
|
||||||
|
ctx, cancel := context.WithTimeout(context.Background(), ETCD_TIMEOUT)
|
||||||
|
resp, err := cli.Get(ctx, etcdPrefix+inodeCfgKey)
|
||||||
|
cancel()
|
||||||
|
if (err != nil)
|
||||||
|
{
|
||||||
|
return nil, status.Error(codes.Internal, "failed to read key from etcd: "+err.Error())
|
||||||
|
}
|
||||||
|
if (len(resp.Kvs) == 0)
|
||||||
|
{
|
||||||
|
return nil, status.Error(codes.Internal, "missing "+inodeCfgKey+" key in etcd")
|
||||||
|
}
|
||||||
|
var inodeCfg InodeConfig
|
||||||
|
err = json.Unmarshal(resp.Kvs[0].Value, &inodeCfg)
|
||||||
|
if (err != nil)
|
||||||
|
{
|
||||||
|
return nil, status.Error(codes.Internal, "invalid "+inodeCfgKey+" key in etcd: "+err.Error())
|
||||||
|
}
|
||||||
|
if (inodeCfg.Size < uint64(volSize))
|
||||||
|
{
|
||||||
|
return nil, status.Error(codes.Internal, "image "+volName+" is already created, but size is less than expected")
|
||||||
|
}
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
// Find a free ID
|
||||||
|
// Create image metadata in a transaction verifying that the image doesn't exist yet AND ID is still free
|
||||||
|
maxIdKey := fmt.Sprintf("%s/index/maxid/%d", etcdPrefix, poolId)
|
||||||
|
ctx, cancel := context.WithTimeout(context.Background(), ETCD_TIMEOUT)
|
||||||
|
resp, err := cli.Get(ctx, maxIdKey)
|
||||||
|
cancel()
|
||||||
|
if (err != nil)
|
||||||
|
{
|
||||||
|
return nil, status.Error(codes.Internal, "failed to read key from etcd: "+err.Error())
|
||||||
|
}
|
||||||
|
var modRev int64
|
||||||
|
var nextId uint64
|
||||||
|
if (len(resp.Kvs) > 0)
|
||||||
|
{
|
||||||
|
var err error
|
||||||
|
nextId, err = strconv.ParseUint(string(resp.Kvs[0].Value), 10, 64)
|
||||||
|
if (err != nil)
|
||||||
|
{
|
||||||
|
return nil, status.Error(codes.Internal, maxIdKey+" contains invalid ID")
|
||||||
|
}
|
||||||
|
modRev = resp.Kvs[0].ModRevision
|
||||||
|
nextId++
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
nextId = 1
|
||||||
|
}
|
||||||
|
inodeIdxJson, _ := json.Marshal(InodeIndex{
|
||||||
|
Id: nextId,
|
||||||
|
PoolId: poolId,
|
||||||
|
})
|
||||||
|
inodeCfgJson, _ := json.Marshal(InodeConfig{
|
||||||
|
Name: volName,
|
||||||
|
Size: uint64(volSize),
|
||||||
|
})
|
||||||
|
ctx, cancel = context.WithTimeout(context.Background(), ETCD_TIMEOUT)
|
||||||
|
txnResp, err := cli.Txn(ctx).If(
|
||||||
|
clientv3.Compare(clientv3.ModRevision(fmt.Sprintf("%s/index/maxid/%d", etcdPrefix, poolId)), "=", modRev),
|
||||||
|
clientv3.Compare(clientv3.CreateRevision(fmt.Sprintf("%s/index/image/%s", etcdPrefix, volName)), "=", 0),
|
||||||
|
clientv3.Compare(clientv3.CreateRevision(fmt.Sprintf("%s/config/inode/%d/%d", etcdPrefix, poolId, nextId)), "=", 0),
|
||||||
|
).Then(
|
||||||
|
clientv3.OpPut(fmt.Sprintf("%s/index/maxid/%d", etcdPrefix, poolId), fmt.Sprintf("%d", nextId)),
|
||||||
|
clientv3.OpPut(fmt.Sprintf("%s/index/image/%s", etcdPrefix, volName), string(inodeIdxJson)),
|
||||||
|
clientv3.OpPut(fmt.Sprintf("%s/config/inode/%d/%d", etcdPrefix, poolId, nextId), string(inodeCfgJson)),
|
||||||
|
).Commit()
|
||||||
|
cancel()
|
||||||
|
if (err != nil)
|
||||||
|
{
|
||||||
|
return nil, status.Error(codes.Internal, "failed to commit transaction in etcd: "+err.Error())
|
||||||
|
}
|
||||||
|
if (txnResp.Succeeded)
|
||||||
|
{
|
||||||
|
imageId = nextId
|
||||||
|
break
|
||||||
|
}
|
||||||
|
// Start over if the transaction fails
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
ctxVars["name"] = volName
|
||||||
|
volumeIdJson, _ := json.Marshal(ctxVars)
|
||||||
|
return &csi.CreateVolumeResponse{
|
||||||
|
Volume: &csi.Volume{
|
||||||
|
// Ugly, but VolumeContext isn't passed to DeleteVolume :-(
|
||||||
|
VolumeId: string(volumeIdJson),
|
||||||
|
CapacityBytes: volSize,
|
||||||
|
},
|
||||||
|
}, nil
|
||||||
|
}
|
||||||
|
|
||||||
|
// DeleteVolume deletes the given volume
|
||||||
|
func (cs *ControllerServer) DeleteVolume(ctx context.Context, req *csi.DeleteVolumeRequest) (*csi.DeleteVolumeResponse, error)
|
||||||
|
{
|
||||||
|
klog.Infof("received controller delete volume request %+v", protosanitizer.StripSecrets(req))
|
||||||
|
if (req == nil)
|
||||||
|
{
|
||||||
|
return nil, status.Error(codes.InvalidArgument, "request cannot be empty")
|
||||||
|
}
|
||||||
|
|
||||||
|
ctxVars := make(map[string]string)
|
||||||
|
err := json.Unmarshal([]byte(req.VolumeId), &ctxVars)
|
||||||
|
if (err != nil)
|
||||||
|
{
|
||||||
|
return nil, status.Error(codes.Internal, "volume ID not in JSON format")
|
||||||
|
}
|
||||||
|
volName := ctxVars["name"]
|
||||||
|
|
||||||
|
_, etcdUrl, etcdPrefix := GetConnectionParams(ctxVars)
|
||||||
|
if (len(etcdUrl) == 0)
|
||||||
|
{
|
||||||
|
return nil, status.Error(codes.InvalidArgument, "no etcdUrl in storage class configuration and no etcd_address in vitastor.conf")
|
||||||
|
}
|
||||||
|
|
||||||
|
cli, err := clientv3.New(clientv3.Config{
|
||||||
|
DialTimeout: ETCD_TIMEOUT,
|
||||||
|
Endpoints: etcdUrl,
|
||||||
|
})
|
||||||
|
if (err != nil)
|
||||||
|
{
|
||||||
|
return nil, status.Error(codes.Internal, "failed to connect to etcd at "+strings.Join(etcdUrl, ",")+": "+err.Error())
|
||||||
|
}
|
||||||
|
defer cli.Close()
|
||||||
|
|
||||||
|
// Find inode by name
|
||||||
|
ctx, cancel := context.WithTimeout(context.Background(), ETCD_TIMEOUT)
|
||||||
|
resp, err := cli.Get(ctx, etcdPrefix+"/index/image/"+volName)
|
||||||
|
cancel()
|
||||||
|
if (err != nil)
|
||||||
|
{
|
||||||
|
return nil, status.Error(codes.Internal, "failed to read key from etcd: "+err.Error())
|
||||||
|
}
|
||||||
|
if (len(resp.Kvs) == 0)
|
||||||
|
{
|
||||||
|
return nil, status.Error(codes.NotFound, "volume "+volName+" does not exist")
|
||||||
|
}
|
||||||
|
var idx InodeIndex
|
||||||
|
err = json.Unmarshal(resp.Kvs[0].Value, &idx)
|
||||||
|
if (err != nil)
|
||||||
|
{
|
||||||
|
return nil, status.Error(codes.Internal, "invalid /index/image/"+volName+" key in etcd: "+err.Error())
|
||||||
|
}
|
||||||
|
|
||||||
|
// Get inode config
|
||||||
|
inodeCfgKey := fmt.Sprintf("%s/config/inode/%d/%d", etcdPrefix, idx.PoolId, idx.Id)
|
||||||
|
ctx, cancel = context.WithTimeout(context.Background(), ETCD_TIMEOUT)
|
||||||
|
resp, err = cli.Get(ctx, inodeCfgKey)
|
||||||
|
cancel()
|
||||||
|
if (err != nil)
|
||||||
|
{
|
||||||
|
return nil, status.Error(codes.Internal, "failed to read key from etcd: "+err.Error())
|
||||||
|
}
|
||||||
|
if (len(resp.Kvs) == 0)
|
||||||
|
{
|
||||||
|
return nil, status.Error(codes.NotFound, "volume "+volName+" does not exist")
|
||||||
|
}
|
||||||
|
var inodeCfg InodeConfig
|
||||||
|
err = json.Unmarshal(resp.Kvs[0].Value, &inodeCfg)
|
||||||
|
if (err != nil)
|
||||||
|
{
|
||||||
|
return nil, status.Error(codes.Internal, "invalid "+inodeCfgKey+" key in etcd: "+err.Error())
|
||||||
|
}
|
||||||
|
|
||||||
|
// Delete inode data by invoking vitastor-rm
|
||||||
|
args := []string{
|
||||||
|
"--etcd_address", strings.Join(etcdUrl, ","),
|
||||||
|
"--pool", fmt.Sprintf("%d", idx.PoolId),
|
||||||
|
"--inode", fmt.Sprintf("%d", idx.Id),
|
||||||
|
}
|
||||||
|
if (ctxVars["configPath"] != "")
|
||||||
|
{
|
||||||
|
args = append(args, "--config_path", ctxVars["configPath"])
|
||||||
|
}
|
||||||
|
c := exec.Command("/usr/bin/vitastor-rm", args...)
|
||||||
|
var stderr bytes.Buffer
|
||||||
|
c.Stdout = nil
|
||||||
|
c.Stderr = &stderr
|
||||||
|
err = c.Run()
|
||||||
|
stderrStr := string(stderr.Bytes())
|
||||||
|
if (err != nil)
|
||||||
|
{
|
||||||
|
klog.Errorf("vitastor-rm failed: %s, status %s\n", stderrStr, err)
|
||||||
|
return nil, status.Error(codes.Internal, stderrStr+" (status "+err.Error()+")")
|
||||||
|
}
|
||||||
|
|
||||||
|
// Delete inode config in etcd
|
||||||
|
ctx, cancel = context.WithTimeout(context.Background(), ETCD_TIMEOUT)
|
||||||
|
txnResp, err := cli.Txn(ctx).Then(
|
||||||
|
clientv3.OpDelete(fmt.Sprintf("%s/index/image/%s", etcdPrefix, volName)),
|
||||||
|
clientv3.OpDelete(fmt.Sprintf("%s/config/inode/%d/%d", etcdPrefix, idx.PoolId, idx.Id)),
|
||||||
|
).Commit()
|
||||||
|
cancel()
|
||||||
|
if (err != nil)
|
||||||
|
{
|
||||||
|
return nil, status.Error(codes.Internal, "failed to delete keys in etcd: "+err.Error())
|
||||||
|
}
|
||||||
|
if (!txnResp.Succeeded)
|
||||||
|
{
|
||||||
|
return nil, status.Error(codes.Internal, "failed to delete keys in etcd: transaction failed")
|
||||||
|
}
|
||||||
|
|
||||||
|
return &csi.DeleteVolumeResponse{}, nil
|
||||||
|
}
|
||||||
|
|
||||||
|
// ControllerPublishVolume return Unimplemented error
|
||||||
|
func (cs *ControllerServer) ControllerPublishVolume(ctx context.Context, req *csi.ControllerPublishVolumeRequest) (*csi.ControllerPublishVolumeResponse, error)
|
||||||
|
{
|
||||||
|
return nil, status.Error(codes.Unimplemented, "")
|
||||||
|
}
|
||||||
|
|
||||||
|
// ControllerUnpublishVolume return Unimplemented error
|
||||||
|
func (cs *ControllerServer) ControllerUnpublishVolume(ctx context.Context, req *csi.ControllerUnpublishVolumeRequest) (*csi.ControllerUnpublishVolumeResponse, error)
|
||||||
|
{
|
||||||
|
return nil, status.Error(codes.Unimplemented, "")
|
||||||
|
}
|
||||||
|
|
||||||
|
// ValidateVolumeCapabilities checks whether the volume capabilities requested are supported.
|
||||||
|
func (cs *ControllerServer) ValidateVolumeCapabilities(ctx context.Context, req *csi.ValidateVolumeCapabilitiesRequest) (*csi.ValidateVolumeCapabilitiesResponse, error)
|
||||||
|
{
|
||||||
|
klog.Infof("received controller validate volume capability request %+v", protosanitizer.StripSecrets(req))
|
||||||
|
if (req == nil)
|
||||||
|
{
|
||||||
|
return nil, status.Errorf(codes.InvalidArgument, "request is nil")
|
||||||
|
}
|
||||||
|
volumeID := req.GetVolumeId()
|
||||||
|
if (volumeID == "")
|
||||||
|
{
|
||||||
|
return nil, status.Error(codes.InvalidArgument, "volumeId is nil")
|
||||||
|
}
|
||||||
|
volumeCapabilities := req.GetVolumeCapabilities()
|
||||||
|
if (volumeCapabilities == nil)
|
||||||
|
{
|
||||||
|
return nil, status.Error(codes.InvalidArgument, "volumeCapabilities is nil")
|
||||||
|
}
|
||||||
|
|
||||||
|
var volumeCapabilityAccessModes []*csi.VolumeCapability_AccessMode
|
||||||
|
for _, mode := range []csi.VolumeCapability_AccessMode_Mode{
|
||||||
|
csi.VolumeCapability_AccessMode_SINGLE_NODE_WRITER,
|
||||||
|
csi.VolumeCapability_AccessMode_MULTI_NODE_MULTI_WRITER,
|
||||||
|
} {
|
||||||
|
volumeCapabilityAccessModes = append(volumeCapabilityAccessModes, &csi.VolumeCapability_AccessMode{Mode: mode})
|
||||||
|
}
|
||||||
|
|
||||||
|
capabilitySupport := false
|
||||||
|
for _, capability := range volumeCapabilities
|
||||||
|
{
|
||||||
|
for _, volumeCapabilityAccessMode := range volumeCapabilityAccessModes
|
||||||
|
{
|
||||||
|
if (volumeCapabilityAccessMode.Mode == capability.AccessMode.Mode)
|
||||||
|
{
|
||||||
|
capabilitySupport = true
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if (!capabilitySupport)
|
||||||
|
{
|
||||||
|
return nil, status.Errorf(codes.NotFound, "%v not supported", req.GetVolumeCapabilities())
|
||||||
|
}
|
||||||
|
|
||||||
|
return &csi.ValidateVolumeCapabilitiesResponse{
|
||||||
|
Confirmed: &csi.ValidateVolumeCapabilitiesResponse_Confirmed{
|
||||||
|
VolumeCapabilities: req.VolumeCapabilities,
|
||||||
|
},
|
||||||
|
}, nil
|
||||||
|
}
|
||||||
|
|
||||||
|
// ListVolumes returns a list of volumes
|
||||||
|
func (cs *ControllerServer) ListVolumes(ctx context.Context, req *csi.ListVolumesRequest) (*csi.ListVolumesResponse, error)
|
||||||
|
{
|
||||||
|
return nil, status.Error(codes.Unimplemented, "")
|
||||||
|
}
|
||||||
|
|
||||||
|
// GetCapacity returns the capacity of the storage pool
|
||||||
|
func (cs *ControllerServer) GetCapacity(ctx context.Context, req *csi.GetCapacityRequest) (*csi.GetCapacityResponse, error)
|
||||||
|
{
|
||||||
|
return nil, status.Error(codes.Unimplemented, "")
|
||||||
|
}
|
||||||
|
|
||||||
|
// ControllerGetCapabilities returns the capabilities of the controller service.
|
||||||
|
func (cs *ControllerServer) ControllerGetCapabilities(ctx context.Context, req *csi.ControllerGetCapabilitiesRequest) (*csi.ControllerGetCapabilitiesResponse, error)
|
||||||
|
{
|
||||||
|
functionControllerServerCapabilities := func(cap csi.ControllerServiceCapability_RPC_Type) *csi.ControllerServiceCapability
|
||||||
|
{
|
||||||
|
return &csi.ControllerServiceCapability{
|
||||||
|
Type: &csi.ControllerServiceCapability_Rpc{
|
||||||
|
Rpc: &csi.ControllerServiceCapability_RPC{
|
||||||
|
Type: cap,
|
||||||
|
},
|
||||||
|
},
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
var controllerServerCapabilities []*csi.ControllerServiceCapability
|
||||||
|
for _, capability := range []csi.ControllerServiceCapability_RPC_Type{
|
||||||
|
csi.ControllerServiceCapability_RPC_CREATE_DELETE_VOLUME,
|
||||||
|
csi.ControllerServiceCapability_RPC_LIST_VOLUMES,
|
||||||
|
csi.ControllerServiceCapability_RPC_EXPAND_VOLUME,
|
||||||
|
csi.ControllerServiceCapability_RPC_CREATE_DELETE_SNAPSHOT,
|
||||||
|
} {
|
||||||
|
controllerServerCapabilities = append(controllerServerCapabilities, functionControllerServerCapabilities(capability))
|
||||||
|
}
|
||||||
|
|
||||||
|
return &csi.ControllerGetCapabilitiesResponse{
|
||||||
|
Capabilities: controllerServerCapabilities,
|
||||||
|
}, nil
|
||||||
|
}
|
||||||
|
|
||||||
|
// CreateSnapshot create snapshot of an existing PV
|
||||||
|
func (cs *ControllerServer) CreateSnapshot(ctx context.Context, req *csi.CreateSnapshotRequest) (*csi.CreateSnapshotResponse, error)
|
||||||
|
{
|
||||||
|
return nil, status.Error(codes.Unimplemented, "")
|
||||||
|
}
|
||||||
|
|
||||||
|
// DeleteSnapshot delete provided snapshot of a PV
|
||||||
|
func (cs *ControllerServer) DeleteSnapshot(ctx context.Context, req *csi.DeleteSnapshotRequest) (*csi.DeleteSnapshotResponse, error)
|
||||||
|
{
|
||||||
|
return nil, status.Error(codes.Unimplemented, "")
|
||||||
|
}
|
||||||
|
|
||||||
|
// ListSnapshots list the snapshots of a PV
|
||||||
|
func (cs *ControllerServer) ListSnapshots(ctx context.Context, req *csi.ListSnapshotsRequest) (*csi.ListSnapshotsResponse, error)
|
||||||
|
{
|
||||||
|
return nil, status.Error(codes.Unimplemented, "")
|
||||||
|
}
|
||||||
|
|
||||||
|
// ControllerExpandVolume resizes a volume
|
||||||
|
func (cs *ControllerServer) ControllerExpandVolume(ctx context.Context, req *csi.ControllerExpandVolumeRequest) (*csi.ControllerExpandVolumeResponse, error)
|
||||||
|
{
|
||||||
|
return nil, status.Error(codes.Unimplemented, "")
|
||||||
|
}
|
||||||
|
|
||||||
|
// ControllerGetVolume get volume info
|
||||||
|
func (cs *ControllerServer) ControllerGetVolume(ctx context.Context, req *csi.ControllerGetVolumeRequest) (*csi.ControllerGetVolumeResponse, error)
|
||||||
|
{
|
||||||
|
return nil, status.Error(codes.Unimplemented, "")
|
||||||
|
}
|
|
@ -0,0 +1,137 @@
|
||||||
|
/*
|
||||||
|
Copyright 2017 The Kubernetes Authors.
|
||||||
|
|
||||||
|
Licensed under the Apache License, Version 2.0 (the "License");
|
||||||
|
you may not use this file except in compliance with the License.
|
||||||
|
You may obtain a copy of the License at
|
||||||
|
|
||||||
|
http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
|
||||||
|
Unless required by applicable law or agreed to in writing, software
|
||||||
|
distributed under the License is distributed on an "AS IS" BASIS,
|
||||||
|
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||||
|
See the License for the specific language governing permissions and
|
||||||
|
limitations under the License.
|
||||||
|
*/
|
||||||
|
|
||||||
|
package vitastor
|
||||||
|
|
||||||
|
import (
|
||||||
|
"fmt"
|
||||||
|
"net"
|
||||||
|
"os"
|
||||||
|
"strings"
|
||||||
|
"sync"
|
||||||
|
|
||||||
|
"github.com/golang/glog"
|
||||||
|
"golang.org/x/net/context"
|
||||||
|
"google.golang.org/grpc"
|
||||||
|
|
||||||
|
"github.com/container-storage-interface/spec/lib/go/csi"
|
||||||
|
"github.com/kubernetes-csi/csi-lib-utils/protosanitizer"
|
||||||
|
)
|
||||||
|
|
||||||
|
// Defines Non blocking GRPC server interfaces
|
||||||
|
type NonBlockingGRPCServer interface {
|
||||||
|
// Start services at the endpoint
|
||||||
|
Start(endpoint string, ids csi.IdentityServer, cs csi.ControllerServer, ns csi.NodeServer)
|
||||||
|
// Waits for the service to stop
|
||||||
|
Wait()
|
||||||
|
// Stops the service gracefully
|
||||||
|
Stop()
|
||||||
|
// Stops the service forcefully
|
||||||
|
ForceStop()
|
||||||
|
}
|
||||||
|
|
||||||
|
func NewNonBlockingGRPCServer() NonBlockingGRPCServer {
|
||||||
|
return &nonBlockingGRPCServer{}
|
||||||
|
}
|
||||||
|
|
||||||
|
// NonBlocking server
|
||||||
|
type nonBlockingGRPCServer struct {
|
||||||
|
wg sync.WaitGroup
|
||||||
|
server *grpc.Server
|
||||||
|
}
|
||||||
|
|
||||||
|
func (s *nonBlockingGRPCServer) Start(endpoint string, ids csi.IdentityServer, cs csi.ControllerServer, ns csi.NodeServer) {
|
||||||
|
|
||||||
|
s.wg.Add(1)
|
||||||
|
|
||||||
|
go s.serve(endpoint, ids, cs, ns)
|
||||||
|
|
||||||
|
return
|
||||||
|
}
|
||||||
|
|
||||||
|
func (s *nonBlockingGRPCServer) Wait() {
|
||||||
|
s.wg.Wait()
|
||||||
|
}
|
||||||
|
|
||||||
|
func (s *nonBlockingGRPCServer) Stop() {
|
||||||
|
s.server.GracefulStop()
|
||||||
|
}
|
||||||
|
|
||||||
|
func (s *nonBlockingGRPCServer) ForceStop() {
|
||||||
|
s.server.Stop()
|
||||||
|
}
|
||||||
|
|
||||||
|
func (s *nonBlockingGRPCServer) serve(endpoint string, ids csi.IdentityServer, cs csi.ControllerServer, ns csi.NodeServer) {
|
||||||
|
|
||||||
|
proto, addr, err := ParseEndpoint(endpoint)
|
||||||
|
if err != nil {
|
||||||
|
glog.Fatal(err.Error())
|
||||||
|
}
|
||||||
|
|
||||||
|
if proto == "unix" {
|
||||||
|
addr = "/" + addr
|
||||||
|
if err := os.Remove(addr); err != nil && !os.IsNotExist(err) {
|
||||||
|
glog.Fatalf("Failed to remove %s, error: %s", addr, err.Error())
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
listener, err := net.Listen(proto, addr)
|
||||||
|
if err != nil {
|
||||||
|
glog.Fatalf("Failed to listen: %v", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
opts := []grpc.ServerOption{
|
||||||
|
grpc.UnaryInterceptor(logGRPC),
|
||||||
|
}
|
||||||
|
server := grpc.NewServer(opts...)
|
||||||
|
s.server = server
|
||||||
|
|
||||||
|
if ids != nil {
|
||||||
|
csi.RegisterIdentityServer(server, ids)
|
||||||
|
}
|
||||||
|
if cs != nil {
|
||||||
|
csi.RegisterControllerServer(server, cs)
|
||||||
|
}
|
||||||
|
if ns != nil {
|
||||||
|
csi.RegisterNodeServer(server, ns)
|
||||||
|
}
|
||||||
|
|
||||||
|
glog.Infof("Listening for connections on address: %#v", listener.Addr())
|
||||||
|
|
||||||
|
server.Serve(listener)
|
||||||
|
}
|
||||||
|
|
||||||
|
func ParseEndpoint(ep string) (string, string, error) {
|
||||||
|
if strings.HasPrefix(strings.ToLower(ep), "unix://") || strings.HasPrefix(strings.ToLower(ep), "tcp://") {
|
||||||
|
s := strings.SplitN(ep, "://", 2)
|
||||||
|
if s[1] != "" {
|
||||||
|
return s[0], s[1], nil
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return "", "", fmt.Errorf("Invalid endpoint: %v", ep)
|
||||||
|
}
|
||||||
|
|
||||||
|
func logGRPC(ctx context.Context, req interface{}, info *grpc.UnaryServerInfo, handler grpc.UnaryHandler) (interface{}, error) {
|
||||||
|
glog.V(3).Infof("GRPC call: %s", info.FullMethod)
|
||||||
|
glog.V(5).Infof("GRPC request: %s", protosanitizer.StripSecrets(req))
|
||||||
|
resp, err := handler(ctx, req)
|
||||||
|
if err != nil {
|
||||||
|
glog.Errorf("GRPC error: %v", err)
|
||||||
|
} else {
|
||||||
|
glog.V(5).Infof("GRPC response: %s", protosanitizer.StripSecrets(resp))
|
||||||
|
}
|
||||||
|
return resp, err
|
||||||
|
}
|
|
@ -0,0 +1,60 @@
|
||||||
|
// Copyright (c) Vitaliy Filippov, 2019+
|
||||||
|
// License: VNPL-1.1 or GNU GPL-2.0+ (see README.md for details)
|
||||||
|
|
||||||
|
package vitastor
|
||||||
|
|
||||||
|
import (
|
||||||
|
"context"
|
||||||
|
|
||||||
|
"github.com/kubernetes-csi/csi-lib-utils/protosanitizer"
|
||||||
|
"k8s.io/klog"
|
||||||
|
|
||||||
|
"github.com/container-storage-interface/spec/lib/go/csi"
|
||||||
|
)
|
||||||
|
|
||||||
|
// IdentityServer struct of Vitastor CSI driver with supported methods of CSI identity server spec.
|
||||||
|
type IdentityServer struct
|
||||||
|
{
|
||||||
|
*Driver
|
||||||
|
}
|
||||||
|
|
||||||
|
// NewIdentityServer create new instance identity
|
||||||
|
func NewIdentityServer(driver *Driver) *IdentityServer
|
||||||
|
{
|
||||||
|
return &IdentityServer{
|
||||||
|
Driver: driver,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// GetPluginInfo returns metadata of the plugin
|
||||||
|
func (is *IdentityServer) GetPluginInfo(ctx context.Context, req *csi.GetPluginInfoRequest) (*csi.GetPluginInfoResponse, error)
|
||||||
|
{
|
||||||
|
klog.Infof("received identity plugin info request %+v", protosanitizer.StripSecrets(req))
|
||||||
|
return &csi.GetPluginInfoResponse{
|
||||||
|
Name: vitastorCSIDriverName,
|
||||||
|
VendorVersion: vitastorCSIDriverVersion,
|
||||||
|
}, nil
|
||||||
|
}
|
||||||
|
|
||||||
|
// GetPluginCapabilities returns available capabilities of the plugin
|
||||||
|
func (is *IdentityServer) GetPluginCapabilities(ctx context.Context, req *csi.GetPluginCapabilitiesRequest) (*csi.GetPluginCapabilitiesResponse, error)
|
||||||
|
{
|
||||||
|
klog.Infof("received identity plugin capabilities request %+v", protosanitizer.StripSecrets(req))
|
||||||
|
return &csi.GetPluginCapabilitiesResponse{
|
||||||
|
Capabilities: []*csi.PluginCapability{
|
||||||
|
{
|
||||||
|
Type: &csi.PluginCapability_Service_{
|
||||||
|
Service: &csi.PluginCapability_Service{
|
||||||
|
Type: csi.PluginCapability_Service_CONTROLLER_SERVICE,
|
||||||
|
},
|
||||||
|
},
|
||||||
|
},
|
||||||
|
},
|
||||||
|
}, nil
|
||||||
|
}
|
||||||
|
|
||||||
|
// Probe returns the health and readiness of the plugin
|
||||||
|
func (is *IdentityServer) Probe(ctx context.Context, req *csi.ProbeRequest) (*csi.ProbeResponse, error)
|
||||||
|
{
|
||||||
|
return &csi.ProbeResponse{}, nil
|
||||||
|
}
|
|
@ -0,0 +1,279 @@
|
||||||
|
// Copyright (c) Vitaliy Filippov, 2019+
|
||||||
|
// License: VNPL-1.1 or GNU GPL-2.0+ (see README.md for details)
|
||||||
|
|
||||||
|
package vitastor
|
||||||
|
|
||||||
|
import (
|
||||||
|
"context"
|
||||||
|
"os"
|
||||||
|
"os/exec"
|
||||||
|
"encoding/json"
|
||||||
|
"strings"
|
||||||
|
"bytes"
|
||||||
|
|
||||||
|
"google.golang.org/grpc/codes"
|
||||||
|
"google.golang.org/grpc/status"
|
||||||
|
"k8s.io/utils/mount"
|
||||||
|
utilexec "k8s.io/utils/exec"
|
||||||
|
|
||||||
|
"github.com/container-storage-interface/spec/lib/go/csi"
|
||||||
|
"github.com/kubernetes-csi/csi-lib-utils/protosanitizer"
|
||||||
|
"k8s.io/klog"
|
||||||
|
)
|
||||||
|
|
||||||
|
// NodeServer struct of Vitastor CSI driver with supported methods of CSI node server spec.
|
||||||
|
type NodeServer struct
|
||||||
|
{
|
||||||
|
*Driver
|
||||||
|
mounter mount.Interface
|
||||||
|
}
|
||||||
|
|
||||||
|
// NewNodeServer create new instance node
|
||||||
|
func NewNodeServer(driver *Driver) *NodeServer
|
||||||
|
{
|
||||||
|
return &NodeServer{
|
||||||
|
Driver: driver,
|
||||||
|
mounter: mount.New(""),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// NodeStageVolume mounts the volume to a staging path on the node.
|
||||||
|
func (ns *NodeServer) NodeStageVolume(ctx context.Context, req *csi.NodeStageVolumeRequest) (*csi.NodeStageVolumeResponse, error)
|
||||||
|
{
|
||||||
|
return &csi.NodeStageVolumeResponse{}, nil
|
||||||
|
}
|
||||||
|
|
||||||
|
// NodeUnstageVolume unstages the volume from the staging path
|
||||||
|
func (ns *NodeServer) NodeUnstageVolume(ctx context.Context, req *csi.NodeUnstageVolumeRequest) (*csi.NodeUnstageVolumeResponse, error)
|
||||||
|
{
|
||||||
|
return &csi.NodeUnstageVolumeResponse{}, nil
|
||||||
|
}
|
||||||
|
|
||||||
|
func Contains(list []string, s string) bool
|
||||||
|
{
|
||||||
|
for i := 0; i < len(list); i++
|
||||||
|
{
|
||||||
|
if (list[i] == s)
|
||||||
|
{
|
||||||
|
return true
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return false
|
||||||
|
}
|
||||||
|
|
||||||
|
// NodePublishVolume mounts the volume mounted to the staging path to the target path
|
||||||
|
func (ns *NodeServer) NodePublishVolume(ctx context.Context, req *csi.NodePublishVolumeRequest) (*csi.NodePublishVolumeResponse, error)
|
||||||
|
{
|
||||||
|
klog.Infof("received node publish volume request %+v", protosanitizer.StripSecrets(req))
|
||||||
|
|
||||||
|
targetPath := req.GetTargetPath()
|
||||||
|
|
||||||
|
// Check that it's not already mounted
|
||||||
|
free, error := mount.IsNotMountPoint(ns.mounter, targetPath)
|
||||||
|
if (error != nil)
|
||||||
|
{
|
||||||
|
if (os.IsNotExist(error))
|
||||||
|
{
|
||||||
|
error := os.MkdirAll(targetPath, 0777)
|
||||||
|
if (error != nil)
|
||||||
|
{
|
||||||
|
return nil, status.Error(codes.Internal, error.Error())
|
||||||
|
}
|
||||||
|
free = true
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
return nil, status.Error(codes.Internal, error.Error())
|
||||||
|
}
|
||||||
|
}
|
||||||
|
if (!free)
|
||||||
|
{
|
||||||
|
return &csi.NodePublishVolumeResponse{}, nil
|
||||||
|
}
|
||||||
|
|
||||||
|
ctxVars := make(map[string]string)
|
||||||
|
err := json.Unmarshal([]byte(req.VolumeId), &ctxVars)
|
||||||
|
if (err != nil)
|
||||||
|
{
|
||||||
|
return nil, status.Error(codes.Internal, "volume ID not in JSON format")
|
||||||
|
}
|
||||||
|
volName := ctxVars["name"]
|
||||||
|
|
||||||
|
_, etcdUrl, etcdPrefix := GetConnectionParams(ctxVars)
|
||||||
|
if (len(etcdUrl) == 0)
|
||||||
|
{
|
||||||
|
return nil, status.Error(codes.InvalidArgument, "no etcdUrl in storage class configuration and no etcd_address in vitastor.conf")
|
||||||
|
}
|
||||||
|
|
||||||
|
// Map NBD device
|
||||||
|
// FIXME: Check if already mapped
|
||||||
|
args := []string{
|
||||||
|
"map", "--etcd_address", strings.Join(etcdUrl, ","),
|
||||||
|
"--etcd_prefix", etcdPrefix,
|
||||||
|
"--image", volName,
|
||||||
|
};
|
||||||
|
if (ctxVars["configPath"] != "")
|
||||||
|
{
|
||||||
|
args = append(args, "--config_path", ctxVars["configPath"])
|
||||||
|
}
|
||||||
|
if (req.GetReadonly())
|
||||||
|
{
|
||||||
|
args = append(args, "--readonly", "1")
|
||||||
|
}
|
||||||
|
c := exec.Command("/usr/bin/vitastor-nbd", args...)
|
||||||
|
var stdout, stderr bytes.Buffer
|
||||||
|
c.Stdout, c.Stderr = &stdout, &stderr
|
||||||
|
err = c.Run()
|
||||||
|
stdoutStr, stderrStr := string(stdout.Bytes()), string(stderr.Bytes())
|
||||||
|
if (err != nil)
|
||||||
|
{
|
||||||
|
klog.Errorf("vitastor-nbd map failed: %s, status %s\n", stdoutStr+stderrStr, err)
|
||||||
|
return nil, status.Error(codes.Internal, stdoutStr+stderrStr+" (status "+err.Error()+")")
|
||||||
|
}
|
||||||
|
devicePath := strings.TrimSpace(stdoutStr)
|
||||||
|
|
||||||
|
// Check existing format
|
||||||
|
diskMounter := &mount.SafeFormatAndMount{Interface: ns.mounter, Exec: utilexec.New()}
|
||||||
|
existingFormat, err := diskMounter.GetDiskFormat(devicePath)
|
||||||
|
if (err != nil)
|
||||||
|
{
|
||||||
|
klog.Errorf("failed to get disk format for path %s, error: %v", err)
|
||||||
|
// unmap NBD device
|
||||||
|
unmapOut, unmapErr := exec.Command("/usr/bin/vitastor-nbd", "unmap", devicePath).CombinedOutput()
|
||||||
|
if (unmapErr != nil)
|
||||||
|
{
|
||||||
|
klog.Errorf("failed to unmap NBD device %s: %s, error: %v", devicePath, unmapOut, unmapErr)
|
||||||
|
}
|
||||||
|
return nil, err
|
||||||
|
}
|
||||||
|
|
||||||
|
// Format the device (ext4 or xfs)
|
||||||
|
fsType := req.GetVolumeCapability().GetMount().GetFsType()
|
||||||
|
isBlock := req.GetVolumeCapability().GetBlock() != nil
|
||||||
|
opt := req.GetVolumeCapability().GetMount().GetMountFlags()
|
||||||
|
opt = append(opt, "_netdev")
|
||||||
|
if ((req.VolumeCapability.AccessMode.Mode == csi.VolumeCapability_AccessMode_MULTI_NODE_READER_ONLY ||
|
||||||
|
req.VolumeCapability.AccessMode.Mode == csi.VolumeCapability_AccessMode_SINGLE_NODE_READER_ONLY) &&
|
||||||
|
!Contains(opt, "ro"))
|
||||||
|
{
|
||||||
|
opt = append(opt, "ro")
|
||||||
|
}
|
||||||
|
if (fsType == "xfs")
|
||||||
|
{
|
||||||
|
opt = append(opt, "nouuid")
|
||||||
|
}
|
||||||
|
readOnly := Contains(opt, "ro")
|
||||||
|
if (existingFormat == "" && !readOnly)
|
||||||
|
{
|
||||||
|
args := []string{}
|
||||||
|
switch fsType
|
||||||
|
{
|
||||||
|
case "ext4":
|
||||||
|
args = []string{"-m0", "-Enodiscard,lazy_itable_init=1,lazy_journal_init=1", devicePath}
|
||||||
|
case "xfs":
|
||||||
|
args = []string{"-K", devicePath}
|
||||||
|
}
|
||||||
|
if (len(args) > 0)
|
||||||
|
{
|
||||||
|
cmdOut, cmdErr := diskMounter.Exec.Command("mkfs."+fsType, args...).CombinedOutput()
|
||||||
|
if (cmdErr != nil)
|
||||||
|
{
|
||||||
|
klog.Errorf("failed to run mkfs error: %v, output: %v", cmdErr, string(cmdOut))
|
||||||
|
// unmap NBD device
|
||||||
|
unmapOut, unmapErr := exec.Command("/usr/bin/vitastor-nbd", "unmap", devicePath).CombinedOutput()
|
||||||
|
if (unmapErr != nil)
|
||||||
|
{
|
||||||
|
klog.Errorf("failed to unmap NBD device %s: %s, error: %v", devicePath, unmapOut, unmapErr)
|
||||||
|
}
|
||||||
|
return nil, status.Error(codes.Internal, cmdErr.Error())
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
if (isBlock)
|
||||||
|
{
|
||||||
|
opt = append(opt, "bind")
|
||||||
|
err = diskMounter.Mount(devicePath, targetPath, fsType, opt)
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
err = diskMounter.FormatAndMount(devicePath, targetPath, fsType, opt)
|
||||||
|
}
|
||||||
|
if (err != nil)
|
||||||
|
{
|
||||||
|
klog.Errorf(
|
||||||
|
"failed to mount device path (%s) to path (%s) for volume (%s) error: %s",
|
||||||
|
devicePath, targetPath, volName, err,
|
||||||
|
)
|
||||||
|
// unmap NBD device
|
||||||
|
unmapOut, unmapErr := exec.Command("/usr/bin/vitastor-nbd", "unmap", devicePath).CombinedOutput()
|
||||||
|
if (unmapErr != nil)
|
||||||
|
{
|
||||||
|
klog.Errorf("failed to unmap NBD device %s: %s, error: %v", devicePath, unmapOut, unmapErr)
|
||||||
|
}
|
||||||
|
return nil, status.Error(codes.Internal, err.Error())
|
||||||
|
}
|
||||||
|
return &csi.NodePublishVolumeResponse{}, nil
|
||||||
|
}
|
||||||
|
|
||||||
|
// NodeUnpublishVolume unmounts the volume from the target path
|
||||||
|
func (ns *NodeServer) NodeUnpublishVolume(ctx context.Context, req *csi.NodeUnpublishVolumeRequest) (*csi.NodeUnpublishVolumeResponse, error)
|
||||||
|
{
|
||||||
|
klog.Infof("received node unpublish volume request %+v", protosanitizer.StripSecrets(req))
|
||||||
|
targetPath := req.GetTargetPath()
|
||||||
|
devicePath, refCount, err := mount.GetDeviceNameFromMount(ns.mounter, targetPath)
|
||||||
|
if (err != nil)
|
||||||
|
{
|
||||||
|
if (os.IsNotExist(err))
|
||||||
|
{
|
||||||
|
return nil, status.Error(codes.NotFound, "Target path not found")
|
||||||
|
}
|
||||||
|
return nil, status.Error(codes.Internal, err.Error())
|
||||||
|
}
|
||||||
|
if (devicePath == "")
|
||||||
|
{
|
||||||
|
return nil, status.Error(codes.NotFound, "Volume not mounted")
|
||||||
|
}
|
||||||
|
// unmount
|
||||||
|
err = mount.CleanupMountPoint(targetPath, ns.mounter, false)
|
||||||
|
if (err != nil)
|
||||||
|
{
|
||||||
|
return nil, status.Error(codes.Internal, err.Error())
|
||||||
|
}
|
||||||
|
// unmap NBD device
|
||||||
|
if (refCount == 1)
|
||||||
|
{
|
||||||
|
unmapOut, unmapErr := exec.Command("/usr/bin/vitastor-nbd", "unmap", devicePath).CombinedOutput()
|
||||||
|
if (unmapErr != nil)
|
||||||
|
{
|
||||||
|
klog.Errorf("failed to unmap NBD device %s: %s, error: %v", devicePath, unmapOut, unmapErr)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return &csi.NodeUnpublishVolumeResponse{}, nil
|
||||||
|
}
|
||||||
|
|
||||||
|
// NodeGetVolumeStats returns volume capacity statistics available for the volume
|
||||||
|
func (ns *NodeServer) NodeGetVolumeStats(ctx context.Context, req *csi.NodeGetVolumeStatsRequest) (*csi.NodeGetVolumeStatsResponse, error)
|
||||||
|
{
|
||||||
|
return nil, status.Error(codes.Unimplemented, "")
|
||||||
|
}
|
||||||
|
|
||||||
|
// NodeExpandVolume expanding the file system on the node
|
||||||
|
func (ns *NodeServer) NodeExpandVolume(ctx context.Context, req *csi.NodeExpandVolumeRequest) (*csi.NodeExpandVolumeResponse, error)
|
||||||
|
{
|
||||||
|
return nil, status.Error(codes.Unimplemented, "")
|
||||||
|
}
|
||||||
|
|
||||||
|
// NodeGetCapabilities returns the supported capabilities of the node server
|
||||||
|
func (ns *NodeServer) NodeGetCapabilities(ctx context.Context, req *csi.NodeGetCapabilitiesRequest) (*csi.NodeGetCapabilitiesResponse, error)
|
||||||
|
{
|
||||||
|
return &csi.NodeGetCapabilitiesResponse{}, nil
|
||||||
|
}
|
||||||
|
|
||||||
|
// NodeGetInfo returns NodeGetInfoResponse for CO.
|
||||||
|
func (ns *NodeServer) NodeGetInfo(ctx context.Context, req *csi.NodeGetInfoRequest) (*csi.NodeGetInfoResponse, error)
|
||||||
|
{
|
||||||
|
klog.Infof("received node get info request %+v", protosanitizer.StripSecrets(req))
|
||||||
|
return &csi.NodeGetInfoResponse{
|
||||||
|
NodeId: ns.NodeID,
|
||||||
|
}, nil
|
||||||
|
}
|
|
@ -0,0 +1,36 @@
|
||||||
|
// Copyright (c) Vitaliy Filippov, 2019+
|
||||||
|
// License: VNPL-1.1 or GNU GPL-2.0+ (see README.md for details)
|
||||||
|
|
||||||
|
package vitastor
|
||||||
|
|
||||||
|
import (
|
||||||
|
"k8s.io/klog"
|
||||||
|
)
|
||||||
|
|
||||||
|
type Driver struct
|
||||||
|
{
|
||||||
|
*Config
|
||||||
|
}
|
||||||
|
|
||||||
|
// NewDriver create new instance driver
|
||||||
|
func NewDriver(config *Config) (*Driver, error)
|
||||||
|
{
|
||||||
|
if (config == nil)
|
||||||
|
{
|
||||||
|
klog.Errorf("Vitastor CSI driver initialization failed")
|
||||||
|
return nil, nil
|
||||||
|
}
|
||||||
|
driver := &Driver{
|
||||||
|
Config: config,
|
||||||
|
}
|
||||||
|
klog.Infof("Vitastor CSI driver initialized")
|
||||||
|
return driver, nil
|
||||||
|
}
|
||||||
|
|
||||||
|
// Start server
|
||||||
|
func (driver *Driver) Run()
|
||||||
|
{
|
||||||
|
server := NewNonBlockingGRPCServer()
|
||||||
|
server.Start(driver.Endpoint, NewIdentityServer(driver), NewControllerServer(driver), NewNodeServer(driver))
|
||||||
|
server.Wait()
|
||||||
|
}
|
|
@ -0,0 +1,39 @@
|
||||||
|
// Copyright (c) Vitaliy Filippov, 2019+
|
||||||
|
// License: VNPL-1.1 or GNU GPL-2.0+ (see README.md for details)
|
||||||
|
|
||||||
|
package main
|
||||||
|
|
||||||
|
import (
|
||||||
|
"flag"
|
||||||
|
"fmt"
|
||||||
|
"os"
|
||||||
|
"k8s.io/klog"
|
||||||
|
"vitastor.io/csi/src"
|
||||||
|
)
|
||||||
|
|
||||||
|
func main()
|
||||||
|
{
|
||||||
|
var config = vitastor.NewConfig()
|
||||||
|
flag.StringVar(&config.Endpoint, "endpoint", "", "CSI endpoint")
|
||||||
|
flag.StringVar(&config.NodeID, "node", "", "Node ID")
|
||||||
|
flag.Parse()
|
||||||
|
if (config.Endpoint == "")
|
||||||
|
{
|
||||||
|
config.Endpoint = os.Getenv("CSI_ENDPOINT")
|
||||||
|
}
|
||||||
|
if (config.NodeID == "")
|
||||||
|
{
|
||||||
|
config.NodeID = os.Getenv("NODE_ID")
|
||||||
|
}
|
||||||
|
if (config.Endpoint == "" && config.NodeID == "")
|
||||||
|
{
|
||||||
|
fmt.Fprintf(os.Stderr, "Please set -endpoint and -node / CSI_ENDPOINT & NODE_ID env vars\n")
|
||||||
|
os.Exit(1)
|
||||||
|
}
|
||||||
|
drv, err := vitastor.NewDriver(config)
|
||||||
|
if (err != nil)
|
||||||
|
{
|
||||||
|
klog.Fatalln(err)
|
||||||
|
}
|
||||||
|
drv.Run()
|
||||||
|
}
|
|
@ -1,8 +1,18 @@
|
||||||
vitastor (0.6.2-1) unstable; urgency=medium
|
vitastor (0.6.5-1) unstable; urgency=medium
|
||||||
|
|
||||||
|
* RDMA support
|
||||||
* Bugfixes
|
* Bugfixes
|
||||||
|
|
||||||
-- Vitaliy Filippov <vitalif@yourcmc.ru> Tue, 02 Feb 2021 23:01:24 +0300
|
-- Vitaliy Filippov <vitalif@yourcmc.ru> Sat, 01 May 2021 18:46:10 +0300
|
||||||
|
|
||||||
|
vitastor (0.6.0-1) unstable; urgency=medium
|
||||||
|
|
||||||
|
* Snapshots and Copy-on-Write clones
|
||||||
|
* Image metadata in etcd (name, size)
|
||||||
|
* Image I/O and space statistics in etcd
|
||||||
|
* Write throttling for smoothing random write workloads in SSD+HDD configurations
|
||||||
|
|
||||||
|
-- Vitaliy Filippov <vitalif@yourcmc.ru> Sun, 11 Apr 2021 00:49:18 +0300
|
||||||
|
|
||||||
vitastor (0.5.1-1) unstable; urgency=medium
|
vitastor (0.5.1-1) unstable; urgency=medium
|
||||||
|
|
||||||
|
|
|
@ -2,7 +2,7 @@ Source: vitastor
|
||||||
Section: admin
|
Section: admin
|
||||||
Priority: optional
|
Priority: optional
|
||||||
Maintainer: Vitaliy Filippov <vitalif@yourcmc.ru>
|
Maintainer: Vitaliy Filippov <vitalif@yourcmc.ru>
|
||||||
Build-Depends: debhelper, liburing-dev (>= 0.6), g++ (>= 8), libstdc++6 (>= 8), linux-libc-dev, libgoogle-perftools-dev, libjerasure-dev, libgf-complete-dev
|
Build-Depends: debhelper, liburing-dev (>= 0.6), g++ (>= 8), libstdc++6 (>= 8), linux-libc-dev, libgoogle-perftools-dev, libjerasure-dev, libgf-complete-dev, libibverbs-dev
|
||||||
Standards-Version: 4.5.0
|
Standards-Version: 4.5.0
|
||||||
Homepage: https://vitastor.io/
|
Homepage: https://vitastor.io/
|
||||||
Rules-Requires-Root: no
|
Rules-Requires-Root: no
|
||||||
|
|
|
@ -11,6 +11,10 @@ RUN if [ "$REL" = "buster" ]; then \
|
||||||
echo 'Package: *' >> /etc/apt/preferences; \
|
echo 'Package: *' >> /etc/apt/preferences; \
|
||||||
echo 'Pin: release a=buster-backports' >> /etc/apt/preferences; \
|
echo 'Pin: release a=buster-backports' >> /etc/apt/preferences; \
|
||||||
echo 'Pin-Priority: 500' >> /etc/apt/preferences; \
|
echo 'Pin-Priority: 500' >> /etc/apt/preferences; \
|
||||||
|
echo >> /etc/apt/preferences; \
|
||||||
|
echo 'Package: libglvnd* libgles* libglx* libgl1 libegl* libopengl* mesa*' >> /etc/apt/preferences; \
|
||||||
|
echo 'Pin: release a=buster-backports' >> /etc/apt/preferences; \
|
||||||
|
echo 'Pin-Priority: 50' >> /etc/apt/preferences; \
|
||||||
fi; \
|
fi; \
|
||||||
grep '^deb ' /etc/apt/sources.list | perl -pe 's/^deb/deb-src/' >> /etc/apt/sources.list; \
|
grep '^deb ' /etc/apt/sources.list | perl -pe 's/^deb/deb-src/' >> /etc/apt/sources.list; \
|
||||||
echo 'APT::Install-Recommends false;' >> /etc/apt/apt.conf; \
|
echo 'APT::Install-Recommends false;' >> /etc/apt/apt.conf; \
|
||||||
|
@ -20,20 +24,22 @@ RUN apt-get update
|
||||||
RUN apt-get -y install qemu fio liburing1 liburing-dev libgoogle-perftools-dev devscripts
|
RUN apt-get -y install qemu fio liburing1 liburing-dev libgoogle-perftools-dev devscripts
|
||||||
RUN apt-get -y build-dep qemu
|
RUN apt-get -y build-dep qemu
|
||||||
RUN apt-get -y build-dep fio
|
RUN apt-get -y build-dep fio
|
||||||
|
# To build a custom version
|
||||||
|
#RUN cp /root/packages/qemu-orig/* /root
|
||||||
RUN apt-get --download-only source qemu
|
RUN apt-get --download-only source qemu
|
||||||
RUN apt-get --download-only source fio
|
RUN apt-get --download-only source fio
|
||||||
|
|
||||||
ADD qemu-5.0-vitastor.patch qemu-5.1-vitastor.patch /root/vitastor/
|
ADD patches/qemu-5.0-vitastor.patch patches/qemu-5.1-vitastor.patch /root/vitastor/patches/
|
||||||
RUN set -e; \
|
RUN set -e; \
|
||||||
mkdir -p /root/packages/qemu-$REL; \
|
mkdir -p /root/packages/qemu-$REL; \
|
||||||
rm -rf /root/packages/qemu-$REL/*; \
|
rm -rf /root/packages/qemu-$REL/*; \
|
||||||
cd /root/packages/qemu-$REL; \
|
cd /root/packages/qemu-$REL; \
|
||||||
dpkg-source -x /root/qemu*.dsc; \
|
dpkg-source -x /root/qemu*.dsc; \
|
||||||
if [ -d /root/packages/qemu-$REL/qemu-5.0 ]; then \
|
if [ -d /root/packages/qemu-$REL/qemu-5.0 ]; then \
|
||||||
cp /root/vitastor/qemu-5.0-vitastor.patch /root/packages/qemu-$REL/qemu-5.0/debian/patches; \
|
cp /root/vitastor/patches/qemu-5.0-vitastor.patch /root/packages/qemu-$REL/qemu-5.0/debian/patches; \
|
||||||
echo qemu-5.0-vitastor.patch >> /root/packages/qemu-$REL/qemu-5.0/debian/patches/series; \
|
echo qemu-5.0-vitastor.patch >> /root/packages/qemu-$REL/qemu-5.0/debian/patches/series; \
|
||||||
else \
|
else \
|
||||||
cp /root/vitastor/qemu-5.1-vitastor.patch /root/packages/qemu-$REL/qemu-*/debian/patches; \
|
cp /root/vitastor/patches/qemu-5.1-vitastor.patch /root/packages/qemu-$REL/qemu-*/debian/patches; \
|
||||||
P=`ls -d /root/packages/qemu-$REL/qemu-*/debian/patches`; \
|
P=`ls -d /root/packages/qemu-$REL/qemu-*/debian/patches`; \
|
||||||
echo qemu-5.1-vitastor.patch >> $P/series; \
|
echo qemu-5.1-vitastor.patch >> $P/series; \
|
||||||
fi; \
|
fi; \
|
||||||
|
|
|
@ -22,7 +22,7 @@ RUN apt-get -y build-dep qemu
|
||||||
RUN apt-get -y build-dep fio
|
RUN apt-get -y build-dep fio
|
||||||
RUN apt-get --download-only source qemu
|
RUN apt-get --download-only source qemu
|
||||||
RUN apt-get --download-only source fio
|
RUN apt-get --download-only source fio
|
||||||
RUN apt-get -y install libjerasure-dev cmake
|
RUN apt-get update && apt-get -y install libjerasure-dev cmake libibverbs-dev
|
||||||
|
|
||||||
ADD . /root/vitastor
|
ADD . /root/vitastor
|
||||||
RUN set -e -x; \
|
RUN set -e -x; \
|
||||||
|
@ -40,10 +40,10 @@ RUN set -e -x; \
|
||||||
mkdir -p /root/packages/vitastor-$REL; \
|
mkdir -p /root/packages/vitastor-$REL; \
|
||||||
rm -rf /root/packages/vitastor-$REL/*; \
|
rm -rf /root/packages/vitastor-$REL/*; \
|
||||||
cd /root/packages/vitastor-$REL; \
|
cd /root/packages/vitastor-$REL; \
|
||||||
cp -r /root/vitastor vitastor-0.6.2; \
|
cp -r /root/vitastor vitastor-0.6.5; \
|
||||||
ln -s /root/packages/qemu-$REL/qemu-*/ vitastor-0.6.2/qemu; \
|
ln -s /root/packages/qemu-$REL/qemu-*/ vitastor-0.6.5/qemu; \
|
||||||
ln -s /root/fio-build/fio-*/ vitastor-0.6.2/fio; \
|
ln -s /root/fio-build/fio-*/ vitastor-0.6.5/fio; \
|
||||||
cd vitastor-0.6.2; \
|
cd vitastor-0.6.5; \
|
||||||
FIO=$(head -n1 fio/debian/changelog | perl -pe 's/^.*\((.*?)\).*$/$1/'); \
|
FIO=$(head -n1 fio/debian/changelog | perl -pe 's/^.*\((.*?)\).*$/$1/'); \
|
||||||
QEMU=$(head -n1 qemu/debian/changelog | perl -pe 's/^.*\((.*?)\).*$/$1/'); \
|
QEMU=$(head -n1 qemu/debian/changelog | perl -pe 's/^.*\((.*?)\).*$/$1/'); \
|
||||||
sh copy-qemu-includes.sh; \
|
sh copy-qemu-includes.sh; \
|
||||||
|
@ -59,8 +59,8 @@ RUN set -e -x; \
|
||||||
echo "dep:fio=$FIO" > debian/substvars; \
|
echo "dep:fio=$FIO" > debian/substvars; \
|
||||||
echo "dep:qemu=$QEMU" >> debian/substvars; \
|
echo "dep:qemu=$QEMU" >> debian/substvars; \
|
||||||
cd /root/packages/vitastor-$REL; \
|
cd /root/packages/vitastor-$REL; \
|
||||||
tar --sort=name --mtime='2020-01-01' --owner=0 --group=0 --exclude=debian -cJf vitastor_0.6.2.orig.tar.xz vitastor-0.6.2; \
|
tar --sort=name --mtime='2020-01-01' --owner=0 --group=0 --exclude=debian -cJf vitastor_0.6.5.orig.tar.xz vitastor-0.6.5; \
|
||||||
cd vitastor-0.6.2; \
|
cd vitastor-0.6.5; \
|
||||||
V=$(head -n1 debian/changelog | perl -pe 's/^.*\((.*?)\).*$/$1/'); \
|
V=$(head -n1 debian/changelog | perl -pe 's/^.*\((.*?)\).*$/$1/'); \
|
||||||
DEBFULLNAME="Vitaliy Filippov <vitalif@yourcmc.ru>" dch -D $REL -v "$V""$REL" "Rebuild for $REL"; \
|
DEBFULLNAME="Vitaliy Filippov <vitalif@yourcmc.ru>" dch -D $REL -v "$V""$REL" "Rebuild for $REL"; \
|
||||||
DEB_BUILD_OPTIONS=nocheck dpkg-buildpackage --jobs=auto -sa; \
|
DEB_BUILD_OPTIONS=nocheck dpkg-buildpackage --jobs=auto -sa; \
|
||||||
|
|
|
@ -244,6 +244,7 @@ async function optimize_change({ prev_pgs: prev_int_pgs, osd_tree, pg_size = 3,
|
||||||
{
|
{
|
||||||
return null;
|
return null;
|
||||||
}
|
}
|
||||||
|
// FIXME: use parity_chunks with parity_space instead of pg_minsize
|
||||||
const pg_effsize = Math.min(pg_minsize, Object.keys(osd_tree).length)
|
const pg_effsize = Math.min(pg_minsize, Object.keys(osd_tree).length)
|
||||||
+ Math.max(0, Math.min(pg_size, Object.keys(osd_tree).length) - pg_minsize) * parity_space;
|
+ Math.max(0, Math.min(pg_size, Object.keys(osd_tree).length) - pg_minsize) * parity_space;
|
||||||
const pg_count = prev_int_pgs.length;
|
const pg_count = prev_int_pgs.length;
|
||||||
|
|
93
mon/mon.js
93
mon/mon.js
|
@ -36,11 +36,19 @@ const etcd_allow = new RegExp('^'+[
|
||||||
'history/last_clean_pgs',
|
'history/last_clean_pgs',
|
||||||
'inode/stats/[1-9]\\d*/[1-9]\\d*',
|
'inode/stats/[1-9]\\d*/[1-9]\\d*',
|
||||||
'stats',
|
'stats',
|
||||||
|
'index/image/.*',
|
||||||
|
'index/maxid/[1-9]\\d*',
|
||||||
].join('$|^')+'$');
|
].join('$|^')+'$');
|
||||||
|
|
||||||
const etcd_tree = {
|
const etcd_tree = {
|
||||||
config: {
|
config: {
|
||||||
/* global: {
|
/* global: {
|
||||||
|
// WARNING: NOT ALL OF THESE ARE ACTUALLY CONFIGURABLE HERE
|
||||||
|
// THIS IS JUST A POOR MAN'S CONFIG DOCUMENTATION
|
||||||
|
// etcd connection
|
||||||
|
config_path: "/etc/vitastor/vitastor.conf",
|
||||||
|
etcd_address: "10.0.115.10:2379/v3",
|
||||||
|
etcd_prefix: "/vitastor",
|
||||||
// mon
|
// mon
|
||||||
etcd_mon_ttl: 30, // min: 10
|
etcd_mon_ttl: 30, // min: 10
|
||||||
etcd_mon_timeout: 1000, // ms. min: 0
|
etcd_mon_timeout: 1000, // ms. min: 0
|
||||||
|
@ -50,7 +58,17 @@ const etcd_tree = {
|
||||||
osd_out_time: 600, // seconds. min: 0
|
osd_out_time: 600, // seconds. min: 0
|
||||||
placement_levels: { datacenter: 1, rack: 2, host: 3, osd: 4, ... },
|
placement_levels: { datacenter: 1, rack: 2, host: 3, osd: 4, ... },
|
||||||
// client and osd
|
// client and osd
|
||||||
|
tcp_header_buffer_size: 65536,
|
||||||
use_sync_send_recv: false,
|
use_sync_send_recv: false,
|
||||||
|
use_rdma: true,
|
||||||
|
rdma_device: null, // for example, "rocep5s0f0"
|
||||||
|
rdma_port_num: 1,
|
||||||
|
rdma_gid_index: 0,
|
||||||
|
rdma_mtu: 4096,
|
||||||
|
rdma_max_sge: 128,
|
||||||
|
rdma_max_send: 32,
|
||||||
|
rdma_max_recv: 8,
|
||||||
|
rdma_max_msg: 1048576,
|
||||||
log_level: 0,
|
log_level: 0,
|
||||||
block_size: 131072,
|
block_size: 131072,
|
||||||
disk_alignment: 4096,
|
disk_alignment: 4096,
|
||||||
|
@ -241,14 +259,26 @@ const etcd_tree = {
|
||||||
},
|
},
|
||||||
inode: {
|
inode: {
|
||||||
stats: {
|
stats: {
|
||||||
/* <inode_t>: {
|
/* <pool_id>: {
|
||||||
|
<inode_t>: {
|
||||||
raw_used: uint64_t, // raw used bytes on OSDs
|
raw_used: uint64_t, // raw used bytes on OSDs
|
||||||
read: { count: uint64_t, usec: uint64_t, bytes: uint64_t },
|
read: { count: uint64_t, usec: uint64_t, bytes: uint64_t },
|
||||||
write: { count: uint64_t, usec: uint64_t, bytes: uint64_t },
|
write: { count: uint64_t, usec: uint64_t, bytes: uint64_t },
|
||||||
delete: { count: uint64_t, usec: uint64_t, bytes: uint64_t },
|
delete: { count: uint64_t, usec: uint64_t, bytes: uint64_t },
|
||||||
|
},
|
||||||
}, */
|
}, */
|
||||||
},
|
},
|
||||||
},
|
},
|
||||||
|
pool: {
|
||||||
|
stats: {
|
||||||
|
/* <pool_id>: {
|
||||||
|
used_raw_tb: float, // used raw space in the pool
|
||||||
|
total_raw_tb: float, // maximum amount of space in the pool
|
||||||
|
raw_to_usable: float, // raw to usable ratio
|
||||||
|
space_efficiency: float, // 0..1
|
||||||
|
} */
|
||||||
|
},
|
||||||
|
},
|
||||||
stats: {
|
stats: {
|
||||||
/* op_stats: {
|
/* op_stats: {
|
||||||
<string>: { count: uint64_t, usec: uint64_t, bytes: uint64_t },
|
<string>: { count: uint64_t, usec: uint64_t, bytes: uint64_t },
|
||||||
|
@ -271,6 +301,17 @@ const etcd_tree = {
|
||||||
history: {
|
history: {
|
||||||
last_clean_pgs: {},
|
last_clean_pgs: {},
|
||||||
},
|
},
|
||||||
|
index: {
|
||||||
|
image: {
|
||||||
|
/* <name>: {
|
||||||
|
id: uint64_t,
|
||||||
|
pool_id: uint64_t,
|
||||||
|
}, */
|
||||||
|
},
|
||||||
|
maxid: {
|
||||||
|
/* <pool_id>: uint64_t, */
|
||||||
|
},
|
||||||
|
},
|
||||||
};
|
};
|
||||||
|
|
||||||
// FIXME Split into several files
|
// FIXME Split into several files
|
||||||
|
@ -345,6 +386,11 @@ class Mon
|
||||||
{
|
{
|
||||||
this.config.mon_stats_timeout = 100;
|
this.config.mon_stats_timeout = 100;
|
||||||
}
|
}
|
||||||
|
this.config.mon_stats_interval = Number(this.config.mon_stats_interval) || 5000;
|
||||||
|
if (this.config.mon_stats_interval < 100)
|
||||||
|
{
|
||||||
|
this.config.mon_stats_interval = 100;
|
||||||
|
}
|
||||||
// After this number of seconds, a dead OSD will be removed from PG distribution
|
// After this number of seconds, a dead OSD will be removed from PG distribution
|
||||||
this.config.osd_out_time = Number(this.config.osd_out_time) || 0;
|
this.config.osd_out_time = Number(this.config.osd_out_time) || 0;
|
||||||
if (!this.config.osd_out_time)
|
if (!this.config.osd_out_time)
|
||||||
|
@ -1009,6 +1055,17 @@ class Mon
|
||||||
} });
|
} });
|
||||||
}
|
}
|
||||||
LPOptimizer.print_change_stats(optimize_result);
|
LPOptimizer.print_change_stats(optimize_result);
|
||||||
|
const pg_effsize = Math.min(pool_cfg.pg_size, Object.keys(pool_tree).length);
|
||||||
|
this.state.pool.stats[pool_id] = {
|
||||||
|
used_raw_tb: (this.state.pool.stats[pool_id]||{}).used_raw_tb || 0,
|
||||||
|
total_raw_tb: optimize_result.space,
|
||||||
|
raw_to_usable: pg_effsize / (pool_cfg.pg_size - (pool_cfg.parity_chunks||0)),
|
||||||
|
space_efficiency: optimize_result.space/(optimize_result.total_space||1),
|
||||||
|
};
|
||||||
|
etcd_request.success.push({ requestPut: {
|
||||||
|
key: b64(this.etcd_prefix+'/pool/stats/'+pool_id),
|
||||||
|
value: b64(JSON.stringify(this.state.pool.stats[pool_id])),
|
||||||
|
} });
|
||||||
this.save_new_pgs_txn(etcd_request, pool_id, up_osds, real_prev_pgs, optimize_result.int_pgs, pg_history);
|
this.save_new_pgs_txn(etcd_request, pool_id, up_osds, real_prev_pgs, optimize_result.int_pgs, pg_history);
|
||||||
}
|
}
|
||||||
this.state.config.pgs.hash = tree_hash;
|
this.state.config.pgs.hash = tree_hash;
|
||||||
|
@ -1115,7 +1172,7 @@ class Mon
|
||||||
}, this.config.mon_change_timeout || 1000);
|
}, this.config.mon_change_timeout || 1000);
|
||||||
}
|
}
|
||||||
|
|
||||||
sum_stats()
|
sum_op_stats()
|
||||||
{
|
{
|
||||||
const op_stats = {}, subop_stats = {}, recovery_stats = {};
|
const op_stats = {}, subop_stats = {}, recovery_stats = {};
|
||||||
for (const osd in this.state.osd.stats)
|
for (const osd in this.state.osd.stats)
|
||||||
|
@ -1176,18 +1233,31 @@ class Mon
|
||||||
write: { count: 0n, usec: 0n, bytes: 0n },
|
write: { count: 0n, usec: 0n, bytes: 0n },
|
||||||
delete: { count: 0n, usec: 0n, bytes: 0n },
|
delete: { count: 0n, usec: 0n, bytes: 0n },
|
||||||
});
|
});
|
||||||
|
for (const pool_id in this.state.config.pools)
|
||||||
|
{
|
||||||
|
this.state.pool.stats[pool_id] = this.state.pool.stats[pool_id] || {};
|
||||||
|
this.state.pool.stats[pool_id].used_raw_tb = 0n;
|
||||||
|
}
|
||||||
for (const osd_num in this.state.osd.space)
|
for (const osd_num in this.state.osd.space)
|
||||||
{
|
{
|
||||||
for (const pool_id in this.state.osd.space[osd_num])
|
for (const pool_id in this.state.osd.space[osd_num])
|
||||||
{
|
{
|
||||||
|
this.state.pool.stats[pool_id] = this.state.pool.stats[pool_id] || { used_raw_tb: 0n };
|
||||||
inode_stats[pool_id] = inode_stats[pool_id] || {};
|
inode_stats[pool_id] = inode_stats[pool_id] || {};
|
||||||
for (const inode_num in this.state.osd.space[osd_num][pool_id])
|
for (const inode_num in this.state.osd.space[osd_num][pool_id])
|
||||||
{
|
{
|
||||||
|
const u = BigInt(this.state.osd.space[osd_num][pool_id][inode_num]||0);
|
||||||
inode_stats[pool_id][inode_num] = inode_stats[pool_id][inode_num] || inode_stub();
|
inode_stats[pool_id][inode_num] = inode_stats[pool_id][inode_num] || inode_stub();
|
||||||
inode_stats[pool_id][inode_num].raw_used += BigInt(this.state.osd.space[osd_num][pool_id][inode_num]||0);
|
inode_stats[pool_id][inode_num].raw_used += u;
|
||||||
|
this.state.pool.stats[pool_id].used_raw_tb += u;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
for (const pool_id in this.state.config.pools)
|
||||||
|
{
|
||||||
|
const used = this.state.pool.stats[pool_id].used_raw_tb;
|
||||||
|
this.state.pool.stats[pool_id].used_raw_tb = Number(used)/1024/1024/1024/1024;
|
||||||
|
}
|
||||||
for (const osd_num in this.state.osd.inodestats)
|
for (const osd_num in this.state.osd.inodestats)
|
||||||
{
|
{
|
||||||
const ist = this.state.osd.inodestats[osd_num];
|
const ist = this.state.osd.inodestats[osd_num];
|
||||||
|
@ -1259,7 +1329,7 @@ class Mon
|
||||||
async update_total_stats()
|
async update_total_stats()
|
||||||
{
|
{
|
||||||
const txn = [];
|
const txn = [];
|
||||||
const stats = this.sum_stats();
|
const stats = this.sum_op_stats();
|
||||||
const object_counts = this.sum_object_counts();
|
const object_counts = this.sum_object_counts();
|
||||||
const inode_stats = this.sum_inode_stats();
|
const inode_stats = this.sum_inode_stats();
|
||||||
this.fix_stat_overflows(stats, (this.prev_stats = this.prev_stats || {}));
|
this.fix_stat_overflows(stats, (this.prev_stats = this.prev_stats || {}));
|
||||||
|
@ -1278,6 +1348,13 @@ class Mon
|
||||||
} });
|
} });
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
for (const pool_id in this.state.pool.stats)
|
||||||
|
{
|
||||||
|
txn.push({ requestPut: {
|
||||||
|
key: b64(this.etcd_prefix+'/pool/stats/'+pool_id),
|
||||||
|
value: b64(JSON.stringify(this.state.pool.stats[pool_id])),
|
||||||
|
} });
|
||||||
|
}
|
||||||
if (txn.length)
|
if (txn.length)
|
||||||
{
|
{
|
||||||
await this.etcd_call('/kv/txn', { success: txn }, this.config.etcd_mon_timeout, 0);
|
await this.etcd_call('/kv/txn', { success: txn }, this.config.etcd_mon_timeout, 0);
|
||||||
|
@ -1291,11 +1368,17 @@ class Mon
|
||||||
clearTimeout(this.stats_timer);
|
clearTimeout(this.stats_timer);
|
||||||
this.stats_timer = null;
|
this.stats_timer = null;
|
||||||
}
|
}
|
||||||
|
let sleep = (this.stats_update_next||0) - Date.now();
|
||||||
|
if (sleep < this.config.mon_stats_timeout)
|
||||||
|
{
|
||||||
|
sleep = this.config.mon_stats_timeout;
|
||||||
|
}
|
||||||
this.stats_timer = setTimeout(() =>
|
this.stats_timer = setTimeout(() =>
|
||||||
{
|
{
|
||||||
this.stats_timer = null;
|
this.stats_timer = null;
|
||||||
|
this.stats_update_next = Date.now() + this.config.mon_stats_interval;
|
||||||
this.update_total_stats().catch(console.error);
|
this.update_total_stats().catch(console.error);
|
||||||
}, this.config.mon_stats_timeout || 1000);
|
}, sleep);
|
||||||
}
|
}
|
||||||
|
|
||||||
parse_kv(kv)
|
parse_kv(kv)
|
||||||
|
|
|
@ -0,0 +1,948 @@
|
||||||
|
# Vitastor Driver for OpenStack Cinder
|
||||||
|
#
|
||||||
|
# --------------------------------------------
|
||||||
|
# Install as cinder/volume/drivers/vitastor.py
|
||||||
|
# --------------------------------------------
|
||||||
|
#
|
||||||
|
# Copyright 2020 Vitaliy Filippov
|
||||||
|
#
|
||||||
|
# Licensed under the Apache License, Version 2.0 (the "License"); you may
|
||||||
|
# not use this file except in compliance with the License. You may obtain
|
||||||
|
# a copy of the License at
|
||||||
|
#
|
||||||
|
# http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
#
|
||||||
|
# Unless required by applicable law or agreed to in writing, software
|
||||||
|
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
|
||||||
|
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
|
||||||
|
# License for the specific language governing permissions and limitations
|
||||||
|
# under the License.
|
||||||
|
"""Cinder Vitastor Driver"""
|
||||||
|
|
||||||
|
import binascii
|
||||||
|
import base64
|
||||||
|
import errno
|
||||||
|
import json
|
||||||
|
import math
|
||||||
|
import os
|
||||||
|
import tempfile
|
||||||
|
|
||||||
|
from castellan import key_manager
|
||||||
|
from oslo_config import cfg
|
||||||
|
from oslo_log import log as logging
|
||||||
|
from oslo_service import loopingcall
|
||||||
|
from oslo_concurrency import processutils
|
||||||
|
from oslo_utils import encodeutils
|
||||||
|
from oslo_utils import excutils
|
||||||
|
from oslo_utils import fileutils
|
||||||
|
from oslo_utils import units
|
||||||
|
import six
|
||||||
|
from six.moves.urllib import request
|
||||||
|
|
||||||
|
from cinder import exception
|
||||||
|
from cinder.i18n import _
|
||||||
|
from cinder.image import image_utils
|
||||||
|
from cinder import interface
|
||||||
|
from cinder import objects
|
||||||
|
from cinder.objects import fields
|
||||||
|
from cinder import utils
|
||||||
|
from cinder.volume import configuration
|
||||||
|
from cinder.volume import driver
|
||||||
|
from cinder.volume import volume_utils
|
||||||
|
|
||||||
|
VERSION = '0.6.5'
|
||||||
|
|
||||||
|
LOG = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
VITASTOR_OPTS = [
|
||||||
|
cfg.StrOpt(
|
||||||
|
'vitastor_config_path',
|
||||||
|
default='/etc/vitastor/vitastor.conf',
|
||||||
|
help='Vitastor configuration file path'
|
||||||
|
),
|
||||||
|
cfg.StrOpt(
|
||||||
|
'vitastor_etcd_address',
|
||||||
|
default='',
|
||||||
|
help='Vitastor etcd address(es)'),
|
||||||
|
cfg.StrOpt(
|
||||||
|
'vitastor_etcd_prefix',
|
||||||
|
default='/vitastor',
|
||||||
|
help='Vitastor etcd prefix'
|
||||||
|
),
|
||||||
|
cfg.StrOpt(
|
||||||
|
'vitastor_pool_id',
|
||||||
|
default='',
|
||||||
|
help='Vitastor pool ID to use for volumes'
|
||||||
|
),
|
||||||
|
# FIXME exclusive_cinder_pool ?
|
||||||
|
]
|
||||||
|
|
||||||
|
CONF = cfg.CONF
|
||||||
|
CONF.register_opts(VITASTOR_OPTS, group = configuration.SHARED_CONF_GROUP)
|
||||||
|
|
||||||
|
class VitastorDriverException(exception.VolumeDriverException):
|
||||||
|
message = _("Vitastor Cinder driver failure: %(reason)s")
|
||||||
|
|
||||||
|
@interface.volumedriver
|
||||||
|
class VitastorDriver(driver.CloneableImageVD,
|
||||||
|
driver.ManageableVD, driver.ManageableSnapshotsVD,
|
||||||
|
driver.BaseVD):
|
||||||
|
"""Implements Vitastor volume commands."""
|
||||||
|
|
||||||
|
cfg = {}
|
||||||
|
_etcd_urls = []
|
||||||
|
|
||||||
|
def __init__(self, active_backend_id = None, *args, **kwargs):
|
||||||
|
super(VitastorDriver, self).__init__(*args, **kwargs)
|
||||||
|
self.configuration.append_config_values(VITASTOR_OPTS)
|
||||||
|
|
||||||
|
@classmethod
|
||||||
|
def get_driver_options(cls):
|
||||||
|
additional_opts = cls._get_oslo_driver_opts(
|
||||||
|
'reserved_percentage',
|
||||||
|
'max_over_subscription_ratio',
|
||||||
|
'volume_dd_blocksize'
|
||||||
|
)
|
||||||
|
return VITASTOR_OPTS + additional_opts
|
||||||
|
|
||||||
|
def do_setup(self, context):
|
||||||
|
"""Performs initialization steps that could raise exceptions."""
|
||||||
|
super(VitastorDriver, self).do_setup(context)
|
||||||
|
# Make sure configuration is in UTF-8
|
||||||
|
for attr in [ 'config_path', 'etcd_address', 'etcd_prefix', 'pool_id' ]:
|
||||||
|
val = self.configuration.safe_get('vitastor_'+attr)
|
||||||
|
if val is not None:
|
||||||
|
self.cfg[attr] = utils.convert_str(val)
|
||||||
|
self.cfg = self._load_config(self.cfg)
|
||||||
|
|
||||||
|
def _load_config(self, cfg):
|
||||||
|
# Try to load configuration file
|
||||||
|
try:
|
||||||
|
f = open(cfg['config_path'] or '/etc/vitastor/vitastor.conf')
|
||||||
|
conf = json.loads(f.read())
|
||||||
|
f.close()
|
||||||
|
for k in conf:
|
||||||
|
cfg[k] = cfg.get(k, conf[k])
|
||||||
|
except:
|
||||||
|
pass
|
||||||
|
if isinstance(cfg['etcd_address'], str):
|
||||||
|
cfg['etcd_address'] = cfg['etcd_address'].split(',')
|
||||||
|
# Sanitize etcd URLs
|
||||||
|
for i, etcd_url in enumerate(cfg['etcd_address']):
|
||||||
|
ssl = False
|
||||||
|
if etcd_url.lower().startswith('http://'):
|
||||||
|
etcd_url = etcd_url[7:]
|
||||||
|
elif etcd_url.lower().startswith('https://'):
|
||||||
|
etcd_url = etcd_url[8:]
|
||||||
|
ssl = True
|
||||||
|
if etcd_url.find('/') < 0:
|
||||||
|
etcd_url += '/v3'
|
||||||
|
if ssl:
|
||||||
|
etcd_url = 'https://'+etcd_url
|
||||||
|
else:
|
||||||
|
etcd_url = 'http://'+etcd_url
|
||||||
|
cfg['etcd_address'][i] = etcd_url
|
||||||
|
return cfg
|
||||||
|
|
||||||
|
def check_for_setup_error(self):
|
||||||
|
"""Returns an error if prerequisites aren't met."""
|
||||||
|
|
||||||
|
def _encode_etcd_key(self, key):
|
||||||
|
if not isinstance(key, bytes):
|
||||||
|
key = str(key).encode('utf-8')
|
||||||
|
return base64.b64encode(self.cfg['etcd_prefix'].encode('utf-8')+b'/'+key).decode('utf-8')
|
||||||
|
|
||||||
|
def _encode_etcd_value(self, value):
|
||||||
|
if not isinstance(value, bytes):
|
||||||
|
value = str(value).encode('utf-8')
|
||||||
|
return base64.b64encode(value).decode('utf-8')
|
||||||
|
|
||||||
|
def _encode_etcd_requests(self, obj):
|
||||||
|
for v in obj:
|
||||||
|
for rt in v:
|
||||||
|
if 'key' in v[rt]:
|
||||||
|
v[rt]['key'] = self._encode_etcd_key(v[rt]['key'])
|
||||||
|
if 'range_end' in v[rt]:
|
||||||
|
v[rt]['range_end'] = self._encode_etcd_key(v[rt]['range_end'])
|
||||||
|
if 'value' in v[rt]:
|
||||||
|
v[rt]['value'] = self._encode_etcd_value(v[rt]['value'])
|
||||||
|
|
||||||
|
def _etcd_txn(self, params):
|
||||||
|
if 'compare' in params:
|
||||||
|
for v in params['compare']:
|
||||||
|
if 'key' in v:
|
||||||
|
v['key'] = self._encode_etcd_key(v['key'])
|
||||||
|
if 'failure' in params:
|
||||||
|
self._encode_etcd_requests(params['failure'])
|
||||||
|
if 'success' in params:
|
||||||
|
self._encode_etcd_requests(params['success'])
|
||||||
|
body = json.dumps(params).encode('utf-8')
|
||||||
|
headers = {
|
||||||
|
'Content-Type': 'application/json'
|
||||||
|
}
|
||||||
|
err = None
|
||||||
|
for etcd_url in self.cfg['etcd_address']:
|
||||||
|
try:
|
||||||
|
resp = request.urlopen(request.Request(etcd_url+'/kv/txn', body, headers), timeout = 5)
|
||||||
|
data = json.loads(resp.read())
|
||||||
|
if 'responses' not in data:
|
||||||
|
data['responses'] = []
|
||||||
|
for i, resp in enumerate(data['responses']):
|
||||||
|
if 'response_range' in resp:
|
||||||
|
if 'kvs' not in resp['response_range']:
|
||||||
|
resp['response_range']['kvs'] = []
|
||||||
|
for kv in resp['response_range']['kvs']:
|
||||||
|
kv['key'] = base64.b64decode(kv['key'].encode('utf-8')).decode('utf-8')
|
||||||
|
if kv['key'].startswith(self.cfg['etcd_prefix']+'/'):
|
||||||
|
kv['key'] = kv['key'][len(self.cfg['etcd_prefix'])+1 : ]
|
||||||
|
kv['value'] = json.loads(base64.b64decode(kv['value'].encode('utf-8')))
|
||||||
|
if len(resp.keys()) != 1:
|
||||||
|
LOG.exception('unknown responses['+str(i)+'] format: '+json.dumps(resp))
|
||||||
|
else:
|
||||||
|
resp = data['responses'][i] = resp[list(resp.keys())[0]]
|
||||||
|
return data
|
||||||
|
except Exception as e:
|
||||||
|
LOG.exception('error calling etcd transaction: '+body.decode('utf-8')+'\nerror: '+str(e))
|
||||||
|
err = e
|
||||||
|
raise err
|
||||||
|
|
||||||
|
def _etcd_foreach(self, prefix, add_fn):
|
||||||
|
total = 0
|
||||||
|
batch = 1000
|
||||||
|
begin = prefix+'/'
|
||||||
|
while True:
|
||||||
|
resp = self._etcd_txn({ 'success': [
|
||||||
|
{ 'request_range': {
|
||||||
|
'key': begin,
|
||||||
|
'range_end': prefix+'0',
|
||||||
|
'limit': batch+1,
|
||||||
|
} },
|
||||||
|
] })
|
||||||
|
i = 0
|
||||||
|
while i < batch and i < len(resp['responses'][0]['kvs']):
|
||||||
|
kv = resp['responses'][0]['kvs'][i]
|
||||||
|
add_fn(kv)
|
||||||
|
i += 1
|
||||||
|
if len(resp['responses'][0]['kvs']) <= batch:
|
||||||
|
break
|
||||||
|
begin = resp['responses'][0]['kvs'][batch]['key']
|
||||||
|
return total
|
||||||
|
|
||||||
|
def _update_volume_stats(self):
|
||||||
|
location_info = json.dumps({
|
||||||
|
'config': self.configuration.vitastor_config_path,
|
||||||
|
'etcd_address': self.configuration.vitastor_etcd_address,
|
||||||
|
'etcd_prefix': self.configuration.vitastor_etcd_prefix,
|
||||||
|
'pool_id': self.configuration.vitastor_pool_id,
|
||||||
|
})
|
||||||
|
|
||||||
|
stats = {
|
||||||
|
'vendor_name': 'Vitastor',
|
||||||
|
'driver_version': self.VERSION,
|
||||||
|
'storage_protocol': 'vitastor',
|
||||||
|
'total_capacity_gb': 'unknown',
|
||||||
|
'free_capacity_gb': 'unknown',
|
||||||
|
# FIXME check if safe_get is required
|
||||||
|
'reserved_percentage': self.configuration.safe_get('reserved_percentage'),
|
||||||
|
'multiattach': True,
|
||||||
|
'thin_provisioning_support': True,
|
||||||
|
'max_over_subscription_ratio': self.configuration.safe_get('max_over_subscription_ratio'),
|
||||||
|
'location_info': location_info,
|
||||||
|
'backend_state': 'down',
|
||||||
|
'volume_backend_name': self.configuration.safe_get('volume_backend_name') or 'vitastor',
|
||||||
|
'replication_enabled': False,
|
||||||
|
}
|
||||||
|
|
||||||
|
try:
|
||||||
|
pool_stats = self._etcd_txn({ 'success': [
|
||||||
|
{ 'request_range': { 'key': 'pool/stats/'+str(self.cfg['pool_id']) } }
|
||||||
|
] })
|
||||||
|
total_provisioned = 0
|
||||||
|
def add_total(kv):
|
||||||
|
nonlocal total_provisioned
|
||||||
|
if kv['key'].find('@') >= 0:
|
||||||
|
total_provisioned += kv['value']['size']
|
||||||
|
self._etcd_foreach('config/inode/'+str(self.cfg['pool_id']), lambda kv: add_total(kv))
|
||||||
|
stats['provisioned_capacity_gb'] = round(total_provisioned/1024.0/1024.0/1024.0, 2)
|
||||||
|
pool_stats = pool_stats['responses'][0]['kvs']
|
||||||
|
if len(pool_stats):
|
||||||
|
pool_stats = pool_stats[0]
|
||||||
|
stats['free_capacity_gb'] = round(1024.0*(pool_stats['total_raw_tb']-pool_stats['used_raw_tb'])/pool_stats['raw_to_usable'], 2)
|
||||||
|
stats['total_capacity_gb'] = round(1024.0*pool_stats['total_raw_tb'], 2)
|
||||||
|
stats['backend_state'] = 'up'
|
||||||
|
except Exception as e:
|
||||||
|
# just log and return unknown capacities
|
||||||
|
LOG.exception('error getting vitastor pool stats: '+str(e))
|
||||||
|
|
||||||
|
self._stats = stats
|
||||||
|
|
||||||
|
def _next_id(self, resp):
|
||||||
|
if len(resp['kvs']) == 0:
|
||||||
|
return (1, 0)
|
||||||
|
else:
|
||||||
|
return (1 + resp['kvs'][0]['value'], resp['kvs'][0]['mod_revision'])
|
||||||
|
|
||||||
|
def create_volume(self, volume):
|
||||||
|
"""Creates a logical volume."""
|
||||||
|
|
||||||
|
size = int(volume.size) * units.Gi
|
||||||
|
# FIXME: Check if convert_str is really required
|
||||||
|
vol_name = utils.convert_str(volume.name)
|
||||||
|
if vol_name.find('@') >= 0 or vol_name.find('/') >= 0:
|
||||||
|
raise exception.VolumeBackendAPIException(data = '@ and / are forbidden in volume and snapshot names')
|
||||||
|
|
||||||
|
LOG.debug("creating volume '%s'", vol_name)
|
||||||
|
|
||||||
|
self._create_image(vol_name, { 'size': size })
|
||||||
|
|
||||||
|
if volume.encryption_key_id:
|
||||||
|
self._create_encrypted_volume(volume, volume.obj_context)
|
||||||
|
|
||||||
|
volume_update = {}
|
||||||
|
return volume_update
|
||||||
|
|
||||||
|
def _create_encrypted_volume(self, volume, context):
|
||||||
|
"""Create a new LUKS encrypted image directly in Vitastor."""
|
||||||
|
vol_name = utils.convert_str(volume.name)
|
||||||
|
f, opts = self._encrypt_opts(volume, context)
|
||||||
|
# FIXME: Check if it works at all :-)
|
||||||
|
self._execute(
|
||||||
|
'qemu-img', 'convert', '-f', 'luks', *opts,
|
||||||
|
'vitastor:image='+vol_name.replace(':', '\\:')+self._qemu_args(),
|
||||||
|
'%sM' % (volume.size * 1024)
|
||||||
|
)
|
||||||
|
f.close()
|
||||||
|
|
||||||
|
def _encrypt_opts(self, volume, context):
|
||||||
|
encryption = volume_utils.check_encryption_provider(self.db, volume, context)
|
||||||
|
# Fetch the key associated with the volume and decode the passphrase
|
||||||
|
keymgr = key_manager.API(CONF)
|
||||||
|
key = keymgr.get(context, encryption['encryption_key_id'])
|
||||||
|
passphrase = binascii.hexlify(key.get_encoded()).decode('utf-8')
|
||||||
|
# Decode the dm-crypt style cipher spec into something qemu-img can use
|
||||||
|
cipher_spec = image_utils.decode_cipher(encryption['cipher'], encryption['key_size'])
|
||||||
|
tmp_dir = volume_utils.image_conversion_dir()
|
||||||
|
f = tempfile.NamedTemporaryFile(prefix = 'luks_', dir = tmp_dir)
|
||||||
|
f.write(passphrase)
|
||||||
|
f.flush()
|
||||||
|
return (f, [
|
||||||
|
'--object', 'secret,id=luks_sec,format=raw,file=%(passfile)s' % {'passfile': f.name},
|
||||||
|
'-o', 'key-secret=luks_sec,cipher-alg=%(cipher_alg)s,cipher-mode=%(cipher_mode)s,ivgen-alg=%(ivgen_alg)s' % cipher_spec,
|
||||||
|
])
|
||||||
|
|
||||||
|
def create_snapshot(self, snapshot):
|
||||||
|
"""Creates a volume snapshot."""
|
||||||
|
|
||||||
|
vol_name = utils.convert_str(snapshot.volume_name)
|
||||||
|
snap_name = utils.convert_str(snapshot.name)
|
||||||
|
if snap_name.find('@') >= 0 or snap_name.find('/') >= 0:
|
||||||
|
raise exception.VolumeBackendAPIException(data = '@ and / are forbidden in volume and snapshot names')
|
||||||
|
self._create_snapshot(vol_name, vol_name+'@'+snap_name)
|
||||||
|
|
||||||
|
def snapshot_revert_use_temp_snapshot(self):
|
||||||
|
"""Disable the use of a temporary snapshot on revert."""
|
||||||
|
return False
|
||||||
|
|
||||||
|
def revert_to_snapshot(self, context, volume, snapshot):
|
||||||
|
"""Revert a volume to a given snapshot."""
|
||||||
|
|
||||||
|
# FIXME Delete the image, then recreate it from the snapshot
|
||||||
|
|
||||||
|
def delete_snapshot(self, snapshot):
|
||||||
|
"""Deletes a snapshot."""
|
||||||
|
|
||||||
|
vol_name = utils.convert_str(snapshot.volume_name)
|
||||||
|
snap_name = utils.convert_str(snapshot.name)
|
||||||
|
|
||||||
|
# Find the snapshot
|
||||||
|
resp = self._etcd_txn({ 'success': [
|
||||||
|
{ 'request_range': { 'key': 'index/image/'+vol_name+'@'+snap_name } },
|
||||||
|
] })
|
||||||
|
if len(resp['responses'][0]['kvs']) == 0:
|
||||||
|
raise exception.SnapshotNotFound(snapshot_id = snap_name)
|
||||||
|
inode_id = int(resp['responses'][0]['kvs'][0]['value']['id'])
|
||||||
|
pool_id = int(resp['responses'][0]['kvs'][0]['value']['pool_id'])
|
||||||
|
parents = {}
|
||||||
|
parents[(pool_id << 48) | (inode_id & 0xffffffffffff)] = True
|
||||||
|
|
||||||
|
# Check if there are child volumes
|
||||||
|
children = self._child_count(parents)
|
||||||
|
if children > 0:
|
||||||
|
raise exception.SnapshotIsBusy(snapshot_name = snap_name)
|
||||||
|
|
||||||
|
# FIXME: We can't delete snapshots because we can't merge layers yet
|
||||||
|
raise exception.VolumeBackendAPIException(data = 'Snapshot delete (layer merge) is not implemented yet')
|
||||||
|
|
||||||
|
def _child_count(self, parents):
|
||||||
|
children = 0
|
||||||
|
def add_child(kv):
|
||||||
|
nonlocal children
|
||||||
|
children += self._check_parent(kv, parents)
|
||||||
|
self._etcd_foreach('config/inode', lambda kv: add_child(kv))
|
||||||
|
return children
|
||||||
|
|
||||||
|
def _check_parent(self, kv, parents):
|
||||||
|
if 'parent_id' not in kv['value']:
|
||||||
|
return 0
|
||||||
|
parent_id = kv['value']['parent_id']
|
||||||
|
_, _, pool_id, inode_id = kv['key'].split('/')
|
||||||
|
parent_pool_id = pool_id
|
||||||
|
if 'parent_pool_id' in kv['value'] and kv['value']['parent_pool_id']:
|
||||||
|
parent_pool_id = kv['value']['parent_pool_id']
|
||||||
|
inode = (int(pool_id) << 48) | (int(inode_id) & 0xffffffffffff)
|
||||||
|
parent = (int(parent_pool_id) << 48) | (int(parent_id) & 0xffffffffffff)
|
||||||
|
if parent in parents and inode not in parents:
|
||||||
|
return 1
|
||||||
|
return 0
|
||||||
|
|
||||||
|
def create_cloned_volume(self, volume, src_vref):
|
||||||
|
"""Create a cloned volume from another volume."""
|
||||||
|
|
||||||
|
size = int(volume.size) * units.Gi
|
||||||
|
src_name = utils.convert_str(src_vref.name)
|
||||||
|
dest_name = utils.convert_str(volume.name)
|
||||||
|
if dest_name.find('@') >= 0 or dest_name.find('/') >= 0:
|
||||||
|
raise exception.VolumeBackendAPIException(data = '@ and / are forbidden in volume and snapshot names')
|
||||||
|
|
||||||
|
# FIXME Do full copy if requested (cfg.disable_clone)
|
||||||
|
|
||||||
|
if src_vref.admin_metadata.get('readonly') == 'True':
|
||||||
|
# source volume is a volume-image cache entry or other readonly volume
|
||||||
|
# clone without intermediate snapshot
|
||||||
|
src = self._get_image(src_name)
|
||||||
|
LOG.debug("creating image '%s' from '%s'", dest_name, src_name)
|
||||||
|
new_cfg = self._create_image(dest_name, {
|
||||||
|
'size': size,
|
||||||
|
'parent_id': src['idx']['id'],
|
||||||
|
'parent_pool_id': src['idx']['pool_id'],
|
||||||
|
})
|
||||||
|
return {}
|
||||||
|
|
||||||
|
clone_snap = "%s@%s.clone_snap" % (src_name, dest_name)
|
||||||
|
make_img = True
|
||||||
|
if (volume.display_name and
|
||||||
|
volume.display_name.startswith('image-') and
|
||||||
|
src_vref.project_id != volume.project_id):
|
||||||
|
# idiotic openstack creates image-volume cache entries
|
||||||
|
# as clones of normal VM volumes... :-X prevent it :-D
|
||||||
|
clone_snap = dest_name
|
||||||
|
make_img = False
|
||||||
|
|
||||||
|
LOG.debug("creating layer '%s' under '%s'", clone_snap, src_name)
|
||||||
|
new_cfg = self._create_snapshot(src_name, clone_snap, True)
|
||||||
|
if make_img:
|
||||||
|
# Then create a clone from it
|
||||||
|
new_cfg = self._create_image(dest_name, {
|
||||||
|
'size': size,
|
||||||
|
'parent_id': new_cfg['parent_id'],
|
||||||
|
'parent_pool_id': new_cfg['parent_pool_id'],
|
||||||
|
})
|
||||||
|
|
||||||
|
return {}
|
||||||
|
|
||||||
|
def create_volume_from_snapshot(self, volume, snapshot):
|
||||||
|
"""Creates a cloned volume from an existing snapshot."""
|
||||||
|
|
||||||
|
vol_name = utils.convert_str(volume.name)
|
||||||
|
snap_name = utils.convert_str(snapshot.name)
|
||||||
|
|
||||||
|
snap = self._get_image(vol_name+'@'+snap_name)
|
||||||
|
if not snap:
|
||||||
|
raise exception.SnapshotNotFound(snapshot_id = snap_name)
|
||||||
|
snap_inode_id = int(resp['responses'][0]['kvs'][0]['value']['id'])
|
||||||
|
snap_pool_id = int(resp['responses'][0]['kvs'][0]['value']['pool_id'])
|
||||||
|
|
||||||
|
size = snap['cfg']['size']
|
||||||
|
if int(volume.size):
|
||||||
|
size = int(volume.size) * units.Gi
|
||||||
|
new_cfg = self._create_image(vol_name, {
|
||||||
|
'size': size,
|
||||||
|
'parent_id': snap['idx']['id'],
|
||||||
|
'parent_pool_id': snap['idx']['pool_id'],
|
||||||
|
})
|
||||||
|
|
||||||
|
return {}
|
||||||
|
|
||||||
|
def _vitastor_args(self):
|
||||||
|
args = []
|
||||||
|
for k in [ 'config_path', 'etcd_address', 'etcd_prefix' ]:
|
||||||
|
v = self.configuration.safe_get('vitastor_'+k)
|
||||||
|
if v:
|
||||||
|
args.extend(['--'+k, v])
|
||||||
|
return args
|
||||||
|
|
||||||
|
def _qemu_args(self):
|
||||||
|
args = ''
|
||||||
|
for k in [ 'config_path', 'etcd_address', 'etcd_prefix' ]:
|
||||||
|
v = self.configuration.safe_get('vitastor_'+k)
|
||||||
|
kk = k
|
||||||
|
if kk == 'etcd_address':
|
||||||
|
# FIXME use etcd_address in qemu driver
|
||||||
|
kk = 'etcd_host'
|
||||||
|
if v:
|
||||||
|
args += ':'+kk+'='+v.replace(':', '\\:')
|
||||||
|
return args
|
||||||
|
|
||||||
|
def delete_volume(self, volume):
|
||||||
|
"""Deletes a logical volume."""
|
||||||
|
|
||||||
|
vol_name = utils.convert_str(volume.name)
|
||||||
|
|
||||||
|
# Find the volume and all its snapshots
|
||||||
|
range_end = b'index/image/' + vol_name.encode('utf-8')
|
||||||
|
range_end = range_end[0 : len(range_end)-1] + six.int2byte(range_end[len(range_end)-1] + 1)
|
||||||
|
resp = self._etcd_txn({ 'success': [
|
||||||
|
{ 'request_range': { 'key': 'index/image/'+vol_name, 'range_end': range_end } },
|
||||||
|
] })
|
||||||
|
if len(resp['responses'][0]['kvs']) == 0:
|
||||||
|
# already deleted
|
||||||
|
LOG.info("volume %s no longer exists in backend", vol_name)
|
||||||
|
return
|
||||||
|
layers = resp['responses'][0]['kvs']
|
||||||
|
layer_ids = {}
|
||||||
|
for kv in layers:
|
||||||
|
inode_id = int(kv['value']['id'])
|
||||||
|
pool_id = int(kv['value']['pool_id'])
|
||||||
|
inode_pool_id = (pool_id << 48) | (inode_id & 0xffffffffffff)
|
||||||
|
layer_ids[inode_pool_id] = True
|
||||||
|
|
||||||
|
# Check if the volume has clones and raise 'busy' if so
|
||||||
|
children = self._child_count(layer_ids)
|
||||||
|
if children > 0:
|
||||||
|
raise exception.VolumeIsBusy(volume_name = vol_name)
|
||||||
|
|
||||||
|
# Clear data
|
||||||
|
for kv in layers:
|
||||||
|
args = [
|
||||||
|
'vitastor-rm', '--pool', str(kv['value']['pool_id']),
|
||||||
|
'--inode', str(kv['value']['id']), '--progress', '0',
|
||||||
|
*(self._vitastor_args())
|
||||||
|
]
|
||||||
|
try:
|
||||||
|
self._execute(*args)
|
||||||
|
except processutils.ProcessExecutionError as exc:
|
||||||
|
LOG.error("Failed to remove layer "+kv['key']+": "+exc)
|
||||||
|
raise exception.VolumeBackendAPIException(data = exc.stderr)
|
||||||
|
|
||||||
|
# Delete all layers from etcd
|
||||||
|
requests = []
|
||||||
|
for kv in layers:
|
||||||
|
requests.append({ 'request_delete_range': { 'key': kv['key'] } })
|
||||||
|
requests.append({ 'request_delete_range': { 'key': 'config/inode/'+str(kv['value']['pool_id'])+'/'+str(kv['value']['id']) } })
|
||||||
|
self._etcd_txn({ 'success': requests })
|
||||||
|
|
||||||
|
def retype(self, context, volume, new_type, diff, host):
|
||||||
|
"""Change extra type specifications for a volume."""
|
||||||
|
|
||||||
|
# FIXME Maybe (in the future) support multiple pools as different types
|
||||||
|
return True, {}
|
||||||
|
|
||||||
|
def ensure_export(self, context, volume):
|
||||||
|
"""Synchronously recreates an export for a logical volume."""
|
||||||
|
pass
|
||||||
|
|
||||||
|
def create_export(self, context, volume, connector):
|
||||||
|
"""Exports the volume."""
|
||||||
|
pass
|
||||||
|
|
||||||
|
def remove_export(self, context, volume):
|
||||||
|
"""Removes an export for a logical volume."""
|
||||||
|
pass
|
||||||
|
|
||||||
|
def _create_image(self, vol_name, cfg):
|
||||||
|
pool_s = str(self.cfg['pool_id'])
|
||||||
|
image_id = 0
|
||||||
|
while image_id == 0:
|
||||||
|
# check if the image already exists and find a free ID
|
||||||
|
resp = self._etcd_txn({ 'success': [
|
||||||
|
{ 'request_range': { 'key': 'index/image/'+vol_name } },
|
||||||
|
{ 'request_range': { 'key': 'index/maxid/'+pool_s } },
|
||||||
|
] })
|
||||||
|
if len(resp['responses'][0]['kvs']) > 0:
|
||||||
|
# already exists
|
||||||
|
raise exception.VolumeBackendAPIException(data = 'Volume '+vol_name+' already exists')
|
||||||
|
image_id, id_mod = self._next_id(resp['responses'][1])
|
||||||
|
# try to create the image
|
||||||
|
resp = self._etcd_txn({ 'compare': [
|
||||||
|
{ 'target': 'MOD', 'mod_revision': id_mod, 'key': 'index/maxid/'+pool_s },
|
||||||
|
{ 'target': 'VERSION', 'version': 0, 'key': 'index/image/'+vol_name },
|
||||||
|
{ 'target': 'VERSION', 'version': 0, 'key': 'config/inode/'+pool_s+'/'+str(image_id) },
|
||||||
|
], 'success': [
|
||||||
|
{ 'request_put': { 'key': 'index/maxid/'+pool_s, 'value': image_id } },
|
||||||
|
{ 'request_put': { 'key': 'index/image/'+vol_name, 'value': json.dumps({
|
||||||
|
'id': image_id, 'pool_id': self.cfg['pool_id']
|
||||||
|
}) } },
|
||||||
|
{ 'request_put': { 'key': 'config/inode/'+pool_s+'/'+str(image_id), 'value': json.dumps({
|
||||||
|
**cfg, 'name': vol_name,
|
||||||
|
}) } },
|
||||||
|
] })
|
||||||
|
if not resp.get('succeeded'):
|
||||||
|
# repeat
|
||||||
|
image_id = 0
|
||||||
|
|
||||||
|
def _create_snapshot(self, vol_name, snap_vol_name, allow_existing = False):
|
||||||
|
while True:
|
||||||
|
# check if the image already exists and snapshot doesn't
|
||||||
|
resp = self._etcd_txn({ 'success': [
|
||||||
|
{ 'request_range': { 'key': 'index/image/'+vol_name } },
|
||||||
|
{ 'request_range': { 'key': 'index/image/'+snap_vol_name } },
|
||||||
|
] })
|
||||||
|
if len(resp['responses'][0]['kvs']) == 0:
|
||||||
|
raise exception.VolumeBackendAPIException(data = 'Volume '+vol_name+' does not exist')
|
||||||
|
if len(resp['responses'][1]['kvs']) > 0:
|
||||||
|
if allow_existing:
|
||||||
|
snap_idx = resp['responses'][1]['kvs'][0]['value']
|
||||||
|
resp = self._etcd_txn({ 'success': [
|
||||||
|
{ 'request_range': { 'key': 'config/inode/'+str(snap_idx['pool_id'])+'/'+str(snap_idx['id']) } },
|
||||||
|
] })
|
||||||
|
if len(resp['responses'][0]['kvs']) == 0:
|
||||||
|
raise exception.VolumeBackendAPIException(data =
|
||||||
|
'Volume '+snap_vol_name+' is already indexed, but does not exist'
|
||||||
|
)
|
||||||
|
return resp['responses'][0]['kvs'][0]['value']
|
||||||
|
raise exception.VolumeBackendAPIException(
|
||||||
|
data = 'Volume '+snap_vol_name+' already exists'
|
||||||
|
)
|
||||||
|
vol_idx = resp['responses'][0]['kvs'][0]['value']
|
||||||
|
vol_idx_mod = resp['responses'][0]['kvs'][0]['mod_revision']
|
||||||
|
# get image inode config and find a new ID
|
||||||
|
resp = self._etcd_txn({ 'success': [
|
||||||
|
{ 'request_range': { 'key': 'config/inode/'+str(vol_idx['pool_id'])+'/'+str(vol_idx['id']) } },
|
||||||
|
{ 'request_range': { 'key': 'index/maxid/'+str(self.cfg['pool_id']) } },
|
||||||
|
] })
|
||||||
|
if len(resp['responses'][0]['kvs']) == 0:
|
||||||
|
raise exception.VolumeBackendAPIException(data = 'Volume '+vol_name+' does not exist')
|
||||||
|
vol_cfg = resp['responses'][0]['kvs'][0]['value']
|
||||||
|
vol_mod = resp['responses'][0]['kvs'][0]['mod_revision']
|
||||||
|
new_id, id_mod = self._next_id(resp['responses'][1])
|
||||||
|
# try to redirect image to the new inode
|
||||||
|
new_cfg = {
|
||||||
|
**vol_cfg, 'name': vol_name, 'parent_id': vol_idx['id'], 'parent_pool_id': vol_idx['pool_id']
|
||||||
|
}
|
||||||
|
resp = self._etcd_txn({ 'compare': [
|
||||||
|
{ 'target': 'MOD', 'mod_revision': vol_idx_mod, 'key': 'index/image/'+vol_name },
|
||||||
|
{ 'target': 'MOD', 'mod_revision': vol_mod, 'key': 'config/inode/'+str(vol_idx['pool_id'])+'/'+str(vol_idx['id']) },
|
||||||
|
{ 'target': 'MOD', 'mod_revision': id_mod, 'key': 'index/maxid/'+str(self.cfg['pool_id']) },
|
||||||
|
{ 'target': 'VERSION', 'version': 0, 'key': 'index/image/'+snap_vol_name },
|
||||||
|
{ 'target': 'VERSION', 'version': 0, 'key': 'config/inode/'+str(self.cfg['pool_id'])+'/'+str(new_id) },
|
||||||
|
], 'success': [
|
||||||
|
{ 'request_put': { 'key': 'index/maxid/'+str(self.cfg['pool_id']), 'value': new_id } },
|
||||||
|
{ 'request_put': { 'key': 'index/image/'+vol_name, 'value': json.dumps({
|
||||||
|
'id': new_id, 'pool_id': self.cfg['pool_id']
|
||||||
|
}) } },
|
||||||
|
{ 'request_put': { 'key': 'config/inode/'+str(self.cfg['pool_id'])+'/'+str(new_id), 'value': json.dumps(new_cfg) } },
|
||||||
|
{ 'request_put': { 'key': 'index/image/'+snap_vol_name, 'value': json.dumps({
|
||||||
|
'id': vol_idx['id'], 'pool_id': vol_idx['pool_id']
|
||||||
|
}) } },
|
||||||
|
{ 'request_put': { 'key': 'config/inode/'+str(vol_idx['pool_id'])+'/'+str(vol_idx['id']), 'value': json.dumps({
|
||||||
|
**vol_cfg, 'name': snap_vol_name, 'readonly': True
|
||||||
|
}) } }
|
||||||
|
] })
|
||||||
|
if resp.get('succeeded'):
|
||||||
|
return new_cfg
|
||||||
|
|
||||||
|
def initialize_connection(self, volume, connector):
|
||||||
|
data = {
|
||||||
|
'driver_volume_type': 'vitastor',
|
||||||
|
'data': {
|
||||||
|
'config_path': self.configuration.vitastor_config_path,
|
||||||
|
'etcd_address': self.configuration.vitastor_etcd_address,
|
||||||
|
'etcd_prefix': self.configuration.vitastor_etcd_prefix,
|
||||||
|
'name': volume.name,
|
||||||
|
'logical_block_size': 512,
|
||||||
|
'physical_block_size': 4096,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
LOG.debug('connection data: %s', data)
|
||||||
|
return data
|
||||||
|
|
||||||
|
def terminate_connection(self, volume, connector, **kwargs):
|
||||||
|
pass
|
||||||
|
|
||||||
|
def clone_image(self, context, volume, image_location, image_meta, image_service):
|
||||||
|
if image_location:
|
||||||
|
# Note: image_location[0] is glance image direct_url.
|
||||||
|
# image_location[1] contains the list of all locations (including
|
||||||
|
# direct_url) or None if show_multiple_locations is False in
|
||||||
|
# glance configuration.
|
||||||
|
if image_location[1]:
|
||||||
|
url_locations = [location['url'] for location in image_location[1]]
|
||||||
|
else:
|
||||||
|
url_locations = [image_location[0]]
|
||||||
|
# iterate all locations to look for a cloneable one.
|
||||||
|
for url_location in url_locations:
|
||||||
|
if url_location and url_location.startswith('cinder://'):
|
||||||
|
# The idea is to use cinder://<volume-id> Glance volumes as base images
|
||||||
|
base_vol = self.db.volume_get(context, url_location[len('cinder://') : ])
|
||||||
|
if not base_vol or base_vol.volume_type_id != volume.volume_type_id:
|
||||||
|
continue
|
||||||
|
size = int(volume.size) * units.Gi
|
||||||
|
dest_name = utils.convert_str(volume.name)
|
||||||
|
# Find or create the base snapshot
|
||||||
|
snap_cfg = self._create_snapshot(base_vol.name, base_vol.name+'@.clone_snap', True)
|
||||||
|
# Then create a clone from it
|
||||||
|
new_cfg = self._create_image(dest_name, {
|
||||||
|
'size': size,
|
||||||
|
'parent_id': snap_cfg['parent_id'],
|
||||||
|
'parent_pool_id': snap_cfg['parent_pool_id'],
|
||||||
|
})
|
||||||
|
return ({}, True)
|
||||||
|
return ({}, False)
|
||||||
|
|
||||||
|
def copy_image_to_encrypted_volume(self, context, volume, image_service, image_id):
|
||||||
|
self.copy_image_to_volume(context, volume, image_service, image_id, encrypted = True)
|
||||||
|
|
||||||
|
def copy_image_to_volume(self, context, volume, image_service, image_id, encrypted = False):
|
||||||
|
tmp_dir = volume_utils.image_conversion_dir()
|
||||||
|
with tempfile.NamedTemporaryFile(dir = tmp_dir) as tmp:
|
||||||
|
image_utils.fetch_to_raw(
|
||||||
|
context, image_service, image_id, tmp.name,
|
||||||
|
self.configuration.volume_dd_blocksize, size = volume.size
|
||||||
|
)
|
||||||
|
out_format = [ '-O', 'raw' ]
|
||||||
|
if encrypted:
|
||||||
|
key_file, opts = self._encrypt_opts(volume, context)
|
||||||
|
out_format = [ '-O', 'luks', *opts ]
|
||||||
|
dest_name = utils.convert_str(volume.name)
|
||||||
|
self._try_execute(
|
||||||
|
'qemu-img', 'convert', '-f', 'raw', tmp.name, *out_format,
|
||||||
|
'vitastor:image='+dest_name.replace(':', '\\:')+self._qemu_args()
|
||||||
|
)
|
||||||
|
if encrypted:
|
||||||
|
key_file.close()
|
||||||
|
|
||||||
|
def copy_volume_to_image(self, context, volume, image_service, image_meta):
|
||||||
|
tmp_dir = volume_utils.image_conversion_dir()
|
||||||
|
tmp_file = os.path.join(tmp_dir, volume.name + '-' + image_meta['id'])
|
||||||
|
with fileutils.remove_path_on_error(tmp_file):
|
||||||
|
vol_name = utils.convert_str(volume.name)
|
||||||
|
self._try_execute(
|
||||||
|
'qemu-img', 'convert', '-f', 'raw',
|
||||||
|
'vitastor:image='+vol_name.replace(':', '\\:')+self._qemu_args(),
|
||||||
|
'-O', 'raw', tmp_file
|
||||||
|
)
|
||||||
|
# FIXME: Copy directly if the destination image is also in Vitastor
|
||||||
|
volume_utils.upload_volume(context, image_service, image_meta, tmp_file, volume)
|
||||||
|
os.unlink(tmp_file)
|
||||||
|
|
||||||
|
def _get_image(self, vol_name):
|
||||||
|
# find the image
|
||||||
|
resp = self._etcd_txn({ 'success': [
|
||||||
|
{ 'request_range': { 'key': 'index/image/'+vol_name } },
|
||||||
|
] })
|
||||||
|
if len(resp['responses'][0]['kvs']) == 0:
|
||||||
|
return None
|
||||||
|
vol_idx = resp['responses'][0]['kvs'][0]['value']
|
||||||
|
vol_idx_mod = resp['responses'][0]['kvs'][0]['mod_revision']
|
||||||
|
# get image inode config
|
||||||
|
resp = self._etcd_txn({ 'success': [
|
||||||
|
{ 'request_range': { 'key': 'config/inode/'+str(vol_idx['pool_id'])+'/'+str(vol_idx['id']) } },
|
||||||
|
] })
|
||||||
|
if len(resp['responses'][0]['kvs']) == 0:
|
||||||
|
return None
|
||||||
|
vol_cfg = resp['responses'][0]['kvs'][0]['value']
|
||||||
|
vol_cfg_mod = resp['responses'][0]['kvs'][0]['mod_revision']
|
||||||
|
return {
|
||||||
|
'cfg': vol_cfg,
|
||||||
|
'cfg_mod': vol_cfg_mod,
|
||||||
|
'idx': vol_idx,
|
||||||
|
'idx_mod': vol_idx_mod,
|
||||||
|
}
|
||||||
|
|
||||||
|
def extend_volume(self, volume, new_size):
|
||||||
|
"""Extend an existing volume."""
|
||||||
|
vol_name = utils.convert_str(volume.name)
|
||||||
|
while True:
|
||||||
|
vol = self._get_image(vol_name)
|
||||||
|
if not vol:
|
||||||
|
raise exception.VolumeBackendAPIException(data = 'Volume '+vol_name+' does not exist')
|
||||||
|
# change size
|
||||||
|
size = int(new_size) * units.Gi
|
||||||
|
if size == vol['cfg']['size']:
|
||||||
|
break
|
||||||
|
resp = self._etcd_txn({ 'compare': [ {
|
||||||
|
'target': 'MOD',
|
||||||
|
'mod_revision': vol['cfg_mod'],
|
||||||
|
'key': 'config/inode/'+str(vol['idx']['pool_id'])+'/'+str(vol['idx']['id']),
|
||||||
|
} ], 'success': [
|
||||||
|
{ 'request_put': {
|
||||||
|
'key': 'config/inode/'+str(vol['idx']['pool_id'])+'/'+str(vol['idx']['id']),
|
||||||
|
'value': json.dumps({ **vol['cfg'], 'size': size }),
|
||||||
|
} },
|
||||||
|
] })
|
||||||
|
if resp.get('succeeded'):
|
||||||
|
break
|
||||||
|
LOG.debug(
|
||||||
|
"Extend volume from %(old_size)s GB to %(new_size)s GB.",
|
||||||
|
{'old_size': volume.size, 'new_size': new_size}
|
||||||
|
)
|
||||||
|
|
||||||
|
def _add_manageable_volume(self, kv, manageable_volumes, cinder_ids):
|
||||||
|
cfg = kv['value']
|
||||||
|
if kv['key'].find('@') >= 0:
|
||||||
|
# snapshot
|
||||||
|
return
|
||||||
|
image_id = volume_utils.extract_id_from_volume_name(cfg['name'])
|
||||||
|
image_info = {
|
||||||
|
'reference': {'source-name': image_name},
|
||||||
|
'size': int(math.ceil(float(cfg['size']) / units.Gi)),
|
||||||
|
'cinder_id': None,
|
||||||
|
'extra_info': None,
|
||||||
|
}
|
||||||
|
if image_id in cinder_ids:
|
||||||
|
image_info['cinder_id'] = image_id
|
||||||
|
image_info['safe_to_manage'] = False
|
||||||
|
image_info['reason_not_safe'] = 'already managed'
|
||||||
|
else:
|
||||||
|
image_info['safe_to_manage'] = True
|
||||||
|
image_info['reason_not_safe'] = None
|
||||||
|
manageable_volumes.append(image_info)
|
||||||
|
|
||||||
|
def get_manageable_volumes(self, cinder_volumes, marker, limit, offset, sort_keys, sort_dirs):
|
||||||
|
manageable_volumes = []
|
||||||
|
cinder_ids = [resource['id'] for resource in cinder_volumes]
|
||||||
|
|
||||||
|
# List all volumes
|
||||||
|
# FIXME: It's possible to use pagination in our case, but.. do we want it?
|
||||||
|
self._etcd_foreach('config/inode/'+str(self.cfg['pool_id']),
|
||||||
|
lambda kv: self._add_manageable_volume(kv, manageable_volumes, cinder_ids))
|
||||||
|
|
||||||
|
return volume_utils.paginate_entries_list(
|
||||||
|
manageable_volumes, marker, limit, offset, sort_keys, sort_dirs)
|
||||||
|
|
||||||
|
def _get_existing_name(existing_ref):
|
||||||
|
if not isinstance(existing_ref, dict):
|
||||||
|
existing_ref = {"source-name": existing_ref}
|
||||||
|
if 'source-name' not in existing_ref:
|
||||||
|
reason = _('Reference must contain source-name element.')
|
||||||
|
raise exception.ManageExistingInvalidReference(existing_ref=existing_ref, reason=reason)
|
||||||
|
src_name = utils.convert_str(existing_ref['source-name'])
|
||||||
|
if not src_name:
|
||||||
|
reason = _('Reference must contain source-name element.')
|
||||||
|
raise exception.ManageExistingInvalidReference(existing_ref=existing_ref, reason=reason)
|
||||||
|
return src_name
|
||||||
|
|
||||||
|
def manage_existing_get_size(self, volume, existing_ref):
|
||||||
|
"""Return size of an existing image for manage_existing.
|
||||||
|
|
||||||
|
:param volume: volume ref info to be set
|
||||||
|
:param existing_ref: {'source-name': <image name>}
|
||||||
|
"""
|
||||||
|
src_name = self._get_existing_name(existing_ref)
|
||||||
|
vol = self._get_image(src_name)
|
||||||
|
if not vol:
|
||||||
|
raise exception.VolumeBackendAPIException(data = 'Volume '+src_name+' does not exist')
|
||||||
|
return int(math.ceil(float(vol['cfg']['size']) / units.Gi))
|
||||||
|
|
||||||
|
def manage_existing(self, volume, existing_ref):
|
||||||
|
"""Manages an existing image.
|
||||||
|
|
||||||
|
Renames the image name to match the expected name for the volume.
|
||||||
|
|
||||||
|
:param volume: volume ref info to be set
|
||||||
|
:param existing_ref: {'source-name': <image name>}
|
||||||
|
"""
|
||||||
|
from_name = self._get_existing_name(existing_ref)
|
||||||
|
to_name = utils.convert_str(volume.name)
|
||||||
|
self._rename(from_name, to_name)
|
||||||
|
|
||||||
|
def _rename(self, from_name, to_name):
|
||||||
|
while True:
|
||||||
|
vol = self._get_image(from_name)
|
||||||
|
if not vol:
|
||||||
|
raise exception.VolumeBackendAPIException(data = 'Volume '+from_name+' does not exist')
|
||||||
|
to = self._get_image(to_name)
|
||||||
|
if to:
|
||||||
|
raise exception.VolumeBackendAPIException(data = 'Volume '+to_name+' already exists')
|
||||||
|
resp = self._etcd_txn({ 'compare': [
|
||||||
|
{ 'target': 'MOD', 'mod_revision': vol['idx_mod'], 'key': 'index/image/'+vol['cfg']['name'] },
|
||||||
|
{ 'target': 'MOD', 'mod_revision': vol['cfg_mod'], 'key': 'config/inode/'+str(vol['idx']['pool_id'])+'/'+str(vol['idx']['id']) },
|
||||||
|
{ 'target': 'VERSION', 'version': 0, 'key': 'index/image/'+to_name },
|
||||||
|
], 'success': [
|
||||||
|
{ 'request_delete_range': { 'key': 'index/image/'+vol['cfg']['name'] } },
|
||||||
|
{ 'request_put': { 'key': 'index/image/'+to_name, 'value': json.dumps(vol['idx']) } },
|
||||||
|
{ 'request_put': { 'key': 'config/inode/'+str(vol['idx']['pool_id'])+'/'+str(vol['idx']['id']),
|
||||||
|
'value': json.dumps({ **vol['cfg'], 'name': to_name }) } },
|
||||||
|
] })
|
||||||
|
if resp.get('succeeded'):
|
||||||
|
break
|
||||||
|
|
||||||
|
def unmanage(self, volume):
|
||||||
|
pass
|
||||||
|
|
||||||
|
def _add_manageable_snapshot(self, kv, manageable_snapshots, cinder_ids):
|
||||||
|
cfg = kv['value']
|
||||||
|
dog = kv['key'].find('@')
|
||||||
|
if dog < 0:
|
||||||
|
# snapshot
|
||||||
|
return
|
||||||
|
image_name = kv['key'][0 : dog]
|
||||||
|
snap_name = kv['key'][dog+1 : ]
|
||||||
|
snapshot_id = volume_utils.extract_id_from_snapshot_name(snap_name)
|
||||||
|
snapshot_info = {
|
||||||
|
'reference': {'source-name': snap_name},
|
||||||
|
'size': int(math.ceil(float(cfg['size']) / units.Gi)),
|
||||||
|
'cinder_id': None,
|
||||||
|
'extra_info': None,
|
||||||
|
'safe_to_manage': False,
|
||||||
|
'reason_not_safe': None,
|
||||||
|
'source_reference': {'source-name': image_name}
|
||||||
|
}
|
||||||
|
if snapshot_id in cinder_ids:
|
||||||
|
# Exclude snapshots already managed.
|
||||||
|
snapshot_info['reason_not_safe'] = ('already managed')
|
||||||
|
snapshot_info['cinder_id'] = snapshot_id
|
||||||
|
elif snap_name.endswith('.clone_snap'):
|
||||||
|
# Exclude clone snapshot.
|
||||||
|
snapshot_info['reason_not_safe'] = ('used for clone snap')
|
||||||
|
else:
|
||||||
|
snapshot_info['safe_to_manage'] = True
|
||||||
|
manageable_snapshots.append(snapshot_info)
|
||||||
|
|
||||||
|
def get_manageable_snapshots(self, cinder_snapshots, marker, limit, offset, sort_keys, sort_dirs):
|
||||||
|
"""List manageable snapshots in Vitastor."""
|
||||||
|
manageable_snapshots = []
|
||||||
|
cinder_snapshot_ids = [resource['id'] for resource in cinder_snapshots]
|
||||||
|
# List all volumes
|
||||||
|
# FIXME: It's possible to use pagination in our case, but.. do we want it?
|
||||||
|
self._etcd_foreach('config/inode/'+str(self.cfg['pool_id']),
|
||||||
|
lambda kv: self._add_manageable_volume(kv, manageable_snapshots, cinder_snapshot_ids))
|
||||||
|
return volume_utils.paginate_entries_list(
|
||||||
|
manageable_snapshots, marker, limit, offset, sort_keys, sort_dirs)
|
||||||
|
|
||||||
|
def manage_existing_snapshot_get_size(self, snapshot, existing_ref):
|
||||||
|
"""Return size of an existing image for manage_existing.
|
||||||
|
|
||||||
|
:param snapshot: snapshot ref info to be set
|
||||||
|
:param existing_ref: {'source-name': <name of snapshot>}
|
||||||
|
"""
|
||||||
|
vol_name = utils.convert_str(snapshot.volume_name)
|
||||||
|
snap_name = self._get_existing_name(existing_ref)
|
||||||
|
vol = self._get_image(vol_name+'@'+snap_name)
|
||||||
|
if not vol:
|
||||||
|
raise exception.ManageExistingInvalidReference(
|
||||||
|
existing_ref=snapshot_name, reason='Specified snapshot does not exist.'
|
||||||
|
)
|
||||||
|
return int(math.ceil(float(vol['cfg']['size']) / units.Gi))
|
||||||
|
|
||||||
|
def manage_existing_snapshot(self, snapshot, existing_ref):
|
||||||
|
"""Manages an existing snapshot.
|
||||||
|
|
||||||
|
Renames the snapshot name to match the expected name for the snapshot.
|
||||||
|
Error checking done by manage_existing_get_size is not repeated.
|
||||||
|
|
||||||
|
:param snapshot: snapshot ref info to be set
|
||||||
|
:param existing_ref: {'source-name': <name of snapshot>}
|
||||||
|
"""
|
||||||
|
vol_name = utils.convert_str(snapshot.volume_name)
|
||||||
|
snap_name = self._get_existing_name(existing_ref)
|
||||||
|
from_name = vol_name+'@'+snap_name
|
||||||
|
to_name = vol_name+'@'+utils.convert_str(snapshot.name)
|
||||||
|
self._rename(from_name, to_name)
|
||||||
|
|
||||||
|
def unmanage_snapshot(self, snapshot):
|
||||||
|
"""Removes the specified snapshot from Cinder management."""
|
||||||
|
pass
|
||||||
|
|
||||||
|
def _dumps(self, obj):
|
||||||
|
return json.dumps(obj, separators=(',', ':'), sort_keys=True)
|
|
@ -0,0 +1,23 @@
|
||||||
|
# Devstack configuration for bridged networking
|
||||||
|
|
||||||
|
[[local|localrc]]
|
||||||
|
ADMIN_PASSWORD=secret
|
||||||
|
DATABASE_PASSWORD=$ADMIN_PASSWORD
|
||||||
|
RABBIT_PASSWORD=$ADMIN_PASSWORD
|
||||||
|
SERVICE_PASSWORD=$ADMIN_PASSWORD
|
||||||
|
HOST_IP=10.0.2.15
|
||||||
|
Q_USE_SECGROUP=True
|
||||||
|
FLOATING_RANGE="10.0.2.0/24"
|
||||||
|
IPV4_ADDRS_SAFE_TO_USE="10.0.5.0/24"
|
||||||
|
Q_FLOATING_ALLOCATION_POOL=start=10.0.2.50,end=10.0.2.100
|
||||||
|
PUBLIC_NETWORK_GATEWAY=10.0.2.2
|
||||||
|
PUBLIC_INTERFACE=ens3
|
||||||
|
Q_USE_PROVIDERNET_FOR_PUBLIC=True
|
||||||
|
Q_AGENT=linuxbridge
|
||||||
|
Q_ML2_PLUGIN_MECHANISM_DRIVERS=linuxbridge
|
||||||
|
LB_PHYSICAL_INTERFACE=ens3
|
||||||
|
PUBLIC_PHYSICAL_NETWORK=default
|
||||||
|
LB_INTERFACE_MAPPINGS=default:ens3
|
||||||
|
Q_SERVICE_PLUGIN_CLASSES=
|
||||||
|
Q_ML2_PLUGIN_TYPE_DRIVERS=flat
|
||||||
|
Q_ML2_PLUGIN_EXT_DRIVERS=
|
|
@ -0,0 +1,609 @@
|
||||||
|
commit bd283191b3e7a4c6d1c100d3d96e348a1ebffe55
|
||||||
|
Author: Vitaliy Filippov <vitalif@yourcmc.ru>
|
||||||
|
Date: Sun Jun 27 12:52:40 2021 +0300
|
||||||
|
|
||||||
|
Add Vitastor support
|
||||||
|
|
||||||
|
diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng
|
||||||
|
index aa50eac..082b4f8 100644
|
||||||
|
--- a/docs/schemas/domaincommon.rng
|
||||||
|
+++ b/docs/schemas/domaincommon.rng
|
||||||
|
@@ -1728,6 +1728,35 @@
|
||||||
|
</element>
|
||||||
|
</define>
|
||||||
|
|
||||||
|
+ <define name="diskSourceNetworkProtocolVitastor">
|
||||||
|
+ <element name="source">
|
||||||
|
+ <interleave>
|
||||||
|
+ <attribute name="protocol">
|
||||||
|
+ <value>vitastor</value>
|
||||||
|
+ </attribute>
|
||||||
|
+ <ref name="diskSourceCommon"/>
|
||||||
|
+ <optional>
|
||||||
|
+ <attribute name="name"/>
|
||||||
|
+ </optional>
|
||||||
|
+ <optional>
|
||||||
|
+ <attribute name="query"/>
|
||||||
|
+ </optional>
|
||||||
|
+ <zeroOrMore>
|
||||||
|
+ <ref name="diskSourceNetworkHost"/>
|
||||||
|
+ </zeroOrMore>
|
||||||
|
+ <optional>
|
||||||
|
+ <element name="config">
|
||||||
|
+ <attribute name="file">
|
||||||
|
+ <ref name="absFilePath"/>
|
||||||
|
+ </attribute>
|
||||||
|
+ <empty/>
|
||||||
|
+ </element>
|
||||||
|
+ </optional>
|
||||||
|
+ <empty/>
|
||||||
|
+ </interleave>
|
||||||
|
+ </element>
|
||||||
|
+ </define>
|
||||||
|
+
|
||||||
|
<define name="diskSourceNetworkProtocolISCSI">
|
||||||
|
<element name="source">
|
||||||
|
<attribute name="protocol">
|
||||||
|
@@ -1851,6 +1880,7 @@
|
||||||
|
<ref name="diskSourceNetworkProtocolHTTP"/>
|
||||||
|
<ref name="diskSourceNetworkProtocolSimple"/>
|
||||||
|
<ref name="diskSourceNetworkProtocolVxHS"/>
|
||||||
|
+ <ref name="diskSourceNetworkProtocolVitastor"/>
|
||||||
|
</choice>
|
||||||
|
</define>
|
||||||
|
|
||||||
|
diff --git a/include/libvirt/libvirt-storage.h b/include/libvirt/libvirt-storage.h
|
||||||
|
index 4bf2b5f..dbc011b 100644
|
||||||
|
--- a/include/libvirt/libvirt-storage.h
|
||||||
|
+++ b/include/libvirt/libvirt-storage.h
|
||||||
|
@@ -240,6 +240,7 @@ typedef enum {
|
||||||
|
VIR_CONNECT_LIST_STORAGE_POOLS_GLUSTER = 1 << 16,
|
||||||
|
VIR_CONNECT_LIST_STORAGE_POOLS_ZFS = 1 << 17,
|
||||||
|
VIR_CONNECT_LIST_STORAGE_POOLS_VSTORAGE = 1 << 18,
|
||||||
|
+ VIR_CONNECT_LIST_STORAGE_POOLS_VITASTOR = 1 << 20,
|
||||||
|
} virConnectListAllStoragePoolsFlags;
|
||||||
|
|
||||||
|
int virConnectListAllStoragePools(virConnectPtr conn,
|
||||||
|
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
|
||||||
|
index 222bb8c..685d255 100644
|
||||||
|
--- a/src/conf/domain_conf.c
|
||||||
|
+++ b/src/conf/domain_conf.c
|
||||||
|
@@ -8653,6 +8653,10 @@ virDomainDiskSourceNetworkParse(xmlNodePtr node,
|
||||||
|
goto cleanup;
|
||||||
|
}
|
||||||
|
|
||||||
|
+ if (src->protocol == VIR_STORAGE_NET_PROTOCOL_VITASTOR) {
|
||||||
|
+ src->relPath = virXMLPropString(node, "query");
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
if ((haveTLS = virXMLPropString(node, "tls")) &&
|
||||||
|
(src->haveTLS = virTristateBoolTypeFromString(haveTLS)) <= 0) {
|
||||||
|
virReportError(VIR_ERR_XML_ERROR,
|
||||||
|
@@ -23849,6 +23853,10 @@ virDomainDiskSourceFormatNetwork(virBufferPtr attrBuf,
|
||||||
|
|
||||||
|
virBufferEscapeString(attrBuf, " name='%s'", path ? path : src->path);
|
||||||
|
|
||||||
|
+ if (src->protocol == VIR_STORAGE_NET_PROTOCOL_VITASTOR && src->relPath != NULL) {
|
||||||
|
+ virBufferEscapeString(attrBuf, " query='%s'", src->relPath);
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
VIR_FREE(path);
|
||||||
|
|
||||||
|
if (src->haveTLS != VIR_TRISTATE_BOOL_ABSENT &&
|
||||||
|
@@ -30930,6 +30938,7 @@ virDomainDiskTranslateSourcePool(virDomainDiskDefPtr def)
|
||||||
|
|
||||||
|
case VIR_STORAGE_POOL_MPATH:
|
||||||
|
case VIR_STORAGE_POOL_RBD:
|
||||||
|
+ case VIR_STORAGE_POOL_VITASTOR:
|
||||||
|
case VIR_STORAGE_POOL_SHEEPDOG:
|
||||||
|
case VIR_STORAGE_POOL_GLUSTER:
|
||||||
|
case VIR_STORAGE_POOL_LAST:
|
||||||
|
diff --git a/src/conf/storage_conf.c b/src/conf/storage_conf.c
|
||||||
|
index 55db7a9..7cbe937 100644
|
||||||
|
--- a/src/conf/storage_conf.c
|
||||||
|
+++ b/src/conf/storage_conf.c
|
||||||
|
@@ -58,7 +58,7 @@ VIR_ENUM_IMPL(virStoragePool,
|
||||||
|
"logical", "disk", "iscsi",
|
||||||
|
"iscsi-direct", "scsi", "mpath",
|
||||||
|
"rbd", "sheepdog", "gluster",
|
||||||
|
- "zfs", "vstorage")
|
||||||
|
+ "zfs", "vstorage", "vitastor")
|
||||||
|
|
||||||
|
VIR_ENUM_IMPL(virStoragePoolFormatFileSystem,
|
||||||
|
VIR_STORAGE_POOL_FS_LAST,
|
||||||
|
@@ -232,6 +232,18 @@ static virStoragePoolTypeInfo poolTypeInfo[] = {
|
||||||
|
.formatToString = virStorageFileFormatTypeToString,
|
||||||
|
}
|
||||||
|
},
|
||||||
|
+ {.poolType = VIR_STORAGE_POOL_VITASTOR,
|
||||||
|
+ .poolOptions = {
|
||||||
|
+ .flags = (VIR_STORAGE_POOL_SOURCE_HOST |
|
||||||
|
+ VIR_STORAGE_POOL_SOURCE_NETWORK |
|
||||||
|
+ VIR_STORAGE_POOL_SOURCE_NAME),
|
||||||
|
+ },
|
||||||
|
+ .volOptions = {
|
||||||
|
+ .defaultFormat = VIR_STORAGE_FILE_RAW,
|
||||||
|
+ .formatFromString = virStorageVolumeFormatFromString,
|
||||||
|
+ .formatToString = virStorageFileFormatTypeToString,
|
||||||
|
+ }
|
||||||
|
+ },
|
||||||
|
{.poolType = VIR_STORAGE_POOL_SHEEPDOG,
|
||||||
|
.poolOptions = {
|
||||||
|
.flags = (VIR_STORAGE_POOL_SOURCE_HOST |
|
||||||
|
@@ -434,6 +446,11 @@ virStoragePoolDefParseSource(xmlXPathContextPtr ctxt,
|
||||||
|
_("element 'name' is mandatory for RBD pool"));
|
||||||
|
goto cleanup;
|
||||||
|
}
|
||||||
|
+ if (pool_type == VIR_STORAGE_POOL_VITASTOR && source->name == NULL) {
|
||||||
|
+ virReportError(VIR_ERR_XML_ERROR, "%s",
|
||||||
|
+ _("element 'name' is mandatory for Vitastor pool"));
|
||||||
|
+ return -1;
|
||||||
|
+ }
|
||||||
|
|
||||||
|
if (options->formatFromString) {
|
||||||
|
char *format = virXPathString("string(./format/@type)", ctxt);
|
||||||
|
@@ -1009,6 +1026,7 @@ virStoragePoolDefFormatBuf(virBufferPtr buf,
|
||||||
|
/* RBD, Sheepdog, Gluster and Iscsi-direct devices are not local block devs nor
|
||||||
|
* files, so they don't have a target */
|
||||||
|
if (def->type != VIR_STORAGE_POOL_RBD &&
|
||||||
|
+ def->type != VIR_STORAGE_POOL_VITASTOR &&
|
||||||
|
def->type != VIR_STORAGE_POOL_SHEEPDOG &&
|
||||||
|
def->type != VIR_STORAGE_POOL_GLUSTER &&
|
||||||
|
def->type != VIR_STORAGE_POOL_ISCSI_DIRECT) {
|
||||||
|
diff --git a/src/conf/storage_conf.h b/src/conf/storage_conf.h
|
||||||
|
index dc0aa2a..ed4983d 100644
|
||||||
|
--- a/src/conf/storage_conf.h
|
||||||
|
+++ b/src/conf/storage_conf.h
|
||||||
|
@@ -91,6 +91,7 @@ typedef enum {
|
||||||
|
VIR_STORAGE_POOL_GLUSTER, /* Gluster device */
|
||||||
|
VIR_STORAGE_POOL_ZFS, /* ZFS */
|
||||||
|
VIR_STORAGE_POOL_VSTORAGE, /* Virtuozzo Storage */
|
||||||
|
+ VIR_STORAGE_POOL_VITASTOR, /* Vitastor */
|
||||||
|
|
||||||
|
VIR_STORAGE_POOL_LAST,
|
||||||
|
} virStoragePoolType;
|
||||||
|
@@ -422,6 +423,7 @@ VIR_ENUM_DECL(virStoragePartedFs)
|
||||||
|
VIR_CONNECT_LIST_STORAGE_POOLS_SCSI | \
|
||||||
|
VIR_CONNECT_LIST_STORAGE_POOLS_MPATH | \
|
||||||
|
VIR_CONNECT_LIST_STORAGE_POOLS_RBD | \
|
||||||
|
+ VIR_CONNECT_LIST_STORAGE_POOLS_VITASTOR | \
|
||||||
|
VIR_CONNECT_LIST_STORAGE_POOLS_SHEEPDOG | \
|
||||||
|
VIR_CONNECT_LIST_STORAGE_POOLS_GLUSTER | \
|
||||||
|
VIR_CONNECT_LIST_STORAGE_POOLS_ZFS | \
|
||||||
|
diff --git a/src/conf/virstorageobj.c b/src/conf/virstorageobj.c
|
||||||
|
index 6ea6a97..3ba45b9 100644
|
||||||
|
--- a/src/conf/virstorageobj.c
|
||||||
|
+++ b/src/conf/virstorageobj.c
|
||||||
|
@@ -1478,6 +1478,7 @@ virStoragePoolObjSourceFindDuplicateCb(const void *payload,
|
||||||
|
return 1;
|
||||||
|
break;
|
||||||
|
|
||||||
|
+ case VIR_STORAGE_POOL_VITASTOR:
|
||||||
|
case VIR_STORAGE_POOL_RBD:
|
||||||
|
case VIR_STORAGE_POOL_LAST:
|
||||||
|
break;
|
||||||
|
@@ -1971,6 +1972,8 @@ virStoragePoolObjMatch(virStoragePoolObjPtr obj,
|
||||||
|
(obj->def->type == VIR_STORAGE_POOL_MPATH)) ||
|
||||||
|
(MATCH(VIR_CONNECT_LIST_STORAGE_POOLS_RBD) &&
|
||||||
|
(obj->def->type == VIR_STORAGE_POOL_RBD)) ||
|
||||||
|
+ (MATCH(VIR_CONNECT_LIST_STORAGE_POOLS_VITASTOR) &&
|
||||||
|
+ (obj->def->type == VIR_STORAGE_POOL_VITASTOR)) ||
|
||||||
|
(MATCH(VIR_CONNECT_LIST_STORAGE_POOLS_SHEEPDOG) &&
|
||||||
|
(obj->def->type == VIR_STORAGE_POOL_SHEEPDOG)) ||
|
||||||
|
(MATCH(VIR_CONNECT_LIST_STORAGE_POOLS_GLUSTER) &&
|
||||||
|
diff --git a/src/libvirt-storage.c b/src/libvirt-storage.c
|
||||||
|
index 2ea3e94..d5d2273 100644
|
||||||
|
--- a/src/libvirt-storage.c
|
||||||
|
+++ b/src/libvirt-storage.c
|
||||||
|
@@ -92,6 +92,7 @@ virStoragePoolGetConnect(virStoragePoolPtr pool)
|
||||||
|
* VIR_CONNECT_LIST_STORAGE_POOLS_SCSI
|
||||||
|
* VIR_CONNECT_LIST_STORAGE_POOLS_MPATH
|
||||||
|
* VIR_CONNECT_LIST_STORAGE_POOLS_RBD
|
||||||
|
+ * VIR_CONNECT_LIST_STORAGE_POOLS_VITASTOR
|
||||||
|
* VIR_CONNECT_LIST_STORAGE_POOLS_SHEEPDOG
|
||||||
|
*
|
||||||
|
* Returns the number of storage pools found or -1 and sets @pools to
|
||||||
|
diff --git a/src/libxl/libxl_conf.c b/src/libxl/libxl_conf.c
|
||||||
|
index 73e988a..ab7bb81 100644
|
||||||
|
--- a/src/libxl/libxl_conf.c
|
||||||
|
+++ b/src/libxl/libxl_conf.c
|
||||||
|
@@ -905,6 +905,7 @@ libxlMakeNetworkDiskSrcStr(virStorageSourcePtr src,
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_SHEEPDOG:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_SSH:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_VXHS:
|
||||||
|
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_LAST:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_NONE:
|
||||||
|
virReportError(VIR_ERR_NO_SUPPORT,
|
||||||
|
diff --git a/src/qemu/qemu_block.c b/src/qemu/qemu_block.c
|
||||||
|
index cbf0aa4..096700d 100644
|
||||||
|
--- a/src/qemu/qemu_block.c
|
||||||
|
+++ b/src/qemu/qemu_block.c
|
||||||
|
@@ -959,6 +959,42 @@ qemuBlockStorageSourceGetRBDProps(virStorageSourcePtr src)
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
+static virJSONValuePtr
|
||||||
|
+qemuBlockStorageSourceGetVitastorProps(virStorageSource *src)
|
||||||
|
+{
|
||||||
|
+ virJSONValuePtr ret = NULL;
|
||||||
|
+ virStorageNetHostDefPtr host;
|
||||||
|
+ size_t i;
|
||||||
|
+ virBuffer buf = VIR_BUFFER_INITIALIZER;
|
||||||
|
+ char *etcd = NULL;
|
||||||
|
+
|
||||||
|
+ for (i = 0; i < src->nhosts; i++) {
|
||||||
|
+ host = src->hosts + i;
|
||||||
|
+ if ((virStorageNetHostTransport)host->transport != VIR_STORAGE_NET_HOST_TRANS_TCP) {
|
||||||
|
+ goto cleanup;
|
||||||
|
+ }
|
||||||
|
+ virBufferAsprintf(&buf, i > 0 ? ",%s:%u" : "%s:%u", host->name, host->port);
|
||||||
|
+ }
|
||||||
|
+ if (src->nhosts > 0) {
|
||||||
|
+ etcd = virBufferContentAndReset(&buf);
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ if (virJSONValueObjectCreate(&ret,
|
||||||
|
+ "s:driver", "vitastor",
|
||||||
|
+ "S:etcd_host", etcd,
|
||||||
|
+ "S:etcd_prefix", src->relPath,
|
||||||
|
+ "S:config_path", src->configFile,
|
||||||
|
+ "s:image", src->path,
|
||||||
|
+ NULL) < 0)
|
||||||
|
+ goto cleanup;
|
||||||
|
+
|
||||||
|
+cleanup:
|
||||||
|
+ VIR_FREE(etcd);
|
||||||
|
+ virBufferFreeAndReset(&buf);
|
||||||
|
+ return ret;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+
|
||||||
|
static virJSONValuePtr
|
||||||
|
qemuBlockStorageSourceGetSheepdogProps(virStorageSourcePtr src)
|
||||||
|
{
|
||||||
|
@@ -1174,6 +1210,11 @@ qemuBlockStorageSourceGetBackendProps(virStorageSourcePtr src,
|
||||||
|
return NULL;
|
||||||
|
break;
|
||||||
|
|
||||||
|
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
|
||||||
|
+ if (!(fileprops = qemuBlockStorageSourceGetVitastorProps(src)))
|
||||||
|
+ return NULL;
|
||||||
|
+ break;
|
||||||
|
+
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_SHEEPDOG:
|
||||||
|
if (!(fileprops = qemuBlockStorageSourceGetSheepdogProps(src)))
|
||||||
|
return NULL;
|
||||||
|
diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c
|
||||||
|
index 822d5f8..e375cef 100644
|
||||||
|
--- a/src/qemu/qemu_command.c
|
||||||
|
+++ b/src/qemu/qemu_command.c
|
||||||
|
@@ -975,6 +975,43 @@ qemuBuildNetworkDriveStr(virStorageSourcePtr src,
|
||||||
|
ret = virBufferContentAndReset(&buf);
|
||||||
|
break;
|
||||||
|
|
||||||
|
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
|
||||||
|
+ if (strchr(src->path, ':')) {
|
||||||
|
+ virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
|
||||||
|
+ _("':' not allowed in Vitastor source volume name '%s'"),
|
||||||
|
+ src->path);
|
||||||
|
+ return NULL;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ virBufferStrcat(&buf, "vitastor:image=", src->path, NULL);
|
||||||
|
+
|
||||||
|
+ if (src->nhosts > 0) {
|
||||||
|
+ virBufferAddLit(&buf, ":etcd_host=");
|
||||||
|
+ for (i = 0; i < src->nhosts; i++) {
|
||||||
|
+ if (i)
|
||||||
|
+ virBufferAddLit(&buf, ",");
|
||||||
|
+
|
||||||
|
+ /* assume host containing : is ipv6 */
|
||||||
|
+ if (strchr(src->hosts[i].name, ':'))
|
||||||
|
+ virBufferEscape(&buf, '\\', ":", "[%s]",
|
||||||
|
+ src->hosts[i].name);
|
||||||
|
+ else
|
||||||
|
+ virBufferAsprintf(&buf, "%s", src->hosts[i].name);
|
||||||
|
+
|
||||||
|
+ if (src->hosts[i].port)
|
||||||
|
+ virBufferAsprintf(&buf, "\\:%u", src->hosts[i].port);
|
||||||
|
+ }
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ if (src->configFile)
|
||||||
|
+ virBufferEscape(&buf, '\\', ":", ":config_path=%s", src->configFile);
|
||||||
|
+
|
||||||
|
+ if (src->relPath)
|
||||||
|
+ virBufferEscape(&buf, '\\', ":", ":etcd_prefix=%s", src->relPath);
|
||||||
|
+
|
||||||
|
+ ret = virBufferContentAndReset(&buf);
|
||||||
|
+ break;
|
||||||
|
+
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_VXHS:
|
||||||
|
virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
|
||||||
|
_("VxHS protocol does not support URI syntax"));
|
||||||
|
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
|
||||||
|
index ec6b340..f399efa 100644
|
||||||
|
--- a/src/qemu/qemu_domain.c
|
||||||
|
+++ b/src/qemu/qemu_domain.c
|
||||||
|
@@ -10881,6 +10881,7 @@ qemuDomainPrepareStorageSourceTLS(virStorageSourcePtr src,
|
||||||
|
break;
|
||||||
|
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_RBD:
|
||||||
|
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_SHEEPDOG:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_GLUSTER:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_ISCSI:
|
||||||
|
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
|
||||||
|
index 1d96170..2d24396 100644
|
||||||
|
--- a/src/qemu/qemu_driver.c
|
||||||
|
+++ b/src/qemu/qemu_driver.c
|
||||||
|
@@ -14687,6 +14687,7 @@ qemuDomainSnapshotPrepareDiskExternalInactive(virDomainSnapshotDiskDefPtr snapdi
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_TFTP:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_SSH:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_VXHS:
|
||||||
|
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_LAST:
|
||||||
|
virReportError(VIR_ERR_INTERNAL_ERROR,
|
||||||
|
_("external inactive snapshots are not supported on "
|
||||||
|
@@ -14764,6 +14765,7 @@ qemuDomainSnapshotPrepareDiskExternalActive(virDomainSnapshotDiskDefPtr snapdisk
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_TFTP:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_SSH:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_VXHS:
|
||||||
|
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_LAST:
|
||||||
|
virReportError(VIR_ERR_INTERNAL_ERROR,
|
||||||
|
_("external active snapshots are not supported on "
|
||||||
|
@@ -14887,6 +14889,7 @@ qemuDomainSnapshotPrepareDiskInternal(virDomainDiskDefPtr disk,
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_TFTP:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_SSH:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_VXHS:
|
||||||
|
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_LAST:
|
||||||
|
virReportError(VIR_ERR_INTERNAL_ERROR,
|
||||||
|
_("internal inactive snapshots are not supported on "
|
||||||
|
diff --git a/src/qemu/qemu_parse_command.c b/src/qemu/qemu_parse_command.c
|
||||||
|
index c4650f0..551da41 100644
|
||||||
|
--- a/src/qemu/qemu_parse_command.c
|
||||||
|
+++ b/src/qemu/qemu_parse_command.c
|
||||||
|
@@ -2184,6 +2184,7 @@ qemuParseCommandLine(virFileCachePtr capsCache,
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_TFTP:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_SSH:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_LAST:
|
||||||
|
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_NONE:
|
||||||
|
/* ignored for now */
|
||||||
|
break;
|
||||||
|
diff --git a/src/storage/storage_driver.c b/src/storage/storage_driver.c
|
||||||
|
index 4a13e90..33301c7 100644
|
||||||
|
--- a/src/storage/storage_driver.c
|
||||||
|
+++ b/src/storage/storage_driver.c
|
||||||
|
@@ -1568,6 +1568,7 @@ storageVolLookupByPathCallback(virStoragePoolObjPtr obj,
|
||||||
|
case VIR_STORAGE_POOL_RBD:
|
||||||
|
case VIR_STORAGE_POOL_SHEEPDOG:
|
||||||
|
case VIR_STORAGE_POOL_ZFS:
|
||||||
|
+ case VIR_STORAGE_POOL_VITASTOR:
|
||||||
|
case VIR_STORAGE_POOL_LAST:
|
||||||
|
ignore_value(VIR_STRDUP(stable_path, data->path));
|
||||||
|
break;
|
||||||
|
diff --git a/src/util/virstoragefile.c b/src/util/virstoragefile.c
|
||||||
|
index bd4b027..b323cd6 100644
|
||||||
|
--- a/src/util/virstoragefile.c
|
||||||
|
+++ b/src/util/virstoragefile.c
|
||||||
|
@@ -84,7 +84,8 @@ VIR_ENUM_IMPL(virStorageNetProtocol, VIR_STORAGE_NET_PROTOCOL_LAST,
|
||||||
|
"ftps",
|
||||||
|
"tftp",
|
||||||
|
"ssh",
|
||||||
|
- "vxhs")
|
||||||
|
+ "vxhs",
|
||||||
|
+ "vitastor")
|
||||||
|
|
||||||
|
VIR_ENUM_IMPL(virStorageNetHostTransport, VIR_STORAGE_NET_HOST_TRANS_LAST,
|
||||||
|
"tcp",
|
||||||
|
@@ -2839,6 +2840,83 @@ virStorageSourceParseRBDColonString(const char *rbdstr,
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
+static int
|
||||||
|
+virStorageSourceParseVitastorColonString(const char *colonstr,
|
||||||
|
+ virStorageSourcePtr src)
|
||||||
|
+{
|
||||||
|
+ char *p, *e, *next;
|
||||||
|
+ char *options = NULL;
|
||||||
|
+
|
||||||
|
+ /* optionally skip the "vitastor:" prefix if provided */
|
||||||
|
+ if (STRPREFIX(colonstr, "vitastor:"))
|
||||||
|
+ colonstr += strlen("vitastor:");
|
||||||
|
+
|
||||||
|
+ if (VIR_STRDUP(options, colonstr) < 0)
|
||||||
|
+ return -1;
|
||||||
|
+
|
||||||
|
+ p = options;
|
||||||
|
+ while (*p) {
|
||||||
|
+ /* find : delimiter or end of string */
|
||||||
|
+ for (e = p; *e && *e != ':'; ++e) {
|
||||||
|
+ if (*e == '\\') {
|
||||||
|
+ e++;
|
||||||
|
+ if (*e == '\0')
|
||||||
|
+ break;
|
||||||
|
+ }
|
||||||
|
+ }
|
||||||
|
+ if (*e == '\0') {
|
||||||
|
+ next = e; /* last kv pair */
|
||||||
|
+ } else {
|
||||||
|
+ next = e + 1;
|
||||||
|
+ *e = '\0';
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ if (STRPREFIX(p, "image=")) {
|
||||||
|
+ if (VIR_STRDUP(src->path, p + strlen("image=")) < 0)
|
||||||
|
+ return -1;
|
||||||
|
+ } else if (STRPREFIX(p, "etcd_prefix=")) {
|
||||||
|
+ if (VIR_STRDUP(src->relPath, p + strlen("etcd_prefix=")) < 0)
|
||||||
|
+ return -1;
|
||||||
|
+ } else if (STRPREFIX(p, "config_file=")) {
|
||||||
|
+ if (VIR_STRDUP(src->configFile, p + strlen("config_file=")) < 0)
|
||||||
|
+ return -1;
|
||||||
|
+ } else if (STRPREFIX(p, "etcd_host=")) {
|
||||||
|
+ char *h, *sep;
|
||||||
|
+
|
||||||
|
+ h = p + strlen("etcd_host=");
|
||||||
|
+ while (h < e) {
|
||||||
|
+ for (sep = h; sep < e; ++sep) {
|
||||||
|
+ if (*sep == '\\' && (sep[1] == ',' ||
|
||||||
|
+ sep[1] == ';' ||
|
||||||
|
+ sep[1] == ' ')) {
|
||||||
|
+ *sep = '\0';
|
||||||
|
+ sep += 2;
|
||||||
|
+ break;
|
||||||
|
+ }
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ if (virStorageSourceRBDAddHost(src, h) < 0)
|
||||||
|
+ goto error;
|
||||||
|
+
|
||||||
|
+ h = sep;
|
||||||
|
+ }
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ p = next;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ if (!src->path) {
|
||||||
|
+ goto error;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ return 0;
|
||||||
|
+
|
||||||
|
+error:
|
||||||
|
+ VIR_FREE(options);
|
||||||
|
+ return -1;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+
|
||||||
|
static int
|
||||||
|
virStorageSourceParseNBDColonString(const char *nbdstr,
|
||||||
|
virStorageSourcePtr src)
|
||||||
|
@@ -2942,6 +3020,11 @@ virStorageSourceParseBackingColon(virStorageSourcePtr src,
|
||||||
|
goto cleanup;
|
||||||
|
break;
|
||||||
|
|
||||||
|
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
|
||||||
|
+ if (virStorageSourceParseVitastorColonString(path, src) < 0)
|
||||||
|
+ return -1;
|
||||||
|
+ break;
|
||||||
|
+
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_SHEEPDOG:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_LAST:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_NONE:
|
||||||
|
@@ -3441,6 +3524,56 @@ virStorageSourceParseBackingJSONRBD(virStorageSourcePtr src,
|
||||||
|
return ret;
|
||||||
|
}
|
||||||
|
|
||||||
|
+static int
|
||||||
|
+virStorageSourceParseBackingJSONVitastor(virStorageSourcePtr src,
|
||||||
|
+ virJSONValuePtr json,
|
||||||
|
+ int opaque ATTRIBUTE_UNUSED)
|
||||||
|
+{
|
||||||
|
+ const char *filename;
|
||||||
|
+ const char *image = virJSONValueObjectGetString(json, "image");
|
||||||
|
+ const char *conf = virJSONValueObjectGetString(json, "config_path");
|
||||||
|
+ const char *etcd_prefix = virJSONValueObjectGetString(json, "etcd_prefix");
|
||||||
|
+ virJSONValuePtr servers = virJSONValueObjectGetArray(json, "server");
|
||||||
|
+ size_t nservers;
|
||||||
|
+ size_t i;
|
||||||
|
+
|
||||||
|
+ src->type = VIR_STORAGE_TYPE_NETWORK;
|
||||||
|
+ src->protocol = VIR_STORAGE_NET_PROTOCOL_VITASTOR;
|
||||||
|
+
|
||||||
|
+ /* legacy syntax passed via 'filename' option */
|
||||||
|
+ if ((filename = virJSONValueObjectGetString(json, "filename")))
|
||||||
|
+ return virStorageSourceParseVitastorColonString(filename, src);
|
||||||
|
+
|
||||||
|
+ if (!image) {
|
||||||
|
+ virReportError(VIR_ERR_INVALID_ARG, "%s",
|
||||||
|
+ _("missing image name in Vitastor backing volume "
|
||||||
|
+ "JSON specification"));
|
||||||
|
+ return -1;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ if (VIR_STRDUP(src->path, image) < 0 ||
|
||||||
|
+ VIR_STRDUP(src->configFile, conf) < 0 ||
|
||||||
|
+ VIR_STRDUP(src->relPath, etcd_prefix) < 0)
|
||||||
|
+ return -1;
|
||||||
|
+
|
||||||
|
+ if (servers) {
|
||||||
|
+ nservers = virJSONValueArraySize(servers);
|
||||||
|
+
|
||||||
|
+ if (VIR_ALLOC_N(src->hosts, nservers) < 0)
|
||||||
|
+ return -1;
|
||||||
|
+
|
||||||
|
+ src->nhosts = nservers;
|
||||||
|
+
|
||||||
|
+ for (i = 0; i < nservers; i++) {
|
||||||
|
+ if (virStorageSourceParseBackingJSONInetSocketAddress(src->hosts + i,
|
||||||
|
+ virJSONValueArrayGet(servers, i)) < 0)
|
||||||
|
+ return -1;
|
||||||
|
+ }
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ return 0;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
static int
|
||||||
|
virStorageSourceParseBackingJSONRaw(virStorageSourcePtr src,
|
||||||
|
virJSONValuePtr json,
|
||||||
|
@@ -3507,6 +3640,7 @@ static const struct virStorageSourceJSONDriverParser jsonParsers[] = {
|
||||||
|
{"sheepdog", virStorageSourceParseBackingJSONSheepdog, 0},
|
||||||
|
{"ssh", virStorageSourceParseBackingJSONSSH, 0},
|
||||||
|
{"rbd", virStorageSourceParseBackingJSONRBD, 0},
|
||||||
|
+ {"vitastor", virStorageSourceParseBackingJSONVitastor, 0},
|
||||||
|
{"raw", virStorageSourceParseBackingJSONRaw, 0},
|
||||||
|
{"vxhs", virStorageSourceParseBackingJSONVxHS, 0},
|
||||||
|
};
|
||||||
|
@@ -4276,6 +4410,7 @@ virStorageSourceNetworkDefaultPort(virStorageNetProtocol protocol)
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_GLUSTER:
|
||||||
|
return 24007;
|
||||||
|
|
||||||
|
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_RBD:
|
||||||
|
/* we don't provide a default for RBD */
|
||||||
|
return 0;
|
||||||
|
diff --git a/src/util/virstoragefile.h b/src/util/virstoragefile.h
|
||||||
|
index 1d6161a..8d83bf3 100644
|
||||||
|
--- a/src/util/virstoragefile.h
|
||||||
|
+++ b/src/util/virstoragefile.h
|
||||||
|
@@ -134,6 +134,7 @@ typedef enum {
|
||||||
|
VIR_STORAGE_NET_PROTOCOL_TFTP,
|
||||||
|
VIR_STORAGE_NET_PROTOCOL_SSH,
|
||||||
|
VIR_STORAGE_NET_PROTOCOL_VXHS,
|
||||||
|
+ VIR_STORAGE_NET_PROTOCOL_VITASTOR,
|
||||||
|
|
||||||
|
VIR_STORAGE_NET_PROTOCOL_LAST
|
||||||
|
} virStorageNetProtocol;
|
||||||
|
diff --git a/src/xenconfig/xen_xl.c b/src/xenconfig/xen_xl.c
|
||||||
|
index accfc3a..a18f9c3 100644
|
||||||
|
--- a/src/xenconfig/xen_xl.c
|
||||||
|
+++ b/src/xenconfig/xen_xl.c
|
||||||
|
@@ -1535,6 +1535,7 @@ xenFormatXLDiskSrcNet(virStorageSourcePtr src)
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_SHEEPDOG:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_SSH:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_VXHS:
|
||||||
|
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_LAST:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_NONE:
|
||||||
|
virReportError(VIR_ERR_NO_SUPPORT,
|
||||||
|
diff --git a/tools/virsh-pool.c b/tools/virsh-pool.c
|
||||||
|
index 70ca39b..9caef51 100644
|
||||||
|
--- a/tools/virsh-pool.c
|
||||||
|
+++ b/tools/virsh-pool.c
|
||||||
|
@@ -1219,6 +1219,9 @@ cmdPoolList(vshControl *ctl, const vshCmd *cmd ATTRIBUTE_UNUSED)
|
||||||
|
case VIR_STORAGE_POOL_VSTORAGE:
|
||||||
|
flags |= VIR_CONNECT_LIST_STORAGE_POOLS_VSTORAGE;
|
||||||
|
break;
|
||||||
|
+ case VIR_STORAGE_POOL_VITASTOR:
|
||||||
|
+ flags |= VIR_CONNECT_LIST_STORAGE_POOLS_VITASTOR;
|
||||||
|
+ break;
|
||||||
|
case VIR_STORAGE_POOL_LAST:
|
||||||
|
break;
|
||||||
|
}
|
|
@ -0,0 +1,657 @@
|
||||||
|
commit 41cdfe8317d98f70aadedfdbb381effed2641bdd
|
||||||
|
Author: Vitaliy Filippov <vitalif@yourcmc.ru>
|
||||||
|
Date: Fri Jul 9 01:31:57 2021 +0300
|
||||||
|
|
||||||
|
Add Vitastor support
|
||||||
|
|
||||||
|
diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng
|
||||||
|
index 7dc419b..875433b 100644
|
||||||
|
--- a/docs/schemas/domaincommon.rng
|
||||||
|
+++ b/docs/schemas/domaincommon.rng
|
||||||
|
@@ -1827,6 +1827,35 @@
|
||||||
|
</element>
|
||||||
|
</define>
|
||||||
|
|
||||||
|
+ <define name="diskSourceNetworkProtocolVitastor">
|
||||||
|
+ <element name="source">
|
||||||
|
+ <interleave>
|
||||||
|
+ <attribute name="protocol">
|
||||||
|
+ <value>vitastor</value>
|
||||||
|
+ </attribute>
|
||||||
|
+ <ref name="diskSourceCommon"/>
|
||||||
|
+ <optional>
|
||||||
|
+ <attribute name="name"/>
|
||||||
|
+ </optional>
|
||||||
|
+ <optional>
|
||||||
|
+ <attribute name="query"/>
|
||||||
|
+ </optional>
|
||||||
|
+ <zeroOrMore>
|
||||||
|
+ <ref name="diskSourceNetworkHost"/>
|
||||||
|
+ </zeroOrMore>
|
||||||
|
+ <optional>
|
||||||
|
+ <element name="config">
|
||||||
|
+ <attribute name="file">
|
||||||
|
+ <ref name="absFilePath"/>
|
||||||
|
+ </attribute>
|
||||||
|
+ <empty/>
|
||||||
|
+ </element>
|
||||||
|
+ </optional>
|
||||||
|
+ <empty/>
|
||||||
|
+ </interleave>
|
||||||
|
+ </element>
|
||||||
|
+ </define>
|
||||||
|
+
|
||||||
|
<define name="diskSourceNetworkProtocolISCSI">
|
||||||
|
<element name="source">
|
||||||
|
<attribute name="protocol">
|
||||||
|
@@ -2083,6 +2112,7 @@
|
||||||
|
<ref name="diskSourceNetworkProtocolSimple"/>
|
||||||
|
<ref name="diskSourceNetworkProtocolVxHS"/>
|
||||||
|
<ref name="diskSourceNetworkProtocolNFS"/>
|
||||||
|
+ <ref name="diskSourceNetworkProtocolVitastor"/>
|
||||||
|
</choice>
|
||||||
|
</define>
|
||||||
|
|
||||||
|
diff --git a/include/libvirt/libvirt-storage.h b/include/libvirt/libvirt-storage.h
|
||||||
|
index 089e1e0..d7e7ef4 100644
|
||||||
|
--- a/include/libvirt/libvirt-storage.h
|
||||||
|
+++ b/include/libvirt/libvirt-storage.h
|
||||||
|
@@ -245,6 +245,7 @@ typedef enum {
|
||||||
|
VIR_CONNECT_LIST_STORAGE_POOLS_ZFS = 1 << 17,
|
||||||
|
VIR_CONNECT_LIST_STORAGE_POOLS_VSTORAGE = 1 << 18,
|
||||||
|
VIR_CONNECT_LIST_STORAGE_POOLS_ISCSI_DIRECT = 1 << 19,
|
||||||
|
+ VIR_CONNECT_LIST_STORAGE_POOLS_VITASTOR = 1 << 20,
|
||||||
|
} virConnectListAllStoragePoolsFlags;
|
||||||
|
|
||||||
|
int virConnectListAllStoragePools(virConnectPtr conn,
|
||||||
|
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
|
||||||
|
index 01b7187..c6e9702 100644
|
||||||
|
--- a/src/conf/domain_conf.c
|
||||||
|
+++ b/src/conf/domain_conf.c
|
||||||
|
@@ -8261,7 +8261,8 @@ virDomainDiskSourceNetworkParse(xmlNodePtr node,
|
||||||
|
src->configFile = virXPathString("string(./config/@file)", ctxt);
|
||||||
|
|
||||||
|
if (src->protocol == VIR_STORAGE_NET_PROTOCOL_HTTP ||
|
||||||
|
- src->protocol == VIR_STORAGE_NET_PROTOCOL_HTTPS)
|
||||||
|
+ src->protocol == VIR_STORAGE_NET_PROTOCOL_HTTPS ||
|
||||||
|
+ src->protocol == VIR_STORAGE_NET_PROTOCOL_VITASTOR)
|
||||||
|
src->query = virXMLPropString(node, "query");
|
||||||
|
|
||||||
|
if (virDomainStorageNetworkParseHosts(node, ctxt, &src->hosts, &src->nhosts) < 0)
|
||||||
|
@@ -31392,6 +31393,7 @@ virDomainStorageSourceTranslateSourcePool(virStorageSourcePtr src,
|
||||||
|
|
||||||
|
case VIR_STORAGE_POOL_MPATH:
|
||||||
|
case VIR_STORAGE_POOL_RBD:
|
||||||
|
+ case VIR_STORAGE_POOL_VITASTOR:
|
||||||
|
case VIR_STORAGE_POOL_SHEEPDOG:
|
||||||
|
case VIR_STORAGE_POOL_GLUSTER:
|
||||||
|
case VIR_STORAGE_POOL_LAST:
|
||||||
|
diff --git a/src/conf/storage_conf.c b/src/conf/storage_conf.c
|
||||||
|
index 0c50529..fe97574 100644
|
||||||
|
--- a/src/conf/storage_conf.c
|
||||||
|
+++ b/src/conf/storage_conf.c
|
||||||
|
@@ -60,7 +60,7 @@ VIR_ENUM_IMPL(virStoragePool,
|
||||||
|
"logical", "disk", "iscsi",
|
||||||
|
"iscsi-direct", "scsi", "mpath",
|
||||||
|
"rbd", "sheepdog", "gluster",
|
||||||
|
- "zfs", "vstorage",
|
||||||
|
+ "zfs", "vstorage", "vitastor",
|
||||||
|
);
|
||||||
|
|
||||||
|
VIR_ENUM_IMPL(virStoragePoolFormatFileSystem,
|
||||||
|
@@ -249,6 +249,18 @@ static virStoragePoolTypeInfo poolTypeInfo[] = {
|
||||||
|
.formatToString = virStorageFileFormatTypeToString,
|
||||||
|
}
|
||||||
|
},
|
||||||
|
+ {.poolType = VIR_STORAGE_POOL_VITASTOR,
|
||||||
|
+ .poolOptions = {
|
||||||
|
+ .flags = (VIR_STORAGE_POOL_SOURCE_HOST |
|
||||||
|
+ VIR_STORAGE_POOL_SOURCE_NETWORK |
|
||||||
|
+ VIR_STORAGE_POOL_SOURCE_NAME),
|
||||||
|
+ },
|
||||||
|
+ .volOptions = {
|
||||||
|
+ .defaultFormat = VIR_STORAGE_FILE_RAW,
|
||||||
|
+ .formatFromString = virStorageVolumeFormatFromString,
|
||||||
|
+ .formatToString = virStorageFileFormatTypeToString,
|
||||||
|
+ }
|
||||||
|
+ },
|
||||||
|
{.poolType = VIR_STORAGE_POOL_SHEEPDOG,
|
||||||
|
.poolOptions = {
|
||||||
|
.flags = (VIR_STORAGE_POOL_SOURCE_HOST |
|
||||||
|
@@ -551,6 +563,11 @@ virStoragePoolDefParseSource(xmlXPathContextPtr ctxt,
|
||||||
|
_("element 'name' is mandatory for RBD pool"));
|
||||||
|
goto cleanup;
|
||||||
|
}
|
||||||
|
+ if (pool_type == VIR_STORAGE_POOL_VITASTOR && source->name == NULL) {
|
||||||
|
+ virReportError(VIR_ERR_XML_ERROR, "%s",
|
||||||
|
+ _("element 'name' is mandatory for Vitastor pool"));
|
||||||
|
+ return -1;
|
||||||
|
+ }
|
||||||
|
|
||||||
|
if (options->formatFromString) {
|
||||||
|
g_autofree char *format = NULL;
|
||||||
|
@@ -1217,6 +1234,7 @@ virStoragePoolDefFormatBuf(virBufferPtr buf,
|
||||||
|
/* RBD, Sheepdog, Gluster and Iscsi-direct devices are not local block devs nor
|
||||||
|
* files, so they don't have a target */
|
||||||
|
if (def->type != VIR_STORAGE_POOL_RBD &&
|
||||||
|
+ def->type != VIR_STORAGE_POOL_VITASTOR &&
|
||||||
|
def->type != VIR_STORAGE_POOL_SHEEPDOG &&
|
||||||
|
def->type != VIR_STORAGE_POOL_GLUSTER &&
|
||||||
|
def->type != VIR_STORAGE_POOL_ISCSI_DIRECT) {
|
||||||
|
diff --git a/src/conf/storage_conf.h b/src/conf/storage_conf.h
|
||||||
|
index ffd406e..8868a05 100644
|
||||||
|
--- a/src/conf/storage_conf.h
|
||||||
|
+++ b/src/conf/storage_conf.h
|
||||||
|
@@ -110,6 +110,7 @@ typedef enum {
|
||||||
|
VIR_STORAGE_POOL_GLUSTER, /* Gluster device */
|
||||||
|
VIR_STORAGE_POOL_ZFS, /* ZFS */
|
||||||
|
VIR_STORAGE_POOL_VSTORAGE, /* Virtuozzo Storage */
|
||||||
|
+ VIR_STORAGE_POOL_VITASTOR, /* Vitastor */
|
||||||
|
|
||||||
|
VIR_STORAGE_POOL_LAST,
|
||||||
|
} virStoragePoolType;
|
||||||
|
@@ -474,6 +475,7 @@ VIR_ENUM_DECL(virStoragePartedFs);
|
||||||
|
VIR_CONNECT_LIST_STORAGE_POOLS_SCSI | \
|
||||||
|
VIR_CONNECT_LIST_STORAGE_POOLS_MPATH | \
|
||||||
|
VIR_CONNECT_LIST_STORAGE_POOLS_RBD | \
|
||||||
|
+ VIR_CONNECT_LIST_STORAGE_POOLS_VITASTOR | \
|
||||||
|
VIR_CONNECT_LIST_STORAGE_POOLS_SHEEPDOG | \
|
||||||
|
VIR_CONNECT_LIST_STORAGE_POOLS_GLUSTER | \
|
||||||
|
VIR_CONNECT_LIST_STORAGE_POOLS_ZFS | \
|
||||||
|
diff --git a/src/conf/virstorageobj.c b/src/conf/virstorageobj.c
|
||||||
|
index 9fe8b3f..bf595b0 100644
|
||||||
|
--- a/src/conf/virstorageobj.c
|
||||||
|
+++ b/src/conf/virstorageobj.c
|
||||||
|
@@ -1491,6 +1491,7 @@ virStoragePoolObjSourceFindDuplicateCb(const void *payload,
|
||||||
|
return 1;
|
||||||
|
break;
|
||||||
|
|
||||||
|
+ case VIR_STORAGE_POOL_VITASTOR:
|
||||||
|
case VIR_STORAGE_POOL_RBD:
|
||||||
|
case VIR_STORAGE_POOL_LAST:
|
||||||
|
break;
|
||||||
|
@@ -1990,6 +1991,8 @@ virStoragePoolObjMatch(virStoragePoolObjPtr obj,
|
||||||
|
(obj->def->type == VIR_STORAGE_POOL_MPATH)) ||
|
||||||
|
(MATCH(VIR_CONNECT_LIST_STORAGE_POOLS_RBD) &&
|
||||||
|
(obj->def->type == VIR_STORAGE_POOL_RBD)) ||
|
||||||
|
+ (MATCH(VIR_CONNECT_LIST_STORAGE_POOLS_VITASTOR) &&
|
||||||
|
+ (obj->def->type == VIR_STORAGE_POOL_VITASTOR)) ||
|
||||||
|
(MATCH(VIR_CONNECT_LIST_STORAGE_POOLS_SHEEPDOG) &&
|
||||||
|
(obj->def->type == VIR_STORAGE_POOL_SHEEPDOG)) ||
|
||||||
|
(MATCH(VIR_CONNECT_LIST_STORAGE_POOLS_GLUSTER) &&
|
||||||
|
diff --git a/src/libvirt-storage.c b/src/libvirt-storage.c
|
||||||
|
index 2a7cdca..f756be1 100644
|
||||||
|
--- a/src/libvirt-storage.c
|
||||||
|
+++ b/src/libvirt-storage.c
|
||||||
|
@@ -92,6 +92,7 @@ virStoragePoolGetConnect(virStoragePoolPtr pool)
|
||||||
|
* VIR_CONNECT_LIST_STORAGE_POOLS_SCSI
|
||||||
|
* VIR_CONNECT_LIST_STORAGE_POOLS_MPATH
|
||||||
|
* VIR_CONNECT_LIST_STORAGE_POOLS_RBD
|
||||||
|
+ * VIR_CONNECT_LIST_STORAGE_POOLS_VITASTOR
|
||||||
|
* VIR_CONNECT_LIST_STORAGE_POOLS_SHEEPDOG
|
||||||
|
* VIR_CONNECT_LIST_STORAGE_POOLS_GLUSTER
|
||||||
|
* VIR_CONNECT_LIST_STORAGE_POOLS_ZFS
|
||||||
|
diff --git a/src/libxl/libxl_conf.c b/src/libxl/libxl_conf.c
|
||||||
|
index 6a8ae27..a735bc6 100644
|
||||||
|
--- a/src/libxl/libxl_conf.c
|
||||||
|
+++ b/src/libxl/libxl_conf.c
|
||||||
|
@@ -942,6 +942,7 @@ libxlMakeNetworkDiskSrcStr(virStorageSourcePtr src,
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_SSH:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_VXHS:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_NFS:
|
||||||
|
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_LAST:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_NONE:
|
||||||
|
virReportError(VIR_ERR_NO_SUPPORT,
|
||||||
|
diff --git a/src/libxl/xen_xl.c b/src/libxl/xen_xl.c
|
||||||
|
index 17b93d0..c5a0084 100644
|
||||||
|
--- a/src/libxl/xen_xl.c
|
||||||
|
+++ b/src/libxl/xen_xl.c
|
||||||
|
@@ -1601,6 +1601,7 @@ xenFormatXLDiskSrcNet(virStorageSourcePtr src)
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_SSH:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_VXHS:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_NFS:
|
||||||
|
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_LAST:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_NONE:
|
||||||
|
virReportError(VIR_ERR_NO_SUPPORT,
|
||||||
|
diff --git a/src/qemu/qemu_block.c b/src/qemu/qemu_block.c
|
||||||
|
index f9c6da2..922dde5 100644
|
||||||
|
--- a/src/qemu/qemu_block.c
|
||||||
|
+++ b/src/qemu/qemu_block.c
|
||||||
|
@@ -938,6 +938,38 @@ qemuBlockStorageSourceGetRBDProps(virStorageSourcePtr src,
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
+static virJSONValuePtr
|
||||||
|
+qemuBlockStorageSourceGetVitastorProps(virStorageSource *src)
|
||||||
|
+{
|
||||||
|
+ virJSONValuePtr ret = NULL;
|
||||||
|
+ virStorageNetHostDefPtr host;
|
||||||
|
+ size_t i;
|
||||||
|
+ g_auto(virBuffer) buf = VIR_BUFFER_INITIALIZER;
|
||||||
|
+ g_autofree char *etcd = NULL;
|
||||||
|
+
|
||||||
|
+ for (i = 0; i < src->nhosts; i++) {
|
||||||
|
+ host = src->hosts + i;
|
||||||
|
+ if ((virStorageNetHostTransport)host->transport != VIR_STORAGE_NET_HOST_TRANS_TCP) {
|
||||||
|
+ return NULL;
|
||||||
|
+ }
|
||||||
|
+ virBufferAsprintf(&buf, i > 0 ? ",%s:%u" : "%s:%u", host->name, host->port);
|
||||||
|
+ }
|
||||||
|
+ if (src->nhosts > 0) {
|
||||||
|
+ etcd = virBufferContentAndReset(&buf);
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ if (virJSONValueObjectCreate(&ret,
|
||||||
|
+ "S:etcd_host", etcd,
|
||||||
|
+ "S:etcd_prefix", src->query,
|
||||||
|
+ "S:config_path", src->configFile,
|
||||||
|
+ "s:image", src->path,
|
||||||
|
+ NULL) < 0)
|
||||||
|
+ return NULL;
|
||||||
|
+
|
||||||
|
+ return ret;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+
|
||||||
|
static virJSONValuePtr
|
||||||
|
qemuBlockStorageSourceGetSheepdogProps(virStorageSourcePtr src)
|
||||||
|
{
|
||||||
|
@@ -1224,6 +1256,12 @@ qemuBlockStorageSourceGetBackendProps(virStorageSourcePtr src,
|
||||||
|
return NULL;
|
||||||
|
break;
|
||||||
|
|
||||||
|
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
|
||||||
|
+ driver = "vitastor";
|
||||||
|
+ if (!(fileprops = qemuBlockStorageSourceGetVitastorProps(src)))
|
||||||
|
+ return NULL;
|
||||||
|
+ break;
|
||||||
|
+
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_SHEEPDOG:
|
||||||
|
driver = "sheepdog";
|
||||||
|
if (!(fileprops = qemuBlockStorageSourceGetSheepdogProps(src)))
|
||||||
|
@@ -2183,6 +2221,7 @@ qemuBlockGetBackingStoreString(virStorageSourcePtr src,
|
||||||
|
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_SHEEPDOG:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_RBD:
|
||||||
|
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_VXHS:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_NFS:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_SSH:
|
||||||
|
@@ -2560,6 +2599,12 @@ qemuBlockStorageSourceCreateGetStorageProps(virStorageSourcePtr src,
|
||||||
|
return -1;
|
||||||
|
break;
|
||||||
|
|
||||||
|
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
|
||||||
|
+ driver = "vitastor";
|
||||||
|
+ if (!(location = qemuBlockStorageSourceGetVitastorProps(src)))
|
||||||
|
+ return -1;
|
||||||
|
+ break;
|
||||||
|
+
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_SHEEPDOG:
|
||||||
|
driver = "sheepdog";
|
||||||
|
if (!(location = qemuBlockStorageSourceGetSheepdogProps(src)))
|
||||||
|
diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c
|
||||||
|
index 6f970a3..10b39ca 100644
|
||||||
|
--- a/src/qemu/qemu_command.c
|
||||||
|
+++ b/src/qemu/qemu_command.c
|
||||||
|
@@ -1034,6 +1034,43 @@ qemuBuildNetworkDriveStr(virStorageSourcePtr src,
|
||||||
|
ret = virBufferContentAndReset(&buf);
|
||||||
|
break;
|
||||||
|
|
||||||
|
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
|
||||||
|
+ if (strchr(src->path, ':')) {
|
||||||
|
+ virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
|
||||||
|
+ _("':' not allowed in Vitastor source volume name '%s'"),
|
||||||
|
+ src->path);
|
||||||
|
+ return NULL;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ virBufferStrcat(&buf, "vitastor:image=", src->path, NULL);
|
||||||
|
+
|
||||||
|
+ if (src->nhosts > 0) {
|
||||||
|
+ virBufferAddLit(&buf, ":etcd_host=");
|
||||||
|
+ for (i = 0; i < src->nhosts; i++) {
|
||||||
|
+ if (i)
|
||||||
|
+ virBufferAddLit(&buf, ",");
|
||||||
|
+
|
||||||
|
+ /* assume host containing : is ipv6 */
|
||||||
|
+ if (strchr(src->hosts[i].name, ':'))
|
||||||
|
+ virBufferEscape(&buf, '\\', ":", "[%s]",
|
||||||
|
+ src->hosts[i].name);
|
||||||
|
+ else
|
||||||
|
+ virBufferAsprintf(&buf, "%s", src->hosts[i].name);
|
||||||
|
+
|
||||||
|
+ if (src->hosts[i].port)
|
||||||
|
+ virBufferAsprintf(&buf, "\\:%u", src->hosts[i].port);
|
||||||
|
+ }
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ if (src->configFile)
|
||||||
|
+ virBufferEscape(&buf, '\\', ":", ":config_path=%s", src->configFile);
|
||||||
|
+
|
||||||
|
+ if (src->query)
|
||||||
|
+ virBufferEscape(&buf, '\\', ":", ":etcd_prefix=%s", src->query);
|
||||||
|
+
|
||||||
|
+ ret = virBufferContentAndReset(&buf);
|
||||||
|
+ break;
|
||||||
|
+
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_VXHS:
|
||||||
|
virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
|
||||||
|
_("VxHS protocol does not support URI syntax"));
|
||||||
|
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
|
||||||
|
index 0765dc7..4cff344 100644
|
||||||
|
--- a/src/qemu/qemu_domain.c
|
||||||
|
+++ b/src/qemu/qemu_domain.c
|
||||||
|
@@ -4610,7 +4610,8 @@ qemuDomainValidateStorageSource(virStorageSourcePtr src,
|
||||||
|
if (src->query &&
|
||||||
|
(actualType != VIR_STORAGE_TYPE_NETWORK ||
|
||||||
|
(src->protocol != VIR_STORAGE_NET_PROTOCOL_HTTPS &&
|
||||||
|
- src->protocol != VIR_STORAGE_NET_PROTOCOL_HTTP))) {
|
||||||
|
+ src->protocol != VIR_STORAGE_NET_PROTOCOL_HTTP &&
|
||||||
|
+ src->protocol != VIR_STORAGE_NET_PROTOCOL_VITASTOR))) {
|
||||||
|
virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
|
||||||
|
_("query is supported only with HTTP(S) protocols"));
|
||||||
|
return -1;
|
||||||
|
@@ -9704,6 +9705,7 @@ qemuDomainPrepareStorageSourceTLS(virStorageSourcePtr src,
|
||||||
|
break;
|
||||||
|
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_RBD:
|
||||||
|
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_SHEEPDOG:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_GLUSTER:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_ISCSI:
|
||||||
|
diff --git a/src/qemu/qemu_snapshot.c b/src/qemu/qemu_snapshot.c
|
||||||
|
index ee333c3..674aa58 100644
|
||||||
|
--- a/src/qemu/qemu_snapshot.c
|
||||||
|
+++ b/src/qemu/qemu_snapshot.c
|
||||||
|
@@ -403,6 +403,7 @@ qemuSnapshotPrepareDiskExternalInactive(virDomainSnapshotDiskDefPtr snapdisk,
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_NONE:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_NBD:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_RBD:
|
||||||
|
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_SHEEPDOG:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_GLUSTER:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_ISCSI:
|
||||||
|
@@ -493,6 +494,7 @@ qemuSnapshotPrepareDiskExternalActive(virDomainObjPtr vm,
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_NONE:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_NBD:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_RBD:
|
||||||
|
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_SHEEPDOG:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_ISCSI:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_HTTP:
|
||||||
|
@@ -623,6 +625,7 @@ qemuSnapshotPrepareDiskInternal(virDomainDiskDefPtr disk,
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_NONE:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_NBD:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_RBD:
|
||||||
|
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_SHEEPDOG:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_GLUSTER:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_ISCSI:
|
||||||
|
diff --git a/src/storage/storage_driver.c b/src/storage/storage_driver.c
|
||||||
|
index 16bc53a..1e5d820 100644
|
||||||
|
--- a/src/storage/storage_driver.c
|
||||||
|
+++ b/src/storage/storage_driver.c
|
||||||
|
@@ -1645,6 +1645,7 @@ storageVolLookupByPathCallback(virStoragePoolObjPtr obj,
|
||||||
|
|
||||||
|
case VIR_STORAGE_POOL_GLUSTER:
|
||||||
|
case VIR_STORAGE_POOL_RBD:
|
||||||
|
+ case VIR_STORAGE_POOL_VITASTOR:
|
||||||
|
case VIR_STORAGE_POOL_SHEEPDOG:
|
||||||
|
case VIR_STORAGE_POOL_ZFS:
|
||||||
|
case VIR_STORAGE_POOL_LAST:
|
||||||
|
diff --git a/src/test/test_driver.c b/src/test/test_driver.c
|
||||||
|
index 29c4c86..a27ad94 100644
|
||||||
|
--- a/src/test/test_driver.c
|
||||||
|
+++ b/src/test/test_driver.c
|
||||||
|
@@ -7096,6 +7096,7 @@ testStorageVolumeTypeForPool(int pooltype)
|
||||||
|
case VIR_STORAGE_POOL_ISCSI_DIRECT:
|
||||||
|
case VIR_STORAGE_POOL_GLUSTER:
|
||||||
|
case VIR_STORAGE_POOL_RBD:
|
||||||
|
+ case VIR_STORAGE_POOL_VITASTOR:
|
||||||
|
return VIR_STORAGE_VOL_NETWORK;
|
||||||
|
case VIR_STORAGE_POOL_LOGICAL:
|
||||||
|
case VIR_STORAGE_POOL_DISK:
|
||||||
|
diff --git a/src/util/virstoragefile.c b/src/util/virstoragefile.c
|
||||||
|
index 0d3c2af..36e3afc 100644
|
||||||
|
--- a/src/util/virstoragefile.c
|
||||||
|
+++ b/src/util/virstoragefile.c
|
||||||
|
@@ -91,6 +91,7 @@ VIR_ENUM_IMPL(virStorageNetProtocol,
|
||||||
|
"ssh",
|
||||||
|
"vxhs",
|
||||||
|
"nfs",
|
||||||
|
+ "vitastor",
|
||||||
|
);
|
||||||
|
|
||||||
|
VIR_ENUM_IMPL(virStorageNetHostTransport,
|
||||||
|
@@ -2880,6 +2881,75 @@ virStorageSourceParseRBDColonString(const char *rbdstr,
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
+static int
|
||||||
|
+virStorageSourceParseVitastorColonString(const char *colonstr,
|
||||||
|
+ virStorageSourcePtr src)
|
||||||
|
+{
|
||||||
|
+ char *p, *e, *next;
|
||||||
|
+ g_autofree char *options = NULL;
|
||||||
|
+
|
||||||
|
+ /* optionally skip the "vitastor:" prefix if provided */
|
||||||
|
+ if (STRPREFIX(colonstr, "vitastor:"))
|
||||||
|
+ colonstr += strlen("vitastor:");
|
||||||
|
+
|
||||||
|
+ options = g_strdup(colonstr);
|
||||||
|
+
|
||||||
|
+ p = options;
|
||||||
|
+ while (*p) {
|
||||||
|
+ /* find : delimiter or end of string */
|
||||||
|
+ for (e = p; *e && *e != ':'; ++e) {
|
||||||
|
+ if (*e == '\\') {
|
||||||
|
+ e++;
|
||||||
|
+ if (*e == '\0')
|
||||||
|
+ break;
|
||||||
|
+ }
|
||||||
|
+ }
|
||||||
|
+ if (*e == '\0') {
|
||||||
|
+ next = e; /* last kv pair */
|
||||||
|
+ } else {
|
||||||
|
+ next = e + 1;
|
||||||
|
+ *e = '\0';
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ if (STRPREFIX(p, "image=")) {
|
||||||
|
+ src->path = g_strdup(p + strlen("image="));
|
||||||
|
+ } else if (STRPREFIX(p, "etcd_prefix=")) {
|
||||||
|
+ src->query = g_strdup(p + strlen("etcd_prefix="));
|
||||||
|
+ } else if (STRPREFIX(p, "config_file=")) {
|
||||||
|
+ src->configFile = g_strdup(p + strlen("config_file="));
|
||||||
|
+ } else if (STRPREFIX(p, "etcd_host=")) {
|
||||||
|
+ char *h, *sep;
|
||||||
|
+
|
||||||
|
+ h = p + strlen("etcd_host=");
|
||||||
|
+ while (h < e) {
|
||||||
|
+ for (sep = h; sep < e; ++sep) {
|
||||||
|
+ if (*sep == '\\' && (sep[1] == ',' ||
|
||||||
|
+ sep[1] == ';' ||
|
||||||
|
+ sep[1] == ' ')) {
|
||||||
|
+ *sep = '\0';
|
||||||
|
+ sep += 2;
|
||||||
|
+ break;
|
||||||
|
+ }
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ if (virStorageSourceRBDAddHost(src, h) < 0)
|
||||||
|
+ return -1;
|
||||||
|
+
|
||||||
|
+ h = sep;
|
||||||
|
+ }
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ p = next;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ if (!src->path) {
|
||||||
|
+ return -1;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ return 0;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+
|
||||||
|
static int
|
||||||
|
virStorageSourceParseNBDColonString(const char *nbdstr,
|
||||||
|
virStorageSourcePtr src)
|
||||||
|
@@ -2992,6 +3062,11 @@ virStorageSourceParseBackingColon(virStorageSourcePtr src,
|
||||||
|
return -1;
|
||||||
|
break;
|
||||||
|
|
||||||
|
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
|
||||||
|
+ if (virStorageSourceParseVitastorColonString(path, src) < 0)
|
||||||
|
+ return -1;
|
||||||
|
+ break;
|
||||||
|
+
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_SHEEPDOG:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_LAST:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_NONE:
|
||||||
|
@@ -3581,6 +3656,54 @@ virStorageSourceParseBackingJSONRBD(virStorageSourcePtr src,
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
+static int
|
||||||
|
+virStorageSourceParseBackingJSONVitastor(virStorageSourcePtr src,
|
||||||
|
+ virJSONValuePtr json,
|
||||||
|
+ const char *jsonstr G_GNUC_UNUSED,
|
||||||
|
+ int opaque G_GNUC_UNUSED)
|
||||||
|
+{
|
||||||
|
+ const char *filename;
|
||||||
|
+ const char *image = virJSONValueObjectGetString(json, "image");
|
||||||
|
+ const char *conf = virJSONValueObjectGetString(json, "config_path");
|
||||||
|
+ const char *etcd_prefix = virJSONValueObjectGetString(json, "etcd_prefix");
|
||||||
|
+ virJSONValuePtr servers = virJSONValueObjectGetArray(json, "server");
|
||||||
|
+ size_t nservers;
|
||||||
|
+ size_t i;
|
||||||
|
+
|
||||||
|
+ src->type = VIR_STORAGE_TYPE_NETWORK;
|
||||||
|
+ src->protocol = VIR_STORAGE_NET_PROTOCOL_VITASTOR;
|
||||||
|
+
|
||||||
|
+ /* legacy syntax passed via 'filename' option */
|
||||||
|
+ if ((filename = virJSONValueObjectGetString(json, "filename")))
|
||||||
|
+ return virStorageSourceParseVitastorColonString(filename, src);
|
||||||
|
+
|
||||||
|
+ if (!image) {
|
||||||
|
+ virReportError(VIR_ERR_INVALID_ARG, "%s",
|
||||||
|
+ _("missing image name in Vitastor backing volume "
|
||||||
|
+ "JSON specification"));
|
||||||
|
+ return -1;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ src->path = g_strdup(image);
|
||||||
|
+ src->configFile = g_strdup(conf);
|
||||||
|
+ src->query = g_strdup(etcd_prefix);
|
||||||
|
+
|
||||||
|
+ if (servers) {
|
||||||
|
+ nservers = virJSONValueArraySize(servers);
|
||||||
|
+
|
||||||
|
+ src->hosts = g_new0(virStorageNetHostDef, nservers);
|
||||||
|
+ src->nhosts = nservers;
|
||||||
|
+
|
||||||
|
+ for (i = 0; i < nservers; i++) {
|
||||||
|
+ if (virStorageSourceParseBackingJSONInetSocketAddress(src->hosts + i,
|
||||||
|
+ virJSONValueArrayGet(servers, i)) < 0)
|
||||||
|
+ return -1;
|
||||||
|
+ }
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ return 0;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
static int
|
||||||
|
virStorageSourceParseBackingJSONRaw(virStorageSourcePtr src,
|
||||||
|
virJSONValuePtr json,
|
||||||
|
@@ -3759,6 +3882,7 @@ static const struct virStorageSourceJSONDriverParser jsonParsers[] = {
|
||||||
|
{"sheepdog", false, virStorageSourceParseBackingJSONSheepdog, 0},
|
||||||
|
{"ssh", false, virStorageSourceParseBackingJSONSSH, 0},
|
||||||
|
{"rbd", false, virStorageSourceParseBackingJSONRBD, 0},
|
||||||
|
+ {"vitastor", false, virStorageSourceParseBackingJSONVitastor, 0},
|
||||||
|
{"raw", true, virStorageSourceParseBackingJSONRaw, 0},
|
||||||
|
{"nfs", false, virStorageSourceParseBackingJSONNFS, 0},
|
||||||
|
{"vxhs", false, virStorageSourceParseBackingJSONVxHS, 0},
|
||||||
|
@@ -4503,6 +4627,7 @@ virStorageSourceNetworkDefaultPort(virStorageNetProtocol protocol)
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_GLUSTER:
|
||||||
|
return 24007;
|
||||||
|
|
||||||
|
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_RBD:
|
||||||
|
/* we don't provide a default for RBD */
|
||||||
|
return 0;
|
||||||
|
diff --git a/src/util/virstoragefile.h b/src/util/virstoragefile.h
|
||||||
|
index 5689c39..3eb4e3c 100644
|
||||||
|
--- a/src/util/virstoragefile.h
|
||||||
|
+++ b/src/util/virstoragefile.h
|
||||||
|
@@ -136,6 +136,7 @@ typedef enum {
|
||||||
|
VIR_STORAGE_NET_PROTOCOL_SSH,
|
||||||
|
VIR_STORAGE_NET_PROTOCOL_VXHS,
|
||||||
|
VIR_STORAGE_NET_PROTOCOL_NFS,
|
||||||
|
+ VIR_STORAGE_NET_PROTOCOL_VITASTOR,
|
||||||
|
|
||||||
|
VIR_STORAGE_NET_PROTOCOL_LAST
|
||||||
|
} virStorageNetProtocol;
|
||||||
|
diff --git a/tests/storagepoolcapsschemadata/poolcaps-fs.xml b/tests/storagepoolcapsschemadata/poolcaps-fs.xml
|
||||||
|
index eee75af..8bd0a57 100644
|
||||||
|
--- a/tests/storagepoolcapsschemadata/poolcaps-fs.xml
|
||||||
|
+++ b/tests/storagepoolcapsschemadata/poolcaps-fs.xml
|
||||||
|
@@ -204,4 +204,11 @@
|
||||||
|
</enum>
|
||||||
|
</volOptions>
|
||||||
|
</pool>
|
||||||
|
+ <pool type='vitastor' supported='no'>
|
||||||
|
+ <volOptions>
|
||||||
|
+ <defaultFormat type='raw'/>
|
||||||
|
+ <enum name='targetFormatType'>
|
||||||
|
+ </enum>
|
||||||
|
+ </volOptions>
|
||||||
|
+ </pool>
|
||||||
|
</storagepoolCapabilities>
|
||||||
|
diff --git a/tests/storagepoolcapsschemadata/poolcaps-full.xml b/tests/storagepoolcapsschemadata/poolcaps-full.xml
|
||||||
|
index 805950a..852df0d 100644
|
||||||
|
--- a/tests/storagepoolcapsschemadata/poolcaps-full.xml
|
||||||
|
+++ b/tests/storagepoolcapsschemadata/poolcaps-full.xml
|
||||||
|
@@ -204,4 +204,11 @@
|
||||||
|
</enum>
|
||||||
|
</volOptions>
|
||||||
|
</pool>
|
||||||
|
+ <pool type='vitastor' supported='yes'>
|
||||||
|
+ <volOptions>
|
||||||
|
+ <defaultFormat type='raw'/>
|
||||||
|
+ <enum name='targetFormatType'>
|
||||||
|
+ </enum>
|
||||||
|
+ </volOptions>
|
||||||
|
+ </pool>
|
||||||
|
</storagepoolCapabilities>
|
||||||
|
diff --git a/tests/storagepoolxml2argvtest.c b/tests/storagepoolxml2argvtest.c
|
||||||
|
index 967d1f2..1e8ff7a 100644
|
||||||
|
--- a/tests/storagepoolxml2argvtest.c
|
||||||
|
+++ b/tests/storagepoolxml2argvtest.c
|
||||||
|
@@ -68,6 +68,7 @@ testCompareXMLToArgvFiles(bool shouldFail,
|
||||||
|
case VIR_STORAGE_POOL_GLUSTER:
|
||||||
|
case VIR_STORAGE_POOL_ZFS:
|
||||||
|
case VIR_STORAGE_POOL_VSTORAGE:
|
||||||
|
+ case VIR_STORAGE_POOL_VITASTOR:
|
||||||
|
case VIR_STORAGE_POOL_LAST:
|
||||||
|
default:
|
||||||
|
VIR_TEST_DEBUG("pool type '%s' has no xml2argv test", defTypeStr);
|
||||||
|
diff --git a/tools/virsh-pool.c b/tools/virsh-pool.c
|
||||||
|
index 7835fa6..8841fcf 100644
|
||||||
|
--- a/tools/virsh-pool.c
|
||||||
|
+++ b/tools/virsh-pool.c
|
||||||
|
@@ -1237,6 +1237,9 @@ cmdPoolList(vshControl *ctl, const vshCmd *cmd G_GNUC_UNUSED)
|
||||||
|
case VIR_STORAGE_POOL_VSTORAGE:
|
||||||
|
flags |= VIR_CONNECT_LIST_STORAGE_POOLS_VSTORAGE;
|
||||||
|
break;
|
||||||
|
+ case VIR_STORAGE_POOL_VITASTOR:
|
||||||
|
+ flags |= VIR_CONNECT_LIST_STORAGE_POOLS_VITASTOR;
|
||||||
|
+ break;
|
||||||
|
case VIR_STORAGE_POOL_LAST:
|
||||||
|
break;
|
||||||
|
}
|
|
@ -0,0 +1,661 @@
|
||||||
|
commit c6e1958a1b4974828e8e5852beb252ce6594e670
|
||||||
|
Author: Vitaliy Filippov <vitalif@yourcmc.ru>
|
||||||
|
Date: Mon Jun 28 01:20:19 2021 +0300
|
||||||
|
|
||||||
|
Add Vitastor support
|
||||||
|
|
||||||
|
diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng
|
||||||
|
index 5ea14b6..a9df168 100644
|
||||||
|
--- a/docs/schemas/domaincommon.rng
|
||||||
|
+++ b/docs/schemas/domaincommon.rng
|
||||||
|
@@ -1859,6 +1859,35 @@
|
||||||
|
</element>
|
||||||
|
</define>
|
||||||
|
|
||||||
|
+ <define name="diskSourceNetworkProtocolVitastor">
|
||||||
|
+ <element name="source">
|
||||||
|
+ <interleave>
|
||||||
|
+ <attribute name="protocol">
|
||||||
|
+ <value>vitastor</value>
|
||||||
|
+ </attribute>
|
||||||
|
+ <ref name="diskSourceCommon"/>
|
||||||
|
+ <optional>
|
||||||
|
+ <attribute name="name"/>
|
||||||
|
+ </optional>
|
||||||
|
+ <optional>
|
||||||
|
+ <attribute name="query"/>
|
||||||
|
+ </optional>
|
||||||
|
+ <zeroOrMore>
|
||||||
|
+ <ref name="diskSourceNetworkHost"/>
|
||||||
|
+ </zeroOrMore>
|
||||||
|
+ <optional>
|
||||||
|
+ <element name="config">
|
||||||
|
+ <attribute name="file">
|
||||||
|
+ <ref name="absFilePath"/>
|
||||||
|
+ </attribute>
|
||||||
|
+ <empty/>
|
||||||
|
+ </element>
|
||||||
|
+ </optional>
|
||||||
|
+ <empty/>
|
||||||
|
+ </interleave>
|
||||||
|
+ </element>
|
||||||
|
+ </define>
|
||||||
|
+
|
||||||
|
<define name="diskSourceNetworkProtocolISCSI">
|
||||||
|
<element name="source">
|
||||||
|
<attribute name="protocol">
|
||||||
|
@@ -2115,6 +2144,7 @@
|
||||||
|
<ref name="diskSourceNetworkProtocolSimple"/>
|
||||||
|
<ref name="diskSourceNetworkProtocolVxHS"/>
|
||||||
|
<ref name="diskSourceNetworkProtocolNFS"/>
|
||||||
|
+ <ref name="diskSourceNetworkProtocolVitastor"/>
|
||||||
|
</choice>
|
||||||
|
</define>
|
||||||
|
|
||||||
|
diff --git a/include/libvirt/libvirt-storage.h b/include/libvirt/libvirt-storage.h
|
||||||
|
index 089e1e0..d7e7ef4 100644
|
||||||
|
--- a/include/libvirt/libvirt-storage.h
|
||||||
|
+++ b/include/libvirt/libvirt-storage.h
|
||||||
|
@@ -245,6 +245,7 @@ typedef enum {
|
||||||
|
VIR_CONNECT_LIST_STORAGE_POOLS_ZFS = 1 << 17,
|
||||||
|
VIR_CONNECT_LIST_STORAGE_POOLS_VSTORAGE = 1 << 18,
|
||||||
|
VIR_CONNECT_LIST_STORAGE_POOLS_ISCSI_DIRECT = 1 << 19,
|
||||||
|
+ VIR_CONNECT_LIST_STORAGE_POOLS_VITASTOR = 1 << 20,
|
||||||
|
} virConnectListAllStoragePoolsFlags;
|
||||||
|
|
||||||
|
int virConnectListAllStoragePools(virConnectPtr conn,
|
||||||
|
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
|
||||||
|
index d78f846..f7222e3 100644
|
||||||
|
--- a/src/conf/domain_conf.c
|
||||||
|
+++ b/src/conf/domain_conf.c
|
||||||
|
@@ -8251,7 +8251,8 @@ virDomainDiskSourceNetworkParse(xmlNodePtr node,
|
||||||
|
src->configFile = virXPathString("string(./config/@file)", ctxt);
|
||||||
|
|
||||||
|
if (src->protocol == VIR_STORAGE_NET_PROTOCOL_HTTP ||
|
||||||
|
- src->protocol == VIR_STORAGE_NET_PROTOCOL_HTTPS)
|
||||||
|
+ src->protocol == VIR_STORAGE_NET_PROTOCOL_HTTPS ||
|
||||||
|
+ src->protocol == VIR_STORAGE_NET_PROTOCOL_VITASTOR)
|
||||||
|
src->query = virXMLPropString(node, "query");
|
||||||
|
|
||||||
|
if (virDomainStorageNetworkParseHosts(node, ctxt, &src->hosts, &src->nhosts) < 0)
|
||||||
|
@@ -30775,6 +30776,7 @@ virDomainStorageSourceTranslateSourcePool(virStorageSource *src,
|
||||||
|
|
||||||
|
case VIR_STORAGE_POOL_MPATH:
|
||||||
|
case VIR_STORAGE_POOL_RBD:
|
||||||
|
+ case VIR_STORAGE_POOL_VITASTOR:
|
||||||
|
case VIR_STORAGE_POOL_SHEEPDOG:
|
||||||
|
case VIR_STORAGE_POOL_GLUSTER:
|
||||||
|
case VIR_STORAGE_POOL_LAST:
|
||||||
|
diff --git a/src/conf/storage_conf.c b/src/conf/storage_conf.c
|
||||||
|
index 2aa9a3d..166ca1f 100644
|
||||||
|
--- a/src/conf/storage_conf.c
|
||||||
|
+++ b/src/conf/storage_conf.c
|
||||||
|
@@ -60,7 +60,7 @@ VIR_ENUM_IMPL(virStoragePool,
|
||||||
|
"logical", "disk", "iscsi",
|
||||||
|
"iscsi-direct", "scsi", "mpath",
|
||||||
|
"rbd", "sheepdog", "gluster",
|
||||||
|
- "zfs", "vstorage",
|
||||||
|
+ "zfs", "vstorage", "vitastor",
|
||||||
|
);
|
||||||
|
|
||||||
|
VIR_ENUM_IMPL(virStoragePoolFormatFileSystem,
|
||||||
|
@@ -246,6 +246,18 @@ static virStoragePoolTypeInfo poolTypeInfo[] = {
|
||||||
|
.formatToString = virStorageFileFormatTypeToString,
|
||||||
|
}
|
||||||
|
},
|
||||||
|
+ {.poolType = VIR_STORAGE_POOL_VITASTOR,
|
||||||
|
+ .poolOptions = {
|
||||||
|
+ .flags = (VIR_STORAGE_POOL_SOURCE_HOST |
|
||||||
|
+ VIR_STORAGE_POOL_SOURCE_NETWORK |
|
||||||
|
+ VIR_STORAGE_POOL_SOURCE_NAME),
|
||||||
|
+ },
|
||||||
|
+ .volOptions = {
|
||||||
|
+ .defaultFormat = VIR_STORAGE_FILE_RAW,
|
||||||
|
+ .formatFromString = virStorageVolumeFormatFromString,
|
||||||
|
+ .formatToString = virStorageFileFormatTypeToString,
|
||||||
|
+ }
|
||||||
|
+ },
|
||||||
|
{.poolType = VIR_STORAGE_POOL_SHEEPDOG,
|
||||||
|
.poolOptions = {
|
||||||
|
.flags = (VIR_STORAGE_POOL_SOURCE_HOST |
|
||||||
|
@@ -546,6 +558,11 @@ virStoragePoolDefParseSource(xmlXPathContextPtr ctxt,
|
||||||
|
_("element 'name' is mandatory for RBD pool"));
|
||||||
|
return -1;
|
||||||
|
}
|
||||||
|
+ if (pool_type == VIR_STORAGE_POOL_VITASTOR && source->name == NULL) {
|
||||||
|
+ virReportError(VIR_ERR_XML_ERROR, "%s",
|
||||||
|
+ _("element 'name' is mandatory for Vitastor pool"));
|
||||||
|
+ return -1;
|
||||||
|
+ }
|
||||||
|
|
||||||
|
if (options->formatFromString) {
|
||||||
|
g_autofree char *format = NULL;
|
||||||
|
@@ -1182,6 +1199,7 @@ virStoragePoolDefFormatBuf(virBuffer *buf,
|
||||||
|
/* RBD, Sheepdog, Gluster and Iscsi-direct devices are not local block devs nor
|
||||||
|
* files, so they don't have a target */
|
||||||
|
if (def->type != VIR_STORAGE_POOL_RBD &&
|
||||||
|
+ def->type != VIR_STORAGE_POOL_VITASTOR &&
|
||||||
|
def->type != VIR_STORAGE_POOL_SHEEPDOG &&
|
||||||
|
def->type != VIR_STORAGE_POOL_GLUSTER &&
|
||||||
|
def->type != VIR_STORAGE_POOL_ISCSI_DIRECT) {
|
||||||
|
diff --git a/src/conf/storage_conf.h b/src/conf/storage_conf.h
|
||||||
|
index 76efaac..928149a 100644
|
||||||
|
--- a/src/conf/storage_conf.h
|
||||||
|
+++ b/src/conf/storage_conf.h
|
||||||
|
@@ -106,6 +106,7 @@ typedef enum {
|
||||||
|
VIR_STORAGE_POOL_GLUSTER, /* Gluster device */
|
||||||
|
VIR_STORAGE_POOL_ZFS, /* ZFS */
|
||||||
|
VIR_STORAGE_POOL_VSTORAGE, /* Virtuozzo Storage */
|
||||||
|
+ VIR_STORAGE_POOL_VITASTOR, /* Vitastor */
|
||||||
|
|
||||||
|
VIR_STORAGE_POOL_LAST,
|
||||||
|
} virStoragePoolType;
|
||||||
|
@@ -465,6 +466,7 @@ VIR_ENUM_DECL(virStoragePartedFs);
|
||||||
|
VIR_CONNECT_LIST_STORAGE_POOLS_SCSI | \
|
||||||
|
VIR_CONNECT_LIST_STORAGE_POOLS_MPATH | \
|
||||||
|
VIR_CONNECT_LIST_STORAGE_POOLS_RBD | \
|
||||||
|
+ VIR_CONNECT_LIST_STORAGE_POOLS_VITASTOR | \
|
||||||
|
VIR_CONNECT_LIST_STORAGE_POOLS_SHEEPDOG | \
|
||||||
|
VIR_CONNECT_LIST_STORAGE_POOLS_GLUSTER | \
|
||||||
|
VIR_CONNECT_LIST_STORAGE_POOLS_ZFS | \
|
||||||
|
diff --git a/src/conf/storage_source_conf.c b/src/conf/storage_source_conf.c
|
||||||
|
index 5ca06fa..05ded49 100644
|
||||||
|
--- a/src/conf/storage_source_conf.c
|
||||||
|
+++ b/src/conf/storage_source_conf.c
|
||||||
|
@@ -85,6 +85,7 @@ VIR_ENUM_IMPL(virStorageNetProtocol,
|
||||||
|
"ssh",
|
||||||
|
"vxhs",
|
||||||
|
"nfs",
|
||||||
|
+ "vitastor",
|
||||||
|
);
|
||||||
|
|
||||||
|
|
||||||
|
@@ -1262,6 +1263,7 @@ virStorageSourceNetworkDefaultPort(virStorageNetProtocol protocol)
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_GLUSTER:
|
||||||
|
return 24007;
|
||||||
|
|
||||||
|
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_RBD:
|
||||||
|
/* we don't provide a default for RBD */
|
||||||
|
return 0;
|
||||||
|
diff --git a/src/conf/storage_source_conf.h b/src/conf/storage_source_conf.h
|
||||||
|
index 389c7b5..dbf02e3 100644
|
||||||
|
--- a/src/conf/storage_source_conf.h
|
||||||
|
+++ b/src/conf/storage_source_conf.h
|
||||||
|
@@ -127,6 +127,7 @@ typedef enum {
|
||||||
|
VIR_STORAGE_NET_PROTOCOL_SSH,
|
||||||
|
VIR_STORAGE_NET_PROTOCOL_VXHS,
|
||||||
|
VIR_STORAGE_NET_PROTOCOL_NFS,
|
||||||
|
+ VIR_STORAGE_NET_PROTOCOL_VITASTOR,
|
||||||
|
|
||||||
|
VIR_STORAGE_NET_PROTOCOL_LAST
|
||||||
|
} virStorageNetProtocol;
|
||||||
|
diff --git a/src/conf/virstorageobj.c b/src/conf/virstorageobj.c
|
||||||
|
index 24957d6..4520a73 100644
|
||||||
|
--- a/src/conf/virstorageobj.c
|
||||||
|
+++ b/src/conf/virstorageobj.c
|
||||||
|
@@ -1487,6 +1487,7 @@ virStoragePoolObjSourceFindDuplicateCb(const void *payload,
|
||||||
|
return 1;
|
||||||
|
break;
|
||||||
|
|
||||||
|
+ case VIR_STORAGE_POOL_VITASTOR:
|
||||||
|
case VIR_STORAGE_POOL_RBD:
|
||||||
|
case VIR_STORAGE_POOL_LAST:
|
||||||
|
break;
|
||||||
|
@@ -1986,6 +1987,8 @@ virStoragePoolObjMatch(virStoragePoolObj *obj,
|
||||||
|
(obj->def->type == VIR_STORAGE_POOL_MPATH)) ||
|
||||||
|
(MATCH(VIR_CONNECT_LIST_STORAGE_POOLS_RBD) &&
|
||||||
|
(obj->def->type == VIR_STORAGE_POOL_RBD)) ||
|
||||||
|
+ (MATCH(VIR_CONNECT_LIST_STORAGE_POOLS_VITASTOR) &&
|
||||||
|
+ (obj->def->type == VIR_STORAGE_POOL_VITASTOR)) ||
|
||||||
|
(MATCH(VIR_CONNECT_LIST_STORAGE_POOLS_SHEEPDOG) &&
|
||||||
|
(obj->def->type == VIR_STORAGE_POOL_SHEEPDOG)) ||
|
||||||
|
(MATCH(VIR_CONNECT_LIST_STORAGE_POOLS_GLUSTER) &&
|
||||||
|
diff --git a/src/libvirt-storage.c b/src/libvirt-storage.c
|
||||||
|
index 2a7cdca..f756be1 100644
|
||||||
|
--- a/src/libvirt-storage.c
|
||||||
|
+++ b/src/libvirt-storage.c
|
||||||
|
@@ -92,6 +92,7 @@ virStoragePoolGetConnect(virStoragePoolPtr pool)
|
||||||
|
* VIR_CONNECT_LIST_STORAGE_POOLS_SCSI
|
||||||
|
* VIR_CONNECT_LIST_STORAGE_POOLS_MPATH
|
||||||
|
* VIR_CONNECT_LIST_STORAGE_POOLS_RBD
|
||||||
|
+ * VIR_CONNECT_LIST_STORAGE_POOLS_VITASTOR
|
||||||
|
* VIR_CONNECT_LIST_STORAGE_POOLS_SHEEPDOG
|
||||||
|
* VIR_CONNECT_LIST_STORAGE_POOLS_GLUSTER
|
||||||
|
* VIR_CONNECT_LIST_STORAGE_POOLS_ZFS
|
||||||
|
diff --git a/src/libxl/libxl_conf.c b/src/libxl/libxl_conf.c
|
||||||
|
index 56cb9ab..dfb31b9 100644
|
||||||
|
--- a/src/libxl/libxl_conf.c
|
||||||
|
+++ b/src/libxl/libxl_conf.c
|
||||||
|
@@ -972,6 +972,7 @@ libxlMakeNetworkDiskSrcStr(virStorageSource *src,
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_SSH:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_VXHS:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_NFS:
|
||||||
|
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_LAST:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_NONE:
|
||||||
|
virReportError(VIR_ERR_NO_SUPPORT,
|
||||||
|
diff --git a/src/libxl/xen_xl.c b/src/libxl/xen_xl.c
|
||||||
|
index c0905b0..c172378 100644
|
||||||
|
--- a/src/libxl/xen_xl.c
|
||||||
|
+++ b/src/libxl/xen_xl.c
|
||||||
|
@@ -1540,6 +1540,7 @@ xenFormatXLDiskSrcNet(virStorageSource *src)
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_SSH:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_VXHS:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_NFS:
|
||||||
|
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_LAST:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_NONE:
|
||||||
|
virReportError(VIR_ERR_NO_SUPPORT,
|
||||||
|
diff --git a/src/qemu/qemu_block.c b/src/qemu/qemu_block.c
|
||||||
|
index 6627d04..c33f428 100644
|
||||||
|
--- a/src/qemu/qemu_block.c
|
||||||
|
+++ b/src/qemu/qemu_block.c
|
||||||
|
@@ -928,6 +928,38 @@ qemuBlockStorageSourceGetRBDProps(virStorageSource *src,
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
+static virJSONValue *
|
||||||
|
+qemuBlockStorageSourceGetVitastorProps(virStorageSource *src)
|
||||||
|
+{
|
||||||
|
+ virJSONValuePtr ret = NULL;
|
||||||
|
+ virStorageNetHostDefPtr host;
|
||||||
|
+ size_t i;
|
||||||
|
+ g_auto(virBuffer) buf = VIR_BUFFER_INITIALIZER;
|
||||||
|
+ g_autofree char *etcd = NULL;
|
||||||
|
+
|
||||||
|
+ for (i = 0; i < src->nhosts; i++) {
|
||||||
|
+ host = src->hosts + i;
|
||||||
|
+ if ((virStorageNetHostTransport)host->transport != VIR_STORAGE_NET_HOST_TRANS_TCP) {
|
||||||
|
+ return NULL;
|
||||||
|
+ }
|
||||||
|
+ virBufferAsprintf(&buf, i > 0 ? ",%s:%u" : "%s:%u", host->name, host->port);
|
||||||
|
+ }
|
||||||
|
+ if (src->nhosts > 0) {
|
||||||
|
+ etcd = virBufferContentAndReset(&buf);
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ if (virJSONValueObjectCreate(&ret,
|
||||||
|
+ "S:etcd_host", etcd,
|
||||||
|
+ "S:etcd_prefix", src->query,
|
||||||
|
+ "S:config_path", src->configFile,
|
||||||
|
+ "s:image", src->path,
|
||||||
|
+ NULL) < 0)
|
||||||
|
+ return NULL;
|
||||||
|
+
|
||||||
|
+ return ret;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+
|
||||||
|
static virJSONValue *
|
||||||
|
qemuBlockStorageSourceGetSheepdogProps(virStorageSource *src)
|
||||||
|
{
|
||||||
|
@@ -1218,6 +1250,12 @@ qemuBlockStorageSourceGetBackendProps(virStorageSource *src,
|
||||||
|
return NULL;
|
||||||
|
break;
|
||||||
|
|
||||||
|
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
|
||||||
|
+ driver = "vitastor";
|
||||||
|
+ if (!(fileprops = qemuBlockStorageSourceGetVitastorProps(src)))
|
||||||
|
+ return NULL;
|
||||||
|
+ break;
|
||||||
|
+
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_SHEEPDOG:
|
||||||
|
driver = "sheepdog";
|
||||||
|
if (!(fileprops = qemuBlockStorageSourceGetSheepdogProps(src)))
|
||||||
|
@@ -2231,6 +2269,7 @@ qemuBlockGetBackingStoreString(virStorageSource *src,
|
||||||
|
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_SHEEPDOG:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_RBD:
|
||||||
|
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_VXHS:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_NFS:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_SSH:
|
||||||
|
@@ -2608,6 +2647,12 @@ qemuBlockStorageSourceCreateGetStorageProps(virStorageSource *src,
|
||||||
|
return -1;
|
||||||
|
break;
|
||||||
|
|
||||||
|
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
|
||||||
|
+ driver = "vitastor";
|
||||||
|
+ if (!(location = qemuBlockStorageSourceGetVitastorProps(src)))
|
||||||
|
+ return -1;
|
||||||
|
+ break;
|
||||||
|
+
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_SHEEPDOG:
|
||||||
|
driver = "sheepdog";
|
||||||
|
if (!(location = qemuBlockStorageSourceGetSheepdogProps(src)))
|
||||||
|
diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c
|
||||||
|
index ea51369..8258632 100644
|
||||||
|
--- a/src/qemu/qemu_command.c
|
||||||
|
+++ b/src/qemu/qemu_command.c
|
||||||
|
@@ -1074,6 +1074,43 @@ qemuBuildNetworkDriveStr(virStorageSource *src,
|
||||||
|
ret = virBufferContentAndReset(&buf);
|
||||||
|
break;
|
||||||
|
|
||||||
|
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
|
||||||
|
+ if (strchr(src->path, ':')) {
|
||||||
|
+ virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
|
||||||
|
+ _("':' not allowed in Vitastor source volume name '%s'"),
|
||||||
|
+ src->path);
|
||||||
|
+ return NULL;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ virBufferStrcat(&buf, "vitastor:image=", src->path, NULL);
|
||||||
|
+
|
||||||
|
+ if (src->nhosts > 0) {
|
||||||
|
+ virBufferAddLit(&buf, ":etcd_host=");
|
||||||
|
+ for (i = 0; i < src->nhosts; i++) {
|
||||||
|
+ if (i)
|
||||||
|
+ virBufferAddLit(&buf, ",");
|
||||||
|
+
|
||||||
|
+ /* assume host containing : is ipv6 */
|
||||||
|
+ if (strchr(src->hosts[i].name, ':'))
|
||||||
|
+ virBufferEscape(&buf, '\\', ":", "[%s]",
|
||||||
|
+ src->hosts[i].name);
|
||||||
|
+ else
|
||||||
|
+ virBufferAsprintf(&buf, "%s", src->hosts[i].name);
|
||||||
|
+
|
||||||
|
+ if (src->hosts[i].port)
|
||||||
|
+ virBufferAsprintf(&buf, "\\:%u", src->hosts[i].port);
|
||||||
|
+ }
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ if (src->configFile)
|
||||||
|
+ virBufferEscape(&buf, '\\', ":", ":config_path=%s", src->configFile);
|
||||||
|
+
|
||||||
|
+ if (src->query)
|
||||||
|
+ virBufferEscape(&buf, '\\', ":", ":etcd_prefix=%s", src->query);
|
||||||
|
+
|
||||||
|
+ ret = virBufferContentAndReset(&buf);
|
||||||
|
+ break;
|
||||||
|
+
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_VXHS:
|
||||||
|
virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
|
||||||
|
_("VxHS protocol does not support URI syntax"));
|
||||||
|
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
|
||||||
|
index fc60e15..5ab410d 100644
|
||||||
|
--- a/src/qemu/qemu_domain.c
|
||||||
|
+++ b/src/qemu/qemu_domain.c
|
||||||
|
@@ -4829,7 +4829,8 @@ qemuDomainValidateStorageSource(virStorageSource *src,
|
||||||
|
if (src->query &&
|
||||||
|
(actualType != VIR_STORAGE_TYPE_NETWORK ||
|
||||||
|
(src->protocol != VIR_STORAGE_NET_PROTOCOL_HTTPS &&
|
||||||
|
- src->protocol != VIR_STORAGE_NET_PROTOCOL_HTTP))) {
|
||||||
|
+ src->protocol != VIR_STORAGE_NET_PROTOCOL_HTTP &&
|
||||||
|
+ src->protocol != VIR_STORAGE_NET_PROTOCOL_VITASTOR))) {
|
||||||
|
virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
|
||||||
|
_("query is supported only with HTTP(S) protocols"));
|
||||||
|
return -1;
|
||||||
|
@@ -10027,6 +10028,7 @@ qemuDomainPrepareStorageSourceTLS(virStorageSource *src,
|
||||||
|
break;
|
||||||
|
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_RBD:
|
||||||
|
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_SHEEPDOG:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_GLUSTER:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_ISCSI:
|
||||||
|
diff --git a/src/qemu/qemu_snapshot.c b/src/qemu/qemu_snapshot.c
|
||||||
|
index 4e74ddd..14e5f2e 100644
|
||||||
|
--- a/src/qemu/qemu_snapshot.c
|
||||||
|
+++ b/src/qemu/qemu_snapshot.c
|
||||||
|
@@ -402,6 +402,7 @@ qemuSnapshotPrepareDiskExternalInactive(virDomainSnapshotDiskDef *snapdisk,
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_NONE:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_NBD:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_RBD:
|
||||||
|
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_SHEEPDOG:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_GLUSTER:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_ISCSI:
|
||||||
|
@@ -494,6 +495,7 @@ qemuSnapshotPrepareDiskExternalActive(virDomainObj *vm,
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_NONE:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_NBD:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_RBD:
|
||||||
|
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_SHEEPDOG:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_ISCSI:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_HTTP:
|
||||||
|
@@ -647,6 +649,7 @@ qemuSnapshotPrepareDiskInternal(virDomainDiskDef *disk,
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_NONE:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_NBD:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_RBD:
|
||||||
|
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_SHEEPDOG:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_GLUSTER:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_ISCSI:
|
||||||
|
diff --git a/src/storage/storage_driver.c b/src/storage/storage_driver.c
|
||||||
|
index c2ff4b8..70d0689 100644
|
||||||
|
--- a/src/storage/storage_driver.c
|
||||||
|
+++ b/src/storage/storage_driver.c
|
||||||
|
@@ -1644,6 +1644,7 @@ storageVolLookupByPathCallback(virStoragePoolObj *obj,
|
||||||
|
|
||||||
|
case VIR_STORAGE_POOL_GLUSTER:
|
||||||
|
case VIR_STORAGE_POOL_RBD:
|
||||||
|
+ case VIR_STORAGE_POOL_VITASTOR:
|
||||||
|
case VIR_STORAGE_POOL_SHEEPDOG:
|
||||||
|
case VIR_STORAGE_POOL_ZFS:
|
||||||
|
case VIR_STORAGE_POOL_LAST:
|
||||||
|
diff --git a/src/storage_file/storage_source_backingstore.c b/src/storage_file/storage_source_backingstore.c
|
||||||
|
index e48ae72..d7a9b72 100644
|
||||||
|
--- a/src/storage_file/storage_source_backingstore.c
|
||||||
|
+++ b/src/storage_file/storage_source_backingstore.c
|
||||||
|
@@ -284,6 +284,75 @@ virStorageSourceParseRBDColonString(const char *rbdstr,
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
+static int
|
||||||
|
+virStorageSourceParseVitastorColonString(const char *colonstr,
|
||||||
|
+ virStorageSource *src)
|
||||||
|
+{
|
||||||
|
+ char *p, *e, *next;
|
||||||
|
+ g_autofree char *options = NULL;
|
||||||
|
+
|
||||||
|
+ /* optionally skip the "vitastor:" prefix if provided */
|
||||||
|
+ if (STRPREFIX(colonstr, "vitastor:"))
|
||||||
|
+ colonstr += strlen("vitastor:");
|
||||||
|
+
|
||||||
|
+ options = g_strdup(colonstr);
|
||||||
|
+
|
||||||
|
+ p = options;
|
||||||
|
+ while (*p) {
|
||||||
|
+ /* find : delimiter or end of string */
|
||||||
|
+ for (e = p; *e && *e != ':'; ++e) {
|
||||||
|
+ if (*e == '\\') {
|
||||||
|
+ e++;
|
||||||
|
+ if (*e == '\0')
|
||||||
|
+ break;
|
||||||
|
+ }
|
||||||
|
+ }
|
||||||
|
+ if (*e == '\0') {
|
||||||
|
+ next = e; /* last kv pair */
|
||||||
|
+ } else {
|
||||||
|
+ next = e + 1;
|
||||||
|
+ *e = '\0';
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ if (STRPREFIX(p, "image=")) {
|
||||||
|
+ src->path = g_strdup(p + strlen("image="));
|
||||||
|
+ } else if (STRPREFIX(p, "etcd_prefix=")) {
|
||||||
|
+ src->query = g_strdup(p + strlen("etcd_prefix="));
|
||||||
|
+ } else if (STRPREFIX(p, "config_file=")) {
|
||||||
|
+ src->configFile = g_strdup(p + strlen("config_file="));
|
||||||
|
+ } else if (STRPREFIX(p, "etcd_host=")) {
|
||||||
|
+ char *h, *sep;
|
||||||
|
+
|
||||||
|
+ h = p + strlen("etcd_host=");
|
||||||
|
+ while (h < e) {
|
||||||
|
+ for (sep = h; sep < e; ++sep) {
|
||||||
|
+ if (*sep == '\\' && (sep[1] == ',' ||
|
||||||
|
+ sep[1] == ';' ||
|
||||||
|
+ sep[1] == ' ')) {
|
||||||
|
+ *sep = '\0';
|
||||||
|
+ sep += 2;
|
||||||
|
+ break;
|
||||||
|
+ }
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ if (virStorageSourceRBDAddHost(src, h) < 0)
|
||||||
|
+ return -1;
|
||||||
|
+
|
||||||
|
+ h = sep;
|
||||||
|
+ }
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ p = next;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ if (!src->path) {
|
||||||
|
+ return -1;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ return 0;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
+
|
||||||
|
static int
|
||||||
|
virStorageSourceParseNBDColonString(const char *nbdstr,
|
||||||
|
virStorageSource *src)
|
||||||
|
@@ -396,6 +465,11 @@ virStorageSourceParseBackingColon(virStorageSource *src,
|
||||||
|
return -1;
|
||||||
|
break;
|
||||||
|
|
||||||
|
+ case VIR_STORAGE_NET_PROTOCOL_VITASTOR:
|
||||||
|
+ if (virStorageSourceParseVitastorColonString(path, src) < 0)
|
||||||
|
+ return -1;
|
||||||
|
+ break;
|
||||||
|
+
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_SHEEPDOG:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_LAST:
|
||||||
|
case VIR_STORAGE_NET_PROTOCOL_NONE:
|
||||||
|
@@ -984,6 +1058,54 @@ virStorageSourceParseBackingJSONRBD(virStorageSource *src,
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
+static int
|
||||||
|
+virStorageSourceParseBackingJSONVitastor(virStorageSource *src,
|
||||||
|
+ virJSONValue *json,
|
||||||
|
+ const char *jsonstr G_GNUC_UNUSED,
|
||||||
|
+ int opaque G_GNUC_UNUSED)
|
||||||
|
+{
|
||||||
|
+ const char *filename;
|
||||||
|
+ const char *image = virJSONValueObjectGetString(json, "image");
|
||||||
|
+ const char *conf = virJSONValueObjectGetString(json, "config_path");
|
||||||
|
+ const char *etcd_prefix = virJSONValueObjectGetString(json, "etcd_prefix");
|
||||||
|
+ virJSONValue *servers = virJSONValueObjectGetArray(json, "server");
|
||||||
|
+ size_t nservers;
|
||||||
|
+ size_t i;
|
||||||
|
+
|
||||||
|
+ src->type = VIR_STORAGE_TYPE_NETWORK;
|
||||||
|
+ src->protocol = VIR_STORAGE_NET_PROTOCOL_VITASTOR;
|
||||||
|
+
|
||||||
|
+ /* legacy syntax passed via 'filename' option */
|
||||||
|
+ if ((filename = virJSONValueObjectGetString(json, "filename")))
|
||||||
|
+ return virStorageSourceParseVitastorColonString(filename, src);
|
||||||
|
+
|
||||||
|
+ if (!image) {
|
||||||
|
+ virReportError(VIR_ERR_INVALID_ARG, "%s",
|
||||||
|
+ _("missing image name in Vitastor backing volume "
|
||||||
|
+ "JSON specification"));
|
||||||
|
+ return -1;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ src->path = g_strdup(image);
|
||||||
|
+ src->configFile = g_strdup(conf);
|
||||||
|
+ src->query = g_strdup(etcd_prefix);
|
||||||
|
+
|
||||||
|
+ if (servers) {
|
||||||
|
+ nservers = virJSONValueArraySize(servers);
|
||||||
|
+
|
||||||
|
+ src->hosts = g_new0(virStorageNetHostDef, nservers);
|
||||||
|
+ src->nhosts = nservers;
|
||||||
|
+
|
||||||
|
+ for (i = 0; i < nservers; i++) {
|
||||||
|
+ if (virStorageSourceParseBackingJSONInetSocketAddress(src->hosts + i,
|
||||||
|
+ virJSONValueArrayGet(servers, i)) < 0)
|
||||||
|
+ return -1;
|
||||||
|
+ }
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
+ return 0;
|
||||||
|
+}
|
||||||
|
+
|
||||||
|
static int
|
||||||
|
virStorageSourceParseBackingJSONRaw(virStorageSource *src,
|
||||||
|
virJSONValue *json,
|
||||||
|
@@ -1162,6 +1284,7 @@ static const struct virStorageSourceJSONDriverParser jsonParsers[] = {
|
||||||
|
{"sheepdog", false, virStorageSourceParseBackingJSONSheepdog, 0},
|
||||||
|
{"ssh", false, virStorageSourceParseBackingJSONSSH, 0},
|
||||||
|
{"rbd", false, virStorageSourceParseBackingJSONRBD, 0},
|
||||||
|
+ {"vitastor", false, virStorageSourceParseBackingJSONVitastor, 0},
|
||||||
|
{"raw", true, virStorageSourceParseBackingJSONRaw, 0},
|
||||||
|
{"nfs", false, virStorageSourceParseBackingJSONNFS, 0},
|
||||||
|
{"vxhs", false, virStorageSourceParseBackingJSONVxHS, 0},
|
||||||
|
diff --git a/src/test/test_driver.c b/src/test/test_driver.c
|
||||||
|
index ef0ddab..2173dc3 100644
|
||||||
|
--- a/src/test/test_driver.c
|
||||||
|
+++ b/src/test/test_driver.c
|
||||||
|
@@ -7131,6 +7131,7 @@ testStorageVolumeTypeForPool(int pooltype)
|
||||||
|
case VIR_STORAGE_POOL_ISCSI_DIRECT:
|
||||||
|
case VIR_STORAGE_POOL_GLUSTER:
|
||||||
|
case VIR_STORAGE_POOL_RBD:
|
||||||
|
+ case VIR_STORAGE_POOL_VITASTOR:
|
||||||
|
return VIR_STORAGE_VOL_NETWORK;
|
||||||
|
case VIR_STORAGE_POOL_LOGICAL:
|
||||||
|
case VIR_STORAGE_POOL_DISK:
|
||||||
|
diff --git a/tests/storagepoolcapsschemadata/poolcaps-fs.xml b/tests/storagepoolcapsschemadata/poolcaps-fs.xml
|
||||||
|
index eee75af..8bd0a57 100644
|
||||||
|
--- a/tests/storagepoolcapsschemadata/poolcaps-fs.xml
|
||||||
|
+++ b/tests/storagepoolcapsschemadata/poolcaps-fs.xml
|
||||||
|
@@ -204,4 +204,11 @@
|
||||||
|
</enum>
|
||||||
|
</volOptions>
|
||||||
|
</pool>
|
||||||
|
+ <pool type='vitastor' supported='no'>
|
||||||
|
+ <volOptions>
|
||||||
|
+ <defaultFormat type='raw'/>
|
||||||
|
+ <enum name='targetFormatType'>
|
||||||
|
+ </enum>
|
||||||
|
+ </volOptions>
|
||||||
|
+ </pool>
|
||||||
|
</storagepoolCapabilities>
|
||||||
|
diff --git a/tests/storagepoolcapsschemadata/poolcaps-full.xml b/tests/storagepoolcapsschemadata/poolcaps-full.xml
|
||||||
|
index 805950a..852df0d 100644
|
||||||
|
--- a/tests/storagepoolcapsschemadata/poolcaps-full.xml
|
||||||
|
+++ b/tests/storagepoolcapsschemadata/poolcaps-full.xml
|
||||||
|
@@ -204,4 +204,11 @@
|
||||||
|
</enum>
|
||||||
|
</volOptions>
|
||||||
|
</pool>
|
||||||
|
+ <pool type='vitastor' supported='yes'>
|
||||||
|
+ <volOptions>
|
||||||
|
+ <defaultFormat type='raw'/>
|
||||||
|
+ <enum name='targetFormatType'>
|
||||||
|
+ </enum>
|
||||||
|
+ </volOptions>
|
||||||
|
+ </pool>
|
||||||
|
</storagepoolCapabilities>
|
||||||
|
diff --git a/tests/storagepoolxml2argvtest.c b/tests/storagepoolxml2argvtest.c
|
||||||
|
index 449b745..7f95cc8 100644
|
||||||
|
--- a/tests/storagepoolxml2argvtest.c
|
||||||
|
+++ b/tests/storagepoolxml2argvtest.c
|
||||||
|
@@ -68,6 +68,7 @@ testCompareXMLToArgvFiles(bool shouldFail,
|
||||||
|
case VIR_STORAGE_POOL_GLUSTER:
|
||||||
|
case VIR_STORAGE_POOL_ZFS:
|
||||||
|
case VIR_STORAGE_POOL_VSTORAGE:
|
||||||
|
+ case VIR_STORAGE_POOL_VITASTOR:
|
||||||
|
case VIR_STORAGE_POOL_LAST:
|
||||||
|
default:
|
||||||
|
VIR_TEST_DEBUG("pool type '%s' has no xml2argv test", defTypeStr);
|
||||||
|
diff --git a/tools/virsh-pool.c b/tools/virsh-pool.c
|
||||||
|
index 18f3839..c8e1436 100644
|
||||||
|
--- a/tools/virsh-pool.c
|
||||||
|
+++ b/tools/virsh-pool.c
|
||||||
|
@@ -1231,6 +1231,9 @@ cmdPoolList(vshControl *ctl, const vshCmd *cmd G_GNUC_UNUSED)
|
||||||
|
case VIR_STORAGE_POOL_VSTORAGE:
|
||||||
|
flags |= VIR_CONNECT_LIST_STORAGE_POOLS_VSTORAGE;
|
||||||
|
break;
|
||||||
|
+ case VIR_STORAGE_POOL_VITASTOR:
|
||||||
|
+ flags |= VIR_CONNECT_LIST_STORAGE_POOLS_VITASTOR;
|
||||||
|
+ break;
|
||||||
|
case VIR_STORAGE_POOL_LAST:
|
||||||
|
break;
|
||||||
|
}
|
|
@ -0,0 +1,32 @@
|
||||||
|
<!-- Example libvirt VM configuration with Vitastor disk -->
|
||||||
|
<domain type='kvm'>
|
||||||
|
<name>debian9</name>
|
||||||
|
<uuid>96f277fb-fd9c-49da-bf21-a5cfd54eb162</uuid>
|
||||||
|
<memory unit="KiB">524288</memory>
|
||||||
|
<currentMemory>524288</currentMemory>
|
||||||
|
<vcpu>1</vcpu>
|
||||||
|
<os>
|
||||||
|
<type arch='x86_64'>hvm</type>
|
||||||
|
<boot dev='hd' />
|
||||||
|
</os>
|
||||||
|
<devices>
|
||||||
|
<emulator>/usr/bin/qemu-system-x86_64</emulator>
|
||||||
|
<disk type='network' device='disk'>
|
||||||
|
<target dev='vda' bus='virtio' />
|
||||||
|
<driver name='qemu' type='raw' />
|
||||||
|
<!-- name is Vitastor image name -->
|
||||||
|
<!-- config (optional) is the path to Vitastor's configuration file -->
|
||||||
|
<!-- query (optional) is Vitastor's etcd_prefix -->
|
||||||
|
<source protocol='vitastor' name='debian9' query='/vitastor' config='/etc/vitastor/vitastor.conf'>
|
||||||
|
<!-- hosts = etcd addresses -->
|
||||||
|
<host name='192.168.7.2' port='2379' />
|
||||||
|
</source>
|
||||||
|
<!-- required because Vitastor only supports 4k physical sectors -->
|
||||||
|
<blockio physical_block_size="4096" logical_block_size="512" />
|
||||||
|
</disk>
|
||||||
|
<interface type='network'>
|
||||||
|
<source network='default' />
|
||||||
|
</interface>
|
||||||
|
<graphics type='vnc' port='-1' />
|
||||||
|
</devices>
|
||||||
|
</domain>
|
|
@ -0,0 +1,287 @@
|
||||||
|
diff --git a/nova/virt/image/model.py b/nova/virt/image/model.py
|
||||||
|
index 971f7e9c07..70ed70d5e2 100644
|
||||||
|
--- a/nova/virt/image/model.py
|
||||||
|
+++ b/nova/virt/image/model.py
|
||||||
|
@@ -129,3 +129,22 @@ class RBDImage(Image):
|
||||||
|
self.user = user
|
||||||
|
self.password = password
|
||||||
|
self.servers = servers
|
||||||
|
+
|
||||||
|
+
|
||||||
|
+class VitastorImage(Image):
|
||||||
|
+ """Class for images in a remote Vitastor cluster"""
|
||||||
|
+
|
||||||
|
+ def __init__(self, name, etcd_address = None, etcd_prefix = None, config_path = None):
|
||||||
|
+ """Create a new Vitastor image object
|
||||||
|
+
|
||||||
|
+ :param name: name of the image
|
||||||
|
+ :param etcd_address: etcd URL(s) (optional)
|
||||||
|
+ :param etcd_prefix: etcd prefix (optional)
|
||||||
|
+ :param config_path: path to the configuration (optional)
|
||||||
|
+ """
|
||||||
|
+ super(RBDImage, self).__init__(FORMAT_RAW)
|
||||||
|
+
|
||||||
|
+ self.name = name
|
||||||
|
+ self.etcd_address = etcd_address
|
||||||
|
+ self.etcd_prefix = etcd_prefix
|
||||||
|
+ self.config_path = config_path
|
||||||
|
diff --git a/nova/virt/images.py b/nova/virt/images.py
|
||||||
|
index 5358f3766a..ebe3d6effb 100644
|
||||||
|
--- a/nova/virt/images.py
|
||||||
|
+++ b/nova/virt/images.py
|
||||||
|
@@ -41,7 +41,7 @@ IMAGE_API = glance.API()
|
||||||
|
|
||||||
|
def qemu_img_info(path, format=None):
|
||||||
|
"""Return an object containing the parsed output from qemu-img info."""
|
||||||
|
- if not os.path.exists(path) and not path.startswith('rbd:'):
|
||||||
|
+ if not os.path.exists(path) and not path.startswith('rbd:') and not path.startswith('vitastor:'):
|
||||||
|
raise exception.DiskNotFound(location=path)
|
||||||
|
|
||||||
|
info = nova.privsep.qemu.unprivileged_qemu_img_info(path, format=format)
|
||||||
|
@@ -50,7 +50,7 @@ def qemu_img_info(path, format=None):
|
||||||
|
|
||||||
|
def privileged_qemu_img_info(path, format=None, output_format='json'):
|
||||||
|
"""Return an object containing the parsed output from qemu-img info."""
|
||||||
|
- if not os.path.exists(path) and not path.startswith('rbd:'):
|
||||||
|
+ if not os.path.exists(path) and not path.startswith('rbd:') and not path.startswith('vitastor:'):
|
||||||
|
raise exception.DiskNotFound(location=path)
|
||||||
|
|
||||||
|
info = nova.privsep.qemu.privileged_qemu_img_info(path, format=format)
|
||||||
|
diff --git a/nova/virt/libvirt/config.py b/nova/virt/libvirt/config.py
|
||||||
|
index f9475776b3..51573fe41d 100644
|
||||||
|
--- a/nova/virt/libvirt/config.py
|
||||||
|
+++ b/nova/virt/libvirt/config.py
|
||||||
|
@@ -1060,6 +1060,8 @@ class LibvirtConfigGuestDisk(LibvirtConfigGuestDevice):
|
||||||
|
self.driver_iommu = False
|
||||||
|
self.source_path = None
|
||||||
|
self.source_protocol = None
|
||||||
|
+ self.source_query = None
|
||||||
|
+ self.source_config = None
|
||||||
|
self.source_name = None
|
||||||
|
self.source_hosts = []
|
||||||
|
self.source_ports = []
|
||||||
|
@@ -1186,7 +1188,8 @@ class LibvirtConfigGuestDisk(LibvirtConfigGuestDevice):
|
||||||
|
elif self.source_type == "mount":
|
||||||
|
dev.append(etree.Element("source", dir=self.source_path))
|
||||||
|
elif self.source_type == "network" and self.source_protocol:
|
||||||
|
- source = etree.Element("source", protocol=self.source_protocol)
|
||||||
|
+ source = etree.Element("source", protocol=self.source_protocol,
|
||||||
|
+ query=self.source_query, config=self.source_config)
|
||||||
|
if self.source_name is not None:
|
||||||
|
source.set('name', self.source_name)
|
||||||
|
hosts_info = zip(self.source_hosts, self.source_ports)
|
||||||
|
diff --git a/nova/virt/libvirt/driver.py b/nova/virt/libvirt/driver.py
|
||||||
|
index 391231c527..34dc60dcdd 100644
|
||||||
|
--- a/nova/virt/libvirt/driver.py
|
||||||
|
+++ b/nova/virt/libvirt/driver.py
|
||||||
|
@@ -179,6 +179,7 @@ VOLUME_DRIVERS = {
|
||||||
|
'local': 'nova.virt.libvirt.volume.volume.LibvirtVolumeDriver',
|
||||||
|
'fake': 'nova.virt.libvirt.volume.volume.LibvirtFakeVolumeDriver',
|
||||||
|
'rbd': 'nova.virt.libvirt.volume.net.LibvirtNetVolumeDriver',
|
||||||
|
+ 'vitastor': 'nova.virt.libvirt.volume.vitastor.LibvirtVitastorVolumeDriver',
|
||||||
|
'nfs': 'nova.virt.libvirt.volume.nfs.LibvirtNFSVolumeDriver',
|
||||||
|
'smbfs': 'nova.virt.libvirt.volume.smbfs.LibvirtSMBFSVolumeDriver',
|
||||||
|
'fibre_channel': 'nova.virt.libvirt.volume.fibrechannel.LibvirtFibreChannelVolumeDriver', # noqa:E501
|
||||||
|
@@ -385,10 +386,10 @@ class LibvirtDriver(driver.ComputeDriver):
|
||||||
|
# This prevents the risk of one test setting a capability
|
||||||
|
# which bleeds over into other tests.
|
||||||
|
|
||||||
|
- # LVM and RBD require raw images. If we are not configured to
|
||||||
|
+ # LVM, RBD, Vitastor require raw images. If we are not configured to
|
||||||
|
# force convert images into raw format, then we _require_ raw
|
||||||
|
# images only.
|
||||||
|
- raw_only = ('rbd', 'lvm')
|
||||||
|
+ raw_only = ('rbd', 'lvm', 'vitastor')
|
||||||
|
requires_raw_image = (CONF.libvirt.images_type in raw_only and
|
||||||
|
not CONF.force_raw_images)
|
||||||
|
requires_ploop_image = CONF.libvirt.virt_type == 'parallels'
|
||||||
|
@@ -775,12 +776,12 @@ class LibvirtDriver(driver.ComputeDriver):
|
||||||
|
# Some imagebackends are only able to import raw disk images,
|
||||||
|
# and will fail if given any other format. See the bug
|
||||||
|
# https://bugs.launchpad.net/nova/+bug/1816686 for more details.
|
||||||
|
- if CONF.libvirt.images_type in ('rbd',):
|
||||||
|
+ if CONF.libvirt.images_type in ('rbd', 'vitastor'):
|
||||||
|
if not CONF.force_raw_images:
|
||||||
|
msg = _("'[DEFAULT]/force_raw_images = False' is not "
|
||||||
|
- "allowed with '[libvirt]/images_type = rbd'. "
|
||||||
|
+ "allowed with '[libvirt]/images_type = rbd' or 'vitastor'. "
|
||||||
|
"Please check the two configs and if you really "
|
||||||
|
- "do want to use rbd as images_type, set "
|
||||||
|
+ "do want to use rbd or vitastor as images_type, set "
|
||||||
|
"force_raw_images to True.")
|
||||||
|
raise exception.InvalidConfiguration(msg)
|
||||||
|
|
||||||
|
@@ -2603,6 +2604,16 @@ class LibvirtDriver(driver.ComputeDriver):
|
||||||
|
if connection_info['data'].get('auth_enabled'):
|
||||||
|
username = connection_info['data']['auth_username']
|
||||||
|
path = f"rbd:{volume_name}:id={username}"
|
||||||
|
+ elif connection_info['driver_volume_type'] == 'vitastor':
|
||||||
|
+ volume_name = connection_info['data']['name']
|
||||||
|
+ path = 'vitastor:image='+volume_name.replace(':', '\\:')
|
||||||
|
+ for k in [ 'config_path', 'etcd_address', 'etcd_prefix' ]:
|
||||||
|
+ if k in connection_info['data']:
|
||||||
|
+ kk = k
|
||||||
|
+ if kk == 'etcd_address':
|
||||||
|
+ # FIXME use etcd_address in qemu driver
|
||||||
|
+ kk = 'etcd_host'
|
||||||
|
+ path += ":"+kk+"="+connection_info['data'][k].replace(':', '\\:')
|
||||||
|
else:
|
||||||
|
path = 'unknown'
|
||||||
|
raise exception.DiskNotFound(location='unknown')
|
||||||
|
@@ -2827,8 +2838,8 @@ class LibvirtDriver(driver.ComputeDriver):
|
||||||
|
|
||||||
|
image_format = CONF.libvirt.snapshot_image_format or source_type
|
||||||
|
|
||||||
|
- # NOTE(bfilippov): save lvm and rbd as raw
|
||||||
|
- if image_format == 'lvm' or image_format == 'rbd':
|
||||||
|
+ # NOTE(bfilippov): save lvm and rbd and vitastor as raw
|
||||||
|
+ if image_format == 'lvm' or image_format == 'rbd' or image_format == 'vitastor':
|
||||||
|
image_format = 'raw'
|
||||||
|
|
||||||
|
metadata = self._create_snapshot_metadata(instance.image_meta,
|
||||||
|
@@ -2899,7 +2910,7 @@ class LibvirtDriver(driver.ComputeDriver):
|
||||||
|
expected_state=task_states.IMAGE_UPLOADING)
|
||||||
|
|
||||||
|
# TODO(nic): possibly abstract this out to the root_disk
|
||||||
|
- if source_type == 'rbd' and live_snapshot:
|
||||||
|
+ if (source_type == 'rbd' or source_type == 'vitastor') and live_snapshot:
|
||||||
|
# Standard snapshot uses qemu-img convert from RBD which is
|
||||||
|
# not safe to run with live_snapshot.
|
||||||
|
live_snapshot = False
|
||||||
|
@@ -4099,7 +4110,7 @@ class LibvirtDriver(driver.ComputeDriver):
|
||||||
|
# cleanup rescue volume
|
||||||
|
lvm.remove_volumes([lvmdisk for lvmdisk in self._lvm_disks(instance)
|
||||||
|
if lvmdisk.endswith('.rescue')])
|
||||||
|
- if CONF.libvirt.images_type == 'rbd':
|
||||||
|
+ if CONF.libvirt.images_type == 'rbd' or CONF.libvirt.images_type == 'vitastor':
|
||||||
|
filter_fn = lambda disk: (disk.startswith(instance.uuid) and
|
||||||
|
disk.endswith('.rescue'))
|
||||||
|
rbd_utils.RBDDriver().cleanup_volumes(filter_fn)
|
||||||
|
@@ -4356,6 +4367,8 @@ class LibvirtDriver(driver.ComputeDriver):
|
||||||
|
# TODO(mikal): there is a bug here if images_type has
|
||||||
|
# changed since creation of the instance, but I am pretty
|
||||||
|
# sure that this bug already exists.
|
||||||
|
+ if CONF.libvirt.images_type == 'vitastor':
|
||||||
|
+ return 'vitastor'
|
||||||
|
return 'rbd' if CONF.libvirt.images_type == 'rbd' else 'raw'
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
@@ -4764,10 +4777,10 @@ class LibvirtDriver(driver.ComputeDriver):
|
||||||
|
finally:
|
||||||
|
# NOTE(mikal): if the config drive was imported into RBD,
|
||||||
|
# then we no longer need the local copy
|
||||||
|
- if CONF.libvirt.images_type == 'rbd':
|
||||||
|
+ if CONF.libvirt.images_type == 'rbd' or CONF.libvirt.images_type == 'vitastor':
|
||||||
|
LOG.info('Deleting local config drive %(path)s '
|
||||||
|
- 'because it was imported into RBD.',
|
||||||
|
- {'path': config_disk_local_path},
|
||||||
|
+ 'because it was imported into %(type).',
|
||||||
|
+ {'path': config_disk_local_path, 'type': CONF.libvirt.images_type},
|
||||||
|
instance=instance)
|
||||||
|
os.unlink(config_disk_local_path)
|
||||||
|
|
||||||
|
diff --git a/nova/virt/libvirt/utils.py b/nova/virt/libvirt/utils.py
|
||||||
|
index da2a6e8b8a..52c02e72f1 100644
|
||||||
|
--- a/nova/virt/libvirt/utils.py
|
||||||
|
+++ b/nova/virt/libvirt/utils.py
|
||||||
|
@@ -340,6 +340,10 @@ def find_disk(guest: libvirt_guest.Guest) -> ty.Tuple[str, ty.Optional[str]]:
|
||||||
|
disk_path = disk.source_name
|
||||||
|
if disk_path:
|
||||||
|
disk_path = 'rbd:' + disk_path
|
||||||
|
+ elif not disk_path and disk.source_protocol == 'vitastor':
|
||||||
|
+ disk_path = disk.source_name
|
||||||
|
+ if disk_path:
|
||||||
|
+ disk_path = 'vitastor:' + disk_path
|
||||||
|
|
||||||
|
if not disk_path:
|
||||||
|
raise RuntimeError(_("Can't retrieve root device path "
|
||||||
|
@@ -354,6 +358,8 @@ def get_disk_type_from_path(path: str) -> ty.Optional[str]:
|
||||||
|
return 'lvm'
|
||||||
|
elif path.startswith('rbd:'):
|
||||||
|
return 'rbd'
|
||||||
|
+ elif path.startswith('vitastor:'):
|
||||||
|
+ return 'vitastor'
|
||||||
|
elif (os.path.isdir(path) and
|
||||||
|
os.path.exists(os.path.join(path, "DiskDescriptor.xml"))):
|
||||||
|
return 'ploop'
|
||||||
|
diff --git a/nova/virt/libvirt/volume/vitastor.py b/nova/virt/libvirt/volume/vitastor.py
|
||||||
|
new file mode 100644
|
||||||
|
index 0000000000..0256df62c1
|
||||||
|
--- /dev/null
|
||||||
|
+++ b/nova/virt/libvirt/volume/vitastor.py
|
||||||
|
@@ -0,0 +1,75 @@
|
||||||
|
+# Copyright (c) 2021+, Vitaliy Filippov <vitalif@yourcmc.ru>
|
||||||
|
+#
|
||||||
|
+# Licensed under the Apache License, Version 2.0 (the "License"); you may
|
||||||
|
+# not use this file except in compliance with the License. You may obtain
|
||||||
|
+# a copy of the License at
|
||||||
|
+#
|
||||||
|
+# http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
+#
|
||||||
|
+# Unless required by applicable law or agreed to in writing, software
|
||||||
|
+# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
|
||||||
|
+# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
|
||||||
|
+# License for the specific language governing permissions and limitations
|
||||||
|
+# under the License.
|
||||||
|
+
|
||||||
|
+from os_brick import exception as os_brick_exception
|
||||||
|
+from os_brick import initiator
|
||||||
|
+from os_brick.initiator import connector
|
||||||
|
+from oslo_log import log as logging
|
||||||
|
+
|
||||||
|
+import nova.conf
|
||||||
|
+from nova import utils
|
||||||
|
+from nova.virt.libvirt.volume import volume as libvirt_volume
|
||||||
|
+
|
||||||
|
+
|
||||||
|
+CONF = nova.conf.CONF
|
||||||
|
+LOG = logging.getLogger(__name__)
|
||||||
|
+
|
||||||
|
+
|
||||||
|
+class LibvirtVitastorVolumeDriver(libvirt_volume.LibvirtBaseVolumeDriver):
|
||||||
|
+ """Driver to attach Vitastor volumes to libvirt."""
|
||||||
|
+ def __init__(self, host):
|
||||||
|
+ super(LibvirtVitastorVolumeDriver, self).__init__(host, is_block_dev=False)
|
||||||
|
+
|
||||||
|
+ def connect_volume(self, connection_info, instance):
|
||||||
|
+ pass
|
||||||
|
+
|
||||||
|
+ def disconnect_volume(self, connection_info, instance):
|
||||||
|
+ pass
|
||||||
|
+
|
||||||
|
+ def get_config(self, connection_info, disk_info):
|
||||||
|
+ """Returns xml for libvirt."""
|
||||||
|
+ conf = super(LibvirtVitastorVolumeDriver, self).get_config(connection_info, disk_info)
|
||||||
|
+ conf.source_type = 'network'
|
||||||
|
+ conf.source_protocol = 'vitastor'
|
||||||
|
+ conf.source_name = connection_info['data'].get('name')
|
||||||
|
+ conf.source_query = connection_info['data'].get('etcd_prefix') or None
|
||||||
|
+ conf.source_config = connection_info['data'].get('config_path') or None
|
||||||
|
+ conf.source_hosts = []
|
||||||
|
+ conf.source_ports = []
|
||||||
|
+ addresses = connection_info['data'].get('etcd_address', '')
|
||||||
|
+ if addresses:
|
||||||
|
+ if not isinstance(addresses, list):
|
||||||
|
+ addresses = addresses.split(',')
|
||||||
|
+ for addr in addresses:
|
||||||
|
+ if addr.startswith('https://'):
|
||||||
|
+ raise NotImplementedError('Vitastor block driver does not support SSL for etcd communication yet')
|
||||||
|
+ if addr.startswith('http://'):
|
||||||
|
+ addr = addr[7:]
|
||||||
|
+ addr = addr.rstrip('/')
|
||||||
|
+ if addr.endswith('/v3'):
|
||||||
|
+ addr = addr[0:-3]
|
||||||
|
+ p = addr.find('/')
|
||||||
|
+ if p > 0:
|
||||||
|
+ raise NotImplementedError('libvirt does not support custom URL paths for Vitastor etcd yet. Use /etc/vitastor/vitastor.conf')
|
||||||
|
+ p = addr.find(':')
|
||||||
|
+ port = '2379'
|
||||||
|
+ if p > 0:
|
||||||
|
+ port = addr[p+1:]
|
||||||
|
+ addr = addr[0:p]
|
||||||
|
+ conf.source_hosts.append(addr)
|
||||||
|
+ conf.source_ports.append(port)
|
||||||
|
+ return conf
|
||||||
|
+
|
||||||
|
+ def extend_volume(self, connection_info, instance, requested_size):
|
||||||
|
+ raise NotImplementedError
|
|
@ -11,7 +11,7 @@ Index: qemu-3.1+dfsg/qapi/block-core.json
|
||||||
'host_cdrom', 'host_device', 'http', 'https', 'iscsi', 'luks',
|
'host_cdrom', 'host_device', 'http', 'https', 'iscsi', 'luks',
|
||||||
'nbd', 'nfs', 'null-aio', 'null-co', 'nvme', 'parallels', 'qcow',
|
'nbd', 'nfs', 'null-aio', 'null-co', 'nvme', 'parallels', 'qcow',
|
||||||
'qcow2', 'qed', 'quorum', 'raw', 'rbd', 'replication', 'sheepdog',
|
'qcow2', 'qed', 'quorum', 'raw', 'rbd', 'replication', 'sheepdog',
|
||||||
@@ -3367,6 +3367,24 @@
|
@@ -3367,6 +3367,28 @@
|
||||||
'*tag': 'str' } }
|
'*tag': 'str' } }
|
||||||
|
|
||||||
##
|
##
|
||||||
|
@ -19,17 +19,21 @@ Index: qemu-3.1+dfsg/qapi/block-core.json
|
||||||
+#
|
+#
|
||||||
+# Driver specific block device options for vitastor
|
+# Driver specific block device options for vitastor
|
||||||
+#
|
+#
|
||||||
|
+# @image: Image name
|
||||||
+# @inode: Inode number
|
+# @inode: Inode number
|
||||||
+# @pool: Pool ID
|
+# @pool: Pool ID
|
||||||
+# @size: Desired image size in bytes
|
+# @size: Desired image size in bytes
|
||||||
+# @etcd_host: etcd connection address
|
+# @config_path: Path to Vitastor configuration
|
||||||
|
+# @etcd_host: etcd connection address(es)
|
||||||
+# @etcd_prefix: etcd key/value prefix
|
+# @etcd_prefix: etcd key/value prefix
|
||||||
+##
|
+##
|
||||||
+{ 'struct': 'BlockdevOptionsVitastor',
|
+{ 'struct': 'BlockdevOptionsVitastor',
|
||||||
+ 'data': { 'inode': 'uint64',
|
+ 'data': { '*inode': 'uint64',
|
||||||
+ 'pool': 'uint64',
|
+ '*pool': 'uint64',
|
||||||
+ 'size': 'uint64',
|
+ '*size': 'uint64',
|
||||||
+ 'etcd_host': 'str',
|
+ '*image': 'str',
|
||||||
|
+ '*config_path': 'str',
|
||||||
|
+ '*etcd_host': 'str',
|
||||||
+ '*etcd_prefix': 'str' } }
|
+ '*etcd_prefix': 'str' } }
|
||||||
+
|
+
|
||||||
+##
|
+##
|
|
@ -11,7 +11,7 @@ Index: qemu/qapi/block-core.json
|
||||||
'ssh', 'throttle', 'vdi', 'vhdx', 'vmdk', 'vpc', 'vvfat', 'vxhs' ] }
|
'ssh', 'throttle', 'vdi', 'vhdx', 'vmdk', 'vpc', 'vvfat', 'vxhs' ] }
|
||||||
|
|
||||||
##
|
##
|
||||||
@@ -3725,6 +3725,24 @@
|
@@ -3725,6 +3725,28 @@
|
||||||
'*tag': 'str' } }
|
'*tag': 'str' } }
|
||||||
|
|
||||||
##
|
##
|
||||||
|
@ -19,17 +19,21 @@ Index: qemu/qapi/block-core.json
|
||||||
+#
|
+#
|
||||||
+# Driver specific block device options for vitastor
|
+# Driver specific block device options for vitastor
|
||||||
+#
|
+#
|
||||||
|
+# @image: Image name
|
||||||
+# @inode: Inode number
|
+# @inode: Inode number
|
||||||
+# @pool: Pool ID
|
+# @pool: Pool ID
|
||||||
+# @size: Desired image size in bytes
|
+# @size: Desired image size in bytes
|
||||||
+# @etcd_host: etcd connection address
|
+# @config_path: Path to Vitastor configuration
|
||||||
|
+# @etcd_host: etcd connection address(es)
|
||||||
+# @etcd_prefix: etcd key/value prefix
|
+# @etcd_prefix: etcd key/value prefix
|
||||||
+##
|
+##
|
||||||
+{ 'struct': 'BlockdevOptionsVitastor',
|
+{ 'struct': 'BlockdevOptionsVitastor',
|
||||||
+ 'data': { 'inode': 'uint64',
|
+ 'data': { '*inode': 'uint64',
|
||||||
+ 'pool': 'uint64',
|
+ '*pool': 'uint64',
|
||||||
+ 'size': 'uint64',
|
+ '*size': 'uint64',
|
||||||
+ 'etcd_host': 'str',
|
+ '*image': 'str',
|
||||||
|
+ '*config_path': 'str',
|
||||||
|
+ '*etcd_host': 'str',
|
||||||
+ '*etcd_prefix': 'str' } }
|
+ '*etcd_prefix': 'str' } }
|
||||||
+
|
+
|
||||||
+##
|
+##
|
|
@ -11,7 +11,7 @@ Index: qemu/qapi/block-core.json
|
||||||
'ssh', 'throttle', 'vdi', 'vhdx', 'vmdk', 'vpc', 'vvfat', 'vxhs' ] }
|
'ssh', 'throttle', 'vdi', 'vhdx', 'vmdk', 'vpc', 'vvfat', 'vxhs' ] }
|
||||||
|
|
||||||
##
|
##
|
||||||
@@ -3635,6 +3635,24 @@
|
@@ -3635,6 +3635,28 @@
|
||||||
'*tag': 'str' } }
|
'*tag': 'str' } }
|
||||||
|
|
||||||
##
|
##
|
||||||
|
@ -19,17 +19,21 @@ Index: qemu/qapi/block-core.json
|
||||||
+#
|
+#
|
||||||
+# Driver specific block device options for vitastor
|
+# Driver specific block device options for vitastor
|
||||||
+#
|
+#
|
||||||
|
+# @image: Image name
|
||||||
+# @inode: Inode number
|
+# @inode: Inode number
|
||||||
+# @pool: Pool ID
|
+# @pool: Pool ID
|
||||||
+# @size: Desired image size in bytes
|
+# @size: Desired image size in bytes
|
||||||
+# @etcd_host: etcd connection address
|
+# @config_path: Path to Vitastor configuration
|
||||||
|
+# @etcd_host: etcd connection address(es)
|
||||||
+# @etcd_prefix: etcd key/value prefix
|
+# @etcd_prefix: etcd key/value prefix
|
||||||
+##
|
+##
|
||||||
+{ 'struct': 'BlockdevOptionsVitastor',
|
+{ 'struct': 'BlockdevOptionsVitastor',
|
||||||
+ 'data': { 'inode': 'uint64',
|
+ 'data': { '*inode': 'uint64',
|
||||||
+ 'pool': 'uint64',
|
+ '*pool': 'uint64',
|
||||||
+ 'size': 'uint64',
|
+ '*size': 'uint64',
|
||||||
+ 'etcd_host': 'str',
|
+ '*image': 'str',
|
||||||
|
+ '*config_path': 'str',
|
||||||
|
+ '*etcd_host': 'str',
|
||||||
+ '*etcd_prefix': 'str' } }
|
+ '*etcd_prefix': 'str' } }
|
||||||
+
|
+
|
||||||
+##
|
+##
|
|
@ -11,7 +11,7 @@ Index: qemu-5.1+dfsg/qapi/block-core.json
|
||||||
'ssh', 'throttle', 'vdi', 'vhdx', 'vmdk', 'vpc', 'vvfat' ] }
|
'ssh', 'throttle', 'vdi', 'vhdx', 'vmdk', 'vpc', 'vvfat' ] }
|
||||||
|
|
||||||
##
|
##
|
||||||
@@ -3644,6 +3644,24 @@
|
@@ -3644,6 +3644,28 @@
|
||||||
'*tag': 'str' } }
|
'*tag': 'str' } }
|
||||||
|
|
||||||
##
|
##
|
||||||
|
@ -19,17 +19,21 @@ Index: qemu-5.1+dfsg/qapi/block-core.json
|
||||||
+#
|
+#
|
||||||
+# Driver specific block device options for vitastor
|
+# Driver specific block device options for vitastor
|
||||||
+#
|
+#
|
||||||
|
+# @image: Image name
|
||||||
+# @inode: Inode number
|
+# @inode: Inode number
|
||||||
+# @pool: Pool ID
|
+# @pool: Pool ID
|
||||||
+# @size: Desired image size in bytes
|
+# @size: Desired image size in bytes
|
||||||
+# @etcd_host: etcd connection address
|
+# @config_path: Path to Vitastor configuration
|
||||||
|
+# @etcd_host: etcd connection address(es)
|
||||||
+# @etcd_prefix: etcd key/value prefix
|
+# @etcd_prefix: etcd key/value prefix
|
||||||
+##
|
+##
|
||||||
+{ 'struct': 'BlockdevOptionsVitastor',
|
+{ 'struct': 'BlockdevOptionsVitastor',
|
||||||
+ 'data': { 'inode': 'uint64',
|
+ 'data': { '*inode': 'uint64',
|
||||||
+ 'pool': 'uint64',
|
+ '*pool': 'uint64',
|
||||||
+ 'size': 'uint64',
|
+ '*size': 'uint64',
|
||||||
+ 'etcd_host': 'str',
|
+ '*image': 'str',
|
||||||
|
+ '*config_path': 'str',
|
||||||
|
+ '*etcd_host': 'str',
|
||||||
+ '*etcd_prefix': 'str' } }
|
+ '*etcd_prefix': 'str' } }
|
||||||
+
|
+
|
||||||
+##
|
+##
|
|
@ -48,4 +48,4 @@ FIO=`rpm -qi fio | perl -e 'while(<>) { /^Epoch[\s:]+(\S+)/ && print "$1:"; /^Ve
|
||||||
QEMU=`rpm -qi qemu qemu-kvm | perl -e 'while(<>) { /^Epoch[\s:]+(\S+)/ && print "$1:"; /^Version[\s:]+(\S+)/ && print $1; /^Release[\s:]+(\S+)/ && print "-$1"; }'`
|
QEMU=`rpm -qi qemu qemu-kvm | perl -e 'while(<>) { /^Epoch[\s:]+(\S+)/ && print "$1:"; /^Version[\s:]+(\S+)/ && print $1; /^Release[\s:]+(\S+)/ && print "-$1"; }'`
|
||||||
perl -i -pe 's/(Requires:\s*fio)([^\n]+)?/$1 = '$FIO'/' $VITASTOR/rpm/vitastor-el$EL.spec
|
perl -i -pe 's/(Requires:\s*fio)([^\n]+)?/$1 = '$FIO'/' $VITASTOR/rpm/vitastor-el$EL.spec
|
||||||
perl -i -pe 's/(Requires:\s*qemu(?:-kvm)?)([^\n]+)?/$1 = '$QEMU'/' $VITASTOR/rpm/vitastor-el$EL.spec
|
perl -i -pe 's/(Requires:\s*qemu(?:-kvm)?)([^\n]+)?/$1 = '$QEMU'/' $VITASTOR/rpm/vitastor-el$EL.spec
|
||||||
tar --transform 's#^#vitastor-0.6.2/#' --exclude 'rpm/*.rpm' -czf $VITASTOR/../vitastor-0.6.2$(rpm --eval '%dist').tar.gz *
|
tar --transform 's#^#vitastor-0.6.5/#' --exclude 'rpm/*.rpm' -czf $VITASTOR/../vitastor-0.6.5$(rpm --eval '%dist').tar.gz *
|
||||||
|
|
|
@ -11,7 +11,7 @@ RUN rm -rf /var/lib/dnf/*; dnf download --disablerepo='*' --enablerepo='centos-a
|
||||||
RUN rpm --nomd5 -i qemu*.src.rpm
|
RUN rpm --nomd5 -i qemu*.src.rpm
|
||||||
RUN cd ~/rpmbuild/SPECS && dnf builddep -y --enablerepo=PowerTools --spec qemu-kvm.spec
|
RUN cd ~/rpmbuild/SPECS && dnf builddep -y --enablerepo=PowerTools --spec qemu-kvm.spec
|
||||||
|
|
||||||
ADD qemu-*-vitastor.patch /root/vitastor/
|
ADD patches/qemu-*-vitastor.patch /root/vitastor/patches/
|
||||||
|
|
||||||
RUN set -e; \
|
RUN set -e; \
|
||||||
mkdir -p /root/packages/qemu-el8; \
|
mkdir -p /root/packages/qemu-el8; \
|
||||||
|
@ -25,7 +25,7 @@ RUN set -e; \
|
||||||
echo "Patch$((PN+1)): qemu-4.2-vitastor.patch" >> qemu-kvm.spec; \
|
echo "Patch$((PN+1)): qemu-4.2-vitastor.patch" >> qemu-kvm.spec; \
|
||||||
tail -n +2 xx01 >> qemu-kvm.spec; \
|
tail -n +2 xx01 >> qemu-kvm.spec; \
|
||||||
perl -i -pe 's/(^Release:\s*\d+)/$1.vitastor/' qemu-kvm.spec; \
|
perl -i -pe 's/(^Release:\s*\d+)/$1.vitastor/' qemu-kvm.spec; \
|
||||||
cp /root/vitastor/qemu-4.2-vitastor.patch ~/rpmbuild/SOURCES; \
|
cp /root/vitastor/patches/qemu-4.2-vitastor.patch ~/rpmbuild/SOURCES; \
|
||||||
rpmbuild --nocheck -ba qemu-kvm.spec; \
|
rpmbuild --nocheck -ba qemu-kvm.spec; \
|
||||||
cp ~/rpmbuild/RPMS/*/*qemu* /root/packages/qemu-el8/; \
|
cp ~/rpmbuild/RPMS/*/*qemu* /root/packages/qemu-el8/; \
|
||||||
cp ~/rpmbuild/SRPMS/*qemu* /root/packages/qemu-el8/
|
cp ~/rpmbuild/SRPMS/*qemu* /root/packages/qemu-el8/
|
||||||
|
|
|
@ -15,8 +15,9 @@ RUN yumdownloader --disablerepo=centos-sclo-rh --source fio
|
||||||
RUN rpm --nomd5 -i qemu*.src.rpm
|
RUN rpm --nomd5 -i qemu*.src.rpm
|
||||||
RUN rpm --nomd5 -i fio*.src.rpm
|
RUN rpm --nomd5 -i fio*.src.rpm
|
||||||
RUN rm -f /etc/yum.repos.d/CentOS-Media.repo
|
RUN rm -f /etc/yum.repos.d/CentOS-Media.repo
|
||||||
RUN cd ~/rpmbuild/SPECS && yum-builddep -y --enablerepo='*' --disablerepo=centos-sclo-rh --disablerepo=centos-sclo-rh-source --disablerepo=centos-sclo-sclo-testing qemu-kvm.spec
|
RUN cd ~/rpmbuild/SPECS && yum-builddep -y qemu-kvm.spec
|
||||||
RUN cd ~/rpmbuild/SPECS && yum-builddep -y --enablerepo='*' --disablerepo=centos-sclo-rh --disablerepo=centos-sclo-rh-source --disablerepo=centos-sclo-sclo-testing fio.spec
|
RUN cd ~/rpmbuild/SPECS && yum-builddep -y fio.spec
|
||||||
|
RUN yum -y install rdma-core-devel
|
||||||
|
|
||||||
ADD https://vitastor.io/rpms/liburing-el7/liburing-0.7-2.el7.src.rpm /root
|
ADD https://vitastor.io/rpms/liburing-el7/liburing-0.7-2.el7.src.rpm /root
|
||||||
|
|
||||||
|
@ -37,7 +38,7 @@ ADD . /root/vitastor
|
||||||
RUN set -e; \
|
RUN set -e; \
|
||||||
cd /root/vitastor/rpm; \
|
cd /root/vitastor/rpm; \
|
||||||
sh build-tarball.sh; \
|
sh build-tarball.sh; \
|
||||||
cp /root/vitastor-0.6.2.el7.tar.gz ~/rpmbuild/SOURCES; \
|
cp /root/vitastor-0.6.5.el7.tar.gz ~/rpmbuild/SOURCES; \
|
||||||
cp vitastor-el7.spec ~/rpmbuild/SPECS/vitastor.spec; \
|
cp vitastor-el7.spec ~/rpmbuild/SPECS/vitastor.spec; \
|
||||||
cd ~/rpmbuild/SPECS/; \
|
cd ~/rpmbuild/SPECS/; \
|
||||||
rpmbuild -ba vitastor.spec; \
|
rpmbuild -ba vitastor.spec; \
|
||||||
|
|
|
@ -1,11 +1,11 @@
|
||||||
Name: vitastor
|
Name: vitastor
|
||||||
Version: 0.6.2
|
Version: 0.6.5
|
||||||
Release: 1%{?dist}
|
Release: 1%{?dist}
|
||||||
Summary: Vitastor, a fast software-defined clustered block storage
|
Summary: Vitastor, a fast software-defined clustered block storage
|
||||||
|
|
||||||
License: Vitastor Network Public License 1.1
|
License: Vitastor Network Public License 1.1
|
||||||
URL: https://vitastor.io/
|
URL: https://vitastor.io/
|
||||||
Source0: vitastor-0.6.2.el7.tar.gz
|
Source0: vitastor-0.6.5.el7.tar.gz
|
||||||
|
|
||||||
BuildRequires: liburing-devel >= 0.6
|
BuildRequires: liburing-devel >= 0.6
|
||||||
BuildRequires: gperftools-devel
|
BuildRequires: gperftools-devel
|
||||||
|
@ -14,6 +14,7 @@ BuildRequires: rh-nodejs12
|
||||||
BuildRequires: rh-nodejs12-npm
|
BuildRequires: rh-nodejs12-npm
|
||||||
BuildRequires: jerasure-devel
|
BuildRequires: jerasure-devel
|
||||||
BuildRequires: gf-complete-devel
|
BuildRequires: gf-complete-devel
|
||||||
|
BuildRequires: libibverbs-devel
|
||||||
BuildRequires: cmake
|
BuildRequires: cmake
|
||||||
Requires: fio = 3.7-1.el7
|
Requires: fio = 3.7-1.el7
|
||||||
Requires: qemu-kvm = 2.0.0-1.el7.6
|
Requires: qemu-kvm = 2.0.0-1.el7.6
|
||||||
|
@ -61,8 +62,9 @@ cp -r mon %buildroot/usr/lib/vitastor/mon
|
||||||
%_libdir/libfio_vitastor.so
|
%_libdir/libfio_vitastor.so
|
||||||
%_libdir/libfio_vitastor_blk.so
|
%_libdir/libfio_vitastor_blk.so
|
||||||
%_libdir/libfio_vitastor_sec.so
|
%_libdir/libfio_vitastor_sec.so
|
||||||
%_libdir/libvitastor_blk.so
|
%_libdir/libvitastor_blk.so*
|
||||||
%_libdir/libvitastor_client.so
|
%_libdir/libvitastor_client.so*
|
||||||
|
%_includedir/vitastor_c.h
|
||||||
/usr/lib/vitastor
|
/usr/lib/vitastor
|
||||||
|
|
||||||
|
|
||||||
|
|
|
@ -15,6 +15,7 @@ RUN rpm --nomd5 -i qemu*.src.rpm
|
||||||
RUN rpm --nomd5 -i fio*.src.rpm
|
RUN rpm --nomd5 -i fio*.src.rpm
|
||||||
RUN cd ~/rpmbuild/SPECS && dnf builddep -y --enablerepo=powertools --spec qemu-kvm.spec
|
RUN cd ~/rpmbuild/SPECS && dnf builddep -y --enablerepo=powertools --spec qemu-kvm.spec
|
||||||
RUN cd ~/rpmbuild/SPECS && dnf builddep -y --enablerepo=powertools --spec fio.spec && dnf install -y cmake
|
RUN cd ~/rpmbuild/SPECS && dnf builddep -y --enablerepo=powertools --spec fio.spec && dnf install -y cmake
|
||||||
|
RUN yum -y install libibverbs-devel libarchive
|
||||||
|
|
||||||
ADD https://vitastor.io/rpms/liburing-el7/liburing-0.7-2.el7.src.rpm /root
|
ADD https://vitastor.io/rpms/liburing-el7/liburing-0.7-2.el7.src.rpm /root
|
||||||
|
|
||||||
|
@ -35,7 +36,7 @@ ADD . /root/vitastor
|
||||||
RUN set -e; \
|
RUN set -e; \
|
||||||
cd /root/vitastor/rpm; \
|
cd /root/vitastor/rpm; \
|
||||||
sh build-tarball.sh; \
|
sh build-tarball.sh; \
|
||||||
cp /root/vitastor-0.6.2.el8.tar.gz ~/rpmbuild/SOURCES; \
|
cp /root/vitastor-0.6.5.el8.tar.gz ~/rpmbuild/SOURCES; \
|
||||||
cp vitastor-el8.spec ~/rpmbuild/SPECS/vitastor.spec; \
|
cp vitastor-el8.spec ~/rpmbuild/SPECS/vitastor.spec; \
|
||||||
cd ~/rpmbuild/SPECS/; \
|
cd ~/rpmbuild/SPECS/; \
|
||||||
rpmbuild -ba vitastor.spec; \
|
rpmbuild -ba vitastor.spec; \
|
||||||
|
|
|
@ -1,11 +1,11 @@
|
||||||
Name: vitastor
|
Name: vitastor
|
||||||
Version: 0.6.2
|
Version: 0.6.5
|
||||||
Release: 1%{?dist}
|
Release: 1%{?dist}
|
||||||
Summary: Vitastor, a fast software-defined clustered block storage
|
Summary: Vitastor, a fast software-defined clustered block storage
|
||||||
|
|
||||||
License: Vitastor Network Public License 1.1
|
License: Vitastor Network Public License 1.1
|
||||||
URL: https://vitastor.io/
|
URL: https://vitastor.io/
|
||||||
Source0: vitastor-0.6.2.el8.tar.gz
|
Source0: vitastor-0.6.5.el8.tar.gz
|
||||||
|
|
||||||
BuildRequires: liburing-devel >= 0.6
|
BuildRequires: liburing-devel >= 0.6
|
||||||
BuildRequires: gperftools-devel
|
BuildRequires: gperftools-devel
|
||||||
|
@ -13,6 +13,7 @@ BuildRequires: gcc-toolset-9-gcc-c++
|
||||||
BuildRequires: nodejs >= 10
|
BuildRequires: nodejs >= 10
|
||||||
BuildRequires: jerasure-devel
|
BuildRequires: jerasure-devel
|
||||||
BuildRequires: gf-complete-devel
|
BuildRequires: gf-complete-devel
|
||||||
|
BuildRequires: libibverbs-devel
|
||||||
BuildRequires: cmake
|
BuildRequires: cmake
|
||||||
Requires: fio = 3.7-3.el8
|
Requires: fio = 3.7-3.el8
|
||||||
Requires: qemu-kvm = 4.2.0-29.el8.6
|
Requires: qemu-kvm = 4.2.0-29.el8.6
|
||||||
|
@ -58,8 +59,9 @@ cp -r mon %buildroot/usr/lib/vitastor
|
||||||
%_libdir/libfio_vitastor.so
|
%_libdir/libfio_vitastor.so
|
||||||
%_libdir/libfio_vitastor_blk.so
|
%_libdir/libfio_vitastor_blk.so
|
||||||
%_libdir/libfio_vitastor_sec.so
|
%_libdir/libfio_vitastor_sec.so
|
||||||
%_libdir/libvitastor_blk.so
|
%_libdir/libvitastor_blk.so*
|
||||||
%_libdir/libvitastor_client.so
|
%_libdir/libvitastor_client.so*
|
||||||
|
%_includedir/vitastor_c.h
|
||||||
/usr/lib/vitastor
|
/usr/lib/vitastor
|
||||||
|
|
||||||
|
|
||||||
|
|
|
@ -4,6 +4,8 @@ project(vitastor)
|
||||||
|
|
||||||
include(GNUInstallDirs)
|
include(GNUInstallDirs)
|
||||||
|
|
||||||
|
set(WITH_QEMU true CACHE BOOL "Build QEMU driver")
|
||||||
|
set(WITH_FIO true CACHE BOOL "Build FIO driver")
|
||||||
set(QEMU_PLUGINDIR qemu CACHE STRING "QEMU plugin directory suffix (qemu-kvm on RHEL)")
|
set(QEMU_PLUGINDIR qemu CACHE STRING "QEMU plugin directory suffix (qemu-kvm on RHEL)")
|
||||||
set(WITH_ASAN false CACHE BOOL "Build with AddressSanitizer")
|
set(WITH_ASAN false CACHE BOOL "Build with AddressSanitizer")
|
||||||
if("${CMAKE_INSTALL_PREFIX}" MATCHES "^/usr/local/?$")
|
if("${CMAKE_INSTALL_PREFIX}" MATCHES "^/usr/local/?$")
|
||||||
|
@ -13,7 +15,7 @@ if("${CMAKE_INSTALL_PREFIX}" MATCHES "^/usr/local/?$")
|
||||||
set(CMAKE_INSTALL_RPATH "${CMAKE_INSTALL_PREFIX}/${CMAKE_INSTALL_LIBDIR}")
|
set(CMAKE_INSTALL_RPATH "${CMAKE_INSTALL_PREFIX}/${CMAKE_INSTALL_LIBDIR}")
|
||||||
endif()
|
endif()
|
||||||
|
|
||||||
add_definitions(-DVERSION="0.6.2")
|
add_definitions(-DVERSION="0.6.5")
|
||||||
add_definitions(-Wall -Wno-sign-compare -Wno-comment -Wno-parentheses -Wno-pointer-arith -I ${CMAKE_SOURCE_DIR}/src)
|
add_definitions(-Wall -Wno-sign-compare -Wno-comment -Wno-parentheses -Wno-pointer-arith -I ${CMAKE_SOURCE_DIR}/src)
|
||||||
if (${WITH_ASAN})
|
if (${WITH_ASAN})
|
||||||
add_definitions(-fsanitize=address -fno-omit-frame-pointer)
|
add_definitions(-fsanitize=address -fno-omit-frame-pointer)
|
||||||
|
@ -36,7 +38,9 @@ string(REGEX REPLACE "([\\/\\-]D) *NDEBUG" "" CMAKE_C_FLAGS_RELWITHDEBINFO "${CM
|
||||||
|
|
||||||
find_package(PkgConfig)
|
find_package(PkgConfig)
|
||||||
pkg_check_modules(LIBURING REQUIRED liburing)
|
pkg_check_modules(LIBURING REQUIRED liburing)
|
||||||
|
if (${WITH_QEMU})
|
||||||
pkg_check_modules(GLIB REQUIRED glib-2.0)
|
pkg_check_modules(GLIB REQUIRED glib-2.0)
|
||||||
|
endif (${WITH_QEMU})
|
||||||
pkg_check_modules(IBVERBS libibverbs)
|
pkg_check_modules(IBVERBS libibverbs)
|
||||||
if (IBVERBS_LIBRARIES)
|
if (IBVERBS_LIBRARIES)
|
||||||
add_definitions(-DWITH_RDMA)
|
add_definitions(-DWITH_RDMA)
|
||||||
|
@ -62,6 +66,7 @@ target_link_libraries(vitastor_blk
|
||||||
)
|
)
|
||||||
set_target_properties(vitastor_blk PROPERTIES VERSION ${VERSION} SOVERSION 0)
|
set_target_properties(vitastor_blk PROPERTIES VERSION ${VERSION} SOVERSION 0)
|
||||||
|
|
||||||
|
if (${WITH_FIO})
|
||||||
# libfio_vitastor_blk.so
|
# libfio_vitastor_blk.so
|
||||||
add_library(fio_vitastor_blk SHARED
|
add_library(fio_vitastor_blk SHARED
|
||||||
fio_engine.cpp
|
fio_engine.cpp
|
||||||
|
@ -70,16 +75,18 @@ add_library(fio_vitastor_blk SHARED
|
||||||
target_link_libraries(fio_vitastor_blk
|
target_link_libraries(fio_vitastor_blk
|
||||||
vitastor_blk
|
vitastor_blk
|
||||||
)
|
)
|
||||||
|
endif (${WITH_FIO})
|
||||||
|
|
||||||
# libvitastor_common.a
|
# libvitastor_common.a
|
||||||
|
set(MSGR_RDMA "")
|
||||||
|
if (IBVERBS_LIBRARIES)
|
||||||
|
set(MSGR_RDMA "msgr_rdma.cpp")
|
||||||
|
endif (IBVERBS_LIBRARIES)
|
||||||
add_library(vitastor_common STATIC
|
add_library(vitastor_common STATIC
|
||||||
epoll_manager.cpp etcd_state_client.cpp
|
epoll_manager.cpp etcd_state_client.cpp
|
||||||
messenger.cpp msgr_stop.cpp msgr_op.cpp msgr_send.cpp msgr_receive.cpp ringloop.cpp ../json11/json11.cpp
|
messenger.cpp msgr_stop.cpp msgr_op.cpp msgr_send.cpp msgr_receive.cpp ringloop.cpp ../json11/json11.cpp
|
||||||
http_client.cpp osd_ops.cpp pg_states.cpp timerfd_manager.cpp base64.cpp
|
http_client.cpp osd_ops.cpp pg_states.cpp timerfd_manager.cpp base64.cpp ${MSGR_RDMA}
|
||||||
)
|
)
|
||||||
if (IBVERBS_LIBRARIES)
|
|
||||||
target_sources(vitastor_common PRIVATE msgr_rdma.cpp)
|
|
||||||
endif (IBVERBS_LIBRARIES)
|
|
||||||
target_compile_options(vitastor_common PUBLIC -fPIC)
|
target_compile_options(vitastor_common PUBLIC -fPIC)
|
||||||
|
|
||||||
# vitastor-osd
|
# vitastor-osd
|
||||||
|
@ -95,6 +102,7 @@ target_link_libraries(vitastor-osd
|
||||||
${IBVERBS_LIBRARIES}
|
${IBVERBS_LIBRARIES}
|
||||||
)
|
)
|
||||||
|
|
||||||
|
if (${WITH_FIO})
|
||||||
# libfio_vitastor_sec.so
|
# libfio_vitastor_sec.so
|
||||||
add_library(fio_vitastor_sec SHARED
|
add_library(fio_vitastor_sec SHARED
|
||||||
fio_sec_osd.cpp
|
fio_sec_osd.cpp
|
||||||
|
@ -103,11 +111,14 @@ add_library(fio_vitastor_sec SHARED
|
||||||
target_link_libraries(fio_vitastor_sec
|
target_link_libraries(fio_vitastor_sec
|
||||||
tcmalloc_minimal
|
tcmalloc_minimal
|
||||||
)
|
)
|
||||||
|
endif (${WITH_FIO})
|
||||||
|
|
||||||
# libvitastor_client.so
|
# libvitastor_client.so
|
||||||
add_library(vitastor_client SHARED
|
add_library(vitastor_client SHARED
|
||||||
cluster_client.cpp
|
cluster_client.cpp
|
||||||
|
vitastor_c.cpp
|
||||||
)
|
)
|
||||||
|
set_target_properties(vitastor_client PROPERTIES PUBLIC_HEADER "vitastor_c.h")
|
||||||
target_link_libraries(vitastor_client
|
target_link_libraries(vitastor_client
|
||||||
vitastor_common
|
vitastor_common
|
||||||
tcmalloc_minimal
|
tcmalloc_minimal
|
||||||
|
@ -116,6 +127,7 @@ target_link_libraries(vitastor_client
|
||||||
)
|
)
|
||||||
set_target_properties(vitastor_client PROPERTIES VERSION ${VERSION} SOVERSION 0)
|
set_target_properties(vitastor_client PROPERTIES VERSION ${VERSION} SOVERSION 0)
|
||||||
|
|
||||||
|
if (${WITH_FIO})
|
||||||
# libfio_vitastor.so
|
# libfio_vitastor.so
|
||||||
add_library(fio_vitastor SHARED
|
add_library(fio_vitastor SHARED
|
||||||
fio_cluster.cpp
|
fio_cluster.cpp
|
||||||
|
@ -123,6 +135,7 @@ add_library(fio_vitastor SHARED
|
||||||
target_link_libraries(fio_vitastor
|
target_link_libraries(fio_vitastor
|
||||||
vitastor_client
|
vitastor_client
|
||||||
)
|
)
|
||||||
|
endif (${WITH_FIO})
|
||||||
|
|
||||||
# vitastor-nbd
|
# vitastor-nbd
|
||||||
add_executable(vitastor-nbd
|
add_executable(vitastor-nbd
|
||||||
|
@ -145,27 +158,24 @@ add_executable(vitastor-dump-journal
|
||||||
dump_journal.cpp crc32c.c
|
dump_journal.cpp crc32c.c
|
||||||
)
|
)
|
||||||
|
|
||||||
|
if (${WITH_QEMU})
|
||||||
# qemu_driver.so
|
# qemu_driver.so
|
||||||
add_library(qemu_proxy STATIC qemu_proxy.cpp)
|
add_library(qemu_vitastor SHARED
|
||||||
target_compile_options(qemu_proxy PUBLIC -fPIC)
|
qemu_driver.c
|
||||||
target_include_directories(qemu_proxy PUBLIC
|
)
|
||||||
|
target_include_directories(qemu_vitastor PUBLIC
|
||||||
../qemu/b/qemu
|
../qemu/b/qemu
|
||||||
../qemu/include
|
../qemu/include
|
||||||
${GLIB_INCLUDE_DIRS}
|
${GLIB_INCLUDE_DIRS}
|
||||||
)
|
)
|
||||||
target_link_libraries(qemu_proxy
|
|
||||||
vitastor_client
|
|
||||||
)
|
|
||||||
add_library(qemu_vitastor SHARED
|
|
||||||
qemu_driver.c
|
|
||||||
)
|
|
||||||
target_link_libraries(qemu_vitastor
|
target_link_libraries(qemu_vitastor
|
||||||
qemu_proxy
|
vitastor_client
|
||||||
)
|
)
|
||||||
set_target_properties(qemu_vitastor PROPERTIES
|
set_target_properties(qemu_vitastor PROPERTIES
|
||||||
PREFIX ""
|
PREFIX ""
|
||||||
OUTPUT_NAME "block-vitastor"
|
OUTPUT_NAME "block-vitastor"
|
||||||
)
|
)
|
||||||
|
endif (${WITH_QEMU})
|
||||||
|
|
||||||
### Test stubs
|
### Test stubs
|
||||||
|
|
||||||
|
@ -199,6 +209,14 @@ target_link_libraries(osd_peering_pg_test tcmalloc_minimal)
|
||||||
# test_allocator
|
# test_allocator
|
||||||
add_executable(test_allocator test_allocator.cpp allocator.cpp)
|
add_executable(test_allocator test_allocator.cpp allocator.cpp)
|
||||||
|
|
||||||
|
# test_cas
|
||||||
|
add_executable(test_cas
|
||||||
|
test_cas.cpp
|
||||||
|
)
|
||||||
|
target_link_libraries(test_cas
|
||||||
|
vitastor_client
|
||||||
|
)
|
||||||
|
|
||||||
# test_cluster_client
|
# test_cluster_client
|
||||||
add_executable(test_cluster_client
|
add_executable(test_cluster_client
|
||||||
test_cluster_client.cpp
|
test_cluster_client.cpp
|
||||||
|
@ -217,5 +235,14 @@ target_include_directories(test_cluster_client PUBLIC ${CMAKE_SOURCE_DIR}/src/mo
|
||||||
### Install
|
### Install
|
||||||
|
|
||||||
install(TARGETS vitastor-osd vitastor-dump-journal vitastor-nbd vitastor-rm RUNTIME DESTINATION ${CMAKE_INSTALL_BINDIR})
|
install(TARGETS vitastor-osd vitastor-dump-journal vitastor-nbd vitastor-rm RUNTIME DESTINATION ${CMAKE_INSTALL_BINDIR})
|
||||||
install(TARGETS fio_vitastor fio_vitastor_blk fio_vitastor_sec vitastor_blk vitastor_client LIBRARY DESTINATION ${CMAKE_INSTALL_LIBDIR})
|
install(
|
||||||
|
TARGETS vitastor_blk vitastor_client
|
||||||
|
LIBRARY DESTINATION ${CMAKE_INSTALL_LIBDIR}
|
||||||
|
PUBLIC_HEADER DESTINATION ${CMAKE_INSTALL_INCLUDEDIR}
|
||||||
|
)
|
||||||
|
if (${WITH_FIO})
|
||||||
|
install(TARGETS fio_vitastor fio_vitastor_blk fio_vitastor_sec LIBRARY DESTINATION ${CMAKE_INSTALL_LIBDIR})
|
||||||
|
endif (${WITH_FIO})
|
||||||
|
if (${WITH_QEMU})
|
||||||
install(TARGETS qemu_vitastor LIBRARY DESTINATION /usr/${CMAKE_INSTALL_LIBDIR}/${QEMU_PLUGINDIR})
|
install(TARGETS qemu_vitastor LIBRARY DESTINATION /usr/${CMAKE_INSTALL_LIBDIR}/${QEMU_PLUGINDIR})
|
||||||
|
endif (${WITH_QEMU})
|
||||||
|
|
|
@ -43,11 +43,6 @@ int blockstore_t::read_bitmap(object_id oid, uint64_t target_version, void *bitm
|
||||||
return impl->read_bitmap(oid, target_version, bitmap, result_version);
|
return impl->read_bitmap(oid, target_version, bitmap, result_version);
|
||||||
}
|
}
|
||||||
|
|
||||||
std::unordered_map<object_id, uint64_t> & blockstore_t::get_unstable_writes()
|
|
||||||
{
|
|
||||||
return impl->unstable_writes;
|
|
||||||
}
|
|
||||||
|
|
||||||
std::map<uint64_t, uint64_t> & blockstore_t::get_inode_space_stats()
|
std::map<uint64_t, uint64_t> & blockstore_t::get_inode_space_stats()
|
||||||
{
|
{
|
||||||
return impl->inode_space_stats;
|
return impl->inode_space_stats;
|
||||||
|
|
|
@ -183,9 +183,6 @@ public:
|
||||||
// Simplified synchronous operation: get object bitmap & current version
|
// Simplified synchronous operation: get object bitmap & current version
|
||||||
int read_bitmap(object_id oid, uint64_t target_version, void *bitmap, uint64_t *result_version = NULL);
|
int read_bitmap(object_id oid, uint64_t target_version, void *bitmap, uint64_t *result_version = NULL);
|
||||||
|
|
||||||
// Unstable writes are added here (map of object_id -> version)
|
|
||||||
std::unordered_map<object_id, uint64_t> & get_unstable_writes();
|
|
||||||
|
|
||||||
// Get per-inode space usage statistics
|
// Get per-inode space usage statistics
|
||||||
std::map<uint64_t, uint64_t> & get_inode_space_stats();
|
std::map<uint64_t, uint64_t> & get_inode_space_stats();
|
||||||
|
|
||||||
|
|
|
@ -16,6 +16,8 @@
|
||||||
|
|
||||||
cluster_client_t::cluster_client_t(ring_loop_t *ringloop, timerfd_manager_t *tfd, json11::Json & config)
|
cluster_client_t::cluster_client_t(ring_loop_t *ringloop, timerfd_manager_t *tfd, json11::Json & config)
|
||||||
{
|
{
|
||||||
|
config = osd_messenger_t::read_config(config);
|
||||||
|
|
||||||
this->ringloop = ringloop;
|
this->ringloop = ringloop;
|
||||||
this->tfd = tfd;
|
this->tfd = tfd;
|
||||||
this->config = config;
|
this->config = config;
|
||||||
|
@ -49,7 +51,7 @@ cluster_client_t::cluster_client_t(ring_loop_t *ringloop, timerfd_manager_t *tfd
|
||||||
msgr.exec_op = [this](osd_op_t *op)
|
msgr.exec_op = [this](osd_op_t *op)
|
||||||
{
|
{
|
||||||
// Garbage in
|
// Garbage in
|
||||||
printf("Incoming garbage from peer %d\n", op->peer_fd);
|
fprintf(stderr, "Incoming garbage from peer %d\n", op->peer_fd);
|
||||||
msgr.stop_client(op->peer_fd);
|
msgr.stop_client(op->peer_fd);
|
||||||
delete op;
|
delete op;
|
||||||
};
|
};
|
||||||
|
@ -631,6 +633,13 @@ resume_1:
|
||||||
// Slice the operation into parts
|
// Slice the operation into parts
|
||||||
slice_rw(op);
|
slice_rw(op);
|
||||||
op->needs_reslice = false;
|
op->needs_reslice = false;
|
||||||
|
if (op->opcode == OSD_OP_WRITE && op->version && op->parts.size() > 1)
|
||||||
|
{
|
||||||
|
// Atomic writes to multiple stripes are unsupported
|
||||||
|
op->retval = -EINVAL;
|
||||||
|
erase_op(op);
|
||||||
|
return 1;
|
||||||
|
}
|
||||||
resume_2:
|
resume_2:
|
||||||
// Send unsent parts, if they're not subject to change
|
// Send unsent parts, if they're not subject to change
|
||||||
op->state = 3;
|
op->state = 3;
|
||||||
|
@ -686,13 +695,16 @@ resume_3:
|
||||||
// Check parent inode
|
// Check parent inode
|
||||||
auto ino_it = st_cli.inode_config.find(op->cur_inode);
|
auto ino_it = st_cli.inode_config.find(op->cur_inode);
|
||||||
while (ino_it != st_cli.inode_config.end() && ino_it->second.parent_id &&
|
while (ino_it != st_cli.inode_config.end() && ino_it->second.parent_id &&
|
||||||
INODE_POOL(ino_it->second.parent_id) == INODE_POOL(op->cur_inode))
|
INODE_POOL(ino_it->second.parent_id) == INODE_POOL(op->cur_inode) &&
|
||||||
|
// Check for loops
|
||||||
|
ino_it->second.parent_id != op->inode)
|
||||||
{
|
{
|
||||||
// Skip parents from the same pool
|
// Skip parents from the same pool
|
||||||
ino_it = st_cli.inode_config.find(ino_it->second.parent_id);
|
ino_it = st_cli.inode_config.find(ino_it->second.parent_id);
|
||||||
}
|
}
|
||||||
if (ino_it != st_cli.inode_config.end() &&
|
if (ino_it != st_cli.inode_config.end() &&
|
||||||
ino_it->second.parent_id)
|
ino_it->second.parent_id &&
|
||||||
|
ino_it->second.parent_id != op->inode)
|
||||||
{
|
{
|
||||||
// Continue reading from the parent inode
|
// Continue reading from the parent inode
|
||||||
op->cur_inode = ino_it->second.parent_id;
|
op->cur_inode = ino_it->second.parent_id;
|
||||||
|
@ -920,6 +932,7 @@ bool cluster_client_t::try_send(cluster_op_t *op, int i)
|
||||||
.offset = part->offset,
|
.offset = part->offset,
|
||||||
.len = part->len,
|
.len = part->len,
|
||||||
.meta_revision = meta_rev,
|
.meta_revision = meta_rev,
|
||||||
|
.version = op->opcode == OSD_OP_WRITE ? op->version : 0,
|
||||||
} },
|
} },
|
||||||
.bitmap = op->opcode == OSD_OP_WRITE ? NULL : op->part_bitmaps + pg_bitmap_size*i,
|
.bitmap = op->opcode == OSD_OP_WRITE ? NULL : op->part_bitmaps + pg_bitmap_size*i,
|
||||||
.bitmap_len = (unsigned)(op->opcode == OSD_OP_WRITE ? 0 : pg_bitmap_size),
|
.bitmap_len = (unsigned)(op->opcode == OSD_OP_WRITE ? 0 : pg_bitmap_size),
|
||||||
|
@ -952,13 +965,14 @@ int cluster_client_t::continue_sync(cluster_op_t *op)
|
||||||
return 1;
|
return 1;
|
||||||
}
|
}
|
||||||
// Check that all OSD connections are still alive
|
// Check that all OSD connections are still alive
|
||||||
for (auto sync_osd: dirty_osds)
|
for (auto do_it = dirty_osds.begin(); do_it != dirty_osds.end(); )
|
||||||
{
|
{
|
||||||
|
osd_num_t sync_osd = *do_it;
|
||||||
auto peer_it = msgr.osd_peer_fds.find(sync_osd);
|
auto peer_it = msgr.osd_peer_fds.find(sync_osd);
|
||||||
if (peer_it == msgr.osd_peer_fds.end())
|
if (peer_it == msgr.osd_peer_fds.end())
|
||||||
{
|
dirty_osds.erase(do_it++);
|
||||||
return 0;
|
else
|
||||||
}
|
do_it++;
|
||||||
}
|
}
|
||||||
// Post sync to affected OSDs
|
// Post sync to affected OSDs
|
||||||
for (auto & prev_op: dirty_buffers)
|
for (auto & prev_op: dirty_buffers)
|
||||||
|
@ -1069,10 +1083,6 @@ void cluster_client_t::handle_op_part(cluster_op_part_t *part)
|
||||||
if (part->op.reply.hdr.retval != expected)
|
if (part->op.reply.hdr.retval != expected)
|
||||||
{
|
{
|
||||||
// Operation failed, retry
|
// Operation failed, retry
|
||||||
printf(
|
|
||||||
"%s operation failed on OSD %lu: retval=%ld (expected %d), dropping connection\n",
|
|
||||||
osd_op_names[part->op.req.hdr.opcode], part->osd_num, part->op.reply.hdr.retval, expected
|
|
||||||
);
|
|
||||||
if (part->op.reply.hdr.retval == -EPIPE)
|
if (part->op.reply.hdr.retval == -EPIPE)
|
||||||
{
|
{
|
||||||
// Mark op->up_wait = true before stopping the client
|
// Mark op->up_wait = true before stopping the client
|
||||||
|
@ -1091,7 +1101,14 @@ void cluster_client_t::handle_op_part(cluster_op_part_t *part)
|
||||||
// Don't overwrite other errors with -EPIPE
|
// Don't overwrite other errors with -EPIPE
|
||||||
op->retval = part->op.reply.hdr.retval;
|
op->retval = part->op.reply.hdr.retval;
|
||||||
}
|
}
|
||||||
|
if (op->retval != -EINTR && op->retval != -EIO)
|
||||||
|
{
|
||||||
|
fprintf(
|
||||||
|
stderr, "%s operation failed on OSD %lu: retval=%ld (expected %d), dropping connection\n",
|
||||||
|
osd_op_names[part->op.req.hdr.opcode], part->osd_num, part->op.reply.hdr.retval, expected
|
||||||
|
);
|
||||||
msgr.stop_client(part->op.peer_fd);
|
msgr.stop_client(part->op.peer_fd);
|
||||||
|
}
|
||||||
part->flags |= PART_ERROR;
|
part->flags |= PART_ERROR;
|
||||||
}
|
}
|
||||||
else
|
else
|
||||||
|
@ -1103,6 +1120,7 @@ void cluster_client_t::handle_op_part(cluster_op_part_t *part)
|
||||||
if (op->opcode == OSD_OP_READ)
|
if (op->opcode == OSD_OP_READ)
|
||||||
{
|
{
|
||||||
copy_part_bitmap(op, part);
|
copy_part_bitmap(op, part);
|
||||||
|
op->version = op->parts.size() == 1 ? part->op.reply.rw.version : 0;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
if (op->inflight_count == 0)
|
if (op->inflight_count == 0)
|
||||||
|
|
|
@ -31,6 +31,9 @@ struct cluster_op_t
|
||||||
uint64_t inode;
|
uint64_t inode;
|
||||||
uint64_t offset;
|
uint64_t offset;
|
||||||
uint64_t len;
|
uint64_t len;
|
||||||
|
// for reads and writes within a single object (stripe),
|
||||||
|
// reads can return current version and writes can use "CAS" semantics
|
||||||
|
uint64_t version = 0;
|
||||||
int retval;
|
int retval;
|
||||||
osd_op_buf_list_t iov;
|
osd_op_buf_list_t iov;
|
||||||
std::function<void(cluster_op_t*)> callback;
|
std::function<void(cluster_op_t*)> callback;
|
||||||
|
|
|
@ -35,7 +35,7 @@ etcd_kv_t etcd_state_client_t::parse_etcd_kv(const json11::Json & kv_json)
|
||||||
kv.value = json_text == "" ? json11::Json() : json11::Json::parse(json_text, json_err);
|
kv.value = json_text == "" ? json11::Json() : json11::Json::parse(json_text, json_err);
|
||||||
if (json_err != "")
|
if (json_err != "")
|
||||||
{
|
{
|
||||||
printf("Bad JSON in etcd key %s: %s (value: %s)\n", kv.key.c_str(), json_err.c_str(), json_text.c_str());
|
fprintf(stderr, "Bad JSON in etcd key %s: %s (value: %s)\n", kv.key.c_str(), json_err.c_str(), json_text.c_str());
|
||||||
kv.key = "";
|
kv.key = "";
|
||||||
}
|
}
|
||||||
else
|
else
|
||||||
|
@ -50,6 +50,11 @@ void etcd_state_client_t::etcd_txn(json11::Json txn, int timeout, std::function<
|
||||||
|
|
||||||
void etcd_state_client_t::etcd_call(std::string api, json11::Json payload, int timeout, std::function<void(std::string, json11::Json)> callback)
|
void etcd_state_client_t::etcd_call(std::string api, json11::Json payload, int timeout, std::function<void(std::string, json11::Json)> callback)
|
||||||
{
|
{
|
||||||
|
if (!etcd_addresses.size())
|
||||||
|
{
|
||||||
|
fprintf(stderr, "etcd_address is missing in Vitastor configuration\n");
|
||||||
|
exit(1);
|
||||||
|
}
|
||||||
std::string etcd_address = etcd_addresses[rand() % etcd_addresses.size()];
|
std::string etcd_address = etcd_addresses[rand() % etcd_addresses.size()];
|
||||||
std::string etcd_api_path;
|
std::string etcd_api_path;
|
||||||
int pos = etcd_address.find('/');
|
int pos = etcd_address.find('/');
|
||||||
|
@ -76,16 +81,16 @@ void etcd_state_client_t::add_etcd_url(std::string addr)
|
||||||
addr = addr.substr(7);
|
addr = addr.substr(7);
|
||||||
else if (strtolower(addr.substr(0, 8)) == "https://")
|
else if (strtolower(addr.substr(0, 8)) == "https://")
|
||||||
{
|
{
|
||||||
printf("HTTPS is unsupported for etcd. Either use plain HTTP or setup a local proxy for etcd interaction\n");
|
fprintf(stderr, "HTTPS is unsupported for etcd. Either use plain HTTP or setup a local proxy for etcd interaction\n");
|
||||||
exit(1);
|
exit(1);
|
||||||
}
|
}
|
||||||
if (addr.find('/') < 0)
|
if (addr.find('/') == std::string::npos)
|
||||||
addr += "/v3";
|
addr += "/v3";
|
||||||
this->etcd_addresses.push_back(addr);
|
this->etcd_addresses.push_back(addr);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
void etcd_state_client_t::parse_config(json11::Json & config)
|
void etcd_state_client_t::parse_config(const json11::Json & config)
|
||||||
{
|
{
|
||||||
this->etcd_addresses.clear();
|
this->etcd_addresses.clear();
|
||||||
if (config["etcd_address"].is_string())
|
if (config["etcd_address"].is_string())
|
||||||
|
@ -122,6 +127,11 @@ void etcd_state_client_t::parse_config(json11::Json & config)
|
||||||
|
|
||||||
void etcd_state_client_t::start_etcd_watcher()
|
void etcd_state_client_t::start_etcd_watcher()
|
||||||
{
|
{
|
||||||
|
if (!etcd_addresses.size())
|
||||||
|
{
|
||||||
|
fprintf(stderr, "etcd_address is missing in Vitastor configuration\n");
|
||||||
|
exit(1);
|
||||||
|
}
|
||||||
std::string etcd_address = etcd_addresses[rand() % etcd_addresses.size()];
|
std::string etcd_address = etcd_addresses[rand() % etcd_addresses.size()];
|
||||||
std::string etcd_api_path;
|
std::string etcd_api_path;
|
||||||
int pos = etcd_address.find('/');
|
int pos = etcd_address.find('/');
|
||||||
|
@ -139,7 +149,7 @@ void etcd_state_client_t::start_etcd_watcher()
|
||||||
json11::Json data = json11::Json::parse(msg->body, json_err);
|
json11::Json data = json11::Json::parse(msg->body, json_err);
|
||||||
if (json_err != "")
|
if (json_err != "")
|
||||||
{
|
{
|
||||||
printf("Bad JSON in etcd event: %s, ignoring event\n", json_err.c_str());
|
fprintf(stderr, "Bad JSON in etcd event: %s, ignoring event\n", json_err.c_str());
|
||||||
}
|
}
|
||||||
else
|
else
|
||||||
{
|
{
|
||||||
|
@ -165,7 +175,7 @@ void etcd_state_client_t::start_etcd_watcher()
|
||||||
{
|
{
|
||||||
if (this->log_level > 3)
|
if (this->log_level > 3)
|
||||||
{
|
{
|
||||||
printf("Incoming event: %s -> %s\n", kv.first.c_str(), kv.second.value.dump().c_str());
|
fprintf(stderr, "Incoming event: %s -> %s\n", kv.first.c_str(), kv.second.value.dump().c_str());
|
||||||
}
|
}
|
||||||
parse_state(kv.second);
|
parse_state(kv.second);
|
||||||
}
|
}
|
||||||
|
@ -240,7 +250,7 @@ void etcd_state_client_t::load_global_config()
|
||||||
{
|
{
|
||||||
if (err != "")
|
if (err != "")
|
||||||
{
|
{
|
||||||
printf("Error reading OSD configuration from etcd: %s\n", err.c_str());
|
fprintf(stderr, "Error reading OSD configuration from etcd: %s\n", err.c_str());
|
||||||
tfd->set_timer(ETCD_SLOW_TIMEOUT, false, [this](int timer_id)
|
tfd->set_timer(ETCD_SLOW_TIMEOUT, false, [this](int timer_id)
|
||||||
{
|
{
|
||||||
load_global_config();
|
load_global_config();
|
||||||
|
@ -313,7 +323,7 @@ void etcd_state_client_t::load_pgs()
|
||||||
{
|
{
|
||||||
if (err != "")
|
if (err != "")
|
||||||
{
|
{
|
||||||
printf("Error loading PGs from etcd: %s\n", err.c_str());
|
fprintf(stderr, "Error loading PGs from etcd: %s\n", err.c_str());
|
||||||
tfd->set_timer(ETCD_SLOW_TIMEOUT, false, [this](int timer_id)
|
tfd->set_timer(ETCD_SLOW_TIMEOUT, false, [this](int timer_id)
|
||||||
{
|
{
|
||||||
load_pgs();
|
load_pgs();
|
||||||
|
@ -342,7 +352,7 @@ void etcd_state_client_t::load_pgs()
|
||||||
});
|
});
|
||||||
}
|
}
|
||||||
#else
|
#else
|
||||||
void etcd_state_client_t::parse_config(json11::Json & config)
|
void etcd_state_client_t::parse_config(const json11::Json & config)
|
||||||
{
|
{
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -376,7 +386,7 @@ void etcd_state_client_t::parse_state(const etcd_kv_t & kv)
|
||||||
sscanf(pool_item.first.c_str(), "%u%c", &pool_id, &null_byte);
|
sscanf(pool_item.first.c_str(), "%u%c", &pool_id, &null_byte);
|
||||||
if (!pool_id || pool_id >= POOL_ID_MAX || null_byte != 0)
|
if (!pool_id || pool_id >= POOL_ID_MAX || null_byte != 0)
|
||||||
{
|
{
|
||||||
printf("Pool ID %s is invalid (must be a number less than 0x%x), skipping pool\n", pool_item.first.c_str(), POOL_ID_MAX);
|
fprintf(stderr, "Pool ID %s is invalid (must be a number less than 0x%x), skipping pool\n", pool_item.first.c_str(), POOL_ID_MAX);
|
||||||
continue;
|
continue;
|
||||||
}
|
}
|
||||||
pc.id = pool_id;
|
pc.id = pool_id;
|
||||||
|
@ -384,7 +394,7 @@ void etcd_state_client_t::parse_state(const etcd_kv_t & kv)
|
||||||
pc.name = pool_item.second["name"].string_value();
|
pc.name = pool_item.second["name"].string_value();
|
||||||
if (pc.name == "")
|
if (pc.name == "")
|
||||||
{
|
{
|
||||||
printf("Pool %u has empty name, skipping pool\n", pool_id);
|
fprintf(stderr, "Pool %u has empty name, skipping pool\n", pool_id);
|
||||||
continue;
|
continue;
|
||||||
}
|
}
|
||||||
// Failure Domain
|
// Failure Domain
|
||||||
|
@ -398,7 +408,7 @@ void etcd_state_client_t::parse_state(const etcd_kv_t & kv)
|
||||||
pc.scheme = POOL_SCHEME_JERASURE;
|
pc.scheme = POOL_SCHEME_JERASURE;
|
||||||
else
|
else
|
||||||
{
|
{
|
||||||
printf("Pool %u has invalid coding scheme (one of \"xor\", \"replicated\" or \"jerasure\" required), skipping pool\n", pool_id);
|
fprintf(stderr, "Pool %u has invalid coding scheme (one of \"xor\", \"replicated\" or \"jerasure\" required), skipping pool\n", pool_id);
|
||||||
continue;
|
continue;
|
||||||
}
|
}
|
||||||
// PG Size
|
// PG Size
|
||||||
|
@ -408,7 +418,7 @@ void etcd_state_client_t::parse_state(const etcd_kv_t & kv)
|
||||||
(pc.scheme == POOL_SCHEME_XOR || pc.scheme == POOL_SCHEME_JERASURE) ||
|
(pc.scheme == POOL_SCHEME_XOR || pc.scheme == POOL_SCHEME_JERASURE) ||
|
||||||
pool_item.second["pg_size"].uint64_value() > 256)
|
pool_item.second["pg_size"].uint64_value() > 256)
|
||||||
{
|
{
|
||||||
printf("Pool %u has invalid pg_size, skipping pool\n", pool_id);
|
fprintf(stderr, "Pool %u has invalid pg_size, skipping pool\n", pool_id);
|
||||||
continue;
|
continue;
|
||||||
}
|
}
|
||||||
// Parity Chunks
|
// Parity Chunks
|
||||||
|
@ -417,7 +427,7 @@ void etcd_state_client_t::parse_state(const etcd_kv_t & kv)
|
||||||
{
|
{
|
||||||
if (pc.parity_chunks > 1)
|
if (pc.parity_chunks > 1)
|
||||||
{
|
{
|
||||||
printf("Pool %u has invalid parity_chunks (must be 1), skipping pool\n", pool_id);
|
fprintf(stderr, "Pool %u has invalid parity_chunks (must be 1), skipping pool\n", pool_id);
|
||||||
continue;
|
continue;
|
||||||
}
|
}
|
||||||
pc.parity_chunks = 1;
|
pc.parity_chunks = 1;
|
||||||
|
@ -425,7 +435,7 @@ void etcd_state_client_t::parse_state(const etcd_kv_t & kv)
|
||||||
if (pc.scheme == POOL_SCHEME_JERASURE &&
|
if (pc.scheme == POOL_SCHEME_JERASURE &&
|
||||||
(pc.parity_chunks < 1 || pc.parity_chunks > pc.pg_size-2))
|
(pc.parity_chunks < 1 || pc.parity_chunks > pc.pg_size-2))
|
||||||
{
|
{
|
||||||
printf("Pool %u has invalid parity_chunks (must be between 1 and pg_size-2), skipping pool\n", pool_id);
|
fprintf(stderr, "Pool %u has invalid parity_chunks (must be between 1 and pg_size-2), skipping pool\n", pool_id);
|
||||||
continue;
|
continue;
|
||||||
}
|
}
|
||||||
// PG MinSize
|
// PG MinSize
|
||||||
|
@ -434,14 +444,14 @@ void etcd_state_client_t::parse_state(const etcd_kv_t & kv)
|
||||||
(pc.scheme == POOL_SCHEME_XOR || pc.scheme == POOL_SCHEME_JERASURE) &&
|
(pc.scheme == POOL_SCHEME_XOR || pc.scheme == POOL_SCHEME_JERASURE) &&
|
||||||
pc.pg_minsize < (pc.pg_size-pc.parity_chunks))
|
pc.pg_minsize < (pc.pg_size-pc.parity_chunks))
|
||||||
{
|
{
|
||||||
printf("Pool %u has invalid pg_minsize, skipping pool\n", pool_id);
|
fprintf(stderr, "Pool %u has invalid pg_minsize, skipping pool\n", pool_id);
|
||||||
continue;
|
continue;
|
||||||
}
|
}
|
||||||
// PG Count
|
// PG Count
|
||||||
pc.pg_count = pool_item.second["pg_count"].uint64_value();
|
pc.pg_count = pool_item.second["pg_count"].uint64_value();
|
||||||
if (pc.pg_count < 1)
|
if (pc.pg_count < 1)
|
||||||
{
|
{
|
||||||
printf("Pool %u has invalid pg_count, skipping pool\n", pool_id);
|
fprintf(stderr, "Pool %u has invalid pg_count, skipping pool\n", pool_id);
|
||||||
continue;
|
continue;
|
||||||
}
|
}
|
||||||
// Max OSD Combinations
|
// Max OSD Combinations
|
||||||
|
@ -450,7 +460,7 @@ void etcd_state_client_t::parse_state(const etcd_kv_t & kv)
|
||||||
pc.max_osd_combinations = 10000;
|
pc.max_osd_combinations = 10000;
|
||||||
if (pc.max_osd_combinations > 0 && pc.max_osd_combinations < 100)
|
if (pc.max_osd_combinations > 0 && pc.max_osd_combinations < 100)
|
||||||
{
|
{
|
||||||
printf("Pool %u has invalid max_osd_combinations (must be at least 100), skipping pool\n", pool_id);
|
fprintf(stderr, "Pool %u has invalid max_osd_combinations (must be at least 100), skipping pool\n", pool_id);
|
||||||
continue;
|
continue;
|
||||||
}
|
}
|
||||||
// PG Stripe Size
|
// PG Stripe Size
|
||||||
|
@ -468,7 +478,7 @@ void etcd_state_client_t::parse_state(const etcd_kv_t & kv)
|
||||||
{
|
{
|
||||||
if (pg_item.second.target_set.size() != parsed_cfg.pg_size)
|
if (pg_item.second.target_set.size() != parsed_cfg.pg_size)
|
||||||
{
|
{
|
||||||
printf("Pool %u PG %u configuration is invalid: osd_set size %lu != pool pg_size %lu\n",
|
fprintf(stderr, "Pool %u PG %u configuration is invalid: osd_set size %lu != pool pg_size %lu\n",
|
||||||
pool_id, pg_item.first, pg_item.second.target_set.size(), parsed_cfg.pg_size);
|
pool_id, pg_item.first, pg_item.second.target_set.size(), parsed_cfg.pg_size);
|
||||||
pg_item.second.pause = true;
|
pg_item.second.pause = true;
|
||||||
}
|
}
|
||||||
|
@ -491,7 +501,7 @@ void etcd_state_client_t::parse_state(const etcd_kv_t & kv)
|
||||||
sscanf(pool_item.first.c_str(), "%u%c", &pool_id, &null_byte);
|
sscanf(pool_item.first.c_str(), "%u%c", &pool_id, &null_byte);
|
||||||
if (!pool_id || pool_id >= POOL_ID_MAX || null_byte != 0)
|
if (!pool_id || pool_id >= POOL_ID_MAX || null_byte != 0)
|
||||||
{
|
{
|
||||||
printf("Pool ID %s is invalid in PG configuration (must be a number less than 0x%x), skipping pool\n", pool_item.first.c_str(), POOL_ID_MAX);
|
fprintf(stderr, "Pool ID %s is invalid in PG configuration (must be a number less than 0x%x), skipping pool\n", pool_item.first.c_str(), POOL_ID_MAX);
|
||||||
continue;
|
continue;
|
||||||
}
|
}
|
||||||
for (auto & pg_item: pool_item.second.object_items())
|
for (auto & pg_item: pool_item.second.object_items())
|
||||||
|
@ -500,7 +510,7 @@ void etcd_state_client_t::parse_state(const etcd_kv_t & kv)
|
||||||
sscanf(pg_item.first.c_str(), "%u%c", &pg_num, &null_byte);
|
sscanf(pg_item.first.c_str(), "%u%c", &pg_num, &null_byte);
|
||||||
if (!pg_num || null_byte != 0)
|
if (!pg_num || null_byte != 0)
|
||||||
{
|
{
|
||||||
printf("Bad key in pool %u PG configuration: %s (must be a number), skipped\n", pool_id, pg_item.first.c_str());
|
fprintf(stderr, "Bad key in pool %u PG configuration: %s (must be a number), skipped\n", pool_id, pg_item.first.c_str());
|
||||||
continue;
|
continue;
|
||||||
}
|
}
|
||||||
auto & parsed_cfg = this->pool_config[pool_id].pg_config[pg_num];
|
auto & parsed_cfg = this->pool_config[pool_id].pg_config[pg_num];
|
||||||
|
@ -514,7 +524,7 @@ void etcd_state_client_t::parse_state(const etcd_kv_t & kv)
|
||||||
}
|
}
|
||||||
if (parsed_cfg.target_set.size() != pool_config[pool_id].pg_size)
|
if (parsed_cfg.target_set.size() != pool_config[pool_id].pg_size)
|
||||||
{
|
{
|
||||||
printf("Pool %u PG %u configuration is invalid: osd_set size %lu != pool pg_size %lu\n",
|
fprintf(stderr, "Pool %u PG %u configuration is invalid: osd_set size %lu != pool pg_size %lu\n",
|
||||||
pool_id, pg_num, parsed_cfg.target_set.size(), pool_config[pool_id].pg_size);
|
pool_id, pg_num, parsed_cfg.target_set.size(), pool_config[pool_id].pg_size);
|
||||||
parsed_cfg.pause = true;
|
parsed_cfg.pause = true;
|
||||||
}
|
}
|
||||||
|
@ -527,8 +537,8 @@ void etcd_state_client_t::parse_state(const etcd_kv_t & kv)
|
||||||
{
|
{
|
||||||
if (pg_it->second.exists && pg_it->first != ++n)
|
if (pg_it->second.exists && pg_it->first != ++n)
|
||||||
{
|
{
|
||||||
printf(
|
fprintf(
|
||||||
"Invalid pool %u PG configuration: PG numbers don't cover whole 1..%lu range\n",
|
stderr, "Invalid pool %u PG configuration: PG numbers don't cover whole 1..%lu range\n",
|
||||||
pool_item.second.id, pool_item.second.pg_config.size()
|
pool_item.second.id, pool_item.second.pg_config.size()
|
||||||
);
|
);
|
||||||
for (pg_it = pool_item.second.pg_config.begin(); pg_it != pool_item.second.pg_config.end(); pg_it++)
|
for (pg_it = pool_item.second.pg_config.begin(); pg_it != pool_item.second.pg_config.end(); pg_it++)
|
||||||
|
@ -551,7 +561,7 @@ void etcd_state_client_t::parse_state(const etcd_kv_t & kv)
|
||||||
sscanf(key.c_str() + etcd_prefix.length()+12, "%u/%u%c", &pool_id, &pg_num, &null_byte);
|
sscanf(key.c_str() + etcd_prefix.length()+12, "%u/%u%c", &pool_id, &pg_num, &null_byte);
|
||||||
if (!pool_id || pool_id >= POOL_ID_MAX || !pg_num || null_byte != 0)
|
if (!pool_id || pool_id >= POOL_ID_MAX || !pg_num || null_byte != 0)
|
||||||
{
|
{
|
||||||
printf("Bad etcd key %s, ignoring\n", key.c_str());
|
fprintf(stderr, "Bad etcd key %s, ignoring\n", key.c_str());
|
||||||
}
|
}
|
||||||
else
|
else
|
||||||
{
|
{
|
||||||
|
@ -590,7 +600,7 @@ void etcd_state_client_t::parse_state(const etcd_kv_t & kv)
|
||||||
sscanf(key.c_str() + etcd_prefix.length()+10, "%u/%u%c", &pool_id, &pg_num, &null_byte);
|
sscanf(key.c_str() + etcd_prefix.length()+10, "%u/%u%c", &pool_id, &pg_num, &null_byte);
|
||||||
if (!pool_id || pool_id >= POOL_ID_MAX || !pg_num || null_byte != 0)
|
if (!pool_id || pool_id >= POOL_ID_MAX || !pg_num || null_byte != 0)
|
||||||
{
|
{
|
||||||
printf("Bad etcd key %s, ignoring\n", key.c_str());
|
fprintf(stderr, "Bad etcd key %s, ignoring\n", key.c_str());
|
||||||
}
|
}
|
||||||
else if (value.is_null())
|
else if (value.is_null())
|
||||||
{
|
{
|
||||||
|
@ -614,7 +624,7 @@ void etcd_state_client_t::parse_state(const etcd_kv_t & kv)
|
||||||
}
|
}
|
||||||
if (i >= pg_state_bit_count)
|
if (i >= pg_state_bit_count)
|
||||||
{
|
{
|
||||||
printf("Unexpected pool %u PG %u state keyword in etcd: %s\n", pool_id, pg_num, e.dump().c_str());
|
fprintf(stderr, "Unexpected pool %u PG %u state keyword in etcd: %s\n", pool_id, pg_num, e.dump().c_str());
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
@ -623,7 +633,7 @@ void etcd_state_client_t::parse_state(const etcd_kv_t & kv)
|
||||||
(state & PG_PEERING) && state != PG_PEERING ||
|
(state & PG_PEERING) && state != PG_PEERING ||
|
||||||
(state & PG_INCOMPLETE) && state != PG_INCOMPLETE)
|
(state & PG_INCOMPLETE) && state != PG_INCOMPLETE)
|
||||||
{
|
{
|
||||||
printf("Unexpected pool %u PG %u state in etcd: primary=%lu, state=%s\n", pool_id, pg_num, cur_primary, value["state"].dump().c_str());
|
fprintf(stderr, "Unexpected pool %u PG %u state in etcd: primary=%lu, state=%s\n", pool_id, pg_num, cur_primary, value["state"].dump().c_str());
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
this->pool_config[pool_id].pg_config[pg_num].cur_primary = cur_primary;
|
this->pool_config[pool_id].pg_config[pg_num].cur_primary = cur_primary;
|
||||||
|
@ -661,7 +671,7 @@ void etcd_state_client_t::parse_state(const etcd_kv_t & kv)
|
||||||
sscanf(key.c_str() + etcd_prefix.length()+14, "%lu/%lu%c", &pool_id, &inode_num, &null_byte);
|
sscanf(key.c_str() + etcd_prefix.length()+14, "%lu/%lu%c", &pool_id, &inode_num, &null_byte);
|
||||||
if (!pool_id || pool_id >= POOL_ID_MAX || !inode_num || (inode_num >> (64-POOL_ID_BITS)) || null_byte != 0)
|
if (!pool_id || pool_id >= POOL_ID_MAX || !inode_num || (inode_num >> (64-POOL_ID_BITS)) || null_byte != 0)
|
||||||
{
|
{
|
||||||
printf("Bad etcd key %s, ignoring\n", key.c_str());
|
fprintf(stderr, "Bad etcd key %s, ignoring\n", key.c_str());
|
||||||
}
|
}
|
||||||
else
|
else
|
||||||
{
|
{
|
||||||
|
@ -696,8 +706,8 @@ void etcd_state_client_t::parse_state(const etcd_kv_t & kv)
|
||||||
parent_inode_num |= pool_id << (64-POOL_ID_BITS);
|
parent_inode_num |= pool_id << (64-POOL_ID_BITS);
|
||||||
else if (parent_pool_id >= POOL_ID_MAX)
|
else if (parent_pool_id >= POOL_ID_MAX)
|
||||||
{
|
{
|
||||||
printf(
|
fprintf(
|
||||||
"Inode %lu/%lu parent_pool value is invalid, ignoring parent setting\n",
|
stderr, "Inode %lu/%lu parent_pool value is invalid, ignoring parent setting\n",
|
||||||
inode_num >> (64-POOL_ID_BITS), inode_num & ((1l << (64-POOL_ID_BITS)) - 1)
|
inode_num >> (64-POOL_ID_BITS), inode_num & ((1l << (64-POOL_ID_BITS)) - 1)
|
||||||
);
|
);
|
||||||
parent_inode_num = 0;
|
parent_inode_num = 0;
|
||||||
|
|
|
@ -106,7 +106,7 @@ public:
|
||||||
void load_global_config();
|
void load_global_config();
|
||||||
void load_pgs();
|
void load_pgs();
|
||||||
void parse_state(const etcd_kv_t & kv);
|
void parse_state(const etcd_kv_t & kv);
|
||||||
void parse_config(json11::Json & config);
|
void parse_config(const json11::Json & config);
|
||||||
inode_watch_t* watch_inode(std::string name);
|
inode_watch_t* watch_inode(std::string name);
|
||||||
void close_watch(inode_watch_t* watch);
|
void close_watch(inode_watch_t* watch);
|
||||||
~etcd_state_client_t();
|
~etcd_state_client_t();
|
||||||
|
|
|
@ -24,28 +24,25 @@
|
||||||
#include <netinet/tcp.h>
|
#include <netinet/tcp.h>
|
||||||
|
|
||||||
#include <vector>
|
#include <vector>
|
||||||
#include <unordered_map>
|
|
||||||
|
|
||||||
#include "epoll_manager.h"
|
#include "vitastor_c.h"
|
||||||
#include "cluster_client.h"
|
|
||||||
#include "fio_headers.h"
|
#include "fio_headers.h"
|
||||||
|
|
||||||
struct sec_data
|
struct sec_data
|
||||||
{
|
{
|
||||||
ring_loop_t *ringloop = NULL;
|
vitastor_c *cli = NULL;
|
||||||
epoll_manager_t *epmgr = NULL;
|
void *watch = NULL;
|
||||||
cluster_client_t *cli = NULL;
|
|
||||||
inode_watch_t *watch = NULL;
|
|
||||||
bool last_sync = false;
|
bool last_sync = false;
|
||||||
/* The list of completed io_u structs. */
|
/* The list of completed io_u structs. */
|
||||||
std::vector<io_u*> completed;
|
std::vector<io_u*> completed;
|
||||||
uint64_t op_n = 0, inflight = 0;
|
uint64_t inflight = 0;
|
||||||
bool trace = false;
|
bool trace = false;
|
||||||
};
|
};
|
||||||
|
|
||||||
struct sec_options
|
struct sec_options
|
||||||
{
|
{
|
||||||
int __pad;
|
int __pad;
|
||||||
|
char *config_path = NULL;
|
||||||
char *etcd_host = NULL;
|
char *etcd_host = NULL;
|
||||||
char *etcd_prefix = NULL;
|
char *etcd_prefix = NULL;
|
||||||
char *image = NULL;
|
char *image = NULL;
|
||||||
|
@ -54,12 +51,22 @@ struct sec_options
|
||||||
int cluster_log = 0;
|
int cluster_log = 0;
|
||||||
int trace = 0;
|
int trace = 0;
|
||||||
int use_rdma = 0;
|
int use_rdma = 0;
|
||||||
|
char *rdma_device = NULL;
|
||||||
int rdma_port_num = 0;
|
int rdma_port_num = 0;
|
||||||
int rdma_gid_index = 0;
|
int rdma_gid_index = 0;
|
||||||
int rdma_mtu = 0;
|
int rdma_mtu = 0;
|
||||||
};
|
};
|
||||||
|
|
||||||
static struct fio_option options[] = {
|
static struct fio_option options[] = {
|
||||||
|
{
|
||||||
|
.name = "conf",
|
||||||
|
.lname = "Vitastor config path",
|
||||||
|
.type = FIO_OPT_STR_STORE,
|
||||||
|
.off1 = offsetof(struct sec_options, config_path),
|
||||||
|
.help = "Vitastor config path",
|
||||||
|
.category = FIO_OPT_C_ENGINE,
|
||||||
|
.group = FIO_OPT_G_FILENAME,
|
||||||
|
},
|
||||||
{
|
{
|
||||||
.name = "etcd",
|
.name = "etcd",
|
||||||
.lname = "etcd address",
|
.lname = "etcd address",
|
||||||
|
@ -127,10 +134,49 @@ static struct fio_option options[] = {
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
.name = "use_rdma",
|
.name = "use_rdma",
|
||||||
.lname = "OSD trace",
|
.lname = "Use RDMA",
|
||||||
.type = FIO_OPT_BOOL,
|
.type = FIO_OPT_BOOL,
|
||||||
.off1 = offsetof(struct sec_options, use_rdma),
|
.off1 = offsetof(struct sec_options, use_rdma),
|
||||||
.help = "Use RDMA",
|
.help = "Use RDMA",
|
||||||
|
.def = "-1",
|
||||||
|
.category = FIO_OPT_C_ENGINE,
|
||||||
|
.group = FIO_OPT_G_FILENAME,
|
||||||
|
},
|
||||||
|
{
|
||||||
|
.name = "rdma_device",
|
||||||
|
.lname = "RDMA device name",
|
||||||
|
.type = FIO_OPT_STR_STORE,
|
||||||
|
.off1 = offsetof(struct sec_options, rdma_device),
|
||||||
|
.help = "RDMA device name",
|
||||||
|
.category = FIO_OPT_C_ENGINE,
|
||||||
|
.group = FIO_OPT_G_FILENAME,
|
||||||
|
},
|
||||||
|
{
|
||||||
|
.name = "rdma_port_num",
|
||||||
|
.lname = "RDMA port number",
|
||||||
|
.type = FIO_OPT_INT,
|
||||||
|
.off1 = offsetof(struct sec_options, rdma_port_num),
|
||||||
|
.help = "RDMA port number",
|
||||||
|
.def = "0",
|
||||||
|
.category = FIO_OPT_C_ENGINE,
|
||||||
|
.group = FIO_OPT_G_FILENAME,
|
||||||
|
},
|
||||||
|
{
|
||||||
|
.name = "rdma_gid_index",
|
||||||
|
.lname = "RDMA gid index",
|
||||||
|
.type = FIO_OPT_INT,
|
||||||
|
.off1 = offsetof(struct sec_options, rdma_gid_index),
|
||||||
|
.help = "RDMA gid index",
|
||||||
|
.def = "0",
|
||||||
|
.category = FIO_OPT_C_ENGINE,
|
||||||
|
.group = FIO_OPT_G_FILENAME,
|
||||||
|
},
|
||||||
|
{
|
||||||
|
.name = "rdma_mtu",
|
||||||
|
.lname = "RDMA path MTU",
|
||||||
|
.type = FIO_OPT_INT,
|
||||||
|
.off1 = offsetof(struct sec_options, rdma_mtu),
|
||||||
|
.help = "RDMA path MTU",
|
||||||
.def = "0",
|
.def = "0",
|
||||||
.category = FIO_OPT_C_ENGINE,
|
.category = FIO_OPT_C_ENGINE,
|
||||||
.group = FIO_OPT_G_FILENAME,
|
.group = FIO_OPT_G_FILENAME,
|
||||||
|
@ -140,17 +186,17 @@ static struct fio_option options[] = {
|
||||||
},
|
},
|
||||||
};
|
};
|
||||||
|
|
||||||
|
static void watch_callback(void *opaque, long watch)
|
||||||
|
{
|
||||||
|
struct sec_data *bsd = (struct sec_data*)opaque;
|
||||||
|
bsd->watch = (void*)watch;
|
||||||
|
}
|
||||||
|
|
||||||
static int sec_setup(struct thread_data *td)
|
static int sec_setup(struct thread_data *td)
|
||||||
{
|
{
|
||||||
sec_options *o = (sec_options*)td->eo;
|
sec_options *o = (sec_options*)td->eo;
|
||||||
sec_data *bsd;
|
sec_data *bsd;
|
||||||
|
|
||||||
if (!o->etcd_host)
|
|
||||||
{
|
|
||||||
td_verror(td, EINVAL, "etcd address is missing");
|
|
||||||
return 1;
|
|
||||||
}
|
|
||||||
|
|
||||||
bsd = new sec_data;
|
bsd = new sec_data;
|
||||||
if (!bsd)
|
if (!bsd)
|
||||||
{
|
{
|
||||||
|
@ -166,13 +212,6 @@ static int sec_setup(struct thread_data *td)
|
||||||
td->o.open_files++;
|
td->o.open_files++;
|
||||||
}
|
}
|
||||||
|
|
||||||
json11::Json cfg = json11::Json::object {
|
|
||||||
{ "etcd_address", std::string(o->etcd_host) },
|
|
||||||
{ "etcd_prefix", std::string(o->etcd_prefix ? o->etcd_prefix : "/vitastor") },
|
|
||||||
{ "log_level", o->cluster_log },
|
|
||||||
{ "use_rdma", o->use_rdma },
|
|
||||||
};
|
|
||||||
|
|
||||||
if (!o->image)
|
if (!o->image)
|
||||||
{
|
{
|
||||||
if (!(o->inode & ((1l << (64-POOL_ID_BITS)) - 1)))
|
if (!(o->inode & ((1l << (64-POOL_ID_BITS)) - 1)))
|
||||||
|
@ -194,20 +233,20 @@ static int sec_setup(struct thread_data *td)
|
||||||
{
|
{
|
||||||
o->inode = 0;
|
o->inode = 0;
|
||||||
}
|
}
|
||||||
bsd->ringloop = new ring_loop_t(512);
|
bsd->cli = vitastor_c_create_uring(o->config_path, o->etcd_host, o->etcd_prefix,
|
||||||
bsd->epmgr = new epoll_manager_t(bsd->ringloop);
|
o->use_rdma, o->rdma_device, o->rdma_port_num, o->rdma_gid_index, o->rdma_mtu, o->cluster_log);
|
||||||
bsd->cli = new cluster_client_t(bsd->ringloop, bsd->epmgr->tfd, cfg);
|
|
||||||
if (o->image)
|
if (o->image)
|
||||||
{
|
{
|
||||||
while (!bsd->cli->is_ready())
|
bsd->watch = NULL;
|
||||||
|
vitastor_c_watch_inode(bsd->cli, o->image, watch_callback, bsd);
|
||||||
|
while (true)
|
||||||
{
|
{
|
||||||
bsd->ringloop->loop();
|
vitastor_c_uring_handle_events(bsd->cli);
|
||||||
if (bsd->cli->is_ready())
|
if (bsd->watch)
|
||||||
break;
|
break;
|
||||||
bsd->ringloop->wait();
|
vitastor_c_uring_wait_events(bsd->cli);
|
||||||
}
|
}
|
||||||
bsd->watch = bsd->cli->st_cli.watch_inode(std::string(o->image));
|
td->files[0]->real_file_size = vitastor_c_inode_get_size(bsd->watch);
|
||||||
td->files[0]->real_file_size = bsd->watch->cfg.size;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
bsd->trace = o->trace ? true : false;
|
bsd->trace = o->trace ? true : false;
|
||||||
|
@ -222,11 +261,9 @@ static void sec_cleanup(struct thread_data *td)
|
||||||
{
|
{
|
||||||
if (bsd->watch)
|
if (bsd->watch)
|
||||||
{
|
{
|
||||||
bsd->cli->st_cli.close_watch(bsd->watch);
|
vitastor_c_close_watch(bsd->cli, bsd->watch);
|
||||||
}
|
}
|
||||||
delete bsd->cli;
|
vitastor_c_destroy(bsd->cli);
|
||||||
delete bsd->epmgr;
|
|
||||||
delete bsd->ringloop;
|
|
||||||
delete bsd;
|
delete bsd;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
@ -237,12 +274,31 @@ static int sec_init(struct thread_data *td)
|
||||||
return 0;
|
return 0;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
static void io_callback(void *opaque, long retval)
|
||||||
|
{
|
||||||
|
struct io_u *io = (struct io_u*)opaque;
|
||||||
|
io->error = retval < 0 ? -retval : 0;
|
||||||
|
sec_data *bsd = (sec_data*)io->engine_data;
|
||||||
|
bsd->inflight--;
|
||||||
|
bsd->completed.push_back(io);
|
||||||
|
if (bsd->trace)
|
||||||
|
{
|
||||||
|
printf("--- %s 0x%lx retval=%ld\n", io->ddir == DDIR_READ ? "READ" :
|
||||||
|
(io->ddir == DDIR_WRITE ? "WRITE" : "SYNC"), (uint64_t)io, retval);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
static void read_callback(void *opaque, long retval, uint64_t version)
|
||||||
|
{
|
||||||
|
io_callback(opaque, retval);
|
||||||
|
}
|
||||||
|
|
||||||
/* Begin read or write request. */
|
/* Begin read or write request. */
|
||||||
static enum fio_q_status sec_queue(struct thread_data *td, struct io_u *io)
|
static enum fio_q_status sec_queue(struct thread_data *td, struct io_u *io)
|
||||||
{
|
{
|
||||||
sec_options *opt = (sec_options*)td->eo;
|
sec_options *opt = (sec_options*)td->eo;
|
||||||
sec_data *bsd = (sec_data*)td->io_ops_data;
|
sec_data *bsd = (sec_data*)td->io_ops_data;
|
||||||
int n = bsd->op_n;
|
struct iovec iov;
|
||||||
|
|
||||||
fio_ro_check(td, io);
|
fio_ro_check(td, io);
|
||||||
if (io->ddir == DDIR_SYNC && bsd->last_sync)
|
if (io->ddir == DDIR_SYNC && bsd->last_sync)
|
||||||
|
@ -251,32 +307,29 @@ static enum fio_q_status sec_queue(struct thread_data *td, struct io_u *io)
|
||||||
}
|
}
|
||||||
|
|
||||||
io->engine_data = bsd;
|
io->engine_data = bsd;
|
||||||
cluster_op_t *op = new cluster_op_t;
|
io->error = 0;
|
||||||
|
bsd->inflight++;
|
||||||
|
|
||||||
op->inode = opt->image ? bsd->watch->cfg.num : opt->inode;
|
uint64_t inode = opt->image ? vitastor_c_inode_get_num(bsd->watch) : opt->inode;
|
||||||
switch (io->ddir)
|
switch (io->ddir)
|
||||||
{
|
{
|
||||||
case DDIR_READ:
|
case DDIR_READ:
|
||||||
op->opcode = OSD_OP_READ;
|
iov = { .iov_base = io->xfer_buf, .iov_len = io->xfer_buflen };
|
||||||
op->offset = io->offset;
|
vitastor_c_read(bsd->cli, inode, io->offset, io->xfer_buflen, &iov, 1, read_callback, io);
|
||||||
op->len = io->xfer_buflen;
|
|
||||||
op->iov.push_back(io->xfer_buf, io->xfer_buflen);
|
|
||||||
bsd->last_sync = false;
|
bsd->last_sync = false;
|
||||||
break;
|
break;
|
||||||
case DDIR_WRITE:
|
case DDIR_WRITE:
|
||||||
if (opt->image && bsd->watch->cfg.readonly)
|
if (opt->image && vitastor_c_inode_get_readonly(bsd->watch))
|
||||||
{
|
{
|
||||||
io->error = EROFS;
|
io->error = EROFS;
|
||||||
return FIO_Q_COMPLETED;
|
return FIO_Q_COMPLETED;
|
||||||
}
|
}
|
||||||
op->opcode = OSD_OP_WRITE;
|
iov = { .iov_base = io->xfer_buf, .iov_len = io->xfer_buflen };
|
||||||
op->offset = io->offset;
|
vitastor_c_write(bsd->cli, inode, io->offset, io->xfer_buflen, 0, &iov, 1, io_callback, io);
|
||||||
op->len = io->xfer_buflen;
|
|
||||||
op->iov.push_back(io->xfer_buf, io->xfer_buflen);
|
|
||||||
bsd->last_sync = false;
|
bsd->last_sync = false;
|
||||||
break;
|
break;
|
||||||
case DDIR_SYNC:
|
case DDIR_SYNC:
|
||||||
op->opcode = OSD_OP_SYNC;
|
vitastor_c_sync(bsd->cli, io_callback, io);
|
||||||
bsd->last_sync = true;
|
bsd->last_sync = true;
|
||||||
break;
|
break;
|
||||||
default:
|
default:
|
||||||
|
@ -284,39 +337,20 @@ static enum fio_q_status sec_queue(struct thread_data *td, struct io_u *io)
|
||||||
return FIO_Q_COMPLETED;
|
return FIO_Q_COMPLETED;
|
||||||
}
|
}
|
||||||
|
|
||||||
op->callback = [io, n](cluster_op_t *op)
|
|
||||||
{
|
|
||||||
io->error = op->retval < 0 ? -op->retval : 0;
|
|
||||||
sec_data *bsd = (sec_data*)io->engine_data;
|
|
||||||
bsd->inflight--;
|
|
||||||
bsd->completed.push_back(io);
|
|
||||||
if (bsd->trace)
|
|
||||||
{
|
|
||||||
printf("--- %s n=%d retval=%d\n", io->ddir == DDIR_READ ? "READ" :
|
|
||||||
(io->ddir == DDIR_WRITE ? "WRITE" : "SYNC"), n, op->retval);
|
|
||||||
}
|
|
||||||
delete op;
|
|
||||||
};
|
|
||||||
|
|
||||||
if (opt->trace)
|
if (opt->trace)
|
||||||
{
|
{
|
||||||
if (io->ddir == DDIR_SYNC)
|
if (io->ddir == DDIR_SYNC)
|
||||||
{
|
{
|
||||||
printf("+++ SYNC # %d\n", n);
|
printf("+++ SYNC 0x%lx\n", (uint64_t)io);
|
||||||
}
|
}
|
||||||
else
|
else
|
||||||
{
|
{
|
||||||
printf("+++ %s # %d 0x%llx+%llx\n",
|
printf("+++ %s 0x%lx 0x%llx+%llx\n",
|
||||||
io->ddir == DDIR_READ ? "READ" : "WRITE",
|
io->ddir == DDIR_READ ? "READ" : "WRITE",
|
||||||
n, io->offset, io->xfer_buflen);
|
(uint64_t)io, io->offset, io->xfer_buflen);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
io->error = 0;
|
|
||||||
bsd->inflight++;
|
|
||||||
bsd->op_n++;
|
|
||||||
bsd->cli->execute(op);
|
|
||||||
|
|
||||||
if (io->error != 0)
|
if (io->error != 0)
|
||||||
return FIO_Q_COMPLETED;
|
return FIO_Q_COMPLETED;
|
||||||
return FIO_Q_QUEUED;
|
return FIO_Q_QUEUED;
|
||||||
|
@ -327,10 +361,10 @@ static int sec_getevents(struct thread_data *td, unsigned int min, unsigned int
|
||||||
sec_data *bsd = (sec_data*)td->io_ops_data;
|
sec_data *bsd = (sec_data*)td->io_ops_data;
|
||||||
while (true)
|
while (true)
|
||||||
{
|
{
|
||||||
bsd->ringloop->loop();
|
vitastor_c_uring_handle_events(bsd->cli);
|
||||||
if (bsd->completed.size() >= min)
|
if (bsd->completed.size() >= min)
|
||||||
break;
|
break;
|
||||||
bsd->ringloop->wait();
|
vitastor_c_uring_wait_events(bsd->cli);
|
||||||
}
|
}
|
||||||
return bsd->completed.size();
|
return bsd->completed.size();
|
||||||
}
|
}
|
||||||
|
|
|
@ -21,11 +21,13 @@ void osd_messenger_t::init()
|
||||||
);
|
);
|
||||||
if (!rdma_context)
|
if (!rdma_context)
|
||||||
{
|
{
|
||||||
printf("[OSD %lu] Couldn't initialize RDMA, proceeding with TCP only\n", osd_num);
|
fprintf(stderr, "[OSD %lu] Couldn't initialize RDMA, proceeding with TCP only\n", osd_num);
|
||||||
}
|
}
|
||||||
else
|
else
|
||||||
{
|
{
|
||||||
printf("[OSD %lu] RDMA initialized successfully\n", osd_num);
|
rdma_max_sge = rdma_max_sge < rdma_context->attrx.orig_attr.max_sge
|
||||||
|
? rdma_max_sge : rdma_context->attrx.orig_attr.max_sge;
|
||||||
|
fprintf(stderr, "[OSD %lu] RDMA initialized successfully\n", osd_num);
|
||||||
fcntl(rdma_context->channel->fd, F_SETFL, fcntl(rdma_context->channel->fd, F_GETFL, 0) | O_NONBLOCK);
|
fcntl(rdma_context->channel->fd, F_SETFL, fcntl(rdma_context->channel->fd, F_GETFL, 0) | O_NONBLOCK);
|
||||||
tfd->set_fd_handler(rdma_context->channel->fd, false, [this](int notify_fd, int epoll_events)
|
tfd->set_fd_handler(rdma_context->channel->fd, false, [this](int notify_fd, int epoll_events)
|
||||||
{
|
{
|
||||||
|
@ -53,7 +55,7 @@ void osd_messenger_t::init()
|
||||||
if (!cl->ping_time_remaining)
|
if (!cl->ping_time_remaining)
|
||||||
{
|
{
|
||||||
// Ping timed out, stop the client
|
// Ping timed out, stop the client
|
||||||
printf("Ping timed out for OSD %lu (client %d), disconnecting peer\n", cl->osd_num, cl->peer_fd);
|
fprintf(stderr, "Ping timed out for OSD %lu (client %d), disconnecting peer\n", cl->osd_num, cl->peer_fd);
|
||||||
to_stop.push_back(cl->peer_fd);
|
to_stop.push_back(cl->peer_fd);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
@ -80,7 +82,7 @@ void osd_messenger_t::init()
|
||||||
delete op;
|
delete op;
|
||||||
if (fail_fd >= 0)
|
if (fail_fd >= 0)
|
||||||
{
|
{
|
||||||
printf("Ping failed for OSD %lu (client %d), disconnecting peer\n", cl->osd_num, cl->peer_fd);
|
fprintf(stderr, "Ping failed for OSD %lu (client %d), disconnecting peer\n", cl->osd_num, cl->peer_fd);
|
||||||
stop_client(fail_fd, true);
|
stop_client(fail_fd, true);
|
||||||
}
|
}
|
||||||
};
|
};
|
||||||
|
@ -129,39 +131,46 @@ void osd_messenger_t::parse_config(const json11::Json & config)
|
||||||
{
|
{
|
||||||
#ifdef WITH_RDMA
|
#ifdef WITH_RDMA
|
||||||
if (!config["use_rdma"].is_null())
|
if (!config["use_rdma"].is_null())
|
||||||
|
{
|
||||||
|
// RDMA is on by default in RDMA-enabled builds
|
||||||
this->use_rdma = config["use_rdma"].bool_value() || config["use_rdma"].uint64_value() != 0;
|
this->use_rdma = config["use_rdma"].bool_value() || config["use_rdma"].uint64_value() != 0;
|
||||||
|
}
|
||||||
this->rdma_device = config["rdma_device"].string_value();
|
this->rdma_device = config["rdma_device"].string_value();
|
||||||
this->rdma_port_num = (uint8_t)config["rdma_port_num"].uint64_value();
|
this->rdma_port_num = (uint8_t)config["rdma_port_num"].uint64_value();
|
||||||
if (!this->rdma_port_num)
|
if (!this->rdma_port_num)
|
||||||
this->rdma_port_num = 1;
|
this->rdma_port_num = 1;
|
||||||
this->rdma_gid_index = (uint8_t)config["rdma_gid_index"].uint64_value();
|
this->rdma_gid_index = (uint8_t)config["rdma_gid_index"].uint64_value();
|
||||||
this->rdma_mtu = (uint32_t)config["rdma_mtu"].uint64_value();
|
this->rdma_mtu = (uint32_t)config["rdma_mtu"].uint64_value();
|
||||||
|
this->rdma_max_sge = config["rdma_max_sge"].uint64_value();
|
||||||
|
if (!this->rdma_max_sge)
|
||||||
|
this->rdma_max_sge = 128;
|
||||||
|
this->rdma_max_send = config["rdma_max_send"].uint64_value();
|
||||||
|
if (!this->rdma_max_send)
|
||||||
|
this->rdma_max_send = 32;
|
||||||
|
this->rdma_max_recv = config["rdma_max_recv"].uint64_value();
|
||||||
|
if (!this->rdma_max_recv)
|
||||||
|
this->rdma_max_recv = 8;
|
||||||
|
this->rdma_max_msg = config["rdma_max_msg"].uint64_value();
|
||||||
|
if (!this->rdma_max_msg || this->rdma_max_msg > 128*1024*1024)
|
||||||
|
this->rdma_max_msg = 1024*1024;
|
||||||
#endif
|
#endif
|
||||||
this->bs_bitmap_granularity = strtoull(config["bitmap_granularity"].string_value().c_str(), NULL, 10);
|
this->receive_buffer_size = (uint32_t)config["tcp_header_buffer_size"].uint64_value();
|
||||||
if (!this->bs_bitmap_granularity)
|
if (!this->receive_buffer_size || this->receive_buffer_size > 1024*1024*1024)
|
||||||
this->bs_bitmap_granularity = DEFAULT_BITMAP_GRANULARITY;
|
this->receive_buffer_size = 65536;
|
||||||
this->use_sync_send_recv = config["use_sync_send_recv"].bool_value() ||
|
this->use_sync_send_recv = config["use_sync_send_recv"].bool_value() ||
|
||||||
config["use_sync_send_recv"].uint64_value();
|
config["use_sync_send_recv"].uint64_value();
|
||||||
this->peer_connect_interval = config["peer_connect_interval"].uint64_value();
|
this->peer_connect_interval = config["peer_connect_interval"].uint64_value();
|
||||||
if (!this->peer_connect_interval)
|
if (!this->peer_connect_interval)
|
||||||
{
|
this->peer_connect_interval = 5;
|
||||||
this->peer_connect_interval = DEFAULT_PEER_CONNECT_INTERVAL;
|
|
||||||
}
|
|
||||||
this->peer_connect_timeout = config["peer_connect_timeout"].uint64_value();
|
this->peer_connect_timeout = config["peer_connect_timeout"].uint64_value();
|
||||||
if (!this->peer_connect_timeout)
|
if (!this->peer_connect_timeout)
|
||||||
{
|
this->peer_connect_timeout = 5;
|
||||||
this->peer_connect_timeout = DEFAULT_PEER_CONNECT_TIMEOUT;
|
|
||||||
}
|
|
||||||
this->osd_idle_timeout = config["osd_idle_timeout"].uint64_value();
|
this->osd_idle_timeout = config["osd_idle_timeout"].uint64_value();
|
||||||
if (!this->osd_idle_timeout)
|
if (!this->osd_idle_timeout)
|
||||||
{
|
this->osd_idle_timeout = 5;
|
||||||
this->osd_idle_timeout = DEFAULT_OSD_PING_TIMEOUT;
|
|
||||||
}
|
|
||||||
this->osd_ping_timeout = config["osd_ping_timeout"].uint64_value();
|
this->osd_ping_timeout = config["osd_ping_timeout"].uint64_value();
|
||||||
if (!this->osd_ping_timeout)
|
if (!this->osd_ping_timeout)
|
||||||
{
|
this->osd_ping_timeout = 5;
|
||||||
this->osd_ping_timeout = DEFAULT_OSD_PING_TIMEOUT;
|
|
||||||
}
|
|
||||||
this->log_level = config["log_level"].uint64_value();
|
this->log_level = config["log_level"].uint64_value();
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -252,7 +261,7 @@ void osd_messenger_t::try_connect_peer_addr(osd_num_t peer_osd, const char *peer
|
||||||
{
|
{
|
||||||
osd_num_t peer_osd = clients.at(peer_fd)->osd_num;
|
osd_num_t peer_osd = clients.at(peer_fd)->osd_num;
|
||||||
stop_client(peer_fd, true);
|
stop_client(peer_fd, true);
|
||||||
on_connect_peer(peer_osd, -EIO);
|
on_connect_peer(peer_osd, -EPIPE);
|
||||||
return;
|
return;
|
||||||
});
|
});
|
||||||
}
|
}
|
||||||
|
@ -296,7 +305,7 @@ void osd_messenger_t::handle_peer_epoll(int peer_fd, int epoll_events)
|
||||||
if (epoll_events & EPOLLRDHUP)
|
if (epoll_events & EPOLLRDHUP)
|
||||||
{
|
{
|
||||||
// Stop client
|
// Stop client
|
||||||
printf("[OSD %lu] client %d disconnected\n", this->osd_num, peer_fd);
|
fprintf(stderr, "[OSD %lu] client %d disconnected\n", this->osd_num, peer_fd);
|
||||||
stop_client(peer_fd, true);
|
stop_client(peer_fd, true);
|
||||||
}
|
}
|
||||||
else if (epoll_events & EPOLLIN)
|
else if (epoll_events & EPOLLIN)
|
||||||
|
@ -321,7 +330,7 @@ void osd_messenger_t::on_connect_peer(osd_num_t peer_osd, int peer_fd)
|
||||||
wp.connecting = false;
|
wp.connecting = false;
|
||||||
if (peer_fd < 0)
|
if (peer_fd < 0)
|
||||||
{
|
{
|
||||||
printf("Failed to connect to peer OSD %lu address %s port %d: %s\n", peer_osd, wp.cur_addr.c_str(), wp.cur_port, strerror(-peer_fd));
|
fprintf(stderr, "Failed to connect to peer OSD %lu address %s port %d: %s\n", peer_osd, wp.cur_addr.c_str(), wp.cur_port, strerror(-peer_fd));
|
||||||
if (wp.address_changed)
|
if (wp.address_changed)
|
||||||
{
|
{
|
||||||
wp.address_changed = false;
|
wp.address_changed = false;
|
||||||
|
@ -348,7 +357,7 @@ void osd_messenger_t::on_connect_peer(osd_num_t peer_osd, int peer_fd)
|
||||||
}
|
}
|
||||||
if (log_level > 0)
|
if (log_level > 0)
|
||||||
{
|
{
|
||||||
printf("[OSD %lu] Connected with peer OSD %lu (client %d)\n", osd_num, peer_osd, peer_fd);
|
fprintf(stderr, "[OSD %lu] Connected with peer OSD %lu (client %d)\n", osd_num, peer_osd, peer_fd);
|
||||||
}
|
}
|
||||||
wanted_peers.erase(peer_osd);
|
wanted_peers.erase(peer_osd);
|
||||||
repeer_pgs(peer_osd);
|
repeer_pgs(peer_osd);
|
||||||
|
@ -356,9 +365,6 @@ void osd_messenger_t::on_connect_peer(osd_num_t peer_osd, int peer_fd)
|
||||||
|
|
||||||
void osd_messenger_t::check_peer_config(osd_client_t *cl)
|
void osd_messenger_t::check_peer_config(osd_client_t *cl)
|
||||||
{
|
{
|
||||||
#ifdef WITH_RDMA
|
|
||||||
msgr_rdma_connection_t *rdma_conn = NULL;
|
|
||||||
#endif
|
|
||||||
osd_op_t *op = new osd_op_t();
|
osd_op_t *op = new osd_op_t();
|
||||||
op->op_type = OSD_OP_OUT;
|
op->op_type = OSD_OP_OUT;
|
||||||
op->peer_fd = cl->peer_fd;
|
op->peer_fd = cl->peer_fd;
|
||||||
|
@ -374,11 +380,12 @@ void osd_messenger_t::check_peer_config(osd_client_t *cl)
|
||||||
#ifdef WITH_RDMA
|
#ifdef WITH_RDMA
|
||||||
if (rdma_context)
|
if (rdma_context)
|
||||||
{
|
{
|
||||||
cl->rdma_conn = msgr_rdma_connection_t::create(rdma_context, max_rdma_send, max_rdma_recv, max_rdma_sge);
|
cl->rdma_conn = msgr_rdma_connection_t::create(rdma_context, rdma_max_send, rdma_max_recv, rdma_max_sge, rdma_max_msg);
|
||||||
if (cl->rdma_conn)
|
if (cl->rdma_conn)
|
||||||
{
|
{
|
||||||
json11::Json payload = json11::Json::object {
|
json11::Json payload = json11::Json::object {
|
||||||
{ "connect_rdma", cl->rdma_conn->addr.to_string() },
|
{ "connect_rdma", cl->rdma_conn->addr.to_string() },
|
||||||
|
{ "rdma_max_msg", cl->rdma_conn->max_msg },
|
||||||
};
|
};
|
||||||
std::string payload_str = payload.dump();
|
std::string payload_str = payload.dump();
|
||||||
op->req.show_conf.json_len = payload_str.size();
|
op->req.show_conf.json_len = payload_str.size();
|
||||||
|
@ -388,11 +395,7 @@ void osd_messenger_t::check_peer_config(osd_client_t *cl)
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
#endif
|
#endif
|
||||||
op->callback = [this, cl
|
op->callback = [this, cl](osd_op_t *op)
|
||||||
#ifdef WITH_RDMA
|
|
||||||
, rdma_conn
|
|
||||||
#endif
|
|
||||||
](osd_op_t *op)
|
|
||||||
{
|
{
|
||||||
std::string json_err;
|
std::string json_err;
|
||||||
json11::Json config;
|
json11::Json config;
|
||||||
|
@ -400,7 +403,7 @@ void osd_messenger_t::check_peer_config(osd_client_t *cl)
|
||||||
if (op->reply.hdr.retval < 0)
|
if (op->reply.hdr.retval < 0)
|
||||||
{
|
{
|
||||||
err = true;
|
err = true;
|
||||||
printf("Failed to get config from OSD %lu (retval=%ld), disconnecting peer\n", cl->osd_num, op->reply.hdr.retval);
|
fprintf(stderr, "Failed to get config from OSD %lu (retval=%ld), disconnecting peer\n", cl->osd_num, op->reply.hdr.retval);
|
||||||
}
|
}
|
||||||
else
|
else
|
||||||
{
|
{
|
||||||
|
@ -408,18 +411,18 @@ void osd_messenger_t::check_peer_config(osd_client_t *cl)
|
||||||
if (json_err != "")
|
if (json_err != "")
|
||||||
{
|
{
|
||||||
err = true;
|
err = true;
|
||||||
printf("Failed to get config from OSD %lu: bad JSON: %s, disconnecting peer\n", cl->osd_num, json_err.c_str());
|
fprintf(stderr, "Failed to get config from OSD %lu: bad JSON: %s, disconnecting peer\n", cl->osd_num, json_err.c_str());
|
||||||
}
|
}
|
||||||
else if (config["osd_num"].uint64_value() != cl->osd_num)
|
else if (config["osd_num"].uint64_value() != cl->osd_num)
|
||||||
{
|
{
|
||||||
err = true;
|
err = true;
|
||||||
printf("Connected to OSD %lu instead of OSD %lu, peer state is outdated, disconnecting peer\n", config["osd_num"].uint64_value(), cl->osd_num);
|
fprintf(stderr, "Connected to OSD %lu instead of OSD %lu, peer state is outdated, disconnecting peer\n", config["osd_num"].uint64_value(), cl->osd_num);
|
||||||
}
|
}
|
||||||
else if (config["protocol_version"].uint64_value() != OSD_PROTOCOL_VERSION)
|
else if (config["protocol_version"].uint64_value() != OSD_PROTOCOL_VERSION)
|
||||||
{
|
{
|
||||||
err = true;
|
err = true;
|
||||||
printf(
|
fprintf(
|
||||||
"OSD %lu protocol version is %lu, but only version %u is supported.\n"
|
stderr, "OSD %lu protocol version is %lu, but only version %u is supported.\n"
|
||||||
" If you need to upgrade from 0.5.x please request it via the issue tracker.\n",
|
" If you need to upgrade from 0.5.x please request it via the issue tracker.\n",
|
||||||
cl->osd_num, config["protocol_version"].uint64_value(), OSD_PROTOCOL_VERSION
|
cl->osd_num, config["protocol_version"].uint64_value(), OSD_PROTOCOL_VERSION
|
||||||
);
|
);
|
||||||
|
@ -440,8 +443,8 @@ void osd_messenger_t::check_peer_config(osd_client_t *cl)
|
||||||
if (!msgr_rdma_address_t::from_string(config["rdma_address"].string_value().c_str(), &addr) ||
|
if (!msgr_rdma_address_t::from_string(config["rdma_address"].string_value().c_str(), &addr) ||
|
||||||
cl->rdma_conn->connect(&addr) != 0)
|
cl->rdma_conn->connect(&addr) != 0)
|
||||||
{
|
{
|
||||||
printf(
|
fprintf(
|
||||||
"Failed to connect to OSD %lu (address %s) using RDMA\n",
|
stderr, "Failed to connect to OSD %lu (address %s) using RDMA\n",
|
||||||
cl->osd_num, config["rdma_address"].string_value().c_str()
|
cl->osd_num, config["rdma_address"].string_value().c_str()
|
||||||
);
|
);
|
||||||
delete cl->rdma_conn;
|
delete cl->rdma_conn;
|
||||||
|
@ -455,7 +458,15 @@ void osd_messenger_t::check_peer_config(osd_client_t *cl)
|
||||||
}
|
}
|
||||||
else
|
else
|
||||||
{
|
{
|
||||||
printf("Connected to OSD %lu using RDMA\n", cl->osd_num);
|
uint64_t server_max_msg = config["rdma_max_msg"].uint64_value();
|
||||||
|
if (cl->rdma_conn->max_msg > server_max_msg)
|
||||||
|
{
|
||||||
|
cl->rdma_conn->max_msg = server_max_msg;
|
||||||
|
}
|
||||||
|
if (log_level > 0)
|
||||||
|
{
|
||||||
|
fprintf(stderr, "Connected to OSD %lu using RDMA\n", cl->osd_num);
|
||||||
|
}
|
||||||
cl->peer_state = PEER_RDMA;
|
cl->peer_state = PEER_RDMA;
|
||||||
tfd->set_fd_handler(cl->peer_fd, false, NULL);
|
tfd->set_fd_handler(cl->peer_fd, false, NULL);
|
||||||
// Add the initial receive request
|
// Add the initial receive request
|
||||||
|
@ -480,7 +491,7 @@ void osd_messenger_t::accept_connections(int listen_fd)
|
||||||
{
|
{
|
||||||
assert(peer_fd != 0);
|
assert(peer_fd != 0);
|
||||||
char peer_str[256];
|
char peer_str[256];
|
||||||
printf("[OSD %lu] new client %d: connection from %s port %d\n", this->osd_num, peer_fd,
|
fprintf(stderr, "[OSD %lu] new client %d: connection from %s port %d\n", this->osd_num, peer_fd,
|
||||||
inet_ntop(AF_INET, &addr.sin_addr, peer_str, 256), ntohs(addr.sin_port));
|
inet_ntop(AF_INET, &addr.sin_addr, peer_str, 256), ntohs(addr.sin_port));
|
||||||
fcntl(peer_fd, F_SETFL, fcntl(peer_fd, F_GETFL, 0) | O_NONBLOCK);
|
fcntl(peer_fd, F_SETFL, fcntl(peer_fd, F_GETFL, 0) | O_NONBLOCK);
|
||||||
int one = 1;
|
int one = 1;
|
||||||
|
@ -505,7 +516,58 @@ void osd_messenger_t::accept_connections(int listen_fd)
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
#ifdef WITH_RDMA
|
||||||
bool osd_messenger_t::is_rdma_enabled()
|
bool osd_messenger_t::is_rdma_enabled()
|
||||||
{
|
{
|
||||||
return rdma_context != NULL;
|
return rdma_context != NULL;
|
||||||
}
|
}
|
||||||
|
#endif
|
||||||
|
|
||||||
|
json11::Json osd_messenger_t::read_config(const json11::Json & config)
|
||||||
|
{
|
||||||
|
const char *config_path = config["config_path"].string_value() != ""
|
||||||
|
? config["config_path"].string_value().c_str() : VITASTOR_CONFIG_PATH;
|
||||||
|
int fd = open(config_path, O_RDONLY);
|
||||||
|
if (fd < 0)
|
||||||
|
{
|
||||||
|
if (errno != ENOENT)
|
||||||
|
fprintf(stderr, "Error reading %s: %s\n", config_path, strerror(errno));
|
||||||
|
return config;
|
||||||
|
}
|
||||||
|
struct stat st;
|
||||||
|
if (fstat(fd, &st) != 0)
|
||||||
|
{
|
||||||
|
fprintf(stderr, "Error reading %s: %s\n", config_path, strerror(errno));
|
||||||
|
close(fd);
|
||||||
|
return config;
|
||||||
|
}
|
||||||
|
std::string buf;
|
||||||
|
buf.resize(st.st_size);
|
||||||
|
int done = 0;
|
||||||
|
while (done < st.st_size)
|
||||||
|
{
|
||||||
|
int r = read(fd, (void*)buf.data()+done, st.st_size-done);
|
||||||
|
if (r < 0)
|
||||||
|
{
|
||||||
|
fprintf(stderr, "Error reading %s: %s\n", config_path, strerror(errno));
|
||||||
|
close(fd);
|
||||||
|
return config;
|
||||||
|
}
|
||||||
|
done += r;
|
||||||
|
}
|
||||||
|
close(fd);
|
||||||
|
std::string json_err;
|
||||||
|
json11::Json::object file_config = json11::Json::parse(buf, json_err).object_items();
|
||||||
|
if (json_err != "")
|
||||||
|
{
|
||||||
|
fprintf(stderr, "Invalid JSON in %s: %s\n", config_path, json_err.c_str());
|
||||||
|
return config;
|
||||||
|
}
|
||||||
|
file_config.erase("config_path");
|
||||||
|
file_config.erase("osd_num");
|
||||||
|
for (auto kv: config.object_items())
|
||||||
|
{
|
||||||
|
file_config[kv.first] = kv.second;
|
||||||
|
}
|
||||||
|
return file_config;
|
||||||
|
}
|
||||||
|
|
|
@ -33,10 +33,8 @@
|
||||||
#define PEER_RDMA 4
|
#define PEER_RDMA 4
|
||||||
#define PEER_STOPPED 5
|
#define PEER_STOPPED 5
|
||||||
|
|
||||||
#define DEFAULT_PEER_CONNECT_INTERVAL 5
|
|
||||||
#define DEFAULT_PEER_CONNECT_TIMEOUT 5
|
|
||||||
#define DEFAULT_OSD_PING_TIMEOUT 5
|
|
||||||
#define DEFAULT_BITMAP_GRANULARITY 4096
|
#define DEFAULT_BITMAP_GRANULARITY 4096
|
||||||
|
#define VITASTOR_CONFIG_PATH "/etc/vitastor/vitastor.conf"
|
||||||
|
|
||||||
#define MSGR_SENDP_HDR 1
|
#define MSGR_SENDP_HDR 1
|
||||||
#define MSGR_SENDP_FREE 2
|
#define MSGR_SENDP_FREE 2
|
||||||
|
@ -122,13 +120,11 @@ struct osd_messenger_t
|
||||||
protected:
|
protected:
|
||||||
int keepalive_timer_id = -1;
|
int keepalive_timer_id = -1;
|
||||||
|
|
||||||
// FIXME: make receive_buffer_size configurable
|
uint32_t receive_buffer_size = 0;
|
||||||
int receive_buffer_size = 64*1024;
|
int peer_connect_interval = 0;
|
||||||
int peer_connect_interval = DEFAULT_PEER_CONNECT_INTERVAL;
|
int peer_connect_timeout = 0;
|
||||||
int peer_connect_timeout = DEFAULT_PEER_CONNECT_TIMEOUT;
|
int osd_idle_timeout = 0;
|
||||||
int osd_idle_timeout = DEFAULT_OSD_PING_TIMEOUT;
|
int osd_ping_timeout = 0;
|
||||||
int osd_ping_timeout = DEFAULT_OSD_PING_TIMEOUT;
|
|
||||||
uint32_t bs_bitmap_granularity = 0;
|
|
||||||
int log_level = 0;
|
int log_level = 0;
|
||||||
bool use_sync_send_recv = false;
|
bool use_sync_send_recv = false;
|
||||||
|
|
||||||
|
@ -137,7 +133,8 @@ protected:
|
||||||
std::string rdma_device;
|
std::string rdma_device;
|
||||||
uint64_t rdma_port_num = 1, rdma_gid_index = 0, rdma_mtu = 0;
|
uint64_t rdma_port_num = 1, rdma_gid_index = 0, rdma_mtu = 0;
|
||||||
msgr_rdma_context_t *rdma_context = NULL;
|
msgr_rdma_context_t *rdma_context = NULL;
|
||||||
int max_rdma_sge = 128, max_rdma_send = 32, max_rdma_recv = 32;
|
uint64_t rdma_max_sge = 0, rdma_max_send = 0, rdma_max_recv = 8;
|
||||||
|
uint64_t rdma_max_msg = 0;
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
std::vector<int> read_ready_clients;
|
std::vector<int> read_ready_clients;
|
||||||
|
@ -168,9 +165,11 @@ public:
|
||||||
void accept_connections(int listen_fd);
|
void accept_connections(int listen_fd);
|
||||||
~osd_messenger_t();
|
~osd_messenger_t();
|
||||||
|
|
||||||
|
static json11::Json read_config(const json11::Json & config);
|
||||||
|
|
||||||
#ifdef WITH_RDMA
|
#ifdef WITH_RDMA
|
||||||
bool is_rdma_enabled();
|
bool is_rdma_enabled();
|
||||||
bool connect_rdma(int peer_fd, std::string rdma_address);
|
bool connect_rdma(int peer_fd, std::string rdma_address, uint64_t client_max_msg);
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
protected:
|
protected:
|
||||||
|
@ -188,6 +187,7 @@ protected:
|
||||||
void handle_send(int result, osd_client_t *cl);
|
void handle_send(int result, osd_client_t *cl);
|
||||||
|
|
||||||
bool handle_read(int result, osd_client_t *cl);
|
bool handle_read(int result, osd_client_t *cl);
|
||||||
|
bool handle_read_buffer(osd_client_t *cl, void *curbuf, int remain);
|
||||||
bool handle_finished_read(osd_client_t *cl);
|
bool handle_finished_read(osd_client_t *cl);
|
||||||
void handle_op_hdr(osd_client_t *cl);
|
void handle_op_hdr(osd_client_t *cl);
|
||||||
bool handle_reply_hdr(osd_client_t *cl);
|
bool handle_reply_hdr(osd_client_t *cl);
|
||||||
|
|
|
@ -42,3 +42,8 @@ void osd_messenger_t::read_requests()
|
||||||
void osd_messenger_t::send_replies()
|
void osd_messenger_t::send_replies()
|
||||||
{
|
{
|
||||||
}
|
}
|
||||||
|
|
||||||
|
json11::Json osd_messenger_t::read_config(const json11::Json & config)
|
||||||
|
{
|
||||||
|
return config;
|
||||||
|
}
|
||||||
|
|
|
@ -76,7 +76,7 @@ struct osd_op_buf_list_t
|
||||||
buf = (iovec*)malloc(sizeof(iovec) * alloc);
|
buf = (iovec*)malloc(sizeof(iovec) * alloc);
|
||||||
if (!buf)
|
if (!buf)
|
||||||
{
|
{
|
||||||
printf("Failed to allocate %lu bytes\n", sizeof(iovec) * alloc);
|
fprintf(stderr, "Failed to allocate %lu bytes\n", sizeof(iovec) * alloc);
|
||||||
exit(1);
|
exit(1);
|
||||||
}
|
}
|
||||||
memcpy(buf, inline_buf, sizeof(iovec) * old);
|
memcpy(buf, inline_buf, sizeof(iovec) * old);
|
||||||
|
@ -87,7 +87,7 @@ struct osd_op_buf_list_t
|
||||||
buf = (iovec*)realloc(buf, sizeof(iovec) * alloc);
|
buf = (iovec*)realloc(buf, sizeof(iovec) * alloc);
|
||||||
if (!buf)
|
if (!buf)
|
||||||
{
|
{
|
||||||
printf("Failed to allocate %lu bytes\n", sizeof(iovec) * alloc);
|
fprintf(stderr, "Failed to allocate %lu bytes\n", sizeof(iovec) * alloc);
|
||||||
exit(1);
|
exit(1);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
@ -109,7 +109,7 @@ struct osd_op_buf_list_t
|
||||||
buf = (iovec*)malloc(sizeof(iovec) * alloc);
|
buf = (iovec*)malloc(sizeof(iovec) * alloc);
|
||||||
if (!buf)
|
if (!buf)
|
||||||
{
|
{
|
||||||
printf("Failed to allocate %lu bytes\n", sizeof(iovec) * alloc);
|
fprintf(stderr, "Failed to allocate %lu bytes\n", sizeof(iovec) * alloc);
|
||||||
exit(1);
|
exit(1);
|
||||||
}
|
}
|
||||||
memcpy(buf, inline_buf, sizeof(iovec)*old);
|
memcpy(buf, inline_buf, sizeof(iovec)*old);
|
||||||
|
@ -120,7 +120,7 @@ struct osd_op_buf_list_t
|
||||||
buf = (iovec*)realloc(buf, sizeof(iovec) * alloc);
|
buf = (iovec*)realloc(buf, sizeof(iovec) * alloc);
|
||||||
if (!buf)
|
if (!buf)
|
||||||
{
|
{
|
||||||
printf("Failed to allocate %lu bytes\n", sizeof(iovec) * alloc);
|
fprintf(stderr, "Failed to allocate %lu bytes\n", sizeof(iovec) * alloc);
|
||||||
exit(1);
|
exit(1);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
|
@ -1,3 +1,6 @@
|
||||||
|
// Copyright (c) Vitaliy Filippov, 2019+
|
||||||
|
// License: VNPL-1.1 or GNU GPL-2.0+ (see README.md for details)
|
||||||
|
|
||||||
#include <stdio.h>
|
#include <stdio.h>
|
||||||
#include <stdlib.h>
|
#include <stdlib.h>
|
||||||
#include "msgr_rdma.h"
|
#include "msgr_rdma.h"
|
||||||
|
@ -163,7 +166,8 @@ cleanup:
|
||||||
return NULL;
|
return NULL;
|
||||||
}
|
}
|
||||||
|
|
||||||
msgr_rdma_connection_t *msgr_rdma_connection_t::create(msgr_rdma_context_t *ctx, uint32_t max_send, uint32_t max_recv, uint32_t max_sge)
|
msgr_rdma_connection_t *msgr_rdma_connection_t::create(msgr_rdma_context_t *ctx, uint32_t max_send,
|
||||||
|
uint32_t max_recv, uint32_t max_sge, uint32_t max_msg)
|
||||||
{
|
{
|
||||||
msgr_rdma_connection_t *conn = new msgr_rdma_connection_t;
|
msgr_rdma_connection_t *conn = new msgr_rdma_connection_t;
|
||||||
|
|
||||||
|
@ -173,6 +177,7 @@ msgr_rdma_connection_t *msgr_rdma_connection_t::create(msgr_rdma_context_t *ctx,
|
||||||
conn->max_send = max_send;
|
conn->max_send = max_send;
|
||||||
conn->max_recv = max_recv;
|
conn->max_recv = max_recv;
|
||||||
conn->max_sge = max_sge;
|
conn->max_sge = max_sge;
|
||||||
|
conn->max_msg = max_msg;
|
||||||
|
|
||||||
ctx->used_max_cqe += max_send+max_recv;
|
ctx->used_max_cqe += max_send+max_recv;
|
||||||
if (ctx->used_max_cqe > ctx->max_cqe)
|
if (ctx->used_max_cqe > ctx->max_cqe)
|
||||||
|
@ -293,21 +298,25 @@ int msgr_rdma_connection_t::connect(msgr_rdma_address_t *dest)
|
||||||
return 0;
|
return 0;
|
||||||
}
|
}
|
||||||
|
|
||||||
bool osd_messenger_t::connect_rdma(int peer_fd, std::string rdma_address)
|
bool osd_messenger_t::connect_rdma(int peer_fd, std::string rdma_address, uint64_t client_max_msg)
|
||||||
{
|
{
|
||||||
// Try to connect to the peer using RDMA
|
// Try to connect to the peer using RDMA
|
||||||
msgr_rdma_address_t addr;
|
msgr_rdma_address_t addr;
|
||||||
if (msgr_rdma_address_t::from_string(rdma_address.c_str(), &addr))
|
if (msgr_rdma_address_t::from_string(rdma_address.c_str(), &addr))
|
||||||
{
|
{
|
||||||
auto rdma_conn = msgr_rdma_connection_t::create(rdma_context, max_rdma_send, max_rdma_recv, max_rdma_sge);
|
if (client_max_msg > rdma_max_msg)
|
||||||
|
{
|
||||||
|
client_max_msg = rdma_max_msg;
|
||||||
|
}
|
||||||
|
auto rdma_conn = msgr_rdma_connection_t::create(rdma_context, rdma_max_send, rdma_max_recv, rdma_max_sge, client_max_msg);
|
||||||
if (rdma_conn)
|
if (rdma_conn)
|
||||||
{
|
{
|
||||||
int r = rdma_conn->connect(&addr);
|
int r = rdma_conn->connect(&addr);
|
||||||
if (r != 0)
|
if (r != 0)
|
||||||
{
|
{
|
||||||
delete rdma_conn;
|
delete rdma_conn;
|
||||||
printf(
|
fprintf(
|
||||||
"Failed to connect RDMA queue pair to %s (client %d)\n",
|
stderr, "Failed to connect RDMA queue pair to %s (client %d)\n",
|
||||||
addr.to_string().c_str(), peer_fd
|
addr.to_string().c_str(), peer_fd
|
||||||
);
|
);
|
||||||
}
|
}
|
||||||
|
@ -337,7 +346,7 @@ static void try_send_rdma_wr(osd_client_t *cl, ibv_sge *sge, int op_sge)
|
||||||
int err = ibv_post_send(cl->rdma_conn->qp, &wr, &bad_wr);
|
int err = ibv_post_send(cl->rdma_conn->qp, &wr, &bad_wr);
|
||||||
if (err || bad_wr)
|
if (err || bad_wr)
|
||||||
{
|
{
|
||||||
printf("RDMA send failed: %s\n", strerror(err));
|
fprintf(stderr, "RDMA send failed: %s\n", strerror(err));
|
||||||
exit(1);
|
exit(1);
|
||||||
}
|
}
|
||||||
cl->rdma_conn->cur_send++;
|
cl->rdma_conn->cur_send++;
|
||||||
|
@ -351,46 +360,23 @@ bool osd_messenger_t::try_send_rdma(osd_client_t *cl)
|
||||||
// Only send one batch at a time
|
// Only send one batch at a time
|
||||||
return true;
|
return true;
|
||||||
}
|
}
|
||||||
int op_size = 0, op_sge = 0, op_max = rc->max_sge*bs_bitmap_granularity;
|
uint64_t op_size = 0, op_sge = 0;
|
||||||
// FIXME: rc->max_sge should be negotiated between client & server
|
|
||||||
ibv_sge sge[rc->max_sge];
|
ibv_sge sge[rc->max_sge];
|
||||||
while (rc->send_pos < cl->send_list.size())
|
while (rc->send_pos < cl->send_list.size())
|
||||||
{
|
{
|
||||||
iovec & iov = cl->send_list[rc->send_pos];
|
iovec & iov = cl->send_list[rc->send_pos];
|
||||||
if (cl->outbox[rc->send_pos].flags & MSGR_SENDP_HDR)
|
if (op_size >= rc->max_msg || op_sge >= rc->max_sge)
|
||||||
{
|
|
||||||
if (op_sge > 0)
|
|
||||||
{
|
{
|
||||||
try_send_rdma_wr(cl, sge, op_sge);
|
try_send_rdma_wr(cl, sge, op_sge);
|
||||||
op_sge = 0;
|
op_sge = 0;
|
||||||
op_size = 0;
|
op_size = 0;
|
||||||
if (rc->cur_send >= rc->max_send)
|
if (rc->cur_send >= rc->max_send)
|
||||||
break;
|
|
||||||
}
|
|
||||||
assert(rc->send_buf_pos == 0);
|
|
||||||
sge[0] = {
|
|
||||||
.addr = (uintptr_t)iov.iov_base,
|
|
||||||
.length = (uint32_t)iov.iov_len,
|
|
||||||
.lkey = rc->ctx->mr->lkey,
|
|
||||||
};
|
|
||||||
try_send_rdma_wr(cl, sge, 1);
|
|
||||||
rc->send_pos++;
|
|
||||||
if (rc->cur_send >= rc->max_send)
|
|
||||||
break;
|
|
||||||
}
|
|
||||||
else
|
|
||||||
{
|
{
|
||||||
if (op_size >= op_max || op_sge >= rc->max_sge)
|
|
||||||
{
|
|
||||||
try_send_rdma_wr(cl, sge, op_sge);
|
|
||||||
op_sge = 0;
|
|
||||||
op_size = 0;
|
|
||||||
if (rc->cur_send >= rc->max_send)
|
|
||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
// Fragment all messages into parts no longer than (max_sge*4k) = 120k on ConnectX-4
|
}
|
||||||
// Otherwise the client may not be able to receive them in small parts
|
uint32_t len = (uint32_t)(op_size+iov.iov_len-rc->send_buf_pos < rc->max_msg
|
||||||
uint32_t len = (uint32_t)(op_size+iov.iov_len-rc->send_buf_pos < op_max ? iov.iov_len-rc->send_buf_pos : op_max-op_size);
|
? iov.iov_len-rc->send_buf_pos : rc->max_msg-op_size);
|
||||||
sge[op_sge++] = {
|
sge[op_sge++] = {
|
||||||
.addr = (uintptr_t)(iov.iov_base+rc->send_buf_pos),
|
.addr = (uintptr_t)(iov.iov_base+rc->send_buf_pos),
|
||||||
.length = len,
|
.length = len,
|
||||||
|
@ -404,7 +390,6 @@ bool osd_messenger_t::try_send_rdma(osd_client_t *cl)
|
||||||
rc->send_buf_pos = 0;
|
rc->send_buf_pos = 0;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
|
||||||
if (op_sge > 0)
|
if (op_sge > 0)
|
||||||
{
|
{
|
||||||
try_send_rdma_wr(cl, sge, op_sge);
|
try_send_rdma_wr(cl, sge, op_sge);
|
||||||
|
@ -423,7 +408,7 @@ static void try_recv_rdma_wr(osd_client_t *cl, ibv_sge *sge, int op_sge)
|
||||||
int err = ibv_post_recv(cl->rdma_conn->qp, &wr, &bad_wr);
|
int err = ibv_post_recv(cl->rdma_conn->qp, &wr, &bad_wr);
|
||||||
if (err || bad_wr)
|
if (err || bad_wr)
|
||||||
{
|
{
|
||||||
printf("RDMA receive failed: %s\n", strerror(err));
|
fprintf(stderr, "RDMA receive failed: %s\n", strerror(err));
|
||||||
exit(1);
|
exit(1);
|
||||||
}
|
}
|
||||||
cl->rdma_conn->cur_recv++;
|
cl->rdma_conn->cur_recv++;
|
||||||
|
@ -432,53 +417,16 @@ static void try_recv_rdma_wr(osd_client_t *cl, ibv_sge *sge, int op_sge)
|
||||||
bool osd_messenger_t::try_recv_rdma(osd_client_t *cl)
|
bool osd_messenger_t::try_recv_rdma(osd_client_t *cl)
|
||||||
{
|
{
|
||||||
auto rc = cl->rdma_conn;
|
auto rc = cl->rdma_conn;
|
||||||
if (rc->cur_recv > 0)
|
while (rc->cur_recv < rc->max_recv)
|
||||||
{
|
{
|
||||||
return true;
|
void *buf = malloc_or_die(rc->max_msg);
|
||||||
}
|
rc->recv_buffers.push_back(buf);
|
||||||
if (!cl->recv_list.get_size())
|
ibv_sge sge = {
|
||||||
{
|
.addr = (uintptr_t)buf,
|
||||||
cl->recv_list.reset();
|
.length = (uint32_t)rc->max_msg,
|
||||||
cl->read_op = new osd_op_t;
|
|
||||||
cl->read_op->peer_fd = cl->peer_fd;
|
|
||||||
cl->read_op->op_type = OSD_OP_IN;
|
|
||||||
cl->recv_list.push_back(cl->read_op->req.buf, OSD_PACKET_SIZE);
|
|
||||||
cl->read_remaining = OSD_PACKET_SIZE;
|
|
||||||
cl->read_state = CL_READ_HDR;
|
|
||||||
}
|
|
||||||
int op_size = 0, op_sge = 0, op_max = rc->max_sge*bs_bitmap_granularity;
|
|
||||||
iovec *segments = cl->recv_list.get_iovec();
|
|
||||||
// FIXME: rc->max_sge should be negotiated between client & server
|
|
||||||
ibv_sge sge[rc->max_sge];
|
|
||||||
while (rc->recv_pos < cl->recv_list.get_size())
|
|
||||||
{
|
|
||||||
iovec & iov = segments[rc->recv_pos];
|
|
||||||
if (op_size >= op_max || op_sge >= rc->max_sge)
|
|
||||||
{
|
|
||||||
try_recv_rdma_wr(cl, sge, op_sge);
|
|
||||||
op_sge = 0;
|
|
||||||
op_size = 0;
|
|
||||||
if (rc->cur_recv >= rc->max_recv)
|
|
||||||
break;
|
|
||||||
}
|
|
||||||
// Receive in identical (max_sge*4k) fragments
|
|
||||||
uint32_t len = (uint32_t)(op_size+iov.iov_len-rc->recv_buf_pos < op_max ? iov.iov_len-rc->recv_buf_pos : op_max-op_size);
|
|
||||||
sge[op_sge++] = {
|
|
||||||
.addr = (uintptr_t)(iov.iov_base+rc->recv_buf_pos),
|
|
||||||
.length = len,
|
|
||||||
.lkey = rc->ctx->mr->lkey,
|
.lkey = rc->ctx->mr->lkey,
|
||||||
};
|
};
|
||||||
op_size += len;
|
try_recv_rdma_wr(cl, &sge, 1);
|
||||||
rc->recv_buf_pos += len;
|
|
||||||
if (rc->recv_buf_pos >= iov.iov_len)
|
|
||||||
{
|
|
||||||
rc->recv_pos++;
|
|
||||||
rc->recv_buf_pos = 0;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
if (op_sge > 0)
|
|
||||||
{
|
|
||||||
try_recv_rdma_wr(cl, sge, op_sge);
|
|
||||||
}
|
}
|
||||||
return true;
|
return true;
|
||||||
}
|
}
|
||||||
|
@ -497,7 +445,7 @@ void osd_messenger_t::handle_rdma_events()
|
||||||
}
|
}
|
||||||
if (ibv_req_notify_cq(rdma_context->cq, 0) != 0)
|
if (ibv_req_notify_cq(rdma_context->cq, 0) != 0)
|
||||||
{
|
{
|
||||||
printf("Failed to request RDMA completion notification, exiting\n");
|
fprintf(stderr, "Failed to request RDMA completion notification, exiting\n");
|
||||||
exit(1);
|
exit(1);
|
||||||
}
|
}
|
||||||
ibv_wc wc[RDMA_EVENTS_AT_ONCE];
|
ibv_wc wc[RDMA_EVENTS_AT_ONCE];
|
||||||
|
@ -517,37 +465,23 @@ void osd_messenger_t::handle_rdma_events()
|
||||||
osd_client_t *cl = cl_it->second;
|
osd_client_t *cl = cl_it->second;
|
||||||
if (wc[i].status != IBV_WC_SUCCESS)
|
if (wc[i].status != IBV_WC_SUCCESS)
|
||||||
{
|
{
|
||||||
printf("RDMA work request failed for client %d", client_id);
|
fprintf(stderr, "RDMA work request failed for client %d", client_id);
|
||||||
if (cl->osd_num)
|
if (cl->osd_num)
|
||||||
{
|
{
|
||||||
printf(" (OSD %lu)", cl->osd_num);
|
fprintf(stderr, " (OSD %lu)", cl->osd_num);
|
||||||
}
|
}
|
||||||
printf(" with status: %s, stopping client\n", ibv_wc_status_str(wc[i].status));
|
fprintf(stderr, " with status: %s, stopping client\n", ibv_wc_status_str(wc[i].status));
|
||||||
stop_client(client_id);
|
stop_client(client_id);
|
||||||
continue;
|
continue;
|
||||||
}
|
}
|
||||||
if (!is_send)
|
if (!is_send)
|
||||||
{
|
{
|
||||||
cl->rdma_conn->cur_recv--;
|
cl->rdma_conn->cur_recv--;
|
||||||
if (!cl->rdma_conn->cur_recv)
|
handle_read_buffer(cl, cl->rdma_conn->recv_buffers[0], wc[i].byte_len);
|
||||||
{
|
free(cl->rdma_conn->recv_buffers[0]);
|
||||||
cl->recv_list.done += cl->rdma_conn->recv_pos;
|
cl->rdma_conn->recv_buffers.erase(cl->rdma_conn->recv_buffers.begin(), cl->rdma_conn->recv_buffers.begin()+1);
|
||||||
cl->rdma_conn->recv_pos = 0;
|
|
||||||
if (!cl->recv_list.get_size())
|
|
||||||
{
|
|
||||||
cl->read_remaining = 0;
|
|
||||||
if (handle_finished_read(cl))
|
|
||||||
{
|
|
||||||
try_recv_rdma(cl);
|
try_recv_rdma(cl);
|
||||||
}
|
}
|
||||||
}
|
|
||||||
else
|
|
||||||
{
|
|
||||||
// Continue to receive data
|
|
||||||
try_recv_rdma(cl);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
else
|
else
|
||||||
{
|
{
|
||||||
cl->rdma_conn->cur_send--;
|
cl->rdma_conn->cur_send--;
|
||||||
|
|
|
@ -1,3 +1,6 @@
|
||||||
|
// Copyright (c) Vitaliy Filippov, 2019+
|
||||||
|
// License: VNPL-1.1 or GNU GPL-2.0+ (see README.md for details)
|
||||||
|
|
||||||
#pragma once
|
#pragma once
|
||||||
#include <infiniband/verbs.h>
|
#include <infiniband/verbs.h>
|
||||||
#include <string>
|
#include <string>
|
||||||
|
@ -43,11 +46,13 @@ struct msgr_rdma_connection_t
|
||||||
msgr_rdma_address_t addr;
|
msgr_rdma_address_t addr;
|
||||||
int max_send = 0, max_recv = 0, max_sge = 0;
|
int max_send = 0, max_recv = 0, max_sge = 0;
|
||||||
int cur_send = 0, cur_recv = 0;
|
int cur_send = 0, cur_recv = 0;
|
||||||
|
uint64_t max_msg = 0;
|
||||||
|
|
||||||
int send_pos = 0, send_buf_pos = 0;
|
int send_pos = 0, send_buf_pos = 0;
|
||||||
int recv_pos = 0, recv_buf_pos = 0;
|
int recv_pos = 0, recv_buf_pos = 0;
|
||||||
|
std::vector<void*> recv_buffers;
|
||||||
|
|
||||||
~msgr_rdma_connection_t();
|
~msgr_rdma_connection_t();
|
||||||
static msgr_rdma_connection_t *create(msgr_rdma_context_t *ctx, uint32_t max_send, uint32_t max_recv, uint32_t max_sge);
|
static msgr_rdma_connection_t *create(msgr_rdma_context_t *ctx, uint32_t max_send, uint32_t max_recv, uint32_t max_sge, uint32_t max_msg);
|
||||||
int connect(msgr_rdma_address_t *dest);
|
int connect(msgr_rdma_address_t *dest);
|
||||||
};
|
};
|
||||||
|
|
|
@ -72,7 +72,7 @@ bool osd_messenger_t::handle_read(int result, osd_client_t *cl)
|
||||||
// this is a client socket, so don't panic on error. just disconnect it
|
// this is a client socket, so don't panic on error. just disconnect it
|
||||||
if (result != 0)
|
if (result != 0)
|
||||||
{
|
{
|
||||||
printf("Client %d socket read error: %d (%s). Disconnecting client\n", cl->peer_fd, -result, strerror(-result));
|
fprintf(stderr, "Client %d socket read error: %d (%s). Disconnecting client\n", cl->peer_fd, -result, strerror(-result));
|
||||||
}
|
}
|
||||||
stop_client(cl->peer_fd);
|
stop_client(cl->peer_fd);
|
||||||
return false;
|
return false;
|
||||||
|
@ -90,10 +90,39 @@ bool osd_messenger_t::handle_read(int result, osd_client_t *cl)
|
||||||
if (result > 0)
|
if (result > 0)
|
||||||
{
|
{
|
||||||
if (cl->read_iov.iov_base == cl->in_buf)
|
if (cl->read_iov.iov_base == cl->in_buf)
|
||||||
|
{
|
||||||
|
if (!handle_read_buffer(cl, cl->in_buf, result))
|
||||||
|
{
|
||||||
|
goto fin;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
// Long data
|
||||||
|
cl->read_remaining -= result;
|
||||||
|
cl->recv_list.eat(result);
|
||||||
|
if (cl->recv_list.done >= cl->recv_list.count)
|
||||||
|
{
|
||||||
|
handle_finished_read(cl);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
if (result >= cl->read_iov.iov_len)
|
||||||
|
{
|
||||||
|
ret = true;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
fin:
|
||||||
|
for (auto cb: set_immediate)
|
||||||
|
{
|
||||||
|
cb();
|
||||||
|
}
|
||||||
|
set_immediate.clear();
|
||||||
|
return ret;
|
||||||
|
}
|
||||||
|
|
||||||
|
bool osd_messenger_t::handle_read_buffer(osd_client_t *cl, void *curbuf, int remain)
|
||||||
{
|
{
|
||||||
// Compose operation(s) from the buffer
|
// Compose operation(s) from the buffer
|
||||||
int remain = result;
|
|
||||||
void *curbuf = cl->in_buf;
|
|
||||||
while (remain > 0)
|
while (remain > 0)
|
||||||
{
|
{
|
||||||
if (!cl->read_op)
|
if (!cl->read_op)
|
||||||
|
@ -130,33 +159,11 @@ bool osd_messenger_t::handle_read(int result, osd_client_t *cl)
|
||||||
{
|
{
|
||||||
if (!handle_finished_read(cl))
|
if (!handle_finished_read(cl))
|
||||||
{
|
{
|
||||||
goto fin;
|
return false;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
return true;
|
||||||
else
|
|
||||||
{
|
|
||||||
// Long data
|
|
||||||
cl->read_remaining -= result;
|
|
||||||
cl->recv_list.eat(result);
|
|
||||||
if (cl->recv_list.done >= cl->recv_list.count)
|
|
||||||
{
|
|
||||||
handle_finished_read(cl);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
if (result >= cl->read_iov.iov_len)
|
|
||||||
{
|
|
||||||
ret = true;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
fin:
|
|
||||||
for (auto cb: set_immediate)
|
|
||||||
{
|
|
||||||
cb();
|
|
||||||
}
|
|
||||||
set_immediate.clear();
|
|
||||||
return ret;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
bool osd_messenger_t::handle_finished_read(osd_client_t *cl)
|
bool osd_messenger_t::handle_finished_read(osd_client_t *cl)
|
||||||
|
@ -170,7 +177,7 @@ bool osd_messenger_t::handle_finished_read(osd_client_t *cl)
|
||||||
handle_op_hdr(cl);
|
handle_op_hdr(cl);
|
||||||
else
|
else
|
||||||
{
|
{
|
||||||
printf("Received garbage: magic=%lx id=%lu opcode=%lx from %d\n", cl->read_op->req.hdr.magic, cl->read_op->req.hdr.id, cl->read_op->req.hdr.opcode, cl->peer_fd);
|
fprintf(stderr, "Received garbage: magic=%lx id=%lu opcode=%lx from %d\n", cl->read_op->req.hdr.magic, cl->read_op->req.hdr.id, cl->read_op->req.hdr.opcode, cl->peer_fd);
|
||||||
stop_client(cl->peer_fd);
|
stop_client(cl->peer_fd);
|
||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
|
@ -285,7 +292,7 @@ bool osd_messenger_t::handle_reply_hdr(osd_client_t *cl)
|
||||||
if (req_it == cl->sent_ops.end())
|
if (req_it == cl->sent_ops.end())
|
||||||
{
|
{
|
||||||
// Command out of sync. Drop connection
|
// Command out of sync. Drop connection
|
||||||
printf("Client %d command out of sync: id %lu\n", cl->peer_fd, cl->read_op->req.hdr.id);
|
fprintf(stderr, "Client %d command out of sync: id %lu\n", cl->peer_fd, cl->read_op->req.hdr.id);
|
||||||
stop_client(cl->peer_fd);
|
stop_client(cl->peer_fd);
|
||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
|
@ -300,7 +307,7 @@ bool osd_messenger_t::handle_reply_hdr(osd_client_t *cl)
|
||||||
if (op->reply.hdr.retval >= 0 && (op->reply.hdr.retval != expected_size || bmp_len > op->bitmap_len))
|
if (op->reply.hdr.retval >= 0 && (op->reply.hdr.retval != expected_size || bmp_len > op->bitmap_len))
|
||||||
{
|
{
|
||||||
// Check reply length to not overflow the buffer
|
// Check reply length to not overflow the buffer
|
||||||
printf("Client %d read reply of different length: expected %u+%u, got %ld+%u\n",
|
fprintf(stderr, "Client %d read reply of different length: expected %u+%u, got %ld+%u\n",
|
||||||
cl->peer_fd, expected_size, op->bitmap_len, op->reply.hdr.retval, bmp_len);
|
cl->peer_fd, expected_size, op->bitmap_len, op->reply.hdr.retval, bmp_len);
|
||||||
cl->sent_ops[op->req.hdr.id] = op;
|
cl->sent_ops[op->req.hdr.id] = op;
|
||||||
stop_client(cl->peer_fd);
|
stop_client(cl->peer_fd);
|
||||||
|
|
|
@ -227,7 +227,7 @@ void osd_messenger_t::handle_send(int result, osd_client_t *cl)
|
||||||
if (result < 0 && result != -EAGAIN)
|
if (result < 0 && result != -EAGAIN)
|
||||||
{
|
{
|
||||||
// this is a client socket, so don't panic. just disconnect it
|
// this is a client socket, so don't panic. just disconnect it
|
||||||
printf("Client %d socket write error: %d (%s). Disconnecting client\n", cl->peer_fd, -result, strerror(-result));
|
fprintf(stderr, "Client %d socket write error: %d (%s). Disconnecting client\n", cl->peer_fd, -result, strerror(-result));
|
||||||
stop_client(cl->peer_fd);
|
stop_client(cl->peer_fd);
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
@ -272,7 +272,10 @@ void osd_messenger_t::handle_send(int result, osd_client_t *cl)
|
||||||
{
|
{
|
||||||
// FIXME: Do something better than just forgetting the FD
|
// FIXME: Do something better than just forgetting the FD
|
||||||
// FIXME: Ignore pings during RDMA state transition
|
// FIXME: Ignore pings during RDMA state transition
|
||||||
printf("Successfully connected with client %d using RDMA\n", cl->peer_fd);
|
if (log_level > 0)
|
||||||
|
{
|
||||||
|
fprintf(stderr, "Successfully connected with client %d using RDMA\n", cl->peer_fd);
|
||||||
|
}
|
||||||
cl->peer_state = PEER_RDMA;
|
cl->peer_state = PEER_RDMA;
|
||||||
tfd->set_fd_handler(cl->peer_fd, false, NULL);
|
tfd->set_fd_handler(cl->peer_fd, false, NULL);
|
||||||
// Add the initial receive request
|
// Add the initial receive request
|
||||||
|
|
|
@ -58,11 +58,11 @@ void osd_messenger_t::stop_client(int peer_fd, bool force)
|
||||||
{
|
{
|
||||||
if (cl->osd_num)
|
if (cl->osd_num)
|
||||||
{
|
{
|
||||||
printf("[OSD %lu] Stopping client %d (OSD peer %lu)\n", osd_num, peer_fd, cl->osd_num);
|
fprintf(stderr, "[OSD %lu] Stopping client %d (OSD peer %lu)\n", osd_num, peer_fd, cl->osd_num);
|
||||||
}
|
}
|
||||||
else
|
else
|
||||||
{
|
{
|
||||||
printf("[OSD %lu] Stopping client %d (regular client)\n", osd_num, peer_fd);
|
fprintf(stderr, "[OSD %lu] Stopping client %d (regular client)\n", osd_num, peer_fd);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
// First set state to STOPPED so another stop_client() call doesn't try to free it again
|
// First set state to STOPPED so another stop_client() call doesn't try to free it again
|
||||||
|
|
|
@ -10,6 +10,7 @@
|
||||||
#include <netinet/tcp.h>
|
#include <netinet/tcp.h>
|
||||||
#include <arpa/inet.h>
|
#include <arpa/inet.h>
|
||||||
#include <sys/un.h>
|
#include <sys/un.h>
|
||||||
|
#include <sys/epoll.h>
|
||||||
#include <unistd.h>
|
#include <unistd.h>
|
||||||
#include <fcntl.h>
|
#include <fcntl.h>
|
||||||
#include <signal.h>
|
#include <signal.h>
|
||||||
|
@ -116,7 +117,7 @@ public:
|
||||||
"Vitastor NBD proxy\n"
|
"Vitastor NBD proxy\n"
|
||||||
"(c) Vitaliy Filippov, 2020-2021 (VNPL-1.1)\n\n"
|
"(c) Vitaliy Filippov, 2020-2021 (VNPL-1.1)\n\n"
|
||||||
"USAGE:\n"
|
"USAGE:\n"
|
||||||
" %s map --etcd_address <etcd_address> (--image <image> | --pool <pool> --inode <inode> --size <size in bytes>)\n"
|
" %s map [--etcd_address <etcd_address>] (--image <image> | --pool <pool> --inode <inode> --size <size in bytes>)\n"
|
||||||
" %s unmap /dev/nbd0\n"
|
" %s unmap /dev/nbd0\n"
|
||||||
" %s list [--json]\n",
|
" %s list [--json]\n",
|
||||||
exe_name, exe_name, exe_name
|
exe_name, exe_name, exe_name
|
||||||
|
@ -146,11 +147,6 @@ public:
|
||||||
void start(json11::Json cfg)
|
void start(json11::Json cfg)
|
||||||
{
|
{
|
||||||
// Check options
|
// Check options
|
||||||
if (cfg["etcd_address"].string_value() == "")
|
|
||||||
{
|
|
||||||
fprintf(stderr, "etcd_address is missing\n");
|
|
||||||
exit(1);
|
|
||||||
}
|
|
||||||
if (cfg["image"].string_value() != "")
|
if (cfg["image"].string_value() != "")
|
||||||
{
|
{
|
||||||
// Use image name
|
// Use image name
|
||||||
|
@ -205,9 +201,10 @@ public:
|
||||||
fcntl(sockfd[0], F_SETFL, fcntl(sockfd[0], F_GETFL, 0) | O_NONBLOCK);
|
fcntl(sockfd[0], F_SETFL, fcntl(sockfd[0], F_GETFL, 0) | O_NONBLOCK);
|
||||||
nbd_fd = sockfd[0];
|
nbd_fd = sockfd[0];
|
||||||
load_module();
|
load_module();
|
||||||
|
bool bg = cfg["foreground"].is_null();
|
||||||
if (!cfg["dev_num"].is_null())
|
if (!cfg["dev_num"].is_null())
|
||||||
{
|
{
|
||||||
if (run_nbd(sockfd, cfg["dev_num"].int64_value(), device_size, NBD_FLAG_SEND_FLUSH, 30) < 0)
|
if (run_nbd(sockfd, cfg["dev_num"].int64_value(), device_size, NBD_FLAG_SEND_FLUSH, 30, bg) < 0)
|
||||||
{
|
{
|
||||||
perror("run_nbd");
|
perror("run_nbd");
|
||||||
exit(1);
|
exit(1);
|
||||||
|
@ -219,7 +216,7 @@ public:
|
||||||
int i = 0;
|
int i = 0;
|
||||||
while (true)
|
while (true)
|
||||||
{
|
{
|
||||||
int r = run_nbd(sockfd, i, device_size, NBD_FLAG_SEND_FLUSH, 30);
|
int r = run_nbd(sockfd, i, device_size, NBD_FLAG_SEND_FLUSH, 30, bg);
|
||||||
if (r == 0)
|
if (r == 0)
|
||||||
{
|
{
|
||||||
printf("/dev/nbd%d\n", i);
|
printf("/dev/nbd%d\n", i);
|
||||||
|
@ -242,7 +239,7 @@ public:
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
if (cfg["foreground"].is_null())
|
if (bg)
|
||||||
{
|
{
|
||||||
daemonize();
|
daemonize();
|
||||||
}
|
}
|
||||||
|
@ -259,22 +256,47 @@ public:
|
||||||
};
|
};
|
||||||
ringloop->register_consumer(&consumer);
|
ringloop->register_consumer(&consumer);
|
||||||
// Add FD to epoll
|
// Add FD to epoll
|
||||||
epmgr->tfd->set_fd_handler(sockfd[0], false, [this](int peer_fd, int epoll_events)
|
bool stop = false;
|
||||||
|
epmgr->tfd->set_fd_handler(sockfd[0], false, [this, &stop](int peer_fd, int epoll_events)
|
||||||
|
{
|
||||||
|
if (epoll_events & EPOLLRDHUP)
|
||||||
|
{
|
||||||
|
close(peer_fd);
|
||||||
|
stop = true;
|
||||||
|
}
|
||||||
|
else
|
||||||
{
|
{
|
||||||
read_ready++;
|
read_ready++;
|
||||||
submit_read();
|
submit_read();
|
||||||
|
}
|
||||||
});
|
});
|
||||||
while (1)
|
while (!stop)
|
||||||
{
|
{
|
||||||
ringloop->loop();
|
ringloop->loop();
|
||||||
ringloop->wait();
|
ringloop->wait();
|
||||||
}
|
}
|
||||||
// FIXME: Cleanup when exiting
|
stop = false;
|
||||||
|
cluster_op_t *close_sync = new cluster_op_t;
|
||||||
|
close_sync->opcode = OSD_OP_SYNC;
|
||||||
|
close_sync->callback = [this, &stop](cluster_op_t *op)
|
||||||
|
{
|
||||||
|
stop = true;
|
||||||
|
delete op;
|
||||||
|
};
|
||||||
|
cli->execute(close_sync);
|
||||||
|
while (!stop)
|
||||||
|
{
|
||||||
|
ringloop->loop();
|
||||||
|
ringloop->wait();
|
||||||
|
}
|
||||||
|
delete cli;
|
||||||
|
delete epmgr;
|
||||||
|
delete ringloop;
|
||||||
}
|
}
|
||||||
|
|
||||||
void load_module()
|
void load_module()
|
||||||
{
|
{
|
||||||
if (access("/sys/module/nbd", F_OK))
|
if (access("/sys/module/nbd", F_OK) == 0)
|
||||||
{
|
{
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
@ -416,7 +438,7 @@ public:
|
||||||
}
|
}
|
||||||
|
|
||||||
protected:
|
protected:
|
||||||
int run_nbd(int sockfd[2], int dev_num, uint64_t size, uint64_t flags, unsigned timeout)
|
int run_nbd(int sockfd[2], int dev_num, uint64_t size, uint64_t flags, unsigned timeout, bool bg)
|
||||||
{
|
{
|
||||||
// Check handle size
|
// Check handle size
|
||||||
assert(sizeof(cur_req.handle) == 8);
|
assert(sizeof(cur_req.handle) == 8);
|
||||||
|
@ -464,11 +486,14 @@ protected:
|
||||||
{
|
{
|
||||||
// Run in child
|
// Run in child
|
||||||
close(sockfd[0]);
|
close(sockfd[0]);
|
||||||
|
if (bg)
|
||||||
|
{
|
||||||
|
daemonize();
|
||||||
|
}
|
||||||
r = ioctl(nbd, NBD_DO_IT);
|
r = ioctl(nbd, NBD_DO_IT);
|
||||||
if (r < 0)
|
if (r < 0)
|
||||||
{
|
{
|
||||||
fprintf(stderr, "NBD device terminated with error: %s\n", strerror(errno));
|
fprintf(stderr, "NBD device terminated with error: %s\n", strerror(errno));
|
||||||
kill(getppid(), SIGTERM);
|
|
||||||
}
|
}
|
||||||
close(sockfd[1]);
|
close(sockfd[1]);
|
||||||
ioctl(nbd, NBD_CLEAR_QUE);
|
ioctl(nbd, NBD_CLEAR_QUE);
|
||||||
|
|
98
src/osd.cpp
98
src/osd.cpp
|
@ -10,31 +10,39 @@
|
||||||
#include "osd.h"
|
#include "osd.h"
|
||||||
#include "http_client.h"
|
#include "http_client.h"
|
||||||
|
|
||||||
osd_t::osd_t(blockstore_config_t & config, ring_loop_t *ringloop)
|
static blockstore_config_t json_to_bs(const json11::Json::object & config)
|
||||||
{
|
{
|
||||||
bs_block_size = strtoull(config["block_size"].c_str(), NULL, 10);
|
blockstore_config_t bs;
|
||||||
bs_bitmap_granularity = strtoull(config["bitmap_granularity"].c_str(), NULL, 10);
|
for (auto kv: config)
|
||||||
if (!bs_block_size)
|
{
|
||||||
bs_block_size = DEFAULT_BLOCK_SIZE;
|
if (kv.second.is_string())
|
||||||
if (!bs_bitmap_granularity)
|
bs[kv.first] = kv.second.string_value();
|
||||||
bs_bitmap_granularity = DEFAULT_BITMAP_GRANULARITY;
|
else
|
||||||
clean_entry_bitmap_size = bs_block_size / bs_bitmap_granularity / 8;
|
bs[kv.first] = kv.second.dump();
|
||||||
|
}
|
||||||
|
return bs;
|
||||||
|
}
|
||||||
|
|
||||||
|
osd_t::osd_t(const json11::Json & config, ring_loop_t *ringloop)
|
||||||
|
{
|
||||||
zero_buffer_size = 1<<20;
|
zero_buffer_size = 1<<20;
|
||||||
zero_buffer = malloc_or_die(zero_buffer_size);
|
zero_buffer = malloc_or_die(zero_buffer_size);
|
||||||
memset(zero_buffer, 0, zero_buffer_size);
|
memset(zero_buffer, 0, zero_buffer_size);
|
||||||
|
|
||||||
this->config = config;
|
|
||||||
this->ringloop = ringloop;
|
this->ringloop = ringloop;
|
||||||
|
|
||||||
|
this->config = msgr.read_config(config).object_items();
|
||||||
|
if (this->config.find("log_level") == this->config.end())
|
||||||
|
this->config["log_level"] = 1;
|
||||||
|
parse_config(this->config);
|
||||||
|
|
||||||
epmgr = new epoll_manager_t(ringloop);
|
epmgr = new epoll_manager_t(ringloop);
|
||||||
// FIXME: Use timerfd_interval based directly on io_uring
|
// FIXME: Use timerfd_interval based directly on io_uring
|
||||||
this->tfd = epmgr->tfd;
|
this->tfd = epmgr->tfd;
|
||||||
|
|
||||||
// FIXME: Create Blockstore from on-disk superblock config and check it against the OSD cluster config
|
// FIXME: Create Blockstore from on-disk superblock config and check it against the OSD cluster config
|
||||||
this->bs = new blockstore_t(config, ringloop, tfd);
|
auto bs_cfg = json_to_bs(this->config);
|
||||||
|
this->bs = new blockstore_t(bs_cfg, ringloop, tfd);
|
||||||
parse_config(config);
|
|
||||||
|
|
||||||
this->tfd->set_timer(print_stats_interval*1000, true, [this](int timer_id)
|
this->tfd->set_timer(print_stats_interval*1000, true, [this](int timer_id)
|
||||||
{
|
{
|
||||||
|
@ -66,63 +74,71 @@ osd_t::~osd_t()
|
||||||
free(zero_buffer);
|
free(zero_buffer);
|
||||||
}
|
}
|
||||||
|
|
||||||
void osd_t::parse_config(blockstore_config_t & config)
|
void osd_t::parse_config(const json11::Json & config)
|
||||||
{
|
{
|
||||||
if (config.find("log_level") == config.end())
|
st_cli.parse_config(config);
|
||||||
config["log_level"] = "1";
|
msgr.parse_config(config);
|
||||||
log_level = strtoull(config["log_level"].c_str(), NULL, 10);
|
// OSD number
|
||||||
// Initial startup configuration
|
osd_num = config["osd_num"].uint64_value();
|
||||||
json11::Json json_config = json11::Json(config);
|
|
||||||
st_cli.parse_config(json_config);
|
|
||||||
etcd_report_interval = strtoull(config["etcd_report_interval"].c_str(), NULL, 10);
|
|
||||||
if (etcd_report_interval <= 0)
|
|
||||||
etcd_report_interval = 30;
|
|
||||||
osd_num = strtoull(config["osd_num"].c_str(), NULL, 10);
|
|
||||||
if (!osd_num)
|
if (!osd_num)
|
||||||
throw std::runtime_error("osd_num is required in the configuration");
|
throw std::runtime_error("osd_num is required in the configuration");
|
||||||
msgr.osd_num = osd_num;
|
msgr.osd_num = osd_num;
|
||||||
|
// Vital Blockstore parameters
|
||||||
|
bs_block_size = config["block_size"].uint64_value();
|
||||||
|
if (!bs_block_size)
|
||||||
|
bs_block_size = DEFAULT_BLOCK_SIZE;
|
||||||
|
bs_bitmap_granularity = config["bitmap_granularity"].uint64_value();
|
||||||
|
if (!bs_bitmap_granularity)
|
||||||
|
bs_bitmap_granularity = DEFAULT_BITMAP_GRANULARITY;
|
||||||
|
clean_entry_bitmap_size = bs_block_size / bs_bitmap_granularity / 8;
|
||||||
|
// Bind address
|
||||||
|
bind_address = config["bind_address"].string_value();
|
||||||
|
if (bind_address == "")
|
||||||
|
bind_address = "0.0.0.0";
|
||||||
|
bind_port = config["bind_port"].uint64_value();
|
||||||
|
if (bind_port <= 0 || bind_port > 65535)
|
||||||
|
bind_port = 0;
|
||||||
|
// OSD configuration
|
||||||
|
log_level = config["log_level"].uint64_value();
|
||||||
|
etcd_report_interval = config["etcd_report_interval"].uint64_value();
|
||||||
|
if (etcd_report_interval <= 0)
|
||||||
|
etcd_report_interval = 30;
|
||||||
|
readonly = config["readonly"] == "true" || config["readonly"] == "1" || config["readonly"] == "yes";
|
||||||
run_primary = config["run_primary"] != "false" && config["run_primary"] != "0" && config["run_primary"] != "no";
|
run_primary = config["run_primary"] != "false" && config["run_primary"] != "0" && config["run_primary"] != "no";
|
||||||
no_rebalance = config["no_rebalance"] == "true" || config["no_rebalance"] == "1" || config["no_rebalance"] == "yes";
|
no_rebalance = config["no_rebalance"] == "true" || config["no_rebalance"] == "1" || config["no_rebalance"] == "yes";
|
||||||
no_recovery = config["no_recovery"] == "true" || config["no_recovery"] == "1" || config["no_recovery"] == "yes";
|
no_recovery = config["no_recovery"] == "true" || config["no_recovery"] == "1" || config["no_recovery"] == "yes";
|
||||||
allow_test_ops = config["allow_test_ops"] == "true" || config["allow_test_ops"] == "1" || config["allow_test_ops"] == "yes";
|
allow_test_ops = config["allow_test_ops"] == "true" || config["allow_test_ops"] == "1" || config["allow_test_ops"] == "yes";
|
||||||
// Cluster configuration
|
|
||||||
bind_address = config["bind_address"];
|
|
||||||
if (bind_address == "")
|
|
||||||
bind_address = "0.0.0.0";
|
|
||||||
bind_port = stoull_full(config["bind_port"]);
|
|
||||||
if (bind_port <= 0 || bind_port > 65535)
|
|
||||||
bind_port = 0;
|
|
||||||
if (config["immediate_commit"] == "all")
|
if (config["immediate_commit"] == "all")
|
||||||
immediate_commit = IMMEDIATE_ALL;
|
immediate_commit = IMMEDIATE_ALL;
|
||||||
else if (config["immediate_commit"] == "small")
|
else if (config["immediate_commit"] == "small")
|
||||||
immediate_commit = IMMEDIATE_SMALL;
|
immediate_commit = IMMEDIATE_SMALL;
|
||||||
if (config.find("autosync_interval") != config.end())
|
else
|
||||||
|
immediate_commit = IMMEDIATE_NONE;
|
||||||
|
if (!config["autosync_interval"].is_null())
|
||||||
{
|
{
|
||||||
autosync_interval = strtoull(config["autosync_interval"].c_str(), NULL, 10);
|
// Allow to set it to 0
|
||||||
|
autosync_interval = config["autosync_interval"].uint64_value();
|
||||||
if (autosync_interval > MAX_AUTOSYNC_INTERVAL)
|
if (autosync_interval > MAX_AUTOSYNC_INTERVAL)
|
||||||
autosync_interval = DEFAULT_AUTOSYNC_INTERVAL;
|
autosync_interval = DEFAULT_AUTOSYNC_INTERVAL;
|
||||||
}
|
}
|
||||||
if (config.find("client_queue_depth") != config.end())
|
if (!config["client_queue_depth"].is_null())
|
||||||
{
|
{
|
||||||
client_queue_depth = strtoull(config["client_queue_depth"].c_str(), NULL, 10);
|
client_queue_depth = config["client_queue_depth"].uint64_value();
|
||||||
if (client_queue_depth < 128)
|
if (client_queue_depth < 128)
|
||||||
client_queue_depth = 128;
|
client_queue_depth = 128;
|
||||||
}
|
}
|
||||||
recovery_queue_depth = strtoull(config["recovery_queue_depth"].c_str(), NULL, 10);
|
recovery_queue_depth = config["recovery_queue_depth"].uint64_value();
|
||||||
if (recovery_queue_depth < 1 || recovery_queue_depth > MAX_RECOVERY_QUEUE)
|
if (recovery_queue_depth < 1 || recovery_queue_depth > MAX_RECOVERY_QUEUE)
|
||||||
recovery_queue_depth = DEFAULT_RECOVERY_QUEUE;
|
recovery_queue_depth = DEFAULT_RECOVERY_QUEUE;
|
||||||
recovery_sync_batch = strtoull(config["recovery_sync_batch"].c_str(), NULL, 10);
|
recovery_sync_batch = config["recovery_sync_batch"].uint64_value();
|
||||||
if (recovery_sync_batch < 1 || recovery_sync_batch > MAX_RECOVERY_QUEUE)
|
if (recovery_sync_batch < 1 || recovery_sync_batch > MAX_RECOVERY_QUEUE)
|
||||||
recovery_sync_batch = DEFAULT_RECOVERY_BATCH;
|
recovery_sync_batch = DEFAULT_RECOVERY_BATCH;
|
||||||
if (config["readonly"] == "true" || config["readonly"] == "1" || config["readonly"] == "yes")
|
print_stats_interval = config["print_stats_interval"].uint64_value();
|
||||||
readonly = true;
|
|
||||||
print_stats_interval = strtoull(config["print_stats_interval"].c_str(), NULL, 10);
|
|
||||||
if (!print_stats_interval)
|
if (!print_stats_interval)
|
||||||
print_stats_interval = 3;
|
print_stats_interval = 3;
|
||||||
slow_log_interval = strtoull(config["slow_log_interval"].c_str(), NULL, 10);
|
slow_log_interval = config["slow_log_interval"].uint64_value();
|
||||||
if (!slow_log_interval)
|
if (!slow_log_interval)
|
||||||
slow_log_interval = 10;
|
slow_log_interval = 10;
|
||||||
msgr.parse_config(json_config);
|
|
||||||
}
|
}
|
||||||
|
|
||||||
void osd_t::bind_socket()
|
void osd_t::bind_socket()
|
||||||
|
|
|
@ -92,7 +92,7 @@ class osd_t
|
||||||
{
|
{
|
||||||
// config
|
// config
|
||||||
|
|
||||||
blockstore_config_t config;
|
json11::Json::object config;
|
||||||
int etcd_report_interval = 30;
|
int etcd_report_interval = 30;
|
||||||
|
|
||||||
bool readonly = false;
|
bool readonly = false;
|
||||||
|
@ -167,7 +167,7 @@ class osd_t
|
||||||
uint64_t recovery_stat_bytes[2][2] = { 0 };
|
uint64_t recovery_stat_bytes[2][2] = { 0 };
|
||||||
|
|
||||||
// cluster connection
|
// cluster connection
|
||||||
void parse_config(blockstore_config_t & config);
|
void parse_config(const json11::Json & config);
|
||||||
void init_cluster();
|
void init_cluster();
|
||||||
void on_change_osd_state_hook(osd_num_t peer_osd);
|
void on_change_osd_state_hook(osd_num_t peer_osd);
|
||||||
void on_change_pg_history_hook(pool_id_t pool_id, pg_num_t pg_num);
|
void on_change_pg_history_hook(pool_id_t pool_id, pg_num_t pg_num);
|
||||||
|
@ -268,7 +268,7 @@ class osd_t
|
||||||
}
|
}
|
||||||
|
|
||||||
public:
|
public:
|
||||||
osd_t(blockstore_config_t & config, ring_loop_t *ringloop);
|
osd_t(const json11::Json & config, ring_loop_t *ringloop);
|
||||||
~osd_t();
|
~osd_t();
|
||||||
void force_stop(int exitcode);
|
void force_stop(int exitcode);
|
||||||
bool shutdown();
|
bool shutdown();
|
||||||
|
|
|
@ -21,7 +21,7 @@ void osd_t::init_cluster()
|
||||||
{
|
{
|
||||||
// Test version of clustering code with 1 pool, 1 PG and 2 peers
|
// Test version of clustering code with 1 pool, 1 PG and 2 peers
|
||||||
// Example: peers = 2:127.0.0.1:11204,3:127.0.0.1:11205
|
// Example: peers = 2:127.0.0.1:11204,3:127.0.0.1:11205
|
||||||
std::string peerstr = config["peers"];
|
std::string peerstr = config["peers"].string_value();
|
||||||
while (peerstr.size())
|
while (peerstr.size())
|
||||||
{
|
{
|
||||||
int pos = peerstr.find(',');
|
int pos = peerstr.find(',');
|
||||||
|
@ -340,21 +340,10 @@ void osd_t::on_change_pg_history_hook(pool_id_t pool_id, pg_num_t pg_num)
|
||||||
|
|
||||||
void osd_t::on_load_config_hook(json11::Json::object & global_config)
|
void osd_t::on_load_config_hook(json11::Json::object & global_config)
|
||||||
{
|
{
|
||||||
blockstore_config_t osd_config = this->config;
|
json11::Json::object osd_config = this->config;
|
||||||
for (auto & cfg_var: global_config)
|
for (auto & kv: global_config)
|
||||||
{
|
if (osd_config.find(kv.first) == osd_config.end())
|
||||||
if (this->config.find(cfg_var.first) == this->config.end())
|
osd_config[kv.first] = kv.second;
|
||||||
{
|
|
||||||
if (cfg_var.second.is_string())
|
|
||||||
{
|
|
||||||
osd_config[cfg_var.first] = cfg_var.second.string_value();
|
|
||||||
}
|
|
||||||
else
|
|
||||||
{
|
|
||||||
osd_config[cfg_var.first] = cfg_var.second.dump();
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
parse_config(osd_config);
|
parse_config(osd_config);
|
||||||
bind_socket();
|
bind_socket();
|
||||||
acquire_lease();
|
acquire_lease();
|
||||||
|
@ -380,7 +369,7 @@ void osd_t::acquire_lease()
|
||||||
etcd_lease_id = data["ID"].string_value();
|
etcd_lease_id = data["ID"].string_value();
|
||||||
create_osd_state();
|
create_osd_state();
|
||||||
});
|
});
|
||||||
printf("[OSD %lu] reporting to etcd at %s every %d seconds\n", this->osd_num, config["etcd_address"].c_str(), etcd_report_interval);
|
printf("[OSD %lu] reporting to etcd at %s every %d seconds\n", this->osd_num, config["etcd_address"].string_value().c_str(), etcd_report_interval);
|
||||||
tfd->set_timer(etcd_report_interval*1000, true, [this](int timer_id)
|
tfd->set_timer(etcd_report_interval*1000, true, [this](int timer_id)
|
||||||
{
|
{
|
||||||
renew_lease();
|
renew_lease();
|
||||||
|
|
|
@ -29,13 +29,13 @@ int main(int narg, char *args[])
|
||||||
perror("BUG: too small packet size");
|
perror("BUG: too small packet size");
|
||||||
return 1;
|
return 1;
|
||||||
}
|
}
|
||||||
blockstore_config_t config;
|
json11::Json::object config;
|
||||||
for (int i = 1; i < narg; i++)
|
for (int i = 1; i < narg; i++)
|
||||||
{
|
{
|
||||||
if (args[i][0] == '-' && args[i][1] == '-' && i < narg-1)
|
if (args[i][0] == '-' && args[i][1] == '-' && i < narg-1)
|
||||||
{
|
{
|
||||||
char *opt = args[i]+2;
|
char *opt = args[i]+2;
|
||||||
config[opt] = args[++i];
|
config[std::string(opt)] = std::string(args[++i]);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
signal(SIGINT, handle_sigint);
|
signal(SIGINT, handle_sigint);
|
||||||
|
|
|
@ -191,6 +191,9 @@ struct __attribute__((__packed__)) osd_op_rw_t
|
||||||
uint32_t flags;
|
uint32_t flags;
|
||||||
// inode metadata revision
|
// inode metadata revision
|
||||||
uint64_t meta_revision;
|
uint64_t meta_revision;
|
||||||
|
// object version for atomic "CAS" (compare-and-set) writes
|
||||||
|
// writes and deletes fail with -EINTR if object version differs from (version-1)
|
||||||
|
uint64_t version;
|
||||||
};
|
};
|
||||||
|
|
||||||
struct __attribute__((__packed__)) osd_reply_rw_t
|
struct __attribute__((__packed__)) osd_reply_rw_t
|
||||||
|
@ -199,6 +202,8 @@ struct __attribute__((__packed__)) osd_reply_rw_t
|
||||||
// for reads: bitmap length
|
// for reads: bitmap length
|
||||||
uint32_t bitmap_len;
|
uint32_t bitmap_len;
|
||||||
uint32_t pad0;
|
uint32_t pad0;
|
||||||
|
// for reads: object version
|
||||||
|
uint64_t version;
|
||||||
};
|
};
|
||||||
|
|
||||||
// sync to the primary OSD
|
// sync to the primary OSD
|
||||||
|
|
|
@ -67,7 +67,9 @@ bool osd_t::prepare_primary_rw(osd_op_t *cur_op)
|
||||||
}
|
}
|
||||||
// Find parents from the same pool. Optimized reads only work within pools
|
// Find parents from the same pool. Optimized reads only work within pools
|
||||||
while (inode_it != st_cli.inode_config.end() && inode_it->second.parent_id &&
|
while (inode_it != st_cli.inode_config.end() && inode_it->second.parent_id &&
|
||||||
INODE_POOL(inode_it->second.parent_id) == pg_it->second.pool_id)
|
INODE_POOL(inode_it->second.parent_id) == pg_it->second.pool_id &&
|
||||||
|
// Check for loops
|
||||||
|
inode_it->second.parent_id != cur_op->req.rw.inode)
|
||||||
{
|
{
|
||||||
chain_size++;
|
chain_size++;
|
||||||
inode_it = st_cli.inode_config.find(inode_it->second.parent_id);
|
inode_it = st_cli.inode_config.find(inode_it->second.parent_id);
|
||||||
|
@ -123,7 +125,10 @@ bool osd_t::prepare_primary_rw(osd_op_t *cur_op)
|
||||||
int chain_num = 0;
|
int chain_num = 0;
|
||||||
op_data->read_chain[chain_num++] = cur_op->req.rw.inode;
|
op_data->read_chain[chain_num++] = cur_op->req.rw.inode;
|
||||||
auto inode_it = st_cli.inode_config.find(cur_op->req.rw.inode);
|
auto inode_it = st_cli.inode_config.find(cur_op->req.rw.inode);
|
||||||
while (inode_it != st_cli.inode_config.end() && inode_it->second.parent_id)
|
while (inode_it != st_cli.inode_config.end() && inode_it->second.parent_id &&
|
||||||
|
INODE_POOL(inode_it->second.parent_id) == pg_it->second.pool_id &&
|
||||||
|
// Check for loops
|
||||||
|
inode_it->second.parent_id != cur_op->req.rw.inode)
|
||||||
{
|
{
|
||||||
op_data->read_chain[chain_num++] = inode_it->second.parent_id;
|
op_data->read_chain[chain_num++] = inode_it->second.parent_id;
|
||||||
inode_it = st_cli.inode_config.find(inode_it->second.parent_id);
|
inode_it = st_cli.inode_config.find(inode_it->second.parent_id);
|
||||||
|
@ -222,6 +227,7 @@ resume_2:
|
||||||
finish_op(cur_op, op_data->epipe > 0 ? -EPIPE : -EIO);
|
finish_op(cur_op, op_data->epipe > 0 ? -EPIPE : -EIO);
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
cur_op->reply.rw.version = op_data->fact_ver;
|
||||||
cur_op->reply.rw.bitmap_len = op_data->pg_data_size * clean_entry_bitmap_size;
|
cur_op->reply.rw.bitmap_len = op_data->pg_data_size * clean_entry_bitmap_size;
|
||||||
if (op_data->degraded)
|
if (op_data->degraded)
|
||||||
{
|
{
|
||||||
|
@ -343,6 +349,12 @@ resume_3:
|
||||||
pg_cancel_write_queue(pg, cur_op, op_data->oid, op_data->epipe > 0 ? -EPIPE : -EIO);
|
pg_cancel_write_queue(pg, cur_op, op_data->oid, op_data->epipe > 0 ? -EPIPE : -EIO);
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
// Check CAS version
|
||||||
|
if (cur_op->req.rw.version && op_data->fact_ver != (cur_op->req.rw.version-1))
|
||||||
|
{
|
||||||
|
cur_op->reply.hdr.retval = -EINTR;
|
||||||
|
goto continue_others;
|
||||||
|
}
|
||||||
// Save version override for parallel reads
|
// Save version override for parallel reads
|
||||||
pg.ver_override[op_data->oid] = op_data->fact_ver;
|
pg.ver_override[op_data->oid] = op_data->fact_ver;
|
||||||
// Submit deletes
|
// Submit deletes
|
||||||
|
@ -370,6 +382,8 @@ resume_5:
|
||||||
free_object_state(pg, &op_data->object_state);
|
free_object_state(pg, &op_data->object_state);
|
||||||
}
|
}
|
||||||
pg.total_count--;
|
pg.total_count--;
|
||||||
|
cur_op->reply.hdr.retval = 0;
|
||||||
|
continue_others:
|
||||||
osd_op_t *next_op = NULL;
|
osd_op_t *next_op = NULL;
|
||||||
auto next_it = pg.write_queue.find(op_data->oid);
|
auto next_it = pg.write_queue.find(op_data->oid);
|
||||||
if (next_it != pg.write_queue.end() && next_it->second == cur_op)
|
if (next_it != pg.write_queue.end() && next_it->second == cur_op)
|
||||||
|
@ -378,7 +392,7 @@ resume_5:
|
||||||
if (next_it != pg.write_queue.end() && next_it->first == op_data->oid)
|
if (next_it != pg.write_queue.end() && next_it->first == op_data->oid)
|
||||||
next_op = next_it->second;
|
next_op = next_it->second;
|
||||||
}
|
}
|
||||||
finish_op(cur_op, cur_op->req.rw.len);
|
finish_op(cur_op, cur_op->reply.hdr.retval);
|
||||||
if (next_op)
|
if (next_op)
|
||||||
{
|
{
|
||||||
// Continue next write to the same object
|
// Continue next write to the same object
|
||||||
|
|
|
@ -65,7 +65,10 @@ int osd_t::read_bitmaps(osd_op_t *cur_op, pg_t & pg, int base_state)
|
||||||
auto vo_it = pg.ver_override.find(cur_oid);
|
auto vo_it = pg.ver_override.find(cur_oid);
|
||||||
auto read_version = (vo_it != pg.ver_override.end() ? vo_it->second : UINT64_MAX);
|
auto read_version = (vo_it != pg.ver_override.end() ? vo_it->second : UINT64_MAX);
|
||||||
// Read bitmap synchronously from the local database
|
// Read bitmap synchronously from the local database
|
||||||
bs->read_bitmap(cur_oid, read_version, op_data->snapshot_bitmaps + chain_num*clean_entry_bitmap_size, NULL);
|
bs->read_bitmap(
|
||||||
|
cur_oid, read_version, op_data->snapshot_bitmaps + chain_num*clean_entry_bitmap_size,
|
||||||
|
!chain_num ? &cur_op->reply.rw.version : NULL
|
||||||
|
);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
else
|
else
|
||||||
|
@ -228,7 +231,10 @@ int osd_t::submit_bitmap_subops(osd_op_t *cur_op, pg_t & pg)
|
||||||
// Read bitmap synchronously from the local database
|
// Read bitmap synchronously from the local database
|
||||||
for (int j = prev; j <= i; j++)
|
for (int j = prev; j <= i; j++)
|
||||||
{
|
{
|
||||||
bs->read_bitmap((*bitmap_requests)[j].oid, (*bitmap_requests)[j].version, (*bitmap_requests)[j].bmp_buf, NULL);
|
bs->read_bitmap(
|
||||||
|
(*bitmap_requests)[j].oid, (*bitmap_requests)[j].version, (*bitmap_requests)[j].bmp_buf,
|
||||||
|
(*bitmap_requests)[j].oid.inode == cur_op->req.rw.inode ? &cur_op->reply.rw.version : NULL
|
||||||
|
);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
else
|
else
|
||||||
|
@ -264,6 +270,10 @@ int osd_t::submit_bitmap_subops(osd_op_t *cur_op, pg_t & pg)
|
||||||
for (int j = prev; j <= i; j++)
|
for (int j = prev; j <= i; j++)
|
||||||
{
|
{
|
||||||
memcpy((*bitmap_requests)[j].bmp_buf, cur_buf, clean_entry_bitmap_size);
|
memcpy((*bitmap_requests)[j].bmp_buf, cur_buf, clean_entry_bitmap_size);
|
||||||
|
if ((*bitmap_requests)[j].oid.inode == cur_op->req.rw.inode)
|
||||||
|
{
|
||||||
|
memcpy(&cur_op->reply.rw.version, cur_buf-8, 8);
|
||||||
|
}
|
||||||
cur_buf += 8 + clean_entry_bitmap_size;
|
cur_buf += 8 + clean_entry_bitmap_size;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
|
@ -96,6 +96,12 @@ resume_3:
|
||||||
pg_cancel_write_queue(pg, cur_op, op_data->oid, op_data->epipe > 0 ? -EPIPE : -EIO);
|
pg_cancel_write_queue(pg, cur_op, op_data->oid, op_data->epipe > 0 ? -EPIPE : -EIO);
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
// Check CAS version
|
||||||
|
if (cur_op->req.rw.version && op_data->fact_ver != (cur_op->req.rw.version-1))
|
||||||
|
{
|
||||||
|
cur_op->reply.hdr.retval = -EINTR;
|
||||||
|
goto continue_others;
|
||||||
|
}
|
||||||
if (op_data->scheme == POOL_SCHEME_REPLICATED)
|
if (op_data->scheme == POOL_SCHEME_REPLICATED)
|
||||||
{
|
{
|
||||||
// Set bitmap bits
|
// Set bitmap bits
|
||||||
|
@ -265,7 +271,7 @@ continue_others:
|
||||||
next_op = next_it->second;
|
next_op = next_it->second;
|
||||||
}
|
}
|
||||||
// finish_op would invalidate next_it if it cleared pg.write_queue, but it doesn't do that :)
|
// finish_op would invalidate next_it if it cleared pg.write_queue, but it doesn't do that :)
|
||||||
finish_op(cur_op, cur_op->req.rw.len);
|
finish_op(cur_op, cur_op->reply.hdr.retval);
|
||||||
if (next_op)
|
if (next_op)
|
||||||
{
|
{
|
||||||
// Continue next write to the same object
|
// Continue next write to the same object
|
||||||
|
|
|
@ -169,11 +169,12 @@ void osd_t::exec_show_config(osd_op_t *cur_op)
|
||||||
if (req_json["connect_rdma"].is_string())
|
if (req_json["connect_rdma"].is_string())
|
||||||
{
|
{
|
||||||
// Peer is trying to connect using RDMA, try to satisfy him
|
// Peer is trying to connect using RDMA, try to satisfy him
|
||||||
bool ok = msgr.connect_rdma(cur_op->peer_fd, req_json["connect_rdma"].string_value());
|
bool ok = msgr.connect_rdma(cur_op->peer_fd, req_json["connect_rdma"].string_value(), req_json["rdma_max_msg"].uint64_value());
|
||||||
if (ok)
|
if (ok)
|
||||||
{
|
{
|
||||||
wire_config["rdma_connected"] = true;
|
auto rc = msgr.clients.at(cur_op->peer_fd)->rdma_conn;
|
||||||
wire_config["rdma_address"] = msgr.clients.at(cur_op->peer_fd)->rdma_conn->addr.to_string();
|
wire_config["rdma_address"] = rc->addr.to_string();
|
||||||
|
wire_config["rdma_max_msg"] = rc->max_msg;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
|
@ -26,7 +26,7 @@
|
||||||
#define qobject_unref QDECREF
|
#define qobject_unref QDECREF
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
#include "qemu_proxy.h"
|
#include "vitastor_c.h"
|
||||||
|
|
||||||
void qemu_module_dummy(void)
|
void qemu_module_dummy(void)
|
||||||
{
|
{
|
||||||
|
@ -40,6 +40,7 @@ typedef struct VitastorClient
|
||||||
{
|
{
|
||||||
void *proxy;
|
void *proxy;
|
||||||
void *watch;
|
void *watch;
|
||||||
|
char *config_path;
|
||||||
char *etcd_host;
|
char *etcd_host;
|
||||||
char *etcd_prefix;
|
char *etcd_prefix;
|
||||||
char *image;
|
char *image;
|
||||||
|
@ -47,6 +48,11 @@ typedef struct VitastorClient
|
||||||
uint64_t pool;
|
uint64_t pool;
|
||||||
uint64_t size;
|
uint64_t size;
|
||||||
long readonly;
|
long readonly;
|
||||||
|
int use_rdma;
|
||||||
|
char *rdma_device;
|
||||||
|
int rdma_port_num;
|
||||||
|
int rdma_gid_index;
|
||||||
|
int rdma_mtu;
|
||||||
QemuMutex mutex;
|
QemuMutex mutex;
|
||||||
} VitastorClient;
|
} VitastorClient;
|
||||||
|
|
||||||
|
@ -60,10 +66,11 @@ typedef struct VitastorRPC
|
||||||
} VitastorRPC;
|
} VitastorRPC;
|
||||||
|
|
||||||
static void vitastor_co_init_task(BlockDriverState *bs, VitastorRPC *task);
|
static void vitastor_co_init_task(BlockDriverState *bs, VitastorRPC *task);
|
||||||
static void vitastor_co_generic_bh_cb(long retval, void *opaque);
|
static void vitastor_co_generic_bh_cb(void *opaque, long retval);
|
||||||
|
static void vitastor_co_read_cb(void *opaque, long retval, uint64_t version);
|
||||||
static void vitastor_close(BlockDriverState *bs);
|
static void vitastor_close(BlockDriverState *bs);
|
||||||
|
|
||||||
static char *qemu_rbd_next_tok(char *src, char delim, char **p)
|
static char *qemu_vitastor_next_tok(char *src, char delim, char **p)
|
||||||
{
|
{
|
||||||
char *end;
|
char *end;
|
||||||
*p = NULL;
|
*p = NULL;
|
||||||
|
@ -82,7 +89,7 @@ static char *qemu_rbd_next_tok(char *src, char delim, char **p)
|
||||||
return src;
|
return src;
|
||||||
}
|
}
|
||||||
|
|
||||||
static void qemu_rbd_unescape(char *src)
|
static void qemu_vitastor_unescape(char *src)
|
||||||
{
|
{
|
||||||
char *p;
|
char *p;
|
||||||
for (p = src; *src; ++src, ++p)
|
for (p = src; *src; ++src, ++p)
|
||||||
|
@ -95,7 +102,8 @@ static void qemu_rbd_unescape(char *src)
|
||||||
}
|
}
|
||||||
|
|
||||||
// vitastor[:key=value]*
|
// vitastor[:key=value]*
|
||||||
// vitastor:etcd_host=127.0.0.1:inode=1:pool=1
|
// vitastor[:etcd_host=127.0.0.1]:inode=1:pool=1[:rdma_gid_index=3]
|
||||||
|
// vitastor:config_path=/etc/vitastor/vitastor.conf:image=testimg
|
||||||
static void vitastor_parse_filename(const char *filename, QDict *options, Error **errp)
|
static void vitastor_parse_filename(const char *filename, QDict *options, Error **errp)
|
||||||
{
|
{
|
||||||
const char *start;
|
const char *start;
|
||||||
|
@ -114,16 +122,22 @@ static void vitastor_parse_filename(const char *filename, QDict *options, Error
|
||||||
while (p)
|
while (p)
|
||||||
{
|
{
|
||||||
char *name, *value;
|
char *name, *value;
|
||||||
name = qemu_rbd_next_tok(p, '=', &p);
|
name = qemu_vitastor_next_tok(p, '=', &p);
|
||||||
if (!p)
|
if (!p)
|
||||||
{
|
{
|
||||||
error_setg(errp, "conf option %s has no value", name);
|
error_setg(errp, "conf option %s has no value", name);
|
||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
qemu_rbd_unescape(name);
|
qemu_vitastor_unescape(name);
|
||||||
value = qemu_rbd_next_tok(p, ':', &p);
|
value = qemu_vitastor_next_tok(p, ':', &p);
|
||||||
qemu_rbd_unescape(value);
|
qemu_vitastor_unescape(value);
|
||||||
if (!strcmp(name, "inode") || !strcmp(name, "pool") || !strcmp(name, "size"))
|
if (!strcmp(name, "inode") ||
|
||||||
|
!strcmp(name, "pool") ||
|
||||||
|
!strcmp(name, "size") ||
|
||||||
|
!strcmp(name, "use_rdma") ||
|
||||||
|
!strcmp(name, "rdma_port_num") ||
|
||||||
|
!strcmp(name, "rdma_gid_index") ||
|
||||||
|
!strcmp(name, "rdma_mtu"))
|
||||||
{
|
{
|
||||||
unsigned long long num_val;
|
unsigned long long num_val;
|
||||||
if (parse_uint_full(value, &num_val, 0))
|
if (parse_uint_full(value, &num_val, 0))
|
||||||
|
@ -157,11 +171,6 @@ static void vitastor_parse_filename(const char *filename, QDict *options, Error
|
||||||
goto out;
|
goto out;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
if (!qdict_get_str(options, "etcd_host"))
|
|
||||||
{
|
|
||||||
error_setg(errp, "etcd_host is missing");
|
|
||||||
goto out;
|
|
||||||
}
|
|
||||||
|
|
||||||
out:
|
out:
|
||||||
g_free(buf);
|
g_free(buf);
|
||||||
|
@ -175,7 +184,7 @@ static void coroutine_fn vitastor_co_get_metadata(VitastorRPC *task)
|
||||||
task->co = qemu_coroutine_self();
|
task->co = qemu_coroutine_self();
|
||||||
|
|
||||||
qemu_mutex_lock(&client->mutex);
|
qemu_mutex_lock(&client->mutex);
|
||||||
vitastor_proxy_watch_metadata(client->proxy, client->image, vitastor_co_generic_bh_cb, task);
|
vitastor_c_watch_inode(client->proxy, client->image, vitastor_co_generic_bh_cb, task);
|
||||||
qemu_mutex_unlock(&client->mutex);
|
qemu_mutex_unlock(&client->mutex);
|
||||||
|
|
||||||
while (!task->complete)
|
while (!task->complete)
|
||||||
|
@ -189,9 +198,19 @@ static int vitastor_file_open(BlockDriverState *bs, QDict *options, int flags, E
|
||||||
VitastorClient *client = bs->opaque;
|
VitastorClient *client = bs->opaque;
|
||||||
int64_t ret = 0;
|
int64_t ret = 0;
|
||||||
qemu_mutex_init(&client->mutex);
|
qemu_mutex_init(&client->mutex);
|
||||||
|
client->config_path = g_strdup(qdict_get_try_str(options, "config_path"));
|
||||||
|
// FIXME: Rename to etcd_address
|
||||||
client->etcd_host = g_strdup(qdict_get_try_str(options, "etcd_host"));
|
client->etcd_host = g_strdup(qdict_get_try_str(options, "etcd_host"));
|
||||||
client->etcd_prefix = g_strdup(qdict_get_try_str(options, "etcd_prefix"));
|
client->etcd_prefix = g_strdup(qdict_get_try_str(options, "etcd_prefix"));
|
||||||
client->proxy = vitastor_proxy_create(bdrv_get_aio_context(bs), client->etcd_host, client->etcd_prefix);
|
client->use_rdma = qdict_get_try_int(options, "use_rdma", -1);
|
||||||
|
client->rdma_device = g_strdup(qdict_get_try_str(options, "rdma_device"));
|
||||||
|
client->rdma_port_num = qdict_get_try_int(options, "rdma_port_num", 0);
|
||||||
|
client->rdma_gid_index = qdict_get_try_int(options, "rdma_gid_index", 0);
|
||||||
|
client->rdma_mtu = qdict_get_try_int(options, "rdma_mtu", 0);
|
||||||
|
client->proxy = vitastor_c_create_qemu(
|
||||||
|
(QEMUSetFDHandler*)aio_set_fd_handler, bdrv_get_aio_context(bs), client->config_path, client->etcd_host, client->etcd_prefix,
|
||||||
|
client->use_rdma, client->rdma_device, client->rdma_port_num, client->rdma_gid_index, client->rdma_mtu, 0
|
||||||
|
);
|
||||||
client->image = g_strdup(qdict_get_try_str(options, "image"));
|
client->image = g_strdup(qdict_get_try_str(options, "image"));
|
||||||
client->readonly = (flags & BDRV_O_RDWR) ? 1 : 0;
|
client->readonly = (flags & BDRV_O_RDWR) ? 1 : 0;
|
||||||
if (client->image)
|
if (client->image)
|
||||||
|
@ -210,9 +229,9 @@ static int vitastor_file_open(BlockDriverState *bs, QDict *options, int flags, E
|
||||||
}
|
}
|
||||||
BDRV_POLL_WHILE(bs, !task.complete);
|
BDRV_POLL_WHILE(bs, !task.complete);
|
||||||
client->watch = (void*)task.ret;
|
client->watch = (void*)task.ret;
|
||||||
client->readonly = client->readonly || vitastor_proxy_get_readonly(client->watch);
|
client->readonly = client->readonly || vitastor_c_inode_get_readonly(client->watch);
|
||||||
client->size = vitastor_proxy_get_size(client->watch);
|
client->size = vitastor_c_inode_get_size(client->watch);
|
||||||
if (!vitastor_proxy_get_inode_num(client->watch))
|
if (!vitastor_c_inode_get_num(client->watch))
|
||||||
{
|
{
|
||||||
error_setg(errp, "image does not exist");
|
error_setg(errp, "image does not exist");
|
||||||
vitastor_close(bs);
|
vitastor_close(bs);
|
||||||
|
@ -241,6 +260,12 @@ static int vitastor_file_open(BlockDriverState *bs, QDict *options, int flags, E
|
||||||
}
|
}
|
||||||
bs->total_sectors = client->size / BDRV_SECTOR_SIZE;
|
bs->total_sectors = client->size / BDRV_SECTOR_SIZE;
|
||||||
//client->aio_context = bdrv_get_aio_context(bs);
|
//client->aio_context = bdrv_get_aio_context(bs);
|
||||||
|
qdict_del(options, "use_rdma");
|
||||||
|
qdict_del(options, "rdma_mtu");
|
||||||
|
qdict_del(options, "rdma_gid_index");
|
||||||
|
qdict_del(options, "rdma_port_num");
|
||||||
|
qdict_del(options, "rdma_device");
|
||||||
|
qdict_del(options, "config_path");
|
||||||
qdict_del(options, "etcd_host");
|
qdict_del(options, "etcd_host");
|
||||||
qdict_del(options, "etcd_prefix");
|
qdict_del(options, "etcd_prefix");
|
||||||
qdict_del(options, "image");
|
qdict_del(options, "image");
|
||||||
|
@ -253,8 +278,11 @@ static int vitastor_file_open(BlockDriverState *bs, QDict *options, int flags, E
|
||||||
static void vitastor_close(BlockDriverState *bs)
|
static void vitastor_close(BlockDriverState *bs)
|
||||||
{
|
{
|
||||||
VitastorClient *client = bs->opaque;
|
VitastorClient *client = bs->opaque;
|
||||||
vitastor_proxy_destroy(client->proxy);
|
vitastor_c_destroy(client->proxy);
|
||||||
qemu_mutex_destroy(&client->mutex);
|
qemu_mutex_destroy(&client->mutex);
|
||||||
|
if (client->config_path)
|
||||||
|
g_free(client->config_path);
|
||||||
|
if (client->etcd_host)
|
||||||
g_free(client->etcd_host);
|
g_free(client->etcd_host);
|
||||||
if (client->etcd_prefix)
|
if (client->etcd_prefix)
|
||||||
g_free(client->etcd_prefix);
|
g_free(client->etcd_prefix);
|
||||||
|
@ -365,7 +393,7 @@ static void vitastor_co_init_task(BlockDriverState *bs, VitastorRPC *task)
|
||||||
};
|
};
|
||||||
}
|
}
|
||||||
|
|
||||||
static void vitastor_co_generic_bh_cb(long retval, void *opaque)
|
static void vitastor_co_generic_bh_cb(void *opaque, long retval)
|
||||||
{
|
{
|
||||||
VitastorRPC *task = opaque;
|
VitastorRPC *task = opaque;
|
||||||
task->ret = retval;
|
task->ret = retval;
|
||||||
|
@ -381,6 +409,11 @@ static void vitastor_co_generic_bh_cb(long retval, void *opaque)
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
static void vitastor_co_read_cb(void *opaque, long retval, uint64_t version)
|
||||||
|
{
|
||||||
|
vitastor_co_generic_bh_cb(opaque, retval);
|
||||||
|
}
|
||||||
|
|
||||||
static int coroutine_fn vitastor_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes, QEMUIOVector *iov, int flags)
|
static int coroutine_fn vitastor_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes, QEMUIOVector *iov, int flags)
|
||||||
{
|
{
|
||||||
VitastorClient *client = bs->opaque;
|
VitastorClient *client = bs->opaque;
|
||||||
|
@ -388,9 +421,9 @@ static int coroutine_fn vitastor_co_preadv(BlockDriverState *bs, uint64_t offset
|
||||||
vitastor_co_init_task(bs, &task);
|
vitastor_co_init_task(bs, &task);
|
||||||
task.iov = iov;
|
task.iov = iov;
|
||||||
|
|
||||||
uint64_t inode = client->watch ? vitastor_proxy_get_inode_num(client->watch) : client->inode;
|
uint64_t inode = client->watch ? vitastor_c_inode_get_num(client->watch) : client->inode;
|
||||||
qemu_mutex_lock(&client->mutex);
|
qemu_mutex_lock(&client->mutex);
|
||||||
vitastor_proxy_rw(0, client->proxy, inode, offset, bytes, iov->iov, iov->niov, vitastor_co_generic_bh_cb, &task);
|
vitastor_c_read(client->proxy, inode, offset, bytes, iov->iov, iov->niov, vitastor_co_read_cb, &task);
|
||||||
qemu_mutex_unlock(&client->mutex);
|
qemu_mutex_unlock(&client->mutex);
|
||||||
|
|
||||||
while (!task.complete)
|
while (!task.complete)
|
||||||
|
@ -408,9 +441,9 @@ static int coroutine_fn vitastor_co_pwritev(BlockDriverState *bs, uint64_t offse
|
||||||
vitastor_co_init_task(bs, &task);
|
vitastor_co_init_task(bs, &task);
|
||||||
task.iov = iov;
|
task.iov = iov;
|
||||||
|
|
||||||
uint64_t inode = client->watch ? vitastor_proxy_get_inode_num(client->watch) : client->inode;
|
uint64_t inode = client->watch ? vitastor_c_inode_get_num(client->watch) : client->inode;
|
||||||
qemu_mutex_lock(&client->mutex);
|
qemu_mutex_lock(&client->mutex);
|
||||||
vitastor_proxy_rw(1, client->proxy, inode, offset, bytes, iov->iov, iov->niov, vitastor_co_generic_bh_cb, &task);
|
vitastor_c_write(client->proxy, inode, offset, bytes, 0, iov->iov, iov->niov, vitastor_co_generic_bh_cb, &task);
|
||||||
qemu_mutex_unlock(&client->mutex);
|
qemu_mutex_unlock(&client->mutex);
|
||||||
|
|
||||||
while (!task.complete)
|
while (!task.complete)
|
||||||
|
@ -440,7 +473,7 @@ static int coroutine_fn vitastor_co_flush(BlockDriverState *bs)
|
||||||
vitastor_co_init_task(bs, &task);
|
vitastor_co_init_task(bs, &task);
|
||||||
|
|
||||||
qemu_mutex_lock(&client->mutex);
|
qemu_mutex_lock(&client->mutex);
|
||||||
vitastor_proxy_sync(client->proxy, vitastor_co_generic_bh_cb, &task);
|
vitastor_c_sync(client->proxy, vitastor_co_generic_bh_cb, &task);
|
||||||
qemu_mutex_unlock(&client->mutex);
|
qemu_mutex_unlock(&client->mutex);
|
||||||
|
|
||||||
while (!task.complete)
|
while (!task.complete)
|
||||||
|
@ -478,6 +511,7 @@ static QEMUOptionParameter vitastor_create_opts[] = {
|
||||||
static const char *vitastor_strong_runtime_opts[] = {
|
static const char *vitastor_strong_runtime_opts[] = {
|
||||||
"inode",
|
"inode",
|
||||||
"pool",
|
"pool",
|
||||||
|
"config_path",
|
||||||
"etcd_host",
|
"etcd_host",
|
||||||
"etcd_prefix",
|
"etcd_prefix",
|
||||||
|
|
||||||
|
|
|
@ -1,163 +0,0 @@
|
||||||
// Copyright (c) Vitaliy Filippov, 2019+
|
|
||||||
// License: VNPL-1.1 or GNU GPL-2.0+ (see README.md for details)
|
|
||||||
|
|
||||||
// C-C++ proxy for the QEMU driver
|
|
||||||
// (QEMU headers don't compile with g++)
|
|
||||||
|
|
||||||
#include <sys/epoll.h>
|
|
||||||
|
|
||||||
#include "cluster_client.h"
|
|
||||||
|
|
||||||
typedef void* AioContext;
|
|
||||||
#include "qemu_proxy.h"
|
|
||||||
|
|
||||||
extern "C"
|
|
||||||
{
|
|
||||||
// QEMU
|
|
||||||
typedef void IOHandler(void *opaque);
|
|
||||||
void aio_set_fd_handler(AioContext *ctx, int fd, int is_external, IOHandler *fd_read, IOHandler *fd_write, void *poll_fn, void *opaque);
|
|
||||||
}
|
|
||||||
|
|
||||||
struct QemuProxyData
|
|
||||||
{
|
|
||||||
int fd;
|
|
||||||
std::function<void(int, int)> callback;
|
|
||||||
};
|
|
||||||
|
|
||||||
class QemuProxy
|
|
||||||
{
|
|
||||||
std::map<int, QemuProxyData> handlers;
|
|
||||||
|
|
||||||
public:
|
|
||||||
|
|
||||||
timerfd_manager_t *tfd;
|
|
||||||
cluster_client_t *cli;
|
|
||||||
AioContext *ctx;
|
|
||||||
|
|
||||||
QemuProxy(AioContext *ctx, const char *etcd_host, const char *etcd_prefix)
|
|
||||||
{
|
|
||||||
this->ctx = ctx;
|
|
||||||
json11::Json cfg = json11::Json::object {
|
|
||||||
{ "etcd_address", std::string(etcd_host) },
|
|
||||||
{ "etcd_prefix", std::string(etcd_prefix ? etcd_prefix : "/vitastor") },
|
|
||||||
};
|
|
||||||
tfd = new timerfd_manager_t([this](int fd, bool wr, std::function<void(int, int)> callback) { set_fd_handler(fd, wr, callback); });
|
|
||||||
cli = new cluster_client_t(NULL, tfd, cfg);
|
|
||||||
}
|
|
||||||
|
|
||||||
~QemuProxy()
|
|
||||||
{
|
|
||||||
delete cli;
|
|
||||||
delete tfd;
|
|
||||||
}
|
|
||||||
|
|
||||||
void set_fd_handler(int fd, bool wr, std::function<void(int, int)> callback)
|
|
||||||
{
|
|
||||||
if (callback != NULL)
|
|
||||||
{
|
|
||||||
handlers[fd] = { .fd = fd, .callback = callback };
|
|
||||||
aio_set_fd_handler(ctx, fd, false, &QemuProxy::read_handler, wr ? &QemuProxy::write_handler : NULL, NULL, &handlers[fd]);
|
|
||||||
}
|
|
||||||
else
|
|
||||||
{
|
|
||||||
handlers.erase(fd);
|
|
||||||
aio_set_fd_handler(ctx, fd, false, NULL, NULL, NULL, NULL);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
static void read_handler(void *opaque)
|
|
||||||
{
|
|
||||||
QemuProxyData *data = (QemuProxyData *)opaque;
|
|
||||||
data->callback(data->fd, EPOLLIN);
|
|
||||||
}
|
|
||||||
|
|
||||||
static void write_handler(void *opaque)
|
|
||||||
{
|
|
||||||
QemuProxyData *data = (QemuProxyData *)opaque;
|
|
||||||
data->callback(data->fd, EPOLLOUT);
|
|
||||||
}
|
|
||||||
};
|
|
||||||
|
|
||||||
extern "C" {
|
|
||||||
|
|
||||||
void* vitastor_proxy_create(AioContext *ctx, const char *etcd_host, const char *etcd_prefix)
|
|
||||||
{
|
|
||||||
QemuProxy *p = new QemuProxy(ctx, etcd_host, etcd_prefix);
|
|
||||||
return p;
|
|
||||||
}
|
|
||||||
|
|
||||||
void vitastor_proxy_destroy(void *client)
|
|
||||||
{
|
|
||||||
QemuProxy *p = (QemuProxy*)client;
|
|
||||||
delete p;
|
|
||||||
}
|
|
||||||
|
|
||||||
void vitastor_proxy_rw(int write, void *client, uint64_t inode, uint64_t offset, uint64_t len,
|
|
||||||
iovec *iov, int iovcnt, VitastorIOHandler cb, void *opaque)
|
|
||||||
{
|
|
||||||
QemuProxy *p = (QemuProxy*)client;
|
|
||||||
cluster_op_t *op = new cluster_op_t;
|
|
||||||
op->opcode = write ? OSD_OP_WRITE : OSD_OP_READ;
|
|
||||||
op->inode = inode;
|
|
||||||
op->offset = offset;
|
|
||||||
op->len = len;
|
|
||||||
for (int i = 0; i < iovcnt; i++)
|
|
||||||
{
|
|
||||||
op->iov.push_back(iov[i].iov_base, iov[i].iov_len);
|
|
||||||
}
|
|
||||||
op->callback = [cb, opaque](cluster_op_t *op)
|
|
||||||
{
|
|
||||||
cb(op->retval, opaque);
|
|
||||||
delete op;
|
|
||||||
};
|
|
||||||
p->cli->execute(op);
|
|
||||||
}
|
|
||||||
|
|
||||||
void vitastor_proxy_sync(void *client, VitastorIOHandler cb, void *opaque)
|
|
||||||
{
|
|
||||||
QemuProxy *p = (QemuProxy*)client;
|
|
||||||
cluster_op_t *op = new cluster_op_t;
|
|
||||||
op->opcode = OSD_OP_SYNC;
|
|
||||||
op->callback = [cb, opaque](cluster_op_t *op)
|
|
||||||
{
|
|
||||||
cb(op->retval, opaque);
|
|
||||||
delete op;
|
|
||||||
};
|
|
||||||
p->cli->execute(op);
|
|
||||||
}
|
|
||||||
|
|
||||||
void vitastor_proxy_watch_metadata(void *client, char *image, VitastorIOHandler cb, void *opaque)
|
|
||||||
{
|
|
||||||
QemuProxy *p = (QemuProxy*)client;
|
|
||||||
p->cli->on_ready([=]()
|
|
||||||
{
|
|
||||||
auto watch = p->cli->st_cli.watch_inode(std::string(image));
|
|
||||||
cb((long)watch, opaque);
|
|
||||||
});
|
|
||||||
}
|
|
||||||
|
|
||||||
void vitastor_proxy_close_watch(void *client, void *watch)
|
|
||||||
{
|
|
||||||
QemuProxy *p = (QemuProxy*)client;
|
|
||||||
p->cli->st_cli.close_watch((inode_watch_t*)watch);
|
|
||||||
}
|
|
||||||
|
|
||||||
uint64_t vitastor_proxy_get_size(void *watch_ptr)
|
|
||||||
{
|
|
||||||
inode_watch_t *watch = (inode_watch_t*)watch_ptr;
|
|
||||||
return watch->cfg.size;
|
|
||||||
}
|
|
||||||
|
|
||||||
uint64_t vitastor_proxy_get_inode_num(void *watch_ptr)
|
|
||||||
{
|
|
||||||
inode_watch_t *watch = (inode_watch_t*)watch_ptr;
|
|
||||||
return watch->cfg.num;
|
|
||||||
}
|
|
||||||
|
|
||||||
int vitastor_proxy_get_readonly(void *watch_ptr)
|
|
||||||
{
|
|
||||||
inode_watch_t *watch = (inode_watch_t*)watch_ptr;
|
|
||||||
return watch->cfg.readonly;
|
|
||||||
}
|
|
||||||
|
|
||||||
}
|
|
|
@ -1,34 +0,0 @@
|
||||||
// Copyright (c) Vitaliy Filippov, 2019+
|
|
||||||
// License: VNPL-1.1 or GNU GPL-2.0+ (see README.md for details)
|
|
||||||
|
|
||||||
#ifndef VITASTOR_QEMU_PROXY_H
|
|
||||||
#define VITASTOR_QEMU_PROXY_H
|
|
||||||
|
|
||||||
#ifndef POOL_ID_BITS
|
|
||||||
#define POOL_ID_BITS 16
|
|
||||||
#endif
|
|
||||||
#include <stdint.h>
|
|
||||||
#include <sys/uio.h>
|
|
||||||
|
|
||||||
#ifdef __cplusplus
|
|
||||||
extern "C" {
|
|
||||||
#endif
|
|
||||||
|
|
||||||
// Our exports
|
|
||||||
typedef void VitastorIOHandler(long retval, void *opaque);
|
|
||||||
void* vitastor_proxy_create(AioContext *ctx, const char *etcd_host, const char *etcd_prefix);
|
|
||||||
void vitastor_proxy_destroy(void *client);
|
|
||||||
void vitastor_proxy_rw(int write, void *client, uint64_t inode, uint64_t offset, uint64_t len,
|
|
||||||
struct iovec *iov, int iovcnt, VitastorIOHandler cb, void *opaque);
|
|
||||||
void vitastor_proxy_sync(void *client, VitastorIOHandler cb, void *opaque);
|
|
||||||
void vitastor_proxy_watch_metadata(void *client, char *image, VitastorIOHandler cb, void *opaque);
|
|
||||||
void vitastor_proxy_close_watch(void *client, void *watch);
|
|
||||||
uint64_t vitastor_proxy_get_size(void *watch);
|
|
||||||
uint64_t vitastor_proxy_get_inode_num(void *watch);
|
|
||||||
int vitastor_proxy_get_readonly(void *watch);
|
|
||||||
|
|
||||||
#ifdef __cplusplus
|
|
||||||
}
|
|
||||||
#endif
|
|
||||||
|
|
||||||
#endif
|
|
|
@ -87,7 +87,7 @@ public:
|
||||||
"Vitastor inode removal tool\n"
|
"Vitastor inode removal tool\n"
|
||||||
"(c) Vitaliy Filippov, 2020 (VNPL-1.1)\n\n"
|
"(c) Vitaliy Filippov, 2020 (VNPL-1.1)\n\n"
|
||||||
"USAGE:\n"
|
"USAGE:\n"
|
||||||
" %s --etcd_address <etcd_address> --pool <pool> --inode <inode> [--wait-list]\n",
|
" %s [--etcd_address <etcd_address>] --pool <pool> --inode <inode> [--wait-list]\n",
|
||||||
exe_name
|
exe_name
|
||||||
);
|
);
|
||||||
exit(0);
|
exit(0);
|
||||||
|
@ -95,11 +95,6 @@ public:
|
||||||
|
|
||||||
void run(json11::Json cfg)
|
void run(json11::Json cfg)
|
||||||
{
|
{
|
||||||
if (cfg["etcd_address"].string_value() == "")
|
|
||||||
{
|
|
||||||
fprintf(stderr, "etcd_address is missing\n");
|
|
||||||
exit(1);
|
|
||||||
}
|
|
||||||
inode = cfg["inode"].uint64_value();
|
inode = cfg["inode"].uint64_value();
|
||||||
pool_id = cfg["pool"].uint64_value();
|
pool_id = cfg["pool"].uint64_value();
|
||||||
if (pool_id)
|
if (pool_id)
|
||||||
|
|
|
@ -0,0 +1,135 @@
|
||||||
|
// Copyright (c) Vitaliy Filippov, 2019+
|
||||||
|
// License: VNPL-1.1 (see README.md for details)
|
||||||
|
|
||||||
|
#include <stdio.h>
|
||||||
|
#include <stdlib.h>
|
||||||
|
|
||||||
|
#include "epoll_manager.h"
|
||||||
|
#include "cluster_client.h"
|
||||||
|
|
||||||
|
void send_read(cluster_client_t *cli, uint64_t inode, std::function<void(int, uint64_t)> cb)
|
||||||
|
{
|
||||||
|
cluster_op_t *op = new cluster_op_t();
|
||||||
|
op->opcode = OSD_OP_READ;
|
||||||
|
op->inode = inode;
|
||||||
|
op->offset = 0;
|
||||||
|
op->len = 4096;
|
||||||
|
op->iov.push_back(malloc_or_die(op->len), op->len);
|
||||||
|
op->callback = [cb](cluster_op_t *op)
|
||||||
|
{
|
||||||
|
uint64_t version = op->version;
|
||||||
|
int retval = op->retval;
|
||||||
|
if (retval == op->len)
|
||||||
|
retval = 0;
|
||||||
|
free(op->iov.buf[0].iov_base);
|
||||||
|
delete op;
|
||||||
|
if (cb != NULL)
|
||||||
|
cb(retval, version);
|
||||||
|
};
|
||||||
|
cli->execute(op);
|
||||||
|
}
|
||||||
|
|
||||||
|
void send_write(cluster_client_t *cli, uint64_t inode, int byte, uint64_t version, std::function<void(int)> cb)
|
||||||
|
{
|
||||||
|
cluster_op_t *op = new cluster_op_t();
|
||||||
|
op->opcode = OSD_OP_WRITE;
|
||||||
|
op->inode = inode;
|
||||||
|
op->offset = 0;
|
||||||
|
op->len = 4096;
|
||||||
|
op->version = version;
|
||||||
|
op->iov.push_back(malloc_or_die(op->len), op->len);
|
||||||
|
memset(op->iov.buf[0].iov_base, byte, op->len);
|
||||||
|
op->callback = [cb](cluster_op_t *op)
|
||||||
|
{
|
||||||
|
int retval = op->retval;
|
||||||
|
if (retval == op->len)
|
||||||
|
retval = 0;
|
||||||
|
free(op->iov.buf[0].iov_base);
|
||||||
|
delete op;
|
||||||
|
if (cb != NULL)
|
||||||
|
cb(retval);
|
||||||
|
};
|
||||||
|
cli->execute(op);
|
||||||
|
}
|
||||||
|
|
||||||
|
int main(int narg, char *args[])
|
||||||
|
{
|
||||||
|
json11::Json::object cfgo;
|
||||||
|
for (int i = 1; i < narg; i++)
|
||||||
|
{
|
||||||
|
if (args[i][0] == '-' && args[i][1] == '-')
|
||||||
|
{
|
||||||
|
const char *opt = args[i]+2;
|
||||||
|
cfgo[opt] = i == narg-1 ? "1" : args[++i];
|
||||||
|
}
|
||||||
|
}
|
||||||
|
json11::Json cfg(cfgo);
|
||||||
|
uint64_t inode = (cfg["pool_id"].uint64_value() << (64-POOL_ID_BITS))
|
||||||
|
| cfg["inode_id"].uint64_value();
|
||||||
|
uint64_t base_ver = 0;
|
||||||
|
// Create client
|
||||||
|
auto ringloop = new ring_loop_t(512);
|
||||||
|
auto epmgr = new epoll_manager_t(ringloop);
|
||||||
|
auto cli = new cluster_client_t(ringloop, epmgr->tfd, cfg);
|
||||||
|
cli->on_ready([&]()
|
||||||
|
{
|
||||||
|
send_read(cli, inode, [&](int r, uint64_t v)
|
||||||
|
{
|
||||||
|
if (r < 0)
|
||||||
|
{
|
||||||
|
fprintf(stderr, "Initial read operation failed\n");
|
||||||
|
exit(1);
|
||||||
|
}
|
||||||
|
base_ver = v;
|
||||||
|
// CAS v=1 = compare with zero, non-existing object
|
||||||
|
send_write(cli, inode, 0x01, base_ver+1, [&](int r)
|
||||||
|
{
|
||||||
|
if (r < 0)
|
||||||
|
{
|
||||||
|
fprintf(stderr, "CAS for non-existing object failed\n");
|
||||||
|
exit(1);
|
||||||
|
}
|
||||||
|
// Check that read returns the new version
|
||||||
|
send_read(cli, inode, [&](int r, uint64_t v)
|
||||||
|
{
|
||||||
|
if (r < 0)
|
||||||
|
{
|
||||||
|
fprintf(stderr, "Read operation failed after write\n");
|
||||||
|
exit(1);
|
||||||
|
}
|
||||||
|
if (v != base_ver+1)
|
||||||
|
{
|
||||||
|
fprintf(stderr, "Read operation failed to return the new version number\n");
|
||||||
|
exit(1);
|
||||||
|
}
|
||||||
|
// CAS v=2 = compare with v=1, existing object
|
||||||
|
send_write(cli, inode, 0x02, base_ver+2, [&](int r)
|
||||||
|
{
|
||||||
|
if (r < 0)
|
||||||
|
{
|
||||||
|
fprintf(stderr, "CAS for existing object failed\n");
|
||||||
|
exit(1);
|
||||||
|
}
|
||||||
|
// CAS v=2 again = compare with v=1, but version is 2. Must fail with -EINTR
|
||||||
|
send_write(cli, inode, 0x03, base_ver+2, [&](int r)
|
||||||
|
{
|
||||||
|
if (r != -EINTR)
|
||||||
|
{
|
||||||
|
fprintf(stderr, "CAS conflict detection failed\n");
|
||||||
|
exit(1);
|
||||||
|
}
|
||||||
|
printf("Basic CAS test succeeded\n");
|
||||||
|
exit(0);
|
||||||
|
});
|
||||||
|
});
|
||||||
|
});
|
||||||
|
});
|
||||||
|
});
|
||||||
|
});
|
||||||
|
while (1)
|
||||||
|
{
|
||||||
|
ringloop->loop();
|
||||||
|
ringloop->wait();
|
||||||
|
}
|
||||||
|
return 0;
|
||||||
|
}
|
|
@ -0,0 +1,254 @@
|
||||||
|
// Copyright (c) Vitaliy Filippov, 2019+
|
||||||
|
// License: VNPL-1.1 or GNU GPL-2.0+ (see README.md for details)
|
||||||
|
|
||||||
|
// Simplified C client library for QEMU, fio and other external drivers
|
||||||
|
// Also acts as a C-C++ proxy for the QEMU driver (QEMU headers don't compile with g++)
|
||||||
|
|
||||||
|
#include <sys/epoll.h>
|
||||||
|
|
||||||
|
#include "ringloop.h"
|
||||||
|
#include "epoll_manager.h"
|
||||||
|
#include "cluster_client.h"
|
||||||
|
|
||||||
|
#include "vitastor_c.h"
|
||||||
|
|
||||||
|
struct vitastor_qemu_fd_t
|
||||||
|
{
|
||||||
|
int fd;
|
||||||
|
std::function<void(int, int)> callback;
|
||||||
|
};
|
||||||
|
|
||||||
|
struct vitastor_c
|
||||||
|
{
|
||||||
|
std::map<int, vitastor_qemu_fd_t> handlers;
|
||||||
|
ring_loop_t *ringloop = NULL;
|
||||||
|
epoll_manager_t *epmgr = NULL;
|
||||||
|
timerfd_manager_t *tfd = NULL;
|
||||||
|
cluster_client_t *cli = NULL;
|
||||||
|
|
||||||
|
QEMUSetFDHandler *aio_set_fd_handler = NULL;
|
||||||
|
void *aio_ctx = NULL;
|
||||||
|
};
|
||||||
|
|
||||||
|
extern "C" {
|
||||||
|
|
||||||
|
static json11::Json vitastor_c_common_config(const char *config_path, const char *etcd_host, const char *etcd_prefix,
|
||||||
|
int use_rdma, const char *rdma_device, int rdma_port_num, int rdma_gid_index, int rdma_mtu, int log_level)
|
||||||
|
{
|
||||||
|
json11::Json::object cfg;
|
||||||
|
if (config_path)
|
||||||
|
cfg["config_path"] = std::string(config_path);
|
||||||
|
if (etcd_host)
|
||||||
|
cfg["etcd_address"] = std::string(etcd_host);
|
||||||
|
if (etcd_prefix)
|
||||||
|
cfg["etcd_prefix"] = std::string(etcd_prefix);
|
||||||
|
// -1 means unspecified
|
||||||
|
if (use_rdma >= 0)
|
||||||
|
cfg["use_rdma"] = use_rdma > 0;
|
||||||
|
if (rdma_device)
|
||||||
|
cfg["rdma_device"] = std::string(rdma_device);
|
||||||
|
if (rdma_port_num)
|
||||||
|
cfg["rdma_port_num"] = rdma_port_num;
|
||||||
|
if (rdma_gid_index)
|
||||||
|
cfg["rdma_gid_index"] = rdma_gid_index;
|
||||||
|
if (rdma_mtu)
|
||||||
|
cfg["rdma_mtu"] = rdma_mtu;
|
||||||
|
if (log_level)
|
||||||
|
cfg["log_level"] = log_level;
|
||||||
|
return json11::Json(cfg);
|
||||||
|
}
|
||||||
|
|
||||||
|
static void vitastor_c_read_handler(void *opaque)
|
||||||
|
{
|
||||||
|
vitastor_qemu_fd_t *data = (vitastor_qemu_fd_t *)opaque;
|
||||||
|
data->callback(data->fd, EPOLLIN);
|
||||||
|
}
|
||||||
|
|
||||||
|
static void vitastor_c_write_handler(void *opaque)
|
||||||
|
{
|
||||||
|
vitastor_qemu_fd_t *data = (vitastor_qemu_fd_t *)opaque;
|
||||||
|
data->callback(data->fd, EPOLLOUT);
|
||||||
|
}
|
||||||
|
|
||||||
|
vitastor_c *vitastor_c_create_qemu(QEMUSetFDHandler *aio_set_fd_handler, void *aio_context,
|
||||||
|
const char *config_path, const char *etcd_host, const char *etcd_prefix,
|
||||||
|
bool use_rdma, const char *rdma_device, int rdma_port_num, int rdma_gid_index, int rdma_mtu, int log_level)
|
||||||
|
{
|
||||||
|
json11::Json cfg_json = vitastor_c_common_config(
|
||||||
|
config_path, etcd_host, etcd_prefix, use_rdma,
|
||||||
|
rdma_device, rdma_port_num, rdma_gid_index, rdma_mtu, log_level
|
||||||
|
);
|
||||||
|
vitastor_c *self = new vitastor_c;
|
||||||
|
self->aio_set_fd_handler = aio_set_fd_handler;
|
||||||
|
self->aio_ctx = aio_context;
|
||||||
|
self->tfd = new timerfd_manager_t([self](int fd, bool wr, std::function<void(int, int)> callback)
|
||||||
|
{
|
||||||
|
if (callback != NULL)
|
||||||
|
{
|
||||||
|
self->handlers[fd] = { .fd = fd, .callback = callback };
|
||||||
|
self->aio_set_fd_handler(self->aio_ctx, fd, false,
|
||||||
|
vitastor_c_read_handler, wr ? vitastor_c_write_handler : NULL, NULL, &self->handlers[fd]);
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
self->handlers.erase(fd);
|
||||||
|
self->aio_set_fd_handler(self->aio_ctx, fd, false, NULL, NULL, NULL, NULL);
|
||||||
|
}
|
||||||
|
});
|
||||||
|
self->cli = new cluster_client_t(NULL, self->tfd, cfg_json);
|
||||||
|
return self;
|
||||||
|
}
|
||||||
|
|
||||||
|
vitastor_c *vitastor_c_create_uring(const char *config_path, const char *etcd_host, const char *etcd_prefix,
|
||||||
|
int use_rdma, const char *rdma_device, int rdma_port_num, int rdma_gid_index, int rdma_mtu, int log_level)
|
||||||
|
{
|
||||||
|
json11::Json cfg_json = vitastor_c_common_config(
|
||||||
|
config_path, etcd_host, etcd_prefix, use_rdma,
|
||||||
|
rdma_device, rdma_port_num, rdma_gid_index, rdma_mtu, log_level
|
||||||
|
);
|
||||||
|
vitastor_c *self = new vitastor_c;
|
||||||
|
self->ringloop = new ring_loop_t(512);
|
||||||
|
self->epmgr = new epoll_manager_t(self->ringloop);
|
||||||
|
self->cli = new cluster_client_t(self->ringloop, self->epmgr->tfd, cfg_json);
|
||||||
|
return self;
|
||||||
|
}
|
||||||
|
|
||||||
|
vitastor_c *vitastor_c_create_uring_json(const char **options, int options_len)
|
||||||
|
{
|
||||||
|
json11::Json::object cfg;
|
||||||
|
for (int i = 0; i < options_len-1; i += 2)
|
||||||
|
{
|
||||||
|
cfg[options[i]] = std::string(options[i+1]);
|
||||||
|
}
|
||||||
|
json11::Json cfg_json(cfg);
|
||||||
|
vitastor_c *self = new vitastor_c;
|
||||||
|
self->ringloop = new ring_loop_t(512);
|
||||||
|
self->epmgr = new epoll_manager_t(self->ringloop);
|
||||||
|
self->cli = new cluster_client_t(self->ringloop, self->epmgr->tfd, cfg_json);
|
||||||
|
return self;
|
||||||
|
}
|
||||||
|
|
||||||
|
void vitastor_c_destroy(vitastor_c *client)
|
||||||
|
{
|
||||||
|
delete client->cli;
|
||||||
|
if (client->epmgr)
|
||||||
|
delete client->epmgr;
|
||||||
|
else
|
||||||
|
delete client->tfd;
|
||||||
|
if (client->ringloop)
|
||||||
|
delete client->ringloop;
|
||||||
|
delete client;
|
||||||
|
}
|
||||||
|
|
||||||
|
int vitastor_c_is_ready(vitastor_c *client)
|
||||||
|
{
|
||||||
|
return client->cli->is_ready();
|
||||||
|
}
|
||||||
|
|
||||||
|
void vitastor_c_uring_wait_ready(vitastor_c *client)
|
||||||
|
{
|
||||||
|
while (!client->cli->is_ready())
|
||||||
|
{
|
||||||
|
client->ringloop->loop();
|
||||||
|
if (client->cli->is_ready())
|
||||||
|
break;
|
||||||
|
client->ringloop->wait();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
void vitastor_c_uring_handle_events(vitastor_c *client)
|
||||||
|
{
|
||||||
|
client->ringloop->loop();
|
||||||
|
}
|
||||||
|
|
||||||
|
void vitastor_c_uring_wait_events(vitastor_c *client)
|
||||||
|
{
|
||||||
|
client->ringloop->wait();
|
||||||
|
}
|
||||||
|
|
||||||
|
void vitastor_c_read(vitastor_c *client, uint64_t inode, uint64_t offset, uint64_t len,
|
||||||
|
struct iovec *iov, int iovcnt, VitastorReadHandler cb, void *opaque)
|
||||||
|
{
|
||||||
|
cluster_op_t *op = new cluster_op_t;
|
||||||
|
op->opcode = OSD_OP_READ;
|
||||||
|
op->inode = inode;
|
||||||
|
op->offset = offset;
|
||||||
|
op->len = len;
|
||||||
|
for (int i = 0; i < iovcnt; i++)
|
||||||
|
{
|
||||||
|
op->iov.push_back(iov[i].iov_base, iov[i].iov_len);
|
||||||
|
}
|
||||||
|
op->callback = [cb, opaque](cluster_op_t *op)
|
||||||
|
{
|
||||||
|
cb(opaque, op->retval, op->version);
|
||||||
|
delete op;
|
||||||
|
};
|
||||||
|
client->cli->execute(op);
|
||||||
|
}
|
||||||
|
|
||||||
|
void vitastor_c_write(vitastor_c *client, uint64_t inode, uint64_t offset, uint64_t len, uint64_t check_version,
|
||||||
|
struct iovec *iov, int iovcnt, VitastorIOHandler cb, void *opaque)
|
||||||
|
{
|
||||||
|
cluster_op_t *op = new cluster_op_t;
|
||||||
|
op->opcode = OSD_OP_WRITE;
|
||||||
|
op->inode = inode;
|
||||||
|
op->offset = offset;
|
||||||
|
op->len = len;
|
||||||
|
op->version = check_version;
|
||||||
|
for (int i = 0; i < iovcnt; i++)
|
||||||
|
{
|
||||||
|
op->iov.push_back(iov[i].iov_base, iov[i].iov_len);
|
||||||
|
}
|
||||||
|
op->callback = [cb, opaque](cluster_op_t *op)
|
||||||
|
{
|
||||||
|
cb(opaque, op->retval);
|
||||||
|
delete op;
|
||||||
|
};
|
||||||
|
client->cli->execute(op);
|
||||||
|
}
|
||||||
|
|
||||||
|
void vitastor_c_sync(vitastor_c *client, VitastorIOHandler cb, void *opaque)
|
||||||
|
{
|
||||||
|
cluster_op_t *op = new cluster_op_t;
|
||||||
|
op->opcode = OSD_OP_SYNC;
|
||||||
|
op->callback = [cb, opaque](cluster_op_t *op)
|
||||||
|
{
|
||||||
|
cb(opaque, op->retval);
|
||||||
|
delete op;
|
||||||
|
};
|
||||||
|
client->cli->execute(op);
|
||||||
|
}
|
||||||
|
|
||||||
|
void vitastor_c_watch_inode(vitastor_c *client, char *image, VitastorIOHandler cb, void *opaque)
|
||||||
|
{
|
||||||
|
client->cli->on_ready([=]()
|
||||||
|
{
|
||||||
|
auto watch = client->cli->st_cli.watch_inode(std::string(image));
|
||||||
|
cb(opaque, (long)watch);
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
void vitastor_c_close_watch(vitastor_c *client, void *handle)
|
||||||
|
{
|
||||||
|
client->cli->st_cli.close_watch((inode_watch_t*)handle);
|
||||||
|
}
|
||||||
|
|
||||||
|
uint64_t vitastor_c_inode_get_size(void *handle)
|
||||||
|
{
|
||||||
|
inode_watch_t *watch = (inode_watch_t*)handle;
|
||||||
|
return watch->cfg.size;
|
||||||
|
}
|
||||||
|
|
||||||
|
uint64_t vitastor_c_inode_get_num(void *handle)
|
||||||
|
{
|
||||||
|
inode_watch_t *watch = (inode_watch_t*)handle;
|
||||||
|
return watch->cfg.num;
|
||||||
|
}
|
||||||
|
|
||||||
|
int vitastor_c_inode_get_readonly(void *handle)
|
||||||
|
{
|
||||||
|
inode_watch_t *watch = (inode_watch_t*)handle;
|
||||||
|
return watch->cfg.readonly;
|
||||||
|
}
|
||||||
|
|
||||||
|
}
|
|
@ -0,0 +1,55 @@
|
||||||
|
// Copyright (c) Vitaliy Filippov, 2019+
|
||||||
|
// License: VNPL-1.1 or GNU GPL-2.0+ (see README.md for details)
|
||||||
|
|
||||||
|
// Simplified C client library for QEMU, fio and other external drivers
|
||||||
|
|
||||||
|
#ifndef VITASTOR_QEMU_PROXY_H
|
||||||
|
#define VITASTOR_QEMU_PROXY_H
|
||||||
|
|
||||||
|
#ifndef POOL_ID_BITS
|
||||||
|
#define POOL_ID_BITS 16
|
||||||
|
#endif
|
||||||
|
#include <stdint.h>
|
||||||
|
#include <sys/uio.h>
|
||||||
|
|
||||||
|
#ifdef __cplusplus
|
||||||
|
extern "C" {
|
||||||
|
#endif
|
||||||
|
|
||||||
|
struct vitastor_c;
|
||||||
|
typedef struct vitastor_c vitastor_c;
|
||||||
|
|
||||||
|
typedef void VitastorReadHandler(void *opaque, long retval, uint64_t version);
|
||||||
|
typedef void VitastorIOHandler(void *opaque, long retval);
|
||||||
|
|
||||||
|
// QEMU
|
||||||
|
typedef void IOHandler(void *opaque);
|
||||||
|
typedef void QEMUSetFDHandler(void *ctx, int fd, int is_external, IOHandler *fd_read, IOHandler *fd_write, void *poll_fn, void *opaque);
|
||||||
|
|
||||||
|
vitastor_c *vitastor_c_create_qemu(QEMUSetFDHandler *aio_set_fd_handler, void *aio_context,
|
||||||
|
const char *config_path, const char *etcd_host, const char *etcd_prefix,
|
||||||
|
bool use_rdma, const char *rdma_device, int rdma_port_num, int rdma_gid_index, int rdma_mtu, int log_level);
|
||||||
|
vitastor_c *vitastor_c_create_uring(const char *config_path, const char *etcd_host, const char *etcd_prefix,
|
||||||
|
int use_rdma, const char *rdma_device, int rdma_port_num, int rdma_gid_index, int rdma_mtu, int log_level);
|
||||||
|
vitastor_c *vitastor_c_create_uring_json(const char **options, int options_len);
|
||||||
|
void vitastor_c_destroy(vitastor_c *client);
|
||||||
|
int vitastor_c_is_ready(vitastor_c *client);
|
||||||
|
void vitastor_c_uring_wait_ready(vitastor_c *client);
|
||||||
|
void vitastor_c_uring_handle_events(vitastor_c *client);
|
||||||
|
void vitastor_c_uring_wait_events(vitastor_c *client);
|
||||||
|
void vitastor_c_read(vitastor_c *client, uint64_t inode, uint64_t offset, uint64_t len,
|
||||||
|
struct iovec *iov, int iovcnt, VitastorReadHandler cb, void *opaque);
|
||||||
|
void vitastor_c_write(vitastor_c *client, uint64_t inode, uint64_t offset, uint64_t len, uint64_t check_version,
|
||||||
|
struct iovec *iov, int iovcnt, VitastorIOHandler cb, void *opaque);
|
||||||
|
void vitastor_c_sync(vitastor_c *client, VitastorIOHandler cb, void *opaque);
|
||||||
|
void vitastor_c_watch_inode(vitastor_c *client, char *image, VitastorIOHandler cb, void *opaque);
|
||||||
|
void vitastor_c_close_watch(vitastor_c *client, void *handle);
|
||||||
|
uint64_t vitastor_c_inode_get_size(void *handle);
|
||||||
|
uint64_t vitastor_c_inode_get_num(void *handle);
|
||||||
|
int vitastor_c_inode_get_readonly(void *handle);
|
||||||
|
|
||||||
|
#ifdef __cplusplus
|
||||||
|
}
|
||||||
|
#endif
|
||||||
|
|
||||||
|
#endif
|
|
@ -0,0 +1,43 @@
|
||||||
|
#!/bin/bash -ex
|
||||||
|
|
||||||
|
. `dirname $0`/common.sh
|
||||||
|
|
||||||
|
OSD_SIZE=${OSD_SIZE:-1024}
|
||||||
|
|
||||||
|
dd if=/dev/zero of=./testdata/test_osd1.bin bs=1024 count=1 seek=$((OSD_SIZE*1024-1))
|
||||||
|
dd if=/dev/zero of=./testdata/test_osd2.bin bs=1024 count=1 seek=$((OSD_SIZE*1024-1))
|
||||||
|
dd if=/dev/zero of=./testdata/test_osd3.bin bs=1024 count=1 seek=$((OSD_SIZE*1024-1))
|
||||||
|
|
||||||
|
build/src/vitastor-osd --osd_num 1 --bind_address 127.0.0.1 $OSD_ARGS --etcd_address $ETCD_URL $(node mon/simple-offsets.js --format options --device ./testdata/test_osd1.bin 2>/dev/null) &>./testdata/osd1.log &
|
||||||
|
OSD1_PID=$!
|
||||||
|
build/src/vitastor-osd --osd_num 2 --bind_address 127.0.0.1 $OSD_ARGS --etcd_address $ETCD_URL $(node mon/simple-offsets.js --format options --device ./testdata/test_osd2.bin 2>/dev/null) &>./testdata/osd2.log &
|
||||||
|
OSD2_PID=$!
|
||||||
|
build/src/vitastor-osd --osd_num 3 --bind_address 127.0.0.1 $OSD_ARGS --etcd_address $ETCD_URL $(node mon/simple-offsets.js --format options --device ./testdata/test_osd3.bin 2>/dev/null) &>./testdata/osd3.log &
|
||||||
|
OSD3_PID=$!
|
||||||
|
|
||||||
|
cd mon
|
||||||
|
npm install
|
||||||
|
cd ..
|
||||||
|
node mon/mon-main.js --etcd_url http://$ETCD_URL --etcd_prefix "/vitastor" &>./testdata/mon.log &
|
||||||
|
MON_PID=$!
|
||||||
|
|
||||||
|
if [ -n "$GLOBAL_CONF" ]; then
|
||||||
|
$ETCDCTL put /vitastor/config/global "$GLOBAL_CONF"
|
||||||
|
fi
|
||||||
|
|
||||||
|
$ETCDCTL put /vitastor/config/pools '{"1":{"name":"testpool","scheme":"xor","pg_size":3,"pg_minsize":2,"parity_chunks":1,"pg_count":1,"failure_domain":"osd"}}'
|
||||||
|
|
||||||
|
sleep 2
|
||||||
|
|
||||||
|
if ! ($ETCDCTL get /vitastor/config/pgs --print-value-only | jq -s -e '(. | length) != 0 and (.[0].items["1"]["1"].osd_set | sort) == ["1","2","3"]'); then
|
||||||
|
format_error "FAILED: 1 PG NOT CONFIGURED"
|
||||||
|
fi
|
||||||
|
|
||||||
|
if ! ($ETCDCTL get /vitastor/pg/state/1/1 --print-value-only | jq -s -e '(. | length) != 0 and .[0].state == ["active"]'); then
|
||||||
|
format_error "FAILED: 1 PG NOT UP"
|
||||||
|
fi
|
||||||
|
|
||||||
|
if ! cmp build/src/block-vitastor.so /usr/lib/x86_64-linux-gnu/qemu/block-vitastor.so; then
|
||||||
|
sudo rm -f /usr/lib/x86_64-linux-gnu/qemu/block-vitastor.so
|
||||||
|
sudo ln -s "$(realpath .)/build/src/block-vitastor.so" /usr/lib/x86_64-linux-gnu/qemu/block-vitastor.so
|
||||||
|
fi
|
|
@ -0,0 +1,7 @@
|
||||||
|
#!/bin/bash -ex
|
||||||
|
|
||||||
|
. `dirname $0`/run_3osds.sh
|
||||||
|
|
||||||
|
build/src/test_cas --pool_id 1 --inode_id 1 --etcd_address $ETCD_URL
|
||||||
|
|
||||||
|
format_green OK
|
|
@ -1,40 +1,6 @@
|
||||||
#!/bin/bash -ex
|
#!/bin/bash -ex
|
||||||
|
|
||||||
. `dirname $0`/common.sh
|
. `dirname $0`/run_3osds.sh
|
||||||
|
|
||||||
dd if=/dev/zero of=./testdata/test_osd1.bin bs=1024 count=1 seek=$((1024*1024-1))
|
|
||||||
dd if=/dev/zero of=./testdata/test_osd2.bin bs=1024 count=1 seek=$((1024*1024-1))
|
|
||||||
dd if=/dev/zero of=./testdata/test_osd3.bin bs=1024 count=1 seek=$((1024*1024-1))
|
|
||||||
|
|
||||||
build/src/vitastor-osd --osd_num 1 --bind_address 127.0.0.1 --etcd_address $ETCD_URL $(node mon/simple-offsets.js --format options --device ./testdata/test_osd1.bin 2>/dev/null) &>./testdata/osd1.log &
|
|
||||||
OSD1_PID=$!
|
|
||||||
build/src/vitastor-osd --osd_num 2 --bind_address 127.0.0.1 --etcd_address $ETCD_URL $(node mon/simple-offsets.js --format options --device ./testdata/test_osd2.bin 2>/dev/null) &>./testdata/osd2.log &
|
|
||||||
OSD2_PID=$!
|
|
||||||
build/src/vitastor-osd --osd_num 3 --bind_address 127.0.0.1 --etcd_address $ETCD_URL $(node mon/simple-offsets.js --format options --device ./testdata/test_osd3.bin 2>/dev/null) &>./testdata/osd3.log &
|
|
||||||
OSD3_PID=$!
|
|
||||||
|
|
||||||
cd mon
|
|
||||||
npm install
|
|
||||||
cd ..
|
|
||||||
node mon/mon-main.js --etcd_url http://$ETCD_URL --etcd_prefix "/vitastor" &>./testdata/mon.log &
|
|
||||||
MON_PID=$!
|
|
||||||
|
|
||||||
$ETCDCTL put /vitastor/config/pools '{"1":{"name":"testpool","scheme":"xor","pg_size":3,"pg_minsize":2,"parity_chunks":1,"pg_count":1,"failure_domain":"osd"}}'
|
|
||||||
|
|
||||||
sleep 2
|
|
||||||
|
|
||||||
if ! ($ETCDCTL get /vitastor/config/pgs --print-value-only | jq -s -e '(. | length) != 0 and (.[0].items["1"]["1"].osd_set | sort) == ["1","2","3"]'); then
|
|
||||||
format_error "FAILED: 1 PG NOT CONFIGURED"
|
|
||||||
fi
|
|
||||||
|
|
||||||
if ! ($ETCDCTL get /vitastor/pg/state/1/1 --print-value-only | jq -s -e '(. | length) != 0 and .[0].state == ["active"]'); then
|
|
||||||
format_error "FAILED: 1 PG NOT UP"
|
|
||||||
fi
|
|
||||||
|
|
||||||
if ! cmp build/src/block-vitastor.so /usr/lib/x86_64-linux-gnu/qemu/block-vitastor.so; then
|
|
||||||
sudo rm -f /usr/lib/x86_64-linux-gnu/qemu/block-vitastor.so
|
|
||||||
sudo ln -s "$(realpath .)/build/src/block-vitastor.so" /usr/lib/x86_64-linux-gnu/qemu/block-vitastor.so
|
|
||||||
fi
|
|
||||||
|
|
||||||
# Test basic write and snapshot
|
# Test basic write and snapshot
|
||||||
|
|
||||||
|
|
|
@ -1,40 +1,8 @@
|
||||||
#!/bin/bash -ex
|
#!/bin/bash -ex
|
||||||
|
|
||||||
. `dirname $0`/common.sh
|
OSD_SIZE=2048
|
||||||
|
|
||||||
dd if=/dev/zero of=./testdata/test_osd1.bin bs=2048 count=1 seek=$((1024*1024-1))
|
. `dirname $0`/run_3osds.sh
|
||||||
dd if=/dev/zero of=./testdata/test_osd2.bin bs=2048 count=1 seek=$((1024*1024-1))
|
|
||||||
dd if=/dev/zero of=./testdata/test_osd3.bin bs=2048 count=1 seek=$((1024*1024-1))
|
|
||||||
|
|
||||||
build/src/vitastor-osd --osd_num 1 --bind_address 127.0.0.1 --etcd_address $ETCD_URL $(node mon/simple-offsets.js --format options --device ./testdata/test_osd1.bin 2>/dev/null) &>./testdata/osd1.log &
|
|
||||||
OSD1_PID=$!
|
|
||||||
build/src/vitastor-osd --osd_num 2 --bind_address 127.0.0.1 --etcd_address $ETCD_URL $(node mon/simple-offsets.js --format options --device ./testdata/test_osd2.bin 2>/dev/null) &>./testdata/osd2.log &
|
|
||||||
OSD2_PID=$!
|
|
||||||
build/src/vitastor-osd --osd_num 3 --bind_address 127.0.0.1 --etcd_address $ETCD_URL $(node mon/simple-offsets.js --format options --device ./testdata/test_osd3.bin 2>/dev/null) &>./testdata/osd3.log &
|
|
||||||
OSD3_PID=$!
|
|
||||||
|
|
||||||
cd mon
|
|
||||||
npm install
|
|
||||||
cd ..
|
|
||||||
node mon/mon-main.js --etcd_url http://$ETCD_URL --etcd_prefix "/vitastor" &>./testdata/mon.log &
|
|
||||||
MON_PID=$!
|
|
||||||
|
|
||||||
$ETCDCTL put /vitastor/config/pools '{"1":{"name":"testpool","scheme":"xor","pg_size":3,"pg_minsize":2,"parity_chunks":1,"pg_count":1,"failure_domain":"osd"}}'
|
|
||||||
|
|
||||||
sleep 2
|
|
||||||
|
|
||||||
if ! ($ETCDCTL get /vitastor/config/pgs --print-value-only | jq -s -e '(. | length) != 0 and (.[0].items["1"]["1"].osd_set | sort) == ["1","2","3"]'); then
|
|
||||||
format_error "FAILED: 1 PG NOT CONFIGURED"
|
|
||||||
fi
|
|
||||||
|
|
||||||
if ! ($ETCDCTL get /vitastor/pg/state/1/1 --print-value-only | jq -s -e '(. | length) != 0 and .[0].state == ["active"]'); then
|
|
||||||
format_error "FAILED: 1 PG NOT UP"
|
|
||||||
fi
|
|
||||||
|
|
||||||
if ! cmp build/src/block-vitastor.so /usr/lib/x86_64-linux-gnu/qemu/block-vitastor.so; then
|
|
||||||
sudo rm -f /usr/lib/x86_64-linux-gnu/qemu/block-vitastor.so
|
|
||||||
sudo ln -s "$(realpath .)/build/src/block-vitastor.so" /usr/lib/x86_64-linux-gnu/qemu/block-vitastor.so
|
|
||||||
fi
|
|
||||||
|
|
||||||
$ETCDCTL put /vitastor/config/inode/1/1 '{"name":"debian9","size":'$((2048*1024*1024))'}'
|
$ETCDCTL put /vitastor/config/inode/1/1 '{"name":"debian9","size":'$((2048*1024*1024))'}'
|
||||||
|
|
||||||
|
@ -46,7 +14,7 @@ $ETCDCTL put /vitastor/config/inode/1/1 '{"name":"debian9@0","size":'$((2048*102
|
||||||
$ETCDCTL put /vitastor/config/inode/1/2 '{"parent_id":1,"name":"debian9","size":'$((2048*1024*1024))'}'
|
$ETCDCTL put /vitastor/config/inode/1/2 '{"parent_id":1,"name":"debian9","size":'$((2048*1024*1024))'}'
|
||||||
|
|
||||||
qemu-system-x86_64 -enable-kvm -m 1024 \
|
qemu-system-x86_64 -enable-kvm -m 1024 \
|
||||||
-drive 'file=vitastor:etcd_host=127.0.0.1\:$ETCD_PORT/v3:image=debian9',format=raw,if=none,id=drive-virtio-disk0,cache=none \
|
-drive 'file=vitastor:etcd_host=127.0.0.1\:'$ETCD_PORT'/v3:image=debian9',format=raw,if=none,id=drive-virtio-disk0,cache=none \
|
||||||
-device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1,write-cache=off,physical_block_size=4096,logical_block_size=512 \
|
-device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1,write-cache=off,physical_block_size=4096,logical_block_size=512 \
|
||||||
-vnc 0.0.0.0:0
|
-vnc 0.0.0.0:0
|
||||||
|
|
||||||
|
|
|
@ -1,44 +1,10 @@
|
||||||
#!/bin/bash -ex
|
#!/bin/bash -ex
|
||||||
|
|
||||||
. `dirname $0`/common.sh
|
. `dirname $0`/run_3osds.sh
|
||||||
|
|
||||||
dd if=/dev/zero of=./testdata/test_osd1.bin bs=1024 count=1 seek=$((1024*1024-1))
|
|
||||||
dd if=/dev/zero of=./testdata/test_osd2.bin bs=1024 count=1 seek=$((1024*1024-1))
|
|
||||||
dd if=/dev/zero of=./testdata/test_osd3.bin bs=1024 count=1 seek=$((1024*1024-1))
|
|
||||||
|
|
||||||
build/src/vitastor-osd --osd_num 1 --bind_address 127.0.0.1 --etcd_address $ETCD_URL $(node mon/simple-offsets.js --format options --device ./testdata/test_osd1.bin 2>/dev/null) &>./testdata/osd1.log &
|
|
||||||
OSD1_PID=$!
|
|
||||||
build/src/vitastor-osd --osd_num 2 --bind_address 127.0.0.1 --etcd_address $ETCD_URL $(node mon/simple-offsets.js --format options --device ./testdata/test_osd2.bin 2>/dev/null) &>./testdata/osd2.log &
|
|
||||||
OSD2_PID=$!
|
|
||||||
build/src/vitastor-osd --osd_num 3 --bind_address 127.0.0.1 --etcd_address $ETCD_URL $(node mon/simple-offsets.js --format options --device ./testdata/test_osd3.bin 2>/dev/null) &>./testdata/osd3.log &
|
|
||||||
OSD3_PID=$!
|
|
||||||
|
|
||||||
cd mon
|
|
||||||
npm install
|
|
||||||
cd ..
|
|
||||||
node mon/mon-main.js --etcd_url http://$ETCD_URL --etcd_prefix "/vitastor" &>./testdata/mon.log &
|
|
||||||
MON_PID=$!
|
|
||||||
|
|
||||||
$ETCDCTL put /vitastor/config/pools '{"1":{"name":"testpool","scheme":"xor","pg_size":3,"pg_minsize":2,"parity_chunks":1,"pg_count":1,"failure_domain":"osd"}}'
|
|
||||||
|
|
||||||
sleep 2
|
|
||||||
|
|
||||||
if ! ($ETCDCTL get /vitastor/config/pgs --print-value-only | jq -s -e '(. | length) != 0 and (.[0].items["1"]["1"].osd_set | sort) == ["1","2","3"]'); then
|
|
||||||
format_error "FAILED: 1 PG NOT CONFIGURED"
|
|
||||||
fi
|
|
||||||
|
|
||||||
if ! ($ETCDCTL get /vitastor/pg/state/1/1 --print-value-only | jq -s -e '(. | length) != 0 and .[0].state == ["active"]'); then
|
|
||||||
format_error "FAILED: 1 PG NOT UP"
|
|
||||||
fi
|
|
||||||
|
|
||||||
#LD_PRELOAD=libasan.so.5 \
|
#LD_PRELOAD=libasan.so.5 \
|
||||||
# fio -thread -name=test -ioengine=build/src/libfio_vitastor_sec.so -bs=4k -fsync=128 `$ETCDCTL get /vitastor/osd/state/1 --print-value-only | jq -r '"-host="+.addresses[0]+" -port="+(.port|tostring)'` -rw=write -size=32M
|
# fio -thread -name=test -ioengine=build/src/libfio_vitastor_sec.so -bs=4k -fsync=128 `$ETCDCTL get /vitastor/osd/state/1 --print-value-only | jq -r '"-host="+.addresses[0]+" -port="+(.port|tostring)'` -rw=write -size=32M
|
||||||
|
|
||||||
if ! cmp build/src/block-vitastor.so /usr/lib/x86_64-linux-gnu/qemu/block-vitastor.so; then
|
|
||||||
sudo rm -f /usr/lib/x86_64-linux-gnu/qemu/block-vitastor.so
|
|
||||||
sudo ln -s "$(realpath .)/build/src/block-vitastor.so" /usr/lib/x86_64-linux-gnu/qemu/block-vitastor.so
|
|
||||||
fi
|
|
||||||
|
|
||||||
# A lot of parallel syncs was crashing the primary OSD at some point
|
# A lot of parallel syncs was crashing the primary OSD at some point
|
||||||
|
|
||||||
LD_PRELOAD=libasan.so.5 \
|
LD_PRELOAD=libasan.so.5 \
|
||||||
|
|
|
@ -1,40 +1,10 @@
|
||||||
#!/bin/bash -ex
|
#!/bin/bash -ex
|
||||||
# Test the `no_same_sector_overwrites` mode
|
# Test the `no_same_sector_overwrites` mode
|
||||||
|
|
||||||
. `dirname $0`/common.sh
|
OSD_ARGS="--journal_no_same_sector_overwrites true --journal_sector_buffer_count 1024 --disable_data_fsync 1 --immediate_commit all"
|
||||||
|
GLOBAL_CONF='{"immediate_commit":"all"}'
|
||||||
|
|
||||||
dd if=/dev/zero of=./testdata/test_osd1.bin bs=1024 count=1 seek=$((1024*1024-1))
|
. `dirname $0`/run_3osds.sh
|
||||||
dd if=/dev/zero of=./testdata/test_osd2.bin bs=1024 count=1 seek=$((1024*1024-1))
|
|
||||||
dd if=/dev/zero of=./testdata/test_osd3.bin bs=1024 count=1 seek=$((1024*1024-1))
|
|
||||||
|
|
||||||
NO_SAME="--journal_no_same_sector_overwrites true --journal_sector_buffer_count 1024 --disable_data_fsync 1 --immediate_commit all"
|
|
||||||
|
|
||||||
build/src/vitastor-osd --osd_num 1 --bind_address 127.0.0.1 $NO_SAME --etcd_address $ETCD_URL $(node mon/simple-offsets.js --format options --device ./testdata/test_osd1.bin 2>/dev/null) &>./testdata/osd1.log &
|
|
||||||
OSD1_PID=$!
|
|
||||||
build/src/vitastor-osd --osd_num 2 --bind_address 127.0.0.1 $NO_SAME --etcd_address $ETCD_URL $(node mon/simple-offsets.js --format options --device ./testdata/test_osd2.bin 2>/dev/null) &>./testdata/osd2.log &
|
|
||||||
OSD2_PID=$!
|
|
||||||
build/src/vitastor-osd --osd_num 3 --bind_address 127.0.0.1 $NO_SAME --etcd_address $ETCD_URL $(node mon/simple-offsets.js --format options --device ./testdata/test_osd3.bin 2>/dev/null) &>./testdata/osd3.log &
|
|
||||||
OSD3_PID=$!
|
|
||||||
|
|
||||||
cd mon
|
|
||||||
npm install
|
|
||||||
cd ..
|
|
||||||
node mon/mon-main.js --etcd_url http://$ETCD_URL --etcd_prefix "/vitastor" &>./testdata/mon.log &
|
|
||||||
MON_PID=$!
|
|
||||||
|
|
||||||
$ETCDCTL put /vitastor/config/global '{"immediate_commit":"all"}'
|
|
||||||
|
|
||||||
$ETCDCTL put /vitastor/config/pools '{"1":{"name":"testpool","scheme":"xor","pg_size":3,"pg_minsize":2,"parity_chunks":1,"pg_count":1,"failure_domain":"osd"}}'
|
|
||||||
|
|
||||||
sleep 2
|
|
||||||
|
|
||||||
if ! ($ETCDCTL get /vitastor/config/pgs --print-value-only | jq -s -e '(. | length) != 0 and (.[0].items["1"]["1"].osd_set | sort) == ["1","2","3"]'); then
|
|
||||||
format_error "FAILED: 1 PG NOT CONFIGURED"
|
|
||||||
fi
|
|
||||||
|
|
||||||
if ! ($ETCDCTL get /vitastor/pg/state/1/1 --print-value-only | jq -s -e '(. | length) != 0 and .[0].state == ["active"]'); then
|
|
||||||
format_error "FAILED: 1 PG NOT UP"
|
|
||||||
fi
|
|
||||||
|
|
||||||
#LSAN_OPTIONS=report_objects=true:suppressions=`pwd`/testdata/lsan-suppress.txt LD_PRELOAD=libasan.so.5 \
|
#LSAN_OPTIONS=report_objects=true:suppressions=`pwd`/testdata/lsan-suppress.txt LD_PRELOAD=libasan.so.5 \
|
||||||
# fio -thread -name=test -ioengine=build/src/libfio_vitastor_sec.so -bs=4k -fsync=128 `$ETCDCTL get /vitastor/osd/state/1 --print-value-only | jq -r '"-host="+.addresses[0]+" -port="+(.port|tostring)'` -rw=write -size=32M
|
# fio -thread -name=test -ioengine=build/src/libfio_vitastor_sec.so -bs=4k -fsync=128 `$ETCDCTL get /vitastor/osd/state/1 --print-value-only | jq -r '"-host="+.addresses[0]+" -port="+(.port|tostring)'` -rw=write -size=32M
|
||||||
|
|
Loading…
Reference in New Issue