Compare commits

...

26 Commits

Author SHA1 Message Date
79719e44ac Release 1.9.2
All checks were successful
Test / test_root_node (push) Successful in 8s
Test / test_dd (push) Successful in 12s
Test / test_rebalance_verify_ec (push) Successful in 1m40s
Test / test_rebalance_verify_ec_imm (push) Successful in 1m41s
Test / test_write_no_same (push) Successful in 7s
Test / test_switch_primary (push) Successful in 31s
Test / test_write (push) Successful in 30s
Test / test_write_xor (push) Successful in 36s
Test / test_heal_pg_size_2 (push) Successful in 2m15s
Test / test_heal_ec (push) Successful in 2m15s
Test / test_heal_antietcd (push) Successful in 2m16s
Test / test_heal_csum_32k_dmj (push) Successful in 2m17s
Test / test_heal_csum_32k_dj (push) Successful in 2m16s
Test / test_heal_csum_32k (push) Successful in 2m18s
Test / test_heal_csum_4k_dmj (push) Successful in 2m17s
Test / test_heal_csum_4k_dj (push) Successful in 2m17s
Test / test_osd_tags (push) Successful in 7s
Test / test_snapshot_pool2 (push) Successful in 13s
Test / test_enospc (push) Successful in 10s
Test / test_enospc_xor (push) Successful in 12s
Test / test_enospc_imm (push) Successful in 10s
Test / test_enospc_imm_xor (push) Successful in 13s
Test / test_scrub (push) Successful in 13s
Test / test_scrub_zero_osd_2 (push) Successful in 12s
Test / test_scrub_xor (push) Successful in 13s
Test / test_scrub_pg_size_3 (push) Successful in 14s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 15s
Test / test_scrub_ec (push) Successful in 13s
Test / test_nfs (push) Successful in 11s
Test / test_heal_csum_4k (push) Successful in 2m14s
New features:
- Support resizing normal vitastor-disk partitions and moving journal/metadata: [vitastor-disk resize](https://vitastor.io/docs/usage/disk.html#resize)
- Support simple forms of vitastor-disk {dump,write}-{meta,journal} for OSD partitions

Bug fixes:
- Fix block RWX volumes broken after introducing stage/unstage support
- Do not allow to create non-block RWX volumes in CSI
- Fix vitastor-disk prepare not seeing the newly created partition in rare cases
- Fix non-array tags not showing up in ls-osd/osd-tree
- Make OpenNebula oned.conf patching during installation smarter
- Fix iseek option in vitastor-cli dd not working
- Validate conv=, iflag=, oflag= options in vitastor-cli dd
- Fix vitastor-disk write-meta not writing header checksum to the disk
- Fix JSON format in vitastor-disk dump-meta
- Fix read_chain_bitmap not working for snapshot in another pool
- Fix a possible OSD crash during parallel read & write to an image with snapshots
- Several followups to the READ_CHAIN_BITMAP fix: avoid data reads, fix possible overflow in is_zero(), fix bitmap size
2024-10-20 01:49:13 +03:00
f5626655df Add new disk command docs 2024-10-20 01:47:46 +03:00
7e2dde2702 Fix block RWX volumes broken after introducing stage/unstage support 2024-10-19 11:56:56 +03:00
3b0ab317cf Validate non-block RWX in CSI 2024-10-18 01:55:38 +03:00
18eb99c494 Implement resizing partitions created with vitastor-disk 2024-10-18 01:55:19 +03:00
4e8a1a8895 Run partprobe in add_partition() if /dev/disk/by-partuuid symlink is not present 2024-10-12 18:07:53 +03:00
d27a8bdabc Make get_parent_device return full path 2024-10-12 13:44:52 +03:00
ebd616e42f Extract clear_osd_superblock() 2024-10-12 13:44:52 +03:00
b18d296e01 Extract check_existing_partition(), get_device_size() 2024-10-12 13:44:52 +03:00
a03508320e Move json_is_true/json_is_false to json_util.cpp
All checks were successful
Test / test_dd (push) Successful in 12s
Test / test_rebalance_verify_ec (push) Successful in 1m37s
Test / test_root_node (push) Successful in 9s
Test / test_rebalance_verify_ec_imm (push) Successful in 1m40s
Test / test_write_no_same (push) Successful in 8s
Test / test_switch_primary (push) Successful in 31s
Test / test_write (push) Successful in 31s
Test / test_write_xor (push) Successful in 35s
Test / test_heal_pg_size_2 (push) Successful in 2m14s
Test / test_heal_ec (push) Successful in 2m15s
Test / test_heal_antietcd (push) Successful in 2m15s
Test / test_heal_csum_32k_dmj (push) Successful in 2m16s
Test / test_heal_csum_32k_dj (push) Successful in 2m16s
Test / test_heal_csum_32k (push) Successful in 2m18s
Test / test_heal_csum_4k_dmj (push) Successful in 2m14s
Test / test_osd_tags (push) Successful in 7s
Test / test_heal_csum_4k_dj (push) Successful in 2m17s
Test / test_snapshot_pool2 (push) Successful in 13s
Test / test_enospc (push) Successful in 10s
Test / test_enospc_imm (push) Successful in 10s
Test / test_enospc_xor (push) Successful in 13s
Test / test_enospc_imm_xor (push) Successful in 13s
Test / test_scrub_zero_osd_2 (push) Successful in 11s
Test / test_scrub (push) Successful in 14s
Test / test_scrub_xor (push) Successful in 13s
Test / test_scrub_pg_size_3 (push) Successful in 14s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 14s
Test / test_scrub_ec (push) Successful in 13s
Test / test_nfs (push) Successful in 11s
Test / test_heal_csum_4k (push) Successful in 2m9s
2024-10-12 00:40:39 +03:00
c9ccc790ec Fix non-array tags not showing up in ls-osd/osd-tree
All checks were successful
Test / test_dd (push) Successful in 13s
Test / test_rebalance_verify_ec (push) Successful in 1m39s
Test / test_root_node (push) Successful in 9s
Test / test_rebalance_verify_ec_imm (push) Successful in 1m41s
Test / test_write_no_same (push) Successful in 9s
Test / test_switch_primary (push) Successful in 33s
Test / test_write (push) Successful in 32s
Test / test_write_xor (push) Successful in 33s
Test / test_heal_pg_size_2 (push) Successful in 2m16s
Test / test_heal_ec (push) Successful in 2m16s
Test / test_heal_antietcd (push) Successful in 2m17s
Test / test_heal_csum_32k_dmj (push) Successful in 2m18s
Test / test_heal_csum_32k_dj (push) Successful in 2m18s
Test / test_heal_csum_32k (push) Successful in 2m16s
Test / test_heal_csum_4k_dmj (push) Successful in 2m18s
Test / test_heal_csum_4k_dj (push) Successful in 2m19s
Test / test_osd_tags (push) Successful in 8s
Test / test_snapshot_pool2 (push) Successful in 15s
Test / test_enospc (push) Successful in 10s
Test / test_enospc_xor (push) Successful in 11s
Test / test_enospc_imm (push) Successful in 10s
Test / test_enospc_imm_xor (push) Successful in 12s
Test / test_scrub_zero_osd_2 (push) Successful in 12s
Test / test_scrub (push) Successful in 15s
Test / test_scrub_xor (push) Successful in 13s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 14s
Test / test_scrub_pg_size_3 (push) Successful in 17s
Test / test_scrub_ec (push) Successful in 13s
Test / test_nfs (push) Successful in 11s
Test / test_heal_csum_4k (push) Successful in 2m16s
2024-10-11 18:33:35 +03:00
db2d9c5b3d Fix tables in NFS doc 2024-10-08 00:20:10 +03:00
09f15f44c9 Fix Toshiba MG and VDUSE Debian kernel note in docs 2024-10-08 00:17:14 +03:00
c5a58c2e81 Support reading parameters automatically from the superblock in vitastor-disk {dump,write}-{meta,journal}
Some checks reported warnings
Test / test_dd (push) Has been cancelled
Test / test_root_node (push) Has been cancelled
Test / test_switch_primary (push) Has been cancelled
Test / test_write (push) Has been cancelled
Test / test_write_xor (push) Has been cancelled
Test / test_write_no_same (push) Has been cancelled
Test / test_heal_pg_size_2 (push) Has been cancelled
Test / build (push) Has been cancelled
Test / test_heal_ec (push) Has been cancelled
Test / test_heal_antietcd (push) Has been cancelled
Test / test_heal_csum_32k_dmj (push) Has been cancelled
Test / test_heal_csum_32k_dj (push) Has been cancelled
Test / test_heal_csum_32k (push) Has been cancelled
Test / test_heal_csum_4k_dmj (push) Has been cancelled
Test / test_heal_csum_4k_dj (push) Has been cancelled
Test / test_heal_csum_4k (push) Has been cancelled
Test / test_snapshot_pool2 (push) Has been cancelled
Test / test_osd_tags (push) Has been cancelled
Test / test_enospc (push) Has been cancelled
Test / test_enospc_xor (push) Has been cancelled
Test / test_add_osd (push) Has been cancelled
Test / test_enospc_imm (push) Has been cancelled
Test / test_enospc_imm_xor (push) Has been cancelled
Test / test_scrub (push) Has been cancelled
Test / test_scrub_zero_osd_2 (push) Has been cancelled
Test / test_scrub_xor (push) Has been cancelled
Test / test_scrub_pg_size_3 (push) Has been cancelled
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Has been cancelled
Test / test_scrub_ec (push) Has been cancelled
Test / test_nfs (push) Has been cancelled
2024-10-07 02:21:58 +03:00
30e7c2ad1e Add custom OpenNebula oned.conf patcher (it uses a SHITTY configuration file format) 2024-10-06 13:46:05 +03:00
2e76ceabbe Fix iseek option in vitastor-cli dd 2024-10-05 18:25:38 +03:00
3df088c207 Validate conv=, iflag=, oflag= options in vitastor-cli dd 2024-10-05 18:02:36 +03:00
d882a19eab Fix vitastor-disk write-meta not writing header checksum to the disk... 2024-10-05 17:32:55 +03:00
702be3da7a Fix JSON format in vitastor-disk dump-meta 2024-10-05 16:08:34 +03:00
99533e1c2f Fix .yml links 2024-10-02 00:38:07 +03:00
a6cceb43bf Fix read_chain_bitmap not working for snapshot in another pool
All checks were successful
Test / test_dd (push) Successful in 13s
Test / test_root_node (push) Successful in 10s
Test / test_rebalance_verify_ec (push) Successful in 1m43s
Test / test_rebalance_verify_ec_imm (push) Successful in 1m45s
Test / test_write_no_same (push) Successful in 8s
Test / test_write (push) Successful in 31s
Test / test_switch_primary (push) Successful in 34s
Test / test_write_xor (push) Successful in 35s
Test / test_heal_pg_size_2 (push) Successful in 2m15s
Test / test_heal_ec (push) Successful in 2m16s
Test / test_heal_antietcd (push) Successful in 2m17s
Test / test_heal_csum_32k_dmj (push) Successful in 2m19s
Test / test_heal_csum_32k_dj (push) Successful in 2m20s
Test / test_heal_csum_4k_dmj (push) Successful in 2m14s
Test / test_heal_csum_32k (push) Successful in 2m21s
Test / test_osd_tags (push) Successful in 7s
Test / test_heal_csum_4k_dj (push) Successful in 2m20s
Test / test_snapshot_pool2 (push) Successful in 14s
Test / test_enospc (push) Successful in 12s
Test / test_enospc_imm (push) Successful in 11s
Test / test_enospc_xor (push) Successful in 14s
Test / test_enospc_imm_xor (push) Successful in 14s
Test / test_scrub_zero_osd_2 (push) Successful in 14s
Test / test_scrub (push) Successful in 16s
Test / test_scrub_xor (push) Successful in 15s
Test / test_scrub_pg_size_3 (push) Successful in 17s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 16s
Test / test_scrub_ec (push) Successful in 16s
Test / test_nfs (push) Successful in 13s
Test / test_heal_csum_4k (push) Successful in 2m17s
2024-10-02 00:24:48 +03:00
745d89459a Fix link, add title 2024-09-29 22:05:56 +03:00
48f023292d Fix extra data reads on read_chain
All checks were successful
Test / test_rebalance_verify_imm (push) Successful in 1m35s
Test / test_dd (push) Successful in 12s
Test / test_root_node (push) Successful in 9s
Test / test_rebalance_verify_ec (push) Successful in 1m43s
Test / test_rebalance_verify_ec_imm (push) Successful in 1m44s
Test / test_write_no_same (push) Successful in 10s
Test / test_switch_primary (push) Successful in 32s
Test / test_write (push) Successful in 32s
Test / test_write_xor (push) Successful in 34s
Test / test_heal_pg_size_2 (push) Successful in 2m16s
Test / test_heal_ec (push) Successful in 2m16s
Test / test_heal_antietcd (push) Successful in 2m17s
Test / test_heal_csum_32k_dmj (push) Successful in 2m17s
Test / test_heal_csum_32k_dj (push) Successful in 2m20s
Test / test_heal_csum_32k (push) Successful in 2m22s
Test / test_heal_csum_4k_dmj (push) Successful in 2m20s
Test / test_heal_csum_4k_dj (push) Successful in 2m16s
Test / test_osd_tags (push) Successful in 8s
Test / test_enospc (push) Successful in 12s
Test / test_enospc_xor (push) Successful in 12s
Test / test_enospc_imm (push) Successful in 12s
Test / test_enospc_imm_xor (push) Successful in 13s
Test / test_scrub_zero_osd_2 (push) Successful in 12s
Test / test_scrub (push) Successful in 14s
Test / test_scrub_xor (push) Successful in 14s
Test / test_scrub_pg_size_3 (push) Successful in 16s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 15s
Test / test_nfs (push) Successful in 11s
Test / test_scrub_ec (push) Successful in 16s
Test / test_heal_csum_4k (push) Successful in 2m12s
2024-09-21 17:05:42 +03:00
b58bf3ada5 Fix possible OSD crash during parallel read & write to an image with snapshots
All checks were successful
Test / test_rebalance_verify_imm (push) Successful in 1m39s
Test / test_dd (push) Successful in 12s
Test / test_rebalance_verify_ec (push) Successful in 1m45s
Test / test_root_node (push) Successful in 8s
Test / test_rebalance_verify_ec_imm (push) Successful in 1m45s
Test / test_write_no_same (push) Successful in 8s
Test / test_switch_primary (push) Successful in 31s
Test / test_write (push) Successful in 31s
Test / test_write_xor (push) Successful in 34s
Test / test_heal_pg_size_2 (push) Successful in 2m16s
Test / test_heal_ec (push) Successful in 2m15s
Test / test_heal_antietcd (push) Successful in 2m17s
Test / test_heal_csum_32k_dj (push) Successful in 2m21s
Test / test_heal_csum_32k (push) Successful in 2m17s
Test / test_heal_csum_4k_dmj (push) Successful in 2m19s
Test / test_osd_tags (push) Successful in 10s
Test / test_enospc (push) Successful in 11s
Test / test_heal_csum_4k_dj (push) Successful in 2m18s
Test / test_enospc_xor (push) Successful in 14s
Test / test_enospc_imm (push) Successful in 10s
Test / test_enospc_imm_xor (push) Successful in 14s
Test / test_scrub (push) Successful in 14s
Test / test_scrub_zero_osd_2 (push) Successful in 14s
Test / test_scrub_xor (push) Successful in 16s
Test / test_scrub_pg_size_3 (push) Successful in 15s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 17s
Test / test_nfs (push) Successful in 11s
Test / test_scrub_ec (push) Successful in 14s
Test / test_heal_csum_4k (push) Successful in 2m17s
Test / test_heal_csum_32k_dmj (push) Successful in 2m20s
OSDs could crash with the following "assertion failed" message (crash didn't affect data
and was caused by OSD thinking upper blocks are full while they weren't). Reproduction
without introducing artificial delays is hard because you have to force OSD to read an
object with enqueued but not handled write which fills previously non-full bitmap. O_o.

```
vitastor-osd: ./src/osd/osd_primary_chain.cpp:613: void osd_t::send_chained_read_results(pg_t&, osd_op_t*): Assertion `stripes[role].read_buf' failed.
```
2024-09-21 13:44:36 +03:00
f18a749324 READ_CHAIN fix was incomplete :-)
Some checks failed
Test / test_rebalance_verify_imm (push) Successful in 1m33s
Test / test_dd (push) Successful in 12s
Test / test_rebalance_verify_ec (push) Successful in 1m39s
Test / test_root_node (push) Successful in 9s
Test / test_rebalance_verify_ec_imm (push) Successful in 1m43s
Test / test_write_no_same (push) Successful in 8s
Test / test_switch_primary (push) Successful in 32s
Test / test_write (push) Successful in 33s
Test / test_write_xor (push) Successful in 35s
Test / test_heal_pg_size_2 (push) Successful in 2m15s
Test / test_heal_ec (push) Successful in 2m17s
Test / test_heal_antietcd (push) Successful in 2m17s
Test / test_heal_csum_32k_dmj (push) Failing after 2m22s
Test / test_heal_csum_32k_dj (push) Successful in 2m15s
Test / test_heal_csum_32k (push) Successful in 2m18s
Test / test_osd_tags (push) Successful in 8s
Test / test_heal_csum_4k_dmj (push) Successful in 2m19s
Test / test_heal_csum_4k_dj (push) Successful in 2m13s
Test / test_enospc (push) Successful in 12s
Test / test_enospc_xor (push) Successful in 12s
Test / test_enospc_imm (push) Successful in 11s
Test / test_enospc_imm_xor (push) Successful in 13s
Test / test_scrub_zero_osd_2 (push) Successful in 14s
Test / test_scrub (push) Successful in 16s
Test / test_scrub_xor (push) Successful in 14s
Test / test_scrub_pg_size_3 (push) Successful in 15s
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 17s
Test / test_scrub_ec (push) Successful in 15s
Test / test_nfs (push) Successful in 13s
Test / test_heal_csum_4k (push) Successful in 2m20s
2024-09-21 13:40:31 +03:00
6e9307c522 Fix possible overflow in is_zero() 2024-09-21 13:40:10 +03:00
68 changed files with 1191 additions and 325 deletions

View File

@@ -828,6 +828,24 @@ jobs:
echo ""
done
test_snapshot_pool2:
runs-on: ubuntu-latest
needs: build
container: ${{env.TEST_IMAGE}}:${{github.sha}}
steps:
- name: Run test
id: test
timeout-minutes: 3
run: /root/vitastor/tests/test_snapshot_pool2.sh
- name: Print logs
if: always() && steps.test.outcome == 'failure'
run: |
for i in /root/vitastor/testdata/*.log /root/vitastor/testdata/*.txt; do
echo "-------- $i --------"
cat $i
echo ""
done
test_osd_tags:
runs-on: ubuntu-latest
needs: build

View File

@@ -2,6 +2,6 @@ cmake_minimum_required(VERSION 2.8.12)
project(vitastor)
set(VITASTOR_VERSION "1.9.1")
set(VITASTOR_VERSION "1.9.2")
add_subdirectory(src)

View File

@@ -1,4 +1,4 @@
## Vitastor
# Vitastor
[Read English version](README.md)
@@ -22,7 +22,7 @@ TCP и RDMA и на хорошем железе может достигать з
Vitastor поддерживает QEMU-драйвер, протоколы NBD и NFS, драйверы OpenStack, OpenNebula, Proxmox, Kubernetes.
Другие драйверы могут также быть легко реализованы.
Подробности смотрите в документации по ссылкам ниже.
Подробности смотрите в документации по ссылкам. Можете начать отсюда: [Быстрый старт](docs/intro/quickstart.ru.md).
## Презентации и записи докладов
@@ -51,7 +51,7 @@ Vitastor поддерживает QEMU-драйвер, протоколы NBD и
- Параметры
- [Общие](docs/config/common.ru.md)
- [Сетевые](docs/config/network.ru.md)
- [Клиентский код](docs/config/client.en.md)
- [Клиентский код](docs/config/client.ru.md)
- [Глобальные дисковые параметры](docs/config/layout-cluster.ru.md)
- [Дисковые параметры OSD](docs/config/layout-osd.ru.md)
- [Прочие параметры OSD](docs/config/osd.ru.md)

View File

@@ -22,7 +22,7 @@ or internal systems of public clouds.
Vitastor supports QEMU, NBD, NFS protocols, OpenStack, OpenNebula, Proxmox, Kubernetes drivers.
More drivers may be created easily.
Read more details below in the documentation.
Read more details in the documentation. You can start from here: [Quick Start](docs/intro/quickstart.en.md).
## Talks and presentations

View File

@@ -1,4 +1,4 @@
VITASTOR_VERSION ?= v1.9.1
VITASTOR_VERSION ?= v1.9.2
all: build push

View File

@@ -49,7 +49,7 @@ spec:
capabilities:
add: ["SYS_ADMIN"]
allowPrivilegeEscalation: true
image: vitalif/vitastor-csi:v1.9.1
image: vitalif/vitastor-csi:v1.9.2
args:
- "--node=$(NODE_ID)"
- "--endpoint=$(CSI_ENDPOINT)"

View File

@@ -121,7 +121,7 @@ spec:
privileged: true
capabilities:
add: ["SYS_ADMIN"]
image: vitalif/vitastor-csi:v1.9.1
image: vitalif/vitastor-csi:v1.9.2
args:
- "--node=$(NODE_ID)"
- "--endpoint=$(CSI_ENDPOINT)"

View File

@@ -3,10 +3,10 @@ module vitastor.io/csi
go 1.15
require (
github.com/container-storage-interface/spec v1.4.0
github.com/container-storage-interface/spec v1.8.0
github.com/golang/glog v0.0.0-20160126235308-23def4e6c14b
github.com/kubernetes-csi/csi-lib-utils v0.9.1
golang.org/x/net v0.0.0-20201202161906-c7110b5ffcbb
golang.org/x/net v0.7.0
golang.org/x/xerrors v0.0.0-20200804184101-5ec99f83aff1 // indirect
google.golang.org/grpc v1.33.1
google.golang.org/protobuf v1.24.0

View File

@@ -41,8 +41,8 @@ github.com/chzyer/logex v1.1.10/go.mod h1:+Ywpsq7O8HXn0nuIou7OrIPyXbp3wmkHB+jjWR
github.com/chzyer/readline v0.0.0-20180603132655-2972be24d48e/go.mod h1:nSuG5e5PlCu98SY8svDHJxuZscDgtXS6KTTbou5AhLI=
github.com/chzyer/test v0.0.0-20180213035817-a1ea475d72b1/go.mod h1:Q3SI9o4m/ZMnBNeIyt5eFwwo7qiLfzFZmjNmxjkiQlU=
github.com/container-storage-interface/spec v1.2.0/go.mod h1:6URME8mwIBbpVyZV93Ce5St17xBiQJQY67NDsuohiy4=
github.com/container-storage-interface/spec v1.4.0 h1:ozAshSKxpJnYUfmkpZCTYyF/4MYeYlhdXbAvPvfGmkg=
github.com/container-storage-interface/spec v1.4.0/go.mod h1:6URME8mwIBbpVyZV93Ce5St17xBiQJQY67NDsuohiy4=
github.com/container-storage-interface/spec v1.8.0 h1:D0vhF3PLIZwlwZEf2eNbpujGCNwspwTYf2idJRJx4xI=
github.com/container-storage-interface/spec v1.8.0/go.mod h1:ROLik+GhPslwwWRNFF1KasPzroNARibH2rfz1rkg4H0=
github.com/davecgh/go-spew v1.1.0/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c=
github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
@@ -182,6 +182,7 @@ github.com/stretchr/testify v1.3.0/go.mod h1:M5WIy9Dh21IEIfnGCwXGc5bZfKNJtfHm1UV
github.com/stretchr/testify v1.4.0/go.mod h1:j7eGeouHqKxXV5pUuKE4zz7dFj8WfuZ+81PSLYec5m4=
github.com/stretchr/testify v1.5.1 h1:nOGnQDM7FYENwehXlg/kFVnos3rEvtKTjRvOWSzb6H4=
github.com/stretchr/testify v1.5.1/go.mod h1:5W2xD1RspED5o8YsWQXVCued0rvSQ+mT+I5cxcmMvtA=
github.com/yuin/goldmark v1.4.13/go.mod h1:6yULJ656Px+3vBD8DxQVa3kxgyrAnzto9xy5taEt/CY=
go.opencensus.io v0.21.0/go.mod h1:mSImk1erAIZhrmZN+AvHh14ztQfjbGwt4TtuofqLduU=
go.opencensus.io v0.22.0/go.mod h1:+kGneAE2xo2IficOXnaByMWTGM9T73dGwxeWcUqIpI8=
go.opencensus.io v0.22.2/go.mod h1:yxeiOL68Rb0Xd1ddK5vPZ/oVn4vY4Ynel7k9FzqtOIw=
@@ -195,6 +196,7 @@ golang.org/x/crypto v0.0.0-20190605123033-f99c8df09eb5/go.mod h1:yigFU9vqHzYiE8U
golang.org/x/crypto v0.0.0-20191011191535-87dc89f01550/go.mod h1:yigFU9vqHzYiE8UmvKecakEJjdnWj3jj499lnFckfCI=
golang.org/x/crypto v0.0.0-20191206172530-e9b2fee46413/go.mod h1:LzIPMQfyMNhhGPhUkYOs5KpL4U8rLKemX1yGLhDgUto=
golang.org/x/crypto v0.0.0-20200622213623-75b288015ac9/go.mod h1:LzIPMQfyMNhhGPhUkYOs5KpL4U8rLKemX1yGLhDgUto=
golang.org/x/crypto v0.0.0-20210921155107-089bfa567519/go.mod h1:GvvjBRRGRdwPK5ydBHafDWAxML/pGHZbMvKqRZ5+Abc=
golang.org/x/exp v0.0.0-20190121172915-509febef88a4/go.mod h1:CJ0aWSM057203Lf6IL+f9T1iT9GByDxfZKAQTCR3kQA=
golang.org/x/exp v0.0.0-20190306152737-a1d7652674e8/go.mod h1:CJ0aWSM057203Lf6IL+f9T1iT9GByDxfZKAQTCR3kQA=
golang.org/x/exp v0.0.0-20190510132918-efd6b22b2522/go.mod h1:ZjyILWgesfNpC6sMxTJOJm9Kp84zZh5NQWvqDGG3Qr8=
@@ -213,6 +215,7 @@ golang.org/x/mobile v0.0.0-20190719004257-d2bd2a29d028/go.mod h1:E/iHnbuqvinMTCc
golang.org/x/mod v0.0.0-20190513183733-4bf6d317e70e/go.mod h1:mXi4GBBbnImb6dmsKGUJ2LatrhH/nqhxcFungHvyanc=
golang.org/x/mod v0.1.0/go.mod h1:0QHyrYULN0/3qlju5TqG8bIK38QM8yzMo5ekMj3DlcY=
golang.org/x/mod v0.1.1-0.20191105210325-c90efee705ee/go.mod h1:QqPTAvyqsEbceGzBzNggFXnrqF1CaUcvgkdR5Ot7KZg=
golang.org/x/mod v0.6.0-dev.0.20220419223038-86c51ed26bb4/go.mod h1:jJ57K6gSWd91VN4djpZkiMVwK6gcyfeH4XE8wZrZaV4=
golang.org/x/net v0.0.0-20180724234803-3673e40ba225/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4=
golang.org/x/net v0.0.0-20180906233101-161cd47e91fd/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4=
golang.org/x/net v0.0.0-20181114220301-adae6a3d119a/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4=
@@ -228,8 +231,10 @@ golang.org/x/net v0.0.0-20190620200207-3b0461eec859/go.mod h1:z5CRVTTTmAJ677TzLL
golang.org/x/net v0.0.0-20191209160850-c0dbc17a3553/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s=
golang.org/x/net v0.0.0-20200324143707-d3edc9973b7e/go.mod h1:qpuaurCH72eLCgpAm/N6yyVIVM9cpaDIP3A8BGJEC5A=
golang.org/x/net v0.0.0-20200707034311-ab3426394381/go.mod h1:/O7V0waA8r7cgGh81Ro3o1hOxt32SMVPicZroKQ2sZA=
golang.org/x/net v0.0.0-20201202161906-c7110b5ffcbb h1:eBmm0M9fYhWpKZLjQUUKka/LtIxf46G4fxeEz5KJr9U=
golang.org/x/net v0.0.0-20201202161906-c7110b5ffcbb/go.mod h1:sp8m0HH+o8qH0wwXwYZr8TS3Oi6o0r6Gce1SSxlDquU=
golang.org/x/net v0.0.0-20210226172049-e18ecbb05110/go.mod h1:m0MpNAwzfU5UDzcl9v0D8zg8gWTRqZa9RBIspLL5mdg=
golang.org/x/net v0.0.0-20220722155237-a158d28d115b/go.mod h1:XRhObCWvk6IyKnWLug+ECip1KBveYUHfp+8e9klMJ9c=
golang.org/x/net v0.7.0 h1:rJrUqqhjsgNp7KqAIc25s9pZnjU7TUcSY7HcVZjdn1g=
golang.org/x/net v0.7.0/go.mod h1:2Tu9+aMcznHK/AK1HMvgo6xiTLG5rD5rZLDS+rp2Bjs=
golang.org/x/oauth2 v0.0.0-20180821212333-d2e6202438be/go.mod h1:N/0e6XlmueqKjAGxoOufVs8QHGRruUQn6yWY3a++T0U=
golang.org/x/oauth2 v0.0.0-20190226205417-e64efc72b421/go.mod h1:gOpvHmFTYa4IltrdGE7lF6nIHvwfUNPOp7c8zoXwtLw=
golang.org/x/oauth2 v0.0.0-20190604053449-0f29369cfe45/go.mod h1:gOpvHmFTYa4IltrdGE7lF6nIHvwfUNPOp7c8zoXwtLw=
@@ -240,6 +245,7 @@ golang.org/x/sync v0.0.0-20181221193216-37e7f081c4d4/go.mod h1:RxMgew5VJxzue5/jJ
golang.org/x/sync v0.0.0-20190227155943-e225da77a7e6/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20190423024810-112230192c58/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20190911185100-cd5d95a43a6e/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20220722155255-886fb9371eb4/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sys v0.0.0-20180905080454-ebe1bf3edb33/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
golang.org/x/sys v0.0.0-20180909124046-d0be0721c37e/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
golang.org/x/sys v0.0.0-20181116152217-5ac8a444bdc5/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
@@ -259,13 +265,22 @@ golang.org/x/sys v0.0.0-20200302150141-5c8b2ff67527/go.mod h1:h1NjWce9XRLGQEsW7w
golang.org/x/sys v0.0.0-20200323222414-85ca7c5b95cd/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20200615200032-f1bc736245b1/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20200622214017-ed371f2e16b4/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20200930185726-fdedc70b468f h1:+Nyd8tzPX9R7BWHguqsrbFdRx3WQ/1ib8I44HXV5yTA=
golang.org/x/sys v0.0.0-20200930185726-fdedc70b468f/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20201119102817-f84b799fce68/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20210615035016-665e8c7367d1/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.0.0-20220520151302-bc2c85ada10a/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.0.0-20220722155257-8c9f86f7a55f/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.5.0 h1:MUK/U/4lj1t1oPg0HfuXDN/Z1wv31ZJ/YcPiGccS4DU=
golang.org/x/sys v0.5.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/term v0.0.0-20201126162022-7de9c90e9dd1/go.mod h1:bj7SfCRtBDWHUb9snDiAeCFNEtKQo2Wmx5Cou7ajbmo=
golang.org/x/term v0.0.0-20210927222741-03fcf44c2211/go.mod h1:jbD1KX2456YbFQfuXm/mYQcufACuNUgVhRMnK/tPxf8=
golang.org/x/term v0.5.0/go.mod h1:jMB1sMXY+tzblOD4FWmEbocvup2/aLOaQEp7JmGp78k=
golang.org/x/text v0.3.0/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ=
golang.org/x/text v0.3.1-0.20180807135948-17ff2d5776d2/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ=
golang.org/x/text v0.3.2/go.mod h1:bEr9sfX3Q8Zfm5fL9x+3itogRgK3+ptLWKqgva+5dAk=
golang.org/x/text v0.3.3 h1:cokOdA+Jmi5PJGXLlLllQSgYigAEfHXJAERHVMaCc2k=
golang.org/x/text v0.3.3/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ=
golang.org/x/text v0.3.7/go.mod h1:u+2+/6zg+i71rQMx5EYifcz6MCKuco9NR6JIITiCfzQ=
golang.org/x/text v0.7.0 h1:4BRB4x83lYWy72KwLD/qYDuTu7q9PjSagHvijDw7cLo=
golang.org/x/text v0.7.0/go.mod h1:mrYo+phRRbMaCq/xk9113O4dZlRixOauAjOtrjsXDZ8=
golang.org/x/time v0.0.0-20181108054448-85acf8d2951c/go.mod h1:tRJNPiyCQ0inRvYxbN9jk5I+vvW/OXSQhTDSoE431IQ=
golang.org/x/time v0.0.0-20190308202827-9d24e82272b4/go.mod h1:tRJNPiyCQ0inRvYxbN9jk5I+vvW/OXSQhTDSoE431IQ=
golang.org/x/time v0.0.0-20191024005414-555d28b269f0/go.mod h1:tRJNPiyCQ0inRvYxbN9jk5I+vvW/OXSQhTDSoE431IQ=
@@ -286,8 +301,10 @@ golang.org/x/tools v0.0.0-20190628153133-6cdbf07be9d0/go.mod h1:/rFqwRUd4F7ZHNgw
golang.org/x/tools v0.0.0-20190816200558-6889da9d5479/go.mod h1:b+2E5dAYhXwXZwtnZ6UAqBI28+e2cm9otk0dWdXHAEo=
golang.org/x/tools v0.0.0-20190911174233-4f2ddba30aff/go.mod h1:b+2E5dAYhXwXZwtnZ6UAqBI28+e2cm9otk0dWdXHAEo=
golang.org/x/tools v0.0.0-20191012152004-8de300cfc20a/go.mod h1:b+2E5dAYhXwXZwtnZ6UAqBI28+e2cm9otk0dWdXHAEo=
golang.org/x/tools v0.0.0-20191119224855-298f0cb1881e/go.mod h1:b+2E5dAYhXwXZwtnZ6UAqBI28+e2cm9otk0dWdXHAEo=
golang.org/x/tools v0.0.0-20191125144606-a911d9008d1f/go.mod h1:b+2E5dAYhXwXZwtnZ6UAqBI28+e2cm9otk0dWdXHAEo=
golang.org/x/tools v0.0.0-20191227053925-7b8e75db28f4/go.mod h1:TB2adYChydJhpapKDTa4BR/hXlZSLoq2Wpct/0txZ28=
golang.org/x/tools v0.1.12/go.mod h1:hNGJHUnrk76NpqgfD5Aqm5Crs+Hm0VOH/i9J2+nxYbc=
golang.org/x/xerrors v0.0.0-20190717185122-a985d3407aa7/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
golang.org/x/xerrors v0.0.0-20191011141410-1b5146add898/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
golang.org/x/xerrors v0.0.0-20191204190536-9bdfabe68543/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=

View File

@@ -5,7 +5,7 @@ package vitastor
const (
vitastorCSIDriverName = "csi.vitastor.io"
vitastorCSIDriverVersion = "1.9.1"
vitastorCSIDriverVersion = "1.9.2"
)
// Config struct fills the parameters of request or user input

View File

@@ -158,6 +158,12 @@ func (cs *ControllerServer) CreateVolume(ctx context.Context, req *csi.CreateVol
return nil, status.Error(codes.InvalidArgument, "volume capabilities is a required field")
}
err := cs.checkCaps(volumeCapabilities)
if (err != nil)
{
return nil, err
}
etcdVolumePrefix := req.Parameters["etcdVolumePrefix"]
poolId, _ := strconv.ParseUint(req.Parameters["poolId"], 10, 64)
if (poolId == 0)
@@ -301,13 +307,44 @@ func (cs *ControllerServer) ValidateVolumeCapabilities(ctx context.Context, req
return nil, status.Error(codes.InvalidArgument, "volumeCapabilities is nil")
}
err := cs.checkCaps(volumeCapabilities)
if (err != nil)
{
return nil, err
}
return &csi.ValidateVolumeCapabilitiesResponse{
Confirmed: &csi.ValidateVolumeCapabilitiesResponse_Confirmed{
VolumeCapabilities: req.VolumeCapabilities,
},
}, nil
}
func (cs *ControllerServer) checkCaps(volumeCapabilities []*csi.VolumeCapability) error
{
var volumeCapabilityAccessModes []*csi.VolumeCapability_AccessMode
for _, mode := range []csi.VolumeCapability_AccessMode_Mode{
csi.VolumeCapability_AccessMode_SINGLE_NODE_WRITER,
csi.VolumeCapability_AccessMode_MULTI_NODE_MULTI_WRITER,
csi.VolumeCapability_AccessMode_SINGLE_NODE_READER_ONLY,
csi.VolumeCapability_AccessMode_MULTI_NODE_READER_ONLY,
csi.VolumeCapability_AccessMode_SINGLE_NODE_SINGLE_WRITER,
csi.VolumeCapability_AccessMode_SINGLE_NODE_MULTI_WRITER,
} {
volumeCapabilityAccessModes = append(volumeCapabilityAccessModes, &csi.VolumeCapability_AccessMode{Mode: mode})
}
for _, capability := range volumeCapabilities
{
if (capability.GetBlock() != nil)
{
for _, mode := range []csi.VolumeCapability_AccessMode_Mode{
csi.VolumeCapability_AccessMode_MULTI_NODE_SINGLE_WRITER,
csi.VolumeCapability_AccessMode_MULTI_NODE_MULTI_WRITER,
} {
volumeCapabilityAccessModes = append(volumeCapabilityAccessModes, &csi.VolumeCapability_AccessMode{Mode: mode})
}
break
}
}
capabilitySupport := false
for _, capability := range volumeCapabilities
@@ -323,14 +360,10 @@ func (cs *ControllerServer) ValidateVolumeCapabilities(ctx context.Context, req
if (!capabilitySupport)
{
return nil, status.Errorf(codes.NotFound, "%v not supported", req.GetVolumeCapabilities())
return status.Errorf(codes.NotFound, "%v not supported", volumeCapabilities)
}
return &csi.ValidateVolumeCapabilitiesResponse{
Confirmed: &csi.ValidateVolumeCapabilitiesResponse_Confirmed{
VolumeCapabilities: req.VolumeCapabilities,
},
}, nil
return nil
}
// ListVolumes returns a list of volumes

View File

@@ -228,6 +228,26 @@ func (ns *NodeServer) NodeStageVolume(ctx context.Context, req *csi.NodeStageVol
// Check that it's not already mounted
_, err = mount.IsNotMountPoint(ns.mounter, targetPath)
if (err == nil)
{
var finfo os.FileInfo
finfo, err = os.Stat(targetPath)
if (err != nil)
{
klog.Errorf("failed to stat %s: %v", targetPath, err)
return nil, err
}
if (finfo.IsDir() != (!isBlock))
{
err = os.Remove(targetPath)
if (err != nil)
{
klog.Errorf("failed to remove %s (to recreate it with correct type): %v", targetPath, err)
return nil, err
}
err = os.ErrNotExist
}
}
if (err != nil)
{
if (os.IsNotExist(err))
@@ -385,7 +405,7 @@ func (ns *NodeServer) NodeUnstageVolume(ctx context.Context, req *csi.NodeUnstag
defer ns.unlockVolume(ctxVars["configPath"]+":"+volName)
targetPath := req.GetStagingTargetPath()
devicePath, refCount, err := mount.GetDeviceNameFromMount(ns.mounter, targetPath)
devicePath, _, err := mount.GetDeviceNameFromMount(ns.mounter, targetPath)
if (err != nil)
{
if (os.IsNotExist(err))
@@ -402,6 +422,16 @@ func (ns *NodeServer) NodeUnstageVolume(ctx context.Context, req *csi.NodeUnstag
return &csi.NodeUnstageVolumeResponse{}, nil
}
refList, err := ns.mounter.GetMountRefs(targetPath)
if (err != nil)
{
return nil, err
}
if (len(refList) > 0)
{
klog.Warningf("%s is still referenced: %v", targetPath, refList)
}
// unmount
err = mount.CleanupMountPoint(targetPath, ns.mounter, false)
if (err != nil)
@@ -410,7 +440,7 @@ func (ns *NodeServer) NodeUnstageVolume(ctx context.Context, req *csi.NodeUnstag
}
// unmap device
if (refCount == 1)
if (len(refList) == 0)
{
if (!ns.useVduse)
{

2
debian/changelog vendored
View File

@@ -1,4 +1,4 @@
vitastor (1.9.1-1) unstable; urgency=medium
vitastor (1.9.2-1) unstable; urgency=medium
* Bugfixes

View File

@@ -106,8 +106,8 @@ SSD cache or "media-cache" - for example, a lot of Seagate EXOS drives have
it (they have internal SSD cache even though it's not stated in datasheets).
Setting this parameter to "all" or "small" in OSD parameters requires enabling
[disable_journal_fsync](layout-osd.en.yml#disable_journal_fsync) and
[disable_meta_fsync](layout-osd.en.yml#disable_meta_fsync), setting it to
"all" also requires enabling [disable_data_fsync](layout-osd.en.yml#disable_data_fsync).
[disable_journal_fsync](layout-osd.en.md#disable_journal_fsync) and
[disable_meta_fsync](layout-osd.en.md#disable_meta_fsync), setting it to
"all" also requires enabling [disable_data_fsync](layout-osd.en.md#disable_data_fsync).
vitastor-disk tried to do that by default, first checking/disabling drive cache.
If it can't disable drive cache, OSD get initialized with "none".

View File

@@ -112,6 +112,6 @@ HDD-дисках с внутренним SSD или "медиа" кэшем - н
указано в спецификациях).
Указание "all" или "small" в настройках / командной строке OSD требует
включения [disable_journal_fsync](layout-osd.ru.yml#disable_journal_fsync) и
[disable_meta_fsync](layout-osd.ru.yml#disable_meta_fsync), значение "all"
также требует включения [disable_data_fsync](layout-osd.ru.yml#disable_data_fsync).
включения [disable_journal_fsync](layout-osd.ru.md#disable_journal_fsync) и
[disable_meta_fsync](layout-osd.ru.md#disable_meta_fsync), значение "all"
также требует включения [disable_data_fsync](layout-osd.ru.md#disable_data_fsync).

View File

@@ -55,7 +55,7 @@ Examples:
OSD placement tree is set in a separate etcd key `/vitastor/config/node_placement`
in the following JSON format:
`
```
{
"<node name or OSD number>": {
"level": "<level>",
@@ -63,7 +63,7 @@ in the following JSON format:
},
...
}
`
```
Here, if a node name is a number then it is assumed to refer to an OSD.
Level of the OSD is always "osd" and cannot be overriden. You may only

View File

@@ -54,7 +54,7 @@
Дерево размещения OSD задаётся в отдельном ключе etcd `/vitastor/config/node_placement`
в следующем JSON-формате:
`
```
{
"<имя узла или номер OSD>": {
"level": "<уровень>",
@@ -62,7 +62,7 @@
},
...
}
`
```
Здесь, если название узла - число, считается, что это OSD. Уровень OSD
всегда равен "osd" и не может быть переопределён. Для OSD вы можете только

View File

@@ -97,9 +97,9 @@
it (they have internal SSD cache even though it's not stated in datasheets).
Setting this parameter to "all" or "small" in OSD parameters requires enabling
[disable_journal_fsync](layout-osd.en.yml#disable_journal_fsync) and
[disable_meta_fsync](layout-osd.en.yml#disable_meta_fsync), setting it to
"all" also requires enabling [disable_data_fsync](layout-osd.en.yml#disable_data_fsync).
[disable_journal_fsync](layout-osd.en.md#disable_journal_fsync) and
[disable_meta_fsync](layout-osd.en.md#disable_meta_fsync), setting it to
"all" also requires enabling [disable_data_fsync](layout-osd.en.md#disable_data_fsync).
vitastor-disk tried to do that by default, first checking/disabling drive cache.
If it can't disable drive cache, OSD get initialized with "none".
info_ru: |
@@ -156,6 +156,6 @@
указано в спецификациях).
Указание "all" или "small" в настройках / командной строке OSD требует
включения [disable_journal_fsync](layout-osd.ru.yml#disable_journal_fsync) и
[disable_meta_fsync](layout-osd.ru.yml#disable_meta_fsync), значение "all"
также требует включения [disable_data_fsync](layout-osd.ru.yml#disable_data_fsync).
включения [disable_journal_fsync](layout-osd.ru.md#disable_journal_fsync) и
[disable_meta_fsync](layout-osd.ru.md#disable_meta_fsync), значение "all"
также требует включения [disable_data_fsync](layout-osd.ru.md#disable_data_fsync).

View File

@@ -4,6 +4,8 @@
[Читать на русском](opennebula.ru.md)
# OpenNebula
## Automatic Installation
OpenNebula plugin is packaged as `vitastor-opennebula` Debian and RPM package since Vitastor 1.9.0. So:

View File

@@ -4,6 +4,8 @@
[Read in English](opennebula.en.md)
# OpenNebula
## Автоматическая установка
Плагин OpenNebula Vitastor распространяется как Debian и RPM пакет `vitastor-opennebula`, начиная с версии Vitastor 1.9.0. Так что:

View File

@@ -22,7 +22,7 @@
использовать и десктопные SSD, включив режим отложенного fsync, но производительность будет хуже.
О конденсаторах читайте [здесь](../config/layout-cluster.ru.md#immediate_commit).
- Если хотите использовать HDD, берите современные модели с Media или SSD кэшем - HGST Ultrastar,
Toshiba MG08, Seagate EXOS или что-то похожее. Если такого кэша у ваших дисков нет,
Toshiba MG, Seagate EXOS или что-то похожее. Если такого кэша у ваших дисков нет,
обязательно возьмите SSD под метаданные и журнал (маленькие, буквально 2 ГБ на 1 ТБ HDD-места).
- Возьмите быструю сеть, минимум 10 гбит/с. Идеал - что-то вроде Mellanox ConnectX-4 с RoCEv2.
- Для лучшей производительности отключите энергосбережение CPU: `cpupower idle-set -D 0 && cpupower frequency-set -g performance`.
@@ -33,7 +33,7 @@
- SATA SSD: Micron 5100/5200/5300/5400, Samsung PM863/PM883/PM893, Intel D3-S4510/4520/4610/4620, Kingston DC500M
- NVMe: Micron 9100/9200/9300/9400, Micron 7300/7450, Samsung PM983/PM9A3, Samsung PM1723/1735/1743,
Intel DC-P3700/P4500/P4600, Intel D7-P5500/P5600, Intel Optane, Kingston DC1000B/DC1500M
- HDD: HGST Ultrastar, Toshiba MG06/MG07/MG08, Seagate EXOS
- HDD: HGST Ultrastar, Toshiba MG, Seagate EXOS
## Настройте мониторы

View File

@@ -13,6 +13,7 @@ It supports the following commands:
- [prepare](#prepare)
- [upgrade-simple](#upgrade-simple)
- [resize](#resize)
- [raw-resize](#raw-resize)
- [start/stop/restart/enable/disable](#start/stop/restart/enable/disable)
- [purge](#purge)
- [read-sb](#read-sb)
@@ -127,25 +128,49 @@ Requires the `sfdisk` utility.
## resize
`vitastor-disk resize <ALL_OSD_PARAMETERS> <NEW_LAYOUT> [--iodepth 32]`
`vitastor-disk resize <osd_num>|<osd_device> [OPTIONS]`
Resize data area and/or rewrite/move journal and metadata.
Resize data area and/or move journal and metadata:
| <!-- --> | <!-- --> |
|---------------------------|----------------------------------------|
| `--move-journal TARGET` | move journal to `TARGET` |
| `--move-meta TARGET` | move metadata to `TARGET` |
| `--journal-size NEW_SIZE` | resize journal to `NEW_SIZE` |
| `--data-size NEW_SIZE` | resize data device to `NEW_SIZE` |
| `--dry-run` | only show new layout, do not apply it |
`NEW_SIZE` may include k/m/g/t suffixes.
`TARGET` may be one of:
| <!-- --> | <!-- --> |
|----------------|--------------------------------------------------------------------------|
| `<partition>` | move journal/metadata to an existing GPT partition |
| `<raw_device>` | create a GPT partition on `<raw_device>` and move journal/metadata to it |
| `""` | (empty string) move journal/metadata back to the data device |
## raw-resize
`vitastor-disk raw-resize <ALL_OSD_PARAMETERS> <NEW_LAYOUT> [--iodepth 32]`
Resize data area and/or rewrite/move journal and metadata (manual format).
`ALL_OSD_PARAMETERS` must include all (at least all disk-related)
parameters from OSD command line (i.e. from systemd unit or superblock).
`NEW_LAYOUT` may include new disk layout parameters:
```
--new_data_offset SIZE resize data area so it starts at SIZE
--new_data_len SIZE resize data area to SIZE bytes
--new_meta_device PATH use PATH for new metadata
--new_meta_offset SIZE make new metadata area start at SIZE
--new_meta_len SIZE make new metadata area SIZE bytes long
--new_journal_device PATH use PATH for new journal
--new_journal_offset SIZE make new journal area start at SIZE
--new_journal_len SIZE make new journal area SIZE bytes long
```
| <!-- --> | <!-- --> |
|-----------------------------|-------------------------------------------|
| `--new_data_offset SIZE` | resize data area so it starts at `SIZE` |
| `--new_data_len SIZE` | resize data area to `SIZE` bytes |
| `--new_meta_device PATH` | use `PATH` for new metadata |
| `--new_meta_offset SIZE` | make new metadata area start at `SIZE` |
| `--new_meta_len SIZE` | make new metadata area `SIZE` bytes long |
| `--new_journal_device PATH` | use `PATH` for new journal |
| `--new_journal_offset SIZE` | make new journal area start at `SIZE` |
| `--new_journal_len SIZE` | make new journal area `SIZE` bytes long |
SIZE may include k/m/g/t suffixes. If any of the new layout parameter
options are not specified, old values will be used.
@@ -217,10 +242,14 @@ Intended for use from startup scripts (i.e. from systemd units).
## dump-journal
`vitastor-disk dump-journal [OPTIONS] <osd_device>`
`vitastor-disk dump-journal [OPTIONS] <journal_file> <journal_block_size> <offset> <size>`
Dump journal in human-readable or JSON (if `--json` is specified) format.
You can specify any OSD device (data, metadata or journal), or the layout manually.
Options:
```
@@ -233,23 +262,35 @@ Options:
## write-journal
`vitastor-disk write-journal <osd_device>`
`vitastor-disk write-journal <journal_file> <journal_block_size> <bitmap_size> <offset> <size>`
Write journal from JSON taken from standard input in the same format as produced by
`dump-journal --json --format data`.
You can specify any OSD device (data, metadata or journal), or the layout manually.
## dump-meta
`vitastor-disk dump-meta <osd_device>`
`vitastor-disk dump-meta <meta_file> <meta_block_size> <offset> <size>`
Dump metadata in JSON format.
You can specify any OSD device (data, metadata or journal), or the layout manually.
## write-meta
`vitastor-disk write-meta <osd_device>`
`vitastor-disk write-meta <meta_file> <offset> <size>`
Write metadata from JSON taken from standard input in the same format as produced by `dump-meta`.
You can specify any OSD device (data, metadata or journal), or the layout manually.
## simple-offsets
`vitastor-disk simple-offsets <device>`

View File

@@ -13,6 +13,7 @@ vitastor-disk - инструмент командной строки для уп
- [prepare](#prepare)
- [upgrade-simple](#upgrade-simple)
- [resize](#resize)
- [raw-resize](#raw-resize)
- [start/stop/restart/enable/disable](#start/stop/restart/enable/disable)
- [purge](#purge)
- [read-sb](#read-sb)
@@ -129,27 +130,51 @@ throttle_target_mbs, throttle_target_parallelism, throttle_threshold_us.
## resize
`vitastor-disk resize <ALL_OSD_PARAMETERS> <NEW_LAYOUT> [--iodepth 32]`
`vitastor-disk resize <osd_num>|<osd_device> [OPTIONS]`
Изменить размер области данных и/или переместить журнал и метаданные.
Изменить размер области данных и/или переместить журнал и метаданные:
В `ALL_OSD_PARAMETERS` нужно указать все относящиеся к диску параметры OSD
| <!-- --> | <!-- --> |
|-------------------------------|------------------------------------------------|
| `--move-journal ЦЕЛЬ` | переместить журнал на `ЦЕЛЬ` |
| `--move-meta ЦЕЛЬ` | переместить метаданные на `ЦЕЛЬ` |
| `--journal-size НОВЫЙ_РАЗМЕР` | изменить размер журнала на `НОВЫЙ_РАЗМЕР` |
| `--data-size НОВЫЙ_РАЗМЕР` | изменить размер диска данных на `НОВЫЙ_РАЗМЕР` |
| `--dry-run` | показать новые параметры, но не применять их |
`НОВЫЙ_РАЗМЕР` может быть указан с суффиксами k/m/g/t (кило/мега/гига/терабайт).
`ЦЕЛЬ` может быть одним из:
| <!-- --> | <!-- --> |
|-----------------|-------------------------------------------------------------------------------------|
| `<раздел>` | переместить журнал/метаданные на существующий GPT-раздел |
| `<полный_диск>` | создать GPT-раздел на диске `<полный_диск>` и переместить журнал/метаданные на него |
| `""` | (пустая строка) переместить журнал/метаданные обратно на диск данных |
## raw-resize
`vitastor-disk raw-resize <ВСЕАРАМЕТРЫ_OSD> <НОВЫЕ_РАЗМЕРЫ> [--iodepth 32]`
Изменить размер области данных и/или переместить журнал и метаданные (ручной формат).
В `ВСЕАРАМЕТРЫ_OSD` нужно указать все относящиеся к диску параметры OSD
из суперблока OSD или из файла сервиса systemd (в старых версиях).
В `NEW_LAYOUT` нужно указать новые параметры расположения данных:
В `НОВЫЕ_РАЗМЕРЫ` нужно указать новые параметры расположения данных:
```
--new_data_offset РАЗМЕР сдвинуть начало области данных на РАЗМЕР байт
--new_data_len РАЗМЕР изменить размер области данных до РАЗМЕР байт
--new_meta_device ПУТЬ использовать ПУТЬ как новое устройство метаданных
--new_meta_offset РАЗМЕР разместить новые метаданные по смещению РАЗМЕР байт
--new_meta_len РАЗМЕР сделать новые метаданные размером РАЗМЕР байт
--new_journal_device ПУТЬ использовать ПУТЬ как новое устройство журнала
--new_journal_offset РАЗМЕР разместить новый журнал по смещению РАЗМЕР байт
--new_journal_len РАЗМЕР сделать новый журнал размером РАЗМЕР байт
```
| <!-- --> | <!-- --> |
|-------------------------------|-------------------------------------------------------|
| `--new_data_offset РАЗМЕР` | сдвинуть начало области данных на `РАЗМЕР` байт |
| `--new_data_len РАЗМЕР` | изменить размер области данных до `РАЗМЕР` байт |
| `--new_meta_device ПУТЬ` | использовать `ПУТЬ` как новое устройство метаданных |
| `--new_meta_offset РАЗМЕР` | разместить новые метаданные по смещению `РАЗМЕР` байт |
| `--new_meta_len РАЗМЕР` | сделать новые метаданные размером `РАЗМЕР` байт |
| `--new_journal_device ПУТЬ` | использовать `ПУТЬ` как новое устройство журнала |
| `--new_journal_offset РАЗМЕР` | разместить новый журнал по смещению `РАЗМЕР` байт |
| `--new_journal_len РАЗМЕР` | сделать новый журнал размером `РАЗМЕР` байт |
РАЗМЕР может быть указан с суффиксами k/m/g/t. Если любой из новых параметров
`РАЗМЕР` может быть указан с суффиксами k/m/g/t. Если любой из новых параметров
расположения не указан, он принимается равным старому значению.
## start/stop/restart/enable/disable
@@ -224,10 +249,15 @@ OSD отключены fsync-и.
## dump-journal
`vitastor-disk dump-journal <osd_device>`
`vitastor-disk dump-journal [OPTIONS] <journal_file> <journal_block_size> <offset> <size>`
Вывести журнал в человекочитаемом или в JSON (с опцией `--json`) виде.
Вы можете указать любой раздел OSD - данных, журнала или метаданных - либо указать все
параметры расположения вручную.
Опции:
```
@@ -240,22 +270,37 @@ OSD отключены fsync-и.
## write-journal
`vitastor-disk write-journal <osd_device>`
`vitastor-disk write-journal <journal_file> <journal_block_size> <bitmap_size> <offset> <size>`
Записать журнал из JSON со стандартного ввода в формате, аналогичном `dump-journal --json --format data`.
Вы можете указать любой раздел OSD - данных, журнала или метаданных - либо указать все
параметры расположения вручную.
## dump-meta
`vitastor-disk dump-meta <osd_device>`
`vitastor-disk dump-meta <meta_file> <meta_block_size> <offset> <size>`
Вывести метаданные в формате JSON.
Вы можете указать любой раздел OSD - данных, журнала или метаданных - либо указать все
параметры расположения вручную.
## write-meta
`vitastor-disk write-meta <osd_device>`
`vitastor-disk write-meta <meta_file> <offset> <size>`
Записать метаданные из JSON со стандартного ввода в формате, аналогичном `dump-meta`.
Вы можете указать любой раздел OSD - данных, журнала или метаданных - либо указать все
параметры расположения вручную.
## simple-offsets
`vitastor-disk simple-offsets <device>`

View File

@@ -156,17 +156,17 @@ behind. Defragmentation removes garbage and moves data still in use to new volum
Options:
| <!-- --> | <!-- --> |
|--------------------------|------------------------------------------------------------------------ |
| --volume_untouched 86400 | Defragment volumes last appended to at least this number of seconds ago |
| --defrag_percent 50 | Defragment volumes with at least this % of removed data |
| --defrag_block_count 16 | Read this number of pool blocks at once during defrag |
| --defrag_iodepth 16 | Move up to this number of files in parallel during defrag |
| --trace | Print verbose defragmentation status |
| --dry-run | Skip modifications, only print status |
| --recalc-stats | Recalculate all volume statistics |
| --include-empty | Include old and empty volumes; make sure to restart NFS servers before using it |
| --no-rm | Move, but do not delete data |
| <!-- --> | <!-- --> |
|----------------------------|------------------------------------------------------------------------ |
| `--volume_untouched 86400` | Defragment volumes last appended to at least this number of seconds ago |
| `--defrag_percent 50` | Defragment volumes with at least this % of removed data |
| `--defrag_block_count 16` | Read this number of pool blocks at once during defrag |
| `--defrag_iodepth 16` | Move up to this number of files in parallel during defrag |
| `--trace` | Print verbose defragmentation status |
| `--dry-run` | Skip modifications, only print status |
| `--recalc-stats` | Recalculate all volume statistics |
| `--include-empty` | Include old and empty volumes; make sure to restart NFS servers before using it |
| `--no-rm` | Move, but do not delete data |
## Common options

View File

@@ -164,17 +164,17 @@ JSON-формате :-). Для инспекции содержимого БД
Опции:
| <!-- --> | <!-- --> |
|--------------------------|------------------------------------------------------------------------ |
| --volume_untouched 86400 | Дефрагментировать только тома, в которые уже не писали это число секунд |
| --defrag_percent 50 | Дефрагментировать только тома, в которых этот % данных удалён |
| --defrag_block_count 16 | Читать это количество блоков пула за один раз |
| --defrag_iodepth 16 | Перемещать одновременно до этого числа файлов |
| --trace | Печатать детальную статистику дефрагментации |
| --dry-run | Не производить никаких изменений, только описать выполняемые действия |
| --recalc-stats | Пересчитать и сохранить статистику всех томов |
| --include-empty | Дефрагментировать старые и пустые тома; обязательно перезапустите NFS-сервера после использования этой опции |
| --no-rm | Перемещать, но не удалять данные |
| <!-- --> | <!-- --> |
|----------------------------|------------------------------------------------------------------------ |
| `--volume_untouched 86400` | Дефрагментировать только тома, в которые уже не писали это число секунд |
| `--defrag_percent 50` | Дефрагментировать только тома, в которых этот % данных удалён |
| `--defrag_block_count 16` | Читать это количество блоков пула за один раз |
| `--defrag_iodepth 16` | Перемещать одновременно до этого числа файлов |
| `--trace` | Печатать детальную статистику дефрагментации |
| `--dry-run` | Не производить никаких изменений, только описать выполняемые действия |
| `--recalc-stats` | Пересчитать и сохранить статистику всех томов |
| `--include-empty` | Дефрагментировать старые и пустые тома; обязательно перезапустите NFS-сервера после использования этой опции |
| `--no-rm` | Перемещать, но не удалять данные |
## Общие опции

View File

@@ -151,9 +151,9 @@ Example performance comparison:
To try VDUSE you need at least Linux 5.15, built with VDUSE support
(CONFIG_VDPA=m, CONFIG_VDPA_USER=m, CONFIG_VIRTIO_VDPA=m).
Debian Linux kernels have these options disabled by now, so if you want to try it on Debian,
use a kernel from Ubuntu [kernel-ppa/mainline](https://kernel.ubuntu.com/~kernel-ppa/mainline/), Proxmox,
or build modules for Debian kernel manually:
Debian Linux kernels had these options disabled until 6.6, so make sure you install a newer kernel
(from bookworm-backports, trixie or newer Debian version) if you want to try VDUSE. You can also
build modules for an existing kernel manually:
```
mkdir build

View File

@@ -154,9 +154,9 @@ VDUSE - на данный момент лучший интерфейс для п
Чтобы попробовать VDUSE, вам нужно ядро Linux как минимум версии 5.15, собранное с поддержкой
VDUSE (CONFIG_VDPA=m, CONFIG_VDPA_USER=m, CONFIG_VIRTIO_VDPA=m).
В ядрах в Debian Linux поддержка пока отключена по умолчанию, так что чтобы попробовать VDUSE
на Debian, поставьте ядро из Ubuntu [kernel-ppa/mainline](https://kernel.ubuntu.com/~kernel-ppa/mainline/),
из Proxmox или соберите модули для ядра Debian вручную:
В ядрах в Debian Linux эти опции включены, только начиная с 6.6, так что установите свежее ядро
из bookworm-backports, trixie или из более новой версии Debian, если хотите попробовать VDUSE.
Либо же вы можете самостоятельно собрать модули для установленного ядра:
```
mkdir build

View File

@@ -1,6 +1,6 @@
{
"name": "vitastor-mon",
"version": "1.9.1",
"version": "1.9.2",
"description": "Vitastor SDS monitor service",
"main": "mon-main.js",
"scripts": {

View File

@@ -3,7 +3,9 @@
set -e
reapply_patch() {
if ! patch -f --dry-run -F 0 -R $1 < $2 >/dev/null; then
if ! [[ -e $1 ]]; then
echo "$1 does not exist, OpenNebula is not installed"
elif ! patch -f --dry-run -F 0 -R $1 < $2 >/dev/null; then
already_applied=0
if ! patch --no-backup-if-mismatch -r - -F 0 -f $1 < $2; then
applied_ok=0
@@ -15,8 +17,13 @@ echo "Reapplying Vitastor patches to OpenNebula's oned.conf, vmm_execrc and down
already_applied=1
applied_ok=1
reapply_patch /var/lib/one/remotes/datastore/downloader.sh /var/lib/one/remotes/datastore/vitastor/downloader-vitastor.sh.diff
reapply_patch /etc/one/oned.conf /var/lib/one/remotes/datastore/vitastor/oned.conf.diff
reapply_patch /etc/one/vmm_exec/vmm_execrc /var/lib/one/remotes/datastore/vitastor/vmm_execrc.diff
if [[ -e /etc/one/oned.conf ]]; then
if ! /var/lib/one/remotes/datastore/vitastor/patch-oned-conf.py /etc/one/oned.conf; then
applied_ok=0
already_applied=0
fi
fi
if [[ "$already_applied" = 1 ]]; then
echo "OK: Vitastor OpenNebula patches are already applied"
elif [[ "$applied_ok" = 1 ]]; then

View File

@@ -0,0 +1,115 @@
#!/usr/bin/env python3
# Patch /etc/one/oned.conf for Vitastor support
# -s = also enable save.vitastor/restore.vitastor overrides
import re
import os
import sys
class Fixer:
save_restore = 0
def require_sub_cb(self, m, cb):
self.found = 1
return cb(m)
def require_sub(self, regexp, cb, text, error):
self.found = 0
new_text = re.sub(regexp, lambda m: self.require_sub_cb(m, cb), text)
if not self.found and error:
self.errors.append(error)
return new_text
def fix(self, oned_conf):
self.errors = []
self.kvm_found = 0
oned_conf = self.require_sub(r'((?:^|\n)[ \t]*VM_MAD\s*=\s*\[)([^\]]+)\]', lambda m: m.group(1)+self.fix_vm_mad(m.group(2))+']', oned_conf, 'VM_MAD not found')
if not self.kvm_found:
self.errors.append("VM_MAD[NAME=kvm].ARGUMENTS not found")
oned_conf = self.require_sub(r'((?:^|\n)[ \t]*TM_MAD\s*=\s*\[)([^\]]+)\]', lambda m: m.group(1)+self.fix_tm_mad(m.group(2))+']', oned_conf, 'TM_MAD not found')
oned_conf = self.require_sub(r'((?:^|\n)[ \t]*DATASTORE_MAD\s*=\s*\[)([^\]]+)\]', lambda m: m.group(1)+self.fix_datastore_mad(m.group(2))+']', oned_conf, 'DATASTORE_MAD not found')
if oned_conf[-1:] != '\n':
oned_conf += '\n'
if not re.compile(r'(^|\n)[ \t]*INHERIT_DATASTORE_ATTR\s*=\s*"VITASTOR_CONF"').search(oned_conf):
oned_conf += '\nINHERIT_DATASTORE_ATTR="VITASTOR_CONF"\n'
if not re.compile(r'(^|\n)[ \t]*INHERIT_DATASTORE_ATTR\s*=\s*"IMAGE_PREFIX"').search(oned_conf):
oned_conf += '\nINHERIT_DATASTORE_ATTR="IMAGE_PREFIX"\n'
if not re.compile(r'(^|\n)[ \t]*TM_MAD_CONF\s*=\s*\[[^\]]*NAME\s*=\s*"vitastor"').search(oned_conf):
oned_conf += ('\nTM_MAD_CONF = [\n'+
' NAME = "vitastor", LN_TARGET = "NONE", CLONE_TARGET = "SELF", SHARED = "YES",\n'+
' DS_MIGRATE = "NO", DRIVER = "raw", ALLOW_ORPHANS="format",\n'+
' TM_MAD_SYSTEM = "ssh,shared", LN_TARGET_SSH = "SYSTEM", CLONE_TARGET_SSH = "SYSTEM",\n'+
' DISK_TYPE_SSH = "FILE", LN_TARGET_SHARED = "NONE",\n'+
' CLONE_TARGET_SHARED = "SELF", DISK_TYPE_SHARED = "FILE"\n'+
']\n')
if not re.compile(r'(^|\n)[ \t]*DS_MAD_CONF\s*=\s*\[[^\]]*NAME\s*=\s*"vitastor"').search(oned_conf):
oned_conf += ('\nDS_MAD_CONF = [\n'+
' NAME = "vitastor",\n'+
' REQUIRED_ATTRS = "DISK_TYPE,BRIDGE_LIST",\n'+
' PERSISTENT_ONLY = "NO",\n'+
' MARKETPLACE_ACTIONS = "export"\n'+
']\n')
return oned_conf
def fix_vm_mad(self, vm_mad_params):
if re.compile(r'\bNAME\s*=\s*"kvm"').search(vm_mad_params):
vm_mad_params = re.sub(r'\b(ARGUMENTS\s*=\s*")([^"]+)"', lambda m: m.group(1)+self.fix_vm_mad_args(m.group(2))+'"', vm_mad_params)
self.kvm_found = 1
return vm_mad_params
def fix_vm_mad_args(self, args):
args = self.fix_vm_mad_override(args, 'deploy')
if self.save_restore:
args = self.fix_vm_mad_override(args, 'save')
args = self.fix_vm_mad_override(args, 'restore')
return args
def fix_vm_mad_override(self, args, override):
m = re.compile(r'-l (\S+)').search(args)
if m and re.compile(override+'='+override+'.vitastor').search(m.group(1)):
return args
elif m and re.compile(override+'=').search(m.group(1)):
self.errors.append(override+"= is already overridden in -l option in VM_MAD[NAME=kvm].ARGUMENTS")
return args
elif m:
return self.require_sub(r'-l (\S+)', lambda m: '-l '+m.group(1)+','+override+'='+override+'.vitastor', args, '-l option not found in VM_MAD[NAME=kvm].ARGUMENTS')
else:
return args+' -l '+override+'='+override+'.vitastor'
def fix_tm_mad(self, params):
return self.require_sub(r'\b(ARGUMENTS\s*=\s*")([^"]+)"', lambda m: m.group(1)+self.fix_tm_mad_args('d', m.group(2), "TM_MAD")+'"', params, "TM_MAD.ARGUMENTS not found")
def fix_tm_mad_args(self, opt, args, v):
return self.require_sub('(-'+opt+r') (\S+)', lambda m: self.fix_tm_mad_arg(m), args, "-"+opt+" option not found in "+v+".ARGUMENTS")
def fix_tm_mad_arg(self, m):
a = m.group(2).split(',')
if 'vitastor' not in a:
a += [ 'vitastor' ]
return m.group(1)+' '+(','.join(a))
def fix_datastore_mad(self, params):
params = self.require_sub(r'\b(ARGUMENTS\s*=\s*")([^"]+)"', lambda m: m.group(1)+self.fix_tm_mad_args('d', m.group(2), "DATASTORE_MAD")+'"', params, "DATASTORE_MAD.ARGUMENTS not found")
return self.require_sub(r'\b(ARGUMENTS\s*=\s*")([^"]+)"', lambda m: m.group(1)+self.fix_tm_mad_args('s', m.group(2), "DATASTORE_MAD")+'"', params, "")
fixer = Fixer()
oned_conf_file = ''
for arg in sys.argv[1:]:
if arg == '-s':
fixer.save_restore = 1
else:
oned_conf_file = arg
break
if not oned_conf_file:
sys.stderr.write("USAGE: ./patch-oned-conf.py [-s] /etc/one/oned.conf\n-s means also enable save.vitastor/restore.vitastor overrides\n")
sys.exit(1)
with open(oned_conf_file, 'r') as fd:
oned_conf = fd.read()
new_conf = fixer.fix(oned_conf)
if new_conf != oned_conf:
os.rename(oned_conf_file, oned_conf_file+'.bak')
with open(oned_conf_file, 'w') as fd:
fd.write(new_conf)
if len(fixer.errors) > 0:
sys.stderr.write("ERROR: Failed to patch "+oned_conf_file+", patch it manually. Errors:\n- "+('\n- '.join(fixer.errors))+'\n')
sys.exit(1)

View File

@@ -50,7 +50,7 @@ from cinder.volume import configuration
from cinder.volume import driver
from cinder.volume import volume_utils
VITASTOR_VERSION = '1.9.1'
VITASTOR_VERSION = '1.9.2'
LOG = logging.getLogger(__name__)

View File

@@ -1,11 +1,11 @@
Name: vitastor
Version: 1.9.1
Version: 1.9.2
Release: 1%{?dist}
Summary: Vitastor, a fast software-defined clustered block storage
License: Vitastor Network Public License 1.1
URL: https://vitastor.io/
Source0: vitastor-1.9.1.el7.tar.gz
Source0: vitastor-1.9.2.el7.tar.gz
BuildRequires: liburing-devel >= 0.6
BuildRequires: gperftools-devel

View File

@@ -1,11 +1,11 @@
Name: vitastor
Version: 1.9.1
Version: 1.9.2
Release: 1%{?dist}
Summary: Vitastor, a fast software-defined clustered block storage
License: Vitastor Network Public License 1.1
URL: https://vitastor.io/
Source0: vitastor-1.9.1.el8.tar.gz
Source0: vitastor-1.9.2.el8.tar.gz
BuildRequires: liburing-devel >= 0.6
BuildRequires: gperftools-devel

View File

@@ -1,11 +1,11 @@
Name: vitastor
Version: 1.9.1
Version: 1.9.2
Release: 1%{?dist}
Summary: Vitastor, a fast software-defined clustered block storage
License: Vitastor Network Public License 1.1
URL: https://vitastor.io/
Source0: vitastor-1.9.1.el9.tar.gz
Source0: vitastor-1.9.2.el9.tar.gz
BuildRequires: liburing-devel >= 0.6
BuildRequires: gperftools-devel

View File

@@ -19,7 +19,7 @@ if("${CMAKE_INSTALL_PREFIX}" MATCHES "^/usr/local/?$")
set(CMAKE_INSTALL_RPATH "${CMAKE_INSTALL_PREFIX}/${CMAKE_INSTALL_LIBDIR}")
endif()
add_definitions(-DVITASTOR_VERSION="1.9.1")
add_definitions(-DVITASTOR_VERSION="1.9.2")
add_definitions(-D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -Wall -Wno-sign-compare -Wno-comment -Wno-parentheses -Wno-pointer-arith -fdiagnostics-color=always -fno-omit-frame-pointer -I ${CMAKE_SOURCE_DIR}/src)
add_link_options(-fno-omit-frame-pointer)
if (${WITH_ASAN})

View File

@@ -993,7 +993,8 @@ int blockstore_impl_t::read_bitmap(object_id oid, uint64_t target_version, void
{
while (dirty_it->first.oid == oid)
{
if (target_version >= dirty_it->first.version)
// Condition has to be the same as in dequeue_read()
if (!IS_IN_FLIGHT(dirty_it->second.state) && target_version >= dirty_it->first.version)
{
if (result_version)
*result_version = dirty_it->first.version;

View File

@@ -10,7 +10,7 @@ endif (IBVERBS_LIBRARIES)
add_library(vitastor_common STATIC
../util/epoll_manager.cpp etcd_state_client.cpp messenger.cpp ../util/addr_util.cpp
msgr_stop.cpp msgr_op.cpp msgr_send.cpp msgr_receive.cpp ../util/ringloop.cpp ../../json11/json11.cpp
http_client.cpp osd_ops.cpp pg_states.cpp ../util/timerfd_manager.cpp ../util/str_util.cpp ${MSGR_RDMA}
http_client.cpp osd_ops.cpp pg_states.cpp ../util/timerfd_manager.cpp ../util/str_util.cpp ../util/json_util.cpp ${MSGR_RDMA}
)
target_link_libraries(vitastor_common pthread)
target_compile_options(vitastor_common PUBLIC -fPIC)
@@ -88,7 +88,7 @@ add_executable(test_cluster_client
EXCLUDE_FROM_ALL
../test/test_cluster_client.cpp
pg_states.cpp osd_ops.cpp cluster_client.cpp cluster_client_list.cpp cluster_client_wb.cpp msgr_op.cpp ../test/mock/messenger.cpp msgr_stop.cpp
etcd_state_client.cpp ../util/timerfd_manager.cpp ../util/str_util.cpp ../../json11/json11.cpp
etcd_state_client.cpp ../util/timerfd_manager.cpp ../util/str_util.cpp ../util/json_util.cpp ../../json11/json11.cpp
)
target_compile_definitions(test_cluster_client PUBLIC -D__MOCK__)
target_include_directories(test_cluster_client BEFORE PUBLIC ${CMAKE_SOURCE_DIR}/src/test/mock)

View File

@@ -4,7 +4,7 @@
#include <stdexcept>
#include <assert.h>
#include "cluster_client_impl.h"
#include "http_client.h" // json_is_true
#include "json_util.h"
cluster_client_t::cluster_client_t(ring_loop_t *ringloop, timerfd_manager_t *tfd, json11::Json config)
{
@@ -955,7 +955,7 @@ void cluster_client_t::slice_rw(cluster_op_t *op)
? (stripe + pg_block_size) : (op->offset + op->len);
op->parts[i].iov.reset();
op->parts[i].flags = 0;
if (op->cur_inode != op->inode || op->opcode == OSD_OP_READ && dirty_copied)
if (op->opcode != OSD_OP_READ_CHAIN_BITMAP && op->cur_inode != op->inode || op->opcode == OSD_OP_READ && dirty_copied)
{
// Read remaining parts from upper layers
uint64_t prev = begin, cur = begin;

View File

@@ -15,6 +15,7 @@
#include "addr_util.h"
#include "str_util.h"
#include "json_util.h"
#include "json11/json11.hpp"
#include "http_client.h"
#include "timerfd_manager.h"
@@ -724,22 +725,3 @@ static bool ws_parse_frame(std::string & buf, int & type, std::string & res)
buf = buf.substr(hdr+len);
return true;
}
// FIXME: move to utils
bool json_is_true(const json11::Json & val)
{
if (val.is_string())
return val == "true" || val == "yes" || val == "1";
return val.bool_value();
}
bool json_is_false(const json11::Json & val)
{
if (val.is_string())
return val.string_value() == "false" || val.string_value() == "no" || val.string_value() == "0";
if (val.is_number())
return val.number_value() == 0;
if (val.is_bool())
return !val.bool_value();
return false;
}

View File

@@ -48,9 +48,3 @@ void http_request(http_co_t *handler, const std::string & host, const std::strin
const http_options_t & options, std::function<void(const http_response_t *response)> response_callback);
void http_post_message(http_co_t *handler, int type, const std::string & msg);
void http_close(http_co_t *co);
// Utils
std::string strtolower(const std::string & in);
// FIXME: move to json11
bool json_is_true(const json11::Json & val);
bool json_is_false(const json11::Json & val);

View File

@@ -6,7 +6,7 @@ includedir=${prefix}/@CMAKE_INSTALL_INCLUDEDIR@
Name: Vitastor
Description: Vitastor client library
Version: 1.9.1
Version: 1.9.2
Libs: -L${libdir} -lvitastor_client
Cflags: -I${includedir}

View File

@@ -369,6 +369,7 @@ struct cli_dd_t
{
cli_tool_t *parent;
std::vector<std::string> conv, iflag, oflag;
dd_in_info_t iinfo;
dd_out_info_t oinfo;
@@ -766,6 +767,49 @@ struct cli_dd_t
goto resume_3;
else if (state == 4)
goto resume_4;
for (int i = 0; i < conv.size(); i++)
{
if (conv[i] == "nofsync")
oinfo.end_fsync = false;
else if (conv[i] == "trunc")
oinfo.out_trunc = true;
else if (conv[i] == "nocreat")
oinfo.out_create = false;
else if (conv[i] == "noerror")
ignore_errors = true;
else if (conv[i] == "nosparse")
write_zero = true;
else
{
result = (cli_result_t){ .err = EINVAL, .text = "Unknown option conv="+conv[i] };
state = 100;
return;
}
}
for (int i = 0; i < iflag.size(); i++)
{
if (iflag[i] == "direct")
iinfo.in_direct = true;
else
{
result = (cli_result_t){ .err = EINVAL, .text = "Unknown option iflag="+iflag[i] };
state = 100;
return;
}
}
for (int i = 0; i < oflag.size(); i++)
{
if (oflag[i] == "direct")
oinfo.out_direct = true;
else if (oflag[i] == "append")
oinfo.out_append = true;
else
{
result = (cli_result_t){ .err = EINVAL, .text = "Unknown option oflag="+oflag[i] };
state = 100;
return;
}
}
if ((oinfo.oimg != "" && oinfo.ofile != "") || (iinfo.iimg != "" && iinfo.ifile != ""))
{
result = (cli_result_t){ .err = EINVAL, .text = "Image and file can't be specified at the same time" };
@@ -908,6 +952,18 @@ static uint64_t parse_blocks(json11::Json v, uint64_t bs, uint64_t def)
return res;
}
static std::vector<std::string> explode_json(const std::string & sep, json11::Json opt)
{
if (opt.is_array())
{
std::vector<std::string> arr;
for (auto & item: opt.array_items())
arr.push_back(item.as_string());
return arr;
}
return explode(sep, opt.as_string(), true);
}
std::function<bool(cli_result_t &)> cli_tool_t::start_dd(json11::Json cfg)
{
auto dd = new cli_dd_t();
@@ -923,7 +979,7 @@ std::function<bool(cli_result_t &)> cli_tool_t::start_dd(json11::Json cfg)
dd->oseek = parse_blocks(cfg["oseek"], dd->blocksize, 0);
if (!dd->oseek)
dd->oseek = parse_blocks(cfg["seek"], dd->blocksize, 0);
dd->iseek = parse_blocks(cfg["oseek"], dd->blocksize, 0);
dd->iseek = parse_blocks(cfg["iseek"], dd->blocksize, 0);
if (!dd->iseek)
dd->iseek = parse_blocks(cfg["skip"], dd->blocksize, 0);
dd->iodepth = cfg["iodepth"].uint64_value();
@@ -935,25 +991,9 @@ std::function<bool(cli_result_t &)> cli_tool_t::start_dd(json11::Json cfg)
progress = true;
dd->iinfo.detect_size = cfg["size"].is_null();
dd->oinfo.out_size = parse_size(cfg["size"].as_string());
std::vector<std::string> conv = explode(",", cfg["conv"].string_value(), true);
if (std::find(conv.begin(), conv.end(), "nofsync") != conv.end())
dd->oinfo.end_fsync = false;
if (std::find(conv.begin(), conv.end(), "trunc") != conv.end())
dd->oinfo.out_trunc = true;
if (std::find(conv.begin(), conv.end(), "nocreat") != conv.end())
dd->oinfo.out_create = false;
if (std::find(conv.begin(), conv.end(), "noerror") != conv.end())
dd->ignore_errors = true;
if (std::find(conv.begin(), conv.end(), "nosparse") != conv.end())
dd->write_zero = true;
conv = explode(",", cfg["iflag"].string_value(), true);
if (std::find(conv.begin(), conv.end(), "direct") != conv.end())
dd->iinfo.in_direct = true;
conv = explode(",", cfg["oflag"].string_value(), true);
if (std::find(conv.begin(), conv.end(), "direct") != conv.end())
dd->oinfo.out_direct = true;
if (std::find(conv.begin(), conv.end(), "append") != conv.end())
dd->oinfo.out_append = true;
dd->conv = explode_json(",", cfg["conv"]);
dd->iflag = explode_json(",", cfg["iflag"]);
dd->oflag = explode_json(",", cfg["oflag"]);
return [dd](cli_result_t & result)
{
dd->loop();

View File

@@ -5,6 +5,7 @@
#include "cluster_client.h"
#include "pg_states.h"
#include "str_util.h"
#include "json_util.h"
struct cli_fix_t
{

View File

@@ -21,6 +21,3 @@ template<class T> void remove_duplicates(std::vector<T> & ret)
}
ret.resize(j+1);
}
// from http_client.cpp...
bool json_is_false(const json11::Json & val);

View File

@@ -4,6 +4,7 @@
#include "cli.h"
#include "cluster_client.h"
#include "str_util.h"
#include "json_util.h"
#include "http_client.h"
// Reweight OSD, change tags or set noout flag

View File

@@ -156,6 +156,8 @@ resume_1:
for (auto & jtag: osd_cfg["tags"].array_items())
osd.tags.push_back(jtag.string_value());
}
else if (osd_cfg["tags"].is_string())
osd.tags.push_back(osd_cfg["tags"].string_value());
osd.noout = osd_cfg["noout"].bool_value();
}
auto np_it = node_placement.find(std::to_string(osd.num));

View File

@@ -4,6 +4,7 @@
#include "cli.h"
#include "cluster_client.h"
#include "str_util.h"
#include "json_util.h"
#include "pg_states.h"
#include "http_client.h"

View File

@@ -5,8 +5,9 @@ project(vitastor)
# vitastor-disk
add_executable(vitastor-disk
disk_tool.cpp disk_simple_offsets.cpp
disk_tool_journal.cpp disk_tool_meta.cpp disk_tool_prepare.cpp disk_tool_resize.cpp disk_tool_udev.cpp disk_tool_utils.cpp disk_tool_upgrade.cpp
../util/crc32c.c ../util/str_util.cpp ../../json11/json11.cpp ../util/rw_blocking.cpp ../util/allocator.cpp ../util/ringloop.cpp ../blockstore/blockstore_disk.cpp
disk_tool_journal.cpp disk_tool_meta.cpp disk_tool_prepare.cpp disk_tool_resize.cpp
disk_tool_resize_auto.cpp disk_tool_udev.cpp disk_tool_utils.cpp disk_tool_upgrade.cpp
../util/crc32c.c ../util/str_util.cpp ../util/json_util.cpp ../../json11/json11.cpp ../util/rw_blocking.cpp ../util/allocator.cpp ../util/ringloop.cpp ../blockstore/blockstore_disk.cpp
)
target_link_libraries(vitastor-disk
tcmalloc_minimal

View File

@@ -92,8 +92,22 @@ static const char *help_text =
" \n"
" Requires the `sfdisk` utility.\n"
"\n"
"vitastor-disk resize <ALL_OSD_PARAMETERS> <NEW_LAYOUT> [--iodepth 32]\n"
" Resize data area and/or rewrite/move journal and metadata\n"
"vitastor-disk resize <osd_num>|<osd_device> [OPTIONS]\n"
" Resize data area and/or move journal and metadata:\n"
" --move-journal TARGET move journal to TARGET\n"
" --move-meta TARGET move metadata to TARGET\n"
" --journal-size NEW_SIZE resize journal to NEW_SIZE\n"
" --data-size NEW_SIZE resize data device to NEW_SIZE\n"
" --dry-run only show new layout, do not apply it\n"
" \n"
" NEW_SIZE may include k/m/g/t suffixes.\n"
" TARGET may be one of:\n"
" <partition> move journal/metadata to an existing GPT partition\n"
" <raw_device> create a GPT partition on <raw_device> and move journal/metadata to it\n"
" \"\" (empty string) move journal/metadata back to the data device\n"
"\n"
"vitastor-disk raw-resize <ALL_OSD_PARAMETERS> <NEW_LAYOUT> [--iodepth 32]\n"
" Resize data area and/or rewrite/move journal and metadata (manual format).\n"
" ALL_OSD_PARAMETERS must include all (at least all disk-related)\n"
" parameters from OSD command line (i.e. from systemd unit or superblock).\n"
" NEW_LAYOUT may include new disk layout parameters:\n"
@@ -143,8 +157,10 @@ static const char *help_text =
" For now, this only checks that device cache is in write-through mode if fsync is disabled.\n"
" Intended for use from startup scripts (i.e. from systemd units).\n"
"\n"
"vitastor-disk dump-journal [OPTIONS] <osd_device>\n"
"vitastor-disk dump-journal [OPTIONS] <journal_file> <journal_block_size> <offset> <size>\n"
" Dump journal in human-readable or JSON (if --json is specified) format.\n"
" Dump journal in text or JSON (if --json is specified) format.\n"
" You can specify any OSD device (data, metadata or journal), or the layout manually.\n"
" Options:\n"
" --all Scan the whole journal area for entries and dump them, even outdated ones\n"
" --json Dump journal in JSON format\n"
@@ -152,16 +168,21 @@ static const char *help_text =
" --format data Same as \"entries\", but also include small write data\n"
" --format blocks Dump as an array of journal blocks each containing array of entries\n"
"\n"
"vitastor-disk write-journal <osd_device>\n"
"vitastor-disk write-journal <journal_file> <journal_block_size> <bitmap_size> <offset> <size>\n"
" Write journal from JSON taken from standard input in the same format as produced by\n"
" `dump-journal --json --format data`.\n"
" You can specify any OSD device (data, metadata or journal), or the layout manually.\n"
"\n"
"vitastor-disk dump-meta <osd_device>\n"
"vitastor-disk dump-meta <meta_file> <meta_block_size> <offset> <size>\n"
" Dump metadata in JSON format.\n"
" You can specify any OSD device (data, metadata or journal), or the layout manually.\n"
"\n"
"vitastor-disk write-meta <osd_device>\n"
"vitastor-disk write-meta <meta_file> <offset> <size>\n"
" Write metadata from JSON taken from standard input in the same format as produced by\n"
" `dump-meta`. Intended for debugging.\n"
" Write metadata from JSON taken from standard input in the same format as produced by `dump-meta`.\n"
" You can specify any OSD device (data, metadata or journal), or the layout manually.\n"
"\n"
"vitastor-disk simple-offsets <device>\n"
" Calculate offsets for old simple&stupid (no superblock) OSD deployment. Options:\n"
@@ -229,6 +250,10 @@ int main(int argc, char *argv[])
{
self.options["force"] = "1";
}
else if (!strcmp(argv[i], "--dry-run") || !strcmp(argv[i], "--dry_run"))
{
self.options["dry_run"] = "1";
}
else if (!strcmp(argv[i], "--allow-data-loss"))
{
self.options["allow_data_loss"] = "1";
@@ -236,7 +261,7 @@ int main(int argc, char *argv[])
else if (argv[i][0] == '-' && argv[i][1] == '-' && i < argc-1)
{
char *key = argv[i]+2;
self.options[key] = argv[++i];
self.options[str_replace(key, "-", "_")] = argv[++i];
}
else
{
@@ -249,29 +274,49 @@ int main(int argc, char *argv[])
}
if (!strcmp(cmd[0], "dump-journal"))
{
if (cmd.size() < 5)
if (cmd.size() != 2 && cmd.size() < 5)
{
print_help(help_text, aliased ? "vitastor-dump-journal" : "vitastor-disk", cmd[0], false);
return 1;
}
self.dsk.journal_device = cmd[1];
self.dsk.journal_block_size = strtoul(cmd[2], NULL, 10);
self.dsk.journal_offset = strtoull(cmd[3], NULL, 10);
self.dsk.journal_len = strtoull(cmd[4], NULL, 10);
if (cmd.size() > 2)
{
self.dsk.journal_block_size = strtoul(cmd[2], NULL, 10);
self.dsk.journal_offset = strtoull(cmd[3], NULL, 10);
self.dsk.journal_len = strtoull(cmd[4], NULL, 10);
}
else
{
// First argument is an OSD device - take metadata layout parameters from it
if (self.dump_load_check_superblock(self.dsk.journal_device))
return 1;
}
return self.dump_journal();
}
else if (!strcmp(cmd[0], "write-journal"))
{
if (cmd.size() < 6)
if (cmd.size() != 2 && cmd.size() < 6)
{
print_help(help_text, "vitastor-disk", cmd[0], false);
return 1;
}
self.new_journal_device = cmd[1];
self.dsk.journal_block_size = strtoul(cmd[2], NULL, 10);
self.dsk.clean_entry_bitmap_size = strtoul(cmd[3], NULL, 10);
self.new_journal_offset = strtoull(cmd[4], NULL, 10);
self.new_journal_len = strtoull(cmd[5], NULL, 10);
if (cmd.size() > 2)
{
self.dsk.journal_block_size = strtoul(cmd[2], NULL, 10);
self.dsk.clean_entry_bitmap_size = strtoul(cmd[3], NULL, 10);
self.new_journal_offset = strtoull(cmd[4], NULL, 10);
self.new_journal_len = strtoull(cmd[5], NULL, 10);
}
else
{
// First argument is an OSD device - take metadata layout parameters from it
if (self.dump_load_check_superblock(self.new_journal_device))
return 1;
self.new_journal_offset = self.dsk.journal_offset;
self.new_journal_len = self.dsk.journal_len;
}
std::string json_err;
json11::Json entries = json11::Json::parse(read_all_fd(0), json_err);
if (json_err != "")
@@ -296,27 +341,47 @@ int main(int argc, char *argv[])
}
else if (!strcmp(cmd[0], "dump-meta"))
{
if (cmd.size() < 5)
if (cmd.size() != 2 && cmd.size() < 5)
{
print_help(help_text, "vitastor-disk", cmd[0], false);
return 1;
}
self.dsk.meta_device = cmd[1];
self.dsk.meta_block_size = strtoul(cmd[2], NULL, 10);
self.dsk.meta_offset = strtoull(cmd[3], NULL, 10);
self.dsk.meta_len = strtoull(cmd[4], NULL, 10);
if (cmd.size() > 2)
{
self.dsk.meta_block_size = strtoul(cmd[2], NULL, 10);
self.dsk.meta_offset = strtoull(cmd[3], NULL, 10);
self.dsk.meta_len = strtoull(cmd[4], NULL, 10);
}
else
{
// First argument is an OSD device - take metadata layout parameters from it
if (self.dump_load_check_superblock(self.dsk.meta_device))
return 1;
}
return self.dump_meta();
}
else if (!strcmp(cmd[0], "write-meta"))
{
if (cmd.size() < 4)
if (cmd.size() != 2 && cmd.size() < 4)
{
print_help(help_text, "vitastor-disk", cmd[0], false);
return 1;
}
self.new_meta_device = cmd[1];
self.new_meta_offset = strtoull(cmd[2], NULL, 10);
self.new_meta_len = strtoull(cmd[3], NULL, 10);
if (cmd.size() > 2)
{
self.new_meta_offset = strtoull(cmd[2], NULL, 10);
self.new_meta_len = strtoull(cmd[3], NULL, 10);
}
else
{
// First argument is an OSD device - take metadata layout parameters from it
if (self.dump_load_check_superblock(self.new_meta_device))
return 1;
self.new_meta_offset = self.dsk.meta_offset;
self.new_meta_len = self.dsk.meta_len;
}
std::string json_err;
json11::Json meta = json11::Json::parse(read_all_fd(0), json_err);
if (json_err != "")
@@ -328,7 +393,16 @@ int main(int argc, char *argv[])
}
else if (!strcmp(cmd[0], "resize"))
{
return self.resize_data();
if (cmd.size() != 2)
{
fprintf(stderr, "Exactly 1 OSD number or OSD device path argument is required\n");
return 1;
}
return self.resize_data(cmd[1]);
}
else if (!strcmp(cmd[0], "raw-resize"))
{
return self.raw_resize();
}
else if (!strcmp(cmd[0], "simple-offsets"))
{

View File

@@ -93,10 +93,16 @@ struct disk_tool_t
void dump_meta_header(blockstore_meta_header_v2_t *hdr);
void dump_meta_entry(uint64_t block_num, clean_disk_entry *entry, uint8_t *bitmap);
int dump_load_check_superblock(const std::string & device);
int write_json_journal(json11::Json entries);
int write_json_meta(json11::Json meta);
int resize_data();
int resize_data(std::string device);
int resize_parse_move_journal(std::map<std::string, std::string> & move_options, bool dry_run);
int resize_parse_move_meta(std::map<std::string, std::string> & move_options, bool dry_run);
int raw_resize();
int resize_parse_params();
void resize_init(blockstore_meta_header_v2_t *hdr);
int resize_remap_blocks();
@@ -114,11 +120,13 @@ struct disk_tool_t
int systemd_start_stop_osds(const std::vector<std::string> & cmd, const std::vector<std::string> & devices);
int pre_exec_osd(std::string device);
int purge_devices(const std::vector<std::string> & devices);
int clear_osd_superblock(const std::string & dev);
json11::Json read_osd_superblock(std::string device, bool expect_exist = true, bool ignore_nonref = false);
uint32_t write_osd_superblock(std::string device, json11::Json params);
int prepare_one(std::map<std::string, std::string> options, int is_hdd = -1);
int check_existing_partition(const std::string & dev);
int prepare(std::vector<std::string> devices);
std::vector<vitastor_dev_info_t> collect_devices(const std::vector<std::string> & devices);
json11::Json add_partitions(vitastor_dev_info_t & devinfo, std::vector<std::string> sizes);
@@ -133,8 +141,8 @@ void disk_tool_simple_offsets(json11::Json cfg, bool json_output);
uint64_t sscanf_json(const char *fmt, const json11::Json & str);
void fromhexstr(const std::string & from, int bytes, uint8_t *to);
int disable_cache(std::string dev);
uint64_t get_device_size(const std::string & dev, bool should_exist = false);
std::string get_parent_device(std::string dev);
bool json_is_true(const json11::Json & val);
int shell_exec(const std::vector<std::string> & cmd, const std::string & in, std::string *out, std::string *err);
int write_zero(int fd, uint64_t offset, uint64_t size);
json11::Json read_parttable(std::string dev);

View File

@@ -517,6 +517,12 @@ int disk_tool_t::write_json_journal(json11::Json entries)
uint32_t data_csum_size = !dsk.data_csum_type ? 0 : ne->small_write.len/dsk.csum_block_size*(dsk.data_csum_type & 0xFF);
fromhexstr(rec["bitmap"].string_value(), dsk.clean_entry_bitmap_size, ((uint8_t*)ne) + sizeof(journal_entry_small_write) + data_csum_size);
fromhexstr(rec["data"].string_value(), ne->small_write.len, new_journal_data);
if (ne->small_write.len > 0 && !rec["data"].is_string())
{
fprintf(stderr, "Error: entry data is missing, please generate the dump with --json --format data\n");
free(new_journal_buf);
return 1;
}
if (dsk.data_csum_type)
fromhexstr(rec["block_csums"].string_value(), data_csum_size, ((uint8_t*)ne) + sizeof(journal_entry_small_write));
if (rec["data"].is_string())

View File

@@ -4,6 +4,7 @@
#include "disk_tool.h"
#include "rw_blocking.h"
#include "osd_id.h"
#include "json_util.h"
int disk_tool_t::process_meta(std::function<void(blockstore_meta_header_v2_t *)> hdr_fn,
std::function<void(uint64_t, clean_disk_entry*, uint8_t*)> record_fn)
@@ -149,6 +150,31 @@ int disk_tool_t::process_meta(std::function<void(blockstore_meta_header_v2_t *)>
return 0;
}
int disk_tool_t::dump_load_check_superblock(const std::string & device)
{
json11::Json sb = read_osd_superblock(device, true, false);
if (sb.is_null())
return 1;
try
{
auto cfg = json_to_string_map(sb["params"].object_items());
dsk.parse_config(cfg);
dsk.data_io = dsk.meta_io = dsk.journal_io = "direct";
dsk.open_data();
dsk.open_meta();
dsk.open_journal();
dsk.calc_lengths(true);
}
catch (std::exception & e)
{
dsk.close_all();
fprintf(stderr, "%s\n", e.what());
return 1;
}
dsk.close_all();
return 0;
}
int disk_tool_t::dump_meta()
{
int r = process_meta(
@@ -176,7 +202,7 @@ void disk_tool_t::dump_meta_header(blockstore_meta_header_v2_t *hdr)
{
printf(
"{\"version\":\"0.9\",\"meta_block_size\":%u,\"data_block_size\":%u,\"bitmap_granularity\":%u,"
"\"data_csum_type\":%s,\"csum_block_size\":%u,\"entries\":[\n",
"\"data_csum_type\":\"%s\",\"csum_block_size\":%u,\"entries\":[\n",
hdr->meta_block_size, hdr->data_block_size, hdr->bitmap_granularity,
csum_type_str(hdr->data_csum_type).c_str(), hdr->csum_block_size
);
@@ -243,12 +269,16 @@ int disk_tool_t::write_json_meta(json11::Json meta)
? meta["data_block_size"].uint64_value() : 131072;
new_hdr->bitmap_granularity = meta["bitmap_granularity"].uint64_value()
? meta["bitmap_granularity"].uint64_value() : 4096;
new_hdr->data_csum_type = meta["data_csum_type"].is_number()
? meta["data_csum_type"].uint64_value()
: (meta["data_csum_type"].string_value() == "crc32c"
? BLOCKSTORE_CSUM_CRC32C
: BLOCKSTORE_CSUM_NONE);
new_hdr->csum_block_size = meta["csum_block_size"].uint64_value();
if (new_hdr->version >= BLOCKSTORE_META_FORMAT_V2)
{
new_hdr->data_csum_type = meta["data_csum_type"].is_number()
? meta["data_csum_type"].uint64_value()
: (meta["data_csum_type"].string_value() == "crc32c"
? BLOCKSTORE_CSUM_CRC32C
: BLOCKSTORE_CSUM_NONE);
new_hdr->csum_block_size = meta["csum_block_size"].uint64_value();
new_hdr->header_csum = crc32c(0, new_hdr, sizeof(*new_hdr));
}
uint32_t new_clean_entry_header_size = (new_hdr->version == BLOCKSTORE_META_FORMAT_V1
? sizeof(clean_disk_entry) : sizeof(clean_disk_entry) + 4 /*entry_csum*/);
new_clean_entry_bitmap_size = (new_hdr->data_block_size / new_hdr->bitmap_granularity + 7) / 8;

View File

@@ -3,6 +3,7 @@
#include "disk_tool.h"
#include "str_util.h"
#include "json_util.h"
#include "osd_id.h"
int disk_tool_t::prepare_one(std::map<std::string, std::string> options, int is_hdd)
@@ -52,24 +53,9 @@ int disk_tool_t::prepare_one(std::map<std::string, std::string> options, int is_
return 1;
}
if (i == 0 && is_hdd == -1)
is_hdd = trim(read_file("/sys/block/"+parent_dev+"/queue/rotational")) == "1";
std::string out;
if (shell_exec({ "wipefs", dev }, "", &out, NULL) != 0 || out != "")
{
fprintf(stderr, "%s contains data, not creating OSD without --force. wipefs shows:\n%s", dev.c_str(), out.c_str());
is_hdd = trim(read_file("/sys/block/"+parent_dev.substr(5)+"/queue/rotational")) == "1";
if (check_existing_partition(dev) != 0)
return 1;
}
json11::Json sb = read_osd_superblock(dev, false);
if (!sb.is_null())
{
fprintf(stderr, "%s already contains Vitastor OSD superblock, not creating OSD without --force\n", dev.c_str());
return 1;
}
if (fix_partition_type(dev) != 0)
{
fprintf(stderr, "%s has incorrect type and we failed to change it to Vitastor type\n", dev.c_str());
return 1;
}
}
}
for (auto dev: std::vector<std::string>{"data", "meta", "journal"})
@@ -221,6 +207,28 @@ int disk_tool_t::prepare_one(std::map<std::string, std::string> options, int is_
return 0;
}
int disk_tool_t::check_existing_partition(const std::string & dev)
{
std::string out;
if (shell_exec({ "wipefs", dev }, "", &out, NULL) != 0 || out != "")
{
fprintf(stderr, "%s contains data, not creating OSD without --force. wipefs shows:\n%s", dev.c_str(), out.c_str());
return 1;
}
json11::Json sb = read_osd_superblock(dev, false);
if (!sb.is_null())
{
fprintf(stderr, "%s already contains Vitastor OSD superblock, not creating OSD without --force\n", dev.c_str());
return 1;
}
if (fix_partition_type(dev) != 0)
{
fprintf(stderr, "%s has incorrect type and we failed to change it to Vitastor type\n", dev.c_str());
return 1;
}
return 0;
}
std::vector<vitastor_dev_info_t> disk_tool_t::collect_devices(const std::vector<std::string> & devices)
{
std::vector<vitastor_dev_info_t> devinfo;
@@ -232,33 +240,16 @@ std::vector<vitastor_dev_info_t> disk_tool_t::collect_devices(const std::vector<
fprintf(stderr, "%s does not start with /dev/, ignoring\n", dev.c_str());
continue;
}
struct stat dev_st, sys_st;
if (stat(dev.c_str(), &dev_st) < 0)
struct stat sys_st;
uint64_t dev_size = get_device_size(dev, false);
if (dev_size == UINT64_MAX)
{
if (errno == ENOENT)
{
fprintf(stderr, "%s does not exist, skipping\n", dev.c_str());
continue;
}
fprintf(stderr, "Error checking %s: %s\n", dev.c_str(), strerror(errno));
return {};
}
uint64_t dev_size = dev_st.st_size;
if (S_ISBLK(dev_st.st_mode))
else if (!dev_size)
{
int fd = open(dev.c_str(), O_DIRECT|O_RDWR);
if (fd < 0)
{
fprintf(stderr, "Failed to open %s: %s\n", dev.c_str(), strerror(errno));
return {};
}
if (ioctl(fd, BLKGETSIZE64, &dev_size) < 0)
{
fprintf(stderr, "Failed to get %s size: %s\n", dev.c_str(), strerror(errno));
close(fd);
return {};
}
close(fd);
fprintf(stderr, "%s does not exist, skipping\n", dev.c_str());
continue;
}
if (stat(("/sys/block/"+dev.substr(5)).c_str(), &sys_st) < 0)
{
@@ -337,7 +328,7 @@ json11::Json disk_tool_t::add_partitions(vitastor_dev_info_t & devinfo, std::vec
script += "+ "+size+" "+std::string(VITASTOR_PART_TYPE)+"\n";
}
std::string out;
if (shell_exec({ "sfdisk", "--no-reread", "--force", devinfo.path }, script, &out, NULL) != 0)
if (shell_exec({ "sfdisk", "--no-reread", "--no-tell-kernel", "--force", devinfo.path }, script, &out, NULL) != 0)
{
fprintf(stderr, "Failed to add %zu partition(s) with sfdisk\n", sizes.size());
return {};
@@ -364,8 +355,9 @@ json11::Json disk_tool_t::add_partitions(vitastor_dev_info_t & devinfo, std::vec
{
for (const auto & part: new_parts)
{
std::string link_path = "/dev/disk/by-partuuid/"+strtolower(part["uuid"].string_value());
struct stat st;
if (stat(part["node"].string_value().c_str(), &st) < 0)
if (lstat(link_path.c_str(), &st) < 0)
{
if (errno == ENOENT)
{
@@ -386,7 +378,7 @@ json11::Json disk_tool_t::add_partitions(vitastor_dev_info_t & devinfo, std::vec
}
else
{
fprintf(stderr, "Failed to lstat %s: %s\n", part["node"].string_value().c_str(), strerror(errno));
fprintf(stderr, "Failed to lstat %s: %s\n", link_path.c_str(), strerror(errno));
return {};
}
}
@@ -395,8 +387,9 @@ json11::Json disk_tool_t::add_partitions(vitastor_dev_info_t & devinfo, std::vec
}
// Wait until device symlinks in /dev/disk/by-partuuid/ appear
bool exists = false;
const int max_iter = 300; // max 30 sec
iter = 0;
while (!exists && iter < 300) // max 30 sec
while (!exists && iter < max_iter)
{
exists = true;
for (const auto & part: new_parts)
@@ -406,7 +399,13 @@ json11::Json disk_tool_t::add_partitions(vitastor_dev_info_t & devinfo, std::vec
if (lstat(link_path.c_str(), &st) < 0)
{
if (errno == ENOENT)
{
exists = false;
if (iter == 4)
{
fprintf(stderr, "Waiting for %s to appear for up to %d sec...\n", link_path.c_str(), max_iter/10);
}
}
else
{
fprintf(stderr, "Failed to lstat %s: %s\n", link_path.c_str(), strerror(errno));

View File

@@ -18,7 +18,7 @@ struct resizer_data_moving_t
uint64_t old_loc, new_loc;
};
int disk_tool_t::resize_data()
int disk_tool_t::raw_resize()
{
int r;
// Parse parameters

View File

@@ -0,0 +1,296 @@
// Copyright (c) Vitaliy Filippov, 2019+
// License: VNPL-1.1 (see README.md for details)
#include "disk_tool.h"
#include "rw_blocking.h"
#include "str_util.h"
#include "json_util.h"
int disk_tool_t::resize_data(std::string device)
{
if (options.find("move_journal") == options.end() &&
options.find("move_data") == options.end() &&
options.find("journal_size") == options.end() &&
options.find("data_size") == options.end())
{
fprintf(stderr, "None of --move-journal, --move-data, --journal-size, --data-size options are specified - nothing to do!\n");
return 1;
}
if (stoull_full(device))
device = "/dev/vitastor/osd"+device+"-data";
json11::Json sb = read_osd_superblock(device, true, false);
if (sb.is_null())
return 1;
auto sb_params = json_to_string_map(sb["params"].object_items());
try
{
dsk.parse_config(sb_params);
dsk.data_io = dsk.meta_io = dsk.journal_io = "cached";
dsk.open_data();
dsk.open_meta();
dsk.open_journal();
dsk.calc_lengths(true);
}
catch (std::exception & e)
{
dsk.close_all();
fprintf(stderr, "%s\n", e.what());
return 1;
}
dsk.close_all();
bool dry_run = options.find("dry_run") != options.end();
auto old_journal_device = dsk.journal_device;
auto old_meta_device = dsk.meta_device;
new_journal_len = dsk.journal_len;
if (options.find("journal_size") != options.end())
{
new_journal_len = parse_size(options["journal_size"]);
if (options.find("move_journal") == options.end())
options["move_journal"] = dsk.journal_device == dsk.data_device ? "" : dsk.journal_device;
}
std::map<std::string, std::string> move_options;
if (options.find("move_journal") != options.end())
{
if (resize_parse_move_journal(move_options, dry_run) != 0)
return 1;
}
if (options.find("move_meta") != options.end())
{
if (resize_parse_move_meta(move_options, dry_run) != 0)
return 1;
}
auto new_journal_device = move_options.find("new_journal_device") != move_options.end()
? move_options["new_journal_device"] : dsk.journal_device;
auto new_meta_device = move_options.find("new_meta_device") != move_options.end()
? move_options["new_meta_device"] : dsk.meta_device;
// Calculate new data & meta offsets
new_data_offset = 4096 + (new_journal_device == dsk.data_device ? new_journal_len : 0) +
(new_meta_device == dsk.data_device ? dsk.meta_len : 0);
new_data_offset += ((dsk.data_offset-new_data_offset) % dsk.data_block_size);
if (new_data_offset != dsk.data_offset)
move_options["new_data_offset"] = std::to_string(new_data_offset);
if (options.find("data_size") != options.end())
{
auto new_data_dev_size = parse_size(options["data_size"]);
new_data_dev_size = options["data_size"] == "max" || new_data_dev_size > dsk.data_device_size
? dsk.data_device_size : new_data_dev_size;
if (new_data_dev_size-dsk.data_offset != dsk.data_len)
move_options["new_data_len"] = std::to_string(new_data_dev_size-new_data_offset);
}
new_meta_offset = 4096 + (new_meta_device == new_journal_device ? new_journal_len : 0);
if (new_meta_offset != dsk.meta_offset)
move_options["new_meta_offset"] = std::to_string(new_meta_offset);
// Run resize
auto orig_options = std::move(options);
options = sb_params;
for (auto & kv: move_options)
options[kv.first] = kv.second;
if (!json)
{
std::string cmd;
for (auto & kv: move_options)
cmd += " "+kv.first+" = "+kv.second+"\n";
fprintf(stderr, "Running resize:\n%s", cmd.c_str());
}
if (!dry_run && raw_resize() != 0)
return 1;
// Write new superblocks
json11::Json::object new_sb_params = sb["params"].object_items();
if (move_options.find("new_journal_device") != move_options.end())
new_sb_params["journal_device"] = move_options["new_journal_device"];
if (move_options.find("new_meta_device") != move_options.end())
new_sb_params["meta_device"] = move_options["new_meta_device"];
new_sb_params["data_offset"] = new_data_offset;
new_sb_params["meta_offset"] = new_meta_offset;
if (move_options.find("new_data_len") != move_options.end())
new_sb_params["data_size"] = stoull_full(move_options["new_data_len"]);
std::set<std::string> clear_superblocks, write_superblocks;
write_superblocks.insert(dsk.data_device);
write_superblocks.insert(new_journal_device);
write_superblocks.insert(new_meta_device);
if (write_superblocks.find(old_journal_device) == write_superblocks.end())
clear_superblocks.insert(old_journal_device);
if (write_superblocks.find(old_meta_device) == write_superblocks.end())
clear_superblocks.insert(old_meta_device);
for (auto & dev: clear_superblocks)
{
if (!json)
fprintf(stderr, "Clearing OSD superblock on %s\n", dev.c_str());
if (!dry_run && clear_osd_superblock(dev) != 0)
return 1;
}
for (auto & dev: write_superblocks)
{
if (!json)
fprintf(stderr, "Writing new OSD superblock to %s\n", dev.c_str());
if (!dry_run && !write_osd_superblock(dev, new_sb_params))
return 1;
}
if (json)
{
printf("%s\n", json11::Json(json11::Json::object {
{ "new_sb_params", new_sb_params },
}).dump().c_str());
}
return 0;
}
int disk_tool_t::resize_parse_move_journal(std::map<std::string, std::string> & move_options, bool dry_run)
{
if (options["move_journal"] == "")
{
// move back to the data device
// but first check if not already there :)
if (dsk.journal_device == dsk.data_device && new_journal_len == dsk.journal_len)
{
// already there
fprintf(stderr, "journal is already on data device and has the same size\n");
return 0;
}
move_options["new_journal_device"] = dsk.data_device;
move_options["new_journal_offset"] = "4096";
move_options["new_journal_len"] = std::to_string(new_journal_len);
}
else
{
std::string real_dev = realpath_str(options["move_journal"], false);
if (real_dev == "")
return 1;
std::string parent_dev = get_parent_device(real_dev);
if (parent_dev == "")
return 1;
if (parent_dev == real_dev)
{
// whole disk - create partition
std::string old_real_dev = realpath_str(dsk.journal_device);
if (old_real_dev == "")
return 1;
if (options.find("force") == options.end() &&
get_parent_device(old_real_dev) == parent_dev)
{
// already there
fprintf(stderr, "journal is already on a partition of %s, add --force to create a new partition\n", options["move_journal"].c_str());
return 0;
}
new_journal_len = ((new_journal_len+1024*1024-1)/1024/1024)*1024*1024;
if (!dry_run)
{
auto devinfos = collect_devices({ real_dev });
if (devinfos.size() == 0)
return 1;
std::vector<std::string> sizes;
sizes.push_back(std::to_string(new_journal_len/1024/1024)+"MiB");
auto new_parts = add_partitions(devinfos[0], sizes);
if (!new_parts.array_items().size())
return 1;
options["move_journal"] = "/dev/disk/by-partuuid/"+strtolower(new_parts[0]["uuid"].string_value());
}
else
options["move_journal"] = "<new journal partition on "+parent_dev+">";
}
else if (options["move_journal"].substr(0, 22) != "/dev/disk/by-partuuid/")
{
// Partitions should be identified by GPT partition UUID
fprintf(stderr, "%s does not start with /dev/disk/by-partuuid/. Partitions should be identified by GPT partition UUIDs\n", options["move_journal"].c_str());
return 1;
}
else
{
// already a partition - check that it's a GPT partition with correct type
if (options.find("force") == options.end() &&
check_existing_partition(real_dev) != 0)
{
return 1;
}
new_journal_len = get_device_size(options["move_journal"], true);
if (new_journal_len == UINT64_MAX)
return 1;
}
new_journal_len -= 4096;
move_options["new_journal_device"] = options["move_journal"];
move_options["new_journal_offset"] = "4096";
move_options["new_journal_len"] = std::to_string(new_journal_len);
}
return 0;
}
int disk_tool_t::resize_parse_move_meta(std::map<std::string, std::string> & move_options, bool dry_run)
{
if (options["move_meta"] == "")
{
// move back to the data device
// but first check if not already there :)
if (dsk.meta_device == dsk.data_device)
{
// already there
fprintf(stderr, "metadata is already on data device\n");
return 0;
}
auto new_journal_device = move_options.find("new_journal_device") != move_options.end()
? move_options["new_journal_device"] : dsk.journal_device;
move_options["new_meta_device"] = dsk.data_device;
move_options["new_meta_len"] = std::to_string(dsk.meta_len);
}
else
{
std::string real_dev = realpath_str(options["move_meta"], false);
if (real_dev == "")
return 1;
std::string parent_dev = get_parent_device(real_dev);
if (parent_dev == "")
return 1;
uint64_t new_meta_len = 0;
if (parent_dev == real_dev)
{
// whole disk - create partition
std::string old_real_dev = realpath_str(dsk.meta_device);
if (old_real_dev == "")
return 1;
if (options.find("force") == options.end() &&
get_parent_device(old_real_dev) == parent_dev)
{
// already there
fprintf(stderr, "metadata is already on a partition of %s\n", options["move_meta"].c_str());
return 0;
}
new_meta_len = ((dsk.meta_len+1024*1024-1)/1024/1024)*1024*1024;
if (!dry_run)
{
auto devinfos = collect_devices({ real_dev });
if (devinfos.size() == 0)
return 1;
std::vector<std::string> sizes;
sizes.push_back(std::to_string(new_meta_len/1024/1024)+"MiB");
auto new_parts = add_partitions(devinfos[0], sizes);
if (!new_parts.array_items().size())
return 1;
options["move_meta"] = "/dev/disk/by-partuuid/"+strtolower(new_parts[0]["uuid"].string_value());
}
else
options["move_meta"] = "<new metadata partition on "+parent_dev+">";
}
else if (options["move_meta"].substr(0, 22) != "/dev/disk/by-partuuid/")
{
// Partitions should be identified by GPT partition UUID
fprintf(stderr, "%s does not start with /dev/disk/by-partuuid/. Partitions should be identified by GPT partition UUIDs\n", options["move_meta"].c_str());
return 1;
}
else
{
// already a partition - check that it's a GPT partition with correct type
if (options.find("force") == options.end() &&
check_existing_partition(real_dev) != 0)
{
return 1;
}
new_meta_len = get_device_size(options["move_meta"], true);
if (new_meta_len == UINT64_MAX)
return 1;
}
new_meta_len -= 4096;
move_options["new_meta_len"] = std::to_string(new_meta_len);
move_options["new_meta_device"] = options["move_meta"];
move_options["new_meta_offset"] = "4096";
}
return 0;
}

View File

@@ -6,6 +6,7 @@
#include "disk_tool.h"
#include "rw_blocking.h"
#include "str_util.h"
#include "json_util.h"
struct __attribute__((__packed__)) vitastor_disk_superblock_t
{
@@ -381,6 +382,34 @@ int disk_tool_t::pre_exec_osd(std::string device)
return 0;
}
int disk_tool_t::clear_osd_superblock(const std::string & dev)
{
uint8_t *buf = (uint8_t*)memalign_or_die(MEM_ALIGNMENT, 4096);
int fd = -1, r = open(dev.c_str(), O_DIRECT|O_RDWR);
if (r >= 0)
{
fd = r;
r = read_blocking(fd, buf, 4096);
if (r == 4096)
{
// Clear magic and CRC
memset(buf, 0, 12);
r = lseek64(fd, 0, 0);
if (r == 0)
{
r = write_blocking(fd, buf, 4096);
if (r == 4096)
r = 0;
}
}
}
if (fd >= 0)
close(fd);
free(buf);
buf = NULL;
return r;
}
int disk_tool_t::purge_devices(const std::vector<std::string> & devices)
{
std::set<uint64_t> osd_numbers;
@@ -439,7 +468,6 @@ int disk_tool_t::purge_devices(const std::vector<std::string> & devices)
return 1;
}
// Destroy OSD superblocks
uint8_t *buf = (uint8_t*)memalign_or_die(MEM_ALIGNMENT, 4096);
for (auto & sb: superblocks)
{
for (auto dev_type: std::vector<std::string>{ "data", "meta", "journal" })
@@ -447,26 +475,7 @@ int disk_tool_t::purge_devices(const std::vector<std::string> & devices)
auto dev = sb["real_"+dev_type+"_device"].string_value();
if (dev != "")
{
int fd = -1, r = open(dev.c_str(), O_DIRECT|O_RDWR);
if (r >= 0)
{
fd = r;
r = read_blocking(fd, buf, 4096);
if (r == 4096)
{
// Clear magic and CRC
memset(buf, 0, 12);
r = lseek64(fd, 0, 0);
if (r == 0)
{
r = write_blocking(fd, buf, 4096);
if (r == 4096)
r = 0;
}
}
}
if (fd >= 0)
close(fd);
int r = clear_osd_superblock(dev);
if (r != 0)
{
fprintf(stderr, "Failed to clear OSD %ju %s device %s superblock: %s\n",
@@ -487,7 +496,7 @@ int disk_tool_t::purge_devices(const std::vector<std::string> & devices)
fprintf(stderr, "Failed to delete partition %s: failed to find parent device\n", dev.c_str());
continue;
}
auto pt = read_parttable("/dev/"+parent_dev);
auto pt = read_parttable(parent_dev);
if (!pt.is_object())
continue;
json11::Json::array newpt = pt["partitions"].array_items();
@@ -498,7 +507,7 @@ int disk_tool_t::purge_devices(const std::vector<std::string> & devices)
auto old_part = newpt[i];
newpt.erase(newpt.begin()+i, newpt.begin()+i+1);
vitastor_dev_info_t devinfo = {
.path = "/dev/"+parent_dev,
.path = parent_dev,
.pt = json11::Json::object{ { "partitions", newpt } },
};
add_partitions(devinfo, {});
@@ -507,7 +516,7 @@ int disk_tool_t::purge_devices(const std::vector<std::string> & devices)
errno != ENOENT)
{
std::string out;
shell_exec({ "partprobe", "/dev/"+parent_dev }, "", &out, NULL);
shell_exec({ "partprobe", parent_dev }, "", &out, NULL);
}
break;
}
@@ -516,7 +525,5 @@ int disk_tool_t::purge_devices(const std::vector<std::string> & devices)
}
}
}
free(buf);
buf = NULL;
return 0;
}

View File

@@ -101,7 +101,7 @@ int disk_tool_t::upgrade_simple_unit(std::string unit)
resizer.options = options;
for (auto & kv: resize)
resizer.options[kv.first] = std::to_string(kv.second);
if (resizer.resize_data() != 0)
if (resizer.raw_resize() != 0)
{
// FIXME: Resize with backup or journal
fprintf(

View File

@@ -60,14 +60,14 @@ int disable_cache(std::string dev)
auto parent_dev = get_parent_device(dev);
if (parent_dev == "")
return 1;
auto scsi_disk = "/sys/block/"+parent_dev+"/device/scsi_disk";
auto scsi_disk = "/sys/block/"+parent_dev.substr(5)+"/device/scsi_disk";
DIR *dir = opendir(scsi_disk.c_str());
if (!dir)
{
if (errno == ENOENT)
{
// Not a SCSI/SATA device, just check /sys/block/.../queue/write_cache
return check_queue_cache(dev.substr(5), parent_dev);
return check_queue_cache(dev.substr(5), parent_dev.substr(5));
}
else
{
@@ -84,7 +84,7 @@ int disable_cache(std::string dev)
{
// Not a SCSI/SATA device, just check /sys/block/.../queue/write_cache
closedir(dir);
return check_queue_cache(dev.substr(5), parent_dev);
return check_queue_cache(dev.substr(5), parent_dev.substr(5));
}
scsi_disk += "/";
scsi_disk += de->d_name;
@@ -117,6 +117,38 @@ int disable_cache(std::string dev)
return 0;
}
uint64_t get_device_size(const std::string & dev, bool should_exist)
{
struct stat dev_st;
if (stat(dev.c_str(), &dev_st) < 0)
{
if (errno == ENOENT && !should_exist)
{
return 0;
}
fprintf(stderr, "Error checking %s: %s\n", dev.c_str(), strerror(errno));
return UINT64_MAX;
}
uint64_t dev_size = dev_st.st_size;
if (S_ISBLK(dev_st.st_mode))
{
int fd = open(dev.c_str(), O_DIRECT|O_RDWR);
if (fd < 0)
{
fprintf(stderr, "Failed to open %s: %s\n", dev.c_str(), strerror(errno));
return UINT64_MAX;
}
if (ioctl(fd, BLKGETSIZE64, &dev_size) < 0)
{
fprintf(stderr, "Failed to get %s size: %s\n", dev.c_str(), strerror(errno));
close(fd);
return UINT64_MAX;
}
close(fd);
}
return dev_size;
}
std::string get_parent_device(std::string dev)
{
if (dev.substr(0, 5) != "/dev/")
@@ -125,16 +157,26 @@ std::string get_parent_device(std::string dev)
return "";
}
dev = dev.substr(5);
// check if it's a partition - partitions aren't present in /sys/block/
struct stat st;
auto chk = "/sys/block/"+dev;
if (stat(chk.c_str(), &st) == 0)
{
// present in /sys/block/ - not a partition
return "/dev/"+dev;
}
else if (errno != ENOENT)
{
fprintf(stderr, "Failed to stat %s: %s\n", chk.c_str(), strerror(errno));
return "";
}
int i = dev.size();
while (i > 0 && isdigit(dev[i-1]))
i--;
if (i >= 1 && dev[i-1] == '-') // dm-0, dm-1
return dev;
else if (i >= 2 && dev[i-1] == 'p' && isdigit(dev[i-2])) // nvme0n1p1
if (i >= 2 && dev[i-1] == 'p' && isdigit(dev[i-2])) // nvme0n1p1
i--;
// Check that such block device exists
struct stat st;
auto chk = "/sys/block/"+dev.substr(0, i);
chk = "/sys/block/"+dev.substr(0, i);
if (stat(chk.c_str(), &st) < 0)
{
if (errno != ENOENT)
@@ -142,16 +184,9 @@ std::string get_parent_device(std::string dev)
fprintf(stderr, "Failed to stat %s: %s\n", chk.c_str(), strerror(errno));
return "";
}
return dev;
return "/dev/"+dev;
}
return dev.substr(0, i);
}
bool json_is_true(const json11::Json & val)
{
if (val.is_string())
return val == "true" || val == "yes" || val == "1";
return val.bool_value();
return "/dev/"+dev.substr(0, i);
}
int shell_exec(const std::vector<std::string> & cmd, const std::string & in, std::string *out, std::string *err)
@@ -314,7 +349,7 @@ int fix_partition_type(std::string dev_by_uuid)
std::string parent_dev = get_parent_device(realpath_str(dev_by_uuid, false));
if (parent_dev == "")
return 1;
auto pt = read_parttable("/dev/"+parent_dev);
auto pt = read_parttable(parent_dev);
if (pt.is_null() || pt.is_bool())
return 1;
std::string script = "label: gpt\n\n";
@@ -342,7 +377,7 @@ int fix_partition_type(std::string dev_by_uuid)
script += "\n";
}
std::string out;
return shell_exec({ "sfdisk", "--no-reread", "--force", "/dev/"+parent_dev }, script, &out, NULL);
return shell_exec({ "sfdisk", "--no-reread", "--no-tell-kernel", "--force", parent_dev }, script, &out, NULL);
}
std::string csum_type_str(uint32_t data_csum_type)

View File

@@ -19,6 +19,7 @@
#include "addr_util.h"
#include "str_util.h"
#include "json_util.h"
#include "nfs_proxy.h"
#include "nfs_kv.h"
#include "nfs_block.h"

View File

@@ -14,19 +14,7 @@
#include "osd.h"
#include "http_client.h"
#include "str_util.h"
static blockstore_config_t json_to_bs(const json11::Json::object & config)
{
blockstore_config_t bs;
for (auto kv: config)
{
if (kv.second.is_string())
bs[kv.first] = kv.second.string_value();
else if (!kv.second.is_null())
bs[kv.first] = kv.second.dump();
}
return bs;
}
#include "json_util.h"
osd_t::osd_t(const json11::Json & config, ring_loop_t *ringloop)
{
@@ -46,7 +34,7 @@ osd_t::osd_t(const json11::Json & config, ring_loop_t *ringloop)
if (!json_is_true(this->config["disable_blockstore"]))
{
auto bs_cfg = json_to_bs(this->config);
auto bs_cfg = json_to_string_map(this->config);
this->bs = new blockstore_t(bs_cfg, ringloop, tfd);
// Wait for blockstore initialisation before actually starting OSD logic
// to prevent peering timeouts during restart with filled databases
@@ -151,7 +139,7 @@ void osd_t::parse_config(bool init)
}
if (bs)
{
auto bs_cfg = json_to_bs(config);
auto bs_cfg = json_to_string_map(config);
bs->parse_config(bs_cfg);
}
st_cli.parse_config(config);

View File

@@ -337,7 +337,7 @@ std::vector<osd_chain_read_t> osd_t::collect_chained_read_requests(osd_op_t *cur
{
uint8_t *part_bitmap = ((uint8_t*)op_data->snapshot_bitmaps) + chain_pos*stripe_count*clean_entry_bitmap_size;
int start = !cur_op->req.rw.len ? 0 : (cur_op->req.rw.offset - op_data->oid.stripe)/bs_bitmap_granularity;
int end = !cur_op->req.rw.len ? op_data->pg_data_size*clean_entry_bitmap_size : start + cur_op->req.rw.len/bs_bitmap_granularity;
int end = !cur_op->req.rw.len ? op_data->pg_data_size*clean_entry_bitmap_size*8 : start + cur_op->req.rw.len/bs_bitmap_granularity;
// Skip unneeded part in the beginning
while (start < end && (
((global_bitmap[start>>3] >> (start&7)) & 1) ||
@@ -369,12 +369,15 @@ std::vector<osd_chain_read_t> osd_t::collect_chained_read_requests(osd_op_t *cur
global_bitmap[cur>>3] = global_bitmap[cur>>3] | (part_bitmap[cur>>3] & (1 << (cur&7)));
}
// Add request
chain_reads.push_back((osd_chain_read_t){
.chain_pos = chain_pos,
.inode = op_data->read_chain[chain_pos],
.offset = start*bs_bitmap_granularity,
.len = (end-start)*bs_bitmap_granularity,
});
if (cur_op->req.rw.len)
{
chain_reads.push_back((osd_chain_read_t){
.chain_pos = chain_pos,
.inode = op_data->read_chain[chain_pos],
.offset = start*bs_bitmap_granularity,
.len = (end-start)*bs_bitmap_granularity,
});
}
}
}
return chain_reads;

View File

@@ -55,10 +55,3 @@ json11::Json::object osd_messenger_t::merge_configs(const json11::Json::object &
{
return cli_config;
}
bool json_is_true(const json11::Json & val)
{
if (val.is_string())
return val == "true" || val == "yes" || val == "1";
return val.bool_value();
}

35
src/util/json_util.cpp Normal file
View File

@@ -0,0 +1,35 @@
// Copyright (c) Vitaliy Filippov, 2019+
// License: VNPL-1.1 or GNU GPL-2.0+ (see README.md for details)
#include "json_util.h"
std::map<std::string, std::string> json_to_string_map(const json11::Json::object & config)
{
std::map<std::string, std::string> bs;
for (auto kv: config)
{
if (kv.second.is_string())
bs[kv.first] = kv.second.string_value();
else if (!kv.second.is_null())
bs[kv.first] = kv.second.dump();
}
return bs;
}
bool json_is_true(const json11::Json & val)
{
if (val.is_string())
return val == "true" || val == "yes" || val == "1";
return val.bool_value();
}
bool json_is_false(const json11::Json & val)
{
if (val.is_string())
return val.string_value() == "false" || val.string_value() == "no" || val.string_value() == "0";
if (val.is_number())
return val.number_value() == 0;
if (val.is_bool())
return !val.bool_value();
return false;
}

13
src/util/json_util.h Normal file
View File

@@ -0,0 +1,13 @@
// Copyright (c) Vitaliy Filippov, 2019+
// License: VNPL-1.1 or GNU GPL-2.0+ (see README.md for details)
#pragma once
#include <map>
#include <string>
#include "json11/json11.hpp"
std::map<std::string, std::string> json_to_string_map(const json11::Json::object & config);
bool json_is_true(const json11::Json & val);
bool json_is_false(const json11::Json & val);

View File

@@ -489,7 +489,7 @@ std::string format_datetime(uint64_t unixtime)
bool is_zero(void *buf, size_t size)
{
size_t i = 0;
while (i <= size-8)
while (i+8 <= size)
{
if (*(uint64_t*)((uint8_t*)buf + i))
return false;

View File

@@ -83,16 +83,19 @@ fi
POOLCFG='"name":"testpool","failure_domain":"osd",'$POOLCFG
$ETCDCTL put /vitastor/config/pools '{"1":{'$POOLCFG',"pg_size":'$PG_SIZE',"pg_minsize":'$PG_MINSIZE',"pg_count":'$PG_COUNT'}}'
wait_up()
wait_pool_up()
{
local sec=$1
local pool=$2
local pgsize=$3
local pgcount=$4
local i=0
local configured=0
while [[ $i -lt $sec ]]; do
if $ETCDCTL get /vitastor/pg/config --print-value-only | jq -s -e '(. | length) != 0 and ([ .[0].items["1"][] |
select(((.osd_set | select(. != 0) | sort | unique) | length) == '$PG_SIZE') ] | length) == '$PG_COUNT; then
if $ETCDCTL get /vitastor/pg/config --print-value-only | jq -s -e '(. | length) != 0 and ([ .[0].items["'$pool'"][] |
select(((.osd_set | select(. != 0) | sort | unique) | length) == '$pgsize') ] | length) == '$pgcount; then
configured=1
if $ETCDCTL get /vitastor/pg/state/1/ --prefix --print-value-only | jq -s -e '[ .[] | select(.state == ["active"]) ] | length == '$PG_COUNT; then
if $ETCDCTL get /vitastor/pg/state/$pool/ --prefix --print-value-only | jq -s -e '[ .[] | select(.state == ["active"]) ] | length == '$pgcount; then
break
fi
fi
@@ -107,6 +110,11 @@ wait_up()
done
}
wait_up()
{
wait_pool_up "$1" 1 $PG_SIZE $PG_COUNT
}
if [[ $OSD_COUNT -gt 0 ]]; then
wait_up 120
fi

View File

@@ -68,6 +68,8 @@ TEST_NAME=csum_4k_dmj OSD_ARGS="--data_csum_type crc32c --inmemory_metadata fal
TEST_NAME=csum_4k_dj OSD_ARGS="--data_csum_type crc32c --inmemory_journal false" OFFSET_ARGS=$OSD_ARGS ./test_heal.sh
TEST_NAME=csum_4k OSD_ARGS="--data_csum_type crc32c" OFFSET_ARGS=$OSD_ARGS ./test_heal.sh
./test_snapshot_pool2.sh
./test_osd_tags.sh
./test_enospc.sh

38
tests/test_snapshot_pool2.sh Executable file
View File

@@ -0,0 +1,38 @@
#!/bin/bash -ex
. `dirname $0`/run_3osds.sh
check_qemu
# snapshot in another pool
build/src/cmd/vitastor-cli --etcd_address $ETCD_URL create-pool testpool2 -s 3 -n 4 --failure_domain osd
wait_pool_up 30 2 3 4
build/src/cmd/vitastor-cli --etcd_address $ETCD_URL create -s 128M testchain -p testpool
LD_PRELOAD="build/src/client/libfio_vitastor.so" \
fio -thread -name=test -ioengine=build/src/client/libfio_vitastor.so -bs=1M -direct=1 -iodepth=4 -fsync=1 -rw=write \
-etcd=$ETCD_URL -image=testchain -mirror_file=./testdata/bin/mirror.bin -buffer_pattern=0xabcd
build/src/cmd/vitastor-cli --etcd_address $ETCD_URL snap-create testchain@snap1 -p testpool2
LD_PRELOAD="build/src/client/libfio_vitastor.so" \
fio -thread -name=test -ioengine=build/src/client/libfio_vitastor.so -bs=4k -direct=1 -iodepth=4 -end_fsync=1 -rw=randwrite -number_ios=32 \
-etcd=$ETCD_URL -image=testchain -mirror_file=./testdata/bin/mirror.bin -buffer_pattern=0xabcd
build/src/cmd/vitastor-cli --etcd_address $ETCD_URL dd iimg=testchain of=./testdata/bin/res.bin bs=128k iodepth=4
cmp ./testdata/bin/res.bin ./testdata/bin/mirror.bin
build/src/cmd/vitastor-cli --etcd_address $ETCD_URL dd iimg=testchain of=./testdata/bin/res.bin bs=32k iodepth=4 conv=nosparse
cmp ./testdata/bin/res.bin ./testdata/bin/mirror.bin
qemu-img convert -p \
-f raw "vitastor:etcd_host=127.0.0.1\:$ETCD_PORT/v3:image=testchain" \
-O raw ./testdata/bin/res.bin
cmp ./testdata/bin/res.bin ./testdata/bin/mirror.bin
format_green OK