Compare commits

..

3 Commits

Author SHA1 Message Date
Vitaliy Filippov 4eab26f968 Add documentation and a very basic test for pool management commands
Test / test_snapshot_ec (push) Successful in 31s Details
Test / test_rm (push) Successful in 17s Details
Test / test_move_reappear (push) Successful in 24s Details
Test / test_snapshot_down (push) Successful in 27s Details
Test / test_snapshot_down_ec (push) Successful in 33s Details
Test / test_splitbrain (push) Successful in 20s Details
Test / test_snapshot_chain (push) Successful in 2m15s Details
Test / test_snapshot_chain_ec (push) Successful in 2m58s Details
Test / test_rebalance_verify_imm (push) Successful in 5m3s Details
Test / test_rebalance_verify (push) Successful in 5m36s Details
Test / test_switch_primary (push) Successful in 37s Details
Test / test_rebalance_verify_ec_imm (push) Successful in 4m3s Details
Test / test_write_no_same (push) Successful in 21s Details
Test / test_write (push) Successful in 58s Details
Test / test_write_xor (push) Successful in 1m31s Details
Test / test_rebalance_verify_ec (push) Successful in 6m20s Details
Test / test_heal_pg_size_2 (push) Successful in 4m7s Details
Test / test_heal_ec (push) Successful in 4m33s Details
Test / test_heal_csum_32k_dmj (push) Successful in 5m53s Details
Test / test_heal_csum_32k_dj (push) Successful in 6m17s Details
Test / test_heal_csum_32k (push) Successful in 7m23s Details
Test / test_heal_csum_4k_dmj (push) Successful in 6m56s Details
Test / test_scrub_zero_osd_2 (push) Successful in 1m26s Details
Test / test_scrub (push) Successful in 1m29s Details
Test / test_heal_csum_4k_dj (push) Successful in 7m1s Details
Test / test_scrub_xor (push) Successful in 1m1s Details
Test / test_heal_csum_4k (push) Successful in 6m34s Details
Test / test_scrub_pg_size_6_pg_minsize_4_osd_count_6_ec (push) Successful in 32s Details
Test / test_scrub_pg_size_3 (push) Successful in 1m19s Details
Test / test_scrub_ec (push) Successful in 24s Details
2024-02-28 13:08:04 +03:00
Vitaliy Filippov 86243b7101 Rework & fix pool-create / pool-modify / pool-ls 2024-02-28 13:08:04 +03:00
idelson dc92851322 vitastor-cli: add commands to control pools: pool-create, pool-ls, pool-modify, pool-rm
PR #59 - https://github.com/vitalif/vitastor/pull/58/commits

By MIND Software LLC

By submitting this pull request, I accept Vitastor CLA
2024-02-28 13:08:04 +03:00
26 changed files with 2466 additions and 292 deletions

View File

@ -154,8 +154,25 @@ That is, if it becomes impossible to place PG data on at least (pg_minsize)
OSDs, PG is deactivated for both read and write. So you know that a fresh
write always goes to at least (pg_minsize) OSDs (disks).
That is, pg_size minus pg_minsize sets the number of disk failures to tolerate
without temporary downtime (for [osd_out_time](monitor.en.md#osd_out_time)).
For example, the difference between pg_minsize 2 and 1 in a 3-way replicated
pool (pg_size=3) is:
- If 2 hosts go down with pg_minsize=2, the pool becomes inactive and remains
inactive for [osd_out_time](monitor.en.md#osd_out_time) (10 minutes). After
this timeout, the monitor selects replacement hosts/OSDs and the pool comes
up and starts to heal. Therefore, if you don't have replacement OSDs, i.e.
if you only have 3 hosts with OSDs and 2 of them are down, the pool remains
inactive until you add or return at least 1 host (or change failure_domain
to "osd").
- If 2 hosts go down with pg_minsize=1, the pool only experiences a short
I/O pause until the monitor notices that OSDs are down (5-10 seconds with
the default [etcd_report_interval](osd.en.md#etcd_report_interval)). After
this pause, I/O resumes, but new data is temporarily written in only 1 copy.
Then, after osd_out_time, the monitor also selects replacement OSDs and the
pool starts to heal.
So, pg_minsize regulates the number of failures that a pool can tolerate
without temporary downtime for [osd_out_time](monitor.en.md#osd_out_time),
but at a cost of slightly reduced storage reliability.
FIXME: pg_minsize behaviour may be changed in the future to only make PGs
read-only instead of deactivating them.
@ -168,8 +185,8 @@ read-only instead of deactivating them.
Number of PGs for this pool. The value should be big enough for the monitor /
LP solver to be able to optimize data placement.
"Enough" is usually around 64-128 PGs per OSD, i.e. you set pg_count for pool
to (total OSD count * 100 / pg_size). You can round it to the closest power of 2,
"Enough" is usually around 10-100 PGs per OSD, i.e. you set pg_count for pool
to (total OSD count * 10 / pg_size). You can round it to the closest power of 2,
because it makes it easier to reduce or increase PG count later by dividing or
multiplying it by 2.

View File

@ -157,9 +157,25 @@
OSD, PG деактивируется на чтение и запись. Иными словами, всегда известно,
что новые блоки данных всегда записываются как минимум на pg_minsize дисков.
По сути, разница pg_size и pg_minsize задаёт число отказов дисков, которые пул
может пережить без временной (на [osd_out_time](monitor.ru.md#osd_out_time))
остановки обслуживания.
Для примера, разница между pg_minsize 2 и 1 в реплицированном пуле с 3 копиями
данных (pg_size=3), проявляется следующим образом:
- Если 2 сервера отключаются при pg_minsize=2, пул становится неактивным и
остаётся неактивным в течение [osd_out_time](monitor.en.md#osd_out_time)
(10 минут), после чего монитор назначает другие OSD/серверы на замену, пул
поднимается и начинает восстанавливать недостающие копии данных. Соответственно,
если OSD на замену нет - то есть, если у вас всего 3 сервера с OSD и 2 из них
недоступны - пул так и остаётся недоступным до тех пор, пока вы не вернёте
или не добавите хотя бы 1 сервер (или не переключите failure_domain на "osd").
- Если 2 сервера отключаются при pg_minsize=1, ввод-вывод лишь приостанавливается
на короткое время, до тех пор, пока монитор не поймёт, что OSD отключены
(что занимает 5-10 секунд при стандартном [etcd_report_interval](osd.en.md#etcd_report_interval)).
После этого ввод-вывод восстанавливается, но новые данные временно пишутся
всего в 1 копии. Когда же проходит osd_out_time, монитор точно так же назначает
другие OSD на замену выбывшим и пул начинает восстанавливать копии данных.
То есть, pg_minsize регулирует число отказов, которые пул может пережить без
временной остановки обслуживания на [osd_out_time](monitor.ru.md#osd_out_time),
но ценой немного пониженных гарантий надёжности.
FIXME: Поведение pg_minsize может быть изменено в будущем с полной деактивации
PG на перевод их в режим только для чтения.
@ -172,8 +188,8 @@ PG на перевод их в режим только для чтения.
Число PG для данного пула. Число должно быть достаточно большим, чтобы монитор
мог равномерно распределить по ним данные.
Обычно это означает примерно 64-128 PG на 1 OSD, т.е. pg_count можно устанавливать
равным (общему числу OSD * 100 / pg_size). Значение можно округлить до ближайшей
Обычно это означает примерно 10-100 PG на 1 OSD, т.е. pg_count можно устанавливать
равным (общему числу OSD * 10 / pg_size). Значение можно округлить до ближайшей
степени 2, чтобы потом было легче уменьшать или увеличивать число PG, умножая
или деля его на 2.

View File

@ -75,18 +75,16 @@ On the monitor hosts:
## Create a pool
Create pool configuration in etcd:
Create a pool using vitastor-cli:
```
etcdctl --endpoints=... put /vitastor/config/pools '{"1":{"name":"testpool",
"scheme":"replicated","pg_size":2,"pg_minsize":1,"pg_count":256,"failure_domain":"host"}}'
vitastor-cli create-pool testpool --pg_size 2 --pg_count 256
```
For EC pools the configuration should look like the following:
```
etcdctl --endpoints=... put /vitastor/config/pools '{"2":{"name":"ecpool",
"scheme":"ec","pg_size":4,"parity_chunks":2,"pg_minsize":2,"pg_count":256,"failure_domain":"host"}}'
vitastor-cli create-pool testpool --ec 2+2 --pg_count 256
```
After you do this, one of the monitors will configure PGs and OSDs will start them.

View File

@ -77,18 +77,16 @@
## Создайте пул
Создайте конфигурацию пула с помощью etcdctl:
Создайте пул с помощью vitastor-cli:
```
etcdctl --endpoints=... put /vitastor/config/pools '{"1":{"name":"testpool",
"scheme":"replicated","pg_size":2,"pg_minsize":1,"pg_count":256,"failure_domain":"host"}}'
vitastor-cli create-pool testpool --pg_size 2 --pg_count 256
```
Для пулов с кодами коррекции ошибок конфигурация должна выглядеть примерно так:
```
etcdctl --endpoints=... put /vitastor/config/pools '{"2":{"name":"ecpool",
"scheme":"ec","pg_size":4,"parity_chunks":2,"pg_minsize":2,"pg_count":256,"failure_domain":"host"}}'
vitastor-cli create-pool testpool --ec 2+2 --pg_count 256
```
После этого один из мониторов должен сконфигурировать PG, а OSD должны запустить их.

View File

@ -24,6 +24,10 @@ It supports the following commands:
- [fix](#fix)
- [alloc-osd](#alloc-osd)
- [rm-osd](#rm-osd)
- [create-pool](#create-pool)
- [modify-pool](#modify-pool)
- [ls-pools](#ls-pools)
- [rm-pool](#rm-pool)
Global options:
@ -238,3 +242,84 @@ Refuses to remove OSDs with data without `--force` and `--allow-data-loss`.
With `--dry-run` only checks if deletion is possible without data loss and
redundancy degradation.
## create-pool
`vitastor-cli create-pool|pool-create <name> (-s <pg_size>|--ec <N>+<K>) -n <pg_count> [OPTIONS]`
Create a pool. Required parameters:
| `-s|--pg_size R` | Number of replicas for replicated pools |
| `--ec N+K` | Number of data (N) and parity (K) chunks for erasure-coded pools |
| `-n|--pg_count N` | PG count for the new pool (start with 10*<OSD count>/pg_size rounded to a power of 2) |
Optional parameters:
| `--pg_minsize <number>` | R or N+K minus number of failures to tolerate without downtime ([details](../config/pool.en.md#pg_minsize)) |
| `--failure_domain host` | Failure domain: host, osd or a level from placement_levels. Default: host |
| `--root_node <node>` | Put pool only on child OSDs of this placement tree node |
| `--osd_tags <tag>[,<tag>]...` | Put pool only on OSDs tagged with all specified tags |
| `--block_size 128k` | Put pool only on OSDs with this data block size |
| `--bitmap_granularity 4k` | Put pool only on OSDs with this logical sector size |
| `--immediate_commit none` | Put pool only on OSDs with this or larger immediate_commit (none < small < all) |
| `--primary_affinity_tags tags` | Prefer to put primary copies on OSDs with all specified tags |
| `--scrub_interval <time>` | Enable regular scrubbing for this pool. Format: number + unit s/m/h/d/M/y |
| `--pg_stripe_size <number>` | Increase object grouping stripe |
| `--max_osd_combinations 10000` | Maximum number of random combinations for LP solver input |
| `--wait` | Wait for the new pool to come online |
| `-f|--force` | Do not check that cluster has enough OSDs to create the pool |
See also [Pool configuration](../config/pool.en.md) for detailed parameter descriptions.
Examples:
`vitastor-cli create-pool test_x4 -s 4 -n 32`
`vitastor-cli create-pool test_ec42 --ec 4+2 -n 32`
## modify-pool
`vitastor-cli modify-pool|pool-modify <id|name> [--name <new_name>] [PARAMETERS...]`
Modify an existing pool. Modifiable parameters:
```
[-s|--pg_size <number>] [--pg_minsize <number>] [-n|--pg_count <count>]
[--failure_domain <level>] [--root_node <node>] [--osd_tags <tags>]
[--max_osd_combinations <number>] [--primary_affinity_tags <tags>] [--scrub_interval <time>]
```
Non-modifiable parameters (changing them WILL lead to data loss):
```
[--block_size <size>] [--bitmap_granularity <size>]
[--immediate_commit <all|small|none>] [--pg_stripe_size <size>]
```
These, however, can still be modified with -f|--force.
See [create-pool](#create-pool) for parameter descriptions.
Examples:
`vitastor-cli modify-pool pool_A --name pool_B`
`vitastor-cli modify-pool 2 --pg_size 4 -n 128`
## rm-pool
`vitastor-cli rm-pool|pool-rm [--force] <id|name>`
Remove a pool. Refuses to remove pools with images without `--force`.
## ls-pools
`vitastor-cli ls-pools|pool-ls|ls-pool|pools [-l] [--detail] [--sort FIELD] [-r] [-n N] [--stats] [<glob> ...]`
List pools (only matching <glob> patterns if passed).
| `-l|--long` | Also report I/O statistics |
| `--detail` | Use list format (not table), show all details |
| `--sort FIELD` | Sort by specified field (see fields in --json output) |
| `-r|--reverse` | Sort in descending order |
| `-n|--count N` | Only list first N items |

View File

@ -23,6 +23,10 @@ vitastor-cli - интерфейс командной строки для адм
- [merge-data](#merge-data)
- [alloc-osd](#alloc-osd)
- [rm-osd](#rm-osd)
- [create-pool](#create-pool)
- [modify-pool](#modify-pool)
- [ls-pools](#ls-pools)
- [rm-pool](#rm-pool)
Глобальные опции:
@ -85,8 +89,8 @@ kaveri 2/1 32 0 B 10 G 0 B 100% 0%
`vitastor-cli ls [-l] [-p POOL] [--sort FIELD] [-r] [-n N] [<glob> ...]`
Показать список образов, если переданы шаблоны `<glob>`, то только с именами,
соответствующими этим шаблонам (стандартные ФС-шаблоны с * и ?).
Показать список образов, если передан(ы) шаблон(ы) `<glob>`, то только с именами,
соответствующими одному из шаблонов (стандартные ФС-шаблоны с * и ?).
Опции:
@ -255,3 +259,85 @@ vitastor-cli snap-create [-p|--pool <id|name>] <image>@<snapshot>
С опцией `--dry-run` только проверяет, возможно ли удаление без потери данных и деградации
избыточности.
## create-pool
`vitastor-cli create-pool|pool-create <name> (-s <pg_size>|--ec <N>+<K>) -n <pg_count> [OPTIONS]`
Создать пул. Обязательные параметры:
| `-s|--pg_size R` | Число копий данных для реплицированных пулов |
| `--ec N+K` | Число частей данных (N) и чётности (K) для пулов с кодами коррекции ошибок |
| `-n|--pg_count N` | Число PG для нового пула (начните с 10*<число OSD>/pg_size, округлённого до степени двойки) |
Необязательные параметры:
| `--pg_minsize <number>` | (R или N+K) минус число разрешённых отказов без остановки пула ([подробнее](../config/pool.ru.md#pg_minsize)) |
| `--failure_domain host` | Домен отказа: host, osd или другой из placement_levels. По умолчанию: host |
| `--root_node <node>` | Использовать для пула только дочерние OSD этого узла дерева размещения |
| `--osd_tags <tag>[,<tag>]...` | ...только OSD со всеми заданными тегами |
| `--block_size 128k` | ...только OSD с данным размером блока |
| `--bitmap_granularity 4k` | ...только OSD с данным размером логического сектора |
| `--immediate_commit none` | ...только OSD с этим или большим immediate_commit (none < small < all) |
| `--primary_affinity_tags tags` | Предпочитать OSD со всеми данными тегами для роли первичных |
| `--scrub_interval <time>` | Включить скрабы с заданным интервалом времени (число + единица s/m/h/d/M/y) |
| `--pg_stripe_size <number>` | Увеличить блок группировки объектов по PG |
| `--max_osd_combinations 10000` | Максимальное число случайных комбинаций OSD для ЛП-солвера |
| `--wait` | Подождать, пока новый пул будет активирован |
| `-f|--force` | Не проверять, что в кластере достаточно доменов отказа для создания пула |
Подробно о параметрах см. [Конфигурация пулов](../config/pool.ru.md).
Примеры:
`vitastor-cli create-pool test_x4 -s 4 -n 32`
`vitastor-cli create-pool test_ec42 --ec 4+2 -n 32`
## modify-pool
`vitastor-cli modify-pool|pool-modify <id|name> [--name <new_name>] [PARAMETERS...]`
Изменить настройки существующего пула. Изменяемые параметры:
```
[-s|--pg_size <number>] [--pg_minsize <number>] [-n|--pg_count <count>]
[--failure_domain <level>] [--root_node <node>] [--osd_tags <tags>]
[--max_osd_combinations <number>] [--primary_affinity_tags <tags>] [--scrub_interval <time>]
```
Неизменяемые параметры (их изменение ПРИВЕДЁТ к потере данных):
```
[--block_size <size>] [--bitmap_granularity <size>]
[--immediate_commit <all|small|none>] [--pg_stripe_size <size>]
```
Эти параметры можно изменить, только если явно передать опцию -f или --force.
Описания параметров смотрите в [create-pool](#create-pool).
Примеры:
`vitastor-cli modify-pool pool_A --name pool_B`
`vitastor-cli modify-pool 2 --pg_size 4 -n 128`
## rm-pool
`vitastor-cli rm-pool|pool-rm [--force] <id|name>`
Удалить пул. Отказывается удалять пул, в котором ещё есть образы, без `--force`.
## ls-pools
`vitastor-cli ls-pools|pool-ls|ls-pool|pools [-l] [--detail] [--sort FIELD] [-r] [-n N] [--stats] [<glob> ...]`
Показать список пулов. Если передан(ы) шаблон(ы) `<glob>`, то только с именами,
соответствующими одному из шаблонов (стандартные ФС-шаблоны с * и ?).
| `-l|--long` | Вывести также статистику ввода-вывода |
| `--detail` | Максимально подробный вывод в виде списка (а не таблицы) |
| `--sort FIELD` | Сортировать по заданному полю (поля см. в выводе с --json) |
| `-r|--reverse` | Сортировать в обратном порядке |
| `-n|--count N` | Выводить только первые N записей |

View File

@ -145,7 +145,6 @@ add_library(vitastor_client SHARED
cli_status.cpp
cli_describe.cpp
cli_fix.cpp
cli_df.cpp
cli_ls.cpp
cli_create.cpp
cli_modify.cpp
@ -154,6 +153,11 @@ add_library(vitastor_client SHARED
cli_rm_data.cpp
cli_rm.cpp
cli_rm_osd.cpp
cli_pool_cfg.cpp
cli_pool_create.cpp
cli_pool_ls.cpp
cli_pool_modify.cpp
cli_pool_rm.cpp
)
set_target_properties(vitastor_client PROPERTIES PUBLIC_HEADER "vitastor_c.h")
target_link_libraries(vitastor_client

View File

@ -113,6 +113,54 @@ static const char* help_text =
" With --dry-run only checks if deletion is possible without data loss and\n"
" redundancy degradation.\n"
"\n"
"vitastor-cli create-pool|pool-create <name> (-s <pg_size>|--ec <N>+<K>) -n <pg_count> [OPTIONS]\n"
" Create a pool. Required parameters:\n"
" -s|--pg_size R Number of replicas for replicated pools\n"
" --ec N+K Number of data (N) and parity (K) chunks for erasure-coded pools\n"
" -n|--pg_count N PG count for the new pool (start with 10*<OSD count>/pg_size rounded to a power of 2)\n"
" Optional parameters:\n"
" --pg_minsize <number> R or N+K minus number of failures to tolerate without downtime\n"
" --failure_domain host Failure domain: host, osd or a level from placement_levels. Default: host\n"
" --root_node <node> Put pool only on child OSDs of this placement tree node\n"
" --osd_tags <tag>[,<tag>]... Put pool only on OSDs tagged with all specified tags\n"
" --block_size 128k Put pool only on OSDs with this data block size\n"
" --bitmap_granularity 4k Put pool only on OSDs with this logical sector size\n"
" --immediate_commit none Put pool only on OSDs with this or larger immediate_commit (none < small < all)\n"
" --primary_affinity_tags tags Prefer to put primary copies on OSDs with all specified tags\n"
" --scrub_interval <time> Enable regular scrubbing for this pool. Format: number + unit s/m/h/d/M/y\n"
" --pg_stripe_size <number> Increase object grouping stripe\n"
" --max_osd_combinations 10000 Maximum number of random combinations for LP solver input\n"
" --wait Wait for the new pool to come online\n"
" -f|--force Do not check that cluster has enough OSDs to create the pool\n"
" Examples:\n"
" vitastor-cli create-pool test_x4 -s 4 -n 32\n"
" vitastor-cli create-pool test_ec42 --ec 4+2 -n 32\n"
"\n"
"vitastor-cli modify-pool|pool-modify <id|name> [--name <new_name>] [PARAMETERS...]\n"
" Modify an existing pool. Modifiable parameters:\n"
" [-s|--pg_size <number>] [--pg_minsize <number>] [-n|--pg_count <count>]\n"
" [--failure_domain <level>] [--root_node <node>] [--osd_tags <tags>]\n"
" [--max_osd_combinations <number>] [--primary_affinity_tags <tags>] [--scrub_interval <time>]\n"
" Non-modifiable parameters (changing them WILL lead to data loss):\n"
" [--block_size <size>] [--bitmap_granularity <size>]\n"
" [--immediate_commit <all|small|none>] [--pg_stripe_size <size>]\n"
" These, however, can still be modified with -f|--force.\n"
" See create-pool for parameter descriptions.\n"
" Examples:\n"
" vitastor-cli modify-pool pool_A --name pool_B\n"
" vitastor-cli modify-pool 2 --pg_size 4 -n 128\n"
"\n"
"vitastor-cli rm-pool|pool-rm [--force] <id|name>\n"
" Remove a pool. Refuses to remove pools with images without --force.\n"
"\n"
"vitastor-cli ls-pools|pool-ls|ls-pool|pools [-l] [--detail] [--sort FIELD] [-r] [-n N] [--stats] [<glob> ...]\n"
" List pools (only matching <glob> patterns if passed).\n"
" -l|--long Also report I/O statistics\n"
" --detail Use list format (not table), show all details\n"
" --sort FIELD Sort by specified field (see fields in --json output)\n"
" -r|--reverse Sort in descending order\n"
" -n|--count N Only list first N items\n"
"\n"
"Use vitastor-cli --help <command> for command details or vitastor-cli --help --all for all details.\n"
"\n"
"GLOBAL OPTIONS:\n"
@ -133,6 +181,8 @@ static json11::Json::object parse_args(int narg, const char *args[])
cfg["progress"] = "1";
for (int i = 1; i < narg; i++)
{
bool argHasValue = (!(i == narg-1) && (args[i+1][0] != '-'));
if (args[i][0] == '-' && args[i][1] == 'h' && args[i][2] == 0)
{
cfg["help"] = "1";
@ -143,15 +193,15 @@ static json11::Json::object parse_args(int narg, const char *args[])
}
else if (args[i][0] == '-' && args[i][1] == 'n' && args[i][2] == 0)
{
cfg["count"] = args[++i];
cfg["count"] = argHasValue ? args[++i] : "";
}
else if (args[i][0] == '-' && args[i][1] == 'p' && args[i][2] == 0)
{
cfg["pool"] = args[++i];
cfg["pool"] = argHasValue ? args[++i] : "";
}
else if (args[i][0] == '-' && args[i][1] == 's' && args[i][2] == 0)
{
cfg["size"] = args[++i];
cfg["size"] = argHasValue ? args[++i] : "";
}
else if (args[i][0] == '-' && args[i][1] == 'r' && args[i][2] == 0)
{
@ -164,17 +214,23 @@ static json11::Json::object parse_args(int narg, const char *args[])
else if (args[i][0] == '-' && args[i][1] == '-')
{
const char *opt = args[i]+2;
cfg[opt] = i == narg-1 || !strcmp(opt, "json") ||
if (!strcmp(opt, "json") || !strcmp(opt, "wait") ||
!strcmp(opt, "wait-list") || !strcmp(opt, "wait_list") ||
!strcmp(opt, "long") || !strcmp(opt, "del") ||
!strcmp(opt, "long") || !strcmp(opt, "detail") || !strcmp(opt, "del") ||
!strcmp(opt, "no-color") || !strcmp(opt, "no_color") ||
!strcmp(opt, "readonly") || !strcmp(opt, "readwrite") ||
!strcmp(opt, "force") || !strcmp(opt, "reverse") ||
!strcmp(opt, "allow-data-loss") || !strcmp(opt, "allow_data_loss") ||
!strcmp(opt, "dry-run") || !strcmp(opt, "dry_run") ||
!strcmp(opt, "help") || !strcmp(opt, "all") ||
(!strcmp(opt, "writers-stopped") || !strcmp(opt, "writers_stopped")) && strcmp("1", args[i+1]) != 0
? "1" : args[++i];
!strcmp(opt, "writers-stopped") || !strcmp(opt, "writers_stopped"))
{
cfg[opt] = "1";
}
else
{
cfg[opt] = argHasValue ? args[++i] : "";
}
}
else
{
@ -217,7 +273,7 @@ static int run(cli_tool_t *p, json11::Json::object cfg)
else if (cmd[0] == "df")
{
// Show pool space stats
action_cb = p->start_df(cfg);
action_cb = p->start_pool_ls(cfg);
}
else if (cmd[0] == "ls")
{
@ -324,6 +380,44 @@ static int run(cli_tool_t *p, json11::Json::object cfg)
// Allocate a new OSD number
action_cb = p->start_alloc_osd(cfg);
}
else if (cmd[0] == "create-pool" || cmd[0] == "pool-create")
{
// Create a new pool
if (cmd.size() > 1 && cfg["name"].is_null())
{
cfg["name"] = cmd[1];
}
action_cb = p->start_pool_create(cfg);
}
else if (cmd[0] == "modify-pool" || cmd[0] == "pool-modify")
{
// Modify existing pool
if (cmd.size() > 1)
{
cfg["old_name"] = cmd[1];
}
action_cb = p->start_pool_modify(cfg);
}
else if (cmd[0] == "rm-pool" || cmd[0] == "pool-rm")
{
// Remove existing pool
if (cmd.size() > 1)
{
cfg["pool"] = cmd[1];
}
action_cb = p->start_pool_rm(cfg);
}
else if (cmd[0] == "ls-pool" || cmd[0] == "pool-ls" || cmd[0] == "ls-pools" || cmd[0] == "pools")
{
// Show pool list
cfg["show_recovery"] = 1;
if (cmd.size() > 1)
{
cmd.erase(cmd.begin(), cmd.begin()+1);
cfg["names"] = cmd;
}
action_cb = p->start_pool_ls(cfg);
}
else
{
result = { .err = EINVAL, .text = "unknown command: "+cmd[0].string_value() };

View File

@ -46,6 +46,7 @@ public:
json11::Json etcd_result;
void parse_config(json11::Json::object & cfg);
json11::Json parse_tags(std::string tags);
void change_parent(inode_t cur, inode_t new_parent, cli_result_t *result);
inode_config_t* get_inode_cfg(const std::string & name);
@ -58,7 +59,6 @@ public:
std::function<bool(cli_result_t &)> start_status(json11::Json);
std::function<bool(cli_result_t &)> start_describe(json11::Json);
std::function<bool(cli_result_t &)> start_fix(json11::Json);
std::function<bool(cli_result_t &)> start_df(json11::Json);
std::function<bool(cli_result_t &)> start_ls(json11::Json);
std::function<bool(cli_result_t &)> start_create(json11::Json);
std::function<bool(cli_result_t &)> start_modify(json11::Json);
@ -68,6 +68,10 @@ public:
std::function<bool(cli_result_t &)> start_rm(json11::Json);
std::function<bool(cli_result_t &)> start_rm_osd(json11::Json cfg);
std::function<bool(cli_result_t &)> start_alloc_osd(json11::Json cfg);
std::function<bool(cli_result_t &)> start_pool_create(json11::Json);
std::function<bool(cli_result_t &)> start_pool_modify(json11::Json);
std::function<bool(cli_result_t &)> start_pool_rm(json11::Json);
std::function<bool(cli_result_t &)> start_pool_ls(json11::Json);
// Should be called like loop_and_wait(start_status(), <completion callback>)
void loop_and_wait(std::function<bool(cli_result_t &)> loop_cb, std::function<void(const cli_result_t &)> complete_cb);
@ -77,8 +81,13 @@ public:
std::string print_table(json11::Json items, json11::Json header, bool use_esc);
size_t print_detail_title_len(json11::Json item, std::vector<std::pair<std::string, std::string>> names, size_t prev_len);
std::string print_detail(json11::Json item, std::vector<std::pair<std::string, std::string>> names, size_t title_len, bool use_esc);
std::string format_lat(uint64_t lat);
std::string format_q(double depth);
bool stupid_glob(const std::string str, const std::string glob);
std::string implode(const std::string & sep, json11::Json array);

View File

@ -183,7 +183,16 @@ resume_3:
// Save into inode_config for library users to be able to take it from there immediately
new_cfg.mod_revision = parent->etcd_result["responses"][0]["response_put"]["header"]["revision"].uint64_value();
parent->cli->st_cli.insert_inode_config(new_cfg);
result = (cli_result_t){ .err = 0, .text = "Image "+image_name+" created" };
result = (cli_result_t){
.err = 0,
.text = "Image "+image_name+" created",
.data = json11::Json::object {
{ "name", image_name },
{ "pool", new_pool_name },
{ "parent", new_parent },
{ "size", size },
}
};
state = 100;
}
@ -251,7 +260,16 @@ resume_4:
// Save into inode_config for library users to be able to take it from there immediately
new_cfg.mod_revision = parent->etcd_result["responses"][0]["response_put"]["header"]["revision"].uint64_value();
parent->cli->st_cli.insert_inode_config(new_cfg);
result = (cli_result_t){ .err = 0, .text = "Snapshot "+image_name+"@"+new_snap+" created" };
result = (cli_result_t){
.err = 0,
.text = "Snapshot "+image_name+"@"+new_snap+" created",
.data = json11::Json::object {
{ "name", image_name+"@"+new_snap },
{ "pool", (uint64_t)new_pool_id },
{ "parent", new_parent },
{ "size", size },
}
};
state = 100;
}

View File

@ -1,243 +0,0 @@
// Copyright (c) Vitaliy Filippov, 2019+
// License: VNPL-1.1 (see README.md for details)
#include "cli.h"
#include "cluster_client.h"
#include "str_util.h"
// List pools with space statistics
struct pool_lister_t
{
cli_tool_t *parent;
int state = 0;
json11::Json space_info;
cli_result_t result;
std::map<pool_id_t, json11::Json::object> pool_stats;
bool is_done()
{
return state == 100;
}
void get_stats()
{
if (state == 1)
goto resume_1;
// Space statistics - pool/stats/<pool>
parent->etcd_txn(json11::Json::object {
{ "success", json11::Json::array {
json11::Json::object {
{ "request_range", json11::Json::object {
{ "key", base64_encode(
parent->cli->st_cli.etcd_prefix+"/pool/stats/"
) },
{ "range_end", base64_encode(
parent->cli->st_cli.etcd_prefix+"/pool/stats0"
) },
} },
},
json11::Json::object {
{ "request_range", json11::Json::object {
{ "key", base64_encode(
parent->cli->st_cli.etcd_prefix+"/osd/stats/"
) },
{ "range_end", base64_encode(
parent->cli->st_cli.etcd_prefix+"/osd/stats0"
) },
} },
},
} },
});
state = 1;
resume_1:
if (parent->waiting > 0)
return;
if (parent->etcd_err.err)
{
result = parent->etcd_err;
state = 100;
return;
}
space_info = parent->etcd_result;
std::map<pool_id_t, uint64_t> osd_free;
for (auto & kv_item: space_info["responses"][0]["response_range"]["kvs"].array_items())
{
auto kv = parent->cli->st_cli.parse_etcd_kv(kv_item);
// pool ID
pool_id_t pool_id;
char null_byte = 0;
int scanned = sscanf(kv.key.substr(parent->cli->st_cli.etcd_prefix.length()).c_str(), "/pool/stats/%u%c", &pool_id, &null_byte);
if (scanned != 1 || !pool_id || pool_id >= POOL_ID_MAX)
{
fprintf(stderr, "Invalid key in etcd: %s\n", kv.key.c_str());
continue;
}
// pool/stats/<N>
pool_stats[pool_id] = kv.value.object_items();
}
for (auto & kv_item: space_info["responses"][1]["response_range"]["kvs"].array_items())
{
auto kv = parent->cli->st_cli.parse_etcd_kv(kv_item);
// osd ID
osd_num_t osd_num;
char null_byte = 0;
int scanned = sscanf(kv.key.substr(parent->cli->st_cli.etcd_prefix.length()).c_str(), "/osd/stats/%ju%c", &osd_num, &null_byte);
if (scanned != 1 || !osd_num || osd_num >= POOL_ID_MAX)
{
fprintf(stderr, "Invalid key in etcd: %s\n", kv.key.c_str());
continue;
}
// osd/stats/<N>::free
osd_free[osd_num] = kv.value["free"].uint64_value();
}
// Calculate max_avail for each pool
for (auto & pp: parent->cli->st_cli.pool_config)
{
auto & pool_cfg = pp.second;
uint64_t pool_avail = UINT64_MAX;
std::map<osd_num_t, uint64_t> pg_per_osd;
for (auto & pgp: pool_cfg.pg_config)
{
for (auto pg_osd: pgp.second.target_set)
{
if (pg_osd != 0)
{
pg_per_osd[pg_osd]++;
}
}
}
for (auto pg_per_pair: pg_per_osd)
{
uint64_t pg_free = osd_free[pg_per_pair.first] * pool_cfg.real_pg_count / pg_per_pair.second;
if (pool_avail > pg_free)
{
pool_avail = pg_free;
}
}
if (pool_avail == UINT64_MAX)
{
pool_avail = 0;
}
if (pool_cfg.scheme != POOL_SCHEME_REPLICATED)
{
pool_avail *= (pool_cfg.pg_size - pool_cfg.parity_chunks);
}
pool_stats[pool_cfg.id] = json11::Json::object {
{ "id", (uint64_t)pool_cfg.id },
{ "name", pool_cfg.name },
{ "pg_count", pool_cfg.pg_count },
{ "real_pg_count", pool_cfg.real_pg_count },
{ "scheme", pool_cfg.scheme == POOL_SCHEME_REPLICATED ? "replicated" : "ec" },
{ "scheme_name", pool_cfg.scheme == POOL_SCHEME_REPLICATED
? std::to_string(pool_cfg.pg_size)+"/"+std::to_string(pool_cfg.pg_minsize)
: "EC "+std::to_string(pool_cfg.pg_size-pool_cfg.parity_chunks)+"+"+std::to_string(pool_cfg.parity_chunks) },
{ "used_raw", (uint64_t)(pool_stats[pool_cfg.id]["used_raw_tb"].number_value() * ((uint64_t)1<<40)) },
{ "total_raw", (uint64_t)(pool_stats[pool_cfg.id]["total_raw_tb"].number_value() * ((uint64_t)1<<40)) },
{ "max_available", pool_avail },
{ "raw_to_usable", pool_stats[pool_cfg.id]["raw_to_usable"].number_value() },
{ "space_efficiency", pool_stats[pool_cfg.id]["space_efficiency"].number_value() },
{ "pg_real_size", pool_stats[pool_cfg.id]["pg_real_size"].uint64_value() },
{ "failure_domain", pool_cfg.failure_domain },
};
}
}
json11::Json::array to_list()
{
json11::Json::array list;
for (auto & kv: pool_stats)
{
list.push_back(kv.second);
}
return list;
}
void loop()
{
get_stats();
if (parent->waiting > 0)
return;
if (state == 100)
return;
if (parent->json_output)
{
// JSON output
result.data = to_list();
state = 100;
return;
}
// Table output: name, scheme_name, pg_count, total, used, max_avail, used%, efficiency
json11::Json::array cols;
cols.push_back(json11::Json::object{
{ "key", "name" },
{ "title", "NAME" },
});
cols.push_back(json11::Json::object{
{ "key", "scheme_name" },
{ "title", "SCHEME" },
});
cols.push_back(json11::Json::object{
{ "key", "pg_count_fmt" },
{ "title", "PGS" },
});
cols.push_back(json11::Json::object{
{ "key", "total_fmt" },
{ "title", "TOTAL" },
});
cols.push_back(json11::Json::object{
{ "key", "used_fmt" },
{ "title", "USED" },
});
cols.push_back(json11::Json::object{
{ "key", "max_avail_fmt" },
{ "title", "AVAILABLE" },
});
cols.push_back(json11::Json::object{
{ "key", "used_pct" },
{ "title", "USED%" },
});
cols.push_back(json11::Json::object{
{ "key", "eff_fmt" },
{ "title", "EFFICIENCY" },
});
json11::Json::array list;
for (auto & kv: pool_stats)
{
double raw_to = kv.second["raw_to_usable"].number_value();
if (raw_to < 0.000001 && raw_to > -0.000001)
raw_to = 1;
kv.second["pg_count_fmt"] = kv.second["real_pg_count"] == kv.second["pg_count"]
? kv.second["real_pg_count"].as_string()
: kv.second["real_pg_count"].as_string()+"->"+kv.second["pg_count"].as_string();
kv.second["total_fmt"] = format_size(kv.second["total_raw"].uint64_value() / raw_to);
kv.second["used_fmt"] = format_size(kv.second["used_raw"].uint64_value() / raw_to);
kv.second["max_avail_fmt"] = format_size(kv.second["max_available"].uint64_value());
kv.second["used_pct"] = format_q(kv.second["total_raw"].uint64_value()
? (100 - 100*kv.second["max_available"].uint64_value() *
kv.second["raw_to_usable"].number_value() / kv.second["total_raw"].uint64_value())
: 100)+"%";
kv.second["eff_fmt"] = format_q(kv.second["space_efficiency"].number_value()*100)+"%";
}
result.data = to_list();
result.text = print_table(result.data, cols, parent->color);
state = 100;
}
};
std::function<bool(cli_result_t &)> cli_tool_t::start_df(json11::Json cfg)
{
auto lister = new pool_lister_t();
lister->parent = this;
return [lister](cli_result_t & result)
{
lister->loop();
if (lister->is_done())
{
result = lister->result;
delete lister;
return true;
}
return false;
};
}

View File

@ -342,7 +342,11 @@ struct snap_merger_t
printf("\rOverwriting blocks: %ju/%ju\n", to_process, to_process);
}
// Done
result = (cli_result_t){ .text = "Done, layers from "+from_name+" to "+to_name+" merged into "+target_name };
result = (cli_result_t){ .text = "Done, layers from "+from_name+" to "+to_name+" merged into "+target_name, .data = json11::Json::object {
{ "from", from_name },
{ "to", to_name },
{ "into", target_name },
}};
state = 100;
resume_100:
return;

View File

@ -84,7 +84,10 @@ struct image_changer_t
(!new_size && !force_size || cfg.size == new_size || cfg.size >= new_size && inc_size) &&
(new_name == "" || new_name == image_name))
{
result = (cli_result_t){ .text = "No change" };
result = (cli_result_t){ .err = 0, .text = "No change", .data = json11::Json::object {
{ "error_code", 0 },
{ "error_text", "No change" },
}};
state = 100;
return;
}
@ -220,7 +223,16 @@ resume_2:
parent->cli->st_cli.inode_by_name.erase(image_name);
}
parent->cli->st_cli.insert_inode_config(cfg);
result = (cli_result_t){ .err = 0, .text = "Image "+image_name+" modified" };
result = (cli_result_t){
.err = 0,
.text = "Image "+image_name+" modified",
.data = json11::Json::object {
{ "name", image_name },
{ "inode", INODE_NO_POOL(inode_num) },
{ "pool", (uint64_t)INODE_POOL(inode_num) },
{ "size", new_size },
}
};
state = 100;
}
};

264
src/cli_pool_cfg.cpp Normal file
View File

@ -0,0 +1,264 @@
// Copyright (c) Vitaliy Filippov, 2024
// License: VNPL-1.1 (see README.md for details)
#include "cli_pool_cfg.h"
#include "etcd_state_client.h"
#include "str_util.h"
std::string validate_pool_config(json11::Json::object & new_cfg, json11::Json old_cfg,
uint64_t global_block_size, uint64_t global_bitmap_granularity, bool force)
{
// short option names
if (new_cfg.find("count") != new_cfg.end())
{
new_cfg["pg_count"] = new_cfg["count"];
new_cfg.erase("count");
}
if (new_cfg.find("size") != new_cfg.end())
{
new_cfg["pg_size"] = new_cfg["size"];
new_cfg.erase("size");
}
// --ec shortcut
if (new_cfg.find("ec") != new_cfg.end())
{
if (new_cfg.find("scheme") != new_cfg.end() ||
new_cfg.find("pg_size") != new_cfg.end() ||
new_cfg.find("parity_chunks") != new_cfg.end())
{
return "--ec can't be used with --pg_size, --parity_chunks or --scheme";
}
// pg_size = N+K
// parity_chunks = K
uint64_t data_chunks = 0, parity_chunks = 0;
char null_byte = 0;
int ret = sscanf(new_cfg["ec"].string_value().c_str(), "%ju+%ju%c", &data_chunks, &parity_chunks, &null_byte);
if (ret != 2 || !data_chunks || !parity_chunks)
{
return "--ec should be <N>+<K> format (<N>, <K> - numbers)";
}
new_cfg.erase("ec");
new_cfg["scheme"] = "ec";
new_cfg["pg_size"] = data_chunks+parity_chunks;
new_cfg["parity_chunks"] = parity_chunks;
}
if (old_cfg.is_null() && new_cfg["scheme"].string_value() == "")
{
// Default scheme
new_cfg["scheme"] = "replicated";
}
if (new_cfg.find("pg_minsize") == new_cfg.end() && (old_cfg.is_null() || new_cfg.find("pg_size") != new_cfg.end()))
{
// Default pg_minsize
if (new_cfg["scheme"] == "replicated")
{
// pg_minsize = (N+K > 2) ? 2 : 1
new_cfg["pg_minsize"] = new_cfg["pg_size"].uint64_value() > 2 ? 2 : 1;
}
else // ec or xor
{
// pg_minsize = (K > 1) ? N + 1 : N
new_cfg["pg_minsize"] = new_cfg["pg_size"].uint64_value() - new_cfg["parity_chunks"].uint64_value() +
(new_cfg["parity_chunks"].uint64_value() > 1 ? 1 : 0);
}
}
if (new_cfg["scheme"] != "ec")
{
new_cfg.erase("parity_chunks");
}
// Check integer values and unknown keys
for (auto kv_it = new_cfg.begin(); kv_it != new_cfg.end(); )
{
auto & key = kv_it->first;
auto & value = kv_it->second;
if (key == "pg_size" || key == "parity_chunks" || key == "pg_minsize" ||
key == "pg_count" || key == "max_osd_combinations" || key == "block_size" ||
key == "bitmap_granularity" || key == "pg_stripe_size")
{
if (value.is_number() && value.uint64_value() != value.number_value() ||
value.is_string() && !value.uint64_value() && value.string_value() != "0")
{
return key+" must be a non-negative integer";
}
value = value.uint64_value();
}
else if (key == "name" || key == "scheme" || key == "immediate_commit" ||
key == "failure_domain" || key == "root_node" || key == "scrub_interval")
{
// OK
}
else if (key == "osd_tags" || key == "primary_affinity_tags")
{
if (value.is_string())
{
value = explode(",", value.string_value(), true);
}
}
else
{
// Unknown parameter
new_cfg.erase(kv_it++);
continue;
}
kv_it++;
}
// Merge with the old config
if (!old_cfg.is_null())
{
for (auto & kv: old_cfg.object_items())
{
if (new_cfg.find(kv.first) == new_cfg.end())
{
new_cfg[kv.first] = kv.second;
}
}
}
// Prevent autovivification of object keys. Now we don't modify the config, we just check it
json11::Json cfg = new_cfg;
// Validate changes
if (!old_cfg.is_null() && !force)
{
if (old_cfg["scheme"] != cfg["scheme"])
{
return "Changing scheme for an existing pool will lead to data loss. Use --force to proceed";
}
if (etcd_state_client_t::parse_scheme(old_cfg["scheme"].string_value()) == POOL_SCHEME_EC)
{
uint64_t old_data_chunks = old_cfg["pg_size"].uint64_value() - old_cfg["parity_chunks"].uint64_value();
uint64_t new_data_chunks = cfg["pg_size"].uint64_value() - cfg["parity_chunks"].uint64_value();
if (old_data_chunks != new_data_chunks)
{
return "Changing EC data chunk count for an existing pool will lead to data loss. Use --force to proceed";
}
}
if (old_cfg["block_size"] != cfg["block_size"] ||
old_cfg["bitmap_granularity"] != cfg["bitmap_granularity"] ||
old_cfg["immediate_commit"] != cfg["immediate_commit"])
{
return "Changing block_size, bitmap_granularity or immediate_commit"
" for an existing pool will lead to incomplete PGs. Use --force to proceed";
}
if (old_cfg["pg_stripe_size"] != cfg["pg_stripe_size"])
{
return "Changing pg_stripe_size for an existing pool will lead to data loss. Use --force to proceed";
}
}
// Validate values
if (cfg["name"].string_value() == "")
{
return "Non-empty pool name is required";
}
// scheme
auto scheme = etcd_state_client_t::parse_scheme(cfg["scheme"].string_value());
if (!scheme)
{
return "Scheme must be one of \"replicated\", \"ec\" or \"xor\"";
}
// pg_size
auto pg_size = cfg["pg_size"].uint64_value();
if (!pg_size)
{
return "Non-zero PG size is required";
}
if (scheme != POOL_SCHEME_REPLICATED && pg_size < 3)
{
return "PG size can't be smaller than 3 for EC/XOR pools";
}
if (pg_size > 256)
{
return "PG size can't be greater than 256";
}
// parity_chunks
uint64_t parity_chunks = 1;
if (scheme == POOL_SCHEME_EC)
{
parity_chunks = cfg["parity_chunks"].uint64_value();
if (!parity_chunks)
{
return "Non-zero parity_chunks is required";
}
if (parity_chunks > pg_size-2)
{
return "parity_chunks can't be greater than "+std::to_string(pg_size-2)+" (PG size - 2)";
}
}
// pg_minsize
auto pg_minsize = cfg["pg_minsize"].uint64_value();
if (!pg_minsize)
{
return "Non-zero pg_minsize is required";
}
else if (pg_minsize > pg_size)
{
return "pg_minsize can't be greater than "+std::to_string(pg_size)+" (PG size)";
}
else if (scheme != POOL_SCHEME_REPLICATED && pg_minsize < pg_size-parity_chunks)
{
return "pg_minsize can't be smaller than "+std::to_string(pg_size-parity_chunks)+
" (pg_size - parity_chunks) for XOR/EC pool";
}
// pg_count
if (!cfg["pg_count"].uint64_value())
{
return "Non-zero pg_count is required";
}
// max_osd_combinations
if (!cfg["max_osd_combinations"].is_null() && cfg["max_osd_combinations"].uint64_value() < 100)
{
return "max_osd_combinations must be at least 100, but it is "+cfg["max_osd_combinations"].as_string();
}
// block_size
auto block_size = cfg["block_size"].uint64_value();
if (!cfg["block_size"].is_null() && ((block_size & (block_size-1)) ||
block_size < MIN_DATA_BLOCK_SIZE || block_size > MAX_DATA_BLOCK_SIZE))
{
return "block_size must be a power of two between "+std::to_string(MIN_DATA_BLOCK_SIZE)+
" and "+std::to_string(MAX_DATA_BLOCK_SIZE)+", but it is "+std::to_string(block_size);
}
block_size = (block_size ? block_size : global_block_size);
// bitmap_granularity
auto bitmap_granularity = cfg["bitmap_granularity"].uint64_value();
if (!cfg["bitmap_granularity"].is_null() && (!bitmap_granularity || (bitmap_granularity % 512)))
{
return "bitmap_granularity must be a multiple of 512, but it is "+std::to_string(bitmap_granularity);
}
bitmap_granularity = (bitmap_granularity ? bitmap_granularity : global_bitmap_granularity);
if (block_size % bitmap_granularity)
{
return "bitmap_granularity must divide data block size ("+std::to_string(block_size)+"), but it is "+std::to_string(bitmap_granularity);
}
// immediate_commit
if (!cfg["immediate_commit"].is_null() && !etcd_state_client_t::parse_immediate_commit(cfg["immediate_commit"].string_value()))
{
return "immediate_commit must be one of \"all\", \"small\", or \"none\", but it is "+cfg["scrub_interval"].as_string();
}
// scrub_interval
if (!cfg["scrub_interval"].is_null())
{
bool ok;
parse_time(cfg["scrub_interval"].string_value(), &ok);
if (!ok)
{
return "scrub_interval must be a time interval (number + unit s/m/h/d/M/y), but it is "+cfg["scrub_interval"].as_string();
}
}
return "";
}

10
src/cli_pool_cfg.h Normal file
View File

@ -0,0 +1,10 @@
// Copyright (c) Vitaliy Filippov, 2024
// License: VNPL-1.1 (see README.md for details)
#pragma once
#include "json11/json11.hpp"
#include <stdint.h>
std::string validate_pool_config(json11::Json::object & new_cfg, json11::Json old_cfg,
uint64_t global_block_size, uint64_t global_bitmap_granularity, bool force);

622
src/cli_pool_create.cpp Normal file
View File

@ -0,0 +1,622 @@
// Copyright (c) MIND Software LLC, 2023 (info@mindsw.io)
// I accept Vitastor CLA: see CLA-en.md for details
// Copyright (c) Vitaliy Filippov, 2024
// License: VNPL-1.1 (see README.md for details)
#include <ctype.h>
#include "cli.h"
#include "cli_pool_cfg.h"
#include "cluster_client.h"
#include "epoll_manager.h"
#include "pg_states.h"
#include "str_util.h"
struct pool_creator_t
{
cli_tool_t *parent;
json11::Json::object cfg;
bool force = false;
bool wait = false;
int state = 0;
cli_result_t result;
struct {
uint32_t retries = 5;
uint32_t interval = 0;
bool passed = false;
} create_check;
uint64_t new_id = 1;
uint64_t new_pools_mod_rev;
json11::Json state_node_tree;
json11::Json new_pools;
bool is_done() { return state == 100; }
void loop()
{
if (state == 1)
goto resume_1;
else if (state == 2)
goto resume_2;
else if (state == 3)
goto resume_3;
else if (state == 4)
goto resume_4;
else if (state == 5)
goto resume_5;
else if (state == 6)
goto resume_6;
else if (state == 7)
goto resume_7;
else if (state == 8)
goto resume_8;
// Validate pool parameters
result.text = validate_pool_config(cfg, json11::Json(), parent->cli->st_cli.global_block_size,
parent->cli->st_cli.global_bitmap_granularity, force);
if (result.text != "")
{
result.err = EINVAL;
state = 100;
return;
}
state = 1;
resume_1:
// If not forced, check that we have enough osds for pg_size
if (!force)
{
// Get node_placement configuration from etcd
parent->etcd_txn(json11::Json::object {
{ "success", json11::Json::array {
json11::Json::object {
{ "request_range", json11::Json::object {
{ "key", base64_encode(parent->cli->st_cli.etcd_prefix+"/config/node_placement") },
} }
},
} },
});
state = 2;
resume_2:
if (parent->waiting > 0)
return;
if (parent->etcd_err.err)
{
result = parent->etcd_err;
state = 100;
return;
}
// Get state_node_tree based on node_placement and osd peer states
{
auto kv = parent->cli->st_cli.parse_etcd_kv(parent->etcd_result["responses"][0]["response_range"]["kvs"][0]);
state_node_tree = get_state_node_tree(kv.value.object_items());
}
// Skip tag checks, if pool has none
if (cfg["osd_tags"].array_items().size())
{
// Get osd configs (for tags) of osds in state_node_tree
{
json11::Json::array osd_configs;
for (auto osd_num: state_node_tree["osds"].array_items())
{
osd_configs.push_back(json11::Json::object {
{ "request_range", json11::Json::object {
{ "key", base64_encode(parent->cli->st_cli.etcd_prefix+"/config/osd/"+osd_num.as_string()) },
} }
});
}
parent->etcd_txn(json11::Json::object { { "success", osd_configs, }, });
}
state = 3;
resume_3:
if (parent->waiting > 0)
return;
if (parent->etcd_err.err)
{
result = parent->etcd_err;
state = 100;
return;
}
// Filter out osds from state_node_tree based on pool/osd tags
{
std::vector<json11::Json> osd_configs;
for (auto & ocr: parent->etcd_result["responses"].array_items())
{
auto kv = parent->cli->st_cli.parse_etcd_kv(ocr["response_range"]["kvs"][0]);
osd_configs.push_back(kv.value);
}
state_node_tree = filter_state_node_tree_by_tags(state_node_tree, osd_configs);
}
}
// Get stats (for block_size, bitmap_granularity, ...) of osds in state_node_tree
{
json11::Json::array osd_stats;
for (auto osd_num: state_node_tree["osds"].array_items())
{
osd_stats.push_back(json11::Json::object {
{ "request_range", json11::Json::object {
{ "key", base64_encode(parent->cli->st_cli.etcd_prefix+"/osd/stats/"+osd_num.as_string()) },
} }
});
}
parent->etcd_txn(json11::Json::object { { "success", osd_stats, }, });
}
state = 4;
resume_4:
if (parent->waiting > 0)
return;
if (parent->etcd_err.err)
{
result = parent->etcd_err;
state = 100;
return;
}
// Filter osds from state_node_tree based on pool parameters and osd stats
{
std::vector<json11::Json> osd_stats;
for (auto & ocr: parent->etcd_result["responses"].array_items())
{
auto kv = parent->cli->st_cli.parse_etcd_kv(ocr["response_range"]["kvs"][0]);
osd_stats.push_back(kv.value);
}
state_node_tree = filter_state_node_tree_by_stats(state_node_tree, osd_stats);
}
// Check that pg_size <= max_pg_size
{
auto failure_domain = cfg["failure_domain"].string_value() == ""
? "host" : cfg["failure_domain"].string_value();
uint64_t max_pg_size = get_max_pg_size(state_node_tree["nodes"].object_items(),
failure_domain, cfg["root_node"].string_value());
if (cfg["pg_size"].uint64_value() > max_pg_size)
{
result = (cli_result_t){
.err = EINVAL,
.text =
"There are "+std::to_string(max_pg_size)+" \""+failure_domain+"\" failure domains with OSDs matching tags and"
" block_size/bitmap_granularity/immediate_commit parameters, but you want to create a"
" pool with "+cfg["pg_size"].as_string()+" OSDs from different failure domains in a PG."
" Change parameters or add --force if you want to create a degraded pool and add OSDs later."
};
state = 100;
return;
}
}
}
// Create pool
state = 5;
resume_5:
// Get pools from etcd
parent->etcd_txn(json11::Json::object {
{ "success", json11::Json::array {
json11::Json::object {
{ "request_range", json11::Json::object {
{ "key", base64_encode(parent->cli->st_cli.etcd_prefix+"/config/pools") },
} }
},
} },
});
state = 6;
resume_6:
if (parent->waiting > 0)
return;
if (parent->etcd_err.err)
{
result = parent->etcd_err;
state = 100;
return;
}
{
// Add new pool
auto kv = parent->cli->st_cli.parse_etcd_kv(parent->etcd_result["responses"][0]["response_range"]["kvs"][0]);
new_pools = create_pool(kv);
if (new_pools.is_string())
{
result = (cli_result_t){ .err = EEXIST, .text = new_pools.string_value() };
state = 100;
return;
}
new_pools_mod_rev = kv.mod_revision;
}
// Update pools in etcd
parent->etcd_txn(json11::Json::object {
{ "compare", json11::Json::array {
json11::Json::object {
{ "target", "MOD" },
{ "key", base64_encode(parent->cli->st_cli.etcd_prefix+"/config/pools") },
{ "result", "LESS" },
{ "mod_revision", new_pools_mod_rev+1 },
}
} },
{ "success", json11::Json::array {
json11::Json::object {
{ "request_put", json11::Json::object {
{ "key", base64_encode(parent->cli->st_cli.etcd_prefix+"/config/pools") },
{ "value", base64_encode(new_pools.dump()) },
} },
},
} },
});
state = 7;
resume_7:
if (parent->waiting > 0)
return;
if (parent->etcd_err.err)
{
result = parent->etcd_err;
state = 100;
return;
}
// Perform final create-check
create_check.interval = parent->cli->config["mon_change_timeout"].uint64_value();
if (!create_check.interval)
create_check.interval = 1000;
state = 8;
resume_8:
if (parent->waiting > 0)
return;
// Unless forced, check that pool was created and is active
if (!wait)
{
create_check.passed = true;
}
else if (create_check.retries)
{
create_check.retries--;
parent->waiting++;
parent->epmgr->tfd->set_timer(create_check.interval, false, [this](int timer_id)
{
if (parent->cli->st_cli.pool_config.find(new_id) != parent->cli->st_cli.pool_config.end())
{
auto & pool_cfg = parent->cli->st_cli.pool_config[new_id];
create_check.passed = pool_cfg.real_pg_count > 0;
for (auto pg_it = pool_cfg.pg_config.begin(); pg_it != pool_cfg.pg_config.end(); pg_it++)
{
if (!(pg_it->second.cur_state & PG_ACTIVE))
{
create_check.passed = false;
break;
}
}
if (create_check.passed)
create_check.retries = 0;
}
parent->waiting--;
parent->ringloop->wakeup();
});
return;
}
if (!create_check.passed)
{
result = (cli_result_t) {
.err = EAGAIN,
.text = "Pool "+cfg["name"].string_value()+" was created, but failed to become active."
" This may indicate that cluster state has changed while the pool was being created."
" Please check the current state and adjust the pool configuration if necessary.",
};
}
else
{
result = (cli_result_t){
.err = 0,
.text = "Pool "+cfg["name"].string_value()+" created",
.data = new_pools[std::to_string(new_id)],
};
}
state = 100;
}
// Returns a JSON object of form {"nodes": {...}, "osds": [...]} that
// contains: all nodes (osds, hosts, ...) based on node_placement config
// and current peer state, and a list of active peer osds.
json11::Json get_state_node_tree(json11::Json::object node_placement)
{
// Erase non-peer osd nodes from node_placement
for (auto np_it = node_placement.begin(); np_it != node_placement.end();)
{
// Numeric nodes are osds
osd_num_t osd_num = stoull_full(np_it->first);
// If node is osd and it is not in peer states, erase it
if (osd_num > 0 &&
parent->cli->st_cli.peer_states.find(osd_num) == parent->cli->st_cli.peer_states.end())
{
node_placement.erase(np_it++);
}
else
np_it++;
}
// List of peer osds
std::vector<std::string> peer_osds;
// Record peer osds and add missing osds/hosts to np
for (auto & ps: parent->cli->st_cli.peer_states)
{
std::string osd_num = std::to_string(ps.first);
// Record peer osd
peer_osds.push_back(osd_num);
// Add osd, if necessary
if (node_placement.find(osd_num) == node_placement.end())
{
std::string osd_host = ps.second["host"].as_string();
// Add host, if necessary
if (node_placement.find(osd_host) == node_placement.end())
{
node_placement[osd_host] = json11::Json::object {
{ "level", "host" }
};
}
node_placement[osd_num] = json11::Json::object {
{ "parent", osd_host }
};
}
}
return json11::Json::object { { "osds", peer_osds }, { "nodes", node_placement } };
}
// Returns new state_node_tree based on given state_node_tree with osds
// filtered out by tags in given osd_configs and current pool config.
// Requires: state_node_tree["osds"] must match osd_configs 1-1
json11::Json filter_state_node_tree_by_tags(const json11::Json & state_node_tree, std::vector<json11::Json> & osd_configs)
{
auto & osds = state_node_tree["osds"].array_items();
// Accepted state_node_tree nodes
auto accepted_nodes = state_node_tree["nodes"].object_items();
// List of accepted osds
std::vector<std::string> accepted_osds;
for (size_t i = 0; i < osd_configs.size(); i++)
{
auto & oc = osd_configs[i].object_items();
// Get osd number
auto osd_num = osds[i].as_string();
// We need tags in config to check against pool tags
if (oc.find("tags") == oc.end())
{
// Exclude osd from state_node_tree nodes
accepted_nodes.erase(osd_num);
continue;
}
else
{
// If all pool tags are in osd tags, accept osd
if (all_in_tags(osd_configs[i]["tags"], cfg["osd_tags"]))
{
accepted_osds.push_back(osd_num);
}
// Otherwise, exclude osd
else
{
// Exclude osd from state_node_tree nodes
accepted_nodes.erase(osd_num);
}
}
}
return json11::Json::object { { "osds", accepted_osds }, { "nodes", accepted_nodes } };
}
// Returns new state_node_tree based on given state_node_tree with osds
// filtered out by stats parameters (block_size, bitmap_granularity) in
// given osd_stats and current pool config.
// Requires: state_node_tree["osds"] must match osd_stats 1-1
json11::Json filter_state_node_tree_by_stats(const json11::Json & state_node_tree, std::vector<json11::Json> & osd_stats)
{
auto & osds = state_node_tree["osds"].array_items();
// Accepted state_node_tree nodes
auto accepted_nodes = state_node_tree["nodes"].object_items();
// List of accepted osds
std::vector<std::string> accepted_osds;
uint64_t p_block_size = cfg["block_size"].uint64_value()
? cfg["block_size"].uint64_value()
: parent->cli->st_cli.global_block_size;
uint64_t p_bitmap_granularity = cfg["bitmap_granularity"].uint64_value()
? cfg["bitmap_granularity"].uint64_value()
: parent->cli->st_cli.global_bitmap_granularity;
uint32_t p_immediate_commit = cfg["immediate_commit"].is_string()
? etcd_state_client_t::parse_immediate_commit(cfg["immediate_commit"].string_value())
: parent->cli->st_cli.global_immediate_commit;
for (size_t i = 0; i < osd_stats.size(); i++)
{
auto & os = osd_stats[i];
// Get osd number
auto osd_num = osds[i].as_string();
if (!os["data_block_size"].is_null() && os["data_block_size"] != p_block_size ||
!os["bitmap_granularity"].is_null() && os["bitmap_granularity"] != p_bitmap_granularity ||
!os["immediate_commit"].is_null() &&
etcd_state_client_t::parse_immediate_commit(os["immediate_commit"].string_value()) < p_immediate_commit)
{
accepted_nodes.erase(osd_num);
}
else
{
accepted_osds.push_back(osd_num);
}
}
return json11::Json::object { { "osds", accepted_osds }, { "nodes", accepted_nodes } };
}
// Returns maximum pg_size possible for given node_tree and failure_domain, starting at parent_node
uint64_t get_max_pg_size(json11::Json::object node_tree, const std::string & level, const std::string & parent_node)
{
uint64_t max_pg_sz = 0;
std::vector<std::string> nodes;
// Check if parent node is an osd (numeric)
if (parent_node != "" && stoull_full(parent_node))
{
// Add it to node list if osd is in node tree
if (node_tree.find(parent_node) != node_tree.end())
nodes.push_back(parent_node);
}
// If parent node given, ...
else if (parent_node != "")
{
// ... look for children nodes of this parent
for (auto & sn: node_tree)
{
auto & props = sn.second.object_items();
auto parent_prop = props.find("parent");
if (parent_prop != props.end() && (parent_prop->second.as_string() == parent_node))
{
nodes.push_back(sn.first);
// If we're not looking for all osds, we only need a single
// child osd node
if (level != "osd" && stoull_full(sn.first))
break;
}
}
}
// No parent node given, and we're not looking for all osds
else if (level != "osd")
{
// ... look for all level nodes
for (auto & sn: node_tree)
{
auto & props = sn.second.object_items();
auto level_prop = props.find("level");
if (level_prop != props.end() && (level_prop->second.as_string() == level))
{
nodes.push_back(sn.first);
}
}
}
// Otherwise, ...
else
{
// ... we're looking for osd nodes only
for (auto & sn: node_tree)
{
if (stoull_full(sn.first))
{
nodes.push_back(sn.first);
}
}
}
// Process gathered nodes
for (auto & node: nodes)
{
// Check for osd node, return constant max size
if (stoull_full(node))
{
max_pg_sz += 1;
}
// Otherwise, ...
else
{
// ... exclude parent node from tree, and ...
node_tree.erase(parent_node);
// ... descend onto the resulting tree
max_pg_sz += get_max_pg_size(node_tree, level, node);
}
}
return max_pg_sz;
}
json11::Json create_pool(const etcd_kv_t & kv)
{
for (auto & p: kv.value.object_items())
{
// ID
uint64_t pool_id = stoull_full(p.first);
new_id = std::max(pool_id+1, new_id);
// Name
if (p.second["name"].string_value() == cfg["name"].string_value())
{
return "Pool with name \""+cfg["name"].string_value()+"\" already exists (ID "+std::to_string(pool_id)+")";
}
}
auto res = kv.value.object_items();
res[std::to_string(new_id)] = cfg;
return res;
}
// Checks whether tags2 tags are all in tags1 tags
bool all_in_tags(json11::Json tags1, json11::Json tags2)
{
if (!tags2.is_array())
{
tags2 = json11::Json::array{ tags2.string_value() };
}
if (!tags1.is_array())
{
tags1 = json11::Json::array{ tags1.string_value() };
}
for (auto & tag2: tags2.array_items())
{
bool found = false;
for (auto & tag1: tags1.array_items())
{
if (tag1 == tag2)
{
found = true;
break;
}
}
if (!found)
{
return false;
}
}
return true;
}
};
std::function<bool(cli_result_t &)> cli_tool_t::start_pool_create(json11::Json cfg)
{
auto pool_creator = new pool_creator_t();
pool_creator->parent = this;
pool_creator->cfg = cfg.object_items();
pool_creator->force = cfg["force"].bool_value();
pool_creator->wait = cfg["wait"].bool_value();
return [pool_creator](cli_result_t & result)
{
pool_creator->loop();
if (pool_creator->is_done())
{
result = pool_creator->result;
delete pool_creator;
return true;
}
return false;
};
}

721
src/cli_pool_ls.cpp Normal file
View File

@ -0,0 +1,721 @@
// Copyright (c) Vitaliy Filippov, 2019+
// License: VNPL-1.1 (see README.md for details)
#include <algorithm>
#include "cli.h"
#include "cluster_client.h"
#include "str_util.h"
#include "pg_states.h"
// List pools with space statistics
// - df - minimal list with % used space
// - pool-ls - same but with PG state and recovery %
// - pool-ls -l - same but also include I/O statistics
// - pool-ls --detail - use list format, include PG states, I/O stats and all pool parameters
struct pool_lister_t
{
cli_tool_t *parent;
std::string sort_field;
std::set<std::string> only_names;
bool reverse = false;
int max_count = 0;
bool show_recovery = false;
bool show_stats = false;
bool detailed = false;
int state = 0;
cli_result_t result;
std::map<pool_id_t, json11::Json::object> pool_stats;
struct io_stats_t
{
uint64_t count = 0;
uint64_t read_iops = 0;
uint64_t read_bps = 0;
uint64_t read_lat = 0;
uint64_t write_iops = 0;
uint64_t write_bps = 0;
uint64_t write_lat = 0;
uint64_t delete_iops = 0;
uint64_t delete_bps = 0;
uint64_t delete_lat = 0;
};
struct object_counts_t
{
uint64_t object_count = 0;
uint64_t misplaced_count = 0;
uint64_t degraded_count = 0;
uint64_t incomplete_count = 0;
};
bool is_done()
{
return state == 100;
}
void get_pool_stats(int base_state)
{
if (state == base_state+1)
goto resume_1;
// Space statistics - pool/stats/<pool>
parent->etcd_txn(json11::Json::object {
{ "success", json11::Json::array {
json11::Json::object {
{ "request_range", json11::Json::object {
{ "key", base64_encode(
parent->cli->st_cli.etcd_prefix+"/pool/stats/"
) },
{ "range_end", base64_encode(
parent->cli->st_cli.etcd_prefix+"/pool/stats0"
) },
} },
},
json11::Json::object {
{ "request_range", json11::Json::object {
{ "key", base64_encode(
parent->cli->st_cli.etcd_prefix+"/osd/stats/"
) },
{ "range_end", base64_encode(
parent->cli->st_cli.etcd_prefix+"/osd/stats0"
) },
} },
},
json11::Json::object {
{ "request_range", json11::Json::object {
{ "key", base64_encode(
parent->cli->st_cli.etcd_prefix+"/config/pools"
) },
} },
},
} },
});
state = base_state+1;
resume_1:
if (parent->waiting > 0)
return;
if (parent->etcd_err.err)
{
result = parent->etcd_err;
state = 100;
return;
}
auto space_info = parent->etcd_result;
auto config_pools = space_info["responses"][2]["response_range"]["kvs"][0];
if (!config_pools.is_null())
{
config_pools = parent->cli->st_cli.parse_etcd_kv(config_pools).value;
}
for (auto & kv_item: space_info["responses"][0]["response_range"]["kvs"].array_items())
{
auto kv = parent->cli->st_cli.parse_etcd_kv(kv_item);
// pool ID
pool_id_t pool_id;
char null_byte = 0;
int scanned = sscanf(kv.key.substr(parent->cli->st_cli.etcd_prefix.length()).c_str(), "/pool/stats/%u%c", &pool_id, &null_byte);
if (scanned != 1 || !pool_id || pool_id >= POOL_ID_MAX)
{
fprintf(stderr, "Invalid key in etcd: %s\n", kv.key.c_str());
continue;
}
// pool/stats/<N>
pool_stats[pool_id] = kv.value.object_items();
}
std::map<pool_id_t, uint64_t> osd_free;
for (auto & kv_item: space_info["responses"][1]["response_range"]["kvs"].array_items())
{
auto kv = parent->cli->st_cli.parse_etcd_kv(kv_item);
// osd ID
osd_num_t osd_num;
char null_byte = 0;
int scanned = sscanf(kv.key.substr(parent->cli->st_cli.etcd_prefix.length()).c_str(), "/osd/stats/%ju%c", &osd_num, &null_byte);
if (scanned != 1 || !osd_num || osd_num >= POOL_ID_MAX)
{
fprintf(stderr, "Invalid key in etcd: %s\n", kv.key.c_str());
continue;
}
// osd/stats/<N>::free
osd_free[osd_num] = kv.value["free"].uint64_value();
}
// Calculate max_avail for each pool
for (auto & pp: parent->cli->st_cli.pool_config)
{
auto & pool_cfg = pp.second;
uint64_t pool_avail = UINT64_MAX;
std::map<osd_num_t, uint64_t> pg_per_osd;
bool active = pool_cfg.real_pg_count > 0;
uint64_t pg_states = 0;
for (auto & pgp: pool_cfg.pg_config)
{
if (!(pgp.second.cur_state & PG_ACTIVE))
{
active = false;
}
pg_states |= pgp.second.cur_state;
for (auto pg_osd: pgp.second.target_set)
{
if (pg_osd != 0)
{
pg_per_osd[pg_osd]++;
}
}
}
for (auto pg_per_pair: pg_per_osd)
{
uint64_t pg_free = osd_free[pg_per_pair.first] * pool_cfg.real_pg_count / pg_per_pair.second;
if (pool_avail > pg_free)
{
pool_avail = pg_free;
}
}
if (pool_avail == UINT64_MAX)
{
pool_avail = 0;
}
if (pool_cfg.scheme != POOL_SCHEME_REPLICATED)
{
pool_avail *= (pool_cfg.pg_size - pool_cfg.parity_chunks);
}
// incomplete > has_incomplete > degraded > has_degraded > has_misplaced
std::string status;
if (!active)
status = "inactive";
else if (pg_states & PG_INCOMPLETE)
status = "incomplete";
else if (pg_states & PG_HAS_INCOMPLETE)
status = "has_incomplete";
else if (pg_states & PG_DEGRADED)
status = "degraded";
else if (pg_states & PG_HAS_DEGRADED)
status = "has_degraded";
else if (pg_states & PG_HAS_MISPLACED)
status = "has_misplaced";
else
status = "active";
pool_stats[pool_cfg.id] = json11::Json::object {
{ "id", (uint64_t)pool_cfg.id },
{ "name", pool_cfg.name },
{ "status", status },
{ "pg_count", pool_cfg.pg_count },
{ "real_pg_count", pool_cfg.real_pg_count },
{ "scheme_name", pool_cfg.scheme == POOL_SCHEME_REPLICATED
? std::to_string(pool_cfg.pg_size)+"/"+std::to_string(pool_cfg.pg_minsize)
: "EC "+std::to_string(pool_cfg.pg_size-pool_cfg.parity_chunks)+"+"+std::to_string(pool_cfg.parity_chunks) },
{ "used_raw", (uint64_t)(pool_stats[pool_cfg.id]["used_raw_tb"].number_value() * ((uint64_t)1<<40)) },
{ "total_raw", (uint64_t)(pool_stats[pool_cfg.id]["total_raw_tb"].number_value() * ((uint64_t)1<<40)) },
{ "max_available", pool_avail },
{ "raw_to_usable", pool_stats[pool_cfg.id]["raw_to_usable"].number_value() },
{ "space_efficiency", pool_stats[pool_cfg.id]["space_efficiency"].number_value() },
{ "pg_real_size", pool_stats[pool_cfg.id]["pg_real_size"].uint64_value() },
{ "osd_count", pg_per_osd.size() },
};
}
// Include full pool config
for (auto & pp: config_pools.object_items())
{
if (!pp.second.is_object())
{
continue;
}
auto pool_id = stoull_full(pp.first);
auto & st = pool_stats[pool_id];
for (auto & kv: pp.second.object_items())
{
if (st.find(kv.first) == st.end())
st[kv.first] = kv.second;
}
}
}
void get_pg_stats(int base_state)
{
if (state == base_state+1)
goto resume_1;
// Space statistics - pool/stats/<pool>
parent->etcd_txn(json11::Json::object {
{ "success", json11::Json::array {
json11::Json::object {
{ "request_range", json11::Json::object {
{ "key", base64_encode(
parent->cli->st_cli.etcd_prefix+"/pg/stats/"
) },
{ "range_end", base64_encode(
parent->cli->st_cli.etcd_prefix+"/pg/stats0"
) },
} },
},
} },
});
state = base_state+1;
resume_1:
if (parent->waiting > 0)
return;
if (parent->etcd_err.err)
{
result = parent->etcd_err;
state = 100;
return;
}
auto pg_stats = parent->etcd_result["responses"][0]["response_range"]["kvs"];
// Calculate recovery percent
std::map<pool_id_t, object_counts_t> counts;
for (auto & kv_item: pg_stats.array_items())
{
auto kv = parent->cli->st_cli.parse_etcd_kv(kv_item);
// pool ID & pg number
pool_id_t pool_id;
pg_num_t pg_num = 0;
char null_byte = 0;
int scanned = sscanf(kv.key.substr(parent->cli->st_cli.etcd_prefix.length()).c_str(),
"/pg/stats/%u/%u%c", &pool_id, &pg_num, &null_byte);
if (scanned != 2 || !pool_id || pool_id >= POOL_ID_MAX)
{
fprintf(stderr, "Invalid key in etcd: %s\n", kv.key.c_str());
continue;
}
auto & cnt = counts[pool_id];
cnt.object_count += kv.value["object_count"].uint64_value();
cnt.misplaced_count += kv.value["misplaced_count"].uint64_value();
cnt.degraded_count += kv.value["degraded_count"].uint64_value();
cnt.incomplete_count += kv.value["incomplete_count"].uint64_value();
}
for (auto & pp: pool_stats)
{
auto & cnt = counts[pp.first];
auto & st = pp.second;
st["object_count"] = cnt.object_count;
st["misplaced_count"] = cnt.misplaced_count;
st["degraded_count"] = cnt.degraded_count;
st["incomplete_count"] = cnt.incomplete_count;
}
}
void get_inode_stats(int base_state)
{
if (state == base_state+1)
goto resume_1;
// Space statistics - pool/stats/<pool>
parent->etcd_txn(json11::Json::object {
{ "success", json11::Json::array {
json11::Json::object {
{ "request_range", json11::Json::object {
{ "key", base64_encode(
parent->cli->st_cli.etcd_prefix+"/inode/stats/"
) },
{ "range_end", base64_encode(
parent->cli->st_cli.etcd_prefix+"/inode/stats0"
) },
} },
},
} },
});
state = base_state+1;
resume_1:
if (parent->waiting > 0)
return;
if (parent->etcd_err.err)
{
result = parent->etcd_err;
state = 100;
return;
}
auto inode_stats = parent->etcd_result["responses"][0]["response_range"]["kvs"];
// Performance statistics
std::map<pool_id_t, io_stats_t> pool_io;
for (auto & kv_item: inode_stats.array_items())
{
auto kv = parent->cli->st_cli.parse_etcd_kv(kv_item);
// pool ID & inode number
pool_id_t pool_id;
inode_t only_inode_num;
char null_byte = 0;
int scanned = sscanf(kv.key.substr(parent->cli->st_cli.etcd_prefix.length()).c_str(),
"/inode/stats/%u/%ju%c", &pool_id, &only_inode_num, &null_byte);
if (scanned != 2 || !pool_id || pool_id >= POOL_ID_MAX || INODE_POOL(only_inode_num) != 0)
{
fprintf(stderr, "Invalid key in etcd: %s\n", kv.key.c_str());
continue;
}
auto & io = pool_io[pool_id];
io.read_iops += kv.value["read"]["iops"].uint64_value();
io.read_bps += kv.value["read"]["bps"].uint64_value();
io.read_lat += kv.value["read"]["lat"].uint64_value();
io.write_iops += kv.value["write"]["iops"].uint64_value();
io.write_bps += kv.value["write"]["bps"].uint64_value();
io.write_lat += kv.value["write"]["lat"].uint64_value();
io.delete_iops += kv.value["delete"]["iops"].uint64_value();
io.delete_bps += kv.value["delete"]["bps"].uint64_value();
io.delete_lat += kv.value["delete"]["lat"].uint64_value();
io.count++;
}
for (auto & pp: pool_stats)
{
auto & io = pool_io[pp.first];
if (io.count > 0)
{
io.read_lat /= io.count;
io.write_lat /= io.count;
io.delete_lat /= io.count;
}
auto & st = pp.second;
st["read_iops"] = io.read_iops;
st["read_bps"] = io.read_bps;
st["read_lat"] = io.read_lat;
st["write_iops"] = io.write_iops;
st["write_bps"] = io.write_bps;
st["write_lat"] = io.write_lat;
st["delete_iops"] = io.delete_iops;
st["delete_bps"] = io.delete_bps;
st["delete_lat"] = io.delete_lat;
}
}
json11::Json::array to_list()
{
json11::Json::array list;
for (auto & kv: pool_stats)
{
if (!only_names.size())
{
list.push_back(kv.second);
}
else
{
for (auto glob: only_names)
{
if (stupid_glob(kv.second["name"].string_value(), glob))
{
list.push_back(kv.second);
break;
}
}
}
}
if (sort_field == "name" || sort_field == "scheme" ||
sort_field == "scheme_name" || sort_field == "status")
{
std::sort(list.begin(), list.end(), [this](json11::Json a, json11::Json b)
{
auto av = a[sort_field].as_string();
auto bv = b[sort_field].as_string();
return reverse ? av > bv : av < bv;
});
}
else
{
std::sort(list.begin(), list.end(), [this](json11::Json a, json11::Json b)
{
auto av = a[sort_field].number_value();
auto bv = b[sort_field].number_value();
return reverse ? av > bv : av < bv;
});
}
if (max_count > 0 && list.size() > max_count)
{
list.resize(max_count);
}
return list;
}
void loop()
{
if (state == 1)
goto resume_1;
if (state == 2)
goto resume_2;
if (state == 3)
goto resume_3;
if (state == 100)
return;
show_stats = show_stats || detailed;
show_recovery = show_recovery || detailed;
resume_1:
get_pool_stats(0);
if (parent->waiting > 0)
return;
if (show_stats)
{
resume_2:
get_inode_stats(1);
if (parent->waiting > 0)
return;
}
if (show_recovery)
{
resume_3:
get_pg_stats(2);
if (parent->waiting > 0)
return;
}
if (parent->json_output)
{
// JSON output
result.data = to_list();
state = 100;
return;
}
json11::Json::array list;
for (auto & kv: pool_stats)
{
auto & st = kv.second;
double raw_to = st["raw_to_usable"].number_value();
if (raw_to < 0.000001 && raw_to > -0.000001)
raw_to = 1;
st["pg_count_fmt"] = st["real_pg_count"] == st["pg_count"]
? st["real_pg_count"].as_string()
: st["real_pg_count"].as_string()+"->"+st["pg_count"].as_string();
st["total_fmt"] = format_size(st["total_raw"].uint64_value() / raw_to);
st["used_fmt"] = format_size(st["used_raw"].uint64_value() / raw_to);
st["max_avail_fmt"] = format_size(st["max_available"].uint64_value());
st["used_pct"] = format_q(st["total_raw"].uint64_value()
? (100 - 100*st["max_available"].uint64_value() *
st["raw_to_usable"].number_value() / st["total_raw"].uint64_value())
: 100)+"%";
st["eff_fmt"] = format_q(st["space_efficiency"].number_value()*100)+"%";
if (show_stats)
{
st["read_bw"] = format_size(st["read_bps"].uint64_value())+"/s";
st["write_bw"] = format_size(st["write_bps"].uint64_value())+"/s";
st["delete_bw"] = format_size(st["delete_bps"].uint64_value())+"/s";
st["read_iops"] = format_q(st["read_iops"].number_value());
st["write_iops"] = format_q(st["write_iops"].number_value());
st["delete_iops"] = format_q(st["delete_iops"].number_value());
st["read_lat_f"] = format_lat(st["read_lat"].uint64_value());
st["write_lat_f"] = format_lat(st["write_lat"].uint64_value());
st["delete_lat_f"] = format_lat(st["delete_lat"].uint64_value());
}
if (show_recovery)
{
auto object_count = st["object_count"].uint64_value();
auto recovery_pct = 100.0 * (object_count - (st["misplaced_count"].uint64_value() +
st["degraded_count"].uint64_value() + st["incomplete_count"].uint64_value())) /
(object_count ? object_count : 1);
st["recovery_fmt"] = format_q(recovery_pct)+"%";
}
}
if (detailed)
{
for (auto & kv: pool_stats)
{
auto & st = kv.second;
auto total = st["object_count"].uint64_value();
auto obj_size = st["block_size"].uint64_value();
if (!obj_size)
obj_size = parent->cli->st_cli.global_block_size;
if (st["scheme"] == "ec")
obj_size *= st["pg_size"].uint64_value() - st["parity_chunks"].uint64_value();
else if (st["scheme"] == "xor")
obj_size *= st["pg_size"].uint64_value() - 1;
auto n = st["misplaced_count"].uint64_value();
if (n > 0)
st["misplaced_fmt"] = format_size(n * obj_size) + " / " + format_q(100.0 * n / total);
n = st["degraded_count"].uint64_value();
if (n > 0)
st["degraded_fmt"] = format_size(n * obj_size) + " / " + format_q(100.0 * n / total);
n = st["incomplete_count"].uint64_value();
if (n > 0)
st["incomplete_fmt"] = format_size(n * obj_size) + " / " + format_q(100.0 * n / total);
st["read_fmt"] = st["read_bw"].string_value()+", "+st["read_iops"].string_value()+" op/s, "+
st["read_lat_f"].string_value()+" lat";
st["write_fmt"] = st["write_bw"].string_value()+", "+st["write_iops"].string_value()+" op/s, "+
st["write_lat_f"].string_value()+" lat";
st["delete_fmt"] = st["delete_bw"].string_value()+", "+st["delete_iops"].string_value()+" op/s, "+
st["delete_lat_f"].string_value()+" lat";
if (st["scheme"] == "replicated")
st["scheme_name"] = "x"+st["pg_size"].as_string();
if (st["failure_domain"].string_value() == "")
st["failure_domain"] = "host";
st["osd_tags_fmt"] = implode(", ", st["osd_tags"]);
st["primary_affinity_tags_fmt"] = implode(", ", st["primary_affinity_tags"]);
if (st["block_size"].uint64_value())
st["block_size_fmt"] = format_size(st["block_size"].uint64_value());
if (st["bitmap_granularity"].uint64_value())
st["bitmap_granularity_fmt"] = format_size(st["bitmap_granularity"].uint64_value());
}
// All pool parameters are only displayed in the "detailed" mode
// because there's too many of them to show them in table
auto cols = std::vector<std::pair<std::string, std::string>>{
{ "name", "Name" },
{ "id", "ID" },
{ "scheme_name", "Scheme" },
{ "status", "Status" },
{ "pg_count_fmt", "PGs" },
{ "pg_minsize", "PG minsize" },
{ "failure_domain", "Failure domain" },
{ "root_node", "Root node" },
{ "osd_tags_fmt", "OSD tags" },
{ "primary_affinity_tags_fmt", "Primary affinity" },
{ "block_size_fmt", "Block size" },
{ "bitmap_granularity_fmt", "Bitmap granularity" },
{ "immediate_commit", "Immediate commit" },
{ "scrub_interval", "Scrub interval" },
{ "pg_stripe_size", "PG stripe size" },
{ "max_osd_combinations", "Max OSD combinations" },
{ "total_fmt", "Total" },
{ "used_fmt", "Used" },
{ "max_avail_fmt", "Available" },
{ "used_pct", "Used%" },
{ "eff_fmt", "Efficiency" },
{ "osd_count", "OSD count" },
{ "misplaced_fmt", "Misplaced" },
{ "degraded_fmt", "Degraded" },
{ "incomplete_fmt", "Incomplete" },
{ "read_fmt", "Read" },
{ "write_fmt", "Write" },
{ "delete_fmt", "Delete" },
};
auto list = to_list();
size_t title_len = 0;
for (auto & item: list)
{
title_len = print_detail_title_len(item, cols, title_len);
}
for (auto & item: list)
{
if (result.text != "")
result.text += "\n";
result.text += print_detail(item, cols, title_len, parent->color);
}
state = 100;
return;
}
// Table output: name, scheme_name, pg_count, total, used, max_avail, used%, efficiency
json11::Json::array cols;
cols.push_back(json11::Json::object{
{ "key", "name" },
{ "title", "NAME" },
});
cols.push_back(json11::Json::object{
{ "key", "scheme_name" },
{ "title", "SCHEME" },
});
cols.push_back(json11::Json::object{
{ "key", "status" },
{ "title", "STATUS" },
});
cols.push_back(json11::Json::object{
{ "key", "pg_count_fmt" },
{ "title", "PGS" },
});
cols.push_back(json11::Json::object{
{ "key", "total_fmt" },
{ "title", "TOTAL" },
});
cols.push_back(json11::Json::object{
{ "key", "used_fmt" },
{ "title", "USED" },
});
cols.push_back(json11::Json::object{
{ "key", "max_avail_fmt" },
{ "title", "AVAILABLE" },
});
cols.push_back(json11::Json::object{
{ "key", "used_pct" },
{ "title", "USED%" },
});
cols.push_back(json11::Json::object{
{ "key", "eff_fmt" },
{ "title", "EFFICIENCY" },
});
if (show_recovery)
{
cols.push_back(json11::Json::object{ { "key", "recovery_fmt" }, { "title", "RECOVERY" } });
}
if (show_stats)
{
cols.push_back(json11::Json::object{ { "key", "read_bw" }, { "title", "READ" } });
cols.push_back(json11::Json::object{ { "key", "read_iops" }, { "title", "IOPS" } });
cols.push_back(json11::Json::object{ { "key", "read_lat_f" }, { "title", "LAT" } });
cols.push_back(json11::Json::object{ { "key", "write_bw" }, { "title", "WRITE" } });
cols.push_back(json11::Json::object{ { "key", "write_iops" }, { "title", "IOPS" } });
cols.push_back(json11::Json::object{ { "key", "write_lat_f" }, { "title", "LAT" } });
cols.push_back(json11::Json::object{ { "key", "delete_bw" }, { "title", "DELETE" } });
cols.push_back(json11::Json::object{ { "key", "delete_iops" }, { "title", "IOPS" } });
cols.push_back(json11::Json::object{ { "key", "delete_lat_f" }, { "title", "LAT" } });
}
result.data = to_list();
result.text = print_table(result.data, cols, parent->color);
state = 100;
}
};
size_t print_detail_title_len(json11::Json item, std::vector<std::pair<std::string, std::string>> names, size_t prev_len)
{
size_t title_len = prev_len;
for (auto & kv: names)
{
if (!item[kv.first].is_null() && (!item[kv.first].is_string() || item[kv.first].string_value() != ""))
{
size_t len = utf8_length(kv.second);
title_len = title_len < len ? len : title_len;
}
}
return title_len;
}
std::string print_detail(json11::Json item, std::vector<std::pair<std::string, std::string>> names, size_t title_len, bool use_esc)
{
std::string str;
for (auto & kv: names)
{
if (!item[kv.first].is_null() && (!item[kv.first].is_string() || item[kv.first].string_value() != ""))
{
str += kv.second;
str += ": ";
size_t len = utf8_length(kv.second);
for (int j = 0; j < title_len-len; j++)
str += ' ';
if (use_esc)
str += "\033[1m";
str += item[kv.first].as_string();
if (use_esc)
str += "\033[0m";
str += "\n";
}
}
return str;
}
std::function<bool(cli_result_t &)> cli_tool_t::start_pool_ls(json11::Json cfg)
{
auto lister = new pool_lister_t();
lister->parent = this;
lister->show_recovery = cfg["show_recovery"].bool_value();
lister->show_stats = cfg["long"].bool_value();
lister->detailed = cfg["detail"].bool_value();
lister->sort_field = cfg["sort"].string_value();
if ((lister->sort_field == "osd_tags") ||
(lister->sort_field == "primary_affinity_tags" ))
lister->sort_field = lister->sort_field + "_fmt";
lister->reverse = cfg["reverse"].bool_value();
lister->max_count = cfg["count"].uint64_value();
for (auto & item: cfg["names"].array_items())
{
lister->only_names.insert(item.string_value());
}
return [lister](cli_result_t & result)
{
lister->loop();
if (lister->is_done())
{
result = lister->result;
delete lister;
return true;
}
return false;
};
}
std::string implode(const std::string & sep, json11::Json array)
{
if (array.is_number() || array.is_bool() || array.is_string())
{
return array.as_string();
}
std::string res;
bool first = true;
for (auto & item: array.array_items())
{
res += (first ? item.as_string() : sep+item.as_string());
first = false;
}
return res;
}

185
src/cli_pool_modify.cpp Normal file
View File

@ -0,0 +1,185 @@
// Copyright (c) MIND Software LLC, 2023 (info@mindsw.io)
// I accept Vitastor CLA: see CLA-en.md for details
// Copyright (c) Vitaliy Filippov, 2024
// License: VNPL-1.1 (see README.md for details)
#include <ctype.h>
#include "cli.h"
#include "cli_pool_cfg.h"
#include "cluster_client.h"
#include "str_util.h"
struct pool_changer_t
{
cli_tool_t *parent;
// Required parameters (id/name)
pool_id_t pool_id = 0;
std::string pool_name;
json11::Json::object cfg;
json11::Json::object new_cfg;
bool force = false;
json11::Json old_cfg;
int state = 0;
cli_result_t result;
// Updated pools
json11::Json new_pools;
// Expected pools mod revision
uint64_t pools_mod_rev;
bool is_done() { return state == 100; }
void loop()
{
if (state == 1)
goto resume_1;
else if (state == 2)
goto resume_2;
pool_id = stoull_full(cfg["old_name"].string_value());
if (!pool_id)
{
pool_name = cfg["old_name"].string_value();
if (pool_name == "")
{
result = (cli_result_t){ .err = ENOENT, .text = "Pool ID or name is required to modify it" };
state = 100;
return;
}
}
resume_0:
// Get pools from etcd
parent->etcd_txn(json11::Json::object {
{ "success", json11::Json::array {
json11::Json::object {
{ "request_range", json11::Json::object {
{ "key", base64_encode(parent->cli->st_cli.etcd_prefix+"/config/pools") },
} }
},
} },
});
state = 1;
resume_1:
if (parent->waiting > 0)
return;
if (parent->etcd_err.err)
{
result = parent->etcd_err;
state = 100;
return;
}
{
// Parse received pools from etcd
auto kv = parent->cli->st_cli.parse_etcd_kv(parent->etcd_result["responses"][0]["response_range"]["kvs"][0]);
// Get pool by name or ID
old_cfg = json11::Json();
if (pool_name != "")
{
for (auto & pce: kv.value.object_items())
{
if (pce.second["name"] == pool_name)
{
pool_id = stoull_full(pce.first);
old_cfg = pce.second;
break;
}
}
}
else
{
pool_name = std::to_string(pool_id);
old_cfg = kv.value[pool_name];
}
if (!old_cfg.is_object())
{
result = (cli_result_t){ .err = ENOENT, .text = "Pool "+pool_name+" does not exist" };
state = 100;
return;
}
// Update pool
new_cfg = cfg;
result.text = validate_pool_config(new_cfg, old_cfg, parent->cli->st_cli.global_block_size,
parent->cli->st_cli.global_bitmap_granularity, force);
if (result.text != "")
{
result.err = EINVAL;
state = 100;
return;
}
// Update pool
auto pls = kv.value.object_items();
pls[std::to_string(pool_id)] = new_cfg;
new_pools = pls;
// Expected pools mod revision
pools_mod_rev = kv.mod_revision;
}
// Update pools in etcd
parent->etcd_txn(json11::Json::object {
{ "compare", json11::Json::array {
json11::Json::object {
{ "target", "MOD" },
{ "key", base64_encode(parent->cli->st_cli.etcd_prefix+"/config/pools") },
{ "result", "LESS" },
{ "mod_revision", pools_mod_rev+1 },
}
} },
{ "success", json11::Json::array {
json11::Json::object {
{ "request_put", json11::Json::object {
{ "key", base64_encode(parent->cli->st_cli.etcd_prefix+"/config/pools") },
{ "value", base64_encode(new_pools.dump()) },
} },
},
} },
});
state = 2;
resume_2:
if (parent->waiting > 0)
return;
if (parent->etcd_err.err)
{
result = parent->etcd_err;
state = 100;
return;
}
if (!parent->etcd_result["succeeded"].bool_value())
{
// CAS failure - retry
fprintf(stderr, "Warning: pool configuration was modified in the meantime by someone else\n");
goto resume_0;
}
// Successfully updated pool
result = (cli_result_t){
.err = 0,
.text = "Pool "+pool_name+" updated",
.data = new_pools,
};
state = 100;
}
};
std::function<bool(cli_result_t &)> cli_tool_t::start_pool_modify(json11::Json cfg)
{
auto pool_changer = new pool_changer_t();
pool_changer->parent = this;
pool_changer->cfg = cfg.object_items();
pool_changer->force = cfg["force"].bool_value();
return [pool_changer](cli_result_t & result)
{
pool_changer->loop();
if (pool_changer->is_done())
{
result = pool_changer->result;
delete pool_changer;
return true;
}
return false;
};
}

226
src/cli_pool_rm.cpp Normal file
View File

@ -0,0 +1,226 @@
// Copyright (c) MIND Software LLC, 2023 (info@mindsw.io)
// I accept Vitastor CLA: see CLA-en.md for details
// Copyright (c) Vitaliy Filippov, 2024
// License: VNPL-1.1 (see README.md for details)
#include <ctype.h>
#include "cli.h"
#include "cluster_client.h"
#include "str_util.h"
struct pool_remover_t
{
cli_tool_t *parent;
// Required parameters (id/name)
pool_id_t pool_id = 0;
std::string pool_name;
// Force removal
bool force;
int state = 0;
cli_result_t result;
// Is pool valid?
bool pool_valid = false;
// Updated pools
json11::Json new_pools;
// Expected pools mod revision
uint64_t pools_mod_rev;
bool is_done() { return state == 100; }
void loop()
{
if (state == 1)
goto resume_1;
else if (state == 2)
goto resume_2;
else if (state == 3)
goto resume_3;
// Pool name (or id) required
if (!pool_id && pool_name == "")
{
result = (cli_result_t){ .err = EINVAL, .text = "Pool name or id must be given" };
state = 100;
return;
}
// Validate pool name/id
// Get pool id by name (if name given)
if (pool_name != "")
{
for (auto & ic: parent->cli->st_cli.pool_config)
{
if (ic.second.name == pool_name)
{
pool_id = ic.first;
pool_valid = 1;
break;
}
}
}
// Otherwise, check if given pool id is valid
else
{
// Set pool name from id (for easier logging)
pool_name = "id " + std::to_string(pool_id);
// Look-up pool id in pool_config
if (parent->cli->st_cli.pool_config.find(pool_id) != parent->cli->st_cli.pool_config.end())
{
pool_valid = 1;
}
}
// Need a valid pool to proceed
if (!pool_valid)
{
result = (cli_result_t){ .err = ENOENT, .text = "Pool "+pool_name+" does not exist" };
state = 100;
return;
}
// Unless forced, check if pool has associated Images/Snapshots
if (!force)
{
std::string images;
for (auto & ic: parent->cli->st_cli.inode_config)
{
if (pool_id && INODE_POOL(ic.second.num) != pool_id)
{
continue;
}
images += ((images != "") ? ", " : "") + ic.second.name;
}
if (images != "")
{
result = (cli_result_t){
.err = ENOTEMPTY,
.text =
"Pool "+pool_name+" cannot be removed as it still has the following "
"images/snapshots associated with it: "+images
};
state = 100;
return;
}
}
// Proceed to deleting the pool
state = 1;
do
{
resume_1:
// Get pools from etcd
parent->etcd_txn(json11::Json::object {
{ "success", json11::Json::array {
json11::Json::object {
{ "request_range", json11::Json::object {
{ "key", base64_encode(parent->cli->st_cli.etcd_prefix+"/config/pools") },
} }
},
} },
});
state = 2;
resume_2:
if (parent->waiting > 0)
return;
if (parent->etcd_err.err)
{
result = parent->etcd_err;
state = 100;
return;
}
{
// Parse received pools from etcd
auto kv = parent->cli->st_cli.parse_etcd_kv(parent->etcd_result["responses"][0]["response_range"]["kvs"][0]);
// Remove pool
auto p = kv.value.object_items();
if (p.erase(std::to_string(pool_id)) != 1)
{
result = (cli_result_t){
.err = ENOENT,
.text = "Failed to erase pool "+pool_name+" from: "+kv.value.string_value()
};
state = 100;
return;
}
// Record updated pools
new_pools = p;
// Expected pools mod revision
pools_mod_rev = kv.mod_revision;
}
// Update pools in etcd
parent->etcd_txn(json11::Json::object {
{ "compare", json11::Json::array {
json11::Json::object {
{ "target", "MOD" },
{ "key", base64_encode(parent->cli->st_cli.etcd_prefix+"/config/pools") },
{ "result", "LESS" },
{ "mod_revision", pools_mod_rev+1 },
}
} },
{ "success", json11::Json::array {
json11::Json::object {
{ "request_put", json11::Json::object {
{ "key", base64_encode(parent->cli->st_cli.etcd_prefix+"/config/pools") },
{ "value", base64_encode(new_pools.dump()) },
} },
},
} },
});
state = 3;
resume_3:
if (parent->waiting > 0)
return;
if (parent->etcd_err.err)
{
result = parent->etcd_err;
state = 100;
return;
}
} while (!parent->etcd_result["succeeded"].bool_value());
// Successfully deleted pool
result = (cli_result_t){
.err = 0,
.text = "Pool "+pool_name+" deleted",
.data = new_pools
};
state = 100;
}
};
std::function<bool(cli_result_t &)> cli_tool_t::start_pool_rm(json11::Json cfg)
{
auto pool_remover = new pool_remover_t();
pool_remover->parent = this;
pool_remover->pool_id = cfg["pool"].uint64_value();
pool_remover->pool_name = pool_remover->pool_id ? "" : cfg["pool"].as_string();
pool_remover->force = !cfg["force"].is_null();
return [pool_remover](cli_result_t & result)
{
pool_remover->loop();
if (pool_remover->is_done())
{
result = pool_remover->result;
delete pool_remover;
return true;
}
return false;
};
}

View File

@ -245,6 +245,7 @@ resume_8:
}
state = 100;
result = (cli_result_t){
.err = 0,
.text = "",
.data = my_result(result.data),
};

View File

@ -573,8 +573,7 @@ void etcd_state_client_t::load_global_config()
{
global_bitmap_granularity = DEFAULT_BITMAP_GRANULARITY;
}
global_immediate_commit = global_config["immediate_commit"].string_value() == "all"
? IMMEDIATE_ALL : (global_config["immediate_commit"].string_value() == "small" ? IMMEDIATE_SMALL : IMMEDIATE_NONE);
global_immediate_commit = parse_immediate_commit(global_config["immediate_commit"].string_value());
on_load_config_hook(global_config);
});
}
@ -782,13 +781,8 @@ void etcd_state_client_t::parse_state(const etcd_kv_t & kv)
// Failure Domain
pc.failure_domain = pool_item.second["failure_domain"].string_value();
// Coding Scheme
if (pool_item.second["scheme"] == "replicated")
pc.scheme = POOL_SCHEME_REPLICATED;
else if (pool_item.second["scheme"] == "xor")
pc.scheme = POOL_SCHEME_XOR;
else if (pool_item.second["scheme"] == "ec" || pool_item.second["scheme"] == "jerasure")
pc.scheme = POOL_SCHEME_EC;
else
pc.scheme = parse_scheme(pool_item.second["scheme"].string_value());
if (!pc.scheme)
{
fprintf(stderr, "Pool %u has invalid coding scheme (one of \"xor\", \"replicated\", \"ec\" or \"jerasure\" required), skipping pool\n", pool_id);
continue;
@ -871,9 +865,7 @@ void etcd_state_client_t::parse_state(const etcd_kv_t & kv)
pc.scrub_interval = 0;
// Immediate Commit Mode
pc.immediate_commit = pool_item.second["immediate_commit"].is_string()
? (pool_item.second["immediate_commit"].string_value() == "all"
? IMMEDIATE_ALL : (pool_item.second["immediate_commit"].string_value() == "small"
? IMMEDIATE_SMALL : IMMEDIATE_NONE))
? parse_immediate_commit(pool_item.second["immediate_commit"].string_value())
: global_immediate_commit;
// PG Stripe Size
pc.pg_stripe_size = pool_item.second["pg_stripe_size"].uint64_value();
@ -1167,6 +1159,23 @@ void etcd_state_client_t::parse_state(const etcd_kv_t & kv)
}
}
uint32_t etcd_state_client_t::parse_immediate_commit(const std::string & immediate_commit_str)
{
return immediate_commit_str == "all" ? IMMEDIATE_ALL :
(immediate_commit_str == "small" ? IMMEDIATE_SMALL : IMMEDIATE_NONE);
}
uint32_t etcd_state_client_t::parse_scheme(const std::string & scheme)
{
if (scheme == "replicated")
return POOL_SCHEME_REPLICATED;
else if (scheme == "xor")
return POOL_SCHEME_XOR;
else if (scheme == "ec" || scheme == "jerasure")
return POOL_SCHEME_EC;
return 0;
}
void etcd_state_client_t::insert_inode_config(const inode_config_t & cfg)
{
this->inode_config[cfg.num] = cfg;

View File

@ -151,4 +151,7 @@ public:
void close_watch(inode_watch_t* watch);
int address_count();
~etcd_state_client_t();
static uint32_t parse_immediate_commit(const std::string & immediate_commit_str);
static uint32_t parse_scheme(const std::string & scheme_str);
};

View File

@ -3,6 +3,8 @@
#pragma once
#include "object_id.h"
#define POOL_SCHEME_REPLICATED 1
#define POOL_SCHEME_XOR 2
#define POOL_SCHEME_EC 3

View File

@ -214,7 +214,10 @@ void print_help(const char *help_text, std::string exe_name, std::string cmd, bo
else if (*next_line && isspace(*next_line))
started = true;
else if (cmd_start && matched)
{
filtered_text += std::string(cmd_start, next_line-cmd_start);
matched = started = false;
}
}
while (filtered_text.size() > 1 &&
filtered_text[filtered_text.size()-1] == '\n' &&
@ -324,3 +327,24 @@ size_t utf8_length(const char *s)
len += (*s & 0xC0) != 0x80;
return len;
}
std::vector<std::string> explode(const std::string & sep, const std::string & value, bool trim)
{
std::vector<std::string> res;
size_t prev = 0;
while (prev < value.size())
{
while (trim && prev < value.size() && isspace(value[prev]))
prev++;
size_t pos = value.find(sep, prev);
if (pos == std::string::npos)
pos = value.size();
size_t next = pos+sep.size();
while (trim && pos > prev && isspace(value[pos-1]))
pos--;
if (!trim || pos > prev)
res.push_back(value.substr(prev, pos-prev));
prev = next;
}
return res;
}

View File

@ -4,6 +4,7 @@
#pragma once
#include <stdint.h>
#include <string>
#include <vector>
std::string base64_encode(const std::string &in);
std::string base64_decode(const std::string &in);
@ -20,3 +21,4 @@ std::string read_all_fd(int fd);
std::string str_repeat(const std::string & str, int times);
size_t utf8_length(const std::string & s);
size_t utf8_length(const char *s);
std::vector<std::string> explode(const std::string & sep, const std::string & value, bool trim);

View File

@ -13,7 +13,14 @@ $ETCDCTL put /vitastor/osd/stats/5 '{"host":"host3","size":1073741824,"time":"'$
$ETCDCTL put /vitastor/osd/stats/6 '{"host":"host3","size":1073741824,"time":"'$TIME'"}'
$ETCDCTL put /vitastor/osd/stats/7 '{"host":"host4","size":1073741824,"time":"'$TIME'"}'
$ETCDCTL put /vitastor/osd/stats/8 '{"host":"host4","size":1073741824,"time":"'$TIME'"}'
$ETCDCTL put /vitastor/config/pools '{"1":{"name":"testpool","scheme":"replicated","pg_size":2,"pg_minsize":1,"pg_count":4,"failure_domain":"rack"}}'
build/src/vitastor-cli --etcd_address $ETCD_URL create-pool testpool --ec 3+2 -n 32 --failure_domain rack --force
$ETCDCTL get --print-value-only /vitastor/config/pools | jq -s -e '. == [{"1": {"failure_domain": "rack", "name": "testpool", "parity_chunks": 2, "pg_count": 32, "pg_minsize": 4, "pg_size": 5, "scheme": "ec"}}]'
build/src/vitastor-cli --etcd_address $ETCD_URL modify-pool testpool --ec 3+3 --failure_domain host
$ETCDCTL get --print-value-only /vitastor/config/pools | jq -s -e '. == [{"1": {"failure_domain": "host", "name": "testpool", "parity_chunks": 3, "pg_count": 32, "pg_minsize": 4, "pg_size": 6, "scheme": "ec"}}]'
build/src/vitastor-cli --etcd_address $ETCD_URL rm-pool testpool
$ETCDCTL get --print-value-only /vitastor/config/pools | jq -s -e '. == [{}]'
build/src/vitastor-cli --etcd_address $ETCD_URL create-pool testpool -s 2 -n 4 --failure_domain rack --force
$ETCDCTL get --print-value-only /vitastor/config/pools | jq -s -e '. == [{"1":{"name":"testpool","scheme":"replicated","pg_size":2,"pg_minsize":1,"pg_count":4,"failure_domain":"rack"}}]'
node mon/mon-main.js --etcd_address $ETCD_URL --etcd_prefix "/vitastor" >>./testdata/mon.log 2>&1 &
MON_PID=$!