pve-qemu/debian/patches/pve/0030-PVE-Backup-Proxmox-bac...

2007 lines
62 KiB
Diff

From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Dietmar Maurer <dietmar@proxmox.com>
Date: Mon, 6 Apr 2020 12:16:59 +0200
Subject: [PATCH] PVE-Backup: Proxmox backup patches for QEMU
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
For PBS, using dirty bitmaps is supported via QEMU's
MIRROR_SYNC_MODE_BITMAP. When the feature is used, the data-write
callback is only executed for any changed chunks, the PBS rust code
will reuse chunks from the previous index for everything it doesn't
receive if reuse_index is true. On error or cancellation, all dirty
bitmaps are removed to ensure consistency.
By using a JobTxn, we can sync dirty bitmaps only when *all* jobs were
successful - meaning we don't need to remove them when the backup
fails, since QEMU's BITMAP_SYNC_MODE_ON_SUCCESS will now handle that
for us. A sequential transaction is used, so drives will be backed up
one after the other.
The backup and backup-cancel QMP calls are coroutines. This has the
benefit that calls are asynchronous to the main loop, i.e. long
running operations like connecting to a PBS server will no longer hang
the VM.
backup_job_create() and job_cancel_sync() cannot be run from a
coroutine and requires an acuqired AioContext, so the job creation and
canceling are extracted as bottom halves and called from the
respective QMP coroutines.
To communicate the finishing state, a dedicated property is used for
query-backup: 'finishing'. A dedicated state is explicitly not used,
since that would break compatibility with older qemu-server versions.
The first call to job_cancel_sync() will cancel and free all jobs in
the transaction, but it is necessary to pick a job that is:
1. still referenced. For this, there is a job_ref directly after job
creation paired with a job_unref in cleanup paths.
2. not yet finalized. In job_cancel_bh(), the first job that's not
completed yet is used. This is not necessarily the first job in the
list, because pvebackup_co_complete_stream() might not yet have
removed a completed job when job_cancel_bh() runs. Why even bother
with the bottom half at all and not use job_cancel() in
qmp_backup_cancel() directly? The reason is that qmp_backup_cancel()
is a coroutine, so it will hang when reaching AIO_WAIT_WHILE() and
job_cancel() might end up calling that.
Regarding BackupPerf performance settings. For now, only the
max-workers setting is exposed, because:
1. use-copy-range would need to be implemented in backup-dump and the
feature was actually turned off by default in QEMU itself, because it
didn't provide the expected benefit, see commit 6a30f663d4 ("qapi:
backup: disable copy_range by default").
2. max-chunk: enforced to be at least the backup cluster size (4 MiB
for PBS) and otherwise maximum of source and target cluster size.
And block-copy has a maximum buffer size of 1 MiB, so setting a larger
max-chunk doesn't even have an effect. To make the setting sensibly
usable the check would need to be removed and optionally the
block-copy max buffer size would need to be bumped. I tried doing just
that, and tested different source/target combinations with different
max-chunk settings, but there were no noticable improvements over the
default "unlimited" (resulting in 1 MiB for block-copy).
Signed-off-by: Dietmar Maurer <dietmar@proxmox.com>
[SR: Add dirty-bitmap tracking for incremental backups
Add query_proxmox_support and query-pbs-bitmap-info QMP calls
Use a transaction to synchronize job states
Co-routine and async-related improvements
Improve finishing backups/cleanups
Various other improvements]
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
[FG: add master key support]
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
[WB: add PBS namespace support]
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
[FE: add new force parameter to job_cancel_sync calls
adapt for new job lock mechanism replacing AioContext locks
adapt to QAPI changes
improve canceling
allow passing max-workers setting
use malloc_trim after backup
create jobs in a drained section]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
block/backup-dump.c | 10 +-
block/meson.build | 5 +
block/monitor/block-hmp-cmds.c | 39 ++
blockdev.c | 1 +
hmp-commands-info.hx | 14 +
hmp-commands.hx | 29 +
include/block/block_int-common.h | 2 +-
include/monitor/hmp.h | 3 +
meson.build | 1 +
monitor/hmp-cmds.c | 72 ++
proxmox-backup-client.c | 146 ++++
proxmox-backup-client.h | 60 ++
pve-backup.c | 1067 ++++++++++++++++++++++++++++++
qapi/block-core.json | 229 +++++++
qapi/common.json | 14 +
qapi/machine.json | 16 +-
16 files changed, 1690 insertions(+), 18 deletions(-)
create mode 100644 proxmox-backup-client.c
create mode 100644 proxmox-backup-client.h
create mode 100644 pve-backup.c
diff --git a/block/backup-dump.c b/block/backup-dump.c
index 232a094426..e46abf1070 100644
--- a/block/backup-dump.c
+++ b/block/backup-dump.c
@@ -9,6 +9,8 @@
*/
#include "qemu/osdep.h"
+
+#include "qapi/qmp/qdict.h"
#include "qom/object_interfaces.h"
#include "block/block_int.h"
@@ -141,7 +143,7 @@ static void bdrv_backup_dump_init(void)
block_init(bdrv_backup_dump_init);
-BlockDriverState *bdrv_backup_dump_create(
+BlockDriverState *coroutine_fn bdrv_co_backup_dump_create(
int dump_cb_block_size,
uint64_t byte_size,
BackupDumpFunc *dump_cb,
@@ -149,9 +151,11 @@ BlockDriverState *bdrv_backup_dump_create(
Error **errp)
{
BDRVBackupDumpState *state;
- BlockDriverState *bs = bdrv_new_open_driver(
- &bdrv_backup_dump_drive, NULL, BDRV_O_RDWR, errp);
+ QDict *options = qdict_new();
+ qdict_put_str(options, "driver", "backup-dump-drive");
+
+ BlockDriverState *bs = bdrv_co_open(NULL, NULL, options, BDRV_O_RDWR, errp);
if (!bs) {
return NULL;
}
diff --git a/block/meson.build b/block/meson.build
index 6fde9f7dcd..6d468f89e5 100644
--- a/block/meson.build
+++ b/block/meson.build
@@ -45,6 +45,11 @@ block_ss.add(files(
), zstd, zlib, gnutls)
block_ss.add(files('../vma-writer.c'), libuuid)
+block_ss.add(files(
+ '../proxmox-backup-client.c',
+ '../pve-backup.c',
+), libproxmox_backup_qemu)
+
system_ss.add(when: 'CONFIG_TCG', if_true: files('blkreplay.c'))
system_ss.add(files('block-ram-registrar.c'))
diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c
index ca2599de44..6efe28cef5 100644
--- a/block/monitor/block-hmp-cmds.c
+++ b/block/monitor/block-hmp-cmds.c
@@ -1029,3 +1029,42 @@ void hmp_change_medium(Monitor *mon, const char *device, const char *target,
qmp_blockdev_change_medium(device, NULL, target, arg, true, force,
!!read_only, read_only_mode, errp);
}
+
+void coroutine_fn hmp_backup_cancel(Monitor *mon, const QDict *qdict)
+{
+ Error *error = NULL;
+
+ qmp_backup_cancel(&error);
+
+ hmp_handle_error(mon, error);
+}
+
+void coroutine_fn hmp_backup(Monitor *mon, const QDict *qdict)
+{
+ Error *error = NULL;
+
+ const char *backup_file = qdict_get_str(qdict, "backupfile");
+ const char *devlist = qdict_get_try_str(qdict, "devlist");
+ int64_t speed = qdict_get_try_int(qdict, "speed", 0);
+
+ qmp_backup(
+ backup_file,
+ NULL, // PBS password
+ NULL, // PBS keyfile
+ NULL, // PBS key_password
+ NULL, // PBS master_keyfile
+ NULL, // PBS fingerprint
+ NULL, // PBS backup-ns
+ NULL, // PBS backup-id
+ false, 0, // PBS backup-time
+ false, false, // PBS use-dirty-bitmap
+ false, false, // PBS compress
+ false, false, // PBS encrypt
+ true, BACKUP_FORMAT_VMA,
+ NULL, NULL,
+ devlist, qdict_haskey(qdict, "speed"), speed,
+ false, 0, // BackupPerf max-workers
+ &error);
+
+ hmp_handle_error(mon, error);
+}
diff --git a/blockdev.c b/blockdev.c
index cd5f205ad1..7793143d76 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -37,6 +37,7 @@
#include "block/blockjob.h"
#include "block/dirty-bitmap.h"
#include "block/qdict.h"
+#include "block/blockjob_int.h"
#include "block/throttle-groups.h"
#include "monitor/monitor.h"
#include "qemu/error-report.h"
diff --git a/hmp-commands-info.hx b/hmp-commands-info.hx
index 10fdd822e0..15937793c1 100644
--- a/hmp-commands-info.hx
+++ b/hmp-commands-info.hx
@@ -471,6 +471,20 @@ SRST
Show the current VM UUID.
ERST
+
+ {
+ .name = "backup",
+ .args_type = "",
+ .params = "",
+ .help = "show backup status",
+ .cmd = hmp_info_backup,
+ },
+
+SRST
+ ``info backup``
+ Show backup status.
+ERST
+
#if defined(CONFIG_SLIRP)
{
.name = "usernet",
diff --git a/hmp-commands.hx b/hmp-commands.hx
index e352f86872..0c8b6725fb 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -101,6 +101,35 @@ ERST
SRST
``block_stream``
Copy data from a backing file into a block device.
+ERST
+
+ {
+ .name = "backup",
+ .args_type = "backupfile:s,speed:o?,devlist:s?",
+ .params = "backupfile [speed [devlist]]",
+ .help = "create a VM backup (VMA format).",
+ .cmd = hmp_backup,
+ .coroutine = true,
+ },
+
+SRST
+``backup``
+ Create a VM backup.
+ERST
+
+ {
+ .name = "backup_cancel",
+ .args_type = "",
+ .params = "",
+ .help = "cancel the current VM backup",
+ .cmd = hmp_backup_cancel,
+ .coroutine = true,
+ },
+
+SRST
+``backup_cancel``
+ Cancel the current VM backup.
+
ERST
{
diff --git a/include/block/block_int-common.h b/include/block/block_int-common.h
index 0f2e1817ad..0a0339eee4 100644
--- a/include/block/block_int-common.h
+++ b/include/block/block_int-common.h
@@ -63,7 +63,7 @@
typedef int BackupDumpFunc(void *opaque, uint64_t offset, uint64_t bytes, const void *buf);
-BlockDriverState *bdrv_backup_dump_create(
+BlockDriverState *coroutine_fn bdrv_co_backup_dump_create(
int dump_cb_block_size,
uint64_t byte_size,
BackupDumpFunc *dump_cb,
diff --git a/include/monitor/hmp.h b/include/monitor/hmp.h
index 7a7def7530..cba7afe70c 100644
--- a/include/monitor/hmp.h
+++ b/include/monitor/hmp.h
@@ -32,6 +32,7 @@ void hmp_info_savevm(Monitor *mon, const QDict *qdict);
void hmp_info_migrate(Monitor *mon, const QDict *qdict);
void hmp_info_migrate_capabilities(Monitor *mon, const QDict *qdict);
void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict);
+void hmp_info_backup(Monitor *mon, const QDict *qdict);
void hmp_info_cpus(Monitor *mon, const QDict *qdict);
void hmp_info_vnc(Monitor *mon, const QDict *qdict);
void hmp_info_spice(Monitor *mon, const QDict *qdict);
@@ -84,6 +85,8 @@ void hmp_change_vnc(Monitor *mon, const char *device, const char *target,
void hmp_change_medium(Monitor *mon, const char *device, const char *target,
const char *arg, const char *read_only, bool force,
Error **errp);
+void hmp_backup(Monitor *mon, const QDict *qdict);
+void hmp_backup_cancel(Monitor *mon, const QDict *qdict);
void hmp_migrate(Monitor *mon, const QDict *qdict);
void hmp_device_add(Monitor *mon, const QDict *qdict);
void hmp_device_del(Monitor *mon, const QDict *qdict);
diff --git a/meson.build b/meson.build
index cd95530d3b..d53976d621 100644
--- a/meson.build
+++ b/meson.build
@@ -1779,6 +1779,7 @@ endif
has_gettid = cc.has_function('gettid')
libuuid = cc.find_library('uuid', required: true)
+libproxmox_backup_qemu = cc.find_library('proxmox_backup_qemu', required: true)
# libselinux
selinux = dependency('libselinux',
diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index 91be698308..5b9c231a4c 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -21,6 +21,7 @@
#include "qemu/help_option.h"
#include "monitor/monitor-internal.h"
#include "qapi/error.h"
+#include "qapi/qapi-commands-block-core.h"
#include "qapi/qapi-commands-control.h"
#include "qapi/qapi-commands-migration.h"
#include "qapi/qapi-commands-misc.h"
@@ -144,6 +145,77 @@ void hmp_sync_profile(Monitor *mon, const QDict *qdict)
}
}
+void hmp_info_backup(Monitor *mon, const QDict *qdict)
+{
+ BackupStatus *info;
+ PBSBitmapInfoList *bitmap_info;
+
+ info = qmp_query_backup(NULL);
+
+ if (!info) {
+ monitor_printf(mon, "Backup status: not initialized\n");
+ return;
+ }
+
+ if (info->status) {
+ if (info->errmsg) {
+ monitor_printf(mon, "Backup status: %s - %s\n",
+ info->status, info->errmsg);
+ } else {
+ monitor_printf(mon, "Backup status: %s\n", info->status);
+ }
+ }
+
+ if (info->backup_file) {
+ monitor_printf(mon, "Start time: %s", ctime(&info->start_time));
+ if (info->end_time) {
+ monitor_printf(mon, "End time: %s", ctime(&info->end_time));
+ }
+
+ monitor_printf(mon, "Backup file: %s\n", info->backup_file);
+ monitor_printf(mon, "Backup uuid: %s\n", info->uuid);
+
+ if (!(info->has_total && info->total)) {
+ // this should not happen normally
+ monitor_printf(mon, "Total size: %d\n", 0);
+ } else {
+ size_t total_or_dirty = info->total;
+ bitmap_info = qmp_query_pbs_bitmap_info(NULL);
+
+ while (bitmap_info) {
+ monitor_printf(mon, "Drive %s:\n",
+ bitmap_info->value->drive);
+ monitor_printf(mon, " bitmap action: %s\n",
+ PBSBitmapAction_str(bitmap_info->value->action));
+ monitor_printf(mon, " size: %zd\n",
+ bitmap_info->value->size);
+ monitor_printf(mon, " dirty: %zd\n",
+ bitmap_info->value->dirty);
+ bitmap_info = bitmap_info->next;
+ }
+
+ qapi_free_PBSBitmapInfoList(bitmap_info);
+
+ int zero_per = (info->has_zero_bytes && info->zero_bytes) ?
+ (info->zero_bytes * 100)/info->total : 0;
+ monitor_printf(mon, "Total size: %zd\n", info->total);
+ int trans_per = (info->transferred * 100)/total_or_dirty;
+ monitor_printf(mon, "Transferred bytes: %zd (%d%%)\n",
+ info->transferred, trans_per);
+ monitor_printf(mon, "Zero bytes: %zd (%d%%)\n",
+ info->zero_bytes, zero_per);
+
+ if (info->has_reused) {
+ int reused_per = (info->reused * 100)/total_or_dirty;
+ monitor_printf(mon, "Reused bytes: %zd (%d%%)\n",
+ info->reused, reused_per);
+ }
+ }
+ }
+
+ qapi_free_BackupStatus(info);
+}
+
void hmp_exit_preconfig(Monitor *mon, const QDict *qdict)
{
Error *err = NULL;
diff --git a/proxmox-backup-client.c b/proxmox-backup-client.c
new file mode 100644
index 0000000000..0923037dec
--- /dev/null
+++ b/proxmox-backup-client.c
@@ -0,0 +1,146 @@
+#include "proxmox-backup-client.h"
+#include "qemu/main-loop.h"
+#include "block/aio-wait.h"
+#include "qapi/error.h"
+
+/* Proxmox Backup Server client bindings using coroutines */
+
+// This is called from another thread, so we use aio_co_schedule()
+static void proxmox_backup_schedule_wake(void *data) {
+ CoCtxData *waker = (CoCtxData *)data;
+ aio_co_schedule(waker->ctx, waker->co);
+}
+
+int coroutine_fn
+proxmox_backup_co_connect(ProxmoxBackupHandle *pbs, Error **errp)
+{
+ Coroutine *co = qemu_coroutine_self();
+ AioContext *ctx = qemu_get_current_aio_context();
+ CoCtxData waker = { .co = co, .ctx = ctx };
+ char *pbs_err = NULL;
+ int pbs_res = -1;
+
+ proxmox_backup_connect_async(pbs, proxmox_backup_schedule_wake, &waker, &pbs_res, &pbs_err);
+ qemu_coroutine_yield();
+ if (pbs_res < 0) {
+ if (errp) error_setg(errp, "backup connect failed: %s", pbs_err ? pbs_err : "unknown error");
+ if (pbs_err) proxmox_backup_free_error(pbs_err);
+ }
+ return pbs_res;
+}
+
+int coroutine_fn
+proxmox_backup_co_add_config(
+ ProxmoxBackupHandle *pbs,
+ const char *name,
+ const uint8_t *data,
+ uint64_t size,
+ Error **errp)
+{
+ Coroutine *co = qemu_coroutine_self();
+ AioContext *ctx = qemu_get_current_aio_context();
+ CoCtxData waker = { .co = co, .ctx = ctx };
+ char *pbs_err = NULL;
+ int pbs_res = -1;
+
+ proxmox_backup_add_config_async(
+ pbs, name, data, size ,proxmox_backup_schedule_wake, &waker, &pbs_res, &pbs_err);
+ qemu_coroutine_yield();
+ if (pbs_res < 0) {
+ if (errp) error_setg(errp, "backup add_config %s failed: %s", name, pbs_err ? pbs_err : "unknown error");
+ if (pbs_err) proxmox_backup_free_error(pbs_err);
+ }
+ return pbs_res;
+}
+
+int coroutine_fn
+proxmox_backup_co_register_image(
+ ProxmoxBackupHandle *pbs,
+ const char *device_name,
+ uint64_t size,
+ bool incremental,
+ Error **errp)
+{
+ Coroutine *co = qemu_coroutine_self();
+ AioContext *ctx = qemu_get_current_aio_context();
+ CoCtxData waker = { .co = co, .ctx = ctx };
+ char *pbs_err = NULL;
+ int pbs_res = -1;
+
+ proxmox_backup_register_image_async(
+ pbs, device_name, size, incremental, proxmox_backup_schedule_wake, &waker, &pbs_res, &pbs_err);
+ qemu_coroutine_yield();
+ if (pbs_res < 0) {
+ if (errp) error_setg(errp, "backup register image failed: %s", pbs_err ? pbs_err : "unknown error");
+ if (pbs_err) proxmox_backup_free_error(pbs_err);
+ }
+ return pbs_res;
+}
+
+int coroutine_fn
+proxmox_backup_co_finish(
+ ProxmoxBackupHandle *pbs,
+ Error **errp)
+{
+ Coroutine *co = qemu_coroutine_self();
+ AioContext *ctx = qemu_get_current_aio_context();
+ CoCtxData waker = { .co = co, .ctx = ctx };
+ char *pbs_err = NULL;
+ int pbs_res = -1;
+
+ proxmox_backup_finish_async(
+ pbs, proxmox_backup_schedule_wake, &waker, &pbs_res, &pbs_err);
+ qemu_coroutine_yield();
+ if (pbs_res < 0) {
+ if (errp) error_setg(errp, "backup finish failed: %s", pbs_err ? pbs_err : "unknown error");
+ if (pbs_err) proxmox_backup_free_error(pbs_err);
+ }
+ return pbs_res;
+}
+
+int coroutine_fn
+proxmox_backup_co_close_image(
+ ProxmoxBackupHandle *pbs,
+ uint8_t dev_id,
+ Error **errp)
+{
+ Coroutine *co = qemu_coroutine_self();
+ AioContext *ctx = qemu_get_current_aio_context();
+ CoCtxData waker = { .co = co, .ctx = ctx };
+ char *pbs_err = NULL;
+ int pbs_res = -1;
+
+ proxmox_backup_close_image_async(
+ pbs, dev_id, proxmox_backup_schedule_wake, &waker, &pbs_res, &pbs_err);
+ qemu_coroutine_yield();
+ if (pbs_res < 0) {
+ if (errp) error_setg(errp, "backup close image failed: %s", pbs_err ? pbs_err : "unknown error");
+ if (pbs_err) proxmox_backup_free_error(pbs_err);
+ }
+ return pbs_res;
+}
+
+int coroutine_fn
+proxmox_backup_co_write_data(
+ ProxmoxBackupHandle *pbs,
+ uint8_t dev_id,
+ const uint8_t *data,
+ uint64_t offset,
+ uint64_t size,
+ Error **errp)
+{
+ Coroutine *co = qemu_coroutine_self();
+ AioContext *ctx = qemu_get_current_aio_context();
+ CoCtxData waker = { .co = co, .ctx = ctx };
+ char *pbs_err = NULL;
+ int pbs_res = -1;
+
+ proxmox_backup_write_data_async(
+ pbs, dev_id, data, offset, size, proxmox_backup_schedule_wake, &waker, &pbs_res, &pbs_err);
+ qemu_coroutine_yield();
+ if (pbs_res < 0) {
+ if (errp) error_setg(errp, "backup write data failed: %s", pbs_err ? pbs_err : "unknown error");
+ if (pbs_err) proxmox_backup_free_error(pbs_err);
+ }
+ return pbs_res;
+}
diff --git a/proxmox-backup-client.h b/proxmox-backup-client.h
new file mode 100644
index 0000000000..8cbf645b2c
--- /dev/null
+++ b/proxmox-backup-client.h
@@ -0,0 +1,60 @@
+#ifndef PROXMOX_BACKUP_CLIENT_H
+#define PROXMOX_BACKUP_CLIENT_H
+
+#include "qemu/osdep.h"
+#include "qemu/coroutine.h"
+#include "proxmox-backup-qemu.h"
+
+typedef struct CoCtxData {
+ Coroutine *co;
+ AioContext *ctx;
+ void *data;
+} CoCtxData;
+
+// FIXME: Remove once coroutines are supported for QMP
+void block_on_coroutine_fn(CoroutineEntry *entry, void *entry_arg);
+
+int coroutine_fn
+proxmox_backup_co_connect(
+ ProxmoxBackupHandle *pbs,
+ Error **errp);
+
+int coroutine_fn
+proxmox_backup_co_add_config(
+ ProxmoxBackupHandle *pbs,
+ const char *name,
+ const uint8_t *data,
+ uint64_t size,
+ Error **errp);
+
+int coroutine_fn
+proxmox_backup_co_register_image(
+ ProxmoxBackupHandle *pbs,
+ const char *device_name,
+ uint64_t size,
+ bool incremental,
+ Error **errp);
+
+
+int coroutine_fn
+proxmox_backup_co_finish(
+ ProxmoxBackupHandle *pbs,
+ Error **errp);
+
+int coroutine_fn
+proxmox_backup_co_close_image(
+ ProxmoxBackupHandle *pbs,
+ uint8_t dev_id,
+ Error **errp);
+
+int coroutine_fn
+proxmox_backup_co_write_data(
+ ProxmoxBackupHandle *pbs,
+ uint8_t dev_id,
+ const uint8_t *data,
+ uint64_t offset,
+ uint64_t size,
+ Error **errp);
+
+
+#endif /* PROXMOX_BACKUP_CLIENT_H */
diff --git a/pve-backup.c b/pve-backup.c
new file mode 100644
index 0000000000..d84d807654
--- /dev/null
+++ b/pve-backup.c
@@ -0,0 +1,1067 @@
+#include "proxmox-backup-client.h"
+#include "vma.h"
+
+#include "qemu/osdep.h"
+#include "qemu/module.h"
+#include "sysemu/block-backend.h"
+#include "sysemu/blockdev.h"
+#include "block/block_int-global-state.h"
+#include "block/blockjob.h"
+#include "block/dirty-bitmap.h"
+#include "qapi/qapi-commands-block.h"
+#include "qapi/qmp/qerror.h"
+#include "qemu/cutils.h"
+
+#if defined(CONFIG_MALLOC_TRIM)
+#include <malloc.h>
+#endif
+
+#include <proxmox-backup-qemu.h>
+
+/* PVE backup state and related function */
+
+/*
+ * Note: A resume from a qemu_coroutine_yield can happen in a different thread,
+ * so you may not use normal mutexes within coroutines:
+ *
+ * ---bad-example---
+ * qemu_rec_mutex_lock(lock)
+ * ...
+ * qemu_coroutine_yield() // wait for something
+ * // we are now inside a different thread
+ * qemu_rec_mutex_unlock(lock) // Crash - wrong thread!!
+ * ---end-bad-example--
+ *
+ * ==> Always use CoMutext inside coroutines.
+ * ==> Never acquire/release AioContext withing coroutines (because that use QemuRecMutex)
+ *
+ */
+
+const char *PBS_BITMAP_NAME = "pbs-incremental-dirty-bitmap";
+
+static struct PVEBackupState {
+ struct {
+ // Everything accessed from qmp_backup_query command is protected using
+ // this lock. Do NOT hold this lock for long times, as it is sometimes
+ // acquired from coroutines, and thus any wait time may block the guest.
+ QemuMutex lock;
+ Error *error;
+ time_t start_time;
+ time_t end_time;
+ char *backup_file;
+ uuid_t uuid;
+ char uuid_str[37];
+ size_t total;
+ size_t dirty;
+ size_t transferred;
+ size_t reused;
+ size_t zero_bytes;
+ GList *bitmap_list;
+ bool finishing;
+ bool starting;
+ } stat;
+ int64_t speed;
+ BackupPerf perf;
+ VmaWriter *vmaw;
+ ProxmoxBackupHandle *pbs;
+ GList *di_list;
+ JobTxn *txn;
+ CoMutex backup_mutex;
+ CoMutex dump_callback_mutex;
+} backup_state;
+
+static void pvebackup_init(void)
+{
+ qemu_mutex_init(&backup_state.stat.lock);
+ qemu_co_mutex_init(&backup_state.backup_mutex);
+ qemu_co_mutex_init(&backup_state.dump_callback_mutex);
+}
+
+// initialize PVEBackupState at startup
+opts_init(pvebackup_init);
+
+typedef struct PVEBackupDevInfo {
+ BlockDriverState *bs;
+ size_t size;
+ uint64_t block_size;
+ uint8_t dev_id;
+ int completed_ret; // INT_MAX if not completed
+ char targetfile[PATH_MAX];
+ BdrvDirtyBitmap *bitmap;
+ BlockDriverState *target;
+ BlockJob *job;
+} PVEBackupDevInfo;
+
+static void pvebackup_propagate_error(Error *err)
+{
+ qemu_mutex_lock(&backup_state.stat.lock);
+ error_propagate(&backup_state.stat.error, err);
+ qemu_mutex_unlock(&backup_state.stat.lock);
+}
+
+static bool pvebackup_error_or_canceled(void)
+{
+ qemu_mutex_lock(&backup_state.stat.lock);
+ bool error_or_canceled = !!backup_state.stat.error;
+ qemu_mutex_unlock(&backup_state.stat.lock);
+
+ return error_or_canceled;
+}
+
+static void pvebackup_add_transferred_bytes(size_t transferred, size_t zero_bytes, size_t reused)
+{
+ qemu_mutex_lock(&backup_state.stat.lock);
+ backup_state.stat.zero_bytes += zero_bytes;
+ backup_state.stat.transferred += transferred;
+ backup_state.stat.reused += reused;
+ qemu_mutex_unlock(&backup_state.stat.lock);
+}
+
+// This may get called from multiple coroutines in multiple io-threads
+// Note1: this may get called after job_cancel()
+static int coroutine_fn
+pvebackup_co_dump_pbs_cb(
+ void *opaque,
+ uint64_t start,
+ uint64_t bytes,
+ const void *pbuf)
+{
+ assert(qemu_in_coroutine());
+
+ const uint64_t size = bytes;
+ const unsigned char *buf = pbuf;
+ PVEBackupDevInfo *di = opaque;
+
+ assert(backup_state.pbs);
+ assert(buf);
+
+ Error *local_err = NULL;
+ int pbs_res = -1;
+
+ bool is_zero_block = size == di->block_size && buffer_is_zero(buf, size);
+
+ qemu_co_mutex_lock(&backup_state.dump_callback_mutex);
+
+ // avoid deadlock if job is cancelled
+ if (pvebackup_error_or_canceled()) {
+ qemu_co_mutex_unlock(&backup_state.dump_callback_mutex);
+ return -1;
+ }
+
+ uint64_t transferred = 0;
+ uint64_t reused = 0;
+ while (transferred < size) {
+ uint64_t left = size - transferred;
+ uint64_t to_transfer = left < di->block_size ? left : di->block_size;
+
+ pbs_res = proxmox_backup_co_write_data(backup_state.pbs, di->dev_id,
+ is_zero_block ? NULL : buf + transferred, start + transferred,
+ to_transfer, &local_err);
+ transferred += to_transfer;
+
+ if (pbs_res < 0) {
+ pvebackup_propagate_error(local_err);
+ qemu_co_mutex_unlock(&backup_state.dump_callback_mutex);
+ return pbs_res;
+ }
+
+ reused += pbs_res == 0 ? to_transfer : 0;
+ }
+
+ qemu_co_mutex_unlock(&backup_state.dump_callback_mutex);
+ pvebackup_add_transferred_bytes(size, is_zero_block ? size : 0, reused);
+
+ return size;
+}
+
+// This may get called from multiple coroutines in multiple io-threads
+static int coroutine_fn
+pvebackup_co_dump_vma_cb(
+ void *opaque,
+ uint64_t start,
+ uint64_t bytes,
+ const void *pbuf)
+{
+ assert(qemu_in_coroutine());
+
+ const uint64_t size = bytes;
+ const unsigned char *buf = pbuf;
+ PVEBackupDevInfo *di = opaque;
+
+ int ret = -1;
+
+ assert(backup_state.vmaw);
+ assert(buf);
+
+ uint64_t remaining = size;
+
+ uint64_t cluster_num = start / VMA_CLUSTER_SIZE;
+ if ((cluster_num * VMA_CLUSTER_SIZE) != start) {
+ Error *local_err = NULL;
+ error_setg(&local_err,
+ "got unaligned write inside backup dump "
+ "callback (sector %ld)", start);
+ pvebackup_propagate_error(local_err);
+ return -1; // not aligned to cluster size
+ }
+
+ while (remaining > 0) {
+ qemu_co_mutex_lock(&backup_state.dump_callback_mutex);
+ // avoid deadlock if job is cancelled
+ if (pvebackup_error_or_canceled()) {
+ qemu_co_mutex_unlock(&backup_state.dump_callback_mutex);
+ return -1;
+ }
+
+ size_t zero_bytes = 0;
+ ret = vma_writer_write(backup_state.vmaw, di->dev_id, cluster_num, buf, &zero_bytes);
+ qemu_co_mutex_unlock(&backup_state.dump_callback_mutex);
+
+ ++cluster_num;
+ buf += VMA_CLUSTER_SIZE;
+ if (ret < 0) {
+ Error *local_err = NULL;
+ vma_writer_error_propagate(backup_state.vmaw, &local_err);
+ pvebackup_propagate_error(local_err);
+ return ret;
+ } else {
+ if (remaining >= VMA_CLUSTER_SIZE) {
+ assert(ret == VMA_CLUSTER_SIZE);
+ pvebackup_add_transferred_bytes(VMA_CLUSTER_SIZE, zero_bytes, 0);
+ remaining -= VMA_CLUSTER_SIZE;
+ } else {
+ assert(ret == remaining);
+ pvebackup_add_transferred_bytes(remaining, zero_bytes, 0);
+ remaining = 0;
+ }
+ }
+ }
+
+ return size;
+}
+
+// assumes the caller holds backup_mutex
+static void coroutine_fn pvebackup_co_cleanup(void)
+{
+ assert(qemu_in_coroutine());
+
+ qemu_mutex_lock(&backup_state.stat.lock);
+ backup_state.stat.finishing = true;
+ qemu_mutex_unlock(&backup_state.stat.lock);
+
+ if (backup_state.vmaw) {
+ Error *local_err = NULL;
+ vma_writer_close(backup_state.vmaw, &local_err);
+
+ if (local_err != NULL) {
+ pvebackup_propagate_error(local_err);
+ }
+
+ backup_state.vmaw = NULL;
+ }
+
+ if (backup_state.pbs) {
+ if (!pvebackup_error_or_canceled()) {
+ Error *local_err = NULL;
+ proxmox_backup_co_finish(backup_state.pbs, &local_err);
+ if (local_err != NULL) {
+ pvebackup_propagate_error(local_err);
+ }
+ }
+
+ proxmox_backup_disconnect(backup_state.pbs);
+ backup_state.pbs = NULL;
+ }
+
+ g_list_free(backup_state.di_list);
+ backup_state.di_list = NULL;
+
+ qemu_mutex_lock(&backup_state.stat.lock);
+ backup_state.stat.end_time = time(NULL);
+ backup_state.stat.finishing = false;
+ qemu_mutex_unlock(&backup_state.stat.lock);
+
+#if defined(CONFIG_MALLOC_TRIM)
+ /*
+ * Try to reclaim memory for buffers (and, in case of PBS, Rust futures), etc.
+ * Won't happen by default if there is fragmentation.
+ */
+ malloc_trim(4 * 1024 * 1024);
+#endif
+}
+
+static void coroutine_fn pvebackup_co_complete_stream(void *opaque)
+{
+ PVEBackupDevInfo *di = opaque;
+ int ret = di->completed_ret;
+
+ qemu_mutex_lock(&backup_state.stat.lock);
+ bool starting = backup_state.stat.starting;
+ qemu_mutex_unlock(&backup_state.stat.lock);
+ if (starting) {
+ /* in 'starting' state, no tasks have been run yet, meaning we can (and
+ * must) skip all cleanup, as we don't know what has and hasn't been
+ * initialized yet. */
+ return;
+ }
+
+ qemu_co_mutex_lock(&backup_state.backup_mutex);
+
+ if (ret < 0) {
+ Error *local_err = NULL;
+ error_setg(&local_err, "job failed with err %d - %s", ret, strerror(-ret));
+ pvebackup_propagate_error(local_err);
+ }
+
+ di->bs = NULL;
+
+ assert(di->target == NULL);
+
+ bool error_or_canceled = pvebackup_error_or_canceled();
+
+ if (backup_state.vmaw) {
+ vma_writer_close_stream(backup_state.vmaw, di->dev_id);
+ }
+
+ if (backup_state.pbs && !error_or_canceled) {
+ Error *local_err = NULL;
+ proxmox_backup_co_close_image(backup_state.pbs, di->dev_id, &local_err);
+ if (local_err != NULL) {
+ pvebackup_propagate_error(local_err);
+ }
+ }
+
+ if (di->job) {
+ WITH_JOB_LOCK_GUARD() {
+ job_unref_locked(&di->job->job);
+ di->job = NULL;
+ }
+ }
+
+ // remove self from job list
+ backup_state.di_list = g_list_remove(backup_state.di_list, di);
+
+ g_free(di);
+
+ /* call cleanup if we're the last job */
+ if (!g_list_first(backup_state.di_list)) {
+ pvebackup_co_cleanup();
+ }
+
+ qemu_co_mutex_unlock(&backup_state.backup_mutex);
+}
+
+static void pvebackup_complete_cb(void *opaque, int ret)
+{
+ PVEBackupDevInfo *di = opaque;
+ di->completed_ret = ret;
+
+ /*
+ * Schedule stream cleanup in async coroutine. close_image and finish might
+ * take a while, so we can't block on them here. This way it also doesn't
+ * matter if we're already running in a coroutine or not.
+ * Note: di is a pointer to an entry in the global backup_state struct, so
+ * it stays valid.
+ */
+ Coroutine *co = qemu_coroutine_create(pvebackup_co_complete_stream, di);
+ aio_co_enter(qemu_get_aio_context(), co);
+}
+
+/*
+ * job_cancel(_sync) does not like to be called from coroutines, so defer to
+ * main loop processing via a bottom half. Assumes that caller holds
+ * backup_mutex.
+ */
+static void job_cancel_bh(void *opaque) {
+ CoCtxData *data = (CoCtxData*)opaque;
+
+ /*
+ * Be careful to pick a valid job to cancel:
+ * 1. job_cancel_sync() does not expect the job to be finalized already.
+ * 2. job_exit() might run between scheduling and running job_cancel_bh()
+ * and pvebackup_co_complete_stream() might not have removed the job from
+ * the list yet (in fact, cannot, because it waits for the backup_mutex).
+ * Requiring !job_is_completed() ensures that no finalized job is picked.
+ */
+ GList *bdi = g_list_first(backup_state.di_list);
+ while (bdi) {
+ if (bdi->data) {
+ BlockJob *bj = ((PVEBackupDevInfo *)bdi->data)->job;
+ if (bj) {
+ Job *job = &bj->job;
+ WITH_JOB_LOCK_GUARD() {
+ if (!job_is_completed_locked(job)) {
+ job_cancel_sync_locked(job, true);
+ /*
+ * It's enough to cancel one job in the transaction, the
+ * rest will follow automatically.
+ */
+ break;
+ }
+ }
+ }
+ }
+ bdi = g_list_next(bdi);
+ }
+
+ aio_co_enter(data->ctx, data->co);
+}
+
+void coroutine_fn qmp_backup_cancel(Error **errp)
+{
+ Error *cancel_err = NULL;
+ error_setg(&cancel_err, "backup canceled");
+ pvebackup_propagate_error(cancel_err);
+
+ qemu_co_mutex_lock(&backup_state.backup_mutex);
+
+ if (backup_state.vmaw) {
+ /* make sure vma writer does not block anymore */
+ vma_writer_set_error(backup_state.vmaw, "backup canceled");
+ }
+
+ if (backup_state.pbs) {
+ proxmox_backup_abort(backup_state.pbs, "backup canceled");
+ }
+
+ CoCtxData data = {
+ .ctx = qemu_get_current_aio_context(),
+ .co = qemu_coroutine_self(),
+ };
+ aio_bh_schedule_oneshot(data.ctx, job_cancel_bh, &data);
+ qemu_coroutine_yield();
+
+ qemu_co_mutex_unlock(&backup_state.backup_mutex);
+}
+
+// assumes the caller holds backup_mutex
+static int coroutine_fn pvebackup_co_add_config(
+ const char *file,
+ const char *name,
+ BackupFormat format,
+ VmaWriter *vmaw,
+ ProxmoxBackupHandle *pbs,
+ Error **errp)
+{
+ int res = 0;
+
+ char *cdata = NULL;
+ gsize clen = 0;
+ GError *err = NULL;
+ if (!g_file_get_contents(file, &cdata, &clen, &err)) {
+ error_setg(errp, "unable to read file '%s'", file);
+ return 1;
+ }
+
+ char *basename = g_path_get_basename(file);
+ if (name == NULL) name = basename;
+
+ if (format == BACKUP_FORMAT_VMA) {
+ if (vma_writer_add_config(vmaw, name, cdata, clen) != 0) {
+ error_setg(errp, "unable to add %s config data to vma archive", file);
+ goto err;
+ }
+ } else if (format == BACKUP_FORMAT_PBS) {
+ if (proxmox_backup_co_add_config(pbs, name, (unsigned char *)cdata, clen, errp) < 0)
+ goto err;
+ }
+
+ out:
+ g_free(basename);
+ g_free(cdata);
+ return res;
+
+ err:
+ res = -1;
+ goto out;
+}
+
+/*
+ * backup_job_create can *not* be run from a coroutine (and requires an
+ * acquired AioContext), so this can't either.
+ * The caller is responsible that backup_mutex is held nonetheless.
+ */
+static void create_backup_jobs_bh(void *opaque) {
+
+ assert(!qemu_in_coroutine());
+
+ CoCtxData *data = (CoCtxData*)opaque;
+ Error **errp = (Error**)data->data;
+
+ Error *local_err = NULL;
+
+ /* create job transaction to synchronize bitmap commit and cancel all
+ * jobs in case one errors */
+ if (backup_state.txn) {
+ job_txn_unref(backup_state.txn);
+ }
+ backup_state.txn = job_txn_new_seq();
+
+ /* create and start all jobs (paused state) */
+ GList *l = backup_state.di_list;
+ while (l) {
+ PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
+ l = g_list_next(l);
+
+ assert(di->target != NULL);
+
+ MirrorSyncMode sync_mode = MIRROR_SYNC_MODE_FULL;
+ BitmapSyncMode bitmap_mode = BITMAP_SYNC_MODE_NEVER;
+ if (di->bitmap) {
+ sync_mode = MIRROR_SYNC_MODE_BITMAP;
+ bitmap_mode = BITMAP_SYNC_MODE_ON_SUCCESS;
+ }
+ AioContext *aio_context = bdrv_get_aio_context(di->bs);
+ aio_context_acquire(aio_context);
+
+ bdrv_drained_begin(di->bs);
+
+ BlockJob *job = backup_job_create(
+ NULL, di->bs, di->target, backup_state.speed, sync_mode, di->bitmap,
+ bitmap_mode, false, NULL, &backup_state.perf, BLOCKDEV_ON_ERROR_REPORT,
+ BLOCKDEV_ON_ERROR_REPORT, JOB_DEFAULT, pvebackup_complete_cb, di, backup_state.txn,
+ &local_err);
+
+ bdrv_drained_end(di->bs);
+
+ aio_context_release(aio_context);
+
+ di->job = job;
+ if (job) {
+ WITH_JOB_LOCK_GUARD() {
+ job_ref_locked(&job->job);
+ }
+ }
+
+ if (!job || local_err) {
+ error_setg(errp, "backup_job_create failed: %s",
+ local_err ? error_get_pretty(local_err) : "null");
+ break;
+ }
+
+ bdrv_unref(di->target);
+ di->target = NULL;
+ }
+
+ if (*errp) {
+ /*
+ * It's enough to cancel one job in the transaction, the rest will
+ * follow automatically.
+ */
+ bool canceled = false;
+ l = backup_state.di_list;
+ while (l) {
+ PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
+ l = g_list_next(l);
+
+ if (di->target) {
+ bdrv_unref(di->target);
+ di->target = NULL;
+ }
+
+ if (di->job) {
+ WITH_JOB_LOCK_GUARD() {
+ if (!canceled) {
+ job_cancel_sync_locked(&di->job->job, true);
+ canceled = true;
+ }
+ job_unref_locked(&di->job->job);
+ di->job = NULL;
+ }
+ }
+ }
+ }
+
+ /* return */
+ aio_co_enter(data->ctx, data->co);
+}
+
+UuidInfo coroutine_fn *qmp_backup(
+ const char *backup_file,
+ const char *password,
+ const char *keyfile,
+ const char *key_password,
+ const char *master_keyfile,
+ const char *fingerprint,
+ const char *backup_ns,
+ const char *backup_id,
+ bool has_backup_time, int64_t backup_time,
+ bool has_use_dirty_bitmap, bool use_dirty_bitmap,
+ bool has_compress, bool compress,
+ bool has_encrypt, bool encrypt,
+ bool has_format, BackupFormat format,
+ const char *config_file,
+ const char *firewall_file,
+ const char *devlist,
+ bool has_speed, int64_t speed,
+ bool has_max_workers, int64_t max_workers,
+ Error **errp)
+{
+ assert(qemu_in_coroutine());
+
+ qemu_co_mutex_lock(&backup_state.backup_mutex);
+
+ BlockBackend *blk;
+ BlockDriverState *bs = NULL;
+ Error *local_err = NULL;
+ uuid_t uuid;
+ VmaWriter *vmaw = NULL;
+ ProxmoxBackupHandle *pbs = NULL;
+ gchar **devs = NULL;
+ GList *di_list = NULL;
+ GList *l;
+ UuidInfo *uuid_info;
+
+ const char *config_name = "qemu-server.conf";
+ const char *firewall_name = "qemu-server.fw";
+
+ if (backup_state.di_list) {
+ error_set(errp, ERROR_CLASS_GENERIC_ERROR,
+ "previous backup not finished");
+ qemu_co_mutex_unlock(&backup_state.backup_mutex);
+ return NULL;
+ }
+
+ /* Todo: try to auto-detect format based on file name */
+ format = has_format ? format : BACKUP_FORMAT_VMA;
+
+ if (devlist) {
+ devs = g_strsplit_set(devlist, ",;:", -1);
+
+ gchar **d = devs;
+ while (d && *d) {
+ blk = blk_by_name(*d);
+ if (blk) {
+ bs = blk_bs(blk);
+ if (!bdrv_co_is_inserted(bs)) {
+ error_setg(errp, QERR_DEVICE_HAS_NO_MEDIUM, *d);
+ goto err;
+ }
+ PVEBackupDevInfo *di = g_new0(PVEBackupDevInfo, 1);
+ di->bs = bs;
+ di_list = g_list_append(di_list, di);
+ } else {
+ error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND,
+ "Device '%s' not found", *d);
+ goto err;
+ }
+ d++;
+ }
+
+ } else {
+ BdrvNextIterator it;
+
+ bs = NULL;
+ for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
+ if (!bdrv_co_is_inserted(bs) || bdrv_is_read_only(bs)) {
+ continue;
+ }
+
+ PVEBackupDevInfo *di = g_new0(PVEBackupDevInfo, 1);
+ di->bs = bs;
+ di_list = g_list_append(di_list, di);
+ }
+ }
+
+ if (!di_list) {
+ error_set(errp, ERROR_CLASS_GENERIC_ERROR, "empty device list");
+ goto err;
+ }
+
+ size_t total = 0;
+
+ l = di_list;
+ while (l) {
+ PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
+ l = g_list_next(l);
+ if (bdrv_op_is_blocked(di->bs, BLOCK_OP_TYPE_BACKUP_SOURCE, errp)) {
+ goto err;
+ }
+
+ ssize_t size = bdrv_getlength(di->bs);
+ if (size < 0) {
+ error_setg_errno(errp, -size, "bdrv_getlength failed");
+ goto err;
+ }
+ di->size = size;
+ total += size;
+
+ di->completed_ret = INT_MAX;
+ }
+
+ uuid_generate(uuid);
+
+ qemu_mutex_lock(&backup_state.stat.lock);
+ backup_state.stat.reused = 0;
+
+ /* clear previous backup's bitmap_list */
+ if (backup_state.stat.bitmap_list) {
+ GList *bl = backup_state.stat.bitmap_list;
+ while (bl) {
+ g_free(((PBSBitmapInfo *)bl->data)->drive);
+ g_free(bl->data);
+ bl = g_list_next(bl);
+ }
+ g_list_free(backup_state.stat.bitmap_list);
+ backup_state.stat.bitmap_list = NULL;
+ }
+
+ if (format == BACKUP_FORMAT_PBS) {
+ if (!password) {
+ error_set(errp, ERROR_CLASS_GENERIC_ERROR, "missing parameter 'password'");
+ goto err_mutex;
+ }
+ if (!backup_id) {
+ error_set(errp, ERROR_CLASS_GENERIC_ERROR, "missing parameter 'backup-id'");
+ goto err_mutex;
+ }
+ if (!has_backup_time) {
+ error_set(errp, ERROR_CLASS_GENERIC_ERROR, "missing parameter 'backup-time'");
+ goto err_mutex;
+ }
+
+ int dump_cb_block_size = PROXMOX_BACKUP_DEFAULT_CHUNK_SIZE; // Hardcoded (4M)
+ firewall_name = "fw.conf";
+
+ char *pbs_err = NULL;
+ pbs = proxmox_backup_new_ns(
+ backup_file,
+ backup_ns,
+ backup_id,
+ backup_time,
+ dump_cb_block_size,
+ password,
+ keyfile,
+ key_password,
+ master_keyfile,
+ has_compress ? compress : true,
+ has_encrypt ? encrypt : !!keyfile,
+ fingerprint,
+ &pbs_err);
+
+ if (!pbs) {
+ error_set(errp, ERROR_CLASS_GENERIC_ERROR,
+ "proxmox_backup_new failed: %s", pbs_err);
+ proxmox_backup_free_error(pbs_err);
+ goto err_mutex;
+ }
+
+ int connect_result = proxmox_backup_co_connect(pbs, errp);
+ if (connect_result < 0)
+ goto err_mutex;
+
+ /* register all devices */
+ l = di_list;
+ while (l) {
+ PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
+ l = g_list_next(l);
+
+ di->block_size = dump_cb_block_size;
+
+ const char *devname = bdrv_get_device_name(di->bs);
+ PBSBitmapAction action = PBS_BITMAP_ACTION_NOT_USED;
+ size_t dirty = di->size;
+
+ BdrvDirtyBitmap *bitmap = bdrv_find_dirty_bitmap(di->bs, PBS_BITMAP_NAME);
+ bool expect_only_dirty = false;
+
+ if (has_use_dirty_bitmap && use_dirty_bitmap) {
+ if (bitmap == NULL) {
+ bitmap = bdrv_create_dirty_bitmap(di->bs, dump_cb_block_size, PBS_BITMAP_NAME, errp);
+ if (!bitmap) {
+ goto err_mutex;
+ }
+ action = PBS_BITMAP_ACTION_NEW;
+ } else {
+ expect_only_dirty = proxmox_backup_check_incremental(pbs, devname, di->size) != 0;
+ }
+
+ if (expect_only_dirty) {
+ /* track clean chunks as reused */
+ dirty = MIN(bdrv_get_dirty_count(bitmap), di->size);
+ backup_state.stat.reused += di->size - dirty;
+ action = PBS_BITMAP_ACTION_USED;
+ } else {
+ /* mark entire bitmap as dirty to make full backup */
+ bdrv_set_dirty_bitmap(bitmap, 0, di->size);
+ if (action != PBS_BITMAP_ACTION_NEW) {
+ action = PBS_BITMAP_ACTION_INVALID;
+ }
+ }
+ di->bitmap = bitmap;
+ } else {
+ /* after a full backup the old dirty bitmap is invalid anyway */
+ if (bitmap != NULL) {
+ bdrv_release_dirty_bitmap(bitmap);
+ action = PBS_BITMAP_ACTION_NOT_USED_REMOVED;
+ }
+ }
+
+ int dev_id = proxmox_backup_co_register_image(pbs, devname, di->size, expect_only_dirty, errp);
+ if (dev_id < 0) {
+ goto err_mutex;
+ }
+
+ if (!(di->target = bdrv_co_backup_dump_create(dump_cb_block_size, di->size, pvebackup_co_dump_pbs_cb, di, errp))) {
+ goto err_mutex;
+ }
+
+ di->dev_id = dev_id;
+
+ PBSBitmapInfo *info = g_malloc(sizeof(*info));
+ info->drive = g_strdup(devname);
+ info->action = action;
+ info->size = di->size;
+ info->dirty = dirty;
+ backup_state.stat.bitmap_list = g_list_append(backup_state.stat.bitmap_list, info);
+ }
+ } else if (format == BACKUP_FORMAT_VMA) {
+ vmaw = vma_writer_create(backup_file, uuid, &local_err);
+ if (!vmaw) {
+ if (local_err) {
+ error_propagate(errp, local_err);
+ }
+ goto err_mutex;
+ }
+
+ /* register all devices for vma writer */
+ l = di_list;
+ while (l) {
+ PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
+ l = g_list_next(l);
+
+ if (!(di->target = bdrv_co_backup_dump_create(VMA_CLUSTER_SIZE, di->size, pvebackup_co_dump_vma_cb, di, errp))) {
+ goto err_mutex;
+ }
+
+ const char *devname = bdrv_get_device_name(di->bs);
+ di->dev_id = vma_writer_register_stream(vmaw, devname, di->size);
+ if (di->dev_id <= 0) {
+ error_set(errp, ERROR_CLASS_GENERIC_ERROR,
+ "register_stream failed");
+ goto err_mutex;
+ }
+ }
+ } else {
+ error_set(errp, ERROR_CLASS_GENERIC_ERROR, "unknown backup format");
+ goto err_mutex;
+ }
+
+ /* add configuration file to archive */
+ if (config_file) {
+ if (pvebackup_co_add_config(config_file, config_name, format, vmaw, pbs, errp) != 0) {
+ goto err_mutex;
+ }
+ }
+
+ /* add firewall file to archive */
+ if (firewall_file) {
+ if (pvebackup_co_add_config(firewall_file, firewall_name, format, vmaw, pbs, errp) != 0) {
+ goto err_mutex;
+ }
+ }
+ /* initialize global backup_state now */
+ /* note: 'reused' and 'bitmap_list' are initialized earlier */
+
+ if (backup_state.stat.error) {
+ error_free(backup_state.stat.error);
+ backup_state.stat.error = NULL;
+ }
+
+ backup_state.stat.start_time = time(NULL);
+ backup_state.stat.end_time = 0;
+
+ if (backup_state.stat.backup_file) {
+ g_free(backup_state.stat.backup_file);
+ }
+ backup_state.stat.backup_file = g_strdup(backup_file);
+
+ uuid_copy(backup_state.stat.uuid, uuid);
+ uuid_unparse_lower(uuid, backup_state.stat.uuid_str);
+ char *uuid_str = g_strdup(backup_state.stat.uuid_str);
+
+ backup_state.stat.total = total;
+ backup_state.stat.dirty = total - backup_state.stat.reused;
+ backup_state.stat.transferred = 0;
+ backup_state.stat.zero_bytes = 0;
+ backup_state.stat.finishing = false;
+ backup_state.stat.starting = true;
+
+ qemu_mutex_unlock(&backup_state.stat.lock);
+
+ backup_state.speed = (has_speed && speed > 0) ? speed : 0;
+
+ backup_state.perf = (BackupPerf){ .max_workers = 16 };
+ if (has_max_workers) {
+ backup_state.perf.max_workers = max_workers;
+ }
+
+ backup_state.vmaw = vmaw;
+ backup_state.pbs = pbs;
+
+ backup_state.di_list = di_list;
+
+ uuid_info = g_malloc0(sizeof(*uuid_info));
+ uuid_info->UUID = uuid_str;
+
+ /* Run create_backup_jobs_bh outside of coroutine (in BH) but keep
+ * backup_mutex locked. This is fine, a CoMutex can be held across yield
+ * points, and we'll release it as soon as the BH reschedules us.
+ */
+ CoCtxData waker = {
+ .co = qemu_coroutine_self(),
+ .ctx = qemu_get_current_aio_context(),
+ .data = &local_err,
+ };
+ aio_bh_schedule_oneshot(waker.ctx, create_backup_jobs_bh, &waker);
+ qemu_coroutine_yield();
+
+ if (local_err) {
+ error_propagate(errp, local_err);
+ goto err;
+ }
+
+ qemu_co_mutex_unlock(&backup_state.backup_mutex);
+
+ qemu_mutex_lock(&backup_state.stat.lock);
+ backup_state.stat.starting = false;
+ qemu_mutex_unlock(&backup_state.stat.lock);
+
+ /* start the first job in the transaction */
+ job_txn_start_seq(backup_state.txn);
+
+ return uuid_info;
+
+err_mutex:
+ qemu_mutex_unlock(&backup_state.stat.lock);
+
+err:
+
+ l = di_list;
+ while (l) {
+ PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
+ l = g_list_next(l);
+
+ if (di->target) {
+ bdrv_co_unref(di->target);
+ }
+
+ if (di->targetfile[0]) {
+ unlink(di->targetfile);
+ }
+ g_free(di);
+ }
+ g_list_free(di_list);
+ backup_state.di_list = NULL;
+
+ if (devs) {
+ g_strfreev(devs);
+ }
+
+ if (vmaw) {
+ Error *err = NULL;
+ vma_writer_close(vmaw, &err);
+ unlink(backup_file);
+ }
+
+ if (pbs) {
+ proxmox_backup_disconnect(pbs);
+ backup_state.pbs = NULL;
+ }
+
+ qemu_co_mutex_unlock(&backup_state.backup_mutex);
+ return NULL;
+}
+
+BackupStatus *qmp_query_backup(Error **errp)
+{
+ BackupStatus *info = g_malloc0(sizeof(*info));
+
+ qemu_mutex_lock(&backup_state.stat.lock);
+
+ if (!backup_state.stat.start_time) {
+ /* not started, return {} */
+ qemu_mutex_unlock(&backup_state.stat.lock);
+ return info;
+ }
+
+ info->has_start_time = true;
+ info->start_time = backup_state.stat.start_time;
+
+ if (backup_state.stat.backup_file) {
+ info->backup_file = g_strdup(backup_state.stat.backup_file);
+ }
+
+ info->uuid = g_strdup(backup_state.stat.uuid_str);
+
+ if (backup_state.stat.end_time) {
+ if (backup_state.stat.error) {
+ info->status = g_strdup("error");
+ info->errmsg = g_strdup(error_get_pretty(backup_state.stat.error));
+ } else {
+ info->status = g_strdup("done");
+ }
+ info->has_end_time = true;
+ info->end_time = backup_state.stat.end_time;
+ } else {
+ info->status = g_strdup("active");
+ }
+
+ info->has_total = true;
+ info->total = backup_state.stat.total;
+ info->has_dirty = true;
+ info->dirty = backup_state.stat.dirty;
+ info->has_zero_bytes = true;
+ info->zero_bytes = backup_state.stat.zero_bytes;
+ info->has_transferred = true;
+ info->transferred = backup_state.stat.transferred;
+ info->has_reused = true;
+ info->reused = backup_state.stat.reused;
+ info->finishing = backup_state.stat.finishing;
+
+ qemu_mutex_unlock(&backup_state.stat.lock);
+
+ return info;
+}
+
+PBSBitmapInfoList *qmp_query_pbs_bitmap_info(Error **errp)
+{
+ PBSBitmapInfoList *head = NULL, **p_next = &head;
+
+ qemu_mutex_lock(&backup_state.stat.lock);
+
+ GList *l = backup_state.stat.bitmap_list;
+ while (l) {
+ PBSBitmapInfo *info = (PBSBitmapInfo *)l->data;
+ l = g_list_next(l);
+
+ /* clone bitmap info to avoid auto free after QMP marshalling */
+ PBSBitmapInfo *info_ret = g_malloc0(sizeof(*info_ret));
+ info_ret->drive = g_strdup(info->drive);
+ info_ret->action = info->action;
+ info_ret->size = info->size;
+ info_ret->dirty = info->dirty;
+
+ PBSBitmapInfoList *info_list = g_malloc0(sizeof(*info_list));
+ info_list->value = info_ret;
+
+ *p_next = info_list;
+ p_next = &info_list->next;
+ }
+
+ qemu_mutex_unlock(&backup_state.stat.lock);
+
+ return head;
+}
+
+ProxmoxSupportStatus *qmp_query_proxmox_support(Error **errp)
+{
+ ProxmoxSupportStatus *ret = g_malloc0(sizeof(*ret));
+ ret->pbs_library_version = g_strdup(proxmox_backup_qemu_version());
+ ret->pbs_dirty_bitmap = true;
+ ret->pbs_dirty_bitmap_savevm = true;
+ ret->query_bitmap_info = true;
+ ret->pbs_masterkey = true;
+ ret->backup_max_workers = true;
+ return ret;
+}
diff --git a/qapi/block-core.json b/qapi/block-core.json
index bb471c078d..1b8462a51b 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -839,6 +839,235 @@
{ 'command': 'query-block', 'returns': ['BlockInfo'],
'allow-preconfig': true }
+##
+# @BackupStatus:
+#
+# Detailed backup status.
+#
+# @status: string describing the current backup status.
+# This can be 'active', 'done', 'error'. If this field is not
+# returned, no backup process has been initiated
+#
+# @errmsg: error message (only returned if status is 'error')
+#
+# @total: total amount of bytes involved in the backup process
+#
+# @dirty: with incremental mode (PBS) this is the amount of bytes involved
+# in the backup process which are marked dirty.
+#
+# @transferred: amount of bytes already backed up.
+#
+# @reused: amount of bytes reused due to deduplication.
+#
+# @zero-bytes: amount of 'zero' bytes detected.
+#
+# @start-time: time (epoch) when backup job started.
+#
+# @end-time: time (epoch) when backup job finished.
+#
+# @backup-file: backup file name
+#
+# @uuid: uuid for this backup job
+#
+# @finishing: if status='active' and finishing=true, then the backup process is
+# waiting for the target to finish.
+#
+##
+{ 'struct': 'BackupStatus',
+ 'data': {'*status': 'str', '*errmsg': 'str', '*total': 'int', '*dirty': 'int',
+ '*transferred': 'int', '*zero-bytes': 'int', '*reused': 'int',
+ '*start-time': 'int', '*end-time': 'int',
+ '*backup-file': 'str', '*uuid': 'str', 'finishing': 'bool' } }
+
+##
+# @BackupFormat:
+#
+# An enumeration of supported backup formats.
+#
+# @vma: Proxmox vma backup format
+#
+# @pbs: Proxmox backup server format
+#
+##
+{ 'enum': 'BackupFormat',
+ 'data': [ 'vma', 'pbs' ] }
+
+##
+# @backup:
+#
+# Starts a VM backup.
+#
+# @backup-file: the backup file name
+#
+# @format: format of the backup file
+#
+# @config-file: a configuration file to include into
+# the backup archive.
+#
+# @speed: the maximum speed, in bytes per second
+#
+# @devlist: list of block device names (separated by ',', ';'
+# or ':'). By default the backup includes all writable block devices.
+#
+# @password: backup server passsword (required for format 'pbs')
+#
+# @keyfile: keyfile used for encryption (optional for format 'pbs')
+#
+# @key-password: password for keyfile (optional for format 'pbs')
+#
+# @master-keyfile: PEM-formatted master public keyfile (optional for format 'pbs')
+#
+# @fingerprint: server cert fingerprint (optional for format 'pbs')
+#
+# @backup-ns: backup namespace (required for format 'pbs')
+#
+# @backup-id: backup ID (required for format 'pbs')
+#
+# @backup-time: backup timestamp (Unix epoch, required for format 'pbs')
+#
+# @use-dirty-bitmap: use dirty bitmap to detect incremental changes since last job (optional for format 'pbs')
+#
+# @compress: use compression (optional for format 'pbs', defaults to true)
+#
+# @encrypt: use encryption ((optional for format 'pbs', defaults to true if there is a keyfile)
+#
+# @max-workers: see @BackupPerf for details. Default 16.
+#
+# Returns: the uuid of the backup job
+#
+##
+{ 'command': 'backup', 'data': { 'backup-file': 'str',
+ '*password': 'str',
+ '*keyfile': 'str',
+ '*key-password': 'str',
+ '*master-keyfile': 'str',
+ '*fingerprint': 'str',
+ '*backup-ns': 'str',
+ '*backup-id': 'str',
+ '*backup-time': 'int',
+ '*use-dirty-bitmap': 'bool',
+ '*compress': 'bool',
+ '*encrypt': 'bool',
+ '*format': 'BackupFormat',
+ '*config-file': 'str',
+ '*firewall-file': 'str',
+ '*devlist': 'str',
+ '*speed': 'int',
+ '*max-workers': 'int' },
+ 'returns': 'UuidInfo', 'coroutine': true }
+
+##
+# @query-backup:
+#
+# Returns information about current/last backup task.
+#
+# Returns: @BackupStatus
+#
+##
+{ 'command': 'query-backup', 'returns': 'BackupStatus' }
+
+##
+# @backup-cancel:
+#
+# Cancel the current executing backup process.
+#
+# Returns: nothing on success
+#
+# Notes: This command succeeds even if there is no backup process running.
+#
+##
+{ 'command': 'backup-cancel', 'coroutine': true }
+
+##
+# @ProxmoxSupportStatus:
+#
+# Contains info about supported features added by Proxmox.
+#
+# @pbs-dirty-bitmap: True if dirty-bitmap-incremental backups to PBS are
+# supported.
+#
+# @query-bitmap-info: True if the 'query-pbs-bitmap-info' QMP call is supported.
+#
+# @pbs-dirty-bitmap-savevm: True if 'dirty-bitmaps' migration capability can
+# safely be set for savevm-async.
+#
+# @pbs-masterkey: True if the QMP backup call supports the 'master_keyfile'
+# parameter.
+#
+# @pbs-library-version: Running version of libproxmox-backup-qemu0 library.
+#
+##
+{ 'struct': 'ProxmoxSupportStatus',
+ 'data': { 'pbs-dirty-bitmap': 'bool',
+ 'query-bitmap-info': 'bool',
+ 'pbs-dirty-bitmap-savevm': 'bool',
+ 'pbs-masterkey': 'bool',
+ 'pbs-library-version': 'str',
+ 'backup-max-workers': 'bool' } }
+
+##
+# @query-proxmox-support:
+#
+# Returns information about supported features added by Proxmox.
+#
+# Returns: @ProxmoxSupportStatus
+#
+##
+{ 'command': 'query-proxmox-support', 'returns': 'ProxmoxSupportStatus' }
+
+##
+# @PBSBitmapAction:
+#
+# An action taken on a dirty-bitmap when a backup job was started.
+#
+# @not-used: Bitmap mode was not enabled.
+#
+# @not-used-removed: Bitmap mode was not enabled, but a bitmap from a
+# previous backup still existed and was removed.
+#
+# @new: A new bitmap was attached to the drive for this backup.
+#
+# @used: An existing bitmap will be used to only backup changed data.
+#
+# @invalid: A bitmap existed, but had to be cleared since it's associated
+# base snapshot did not match the base given for the current job or
+# the crypt mode has changed.
+#
+##
+{ 'enum': 'PBSBitmapAction',
+ 'data': ['not-used', 'not-used-removed', 'new', 'used', 'invalid'] }
+
+##
+# @PBSBitmapInfo:
+#
+# Contains information about dirty bitmaps used for each drive in a PBS backup.
+#
+# @drive: The underlying drive.
+#
+# @action: The action that was taken when the backup started.
+#
+# @size: The total size of the drive.
+#
+# @dirty: How much of the drive is considered dirty and will be backed up,
+# or 'size' if everything will be.
+#
+##
+{ 'struct': 'PBSBitmapInfo',
+ 'data': { 'drive': 'str', 'action': 'PBSBitmapAction', 'size': 'int',
+ 'dirty': 'int' } }
+
+##
+# @query-pbs-bitmap-info:
+#
+# Returns information about dirty bitmaps used on the most recently started
+# backup. Returns nothing when the last backup was not using PBS or if no
+# backup occured in this session.
+#
+# Returns: @PBSBitmapInfo
+#
+##
+{ 'command': 'query-pbs-bitmap-info', 'returns': ['PBSBitmapInfo'] }
+
##
# @BlockDeviceTimedStats:
#
diff --git a/qapi/common.json b/qapi/common.json
index 6fed9cde1a..630a2a8f9a 100644
--- a/qapi/common.json
+++ b/qapi/common.json
@@ -207,3 +207,17 @@
##
{ 'struct': 'HumanReadableText',
'data': { 'human-readable-text': 'str' } }
+
+##
+# @UuidInfo:
+#
+# Guest UUID information (Universally Unique Identifier).
+#
+# @UUID: the UUID of the guest
+#
+# Since: 0.14.0
+#
+# Notes: If no UUID was specified for the guest, a null UUID is
+# returned.
+##
+{ 'struct': 'UuidInfo', 'data': {'UUID': 'str'} }
diff --git a/qapi/machine.json b/qapi/machine.json
index 7da3c519ba..888457f810 100644
--- a/qapi/machine.json
+++ b/qapi/machine.json
@@ -4,6 +4,8 @@
# This work is licensed under the terms of the GNU GPL, version 2 or later.
# See the COPYING file in the top-level directory.
+{ 'include': 'common.json' }
+
##
# = Machines
##
@@ -230,20 +232,6 @@
##
{ 'command': 'query-target', 'returns': 'TargetInfo' }
-##
-# @UuidInfo:
-#
-# Guest UUID information (Universally Unique Identifier).
-#
-# @UUID: the UUID of the guest
-#
-# Since: 0.14
-#
-# Notes: If no UUID was specified for the guest, a null UUID is
-# returned.
-##
-{ 'struct': 'UuidInfo', 'data': {'UUID': 'str'} }
-
##
# @query-uuid:
#