Compare commits

..

5 Commits

Author SHA1 Message Date
Thomas Lamprecht
e0969989ac bump version to 9.2.0-5
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2025-04-04 16:16:02 +02:00
Thomas Lamprecht
b30e898392 PVE backup: backup-access api: simplify bitmap logic
See inner patch for details.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2025-04-04 16:14:24 +02:00
Thomas Lamprecht
053f68a7ac bump version to 9.2.0-4
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2025-04-03 17:00:39 +02:00
Thomas Lamprecht
cb662eb2c1 pve backup: implement features for external backup providers
This series of patches extends our existing QEMU backup integration
for PVE such that it can be used for a in-development external-backup
plugin API.

This includes among other things an API for backup access and teardown
as well as dirty-bitmap management. See the separate patches for more
details and check [v7] and [v8] of this series on the list. The former
is the last revision by Fiona which was already very advanced,
Wolfgang picked that up and extended/adapted it slightly to match
needs that he found during implementation of a test provider and from
feedback of (future) external backup provider like Bareos and Veritas.
This was mainly done to get the QEMU build out for broad QA, we can
still change things here as long as the rest of the series is not yet
applied.

[v7]: https://lore.proxmox.com/pve-devel/20250401173435.221892-1-f.ebner@proxmox.com/
[v8]: https://lore.proxmox.com/pve-devel/20250403123118.264974-1-w.bumiller@proxmox.com/

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2025-04-03 16:48:42 +02:00
Wolfgang Bumiller
a6ee093f1c merge async snapshot improvements
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2025-04-02 17:23:05 +02:00
21 changed files with 2273 additions and 1288 deletions

View File

@@ -58,7 +58,7 @@ $(BUILDDIR): submodule
deb kvm: $(DEBS)
$(DEB_DBG): $(DEB)
$(DEB): $(BUILDDIR)
cd $(BUILDDIR); dpkg-buildpackage -b -us -uc -j32
cd $(BUILDDIR); dpkg-buildpackage -b -us -uc
lintian $(DEBS)
sbuild: $(DSC)

20
debian/changelog vendored
View File

@@ -1,8 +1,22 @@
pve-qemu-kvm (9.2.0-3+vitastor1) bookworm; urgency=medium
pve-qemu-kvm (9.2.0-5) bookworm; urgency=medium
* Add Vitastor support
* pve backup: backup-access api: simplify bitmap logic
-- Vitaliy Filippov <vitalif@yourcmc.ru> Wed, 02 Apr 2025 17:39:11 +0300
-- Proxmox Support Team <support@proxmox.com> Fri, 04 Apr 2025 16:15:58 +0200
pve-qemu-kvm (9.2.0-4) bookworm; urgency=medium
* various async snapshot improvements, inclduing using a dedicated IO thread
for the state file when doing a live snapshot. That should reduce load on
the main thread and for it to get stuck on IO, i.e. same benefits as using
a dedicated IO thread for regular drives. This is particularly interesting
when the VM state storage is a network storage like NFS. It should also
address #6262.
* pve backup: implement basic features and API in preperation for external
backup provider storage plugins.
-- Proxmox Support Team <support@proxmox.com> Thu, 03 Apr 2025 17:00:34 +0200
pve-qemu-kvm (9.2.0-3) bookworm; urgency=medium

1
debian/control vendored
View File

@@ -59,7 +59,6 @@ Depends: ceph-common (>= 0.48),
libspice-server1 (>= 0.14.0~),
libusb-1.0-0 (>= 1.0.17-1),
libusbredirparser1 (>= 0.6-2),
vitastor-client (>= 0.9.4),
libuuid1,
${misc:Depends},
${shlibs:Depends},

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,81 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Mon, 31 Mar 2025 16:55:02 +0200
Subject: [PATCH] savevm-async: improve setting state of snapshot operation in
savevm-end handler
One of the callers of wait_for_close_co() already sets the state to
SAVE_STATE_DONE before, but that is not fully correct, because at that
moment, the operation is not fully done. In particular, if closing the
target later fails, the state would even be set to SAVE_STATE_ERROR
afterwards. DONE -> ERROR is not a valid state transition. Although,
it should not matter in practice as long as the relevant QMP commands
are sequential.
The other caller does not set the state and so there seems to be a
race that could lead to the state not getting set at all. This is
because before this commit, the wait_for_close_co() function could
return early when there is no target file anymore. This can only
happen when canceling and needs to happen right around the time when
the snapshot is already finishing and closing the target.
Simply avoid the early return and always set the state within the
wait_for_close_co() function rather than at the call site.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
migration/savevm-async.c | 33 +++++++++++++++------------------
1 file changed, 15 insertions(+), 18 deletions(-)
diff --git a/migration/savevm-async.c b/migration/savevm-async.c
index 1e79fce9ba..e63dc6d8a3 100644
--- a/migration/savevm-async.c
+++ b/migration/savevm-async.c
@@ -450,23 +450,22 @@ static void coroutine_fn wait_for_close_co(void *opaque)
{
int64_t timeout;
- if (!snap_state.target) {
- DPRINTF("savevm-end: no target file open\n");
- return;
- }
-
- /* wait until cleanup is done before returning, this ensures that after this
- * call exits the statefile will be closed and can be removed immediately */
- DPRINTF("savevm-end: waiting for cleanup\n");
- timeout = 30L * 1000 * 1000 * 1000;
- qemu_co_sleep_ns_wakeable(&snap_state.target_close_wait,
- QEMU_CLOCK_REALTIME, timeout);
if (snap_state.target) {
- save_snapshot_error("timeout waiting for target file close in "
- "qmp_savevm_end");
- /* we cannot assume the snapshot finished in this case, so leave the
- * state alone - caller has to figure something out */
- return;
+ /* wait until cleanup is done before returning, this ensures that after this
+ * call exits the statefile will be closed and can be removed immediately */
+ DPRINTF("savevm-end: waiting for cleanup\n");
+ timeout = 30L * 1000 * 1000 * 1000;
+ qemu_co_sleep_ns_wakeable(&snap_state.target_close_wait,
+ QEMU_CLOCK_REALTIME, timeout);
+ if (snap_state.target) {
+ save_snapshot_error("timeout waiting for target file close in "
+ "qmp_savevm_end");
+ /* we cannot assume the snapshot finished in this case, so leave the
+ * state alone - caller has to figure something out */
+ return;
+ }
+ } else {
+ DPRINTF("savevm-end: no target file open\n");
}
// File closed and no other error, so ensure next snapshot can be started.
@@ -497,8 +496,6 @@ void qmp_savevm_end(Error **errp)
snap_state.saved_vm_running = false;
}
- snap_state.state = SAVE_STATE_DONE;
-
qemu_coroutine_enter(wait_for_close);
}

View File

@@ -0,0 +1,71 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Mon, 31 Mar 2025 16:55:03 +0200
Subject: [PATCH] savevm-async: rename saved_vm_running to vm_needs_start
This is what the variable actually expresses. Otherwise, setting it
to false after starting the VM doesn't make sense.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
migration/savevm-async.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/migration/savevm-async.c b/migration/savevm-async.c
index e63dc6d8a3..1e34c31e8b 100644
--- a/migration/savevm-async.c
+++ b/migration/savevm-async.c
@@ -51,7 +51,7 @@ static struct SnapshotState {
int state;
Error *error;
Error *blocker;
- int saved_vm_running;
+ int vm_needs_start;
QEMUFile *file;
int64_t total_time;
QEMUBH *finalize_bh;
@@ -224,9 +224,9 @@ static void process_savevm_finalize(void *opaque)
save_snapshot_error("process_savevm_cleanup: invalid state: %d",
snap_state.state);
}
- if (snap_state.saved_vm_running) {
+ if (snap_state.vm_needs_start) {
vm_start();
- snap_state.saved_vm_running = false;
+ snap_state.vm_needs_start = false;
}
DPRINTF("timing: process_savevm_finalize (full) took %ld ms\n",
@@ -352,7 +352,7 @@ void qmp_savevm_start(const char *statefile, Error **errp)
}
/* initialize snapshot info */
- snap_state.saved_vm_running = runstate_is_running();
+ snap_state.vm_needs_start = runstate_is_running();
snap_state.bs_pos = 0;
snap_state.total_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
snap_state.blocker = NULL;
@@ -440,9 +440,9 @@ restart:
save_snapshot_error("setup failed");
- if (snap_state.saved_vm_running) {
+ if (snap_state.vm_needs_start) {
vm_start();
- snap_state.saved_vm_running = false;
+ snap_state.vm_needs_start = false;
}
}
@@ -491,9 +491,9 @@ void qmp_savevm_end(Error **errp)
return;
}
- if (snap_state.saved_vm_running) {
+ if (snap_state.vm_needs_start) {
vm_start();
- snap_state.saved_vm_running = false;
+ snap_state.vm_needs_start = false;
}
qemu_coroutine_enter(wait_for_close);

View File

@@ -0,0 +1,120 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Mon, 31 Mar 2025 16:55:04 +0200
Subject: [PATCH] savevm-async: improve runstate preservation, cleanup error
handling
Determine if VM needs to be started after finishing right before
actually stopping the VM instead of at the beginning.
In qmp_savevm_start(), the only path stopping the VM returns right
aftwards, so there is no need for the vm_start() handling after
errors.
Lastly, improve the code style for checking whether migrate_init()
failed by explicitly comparing against 0.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
[WB: squashed error handling commits, rename goto branch instead of
inlining it]
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
---
migration/savevm-async.c | 25 ++++++++++---------------
1 file changed, 10 insertions(+), 15 deletions(-)
diff --git a/migration/savevm-async.c b/migration/savevm-async.c
index 1e34c31e8b..d8d2c80475 100644
--- a/migration/savevm-async.c
+++ b/migration/savevm-async.c
@@ -178,6 +178,7 @@ static void process_savevm_finalize(void *opaque)
*/
blk_set_aio_context(snap_state.target, qemu_get_aio_context(), NULL);
+ snap_state.vm_needs_start = runstate_is_running();
ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
if (ret < 0) {
save_snapshot_error("vm_stop_force_state error %d", ret);
@@ -352,7 +353,6 @@ void qmp_savevm_start(const char *statefile, Error **errp)
}
/* initialize snapshot info */
- snap_state.vm_needs_start = runstate_is_running();
snap_state.bs_pos = 0;
snap_state.total_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
snap_state.blocker = NULL;
@@ -364,13 +364,14 @@ void qmp_savevm_start(const char *statefile, Error **errp)
}
if (!statefile) {
+ snap_state.vm_needs_start = runstate_is_running();
vm_stop(RUN_STATE_SAVE_VM);
snap_state.state = SAVE_STATE_COMPLETED;
return;
}
if (qemu_savevm_state_blocked(errp)) {
- return;
+ goto fail;
}
/* Open the image */
@@ -380,12 +381,12 @@ void qmp_savevm_start(const char *statefile, Error **errp)
snap_state.target = blk_new_open(statefile, NULL, options, bdrv_oflags, &local_err);
if (!snap_state.target) {
error_setg(errp, "failed to open '%s'", statefile);
- goto restart;
+ goto fail;
}
target_bs = blk_bs(snap_state.target);
if (!target_bs) {
error_setg(errp, "failed to open '%s' - no block driver state", statefile);
- goto restart;
+ goto fail;
}
QIOChannel *ioc = QIO_CHANNEL(qio_channel_savevm_async_new(snap_state.target,
@@ -394,7 +395,7 @@ void qmp_savevm_start(const char *statefile, Error **errp)
if (!snap_state.file) {
error_setg(errp, "failed to open '%s'", statefile);
- goto restart;
+ goto fail;
}
/*
@@ -402,8 +403,8 @@ void qmp_savevm_start(const char *statefile, Error **errp)
* State is cleared in process_savevm_co, but has to be initialized
* here (blocking main thread, from QMP) to avoid race conditions.
*/
- if (migrate_init(ms, errp)) {
- return;
+ if (migrate_init(ms, errp) != 0) {
+ goto fail;
}
memset(&mig_stats, 0, sizeof(mig_stats));
ms->to_dst_file = snap_state.file;
@@ -418,7 +419,7 @@ void qmp_savevm_start(const char *statefile, Error **errp)
if (ret != 0) {
error_setg_errno(errp, -ret, "savevm state setup failed: %s",
local_err ? error_get_pretty(local_err) : "unknown error");
- return;
+ goto fail;
}
/* Async processing from here on out happens in iohandler context, so let
@@ -436,14 +437,8 @@ void qmp_savevm_start(const char *statefile, Error **errp)
return;
-restart:
-
+fail:
save_snapshot_error("setup failed");
-
- if (snap_state.vm_needs_start) {
- vm_start();
- snap_state.vm_needs_start = false;
- }
}
static void coroutine_fn wait_for_close_co(void *opaque)

View File

@@ -0,0 +1,185 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Mon, 31 Mar 2025 16:55:06 +0200
Subject: [PATCH] savevm-async: use dedicated iothread for state file
Having the state file be in the iohandler context means that a
blk_drain_all() call in the main thread or vCPU thread that happens
while the snapshot is running will result in a deadlock.
For example, the main thread might be stuck in:
> 0 0x00007300ac9552d6 in __ppoll (fds=0x64bd5a411a50, nfds=2, timeout=<optimized out>, timeout@entry=0x0, sigmask=sigmask@entry=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:42
> 1 0x000064bd51af3cad in ppoll (__ss=0x0, __timeout=0x0, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/x86_64-linux-gnu/bits/poll2.h:64
> 2 0x000064bd51ad8799 in fdmon_poll_wait (ctx=0x64bd58d968a0, ready_list=0x7ffcfcc15558, timeout=-1) at ../util/fdmon-poll.c:79
> 3 0x000064bd51ad7c3d in aio_poll (ctx=0x64bd58d968a0, blocking=blocking@entry=true) at ../util/aio-posix.c:671
> 4 0x000064bd519a0b5d in bdrv_drain_all_begin () at ../block/io.c:531
> 5 bdrv_drain_all_begin () at ../block/io.c:510
> 6 0x000064bd519943c4 in blk_drain_all () at ../block/block-backend.c:2085
> 7 0x000064bd5160fc5a in virtio_scsi_dataplane_stop (vdev=0x64bd5a215190) at ../hw/scsi/virtio-scsi-dataplane.c:213
> 8 0x000064bd51664e90 in virtio_bus_stop_ioeventfd (bus=0x64bd5a215110) at ../hw/virtio/virtio-bus.c:259
> 9 0x000064bd5166511b in virtio_bus_stop_ioeventfd (bus=<optimized out>) at ../hw/virtio/virtio-bus.c:251
> 10 virtio_bus_reset (bus=<optimized out>) at ../hw/virtio/virtio-bus.c:107
> 11 0x000064bd51667431 in virtio_pci_reset (qdev=<optimized out>) at ../hw/virtio/virtio-pci.c:2296
...
> 34 0x000064bd517aa951 in pc_machine_reset (machine=<optimized out>, type=<optimized out>) at ../hw/i386/pc.c:1722
> 35 0x000064bd516aa4c4 in qemu_system_reset (reason=reason@entry=SHUTDOWN_CAUSE_GUEST_RESET) at ../system/runstate.c:525
> 36 0x000064bd516aaeb9 in main_loop_should_exit (status=<synthetic pointer>) at ../system/runstate.c:801
> 37 qemu_main_loop () at ../system/runstate.c:834
which is in block/io.c:
> /* Now poll the in-flight requests */
> AIO_WAIT_WHILE_UNLOCKED(NULL, bdrv_drain_all_poll());
The working theory is: The deadlock happens because the IO is issued
from the process_savevm_co() coroutine, which doesn't get scheduled
again to complete in-flight requests when the main thread is stuck
there polling. The main thread itself is the one that would need to
schedule it. In case of a vCPU triggering the VirtIO SCSI dataplane
stop, which happens during (Linux) boot, the vCPU thread will hold the
big QEMU lock (BQL) blocking the main thread from making progress
scheduling the process_savevm_co() coroutine.
This change should also help in general to reduce load on the main
thread and for it to get stuck on IO, i.e. same benefits as using a
dedicated IO thread for regular drives. This is particularly
interesting when the VM state storage is a network storage like NFS.
With some luck, it could also help with bug #6262 [0]. The failure
there happens while issuing/right after the savevm-start QMP command,
so the most likely coroutine is the process_savevm_co() that was
previously scheduled to the iohandler context. Likely someone polls
the iohandler context and wants to enter the already scheduled
coroutine leading to the abort():
> qemu_aio_coroutine_enter: Co-routine was already scheduled in 'aio_co_schedule'
With a dedicated iothread, there hopefully is no such race.
The comment above querying the pending bytes wrongly talked about the
"iothread lock", but should've been "iohandler lock". This was even
renamed to BQL (big QEMU lock) a few releases ago. Even if that was
not a typo to begin with, there are no AioContext locks anymore.
[0]: https://bugzilla.proxmox.com/show_bug.cgi?id=6262
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
[WB: update to the changed error handling in the previous commit]
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
---
migration/savevm-async.c | 42 ++++++++++++++++++++++++++++------------
1 file changed, 30 insertions(+), 12 deletions(-)
diff --git a/migration/savevm-async.c b/migration/savevm-async.c
index d8d2c80475..11ea4c601d 100644
--- a/migration/savevm-async.c
+++ b/migration/savevm-async.c
@@ -25,6 +25,7 @@
#include "qemu/main-loop.h"
#include "qemu/rcu.h"
#include "qemu/yank.h"
+#include "sysemu/iothread.h"
/* #define DEBUG_SAVEVM_STATE */
@@ -57,6 +58,7 @@ static struct SnapshotState {
QEMUBH *finalize_bh;
Coroutine *co;
QemuCoSleep target_close_wait;
+ IOThread *iothread;
} snap_state;
static bool savevm_aborted(void)
@@ -256,16 +258,13 @@ static void coroutine_fn process_savevm_co(void *opaque)
uint64_t threshold = 400 * 1000;
/*
- * pending_{estimate,exact} are expected to be called without iothread
- * lock. Similar to what is done in migration.c, call the exact variant
- * only once pend_precopy in the estimate is below the threshold.
+ * Similar to what is done in migration.c, call the exact variant only
+ * once pend_precopy in the estimate is below the threshold.
*/
- bql_unlock();
qemu_savevm_state_pending_estimate(&pend_precopy, &pend_postcopy);
if (pend_precopy <= threshold) {
qemu_savevm_state_pending_exact(&pend_precopy, &pend_postcopy);
}
- bql_lock();
pending_size = pend_precopy + pend_postcopy;
/*
@@ -332,11 +331,17 @@ static void coroutine_fn process_savevm_co(void *opaque)
qemu_bh_schedule(snap_state.finalize_bh);
}
+static void savevm_cleanup_iothread(void) {
+ if (snap_state.iothread) {
+ iothread_destroy(snap_state.iothread);
+ snap_state.iothread = NULL;
+ }
+}
+
void qmp_savevm_start(const char *statefile, Error **errp)
{
Error *local_err = NULL;
MigrationState *ms = migrate_get_current();
- AioContext *iohandler_ctx = iohandler_get_aio_context();
BlockDriverState *target_bs = NULL;
int ret = 0;
@@ -374,6 +379,19 @@ void qmp_savevm_start(const char *statefile, Error **errp)
goto fail;
}
+ if (snap_state.iothread) {
+ /* This is not expected, so warn about it, but no point in re-creating a new iothread. */
+ warn_report("iothread for snapshot already exists - re-using");
+ } else {
+ snap_state.iothread =
+ iothread_create("__proxmox_savevm_async_iothread__", &local_err);
+ if (!snap_state.iothread) {
+ error_setg(errp, "creating iothread failed: %s",
+ local_err ? error_get_pretty(local_err) : "unknown error");
+ goto fail;
+ }
+ }
+
/* Open the image */
QDict *options = NULL;
options = qdict_new();
@@ -422,22 +440,20 @@ void qmp_savevm_start(const char *statefile, Error **errp)
goto fail;
}
- /* Async processing from here on out happens in iohandler context, so let
- * the target bdrv have its home there.
- */
- ret = blk_set_aio_context(snap_state.target, iohandler_ctx, &local_err);
+ ret = blk_set_aio_context(snap_state.target, snap_state.iothread->ctx, &local_err);
if (ret != 0) {
- warn_report("failed to set iohandler context for VM state target: %s %s",
+ warn_report("failed to set iothread context for VM state target: %s %s",
local_err ? error_get_pretty(local_err) : "unknown error",
strerror(-ret));
}
snap_state.co = qemu_coroutine_create(&process_savevm_co, NULL);
- aio_co_schedule(iohandler_ctx, snap_state.co);
+ aio_co_schedule(snap_state.iothread->ctx, snap_state.co);
return;
fail:
+ savevm_cleanup_iothread();
save_snapshot_error("setup failed");
}
@@ -463,6 +479,8 @@ static void coroutine_fn wait_for_close_co(void *opaque)
DPRINTF("savevm-end: no target file open\n");
}
+ savevm_cleanup_iothread();
+
// File closed and no other error, so ensure next snapshot can be started.
if (snap_state.state != SAVE_STATE_ERROR) {
snap_state.state = SAVE_STATE_DONE;

View File

@@ -0,0 +1,33 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Mon, 31 Mar 2025 16:55:07 +0200
Subject: [PATCH] savevm-async: treat failure to set iothread context as a hard
failure
This is not expected to ever fail and there might be assumptions about
having the expected context down the line.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
[WB: update to changed error handling]
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
---
migration/savevm-async.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/migration/savevm-async.c b/migration/savevm-async.c
index 11ea4c601d..f2b10b5519 100644
--- a/migration/savevm-async.c
+++ b/migration/savevm-async.c
@@ -442,9 +442,9 @@ void qmp_savevm_start(const char *statefile, Error **errp)
ret = blk_set_aio_context(snap_state.target, snap_state.iothread->ctx, &local_err);
if (ret != 0) {
- warn_report("failed to set iothread context for VM state target: %s %s",
- local_err ? error_get_pretty(local_err) : "unknown error",
- strerror(-ret));
+ error_setg_errno(errp, -ret, "failed to set iothread context for VM state target: %s",
+ local_err ? error_get_pretty(local_err) : "unknown error");
+ goto fail;
}
snap_state.co = qemu_coroutine_create(&process_savevm_co, NULL);

View File

@@ -0,0 +1,41 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Thu, 3 Apr 2025 14:30:41 +0200
Subject: [PATCH] PVE backup: clean up directly in setup_snapshot_access() when
it fails
The only thing that might need to be cleaned up after
setup_snapshot_access() failed is dropping the cbw filter. Do so in
the single branch it matters inside setup_snapshot_access() itself.
This avoids the need that callers of setup_snapshot_access() use
cleanup_snapshot_access() when the call failed.
Suggested-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
---
pve-backup.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/pve-backup.c b/pve-backup.c
index 32352fb5ec..2408f182bc 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -576,6 +576,9 @@ static int setup_snapshot_access(PVEBackupDevInfo *di, Error **errp)
di->fleecing.snapshot_access =
bdrv_open(NULL, NULL, snapshot_access_opts, BDRV_O_RDWR | BDRV_O_UNMAP, &local_err);
if (!di->fleecing.snapshot_access) {
+ bdrv_cbw_drop(di->fleecing.cbw);
+ di->fleecing.cbw = NULL;
+
error_setg(errp, "setting up snapshot access for fleecing failed: %s",
local_err ? error_get_pretty(local_err) : "unknown error");
return -1;
@@ -629,7 +632,6 @@ static void create_backup_jobs_bh(void *opaque) {
error_setg(errp, "%s - setting up snapshot access for fleecing failed: %s",
di->device_name,
local_err ? error_get_pretty(local_err) : "unknown error");
- cleanup_snapshot_access(di);
bdrv_drained_end(di->bs);
break;
}

View File

@@ -0,0 +1,59 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Thu, 3 Apr 2025 14:30:42 +0200
Subject: [PATCH] PVE backup: factor out helper to clear backup state's bitmap
list
Suggested-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
---
pve-backup.c | 28 ++++++++++++++++++----------
1 file changed, 18 insertions(+), 10 deletions(-)
diff --git a/pve-backup.c b/pve-backup.c
index 2408f182bc..915649b5f9 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -811,6 +811,23 @@ err:
return di_list;
}
+/*
+ * To be called with the backup_state.stat mutex held.
+ */
+static void clear_backup_state_bitmap_list(void) {
+
+ if (backup_state.stat.bitmap_list) {
+ GList *bl = backup_state.stat.bitmap_list;
+ while (bl) {
+ g_free(((PBSBitmapInfo *)bl->data)->drive);
+ g_free(bl->data);
+ bl = g_list_next(bl);
+ }
+ g_list_free(backup_state.stat.bitmap_list);
+ backup_state.stat.bitmap_list = NULL;
+ }
+}
+
UuidInfo coroutine_fn *qmp_backup(
const char *backup_file,
const char *password,
@@ -898,16 +915,7 @@ UuidInfo coroutine_fn *qmp_backup(
backup_state.stat.reused = 0;
/* clear previous backup's bitmap_list */
- if (backup_state.stat.bitmap_list) {
- GList *bl = backup_state.stat.bitmap_list;
- while (bl) {
- g_free(((PBSBitmapInfo *)bl->data)->drive);
- g_free(bl->data);
- bl = g_list_next(bl);
- }
- g_list_free(backup_state.stat.bitmap_list);
- backup_state.stat.bitmap_list = NULL;
- }
+ clear_backup_state_bitmap_list();
if (format == BACKUP_FORMAT_PBS) {
if (!password) {

View File

@@ -0,0 +1,95 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Thu, 3 Apr 2025 14:30:43 +0200
Subject: [PATCH] PVE backup: factor out helper to initialize backup state stat
struct
Suggested-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
---
pve-backup.c | 62 ++++++++++++++++++++++++++++++++--------------------
1 file changed, 38 insertions(+), 24 deletions(-)
diff --git a/pve-backup.c b/pve-backup.c
index 915649b5f9..88a981f81c 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -828,6 +828,43 @@ static void clear_backup_state_bitmap_list(void) {
}
}
+/*
+ * Initializes most of the backup state 'stat' struct. Note that 'reused' and
+ * 'bitmap_list' are not changed by this function and need to be handled by
+ * the caller. In particular, 'reused' needs to be set before calling this
+ * function.
+ *
+ * To be called with the backup_state.stat mutex held.
+ */
+static void initialize_backup_state_stat(
+ const char *backup_file,
+ uuid_t uuid,
+ size_t total)
+{
+ if (backup_state.stat.error) {
+ error_free(backup_state.stat.error);
+ backup_state.stat.error = NULL;
+ }
+
+ backup_state.stat.start_time = time(NULL);
+ backup_state.stat.end_time = 0;
+
+ if (backup_state.stat.backup_file) {
+ g_free(backup_state.stat.backup_file);
+ }
+ backup_state.stat.backup_file = g_strdup(backup_file);
+
+ uuid_copy(backup_state.stat.uuid, uuid);
+ uuid_unparse_lower(uuid, backup_state.stat.uuid_str);
+
+ backup_state.stat.total = total;
+ backup_state.stat.dirty = total - backup_state.stat.reused;
+ backup_state.stat.transferred = 0;
+ backup_state.stat.zero_bytes = 0;
+ backup_state.stat.finishing = false;
+ backup_state.stat.starting = true;
+}
+
UuidInfo coroutine_fn *qmp_backup(
const char *backup_file,
const char *password,
@@ -1070,32 +1107,9 @@ UuidInfo coroutine_fn *qmp_backup(
}
}
/* initialize global backup_state now */
- /* note: 'reused' and 'bitmap_list' are initialized earlier */
-
- if (backup_state.stat.error) {
- error_free(backup_state.stat.error);
- backup_state.stat.error = NULL;
- }
-
- backup_state.stat.start_time = time(NULL);
- backup_state.stat.end_time = 0;
-
- if (backup_state.stat.backup_file) {
- g_free(backup_state.stat.backup_file);
- }
- backup_state.stat.backup_file = g_strdup(backup_file);
-
- uuid_copy(backup_state.stat.uuid, uuid);
- uuid_unparse_lower(uuid, backup_state.stat.uuid_str);
+ initialize_backup_state_stat(backup_file, uuid, total);
char *uuid_str = g_strdup(backup_state.stat.uuid_str);
- backup_state.stat.total = total;
- backup_state.stat.dirty = total - backup_state.stat.reused;
- backup_state.stat.transferred = 0;
- backup_state.stat.zero_bytes = 0;
- backup_state.stat.finishing = false;
- backup_state.stat.starting = true;
-
qemu_mutex_unlock(&backup_state.stat.lock);
backup_state.speed = (has_speed && speed > 0) ? speed : 0;

View File

@@ -0,0 +1,63 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Thu, 3 Apr 2025 14:30:44 +0200
Subject: [PATCH] PVE backup: add target ID in backup state
In preparation for allowing multiple backup providers and potentially
multiple targets for a given provider. Each backup target can then
have its own dirty bitmap and there can be additional checks that the
current backup state is actually associated to the expected target.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
---
pve-backup.c | 15 ++++++++++++++-
1 file changed, 14 insertions(+), 1 deletion(-)
diff --git a/pve-backup.c b/pve-backup.c
index 88a981f81c..8789a0667a 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -70,6 +70,7 @@ static struct PVEBackupState {
JobTxn *txn;
CoMutex backup_mutex;
CoMutex dump_callback_mutex;
+ char *target_id;
} backup_state;
static void pvebackup_init(void)
@@ -865,6 +866,16 @@ static void initialize_backup_state_stat(
backup_state.stat.starting = true;
}
+/*
+ * To be called with the backup_state mutex held.
+ */
+static void backup_state_set_target_id(const char *target_id) {
+ if (backup_state.target_id) {
+ g_free(backup_state.target_id);
+ }
+ backup_state.target_id = g_strdup(target_id);
+}
+
UuidInfo coroutine_fn *qmp_backup(
const char *backup_file,
const char *password,
@@ -904,7 +915,7 @@ UuidInfo coroutine_fn *qmp_backup(
if (backup_state.di_list) {
error_set(errp, ERROR_CLASS_GENERIC_ERROR,
- "previous backup not finished");
+ "previous backup for target '%s' not finished", backup_state.target_id);
qemu_co_mutex_unlock(&backup_state.backup_mutex);
return NULL;
}
@@ -1122,6 +1133,8 @@ UuidInfo coroutine_fn *qmp_backup(
backup_state.vmaw = vmaw;
backup_state.pbs = pbs;
+ backup_state_set_target_id("Proxmox");
+
backup_state.di_list = di_list;
uuid_info = g_malloc0(sizeof(*uuid_info));

View File

@@ -0,0 +1,57 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Thu, 3 Apr 2025 14:30:45 +0200
Subject: [PATCH] PVE backup: get device info: allow caller to specify filter
for which devices use fleecing
For providing snapshot-access to external backup providers, EFI and
TPM also need an associated fleecing image. The new caller will thus
need a different filter.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
---
pve-backup.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/pve-backup.c b/pve-backup.c
index 8789a0667a..755f1abcf1 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -719,7 +719,7 @@ static void create_backup_jobs_bh(void *opaque) {
/*
* EFI disk and TPM state are small and it's just not worth setting up fleecing for them.
*/
-static bool device_uses_fleecing(const char *device_id)
+static bool fleecing_no_efi_tpm(const char *device_id)
{
return strncmp(device_id, "drive-efidisk", 13) && strncmp(device_id, "drive-tpmstate", 14);
}
@@ -731,7 +731,7 @@ static bool device_uses_fleecing(const char *device_id)
*/
static GList coroutine_fn GRAPH_RDLOCK *get_device_info(
const char *devlist,
- bool fleecing,
+ bool (*device_uses_fleecing)(const char*),
Error **errp)
{
gchar **devs = NULL;
@@ -757,7 +757,7 @@ static GList coroutine_fn GRAPH_RDLOCK *get_device_info(
di->bs = bs;
di->device_name = g_strdup(bdrv_get_device_name(bs));
- if (fleecing && device_uses_fleecing(*d)) {
+ if (device_uses_fleecing && device_uses_fleecing(*d)) {
g_autofree gchar *fleecing_devid = g_strconcat(*d, "-fleecing", NULL);
BlockBackend *fleecing_blk = blk_by_name(fleecing_devid);
if (!fleecing_blk) {
@@ -924,7 +924,8 @@ UuidInfo coroutine_fn *qmp_backup(
format = has_format ? format : BACKUP_FORMAT_VMA;
bdrv_graph_co_rdlock();
- di_list = get_device_info(devlist, has_fleecing && fleecing, &local_err);
+ di_list = get_device_info(devlist, (has_fleecing && fleecing) ? fleecing_no_efi_tpm : NULL,
+ &local_err);
bdrv_graph_co_rdunlock();
if (local_err) {
error_propagate(errp, local_err);

View File

@@ -0,0 +1,495 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Thu, 3 Apr 2025 14:30:46 +0200
Subject: [PATCH] PVE backup: implement backup access setup and teardown API
for external providers
For external backup providers, the state of the VM's disk images at
the time the backup is started is preserved via a snapshot-access
block node. Old data is moved to the fleecing image when new guest
writes come in. The snapshot-access block node, as well as the
associated bitmap in case of incremental backup, will be exported via
NBD to the external provider. The NBD export will be done by the
management layer, the missing functionality is setting up and tearing
down the snapshot-access block nodes, which this patch adds.
It is necessary to also set up fleecing for EFI and TPM disks, so that
old data can be moved out of the way when a new guest write comes in.
There can only be one regular backup or one active backup access at
a time, because both require replacing the original block node of the
drive. Thus the backup state is re-used, and checks are added to
prohibit regular backup while snapshot access is active and vice
versa.
The block nodes added by the backup-access-setup QMP call are not
tracked anywhere else (there is no job they are associated to like for
regular backup). This requires adding a callback for teardown when
QEMU exits, i.e. in qemu_cleanup(). Otherwise, there will be an
assertion failure that the block graph is not empty when QEMU exits
before the backup-access-teardown QMP command is called.
The code for the qmp_backup_access_setup() was based on the existing
qmp_backup() routine.
The return value for the setup QMP command contains information about
the snapshot-access block nodes that can be used by the management
layer to set up the NBD exports.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
---
pve-backup.c | 264 ++++++++++++++++++++++++++++++++++++++++++-
pve-backup.h | 16 +++
qapi/block-core.json | 49 ++++++++
system/runstate.c | 6 +
4 files changed, 329 insertions(+), 6 deletions(-)
create mode 100644 pve-backup.h
diff --git a/pve-backup.c b/pve-backup.c
index 755f1abcf1..091b5bd231 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -1,4 +1,5 @@
#include "proxmox-backup-client.h"
+#include "pve-backup.h"
#include "vma.h"
#include "qemu/osdep.h"
@@ -588,6 +589,36 @@ static int setup_snapshot_access(PVEBackupDevInfo *di, Error **errp)
return 0;
}
+static void setup_all_snapshot_access_bh(void *opaque)
+{
+ assert(!qemu_in_coroutine());
+
+ CoCtxData *data = (CoCtxData*)opaque;
+ Error **errp = (Error**)data->data;
+
+ Error *local_err = NULL;
+
+ GList *l = backup_state.di_list;
+ while (l) {
+ PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
+ l = g_list_next(l);
+
+ bdrv_drained_begin(di->bs);
+
+ if (setup_snapshot_access(di, &local_err) < 0) {
+ bdrv_drained_end(di->bs);
+ error_setg(errp, "%s - setting up snapshot access failed: %s", di->device_name,
+ local_err ? error_get_pretty(local_err) : "unknown error");
+ break;
+ }
+
+ bdrv_drained_end(di->bs);
+ }
+
+ /* return */
+ aio_co_enter(data->ctx, data->co);
+}
+
/*
* backup_job_create can *not* be run from a coroutine, so this can't either.
* The caller is responsible that backup_mutex is held nonetheless.
@@ -724,6 +755,11 @@ static bool fleecing_no_efi_tpm(const char *device_id)
return strncmp(device_id, "drive-efidisk", 13) && strncmp(device_id, "drive-tpmstate", 14);
}
+static bool fleecing_all(const char *device_id)
+{
+ return true;
+}
+
/*
* Returns a list of device infos, which needs to be freed by the caller. In
* case of an error, errp will be set, but the returned value might still be a
@@ -839,8 +875,9 @@ static void clear_backup_state_bitmap_list(void) {
*/
static void initialize_backup_state_stat(
const char *backup_file,
- uuid_t uuid,
- size_t total)
+ uuid_t *uuid,
+ size_t total,
+ bool starting)
{
if (backup_state.stat.error) {
error_free(backup_state.stat.error);
@@ -855,15 +892,19 @@ static void initialize_backup_state_stat(
}
backup_state.stat.backup_file = g_strdup(backup_file);
- uuid_copy(backup_state.stat.uuid, uuid);
- uuid_unparse_lower(uuid, backup_state.stat.uuid_str);
+ if (uuid) {
+ uuid_copy(backup_state.stat.uuid, *uuid);
+ uuid_unparse_lower(*uuid, backup_state.stat.uuid_str);
+ } else {
+ backup_state.stat.uuid_str[0] = '\0';
+ }
backup_state.stat.total = total;
backup_state.stat.dirty = total - backup_state.stat.reused;
backup_state.stat.transferred = 0;
backup_state.stat.zero_bytes = 0;
backup_state.stat.finishing = false;
- backup_state.stat.starting = true;
+ backup_state.stat.starting = starting;
}
/*
@@ -876,6 +917,216 @@ static void backup_state_set_target_id(const char *target_id) {
backup_state.target_id = g_strdup(target_id);
}
+BackupAccessInfoList *coroutine_fn qmp_backup_access_setup(
+ const char *target_id,
+ const char *devlist,
+ Error **errp)
+{
+ assert(qemu_in_coroutine());
+
+ qemu_co_mutex_lock(&backup_state.backup_mutex);
+
+ Error *local_err = NULL;
+ GList *di_list = NULL;
+ GList *l;
+
+ if (backup_state.di_list) {
+ error_set(errp, ERROR_CLASS_GENERIC_ERROR,
+ "previous backup for target '%s' not finished", backup_state.target_id);
+ qemu_co_mutex_unlock(&backup_state.backup_mutex);
+ return NULL;
+ }
+
+ bdrv_graph_co_rdlock();
+ di_list = get_device_info(devlist, fleecing_all, &local_err);
+ bdrv_graph_co_rdunlock();
+ if (local_err) {
+ error_propagate(errp, local_err);
+ goto err;
+ }
+ assert(di_list);
+
+ size_t total = 0;
+
+ l = di_list;
+ while (l) {
+ PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
+ l = g_list_next(l);
+
+ ssize_t size = bdrv_getlength(di->bs);
+ if (size < 0) {
+ error_setg_errno(errp, -size, "bdrv_getlength failed");
+ goto err;
+ }
+ di->size = size;
+ total += size;
+
+ di->completed_ret = INT_MAX;
+ }
+
+ qemu_mutex_lock(&backup_state.stat.lock);
+ backup_state.stat.reused = 0;
+
+ /* clear previous backup's bitmap_list */
+ clear_backup_state_bitmap_list();
+
+ /* starting=false, because there is no associated QEMU job */
+ initialize_backup_state_stat(NULL, NULL, total, false);
+
+ qemu_mutex_unlock(&backup_state.stat.lock);
+
+ backup_state_set_target_id(target_id);
+
+ backup_state.vmaw = NULL;
+ backup_state.pbs = NULL;
+
+ backup_state.di_list = di_list;
+
+ /* Run setup_all_snapshot_access_bh outside of coroutine (in BH) but keep
+ * backup_mutex locked. This is fine, a CoMutex can be held across yield
+ * points, and we'll release it as soon as the BH reschedules us.
+ */
+ CoCtxData waker = {
+ .co = qemu_coroutine_self(),
+ .ctx = qemu_get_current_aio_context(),
+ .data = &local_err,
+ };
+ aio_bh_schedule_oneshot(waker.ctx, setup_all_snapshot_access_bh, &waker);
+ qemu_coroutine_yield();
+
+ if (local_err) {
+ error_propagate(errp, local_err);
+ goto err;
+ }
+
+ qemu_co_mutex_unlock(&backup_state.backup_mutex);
+
+ BackupAccessInfoList *bai_head = NULL, **p_bai_next = &bai_head;
+
+ l = di_list;
+ while (l) {
+ PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
+ l = g_list_next(l);
+
+ BackupAccessInfoList *info = g_malloc0(sizeof(*info));
+ info->value = g_malloc0(sizeof(*info->value));
+ info->value->node_name = g_strdup(bdrv_get_node_name(di->fleecing.snapshot_access));
+ info->value->device = g_strdup(di->device_name);
+ info->value->size = di->size;
+
+ *p_bai_next = info;
+ p_bai_next = &info->next;
+ }
+
+ return bai_head;
+
+err:
+
+ l = di_list;
+ while (l) {
+ PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
+ l = g_list_next(l);
+
+ g_free(di->device_name);
+ di->device_name = NULL;
+
+ g_free(di);
+ }
+ g_list_free(di_list);
+ backup_state.di_list = NULL;
+
+ qemu_co_mutex_unlock(&backup_state.backup_mutex);
+ return NULL;
+}
+
+/*
+ * Caller needs to hold the backup mutex or the BQL.
+ */
+void backup_access_teardown(void)
+{
+ GList *l = backup_state.di_list;
+
+ qemu_mutex_lock(&backup_state.stat.lock);
+ backup_state.stat.finishing = true;
+ qemu_mutex_unlock(&backup_state.stat.lock);
+
+ while (l) {
+ PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
+ l = g_list_next(l);
+
+ if (di->fleecing.snapshot_access) {
+ bdrv_unref(di->fleecing.snapshot_access);
+ di->fleecing.snapshot_access = NULL;
+ }
+ if (di->fleecing.cbw) {
+ bdrv_cbw_drop(di->fleecing.cbw);
+ di->fleecing.cbw = NULL;
+ }
+
+ g_free(di->device_name);
+ di->device_name = NULL;
+
+ g_free(di);
+ }
+ g_list_free(backup_state.di_list);
+ backup_state.di_list = NULL;
+
+ qemu_mutex_lock(&backup_state.stat.lock);
+ backup_state.stat.end_time = time(NULL);
+ backup_state.stat.finishing = false;
+ qemu_mutex_unlock(&backup_state.stat.lock);
+}
+
+// Not done in a coroutine, because bdrv_co_unref() and cbw_drop() would just spawn BHs anyways.
+// Caller needs to hold the backup_state.backup_mutex lock
+static void backup_access_teardown_bh(void *opaque)
+{
+ CoCtxData *data = (CoCtxData*)opaque;
+
+ backup_access_teardown();
+
+ /* return */
+ aio_co_enter(data->ctx, data->co);
+}
+
+void coroutine_fn qmp_backup_access_teardown(const char *target_id, Error **errp)
+{
+ assert(qemu_in_coroutine());
+
+ qemu_co_mutex_lock(&backup_state.backup_mutex);
+
+ if (!backup_state.target_id) { // nothing to do
+ qemu_co_mutex_unlock(&backup_state.backup_mutex);
+ return;
+ }
+
+ /*
+ * Continue with target_id == NULL, used by the callback registered for qemu_cleanup()
+ */
+ if (target_id && strcmp(backup_state.target_id, target_id)) {
+ error_setg(errp, "cannot teardown backup access - got target %s instead of %s",
+ target_id, backup_state.target_id);
+ qemu_co_mutex_unlock(&backup_state.backup_mutex);
+ return;
+ }
+
+ if (!strcmp(backup_state.target_id, "Proxmox VE")) {
+ error_setg(errp, "cannot teardown backup access for PVE - use backup-cancel instead");
+ qemu_co_mutex_unlock(&backup_state.backup_mutex);
+ return;
+ }
+
+ CoCtxData waker = {
+ .co = qemu_coroutine_self(),
+ .ctx = qemu_get_current_aio_context(),
+ };
+ aio_bh_schedule_oneshot(waker.ctx, backup_access_teardown_bh, &waker);
+ qemu_coroutine_yield();
+
+ qemu_co_mutex_unlock(&backup_state.backup_mutex);
+ return;
+}
+
UuidInfo coroutine_fn *qmp_backup(
const char *backup_file,
const char *password,
@@ -1119,7 +1370,7 @@ UuidInfo coroutine_fn *qmp_backup(
}
}
/* initialize global backup_state now */
- initialize_backup_state_stat(backup_file, uuid, total);
+ initialize_backup_state_stat(backup_file, &uuid, total, true);
char *uuid_str = g_strdup(backup_state.stat.uuid_str);
qemu_mutex_unlock(&backup_state.stat.lock);
@@ -1298,5 +1549,6 @@ ProxmoxSupportStatus *qmp_query_proxmox_support(Error **errp)
ret->pbs_masterkey = true;
ret->backup_max_workers = true;
ret->backup_fleecing = true;
+ ret->backup_access_api = true;
return ret;
}
diff --git a/pve-backup.h b/pve-backup.h
new file mode 100644
index 0000000000..4033bc848f
--- /dev/null
+++ b/pve-backup.h
@@ -0,0 +1,16 @@
+/*
+ * Bacup code used by Proxmox VE
+ *
+ * Copyright (C) Proxmox Server Solutions
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#ifndef PVE_BACKUP_H
+#define PVE_BACKUP_H
+
+void backup_access_teardown(void);
+
+#endif /* PVE_BACKUP_H */
diff --git a/qapi/block-core.json b/qapi/block-core.json
index c581f1f238..3f092221ce 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1019,6 +1019,9 @@
#
# @pbs-library-version: Running version of libproxmox-backup-qemu0 library.
#
+# @backup-access-api: Whether backup access API for external providers is
+# supported or not.
+#
# @backup-fleecing: Whether backup fleecing is supported or not.
#
# @backup-max-workers: Whether the 'max-workers' @BackupPerf setting is
@@ -1032,6 +1035,7 @@
'pbs-dirty-bitmap-migration': 'bool',
'pbs-masterkey': 'bool',
'pbs-library-version': 'str',
+ 'backup-access-api': 'bool',
'backup-fleecing': 'bool',
'backup-max-workers': 'bool' } }
@@ -1098,6 +1102,51 @@
##
{ 'command': 'query-pbs-bitmap-info', 'returns': ['PBSBitmapInfo'] }
+##
+# @BackupAccessInfo:
+#
+# Info associated to a snapshot access for backup. For more information about
+# the bitmap see @BackupAccessBitmapMode.
+#
+# @node-name: the block node name of the snapshot-access node.
+#
+# @device: the device on top of which the snapshot access was created.
+#
+# @size: the size of the block device in bytes.
+#
+##
+{ 'struct': 'BackupAccessInfo',
+ 'data': { 'node-name': 'str', 'device': 'str', 'size': 'size' } }
+
+##
+# @backup-access-setup:
+#
+# Set up snapshot access to VM drives for an external backup provider. No other
+# backup or backup access can be done before tearing down the backup access.
+#
+# @target-id: the unique ID of the backup target.
+#
+# @devlist: list of block device names (separated by ',', ';' or ':'). By
+# default the backup includes all writable block devices.
+#
+# Returns: a list of @BackupAccessInfo, one for each device.
+#
+##
+{ 'command': 'backup-access-setup',
+ 'data': { 'target-id': 'str', '*devlist': 'str' },
+ 'returns': [ 'BackupAccessInfo' ], 'coroutine': true }
+
+##
+# @backup-access-teardown:
+#
+# Tear down previously setup snapshot access for the same target.
+#
+# @target-id: the ID of the backup target.
+#
+##
+{ 'command': 'backup-access-teardown', 'data': { 'target-id': 'str' },
+ 'coroutine': true }
+
##
# @BlockDeviceTimedStats:
#
diff --git a/system/runstate.c b/system/runstate.c
index c2c9afa905..6f93d7c2fb 100644
--- a/system/runstate.c
+++ b/system/runstate.c
@@ -60,6 +60,7 @@
#include "sysemu/sysemu.h"
#include "sysemu/tpm.h"
#include "trace.h"
+#include "pve-backup.h"
static NotifierList exit_notifiers =
NOTIFIER_LIST_INITIALIZER(exit_notifiers);
@@ -920,6 +921,11 @@ void qemu_cleanup(int status)
* requests happening from here on anyway.
*/
bdrv_drain_all_begin();
+ /*
+ * The backup access is set up by a QMP command, but is neither owned by a monitor nor
+ * associated to a BlockBackend. Need to tear it down manually here.
+ */
+ backup_access_teardown();
job_cancel_sync_all();
bdrv_close_all();

View File

@@ -0,0 +1,122 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Thu, 3 Apr 2025 14:30:47 +0200
Subject: [PATCH] PVE backup: factor out get_single_device_info() helper
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
[WB: free di and di->device_name on error]
Sigend-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
---
pve-backup.c | 90 +++++++++++++++++++++++++++++++---------------------
1 file changed, 53 insertions(+), 37 deletions(-)
diff --git a/pve-backup.c b/pve-backup.c
index 091b5bd231..8b7414f057 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -760,6 +760,57 @@ static bool fleecing_all(const char *device_id)
return true;
}
+static PVEBackupDevInfo coroutine_fn GRAPH_RDLOCK *get_single_device_info(
+ const char *device,
+ bool (*device_uses_fleecing)(const char*),
+ Error **errp)
+{
+ BlockBackend *blk = blk_by_name(device);
+ if (!blk) {
+ error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND,
+ "Device '%s' not found", device);
+ return NULL;
+ }
+ BlockDriverState *bs = blk_bs(blk);
+ if (!bdrv_co_is_inserted(bs)) {
+ error_setg(errp, "Device '%s' has no medium", device);
+ return NULL;
+ }
+ PVEBackupDevInfo *di = g_new0(PVEBackupDevInfo, 1);
+ di->bs = bs;
+ di->device_name = g_strdup(bdrv_get_device_name(bs));
+
+ if (device_uses_fleecing && device_uses_fleecing(device)) {
+ g_autofree gchar *fleecing_devid = g_strconcat(device, "-fleecing", NULL);
+ BlockBackend *fleecing_blk = blk_by_name(fleecing_devid);
+ if (!fleecing_blk) {
+ error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND,
+ "Device '%s' not found", fleecing_devid);
+ goto fail;
+ }
+ BlockDriverState *fleecing_bs = blk_bs(fleecing_blk);
+ if (!bdrv_co_is_inserted(fleecing_bs)) {
+ error_setg(errp, "Device '%s' has no medium", fleecing_devid);
+ goto fail;
+ }
+ /*
+ * Fleecing image needs to be the same size to act as a cbw target.
+ */
+ if (bs->total_sectors != fleecing_bs->total_sectors) {
+ error_setg(errp, "Size mismatch for '%s' - sector count %ld != %ld",
+ fleecing_devid, fleecing_bs->total_sectors, bs->total_sectors);
+ goto fail;
+ }
+ di->fleecing.bs = fleecing_bs;
+ }
+
+ return di;
+fail:
+ g_free(di->device_name);
+ g_free(di);
+ return NULL;
+}
+
/*
* Returns a list of device infos, which needs to be freed by the caller. In
* case of an error, errp will be set, but the returned value might still be a
@@ -778,45 +829,10 @@ static GList coroutine_fn GRAPH_RDLOCK *get_device_info(
gchar **d = devs;
while (d && *d) {
- BlockBackend *blk = blk_by_name(*d);
- if (!blk) {
- error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND,
- "Device '%s' not found", *d);
+ PVEBackupDevInfo *di = get_single_device_info(*d, device_uses_fleecing, errp);
+ if (!di) {
goto err;
}
- BlockDriverState *bs = blk_bs(blk);
- if (!bdrv_co_is_inserted(bs)) {
- error_setg(errp, "Device '%s' has no medium", *d);
- goto err;
- }
- PVEBackupDevInfo *di = g_new0(PVEBackupDevInfo, 1);
- di->bs = bs;
- di->device_name = g_strdup(bdrv_get_device_name(bs));
-
- if (device_uses_fleecing && device_uses_fleecing(*d)) {
- g_autofree gchar *fleecing_devid = g_strconcat(*d, "-fleecing", NULL);
- BlockBackend *fleecing_blk = blk_by_name(fleecing_devid);
- if (!fleecing_blk) {
- error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND,
- "Device '%s' not found", fleecing_devid);
- goto err;
- }
- BlockDriverState *fleecing_bs = blk_bs(fleecing_blk);
- if (!bdrv_co_is_inserted(fleecing_bs)) {
- error_setg(errp, "Device '%s' has no medium", fleecing_devid);
- goto err;
- }
- /*
- * Fleecing image needs to be the same size to act as a cbw target.
- */
- if (bs->total_sectors != fleecing_bs->total_sectors) {
- error_setg(errp, "Size mismatch for '%s' - sector count %ld != %ld",
- fleecing_devid, fleecing_bs->total_sectors, bs->total_sectors);
- goto err;
- }
- di->fleecing.bs = fleecing_bs;
- }
-
di_list = g_list_append(di_list, di);
d++;
}

View File

@@ -0,0 +1,470 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Thu, 3 Apr 2025 14:30:48 +0200
Subject: [PATCH] PVE backup: implement bitmap support for external backup
access
There can be one dirty bitmap for each backup target ID for each
device (which are tracked in the backup_access_bitmaps hash table).
The QMP user can specify the ID of the bitmap it likes to use. This ID
is then compared to the current one for the given target and device.
If they match, the bitmap is re-used (should it still exist on the
drive, otherwise re-created). If there is a mismatch, the old bitmap
is removed and a new one is created.
The return value of the QMP command includes information about what
bitmap action was taken. Similar to what the query-backup QMP command
returns for regular backup. It also includes the bitmap name and
associated block node, so the management layer can then set up an NBD
export with the bitmap.
While the backup access is active, a background bitmap is also
required. This is necessary to implement bitmap handling according to
the original reference [0]. In particular:
- in the error case, new writes since the backup access was set up are
in the background bitmap. Because of failure, the previously tracked
writes from the backup access bitmap are still required too. Thus,
the bitmap is merged with the background bitmap to get all new
writes since the last backup.
- in the success case, continue tracking for the next incremental
backup in the backup access bitmap. New writes since the backup
access was set up are in the background bitmap. Because the backup
was successfully, clear the backup access bitmap and merge back the
background bitmap to get only the new writes.
Since QEMU cannot know if the backup was successful or not (except if
failure already happens during the setup QMP command), the management
layer needs to tell it via the teardown QMP command.
The bitmap action is also recorded in the device info now.
[0]: https://lore.kernel.org/qemu-devel/b68833dd-8864-4d72-7c61-c134a9835036@ya.ru/
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
---
pve-backup.c | 196 +++++++++++++++++++++++++++++++++++++++++--
pve-backup.h | 2 +-
qapi/block-core.json | 36 ++++++--
system/runstate.c | 2 +-
4 files changed, 220 insertions(+), 16 deletions(-)
diff --git a/pve-backup.c b/pve-backup.c
index 8b7414f057..0490d1f421 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -15,6 +15,7 @@
#include "qapi/qmp/qdict.h"
#include "qapi/qmp/qerror.h"
#include "qemu/cutils.h"
+#include "qemu/error-report.h"
#if defined(CONFIG_MALLOC_TRIM)
#include <malloc.h>
@@ -41,6 +42,7 @@
*/
const char *PBS_BITMAP_NAME = "pbs-incremental-dirty-bitmap";
+const char *BACKGROUND_BITMAP_NAME = "backup-access-background-bitmap";
static struct PVEBackupState {
struct {
@@ -72,6 +74,7 @@ static struct PVEBackupState {
CoMutex backup_mutex;
CoMutex dump_callback_mutex;
char *target_id;
+ GHashTable *backup_access_bitmaps; // key=target_id, value=bitmap_name
} backup_state;
static void pvebackup_init(void)
@@ -99,8 +102,11 @@ typedef struct PVEBackupDevInfo {
char* device_name;
int completed_ret; // INT_MAX if not completed
BdrvDirtyBitmap *bitmap;
+ BdrvDirtyBitmap *background_bitmap; // used for external backup access
+ PBSBitmapAction bitmap_action;
BlockDriverState *target;
BlockJob *job;
+ char *requested_bitmap_name; // used by external backup access during initialization
} PVEBackupDevInfo;
static void pvebackup_propagate_error(Error *err)
@@ -362,6 +368,67 @@ static void coroutine_fn pvebackup_co_complete_stream(void *opaque)
qemu_co_mutex_unlock(&backup_state.backup_mutex);
}
+/*
+ * New writes since the backup access was set up are in the background bitmap. Because of failure,
+ * the previously tracked writes in di->bitmap are still required too. Thus, merge with the
+ * background bitmap to get all new writes since the last backup.
+ */
+static void handle_backup_access_bitmaps_in_error_case(PVEBackupDevInfo *di)
+{
+ Error *local_err = NULL;
+
+ if (di->bs && di->background_bitmap) {
+ bdrv_drained_begin(di->bs);
+ if (di->bitmap) {
+ bdrv_enable_dirty_bitmap(di->bitmap);
+ if (!bdrv_merge_dirty_bitmap(di->bitmap, di->background_bitmap, NULL, &local_err)) {
+ warn_report("backup access: %s - could not merge bitmaps in error path - %s",
+ di->device_name,
+ local_err ? error_get_pretty(local_err) : "unknown error");
+ /*
+ * Could not merge, drop original bitmap too.
+ */
+ bdrv_release_dirty_bitmap(di->bitmap);
+ }
+ } else {
+ warn_report("backup access: %s - expected bitmap not present", di->device_name);
+ }
+ bdrv_release_dirty_bitmap(di->background_bitmap);
+ bdrv_drained_end(di->bs);
+ }
+}
+
+/*
+ * Continue tracking for next incremental backup in di->bitmap. New writes since the backup access
+ * was set up are in the background bitmap. Because the backup was successful, clear di->bitmap and
+ * merge back the background bitmap to get only the new writes.
+ */
+static void handle_backup_access_bitmaps_after_success(PVEBackupDevInfo *di)
+{
+ Error *local_err = NULL;
+
+ if (di->bs && di->background_bitmap) {
+ bdrv_drained_begin(di->bs);
+ if (di->bitmap) {
+ bdrv_enable_dirty_bitmap(di->bitmap);
+ bdrv_clear_dirty_bitmap(di->bitmap, NULL);
+ if (!bdrv_merge_dirty_bitmap(di->bitmap, di->background_bitmap, NULL, &local_err)) {
+ warn_report("backup access: %s - could not merge bitmaps after backup - %s",
+ di->device_name,
+ local_err ? error_get_pretty(local_err) : "unknown error");
+ /*
+ * Could not merge, drop original bitmap too.
+ */
+ bdrv_release_dirty_bitmap(di->bitmap);
+ }
+ } else {
+ warn_report("backup access: %s - expected bitmap not present", di->device_name);
+ }
+ bdrv_release_dirty_bitmap(di->background_bitmap);
+ bdrv_drained_end(di->bs);
+ }
+}
+
static void cleanup_snapshot_access(PVEBackupDevInfo *di)
{
if (di->fleecing.snapshot_access) {
@@ -605,6 +672,21 @@ static void setup_all_snapshot_access_bh(void *opaque)
bdrv_drained_begin(di->bs);
+ if (di->bitmap) {
+ BdrvDirtyBitmap *background_bitmap =
+ bdrv_create_dirty_bitmap(di->bs, PROXMOX_BACKUP_DEFAULT_CHUNK_SIZE,
+ BACKGROUND_BITMAP_NAME, &local_err);
+ if (!background_bitmap) {
+ error_setg(errp, "%s - creating background bitmap for backup access failed: %s",
+ di->device_name,
+ local_err ? error_get_pretty(local_err) : "unknown error");
+ bdrv_drained_end(di->bs);
+ break;
+ }
+ di->background_bitmap = background_bitmap;
+ bdrv_disable_dirty_bitmap(di->bitmap);
+ }
+
if (setup_snapshot_access(di, &local_err) < 0) {
bdrv_drained_end(di->bs);
error_setg(errp, "%s - setting up snapshot access failed: %s", di->device_name,
@@ -935,7 +1017,7 @@ static void backup_state_set_target_id(const char *target_id) {
BackupAccessInfoList *coroutine_fn qmp_backup_access_setup(
const char *target_id,
- const char *devlist,
+ BackupAccessSourceDeviceList *devices,
Error **errp)
{
assert(qemu_in_coroutine());
@@ -954,12 +1036,17 @@ BackupAccessInfoList *coroutine_fn qmp_backup_access_setup(
}
bdrv_graph_co_rdlock();
- di_list = get_device_info(devlist, fleecing_all, &local_err);
- bdrv_graph_co_rdunlock();
- if (local_err) {
- error_propagate(errp, local_err);
- goto err;
+ for (BackupAccessSourceDeviceList *it = devices; it; it = it->next) {
+ PVEBackupDevInfo *di = get_single_device_info(it->value->device, fleecing_all, &local_err);
+ if (!di) {
+ bdrv_graph_co_rdunlock();
+ error_propagate(errp, local_err);
+ goto err;
+ }
+ di->requested_bitmap_name = g_strdup(it->value->bitmap_name);
+ di_list = g_list_append(di_list, di);
}
+ bdrv_graph_co_rdunlock();
assert(di_list);
size_t total = 0;
@@ -986,6 +1073,78 @@ BackupAccessInfoList *coroutine_fn qmp_backup_access_setup(
/* clear previous backup's bitmap_list */
clear_backup_state_bitmap_list();
+ if (!backup_state.backup_access_bitmaps) {
+ backup_state.backup_access_bitmaps =
+ g_hash_table_new_full(g_str_hash, g_str_equal, free, free);
+ }
+
+ /* create bitmaps if requested */
+ l = di_list;
+ while (l) {
+ PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
+ l = g_list_next(l);
+
+ di->block_size = PROXMOX_BACKUP_DEFAULT_CHUNK_SIZE;
+
+ PBSBitmapAction action = PBS_BITMAP_ACTION_NOT_USED;
+ size_t dirty = di->size;
+
+ const char *old_bitmap_name =
+ (const char*)g_hash_table_lookup(backup_state.backup_access_bitmaps, target_id);
+
+ bool same_bitmap_name = old_bitmap_name
+ && di->requested_bitmap_name
+ && strcmp(di->requested_bitmap_name, old_bitmap_name) == 0;
+
+ if (old_bitmap_name && !same_bitmap_name) {
+ BdrvDirtyBitmap *old_bitmap = bdrv_find_dirty_bitmap(di->bs, old_bitmap_name);
+ if (!old_bitmap) {
+ warn_report("setup backup access: expected old bitmap '%s' not found for drive "
+ "'%s'", old_bitmap_name, di->device_name);
+ } else {
+ g_hash_table_remove(backup_state.backup_access_bitmaps, target_id);
+ bdrv_release_dirty_bitmap(old_bitmap);
+ action = PBS_BITMAP_ACTION_NOT_USED_REMOVED;
+ }
+ }
+
+ BdrvDirtyBitmap *bitmap = NULL;
+ if (di->requested_bitmap_name) {
+ bitmap = bdrv_find_dirty_bitmap(di->bs, di->requested_bitmap_name);
+ if (!bitmap) {
+ bitmap = bdrv_create_dirty_bitmap(di->bs, PROXMOX_BACKUP_DEFAULT_CHUNK_SIZE,
+ di->requested_bitmap_name, errp);
+ if (!bitmap) {
+ qemu_mutex_unlock(&backup_state.stat.lock);
+ goto err;
+ }
+ bdrv_set_dirty_bitmap(bitmap, 0, di->size);
+ action = PBS_BITMAP_ACTION_NEW;
+ } else {
+ /* track clean chunks as reused */
+ dirty = MIN(bdrv_get_dirty_count(bitmap), di->size);
+ backup_state.stat.reused += di->size - dirty;
+ action = PBS_BITMAP_ACTION_USED;
+ }
+
+ if (!same_bitmap_name) {
+ g_hash_table_insert(backup_state.backup_access_bitmaps,
+ strdup(target_id), strdup(di->requested_bitmap_name));
+ }
+
+ }
+
+ PBSBitmapInfo *info = g_malloc(sizeof(*info));
+ info->drive = g_strdup(di->device_name);
+ info->action = action;
+ info->size = di->size;
+ info->dirty = dirty;
+ backup_state.stat.bitmap_list = g_list_append(backup_state.stat.bitmap_list, info);
+
+ di->bitmap = bitmap;
+ di->bitmap_action = action;
+ }
+
/* starting=false, because there is no associated QEMU job */
initialize_backup_state_stat(NULL, NULL, total, false);
@@ -1029,6 +1188,12 @@ BackupAccessInfoList *coroutine_fn qmp_backup_access_setup(
info->value->node_name = g_strdup(bdrv_get_node_name(di->fleecing.snapshot_access));
info->value->device = g_strdup(di->device_name);
info->value->size = di->size;
+ if (di->requested_bitmap_name) {
+ info->value->bitmap_node_name = g_strdup(bdrv_get_node_name(di->bs));
+ info->value->bitmap_name = g_strdup(di->requested_bitmap_name);
+ info->value->bitmap_action = di->bitmap_action;
+ info->value->has_bitmap_action = true;
+ }
*p_bai_next = info;
p_bai_next = &info->next;
@@ -1043,6 +1208,8 @@ err:
PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
l = g_list_next(l);
+ handle_backup_access_bitmaps_in_error_case(di);
+
g_free(di->device_name);
di->device_name = NULL;
@@ -1058,7 +1225,7 @@ err:
/*
* Caller needs to hold the backup mutex or the BQL.
*/
-void backup_access_teardown(void)
+void backup_access_teardown(bool success)
{
GList *l = backup_state.di_list;
@@ -1079,9 +1246,18 @@ void backup_access_teardown(void)
di->fleecing.cbw = NULL;
}
+ if (success) {
+ handle_backup_access_bitmaps_after_success(di);
+ } else {
+ handle_backup_access_bitmaps_in_error_case(di);
+ }
+
g_free(di->device_name);
di->device_name = NULL;
+ g_free(di->requested_bitmap_name);
+ di->requested_bitmap_name = NULL;
+
g_free(di);
}
g_list_free(backup_state.di_list);
@@ -1099,13 +1275,13 @@ static void backup_access_teardown_bh(void *opaque)
{
CoCtxData *data = (CoCtxData*)opaque;
- backup_access_teardown();
+ backup_access_teardown(*((bool*)data->data));
/* return */
aio_co_enter(data->ctx, data->co);
}
-void coroutine_fn qmp_backup_access_teardown(const char *target_id, Error **errp)
+void coroutine_fn qmp_backup_access_teardown(const char *target_id, bool success, Error **errp)
{
assert(qemu_in_coroutine());
@@ -1135,6 +1311,7 @@ void coroutine_fn qmp_backup_access_teardown(const char *target_id, Error **errp
CoCtxData waker = {
.co = qemu_coroutine_self(),
.ctx = qemu_get_current_aio_context(),
+ .data = &success,
};
aio_bh_schedule_oneshot(waker.ctx, backup_access_teardown_bh, &waker);
qemu_coroutine_yield();
@@ -1335,6 +1512,7 @@ UuidInfo coroutine_fn *qmp_backup(
}
di->dev_id = dev_id;
+ di->bitmap_action = action;
PBSBitmapInfo *info = g_malloc(sizeof(*info));
info->drive = g_strdup(di->device_name);
diff --git a/pve-backup.h b/pve-backup.h
index 4033bc848f..9ebeef7c8f 100644
--- a/pve-backup.h
+++ b/pve-backup.h
@@ -11,6 +11,6 @@
#ifndef PVE_BACKUP_H
#define PVE_BACKUP_H
-void backup_access_teardown(void);
+void backup_access_teardown(bool success);
#endif /* PVE_BACKUP_H */
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 3f092221ce..873db3f276 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1114,9 +1114,33 @@
#
# @size: the size of the block device in bytes.
#
+# @bitmap-node-name: the block node name the dirty bitmap is associated to.
+#
+# @bitmap-name: the name of the dirty bitmap associated to the backup access.
+#
+# @bitmap-action: the action taken on the dirty bitmap.
+#
##
{ 'struct': 'BackupAccessInfo',
- 'data': { 'node-name': 'str', 'device': 'str', 'size': 'size' } }
+ 'data': { 'node-name': 'str', 'device': 'str', 'size': 'size',
+ '*bitmap-node-name': 'str', '*bitmap-name': 'str',
+ '*bitmap-action': 'PBSBitmapAction' } }
+
+##
+# @BackupAccessSourceDevice:
+#
+# Source block device information for creating a backup access.
+#
+# @device: the block device name.
+#
+# @bitmap-name: use/create a bitmap with this name for the device. Re-using the
+# same name allows for making incremental backups. Check the @bitmap-action
+# in the result to see if you can actually re-use the bitmap or if it had to
+# be newly created.
+#
+##
+{ 'struct': 'BackupAccessSourceDevice',
+ 'data': { 'device': 'str', '*bitmap-name': 'str' } }
##
# @backup-access-setup:
@@ -1126,14 +1150,13 @@
#
# @target-id: the unique ID of the backup target.
#
-# @devlist: list of block device names (separated by ',', ';' or ':'). By
-# default the backup includes all writable block devices.
+# @devices: list of devices for which to create the backup access.
#
# Returns: a list of @BackupAccessInfo, one for each device.
#
##
{ 'command': 'backup-access-setup',
- 'data': { 'target-id': 'str', '*devlist': 'str' },
+ 'data': { 'target-id': 'str', 'devices': [ 'BackupAccessSourceDevice' ] },
'returns': [ 'BackupAccessInfo' ], 'coroutine': true }
##
@@ -1143,8 +1166,11 @@
#
# @target-id: the ID of the backup target.
#
+# @success: whether the backup done by the external provider was successful.
+#
##
-{ 'command': 'backup-access-teardown', 'data': { 'target-id': 'str' },
+{ 'command': 'backup-access-teardown',
+ 'data': { 'target-id': 'str', 'success': 'bool' },
'coroutine': true }
##
diff --git a/system/runstate.c b/system/runstate.c
index 6f93d7c2fb..ef3277930f 100644
--- a/system/runstate.c
+++ b/system/runstate.c
@@ -925,7 +925,7 @@ void qemu_cleanup(int status)
* The backup access is set up by a QMP command, but is neither owned by a monitor nor
* associated to a BlockBackend. Need to tear it down manually here.
*/
- backup_access_teardown();
+ backup_access_teardown(false);
job_cancel_sync_all();
bdrv_close_all();

View File

@@ -0,0 +1,57 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Thu, 3 Apr 2025 14:30:49 +0200
Subject: [PATCH] PVE backup: backup-access api: indicate situation where a
bitmap was recreated
The backup-access api keeps track of what bitmap names got used for
which devices and thus knows when a bitmap went missing. Propagate
this information to the QMP user with a new 'missing-recreated'
variant for the taken bitmap action.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
---
pve-backup.c | 6 +++++-
qapi/block-core.json | 9 ++++++++-
2 files changed, 13 insertions(+), 2 deletions(-)
diff --git a/pve-backup.c b/pve-backup.c
index 0490d1f421..8909842292 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -1119,7 +1119,11 @@ BackupAccessInfoList *coroutine_fn qmp_backup_access_setup(
goto err;
}
bdrv_set_dirty_bitmap(bitmap, 0, di->size);
- action = PBS_BITMAP_ACTION_NEW;
+ if (same_bitmap_name) {
+ action = PBS_BITMAP_ACTION_MISSING_RECREATED;
+ } else {
+ action = PBS_BITMAP_ACTION_NEW;
+ }
} else {
/* track clean chunks as reused */
dirty = MIN(bdrv_get_dirty_count(bitmap), di->size);
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 873db3f276..58586170d9 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1067,9 +1067,16 @@
# base snapshot did not match the base given for the current job or
# the crypt mode has changed.
#
+# @missing-recreated: A bitmap for incremental backup was expected to be
+# present, but was missing and thus got recreated. For example, this can
+# happen if the drive was re-attached or if the bitmap was deleted for some
+# other reason. PBS does not currently keep track of this; the backup-access
+# mechanism does.
+#
##
{ 'enum': 'PBSBitmapAction',
- 'data': ['not-used', 'not-used-removed', 'new', 'used', 'invalid'] }
+ 'data': ['not-used', 'not-used-removed', 'new', 'used', 'invalid',
+ 'missing-recreated'] }
##
# @PBSBitmapInfo:

View File

@@ -0,0 +1,84 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Wolfgang Bumiller <w.bumiller@proxmox.com>
Date: Thu, 3 Apr 2025 14:30:50 +0200
Subject: [PATCH] PVE backup: backup-access-api: explicit bitmap-mode parameter
This allows to explicitly request to re-create a bitmap under the same
name.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
[TL: fix trailing whitespace error]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
pve-backup.c | 17 ++++++++++++++++-
qapi/block-core.json | 20 +++++++++++++++++++-
2 files changed, 35 insertions(+), 2 deletions(-)
diff --git a/pve-backup.c b/pve-backup.c
index 8909842292..18bcf29533 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -1043,7 +1043,16 @@ BackupAccessInfoList *coroutine_fn qmp_backup_access_setup(
error_propagate(errp, local_err);
goto err;
}
- di->requested_bitmap_name = g_strdup(it->value->bitmap_name);
+ if (it->value->bitmap_mode == BACKUP_ACCESS_SETUP_BITMAP_MODE_NONE) {
+ di->bitmap_action = PBS_BITMAP_ACTION_NOT_USED;
+ } else {
+ di->requested_bitmap_name = g_strdup(it->value->bitmap_name);
+ if (it->value->bitmap_mode == BACKUP_ACCESS_SETUP_BITMAP_MODE_NEW) {
+ di->bitmap_action = PBS_BITMAP_ACTION_NEW;
+ } else if (it->value->bitmap_mode == BACKUP_ACCESS_SETUP_BITMAP_MODE_USE) {
+ di->bitmap_action = PBS_BITMAP_ACTION_USED;
+ }
+ }
di_list = g_list_append(di_list, di);
}
bdrv_graph_co_rdunlock();
@@ -1096,6 +1105,12 @@ BackupAccessInfoList *coroutine_fn qmp_backup_access_setup(
&& di->requested_bitmap_name
&& strcmp(di->requested_bitmap_name, old_bitmap_name) == 0;
+ /* special case: if we explicitly requested a *new* bitmap, treat an
+ * existing bitmap as having a different name */
+ if (di->bitmap_action == PBS_BITMAP_ACTION_NEW) {
+ same_bitmap_name = false;
+ }
+
if (old_bitmap_name && !same_bitmap_name) {
BdrvDirtyBitmap *old_bitmap = bdrv_find_dirty_bitmap(di->bs, old_bitmap_name);
if (!old_bitmap) {
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 58586170d9..09beb3217c 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1145,9 +1145,27 @@
# in the result to see if you can actually re-use the bitmap or if it had to
# be newly created.
#
+# @bitmap-mode: used to control whether the bitmap should be reused or
+# recreated.
+#
##
{ 'struct': 'BackupAccessSourceDevice',
- 'data': { 'device': 'str', '*bitmap-name': 'str' } }
+ 'data': { 'device': 'str', '*bitmap-name': 'str',
+ '*bitmap-mode': 'BackupAccessSetupBitmapMode' } }
+
+##
+# @BackupAccessSetupBitmapMode:
+#
+# How to setup a bitmap for a device for @backup-access-setup.
+#
+# @none: do not use a bitmap. Removes an existing bitmap if present.
+#
+# @new: create and use a new bitmap.
+#
+# @use: try to re-use an existing bitmap. Create a new one if it doesn't exist.
+##
+{ 'enum': 'BackupAccessSetupBitmapMode',
+ 'data': ['none', 'new', 'use' ] }
##
# @backup-access-setup:

View File

@@ -0,0 +1,206 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Fri, 4 Apr 2025 15:31:36 +0200
Subject: [PATCH] PVE backup: backup-access api: simplify bitmap logic
Currently, only one bitmap name per target is planned to be used.
Simply use the target ID itself as the bitmap name. This allows to
simplify the logic quite a bit and there also is no need for the
backup_access_bitmaps hash table anymore.
For the return value, the bitmap names are still passed along for
convenience in the caller.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Tested-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Reviewed-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
---
pve-backup.c | 72 ++++++++++++--------------------------------
qapi/block-core.json | 15 ++++-----
2 files changed, 26 insertions(+), 61 deletions(-)
diff --git a/pve-backup.c b/pve-backup.c
index 18bcf29533..0ea0343b22 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -74,7 +74,6 @@ static struct PVEBackupState {
CoMutex backup_mutex;
CoMutex dump_callback_mutex;
char *target_id;
- GHashTable *backup_access_bitmaps; // key=target_id, value=bitmap_name
} backup_state;
static void pvebackup_init(void)
@@ -106,7 +105,7 @@ typedef struct PVEBackupDevInfo {
PBSBitmapAction bitmap_action;
BlockDriverState *target;
BlockJob *job;
- char *requested_bitmap_name; // used by external backup access during initialization
+ BackupAccessSetupBitmapMode requested_bitmap_mode;
} PVEBackupDevInfo;
static void pvebackup_propagate_error(Error *err)
@@ -1043,16 +1042,7 @@ BackupAccessInfoList *coroutine_fn qmp_backup_access_setup(
error_propagate(errp, local_err);
goto err;
}
- if (it->value->bitmap_mode == BACKUP_ACCESS_SETUP_BITMAP_MODE_NONE) {
- di->bitmap_action = PBS_BITMAP_ACTION_NOT_USED;
- } else {
- di->requested_bitmap_name = g_strdup(it->value->bitmap_name);
- if (it->value->bitmap_mode == BACKUP_ACCESS_SETUP_BITMAP_MODE_NEW) {
- di->bitmap_action = PBS_BITMAP_ACTION_NEW;
- } else if (it->value->bitmap_mode == BACKUP_ACCESS_SETUP_BITMAP_MODE_USE) {
- di->bitmap_action = PBS_BITMAP_ACTION_USED;
- }
- }
+ di->requested_bitmap_mode = it->value->bitmap_mode;
di_list = g_list_append(di_list, di);
}
bdrv_graph_co_rdunlock();
@@ -1082,10 +1072,7 @@ BackupAccessInfoList *coroutine_fn qmp_backup_access_setup(
/* clear previous backup's bitmap_list */
clear_backup_state_bitmap_list();
- if (!backup_state.backup_access_bitmaps) {
- backup_state.backup_access_bitmaps =
- g_hash_table_new_full(g_str_hash, g_str_equal, free, free);
- }
+ const char *bitmap_name = target_id;
/* create bitmaps if requested */
l = di_list;
@@ -1098,59 +1085,43 @@ BackupAccessInfoList *coroutine_fn qmp_backup_access_setup(
PBSBitmapAction action = PBS_BITMAP_ACTION_NOT_USED;
size_t dirty = di->size;
- const char *old_bitmap_name =
- (const char*)g_hash_table_lookup(backup_state.backup_access_bitmaps, target_id);
-
- bool same_bitmap_name = old_bitmap_name
- && di->requested_bitmap_name
- && strcmp(di->requested_bitmap_name, old_bitmap_name) == 0;
-
- /* special case: if we explicitly requested a *new* bitmap, treat an
- * existing bitmap as having a different name */
- if (di->bitmap_action == PBS_BITMAP_ACTION_NEW) {
- same_bitmap_name = false;
- }
-
- if (old_bitmap_name && !same_bitmap_name) {
- BdrvDirtyBitmap *old_bitmap = bdrv_find_dirty_bitmap(di->bs, old_bitmap_name);
- if (!old_bitmap) {
- warn_report("setup backup access: expected old bitmap '%s' not found for drive "
- "'%s'", old_bitmap_name, di->device_name);
- } else {
- g_hash_table_remove(backup_state.backup_access_bitmaps, target_id);
+ if (di->requested_bitmap_mode == BACKUP_ACCESS_SETUP_BITMAP_MODE_NONE ||
+ di->requested_bitmap_mode == BACKUP_ACCESS_SETUP_BITMAP_MODE_NEW) {
+ BdrvDirtyBitmap *old_bitmap = bdrv_find_dirty_bitmap(di->bs, bitmap_name);
+ if (old_bitmap) {
bdrv_release_dirty_bitmap(old_bitmap);
- action = PBS_BITMAP_ACTION_NOT_USED_REMOVED;
+ action = PBS_BITMAP_ACTION_NOT_USED_REMOVED; // set below for new
}
}
BdrvDirtyBitmap *bitmap = NULL;
- if (di->requested_bitmap_name) {
- bitmap = bdrv_find_dirty_bitmap(di->bs, di->requested_bitmap_name);
+ if (di->requested_bitmap_mode == BACKUP_ACCESS_SETUP_BITMAP_MODE_NEW ||
+ di->requested_bitmap_mode == BACKUP_ACCESS_SETUP_BITMAP_MODE_USE) {
+ bitmap = bdrv_find_dirty_bitmap(di->bs, bitmap_name);
if (!bitmap) {
bitmap = bdrv_create_dirty_bitmap(di->bs, PROXMOX_BACKUP_DEFAULT_CHUNK_SIZE,
- di->requested_bitmap_name, errp);
+ bitmap_name, errp);
if (!bitmap) {
qemu_mutex_unlock(&backup_state.stat.lock);
goto err;
}
bdrv_set_dirty_bitmap(bitmap, 0, di->size);
- if (same_bitmap_name) {
+ if (di->requested_bitmap_mode == BACKUP_ACCESS_SETUP_BITMAP_MODE_USE) {
action = PBS_BITMAP_ACTION_MISSING_RECREATED;
} else {
action = PBS_BITMAP_ACTION_NEW;
}
} else {
+ if (di->requested_bitmap_mode == BACKUP_ACCESS_SETUP_BITMAP_MODE_NEW) {
+ qemu_mutex_unlock(&backup_state.stat.lock);
+ error_setg(errp, "internal error - removed old bitmap still present");
+ goto err;
+ }
/* track clean chunks as reused */
dirty = MIN(bdrv_get_dirty_count(bitmap), di->size);
backup_state.stat.reused += di->size - dirty;
action = PBS_BITMAP_ACTION_USED;
}
-
- if (!same_bitmap_name) {
- g_hash_table_insert(backup_state.backup_access_bitmaps,
- strdup(target_id), strdup(di->requested_bitmap_name));
- }
-
}
PBSBitmapInfo *info = g_malloc(sizeof(*info));
@@ -1207,9 +1178,9 @@ BackupAccessInfoList *coroutine_fn qmp_backup_access_setup(
info->value->node_name = g_strdup(bdrv_get_node_name(di->fleecing.snapshot_access));
info->value->device = g_strdup(di->device_name);
info->value->size = di->size;
- if (di->requested_bitmap_name) {
+ if (di->bitmap) {
info->value->bitmap_node_name = g_strdup(bdrv_get_node_name(di->bs));
- info->value->bitmap_name = g_strdup(di->requested_bitmap_name);
+ info->value->bitmap_name = g_strdup(bitmap_name);
info->value->bitmap_action = di->bitmap_action;
info->value->has_bitmap_action = true;
}
@@ -1274,9 +1245,6 @@ void backup_access_teardown(bool success)
g_free(di->device_name);
di->device_name = NULL;
- g_free(di->requested_bitmap_name);
- di->requested_bitmap_name = NULL;
-
g_free(di);
}
g_list_free(backup_state.di_list);
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 09beb3217c..02c043f0f7 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1140,18 +1140,12 @@
#
# @device: the block device name.
#
-# @bitmap-name: use/create a bitmap with this name for the device. Re-using the
-# same name allows for making incremental backups. Check the @bitmap-action
-# in the result to see if you can actually re-use the bitmap or if it had to
-# be newly created.
-#
# @bitmap-mode: used to control whether the bitmap should be reused or
-# recreated.
+# recreated or not used. Default is not using a bitmap.
#
##
{ 'struct': 'BackupAccessSourceDevice',
- 'data': { 'device': 'str', '*bitmap-name': 'str',
- '*bitmap-mode': 'BackupAccessSetupBitmapMode' } }
+ 'data': { 'device': 'str', '*bitmap-mode': 'BackupAccessSetupBitmapMode' } }
##
# @BackupAccessSetupBitmapMode:
@@ -1175,7 +1169,10 @@
#
# @target-id: the unique ID of the backup target.
#
-# @devices: list of devices for which to create the backup access.
+# @devices: list of devices for which to create the backup access. Also
+# controls whether to use/create a bitmap for the device. Check the
+# @bitmap-action in the result to see what action was actually taken for the
+# bitmap. Each target controls its own bitmaps.
#
# Returns: a list of @BackupAccessInfo, one for each device.
#

17
debian/patches/series vendored
View File

@@ -72,4 +72,19 @@ pve/0054-Revert-hpet-place-read-only-bits-directly-in-new_val.patch
pve/0055-Revert-hpet-remove-unnecessary-variable-index.patch
pve/0056-Revert-hpet-ignore-high-bits-of-comparator-in-32-bit.patch
pve/0057-Revert-hpet-fix-and-cleanup-persistence-of-interrupt.patch
pve-qemu-9.2-vitastor.patch
pve/0058-savevm-async-improve-setting-state-of-snapshot-opera.patch
pve/0059-savevm-async-rename-saved_vm_running-to-vm_needs_sta.patch
pve/0060-savevm-async-improve-runstate-preservation-cleanup-e.patch
pve/0061-savevm-async-use-dedicated-iothread-for-state-file.patch
pve/0062-savevm-async-treat-failure-to-set-iothread-context-a.patch
pve/0063-PVE-backup-clean-up-directly-in-setup_snapshot_acces.patch
pve/0064-PVE-backup-factor-out-helper-to-clear-backup-state-s.patch
pve/0065-PVE-backup-factor-out-helper-to-initialize-backup-st.patch
pve/0066-PVE-backup-add-target-ID-in-backup-state.patch
pve/0067-PVE-backup-get-device-info-allow-caller-to-specify-f.patch
pve/0068-PVE-backup-implement-backup-access-setup-and-teardow.patch
pve/0069-PVE-backup-factor-out-get_single_device_info-helper.patch
pve/0070-PVE-backup-implement-bitmap-support-for-external-bac.patch
pve/0071-PVE-backup-backup-access-api-indicate-situation-wher.patch
pve/0072-PVE-backup-backup-access-api-explicit-bitmap-mode-pa.patch
pve/0073-PVE-backup-backup-access-api-simplify-bitmap-logic.patch