Compare commits

..

26 Commits

Author SHA1 Message Date
715a6d3a7d Add vitastor support 2025-08-25 19:53:32 +03:00
Thomas Lamprecht
839b53bab8 bump version to 10.0.2-4
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2025-07-17 23:22:41 +02:00
Thomas Lamprecht
c8dd2db54c savevm-async: reuse migration blocker check for snapshots/hibernation
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2025-07-17 23:21:51 +02:00
Fiona Ebner
5e0c823f82 d/changelog: drop faulty partial entry
The actual entry for dropping GlusterFS is right below.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2025-07-15 15:28:15 +02:00
Fabian Grünbichler
4f0e2dfce4 bump version to 10.0.2-3
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2025-07-03 12:04:25 +02:00
Fabian Grünbichler
8e6de66a76 add backup/zeroinit/track-alloc blockdev patches
as well as one for directly returning child nodes in the block graph

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2025-07-03 12:04:25 +02:00
Thomas Lamprecht
1e351bfe19 bump version to 10.0.2-2
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2025-06-25 19:23:35 +02:00
Thomas Lamprecht
4397bd351d drop support for native glusterfs
As the GlusterFS project is unmaintained since a while and other
projects like QEMU also drop support for using it natively.

One can still use the gluster tools to mount an instance manually and
then use it as directory storage; the better (long term) option will
be to replace the storage server with something maintained though, as
PVE 8 will be supported until the middle of 2026 users have some time
before they need to decide what way they will go.

Mirrors a commit from pve-storage with similar intentions [0].

0: https://git.proxmox.com/?p=pve-storage.git;a=commit;h=7669a99e97f3fd35cca95d1d1ab8a377f593dccb

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2025-06-25 19:19:43 +02:00
Wolfgang Bumiller
6e581a8033 buildsys: use grouping for .deb targets
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2025-06-17 10:22:06 +02:00
Thomas Lamprecht
7891f050c6 bump version to 10.0.2-1
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2025-06-17 08:38:54 +02:00
Fiona Ebner
bd931f0f04 squash some related patches
In particular for backup and savevm-async, a lot can be grouped
together, many patches were just later fixups/improvements. Backup
patches are still grouped in three: original, addition of fleecing
and addition of external backup API (preparatory patches there
could be split and squashed too, but they're relatively new, so that
is something for next time).

The code with all patches applied is still the same.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Link: https://lore.proxmox.com/20250602102227.120732-3-f.ebner@proxmox.com
2025-06-17 08:37:57 +02:00
Fiona Ebner
b73923535f update submodule and patches to QEMU 10.0.2
Only minor changes this time:

Adapt to changed include paths:
32cad1ffb8 ("include: Rename sysemu/ -> system/")
407bc4bf90 ("qapi: Move include/qapi/qmp/ to include/qobject/")

Adapt to a function signature change:
4822128693 ("migration: Drop inactivate_disk param in
qemu_savevm_state_complete*")

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Link: https://lore.proxmox.com/20250602102227.120732-2-f.ebner@proxmox.com
2025-06-17 08:37:57 +02:00
Fiona Ebner
50717bf6df d/rules: remove outdated workaround against historic changelog file
There is no top-level 'Changelog' file in the QEMU submodule
repository anymore since QEMU v5.2, to be precise commit e83029fa60
("CHANGELOG: remove disused file").

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2025-04-04 18:46:52 +02:00
Thomas Lamprecht
e0969989ac bump version to 9.2.0-5
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2025-04-04 16:16:02 +02:00
Thomas Lamprecht
b30e898392 PVE backup: backup-access api: simplify bitmap logic
See inner patch for details.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2025-04-04 16:14:24 +02:00
Thomas Lamprecht
053f68a7ac bump version to 9.2.0-4
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2025-04-03 17:00:39 +02:00
Thomas Lamprecht
cb662eb2c1 pve backup: implement features for external backup providers
This series of patches extends our existing QEMU backup integration
for PVE such that it can be used for a in-development external-backup
plugin API.

This includes among other things an API for backup access and teardown
as well as dirty-bitmap management. See the separate patches for more
details and check [v7] and [v8] of this series on the list. The former
is the last revision by Fiona which was already very advanced,
Wolfgang picked that up and extended/adapted it slightly to match
needs that he found during implementation of a test provider and from
feedback of (future) external backup provider like Bareos and Veritas.
This was mainly done to get the QEMU build out for broad QA, we can
still change things here as long as the rest of the series is not yet
applied.

[v7]: https://lore.proxmox.com/pve-devel/20250401173435.221892-1-f.ebner@proxmox.com/
[v8]: https://lore.proxmox.com/pve-devel/20250403123118.264974-1-w.bumiller@proxmox.com/

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2025-04-03 16:48:42 +02:00
Wolfgang Bumiller
a6ee093f1c merge async snapshot improvements
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2025-04-02 17:23:05 +02:00
Thomas Lamprecht
76dd2fb90b bump version to 9.2.0-3
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2025-03-26 09:57:04 +01:00
Fiona Ebner
3da02409af revert hpet changes to fix performance regression
As reported in the community forum [0][1], QEMU processes for Linux
guests would consume more CPU on the host after an update to QEMU 9.2.
The issue was reproduced and bisecting pointed to QEMU commit
f0ccf77078 ("hpet: fix and cleanup persistence of interrupt status").

Some quick experimentation suggests that in particular the last part
 is responsible for the issue:
> - the timer must be kept running even if not enabled, in
>   order to set the ISR flag, so writes to HPET_TN_CFG must
>   not call hpet_del_timer()

Users confirmed that setting the hpet=off machine flag works around
the issue[0]. For Windows (7 or later) guests, the flag is already
disabled, because of issues in the past [2].

Upstream suggested reverting the relevant patches for now [3], because
other issues were reported too. All except commit 5895879aca ("hpet:
remove unnecessary variable "index"") are actually dependent on each
other for cleanly reverting f0ccf77078, and while not strictly
required, that one was reverted too for completeness.

[0]: https://forum.proxmox.com/threads/163694/
[1]: https://forum.proxmox.com/threads/161849/post-756793
[2]: https://lists.proxmox.com/pipermail/pve-devel/2012-December/004958.html
[3]: https://lore.kernel.org/qemu-devel/CABgObfaKJ5NFVKmYLFmu4C0iZZLJJtcWksLCzyA0tBoz0koZ4A@mail.gmail.com/

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2025-03-26 08:42:59 +01:00
Thomas Lamprecht
f7be446ebe bump version to 9.2.0-2
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2025-02-24 17:34:34 +01:00
Fiona Ebner
dc45786e06 code style: some more coccinelle fixes
Below are the commands that generated the changes along with the
rationale:

command: spatch --in-place scripts/coccinelle/error_propagate_null.cocci pve-backup.c
rationale: error_propagate() already checks for NULL in its second
           argument

command: spatch --in-place scripts/coccinelle/round.cocci vma-reader.c vma-writer.c
rationale: DIV_ROUND_UP() macro is more readable than the expanded
           calculation

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2025-02-24 17:21:14 +01:00
Fiona Ebner
f6dc6b54ba replicated zfs migration: fix assertion failure with multiple disks
It is necessary to reset the error pointer after error_report_err(),
because that function frees the error. Not doing so can lead to a
use-after-free and in particular error_setg() with the same error
pointer will run into assertion failure, because it asserts that no
previous error is set:

> #5  0x00007c1723674eb2 in __GI___assert_fail (assertion=assertion@entry=0x59132c9fc540 "*errp == NULL",
>     file=file@entry=0x59132c9fc530 "../util/error.c", line=line@entry=68,
>     function=function@entry=0x59132c9fc5f8 <__PRETTY_FUNCTION__.2> "error_setv")
> #6  0x000059132c7d250f in error_setv (errp=0x7c15839fafb8, src=0x59132c9af224 "../block/dirty-bitmap.c", line=182,
>     func=0x59132c9af9b0 <__func__.17> "bdrv_dirty_bitmap_check", err_class=err_class@entry=ERROR_CLASS_GENERIC_ERROR,
>     fmt=fmt@entry=0x59132c9af380 "Bitmap '%s' is currently in use by another operation and cannot be used", ap=0x7c15839fad60,
>     suffix=0x0)
> #7  0x000059132c7d265c in error_setg_internal (errp=errp@entry=0x7c15839fafb8,
>     src=src@entry=0x59132c9af224 "../block/dirty-bitmap.c", line=line@entry=182,
>     func=func@entry=0x59132c9af9b0 <__func__.17> "bdrv_dirty_bitmap_check",
>     fmt=fmt@entry=0x59132c9af380 "Bitmap '%s' is currently in use by another operation and cannot be used")
> #8  0x000059132c68fbc1 in bdrv_dirty_bitmap_check (bitmap=bitmap@entry=0x5913542d6190, flags=flags@entry=7,
>     errp=errp@entry=0x7c15839fafb8)
> #9  0x000059132c3b951d in add_bitmaps_to_list (s=s@entry=0x59132d87ee40 <dbm_state>, bs=bs@entry=0x591352d6b720,
>     bs_name=bs_name@entry=0x591352d69900 "drive-scsi1", alias_map=alias_map@entry=0x0, errp=errp@entry=0x7c15839fafb8)
> #10 0x000059132c3ba23d in init_dirty_bitmap_migration (errp=<optimized out>, s=0x59132d87ee40 <dbm_state>)
> #11 dirty_bitmap_save_setup (f=0x591352ebdd30, opaque=0x59132d87ee40 <dbm_state>, errp=0x7c15839fafb8)
> #12 0x000059132c3d81f0 in qemu_savevm_state_setup (f=0x591352ebdd30, errp=errp@entry=0x7c15839fafb8)

Fix created using the appropriate in-tree coccinelle script:
spatch --in-place scripts/coccinelle/error-use-after-free.cocci migration/block-dirty-bitmap.c

The problematic change exposing the issue was part of 7882afe ("update
submodule and patches to QEMU 9.1.2") adapting to QEMU 9.1, commit
dd03167725 ("migration: Add Error** argument to
add_bitmaps_to_list()"), where the add_bitmaps_to_list() function
gained an error pointer argument, replacing the local error variable
that was used before.

Fixes: 7882afe ("update submodule and patches to QEMU 9.1.2")
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2025-02-24 17:21:14 +01:00
Thomas Lamprecht
4f4fca78f7 bump version to 9.2.0-1
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2025-02-04 08:49:24 +01:00
Fiona Ebner
e247b46563 stable fixes for QEMU 9.2.0
Most notabbly, there now is an upstream workaround for the "Windows
PCI Label bug" [0] and the revert of QEMU commit 44d975ef34 ("x86:
acpi: workaround Windows not handling name references in Package
properly") can be dropped.

Pick up some other fixes already merged in current master, for
emulation as well as x86(_64) KVM, some PCI/USB fixes and a pair of
regression fixes for the net subsystem.

[0]: https://gitlab.com/qemu-project/qemu/-/issues/774

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2025-02-04 08:37:47 +01:00
Fiona Ebner
670aa8ecdf update submodule and patches to QEMU 9.2.0
Notable changes:

* Commit 07bea2d35f ("block-backend: Remove deadcode") removed
  blk_op_{,un}block_all() which was used by PVE async savevm code.
  Fixed by switching to using bdrv_op_{,un}block_all().

* Drop patches that are already part of upstream.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2025-02-04 08:37:47 +01:00
87 changed files with 4659 additions and 1615 deletions

View File

@@ -56,9 +56,8 @@ $(BUILDDIR): submodule
.PHONY: deb kvm
deb kvm: $(DEBS)
$(DEB_DBG): $(DEB)
$(DEB): $(BUILDDIR)
cd $(BUILDDIR); dpkg-buildpackage -b -us -uc
$(DEBS) &: $(BUILDDIR)
cd $(BUILDDIR); dpkg-buildpackage -b -us -uc -j40
lintian $(DEBS)
sbuild: $(DSC)

82
debian/changelog vendored
View File

@@ -1,3 +1,85 @@
pve-qemu-kvm (10.0.2-4+vitastor1) trixie; urgency=medium
* Add Vitastor support
-- Vitaliy Filippov <vitalif@yourcmc.ru> Mon, 25 Aug 2025 19:39:12 +0300
pve-qemu-kvm (10.0.2-4) trixie; urgency=medium
* savevm-async: reuse migration blocker check for snapshots/hibernation to
avoid crashing a VM when on these actions if its configuration does not
support them.
-- Proxmox Support Team <support@proxmox.com> Thu, 17 Jul 2025 23:08:46 +0200
pve-qemu-kvm (10.0.2-3) trixie; urgency=medium
* add backup/zeroinit/track-alloc blockdev patches
-- Proxmox Support Team <support@proxmox.com> Thu, 03 Jul 2025 12:02:03 +0200
pve-qemu-kvm (10.0.2-2) trixie; urgency=medium
* drop support for accessing Gluster based storage directly due to its
effective end of support and maintenance. The last upstream release
happened over 2.5 years ago and there's currently no one providing
enterprise support or security updates. Further, upstream QEMU will remove
the integration in one of the next releases, so use the upcomming PVE 9
major release to provide a clean cut.
User can either stay on Proxmox VE 8 until its end-of-life (probably end
of June 2026), or mount GlusterFS "manually" (e.g., /etc/fstab) and add it
as directory storage to Proxmox VE.
We recommend moving to other actively maintained storage technology
altogether though.
-- Proxmox Support Team <support@proxmox.com> Wed, 25 Jun 2025 19:23:30 +0200
pve-qemu-kvm (10.0.2-1) trixie; urgency=medium
* update QEMU and downstream patches for 10.0.2 release.
-- Proxmox Support Team <support@proxmox.com> Tue, 17 Jun 2025 08:38:51 +0200
pve-qemu-kvm (9.2.0-5) bookworm; urgency=medium
* pve backup: backup-access api: simplify bitmap logic
-- Proxmox Support Team <support@proxmox.com> Fri, 04 Apr 2025 16:15:58 +0200
pve-qemu-kvm (9.2.0-4) bookworm; urgency=medium
* various async snapshot improvements, inclduing using a dedicated IO thread
for the state file when doing a live snapshot. That should reduce load on
the main thread and for it to get stuck on IO, i.e. same benefits as using
a dedicated IO thread for regular drives. This is particularly interesting
when the VM state storage is a network storage like NFS. It should also
address #6262.
* pve backup: implement basic features and API in preperation for external
backup provider storage plugins.
-- Proxmox Support Team <support@proxmox.com> Thu, 03 Apr 2025 17:00:34 +0200
pve-qemu-kvm (9.2.0-3) bookworm; urgency=medium
* revert changes to the High Precision Event Timer (HPET) to fix performance
regression
-- Proxmox Support Team <support@proxmox.com> Wed, 26 Mar 2025 09:56:01 +0100
pve-qemu-kvm (9.2.0-2) bookworm; urgency=medium
* fix assertion failure when migrating a VM with multiple disks on a
replicated ZFS.
-- Proxmox Support Team <support@proxmox.com> Mon, 24 Feb 2025 17:33:34 +0100
pve-qemu-kvm (9.2.0-1) bookworm; urgency=medium
* update submodule and patches to QEMU 9.2.0
-- Proxmox Support Team <support@proxmox.com> Tue, 04 Feb 2025 08:49:20 +0100
pve-qemu-kvm (9.1.2-3) bookworm; urgency=medium
* async snapshot: explicitly specify raw format when loading the VM state

9
debian/control vendored
View File

@@ -12,7 +12,6 @@ Build-Depends: debhelper-compat (= 13),
libepoxy-dev,
libfdt-dev,
libgbm-dev,
libglusterfs-dev (>= 5.2-2),
libgnutls28-dev,
libiscsi-dev (>= 1.12.0),
libjpeg-dev,
@@ -47,18 +46,12 @@ Package: pve-qemu-kvm
Architecture: any
Depends: ceph-common (>= 0.48),
iproute2,
libgfapi0 | glusterfs-common (>= 5.6),
libgfchangelog0 | glusterfs-common (>= 5.6),
libgfdb0 | glusterfs-common (>= 5.6),
libgfrpc0 | glusterfs-common (>= 5.6),
libgfxdr0 | glusterfs-common (>= 5.6),
libglusterfs-dev | glusterfs-common (>= 5.6),
libglusterfs0 | glusterfs-common (>= 5.6),
libiscsi4 (>= 1.12.0) | libiscsi7,
libjpeg62-turbo,
libspice-server1 (>= 0.14.0~),
libusb-1.0-0 (>= 1.0.17-1),
libusbredirparser1 (>= 0.6-2),
vitastor-client (>= 0.9.4),
libuuid1,
${misc:Depends},
${shlibs:Depends},

View File

@@ -38,7 +38,7 @@ Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
5 files changed, 142 insertions(+), 28 deletions(-)
diff --git a/block/mirror.c b/block/mirror.c
index 61f0a717b7..83a88562c5 100644
index a53582f17b..fafca1360e 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -51,7 +51,7 @@ typedef struct MirrorBlockJob {
@@ -258,10 +258,10 @@ index 61f0a717b7..83a88562c5 100644
base_read_only, errp);
if (!job) {
diff --git a/blockdev.c b/blockdev.c
index 835064ed03..9b10e3917c 100644
index 1d1f27cfff..ec45bbaa52 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2778,6 +2778,9 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
@@ -2797,6 +2797,9 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
BlockDriverState *target,
const char *replaces,
enum MirrorSyncMode sync,
@@ -271,7 +271,7 @@ index 835064ed03..9b10e3917c 100644
BlockMirrorBackingMode backing_mode,
bool zero_target,
bool has_speed, int64_t speed,
@@ -2796,6 +2799,7 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
@@ -2815,6 +2818,7 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
{
BlockDriverState *unfiltered_bs;
int job_flags = JOB_DEFAULT;
@@ -279,7 +279,7 @@ index 835064ed03..9b10e3917c 100644
GLOBAL_STATE_CODE();
GRAPH_RDLOCK_GUARD_MAINLOOP();
@@ -2850,6 +2854,29 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
@@ -2869,6 +2873,29 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
sync = MIRROR_SYNC_MODE_FULL;
}
@@ -309,7 +309,7 @@ index 835064ed03..9b10e3917c 100644
if (!replaces) {
/* We want to mirror from @bs, but keep implicit filters on top */
unfiltered_bs = bdrv_skip_implicit_filters(bs);
@@ -2891,8 +2918,8 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
@@ -2910,8 +2937,8 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
* and will allow to check whether the node still exist at mirror completion
*/
mirror_start(job_id, bs, target,
@@ -320,7 +320,7 @@ index 835064ed03..9b10e3917c 100644
on_source_error, on_target_error, unmap, filter_node_name,
copy_mode, errp);
}
@@ -3036,6 +3063,8 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
@@ -3055,6 +3082,8 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
blockdev_mirror_common(arg->job_id, bs, target_bs,
arg->replaces, arg->sync,
@@ -329,7 +329,7 @@ index 835064ed03..9b10e3917c 100644
backing_mode, zero_target,
arg->has_speed, arg->speed,
arg->has_granularity, arg->granularity,
@@ -3055,6 +3084,8 @@ void qmp_blockdev_mirror(const char *job_id,
@@ -3074,6 +3103,8 @@ void qmp_blockdev_mirror(const char *job_id,
const char *device, const char *target,
const char *replaces,
MirrorSyncMode sync,
@@ -338,7 +338,7 @@ index 835064ed03..9b10e3917c 100644
bool has_speed, int64_t speed,
bool has_granularity, uint32_t granularity,
bool has_buf_size, int64_t buf_size,
@@ -3095,7 +3126,8 @@ void qmp_blockdev_mirror(const char *job_id,
@@ -3114,7 +3145,8 @@ void qmp_blockdev_mirror(const char *job_id,
}
blockdev_mirror_common(job_id, bs, target_bs,
@@ -364,10 +364,10 @@ index eb2d92a226..f0c642b194 100644
BlockdevOnError on_source_error,
BlockdevOnError on_target_error,
diff --git a/qapi/block-core.json b/qapi/block-core.json
index aa40d44f1d..c2a337cc04 100644
index b1937780e1..0e5f148d30 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -2174,6 +2174,15 @@
@@ -2182,6 +2182,15 @@
# destination (all the disk, only the sectors allocated in the
# topmost image, or only new I/O).
#
@@ -383,7 +383,7 @@ index aa40d44f1d..c2a337cc04 100644
# @granularity: granularity of the dirty bitmap, default is 64K if the
# image format doesn't have clusters, 4K if the clusters are
# smaller than that, else the cluster size. Must be a power of 2
@@ -2216,7 +2225,9 @@
@@ -2224,7 +2233,9 @@
{ 'struct': 'DriveMirror',
'data': { '*job-id': 'str', 'device': 'str', 'target': 'str',
'*format': 'str', '*node-name': 'str', '*replaces': 'str',
@@ -394,7 +394,7 @@ index aa40d44f1d..c2a337cc04 100644
'*speed': 'int', '*granularity': 'uint32',
'*buf-size': 'int', '*on-source-error': 'BlockdevOnError',
'*on-target-error': 'BlockdevOnError',
@@ -2496,6 +2507,15 @@
@@ -2503,6 +2514,15 @@
# destination (all the disk, only the sectors allocated in the
# topmost image, or only new I/O).
#
@@ -410,7 +410,7 @@ index aa40d44f1d..c2a337cc04 100644
# @granularity: granularity of the dirty bitmap, default is 64K if the
# image format doesn't have clusters, 4K if the clusters are
# smaller than that, else the cluster size. Must be a power of 2
@@ -2544,7 +2564,8 @@
@@ -2551,7 +2571,8 @@
{ 'command': 'blockdev-mirror',
'data': { '*job-id': 'str', 'device': 'str', 'target': 'str',
'*replaces': 'str',
@@ -421,7 +421,7 @@ index aa40d44f1d..c2a337cc04 100644
'*buf-size': 'int', '*on-source-error': 'BlockdevOnError',
'*on-target-error': 'BlockdevOnError',
diff --git a/tests/unit/test-block-iothread.c b/tests/unit/test-block-iothread.c
index 3766d5de6b..afa44cbd34 100644
index 2b358eaaa8..2a149fe021 100644
--- a/tests/unit/test-block-iothread.c
+++ b/tests/unit/test-block-iothread.c
@@ -755,8 +755,8 @@ static void test_propagate_mirror(void)

View File

@@ -24,7 +24,7 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 18 insertions(+), 6 deletions(-)
diff --git a/block/mirror.c b/block/mirror.c
index 83a88562c5..fc439ea936 100644
index fafca1360e..05e738bcce 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -694,8 +694,6 @@ static int mirror_exit_common(Job *job)

View File

@@ -16,10 +16,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 3 insertions(+)
diff --git a/blockdev.c b/blockdev.c
index 9b10e3917c..c3fa897289 100644
index ec45bbaa52..9fab7ec554 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2875,6 +2875,9 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
@@ -2894,6 +2894,9 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
if (bdrv_dirty_bitmap_check(bitmap, BDRV_BITMAP_ALLOW_RO, errp)) {
return;
}

View File

@@ -16,7 +16,7 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 4 insertions(+), 7 deletions(-)
diff --git a/block/mirror.c b/block/mirror.c
index fc439ea936..cde5d710fd 100644
index 05e738bcce..2a2a227f3b 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -809,8 +809,8 @@ static int mirror_exit_common(Job *job)

View File

@@ -21,7 +21,7 @@ Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
3 files changed, 70 insertions(+), 59 deletions(-)
diff --git a/block/mirror.c b/block/mirror.c
index cde5d710fd..e20f50e5fb 100644
index 2a2a227f3b..87c0856979 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -1763,31 +1763,13 @@ static BlockJob *mirror_start_job(
@@ -62,10 +62,10 @@ index cde5d710fd..e20f50e5fb 100644
if (bitmap_mode != BITMAP_SYNC_MODE_NEVER) {
diff --git a/blockdev.c b/blockdev.c
index c3fa897289..9cbd166674 100644
index 9fab7ec554..158ac9314b 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2854,7 +2854,36 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
@@ -2873,7 +2873,36 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
sync = MIRROR_SYNC_MODE_FULL;
}

View File

@@ -60,7 +60,7 @@ index c3740ec616..7f38ce6b8b 100644
void monitor_init_globals(void);
void monitor_init_globals_core(void);
diff --git a/monitor/monitor-internal.h b/monitor/monitor-internal.h
index cb628f681d..93dbd62fc2 100644
index 5676eb334e..4c452a6aeb 100644
--- a/monitor/monitor-internal.h
+++ b/monitor/monitor-internal.h
@@ -151,6 +151,13 @@ typedef struct {
@@ -78,7 +78,7 @@ index cb628f681d..93dbd62fc2 100644
/**
diff --git a/monitor/monitor.c b/monitor/monitor.c
index db52a9c7ef..2d63959351 100644
index c5a5d30877..07775784d4 100644
--- a/monitor/monitor.c
+++ b/monitor/monitor.c
@@ -116,6 +116,21 @@ bool monitor_cur_is_qmp(void)
@@ -104,7 +104,7 @@ index db52a9c7ef..2d63959351 100644
* Is @mon is using readline?
* Note: not all HMP monitors use readline, e.g., gdbserver has a
diff --git a/monitor/qmp.c b/monitor/qmp.c
index 5e538f34c0..eb181d5979 100644
index 2f46cf9e49..f093e256e9 100644
--- a/monitor/qmp.c
+++ b/monitor/qmp.c
@@ -165,6 +165,8 @@ static void monitor_qmp_dispatch(MonitorQMP *mon, QObject *req)
@@ -144,7 +144,7 @@ index 5e538f34c0..eb181d5979 100644
monitor_qmp_caps_reset(mon);
data = qmp_greeting(mon);
diff --git a/qapi/qmp-dispatch.c b/qapi/qmp-dispatch.c
index 176b549473..790bb7d1da 100644
index e569224eae..eb03782e91 100644
--- a/qapi/qmp-dispatch.c
+++ b/qapi/qmp-dispatch.c
@@ -117,16 +117,28 @@ typedef struct QmpDispatchBH {

View File

@@ -55,7 +55,7 @@ Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/hw/ide/core.c b/hw/ide/core.c
index 08d9218455..20d8c0cf66 100644
index b14983ec54..41c543e627 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -456,7 +456,7 @@ static void ide_trim_bh_cb(void *opaque)

View File

@@ -1,69 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Guenter Roeck <linux@roeck-us.net>
Date: Tue, 28 Feb 2023 09:11:29 -0800
Subject: [PATCH] scsi: megasas: Internal cdbs have 16-byte length
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Host drivers do not necessarily set cdb_len in megasas io commands.
With commits 6d1511cea0 ("scsi: Reject commands if the CDB length
exceeds buf_len") and fe9d8927e2 ("scsi: Add buf_len parameter to
scsi_req_new()"), this results in failures to boot Linux from affected
SCSI drives because cdb_len is set to 0 by the host driver.
Set the cdb length to its actual size to solve the problem.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(picked-up from https://lists.nongnu.org/archive/html/qemu-devel/2023-02/msg08653.html)
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
hw/scsi/megasas.c | 14 ++------------
1 file changed, 2 insertions(+), 12 deletions(-)
diff --git a/hw/scsi/megasas.c b/hw/scsi/megasas.c
index 2d0c607177..97e51733af 100644
--- a/hw/scsi/megasas.c
+++ b/hw/scsi/megasas.c
@@ -1781,7 +1781,7 @@ static int megasas_handle_io(MegasasState *s, MegasasCmd *cmd, int frame_cmd)
uint8_t cdb[16];
int len;
struct SCSIDevice *sdev = NULL;
- int target_id, lun_id, cdb_len;
+ int target_id, lun_id;
lba_count = le32_to_cpu(cmd->frame->io.header.data_len);
lba_start_lo = le32_to_cpu(cmd->frame->io.lba_lo);
@@ -1790,7 +1790,6 @@ static int megasas_handle_io(MegasasState *s, MegasasCmd *cmd, int frame_cmd)
target_id = cmd->frame->header.target_id;
lun_id = cmd->frame->header.lun_id;
- cdb_len = cmd->frame->header.cdb_len;
if (target_id < MFI_MAX_LD && lun_id == 0) {
sdev = scsi_device_find(&s->bus, 0, target_id, lun_id);
@@ -1805,15 +1804,6 @@ static int megasas_handle_io(MegasasState *s, MegasasCmd *cmd, int frame_cmd)
return MFI_STAT_DEVICE_NOT_FOUND;
}
- if (cdb_len > 16) {
- trace_megasas_scsi_invalid_cdb_len(
- mfi_frame_desc(frame_cmd), 1, target_id, lun_id, cdb_len);
- megasas_write_sense(cmd, SENSE_CODE(INVALID_OPCODE));
- cmd->frame->header.scsi_status = CHECK_CONDITION;
- s->event_count++;
- return MFI_STAT_SCSI_DONE_WITH_ERROR;
- }
-
cmd->iov_size = lba_count * sdev->blocksize;
if (megasas_map_sgl(s, cmd, &cmd->frame->io.sgl)) {
megasas_write_sense(cmd, SENSE_CODE(TARGET_FAILURE));
@@ -1824,7 +1814,7 @@ static int megasas_handle_io(MegasasState *s, MegasasCmd *cmd, int frame_cmd)
megasas_encode_lba(cdb, lba_start, lba_count, is_write);
cmd->req = scsi_req_new(sdev, cmd->index,
- lun_id, cdb, cdb_len, cmd);
+ lun_id, cdb, sizeof(cdb), cmd);
if (!cmd->req) {
trace_megasas_scsi_req_alloc_failed(
mfi_frame_desc(frame_cmd), target_id, lun_id);

View File

@@ -1,45 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Fri, 17 Nov 2023 11:18:06 +0100
Subject: [PATCH] Revert "x86: acpi: workaround Windows not handling name
references in Package properly"
This reverts commit 44d975ef340e2f21f236f9520c53e1b30d2213a4.
As reported in the community forum [0] and reproduced locally this
breaks VirtIO network adapters in (at least) the German ISO of Windows
Server 2022. The fix itself was for
> Issue is not fatal but as result acpi-index/"PCI Label ID" property
> is either not shown in device details page or shows incorrect value.
so revert and tolerate that as a stop-gap, rather than have the
devices not working at all.
[0]: https://forum.proxmox.com/threads/92094/post-605684
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
hw/i386/acpi-build.c | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 5d4bd2b710..67194bb705 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -346,13 +346,9 @@ Aml *aml_pci_device_dsm(void)
{
Aml *params = aml_local(0);
Aml *pkg = aml_package(2);
- aml_append(pkg, aml_int(0));
- aml_append(pkg, aml_int(0));
+ aml_append(pkg, aml_name("BSEL"));
+ aml_append(pkg, aml_name("ASUN"));
aml_append(method, aml_store(pkg, params));
- aml_append(method,
- aml_store(aml_name("BSEL"), aml_index(params, aml_int(0))));
- aml_append(method,
- aml_store(aml_name("ASUN"), aml_index(params, aml_int(1))));
aml_append(method,
aml_return(aml_call5("PDSM", aml_arg(0), aml_arg(1),
aml_arg(2), aml_arg(3), params))

View File

@@ -1,81 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Akihiko Odaki <akihiko.odaki@daynix.com>
Date: Tue, 22 Oct 2024 15:49:01 +0900
Subject: [PATCH] virtio-net: Add queues before loading them
Call virtio_net_set_multiqueue() to add queues before loading their
states. Otherwise the loaded queues will not have handlers and elements
in them will not be processed.
Cc: qemu-stable@nongnu.org
Fixes: 8c49756825da ("virtio-net: Add only one queue pair when realizing")
Reported-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
(picked from https://lore.kernel.org/qemu-devel/20241022-load-v1-1-99df0bff7939@daynix.com/)
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
hw/net/virtio-net.c | 10 ++++++++++
hw/virtio/virtio.c | 7 +++++++
include/hw/virtio/virtio.h | 2 ++
3 files changed, 19 insertions(+)
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index ed33a32877..90d05f94d4 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -3032,6 +3032,15 @@ static void virtio_net_set_multiqueue(VirtIONet *n, int multiqueue)
virtio_net_set_queue_pairs(n);
}
+static int virtio_net_pre_load_queues(VirtIODevice *vdev)
+{
+ virtio_net_set_multiqueue(VIRTIO_NET(vdev),
+ virtio_has_feature(vdev->guest_features, VIRTIO_NET_F_RSS) ||
+ virtio_has_feature(vdev->guest_features, VIRTIO_NET_F_MQ));
+
+ return 0;
+}
+
static int virtio_net_post_load_device(void *opaque, int version_id)
{
VirtIONet *n = opaque;
@@ -4010,6 +4019,7 @@ static void virtio_net_class_init(ObjectClass *klass, void *data)
vdc->guest_notifier_mask = virtio_net_guest_notifier_mask;
vdc->guest_notifier_pending = virtio_net_guest_notifier_pending;
vdc->legacy_features |= (0x1 << VIRTIO_NET_F_GSO);
+ vdc->pre_load_queues = virtio_net_pre_load_queues;
vdc->post_load = virtio_net_post_load_virtio;
vdc->vmsd = &vmstate_virtio_net_device;
vdc->primary_unplug_pending = primary_unplug_pending;
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 9e10cbc058..10f24a58dd 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -3251,6 +3251,13 @@ virtio_load(VirtIODevice *vdev, QEMUFile *f, int version_id)
config_len--;
}
+ if (vdc->pre_load_queues) {
+ ret = vdc->pre_load_queues(vdev);
+ if (ret) {
+ return ret;
+ }
+ }
+
num = qemu_get_be32(f);
if (num > VIRTIO_QUEUE_MAX) {
diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index 0fcbc5c0c6..953dfca27c 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -210,6 +210,8 @@ struct VirtioDeviceClass {
void (*guest_notifier_mask)(VirtIODevice *vdev, int n, bool mask);
int (*start_ioeventfd)(VirtIODevice *vdev);
void (*stop_ioeventfd)(VirtIODevice *vdev);
+ /* Called before loading queues. Useful to add queues before loading. */
+ int (*pre_load_queues)(VirtIODevice *vdev);
/* Saving and loading of a device; trying to deprecate save/load
* use vmsd for new devices.
*/

View File

@@ -1,36 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Akihiko Odaki <akihiko.odaki@daynix.com>
Date: Fri, 22 Nov 2024 14:03:08 +0900
Subject: [PATCH] virtio-net: Fix size check in dhclient workaround
work_around_broken_dhclient() accesses IP and UDP headers to detect
relevant packets and to calculate checksums, but it didn't check if
the packet has size sufficient to accommodate them, causing out-of-bound
access hazards. Fix this by correcting the size requirement.
Fixes: 1d41b0c1ec66 ("Work around dhclient brokenness")
Cc: qemu-stable@nongnu.org
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
(picked from https://lore.kernel.org/qemu-devel/20241122-queue-v3-2-f2ff03b8dbfd@daynix.com/#t)
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
hw/net/virtio-net.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 90d05f94d4..c1fe457359 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -1692,8 +1692,11 @@ static void virtio_net_hdr_swap(VirtIODevice *vdev, struct virtio_net_hdr *hdr)
static void work_around_broken_dhclient(struct virtio_net_hdr *hdr,
uint8_t *buf, size_t size)
{
+ size_t csum_size = ETH_HLEN + sizeof(struct ip_header) +
+ sizeof(struct udp_header);
+
if ((hdr->flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) && /* missing csum */
- (size > 27 && size < 1500) && /* normal sized MTU */
+ (size >= csum_size && size < 1500) && /* normal sized MTU */
(buf[12] == 0x08 && buf[13] == 0x00) && /* ethertype == IPv4 */
(buf[23] == 17) && /* ip.protocol == UDP */
(buf[34] == 0 && buf[35] == 67)) { /* udp.srcport == bootps */

File diff suppressed because it is too large Load Diff

View File

@@ -14,10 +14,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/block/file-posix.c b/block/file-posix.c
index ff928b5e85..99e5bea1cc 100644
index 56d1972d15..cfa0b832ba 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -564,7 +564,7 @@ static QemuOptsList raw_runtime_opts = {
@@ -565,7 +565,7 @@ static QemuOptsList raw_runtime_opts = {
{
.name = "locking",
.type = QEMU_OPT_STRING,
@@ -26,7 +26,7 @@ index ff928b5e85..99e5bea1cc 100644
},
{
.name = "pr-manager",
@@ -664,7 +664,7 @@ static int raw_open_common(BlockDriverState *bs, QDict *options,
@@ -665,7 +665,7 @@ static int raw_open_common(BlockDriverState *bs, QDict *options,
s->use_lock = false;
break;
case ON_OFF_AUTO_AUTO:

View File

@@ -9,12 +9,12 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/net/net.h b/include/net/net.h
index c8f679761b..35a1338e40 100644
index cdd5b109b0..653a37e9d1 100644
--- a/include/net/net.h
+++ b/include/net/net.h
@@ -309,8 +309,8 @@ void netdev_add(QemuOpts *opts, Error **errp);
@@ -305,8 +305,8 @@ void netdev_add(QemuOpts *opts, Error **errp);
int net_hub_id_for_client(NetClientState *nc, int *id);
NetClientState *net_hub_port_find(int hub_id);
-#define DEFAULT_NETWORK_SCRIPT CONFIG_SYSCONFDIR "/qemu-ifup"
-#define DEFAULT_NETWORK_DOWN_SCRIPT CONFIG_SYSCONFDIR "/qemu-ifdown"

View File

@@ -10,10 +10,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index fa027cc206..da7ef0cbe6 100644
index 76f24446a5..2a47d79b49 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -2418,9 +2418,9 @@ uint64_t cpu_get_tsc(CPUX86State *env);
@@ -2556,9 +2556,9 @@ uint64_t cpu_get_tsc(CPUX86State *env);
#define CPU_RESOLVING_TYPE TYPE_X86_CPU
#ifdef TARGET_X86_64

View File

@@ -9,7 +9,7 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/ui/spice-core.c b/ui/spice-core.c
index 15be640286..ea20e6153c 100644
index 0326c63bec..d523d00200 100644
--- a/ui/spice-core.c
+++ b/ui/spice-core.c
@@ -690,32 +690,35 @@ static void qemu_spice_init(void)

View File

@@ -9,7 +9,7 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 11 insertions(+), 4 deletions(-)
diff --git a/block/gluster.c b/block/gluster.c
index f8b415f381..02bde39d94 100644
index c6d25ae733..ccca125c3a 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -42,7 +42,7 @@

View File

@@ -18,7 +18,7 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 2 insertions(+)
diff --git a/block/rbd.c b/block/rbd.c
index 9c0fd0cb3f..101ee59d6e 100644
index af984fb7db..bf143fac00 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -963,6 +963,8 @@ static int qemu_rbd_connect(rados_t *cluster, rados_ioctx_t *io_ctx,

View File

@@ -16,7 +16,7 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/block/gluster.c b/block/gluster.c
index 02bde39d94..36c00088cc 100644
index ccca125c3a..301a653ea7 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -57,6 +57,7 @@ typedef struct GlusterAIOCB {
@@ -27,7 +27,7 @@ index 02bde39d94..36c00088cc 100644
} GlusterAIOCB;
typedef struct BDRVGlusterState {
@@ -749,8 +750,10 @@ static void gluster_finish_aiocb(struct glfs_fd *fd, ssize_t ret,
@@ -746,8 +747,10 @@ static void gluster_finish_aiocb(struct glfs_fd *fd, ssize_t ret,
acb->ret = 0; /* Success */
} else if (ret < 0) {
acb->ret = -errno; /* Read/Write failed */
@@ -39,7 +39,7 @@ index 02bde39d94..36c00088cc 100644
}
aio_co_schedule(acb->aio_context, acb->coroutine);
@@ -1019,6 +1022,7 @@ static coroutine_fn int qemu_gluster_co_pwrite_zeroes(BlockDriverState *bs,
@@ -1018,6 +1021,7 @@ static coroutine_fn int qemu_gluster_co_pwrite_zeroes(BlockDriverState *bs,
acb.ret = 0;
acb.coroutine = qemu_coroutine_self();
acb.aio_context = bdrv_get_aio_context(bs);
@@ -47,7 +47,7 @@ index 02bde39d94..36c00088cc 100644
ret = glfs_zerofill_async(s->fd, offset, bytes, gluster_finish_aiocb, &acb);
if (ret < 0) {
@@ -1199,9 +1203,11 @@ static coroutine_fn int qemu_gluster_co_rw(BlockDriverState *bs,
@@ -1198,9 +1202,11 @@ static coroutine_fn int qemu_gluster_co_rw(BlockDriverState *bs,
acb.aio_context = bdrv_get_aio_context(bs);
if (write) {
@@ -59,7 +59,7 @@ index 02bde39d94..36c00088cc 100644
ret = glfs_preadv_async(s->fd, qiov->iov, qiov->niov, offset, 0,
gluster_finish_aiocb, &acb);
}
@@ -1264,6 +1270,7 @@ static coroutine_fn int qemu_gluster_co_flush_to_disk(BlockDriverState *bs)
@@ -1263,6 +1269,7 @@ static coroutine_fn int qemu_gluster_co_flush_to_disk(BlockDriverState *bs)
acb.ret = 0;
acb.coroutine = qemu_coroutine_self();
acb.aio_context = bdrv_get_aio_context(bs);
@@ -67,7 +67,7 @@ index 02bde39d94..36c00088cc 100644
ret = glfs_fsync_async(s->fd, gluster_finish_aiocb, &acb);
if (ret < 0) {
@@ -1312,6 +1319,7 @@ static coroutine_fn int qemu_gluster_co_pdiscard(BlockDriverState *bs,
@@ -1311,6 +1318,7 @@ static coroutine_fn int qemu_gluster_co_pdiscard(BlockDriverState *bs,
acb.ret = 0;
acb.coroutine = qemu_coroutine_self();
acb.aio_context = bdrv_get_aio_context(bs);

View File

@@ -9,7 +9,7 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/qemu-img.c b/qemu-img.c
index 7668f86769..2575e97b43 100644
index 2044c22a4c..4c8b5412c6 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -3075,7 +3075,8 @@ static int img_info(int argc, char **argv)

View File

@@ -54,10 +54,10 @@ index c9dd70a892..048788b23d 100644
DEF("info", img_info,
diff --git a/qemu-img.c b/qemu-img.c
index 2575e97b43..8ec68b346f 100644
index 4c8b5412c6..d5fa89a204 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -4993,10 +4993,12 @@ static int img_bitmap(int argc, char **argv)
@@ -4997,10 +4997,12 @@ static int img_bitmap(int argc, char **argv)
#define C_IF 04
#define C_OF 010
#define C_SKIP 020
@@ -70,7 +70,7 @@ index 2575e97b43..8ec68b346f 100644
};
struct DdIo {
@@ -5072,6 +5074,19 @@ static int img_dd_skip(const char *arg,
@@ -5076,6 +5078,19 @@ static int img_dd_skip(const char *arg,
return 0;
}
@@ -90,7 +90,7 @@ index 2575e97b43..8ec68b346f 100644
static int img_dd(int argc, char **argv)
{
int ret = 0;
@@ -5112,6 +5127,7 @@ static int img_dd(int argc, char **argv)
@@ -5116,6 +5131,7 @@ static int img_dd(int argc, char **argv)
{ "if", img_dd_if, C_IF },
{ "of", img_dd_of, C_OF },
{ "skip", img_dd_skip, C_SKIP },
@@ -98,7 +98,7 @@ index 2575e97b43..8ec68b346f 100644
{ NULL, NULL, 0 }
};
const struct option long_options[] = {
@@ -5187,91 +5203,112 @@ static int img_dd(int argc, char **argv)
@@ -5191,91 +5207,112 @@ static int img_dd(int argc, char **argv)
arg = NULL;
}
@@ -275,7 +275,7 @@ index 2575e97b43..8ec68b346f 100644
}
if (dd.flags & C_SKIP && (in.offset > INT64_MAX / in.bsz ||
@@ -5288,20 +5325,43 @@ static int img_dd(int argc, char **argv)
@@ -5292,20 +5329,43 @@ static int img_dd(int argc, char **argv)
in.buf = g_new(uint8_t, in.bsz);
for (out_pos = 0; in_pos < size; ) {

View File

@@ -16,10 +16,10 @@ Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
1 file changed, 25 insertions(+), 3 deletions(-)
diff --git a/qemu-img.c b/qemu-img.c
index 8ec68b346f..b98184bba1 100644
index d5fa89a204..d458e85af2 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -4994,11 +4994,13 @@ static int img_bitmap(int argc, char **argv)
@@ -4998,11 +4998,13 @@ static int img_bitmap(int argc, char **argv)
#define C_OF 010
#define C_SKIP 020
#define C_OSIZE 040
@@ -33,7 +33,7 @@ index 8ec68b346f..b98184bba1 100644
};
struct DdIo {
@@ -5087,6 +5089,19 @@ static int img_dd_osize(const char *arg,
@@ -5091,6 +5093,19 @@ static int img_dd_osize(const char *arg,
return 0;
}
@@ -53,7 +53,7 @@ index 8ec68b346f..b98184bba1 100644
static int img_dd(int argc, char **argv)
{
int ret = 0;
@@ -5101,12 +5116,14 @@ static int img_dd(int argc, char **argv)
@@ -5105,12 +5120,14 @@ static int img_dd(int argc, char **argv)
int c, i;
const char *out_fmt = "raw";
const char *fmt = NULL;
@@ -69,7 +69,7 @@ index 8ec68b346f..b98184bba1 100644
};
struct DdIo in = {
.bsz = 512, /* Block size is by default 512 bytes */
@@ -5128,6 +5145,7 @@ static int img_dd(int argc, char **argv)
@@ -5132,6 +5149,7 @@ static int img_dd(int argc, char **argv)
{ "of", img_dd_of, C_OF },
{ "skip", img_dd_skip, C_SKIP },
{ "osize", img_dd_osize, C_OSIZE },
@@ -77,7 +77,7 @@ index 8ec68b346f..b98184bba1 100644
{ NULL, NULL, 0 }
};
const struct option long_options[] = {
@@ -5324,9 +5342,10 @@ static int img_dd(int argc, char **argv)
@@ -5328,9 +5346,10 @@ static int img_dd(int argc, char **argv)
in.buf = g_new(uint8_t, in.bsz);
@@ -90,7 +90,7 @@ index 8ec68b346f..b98184bba1 100644
if (blk1) {
in_ret = blk_pread(blk1, in_pos, bytes, in.buf, 0);
if (in_ret == 0) {
@@ -5335,6 +5354,9 @@ static int img_dd(int argc, char **argv)
@@ -5339,6 +5358,9 @@ static int img_dd(int argc, char **argv)
} else {
in_ret = read(STDIN_FILENO, in.buf, bytes);
if (in_ret == 0) {

View File

@@ -65,10 +65,10 @@ index 048788b23d..0b29a67a06 100644
DEF("info", img_info,
diff --git a/qemu-img.c b/qemu-img.c
index b98184bba1..6fc8384f64 100644
index d458e85af2..dc13efba8b 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -5118,7 +5118,7 @@ static int img_dd(int argc, char **argv)
@@ -5122,7 +5122,7 @@ static int img_dd(int argc, char **argv)
const char *fmt = NULL;
int64_t size = 0, readsize = 0;
int64_t out_pos, in_pos;
@@ -77,7 +77,7 @@ index b98184bba1..6fc8384f64 100644
struct DdInfo dd = {
.flags = 0,
.count = 0,
@@ -5156,7 +5156,7 @@ static int img_dd(int argc, char **argv)
@@ -5160,7 +5160,7 @@ static int img_dd(int argc, char **argv)
{ 0, 0, 0, 0 }
};
@@ -86,7 +86,7 @@ index b98184bba1..6fc8384f64 100644
if (c == EOF) {
break;
}
@@ -5176,6 +5176,9 @@ static int img_dd(int argc, char **argv)
@@ -5180,6 +5180,9 @@ static int img_dd(int argc, char **argv)
case 'h':
help();
break;
@@ -96,7 +96,7 @@ index b98184bba1..6fc8384f64 100644
case 'U':
force_share = true;
break;
@@ -5306,13 +5309,15 @@ static int img_dd(int argc, char **argv)
@@ -5310,13 +5313,15 @@ static int img_dd(int argc, char **argv)
size - in.bsz * in.offset, &error_abort);
}

View File

@@ -46,10 +46,10 @@ index 0b29a67a06..758f397232 100644
DEF("info", img_info,
diff --git a/qemu-img.c b/qemu-img.c
index 6fc8384f64..a6c88e0860 100644
index dc13efba8b..02f2e0aa45 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -5110,6 +5110,7 @@ static int img_dd(int argc, char **argv)
@@ -5114,6 +5114,7 @@ static int img_dd(int argc, char **argv)
BlockDriver *drv = NULL, *proto_drv = NULL;
BlockBackend *blk1 = NULL, *blk2 = NULL;
QemuOpts *opts = NULL;
@@ -57,7 +57,7 @@ index 6fc8384f64..a6c88e0860 100644
QemuOptsList *create_opts = NULL;
Error *local_err = NULL;
bool image_opts = false;
@@ -5119,6 +5120,7 @@ static int img_dd(int argc, char **argv)
@@ -5123,6 +5124,7 @@ static int img_dd(int argc, char **argv)
int64_t size = 0, readsize = 0;
int64_t out_pos, in_pos;
bool force_share = false, skip_create = false;
@@ -65,7 +65,7 @@ index 6fc8384f64..a6c88e0860 100644
struct DdInfo dd = {
.flags = 0,
.count = 0,
@@ -5156,7 +5158,7 @@ static int img_dd(int argc, char **argv)
@@ -5160,7 +5162,7 @@ static int img_dd(int argc, char **argv)
{ 0, 0, 0, 0 }
};
@@ -74,7 +74,7 @@ index 6fc8384f64..a6c88e0860 100644
if (c == EOF) {
break;
}
@@ -5179,6 +5181,19 @@ static int img_dd(int argc, char **argv)
@@ -5183,6 +5185,19 @@ static int img_dd(int argc, char **argv)
case 'n':
skip_create = true;
break;
@@ -94,7 +94,7 @@ index 6fc8384f64..a6c88e0860 100644
case 'U':
force_share = true;
break;
@@ -5238,11 +5253,24 @@ static int img_dd(int argc, char **argv)
@@ -5242,11 +5257,24 @@ static int img_dd(int argc, char **argv)
if (dd.flags & C_IF) {
blk1 = img_open(image_opts, in.filename, fmt, 0, false, false,
force_share);
@@ -120,7 +120,7 @@ index 6fc8384f64..a6c88e0860 100644
}
if (dd.flags & C_OSIZE) {
@@ -5397,6 +5425,7 @@ static int img_dd(int argc, char **argv)
@@ -5401,6 +5429,7 @@ static int img_dd(int argc, char **argv)
out:
g_free(arg);
qemu_opts_del(opts);

View File

@@ -18,7 +18,7 @@ Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
4 files changed, 82 insertions(+), 4 deletions(-)
diff --git a/hw/core/machine-hmp-cmds.c b/hw/core/machine-hmp-cmds.c
index 8701f00cc7..3b4c5ef403 100644
index c6325cdcaa..7f817d622d 100644
--- a/hw/core/machine-hmp-cmds.c
+++ b/hw/core/machine-hmp-cmds.c
@@ -179,7 +179,35 @@ void hmp_info_balloon(Monitor *mon, const QDict *qdict)
@@ -59,10 +59,10 @@ index 8701f00cc7..3b4c5ef403 100644
qapi_free_BalloonInfo(info);
}
diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 609e39a821..8cb6dfcac3 100644
index 2eb5a14fa2..aa2fd6c32f 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -781,8 +781,37 @@ static uint64_t virtio_balloon_get_features(VirtIODevice *vdev, uint64_t f,
@@ -795,8 +795,37 @@ static uint64_t virtio_balloon_get_features(VirtIODevice *vdev, uint64_t f,
static void virtio_balloon_stat(void *opaque, BalloonInfo *info)
{
VirtIOBalloon *dev = opaque;
@@ -103,10 +103,10 @@ index 609e39a821..8cb6dfcac3 100644
static void virtio_balloon_to_target(void *opaque, ram_addr_t target)
diff --git a/qapi/machine.json b/qapi/machine.json
index d4317435e7..db8ed2e357 100644
index a6b8795b09..9f7ed0eaa0 100644
--- a/qapi/machine.json
+++ b/qapi/machine.json
@@ -1164,9 +1164,29 @@
@@ -1163,9 +1163,29 @@
# @actual: the logical size of the VM in bytes Formula used:
# logical_vm_size = vm_ram_size - balloon_size
#
@@ -138,10 +138,10 @@ index d4317435e7..db8ed2e357 100644
##
# @query-balloon:
diff --git a/qapi/pragma.json b/qapi/pragma.json
index 59fbe74b8c..be8fa304c5 100644
index 023a2ef7bc..6aaa9cb975 100644
--- a/qapi/pragma.json
+++ b/qapi/pragma.json
@@ -90,6 +90,7 @@
@@ -81,6 +81,7 @@
'member-name-exceptions': [ # visible in:
'ACPISlotType', # query-acpi-ospm-status
'AcpiTableOptions', # -acpitable

View File

@@ -13,10 +13,10 @@ Signed-off-by: Dietmar Maurer <dietmar@proxmox.com>
2 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/hw/core/machine-qmp-cmds.c b/hw/core/machine-qmp-cmds.c
index 130217da8f..52a6d74820 100644
index 1bc21b84a4..93fb4bc24a 100644
--- a/hw/core/machine-qmp-cmds.c
+++ b/hw/core/machine-qmp-cmds.c
@@ -90,6 +90,12 @@ MachineInfoList *qmp_query_machines(bool has_compat_props, bool compat_props,
@@ -91,6 +91,12 @@ MachineInfoList *qmp_query_machines(bool has_compat_props, bool compat_props,
info->numa_mem_supported = mc->numa_mem_supported;
info->deprecated = !!mc->deprecation_reason;
info->acpi = !!object_class_property_find(OBJECT_CLASS(mc), "acpi");
@@ -26,14 +26,14 @@ index 130217da8f..52a6d74820 100644
+ info->is_current = true;
+ }
+
if (mc->default_cpu_type) {
info->default_cpu_type = g_strdup(mc->default_cpu_type);
if (default_cpu_type) {
info->default_cpu_type = g_strdup(default_cpu_type);
}
diff --git a/qapi/machine.json b/qapi/machine.json
index db8ed2e357..0c703316f5 100644
index 9f7ed0eaa0..16366b774a 100644
--- a/qapi/machine.json
+++ b/qapi/machine.json
@@ -168,6 +168,8 @@
@@ -167,6 +167,8 @@
#
# @is-default: whether the machine is default
#
@@ -42,7 +42,7 @@ index db8ed2e357..0c703316f5 100644
# @cpu-max: maximum number of CPUs supported by the machine type
# (since 1.5)
#
@@ -200,7 +202,7 @@
@@ -199,7 +201,7 @@
##
{ 'struct': 'MachineInfo',
'data': { 'name': 'str', '*alias': 'str',

View File

@@ -14,7 +14,7 @@ Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2 files changed, 7 insertions(+)
diff --git a/qapi/ui.json b/qapi/ui.json
index 8c8464faac..cebda37f8f 100644
index c536d4e524..c2df48959b 100644
--- a/qapi/ui.json
+++ b/qapi/ui.json
@@ -312,11 +312,14 @@
@@ -33,7 +33,7 @@ index 8c8464faac..cebda37f8f 100644
'if': 'CONFIG_SPICE' }
diff --git a/ui/spice-core.c b/ui/spice-core.c
index ea20e6153c..55a15fba8b 100644
index d523d00200..c76c224706 100644
--- a/ui/spice-core.c
+++ b/ui/spice-core.c
@@ -548,6 +548,10 @@ static SpiceInfo *qmp_query_spice_real(Error **errp)

View File

@@ -25,7 +25,7 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
diff --git a/migration/channel-savevm-async.c b/migration/channel-savevm-async.c
new file mode 100644
index 0000000000..081a192f49
index 0000000000..e57ab2ae40
--- /dev/null
+++ b/migration/channel-savevm-async.c
@@ -0,0 +1,184 @@
@@ -35,7 +35,7 @@ index 0000000000..081a192f49
+#include "qemu/osdep.h"
+#include "migration/channel-savevm-async.h"
+#include "qapi/error.h"
+#include "sysemu/block-backend.h"
+#include "system/block-backend.h"
+#include "trace.h"
+
+QIOChannelSavevmAsync *
@@ -271,14 +271,14 @@ index 0000000000..17ae2cb261
+
+#endif /* QIO_CHANNEL_SAVEVM_ASYNC_H */
diff --git a/migration/meson.build b/migration/meson.build
index 5ce2acb41e..020127d901 100644
index 9aa48b290e..cf66c78681 100644
--- a/migration/meson.build
+++ b/migration/meson.build
@@ -13,6 +13,7 @@ system_ss.add(files(
'block-dirty-bitmap.c',
@@ -14,6 +14,7 @@ system_ss.add(files(
'block-active.c',
'channel.c',
'channel-block.c',
+ 'channel-savevm-async.c',
'dirtyrate.c',
'exec.c',
'fd.c',
'cpr.c',
'cpr-transfer.c',
'cpu-throttle.c',

View File

@@ -30,21 +30,24 @@ Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
adapt to QAPI and other changes for 8.2
make sure to not call vm_start() from coroutine
stop CPU throttling after finishing
force raw format when loading state as suggested by Friedrich Weber]
force raw format when loading state as suggested by Friedrich Weber
improve setting state in savevm-end handler
improve runstate preservation
use dedicated iothread for state file to avoid deadlock, bug #6262]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
hmp-commands-info.hx | 13 +
hmp-commands.hx | 17 ++
hmp-commands.hx | 17 +
include/migration/snapshot.h | 2 +
include/monitor/hmp.h | 3 +
migration/meson.build | 1 +
migration/savevm-async.c | 553 +++++++++++++++++++++++++++++++++++
migration/savevm-async.c | 581 +++++++++++++++++++++++++++++++++++
monitor/hmp-cmds.c | 38 +++
qapi/migration.json | 34 +++
qapi/migration.json | 34 ++
qapi/misc.json | 18 ++
qemu-options.hx | 12 +
system/vl.c | 10 +
11 files changed, 701 insertions(+)
11 files changed, 729 insertions(+)
create mode 100644 migration/savevm-async.c
diff --git a/hmp-commands-info.hx b/hmp-commands-info.hx
@@ -129,10 +132,10 @@ index ae116d9804..2596cc2426 100644
void coroutine_fn hmp_screendump(Monitor *mon, const QDict *qdict);
void hmp_chardev_add(Monitor *mon, const QDict *qdict);
diff --git a/migration/meson.build b/migration/meson.build
index 020127d901..4b0c4f0f51 100644
index cf66c78681..46e92249a1 100644
--- a/migration/meson.build
+++ b/migration/meson.build
@@ -27,6 +27,7 @@ system_ss.add(files(
@@ -33,6 +33,7 @@ system_ss.add(files(
'options.c',
'postcopy-ram.c',
'savevm.c',
@@ -142,10 +145,10 @@ index 020127d901..4b0c4f0f51 100644
'threadinfo.c',
diff --git a/migration/savevm-async.c b/migration/savevm-async.c
new file mode 100644
index 0000000000..276f8bfcbb
index 0000000000..56e0fa6c69
--- /dev/null
+++ b/migration/savevm-async.c
@@ -0,0 +1,553 @@
@@ -0,0 +1,581 @@
+#include "qemu/osdep.h"
+#include "migration/channel-savevm-async.h"
+#include "migration/migration.h"
@@ -156,14 +159,14 @@ index 0000000000..276f8bfcbb
+#include "migration/global_state.h"
+#include "migration/ram.h"
+#include "migration/qemu-file.h"
+#include "sysemu/cpu-throttle.h"
+#include "sysemu/sysemu.h"
+#include "sysemu/runstate.h"
+#include "system/cpu-throttle.h"
+#include "system/system.h"
+#include "system/runstate.h"
+#include "block/block.h"
+#include "sysemu/block-backend.h"
+#include "system/block-backend.h"
+#include "qapi/error.h"
+#include "qapi/qmp/qerror.h"
+#include "qapi/qmp/qdict.h"
+#include "qobject/qdict.h"
+#include "qapi/qapi-commands-migration.h"
+#include "qapi/qapi-commands-misc.h"
+#include "qapi/qapi-commands-block.h"
@@ -173,6 +176,7 @@ index 0000000000..276f8bfcbb
+#include "qemu/main-loop.h"
+#include "qemu/rcu.h"
+#include "qemu/yank.h"
+#include "system/iothread.h"
+
+/* #define DEBUG_SAVEVM_STATE */
+
@@ -199,12 +203,13 @@ index 0000000000..276f8bfcbb
+ int state;
+ Error *error;
+ Error *blocker;
+ int saved_vm_running;
+ int vm_needs_start;
+ QEMUFile *file;
+ int64_t total_time;
+ QEMUBH *finalize_bh;
+ Coroutine *co;
+ QemuCoSleep target_close_wait;
+ IOThread *iothread;
+} snap_state;
+
+static bool savevm_aborted(void)
@@ -262,6 +267,7 @@ index 0000000000..276f8bfcbb
+ }
+
+ if (snap_state.target) {
+ BlockDriverState *target_bs = blk_bs(snap_state.target);
+ if (!savevm_aborted()) {
+ /* try to truncate, but ignore errors (will fail on block devices).
+ * note1: bdrv_read() need whole blocks, so we need to round up
@@ -270,7 +276,9 @@ index 0000000000..276f8bfcbb
+ size_t size = QEMU_ALIGN_UP(snap_state.bs_pos, BDRV_SECTOR_SIZE*2);
+ blk_truncate(snap_state.target, size, false, PREALLOC_MODE_OFF, 0, NULL);
+ }
+ blk_op_unblock_all(snap_state.target, snap_state.blocker);
+ if (target_bs) {
+ bdrv_op_unblock_all(target_bs, snap_state.blocker);
+ }
+ error_free(snap_state.blocker);
+ snap_state.blocker = NULL;
+ blk_unref(snap_state.target);
@@ -323,6 +331,7 @@ index 0000000000..276f8bfcbb
+ */
+ blk_set_aio_context(snap_state.target, qemu_get_aio_context(), NULL);
+
+ snap_state.vm_needs_start = runstate_is_running();
+ ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
+ if (ret < 0) {
+ save_snapshot_error("vm_stop_force_state error %d", ret);
@@ -330,7 +339,7 @@ index 0000000000..276f8bfcbb
+
+ if (!aborted) {
+ /* skip state saving if we aborted, snapshot will be invalid anyway */
+ (void)qemu_savevm_state_complete_precopy(snap_state.file, false, false);
+ (void)qemu_savevm_state_complete_precopy(snap_state.file, false);
+ ret = qemu_file_get_error(snap_state.file);
+ if (ret < 0) {
+ save_snapshot_error("qemu_savevm_state_complete_precopy error %d", ret);
@@ -369,9 +378,9 @@ index 0000000000..276f8bfcbb
+ save_snapshot_error("process_savevm_cleanup: invalid state: %d",
+ snap_state.state);
+ }
+ if (snap_state.saved_vm_running) {
+ if (snap_state.vm_needs_start) {
+ vm_start();
+ snap_state.saved_vm_running = false;
+ snap_state.vm_needs_start = false;
+ }
+
+ DPRINTF("timing: process_savevm_finalize (full) took %ld ms\n",
@@ -400,16 +409,13 @@ index 0000000000..276f8bfcbb
+ uint64_t threshold = 400 * 1000;
+
+ /*
+ * pending_{estimate,exact} are expected to be called without iothread
+ * lock. Similar to what is done in migration.c, call the exact variant
+ * only once pend_precopy in the estimate is below the threshold.
+ * Similar to what is done in migration.c, call the exact variant only
+ * once pend_precopy in the estimate is below the threshold.
+ */
+ bql_unlock();
+ qemu_savevm_state_pending_estimate(&pend_precopy, &pend_postcopy);
+ if (pend_precopy <= threshold) {
+ qemu_savevm_state_pending_exact(&pend_precopy, &pend_postcopy);
+ }
+ bql_lock();
+ pending_size = pend_precopy + pend_postcopy;
+
+ /*
@@ -476,11 +482,18 @@ index 0000000000..276f8bfcbb
+ qemu_bh_schedule(snap_state.finalize_bh);
+}
+
+static void savevm_cleanup_iothread(void) {
+ if (snap_state.iothread) {
+ iothread_destroy(snap_state.iothread);
+ snap_state.iothread = NULL;
+ }
+}
+
+void qmp_savevm_start(const char *statefile, Error **errp)
+{
+ Error *local_err = NULL;
+ MigrationState *ms = migrate_get_current();
+ AioContext *iohandler_ctx = iohandler_get_aio_context();
+ BlockDriverState *target_bs = NULL;
+ int ret = 0;
+
+ int bdrv_oflags = BDRV_O_RDWR | BDRV_O_RESIZE | BDRV_O_NO_FLUSH;
@@ -496,7 +509,6 @@ index 0000000000..276f8bfcbb
+ }
+
+ /* initialize snapshot info */
+ snap_state.saved_vm_running = runstate_is_running();
+ snap_state.bs_pos = 0;
+ snap_state.total_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
+ snap_state.blocker = NULL;
@@ -508,13 +520,27 @@ index 0000000000..276f8bfcbb
+ }
+
+ if (!statefile) {
+ snap_state.vm_needs_start = runstate_is_running();
+ vm_stop(RUN_STATE_SAVE_VM);
+ snap_state.state = SAVE_STATE_COMPLETED;
+ return;
+ }
+
+ if (qemu_savevm_state_blocked(errp)) {
+ return;
+ goto fail;
+ }
+
+ if (snap_state.iothread) {
+ /* This is not expected, so warn about it, but no point in re-creating a new iothread. */
+ warn_report("iothread for snapshot already exists - re-using");
+ } else {
+ snap_state.iothread =
+ iothread_create("__proxmox_savevm_async_iothread__", &local_err);
+ if (!snap_state.iothread) {
+ error_setg(errp, "creating iothread failed: %s",
+ local_err ? error_get_pretty(local_err) : "unknown error");
+ goto fail;
+ }
+ }
+
+ /* Open the image */
@@ -524,7 +550,12 @@ index 0000000000..276f8bfcbb
+ snap_state.target = blk_new_open(statefile, NULL, options, bdrv_oflags, &local_err);
+ if (!snap_state.target) {
+ error_setg(errp, "failed to open '%s'", statefile);
+ goto restart;
+ goto fail;
+ }
+ target_bs = blk_bs(snap_state.target);
+ if (!target_bs) {
+ error_setg(errp, "failed to open '%s' - no block driver state", statefile);
+ goto fail;
+ }
+
+ QIOChannel *ioc = QIO_CHANNEL(qio_channel_savevm_async_new(snap_state.target,
@@ -533,7 +564,7 @@ index 0000000000..276f8bfcbb
+
+ if (!snap_state.file) {
+ error_setg(errp, "failed to open '%s'", statefile);
+ goto restart;
+ goto fail;
+ }
+
+ /*
@@ -541,14 +572,14 @@ index 0000000000..276f8bfcbb
+ * State is cleared in process_savevm_co, but has to be initialized
+ * here (blocking main thread, from QMP) to avoid race conditions.
+ */
+ if (migrate_init(ms, errp)) {
+ return;
+ if (migrate_init(ms, errp) != 0) {
+ goto fail;
+ }
+ memset(&mig_stats, 0, sizeof(mig_stats));
+ ms->to_dst_file = snap_state.file;
+
+ error_setg(&snap_state.blocker, "block device is in use by savevm");
+ blk_op_block_all(snap_state.target, snap_state.blocker);
+ bdrv_op_block_all(target_bs, snap_state.blocker);
+
+ snap_state.state = SAVE_STATE_ACTIVE;
+ snap_state.finalize_bh = qemu_bh_new(process_savevm_finalize, &snap_state);
@@ -557,56 +588,49 @@ index 0000000000..276f8bfcbb
+ if (ret != 0) {
+ error_setg_errno(errp, -ret, "savevm state setup failed: %s",
+ local_err ? error_get_pretty(local_err) : "unknown error");
+ return;
+ goto fail;
+ }
+
+ /* Async processing from here on out happens in iohandler context, so let
+ * the target bdrv have its home there.
+ */
+ ret = blk_set_aio_context(snap_state.target, iohandler_ctx, &local_err);
+ ret = blk_set_aio_context(snap_state.target, snap_state.iothread->ctx, &local_err);
+ if (ret != 0) {
+ warn_report("failed to set iohandler context for VM state target: %s %s",
+ local_err ? error_get_pretty(local_err) : "unknown error",
+ strerror(-ret));
+ error_setg_errno(errp, -ret, "failed to set iothread context for VM state target: %s",
+ local_err ? error_get_pretty(local_err) : "unknown error");
+ goto fail;
+ }
+
+ snap_state.co = qemu_coroutine_create(&process_savevm_co, NULL);
+ aio_co_schedule(iohandler_ctx, snap_state.co);
+ aio_co_schedule(snap_state.iothread->ctx, snap_state.co);
+
+ return;
+
+restart:
+
+fail:
+ savevm_cleanup_iothread();
+ save_snapshot_error("setup failed");
+
+ if (snap_state.saved_vm_running) {
+ vm_start();
+ snap_state.saved_vm_running = false;
+ }
+}
+
+static void coroutine_fn wait_for_close_co(void *opaque)
+{
+ int64_t timeout;
+
+ if (!snap_state.target) {
+ if (snap_state.target) {
+ /* wait until cleanup is done before returning, this ensures that after this
+ * call exits the statefile will be closed and can be removed immediately */
+ DPRINTF("savevm-end: waiting for cleanup\n");
+ timeout = 30L * 1000 * 1000 * 1000;
+ qemu_co_sleep_ns_wakeable(&snap_state.target_close_wait,
+ QEMU_CLOCK_REALTIME, timeout);
+ if (snap_state.target) {
+ save_snapshot_error("timeout waiting for target file close in "
+ "qmp_savevm_end");
+ /* we cannot assume the snapshot finished in this case, so leave the
+ * state alone - caller has to figure something out */
+ return;
+ }
+ } else {
+ DPRINTF("savevm-end: no target file open\n");
+ return;
+ }
+
+ /* wait until cleanup is done before returning, this ensures that after this
+ * call exits the statefile will be closed and can be removed immediately */
+ DPRINTF("savevm-end: waiting for cleanup\n");
+ timeout = 30L * 1000 * 1000 * 1000;
+ qemu_co_sleep_ns_wakeable(&snap_state.target_close_wait,
+ QEMU_CLOCK_REALTIME, timeout);
+ if (snap_state.target) {
+ save_snapshot_error("timeout waiting for target file close in "
+ "qmp_savevm_end");
+ /* we cannot assume the snapshot finished in this case, so leave the
+ * state alone - caller has to figure something out */
+ return;
+ }
+ savevm_cleanup_iothread();
+
+ // File closed and no other error, so ensure next snapshot can be started.
+ if (snap_state.state != SAVE_STATE_ERROR) {
@@ -631,19 +655,18 @@ index 0000000000..276f8bfcbb
+ return;
+ }
+
+ if (snap_state.saved_vm_running) {
+ if (snap_state.vm_needs_start) {
+ vm_start();
+ snap_state.saved_vm_running = false;
+ snap_state.vm_needs_start = false;
+ }
+
+ snap_state.state = SAVE_STATE_DONE;
+
+ qemu_coroutine_enter(wait_for_close);
+}
+
+int load_snapshot_from_blockdev(const char *filename, Error **errp)
+{
+ BlockBackend *be;
+ BlockDriverState *bs = NULL;
+ Error *local_err = NULL;
+ Error *blocker = NULL;
+ QDict *options;
@@ -662,8 +685,14 @@ index 0000000000..276f8bfcbb
+ goto the_end;
+ }
+
+ bs = blk_bs(be);
+ if (!bs) {
+ error_setg(errp, "Could not open VM state file - missing block driver state");
+ goto the_end;
+ }
+
+ error_setg(&blocker, "block device is in use by load state");
+ blk_op_block_all(be, blocker);
+ bdrv_op_block_all(bs, blocker);
+
+ /* restore the VM state */
+ f = qemu_file_new_input(QIO_CHANNEL(qio_channel_savevm_async_new(be, &bs_pos)));
@@ -693,14 +722,16 @@ index 0000000000..276f8bfcbb
+
+ the_end:
+ if (be) {
+ blk_op_unblock_all(be, blocker);
+ if (bs) {
+ bdrv_op_unblock_all(bs, blocker);
+ }
+ error_free(blocker);
+ blk_unref(be);
+ }
+ return ret;
+}
diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index f601d06ab8..874084565f 100644
index 7ded3378cf..bade2a4b92 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -24,6 +24,7 @@
@@ -709,10 +740,10 @@ index f601d06ab8..874084565f 100644
#include "qapi/qapi-commands-machine.h"
+#include "qapi/qapi-commands-migration.h"
#include "qapi/qapi-commands-misc.h"
#include "qapi/qmp/qdict.h"
#include "qobject/qdict.h"
#include "qemu/cutils.h"
@@ -434,3 +435,40 @@ void hmp_dumpdtb(Monitor *mon, const QDict *qdict)
monitor_printf(mon, "dtb dumped to %s", filename);
monitor_printf(mon, "DTB dumped to '%s'\n", filename);
}
#endif
+
@@ -753,10 +784,10 @@ index f601d06ab8..874084565f 100644
+ }
+}
diff --git a/qapi/migration.json b/qapi/migration.json
index 7324571e92..d6e94a7c41 100644
index 8b9c53595c..ff3479da65 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -276,6 +276,40 @@
@@ -279,6 +279,40 @@
'*dirty-limit-throttle-time-per-round': 'uint64',
'*dirty-limit-ring-full-time': 'uint64'} }
@@ -827,10 +858,10 @@ index 559b66f201..7959e89c1e 100644
# @CommandLineParameterType:
#
diff --git a/qemu-options.hx b/qemu-options.hx
index d94e2cbbae..07730f9e65 100644
index dc694a99a3..defee0c06a 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -4805,6 +4805,18 @@ SRST
@@ -4862,6 +4862,18 @@ SRST
Start right away with a saved state (``loadvm`` in monitor)
ERST
@@ -850,10 +881,10 @@ index d94e2cbbae..07730f9e65 100644
DEF("daemonize", 0, QEMU_OPTION_daemonize, \
"-daemonize daemonize QEMU after initializing\n", QEMU_ARCH_ALL)
diff --git a/system/vl.c b/system/vl.c
index 01b8b8e77a..d6bbdc906e 100644
index ec93988a03..9b36ace6b4 100644
--- a/system/vl.c
+++ b/system/vl.c
@@ -164,6 +164,7 @@ static const char *accelerators;
@@ -171,6 +171,7 @@ static const char *accelerators;
static bool have_custom_ram_size;
static const char *ram_memdev_id;
static QDict *machine_opts_dict;
@@ -861,7 +892,7 @@ index 01b8b8e77a..d6bbdc906e 100644
static QTAILQ_HEAD(, ObjectOption) object_opts = QTAILQ_HEAD_INITIALIZER(object_opts);
static QTAILQ_HEAD(, DeviceOption) device_opts = QTAILQ_HEAD_INITIALIZER(device_opts);
static int display_remote;
@@ -2727,6 +2728,12 @@ void qmp_x_exit_preconfig(Error **errp)
@@ -2814,6 +2815,12 @@ void qmp_x_exit_preconfig(Error **errp)
RunState state = autostart ? RUN_STATE_RUNNING : runstate_get();
load_snapshot(loadvm, NULL, false, NULL, &error_fatal);
load_snapshot_resume(state);
@@ -874,7 +905,7 @@ index 01b8b8e77a..d6bbdc906e 100644
}
if (replay_mode != REPLAY_MODE_NONE) {
replay_vmstate_init();
@@ -3275,6 +3282,9 @@ void qemu_init(int argc, char **argv)
@@ -3360,6 +3367,9 @@ void qemu_init(int argc, char **argv)
case QEMU_OPTION_loadvm:
loadvm = optarg;
break;

View File

@@ -19,7 +19,7 @@ Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
3 files changed, 38 insertions(+), 17 deletions(-)
diff --git a/migration/qemu-file.c b/migration/qemu-file.c
index b6d2f588bd..754dc0b3f7 100644
index 1303a5bf58..6e2d58d5c0 100644
--- a/migration/qemu-file.c
+++ b/migration/qemu-file.c
@@ -34,8 +34,8 @@
@@ -31,9 +31,9 @@ index b6d2f588bd..754dc0b3f7 100644
+#define DEFAULT_IO_BUF_SIZE 32768
+#define MAX_IOV_SIZE MIN_CONST(IOV_MAX, 256)
struct QEMUFile {
QIOChannel *ioc;
@@ -43,7 +43,8 @@ struct QEMUFile {
typedef struct FdEntry {
QTAILQ_ENTRY(FdEntry) entry;
@@ -48,7 +48,8 @@ struct QEMUFile {
int buf_index;
int buf_size; /* 0 when writing */
@@ -43,7 +43,7 @@ index b6d2f588bd..754dc0b3f7 100644
DECLARE_BITMAP(may_free, MAX_IOV_SIZE);
struct iovec iov[MAX_IOV_SIZE];
@@ -100,7 +101,9 @@ int qemu_file_shutdown(QEMUFile *f)
@@ -108,7 +109,9 @@ int qemu_file_shutdown(QEMUFile *f)
return 0;
}
@@ -54,16 +54,16 @@ index b6d2f588bd..754dc0b3f7 100644
{
QEMUFile *f;
@@ -109,6 +112,8 @@ static QEMUFile *qemu_file_new_impl(QIOChannel *ioc, bool is_writable)
object_ref(ioc);
f->ioc = ioc;
@@ -119,6 +122,8 @@ static QEMUFile *qemu_file_new_impl(QIOChannel *ioc, bool is_writable)
f->is_writable = is_writable;
f->can_pass_fd = qio_channel_has_feature(ioc, QIO_CHANNEL_FEATURE_FD_PASS);
QTAILQ_INIT(&f->fds);
+ f->buf_allocated_size = buffer_size;
+ f->buf = malloc(buffer_size);
return f;
}
@@ -119,17 +124,27 @@ static QEMUFile *qemu_file_new_impl(QIOChannel *ioc, bool is_writable)
@@ -129,17 +134,27 @@ static QEMUFile *qemu_file_new_impl(QIOChannel *ioc, bool is_writable)
*/
QEMUFile *qemu_file_get_return_path(QEMUFile *f)
{
@@ -94,17 +94,17 @@ index b6d2f588bd..754dc0b3f7 100644
}
/*
@@ -327,7 +342,7 @@ static ssize_t coroutine_mixed_fn qemu_fill_buffer(QEMUFile *f)
@@ -339,7 +354,7 @@ static ssize_t coroutine_mixed_fn qemu_fill_buffer(QEMUFile *f)
}
do {
len = qio_channel_read(f->ioc,
(char *)f->buf + pending,
- IO_BUF_SIZE - pending,
+ f->buf_allocated_size - pending,
&local_error);
- struct iovec iov = { f->buf + pending, IO_BUF_SIZE - pending };
+ struct iovec iov = { f->buf + pending, f->buf_allocated_size - pending };
len = qio_channel_readv_full(f->ioc, &iov, 1, pfds, pnfd, 0,
&local_error);
if (len == QIO_CHANNEL_ERR_BLOCK) {
if (qemu_in_coroutine()) {
@@ -367,6 +382,9 @@ int qemu_fclose(QEMUFile *f)
ret = ret2;
@@ -443,6 +458,9 @@ int qemu_fclose(QEMUFile *f)
g_free(fde);
}
g_clear_pointer(&f->ioc, object_unref);
+
@@ -113,7 +113,7 @@ index b6d2f588bd..754dc0b3f7 100644
error_free(f->last_error_obj);
g_free(f);
trace_qemu_file_fclose();
@@ -415,7 +433,7 @@ static void add_buf_to_iovec(QEMUFile *f, size_t len)
@@ -491,7 +509,7 @@ static void add_buf_to_iovec(QEMUFile *f, size_t len)
{
if (!add_to_iovec(f, f->buf + f->buf_index, len, false)) {
f->buf_index += len;
@@ -122,7 +122,7 @@ index b6d2f588bd..754dc0b3f7 100644
qemu_fflush(f);
}
}
@@ -440,7 +458,7 @@ void qemu_put_buffer(QEMUFile *f, const uint8_t *buf, size_t size)
@@ -516,7 +534,7 @@ void qemu_put_buffer(QEMUFile *f, const uint8_t *buf, size_t size)
}
while (size > 0) {
@@ -131,7 +131,7 @@ index b6d2f588bd..754dc0b3f7 100644
if (l > size) {
l = size;
}
@@ -586,8 +604,8 @@ size_t coroutine_mixed_fn qemu_peek_buffer(QEMUFile *f, uint8_t **buf, size_t si
@@ -662,8 +680,8 @@ size_t coroutine_mixed_fn qemu_peek_buffer(QEMUFile *f, uint8_t **buf, size_t si
size_t index;
assert(!qemu_file_is_writable(f));
@@ -142,7 +142,7 @@ index b6d2f588bd..754dc0b3f7 100644
/* The 1st byte to read from */
index = f->buf_index + offset;
@@ -637,7 +655,7 @@ size_t coroutine_mixed_fn qemu_get_buffer(QEMUFile *f, uint8_t *buf, size_t size
@@ -713,7 +731,7 @@ size_t coroutine_mixed_fn qemu_get_buffer(QEMUFile *f, uint8_t *buf, size_t size
size_t res;
uint8_t *src;
@@ -151,7 +151,7 @@ index b6d2f588bd..754dc0b3f7 100644
if (res == 0) {
return done;
}
@@ -671,7 +689,7 @@ size_t coroutine_mixed_fn qemu_get_buffer(QEMUFile *f, uint8_t *buf, size_t size
@@ -747,7 +765,7 @@ size_t coroutine_mixed_fn qemu_get_buffer(QEMUFile *f, uint8_t *buf, size_t size
*/
size_t coroutine_mixed_fn qemu_get_buffer_in_place(QEMUFile *f, uint8_t **buf, size_t size)
{
@@ -160,7 +160,7 @@ index b6d2f588bd..754dc0b3f7 100644
size_t res;
uint8_t *src = NULL;
@@ -696,7 +714,7 @@ int coroutine_mixed_fn qemu_peek_byte(QEMUFile *f, int offset)
@@ -772,7 +790,7 @@ int coroutine_mixed_fn qemu_peek_byte(QEMUFile *f, int offset)
int index = f->buf_index + offset;
assert(!qemu_file_is_writable(f));
@@ -170,7 +170,7 @@ index b6d2f588bd..754dc0b3f7 100644
if (index >= f->buf_size) {
qemu_fill_buffer(f);
diff --git a/migration/qemu-file.h b/migration/qemu-file.h
index 11c2120edd..edf3c5d147 100644
index f5b9f430e0..0179b90698 100644
--- a/migration/qemu-file.h
+++ b/migration/qemu-file.h
@@ -30,7 +30,9 @@
@@ -182,12 +182,12 @@ index 11c2120edd..edf3c5d147 100644
+QEMUFile *qemu_file_new_output_sized(QIOChannel *ioc, size_t buffer_size);
int qemu_fclose(QEMUFile *f);
/*
G_DEFINE_AUTOPTR_CLEANUP_FUNC(QEMUFile, qemu_fclose)
diff --git a/migration/savevm-async.c b/migration/savevm-async.c
index 276f8bfcbb..30921d7e9f 100644
index 56e0fa6c69..730b815494 100644
--- a/migration/savevm-async.c
+++ b/migration/savevm-async.c
@@ -381,7 +381,7 @@ void qmp_savevm_start(const char *statefile, Error **errp)
@@ -409,7 +409,7 @@ void qmp_savevm_start(const char *statefile, Error **errp)
QIOChannel *ioc = QIO_CHANNEL(qio_channel_savevm_async_new(snap_state.target,
&snap_state.bs_pos));
@@ -196,8 +196,8 @@ index 276f8bfcbb..30921d7e9f 100644
if (!snap_state.file) {
error_setg(errp, "failed to open '%s'", statefile);
@@ -518,7 +518,8 @@ int load_snapshot_from_blockdev(const char *filename, Error **errp)
blk_op_block_all(be, blocker);
@@ -544,7 +544,8 @@ int load_snapshot_from_blockdev(const char *filename, Error **errp)
bdrv_op_block_all(bs, blocker);
/* restore the VM state */
- f = qemu_file_new_input(QIO_CHANNEL(qio_channel_savevm_async_new(be, &bs_pos)));

View File

@@ -15,7 +15,7 @@ Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
create mode 100644 block/zeroinit.c
diff --git a/block/meson.build b/block/meson.build
index f1262ec2ba..6a60b5d6b9 100644
index 34b1b2a306..a21d9a5411 100644
--- a/block/meson.build
+++ b/block/meson.build
@@ -39,6 +39,7 @@ block_ss.add(files(
@@ -28,7 +28,7 @@ index f1262ec2ba..6a60b5d6b9 100644
system_ss.add(when: 'CONFIG_TCG', if_true: files('blkreplay.c'))
diff --git a/block/zeroinit.c b/block/zeroinit.c
new file mode 100644
index 0000000000..2b2b194ccf
index 0000000000..f9d513db15
--- /dev/null
+++ b/block/zeroinit.c
@@ -0,0 +1,207 @@
@@ -47,8 +47,8 @@ index 0000000000..2b2b194ccf
+#include "block/block_int.h"
+#include "block/block-io.h"
+#include "block/graph-lock.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/qmp/qstring.h"
+#include "qobject/qdict.h"
+#include "qobject/qstring.h"
+#include "qemu/cutils.h"
+#include "qemu/option.h"
+#include "qemu/module.h"

View File

@@ -14,10 +14,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2 files changed, 11 insertions(+)
diff --git a/qemu-options.hx b/qemu-options.hx
index 07730f9e65..7fdc944965 100644
index defee0c06a..fb980a05cf 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -1239,6 +1239,9 @@ legacy PC, they are not recommended for modern configurations.
@@ -1280,6 +1280,9 @@ legacy PC, they are not recommended for modern configurations.
ERST
@@ -28,10 +28,10 @@ index 07730f9e65..7fdc944965 100644
"-fda/-fdb file use 'file' as floppy disk 0/1 image\n", QEMU_ARCH_ALL)
DEF("fdb", HAS_ARG, QEMU_OPTION_fdb, "", QEMU_ARCH_ALL)
diff --git a/system/vl.c b/system/vl.c
index d6bbdc906e..200468a753 100644
index 9b36ace6b4..452742ab58 100644
--- a/system/vl.c
+++ b/system/vl.c
@@ -2764,6 +2764,7 @@ void qemu_init(int argc, char **argv)
@@ -2854,6 +2854,7 @@ void qemu_init(int argc, char **argv)
MachineClass *machine_class;
bool userconfig = true;
FILE *vmstate_dump_file = NULL;
@@ -39,7 +39,7 @@ index d6bbdc906e..200468a753 100644
qemu_add_opts(&qemu_drive_opts);
qemu_add_drive_opts(&qemu_legacy_drive_opts);
@@ -3387,6 +3388,13 @@ void qemu_init(int argc, char **argv)
@@ -3472,6 +3473,13 @@ void qemu_init(int argc, char **argv)
machine_parse_property_opt(qemu_find_opts("smp-opts"),
"smp", optarg);
break;

View File

@@ -11,7 +11,7 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 9 insertions(+)
diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c
index c13cdd7994..fd5808cdc0 100644
index 2a3e878c4d..efbed1aea3 100644
--- a/hw/intc/apic_common.c
+++ b/hw/intc/apic_common.c
@@ -263,6 +263,15 @@ static void apic_reset_common(DeviceState *dev)

View File

@@ -13,10 +13,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2 files changed, 46 insertions(+), 20 deletions(-)
diff --git a/block/file-posix.c b/block/file-posix.c
index 99e5bea1cc..6a4f6a25e6 100644
index cfa0b832ba..d5c28cccc9 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -2884,6 +2884,7 @@ raw_co_create(BlockdevCreateOptions *options, Error **errp)
@@ -2897,6 +2897,7 @@ raw_co_create(BlockdevCreateOptions *options, Error **errp)
int fd;
uint64_t perm, shared;
int result = 0;
@@ -24,7 +24,7 @@ index 99e5bea1cc..6a4f6a25e6 100644
/* Validate options and set default values */
assert(options->driver == BLOCKDEV_DRIVER_FILE);
@@ -2924,19 +2925,22 @@ raw_co_create(BlockdevCreateOptions *options, Error **errp)
@@ -2937,19 +2938,22 @@ raw_co_create(BlockdevCreateOptions *options, Error **errp)
perm = BLK_PERM_WRITE | BLK_PERM_RESIZE;
shared = BLK_PERM_ALL & ~BLK_PERM_RESIZE;
@@ -59,7 +59,7 @@ index 99e5bea1cc..6a4f6a25e6 100644
}
/* Clear the file by truncating it to 0 */
@@ -2990,13 +2994,15 @@ raw_co_create(BlockdevCreateOptions *options, Error **errp)
@@ -3003,13 +3007,15 @@ raw_co_create(BlockdevCreateOptions *options, Error **errp)
}
out_unlock:
@@ -82,7 +82,7 @@ index 99e5bea1cc..6a4f6a25e6 100644
}
out_close:
@@ -3020,6 +3026,7 @@ raw_co_create_opts(BlockDriver *drv, const char *filename,
@@ -3033,6 +3039,7 @@ raw_co_create_opts(BlockDriver *drv, const char *filename,
PreallocMode prealloc;
char *buf = NULL;
Error *local_err = NULL;
@@ -90,7 +90,7 @@ index 99e5bea1cc..6a4f6a25e6 100644
/* Skip file: protocol prefix */
strstart(filename, "file:", &filename);
@@ -3042,6 +3049,18 @@ raw_co_create_opts(BlockDriver *drv, const char *filename,
@@ -3055,6 +3062,18 @@ raw_co_create_opts(BlockDriver *drv, const char *filename,
return -EINVAL;
}
@@ -109,7 +109,7 @@ index 99e5bea1cc..6a4f6a25e6 100644
options = (BlockdevCreateOptions) {
.driver = BLOCKDEV_DRIVER_FILE,
.u.file = {
@@ -3053,6 +3072,8 @@ raw_co_create_opts(BlockDriver *drv, const char *filename,
@@ -3066,6 +3085,8 @@ raw_co_create_opts(BlockDriver *drv, const char *filename,
.nocow = nocow,
.has_extent_size_hint = has_extent_size_hint,
.extent_size_hint = extent_size_hint,
@@ -119,10 +119,10 @@ index 99e5bea1cc..6a4f6a25e6 100644
};
return raw_co_create(&options, errp);
diff --git a/qapi/block-core.json b/qapi/block-core.json
index c2a337cc04..1cb6f04db3 100644
index 0e5f148d30..1c05413916 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -4959,6 +4959,10 @@
@@ -5016,6 +5016,10 @@
# @extent-size-hint: Extent size hint to add to the image file; 0 for
# not adding an extent size hint (default: 1 MB, since 5.1)
#
@@ -133,7 +133,7 @@ index c2a337cc04..1cb6f04db3 100644
# Since: 2.12
##
{ 'struct': 'BlockdevCreateOptionsFile',
@@ -4966,7 +4970,8 @@
@@ -5023,7 +5027,8 @@
'size': 'size',
'*preallocation': 'PreallocMode',
'*nocow': 'bool',

View File

@@ -18,7 +18,7 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/monitor/qmp.c b/monitor/qmp.c
index eb181d5979..20fc0d20a6 100644
index f093e256e9..78f1c8e3c8 100644
--- a/monitor/qmp.c
+++ b/monitor/qmp.c
@@ -534,8 +534,7 @@ void monitor_init_qmp(Chardev *chr, bool pretty, Error **errp)

View File

@@ -26,10 +26,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 27dcda0248..7a13e9f014 100644
index 63c6ef93d2..9a34017e5a 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -173,7 +173,8 @@ GlobalProperty hw_compat_4_0[] = {
@@ -193,7 +193,8 @@ GlobalProperty hw_compat_4_0[] = {
{ "virtio-vga", "edid", "false" },
{ "virtio-gpu-device", "edid", "false" },
{ "virtio-device", "use-started", "false" },

View File

@@ -21,10 +21,10 @@ Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
4 files changed, 34 insertions(+)
diff --git a/hw/core/machine-qmp-cmds.c b/hw/core/machine-qmp-cmds.c
index 52a6d74820..362128842d 100644
index 93fb4bc24a..b9999423b4 100644
--- a/hw/core/machine-qmp-cmds.c
+++ b/hw/core/machine-qmp-cmds.c
@@ -94,6 +94,11 @@ MachineInfoList *qmp_query_machines(bool has_compat_props, bool compat_props,
@@ -95,6 +95,11 @@ MachineInfoList *qmp_query_machines(bool has_compat_props, bool compat_props,
if (strcmp(mc->name, MACHINE_GET_CLASS(current_machine)->name) == 0) {
info->has_is_current = true;
info->is_current = true;
@@ -35,25 +35,25 @@ index 52a6d74820..362128842d 100644
+ }
}
if (mc->default_cpu_type) {
if (default_cpu_type) {
diff --git a/include/hw/boards.h b/include/hw/boards.h
index 48ff6d8b93..5cddeb7fcb 100644
index f22b2e7fc7..8ada4d5832 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -252,6 +252,8 @@ struct MachineClass {
@@ -271,6 +271,8 @@ struct MachineClass {
const char *desc;
const char *deprecation_reason;
+ const char *pve_version;
+
void (*init)(MachineState *state);
void (*reset)(MachineState *state, ShutdownCause reason);
void (*reset)(MachineState *state, ResetType type);
void (*wakeup)(MachineState *state);
diff --git a/qapi/machine.json b/qapi/machine.json
index 0c703316f5..dc46a3e93f 100644
index 16366b774a..12cfd3f260 100644
--- a/qapi/machine.json
+++ b/qapi/machine.json
@@ -190,6 +190,8 @@
@@ -189,6 +189,8 @@
#
# @acpi: machine type supports ACPI (since 8.0)
#
@@ -62,7 +62,7 @@ index 0c703316f5..dc46a3e93f 100644
# @compat-props: The machine type's compatibility properties. Only
# present when query-machines argument @compat-props is true.
# (since 9.1)
@@ -206,6 +208,7 @@
@@ -205,6 +207,7 @@
'hotpluggable-cpus': 'bool', 'numa-mem-supported': 'bool',
'deprecated': 'bool', '*default-cpu-type': 'str',
'*default-ram-id': 'str', 'acpi': 'bool',
@@ -71,10 +71,10 @@ index 0c703316f5..dc46a3e93f 100644
'features': ['unstable'] } } }
diff --git a/system/vl.c b/system/vl.c
index 200468a753..0dbdba6421 100644
index 452742ab58..c3707b2412 100644
--- a/system/vl.c
+++ b/system/vl.c
@@ -1675,6 +1675,7 @@ static MachineClass *select_machine(QDict *qdict, Error **errp)
@@ -1674,6 +1674,7 @@ static MachineClass *select_machine(QDict *qdict, Error **errp)
{
ERRP_GUARD();
const char *machine_type = qdict_get_try_str(qdict, "type");
@@ -82,7 +82,7 @@ index 200468a753..0dbdba6421 100644
g_autoptr(GSList) machines = object_class_get_list(TYPE_MACHINE, false);
MachineClass *machine_class = NULL;
@@ -1694,7 +1695,11 @@ static MachineClass *select_machine(QDict *qdict, Error **errp)
@@ -1693,7 +1694,11 @@ static MachineClass *select_machine(QDict *qdict, Error **errp)
if (!machine_class) {
error_append_hint(errp,
"Use -machine help to list supported machines\n");
@@ -94,7 +94,7 @@ index 200468a753..0dbdba6421 100644
return machine_class;
}
@@ -3329,12 +3334,31 @@ void qemu_init(int argc, char **argv)
@@ -3414,12 +3419,31 @@ void qemu_init(int argc, char **argv)
case QEMU_OPTION_machine:
{
bool help;

View File

@@ -25,7 +25,7 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/block/backup.c b/block/backup.c
index 3dd2e229d2..eba5b11493 100644
index 79652bf57b..cc747e9163 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -237,8 +237,8 @@ static void backup_init_bcs_bitmap(BackupBlockJob *job)

View File

@@ -16,18 +16,18 @@ Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
block/meson.build | 2 +
meson.build | 5 +
vma-reader.c | 868 ++++++++++++++++++++++++++++++++++++++++++
vma-writer.c | 817 ++++++++++++++++++++++++++++++++++++++++
vma-reader.c | 867 ++++++++++++++++++++++++++++++++++++++++++
vma-writer.c | 816 ++++++++++++++++++++++++++++++++++++++++
vma.c | 941 ++++++++++++++++++++++++++++++++++++++++++++++
vma.h | 150 ++++++++
6 files changed, 2783 insertions(+)
6 files changed, 2781 insertions(+)
create mode 100644 vma-reader.c
create mode 100644 vma-writer.c
create mode 100644 vma.c
create mode 100644 vma.h
diff --git a/block/meson.build b/block/meson.build
index 6a60b5d6b9..652c8cbdb7 100644
index a21d9a5411..1373612c10 100644
--- a/block/meson.build
+++ b/block/meson.build
@@ -42,6 +42,8 @@ block_ss.add(files(
@@ -40,10 +40,10 @@ index 6a60b5d6b9..652c8cbdb7 100644
system_ss.add(files('block-ram-registrar.c'))
diff --git a/meson.build b/meson.build
index aa7ea85d0b..7eee5b4249 100644
index 8ec796d835..680ab48b9b 100644
--- a/meson.build
+++ b/meson.build
@@ -2012,6 +2012,8 @@ endif
@@ -2161,6 +2161,8 @@ endif
has_gettid = cc.has_function('gettid')
@@ -52,7 +52,7 @@ index aa7ea85d0b..7eee5b4249 100644
# libselinux
selinux = dependency('libselinux',
required: get_option('selinux'),
@@ -4097,6 +4099,9 @@ if have_tools
@@ -4367,6 +4369,9 @@ if have_tools
dependencies: [blockdev, qemuutil, selinux],
install: true)
@@ -64,10 +64,10 @@ index aa7ea85d0b..7eee5b4249 100644
foreach exe: [ 'qemu-img', 'qemu-io', 'qemu-nbd', 'qemu-storage-daemon']
diff --git a/vma-reader.c b/vma-reader.c
new file mode 100644
index 0000000000..65015d2e1e
index 0000000000..1888b21851
--- /dev/null
+++ b/vma-reader.c
@@ -0,0 +1,868 @@
@@ -0,0 +1,867 @@
+/*
+ * VMA: Virtual Machine Archive
+ *
@@ -90,7 +90,7 @@ index 0000000000..65015d2e1e
+#include "vma.h"
+#include "block/block.h"
+#include "block/graph-lock.h"
+#include "sysemu/block-backend.h"
+#include "system/block-backend.h"
+
+static unsigned char zero_vma_block[VMA_BLOCK_SIZE];
+
@@ -883,8 +883,7 @@ index 0000000000..65015d2e1e
+
+ int64_t cluster_num, end;
+
+ end = (vmar->devinfo[i].size + VMA_CLUSTER_SIZE - 1) /
+ VMA_CLUSTER_SIZE;
+ end = DIV_ROUND_UP(vmar->devinfo[i].size, VMA_CLUSTER_SIZE);
+
+ for (cluster_num = 0; cluster_num < end; cluster_num++) {
+ if (!vma_reader_get_bitmap(rstate, cluster_num)) {
@@ -938,10 +937,10 @@ index 0000000000..65015d2e1e
+
diff --git a/vma-writer.c b/vma-writer.c
new file mode 100644
index 0000000000..a466652a5d
index 0000000000..3f489092df
--- /dev/null
+++ b/vma-writer.c
@@ -0,0 +1,817 @@
@@ -0,0 +1,816 @@
+/*
+ * VMA: Virtual Machine Archive
+ *
@@ -1135,8 +1134,7 @@ index 0000000000..a466652a5d
+ vmaw->stream_info[n].devname = g_strdup(devname);
+ vmaw->stream_info[n].size = size;
+
+ vmaw->stream_info[n].cluster_count = (size + VMA_CLUSTER_SIZE - 1) /
+ VMA_CLUSTER_SIZE;
+ vmaw->stream_info[n].cluster_count = DIV_ROUND_UP(size, VMA_CLUSTER_SIZE);
+
+ vmaw->stream_count = n;
+
@@ -1761,7 +1759,7 @@ index 0000000000..a466652a5d
+}
diff --git a/vma.c b/vma.c
new file mode 100644
index 0000000000..8d4b4be414
index 0000000000..0e990b5e30
--- /dev/null
+++ b/vma.c
@@ -0,0 +1,941 @@
@@ -1787,8 +1785,8 @@ index 0000000000..8d4b4be414
+#include "qemu/main-loop.h"
+#include "qemu/cutils.h"
+#include "qemu/memalign.h"
+#include "qapi/qmp/qdict.h"
+#include "sysemu/block-backend.h"
+#include "qobject/qdict.h"
+#include "system/block-backend.h"
+
+static void help(void)
+{

View File

@@ -22,7 +22,7 @@ Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
diff --git a/block/backup-dump.c b/block/backup-dump.c
new file mode 100644
index 0000000000..e46abf1070
index 0000000000..354593bc10
--- /dev/null
+++ b/block/backup-dump.c
@@ -0,0 +1,172 @@
@@ -38,7 +38,7 @@ index 0000000000..e46abf1070
+
+#include "qemu/osdep.h"
+
+#include "qapi/qmp/qdict.h"
+#include "qobject/qdict.h"
+#include "qom/object_interfaces.h"
+#include "block/block_int.h"
+
@@ -199,7 +199,7 @@ index 0000000000..e46abf1070
+ return bs;
+}
diff --git a/block/backup.c b/block/backup.c
index eba5b11493..1963e47ab9 100644
index cc747e9163..6f7c45f922 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -29,28 +29,6 @@
@@ -247,7 +247,7 @@ index eba5b11493..1963e47ab9 100644
if (perf->max_chunk && perf->max_chunk < cluster_size) {
error_setg(errp, "Required max-chunk (%" PRIi64 ") is less than backup "
diff --git a/block/meson.build b/block/meson.build
index 652c8cbdb7..e1cf5a2e65 100644
index 1373612c10..6278c4af0f 100644
--- a/block/meson.build
+++ b/block/meson.build
@@ -4,6 +4,7 @@ block_ss.add(files(
@@ -312,10 +312,10 @@ index ebb4e56a50..e717a74e5f 100644
BDRV_TRACKED_READ,
BDRV_TRACKED_WRITE,
diff --git a/job.c b/job.c
index 660ce22c56..baf54c8d60 100644
index 0653bc2ba6..b981070ee8 100644
--- a/job.c
+++ b/job.c
@@ -331,7 +331,8 @@ static bool job_started_locked(Job *job)
@@ -337,7 +337,8 @@ static bool job_started_locked(Job *job)
}
/* Called with job_mutex held. */

View File

@@ -11,7 +11,7 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2 files changed, 46 insertions(+)
diff --git a/include/qemu/job.h b/include/qemu/job.h
index 2b873f2576..528cd6acb9 100644
index a5a04155ea..562cc7eaec 100644
--- a/include/qemu/job.h
+++ b/include/qemu/job.h
@@ -362,6 +362,18 @@ void job_unlock(void);
@@ -34,7 +34,7 @@ index 2b873f2576..528cd6acb9 100644
* Release a reference that was previously acquired with job_txn_add_job or
* job_txn_new. If it's the last reference to the object, it will be freed.
diff --git a/job.c b/job.c
index baf54c8d60..3ac5e5cde2 100644
index b981070ee8..f4646866ec 100644
--- a/job.c
+++ b/job.c
@@ -94,6 +94,8 @@ struct JobTxn {
@@ -72,7 +72,7 @@ index baf54c8d60..3ac5e5cde2 100644
/* Called with job_mutex held. */
static void job_txn_ref_locked(JobTxn *txn)
{
@@ -1042,6 +1063,12 @@ static void job_completed_txn_success_locked(Job *job)
@@ -1048,6 +1069,12 @@ static void job_completed_txn_success_locked(Job *job)
*/
QLIST_FOREACH(other_job, &txn->jobs, txn_list) {
if (!job_is_completed_locked(other_job)) {
@@ -85,7 +85,7 @@ index baf54c8d60..3ac5e5cde2 100644
return;
}
assert(other_job->ret == 0);
@@ -1253,6 +1280,13 @@ int job_finish_sync_locked(Job *job,
@@ -1259,6 +1286,13 @@ int job_finish_sync_locked(Job *job,
return -EBUSY;
}

View File

@@ -94,17 +94,17 @@ Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
monitor/hmp-cmds.c | 72 +++
proxmox-backup-client.c | 146 +++++
proxmox-backup-client.h | 60 ++
pve-backup.c | 1092 ++++++++++++++++++++++++++++++++
pve-backup.c | 1096 ++++++++++++++++++++++++++++++++
qapi/block-core.json | 233 +++++++
qapi/common.json | 14 +
qapi/machine.json | 16 +-
14 files changed, 1711 insertions(+), 14 deletions(-)
14 files changed, 1715 insertions(+), 14 deletions(-)
create mode 100644 proxmox-backup-client.c
create mode 100644 proxmox-backup-client.h
create mode 100644 pve-backup.c
diff --git a/block/meson.build b/block/meson.build
index e1cf5a2e65..2367e1ac1b 100644
index 6278c4af0f..d1b16e40e9 100644
--- a/block/meson.build
+++ b/block/meson.build
@@ -44,6 +44,11 @@ block_ss.add(files(
@@ -120,10 +120,10 @@ index e1cf5a2e65..2367e1ac1b 100644
system_ss.add(when: 'CONFIG_TCG', if_true: files('blkreplay.c'))
system_ss.add(files('block-ram-registrar.c'))
diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c
index bdf2eb50b6..439a7a14c8 100644
index 6919a49bf5..4f30f99644 100644
--- a/block/monitor/block-hmp-cmds.c
+++ b/block/monitor/block-hmp-cmds.c
@@ -1009,3 +1009,42 @@ void hmp_change_medium(Monitor *mon, const char *device, const char *target,
@@ -1010,3 +1010,42 @@ void hmp_change_medium(Monitor *mon, const char *device, const char *target,
qmp_blockdev_change_medium(device, NULL, target, arg, true, force,
!!read_only, read_only_mode, errp);
}
@@ -167,7 +167,7 @@ index bdf2eb50b6..439a7a14c8 100644
+ hmp_handle_error(mon, error);
+}
diff --git a/blockdev.c b/blockdev.c
index 9cbd166674..8080c47fa6 100644
index 158ac9314b..17de5d2ae4 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -37,6 +37,7 @@
@@ -265,10 +265,10 @@ index 2596cc2426..9dda91d65a 100644
void hmp_device_add(Monitor *mon, const QDict *qdict);
void hmp_device_del(Monitor *mon, const QDict *qdict);
diff --git a/meson.build b/meson.build
index 7eee5b4249..979c452f74 100644
index 680ab48b9b..1f74de1d93 100644
--- a/meson.build
+++ b/meson.build
@@ -2013,6 +2013,7 @@ endif
@@ -2162,6 +2162,7 @@ endif
has_gettid = cc.has_function('gettid')
libuuid = cc.find_library('uuid', required: true)
@@ -277,7 +277,7 @@ index 7eee5b4249..979c452f74 100644
# libselinux
selinux = dependency('libselinux',
diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index 874084565f..bedeb81f8c 100644
index bade2a4b92..77a2fa409a 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -22,6 +22,7 @@
@@ -586,22 +586,23 @@ index 0000000000..8cbf645b2c
+#endif /* PROXMOX_BACKUP_CLIENT_H */
diff --git a/pve-backup.c b/pve-backup.c
new file mode 100644
index 0000000000..9f83ecb310
index 0000000000..e931cb9203
--- /dev/null
+++ b/pve-backup.c
@@ -0,0 +1,1092 @@
@@ -0,0 +1,1096 @@
+#include "proxmox-backup-client.h"
+#include "vma.h"
+
+#include "qemu/osdep.h"
+#include "qemu/module.h"
+#include "sysemu/block-backend.h"
+#include "sysemu/blockdev.h"
+#include "system/block-backend.h"
+#include "system/blockdev.h"
+#include "block/block_int-global-state.h"
+#include "block/blockjob.h"
+#include "block/dirty-bitmap.h"
+#include "block/graph-lock.h"
+#include "qapi/qapi-commands-block.h"
+#include "qobject/qdict.h"
+#include "qapi/qmp/qerror.h"
+#include "qemu/cutils.h"
+
@@ -677,6 +678,7 @@ index 0000000000..9f83ecb310
+ size_t size;
+ uint64_t block_size;
+ uint8_t dev_id;
+ char* device_name;
+ int completed_ret; // INT_MAX if not completed
+ BdrvDirtyBitmap *bitmap;
+ BlockDriverState *target;
@@ -910,6 +912,8 @@ index 0000000000..9f83ecb310
+ }
+
+ di->bs = NULL;
+ g_free(di->device_name);
+ di->device_name = NULL;
+
+ assert(di->target == NULL);
+
@@ -1199,6 +1203,8 @@ index 0000000000..9f83ecb310
+ }
+ PVEBackupDevInfo *di = g_new0(PVEBackupDevInfo, 1);
+ di->bs = bs;
+ di->device_name = g_strdup(bdrv_get_device_name(bs));
+
+ di_list = g_list_append(di_list, di);
+ d++;
+ }
@@ -1212,6 +1218,7 @@ index 0000000000..9f83ecb310
+
+ PVEBackupDevInfo *di = g_new0(PVEBackupDevInfo, 1);
+ di->bs = bs;
+ di->device_name = g_strdup(bdrv_get_device_name(bs));
+ di_list = g_list_append(di_list, di);
+ }
+ }
@@ -1378,9 +1385,6 @@ index 0000000000..9f83ecb310
+
+ di->block_size = dump_cb_block_size;
+
+ bdrv_graph_co_rdlock();
+ const char *devname = bdrv_get_device_name(di->bs);
+ bdrv_graph_co_rdunlock();
+ PBSBitmapAction action = PBS_BITMAP_ACTION_NOT_USED;
+ size_t dirty = di->size;
+
@@ -1395,7 +1399,8 @@ index 0000000000..9f83ecb310
+ }
+ action = PBS_BITMAP_ACTION_NEW;
+ } else {
+ expect_only_dirty = proxmox_backup_check_incremental(pbs, devname, di->size) != 0;
+ expect_only_dirty =
+ proxmox_backup_check_incremental(pbs, di->device_name, di->size) != 0;
+ }
+
+ if (expect_only_dirty) {
@@ -1419,7 +1424,8 @@ index 0000000000..9f83ecb310
+ }
+ }
+
+ int dev_id = proxmox_backup_co_register_image(pbs, devname, di->size, expect_only_dirty, errp);
+ int dev_id = proxmox_backup_co_register_image(pbs, di->device_name, di->size,
+ expect_only_dirty, errp);
+ if (dev_id < 0) {
+ goto err_mutex;
+ }
@@ -1431,7 +1437,7 @@ index 0000000000..9f83ecb310
+ di->dev_id = dev_id;
+
+ PBSBitmapInfo *info = g_malloc(sizeof(*info));
+ info->drive = g_strdup(devname);
+ info->drive = g_strdup(di->device_name);
+ info->action = action;
+ info->size = di->size;
+ info->dirty = dirty;
@@ -1440,9 +1446,7 @@ index 0000000000..9f83ecb310
+ } else if (format == BACKUP_FORMAT_VMA) {
+ vmaw = vma_writer_create(backup_file, uuid, &local_err);
+ if (!vmaw) {
+ if (local_err) {
+ error_propagate(errp, local_err);
+ }
+ error_propagate(errp, local_err);
+ goto err_mutex;
+ }
+
@@ -1456,10 +1460,7 @@ index 0000000000..9f83ecb310
+ goto err_mutex;
+ }
+
+ bdrv_graph_co_rdlock();
+ const char *devname = bdrv_get_device_name(di->bs);
+ bdrv_graph_co_rdunlock();
+ di->dev_id = vma_writer_register_stream(vmaw, devname, di->size);
+ di->dev_id = vma_writer_register_stream(vmaw, di->device_name, di->size);
+ if (di->dev_id <= 0) {
+ error_set(errp, ERROR_CLASS_GENERIC_ERROR,
+ "register_stream failed");
@@ -1570,6 +1571,9 @@ index 0000000000..9f83ecb310
+ bdrv_co_unref(di->target);
+ }
+
+ g_free(di->device_name);
+ di->device_name = NULL;
+
+ g_free(di);
+ }
+ g_list_free(di_list);
@@ -1683,10 +1687,10 @@ index 0000000000..9f83ecb310
+ return ret;
+}
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 1cb6f04db3..ac83c3495d 100644
index 1c05413916..dd98e03bf1 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -851,6 +851,239 @@
@@ -855,6 +855,239 @@
{ 'command': 'query-block', 'returns': ['BlockInfo'],
'allow-preconfig': true }
@@ -1927,13 +1931,13 @@ index 1cb6f04db3..ac83c3495d 100644
# @BlockDeviceTimedStats:
#
diff --git a/qapi/common.json b/qapi/common.json
index 7558ce5430..5c00bddeb7 100644
index 0e3a0bbbfb..554680c716 100644
--- a/qapi/common.json
+++ b/qapi/common.json
@@ -200,3 +200,17 @@
@@ -226,3 +226,17 @@
##
{ 'struct': 'HumanReadableText',
'data': { 'human-readable-text': 'str' } }
{ 'enum': 'EndianMode',
'data': [ 'unspecified', 'little', 'big' ] }
+
+##
+# @UuidInfo:
@@ -1949,7 +1953,7 @@ index 7558ce5430..5c00bddeb7 100644
+##
+{ 'struct': 'UuidInfo', 'data': {'UUID': 'str'} }
diff --git a/qapi/machine.json b/qapi/machine.json
index dc46a3e93f..bd58d58fc5 100644
index 12cfd3f260..a8abdb42a3 100644
--- a/qapi/machine.json
+++ b/qapi/machine.json
@@ -4,6 +4,8 @@
@@ -1961,7 +1965,7 @@ index dc46a3e93f..bd58d58fc5 100644
##
# = Machines
##
@@ -303,20 +305,6 @@
@@ -302,20 +304,6 @@
##
{ 'command': 'query-target', 'returns': 'TargetInfo' }

View File

@@ -14,10 +14,10 @@ Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
create mode 100644 pbs-restore.c
diff --git a/meson.build b/meson.build
index 979c452f74..426f382178 100644
index 1f74de1d93..8508aab9c9 100644
--- a/meson.build
+++ b/meson.build
@@ -4103,6 +4103,10 @@ if have_tools
@@ -4373,6 +4373,10 @@ if have_tools
vma = executable('vma', files('vma.c', 'vma-reader.c') + genh,
dependencies: [authz, block, crypto, io, qemuutil, qom], install: true)
@@ -30,7 +30,7 @@ index 979c452f74..426f382178 100644
foreach exe: [ 'qemu-img', 'qemu-io', 'qemu-nbd', 'qemu-storage-daemon']
diff --git a/pbs-restore.c b/pbs-restore.c
new file mode 100644
index 0000000000..f03d9bab8d
index 0000000000..f165f418af
--- /dev/null
+++ b/pbs-restore.c
@@ -0,0 +1,236 @@
@@ -57,8 +57,8 @@ index 0000000000..f03d9bab8d
+#include "qemu/main-loop.h"
+#include "qemu/cutils.h"
+#include "qapi/error.h"
+#include "qapi/qmp/qdict.h"
+#include "sysemu/block-backend.h"
+#include "qobject/qdict.h"
+#include "system/block-backend.h"
+
+#include <proxmox-backup-qemu.h>
+

View File

@@ -23,7 +23,7 @@ Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
create mode 100644 block/pbs.c
diff --git a/block/meson.build b/block/meson.build
index 2367e1ac1b..e178047ec9 100644
index d1b16e40e9..d243372c41 100644
--- a/block/meson.build
+++ b/block/meson.build
@@ -49,6 +49,8 @@ block_ss.add(files(
@@ -37,7 +37,7 @@ index 2367e1ac1b..e178047ec9 100644
system_ss.add(files('block-ram-registrar.c'))
diff --git a/block/pbs.c b/block/pbs.c
new file mode 100644
index 0000000000..2d5e28ce8f
index 0000000000..3e41421716
--- /dev/null
+++ b/block/pbs.c
@@ -0,0 +1,306 @@
@@ -47,8 +47,8 @@ index 0000000000..2d5e28ce8f
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/qmp/qstring.h"
+#include "qobject/qdict.h"
+#include "qobject/qstring.h"
+#include "qemu/module.h"
+#include "qemu/option.h"
+#include "qemu/cutils.h"
@@ -348,23 +348,23 @@ index 0000000000..2d5e28ce8f
+
+block_init(bdrv_pbs_init);
diff --git a/meson.build b/meson.build
index 426f382178..7e6130cfdf 100644
index 8508aab9c9..9c39f54f86 100644
--- a/meson.build
+++ b/meson.build
@@ -4559,7 +4559,7 @@ summary_info += {'zstd support': zstd}
summary_info += {'Query Processing Library support': qpl}
@@ -4838,7 +4838,7 @@ summary_info += {'Query Processing Library support': qpl}
summary_info += {'UADK Library support': uadk}
summary_info += {'qatzip support': qatzip}
summary_info += {'NUMA host support': numa}
-summary_info += {'capstone': capstone}
+summary_info += {'PBS bdrv support': config_host.has_key('CONFIG_PBS_BDRV')}
summary_info += {'libpmem support': libpmem}
summary_info += {'libdaxctl support': libdaxctl}
summary_info += {'libudev': libudev}
summary_info += {'libcbor support': libcbor}
diff --git a/qapi/block-core.json b/qapi/block-core.json
index ac83c3495d..fe0eefcea6 100644
index dd98e03bf1..0c3ebfa74e 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -3457,6 +3457,7 @@
@@ -3470,6 +3470,7 @@
'parallels', 'preallocate', 'qcow', 'qcow2', 'qed', 'quorum',
'raw', 'rbd',
{ 'name': 'replication', 'if': 'CONFIG_REPLICATION' },
@@ -372,7 +372,7 @@ index ac83c3495d..fe0eefcea6 100644
'ssh', 'throttle', 'vdi', 'vhdx',
{ 'name': 'virtio-blk-vfio-pci', 'if': 'CONFIG_BLKIO' },
{ 'name': 'virtio-blk-vhost-user', 'if': 'CONFIG_BLKIO' },
@@ -3543,6 +3544,33 @@
@@ -3556,6 +3557,33 @@
{ 'struct': 'BlockdevOptionsNull',
'data': { '*size': 'int', '*latency-ns': 'uint64', '*read-zeroes': 'bool' } }
@@ -406,7 +406,7 @@ index ac83c3495d..fe0eefcea6 100644
##
# @BlockdevOptionsNVMe:
#
@@ -4978,6 +5006,7 @@
@@ -5003,6 +5031,7 @@
'nfs': 'BlockdevOptionsNfs',
'null-aio': 'BlockdevOptionsNull',
'null-co': 'BlockdevOptionsNull',
@@ -415,10 +415,10 @@ index ac83c3495d..fe0eefcea6 100644
'nvme-io_uring': { 'type': 'BlockdevOptionsNvmeIoUring',
'if': 'CONFIG_BLKIO' },
diff --git a/qapi/pragma.json b/qapi/pragma.json
index be8fa304c5..7ff46bd128 100644
index 6aaa9cb975..e9c595c4ba 100644
--- a/qapi/pragma.json
+++ b/qapi/pragma.json
@@ -100,6 +100,7 @@
@@ -91,6 +91,7 @@
'BlockInfo', # query-block
'BlockdevAioOptions', # blockdev-add, -blockdev
'BlockdevDriver', # blockdev-add, query-blockstats, ...

View File

@@ -14,10 +14,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2 files changed, 7 insertions(+), 3 deletions(-)
diff --git a/meson.build b/meson.build
index 7e6130cfdf..984f858bdc 100644
index 9c39f54f86..60af7fa723 100644
--- a/meson.build
+++ b/meson.build
@@ -2013,6 +2013,7 @@ endif
@@ -2162,6 +2162,7 @@ endif
has_gettid = cc.has_function('gettid')
libuuid = cc.find_library('uuid', required: true)
@@ -25,7 +25,7 @@ index 7e6130cfdf..984f858bdc 100644
libproxmox_backup_qemu = cc.find_library('proxmox_backup_qemu', required: true)
# libselinux
@@ -3597,7 +3598,7 @@ if have_block
@@ -3766,7 +3767,7 @@ if have_block
if host_os == 'windows'
system_ss.add(files('os-win32.c'))
else
@@ -35,7 +35,7 @@ index 7e6130cfdf..984f858bdc 100644
endif
diff --git a/os-posix.c b/os-posix.c
index 43f9a43f3f..a47e46d1c2 100644
index 52925c23d3..84b96d3da9 100644
--- a/os-posix.c
+++ b/os-posix.c
@@ -29,6 +29,8 @@

View File

@@ -26,19 +26,19 @@ Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
create mode 100644 migration/pbs-state.c
diff --git a/include/migration/misc.h b/include/migration/misc.h
index bfadc5613b..e2e51fcf6b 100644
index 8fd36eba1d..e963e93e71 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -111,4 +111,7 @@ bool migration_in_bg_snapshot(void);
/* migration/block-dirty-bitmap.c */
void dirty_bitmap_mig_init(void);
@@ -140,4 +140,7 @@ bool multifd_device_state_save_thread_should_exit(void);
void multifd_abort_device_state_save_threads(void);
bool multifd_join_device_state_save_threads(void);
+/* migration/pbs-state.c */
+void pbs_state_mig_init(void);
+
#endif
diff --git a/migration/meson.build b/migration/meson.build
index 4b0c4f0f51..d039797132 100644
index 46e92249a1..fb3fd7d7d0 100644
--- a/migration/meson.build
+++ b/migration/meson.build
@@ -8,6 +8,7 @@ migration_files = files(
@@ -49,7 +49,7 @@ index 4b0c4f0f51..d039797132 100644
system_ss.add(files(
'block-dirty-bitmap.c',
@@ -25,6 +26,7 @@ system_ss.add(files(
@@ -31,6 +32,7 @@ system_ss.add(files(
'multifd-zlib.c',
'multifd-zero-page.c',
'options.c',
@@ -58,13 +58,13 @@ index 4b0c4f0f51..d039797132 100644
'savevm.c',
'savevm-async.c',
diff --git a/migration/migration.c b/migration/migration.c
index ae2be31557..fab4c20ee4 100644
index d46e776e24..2f3430f440 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -263,6 +263,7 @@ void migration_object_init(void)
@@ -319,6 +319,7 @@ void migration_object_init(void)
ram_mig_init();
dirty_bitmap_mig_init();
/* Initialize cpu throttle timers */
cpu_throttle_init();
+ pbs_state_mig_init();
}
@@ -180,10 +180,10 @@ index 0000000000..a97187e4d7
+ NULL);
+}
diff --git a/pve-backup.c b/pve-backup.c
index 9f83ecb310..57477f7f2a 100644
index e931cb9203..366b015589 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -1085,6 +1085,7 @@ ProxmoxSupportStatus *qmp_query_proxmox_support(Error **errp)
@@ -1089,6 +1089,7 @@ ProxmoxSupportStatus *qmp_query_proxmox_support(Error **errp)
ret->pbs_library_version = g_strdup(proxmox_backup_qemu_version());
ret->pbs_dirty_bitmap = true;
ret->pbs_dirty_bitmap_savevm = true;
@@ -192,10 +192,10 @@ index 9f83ecb310..57477f7f2a 100644
ret->pbs_masterkey = true;
ret->backup_max_workers = true;
diff --git a/qapi/block-core.json b/qapi/block-core.json
index fe0eefcea6..521a1914e8 100644
index 0c3ebfa74e..6838187607 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1004,6 +1004,11 @@
@@ -1008,6 +1008,11 @@
# @pbs-dirty-bitmap-savevm: True if 'dirty-bitmaps' migration capability can
# safely be set for savevm-async.
#
@@ -207,7 +207,7 @@ index fe0eefcea6..521a1914e8 100644
# @pbs-masterkey: True if the QMP backup call supports the 'master_keyfile'
# parameter.
#
@@ -1017,6 +1022,7 @@
@@ -1021,6 +1026,7 @@
'data': { 'pbs-dirty-bitmap': 'bool',
'query-bitmap-info': 'bool',
'pbs-dirty-bitmap-savevm': 'bool',

View File

@@ -15,20 +15,21 @@ transferred.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
migration/block-dirty-bitmap.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
migration/block-dirty-bitmap.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index a7d55048c2..77346a5fa2 100644
index f2c352d4a7..931a8481e9 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -539,7 +539,10 @@ static int add_bitmaps_to_list(DBMSaveState *s, BlockDriverState *bs,
@@ -539,7 +539,11 @@ static int add_bitmaps_to_list(DBMSaveState *s, BlockDriverState *bs,
}
if (bdrv_dirty_bitmap_check(bitmap, BDRV_BITMAP_DEFAULT, errp)) {
- return -1;
+ if (errp != NULL) {
+ error_report_err(*errp);
+ *errp = NULL;
+ }
+ continue;
}

View File

@@ -21,7 +21,7 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 30 insertions(+)
diff --git a/block/iscsi.c b/block/iscsi.c
index 979bf90cb7..961714a4be 100644
index 2f0f4dac09..b523137cff 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -1392,12 +1392,42 @@ static char *get_initiator_name(QemuOpts *opts)

View File

@@ -11,7 +11,7 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/block/stream.c b/block/stream.c
index 7031eef12b..d2da83ae7c 100644
index 999d9e56d4..e187cd1262 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -27,7 +27,7 @@ enum {

View File

@@ -31,21 +31,22 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
[FE: adapt to changed function signatures
make error return value consistent with QEMU
avoid premature break during read
adhere to block graph lock requirements]
adhere to block graph lock requirements
avoid superfluous child permission update]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
block/alloc-track.c | 366 ++++++++++++++++++++++++++++++++++++++++++++
block/alloc-track.c | 343 ++++++++++++++++++++++++++++++++++++++++++++
block/meson.build | 1 +
block/stream.c | 34 ++++
3 files changed, 401 insertions(+)
block/stream.c | 34 +++++
3 files changed, 378 insertions(+)
create mode 100644 block/alloc-track.c
diff --git a/block/alloc-track.c b/block/alloc-track.c
new file mode 100644
index 0000000000..b4a9851144
index 0000000000..718aaabf2a
--- /dev/null
+++ b/block/alloc-track.c
@@ -0,0 +1,366 @@
@@ -0,0 +1,343 @@
+/*
+ * Node to allow backing images to be applied to any node. Assumes a blank
+ * image to begin with, only new writes are tracked as allocated, thus this
@@ -63,26 +64,19 @@ index 0000000000..b4a9851144
+#include "block/block_int.h"
+#include "block/dirty-bitmap.h"
+#include "block/graph-lock.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/qmp/qstring.h"
+#include "qobject/qdict.h"
+#include "qobject/qstring.h"
+#include "qemu/cutils.h"
+#include "qemu/error-report.h"
+#include "qemu/option.h"
+#include "qemu/module.h"
+#include "sysemu/block-backend.h"
+#include "system/block-backend.h"
+
+#define TRACK_OPT_AUTO_REMOVE "auto-remove"
+
+typedef enum DropState {
+ DropNone,
+ DropInProgress,
+} DropState;
+
+typedef struct {
+ BdrvDirtyBitmap *bitmap;
+ uint64_t granularity;
+ DropState drop_state;
+ bool auto_remove;
+} BDRVAllocTrackState;
+
+static QemuOptsList runtime_opts = {
@@ -134,7 +128,11 @@ index 0000000000..b4a9851144
+ goto fail;
+ }
+
+ s->auto_remove = qemu_opt_get_bool(opts, TRACK_OPT_AUTO_REMOVE, false);
+ if (!qemu_opt_get_bool(opts, TRACK_OPT_AUTO_REMOVE, false)) {
+ error_setg(errp, "alloc-track: requires auto-remove option to be set to on");
+ ret = -EINVAL;
+ goto fail;
+ }
+
+ /* open the target (write) node, backing will be attached by block layer */
+ file = bdrv_open_child(NULL, options, "file", bs, &child_of_bds,
@@ -182,8 +180,6 @@ index 0000000000..b4a9851144
+ goto fail;
+ }
+
+ s->drop_state = DropNone;
+
+fail:
+ if (ret < 0) {
+ bdrv_graph_wrlock();
@@ -334,18 +330,8 @@ index 0000000000..b4a9851144
+ BlockReopenQueue *reopen_queue, uint64_t perm, uint64_t shared,
+ uint64_t *nperm, uint64_t *nshared)
+{
+ BDRVAllocTrackState *s = bs->opaque;
+
+ *nshared = BLK_PERM_ALL;
+
+ /* in case we're currently dropping ourselves, claim to not use any
+ * permissions at all - which is fine, since from this point on we will
+ * never issue a read or write anymore */
+ if (s->drop_state == DropInProgress) {
+ *nperm = 0;
+ return;
+ }
+
+ if (role & BDRV_CHILD_DATA) {
+ *nperm = perm & DEFAULT_PERM_PASSTHROUGH;
+ } else {
@@ -371,14 +357,6 @@ index 0000000000..b4a9851144
+ * kinda fits better, but in the long-term, a special parameter would be
+ * nice (or done via qemu-server via upcoming blockdev-replace QMP command).
+ */
+ if (backing_file == NULL) {
+ BDRVAllocTrackState *s = bs->opaque;
+ bdrv_drained_begin(bs);
+ s->drop_state = DropInProgress;
+ bdrv_child_refresh_perms(bs, bs->file, &error_abort);
+ bdrv_drained_end(bs);
+ }
+
+ return 0;
+}
+
@@ -413,7 +391,7 @@ index 0000000000..b4a9851144
+
+block_init(bdrv_alloc_track_init);
diff --git a/block/meson.build b/block/meson.build
index e178047ec9..7ef7250d31 100644
index d243372c41..9b45b5256d 100644
--- a/block/meson.build
+++ b/block/meson.build
@@ -2,6 +2,7 @@ block_ss.add(genh)
@@ -425,7 +403,7 @@ index e178047ec9..7ef7250d31 100644
'backup.c',
'backup-dump.c',
diff --git a/block/stream.c b/block/stream.c
index d2da83ae7c..f941cba14e 100644
index e187cd1262..0b61029399 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -120,6 +120,40 @@ static int stream_prepare(Job *job)

View File

@@ -13,7 +13,7 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 2 insertions(+), 40 deletions(-)
diff --git a/block/rbd.c b/block/rbd.c
index 101ee59d6e..4ad3b1a7b1 100644
index bf143fac00..70d92966f7 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -1515,7 +1515,6 @@ static int coroutine_fn qemu_rbd_co_block_status(BlockDriverState *bs,

View File

@@ -14,7 +14,7 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/block/rbd.c b/block/rbd.c
index 4ad3b1a7b1..e341745255 100644
index 70d92966f7..931b513828 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -1474,11 +1474,11 @@ static int qemu_rbd_diff_iterate_cb(uint64_t offs, size_t len,

View File

@@ -24,7 +24,7 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 112 deletions(-)
diff --git a/block/rbd.c b/block/rbd.c
index e341745255..436d3d7811 100644
index 931b513828..4ab9bb5e02 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -108,12 +108,6 @@ typedef struct RBDTask {

View File

@@ -61,17 +61,96 @@ it has no parent, so just pass the one from the original bs.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
[FE: improve error when cbw fails as reported by Friedrich Weber]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
block/copy-before-write.c | 18 ++--
block/copy-before-write.h | 1 +
block/monitor/block-hmp-cmds.c | 1 +
pve-backup.c | 135 ++++++++++++++++++++++++++++++++-
qapi/block-core.json | 10 ++-
3 files changed, 142 insertions(+), 4 deletions(-)
pve-backup.c | 175 ++++++++++++++++++++++++++++++++-
qapi/block-core.json | 10 +-
5 files changed, 195 insertions(+), 10 deletions(-)
diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index fd470f5f92..5c23b578ef 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -27,6 +27,7 @@
#include "qobject/qjson.h"
#include "system/block-backend.h"
+#include "qemu/atomic.h"
#include "qemu/cutils.h"
#include "qapi/error.h"
#include "block/block_int.h"
@@ -75,7 +76,8 @@ typedef struct BDRVCopyBeforeWriteState {
* @snapshot_error is normally zero. But on first copy-before-write failure
* when @on_cbw_error == ON_CBW_ERROR_BREAK_SNAPSHOT, @snapshot_error takes
* value of this error (<0). After that all in-flight and further
- * snapshot-API requests will fail with that error.
+ * snapshot-API requests will fail with that error. To be accessed with
+ * atomics.
*/
int snapshot_error;
} BDRVCopyBeforeWriteState;
@@ -115,7 +117,7 @@ static coroutine_fn int cbw_do_copy_before_write(BlockDriverState *bs,
return 0;
}
- if (s->snapshot_error) {
+ if (qatomic_read(&s->snapshot_error)) {
return 0;
}
@@ -139,9 +141,7 @@ static coroutine_fn int cbw_do_copy_before_write(BlockDriverState *bs,
WITH_QEMU_LOCK_GUARD(&s->lock) {
if (ret < 0) {
assert(s->on_cbw_error == ON_CBW_ERROR_BREAK_SNAPSHOT);
- if (!s->snapshot_error) {
- s->snapshot_error = ret;
- }
+ qatomic_cmpxchg(&s->snapshot_error, 0, ret);
} else {
bdrv_set_dirty_bitmap(s->done_bitmap, off, end - off);
}
@@ -215,7 +215,7 @@ cbw_snapshot_read_lock(BlockDriverState *bs, int64_t offset, int64_t bytes,
QEMU_LOCK_GUARD(&s->lock);
- if (s->snapshot_error) {
+ if (qatomic_read(&s->snapshot_error)) {
g_free(req);
return NULL;
}
@@ -595,6 +595,12 @@ void bdrv_cbw_drop(BlockDriverState *bs)
bdrv_unref(bs);
}
+int bdrv_cbw_snapshot_error(BlockDriverState *bs)
+{
+ BDRVCopyBeforeWriteState *s = bs->opaque;
+ return qatomic_read(&s->snapshot_error);
+}
+
static void cbw_init(void)
{
bdrv_register(&bdrv_cbw_filter);
diff --git a/block/copy-before-write.h b/block/copy-before-write.h
index 2a5d4ba693..969da3620f 100644
--- a/block/copy-before-write.h
+++ b/block/copy-before-write.h
@@ -44,5 +44,6 @@ BlockDriverState *bdrv_cbw_append(BlockDriverState *source,
BlockCopyState **bcs,
Error **errp);
void bdrv_cbw_drop(BlockDriverState *bs);
+int bdrv_cbw_snapshot_error(BlockDriverState *bs);
#endif /* COPY_BEFORE_WRITE_H */
diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c
index 439a7a14c8..d0e7771dcc 100644
index 4f30f99644..66d16d342f 100644
--- a/block/monitor/block-hmp-cmds.c
+++ b/block/monitor/block-hmp-cmds.c
@@ -1044,6 +1044,7 @@ void coroutine_fn hmp_backup(Monitor *mon, const QDict *qdict)
@@ -1045,6 +1045,7 @@ void coroutine_fn hmp_backup(Monitor *mon, const QDict *qdict)
NULL, NULL,
devlist, qdict_haskey(qdict, "speed"), speed,
false, 0, // BackupPerf max-workers
@@ -80,22 +159,18 @@ index 439a7a14c8..d0e7771dcc 100644
hmp_handle_error(mon, error);
diff --git a/pve-backup.c b/pve-backup.c
index 57477f7f2a..0f098000dd 100644
index 366b015589..9b66788ab5 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -7,9 +7,11 @@
#include "sysemu/blockdev.h"
@@ -7,6 +7,7 @@
#include "system/blockdev.h"
#include "block/block_int-global-state.h"
#include "block/blockjob.h"
+#include "block/copy-before-write.h"
#include "block/dirty-bitmap.h"
#include "block/graph-lock.h"
#include "qapi/qapi-commands-block.h"
+#include "qapi/qmp/qdict.h"
#include "qapi/qmp/qerror.h"
#include "qemu/cutils.h"
@@ -80,8 +82,15 @@ static void pvebackup_init(void)
@@ -81,8 +82,15 @@ static void pvebackup_init(void)
// initialize PVEBackupState at startup
opts_init(pvebackup_init);
@@ -111,17 +186,12 @@ index 57477f7f2a..0f098000dd 100644
size_t size;
uint64_t block_size;
uint8_t dev_id;
@@ -353,6 +362,22 @@ static void pvebackup_complete_cb(void *opaque, int ret)
PVEBackupDevInfo *di = opaque;
di->completed_ret = ret;
@@ -352,11 +360,44 @@ static void coroutine_fn pvebackup_co_complete_stream(void *opaque)
qemu_co_mutex_unlock(&backup_state.backup_mutex);
}
+ /*
+ * Handle block-graph specific cleanup (for fleecing) outside of the coroutine, because the work
+ * won't be done as a coroutine anyways:
+ * - For snapshot_access, allows doing bdrv_unref() directly. Doing it via bdrv_co_unref() would
+ * just spawn a BH calling bdrv_unref().
+ * - For cbw, draining would need to spawn a BH.
+ */
+static void cleanup_snapshot_access(PVEBackupDevInfo *di)
+{
+ if (di->fleecing.snapshot_access) {
+ bdrv_unref(di->fleecing.snapshot_access);
+ di->fleecing.snapshot_access = NULL;
@@ -130,11 +200,104 @@ index 57477f7f2a..0f098000dd 100644
+ bdrv_cbw_drop(di->fleecing.cbw);
+ di->fleecing.cbw = NULL;
+ }
+}
+
static void pvebackup_complete_cb(void *opaque, int ret)
{
PVEBackupDevInfo *di = opaque;
di->completed_ret = ret;
+ if (di->fleecing.cbw) {
+ /*
+ * With fleecing, failure for cbw does not fail the guest write, but only sets the snapshot
+ * error, making further requests to the snapshot fail with EACCES, which then also fail the
+ * job. But that code is not the root cause and just confusing, so update it.
+ */
+ int snapshot_error = bdrv_cbw_snapshot_error(di->fleecing.cbw);
+ if (di->completed_ret == -EACCES && snapshot_error) {
+ di->completed_ret = snapshot_error;
+ }
+ }
+
+ /*
+ * Handle block-graph specific cleanup (for fleecing) outside of the coroutine, because the work
+ * won't be done as a coroutine anyways:
+ * - For snapshot_access, allows doing bdrv_unref() directly. Doing it via bdrv_co_unref() would
+ * just spawn a BH calling bdrv_unref().
+ * - For cbw, draining would need to spawn a BH.
+ */
+ cleanup_snapshot_access(di);
+
/*
* Needs to happen outside of coroutine, because it takes the graph write lock.
*/
@@ -519,9 +544,77 @@ static void create_backup_jobs_bh(void *opaque) {
@@ -487,6 +528,65 @@ static int coroutine_fn pvebackup_co_add_config(
goto out;
}
+/*
+ * Setup a snapshot-access block node for a device with associated fleecing image.
+ */
+static int setup_snapshot_access(PVEBackupDevInfo *di, Error **errp)
+{
+ Error *local_err = NULL;
+
+ if (!di->fleecing.bs) {
+ error_setg(errp, "no associated fleecing image");
+ return -1;
+ }
+
+ QDict *cbw_opts = qdict_new();
+ qdict_put_str(cbw_opts, "driver", "copy-before-write");
+ qdict_put_str(cbw_opts, "file", bdrv_get_node_name(di->bs));
+ qdict_put_str(cbw_opts, "target", bdrv_get_node_name(di->fleecing.bs));
+
+ if (di->bitmap) {
+ /*
+ * Only guest writes to parts relevant for the backup need to be intercepted with
+ * old data being copied to the fleecing image.
+ */
+ qdict_put_str(cbw_opts, "bitmap.node", bdrv_get_node_name(di->bs));
+ qdict_put_str(cbw_opts, "bitmap.name", bdrv_dirty_bitmap_name(di->bitmap));
+ }
+ /*
+ * Fleecing storage is supposed to be fast and it's better to break backup than guest
+ * writes. Certain guest drivers like VirtIO-win have 60 seconds timeout by default, so
+ * abort a bit before that.
+ */
+ qdict_put_str(cbw_opts, "on-cbw-error", "break-snapshot");
+ qdict_put_int(cbw_opts, "cbw-timeout", 45);
+
+ di->fleecing.cbw = bdrv_insert_node(di->bs, cbw_opts, BDRV_O_RDWR, &local_err);
+
+ if (!di->fleecing.cbw) {
+ error_setg(errp, "appending cbw node for fleecing failed: %s",
+ local_err ? error_get_pretty(local_err) : "unknown error");
+ return -1;
+ }
+
+ QDict *snapshot_access_opts = qdict_new();
+ qdict_put_str(snapshot_access_opts, "driver", "snapshot-access");
+ qdict_put_str(snapshot_access_opts, "file", bdrv_get_node_name(di->fleecing.cbw));
+
+ di->fleecing.snapshot_access =
+ bdrv_open(NULL, NULL, snapshot_access_opts, BDRV_O_RDWR | BDRV_O_UNMAP, &local_err);
+ if (!di->fleecing.snapshot_access) {
+ bdrv_cbw_drop(di->fleecing.cbw);
+ di->fleecing.cbw = NULL;
+
+ error_setg(errp, "setting up snapshot access for fleecing failed: %s",
+ local_err ? error_get_pretty(local_err) : "unknown error");
+ return -1;
+ }
+
+ return 0;
+}
+
/*
* backup_job_create can *not* be run from a coroutine, so this can't either.
* The caller is responsible that backup_mutex is held nonetheless.
@@ -523,9 +623,42 @@ static void create_backup_jobs_bh(void *opaque) {
}
bdrv_drained_begin(di->bs);
@@ -142,50 +305,15 @@ index 57477f7f2a..0f098000dd 100644
+
+ BlockDriverState *source_bs = di->bs;
+ bool discard_source = false;
+ bdrv_graph_co_rdlock();
+ const char *job_id = bdrv_get_device_name(di->bs);
+ bdrv_graph_co_rdunlock();
+ if (di->fleecing.bs) {
+ QDict *cbw_opts = qdict_new();
+ qdict_put_str(cbw_opts, "driver", "copy-before-write");
+ qdict_put_str(cbw_opts, "file", bdrv_get_node_name(di->bs));
+ qdict_put_str(cbw_opts, "target", bdrv_get_node_name(di->fleecing.bs));
+
+ if (di->bitmap) {
+ /*
+ * Only guest writes to parts relevant for the backup need to be intercepted with
+ * old data being copied to the fleecing image.
+ */
+ qdict_put_str(cbw_opts, "bitmap.node", bdrv_get_node_name(di->bs));
+ qdict_put_str(cbw_opts, "bitmap.name", bdrv_dirty_bitmap_name(di->bitmap));
+ }
+ /*
+ * Fleecing storage is supposed to be fast and it's better to break backup than guest
+ * writes. Certain guest drivers like VirtIO-win have 60 seconds timeout by default, so
+ * abort a bit before that.
+ */
+ qdict_put_str(cbw_opts, "on-cbw-error", "break-snapshot");
+ qdict_put_int(cbw_opts, "cbw-timeout", 45);
+
+ di->fleecing.cbw = bdrv_insert_node(di->bs, cbw_opts, BDRV_O_RDWR, &local_err);
+
+ if (!di->fleecing.cbw) {
+ error_setg(errp, "appending cbw node for fleecing failed: %s",
+ if (setup_snapshot_access(di, &local_err) < 0) {
+ error_setg(errp, "%s - setting up snapshot access for fleecing failed: %s",
+ di->device_name,
+ local_err ? error_get_pretty(local_err) : "unknown error");
+ bdrv_drained_end(di->bs);
+ break;
+ }
+
+ QDict *snapshot_access_opts = qdict_new();
+ qdict_put_str(snapshot_access_opts, "driver", "snapshot-access");
+ qdict_put_str(snapshot_access_opts, "file", bdrv_get_node_name(di->fleecing.cbw));
+
+ di->fleecing.snapshot_access =
+ bdrv_open(NULL, NULL, snapshot_access_opts, BDRV_O_RDWR | BDRV_O_UNMAP, &local_err);
+ if (!di->fleecing.snapshot_access) {
+ error_setg(errp, "setting up snapshot access for fleecing failed: %s",
+ local_err ? error_get_pretty(local_err) : "unknown error");
+ break;
+ }
+ source_bs = di->fleecing.snapshot_access;
+ discard_source = true;
+
@@ -209,12 +337,20 @@ index 57477f7f2a..0f098000dd 100644
BlockJob *job = backup_job_create(
- NULL, di->bs, di->target, backup_state.speed, sync_mode, di->bitmap,
- bitmap_mode, false, NULL, &backup_state.perf, BLOCKDEV_ON_ERROR_REPORT,
+ job_id, source_bs, di->target, backup_state.speed, sync_mode, di->bitmap,
+ di->device_name, source_bs, di->target, backup_state.speed, sync_mode, di->bitmap,
+ bitmap_mode, false, discard_source, NULL, &perf, BLOCKDEV_ON_ERROR_REPORT,
BLOCKDEV_ON_ERROR_REPORT, JOB_DEFAULT, pvebackup_complete_cb, di, backup_state.txn,
&local_err);
@@ -577,6 +670,14 @@ static void create_backup_jobs_bh(void *opaque) {
@@ -539,6 +672,7 @@ static void create_backup_jobs_bh(void *opaque) {
}
if (!job || local_err) {
+ cleanup_snapshot_access(di);
error_setg(errp, "backup_job_create failed: %s",
local_err ? error_get_pretty(local_err) : "null");
break;
@@ -581,6 +715,14 @@ static void create_backup_jobs_bh(void *opaque) {
aio_co_enter(data->ctx, data->co);
}
@@ -229,7 +365,7 @@ index 57477f7f2a..0f098000dd 100644
/*
* Returns a list of device infos, which needs to be freed by the caller. In
* case of an error, errp will be set, but the returned value might still be a
@@ -584,6 +685,7 @@ static void create_backup_jobs_bh(void *opaque) {
@@ -588,6 +730,7 @@ static void create_backup_jobs_bh(void *opaque) {
*/
static GList coroutine_fn GRAPH_RDLOCK *get_device_info(
const char *devlist,
@@ -237,11 +373,10 @@ index 57477f7f2a..0f098000dd 100644
Error **errp)
{
gchar **devs = NULL;
@@ -607,6 +709,31 @@ static GList coroutine_fn GRAPH_RDLOCK *get_device_info(
}
PVEBackupDevInfo *di = g_new0(PVEBackupDevInfo, 1);
@@ -613,6 +756,30 @@ static GList coroutine_fn GRAPH_RDLOCK *get_device_info(
di->bs = bs;
+
di->device_name = g_strdup(bdrv_get_device_name(bs));
+ if (fleecing && device_uses_fleecing(*d)) {
+ g_autofree gchar *fleecing_devid = g_strconcat(*d, "-fleecing", NULL);
+ BlockBackend *fleecing_blk = blk_by_name(fleecing_devid);
@@ -269,7 +404,7 @@ index 57477f7f2a..0f098000dd 100644
di_list = g_list_append(di_list, di);
d++;
}
@@ -656,6 +783,7 @@ UuidInfo coroutine_fn *qmp_backup(
@@ -663,6 +830,7 @@ UuidInfo coroutine_fn *qmp_backup(
const char *devlist,
bool has_speed, int64_t speed,
bool has_max_workers, int64_t max_workers,
@@ -277,7 +412,7 @@ index 57477f7f2a..0f098000dd 100644
Error **errp)
{
assert(qemu_in_coroutine());
@@ -684,7 +812,7 @@ UuidInfo coroutine_fn *qmp_backup(
@@ -691,7 +859,7 @@ UuidInfo coroutine_fn *qmp_backup(
format = has_format ? format : BACKUP_FORMAT_VMA;
bdrv_graph_co_rdlock();
@@ -286,7 +421,7 @@ index 57477f7f2a..0f098000dd 100644
bdrv_graph_co_rdunlock();
if (local_err) {
error_propagate(errp, local_err);
@@ -1089,5 +1217,6 @@ ProxmoxSupportStatus *qmp_query_proxmox_support(Error **errp)
@@ -1093,5 +1261,6 @@ ProxmoxSupportStatus *qmp_query_proxmox_support(Error **errp)
ret->query_bitmap_info = true;
ret->pbs_masterkey = true;
ret->backup_max_workers = true;
@@ -294,10 +429,10 @@ index 57477f7f2a..0f098000dd 100644
return ret;
}
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 653df22046..9f25c398ec 100644
index 6838187607..9bdcfa31ea 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -948,6 +948,10 @@
@@ -952,6 +952,10 @@
#
# @max-workers: see @BackupPerf for details. Default 16.
#
@@ -308,7 +443,7 @@ index 653df22046..9f25c398ec 100644
# Returns: the uuid of the backup job
#
##
@@ -968,7 +972,8 @@
@@ -972,7 +976,8 @@
'*firewall-file': 'str',
'*devlist': 'str',
'*speed': 'int',
@@ -318,7 +453,7 @@ index 653df22046..9f25c398ec 100644
'returns': 'UuidInfo', 'coroutine': true }
##
@@ -1014,6 +1019,8 @@
@@ -1018,6 +1023,8 @@
#
# @pbs-library-version: Running version of libproxmox-backup-qemu0 library.
#
@@ -327,7 +462,7 @@ index 653df22046..9f25c398ec 100644
# @backup-max-workers: Whether the 'max-workers' @BackupPerf setting is
# supported or not.
#
@@ -1025,6 +1032,7 @@
@@ -1029,6 +1036,7 @@
'pbs-dirty-bitmap-migration': 'bool',
'pbs-masterkey': 'bool',
'pbs-library-version': 'str',

View File

@@ -1,43 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Tue, 26 Mar 2024 14:57:51 +0100
Subject: [PATCH] alloc-track: error out when auto-remove is not set
Since replacing the node now happens in the stream job, where the
option cannot be read from (it's internal to the driver), it will
always be treated as on.
qemu-server will always set it, make sure to have other users notice
the change (should they even exist). The option can be fully dropped
in the future while adding a version guard in qemu-server.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
block/alloc-track.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/block/alloc-track.c b/block/alloc-track.c
index b4a9851144..fc7d58a5d0 100644
--- a/block/alloc-track.c
+++ b/block/alloc-track.c
@@ -34,7 +34,6 @@ typedef struct {
BdrvDirtyBitmap *bitmap;
uint64_t granularity;
DropState drop_state;
- bool auto_remove;
} BDRVAllocTrackState;
static QemuOptsList runtime_opts = {
@@ -86,7 +85,11 @@ static int track_open(BlockDriverState *bs, QDict *options, int flags,
goto fail;
}
- s->auto_remove = qemu_opt_get_bool(opts, TRACK_OPT_AUTO_REMOVE, false);
+ if (!qemu_opt_get_bool(opts, TRACK_OPT_AUTO_REMOVE, false)) {
+ error_setg(errp, "alloc-track: requires auto-remove option to be set to on");
+ ret = -EINVAL;
+ goto fail;
+ }
/* open the target (write) node, backing will be attached by block layer */
file = bdrv_open_child(NULL, options, "file", bs, &child_of_bds,

View File

@@ -22,10 +22,10 @@ Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
1 file changed, 51 insertions(+), 27 deletions(-)
diff --git a/include/hw/boards.h b/include/hw/boards.h
index 5cddeb7fcb..b1e7787499 100644
index 8ada4d5832..f9f3b75284 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -607,42 +607,66 @@ struct MachineState {
@@ -636,42 +636,66 @@ struct MachineState {
/*
@@ -117,7 +117,7 @@ index 5cddeb7fcb..b1e7787499 100644
/*
* Evaluates true when a machine type with (major, minor)
@@ -651,7 +675,7 @@ struct MachineState {
@@ -680,7 +704,7 @@ struct MachineState {
* lifecycle rules
*/
#define MACHINE_VER_IS_DEPRECATED(...) \
@@ -126,7 +126,7 @@ index 5cddeb7fcb..b1e7787499 100644
/*
* Evaluates true when a machine type with (major, minor)
@@ -660,7 +684,7 @@ struct MachineState {
@@ -689,7 +713,7 @@ struct MachineState {
* lifecycle rules
*/
#define MACHINE_VER_SHOULD_DELETE(...) \

View File

@@ -1,84 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Wed, 27 Mar 2024 11:15:39 +0100
Subject: [PATCH] alloc-track: avoid seemingly superfluous child permission
update
Doesn't seem necessary nowadays (maybe after commit "alloc-track: fix
deadlock during drop" where the dropping is not rescheduled and delayed
anymore or some upstream change). Should there really be some issue,
instead of having a drop state, this could also be just based off the
fact whether there is still a backing child.
Dumping the cumulative (shared) permissions for the BDS with a debug
print yields the same values after this patch and with QEMU 8.1,
namely 3 and 5.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
block/alloc-track.c | 26 --------------------------
1 file changed, 26 deletions(-)
diff --git a/block/alloc-track.c b/block/alloc-track.c
index fc7d58a5d0..b56425b7f0 100644
--- a/block/alloc-track.c
+++ b/block/alloc-track.c
@@ -25,15 +25,9 @@
#define TRACK_OPT_AUTO_REMOVE "auto-remove"
-typedef enum DropState {
- DropNone,
- DropInProgress,
-} DropState;
-
typedef struct {
BdrvDirtyBitmap *bitmap;
uint64_t granularity;
- DropState drop_state;
} BDRVAllocTrackState;
static QemuOptsList runtime_opts = {
@@ -137,8 +131,6 @@ static int track_open(BlockDriverState *bs, QDict *options, int flags,
goto fail;
}
- s->drop_state = DropNone;
-
fail:
if (ret < 0) {
bdrv_graph_wrlock();
@@ -289,18 +281,8 @@ track_child_perm(BlockDriverState *bs, BdrvChild *c, BdrvChildRole role,
BlockReopenQueue *reopen_queue, uint64_t perm, uint64_t shared,
uint64_t *nperm, uint64_t *nshared)
{
- BDRVAllocTrackState *s = bs->opaque;
-
*nshared = BLK_PERM_ALL;
- /* in case we're currently dropping ourselves, claim to not use any
- * permissions at all - which is fine, since from this point on we will
- * never issue a read or write anymore */
- if (s->drop_state == DropInProgress) {
- *nperm = 0;
- return;
- }
-
if (role & BDRV_CHILD_DATA) {
*nperm = perm & DEFAULT_PERM_PASSTHROUGH;
} else {
@@ -326,14 +308,6 @@ track_co_change_backing_file(BlockDriverState *bs, const char *backing_file,
* kinda fits better, but in the long-term, a special parameter would be
* nice (or done via qemu-server via upcoming blockdev-replace QMP command).
*/
- if (backing_file == NULL) {
- BDRVAllocTrackState *s = bs->opaque;
- bdrv_drained_begin(bs);
- s->drop_state = DropInProgress;
- bdrv_child_refresh_perms(bs, bs->file, &error_abort);
- bdrv_drained_end(bs);
- }
-
return 0;
}

View File

@@ -0,0 +1,50 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Wed, 19 Mar 2025 17:31:05 +0100
Subject: [PATCH] Revert "hpet: avoid timer storms on periodic timers"
This reverts commit 7c912ffb59e8137091894d767433e65c3df8b0bf.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
hw/timer/hpet.c | 13 ++-----------
1 file changed, 2 insertions(+), 11 deletions(-)
diff --git a/hw/timer/hpet.c b/hw/timer/hpet.c
index ccb97b6806..0f45af8bbe 100644
--- a/hw/timer/hpet.c
+++ b/hw/timer/hpet.c
@@ -61,7 +61,6 @@ typedef struct HPETTimer { /* timers */
uint8_t wrap_flag; /* timer pop will indicate wrap for one-shot 32-bit
* mode. Next pop will be actual timer expiration.
*/
- uint64_t last; /* last value armed, to avoid timer storms */
} HPETTimer;
struct HPETState {
@@ -262,7 +261,6 @@ static int hpet_post_load(void *opaque, int version_id)
for (i = 0; i < s->num_timers; i++) {
HPETTimer *t = &s->timer[i];
t->cmp64 = hpet_calculate_cmp64(t, s->hpet_counter, t->cmp);
- t->last = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) - NANOSECONDS_PER_SECOND;
}
/* Recalculate the offset between the main counter and guest time */
if (!s->hpet_offset_saved) {
@@ -350,15 +348,8 @@ static const VMStateDescription vmstate_hpet = {
static void hpet_arm(HPETTimer *t, uint64_t tick)
{
- uint64_t ns = hpet_get_ns(t->state, tick);
-
- /* Clamp period to reasonable min value (1 us) */
- if (timer_is_periodic(t) && ns - t->last < 1000) {
- ns = t->last + 1000;
- }
-
- t->last = ns;
- timer_mod(t->qemu_timer, ns);
+ /* FIXME: Clamp period to reasonable min value? */
+ timer_mod(t->qemu_timer, hpet_get_ns(t->state, tick));
}
/*

View File

@@ -1,133 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Thu, 11 Apr 2024 11:29:26 +0200
Subject: [PATCH] copy-before-write: allow specifying minimum cluster size
Useful to make discard-source work in the context of backup fleecing
when the fleecing image has a larger granularity than the backup
target.
Copy-before-write operations will use at least this granularity and in
particular, discard requests to the source node will too. If the
granularity is too small, they will just be aligned down in
cbw_co_pdiscard_snapshot() and thus effectively ignored.
The QAPI uses uint32 so the value will be non-negative, but still fit
into a uint64_t.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
block/block-copy.c | 17 +++++++++++++----
block/copy-before-write.c | 3 ++-
include/block/block-copy.h | 1 +
qapi/block-core.json | 8 +++++++-
4 files changed, 23 insertions(+), 6 deletions(-)
diff --git a/block/block-copy.c b/block/block-copy.c
index cc618e4561..12d662e9d4 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -310,6 +310,7 @@ void block_copy_set_copy_opts(BlockCopyState *s, bool use_copy_range,
}
static int64_t block_copy_calculate_cluster_size(BlockDriverState *target,
+ int64_t min_cluster_size,
Error **errp)
{
int ret;
@@ -335,7 +336,7 @@ static int64_t block_copy_calculate_cluster_size(BlockDriverState *target,
"used. If the actual block size of the target exceeds "
"this default, the backup may be unusable",
BLOCK_COPY_CLUSTER_SIZE_DEFAULT);
- return BLOCK_COPY_CLUSTER_SIZE_DEFAULT;
+ return MAX(min_cluster_size, BLOCK_COPY_CLUSTER_SIZE_DEFAULT);
} else if (ret < 0 && !target_does_cow) {
error_setg_errno(errp, -ret,
"Couldn't determine the cluster size of the target image, "
@@ -345,16 +346,18 @@ static int64_t block_copy_calculate_cluster_size(BlockDriverState *target,
return ret;
} else if (ret < 0 && target_does_cow) {
/* Not fatal; just trudge on ahead. */
- return BLOCK_COPY_CLUSTER_SIZE_DEFAULT;
+ return MAX(min_cluster_size, BLOCK_COPY_CLUSTER_SIZE_DEFAULT);
}
- return MAX(BLOCK_COPY_CLUSTER_SIZE_DEFAULT, bdi.cluster_size);
+ return MAX(min_cluster_size,
+ MAX(BLOCK_COPY_CLUSTER_SIZE_DEFAULT, bdi.cluster_size));
}
BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
BlockDriverState *copy_bitmap_bs,
const BdrvDirtyBitmap *bitmap,
bool discard_source,
+ int64_t min_cluster_size,
Error **errp)
{
ERRP_GUARD();
@@ -365,7 +368,13 @@ BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
GLOBAL_STATE_CODE();
- cluster_size = block_copy_calculate_cluster_size(target->bs, errp);
+ if (min_cluster_size && !is_power_of_2(min_cluster_size)) {
+ error_setg(errp, "min-cluster-size needs to be a power of 2");
+ return NULL;
+ }
+
+ cluster_size = block_copy_calculate_cluster_size(target->bs,
+ min_cluster_size, errp);
if (cluster_size < 0) {
return NULL;
}
diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index 28f6a096cd..ef4e666303 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -478,7 +478,8 @@ static int cbw_open(BlockDriverState *bs, QDict *options, int flags,
s->discard_source = flags & BDRV_O_CBW_DISCARD_SOURCE;
s->bcs = block_copy_state_new(bs->file, s->target, bs, bitmap,
- flags & BDRV_O_CBW_DISCARD_SOURCE, errp);
+ flags & BDRV_O_CBW_DISCARD_SOURCE,
+ opts->min_cluster_size, errp);
if (!s->bcs) {
error_prepend(errp, "Cannot create block-copy-state: ");
return -EINVAL;
diff --git a/include/block/block-copy.h b/include/block/block-copy.h
index bdc703bacd..77857c6c68 100644
--- a/include/block/block-copy.h
+++ b/include/block/block-copy.h
@@ -28,6 +28,7 @@ BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
BlockDriverState *copy_bitmap_bs,
const BdrvDirtyBitmap *bitmap,
bool discard_source,
+ int64_t min_cluster_size,
Error **errp);
/* Function should be called prior any actual copy request */
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 521a1914e8..171846deb1 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -4927,12 +4927,18 @@
# @on-cbw-error parameter will decide how this failure is handled.
# Default 0. (Since 7.1)
#
+# @min-cluster-size: Minimum size of blocks used by copy-before-write
+# operations. Has to be a power of 2. No effect if smaller than
+# the maximum of the target's cluster size and 64 KiB. Default 0.
+# (Since 8.1)
+#
# Since: 6.2
##
{ 'struct': 'BlockdevOptionsCbw',
'base': 'BlockdevOptionsGenericFormat',
'data': { 'target': 'BlockdevRef', '*bitmap': 'BlockDirtyBitmap',
- '*on-cbw-error': 'OnCbwError', '*cbw-timeout': 'uint32' } }
+ '*on-cbw-error': 'OnCbwError', '*cbw-timeout': 'uint32',
+ '*min-cluster-size': 'uint32' } }
##
# @BlockdevOptions:

View File

@@ -0,0 +1,202 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Wed, 19 Mar 2025 17:31:08 +0100
Subject: [PATCH] Revert "hpet: store full 64-bit target value of the counter"
This reverts commit 242d665396407f83a6acbffc804882eeb21cfdad.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
hw/timer/hpet.c | 109 +++++++++++++++++++++++++++---------------------
1 file changed, 61 insertions(+), 48 deletions(-)
diff --git a/hw/timer/hpet.c b/hw/timer/hpet.c
index 0f45af8bbe..635a060d38 100644
--- a/hw/timer/hpet.c
+++ b/hw/timer/hpet.c
@@ -56,7 +56,6 @@ typedef struct HPETTimer { /* timers */
uint64_t cmp; /* comparator */
uint64_t fsb; /* FSB route */
/* Hidden register state */
- uint64_t cmp64; /* comparator (extended to counter width) */
uint64_t period; /* Last value written to comparator */
uint8_t wrap_flag; /* timer pop will indicate wrap for one-shot 32-bit
* mode. Next pop will be actual timer expiration.
@@ -119,6 +118,11 @@ static uint32_t timer_enabled(HPETTimer *t)
}
static uint32_t hpet_time_after(uint64_t a, uint64_t b)
+{
+ return ((int32_t)(b - a) < 0);
+}
+
+static uint32_t hpet_time_after64(uint64_t a, uint64_t b)
{
return ((int64_t)(b - a) < 0);
}
@@ -155,32 +159,27 @@ static uint64_t hpet_get_ticks(HPETState *s)
return ns_to_ticks(qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + s->hpet_offset);
}
-static uint64_t hpet_get_ns(HPETState *s, uint64_t tick)
-{
- return ticks_to_ns(tick) - s->hpet_offset;
-}
-
/*
- * calculate next value of the general counter that matches the
- * target (either entirely, or the low 32-bit only depending on
- * the timer mode).
+ * calculate diff between comparator value and current ticks
*/
-static uint64_t hpet_calculate_cmp64(HPETTimer *t, uint64_t cur_tick, uint64_t target)
+static inline uint64_t hpet_calculate_diff(HPETTimer *t, uint64_t current)
{
+
if (t->config & HPET_TN_32BIT) {
- uint64_t result = deposit64(cur_tick, 0, 32, target);
- if (result < cur_tick) {
- result += 0x100000000ULL;
- }
- return result;
+ uint32_t diff, cmp;
+
+ cmp = (uint32_t)t->cmp;
+ diff = cmp - (uint32_t)current;
+ diff = (int32_t)diff > 0 ? diff : (uint32_t)1;
+ return (uint64_t)diff;
} else {
- return target;
- }
-}
+ uint64_t diff, cmp;
-static uint64_t hpet_next_wrap(uint64_t cur_tick)
-{
- return (cur_tick | 0xffffffffU) + 1;
+ cmp = t->cmp;
+ diff = cmp - current;
+ diff = (int64_t)diff > 0 ? diff : (uint64_t)1;
+ return diff;
+ }
}
static void update_irq(struct HPETTimer *timer, int set)
@@ -256,12 +255,7 @@ static bool hpet_validate_num_timers(void *opaque, int version_id)
static int hpet_post_load(void *opaque, int version_id)
{
HPETState *s = opaque;
- int i;
- for (i = 0; i < s->num_timers; i++) {
- HPETTimer *t = &s->timer[i];
- t->cmp64 = hpet_calculate_cmp64(t, s->hpet_counter, t->cmp);
- }
/* Recalculate the offset between the main counter and guest time */
if (!s->hpet_offset_saved) {
s->hpet_offset = ticks_to_ns(s->hpet_counter)
@@ -346,10 +340,14 @@ static const VMStateDescription vmstate_hpet = {
}
};
-static void hpet_arm(HPETTimer *t, uint64_t tick)
+static void hpet_arm(HPETTimer *t, uint64_t ticks)
{
- /* FIXME: Clamp period to reasonable min value? */
- timer_mod(t->qemu_timer, hpet_get_ns(t->state, tick));
+ if (ticks < ns_to_ticks(INT64_MAX / 2)) {
+ timer_mod(t->qemu_timer,
+ qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + ticks_to_ns(ticks));
+ } else {
+ timer_del(t->qemu_timer);
+ }
}
/*
@@ -358,44 +356,54 @@ static void hpet_arm(HPETTimer *t, uint64_t tick)
static void hpet_timer(void *opaque)
{
HPETTimer *t = opaque;
+ uint64_t diff;
+
uint64_t period = t->period;
uint64_t cur_tick = hpet_get_ticks(t->state);
if (timer_is_periodic(t) && period != 0) {
- while (hpet_time_after(cur_tick, t->cmp64)) {
- t->cmp64 += period;
- }
if (t->config & HPET_TN_32BIT) {
- t->cmp = (uint32_t)t->cmp64;
+ while (hpet_time_after(cur_tick, t->cmp)) {
+ t->cmp = (uint32_t)(t->cmp + t->period);
+ }
} else {
- t->cmp = t->cmp64;
+ while (hpet_time_after64(cur_tick, t->cmp)) {
+ t->cmp += period;
+ }
+ }
+ diff = hpet_calculate_diff(t, cur_tick);
+ hpet_arm(t, diff);
+ } else if (t->config & HPET_TN_32BIT && !timer_is_periodic(t)) {
+ if (t->wrap_flag) {
+ diff = hpet_calculate_diff(t, cur_tick);
+ hpet_arm(t, diff);
+ t->wrap_flag = 0;
}
- hpet_arm(t, t->cmp64);
- } else if (t->wrap_flag) {
- t->wrap_flag = 0;
- hpet_arm(t, t->cmp64);
}
update_irq(t, 1);
}
static void hpet_set_timer(HPETTimer *t)
{
+ uint64_t diff;
+ uint32_t wrap_diff; /* how many ticks until we wrap? */
uint64_t cur_tick = hpet_get_ticks(t->state);
+ /* whenever new timer is being set up, make sure wrap_flag is 0 */
t->wrap_flag = 0;
- t->cmp64 = hpet_calculate_cmp64(t, cur_tick, t->cmp);
- if (t->config & HPET_TN_32BIT) {
+ diff = hpet_calculate_diff(t, cur_tick);
- /* hpet spec says in one-shot 32-bit mode, generate an interrupt when
- * counter wraps in addition to an interrupt with comparator match.
- */
- if (!timer_is_periodic(t) && t->cmp64 > hpet_next_wrap(cur_tick)) {
+ /* hpet spec says in one-shot 32-bit mode, generate an interrupt when
+ * counter wraps in addition to an interrupt with comparator match.
+ */
+ if (t->config & HPET_TN_32BIT && !timer_is_periodic(t)) {
+ wrap_diff = 0xffffffff - (uint32_t)cur_tick;
+ if (wrap_diff < (uint32_t)diff) {
+ diff = wrap_diff;
t->wrap_flag = 1;
- hpet_arm(t, hpet_next_wrap(cur_tick));
- return;
}
}
- hpet_arm(t, t->cmp64);
+ hpet_arm(t, diff);
}
static void hpet_del_timer(HPETTimer *t)
@@ -526,7 +534,12 @@ static void hpet_ram_write(void *opaque, hwaddr addr,
timer->cmp = deposit64(timer->cmp, shift, len, value);
}
if (timer_is_periodic(timer)) {
- timer->period = deposit64(timer->period, shift, len, value);
+ /*
+ * FIXME: Clamp period to reasonable min value?
+ * Clamp period to reasonable max value
+ */
+ new_val = deposit64(timer->period, shift, len, value);
+ timer->period = MIN(new_val, (timer->config & HPET_TN_32BIT ? ~0u : ~0ull) >> 1);
}
timer->config &= ~HPET_TN_SETVAL;
if (hpet_enabled(s)) {

View File

@@ -1,106 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Thu, 11 Apr 2024 11:29:27 +0200
Subject: [PATCH] backup: add minimum cluster size to performance options
Useful to make discard-source work in the context of backup fleecing
when the fleecing image has a larger granularity than the backup
target.
Backup/block-copy will use at least this granularity for copy operations
and in particular, discard requests to the backup source will too. If
the granularity is too small, they will just be aligned down in
cbw_co_pdiscard_snapshot() and thus effectively ignored.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
block/backup.c | 2 +-
block/copy-before-write.c | 2 ++
block/copy-before-write.h | 1 +
blockdev.c | 3 +++
qapi/block-core.json | 9 +++++++--
5 files changed, 14 insertions(+), 3 deletions(-)
diff --git a/block/backup.c b/block/backup.c
index 1963e47ab9..fe69723ada 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -434,7 +434,7 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
}
cbw = bdrv_cbw_append(bs, target, filter_node_name, discard_source,
- &bcs, errp);
+ perf->min_cluster_size, &bcs, errp);
if (!cbw) {
goto error;
}
diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index ef4e666303..adb27649a8 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -547,6 +547,7 @@ BlockDriverState *bdrv_cbw_append(BlockDriverState *source,
BlockDriverState *target,
const char *filter_node_name,
bool discard_source,
+ int64_t min_cluster_size,
BlockCopyState **bcs,
Error **errp)
{
@@ -565,6 +566,7 @@ BlockDriverState *bdrv_cbw_append(BlockDriverState *source,
}
qdict_put_str(opts, "file", bdrv_get_node_name(source));
qdict_put_str(opts, "target", bdrv_get_node_name(target));
+ qdict_put_int(opts, "min-cluster-size", min_cluster_size);
top = bdrv_insert_node(source, opts, flags, errp);
if (!top) {
diff --git a/block/copy-before-write.h b/block/copy-before-write.h
index 01af0cd3c4..dc6cafe7fa 100644
--- a/block/copy-before-write.h
+++ b/block/copy-before-write.h
@@ -40,6 +40,7 @@ BlockDriverState *bdrv_cbw_append(BlockDriverState *source,
BlockDriverState *target,
const char *filter_node_name,
bool discard_source,
+ int64_t min_cluster_size,
BlockCopyState **bcs,
Error **errp);
void bdrv_cbw_drop(BlockDriverState *bs);
diff --git a/blockdev.c b/blockdev.c
index 8080c47fa6..3f67eb413d 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2656,6 +2656,9 @@ static BlockJob *do_backup_common(BackupCommon *backup,
if (backup->x_perf->has_max_chunk) {
perf.max_chunk = backup->x_perf->max_chunk;
}
+ if (backup->x_perf->has_min_cluster_size) {
+ perf.min_cluster_size = backup->x_perf->min_cluster_size;
+ }
}
if ((backup->sync == MIRROR_SYNC_MODE_BITMAP) ||
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 171846deb1..653df22046 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1790,11 +1790,16 @@
# it should not be less than job cluster size which is calculated
# as maximum of target image cluster size and 64k. Default 0.
#
+# @min-cluster-size: Minimum size of blocks used by copy-before-write
+# and background copy operations. Has to be a power of 2. No
+# effect if smaller than the maximum of the target's cluster size
+# and 64 KiB. Default 0. (Since 8.1)
+#
# Since: 6.0
##
{ 'struct': 'BackupPerf',
- 'data': { '*use-copy-range': 'bool',
- '*max-workers': 'int', '*max-chunk': 'int64' } }
+ 'data': { '*use-copy-range': 'bool', '*max-workers': 'int',
+ '*max-chunk': 'int64', '*min-cluster-size': 'uint32' } }
##
# @BackupCommon:

View File

@@ -0,0 +1,281 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Wed, 19 Mar 2025 17:31:09 +0100
Subject: [PATCH] Revert "hpet: accept 64-bit reads and writes"
This reverts commit c2366567378dd8fb89329816003801f54e30e6f3.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
hw/timer/hpet.c | 137 +++++++++++++++++++++++++++++-------------
hw/timer/trace-events | 3 +-
2 files changed, 96 insertions(+), 44 deletions(-)
diff --git a/hw/timer/hpet.c b/hw/timer/hpet.c
index 635a060d38..5f4bb5667d 100644
--- a/hw/timer/hpet.c
+++ b/hw/timer/hpet.c
@@ -421,7 +421,6 @@ static uint64_t hpet_ram_read(void *opaque, hwaddr addr,
unsigned size)
{
HPETState *s = opaque;
- int shift = (addr & 4) * 8;
uint64_t cur_tick;
trace_hpet_ram_read(addr);
@@ -436,33 +435,52 @@ static uint64_t hpet_ram_read(void *opaque, hwaddr addr,
return 0;
}
- switch (addr & 0x18) {
- case HPET_TN_CFG: // including interrupt capabilities
- return timer->config >> shift;
+ switch ((addr - 0x100) % 0x20) {
+ case HPET_TN_CFG:
+ return timer->config;
+ case HPET_TN_CFG + 4: // Interrupt capabilities
+ return timer->config >> 32;
case HPET_TN_CMP: // comparator register
- return timer->cmp >> shift;
+ return timer->cmp;
+ case HPET_TN_CMP + 4:
+ return timer->cmp >> 32;
case HPET_TN_ROUTE:
- return timer->fsb >> shift;
+ return timer->fsb;
+ case HPET_TN_ROUTE + 4:
+ return timer->fsb >> 32;
default:
trace_hpet_ram_read_invalid();
break;
}
} else {
- switch (addr & ~4) {
- case HPET_ID: // including HPET_PERIOD
- return s->capability >> shift;
+ switch (addr) {
+ case HPET_ID:
+ return s->capability;
+ case HPET_PERIOD:
+ return s->capability >> 32;
case HPET_CFG:
- return s->config >> shift;
+ return s->config;
+ case HPET_CFG + 4:
+ trace_hpet_invalid_hpet_cfg(4);
+ return 0;
case HPET_COUNTER:
if (hpet_enabled(s)) {
cur_tick = hpet_get_ticks(s);
} else {
cur_tick = s->hpet_counter;
}
- trace_hpet_ram_read_reading_counter(addr & 4, cur_tick);
- return cur_tick >> shift;
+ trace_hpet_ram_read_reading_counter(0, cur_tick);
+ return cur_tick;
+ case HPET_COUNTER + 4:
+ if (hpet_enabled(s)) {
+ cur_tick = hpet_get_ticks(s);
+ } else {
+ cur_tick = s->hpet_counter;
+ }
+ trace_hpet_ram_read_reading_counter(4, cur_tick);
+ return cur_tick >> 32;
case HPET_STATUS:
- return s->isr >> shift;
+ return s->isr;
default:
trace_hpet_ram_read_invalid();
break;
@@ -476,11 +494,11 @@ static void hpet_ram_write(void *opaque, hwaddr addr,
{
int i;
HPETState *s = opaque;
- int shift = (addr & 4) * 8;
- int len = MIN(size * 8, 64 - shift);
uint64_t old_val, new_val, cleared;
trace_hpet_ram_write(addr, value);
+ old_val = hpet_ram_read(opaque, addr, 4);
+ new_val = value;
/*address range of all TN regs*/
if (addr >= 0x100 && addr <= 0x3ff) {
@@ -492,12 +510,9 @@ static void hpet_ram_write(void *opaque, hwaddr addr,
trace_hpet_timer_id_out_of_range(timer_id);
return;
}
- switch (addr & 0x18) {
+ switch ((addr - 0x100) % 0x20) {
case HPET_TN_CFG:
- trace_hpet_ram_write_tn_cfg(addr & 4);
- old_val = timer->config;
- new_val = deposit64(old_val, shift, len, value);
- new_val = hpet_fixup_reg(new_val, old_val, HPET_TN_CFG_WRITE_MASK);
+ trace_hpet_ram_write_tn_cfg();
if (deactivating_bit(old_val, new_val, HPET_TN_TYPE_LEVEL)) {
/*
* Do this before changing timer->config; otherwise, if
@@ -505,7 +520,8 @@ static void hpet_ram_write(void *opaque, hwaddr addr,
*/
update_irq(timer, 0);
}
- timer->config = new_val;
+ new_val = hpet_fixup_reg(new_val, old_val, HPET_TN_CFG_WRITE_MASK);
+ timer->config = (timer->config & 0xffffffff00000000ULL) | new_val;
if (activating_bit(old_val, new_val, HPET_TN_ENABLE)
&& (s->isr & (1 << timer_id))) {
update_irq(timer, 1);
@@ -518,28 +534,56 @@ static void hpet_ram_write(void *opaque, hwaddr addr,
hpet_set_timer(timer);
}
break;
+ case HPET_TN_CFG + 4: // Interrupt capabilities
+ trace_hpet_ram_write_invalid_tn_cfg(4);
+ break;
case HPET_TN_CMP: // comparator register
+ trace_hpet_ram_write_tn_cmp(0);
if (timer->config & HPET_TN_32BIT) {
- /* High 32-bits are zero, leave them untouched. */
- if (shift) {
- trace_hpet_ram_write_invalid_tn_cmp();
- break;
+ new_val = (uint32_t)new_val;
+ }
+ if (!timer_is_periodic(timer)
+ || (timer->config & HPET_TN_SETVAL)) {
+ timer->cmp = (timer->cmp & 0xffffffff00000000ULL) | new_val;
+ }
+ if (timer_is_periodic(timer)) {
+ /*
+ * FIXME: Clamp period to reasonable min value?
+ * Clamp period to reasonable max value
+ */
+ if (timer->config & HPET_TN_32BIT) {
+ new_val = MIN(new_val, ~0u >> 1);
}
- len = 64;
- value = (uint32_t) value;
+ timer->period =
+ (timer->period & 0xffffffff00000000ULL) | new_val;
+ }
+ /*
+ * FIXME: on a 64-bit write, HPET_TN_SETVAL should apply to the
+ * high bits part as well.
+ */
+ timer->config &= ~HPET_TN_SETVAL;
+ if (hpet_enabled(s)) {
+ hpet_set_timer(timer);
}
- trace_hpet_ram_write_tn_cmp(addr & 4);
+ break;
+ case HPET_TN_CMP + 4: // comparator register high order
+ if (timer->config & HPET_TN_32BIT) {
+ trace_hpet_ram_write_invalid_tn_cmp();
+ break;
+ }
+ trace_hpet_ram_write_tn_cmp(4);
if (!timer_is_periodic(timer)
|| (timer->config & HPET_TN_SETVAL)) {
- timer->cmp = deposit64(timer->cmp, shift, len, value);
+ timer->cmp = (timer->cmp & 0xffffffffULL) | new_val << 32;
}
if (timer_is_periodic(timer)) {
/*
* FIXME: Clamp period to reasonable min value?
* Clamp period to reasonable max value
*/
- new_val = deposit64(timer->period, shift, len, value);
- timer->period = MIN(new_val, (timer->config & HPET_TN_32BIT ? ~0u : ~0ull) >> 1);
+ new_val = MIN(new_val, ~0u >> 1);
+ timer->period =
+ (timer->period & 0xffffffffULL) | new_val << 32;
}
timer->config &= ~HPET_TN_SETVAL;
if (hpet_enabled(s)) {
@@ -547,7 +591,10 @@ static void hpet_ram_write(void *opaque, hwaddr addr,
}
break;
case HPET_TN_ROUTE:
- timer->fsb = deposit64(timer->fsb, shift, len, value);
+ timer->fsb = (timer->fsb & 0xffffffff00000000ULL) | new_val;
+ break;
+ case HPET_TN_ROUTE + 4:
+ timer->fsb = (new_val << 32) | (timer->fsb & 0xffffffff);
break;
default:
trace_hpet_ram_write_invalid();
@@ -555,14 +602,12 @@ static void hpet_ram_write(void *opaque, hwaddr addr,
}
return;
} else {
- switch (addr & ~4) {
+ switch (addr) {
case HPET_ID:
return;
case HPET_CFG:
- old_val = s->config;
- new_val = deposit64(old_val, shift, len, value);
new_val = hpet_fixup_reg(new_val, old_val, HPET_CFG_WRITE_MASK);
- s->config = new_val;
+ s->config = (s->config & 0xffffffff00000000ULL) | new_val;
if (activating_bit(old_val, new_val, HPET_CFG_ENABLE)) {
/* Enable main counter and interrupt generation. */
s->hpet_offset =
@@ -592,8 +637,10 @@ static void hpet_ram_write(void *opaque, hwaddr addr,
qemu_set_irq(s->irqs[RTC_ISA_IRQ], s->rtc_irq_level);
}
break;
+ case HPET_CFG + 4:
+ trace_hpet_invalid_hpet_cfg(4);
+ break;
case HPET_STATUS:
- new_val = value << shift;
cleared = new_val & s->isr;
for (i = 0; i < s->num_timers; i++) {
if (cleared & (1 << i)) {
@@ -605,7 +652,15 @@ static void hpet_ram_write(void *opaque, hwaddr addr,
if (hpet_enabled(s)) {
trace_hpet_ram_write_counter_write_while_enabled();
}
- s->hpet_counter = deposit64(s->hpet_counter, shift, len, value);
+ s->hpet_counter =
+ (s->hpet_counter & 0xffffffff00000000ULL) | value;
+ trace_hpet_ram_write_counter_written(0, value, s->hpet_counter);
+ break;
+ case HPET_COUNTER + 4:
+ trace_hpet_ram_write_counter_write_while_enabled();
+ s->hpet_counter =
+ (s->hpet_counter & 0xffffffffULL) | (((uint64_t)value) << 32);
+ trace_hpet_ram_write_counter_written(4, value, s->hpet_counter);
break;
default:
trace_hpet_ram_write_invalid();
@@ -619,11 +674,7 @@ static const MemoryRegionOps hpet_ram_ops = {
.write = hpet_ram_write,
.valid = {
.min_access_size = 4,
- .max_access_size = 8,
- },
- .impl = {
- .min_access_size = 4,
- .max_access_size = 8,
+ .max_access_size = 4,
},
.endianness = DEVICE_NATIVE_ENDIAN,
};
diff --git a/hw/timer/trace-events b/hw/timer/trace-events
index c5b6db49f5..dd8a53c690 100644
--- a/hw/timer/trace-events
+++ b/hw/timer/trace-events
@@ -114,7 +114,8 @@ hpet_ram_read_reading_counter(uint8_t reg_off, uint64_t cur_tick) "reading count
hpet_ram_read_invalid(void) "invalid hpet_ram_readl"
hpet_ram_write(uint64_t addr, uint64_t value) "enter hpet_ram_writel at 0x%" PRIx64 " = 0x%" PRIx64
hpet_ram_write_timer_id(uint64_t timer_id) "hpet_ram_writel timer_id = 0x%" PRIx64
-hpet_ram_write_tn_cfg(uint8_t reg_off) "hpet_ram_writel HPET_TN_CFG + %" PRIu8
+hpet_ram_write_tn_cfg(void) "hpet_ram_writel HPET_TN_CFG"
+hpet_ram_write_invalid_tn_cfg(uint8_t reg_off) "invalid HPET_TN_CFG + %" PRIu8 " write"
hpet_ram_write_tn_cmp(uint8_t reg_off) "hpet_ram_writel HPET_TN_CMP + %" PRIu8
hpet_ram_write_invalid_tn_cmp(void) "invalid HPET_TN_CMP + 4 write"
hpet_ram_write_invalid(void) "invalid hpet_ram_writel"

View File

@@ -1,117 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Mon, 29 Apr 2024 14:43:58 +0200
Subject: [PATCH] PVE backup: improve error when copy-before-write fails for
fleecing
With fleecing, failure for copy-before-write does not fail the guest
write, but only sets the snapshot error that is associated to the
copy-before-write filter, making further requests to the snapshot
access fail with EACCES, which then also fails the job. But that error
code is not the root cause of why the backup failed, so bubble up the
original snapshot error instead.
Reported-by: Friedrich Weber <f.weber@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Tested-by: Friedrich Weber <f.weber@proxmox.com>
---
block/copy-before-write.c | 18 ++++++++++++------
block/copy-before-write.h | 1 +
pve-backup.c | 9 +++++++++
3 files changed, 22 insertions(+), 6 deletions(-)
diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index adb27649a8..a5bb4d14f6 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -27,6 +27,7 @@
#include "qapi/qmp/qjson.h"
#include "sysemu/block-backend.h"
+#include "qemu/atomic.h"
#include "qemu/cutils.h"
#include "qapi/error.h"
#include "block/block_int.h"
@@ -75,7 +76,8 @@ typedef struct BDRVCopyBeforeWriteState {
* @snapshot_error is normally zero. But on first copy-before-write failure
* when @on_cbw_error == ON_CBW_ERROR_BREAK_SNAPSHOT, @snapshot_error takes
* value of this error (<0). After that all in-flight and further
- * snapshot-API requests will fail with that error.
+ * snapshot-API requests will fail with that error. To be accessed with
+ * atomics.
*/
int snapshot_error;
} BDRVCopyBeforeWriteState;
@@ -115,7 +117,7 @@ static coroutine_fn int cbw_do_copy_before_write(BlockDriverState *bs,
return 0;
}
- if (s->snapshot_error) {
+ if (qatomic_read(&s->snapshot_error)) {
return 0;
}
@@ -139,9 +141,7 @@ static coroutine_fn int cbw_do_copy_before_write(BlockDriverState *bs,
WITH_QEMU_LOCK_GUARD(&s->lock) {
if (ret < 0) {
assert(s->on_cbw_error == ON_CBW_ERROR_BREAK_SNAPSHOT);
- if (!s->snapshot_error) {
- s->snapshot_error = ret;
- }
+ qatomic_cmpxchg(&s->snapshot_error, 0, ret);
} else {
bdrv_set_dirty_bitmap(s->done_bitmap, off, end - off);
}
@@ -215,7 +215,7 @@ cbw_snapshot_read_lock(BlockDriverState *bs, int64_t offset, int64_t bytes,
QEMU_LOCK_GUARD(&s->lock);
- if (s->snapshot_error) {
+ if (qatomic_read(&s->snapshot_error)) {
g_free(req);
return NULL;
}
@@ -586,6 +586,12 @@ void bdrv_cbw_drop(BlockDriverState *bs)
bdrv_unref(bs);
}
+int bdrv_cbw_snapshot_error(BlockDriverState *bs)
+{
+ BDRVCopyBeforeWriteState *s = bs->opaque;
+ return qatomic_read(&s->snapshot_error);
+}
+
static void cbw_init(void)
{
bdrv_register(&bdrv_cbw_filter);
diff --git a/block/copy-before-write.h b/block/copy-before-write.h
index dc6cafe7fa..a27d2d7d9f 100644
--- a/block/copy-before-write.h
+++ b/block/copy-before-write.h
@@ -44,5 +44,6 @@ BlockDriverState *bdrv_cbw_append(BlockDriverState *source,
BlockCopyState **bcs,
Error **errp);
void bdrv_cbw_drop(BlockDriverState *bs);
+int bdrv_cbw_snapshot_error(BlockDriverState *bs);
#endif /* COPY_BEFORE_WRITE_H */
diff --git a/pve-backup.c b/pve-backup.c
index 0f098000dd..75da1dc051 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -374,6 +374,15 @@ static void pvebackup_complete_cb(void *opaque, int ret)
di->fleecing.snapshot_access = NULL;
}
if (di->fleecing.cbw) {
+ /*
+ * With fleecing, failure for cbw does not fail the guest write, but only sets the snapshot
+ * error, making further requests to the snapshot fail with EACCES, which then also fail the
+ * job. But that code is not the root cause and just confusing, so update it.
+ */
+ int snapshot_error = bdrv_cbw_snapshot_error(di->fleecing.cbw);
+ if (di->completed_ret == -EACCES && snapshot_error) {
+ di->completed_ret = snapshot_error;
+ }
bdrv_cbw_drop(di->fleecing.cbw);
di->fleecing.cbw = NULL;
}

View File

@@ -0,0 +1,64 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Wed, 19 Mar 2025 17:31:10 +0100
Subject: [PATCH] Revert "hpet: place read-only bits directly in "new_val""
This reverts commit ba88935b0fac2588b0a739f810b58dfabf7f92c8.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
hw/timer/hpet.c | 15 ++++++++-------
1 file changed, 8 insertions(+), 7 deletions(-)
diff --git a/hw/timer/hpet.c b/hw/timer/hpet.c
index 5f4bb5667d..5e3bf1f153 100644
--- a/hw/timer/hpet.c
+++ b/hw/timer/hpet.c
@@ -494,7 +494,7 @@ static void hpet_ram_write(void *opaque, hwaddr addr,
{
int i;
HPETState *s = opaque;
- uint64_t old_val, new_val, cleared;
+ uint64_t old_val, new_val, val;
trace_hpet_ram_write(addr, value);
old_val = hpet_ram_read(opaque, addr, 4);
@@ -520,12 +520,13 @@ static void hpet_ram_write(void *opaque, hwaddr addr,
*/
update_irq(timer, 0);
}
- new_val = hpet_fixup_reg(new_val, old_val, HPET_TN_CFG_WRITE_MASK);
- timer->config = (timer->config & 0xffffffff00000000ULL) | new_val;
+ val = hpet_fixup_reg(new_val, old_val, HPET_TN_CFG_WRITE_MASK);
+ timer->config = (timer->config & 0xffffffff00000000ULL) | val;
if (activating_bit(old_val, new_val, HPET_TN_ENABLE)
&& (s->isr & (1 << timer_id))) {
update_irq(timer, 1);
}
+
if (new_val & HPET_TN_32BIT) {
timer->cmp = (uint32_t)timer->cmp;
timer->period = (uint32_t)timer->period;
@@ -606,8 +607,8 @@ static void hpet_ram_write(void *opaque, hwaddr addr,
case HPET_ID:
return;
case HPET_CFG:
- new_val = hpet_fixup_reg(new_val, old_val, HPET_CFG_WRITE_MASK);
- s->config = (s->config & 0xffffffff00000000ULL) | new_val;
+ val = hpet_fixup_reg(new_val, old_val, HPET_CFG_WRITE_MASK);
+ s->config = (s->config & 0xffffffff00000000ULL) | val;
if (activating_bit(old_val, new_val, HPET_CFG_ENABLE)) {
/* Enable main counter and interrupt generation. */
s->hpet_offset =
@@ -641,9 +642,9 @@ static void hpet_ram_write(void *opaque, hwaddr addr,
trace_hpet_invalid_hpet_cfg(4);
break;
case HPET_STATUS:
- cleared = new_val & s->isr;
+ val = new_val & s->isr;
for (i = 0; i < s->num_timers; i++) {
- if (cleared & (1 << i)) {
+ if (val & (1 << i)) {
update_irq(&s->timer[i], 0);
}
}

View File

@@ -1,103 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Thu, 7 Nov 2024 17:51:14 +0100
Subject: [PATCH] PVE backup: fixup error handling for fleecing
The drained section needs to be terminated before breaking out of the
loop in the error scenarios. Otherwise, guest IO on the drive would
become stuck.
If the job is created successfully, then the job completion callback
will clean up the snapshot access block nodes. In case failure
happened before the job is created, there was no cleanup for the
snapshot access block nodes yet. Add it.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
pve-backup.c | 38 +++++++++++++++++++++++++-------------
1 file changed, 25 insertions(+), 13 deletions(-)
diff --git a/pve-backup.c b/pve-backup.c
index 75da1dc051..167f0b5c3f 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -357,22 +357,23 @@ static void coroutine_fn pvebackup_co_complete_stream(void *opaque)
qemu_co_mutex_unlock(&backup_state.backup_mutex);
}
-static void pvebackup_complete_cb(void *opaque, int ret)
+static void cleanup_snapshot_access(PVEBackupDevInfo *di)
{
- PVEBackupDevInfo *di = opaque;
- di->completed_ret = ret;
-
- /*
- * Handle block-graph specific cleanup (for fleecing) outside of the coroutine, because the work
- * won't be done as a coroutine anyways:
- * - For snapshot_access, allows doing bdrv_unref() directly. Doing it via bdrv_co_unref() would
- * just spawn a BH calling bdrv_unref().
- * - For cbw, draining would need to spawn a BH.
- */
if (di->fleecing.snapshot_access) {
bdrv_unref(di->fleecing.snapshot_access);
di->fleecing.snapshot_access = NULL;
}
+ if (di->fleecing.cbw) {
+ bdrv_cbw_drop(di->fleecing.cbw);
+ di->fleecing.cbw = NULL;
+ }
+}
+
+static void pvebackup_complete_cb(void *opaque, int ret)
+{
+ PVEBackupDevInfo *di = opaque;
+ di->completed_ret = ret;
+
if (di->fleecing.cbw) {
/*
* With fleecing, failure for cbw does not fail the guest write, but only sets the snapshot
@@ -383,10 +384,17 @@ static void pvebackup_complete_cb(void *opaque, int ret)
if (di->completed_ret == -EACCES && snapshot_error) {
di->completed_ret = snapshot_error;
}
- bdrv_cbw_drop(di->fleecing.cbw);
- di->fleecing.cbw = NULL;
}
+ /*
+ * Handle block-graph specific cleanup (for fleecing) outside of the coroutine, because the work
+ * won't be done as a coroutine anyways:
+ * - For snapshot_access, allows doing bdrv_unref() directly. Doing it via bdrv_co_unref() would
+ * just spawn a BH calling bdrv_unref().
+ * - For cbw, draining would need to spawn a BH.
+ */
+ cleanup_snapshot_access(di);
+
/*
* Needs to happen outside of coroutine, because it takes the graph write lock.
*/
@@ -587,6 +595,7 @@ static void create_backup_jobs_bh(void *opaque) {
if (!di->fleecing.cbw) {
error_setg(errp, "appending cbw node for fleecing failed: %s",
local_err ? error_get_pretty(local_err) : "unknown error");
+ bdrv_drained_end(di->bs);
break;
}
@@ -599,6 +608,8 @@ static void create_backup_jobs_bh(void *opaque) {
if (!di->fleecing.snapshot_access) {
error_setg(errp, "setting up snapshot access for fleecing failed: %s",
local_err ? error_get_pretty(local_err) : "unknown error");
+ cleanup_snapshot_access(di);
+ bdrv_drained_end(di->bs);
break;
}
source_bs = di->fleecing.snapshot_access;
@@ -637,6 +648,7 @@ static void create_backup_jobs_bh(void *opaque) {
}
if (!job || local_err) {
+ cleanup_snapshot_access(di);
error_setg(errp, "backup_job_create failed: %s",
local_err ? error_get_pretty(local_err) : "null");
break;

View File

@@ -0,0 +1,68 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Wed, 19 Mar 2025 17:31:11 +0100
Subject: [PATCH] Revert "hpet: remove unnecessary variable "index""
This reverts commit 5895879aca252f4ebb2d1078eaf836c61ec54e9b.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
hw/timer/hpet.c | 15 ++++++++-------
1 file changed, 8 insertions(+), 7 deletions(-)
diff --git a/hw/timer/hpet.c b/hw/timer/hpet.c
index 5e3bf1f153..daef12c8cf 100644
--- a/hw/timer/hpet.c
+++ b/hw/timer/hpet.c
@@ -421,12 +421,12 @@ static uint64_t hpet_ram_read(void *opaque, hwaddr addr,
unsigned size)
{
HPETState *s = opaque;
- uint64_t cur_tick;
+ uint64_t cur_tick, index;
trace_hpet_ram_read(addr);
-
+ index = addr;
/*address range of all TN regs*/
- if (addr >= 0x100 && addr <= 0x3ff) {
+ if (index >= 0x100 && index <= 0x3ff) {
uint8_t timer_id = (addr - 0x100) / 0x20;
HPETTimer *timer = &s->timer[timer_id];
@@ -453,7 +453,7 @@ static uint64_t hpet_ram_read(void *opaque, hwaddr addr,
break;
}
} else {
- switch (addr) {
+ switch (index) {
case HPET_ID:
return s->capability;
case HPET_PERIOD:
@@ -494,14 +494,15 @@ static void hpet_ram_write(void *opaque, hwaddr addr,
{
int i;
HPETState *s = opaque;
- uint64_t old_val, new_val, val;
+ uint64_t old_val, new_val, val, index;
trace_hpet_ram_write(addr, value);
+ index = addr;
old_val = hpet_ram_read(opaque, addr, 4);
new_val = value;
/*address range of all TN regs*/
- if (addr >= 0x100 && addr <= 0x3ff) {
+ if (index >= 0x100 && index <= 0x3ff) {
uint8_t timer_id = (addr - 0x100) / 0x20;
HPETTimer *timer = &s->timer[timer_id];
@@ -603,7 +604,7 @@ static void hpet_ram_write(void *opaque, hwaddr addr,
}
return;
} else {
- switch (addr) {
+ switch (index) {
case HPET_ID:
return;
case HPET_CFG:

View File

@@ -1,135 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Thu, 7 Nov 2024 17:51:15 +0100
Subject: [PATCH] PVE backup: factor out setting up snapshot access for
fleecing
Avoids some line bloat in the create_backup_jobs_bh() function and is
in preparation for setting up the snapshot access independently of
fleecing, in particular that will be useful for providing access to
the snapshot via NBD.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
pve-backup.c | 95 ++++++++++++++++++++++++++++++++--------------------
1 file changed, 58 insertions(+), 37 deletions(-)
diff --git a/pve-backup.c b/pve-backup.c
index 167f0b5c3f..f136d004c4 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -525,6 +525,62 @@ static int coroutine_fn pvebackup_co_add_config(
goto out;
}
+/*
+ * Setup a snapshot-access block node for a device with associated fleecing image.
+ */
+static int setup_snapshot_access(PVEBackupDevInfo *di, Error **errp)
+{
+ Error *local_err = NULL;
+
+ if (!di->fleecing.bs) {
+ error_setg(errp, "no associated fleecing image");
+ return -1;
+ }
+
+ QDict *cbw_opts = qdict_new();
+ qdict_put_str(cbw_opts, "driver", "copy-before-write");
+ qdict_put_str(cbw_opts, "file", bdrv_get_node_name(di->bs));
+ qdict_put_str(cbw_opts, "target", bdrv_get_node_name(di->fleecing.bs));
+
+ if (di->bitmap) {
+ /*
+ * Only guest writes to parts relevant for the backup need to be intercepted with
+ * old data being copied to the fleecing image.
+ */
+ qdict_put_str(cbw_opts, "bitmap.node", bdrv_get_node_name(di->bs));
+ qdict_put_str(cbw_opts, "bitmap.name", bdrv_dirty_bitmap_name(di->bitmap));
+ }
+ /*
+ * Fleecing storage is supposed to be fast and it's better to break backup than guest
+ * writes. Certain guest drivers like VirtIO-win have 60 seconds timeout by default, so
+ * abort a bit before that.
+ */
+ qdict_put_str(cbw_opts, "on-cbw-error", "break-snapshot");
+ qdict_put_int(cbw_opts, "cbw-timeout", 45);
+
+ di->fleecing.cbw = bdrv_insert_node(di->bs, cbw_opts, BDRV_O_RDWR, &local_err);
+
+ if (!di->fleecing.cbw) {
+ error_setg(errp, "appending cbw node for fleecing failed: %s",
+ local_err ? error_get_pretty(local_err) : "unknown error");
+ return -1;
+ }
+
+ QDict *snapshot_access_opts = qdict_new();
+ qdict_put_str(snapshot_access_opts, "driver", "snapshot-access");
+ qdict_put_str(snapshot_access_opts, "file", bdrv_get_node_name(di->fleecing.cbw));
+
+ di->fleecing.snapshot_access =
+ bdrv_open(NULL, NULL, snapshot_access_opts, BDRV_O_RDWR | BDRV_O_UNMAP, &local_err);
+ if (!di->fleecing.snapshot_access) {
+ error_setg(errp, "setting up snapshot access for fleecing failed: %s",
+ local_err ? error_get_pretty(local_err) : "unknown error");
+ return -1;
+ }
+
+ return 0;
+}
+
/*
* backup_job_create can *not* be run from a coroutine, so this can't either.
* The caller is responsible that backup_mutex is held nonetheless.
@@ -569,49 +625,14 @@ static void create_backup_jobs_bh(void *opaque) {
const char *job_id = bdrv_get_device_name(di->bs);
bdrv_graph_co_rdunlock();
if (di->fleecing.bs) {
- QDict *cbw_opts = qdict_new();
- qdict_put_str(cbw_opts, "driver", "copy-before-write");
- qdict_put_str(cbw_opts, "file", bdrv_get_node_name(di->bs));
- qdict_put_str(cbw_opts, "target", bdrv_get_node_name(di->fleecing.bs));
-
- if (di->bitmap) {
- /*
- * Only guest writes to parts relevant for the backup need to be intercepted with
- * old data being copied to the fleecing image.
- */
- qdict_put_str(cbw_opts, "bitmap.node", bdrv_get_node_name(di->bs));
- qdict_put_str(cbw_opts, "bitmap.name", bdrv_dirty_bitmap_name(di->bitmap));
- }
- /*
- * Fleecing storage is supposed to be fast and it's better to break backup than guest
- * writes. Certain guest drivers like VirtIO-win have 60 seconds timeout by default, so
- * abort a bit before that.
- */
- qdict_put_str(cbw_opts, "on-cbw-error", "break-snapshot");
- qdict_put_int(cbw_opts, "cbw-timeout", 45);
-
- di->fleecing.cbw = bdrv_insert_node(di->bs, cbw_opts, BDRV_O_RDWR, &local_err);
-
- if (!di->fleecing.cbw) {
- error_setg(errp, "appending cbw node for fleecing failed: %s",
- local_err ? error_get_pretty(local_err) : "unknown error");
- bdrv_drained_end(di->bs);
- break;
- }
-
- QDict *snapshot_access_opts = qdict_new();
- qdict_put_str(snapshot_access_opts, "driver", "snapshot-access");
- qdict_put_str(snapshot_access_opts, "file", bdrv_get_node_name(di->fleecing.cbw));
-
- di->fleecing.snapshot_access =
- bdrv_open(NULL, NULL, snapshot_access_opts, BDRV_O_RDWR | BDRV_O_UNMAP, &local_err);
- if (!di->fleecing.snapshot_access) {
+ if (setup_snapshot_access(di, &local_err) < 0) {
error_setg(errp, "setting up snapshot access for fleecing failed: %s",
local_err ? error_get_pretty(local_err) : "unknown error");
cleanup_snapshot_access(di);
bdrv_drained_end(di->bs);
break;
}
+
source_bs = di->fleecing.snapshot_access;
discard_source = true;

View File

@@ -0,0 +1,40 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Wed, 19 Mar 2025 17:31:12 +0100
Subject: [PATCH] Revert "hpet: ignore high bits of comparator in 32-bit mode"
This reverts commit 9eb7fad3546a89ee7cf0e90f5b1daccf89725cea.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
hw/timer/hpet.c | 4 ----
hw/timer/trace-events | 1 -
2 files changed, 5 deletions(-)
diff --git a/hw/timer/hpet.c b/hw/timer/hpet.c
index daef12c8cf..927263e2ff 100644
--- a/hw/timer/hpet.c
+++ b/hw/timer/hpet.c
@@ -569,10 +569,6 @@ static void hpet_ram_write(void *opaque, hwaddr addr,
}
break;
case HPET_TN_CMP + 4: // comparator register high order
- if (timer->config & HPET_TN_32BIT) {
- trace_hpet_ram_write_invalid_tn_cmp();
- break;
- }
trace_hpet_ram_write_tn_cmp(4);
if (!timer_is_periodic(timer)
|| (timer->config & HPET_TN_SETVAL)) {
diff --git a/hw/timer/trace-events b/hw/timer/trace-events
index dd8a53c690..2b81ee0812 100644
--- a/hw/timer/trace-events
+++ b/hw/timer/trace-events
@@ -117,7 +117,6 @@ hpet_ram_write_timer_id(uint64_t timer_id) "hpet_ram_writel timer_id = 0x%" PRIx
hpet_ram_write_tn_cfg(void) "hpet_ram_writel HPET_TN_CFG"
hpet_ram_write_invalid_tn_cfg(uint8_t reg_off) "invalid HPET_TN_CFG + %" PRIu8 " write"
hpet_ram_write_tn_cmp(uint8_t reg_off) "hpet_ram_writel HPET_TN_CMP + %" PRIu8
-hpet_ram_write_invalid_tn_cmp(void) "invalid HPET_TN_CMP + 4 write"
hpet_ram_write_invalid(void) "invalid hpet_ram_writel"
hpet_ram_write_counter_write_while_enabled(void) "Writing counter while HPET enabled!"
hpet_ram_write_counter_written(uint8_t reg_off, uint64_t value, uint64_t counter) "HPET counter + %" PRIu8 "written. crt = 0x%" PRIx64 " -> 0x%" PRIx64

View File

@@ -1,135 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Thu, 7 Nov 2024 17:51:16 +0100
Subject: [PATCH] PVE backup: save device name in device info structure
The device name needs to be queried while holding the graph read lock
and since it doesn't change during the whole operation, just get it
once during setup and avoid the need to query it again in different
places.
Also in preparation to use it more often in error messages and for the
upcoming external backup access API.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
pve-backup.c | 29 +++++++++++++++--------------
1 file changed, 15 insertions(+), 14 deletions(-)
diff --git a/pve-backup.c b/pve-backup.c
index f136d004c4..8ccb281c8c 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -94,6 +94,7 @@ typedef struct PVEBackupDevInfo {
size_t size;
uint64_t block_size;
uint8_t dev_id;
+ char* device_name;
int completed_ret; // INT_MAX if not completed
BdrvDirtyBitmap *bitmap;
BlockDriverState *target;
@@ -327,6 +328,8 @@ static void coroutine_fn pvebackup_co_complete_stream(void *opaque)
}
di->bs = NULL;
+ g_free(di->device_name);
+ di->device_name = NULL;
assert(di->target == NULL);
@@ -621,9 +624,6 @@ static void create_backup_jobs_bh(void *opaque) {
BlockDriverState *source_bs = di->bs;
bool discard_source = false;
- bdrv_graph_co_rdlock();
- const char *job_id = bdrv_get_device_name(di->bs);
- bdrv_graph_co_rdunlock();
if (di->fleecing.bs) {
if (setup_snapshot_access(di, &local_err) < 0) {
error_setg(errp, "setting up snapshot access for fleecing failed: %s",
@@ -654,7 +654,7 @@ static void create_backup_jobs_bh(void *opaque) {
}
BlockJob *job = backup_job_create(
- job_id, source_bs, di->target, backup_state.speed, sync_mode, di->bitmap,
+ di->device_name, source_bs, di->target, backup_state.speed, sync_mode, di->bitmap,
bitmap_mode, false, discard_source, NULL, &perf, BLOCKDEV_ON_ERROR_REPORT,
BLOCKDEV_ON_ERROR_REPORT, JOB_DEFAULT, pvebackup_complete_cb, di, backup_state.txn,
&local_err);
@@ -751,6 +751,7 @@ static GList coroutine_fn GRAPH_RDLOCK *get_device_info(
}
PVEBackupDevInfo *di = g_new0(PVEBackupDevInfo, 1);
di->bs = bs;
+ di->device_name = g_strdup(bdrv_get_device_name(bs));
if (fleecing && device_uses_fleecing(*d)) {
g_autofree gchar *fleecing_devid = g_strconcat(*d, "-fleecing", NULL);
@@ -789,6 +790,7 @@ static GList coroutine_fn GRAPH_RDLOCK *get_device_info(
PVEBackupDevInfo *di = g_new0(PVEBackupDevInfo, 1);
di->bs = bs;
+ di->device_name = g_strdup(bdrv_get_device_name(bs));
di_list = g_list_append(di_list, di);
}
}
@@ -956,9 +958,6 @@ UuidInfo coroutine_fn *qmp_backup(
di->block_size = dump_cb_block_size;
- bdrv_graph_co_rdlock();
- const char *devname = bdrv_get_device_name(di->bs);
- bdrv_graph_co_rdunlock();
PBSBitmapAction action = PBS_BITMAP_ACTION_NOT_USED;
size_t dirty = di->size;
@@ -973,7 +972,8 @@ UuidInfo coroutine_fn *qmp_backup(
}
action = PBS_BITMAP_ACTION_NEW;
} else {
- expect_only_dirty = proxmox_backup_check_incremental(pbs, devname, di->size) != 0;
+ expect_only_dirty =
+ proxmox_backup_check_incremental(pbs, di->device_name, di->size) != 0;
}
if (expect_only_dirty) {
@@ -997,7 +997,8 @@ UuidInfo coroutine_fn *qmp_backup(
}
}
- int dev_id = proxmox_backup_co_register_image(pbs, devname, di->size, expect_only_dirty, errp);
+ int dev_id = proxmox_backup_co_register_image(pbs, di->device_name, di->size,
+ expect_only_dirty, errp);
if (dev_id < 0) {
goto err_mutex;
}
@@ -1009,7 +1010,7 @@ UuidInfo coroutine_fn *qmp_backup(
di->dev_id = dev_id;
PBSBitmapInfo *info = g_malloc(sizeof(*info));
- info->drive = g_strdup(devname);
+ info->drive = g_strdup(di->device_name);
info->action = action;
info->size = di->size;
info->dirty = dirty;
@@ -1034,10 +1035,7 @@ UuidInfo coroutine_fn *qmp_backup(
goto err_mutex;
}
- bdrv_graph_co_rdlock();
- const char *devname = bdrv_get_device_name(di->bs);
- bdrv_graph_co_rdunlock();
- di->dev_id = vma_writer_register_stream(vmaw, devname, di->size);
+ di->dev_id = vma_writer_register_stream(vmaw, di->device_name, di->size);
if (di->dev_id <= 0) {
error_set(errp, ERROR_CLASS_GENERIC_ERROR,
"register_stream failed");
@@ -1148,6 +1146,9 @@ err:
bdrv_co_unref(di->target);
}
+ g_free(di->device_name);
+ di->device_name = NULL;
+
g_free(di);
}
g_list_free(di_list);

View File

@@ -0,0 +1,120 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Wed, 19 Mar 2025 17:31:13 +0100
Subject: [PATCH] Revert "hpet: fix and cleanup persistence of interrupt
status"
This reverts commit f0ccf770789e48b7a73497b465fdc892d28c1339.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
hw/timer/hpet.c | 60 ++++++++++++++++---------------------------------
1 file changed, 19 insertions(+), 41 deletions(-)
diff --git a/hw/timer/hpet.c b/hw/timer/hpet.c
index 927263e2ff..5aae09f166 100644
--- a/hw/timer/hpet.c
+++ b/hw/timer/hpet.c
@@ -199,31 +199,21 @@ static void update_irq(struct HPETTimer *timer, int set)
}
s = timer->state;
mask = 1 << timer->tn;
-
- if (set && (timer->config & HPET_TN_TYPE_LEVEL)) {
- /*
- * If HPET_TN_ENABLE bit is 0, "the timer will still operate and
- * generate appropriate status bits, but will not cause an interrupt"
- */
- s->isr |= mask;
- } else {
+ if (!set || !timer_enabled(timer) || !hpet_enabled(timer->state)) {
s->isr &= ~mask;
- }
-
- if (set && timer_enabled(timer) && hpet_enabled(s)) {
- if (timer_fsb_route(timer)) {
- address_space_stl_le(&address_space_memory, timer->fsb >> 32,
- timer->fsb & 0xffffffff, MEMTXATTRS_UNSPECIFIED,
- NULL);
- } else if (timer->config & HPET_TN_TYPE_LEVEL) {
- qemu_irq_raise(s->irqs[route]);
- } else {
- qemu_irq_pulse(s->irqs[route]);
- }
- } else {
if (!timer_fsb_route(timer)) {
qemu_irq_lower(s->irqs[route]);
}
+ } else if (timer_fsb_route(timer)) {
+ address_space_stl_le(&address_space_memory, timer->fsb >> 32,
+ timer->fsb & 0xffffffff, MEMTXATTRS_UNSPECIFIED,
+ NULL);
+ } else if (timer->config & HPET_TN_TYPE_LEVEL) {
+ s->isr |= mask;
+ qemu_irq_raise(s->irqs[route]);
+ } else {
+ s->isr &= ~mask;
+ qemu_irq_pulse(s->irqs[route]);
}
}
@@ -408,13 +398,8 @@ static void hpet_set_timer(HPETTimer *t)
static void hpet_del_timer(HPETTimer *t)
{
- HPETState *s = t->state;
timer_del(t->qemu_timer);
-
- if (s->isr & (1 << t->tn)) {
- /* For level-triggered interrupt, this leaves ISR set but lowers irq. */
- update_irq(t, 1);
- }
+ update_irq(t, 0);
}
static uint64_t hpet_ram_read(void *opaque, hwaddr addr,
@@ -514,26 +499,20 @@ static void hpet_ram_write(void *opaque, hwaddr addr,
switch ((addr - 0x100) % 0x20) {
case HPET_TN_CFG:
trace_hpet_ram_write_tn_cfg();
- if (deactivating_bit(old_val, new_val, HPET_TN_TYPE_LEVEL)) {
- /*
- * Do this before changing timer->config; otherwise, if
- * HPET_TN_FSB is set, update_irq will not lower the qemu_irq.
- */
+ if (activating_bit(old_val, new_val, HPET_TN_FSB_ENABLE)) {
update_irq(timer, 0);
}
val = hpet_fixup_reg(new_val, old_val, HPET_TN_CFG_WRITE_MASK);
timer->config = (timer->config & 0xffffffff00000000ULL) | val;
- if (activating_bit(old_val, new_val, HPET_TN_ENABLE)
- && (s->isr & (1 << timer_id))) {
- update_irq(timer, 1);
- }
-
if (new_val & HPET_TN_32BIT) {
timer->cmp = (uint32_t)timer->cmp;
timer->period = (uint32_t)timer->period;
}
- if (hpet_enabled(s)) {
+ if (activating_bit(old_val, new_val, HPET_TN_ENABLE) &&
+ hpet_enabled(s)) {
hpet_set_timer(timer);
+ } else if (deactivating_bit(old_val, new_val, HPET_TN_ENABLE)) {
+ hpet_del_timer(timer);
}
break;
case HPET_TN_CFG + 4: // Interrupt capabilities
@@ -611,10 +590,9 @@ static void hpet_ram_write(void *opaque, hwaddr addr,
s->hpet_offset =
ticks_to_ns(s->hpet_counter) - qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
for (i = 0; i < s->num_timers; i++) {
- if (timer_enabled(&s->timer[i]) && (s->isr & (1 << i))) {
- update_irq(&s->timer[i], 1);
+ if ((&s->timer[i])->cmp != ~0ULL) {
+ hpet_set_timer(&s->timer[i]);
}
- hpet_set_timer(&s->timer[i]);
}
} else if (deactivating_bit(old_val, new_val, HPET_CFG_ENABLE)) {
/* Halt main counter and disable interrupt generation. */

View File

@@ -0,0 +1,59 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Thu, 3 Apr 2025 14:30:42 +0200
Subject: [PATCH] PVE backup: factor out helper to clear backup state's bitmap
list
Suggested-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
---
pve-backup.c | 28 ++++++++++++++++++----------
1 file changed, 18 insertions(+), 10 deletions(-)
diff --git a/pve-backup.c b/pve-backup.c
index 9b66788ab5..588ee98ffc 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -811,6 +811,23 @@ err:
return di_list;
}
+/*
+ * To be called with the backup_state.stat mutex held.
+ */
+static void clear_backup_state_bitmap_list(void) {
+
+ if (backup_state.stat.bitmap_list) {
+ GList *bl = backup_state.stat.bitmap_list;
+ while (bl) {
+ g_free(((PBSBitmapInfo *)bl->data)->drive);
+ g_free(bl->data);
+ bl = g_list_next(bl);
+ }
+ g_list_free(backup_state.stat.bitmap_list);
+ backup_state.stat.bitmap_list = NULL;
+ }
+}
+
UuidInfo coroutine_fn *qmp_backup(
const char *backup_file,
const char *password,
@@ -898,16 +915,7 @@ UuidInfo coroutine_fn *qmp_backup(
backup_state.stat.reused = 0;
/* clear previous backup's bitmap_list */
- if (backup_state.stat.bitmap_list) {
- GList *bl = backup_state.stat.bitmap_list;
- while (bl) {
- g_free(((PBSBitmapInfo *)bl->data)->drive);
- g_free(bl->data);
- bl = g_list_next(bl);
- }
- g_list_free(backup_state.stat.bitmap_list);
- backup_state.stat.bitmap_list = NULL;
- }
+ clear_backup_state_bitmap_list();
if (format == BACKUP_FORMAT_PBS) {
if (!password) {

View File

@@ -1,25 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Thu, 7 Nov 2024 17:51:17 +0100
Subject: [PATCH] PVE backup: include device name in error when setting up
snapshot access fails
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
pve-backup.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/pve-backup.c b/pve-backup.c
index 8ccb281c8c..255465676c 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -626,7 +626,8 @@ static void create_backup_jobs_bh(void *opaque) {
bool discard_source = false;
if (di->fleecing.bs) {
if (setup_snapshot_access(di, &local_err) < 0) {
- error_setg(errp, "setting up snapshot access for fleecing failed: %s",
+ error_setg(errp, "%s - setting up snapshot access for fleecing failed: %s",
+ di->device_name,
local_err ? error_get_pretty(local_err) : "unknown error");
cleanup_snapshot_access(di);
bdrv_drained_end(di->bs);

View File

@@ -0,0 +1,95 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Thu, 3 Apr 2025 14:30:43 +0200
Subject: [PATCH] PVE backup: factor out helper to initialize backup state stat
struct
Suggested-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
---
pve-backup.c | 62 ++++++++++++++++++++++++++++++++--------------------
1 file changed, 38 insertions(+), 24 deletions(-)
diff --git a/pve-backup.c b/pve-backup.c
index 588ee98ffc..3be9930ad3 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -828,6 +828,43 @@ static void clear_backup_state_bitmap_list(void) {
}
}
+/*
+ * Initializes most of the backup state 'stat' struct. Note that 'reused' and
+ * 'bitmap_list' are not changed by this function and need to be handled by
+ * the caller. In particular, 'reused' needs to be set before calling this
+ * function.
+ *
+ * To be called with the backup_state.stat mutex held.
+ */
+static void initialize_backup_state_stat(
+ const char *backup_file,
+ uuid_t uuid,
+ size_t total)
+{
+ if (backup_state.stat.error) {
+ error_free(backup_state.stat.error);
+ backup_state.stat.error = NULL;
+ }
+
+ backup_state.stat.start_time = time(NULL);
+ backup_state.stat.end_time = 0;
+
+ if (backup_state.stat.backup_file) {
+ g_free(backup_state.stat.backup_file);
+ }
+ backup_state.stat.backup_file = g_strdup(backup_file);
+
+ uuid_copy(backup_state.stat.uuid, uuid);
+ uuid_unparse_lower(uuid, backup_state.stat.uuid_str);
+
+ backup_state.stat.total = total;
+ backup_state.stat.dirty = total - backup_state.stat.reused;
+ backup_state.stat.transferred = 0;
+ backup_state.stat.zero_bytes = 0;
+ backup_state.stat.finishing = false;
+ backup_state.stat.starting = true;
+}
+
UuidInfo coroutine_fn *qmp_backup(
const char *backup_file,
const char *password,
@@ -1070,32 +1107,9 @@ UuidInfo coroutine_fn *qmp_backup(
}
}
/* initialize global backup_state now */
- /* note: 'reused' and 'bitmap_list' are initialized earlier */
-
- if (backup_state.stat.error) {
- error_free(backup_state.stat.error);
- backup_state.stat.error = NULL;
- }
-
- backup_state.stat.start_time = time(NULL);
- backup_state.stat.end_time = 0;
-
- if (backup_state.stat.backup_file) {
- g_free(backup_state.stat.backup_file);
- }
- backup_state.stat.backup_file = g_strdup(backup_file);
-
- uuid_copy(backup_state.stat.uuid, uuid);
- uuid_unparse_lower(uuid, backup_state.stat.uuid_str);
+ initialize_backup_state_stat(backup_file, uuid, total);
char *uuid_str = g_strdup(backup_state.stat.uuid_str);
- backup_state.stat.total = total;
- backup_state.stat.dirty = total - backup_state.stat.reused;
- backup_state.stat.transferred = 0;
- backup_state.stat.zero_bytes = 0;
- backup_state.stat.finishing = false;
- backup_state.stat.starting = true;
-
qemu_mutex_unlock(&backup_state.stat.lock);
backup_state.speed = (has_speed && speed > 0) ? speed : 0;

View File

@@ -0,0 +1,63 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Thu, 3 Apr 2025 14:30:44 +0200
Subject: [PATCH] PVE backup: add target ID in backup state
In preparation for allowing multiple backup providers and potentially
multiple targets for a given provider. Each backup target can then
have its own dirty bitmap and there can be additional checks that the
current backup state is actually associated to the expected target.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
---
pve-backup.c | 15 ++++++++++++++-
1 file changed, 14 insertions(+), 1 deletion(-)
diff --git a/pve-backup.c b/pve-backup.c
index 3be9930ad3..87778f7e76 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -70,6 +70,7 @@ static struct PVEBackupState {
JobTxn *txn;
CoMutex backup_mutex;
CoMutex dump_callback_mutex;
+ char *target_id;
} backup_state;
static void pvebackup_init(void)
@@ -865,6 +866,16 @@ static void initialize_backup_state_stat(
backup_state.stat.starting = true;
}
+/*
+ * To be called with the backup_state mutex held.
+ */
+static void backup_state_set_target_id(const char *target_id) {
+ if (backup_state.target_id) {
+ g_free(backup_state.target_id);
+ }
+ backup_state.target_id = g_strdup(target_id);
+}
+
UuidInfo coroutine_fn *qmp_backup(
const char *backup_file,
const char *password,
@@ -904,7 +915,7 @@ UuidInfo coroutine_fn *qmp_backup(
if (backup_state.di_list) {
error_set(errp, ERROR_CLASS_GENERIC_ERROR,
- "previous backup not finished");
+ "previous backup for target '%s' not finished", backup_state.target_id);
qemu_co_mutex_unlock(&backup_state.backup_mutex);
return NULL;
}
@@ -1122,6 +1133,8 @@ UuidInfo coroutine_fn *qmp_backup(
backup_state.vmaw = vmaw;
backup_state.pbs = pbs;
+ backup_state_set_target_id("Proxmox");
+
backup_state.di_list = di_list;
uuid_info = g_malloc0(sizeof(*uuid_info));

View File

@@ -0,0 +1,57 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Thu, 3 Apr 2025 14:30:45 +0200
Subject: [PATCH] PVE backup: get device info: allow caller to specify filter
for which devices use fleecing
For providing snapshot-access to external backup providers, EFI and
TPM also need an associated fleecing image. The new caller will thus
need a different filter.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
---
pve-backup.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/pve-backup.c b/pve-backup.c
index 87778f7e76..bd81621d51 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -719,7 +719,7 @@ static void create_backup_jobs_bh(void *opaque) {
/*
* EFI disk and TPM state are small and it's just not worth setting up fleecing for them.
*/
-static bool device_uses_fleecing(const char *device_id)
+static bool fleecing_no_efi_tpm(const char *device_id)
{
return strncmp(device_id, "drive-efidisk", 13) && strncmp(device_id, "drive-tpmstate", 14);
}
@@ -731,7 +731,7 @@ static bool device_uses_fleecing(const char *device_id)
*/
static GList coroutine_fn GRAPH_RDLOCK *get_device_info(
const char *devlist,
- bool fleecing,
+ bool (*device_uses_fleecing)(const char*),
Error **errp)
{
gchar **devs = NULL;
@@ -757,7 +757,7 @@ static GList coroutine_fn GRAPH_RDLOCK *get_device_info(
di->bs = bs;
di->device_name = g_strdup(bdrv_get_device_name(bs));
- if (fleecing && device_uses_fleecing(*d)) {
+ if (device_uses_fleecing && device_uses_fleecing(*d)) {
g_autofree gchar *fleecing_devid = g_strconcat(*d, "-fleecing", NULL);
BlockBackend *fleecing_blk = blk_by_name(fleecing_devid);
if (!fleecing_blk) {
@@ -924,7 +924,8 @@ UuidInfo coroutine_fn *qmp_backup(
format = has_format ? format : BACKUP_FORMAT_VMA;
bdrv_graph_co_rdlock();
- di_list = get_device_info(devlist, has_fleecing && fleecing, &local_err);
+ di_list = get_device_info(devlist, (has_fleecing && fleecing) ? fleecing_no_efi_tpm : NULL,
+ &local_err);
bdrv_graph_co_rdunlock();
if (local_err) {
error_propagate(errp, local_err);

View File

@@ -0,0 +1,898 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Thu, 3 Apr 2025 14:30:46 +0200
Subject: [PATCH] PVE backup: implement backup access setup and teardown API
for external providers
For external backup providers, the state of the VM's disk images at
the time the backup is started is preserved via a snapshot-access
block node. Old data is moved to the fleecing image when new guest
writes come in. The snapshot-access block node, as well as the
associated bitmap in case of incremental backup, will be exported via
NBD to the external provider. The NBD export will be done by the
management layer, the missing functionality is setting up and tearing
down the snapshot-access block nodes, which this patch adds.
It is necessary to also set up fleecing for EFI and TPM disks, so that
old data can be moved out of the way when a new guest write comes in.
There can only be one regular backup or one active backup access at
a time, because both require replacing the original block node of the
drive. Thus the backup state is re-used, and checks are added to
prohibit regular backup while snapshot access is active and vice
versa.
The block nodes added by the backup-access-setup QMP call are not
tracked anywhere else (there is no job they are associated to like for
regular backup). This requires adding a callback for teardown when
QEMU exits, i.e. in qemu_cleanup(). Otherwise, there will be an
assertion failure that the block graph is not empty when QEMU exits
before the backup-access-teardown QMP command is called.
The code for the qmp_backup_access_setup() was based on the existing
qmp_backup() routine.
The return value for the setup QMP command contains information about
the snapshot-access block nodes that can be used by the management
layer to set up the NBD exports.
There can be one dirty bitmap for each backup target ID for each
device (which are tracked in the backup_access_bitmaps hash table).
The QMP user can specify the ID of the bitmap it likes to use. This ID
is then compared to the current one for the given target and device.
If they match, the bitmap is re-used (should it still exist on the
drive, otherwise re-created). If there is a mismatch, the old bitmap
is removed and a new one is created.
The return value of the QMP command includes information about what
bitmap action was taken. Similar to what the query-backup QMP command
returns for regular backup. It also includes the bitmap name and
associated block node, so the management layer can then set up an NBD
export with the bitmap.
While the backup access is active, a background bitmap is also
required. This is necessary to implement bitmap handling according to
the original reference [0]. In particular:
- in the error case, new writes since the backup access was set up are
in the background bitmap. Because of failure, the previously tracked
writes from the backup access bitmap are still required too. Thus,
the bitmap is merged with the background bitmap to get all new
writes since the last backup.
- in the success case, continue tracking for the next incremental
backup in the backup access bitmap. New writes since the backup
access was set up are in the background bitmap. Because the backup
was successfully, clear the backup access bitmap and merge back the
background bitmap to get only the new writes.
Since QEMU cannot know if the backup was successful or not (except if
failure already happens during the setup QMP command), the management
layer needs to tell it via the teardown QMP command.
The bitmap action is also recorded in the device info now.
The backup-access api keeps track of what bitmap names got used for
which devices and thus knows when a bitmap went missing. Propagate
this information to the QMP user with a new 'missing-recreated'
variant for the taken bitmap action.
[0]: https://lore.kernel.org/qemu-devel/b68833dd-8864-4d72-7c61-c134a9835036@ya.ru/
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
---
pve-backup.c | 519 +++++++++++++++++++++++++++++++++++++++----
pve-backup.h | 16 ++
qapi/block-core.json | 99 ++++++++-
system/runstate.c | 6 +
4 files changed, 596 insertions(+), 44 deletions(-)
create mode 100644 pve-backup.h
diff --git a/pve-backup.c b/pve-backup.c
index bd81621d51..0450303017 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -1,4 +1,5 @@
#include "proxmox-backup-client.h"
+#include "pve-backup.h"
#include "vma.h"
#include "qemu/osdep.h"
@@ -14,6 +15,7 @@
#include "qobject/qdict.h"
#include "qapi/qmp/qerror.h"
#include "qemu/cutils.h"
+#include "qemu/error-report.h"
#if defined(CONFIG_MALLOC_TRIM)
#include <malloc.h>
@@ -40,6 +42,7 @@
*/
const char *PBS_BITMAP_NAME = "pbs-incremental-dirty-bitmap";
+const char *BACKGROUND_BITMAP_NAME = "backup-access-background-bitmap";
static struct PVEBackupState {
struct {
@@ -98,8 +101,11 @@ typedef struct PVEBackupDevInfo {
char* device_name;
int completed_ret; // INT_MAX if not completed
BdrvDirtyBitmap *bitmap;
+ BdrvDirtyBitmap *background_bitmap; // used for external backup access
+ PBSBitmapAction bitmap_action;
BlockDriverState *target;
BlockJob *job;
+ BackupAccessSetupBitmapMode requested_bitmap_mode;
} PVEBackupDevInfo;
static void pvebackup_propagate_error(Error *err)
@@ -361,6 +367,67 @@ static void coroutine_fn pvebackup_co_complete_stream(void *opaque)
qemu_co_mutex_unlock(&backup_state.backup_mutex);
}
+/*
+ * New writes since the backup access was set up are in the background bitmap. Because of failure,
+ * the previously tracked writes in di->bitmap are still required too. Thus, merge with the
+ * background bitmap to get all new writes since the last backup.
+ */
+static void handle_backup_access_bitmaps_in_error_case(PVEBackupDevInfo *di)
+{
+ Error *local_err = NULL;
+
+ if (di->bs && di->background_bitmap) {
+ bdrv_drained_begin(di->bs);
+ if (di->bitmap) {
+ bdrv_enable_dirty_bitmap(di->bitmap);
+ if (!bdrv_merge_dirty_bitmap(di->bitmap, di->background_bitmap, NULL, &local_err)) {
+ warn_report("backup access: %s - could not merge bitmaps in error path - %s",
+ di->device_name,
+ local_err ? error_get_pretty(local_err) : "unknown error");
+ /*
+ * Could not merge, drop original bitmap too.
+ */
+ bdrv_release_dirty_bitmap(di->bitmap);
+ }
+ } else {
+ warn_report("backup access: %s - expected bitmap not present", di->device_name);
+ }
+ bdrv_release_dirty_bitmap(di->background_bitmap);
+ bdrv_drained_end(di->bs);
+ }
+}
+
+/*
+ * Continue tracking for next incremental backup in di->bitmap. New writes since the backup access
+ * was set up are in the background bitmap. Because the backup was successful, clear di->bitmap and
+ * merge back the background bitmap to get only the new writes.
+ */
+static void handle_backup_access_bitmaps_after_success(PVEBackupDevInfo *di)
+{
+ Error *local_err = NULL;
+
+ if (di->bs && di->background_bitmap) {
+ bdrv_drained_begin(di->bs);
+ if (di->bitmap) {
+ bdrv_enable_dirty_bitmap(di->bitmap);
+ bdrv_clear_dirty_bitmap(di->bitmap, NULL);
+ if (!bdrv_merge_dirty_bitmap(di->bitmap, di->background_bitmap, NULL, &local_err)) {
+ warn_report("backup access: %s - could not merge bitmaps after backup - %s",
+ di->device_name,
+ local_err ? error_get_pretty(local_err) : "unknown error");
+ /*
+ * Could not merge, drop original bitmap too.
+ */
+ bdrv_release_dirty_bitmap(di->bitmap);
+ }
+ } else {
+ warn_report("backup access: %s - expected bitmap not present", di->device_name);
+ }
+ bdrv_release_dirty_bitmap(di->background_bitmap);
+ bdrv_drained_end(di->bs);
+ }
+}
+
static void cleanup_snapshot_access(PVEBackupDevInfo *di)
{
if (di->fleecing.snapshot_access) {
@@ -588,6 +655,51 @@ static int setup_snapshot_access(PVEBackupDevInfo *di, Error **errp)
return 0;
}
+static void setup_all_snapshot_access_bh(void *opaque)
+{
+ assert(!qemu_in_coroutine());
+
+ CoCtxData *data = (CoCtxData*)opaque;
+ Error **errp = (Error**)data->data;
+
+ Error *local_err = NULL;
+
+ GList *l = backup_state.di_list;
+ while (l) {
+ PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
+ l = g_list_next(l);
+
+ bdrv_drained_begin(di->bs);
+
+ if (di->bitmap) {
+ BdrvDirtyBitmap *background_bitmap =
+ bdrv_create_dirty_bitmap(di->bs, PROXMOX_BACKUP_DEFAULT_CHUNK_SIZE,
+ BACKGROUND_BITMAP_NAME, &local_err);
+ if (!background_bitmap) {
+ error_setg(errp, "%s - creating background bitmap for backup access failed: %s",
+ di->device_name,
+ local_err ? error_get_pretty(local_err) : "unknown error");
+ bdrv_drained_end(di->bs);
+ break;
+ }
+ di->background_bitmap = background_bitmap;
+ bdrv_disable_dirty_bitmap(di->bitmap);
+ }
+
+ if (setup_snapshot_access(di, &local_err) < 0) {
+ bdrv_drained_end(di->bs);
+ error_setg(errp, "%s - setting up snapshot access failed: %s", di->device_name,
+ local_err ? error_get_pretty(local_err) : "unknown error");
+ break;
+ }
+
+ bdrv_drained_end(di->bs);
+ }
+
+ /* return */
+ aio_co_enter(data->ctx, data->co);
+}
+
/*
* backup_job_create can *not* be run from a coroutine, so this can't either.
* The caller is responsible that backup_mutex is held nonetheless.
@@ -724,6 +836,62 @@ static bool fleecing_no_efi_tpm(const char *device_id)
return strncmp(device_id, "drive-efidisk", 13) && strncmp(device_id, "drive-tpmstate", 14);
}
+static bool fleecing_all(const char *device_id)
+{
+ return true;
+}
+
+static PVEBackupDevInfo coroutine_fn GRAPH_RDLOCK *get_single_device_info(
+ const char *device,
+ bool (*device_uses_fleecing)(const char*),
+ Error **errp)
+{
+ BlockBackend *blk = blk_by_name(device);
+ if (!blk) {
+ error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND,
+ "Device '%s' not found", device);
+ return NULL;
+ }
+ BlockDriverState *bs = blk_bs(blk);
+ if (!bdrv_co_is_inserted(bs)) {
+ error_setg(errp, "Device '%s' has no medium", device);
+ return NULL;
+ }
+ PVEBackupDevInfo *di = g_new0(PVEBackupDevInfo, 1);
+ di->bs = bs;
+ di->device_name = g_strdup(bdrv_get_device_name(bs));
+
+ if (device_uses_fleecing && device_uses_fleecing(device)) {
+ g_autofree gchar *fleecing_devid = g_strconcat(device, "-fleecing", NULL);
+ BlockBackend *fleecing_blk = blk_by_name(fleecing_devid);
+ if (!fleecing_blk) {
+ error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND,
+ "Device '%s' not found", fleecing_devid);
+ goto fail;
+ }
+ BlockDriverState *fleecing_bs = blk_bs(fleecing_blk);
+ if (!bdrv_co_is_inserted(fleecing_bs)) {
+ error_setg(errp, "Device '%s' has no medium", fleecing_devid);
+ goto fail;
+ }
+ /*
+ * Fleecing image needs to be the same size to act as a cbw target.
+ */
+ if (bs->total_sectors != fleecing_bs->total_sectors) {
+ error_setg(errp, "Size mismatch for '%s' - sector count %ld != %ld",
+ fleecing_devid, fleecing_bs->total_sectors, bs->total_sectors);
+ goto fail;
+ }
+ di->fleecing.bs = fleecing_bs;
+ }
+
+ return di;
+fail:
+ g_free(di->device_name);
+ g_free(di);
+ return NULL;
+}
+
/*
* Returns a list of device infos, which needs to be freed by the caller. In
* case of an error, errp will be set, but the returned value might still be a
@@ -742,45 +910,10 @@ static GList coroutine_fn GRAPH_RDLOCK *get_device_info(
gchar **d = devs;
while (d && *d) {
- BlockBackend *blk = blk_by_name(*d);
- if (!blk) {
- error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND,
- "Device '%s' not found", *d);
- goto err;
- }
- BlockDriverState *bs = blk_bs(blk);
- if (!bdrv_co_is_inserted(bs)) {
- error_setg(errp, "Device '%s' has no medium", *d);
+ PVEBackupDevInfo *di = get_single_device_info(*d, device_uses_fleecing, errp);
+ if (!di) {
goto err;
}
- PVEBackupDevInfo *di = g_new0(PVEBackupDevInfo, 1);
- di->bs = bs;
- di->device_name = g_strdup(bdrv_get_device_name(bs));
-
- if (device_uses_fleecing && device_uses_fleecing(*d)) {
- g_autofree gchar *fleecing_devid = g_strconcat(*d, "-fleecing", NULL);
- BlockBackend *fleecing_blk = blk_by_name(fleecing_devid);
- if (!fleecing_blk) {
- error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND,
- "Device '%s' not found", fleecing_devid);
- goto err;
- }
- BlockDriverState *fleecing_bs = blk_bs(fleecing_blk);
- if (!bdrv_co_is_inserted(fleecing_bs)) {
- error_setg(errp, "Device '%s' has no medium", fleecing_devid);
- goto err;
- }
- /*
- * Fleecing image needs to be the same size to act as a cbw target.
- */
- if (bs->total_sectors != fleecing_bs->total_sectors) {
- error_setg(errp, "Size mismatch for '%s' - sector count %ld != %ld",
- fleecing_devid, fleecing_bs->total_sectors, bs->total_sectors);
- goto err;
- }
- di->fleecing.bs = fleecing_bs;
- }
-
di_list = g_list_append(di_list, di);
d++;
}
@@ -839,8 +972,9 @@ static void clear_backup_state_bitmap_list(void) {
*/
static void initialize_backup_state_stat(
const char *backup_file,
- uuid_t uuid,
- size_t total)
+ uuid_t *uuid,
+ size_t total,
+ bool starting)
{
if (backup_state.stat.error) {
error_free(backup_state.stat.error);
@@ -855,15 +989,19 @@ static void initialize_backup_state_stat(
}
backup_state.stat.backup_file = g_strdup(backup_file);
- uuid_copy(backup_state.stat.uuid, uuid);
- uuid_unparse_lower(uuid, backup_state.stat.uuid_str);
+ if (uuid) {
+ uuid_copy(backup_state.stat.uuid, *uuid);
+ uuid_unparse_lower(*uuid, backup_state.stat.uuid_str);
+ } else {
+ backup_state.stat.uuid_str[0] = '\0';
+ }
backup_state.stat.total = total;
backup_state.stat.dirty = total - backup_state.stat.reused;
backup_state.stat.transferred = 0;
backup_state.stat.zero_bytes = 0;
backup_state.stat.finishing = false;
- backup_state.stat.starting = true;
+ backup_state.stat.starting = starting;
}
/*
@@ -876,6 +1014,299 @@ static void backup_state_set_target_id(const char *target_id) {
backup_state.target_id = g_strdup(target_id);
}
+BackupAccessInfoList *coroutine_fn qmp_backup_access_setup(
+ const char *target_id,
+ BackupAccessSourceDeviceList *devices,
+ Error **errp)
+{
+ assert(qemu_in_coroutine());
+
+ qemu_co_mutex_lock(&backup_state.backup_mutex);
+
+ Error *local_err = NULL;
+ GList *di_list = NULL;
+ GList *l;
+
+ if (backup_state.di_list) {
+ error_set(errp, ERROR_CLASS_GENERIC_ERROR,
+ "previous backup for target '%s' not finished", backup_state.target_id);
+ qemu_co_mutex_unlock(&backup_state.backup_mutex);
+ return NULL;
+ }
+
+ bdrv_graph_co_rdlock();
+ for (BackupAccessSourceDeviceList *it = devices; it; it = it->next) {
+ PVEBackupDevInfo *di = get_single_device_info(it->value->device, fleecing_all, &local_err);
+ if (!di) {
+ bdrv_graph_co_rdunlock();
+ error_propagate(errp, local_err);
+ goto err;
+ }
+ di->requested_bitmap_mode = it->value->bitmap_mode;
+ di_list = g_list_append(di_list, di);
+ }
+ bdrv_graph_co_rdunlock();
+ assert(di_list);
+
+ size_t total = 0;
+
+ l = di_list;
+ while (l) {
+ PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
+ l = g_list_next(l);
+
+ ssize_t size = bdrv_getlength(di->bs);
+ if (size < 0) {
+ error_setg_errno(errp, -size, "bdrv_getlength failed");
+ goto err;
+ }
+ di->size = size;
+ total += size;
+
+ di->completed_ret = INT_MAX;
+ }
+
+ qemu_mutex_lock(&backup_state.stat.lock);
+ backup_state.stat.reused = 0;
+
+ /* clear previous backup's bitmap_list */
+ clear_backup_state_bitmap_list();
+
+ const char *bitmap_name = target_id;
+
+ /* create bitmaps if requested */
+ l = di_list;
+ while (l) {
+ PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
+ l = g_list_next(l);
+
+ di->block_size = PROXMOX_BACKUP_DEFAULT_CHUNK_SIZE;
+
+ PBSBitmapAction action = PBS_BITMAP_ACTION_NOT_USED;
+ size_t dirty = di->size;
+
+ if (di->requested_bitmap_mode == BACKUP_ACCESS_SETUP_BITMAP_MODE_NONE ||
+ di->requested_bitmap_mode == BACKUP_ACCESS_SETUP_BITMAP_MODE_NEW) {
+ BdrvDirtyBitmap *old_bitmap = bdrv_find_dirty_bitmap(di->bs, bitmap_name);
+ if (old_bitmap) {
+ bdrv_release_dirty_bitmap(old_bitmap);
+ action = PBS_BITMAP_ACTION_NOT_USED_REMOVED; // set below for new
+ }
+ }
+
+ BdrvDirtyBitmap *bitmap = NULL;
+ if (di->requested_bitmap_mode == BACKUP_ACCESS_SETUP_BITMAP_MODE_NEW ||
+ di->requested_bitmap_mode == BACKUP_ACCESS_SETUP_BITMAP_MODE_USE) {
+ bitmap = bdrv_find_dirty_bitmap(di->bs, bitmap_name);
+ if (!bitmap) {
+ bitmap = bdrv_create_dirty_bitmap(di->bs, PROXMOX_BACKUP_DEFAULT_CHUNK_SIZE,
+ bitmap_name, errp);
+ if (!bitmap) {
+ qemu_mutex_unlock(&backup_state.stat.lock);
+ goto err;
+ }
+ bdrv_set_dirty_bitmap(bitmap, 0, di->size);
+ if (di->requested_bitmap_mode == BACKUP_ACCESS_SETUP_BITMAP_MODE_USE) {
+ action = PBS_BITMAP_ACTION_MISSING_RECREATED;
+ } else {
+ action = PBS_BITMAP_ACTION_NEW;
+ }
+ } else {
+ if (di->requested_bitmap_mode == BACKUP_ACCESS_SETUP_BITMAP_MODE_NEW) {
+ qemu_mutex_unlock(&backup_state.stat.lock);
+ error_setg(errp, "internal error - removed old bitmap still present");
+ goto err;
+ }
+ /* track clean chunks as reused */
+ dirty = MIN(bdrv_get_dirty_count(bitmap), di->size);
+ backup_state.stat.reused += di->size - dirty;
+ action = PBS_BITMAP_ACTION_USED;
+ }
+ }
+
+ PBSBitmapInfo *info = g_malloc(sizeof(*info));
+ info->drive = g_strdup(di->device_name);
+ info->action = action;
+ info->size = di->size;
+ info->dirty = dirty;
+ backup_state.stat.bitmap_list = g_list_append(backup_state.stat.bitmap_list, info);
+
+ di->bitmap = bitmap;
+ di->bitmap_action = action;
+ }
+
+ /* starting=false, because there is no associated QEMU job */
+ initialize_backup_state_stat(NULL, NULL, total, false);
+
+ qemu_mutex_unlock(&backup_state.stat.lock);
+
+ backup_state_set_target_id(target_id);
+
+ backup_state.vmaw = NULL;
+ backup_state.pbs = NULL;
+
+ backup_state.di_list = di_list;
+
+ /* Run setup_all_snapshot_access_bh outside of coroutine (in BH) but keep
+ * backup_mutex locked. This is fine, a CoMutex can be held across yield
+ * points, and we'll release it as soon as the BH reschedules us.
+ */
+ CoCtxData waker = {
+ .co = qemu_coroutine_self(),
+ .ctx = qemu_get_current_aio_context(),
+ .data = &local_err,
+ };
+ aio_bh_schedule_oneshot(waker.ctx, setup_all_snapshot_access_bh, &waker);
+ qemu_coroutine_yield();
+
+ if (local_err) {
+ error_propagate(errp, local_err);
+ goto err;
+ }
+
+ qemu_co_mutex_unlock(&backup_state.backup_mutex);
+
+ BackupAccessInfoList *bai_head = NULL, **p_bai_next = &bai_head;
+
+ l = di_list;
+ while (l) {
+ PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
+ l = g_list_next(l);
+
+ BackupAccessInfoList *info = g_malloc0(sizeof(*info));
+ info->value = g_malloc0(sizeof(*info->value));
+ info->value->node_name = g_strdup(bdrv_get_node_name(di->fleecing.snapshot_access));
+ info->value->device = g_strdup(di->device_name);
+ info->value->size = di->size;
+ if (di->bitmap) {
+ info->value->bitmap_node_name = g_strdup(bdrv_get_node_name(di->bs));
+ info->value->bitmap_name = g_strdup(bitmap_name);
+ info->value->bitmap_action = di->bitmap_action;
+ info->value->has_bitmap_action = true;
+ }
+
+ *p_bai_next = info;
+ p_bai_next = &info->next;
+ }
+
+ return bai_head;
+
+err:
+
+ l = di_list;
+ while (l) {
+ PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
+ l = g_list_next(l);
+
+ handle_backup_access_bitmaps_in_error_case(di);
+
+ g_free(di->device_name);
+ di->device_name = NULL;
+
+ g_free(di);
+ }
+ g_list_free(di_list);
+ backup_state.di_list = NULL;
+
+ qemu_co_mutex_unlock(&backup_state.backup_mutex);
+ return NULL;
+}
+
+/*
+ * Caller needs to hold the backup mutex or the BQL.
+ */
+void backup_access_teardown(bool success)
+{
+ GList *l = backup_state.di_list;
+
+ qemu_mutex_lock(&backup_state.stat.lock);
+ backup_state.stat.finishing = true;
+ qemu_mutex_unlock(&backup_state.stat.lock);
+
+ while (l) {
+ PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
+ l = g_list_next(l);
+
+ if (di->fleecing.snapshot_access) {
+ bdrv_unref(di->fleecing.snapshot_access);
+ di->fleecing.snapshot_access = NULL;
+ }
+ if (di->fleecing.cbw) {
+ bdrv_cbw_drop(di->fleecing.cbw);
+ di->fleecing.cbw = NULL;
+ }
+
+ if (success) {
+ handle_backup_access_bitmaps_after_success(di);
+ } else {
+ handle_backup_access_bitmaps_in_error_case(di);
+ }
+
+ g_free(di->device_name);
+ di->device_name = NULL;
+
+ g_free(di);
+ }
+ g_list_free(backup_state.di_list);
+ backup_state.di_list = NULL;
+
+ qemu_mutex_lock(&backup_state.stat.lock);
+ backup_state.stat.end_time = time(NULL);
+ backup_state.stat.finishing = false;
+ qemu_mutex_unlock(&backup_state.stat.lock);
+}
+
+// Not done in a coroutine, because bdrv_co_unref() and cbw_drop() would just spawn BHs anyways.
+// Caller needs to hold the backup_state.backup_mutex lock
+static void backup_access_teardown_bh(void *opaque)
+{
+ CoCtxData *data = (CoCtxData*)opaque;
+
+ backup_access_teardown(*((bool*)data->data));
+
+ /* return */
+ aio_co_enter(data->ctx, data->co);
+}
+
+void coroutine_fn qmp_backup_access_teardown(const char *target_id, bool success, Error **errp)
+{
+ assert(qemu_in_coroutine());
+
+ qemu_co_mutex_lock(&backup_state.backup_mutex);
+
+ if (!backup_state.target_id) { // nothing to do
+ qemu_co_mutex_unlock(&backup_state.backup_mutex);
+ return;
+ }
+
+ /*
+ * Continue with target_id == NULL, used by the callback registered for qemu_cleanup()
+ */
+ if (target_id && strcmp(backup_state.target_id, target_id)) {
+ error_setg(errp, "cannot teardown backup access - got target %s instead of %s",
+ target_id, backup_state.target_id);
+ qemu_co_mutex_unlock(&backup_state.backup_mutex);
+ return;
+ }
+
+ if (!strcmp(backup_state.target_id, "Proxmox VE")) {
+ error_setg(errp, "cannot teardown backup access for PVE - use backup-cancel instead");
+ qemu_co_mutex_unlock(&backup_state.backup_mutex);
+ return;
+ }
+
+ CoCtxData waker = {
+ .co = qemu_coroutine_self(),
+ .ctx = qemu_get_current_aio_context(),
+ .data = &success,
+ };
+ aio_bh_schedule_oneshot(waker.ctx, backup_access_teardown_bh, &waker);
+ qemu_coroutine_yield();
+
+ qemu_co_mutex_unlock(&backup_state.backup_mutex);
+ return;
+}
+
UuidInfo coroutine_fn *qmp_backup(
const char *backup_file,
const char *password,
@@ -1068,6 +1499,7 @@ UuidInfo coroutine_fn *qmp_backup(
}
di->dev_id = dev_id;
+ di->bitmap_action = action;
PBSBitmapInfo *info = g_malloc(sizeof(*info));
info->drive = g_strdup(di->device_name);
@@ -1119,7 +1551,7 @@ UuidInfo coroutine_fn *qmp_backup(
}
}
/* initialize global backup_state now */
- initialize_backup_state_stat(backup_file, uuid, total);
+ initialize_backup_state_stat(backup_file, &uuid, total, true);
char *uuid_str = g_strdup(backup_state.stat.uuid_str);
qemu_mutex_unlock(&backup_state.stat.lock);
@@ -1298,5 +1730,6 @@ ProxmoxSupportStatus *qmp_query_proxmox_support(Error **errp)
ret->pbs_masterkey = true;
ret->backup_max_workers = true;
ret->backup_fleecing = true;
+ ret->backup_access_api = true;
return ret;
}
diff --git a/pve-backup.h b/pve-backup.h
new file mode 100644
index 0000000000..9ebeef7c8f
--- /dev/null
+++ b/pve-backup.h
@@ -0,0 +1,16 @@
+/*
+ * Bacup code used by Proxmox VE
+ *
+ * Copyright (C) Proxmox Server Solutions
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#ifndef PVE_BACKUP_H
+#define PVE_BACKUP_H
+
+void backup_access_teardown(bool success);
+
+#endif /* PVE_BACKUP_H */
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 9bdcfa31ea..2fb51215f2 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1023,6 +1023,9 @@
#
# @pbs-library-version: Running version of libproxmox-backup-qemu0 library.
#
+# @backup-access-api: Whether backup access API for external providers is
+# supported or not.
+#
# @backup-fleecing: Whether backup fleecing is supported or not.
#
# @backup-max-workers: Whether the 'max-workers' @BackupPerf setting is
@@ -1036,6 +1039,7 @@
'pbs-dirty-bitmap-migration': 'bool',
'pbs-masterkey': 'bool',
'pbs-library-version': 'str',
+ 'backup-access-api': 'bool',
'backup-fleecing': 'bool',
'backup-max-workers': 'bool' } }
@@ -1067,9 +1071,16 @@
# base snapshot did not match the base given for the current job or
# the crypt mode has changed.
#
+# @missing-recreated: A bitmap for incremental backup was expected to be
+# present, but was missing and thus got recreated. For example, this can
+# happen if the drive was re-attached or if the bitmap was deleted for some
+# other reason. PBS does not currently keep track of this; the backup-access
+# mechanism does.
+#
##
{ 'enum': 'PBSBitmapAction',
- 'data': ['not-used', 'not-used-removed', 'new', 'used', 'invalid'] }
+ 'data': ['not-used', 'not-used-removed', 'new', 'used', 'invalid',
+ 'missing-recreated'] }
##
# @PBSBitmapInfo:
@@ -1102,6 +1113,92 @@
##
{ 'command': 'query-pbs-bitmap-info', 'returns': ['PBSBitmapInfo'] }
+##
+# @BackupAccessInfo:
+#
+# Info associated to a snapshot access for backup. For more information about
+# the bitmap see @BackupAccessBitmapMode.
+#
+# @node-name: the block node name of the snapshot-access node.
+#
+# @device: the device on top of which the snapshot access was created.
+#
+# @size: the size of the block device in bytes.
+#
+# @bitmap-node-name: the block node name the dirty bitmap is associated to.
+#
+# @bitmap-name: the name of the dirty bitmap associated to the backup access.
+#
+# @bitmap-action: the action taken on the dirty bitmap.
+#
+##
+{ 'struct': 'BackupAccessInfo',
+ 'data': { 'node-name': 'str', 'device': 'str', 'size': 'size',
+ '*bitmap-node-name': 'str', '*bitmap-name': 'str',
+ '*bitmap-action': 'PBSBitmapAction' } }
+
+##
+# @BackupAccessSourceDevice:
+#
+# Source block device information for creating a backup access.
+#
+# @device: the block device name.
+#
+# @bitmap-mode: used to control whether the bitmap should be reused or
+# recreated or not used. Default is not using a bitmap.
+#
+##
+{ 'struct': 'BackupAccessSourceDevice',
+ 'data': { 'device': 'str', '*bitmap-mode': 'BackupAccessSetupBitmapMode' } }
+
+##
+# @BackupAccessSetupBitmapMode:
+#
+# How to setup a bitmap for a device for @backup-access-setup.
+#
+# @none: do not use a bitmap. Removes an existing bitmap if present.
+#
+# @new: create and use a new bitmap.
+#
+# @use: try to re-use an existing bitmap. Create a new one if it doesn't exist.
+##
+{ 'enum': 'BackupAccessSetupBitmapMode',
+ 'data': ['none', 'new', 'use' ] }
+
+##
+# @backup-access-setup:
+#
+# Set up snapshot access to VM drives for an external backup provider. No other
+# backup or backup access can be done before tearing down the backup access.
+#
+# @target-id: the unique ID of the backup target.
+#
+# @devices: list of devices for which to create the backup access. Also
+# controls whether to use/create a bitmap for the device. Check the
+# @bitmap-action in the result to see what action was actually taken for the
+# bitmap. Each target controls its own bitmaps.
+#
+# Returns: a list of @BackupAccessInfo, one for each device.
+#
+##
+{ 'command': 'backup-access-setup',
+ 'data': { 'target-id': 'str', 'devices': [ 'BackupAccessSourceDevice' ] },
+ 'returns': [ 'BackupAccessInfo' ], 'coroutine': true }
+
+##
+# @backup-access-teardown:
+#
+# Tear down previously setup snapshot access for the same target.
+#
+# @target-id: the ID of the backup target.
+#
+# @success: whether the backup done by the external provider was successful.
+#
+##
+{ 'command': 'backup-access-teardown',
+ 'data': { 'target-id': 'str', 'success': 'bool' },
+ 'coroutine': true }
+
##
# @BlockDeviceTimedStats:
#
diff --git a/system/runstate.c b/system/runstate.c
index 272801d307..cf775213bd 100644
--- a/system/runstate.c
+++ b/system/runstate.c
@@ -60,6 +60,7 @@
#include "system/system.h"
#include "system/tpm.h"
#include "trace.h"
+#include "pve-backup.h"
static NotifierList exit_notifiers =
NOTIFIER_LIST_INITIALIZER(exit_notifiers);
@@ -921,6 +922,11 @@ void qemu_cleanup(int status)
* requests happening from here on anyway.
*/
bdrv_drain_all_begin();
+ /*
+ * The backup access is set up by a QMP command, but is neither owned by a monitor nor
+ * associated to a BlockBackend. Need to tear it down manually here.
+ */
+ backup_access_teardown(false);
job_cancel_sync_all();
bdrv_close_all();

View File

@@ -0,0 +1,106 @@
From 5a8cf9e98ba1668a6a20c2fcda1704de4103ff58 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Wed, 2 Jul 2025 18:27:34 +0200
Subject: [PATCH 56/59] PVE backup: prepare for the switch to using blockdev
rather than drive
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Also allow finding block nodes by their node name rather than just via
an associated block backend, which might not exist for block nodes.
For regular drives, it is essential to not use the throttle group,
because otherwise the limits intended only for the guest would also
apply to the backup job.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
pve-backup.c | 51 +++++++++++++++++++++++++++++++++++++++------------
1 file changed, 39 insertions(+), 12 deletions(-)
diff --git a/pve-backup.c b/pve-backup.c
index 0450303017..457fcb7e5c 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -847,29 +847,56 @@ static PVEBackupDevInfo coroutine_fn GRAPH_RDLOCK *get_single_device_info(
Error **errp)
{
BlockBackend *blk = blk_by_name(device);
- if (!blk) {
- error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND,
- "Device '%s' not found", device);
- return NULL;
+ BlockDriverState *root_bs, *bs;
+
+ if (blk) {
+ root_bs = bs = blk_bs(blk);
+ } else {
+ /* TODO PVE 10 - fleecing will always be attached without blk */
+ root_bs = bs = bdrv_find_node(device);
+ if (!bs) {
+ error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND,
+ "Device '%s' not found", device);
+ return NULL;
+ }
+ /* For TPM, bs is already correct, otherwise need the file child. */
+ if (!strncmp(bs->drv->format_name, "throttle", 8)) {
+ if (!bs->file || !bs->file->bs) {
+ error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND,
+ "Device '%s' not found (no file child)", device);
+ return NULL;
+ }
+ bs = bs->file->bs;
+ }
}
- BlockDriverState *bs = blk_bs(blk);
+
if (!bdrv_co_is_inserted(bs)) {
error_setg(errp, "Device '%s' has no medium", device);
return NULL;
}
+
PVEBackupDevInfo *di = g_new0(PVEBackupDevInfo, 1);
di->bs = bs;
- di->device_name = g_strdup(bdrv_get_device_name(bs));
+ /* Need the name of the root node, e.g. drive-scsi0 */
+ di->device_name = g_strdup(bdrv_get_device_or_node_name(root_bs));
if (device_uses_fleecing && device_uses_fleecing(device)) {
g_autofree gchar *fleecing_devid = g_strconcat(device, "-fleecing", NULL);
+ BlockDriverState *fleecing_bs;
+
BlockBackend *fleecing_blk = blk_by_name(fleecing_devid);
- if (!fleecing_blk) {
- error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND,
- "Device '%s' not found", fleecing_devid);
- goto fail;
+ if (fleecing_blk) {
+ fleecing_bs = blk_bs(fleecing_blk);
+ } else {
+ /* TODO PVE 10 - fleecing will always be attached without blk */
+ fleecing_bs = bdrv_find_node(fleecing_devid);
+ if (!fleecing_bs) {
+ error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND,
+ "Device '%s' not found", fleecing_devid);
+ goto fail;
+ }
}
- BlockDriverState *fleecing_bs = blk_bs(fleecing_blk);
+
if (!bdrv_co_is_inserted(fleecing_bs)) {
error_setg(errp, "Device '%s' has no medium", fleecing_devid);
goto fail;
@@ -927,7 +954,7 @@ static GList coroutine_fn GRAPH_RDLOCK *get_device_info(
PVEBackupDevInfo *di = g_new0(PVEBackupDevInfo, 1);
di->bs = bs;
- di->device_name = g_strdup(bdrv_get_device_name(bs));
+ di->device_name = g_strdup(bdrv_get_device_or_node_name(bs));
di_list = g_list_append(di_list, di);
}
}
--
2.39.5

View File

@@ -0,0 +1,71 @@
From 5beb1f48555d74f468b6c0ca657d3be44c8ea8e3 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Wed, 2 Jul 2025 18:27:35 +0200
Subject: [PATCH 57/59] block/zeroinit: support using as blockdev driver
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
block/zeroinit.c | 12 +++++++++---
qapi/block-core.json | 5 +++--
2 files changed, 12 insertions(+), 5 deletions(-)
diff --git a/block/zeroinit.c b/block/zeroinit.c
index f9d513db15..036edb17f5 100644
--- a/block/zeroinit.c
+++ b/block/zeroinit.c
@@ -66,6 +66,7 @@ static int zeroinit_open(BlockDriverState *bs, QDict *options, int flags,
QemuOpts *opts;
Error *local_err = NULL;
int ret;
+ const char *next = NULL;
s->extents = 0;
@@ -77,9 +78,14 @@ static int zeroinit_open(BlockDriverState *bs, QDict *options, int flags,
goto fail;
}
- /* Open the raw file */
- ret = bdrv_open_file_child(qemu_opt_get(opts, "x-next"), options, "next",
- bs, &local_err);
+
+ next = qemu_opt_get(opts, "x-next");
+
+ if (next) {
+ ret = bdrv_open_file_child(next, options, "next", bs, &local_err);
+ } else { /* when opened as a blockdev, there is no 'next' option */
+ ret = bdrv_open_file_child(NULL, options, "file", bs, &local_err);
+ }
if (ret < 0) {
error_propagate(errp, local_err);
goto fail;
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 2fb51215f2..f8ed564cf0 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -3586,7 +3586,7 @@
{ 'name': 'virtio-blk-vfio-pci', 'if': 'CONFIG_BLKIO' },
{ 'name': 'virtio-blk-vhost-user', 'if': 'CONFIG_BLKIO' },
{ 'name': 'virtio-blk-vhost-vdpa', 'if': 'CONFIG_BLKIO' },
- 'vmdk', 'vpc', 'vvfat' ] }
+ 'vmdk', 'vpc', 'vvfat', 'zeroinit' ] }
##
# @BlockdevOptionsFile:
@@ -5172,7 +5172,8 @@
'if': 'CONFIG_BLKIO' },
'vmdk': 'BlockdevOptionsGenericCOWFormat',
'vpc': 'BlockdevOptionsGenericFormat',
- 'vvfat': 'BlockdevOptionsVVFAT'
+ 'vvfat': 'BlockdevOptionsVVFAT',
+ 'zeroinit': 'BlockdevOptionsGenericFormat'
} }
##
--
2.39.5

View File

@@ -0,0 +1,61 @@
From d180b059731818ae34e43e11495c8ac081ab89b9 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Wed, 2 Jul 2025 18:27:36 +0200
Subject: [PATCH 58/59] block/alloc-track: support using as blockdev driver
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
qapi/block-core.json | 19 ++++++++++++++++++-
1 file changed, 18 insertions(+), 1 deletion(-)
diff --git a/qapi/block-core.json b/qapi/block-core.json
index f8ed564cf0..07c5773717 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -3567,7 +3567,8 @@
# Since: 2.9
##
{ 'enum': 'BlockdevDriver',
- 'data': [ 'blkdebug', 'blklogwrites', 'blkreplay', 'blkverify', 'bochs',
+ 'data': [ 'alloc-track',
+ 'blkdebug', 'blklogwrites', 'blkreplay', 'blkverify', 'bochs',
'cloop', 'compress', 'copy-before-write', 'copy-on-read', 'dmg',
'file', 'snapshot-access', 'ftp', 'ftps',
{'name': 'gluster', 'features': [ 'deprecated' ] },
@@ -3668,6 +3669,21 @@
{ 'struct': 'BlockdevOptionsNull',
'data': { '*size': 'int', '*latency-ns': 'uint64', '*read-zeroes': 'bool' } }
+##
+# @BlockdevOptionsAllocTrack:
+#
+# Driver specific block device options for the alloc-track backend.
+#
+# @backing: backing file with the data.
+#
+# @auto-remove: whether the alloc-track driver should drop itself
+# after completing the stream.
+#
+##
+{ 'struct': 'BlockdevOptionsAllocTrack',
+ 'base': 'BlockdevOptionsGenericFormat',
+ 'data': { 'auto-remove': 'bool', 'backing': 'BlockdevRefOrNull' } }
+
##
# @BlockdevOptionsPbs:
#
@@ -5114,6 +5130,7 @@
'*detect-zeroes': 'BlockdevDetectZeroesOptions' },
'discriminator': 'driver',
'data': {
+ 'alloc-track':'BlockdevOptionsAllocTrack',
'blkdebug': 'BlockdevOptionsBlkdebug',
'blklogwrites':'BlockdevOptionsBlklogwrites',
'blkverify': 'BlockdevOptionsBlkverify',
--
2.39.5

View File

@@ -0,0 +1,137 @@
From 76442f3eafa8cbe647fe2d39e78e817ec681143c Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Wed, 2 Jul 2025 18:27:37 +0200
Subject: [PATCH 59/59] block/qapi: include child references in block device
info
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
In combination with using a throttle filter to enforce IO limits for
a guest device, knowing the 'file' child of a block device can be
useful. If the throttle filter is only intended for guest IO, block
jobs should not also be limited by the throttle filter, so the
block operations need to be done with the 'file' child of the top
throttle node as the target. In combination with mirroring, the name
of that child is not fixed.
Another scenario is when unplugging a guest device after mirroring
below a top throttle node, where the mirror target is added explicitly
via blockdev-add. After mirroring, the target becomes the new 'file'
child of the throttle node. For unplugging, both the top throttle node
and the mirror target need to be deleted, because only implicitly
added child nodes are deleted automatically, and the current 'file'
child of the throttle node was explicitly added (as the mirror
target).
In other scenarios, it could be useful to follow the backing chain.
Note that iotests 191 and 273 use _filter_img_info, so the 'children'
information is filtered out there.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
block/qapi.c | 10 ++++++++++
qapi/block-core.json | 16 ++++++++++++++++
tests/qemu-iotests/184.out | 8 ++++++++
3 files changed, 34 insertions(+)
diff --git a/block/qapi.c b/block/qapi.c
index 2c50a6bf3b..e08a1e970f 100644
--- a/block/qapi.c
+++ b/block/qapi.c
@@ -51,6 +51,8 @@ BlockDeviceInfo *bdrv_block_device_info(BlockBackend *blk,
ImageInfo *backing_info;
BlockDriverState *backing;
BlockDeviceInfo *info;
+ BlockdevChildList **children_list_tail;
+ BdrvChild *child;
if (!bs->drv) {
error_setg(errp, "Block device %s is ejected", bs->node_name);
@@ -77,6 +79,14 @@ BlockDeviceInfo *bdrv_block_device_info(BlockBackend *blk,
info->node_name = g_strdup(bs->node_name);
}
+ children_list_tail = &info->children;
+ QLIST_FOREACH(child, &bs->children, next) {
+ BlockdevChild *child_ref = g_new0(BlockdevChild, 1);
+ child_ref->child = g_strdup(child->name);
+ child_ref->node_name = g_strdup(child->bs->node_name);
+ QAPI_LIST_APPEND(children_list_tail, child_ref);
+ }
+
backing = bdrv_cow_bs(bs);
if (backing) {
info->backing_file = g_strdup(backing->filename);
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 07c5773717..4db27f5819 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -461,6 +461,19 @@
'direct': 'bool',
'no-flush': 'bool' } }
+##
+# @BlockdevChild:
+#
+# @child: The name of the child, for example 'file' or 'backing'.
+#
+# @node-name: The name of the child's block driver node.
+#
+# Since: 10.1
+##
+{ 'struct': 'BlockdevChild',
+ 'data': { 'child': 'str',
+ 'node-name': 'str' } }
+
##
# @BlockDeviceInfo:
#
@@ -486,6 +499,8 @@
# @backing_file_depth: number of files in the backing file chain
# (since: 1.2)
#
+# @children: Information about child block nodes. (since: 10.1)
+#
# @active: true if the backend is active; typical cases for inactive backends
# are on the migration source instance after migration completes and on the
# destination before it completes. (since: 10.0)
@@ -560,6 +575,7 @@
{ 'struct': 'BlockDeviceInfo',
'data': { 'file': 'str', '*node-name': 'str', 'ro': 'bool', 'drv': 'str',
'*backing_file': 'str', 'backing_file_depth': 'int',
+ 'children': ['BlockdevChild'],
'active': 'bool', 'encrypted': 'bool',
'detect_zeroes': 'BlockdevDetectZeroesOptions',
'bps': 'int', 'bps_rd': 'int', 'bps_wr': 'int',
diff --git a/tests/qemu-iotests/184.out b/tests/qemu-iotests/184.out
index 52692b6b3b..ef99bb2e9a 100644
--- a/tests/qemu-iotests/184.out
+++ b/tests/qemu-iotests/184.out
@@ -41,6 +41,12 @@ Testing:
},
"iops_wr": 0,
"ro": false,
+ "children": [
+ {
+ "node-name": "disk0",
+ "child": "file"
+ }
+ ],
"node-name": "throttle0",
"backing_file_depth": 1,
"drv": "throttle",
@@ -69,6 +75,8 @@ Testing:
},
"iops_wr": 0,
"ro": false,
+ "children": [
+ ],
"node-name": "disk0",
"backing_file_depth": 0,
"drv": "null-co",
--
2.39.5

View File

@@ -0,0 +1,150 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Wed, 18 Jun 2025 12:25:31 +0200
Subject: [PATCH] savevm-async: reuse migration blocker check for
snapshots/hibernation
Same rationale as with upstream QEMU commit 5aaac46793 ("migration:
savevm: consult migration blockers"), migration and (async) snapshot
are essentially the same operation and thus snapshot also needs to
check for migration blockers. For example, this catches passed-through
PCI devices, where the driver does not support migration and VirtIO-GL
display, which also does not support migration yet.
In the case of VirtIO-GL, there were crashes [0].
However, the commit notes:
> There is really no difference between live migration and savevm, except
> that savevm does not require bdrv_invalidate_cache to be implemented
> by all disks. However, it is unlikely that savevm is used with anything
> except qcow2 disks, so the penalty is small and worth the improvement
> in catching bad usage of savevm.
and for Proxmox VE, suspend-to-disk with VMDK does use savevm-async
and would be broken by simply using migration_is_blocked(). To keep
this working, introduce a new helper that filters blockers with the
prefix used by the VMDK migration blocker.
The function qemu_savevm_state_blocked() is called as part of
savevm_async_is_blocked() so no check is lost with this
patch. The helper is declared in migration/migration.c to be able to
access the 'migration_blockers'.
The VMDK blocker message is declared via a '#define', because using a
'const char*' led to the linker to complain about multiple
declarations. The message does not include the reference to the block
node anymore, but users can still easily find a VMDK disk in the VM
configuration.
Note, this also "breaks" snapshot and hibernate with VNC clipboard by
preventing it. Previously, this would "work", because the Proxmox VE
API has no check yet, but the clipboard will be broken after rollback,
in the sense that it cannot be used anymore, not just lost contents.
So some users might consider adding the check here a breaking change
even if it's technically correct to prevent snapshot and hibernate
with VNC clipboard. But other users might rightfully complain about
broken clipboard. And again, the check also prevents blockers from
passed-through PCI devices, etc. so it seems worth tolerating that
breakage.
[0]: https://forum.proxmox.com/threads/136976/
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Message-ID: <20250618102531.57444-1-f.ebner@proxmox.com>
---
block/vmdk.c | 4 +---
include/migration/blocker.h | 2 ++
migration/migration.c | 24 ++++++++++++++++++++++++
migration/migration.h | 1 +
migration/savevm-async.c | 2 +-
5 files changed, 29 insertions(+), 4 deletions(-)
diff --git a/block/vmdk.c b/block/vmdk.c
index 2adec49912..80696a8d27 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -1402,9 +1402,7 @@ static int vmdk_open(BlockDriverState *bs, QDict *options, int flags,
qemu_co_mutex_init(&s->lock);
/* Disable migration when VMDK images are used */
- error_setg(&s->migration_blocker, "The vmdk format used by node '%s' "
- "does not support live migration",
- bdrv_get_device_or_node_name(bs));
+ error_setg(&s->migration_blocker, "%s", MIGRATION_BLOCKER_VMDK);
ret = migrate_add_blocker_normal(&s->migration_blocker, errp);
if (ret < 0) {
goto fail;
diff --git a/include/migration/blocker.h b/include/migration/blocker.h
index a687ac0efe..f36bfb2df1 100644
--- a/include/migration/blocker.h
+++ b/include/migration/blocker.h
@@ -18,6 +18,8 @@
#define MIG_MODE_ALL MIG_MODE__MAX
+#define MIGRATION_BLOCKER_VMDK "The vmdk format used by a disk does not support live migration"
+
/**
* @migrate_add_blocker - prevent all modes of migration from proceeding
*
diff --git a/migration/migration.c b/migration/migration.c
index 2f3430f440..ecad1aca32 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -2030,6 +2030,30 @@ bool migration_is_blocked(Error **errp)
return false;
}
+bool savevm_async_is_blocked(Error **errp)
+{
+ GSList *blockers = migration_blockers[migrate_mode()];
+
+ if (qemu_savevm_state_blocked(errp)) {
+ return true;
+ }
+
+ /*
+ * The limitation for VMDK images only applies to live-migration, not
+ * snapshots, see commit 5aaac46793 ("migration: savevm: consult migration
+ * blockers").
+ */
+ while (blockers) {
+ if (strcmp(error_get_pretty(blockers->data), MIGRATION_BLOCKER_VMDK)) {
+ error_propagate(errp, error_copy(blockers->data));
+ return true;
+ }
+ blockers = g_slist_next(blockers);
+ }
+
+ return false;
+}
+
/* Returns true if continue to migrate, or false if error detected */
static bool migrate_prepare(MigrationState *s, bool resume, Error **errp)
{
diff --git a/migration/migration.h b/migration/migration.h
index d53f7cad84..b772073572 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -531,6 +531,7 @@ int migration_call_notifiers(MigrationState *s, MigrationEventType type,
int migrate_init(MigrationState *s, Error **errp);
bool migration_is_blocked(Error **errp);
+bool savevm_async_is_blocked(Error **errp);
/* True if outgoing migration has entered postcopy phase */
bool migration_in_postcopy(void);
bool migration_postcopy_is_alive(MigrationStatus state);
diff --git a/migration/savevm-async.c b/migration/savevm-async.c
index 730b815494..6cb91dca27 100644
--- a/migration/savevm-async.c
+++ b/migration/savevm-async.c
@@ -375,7 +375,7 @@ void qmp_savevm_start(const char *statefile, Error **errp)
return;
}
- if (qemu_savevm_state_blocked(errp)) {
+ if (savevm_async_is_blocked(errp)) {
goto fail;
}

37
debian/patches/series vendored
View File

@@ -1,9 +1,5 @@
extra/0001-monitor-qmp-fix-race-with-clients-disconnecting-earl.patch
extra/0002-scsi-megasas-Internal-cdbs-have-16-byte-length.patch
extra/0003-ide-avoid-potential-deadlock-when-draining-during-tr.patch
extra/0004-Revert-x86-acpi-workaround-Windows-not-handling-name.patch
extra/0005-virtio-net-Add-queues-before-loading-them.patch
extra/0006-virtio-net-Fix-size-check-in-dhclient-workaround.patch
extra/0002-ide-avoid-potential-deadlock-when-draining-during-tr.patch
bitmap-mirror/0001-drive-mirror-add-support-for-sync-bitmap-mode-never.patch
bitmap-mirror/0002-drive-mirror-add-support-for-conditional-and-always-.patch
bitmap-mirror/0003-mirror-add-check-for-bitmap-mode-without-bitmap.patch
@@ -51,14 +47,23 @@ pve/0038-block-add-alloc-track-driver.patch
pve/0039-Revert-block-rbd-workaround-for-ceph-issue-53784.patch
pve/0040-Revert-block-rbd-fix-handling-of-holes-in-.bdrv_co_b.patch
pve/0041-Revert-block-rbd-implement-bdrv_co_block_status.patch
pve/0042-alloc-track-error-out-when-auto-remove-is-not-set.patch
pve/0043-alloc-track-avoid-seemingly-superfluous-child-permis.patch
pve/0044-copy-before-write-allow-specifying-minimum-cluster-s.patch
pve/0045-backup-add-minimum-cluster-size-to-performance-optio.patch
pve/0046-PVE-backup-add-fleecing-option.patch
pve/0047-PVE-backup-improve-error-when-copy-before-write-fail.patch
pve/0048-PVE-backup-fixup-error-handling-for-fleecing.patch
pve/0049-PVE-backup-factor-out-setting-up-snapshot-access-for.patch
pve/0050-PVE-backup-save-device-name-in-device-info-structure.patch
pve/0051-PVE-backup-include-device-name-in-error-when-setting.patch
pve/0052-adapt-machine-version-deprecation-for-Proxmox-VE.patch
pve/0042-PVE-backup-add-fleecing-option.patch
pve/0043-adapt-machine-version-deprecation-for-Proxmox-VE.patch
pve/0044-Revert-hpet-avoid-timer-storms-on-periodic-timers.patch
pve/0045-Revert-hpet-store-full-64-bit-target-value-of-the-co.patch
pve/0046-Revert-hpet-accept-64-bit-reads-and-writes.patch
pve/0047-Revert-hpet-place-read-only-bits-directly-in-new_val.patch
pve/0048-Revert-hpet-remove-unnecessary-variable-index.patch
pve/0049-Revert-hpet-ignore-high-bits-of-comparator-in-32-bit.patch
pve/0050-Revert-hpet-fix-and-cleanup-persistence-of-interrupt.patch
pve/0051-PVE-backup-factor-out-helper-to-clear-backup-state-s.patch
pve/0052-PVE-backup-factor-out-helper-to-initialize-backup-st.patch
pve/0053-PVE-backup-add-target-ID-in-backup-state.patch
pve/0054-PVE-backup-get-device-info-allow-caller-to-specify-f.patch
pve/0055-PVE-backup-implement-backup-access-setup-and-teardow.patch
pve/0056-PVE-backup-prepare-for-the-switch-to-using-blockdev-.patch
pve/0057-block-zeroinit-support-using-as-blockdev-driver.patch
pve/0058-block-alloc-track-support-using-as-blockdev-driver.patch
pve/0059-block-qapi-include-child-references-in-block-device-.patch
pve/0060-savevm-async-reuse-migration-blocker-check-for-snaps.patch
pve-qemu-10.0-vitastor.patch

4
debian/rules vendored
View File

@@ -61,7 +61,6 @@ endif
--disable-xen \
--enable-curl \
--enable-docs \
--enable-glusterfs \
--enable-gnutls \
--enable-libiscsi \
--enable-libusb \
@@ -131,8 +130,7 @@ binary-indep: build install
binary-arch: build install
dh_testdir
dh_testroot
# exclude historic Changelog file, which stops at release 0.14
dh_installchangelogs --exclude=Changelog
dh_installchangelogs
dh_installdocs
dh_installexamples
dh_install

2
qemu

Submodule qemu updated: 508081a49b...ff3419cbac