Compare commits

..

1 Commits

Author SHA1 Message Date
cafc3c74a0 Add Vitastor support 2024-02-07 01:04:42 +03:00
79 changed files with 1825 additions and 2508 deletions

View File

@@ -24,7 +24,6 @@ endif
PC_BIOS_FW_PURGE_LIST_IN = \
hppa-firmware.img \
hppa-firmware64.img \
openbios-ppc \
openbios-sparc32 \
openbios-sparc64 \
@@ -32,8 +31,7 @@ PC_BIOS_FW_PURGE_LIST_IN = \
s390-ccw.img \
s390-netboot.img \
u-boot.e500 \
.*[a-zA-Z0-9]\.dtb \
.*[a-zA-Z0-9]\.dts \
.*\.dtb \
qemu_vga.ndrv \
slof.bin \
opensbi-riscv.*-generic-fw_dynamic.bin \
@@ -58,7 +56,7 @@ $(BUILDDIR): submodule
deb kvm: $(DEBS)
$(DEB_DBG): $(DEB)
$(DEB): $(BUILDDIR)
cd $(BUILDDIR); dpkg-buildpackage -b -us -uc
cd $(BUILDDIR); dpkg-buildpackage -b -us -uc -j32
lintian $(DEBS)
sbuild: $(DSC)

180
debian/changelog vendored
View File

@@ -1,182 +1,8 @@
pve-qemu-kvm (9.1.2-1) bookworm; urgency=medium
pve-qemu-kvm (8.1.2-4+vitastor1) bookworm; urgency=medium
* update submodule and patches to QEMU 9.1.2
* Add Vitastor support
* improve error handling and edge cases with fleecing backups.
-- Proxmox Support Team <support@proxmox.com> Wed, 11 Dec 2024 16:47:21 +0100
pve-qemu-kvm (9.0.2-4) bookworm; urgency=medium
* async snapshot: ensure any dynamic vCPU-throttling applied for
auto-converge gets always disabled again after finishing the snapshot.
-- Proxmox Support Team <support@proxmox.com> Sun, 10 Nov 2024 11:23:09 +0100
pve-qemu-kvm (9.0.2-3) bookworm; urgency=medium
* pick up fix for VirtIO PCI regressions
* pick up stable fixes for 9.0, including fixes for VirtIO-net, ARM and
x86(_64) emulation, CVEs to harden NBD server against malicious clients,
as well as a few others (VNC, physmem, Intel IOMMU, ...).
-- Proxmox Support Team <support@proxmox.com> Fri, 06 Sep 2024 16:21:42 +0200
pve-qemu-kvm (9.0.2-2) bookworm; urgency=medium
* actually update submodule to QEMU 9.0.2. The previous release was still
based on 9.0.0 by mistake.
-- Proxmox Support Team <support@proxmox.com> Wed, 07 Aug 2024 10:16:01 +0200
pve-qemu-kvm (9.0.2-1) bookworm; urgency=medium
* update submodule and patches to QEMU 9.0.2. While our version had most
stable fixes included already, there are new fixes for VirtIO and VGA
display screen blanking (#4786)
* backport fix for a regression with the LSI-53c895a controller and one for
the boot order getting ignored for USB storage
-- Proxmox Support Team <support@proxmox.com> Mon, 29 Jul 2024 18:59:40 +0200
pve-qemu-kvm (9.0.0-6) bookworm; urgency=medium
* fix a regression in the zeroinit block driver that prevented importing and
cloning disks to RBD storages which are not using the krbd setting
-- Proxmox Support Team <support@proxmox.com> Mon, 08 Jul 2024 16:11:15 +0200
pve-qemu-kvm (9.0.0-5) bookworm; urgency=medium
* backport fix for CVE-2024-4467 to prevent malicious qcow2 image files from
already causing bad effects if being queried via 'qemu-img info'. For
Proxmox VE, this is an additional safe guard, as currently it directly
creates and manages the qcow2 images used by VMs and does not allow
unprivileged users to import them
* fix #4726: code cleanup: avoid superfluous check in vma backup code
-- Proxmox Support Team <support@proxmox.com> Wed, 03 Jul 2024 13:13:35 +0200
pve-qemu-kvm (9.0.0-4) bookworm; urgency=medium
* fix crash after saving a snapshot without including VM state when a VirtIO
block device with iothread is configured.
* fix edge case in error handling when opening a block device from PBS fails
* minor code cleanup in backup code
-- Proxmox Support Team <support@proxmox.com> Mon, 01 Jul 2024 11:26:11 +0200
pve-qemu-kvm (9.0.0-3) bookworm; urgency=medium
* fix crash when doing resize after hotplugging a disk using io_uring
* fix some minor issues in software CPU emulation (i.e. non-KVM) for ARM and
x86(_64)
-- Proxmox Support Team <support@proxmox.com> Wed, 29 May 2024 15:55:44 +0200
pve-qemu-kvm (9.0.0-2) bookworm; urgency=medium
* fix #5409: backup: fix copy-before-write timeout
* backup: improve error when copy-before-write fails for fleecing
* fix forwards and backwards migration with VirtIO-GPU display
* fix a regression in pflash device introduced in 8.2
* revert a commit for VirtIO PCI devices that turned out to cause more
potential security issues than what it fixed
* move compatibility flags for a new VirtIO-net feature to the correct
machine type. The feature was introduced in QEMU 8.2, but the
compatibility flags got added to machine version 8.0 instead of 8.1. This
breaks backwards migration with machine version 8.1 from a 8.2/9.0 binary
to an 8.1 binary, in cases where the guest kernel enables the feature
(e.g. Ubuntu 23.10).
While that breaks migration with machine version 8.1 from an unpatched to
a patched binary, Proxmox VE only ever had 8.2 on the test repository and
9.0 not yet in any public repository.
-- Proxmox Support Team <support@proxmox.com> Fri, 17 May 2024 17:04:52 +0200
pve-qemu-kvm (9.0.0-1) bookworm; urgency=medium
* update submodule and patches to QEMU 9.0.0
-- Proxmox Support Team <support@proxmox.com> Mon, 29 Apr 2024 10:51:37 +0200
pve-qemu-kvm (8.2.2-1) bookworm; urgency=medium
* update submodule and patches to QEMU 8.2.2
-- Proxmox Support Team <support@proxmox.com> Sat, 27 Apr 2024 12:44:30 +0200
pve-qemu-kvm (8.1.5-5) bookworm; urgency=medium
* implement support for backup fleecing
-- Proxmox Support Team <support@proxmox.com> Thu, 11 Apr 2024 17:46:48 +0200
pve-qemu-kvm (8.1.5-4) bookworm; urgency=medium
* fix live-import for certain kinds of VMDK images that rely on padding
* backup: avoid bubbling up first error if it's an ECANCELED one, as those
are often a result of cancling the job due to running into an actual
issue.
* backup: factor out & clean up gathering device info into helper
-- Proxmox Support Team <support@proxmox.com> Tue, 12 Mar 2024 14:08:40 +0100
pve-qemu-kvm (8.1.5-3) bookworm; urgency=medium
* backport fix for potential deadlock during QMP stop command if the VM has
disks attached through VirtIO-Block and IO-Thread enabled
* fix #4507: add patch to automatically increase NOFILE soft limit
-- Proxmox Support Team <support@proxmox.com> Wed, 21 Feb 2024 20:11:23 +0100
pve-qemu-kvm (8.1.5-2) bookworm; urgency=medium
* work around for a situation where guest IO might get stuck, if the VM is
configure with iothread and VirtIO block/SCSI
-- Proxmox Support Team <support@proxmox.com> Fri, 02 Feb 2024 19:41:27 +0100
pve-qemu-kvm (8.1.5-1) bookworm; urgency=medium
* update to 8.1.5 stable release, including more relevant fixes like:
- virtio-net: correctly copy vnet header when flushing TX
- hw/pflash: implement update buffer for block writes
- Fixes to i386 emulation and ARM emulation.
-- Proxmox Support Team <support@proxmox.com> Fri, 02 Feb 2024 19:08:13 +0100
pve-qemu-kvm (8.1.2-6) bookworm; urgency=medium
* revert attempted fix to avoid rare issue with stuck guest IO when using
iothread, because it caused a much more common issue with iothreads
consuming too much CPU
-- Proxmox Support Team <support@proxmox.com> Fri, 15 Dec 2023 14:22:06 +0100
pve-qemu-kvm (8.1.2-5) bookworm; urgency=medium
* backport workaround for stuck guest IO with iothread and VirtIO block/SCSI
in some rare edge cases
* backport fix for potential deadlock when issuing the "resize" QMP command
for a disk that is using iothread
-- Proxmox Support Team <support@proxmox.com> Mon, 11 Dec 2023 16:58:27 +0100
-- Vitaliy Filippov <vitalif@yourcmc.ru> Mon, 04 Dec 2023 01:40:09 +0300
pve-qemu-kvm (8.1.2-4) bookworm; urgency=medium

1
debian/control vendored
View File

@@ -59,6 +59,7 @@ Depends: ceph-common (>= 0.48),
libspice-server1 (>= 0.14.0~),
libusb-1.0-0 (>= 1.0.17-1),
libusbredirparser1 (>= 0.6-2),
vitastor-client (>= 0.9.4),
libuuid1,
${misc:Depends},
${shlibs:Depends},

View File

@@ -27,18 +27,18 @@ Signed-off-by: Ma Haocong <mahaocong@didichuxing.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
[FE: rebased for 9.1.2]
[FE: rebased for 8.1.1]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
block/mirror.c | 99 ++++++++++++++++++++------
block/mirror.c | 98 +++++++++++++++++++++-----
blockdev.c | 38 +++++++++-
include/block/block_int-global-state.h | 4 +-
qapi/block-core.json | 25 ++++++-
tests/unit/test-block-iothread.c | 4 +-
5 files changed, 142 insertions(+), 28 deletions(-)
5 files changed, 142 insertions(+), 27 deletions(-)
diff --git a/block/mirror.c b/block/mirror.c
index 61f0a717b7..83a88562c5 100644
index d3cacd1708..1ff42c8af1 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -51,7 +51,7 @@ typedef struct MirrorBlockJob {
@@ -50,7 +50,7 @@ index 61f0a717b7..83a88562c5 100644
BlockMirrorBackingMode backing_mode;
/* Whether the target image requires explicit zero-initialization */
bool zero_target;
@@ -73,6 +73,8 @@ typedef struct MirrorBlockJob {
@@ -65,6 +65,8 @@ typedef struct MirrorBlockJob {
size_t buf_size;
int64_t bdev_length;
unsigned long *cow_bitmap;
@@ -59,9 +59,9 @@ index 61f0a717b7..83a88562c5 100644
BdrvDirtyBitmap *dirty_bitmap;
BdrvDirtyBitmapIter *dbi;
uint8_t *buf;
@@ -723,7 +725,8 @@ static int mirror_exit_common(Job *job)
@@ -705,7 +707,8 @@ static int mirror_exit_common(Job *job)
bdrv_child_refresh_perms(mirror_top_bs, mirror_top_bs->backing,
&error_abort);
if (!abort && s->backing_mode == MIRROR_SOURCE_BACKING_CHAIN) {
- BlockDriverState *backing = s->is_none_mode ? src : s->base;
+ BlockDriverState *backing;
@@ -69,7 +69,7 @@ index 61f0a717b7..83a88562c5 100644
BlockDriverState *unfiltered_target = bdrv_skip_filters(target_bs);
if (bdrv_cow_bs(unfiltered_target) != backing) {
@@ -824,6 +827,16 @@ static void mirror_abort(Job *job)
@@ -809,6 +812,16 @@ static void mirror_abort(Job *job)
assert(ret == 0);
}
@@ -86,7 +86,7 @@ index 61f0a717b7..83a88562c5 100644
static void coroutine_fn mirror_throttle(MirrorBlockJob *s)
{
int64_t now = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
@@ -1020,7 +1033,8 @@ static int coroutine_fn mirror_run(Job *job, Error **errp)
@@ -997,7 +1010,8 @@ static int coroutine_fn mirror_run(Job *job, Error **errp)
mirror_free_init(s);
s->last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
@@ -96,7 +96,7 @@ index 61f0a717b7..83a88562c5 100644
ret = mirror_dirty_init(s);
if (ret < 0 || job_is_cancelled(&s->common.job)) {
goto immediate_exit;
@@ -1309,6 +1323,7 @@ static const BlockJobDriver mirror_job_driver = {
@@ -1251,6 +1265,7 @@ static const BlockJobDriver mirror_job_driver = {
.run = mirror_run,
.prepare = mirror_prepare,
.abort = mirror_abort,
@@ -104,7 +104,7 @@ index 61f0a717b7..83a88562c5 100644
.pause = mirror_pause,
.complete = mirror_complete,
.cancel = mirror_cancel,
@@ -1327,6 +1342,7 @@ static const BlockJobDriver commit_active_job_driver = {
@@ -1267,6 +1282,7 @@ static const BlockJobDriver commit_active_job_driver = {
.run = mirror_run,
.prepare = mirror_prepare,
.abort = mirror_abort,
@@ -112,7 +112,7 @@ index 61f0a717b7..83a88562c5 100644
.pause = mirror_pause,
.complete = mirror_complete,
.cancel = commit_active_cancel,
@@ -1719,7 +1735,10 @@ static BlockJob *mirror_start_job(
@@ -1658,7 +1674,10 @@ static BlockJob *mirror_start_job(
BlockCompletionFunc *cb,
void *opaque,
const BlockJobDriver *driver,
@@ -123,10 +123,10 @@ index 61f0a717b7..83a88562c5 100644
+ BlockDriverState *base,
bool auto_complete, const char *filter_node_name,
bool is_mirror, MirrorCopyMode copy_mode,
bool base_ro,
@@ -1734,10 +1753,39 @@ static BlockJob *mirror_start_job(
GLOBAL_STATE_CODE();
Error **errp)
@@ -1670,10 +1689,39 @@ static BlockJob *mirror_start_job(
uint64_t target_perms, target_shared_perms;
int ret;
- if (granularity == 0) {
- granularity = bdrv_get_default_bitmap_granularity(target);
@@ -166,7 +166,7 @@ index 61f0a717b7..83a88562c5 100644
assert(is_power_of_2(granularity));
if (buf_size < 0) {
@@ -1878,7 +1926,9 @@ static BlockJob *mirror_start_job(
@@ -1804,7 +1852,9 @@ static BlockJob *mirror_start_job(
s->replaces = g_strdup(replaces);
s->on_source_error = on_source_error;
s->on_target_error = on_target_error;
@@ -176,10 +176,10 @@ index 61f0a717b7..83a88562c5 100644
+ s->bitmap_mode = bitmap_mode;
s->backing_mode = backing_mode;
s->zero_target = zero_target;
qatomic_set(&s->copy_mode, copy_mode);
@@ -1904,6 +1954,18 @@ static BlockJob *mirror_start_job(
*/
bdrv_disable_dirty_bitmap(s->dirty_bitmap);
s->copy_mode = copy_mode;
@@ -1825,6 +1875,18 @@ static BlockJob *mirror_start_job(
bdrv_disable_dirty_bitmap(s->dirty_bitmap);
}
+ if (s->sync_bitmap) {
+ bdrv_dirty_bitmap_set_busy(s->sync_bitmap, true);
@@ -193,10 +193,10 @@ index 61f0a717b7..83a88562c5 100644
+ }
+ }
+
bdrv_graph_wrlock();
ret = block_job_add_bdrv(&s->common, "source", bs, 0,
BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE |
@@ -1986,6 +2048,9 @@ fail:
BLK_PERM_CONSISTENT_READ,
@@ -1902,6 +1964,9 @@ fail:
if (s->dirty_bitmap) {
bdrv_release_dirty_bitmap(s->dirty_bitmap);
}
@@ -206,7 +206,7 @@ index 61f0a717b7..83a88562c5 100644
job_early_fail(&s->common.job);
}
@@ -2008,35 +2073,28 @@ void mirror_start(const char *job_id, BlockDriverState *bs,
@@ -1919,31 +1984,25 @@ void mirror_start(const char *job_id, BlockDriverState *bs,
BlockDriverState *target, const char *replaces,
int creation_flags, int64_t speed,
uint32_t granularity, int64_t buf_size,
@@ -231,23 +231,19 @@ index 61f0a717b7..83a88562c5 100644
- MirrorSyncMode_str(mode));
- return;
- }
-
bdrv_graph_rdlock_main_loop();
- is_none_mode = mode == MIRROR_SYNC_MODE_NONE;
base = mode == MIRROR_SYNC_MODE_TOP ? bdrv_backing_chain_next(bs) : NULL;
bdrv_graph_rdunlock_main_loop();
mirror_start_job(job_id, bs, creation_flags, target, replaces,
speed, granularity, buf_size, backing_mode, zero_target,
on_source_error, on_target_error, unmap, NULL, NULL,
- &mirror_job_driver, is_none_mode, base, false,
- filter_node_name, true, copy_mode, false, errp);
- filter_node_name, true, copy_mode, errp);
+ &mirror_job_driver, mode, bitmap, bitmap_mode, base,
+ false, filter_node_name, true, copy_mode, false, errp);
+ false, filter_node_name, true, copy_mode, errp);
}
BlockJob *commit_active_start(const char *job_id, BlockDriverState *bs,
@@ -2063,7 +2121,8 @@ BlockJob *commit_active_start(const char *job_id, BlockDriverState *bs,
@@ -1970,7 +2029,8 @@ BlockJob *commit_active_start(const char *job_id, BlockDriverState *bs,
job_id, bs, creation_flags, base, NULL, speed, 0, 0,
MIRROR_LEAVE_BACKING_CHAIN, false,
on_error, on_error, true, cb, opaque,
@@ -255,13 +251,13 @@ index 61f0a717b7..83a88562c5 100644
+ &commit_active_job_driver, MIRROR_SYNC_MODE_FULL,
+ NULL, 0, base, auto_complete,
filter_node_name, false, MIRROR_COPY_MODE_BACKGROUND,
base_read_only, errp);
errp);
if (!job) {
diff --git a/blockdev.c b/blockdev.c
index 835064ed03..9b10e3917c 100644
index e6eba61484..a8b1fd2a73 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2778,6 +2778,9 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
@@ -2848,6 +2848,9 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
BlockDriverState *target,
const char *replaces,
enum MirrorSyncMode sync,
@@ -271,7 +267,7 @@ index 835064ed03..9b10e3917c 100644
BlockMirrorBackingMode backing_mode,
bool zero_target,
bool has_speed, int64_t speed,
@@ -2796,6 +2799,7 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
@@ -2866,6 +2869,7 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
{
BlockDriverState *unfiltered_bs;
int job_flags = JOB_DEFAULT;
@@ -279,7 +275,7 @@ index 835064ed03..9b10e3917c 100644
GLOBAL_STATE_CODE();
GRAPH_RDLOCK_GUARD_MAINLOOP();
@@ -2850,6 +2854,29 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
@@ -2920,6 +2924,29 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
sync = MIRROR_SYNC_MODE_FULL;
}
@@ -309,7 +305,7 @@ index 835064ed03..9b10e3917c 100644
if (!replaces) {
/* We want to mirror from @bs, but keep implicit filters on top */
unfiltered_bs = bdrv_skip_implicit_filters(bs);
@@ -2891,8 +2918,8 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
@@ -2965,8 +2992,8 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
* and will allow to check whether the node still exist at mirror completion
*/
mirror_start(job_id, bs, target,
@@ -320,7 +316,7 @@ index 835064ed03..9b10e3917c 100644
on_source_error, on_target_error, unmap, filter_node_name,
copy_mode, errp);
}
@@ -3036,6 +3063,8 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
@@ -3114,6 +3141,8 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
blockdev_mirror_common(arg->job_id, bs, target_bs,
arg->replaces, arg->sync,
@@ -329,7 +325,7 @@ index 835064ed03..9b10e3917c 100644
backing_mode, zero_target,
arg->has_speed, arg->speed,
arg->has_granularity, arg->granularity,
@@ -3055,6 +3084,8 @@ void qmp_blockdev_mirror(const char *job_id,
@@ -3135,6 +3164,8 @@ void qmp_blockdev_mirror(const char *job_id,
const char *device, const char *target,
const char *replaces,
MirrorSyncMode sync,
@@ -338,7 +334,7 @@ index 835064ed03..9b10e3917c 100644
bool has_speed, int64_t speed,
bool has_granularity, uint32_t granularity,
bool has_buf_size, int64_t buf_size,
@@ -3095,7 +3126,8 @@ void qmp_blockdev_mirror(const char *job_id,
@@ -3183,7 +3214,8 @@ void qmp_blockdev_mirror(const char *job_id,
}
blockdev_mirror_common(job_id, bs, target_bs,
@@ -349,10 +345,10 @@ index 835064ed03..9b10e3917c 100644
has_granularity, granularity,
has_buf_size, buf_size,
diff --git a/include/block/block_int-global-state.h b/include/block/block_int-global-state.h
index eb2d92a226..f0c642b194 100644
index da5fb31089..32f0f9858a 100644
--- a/include/block/block_int-global-state.h
+++ b/include/block/block_int-global-state.h
@@ -158,7 +158,9 @@ void mirror_start(const char *job_id, BlockDriverState *bs,
@@ -152,7 +152,9 @@ void mirror_start(const char *job_id, BlockDriverState *bs,
BlockDriverState *target, const char *replaces,
int creation_flags, int64_t speed,
uint32_t granularity, int64_t buf_size,
@@ -364,10 +360,10 @@ index eb2d92a226..f0c642b194 100644
BlockdevOnError on_source_error,
BlockdevOnError on_target_error,
diff --git a/qapi/block-core.json b/qapi/block-core.json
index aa40d44f1d..c2a337cc04 100644
index 2b1d493d6e..903392cb8f 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -2174,6 +2174,15 @@
@@ -2145,6 +2145,15 @@
# destination (all the disk, only the sectors allocated in the
# topmost image, or only new I/O).
#
@@ -383,7 +379,7 @@ index aa40d44f1d..c2a337cc04 100644
# @granularity: granularity of the dirty bitmap, default is 64K if the
# image format doesn't have clusters, 4K if the clusters are
# smaller than that, else the cluster size. Must be a power of 2
@@ -2216,7 +2225,9 @@
@@ -2187,7 +2196,9 @@
{ 'struct': 'DriveMirror',
'data': { '*job-id': 'str', 'device': 'str', 'target': 'str',
'*format': 'str', '*node-name': 'str', '*replaces': 'str',
@@ -394,7 +390,7 @@ index aa40d44f1d..c2a337cc04 100644
'*speed': 'int', '*granularity': 'uint32',
'*buf-size': 'int', '*on-source-error': 'BlockdevOnError',
'*on-target-error': 'BlockdevOnError',
@@ -2496,6 +2507,15 @@
@@ -2471,6 +2482,15 @@
# destination (all the disk, only the sectors allocated in the
# topmost image, or only new I/O).
#
@@ -410,7 +406,7 @@ index aa40d44f1d..c2a337cc04 100644
# @granularity: granularity of the dirty bitmap, default is 64K if the
# image format doesn't have clusters, 4K if the clusters are
# smaller than that, else the cluster size. Must be a power of 2
@@ -2544,7 +2564,8 @@
@@ -2521,7 +2541,8 @@
{ 'command': 'blockdev-mirror',
'data': { '*job-id': 'str', 'device': 'str', 'target': 'str',
'*replaces': 'str',
@@ -421,10 +417,10 @@ index aa40d44f1d..c2a337cc04 100644
'*buf-size': 'int', '*on-source-error': 'BlockdevOnError',
'*on-target-error': 'BlockdevOnError',
diff --git a/tests/unit/test-block-iothread.c b/tests/unit/test-block-iothread.c
index 3766d5de6b..afa44cbd34 100644
index d727a5fee8..8a34aa2328 100644
--- a/tests/unit/test-block-iothread.c
+++ b/tests/unit/test-block-iothread.c
@@ -755,8 +755,8 @@ static void test_propagate_mirror(void)
@@ -757,8 +757,8 @@ static void test_propagate_mirror(void)
/* Start a mirror job */
mirror_start("job0", src, target, NULL, JOB_DEFAULT, 0, 0, 0,
@@ -434,4 +430,4 @@ index 3766d5de6b..afa44cbd34 100644
+ false, BLOCKDEV_ON_ERROR_REPORT, BLOCKDEV_ON_ERROR_REPORT,
false, "filter_node", MIRROR_COPY_MODE_BACKGROUND,
&error_abort);
WITH_JOB_LOCK_GUARD() {

View File

@@ -24,10 +24,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 18 insertions(+), 6 deletions(-)
diff --git a/block/mirror.c b/block/mirror.c
index 83a88562c5..fc439ea936 100644
index 1ff42c8af1..11b8a8e959 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -694,8 +694,6 @@ static int mirror_exit_common(Job *job)
@@ -682,8 +682,6 @@ static int mirror_exit_common(Job *job)
bdrv_unfreeze_backing_chain(mirror_top_bs, target_bs);
}
@@ -36,9 +36,9 @@ index 83a88562c5..fc439ea936 100644
/* Make sure that the source BDS doesn't go away during bdrv_replace_node,
* before we can call bdrv_drained_end */
bdrv_ref(src);
@@ -805,6 +803,18 @@ static int mirror_exit_common(Job *job)
bdrv_drained_end(target_bs);
bdrv_unref(target_bs);
@@ -788,6 +786,18 @@ static int mirror_exit_common(Job *job)
block_job_remove_all_bdrv(bjob);
bdrv_replace_node(mirror_top_bs, mirror_top_bs->backing->bs, &error_abort);
+ if (s->sync_bitmap) {
+ if (s->bitmap_mode == BITMAP_SYNC_MODE_ALWAYS ||
@@ -55,7 +55,7 @@ index 83a88562c5..fc439ea936 100644
bs_opaque->job = NULL;
bdrv_drained_end(src);
@@ -1763,10 +1773,6 @@ static BlockJob *mirror_start_job(
@@ -1699,10 +1709,6 @@ static BlockJob *mirror_start_job(
" sync mode",
MirrorSyncMode_str(sync_mode));
return NULL;
@@ -66,7 +66,7 @@ index 83a88562c5..fc439ea936 100644
}
} else if (bitmap) {
error_setg(errp,
@@ -1783,6 +1789,12 @@ static BlockJob *mirror_start_job(
@@ -1719,6 +1725,12 @@ static BlockJob *mirror_start_job(
return NULL;
}
granularity = bdrv_dirty_bitmap_granularity(bitmap);

View File

@@ -16,10 +16,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 3 insertions(+)
diff --git a/blockdev.c b/blockdev.c
index 9b10e3917c..c3fa897289 100644
index a8b1fd2a73..83d5cc1e49 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2875,6 +2875,9 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
@@ -2945,6 +2945,9 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
if (bdrv_dirty_bitmap_check(bitmap, BDRV_BITMAP_ALLOW_RO, errp)) {
return;
}

View File

@@ -16,10 +16,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 4 insertions(+), 7 deletions(-)
diff --git a/block/mirror.c b/block/mirror.c
index fc439ea936..cde5d710fd 100644
index 11b8a8e959..00f2665ca4 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -809,8 +809,8 @@ static int mirror_exit_common(Job *job)
@@ -792,8 +792,8 @@ static int mirror_exit_common(Job *job)
job->ret == 0 && ret == 0)) {
/* Success; synchronize copy back to sync. */
bdrv_clear_dirty_bitmap(s->sync_bitmap, NULL);
@@ -30,7 +30,7 @@ index fc439ea936..cde5d710fd 100644
}
}
bdrv_release_dirty_bitmap(s->dirty_bitmap);
@@ -1971,11 +1971,8 @@ static BlockJob *mirror_start_job(
@@ -1892,11 +1892,8 @@ static BlockJob *mirror_start_job(
}
if (s->sync_mode == MIRROR_SYNC_MODE_BITMAP) {
@@ -43,4 +43,4 @@ index fc439ea936..cde5d710fd 100644
+ NULL, true);
}
bdrv_graph_wrlock();
ret = block_job_add_bdrv(&s->common, "source", bs, 0,

View File

@@ -12,7 +12,7 @@ uniform w.r.t. backup block jobs.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
[FE: rebase for 8.2.2]
[FE: rebase for 8.0]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
block/mirror.c | 28 +++------------
@@ -21,12 +21,12 @@ Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
3 files changed, 70 insertions(+), 59 deletions(-)
diff --git a/block/mirror.c b/block/mirror.c
index cde5d710fd..e20f50e5fb 100644
index 00f2665ca4..60cf574de5 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -1763,31 +1763,13 @@ static BlockJob *mirror_start_job(
GLOBAL_STATE_CODE();
@@ -1699,31 +1699,13 @@ static BlockJob *mirror_start_job(
uint64_t target_perms, target_shared_perms;
int ret;
- if (sync_mode == MIRROR_SYNC_MODE_INCREMENTAL) {
- error_setg(errp, "Sync mode '%s' not supported",
@@ -62,10 +62,10 @@ index cde5d710fd..e20f50e5fb 100644
if (bitmap_mode != BITMAP_SYNC_MODE_NEVER) {
diff --git a/blockdev.c b/blockdev.c
index c3fa897289..9cbd166674 100644
index 83d5cc1e49..060d86a65f 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2854,7 +2854,36 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
@@ -2924,7 +2924,36 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
sync = MIRROR_SYNC_MODE_FULL;
}

View File

@@ -48,7 +48,7 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
6 files changed, 59 insertions(+), 5 deletions(-)
diff --git a/include/monitor/monitor.h b/include/monitor/monitor.h
index c3740ec616..7f38ce6b8b 100644
index 965f5d5450..e04bd059b6 100644
--- a/include/monitor/monitor.h
+++ b/include/monitor/monitor.h
@@ -16,6 +16,7 @@ extern QemuOptsList qemu_mon_opts;
@@ -60,7 +60,7 @@ index c3740ec616..7f38ce6b8b 100644
void monitor_init_globals(void);
void monitor_init_globals_core(void);
diff --git a/monitor/monitor-internal.h b/monitor/monitor-internal.h
index cb628f681d..93dbd62fc2 100644
index 252de85681..8db28f9272 100644
--- a/monitor/monitor-internal.h
+++ b/monitor/monitor-internal.h
@@ -151,6 +151,13 @@ typedef struct {
@@ -78,10 +78,10 @@ index cb628f681d..93dbd62fc2 100644
/**
diff --git a/monitor/monitor.c b/monitor/monitor.c
index db52a9c7ef..2d63959351 100644
index dc352f9e9d..56e1307014 100644
--- a/monitor/monitor.c
+++ b/monitor/monitor.c
@@ -116,6 +116,21 @@ bool monitor_cur_is_qmp(void)
@@ -117,6 +117,21 @@ bool monitor_cur_is_qmp(void)
return cur_mon && monitor_is_qmp(cur_mon);
}
@@ -104,7 +104,7 @@ index db52a9c7ef..2d63959351 100644
* Is @mon is using readline?
* Note: not all HMP monitors use readline, e.g., gdbserver has a
diff --git a/monitor/qmp.c b/monitor/qmp.c
index 5e538f34c0..eb181d5979 100644
index 6eee450fe4..c15bf1e1fc 100644
--- a/monitor/qmp.c
+++ b/monitor/qmp.c
@@ -165,6 +165,8 @@ static void monitor_qmp_dispatch(MonitorQMP *mon, QObject *req)
@@ -135,7 +135,7 @@ index 5e538f34c0..eb181d5979 100644
qobject_unref(rsp);
}
@@ -461,6 +473,7 @@ static void monitor_qmp_event(void *opaque, QEMUChrEvent event)
@@ -478,6 +490,7 @@ static void monitor_qmp_event(void *opaque, QEMUChrEvent event)
switch (event) {
case CHR_EVENT_OPENED:
@@ -144,7 +144,7 @@ index 5e538f34c0..eb181d5979 100644
monitor_qmp_caps_reset(mon);
data = qmp_greeting(mon);
diff --git a/qapi/qmp-dispatch.c b/qapi/qmp-dispatch.c
index 176b549473..790bb7d1da 100644
index 555528b6bb..3baa508b4b 100644
--- a/qapi/qmp-dispatch.c
+++ b/qapi/qmp-dispatch.c
@@ -117,16 +117,28 @@ typedef struct QmpDispatchBH {
@@ -180,16 +180,16 @@ index 176b549473..790bb7d1da 100644
aio_co_wake(data->co);
}
@@ -253,6 +265,7 @@ QDict *coroutine_mixed_fn qmp_dispatch(const QmpCommandList *cmds, QObject *requ
@@ -231,6 +243,7 @@ QDict *coroutine_mixed_fn qmp_dispatch(const QmpCommandList *cmds, QObject *requ
.ret = &ret,
.errp = &err,
.co = qemu_coroutine_self(),
+ .conn_nr = monitor_get_connection_nr(cur_mon),
};
aio_bh_schedule_oneshot(iohandler_get_aio_context(), do_qmp_dispatch_bh,
aio_bh_schedule_oneshot(qemu_get_aio_context(), do_qmp_dispatch_bh,
&data);
diff --git a/stubs/monitor-core.c b/stubs/monitor-core.c
index 1894cdfe1f..d74d0459f0 100644
index afa477aae6..d3ff124bf3 100644
--- a/stubs/monitor-core.c
+++ b/stubs/monitor-core.c
@@ -12,6 +12,11 @@ Monitor *monitor_set_cur(Coroutine *co, Monitor *mon)
@@ -201,6 +201,6 @@ index 1894cdfe1f..d74d0459f0 100644
+ return -1;
+}
+
void qapi_event_emit(QAPIEvent event, QDict *qdict)
void monitor_init_qmp(Chardev *chr, bool pretty, Error **errp)
{
}

View File

@@ -22,7 +22,7 @@ Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
1 file changed, 2 insertions(+), 12 deletions(-)
diff --git a/hw/scsi/megasas.c b/hw/scsi/megasas.c
index 2d0c607177..97e51733af 100644
index 32c70c9e99..984b6a3145 100644
--- a/hw/scsi/megasas.c
+++ b/hw/scsi/megasas.c
@@ -1781,7 +1781,7 @@ static int megasas_handle_io(MegasasState *s, MegasasCmd *cmd, int frame_cmd)

View File

@@ -55,10 +55,10 @@ Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/hw/ide/core.c b/hw/ide/core.c
index 08d9218455..20d8c0cf66 100644
index 07971c0218..6a74afe564 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -456,7 +456,7 @@ static void ide_trim_bh_cb(void *opaque)
@@ -444,7 +444,7 @@ static void ide_trim_bh_cb(void *opaque)
iocb->bh = NULL;
qemu_aio_unref(iocb);
@@ -67,7 +67,7 @@ index 08d9218455..20d8c0cf66 100644
blk_dec_in_flight(blk);
}
@@ -516,6 +516,8 @@ static void ide_issue_trim_cb(void *opaque, int ret)
@@ -504,6 +504,8 @@ static void ide_issue_trim_cb(void *opaque, int ret)
done:
iocb->aiocb = NULL;
if (iocb->bh) {
@@ -76,7 +76,7 @@ index 08d9218455..20d8c0cf66 100644
replay_bh_schedule_event(iocb->bh);
}
}
@@ -528,9 +530,6 @@ BlockAIOCB *ide_issue_trim(
@@ -516,9 +518,6 @@ BlockAIOCB *ide_issue_trim(
IDEDevice *dev = s->unit ? s->bus->slave : s->bus->master;
TrimAIOCB *iocb;
@@ -86,7 +86,7 @@ index 08d9218455..20d8c0cf66 100644
iocb = blk_aio_get(&trim_aiocb_info, s->blk, cb, cb_opaque);
iocb->s = s;
iocb->bh = qemu_bh_new_guarded(ide_trim_bh_cb, iocb,
@@ -754,8 +753,9 @@ void ide_cancel_dma_sync(IDEState *s)
@@ -742,8 +741,9 @@ void ide_cancel_dma_sync(IDEState *s)
*/
if (s->bus->dma->aiocb) {
trace_ide_cancel_dma_sync_remaining();

View File

@@ -0,0 +1,48 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Fri, 28 Jul 2023 10:47:48 +0200
Subject: [PATCH] migration/block-dirty-bitmap: fix loading bitmap when there
is an iothread
The bdrv_create_dirty_bitmap() function (which is also called by
bdrv_dirty_bitmap_create_successor()) uses bdrv_getlength(bs). This is
a wrapper around a coroutine, and thus uses bdrv_poll_co(). Polling
tries to release the AioContext which will trigger an assert() if it
hasn't been acquired before.
The issue does not happen for migration, because there we are in a
coroutine already, so the wrapper will just call bdrv_co_getlength()
directly without polling.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
migration/block-dirty-bitmap.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 032fc5f405..e1ae3b7316 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -805,8 +805,11 @@ static int dirty_bitmap_load_start(QEMUFile *f, DBMLoadState *s)
"destination", bdrv_dirty_bitmap_name(s->bitmap));
return -EINVAL;
} else {
+ AioContext *ctx = bdrv_get_aio_context(s->bs);
+ aio_context_acquire(ctx);
s->bitmap = bdrv_create_dirty_bitmap(s->bs, granularity,
s->bitmap_name, &local_err);
+ aio_context_release(ctx);
if (!s->bitmap) {
error_report_err(local_err);
return -EINVAL;
@@ -833,7 +836,10 @@ static int dirty_bitmap_load_start(QEMUFile *f, DBMLoadState *s)
bdrv_disable_dirty_bitmap(s->bitmap);
if (flags & DIRTY_BITMAP_MIG_START_FLAG_ENABLED) {
+ AioContext *ctx = bdrv_get_aio_context(s->bs);
+ aio_context_acquire(ctx);
bdrv_dirty_bitmap_create_successor(s->bitmap, &local_err);
+ aio_context_release(ctx);
if (local_err) {
error_report_err(local_err);
return -EINVAL;

View File

@@ -0,0 +1,100 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Thu, 24 Aug 2023 11:22:21 +0200
Subject: [PATCH] hw/ide: reset: cancel async DMA operation before reseting
state
If there is a pending DMA operation during ide_bus_reset(), the fact
that the IDEstate is already reset before the operation is canceled
can be problematic. In particular, ide_dma_cb() might be called and
then use the reset IDEstate which contains the signature after the
reset. When used to construct the IO operation this leads to
ide_get_sector() returning 0 and nsector being 1. This is particularly
bad, because a write command will thus destroy the first sector which
often contains a partition table or similar.
Traces showing the unsolicited write happening with IDEstate
0x5595af6949d0 being used after reset:
> ahci_port_write ahci(0x5595af6923f0)[0]: port write [reg:PxSCTL] @ 0x2c: 0x00000300
> ahci_reset_port ahci(0x5595af6923f0)[0]: reset port
> ide_reset IDEstate 0x5595af6949d0
> ide_reset IDEstate 0x5595af694da8
> ide_bus_reset_aio aio_cancel
> dma_aio_cancel dbs=0x7f64600089a0
> dma_blk_cb dbs=0x7f64600089a0 ret=0
> dma_complete dbs=0x7f64600089a0 ret=0 cb=0x5595acd40b30
> ahci_populate_sglist ahci(0x5595af6923f0)[0]
> ahci_dma_prepare_buf ahci(0x5595af6923f0)[0]: prepare buf limit=512 prepared=512
> ide_dma_cb IDEState 0x5595af6949d0; sector_num=0 n=1 cmd=DMA WRITE
> dma_blk_io dbs=0x7f6420802010 bs=0x5595ae2c6c30 offset=0 to_dev=1
> dma_blk_cb dbs=0x7f6420802010 ret=0
> (gdb) p *qiov
> $11 = {iov = 0x7f647c76d840, niov = 1, {{nalloc = 1, local_iov = {iov_base = 0x0,
> iov_len = 512}}, {__pad = "\001\000\000\000\000\000\000\000\000\000\000",
> size = 512}}}
> (gdb) bt
> #0 blk_aio_pwritev (blk=0x5595ae2c6c30, offset=0, qiov=0x7f6420802070, flags=0,
> cb=0x5595ace6f0b0 <dma_blk_cb>, opaque=0x7f6420802010)
> at ../block/block-backend.c:1682
> #1 0x00005595ace6f185 in dma_blk_cb (opaque=0x7f6420802010, ret=<optimized out>)
> at ../softmmu/dma-helpers.c:179
> #2 0x00005595ace6f778 in dma_blk_io (ctx=0x5595ae0609f0,
> sg=sg@entry=0x5595af694d00, offset=offset@entry=0, align=align@entry=512,
> io_func=io_func@entry=0x5595ace6ee30 <dma_blk_write_io_func>,
> io_func_opaque=io_func_opaque@entry=0x5595ae2c6c30,
> cb=0x5595acd40b30 <ide_dma_cb>, opaque=0x5595af6949d0,
> dir=DMA_DIRECTION_TO_DEVICE) at ../softmmu/dma-helpers.c:244
> #3 0x00005595ace6f90a in dma_blk_write (blk=0x5595ae2c6c30,
> sg=sg@entry=0x5595af694d00, offset=offset@entry=0, align=align@entry=512,
> cb=cb@entry=0x5595acd40b30 <ide_dma_cb>, opaque=opaque@entry=0x5595af6949d0)
> at ../softmmu/dma-helpers.c:280
> #4 0x00005595acd40e18 in ide_dma_cb (opaque=0x5595af6949d0, ret=<optimized out>)
> at ../hw/ide/core.c:953
> #5 0x00005595ace6f319 in dma_complete (ret=0, dbs=0x7f64600089a0)
> at ../softmmu/dma-helpers.c:107
> #6 dma_blk_cb (opaque=0x7f64600089a0, ret=0) at ../softmmu/dma-helpers.c:127
> #7 0x00005595ad12227d in blk_aio_complete (acb=0x7f6460005b10)
> at ../block/block-backend.c:1527
> #8 blk_aio_complete (acb=0x7f6460005b10) at ../block/block-backend.c:1524
> #9 blk_aio_write_entry (opaque=0x7f6460005b10) at ../block/block-backend.c:1594
> #10 0x00005595ad258cfb in coroutine_trampoline (i0=<optimized out>,
> i1=<optimized out>) at ../util/coroutine-ucontext.c:177
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
hw/ide/core.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/hw/ide/core.c b/hw/ide/core.c
index 6a74afe564..289347af58 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -2515,19 +2515,19 @@ static void ide_dummy_transfer_stop(IDEState *s)
void ide_bus_reset(IDEBus *bus)
{
- bus->unit = 0;
- bus->cmd = 0;
- ide_reset(&bus->ifs[0]);
- ide_reset(&bus->ifs[1]);
- ide_clear_hob(bus);
-
- /* pending async DMA */
+ /* pending async DMA - needs the IDEState before it is reset */
if (bus->dma->aiocb) {
trace_ide_bus_reset_aio();
blk_aio_cancel(bus->dma->aiocb);
bus->dma->aiocb = NULL;
}
+ bus->unit = 0;
+ bus->cmd = 0;
+ ide_reset(&bus->ifs[0]);
+ ide_reset(&bus->ifs[1]);
+ ide_clear_hob(bus);
+
/* reset dma provider too */
if (bus->dma->ops->reset) {
bus->dma->ops->reset(bus->dma);

View File

@@ -1,81 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Akihiko Odaki <akihiko.odaki@daynix.com>
Date: Tue, 22 Oct 2024 15:49:01 +0900
Subject: [PATCH] virtio-net: Add queues before loading them
Call virtio_net_set_multiqueue() to add queues before loading their
states. Otherwise the loaded queues will not have handlers and elements
in them will not be processed.
Cc: qemu-stable@nongnu.org
Fixes: 8c49756825da ("virtio-net: Add only one queue pair when realizing")
Reported-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
(picked from https://lore.kernel.org/qemu-devel/20241022-load-v1-1-99df0bff7939@daynix.com/)
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
hw/net/virtio-net.c | 10 ++++++++++
hw/virtio/virtio.c | 7 +++++++
include/hw/virtio/virtio.h | 2 ++
3 files changed, 19 insertions(+)
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index ed33a32877..90d05f94d4 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -3032,6 +3032,15 @@ static void virtio_net_set_multiqueue(VirtIONet *n, int multiqueue)
virtio_net_set_queue_pairs(n);
}
+static int virtio_net_pre_load_queues(VirtIODevice *vdev)
+{
+ virtio_net_set_multiqueue(VIRTIO_NET(vdev),
+ virtio_has_feature(vdev->guest_features, VIRTIO_NET_F_RSS) ||
+ virtio_has_feature(vdev->guest_features, VIRTIO_NET_F_MQ));
+
+ return 0;
+}
+
static int virtio_net_post_load_device(void *opaque, int version_id)
{
VirtIONet *n = opaque;
@@ -4010,6 +4019,7 @@ static void virtio_net_class_init(ObjectClass *klass, void *data)
vdc->guest_notifier_mask = virtio_net_guest_notifier_mask;
vdc->guest_notifier_pending = virtio_net_guest_notifier_pending;
vdc->legacy_features |= (0x1 << VIRTIO_NET_F_GSO);
+ vdc->pre_load_queues = virtio_net_pre_load_queues;
vdc->post_load = virtio_net_post_load_virtio;
vdc->vmsd = &vmstate_virtio_net_device;
vdc->primary_unplug_pending = primary_unplug_pending;
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 9e10cbc058..10f24a58dd 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -3251,6 +3251,13 @@ virtio_load(VirtIODevice *vdev, QEMUFile *f, int version_id)
config_len--;
}
+ if (vdc->pre_load_queues) {
+ ret = vdc->pre_load_queues(vdev);
+ if (ret) {
+ return ret;
+ }
+ }
+
num = qemu_get_be32(f);
if (num > VIRTIO_QUEUE_MAX) {
diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index 0fcbc5c0c6..953dfca27c 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -210,6 +210,8 @@ struct VirtioDeviceClass {
void (*guest_notifier_mask)(VirtIODevice *vdev, int n, bool mask);
int (*start_ioeventfd)(VirtIODevice *vdev);
void (*stop_ioeventfd)(VirtIODevice *vdev);
+ /* Called before loading queues. Useful to add queues before loading. */
+ int (*pre_load_queues)(VirtIODevice *vdev);
/* Saving and loading of a device; trying to deprecate save/load
* use vmsd for new devices.
*/

View File

@@ -0,0 +1,140 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Thu, 28 Sep 2023 10:07:03 +0200
Subject: [PATCH] Revert "Revert "graph-lock: Disable locking for now""
This reverts commit 3cce22defb4b0e47cf135444e30cc673cff5ebad.
There are still some issues with graph locking, e.g. deadlocks during
backup canceling [0]. Because the AioContext locks still exist, it
should be safe to disable locking again.
From the original 80fc5d2600 ("graph-lock: Disable locking for now"):
> We don't currently rely on graph locking yet. It is supposed to replace
> the AioContext lock eventually to enable multiqueue support, but as long
> as we still have the AioContext lock, it is sufficient without the graph
> lock. Once the AioContext lock goes away, the deadlock doesn't exist any
> more either and this commit can be reverted. (Of course, it can also be
> reverted while the AioContext lock still exists if the callers have been
> fixed.)
[0]: https://lists.nongnu.org/archive/html/qemu-devel/2023-09/msg00729.html
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
block/graph-lock.c | 24 ++++++++++++++++++++++++
1 file changed, 24 insertions(+)
diff --git a/block/graph-lock.c b/block/graph-lock.c
index 5e66f01ae8..5c2873262a 100644
--- a/block/graph-lock.c
+++ b/block/graph-lock.c
@@ -30,8 +30,10 @@ BdrvGraphLock graph_lock;
/* Protects the list of aiocontext and orphaned_reader_count */
static QemuMutex aio_context_list_lock;
+#if 0
/* Written and read with atomic operations. */
static int has_writer;
+#endif
/*
* A reader coroutine could move from an AioContext to another.
@@ -88,6 +90,7 @@ void unregister_aiocontext(AioContext *ctx)
g_free(ctx->bdrv_graph);
}
+#if 0
static uint32_t reader_count(void)
{
BdrvGraphRWlock *brdv_graph;
@@ -105,12 +108,19 @@ static uint32_t reader_count(void)
assert((int32_t)rd >= 0);
return rd;
}
+#endif
void bdrv_graph_wrlock(BlockDriverState *bs)
{
+#if 0
AioContext *ctx = NULL;
GLOBAL_STATE_CODE();
+ /*
+ * TODO Some callers hold an AioContext lock when this is called, which
+ * causes deadlocks. Reenable once the AioContext locking is cleaned up (or
+ * AioContext locks are gone).
+ */
assert(!qatomic_read(&has_writer));
/*
@@ -158,11 +168,13 @@ void bdrv_graph_wrlock(BlockDriverState *bs)
if (ctx) {
aio_context_acquire(bdrv_get_aio_context(bs));
}
+#endif
}
void bdrv_graph_wrunlock(void)
{
GLOBAL_STATE_CODE();
+#if 0
QEMU_LOCK_GUARD(&aio_context_list_lock);
assert(qatomic_read(&has_writer));
@@ -174,10 +186,13 @@ void bdrv_graph_wrunlock(void)
/* Wake up all coroutine that are waiting to read the graph */
qemu_co_enter_all(&reader_queue, &aio_context_list_lock);
+#endif
}
void coroutine_fn bdrv_graph_co_rdlock(void)
{
+ /* TODO Reenable when wrlock is reenabled */
+#if 0
BdrvGraphRWlock *bdrv_graph;
bdrv_graph = qemu_get_current_aio_context()->bdrv_graph;
@@ -237,10 +252,12 @@ void coroutine_fn bdrv_graph_co_rdlock(void)
qemu_co_queue_wait(&reader_queue, &aio_context_list_lock);
}
}
+#endif
}
void coroutine_fn bdrv_graph_co_rdunlock(void)
{
+#if 0
BdrvGraphRWlock *bdrv_graph;
bdrv_graph = qemu_get_current_aio_context()->bdrv_graph;
@@ -258,6 +275,7 @@ void coroutine_fn bdrv_graph_co_rdunlock(void)
if (qatomic_read(&has_writer)) {
aio_wait_kick();
}
+#endif
}
void bdrv_graph_rdlock_main_loop(void)
@@ -275,13 +293,19 @@ void bdrv_graph_rdunlock_main_loop(void)
void assert_bdrv_graph_readable(void)
{
/* reader_count() is slow due to aio_context_list_lock lock contention */
+ /* TODO Reenable when wrlock is reenabled */
+#if 0
#ifdef CONFIG_DEBUG_GRAPH_LOCK
assert(qemu_in_main_thread() || reader_count());
#endif
+#endif
}
void assert_bdrv_graph_writable(void)
{
assert(qemu_in_main_thread());
+ /* TODO Reenable when wrlock is reenabled */
+#if 0
assert(qatomic_read(&has_writer));
+#endif
}

View File

@@ -1,36 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Akihiko Odaki <akihiko.odaki@daynix.com>
Date: Fri, 22 Nov 2024 14:03:08 +0900
Subject: [PATCH] virtio-net: Fix size check in dhclient workaround
work_around_broken_dhclient() accesses IP and UDP headers to detect
relevant packets and to calculate checksums, but it didn't check if
the packet has size sufficient to accommodate them, causing out-of-bound
access hazards. Fix this by correcting the size requirement.
Fixes: 1d41b0c1ec66 ("Work around dhclient brokenness")
Cc: qemu-stable@nongnu.org
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
(picked from https://lore.kernel.org/qemu-devel/20241122-queue-v3-2-f2ff03b8dbfd@daynix.com/#t)
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
hw/net/virtio-net.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 90d05f94d4..c1fe457359 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -1692,8 +1692,11 @@ static void virtio_net_hdr_swap(VirtIODevice *vdev, struct virtio_net_hdr *hdr)
static void work_around_broken_dhclient(struct virtio_net_hdr *hdr,
uint8_t *buf, size_t size)
{
+ size_t csum_size = ETH_HLEN + sizeof(struct ip_header) +
+ sizeof(struct udp_header);
+
if ((hdr->flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) && /* missing csum */
- (size > 27 && size < 1500) && /* normal sized MTU */
+ (size >= csum_size && size < 1500) && /* normal sized MTU */
(buf[12] == 0x08 && buf[13] == 0x00) && /* ethertype == IPv4 */
(buf[23] == 17) && /* ip.protocol == UDP */
(buf[34] == 0 && buf[35] == 67)) { /* udp.srcport == bootps */

View File

@@ -0,0 +1,57 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Thu, 28 Sep 2023 11:19:14 +0200
Subject: [PATCH] migration states: workaround snapshot performance regression
Commit 813cd616 ("migration: Use migration_transferred_bytes() to
calculate rate_limit") introduced a prohibitive performance regression
when taking a snapshot [0]. The reason turns out to be the flushing
done by migration_transferred_bytes()
Just use a _noflush version of the relevant function as a workaround
until upstream fixes the issue. This is inspired by a not-applied
upstream series [1], but doing the very minimum to avoid the
regression.
[0]: https://gitlab.com/qemu-project/qemu/-/issues/1821
[1]: https://lists.nongnu.org/archive/html/qemu-devel/2023-05/msg07708.html
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
migration/migration-stats.c | 16 +++++++++++++++-
1 file changed, 15 insertions(+), 1 deletion(-)
diff --git a/migration/migration-stats.c b/migration/migration-stats.c
index 095d6d75bb..8073c8ebaa 100644
--- a/migration/migration-stats.c
+++ b/migration/migration-stats.c
@@ -18,6 +18,20 @@
MigrationAtomicStats mig_stats;
+/*
+ * Same as migration_transferred_bytes below, but using the _noflush
+ * variant of qemu_file_transferred() to avoid a performance
+ * regression in migration_rate_exceeded().
+ */
+static uint64_t migration_transferred_bytes_noflush(QEMUFile *f)
+{
+ uint64_t multifd = stat64_get(&mig_stats.multifd_bytes);
+ uint64_t qemu_file = qemu_file_transferred_noflush(f);
+
+ trace_migration_transferred_bytes(qemu_file, multifd);
+ return qemu_file + multifd;
+}
+
bool migration_rate_exceeded(QEMUFile *f)
{
if (qemu_file_get_error(f)) {
@@ -25,7 +39,7 @@ bool migration_rate_exceeded(QEMUFile *f)
}
uint64_t rate_limit_start = stat64_get(&mig_stats.rate_limit_start);
- uint64_t rate_limit_current = migration_transferred_bytes(f);
+ uint64_t rate_limit_current = migration_transferred_bytes_noflush(f);
uint64_t rate_limit_used = rate_limit_current - rate_limit_start;
uint64_t rate_limit_max = stat64_get(&mig_stats.rate_limit_max);

View File

@@ -24,10 +24,10 @@ Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 5d4bd2b710..67194bb705 100644
index bb12b0ad43..de14d3c3da 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -346,13 +346,9 @@ Aml *aml_pci_device_dsm(void)
@@ -362,13 +362,9 @@ Aml *aml_pci_device_dsm(void)
{
Aml *params = aml_local(0);
Aml *pkg = aml_package(2);

View File

@@ -0,0 +1,107 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Niklas Cassel <niklas.cassel@wdc.com>
Date: Wed, 8 Nov 2023 23:26:57 +0100
Subject: [PATCH] hw/ide/ahci: fix legacy software reset
Legacy software contains a standard mechanism for generating a reset to a
Serial ATA device - setting the SRST (software reset) bit in the Device
Control register.
Serial ATA has a more robust mechanism called COMRESET, also referred to
as port reset. A port reset is the preferred mechanism for error
recovery and should be used in place of software reset.
Commit e2a5d9b3d9c3 ("hw/ide/ahci: simplify and document PxCI handling")
improved the handling of PxCI, such that PxCI gets cleared after handling
a non-NCQ, or NCQ command (instead of incorrectly clearing PxCI after
receiving anything - even a FIS that failed to parse, which should NOT
clear PxCI, so that you can see which command slot that caused an error).
However, simply clearing PxCI after a non-NCQ, or NCQ command, is not
enough, we also need to clear PxCI when receiving a SRST in the Device
Control register.
A legacy software reset is performed by the host sending two H2D FISes,
the first H2D FIS asserts SRST, and the second H2D FIS deasserts SRST.
The first H2D FIS will not get a D2H reply, and requires the FIS to have
the C bit set to one, such that the HBA itself will clear the bit in PxCI.
The second H2D FIS will get a D2H reply once the diagnostic is completed.
The clearing of the bit in PxCI for this command should ideally be done
in ahci_init_d2h() (if it was a legacy software reset that caused the
reset (a COMRESET does not use a command slot)). However, since the reset
value for PxCI is 0, modify ahci_reset_port() to actually clear PxCI to 0,
that way we can avoid complex logic in ahci_init_d2h().
This fixes an issue for FreeBSD where the device would fail to reset.
The problem was not noticed in Linux, because Linux uses a COMRESET
instead of a legacy software reset by default.
Fixes: e2a5d9b3d9c3 ("hw/ide/ahci: simplify and document PxCI handling")
Reported-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
(picked from https://lists.nongnu.org/archive/html/qemu-devel/2023-11/msg02277.html)
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
hw/ide/ahci.c | 27 ++++++++++++++++++++++++++-
1 file changed, 26 insertions(+), 1 deletion(-)
diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
index d0a774bc17..1718b7e902 100644
--- a/hw/ide/ahci.c
+++ b/hw/ide/ahci.c
@@ -623,9 +623,13 @@ static void ahci_init_d2h(AHCIDevice *ad)
return;
}
+ /*
+ * For simplicity, do not call ahci_clear_cmd_issue() for this
+ * ahci_write_fis_d2h(). (The reset value for PxCI is 0.)
+ */
if (ahci_write_fis_d2h(ad, true)) {
ad->init_d2h_sent = true;
- /* We're emulating receiving the first Reg H2D Fis from the device;
+ /* We're emulating receiving the first Reg D2H FIS from the device;
* Update the SIG register, but otherwise proceed as normal. */
pr->sig = ((uint32_t)ide_state->hcyl << 24) |
(ide_state->lcyl << 16) |
@@ -663,6 +667,7 @@ static void ahci_reset_port(AHCIState *s, int port)
pr->scr_act = 0;
pr->tfdata = 0x7F;
pr->sig = 0xFFFFFFFF;
+ pr->cmd_issue = 0;
d->busy_slot = -1;
d->init_d2h_sent = false;
@@ -1243,10 +1248,30 @@ static void handle_reg_h2d_fis(AHCIState *s, int port,
case STATE_RUN:
if (cmd_fis[15] & ATA_SRST) {
s->dev[port].port_state = STATE_RESET;
+ /*
+ * When setting SRST in the first H2D FIS in the reset sequence,
+ * the device does not send a D2H FIS. Host software thus has to
+ * set the "Clear Busy upon R_OK" bit such that PxCI (and BUSY)
+ * gets cleared. See AHCI 1.3.1, section 10.4.1 Software Reset.
+ */
+ if (opts & AHCI_CMD_CLR_BUSY) {
+ ahci_clear_cmd_issue(ad, slot);
+ }
}
break;
case STATE_RESET:
if (!(cmd_fis[15] & ATA_SRST)) {
+ /*
+ * When clearing SRST in the second H2D FIS in the reset
+ * sequence, the device will execute diagnostics. When this is
+ * done, the device will send a D2H FIS with the good status.
+ * See SATA 3.5a Gold, section 11.4 Software reset protocol.
+ *
+ * This D2H FIS is the first D2H FIS received from the device,
+ * and is received regardless if the reset was performed by a
+ * COMRESET or by setting and clearing the SRST bit. Therefore,
+ * the logic for this is found in ahci_init_d2h() and not here.
+ */
ahci_reset_port(s, port);
}
break;

View File

@@ -0,0 +1,34 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Wed, 22 Nov 2023 13:17:25 +0100
Subject: [PATCH] ui/vnc-clipboard: fix inflate_buffer
Commit d921fea338 ("ui/vnc-clipboard: fix infinite loop in
inflate_buffer (CVE-2023-3255)") removed this hunk, but it is still
required, because it can happen that stream.avail_in becomes zero
before coming across a return value of Z_STREAM_END.
This fixes the host->guest direction with noNVC.
Reported-by: Friedrich Weber <f.weber@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
ui/vnc-clipboard.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/ui/vnc-clipboard.c b/ui/vnc-clipboard.c
index c759be3438..124b6fbd9c 100644
--- a/ui/vnc-clipboard.c
+++ b/ui/vnc-clipboard.c
@@ -69,6 +69,11 @@ static uint8_t *inflate_buffer(uint8_t *in, uint32_t in_len, uint32_t *size)
}
}
+ *size = stream.total_out;
+ inflateEnd(&stream);
+
+ return out;
+
err_end:
inflateEnd(&stream);
err:

View File

@@ -1,8 +1,8 @@
Index: pve-qemu-kvm-9.1.2/block/meson.build
Index: pve-qemu-kvm-8.1.2/block/meson.build
===================================================================
--- pve-qemu-kvm-9.1.2.orig/block/meson.build
+++ pve-qemu-kvm-9.1.2/block/meson.build
@@ -126,6 +126,7 @@ foreach m : [
--- pve-qemu-kvm-8.1.2.orig/block/meson.build
+++ pve-qemu-kvm-8.1.2/block/meson.build
@@ -123,6 +123,7 @@ foreach m : [
[libnfs, 'nfs', files('nfs.c')],
[libssh, 'ssh', files('ssh.c')],
[rbd, 'rbd', files('rbd.c')],
@@ -10,11 +10,11 @@ Index: pve-qemu-kvm-9.1.2/block/meson.build
]
if m[0].found()
module_ss = ss.source_set()
Index: pve-qemu-kvm-9.1.2/meson.build
Index: pve-qemu-kvm-8.1.2/meson.build
===================================================================
--- pve-qemu-kvm-9.1.2.orig/meson.build
+++ pve-qemu-kvm-9.1.2/meson.build
@@ -1516,6 +1516,26 @@ if not get_option('rbd').auto() or have_
--- pve-qemu-kvm-8.1.2.orig/meson.build
+++ pve-qemu-kvm-8.1.2/meson.build
@@ -1303,6 +1303,26 @@ if not get_option('rbd').auto() or have_
endif
endif
@@ -41,15 +41,15 @@ Index: pve-qemu-kvm-9.1.2/meson.build
glusterfs = not_found
glusterfs_ftruncate_has_stat = false
glusterfs_iocb_has_stat = false
@@ -2367,6 +2387,7 @@ endif
@@ -2123,6 +2143,7 @@ if numa.found()
endif
config_host_data.set('CONFIG_OPENGL', opengl.found())
config_host_data.set('CONFIG_PLUGIN', get_option('plugins'))
config_host_data.set('CONFIG_RBD', rbd.found())
+config_host_data.set('CONFIG_VITASTOR', vitastor.found())
config_host_data.set('CONFIG_RDMA', rdma.found())
config_host_data.set('CONFIG_RELOCATABLE', get_option('relocatable'))
config_host_data.set('CONFIG_SAFESTACK', get_option('safe_stack'))
@@ -4534,6 +4555,7 @@ summary_info += {'fdt support': fd
config_host_data.set('CONFIG_SDL', sdl.found())
@@ -4298,6 +4319,7 @@ summary_info += {'fdt support': fd
summary_info += {'libcap-ng support': libcap_ng}
summary_info += {'bpf support': libbpf}
summary_info += {'rbd support': rbd}
@@ -57,11 +57,11 @@ Index: pve-qemu-kvm-9.1.2/meson.build
summary_info += {'smartcard support': cacard}
summary_info += {'U2F support': u2f}
summary_info += {'libusb': libusb}
Index: pve-qemu-kvm-9.1.2/meson_options.txt
Index: pve-qemu-kvm-8.1.2/meson_options.txt
===================================================================
--- pve-qemu-kvm-9.1.2.orig/meson_options.txt
+++ pve-qemu-kvm-9.1.2/meson_options.txt
@@ -194,6 +194,8 @@ option('lzo', type : 'feature', value :
--- pve-qemu-kvm-8.1.2.orig/meson_options.txt
+++ pve-qemu-kvm-8.1.2/meson_options.txt
@@ -186,6 +186,8 @@ option('lzo', type : 'feature', value :
description: 'lzo compression support')
option('rbd', type : 'feature', value : 'auto',
description: 'Ceph block device driver')
@@ -70,11 +70,11 @@ Index: pve-qemu-kvm-9.1.2/meson_options.txt
option('opengl', type : 'feature', value : 'auto',
description: 'OpenGL support')
option('rdma', type : 'feature', value : 'auto',
Index: pve-qemu-kvm-9.1.2/qapi/block-core.json
Index: pve-qemu-kvm-8.1.2/qapi/block-core.json
===================================================================
--- pve-qemu-kvm-9.1.2.orig/qapi/block-core.json
+++ pve-qemu-kvm-9.1.2/qapi/block-core.json
@@ -3477,7 +3477,7 @@
--- pve-qemu-kvm-8.1.2.orig/qapi/block-core.json
+++ pve-qemu-kvm-8.1.2/qapi/block-core.json
@@ -3403,7 +3403,7 @@
'raw', 'rbd',
{ 'name': 'replication', 'if': 'CONFIG_REPLICATION' },
'pbs',
@@ -83,7 +83,7 @@ Index: pve-qemu-kvm-9.1.2/qapi/block-core.json
{ 'name': 'virtio-blk-vfio-pci', 'if': 'CONFIG_BLKIO' },
{ 'name': 'virtio-blk-vhost-user', 'if': 'CONFIG_BLKIO' },
{ 'name': 'virtio-blk-vhost-vdpa', 'if': 'CONFIG_BLKIO' },
@@ -4588,6 +4588,28 @@
@@ -4465,6 +4465,28 @@
'*server': ['InetSocketAddressBase'] } }
##
@@ -112,7 +112,7 @@ Index: pve-qemu-kvm-9.1.2/qapi/block-core.json
# @ReplicationMode:
#
# An enumeration of replication modes.
@@ -5050,6 +5072,7 @@
@@ -4923,6 +4945,7 @@
'throttle': 'BlockdevOptionsThrottle',
'vdi': 'BlockdevOptionsGenericFormat',
'vhdx': 'BlockdevOptionsGenericFormat',
@@ -120,7 +120,7 @@ Index: pve-qemu-kvm-9.1.2/qapi/block-core.json
'virtio-blk-vfio-pci':
{ 'type': 'BlockdevOptionsVirtioBlkVfioPci',
'if': 'CONFIG_BLKIO' },
@@ -5497,6 +5520,20 @@
@@ -5360,6 +5383,17 @@
'*encrypt' : 'RbdEncryptionCreateOptions' } }
##
@@ -128,9 +128,6 @@ Index: pve-qemu-kvm-9.1.2/qapi/block-core.json
+#
+# Driver specific image creation options for Vitastor.
+#
+# @location: Where to store the new image file. This location cannot
+# point to a snapshot.
+#
+# @size: Size of the virtual disk in bytes
+##
+{ 'struct': 'BlockdevCreateOptionsVitastor',
@@ -141,7 +138,7 @@ Index: pve-qemu-kvm-9.1.2/qapi/block-core.json
# @BlockdevVmdkSubformat:
#
# Subformat options for VMDK images
@@ -5718,6 +5755,7 @@
@@ -5581,6 +5615,7 @@
'ssh': 'BlockdevCreateOptionsSsh',
'vdi': 'BlockdevCreateOptionsVdi',
'vhdx': 'BlockdevCreateOptionsVhdx',
@@ -149,32 +146,53 @@ Index: pve-qemu-kvm-9.1.2/qapi/block-core.json
'vmdk': 'BlockdevCreateOptionsVmdk',
'vpc': 'BlockdevCreateOptionsVpc'
} }
Index: pve-qemu-kvm-9.1.2/scripts/meson-buildoptions.sh
Index: pve-qemu-kvm-8.1.2/scripts/ci/org.centos/stream/8/x86_64/configure
===================================================================
--- pve-qemu-kvm-9.1.2.orig/scripts/meson-buildoptions.sh
+++ pve-qemu-kvm-9.1.2/scripts/meson-buildoptions.sh
@@ -168,6 +168,7 @@ meson_options_help() {
--- pve-qemu-kvm-8.1.2.orig/scripts/ci/org.centos/stream/8/x86_64/configure
+++ pve-qemu-kvm-8.1.2/scripts/ci/org.centos/stream/8/x86_64/configure
@@ -30,7 +30,7 @@
--with-suffix="qemu-kvm" \
--firmwarepath=/usr/share/qemu-firmware \
--target-list="x86_64-softmmu" \
---block-drv-rw-whitelist="qcow2,raw,file,host_device,nbd,iscsi,rbd,blkdebug,luks,null-co,nvme,copy-on-read,throttle,gluster" \
+--block-drv-rw-whitelist="qcow2,raw,file,host_device,nbd,iscsi,rbd,vitastor,blkdebug,luks,null-co,nvme,copy-on-read,throttle,gluster" \
--audio-drv-list="" \
--block-drv-ro-whitelist="vmdk,vhdx,vpc,https,ssh" \
--with-coroutine=ucontext \
@@ -176,6 +176,7 @@
--enable-opengl \
--enable-pie \
--enable-rbd \
+--enable-vitastor \
--enable-rdma \
--enable-seccomp \
--enable-snappy \
Index: pve-qemu-kvm-8.1.2/scripts/meson-buildoptions.sh
===================================================================
--- pve-qemu-kvm-8.1.2.orig/scripts/meson-buildoptions.sh
+++ pve-qemu-kvm-8.1.2/scripts/meson-buildoptions.sh
@@ -153,6 +153,7 @@ meson_options_help() {
printf "%s\n" ' qed qed image format support'
printf "%s\n" ' qga-vss build QGA VSS support (broken with MinGW)'
printf "%s\n" ' qpl Query Processing Library support'
printf "%s\n" ' rbd Ceph block device driver'
+ printf "%s\n" ' vitastor Vitastor block device driver'
printf "%s\n" ' rdma Enable RDMA-based migration'
printf "%s\n" ' replication replication support'
printf "%s\n" ' rutabaga-gfx rutabaga_gfx support'
@@ -444,6 +445,8 @@ _meson_option_parse() {
--disable-qpl) printf "%s" -Dqpl=disabled ;;
printf "%s\n" ' sdl SDL user interface'
@@ -416,6 +417,8 @@ _meson_option_parse() {
--disable-qom-cast-debug) printf "%s" -Dqom_cast_debug=false ;;
--enable-rbd) printf "%s" -Drbd=enabled ;;
--disable-rbd) printf "%s" -Drbd=disabled ;;
+ --enable-vitastor) printf "%s" -Dvitastor=enabled ;;
+ --disable-vitastor) printf "%s" -Dvitastor=disabled ;;
--enable-rdma) printf "%s" -Drdma=enabled ;;
--disable-rdma) printf "%s" -Drdma=disabled ;;
--enable-relocatable) printf "%s" -Drelocatable=true ;;
--enable-replication) printf "%s" -Dreplication=enabled ;;
Index: a/block/vitastor.c
===================================================================
--- /dev/null
+++ a/block/vitastor.c
@@ -0,0 +1,1079 @@
@@ -0,0 +1,1076 @@
+// Copyright (c) Vitaliy Filippov, 2019+
+// License: VNPL-1.1 or GNU GPL-2.0+ (see README.md for details)
+
@@ -193,6 +211,7 @@ Index: a/block/vitastor.c
+#include "qapi/error.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/qmp/qerror.h"
+#include "qemu/uri.h"
+#include "qemu/error-report.h"
+#include "qemu/module.h"
+#include "qemu/option.h"
@@ -1197,11 +1216,7 @@ Index: a/block/vitastor.c
+ // FIXME: Implement it along with per-inode statistics
+ //.bdrv_get_allocated_file_size = vitastor_get_allocated_file_size,
+
+#if QEMU_VERSION_MAJOR > 9 || QEMU_VERSION_MAJOR == 9 && QEMU_VERSION_MINOR > 0
+ .bdrv_open = vitastor_file_open,
+#else
+ .bdrv_file_open = vitastor_file_open,
+#endif
+ .bdrv_close = vitastor_close,
+
+ // Option list for the create operation

View File

@@ -14,7 +14,7 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/block/file-posix.c b/block/file-posix.c
index ff928b5e85..99e5bea1cc 100644
index aa89789737..0db366a851 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -564,7 +564,7 @@ static QemuOptsList raw_runtime_opts = {

View File

@@ -9,10 +9,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/net/net.h b/include/net/net.h
index c8f679761b..35a1338e40 100644
index 1448d00afb..d1601d32c1 100644
--- a/include/net/net.h
+++ b/include/net/net.h
@@ -309,8 +309,8 @@ void netdev_add(QemuOpts *opts, Error **errp);
@@ -258,8 +258,8 @@ void netdev_add(QemuOpts *opts, Error **errp);
int net_hub_id_for_client(NetClientState *nc, int *id);
NetClientState *net_hub_port_find(int hub_id);

View File

@@ -10,10 +10,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index fa027cc206..da7ef0cbe6 100644
index e0771a1043..1018ccc0b8 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -2418,9 +2418,9 @@ uint64_t cpu_get_tsc(CPUX86State *env);
@@ -2243,9 +2243,9 @@ uint64_t cpu_get_tsc(CPUX86State *env);
#define CPU_RESOLVING_TYPE TYPE_X86_CPU
#ifdef TARGET_X86_64

View File

@@ -9,10 +9,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/ui/spice-core.c b/ui/spice-core.c
index 15be640286..ea20e6153c 100644
index 52a59386d7..b20c25aee0 100644
--- a/ui/spice-core.c
+++ b/ui/spice-core.c
@@ -690,32 +690,35 @@ static void qemu_spice_init(void)
@@ -691,32 +691,35 @@ static void qemu_spice_init(void)
if (tls_port) {
x509_dir = qemu_opt_get(opts, "x509-dir");

View File

@@ -9,10 +9,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 11 insertions(+), 4 deletions(-)
diff --git a/block/gluster.c b/block/gluster.c
index f8b415f381..02bde39d94 100644
index ad5fadbe79..d0011085c4 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -42,7 +42,7 @@
@@ -43,7 +43,7 @@
#define GLUSTER_DEBUG_DEFAULT 4
#define GLUSTER_DEBUG_MAX 9
#define GLUSTER_OPT_LOGFILE "logfile"
@@ -21,7 +21,7 @@ index f8b415f381..02bde39d94 100644
/*
* Several versions of GlusterFS (3.12? -> 6.0.1) fail when the transfer size
* is greater or equal to 1024 MiB, so we are limiting the transfer size to 512
@@ -421,6 +421,7 @@ static struct glfs *qemu_gluster_glfs_init(BlockdevOptionsGluster *gconf,
@@ -425,6 +425,7 @@ static struct glfs *qemu_gluster_glfs_init(BlockdevOptionsGluster *gconf,
int old_errno;
SocketAddressList *server;
uint64_t port;
@@ -29,7 +29,7 @@ index f8b415f381..02bde39d94 100644
glfs = glfs_find_preopened(gconf->volume);
if (glfs) {
@@ -463,9 +464,15 @@ static struct glfs *qemu_gluster_glfs_init(BlockdevOptionsGluster *gconf,
@@ -467,9 +468,15 @@ static struct glfs *qemu_gluster_glfs_init(BlockdevOptionsGluster *gconf,
}
}

View File

@@ -18,7 +18,7 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 2 insertions(+)
diff --git a/block/rbd.c b/block/rbd.c
index 9c0fd0cb3f..101ee59d6e 100644
index 978671411e..a4749f3b1b 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -963,6 +963,8 @@ static int qemu_rbd_connect(rados_t *cluster, rados_ioctx_t *io_ctx,

View File

@@ -16,10 +16,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/block/gluster.c b/block/gluster.c
index 02bde39d94..36c00088cc 100644
index d0011085c4..2df3d6e35d 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -57,6 +57,7 @@ typedef struct GlusterAIOCB {
@@ -58,6 +58,7 @@ typedef struct GlusterAIOCB {
int ret;
Coroutine *coroutine;
AioContext *aio_context;
@@ -27,7 +27,7 @@ index 02bde39d94..36c00088cc 100644
} GlusterAIOCB;
typedef struct BDRVGlusterState {
@@ -749,8 +750,10 @@ static void gluster_finish_aiocb(struct glfs_fd *fd, ssize_t ret,
@@ -753,8 +754,10 @@ static void gluster_finish_aiocb(struct glfs_fd *fd, ssize_t ret,
acb->ret = 0; /* Success */
} else if (ret < 0) {
acb->ret = -errno; /* Read/Write failed */
@@ -39,7 +39,7 @@ index 02bde39d94..36c00088cc 100644
}
aio_co_schedule(acb->aio_context, acb->coroutine);
@@ -1019,6 +1022,7 @@ static coroutine_fn int qemu_gluster_co_pwrite_zeroes(BlockDriverState *bs,
@@ -1021,6 +1024,7 @@ static coroutine_fn int qemu_gluster_co_pwrite_zeroes(BlockDriverState *bs,
acb.ret = 0;
acb.coroutine = qemu_coroutine_self();
acb.aio_context = bdrv_get_aio_context(bs);
@@ -47,7 +47,7 @@ index 02bde39d94..36c00088cc 100644
ret = glfs_zerofill_async(s->fd, offset, bytes, gluster_finish_aiocb, &acb);
if (ret < 0) {
@@ -1199,9 +1203,11 @@ static coroutine_fn int qemu_gluster_co_rw(BlockDriverState *bs,
@@ -1201,9 +1205,11 @@ static coroutine_fn int qemu_gluster_co_rw(BlockDriverState *bs,
acb.aio_context = bdrv_get_aio_context(bs);
if (write) {
@@ -59,7 +59,7 @@ index 02bde39d94..36c00088cc 100644
ret = glfs_preadv_async(s->fd, qiov->iov, qiov->niov, offset, 0,
gluster_finish_aiocb, &acb);
}
@@ -1264,6 +1270,7 @@ static coroutine_fn int qemu_gluster_co_flush_to_disk(BlockDriverState *bs)
@@ -1266,6 +1272,7 @@ static coroutine_fn int qemu_gluster_co_flush_to_disk(BlockDriverState *bs)
acb.ret = 0;
acb.coroutine = qemu_coroutine_self();
acb.aio_context = bdrv_get_aio_context(bs);
@@ -67,7 +67,7 @@ index 02bde39d94..36c00088cc 100644
ret = glfs_fsync_async(s->fd, gluster_finish_aiocb, &acb);
if (ret < 0) {
@@ -1312,6 +1319,7 @@ static coroutine_fn int qemu_gluster_co_pdiscard(BlockDriverState *bs,
@@ -1314,6 +1321,7 @@ static coroutine_fn int qemu_gluster_co_pdiscard(BlockDriverState *bs,
acb.ret = 0;
acb.coroutine = qemu_coroutine_self();
acb.aio_context = bdrv_get_aio_context(bs);

View File

@@ -9,10 +9,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/qemu-img.c b/qemu-img.c
index 7668f86769..2575e97b43 100644
index 27f48051b0..bb287d8538 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -3075,7 +3075,8 @@ static int img_info(int argc, char **argv)
@@ -3062,7 +3062,8 @@ static int img_info(int argc, char **argv)
list = collect_image_info_list(image_opts, filename, fmt, chain,
force_share);
if (!list) {

View File

@@ -38,10 +38,10 @@ Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2 files changed, 133 insertions(+), 73 deletions(-)
diff --git a/qemu-img-cmds.hx b/qemu-img-cmds.hx
index c9dd70a892..048788b23d 100644
index 1b1dab5b17..d1616c045a 100644
--- a/qemu-img-cmds.hx
+++ b/qemu-img-cmds.hx
@@ -60,9 +60,9 @@ SRST
@@ -58,9 +58,9 @@ SRST
ERST
DEF("dd", img_dd,
@@ -54,10 +54,10 @@ index c9dd70a892..048788b23d 100644
DEF("info", img_info,
diff --git a/qemu-img.c b/qemu-img.c
index 2575e97b43..8ec68b346f 100644
index bb287d8538..09c0340d16 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -4993,10 +4993,12 @@ static int img_bitmap(int argc, char **argv)
@@ -4888,10 +4888,12 @@ static int img_bitmap(int argc, char **argv)
#define C_IF 04
#define C_OF 010
#define C_SKIP 020
@@ -70,7 +70,7 @@ index 2575e97b43..8ec68b346f 100644
};
struct DdIo {
@@ -5072,6 +5074,19 @@ static int img_dd_skip(const char *arg,
@@ -4967,6 +4969,19 @@ static int img_dd_skip(const char *arg,
return 0;
}
@@ -90,7 +90,7 @@ index 2575e97b43..8ec68b346f 100644
static int img_dd(int argc, char **argv)
{
int ret = 0;
@@ -5112,6 +5127,7 @@ static int img_dd(int argc, char **argv)
@@ -5007,6 +5022,7 @@ static int img_dd(int argc, char **argv)
{ "if", img_dd_if, C_IF },
{ "of", img_dd_of, C_OF },
{ "skip", img_dd_skip, C_SKIP },
@@ -98,7 +98,7 @@ index 2575e97b43..8ec68b346f 100644
{ NULL, NULL, 0 }
};
const struct option long_options[] = {
@@ -5187,91 +5203,112 @@ static int img_dd(int argc, char **argv)
@@ -5082,91 +5098,112 @@ static int img_dd(int argc, char **argv)
arg = NULL;
}
@@ -275,7 +275,7 @@ index 2575e97b43..8ec68b346f 100644
}
if (dd.flags & C_SKIP && (in.offset > INT64_MAX / in.bsz ||
@@ -5288,20 +5325,43 @@ static int img_dd(int argc, char **argv)
@@ -5183,20 +5220,43 @@ static int img_dd(int argc, char **argv)
in.buf = g_new(uint8_t, in.bsz);
for (out_pos = 0; in_pos < size; ) {

View File

@@ -16,10 +16,10 @@ Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
1 file changed, 25 insertions(+), 3 deletions(-)
diff --git a/qemu-img.c b/qemu-img.c
index 8ec68b346f..b98184bba1 100644
index 09c0340d16..556535d9d5 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -4994,11 +4994,13 @@ static int img_bitmap(int argc, char **argv)
@@ -4889,11 +4889,13 @@ static int img_bitmap(int argc, char **argv)
#define C_OF 010
#define C_SKIP 020
#define C_OSIZE 040
@@ -33,7 +33,7 @@ index 8ec68b346f..b98184bba1 100644
};
struct DdIo {
@@ -5087,6 +5089,19 @@ static int img_dd_osize(const char *arg,
@@ -4982,6 +4984,19 @@ static int img_dd_osize(const char *arg,
return 0;
}
@@ -53,7 +53,7 @@ index 8ec68b346f..b98184bba1 100644
static int img_dd(int argc, char **argv)
{
int ret = 0;
@@ -5101,12 +5116,14 @@ static int img_dd(int argc, char **argv)
@@ -4996,12 +5011,14 @@ static int img_dd(int argc, char **argv)
int c, i;
const char *out_fmt = "raw";
const char *fmt = NULL;
@@ -69,7 +69,7 @@ index 8ec68b346f..b98184bba1 100644
};
struct DdIo in = {
.bsz = 512, /* Block size is by default 512 bytes */
@@ -5128,6 +5145,7 @@ static int img_dd(int argc, char **argv)
@@ -5023,6 +5040,7 @@ static int img_dd(int argc, char **argv)
{ "of", img_dd_of, C_OF },
{ "skip", img_dd_skip, C_SKIP },
{ "osize", img_dd_osize, C_OSIZE },
@@ -77,7 +77,7 @@ index 8ec68b346f..b98184bba1 100644
{ NULL, NULL, 0 }
};
const struct option long_options[] = {
@@ -5324,9 +5342,10 @@ static int img_dd(int argc, char **argv)
@@ -5219,9 +5237,10 @@ static int img_dd(int argc, char **argv)
in.buf = g_new(uint8_t, in.bsz);
@@ -90,7 +90,7 @@ index 8ec68b346f..b98184bba1 100644
if (blk1) {
in_ret = blk_pread(blk1, in_pos, bytes, in.buf, 0);
if (in_ret == 0) {
@@ -5335,6 +5354,9 @@ static int img_dd(int argc, char **argv)
@@ -5230,6 +5249,9 @@ static int img_dd(int argc, char **argv)
} else {
in_ret = read(STDIN_FILENO, in.buf, bytes);
if (in_ret == 0) {

View File

@@ -13,10 +13,10 @@ Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
3 files changed, 26 insertions(+), 12 deletions(-)
diff --git a/docs/tools/qemu-img.rst b/docs/tools/qemu-img.rst
index 3653adb963..d83e8fb3c0 100644
index 15aeddc6d8..5e713e231d 100644
--- a/docs/tools/qemu-img.rst
+++ b/docs/tools/qemu-img.rst
@@ -212,6 +212,10 @@ Parameters to convert subcommand:
@@ -208,6 +208,10 @@ Parameters to convert subcommand:
Parameters to dd subcommand:
@@ -27,7 +27,7 @@ index 3653adb963..d83e8fb3c0 100644
.. program:: qemu-img-dd
.. option:: bs=BLOCK_SIZE
@@ -492,7 +496,7 @@ Command description:
@@ -488,7 +492,7 @@ Command description:
it doesn't need to be specified separately in this case.
@@ -36,7 +36,7 @@ index 3653adb963..d83e8fb3c0 100644
dd copies from *INPUT* file to *OUTPUT* file converting it from
*FMT* format to *OUTPUT_FMT* format.
@@ -503,6 +507,11 @@ Command description:
@@ -499,6 +503,11 @@ Command description:
The size syntax is similar to :manpage:`dd(1)`'s size syntax.
@@ -49,10 +49,10 @@ index 3653adb963..d83e8fb3c0 100644
Give information about the disk image *FILENAME*. Use it in
diff --git a/qemu-img-cmds.hx b/qemu-img-cmds.hx
index 048788b23d..0b29a67a06 100644
index d1616c045a..b5b0bb4467 100644
--- a/qemu-img-cmds.hx
+++ b/qemu-img-cmds.hx
@@ -60,9 +60,9 @@ SRST
@@ -58,9 +58,9 @@ SRST
ERST
DEF("dd", img_dd,
@@ -65,10 +65,10 @@ index 048788b23d..0b29a67a06 100644
DEF("info", img_info,
diff --git a/qemu-img.c b/qemu-img.c
index b98184bba1..6fc8384f64 100644
index 556535d9d5..289c78febb 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -5118,7 +5118,7 @@ static int img_dd(int argc, char **argv)
@@ -5013,7 +5013,7 @@ static int img_dd(int argc, char **argv)
const char *fmt = NULL;
int64_t size = 0, readsize = 0;
int64_t out_pos, in_pos;
@@ -77,7 +77,7 @@ index b98184bba1..6fc8384f64 100644
struct DdInfo dd = {
.flags = 0,
.count = 0,
@@ -5156,7 +5156,7 @@ static int img_dd(int argc, char **argv)
@@ -5051,7 +5051,7 @@ static int img_dd(int argc, char **argv)
{ 0, 0, 0, 0 }
};
@@ -86,7 +86,7 @@ index b98184bba1..6fc8384f64 100644
if (c == EOF) {
break;
}
@@ -5176,6 +5176,9 @@ static int img_dd(int argc, char **argv)
@@ -5071,6 +5071,9 @@ static int img_dd(int argc, char **argv)
case 'h':
help();
break;
@@ -96,7 +96,7 @@ index b98184bba1..6fc8384f64 100644
case 'U':
force_share = true;
break;
@@ -5306,13 +5309,15 @@ static int img_dd(int argc, char **argv)
@@ -5201,13 +5204,15 @@ static int img_dd(int argc, char **argv)
size - in.bsz * in.offset, &error_abort);
}

View File

@@ -12,10 +12,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
3 files changed, 36 insertions(+), 7 deletions(-)
diff --git a/docs/tools/qemu-img.rst b/docs/tools/qemu-img.rst
index d83e8fb3c0..61c6b21859 100644
index 5e713e231d..9390d5e5cf 100644
--- a/docs/tools/qemu-img.rst
+++ b/docs/tools/qemu-img.rst
@@ -496,10 +496,10 @@ Command description:
@@ -492,10 +492,10 @@ Command description:
it doesn't need to be specified separately in this case.
@@ -30,10 +30,10 @@ index d83e8fb3c0..61c6b21859 100644
The data is by default read and written using blocks of 512 bytes but can be
modified by specifying *BLOCK_SIZE*. If count=\ *BLOCKS* is specified
diff --git a/qemu-img-cmds.hx b/qemu-img-cmds.hx
index 0b29a67a06..758f397232 100644
index b5b0bb4467..36f97e1f19 100644
--- a/qemu-img-cmds.hx
+++ b/qemu-img-cmds.hx
@@ -60,9 +60,9 @@ SRST
@@ -58,9 +58,9 @@ SRST
ERST
DEF("dd", img_dd,
@@ -46,10 +46,10 @@ index 0b29a67a06..758f397232 100644
DEF("info", img_info,
diff --git a/qemu-img.c b/qemu-img.c
index 6fc8384f64..a6c88e0860 100644
index 289c78febb..da543d05cb 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -5110,6 +5110,7 @@ static int img_dd(int argc, char **argv)
@@ -5005,6 +5005,7 @@ static int img_dd(int argc, char **argv)
BlockDriver *drv = NULL, *proto_drv = NULL;
BlockBackend *blk1 = NULL, *blk2 = NULL;
QemuOpts *opts = NULL;
@@ -57,7 +57,7 @@ index 6fc8384f64..a6c88e0860 100644
QemuOptsList *create_opts = NULL;
Error *local_err = NULL;
bool image_opts = false;
@@ -5119,6 +5120,7 @@ static int img_dd(int argc, char **argv)
@@ -5014,6 +5015,7 @@ static int img_dd(int argc, char **argv)
int64_t size = 0, readsize = 0;
int64_t out_pos, in_pos;
bool force_share = false, skip_create = false;
@@ -65,7 +65,7 @@ index 6fc8384f64..a6c88e0860 100644
struct DdInfo dd = {
.flags = 0,
.count = 0,
@@ -5156,7 +5158,7 @@ static int img_dd(int argc, char **argv)
@@ -5051,7 +5053,7 @@ static int img_dd(int argc, char **argv)
{ 0, 0, 0, 0 }
};
@@ -74,7 +74,7 @@ index 6fc8384f64..a6c88e0860 100644
if (c == EOF) {
break;
}
@@ -5179,6 +5181,19 @@ static int img_dd(int argc, char **argv)
@@ -5074,6 +5076,19 @@ static int img_dd(int argc, char **argv)
case 'n':
skip_create = true;
break;
@@ -94,7 +94,7 @@ index 6fc8384f64..a6c88e0860 100644
case 'U':
force_share = true;
break;
@@ -5238,11 +5253,24 @@ static int img_dd(int argc, char **argv)
@@ -5133,11 +5148,24 @@ static int img_dd(int argc, char **argv)
if (dd.flags & C_IF) {
blk1 = img_open(image_opts, in.filename, fmt, 0, false, false,
force_share);
@@ -120,7 +120,7 @@ index 6fc8384f64..a6c88e0860 100644
}
if (dd.flags & C_OSIZE) {
@@ -5397,6 +5425,7 @@ static int img_dd(int argc, char **argv)
@@ -5292,6 +5320,7 @@ static int img_dd(int argc, char **argv)
out:
g_free(arg);
qemu_opts_del(opts);

View File

@@ -18,10 +18,10 @@ Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
4 files changed, 82 insertions(+), 4 deletions(-)
diff --git a/hw/core/machine-hmp-cmds.c b/hw/core/machine-hmp-cmds.c
index 8701f00cc7..3b4c5ef403 100644
index c3e55ef9e9..0e32e6201f 100644
--- a/hw/core/machine-hmp-cmds.c
+++ b/hw/core/machine-hmp-cmds.c
@@ -179,7 +179,35 @@ void hmp_info_balloon(Monitor *mon, const QDict *qdict)
@@ -169,7 +169,35 @@ void hmp_info_balloon(Monitor *mon, const QDict *qdict)
return;
}
@@ -59,10 +59,10 @@ index 8701f00cc7..3b4c5ef403 100644
qapi_free_BalloonInfo(info);
}
diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 609e39a821..8cb6dfcac3 100644
index d004cf29d2..2660ed520b 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -781,8 +781,37 @@ static uint64_t virtio_balloon_get_features(VirtIODevice *vdev, uint64_t f,
@@ -782,8 +782,37 @@ static uint64_t virtio_balloon_get_features(VirtIODevice *vdev, uint64_t f,
static void virtio_balloon_stat(void *opaque, BalloonInfo *info)
{
VirtIOBalloon *dev = opaque;
@@ -103,10 +103,10 @@ index 609e39a821..8cb6dfcac3 100644
static void virtio_balloon_to_target(void *opaque, ram_addr_t target)
diff --git a/qapi/machine.json b/qapi/machine.json
index d4317435e7..db8ed2e357 100644
index a08b6576ca..5c9a4d55f4 100644
--- a/qapi/machine.json
+++ b/qapi/machine.json
@@ -1164,9 +1164,29 @@
@@ -1063,9 +1063,29 @@
# @actual: the logical size of the VM in bytes Formula used:
# logical_vm_size = vm_ram_size - balloon_size
#
@@ -138,10 +138,10 @@ index d4317435e7..db8ed2e357 100644
##
# @query-balloon:
diff --git a/qapi/pragma.json b/qapi/pragma.json
index 59fbe74b8c..be8fa304c5 100644
index 7f810b0e97..325e684411 100644
--- a/qapi/pragma.json
+++ b/qapi/pragma.json
@@ -90,6 +90,7 @@
@@ -35,6 +35,7 @@
'member-name-exceptions': [ # visible in:
'ACPISlotType', # query-acpi-ospm-status
'AcpiTableOptions', # -acpitable

View File

@@ -13,10 +13,10 @@ Signed-off-by: Dietmar Maurer <dietmar@proxmox.com>
2 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/hw/core/machine-qmp-cmds.c b/hw/core/machine-qmp-cmds.c
index 130217da8f..52a6d74820 100644
index 3860a50c3b..40821e2317 100644
--- a/hw/core/machine-qmp-cmds.c
+++ b/hw/core/machine-qmp-cmds.c
@@ -90,6 +90,12 @@ MachineInfoList *qmp_query_machines(bool has_compat_props, bool compat_props,
@@ -91,6 +91,12 @@ MachineInfoList *qmp_query_machines(Error **errp)
info->numa_mem_supported = mc->numa_mem_supported;
info->deprecated = !!mc->deprecation_reason;
info->acpi = !!object_class_property_find(OBJECT_CLASS(mc), "acpi");
@@ -30,10 +30,10 @@ index 130217da8f..52a6d74820 100644
info->default_cpu_type = g_strdup(mc->default_cpu_type);
}
diff --git a/qapi/machine.json b/qapi/machine.json
index db8ed2e357..0c703316f5 100644
index 5c9a4d55f4..fbb61f18e4 100644
--- a/qapi/machine.json
+++ b/qapi/machine.json
@@ -168,6 +168,8 @@
@@ -139,6 +139,8 @@
#
# @is-default: whether the machine is default
#
@@ -42,7 +42,7 @@ index db8ed2e357..0c703316f5 100644
# @cpu-max: maximum number of CPUs supported by the machine type
# (since 1.5)
#
@@ -200,7 +202,7 @@
@@ -163,7 +165,7 @@
##
{ 'struct': 'MachineInfo',
'data': { 'name': 'str', '*alias': 'str',
@@ -50,4 +50,4 @@ index db8ed2e357..0c703316f5 100644
+ '*is-default': 'bool', '*is-current': 'bool', 'cpu-max': 'int',
'hotpluggable-cpus': 'bool', 'numa-mem-supported': 'bool',
'deprecated': 'bool', '*default-cpu-type': 'str',
'*default-ram-id': 'str', 'acpi': 'bool',
'*default-ram-id': 'str', 'acpi': 'bool' } }

View File

@@ -14,10 +14,10 @@ Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2 files changed, 7 insertions(+)
diff --git a/qapi/ui.json b/qapi/ui.json
index 8c8464faac..cebda37f8f 100644
index 006616aa77..dfd1d3e36b 100644
--- a/qapi/ui.json
+++ b/qapi/ui.json
@@ -312,11 +312,14 @@
@@ -317,11 +317,14 @@
#
# @channels: a list of @SpiceChannel for each active spice channel
#
@@ -33,7 +33,7 @@ index 8c8464faac..cebda37f8f 100644
'if': 'CONFIG_SPICE' }
diff --git a/ui/spice-core.c b/ui/spice-core.c
index ea20e6153c..55a15fba8b 100644
index b20c25aee0..26baeb7846 100644
--- a/ui/spice-core.c
+++ b/ui/spice-core.c
@@ -548,6 +548,10 @@ static SpiceInfo *qmp_query_spice_real(Error **errp)

View File

@@ -14,21 +14,20 @@ Additionally, allows tracking the current position from the outside
(intended to be used for progress tracking).
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
migration/channel-savevm-async.c | 184 +++++++++++++++++++++++++++++++
migration/channel-savevm-async.c | 183 +++++++++++++++++++++++++++++++
migration/channel-savevm-async.h | 51 +++++++++
migration/meson.build | 1 +
3 files changed, 236 insertions(+)
3 files changed, 235 insertions(+)
create mode 100644 migration/channel-savevm-async.c
create mode 100644 migration/channel-savevm-async.h
diff --git a/migration/channel-savevm-async.c b/migration/channel-savevm-async.c
new file mode 100644
index 0000000000..081a192f49
index 0000000000..aab081ce07
--- /dev/null
+++ b/migration/channel-savevm-async.c
@@ -0,0 +1,184 @@
@@ -0,0 +1,183 @@
+/*
+ * QIO Channel implementation to be used by savevm-async QMP calls
+ */
@@ -175,9 +174,8 @@ index 0000000000..081a192f49
+
+static void
+qio_channel_savevm_async_set_aio_fd_handler(QIOChannel *ioc,
+ AioContext *read_ctx,
+ AioContext *ctx,
+ IOHandler *io_read,
+ AioContext *write_ctx,
+ IOHandler *io_write,
+ void *opaque)
+{
@@ -271,7 +269,7 @@ index 0000000000..17ae2cb261
+
+#endif /* QIO_CHANNEL_SAVEVM_ASYNC_H */
diff --git a/migration/meson.build b/migration/meson.build
index 5ce2acb41e..020127d901 100644
index 1ae28523a1..37ddcb5d60 100644
--- a/migration/meson.build
+++ b/migration/meson.build
@@ -13,6 +13,7 @@ system_ss.add(files(

View File

@@ -27,9 +27,7 @@ Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
[FE: further improve aborting
adapt to removal of QEMUFileOps
improve condition for entering final stage
adapt to QAPI and other changes for 8.2
make sure to not call vm_start() from coroutine
stop CPU throttling after finishing]
adapt to QAPI and other changes for 8.0]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
hmp-commands-info.hx | 13 +
@@ -37,20 +35,20 @@ Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
include/migration/snapshot.h | 2 +
include/monitor/hmp.h | 3 +
migration/meson.build | 1 +
migration/savevm-async.c | 549 +++++++++++++++++++++++++++++++++++
migration/savevm-async.c | 531 +++++++++++++++++++++++++++++++++++
monitor/hmp-cmds.c | 38 +++
qapi/migration.json | 34 +++
qapi/misc.json | 18 ++
qapi/misc.json | 16 ++
qemu-options.hx | 12 +
system/vl.c | 10 +
11 files changed, 697 insertions(+)
softmmu/vl.c | 10 +
11 files changed, 677 insertions(+)
create mode 100644 migration/savevm-async.c
diff --git a/hmp-commands-info.hx b/hmp-commands-info.hx
index c59cd6637b..d1a7b99add 100644
index f5b37eb74a..10fdd822e0 100644
--- a/hmp-commands-info.hx
+++ b/hmp-commands-info.hx
@@ -512,6 +512,19 @@ SRST
@@ -525,6 +525,19 @@ SRST
Show current migration parameters.
ERST
@@ -71,10 +69,10 @@ index c59cd6637b..d1a7b99add 100644
.name = "balloon",
.args_type = "",
diff --git a/hmp-commands.hx b/hmp-commands.hx
index 06746f0afc..0c7c6f2c16 100644
index 2cbd0f77a0..e352f86872 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -1859,3 +1859,20 @@ SRST
@@ -1865,3 +1865,20 @@ SRST
List event channels in the guest
ERST
#endif
@@ -96,18 +94,18 @@ index 06746f0afc..0c7c6f2c16 100644
+ .coroutine = true,
+ },
diff --git a/include/migration/snapshot.h b/include/migration/snapshot.h
index 9e4dcaaa75..2581730d74 100644
index e72083b117..c846d37806 100644
--- a/include/migration/snapshot.h
+++ b/include/migration/snapshot.h
@@ -68,4 +68,6 @@ bool delete_snapshot(const char *name,
*/
void load_snapshot_resume(RunState state);
@@ -61,4 +61,6 @@ bool delete_snapshot(const char *name,
bool has_devices, strList *devices,
Error **errp);
+int load_snapshot_from_blockdev(const char *filename, Error **errp);
+
#endif
diff --git a/include/monitor/hmp.h b/include/monitor/hmp.h
index ae116d9804..2596cc2426 100644
index 13f9a2dedb..7a7def7530 100644
--- a/include/monitor/hmp.h
+++ b/include/monitor/hmp.h
@@ -28,6 +28,7 @@ void hmp_info_status(Monitor *mon, const QDict *qdict);
@@ -118,7 +116,7 @@ index ae116d9804..2596cc2426 100644
void hmp_info_migrate(Monitor *mon, const QDict *qdict);
void hmp_info_migrate_capabilities(Monitor *mon, const QDict *qdict);
void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict);
@@ -92,6 +93,8 @@ void hmp_closefd(Monitor *mon, const QDict *qdict);
@@ -94,6 +95,8 @@ void hmp_closefd(Monitor *mon, const QDict *qdict);
void hmp_mouse_move(Monitor *mon, const QDict *qdict);
void hmp_mouse_button(Monitor *mon, const QDict *qdict);
void hmp_mouse_set(Monitor *mon, const QDict *qdict);
@@ -128,10 +126,10 @@ index ae116d9804..2596cc2426 100644
void coroutine_fn hmp_screendump(Monitor *mon, const QDict *qdict);
void hmp_chardev_add(Monitor *mon, const QDict *qdict);
diff --git a/migration/meson.build b/migration/meson.build
index 020127d901..4b0c4f0f51 100644
index 37ddcb5d60..07f6057acc 100644
--- a/migration/meson.build
+++ b/migration/meson.build
@@ -27,6 +27,7 @@ system_ss.add(files(
@@ -26,6 +26,7 @@ system_ss.add(files(
'options.c',
'postcopy-ram.c',
'savevm.c',
@@ -141,10 +139,10 @@ index 020127d901..4b0c4f0f51 100644
'threadinfo.c',
diff --git a/migration/savevm-async.c b/migration/savevm-async.c
new file mode 100644
index 0000000000..4c90209188
index 0000000000..e9fc18fb10
--- /dev/null
+++ b/migration/savevm-async.c
@@ -0,0 +1,549 @@
@@ -0,0 +1,531 @@
+#include "qemu/osdep.h"
+#include "migration/channel-savevm-async.h"
+#include "migration/migration.h"
@@ -155,7 +153,6 @@ index 0000000000..4c90209188
+#include "migration/global_state.h"
+#include "migration/ram.h"
+#include "migration/qemu-file.h"
+#include "sysemu/cpu-throttle.h"
+#include "sysemu/sysemu.h"
+#include "sysemu/runstate.h"
+#include "block/block.h"
@@ -167,7 +164,6 @@ index 0000000000..4c90209188
+#include "qapi/qapi-commands-misc.h"
+#include "qapi/qapi-commands-block.h"
+#include "qemu/cutils.h"
+#include "qemu/error-report.h"
+#include "qemu/timer.h"
+#include "qemu/main-loop.h"
+#include "qemu/rcu.h"
@@ -293,7 +289,7 @@ index 0000000000..4c90209188
+ DPRINTF("save_snapshot_error: %s\n", msg);
+
+ if (!snap_state.error) {
+ error_setg(&snap_state.error, "%s", msg);
+ error_set(&snap_state.error, ERROR_CLASS_GENERIC_ERROR, "%s", msg);
+ }
+
+ g_free (msg);
@@ -304,6 +300,7 @@ index 0000000000..4c90209188
+static void process_savevm_finalize(void *opaque)
+{
+ int ret;
+ AioContext *iohandler_ctx = iohandler_get_aio_context();
+ MigrationState *ms = migrate_get_current();
+
+ bool aborted = savevm_aborted();
@@ -320,7 +317,9 @@ index 0000000000..4c90209188
+ * so move it back. It can stay in the main context and live out its live
+ * there, since we're done with it after this method ends anyway.
+ */
+ aio_context_acquire(iohandler_ctx);
+ blk_set_aio_context(snap_state.target, qemu_get_aio_context(), NULL);
+ aio_context_release(iohandler_ctx);
+
+ ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
+ if (ret < 0) {
@@ -345,12 +344,6 @@ index 0000000000..4c90209188
+ ret || aborted ? MIGRATION_STATUS_FAILED : MIGRATION_STATUS_COMPLETED);
+ ms->to_dst_file = NULL;
+
+ /*
+ * Same as in migration_iteration_finish(): saving RAM might've turned on CPU throttling for
+ * auto-converge, make sure to disable it.
+ */
+ cpu_throttle_stop();
+
+ qemu_savevm_state_cleanup();
+
+ ret = save_snapshot_cleanup();
@@ -403,12 +396,12 @@ index 0000000000..4c90209188
+ * lock. Similar to what is done in migration.c, call the exact variant
+ * only once pend_precopy in the estimate is below the threshold.
+ */
+ bql_unlock();
+ qemu_mutex_unlock_iothread();
+ qemu_savevm_state_pending_estimate(&pend_precopy, &pend_postcopy);
+ if (pend_precopy <= threshold) {
+ qemu_savevm_state_pending_exact(&pend_precopy, &pend_postcopy);
+ }
+ bql_lock();
+ qemu_mutex_lock_iothread();
+ pending_size = pend_precopy + pend_postcopy;
+
+ /*
@@ -448,25 +441,21 @@ index 0000000000..4c90209188
+ * so move there now and after every flush.
+ */
+ aio_co_reschedule_self(qemu_get_aio_context());
+ bdrv_graph_co_rdlock();
+ bs = bdrv_first(&it);
+ bdrv_graph_co_rdunlock();
+ while (bs) {
+ for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
+ /* target has BDRV_O_NO_FLUSH, no sense calling bdrv_flush on it */
+ if (bs != blk_bs(snap_state.target)) {
+ AioContext *bs_ctx = bdrv_get_aio_context(bs);
+ if (bs_ctx != qemu_get_aio_context()) {
+ DPRINTF("savevm: async flushing drive %s\n", bs->filename);
+ aio_co_reschedule_self(bs_ctx);
+ bdrv_graph_co_rdlock();
+ bdrv_flush(bs);
+ bdrv_graph_co_rdunlock();
+ aio_co_reschedule_self(qemu_get_aio_context());
+ }
+ if (bs == blk_bs(snap_state.target)) {
+ continue;
+ }
+
+ AioContext *bs_ctx = bdrv_get_aio_context(bs);
+ if (bs_ctx != qemu_get_aio_context()) {
+ DPRINTF("savevm: async flushing drive %s\n", bs->filename);
+ aio_co_reschedule_self(bs_ctx);
+ bdrv_graph_co_rdlock();
+ bdrv_flush(bs);
+ bdrv_graph_co_rdunlock();
+ aio_co_reschedule_self(qemu_get_aio_context());
+ }
+ bdrv_graph_co_rdlock();
+ bs = bdrv_next(&it);
+ bdrv_graph_co_rdunlock();
+ }
+
+ DPRINTF("timing: async flushing took %ld ms\n",
@@ -480,17 +469,23 @@ index 0000000000..4c90209188
+ Error *local_err = NULL;
+ MigrationState *ms = migrate_get_current();
+ AioContext *iohandler_ctx = iohandler_get_aio_context();
+ int ret = 0;
+
+ int bdrv_oflags = BDRV_O_RDWR | BDRV_O_RESIZE | BDRV_O_NO_FLUSH;
+
+ if (snap_state.state != SAVE_STATE_DONE) {
+ error_setg(errp, "VM snapshot already started\n");
+ error_set(errp, ERROR_CLASS_GENERIC_ERROR,
+ "VM snapshot already started\n");
+ return;
+ }
+
+ if (migration_is_running()) {
+ error_setg(errp, "There's a migration process in progress");
+ if (migration_is_running(ms->state)) {
+ error_set(errp, ERROR_CLASS_GENERIC_ERROR, QERR_MIGRATION_ACTIVE);
+ return;
+ }
+
+ if (migrate_block()) {
+ error_set(errp, ERROR_CLASS_GENERIC_ERROR,
+ "Block migration and snapshots are incompatible");
+ return;
+ }
+
@@ -522,7 +517,7 @@ index 0000000000..4c90209188
+ qdict_put_str(options, "driver", "raw");
+ snap_state.target = blk_new_open(statefile, NULL, options, bdrv_oflags, &local_err);
+ if (!snap_state.target) {
+ error_setg(errp, "failed to open '%s'", statefile);
+ error_set(errp, ERROR_CLASS_GENERIC_ERROR, "failed to open '%s'", statefile);
+ goto restart;
+ }
+
@@ -531,7 +526,7 @@ index 0000000000..4c90209188
+ snap_state.file = qemu_file_new_output(ioc);
+
+ if (!snap_state.file) {
+ error_setg(errp, "failed to open '%s'", statefile);
+ error_set(errp, ERROR_CLASS_GENERIC_ERROR, "failed to open '%s'", statefile);
+ goto restart;
+ }
+
@@ -540,10 +535,9 @@ index 0000000000..4c90209188
+ * State is cleared in process_savevm_co, but has to be initialized
+ * here (blocking main thread, from QMP) to avoid race conditions.
+ */
+ if (migrate_init(ms, errp)) {
+ return;
+ }
+ migrate_init(ms);
+ memset(&mig_stats, 0, sizeof(mig_stats));
+ memset(&compression_counters, 0, sizeof(compression_counters));
+ ms->to_dst_file = snap_state.file;
+
+ error_setg(&snap_state.blocker, "block device is in use by savevm");
@@ -551,25 +545,17 @@ index 0000000000..4c90209188
+
+ snap_state.state = SAVE_STATE_ACTIVE;
+ snap_state.finalize_bh = qemu_bh_new(process_savevm_finalize, &snap_state);
+ snap_state.co = qemu_coroutine_create(&process_savevm_co, NULL);
+ qemu_mutex_unlock_iothread();
+ qemu_savevm_state_header(snap_state.file);
+ ret = qemu_savevm_state_setup(snap_state.file, &local_err);
+ if (ret != 0) {
+ error_setg_errno(errp, -ret, "savevm state setup failed: %s",
+ local_err ? error_get_pretty(local_err) : "unknown error");
+ return;
+ }
+ qemu_savevm_state_setup(snap_state.file);
+ qemu_mutex_lock_iothread();
+
+ /* Async processing from here on out happens in iohandler context, so let
+ * the target bdrv have its home there.
+ */
+ ret = blk_set_aio_context(snap_state.target, iohandler_ctx, &local_err);
+ if (ret != 0) {
+ warn_report("failed to set iohandler context for VM state target: %s %s",
+ local_err ? error_get_pretty(local_err) : "unknown error",
+ strerror(-ret));
+ }
+ blk_set_aio_context(snap_state.target, iohandler_ctx, &local_err);
+
+ snap_state.co = qemu_coroutine_create(&process_savevm_co, NULL);
+ aio_co_schedule(iohandler_ctx, snap_state.co);
+
+ return;
@@ -584,10 +570,29 @@ index 0000000000..4c90209188
+ }
+}
+
+static void coroutine_fn wait_for_close_co(void *opaque)
+void coroutine_fn qmp_savevm_end(Error **errp)
+{
+ int64_t timeout;
+
+ if (snap_state.state == SAVE_STATE_DONE) {
+ error_set(errp, ERROR_CLASS_GENERIC_ERROR,
+ "VM snapshot not started\n");
+ return;
+ }
+
+ if (snap_state.state == SAVE_STATE_ACTIVE) {
+ snap_state.state = SAVE_STATE_CANCELLED;
+ goto wait_for_close;
+ }
+
+ if (snap_state.saved_vm_running) {
+ vm_start();
+ snap_state.saved_vm_running = false;
+ }
+
+ snap_state.state = SAVE_STATE_DONE;
+
+wait_for_close:
+ if (!snap_state.target) {
+ DPRINTF("savevm-end: no target file open\n");
+ return;
@@ -615,31 +620,6 @@ index 0000000000..4c90209188
+ DPRINTF("savevm-end: cleanup done\n");
+}
+
+void qmp_savevm_end(Error **errp)
+{
+ if (snap_state.state == SAVE_STATE_DONE) {
+ error_setg(errp, "VM snapshot not started\n");
+ return;
+ }
+
+ Coroutine *wait_for_close = qemu_coroutine_create(wait_for_close_co, NULL);
+
+ if (snap_state.state == SAVE_STATE_ACTIVE) {
+ snap_state.state = SAVE_STATE_CANCELLED;
+ qemu_coroutine_enter(wait_for_close);
+ return;
+ }
+
+ if (snap_state.saved_vm_running) {
+ vm_start();
+ snap_state.saved_vm_running = false;
+ }
+
+ snap_state.state = SAVE_STATE_DONE;
+
+ qemu_coroutine_enter(wait_for_close);
+}
+
+int load_snapshot_from_blockdev(const char *filename, Error **errp)
+{
+ BlockBackend *be;
@@ -695,21 +675,21 @@ index 0000000000..4c90209188
+ return ret;
+}
diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index f601d06ab8..874084565f 100644
index 6c559b48c8..91be698308 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -24,6 +24,7 @@
@@ -22,6 +22,7 @@
#include "monitor/monitor-internal.h"
#include "qapi/error.h"
#include "qapi/qapi-commands-control.h"
#include "qapi/qapi-commands-machine.h"
+#include "qapi/qapi-commands-migration.h"
#include "qapi/qapi-commands-misc.h"
#include "qapi/qmp/qdict.h"
#include "qemu/cutils.h"
@@ -434,3 +435,40 @@ void hmp_dumpdtb(Monitor *mon, const QDict *qdict)
monitor_printf(mon, "dtb dumped to %s", filename);
#include "qapi/qmp/qerror.h"
@@ -443,3 +444,40 @@ void hmp_info_mtree(Monitor *mon, const QDict *qdict)
mtree_info(flatview, dispatch_tree, owner, disabled);
}
#endif
+
+void hmp_savevm_start(Monitor *mon, const QDict *qdict)
+{
@@ -748,10 +728,10 @@ index f601d06ab8..874084565f 100644
+ }
+}
diff --git a/qapi/migration.json b/qapi/migration.json
index 7324571e92..d6e94a7c41 100644
index 8843e74b59..aca0ca1ac1 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -276,6 +276,40 @@
@@ -291,6 +291,40 @@
'*dirty-limit-throttle-time-per-round': 'uint64',
'*dirty-limit-ring-full-time': 'uint64'} }
@@ -793,10 +773,10 @@ index 7324571e92..d6e94a7c41 100644
# @query-migrate:
#
diff --git a/qapi/misc.json b/qapi/misc.json
index 559b66f201..7959e89c1e 100644
index cda2effa81..94a58bb0bf 100644
--- a/qapi/misc.json
+++ b/qapi/misc.json
@@ -454,6 +454,24 @@
@@ -456,6 +456,22 @@
##
{ 'command': 'query-fdsets', 'returns': ['FdsetInfo'] }
@@ -805,8 +785,6 @@ index 559b66f201..7959e89c1e 100644
+#
+# Prepare for snapshot and halt VM. Save VM state to statefile.
+#
+# @statefile: target file that state should be written to.
+#
+##
+{ 'command': 'savevm-start', 'data': { '*statefile': 'str' } }
+
@@ -816,16 +794,16 @@ index 559b66f201..7959e89c1e 100644
+# Resume VM after a snapshot.
+#
+##
+{ 'command': 'savevm-end' }
+{ 'command': 'savevm-end', 'coroutine': true }
+
##
# @CommandLineParameterType:
#
diff --git a/qemu-options.hx b/qemu-options.hx
index d94e2cbbae..07730f9e65 100644
index b56f6b2fb2..c8c78c92d4 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -4805,6 +4805,18 @@ SRST
@@ -4479,6 +4479,18 @@ SRST
Start right away with a saved state (``loadvm`` in monitor)
ERST
@@ -844,10 +822,10 @@ index d94e2cbbae..07730f9e65 100644
#ifndef _WIN32
DEF("daemonize", 0, QEMU_OPTION_daemonize, \
"-daemonize daemonize QEMU after initializing\n", QEMU_ARCH_ALL)
diff --git a/system/vl.c b/system/vl.c
index 01b8b8e77a..d6bbdc906e 100644
--- a/system/vl.c
+++ b/system/vl.c
diff --git a/softmmu/vl.c b/softmmu/vl.c
index b0b96f67fa..f3251de3e7 100644
--- a/softmmu/vl.c
+++ b/softmmu/vl.c
@@ -164,6 +164,7 @@ static const char *accelerators;
static bool have_custom_ram_size;
static const char *ram_memdev_id;
@@ -856,10 +834,10 @@ index 01b8b8e77a..d6bbdc906e 100644
static QTAILQ_HEAD(, ObjectOption) object_opts = QTAILQ_HEAD_INITIALIZER(object_opts);
static QTAILQ_HEAD(, DeviceOption) device_opts = QTAILQ_HEAD_INITIALIZER(device_opts);
static int display_remote;
@@ -2727,6 +2728,12 @@ void qmp_x_exit_preconfig(Error **errp)
RunState state = autostart ? RUN_STATE_RUNNING : runstate_get();
@@ -2643,6 +2644,12 @@ void qmp_x_exit_preconfig(Error **errp)
if (loadvm) {
load_snapshot(loadvm, NULL, false, NULL, &error_fatal);
load_snapshot_resume(state);
+ } else if (loadstate) {
+ Error *local_err = NULL;
+ if (load_snapshot_from_blockdev(loadstate, &local_err) < 0) {
@@ -869,7 +847,7 @@ index 01b8b8e77a..d6bbdc906e 100644
}
if (replay_mode != REPLAY_MODE_NONE) {
replay_vmstate_init();
@@ -3275,6 +3282,9 @@ void qemu_init(int argc, char **argv)
@@ -3190,6 +3197,9 @@ void qemu_init(int argc, char **argv)
case QEMU_OPTION_loadvm:
loadvm = optarg;
break;

View File

@@ -13,18 +13,18 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
[FE: adapt to removal of QEMUFileOps]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
migration/qemu-file.c | 48 +++++++++++++++++++++++++++-------------
migration/qemu-file.c | 49 +++++++++++++++++++++++++++-------------
migration/qemu-file.h | 2 ++
migration/savevm-async.c | 5 +++--
3 files changed, 38 insertions(+), 17 deletions(-)
migration/savevm-async.c | 5 ++--
3 files changed, 38 insertions(+), 18 deletions(-)
diff --git a/migration/qemu-file.c b/migration/qemu-file.c
index b6d2f588bd..754dc0b3f7 100644
index 19c33c9985..e9ffff0f0a 100644
--- a/migration/qemu-file.c
+++ b/migration/qemu-file.c
@@ -34,8 +34,8 @@
#include "rdma.h"
#include "io/channel-file.h"
@@ -33,8 +33,8 @@
#include "options.h"
#include "qapi/error.h"
-#define IO_BUF_SIZE 32768
-#define MAX_IOV_SIZE MIN_CONST(IOV_MAX, 64)
@@ -32,8 +32,8 @@ index b6d2f588bd..754dc0b3f7 100644
+#define MAX_IOV_SIZE MIN_CONST(IOV_MAX, 256)
struct QEMUFile {
QIOChannel *ioc;
@@ -43,7 +43,8 @@ struct QEMUFile {
const QEMUFileHooks *hooks;
@@ -46,7 +46,8 @@ struct QEMUFile {
int buf_index;
int buf_size; /* 0 when writing */
@@ -93,8 +93,8 @@ index b6d2f588bd..754dc0b3f7 100644
+ return qemu_file_new_impl(ioc, false, buffer_size);
}
/*
@@ -327,7 +342,7 @@ static ssize_t coroutine_mixed_fn qemu_fill_buffer(QEMUFile *f)
void qemu_file_set_hooks(QEMUFile *f, const QEMUFileHooks *hooks)
@@ -375,7 +390,7 @@ static ssize_t coroutine_mixed_fn qemu_fill_buffer(QEMUFile *f)
do {
len = qio_channel_read(f->ioc,
(char *)f->buf + pending,
@@ -103,17 +103,16 @@ index b6d2f588bd..754dc0b3f7 100644
&local_error);
if (len == QIO_CHANNEL_ERR_BLOCK) {
if (qemu_in_coroutine()) {
@@ -367,6 +382,9 @@ int qemu_fclose(QEMUFile *f)
ret = ret2;
@@ -425,6 +440,8 @@ int qemu_fclose(QEMUFile *f)
}
g_clear_pointer(&f->ioc, object_unref);
+
+ free(f->buf);
+
error_free(f->last_error_obj);
g_free(f);
trace_qemu_file_fclose();
@@ -415,7 +433,7 @@ static void add_buf_to_iovec(QEMUFile *f, size_t len)
/* If any error was spotted before closing, we should report it
* instead of the close() return value.
*/
@@ -479,7 +496,7 @@ static void add_buf_to_iovec(QEMUFile *f, size_t len)
{
if (!add_to_iovec(f, f->buf + f->buf_index, len, false)) {
f->buf_index += len;
@@ -122,7 +121,7 @@ index b6d2f588bd..754dc0b3f7 100644
qemu_fflush(f);
}
}
@@ -440,7 +458,7 @@ void qemu_put_buffer(QEMUFile *f, const uint8_t *buf, size_t size)
@@ -504,7 +521,7 @@ void qemu_put_buffer(QEMUFile *f, const uint8_t *buf, size_t size)
}
while (size > 0) {
@@ -131,7 +130,7 @@ index b6d2f588bd..754dc0b3f7 100644
if (l > size) {
l = size;
}
@@ -586,8 +604,8 @@ size_t coroutine_mixed_fn qemu_peek_buffer(QEMUFile *f, uint8_t **buf, size_t si
@@ -549,8 +566,8 @@ size_t coroutine_mixed_fn qemu_peek_buffer(QEMUFile *f, uint8_t **buf, size_t si
size_t index;
assert(!qemu_file_is_writable(f));
@@ -142,7 +141,7 @@ index b6d2f588bd..754dc0b3f7 100644
/* The 1st byte to read from */
index = f->buf_index + offset;
@@ -637,7 +655,7 @@ size_t coroutine_mixed_fn qemu_get_buffer(QEMUFile *f, uint8_t *buf, size_t size
@@ -600,7 +617,7 @@ size_t coroutine_mixed_fn qemu_get_buffer(QEMUFile *f, uint8_t *buf, size_t size
size_t res;
uint8_t *src;
@@ -151,7 +150,7 @@ index b6d2f588bd..754dc0b3f7 100644
if (res == 0) {
return done;
}
@@ -671,7 +689,7 @@ size_t coroutine_mixed_fn qemu_get_buffer(QEMUFile *f, uint8_t *buf, size_t size
@@ -634,7 +651,7 @@ size_t coroutine_mixed_fn qemu_get_buffer(QEMUFile *f, uint8_t *buf, size_t size
*/
size_t coroutine_mixed_fn qemu_get_buffer_in_place(QEMUFile *f, uint8_t **buf, size_t size)
{
@@ -160,7 +159,7 @@ index b6d2f588bd..754dc0b3f7 100644
size_t res;
uint8_t *src = NULL;
@@ -696,7 +714,7 @@ int coroutine_mixed_fn qemu_peek_byte(QEMUFile *f, int offset)
@@ -659,7 +676,7 @@ int coroutine_mixed_fn qemu_peek_byte(QEMUFile *f, int offset)
int index = f->buf_index + offset;
assert(!qemu_file_is_writable(f));
@@ -169,25 +168,34 @@ index b6d2f588bd..754dc0b3f7 100644
if (index >= f->buf_size) {
qemu_fill_buffer(f);
@@ -777,7 +794,7 @@ static int qemu_compress_data(z_stream *stream, uint8_t *dest, size_t dest_len,
ssize_t qemu_put_compression_data(QEMUFile *f, z_stream *stream,
const uint8_t *p, size_t size)
{
- ssize_t blen = IO_BUF_SIZE - f->buf_index - sizeof(int32_t);
+ ssize_t blen = f->buf_allocated_size - f->buf_index - sizeof(int32_t);
if (blen < compressBound(size)) {
return -1;
diff --git a/migration/qemu-file.h b/migration/qemu-file.h
index 11c2120edd..edf3c5d147 100644
index 47015f5201..1312b7c903 100644
--- a/migration/qemu-file.h
+++ b/migration/qemu-file.h
@@ -30,7 +30,9 @@
#include "io/channel.h"
@@ -63,7 +63,9 @@ typedef struct QEMUFileHooks {
} QEMUFileHooks;
QEMUFile *qemu_file_new_input(QIOChannel *ioc);
+QEMUFile *qemu_file_new_input_sized(QIOChannel *ioc, size_t buffer_size);
QEMUFile *qemu_file_new_output(QIOChannel *ioc);
+QEMUFile *qemu_file_new_output_sized(QIOChannel *ioc, size_t buffer_size);
void qemu_file_set_hooks(QEMUFile *f, const QEMUFileHooks *hooks);
int qemu_fclose(QEMUFile *f);
/*
diff --git a/migration/savevm-async.c b/migration/savevm-async.c
index 4c90209188..eb562d3dcf 100644
index e9fc18fb10..80624fada8 100644
--- a/migration/savevm-async.c
+++ b/migration/savevm-async.c
@@ -381,7 +381,7 @@ void qmp_savevm_start(const char *statefile, Error **errp)
@@ -378,7 +378,7 @@ void qmp_savevm_start(const char *statefile, Error **errp)
QIOChannel *ioc = QIO_CHANNEL(qio_channel_savevm_async_new(snap_state.target,
&snap_state.bs_pos));
@@ -195,8 +203,8 @@ index 4c90209188..eb562d3dcf 100644
+ snap_state.file = qemu_file_new_output_sized(ioc, 4 * 1024 * 1024);
if (!snap_state.file) {
error_setg(errp, "failed to open '%s'", statefile);
@@ -514,7 +514,8 @@ int load_snapshot_from_blockdev(const char *filename, Error **errp)
error_set(errp, ERROR_CLASS_GENERIC_ERROR, "failed to open '%s'", statefile);
@@ -496,7 +496,8 @@ int load_snapshot_from_blockdev(const char *filename, Error **errp)
blk_op_block_all(be, blocker);
/* restore the VM state */

View File

@@ -4,34 +4,32 @@ Date: Mon, 6 Apr 2020 12:16:47 +0200
Subject: [PATCH] PVE: block: add the zeroinit block driver filter
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
[FE: adapt to changed function signatures
adhere to block graph lock requirements
use dedicated function to open file child]
[FE: adapt to changed function signatures]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
block/meson.build | 1 +
block/zeroinit.c | 207 ++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 208 insertions(+)
block/zeroinit.c | 200 ++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 201 insertions(+)
create mode 100644 block/zeroinit.c
diff --git a/block/meson.build b/block/meson.build
index f1262ec2ba..6a60b5d6b9 100644
index 529fc172c6..1833c71ce9 100644
--- a/block/meson.build
+++ b/block/meson.build
@@ -39,6 +39,7 @@ block_ss.add(files(
'throttle.c',
@@ -40,6 +40,7 @@ block_ss.add(files(
'throttle-groups.c',
'throttle.c',
'write-threshold.c',
+ 'zeroinit.c',
), zstd, zlib)
), zstd, zlib, gnutls)
system_ss.add(when: 'CONFIG_TCG', if_true: files('blkreplay.c'))
diff --git a/block/zeroinit.c b/block/zeroinit.c
new file mode 100644
index 0000000000..2b2b194ccf
index 0000000000..1257342724
--- /dev/null
+++ b/block/zeroinit.c
@@ -0,0 +1,207 @@
@@ -0,0 +1,200 @@
+/*
+ * Filter to fake a zero-initialized block device.
+ *
@@ -46,7 +44,6 @@ index 0000000000..2b2b194ccf
+#include "qapi/error.h"
+#include "block/block_int.h"
+#include "block/block-io.h"
+#include "block/graph-lock.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/qmp/qstring.h"
+#include "qemu/cutils.h"
@@ -112,9 +109,12 @@ index 0000000000..2b2b194ccf
+ }
+
+ /* Open the raw file */
+ ret = bdrv_open_file_child(qemu_opt_get(opts, "x-next"), options, "next",
+ bs, &local_err);
+ if (ret < 0) {
+ bs->file = bdrv_open_child(qemu_opt_get(opts, "x-next"), options, "next",
+ bs, &child_of_bds,
+ BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY,
+ false, &local_err);
+ if (local_err) {
+ ret = -EINVAL;
+ error_propagate(errp, local_err);
+ goto fail;
+ }
@@ -125,9 +125,7 @@ index 0000000000..2b2b194ccf
+ ret = 0;
+fail:
+ if (ret < 0) {
+ bdrv_graph_wrlock();
+ bdrv_unref_child(bs, bs->file);
+ bdrv_graph_wrunlock();
+ }
+ qemu_opts_del(opts);
+ return ret;
@@ -139,22 +137,19 @@ index 0000000000..2b2b194ccf
+ (void)s;
+}
+
+static coroutine_fn int64_t GRAPH_RDLOCK
+zeroinit_co_getlength(BlockDriverState *bs)
+static coroutine_fn int64_t zeroinit_co_getlength(BlockDriverState *bs)
+{
+ return bdrv_co_getlength(bs->file->bs);
+}
+
+static int coroutine_fn GRAPH_RDLOCK
+zeroinit_co_preadv(BlockDriverState *bs, int64_t offset, int64_t bytes,
+ QEMUIOVector *qiov, BdrvRequestFlags flags)
+static int coroutine_fn zeroinit_co_preadv(BlockDriverState *bs,
+ int64_t offset, int64_t bytes, QEMUIOVector *qiov, BdrvRequestFlags flags)
+{
+ return bdrv_co_preadv(bs->file, offset, bytes, qiov, flags);
+}
+
+static int coroutine_fn GRAPH_RDLOCK
+zeroinit_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset, int64_t bytes,
+ BdrvRequestFlags flags)
+static int coroutine_fn zeroinit_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset,
+ int64_t bytes, BdrvRequestFlags flags)
+{
+ BDRVZeroinitState *s = bs->opaque;
+ if (offset >= s->extents)
@@ -162,9 +157,8 @@ index 0000000000..2b2b194ccf
+ return bdrv_pwrite_zeroes(bs->file, offset, bytes, flags);
+}
+
+static int coroutine_fn GRAPH_RDLOCK
+zeroinit_co_pwritev(BlockDriverState *bs, int64_t offset, int64_t bytes,
+ QEMUIOVector *qiov, BdrvRequestFlags flags)
+static int coroutine_fn zeroinit_co_pwritev(BlockDriverState *bs,
+ int64_t offset, int64_t bytes, QEMUIOVector *qiov, BdrvRequestFlags flags)
+{
+ BDRVZeroinitState *s = bs->opaque;
+ int64_t extents = offset + bytes;
@@ -173,35 +167,32 @@ index 0000000000..2b2b194ccf
+ return bdrv_co_pwritev(bs->file, offset, bytes, qiov, flags);
+}
+
+static coroutine_fn int GRAPH_RDLOCK
+zeroinit_co_flush(BlockDriverState *bs)
+static coroutine_fn int zeroinit_co_flush(BlockDriverState *bs)
+{
+ return bdrv_co_flush(bs->file->bs);
+}
+
+static int GRAPH_RDLOCK
+zeroinit_has_zero_init(BlockDriverState *bs)
+static int zeroinit_has_zero_init(BlockDriverState *bs)
+{
+ BDRVZeroinitState *s = bs->opaque;
+ return s->has_zero_init;
+}
+
+static int coroutine_fn GRAPH_RDLOCK
+zeroinit_co_pdiscard(BlockDriverState *bs, int64_t offset, int64_t bytes)
+static int coroutine_fn zeroinit_co_pdiscard(BlockDriverState *bs,
+ int64_t offset, int64_t bytes)
+{
+ return bdrv_co_pdiscard(bs->file, offset, bytes);
+}
+
+static int GRAPH_RDLOCK
+zeroinit_co_truncate(BlockDriverState *bs, int64_t offset, _Bool exact,
+ PreallocMode prealloc, BdrvRequestFlags req_flags,
+ Error **errp)
+static int zeroinit_co_truncate(BlockDriverState *bs, int64_t offset,
+ _Bool exact, PreallocMode prealloc,
+ BdrvRequestFlags req_flags, Error **errp)
+{
+ return bdrv_co_truncate(bs->file, offset, exact, prealloc, req_flags, errp);
+}
+
+static coroutine_fn int GRAPH_RDLOCK
+zeroinit_co_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
+static coroutine_fn int zeroinit_co_get_info(BlockDriverState *bs,
+ BlockDriverInfo *bdi)
+{
+ return bdrv_co_get_info(bs->file->bs, bdi);
+}
@@ -212,7 +203,7 @@ index 0000000000..2b2b194ccf
+ .instance_size = sizeof(BDRVZeroinitState),
+
+ .bdrv_parse_filename = zeroinit_parse_filename,
+ .bdrv_open = zeroinit_open,
+ .bdrv_file_open = zeroinit_open,
+ .bdrv_close = zeroinit_close,
+ .bdrv_co_getlength = zeroinit_co_getlength,
+ .bdrv_child_perm = bdrv_default_perms,

View File

@@ -10,14 +10,14 @@ Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
qemu-options.hx | 3 +++
system/vl.c | 8 ++++++++
softmmu/vl.c | 8 ++++++++
2 files changed, 11 insertions(+)
diff --git a/qemu-options.hx b/qemu-options.hx
index 07730f9e65..7fdc944965 100644
index c8c78c92d4..20ca2cdba7 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -1239,6 +1239,9 @@ legacy PC, they are not recommended for modern configurations.
@@ -1197,6 +1197,9 @@ legacy PC, they are not recommended for modern configurations.
ERST
@@ -27,11 +27,11 @@ index 07730f9e65..7fdc944965 100644
DEF("fda", HAS_ARG, QEMU_OPTION_fda,
"-fda/-fdb file use 'file' as floppy disk 0/1 image\n", QEMU_ARCH_ALL)
DEF("fdb", HAS_ARG, QEMU_OPTION_fdb, "", QEMU_ARCH_ALL)
diff --git a/system/vl.c b/system/vl.c
index d6bbdc906e..200468a753 100644
--- a/system/vl.c
+++ b/system/vl.c
@@ -2764,6 +2764,7 @@ void qemu_init(int argc, char **argv)
diff --git a/softmmu/vl.c b/softmmu/vl.c
index f3251de3e7..1b63ffd33d 100644
--- a/softmmu/vl.c
+++ b/softmmu/vl.c
@@ -2679,6 +2679,7 @@ void qemu_init(int argc, char **argv)
MachineClass *machine_class;
bool userconfig = true;
FILE *vmstate_dump_file = NULL;
@@ -39,7 +39,7 @@ index d6bbdc906e..200468a753 100644
qemu_add_opts(&qemu_drive_opts);
qemu_add_drive_opts(&qemu_legacy_drive_opts);
@@ -3387,6 +3388,13 @@ void qemu_init(int argc, char **argv)
@@ -3302,6 +3303,13 @@ void qemu_init(int argc, char **argv)
machine_parse_property_opt(qemu_find_opts("smp-opts"),
"smp", optarg);
break;
@@ -50,6 +50,6 @@ index d6bbdc906e..200468a753 100644
+ exit(1);
+ }
+ break;
#ifdef CONFIG_VNC
case QEMU_OPTION_vnc:
vnc_parse(optarg);
break;

View File

@@ -11,10 +11,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 9 insertions(+)
diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c
index c13cdd7994..fd5808cdc0 100644
index 4a34f03047..59b917e50c 100644
--- a/hw/intc/apic_common.c
+++ b/hw/intc/apic_common.c
@@ -263,6 +263,15 @@ static void apic_reset_common(DeviceState *dev)
@@ -252,6 +252,15 @@ static void apic_reset_common(DeviceState *dev)
info->vapic_base_update(s);
apic_init_reset(dev);

View File

@@ -9,14 +9,14 @@ Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
block/file-posix.c | 59 ++++++++++++++++++++++++++++++--------------
qapi/block-core.json | 7 +++++-
2 files changed, 46 insertions(+), 20 deletions(-)
qapi/block-core.json | 3 ++-
2 files changed, 42 insertions(+), 20 deletions(-)
diff --git a/block/file-posix.c b/block/file-posix.c
index 99e5bea1cc..6a4f6a25e6 100644
index 0db366a851..46f1ee38ae 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -2884,6 +2884,7 @@ raw_co_create(BlockdevCreateOptions *options, Error **errp)
@@ -2870,6 +2870,7 @@ raw_co_create(BlockdevCreateOptions *options, Error **errp)
int fd;
uint64_t perm, shared;
int result = 0;
@@ -24,7 +24,7 @@ index 99e5bea1cc..6a4f6a25e6 100644
/* Validate options and set default values */
assert(options->driver == BLOCKDEV_DRIVER_FILE);
@@ -2924,19 +2925,22 @@ raw_co_create(BlockdevCreateOptions *options, Error **errp)
@@ -2910,19 +2911,22 @@ raw_co_create(BlockdevCreateOptions *options, Error **errp)
perm = BLK_PERM_WRITE | BLK_PERM_RESIZE;
shared = BLK_PERM_ALL & ~BLK_PERM_RESIZE;
@@ -59,7 +59,7 @@ index 99e5bea1cc..6a4f6a25e6 100644
}
/* Clear the file by truncating it to 0 */
@@ -2990,13 +2994,15 @@ raw_co_create(BlockdevCreateOptions *options, Error **errp)
@@ -2976,13 +2980,15 @@ raw_co_create(BlockdevCreateOptions *options, Error **errp)
}
out_unlock:
@@ -82,7 +82,7 @@ index 99e5bea1cc..6a4f6a25e6 100644
}
out_close:
@@ -3020,6 +3026,7 @@ raw_co_create_opts(BlockDriver *drv, const char *filename,
@@ -3006,6 +3012,7 @@ raw_co_create_opts(BlockDriver *drv, const char *filename,
PreallocMode prealloc;
char *buf = NULL;
Error *local_err = NULL;
@@ -90,7 +90,7 @@ index 99e5bea1cc..6a4f6a25e6 100644
/* Skip file: protocol prefix */
strstart(filename, "file:", &filename);
@@ -3042,6 +3049,18 @@ raw_co_create_opts(BlockDriver *drv, const char *filename,
@@ -3028,6 +3035,18 @@ raw_co_create_opts(BlockDriver *drv, const char *filename,
return -EINVAL;
}
@@ -109,7 +109,7 @@ index 99e5bea1cc..6a4f6a25e6 100644
options = (BlockdevCreateOptions) {
.driver = BLOCKDEV_DRIVER_FILE,
.u.file = {
@@ -3053,6 +3072,8 @@ raw_co_create_opts(BlockDriver *drv, const char *filename,
@@ -3039,6 +3058,8 @@ raw_co_create_opts(BlockDriver *drv, const char *filename,
.nocow = nocow,
.has_extent_size_hint = has_extent_size_hint,
.extent_size_hint = extent_size_hint,
@@ -119,21 +119,10 @@ index 99e5bea1cc..6a4f6a25e6 100644
};
return raw_co_create(&options, errp);
diff --git a/qapi/block-core.json b/qapi/block-core.json
index c2a337cc04..1cb6f04db3 100644
index 903392cb8f..125aa89858 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -4959,6 +4959,10 @@
# @extent-size-hint: Extent size hint to add to the image file; 0 for
# not adding an extent size hint (default: 1 MB, since 5.1)
#
+# @locking: whether to enable file locking. If set to 'auto', only
+# enable when Open File Descriptor (OFD) locking API is available
+# (default: auto).
+#
# Since: 2.12
##
{ 'struct': 'BlockdevCreateOptionsFile',
@@ -4966,7 +4970,8 @@
@@ -4876,7 +4876,8 @@
'size': 'size',
'*preallocation': 'PreallocMode',
'*nocow': 'bool',

View File

@@ -18,10 +18,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/monitor/qmp.c b/monitor/qmp.c
index eb181d5979..20fc0d20a6 100644
index c15bf1e1fc..04fe25c62c 100644
--- a/monitor/qmp.c
+++ b/monitor/qmp.c
@@ -534,8 +534,7 @@ void monitor_init_qmp(Chardev *chr, bool pretty, Error **errp)
@@ -553,8 +553,7 @@ void monitor_init_qmp(Chardev *chr, bool pretty, Error **errp)
qemu_chr_fe_set_echo(&mon->common.chr, true);
/* Note: we run QMP monitor in I/O thread when @chr supports that */

View File

@@ -26,10 +26,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 27dcda0248..7a13e9f014 100644
index f0d35c6401..1427983543 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -173,7 +173,8 @@ GlobalProperty hw_compat_4_0[] = {
@@ -148,7 +148,8 @@ GlobalProperty hw_compat_4_0[] = {
{ "virtio-vga", "edid", "false" },
{ "virtio-gpu-device", "edid", "false" },
{ "virtio-device", "use-started", "false" },

View File

@@ -16,15 +16,15 @@ Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
hw/core/machine-qmp-cmds.c | 5 +++++
include/hw/boards.h | 2 ++
qapi/machine.json | 3 +++
system/vl.c | 24 ++++++++++++++++++++++++
4 files changed, 34 insertions(+)
qapi/machine.json | 4 +++-
softmmu/vl.c | 25 +++++++++++++++++++++++++
4 files changed, 35 insertions(+), 1 deletion(-)
diff --git a/hw/core/machine-qmp-cmds.c b/hw/core/machine-qmp-cmds.c
index 52a6d74820..362128842d 100644
index 40821e2317..ee93ddd69a 100644
--- a/hw/core/machine-qmp-cmds.c
+++ b/hw/core/machine-qmp-cmds.c
@@ -94,6 +94,11 @@ MachineInfoList *qmp_query_machines(bool has_compat_props, bool compat_props,
@@ -95,6 +95,11 @@ MachineInfoList *qmp_query_machines(Error **errp)
if (strcmp(mc->name, MACHINE_GET_CLASS(current_machine)->name) == 0) {
info->has_is_current = true;
info->is_current = true;
@@ -37,10 +37,10 @@ index 52a6d74820..362128842d 100644
if (mc->default_cpu_type) {
diff --git a/include/hw/boards.h b/include/hw/boards.h
index 48ff6d8b93..5cddeb7fcb 100644
index ed83360198..f8b88cd86a 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -252,6 +252,8 @@ struct MachineClass {
@@ -235,6 +235,8 @@ struct MachineClass {
const char *desc;
const char *deprecation_reason;
@@ -50,51 +50,52 @@ index 48ff6d8b93..5cddeb7fcb 100644
void (*reset)(MachineState *state, ShutdownCause reason);
void (*wakeup)(MachineState *state);
diff --git a/qapi/machine.json b/qapi/machine.json
index 0c703316f5..dc46a3e93f 100644
index fbb61f18e4..7da3c519ba 100644
--- a/qapi/machine.json
+++ b/qapi/machine.json
@@ -190,6 +190,8 @@
@@ -161,6 +161,8 @@
#
# @acpi: machine type supports ACPI (since 8.0)
#
+# @pve-version: custom PVE version suffix specified as 'machine+pveN'
+#
# @compat-props: The machine type's compatibility properties. Only
# present when query-machines argument @compat-props is true.
# (since 9.1)
@@ -206,6 +208,7 @@
# Since: 1.2
##
{ 'struct': 'MachineInfo',
@@ -168,7 +170,7 @@
'*is-default': 'bool', '*is-current': 'bool', 'cpu-max': 'int',
'hotpluggable-cpus': 'bool', 'numa-mem-supported': 'bool',
'deprecated': 'bool', '*default-cpu-type': 'str',
'*default-ram-id': 'str', 'acpi': 'bool',
+ '*pve-version': 'str',
'*compat-props': { 'type': ['CompatProperty'],
'features': ['unstable'] } } }
- '*default-ram-id': 'str', 'acpi': 'bool' } }
+ '*default-ram-id': 'str', 'acpi': 'bool', '*pve-version': 'str' } }
diff --git a/system/vl.c b/system/vl.c
index 200468a753..0dbdba6421 100644
--- a/system/vl.c
+++ b/system/vl.c
@@ -1675,6 +1675,7 @@ static MachineClass *select_machine(QDict *qdict, Error **errp)
##
# @query-machines:
diff --git a/softmmu/vl.c b/softmmu/vl.c
index 1b63ffd33d..20ba2c5c87 100644
--- a/softmmu/vl.c
+++ b/softmmu/vl.c
@@ -1597,6 +1597,7 @@ static const QEMUOption *lookup_opt(int argc, char **argv,
static MachineClass *select_machine(QDict *qdict, Error **errp)
{
ERRP_GUARD();
const char *machine_type = qdict_get_try_str(qdict, "type");
const char *optarg = qdict_get_try_str(qdict, "type");
+ const char *pvever = qdict_get_try_str(qdict, "pvever");
g_autoptr(GSList) machines = object_class_get_list(TYPE_MACHINE, false);
MachineClass *machine_class = NULL;
GSList *machines = object_class_get_list(TYPE_MACHINE, false);
MachineClass *machine_class;
Error *local_err = NULL;
@@ -1614,6 +1615,11 @@ static MachineClass *select_machine(QDict *qdict, Error **errp)
}
}
@@ -1694,7 +1695,11 @@ static MachineClass *select_machine(QDict *qdict, Error **errp)
if (!machine_class) {
error_append_hint(errp,
"Use -machine help to list supported machines\n");
+ } else {
+ if (machine_class) {
+ machine_class->pve_version = g_strdup(pvever);
+ qdict_del(qdict, "pvever");
}
+ }
+
return machine_class;
}
@@ -3329,12 +3334,31 @@ void qemu_init(int argc, char **argv)
g_slist_free(machines);
if (local_err) {
error_append_hint(&local_err, "Use -machine help to list supported machines\n");
@@ -3244,12 +3250,31 @@ void qemu_init(int argc, char **argv)
case QEMU_OPTION_machine:
{
bool help;

View File

@@ -25,7 +25,7 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/block/backup.c b/block/backup.c
index 3dd2e229d2..eba5b11493 100644
index db3791f4d1..39410dcf8d 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -237,8 +237,8 @@ static void backup_init_bcs_bitmap(BackupBlockJob *job)
@@ -48,9 +48,9 @@ index 3dd2e229d2..eba5b11493 100644
if (s->sync_mode == MIRROR_SYNC_MODE_TOP) {
int64_t offset = 0;
int64_t count;
@@ -502,6 +500,8 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
@@ -495,6 +493,8 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
block_job_add_bdrv(&job->common, "target", target, 0, BLK_PERM_ALL,
&error_abort);
bdrv_graph_wrunlock();
+ backup_init_bcs_bitmap(job);
+

View File

@@ -15,23 +15,23 @@ Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
block/meson.build | 2 +
meson.build | 5 +
vma-reader.c | 870 ++++++++++++++++++++++++++++++++++++++++++++
vma-writer.c | 817 +++++++++++++++++++++++++++++++++++++++++
vma.c | 901 ++++++++++++++++++++++++++++++++++++++++++++++
vma-reader.c | 867 ++++++++++++++++++++++++++++++++++++++++++++
vma-writer.c | 818 +++++++++++++++++++++++++++++++++++++++++
vma.c | 900 ++++++++++++++++++++++++++++++++++++++++++++++
vma.h | 150 ++++++++
6 files changed, 2745 insertions(+)
6 files changed, 2742 insertions(+)
create mode 100644 vma-reader.c
create mode 100644 vma-writer.c
create mode 100644 vma.c
create mode 100644 vma.h
diff --git a/block/meson.build b/block/meson.build
index 6a60b5d6b9..652c8cbdb7 100644
index 1833c71ce9..59b71ba9f3 100644
--- a/block/meson.build
+++ b/block/meson.build
@@ -42,6 +42,8 @@ block_ss.add(files(
@@ -43,6 +43,8 @@ block_ss.add(files(
'zeroinit.c',
), zstd, zlib)
), zstd, zlib, gnutls)
+block_ss.add(files('../vma-writer.c'), libuuid)
+
@@ -39,10 +39,10 @@ index 6a60b5d6b9..652c8cbdb7 100644
system_ss.add(files('block-ram-registrar.c'))
diff --git a/meson.build b/meson.build
index aa7ea85d0b..7eee5b4249 100644
index a9c4f28247..cd95530d3b 100644
--- a/meson.build
+++ b/meson.build
@@ -2012,6 +2012,8 @@ endif
@@ -1778,6 +1778,8 @@ endif
has_gettid = cc.has_function('gettid')
@@ -51,22 +51,22 @@ index aa7ea85d0b..7eee5b4249 100644
# libselinux
selinux = dependency('libselinux',
required: get_option('selinux'),
@@ -4097,6 +4099,9 @@ if have_tools
dependencies: [blockdev, qemuutil, selinux],
@@ -3908,6 +3910,9 @@ if have_tools
dependencies: [blockdev, qemuutil, gnutls, selinux],
install: true)
+ vma = executable('vma', files('vma.c', 'vma-reader.c') + genh,
+ dependencies: [authz, block, crypto, io, qemuutil, qom], install: true)
+ dependencies: [authz, block, crypto, io, qom], install: true)
+
subdir('storage-daemon')
foreach exe: [ 'qemu-img', 'qemu-io', 'qemu-nbd', 'qemu-storage-daemon']
subdir('contrib/rdmacm-mux')
subdir('contrib/elf2dmp')
diff --git a/vma-reader.c b/vma-reader.c
new file mode 100644
index 0000000000..d0b6721812
index 0000000000..81a891c6b1
--- /dev/null
+++ b/vma-reader.c
@@ -0,0 +1,870 @@
@@ -0,0 +1,867 @@
+/*
+ * VMA: Virtual Machine Archive
+ *
@@ -88,7 +88,6 @@ index 0000000000..d0b6721812
+#include "qemu/ratelimit.h"
+#include "vma.h"
+#include "block/block.h"
+#include "block/graph-lock.h"
+#include "sysemu/block-backend.h"
+
+static unsigned char zero_vma_block[VMA_BLOCK_SIZE];
@@ -601,10 +600,8 @@ index 0000000000..d0b6721812
+ } else {
+ int res = blk_pwrite(target, sector_num * BDRV_SECTOR_SIZE, nb_sectors * BDRV_SECTOR_SIZE, buf, 0);
+ if (res < 0) {
+ bdrv_graph_rdlock_main_loop();
+ error_setg(errp, "blk_pwrite to %s failed (%d)",
+ bdrv_get_device_name(blk_bs(target)), res);
+ bdrv_graph_rdunlock_main_loop();
+ return -1;
+ }
+ }
@@ -939,10 +936,10 @@ index 0000000000..d0b6721812
+
diff --git a/vma-writer.c b/vma-writer.c
new file mode 100644
index 0000000000..a466652a5d
index 0000000000..126b296647
--- /dev/null
+++ b/vma-writer.c
@@ -0,0 +1,817 @@
@@ -0,0 +1,818 @@
+/*
+ * VMA: Virtual Machine Archive
+ *
@@ -1517,16 +1514,17 @@ index 0000000000..a466652a5d
+ int i;
+
+ g_assert(vmaw != NULL);
+ g_assert(status != NULL);
+
+ status->status = vmaw->status;
+ g_strlcpy(status->errmsg, vmaw->errmsg, sizeof(status->errmsg));
+ for (i = 0; i <= 255; i++) {
+ status->stream_info[i] = vmaw->stream_info[i];
+ if (status) {
+ status->status = vmaw->status;
+ g_strlcpy(status->errmsg, vmaw->errmsg, sizeof(status->errmsg));
+ for (i = 0; i <= 255; i++) {
+ status->stream_info[i] = vmaw->stream_info[i];
+ }
+
+ uuid_unparse_lower(vmaw->uuid, status->uuid_str);
+ }
+
+ uuid_unparse_lower(vmaw->uuid, status->uuid_str);
+
+ status->closed = vmaw->closed;
+
+ return vmaw->status;
@@ -1762,10 +1760,10 @@ index 0000000000..a466652a5d
+}
diff --git a/vma.c b/vma.c
new file mode 100644
index 0000000000..bb715e9061
index 0000000000..347f6283ca
--- /dev/null
+++ b/vma.c
@@ -0,0 +1,901 @@
@@ -0,0 +1,900 @@
+/*
+ * VMA: Virtual Machine Archive
+ *
@@ -2078,17 +2076,17 @@ index 0000000000..bb715e9061
+ inbuf);
+ }
+
+ RestoreMap *restore_map = g_new0(RestoreMap, 1);
+ restore_map->devname = g_strdup(devname);
+ restore_map->path = g_strdup(path);
+ restore_map->format = format;
+ restore_map->throttling_bps = bps_value;
+ restore_map->throttling_group = group;
+ restore_map->cache = cache;
+ restore_map->write_zero = write_zero;
+ restore_map->skip = skip;
+ RestoreMap *map = g_new0(RestoreMap, 1);
+ map->devname = g_strdup(devname);
+ map->path = g_strdup(path);
+ map->format = format;
+ map->throttling_bps = bps_value;
+ map->throttling_group = group;
+ map->cache = cache;
+ map->write_zero = write_zero;
+ map->skip = skip;
+
+ g_hash_table_insert(devmap, restore_map->devname, restore_map);
+ g_hash_table_insert(devmap, map->devname, map);
+
+ };
+ }
@@ -2387,7 +2385,7 @@ index 0000000000..bb715e9061
+
+static int create_archive(int argc, char **argv)
+{
+ int c;
+ int i, c;
+ int verbose = 0;
+ const char *archivename;
+ GList *backup_coroutines = NULL;
@@ -2545,7 +2543,6 @@ index 0000000000..bb715e9061
+ vma_writer_get_status(vmaw, &vmastat);
+
+ if (verbose) {
+ int i;
+ for (i = 0; i < 256; i++) {
+ VmaStreamInfo *si = &vmastat.stream_info[i];
+ if (si->size) {

View File

@@ -12,20 +12,20 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
[FE: adapt to coroutine changes]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
block/backup-dump.c | 172 +++++++++++++++++++++++++++++++
block/backup-dump.c | 168 +++++++++++++++++++++++++++++++
block/backup.c | 30 ++----
block/meson.build | 1 +
include/block/block_int-common.h | 35 +++++++
job.c | 3 +-
5 files changed, 218 insertions(+), 23 deletions(-)
5 files changed, 214 insertions(+), 23 deletions(-)
create mode 100644 block/backup-dump.c
diff --git a/block/backup-dump.c b/block/backup-dump.c
new file mode 100644
index 0000000000..e46abf1070
index 0000000000..232a094426
--- /dev/null
+++ b/block/backup-dump.c
@@ -0,0 +1,172 @@
@@ -0,0 +1,168 @@
+/*
+ * BlockDriver to send backup data stream to a callback function
+ *
@@ -37,8 +37,6 @@ index 0000000000..e46abf1070
+ */
+
+#include "qemu/osdep.h"
+
+#include "qapi/qmp/qdict.h"
+#include "qom/object_interfaces.h"
+#include "block/block_int.h"
+
@@ -171,7 +169,7 @@ index 0000000000..e46abf1070
+block_init(bdrv_backup_dump_init);
+
+
+BlockDriverState *coroutine_fn bdrv_co_backup_dump_create(
+BlockDriverState *bdrv_backup_dump_create(
+ int dump_cb_block_size,
+ uint64_t byte_size,
+ BackupDumpFunc *dump_cb,
@@ -179,11 +177,9 @@ index 0000000000..e46abf1070
+ Error **errp)
+{
+ BDRVBackupDumpState *state;
+ BlockDriverState *bs = bdrv_new_open_driver(
+ &bdrv_backup_dump_drive, NULL, BDRV_O_RDWR, errp);
+
+ QDict *options = qdict_new();
+ qdict_put_str(options, "driver", "backup-dump-drive");
+
+ BlockDriverState *bs = bdrv_co_open(NULL, NULL, options, BDRV_O_RDWR, errp);
+ if (!bs) {
+ return NULL;
+ }
@@ -199,7 +195,7 @@ index 0000000000..e46abf1070
+ return bs;
+}
diff --git a/block/backup.c b/block/backup.c
index eba5b11493..1963e47ab9 100644
index 39410dcf8d..af87fa6aa9 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -29,28 +29,6 @@
@@ -231,7 +227,7 @@ index eba5b11493..1963e47ab9 100644
static const BlockJobDriver backup_job_driver;
static void backup_cleanup_sync_bitmap(BackupBlockJob *job, int ret)
@@ -462,6 +440,14 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
@@ -457,6 +435,14 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
}
cluster_size = block_copy_cluster_size(bcs);
@@ -247,7 +243,7 @@ index eba5b11493..1963e47ab9 100644
if (perf->max_chunk && perf->max_chunk < cluster_size) {
error_setg(errp, "Required max-chunk (%" PRIi64 ") is less than backup "
diff --git a/block/meson.build b/block/meson.build
index 652c8cbdb7..e1cf5a2e65 100644
index 59b71ba9f3..6fde9f7dcd 100644
--- a/block/meson.build
+++ b/block/meson.build
@@ -4,6 +4,7 @@ block_ss.add(files(
@@ -255,11 +251,11 @@ index 652c8cbdb7..e1cf5a2e65 100644
'amend.c',
'backup.c',
+ 'backup-dump.c',
'copy-before-write.c',
'blkdebug.c',
'blklogwrites.c',
'blkverify.c',
diff --git a/include/block/block_int-common.h b/include/block/block_int-common.h
index ebb4e56a50..e717a74e5f 100644
index 74195c3004..0f2e1817ad 100644
--- a/include/block/block_int-common.h
+++ b/include/block/block_int-common.h
@@ -26,6 +26,7 @@
@@ -276,7 +272,7 @@ index ebb4e56a50..e717a74e5f 100644
+typedef int BackupDumpFunc(void *opaque, uint64_t offset, uint64_t bytes, const void *buf);
+
+BlockDriverState *coroutine_fn bdrv_co_backup_dump_create(
+BlockDriverState *bdrv_backup_dump_create(
+ int dump_cb_block_size,
+ uint64_t byte_size,
+ BackupDumpFunc *dump_cb,
@@ -312,10 +308,10 @@ index ebb4e56a50..e717a74e5f 100644
BDRV_TRACKED_READ,
BDRV_TRACKED_WRITE,
diff --git a/job.c b/job.c
index 660ce22c56..baf54c8d60 100644
index 72d57f0934..93e22d180b 100644
--- a/job.c
+++ b/job.c
@@ -331,7 +331,8 @@ static bool job_started_locked(Job *job)
@@ -330,7 +330,8 @@ static bool job_started_locked(Job *job)
}
/* Called with job_mutex held. */

View File

@@ -11,10 +11,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2 files changed, 46 insertions(+)
diff --git a/include/qemu/job.h b/include/qemu/job.h
index 2b873f2576..528cd6acb9 100644
index e502787dd8..963cf2bef5 100644
--- a/include/qemu/job.h
+++ b/include/qemu/job.h
@@ -362,6 +362,18 @@ void job_unlock(void);
@@ -381,6 +381,18 @@ void job_unlock(void);
*/
JobTxn *job_txn_new(void);
@@ -34,10 +34,10 @@ index 2b873f2576..528cd6acb9 100644
* Release a reference that was previously acquired with job_txn_add_job or
* job_txn_new. If it's the last reference to the object, it will be freed.
diff --git a/job.c b/job.c
index baf54c8d60..3ac5e5cde2 100644
index 93e22d180b..2b31f1e14f 100644
--- a/job.c
+++ b/job.c
@@ -94,6 +94,8 @@ struct JobTxn {
@@ -93,6 +93,8 @@ struct JobTxn {
/* Reference count */
int refcnt;
@@ -46,7 +46,7 @@ index baf54c8d60..3ac5e5cde2 100644
};
void job_lock(void)
@@ -119,6 +121,25 @@ JobTxn *job_txn_new(void)
@@ -118,6 +120,25 @@ JobTxn *job_txn_new(void)
return txn;
}
@@ -72,7 +72,7 @@ index baf54c8d60..3ac5e5cde2 100644
/* Called with job_mutex held. */
static void job_txn_ref_locked(JobTxn *txn)
{
@@ -1042,6 +1063,12 @@ static void job_completed_txn_success_locked(Job *job)
@@ -1057,6 +1078,12 @@ static void job_completed_txn_success_locked(Job *job)
*/
QLIST_FOREACH(other_job, &txn->jobs, txn_list) {
if (!job_is_completed_locked(other_job)) {
@@ -85,7 +85,7 @@ index baf54c8d60..3ac5e5cde2 100644
return;
}
assert(other_job->ret == 0);
@@ -1253,6 +1280,13 @@ int job_finish_sync_locked(Job *job,
@@ -1268,6 +1295,13 @@ int job_finish_sync_locked(Job *job,
return -EBUSY;
}

View File

@@ -84,31 +84,69 @@ Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
create jobs in a drained section]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
block/meson.build | 5 +
block/monitor/block-hmp-cmds.c | 39 ++
blockdev.c | 1 +
hmp-commands-info.hx | 14 +
hmp-commands.hx | 29 +
include/monitor/hmp.h | 3 +
meson.build | 1 +
monitor/hmp-cmds.c | 72 +++
proxmox-backup-client.c | 146 +++++
proxmox-backup-client.h | 60 ++
pve-backup.c | 1092 ++++++++++++++++++++++++++++++++
qapi/block-core.json | 233 +++++++
qapi/common.json | 14 +
qapi/machine.json | 16 +-
14 files changed, 1711 insertions(+), 14 deletions(-)
block/backup-dump.c | 10 +-
block/meson.build | 5 +
block/monitor/block-hmp-cmds.c | 39 ++
blockdev.c | 1 +
hmp-commands-info.hx | 14 +
hmp-commands.hx | 29 +
include/block/block_int-common.h | 2 +-
include/monitor/hmp.h | 3 +
meson.build | 1 +
monitor/hmp-cmds.c | 72 ++
proxmox-backup-client.c | 146 ++++
proxmox-backup-client.h | 60 ++
pve-backup.c | 1067 ++++++++++++++++++++++++++++++
qapi/block-core.json | 229 +++++++
qapi/common.json | 14 +
qapi/machine.json | 16 +-
16 files changed, 1690 insertions(+), 18 deletions(-)
create mode 100644 proxmox-backup-client.c
create mode 100644 proxmox-backup-client.h
create mode 100644 pve-backup.c
diff --git a/block/backup-dump.c b/block/backup-dump.c
index 232a094426..e46abf1070 100644
--- a/block/backup-dump.c
+++ b/block/backup-dump.c
@@ -9,6 +9,8 @@
*/
#include "qemu/osdep.h"
+
+#include "qapi/qmp/qdict.h"
#include "qom/object_interfaces.h"
#include "block/block_int.h"
@@ -141,7 +143,7 @@ static void bdrv_backup_dump_init(void)
block_init(bdrv_backup_dump_init);
-BlockDriverState *bdrv_backup_dump_create(
+BlockDriverState *coroutine_fn bdrv_co_backup_dump_create(
int dump_cb_block_size,
uint64_t byte_size,
BackupDumpFunc *dump_cb,
@@ -149,9 +151,11 @@ BlockDriverState *bdrv_backup_dump_create(
Error **errp)
{
BDRVBackupDumpState *state;
- BlockDriverState *bs = bdrv_new_open_driver(
- &bdrv_backup_dump_drive, NULL, BDRV_O_RDWR, errp);
+ QDict *options = qdict_new();
+ qdict_put_str(options, "driver", "backup-dump-drive");
+
+ BlockDriverState *bs = bdrv_co_open(NULL, NULL, options, BDRV_O_RDWR, errp);
if (!bs) {
return NULL;
}
diff --git a/block/meson.build b/block/meson.build
index e1cf5a2e65..2367e1ac1b 100644
index 6fde9f7dcd..6d468f89e5 100644
--- a/block/meson.build
+++ b/block/meson.build
@@ -44,6 +44,11 @@ block_ss.add(files(
), zstd, zlib)
@@ -45,6 +45,11 @@ block_ss.add(files(
), zstd, zlib, gnutls)
block_ss.add(files('../vma-writer.c'), libuuid)
+block_ss.add(files(
@@ -120,10 +158,10 @@ index e1cf5a2e65..2367e1ac1b 100644
system_ss.add(when: 'CONFIG_TCG', if_true: files('blkreplay.c'))
system_ss.add(files('block-ram-registrar.c'))
diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c
index bdf2eb50b6..439a7a14c8 100644
index ca2599de44..6efe28cef5 100644
--- a/block/monitor/block-hmp-cmds.c
+++ b/block/monitor/block-hmp-cmds.c
@@ -1009,3 +1009,42 @@ void hmp_change_medium(Monitor *mon, const char *device, const char *target,
@@ -1029,3 +1029,42 @@ void hmp_change_medium(Monitor *mon, const char *device, const char *target,
qmp_blockdev_change_medium(device, NULL, target, arg, true, force,
!!read_only, read_only_mode, errp);
}
@@ -167,7 +205,7 @@ index bdf2eb50b6..439a7a14c8 100644
+ hmp_handle_error(mon, error);
+}
diff --git a/blockdev.c b/blockdev.c
index 9cbd166674..8080c47fa6 100644
index 060d86a65f..79c3575612 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -37,6 +37,7 @@
@@ -179,10 +217,10 @@ index 9cbd166674..8080c47fa6 100644
#include "monitor/monitor.h"
#include "qemu/error-report.h"
diff --git a/hmp-commands-info.hx b/hmp-commands-info.hx
index d1a7b99add..af588145ff 100644
index 10fdd822e0..15937793c1 100644
--- a/hmp-commands-info.hx
+++ b/hmp-commands-info.hx
@@ -458,6 +458,20 @@ SRST
@@ -471,6 +471,20 @@ SRST
Show the current VM UUID.
ERST
@@ -204,7 +242,7 @@ index d1a7b99add..af588145ff 100644
{
.name = "usernet",
diff --git a/hmp-commands.hx b/hmp-commands.hx
index 0c7c6f2c16..bf8315f226 100644
index e352f86872..0c8b6725fb 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -101,6 +101,35 @@ ERST
@@ -243,8 +281,21 @@ index 0c7c6f2c16..bf8315f226 100644
ERST
{
diff --git a/include/block/block_int-common.h b/include/block/block_int-common.h
index 0f2e1817ad..0a0339eee4 100644
--- a/include/block/block_int-common.h
+++ b/include/block/block_int-common.h
@@ -63,7 +63,7 @@
typedef int BackupDumpFunc(void *opaque, uint64_t offset, uint64_t bytes, const void *buf);
-BlockDriverState *bdrv_backup_dump_create(
+BlockDriverState *coroutine_fn bdrv_co_backup_dump_create(
int dump_cb_block_size,
uint64_t byte_size,
BackupDumpFunc *dump_cb,
diff --git a/include/monitor/hmp.h b/include/monitor/hmp.h
index 2596cc2426..9dda91d65a 100644
index 7a7def7530..cba7afe70c 100644
--- a/include/monitor/hmp.h
+++ b/include/monitor/hmp.h
@@ -32,6 +32,7 @@ void hmp_info_savevm(Monitor *mon, const QDict *qdict);
@@ -255,7 +306,7 @@ index 2596cc2426..9dda91d65a 100644
void hmp_info_cpus(Monitor *mon, const QDict *qdict);
void hmp_info_vnc(Monitor *mon, const QDict *qdict);
void hmp_info_spice(Monitor *mon, const QDict *qdict);
@@ -82,6 +83,8 @@ void hmp_change_vnc(Monitor *mon, const char *device, const char *target,
@@ -84,6 +85,8 @@ void hmp_change_vnc(Monitor *mon, const char *device, const char *target,
void hmp_change_medium(Monitor *mon, const char *device, const char *target,
const char *arg, const char *read_only, bool force,
Error **errp);
@@ -265,10 +316,10 @@ index 2596cc2426..9dda91d65a 100644
void hmp_device_add(Monitor *mon, const QDict *qdict);
void hmp_device_del(Monitor *mon, const QDict *qdict);
diff --git a/meson.build b/meson.build
index 7eee5b4249..979c452f74 100644
index cd95530d3b..d53976d621 100644
--- a/meson.build
+++ b/meson.build
@@ -2013,6 +2013,7 @@ endif
@@ -1779,6 +1779,7 @@ endif
has_gettid = cc.has_function('gettid')
libuuid = cc.find_library('uuid', required: true)
@@ -277,18 +328,18 @@ index 7eee5b4249..979c452f74 100644
# libselinux
selinux = dependency('libselinux',
diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index 874084565f..bedeb81f8c 100644
index 91be698308..5b9c231a4c 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -22,6 +22,7 @@
@@ -21,6 +21,7 @@
#include "qemu/help_option.h"
#include "monitor/monitor-internal.h"
#include "qapi/error.h"
+#include "qapi/qapi-commands-block-core.h"
#include "qapi/qapi-commands-control.h"
#include "qapi/qapi-commands-machine.h"
#include "qapi/qapi-commands-migration.h"
@@ -119,6 +120,77 @@ void hmp_sync_profile(Monitor *mon, const QDict *qdict)
#include "qapi/qapi-commands-misc.h"
@@ -144,6 +145,77 @@ void hmp_sync_profile(Monitor *mon, const QDict *qdict)
}
}
@@ -586,10 +637,10 @@ index 0000000000..8cbf645b2c
+#endif /* PROXMOX_BACKUP_CLIENT_H */
diff --git a/pve-backup.c b/pve-backup.c
new file mode 100644
index 0000000000..9f83ecb310
index 0000000000..d84d807654
--- /dev/null
+++ b/pve-backup.c
@@ -0,0 +1,1092 @@
@@ -0,0 +1,1067 @@
+#include "proxmox-backup-client.h"
+#include "vma.h"
+
@@ -600,7 +651,6 @@ index 0000000000..9f83ecb310
+#include "block/block_int-global-state.h"
+#include "block/blockjob.h"
+#include "block/dirty-bitmap.h"
+#include "block/graph-lock.h"
+#include "qapi/qapi-commands-block.h"
+#include "qapi/qmp/qerror.h"
+#include "qemu/cutils.h"
@@ -626,6 +676,7 @@ index 0000000000..9f83ecb310
+ * ---end-bad-example--
+ *
+ * ==> Always use CoMutext inside coroutines.
+ * ==> Never acquire/release AioContext withing coroutines (because that use QemuRecMutex)
+ *
+ */
+
@@ -678,6 +729,7 @@ index 0000000000..9f83ecb310
+ uint64_t block_size;
+ uint8_t dev_id;
+ int completed_ret; // INT_MAX if not completed
+ char targetfile[PATH_MAX];
+ BdrvDirtyBitmap *bitmap;
+ BlockDriverState *target;
+ BlockJob *job;
@@ -898,12 +950,7 @@ index 0000000000..9f83ecb310
+
+ qemu_co_mutex_lock(&backup_state.backup_mutex);
+
+ /*
+ * All jobs in the transaction will be canceled when one receives an error.
+ * The first error wins, so only set it for ECANCELED if it was the last
+ * job. This allows more interesting errors from other jobs to win.
+ */
+ if (ret < 0 && (ret != -ECANCELED || !g_list_nth(backup_state.di_list, 1))) {
+ if (ret < 0) {
+ Error *local_err = NULL;
+ error_setg(&local_err, "job failed with err %d - %s", ret, strerror(-ret));
+ pvebackup_propagate_error(local_err);
@@ -927,6 +974,13 @@ index 0000000000..9f83ecb310
+ }
+ }
+
+ if (di->job) {
+ WITH_JOB_LOCK_GUARD() {
+ job_unref_locked(&di->job->job);
+ di->job = NULL;
+ }
+ }
+
+ // remove self from job list
+ backup_state.di_list = g_list_remove(backup_state.di_list, di);
+
@@ -946,16 +1000,6 @@ index 0000000000..9f83ecb310
+ di->completed_ret = ret;
+
+ /*
+ * Needs to happen outside of coroutine, because it takes the graph write lock.
+ */
+ if (di->job) {
+ WITH_JOB_LOCK_GUARD() {
+ job_unref_locked(&di->job->job);
+ di->job = NULL;
+ }
+ }
+
+ /*
+ * Schedule stream cleanup in async coroutine. close_image and finish might
+ * take a while, so we can't block on them here. This way it also doesn't
+ * matter if we're already running in a coroutine or not.
@@ -1076,7 +1120,8 @@ index 0000000000..9f83ecb310
+}
+
+/*
+ * backup_job_create can *not* be run from a coroutine, so this can't either.
+ * backup_job_create can *not* be run from a coroutine (and requires an
+ * acquired AioContext), so this can't either.
+ * The caller is responsible that backup_mutex is held nonetheless.
+ */
+static void create_backup_jobs_bh(void *opaque) {
@@ -1109,6 +1154,9 @@ index 0000000000..9f83ecb310
+ sync_mode = MIRROR_SYNC_MODE_BITMAP;
+ bitmap_mode = BITMAP_SYNC_MODE_ON_SUCCESS;
+ }
+ AioContext *aio_context = bdrv_get_aio_context(di->bs);
+ aio_context_acquire(aio_context);
+
+ bdrv_drained_begin(di->bs);
+
+ BlockJob *job = backup_job_create(
@@ -1119,6 +1167,8 @@ index 0000000000..9f83ecb310
+
+ bdrv_drained_end(di->bs);
+
+ aio_context_release(aio_context);
+
+ di->job = job;
+ if (job) {
+ WITH_JOB_LOCK_GUARD() {
@@ -1169,66 +1219,6 @@ index 0000000000..9f83ecb310
+ aio_co_enter(data->ctx, data->co);
+}
+
+/*
+ * Returns a list of device infos, which needs to be freed by the caller. In
+ * case of an error, errp will be set, but the returned value might still be a
+ * list.
+ */
+static GList coroutine_fn GRAPH_RDLOCK *get_device_info(
+ const char *devlist,
+ Error **errp)
+{
+ gchar **devs = NULL;
+ GList *di_list = NULL;
+
+ if (devlist) {
+ devs = g_strsplit_set(devlist, ",;:", -1);
+
+ gchar **d = devs;
+ while (d && *d) {
+ BlockBackend *blk = blk_by_name(*d);
+ if (!blk) {
+ error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND,
+ "Device '%s' not found", *d);
+ goto err;
+ }
+ BlockDriverState *bs = blk_bs(blk);
+ if (!bdrv_co_is_inserted(bs)) {
+ error_setg(errp, "Device '%s' has no medium", *d);
+ goto err;
+ }
+ PVEBackupDevInfo *di = g_new0(PVEBackupDevInfo, 1);
+ di->bs = bs;
+ di_list = g_list_append(di_list, di);
+ d++;
+ }
+ } else {
+ BdrvNextIterator it;
+
+ for (BlockDriverState *bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
+ if (!bdrv_co_is_inserted(bs) || bdrv_is_read_only(bs)) {
+ continue;
+ }
+
+ PVEBackupDevInfo *di = g_new0(PVEBackupDevInfo, 1);
+ di->bs = bs;
+ di_list = g_list_append(di_list, di);
+ }
+ }
+
+ if (!di_list) {
+ error_set(errp, ERROR_CLASS_GENERIC_ERROR, "empty device list");
+ goto err;
+ }
+
+err:
+ if (devs) {
+ g_strfreev(devs);
+ }
+
+ return di_list;
+}
+
+UuidInfo coroutine_fn *qmp_backup(
+ const char *backup_file,
+ const char *password,
@@ -1254,10 +1244,13 @@ index 0000000000..9f83ecb310
+
+ qemu_co_mutex_lock(&backup_state.backup_mutex);
+
+ BlockBackend *blk;
+ BlockDriverState *bs = NULL;
+ Error *local_err = NULL;
+ uuid_t uuid;
+ VmaWriter *vmaw = NULL;
+ ProxmoxBackupHandle *pbs = NULL;
+ gchar **devs = NULL;
+ GList *di_list = NULL;
+ GList *l;
+ UuidInfo *uuid_info;
@@ -1275,14 +1268,48 @@ index 0000000000..9f83ecb310
+ /* Todo: try to auto-detect format based on file name */
+ format = has_format ? format : BACKUP_FORMAT_VMA;
+
+ bdrv_graph_co_rdlock();
+ di_list = get_device_info(devlist, &local_err);
+ bdrv_graph_co_rdunlock();
+ if (local_err) {
+ error_propagate(errp, local_err);
+ if (devlist) {
+ devs = g_strsplit_set(devlist, ",;:", -1);
+
+ gchar **d = devs;
+ while (d && *d) {
+ blk = blk_by_name(*d);
+ if (blk) {
+ bs = blk_bs(blk);
+ if (!bdrv_co_is_inserted(bs)) {
+ error_setg(errp, QERR_DEVICE_HAS_NO_MEDIUM, *d);
+ goto err;
+ }
+ PVEBackupDevInfo *di = g_new0(PVEBackupDevInfo, 1);
+ di->bs = bs;
+ di_list = g_list_append(di_list, di);
+ } else {
+ error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND,
+ "Device '%s' not found", *d);
+ goto err;
+ }
+ d++;
+ }
+
+ } else {
+ BdrvNextIterator it;
+
+ bs = NULL;
+ for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
+ if (!bdrv_co_is_inserted(bs) || bdrv_is_read_only(bs)) {
+ continue;
+ }
+
+ PVEBackupDevInfo *di = g_new0(PVEBackupDevInfo, 1);
+ di->bs = bs;
+ di_list = g_list_append(di_list, di);
+ }
+ }
+
+ if (!di_list) {
+ error_set(errp, ERROR_CLASS_GENERIC_ERROR, "empty device list");
+ goto err;
+ }
+ assert(di_list);
+
+ size_t total = 0;
+
@@ -1290,11 +1317,7 @@ index 0000000000..9f83ecb310
+ while (l) {
+ PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
+ l = g_list_next(l);
+
+ bdrv_graph_co_rdlock();
+ bool blocked = bdrv_op_is_blocked(di->bs, BLOCK_OP_TYPE_BACKUP_SOURCE, errp);
+ bdrv_graph_co_rdunlock();
+ if (blocked) {
+ if (bdrv_op_is_blocked(di->bs, BLOCK_OP_TYPE_BACKUP_SOURCE, errp)) {
+ goto err;
+ }
+
@@ -1378,9 +1401,7 @@ index 0000000000..9f83ecb310
+
+ di->block_size = dump_cb_block_size;
+
+ bdrv_graph_co_rdlock();
+ const char *devname = bdrv_get_device_name(di->bs);
+ bdrv_graph_co_rdunlock();
+ PBSBitmapAction action = PBS_BITMAP_ACTION_NOT_USED;
+ size_t dirty = di->size;
+
@@ -1456,9 +1477,7 @@ index 0000000000..9f83ecb310
+ goto err_mutex;
+ }
+
+ bdrv_graph_co_rdlock();
+ const char *devname = bdrv_get_device_name(di->bs);
+ bdrv_graph_co_rdunlock();
+ di->dev_id = vma_writer_register_stream(vmaw, devname, di->size);
+ if (di->dev_id <= 0) {
+ error_set(errp, ERROR_CLASS_GENERIC_ERROR,
@@ -1570,11 +1589,18 @@ index 0000000000..9f83ecb310
+ bdrv_co_unref(di->target);
+ }
+
+ if (di->targetfile[0]) {
+ unlink(di->targetfile);
+ }
+ g_free(di);
+ }
+ g_list_free(di_list);
+ backup_state.di_list = NULL;
+
+ if (devs) {
+ g_strfreev(devs);
+ }
+
+ if (vmaw) {
+ Error *err = NULL;
+ vma_writer_close(vmaw, &err);
@@ -1683,10 +1709,10 @@ index 0000000000..9f83ecb310
+ return ret;
+}
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 1cb6f04db3..ac83c3495d 100644
index 125aa89858..331c8336d1 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -851,6 +851,239 @@
@@ -839,6 +839,235 @@
{ 'command': 'query-block', 'returns': ['BlockInfo'],
'allow-preconfig': true }
@@ -1755,9 +1781,6 @@ index 1cb6f04db3..ac83c3495d 100644
+# @config-file: a configuration file to include into
+# the backup archive.
+#
+# @firewall-file: a firewall configuration file to include into the backup
+# archive.
+#
+# @speed: the maximum speed, in bytes per second
+#
+# @devlist: list of block device names (separated by ',', ';'
@@ -1825,7 +1848,9 @@ index 1cb6f04db3..ac83c3495d 100644
+#
+# Cancel the current executing backup process.
+#
+# .. note:: This command succeeds even if there is no backup process running.
+# Returns: nothing on success
+#
+# Notes: This command succeeds even if there is no backup process running.
+#
+##
+{ 'command': 'backup-cancel', 'coroutine': true }
@@ -1848,9 +1873,6 @@ index 1cb6f04db3..ac83c3495d 100644
+#
+# @pbs-library-version: Running version of libproxmox-backup-qemu0 library.
+#
+# @backup-max-workers: Whether the 'max-workers' @BackupPerf setting is
+# supported or not.
+#
+##
+{ 'struct': 'ProxmoxSupportStatus',
+ 'data': { 'pbs-dirty-bitmap': 'bool',
@@ -1927,10 +1949,10 @@ index 1cb6f04db3..ac83c3495d 100644
# @BlockDeviceTimedStats:
#
diff --git a/qapi/common.json b/qapi/common.json
index 7558ce5430..5c00bddeb7 100644
index 6fed9cde1a..630a2a8f9a 100644
--- a/qapi/common.json
+++ b/qapi/common.json
@@ -200,3 +200,17 @@
@@ -207,3 +207,17 @@
##
{ 'struct': 'HumanReadableText',
'data': { 'human-readable-text': 'str' } }
@@ -1944,12 +1966,12 @@ index 7558ce5430..5c00bddeb7 100644
+#
+# Since: 0.14.0
+#
+# .. note:: If no UUID was specified for the guest, a null UUID is
+# Notes: If no UUID was specified for the guest, a null UUID is
+# returned.
+##
+{ 'struct': 'UuidInfo', 'data': {'UUID': 'str'} }
diff --git a/qapi/machine.json b/qapi/machine.json
index dc46a3e93f..bd58d58fc5 100644
index 7da3c519ba..888457f810 100644
--- a/qapi/machine.json
+++ b/qapi/machine.json
@@ -4,6 +4,8 @@
@@ -1961,7 +1983,7 @@ index dc46a3e93f..bd58d58fc5 100644
##
# = Machines
##
@@ -303,20 +305,6 @@
@@ -230,20 +232,6 @@
##
{ 'command': 'query-target', 'returns': 'TargetInfo' }
@@ -1974,8 +1996,8 @@ index dc46a3e93f..bd58d58fc5 100644
-#
-# Since: 0.14
-#
-# .. note:: If no UUID was specified for the guest, the nil UUID (all
-# zeroes) is returned.
-# Notes: If no UUID was specified for the guest, a null UUID is
-# returned.
-##
-{ 'struct': 'UuidInfo', 'data': {'UUID': 'str'} }
-

View File

@@ -14,20 +14,20 @@ Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
create mode 100644 pbs-restore.c
diff --git a/meson.build b/meson.build
index 979c452f74..426f382178 100644
index d53976d621..c3330310d9 100644
--- a/meson.build
+++ b/meson.build
@@ -4103,6 +4103,10 @@ if have_tools
@@ -3914,6 +3914,10 @@ if have_tools
vma = executable('vma', files('vma.c', 'vma-reader.c') + genh,
dependencies: [authz, block, crypto, io, qemuutil, qom], install: true)
dependencies: [authz, block, crypto, io, qom], install: true)
+ pbs_restore = executable('pbs-restore', files('pbs-restore.c') + genh,
+ dependencies: [authz, block, crypto, io, qemuutil, qom,
+ dependencies: [authz, block, crypto, io, qom,
+ libproxmox_backup_qemu], install: true)
+
subdir('storage-daemon')
foreach exe: [ 'qemu-img', 'qemu-io', 'qemu-nbd', 'qemu-storage-daemon']
subdir('contrib/rdmacm-mux')
subdir('contrib/elf2dmp')
diff --git a/pbs-restore.c b/pbs-restore.c
new file mode 100644
index 0000000000..f03d9bab8d

View File

@@ -14,33 +14,35 @@ Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
getlength is now a coroutine function]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
block/meson.build | 2 +
block/pbs.c | 306 +++++++++++++++++++++++++++++++++++++++++++
block/meson.build | 3 +
block/pbs.c | 305 +++++++++++++++++++++++++++++++++++++++++++
configure | 9 ++
meson.build | 2 +-
qapi/block-core.json | 29 ++++
qapi/block-core.json | 13 ++
qapi/pragma.json | 1 +
5 files changed, 339 insertions(+), 1 deletion(-)
6 files changed, 332 insertions(+), 1 deletion(-)
create mode 100644 block/pbs.c
diff --git a/block/meson.build b/block/meson.build
index 2367e1ac1b..e178047ec9 100644
index 6d468f89e5..becc99ac4e 100644
--- a/block/meson.build
+++ b/block/meson.build
@@ -49,6 +49,8 @@ block_ss.add(files(
@@ -50,6 +50,9 @@ block_ss.add(files(
'../pve-backup.c',
), libproxmox_backup_qemu)
+block_ss.add(files('pbs.c'), libproxmox_backup_qemu)
+block_ss.add(when: 'CONFIG_PBS_BDRV', if_true: files('pbs.c'))
+block_ss.add(when: 'CONFIG_PBS_BDRV', if_true: libproxmox_backup_qemu)
+
system_ss.add(when: 'CONFIG_TCG', if_true: files('blkreplay.c'))
system_ss.add(files('block-ram-registrar.c'))
diff --git a/block/pbs.c b/block/pbs.c
new file mode 100644
index 0000000000..2d5e28ce8f
index 0000000000..a2211e0f3b
--- /dev/null
+++ b/block/pbs.c
@@ -0,0 +1,306 @@
@@ -0,0 +1,305 @@
+/*
+ * Proxmox Backup Server read-only block driver
+ */
@@ -68,7 +70,7 @@ index 0000000000..2d5e28ce8f
+
+typedef struct {
+ ProxmoxRestoreHandle *conn;
+ uint8_t aid;
+ char aid;
+ int64_t length;
+
+ char *repository;
@@ -201,18 +203,12 @@ index 0000000000..2d5e28ce8f
+ }
+
+ /* acquire handle and length */
+ ret = proxmox_restore_open_image(s->conn, s->archive, &pbs_error);
+ if (ret < 0) {
+ s->aid = proxmox_restore_open_image(s->conn, s->archive, &pbs_error);
+ if (s->aid < 0) {
+ if (pbs_error && errp) error_setg(errp, "PBS open_image failed: %s", pbs_error);
+ if (pbs_error) proxmox_backup_free_error(pbs_error);
+ return -ENODEV;
+ }
+ if (ret > UINT8_MAX) {
+ error_setg(errp, "PBS open_image returned an ID larger than %u", UINT8_MAX);
+ return -ENODEV;
+ }
+ s->aid = ret;
+
+ s->length = proxmox_restore_get_image_length(s->conn, s->aid, &pbs_error);
+ if (s->length < 0) {
+ if (pbs_error && errp) error_setg(errp, "PBS get_image_length failed: %s", pbs_error);
@@ -223,6 +219,12 @@ index 0000000000..2d5e28ce8f
+ return 0;
+}
+
+static int pbs_file_open(BlockDriverState *bs, QDict *options, int flags,
+ Error **errp)
+{
+ return pbs_open(bs, options, flags, errp);
+}
+
+static void pbs_close(BlockDriverState *bs) {
+ BDRVPBSState *s = bs->opaque;
+ g_free(s->repository);
@@ -232,8 +234,7 @@ index 0000000000..2d5e28ce8f
+ proxmox_restore_disconnect(s->conn);
+}
+
+static coroutine_fn int64_t GRAPH_RDLOCK
+pbs_co_getlength(BlockDriverState *bs)
+static coroutine_fn int64_t pbs_co_getlength(BlockDriverState *bs)
+{
+ BDRVPBSState *s = bs->opaque;
+ return s->length;
@@ -250,9 +251,9 @@ index 0000000000..2d5e28ce8f
+ aio_co_schedule(rcb->ctx, rcb->co);
+}
+
+static coroutine_fn int GRAPH_RDLOCK
+pbs_co_preadv(BlockDriverState *bs, int64_t offset, int64_t bytes,
+ QEMUIOVector *qiov, BdrvRequestFlags flags)
+static coroutine_fn int pbs_co_preadv(BlockDriverState *bs,
+ int64_t offset, int64_t bytes,
+ QEMUIOVector *qiov, BdrvRequestFlags flags)
+{
+ BDRVPBSState *s = bs->opaque;
+ int ret;
@@ -297,17 +298,16 @@ index 0000000000..2d5e28ce8f
+ return 0;
+}
+
+static coroutine_fn int GRAPH_RDLOCK
+pbs_co_pwritev(BlockDriverState *bs, int64_t offset, int64_t bytes,
+ QEMUIOVector *qiov, BdrvRequestFlags flags)
+static coroutine_fn int pbs_co_pwritev(BlockDriverState *bs,
+ int64_t offset, int64_t bytes,
+ QEMUIOVector *qiov, BdrvRequestFlags flags)
+{
+ fprintf(stderr, "pbs-bdrv: cannot write to backup file, make sure "
+ "any attached disk devices are set to read-only!\n");
+ return -EPERM;
+}
+
+static void GRAPH_RDLOCK
+pbs_refresh_filename(BlockDriverState *bs)
+static void pbs_refresh_filename(BlockDriverState *bs)
+{
+ BDRVPBSState *s = bs->opaque;
+ if (s->namespace) {
@@ -330,6 +330,7 @@ index 0000000000..2d5e28ce8f
+
+ .bdrv_parse_filename = pbs_parse_filename,
+
+ .bdrv_file_open = pbs_file_open,
+ .bdrv_open = pbs_open,
+ .bdrv_close = pbs_close,
+ .bdrv_co_getlength = pbs_co_getlength,
@@ -347,13 +348,54 @@ index 0000000000..2d5e28ce8f
+}
+
+block_init(bdrv_pbs_init);
diff --git a/configure b/configure
index 133f4e3235..f5a830c1f3 100755
--- a/configure
+++ b/configure
@@ -256,6 +256,7 @@ qemu_suffix="qemu"
softmmu="yes"
linux_user=""
bsd_user=""
+pbs_bdrv="yes"
plugins="$default_feature"
ninja=""
python=
@@ -809,6 +810,10 @@ for opt do
;;
--enable-download) download="enabled"; git_submodules_action=update;
;;
+ --disable-pbs-bdrv) pbs_bdrv="no"
+ ;;
+ --enable-pbs-bdrv) pbs_bdrv="yes"
+ ;;
--enable-plugins) if test "$mingw32" = "yes"; then
error_exit "TCG plugins not currently supported on Windows platforms"
else
@@ -959,6 +964,7 @@ cat << EOF
bsd-user all BSD usermode emulation targets
pie Position Independent Executables
debug-tcg TCG debugging (default is disabled)
+ pbs-bdrv Proxmox backup server read-only block driver support
NOTE: The object files are built at the place where configure is launched
EOF
@@ -1744,6 +1750,9 @@ if test "$solaris" = "yes" ; then
fi
echo "SRC_PATH=$source_path" >> $config_host_mak
echo "TARGET_DIRS=$target_list" >> $config_host_mak
+if test "$pbs_bdrv" = "yes" ; then
+ echo "CONFIG_PBS_BDRV=y" >> $config_host_mak
+fi
# XXX: suppress that
if [ "$bsd" = "yes" ] ; then
diff --git a/meson.build b/meson.build
index 426f382178..7e6130cfdf 100644
index c3330310d9..cbfc9a43fb 100644
--- a/meson.build
+++ b/meson.build
@@ -4559,7 +4559,7 @@ summary_info += {'zstd support': zstd}
summary_info += {'Query Processing Library support': qpl}
summary_info += {'UADK Library support': uadk}
@@ -4319,7 +4319,7 @@ summary_info += {'bzip2 support': libbzip2}
summary_info += {'lzfse support': liblzfse}
summary_info += {'zstd support': zstd}
summary_info += {'NUMA host support': numa}
-summary_info += {'capstone': capstone}
+summary_info += {'PBS bdrv support': config_host.has_key('CONFIG_PBS_BDRV')}
@@ -361,10 +403,10 @@ index 426f382178..7e6130cfdf 100644
summary_info += {'libdaxctl support': libdaxctl}
summary_info += {'libudev': libudev}
diff --git a/qapi/block-core.json b/qapi/block-core.json
index ac83c3495d..fe0eefcea6 100644
index 331c8336d1..a818d5f90f 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -3457,6 +3457,7 @@
@@ -3396,6 +3396,7 @@
'parallels', 'preallocate', 'qcow', 'qcow2', 'qed', 'quorum',
'raw', 'rbd',
{ 'name': 'replication', 'if': 'CONFIG_REPLICATION' },
@@ -372,7 +414,7 @@ index ac83c3495d..fe0eefcea6 100644
'ssh', 'throttle', 'vdi', 'vhdx',
{ 'name': 'virtio-blk-vfio-pci', 'if': 'CONFIG_BLKIO' },
{ 'name': 'virtio-blk-vhost-user', 'if': 'CONFIG_BLKIO' },
@@ -3543,6 +3544,33 @@
@@ -3482,6 +3483,17 @@
{ 'struct': 'BlockdevOptionsNull',
'data': { '*size': 'int', '*latency-ns': 'uint64', '*read-zeroes': 'bool' } }
@@ -381,22 +423,6 @@ index ac83c3495d..fe0eefcea6 100644
+#
+# Driver specific block device options for the PBS backend.
+#
+# @repository: Proxmox Backup Server repository.
+#
+# @snapshot: backup snapshots ID.
+#
+# @archive: archive name.
+#
+# @keyfile: keyfile to use for encryption.
+#
+# @password: password to use for connection.
+#
+# @fingerprint: backup server fingerprint.
+#
+# @key_password: password to unlock key.
+#
+# @namespace: namespace where backup snapshot lives.
+#
+##
+{ 'struct': 'BlockdevOptionsPbs',
+ 'data': { 'repository': 'str', 'snapshot': 'str', 'archive': 'str',
@@ -406,7 +432,7 @@ index ac83c3495d..fe0eefcea6 100644
##
# @BlockdevOptionsNVMe:
#
@@ -4978,6 +5006,7 @@
@@ -4886,6 +4898,7 @@
'nfs': 'BlockdevOptionsNfs',
'null-aio': 'BlockdevOptionsNull',
'null-co': 'BlockdevOptionsNull',
@@ -415,10 +441,10 @@ index ac83c3495d..fe0eefcea6 100644
'nvme-io_uring': { 'type': 'BlockdevOptionsNvmeIoUring',
'if': 'CONFIG_BLKIO' },
diff --git a/qapi/pragma.json b/qapi/pragma.json
index be8fa304c5..7ff46bd128 100644
index 325e684411..b6079f6a0e 100644
--- a/qapi/pragma.json
+++ b/qapi/pragma.json
@@ -100,6 +100,7 @@
@@ -45,6 +45,7 @@
'BlockInfo', # query-block
'BlockdevAioOptions', # blockdev-add, -blockdev
'BlockdevDriver', # blockdev-add, query-blockstats, ...

View File

@@ -9,15 +9,15 @@ fitting.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
meson.build | 3 ++-
meson.build | 2 ++
os-posix.c | 7 +++++--
2 files changed, 7 insertions(+), 3 deletions(-)
2 files changed, 7 insertions(+), 2 deletions(-)
diff --git a/meson.build b/meson.build
index 7e6130cfdf..984f858bdc 100644
index cbfc9a43fb..8206270272 100644
--- a/meson.build
+++ b/meson.build
@@ -2013,6 +2013,7 @@ endif
@@ -1779,6 +1779,7 @@ endif
has_gettid = cc.has_function('gettid')
libuuid = cc.find_library('uuid', required: true)
@@ -25,29 +25,28 @@ index 7e6130cfdf..984f858bdc 100644
libproxmox_backup_qemu = cc.find_library('proxmox_backup_qemu', required: true)
# libselinux
@@ -3597,7 +3598,7 @@ if have_block
if host_os == 'windows'
system_ss.add(files('os-win32.c'))
else
- blockdev_ss.add(files('os-posix.c'))
+ blockdev_ss.add(files('os-posix.c'), libsystemd)
endif
@@ -3406,6 +3407,7 @@ if have_block
# os-posix.c contains POSIX-specific functions used by qemu-storage-daemon,
# os-win32.c does not
blockdev_ss.add(when: 'CONFIG_POSIX', if_true: files('os-posix.c'))
+ blockdev_ss.add(when: 'CONFIG_POSIX', if_true: libsystemd)
system_ss.add(when: 'CONFIG_WIN32', if_true: [files('os-win32.c')])
endif
diff --git a/os-posix.c b/os-posix.c
index 43f9a43f3f..a47e46d1c2 100644
index cfcb96533c..fb2ad87009 100644
--- a/os-posix.c
+++ b/os-posix.c
@@ -29,6 +29,8 @@
@@ -28,6 +28,8 @@
#include <pwd.h>
#include <grp.h>
#include <libgen.h>
+#include <systemd/sd-journal.h>
+#include <syslog.h>
#include "qemu/error-report.h"
#include "qemu/log.h"
@@ -306,9 +308,10 @@ void os_setup_post(void)
/* Needed early for CONFIG_BSD etc. */
#include "net/slirp.h"
@@ -310,9 +312,10 @@ void os_setup_post(void)
dup2(fd, 0);
dup2(fd, 1);

View File

@@ -26,10 +26,10 @@ Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
create mode 100644 migration/pbs-state.c
diff --git a/include/migration/misc.h b/include/migration/misc.h
index bfadc5613b..e2e51fcf6b 100644
index 7dcc0b5c2c..4c940b2475 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -111,4 +111,7 @@ bool migration_in_bg_snapshot(void);
@@ -77,4 +77,7 @@ bool migration_in_bg_snapshot(void);
/* migration/block-dirty-bitmap.c */
void dirty_bitmap_mig_init(void);
@@ -38,40 +38,34 @@ index bfadc5613b..e2e51fcf6b 100644
+
#endif
diff --git a/migration/meson.build b/migration/meson.build
index 4b0c4f0f51..d039797132 100644
index 07f6057acc..343994d891 100644
--- a/migration/meson.build
+++ b/migration/meson.build
@@ -8,6 +8,7 @@ migration_files = files(
@@ -7,7 +7,9 @@ migration_files = files(
'vmstate.c',
'qemu-file.c',
'yank_functions.c',
+ 'pbs-state.c',
)
+system_ss.add(libproxmox_backup_qemu)
system_ss.add(files(
'block-dirty-bitmap.c',
@@ -25,6 +26,7 @@ system_ss.add(files(
'multifd-zlib.c',
'multifd-zero-page.c',
'options.c',
+ 'pbs-state.c',
'postcopy-ram.c',
'savevm.c',
'savevm-async.c',
diff --git a/migration/migration.c b/migration/migration.c
index ae2be31557..fab4c20ee4 100644
index 7a4c8beb5d..0a955a2a18 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -263,6 +263,7 @@ void migration_object_init(void)
@@ -162,6 +162,7 @@ void migration_object_init(void)
blk_mig_init();
ram_mig_init();
dirty_bitmap_mig_init();
+ pbs_state_mig_init();
}
typedef struct {
void migration_cancel(const Error *error)
diff --git a/migration/pbs-state.c b/migration/pbs-state.c
new file mode 100644
index 0000000000..a97187e4d7
index 0000000000..887e998b9e
--- /dev/null
+++ b/migration/pbs-state.c
@@ -0,0 +1,104 @@
@@ -120,7 +114,7 @@ index 0000000000..a97187e4d7
+}
+
+/* serialize PBS state and send to target via f, called on source */
+static int pbs_state_save_setup(QEMUFile *f, void *opaque, Error **errp)
+static int pbs_state_save_setup(QEMUFile *f, void *opaque)
+{
+ size_t buf_size;
+ uint8_t *buf = proxmox_export_state(&buf_size);
@@ -180,10 +174,10 @@ index 0000000000..a97187e4d7
+ NULL);
+}
diff --git a/pve-backup.c b/pve-backup.c
index 9f83ecb310..57477f7f2a 100644
index d84d807654..9c8b88d075 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -1085,6 +1085,7 @@ ProxmoxSupportStatus *qmp_query_proxmox_support(Error **errp)
@@ -1060,6 +1060,7 @@ ProxmoxSupportStatus *qmp_query_proxmox_support(Error **errp)
ret->pbs_library_version = g_strdup(proxmox_backup_qemu_version());
ret->pbs_dirty_bitmap = true;
ret->pbs_dirty_bitmap_savevm = true;
@@ -192,10 +186,10 @@ index 9f83ecb310..57477f7f2a 100644
ret->pbs_masterkey = true;
ret->backup_max_workers = true;
diff --git a/qapi/block-core.json b/qapi/block-core.json
index fe0eefcea6..521a1914e8 100644
index a818d5f90f..48eb47c6ea 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1004,6 +1004,11 @@
@@ -991,6 +991,11 @@
# @pbs-dirty-bitmap-savevm: True if 'dirty-bitmaps' migration capability can
# safely be set for savevm-async.
#
@@ -207,7 +201,7 @@ index fe0eefcea6..521a1914e8 100644
# @pbs-masterkey: True if the QMP backup call supports the 'master_keyfile'
# parameter.
#
@@ -1017,6 +1022,7 @@
@@ -1001,6 +1006,7 @@
'data': { 'pbs-dirty-bitmap': 'bool',
'query-bitmap-info': 'bool',
'pbs-dirty-bitmap-savevm': 'bool',

View File

@@ -15,21 +15,18 @@ transferred.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
migration/block-dirty-bitmap.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
migration/block-dirty-bitmap.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index a7d55048c2..77346a5fa2 100644
index e1ae3b7316..285dd1d148 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -539,7 +539,10 @@ static int add_bitmaps_to_list(DBMSaveState *s, BlockDriverState *bs,
}
@@ -540,7 +540,7 @@ static int add_bitmaps_to_list(DBMSaveState *s, BlockDriverState *bs,
if (bdrv_dirty_bitmap_check(bitmap, BDRV_BITMAP_DEFAULT, errp)) {
if (bdrv_dirty_bitmap_check(bitmap, BDRV_BITMAP_DEFAULT, &local_err)) {
error_report_err(local_err);
- return -1;
+ if (errp != NULL) {
+ error_report_err(*errp);
+ }
+ continue;
}

View File

@@ -21,10 +21,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 30 insertions(+)
diff --git a/block/iscsi.c b/block/iscsi.c
index 979bf90cb7..961714a4be 100644
index 34f97ab646..398782963d 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -1392,12 +1392,42 @@ static char *get_initiator_name(QemuOpts *opts)
@@ -1391,12 +1391,42 @@ static char *get_initiator_name(QemuOpts *opts)
const char *name;
char *iscsi_name;
UuidInfo *uuid_info;

View File

@@ -11,7 +11,7 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/block/stream.c b/block/stream.c
index 7031eef12b..d2da83ae7c 100644
index e522bbdec5..afed72db55 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -27,7 +27,7 @@ enum {

View File

@@ -0,0 +1,33 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Stefan Reiter <s.reiter@proxmox.com>
Date: Tue, 2 Mar 2021 16:11:54 +0100
Subject: [PATCH] block/io: accept NULL qiov in bdrv_pad_request
Some operations, e.g. block-stream, perform reads while discarding the
results (only copy-on-read matters). In this case they will pass NULL as
the target QEMUIOVector, which will however trip bdrv_pad_request, since
it wants to extend its passed vector.
Simply check for NULL and do nothing, there's no reason to pad the
target if it will be discarded anyway.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
block/io.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/block/io.c b/block/io.c
index 055fcf7438..63f7b3ad3e 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1710,6 +1710,10 @@ static int bdrv_pad_request(BlockDriverState *bs,
int sliced_niov;
size_t sliced_head, sliced_tail;
+ if (!qiov) {
+ return 0;
+ }
+
/* Should have been checked by the caller already */
ret = bdrv_check_request32(*offset, *bytes, *qiov, *qiov_offset);
if (ret < 0) {

View File

@@ -19,33 +19,27 @@ well.
This only worked if the target supports backing images, so up until now
only for qcow2, with alloc-track any driver for the target can be used.
Replacing the node cannot be done in the
track_co_change_backing_file() callback, because replacing a node
cannot happen in a coroutine and requires the block graph lock
exclusively. Could either become a special option for the stream job,
or maybe the upcoming blockdev-replace QMP command can be used in the
future.
If 'auto-remove' is set, alloc-track will automatically detach itself
once the backing image is removed. It will be replaced by 'file'.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
[FE: adapt to changed function signatures
make error return value consistent with QEMU
avoid premature break during read
adhere to block graph lock requirements]
avoid premature break during read]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
block/alloc-track.c | 366 ++++++++++++++++++++++++++++++++++++++++++++
block/alloc-track.c | 352 ++++++++++++++++++++++++++++++++++++++++++++
block/meson.build | 1 +
block/stream.c | 34 ++++
3 files changed, 401 insertions(+)
2 files changed, 353 insertions(+)
create mode 100644 block/alloc-track.c
diff --git a/block/alloc-track.c b/block/alloc-track.c
new file mode 100644
index 0000000000..b4a9851144
index 0000000000..b75d7c6460
--- /dev/null
+++ b/block/alloc-track.c
@@ -0,0 +1,366 @@
@@ -0,0 +1,352 @@
+/*
+ * Node to allow backing images to be applied to any node. Assumes a blank
+ * image to begin with, only new writes are tracked as allocated, thus this
@@ -62,11 +56,9 @@ index 0000000000..b4a9851144
+#include "qapi/error.h"
+#include "block/block_int.h"
+#include "block/dirty-bitmap.h"
+#include "block/graph-lock.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/qmp/qstring.h"
+#include "qemu/cutils.h"
+#include "qemu/error-report.h"
+#include "qemu/option.h"
+#include "qemu/module.h"
+#include "sysemu/block-backend.h"
@@ -75,12 +67,12 @@ index 0000000000..b4a9851144
+
+typedef enum DropState {
+ DropNone,
+ DropRequested,
+ DropInProgress,
+} DropState;
+
+typedef struct {
+ BdrvDirtyBitmap *bitmap;
+ uint64_t granularity;
+ DropState drop_state;
+ bool auto_remove;
+} BDRVAllocTrackState;
@@ -99,29 +91,26 @@ index 0000000000..b4a9851144
+ },
+};
+
+static void GRAPH_RDLOCK
+track_refresh_limits(BlockDriverState *bs, Error **errp)
+static void track_refresh_limits(BlockDriverState *bs, Error **errp)
+{
+ BDRVAllocTrackState *s = bs->opaque;
+ BlockDriverInfo bdi;
+
+ if (!bs->file) {
+ return;
+ }
+
+ /*
+ * Always use alignment from underlying write device so RMW cycle for
+ * bdrv_pwritev reads data from our backing via track_co_preadv. Also use at
+ * least the bitmap granularity.
+ */
+ /* always use alignment from underlying write device so RMW cycle for
+ * bdrv_pwritev reads data from our backing via track_co_preadv (no partial
+ * cluster allocation in 'file') */
+ bdrv_get_info(bs->file->bs, &bdi);
+ bs->bl.request_alignment = MAX(bs->file->bs->bl.request_alignment,
+ s->granularity);
+ MAX(bdi.cluster_size, BDRV_SECTOR_SIZE));
+}
+
+static int track_open(BlockDriverState *bs, QDict *options, int flags,
+ Error **errp)
+{
+ BDRVAllocTrackState *s = bs->opaque;
+ BdrvChild *file = NULL;
+ QemuOpts *opts;
+ Error *local_err = NULL;
+ int ret = 0;
@@ -137,45 +126,18 @@ index 0000000000..b4a9851144
+ s->auto_remove = qemu_opt_get_bool(opts, TRACK_OPT_AUTO_REMOVE, false);
+
+ /* open the target (write) node, backing will be attached by block layer */
+ file = bdrv_open_child(NULL, options, "file", bs, &child_of_bds,
+ BDRV_CHILD_DATA | BDRV_CHILD_METADATA, false,
+ &local_err);
+ bdrv_graph_wrlock();
+ bs->file = file;
+ bdrv_graph_wrunlock();
+ bs->file = bdrv_open_child(NULL, options, "file", bs, &child_of_bds,
+ BDRV_CHILD_DATA | BDRV_CHILD_METADATA, false,
+ &local_err);
+ if (local_err) {
+ ret = -EINVAL;
+ error_propagate(errp, local_err);
+ goto fail;
+ }
+
+ bdrv_graph_rdlock_main_loop();
+ BlockDriverInfo bdi = {0};
+ ret = bdrv_get_info(bs->file->bs, &bdi);
+ if (ret < 0) {
+ /*
+ * Not a hard failure. Worst that can happen is partial cluster
+ * allocation in the write target. However, the driver here returns its
+ * allocation status based on the dirty bitmap, so any other data that
+ * maps to such a cluster will still be copied later by a stream job (or
+ * during writes to that cluster).
+ */
+ warn_report("alloc-track: unable to query cluster size for write target: %s",
+ strerror(ret));
+ }
+ ret = 0;
+ /*
+ * Always consider alignment from underlying write device so RMW cycle for
+ * bdrv_pwritev reads data from our backing via track_co_preadv. Also try to
+ * avoid partial cluster allocation in the write target by considering the
+ * cluster size.
+ */
+ s->granularity = MAX(bs->file->bs->bl.request_alignment,
+ MAX(bdi.cluster_size, BDRV_SECTOR_SIZE));
+ track_refresh_limits(bs, errp);
+ s->bitmap = bdrv_create_dirty_bitmap(bs->file->bs, s->granularity, NULL,
+ &local_err);
+ bdrv_graph_rdunlock_main_loop();
+ uint64_t gran = bs->bl.request_alignment;
+ s->bitmap = bdrv_create_dirty_bitmap(bs->file->bs, gran, NULL, &local_err);
+ if (local_err) {
+ ret = -EIO;
+ error_propagate(errp, local_err);
@@ -186,9 +148,7 @@ index 0000000000..b4a9851144
+
+fail:
+ if (ret < 0) {
+ bdrv_graph_wrlock();
+ bdrv_unref_child(bs, bs->file);
+ bdrv_graph_wrunlock();
+ if (s->bitmap) {
+ bdrv_release_dirty_bitmap(s->bitmap);
+ }
@@ -205,15 +165,13 @@ index 0000000000..b4a9851144
+ }
+}
+
+static coroutine_fn int64_t GRAPH_RDLOCK
+track_co_getlength(BlockDriverState *bs)
+static coroutine_fn int64_t track_co_getlength(BlockDriverState *bs)
+{
+ return bdrv_co_getlength(bs->file->bs);
+}
+
+static int coroutine_fn GRAPH_RDLOCK
+track_co_preadv(BlockDriverState *bs, int64_t offset, int64_t bytes,
+ QEMUIOVector *qiov, BdrvRequestFlags flags)
+static int coroutine_fn track_co_preadv(BlockDriverState *bs,
+ int64_t offset, int64_t bytes, QEMUIOVector *qiov, BdrvRequestFlags flags)
+{
+ BDRVAllocTrackState *s = bs->opaque;
+ QEMUIOVector local_qiov;
@@ -271,39 +229,36 @@ index 0000000000..b4a9851144
+ return ret;
+}
+
+static int coroutine_fn GRAPH_RDLOCK
+track_co_pwritev(BlockDriverState *bs, int64_t offset, int64_t bytes,
+ QEMUIOVector *qiov, BdrvRequestFlags flags)
+static int coroutine_fn track_co_pwritev(BlockDriverState *bs,
+ int64_t offset, int64_t bytes, QEMUIOVector *qiov, BdrvRequestFlags flags)
+{
+ return bdrv_co_pwritev(bs->file, offset, bytes, qiov, flags);
+}
+
+static int coroutine_fn GRAPH_RDLOCK
+track_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset, int64_t bytes,
+ BdrvRequestFlags flags)
+static int coroutine_fn track_co_pwrite_zeroes(BlockDriverState *bs,
+ int64_t offset, int64_t bytes, BdrvRequestFlags flags)
+{
+ return bdrv_co_pwrite_zeroes(bs->file, offset, bytes, flags);
+}
+
+static int coroutine_fn GRAPH_RDLOCK
+track_co_pdiscard(BlockDriverState *bs, int64_t offset, int64_t bytes)
+static int coroutine_fn track_co_pdiscard(BlockDriverState *bs,
+ int64_t offset, int64_t bytes)
+{
+ return bdrv_co_pdiscard(bs->file, offset, bytes);
+}
+
+static coroutine_fn int GRAPH_RDLOCK
+track_co_flush(BlockDriverState *bs)
+static coroutine_fn int track_co_flush(BlockDriverState *bs)
+{
+ return bdrv_co_flush(bs->file->bs);
+}
+
+static int coroutine_fn GRAPH_RDLOCK
+track_co_block_status(BlockDriverState *bs, bool want_zero,
+ int64_t offset,
+ int64_t bytes,
+ int64_t *pnum,
+ int64_t *map,
+ BlockDriverState **file)
+static int coroutine_fn track_co_block_status(BlockDriverState *bs,
+ bool want_zero,
+ int64_t offset,
+ int64_t bytes,
+ int64_t *pnum,
+ int64_t *map,
+ BlockDriverState **file)
+{
+ BDRVAllocTrackState *s = bs->opaque;
+
@@ -329,10 +284,10 @@ index 0000000000..b4a9851144
+ return 0;
+}
+
+static void GRAPH_RDLOCK
+track_child_perm(BlockDriverState *bs, BdrvChild *c, BdrvChildRole role,
+ BlockReopenQueue *reopen_queue, uint64_t perm, uint64_t shared,
+ uint64_t *nperm, uint64_t *nshared)
+static void track_child_perm(BlockDriverState *bs, BdrvChild *c,
+ BdrvChildRole role, BlockReopenQueue *reopen_queue,
+ uint64_t perm, uint64_t shared,
+ uint64_t *nperm, uint64_t *nshared)
+{
+ BDRVAllocTrackState *s = bs->opaque;
+
@@ -355,28 +310,53 @@ index 0000000000..b4a9851144
+ }
+}
+
+static int coroutine_fn GRAPH_RDLOCK
+track_co_change_backing_file(BlockDriverState *bs, const char *backing_file,
+ const char *backing_fmt)
+static void track_drop(void *opaque)
+{
+ /*
+ * Note that the actual backing file graph change is already done in the
+ * stream job itself with bdrv_set_backing_hd_drained(), so no need to
+ * actually do anything here. But still needs to be implemented, to make
+ * our caller (i.e. bdrv_co_change_backing_file() do the right thing).
+ *
+ * FIXME
+ * We'd like to auto-remove ourselves from the block graph, but it cannot
+ * be done from a coroutine. Currently done in the stream job, where it
+ * kinda fits better, but in the long-term, a special parameter would be
+ * nice (or done via qemu-server via upcoming blockdev-replace QMP command).
+ */
+ if (backing_file == NULL) {
+ BDRVAllocTrackState *s = bs->opaque;
+ bdrv_drained_begin(bs);
+ s->drop_state = DropInProgress;
+ bdrv_child_refresh_perms(bs, bs->file, &error_abort);
+ bdrv_drained_end(bs);
+ BlockDriverState *bs = (BlockDriverState*)opaque;
+ BlockDriverState *file = bs->file->bs;
+ BDRVAllocTrackState *s = bs->opaque;
+
+ assert(file);
+
+ /* we rely on the fact that we're not used anywhere else, so let's wait
+ * until we're only used once - in the drive connected to the guest (and one
+ * ref is held by bdrv_ref in track_change_backing_file) */
+ if (bs->refcnt > 2) {
+ aio_bh_schedule_oneshot(qemu_get_aio_context(), track_drop, opaque);
+ return;
+ }
+ AioContext *aio_context = bdrv_get_aio_context(bs);
+ aio_context_acquire(aio_context);
+
+ bdrv_drained_begin(bs);
+
+ /* now that we're drained, we can safely set 'DropInProgress' */
+ s->drop_state = DropInProgress;
+ bdrv_child_refresh_perms(bs, bs->file, &error_abort);
+
+ bdrv_replace_node(bs, file, &error_abort);
+ bdrv_set_backing_hd(bs, NULL, &error_abort);
+ bdrv_drained_end(bs);
+ bdrv_unref(bs);
+ aio_context_release(aio_context);
+}
+
+static int track_change_backing_file(BlockDriverState *bs,
+ const char *backing_file,
+ const char *backing_fmt)
+{
+ BDRVAllocTrackState *s = bs->opaque;
+ if (s->auto_remove && s->drop_state == DropNone &&
+ backing_file == NULL && backing_fmt == NULL)
+ {
+ /* backing file has been disconnected, there's no longer any use for
+ * this node, so let's remove ourselves from the block graph - we need
+ * to schedule this for later however, since when this function is
+ * called, the blockjob modifying us is probably not done yet and has a
+ * blocker on 'bs' */
+ s->drop_state = DropRequested;
+ bdrv_ref(bs);
+ aio_bh_schedule_oneshot(qemu_get_aio_context(), track_drop, (void*)bs);
+ }
+
+ return 0;
@@ -386,7 +366,7 @@ index 0000000000..b4a9851144
+ .format_name = "alloc-track",
+ .instance_size = sizeof(BDRVAllocTrackState),
+
+ .bdrv_open = track_open,
+ .bdrv_file_open = track_open,
+ .bdrv_close = track_close,
+ .bdrv_co_getlength = track_co_getlength,
+ .bdrv_child_perm = track_child_perm,
@@ -403,7 +383,7 @@ index 0000000000..b4a9851144
+ .supports_backing = true,
+
+ .bdrv_co_block_status = track_co_block_status,
+ .bdrv_co_change_backing_file = track_co_change_backing_file,
+ .bdrv_change_backing_file = track_change_backing_file,
+};
+
+static void bdrv_alloc_track_init(void)
@@ -413,7 +393,7 @@ index 0000000000..b4a9851144
+
+block_init(bdrv_alloc_track_init);
diff --git a/block/meson.build b/block/meson.build
index e178047ec9..7ef7250d31 100644
index becc99ac4e..0a69836593 100644
--- a/block/meson.build
+++ b/block/meson.build
@@ -2,6 +2,7 @@ block_ss.add(genh)
@@ -424,48 +404,3 @@ index e178047ec9..7ef7250d31 100644
'amend.c',
'backup.c',
'backup-dump.c',
diff --git a/block/stream.c b/block/stream.c
index d2da83ae7c..f941cba14e 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -120,6 +120,40 @@ static int stream_prepare(Job *job)
ret = -EPERM;
goto out;
}
+
+ /*
+ * This cannot be done in the co_change_backing_file callback, because
+ * bdrv_replace_node() cannot be done in a coroutine. The latter also
+ * requires the graph lock exclusively. Only required for the
+ * alloc-track driver.
+ *
+ * The long-term plan is to either have an explicit parameter for the
+ * stream job or use the upcoming blockdev-replace QMP command.
+ */
+ if (base_id == NULL && strcmp(unfiltered_bs->drv->format_name, "alloc-track") == 0) {
+ BlockDriverState *file_bs;
+
+ bdrv_graph_rdlock_main_loop();
+ file_bs = unfiltered_bs->file->bs;
+ bdrv_graph_rdunlock_main_loop();
+
+ bdrv_ref(unfiltered_bs); // unrefed by bdrv_replace_node()
+ bdrv_drained_begin(file_bs);
+ bdrv_graph_wrlock();
+
+ bdrv_replace_node(unfiltered_bs, file_bs, &local_err);
+
+ bdrv_graph_wrunlock();
+ bdrv_drained_end(file_bs);
+ bdrv_unref(unfiltered_bs);
+
+ if (local_err) {
+ error_prepend(&local_err, "failed to replace alloc-track node: ");
+ error_report_err(local_err);
+ ret = -EPERM;
+ goto out;
+ }
+ }
}
out:

View File

@@ -7,16 +7,15 @@ This reverts commit fc176116cdea816ceb8dd969080b2b95f58edbc0 in
preparation to revert 0347a8fd4c3faaedf119be04c197804be40a384b.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
block/rbd.c | 42 ++----------------------------------------
1 file changed, 2 insertions(+), 40 deletions(-)
diff --git a/block/rbd.c b/block/rbd.c
index 101ee59d6e..4ad3b1a7b1 100644
index a4749f3b1b..53e0396b51 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -1515,7 +1515,6 @@ static int coroutine_fn qemu_rbd_co_block_status(BlockDriverState *bs,
@@ -1511,7 +1511,6 @@ static int coroutine_fn qemu_rbd_co_block_status(BlockDriverState *bs,
int status, r;
RBDDiffIterateReq req = { .offs = offset };
uint64_t features, flags;
@@ -24,7 +23,7 @@ index 101ee59d6e..4ad3b1a7b1 100644
assert(offset + bytes <= s->image_size);
@@ -1543,43 +1542,7 @@ static int coroutine_fn qemu_rbd_co_block_status(BlockDriverState *bs,
@@ -1539,43 +1538,7 @@ static int coroutine_fn qemu_rbd_co_block_status(BlockDriverState *bs,
return status;
}
@@ -69,7 +68,7 @@ index 101ee59d6e..4ad3b1a7b1 100644
qemu_rbd_diff_iterate_cb, &req);
if (r < 0 && r != QEMU_RBD_EXIT_DIFF_ITERATE2) {
return status;
@@ -1598,8 +1561,7 @@ static int coroutine_fn qemu_rbd_co_block_status(BlockDriverState *bs,
@@ -1594,8 +1557,7 @@ static int coroutine_fn qemu_rbd_co_block_status(BlockDriverState *bs,
status = BDRV_BLOCK_ZERO | BDRV_BLOCK_OFFSET_VALID;
}

View File

@@ -8,16 +8,15 @@ This reverts commit 9e302f64bb407a9bb097b626da97228c2654cfee in
preparation to revert 0347a8fd4c3faaedf119be04c197804be40a384b.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
block/rbd.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/block/rbd.c b/block/rbd.c
index 4ad3b1a7b1..e341745255 100644
index 53e0396b51..0913a0af39 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -1474,11 +1474,11 @@ static int qemu_rbd_diff_iterate_cb(uint64_t offs, size_t len,
@@ -1470,11 +1470,11 @@ static int qemu_rbd_diff_iterate_cb(uint64_t offs, size_t len,
RBDDiffIterateReq *req = opaque;
assert(req->offs + req->bytes <= offs);

View File

@@ -18,13 +18,12 @@ Upstream bug report:
https://gitlab.com/qemu-project/qemu/-/issues/1026
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
block/rbd.c | 112 ----------------------------------------------------
1 file changed, 112 deletions(-)
diff --git a/block/rbd.c b/block/rbd.c
index e341745255..436d3d7811 100644
index 0913a0af39..1dab254517 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -108,12 +108,6 @@ typedef struct RBDTask {
@@ -40,7 +39,7 @@ index e341745255..436d3d7811 100644
static int qemu_rbd_connect(rados_t *cluster, rados_ioctx_t *io_ctx,
BlockdevOptionsRbd *opts, bool cache,
const char *keypairs, const char *secretid,
@@ -1460,111 +1454,6 @@ static ImageInfoSpecific *qemu_rbd_get_specific_info(BlockDriverState *bs,
@@ -1456,111 +1450,6 @@ static ImageInfoSpecific *qemu_rbd_get_specific_info(BlockDriverState *bs,
return spec_info;
}
@@ -152,7 +151,7 @@ index e341745255..436d3d7811 100644
static int64_t coroutine_fn qemu_rbd_co_getlength(BlockDriverState *bs)
{
BDRVRBDState *s = bs->opaque;
@@ -1801,7 +1690,6 @@ static BlockDriver bdrv_rbd = {
@@ -1796,7 +1685,6 @@ static BlockDriver bdrv_rbd = {
#ifdef LIBRBD_SUPPORTS_WRITE_ZEROES
.bdrv_co_pwrite_zeroes = qemu_rbd_co_pwrite_zeroes,
#endif

View File

@@ -1,43 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Tue, 26 Mar 2024 14:57:51 +0100
Subject: [PATCH] alloc-track: error out when auto-remove is not set
Since replacing the node now happens in the stream job, where the
option cannot be read from (it's internal to the driver), it will
always be treated as on.
qemu-server will always set it, make sure to have other users notice
the change (should they even exist). The option can be fully dropped
in the future while adding a version guard in qemu-server.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
block/alloc-track.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/block/alloc-track.c b/block/alloc-track.c
index b4a9851144..fc7d58a5d0 100644
--- a/block/alloc-track.c
+++ b/block/alloc-track.c
@@ -34,7 +34,6 @@ typedef struct {
BdrvDirtyBitmap *bitmap;
uint64_t granularity;
DropState drop_state;
- bool auto_remove;
} BDRVAllocTrackState;
static QemuOptsList runtime_opts = {
@@ -86,7 +85,11 @@ static int track_open(BlockDriverState *bs, QDict *options, int flags,
goto fail;
}
- s->auto_remove = qemu_opt_get_bool(opts, TRACK_OPT_AUTO_REMOVE, false);
+ if (!qemu_opt_get_bool(opts, TRACK_OPT_AUTO_REMOVE, false)) {
+ error_setg(errp, "alloc-track: requires auto-remove option to be set to on");
+ ret = -EINVAL;
+ goto fail;
+ }
/* open the target (write) node, backing will be attached by block layer */
file = bdrv_open_child(NULL, options, "file", bs, &child_of_bds,

View File

@@ -1,84 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Wed, 27 Mar 2024 11:15:39 +0100
Subject: [PATCH] alloc-track: avoid seemingly superfluous child permission
update
Doesn't seem necessary nowadays (maybe after commit "alloc-track: fix
deadlock during drop" where the dropping is not rescheduled and delayed
anymore or some upstream change). Should there really be some issue,
instead of having a drop state, this could also be just based off the
fact whether there is still a backing child.
Dumping the cumulative (shared) permissions for the BDS with a debug
print yields the same values after this patch and with QEMU 8.1,
namely 3 and 5.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
block/alloc-track.c | 26 --------------------------
1 file changed, 26 deletions(-)
diff --git a/block/alloc-track.c b/block/alloc-track.c
index fc7d58a5d0..b56425b7f0 100644
--- a/block/alloc-track.c
+++ b/block/alloc-track.c
@@ -25,15 +25,9 @@
#define TRACK_OPT_AUTO_REMOVE "auto-remove"
-typedef enum DropState {
- DropNone,
- DropInProgress,
-} DropState;
-
typedef struct {
BdrvDirtyBitmap *bitmap;
uint64_t granularity;
- DropState drop_state;
} BDRVAllocTrackState;
static QemuOptsList runtime_opts = {
@@ -137,8 +131,6 @@ static int track_open(BlockDriverState *bs, QDict *options, int flags,
goto fail;
}
- s->drop_state = DropNone;
-
fail:
if (ret < 0) {
bdrv_graph_wrlock();
@@ -289,18 +281,8 @@ track_child_perm(BlockDriverState *bs, BdrvChild *c, BdrvChildRole role,
BlockReopenQueue *reopen_queue, uint64_t perm, uint64_t shared,
uint64_t *nperm, uint64_t *nshared)
{
- BDRVAllocTrackState *s = bs->opaque;
-
*nshared = BLK_PERM_ALL;
- /* in case we're currently dropping ourselves, claim to not use any
- * permissions at all - which is fine, since from this point on we will
- * never issue a read or write anymore */
- if (s->drop_state == DropInProgress) {
- *nperm = 0;
- return;
- }
-
if (role & BDRV_CHILD_DATA) {
*nperm = perm & DEFAULT_PERM_PASSTHROUGH;
} else {
@@ -326,14 +308,6 @@ track_co_change_backing_file(BlockDriverState *bs, const char *backing_file,
* kinda fits better, but in the long-term, a special parameter would be
* nice (or done via qemu-server via upcoming blockdev-replace QMP command).
*/
- if (backing_file == NULL) {
- BDRVAllocTrackState *s = bs->opaque;
- bdrv_drained_begin(bs);
- s->drop_state = DropInProgress;
- bdrv_child_refresh_perms(bs, bs->file, &error_abort);
- bdrv_drained_end(bs);
- }
-
return 0;
}

View File

@@ -0,0 +1,153 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Thu, 6 Apr 2023 14:59:31 +0200
Subject: [PATCH] alloc-track: fix deadlock during drop
by replacing the block node directly after changing the backing file
instead of rescheduling it.
With changes in QEMU 8.0, calling bdrv_get_info (and bdrv_unref)
during drop can lead to a deadlock when using iothread (only triggered
with multiple disks, except during debugging where it also triggered
with one disk sometimes):
1. job_unref_locked acquires the AioContext and calls job->driver->free
2. track_drop gets scheduled
3. bdrv_graph_wrlock is called and polls which leads to track_drop being
called
4. track_drop acquires the AioContext recursively
5. bdrv_get_info is a wrapped coroutine (since 8.0) and thus polls for
bdrv_co_get_info. This releases the AioContext, but only once! The
documentation for the AIO_WAIT_WHILE macro states that the
AioContext lock needs to be acquired exactly once, but there does
not seem to be a way for track_drop to know if it acquired the lock
recursively or not (without adding further hacks).
6. Because the AioContext is still held by the main thread once, it can't
be acquired before entering bdrv_co_get_info in co_schedule_bh_cb
which happens in the iothread
When doing the operation in change_backing_file, the AioContext has
already been acquired by the caller, so the issue with the recursive
lock goes away.
The comment explaining why delaying the replace is necessary is
> we need to schedule this for later however, since when this function
> is called, the blockjob modifying us is probably not done yet and
> has a blocker on 'bs'
However, there is no check for blockers in bdrv_replace_node. It would
need to be done by us, the caller, with check_to_replace_node.
Furthermore, the mirror job also does its call to bdrv_replace_node
while there is an active blocker (inserted by mirror itself) and they
use a specialized version to check for blockers instead of
check_to_replace_node there. Alloc-track could also do something
similar to check for other blockers, but it should be fine to rely on
Proxmox VE that no other operation with the blockdev is going on.
Mirror also drains the target before replacing the node, but the
target can have other users. In case of alloc-track the file child
should not be accessible by anybody else and so there can't be an
in-flight operation for the file child when alloc-track is drained.
The rescheduling based on refcounting is a hack and it doesn't seem to
be necessary anymore. It's not clear what the original issue from the
comment was. Testing with older builds with track_drop done directly
without rescheduling also didn't lead to any noticable issue for me.
One issue it might have been is the one fixed by b1e1af394d
("block/stream: Drain subtree around graph change"), where
block-stream had a use-after-free if the base node changed at an
inconvenient time (which alloc-track's auto-drop does).
It's also not possible to just not auto-replace the alloc-track. Not
replacing it at all leads to other operations like block resize
hanging, and there is no good way to replace it manually via QMP
(there is x-blockdev-change, but it is experimental and doesn't
implement the required operation yet). Also, it's just cleaner in
general to not leave unnecessary block nodes lying around.
Suggested-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
block/alloc-track.c | 54 ++++++++++++++-------------------------------
1 file changed, 16 insertions(+), 38 deletions(-)
diff --git a/block/alloc-track.c b/block/alloc-track.c
index b75d7c6460..76da140a68 100644
--- a/block/alloc-track.c
+++ b/block/alloc-track.c
@@ -25,7 +25,6 @@
typedef enum DropState {
DropNone,
- DropRequested,
DropInProgress,
} DropState;
@@ -268,37 +267,6 @@ static void track_child_perm(BlockDriverState *bs, BdrvChild *c,
}
}
-static void track_drop(void *opaque)
-{
- BlockDriverState *bs = (BlockDriverState*)opaque;
- BlockDriverState *file = bs->file->bs;
- BDRVAllocTrackState *s = bs->opaque;
-
- assert(file);
-
- /* we rely on the fact that we're not used anywhere else, so let's wait
- * until we're only used once - in the drive connected to the guest (and one
- * ref is held by bdrv_ref in track_change_backing_file) */
- if (bs->refcnt > 2) {
- aio_bh_schedule_oneshot(qemu_get_aio_context(), track_drop, opaque);
- return;
- }
- AioContext *aio_context = bdrv_get_aio_context(bs);
- aio_context_acquire(aio_context);
-
- bdrv_drained_begin(bs);
-
- /* now that we're drained, we can safely set 'DropInProgress' */
- s->drop_state = DropInProgress;
- bdrv_child_refresh_perms(bs, bs->file, &error_abort);
-
- bdrv_replace_node(bs, file, &error_abort);
- bdrv_set_backing_hd(bs, NULL, &error_abort);
- bdrv_drained_end(bs);
- bdrv_unref(bs);
- aio_context_release(aio_context);
-}
-
static int track_change_backing_file(BlockDriverState *bs,
const char *backing_file,
const char *backing_fmt)
@@ -308,13 +276,23 @@ static int track_change_backing_file(BlockDriverState *bs,
backing_file == NULL && backing_fmt == NULL)
{
/* backing file has been disconnected, there's no longer any use for
- * this node, so let's remove ourselves from the block graph - we need
- * to schedule this for later however, since when this function is
- * called, the blockjob modifying us is probably not done yet and has a
- * blocker on 'bs' */
- s->drop_state = DropRequested;
+ * this node, so let's remove ourselves from the block graph */
+ BlockDriverState *file = bs->file->bs;
+
+ /* Just to be sure, because bdrv_replace_node unrefs it */
bdrv_ref(bs);
- aio_bh_schedule_oneshot(qemu_get_aio_context(), track_drop, (void*)bs);
+ bdrv_drained_begin(bs);
+
+ /* now that we're drained, we can safely set 'DropInProgress' */
+ s->drop_state = DropInProgress;
+
+ bdrv_child_refresh_perms(bs, bs->file, &error_abort);
+
+ bdrv_replace_node(bs, file, &error_abort);
+ bdrv_set_backing_hd(bs, NULL, &error_abort);
+
+ bdrv_drained_end(bs);
+ bdrv_unref(bs);
}
return 0;

View File

@@ -1,133 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Thu, 11 Apr 2024 11:29:26 +0200
Subject: [PATCH] copy-before-write: allow specifying minimum cluster size
Useful to make discard-source work in the context of backup fleecing
when the fleecing image has a larger granularity than the backup
target.
Copy-before-write operations will use at least this granularity and in
particular, discard requests to the source node will too. If the
granularity is too small, they will just be aligned down in
cbw_co_pdiscard_snapshot() and thus effectively ignored.
The QAPI uses uint32 so the value will be non-negative, but still fit
into a uint64_t.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
block/block-copy.c | 17 +++++++++++++----
block/copy-before-write.c | 3 ++-
include/block/block-copy.h | 1 +
qapi/block-core.json | 8 +++++++-
4 files changed, 23 insertions(+), 6 deletions(-)
diff --git a/block/block-copy.c b/block/block-copy.c
index cc618e4561..12d662e9d4 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -310,6 +310,7 @@ void block_copy_set_copy_opts(BlockCopyState *s, bool use_copy_range,
}
static int64_t block_copy_calculate_cluster_size(BlockDriverState *target,
+ int64_t min_cluster_size,
Error **errp)
{
int ret;
@@ -335,7 +336,7 @@ static int64_t block_copy_calculate_cluster_size(BlockDriverState *target,
"used. If the actual block size of the target exceeds "
"this default, the backup may be unusable",
BLOCK_COPY_CLUSTER_SIZE_DEFAULT);
- return BLOCK_COPY_CLUSTER_SIZE_DEFAULT;
+ return MAX(min_cluster_size, BLOCK_COPY_CLUSTER_SIZE_DEFAULT);
} else if (ret < 0 && !target_does_cow) {
error_setg_errno(errp, -ret,
"Couldn't determine the cluster size of the target image, "
@@ -345,16 +346,18 @@ static int64_t block_copy_calculate_cluster_size(BlockDriverState *target,
return ret;
} else if (ret < 0 && target_does_cow) {
/* Not fatal; just trudge on ahead. */
- return BLOCK_COPY_CLUSTER_SIZE_DEFAULT;
+ return MAX(min_cluster_size, BLOCK_COPY_CLUSTER_SIZE_DEFAULT);
}
- return MAX(BLOCK_COPY_CLUSTER_SIZE_DEFAULT, bdi.cluster_size);
+ return MAX(min_cluster_size,
+ MAX(BLOCK_COPY_CLUSTER_SIZE_DEFAULT, bdi.cluster_size));
}
BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
BlockDriverState *copy_bitmap_bs,
const BdrvDirtyBitmap *bitmap,
bool discard_source,
+ int64_t min_cluster_size,
Error **errp)
{
ERRP_GUARD();
@@ -365,7 +368,13 @@ BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
GLOBAL_STATE_CODE();
- cluster_size = block_copy_calculate_cluster_size(target->bs, errp);
+ if (min_cluster_size && !is_power_of_2(min_cluster_size)) {
+ error_setg(errp, "min-cluster-size needs to be a power of 2");
+ return NULL;
+ }
+
+ cluster_size = block_copy_calculate_cluster_size(target->bs,
+ min_cluster_size, errp);
if (cluster_size < 0) {
return NULL;
}
diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index 28f6a096cd..ef4e666303 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -478,7 +478,8 @@ static int cbw_open(BlockDriverState *bs, QDict *options, int flags,
s->discard_source = flags & BDRV_O_CBW_DISCARD_SOURCE;
s->bcs = block_copy_state_new(bs->file, s->target, bs, bitmap,
- flags & BDRV_O_CBW_DISCARD_SOURCE, errp);
+ flags & BDRV_O_CBW_DISCARD_SOURCE,
+ opts->min_cluster_size, errp);
if (!s->bcs) {
error_prepend(errp, "Cannot create block-copy-state: ");
return -EINVAL;
diff --git a/include/block/block-copy.h b/include/block/block-copy.h
index bdc703bacd..77857c6c68 100644
--- a/include/block/block-copy.h
+++ b/include/block/block-copy.h
@@ -28,6 +28,7 @@ BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
BlockDriverState *copy_bitmap_bs,
const BdrvDirtyBitmap *bitmap,
bool discard_source,
+ int64_t min_cluster_size,
Error **errp);
/* Function should be called prior any actual copy request */
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 521a1914e8..171846deb1 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -4927,12 +4927,18 @@
# @on-cbw-error parameter will decide how this failure is handled.
# Default 0. (Since 7.1)
#
+# @min-cluster-size: Minimum size of blocks used by copy-before-write
+# operations. Has to be a power of 2. No effect if smaller than
+# the maximum of the target's cluster size and 64 KiB. Default 0.
+# (Since 8.1)
+#
# Since: 6.2
##
{ 'struct': 'BlockdevOptionsCbw',
'base': 'BlockdevOptionsGenericFormat',
'data': { 'target': 'BlockdevRef', '*bitmap': 'BlockDirtyBitmap',
- '*on-cbw-error': 'OnCbwError', '*cbw-timeout': 'uint32' } }
+ '*on-cbw-error': 'OnCbwError', '*cbw-timeout': 'uint32',
+ '*min-cluster-size': 'uint32' } }
##
# @BlockdevOptions:

View File

@@ -0,0 +1,190 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Fri, 5 May 2023 13:39:53 +0200
Subject: [PATCH] migration: for snapshots, hold the BQL during setup callbacks
In spirit, this is a partial revert of commit 9b09503752 ("migration:
run setup callbacks out of big lock"), but only for the snapshot case.
For snapshots, the bdrv_writev_vmstate() function is used during setup
(in QIOChannelBlock backing the QEMUFile), but not holding the BQL
while calling it could lead to an assertion failure. To understand
how, first note the following:
1. Generated coroutine wrappers for block layer functions spawn the
coroutine and use AIO_WAIT_WHILE()/aio_poll() to wait for it.
2. If the host OS switches threads at an inconvenient time, it can
happen that a bottom half scheduled for the main thread's AioContext
is executed as part of a vCPU thread's aio_poll().
An example leading to the assertion failure is as follows:
main thread:
1. A snapshot-save QMP command gets issued.
2. snapshot_save_job_bh() is scheduled.
vCPU thread:
3. aio_poll() for the main thread's AioContext is called (e.g. when
the guest writes to a pflash device, as part of blk_pwrite which is a
generated coroutine wrapper).
4. snapshot_save_job_bh() is executed as part of aio_poll().
3. qemu_savevm_state() is called.
4. qemu_mutex_unlock_iothread() is called. Now
qemu_get_current_aio_context() returns 0x0.
5. bdrv_writev_vmstate() is executed during the usual savevm setup.
But this function is a generated coroutine wrapper, so it uses
AIO_WAIT_WHILE. There, the assertion
assert(qemu_get_current_aio_context() == qemu_get_aio_context());
will fail.
To fix it, ensure that the BQL is held during setup. To avoid changing
the behavior for migration too, introduce conditionals for the setup
callbacks that need the BQL and only take the lock if it's not already
held.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
include/migration/register.h | 2 +-
migration/block-dirty-bitmap.c | 15 ++++++++++++---
migration/block.c | 15 ++++++++++++---
migration/ram.c | 16 +++++++++++++---
migration/savevm.c | 2 --
5 files changed, 38 insertions(+), 12 deletions(-)
diff --git a/include/migration/register.h b/include/migration/register.h
index 90914f32f5..c728fd9120 100644
--- a/include/migration/register.h
+++ b/include/migration/register.h
@@ -43,9 +43,9 @@ typedef struct SaveVMHandlers {
* by other locks.
*/
int (*save_live_iterate)(QEMUFile *f, void *opaque);
+ int (*save_setup)(QEMUFile *f, void *opaque);
/* This runs outside the iothread lock! */
- int (*save_setup)(QEMUFile *f, void *opaque);
/* Note for save_live_pending:
* must_precopy:
* - must be migrated in precopy or in stopped state
diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 285dd1d148..f7ee5a74d9 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -1219,10 +1219,17 @@ static int dirty_bitmap_save_setup(QEMUFile *f, void *opaque)
{
DBMSaveState *s = &((DBMState *)opaque)->save;
SaveBitmapState *dbms = NULL;
+ bool release_lock = false;
- qemu_mutex_lock_iothread();
+ /* For snapshots, the BQL is held during setup. */
+ if (!qemu_mutex_iothread_locked()) {
+ qemu_mutex_lock_iothread();
+ release_lock = true;
+ }
if (init_dirty_bitmap_migration(s) < 0) {
- qemu_mutex_unlock_iothread();
+ if (release_lock) {
+ qemu_mutex_unlock_iothread();
+ }
return -1;
}
@@ -1230,7 +1237,9 @@ static int dirty_bitmap_save_setup(QEMUFile *f, void *opaque)
send_bitmap_start(f, s, dbms);
}
qemu_put_bitmap_flags(f, DIRTY_BITMAP_MIG_FLAG_EOS);
- qemu_mutex_unlock_iothread();
+ if (release_lock) {
+ qemu_mutex_unlock_iothread();
+ }
return 0;
}
diff --git a/migration/block.c b/migration/block.c
index 86c2256a2b..8423e0c9f9 100644
--- a/migration/block.c
+++ b/migration/block.c
@@ -725,21 +725,30 @@ static void block_migration_cleanup(void *opaque)
static int block_save_setup(QEMUFile *f, void *opaque)
{
int ret;
+ bool release_lock = false;
trace_migration_block_save("setup", block_mig_state.submitted,
block_mig_state.transferred);
- qemu_mutex_lock_iothread();
+ /* For snapshots, the BQL is held during setup. */
+ if (!qemu_mutex_iothread_locked()) {
+ qemu_mutex_lock_iothread();
+ release_lock = true;
+ }
ret = init_blk_migration(f);
if (ret < 0) {
- qemu_mutex_unlock_iothread();
+ if (release_lock) {
+ qemu_mutex_unlock_iothread();
+ }
return ret;
}
/* start track dirty blocks */
ret = set_dirty_tracking();
- qemu_mutex_unlock_iothread();
+ if (release_lock) {
+ qemu_mutex_unlock_iothread();
+ }
if (ret) {
return ret;
diff --git a/migration/ram.c b/migration/ram.c
index 9040d66e61..01532c9fc9 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2895,8 +2895,16 @@ static void migration_bitmap_clear_discarded_pages(RAMState *rs)
static void ram_init_bitmaps(RAMState *rs)
{
- /* For memory_global_dirty_log_start below. */
- qemu_mutex_lock_iothread();
+ bool release_lock = false;
+
+ /*
+ * For memory_global_dirty_log_start below.
+ * For snapshots, the BQL is held during setup.
+ */
+ if (!qemu_mutex_iothread_locked()) {
+ qemu_mutex_lock_iothread();
+ release_lock = true;
+ }
qemu_mutex_lock_ramlist();
WITH_RCU_READ_LOCK_GUARD() {
@@ -2908,7 +2916,9 @@ static void ram_init_bitmaps(RAMState *rs)
}
}
qemu_mutex_unlock_ramlist();
- qemu_mutex_unlock_iothread();
+ if (release_lock) {
+ qemu_mutex_unlock_iothread();
+ }
/*
* After an eventual first bitmap sync, fixup the initial bitmap
diff --git a/migration/savevm.c b/migration/savevm.c
index a2cb8855e2..ea8b30a630 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1625,10 +1625,8 @@ static int qemu_savevm_state(QEMUFile *f, Error **errp)
reset_vfio_bytes_transferred();
ms->to_dst_file = f;
- qemu_mutex_unlock_iothread();
qemu_savevm_state_header(f);
qemu_savevm_state_setup(f);
- qemu_mutex_lock_iothread();
while (qemu_file_get_error(f) == 0) {
if (qemu_savevm_state_iterate(f, false) > 0) {

View File

@@ -1,106 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Thu, 11 Apr 2024 11:29:27 +0200
Subject: [PATCH] backup: add minimum cluster size to performance options
Useful to make discard-source work in the context of backup fleecing
when the fleecing image has a larger granularity than the backup
target.
Backup/block-copy will use at least this granularity for copy operations
and in particular, discard requests to the backup source will too. If
the granularity is too small, they will just be aligned down in
cbw_co_pdiscard_snapshot() and thus effectively ignored.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
block/backup.c | 2 +-
block/copy-before-write.c | 2 ++
block/copy-before-write.h | 1 +
blockdev.c | 3 +++
qapi/block-core.json | 9 +++++++--
5 files changed, 14 insertions(+), 3 deletions(-)
diff --git a/block/backup.c b/block/backup.c
index 1963e47ab9..fe69723ada 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -434,7 +434,7 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
}
cbw = bdrv_cbw_append(bs, target, filter_node_name, discard_source,
- &bcs, errp);
+ perf->min_cluster_size, &bcs, errp);
if (!cbw) {
goto error;
}
diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index ef4e666303..adb27649a8 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -547,6 +547,7 @@ BlockDriverState *bdrv_cbw_append(BlockDriverState *source,
BlockDriverState *target,
const char *filter_node_name,
bool discard_source,
+ int64_t min_cluster_size,
BlockCopyState **bcs,
Error **errp)
{
@@ -565,6 +566,7 @@ BlockDriverState *bdrv_cbw_append(BlockDriverState *source,
}
qdict_put_str(opts, "file", bdrv_get_node_name(source));
qdict_put_str(opts, "target", bdrv_get_node_name(target));
+ qdict_put_int(opts, "min-cluster-size", min_cluster_size);
top = bdrv_insert_node(source, opts, flags, errp);
if (!top) {
diff --git a/block/copy-before-write.h b/block/copy-before-write.h
index 01af0cd3c4..dc6cafe7fa 100644
--- a/block/copy-before-write.h
+++ b/block/copy-before-write.h
@@ -40,6 +40,7 @@ BlockDriverState *bdrv_cbw_append(BlockDriverState *source,
BlockDriverState *target,
const char *filter_node_name,
bool discard_source,
+ int64_t min_cluster_size,
BlockCopyState **bcs,
Error **errp);
void bdrv_cbw_drop(BlockDriverState *bs);
diff --git a/blockdev.c b/blockdev.c
index 8080c47fa6..3f67eb413d 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2656,6 +2656,9 @@ static BlockJob *do_backup_common(BackupCommon *backup,
if (backup->x_perf->has_max_chunk) {
perf.max_chunk = backup->x_perf->max_chunk;
}
+ if (backup->x_perf->has_min_cluster_size) {
+ perf.min_cluster_size = backup->x_perf->min_cluster_size;
+ }
}
if ((backup->sync == MIRROR_SYNC_MODE_BITMAP) ||
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 171846deb1..653df22046 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1790,11 +1790,16 @@
# it should not be less than job cluster size which is calculated
# as maximum of target image cluster size and 64k. Default 0.
#
+# @min-cluster-size: Minimum size of blocks used by copy-before-write
+# and background copy operations. Has to be a power of 2. No
+# effect if smaller than the maximum of the target's cluster size
+# and 64 KiB. Default 0. (Since 8.1)
+#
# Since: 6.0
##
{ 'struct': 'BackupPerf',
- 'data': { '*use-copy-range': 'bool',
- '*max-workers': 'int', '*max-chunk': 'int64' } }
+ 'data': { '*use-copy-range': 'bool', '*max-workers': 'int',
+ '*max-chunk': 'int64', '*min-cluster-size': 'uint32' } }
##
# @BackupCommon:

View File

@@ -0,0 +1,29 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Fri, 5 May 2023 15:30:16 +0200
Subject: [PATCH] savevm-async: don't hold BQL during setup
See commit "migration: for snapshots, hold the BQL during setup
callbacks" for why. This is separate, because a version of that one
will hopefully land upstream.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
migration/savevm-async.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/migration/savevm-async.c b/migration/savevm-async.c
index 80624fada8..b1d85a4b41 100644
--- a/migration/savevm-async.c
+++ b/migration/savevm-async.c
@@ -401,10 +401,8 @@ void qmp_savevm_start(const char *statefile, Error **errp)
snap_state.state = SAVE_STATE_ACTIVE;
snap_state.finalize_bh = qemu_bh_new(process_savevm_finalize, &snap_state);
snap_state.co = qemu_coroutine_create(&process_savevm_co, NULL);
- qemu_mutex_unlock_iothread();
qemu_savevm_state_header(snap_state.file);
qemu_savevm_state_setup(snap_state.file);
- qemu_mutex_lock_iothread();
/* Async processing from here on out happens in iohandler context, so let
* the target bdrv have its home there.

View File

@@ -1,337 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Thu, 11 Apr 2024 11:29:28 +0200
Subject: [PATCH] PVE backup: add fleecing option
When a fleecing option is given, it is expected that each device has
a corresponding "-fleecing" block device already attached, except for
EFI disk and TPM state, where fleecing is never used.
The following graph was adapted from [0] which also contains more
details about fleecing.
[guest]
|
| root
v file
[copy-before-write]<------[snapshot-access]
| |
| file | target
v v
[source] [fleecing]
For fleecing, a copy-before-write filter is inserted on top of the
source node, as well as a snapshot-access node pointing to the filter
node which allows to read the consistent state of the image at the
time it was inserted. New guest writes are passed through the
copy-before-write filter which will first copy over old data to the
fleecing image in case that old data is still needed by the
snapshot-access node.
The backup process will sequentially read from the snapshot access,
which has a bitmap and knows whether to read from the original image
or the fleecing image to get the "snapshot" state, i.e. data from the
source image at the time when the copy-before-write filter was
inserted. After reading, the copied sections are discarded from the
fleecing image to reduce space usage.
All of this can be restricted by an initial dirty bitmap to parts of
the source image that are required for an incremental backup.
For discard to work, it is necessary that the fleecing image does not
have a larger cluster size than the backup job granularity. Since
querying that size does not always work, e.g. for RBD with krbd, the
cluster size will not be reported, a minimum of 4 MiB is used. A job
with PBS target already has at least this granularity, so it's just
relevant for other targets. I.e. edge cases where this minimum is not
enough should be very rare in practice. If ever necessary in the
future, can still add a passed-in value for the backup QMP command to
override.
Additionally, the cbw-timeout and on-cbw-error=break-snapshot options
are set when installing the copy-before-write filter and
snapshot-access. When an error or timeout occurs, the problematic (and
each further) snapshot operation will fail and thus cancel the backup
instead of breaking the guest write.
Note that job_id cannot be inferred from the snapshot-access bs because
it has no parent, so just pass the one from the original bs.
[0]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg876056.html
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
block/monitor/block-hmp-cmds.c | 1 +
pve-backup.c | 135 ++++++++++++++++++++++++++++++++-
qapi/block-core.json | 10 ++-
3 files changed, 142 insertions(+), 4 deletions(-)
diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c
index 439a7a14c8..d0e7771dcc 100644
--- a/block/monitor/block-hmp-cmds.c
+++ b/block/monitor/block-hmp-cmds.c
@@ -1044,6 +1044,7 @@ void coroutine_fn hmp_backup(Monitor *mon, const QDict *qdict)
NULL, NULL,
devlist, qdict_haskey(qdict, "speed"), speed,
false, 0, // BackupPerf max-workers
+ false, false, // fleecing
&error);
hmp_handle_error(mon, error);
diff --git a/pve-backup.c b/pve-backup.c
index 57477f7f2a..0f098000dd 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -7,9 +7,11 @@
#include "sysemu/blockdev.h"
#include "block/block_int-global-state.h"
#include "block/blockjob.h"
+#include "block/copy-before-write.h"
#include "block/dirty-bitmap.h"
#include "block/graph-lock.h"
#include "qapi/qapi-commands-block.h"
+#include "qapi/qmp/qdict.h"
#include "qapi/qmp/qerror.h"
#include "qemu/cutils.h"
@@ -80,8 +82,15 @@ static void pvebackup_init(void)
// initialize PVEBackupState at startup
opts_init(pvebackup_init);
+typedef struct PVEBackupFleecingInfo {
+ BlockDriverState *bs;
+ BlockDriverState *cbw;
+ BlockDriverState *snapshot_access;
+} PVEBackupFleecingInfo;
+
typedef struct PVEBackupDevInfo {
BlockDriverState *bs;
+ PVEBackupFleecingInfo fleecing;
size_t size;
uint64_t block_size;
uint8_t dev_id;
@@ -353,6 +362,22 @@ static void pvebackup_complete_cb(void *opaque, int ret)
PVEBackupDevInfo *di = opaque;
di->completed_ret = ret;
+ /*
+ * Handle block-graph specific cleanup (for fleecing) outside of the coroutine, because the work
+ * won't be done as a coroutine anyways:
+ * - For snapshot_access, allows doing bdrv_unref() directly. Doing it via bdrv_co_unref() would
+ * just spawn a BH calling bdrv_unref().
+ * - For cbw, draining would need to spawn a BH.
+ */
+ if (di->fleecing.snapshot_access) {
+ bdrv_unref(di->fleecing.snapshot_access);
+ di->fleecing.snapshot_access = NULL;
+ }
+ if (di->fleecing.cbw) {
+ bdrv_cbw_drop(di->fleecing.cbw);
+ di->fleecing.cbw = NULL;
+ }
+
/*
* Needs to happen outside of coroutine, because it takes the graph write lock.
*/
@@ -519,9 +544,77 @@ static void create_backup_jobs_bh(void *opaque) {
}
bdrv_drained_begin(di->bs);
+ BackupPerf perf = (BackupPerf){ .max_workers = backup_state.perf.max_workers };
+
+ BlockDriverState *source_bs = di->bs;
+ bool discard_source = false;
+ bdrv_graph_co_rdlock();
+ const char *job_id = bdrv_get_device_name(di->bs);
+ bdrv_graph_co_rdunlock();
+ if (di->fleecing.bs) {
+ QDict *cbw_opts = qdict_new();
+ qdict_put_str(cbw_opts, "driver", "copy-before-write");
+ qdict_put_str(cbw_opts, "file", bdrv_get_node_name(di->bs));
+ qdict_put_str(cbw_opts, "target", bdrv_get_node_name(di->fleecing.bs));
+
+ if (di->bitmap) {
+ /*
+ * Only guest writes to parts relevant for the backup need to be intercepted with
+ * old data being copied to the fleecing image.
+ */
+ qdict_put_str(cbw_opts, "bitmap.node", bdrv_get_node_name(di->bs));
+ qdict_put_str(cbw_opts, "bitmap.name", bdrv_dirty_bitmap_name(di->bitmap));
+ }
+ /*
+ * Fleecing storage is supposed to be fast and it's better to break backup than guest
+ * writes. Certain guest drivers like VirtIO-win have 60 seconds timeout by default, so
+ * abort a bit before that.
+ */
+ qdict_put_str(cbw_opts, "on-cbw-error", "break-snapshot");
+ qdict_put_int(cbw_opts, "cbw-timeout", 45);
+
+ di->fleecing.cbw = bdrv_insert_node(di->bs, cbw_opts, BDRV_O_RDWR, &local_err);
+
+ if (!di->fleecing.cbw) {
+ error_setg(errp, "appending cbw node for fleecing failed: %s",
+ local_err ? error_get_pretty(local_err) : "unknown error");
+ break;
+ }
+
+ QDict *snapshot_access_opts = qdict_new();
+ qdict_put_str(snapshot_access_opts, "driver", "snapshot-access");
+ qdict_put_str(snapshot_access_opts, "file", bdrv_get_node_name(di->fleecing.cbw));
+
+ di->fleecing.snapshot_access =
+ bdrv_open(NULL, NULL, snapshot_access_opts, BDRV_O_RDWR | BDRV_O_UNMAP, &local_err);
+ if (!di->fleecing.snapshot_access) {
+ error_setg(errp, "setting up snapshot access for fleecing failed: %s",
+ local_err ? error_get_pretty(local_err) : "unknown error");
+ break;
+ }
+ source_bs = di->fleecing.snapshot_access;
+ discard_source = true;
+
+ /*
+ * bdrv_get_info() just retuns 0 (= doesn't matter) for RBD when using krbd. But discard
+ * on the fleecing image won't work if the backup job's granularity is less than the RBD
+ * object size (default 4 MiB), so it does matter. Always use at least 4 MiB. With a PBS
+ * target, the backup job granularity would already be at least this much.
+ */
+ perf.min_cluster_size = 4 * 1024 * 1024;
+ /*
+ * For discard to work, cluster size for the backup job must be at least the same as for
+ * the fleecing image.
+ */
+ BlockDriverInfo bdi;
+ if (bdrv_get_info(di->fleecing.bs, &bdi) >= 0) {
+ perf.min_cluster_size = MAX(perf.min_cluster_size, bdi.cluster_size);
+ }
+ }
+
BlockJob *job = backup_job_create(
- NULL, di->bs, di->target, backup_state.speed, sync_mode, di->bitmap,
- bitmap_mode, false, NULL, &backup_state.perf, BLOCKDEV_ON_ERROR_REPORT,
+ job_id, source_bs, di->target, backup_state.speed, sync_mode, di->bitmap,
+ bitmap_mode, false, discard_source, NULL, &perf, BLOCKDEV_ON_ERROR_REPORT,
BLOCKDEV_ON_ERROR_REPORT, JOB_DEFAULT, pvebackup_complete_cb, di, backup_state.txn,
&local_err);
@@ -577,6 +670,14 @@ static void create_backup_jobs_bh(void *opaque) {
aio_co_enter(data->ctx, data->co);
}
+/*
+ * EFI disk and TPM state are small and it's just not worth setting up fleecing for them.
+ */
+static bool device_uses_fleecing(const char *device_id)
+{
+ return strncmp(device_id, "drive-efidisk", 13) && strncmp(device_id, "drive-tpmstate", 14);
+}
+
/*
* Returns a list of device infos, which needs to be freed by the caller. In
* case of an error, errp will be set, but the returned value might still be a
@@ -584,6 +685,7 @@ static void create_backup_jobs_bh(void *opaque) {
*/
static GList coroutine_fn GRAPH_RDLOCK *get_device_info(
const char *devlist,
+ bool fleecing,
Error **errp)
{
gchar **devs = NULL;
@@ -607,6 +709,31 @@ static GList coroutine_fn GRAPH_RDLOCK *get_device_info(
}
PVEBackupDevInfo *di = g_new0(PVEBackupDevInfo, 1);
di->bs = bs;
+
+ if (fleecing && device_uses_fleecing(*d)) {
+ g_autofree gchar *fleecing_devid = g_strconcat(*d, "-fleecing", NULL);
+ BlockBackend *fleecing_blk = blk_by_name(fleecing_devid);
+ if (!fleecing_blk) {
+ error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND,
+ "Device '%s' not found", fleecing_devid);
+ goto err;
+ }
+ BlockDriverState *fleecing_bs = blk_bs(fleecing_blk);
+ if (!bdrv_co_is_inserted(fleecing_bs)) {
+ error_setg(errp, "Device '%s' has no medium", fleecing_devid);
+ goto err;
+ }
+ /*
+ * Fleecing image needs to be the same size to act as a cbw target.
+ */
+ if (bs->total_sectors != fleecing_bs->total_sectors) {
+ error_setg(errp, "Size mismatch for '%s' - sector count %ld != %ld",
+ fleecing_devid, fleecing_bs->total_sectors, bs->total_sectors);
+ goto err;
+ }
+ di->fleecing.bs = fleecing_bs;
+ }
+
di_list = g_list_append(di_list, di);
d++;
}
@@ -656,6 +783,7 @@ UuidInfo coroutine_fn *qmp_backup(
const char *devlist,
bool has_speed, int64_t speed,
bool has_max_workers, int64_t max_workers,
+ bool has_fleecing, bool fleecing,
Error **errp)
{
assert(qemu_in_coroutine());
@@ -684,7 +812,7 @@ UuidInfo coroutine_fn *qmp_backup(
format = has_format ? format : BACKUP_FORMAT_VMA;
bdrv_graph_co_rdlock();
- di_list = get_device_info(devlist, &local_err);
+ di_list = get_device_info(devlist, has_fleecing && fleecing, &local_err);
bdrv_graph_co_rdunlock();
if (local_err) {
error_propagate(errp, local_err);
@@ -1089,5 +1217,6 @@ ProxmoxSupportStatus *qmp_query_proxmox_support(Error **errp)
ret->query_bitmap_info = true;
ret->pbs_masterkey = true;
ret->backup_max_workers = true;
+ ret->backup_fleecing = true;
return ret;
}
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 653df22046..9f25c398ec 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -948,6 +948,10 @@
#
# @max-workers: see @BackupPerf for details. Default 16.
#
+# @fleecing: perform a backup with fleecing. For each device in @devlist, a
+# corresponing '-fleecing' device with the same size already needs to
+# be present.
+#
# Returns: the uuid of the backup job
#
##
@@ -968,7 +972,8 @@
'*firewall-file': 'str',
'*devlist': 'str',
'*speed': 'int',
- '*max-workers': 'int' },
+ '*max-workers': 'int',
+ '*fleecing': 'bool' },
'returns': 'UuidInfo', 'coroutine': true }
##
@@ -1014,6 +1019,8 @@
#
# @pbs-library-version: Running version of libproxmox-backup-qemu0 library.
#
+# @backup-fleecing: Whether backup fleecing is supported or not.
+#
# @backup-max-workers: Whether the 'max-workers' @BackupPerf setting is
# supported or not.
#
@@ -1025,6 +1032,7 @@
'pbs-dirty-bitmap-migration': 'bool',
'pbs-masterkey': 'bool',
'pbs-library-version': 'str',
+ 'backup-fleecing': 'bool',
'backup-max-workers': 'bool' } }
##

View File

@@ -1,117 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Mon, 29 Apr 2024 14:43:58 +0200
Subject: [PATCH] PVE backup: improve error when copy-before-write fails for
fleecing
With fleecing, failure for copy-before-write does not fail the guest
write, but only sets the snapshot error that is associated to the
copy-before-write filter, making further requests to the snapshot
access fail with EACCES, which then also fails the job. But that error
code is not the root cause of why the backup failed, so bubble up the
original snapshot error instead.
Reported-by: Friedrich Weber <f.weber@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Tested-by: Friedrich Weber <f.weber@proxmox.com>
---
block/copy-before-write.c | 18 ++++++++++++------
block/copy-before-write.h | 1 +
pve-backup.c | 9 +++++++++
3 files changed, 22 insertions(+), 6 deletions(-)
diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index adb27649a8..a5bb4d14f6 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -27,6 +27,7 @@
#include "qapi/qmp/qjson.h"
#include "sysemu/block-backend.h"
+#include "qemu/atomic.h"
#include "qemu/cutils.h"
#include "qapi/error.h"
#include "block/block_int.h"
@@ -75,7 +76,8 @@ typedef struct BDRVCopyBeforeWriteState {
* @snapshot_error is normally zero. But on first copy-before-write failure
* when @on_cbw_error == ON_CBW_ERROR_BREAK_SNAPSHOT, @snapshot_error takes
* value of this error (<0). After that all in-flight and further
- * snapshot-API requests will fail with that error.
+ * snapshot-API requests will fail with that error. To be accessed with
+ * atomics.
*/
int snapshot_error;
} BDRVCopyBeforeWriteState;
@@ -115,7 +117,7 @@ static coroutine_fn int cbw_do_copy_before_write(BlockDriverState *bs,
return 0;
}
- if (s->snapshot_error) {
+ if (qatomic_read(&s->snapshot_error)) {
return 0;
}
@@ -139,9 +141,7 @@ static coroutine_fn int cbw_do_copy_before_write(BlockDriverState *bs,
WITH_QEMU_LOCK_GUARD(&s->lock) {
if (ret < 0) {
assert(s->on_cbw_error == ON_CBW_ERROR_BREAK_SNAPSHOT);
- if (!s->snapshot_error) {
- s->snapshot_error = ret;
- }
+ qatomic_cmpxchg(&s->snapshot_error, 0, ret);
} else {
bdrv_set_dirty_bitmap(s->done_bitmap, off, end - off);
}
@@ -215,7 +215,7 @@ cbw_snapshot_read_lock(BlockDriverState *bs, int64_t offset, int64_t bytes,
QEMU_LOCK_GUARD(&s->lock);
- if (s->snapshot_error) {
+ if (qatomic_read(&s->snapshot_error)) {
g_free(req);
return NULL;
}
@@ -586,6 +586,12 @@ void bdrv_cbw_drop(BlockDriverState *bs)
bdrv_unref(bs);
}
+int bdrv_cbw_snapshot_error(BlockDriverState *bs)
+{
+ BDRVCopyBeforeWriteState *s = bs->opaque;
+ return qatomic_read(&s->snapshot_error);
+}
+
static void cbw_init(void)
{
bdrv_register(&bdrv_cbw_filter);
diff --git a/block/copy-before-write.h b/block/copy-before-write.h
index dc6cafe7fa..a27d2d7d9f 100644
--- a/block/copy-before-write.h
+++ b/block/copy-before-write.h
@@ -44,5 +44,6 @@ BlockDriverState *bdrv_cbw_append(BlockDriverState *source,
BlockCopyState **bcs,
Error **errp);
void bdrv_cbw_drop(BlockDriverState *bs);
+int bdrv_cbw_snapshot_error(BlockDriverState *bs);
#endif /* COPY_BEFORE_WRITE_H */
diff --git a/pve-backup.c b/pve-backup.c
index 0f098000dd..75da1dc051 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -374,6 +374,15 @@ static void pvebackup_complete_cb(void *opaque, int ret)
di->fleecing.snapshot_access = NULL;
}
if (di->fleecing.cbw) {
+ /*
+ * With fleecing, failure for cbw does not fail the guest write, but only sets the snapshot
+ * error, making further requests to the snapshot fail with EACCES, which then also fail the
+ * job. But that code is not the root cause and just confusing, so update it.
+ */
+ int snapshot_error = bdrv_cbw_snapshot_error(di->fleecing.cbw);
+ if (di->completed_ret == -EACCES && snapshot_error) {
+ di->completed_ret = snapshot_error;
+ }
bdrv_cbw_drop(di->fleecing.cbw);
di->fleecing.cbw = NULL;
}

View File

@@ -1,103 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Thu, 7 Nov 2024 17:51:14 +0100
Subject: [PATCH] PVE backup: fixup error handling for fleecing
The drained section needs to be terminated before breaking out of the
loop in the error scenarios. Otherwise, guest IO on the drive would
become stuck.
If the job is created successfully, then the job completion callback
will clean up the snapshot access block nodes. In case failure
happened before the job is created, there was no cleanup for the
snapshot access block nodes yet. Add it.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
pve-backup.c | 38 +++++++++++++++++++++++++-------------
1 file changed, 25 insertions(+), 13 deletions(-)
diff --git a/pve-backup.c b/pve-backup.c
index 75da1dc051..167f0b5c3f 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -357,22 +357,23 @@ static void coroutine_fn pvebackup_co_complete_stream(void *opaque)
qemu_co_mutex_unlock(&backup_state.backup_mutex);
}
-static void pvebackup_complete_cb(void *opaque, int ret)
+static void cleanup_snapshot_access(PVEBackupDevInfo *di)
{
- PVEBackupDevInfo *di = opaque;
- di->completed_ret = ret;
-
- /*
- * Handle block-graph specific cleanup (for fleecing) outside of the coroutine, because the work
- * won't be done as a coroutine anyways:
- * - For snapshot_access, allows doing bdrv_unref() directly. Doing it via bdrv_co_unref() would
- * just spawn a BH calling bdrv_unref().
- * - For cbw, draining would need to spawn a BH.
- */
if (di->fleecing.snapshot_access) {
bdrv_unref(di->fleecing.snapshot_access);
di->fleecing.snapshot_access = NULL;
}
+ if (di->fleecing.cbw) {
+ bdrv_cbw_drop(di->fleecing.cbw);
+ di->fleecing.cbw = NULL;
+ }
+}
+
+static void pvebackup_complete_cb(void *opaque, int ret)
+{
+ PVEBackupDevInfo *di = opaque;
+ di->completed_ret = ret;
+
if (di->fleecing.cbw) {
/*
* With fleecing, failure for cbw does not fail the guest write, but only sets the snapshot
@@ -383,10 +384,17 @@ static void pvebackup_complete_cb(void *opaque, int ret)
if (di->completed_ret == -EACCES && snapshot_error) {
di->completed_ret = snapshot_error;
}
- bdrv_cbw_drop(di->fleecing.cbw);
- di->fleecing.cbw = NULL;
}
+ /*
+ * Handle block-graph specific cleanup (for fleecing) outside of the coroutine, because the work
+ * won't be done as a coroutine anyways:
+ * - For snapshot_access, allows doing bdrv_unref() directly. Doing it via bdrv_co_unref() would
+ * just spawn a BH calling bdrv_unref().
+ * - For cbw, draining would need to spawn a BH.
+ */
+ cleanup_snapshot_access(di);
+
/*
* Needs to happen outside of coroutine, because it takes the graph write lock.
*/
@@ -587,6 +595,7 @@ static void create_backup_jobs_bh(void *opaque) {
if (!di->fleecing.cbw) {
error_setg(errp, "appending cbw node for fleecing failed: %s",
local_err ? error_get_pretty(local_err) : "unknown error");
+ bdrv_drained_end(di->bs);
break;
}
@@ -599,6 +608,8 @@ static void create_backup_jobs_bh(void *opaque) {
if (!di->fleecing.snapshot_access) {
error_setg(errp, "setting up snapshot access for fleecing failed: %s",
local_err ? error_get_pretty(local_err) : "unknown error");
+ cleanup_snapshot_access(di);
+ bdrv_drained_end(di->bs);
break;
}
source_bs = di->fleecing.snapshot_access;
@@ -637,6 +648,7 @@ static void create_backup_jobs_bh(void *opaque) {
}
if (!job || local_err) {
+ cleanup_snapshot_access(di);
error_setg(errp, "backup_job_create failed: %s",
local_err ? error_get_pretty(local_err) : "null");
break;

View File

@@ -1,135 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Thu, 7 Nov 2024 17:51:15 +0100
Subject: [PATCH] PVE backup: factor out setting up snapshot access for
fleecing
Avoids some line bloat in the create_backup_jobs_bh() function and is
in preparation for setting up the snapshot access independently of
fleecing, in particular that will be useful for providing access to
the snapshot via NBD.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
pve-backup.c | 95 ++++++++++++++++++++++++++++++++--------------------
1 file changed, 58 insertions(+), 37 deletions(-)
diff --git a/pve-backup.c b/pve-backup.c
index 167f0b5c3f..f136d004c4 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -525,6 +525,62 @@ static int coroutine_fn pvebackup_co_add_config(
goto out;
}
+/*
+ * Setup a snapshot-access block node for a device with associated fleecing image.
+ */
+static int setup_snapshot_access(PVEBackupDevInfo *di, Error **errp)
+{
+ Error *local_err = NULL;
+
+ if (!di->fleecing.bs) {
+ error_setg(errp, "no associated fleecing image");
+ return -1;
+ }
+
+ QDict *cbw_opts = qdict_new();
+ qdict_put_str(cbw_opts, "driver", "copy-before-write");
+ qdict_put_str(cbw_opts, "file", bdrv_get_node_name(di->bs));
+ qdict_put_str(cbw_opts, "target", bdrv_get_node_name(di->fleecing.bs));
+
+ if (di->bitmap) {
+ /*
+ * Only guest writes to parts relevant for the backup need to be intercepted with
+ * old data being copied to the fleecing image.
+ */
+ qdict_put_str(cbw_opts, "bitmap.node", bdrv_get_node_name(di->bs));
+ qdict_put_str(cbw_opts, "bitmap.name", bdrv_dirty_bitmap_name(di->bitmap));
+ }
+ /*
+ * Fleecing storage is supposed to be fast and it's better to break backup than guest
+ * writes. Certain guest drivers like VirtIO-win have 60 seconds timeout by default, so
+ * abort a bit before that.
+ */
+ qdict_put_str(cbw_opts, "on-cbw-error", "break-snapshot");
+ qdict_put_int(cbw_opts, "cbw-timeout", 45);
+
+ di->fleecing.cbw = bdrv_insert_node(di->bs, cbw_opts, BDRV_O_RDWR, &local_err);
+
+ if (!di->fleecing.cbw) {
+ error_setg(errp, "appending cbw node for fleecing failed: %s",
+ local_err ? error_get_pretty(local_err) : "unknown error");
+ return -1;
+ }
+
+ QDict *snapshot_access_opts = qdict_new();
+ qdict_put_str(snapshot_access_opts, "driver", "snapshot-access");
+ qdict_put_str(snapshot_access_opts, "file", bdrv_get_node_name(di->fleecing.cbw));
+
+ di->fleecing.snapshot_access =
+ bdrv_open(NULL, NULL, snapshot_access_opts, BDRV_O_RDWR | BDRV_O_UNMAP, &local_err);
+ if (!di->fleecing.snapshot_access) {
+ error_setg(errp, "setting up snapshot access for fleecing failed: %s",
+ local_err ? error_get_pretty(local_err) : "unknown error");
+ return -1;
+ }
+
+ return 0;
+}
+
/*
* backup_job_create can *not* be run from a coroutine, so this can't either.
* The caller is responsible that backup_mutex is held nonetheless.
@@ -569,49 +625,14 @@ static void create_backup_jobs_bh(void *opaque) {
const char *job_id = bdrv_get_device_name(di->bs);
bdrv_graph_co_rdunlock();
if (di->fleecing.bs) {
- QDict *cbw_opts = qdict_new();
- qdict_put_str(cbw_opts, "driver", "copy-before-write");
- qdict_put_str(cbw_opts, "file", bdrv_get_node_name(di->bs));
- qdict_put_str(cbw_opts, "target", bdrv_get_node_name(di->fleecing.bs));
-
- if (di->bitmap) {
- /*
- * Only guest writes to parts relevant for the backup need to be intercepted with
- * old data being copied to the fleecing image.
- */
- qdict_put_str(cbw_opts, "bitmap.node", bdrv_get_node_name(di->bs));
- qdict_put_str(cbw_opts, "bitmap.name", bdrv_dirty_bitmap_name(di->bitmap));
- }
- /*
- * Fleecing storage is supposed to be fast and it's better to break backup than guest
- * writes. Certain guest drivers like VirtIO-win have 60 seconds timeout by default, so
- * abort a bit before that.
- */
- qdict_put_str(cbw_opts, "on-cbw-error", "break-snapshot");
- qdict_put_int(cbw_opts, "cbw-timeout", 45);
-
- di->fleecing.cbw = bdrv_insert_node(di->bs, cbw_opts, BDRV_O_RDWR, &local_err);
-
- if (!di->fleecing.cbw) {
- error_setg(errp, "appending cbw node for fleecing failed: %s",
- local_err ? error_get_pretty(local_err) : "unknown error");
- bdrv_drained_end(di->bs);
- break;
- }
-
- QDict *snapshot_access_opts = qdict_new();
- qdict_put_str(snapshot_access_opts, "driver", "snapshot-access");
- qdict_put_str(snapshot_access_opts, "file", bdrv_get_node_name(di->fleecing.cbw));
-
- di->fleecing.snapshot_access =
- bdrv_open(NULL, NULL, snapshot_access_opts, BDRV_O_RDWR | BDRV_O_UNMAP, &local_err);
- if (!di->fleecing.snapshot_access) {
+ if (setup_snapshot_access(di, &local_err) < 0) {
error_setg(errp, "setting up snapshot access for fleecing failed: %s",
local_err ? error_get_pretty(local_err) : "unknown error");
cleanup_snapshot_access(di);
bdrv_drained_end(di->bs);
break;
}
+
source_bs = di->fleecing.snapshot_access;
discard_source = true;

View File

@@ -1,135 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Thu, 7 Nov 2024 17:51:16 +0100
Subject: [PATCH] PVE backup: save device name in device info structure
The device name needs to be queried while holding the graph read lock
and since it doesn't change during the whole operation, just get it
once during setup and avoid the need to query it again in different
places.
Also in preparation to use it more often in error messages and for the
upcoming external backup access API.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
pve-backup.c | 29 +++++++++++++++--------------
1 file changed, 15 insertions(+), 14 deletions(-)
diff --git a/pve-backup.c b/pve-backup.c
index f136d004c4..8ccb281c8c 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -94,6 +94,7 @@ typedef struct PVEBackupDevInfo {
size_t size;
uint64_t block_size;
uint8_t dev_id;
+ char* device_name;
int completed_ret; // INT_MAX if not completed
BdrvDirtyBitmap *bitmap;
BlockDriverState *target;
@@ -327,6 +328,8 @@ static void coroutine_fn pvebackup_co_complete_stream(void *opaque)
}
di->bs = NULL;
+ g_free(di->device_name);
+ di->device_name = NULL;
assert(di->target == NULL);
@@ -621,9 +624,6 @@ static void create_backup_jobs_bh(void *opaque) {
BlockDriverState *source_bs = di->bs;
bool discard_source = false;
- bdrv_graph_co_rdlock();
- const char *job_id = bdrv_get_device_name(di->bs);
- bdrv_graph_co_rdunlock();
if (di->fleecing.bs) {
if (setup_snapshot_access(di, &local_err) < 0) {
error_setg(errp, "setting up snapshot access for fleecing failed: %s",
@@ -654,7 +654,7 @@ static void create_backup_jobs_bh(void *opaque) {
}
BlockJob *job = backup_job_create(
- job_id, source_bs, di->target, backup_state.speed, sync_mode, di->bitmap,
+ di->device_name, source_bs, di->target, backup_state.speed, sync_mode, di->bitmap,
bitmap_mode, false, discard_source, NULL, &perf, BLOCKDEV_ON_ERROR_REPORT,
BLOCKDEV_ON_ERROR_REPORT, JOB_DEFAULT, pvebackup_complete_cb, di, backup_state.txn,
&local_err);
@@ -751,6 +751,7 @@ static GList coroutine_fn GRAPH_RDLOCK *get_device_info(
}
PVEBackupDevInfo *di = g_new0(PVEBackupDevInfo, 1);
di->bs = bs;
+ di->device_name = g_strdup(bdrv_get_device_name(bs));
if (fleecing && device_uses_fleecing(*d)) {
g_autofree gchar *fleecing_devid = g_strconcat(*d, "-fleecing", NULL);
@@ -789,6 +790,7 @@ static GList coroutine_fn GRAPH_RDLOCK *get_device_info(
PVEBackupDevInfo *di = g_new0(PVEBackupDevInfo, 1);
di->bs = bs;
+ di->device_name = g_strdup(bdrv_get_device_name(bs));
di_list = g_list_append(di_list, di);
}
}
@@ -956,9 +958,6 @@ UuidInfo coroutine_fn *qmp_backup(
di->block_size = dump_cb_block_size;
- bdrv_graph_co_rdlock();
- const char *devname = bdrv_get_device_name(di->bs);
- bdrv_graph_co_rdunlock();
PBSBitmapAction action = PBS_BITMAP_ACTION_NOT_USED;
size_t dirty = di->size;
@@ -973,7 +972,8 @@ UuidInfo coroutine_fn *qmp_backup(
}
action = PBS_BITMAP_ACTION_NEW;
} else {
- expect_only_dirty = proxmox_backup_check_incremental(pbs, devname, di->size) != 0;
+ expect_only_dirty =
+ proxmox_backup_check_incremental(pbs, di->device_name, di->size) != 0;
}
if (expect_only_dirty) {
@@ -997,7 +997,8 @@ UuidInfo coroutine_fn *qmp_backup(
}
}
- int dev_id = proxmox_backup_co_register_image(pbs, devname, di->size, expect_only_dirty, errp);
+ int dev_id = proxmox_backup_co_register_image(pbs, di->device_name, di->size,
+ expect_only_dirty, errp);
if (dev_id < 0) {
goto err_mutex;
}
@@ -1009,7 +1010,7 @@ UuidInfo coroutine_fn *qmp_backup(
di->dev_id = dev_id;
PBSBitmapInfo *info = g_malloc(sizeof(*info));
- info->drive = g_strdup(devname);
+ info->drive = g_strdup(di->device_name);
info->action = action;
info->size = di->size;
info->dirty = dirty;
@@ -1034,10 +1035,7 @@ UuidInfo coroutine_fn *qmp_backup(
goto err_mutex;
}
- bdrv_graph_co_rdlock();
- const char *devname = bdrv_get_device_name(di->bs);
- bdrv_graph_co_rdunlock();
- di->dev_id = vma_writer_register_stream(vmaw, devname, di->size);
+ di->dev_id = vma_writer_register_stream(vmaw, di->device_name, di->size);
if (di->dev_id <= 0) {
error_set(errp, ERROR_CLASS_GENERIC_ERROR,
"register_stream failed");
@@ -1148,6 +1146,9 @@ err:
bdrv_co_unref(di->target);
}
+ g_free(di->device_name);
+ di->device_name = NULL;
+
g_free(di);
}
g_list_free(di_list);

View File

@@ -1,25 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Thu, 7 Nov 2024 17:51:17 +0100
Subject: [PATCH] PVE backup: include device name in error when setting up
snapshot access fails
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
pve-backup.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/pve-backup.c b/pve-backup.c
index 8ccb281c8c..255465676c 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -626,7 +626,8 @@ static void create_backup_jobs_bh(void *opaque) {
bool discard_source = false;
if (di->fleecing.bs) {
if (setup_snapshot_access(di, &local_err) < 0) {
- error_setg(errp, "setting up snapshot access for fleecing failed: %s",
+ error_setg(errp, "%s - setting up snapshot access for fleecing failed: %s",
+ di->device_name,
local_err ? error_get_pretty(local_err) : "unknown error");
cleanup_snapshot_access(di);
bdrv_drained_end(di->bs);

34
debian/patches/series vendored
View File

@@ -1,9 +1,13 @@
extra/0001-monitor-qmp-fix-race-with-clients-disconnecting-earl.patch
extra/0002-scsi-megasas-Internal-cdbs-have-16-byte-length.patch
extra/0003-ide-avoid-potential-deadlock-when-draining-during-tr.patch
extra/0004-Revert-x86-acpi-workaround-Windows-not-handling-name.patch
extra/0005-virtio-net-Add-queues-before-loading-them.patch
extra/0006-virtio-net-Fix-size-check-in-dhclient-workaround.patch
extra/0004-migration-block-dirty-bitmap-fix-loading-bitmap-when.patch
extra/0005-hw-ide-reset-cancel-async-DMA-operation-before-reset.patch
extra/0006-Revert-Revert-graph-lock-Disable-locking-for-now.patch
extra/0007-migration-states-workaround-snapshot-performance-reg.patch
extra/0008-Revert-x86-acpi-workaround-Windows-not-handling-name.patch
extra/0009-hw-ide-ahci-fix-legacy-software-reset.patch
extra/0010-ui-vnc-clipboard-fix-inflate_buffer.patch
bitmap-mirror/0001-drive-mirror-add-support-for-sync-bitmap-mode-never.patch
bitmap-mirror/0002-drive-mirror-add-support-for-conditional-and-always-.patch
bitmap-mirror/0003-mirror-add-check-for-bitmap-mode-without-bitmap.patch
@@ -47,18 +51,12 @@ pve/0034-PVE-Migrate-dirty-bitmap-state-via-savevm.patch
pve/0035-migration-block-dirty-bitmap-migrate-other-bitmaps-e.patch
pve/0036-PVE-fall-back-to-open-iscsi-initiatorname.patch
pve/0037-PVE-block-stream-increase-chunk-size.patch
pve/0038-block-add-alloc-track-driver.patch
pve/0039-Revert-block-rbd-workaround-for-ceph-issue-53784.patch
pve/0040-Revert-block-rbd-fix-handling-of-holes-in-.bdrv_co_b.patch
pve/0041-Revert-block-rbd-implement-bdrv_co_block_status.patch
pve/0042-alloc-track-error-out-when-auto-remove-is-not-set.patch
pve/0043-alloc-track-avoid-seemingly-superfluous-child-permis.patch
pve/0044-copy-before-write-allow-specifying-minimum-cluster-s.patch
pve/0045-backup-add-minimum-cluster-size-to-performance-optio.patch
pve/0046-PVE-backup-add-fleecing-option.patch
pve/0047-PVE-backup-improve-error-when-copy-before-write-fail.patch
pve/0048-PVE-backup-fixup-error-handling-for-fleecing.patch
pve/0049-PVE-backup-factor-out-setting-up-snapshot-access-for.patch
pve/0050-PVE-backup-save-device-name-in-device-info-structure.patch
pve/0051-PVE-backup-include-device-name-in-error-when-setting.patch
pve-qemu-9.1-vitastor.patch
pve/0038-block-io-accept-NULL-qiov-in-bdrv_pad_request.patch
pve/0039-block-add-alloc-track-driver.patch
pve/0040-Revert-block-rbd-workaround-for-ceph-issue-53784.patch
pve/0041-Revert-block-rbd-fix-handling-of-holes-in-.bdrv_co_b.patch
pve/0042-Revert-block-rbd-implement-bdrv_co_block_status.patch
pve/0043-alloc-track-fix-deadlock-during-drop.patch
pve/0044-migration-for-snapshots-hold-the-BQL-during-setup-ca.patch
pve/0045-savevm-async-don-t-hold-BQL-during-setup.patch
pve-qemu-8.1-vitastor.patch

View File

@@ -1,2 +1 @@
source-is-missing [roms/SLOF/*.oco]
source-is-missing [linux-user/*/vdso-*.so]

2
qemu

Submodule qemu updated: 508081a49b...78385bc738