Compare commits

..

8 Commits

Author SHA1 Message Date
95769aedb6 Add Vitastor support 2025-03-22 11:44:34 +03:00
Thomas Lamprecht
fa8486e62f bump version to 7.2.10-1
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2024-04-10 15:47:31 +02:00
Fiona Ebner
5d3fc48e11 pick up some extra fixes from upcoming 7.2.11
In particular, the i386 patches fix an issue that was newly introduced
in 7.2.10 and the LSI patches improve the reentrancy fix. The others
also sounded relevant and nice to have.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-04-10 15:43:57 +02:00
Fiona Ebner
a573598321 update patches and submodule to QEMU stable 7.2.10
Many stable fixes came in since the last bump, a few of which were
actually already present. Notable ones not yet present include a few
guest-triggerable assert fixes, some AHCI/IDE fixes (including the fix
for bug #2784), TGC fixes for i386 and ARM, VirtIO fixes, fix to avoid
VNC clipboard denial-of-service.

The reentrancy patches that landed upstream/stable were a newer
version than the ones backported initially here, so it was necessary
to explicitly drop them before rebase (which then picked up the
upstream version).

There were no other conflicts.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-04-10 15:43:57 +02:00
Thomas Lamprecht
8673d5e40a makefile: convert to use simple parenthesis
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
(cherry picked from commit 3c995a426d)
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-09-29 13:58:31 +02:00
Thomas Lamprecht
cfe581f192 d/lintian-overrides: ignore groff line breakage/adjustment warnings
not much we can do here anyway..

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
(cherry picked from commit be7ce325c7)
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-09-29 13:58:31 +02:00
Thomas Lamprecht
555c2585ca d/lintian-overrides: sort
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
(cherry picked from commit 19b4b4c50f)
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-09-29 13:58:31 +02:00
Thomas Lamprecht
899a49e30b d/parse-machines: produce stable json output
Enabling the "canonical" option the keys will be sorted, improving
build reproducibility.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
(cherry picked from commit 590adba81a)
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-09-29 13:58:31 +02:00
106 changed files with 8079 additions and 2506 deletions

View File

@@ -10,7 +10,7 @@ GITVERSION := $(shell git rev-parse HEAD)
DSC=$(PACKAGE)_$(DEB_VERSION_UPSTREAM_REVISION).dsc
DEB = $(PACKAGE)_$(DEB_VERSION_UPSTREAM_REVISION)_$(DEB_BUILD_ARCH).deb
DEB_DBG = $(PACKAGE)-dbgsym_$(DEB_VERSION_UPSTREAM_REVISION)_$(DEB_BUILD_ARCH).deb
DEB_DBG = $(PACKAGE)-dbg_$(DEB_VERSION_UPSTREAM_REVISION)_$(DEB_BUILD_ARCH).deb
DEBS = $(DEB) $(DEB_DBG)
all: $(DEBS)
@@ -44,7 +44,6 @@ $(BUILDDIR): keycodemapdb | submodule
cp -a $(SRCDIR) $@.tmp
cp -a debian $@.tmp/debian
rm -rf $@.tmp/ui/keycodemapdb
rm -rf $@.tmp/roms/edk2 # packaged separately
cp -a keycodemapdb $@.tmp/ui/
find $@.tmp/pc-bios -type f | grep $(BLOB_PURGE_FILTER) | xargs rm -f
sed -i $(BLOB_PURGE_SED_CMDS) $@.tmp/pc-bios/meson.build
@@ -55,7 +54,7 @@ $(BUILDDIR): keycodemapdb | submodule
deb kvm: $(DEBS)
$(DEB_DBG): $(DEB)
$(DEB): $(BUILDDIR)
cd $(BUILDDIR); dpkg-buildpackage -b -us -uc -j
cd $(BUILDDIR); dpkg-buildpackage -b -us -uc -j32
lintian $(DEBS)
sbuild: $(DSC)

74
debian/changelog vendored
View File

@@ -1,75 +1,17 @@
pve-qemu-kvm (8.0.2-7) bookworm; urgency=medium
pve-qemu-kvm (7.2.10-1+vitastor1) bullseye; urgency=medium
* fix #2874: SATA: avoid unsolicited write to sector 0 during reset
* Add Vitastor support
-- Proxmox Support Team <support@proxmox.com> Wed, 04 Oct 2023 08:33:35 +0200
-- Vitaliy Filippov <vitalif@yourcmc.ru> Sat, 22 Mar 2025 11:41:12 +0300
pve-qemu-kvm (8.0.2-6) bookworm; urgency=medium
pve-qemu-kvm (7.2.10-1) bullseye; urgency=medium
* fix #1534: vma: add extract-filter for disk images allowing users to pass
a comma separated list of the disks they want to extract from an archive.
* update patches and submodule to QEMU stable 7.2.10
* backup: create jobs in a drained section to avoid subtle bugs where
something interferes with the block-copy-state bitmap on initialization
* pick up some extra fixes from upcoming 7.2.11 to, e.g., avoid releasing a
regression for i386 that happened in the 7.2.10 stable release.
* backup: drop experimental, and since a while also fully broken, directory
backup format (BACKUP_FORMAT_DIR). This format was never exposed via the
Proxmox VE API, but only available via QMP, as its broken since QEMU 8 and
we got zero reports about that, it's safe to assume that there are no
public users, so just remove it completely.
-- Proxmox Support Team <support@proxmox.com> Wed, 06 Sep 2023 17:03:59 +0200
pve-qemu-kvm (8.0.2-5) bookworm; urgency=medium
* improve memory footprint after backup by not keeping as much memory
resident.
* fix file descriptor leak for vhost (used by default by vNICs).
-- Proxmox Support Team <support@proxmox.com> Wed, 16 Aug 2023 11:52:24 +0200
pve-qemu-kvm (8.0.2-4) bookworm; urgency=medium
* fix resume for snapshot and hibernate in combination with iothread and
dirty bitmap
-- Proxmox Support Team <support@proxmox.com> Fri, 28 Jul 2023 12:58:22 +0200
pve-qemu-kvm (8.0.2-3) bookworm; urgency=medium
* fix regression in QEMU 8.0 for drive mirror with bitmap
-- Proxmox Support Team <support@proxmox.com> Thu, 15 Jun 2023 13:57:46 +0200
pve-qemu-kvm (8.0.2-2) bookworm; urgency=medium
* drop custom get_link_status QMP command, was never really used.
* drop custom & deprecated drive snapshot QMP commands, we use a better
alternative since a while.
-- Proxmox Support Team <support@proxmox.com> Fri, 09 Jun 2023 07:57:56 +0200
pve-qemu-kvm (8.0.2-1) bookworm; urgency=medium
* update to QEMU stable release 8.0.2
* update patches for avoiding issues with DMA reentrancy to current,
slightly optimized version.
-- Proxmox Support Team <support@proxmox.com> Tue, 06 Jun 2023 16:34:50 +0200
pve-qemu-kvm (8.0.0-1) bookworm; urgency=medium
* update to QEMU stable release 8.0.0
* re-build for Proxmox VE 8 / Debian 12 Bookworm
* adapt to the local virtiofsd C variant being dropped, it has been
rewritten in Rust and is now hosted in a separate source repository.
-- Proxmox Support Team <support@proxmox.com> Mon, 22 May 2023 13:45:49 +0200
-- Proxmox Support Team <support@proxmox.com> Wed, 10 Apr 2024 15:47:27 +0200
pve-qemu-kvm (7.2.0-8) bullseye; urgency=medium

1
debian/compat vendored Normal file
View File

@@ -0,0 +1 @@
10

12
debian/control vendored
View File

@@ -2,8 +2,9 @@ Source: pve-qemu-kvm
Section: admin
Priority: optional
Maintainer: Proxmox Support Team <support@proxmox.com>
Build-Depends: debhelper-compat (= 13),
Build-Depends: autotools-dev,
check,
debhelper (>= 9),
libacl1-dev,
libaio-dev,
libattr1-dev,
@@ -38,6 +39,8 @@ Build-Depends: debhelper-compat (= 13),
python3-sphinx,
python3-sphinx-rtd-theme,
quilt,
texi2html,
texinfo,
uuid-dev,
xfslibs-dev,
Standards-Version: 3.7.2
@@ -83,3 +86,10 @@ Description: Full virtualization on x86 hardware
Using KVM, one can run multiple virtual PCs, each running unmodified Linux or
Windows images. Each virtual machine has private virtualized hardware: a
network card, disk, graphics adapter, etc.
Package: pve-qemu-kvm-dbg
Architecture: any
Section: debug
Depends: pve-qemu-kvm (= ${binary:Version}),
Description: pve qemu debugging symbols
This package contains the debugging symbols for pve-qemu-kvm.

View File

@@ -27,18 +27,16 @@ Signed-off-by: Ma Haocong <mahaocong@didichuxing.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
[FE: rebased for 8.0]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
block/mirror.c | 98 +++++++++++++++++++++-----
blockdev.c | 38 +++++++++-
blockdev.c | 39 +++++++++-
include/block/block_int-global-state.h | 4 +-
qapi/block-core.json | 29 ++++++--
tests/unit/test-block-iothread.c | 4 +-
5 files changed, 144 insertions(+), 29 deletions(-)
5 files changed, 145 insertions(+), 29 deletions(-)
diff --git a/block/mirror.c b/block/mirror.c
index 663e2b7002..9099c75992 100644
index 251adc5ae0..8ead5f77a0 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -51,7 +51,7 @@ typedef struct MirrorBlockJob {
@@ -59,7 +57,7 @@ index 663e2b7002..9099c75992 100644
BdrvDirtyBitmap *dirty_bitmap;
BdrvDirtyBitmapIter *dbi;
uint8_t *buf;
@@ -703,7 +705,8 @@ static int mirror_exit_common(Job *job)
@@ -699,7 +701,8 @@ static int mirror_exit_common(Job *job)
bdrv_child_refresh_perms(mirror_top_bs, mirror_top_bs->backing,
&error_abort);
if (!abort && s->backing_mode == MIRROR_SOURCE_BACKING_CHAIN) {
@@ -69,7 +67,7 @@ index 663e2b7002..9099c75992 100644
BlockDriverState *unfiltered_target = bdrv_skip_filters(target_bs);
if (bdrv_cow_bs(unfiltered_target) != backing) {
@@ -801,6 +804,16 @@ static void mirror_abort(Job *job)
@@ -797,6 +800,16 @@ static void mirror_abort(Job *job)
assert(ret == 0);
}
@@ -86,7 +84,7 @@ index 663e2b7002..9099c75992 100644
static void coroutine_fn mirror_throttle(MirrorBlockJob *s)
{
int64_t now = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
@@ -987,7 +1000,8 @@ static int coroutine_fn mirror_run(Job *job, Error **errp)
@@ -977,7 +990,8 @@ static int coroutine_fn mirror_run(Job *job, Error **errp)
mirror_free_init(s);
s->last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
@@ -96,7 +94,7 @@ index 663e2b7002..9099c75992 100644
ret = mirror_dirty_init(s);
if (ret < 0 || job_is_cancelled(&s->common.job)) {
goto immediate_exit;
@@ -1240,6 +1254,7 @@ static const BlockJobDriver mirror_job_driver = {
@@ -1224,6 +1238,7 @@ static const BlockJobDriver mirror_job_driver = {
.run = mirror_run,
.prepare = mirror_prepare,
.abort = mirror_abort,
@@ -104,7 +102,7 @@ index 663e2b7002..9099c75992 100644
.pause = mirror_pause,
.complete = mirror_complete,
.cancel = mirror_cancel,
@@ -1256,6 +1271,7 @@ static const BlockJobDriver commit_active_job_driver = {
@@ -1240,6 +1255,7 @@ static const BlockJobDriver commit_active_job_driver = {
.run = mirror_run,
.prepare = mirror_prepare,
.abort = mirror_abort,
@@ -112,7 +110,7 @@ index 663e2b7002..9099c75992 100644
.pause = mirror_pause,
.complete = mirror_complete,
.cancel = commit_active_cancel,
@@ -1647,7 +1663,10 @@ static BlockJob *mirror_start_job(
@@ -1627,7 +1643,10 @@ static BlockJob *mirror_start_job(
BlockCompletionFunc *cb,
void *opaque,
const BlockJobDriver *driver,
@@ -124,7 +122,7 @@ index 663e2b7002..9099c75992 100644
bool auto_complete, const char *filter_node_name,
bool is_mirror, MirrorCopyMode copy_mode,
Error **errp)
@@ -1659,10 +1678,39 @@ static BlockJob *mirror_start_job(
@@ -1639,10 +1658,39 @@ static BlockJob *mirror_start_job(
uint64_t target_perms, target_shared_perms;
int ret;
@@ -166,7 +164,7 @@ index 663e2b7002..9099c75992 100644
assert(is_power_of_2(granularity));
if (buf_size < 0) {
@@ -1793,7 +1841,9 @@ static BlockJob *mirror_start_job(
@@ -1774,7 +1822,9 @@ static BlockJob *mirror_start_job(
s->replaces = g_strdup(replaces);
s->on_source_error = on_source_error;
s->on_target_error = on_target_error;
@@ -177,7 +175,7 @@ index 663e2b7002..9099c75992 100644
s->backing_mode = backing_mode;
s->zero_target = zero_target;
s->copy_mode = copy_mode;
@@ -1814,6 +1864,18 @@ static BlockJob *mirror_start_job(
@@ -1795,6 +1845,18 @@ static BlockJob *mirror_start_job(
bdrv_disable_dirty_bitmap(s->dirty_bitmap);
}
@@ -196,7 +194,7 @@ index 663e2b7002..9099c75992 100644
ret = block_job_add_bdrv(&s->common, "source", bs, 0,
BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE |
BLK_PERM_CONSISTENT_READ,
@@ -1891,6 +1953,9 @@ fail:
@@ -1872,6 +1934,9 @@ fail:
if (s->dirty_bitmap) {
bdrv_release_dirty_bitmap(s->dirty_bitmap);
}
@@ -206,7 +204,7 @@ index 663e2b7002..9099c75992 100644
job_early_fail(&s->common.job);
}
@@ -1908,31 +1973,25 @@ void mirror_start(const char *job_id, BlockDriverState *bs,
@@ -1889,31 +1954,25 @@ void mirror_start(const char *job_id, BlockDriverState *bs,
BlockDriverState *target, const char *replaces,
int creation_flags, int64_t speed,
uint32_t granularity, int64_t buf_size,
@@ -243,7 +241,7 @@ index 663e2b7002..9099c75992 100644
}
BlockJob *commit_active_start(const char *job_id, BlockDriverState *bs,
@@ -1959,7 +2018,8 @@ BlockJob *commit_active_start(const char *job_id, BlockDriverState *bs,
@@ -1940,7 +1999,8 @@ BlockJob *commit_active_start(const char *job_id, BlockDriverState *bs,
job_id, bs, creation_flags, base, NULL, speed, 0, 0,
MIRROR_LEAVE_BACKING_CHAIN, false,
on_error, on_error, true, cb, opaque,
@@ -254,20 +252,21 @@ index 663e2b7002..9099c75992 100644
errp);
if (!job) {
diff --git a/blockdev.c b/blockdev.c
index e464daea58..50e4a9c682 100644
index ae27a41efa..a0c7e0c13b 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2942,6 +2942,9 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
@@ -2956,6 +2956,10 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
BlockDriverState *target,
const char *replaces,
bool has_replaces, const char *replaces,
enum MirrorSyncMode sync,
+ bool has_bitmap,
+ const char *bitmap_name,
+ bool has_bitmap_mode,
+ BitmapSyncMode bitmap_mode,
BlockMirrorBackingMode backing_mode,
bool zero_target,
bool has_speed, int64_t speed,
@@ -2960,6 +2963,7 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
@@ -2975,6 +2979,7 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
{
BlockDriverState *unfiltered_bs;
int job_flags = JOB_DEFAULT;
@@ -275,11 +274,11 @@ index e464daea58..50e4a9c682 100644
if (!has_speed) {
speed = 0;
@@ -3011,6 +3015,29 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
@@ -3029,6 +3034,29 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
sync = MIRROR_SYNC_MODE_FULL;
}
+ if (bitmap_name) {
+ if (has_bitmap) {
+ if (granularity) {
+ error_setg(errp, "Granularity and bitmap cannot both be set");
+ return;
@@ -302,53 +301,53 @@ index e464daea58..50e4a9c682 100644
+ }
+ }
+
if (!replaces) {
if (!has_replaces) {
/* We want to mirror from @bs, but keep implicit filters on top */
unfiltered_bs = bdrv_skip_implicit_filters(bs);
@@ -3056,8 +3083,8 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
@@ -3075,8 +3103,8 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
* and will allow to check whether the node still exist at mirror completion
*/
mirror_start(job_id, bs, target,
- replaces, job_flags,
- has_replaces ? replaces : NULL, job_flags,
- speed, granularity, buf_size, sync, backing_mode, zero_target,
+ replaces, job_flags, speed, granularity, buf_size, sync,
+ bitmap, bitmap_mode, backing_mode, zero_target,
+ has_replaces ? replaces : NULL, job_flags, speed, granularity,
+ buf_size, sync, bitmap, bitmap_mode, backing_mode, zero_target,
on_source_error, on_target_error, unmap, filter_node_name,
copy_mode, errp);
}
@@ -3202,6 +3229,8 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
@@ -3221,6 +3249,8 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
blockdev_mirror_common(arg->job_id, bs, target_bs,
arg->replaces, arg->sync,
+ arg->bitmap,
blockdev_mirror_common(arg->has_job_id ? arg->job_id : NULL, bs, target_bs,
arg->has_replaces, arg->replaces, arg->sync,
+ arg->has_bitmap, arg->bitmap,
+ arg->has_bitmap_mode, arg->bitmap_mode,
backing_mode, zero_target,
arg->has_speed, arg->speed,
arg->has_granularity, arg->granularity,
@@ -3223,6 +3252,8 @@ void qmp_blockdev_mirror(const char *job_id,
@@ -3242,6 +3272,8 @@ void qmp_blockdev_mirror(bool has_job_id, const char *job_id,
const char *device, const char *target,
const char *replaces,
bool has_replaces, const char *replaces,
MirrorSyncMode sync,
+ const char *bitmap,
+ bool has_bitmap, const char *bitmap,
+ bool has_bitmap_mode, BitmapSyncMode bitmap_mode,
bool has_speed, int64_t speed,
bool has_granularity, uint32_t granularity,
bool has_buf_size, int64_t buf_size,
@@ -3271,7 +3302,8 @@ void qmp_blockdev_mirror(const char *job_id,
@@ -3291,7 +3323,8 @@ void qmp_blockdev_mirror(bool has_job_id, const char *job_id,
}
blockdev_mirror_common(job_id, bs, target_bs,
- replaces, sync, backing_mode,
+ replaces, sync,
blockdev_mirror_common(has_job_id ? job_id : NULL, bs, target_bs,
- has_replaces, replaces, sync, backing_mode,
+ has_replaces, replaces, sync, has_bitmap,
+ bitmap, has_bitmap_mode, bitmap_mode, backing_mode,
zero_target, has_speed, speed,
has_granularity, granularity,
has_buf_size, buf_size,
diff --git a/include/block/block_int-global-state.h b/include/block/block_int-global-state.h
index 902406eb99..d559be928c 100644
index b49f4eb35b..9d744db618 100644
--- a/include/block/block_int-global-state.h
+++ b/include/block/block_int-global-state.h
@@ -152,7 +152,9 @@ void mirror_start(const char *job_id, BlockDriverState *bs,
@@ -149,7 +149,9 @@ void mirror_start(const char *job_id, BlockDriverState *bs,
BlockDriverState *target, const char *replaces,
int creation_flags, int64_t speed,
uint32_t granularity, int64_t buf_size,
@@ -360,10 +359,10 @@ index 902406eb99..d559be928c 100644
BlockdevOnError on_source_error,
BlockdevOnError on_target_error,
diff --git a/qapi/block-core.json b/qapi/block-core.json
index c05ad0c07e..3c945c1f93 100644
index 95ac4fa634..7daaf545be 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -2095,10 +2095,19 @@
@@ -2000,10 +2000,19 @@
# (all the disk, only the sectors allocated in the topmost image, or
# only new I/O).
#
@@ -384,7 +383,7 @@ index c05ad0c07e..3c945c1f93 100644
#
# @buf-size: maximum amount of data in flight from source to
# target (since 1.4).
@@ -2138,7 +2147,9 @@
@@ -2043,7 +2052,9 @@
{ 'struct': 'DriveMirror',
'data': { '*job-id': 'str', 'device': 'str', 'target': 'str',
'*format': 'str', '*node-name': 'str', '*replaces': 'str',
@@ -395,7 +394,7 @@ index c05ad0c07e..3c945c1f93 100644
'*speed': 'int', '*granularity': 'uint32',
'*buf-size': 'int', '*on-source-error': 'BlockdevOnError',
'*on-target-error': 'BlockdevOnError',
@@ -2417,10 +2428,19 @@
@@ -2322,10 +2333,19 @@
# (all the disk, only the sectors allocated in the topmost image, or
# only new I/O).
#
@@ -416,7 +415,7 @@ index c05ad0c07e..3c945c1f93 100644
#
# @buf-size: maximum amount of data in flight from source to
# target
@@ -2470,7 +2490,8 @@
@@ -2375,7 +2395,8 @@
{ 'command': 'blockdev-mirror',
'data': { '*job-id': 'str', 'device': 'str', 'target': 'str',
'*replaces': 'str',
@@ -427,10 +426,10 @@ index c05ad0c07e..3c945c1f93 100644
'*buf-size': 'int', '*on-source-error': 'BlockdevOnError',
'*on-target-error': 'BlockdevOnError',
diff --git a/tests/unit/test-block-iothread.c b/tests/unit/test-block-iothread.c
index 3a5e1eb2c4..c1ecc49073 100644
index 8ca5adec5e..dae80e5a5f 100644
--- a/tests/unit/test-block-iothread.c
+++ b/tests/unit/test-block-iothread.c
@@ -757,8 +757,8 @@ static void test_propagate_mirror(void)
@@ -755,8 +755,8 @@ static void test_propagate_mirror(void)
/* Start a mirror job */
mirror_start("job0", src, target, NULL, JOB_DEFAULT, 0, 0, 0,

View File

@@ -24,10 +24,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 18 insertions(+), 6 deletions(-)
diff --git a/block/mirror.c b/block/mirror.c
index 9099c75992..e2ff42067b 100644
index 8ead5f77a0..35c1b8f25d 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -680,8 +680,6 @@ static int mirror_exit_common(Job *job)
@@ -676,8 +676,6 @@ static int mirror_exit_common(Job *job)
bdrv_unfreeze_backing_chain(mirror_top_bs, target_bs);
}
@@ -36,7 +36,7 @@ index 9099c75992..e2ff42067b 100644
/* Make sure that the source BDS doesn't go away during bdrv_replace_node,
* before we can call bdrv_drained_end */
bdrv_ref(src);
@@ -782,6 +780,18 @@ static int mirror_exit_common(Job *job)
@@ -778,6 +776,18 @@ static int mirror_exit_common(Job *job)
block_job_remove_all_bdrv(bjob);
bdrv_replace_node(mirror_top_bs, mirror_top_bs->backing->bs, &error_abort);
@@ -55,7 +55,7 @@ index 9099c75992..e2ff42067b 100644
bs_opaque->job = NULL;
bdrv_drained_end(src);
@@ -1688,10 +1698,6 @@ static BlockJob *mirror_start_job(
@@ -1668,10 +1678,6 @@ static BlockJob *mirror_start_job(
" sync mode",
MirrorSyncMode_str(sync_mode));
return NULL;
@@ -66,7 +66,7 @@ index 9099c75992..e2ff42067b 100644
}
} else if (bitmap) {
error_setg(errp,
@@ -1708,6 +1714,12 @@ static BlockJob *mirror_start_job(
@@ -1688,6 +1694,12 @@ static BlockJob *mirror_start_job(
return NULL;
}
granularity = bdrv_dirty_bitmap_granularity(bitmap);

View File

@@ -16,10 +16,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 3 insertions(+)
diff --git a/blockdev.c b/blockdev.c
index 50e4a9c682..e6b2b1e338 100644
index a0c7e0c13b..98b9dff154 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3036,6 +3036,9 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
@@ -3055,6 +3055,9 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
if (bdrv_dirty_bitmap_check(bitmap, BDRV_BITMAP_ALLOW_RO, errp)) {
return;
}
@@ -28,4 +28,4 @@ index 50e4a9c682..e6b2b1e338 100644
+ return;
}
if (!replaces) {
if (!has_replaces) {

View File

@@ -16,10 +16,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 4 insertions(+), 7 deletions(-)
diff --git a/block/mirror.c b/block/mirror.c
index e2ff42067b..f42953837b 100644
index 35c1b8f25d..4969c6833c 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -786,8 +786,8 @@ static int mirror_exit_common(Job *job)
@@ -782,8 +782,8 @@ static int mirror_exit_common(Job *job)
job->ret == 0 && ret == 0)) {
/* Success; synchronize copy back to sync. */
bdrv_clear_dirty_bitmap(s->sync_bitmap, NULL);
@@ -30,7 +30,7 @@ index e2ff42067b..f42953837b 100644
}
}
bdrv_release_dirty_bitmap(s->dirty_bitmap);
@@ -1881,11 +1881,8 @@ static BlockJob *mirror_start_job(
@@ -1862,11 +1862,8 @@ static BlockJob *mirror_start_job(
}
if (s->sync_mode == MIRROR_SYNC_MODE_BITMAP) {

View File

@@ -12,8 +12,6 @@ uniform w.r.t. backup block jobs.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
[FE: rebase for 8.0]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
block/mirror.c | 28 +++------------
blockdev.c | 29 +++++++++++++++
@@ -21,10 +19,10 @@ Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
3 files changed, 70 insertions(+), 59 deletions(-)
diff --git a/block/mirror.c b/block/mirror.c
index f42953837b..8f79efaa87 100644
index 4969c6833c..cf85ae1074 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -1688,31 +1688,13 @@ static BlockJob *mirror_start_job(
@@ -1668,31 +1668,13 @@ static BlockJob *mirror_start_job(
uint64_t target_perms, target_shared_perms;
int ret;
@@ -62,17 +60,17 @@ index f42953837b..8f79efaa87 100644
if (bitmap_mode != BITMAP_SYNC_MODE_NEVER) {
diff --git a/blockdev.c b/blockdev.c
index e6b2b1e338..bdae211a54 100644
index 98b9dff154..5b15a86bfa 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3015,7 +3015,36 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
@@ -3034,7 +3034,36 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
sync = MIRROR_SYNC_MODE_FULL;
}
+ if ((sync == MIRROR_SYNC_MODE_BITMAP) ||
+ (sync == MIRROR_SYNC_MODE_INCREMENTAL)) {
+ /* done before desugaring 'incremental' to print the right message */
+ if (!bitmap_name) {
+ if (!has_bitmap) {
+ error_setg(errp, "Must provide a valid bitmap name for "
+ "'%s' sync mode", MirrorSyncMode_str(sync));
+ return;
@@ -93,7 +91,7 @@ index e6b2b1e338..bdae211a54 100644
+ bitmap_mode = BITMAP_SYNC_MODE_ON_SUCCESS;
+ }
+
if (bitmap_name) {
if (has_bitmap) {
+ if (sync != MIRROR_SYNC_MODE_BITMAP) {
+ error_setg(errp, "Sync mode '%s' not supported with bitmap.",
+ MirrorSyncMode_str(sync));

View File

@@ -48,7 +48,7 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
6 files changed, 59 insertions(+), 5 deletions(-)
diff --git a/include/monitor/monitor.h b/include/monitor/monitor.h
index 033390f699..ad35d4fea8 100644
index 737e750670..38804b8595 100644
--- a/include/monitor/monitor.h
+++ b/include/monitor/monitor.h
@@ -16,6 +16,7 @@ extern QemuOptsList qemu_mon_opts;
@@ -60,7 +60,7 @@ index 033390f699..ad35d4fea8 100644
void monitor_init_globals(void);
void monitor_init_globals_core(void);
diff --git a/monitor/monitor-internal.h b/monitor/monitor-internal.h
index 53e3808054..a19f8cbc2b 100644
index a2cdbbf646..b531bd50e7 100644
--- a/monitor/monitor-internal.h
+++ b/monitor/monitor-internal.h
@@ -152,6 +152,13 @@ typedef struct {
@@ -78,7 +78,7 @@ index 53e3808054..a19f8cbc2b 100644
/**
diff --git a/monitor/monitor.c b/monitor/monitor.c
index 8dc96f6af9..f3c38cb714 100644
index 86949024f6..c306cadcf4 100644
--- a/monitor/monitor.c
+++ b/monitor/monitor.c
@@ -135,6 +135,21 @@ bool monitor_cur_is_qmp(void)
@@ -104,7 +104,7 @@ index 8dc96f6af9..f3c38cb714 100644
* Is @mon is using readline?
* Note: not all HMP monitors use readline, e.g., gdbserver has a
diff --git a/monitor/qmp.c b/monitor/qmp.c
index 092c527b6f..6b8cfcf6d8 100644
index acd0a350c2..cc1407e4ac 100644
--- a/monitor/qmp.c
+++ b/monitor/qmp.c
@@ -141,6 +141,8 @@ static void monitor_qmp_dispatch(MonitorQMP *mon, QObject *req)
@@ -135,7 +135,7 @@ index 092c527b6f..6b8cfcf6d8 100644
qobject_unref(rsp);
}
@@ -444,6 +456,7 @@ static void monitor_qmp_event(void *opaque, QEMUChrEvent event)
@@ -427,6 +439,7 @@ static void monitor_qmp_event(void *opaque, QEMUChrEvent event)
switch (event) {
case CHR_EVENT_OPENED:
@@ -144,7 +144,7 @@ index 092c527b6f..6b8cfcf6d8 100644
monitor_qmp_caps_reset(mon);
data = qmp_greeting(mon);
diff --git a/qapi/qmp-dispatch.c b/qapi/qmp-dispatch.c
index 0990873ec8..e605003771 100644
index 5d000fae87..404d428824 100644
--- a/qapi/qmp-dispatch.c
+++ b/qapi/qmp-dispatch.c
@@ -117,16 +117,28 @@ typedef struct QmpDispatchBH {
@@ -180,13 +180,13 @@ index 0990873ec8..e605003771 100644
aio_co_wake(data->co);
}
@@ -231,6 +243,7 @@ QDict *qmp_dispatch(const QmpCommandList *cmds, QObject *request,
@@ -253,6 +265,7 @@ QDict *qmp_dispatch(const QmpCommandList *cmds, QObject *request,
.ret = &ret,
.errp = &err,
.co = qemu_coroutine_self(),
+ .conn_nr = monitor_get_connection_nr(cur_mon),
};
aio_bh_schedule_oneshot(qemu_get_aio_context(), do_qmp_dispatch_bh,
aio_bh_schedule_oneshot(iohandler_get_aio_context(), do_qmp_dispatch_bh,
&data);
diff --git a/stubs/monitor-core.c b/stubs/monitor-core.c
index afa477aae6..d3ff124bf3 100644

View File

@@ -0,0 +1,42 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Fri, 28 Oct 2022 10:09:46 +0200
Subject: [PATCH] init: daemonize: defuse PID file resolve error
When proxmox-file-restore invokes QEMU, the PID file is a (temporary)
file that's already unlinked, so resolving the absolute path here
failed.
It should not be a critical error when the PID file unlink handler
can't be registered, because the path can't be resolved for whatever
reason. If the file is already gone from QEMU's perspective (i.e.
errno is ENOENT), silently ignore the error. Otherwise, print a
warning.
Reported-by: Dominik Csapak <d.csapak@proxmox.com>
Suggested-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
softmmu/vl.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/softmmu/vl.c b/softmmu/vl.c
index 38d76d6e51..7aa3eb5cf9 100644
--- a/softmmu/vl.c
+++ b/softmmu/vl.c
@@ -2468,10 +2468,11 @@ static void qemu_maybe_daemonize(const char *pid_file)
pid_file_realpath = g_malloc0(PATH_MAX);
if (!realpath(pid_file, pid_file_realpath)) {
- error_report("cannot resolve PID file path: %s: %s",
- pid_file, strerror(errno));
- unlink(pid_file);
- exit(1);
+ if (errno != ENOENT) {
+ warn_report("not removing PID file on exit: cannot resolve PID "
+ "file path: %s: %s", pid_file, strerror(errno));
+ }
+ return;
}
qemu_unlink_pidfile_notifier = (struct UnlinkPidfileNotifier) {

View File

@@ -55,10 +55,10 @@ Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/hw/ide/core.c b/hw/ide/core.c
index 45d14a25e9..08e1f0c3d7 100644
index 3e97d665d9..a0f6801bce 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -444,7 +444,7 @@ static void ide_trim_bh_cb(void *opaque)
@@ -443,7 +443,7 @@ static void ide_trim_bh_cb(void *opaque)
iocb->bh = NULL;
qemu_aio_unref(iocb);
@@ -67,7 +67,7 @@ index 45d14a25e9..08e1f0c3d7 100644
blk_dec_in_flight(blk);
}
@@ -504,6 +504,8 @@ static void ide_issue_trim_cb(void *opaque, int ret)
@@ -503,6 +503,8 @@ static void ide_issue_trim_cb(void *opaque, int ret)
done:
iocb->aiocb = NULL;
if (iocb->bh) {
@@ -77,7 +77,7 @@ index 45d14a25e9..08e1f0c3d7 100644
}
}
@@ -515,9 +517,6 @@ BlockAIOCB *ide_issue_trim(
IDEState *s = opaque;
IDEDevice *dev = s->unit ? s->bus->slave : s->bus->master;
TrimAIOCB *iocb;
- /* Paired with a decrement in ide_trim_bh_cb() */
@@ -85,8 +85,8 @@ index 45d14a25e9..08e1f0c3d7 100644
-
iocb = blk_aio_get(&trim_aiocb_info, s->blk, cb, cb_opaque);
iocb->s = s;
iocb->bh = qemu_bh_new(ide_trim_bh_cb, iocb);
@@ -740,8 +739,9 @@ void ide_cancel_dma_sync(IDEState *s)
iocb->bh = qemu_bh_new_guarded(ide_trim_bh_cb, iocb,
@@ -741,8 +740,9 @@ void ide_cancel_dma_sync(IDEState *s)
*/
if (s->bus->dma->aiocb) {
trace_ide_cancel_dma_sync_remaining();

View File

@@ -1,36 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Marc-Andr=C3=A9=20Lureau?= <marcandre.lureau@redhat.com>
Date: Fri, 28 Apr 2023 19:48:06 +0400
Subject: [PATCH] ui: return NULL when getting cursor without a console
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
VNC may try to get the current cursor even when there are no consoles
and crashes. Simple reproducer is qemu with -nodefaults.
Fixes: (again)
https://gitlab.com/qemu-project/qemu/-/issues/1548
Fixes: commit 385ac97f8 ("ui: keep current cursor with QemuConsole")
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(picked up from https://lists.nongnu.org/archive/html/qemu-devel/2023-04/msg05598.html)
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
ui/console.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/ui/console.c b/ui/console.c
index e173731e20..7461446e71 100644
--- a/ui/console.c
+++ b/ui/console.c
@@ -2306,7 +2306,7 @@ QEMUCursor *qemu_console_get_cursor(QemuConsole *con)
if (con == NULL) {
con = active_console;
}
- return con->cursor;
+ return con ? con->cursor : NULL;
}
bool qemu_console_is_visible(QemuConsole *con)

View File

@@ -1,130 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Alexander Bulekov <alxndr@bu.edu>
Date: Thu, 27 Apr 2023 17:10:06 -0400
Subject: [PATCH] memory: prevent dma-reentracy issues
Add a flag to the DeviceState, when a device is engaged in PIO/MMIO/DMA.
This flag is set/checked prior to calling a device's MemoryRegion
handlers, and set when device code initiates DMA. The purpose of this
flag is to prevent two types of DMA-based reentrancy issues:
1.) mmio -> dma -> mmio case
2.) bh -> dma write -> mmio case
These issues have led to problems such as stack-exhaustion and
use-after-frees.
Summary of the problem from Peter Maydell:
https://lore.kernel.org/qemu-devel/CAFEAcA_23vc7hE3iaM-JVA6W38LK4hJoWae5KcknhPRD5fPBZA@mail.gmail.com
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/62
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/540
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/541
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/556
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/557
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/827
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1282
Resolves: CVE-2023-0330
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230427211013.2994127-2-alxndr@bu.edu>
[thuth: Replace warn_report() with warn_report_once()]
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry-picked from commit a2e1753b8054344f32cf94f31c6399a58794a380)
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
include/exec/memory.h | 5 +++++
include/hw/qdev-core.h | 7 +++++++
softmmu/memory.c | 16 ++++++++++++++++
3 files changed, 28 insertions(+)
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 15ade918ba..e45ce6061f 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -767,6 +767,8 @@ struct MemoryRegion {
bool is_iommu;
RAMBlock *ram_block;
Object *owner;
+ /* owner as TYPE_DEVICE. Used for re-entrancy checks in MR access hotpath */
+ DeviceState *dev;
const MemoryRegionOps *ops;
void *opaque;
@@ -791,6 +793,9 @@ struct MemoryRegion {
unsigned ioeventfd_nb;
MemoryRegionIoeventfd *ioeventfds;
RamDiscardManager *rdm; /* Only for RAM */
+
+ /* For devices designed to perform re-entrant IO into their own IO MRs */
+ bool disable_reentrancy_guard;
};
struct IOMMUMemoryRegion {
diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
index bd50ad5ee1..7623703943 100644
--- a/include/hw/qdev-core.h
+++ b/include/hw/qdev-core.h
@@ -162,6 +162,10 @@ struct NamedClockList {
QLIST_ENTRY(NamedClockList) node;
};
+typedef struct {
+ bool engaged_in_io;
+} MemReentrancyGuard;
+
/**
* DeviceState:
* @realized: Indicates whether the device has been fully constructed.
@@ -194,6 +198,9 @@ struct DeviceState {
int alias_required_for_version;
ResettableState reset;
GSList *unplug_blockers;
+
+ /* Is the device currently in mmio/pio/dma? Used to prevent re-entrancy */
+ MemReentrancyGuard mem_reentrancy_guard;
};
struct DeviceListener {
diff --git a/softmmu/memory.c b/softmmu/memory.c
index b1a6cae6f5..b7b3386e9d 100644
--- a/softmmu/memory.c
+++ b/softmmu/memory.c
@@ -542,6 +542,18 @@ static MemTxResult access_with_adjusted_size(hwaddr addr,
access_size_max = 4;
}
+ /* Do not allow more than one simultaneous access to a device's IO Regions */
+ if (mr->dev && !mr->disable_reentrancy_guard &&
+ !mr->ram_device && !mr->ram && !mr->rom_device && !mr->readonly) {
+ if (mr->dev->mem_reentrancy_guard.engaged_in_io) {
+ warn_report_once("Blocked re-entrant IO on MemoryRegion: "
+ "%s at addr: 0x%" HWADDR_PRIX,
+ memory_region_name(mr), addr);
+ return MEMTX_ACCESS_ERROR;
+ }
+ mr->dev->mem_reentrancy_guard.engaged_in_io = true;
+ }
+
/* FIXME: support unaligned access? */
access_size = MAX(MIN(size, access_size_max), access_size_min);
access_mask = MAKE_64BIT_MASK(0, access_size * 8);
@@ -556,6 +568,9 @@ static MemTxResult access_with_adjusted_size(hwaddr addr,
access_mask, attrs);
}
}
+ if (mr->dev) {
+ mr->dev->mem_reentrancy_guard.engaged_in_io = false;
+ }
return r;
}
@@ -1170,6 +1185,7 @@ static void memory_region_do_init(MemoryRegion *mr,
}
mr->name = g_strdup(name);
mr->owner = owner;
+ mr->dev = (DeviceState *) object_dynamic_cast(mr->owner, TYPE_DEVICE);
mr->ram_block = NULL;
if (name) {

View File

@@ -0,0 +1,273 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Zhuojia Shen <chaosdefinition@hotmail.com>
Date: Wed, 10 Apr 2024 08:43:24 +0300
Subject: [PATCH] target/arm: align exposed ID registers with Linux
In CPUID registers exposed to userspace, some registers were missing
and some fields were not exposed. This patch aligns exposed ID
registers and their fields with what the upstream kernel currently
exposes.
Specifically, the following new ID registers/fields are exposed to
userspace:
ID_AA64PFR1_EL1.BT: bits 3-0
ID_AA64PFR1_EL1.MTE: bits 11-8
ID_AA64PFR1_EL1.SME: bits 27-24
ID_AA64ZFR0_EL1.SVEver: bits 3-0
ID_AA64ZFR0_EL1.AES: bits 7-4
ID_AA64ZFR0_EL1.BitPerm: bits 19-16
ID_AA64ZFR0_EL1.BF16: bits 23-20
ID_AA64ZFR0_EL1.SHA3: bits 35-32
ID_AA64ZFR0_EL1.SM4: bits 43-40
ID_AA64ZFR0_EL1.I8MM: bits 47-44
ID_AA64ZFR0_EL1.F32MM: bits 55-52
ID_AA64ZFR0_EL1.F64MM: bits 59-56
ID_AA64SMFR0_EL1.F32F32: bit 32
ID_AA64SMFR0_EL1.B16F32: bit 34
ID_AA64SMFR0_EL1.F16F32: bit 35
ID_AA64SMFR0_EL1.I8I32: bits 39-36
ID_AA64SMFR0_EL1.F64F64: bit 48
ID_AA64SMFR0_EL1.I16I64: bits 55-52
ID_AA64SMFR0_EL1.FA64: bit 63
ID_AA64MMFR0_EL1.ECV: bits 63-60
ID_AA64MMFR1_EL1.AFP: bits 47-44
ID_AA64MMFR2_EL1.AT: bits 35-32
ID_AA64ISAR0_EL1.RNDR: bits 63-60
ID_AA64ISAR1_EL1.FRINTTS: bits 35-32
ID_AA64ISAR1_EL1.BF16: bits 47-44
ID_AA64ISAR1_EL1.DGH: bits 51-48
ID_AA64ISAR1_EL1.I8MM: bits 55-52
ID_AA64ISAR2_EL1.WFxT: bits 3-0
ID_AA64ISAR2_EL1.RPRES: bits 7-4
ID_AA64ISAR2_EL1.GPA3: bits 11-8
ID_AA64ISAR2_EL1.APA3: bits 15-12
The code is also refactored to use symbolic names for ID register fields
for better readability and maintainability.
The test case in tests/tcg/aarch64/sysregs.c is also updated to match
the intended behavior.
Signed-off-by: Zhuojia Shen <chaosdefinition@hotmail.com>
Message-id: DS7PR12MB6309FB585E10772928F14271ACE79@DS7PR12MB6309.namprd12.prod.outlook.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: use Sn_n_Cn_Cn_n syntax to work with older assemblers
that don't recognize id_aa64isar2_el1 and id_aa64mmfr2_el1]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit bc6bd20ee3538347afb750c4bd06edca4a922897)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: pick this for v8.0.0-2361-g1f51573f79
"target/arm: Fix SME full tile indexing")
---
target/arm/helper.c | 96 +++++++++++++++++++++++++------
tests/tcg/aarch64/Makefile.target | 7 ++-
tests/tcg/aarch64/sysregs.c | 24 ++++++--
3 files changed, 103 insertions(+), 24 deletions(-)
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 2e284e048c..acc0470e86 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -7852,31 +7852,89 @@ void register_cp_regs_for_features(ARMCPU *cpu)
#ifdef CONFIG_USER_ONLY
static const ARMCPRegUserSpaceInfo v8_user_idregs[] = {
{ .name = "ID_AA64PFR0_EL1",
- .exported_bits = 0x000f000f00ff0000,
- .fixed_bits = 0x0000000000000011 },
+ .exported_bits = R_ID_AA64PFR0_FP_MASK |
+ R_ID_AA64PFR0_ADVSIMD_MASK |
+ R_ID_AA64PFR0_SVE_MASK |
+ R_ID_AA64PFR0_DIT_MASK,
+ .fixed_bits = (0x1u << R_ID_AA64PFR0_EL0_SHIFT) |
+ (0x1u << R_ID_AA64PFR0_EL1_SHIFT) },
{ .name = "ID_AA64PFR1_EL1",
- .exported_bits = 0x00000000000000f0 },
+ .exported_bits = R_ID_AA64PFR1_BT_MASK |
+ R_ID_AA64PFR1_SSBS_MASK |
+ R_ID_AA64PFR1_MTE_MASK |
+ R_ID_AA64PFR1_SME_MASK },
{ .name = "ID_AA64PFR*_EL1_RESERVED",
- .is_glob = true },
- { .name = "ID_AA64ZFR0_EL1" },
+ .is_glob = true },
+ { .name = "ID_AA64ZFR0_EL1",
+ .exported_bits = R_ID_AA64ZFR0_SVEVER_MASK |
+ R_ID_AA64ZFR0_AES_MASK |
+ R_ID_AA64ZFR0_BITPERM_MASK |
+ R_ID_AA64ZFR0_BFLOAT16_MASK |
+ R_ID_AA64ZFR0_SHA3_MASK |
+ R_ID_AA64ZFR0_SM4_MASK |
+ R_ID_AA64ZFR0_I8MM_MASK |
+ R_ID_AA64ZFR0_F32MM_MASK |
+ R_ID_AA64ZFR0_F64MM_MASK },
+ { .name = "ID_AA64SMFR0_EL1",
+ .exported_bits = R_ID_AA64SMFR0_F32F32_MASK |
+ R_ID_AA64SMFR0_B16F32_MASK |
+ R_ID_AA64SMFR0_F16F32_MASK |
+ R_ID_AA64SMFR0_I8I32_MASK |
+ R_ID_AA64SMFR0_F64F64_MASK |
+ R_ID_AA64SMFR0_I16I64_MASK |
+ R_ID_AA64SMFR0_FA64_MASK },
{ .name = "ID_AA64MMFR0_EL1",
- .fixed_bits = 0x00000000ff000000 },
- { .name = "ID_AA64MMFR1_EL1" },
+ .exported_bits = R_ID_AA64MMFR0_ECV_MASK,
+ .fixed_bits = (0xfu << R_ID_AA64MMFR0_TGRAN64_SHIFT) |
+ (0xfu << R_ID_AA64MMFR0_TGRAN4_SHIFT) },
+ { .name = "ID_AA64MMFR1_EL1",
+ .exported_bits = R_ID_AA64MMFR1_AFP_MASK },
+ { .name = "ID_AA64MMFR2_EL1",
+ .exported_bits = R_ID_AA64MMFR2_AT_MASK },
{ .name = "ID_AA64MMFR*_EL1_RESERVED",
- .is_glob = true },
+ .is_glob = true },
{ .name = "ID_AA64DFR0_EL1",
- .fixed_bits = 0x0000000000000006 },
- { .name = "ID_AA64DFR1_EL1" },
+ .fixed_bits = (0x6u << R_ID_AA64DFR0_DEBUGVER_SHIFT) },
+ { .name = "ID_AA64DFR1_EL1" },
{ .name = "ID_AA64DFR*_EL1_RESERVED",
- .is_glob = true },
+ .is_glob = true },
{ .name = "ID_AA64AFR*",
- .is_glob = true },
+ .is_glob = true },
{ .name = "ID_AA64ISAR0_EL1",
- .exported_bits = 0x00fffffff0fffff0 },
+ .exported_bits = R_ID_AA64ISAR0_AES_MASK |
+ R_ID_AA64ISAR0_SHA1_MASK |
+ R_ID_AA64ISAR0_SHA2_MASK |
+ R_ID_AA64ISAR0_CRC32_MASK |
+ R_ID_AA64ISAR0_ATOMIC_MASK |
+ R_ID_AA64ISAR0_RDM_MASK |
+ R_ID_AA64ISAR0_SHA3_MASK |
+ R_ID_AA64ISAR0_SM3_MASK |
+ R_ID_AA64ISAR0_SM4_MASK |
+ R_ID_AA64ISAR0_DP_MASK |
+ R_ID_AA64ISAR0_FHM_MASK |
+ R_ID_AA64ISAR0_TS_MASK |
+ R_ID_AA64ISAR0_RNDR_MASK },
{ .name = "ID_AA64ISAR1_EL1",
- .exported_bits = 0x000000f0ffffffff },
+ .exported_bits = R_ID_AA64ISAR1_DPB_MASK |
+ R_ID_AA64ISAR1_APA_MASK |
+ R_ID_AA64ISAR1_API_MASK |
+ R_ID_AA64ISAR1_JSCVT_MASK |
+ R_ID_AA64ISAR1_FCMA_MASK |
+ R_ID_AA64ISAR1_LRCPC_MASK |
+ R_ID_AA64ISAR1_GPA_MASK |
+ R_ID_AA64ISAR1_GPI_MASK |
+ R_ID_AA64ISAR1_FRINTTS_MASK |
+ R_ID_AA64ISAR1_SB_MASK |
+ R_ID_AA64ISAR1_BF16_MASK |
+ R_ID_AA64ISAR1_DGH_MASK |
+ R_ID_AA64ISAR1_I8MM_MASK },
+ { .name = "ID_AA64ISAR2_EL1",
+ .exported_bits = R_ID_AA64ISAR2_WFXT_MASK |
+ R_ID_AA64ISAR2_RPRES_MASK |
+ R_ID_AA64ISAR2_GPA3_MASK |
+ R_ID_AA64ISAR2_APA3_MASK },
{ .name = "ID_AA64ISAR*_EL1_RESERVED",
- .is_glob = true },
+ .is_glob = true },
};
modify_arm_cp_regs(v8_idregs, v8_user_idregs);
#endif
@@ -8194,8 +8252,12 @@ void register_cp_regs_for_features(ARMCPU *cpu)
#ifdef CONFIG_USER_ONLY
static const ARMCPRegUserSpaceInfo id_v8_user_midr_cp_reginfo[] = {
{ .name = "MIDR_EL1",
- .exported_bits = 0x00000000ffffffff },
- { .name = "REVIDR_EL1" },
+ .exported_bits = R_MIDR_EL1_REVISION_MASK |
+ R_MIDR_EL1_PARTNUM_MASK |
+ R_MIDR_EL1_ARCHITECTURE_MASK |
+ R_MIDR_EL1_VARIANT_MASK |
+ R_MIDR_EL1_IMPLEMENTER_MASK },
+ { .name = "REVIDR_EL1" },
};
modify_arm_cp_regs(id_v8_midr_cp_reginfo, id_v8_user_midr_cp_reginfo);
#endif
diff --git a/tests/tcg/aarch64/Makefile.target b/tests/tcg/aarch64/Makefile.target
index a72578fccb..fc6d5d824d 100644
--- a/tests/tcg/aarch64/Makefile.target
+++ b/tests/tcg/aarch64/Makefile.target
@@ -23,7 +23,8 @@ config-cc.mak: Makefile
$(call cc-option,-march=armv8.1-a+sve2, CROSS_CC_HAS_SVE2); \
$(call cc-option,-march=armv8.3-a, CROSS_CC_HAS_ARMV8_3); \
$(call cc-option,-mbranch-protection=standard, CROSS_CC_HAS_ARMV8_BTI); \
- $(call cc-option,-march=armv8.5-a+memtag, CROSS_CC_HAS_ARMV8_MTE)) 3> config-cc.mak
+ $(call cc-option,-march=armv8.5-a+memtag, CROSS_CC_HAS_ARMV8_MTE); \
+ $(call cc-option,-march=armv9-a+sme, CROSS_CC_HAS_ARMV9_SME)) 3> config-cc.mak
-include config-cc.mak
# Pauth Tests
@@ -53,7 +54,11 @@ endif
ifneq ($(CROSS_CC_HAS_SVE),)
# System Registers Tests
AARCH64_TESTS += sysregs
+ifneq ($(CROSS_CC_HAS_ARMV9_SME),)
+sysregs: CFLAGS+=-march=armv9-a+sme -DHAS_ARMV9_SME
+else
sysregs: CFLAGS+=-march=armv8.1-a+sve
+endif
# SVE ioctl test
AARCH64_TESTS += sve-ioctls
diff --git a/tests/tcg/aarch64/sysregs.c b/tests/tcg/aarch64/sysregs.c
index 40cf8d2877..46b931f781 100644
--- a/tests/tcg/aarch64/sysregs.c
+++ b/tests/tcg/aarch64/sysregs.c
@@ -22,6 +22,13 @@
#define HWCAP_CPUID (1 << 11)
#endif
+/*
+ * Older assemblers don't recognize newer system register names,
+ * but we can still access them by the Sn_n_Cn_Cn_n syntax.
+ */
+#define SYS_ID_AA64ISAR2_EL1 S3_0_C0_C6_2
+#define SYS_ID_AA64MMFR2_EL1 S3_0_C0_C7_2
+
int failed_bit_count;
/* Read and print system register `id' value */
@@ -112,18 +119,23 @@ int main(void)
* minimum valid fields - for the purposes of this check allowed
* to have non-zero values.
*/
- get_cpu_reg_check_mask(id_aa64isar0_el1, _m(00ff,ffff,f0ff,fff0));
- get_cpu_reg_check_mask(id_aa64isar1_el1, _m(0000,00f0,ffff,ffff));
+ get_cpu_reg_check_mask(id_aa64isar0_el1, _m(f0ff,ffff,f0ff,fff0));
+ get_cpu_reg_check_mask(id_aa64isar1_el1, _m(00ff,f0ff,ffff,ffff));
+ get_cpu_reg_check_mask(SYS_ID_AA64ISAR2_EL1, _m(0000,0000,0000,ffff));
/* TGran4 & TGran64 as pegged to -1 */
- get_cpu_reg_check_mask(id_aa64mmfr0_el1, _m(0000,0000,ff00,0000));
- get_cpu_reg_check_zero(id_aa64mmfr1_el1);
+ get_cpu_reg_check_mask(id_aa64mmfr0_el1, _m(f000,0000,ff00,0000));
+ get_cpu_reg_check_mask(id_aa64mmfr1_el1, _m(0000,f000,0000,0000));
+ get_cpu_reg_check_mask(SYS_ID_AA64MMFR2_EL1, _m(0000,000f,0000,0000));
/* EL1/EL0 reported as AA64 only */
get_cpu_reg_check_mask(id_aa64pfr0_el1, _m(000f,000f,00ff,0011));
- get_cpu_reg_check_mask(id_aa64pfr1_el1, _m(0000,0000,0000,00f0));
+ get_cpu_reg_check_mask(id_aa64pfr1_el1, _m(0000,0000,0f00,0fff));
/* all hidden, DebugVer fixed to 0x6 (ARMv8 debug architecture) */
get_cpu_reg_check_mask(id_aa64dfr0_el1, _m(0000,0000,0000,0006));
get_cpu_reg_check_zero(id_aa64dfr1_el1);
- get_cpu_reg_check_zero(id_aa64zfr0_el1);
+ get_cpu_reg_check_mask(id_aa64zfr0_el1, _m(0ff0,ff0f,00ff,00ff));
+#ifdef HAS_ARMV9_SME
+ get_cpu_reg_check_mask(id_aa64smfr0_el1, _m(80f1,00fd,0000,0000));
+#endif
get_cpu_reg_check_zero(id_aa64afr0_el1);
get_cpu_reg_check_zero(id_aa64afr1_el1);

View File

@@ -1,39 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Alexander Bulekov <alxndr@bu.edu>
Date: Thu, 27 Apr 2023 17:10:10 -0400
Subject: [PATCH] lsi53c895a: disable reentrancy detection for script RAM
As the code is designed to use the memory APIs to access the script ram,
disable reentrancy checks for the pseudo-RAM ram_io MemoryRegion.
In the future, ram_io may be converted from an IO to a proper RAM MemoryRegion.
Reported-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-Id: <20230427211013.2994127-6-alxndr@bu.edu>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry-picked from commit bfd6e7ae6a72b84e2eb9574f56e6ec037f05182c)
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
hw/scsi/lsi53c895a.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/hw/scsi/lsi53c895a.c b/hw/scsi/lsi53c895a.c
index bbf32d3f73..17af67935f 100644
--- a/hw/scsi/lsi53c895a.c
+++ b/hw/scsi/lsi53c895a.c
@@ -2313,6 +2313,12 @@ static void lsi_scsi_realize(PCIDevice *dev, Error **errp)
memory_region_init_io(&s->io_io, OBJECT(s), &lsi_io_ops, s,
"lsi-io", 256);
+ /*
+ * Since we use the address-space API to interact with ram_io, disable the
+ * re-entrancy guard.
+ */
+ s->ram_io.disable_reentrancy_guard = true;
+
address_space_init(&s->pci_io_as, pci_address_space_io(dev), "lsi-pci-io");
qdev_init_gpio_out(d, &s->ext_irq, 1);

View File

@@ -0,0 +1,91 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Peter Maydell <peter.maydell@linaro.org>
Date: Wed, 10 Apr 2024 08:43:25 +0300
Subject: [PATCH] tests/tcg/aarch64/sysregs.c: Use S syntax for id_aa64zfr0_el1
and id_aa64smfr0_el1
Some assemblers will complain about attempts to access
id_aa64zfr0_el1 and id_aa64smfr0_el1 by name if the test
binary isn't built for the right processor type:
/tmp/ccASXpLo.s:782: Error: selected processor does not support system register name 'id_aa64zfr0_el1'
/tmp/ccASXpLo.s:829: Error: selected processor does not support system register name 'id_aa64smfr0_el1'
However, these registers are in the ID space and are guaranteed to
read-as-zero on older CPUs, so the access is both safe and sensible.
Switch to using the S syntax, as we already do for ID_AA64ISAR2_EL1
and ID_AA64MMFR2_EL1. This allows us to drop the HAS_ARMV9_SME check
and the makefile machinery to adjust the CFLAGS for this test, so we
don't rely on having a sufficiently new compiler to be able to check
these registers.
This means we're actually testing the SME ID register: no released
GCC yet recognizes -march=armv9-a+sme, so that was always skipped.
It also avoids a future problem if we try to switch the "do we have
SME support in the toolchain" check from "in the compiler" to "in the
assembler" (at which point we would otherwise run into the above
errors).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 3dc2afeab2964b54848715b913b6c605f36be3e1)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: pick this for v8.0.0-2361-g1f51573f79
"target/arm: Fix SME full tile indexing")
---
tests/tcg/aarch64/Makefile.target | 7 +------
tests/tcg/aarch64/sysregs.c | 11 +++++++----
2 files changed, 8 insertions(+), 10 deletions(-)
diff --git a/tests/tcg/aarch64/Makefile.target b/tests/tcg/aarch64/Makefile.target
index fc6d5d824d..118d069073 100644
--- a/tests/tcg/aarch64/Makefile.target
+++ b/tests/tcg/aarch64/Makefile.target
@@ -51,15 +51,10 @@ AARCH64_TESTS += mte-1 mte-2 mte-3 mte-4 mte-5 mte-6 mte-7
mte-%: CFLAGS += -march=armv8.5-a+memtag
endif
-ifneq ($(CROSS_CC_HAS_SVE),)
# System Registers Tests
AARCH64_TESTS += sysregs
-ifneq ($(CROSS_CC_HAS_ARMV9_SME),)
-sysregs: CFLAGS+=-march=armv9-a+sme -DHAS_ARMV9_SME
-else
-sysregs: CFLAGS+=-march=armv8.1-a+sve
-endif
+ifneq ($(CROSS_CC_HAS_SVE),)
# SVE ioctl test
AARCH64_TESTS += sve-ioctls
sve-ioctls: CFLAGS+=-march=armv8.1-a+sve
diff --git a/tests/tcg/aarch64/sysregs.c b/tests/tcg/aarch64/sysregs.c
index 46b931f781..d8eb06abcf 100644
--- a/tests/tcg/aarch64/sysregs.c
+++ b/tests/tcg/aarch64/sysregs.c
@@ -25,9 +25,14 @@
/*
* Older assemblers don't recognize newer system register names,
* but we can still access them by the Sn_n_Cn_Cn_n syntax.
+ * This also means we don't need to specifically request that the
+ * assembler enables whatever architectural features the ID registers
+ * syntax might be gated behind.
*/
#define SYS_ID_AA64ISAR2_EL1 S3_0_C0_C6_2
#define SYS_ID_AA64MMFR2_EL1 S3_0_C0_C7_2
+#define SYS_ID_AA64ZFR0_EL1 S3_0_C0_C4_4
+#define SYS_ID_AA64SMFR0_EL1 S3_0_C0_C4_5
int failed_bit_count;
@@ -132,10 +137,8 @@ int main(void)
/* all hidden, DebugVer fixed to 0x6 (ARMv8 debug architecture) */
get_cpu_reg_check_mask(id_aa64dfr0_el1, _m(0000,0000,0000,0006));
get_cpu_reg_check_zero(id_aa64dfr1_el1);
- get_cpu_reg_check_mask(id_aa64zfr0_el1, _m(0ff0,ff0f,00ff,00ff));
-#ifdef HAS_ARMV9_SME
- get_cpu_reg_check_mask(id_aa64smfr0_el1, _m(80f1,00fd,0000,0000));
-#endif
+ get_cpu_reg_check_mask(SYS_ID_AA64ZFR0_EL1, _m(0ff0,ff0f,00ff,00ff));
+ get_cpu_reg_check_mask(SYS_ID_AA64SMFR0_EL1, _m(80f1,00fd,0000,0000));
get_cpu_reg_check_zero(id_aa64afr0_el1);
get_cpu_reg_check_zero(id_aa64afr1_el1);

View File

@@ -1,37 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Alexander Bulekov <alxndr@bu.edu>
Date: Thu, 27 Apr 2023 17:10:11 -0400
Subject: [PATCH] bcm2835_property: disable reentrancy detection for iomem
As the code is designed for re-entrant calls from bcm2835_property to
bcm2835_mbox and back into bcm2835_property, mark iomem as
reentrancy-safe.
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230427211013.2994127-7-alxndr@bu.edu>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry-picked from commit 985c4a4e547afb9573b6bd6843d20eb2c3d1d1cd)
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
hw/misc/bcm2835_property.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/hw/misc/bcm2835_property.c b/hw/misc/bcm2835_property.c
index 890ae7bae5..de056ea2df 100644
--- a/hw/misc/bcm2835_property.c
+++ b/hw/misc/bcm2835_property.c
@@ -382,6 +382,13 @@ static void bcm2835_property_init(Object *obj)
memory_region_init_io(&s->iomem, OBJECT(s), &bcm2835_property_ops, s,
TYPE_BCM2835_PROPERTY, 0x10);
+
+ /*
+ * bcm2835_property_ops call into bcm2835_mbox, which in-turn reads from
+ * iomem. As such, mark iomem as re-entracy safe.
+ */
+ s->iomem.disable_reentrancy_guard = true;
+
sysbus_init_mmio(SYS_BUS_DEVICE(s), &s->iomem);
sysbus_init_irq(SYS_BUS_DEVICE(s), &s->mbox_irq);
}

View File

@@ -0,0 +1,199 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Richard Henderson <richard.henderson@linaro.org>
Date: Wed, 10 Apr 2024 08:43:26 +0300
Subject: [PATCH] target/arm: Fix SME full tile indexing
For the outer product set of insns, which take an entire matrix
tile as output, the argument is not a combined tile+column.
Therefore using get_tile_rowcol was incorrect, as we extracted
the tile number from itself.
The test case relies only on assembler support for SME, since
no release of GCC recognizes -march=armv9-a+sme yet.
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1620
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230622151201.1578522-5-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: dropped now-unneeded changes to sysregs CFLAGS]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 1f51573f7925b80e79a29f87c7d9d6ead60960c0)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
---
target/arm/translate-sme.c | 24 ++++++---
tests/tcg/aarch64/Makefile.target | 7 ++-
tests/tcg/aarch64/sme-outprod1.c | 83 +++++++++++++++++++++++++++++++
3 files changed, 107 insertions(+), 7 deletions(-)
create mode 100644 tests/tcg/aarch64/sme-outprod1.c
diff --git a/target/arm/translate-sme.c b/target/arm/translate-sme.c
index 7b87a9df63..65f8495bdd 100644
--- a/target/arm/translate-sme.c
+++ b/target/arm/translate-sme.c
@@ -103,6 +103,21 @@ static TCGv_ptr get_tile_rowcol(DisasContext *s, int esz, int rs,
return addr;
}
+/*
+ * Resolve tile.size[0] to a host pointer.
+ * Used by e.g. outer product insns where we require the entire tile.
+ */
+static TCGv_ptr get_tile(DisasContext *s, int esz, int tile)
+{
+ TCGv_ptr addr = tcg_temp_new_ptr();
+ int offset;
+
+ offset = tile * sizeof(ARMVectorReg) + offsetof(CPUARMState, zarray);
+
+ tcg_gen_addi_ptr(addr, cpu_env, offset);
+ return addr;
+}
+
static bool trans_ZERO(DisasContext *s, arg_ZERO *a)
{
if (!dc_isar_feature(aa64_sme, s)) {
@@ -279,8 +294,7 @@ static bool do_adda(DisasContext *s, arg_adda *a, MemOp esz,
return true;
}
- /* Sum XZR+zad to find ZAd. */
- za = get_tile_rowcol(s, esz, 31, a->zad, false);
+ za = get_tile(s, esz, a->zad);
zn = vec_full_reg_ptr(s, a->zn);
pn = pred_full_reg_ptr(s, a->pn);
pm = pred_full_reg_ptr(s, a->pm);
@@ -310,8 +324,7 @@ static bool do_outprod(DisasContext *s, arg_op *a, MemOp esz,
return true;
}
- /* Sum XZR+zad to find ZAd. */
- za = get_tile_rowcol(s, esz, 31, a->zad, false);
+ za = get_tile(s, esz, a->zad);
zn = vec_full_reg_ptr(s, a->zn);
zm = vec_full_reg_ptr(s, a->zm);
pn = pred_full_reg_ptr(s, a->pn);
@@ -337,8 +350,7 @@ static bool do_outprod_fpst(DisasContext *s, arg_op *a, MemOp esz,
return true;
}
- /* Sum XZR+zad to find ZAd. */
- za = get_tile_rowcol(s, esz, 31, a->zad, false);
+ za = get_tile(s, esz, a->zad);
zn = vec_full_reg_ptr(s, a->zn);
zm = vec_full_reg_ptr(s, a->zm);
pn = pred_full_reg_ptr(s, a->pn);
diff --git a/tests/tcg/aarch64/Makefile.target b/tests/tcg/aarch64/Makefile.target
index 118d069073..5e4ea7c998 100644
--- a/tests/tcg/aarch64/Makefile.target
+++ b/tests/tcg/aarch64/Makefile.target
@@ -24,7 +24,7 @@ config-cc.mak: Makefile
$(call cc-option,-march=armv8.3-a, CROSS_CC_HAS_ARMV8_3); \
$(call cc-option,-mbranch-protection=standard, CROSS_CC_HAS_ARMV8_BTI); \
$(call cc-option,-march=armv8.5-a+memtag, CROSS_CC_HAS_ARMV8_MTE); \
- $(call cc-option,-march=armv9-a+sme, CROSS_CC_HAS_ARMV9_SME)) 3> config-cc.mak
+ $(call cc-option,-Wa$(COMMA)-march=armv9-a+sme, CROSS_AS_HAS_ARMV9_SME)) 3> config-cc.mak
-include config-cc.mak
# Pauth Tests
@@ -51,6 +51,11 @@ AARCH64_TESTS += mte-1 mte-2 mte-3 mte-4 mte-5 mte-6 mte-7
mte-%: CFLAGS += -march=armv8.5-a+memtag
endif
+# SME Tests
+ifneq ($(CROSS_AS_HAS_ARMV9_SME),)
+AARCH64_TESTS += sme-outprod1
+endif
+
# System Registers Tests
AARCH64_TESTS += sysregs
diff --git a/tests/tcg/aarch64/sme-outprod1.c b/tests/tcg/aarch64/sme-outprod1.c
new file mode 100644
index 0000000000..6e5972d75e
--- /dev/null
+++ b/tests/tcg/aarch64/sme-outprod1.c
@@ -0,0 +1,83 @@
+/*
+ * SME outer product, 1 x 1.
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#include <stdio.h>
+
+extern void foo(float *dst);
+
+asm(
+" .arch_extension sme\n"
+" .type foo, @function\n"
+"foo:\n"
+" stp x29, x30, [sp, -80]!\n"
+" mov x29, sp\n"
+" stp d8, d9, [sp, 16]\n"
+" stp d10, d11, [sp, 32]\n"
+" stp d12, d13, [sp, 48]\n"
+" stp d14, d15, [sp, 64]\n"
+" smstart\n"
+" ptrue p0.s, vl4\n"
+" fmov z0.s, #1.0\n"
+/*
+ * An outer product of a vector of 1.0 by itself should be a matrix of 1.0.
+ * Note that we are using tile 1 here (za1.s) rather than tile 0.
+ */
+" zero {za}\n"
+" fmopa za1.s, p0/m, p0/m, z0.s, z0.s\n"
+/*
+ * Read the first 4x4 sub-matrix of elements from tile 1:
+ * Note that za1h should be interchangable here.
+ */
+" mov w12, #0\n"
+" mova z0.s, p0/m, za1v.s[w12, #0]\n"
+" mova z1.s, p0/m, za1v.s[w12, #1]\n"
+" mova z2.s, p0/m, za1v.s[w12, #2]\n"
+" mova z3.s, p0/m, za1v.s[w12, #3]\n"
+/*
+ * And store them to the input pointer (dst in the C code):
+ */
+" st1w {z0.s}, p0, [x0]\n"
+" add x0, x0, #16\n"
+" st1w {z1.s}, p0, [x0]\n"
+" add x0, x0, #16\n"
+" st1w {z2.s}, p0, [x0]\n"
+" add x0, x0, #16\n"
+" st1w {z3.s}, p0, [x0]\n"
+" smstop\n"
+" ldp d8, d9, [sp, 16]\n"
+" ldp d10, d11, [sp, 32]\n"
+" ldp d12, d13, [sp, 48]\n"
+" ldp d14, d15, [sp, 64]\n"
+" ldp x29, x30, [sp], 80\n"
+" ret\n"
+" .size foo, . - foo"
+);
+
+int main()
+{
+ float dst[16];
+ int i, j;
+
+ foo(dst);
+
+ for (i = 0; i < 16; i++) {
+ if (dst[i] != 1.0f) {
+ break;
+ }
+ }
+
+ if (i == 16) {
+ return 0; /* success */
+ }
+
+ /* failure */
+ for (i = 0; i < 4; ++i) {
+ for (j = 0; j < 4; ++j) {
+ printf("%f ", (double)dst[i * 4 + j]);
+ }
+ printf("\n");
+ }
+ return 1;
+}

View File

@@ -1,35 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Alexander Bulekov <alxndr@bu.edu>
Date: Thu, 27 Apr 2023 17:10:12 -0400
Subject: [PATCH] raven: disable reentrancy detection for iomem
As the code is designed for re-entrant calls from raven_io_ops to
pci-conf, mark raven_io_ops as reentrancy-safe.
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <20230427211013.2994127-8-alxndr@bu.edu>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry-picked from commit 6dad5a6810d9c60ca320d01276f6133bbcfa1fc7)
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
hw/pci-host/raven.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/hw/pci-host/raven.c b/hw/pci-host/raven.c
index 072ffe3c5e..9a11ac4b2b 100644
--- a/hw/pci-host/raven.c
+++ b/hw/pci-host/raven.c
@@ -294,6 +294,13 @@ static void raven_pcihost_initfn(Object *obj)
memory_region_init(&s->pci_memory, obj, "pci-memory", 0x3f000000);
address_space_init(&s->pci_io_as, &s->pci_io, "raven-io");
+ /*
+ * Raven's raven_io_ops use the address-space API to access pci-conf-idx
+ * (which is also owned by the raven device). As such, mark the
+ * pci_io_non_contiguous as re-entrancy safe.
+ */
+ s->pci_io_non_contiguous.disable_reentrancy_guard = true;
+
/* CPU address space */
memory_region_add_subregion(address_space_mem, PCI_IO_BASE_ADDR,
&s->pci_io);

View File

@@ -0,0 +1,61 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Dmitrii Gavrilov <ds-gavr@yandex-team.ru>
Date: Wed, 10 Apr 2024 08:43:28 +0300
Subject: [PATCH] system/qdev-monitor: move drain_call_rcu call under if (!dev)
in qmp_device_add()
Original goal of addition of drain_call_rcu to qmp_device_add was to cover
the failure case of qdev_device_add. It seems call of drain_call_rcu was
misplaced in 7bed89958bfbf40df what led to waiting for pending RCU callbacks
under happy path too. What led to overall performance degradation of
qmp_device_add.
In this patch call of drain_call_rcu moved under handling of failure of
qdev_device_add.
Signed-off-by: Dmitrii Gavrilov <ds-gavr@yandex-team.ru>
Message-ID: <20231103105602.90475-1-ds-gavr@yandex-team.ru>
Fixes: 7bed89958bf ("device_core: use drain_call_rcu in in qmp_device_add", 2020-10-12)
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 012b170173bcaa14b9bc26209e0813311ac78489)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
---
softmmu/qdev-monitor.c | 23 +++++++++++------------
1 file changed, 11 insertions(+), 12 deletions(-)
diff --git a/softmmu/qdev-monitor.c b/softmmu/qdev-monitor.c
index 4b0ef65780..f4348443b0 100644
--- a/softmmu/qdev-monitor.c
+++ b/softmmu/qdev-monitor.c
@@ -853,19 +853,18 @@ void qmp_device_add(QDict *qdict, QObject **ret_data, Error **errp)
return;
}
dev = qdev_device_add(opts, errp);
-
- /*
- * Drain all pending RCU callbacks. This is done because
- * some bus related operations can delay a device removal
- * (in this case this can happen if device is added and then
- * removed due to a configuration error)
- * to a RCU callback, but user might expect that this interface
- * will finish its job completely once qmp command returns result
- * to the user
- */
- drain_call_rcu();
-
if (!dev) {
+ /*
+ * Drain all pending RCU callbacks. This is done because
+ * some bus related operations can delay a device removal
+ * (in this case this can happen if device is added and then
+ * removed due to a configuration error)
+ * to a RCU callback, but user might expect that this interface
+ * will finish its job completely once qmp command returns result
+ * to the user
+ */
+ drain_call_rcu();
+
qemu_opts_del(opts);
return;
}

View File

@@ -1,36 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Alexander Bulekov <alxndr@bu.edu>
Date: Thu, 27 Apr 2023 17:10:13 -0400
Subject: [PATCH] apic: disable reentrancy detection for apic-msi
As the code is designed for re-entrant calls to apic-msi, mark apic-msi
as reentrancy-safe.
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-Id: <20230427211013.2994127-9-alxndr@bu.edu>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry-picked from commit 50795ee051a342c681a9b45671c552fbd6274db8)
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
hw/intc/apic.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/hw/intc/apic.c b/hw/intc/apic.c
index 20b5a94073..ac3d47d231 100644
--- a/hw/intc/apic.c
+++ b/hw/intc/apic.c
@@ -885,6 +885,13 @@ static void apic_realize(DeviceState *dev, Error **errp)
memory_region_init_io(&s->io_memory, OBJECT(s), &apic_io_ops, s, "apic-msi",
APIC_SPACE_SIZE);
+ /*
+ * apic-msi's apic_mem_write can call into ioapic_eoi_broadcast, which can
+ * write back to apic-msi. As such mark the apic-msi region re-entrancy
+ * safe.
+ */
+ s->io_memory.disable_reentrancy_guard = true;
+
s->timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, apic_timer, s);
local_apics[s->id] = s;

View File

@@ -0,0 +1,85 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Sven Schnelle <svens@stackframe.org>
Date: Wed, 10 Apr 2024 08:43:29 +0300
Subject: [PATCH] hw/scsi/lsi53c895a: stop script on phase mismatch
Netbsd isn't happy with qemu lsi53c895a emulation:
cd0(esiop0:0:2:0): command with tag id 0 reset
esiop0: autoconfiguration error: phase mismatch without command
esiop0: autoconfiguration error: unhandled scsi interrupt, sist=0x80 sstat1=0x0 DSA=0x23a64b1 DSP=0x50
This is because lsi_bad_phase() triggers a phase mismatch, which
stops SCRIPT processing. However, after returning to
lsi_command_complete(), SCRIPT is restarted with lsi_resume_script().
Fix this by adding a return value to lsi_bad_phase(), and only resume
script processing when lsi_bad_phase() didn't trigger a host interrupt.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Tested-by: Helge Deller <deller@gmx.de>
Message-ID: <20240302214453.2071388-1-svens@stackframe.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit a9198b3132d81a6bfc9fdbf6f3d3a514c2864674)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
---
hw/scsi/lsi53c895a.c | 16 ++++++++++++----
1 file changed, 12 insertions(+), 4 deletions(-)
diff --git a/hw/scsi/lsi53c895a.c b/hw/scsi/lsi53c895a.c
index ca619ed564..905f5ef237 100644
--- a/hw/scsi/lsi53c895a.c
+++ b/hw/scsi/lsi53c895a.c
@@ -570,8 +570,9 @@ static inline void lsi_set_phase(LSIState *s, int phase)
s->sstat1 = (s->sstat1 & ~PHASE_MASK) | phase;
}
-static void lsi_bad_phase(LSIState *s, int out, int new_phase)
+static int lsi_bad_phase(LSIState *s, int out, int new_phase)
{
+ int ret = 0;
/* Trigger a phase mismatch. */
if (s->ccntl0 & LSI_CCNTL0_ENPMJ) {
if ((s->ccntl0 & LSI_CCNTL0_PMJCTL)) {
@@ -584,8 +585,10 @@ static void lsi_bad_phase(LSIState *s, int out, int new_phase)
trace_lsi_bad_phase_interrupt();
lsi_script_scsi_interrupt(s, LSI_SIST0_MA, 0);
lsi_stop_script(s);
+ ret = 1;
}
lsi_set_phase(s, new_phase);
+ return ret;
}
@@ -789,7 +792,7 @@ static int lsi_queue_req(LSIState *s, SCSIRequest *req, uint32_t len)
static void lsi_command_complete(SCSIRequest *req, size_t resid)
{
LSIState *s = LSI53C895A(req->bus->qbus.parent);
- int out;
+ int out, stop = 0;
out = (s->sstat1 & PHASE_MASK) == PHASE_DO;
trace_lsi_command_complete(req->status);
@@ -797,7 +800,10 @@ static void lsi_command_complete(SCSIRequest *req, size_t resid)
s->command_complete = 2;
if (s->waiting && s->dbc != 0) {
/* Raise phase mismatch for short transfers. */
- lsi_bad_phase(s, out, PHASE_ST);
+ stop = lsi_bad_phase(s, out, PHASE_ST);
+ if (stop) {
+ s->waiting = 0;
+ }
} else {
lsi_set_phase(s, PHASE_ST);
}
@@ -807,7 +813,9 @@ static void lsi_command_complete(SCSIRequest *req, size_t resid)
lsi_request_free(s, s->current);
scsi_req_unref(req);
}
- lsi_resume_script(s);
+ if (!stop) {
+ lsi_resume_script(s);
+ }
}
/* Callback to indicate that the SCSI layer has completed a transfer. */

View File

@@ -0,0 +1,38 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Sven Schnelle <svens@stackframe.org>
Date: Wed, 10 Apr 2024 08:43:30 +0300
Subject: [PATCH] hw/scsi/lsi53c895a: add missing decrement of reentrancy
counter
When the maximum count of SCRIPTS instructions is reached, the code
stops execution and returns, but fails to decrement the reentrancy
counter. This effectively renders the SCSI controller unusable
because on next entry the reentrancy counter is still above the limit.
This bug was seen on HP-UX 10.20 which seems to trigger SCRIPTS
loops.
Fixes: b987718bbb ("hw/scsi/lsi53c895a: Fix reentrancy issues in the LSI controller (CVE-2023-0330)")
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-ID: <20240128202214.2644768-1-svens@stackframe.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Helge Deller <deller@gmx.de>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 8b09b7fe47082c69295a0fc0cc01b041b6385025)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
---
hw/scsi/lsi53c895a.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/hw/scsi/lsi53c895a.c b/hw/scsi/lsi53c895a.c
index 905f5ef237..c7a3964b5f 100644
--- a/hw/scsi/lsi53c895a.c
+++ b/hw/scsi/lsi53c895a.c
@@ -1167,6 +1167,7 @@ again:
lsi_script_scsi_interrupt(s, LSI_SIST0_UDC, 0);
lsi_disconnect(s);
trace_lsi_execute_script_stop();
+ reentrancy_level--;
return;
}
insn = read_dword(s, s->dsp);

View File

@@ -1,48 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Fri, 28 Jul 2023 10:47:48 +0200
Subject: [PATCH] migration/block-dirty-bitmap: fix loading bitmap when there
is an iothread
The bdrv_create_dirty_bitmap() function (which is also called by
bdrv_dirty_bitmap_create_successor()) uses bdrv_getlength(bs). This is
a wrapper around a coroutine, and thus uses bdrv_poll_co(). Polling
tries to release the AioContext which will trigger an assert() if it
hasn't been acquired before.
The issue does not happen for migration, because there we are in a
coroutine already, so the wrapper will just call bdrv_co_getlength()
directly without polling.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
migration/block-dirty-bitmap.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index fe73aa94b1..7eaf498439 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -805,8 +805,11 @@ static int dirty_bitmap_load_start(QEMUFile *f, DBMLoadState *s)
"destination", bdrv_dirty_bitmap_name(s->bitmap));
return -EINVAL;
} else {
+ AioContext *ctx = bdrv_get_aio_context(s->bs);
+ aio_context_acquire(ctx);
s->bitmap = bdrv_create_dirty_bitmap(s->bs, granularity,
s->bitmap_name, &local_err);
+ aio_context_release(ctx);
if (!s->bitmap) {
error_report_err(local_err);
return -EINVAL;
@@ -833,7 +836,10 @@ static int dirty_bitmap_load_start(QEMUFile *f, DBMLoadState *s)
bdrv_disable_dirty_bitmap(s->bitmap);
if (flags & DIRTY_BITMAP_MIG_START_FLAG_ENABLED) {
+ AioContext *ctx = bdrv_get_aio_context(s->bs);
+ aio_context_acquire(ctx);
bdrv_dirty_bitmap_create_successor(s->bitmap, &local_err);
+ aio_context_release(ctx);
if (local_err) {
error_report_err(local_err);
return -EINVAL;

View File

@@ -0,0 +1,173 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Sven Schnelle <svens@stackframe.org>
Date: Wed, 10 Apr 2024 08:43:31 +0300
Subject: [PATCH] hw/scsi/lsi53c895a: add timer to scripts processing
HP-UX 10.20 seems to make the lsi53c895a spinning on a memory location
under certain circumstances. As the SCSI controller and CPU are not
running at the same time this loop will never finish. After some
time, the check loop interrupts with a unexpected device disconnect.
This works, but is slow because the kernel resets the scsi controller.
Instead of signaling UDC, start a timer and exit the loop. Until the
timer fires, the CPU can process instructions which might changes the
memory location.
The limit of instructions is also reduced because scripts running on
the SCSI processor are usually very short. This keeps the time until
the loop is exit short.
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-ID: <20240229204407.1699260-1-svens@stackframe.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 9876359990dd4c8a48de65cf5e1c3d13e96a7f4e)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
---
hw/scsi/lsi53c895a.c | 43 +++++++++++++++++++++++++++++++++----------
hw/scsi/trace-events | 2 ++
2 files changed, 35 insertions(+), 10 deletions(-)
diff --git a/hw/scsi/lsi53c895a.c b/hw/scsi/lsi53c895a.c
index c7a3964b5f..48c85d479c 100644
--- a/hw/scsi/lsi53c895a.c
+++ b/hw/scsi/lsi53c895a.c
@@ -188,7 +188,7 @@ static const char *names[] = {
#define LSI_TAG_VALID (1 << 16)
/* Maximum instructions to process. */
-#define LSI_MAX_INSN 10000
+#define LSI_MAX_INSN 100
typedef struct lsi_request {
SCSIRequest *req;
@@ -205,6 +205,7 @@ enum {
LSI_WAIT_RESELECT, /* Wait Reselect instruction has been issued */
LSI_DMA_SCRIPTS, /* processing DMA from lsi_execute_script */
LSI_DMA_IN_PROGRESS, /* DMA operation is in progress */
+ LSI_WAIT_SCRIPTS, /* SCRIPTS stopped because of instruction count limit */
};
enum {
@@ -224,6 +225,7 @@ struct LSIState {
MemoryRegion ram_io;
MemoryRegion io_io;
AddressSpace pci_io_as;
+ QEMUTimer *scripts_timer;
int carry; /* ??? Should this be an a visible register somewhere? */
int status;
@@ -415,6 +417,7 @@ static void lsi_soft_reset(LSIState *s)
s->sbr = 0;
assert(QTAILQ_EMPTY(&s->queue));
assert(!s->current);
+ timer_del(s->scripts_timer);
}
static int lsi_dma_40bit(LSIState *s)
@@ -1135,6 +1138,12 @@ static void lsi_wait_reselect(LSIState *s)
}
}
+static void lsi_scripts_timer_start(LSIState *s)
+{
+ trace_lsi_scripts_timer_start();
+ timer_mod(s->scripts_timer, qemu_clock_get_us(QEMU_CLOCK_VIRTUAL) + 500);
+}
+
static void lsi_execute_script(LSIState *s)
{
PCIDevice *pci_dev = PCI_DEVICE(s);
@@ -1144,6 +1153,11 @@ static void lsi_execute_script(LSIState *s)
int insn_processed = 0;
static int reentrancy_level;
+ if (s->waiting == LSI_WAIT_SCRIPTS) {
+ timer_del(s->scripts_timer);
+ s->waiting = LSI_NOWAIT;
+ }
+
reentrancy_level++;
s->istat1 |= LSI_ISTAT1_SRUN;
@@ -1151,8 +1165,8 @@ again:
/*
* Some windows drivers make the device spin waiting for a memory location
* to change. If we have executed more than LSI_MAX_INSN instructions then
- * assume this is the case and force an unexpected device disconnect. This
- * is apparently sufficient to beat the drivers into submission.
+ * assume this is the case and start a timer. Until the timer fires, the
+ * host CPU has a chance to run and change the memory location.
*
* Another issue (CVE-2023-0330) can occur if the script is programmed to
* trigger itself again and again. Avoid this problem by stopping after
@@ -1160,13 +1174,8 @@ again:
* which should be enough for all valid use cases).
*/
if (++insn_processed > LSI_MAX_INSN || reentrancy_level > 8) {
- if (!(s->sien0 & LSI_SIST0_UDC)) {
- qemu_log_mask(LOG_GUEST_ERROR,
- "lsi_scsi: inf. loop with UDC masked");
- }
- lsi_script_scsi_interrupt(s, LSI_SIST0_UDC, 0);
- lsi_disconnect(s);
- trace_lsi_execute_script_stop();
+ s->waiting = LSI_WAIT_SCRIPTS;
+ lsi_scripts_timer_start(s);
reentrancy_level--;
return;
}
@@ -2205,6 +2214,9 @@ static int lsi_post_load(void *opaque, int version_id)
return -EINVAL;
}
+ if (s->waiting == LSI_WAIT_SCRIPTS) {
+ lsi_scripts_timer_start(s);
+ }
return 0;
}
@@ -2302,6 +2314,15 @@ static const struct SCSIBusInfo lsi_scsi_info = {
.cancel = lsi_request_cancelled
};
+static void scripts_timer_cb(void *opaque)
+{
+ LSIState *s = opaque;
+
+ trace_lsi_scripts_timer_triggered();
+ s->waiting = LSI_NOWAIT;
+ lsi_execute_script(s);
+}
+
static void lsi_scsi_realize(PCIDevice *dev, Error **errp)
{
LSIState *s = LSI53C895A(dev);
@@ -2321,6 +2342,7 @@ static void lsi_scsi_realize(PCIDevice *dev, Error **errp)
"lsi-ram", 0x2000);
memory_region_init_io(&s->io_io, OBJECT(s), &lsi_io_ops, s,
"lsi-io", 256);
+ s->scripts_timer = timer_new_us(QEMU_CLOCK_VIRTUAL, scripts_timer_cb, s);
/*
* Since we use the address-space API to interact with ram_io, disable the
@@ -2345,6 +2367,7 @@ static void lsi_scsi_exit(PCIDevice *dev)
LSIState *s = LSI53C895A(dev);
address_space_destroy(&s->pci_io_as);
+ timer_del(s->scripts_timer);
}
static void lsi_class_init(ObjectClass *klass, void *data)
diff --git a/hw/scsi/trace-events b/hw/scsi/trace-events
index ab238293f0..131af99d91 100644
--- a/hw/scsi/trace-events
+++ b/hw/scsi/trace-events
@@ -299,6 +299,8 @@ lsi_execute_script_stop(void) "SCRIPTS execution stopped"
lsi_awoken(void) "Woken by SIGP"
lsi_reg_read(const char *name, int offset, uint8_t ret) "Read reg %s 0x%x = 0x%02x"
lsi_reg_write(const char *name, int offset, uint8_t val) "Write reg %s 0x%x = 0x%02x"
+lsi_scripts_timer_triggered(void) "SCRIPTS timer triggered"
+lsi_scripts_timer_start(void) "SCRIPTS timer started"
# virtio-scsi.c
virtio_scsi_cmd_req(int lun, uint32_t tag, uint8_t cmd) "virtio_scsi_cmd_req lun=%u tag=0x%x cmd=0x%x"

View File

@@ -1,29 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Li Feng <fengli@smartx.com>
Date: Mon, 31 Jul 2023 20:10:06 +0800
Subject: [PATCH] vhost: fix the fd leak
When the vhost-user reconnect to the backend, the notifer should be
cleanup. Otherwise, the fd resource will be exhausted.
Fixes: f9a09ca3ea ("vhost: add support for configure interrupt")
Signed-off-by: Li Feng <fengli@smartx.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
---
hw/virtio/vhost.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index a266396576..8e3311781f 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -2034,6 +2034,8 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
event_notifier_test_and_clear(
&hdev->vqs[VHOST_QUEUE_NUM_CONFIG_INR].masked_config_notifier);
event_notifier_test_and_clear(&vdev->config_notifier);
+ event_notifier_cleanup(
+ &hdev->vqs[VHOST_QUEUE_NUM_CONFIG_INR].masked_config_notifier);
trace_vhost_dev_stop(hdev, vdev->name, vrings);

View File

@@ -0,0 +1,161 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Laurent Vivier <lvivier@redhat.com>
Date: Wed, 10 Apr 2024 08:43:33 +0300
Subject: [PATCH] e1000e: fix link state on resume
On resume e1000e_vm_state_change() always calls e1000e_autoneg_resume()
that sets link_down to false, and thus activates the link even
if we have disabled it.
The problem can be reproduced starting qemu in paused state (-S) and
then set the link to down. When we resume the machine the link appears
to be up.
Reproducer:
# qemu-system-x86_64 ... -device e1000e,netdev=netdev0,id=net0 -S
{"execute": "qmp_capabilities" }
{"execute": "set_link", "arguments": {"name": "net0", "up": false}}
{"execute": "cont" }
To fix the problem, merge the content of e1000e_vm_state_change()
into e1000e_core_post_load() as e1000 does.
Buglink: https://issues.redhat.com/browse/RHEL-21867
Fixes: 6f3fbe4ed06a ("net: Introduce e1000e device emulation")
Suggested-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 4cadf10234989861398e19f3bb441d3861f3bb7c)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
---
hw/net/e1000e_core.c | 60 ++++++--------------------------------------
hw/net/e1000e_core.h | 2 --
2 files changed, 7 insertions(+), 55 deletions(-)
diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index c71d82ce1d..742f5ec800 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -108,14 +108,6 @@ e1000e_intmgr_timer_resume(E1000IntrDelayTimer *timer)
}
}
-static void
-e1000e_intmgr_timer_pause(E1000IntrDelayTimer *timer)
-{
- if (timer->running) {
- timer_del(timer->timer);
- }
-}
-
static inline void
e1000e_intrmgr_stop_timer(E1000IntrDelayTimer *timer)
{
@@ -397,24 +389,6 @@ e1000e_intrmgr_resume(E1000ECore *core)
}
}
-static void
-e1000e_intrmgr_pause(E1000ECore *core)
-{
- int i;
-
- e1000e_intmgr_timer_pause(&core->radv);
- e1000e_intmgr_timer_pause(&core->rdtr);
- e1000e_intmgr_timer_pause(&core->raid);
- e1000e_intmgr_timer_pause(&core->tidv);
- e1000e_intmgr_timer_pause(&core->tadv);
-
- e1000e_intmgr_timer_pause(&core->itr);
-
- for (i = 0; i < E1000E_MSIX_VEC_NUM; i++) {
- e1000e_intmgr_timer_pause(&core->eitr[i]);
- }
-}
-
static void
e1000e_intrmgr_reset(E1000ECore *core)
{
@@ -3336,12 +3310,6 @@ e1000e_core_read(E1000ECore *core, hwaddr addr, unsigned size)
return 0;
}
-static inline void
-e1000e_autoneg_pause(E1000ECore *core)
-{
- timer_del(core->autoneg_timer);
-}
-
static void
e1000e_autoneg_resume(E1000ECore *core)
{
@@ -3353,22 +3321,6 @@ e1000e_autoneg_resume(E1000ECore *core)
}
}
-static void
-e1000e_vm_state_change(void *opaque, bool running, RunState state)
-{
- E1000ECore *core = opaque;
-
- if (running) {
- trace_e1000e_vm_state_running();
- e1000e_intrmgr_resume(core);
- e1000e_autoneg_resume(core);
- } else {
- trace_e1000e_vm_state_stopped();
- e1000e_autoneg_pause(core);
- e1000e_intrmgr_pause(core);
- }
-}
-
void
e1000e_core_pci_realize(E1000ECore *core,
const uint16_t *eeprom_templ,
@@ -3381,9 +3333,6 @@ e1000e_core_pci_realize(E1000ECore *core,
e1000e_autoneg_timer, core);
e1000e_intrmgr_pci_realize(core);
- core->vmstate =
- qemu_add_vm_change_state_handler(e1000e_vm_state_change, core);
-
for (i = 0; i < E1000E_NUM_QUEUES; i++) {
net_tx_pkt_init(&core->tx[i].tx_pkt, core->owner,
E1000E_MAX_TX_FRAGS, core->has_vnet);
@@ -3408,8 +3357,6 @@ e1000e_core_pci_uninit(E1000ECore *core)
e1000e_intrmgr_pci_unint(core);
- qemu_del_vm_change_state_handler(core->vmstate);
-
for (i = 0; i < E1000E_NUM_QUEUES; i++) {
net_tx_pkt_reset(core->tx[i].tx_pkt);
net_tx_pkt_uninit(core->tx[i].tx_pkt);
@@ -3561,5 +3508,12 @@ e1000e_core_post_load(E1000ECore *core)
*/
nc->link_down = (core->mac[STATUS] & E1000_STATUS_LU) == 0;
+ /*
+ * we need to restart intrmgr timers, as an older version of
+ * QEMU can have stopped them before migration
+ */
+ e1000e_intrmgr_resume(core);
+ e1000e_autoneg_resume(core);
+
return 0;
}
diff --git a/hw/net/e1000e_core.h b/hw/net/e1000e_core.h
index 4ddb4d2c39..f2a8ff4a33 100644
--- a/hw/net/e1000e_core.h
+++ b/hw/net/e1000e_core.h
@@ -100,8 +100,6 @@ struct E1000Core {
E1000IntrDelayTimer eitr[E1000E_MSIX_VEC_NUM];
bool eitr_intr_pending[E1000E_MSIX_VEC_NUM];
- VMChangeStateEntry *vmstate;
-
uint32_t itr_guest_value;
uint32_t eitr_guest_value[E1000E_MSIX_VEC_NUM];

View File

@@ -1,100 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Thu, 24 Aug 2023 11:22:21 +0200
Subject: [PATCH] hw/ide: reset: cancel async DMA operation before reseting
state
If there is a pending DMA operation during ide_bus_reset(), the fact
that the IDEstate is already reset before the operation is canceled
can be problematic. In particular, ide_dma_cb() might be called and
then use the reset IDEstate which contains the signature after the
reset. When used to construct the IO operation this leads to
ide_get_sector() returning 0 and nsector being 1. This is particularly
bad, because a write command will thus destroy the first sector which
often contains a partition table or similar.
Traces showing the unsolicited write happening with IDEstate
0x5595af6949d0 being used after reset:
> ahci_port_write ahci(0x5595af6923f0)[0]: port write [reg:PxSCTL] @ 0x2c: 0x00000300
> ahci_reset_port ahci(0x5595af6923f0)[0]: reset port
> ide_reset IDEstate 0x5595af6949d0
> ide_reset IDEstate 0x5595af694da8
> ide_bus_reset_aio aio_cancel
> dma_aio_cancel dbs=0x7f64600089a0
> dma_blk_cb dbs=0x7f64600089a0 ret=0
> dma_complete dbs=0x7f64600089a0 ret=0 cb=0x5595acd40b30
> ahci_populate_sglist ahci(0x5595af6923f0)[0]
> ahci_dma_prepare_buf ahci(0x5595af6923f0)[0]: prepare buf limit=512 prepared=512
> ide_dma_cb IDEState 0x5595af6949d0; sector_num=0 n=1 cmd=DMA WRITE
> dma_blk_io dbs=0x7f6420802010 bs=0x5595ae2c6c30 offset=0 to_dev=1
> dma_blk_cb dbs=0x7f6420802010 ret=0
> (gdb) p *qiov
> $11 = {iov = 0x7f647c76d840, niov = 1, {{nalloc = 1, local_iov = {iov_base = 0x0,
> iov_len = 512}}, {__pad = "\001\000\000\000\000\000\000\000\000\000\000",
> size = 512}}}
> (gdb) bt
> #0 blk_aio_pwritev (blk=0x5595ae2c6c30, offset=0, qiov=0x7f6420802070, flags=0,
> cb=0x5595ace6f0b0 <dma_blk_cb>, opaque=0x7f6420802010)
> at ../block/block-backend.c:1682
> #1 0x00005595ace6f185 in dma_blk_cb (opaque=0x7f6420802010, ret=<optimized out>)
> at ../softmmu/dma-helpers.c:179
> #2 0x00005595ace6f778 in dma_blk_io (ctx=0x5595ae0609f0,
> sg=sg@entry=0x5595af694d00, offset=offset@entry=0, align=align@entry=512,
> io_func=io_func@entry=0x5595ace6ee30 <dma_blk_write_io_func>,
> io_func_opaque=io_func_opaque@entry=0x5595ae2c6c30,
> cb=0x5595acd40b30 <ide_dma_cb>, opaque=0x5595af6949d0,
> dir=DMA_DIRECTION_TO_DEVICE) at ../softmmu/dma-helpers.c:244
> #3 0x00005595ace6f90a in dma_blk_write (blk=0x5595ae2c6c30,
> sg=sg@entry=0x5595af694d00, offset=offset@entry=0, align=align@entry=512,
> cb=cb@entry=0x5595acd40b30 <ide_dma_cb>, opaque=opaque@entry=0x5595af6949d0)
> at ../softmmu/dma-helpers.c:280
> #4 0x00005595acd40e18 in ide_dma_cb (opaque=0x5595af6949d0, ret=<optimized out>)
> at ../hw/ide/core.c:953
> #5 0x00005595ace6f319 in dma_complete (ret=0, dbs=0x7f64600089a0)
> at ../softmmu/dma-helpers.c:107
> #6 dma_blk_cb (opaque=0x7f64600089a0, ret=0) at ../softmmu/dma-helpers.c:127
> #7 0x00005595ad12227d in blk_aio_complete (acb=0x7f6460005b10)
> at ../block/block-backend.c:1527
> #8 blk_aio_complete (acb=0x7f6460005b10) at ../block/block-backend.c:1524
> #9 blk_aio_write_entry (opaque=0x7f6460005b10) at ../block/block-backend.c:1594
> #10 0x00005595ad258cfb in coroutine_trampoline (i0=<optimized out>,
> i1=<optimized out>) at ../util/coroutine-ucontext.c:177
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
hw/ide/core.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/hw/ide/core.c b/hw/ide/core.c
index 08e1f0c3d7..148fccdef2 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -2513,19 +2513,19 @@ static void ide_dummy_transfer_stop(IDEState *s)
void ide_bus_reset(IDEBus *bus)
{
- bus->unit = 0;
- bus->cmd = 0;
- ide_reset(&bus->ifs[0]);
- ide_reset(&bus->ifs[1]);
- ide_clear_hob(bus);
-
- /* pending async DMA */
+ /* pending async DMA - needs the IDEState before it is reset */
if (bus->dma->aiocb) {
trace_ide_bus_reset_aio();
blk_aio_cancel(bus->dma->aiocb);
bus->dma->aiocb = NULL;
}
+ bus->unit = 0;
+ bus->cmd = 0;
+ ide_reset(&bus->ifs[0]);
+ ide_reset(&bus->ifs[1]);
+ ide_clear_hob(bus);
+
/* reset dma provider too */
if (bus->dma->ops->reset) {
bus->dma->ops->reset(bus->dma);

View File

@@ -0,0 +1,61 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Paolo Bonzini <pbonzini@redhat.com>
Date: Wed, 10 Apr 2024 08:43:49 +0300
Subject: [PATCH] target/i386: introduce function to query MMU indices
Remove knowledge of specific MMU indexes (other than MMU_NESTED_IDX and
MMU_PHYS_IDX) from mmu_translate(). This will make it possible to split
32-bit and 64-bit MMU indexes.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 5f97afe2543f09160a8d123ab6e2e8c6d98fa9ce)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: context fixup in target/i386/cpu.h due to other changes in that area)
---
target/i386/cpu.h | 10 ++++++++++
target/i386/tcg/sysemu/excp_helper.c | 4 ++--
2 files changed, 12 insertions(+), 2 deletions(-)
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 7be047ce33..f175e18768 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -2195,6 +2195,16 @@ static inline int cpu_mmu_index(CPUX86State *env, bool ifetch)
? MMU_KNOSMAP_IDX : MMU_KSMAP_IDX;
}
+static inline bool is_mmu_index_smap(int mmu_index)
+{
+ return mmu_index == MMU_KSMAP_IDX;
+}
+
+static inline bool is_mmu_index_user(int mmu_index)
+{
+ return mmu_index == MMU_USER_IDX;
+}
+
static inline bool is_mmu_index_32(int mmu_index)
{
assert(mmu_index < MMU_PHYS_IDX);
diff --git a/target/i386/tcg/sysemu/excp_helper.c b/target/i386/tcg/sysemu/excp_helper.c
index 5999cdedf5..553a60d976 100644
--- a/target/i386/tcg/sysemu/excp_helper.c
+++ b/target/i386/tcg/sysemu/excp_helper.c
@@ -135,7 +135,7 @@ static bool mmu_translate(CPUX86State *env, const TranslateParams *in,
{
const target_ulong addr = in->addr;
const int pg_mode = in->pg_mode;
- const bool is_user = (in->mmu_idx == MMU_USER_IDX);
+ const bool is_user = is_mmu_index_user(in->mmu_idx);
const MMUAccessType access_type = in->access_type;
uint64_t ptep, pte, rsvd_mask;
PTETranslate pte_trans = {
@@ -355,7 +355,7 @@ do_check_protect_pse36:
}
int prot = 0;
- if (in->mmu_idx != MMU_KSMAP_IDX || !(ptep & PG_USER_MASK)) {
+ if (!is_mmu_index_smap(in->mmu_idx) || !(ptep & PG_USER_MASK)) {
prot |= PAGE_READ;
if ((ptep & PG_RW_MASK) || !(is_user || (pg_mode & PG_MODE_WP))) {
prot |= PAGE_WRITE;

View File

@@ -0,0 +1,130 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Paolo Bonzini <pbonzini@redhat.com>
Date: Wed, 10 Apr 2024 08:43:50 +0300
Subject: [PATCH] target/i386: use separate MMU indexes for 32-bit accesses
Accesses from a 32-bit environment (32-bit code segment for instruction
accesses, EFER.LMA==0 for processor accesses) have to mask away the
upper 32 bits of the address. While a bit wasteful, the easiest way
to do so is to use separate MMU indexes. These days, QEMU anyway is
compiled with a fixed value for NB_MMU_MODES. Split MMU_USER_IDX,
MMU_KSMAP_IDX and MMU_KNOSMAP_IDX in two.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 90f641531c782c873a05895f411c05fbbbef3c49)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: move changes for x86_cpu_mmu_index() to cpu_mmu_index() due to missing
v8.2.0-1030-gace0c5fe5950 "target/i386: Populate CPUClass.mmu_index"
Increase NB_MMU_MODES from 5 to 8 in target/i386/cpu-param.h due to missing
v7.2.0-2640-gffd824f3f32d "include/exec: Set default NB_MMU_MODES to 16"
v7.2.0-2647-g6787318a5d86 "target/i386: Remove NB_MMU_MODES define"
which relaxed upper limit of MMU index for i386, since this commit starts
using MMU_NESTED_IDX=7.
Thanks Zhao Liu and Paolo Bonzini for the analisys and suggestions.
)
---
target/i386/cpu-param.h | 2 +-
target/i386/cpu.h | 44 ++++++++++++++++++++--------
target/i386/tcg/sysemu/excp_helper.c | 3 +-
3 files changed, 34 insertions(+), 15 deletions(-)
diff --git a/target/i386/cpu-param.h b/target/i386/cpu-param.h
index f579b16bd2..e21e472e1e 100644
--- a/target/i386/cpu-param.h
+++ b/target/i386/cpu-param.h
@@ -23,7 +23,7 @@
# define TARGET_VIRT_ADDR_SPACE_BITS 32
#endif
#define TARGET_PAGE_BITS 12
-#define NB_MMU_MODES 5
+#define NB_MMU_MODES 8
#ifndef CONFIG_USER_ONLY
# define TARGET_TB_PCREL 1
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index f175e18768..73eee08f3f 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -2182,27 +2182,42 @@ uint64_t cpu_get_tsc(CPUX86State *env);
#define cpu_list x86_cpu_list
/* MMU modes definitions */
-#define MMU_KSMAP_IDX 0
-#define MMU_USER_IDX 1
-#define MMU_KNOSMAP_IDX 2
-#define MMU_NESTED_IDX 3
-#define MMU_PHYS_IDX 4
+#define MMU_KSMAP64_IDX 0
+#define MMU_KSMAP32_IDX 1
+#define MMU_USER64_IDX 2
+#define MMU_USER32_IDX 3
+#define MMU_KNOSMAP64_IDX 4
+#define MMU_KNOSMAP32_IDX 5
+#define MMU_PHYS_IDX 6
+#define MMU_NESTED_IDX 7
+
+#ifdef CONFIG_USER_ONLY
+#ifdef TARGET_X86_64
+#define MMU_USER_IDX MMU_USER64_IDX
+#else
+#define MMU_USER_IDX MMU_USER32_IDX
+#endif
+#endif
static inline int cpu_mmu_index(CPUX86State *env, bool ifetch)
{
- return (env->hflags & HF_CPL_MASK) == 3 ? MMU_USER_IDX :
- (!(env->hflags & HF_SMAP_MASK) || (env->eflags & AC_MASK))
- ? MMU_KNOSMAP_IDX : MMU_KSMAP_IDX;
+ int mmu_index_32 = (env->hflags & HF_CS64_MASK) ? 1 : 0;
+ int mmu_index_base =
+ (env->hflags & HF_CPL_MASK) == 3 ? MMU_USER64_IDX :
+ !(env->hflags & HF_SMAP_MASK) ? MMU_KNOSMAP64_IDX :
+ (env->eflags & AC_MASK) ? MMU_KNOSMAP64_IDX : MMU_KSMAP64_IDX;
+
+ return mmu_index_base + mmu_index_32;
}
static inline bool is_mmu_index_smap(int mmu_index)
{
- return mmu_index == MMU_KSMAP_IDX;
+ return (mmu_index & ~1) == MMU_KSMAP64_IDX;
}
static inline bool is_mmu_index_user(int mmu_index)
{
- return mmu_index == MMU_USER_IDX;
+ return (mmu_index & ~1) == MMU_USER64_IDX;
}
static inline bool is_mmu_index_32(int mmu_index)
@@ -2213,9 +2228,12 @@ static inline bool is_mmu_index_32(int mmu_index)
static inline int cpu_mmu_index_kernel(CPUX86State *env)
{
- return !(env->hflags & HF_SMAP_MASK) ? MMU_KNOSMAP_IDX :
- ((env->hflags & HF_CPL_MASK) < 3 && (env->eflags & AC_MASK))
- ? MMU_KNOSMAP_IDX : MMU_KSMAP_IDX;
+ int mmu_index_32 = (env->hflags & HF_LMA_MASK) ? 1 : 0;
+ int mmu_index_base =
+ !(env->hflags & HF_SMAP_MASK) ? MMU_KNOSMAP64_IDX :
+ ((env->hflags & HF_CPL_MASK) < 3 && (env->eflags & AC_MASK)) ? MMU_KNOSMAP64_IDX : MMU_KSMAP64_IDX;
+
+ return mmu_index_base + mmu_index_32;
}
#define CC_DST (env->cc_dst)
diff --git a/target/i386/tcg/sysemu/excp_helper.c b/target/i386/tcg/sysemu/excp_helper.c
index 553a60d976..5f13252d68 100644
--- a/target/i386/tcg/sysemu/excp_helper.c
+++ b/target/i386/tcg/sysemu/excp_helper.c
@@ -541,7 +541,8 @@ static bool get_physical_address(CPUX86State *env, vaddr addr,
if (likely(use_stage2)) {
in.cr3 = env->nested_cr3;
in.pg_mode = env->nested_pg_mode;
- in.mmu_idx = MMU_USER_IDX;
+ in.mmu_idx =
+ env->nested_pg_mode & PG_MODE_LMA ? MMU_USER64_IDX : MMU_USER32_IDX;
in.ptw_idx = MMU_PHYS_IDX;
if (!mmu_translate(env, &in, out, err)) {

View File

@@ -0,0 +1,46 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Paolo Bonzini <pbonzini@redhat.com>
Date: Wed, 10 Apr 2024 08:43:51 +0300
Subject: [PATCH] target/i386: fix direction of "32-bit MMU" test
The low bit of MMU indices for x86 TCG indicates whether the processor is
in 32-bit mode and therefore linear addresses have to be masked to 32 bits.
However, the index was computed incorrectly, leading to possible conflicts
in the TLB for any address above 4G.
Analyzed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Fixes: b1661801c18 ("target/i386: Fix physical address truncation", 2024-02-28)
Fixes: 1c15f97b4f1 ("target/i386: Fix physical address truncation" in stable-7.2)
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2206
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 2cc68629a6fc198f4a972698bdd6477f883aedfb)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: move changes for x86_cpu_mmu_index() to cpu_mmu_index() due to missing
v8.2.0-1030-gace0c5fe59 "target/i386: Populate CPUClass.mmu_index")
---
target/i386/cpu.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 73eee08f3f..326649ca99 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -2201,7 +2201,7 @@ uint64_t cpu_get_tsc(CPUX86State *env);
static inline int cpu_mmu_index(CPUX86State *env, bool ifetch)
{
- int mmu_index_32 = (env->hflags & HF_CS64_MASK) ? 1 : 0;
+ int mmu_index_32 = (env->hflags & HF_CS64_MASK) ? 0 : 1;
int mmu_index_base =
(env->hflags & HF_CPL_MASK) == 3 ? MMU_USER64_IDX :
!(env->hflags & HF_SMAP_MASK) ? MMU_KNOSMAP64_IDX :
@@ -2228,7 +2228,7 @@ static inline bool is_mmu_index_32(int mmu_index)
static inline int cpu_mmu_index_kernel(CPUX86State *env)
{
- int mmu_index_32 = (env->hflags & HF_LMA_MASK) ? 1 : 0;
+ int mmu_index_32 = (env->hflags & HF_LMA_MASK) ? 0 : 1;
int mmu_index_base =
!(env->hflags & HF_SMAP_MASK) ? MMU_KNOSMAP64_IDX :
((env->hflags & HF_CPL_MASK) < 3 && (env->eflags & AC_MASK)) ? MMU_KNOSMAP64_IDX : MMU_KSMAP64_IDX;

View File

@@ -0,0 +1,35 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Tao Su <tao1.su@linux.intel.com>
Date: Wed, 10 Apr 2024 08:43:52 +0300
Subject: [PATCH] target/i386: Revert monitor_puts() in do_inject_x86_mce()
monitor_puts() doesn't check the monitor pointer, but do_inject_x86_mce()
may have a parameter with NULL monitor pointer. Revert monitor_puts() in
do_inject_x86_mce() to fix, then the fact that we send the same message to
monitor and log is again more obvious.
Fixes: bf0c50d4aa85 (monitor: expose monitor_puts to rest of code)
Reviwed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Message-ID: <20240320083640.523287-1-tao1.su@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 7fd226b04746f0be0b636de5097f1b42338951a0)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
---
target/i386/helper.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/target/i386/helper.c b/target/i386/helper.c
index 0ac2da066d..290d9d309c 100644
--- a/target/i386/helper.c
+++ b/target/i386/helper.c
@@ -427,7 +427,7 @@ static void do_inject_x86_mce(CPUState *cs, run_on_cpu_data data)
if (need_reset) {
emit_guest_memory_failure(MEMORY_FAILURE_ACTION_RESET, ar,
recursive);
- monitor_puts(params->mon, msg);
+ monitor_printf(params->mon, "%s", msg);
qemu_log_mask(CPU_LOG_RESET, "%s\n", msg);
qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET);
return;

View File

@@ -0,0 +1,86 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Richard Henderson <richard.henderson@linaro.org>
Date: Wed, 10 Apr 2024 08:43:57 +0300
Subject: [PATCH] tcg/optimize: Fix sign_mask for logical right-shift
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The 'sign' computation is attempting to locate the sign bit that has
been repeated, so that we can test if that bit is known zero. That
computation can be zero if there are no known sign repetitions.
Cc: qemu-stable@nongnu.org
Fixes: 93a967fbb57 ("tcg/optimize: Propagate sign info for shifting")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2248
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit 2911e9b95f3bb03783ae5ca3e2494dc3b44a9161)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: trivial context fixup in tests/tcg/aarch64/Makefile.target)
---
tcg/optimize.c | 2 +-
tests/tcg/aarch64/Makefile.target | 1 +
tests/tcg/aarch64/test-2248.c | 28 ++++++++++++++++++++++++++++
3 files changed, 30 insertions(+), 1 deletion(-)
create mode 100644 tests/tcg/aarch64/test-2248.c
diff --git a/tcg/optimize.c b/tcg/optimize.c
index ae081ab29c..b6f6436c74 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -1907,7 +1907,7 @@ static bool fold_shift(OptContext *ctx, TCGOp *op)
* will not reduced the number of input sign repetitions.
*/
sign = (s_mask & -s_mask) >> 1;
- if (!(z_mask & sign)) {
+ if (sign && !(z_mask & sign)) {
ctx->s_mask = s_mask;
}
break;
diff --git a/tests/tcg/aarch64/Makefile.target b/tests/tcg/aarch64/Makefile.target
index 5e4ea7c998..474f61bc30 100644
--- a/tests/tcg/aarch64/Makefile.target
+++ b/tests/tcg/aarch64/Makefile.target
@@ -10,6 +10,7 @@ VPATH += $(AARCH64_SRC)
# Base architecture tests
AARCH64_TESTS=fcvt pcalign-a64
+AARCH64_TESTS += test-2248
fcvt: LDFLAGS+=-lm
diff --git a/tests/tcg/aarch64/test-2248.c b/tests/tcg/aarch64/test-2248.c
new file mode 100644
index 0000000000..aac2e17836
--- /dev/null
+++ b/tests/tcg/aarch64/test-2248.c
@@ -0,0 +1,28 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/* See https://gitlab.com/qemu-project/qemu/-/issues/2248 */
+
+#include <assert.h>
+
+__attribute__((noinline))
+long test(long x, long y, long sh)
+{
+ long r;
+ asm("cmp %1, %2\n\t"
+ "cset x12, lt\n\t"
+ "and w11, w12, #0xff\n\t"
+ "cmp w11, #0\n\t"
+ "csetm x14, ne\n\t"
+ "lsr x13, x14, %3\n\t"
+ "sxtb %0, w13"
+ : "=r"(r)
+ : "r"(x), "r"(y), "r"(sh)
+ : "x11", "x12", "x13", "x14");
+ return r;
+}
+
+int main()
+{
+ long r = test(0, 1, 2);
+ assert(r == -1);
+ return 0;
+}

View File

@@ -0,0 +1,66 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Wafer <wafer@jaguarmicro.com>
Date: Wed, 10 Apr 2024 08:44:02 +0300
Subject: [PATCH] hw/virtio: Fix packed virtqueue flush used_idx
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
In the event of writing many chains of descriptors, the device must
write just the id of the last buffer in the descriptor chain, skip
forward the number of descriptors in the chain, and then repeat the
operations for the rest of chains.
Current QEMU code writes all the buffer ids consecutively, and then
skips all the buffers altogether. This is a bug, and can be reproduced
with a VirtIONet device with _F_MRG_RXBUB and without
_F_INDIRECT_DESC:
If a virtio-net device has the VIRTIO_NET_F_MRG_RXBUF feature
but not the VIRTIO_RING_F_INDIRECT_DESC feature,
'VirtIONetQueue->rx_vq' will use the merge feature
to store data in multiple 'elems'.
The 'num_buffers' in the virtio header indicates how many elements are merged.
If the value of 'num_buffers' is greater than 1,
all the merged elements will be filled into the descriptor ring.
The 'idx' of the elements should be the value of 'vq->used_idx' plus 'ndescs'.
Fixes: 86044b24e8 ("virtio: basic packed virtqueue support")
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Wafer <wafer@jaguarmicro.com>
Message-Id: <20240407015451.5228-2-wafer@jaguarmicro.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 2d9a31b3c27311eca1682cb2c076d7a300441960)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
---
hw/virtio/virtio.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index b7da7f074d..e4f8ed1e63 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -1367,12 +1367,20 @@ static void virtqueue_packed_flush(VirtQueue *vq, unsigned int count)
return;
}
+ /*
+ * For indirect element's 'ndescs' is 1.
+ * For all other elemment's 'ndescs' is the
+ * number of descriptors chained by NEXT (as set in virtqueue_packed_pop).
+ * So When the 'elem' be filled into the descriptor ring,
+ * The 'idx' of this 'elem' shall be
+ * the value of 'vq->used_idx' plus the 'ndescs'.
+ */
+ ndescs += vq->used_elems[0].ndescs;
for (i = 1; i < count; i++) {
- virtqueue_packed_fill_desc(vq, &vq->used_elems[i], i, false);
+ virtqueue_packed_fill_desc(vq, &vq->used_elems[i], ndescs, false);
ndescs += vq->used_elems[i].ndescs;
}
virtqueue_packed_fill_desc(vq, &vq->used_elems[0], 0, true);
- ndescs += vq->used_elems[0].ndescs;
vq->inuse -= ndescs;
vq->used_idx += ndescs;

1279
debian/patches/pve-qemu-7.2-vitastor.patch vendored Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -14,10 +14,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/block/file-posix.c b/block/file-posix.c
index c2dee3f056..9681bd0434 100644
index b9647c5ffc..9a16d86344 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -553,7 +553,7 @@ static QemuOptsList raw_runtime_opts = {
@@ -552,7 +552,7 @@ static QemuOptsList raw_runtime_opts = {
{
.name = "locking",
.type = QEMU_OPT_STRING,
@@ -26,7 +26,7 @@ index c2dee3f056..9681bd0434 100644
},
{
.name = "pr-manager",
@@ -653,7 +653,7 @@ static int raw_open_common(BlockDriverState *bs, QDict *options,
@@ -652,7 +652,7 @@ static int raw_open_common(BlockDriverState *bs, QDict *options,
s->use_lock = false;
break;
case ON_OFF_AUTO_AUTO:

View File

@@ -9,10 +9,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/net/net.h b/include/net/net.h
index 1448d00afb..d1601d32c1 100644
index 5a7c0e9ebf..59dde996f9 100644
--- a/include/net/net.h
+++ b/include/net/net.h
@@ -258,8 +258,8 @@ void netdev_add(QemuOpts *opts, Error **errp);
@@ -238,8 +238,8 @@ void netdev_add(QemuOpts *opts, Error **errp);
int net_hub_id_for_client(NetClientState *nc, int *id);
NetClientState *net_hub_port_find(int hub_id);

View File

@@ -10,10 +10,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index d243e290d3..3489b05ec4 100644
index 326649ca99..24d21486bc 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -2203,9 +2203,9 @@ uint64_t cpu_get_tsc(CPUX86State *env);
@@ -2174,9 +2174,9 @@ uint64_t cpu_get_tsc(CPUX86State *env);
#define CPU_RESOLVING_TYPE TYPE_X86_CPU
#ifdef TARGET_X86_64

View File

@@ -9,10 +9,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/ui/spice-core.c b/ui/spice-core.c
index 52a59386d7..b20c25aee0 100644
index c3ac20ad43..37774f1c0a 100644
--- a/ui/spice-core.c
+++ b/ui/spice-core.c
@@ -691,32 +691,35 @@ static void qemu_spice_init(void)
@@ -689,32 +689,35 @@ static void qemu_spice_init(void)
if (tls_port) {
x509_dir = qemu_opt_get(opts, "x509-dir");

View File

@@ -9,10 +9,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 11 insertions(+), 4 deletions(-)
diff --git a/block/gluster.c b/block/gluster.c
index 185a83e5e5..f11a40aa9e 100644
index 7c90f7ba4b..2e03102f00 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -43,7 +43,7 @@
@@ -42,7 +42,7 @@
#define GLUSTER_DEBUG_DEFAULT 4
#define GLUSTER_DEBUG_MAX 9
#define GLUSTER_OPT_LOGFILE "logfile"
@@ -21,7 +21,7 @@ index 185a83e5e5..f11a40aa9e 100644
/*
* Several versions of GlusterFS (3.12? -> 6.0.1) fail when the transfer size
* is greater or equal to 1024 MiB, so we are limiting the transfer size to 512
@@ -425,6 +425,7 @@ static struct glfs *qemu_gluster_glfs_init(BlockdevOptionsGluster *gconf,
@@ -424,6 +424,7 @@ static struct glfs *qemu_gluster_glfs_init(BlockdevOptionsGluster *gconf,
int old_errno;
SocketAddressList *server;
unsigned long long port;
@@ -29,7 +29,7 @@ index 185a83e5e5..f11a40aa9e 100644
glfs = glfs_find_preopened(gconf->volume);
if (glfs) {
@@ -467,9 +468,15 @@ static struct glfs *qemu_gluster_glfs_init(BlockdevOptionsGluster *gconf,
@@ -466,9 +467,15 @@ static struct glfs *qemu_gluster_glfs_init(BlockdevOptionsGluster *gconf,
}
}

View File

@@ -18,10 +18,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 2 insertions(+)
diff --git a/block/rbd.c b/block/rbd.c
index 978671411e..a4749f3b1b 100644
index f826410f40..64a8d7d48b 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -963,6 +963,8 @@ static int qemu_rbd_connect(rados_t *cluster, rados_ioctx_t *io_ctx,
@@ -820,6 +820,8 @@ static int qemu_rbd_connect(rados_t *cluster, rados_ioctx_t *io_ctx,
rados_conf_set(*cluster, "rbd_cache", "false");
}

View File

@@ -0,0 +1,98 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Wolfgang Bumiller <w.bumiller@proxmox.com>
Date: Mon, 6 Apr 2020 12:16:37 +0200
Subject: [PATCH] PVE: [Up] qmp: add get_link_status
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
[FE: add get_link_status to command name exceptions]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
net/net.c | 27 +++++++++++++++++++++++++++
qapi/net.json | 15 +++++++++++++++
qapi/pragma.json | 2 ++
3 files changed, 44 insertions(+)
diff --git a/net/net.c b/net/net.c
index c3391168f6..f7d984f6f5 100644
--- a/net/net.c
+++ b/net/net.c
@@ -1387,6 +1387,33 @@ void hmp_info_network(Monitor *mon, const QDict *qdict)
}
}
+int64_t qmp_get_link_status(const char *name, Error **errp)
+{
+ NetClientState *ncs[MAX_QUEUE_NUM];
+ NetClientState *nc;
+ int queues;
+ bool ret;
+
+ queues = qemu_find_net_clients_except(name, ncs,
+ NET_CLIENT_DRIVER__MAX,
+ MAX_QUEUE_NUM);
+
+ if (queues == 0) {
+ error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND,
+ "Device '%s' not found", name);
+ return (int64_t) -1;
+ }
+
+ nc = ncs[0];
+ ret = ncs[0]->link_down;
+
+ if (nc->peer->info->type == NET_CLIENT_DRIVER_NIC) {
+ ret = ncs[0]->peer->link_down;
+ }
+
+ return (int64_t) ret ? 0 : 1;
+}
+
void colo_notify_filters_event(int event, Error **errp)
{
NetClientState *nc;
diff --git a/qapi/net.json b/qapi/net.json
index 522ac582ed..327d7c5a37 100644
--- a/qapi/net.json
+++ b/qapi/net.json
@@ -36,6 +36,21 @@
##
{ 'command': 'set_link', 'data': {'name': 'str', 'up': 'bool'} }
+##
+# @get_link_status:
+#
+# Get the current link state of the nics or nic.
+#
+# @name: name of the nic you get the state of
+#
+# Return: If link is up 1
+# If link is down 0
+# If an error occure an empty string.
+#
+# Notes: this is an Proxmox VE extension and not offical part of Qemu.
+##
+{ 'command': 'get_link_status', 'data': {'name': 'str'} , 'returns': 'int' }
+
##
# @netdev_add:
#
diff --git a/qapi/pragma.json b/qapi/pragma.json
index 7f810b0e97..29233db825 100644
--- a/qapi/pragma.json
+++ b/qapi/pragma.json
@@ -15,6 +15,7 @@
'device_add',
'device_del',
'expire_password',
+ 'get_link_status',
'migrate_cancel',
'netdev_add',
'netdev_del',
@@ -26,6 +27,7 @@
'system_wakeup' ],
# Commands allowed to return a non-dictionary
'command-returns-exceptions': [
+ 'get_link_status',
'human-monitor-command',
'qom-get',
'query-tpm-models',

View File

@@ -16,10 +16,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/block/gluster.c b/block/gluster.c
index f11a40aa9e..6756e6b886 100644
index 2e03102f00..7886c5fe8c 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -58,6 +58,7 @@ typedef struct GlusterAIOCB {
@@ -57,6 +57,7 @@ typedef struct GlusterAIOCB {
int ret;
Coroutine *coroutine;
AioContext *aio_context;
@@ -27,7 +27,7 @@ index f11a40aa9e..6756e6b886 100644
} GlusterAIOCB;
typedef struct BDRVGlusterState {
@@ -753,8 +754,10 @@ static void gluster_finish_aiocb(struct glfs_fd *fd, ssize_t ret,
@@ -752,8 +753,10 @@ static void gluster_finish_aiocb(struct glfs_fd *fd, ssize_t ret,
acb->ret = 0; /* Success */
} else if (ret < 0) {
acb->ret = -errno; /* Read/Write failed */
@@ -39,7 +39,7 @@ index f11a40aa9e..6756e6b886 100644
}
aio_co_schedule(acb->aio_context, acb->coroutine);
@@ -1021,6 +1024,7 @@ static coroutine_fn int qemu_gluster_co_pwrite_zeroes(BlockDriverState *bs,
@@ -1022,6 +1025,7 @@ static coroutine_fn int qemu_gluster_co_pwrite_zeroes(BlockDriverState *bs,
acb.ret = 0;
acb.coroutine = qemu_coroutine_self();
acb.aio_context = bdrv_get_aio_context(bs);
@@ -47,7 +47,7 @@ index f11a40aa9e..6756e6b886 100644
ret = glfs_zerofill_async(s->fd, offset, bytes, gluster_finish_aiocb, &acb);
if (ret < 0) {
@@ -1201,9 +1205,11 @@ static coroutine_fn int qemu_gluster_co_rw(BlockDriverState *bs,
@@ -1203,9 +1207,11 @@ static coroutine_fn int qemu_gluster_co_rw(BlockDriverState *bs,
acb.aio_context = bdrv_get_aio_context(bs);
if (write) {
@@ -59,7 +59,7 @@ index f11a40aa9e..6756e6b886 100644
ret = glfs_preadv_async(s->fd, qiov->iov, qiov->niov, offset, 0,
gluster_finish_aiocb, &acb);
}
@@ -1266,6 +1272,7 @@ static coroutine_fn int qemu_gluster_co_flush_to_disk(BlockDriverState *bs)
@@ -1268,6 +1274,7 @@ static coroutine_fn int qemu_gluster_co_flush_to_disk(BlockDriverState *bs)
acb.ret = 0;
acb.coroutine = qemu_coroutine_self();
acb.aio_context = bdrv_get_aio_context(bs);
@@ -67,7 +67,7 @@ index f11a40aa9e..6756e6b886 100644
ret = glfs_fsync_async(s->fd, gluster_finish_aiocb, &acb);
if (ret < 0) {
@@ -1314,6 +1321,7 @@ static coroutine_fn int qemu_gluster_co_pdiscard(BlockDriverState *bs,
@@ -1316,6 +1323,7 @@ static coroutine_fn int qemu_gluster_co_pdiscard(BlockDriverState *bs,
acb.ret = 0;
acb.coroutine = qemu_coroutine_self();
acb.aio_context = bdrv_get_aio_context(bs);

View File

@@ -9,10 +9,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/qemu-img.c b/qemu-img.c
index 9aeac69fa6..0919fac1f1 100644
index 2c32d9da4e..b9636714f6 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -3059,7 +3059,8 @@ static int img_info(int argc, char **argv)
@@ -3013,7 +3013,8 @@ static int img_info(int argc, char **argv)
list = collect_image_info_list(image_opts, filename, fmt, chain,
force_share);
if (!list) {

View File

@@ -54,10 +54,10 @@ index 1b1dab5b17..d1616c045a 100644
DEF("info", img_info,
diff --git a/qemu-img.c b/qemu-img.c
index 0919fac1f1..c584de648c 100644
index b9636714f6..0f6a4e4e57 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -4885,10 +4885,12 @@ static int img_bitmap(int argc, char **argv)
@@ -4840,10 +4840,12 @@ static int img_bitmap(int argc, char **argv)
#define C_IF 04
#define C_OF 010
#define C_SKIP 020
@@ -70,7 +70,7 @@ index 0919fac1f1..c584de648c 100644
};
struct DdIo {
@@ -4964,6 +4966,19 @@ static int img_dd_skip(const char *arg,
@@ -4919,6 +4921,19 @@ static int img_dd_skip(const char *arg,
return 0;
}
@@ -90,7 +90,7 @@ index 0919fac1f1..c584de648c 100644
static int img_dd(int argc, char **argv)
{
int ret = 0;
@@ -5004,6 +5019,7 @@ static int img_dd(int argc, char **argv)
@@ -4959,6 +4974,7 @@ static int img_dd(int argc, char **argv)
{ "if", img_dd_if, C_IF },
{ "of", img_dd_of, C_OF },
{ "skip", img_dd_skip, C_SKIP },
@@ -98,7 +98,7 @@ index 0919fac1f1..c584de648c 100644
{ NULL, NULL, 0 }
};
const struct option long_options[] = {
@@ -5079,91 +5095,112 @@ static int img_dd(int argc, char **argv)
@@ -5034,91 +5050,112 @@ static int img_dd(int argc, char **argv)
arg = NULL;
}
@@ -275,7 +275,7 @@ index 0919fac1f1..c584de648c 100644
}
if (dd.flags & C_SKIP && (in.offset > INT64_MAX / in.bsz ||
@@ -5180,20 +5217,43 @@ static int img_dd(int argc, char **argv)
@@ -5135,20 +5172,43 @@ static int img_dd(int argc, char **argv)
in.buf = g_new(uint8_t, in.bsz);
for (out_pos = 0; in_pos < size; ) {

View File

@@ -16,10 +16,10 @@ Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
1 file changed, 25 insertions(+), 3 deletions(-)
diff --git a/qemu-img.c b/qemu-img.c
index c584de648c..a57ceeddfe 100644
index 0f6a4e4e57..a3cd66e56c 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -4886,11 +4886,13 @@ static int img_bitmap(int argc, char **argv)
@@ -4841,11 +4841,13 @@ static int img_bitmap(int argc, char **argv)
#define C_OF 010
#define C_SKIP 020
#define C_OSIZE 040
@@ -33,7 +33,7 @@ index c584de648c..a57ceeddfe 100644
};
struct DdIo {
@@ -4979,6 +4981,19 @@ static int img_dd_osize(const char *arg,
@@ -4934,6 +4936,19 @@ static int img_dd_osize(const char *arg,
return 0;
}
@@ -53,7 +53,7 @@ index c584de648c..a57ceeddfe 100644
static int img_dd(int argc, char **argv)
{
int ret = 0;
@@ -4993,12 +5008,14 @@ static int img_dd(int argc, char **argv)
@@ -4948,12 +4963,14 @@ static int img_dd(int argc, char **argv)
int c, i;
const char *out_fmt = "raw";
const char *fmt = NULL;
@@ -69,7 +69,7 @@ index c584de648c..a57ceeddfe 100644
};
struct DdIo in = {
.bsz = 512, /* Block size is by default 512 bytes */
@@ -5020,6 +5037,7 @@ static int img_dd(int argc, char **argv)
@@ -4975,6 +4992,7 @@ static int img_dd(int argc, char **argv)
{ "of", img_dd_of, C_OF },
{ "skip", img_dd_skip, C_SKIP },
{ "osize", img_dd_osize, C_OSIZE },
@@ -77,7 +77,7 @@ index c584de648c..a57ceeddfe 100644
{ NULL, NULL, 0 }
};
const struct option long_options[] = {
@@ -5216,9 +5234,10 @@ static int img_dd(int argc, char **argv)
@@ -5171,9 +5189,10 @@ static int img_dd(int argc, char **argv)
in.buf = g_new(uint8_t, in.bsz);
@@ -90,7 +90,7 @@ index c584de648c..a57ceeddfe 100644
if (blk1) {
in_ret = blk_pread(blk1, in_pos, bytes, in.buf, 0);
if (in_ret == 0) {
@@ -5227,6 +5246,9 @@ static int img_dd(int argc, char **argv)
@@ -5182,6 +5201,9 @@ static int img_dd(int argc, char **argv)
} else {
in_ret = read(STDIN_FILENO, in.buf, bytes);
if (in_ret == 0) {

View File

@@ -65,10 +65,10 @@ index d1616c045a..b5b0bb4467 100644
DEF("info", img_info,
diff --git a/qemu-img.c b/qemu-img.c
index a57ceeddfe..06d814e39c 100644
index a3cd66e56c..4f5ef5b887 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -5010,7 +5010,7 @@ static int img_dd(int argc, char **argv)
@@ -4965,7 +4965,7 @@ static int img_dd(int argc, char **argv)
const char *fmt = NULL;
int64_t size = 0, readsize = 0;
int64_t out_pos, in_pos;
@@ -77,7 +77,7 @@ index a57ceeddfe..06d814e39c 100644
struct DdInfo dd = {
.flags = 0,
.count = 0,
@@ -5048,7 +5048,7 @@ static int img_dd(int argc, char **argv)
@@ -5003,7 +5003,7 @@ static int img_dd(int argc, char **argv)
{ 0, 0, 0, 0 }
};
@@ -86,7 +86,7 @@ index a57ceeddfe..06d814e39c 100644
if (c == EOF) {
break;
}
@@ -5068,6 +5068,9 @@ static int img_dd(int argc, char **argv)
@@ -5023,6 +5023,9 @@ static int img_dd(int argc, char **argv)
case 'h':
help();
break;
@@ -96,7 +96,7 @@ index a57ceeddfe..06d814e39c 100644
case 'U':
force_share = true;
break;
@@ -5198,13 +5201,15 @@ static int img_dd(int argc, char **argv)
@@ -5153,13 +5156,15 @@ static int img_dd(int argc, char **argv)
size - in.bsz * in.offset, &error_abort);
}

View File

@@ -7,62 +7,20 @@ Actually provide memory information via the query-balloon
command.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
[FE: add BalloonInfo to member name exceptions list
rebase for 8.0 - moved to hw/core/machine-hmp-cmds.c]
[FE: add BalloonInfo to member name exceptions list]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
hw/core/machine-hmp-cmds.c | 30 +++++++++++++++++++++++++++++-
hw/virtio/virtio-balloon.c | 33 +++++++++++++++++++++++++++++++--
monitor/hmp-cmds.c | 30 +++++++++++++++++++++++++++++-
qapi/machine.json | 22 +++++++++++++++++++++-
qapi/pragma.json | 1 +
4 files changed, 82 insertions(+), 4 deletions(-)
diff --git a/hw/core/machine-hmp-cmds.c b/hw/core/machine-hmp-cmds.c
index c3e55ef9e9..0e32e6201f 100644
--- a/hw/core/machine-hmp-cmds.c
+++ b/hw/core/machine-hmp-cmds.c
@@ -169,7 +169,35 @@ void hmp_info_balloon(Monitor *mon, const QDict *qdict)
return;
}
- monitor_printf(mon, "balloon: actual=%" PRId64 "\n", info->actual >> 20);
+ monitor_printf(mon, "balloon: actual=%" PRId64, info->actual >> 20);
+ monitor_printf(mon, " max_mem=%" PRId64, info->max_mem >> 20);
+ if (info->has_total_mem) {
+ monitor_printf(mon, " total_mem=%" PRId64, info->total_mem >> 20);
+ }
+ if (info->has_free_mem) {
+ monitor_printf(mon, " free_mem=%" PRId64, info->free_mem >> 20);
+ }
+
+ if (info->has_mem_swapped_in) {
+ monitor_printf(mon, " mem_swapped_in=%" PRId64, info->mem_swapped_in);
+ }
+ if (info->has_mem_swapped_out) {
+ monitor_printf(mon, " mem_swapped_out=%" PRId64, info->mem_swapped_out);
+ }
+ if (info->has_major_page_faults) {
+ monitor_printf(mon, " major_page_faults=%" PRId64,
+ info->major_page_faults);
+ }
+ if (info->has_minor_page_faults) {
+ monitor_printf(mon, " minor_page_faults=%" PRId64,
+ info->minor_page_faults);
+ }
+ if (info->has_last_update) {
+ monitor_printf(mon, " last_update=%" PRId64,
+ info->last_update);
+ }
+
+ monitor_printf(mon, "\n");
qapi_free_BalloonInfo(info);
}
diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 746f07c4d2..a41854b902 100644
index e4c4c2d3c8..49874569b1 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -804,8 +804,37 @@ static uint64_t virtio_balloon_get_features(VirtIODevice *vdev, uint64_t f,
@@ -806,8 +806,37 @@ static uint64_t virtio_balloon_get_features(VirtIODevice *vdev, uint64_t f,
static void virtio_balloon_stat(void *opaque, BalloonInfo *info)
{
VirtIOBalloon *dev = opaque;
@@ -102,11 +60,52 @@ index 746f07c4d2..a41854b902 100644
}
static void virtio_balloon_to_target(void *opaque, ram_addr_t target)
diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index 01b789a79e..480b798963 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -696,7 +696,35 @@ void hmp_info_balloon(Monitor *mon, const QDict *qdict)
return;
}
- monitor_printf(mon, "balloon: actual=%" PRId64 "\n", info->actual >> 20);
+ monitor_printf(mon, "balloon: actual=%" PRId64, info->actual >> 20);
+ monitor_printf(mon, " max_mem=%" PRId64, info->max_mem >> 20);
+ if (info->has_total_mem) {
+ monitor_printf(mon, " total_mem=%" PRId64, info->total_mem >> 20);
+ }
+ if (info->has_free_mem) {
+ monitor_printf(mon, " free_mem=%" PRId64, info->free_mem >> 20);
+ }
+
+ if (info->has_mem_swapped_in) {
+ monitor_printf(mon, " mem_swapped_in=%" PRId64, info->mem_swapped_in);
+ }
+ if (info->has_mem_swapped_out) {
+ monitor_printf(mon, " mem_swapped_out=%" PRId64, info->mem_swapped_out);
+ }
+ if (info->has_major_page_faults) {
+ monitor_printf(mon, " major_page_faults=%" PRId64,
+ info->major_page_faults);
+ }
+ if (info->has_minor_page_faults) {
+ monitor_printf(mon, " minor_page_faults=%" PRId64,
+ info->minor_page_faults);
+ }
+ if (info->has_last_update) {
+ monitor_printf(mon, " last_update=%" PRId64,
+ info->last_update);
+ }
+
+ monitor_printf(mon, "\n");
qapi_free_BalloonInfo(info);
}
diff --git a/qapi/machine.json b/qapi/machine.json
index 604b686e59..15f5f86683 100644
index b9228a5e46..10e77a9af3 100644
--- a/qapi/machine.json
+++ b/qapi/machine.json
@@ -1056,9 +1056,29 @@
@@ -1054,9 +1054,29 @@
# @actual: the logical size of the VM in bytes
# Formula used: logical_vm_size = vm_ram_size - balloon_size
#
@@ -138,10 +137,10 @@ index 604b686e59..15f5f86683 100644
##
# @query-balloon:
diff --git a/qapi/pragma.json b/qapi/pragma.json
index 7f810b0e97..325e684411 100644
index 29233db825..f2097b9020 100644
--- a/qapi/pragma.json
+++ b/qapi/pragma.json
@@ -35,6 +35,7 @@
@@ -37,6 +37,7 @@
'member-name-exceptions': [ # visible in:
'ACPISlotType', # query-acpi-ospm-status
'AcpiTableOptions', # -acpitable

View File

@@ -13,13 +13,13 @@ Signed-off-by: Dietmar Maurer <dietmar@proxmox.com>
2 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/hw/core/machine-qmp-cmds.c b/hw/core/machine-qmp-cmds.c
index b98ff15089..24595f618c 100644
index 4f4ab30f8c..76fff60a6b 100644
--- a/hw/core/machine-qmp-cmds.c
+++ b/hw/core/machine-qmp-cmds.c
@@ -103,6 +103,12 @@ MachineInfoList *qmp_query_machines(Error **errp)
@@ -99,6 +99,12 @@ MachineInfoList *qmp_query_machines(Error **errp)
info->hotpluggable_cpus = mc->has_hotpluggable_cpus;
info->numa_mem_supported = mc->numa_mem_supported;
info->deprecated = !!mc->deprecation_reason;
info->acpi = !!object_class_property_find(OBJECT_CLASS(mc), "acpi");
+
+ if (strcmp(mc->name, MACHINE_GET_CLASS(current_machine)->name) == 0) {
+ info->has_is_current = true;
@@ -28,9 +28,9 @@ index b98ff15089..24595f618c 100644
+
if (mc->default_cpu_type) {
info->default_cpu_type = g_strdup(mc->default_cpu_type);
}
info->has_default_cpu_type = true;
diff --git a/qapi/machine.json b/qapi/machine.json
index 15f5f86683..c904280085 100644
index 10e77a9af3..9156103c8f 100644
--- a/qapi/machine.json
+++ b/qapi/machine.json
@@ -138,6 +138,8 @@
@@ -42,7 +42,7 @@ index 15f5f86683..c904280085 100644
# @cpu-max: maximum number of CPUs supported by the machine type
# (since 1.5)
#
@@ -161,7 +163,7 @@
@@ -159,7 +161,7 @@
##
{ 'struct': 'MachineInfo',
'data': { 'name': 'str', '*alias': 'str',
@@ -50,4 +50,4 @@ index 15f5f86683..c904280085 100644
+ '*is-default': 'bool', '*is-current': 'bool', 'cpu-max': 'int',
'hotpluggable-cpus': 'bool', 'numa-mem-supported': 'bool',
'deprecated': 'bool', '*default-cpu-type': 'str',
'*default-ram-id': 'str', 'acpi': 'bool' } }
'*default-ram-id': 'str' } }

View File

@@ -6,15 +6,13 @@ Subject: [PATCH] PVE: qapi: modify spice query
Provide the last ticket in the SpiceInfo struct optionally.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
[FE: adapt to QAPI change]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
qapi/ui.json | 3 +++
ui/spice-core.c | 4 ++++
2 files changed, 7 insertions(+)
ui/spice-core.c | 5 +++++
2 files changed, 8 insertions(+)
diff --git a/qapi/ui.json b/qapi/ui.json
index 98322342f7..316d4dc933 100644
index 0abba3e930..bf8f441227 100644
--- a/qapi/ui.json
+++ b/qapi/ui.json
@@ -310,11 +310,14 @@
@@ -33,14 +31,15 @@ index 98322342f7..316d4dc933 100644
'if': 'CONFIG_SPICE' }
diff --git a/ui/spice-core.c b/ui/spice-core.c
index b20c25aee0..26baeb7846 100644
index 37774f1c0a..367f77f2b4 100644
--- a/ui/spice-core.c
+++ b/ui/spice-core.c
@@ -548,6 +548,10 @@ static SpiceInfo *qmp_query_spice_real(Error **errp)
@@ -534,6 +534,11 @@ static SpiceInfo *qmp_query_spice_real(Error **errp)
micro = SPICE_SERVER_VERSION & 0xff;
info->compiled_version = g_strdup_printf("%d.%d.%d", major, minor, micro);
+ if (auth_passwd) {
+ info->has_ticket = true;
+ info->ticket = g_strdup(auth_passwd);
+ }
+

View File

@@ -15,19 +15,19 @@ Additionally, allows tracking the current position from the outside
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
migration/channel-savevm-async.c | 183 +++++++++++++++++++++++++++++++
migration/channel-savevm-async.c | 182 +++++++++++++++++++++++++++++++
migration/channel-savevm-async.h | 51 +++++++++
migration/meson.build | 1 +
3 files changed, 235 insertions(+)
3 files changed, 234 insertions(+)
create mode 100644 migration/channel-savevm-async.c
create mode 100644 migration/channel-savevm-async.h
diff --git a/migration/channel-savevm-async.c b/migration/channel-savevm-async.c
new file mode 100644
index 0000000000..aab081ce07
index 0000000000..06d5484778
--- /dev/null
+++ b/migration/channel-savevm-async.c
@@ -0,0 +1,183 @@
@@ -0,0 +1,182 @@
+/*
+ * QIO Channel implementation to be used by savevm-async QMP calls
+ */
@@ -71,7 +71,6 @@ index 0000000000..aab081ce07
+ size_t niov,
+ int **fds,
+ size_t *nfds,
+ int flags,
+ Error **errp)
+{
+ QIOChannelSavevmAsync *saioc = QIO_CHANNEL_SAVEVM_ASYNC(ioc);
@@ -269,7 +268,7 @@ index 0000000000..17ae2cb261
+
+#endif /* QIO_CHANNEL_SAVEVM_ASYNC_H */
diff --git a/migration/meson.build b/migration/meson.build
index 0d1bb9f96e..8a142fc7a9 100644
index 690487cf1a..8cac83c06c 100644
--- a/migration/meson.build
+++ b/migration/meson.build
@@ -13,6 +13,7 @@ softmmu_ss.add(files(

View File

@@ -21,31 +21,29 @@ still opened by QEMU.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Signed-off-by: Dietmar Maurer <dietmar@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
[SR: improve aborting
register yank before migration_incoming_state_destroy]
[improve aborting]
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
[FE: further improve aborting
adapt to removal of QEMUFileOps
improve condition for entering final stage
adapt to QAPI and other changes for 8.0]
improve condition for entering final stage]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
hmp-commands-info.hx | 13 +
hmp-commands.hx | 17 ++
hmp-commands.hx | 33 +++
include/migration/snapshot.h | 2 +
include/monitor/hmp.h | 3 +
include/monitor/hmp.h | 5 +
migration/meson.build | 1 +
migration/savevm-async.c | 533 +++++++++++++++++++++++++++++++++++
monitor/hmp-cmds.c | 38 +++
migration/savevm-async.c | 538 +++++++++++++++++++++++++++++++++++
monitor/hmp-cmds.c | 57 ++++
qapi/migration.json | 34 +++
qapi/misc.json | 16 ++
qapi/misc.json | 32 +++
qemu-options.hx | 12 +
softmmu/vl.c | 10 +
11 files changed, 679 insertions(+)
11 files changed, 737 insertions(+)
create mode 100644 migration/savevm-async.c
diff --git a/hmp-commands-info.hx b/hmp-commands-info.hx
index 47d63d26db..a166bff3d5 100644
index 754b1e8408..489c524e9e 100644
--- a/hmp-commands-info.hx
+++ b/hmp-commands-info.hx
@@ -540,6 +540,19 @@ SRST
@@ -69,11 +67,11 @@ index 47d63d26db..a166bff3d5 100644
.name = "balloon",
.args_type = "",
diff --git a/hmp-commands.hx b/hmp-commands.hx
index bb85ee1d26..d9f9f42d11 100644
index 673e39a697..039be0033d 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -1846,3 +1846,20 @@ SRST
List event channels in the guest
@@ -1815,3 +1815,36 @@ SRST
Dump the FDT in dtb format to *filename*.
ERST
#endif
+
@@ -86,6 +84,22 @@ index bb85ee1d26..d9f9f42d11 100644
+ },
+
+ {
+ .name = "snapshot-drive",
+ .args_type = "device:s,name:s",
+ .params = "device name",
+ .help = "Create internal snapshot.",
+ .cmd = hmp_snapshot_drive,
+ },
+
+ {
+ .name = "delete-drive-snapshot",
+ .args_type = "device:s,name:s",
+ .params = "device name",
+ .help = "Delete internal snapshot.",
+ .cmd = hmp_delete_drive_snapshot,
+ },
+
+ {
+ .name = "savevm-end",
+ .args_type = "",
+ .params = "",
@@ -105,10 +119,10 @@ index e72083b117..c846d37806 100644
+
#endif
diff --git a/include/monitor/hmp.h b/include/monitor/hmp.h
index fdb69b7f9c..fdf6b45fb8 100644
index dfbc0c9a2f..440f86aba8 100644
--- a/include/monitor/hmp.h
+++ b/include/monitor/hmp.h
@@ -28,6 +28,7 @@ void hmp_info_status(Monitor *mon, const QDict *qdict);
@@ -27,6 +27,7 @@ void hmp_info_status(Monitor *mon, const QDict *qdict);
void hmp_info_uuid(Monitor *mon, const QDict *qdict);
void hmp_info_chardev(Monitor *mon, const QDict *qdict);
void hmp_info_mice(Monitor *mon, const QDict *qdict);
@@ -116,33 +130,35 @@ index fdb69b7f9c..fdf6b45fb8 100644
void hmp_info_migrate(Monitor *mon, const QDict *qdict);
void hmp_info_migrate_capabilities(Monitor *mon, const QDict *qdict);
void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict);
@@ -94,6 +95,8 @@ void hmp_closefd(Monitor *mon, const QDict *qdict);
void hmp_mouse_move(Monitor *mon, const QDict *qdict);
void hmp_mouse_button(Monitor *mon, const QDict *qdict);
void hmp_mouse_set(Monitor *mon, const QDict *qdict);
@@ -81,6 +82,10 @@ void hmp_netdev_add(Monitor *mon, const QDict *qdict);
void hmp_netdev_del(Monitor *mon, const QDict *qdict);
void hmp_getfd(Monitor *mon, const QDict *qdict);
void hmp_closefd(Monitor *mon, const QDict *qdict);
+void hmp_savevm_start(Monitor *mon, const QDict *qdict);
+void hmp_snapshot_drive(Monitor *mon, const QDict *qdict);
+void hmp_delete_drive_snapshot(Monitor *mon, const QDict *qdict);
+void hmp_savevm_end(Monitor *mon, const QDict *qdict);
void hmp_sendkey(Monitor *mon, const QDict *qdict);
void coroutine_fn hmp_screendump(Monitor *mon, const QDict *qdict);
void hmp_chardev_add(Monitor *mon, const QDict *qdict);
diff --git a/migration/meson.build b/migration/meson.build
index 8a142fc7a9..a7824b5266 100644
index 8cac83c06c..0842d00cd2 100644
--- a/migration/meson.build
+++ b/migration/meson.build
@@ -25,6 +25,7 @@ softmmu_ss.add(files(
@@ -24,6 +24,7 @@ softmmu_ss.add(files(
'multifd-zlib.c',
'postcopy-ram.c',
'savevm.c',
+ 'savevm-async.c',
'socket.c',
'tls.c',
'threadinfo.c',
), gnutls)
diff --git a/migration/savevm-async.c b/migration/savevm-async.c
new file mode 100644
index 0000000000..aa2017d496
index 0000000000..dc30558713
--- /dev/null
+++ b/migration/savevm-async.c
@@ -0,0 +1,533 @@
@@ -0,0 +1,538 @@
+#include "qemu/osdep.h"
+#include "migration/channel-savevm-async.h"
+#include "migration/migration.h"
@@ -165,7 +181,6 @@ index 0000000000..aa2017d496
+#include "qemu/timer.h"
+#include "qemu/main-loop.h"
+#include "qemu/rcu.h"
+#include "qemu/yank.h"
+
+/* #define DEBUG_SAVEVM_STATE */
+
@@ -216,20 +231,24 @@ index 0000000000..aa2017d496
+ info->bytes = s->bs_pos;
+ switch (s->state) {
+ case SAVE_STATE_ERROR:
+ info->has_status = true;
+ info->status = g_strdup("failed");
+ info->has_total_time = true;
+ info->total_time = s->total_time;
+ if (s->error) {
+ info->has_error = true;
+ info->error = g_strdup(error_get_pretty(s->error));
+ }
+ break;
+ case SAVE_STATE_ACTIVE:
+ info->has_status = true;
+ info->status = g_strdup("active");
+ info->has_total_time = true;
+ info->total_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME)
+ - s->total_time;
+ break;
+ case SAVE_STATE_COMPLETED:
+ info->has_status = true;
+ info->status = g_strdup("completed");
+ info->has_total_time = true;
+ info->total_time = s->total_time;
@@ -275,7 +294,7 @@ index 0000000000..aa2017d496
+ return ret;
+}
+
+static void G_GNUC_PRINTF(1, 2) save_snapshot_error(const char *fmt, ...)
+static void save_snapshot_error(const char *fmt, ...)
+{
+ va_list ap;
+ char *msg;
@@ -386,21 +405,14 @@ index 0000000000..aa2017d496
+ }
+
+ while (snap_state.state == SAVE_STATE_ACTIVE) {
+ uint64_t pending_size, pend_precopy, pend_postcopy;
+ uint64_t threshold = 400 * 1000;
+ uint64_t pending_size, pend_precopy, pend_compatible, pend_postcopy;
+
+ /*
+ * pending_{estimate,exact} are expected to be called without iothread
+ * lock. Similar to what is done in migration.c, call the exact variant
+ * only once pend_precopy in the estimate is below the threshold.
+ */
+ /* pending is expected to be called without iothread lock */
+ qemu_mutex_unlock_iothread();
+ qemu_savevm_state_pending_estimate(&pend_precopy, &pend_postcopy);
+ if (pend_precopy <= threshold) {
+ qemu_savevm_state_pending_exact(&pend_precopy, &pend_postcopy);
+ }
+ qemu_savevm_state_pending(snap_state.file, 0, &pend_precopy, &pend_compatible, &pend_postcopy);
+ qemu_mutex_lock_iothread();
+ pending_size = pend_precopy + pend_postcopy;
+
+ pending_size = pend_precopy + pend_compatible + pend_postcopy;
+
+ /*
+ * A guest reaching this cutoff is dirtying lots of RAM. It should be
@@ -411,7 +423,7 @@ index 0000000000..aa2017d496
+ maxlen = blk_getlength(snap_state.target) - 100*1024*1024;
+
+ /* Note that there is no progress for pend_postcopy when iterating */
+ if (pend_precopy > threshold && snap_state.bs_pos + pending_size < maxlen) {
+ if (pending_size - pend_postcopy > 400000 && snap_state.bs_pos + pending_size < maxlen) {
+ ret = qemu_savevm_state_iterate(snap_state.file, false);
+ if (ret < 0) {
+ save_snapshot_error("qemu_savevm_state_iterate error %d", ret);
@@ -453,9 +465,7 @@ index 0000000000..aa2017d496
+ if (bs_ctx != qemu_get_aio_context()) {
+ DPRINTF("savevm: async flushing drive %s\n", bs->filename);
+ aio_co_reschedule_self(bs_ctx);
+ bdrv_graph_co_rdlock();
+ bdrv_flush(bs);
+ bdrv_graph_co_rdunlock();
+ aio_co_reschedule_self(qemu_get_aio_context());
+ }
+ }
@@ -466,7 +476,7 @@ index 0000000000..aa2017d496
+ qemu_bh_schedule(snap_state.finalize_bh);
+}
+
+void qmp_savevm_start(const char *statefile, Error **errp)
+void qmp_savevm_start(bool has_statefile, const char *statefile, Error **errp)
+{
+ Error *local_err = NULL;
+ MigrationState *ms = migrate_get_current();
@@ -503,7 +513,7 @@ index 0000000000..aa2017d496
+ snap_state.error = NULL;
+ }
+
+ if (!statefile) {
+ if (!has_statefile) {
+ vm_stop(RUN_STATE_SAVE_VM);
+ snap_state.state = SAVE_STATE_COMPLETED;
+ return;
@@ -539,7 +549,6 @@ index 0000000000..aa2017d496
+ */
+ migrate_init(ms);
+ memset(&ram_counters, 0, sizeof(ram_counters));
+ memset(&compression_counters, 0, sizeof(compression_counters));
+ ms->to_dst_file = snap_state.file;
+
+ error_setg(&snap_state.blocker, "block device is in use by savevm");
@@ -622,6 +631,22 @@ index 0000000000..aa2017d496
+ DPRINTF("savevm-end: cleanup done\n");
+}
+
+// FIXME: Deprecated
+void qmp_snapshot_drive(const char *device, const char *name, Error **errp)
+{
+ // Compatibility to older qemu-server.
+ qmp_blockdev_snapshot_internal_sync(device, name, errp);
+}
+
+// FIXME: Deprecated
+void qmp_delete_drive_snapshot(const char *device, const char *name,
+ Error **errp)
+{
+ // Compatibility to older qemu-server.
+ (void)qmp_blockdev_snapshot_delete_internal_sync(device, false, NULL,
+ true, name, errp);
+}
+
+int load_snapshot_from_blockdev(const char *filename, Error **errp)
+{
+ BlockBackend *be;
@@ -656,10 +681,6 @@ index 0000000000..aa2017d496
+ dirty_bitmap_mig_before_vm_start();
+
+ qemu_fclose(f);
+
+ /* state_destroy assumes a real migration which would have added a yank */
+ yank_register_instance(MIGRATION_YANK_INSTANCE, &error_abort);
+
+ migration_incoming_state_destroy();
+ if (ret < 0) {
+ error_setg_errno(errp, -ret, "Error while loading VM state");
@@ -677,28 +698,39 @@ index 0000000000..aa2017d496
+ return ret;
+}
diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index 6c559b48c8..91be698308 100644
index 480b798963..cfebfd1db5 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -22,6 +22,7 @@
#include "monitor/monitor-internal.h"
#include "qapi/error.h"
#include "qapi/qapi-commands-control.h"
+#include "qapi/qapi-commands-migration.h"
#include "qapi/qapi-commands-misc.h"
#include "qapi/qmp/qdict.h"
#include "qapi/qmp/qerror.h"
@@ -443,3 +444,40 @@ void hmp_info_mtree(Monitor *mon, const QDict *qdict)
mtree_info(flatview, dispatch_tree, owner, disabled);
@@ -1906,6 +1906,63 @@ void hmp_info_memory_devices(Monitor *mon, const QDict *qdict)
hmp_handle_error(mon, err);
}
+
+void hmp_savevm_start(Monitor *mon, const QDict *qdict)
+{
+ Error *errp = NULL;
+ const char *statefile = qdict_get_try_str(qdict, "statefile");
+
+ qmp_savevm_start(statefile, &errp);
+ qmp_savevm_start(statefile != NULL, statefile, &errp);
+ hmp_handle_error(mon, errp);
+}
+
+void hmp_snapshot_drive(Monitor *mon, const QDict *qdict)
+{
+ Error *errp = NULL;
+ const char *name = qdict_get_str(qdict, "name");
+ const char *device = qdict_get_str(qdict, "device");
+
+ qmp_snapshot_drive(device, name, &errp);
+ hmp_handle_error(mon, errp);
+}
+
+void hmp_delete_drive_snapshot(Monitor *mon, const QDict *qdict)
+{
+ Error *errp = NULL;
+ const char *name = qdict_get_str(qdict, "name");
+ const char *device = qdict_get_str(qdict, "device");
+
+ qmp_delete_drive_snapshot(device, name, &errp);
+ hmp_handle_error(mon, errp);
+}
+
@@ -715,7 +747,7 @@ index 6c559b48c8..91be698308 100644
+ SaveVMInfo *info;
+ info = qmp_query_savevm(NULL);
+
+ if (info->status) {
+ if (info->has_status) {
+ monitor_printf(mon, "savevm status: %s\n", info->status);
+ monitor_printf(mon, "total time: %" PRIu64 " milliseconds\n",
+ info->total_time);
@@ -725,12 +757,16 @@ index 6c559b48c8..91be698308 100644
+ if (info->has_bytes) {
+ monitor_printf(mon, "Bytes saved: %"PRIu64"\n", info->bytes);
+ }
+ if (info->error) {
+ if (info->has_error) {
+ monitor_printf(mon, "Error: %s\n", info->error);
+ }
+}
+
void hmp_info_iothreads(Monitor *mon, const QDict *qdict)
{
IOThreadInfoList *info_list = qmp_query_iothreads(NULL);
diff --git a/qapi/migration.json b/qapi/migration.json
index c84fa10e86..1702b92553 100644
index 88ecf86ac8..4435866379 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -261,6 +261,40 @@
@@ -775,10 +811,10 @@ index c84fa10e86..1702b92553 100644
# @query-migrate:
#
diff --git a/qapi/misc.json b/qapi/misc.json
index 6ddd16ea28..e5681ae8a2 100644
index 27ef5a2b20..b3ce75dcae 100644
--- a/qapi/misc.json
+++ b/qapi/misc.json
@@ -469,6 +469,22 @@
@@ -435,6 +435,38 @@
##
{ 'command': 'query-fdsets', 'returns': ['FdsetInfo'] }
@@ -791,6 +827,22 @@ index 6ddd16ea28..e5681ae8a2 100644
+{ 'command': 'savevm-start', 'data': { '*statefile': 'str' } }
+
+##
+# @snapshot-drive:
+#
+# Create an internal drive snapshot.
+#
+##
+{ 'command': 'snapshot-drive', 'data': { 'device': 'str', 'name': 'str' } }
+
+##
+# @delete-drive-snapshot:
+#
+# Delete a drive snapshot.
+#
+##
+{ 'command': 'delete-drive-snapshot', 'data': { 'device': 'str', 'name': 'str' } }
+
+##
+# @savevm-end:
+#
+# Resume VM after a snapshot.
@@ -802,10 +854,10 @@ index 6ddd16ea28..e5681ae8a2 100644
# @CommandLineParameterType:
#
diff --git a/qemu-options.hx b/qemu-options.hx
index fdddfab6ff..fdd551c2bb 100644
index 7f798ce47e..9e3de34143 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -4398,6 +4398,18 @@ SRST
@@ -4423,6 +4423,18 @@ SRST
Start right away with a saved state (``loadvm`` in monitor)
ERST
@@ -825,7 +877,7 @@ index fdddfab6ff..fdd551c2bb 100644
DEF("daemonize", 0, QEMU_OPTION_daemonize, \
"-daemonize daemonize QEMU after initializing\n", QEMU_ARCH_ALL)
diff --git a/softmmu/vl.c b/softmmu/vl.c
index ea20b23e4c..0eabc71b68 100644
index 7aa3eb5cf9..c94fe3d778 100644
--- a/softmmu/vl.c
+++ b/softmmu/vl.c
@@ -164,6 +164,7 @@ static const char *accelerators;
@@ -836,7 +888,7 @@ index ea20b23e4c..0eabc71b68 100644
static QTAILQ_HEAD(, ObjectOption) object_opts = QTAILQ_HEAD_INITIALIZER(object_opts);
static QTAILQ_HEAD(, DeviceOption) device_opts = QTAILQ_HEAD_INITIALIZER(device_opts);
static int display_remote;
@@ -2612,6 +2613,12 @@ void qmp_x_exit_preconfig(Error **errp)
@@ -2615,6 +2616,12 @@ void qmp_x_exit_preconfig(Error **errp)
if (loadvm) {
load_snapshot(loadvm, NULL, false, NULL, &error_fatal);

View File

@@ -19,7 +19,7 @@ Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
3 files changed, 38 insertions(+), 18 deletions(-)
diff --git a/migration/qemu-file.c b/migration/qemu-file.c
index 102ab3b439..5ced17aba4 100644
index 2d5f74ffc2..9fd97e6fe1 100644
--- a/migration/qemu-file.c
+++ b/migration/qemu-file.c
@@ -31,8 +31,8 @@
@@ -178,7 +178,7 @@ index 102ab3b439..5ced17aba4 100644
if (blen < compressBound(size)) {
return -1;
diff --git a/migration/qemu-file.h b/migration/qemu-file.h
index 9d0155a2a1..cc06240e8d 100644
index fa13d04d78..914f1a63a8 100644
--- a/migration/qemu-file.h
+++ b/migration/qemu-file.h
@@ -63,7 +63,9 @@ typedef struct QEMUFileHooks {
@@ -192,10 +192,10 @@ index 9d0155a2a1..cc06240e8d 100644
int qemu_fclose(QEMUFile *f);
diff --git a/migration/savevm-async.c b/migration/savevm-async.c
index aa2017d496..b97f2c4f14 100644
index dc30558713..a38e7351c1 100644
--- a/migration/savevm-async.c
+++ b/migration/savevm-async.c
@@ -380,7 +380,7 @@ void qmp_savevm_start(const char *statefile, Error **errp)
@@ -374,7 +374,7 @@ void qmp_savevm_start(bool has_statefile, const char *statefile, Error **errp)
QIOChannel *ioc = QIO_CHANNEL(qio_channel_savevm_async_new(snap_state.target,
&snap_state.bs_pos));
@@ -204,7 +204,7 @@ index aa2017d496..b97f2c4f14 100644
if (!snap_state.file) {
error_set(errp, ERROR_CLASS_GENERIC_ERROR, "failed to open '%s'", statefile);
@@ -498,7 +498,8 @@ int load_snapshot_from_blockdev(const char *filename, Error **errp)
@@ -507,7 +507,8 @@ int load_snapshot_from_blockdev(const char *filename, Error **errp)
blk_op_block_all(be, blocker);
/* restore the VM state */

View File

@@ -4,19 +4,19 @@ Date: Mon, 6 Apr 2020 12:16:47 +0200
Subject: [PATCH] PVE: block: add the zeroinit block driver filter
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
[FE: adapt to changed function signatures]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
[adapt to changed function signatures]
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
---
block/meson.build | 1 +
block/zeroinit.c | 200 ++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 201 insertions(+)
block/zeroinit.c | 198 ++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 199 insertions(+)
create mode 100644 block/zeroinit.c
diff --git a/block/meson.build b/block/meson.build
index 382bec0e7d..253fe49fa2 100644
index b7c68b83a3..020a89ae07 100644
--- a/block/meson.build
+++ b/block/meson.build
@@ -44,6 +44,7 @@ block_ss.add(files(
@@ -43,6 +43,7 @@ block_ss.add(files(
'vmdk.c',
'vpc.c',
'write-threshold.c',
@@ -26,10 +26,10 @@ index 382bec0e7d..253fe49fa2 100644
softmmu_ss.add(when: 'CONFIG_TCG', if_true: files('blkreplay.c'))
diff --git a/block/zeroinit.c b/block/zeroinit.c
new file mode 100644
index 0000000000..1257342724
index 0000000000..b60e1b84dc
--- /dev/null
+++ b/block/zeroinit.c
@@ -0,0 +1,200 @@
@@ -0,0 +1,198 @@
+/*
+ * Filter to fake a zero-initialized block device.
+ *
@@ -43,7 +43,6 @@ index 0000000000..1257342724
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "block/block_int.h"
+#include "block/block-io.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/qmp/qstring.h"
+#include "qemu/cutils.h"
@@ -137,9 +136,9 @@ index 0000000000..1257342724
+ (void)s;
+}
+
+static coroutine_fn int64_t zeroinit_co_getlength(BlockDriverState *bs)
+static int64_t zeroinit_getlength(BlockDriverState *bs)
+{
+ return bdrv_co_getlength(bs->file->bs);
+ return bdrv_getlength(bs->file->bs);
+}
+
+static int coroutine_fn zeroinit_co_preadv(BlockDriverState *bs,
@@ -191,10 +190,9 @@ index 0000000000..1257342724
+ return bdrv_co_truncate(bs->file, offset, exact, prealloc, req_flags, errp);
+}
+
+static coroutine_fn int zeroinit_co_get_info(BlockDriverState *bs,
+ BlockDriverInfo *bdi)
+static int zeroinit_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
+{
+ return bdrv_co_get_info(bs->file->bs, bdi);
+ return bdrv_get_info(bs->file->bs, bdi);
+}
+
+static BlockDriver bdrv_zeroinit = {
@@ -205,7 +203,7 @@ index 0000000000..1257342724
+ .bdrv_parse_filename = zeroinit_parse_filename,
+ .bdrv_file_open = zeroinit_open,
+ .bdrv_close = zeroinit_close,
+ .bdrv_co_getlength = zeroinit_co_getlength,
+ .bdrv_getlength = zeroinit_getlength,
+ .bdrv_child_perm = bdrv_default_perms,
+ .bdrv_co_flush_to_disk = zeroinit_co_flush,
+
@@ -221,7 +219,7 @@ index 0000000000..1257342724
+ .bdrv_co_pdiscard = zeroinit_co_pdiscard,
+
+ .bdrv_co_truncate = zeroinit_co_truncate,
+ .bdrv_co_get_info = zeroinit_co_get_info,
+ .bdrv_get_info = zeroinit_get_info,
+};
+
+static void bdrv_zeroinit_init(void)

View File

@@ -14,10 +14,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2 files changed, 11 insertions(+)
diff --git a/qemu-options.hx b/qemu-options.hx
index fdd551c2bb..4eb43b7bc5 100644
index 9e3de34143..1ff8905127 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -1162,6 +1162,9 @@ legacy PC, they are not recommended for modern configurations.
@@ -1159,6 +1159,9 @@ legacy PC, they are not recommended for modern configurations.
ERST
@@ -28,10 +28,10 @@ index fdd551c2bb..4eb43b7bc5 100644
"-fda/-fdb file use 'file' as floppy disk 0/1 image\n", QEMU_ARCH_ALL)
DEF("fdb", HAS_ARG, QEMU_OPTION_fdb, "", QEMU_ARCH_ALL)
diff --git a/softmmu/vl.c b/softmmu/vl.c
index 0eabc71b68..323f6a23d4 100644
index c94fe3d778..a6f7a422ec 100644
--- a/softmmu/vl.c
+++ b/softmmu/vl.c
@@ -2648,6 +2648,7 @@ void qemu_init(int argc, char **argv)
@@ -2651,6 +2651,7 @@ void qemu_init(int argc, char **argv)
MachineClass *machine_class;
bool userconfig = true;
FILE *vmstate_dump_file = NULL;

View File

@@ -11,10 +11,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 9 insertions(+)
diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c
index 4a34f03047..59b917e50c 100644
index 2a20982066..7968ad5a93 100644
--- a/hw/intc/apic_common.c
+++ b/hw/intc/apic_common.c
@@ -252,6 +252,15 @@ static void apic_reset_common(DeviceState *dev)
@@ -278,6 +278,15 @@ static void apic_reset_common(DeviceState *dev)
info->vapic_base_update(s);
apic_init_reset(dev);

View File

@@ -13,10 +13,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2 files changed, 42 insertions(+), 20 deletions(-)
diff --git a/block/file-posix.c b/block/file-posix.c
index 9681bd0434..044890822d 100644
index 9a16d86344..bd68df57ad 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -2483,6 +2483,7 @@ raw_co_create(BlockdevCreateOptions *options, Error **errp)
@@ -2487,6 +2487,7 @@ raw_co_create(BlockdevCreateOptions *options, Error **errp)
int fd;
uint64_t perm, shared;
int result = 0;
@@ -24,7 +24,7 @@ index 9681bd0434..044890822d 100644
/* Validate options and set default values */
assert(options->driver == BLOCKDEV_DRIVER_FILE);
@@ -2523,19 +2524,22 @@ raw_co_create(BlockdevCreateOptions *options, Error **errp)
@@ -2527,19 +2528,22 @@ raw_co_create(BlockdevCreateOptions *options, Error **errp)
perm = BLK_PERM_WRITE | BLK_PERM_RESIZE;
shared = BLK_PERM_ALL & ~BLK_PERM_RESIZE;
@@ -59,7 +59,7 @@ index 9681bd0434..044890822d 100644
}
/* Clear the file by truncating it to 0 */
@@ -2589,13 +2593,15 @@ raw_co_create(BlockdevCreateOptions *options, Error **errp)
@@ -2593,13 +2597,15 @@ raw_co_create(BlockdevCreateOptions *options, Error **errp)
}
out_unlock:
@@ -82,7 +82,7 @@ index 9681bd0434..044890822d 100644
}
out_close:
@@ -2619,6 +2625,7 @@ raw_co_create_opts(BlockDriver *drv, const char *filename,
@@ -2624,6 +2630,7 @@ static int coroutine_fn raw_co_create_opts(BlockDriver *drv,
PreallocMode prealloc;
char *buf = NULL;
Error *local_err = NULL;
@@ -90,7 +90,7 @@ index 9681bd0434..044890822d 100644
/* Skip file: protocol prefix */
strstart(filename, "file:", &filename);
@@ -2641,6 +2648,18 @@ raw_co_create_opts(BlockDriver *drv, const char *filename,
@@ -2646,6 +2653,18 @@ static int coroutine_fn raw_co_create_opts(BlockDriver *drv,
return -EINVAL;
}
@@ -109,7 +109,7 @@ index 9681bd0434..044890822d 100644
options = (BlockdevCreateOptions) {
.driver = BLOCKDEV_DRIVER_FILE,
.u.file = {
@@ -2652,6 +2671,8 @@ raw_co_create_opts(BlockDriver *drv, const char *filename,
@@ -2657,6 +2676,8 @@ static int coroutine_fn raw_co_create_opts(BlockDriver *drv,
.nocow = nocow,
.has_extent_size_hint = has_extent_size_hint,
.extent_size_hint = extent_size_hint,
@@ -119,10 +119,10 @@ index 9681bd0434..044890822d 100644
};
return raw_co_create(&options, errp);
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 3c945c1f93..542add004b 100644
index 7daaf545be..9e902b96bb 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -4740,7 +4740,8 @@
@@ -4624,7 +4624,8 @@
'size': 'size',
'*preallocation': 'PreallocMode',
'*nocow': 'bool',

View File

@@ -18,10 +18,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/monitor/qmp.c b/monitor/qmp.c
index 6b8cfcf6d8..3ec67e32d3 100644
index cc1407e4ac..c34fa2e0e3 100644
--- a/monitor/qmp.c
+++ b/monitor/qmp.c
@@ -519,8 +519,7 @@ void monitor_init_qmp(Chardev *chr, bool pretty, Error **errp)
@@ -502,8 +502,7 @@ void monitor_init_qmp(Chardev *chr, bool pretty, Error **errp)
qemu_chr_fe_set_echo(&mon->common.chr, true);
/* Note: we run QMP monitor in I/O thread when @chr supports that */

View File

@@ -26,10 +26,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 2f6ccf5623..a5927e92f1 100644
index 19f42450f5..39ef2b6fe6 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -142,7 +142,8 @@ GlobalProperty hw_compat_4_0[] = {
@@ -135,7 +135,8 @@ GlobalProperty hw_compat_4_0[] = {
{ "virtio-vga", "edid", "false" },
{ "virtio-gpu-device", "edid", "false" },
{ "virtio-device", "use-started", "false" },

View File

@@ -11,33 +11,32 @@ and only if 'is-current').
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
[FE: adapt to QAPI changes]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
hw/core/machine-qmp-cmds.c | 5 +++++
hw/core/machine-qmp-cmds.c | 6 ++++++
include/hw/boards.h | 2 ++
qapi/machine.json | 4 +++-
softmmu/vl.c | 25 +++++++++++++++++++++++++
4 files changed, 35 insertions(+), 1 deletion(-)
4 files changed, 36 insertions(+), 1 deletion(-)
diff --git a/hw/core/machine-qmp-cmds.c b/hw/core/machine-qmp-cmds.c
index 24595f618c..ee9cb0cd04 100644
index 76fff60a6b..ec9201fb9a 100644
--- a/hw/core/machine-qmp-cmds.c
+++ b/hw/core/machine-qmp-cmds.c
@@ -107,6 +107,11 @@ MachineInfoList *qmp_query_machines(Error **errp)
@@ -103,6 +103,12 @@ MachineInfoList *qmp_query_machines(Error **errp)
if (strcmp(mc->name, MACHINE_GET_CLASS(current_machine)->name) == 0) {
info->has_is_current = true;
info->is_current = true;
+
+ // PVE version string only exists for current machine
+ if (mc->pve_version) {
+ info->has_pve_version = true;
+ info->pve_version = g_strdup(mc->pve_version);
+ }
}
if (mc->default_cpu_type) {
diff --git a/include/hw/boards.h b/include/hw/boards.h
index 6fbbfd56c8..61a526e97d 100644
index ca2f0d3592..acc3b62b6e 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -232,6 +232,8 @@ struct MachineClass {
@@ -50,32 +49,32 @@ index 6fbbfd56c8..61a526e97d 100644
void (*reset)(MachineState *state, ShutdownCause reason);
void (*wakeup)(MachineState *state);
diff --git a/qapi/machine.json b/qapi/machine.json
index c904280085..47f3facdb2 100644
index 9156103c8f..f4fb1b2c9c 100644
--- a/qapi/machine.json
+++ b/qapi/machine.json
@@ -159,6 +159,8 @@
@@ -157,6 +157,8 @@
#
# @acpi: machine type supports ACPI (since 8.0)
# @default-ram-id: the default ID of initial RAM memory backend (since 5.2)
#
+# @pve-version: custom PVE version suffix specified as 'machine+pveN'
+#
# Since: 1.2
##
{ 'struct': 'MachineInfo',
@@ -166,7 +168,7 @@
@@ -164,7 +166,7 @@
'*is-default': 'bool', '*is-current': 'bool', 'cpu-max': 'int',
'hotpluggable-cpus': 'bool', 'numa-mem-supported': 'bool',
'deprecated': 'bool', '*default-cpu-type': 'str',
- '*default-ram-id': 'str', 'acpi': 'bool' } }
+ '*default-ram-id': 'str', 'acpi': 'bool', '*pve-version': 'str' } }
- '*default-ram-id': 'str' } }
+ '*default-ram-id': 'str', '*pve-version': 'str' } }
##
# @query-machines:
diff --git a/softmmu/vl.c b/softmmu/vl.c
index 323f6a23d4..25abdc9da7 100644
index a6f7a422ec..8b0b35b6b4 100644
--- a/softmmu/vl.c
+++ b/softmmu/vl.c
@@ -1578,6 +1578,7 @@ static const QEMUOption *lookup_opt(int argc, char **argv,
@@ -1582,6 +1582,7 @@ static const QEMUOption *lookup_opt(int argc, char **argv,
static MachineClass *select_machine(QDict *qdict, Error **errp)
{
const char *optarg = qdict_get_try_str(qdict, "type");
@@ -83,7 +82,7 @@ index 323f6a23d4..25abdc9da7 100644
GSList *machines = object_class_get_list(TYPE_MACHINE, false);
MachineClass *machine_class;
Error *local_err = NULL;
@@ -1595,6 +1596,11 @@ static MachineClass *select_machine(QDict *qdict, Error **errp)
@@ -1599,6 +1600,11 @@ static MachineClass *select_machine(QDict *qdict, Error **errp)
}
}

View File

@@ -25,7 +25,7 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/block/backup.c b/block/backup.c
index db3791f4d1..39410dcf8d 100644
index 6a9ad97a53..9b0151c5be 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -237,8 +237,8 @@ static void backup_init_bcs_bitmap(BackupBlockJob *job)
@@ -48,7 +48,7 @@ index db3791f4d1..39410dcf8d 100644
if (s->sync_mode == MIRROR_SYNC_MODE_TOP) {
int64_t offset = 0;
int64_t count;
@@ -495,6 +493,8 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
@@ -492,6 +490,8 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
block_job_add_bdrv(&job->common, "target", target, 0, BLK_PERM_ALL,
&error_abort);

View File

@@ -3,33 +3,27 @@ From: Dietmar Maurer <dietmar@proxmox.com>
Date: Mon, 6 Apr 2020 12:16:57 +0200
Subject: [PATCH] PVE-Backup: add vma backup format code
Notes about partial restoring: skipping a certain drive is done via a
map line of the form skip=drive-scsi0. Since in PVE, most archives are
compressed and piped to vma for restore, it's not easily possible to
skip reads.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
[FE: improvements during create
allow partial restore]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
[FE: create: register all streams before entering coroutines]
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
---
block/meson.build | 2 +
meson.build | 5 +
vma-reader.c | 867 ++++++++++++++++++++++++++++++++++++++++++++
vma-writer.c | 793 ++++++++++++++++++++++++++++++++++++++++
vma.c | 900 ++++++++++++++++++++++++++++++++++++++++++++++
vma-reader.c | 859 ++++++++++++++++++++++++++++++++++++++++++++++
vma-writer.c | 791 ++++++++++++++++++++++++++++++++++++++++++
vma.c | 849 +++++++++++++++++++++++++++++++++++++++++++++
vma.h | 150 ++++++++
6 files changed, 2717 insertions(+)
6 files changed, 2656 insertions(+)
create mode 100644 vma-reader.c
create mode 100644 vma-writer.c
create mode 100644 vma.c
create mode 100644 vma.h
diff --git a/block/meson.build b/block/meson.build
index 253fe49fa2..744b698a82 100644
index 020a89ae07..4feae20e37 100644
--- a/block/meson.build
+++ b/block/meson.build
@@ -47,6 +47,8 @@ block_ss.add(files(
@@ -46,6 +46,8 @@ block_ss.add(files(
'zeroinit.c',
), zstd, zlib, gnutls)
@@ -39,10 +33,10 @@ index 253fe49fa2..744b698a82 100644
softmmu_ss.add(files('block-ram-registrar.c'))
diff --git a/meson.build b/meson.build
index 30447cfaef..38a4e2bcef 100644
index 787f91855e..a496f6fe10 100644
--- a/meson.build
+++ b/meson.build
@@ -1527,6 +1527,8 @@ keyutils = dependency('libkeyutils', required: false,
@@ -1525,6 +1525,8 @@ keyutils = dependency('libkeyutils', required: false,
has_gettid = cc.has_function('gettid')
@@ -51,7 +45,7 @@ index 30447cfaef..38a4e2bcef 100644
# libselinux
selinux = dependency('libselinux',
required: get_option('selinux'),
@@ -3650,6 +3652,9 @@ if have_tools
@@ -3598,6 +3600,9 @@ if have_tools
dependencies: [blockdev, qemuutil, gnutls, selinux],
install: true)
@@ -63,10 +57,10 @@ index 30447cfaef..38a4e2bcef 100644
subdir('contrib/elf2dmp')
diff --git a/vma-reader.c b/vma-reader.c
new file mode 100644
index 0000000000..81a891c6b1
index 0000000000..e65f1e8415
--- /dev/null
+++ b/vma-reader.c
@@ -0,0 +1,867 @@
@@ -0,0 +1,859 @@
+/*
+ * VMA: Virtual Machine Archive
+ *
@@ -97,7 +91,6 @@ index 0000000000..81a891c6b1
+ bool write_zeroes;
+ unsigned long *bitmap;
+ int bitmap_size;
+ bool skip;
+} VmaRestoreState;
+
+struct VmaReader {
@@ -495,14 +488,13 @@ index 0000000000..81a891c6b1
+}
+
+static void allocate_rstate(VmaReader *vmar, guint8 dev_id,
+ BlockBackend *target, bool write_zeroes, bool skip)
+ BlockBackend *target, bool write_zeroes)
+{
+ assert(vmar);
+ assert(dev_id);
+
+ vmar->rstate[dev_id].target = target;
+ vmar->rstate[dev_id].write_zeroes = write_zeroes;
+ vmar->rstate[dev_id].skip = skip;
+
+ int64_t size = vmar->devinfo[dev_id].size;
+
@@ -517,30 +509,28 @@ index 0000000000..81a891c6b1
+}
+
+int vma_reader_register_bs(VmaReader *vmar, guint8 dev_id, BlockBackend *target,
+ bool write_zeroes, bool skip, Error **errp)
+ bool write_zeroes, Error **errp)
+{
+ assert(vmar);
+ assert(target != NULL || skip);
+ assert(target != NULL);
+ assert(dev_id);
+ assert(vmar->rstate[dev_id].target == NULL && !vmar->rstate[dev_id].skip);
+ assert(vmar->rstate[dev_id].target == NULL);
+
+ if (target != NULL) {
+ int64_t size = blk_getlength(target);
+ int64_t size_diff = size - vmar->devinfo[dev_id].size;
+ int64_t size = blk_getlength(target);
+ int64_t size_diff = size - vmar->devinfo[dev_id].size;
+
+ /* storage types can have different size restrictions, so it
+ * is not always possible to create an image with exact size.
+ * So we tolerate a size difference up to 4MB.
+ */
+ if ((size_diff < 0) || (size_diff > 4*1024*1024)) {
+ error_setg(errp, "vma_reader_register_bs for stream %s failed - "
+ "unexpected size %zd != %zd", vmar->devinfo[dev_id].devname,
+ size, vmar->devinfo[dev_id].size);
+ return -1;
+ }
+ /* storage types can have different size restrictions, so it
+ * is not always possible to create an image with exact size.
+ * So we tolerate a size difference up to 4MB.
+ */
+ if ((size_diff < 0) || (size_diff > 4*1024*1024)) {
+ error_setg(errp, "vma_reader_register_bs for stream %s failed - "
+ "unexpected size %zd != %zd", vmar->devinfo[dev_id].devname,
+ size, vmar->devinfo[dev_id].size);
+ return -1;
+ }
+
+ allocate_rstate(vmar, dev_id, target, write_zeroes, skip);
+ allocate_rstate(vmar, dev_id, target, write_zeroes);
+
+ return 0;
+}
@@ -633,23 +623,19 @@ index 0000000000..81a891c6b1
+ VmaRestoreState *rstate = &vmar->rstate[dev_id];
+ BlockBackend *target = NULL;
+
+ bool skip = rstate->skip;
+
+ if (dev_id != vmar->vmstate_stream) {
+ target = rstate->target;
+ if (!verify && !target && !skip) {
+ if (!verify && !target) {
+ error_setg(errp, "got wrong dev id %d", dev_id);
+ return -1;
+ }
+
+ if (!skip) {
+ if (vma_reader_get_bitmap(rstate, cluster_num)) {
+ error_setg(errp, "found duplicated cluster %zd for stream %s",
+ cluster_num, vmar->devinfo[dev_id].devname);
+ return -1;
+ }
+ vma_reader_set_bitmap(rstate, cluster_num, 1);
+ if (vma_reader_get_bitmap(rstate, cluster_num)) {
+ error_setg(errp, "found duplicated cluster %zd for stream %s",
+ cluster_num, vmar->devinfo[dev_id].devname);
+ return -1;
+ }
+ vma_reader_set_bitmap(rstate, cluster_num, 1);
+
+ max_sector = vmar->devinfo[dev_id].size/BDRV_SECTOR_SIZE;
+ } else {
@@ -695,7 +681,7 @@ index 0000000000..81a891c6b1
+ return -1;
+ }
+
+ if (!verify && !skip) {
+ if (!verify) {
+ int nb_sectors = end_sector - sector_num;
+ if (restore_write_data(vmar, dev_id, target, vmstate_fd,
+ buf + start, sector_num, nb_sectors,
@@ -731,7 +717,7 @@ index 0000000000..81a891c6b1
+ return -1;
+ }
+
+ if (!verify && !skip) {
+ if (!verify) {
+ int nb_sectors = end_sector - sector_num;
+ if (restore_write_data(vmar, dev_id, target, vmstate_fd,
+ buf + start, sector_num,
@@ -756,7 +742,7 @@ index 0000000000..81a891c6b1
+ vmar->partial_zero_cluster_data += zero_size;
+ }
+
+ if (rstate->write_zeroes && !verify && !skip) {
+ if (rstate->write_zeroes && !verify) {
+ if (restore_write_data(vmar, dev_id, target, vmstate_fd,
+ zero_vma_block, sector_num,
+ nb_sectors, errp) < 0) {
@@ -927,7 +913,7 @@ index 0000000000..81a891c6b1
+
+ for (dev_id = 1; dev_id < 255; dev_id++) {
+ if (vma_reader_get_device_info(vmar, dev_id)) {
+ allocate_rstate(vmar, dev_id, NULL, false, false);
+ allocate_rstate(vmar, dev_id, NULL, false);
+ }
+ }
+
@@ -936,10 +922,10 @@ index 0000000000..81a891c6b1
+
diff --git a/vma-writer.c b/vma-writer.c
new file mode 100644
index 0000000000..ac7da237d0
index 0000000000..df4b20793d
--- /dev/null
+++ b/vma-writer.c
@@ -0,0 +1,793 @@
@@ -0,0 +1,791 @@
+/*
+ * VMA: Virtual Machine Archive
+ *
@@ -1253,8 +1239,6 @@ index 0000000000..ac7da237d0
+ }
+
+ if (vmaw->fd < 0) {
+ error_free(*errp);
+ *errp = NULL;
+ error_setg(errp, "can't open file %s - %s\n", filename,
+ g_strerror(errno));
+ goto err;
@@ -1735,10 +1719,10 @@ index 0000000000..ac7da237d0
+}
diff --git a/vma.c b/vma.c
new file mode 100644
index 0000000000..347f6283ca
index 0000000000..e8dffb43e0
--- /dev/null
+++ b/vma.c
@@ -0,0 +1,900 @@
@@ -0,0 +1,849 @@
+/*
+ * VMA: Virtual Machine Archive
+ *
@@ -1772,7 +1756,7 @@ index 0000000000..347f6283ca
+ "vma list <filename>\n"
+ "vma config <filename> [-c config]\n"
+ "vma create <filename> [-c config] pathname ...\n"
+ "vma extract <filename> [-d <drive-list>] [-r <fifo>] <targetdir>\n"
+ "vma extract <filename> [-r <fifo>] <targetdir>\n"
+ "vma verify <filename> [-v]\n"
+ ;
+
@@ -1879,7 +1863,6 @@ index 0000000000..347f6283ca
+ char *throttling_group;
+ char *cache;
+ bool write_zero;
+ bool skip;
+} RestoreMap;
+
+static bool try_parse_option(char **line, const char *optname, char **out, const char *inbuf) {
@@ -1917,10 +1900,9 @@ index 0000000000..347f6283ca
+ const char *filename;
+ const char *dirname;
+ const char *readmap = NULL;
+ gchar **drive_list = NULL;
+
+ for (;;) {
+ c = getopt(argc, argv, "hvd:r:");
+ c = getopt(argc, argv, "hvr:");
+ if (c == -1) {
+ break;
+ }
@@ -1929,9 +1911,6 @@ index 0000000000..347f6283ca
+ case 'h':
+ help();
+ break;
+ case 'd':
+ drive_list = g_strsplit(optarg, ",", 254);
+ break;
+ case 'r':
+ readmap = optarg;
+ break;
@@ -1991,61 +1970,47 @@ index 0000000000..347f6283ca
+ char *bps = NULL;
+ char *group = NULL;
+ char *cache = NULL;
+ char *devname = NULL;
+ bool skip = false;
+ uint64_t bps_value = 0;
+ const char *path = NULL;
+ bool write_zero = true;
+
+ if (!line || line[0] == '\0' || !strcmp(line, "done\n")) {
+ break;
+ }
+ int len = strlen(line);
+ if (line[len - 1] == '\n') {
+ line[len - 1] = '\0';
+ len = len - 1;
+ if (len == 0) {
+ if (len == 1) {
+ break;
+ }
+ }
+
+ if (strncmp(line, "skip", 4) == 0) {
+ if (len < 6 || line[4] != '=') {
+ g_error("read map failed - option 'skip' has no value ('%s')",
+ inbuf);
+ } else {
+ devname = line + 5;
+ skip = true;
+ while (1) {
+ if (!try_parse_option(&line, "format", &format, inbuf) &&
+ !try_parse_option(&line, "throttling.bps", &bps, inbuf) &&
+ !try_parse_option(&line, "throttling.group", &group, inbuf) &&
+ !try_parse_option(&line, "cache", &cache, inbuf))
+ {
+ break;
+ }
+ } else {
+ while (1) {
+ if (!try_parse_option(&line, "format", &format, inbuf) &&
+ !try_parse_option(&line, "throttling.bps", &bps, inbuf) &&
+ !try_parse_option(&line, "throttling.group", &group, inbuf) &&
+ !try_parse_option(&line, "cache", &cache, inbuf))
+ {
+ break;
+ }
+ }
+
+ if (bps) {
+ bps_value = verify_u64(bps);
+ g_free(bps);
+ }
+
+ if (line[0] == '0' && line[1] == ':') {
+ path = line + 2;
+ write_zero = false;
+ } else if (line[0] == '1' && line[1] == ':') {
+ path = line + 2;
+ write_zero = true;
+ } else {
+ g_error("read map failed - parse error ('%s')", inbuf);
+ }
+
+ path = extract_devname(path, &devname, -1);
+ }
+
+ uint64_t bps_value = 0;
+ if (bps) {
+ bps_value = verify_u64(bps);
+ g_free(bps);
+ }
+
+ const char *path;
+ bool write_zero;
+ if (line[0] == '0' && line[1] == ':') {
+ path = line + 2;
+ write_zero = false;
+ } else if (line[0] == '1' && line[1] == ':') {
+ path = line + 2;
+ write_zero = true;
+ } else {
+ g_error("read map failed - parse error ('%s')", inbuf);
+ }
+
+ char *devname = NULL;
+ path = extract_devname(path, &devname, -1);
+ if (!devname) {
+ g_error("read map failed - no dev name specified ('%s')",
+ inbuf);
@@ -2059,7 +2024,6 @@ index 0000000000..347f6283ca
+ map->throttling_group = group;
+ map->cache = cache;
+ map->write_zero = write_zero;
+ map->skip = skip;
+
+ g_hash_table_insert(devmap, map->devname, map);
+
@@ -2068,12 +2032,12 @@ index 0000000000..347f6283ca
+
+ int i;
+ int vmstate_fd = -1;
+ bool drive_rename_bitmap[255];
+ memset(drive_rename_bitmap, 0, sizeof(drive_rename_bitmap));
+ guint8 vmstate_stream = 0;
+
+ for (i = 1; i < 255; i++) {
+ VmaDeviceInfo *di = vma_reader_get_device_info(vmar, i);
+ if (di && (strcmp(di->devname, "vmstate") == 0)) {
+ vmstate_stream = i;
+ char *statefn = g_strdup_printf("%s/vmstate.bin", dirname);
+ vmstate_fd = open(statefn, O_WRONLY|O_CREAT|O_EXCL, 0644);
+ if (vmstate_fd < 0) {
@@ -2089,25 +2053,10 @@ index 0000000000..347f6283ca
+ const char *cache = NULL;
+ int flags = BDRV_O_RDWR;
+ bool write_zero = true;
+ bool skip = false;
+
+ BlockBackend *blk = NULL;
+
+ if (drive_list) {
+ skip = true;
+ int j;
+ for (j = 0; drive_list[j]; j++) {
+ if (strcmp(drive_list[j], di->devname) == 0) {
+ skip = false;
+ drive_rename_bitmap[i] = true;
+ break;
+ }
+ }
+ } else {
+ drive_rename_bitmap[i] = true;
+ }
+
+ if (!skip && readmap) {
+ if (readmap) {
+ RestoreMap *map;
+ map = (RestoreMap *)g_hash_table_lookup(devmap, di->devname);
+ if (map == NULL) {
@@ -2119,8 +2068,7 @@ index 0000000000..347f6283ca
+ throttling_group = map->throttling_group;
+ cache = map->cache;
+ write_zero = map->write_zero;
+ skip = map->skip;
+ } else if (!skip) {
+ } else {
+ devfn = g_strdup_printf("%s/tmp-disk-%s.raw",
+ dirname, di->devname);
+ printf("DEVINFO %s %zd\n", devfn, di->size);
@@ -2138,60 +2086,57 @@ index 0000000000..347f6283ca
+ write_zero = false;
+ }
+
+ if (!skip) {
+ size_t devlen = strlen(devfn);
+ QDict *options = NULL;
+ bool writethrough;
+ if (format) {
+ /* explicit format from commandline */
+ options = qdict_new();
+ qdict_put_str(options, "driver", format);
+ } else if ((devlen > 4 && strcmp(devfn+devlen-4, ".raw") == 0) ||
+ strncmp(devfn, "/dev/", 5) == 0)
+ {
+ /* This part is now deprecated for PVE as well (just as qemu
+ * deprecated not specifying an explicit raw format, too.
+ */
+ /* explicit raw format */
+ options = qdict_new();
+ qdict_put_str(options, "driver", "raw");
+ }
+
+ if (cache && bdrv_parse_cache_mode(cache, &flags, &writethrough)) {
+ g_error("invalid cache option: %s\n", cache);
+ }
+
+ if (errp || !(blk = blk_new_open(devfn, NULL, options, flags, &errp))) {
+ g_error("can't open file %s - %s", devfn,
+ error_get_pretty(errp));
+ }
+
+ if (cache) {
+ blk_set_enable_write_cache(blk, !writethrough);
+ }
+
+ if (throttling_group) {
+ blk_io_limits_enable(blk, throttling_group);
+ }
+
+ if (throttling_bps) {
+ if (!throttling_group) {
+ blk_io_limits_enable(blk, devfn);
+ }
+
+ ThrottleConfig cfg;
+ throttle_config_init(&cfg);
+ cfg.buckets[THROTTLE_BPS_WRITE].avg = throttling_bps;
+ Error *err = NULL;
+ if (!throttle_is_valid(&cfg, &err)) {
+ error_report_err(err);
+ g_error("failed to apply throttling");
+ }
+ blk_set_io_limits(blk, &cfg);
+ }
+ size_t devlen = strlen(devfn);
+ QDict *options = NULL;
+ bool writethrough;
+ if (format) {
+ /* explicit format from commandline */
+ options = qdict_new();
+ qdict_put_str(options, "driver", format);
+ } else if ((devlen > 4 && strcmp(devfn+devlen-4, ".raw") == 0) ||
+ strncmp(devfn, "/dev/", 5) == 0)
+ {
+ /* This part is now deprecated for PVE as well (just as qemu
+ * deprecated not specifying an explicit raw format, too.
+ */
+ /* explicit raw format */
+ options = qdict_new();
+ qdict_put_str(options, "driver", "raw");
+ }
+ if (cache && bdrv_parse_cache_mode(cache, &flags, &writethrough)) {
+ g_error("invalid cache option: %s\n", cache);
+ }
+
+ if (vma_reader_register_bs(vmar, i, blk, write_zero, skip, &errp) < 0) {
+ if (errp || !(blk = blk_new_open(devfn, NULL, options, flags, &errp))) {
+ g_error("can't open file %s - %s", devfn,
+ error_get_pretty(errp));
+ }
+
+ if (cache) {
+ blk_set_enable_write_cache(blk, !writethrough);
+ }
+
+ if (throttling_group) {
+ blk_io_limits_enable(blk, throttling_group);
+ }
+
+ if (throttling_bps) {
+ if (!throttling_group) {
+ blk_io_limits_enable(blk, devfn);
+ }
+
+ ThrottleConfig cfg;
+ throttle_config_init(&cfg);
+ cfg.buckets[THROTTLE_BPS_WRITE].avg = throttling_bps;
+ Error *err = NULL;
+ if (!throttle_is_valid(&cfg, &err)) {
+ error_report_err(err);
+ g_error("failed to apply throttling");
+ }
+ blk_set_io_limits(blk, &cfg);
+ }
+
+ if (vma_reader_register_bs(vmar, i, blk, write_zero, &errp) < 0) {
+ g_error("%s", error_get_pretty(errp));
+ }
+
@@ -2201,10 +2146,6 @@ index 0000000000..347f6283ca
+ }
+ }
+
+ if (drive_list) {
+ g_strfreev(drive_list);
+ }
+
+ if (vma_reader_restore(vmar, vmstate_fd, verbose, &errp) < 0) {
+ g_error("restore failed - %s", error_get_pretty(errp));
+ }
@@ -2212,7 +2153,7 @@ index 0000000000..347f6283ca
+ if (!readmap) {
+ for (i = 1; i < 255; i++) {
+ VmaDeviceInfo *di = vma_reader_get_device_info(vmar, i);
+ if (di && drive_rename_bitmap[i]) {
+ if (di && (i != vmstate_stream)) {
+ char *tmpfn = g_strdup_printf("%s/tmp-disk-%s.raw",
+ dirname, di->devname);
+ char *fn = g_strdup_printf("%s/disk-%s.raw",
@@ -2311,7 +2252,7 @@ index 0000000000..347f6283ca
+ struct iovec iov;
+ QEMUIOVector qiov;
+
+ int64_t start, end, readlen;
+ int64_t start, end;
+ int ret = 0;
+
+ unsigned char *buf = blk_blockalign(job->target, VMA_CLUSTER_SIZE);
@@ -2325,24 +2266,16 @@ index 0000000000..347f6283ca
+ iov.iov_len = VMA_CLUSTER_SIZE;
+ qemu_iovec_init_external(&qiov, &iov, 1);
+
+ if (start + 1 == end) {
+ memset(buf, 0, VMA_CLUSTER_SIZE);
+ readlen = job->len - start * VMA_CLUSTER_SIZE;
+ assert(readlen > 0 && readlen <= VMA_CLUSTER_SIZE);
+ } else {
+ readlen = VMA_CLUSTER_SIZE;
+ }
+
+ ret = blk_co_preadv(job->target, start * VMA_CLUSTER_SIZE,
+ readlen, &qiov, 0);
+ VMA_CLUSTER_SIZE, &qiov, 0);
+ if (ret < 0) {
+ vma_writer_set_error(job->vmaw, "read error");
+ vma_writer_set_error(job->vmaw, "read error", -1);
+ goto out;
+ }
+
+ size_t zb = 0;
+ if (vma_writer_write(job->vmaw, job->dev_id, start, buf, &zb) < 0) {
+ vma_writer_set_error(job->vmaw, "backup_dump_cb vma_writer_write failed");
+ vma_writer_set_error(job->vmaw, "backup_dump_cb vma_writer_write failed", -1);
+ goto out;
+ }
+ }
@@ -2641,7 +2574,7 @@ index 0000000000..347f6283ca
+}
diff --git a/vma.h b/vma.h
new file mode 100644
index 0000000000..86d2873aa5
index 0000000000..c895c97f6d
--- /dev/null
+++ b/vma.h
@@ -0,0 +1,150 @@
@@ -2779,7 +2712,7 @@ index 0000000000..86d2873aa5
+int coroutine_fn vma_writer_flush_output(VmaWriter *vmaw);
+
+int vma_writer_get_status(VmaWriter *vmaw, VmaStatus *status);
+void vma_writer_set_error(VmaWriter *vmaw, const char *fmt, ...) G_GNUC_PRINTF(2, 3);
+void vma_writer_set_error(VmaWriter *vmaw, const char *fmt, ...);
+
+
+VmaReader *vma_reader_create(const char *filename, Error **errp);
@@ -2789,7 +2722,7 @@ index 0000000000..86d2873aa5
+VmaDeviceInfo *vma_reader_get_device_info(VmaReader *vmar, guint8 dev_id);
+int vma_reader_register_bs(VmaReader *vmar, guint8 dev_id,
+ BlockBackend *target, bool write_zeroes,
+ bool skip, Error **errp);
+ Error **errp);
+int vma_reader_restore(VmaReader *vmar, int vmstate_fd, bool verbose,
+ Error **errp);
+int vma_reader_verify(VmaReader *vmar, bool verbose, Error **errp);

View File

@@ -9,23 +9,21 @@ Subject: [PATCH] PVE-Backup: add backup-dump block driver
- job.c: make job_should_pause non-static
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
[FE: adapt to coroutine changes]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
block/backup-dump.c | 168 +++++++++++++++++++++++++++++++
block/backup-dump.c | 167 +++++++++++++++++++++++++++++++
block/backup.c | 30 ++----
block/meson.build | 1 +
include/block/block_int-common.h | 35 +++++++
job.c | 3 +-
5 files changed, 214 insertions(+), 23 deletions(-)
5 files changed, 213 insertions(+), 23 deletions(-)
create mode 100644 block/backup-dump.c
diff --git a/block/backup-dump.c b/block/backup-dump.c
new file mode 100644
index 0000000000..232a094426
index 0000000000..04718a94e2
--- /dev/null
+++ b/block/backup-dump.c
@@ -0,0 +1,168 @@
@@ -0,0 +1,167 @@
+/*
+ * BlockDriver to send backup data stream to a callback function
+ *
@@ -47,8 +45,7 @@ index 0000000000..232a094426
+ void *dump_cb_data;
+} BDRVBackupDumpState;
+
+static coroutine_fn int qemu_backup_dump_co_get_info(BlockDriverState *bs,
+ BlockDriverInfo *bdi)
+static int qemu_backup_dump_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
+{
+ BDRVBackupDumpState *s = bs->opaque;
+
@@ -89,7 +86,7 @@ index 0000000000..232a094426
+ /* Nothing to do. */
+}
+
+static coroutine_fn int64_t qemu_backup_dump_co_getlength(BlockDriverState *bs)
+static int64_t qemu_backup_dump_getlength(BlockDriverState *bs)
+{
+ BDRVBackupDumpState *s = bs->opaque;
+
@@ -149,8 +146,8 @@ index 0000000000..232a094426
+
+ .bdrv_close = qemu_backup_dump_close,
+ .bdrv_has_zero_init = bdrv_has_zero_init_1,
+ .bdrv_co_getlength = qemu_backup_dump_co_getlength,
+ .bdrv_co_get_info = qemu_backup_dump_co_get_info,
+ .bdrv_getlength = qemu_backup_dump_getlength,
+ .bdrv_get_info = qemu_backup_dump_get_info,
+
+ .bdrv_co_writev = qemu_backup_dump_co_writev,
+
@@ -195,7 +192,7 @@ index 0000000000..232a094426
+ return bs;
+}
diff --git a/block/backup.c b/block/backup.c
index 39410dcf8d..af87fa6aa9 100644
index 9b0151c5be..6e8f6e67b3 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -29,28 +29,6 @@
@@ -227,7 +224,7 @@ index 39410dcf8d..af87fa6aa9 100644
static const BlockJobDriver backup_job_driver;
static void backup_cleanup_sync_bitmap(BackupBlockJob *job, int ret)
@@ -457,6 +435,14 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
@@ -454,6 +432,14 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
}
cluster_size = block_copy_cluster_size(bcs);
@@ -243,7 +240,7 @@ index 39410dcf8d..af87fa6aa9 100644
if (perf->max_chunk && perf->max_chunk < cluster_size) {
error_setg(errp, "Required max-chunk (%" PRIi64 ") is less than backup "
diff --git a/block/meson.build b/block/meson.build
index 744b698a82..f580f95395 100644
index 4feae20e37..0d7023fc82 100644
--- a/block/meson.build
+++ b/block/meson.build
@@ -4,6 +4,7 @@ block_ss.add(files(
@@ -255,18 +252,18 @@ index 744b698a82..f580f95395 100644
'blkdebug.c',
'blklogwrites.c',
diff --git a/include/block/block_int-common.h b/include/block/block_int-common.h
index f01bb8b617..d7ffd1826e 100644
index 31ae91e56e..37b64bcd93 100644
--- a/include/block/block_int-common.h
+++ b/include/block/block_int-common.h
@@ -26,6 +26,7 @@
#include "block/aio.h"
#include "block/block-common.h"
#include "block/accounting.h"
#include "block/block.h"
+#include "block/block-copy.h"
#include "block/block-global-state.h"
#include "block/snapshot.h"
#include "qemu/iov.h"
@@ -60,6 +61,40 @@
#include "block/aio-wait.h"
#include "qemu/queue.h"
#include "qemu/coroutine.h"
@@ -64,6 +65,40 @@
#define BLOCK_PROBE_BUF_SIZE 512

View File

@@ -5,19 +5,17 @@ Subject: [PATCH] PVE-Backup: pbs-restore - new command to restore from proxmox
backup server
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
[WB: add namespace support]
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
---
meson.build | 4 +
pbs-restore.c | 236 ++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 240 insertions(+)
pbs-restore.c | 223 ++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 227 insertions(+)
create mode 100644 pbs-restore.c
diff --git a/meson.build b/meson.build
index 443b3238f9..32ab849ce6 100644
index 406112d96f..9c46881eb7 100644
--- a/meson.build
+++ b/meson.build
@@ -3656,6 +3656,10 @@ if have_tools
@@ -3604,6 +3604,10 @@ if have_tools
vma = executable('vma', files('vma.c', 'vma-reader.c') + genh,
dependencies: [authz, block, crypto, io, qom], install: true)
@@ -30,10 +28,10 @@ index 443b3238f9..32ab849ce6 100644
subdir('contrib/elf2dmp')
diff --git a/pbs-restore.c b/pbs-restore.c
new file mode 100644
index 0000000000..f03d9bab8d
index 0000000000..2f834cf42e
--- /dev/null
+++ b/pbs-restore.c
@@ -0,0 +1,236 @@
@@ -0,0 +1,223 @@
+/*
+ * Qemu image restore helper for Proxmox Backup
+ *
@@ -65,7 +63,7 @@ index 0000000000..f03d9bab8d
+static void help(void)
+{
+ const char *help_msg =
+ "usage: pbs-restore [--repository <repo>] [--ns namespace] snapshot archive-name target [command options]\n"
+ "usage: pbs-restore [--repository <repo>] snapshot archive-name target [command options]\n"
+ ;
+
+ printf("%s", help_msg);
@@ -113,7 +111,6 @@ index 0000000000..f03d9bab8d
+ Error *main_loop_err = NULL;
+ const char *format = "raw";
+ const char *repository = NULL;
+ const char *backup_ns = NULL;
+ const char *keyfile = NULL;
+ int verbose = false;
+ bool skip_zero = false;
@@ -127,7 +124,6 @@ index 0000000000..f03d9bab8d
+ {"verbose", no_argument, 0, 'v'},
+ {"format", required_argument, 0, 'f'},
+ {"repository", required_argument, 0, 'r'},
+ {"ns", required_argument, 0, 'n'},
+ {"keyfile", required_argument, 0, 'k'},
+ {0, 0, 0, 0}
+ };
@@ -148,9 +144,6 @@ index 0000000000..f03d9bab8d
+ case 'r':
+ repository = g_strdup(argv[optind - 1]);
+ break;
+ case 'n':
+ backup_ns = g_strdup(argv[optind - 1]);
+ break;
+ case 'k':
+ keyfile = g_strdup(argv[optind - 1]);
+ break;
@@ -201,16 +194,8 @@ index 0000000000..f03d9bab8d
+ fprintf(stderr, "connecting to repository '%s'\n", repository);
+ }
+ char *pbs_error = NULL;
+ ProxmoxRestoreHandle *conn = proxmox_restore_new_ns(
+ repository,
+ snapshot,
+ backup_ns,
+ password,
+ keyfile,
+ key_password,
+ fingerprint,
+ &pbs_error
+ );
+ ProxmoxRestoreHandle *conn = proxmox_restore_new(
+ repository, snapshot, password, keyfile, key_password, fingerprint, &pbs_error);
+ if (conn == NULL) {
+ fprintf(stderr, "restore failed: %s\n", pbs_error);
+ return -1;

View File

@@ -0,0 +1,452 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Stefan Reiter <s.reiter@proxmox.com>
Date: Mon, 29 Jun 2020 11:06:03 +0200
Subject: [PATCH] PVE-Backup: Add dirty-bitmap tracking for incremental backups
Uses QEMU's existing MIRROR_SYNC_MODE_BITMAP and a dirty-bitmap on top
of all backed-up drives. This will only execute the data-write callback
for any changed chunks, the PBS rust code will reuse chunks from the
previous index for everything it doesn't receive if reuse_index is true.
On error or cancellation, remove all dirty bitmaps to ensure
consistency.
Add PBS/incremental specific information to query backup info QMP and
HMP commands.
Only supported for PBS backups.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Dietmar Maurer <dietmar@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
block/monitor/block-hmp-cmds.c | 1 +
monitor/hmp-cmds.c | 45 ++++++++++----
proxmox-backup-client.c | 3 +-
proxmox-backup-client.h | 1 +
pve-backup.c | 103 ++++++++++++++++++++++++++++++---
qapi/block-core.json | 12 +++-
6 files changed, 142 insertions(+), 23 deletions(-)
diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c
index 60fa93c85e..8b23bdedac 100644
--- a/block/monitor/block-hmp-cmds.c
+++ b/block/monitor/block-hmp-cmds.c
@@ -1044,6 +1044,7 @@ void hmp_backup(Monitor *mon, const QDict *qdict)
false, NULL, // PBS fingerprint
false, NULL, // PBS backup-id
false, 0, // PBS backup-time
+ false, false, // PBS incremental
true, dir ? BACKUP_FORMAT_DIR : BACKUP_FORMAT_VMA,
false, NULL, false, NULL, !!devlist,
devlist, qdict_haskey(qdict, "speed"), speed, &error);
diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index a40b25e906..670f783515 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -225,19 +225,42 @@ void hmp_info_backup(Monitor *mon, const QDict *qdict)
monitor_printf(mon, "End time: %s", ctime(&info->end_time));
}
- int per = (info->has_total && info->total &&
- info->has_transferred && info->transferred) ?
- (info->transferred * 100)/info->total : 0;
- int zero_per = (info->has_total && info->total &&
- info->has_zero_bytes && info->zero_bytes) ?
- (info->zero_bytes * 100)/info->total : 0;
monitor_printf(mon, "Backup file: %s\n", info->backup_file);
monitor_printf(mon, "Backup uuid: %s\n", info->uuid);
- monitor_printf(mon, "Total size: %zd\n", info->total);
- monitor_printf(mon, "Transferred bytes: %zd (%d%%)\n",
- info->transferred, per);
- monitor_printf(mon, "Zero bytes: %zd (%d%%)\n",
- info->zero_bytes, zero_per);
+
+ if (!(info->has_total && info->total)) {
+ // this should not happen normally
+ monitor_printf(mon, "Total size: %d\n", 0);
+ } else {
+ bool incremental = false;
+ size_t total_or_dirty = info->total;
+ if (info->has_transferred) {
+ if (info->has_dirty && info->dirty) {
+ if (info->dirty < info->total) {
+ total_or_dirty = info->dirty;
+ incremental = true;
+ }
+ }
+ }
+
+ int per = (info->transferred * 100)/total_or_dirty;
+
+ monitor_printf(mon, "Backup mode: %s\n", incremental ? "incremental" : "full");
+
+ int zero_per = (info->has_zero_bytes && info->zero_bytes) ?
+ (info->zero_bytes * 100)/info->total : 0;
+ monitor_printf(mon, "Total size: %zd\n", info->total);
+ monitor_printf(mon, "Transferred bytes: %zd (%d%%)\n",
+ info->transferred, per);
+ monitor_printf(mon, "Zero bytes: %zd (%d%%)\n",
+ info->zero_bytes, zero_per);
+
+ if (info->has_reused) {
+ int reused_per = (info->reused * 100)/total_or_dirty;
+ monitor_printf(mon, "Reused bytes: %zd (%d%%)\n",
+ info->reused, reused_per);
+ }
+ }
}
qapi_free_BackupStatus(info);
diff --git a/proxmox-backup-client.c b/proxmox-backup-client.c
index a8f6653a81..4ce7bc0b5e 100644
--- a/proxmox-backup-client.c
+++ b/proxmox-backup-client.c
@@ -89,6 +89,7 @@ proxmox_backup_co_register_image(
ProxmoxBackupHandle *pbs,
const char *device_name,
uint64_t size,
+ bool incremental,
Error **errp)
{
Coroutine *co = qemu_coroutine_self();
@@ -98,7 +99,7 @@ proxmox_backup_co_register_image(
int pbs_res = -1;
proxmox_backup_register_image_async(
- pbs, device_name, size ,proxmox_backup_schedule_wake, &waker, &pbs_res, &pbs_err);
+ pbs, device_name, size, incremental, proxmox_backup_schedule_wake, &waker, &pbs_res, &pbs_err);
qemu_coroutine_yield();
if (pbs_res < 0) {
if (errp) error_setg(errp, "backup register image failed: %s", pbs_err ? pbs_err : "unknown error");
diff --git a/proxmox-backup-client.h b/proxmox-backup-client.h
index 1dda8b7d8f..8cbf645b2c 100644
--- a/proxmox-backup-client.h
+++ b/proxmox-backup-client.h
@@ -32,6 +32,7 @@ proxmox_backup_co_register_image(
ProxmoxBackupHandle *pbs,
const char *device_name,
uint64_t size,
+ bool incremental,
Error **errp);
diff --git a/pve-backup.c b/pve-backup.c
index 6af212b9b4..3f97cf6532 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -28,6 +28,8 @@
*
*/
+const char *PBS_BITMAP_NAME = "pbs-incremental-dirty-bitmap";
+
static struct PVEBackupState {
struct {
// Everithing accessed from qmp_backup_query command is protected using lock
@@ -39,7 +41,9 @@ static struct PVEBackupState {
uuid_t uuid;
char uuid_str[37];
size_t total;
+ size_t dirty;
size_t transferred;
+ size_t reused;
size_t zero_bytes;
} stat;
int64_t speed;
@@ -66,6 +70,7 @@ typedef struct PVEBackupDevInfo {
uint8_t dev_id;
bool completed;
char targetfile[PATH_MAX];
+ BdrvDirtyBitmap *bitmap;
BlockDriverState *target;
} PVEBackupDevInfo;
@@ -107,11 +112,12 @@ static bool pvebackup_error_or_canceled(void)
return error_or_canceled;
}
-static void pvebackup_add_transfered_bytes(size_t transferred, size_t zero_bytes)
+static void pvebackup_add_transfered_bytes(size_t transferred, size_t zero_bytes, size_t reused)
{
qemu_mutex_lock(&backup_state.stat.lock);
backup_state.stat.zero_bytes += zero_bytes;
backup_state.stat.transferred += transferred;
+ backup_state.stat.reused += reused;
qemu_mutex_unlock(&backup_state.stat.lock);
}
@@ -150,7 +156,8 @@ pvebackup_co_dump_pbs_cb(
pvebackup_propagate_error(local_err);
return pbs_res;
} else {
- pvebackup_add_transfered_bytes(size, !buf ? size : 0);
+ size_t reused = (pbs_res == 0) ? size : 0;
+ pvebackup_add_transfered_bytes(size, !buf ? size : 0, reused);
}
return size;
@@ -210,11 +217,11 @@ pvebackup_co_dump_vma_cb(
} else {
if (remaining >= VMA_CLUSTER_SIZE) {
assert(ret == VMA_CLUSTER_SIZE);
- pvebackup_add_transfered_bytes(VMA_CLUSTER_SIZE, zero_bytes);
+ pvebackup_add_transfered_bytes(VMA_CLUSTER_SIZE, zero_bytes, 0);
remaining -= VMA_CLUSTER_SIZE;
} else {
assert(ret == remaining);
- pvebackup_add_transfered_bytes(remaining, zero_bytes);
+ pvebackup_add_transfered_bytes(remaining, zero_bytes, 0);
remaining = 0;
}
}
@@ -250,6 +257,18 @@ static void coroutine_fn pvebackup_co_cleanup(void *unused)
if (local_err != NULL) {
pvebackup_propagate_error(local_err);
}
+ } else {
+ // on error or cancel we cannot ensure synchronization of dirty
+ // bitmaps with backup server, so remove all and do full backup next
+ GList *l = backup_state.di_list;
+ while (l) {
+ PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
+ l = g_list_next(l);
+
+ if (di->bitmap) {
+ bdrv_release_dirty_bitmap(di->bitmap);
+ }
+ }
}
proxmox_backup_disconnect(backup_state.pbs);
@@ -305,6 +324,12 @@ static void pvebackup_complete_cb(void *opaque, int ret)
// remove self from job queue
backup_state.di_list = g_list_remove(backup_state.di_list, di);
+ if (di->bitmap && ret < 0) {
+ // on error or cancel we cannot ensure synchronization of dirty
+ // bitmaps with backup server, so remove all and do full backup next
+ bdrv_release_dirty_bitmap(di->bitmap);
+ }
+
g_free(di);
qemu_mutex_unlock(&backup_state.backup_mutex);
@@ -469,12 +494,18 @@ static bool create_backup_jobs(void) {
assert(di->target != NULL);
+ MirrorSyncMode sync_mode = MIRROR_SYNC_MODE_FULL;
+ BitmapSyncMode bitmap_mode = BITMAP_SYNC_MODE_NEVER;
+ if (di->bitmap) {
+ sync_mode = MIRROR_SYNC_MODE_BITMAP;
+ bitmap_mode = BITMAP_SYNC_MODE_ON_SUCCESS;
+ }
AioContext *aio_context = bdrv_get_aio_context(di->bs);
aio_context_acquire(aio_context);
BlockJob *job = backup_job_create(
- NULL, di->bs, di->target, backup_state.speed, MIRROR_SYNC_MODE_FULL, NULL,
- BITMAP_SYNC_MODE_NEVER, false, NULL, &perf, BLOCKDEV_ON_ERROR_REPORT, BLOCKDEV_ON_ERROR_REPORT,
+ NULL, di->bs, di->target, backup_state.speed, sync_mode, di->bitmap,
+ bitmap_mode, false, NULL, &perf, BLOCKDEV_ON_ERROR_REPORT, BLOCKDEV_ON_ERROR_REPORT,
JOB_DEFAULT, pvebackup_complete_cb, di, NULL, &local_err);
aio_context_release(aio_context);
@@ -525,6 +556,8 @@ typedef struct QmpBackupTask {
const char *fingerprint;
bool has_fingerprint;
int64_t backup_time;
+ bool has_use_dirty_bitmap;
+ bool use_dirty_bitmap;
bool has_format;
BackupFormat format;
bool has_config_file;
@@ -616,6 +649,7 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
}
size_t total = 0;
+ size_t dirty = 0;
l = di_list;
while (l) {
@@ -653,6 +687,8 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
int dump_cb_block_size = PROXMOX_BACKUP_DEFAULT_CHUNK_SIZE; // Hardcoded (4M)
firewall_name = "fw.conf";
+ bool use_dirty_bitmap = task->has_use_dirty_bitmap && task->use_dirty_bitmap;
+
char *pbs_err = NULL;
pbs = proxmox_backup_new(
task->backup_file,
@@ -672,7 +708,8 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
goto err;
}
- if (proxmox_backup_co_connect(pbs, task->errp) < 0)
+ int connect_result = proxmox_backup_co_connect(pbs, task->errp);
+ if (connect_result < 0)
goto err;
/* register all devices */
@@ -683,9 +720,40 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
const char *devname = bdrv_get_device_name(di->bs);
- int dev_id = proxmox_backup_co_register_image(pbs, devname, di->size, task->errp);
- if (dev_id < 0)
+ BdrvDirtyBitmap *bitmap = bdrv_find_dirty_bitmap(di->bs, PBS_BITMAP_NAME);
+ bool expect_only_dirty = false;
+
+ if (use_dirty_bitmap) {
+ if (bitmap == NULL) {
+ bitmap = bdrv_create_dirty_bitmap(di->bs, dump_cb_block_size, PBS_BITMAP_NAME, task->errp);
+ if (!bitmap) {
+ goto err;
+ }
+ } else {
+ expect_only_dirty = proxmox_backup_check_incremental(pbs, devname, di->size) != 0;
+ }
+
+ if (expect_only_dirty) {
+ dirty += bdrv_get_dirty_count(bitmap);
+ } else {
+ /* mark entire bitmap as dirty to make full backup */
+ bdrv_set_dirty_bitmap(bitmap, 0, di->size);
+ dirty += di->size;
+ }
+ di->bitmap = bitmap;
+ } else {
+ dirty += di->size;
+
+ /* after a full backup the old dirty bitmap is invalid anyway */
+ if (bitmap != NULL) {
+ bdrv_release_dirty_bitmap(bitmap);
+ }
+ }
+
+ int dev_id = proxmox_backup_co_register_image(pbs, devname, di->size, expect_only_dirty, task->errp);
+ if (dev_id < 0) {
goto err;
+ }
if (!(di->target = bdrv_backup_dump_create(dump_cb_block_size, di->size, pvebackup_co_dump_pbs_cb, di, task->errp))) {
goto err;
@@ -694,6 +762,8 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
di->dev_id = dev_id;
}
} else if (format == BACKUP_FORMAT_VMA) {
+ dirty = total;
+
vmaw = vma_writer_create(task->backup_file, uuid, &local_err);
if (!vmaw) {
if (local_err) {
@@ -721,6 +791,8 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
}
}
} else if (format == BACKUP_FORMAT_DIR) {
+ dirty = total;
+
if (mkdir(task->backup_file, 0640) != 0) {
error_setg_errno(task->errp, errno, "can't create directory '%s'\n",
task->backup_file);
@@ -793,8 +865,10 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
char *uuid_str = g_strdup(backup_state.stat.uuid_str);
backup_state.stat.total = total;
+ backup_state.stat.dirty = dirty;
backup_state.stat.transferred = 0;
backup_state.stat.zero_bytes = 0;
+ backup_state.stat.reused = format == BACKUP_FORMAT_PBS && dirty >= total ? 0 : total - dirty;
qemu_mutex_unlock(&backup_state.stat.lock);
@@ -818,6 +892,10 @@ err:
PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
l = g_list_next(l);
+ if (di->bitmap) {
+ bdrv_release_dirty_bitmap(di->bitmap);
+ }
+
if (di->target) {
bdrv_unref(di->target);
}
@@ -859,6 +937,7 @@ UuidInfo *qmp_backup(
bool has_fingerprint, const char *fingerprint,
bool has_backup_id, const char *backup_id,
bool has_backup_time, int64_t backup_time,
+ bool has_use_dirty_bitmap, bool use_dirty_bitmap,
bool has_format, BackupFormat format,
bool has_config_file, const char *config_file,
bool has_firewall_file, const char *firewall_file,
@@ -877,6 +956,8 @@ UuidInfo *qmp_backup(
.backup_id = backup_id,
.has_backup_time = has_backup_time,
.backup_time = backup_time,
+ .has_use_dirty_bitmap = has_use_dirty_bitmap,
+ .use_dirty_bitmap = use_dirty_bitmap,
.has_format = has_format,
.format = format,
.has_config_file = has_config_file,
@@ -945,10 +1026,14 @@ BackupStatus *qmp_query_backup(Error **errp)
info->has_total = true;
info->total = backup_state.stat.total;
+ info->has_dirty = true;
+ info->dirty = backup_state.stat.dirty;
info->has_zero_bytes = true;
info->zero_bytes = backup_state.stat.zero_bytes;
info->has_transferred = true;
info->transferred = backup_state.stat.transferred;
+ info->has_reused = true;
+ info->reused = backup_state.stat.reused;
qemu_mutex_unlock(&backup_state.stat.lock);
diff --git a/qapi/block-core.json b/qapi/block-core.json
index c3b6b93472..992e6c1e3f 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -753,8 +753,13 @@
#
# @total: total amount of bytes involved in the backup process
#
+# @dirty: with incremental mode (PBS) this is the amount of bytes involved
+# in the backup process which are marked dirty.
+#
# @transferred: amount of bytes already backed up.
#
+# @reused: amount of bytes reused due to deduplication.
+#
# @zero-bytes: amount of 'zero' bytes detected.
#
# @start-time: time (epoch) when backup job started.
@@ -767,8 +772,8 @@
#
##
{ 'struct': 'BackupStatus',
- 'data': {'*status': 'str', '*errmsg': 'str', '*total': 'int',
- '*transferred': 'int', '*zero-bytes': 'int',
+ 'data': {'*status': 'str', '*errmsg': 'str', '*total': 'int', '*dirty': 'int',
+ '*transferred': 'int', '*zero-bytes': 'int', '*reused': 'int',
'*start-time': 'int', '*end-time': 'int',
'*backup-file': 'str', '*uuid': 'str' } }
@@ -811,6 +816,8 @@
#
# @backup-time: backup timestamp (Unix epoch, required for format 'pbs')
#
+# @use-dirty-bitmap: use dirty bitmap to detect incremental changes since last job (optional for format 'pbs')
+#
# Returns: the uuid of the backup job
#
##
@@ -821,6 +828,7 @@
'*fingerprint': 'str',
'*backup-id': 'str',
'*backup-time': 'int',
+ '*use-dirty-bitmap': 'bool',
'*format': 'BackupFormat',
'*config-file': 'str',
'*firewall-file': 'str',

View File

@@ -0,0 +1,219 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Dietmar Maurer <dietmar@proxmox.com>
Date: Thu, 9 Jul 2020 12:53:08 +0200
Subject: [PATCH] PVE: various PBS fixes
pbs: fix crypt and compress parameters
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
PVE: handle PBS write callback with big blocks correctly
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
PVE: add zero block handling to PBS dump callback
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
block/monitor/block-hmp-cmds.c | 4 ++-
pve-backup.c | 57 +++++++++++++++++++++++++++-------
qapi/block-core.json | 6 ++++
3 files changed, 54 insertions(+), 13 deletions(-)
diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c
index 8b23bdedac..f59b02592e 100644
--- a/block/monitor/block-hmp-cmds.c
+++ b/block/monitor/block-hmp-cmds.c
@@ -1044,7 +1044,9 @@ void hmp_backup(Monitor *mon, const QDict *qdict)
false, NULL, // PBS fingerprint
false, NULL, // PBS backup-id
false, 0, // PBS backup-time
- false, false, // PBS incremental
+ false, false, // PBS use-dirty-bitmap
+ false, false, // PBS compress
+ false, false, // PBS encrypt
true, dir ? BACKUP_FORMAT_DIR : BACKUP_FORMAT_VMA,
false, NULL, false, NULL, !!devlist,
devlist, qdict_haskey(qdict, "speed"), speed, &error);
diff --git a/pve-backup.c b/pve-backup.c
index 3f97cf6532..a275a1d4e1 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -8,6 +8,7 @@
#include "block/blockjob.h"
#include "qapi/qapi-commands-block.h"
#include "qapi/qmp/qerror.h"
+#include "qemu/cutils.h"
/* PVE backup state and related function */
@@ -67,6 +68,7 @@ opts_init(pvebackup_init);
typedef struct PVEBackupDevInfo {
BlockDriverState *bs;
size_t size;
+ uint64_t block_size;
uint8_t dev_id;
bool completed;
char targetfile[PATH_MAX];
@@ -137,10 +139,13 @@ pvebackup_co_dump_pbs_cb(
PVEBackupDevInfo *di = opaque;
assert(backup_state.pbs);
+ assert(buf);
Error *local_err = NULL;
int pbs_res = -1;
+ bool is_zero_block = size == di->block_size && buffer_is_zero(buf, size);
+
qemu_co_mutex_lock(&backup_state.dump_callback_mutex);
// avoid deadlock if job is cancelled
@@ -149,17 +154,29 @@ pvebackup_co_dump_pbs_cb(
return -1;
}
- pbs_res = proxmox_backup_co_write_data(backup_state.pbs, di->dev_id, buf, start, size, &local_err);
- qemu_co_mutex_unlock(&backup_state.dump_callback_mutex);
+ uint64_t transferred = 0;
+ uint64_t reused = 0;
+ while (transferred < size) {
+ uint64_t left = size - transferred;
+ uint64_t to_transfer = left < di->block_size ? left : di->block_size;
- if (pbs_res < 0) {
- pvebackup_propagate_error(local_err);
- return pbs_res;
- } else {
- size_t reused = (pbs_res == 0) ? size : 0;
- pvebackup_add_transfered_bytes(size, !buf ? size : 0, reused);
+ pbs_res = proxmox_backup_co_write_data(backup_state.pbs, di->dev_id,
+ is_zero_block ? NULL : buf + transferred, start + transferred,
+ to_transfer, &local_err);
+ transferred += to_transfer;
+
+ if (pbs_res < 0) {
+ pvebackup_propagate_error(local_err);
+ qemu_co_mutex_unlock(&backup_state.dump_callback_mutex);
+ return pbs_res;
+ }
+
+ reused += pbs_res == 0 ? to_transfer : 0;
}
+ qemu_co_mutex_unlock(&backup_state.dump_callback_mutex);
+ pvebackup_add_transfered_bytes(size, is_zero_block ? size : 0, reused);
+
return size;
}
@@ -180,6 +197,7 @@ pvebackup_co_dump_vma_cb(
int ret = -1;
assert(backup_state.vmaw);
+ assert(buf);
uint64_t remaining = size;
@@ -206,9 +224,7 @@ pvebackup_co_dump_vma_cb(
qemu_co_mutex_unlock(&backup_state.dump_callback_mutex);
++cluster_num;
- if (buf) {
- buf += VMA_CLUSTER_SIZE;
- }
+ buf += VMA_CLUSTER_SIZE;
if (ret < 0) {
Error *local_err = NULL;
vma_writer_error_propagate(backup_state.vmaw, &local_err);
@@ -566,6 +582,10 @@ typedef struct QmpBackupTask {
const char *firewall_file;
bool has_devlist;
const char *devlist;
+ bool has_compress;
+ bool compress;
+ bool has_encrypt;
+ bool encrypt;
bool has_speed;
int64_t speed;
Error **errp;
@@ -689,6 +709,7 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
bool use_dirty_bitmap = task->has_use_dirty_bitmap && task->use_dirty_bitmap;
+
char *pbs_err = NULL;
pbs = proxmox_backup_new(
task->backup_file,
@@ -698,8 +719,10 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
task->has_password ? task->password : NULL,
task->has_keyfile ? task->keyfile : NULL,
task->has_key_password ? task->key_password : NULL,
+ task->has_compress ? task->compress : true,
+ task->has_encrypt ? task->encrypt : task->has_keyfile,
task->has_fingerprint ? task->fingerprint : NULL,
- &pbs_err);
+ &pbs_err);
if (!pbs) {
error_set(task->errp, ERROR_CLASS_GENERIC_ERROR,
@@ -718,6 +741,8 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
l = g_list_next(l);
+ di->block_size = dump_cb_block_size;
+
const char *devname = bdrv_get_device_name(di->bs);
BdrvDirtyBitmap *bitmap = bdrv_find_dirty_bitmap(di->bs, PBS_BITMAP_NAME);
@@ -938,6 +963,8 @@ UuidInfo *qmp_backup(
bool has_backup_id, const char *backup_id,
bool has_backup_time, int64_t backup_time,
bool has_use_dirty_bitmap, bool use_dirty_bitmap,
+ bool has_compress, bool compress,
+ bool has_encrypt, bool encrypt,
bool has_format, BackupFormat format,
bool has_config_file, const char *config_file,
bool has_firewall_file, const char *firewall_file,
@@ -948,6 +975,8 @@ UuidInfo *qmp_backup(
.backup_file = backup_file,
.has_password = has_password,
.password = password,
+ .has_keyfile = has_keyfile,
+ .keyfile = keyfile,
.has_key_password = has_key_password,
.key_password = key_password,
.has_fingerprint = has_fingerprint,
@@ -958,6 +987,10 @@ UuidInfo *qmp_backup(
.backup_time = backup_time,
.has_use_dirty_bitmap = has_use_dirty_bitmap,
.use_dirty_bitmap = use_dirty_bitmap,
+ .has_compress = has_compress,
+ .compress = compress,
+ .has_encrypt = has_encrypt,
+ .encrypt = encrypt,
.has_format = has_format,
.format = format,
.has_config_file = has_config_file,
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 992e6c1e3f..5ac6276dc1 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -818,6 +818,10 @@
#
# @use-dirty-bitmap: use dirty bitmap to detect incremental changes since last job (optional for format 'pbs')
#
+# @compress: use compression (optional for format 'pbs', defaults to true)
+#
+# @encrypt: use encryption ((optional for format 'pbs', defaults to true if there is a keyfile)
+#
# Returns: the uuid of the backup job
#
##
@@ -829,6 +833,8 @@
'*backup-id': 'str',
'*backup-time': 'int',
'*use-dirty-bitmap': 'bool',
+ '*compress': 'bool',
+ '*encrypt': 'bool',
'*format': 'BackupFormat',
'*config-file': 'str',
'*firewall-file': 'str',

View File

@@ -7,27 +7,24 @@ Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
[error cleanups, file_open implementation]
Signed-off-by: Dietmar Maurer <dietmar@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
[WB: add namespace support]
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
[FE: adapt to changed function signatures
make pbs_co_preadv return values consistent with QEMU
getlength is now a coroutine function]
make pbs_co_preadv return values consistent with QEMU]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
block/meson.build | 3 +
block/pbs.c | 305 +++++++++++++++++++++++++++++++++++++++++++
block/pbs.c | 276 +++++++++++++++++++++++++++++++++++++++++++
configure | 9 ++
meson.build | 2 +-
qapi/block-core.json | 13 ++
qapi/pragma.json | 1 +
6 files changed, 332 insertions(+), 1 deletion(-)
6 files changed, 303 insertions(+), 1 deletion(-)
create mode 100644 block/pbs.c
diff --git a/block/meson.build b/block/meson.build
index 5bcebb934b..eece0d5743 100644
index e995ae72b9..7ef2fa72d5 100644
--- a/block/meson.build
+++ b/block/meson.build
@@ -54,6 +54,9 @@ block_ss.add(files(
@@ -53,6 +53,9 @@ block_ss.add(files(
'../pve-backup.c',
), libproxmox_backup_qemu)
@@ -39,10 +36,10 @@ index 5bcebb934b..eece0d5743 100644
softmmu_ss.add(files('block-ram-registrar.c'))
diff --git a/block/pbs.c b/block/pbs.c
new file mode 100644
index 0000000000..a2211e0f3b
index 0000000000..9d1f1f39d4
--- /dev/null
+++ b/block/pbs.c
@@ -0,0 +1,305 @@
@@ -0,0 +1,276 @@
+/*
+ * Proxmox Backup Server read-only block driver
+ */
@@ -55,12 +52,10 @@ index 0000000000..a2211e0f3b
+#include "qemu/option.h"
+#include "qemu/cutils.h"
+#include "block/block_int.h"
+#include "block/block-io.h"
+
+#include <proxmox-backup-qemu.h>
+
+#define PBS_OPT_REPOSITORY "repository"
+#define PBS_OPT_NAMESPACE "namespace"
+#define PBS_OPT_SNAPSHOT "snapshot"
+#define PBS_OPT_ARCHIVE "archive"
+#define PBS_OPT_KEYFILE "keyfile"
@@ -74,7 +69,6 @@ index 0000000000..a2211e0f3b
+ int64_t length;
+
+ char *repository;
+ char *namespace;
+ char *snapshot;
+ char *archive;
+} BDRVPBSState;
@@ -89,11 +83,6 @@ index 0000000000..a2211e0f3b
+ .help = "The server address and repository to connect to.",
+ },
+ {
+ .name = PBS_OPT_NAMESPACE,
+ .type = QEMU_OPT_STRING,
+ .help = "Optional: The snapshot's namespace.",
+ },
+ {
+ .name = PBS_OPT_SNAPSHOT,
+ .type = QEMU_OPT_STRING,
+ .help = "The snapshot to read.",
@@ -129,7 +118,7 @@ index 0000000000..a2211e0f3b
+
+
+// filename format:
+// pbs:repository=<repo>,namespace=<ns>,snapshot=<snap>,password=<pw>,key_password=<kpw>,fingerprint=<fp>,archive=<archive>
+// pbs:repository=<repo>,snapshot=<snap>,password=<pw>,key_password=<kpw>,fingerprint=<fp>,archive=<archive>
+static void pbs_parse_filename(const char *filename, QDict *options,
+ Error **errp)
+{
@@ -165,7 +154,6 @@ index 0000000000..a2211e0f3b
+ s->archive = g_strdup(qemu_opt_get(opts, PBS_OPT_ARCHIVE));
+ const char *keyfile = qemu_opt_get(opts, PBS_OPT_KEYFILE);
+ const char *password = qemu_opt_get(opts, PBS_OPT_PASSWORD);
+ const char *namespace = qemu_opt_get(opts, PBS_OPT_NAMESPACE);
+ const char *fingerprint = qemu_opt_get(opts, PBS_OPT_FINGERPRINT);
+ const char *key_password = qemu_opt_get(opts, PBS_OPT_ENCRYPTION_PASSWORD);
+
@@ -178,12 +166,9 @@ index 0000000000..a2211e0f3b
+ if (!key_password) {
+ key_password = getenv("PBS_ENCRYPTION_PASSWORD");
+ }
+ if (namespace) {
+ s->namespace = g_strdup(namespace);
+ }
+
+ /* connect to PBS server in read mode */
+ s->conn = proxmox_restore_new_ns(s->repository, s->snapshot, s->namespace, password,
+ s->conn = proxmox_restore_new(s->repository, s->snapshot, password,
+ keyfile, key_password, fingerprint, &pbs_error);
+
+ /* invalidates qemu_opt_get char pointers from above */
@@ -228,13 +213,12 @@ index 0000000000..a2211e0f3b
+static void pbs_close(BlockDriverState *bs) {
+ BDRVPBSState *s = bs->opaque;
+ g_free(s->repository);
+ g_free(s->namespace);
+ g_free(s->snapshot);
+ g_free(s->archive);
+ proxmox_restore_disconnect(s->conn);
+}
+
+static coroutine_fn int64_t pbs_co_getlength(BlockDriverState *bs)
+static int64_t pbs_getlength(BlockDriverState *bs)
+{
+ BDRVPBSState *s = bs->opaque;
+ return s->length;
@@ -258,16 +242,7 @@ index 0000000000..a2211e0f3b
+ BDRVPBSState *s = bs->opaque;
+ int ret;
+ char *pbs_error = NULL;
+ uint8_t *buf;
+ bool inline_buf = true;
+
+ /* for single-buffer IO vectors we can fast-path the write directly to it */
+ if (qiov->niov == 1 && qiov->iov->iov_len >= bytes) {
+ buf = qiov->iov->iov_base;
+ } else {
+ inline_buf = false;
+ buf = g_malloc(bytes);
+ }
+ uint8_t *buf = malloc(bytes);
+
+ if (offset < 0 || bytes < 0) {
+ fprintf(stderr, "unexpected negative 'offset' or 'bytes' value!\n");
@@ -290,10 +265,8 @@ index 0000000000..a2211e0f3b
+ return -EIO;
+ }
+
+ if (!inline_buf) {
+ qemu_iovec_from_buf(qiov, 0, buf, bytes);
+ g_free(buf);
+ }
+ qemu_iovec_from_buf(qiov, 0, buf, bytes);
+ free(buf);
+
+ return 0;
+}
@@ -310,13 +283,8 @@ index 0000000000..a2211e0f3b
+static void pbs_refresh_filename(BlockDriverState *bs)
+{
+ BDRVPBSState *s = bs->opaque;
+ if (s->namespace) {
+ snprintf(bs->exact_filename, sizeof(bs->exact_filename), "%s/%s:%s(%s)",
+ s->repository, s->namespace, s->snapshot, s->archive);
+ } else {
+ snprintf(bs->exact_filename, sizeof(bs->exact_filename), "%s/%s(%s)",
+ s->repository, s->snapshot, s->archive);
+ }
+ snprintf(bs->exact_filename, sizeof(bs->exact_filename), "%s/%s(%s)",
+ s->repository, s->snapshot, s->archive);
+}
+
+static const char *const pbs_strong_runtime_opts[] = {
@@ -333,7 +301,7 @@ index 0000000000..a2211e0f3b
+ .bdrv_file_open = pbs_file_open,
+ .bdrv_open = pbs_open,
+ .bdrv_close = pbs_close,
+ .bdrv_co_getlength = pbs_co_getlength,
+ .bdrv_getlength = pbs_getlength,
+
+ .bdrv_co_preadv = pbs_co_preadv,
+ .bdrv_co_pwritev = pbs_co_pwritev,
@@ -349,10 +317,10 @@ index 0000000000..a2211e0f3b
+
+block_init(bdrv_pbs_init);
diff --git a/configure b/configure
index a62a3e6be9..1ac0feb46b 100755
index 5f1828f1ec..c9b70b5e9a 100755
--- a/configure
+++ b/configure
@@ -288,6 +288,7 @@ linux_user=""
@@ -285,6 +285,7 @@ linux_user=""
bsd_user=""
pie=""
coroutine=""
@@ -360,9 +328,9 @@ index a62a3e6be9..1ac0feb46b 100755
plugins="$default_feature"
meson=""
ninja=""
@@ -873,6 +874,10 @@ for opt do
;;
--with-coroutine=*) coroutine="$optarg"
@@ -864,6 +865,10 @@ for opt do
--enable-uuid|--disable-uuid)
echo "$0: $opt is obsolete, UUID support is always built" >&2
;;
+ --disable-pbs-bdrv) pbs_bdrv="no"
+ ;;
@@ -374,12 +342,12 @@ index a62a3e6be9..1ac0feb46b 100755
@@ -1049,6 +1054,7 @@ cat << EOF
debug-info debugging information
safe-stack SafeStack Stack Smash Protection. Depends on
clang/llvm and requires coroutine backend ucontext.
clang/llvm >= 3.7 and requires coroutine backend ucontext.
+ pbs-bdrv Proxmox backup server read-only block driver support
NOTE: The object files are built at the place where configure is launched
EOF
@@ -2386,6 +2392,9 @@ echo "TARGET_DIRS=$target_list" >> $config_host_mak
@@ -2372,6 +2378,9 @@ echo "TARGET_DIRS=$target_list" >> $config_host_mak
if test "$modules" = "yes"; then
echo "CONFIG_MODULES=y" >> $config_host_mak
fi
@@ -390,10 +358,10 @@ index a62a3e6be9..1ac0feb46b 100755
# XXX: suppress that
if [ "$bsd" = "yes" ] ; then
diff --git a/meson.build b/meson.build
index 32ab849ce6..69afe3441b 100644
index 9c46881eb7..93ebda47af 100644
--- a/meson.build
+++ b/meson.build
@@ -4041,7 +4041,7 @@ summary_info += {'bzip2 support': libbzip2}
@@ -3980,7 +3980,7 @@ summary_info += {'bzip2 support': libbzip2}
summary_info += {'lzfse support': liblzfse}
summary_info += {'zstd support': zstd}
summary_info += {'NUMA host support': numa}
@@ -403,10 +371,10 @@ index 32ab849ce6..69afe3441b 100644
summary_info += {'libdaxctl support': libdaxctl}
summary_info += {'libudev': libudev}
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 985859ddee..d601fb4ab2 100644
index 5ac6276dc1..45b63dfe26 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -3304,6 +3304,7 @@
@@ -3103,6 +3103,7 @@
'parallels', 'preallocate', 'qcow', 'qcow2', 'qed', 'quorum',
'raw', 'rbd',
{ 'name': 'replication', 'if': 'CONFIG_REPLICATION' },
@@ -414,7 +382,7 @@ index 985859ddee..d601fb4ab2 100644
'ssh', 'throttle', 'vdi', 'vhdx',
{ 'name': 'virtio-blk-vfio-pci', 'if': 'CONFIG_BLKIO' },
{ 'name': 'virtio-blk-vhost-user', 'if': 'CONFIG_BLKIO' },
@@ -3380,6 +3381,17 @@
@@ -3179,6 +3180,17 @@
{ 'struct': 'BlockdevOptionsNull',
'data': { '*size': 'int', '*latency-ns': 'uint64', '*read-zeroes': 'bool' } }
@@ -427,12 +395,12 @@ index 985859ddee..d601fb4ab2 100644
+{ 'struct': 'BlockdevOptionsPbs',
+ 'data': { 'repository': 'str', 'snapshot': 'str', 'archive': 'str',
+ '*keyfile': 'str', '*password': 'str', '*fingerprint': 'str',
+ '*key_password': 'str', '*namespace': 'str' } }
+ '*key_password': 'str' } }
+
##
# @BlockdevOptionsNVMe:
#
@@ -4753,6 +4765,7 @@
@@ -4531,6 +4543,7 @@
'nfs': 'BlockdevOptionsNfs',
'null-aio': 'BlockdevOptionsNull',
'null-co': 'BlockdevOptionsNull',
@@ -441,10 +409,10 @@ index 985859ddee..d601fb4ab2 100644
'nvme-io_uring': { 'type': 'BlockdevOptionsNvmeIoUring',
'if': 'CONFIG_BLKIO' },
diff --git a/qapi/pragma.json b/qapi/pragma.json
index 325e684411..b6079f6a0e 100644
index f2097b9020..5ab1890519 100644
--- a/qapi/pragma.json
+++ b/qapi/pragma.json
@@ -45,6 +45,7 @@
@@ -47,6 +47,7 @@
'BlockInfo', # query-block
'BlockdevAioOptions', # blockdev-add, -blockdev
'BlockdevDriver', # blockdev-add, query-blockstats, ...

View File

@@ -0,0 +1,74 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Stefan Reiter <s.reiter@proxmox.com>
Date: Wed, 8 Jul 2020 11:57:53 +0200
Subject: [PATCH] PVE: add query_proxmox_support QMP command
Generic interface for future use, currently used for PBS dirty-bitmap
backup support.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
[PVE: query-proxmox-support: include library version]
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
---
pve-backup.c | 9 +++++++++
qapi/block-core.json | 29 +++++++++++++++++++++++++++++
2 files changed, 38 insertions(+)
diff --git a/pve-backup.c b/pve-backup.c
index a275a1d4e1..b10373cd8a 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -1072,3 +1072,12 @@ BackupStatus *qmp_query_backup(Error **errp)
return info;
}
+
+ProxmoxSupportStatus *qmp_query_proxmox_support(Error **errp)
+{
+ ProxmoxSupportStatus *ret = g_malloc0(sizeof(*ret));
+ ret->pbs_library_version = g_strdup(proxmox_backup_qemu_version());
+ ret->pbs_dirty_bitmap = true;
+ ret->pbs_dirty_bitmap_savevm = true;
+ return ret;
+}
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 45b63dfe26..8b0e0d92de 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -863,6 +863,35 @@
##
{ 'command': 'backup-cancel' }
+##
+# @ProxmoxSupportStatus:
+#
+# Contains info about supported features added by Proxmox.
+#
+# @pbs-dirty-bitmap: True if dirty-bitmap-incremental backups to PBS are
+# supported.
+#
+# @pbs-dirty-bitmap-savevm: True if 'dirty-bitmaps' migration capability can
+# safely be set for savevm-async.
+#
+# @pbs-library-version: Running version of libproxmox-backup-qemu0 library.
+#
+##
+{ 'struct': 'ProxmoxSupportStatus',
+ 'data': { 'pbs-dirty-bitmap': 'bool',
+ 'pbs-dirty-bitmap-savevm': 'bool',
+ 'pbs-library-version': 'str' } }
+
+##
+# @query-proxmox-support:
+#
+# Returns information about supported features added by Proxmox.
+#
+# Returns: @ProxmoxSupportStatus
+#
+##
+{ 'command': 'query-proxmox-support', 'returns': 'ProxmoxSupportStatus' }
+
##
# @BlockDeviceTimedStats:
#

View File

@@ -0,0 +1,441 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Stefan Reiter <s.reiter@proxmox.com>
Date: Wed, 19 Aug 2020 17:02:00 +0200
Subject: [PATCH] PVE: add query-pbs-bitmap-info QMP call
Returns advanced information about dirty bitmaps used (or not used) for
the latest PBS backup.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
monitor/hmp-cmds.c | 28 ++++++-----
pve-backup.c | 117 ++++++++++++++++++++++++++++++++-----------
qapi/block-core.json | 56 +++++++++++++++++++++
3 files changed, 159 insertions(+), 42 deletions(-)
diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index 670f783515..d819e5fc36 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -202,6 +202,7 @@ void hmp_info_mice(Monitor *mon, const QDict *qdict)
void hmp_info_backup(Monitor *mon, const QDict *qdict)
{
BackupStatus *info;
+ PBSBitmapInfoList *bitmap_info;
info = qmp_query_backup(NULL);
@@ -232,26 +233,29 @@ void hmp_info_backup(Monitor *mon, const QDict *qdict)
// this should not happen normally
monitor_printf(mon, "Total size: %d\n", 0);
} else {
- bool incremental = false;
size_t total_or_dirty = info->total;
- if (info->has_transferred) {
- if (info->has_dirty && info->dirty) {
- if (info->dirty < info->total) {
- total_or_dirty = info->dirty;
- incremental = true;
- }
- }
+ bitmap_info = qmp_query_pbs_bitmap_info(NULL);
+
+ while (bitmap_info) {
+ monitor_printf(mon, "Drive %s:\n",
+ bitmap_info->value->drive);
+ monitor_printf(mon, " bitmap action: %s\n",
+ PBSBitmapAction_str(bitmap_info->value->action));
+ monitor_printf(mon, " size: %zd\n",
+ bitmap_info->value->size);
+ monitor_printf(mon, " dirty: %zd\n",
+ bitmap_info->value->dirty);
+ bitmap_info = bitmap_info->next;
}
- int per = (info->transferred * 100)/total_or_dirty;
-
- monitor_printf(mon, "Backup mode: %s\n", incremental ? "incremental" : "full");
+ qapi_free_PBSBitmapInfoList(bitmap_info);
int zero_per = (info->has_zero_bytes && info->zero_bytes) ?
(info->zero_bytes * 100)/info->total : 0;
monitor_printf(mon, "Total size: %zd\n", info->total);
+ int trans_per = (info->transferred * 100)/total_or_dirty;
monitor_printf(mon, "Transferred bytes: %zd (%d%%)\n",
- info->transferred, per);
+ info->transferred, trans_per);
monitor_printf(mon, "Zero bytes: %zd (%d%%)\n",
info->zero_bytes, zero_per);
diff --git a/pve-backup.c b/pve-backup.c
index b10373cd8a..8ae50e06c3 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -46,6 +46,7 @@ static struct PVEBackupState {
size_t transferred;
size_t reused;
size_t zero_bytes;
+ GList *bitmap_list;
} stat;
int64_t speed;
VmaWriter *vmaw;
@@ -669,7 +670,6 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
}
size_t total = 0;
- size_t dirty = 0;
l = di_list;
while (l) {
@@ -690,18 +690,33 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
uuid_generate(uuid);
+ qemu_mutex_lock(&backup_state.stat.lock);
+ backup_state.stat.reused = 0;
+
+ /* clear previous backup's bitmap_list */
+ if (backup_state.stat.bitmap_list) {
+ GList *bl = backup_state.stat.bitmap_list;
+ while (bl) {
+ g_free(((PBSBitmapInfo *)bl->data)->drive);
+ g_free(bl->data);
+ bl = g_list_next(bl);
+ }
+ g_list_free(backup_state.stat.bitmap_list);
+ backup_state.stat.bitmap_list = NULL;
+ }
+
if (format == BACKUP_FORMAT_PBS) {
if (!task->has_password) {
error_set(task->errp, ERROR_CLASS_GENERIC_ERROR, "missing parameter 'password'");
- goto err;
+ goto err_mutex;
}
if (!task->has_backup_id) {
error_set(task->errp, ERROR_CLASS_GENERIC_ERROR, "missing parameter 'backup-id'");
- goto err;
+ goto err_mutex;
}
if (!task->has_backup_time) {
error_set(task->errp, ERROR_CLASS_GENERIC_ERROR, "missing parameter 'backup-time'");
- goto err;
+ goto err_mutex;
}
int dump_cb_block_size = PROXMOX_BACKUP_DEFAULT_CHUNK_SIZE; // Hardcoded (4M)
@@ -728,12 +743,12 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
error_set(task->errp, ERROR_CLASS_GENERIC_ERROR,
"proxmox_backup_new failed: %s", pbs_err);
proxmox_backup_free_error(pbs_err);
- goto err;
+ goto err_mutex;
}
int connect_result = proxmox_backup_co_connect(pbs, task->errp);
if (connect_result < 0)
- goto err;
+ goto err_mutex;
/* register all devices */
l = di_list;
@@ -744,6 +759,8 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
di->block_size = dump_cb_block_size;
const char *devname = bdrv_get_device_name(di->bs);
+ PBSBitmapAction action = PBS_BITMAP_ACTION_NOT_USED;
+ size_t dirty = di->size;
BdrvDirtyBitmap *bitmap = bdrv_find_dirty_bitmap(di->bs, PBS_BITMAP_NAME);
bool expect_only_dirty = false;
@@ -752,49 +769,59 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
if (bitmap == NULL) {
bitmap = bdrv_create_dirty_bitmap(di->bs, dump_cb_block_size, PBS_BITMAP_NAME, task->errp);
if (!bitmap) {
- goto err;
+ goto err_mutex;
}
+ action = PBS_BITMAP_ACTION_NEW;
} else {
expect_only_dirty = proxmox_backup_check_incremental(pbs, devname, di->size) != 0;
}
if (expect_only_dirty) {
- dirty += bdrv_get_dirty_count(bitmap);
+ /* track clean chunks as reused */
+ dirty = MIN(bdrv_get_dirty_count(bitmap), di->size);
+ backup_state.stat.reused += di->size - dirty;
+ action = PBS_BITMAP_ACTION_USED;
} else {
/* mark entire bitmap as dirty to make full backup */
bdrv_set_dirty_bitmap(bitmap, 0, di->size);
- dirty += di->size;
+ if (action != PBS_BITMAP_ACTION_NEW) {
+ action = PBS_BITMAP_ACTION_INVALID;
+ }
}
di->bitmap = bitmap;
} else {
- dirty += di->size;
-
/* after a full backup the old dirty bitmap is invalid anyway */
if (bitmap != NULL) {
bdrv_release_dirty_bitmap(bitmap);
+ action = PBS_BITMAP_ACTION_NOT_USED_REMOVED;
}
}
int dev_id = proxmox_backup_co_register_image(pbs, devname, di->size, expect_only_dirty, task->errp);
if (dev_id < 0) {
- goto err;
+ goto err_mutex;
}
if (!(di->target = bdrv_backup_dump_create(dump_cb_block_size, di->size, pvebackup_co_dump_pbs_cb, di, task->errp))) {
- goto err;
+ goto err_mutex;
}
di->dev_id = dev_id;
+
+ PBSBitmapInfo *info = g_malloc(sizeof(*info));
+ info->drive = g_strdup(devname);
+ info->action = action;
+ info->size = di->size;
+ info->dirty = dirty;
+ backup_state.stat.bitmap_list = g_list_append(backup_state.stat.bitmap_list, info);
}
} else if (format == BACKUP_FORMAT_VMA) {
- dirty = total;
-
vmaw = vma_writer_create(task->backup_file, uuid, &local_err);
if (!vmaw) {
if (local_err) {
error_propagate(task->errp, local_err);
}
- goto err;
+ goto err_mutex;
}
/* register all devices for vma writer */
@@ -804,7 +831,7 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
l = g_list_next(l);
if (!(di->target = bdrv_backup_dump_create(VMA_CLUSTER_SIZE, di->size, pvebackup_co_dump_vma_cb, di, task->errp))) {
- goto err;
+ goto err_mutex;
}
const char *devname = bdrv_get_device_name(di->bs);
@@ -812,16 +839,14 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
if (di->dev_id <= 0) {
error_set(task->errp, ERROR_CLASS_GENERIC_ERROR,
"register_stream failed");
- goto err;
+ goto err_mutex;
}
}
} else if (format == BACKUP_FORMAT_DIR) {
- dirty = total;
-
if (mkdir(task->backup_file, 0640) != 0) {
error_setg_errno(task->errp, errno, "can't create directory '%s'\n",
task->backup_file);
- goto err;
+ goto err_mutex;
}
backup_dir = task->backup_file;
@@ -838,18 +863,18 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
di->size, flags, false, &local_err);
if (local_err) {
error_propagate(task->errp, local_err);
- goto err;
+ goto err_mutex;
}
di->target = bdrv_open(di->targetfile, NULL, NULL, flags, &local_err);
if (!di->target) {
error_propagate(task->errp, local_err);
- goto err;
+ goto err_mutex;
}
}
} else {
error_set(task->errp, ERROR_CLASS_GENERIC_ERROR, "unknown backup format");
- goto err;
+ goto err_mutex;
}
@@ -857,7 +882,7 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
if (task->has_config_file) {
if (pvebackup_co_add_config(task->config_file, config_name, format, backup_dir,
vmaw, pbs, task->errp) != 0) {
- goto err;
+ goto err_mutex;
}
}
@@ -865,12 +890,11 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
if (task->has_firewall_file) {
if (pvebackup_co_add_config(task->firewall_file, firewall_name, format, backup_dir,
vmaw, pbs, task->errp) != 0) {
- goto err;
+ goto err_mutex;
}
}
/* initialize global backup_state now */
-
- qemu_mutex_lock(&backup_state.stat.lock);
+ /* note: 'reused' and 'bitmap_list' are initialized earlier */
if (backup_state.stat.error) {
error_free(backup_state.stat.error);
@@ -890,10 +914,9 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
char *uuid_str = g_strdup(backup_state.stat.uuid_str);
backup_state.stat.total = total;
- backup_state.stat.dirty = dirty;
+ backup_state.stat.dirty = total - backup_state.stat.reused;
backup_state.stat.transferred = 0;
backup_state.stat.zero_bytes = 0;
- backup_state.stat.reused = format == BACKUP_FORMAT_PBS && dirty >= total ? 0 : total - dirty;
qemu_mutex_unlock(&backup_state.stat.lock);
@@ -910,6 +933,9 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
task->result = uuid_info;
return;
+err_mutex:
+ qemu_mutex_unlock(&backup_state.stat.lock);
+
err:
l = di_list;
@@ -1073,11 +1099,42 @@ BackupStatus *qmp_query_backup(Error **errp)
return info;
}
+PBSBitmapInfoList *qmp_query_pbs_bitmap_info(Error **errp)
+{
+ PBSBitmapInfoList *head = NULL, **p_next = &head;
+
+ qemu_mutex_lock(&backup_state.stat.lock);
+
+ GList *l = backup_state.stat.bitmap_list;
+ while (l) {
+ PBSBitmapInfo *info = (PBSBitmapInfo *)l->data;
+ l = g_list_next(l);
+
+ /* clone bitmap info to avoid auto free after QMP marshalling */
+ PBSBitmapInfo *info_ret = g_malloc0(sizeof(*info_ret));
+ info_ret->drive = g_strdup(info->drive);
+ info_ret->action = info->action;
+ info_ret->size = info->size;
+ info_ret->dirty = info->dirty;
+
+ PBSBitmapInfoList *info_list = g_malloc0(sizeof(*info_list));
+ info_list->value = info_ret;
+
+ *p_next = info_list;
+ p_next = &info_list->next;
+ }
+
+ qemu_mutex_unlock(&backup_state.stat.lock);
+
+ return head;
+}
+
ProxmoxSupportStatus *qmp_query_proxmox_support(Error **errp)
{
ProxmoxSupportStatus *ret = g_malloc0(sizeof(*ret));
ret->pbs_library_version = g_strdup(proxmox_backup_qemu_version());
ret->pbs_dirty_bitmap = true;
ret->pbs_dirty_bitmap_savevm = true;
+ ret->query_bitmap_info = true;
return ret;
}
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 8b0e0d92de..7fde927621 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -871,6 +871,8 @@
# @pbs-dirty-bitmap: True if dirty-bitmap-incremental backups to PBS are
# supported.
#
+# @query-bitmap-info: True if the 'query-pbs-bitmap-info' QMP call is supported.
+#
# @pbs-dirty-bitmap-savevm: True if 'dirty-bitmaps' migration capability can
# safely be set for savevm-async.
#
@@ -879,6 +881,7 @@
##
{ 'struct': 'ProxmoxSupportStatus',
'data': { 'pbs-dirty-bitmap': 'bool',
+ 'query-bitmap-info': 'bool',
'pbs-dirty-bitmap-savevm': 'bool',
'pbs-library-version': 'str' } }
@@ -892,6 +895,59 @@
##
{ 'command': 'query-proxmox-support', 'returns': 'ProxmoxSupportStatus' }
+##
+# @PBSBitmapAction:
+#
+# An action taken on a dirty-bitmap when a backup job was started.
+#
+# @not-used: Bitmap mode was not enabled.
+#
+# @not-used-removed: Bitmap mode was not enabled, but a bitmap from a
+# previous backup still existed and was removed.
+#
+# @new: A new bitmap was attached to the drive for this backup.
+#
+# @used: An existing bitmap will be used to only backup changed data.
+#
+# @invalid: A bitmap existed, but had to be cleared since it's associated
+# base snapshot did not match the base given for the current job or
+# the crypt mode has changed.
+#
+##
+{ 'enum': 'PBSBitmapAction',
+ 'data': ['not-used', 'not-used-removed', 'new', 'used', 'invalid'] }
+
+##
+# @PBSBitmapInfo:
+#
+# Contains information about dirty bitmaps used for each drive in a PBS backup.
+#
+# @drive: The underlying drive.
+#
+# @action: The action that was taken when the backup started.
+#
+# @size: The total size of the drive.
+#
+# @dirty: How much of the drive is considered dirty and will be backed up,
+# or 'size' if everything will be.
+#
+##
+{ 'struct': 'PBSBitmapInfo',
+ 'data': { 'drive': 'str', 'action': 'PBSBitmapAction', 'size': 'int',
+ 'dirty': 'int' } }
+
+##
+# @query-pbs-bitmap-info:
+#
+# Returns information about dirty bitmaps used on the most recently started
+# backup. Returns nothing when the last backup was not using PBS or if no
+# backup occured in this session.
+#
+# Returns: @PBSBitmapInfo
+#
+##
+{ 'command': 'query-pbs-bitmap-info', 'returns': ['PBSBitmapInfo'] }
+
##
# @BlockDeviceTimedStats:
#

View File

@@ -14,10 +14,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2 files changed, 7 insertions(+), 2 deletions(-)
diff --git a/meson.build b/meson.build
index 69afe3441b..b2e9b2aec7 100644
index 93ebda47af..d47ce5fe64 100644
--- a/meson.build
+++ b/meson.build
@@ -1528,6 +1528,7 @@ keyutils = dependency('libkeyutils', required: false,
@@ -1526,6 +1526,7 @@ keyutils = dependency('libkeyutils', required: false,
has_gettid = cc.has_function('gettid')
libuuid = cc.find_library('uuid', required: true)
@@ -25,7 +25,7 @@ index 69afe3441b..b2e9b2aec7 100644
libproxmox_backup_qemu = cc.find_library('proxmox_backup_qemu', required: true)
# libselinux
@@ -3144,6 +3145,7 @@ if have_block
@@ -3094,6 +3095,7 @@ if have_block
# os-posix.c contains POSIX-specific functions used by qemu-storage-daemon,
# os-win32.c does not
blockdev_ss.add(when: 'CONFIG_POSIX', if_true: files('os-posix.c'))
@@ -34,7 +34,7 @@ index 69afe3441b..b2e9b2aec7 100644
endif
diff --git a/os-posix.c b/os-posix.c
index 90ea71725f..33745a8c22 100644
index 4858650c3e..c5cb12226a 100644
--- a/os-posix.c
+++ b/os-posix.c
@@ -28,6 +28,8 @@
@@ -46,7 +46,7 @@ index 90ea71725f..33745a8c22 100644
/* Needed early for CONFIG_BSD etc. */
#include "net/slirp.h"
@@ -301,9 +303,10 @@ void os_setup_post(void)
@@ -287,9 +289,10 @@ void os_setup_post(void)
dup2(fd, 0);
dup2(fd, 1);

View File

@@ -0,0 +1,293 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Stefan Reiter <s.reiter@proxmox.com>
Date: Thu, 20 Aug 2020 14:25:00 +0200
Subject: [PATCH] PVE-Backup: Use a transaction to synchronize job states
By using a JobTxn, we can sync dirty bitmaps only when *all* jobs were
successful - meaning we don't need to remove them when the backup fails,
since QEMU's BITMAP_SYNC_MODE_ON_SUCCESS will now handle that for us.
To keep the rate-limiting and IO impact from before, we use a sequential
transaction, so drives will still be backed up one after the other.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
[FE: add new force parameter to job_cancel_sync calls
adapt for new job lock mechanism replacing AioContext locks]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
pve-backup.c | 163 ++++++++++++++++-----------------------------------
1 file changed, 50 insertions(+), 113 deletions(-)
diff --git a/pve-backup.c b/pve-backup.c
index 8ae50e06c3..eedac335ec 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -52,6 +52,7 @@ static struct PVEBackupState {
VmaWriter *vmaw;
ProxmoxBackupHandle *pbs;
GList *di_list;
+ JobTxn *txn;
QemuMutex backup_mutex;
CoMutex dump_callback_mutex;
} backup_state;
@@ -71,34 +72,12 @@ typedef struct PVEBackupDevInfo {
size_t size;
uint64_t block_size;
uint8_t dev_id;
- bool completed;
char targetfile[PATH_MAX];
BdrvDirtyBitmap *bitmap;
BlockDriverState *target;
+ BlockJob *job;
} PVEBackupDevInfo;
-static void pvebackup_run_next_job(void);
-
-static BlockJob *
-lookup_active_block_job(PVEBackupDevInfo *di)
-{
- if (!di->completed && di->bs) {
- WITH_JOB_LOCK_GUARD() {
- for (BlockJob *job = block_job_next_locked(NULL); job; job = block_job_next_locked(job)) {
- if (job->job.driver->job_type != JOB_TYPE_BACKUP) {
- continue;
- }
-
- BackupBlockJob *bjob = container_of(job, BackupBlockJob, common);
- if (bjob && bjob->source_bs == di->bs) {
- return job;
- }
- }
- }
- }
- return NULL;
-}
-
static void pvebackup_propagate_error(Error *err)
{
qemu_mutex_lock(&backup_state.stat.lock);
@@ -274,18 +253,6 @@ static void coroutine_fn pvebackup_co_cleanup(void *unused)
if (local_err != NULL) {
pvebackup_propagate_error(local_err);
}
- } else {
- // on error or cancel we cannot ensure synchronization of dirty
- // bitmaps with backup server, so remove all and do full backup next
- GList *l = backup_state.di_list;
- while (l) {
- PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
- l = g_list_next(l);
-
- if (di->bitmap) {
- bdrv_release_dirty_bitmap(di->bitmap);
- }
- }
}
proxmox_backup_disconnect(backup_state.pbs);
@@ -324,8 +291,6 @@ static void pvebackup_complete_cb(void *opaque, int ret)
qemu_mutex_lock(&backup_state.backup_mutex);
- di->completed = true;
-
if (ret < 0) {
Error *local_err = NULL;
error_setg(&local_err, "job failed with err %d - %s", ret, strerror(-ret));
@@ -338,20 +303,17 @@ static void pvebackup_complete_cb(void *opaque, int ret)
block_on_coroutine_fn(pvebackup_complete_stream, di);
- // remove self from job queue
+ // remove self from job list
backup_state.di_list = g_list_remove(backup_state.di_list, di);
- if (di->bitmap && ret < 0) {
- // on error or cancel we cannot ensure synchronization of dirty
- // bitmaps with backup server, so remove all and do full backup next
- bdrv_release_dirty_bitmap(di->bitmap);
- }
-
g_free(di);
- qemu_mutex_unlock(&backup_state.backup_mutex);
+ /* call cleanup if we're the last job */
+ if (!g_list_first(backup_state.di_list)) {
+ block_on_coroutine_fn(pvebackup_co_cleanup, NULL);
+ }
- pvebackup_run_next_job();
+ qemu_mutex_unlock(&backup_state.backup_mutex);
}
static void pvebackup_cancel(void)
@@ -373,32 +335,28 @@ static void pvebackup_cancel(void)
proxmox_backup_abort(backup_state.pbs, "backup canceled");
}
- qemu_mutex_unlock(&backup_state.backup_mutex);
-
- for(;;) {
-
- BlockJob *next_job = NULL;
+ /* it's enough to cancel one job in the transaction, the rest will follow
+ * automatically */
+ GList *bdi = g_list_first(backup_state.di_list);
+ BlockJob *cancel_job = bdi && bdi->data ?
+ ((PVEBackupDevInfo *)bdi->data)->job :
+ NULL;
- qemu_mutex_lock(&backup_state.backup_mutex);
-
- GList *l = backup_state.di_list;
- while (l) {
- PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
- l = g_list_next(l);
-
- BlockJob *job = lookup_active_block_job(di);
- if (job != NULL) {
- next_job = job;
- break;
- }
+ /* ref the job before releasing the mutex, just to be safe */
+ if (cancel_job) {
+ WITH_JOB_LOCK_GUARD() {
+ job_ref_locked(&cancel_job->job);
}
+ }
- qemu_mutex_unlock(&backup_state.backup_mutex);
+ /* job_cancel_sync may enter the job, so we need to release the
+ * backup_mutex to avoid deadlock */
+ qemu_mutex_unlock(&backup_state.backup_mutex);
- if (next_job) {
- job_cancel_sync(&next_job->job, true);
- } else {
- break;
+ if (cancel_job) {
+ WITH_JOB_LOCK_GUARD() {
+ job_cancel_sync_locked(&cancel_job->job, true);
+ job_unref_locked(&cancel_job->job);
}
}
}
@@ -458,49 +416,19 @@ static int coroutine_fn pvebackup_co_add_config(
goto out;
}
-bool job_should_pause_locked(Job *job);
-
-static void pvebackup_run_next_job(void)
-{
- assert(!qemu_in_coroutine());
-
- qemu_mutex_lock(&backup_state.backup_mutex);
-
- GList *l = backup_state.di_list;
- while (l) {
- PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
- l = g_list_next(l);
-
- BlockJob *job = lookup_active_block_job(di);
-
- if (job) {
- qemu_mutex_unlock(&backup_state.backup_mutex);
-
- WITH_JOB_LOCK_GUARD() {
- if (job_should_pause_locked(&job->job)) {
- bool error_or_canceled = pvebackup_error_or_canceled();
- if (error_or_canceled) {
- job_cancel_sync_locked(&job->job, true);
- } else {
- job_resume_locked(&job->job);
- }
- }
- }
- return;
- }
- }
-
- block_on_coroutine_fn(pvebackup_co_cleanup, NULL); // no more jobs, run cleanup
-
- qemu_mutex_unlock(&backup_state.backup_mutex);
-}
-
static bool create_backup_jobs(void) {
assert(!qemu_in_coroutine());
Error *local_err = NULL;
+ /* create job transaction to synchronize bitmap commit and cancel all
+ * jobs in case one errors */
+ if (backup_state.txn) {
+ job_txn_unref(backup_state.txn);
+ }
+ backup_state.txn = job_txn_new_seq();
+
BackupPerf perf = { .max_workers = 16 };
/* create and start all jobs (paused state) */
@@ -523,7 +451,7 @@ static bool create_backup_jobs(void) {
BlockJob *job = backup_job_create(
NULL, di->bs, di->target, backup_state.speed, sync_mode, di->bitmap,
bitmap_mode, false, NULL, &perf, BLOCKDEV_ON_ERROR_REPORT, BLOCKDEV_ON_ERROR_REPORT,
- JOB_DEFAULT, pvebackup_complete_cb, di, NULL, &local_err);
+ JOB_DEFAULT, pvebackup_complete_cb, di, backup_state.txn, &local_err);
aio_context_release(aio_context);
@@ -535,7 +463,8 @@ static bool create_backup_jobs(void) {
pvebackup_propagate_error(create_job_err);
break;
}
- job_start(&job->job);
+
+ di->job = job;
bdrv_unref(di->target);
di->target = NULL;
@@ -553,6 +482,12 @@ static bool create_backup_jobs(void) {
bdrv_unref(di->target);
di->target = NULL;
}
+
+ if (di->job) {
+ WITH_JOB_LOCK_GUARD() {
+ job_unref_locked(&di->job->job);
+ }
+ }
}
}
@@ -943,10 +878,6 @@ err:
PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
l = g_list_next(l);
- if (di->bitmap) {
- bdrv_release_dirty_bitmap(di->bitmap);
- }
-
if (di->target) {
bdrv_unref(di->target);
}
@@ -1035,9 +966,15 @@ UuidInfo *qmp_backup(
block_on_coroutine_fn(pvebackup_co_prepare, &task);
if (*errp == NULL) {
- create_backup_jobs();
+ bool errors = create_backup_jobs();
qemu_mutex_unlock(&backup_state.backup_mutex);
- pvebackup_run_next_job();
+
+ if (!errors) {
+ /* start the first job in the transaction
+ * note: this might directly enter the job, so we need to do this
+ * after unlocking the backup_mutex */
+ job_txn_start_seq(backup_state.txn);
+ }
} else {
qemu_mutex_unlock(&backup_state.backup_mutex);
}

View File

@@ -0,0 +1,499 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Stefan Reiter <s.reiter@proxmox.com>
Date: Mon, 28 Sep 2020 13:40:51 +0200
Subject: [PATCH] PVE-Backup: Don't block on finishing and cleanup
create_backup_jobs
proxmox_backup_co_finish is already async, but previously we would wait
for the coroutine using block_on_coroutine_fn(). Avoid this by
scheduling pvebackup_co_complete_stream (and thus pvebackup_co_cleanup)
as a real coroutine when calling from pvebackup_complete_cb. This is ok,
since complete_stream uses the backup_mutex internally to synchronize,
and other streams can happily continue writing in the meantime anyway.
To accomodate, backup_mutex is converted to a CoMutex. This means
converting every user to a coroutine. This is not just useful here, but
will come in handy once this series[0] is merged, and QMP calls can be
yield-able coroutines too. Then we can also finally get rid of
block_on_coroutine_fn.
Cases of aio_context_acquire/release from within what is now a coroutine
are changed to aio_co_reschedule_self, which works since a running
coroutine always holds the aio lock for the context it is running in.
job_cancel_sync is called from a BH since it can't be run from a
coroutine (uses AIO_WAIT_WHILE internally).
Same thing for create_backup_jobs, which is converted to a BH too.
To communicate the finishing state, a new property is introduced to
query-backup: 'finishing'. A new state is explicitly not used, since
that would break compatibility with older qemu-server versions.
Also fix create_backup_jobs:
No more weird bool returns, just the standard "errp" format used
everywhere else too. With this, if backup_job_create fails, the error
message is actually returned over QMP and can be shown to the user.
To facilitate correct cleanup on such an error, we call
create_backup_jobs as a bottom half directly from pvebackup_co_prepare.
This additionally allows us to actually hold the backup_mutex during
operation.
Also add a job_cancel_sync before job_unref, since a job must be in
STATUS_NULL to be deleted by unref, which could trigger an assert
before.
[0] https://lists.gnu.org/archive/html/qemu-devel/2020-09/msg03515.html
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
[FE: add new force parameter to job_cancel_sync calls]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
pve-backup.c | 212 +++++++++++++++++++++++++++----------------
qapi/block-core.json | 5 +-
2 files changed, 138 insertions(+), 79 deletions(-)
diff --git a/pve-backup.c b/pve-backup.c
index eedac335ec..7bd9d06346 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -33,7 +33,9 @@ const char *PBS_BITMAP_NAME = "pbs-incremental-dirty-bitmap";
static struct PVEBackupState {
struct {
- // Everithing accessed from qmp_backup_query command is protected using lock
+ // Everything accessed from qmp_backup_query command is protected using
+ // this lock. Do NOT hold this lock for long times, as it is sometimes
+ // acquired from coroutines, and thus any wait time may block the guest.
QemuMutex lock;
Error *error;
time_t start_time;
@@ -47,20 +49,22 @@ static struct PVEBackupState {
size_t reused;
size_t zero_bytes;
GList *bitmap_list;
+ bool finishing;
+ bool starting;
} stat;
int64_t speed;
VmaWriter *vmaw;
ProxmoxBackupHandle *pbs;
GList *di_list;
JobTxn *txn;
- QemuMutex backup_mutex;
+ CoMutex backup_mutex;
CoMutex dump_callback_mutex;
} backup_state;
static void pvebackup_init(void)
{
qemu_mutex_init(&backup_state.stat.lock);
- qemu_mutex_init(&backup_state.backup_mutex);
+ qemu_co_mutex_init(&backup_state.backup_mutex);
qemu_co_mutex_init(&backup_state.dump_callback_mutex);
}
@@ -72,6 +76,7 @@ typedef struct PVEBackupDevInfo {
size_t size;
uint64_t block_size;
uint8_t dev_id;
+ int completed_ret; // INT_MAX if not completed
char targetfile[PATH_MAX];
BdrvDirtyBitmap *bitmap;
BlockDriverState *target;
@@ -227,12 +232,12 @@ pvebackup_co_dump_vma_cb(
}
// assumes the caller holds backup_mutex
-static void coroutine_fn pvebackup_co_cleanup(void *unused)
+static void coroutine_fn pvebackup_co_cleanup(void)
{
assert(qemu_in_coroutine());
qemu_mutex_lock(&backup_state.stat.lock);
- backup_state.stat.end_time = time(NULL);
+ backup_state.stat.finishing = true;
qemu_mutex_unlock(&backup_state.stat.lock);
if (backup_state.vmaw) {
@@ -261,35 +266,29 @@ static void coroutine_fn pvebackup_co_cleanup(void *unused)
g_list_free(backup_state.di_list);
backup_state.di_list = NULL;
+
+ qemu_mutex_lock(&backup_state.stat.lock);
+ backup_state.stat.end_time = time(NULL);
+ backup_state.stat.finishing = false;
+ qemu_mutex_unlock(&backup_state.stat.lock);
}
-// assumes the caller holds backup_mutex
-static void coroutine_fn pvebackup_complete_stream(void *opaque)
+static void coroutine_fn pvebackup_co_complete_stream(void *opaque)
{
PVEBackupDevInfo *di = opaque;
+ int ret = di->completed_ret;
- bool error_or_canceled = pvebackup_error_or_canceled();
-
- if (backup_state.vmaw) {
- vma_writer_close_stream(backup_state.vmaw, di->dev_id);
+ qemu_mutex_lock(&backup_state.stat.lock);
+ bool starting = backup_state.stat.starting;
+ qemu_mutex_unlock(&backup_state.stat.lock);
+ if (starting) {
+ /* in 'starting' state, no tasks have been run yet, meaning we can (and
+ * must) skip all cleanup, as we don't know what has and hasn't been
+ * initialized yet. */
+ return;
}
- if (backup_state.pbs && !error_or_canceled) {
- Error *local_err = NULL;
- proxmox_backup_co_close_image(backup_state.pbs, di->dev_id, &local_err);
- if (local_err != NULL) {
- pvebackup_propagate_error(local_err);
- }
- }
-}
-
-static void pvebackup_complete_cb(void *opaque, int ret)
-{
- assert(!qemu_in_coroutine());
-
- PVEBackupDevInfo *di = opaque;
-
- qemu_mutex_lock(&backup_state.backup_mutex);
+ qemu_co_mutex_lock(&backup_state.backup_mutex);
if (ret < 0) {
Error *local_err = NULL;
@@ -301,7 +300,19 @@ static void pvebackup_complete_cb(void *opaque, int ret)
assert(di->target == NULL);
- block_on_coroutine_fn(pvebackup_complete_stream, di);
+ bool error_or_canceled = pvebackup_error_or_canceled();
+
+ if (backup_state.vmaw) {
+ vma_writer_close_stream(backup_state.vmaw, di->dev_id);
+ }
+
+ if (backup_state.pbs && !error_or_canceled) {
+ Error *local_err = NULL;
+ proxmox_backup_co_close_image(backup_state.pbs, di->dev_id, &local_err);
+ if (local_err != NULL) {
+ pvebackup_propagate_error(local_err);
+ }
+ }
// remove self from job list
backup_state.di_list = g_list_remove(backup_state.di_list, di);
@@ -310,21 +321,46 @@ static void pvebackup_complete_cb(void *opaque, int ret)
/* call cleanup if we're the last job */
if (!g_list_first(backup_state.di_list)) {
- block_on_coroutine_fn(pvebackup_co_cleanup, NULL);
+ pvebackup_co_cleanup();
}
- qemu_mutex_unlock(&backup_state.backup_mutex);
+ qemu_co_mutex_unlock(&backup_state.backup_mutex);
}
-static void pvebackup_cancel(void)
+static void pvebackup_complete_cb(void *opaque, int ret)
{
- assert(!qemu_in_coroutine());
+ PVEBackupDevInfo *di = opaque;
+ di->completed_ret = ret;
+
+ /*
+ * Schedule stream cleanup in async coroutine. close_image and finish might
+ * take a while, so we can't block on them here. This way it also doesn't
+ * matter if we're already running in a coroutine or not.
+ * Note: di is a pointer to an entry in the global backup_state struct, so
+ * it stays valid.
+ */
+ Coroutine *co = qemu_coroutine_create(pvebackup_co_complete_stream, di);
+ aio_co_enter(qemu_get_aio_context(), co);
+}
+
+/*
+ * job_cancel(_sync) does not like to be called from coroutines, so defer to
+ * main loop processing via a bottom half.
+ */
+static void job_cancel_bh(void *opaque) {
+ CoCtxData *data = (CoCtxData*)opaque;
+ Job *job = (Job*)data->data;
+ job_cancel_sync(job, true);
+ aio_co_enter(data->ctx, data->co);
+}
+static void coroutine_fn pvebackup_co_cancel(void *opaque)
+{
Error *cancel_err = NULL;
error_setg(&cancel_err, "backup canceled");
pvebackup_propagate_error(cancel_err);
- qemu_mutex_lock(&backup_state.backup_mutex);
+ qemu_co_mutex_lock(&backup_state.backup_mutex);
if (backup_state.vmaw) {
/* make sure vma writer does not block anymore */
@@ -342,28 +378,22 @@ static void pvebackup_cancel(void)
((PVEBackupDevInfo *)bdi->data)->job :
NULL;
- /* ref the job before releasing the mutex, just to be safe */
if (cancel_job) {
- WITH_JOB_LOCK_GUARD() {
- job_ref_locked(&cancel_job->job);
- }
+ CoCtxData data = {
+ .ctx = qemu_get_current_aio_context(),
+ .co = qemu_coroutine_self(),
+ .data = &cancel_job->job,
+ };
+ aio_bh_schedule_oneshot(data.ctx, job_cancel_bh, &data);
+ qemu_coroutine_yield();
}
- /* job_cancel_sync may enter the job, so we need to release the
- * backup_mutex to avoid deadlock */
- qemu_mutex_unlock(&backup_state.backup_mutex);
-
- if (cancel_job) {
- WITH_JOB_LOCK_GUARD() {
- job_cancel_sync_locked(&cancel_job->job, true);
- job_unref_locked(&cancel_job->job);
- }
- }
+ qemu_co_mutex_unlock(&backup_state.backup_mutex);
}
void qmp_backup_cancel(Error **errp)
{
- pvebackup_cancel();
+ block_on_coroutine_fn(pvebackup_co_cancel, NULL);
}
// assumes the caller holds backup_mutex
@@ -416,10 +446,18 @@ static int coroutine_fn pvebackup_co_add_config(
goto out;
}
-static bool create_backup_jobs(void) {
+/*
+ * backup_job_create can *not* be run from a coroutine (and requires an
+ * acquired AioContext), so this can't either.
+ * The caller is responsible that backup_mutex is held nonetheless.
+ */
+static void create_backup_jobs_bh(void *opaque) {
assert(!qemu_in_coroutine());
+ CoCtxData *data = (CoCtxData*)opaque;
+ Error **errp = (Error**)data->data;
+
Error *local_err = NULL;
/* create job transaction to synchronize bitmap commit and cancel all
@@ -455,24 +493,19 @@ static bool create_backup_jobs(void) {
aio_context_release(aio_context);
- if (!job || local_err != NULL) {
- Error *create_job_err = NULL;
- error_setg(&create_job_err, "backup_job_create failed: %s",
- local_err ? error_get_pretty(local_err) : "null");
+ di->job = job;
- pvebackup_propagate_error(create_job_err);
+ if (!job || local_err) {
+ error_setg(errp, "backup_job_create failed: %s",
+ local_err ? error_get_pretty(local_err) : "null");
break;
}
- di->job = job;
-
bdrv_unref(di->target);
di->target = NULL;
}
- bool errors = pvebackup_error_or_canceled();
-
- if (errors) {
+ if (*errp) {
l = backup_state.di_list;
while (l) {
PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
@@ -485,13 +518,15 @@ static bool create_backup_jobs(void) {
if (di->job) {
WITH_JOB_LOCK_GUARD() {
+ job_cancel_sync_locked(&di->job->job, true);
job_unref_locked(&di->job->job);
}
}
}
}
- return errors;
+ /* return */
+ aio_co_enter(data->ctx, data->co);
}
typedef struct QmpBackupTask {
@@ -528,11 +563,12 @@ typedef struct QmpBackupTask {
UuidInfo *result;
} QmpBackupTask;
-// assumes the caller holds backup_mutex
static void coroutine_fn pvebackup_co_prepare(void *opaque)
{
assert(qemu_in_coroutine());
+ qemu_co_mutex_lock(&backup_state.backup_mutex);
+
QmpBackupTask *task = opaque;
task->result = NULL; // just to be sure
@@ -553,8 +589,9 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
const char *firewall_name = "qemu-server.fw";
if (backup_state.di_list) {
- error_set(task->errp, ERROR_CLASS_GENERIC_ERROR,
+ error_set(task->errp, ERROR_CLASS_GENERIC_ERROR,
"previous backup not finished");
+ qemu_co_mutex_unlock(&backup_state.backup_mutex);
return;
}
@@ -621,6 +658,8 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
}
di->size = size;
total += size;
+
+ di->completed_ret = INT_MAX;
}
uuid_generate(uuid);
@@ -852,6 +891,8 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
backup_state.stat.dirty = total - backup_state.stat.reused;
backup_state.stat.transferred = 0;
backup_state.stat.zero_bytes = 0;
+ backup_state.stat.finishing = false;
+ backup_state.stat.starting = true;
qemu_mutex_unlock(&backup_state.stat.lock);
@@ -866,6 +907,33 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
uuid_info->UUID = uuid_str;
task->result = uuid_info;
+
+ /* Run create_backup_jobs_bh outside of coroutine (in BH) but keep
+ * backup_mutex locked. This is fine, a CoMutex can be held across yield
+ * points, and we'll release it as soon as the BH reschedules us.
+ */
+ CoCtxData waker = {
+ .co = qemu_coroutine_self(),
+ .ctx = qemu_get_current_aio_context(),
+ .data = &local_err,
+ };
+ aio_bh_schedule_oneshot(waker.ctx, create_backup_jobs_bh, &waker);
+ qemu_coroutine_yield();
+
+ if (local_err) {
+ error_propagate(task->errp, local_err);
+ goto err;
+ }
+
+ qemu_co_mutex_unlock(&backup_state.backup_mutex);
+
+ qemu_mutex_lock(&backup_state.stat.lock);
+ backup_state.stat.starting = false;
+ qemu_mutex_unlock(&backup_state.stat.lock);
+
+ /* start the first job in the transaction */
+ job_txn_start_seq(backup_state.txn);
+
return;
err_mutex:
@@ -888,6 +956,7 @@ err:
g_free(di);
}
g_list_free(di_list);
+ backup_state.di_list = NULL;
if (devs) {
g_strfreev(devs);
@@ -908,6 +977,8 @@ err:
}
task->result = NULL;
+
+ qemu_co_mutex_unlock(&backup_state.backup_mutex);
return;
}
@@ -961,24 +1032,8 @@ UuidInfo *qmp_backup(
.errp = errp,
};
- qemu_mutex_lock(&backup_state.backup_mutex);
-
block_on_coroutine_fn(pvebackup_co_prepare, &task);
- if (*errp == NULL) {
- bool errors = create_backup_jobs();
- qemu_mutex_unlock(&backup_state.backup_mutex);
-
- if (!errors) {
- /* start the first job in the transaction
- * note: this might directly enter the job, so we need to do this
- * after unlocking the backup_mutex */
- job_txn_start_seq(backup_state.txn);
- }
- } else {
- qemu_mutex_unlock(&backup_state.backup_mutex);
- }
-
return task.result;
}
@@ -1030,6 +1085,7 @@ BackupStatus *qmp_query_backup(Error **errp)
info->transferred = backup_state.stat.transferred;
info->has_reused = true;
info->reused = backup_state.stat.reused;
+ info->finishing = backup_state.stat.finishing;
qemu_mutex_unlock(&backup_state.stat.lock);
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 7fde927621..bf559c6d52 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -770,12 +770,15 @@
#
# @uuid: uuid for this backup job
#
+# @finishing: if status='active' and finishing=true, then the backup process is
+# waiting for the target to finish.
+#
##
{ 'struct': 'BackupStatus',
'data': {'*status': 'str', '*errmsg': 'str', '*total': 'int', '*dirty': 'int',
'*transferred': 'int', '*zero-bytes': 'int', '*reused': 'int',
'*start-time': 'int', '*end-time': 'int',
- '*backup-file': 'str', '*uuid': 'str' } }
+ '*backup-file': 'str', '*uuid': 'str', 'finishing': 'bool' } }
##
# @BackupFormat:

View File

@@ -13,23 +13,21 @@ safe migration is possible and makes sense.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
[FE: split up state_pending for 8.0]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
include/migration/misc.h | 3 ++
migration/meson.build | 2 +
migration/migration.c | 1 +
migration/pbs-state.c | 104 +++++++++++++++++++++++++++++++++++++++
migration/pbs-state.c | 106 +++++++++++++++++++++++++++++++++++++++
pve-backup.c | 1 +
qapi/block-core.json | 6 +++
6 files changed, 117 insertions(+)
6 files changed, 119 insertions(+)
create mode 100644 migration/pbs-state.c
diff --git a/include/migration/misc.h b/include/migration/misc.h
index 8b49841016..78f63ca400 100644
index 465906710d..4f0aeceb6f 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -77,4 +77,7 @@ bool migration_in_bg_snapshot(void);
@@ -75,4 +75,7 @@ bool migration_in_bg_snapshot(void);
/* migration/block-dirty-bitmap.c */
void dirty_bitmap_mig_init(void);
@@ -38,7 +36,7 @@ index 8b49841016..78f63ca400 100644
+
#endif
diff --git a/migration/meson.build b/migration/meson.build
index a7824b5266..de6a271b58 100644
index 0842d00cd2..d012f4d8d3 100644
--- a/migration/meson.build
+++ b/migration/meson.build
@@ -6,8 +6,10 @@ migration_files = files(
@@ -53,10 +51,10 @@ index a7824b5266..de6a271b58 100644
softmmu_ss.add(files(
'block-dirty-bitmap.c',
diff --git a/migration/migration.c b/migration/migration.c
index 99f86bd6c2..db229e72c9 100644
index 9b496cce1d..421b4ee225 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -245,6 +245,7 @@ void migration_object_init(void)
@@ -229,6 +229,7 @@ void migration_object_init(void)
blk_mig_init();
ram_mig_init();
dirty_bitmap_mig_init();
@@ -66,10 +64,10 @@ index 99f86bd6c2..db229e72c9 100644
void migration_cancel(const Error *error)
diff --git a/migration/pbs-state.c b/migration/pbs-state.c
new file mode 100644
index 0000000000..887e998b9e
index 0000000000..29f2b3860d
--- /dev/null
+++ b/migration/pbs-state.c
@@ -0,0 +1,104 @@
@@ -0,0 +1,106 @@
+/*
+ * PBS (dirty-bitmap) state migration
+ */
@@ -88,8 +86,11 @@ index 0000000000..887e998b9e
+/* state is accessed via this static variable directly, 'opaque' is NULL */
+static PBSState pbs_state;
+
+static void pbs_state_pending(void *opaque, uint64_t *must_precopy,
+ uint64_t *can_postcopy)
+static void pbs_state_save_pending(QEMUFile *f, void *opaque,
+ uint64_t max_size,
+ uint64_t *res_precopy_only,
+ uint64_t *res_compatible,
+ uint64_t *res_postcopy_only)
+{
+ /* we send everything in save_setup, so nothing is ever pending */
+}
@@ -159,8 +160,7 @@ index 0000000000..887e998b9e
+static SaveVMHandlers savevm_pbs_state_handlers = {
+ .save_setup = pbs_state_save_setup,
+ .has_postcopy = pbs_state_has_postcopy,
+ .state_pending_exact = pbs_state_pending,
+ .state_pending_estimate = pbs_state_pending,
+ .save_live_pending = pbs_state_save_pending,
+ .is_active_iterate = pbs_state_is_active_iterate,
+ .load_state = pbs_state_load,
+ .is_active = pbs_state_is_active,
@@ -175,22 +175,22 @@ index 0000000000..887e998b9e
+ NULL);
+}
diff --git a/pve-backup.c b/pve-backup.c
index 21441b2f97..5e5c37e06d 100644
index 7bd9d06346..5662f48b72 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -1060,6 +1060,7 @@ ProxmoxSupportStatus *qmp_query_proxmox_support(Error **errp)
@@ -1128,6 +1128,7 @@ ProxmoxSupportStatus *qmp_query_proxmox_support(Error **errp)
ret->pbs_library_version = g_strdup(proxmox_backup_qemu_version());
ret->pbs_dirty_bitmap = true;
ret->pbs_dirty_bitmap_savevm = true;
+ ret->pbs_dirty_bitmap_migration = true;
ret->query_bitmap_info = true;
ret->pbs_masterkey = true;
ret->backup_max_workers = true;
return ret;
}
diff --git a/qapi/block-core.json b/qapi/block-core.json
index d601fb4ab2..16be1e02ec 100644
index bf559c6d52..24f30260c8 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -987,6 +987,11 @@
@@ -879,6 +879,11 @@
# @pbs-dirty-bitmap-savevm: True if 'dirty-bitmaps' migration capability can
# safely be set for savevm-async.
#
@@ -199,14 +199,14 @@ index d601fb4ab2..16be1e02ec 100644
+# migration cap if this is false/unset may lead
+# to crashes on migration!
+#
# @pbs-masterkey: True if the QMP backup call supports the 'master_keyfile'
# parameter.
# @pbs-library-version: Running version of libproxmox-backup-qemu0 library.
#
@@ -997,6 +1002,7 @@
##
@@ -886,6 +891,7 @@
'data': { 'pbs-dirty-bitmap': 'bool',
'query-bitmap-info': 'bool',
'pbs-dirty-bitmap-savevm': 'bool',
+ 'pbs-dirty-bitmap-migration': 'bool',
'pbs-masterkey': 'bool',
'pbs-library-version': 'str',
'backup-max-workers': 'bool' } }
'pbs-library-version': 'str' } }
##

View File

@@ -19,10 +19,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 7eaf498439..509f3df0a6 100644
index 9aba7d9c22..f4ecf9c9f9 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -539,7 +539,7 @@ static int add_bitmaps_to_list(DBMSaveState *s, BlockDriverState *bs,
@@ -538,7 +538,7 @@ static int add_bitmaps_to_list(DBMSaveState *s, BlockDriverState *bs,
if (bdrv_dirty_bitmap_check(bitmap, BDRV_BITMAP_DEFAULT, &local_err)) {
error_report_err(local_err);

View File

@@ -21,10 +21,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 30 insertions(+)
diff --git a/block/iscsi.c b/block/iscsi.c
index 9fc0bed90b..1d40933165 100644
index 1bba42a71b..89cd032c3a 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -1392,12 +1392,42 @@ static char *get_initiator_name(QemuOpts *opts)
@@ -1388,12 +1388,42 @@ static char *get_initiator_name(QemuOpts *opts)
const char *name;
char *iscsi_name;
UuidInfo *uuid_info;

View File

@@ -0,0 +1,598 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Stefan Reiter <s.reiter@proxmox.com>
Date: Tue, 26 Jan 2021 15:45:30 +0100
Subject: [PATCH] PVE: Use coroutine QMP for backup/cancel_backup
Finally turn backup QMP calls into coroutines, now that it's possible.
This has the benefit that calls are asynchronous to the main loop, i.e.
long running operations like connecting to a PBS server will no longer
hang the VM.
Additionally, it allows us to get rid of block_on_coroutine_fn, which
was always a hacky workaround.
While we're already spring cleaning, also remove the QmpBackupTask
struct, since we can now put the 'prepare' function directly into
qmp_backup and thus no longer need those giant walls of text.
(Note that for our patches to work with 5.2.0 this change is actually
required, otherwise monitor_get_fd() fails as we're not in a QMP
coroutine, but one we start ourselves - we could of course set the
monitor for that coroutine ourselves, but let's just fix it the right
way instead)
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
block/monitor/block-hmp-cmds.c | 4 +-
hmp-commands.hx | 2 +
proxmox-backup-client.c | 31 -----
pve-backup.c | 232 ++++++++++-----------------------
qapi/block-core.json | 4 +-
5 files changed, 77 insertions(+), 196 deletions(-)
diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c
index f59b02592e..2e53cb65df 100644
--- a/block/monitor/block-hmp-cmds.c
+++ b/block/monitor/block-hmp-cmds.c
@@ -1018,7 +1018,7 @@ void hmp_info_snapshots(Monitor *mon, const QDict *qdict)
g_free(global_snapshots);
}
-void hmp_backup_cancel(Monitor *mon, const QDict *qdict)
+void coroutine_fn hmp_backup_cancel(Monitor *mon, const QDict *qdict)
{
Error *error = NULL;
@@ -1027,7 +1027,7 @@ void hmp_backup_cancel(Monitor *mon, const QDict *qdict)
hmp_handle_error(mon, error);
}
-void hmp_backup(Monitor *mon, const QDict *qdict)
+void coroutine_fn hmp_backup(Monitor *mon, const QDict *qdict)
{
Error *error = NULL;
diff --git a/hmp-commands.hx b/hmp-commands.hx
index fcf9461295..5fdb198ca4 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -111,6 +111,7 @@ ERST
"\n\t\t\t Use -d to dump data into a directory instead"
"\n\t\t\t of using VMA format.",
.cmd = hmp_backup,
+ .coroutine = true,
},
SRST
@@ -124,6 +125,7 @@ ERST
.params = "",
.help = "cancel the current VM backup",
.cmd = hmp_backup_cancel,
+ .coroutine = true,
},
SRST
diff --git a/proxmox-backup-client.c b/proxmox-backup-client.c
index 4ce7bc0b5e..0923037dec 100644
--- a/proxmox-backup-client.c
+++ b/proxmox-backup-client.c
@@ -5,37 +5,6 @@
/* Proxmox Backup Server client bindings using coroutines */
-typedef struct BlockOnCoroutineWrapper {
- AioContext *ctx;
- CoroutineEntry *entry;
- void *entry_arg;
- bool finished;
-} BlockOnCoroutineWrapper;
-
-static void coroutine_fn block_on_coroutine_wrapper(void *opaque)
-{
- BlockOnCoroutineWrapper *wrapper = opaque;
- wrapper->entry(wrapper->entry_arg);
- wrapper->finished = true;
- aio_wait_kick();
-}
-
-void block_on_coroutine_fn(CoroutineEntry *entry, void *entry_arg)
-{
- assert(!qemu_in_coroutine());
-
- AioContext *ctx = qemu_get_current_aio_context();
- BlockOnCoroutineWrapper wrapper = {
- .finished = false,
- .entry = entry,
- .entry_arg = entry_arg,
- .ctx = ctx,
- };
- Coroutine *wrapper_co = qemu_coroutine_create(block_on_coroutine_wrapper, &wrapper);
- aio_co_enter(ctx, wrapper_co);
- AIO_WAIT_WHILE(ctx, !wrapper.finished);
-}
-
// This is called from another thread, so we use aio_co_schedule()
static void proxmox_backup_schedule_wake(void *data) {
CoCtxData *waker = (CoCtxData *)data;
diff --git a/pve-backup.c b/pve-backup.c
index 5662f48b72..e4fe1b601d 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -354,7 +354,7 @@ static void job_cancel_bh(void *opaque) {
aio_co_enter(data->ctx, data->co);
}
-static void coroutine_fn pvebackup_co_cancel(void *opaque)
+void coroutine_fn qmp_backup_cancel(Error **errp)
{
Error *cancel_err = NULL;
error_setg(&cancel_err, "backup canceled");
@@ -391,11 +391,6 @@ static void coroutine_fn pvebackup_co_cancel(void *opaque)
qemu_co_mutex_unlock(&backup_state.backup_mutex);
}
-void qmp_backup_cancel(Error **errp)
-{
- block_on_coroutine_fn(pvebackup_co_cancel, NULL);
-}
-
// assumes the caller holds backup_mutex
static int coroutine_fn pvebackup_co_add_config(
const char *file,
@@ -529,50 +524,27 @@ static void create_backup_jobs_bh(void *opaque) {
aio_co_enter(data->ctx, data->co);
}
-typedef struct QmpBackupTask {
- const char *backup_file;
- bool has_password;
- const char *password;
- bool has_keyfile;
- const char *keyfile;
- bool has_key_password;
- const char *key_password;
- bool has_backup_id;
- const char *backup_id;
- bool has_backup_time;
- const char *fingerprint;
- bool has_fingerprint;
- int64_t backup_time;
- bool has_use_dirty_bitmap;
- bool use_dirty_bitmap;
- bool has_format;
- BackupFormat format;
- bool has_config_file;
- const char *config_file;
- bool has_firewall_file;
- const char *firewall_file;
- bool has_devlist;
- const char *devlist;
- bool has_compress;
- bool compress;
- bool has_encrypt;
- bool encrypt;
- bool has_speed;
- int64_t speed;
- Error **errp;
- UuidInfo *result;
-} QmpBackupTask;
-
-static void coroutine_fn pvebackup_co_prepare(void *opaque)
+UuidInfo coroutine_fn *qmp_backup(
+ const char *backup_file,
+ bool has_password, const char *password,
+ bool has_keyfile, const char *keyfile,
+ bool has_key_password, const char *key_password,
+ bool has_fingerprint, const char *fingerprint,
+ bool has_backup_id, const char *backup_id,
+ bool has_backup_time, int64_t backup_time,
+ bool has_use_dirty_bitmap, bool use_dirty_bitmap,
+ bool has_compress, bool compress,
+ bool has_encrypt, bool encrypt,
+ bool has_format, BackupFormat format,
+ bool has_config_file, const char *config_file,
+ bool has_firewall_file, const char *firewall_file,
+ bool has_devlist, const char *devlist,
+ bool has_speed, int64_t speed, Error **errp)
{
assert(qemu_in_coroutine());
qemu_co_mutex_lock(&backup_state.backup_mutex);
- QmpBackupTask *task = opaque;
-
- task->result = NULL; // just to be sure
-
BlockBackend *blk;
BlockDriverState *bs = NULL;
const char *backup_dir = NULL;
@@ -589,17 +561,17 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
const char *firewall_name = "qemu-server.fw";
if (backup_state.di_list) {
- error_set(task->errp, ERROR_CLASS_GENERIC_ERROR,
+ error_set(errp, ERROR_CLASS_GENERIC_ERROR,
"previous backup not finished");
qemu_co_mutex_unlock(&backup_state.backup_mutex);
- return;
+ return NULL;
}
/* Todo: try to auto-detect format based on file name */
- BackupFormat format = task->has_format ? task->format : BACKUP_FORMAT_VMA;
+ format = has_format ? format : BACKUP_FORMAT_VMA;
- if (task->has_devlist) {
- devs = g_strsplit_set(task->devlist, ",;:", -1);
+ if (has_devlist) {
+ devs = g_strsplit_set(devlist, ",;:", -1);
gchar **d = devs;
while (d && *d) {
@@ -607,14 +579,14 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
if (blk) {
bs = blk_bs(blk);
if (!bdrv_is_inserted(bs)) {
- error_setg(task->errp, QERR_DEVICE_HAS_NO_MEDIUM, *d);
+ error_setg(errp, QERR_DEVICE_HAS_NO_MEDIUM, *d);
goto err;
}
PVEBackupDevInfo *di = g_new0(PVEBackupDevInfo, 1);
di->bs = bs;
di_list = g_list_append(di_list, di);
} else {
- error_set(task->errp, ERROR_CLASS_DEVICE_NOT_FOUND,
+ error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND,
"Device '%s' not found", *d);
goto err;
}
@@ -637,7 +609,7 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
}
if (!di_list) {
- error_set(task->errp, ERROR_CLASS_GENERIC_ERROR, "empty device list");
+ error_set(errp, ERROR_CLASS_GENERIC_ERROR, "empty device list");
goto err;
}
@@ -647,13 +619,13 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
while (l) {
PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
l = g_list_next(l);
- if (bdrv_op_is_blocked(di->bs, BLOCK_OP_TYPE_BACKUP_SOURCE, task->errp)) {
+ if (bdrv_op_is_blocked(di->bs, BLOCK_OP_TYPE_BACKUP_SOURCE, errp)) {
goto err;
}
ssize_t size = bdrv_getlength(di->bs);
if (size < 0) {
- error_setg_errno(task->errp, -size, "bdrv_getlength failed");
+ error_setg_errno(errp, -size, "bdrv_getlength failed");
goto err;
}
di->size = size;
@@ -680,47 +652,44 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
}
if (format == BACKUP_FORMAT_PBS) {
- if (!task->has_password) {
- error_set(task->errp, ERROR_CLASS_GENERIC_ERROR, "missing parameter 'password'");
+ if (!has_password) {
+ error_set(errp, ERROR_CLASS_GENERIC_ERROR, "missing parameter 'password'");
goto err_mutex;
}
- if (!task->has_backup_id) {
- error_set(task->errp, ERROR_CLASS_GENERIC_ERROR, "missing parameter 'backup-id'");
+ if (!has_backup_id) {
+ error_set(errp, ERROR_CLASS_GENERIC_ERROR, "missing parameter 'backup-id'");
goto err_mutex;
}
- if (!task->has_backup_time) {
- error_set(task->errp, ERROR_CLASS_GENERIC_ERROR, "missing parameter 'backup-time'");
+ if (!has_backup_time) {
+ error_set(errp, ERROR_CLASS_GENERIC_ERROR, "missing parameter 'backup-time'");
goto err_mutex;
}
int dump_cb_block_size = PROXMOX_BACKUP_DEFAULT_CHUNK_SIZE; // Hardcoded (4M)
firewall_name = "fw.conf";
- bool use_dirty_bitmap = task->has_use_dirty_bitmap && task->use_dirty_bitmap;
-
-
char *pbs_err = NULL;
pbs = proxmox_backup_new(
- task->backup_file,
- task->backup_id,
- task->backup_time,
+ backup_file,
+ backup_id,
+ backup_time,
dump_cb_block_size,
- task->has_password ? task->password : NULL,
- task->has_keyfile ? task->keyfile : NULL,
- task->has_key_password ? task->key_password : NULL,
- task->has_compress ? task->compress : true,
- task->has_encrypt ? task->encrypt : task->has_keyfile,
- task->has_fingerprint ? task->fingerprint : NULL,
+ has_password ? password : NULL,
+ has_keyfile ? keyfile : NULL,
+ has_key_password ? key_password : NULL,
+ has_compress ? compress : true,
+ has_encrypt ? encrypt : has_keyfile,
+ has_fingerprint ? fingerprint : NULL,
&pbs_err);
if (!pbs) {
- error_set(task->errp, ERROR_CLASS_GENERIC_ERROR,
+ error_set(errp, ERROR_CLASS_GENERIC_ERROR,
"proxmox_backup_new failed: %s", pbs_err);
proxmox_backup_free_error(pbs_err);
goto err_mutex;
}
- int connect_result = proxmox_backup_co_connect(pbs, task->errp);
+ int connect_result = proxmox_backup_co_connect(pbs, errp);
if (connect_result < 0)
goto err_mutex;
@@ -739,9 +708,9 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
BdrvDirtyBitmap *bitmap = bdrv_find_dirty_bitmap(di->bs, PBS_BITMAP_NAME);
bool expect_only_dirty = false;
- if (use_dirty_bitmap) {
+ if (has_use_dirty_bitmap && use_dirty_bitmap) {
if (bitmap == NULL) {
- bitmap = bdrv_create_dirty_bitmap(di->bs, dump_cb_block_size, PBS_BITMAP_NAME, task->errp);
+ bitmap = bdrv_create_dirty_bitmap(di->bs, dump_cb_block_size, PBS_BITMAP_NAME, errp);
if (!bitmap) {
goto err_mutex;
}
@@ -771,12 +740,12 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
}
}
- int dev_id = proxmox_backup_co_register_image(pbs, devname, di->size, expect_only_dirty, task->errp);
+ int dev_id = proxmox_backup_co_register_image(pbs, devname, di->size, expect_only_dirty, errp);
if (dev_id < 0) {
goto err_mutex;
}
- if (!(di->target = bdrv_backup_dump_create(dump_cb_block_size, di->size, pvebackup_co_dump_pbs_cb, di, task->errp))) {
+ if (!(di->target = bdrv_backup_dump_create(dump_cb_block_size, di->size, pvebackup_co_dump_pbs_cb, di, errp))) {
goto err_mutex;
}
@@ -790,10 +759,10 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
backup_state.stat.bitmap_list = g_list_append(backup_state.stat.bitmap_list, info);
}
} else if (format == BACKUP_FORMAT_VMA) {
- vmaw = vma_writer_create(task->backup_file, uuid, &local_err);
+ vmaw = vma_writer_create(backup_file, uuid, &local_err);
if (!vmaw) {
if (local_err) {
- error_propagate(task->errp, local_err);
+ error_propagate(errp, local_err);
}
goto err_mutex;
}
@@ -804,25 +773,25 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
l = g_list_next(l);
- if (!(di->target = bdrv_backup_dump_create(VMA_CLUSTER_SIZE, di->size, pvebackup_co_dump_vma_cb, di, task->errp))) {
+ if (!(di->target = bdrv_backup_dump_create(VMA_CLUSTER_SIZE, di->size, pvebackup_co_dump_vma_cb, di, errp))) {
goto err_mutex;
}
const char *devname = bdrv_get_device_name(di->bs);
di->dev_id = vma_writer_register_stream(vmaw, devname, di->size);
if (di->dev_id <= 0) {
- error_set(task->errp, ERROR_CLASS_GENERIC_ERROR,
+ error_set(errp, ERROR_CLASS_GENERIC_ERROR,
"register_stream failed");
goto err_mutex;
}
}
} else if (format == BACKUP_FORMAT_DIR) {
- if (mkdir(task->backup_file, 0640) != 0) {
- error_setg_errno(task->errp, errno, "can't create directory '%s'\n",
- task->backup_file);
+ if (mkdir(backup_file, 0640) != 0) {
+ error_setg_errno(errp, errno, "can't create directory '%s'\n",
+ backup_file);
goto err_mutex;
}
- backup_dir = task->backup_file;
+ backup_dir = backup_file;
l = di_list;
while (l) {
@@ -836,34 +805,34 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
bdrv_img_create(di->targetfile, "raw", NULL, NULL, NULL,
di->size, flags, false, &local_err);
if (local_err) {
- error_propagate(task->errp, local_err);
+ error_propagate(errp, local_err);
goto err_mutex;
}
di->target = bdrv_open(di->targetfile, NULL, NULL, flags, &local_err);
if (!di->target) {
- error_propagate(task->errp, local_err);
+ error_propagate(errp, local_err);
goto err_mutex;
}
}
} else {
- error_set(task->errp, ERROR_CLASS_GENERIC_ERROR, "unknown backup format");
+ error_set(errp, ERROR_CLASS_GENERIC_ERROR, "unknown backup format");
goto err_mutex;
}
/* add configuration file to archive */
- if (task->has_config_file) {
- if (pvebackup_co_add_config(task->config_file, config_name, format, backup_dir,
- vmaw, pbs, task->errp) != 0) {
+ if (has_config_file) {
+ if (pvebackup_co_add_config(config_file, config_name, format, backup_dir,
+ vmaw, pbs, errp) != 0) {
goto err_mutex;
}
}
/* add firewall file to archive */
- if (task->has_firewall_file) {
- if (pvebackup_co_add_config(task->firewall_file, firewall_name, format, backup_dir,
- vmaw, pbs, task->errp) != 0) {
+ if (has_firewall_file) {
+ if (pvebackup_co_add_config(firewall_file, firewall_name, format, backup_dir,
+ vmaw, pbs, errp) != 0) {
goto err_mutex;
}
}
@@ -881,7 +850,7 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
if (backup_state.stat.backup_file) {
g_free(backup_state.stat.backup_file);
}
- backup_state.stat.backup_file = g_strdup(task->backup_file);
+ backup_state.stat.backup_file = g_strdup(backup_file);
uuid_copy(backup_state.stat.uuid, uuid);
uuid_unparse_lower(uuid, backup_state.stat.uuid_str);
@@ -896,7 +865,7 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
qemu_mutex_unlock(&backup_state.stat.lock);
- backup_state.speed = (task->has_speed && task->speed > 0) ? task->speed : 0;
+ backup_state.speed = (has_speed && speed > 0) ? speed : 0;
backup_state.vmaw = vmaw;
backup_state.pbs = pbs;
@@ -906,8 +875,6 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
uuid_info = g_malloc0(sizeof(*uuid_info));
uuid_info->UUID = uuid_str;
- task->result = uuid_info;
-
/* Run create_backup_jobs_bh outside of coroutine (in BH) but keep
* backup_mutex locked. This is fine, a CoMutex can be held across yield
* points, and we'll release it as soon as the BH reschedules us.
@@ -921,7 +888,7 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
qemu_coroutine_yield();
if (local_err) {
- error_propagate(task->errp, local_err);
+ error_propagate(errp, local_err);
goto err;
}
@@ -934,7 +901,7 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
/* start the first job in the transaction */
job_txn_start_seq(backup_state.txn);
- return;
+ return uuid_info;
err_mutex:
qemu_mutex_unlock(&backup_state.stat.lock);
@@ -965,7 +932,7 @@ err:
if (vmaw) {
Error *err = NULL;
vma_writer_close(vmaw, &err);
- unlink(task->backup_file);
+ unlink(backup_file);
}
if (pbs) {
@@ -976,65 +943,8 @@ err:
rmdir(backup_dir);
}
- task->result = NULL;
-
qemu_co_mutex_unlock(&backup_state.backup_mutex);
- return;
-}
-
-UuidInfo *qmp_backup(
- const char *backup_file,
- bool has_password, const char *password,
- bool has_keyfile, const char *keyfile,
- bool has_key_password, const char *key_password,
- bool has_fingerprint, const char *fingerprint,
- bool has_backup_id, const char *backup_id,
- bool has_backup_time, int64_t backup_time,
- bool has_use_dirty_bitmap, bool use_dirty_bitmap,
- bool has_compress, bool compress,
- bool has_encrypt, bool encrypt,
- bool has_format, BackupFormat format,
- bool has_config_file, const char *config_file,
- bool has_firewall_file, const char *firewall_file,
- bool has_devlist, const char *devlist,
- bool has_speed, int64_t speed, Error **errp)
-{
- QmpBackupTask task = {
- .backup_file = backup_file,
- .has_password = has_password,
- .password = password,
- .has_keyfile = has_keyfile,
- .keyfile = keyfile,
- .has_key_password = has_key_password,
- .key_password = key_password,
- .has_fingerprint = has_fingerprint,
- .fingerprint = fingerprint,
- .has_backup_id = has_backup_id,
- .backup_id = backup_id,
- .has_backup_time = has_backup_time,
- .backup_time = backup_time,
- .has_use_dirty_bitmap = has_use_dirty_bitmap,
- .use_dirty_bitmap = use_dirty_bitmap,
- .has_compress = has_compress,
- .compress = compress,
- .has_encrypt = has_encrypt,
- .encrypt = encrypt,
- .has_format = has_format,
- .format = format,
- .has_config_file = has_config_file,
- .config_file = config_file,
- .has_firewall_file = has_firewall_file,
- .firewall_file = firewall_file,
- .has_devlist = has_devlist,
- .devlist = devlist,
- .has_speed = has_speed,
- .speed = speed,
- .errp = errp,
- };
-
- block_on_coroutine_fn(pvebackup_co_prepare, &task);
-
- return task.result;
+ return NULL;
}
BackupStatus *qmp_query_backup(Error **errp)
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 24f30260c8..4e8c35a3a2 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -842,7 +842,7 @@
'*config-file': 'str',
'*firewall-file': 'str',
'*devlist': 'str', '*speed': 'int' },
- 'returns': 'UuidInfo' }
+ 'returns': 'UuidInfo', 'coroutine': true }
##
# @query-backup:
@@ -864,7 +864,7 @@
# Notes: This command succeeds even if there is no backup process running.
#
##
-{ 'command': 'backup-cancel' }
+{ 'command': 'backup-cancel', 'coroutine': true }
##
# @ProxmoxSupportStatus:

View File

@@ -1,153 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Thu, 6 Apr 2023 14:59:31 +0200
Subject: [PATCH] alloc-track: fix deadlock during drop
by replacing the block node directly after changing the backing file
instead of rescheduling it.
With changes in QEMU 8.0, calling bdrv_get_info (and bdrv_unref)
during drop can lead to a deadlock when using iothread (only triggered
with multiple disks, except during debugging where it also triggered
with one disk sometimes):
1. job_unref_locked acquires the AioContext and calls job->driver->free
2. track_drop gets scheduled
3. bdrv_graph_wrlock is called and polls which leads to track_drop being
called
4. track_drop acquires the AioContext recursively
5. bdrv_get_info is a wrapped coroutine (since 8.0) and thus polls for
bdrv_co_get_info. This releases the AioContext, but only once! The
documentation for the AIO_WAIT_WHILE macro states that the
AioContext lock needs to be acquired exactly once, but there does
not seem to be a way for track_drop to know if it acquired the lock
recursively or not (without adding further hacks).
6. Because the AioContext is still held by the main thread once, it can't
be acquired before entering bdrv_co_get_info in co_schedule_bh_cb
which happens in the iothread
When doing the operation in change_backing_file, the AioContext has
already been acquired by the caller, so the issue with the recursive
lock goes away.
The comment explaining why delaying the replace is necessary is
> we need to schedule this for later however, since when this function
> is called, the blockjob modifying us is probably not done yet and
> has a blocker on 'bs'
However, there is no check for blockers in bdrv_replace_node. It would
need to be done by us, the caller, with check_to_replace_node.
Furthermore, the mirror job also does its call to bdrv_replace_node
while there is an active blocker (inserted by mirror itself) and they
use a specialized version to check for blockers instead of
check_to_replace_node there. Alloc-track could also do something
similar to check for other blockers, but it should be fine to rely on
Proxmox VE that no other operation with the blockdev is going on.
Mirror also drains the target before replacing the node, but the
target can have other users. In case of alloc-track the file child
should not be accessible by anybody else and so there can't be an
in-flight operation for the file child when alloc-track is drained.
The rescheduling based on refcounting is a hack and it doesn't seem to
be necessary anymore. It's not clear what the original issue from the
comment was. Testing with older builds with track_drop done directly
without rescheduling also didn't lead to any noticable issue for me.
One issue it might have been is the one fixed by b1e1af394d
("block/stream: Drain subtree around graph change"), where
block-stream had a use-after-free if the base node changed at an
inconvenient time (which alloc-track's auto-drop does).
It's also not possible to just not auto-replace the alloc-track. Not
replacing it at all leads to other operations like block resize
hanging, and there is no good way to replace it manually via QMP
(there is x-blockdev-change, but it is experimental and doesn't
implement the required operation yet). Also, it's just cleaner in
general to not leave unnecessary block nodes lying around.
Suggested-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
block/alloc-track.c | 54 ++++++++++++++-------------------------------
1 file changed, 16 insertions(+), 38 deletions(-)
diff --git a/block/alloc-track.c b/block/alloc-track.c
index b75d7c6460..76da140a68 100644
--- a/block/alloc-track.c
+++ b/block/alloc-track.c
@@ -25,7 +25,6 @@
typedef enum DropState {
DropNone,
- DropRequested,
DropInProgress,
} DropState;
@@ -268,37 +267,6 @@ static void track_child_perm(BlockDriverState *bs, BdrvChild *c,
}
}
-static void track_drop(void *opaque)
-{
- BlockDriverState *bs = (BlockDriverState*)opaque;
- BlockDriverState *file = bs->file->bs;
- BDRVAllocTrackState *s = bs->opaque;
-
- assert(file);
-
- /* we rely on the fact that we're not used anywhere else, so let's wait
- * until we're only used once - in the drive connected to the guest (and one
- * ref is held by bdrv_ref in track_change_backing_file) */
- if (bs->refcnt > 2) {
- aio_bh_schedule_oneshot(qemu_get_aio_context(), track_drop, opaque);
- return;
- }
- AioContext *aio_context = bdrv_get_aio_context(bs);
- aio_context_acquire(aio_context);
-
- bdrv_drained_begin(bs);
-
- /* now that we're drained, we can safely set 'DropInProgress' */
- s->drop_state = DropInProgress;
- bdrv_child_refresh_perms(bs, bs->file, &error_abort);
-
- bdrv_replace_node(bs, file, &error_abort);
- bdrv_set_backing_hd(bs, NULL, &error_abort);
- bdrv_drained_end(bs);
- bdrv_unref(bs);
- aio_context_release(aio_context);
-}
-
static int track_change_backing_file(BlockDriverState *bs,
const char *backing_file,
const char *backing_fmt)
@@ -308,13 +276,23 @@ static int track_change_backing_file(BlockDriverState *bs,
backing_file == NULL && backing_fmt == NULL)
{
/* backing file has been disconnected, there's no longer any use for
- * this node, so let's remove ourselves from the block graph - we need
- * to schedule this for later however, since when this function is
- * called, the blockjob modifying us is probably not done yet and has a
- * blocker on 'bs' */
- s->drop_state = DropRequested;
+ * this node, so let's remove ourselves from the block graph */
+ BlockDriverState *file = bs->file->bs;
+
+ /* Just to be sure, because bdrv_replace_node unrefs it */
bdrv_ref(bs);
- aio_bh_schedule_oneshot(qemu_get_aio_context(), track_drop, (void*)bs);
+ bdrv_drained_begin(bs);
+
+ /* now that we're drained, we can safely set 'DropInProgress' */
+ s->drop_state = DropInProgress;
+
+ bdrv_child_refresh_perms(bs, bs->file, &error_abort);
+
+ bdrv_replace_node(bs, file, &error_abort);
+ bdrv_set_backing_hd(bs, NULL, &error_abort);
+
+ bdrv_drained_end(bs);
+ bdrv_unref(bs);
}
return 0;

View File

@@ -0,0 +1,98 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Stefan Reiter <s.reiter@proxmox.com>
Date: Wed, 10 Feb 2021 11:07:06 +0100
Subject: [PATCH] PBS: add master key support
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
this requires a new enough libproxmox-backup-qemu0, and allows querying
from the PVE side to avoid QMP calls with unsupported parameters.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
block/monitor/block-hmp-cmds.c | 1 +
pve-backup.c | 3 +++
qapi/block-core.json | 7 +++++++
3 files changed, 11 insertions(+)
diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c
index 2e53cb65df..f98f4cf7e6 100644
--- a/block/monitor/block-hmp-cmds.c
+++ b/block/monitor/block-hmp-cmds.c
@@ -1041,6 +1041,7 @@ void coroutine_fn hmp_backup(Monitor *mon, const QDict *qdict)
false, NULL, // PBS password
false, NULL, // PBS keyfile
false, NULL, // PBS key_password
+ false, NULL, // PBS master_keyfile
false, NULL, // PBS fingerprint
false, NULL, // PBS backup-id
false, 0, // PBS backup-time
diff --git a/pve-backup.c b/pve-backup.c
index e4fe1b601d..41e8effa01 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -529,6 +529,7 @@ UuidInfo coroutine_fn *qmp_backup(
bool has_password, const char *password,
bool has_keyfile, const char *keyfile,
bool has_key_password, const char *key_password,
+ bool has_master_keyfile, const char *master_keyfile,
bool has_fingerprint, const char *fingerprint,
bool has_backup_id, const char *backup_id,
bool has_backup_time, int64_t backup_time,
@@ -677,6 +678,7 @@ UuidInfo coroutine_fn *qmp_backup(
has_password ? password : NULL,
has_keyfile ? keyfile : NULL,
has_key_password ? key_password : NULL,
+ has_master_keyfile ? master_keyfile : NULL,
has_compress ? compress : true,
has_encrypt ? encrypt : has_keyfile,
has_fingerprint ? fingerprint : NULL,
@@ -1040,5 +1042,6 @@ ProxmoxSupportStatus *qmp_query_proxmox_support(Error **errp)
ret->pbs_dirty_bitmap_savevm = true;
ret->pbs_dirty_bitmap_migration = true;
ret->query_bitmap_info = true;
+ ret->pbs_masterkey = true;
return ret;
}
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 4e8c35a3a2..d8c7331090 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -813,6 +813,8 @@
#
# @key-password: password for keyfile (optional for format 'pbs')
#
+# @master-keyfile: PEM-formatted master public keyfile (optional for format 'pbs')
+#
# @fingerprint: server cert fingerprint (optional for format 'pbs')
#
# @backup-id: backup ID (required for format 'pbs')
@@ -832,6 +834,7 @@
'*password': 'str',
'*keyfile': 'str',
'*key-password': 'str',
+ '*master-keyfile': 'str',
'*fingerprint': 'str',
'*backup-id': 'str',
'*backup-time': 'int',
@@ -884,6 +887,9 @@
# migration cap if this is false/unset may lead
# to crashes on migration!
#
+# @pbs-masterkey: True if the QMP backup call supports the 'master_keyfile'
+# parameter.
+#
# @pbs-library-version: Running version of libproxmox-backup-qemu0 library.
#
##
@@ -892,6 +898,7 @@
'query-bitmap-info': 'bool',
'pbs-dirty-bitmap-savevm': 'bool',
'pbs-dirty-bitmap-migration': 'bool',
+ 'pbs-masterkey': 'bool',
'pbs-library-version': 'str' } }
##

View File

@@ -1,190 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Fri, 5 May 2023 13:39:53 +0200
Subject: [PATCH] migration: for snapshots, hold the BQL during setup callbacks
In spirit, this is a partial revert of commit 9b09503752 ("migration:
run setup callbacks out of big lock"), but only for the snapshot case.
For snapshots, the bdrv_writev_vmstate() function is used during setup
(in QIOChannelBlock backing the QEMUFile), but not holding the BQL
while calling it could lead to an assertion failure. To understand
how, first note the following:
1. Generated coroutine wrappers for block layer functions spawn the
coroutine and use AIO_WAIT_WHILE()/aio_poll() to wait for it.
2. If the host OS switches threads at an inconvenient time, it can
happen that a bottom half scheduled for the main thread's AioContext
is executed as part of a vCPU thread's aio_poll().
An example leading to the assertion failure is as follows:
main thread:
1. A snapshot-save QMP command gets issued.
2. snapshot_save_job_bh() is scheduled.
vCPU thread:
3. aio_poll() for the main thread's AioContext is called (e.g. when
the guest writes to a pflash device, as part of blk_pwrite which is a
generated coroutine wrapper).
4. snapshot_save_job_bh() is executed as part of aio_poll().
3. qemu_savevm_state() is called.
4. qemu_mutex_unlock_iothread() is called. Now
qemu_get_current_aio_context() returns 0x0.
5. bdrv_writev_vmstate() is executed during the usual savevm setup.
But this function is a generated coroutine wrapper, so it uses
AIO_WAIT_WHILE. There, the assertion
assert(qemu_get_current_aio_context() == qemu_get_aio_context());
will fail.
To fix it, ensure that the BQL is held during setup. To avoid changing
the behavior for migration too, introduce conditionals for the setup
callbacks that need the BQL and only take the lock if it's not already
held.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
include/migration/register.h | 2 +-
migration/block-dirty-bitmap.c | 15 ++++++++++++---
migration/block.c | 15 ++++++++++++---
migration/ram.c | 16 +++++++++++++---
migration/savevm.c | 2 --
5 files changed, 38 insertions(+), 12 deletions(-)
diff --git a/include/migration/register.h b/include/migration/register.h
index a8dfd8fefd..fa9b0b0f10 100644
--- a/include/migration/register.h
+++ b/include/migration/register.h
@@ -43,9 +43,9 @@ typedef struct SaveVMHandlers {
* by other locks.
*/
int (*save_live_iterate)(QEMUFile *f, void *opaque);
+ int (*save_setup)(QEMUFile *f, void *opaque);
/* This runs outside the iothread lock! */
- int (*save_setup)(QEMUFile *f, void *opaque);
/* Note for save_live_pending:
* must_precopy:
* - must be migrated in precopy or in stopped state
diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 509f3df0a6..42dc4a8d61 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -1220,10 +1220,17 @@ static int dirty_bitmap_save_setup(QEMUFile *f, void *opaque)
{
DBMSaveState *s = &((DBMState *)opaque)->save;
SaveBitmapState *dbms = NULL;
+ bool release_lock = false;
- qemu_mutex_lock_iothread();
+ /* For snapshots, the BQL is held during setup. */
+ if (!qemu_mutex_iothread_locked()) {
+ qemu_mutex_lock_iothread();
+ release_lock = true;
+ }
if (init_dirty_bitmap_migration(s) < 0) {
- qemu_mutex_unlock_iothread();
+ if (release_lock) {
+ qemu_mutex_unlock_iothread();
+ }
return -1;
}
@@ -1231,7 +1238,9 @@ static int dirty_bitmap_save_setup(QEMUFile *f, void *opaque)
send_bitmap_start(f, s, dbms);
}
qemu_put_bitmap_flags(f, DIRTY_BITMAP_MIG_FLAG_EOS);
- qemu_mutex_unlock_iothread();
+ if (release_lock) {
+ qemu_mutex_unlock_iothread();
+ }
return 0;
}
diff --git a/migration/block.c b/migration/block.c
index b2497bbd32..c9d55be642 100644
--- a/migration/block.c
+++ b/migration/block.c
@@ -716,21 +716,30 @@ static void block_migration_cleanup(void *opaque)
static int block_save_setup(QEMUFile *f, void *opaque)
{
int ret;
+ bool release_lock = false;
trace_migration_block_save("setup", block_mig_state.submitted,
block_mig_state.transferred);
- qemu_mutex_lock_iothread();
+ /* For snapshots, the BQL is held during setup. */
+ if (!qemu_mutex_iothread_locked()) {
+ qemu_mutex_lock_iothread();
+ release_lock = true;
+ }
ret = init_blk_migration(f);
if (ret < 0) {
- qemu_mutex_unlock_iothread();
+ if (release_lock) {
+ qemu_mutex_unlock_iothread();
+ }
return ret;
}
/* start track dirty blocks */
ret = set_dirty_tracking();
- qemu_mutex_unlock_iothread();
+ if (release_lock) {
+ qemu_mutex_unlock_iothread();
+ }
if (ret) {
return ret;
diff --git a/migration/ram.c b/migration/ram.c
index 79d881f735..0ecbbc3202 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -3117,8 +3117,16 @@ static void migration_bitmap_clear_discarded_pages(RAMState *rs)
static void ram_init_bitmaps(RAMState *rs)
{
- /* For memory_global_dirty_log_start below. */
- qemu_mutex_lock_iothread();
+ bool release_lock = false;
+
+ /*
+ * For memory_global_dirty_log_start below.
+ * For snapshots, the BQL is held during setup.
+ */
+ if (!qemu_mutex_iothread_locked()) {
+ qemu_mutex_lock_iothread();
+ release_lock = true;
+ }
qemu_mutex_lock_ramlist();
WITH_RCU_READ_LOCK_GUARD() {
@@ -3130,7 +3138,9 @@ static void ram_init_bitmaps(RAMState *rs)
}
}
qemu_mutex_unlock_ramlist();
- qemu_mutex_unlock_iothread();
+ if (release_lock) {
+ qemu_mutex_unlock_iothread();
+ }
/*
* After an eventual first bitmap sync, fixup the initial bitmap
diff --git a/migration/savevm.c b/migration/savevm.c
index aa54a67fda..fc6a82a555 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1621,10 +1621,8 @@ static int qemu_savevm_state(QEMUFile *f, Error **errp)
memset(&compression_counters, 0, sizeof(compression_counters));
ms->to_dst_file = f;
- qemu_mutex_unlock_iothread();
qemu_savevm_state_header(f);
qemu_savevm_state_setup(f);
- qemu_mutex_lock_iothread();
while (qemu_file_get_error(f) == 0) {
if (qemu_savevm_state_iterate(f, false) > 0) {

View File

@@ -0,0 +1,53 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Stefan Reiter <s.reiter@proxmox.com>
Date: Wed, 9 Dec 2020 11:46:57 +0100
Subject: [PATCH] PVE: block/pbs: fast-path reads without allocation if
possible
...and switch over to g_malloc/g_free while at it to align with other
QEMU code.
Tracing shows the fast-path is taken almost all the time, though not
100% so the slow one is still necessary.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
block/pbs.c | 17 ++++++++++++++---
1 file changed, 14 insertions(+), 3 deletions(-)
diff --git a/block/pbs.c b/block/pbs.c
index 9d1f1f39d4..ce9a870885 100644
--- a/block/pbs.c
+++ b/block/pbs.c
@@ -200,7 +200,16 @@ static coroutine_fn int pbs_co_preadv(BlockDriverState *bs,
BDRVPBSState *s = bs->opaque;
int ret;
char *pbs_error = NULL;
- uint8_t *buf = malloc(bytes);
+ uint8_t *buf;
+ bool inline_buf = true;
+
+ /* for single-buffer IO vectors we can fast-path the write directly to it */
+ if (qiov->niov == 1 && qiov->iov->iov_len >= bytes) {
+ buf = qiov->iov->iov_base;
+ } else {
+ inline_buf = false;
+ buf = g_malloc(bytes);
+ }
if (offset < 0 || bytes < 0) {
fprintf(stderr, "unexpected negative 'offset' or 'bytes' value!\n");
@@ -223,8 +232,10 @@ static coroutine_fn int pbs_co_preadv(BlockDriverState *bs,
return -EIO;
}
- qemu_iovec_from_buf(qiov, 0, buf, bytes);
- free(buf);
+ if (!inline_buf) {
+ qemu_iovec_from_buf(qiov, 0, buf, bytes);
+ g_free(buf);
+ }
return 0;
}

View File

@@ -1,29 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Fri, 5 May 2023 15:30:16 +0200
Subject: [PATCH] savevm-async: don't hold BQL during setup
See commit "migration: for snapshots, hold the BQL during setup
callbacks" for why. This is separate, because a version of that one
will hopefully land upstream.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
migration/savevm-async.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/migration/savevm-async.c b/migration/savevm-async.c
index b97f2c4f14..87ea0573d3 100644
--- a/migration/savevm-async.c
+++ b/migration/savevm-async.c
@@ -403,10 +403,8 @@ void qmp_savevm_start(const char *statefile, Error **errp)
snap_state.state = SAVE_STATE_ACTIVE;
snap_state.finalize_bh = qemu_bh_new(process_savevm_finalize, &snap_state);
snap_state.co = qemu_coroutine_create(&process_savevm_co, NULL);
- qemu_mutex_unlock_iothread();
qemu_savevm_state_header(snap_state.file);
qemu_savevm_state_setup(snap_state.file);
- qemu_mutex_lock_iothread();
/* Async processing from here on out happens in iohandler context, so let
* the target bdrv have its home there.

View File

@@ -11,10 +11,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/block/stream.c b/block/stream.c
index 7f9e1ecdbb..6a29d80398 100644
index 694709bd25..e09bd5c4ef 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -27,7 +27,7 @@ enum {
@@ -28,7 +28,7 @@ enum {
* large enough to process multiple clusters in a single call, so
* that populating contiguous regions of the image is efficient.
*/

View File

@@ -17,10 +17,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 4 insertions(+)
diff --git a/block/io.c b/block/io.c
index 2e267a85ab..449a44bf20 100644
index 4589a58917..20167d61b7 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1576,6 +1576,10 @@ static int bdrv_pad_request(BlockDriverState *bs,
@@ -1730,6 +1730,10 @@ static int bdrv_pad_request(BlockDriverState *bs,
{
int ret;

View File

@@ -25,21 +25,20 @@ once the backing image is removed. It will be replaced by 'file'.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
[FE: adapt to changed function signatures
make error return value consistent with QEMU
avoid premature break during read]
make error return value consistent with QEMU]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
block/alloc-track.c | 352 ++++++++++++++++++++++++++++++++++++++++++++
block/alloc-track.c | 350 ++++++++++++++++++++++++++++++++++++++++++++
block/meson.build | 1 +
2 files changed, 353 insertions(+)
2 files changed, 351 insertions(+)
create mode 100644 block/alloc-track.c
diff --git a/block/alloc-track.c b/block/alloc-track.c
new file mode 100644
index 0000000000..b75d7c6460
index 0000000000..43d40d11af
--- /dev/null
+++ b/block/alloc-track.c
@@ -0,0 +1,352 @@
@@ -0,0 +1,350 @@
+/*
+ * Node to allow backing images to be applied to any node. Assumes a blank
+ * image to begin with, only new writes are tracked as allocated, thus this
@@ -55,7 +54,6 @@ index 0000000000..b75d7c6460
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "block/block_int.h"
+#include "block/dirty-bitmap.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/qmp/qstring.h"
+#include "qemu/cutils.h"
@@ -165,9 +163,9 @@ index 0000000000..b75d7c6460
+ }
+}
+
+static coroutine_fn int64_t track_co_getlength(BlockDriverState *bs)
+static int64_t track_getlength(BlockDriverState *bs)
+{
+ return bdrv_co_getlength(bs->file->bs);
+ return bdrv_getlength(bs->file->bs);
+}
+
+static int coroutine_fn track_co_preadv(BlockDriverState *bs,
@@ -217,8 +215,7 @@ index 0000000000..b75d7c6460
+ ret = bdrv_co_preadv(bs->backing, local_offset, local_bytes,
+ &local_qiov, flags);
+ } else {
+ qemu_iovec_memset(&local_qiov, cur_offset, 0, local_bytes);
+ ret = 0;
+ ret = qemu_iovec_memset(&local_qiov, cur_offset, 0, local_bytes);
+ }
+
+ if (ret != 0) {
@@ -368,7 +365,7 @@ index 0000000000..b75d7c6460
+
+ .bdrv_file_open = track_open,
+ .bdrv_close = track_close,
+ .bdrv_co_getlength = track_co_getlength,
+ .bdrv_getlength = track_getlength,
+ .bdrv_child_perm = track_child_perm,
+ .bdrv_refresh_limits = track_refresh_limits,
+
@@ -393,7 +390,7 @@ index 0000000000..b75d7c6460
+
+block_init(bdrv_alloc_track_init);
diff --git a/block/meson.build b/block/meson.build
index eece0d5743..8a68162cc0 100644
index 7ef2fa72d5..15352f579f 100644
--- a/block/meson.build
+++ b/block/meson.build
@@ -2,6 +2,7 @@ block_ss.add(genh)

View File

@@ -0,0 +1,35 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Stefan Reiter <s.reiter@proxmox.com>
Date: Wed, 26 May 2021 17:36:55 +0200
Subject: [PATCH] PVE: savevm-async: register yank before
migration_incoming_state_destroy
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
migration/savevm-async.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/migration/savevm-async.c b/migration/savevm-async.c
index a38e7351c1..0b1b60c6ae 100644
--- a/migration/savevm-async.c
+++ b/migration/savevm-async.c
@@ -20,6 +20,7 @@
#include "qemu/timer.h"
#include "qemu/main-loop.h"
#include "qemu/rcu.h"
+#include "qemu/yank.h"
/* #define DEBUG_SAVEVM_STATE */
@@ -521,6 +522,10 @@ int load_snapshot_from_blockdev(const char *filename, Error **errp)
dirty_bitmap_mig_before_vm_start();
qemu_fclose(f);
+
+ /* state_destroy assumes a real migration which would have added a yank */
+ yank_register_instance(MIGRATION_YANK_INSTANCE, &error_abort);
+
migration_incoming_state_destroy();
if (ret < 0) {
error_setg_errno(errp, -ret, "Error while loading VM state");

View File

@@ -1,9 +1,9 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fabian Ebner <f.ebner@proxmox.com>
Date: Mon, 7 Feb 2022 14:21:01 +0100
Subject: [PATCH] qemu-img dd: add -l option for loading a snapshot
Subject: [PATCH] qemu-img: dd: add -l option for loading a snapshot
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
docs/tools/qemu-img.rst | 6 +++---
@@ -46,10 +46,10 @@ index b5b0bb4467..36f97e1f19 100644
DEF("info", img_info,
diff --git a/qemu-img.c b/qemu-img.c
index 06d814e39c..e2c06c496d 100644
index 4f5ef5b887..4894016ad2 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -5002,6 +5002,7 @@ static int img_dd(int argc, char **argv)
@@ -4957,6 +4957,7 @@ static int img_dd(int argc, char **argv)
BlockDriver *drv = NULL, *proto_drv = NULL;
BlockBackend *blk1 = NULL, *blk2 = NULL;
QemuOpts *opts = NULL;
@@ -57,7 +57,7 @@ index 06d814e39c..e2c06c496d 100644
QemuOptsList *create_opts = NULL;
Error *local_err = NULL;
bool image_opts = false;
@@ -5011,6 +5012,7 @@ static int img_dd(int argc, char **argv)
@@ -4966,6 +4967,7 @@ static int img_dd(int argc, char **argv)
int64_t size = 0, readsize = 0;
int64_t out_pos, in_pos;
bool force_share = false, skip_create = false;
@@ -65,7 +65,7 @@ index 06d814e39c..e2c06c496d 100644
struct DdInfo dd = {
.flags = 0,
.count = 0,
@@ -5048,7 +5050,7 @@ static int img_dd(int argc, char **argv)
@@ -5003,7 +5005,7 @@ static int img_dd(int argc, char **argv)
{ 0, 0, 0, 0 }
};
@@ -74,7 +74,7 @@ index 06d814e39c..e2c06c496d 100644
if (c == EOF) {
break;
}
@@ -5071,6 +5073,19 @@ static int img_dd(int argc, char **argv)
@@ -5026,6 +5028,19 @@ static int img_dd(int argc, char **argv)
case 'n':
skip_create = true;
break;
@@ -94,7 +94,7 @@ index 06d814e39c..e2c06c496d 100644
case 'U':
force_share = true;
break;
@@ -5130,11 +5145,24 @@ static int img_dd(int argc, char **argv)
@@ -5085,11 +5100,24 @@ static int img_dd(int argc, char **argv)
if (dd.flags & C_IF) {
blk1 = img_open(image_opts, in.filename, fmt, 0, false, false,
force_share);
@@ -120,7 +120,7 @@ index 06d814e39c..e2c06c496d 100644
}
if (dd.flags & C_OSIZE) {
@@ -5289,6 +5317,7 @@ static int img_dd(int argc, char **argv)
@@ -5244,6 +5272,7 @@ static int img_dd(int argc, char **argv)
out:
g_free(arg);
qemu_opts_del(opts);

View File

@@ -0,0 +1,407 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fabian Ebner <f.ebner@proxmox.com>
Date: Thu, 21 Apr 2022 13:26:48 +0200
Subject: [PATCH] vma: allow partial restore
Introduce a new map line for skipping a certain drive, of the form
skip=drive-scsi0
Since in PVE, most archives are compressed and piped to vma for
restore, it's not easily possible to skip reads.
For the reader, a new skip flag for VmaRestoreState is added and the
target is allowed to be NULL if skip is specified when registering. If
the skip flag is set, no writes will be made as well as no check for
duplicate clusters. Therefore, the flag is not set for verify.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Acked-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
vma-reader.c | 64 ++++++++++++---------
vma.c | 157 +++++++++++++++++++++++++++++----------------------
vma.h | 2 +-
3 files changed, 126 insertions(+), 97 deletions(-)
diff --git a/vma-reader.c b/vma-reader.c
index e65f1e8415..81a891c6b1 100644
--- a/vma-reader.c
+++ b/vma-reader.c
@@ -28,6 +28,7 @@ typedef struct VmaRestoreState {
bool write_zeroes;
unsigned long *bitmap;
int bitmap_size;
+ bool skip;
} VmaRestoreState;
struct VmaReader {
@@ -425,13 +426,14 @@ VmaDeviceInfo *vma_reader_get_device_info(VmaReader *vmar, guint8 dev_id)
}
static void allocate_rstate(VmaReader *vmar, guint8 dev_id,
- BlockBackend *target, bool write_zeroes)
+ BlockBackend *target, bool write_zeroes, bool skip)
{
assert(vmar);
assert(dev_id);
vmar->rstate[dev_id].target = target;
vmar->rstate[dev_id].write_zeroes = write_zeroes;
+ vmar->rstate[dev_id].skip = skip;
int64_t size = vmar->devinfo[dev_id].size;
@@ -446,28 +448,30 @@ static void allocate_rstate(VmaReader *vmar, guint8 dev_id,
}
int vma_reader_register_bs(VmaReader *vmar, guint8 dev_id, BlockBackend *target,
- bool write_zeroes, Error **errp)
+ bool write_zeroes, bool skip, Error **errp)
{
assert(vmar);
- assert(target != NULL);
+ assert(target != NULL || skip);
assert(dev_id);
- assert(vmar->rstate[dev_id].target == NULL);
-
- int64_t size = blk_getlength(target);
- int64_t size_diff = size - vmar->devinfo[dev_id].size;
-
- /* storage types can have different size restrictions, so it
- * is not always possible to create an image with exact size.
- * So we tolerate a size difference up to 4MB.
- */
- if ((size_diff < 0) || (size_diff > 4*1024*1024)) {
- error_setg(errp, "vma_reader_register_bs for stream %s failed - "
- "unexpected size %zd != %zd", vmar->devinfo[dev_id].devname,
- size, vmar->devinfo[dev_id].size);
- return -1;
+ assert(vmar->rstate[dev_id].target == NULL && !vmar->rstate[dev_id].skip);
+
+ if (target != NULL) {
+ int64_t size = blk_getlength(target);
+ int64_t size_diff = size - vmar->devinfo[dev_id].size;
+
+ /* storage types can have different size restrictions, so it
+ * is not always possible to create an image with exact size.
+ * So we tolerate a size difference up to 4MB.
+ */
+ if ((size_diff < 0) || (size_diff > 4*1024*1024)) {
+ error_setg(errp, "vma_reader_register_bs for stream %s failed - "
+ "unexpected size %zd != %zd", vmar->devinfo[dev_id].devname,
+ size, vmar->devinfo[dev_id].size);
+ return -1;
+ }
}
- allocate_rstate(vmar, dev_id, target, write_zeroes);
+ allocate_rstate(vmar, dev_id, target, write_zeroes, skip);
return 0;
}
@@ -560,19 +564,23 @@ static int restore_extent(VmaReader *vmar, unsigned char *buf,
VmaRestoreState *rstate = &vmar->rstate[dev_id];
BlockBackend *target = NULL;
+ bool skip = rstate->skip;
+
if (dev_id != vmar->vmstate_stream) {
target = rstate->target;
- if (!verify && !target) {
+ if (!verify && !target && !skip) {
error_setg(errp, "got wrong dev id %d", dev_id);
return -1;
}
- if (vma_reader_get_bitmap(rstate, cluster_num)) {
- error_setg(errp, "found duplicated cluster %zd for stream %s",
- cluster_num, vmar->devinfo[dev_id].devname);
- return -1;
+ if (!skip) {
+ if (vma_reader_get_bitmap(rstate, cluster_num)) {
+ error_setg(errp, "found duplicated cluster %zd for stream %s",
+ cluster_num, vmar->devinfo[dev_id].devname);
+ return -1;
+ }
+ vma_reader_set_bitmap(rstate, cluster_num, 1);
}
- vma_reader_set_bitmap(rstate, cluster_num, 1);
max_sector = vmar->devinfo[dev_id].size/BDRV_SECTOR_SIZE;
} else {
@@ -618,7 +626,7 @@ static int restore_extent(VmaReader *vmar, unsigned char *buf,
return -1;
}
- if (!verify) {
+ if (!verify && !skip) {
int nb_sectors = end_sector - sector_num;
if (restore_write_data(vmar, dev_id, target, vmstate_fd,
buf + start, sector_num, nb_sectors,
@@ -654,7 +662,7 @@ static int restore_extent(VmaReader *vmar, unsigned char *buf,
return -1;
}
- if (!verify) {
+ if (!verify && !skip) {
int nb_sectors = end_sector - sector_num;
if (restore_write_data(vmar, dev_id, target, vmstate_fd,
buf + start, sector_num,
@@ -679,7 +687,7 @@ static int restore_extent(VmaReader *vmar, unsigned char *buf,
vmar->partial_zero_cluster_data += zero_size;
}
- if (rstate->write_zeroes && !verify) {
+ if (rstate->write_zeroes && !verify && !skip) {
if (restore_write_data(vmar, dev_id, target, vmstate_fd,
zero_vma_block, sector_num,
nb_sectors, errp) < 0) {
@@ -850,7 +858,7 @@ int vma_reader_verify(VmaReader *vmar, bool verbose, Error **errp)
for (dev_id = 1; dev_id < 255; dev_id++) {
if (vma_reader_get_device_info(vmar, dev_id)) {
- allocate_rstate(vmar, dev_id, NULL, false);
+ allocate_rstate(vmar, dev_id, NULL, false, false);
}
}
diff --git a/vma.c b/vma.c
index e8dffb43e0..e6e9ffc7fe 100644
--- a/vma.c
+++ b/vma.c
@@ -138,6 +138,7 @@ typedef struct RestoreMap {
char *throttling_group;
char *cache;
bool write_zero;
+ bool skip;
} RestoreMap;
static bool try_parse_option(char **line, const char *optname, char **out, const char *inbuf) {
@@ -245,47 +246,61 @@ static int extract_content(int argc, char **argv)
char *bps = NULL;
char *group = NULL;
char *cache = NULL;
+ char *devname = NULL;
+ bool skip = false;
+ uint64_t bps_value = 0;
+ const char *path = NULL;
+ bool write_zero = true;
+
if (!line || line[0] == '\0' || !strcmp(line, "done\n")) {
break;
}
int len = strlen(line);
if (line[len - 1] == '\n') {
line[len - 1] = '\0';
- if (len == 1) {
+ len = len - 1;
+ if (len == 0) {
break;
}
}
- while (1) {
- if (!try_parse_option(&line, "format", &format, inbuf) &&
- !try_parse_option(&line, "throttling.bps", &bps, inbuf) &&
- !try_parse_option(&line, "throttling.group", &group, inbuf) &&
- !try_parse_option(&line, "cache", &cache, inbuf))
- {
- break;
+ if (strncmp(line, "skip", 4) == 0) {
+ if (len < 6 || line[4] != '=') {
+ g_error("read map failed - option 'skip' has no value ('%s')",
+ inbuf);
+ } else {
+ devname = line + 5;
+ skip = true;
+ }
+ } else {
+ while (1) {
+ if (!try_parse_option(&line, "format", &format, inbuf) &&
+ !try_parse_option(&line, "throttling.bps", &bps, inbuf) &&
+ !try_parse_option(&line, "throttling.group", &group, inbuf) &&
+ !try_parse_option(&line, "cache", &cache, inbuf))
+ {
+ break;
+ }
}
- }
- uint64_t bps_value = 0;
- if (bps) {
- bps_value = verify_u64(bps);
- g_free(bps);
- }
+ if (bps) {
+ bps_value = verify_u64(bps);
+ g_free(bps);
+ }
- const char *path;
- bool write_zero;
- if (line[0] == '0' && line[1] == ':') {
- path = line + 2;
- write_zero = false;
- } else if (line[0] == '1' && line[1] == ':') {
- path = line + 2;
- write_zero = true;
- } else {
- g_error("read map failed - parse error ('%s')", inbuf);
+ if (line[0] == '0' && line[1] == ':') {
+ path = line + 2;
+ write_zero = false;
+ } else if (line[0] == '1' && line[1] == ':') {
+ path = line + 2;
+ write_zero = true;
+ } else {
+ g_error("read map failed - parse error ('%s')", inbuf);
+ }
+
+ path = extract_devname(path, &devname, -1);
}
- char *devname = NULL;
- path = extract_devname(path, &devname, -1);
if (!devname) {
g_error("read map failed - no dev name specified ('%s')",
inbuf);
@@ -299,6 +314,7 @@ static int extract_content(int argc, char **argv)
map->throttling_group = group;
map->cache = cache;
map->write_zero = write_zero;
+ map->skip = skip;
g_hash_table_insert(devmap, map->devname, map);
@@ -328,6 +344,7 @@ static int extract_content(int argc, char **argv)
const char *cache = NULL;
int flags = BDRV_O_RDWR;
bool write_zero = true;
+ bool skip = false;
BlockBackend *blk = NULL;
@@ -343,6 +360,7 @@ static int extract_content(int argc, char **argv)
throttling_group = map->throttling_group;
cache = map->cache;
write_zero = map->write_zero;
+ skip = map->skip;
} else {
devfn = g_strdup_printf("%s/tmp-disk-%s.raw",
dirname, di->devname);
@@ -361,57 +379,60 @@ static int extract_content(int argc, char **argv)
write_zero = false;
}
- size_t devlen = strlen(devfn);
- QDict *options = NULL;
- bool writethrough;
- if (format) {
- /* explicit format from commandline */
- options = qdict_new();
- qdict_put_str(options, "driver", format);
- } else if ((devlen > 4 && strcmp(devfn+devlen-4, ".raw") == 0) ||
- strncmp(devfn, "/dev/", 5) == 0)
- {
- /* This part is now deprecated for PVE as well (just as qemu
- * deprecated not specifying an explicit raw format, too.
- */
- /* explicit raw format */
- options = qdict_new();
- qdict_put_str(options, "driver", "raw");
- }
- if (cache && bdrv_parse_cache_mode(cache, &flags, &writethrough)) {
- g_error("invalid cache option: %s\n", cache);
- }
+ if (!skip) {
+ size_t devlen = strlen(devfn);
+ QDict *options = NULL;
+ bool writethrough;
+ if (format) {
+ /* explicit format from commandline */
+ options = qdict_new();
+ qdict_put_str(options, "driver", format);
+ } else if ((devlen > 4 && strcmp(devfn+devlen-4, ".raw") == 0) ||
+ strncmp(devfn, "/dev/", 5) == 0)
+ {
+ /* This part is now deprecated for PVE as well (just as qemu
+ * deprecated not specifying an explicit raw format, too.
+ */
+ /* explicit raw format */
+ options = qdict_new();
+ qdict_put_str(options, "driver", "raw");
+ }
- if (errp || !(blk = blk_new_open(devfn, NULL, options, flags, &errp))) {
- g_error("can't open file %s - %s", devfn,
- error_get_pretty(errp));
- }
+ if (cache && bdrv_parse_cache_mode(cache, &flags, &writethrough)) {
+ g_error("invalid cache option: %s\n", cache);
+ }
- if (cache) {
- blk_set_enable_write_cache(blk, !writethrough);
- }
+ if (errp || !(blk = blk_new_open(devfn, NULL, options, flags, &errp))) {
+ g_error("can't open file %s - %s", devfn,
+ error_get_pretty(errp));
+ }
- if (throttling_group) {
- blk_io_limits_enable(blk, throttling_group);
- }
+ if (cache) {
+ blk_set_enable_write_cache(blk, !writethrough);
+ }
- if (throttling_bps) {
- if (!throttling_group) {
- blk_io_limits_enable(blk, devfn);
+ if (throttling_group) {
+ blk_io_limits_enable(blk, throttling_group);
}
- ThrottleConfig cfg;
- throttle_config_init(&cfg);
- cfg.buckets[THROTTLE_BPS_WRITE].avg = throttling_bps;
- Error *err = NULL;
- if (!throttle_is_valid(&cfg, &err)) {
- error_report_err(err);
- g_error("failed to apply throttling");
+ if (throttling_bps) {
+ if (!throttling_group) {
+ blk_io_limits_enable(blk, devfn);
+ }
+
+ ThrottleConfig cfg;
+ throttle_config_init(&cfg);
+ cfg.buckets[THROTTLE_BPS_WRITE].avg = throttling_bps;
+ Error *err = NULL;
+ if (!throttle_is_valid(&cfg, &err)) {
+ error_report_err(err);
+ g_error("failed to apply throttling");
+ }
+ blk_set_io_limits(blk, &cfg);
}
- blk_set_io_limits(blk, &cfg);
}
- if (vma_reader_register_bs(vmar, i, blk, write_zero, &errp) < 0) {
+ if (vma_reader_register_bs(vmar, i, blk, write_zero, skip, &errp) < 0) {
g_error("%s", error_get_pretty(errp));
}
diff --git a/vma.h b/vma.h
index c895c97f6d..1b62859165 100644
--- a/vma.h
+++ b/vma.h
@@ -142,7 +142,7 @@ GList *vma_reader_get_config_data(VmaReader *vmar);
VmaDeviceInfo *vma_reader_get_device_info(VmaReader *vmar, guint8 dev_id);
int vma_reader_register_bs(VmaReader *vmar, guint8 dev_id,
BlockBackend *target, bool write_zeroes,
- Error **errp);
+ bool skip, Error **errp);
int vma_reader_restore(VmaReader *vmar, int vmstate_fd, bool verbose,
Error **errp);
int vma_reader_verify(VmaReader *vmar, bool verbose, Error **errp);

View File

@@ -0,0 +1,233 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Wolfgang Bumiller <w.bumiller@proxmox.com>
Date: Tue, 26 Apr 2022 16:06:28 +0200
Subject: [PATCH] pbs: namespace support
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
---
block/monitor/block-hmp-cmds.c | 1 +
block/pbs.c | 25 +++++++++++++++++++++----
pbs-restore.c | 19 ++++++++++++++++---
pve-backup.c | 6 +++++-
qapi/block-core.json | 5 ++++-
5 files changed, 47 insertions(+), 9 deletions(-)
diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c
index f98f4cf7e6..55ef4f5965 100644
--- a/block/monitor/block-hmp-cmds.c
+++ b/block/monitor/block-hmp-cmds.c
@@ -1043,6 +1043,7 @@ void coroutine_fn hmp_backup(Monitor *mon, const QDict *qdict)
false, NULL, // PBS key_password
false, NULL, // PBS master_keyfile
false, NULL, // PBS fingerprint
+ false, NULL, // PBS backup-ns
false, NULL, // PBS backup-id
false, 0, // PBS backup-time
false, false, // PBS use-dirty-bitmap
diff --git a/block/pbs.c b/block/pbs.c
index ce9a870885..9192f3e41b 100644
--- a/block/pbs.c
+++ b/block/pbs.c
@@ -14,6 +14,7 @@
#include <proxmox-backup-qemu.h>
#define PBS_OPT_REPOSITORY "repository"
+#define PBS_OPT_NAMESPACE "namespace"
#define PBS_OPT_SNAPSHOT "snapshot"
#define PBS_OPT_ARCHIVE "archive"
#define PBS_OPT_KEYFILE "keyfile"
@@ -27,6 +28,7 @@ typedef struct {
int64_t length;
char *repository;
+ char *namespace;
char *snapshot;
char *archive;
} BDRVPBSState;
@@ -40,6 +42,11 @@ static QemuOptsList runtime_opts = {
.type = QEMU_OPT_STRING,
.help = "The server address and repository to connect to.",
},
+ {
+ .name = PBS_OPT_NAMESPACE,
+ .type = QEMU_OPT_STRING,
+ .help = "Optional: The snapshot's namespace.",
+ },
{
.name = PBS_OPT_SNAPSHOT,
.type = QEMU_OPT_STRING,
@@ -76,7 +83,7 @@ static QemuOptsList runtime_opts = {
// filename format:
-// pbs:repository=<repo>,snapshot=<snap>,password=<pw>,key_password=<kpw>,fingerprint=<fp>,archive=<archive>
+// pbs:repository=<repo>,namespace=<ns>,snapshot=<snap>,password=<pw>,key_password=<kpw>,fingerprint=<fp>,archive=<archive>
static void pbs_parse_filename(const char *filename, QDict *options,
Error **errp)
{
@@ -112,6 +119,7 @@ static int pbs_open(BlockDriverState *bs, QDict *options, int flags,
s->archive = g_strdup(qemu_opt_get(opts, PBS_OPT_ARCHIVE));
const char *keyfile = qemu_opt_get(opts, PBS_OPT_KEYFILE);
const char *password = qemu_opt_get(opts, PBS_OPT_PASSWORD);
+ const char *namespace = qemu_opt_get(opts, PBS_OPT_NAMESPACE);
const char *fingerprint = qemu_opt_get(opts, PBS_OPT_FINGERPRINT);
const char *key_password = qemu_opt_get(opts, PBS_OPT_ENCRYPTION_PASSWORD);
@@ -124,9 +132,12 @@ static int pbs_open(BlockDriverState *bs, QDict *options, int flags,
if (!key_password) {
key_password = getenv("PBS_ENCRYPTION_PASSWORD");
}
+ if (namespace) {
+ s->namespace = g_strdup(namespace);
+ }
/* connect to PBS server in read mode */
- s->conn = proxmox_restore_new(s->repository, s->snapshot, password,
+ s->conn = proxmox_restore_new_ns(s->repository, s->snapshot, s->namespace, password,
keyfile, key_password, fingerprint, &pbs_error);
/* invalidates qemu_opt_get char pointers from above */
@@ -171,6 +182,7 @@ static int pbs_file_open(BlockDriverState *bs, QDict *options, int flags,
static void pbs_close(BlockDriverState *bs) {
BDRVPBSState *s = bs->opaque;
g_free(s->repository);
+ g_free(s->namespace);
g_free(s->snapshot);
g_free(s->archive);
proxmox_restore_disconnect(s->conn);
@@ -252,8 +264,13 @@ static coroutine_fn int pbs_co_pwritev(BlockDriverState *bs,
static void pbs_refresh_filename(BlockDriverState *bs)
{
BDRVPBSState *s = bs->opaque;
- snprintf(bs->exact_filename, sizeof(bs->exact_filename), "%s/%s(%s)",
- s->repository, s->snapshot, s->archive);
+ if (s->namespace) {
+ snprintf(bs->exact_filename, sizeof(bs->exact_filename), "%s/%s:%s(%s)",
+ s->repository, s->namespace, s->snapshot, s->archive);
+ } else {
+ snprintf(bs->exact_filename, sizeof(bs->exact_filename), "%s/%s(%s)",
+ s->repository, s->snapshot, s->archive);
+ }
}
static const char *const pbs_strong_runtime_opts[] = {
diff --git a/pbs-restore.c b/pbs-restore.c
index 2f834cf42e..f03d9bab8d 100644
--- a/pbs-restore.c
+++ b/pbs-restore.c
@@ -29,7 +29,7 @@
static void help(void)
{
const char *help_msg =
- "usage: pbs-restore [--repository <repo>] snapshot archive-name target [command options]\n"
+ "usage: pbs-restore [--repository <repo>] [--ns namespace] snapshot archive-name target [command options]\n"
;
printf("%s", help_msg);
@@ -77,6 +77,7 @@ int main(int argc, char **argv)
Error *main_loop_err = NULL;
const char *format = "raw";
const char *repository = NULL;
+ const char *backup_ns = NULL;
const char *keyfile = NULL;
int verbose = false;
bool skip_zero = false;
@@ -90,6 +91,7 @@ int main(int argc, char **argv)
{"verbose", no_argument, 0, 'v'},
{"format", required_argument, 0, 'f'},
{"repository", required_argument, 0, 'r'},
+ {"ns", required_argument, 0, 'n'},
{"keyfile", required_argument, 0, 'k'},
{0, 0, 0, 0}
};
@@ -110,6 +112,9 @@ int main(int argc, char **argv)
case 'r':
repository = g_strdup(argv[optind - 1]);
break;
+ case 'n':
+ backup_ns = g_strdup(argv[optind - 1]);
+ break;
case 'k':
keyfile = g_strdup(argv[optind - 1]);
break;
@@ -160,8 +165,16 @@ int main(int argc, char **argv)
fprintf(stderr, "connecting to repository '%s'\n", repository);
}
char *pbs_error = NULL;
- ProxmoxRestoreHandle *conn = proxmox_restore_new(
- repository, snapshot, password, keyfile, key_password, fingerprint, &pbs_error);
+ ProxmoxRestoreHandle *conn = proxmox_restore_new_ns(
+ repository,
+ snapshot,
+ backup_ns,
+ password,
+ keyfile,
+ key_password,
+ fingerprint,
+ &pbs_error
+ );
if (conn == NULL) {
fprintf(stderr, "restore failed: %s\n", pbs_error);
return -1;
diff --git a/pve-backup.c b/pve-backup.c
index 41e8effa01..1c25ae98bd 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -10,6 +10,8 @@
#include "qapi/qmp/qerror.h"
#include "qemu/cutils.h"
+#include <proxmox-backup-qemu.h>
+
/* PVE backup state and related function */
/*
@@ -531,6 +533,7 @@ UuidInfo coroutine_fn *qmp_backup(
bool has_key_password, const char *key_password,
bool has_master_keyfile, const char *master_keyfile,
bool has_fingerprint, const char *fingerprint,
+ bool has_backup_ns, const char *backup_ns,
bool has_backup_id, const char *backup_id,
bool has_backup_time, int64_t backup_time,
bool has_use_dirty_bitmap, bool use_dirty_bitmap,
@@ -670,8 +673,9 @@ UuidInfo coroutine_fn *qmp_backup(
firewall_name = "fw.conf";
char *pbs_err = NULL;
- pbs = proxmox_backup_new(
+ pbs = proxmox_backup_new_ns(
backup_file,
+ has_backup_ns ? backup_ns : NULL,
backup_id,
backup_time,
dump_cb_block_size,
diff --git a/qapi/block-core.json b/qapi/block-core.json
index d8c7331090..889726fc26 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -817,6 +817,8 @@
#
# @fingerprint: server cert fingerprint (optional for format 'pbs')
#
+# @backup-ns: backup namespace (required for format 'pbs')
+#
# @backup-id: backup ID (required for format 'pbs')
#
# @backup-time: backup timestamp (Unix epoch, required for format 'pbs')
@@ -836,6 +838,7 @@
'*key-password': 'str',
'*master-keyfile': 'str',
'*fingerprint': 'str',
+ '*backup-ns': 'str',
'*backup-id': 'str',
'*backup-time': 'int',
'*use-dirty-bitmap': 'bool',
@@ -3290,7 +3293,7 @@
{ 'struct': 'BlockdevOptionsPbs',
'data': { 'repository': 'str', 'snapshot': 'str', 'archive': 'str',
'*keyfile': 'str', '*password': 'str', '*fingerprint': 'str',
- '*key_password': 'str' } }
+ '*key_password': 'str', '*namespace': 'str' } }
##
# @BlockdevOptionsNVMe:

View File

@@ -12,10 +12,10 @@ Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
1 file changed, 2 insertions(+), 40 deletions(-)
diff --git a/block/rbd.c b/block/rbd.c
index a4749f3b1b..53e0396b51 100644
index 64a8d7d48b..9fc6dcb957 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -1511,7 +1511,6 @@ static int coroutine_fn qemu_rbd_co_block_status(BlockDriverState *bs,
@@ -1348,7 +1348,6 @@ static int coroutine_fn qemu_rbd_co_block_status(BlockDriverState *bs,
int status, r;
RBDDiffIterateReq req = { .offs = offset };
uint64_t features, flags;
@@ -23,7 +23,7 @@ index a4749f3b1b..53e0396b51 100644
assert(offset + bytes <= s->image_size);
@@ -1539,43 +1538,7 @@ static int coroutine_fn qemu_rbd_co_block_status(BlockDriverState *bs,
@@ -1376,43 +1375,7 @@ static int coroutine_fn qemu_rbd_co_block_status(BlockDriverState *bs,
return status;
}
@@ -68,7 +68,7 @@ index a4749f3b1b..53e0396b51 100644
qemu_rbd_diff_iterate_cb, &req);
if (r < 0 && r != QEMU_RBD_EXIT_DIFF_ITERATE2) {
return status;
@@ -1594,8 +1557,7 @@ static int coroutine_fn qemu_rbd_co_block_status(BlockDriverState *bs,
@@ -1431,8 +1394,7 @@ static int coroutine_fn qemu_rbd_co_block_status(BlockDriverState *bs,
status = BDRV_BLOCK_ZERO | BDRV_BLOCK_OFFSET_VALID;
}

View File

@@ -13,10 +13,10 @@ Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/block/rbd.c b/block/rbd.c
index 53e0396b51..0913a0af39 100644
index 9fc6dcb957..98f4ba2620 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -1470,11 +1470,11 @@ static int qemu_rbd_diff_iterate_cb(uint64_t offs, size_t len,
@@ -1307,11 +1307,11 @@ static int qemu_rbd_diff_iterate_cb(uint64_t offs, size_t len,
RBDDiffIterateReq *req = opaque;
assert(req->offs + req->bytes <= offs);

View File

@@ -23,10 +23,10 @@ Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
1 file changed, 112 deletions(-)
diff --git a/block/rbd.c b/block/rbd.c
index 0913a0af39..1dab254517 100644
index 98f4ba2620..efcbbe5949 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -108,12 +108,6 @@ typedef struct RBDTask {
@@ -97,12 +97,6 @@ typedef struct RBDTask {
int64_t ret;
} RBDTask;
@@ -39,7 +39,7 @@ index 0913a0af39..1dab254517 100644
static int qemu_rbd_connect(rados_t *cluster, rados_ioctx_t *io_ctx,
BlockdevOptionsRbd *opts, bool cache,
const char *keypairs, const char *secretid,
@@ -1456,111 +1450,6 @@ static ImageInfoSpecific *qemu_rbd_get_specific_info(BlockDriverState *bs,
@@ -1293,111 +1287,6 @@ static ImageInfoSpecific *qemu_rbd_get_specific_info(BlockDriverState *bs,
return spec_info;
}
@@ -148,10 +148,10 @@ index 0913a0af39..1dab254517 100644
- return status;
-}
-
static int64_t coroutine_fn qemu_rbd_co_getlength(BlockDriverState *bs)
static int64_t qemu_rbd_getlength(BlockDriverState *bs)
{
BDRVRBDState *s = bs->opaque;
@@ -1796,7 +1685,6 @@ static BlockDriver bdrv_rbd = {
@@ -1633,7 +1522,6 @@ static BlockDriver bdrv_rbd = {
#ifdef LIBRBD_SUPPORTS_WRITE_ZEROES
.bdrv_co_pwrite_zeroes = qemu_rbd_co_pwrite_zeroes,
#endif

View File

@@ -0,0 +1,60 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fabian Ebner <f.ebner@proxmox.com>
Date: Wed, 25 May 2022 13:59:37 +0200
Subject: [PATCH] PVE-Backup: create jobs: correctly cancel in error scenario
The first call to job_cancel_sync() will cancel and free all jobs in
the transaction, so ensure that it's called only once and get rid of
the job_unref() that would operate on freed memory.
It's also necessary to NULL backup_state.pbs in the error scenario,
because a subsequent backup_cancel QMP call (as happens in PVE when
the backup QMP command fails) would try to call proxmox_backup_abort()
and run into a segfault.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
[FE: adapt for new job lock mechanism replacing AioContext locks]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
pve-backup.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/pve-backup.c b/pve-backup.c
index 1c25ae98bd..1b466eee3a 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -503,6 +503,11 @@ static void create_backup_jobs_bh(void *opaque) {
}
if (*errp) {
+ /*
+ * It's enough to cancel one job in the transaction, the rest will
+ * follow automatically.
+ */
+ bool canceled = false;
l = backup_state.di_list;
while (l) {
PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
@@ -513,11 +518,11 @@ static void create_backup_jobs_bh(void *opaque) {
di->target = NULL;
}
- if (di->job) {
+ if (!canceled && di->job) {
WITH_JOB_LOCK_GUARD() {
job_cancel_sync_locked(&di->job->job, true);
- job_unref_locked(&di->job->job);
}
+ canceled = true;
}
}
}
@@ -943,6 +948,7 @@ err:
if (pbs) {
proxmox_backup_disconnect(pbs);
+ backup_state.pbs = NULL;
}
if (backup_dir) {

View File

@@ -0,0 +1,73 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fabian Ebner <f.ebner@proxmox.com>
Date: Wed, 25 May 2022 13:59:38 +0200
Subject: [PATCH] PVE-Backup: ensure jobs in di_list are referenced
Ensures that qmp_backup_cancel doesn't pick a job that's already been
freed. With unlucky timings it seems possible that:
1. job_exit -> job_completed -> job_finalize_single starts
2. pvebackup_co_complete_stream gets spawned in completion callback
3. job finalize_single finishes -> job's refcount hits zero -> job is
freed
4. qmp_backup_cancel comes in and locks backup_state.backup_mutex
before pvebackup_co_complete_stream can remove the job from the
di_list
5. qmp_backup_cancel will pick a job that's already been freed
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
[FE: adapt for new job lock mechanism replacing AioContext locks]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
pve-backup.c | 22 +++++++++++++++++++---
1 file changed, 19 insertions(+), 3 deletions(-)
diff --git a/pve-backup.c b/pve-backup.c
index 1b466eee3a..5aecf06af7 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -316,6 +316,13 @@ static void coroutine_fn pvebackup_co_complete_stream(void *opaque)
}
}
+ if (di->job) {
+ WITH_JOB_LOCK_GUARD() {
+ job_unref_locked(&di->job->job);
+ di->job = NULL;
+ }
+ }
+
// remove self from job list
backup_state.di_list = g_list_remove(backup_state.di_list, di);
@@ -491,6 +498,11 @@ static void create_backup_jobs_bh(void *opaque) {
aio_context_release(aio_context);
di->job = job;
+ if (job) {
+ WITH_JOB_LOCK_GUARD() {
+ job_ref_locked(&job->job);
+ }
+ }
if (!job || local_err) {
error_setg(errp, "backup_job_create failed: %s",
@@ -518,11 +530,15 @@ static void create_backup_jobs_bh(void *opaque) {
di->target = NULL;
}
- if (!canceled && di->job) {
+ if (di->job) {
WITH_JOB_LOCK_GUARD() {
- job_cancel_sync_locked(&di->job->job, true);
+ if (!canceled) {
+ job_cancel_sync_locked(&di->job->job, true);
+ canceled = true;
+ }
+ job_unref_locked(&di->job->job);
+ di->job = NULL;
}
- canceled = true;
}
}
}

View File

@@ -0,0 +1,118 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fabian Ebner <f.ebner@proxmox.com>
Date: Wed, 25 May 2022 13:59:39 +0200
Subject: [PATCH] PVE-Backup: avoid segfault issues upon backup-cancel
When canceling a backup in PVE via a signal it's easy to run into a
situation where the job is already failing when the backup_cancel QMP
command comes in. With a bit of unlucky timing on top, it can happen
that job_exit() runs between schedulung of job_cancel_bh() and
execution of job_cancel_bh(). But job_cancel_sync() does not expect
that the job is already finalized (in fact, the job might've been
freed already, but even if it isn't, job_cancel_sync() would try to
deref job->txn which would be NULL at that point).
It is not possible to simply use the job_cancel() (which is advertised
as being async but isn't in all cases) in qmp_backup_cancel() for the
same reason job_cancel_sync() cannot be used. Namely, because it can
invoke job_finish_sync() (which uses AIO_WAIT_WHILE and thus hangs if
called from a coroutine). This happens when there's multiple jobs in
the transaction and job->deferred_to_main_loop is true (is set before
scheduling job_exit()) or if the job was not started yet.
Fix the issue by selecting the job to cancel in job_cancel_bh() itself
using the first job that's not completed yet. This is not necessarily
the first job in the list, because pvebackup_co_complete_stream()
might not yet have removed a completed job when job_cancel_bh() runs.
An alternative would be to continue using only the first job and
checking against JOB_STATUS_CONCLUDED or JOB_STATUS_NULL to decide if
it's still necessary and possible to cancel, but the approach with
using the first non-completed job seemed more robust.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
[FE: adapt for new job lock mechanism replacing AioContext locks]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
pve-backup.c | 57 ++++++++++++++++++++++++++++++++++------------------
1 file changed, 38 insertions(+), 19 deletions(-)
diff --git a/pve-backup.c b/pve-backup.c
index 5aecf06af7..a921cbcb2d 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -354,12 +354,41 @@ static void pvebackup_complete_cb(void *opaque, int ret)
/*
* job_cancel(_sync) does not like to be called from coroutines, so defer to
- * main loop processing via a bottom half.
+ * main loop processing via a bottom half. Assumes that caller holds
+ * backup_mutex.
*/
static void job_cancel_bh(void *opaque) {
CoCtxData *data = (CoCtxData*)opaque;
- Job *job = (Job*)data->data;
- job_cancel_sync(job, true);
+
+ /*
+ * Be careful to pick a valid job to cancel:
+ * 1. job_cancel_sync() does not expect the job to be finalized already.
+ * 2. job_exit() might run between scheduling and running job_cancel_bh()
+ * and pvebackup_co_complete_stream() might not have removed the job from
+ * the list yet (in fact, cannot, because it waits for the backup_mutex).
+ * Requiring !job_is_completed() ensures that no finalized job is picked.
+ */
+ GList *bdi = g_list_first(backup_state.di_list);
+ while (bdi) {
+ if (bdi->data) {
+ BlockJob *bj = ((PVEBackupDevInfo *)bdi->data)->job;
+ if (bj) {
+ Job *job = &bj->job;
+ WITH_JOB_LOCK_GUARD() {
+ if (!job_is_completed_locked(job)) {
+ job_cancel_sync_locked(job, true);
+ /*
+ * It's enough to cancel one job in the transaction, the
+ * rest will follow automatically.
+ */
+ break;
+ }
+ }
+ }
+ }
+ bdi = g_list_next(bdi);
+ }
+
aio_co_enter(data->ctx, data->co);
}
@@ -380,22 +409,12 @@ void coroutine_fn qmp_backup_cancel(Error **errp)
proxmox_backup_abort(backup_state.pbs, "backup canceled");
}
- /* it's enough to cancel one job in the transaction, the rest will follow
- * automatically */
- GList *bdi = g_list_first(backup_state.di_list);
- BlockJob *cancel_job = bdi && bdi->data ?
- ((PVEBackupDevInfo *)bdi->data)->job :
- NULL;
-
- if (cancel_job) {
- CoCtxData data = {
- .ctx = qemu_get_current_aio_context(),
- .co = qemu_coroutine_self(),
- .data = &cancel_job->job,
- };
- aio_bh_schedule_oneshot(data.ctx, job_cancel_bh, &data);
- qemu_coroutine_yield();
- }
+ CoCtxData data = {
+ .ctx = qemu_get_current_aio_context(),
+ .co = qemu_coroutine_self(),
+ };
+ aio_bh_schedule_oneshot(data.ctx, job_cancel_bh, &data);
+ qemu_coroutine_yield();
qemu_co_mutex_unlock(&backup_state.backup_mutex);
}

View File

@@ -0,0 +1,57 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fabian Ebner <f.ebner@proxmox.com>
Date: Wed, 22 Jun 2022 10:45:11 +0200
Subject: [PATCH] vma: create: support 64KiB-unaligned input images
which fixes backing up templates with such disks in PVE, for example
efitype=4m EFI disks on a file-based storage (size = 540672).
If there is not enough left to read, blk_co_preadv will return -EIO,
so limit the size in the last iteration.
For writing, an unaligned end is already handled correctly.
The call to memset is not strictly necessary, because writing also
checks that it doesn't write data beyond the end of the image. But
there are two reasons to do it:
1. It's cleaner that way.
2. It allows detecting when the final piece is all zeroes, which might
not happen if the buffer still contains data from the previous
iteration.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
---
vma.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/vma.c b/vma.c
index e6e9ffc7fe..304f02bc84 100644
--- a/vma.c
+++ b/vma.c
@@ -548,7 +548,7 @@ static void coroutine_fn backup_run(void *opaque)
struct iovec iov;
QEMUIOVector qiov;
- int64_t start, end;
+ int64_t start, end, readlen;
int ret = 0;
unsigned char *buf = blk_blockalign(job->target, VMA_CLUSTER_SIZE);
@@ -562,8 +562,16 @@ static void coroutine_fn backup_run(void *opaque)
iov.iov_len = VMA_CLUSTER_SIZE;
qemu_iovec_init_external(&qiov, &iov, 1);
+ if (start + 1 == end) {
+ memset(buf, 0, VMA_CLUSTER_SIZE);
+ readlen = job->len - start * VMA_CLUSTER_SIZE;
+ assert(readlen > 0 && readlen <= VMA_CLUSTER_SIZE);
+ } else {
+ readlen = VMA_CLUSTER_SIZE;
+ }
+
ret = blk_co_preadv(job->target, start * VMA_CLUSTER_SIZE,
- VMA_CLUSTER_SIZE, &qiov, 0);
+ readlen, &qiov, 0);
if (ret < 0) {
vma_writer_set_error(job->vmaw, "read error", -1);
goto out;

View File

@@ -0,0 +1,25 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fabian Ebner <f.ebner@proxmox.com>
Date: Wed, 22 Jun 2022 10:45:12 +0200
Subject: [PATCH] vma: create: avoid triggering assertion in error case
error_setg expects its argument to not be initialized yet.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
---
vma-writer.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/vma-writer.c b/vma-writer.c
index df4b20793d..ac7da237d0 100644
--- a/vma-writer.c
+++ b/vma-writer.c
@@ -311,6 +311,8 @@ VmaWriter *vma_writer_create(const char *filename, uuid_t uuid, Error **errp)
}
if (vmaw->fd < 0) {
+ error_free(*errp);
+ *errp = NULL;
error_setg(errp, "can't open file %s - %s\n", filename,
g_strerror(errno));
goto err;

Some files were not shown because too many files have changed in this diff Show More