Compare commits

...

101 Commits

Author SHA1 Message Date
db46e47c85 Fix truncation, add write-back cache support 2023-10-27 21:07:45 +03:00
016b2de920 Improve performance by adding io_uring support, fix iothread compat 2023-07-19 02:20:35 +03:00
4657dc72ae Add Vitastor support 2023-04-25 11:20:49 +03:00
Thomas Lamprecht
93d558c1ee bump version to 7.2.0-8
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-03-17 15:48:12 +01:00
Fiona Ebner
e752bbe5e2 cherry-pick TCG-related stable fixes for 7.2
When turning off the "KVM hardware virtualization" checkbox in Proxmox
VE, the TCG accelerator is used, so these fixes are relevant then.

The first patch is included to allow cherry-picking the others without
changes.

Reported-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-03-17 15:46:20 +01:00
Thomas Lamprecht
018ef788b3 bump version to 7.2.0-8
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-03-17 12:12:02 +01:00
Fiona Ebner
72fc94c0c6 add patch fixing ACPI CPU hotplug issue with TCG
Required for the debian/edk2-vars-generator.py script in the
pve-edk2-firmware repository when building the edk2-stable202302
release. Without this patch, the QEMU process spawned by the script
would hang indefinietly.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-03-17 12:06:22 +01:00
Thomas Lamprecht
09186f4b6e bump version to 7.2.0-7
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-03-13 17:42:52 +01:00
Fiona Ebner
ffda59f626 add patches to fix regression with LSI SCSI controller
The patch 0008-memory-prevent-dma-reentracy-issues.patch introduced a
regression for the LSI SCSI controller leading to boot failures [0],
because, in its current form, it relies on reentrancy for a particular
ram_io region.

[0]: https://forum.proxmox.com/threads/123843

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-03-13 17:36:22 +01:00
Fiona Ebner
3c4f941ac7 add more stable fixes
The patches were selected from the recent "Patch Round-up for stable
7.2.1" [0]. Those that should be relevant for our supported use-cases
(and the upcoming nvme use-case) were picked. Most of the patches
added now have not been submitted to qemu-stable before.

The follow-up for the virtio-rng-pci migration fix will break
migration between versions with the fix and without the fix when a
virtio-pci-rng(-non)-transitional device is used. Luckily Proxmox VE
only uses the virtio-pci-rng device, and this was fixed by
0006-virtio-rng-pci-fix-migration-compat-for-vectors.patch which was
applied before any public version of Proxmox VE's QEMU 7.2 package was
released.

[0]: https://lists.nongnu.org/archive/html/qemu-stable/2023-03/msg00010.html
[1]: https://bugzilla.redhat.com/show_bug.cgi?id=2162569

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-03-13 17:36:19 +01:00
Fiona Ebner
3a94e1a186 fixup patch "ide: avoid potential deadlock when draining during trim"
The patch was incomplete and (re-)introduced an issue with a potential
failing assertion upon cancelation of the DMA request.

There is a patch on qemu-devel now[0], and it's the same as this one
code-wise (except for comments). But the discussion is still ongoing.
While there shouldn't be a real issue with the patch, there might be
better approaches. The plan is to use this as a stop-gap for now and
pick up the proper solution once it's ready.

[0]: https://lists.nongnu.org/archive/html/qemu-devel/2023-03/msg03325.html

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-03-13 17:36:19 +01:00
Thomas Lamprecht
67cae45f41 bump version to 7.2.0-6
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-03-08 14:32:22 +01:00
Fiona Ebner
58659169de add patch to avoid potential deadlock with trim for IDE/SATA and draining
In particular, the deadlock can occur, together with unlucky timing
between the QEMU threads, when the guest is issuing trim requests
during the start of a backup operation.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
 [ T: resolve trivial merge conflict in series file ]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-03-08 14:22:36 +01:00
Fiona Ebner
10691e04e9 add patch fixing Linux boot failures with megasas SCSI
A regression in 7.2 and easily reproduced.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-03-07 19:50:12 +01:00
Thomas Lamprecht
09723b9298 bump version to 7.2.0-5
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-02-21 13:50:08 +01:00
Fiona Ebner
00e2507aac add fix for iscsi double free issue leading to crashes
Reported here[0] and here[1].

[0]: https://gitlab.com/qemu-project/qemu/-/issues/1378
[1]: https://forum.proxmox.com/threads/122776/

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-02-21 13:49:19 +01:00
Fiona Ebner
e7e5f63573 add patch fixing DMA reentrancy issues
that could lead to use-after-frees and stack overflows with a
malicious (or buggy) guest. See [0] for a good summary:

[0]: https://lore.kernel.org/qemu-devel/CAFEAcA_23vc7hE3iaM-JVA6W38LK4hJoWae5KcknhPRD5fPBZA@mail.gmail.com

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-02-21 10:18:35 +01:00
Fiona Ebner
1688b43738 QMP backup: use correct errno when getting blockdrive length fails
di->size would only be set later. The errno is minus the return value
from the function.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-02-21 09:19:16 +01:00
Fiona Ebner
eee064d954 savevm-async: keep more free space when entering final stage
In qemu-server, we already allocate 2 * $mem_size + 500 MiB for driver
state (which was 32 MiB long ago according to git history). It seems
likely that the 30 MiB cutoff in the savevm-async implementation was
chosen based on that.

In bug #4476 [0], another issue caused the iteration to not make any
progress and the state file filled up all the way to the 30 MiB +
pending_size cutoff. Since the guest is not stopped immediately after
the check, it can still dirty some RAM and the current cutoff is not
enough for a reproducer VM (was done while bug #4476 still was not
fixed), dirtying memory with
> stress-ng -B 2 --bigheap-growth 64.0M'
After entering the final stage, savevm actually filled up the state
file completely, leading to an I/O error. It's probably the same
scenario as reported in the bug report, the error message was fixed in
commit a020815 ("savevm-async: fix function name in error message")
after the bug report.

If not for the bug, the cutoff will only be reached by a VM that's
dirtying RAM faster than can be written to the storage, so increase
the cutoff to 100 MiB to have a bigger chance to finish successfully,
while still trying to not increase downtime too much for
non-hibernation snapshots.

[0]: https://bugzilla.proxmox.com/show_bug.cgi?id=4476

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-02-21 08:39:08 +01:00
Fiona Ebner
8051a24b5f fix #4476: savevm-async: avoid looping without progress
when pend_postcopy is large. By definition, pend_postcopy won't
decrease when iterating, so a value larger than the cutoff of 400000
would lead to essentially empty iterations, filling up the state file
until only 30 MiB + pending_size remain and the second half of the
check would trigger.

Avoid this, by not considering pend_postcopy for the cutoff to enter
the final phase.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-02-21 08:39:08 +01:00
Fiona Ebner
ade9f50160 d/rules: add note explaining why using noopt doesn't currenlty work
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-02-14 10:04:21 +01:00
Fiona Ebner
0fde60fd10 d/rules: add missing export for CFLAGS
Otherwise, they don't affect the build of QEMU at all.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-02-14 10:04:21 +01:00
Thomas Lamprecht
d82c5eb632 bump version to 7.2.0-4
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-01-27 09:37:53 +01:00
Fiona Ebner
d5f6ef56f0 add patch to fix issue with VirtIO disk using detect-zeroes=unmap
Affects Proxmox VE, when the discard disk setting is used for a
VirtIO disk.

Upstream bug report:
https://gitlab.com/qemu-project/qemu/-/issues/1404

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-01-27 09:36:41 +01:00
Fabian Grünbichler
658cba46ee d/control: also conflict with "qemu-system-data"
it ships files also shipped by our qemu package, switching from Debian qemu to
ours doesn't work without manual intervention otherwise..

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2023-01-26 10:55:37 +01:00
Fiona Ebner
a02081501a savevm-async: fix function name in error message
which also makes it distinguishable from the other
"qemu_savevm_state_iterate error" message.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-01-24 17:08:54 +01:00
Thomas Lamprecht
baf4e3132d bump version to 7.2.0-3
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-01-12 13:13:23 +01:00
Fiona Ebner
48c307550a add regression fix for migration with virtio-rng device
between QEMU less than 7.2 and QEMU 7.2 without the fix (both
directions are affected).

As mentioned in the patch message, this fix itself will break
migration between QEMU 7.2 and QEMU 7.2 with the fix (in both
directions, if a virtio-rng device is attached), but this is fine,
because no pve-qemu-kvm package with QEMU 7.2 has been publicly
released yet.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-01-12 13:10:19 +01:00
Thomas Lamprecht
89fdfe8975 bump version to 7.2.0-2
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-01-10 15:47:52 +01:00
Fiona Ebner
f64132208a cherry-pick stable fixes for 7.2
Two for virtio-mem and one for vIOMMU. Both features are not yet
exposed in PVE's qemu-server, but planned to be added.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-01-10 15:42:28 +01:00
Fiona Ebner
271ac0a8a7 add QAPI naming exceptions in patches introducing them
Avoids a patch and is required to compile when not all patches are
applied. No functional change is intended.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-01-10 15:42:16 +01:00
Fiona Ebner
f4ed54ec37 d/control: drop outdated jemalloc dependencies
Commit 3d785ea ("disable jemalloc") disabled jemalloc support, so
these are not needed anymore.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2022-12-21 13:52:16 +01:00
Fiona Ebner
2277182712 d/control: add libslirp-dev as a build dependency
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2022-12-21 13:52:16 +01:00
Fiona Ebner
0906461df0 d/rules: enable slirp again
Commit d03e1b3 ("update submodule and patches to 7.2.0") argued that
slirp is not explicitly supported in PVE, but that is not true. In
qemu-server, user networking is supported (via CLI/API) when no bridge
is set on a virtual NIC. So slirp needs to stay to keep such NICs
working.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2022-12-21 13:52:16 +01:00
Wolfgang Bumiller
29bee92c59 bump version to 7.2.0-1
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2022-12-16 13:23:29 +01:00
Fiona Ebner
82640bb859 d/rules: explicitly disable building slirp
Otherwise, it depends on whether libslirp-devel is installed or not.
See the previous commit message for more context.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2022-12-16 11:47:25 +01:00
Fiona Ebner
d03e1b3ce3 update submodule and patches to 7.2.0
User-facing breaking change:

The slirp submodule for user networking got removed. It would be
necessary to add the --enable-slirp option to the build and/or install
the appropriate library to continue building it. Since PVE is not
explicitly supporting it, it would require additionally installing the
libslirp0 package on all installations and there is *very* little
mention on the community forum when searching for "slirp" or
"netdev user", the plan is to only enable it again if there is some
real demand for it.

Notable changes:

* The big change for this release is the rework of job locking, using
  a job mutex and introducing _locked() variants of job API functions
  moving away from call-side AioContext locking. See (in the qemu
  submodule) commit 6f592e5aca ("job.c: enable job lock/unlock and
  remove Aiocontext locks") and previous commits for context.

  Changes required for the backup patches:
  * Use WITH_JOB_LOCK_GUARD() and call the _locked() variant of job
    API functions where appropriate (many are only availalbe as
    a _locked() variant).
  * Remove acquiring/releasing AioContext around functions taking the
    job mutex lock internally.

  The patch introducing sequential transaction support for jobs needs
  to temporarily unlock the job mutex to call job_start() when
  starting the next job in the transaction.

* The zeroinit block driver now marks its child as primary.

  The documentation in include/block/block-common.h states:
  > Filter node has exactly one FILTERED|PRIMARY child, and may have
  > other children which must not have these bits

  Without this, an assert will trigger when copying to a zeroinit target
  with qemu-img convert, because bdrv_child_cb_attach() expects any
  non-PRIMARY child to be not FILTERED:
  > qemu-img convert -n -p -f raw -O raw input.raw zeroinit:output.raw
  > qemu-img: ../block.c:1476: bdrv_child_cb_attach: Assertion
  > `!(child->role & BDRV_CHILD_FILTERED)' failed.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2022-12-16 11:47:20 +01:00
Thomas Lamprecht
55e33a045e bump version to 7.1.0-4
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-11-22 09:21:10 +01:00
Thomas Lamprecht
8a38e1da9e cherry-pick "block/block-backend: blk_set_enable_write_cache is IO_CODE"
albeit I was short from disarming that GLOBAL_STATE_CODE assert
completely, as its just bogus to assert that on runtime for a lot of
call sites, rather it should be verified on compilation (function
coloring with attributes and maybe a compiler plugin).

But, as this is already solved upstream lets take in that patch.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-11-22 09:19:00 +01:00
Thomas Lamprecht
3b3d5516ee bump version to 7.1.0-3
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-10-28 10:27:54 +02:00
Thomas Lamprecht
509409fb64 init: daemonize: defuse PID file resolve error to warning
fixes file restore, where we actively unlink the PID file of the
transient VM ourself after opening it - while we use it only for
tracking when the QEMU process itself has finished start up, it's
easier and cleaner to fix this regression now, than to rework that to
something that doesn't depends on the PID file at all.

Applying Fiona's patch as patch-patch tracked under extra, as I
expect that something similar to this gets accepted upstreamed.

Link: https://lists.proxmox.com/pipermail/pve-devel/2022-October/054448.html
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-10-28 10:22:26 +02:00
Wolfgang Bumiller
bf03cd367f bump version to 7.1.0-2
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2022-10-18 15:35:09 +02:00
Fiona Ebner
0af826b448 savevm async IO channel: channel writev: fix return value in error case
The documentation in include/io/channel.h states that -1 or
QIO_CHANNEL_ERR_BLOCK should be returned upon error. Simply passing
along the return value from the blk-functions has the potential to
confuse the call sides. Non-blocking mode is not implemented
currently, so -1 it is.

The "return ret" was mistakenly left over from the previous
QEMUFileOps based implementation. Also, use error_setg_errno(), since
the blk(_co)_p{readv,writev} functions return errno codes.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2022-10-18 15:32:13 +02:00
Wolfgang Bumiller
ed23707ed7 bump version to 7.1.0-1
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2022-10-14 14:55:53 +02:00
Fiona Ebner
4e1935c2c9 {alloc track, pbs} block driver: bdrv_co_preadv: adapt return values
to be in-line with what other implementations in QEMU do. Commit
1d39c7098bbfa6862cb96066c4f8f6735ea397c5 mentions the EIO bit and
the function is expected to return 0 upon success (see other
implementations).

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2022-10-14 14:52:36 +02:00
Fiona Ebner
a262e9642b savevm async: cleaner initialization of target_close_wait member
Suggested-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2022-10-14 14:52:34 +02:00
Fiona Ebner
73912aee39 cherry-pick upstream fixes for 7.1.0
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2022-10-14 14:52:32 +02:00
Fiona Ebner
5b15e2ecaf update submodule and patches to 7.1.0
Notable changes:
* The only big change is the switch to using a custom QIOChannel for
  savevm-async, because the previously used QEMUFileOps was dropped.

  Changes to the current implementation:

  * Switch to vector based methods as required for an IO channel. For
    short reads the passed-in IO vector is stuffed with zeroes at the
    end, just to be sure.

  * For reading: The documentation in include/io/channel.h states that
    at least one byte should be read, so also error out when whe are
    at the very end instead of returning 0.

  * For reading: Fix off-by-one error when request goes beyond end.

    The wrong code piece was:
    if ((pos + size) > maxlen) {
        size = maxlen - pos - 1;
    }

    Previously, the last byte would not be read. It's actually
    possible to get a snapshot .raw file that has content all the way
    up the final 512 byte (= BDRV_SECTOR_SIZE) boundary without any
    trailing zero bytes (I wrote a script to do it).

    Luckily, it didn't cause a real issue, because qemu_loadvm_state()
    is not interested in the final (i.e. QEMU_VM_VMDESCRIPTION)
    section. The buffer for reading it is simply freed up afterwards
    and the function will assume that it read the whole section, even
    if that's not the case.

  * For writing: Make use of the generated blk_pwritev() wrapper
    instead of manually wrapping the coroutine to simplify and save a
    few lines.

* Adapt to changed interfaces for blk_{pread,pwrite}:
  * a9262f551e ("block: Change blk_{pread,pwrite}() param order")
  * 3b35d4542c ("block: Add a 'flags' param to blk_pread()")
  * bf5b16fa40 ("block: Make blk_{pread,pwrite}() return 0 on success")
  Those changes especially affected the qemu-img dd patches, because
  the context also changed, but also some of our block drivers used
  the functions.

* Drop qemu-common.h include: it got renamed after essentially
  everything was moved to other headers. The only remaining user I
  could find for things dropped from the header between 7.0 and 7.1
  was qemu_get_vm_name() in the iscsi-initiatorname patch, but it
  already includes the header to which the function was moved.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2022-10-14 14:52:29 +02:00
Wolfgang Bumiller
2775b2e378 bump version to 7.0.0-4
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2022-10-10 11:56:27 +02:00
Wolfgang Bumiller
ed01236593 add patch: PVE Backup: allow passing max-workers performance setting
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2022-10-10 11:55:15 +02:00
Fiona Ebner
2b259b70ec d/rules: add revision to package version
This version string can be queried with $BINARY --version as well as
the query-version QMP command.

Useful for qemu-server to be able to report the running QEMU version
exactly. Could also be used to version guard against features as an
alternative to the query-proxmox-support QMP command.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2022-10-10 11:26:47 +02:00
Thomas Lamprecht
a186335be5 bump version to 7.0.0-3
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-08-30 12:54:12 +02:00
Fiona Ebner
1976ca4607 savevm-async: set SAVE_STATE_DONE when closing state file was successful
Without this change, it's necessary to send a second savevm-end QMP
command after aborting a snaphsot, before a new savevm-start QMP
command can succeed.

In process_savevm_finalize(), no longer set an error in the abort
scenario. If there already is another error, there's no need to
override it. If canceling was done intentionally, qmp_savevm_end()
is responsible for setting the state now.

Reported-by: Mira Limbeck <m.limbeck@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2022-08-19 09:44:16 +02:00
Fiona Ebner
563c592898 savevm-async: avoid segfault when aborting snapshot
Reported in the community forum[0].

For 6.1.0, there were a few changes to the coroutine-sleep API, but
the adaptations in f376b2b ("update and rebase to QEMU v6.1.0") made
a mistake.

Currently, target_close_wait is NULL when passed to
qemu_co_sleep_ns_wakeable(), which further passes it to
qemu_co_sleep(), but there, it is dereferenced when trying to access
the 'to_wake' member:

> Thread 1 "kvm" received signal SIGSEGV, Segmentation fault.
> qemu_co_sleep (w=0x0) at ../util/qemu-coroutine-sleep.c:57

To fix it, create a proper struct and pass its address instead. Also
call qemu_co_sleep_wake unconditionally, because the NULL check (for
the 'to_wake' member) is done inside the function itself.

This patch is based on what the QEMU commits introducing the changes
to the coroutine-sleep API did to the callers in QEMU:
eaee072085 ("coroutine-sleep: allow qemu_co_sleep_wake that wakes nothing")
29a6ea24eb ("coroutine-sleep: replace QemuCoSleepState pointer with struct in the API")

[0]: https://forum.proxmox.com/threads/112130/

Tested-by: Mira Limbeck <m.limbeck@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2022-08-19 09:44:14 +02:00
Thomas Lamprecht
1de53d8a45 bump version to 7.0.0-2
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-07-20 09:17:13 +02:00
Fabian Ebner
0e88ec19db add two more stable patches
For the io_uring patch, it's not very clear which configurations can
trigger it, but it should be rather uncommon. See qemu commit
be6a166fde652589761cf70471bcde623e9bd72a for a bit more information.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2022-07-19 17:22:10 +02:00
Wolfgang Bumiller
9ee866b2e9 bump version to 7.0.0-1
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2022-06-30 11:08:36 +02:00
Fabian Ebner
14ed554660 cherry-pick upstream fixes for 7.0.0
coming in via qemu-stable (except for the vdmk fix, which was tagged
for-7.0 on the qemu-devel list, but didn't make it into the release).

Also took the chance to switch the gluster fix to the version that
made it into upstream.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2022-06-29 12:29:30 +02:00
Fabian Ebner
eba403aafc d/rules: adapt to changed opensbi riscv filenames in 7.0.0
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2022-06-29 12:29:28 +02:00
Fabian Ebner
b2685aee04 d/rules: drop outdated configure flags
See QEMU commits 9e8be4c546ce8469ca9702715bf8f198d604b685 and
a5730b8bd3675f484ed0eacea052452048eeb35d for more information.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2022-06-29 12:29:25 +02:00
Fabian Ebner
dc9827a6a4 update submodule and patches to 7.0.0
Only very minor changes needed:
* Most patches in extra (or some version of them) are part of 7.0.0.
* aio_set_fd_handler got an extra parameter, but can just pass NULL
  like we did for the related 'poll' parameter. See QEMU commit
  826cc32423db2a99d184dbf4f507c737d7e7a4ae for more.
* Add include for qemu/memalign.h in vma.c and vma-writer.c.
* Add reverts for fixups of already reverted 0347a8fd4c ("block/rbd:
  implement bdrv_co_block_status") that came in with 7.0.0. Those
  fixups are not enough, see Proxmox bugzilla #4047.
* Two trivial context changes for bitmap-mirror patches.
* block_int.h got split up into multiple headers.
* Some context changes in configure and meson.build.
* Used the oppurtunity to squash fixup of bdrv_backuo_dump_create typo
  in a later patch into the patch introducing the function (had to
  move code to new header during rebase).

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2022-06-29 12:29:21 +02:00
Thomas Lamprecht
4e4b9ab13f bump version to 6.2.0-11
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-06-22 15:54:58 +02:00
Thomas Lamprecht
39e84ba82d vma/alloc-track improvements
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-06-22 15:52:16 +02:00
Thomas Lamprecht
4fd0fa7fb3 re-export patches in normalized form
iow. using:

git format-patch --zero-commit --no-signature --no-numbered --diff-algorithm=myers ...

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-06-22 15:49:53 +02:00
Dominik Csapak
539e333eaa add 'namespace' to BlockdevOptionsPbs
so that we can use it for the -blockdev options (used for live-restore)

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2022-06-22 15:10:49 +02:00
Fabian Grünbichler
68569ea2ff bump version to 6.2.0-10
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2022-06-09 16:35:57 +02:00
Fabian Grünbichler
41aedfb6db add d/source/include-binaries
to shutup dpkg-source when building a source package

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2022-06-09 16:35:01 +02:00
Fabian Ebner
7bd4d8645a fix #4101: acquire job's aio context before calling job_unref
Otherwise, we might run into an abort via bdrv_co_yield_to_drain()
(can at least happen when a disk with iothread is used):
> #0  0x00007fef4f5dece1 __GI_raise (libc.so.6 + 0x3bce1)
> #1  0x00007fef4f5c8537 __GI_abort (libc.so.6 + 0x25537)
> #2  0x00005641bce3c71f error_exit (qemu-system-x86_64 + 0x80371f)
> #3  0x00005641bce3d02b qemu_mutex_unlock_impl (qemu-system-x86_64 + 0x80402b)
> #4  0x00005641bcd51655 bdrv_co_yield_to_drain (qemu-system-x86_64 + 0x718655)
> #5  0x00005641bcd52de8 bdrv_do_drained_begin (qemu-system-x86_64 + 0x719de8)
> #6  0x00005641bcd47e07 blk_drain (qemu-system-x86_64 + 0x70ee07)
> #7  0x00005641bcd498cd blk_unref (qemu-system-x86_64 + 0x7108cd)
> #8  0x00005641bcd31e6f block_job_free (qemu-system-x86_64 + 0x6f8e6f)
> #9  0x00005641bcd32d65 job_unref (qemu-system-x86_64 + 0x6f9d65)
> #10 0x00005641bcd93b3d pvebackup_co_complete_stream (qemu-system-x86_64 + 0x75ab3d)
> #11 0x00005641bce4e353 coroutine_trampoline (qemu-system-x86_64 + 0x815353)

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Acked-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2022-06-09 14:57:28 +02:00
Wolfgang Bumiller
ed3b5b8ab8 bump version to 6.2.0-9
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2022-06-08 14:04:09 +02:00
Wolfgang Bumiller
7f4326d1dc pbs cleanup fixes
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2022-06-08 13:10:51 +02:00
Wolfgang Bumiller
53bff441c5 delete patches which were dropped from the series file
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2022-06-08 13:07:04 +02:00
Thomas Lamprecht
f9a4b1cea7 bump version to 6.2.0-8
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-05-19 09:25:11 +02:00
Fabian Ebner
dc265df350 add revert to work around performance regression when backing up large RBD disk
resulting in QMP timeouts and very slow backups. The plan is to figure
out (ideally together with upstream) a way to make the implementation
of bdrv_co_block_status for RBD more efficient. But for now, revert
the problematic change as a stop-gap measure.

Upstream bug report:
https://gitlab.com/qemu-project/qemu/-/issues/1026

Forum threads:
https://forum.proxmox.com/threads/109272/
https://forum.proxmox.com/threads/109448/
https://forum.proxmox.com/threads/101334/ (partially)

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2022-05-19 09:23:38 +02:00
Thomas Lamprecht
e0076597c6 bump version to 6.2.0-7
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-05-12 16:06:00 +02:00
Thomas Lamprecht
ee99c1f813 d/control: bump build-depenceny of proxmox-backup-qemu to 1.3.0-1
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-05-12 16:05:30 +02:00
Wolfgang Bumiller
58a5492e9c namespace support
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2022-05-12 13:49:35 +02:00
Thomas Lamprecht
e67b8b5bd9 bump version to 6.2.0-6
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-05-11 10:42:57 +02:00
Thomas Lamprecht
309b5c1694 backport various fixes for gluster, qxl and vnc
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-05-11 10:40:14 +02:00
Thomas Lamprecht
4ce5937f89 bump version to 6.2.0-5
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-04-25 10:13:50 +02:00
Thomas Lamprecht
f87d0523df vma: allow partial restore
Introduce a new map line for skipping a certain drive, of the form
skip=drive-scsi0

Since in PVE, most archives are compressed and piped to vma for
restore, it's not easily possible to skip reads.

For the reader, a new skip flag for VmaRestoreState is added and the
target is allowed to be NULL if skip is specified when registering.
If
the skip flag is set, no writes will be made as well as no check for
duplicate clusters. Therefore, the flag is not set for verify.

Originally-by: Fabian Ebner <f.ebner@proxmox.com>
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Acked-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-04-25 10:07:37 +02:00
Thomas Lamprecht
2fd4ea2813 patches: update context
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-04-25 10:07:01 +02:00
Thomas Lamprecht
2653a5f029 vma: restore: call blk_unref for all opened block devices
Originally-by: Fabian Ebner <f.ebner@proxmox.com>
Link: https://lists.proxmox.com/pipermail/pve-devel/2022-April/052642.html
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-04-25 10:05:29 +02:00
Thomas Lamprecht
664ecf59a9 bump version to 6.2.0-4
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-04-22 11:52:34 +02:00
Thomas Lamprecht
4de9440f87 various stable backports
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-04-22 10:22:39 +02:00
Thomas Lamprecht
9b348f8c6d d/copyright: drop trailing whitespace
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-04-22 09:16:23 +02:00
Thomas Lamprecht
799cf8c5a3 d/control: add suggest dependency-hint for libgl1
It pulls in a lot of stuff via the libglx0 -> libglx-mesa0 dependency
chain, so only suggest it for now to avoid installing it in the
installer or via common "PVE on-top Debian" installations, VirGL
integration is experimental after all and we may drop/replace it with
the vulkan based venus one, once available (Debian 12?).

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-04-22 09:14:39 +02:00
Thomas Lamprecht
b02e62dba0 d/control: add libgbm to build dependencies
required for good virgl support

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-04-22 08:50:36 +02:00
Thomas Lamprecht
fc03e1b6bf bump version to 6.2.0-3
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-04-15 09:09:43 +02:00
Thomas Lamprecht
c8ba14bed0 cherry-pick fix for passing some acpi slic tables
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-04-15 08:07:34 +02:00
Wolfgang Bumiller
daea534caa bump version to 6.2.0-2
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2022-03-03 12:05:37 +01:00
Fabian Ebner
27199bd753 backup: add patch to initialize bcs bitmap early enough for PBS
This is necessary for multi-disk backups where not all jobs are
immediately started after they are created. QEMU commit
06e0a9c16405c0a4c1eca33cf286cc04c42066a2 did already part of the work,
ensuring that new writes after job creation don't pass through to the
backup, but not yet for the MIRROR_SYNC_MODE_BITMAP case which is used
for PBS.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2022-03-03 11:37:17 +01:00
Thomas Lamprecht
e050683663 d/control: mark numactl a recommended package
we do not call in anywhere unconditionally

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-02-24 07:51:19 +01:00
Thomas Lamprecht
13184117e4 d/control: drop sdl dependency, we disable it on compile tinme
disabled via d/rules since a while...

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-02-24 07:51:19 +01:00
Thomas Lamprecht
aa4b14ea10 d/control: libaio1 is added by dh shlibs
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-02-24 07:51:19 +01:00
Thomas Lamprecht
3aa5b7598d enable zstd support
plan to use that for multifd migration, among other things

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-02-24 07:51:19 +01:00
Thomas Lamprecht
13d3e10aa6 compile in virgl support
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-02-24 07:51:19 +01:00
Thomas Lamprecht
58c3533a58 bump version to 6.2.0-1
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-02-18 14:23:41 +01:00
Thomas Lamprecht
fe0542bed9 d/rules: disable libssh by default
was always disabled in our clean builds, this now also avoids
auto-enabling it on "dirty" build hosts

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-02-18 14:21:53 +01:00
Fabian Ebner
f6d40bfdf4 add patch for loading a snapshot with qemu-img dd
Will be used when cloning from a qcow2 efidisk.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2022-02-15 14:03:07 +01:00
Fabian Ebner
107132becc fix getopt-string when introducing -n option for qemu-img dd
The colon after U is wrong, because it doesn't take an argument.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2022-02-15 14:03:07 +01:00
Fabian Ebner
4567474e95 update submodule and patches to 6.2.0
Notable changes:
* bdrv_co_p{discard,readv,writev,write_zeroes} function signatures
  changed, to using int64_t for offsets/bytes and some still had int
  rather than BrdvRequestFlags for the flags.
* job_cancel_sync now has a force parameter. Commit messages in
  73895f3838cd7fdaf185cf1dbc47be58844a966f
  4cfb3f05627ad82af473e7f7ae113c3884cd04e3
  sound like using force=true makes more sense.
* Added 3 patches coming in via qemu-stable tag, most important one is
  to work around a librbd issue.
* Added another 3 patches from qemu-devel to fix issue leading to
  crash when live migrating with iothread.
* cluster_size calculation helper changed (see patch pve/0026).
* QAPI's if conditionals now use 'CONFIG_FOO' rather than
  'defined(CONFIG_FOO)'

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2022-02-15 14:03:07 +01:00
108 changed files with 6679 additions and 1593 deletions

View File

@@ -33,7 +33,7 @@ $(BUILDDIR): keycodemapdb | submodule
deb kvm: $(DEBS)
$(DEB_DBG): $(DEB)
$(DEB): $(BUILDDIR)
cd $(BUILDDIR); dpkg-buildpackage -b -us -uc -j
cd $(BUILDDIR); dpkg-buildpackage -b -us -uc -j32
lintian $(DEBS)
.PHONY: update

267
debian/changelog vendored
View File

@@ -1,3 +1,270 @@
pve-qemu-kvm (7.2.0-8+vitastor3) bullseye; urgency=medium
* Fix truncation
* Add write-back cache support
-- Vitaliy Filippov <vitalif@yourcmc.ru> Fri, 27 Oct 2023 21:04:05 +0300
pve-qemu-kvm (7.2.0-8+vitastor2) bullseye; urgency=medium
* Improve performance by adding io_uring support
* Fix compatibility with iothread
-- Vitaliy Filippov <vitalif@yourcmc.ru> Tue, 18 Jul 2023 02:17:06 +0300
pve-qemu-kvm (7.2.0-8+vitastor1) bullseye; urgency=medium
* Add Vitastor support
-- Vitaliy Filippov <vitalif@yourcmc.ru> Tue, 25 Apr 2023 10:13:42 +0300
pve-qemu-kvm (7.2.0-8) bullseye; urgency=medium
* backport fix for ACPI CPU hotplug issue with TCG
* cherry-pick TCG-related stable fixes for 7.2 for users that turned off KVM
HW acceleration
-- Proxmox Support Team <support@proxmox.com> Fri, 17 Mar 2023 15:47:08 +0100
pve-qemu-kvm (7.2.0-7) bullseye; urgency=medium
* improve fix for potential deadlock with trim for IDE/SATA and draining
* backport stable fixes:
- hw/nvme: fix missing endian conversions for doorbell buffers
- hw/smbios: fix field corruption in type 4 table
- virtio-rng-pci: fix transitional migration compat for vectors
- hw/timer/hpet: Fix expiration time overflow
- vhost/vdpa: stop all svq on device deletion
- vhost: avoid a potential use of an uninitialized variable in the call to
vhost_svq_poll
- chardev/char-socket: set s->listener = NULL in char_socket_finalize to
fix a potential crash after live-migration
- intel-iommu: fail MAP notifier without caching mode
- intel-iommu: fail DEVIOTLB_UNMAP without dt mode
* fix a regression for when the LSI SCSI controller is used
-- Proxmox Support Team <support@proxmox.com> Mon, 13 Mar 2023 17:42:49 +0100
pve-qemu-kvm (7.2.0-6) bullseye; urgency=medium
* fix 7.2 regression for Linux boot failures with megasas SCSI
* fix 7.0 regression for a potential deadlock with trim for IDE/SATA and
draining
-- Proxmox Support Team <support@proxmox.com> Wed, 08 Mar 2023 14:32:17 +0100
pve-qemu-kvm (7.2.0-5) bullseye; urgency=medium
* fix #4476: savevm-async: avoid looping without progress
* savevm-async: decrease the boundary for free space for (memory) state left
on target from 30 MiB to 100 MiB, improving the heuristic for when to
enter the final "pause and sync" stage.
* QMP backup: use correct error number when getting blockdrive length fails
* backport fix for some DMA reentrancy issues, better protecting against
malicious guests
* backport fix for iSCSI double free issue leading to crashes
-- Proxmox Support Team <support@proxmox.com> Tue, 21 Feb 2023 13:49:43 +0100
pve-qemu-kvm (7.2.0-4) bullseye; urgency=medium
* backport fix for a 7.2 regression when using VirtIO disk with
detect-zeroes=unmap
-- Proxmox Support Team <support@proxmox.com> Fri, 27 Jan 2023 09:37:49 +0100
pve-qemu-kvm (7.2.0-3) bullseye; urgency=medium
* add fix for live-migration with virtio-rng devices, which regressed in
QEMU 7.2.0.
-- Proxmox Support Team <support@proxmox.com> Thu, 12 Jan 2023 13:13:14 +0100
pve-qemu-kvm (7.2.0-2) bullseye; urgency=medium
* enable slirp again for now, as in qemu-server, user networking is
supported (via CLI/API) when no bridge is set on a virtual NIC
* cherry-pick stable fixes for 7.2. Two for virtio-mem and one for vIOMMU.
Both features are not yet exposed in PVE's qemu-server, but there's work
going on to change that.
-- Proxmox Support Team <support@proxmox.com> Tue, 10 Jan 2023 15:47:48 +0100
pve-qemu-kvm (7.2.0-1) bullseye; urgency=medium
* update to QEMU stable release 7.2.0
* drop 'slirp' networking
-- Proxmox Support Team <support@proxmox.com> Fri, 16 Dec 2022 13:18:21 +0100
pve-qemu-kvm (7.1.0-4) bullseye; urgency=medium
* cherry-pick "block/block-backend: blk_set_enable_write_cache is IO_CODE"
-- Proxmox Support Team <support@proxmox.com> Tue, 22 Nov 2022 09:21:06 +0100
pve-qemu-kvm (7.1.0-3) bullseye; urgency=medium
* init: daemonize: defuse PID file resolve error to a warning at max, fixing
some usecases that regressed with 7.1, like tracking start up in our
file-restore VM.
-- Proxmox Support Team <support@proxmox.com> Fri, 28 Oct 2022 10:27:49 +0200
pve-qemu-kvm (7.1.0-2) bullseye; urgency=medium
* fix an issue with error handling in async backup code
-- Proxmox Support Team <support@proxmox.com> Tue, 18 Oct 2022 15:33:44 +0200
pve-qemu-kvm (7.1.0-1) bullseye; urgency=medium
* update to QEMU stable release 7.1.0
* add fix for io_uring_register_ring_fd from upstream
-- Proxmox Support Team <support@proxmox.com> Fri, 14 Oct 2022 14:54:09 +0200
pve-qemu-kvm (7.0.0-4) bullseye; urgency=medium
* add revision to version output
* PVE Backup: allow passing max-workers performance setting
-- Proxmox Support Team <support@proxmox.com> Mon, 10 Oct 2022 11:55:37 +0200
pve-qemu-kvm (7.0.0-3) bullseye; urgency=medium
* savevm-async: avoid segfault when aborting snapshot creation task
* savevm-async: set SAVE_STATE_DONE when closing state file was successful
allowing one to start a new snapshot task after aborting one.
-- Proxmox Support Team <support@proxmox.com> Tue, 30 Aug 2022 12:54:03 +0200
pve-qemu-kvm (7.0.0-2) bullseye; urgency=medium
* backport "io_uring: fix short read slow path"
* backport "e1000: set RX descriptor status in a separate operation"
-- Proxmox Support Team <support@proxmox.com> Wed, 20 Jul 2022 09:17:07 +0200
pve-qemu-kvm (7.0.0-1) bullseye; urgency=medium
* update to QEMU stable release 7.0.0
-- Proxmox Support Team <support@proxmox.com> Thu, 30 Jun 2022 11:07:37 +0200
pve-qemu-kvm (6.2.0-11) bullseye; urgency=medium
* add 'namespace' to BlockdevOptionsPbs for live-restore support
* vma: create: support 64KiB-unaligned input images like to improve backing
up some VM templates
* block: alloc-track: avoid unlikely, but possible premature break
-- Proxmox Support Team <support@proxmox.com> Wed, 22 Jun 2022 15:54:54 +0200
pve-qemu-kvm (6.2.0-10) bullseye; urgency=medium
* fix #4101: fix backup cancellation bug with iothreads
-- Proxmox Support Team <support@proxmox.com> Thu, 9 Jun 2022 16:35:51 +0200
pve-qemu-kvm (6.2.0-9) bullseye; urgency=medium
* fix possible race conditions during cancellation of a PBS backup
-- Proxmox Support Team <support@proxmox.com> Wed, 08 Jun 2022 14:03:22 +0200
pve-qemu-kvm (6.2.0-8) bullseye; urgency=medium
* revert "block/rbd: implement bdrv_co_block_status" to work around
performance regression when backing up large RBD disk
-- Proxmox Support Team <support@proxmox.com> Thu, 19 May 2022 09:24:45 +0200
pve-qemu-kvm (6.2.0-7) bullseye; urgency=medium
* Proxmox Backup Server namespace support
-- Proxmox Support Team <support@proxmox.com> Thu, 12 May 2022 16:05:56 +0200
pve-qemu-kvm (6.2.0-6) bullseye; urgency=medium
* block/gluster: correctly set max_pdiscard which is int64_t to avoid
triggering assertion
* ui/vnc.c: Fixed a deadlock bug
* display/qxl-render: fix race condition in qxl_cursor (CVE-2021-4207) and
integer overflow in cursor_alloc (CVE-2021-4206)
-- Proxmox Support Team <support@proxmox.com> Wed, 11 May 2022 10:42:53 +0200
pve-qemu-kvm (6.2.0-5) bullseye; urgency=medium
* vma: allow partial restore by skipping some disk
-- Proxmox Support Team <support@proxmox.com> Mon, 25 Apr 2022 10:13:46 +0200
pve-qemu-kvm (6.2.0-4) bullseye; urgency=medium
* d/control: add libgbm to build dependencies
* d/control: add suggest dependency-hint for libgl1
* various stable backports:
+ virtio-net: fix map leaking on error during receive
+ memory: Fix incorrect calls of log_global_start/stop
+ acpi: fix OEM ID/OEM Table ID padding
+ vhost-vsock: detach the virqueue element in case of error
+ vhost-user: remove VirtQ notifier restore
+ vhost-user: fix VirtQ notifier cleanup
+ virtio: fix the condition for iommu_platform not supported
-- Proxmox Support Team <support@proxmox.com> Fri, 22 Apr 2022 11:52:30 +0200
pve-qemu-kvm (6.2.0-3) bullseye; urgency=medium
* cherry-pick fix for some manually added ACPI table SLIC entries via the
custom args flag.
-- Proxmox Support Team <support@proxmox.com> Fri, 15 Apr 2022 09:09:37 +0200
pve-qemu-kvm (6.2.0-2) bullseye; urgency=medium
* compile in virgl support
* enable zstd support
* drop sdl dependency (it was disabled at compile time already)
* recommend 'numactl'
* fix an issue with multi-disk backups where chunks would be written
multiple times
-- Proxmox Support Team <support@proxmox.com> Thu, 03 Mar 2022 12:03:44 +0100
pve-qemu-kvm (6.2.0-1) bullseye; urgency=medium
* update to QEMU stable release 6.2.0
-- Proxmox Support Team <support@proxmox.com> Thu, 17 Feb 2022 06:23:14 +0100
pve-qemu-kvm (6.1.1-2) bullseye; urgency=medium
* vma: create: register all streams before entering coroutines to avoid that

16
debian/control vendored
View File

@@ -10,26 +10,30 @@ Build-Depends: autotools-dev,
libattr1-dev,
libcap-ng-dev,
libcurl4-gnutls-dev,
libepoxy-dev,
libfdt-dev,
libgbm-dev,
libglusterfs-dev (>= 5.2-2),
libgnutls28-dev,
libiscsi-dev (>= 1.12.0),
libjemalloc-dev,
libjpeg-dev,
libjson-perl,
libnuma-dev,
libpci-dev,
libpixman-1-dev,
libproxmox-backup-qemu0-dev (>= 1.0.3-1),
libproxmox-backup-qemu0-dev (>= 1.3.0-1),
librbd-dev (>= 0.48),
libsdl1.2-dev,
libseccomp-dev,
libslirp-dev,
libspice-protocol-dev (>= 0.12.14~),
libspice-server-dev (>= 0.14.0~),
libsystemd-dev,
liburing-dev,
libusb-1.0-0-dev (>= 1.0.17-1),
libusbredirparser-dev (>= 0.6-2),
libvirglrenderer-dev,
libzstd-dev,
meson,
python3-minimal,
python3-sphinx,
@@ -45,7 +49,6 @@ Package: pve-qemu-kvm
Architecture: any
Depends: ceph-common (>= 0.48),
iproute2,
libaio1,
libgfapi0 | glusterfs-common (>= 5.6),
libgfchangelog0 | glusterfs-common (>= 5.6),
libgfdb0 | glusterfs-common (>= 5.6),
@@ -54,16 +57,16 @@ Depends: ceph-common (>= 0.48),
libglusterfs-dev | glusterfs-common (>= 5.6),
libglusterfs0 | glusterfs-common (>= 5.6),
libiscsi4 (>= 1.12.0) | libiscsi7,
libjemalloc2,
libjpeg62-turbo,
libsdl1.2debian,
libspice-server1 (>= 0.14.0~),
libusb-1.0-0 (>= 1.0.17-1),
libusbredirparser1 (>= 0.6-2),
vitastor-client (>= 0.9.4),
libuuid1,
numactl,
${misc:Depends},
${shlibs:Depends},
Recommends: numactl
Suggests: libgl1
Conflicts: kvm,
pve-kvm,
pve-qemu-kvm-2.6.18,
@@ -72,6 +75,7 @@ Conflicts: kvm,
qemu-system-arm,
qemu-system-common,
qemu-system-x86,
qemu-system-data,
qemu-utils,
Provides: qemu-system-arm, qemu-system-x86, qemu-utils
Replaces: pve-kvm,

2
debian/copyright vendored
View File

@@ -25,7 +25,7 @@ License:
In particular, the QEMU virtual CPU core library (libqemu.a) is
released under the GNU Lesser General Public License version 2 or later.
On Debian systems, the complete text of the GNU Lesser General Public
On Debian systems, the complete text of the GNU Lesser General Public
License can be found in the file /usr/share/common-licenses/LGPL.
Some hardware device emulation sources and other QEMU functionality are

View File

@@ -28,18 +28,18 @@ Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
block/mirror.c | 98 +++++++++++++++++++++++++-------
blockdev.c | 39 ++++++++++++-
include/block/block_int.h | 4 +-
qapi/block-core.json | 29 ++++++++--
tests/unit/test-block-iothread.c | 4 +-
block/mirror.c | 98 +++++++++++++++++++++-----
blockdev.c | 39 +++++++++-
include/block/block_int-global-state.h | 4 +-
qapi/block-core.json | 29 ++++++--
tests/unit/test-block-iothread.c | 4 +-
5 files changed, 145 insertions(+), 29 deletions(-)
diff --git a/block/mirror.c b/block/mirror.c
index 85b781bc21..0821214138 100644
index 251adc5ae0..8ead5f77a0 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -50,7 +50,7 @@ typedef struct MirrorBlockJob {
@@ -51,7 +51,7 @@ typedef struct MirrorBlockJob {
BlockDriverState *to_replace;
/* Used to block operations on the drive-mirror-replace target */
Error *replace_blocker;
@@ -57,7 +57,7 @@ index 85b781bc21..0821214138 100644
BdrvDirtyBitmap *dirty_bitmap;
BdrvDirtyBitmapIter *dbi;
uint8_t *buf;
@@ -697,7 +699,8 @@ static int mirror_exit_common(Job *job)
@@ -699,7 +701,8 @@ static int mirror_exit_common(Job *job)
bdrv_child_refresh_perms(mirror_top_bs, mirror_top_bs->backing,
&error_abort);
if (!abort && s->backing_mode == MIRROR_SOURCE_BACKING_CHAIN) {
@@ -67,7 +67,7 @@ index 85b781bc21..0821214138 100644
BlockDriverState *unfiltered_target = bdrv_skip_filters(target_bs);
if (bdrv_cow_bs(unfiltered_target) != backing) {
@@ -802,6 +805,16 @@ static void mirror_abort(Job *job)
@@ -797,6 +800,16 @@ static void mirror_abort(Job *job)
assert(ret == 0);
}
@@ -84,7 +84,7 @@ index 85b781bc21..0821214138 100644
static void coroutine_fn mirror_throttle(MirrorBlockJob *s)
{
int64_t now = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
@@ -983,7 +996,8 @@ static int coroutine_fn mirror_run(Job *job, Error **errp)
@@ -977,7 +990,8 @@ static int coroutine_fn mirror_run(Job *job, Error **errp)
mirror_free_init(s);
s->last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
@@ -94,7 +94,7 @@ index 85b781bc21..0821214138 100644
ret = mirror_dirty_init(s);
if (ret < 0 || job_is_cancelled(&s->common.job)) {
goto immediate_exit;
@@ -1216,6 +1230,7 @@ static const BlockJobDriver mirror_job_driver = {
@@ -1224,6 +1238,7 @@ static const BlockJobDriver mirror_job_driver = {
.run = mirror_run,
.prepare = mirror_prepare,
.abort = mirror_abort,
@@ -102,15 +102,15 @@ index 85b781bc21..0821214138 100644
.pause = mirror_pause,
.complete = mirror_complete,
.cancel = mirror_cancel,
@@ -1232,6 +1247,7 @@ static const BlockJobDriver commit_active_job_driver = {
@@ -1240,6 +1255,7 @@ static const BlockJobDriver commit_active_job_driver = {
.run = mirror_run,
.prepare = mirror_prepare,
.abort = mirror_abort,
+ .clean = mirror_clean,
.pause = mirror_pause,
.complete = mirror_complete,
},
@@ -1594,7 +1610,10 @@ static BlockJob *mirror_start_job(
.cancel = commit_active_cancel,
@@ -1627,7 +1643,10 @@ static BlockJob *mirror_start_job(
BlockCompletionFunc *cb,
void *opaque,
const BlockJobDriver *driver,
@@ -122,11 +122,12 @@ index 85b781bc21..0821214138 100644
bool auto_complete, const char *filter_node_name,
bool is_mirror, MirrorCopyMode copy_mode,
Error **errp)
@@ -1606,10 +1625,39 @@ static BlockJob *mirror_start_job(
@@ -1639,10 +1658,39 @@ static BlockJob *mirror_start_job(
uint64_t target_perms, target_shared_perms;
int ret;
- if (granularity == 0) {
- granularity = bdrv_get_default_bitmap_granularity(target);
+ if (sync_mode == MIRROR_SYNC_MODE_INCREMENTAL) {
+ error_setg(errp, "Sync mode '%s' not supported",
+ MirrorSyncMode_str(sync_mode));
@@ -147,8 +148,8 @@ index 85b781bc21..0821214138 100644
+ "sync mode '%s' is not compatible with bitmaps",
+ MirrorSyncMode_str(sync_mode));
+ return NULL;
+ }
+
}
+ if (bitmap) {
+ if (granularity) {
+ error_setg(errp, "granularity (%d)"
@@ -158,13 +159,12 @@ index 85b781bc21..0821214138 100644
+ }
+ granularity = bdrv_dirty_bitmap_granularity(bitmap);
+ } else if (granularity == 0) {
granularity = bdrv_get_default_bitmap_granularity(target);
}
-
+ granularity = bdrv_get_default_bitmap_granularity(target);
+ }
assert(is_power_of_2(granularity));
if (buf_size < 0) {
@@ -1747,7 +1795,9 @@ static BlockJob *mirror_start_job(
@@ -1774,7 +1822,9 @@ static BlockJob *mirror_start_job(
s->replaces = g_strdup(replaces);
s->on_source_error = on_source_error;
s->on_target_error = on_target_error;
@@ -175,7 +175,7 @@ index 85b781bc21..0821214138 100644
s->backing_mode = backing_mode;
s->zero_target = zero_target;
s->copy_mode = copy_mode;
@@ -1768,6 +1818,18 @@ static BlockJob *mirror_start_job(
@@ -1795,6 +1845,18 @@ static BlockJob *mirror_start_job(
bdrv_disable_dirty_bitmap(s->dirty_bitmap);
}
@@ -194,7 +194,7 @@ index 85b781bc21..0821214138 100644
ret = block_job_add_bdrv(&s->common, "source", bs, 0,
BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE |
BLK_PERM_CONSISTENT_READ,
@@ -1845,6 +1907,9 @@ fail:
@@ -1872,6 +1934,9 @@ fail:
if (s->dirty_bitmap) {
bdrv_release_dirty_bitmap(s->dirty_bitmap);
}
@@ -204,7 +204,7 @@ index 85b781bc21..0821214138 100644
job_early_fail(&s->common.job);
}
@@ -1862,29 +1927,23 @@ void mirror_start(const char *job_id, BlockDriverState *bs,
@@ -1889,31 +1954,25 @@ void mirror_start(const char *job_id, BlockDriverState *bs,
BlockDriverState *target, const char *replaces,
int creation_flags, int64_t speed,
uint32_t granularity, int64_t buf_size,
@@ -221,6 +221,8 @@ index 85b781bc21..0821214138 100644
- bool is_none_mode;
BlockDriverState *base;
GLOBAL_STATE_CODE();
- if ((mode == MIRROR_SYNC_MODE_INCREMENTAL) ||
- (mode == MIRROR_SYNC_MODE_BITMAP)) {
- error_setg(errp, "Sync mode '%s' not supported",
@@ -239,7 +241,7 @@ index 85b781bc21..0821214138 100644
}
BlockJob *commit_active_start(const char *job_id, BlockDriverState *bs,
@@ -1909,7 +1968,8 @@ BlockJob *commit_active_start(const char *job_id, BlockDriverState *bs,
@@ -1940,7 +1999,8 @@ BlockJob *commit_active_start(const char *job_id, BlockDriverState *bs,
job_id, bs, creation_flags, base, NULL, speed, 0, 0,
MIRROR_LEAVE_BACKING_CHAIN, false,
on_error, on_error, true, cb, opaque,
@@ -250,10 +252,10 @@ index 85b781bc21..0821214138 100644
errp);
if (!job) {
diff --git a/blockdev.c b/blockdev.c
index 3d8ac368a1..03e99264dc 100644
index 3f1dec6242..2ee30323cb 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2957,6 +2957,10 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
@@ -2946,6 +2946,10 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
BlockDriverState *target,
bool has_replaces, const char *replaces,
enum MirrorSyncMode sync,
@@ -264,7 +266,7 @@ index 3d8ac368a1..03e99264dc 100644
BlockMirrorBackingMode backing_mode,
bool zero_target,
bool has_speed, int64_t speed,
@@ -2976,6 +2980,7 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
@@ -2965,6 +2969,7 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
{
BlockDriverState *unfiltered_bs;
int job_flags = JOB_DEFAULT;
@@ -272,7 +274,7 @@ index 3d8ac368a1..03e99264dc 100644
if (!has_speed) {
speed = 0;
@@ -3030,6 +3035,29 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
@@ -3019,6 +3024,29 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
sync = MIRROR_SYNC_MODE_FULL;
}
@@ -302,7 +304,7 @@ index 3d8ac368a1..03e99264dc 100644
if (!has_replaces) {
/* We want to mirror from @bs, but keep implicit filters on top */
unfiltered_bs = bdrv_skip_implicit_filters(bs);
@@ -3076,8 +3104,8 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
@@ -3065,8 +3093,8 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
* and will allow to check whether the node still exist at mirror completion
*/
mirror_start(job_id, bs, target,
@@ -313,7 +315,7 @@ index 3d8ac368a1..03e99264dc 100644
on_source_error, on_target_error, unmap, filter_node_name,
copy_mode, errp);
}
@@ -3222,6 +3250,8 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
@@ -3211,6 +3239,8 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
blockdev_mirror_common(arg->has_job_id ? arg->job_id : NULL, bs, target_bs,
arg->has_replaces, arg->replaces, arg->sync,
@@ -322,7 +324,7 @@ index 3d8ac368a1..03e99264dc 100644
backing_mode, zero_target,
arg->has_speed, arg->speed,
arg->has_granularity, arg->granularity,
@@ -3243,6 +3273,8 @@ void qmp_blockdev_mirror(bool has_job_id, const char *job_id,
@@ -3232,6 +3262,8 @@ void qmp_blockdev_mirror(bool has_job_id, const char *job_id,
const char *device, const char *target,
bool has_replaces, const char *replaces,
MirrorSyncMode sync,
@@ -331,7 +333,7 @@ index 3d8ac368a1..03e99264dc 100644
bool has_speed, int64_t speed,
bool has_granularity, uint32_t granularity,
bool has_buf_size, int64_t buf_size,
@@ -3292,7 +3324,8 @@ void qmp_blockdev_mirror(bool has_job_id, const char *job_id,
@@ -3281,7 +3313,8 @@ void qmp_blockdev_mirror(bool has_job_id, const char *job_id,
}
blockdev_mirror_common(has_job_id ? job_id : NULL, bs, target_bs,
@@ -341,11 +343,11 @@ index 3d8ac368a1..03e99264dc 100644
zero_target, has_speed, speed,
has_granularity, granularity,
has_buf_size, buf_size,
diff --git a/include/block/block_int.h b/include/block/block_int.h
index c31cbd034a..11442893d0 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -1254,7 +1254,9 @@ void mirror_start(const char *job_id, BlockDriverState *bs,
diff --git a/include/block/block_int-global-state.h b/include/block/block_int-global-state.h
index b49f4eb35b..9d744db618 100644
--- a/include/block/block_int-global-state.h
+++ b/include/block/block_int-global-state.h
@@ -149,7 +149,9 @@ void mirror_start(const char *job_id, BlockDriverState *bs,
BlockDriverState *target, const char *replaces,
int creation_flags, int64_t speed,
uint32_t granularity, int64_t buf_size,
@@ -357,10 +359,10 @@ index c31cbd034a..11442893d0 100644
BlockdevOnError on_source_error,
BlockdevOnError on_target_error,
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 675d8265eb..6356a63695 100644
index 95ac4fa634..7daaf545be 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1938,10 +1938,19 @@
@@ -2000,10 +2000,19 @@
# (all the disk, only the sectors allocated in the topmost image, or
# only new I/O).
#
@@ -381,7 +383,7 @@ index 675d8265eb..6356a63695 100644
#
# @buf-size: maximum amount of data in flight from source to
# target (since 1.4).
@@ -1979,7 +1988,9 @@
@@ -2043,7 +2052,9 @@
{ 'struct': 'DriveMirror',
'data': { '*job-id': 'str', 'device': 'str', 'target': 'str',
'*format': 'str', '*node-name': 'str', '*replaces': 'str',
@@ -392,7 +394,7 @@ index 675d8265eb..6356a63695 100644
'*speed': 'int', '*granularity': 'uint32',
'*buf-size': 'int', '*on-source-error': 'BlockdevOnError',
'*on-target-error': 'BlockdevOnError',
@@ -2247,10 +2258,19 @@
@@ -2322,10 +2333,19 @@
# (all the disk, only the sectors allocated in the topmost image, or
# only new I/O).
#
@@ -413,7 +415,7 @@ index 675d8265eb..6356a63695 100644
#
# @buf-size: maximum amount of data in flight from source to
# target
@@ -2299,7 +2319,8 @@
@@ -2375,7 +2395,8 @@
{ 'command': 'blockdev-mirror',
'data': { '*job-id': 'str', 'device': 'str', 'target': 'str',
'*replaces': 'str',
@@ -424,10 +426,10 @@ index 675d8265eb..6356a63695 100644
'*buf-size': 'int', '*on-source-error': 'BlockdevOnError',
'*on-target-error': 'BlockdevOnError',
diff --git a/tests/unit/test-block-iothread.c b/tests/unit/test-block-iothread.c
index c39e70b2f5..470ef79ae0 100644
index 8ca5adec5e..dae80e5a5f 100644
--- a/tests/unit/test-block-iothread.c
+++ b/tests/unit/test-block-iothread.c
@@ -617,8 +617,8 @@ static void test_propagate_mirror(void)
@@ -755,8 +755,8 @@ static void test_propagate_mirror(void)
/* Start a mirror job */
mirror_start("job0", src, target, NULL, JOB_DEFAULT, 0, 0, 0,
@@ -437,4 +439,4 @@ index c39e70b2f5..470ef79ae0 100644
+ false, BLOCKDEV_ON_ERROR_REPORT, BLOCKDEV_ON_ERROR_REPORT,
false, "filter_node", MIRROR_COPY_MODE_BACKGROUND,
&error_abort);
job = job_get("job0");
WITH_JOB_LOCK_GUARD() {

View File

@@ -24,10 +24,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 18 insertions(+), 6 deletions(-)
diff --git a/block/mirror.c b/block/mirror.c
index 0821214138..c688726fae 100644
index 8ead5f77a0..35c1b8f25d 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -674,8 +674,6 @@ static int mirror_exit_common(Job *job)
@@ -676,8 +676,6 @@ static int mirror_exit_common(Job *job)
bdrv_unfreeze_backing_chain(mirror_top_bs, target_bs);
}
@@ -36,9 +36,9 @@ index 0821214138..c688726fae 100644
/* Make sure that the source BDS doesn't go away during bdrv_replace_node,
* before we can call bdrv_drained_end */
bdrv_ref(src);
@@ -783,6 +781,18 @@ static int mirror_exit_common(Job *job)
blk_set_perm(bjob->blk, 0, BLK_PERM_ALL, &error_abort);
blk_insert_bs(bjob->blk, mirror_top_bs, &error_abort);
@@ -778,6 +776,18 @@ static int mirror_exit_common(Job *job)
block_job_remove_all_bdrv(bjob);
bdrv_replace_node(mirror_top_bs, mirror_top_bs->backing->bs, &error_abort);
+ if (s->sync_bitmap) {
+ if (s->bitmap_mode == BITMAP_SYNC_MODE_ALWAYS ||
@@ -55,7 +55,7 @@ index 0821214138..c688726fae 100644
bs_opaque->job = NULL;
bdrv_drained_end(src);
@@ -1635,10 +1645,6 @@ static BlockJob *mirror_start_job(
@@ -1668,10 +1678,6 @@ static BlockJob *mirror_start_job(
" sync mode",
MirrorSyncMode_str(sync_mode));
return NULL;
@@ -66,7 +66,7 @@ index 0821214138..c688726fae 100644
}
} else if (bitmap) {
error_setg(errp,
@@ -1655,6 +1661,12 @@ static BlockJob *mirror_start_job(
@@ -1688,6 +1694,12 @@ static BlockJob *mirror_start_job(
return NULL;
}
granularity = bdrv_dirty_bitmap_granularity(bitmap);

View File

@@ -16,10 +16,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 3 insertions(+)
diff --git a/blockdev.c b/blockdev.c
index 03e99264dc..9e14feec87 100644
index 2ee30323cb..dd1c2cdef7 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3056,6 +3056,9 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
@@ -3045,6 +3045,9 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
if (bdrv_dirty_bitmap_check(bitmap, BDRV_BITMAP_ALLOW_RO, errp)) {
return;
}

View File

@@ -16,10 +16,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 4 insertions(+), 7 deletions(-)
diff --git a/block/mirror.c b/block/mirror.c
index c688726fae..a7f829f766 100644
index 35c1b8f25d..4969c6833c 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -787,8 +787,8 @@ static int mirror_exit_common(Job *job)
@@ -782,8 +782,8 @@ static int mirror_exit_common(Job *job)
job->ret == 0 && ret == 0)) {
/* Success; synchronize copy back to sync. */
bdrv_clear_dirty_bitmap(s->sync_bitmap, NULL);
@@ -30,7 +30,7 @@ index c688726fae..a7f829f766 100644
}
}
bdrv_release_dirty_bitmap(s->dirty_bitmap);
@@ -1835,11 +1835,8 @@ static BlockJob *mirror_start_job(
@@ -1862,11 +1862,8 @@ static BlockJob *mirror_start_job(
}
if (s->sync_mode == MIRROR_SYNC_MODE_BITMAP) {

View File

@@ -19,10 +19,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
3 files changed, 70 insertions(+), 59 deletions(-)
diff --git a/block/mirror.c b/block/mirror.c
index a7f829f766..6a126d18c8 100644
index 4969c6833c..cf85ae1074 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -1635,31 +1635,13 @@ static BlockJob *mirror_start_job(
@@ -1668,31 +1668,13 @@ static BlockJob *mirror_start_job(
uint64_t target_perms, target_shared_perms;
int ret;
@@ -60,10 +60,10 @@ index a7f829f766..6a126d18c8 100644
if (bitmap_mode != BITMAP_SYNC_MODE_NEVER) {
diff --git a/blockdev.c b/blockdev.c
index 9e14feec87..b6f797b41f 100644
index dd1c2cdef7..756e980889 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3035,7 +3035,36 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
@@ -3024,7 +3024,36 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
sync = MIRROR_SYNC_MODE_FULL;
}

View File

@@ -48,7 +48,7 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
6 files changed, 59 insertions(+), 5 deletions(-)
diff --git a/include/monitor/monitor.h b/include/monitor/monitor.h
index 1a8a369b50..2c8a558c67 100644
index 737e750670..38804b8595 100644
--- a/include/monitor/monitor.h
+++ b/include/monitor/monitor.h
@@ -16,6 +16,7 @@ extern QemuOptsList qemu_mon_opts;
@@ -60,10 +60,10 @@ index 1a8a369b50..2c8a558c67 100644
void monitor_init_globals(void);
void monitor_init_globals_core(void);
diff --git a/monitor/monitor-internal.h b/monitor/monitor-internal.h
index 9c3a09cb01..a92be8c3f7 100644
index a2cdbbf646..b531bd50e7 100644
--- a/monitor/monitor-internal.h
+++ b/monitor/monitor-internal.h
@@ -144,6 +144,13 @@ typedef struct {
@@ -152,6 +152,13 @@ typedef struct {
QemuMutex qmp_queue_lock;
/* Input queue that holds all the parsed QMP requests */
GQueue *qmp_requests;
@@ -78,7 +78,7 @@ index 9c3a09cb01..a92be8c3f7 100644
/**
diff --git a/monitor/monitor.c b/monitor/monitor.c
index 46a171bca6..5ccdd2424b 100644
index 86949024f6..c306cadcf4 100644
--- a/monitor/monitor.c
+++ b/monitor/monitor.c
@@ -135,6 +135,21 @@ bool monitor_cur_is_qmp(void)
@@ -144,10 +144,10 @@ index 092c527b6f..6b8cfcf6d8 100644
monitor_qmp_caps_reset(mon);
data = qmp_greeting(mon);
diff --git a/qapi/qmp-dispatch.c b/qapi/qmp-dispatch.c
index 59600210ce..95602446eb 100644
index 0990873ec8..e605003771 100644
--- a/qapi/qmp-dispatch.c
+++ b/qapi/qmp-dispatch.c
@@ -120,16 +120,28 @@ typedef struct QmpDispatchBH {
@@ -117,16 +117,28 @@ typedef struct QmpDispatchBH {
QObject **ret;
Error **errp;
Coroutine *co;
@@ -180,7 +180,7 @@ index 59600210ce..95602446eb 100644
aio_co_wake(data->co);
}
@@ -243,6 +255,7 @@ QDict *qmp_dispatch(const QmpCommandList *cmds, QObject *request,
@@ -231,6 +243,7 @@ QDict *qmp_dispatch(const QmpCommandList *cmds, QObject *request,
.ret = &ret,
.errp = &err,
.co = qemu_coroutine_self(),
@@ -189,10 +189,10 @@ index 59600210ce..95602446eb 100644
aio_bh_schedule_oneshot(qemu_get_aio_context(), do_qmp_dispatch_bh,
&data);
diff --git a/stubs/monitor-core.c b/stubs/monitor-core.c
index d058a2a00d..3290b58120 100644
index afa477aae6..d3ff124bf3 100644
--- a/stubs/monitor-core.c
+++ b/stubs/monitor-core.c
@@ -13,6 +13,11 @@ Monitor *monitor_set_cur(Coroutine *co, Monitor *mon)
@@ -12,6 +12,11 @@ Monitor *monitor_set_cur(Coroutine *co, Monitor *mon)
return NULL;
}

View File

@@ -0,0 +1,42 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Fri, 28 Oct 2022 10:09:46 +0200
Subject: [PATCH] init: daemonize: defuse PID file resolve error
When proxmox-file-restore invokes QEMU, the PID file is a (temporary)
file that's already unlinked, so resolving the absolute path here
failed.
It should not be a critical error when the PID file unlink handler
can't be registered, because the path can't be resolved for whatever
reason. If the file is already gone from QEMU's perspective (i.e.
errno is ENOENT), silently ignore the error. Otherwise, print a
warning.
Reported-by: Dominik Csapak <d.csapak@proxmox.com>
Suggested-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
softmmu/vl.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/softmmu/vl.c b/softmmu/vl.c
index 5115221efe..5f7f6ca981 100644
--- a/softmmu/vl.c
+++ b/softmmu/vl.c
@@ -2460,10 +2460,11 @@ static void qemu_maybe_daemonize(const char *pid_file)
pid_file_realpath = g_malloc0(PATH_MAX);
if (!realpath(pid_file, pid_file_realpath)) {
- error_report("cannot resolve PID file path: %s: %s",
- pid_file, strerror(errno));
- unlink(pid_file);
- exit(1);
+ if (errno != ENOENT) {
+ warn_report("not removing PID file on exit: cannot resolve PID "
+ "file path: %s: %s", pid_file, strerror(errno));
+ }
+ return;
}
qemu_unlink_pidfile_notifier = (struct UnlinkPidfileNotifier) {

View File

@@ -1,55 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Stefan Reiter <s.reiter@proxmox.com>
Date: Wed, 1 Sep 2021 16:51:04 +0200
Subject: [PATCH] monitor/hmp: add support for flag argument with value
Adds support for the "-xS" parameter type, where "-x" denotes a flag
name and the "S" suffix indicates that this flag is supposed to take an
arbitrary string parameter.
These parameters are always optional, the entry in the qdict will be
omitted if the flag is not given.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
monitor/hmp.c | 17 ++++++++++++++++-
1 file changed, 16 insertions(+), 1 deletion(-)
diff --git a/monitor/hmp.c b/monitor/hmp.c
index d50c3124e1..a32dce7a35 100644
--- a/monitor/hmp.c
+++ b/monitor/hmp.c
@@ -980,6 +980,7 @@ static QDict *monitor_parse_arguments(Monitor *mon,
{
const char *tmp = p;
int skip_key = 0;
+ int ret;
/* option */
c = *typestr++;
@@ -1002,8 +1003,22 @@ static QDict *monitor_parse_arguments(Monitor *mon,
}
if (skip_key) {
p = tmp;
+ } else if (*typestr == 'S') {
+ /* has option with string value */
+ typestr++;
+ tmp = p++;
+ while (qemu_isspace(*p)) {
+ p++;
+ }
+ ret = get_str(buf, sizeof(buf), &p);
+ if (ret < 0) {
+ monitor_printf(mon, "%s: value expected for -%c\n",
+ cmd->name, *tmp);
+ goto fail;
+ }
+ qdict_put_str(qdict, key, buf);
} else {
- /* has option */
+ /* has boolean option */
p++;
qdict_put_bool(qdict, key, true);
}

View File

@@ -1,479 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Stefan Reiter <s.reiter@proxmox.com>
Date: Wed, 25 Aug 2021 11:14:13 +0200
Subject: [PATCH] monitor: refactor set/expire_password and allow VNC display
id
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
It is possible to specify more than one VNC server on the command line,
either with an explicit ID or the auto-generated ones à la "default",
"vnc2", "vnc3", ...
It is not possible to change the password on one of these extra VNC
displays though. Fix this by adding a "display" parameter to the
"set_password" and "expire_password" QMP and HMP commands.
For HMP, the display is specified using the "-d" value flag.
For QMP, the schema is updated to explicitly express the supported
variants of the commands with protocol-discriminated unions.
Suggested-by: Eric Blake <eblake@redhat.com>
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
hmp-commands.hx | 29 ++++----
monitor/hmp-cmds.c | 57 +++++++++++++++-
monitor/qmp-cmds.c | 62 ++++++-----------
qapi/ui.json | 165 ++++++++++++++++++++++++++++++++++++++-------
4 files changed, 233 insertions(+), 80 deletions(-)
diff --git a/hmp-commands.hx b/hmp-commands.hx
index 8e45bce2cd..d78e4cfc47 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -1514,34 +1514,35 @@ ERST
{
.name = "set_password",
- .args_type = "protocol:s,password:s,connected:s?",
- .params = "protocol password action-if-connected",
+ .args_type = "protocol:s,password:s,display:-dS,connected:s?",
+ .params = "protocol password [-d display] [action-if-connected]",
.help = "set spice/vnc password",
.cmd = hmp_set_password,
},
SRST
-``set_password [ vnc | spice ] password [ action-if-connected ]``
- Change spice/vnc password. Use zero to make the password stay valid
- forever. *action-if-connected* specifies what should happen in
- case a connection is established: *fail* makes the password change
- fail. *disconnect* changes the password and disconnects the
- client. *keep* changes the password and keeps the connection up.
- *keep* is the default.
+``set_password [ vnc | spice ] password [ -d display ] [ action-if-connected ]``
+ Change spice/vnc password. *display* can be used with 'vnc' to specify
+ which display to set the password on. *action-if-connected* specifies
+ what should happen in case a connection is established: *fail* makes
+ the password change fail. *disconnect* changes the password and
+ disconnects the client. *keep* changes the password and keeps the
+ connection up. *keep* is the default.
ERST
{
.name = "expire_password",
- .args_type = "protocol:s,time:s",
- .params = "protocol time",
+ .args_type = "protocol:s,time:s,display:-dS",
+ .params = "protocol time [-d display]",
.help = "set spice/vnc password expire-time",
.cmd = hmp_expire_password,
},
SRST
-``expire_password [ vnc | spice ]`` *expire-time*
- Specify when a password for spice/vnc becomes
- invalid. *expire-time* accepts:
+``expire_password [ vnc | spice ] expire-time [ -d display ]``
+ Specify when a password for spice/vnc becomes invalid.
+ *display* behaves the same as in ``set_password``.
+ *expire-time* accepts:
``now``
Invalidate password instantly.
diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index a7e197a90b..f4ef58d257 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -1451,10 +1451,41 @@ void hmp_set_password(Monitor *mon, const QDict *qdict)
{
const char *protocol = qdict_get_str(qdict, "protocol");
const char *password = qdict_get_str(qdict, "password");
+ const char *display = qdict_get_try_str(qdict, "display");
const char *connected = qdict_get_try_str(qdict, "connected");
Error *err = NULL;
+ DisplayProtocol proto;
- qmp_set_password(protocol, password, !!connected, connected, &err);
+ SetPasswordOptions opts = {
+ .password = g_strdup(password),
+ .u.vnc.display = NULL,
+ };
+
+ proto = qapi_enum_parse(&DisplayProtocol_lookup, protocol,
+ DISPLAY_PROTOCOL_VNC, &err);
+ if (err) {
+ hmp_handle_error(mon, err);
+ return;
+ }
+ opts.protocol = proto;
+
+ if (proto == DISPLAY_PROTOCOL_VNC) {
+ opts.u.vnc.has_display = !!display;
+ opts.u.vnc.display = g_strdup(display);
+ } else if (proto == DISPLAY_PROTOCOL_SPICE) {
+ opts.u.spice.has_connected = !!connected;
+ opts.u.spice.connected =
+ qapi_enum_parse(&SetPasswordAction_lookup, connected,
+ SET_PASSWORD_ACTION_KEEP, &err);
+ if (err) {
+ hmp_handle_error(mon, err);
+ return;
+ }
+ }
+
+ qmp_set_password(&opts, &err);
+ g_free(opts.password);
+ g_free(opts.u.vnc.display);
hmp_handle_error(mon, err);
}
@@ -1462,9 +1493,31 @@ void hmp_expire_password(Monitor *mon, const QDict *qdict)
{
const char *protocol = qdict_get_str(qdict, "protocol");
const char *whenstr = qdict_get_str(qdict, "time");
+ const char *display = qdict_get_try_str(qdict, "display");
Error *err = NULL;
+ DisplayProtocol proto;
- qmp_expire_password(protocol, whenstr, &err);
+ ExpirePasswordOptions opts = {
+ .time = g_strdup(whenstr),
+ .u.vnc.display = NULL,
+ };
+
+ proto = qapi_enum_parse(&DisplayProtocol_lookup, protocol,
+ DISPLAY_PROTOCOL_VNC, &err);
+ if (err) {
+ hmp_handle_error(mon, err);
+ return;
+ }
+ opts.protocol = proto;
+
+ if (proto == DISPLAY_PROTOCOL_VNC) {
+ opts.u.vnc.has_display = !!display;
+ opts.u.vnc.display = g_strdup(display);
+ }
+
+ qmp_expire_password(&opts, &err);
+ g_free(opts.time);
+ g_free(opts.u.vnc.display);
hmp_handle_error(mon, err);
}
diff --git a/monitor/qmp-cmds.c b/monitor/qmp-cmds.c
index f7d64a6457..65882b5997 100644
--- a/monitor/qmp-cmds.c
+++ b/monitor/qmp-cmds.c
@@ -164,45 +164,30 @@ void qmp_system_wakeup(Error **errp)
qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, errp);
}
-void qmp_set_password(const char *protocol, const char *password,
- bool has_connected, const char *connected, Error **errp)
+void qmp_set_password(SetPasswordOptions *opts, Error **errp)
{
- int disconnect_if_connected = 0;
- int fail_if_connected = 0;
- int rc;
+ bool disconnect_if_connected = false;
+ bool fail_if_connected = false;
+ int rc = 0;
- if (has_connected) {
- if (strcmp(connected, "fail") == 0) {
- fail_if_connected = 1;
- } else if (strcmp(connected, "disconnect") == 0) {
- disconnect_if_connected = 1;
- } else if (strcmp(connected, "keep") == 0) {
- /* nothing */
- } else {
- error_setg(errp, QERR_INVALID_PARAMETER, "connected");
- return;
- }
- }
-
- if (strcmp(protocol, "spice") == 0) {
+ if (opts->protocol == DISPLAY_PROTOCOL_SPICE) {
if (!qemu_using_spice(errp)) {
return;
}
- rc = qemu_spice.set_passwd(password, fail_if_connected,
+ if (opts->u.spice.has_connected) {
+ fail_if_connected =
+ opts->u.spice.connected == SET_PASSWORD_ACTION_FAIL;
+ disconnect_if_connected =
+ opts->u.spice.connected == SET_PASSWORD_ACTION_DISCONNECT;
+ }
+ rc = qemu_spice.set_passwd(opts->password, fail_if_connected,
disconnect_if_connected);
- } else if (strcmp(protocol, "vnc") == 0) {
- if (fail_if_connected || disconnect_if_connected) {
- /* vnc supports "connected=keep" only */
- error_setg(errp, QERR_INVALID_PARAMETER, "connected");
- return;
- }
+ } else if (opts->protocol == DISPLAY_PROTOCOL_VNC) {
/* Note that setting an empty password will not disable login through
* this interface. */
- rc = vnc_display_password(NULL, password);
- } else {
- error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "protocol",
- "'vnc' or 'spice'");
- return;
+ rc = vnc_display_password(
+ opts->u.vnc.has_display ? opts->u.vnc.display : NULL,
+ opts->password);
}
if (rc != 0) {
@@ -210,11 +195,11 @@ void qmp_set_password(const char *protocol, const char *password,
}
}
-void qmp_expire_password(const char *protocol, const char *whenstr,
- Error **errp)
+void qmp_expire_password(ExpirePasswordOptions *opts, Error **errp)
{
time_t when;
int rc;
+ const char* whenstr = opts->time;
if (strcmp(whenstr, "now") == 0) {
when = 0;
@@ -226,17 +211,14 @@ void qmp_expire_password(const char *protocol, const char *whenstr,
when = strtoull(whenstr, NULL, 10);
}
- if (strcmp(protocol, "spice") == 0) {
+ if (opts->protocol == DISPLAY_PROTOCOL_SPICE) {
if (!qemu_using_spice(errp)) {
return;
}
rc = qemu_spice.set_pw_expire(when);
- } else if (strcmp(protocol, "vnc") == 0) {
- rc = vnc_display_pw_expire(NULL, when);
- } else {
- error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "protocol",
- "'vnc' or 'spice'");
- return;
+ } else if (opts->protocol == DISPLAY_PROTOCOL_VNC) {
+ rc = vnc_display_pw_expire(
+ opts->u.vnc.has_display ? opts->u.vnc.display : NULL, when);
}
if (rc != 0) {
diff --git a/qapi/ui.json b/qapi/ui.json
index fd9677d48e..cba8665b73 100644
--- a/qapi/ui.json
+++ b/qapi/ui.json
@@ -9,22 +9,23 @@
{ 'include': 'common.json' }
{ 'include': 'sockets.json' }
+##
+# @DisplayProtocol:
+#
+# Display protocols which support changing password options.
+#
+# Since: 6.2
+#
+##
+{ 'enum': 'DisplayProtocol',
+ 'data': [ { 'name': 'vnc', 'if': 'defined(CONFIG_VNC)' },
+ { 'name': 'spice', 'if': 'defined(CONFIG_SPICE)' } ] }
+
##
# @set_password:
#
# Sets the password of a remote display session.
#
-# @protocol: - 'vnc' to modify the VNC server password
-# - 'spice' to modify the Spice server password
-#
-# @password: the new password
-#
-# @connected: how to handle existing clients when changing the
-# password. If nothing is specified, defaults to 'keep'
-# 'fail' to fail the command if clients are connected
-# 'disconnect' to disconnect existing clients
-# 'keep' to maintain existing clients
-#
# Returns: - Nothing on success
# - If Spice is not enabled, DeviceNotFound
#
@@ -37,16 +38,123 @@
# <- { "return": {} }
#
##
-{ 'command': 'set_password',
- 'data': {'protocol': 'str', 'password': 'str', '*connected': 'str'} }
+{ 'command': 'set_password', 'boxed': true, 'data': 'SetPasswordOptions' }
+
+##
+# @SetPasswordOptions:
+#
+# Data required to set a new password on a display server protocol.
+#
+# @protocol: - 'vnc' to modify the VNC server password
+# - 'spice' to modify the Spice server password
+#
+# @password: the new password
+#
+# Since: 6.2
+#
+##
+{ 'union': 'SetPasswordOptions',
+ 'base': { 'protocol': 'DisplayProtocol',
+ 'password': 'str' },
+ 'discriminator': 'protocol',
+ 'data': { 'vnc': 'SetPasswordOptionsVnc',
+ 'spice': 'SetPasswordOptionsSpice' } }
+
+##
+# @SetPasswordAction:
+#
+# An action to take on changing a password on a connection with active clients.
+#
+# @fail: fail the command if clients are connected
+#
+# @disconnect: disconnect existing clients
+#
+# @keep: maintain existing clients
+#
+# Since: 6.2
+#
+##
+{ 'enum': 'SetPasswordAction',
+ 'data': [ 'fail', 'disconnect', 'keep' ] }
+
+##
+# @SetPasswordActionVnc:
+#
+# See @SetPasswordAction. VNC only supports the keep action. 'connection'
+# should just be omitted for VNC, this is kept for backwards compatibility.
+#
+# @keep: maintain existing clients
+#
+# Since: 6.2
+#
+##
+{ 'enum': 'SetPasswordActionVnc',
+ 'data': [ 'keep' ] }
+
+##
+# @SetPasswordOptionsSpice:
+#
+# Options for set_password specific to the VNC procotol.
+#
+# @connected: How to handle existing clients when changing the
+# password. If nothing is specified, defaults to 'keep'.
+#
+# Since: 6.2
+#
+##
+{ 'struct': 'SetPasswordOptionsSpice',
+ 'data': { '*connected': 'SetPasswordAction' } }
+
+##
+# @SetPasswordOptionsVnc:
+#
+# Options for set_password specific to the VNC procotol.
+#
+# @display: The id of the display where the password should be changed.
+# Defaults to the first.
+#
+# @connected: How to handle existing clients when changing the
+# password.
+#
+# Features:
+# @deprecated: For VNC, @connected will always be 'keep', parameter should be
+# omitted.
+#
+# Since: 6.2
+#
+##
+{ 'struct': 'SetPasswordOptionsVnc',
+ 'data': { '*display': 'str',
+ '*connected': { 'type': 'SetPasswordActionVnc',
+ 'features': ['deprecated'] } } }
##
# @expire_password:
#
# Expire the password of a remote display server.
#
-# @protocol: the name of the remote display protocol 'vnc' or 'spice'
+# Returns: - Nothing on success
+# - If @protocol is 'spice' and Spice is not active, DeviceNotFound
#
+# Since: 0.14
+#
+# Example:
+#
+# -> { "execute": "expire_password", "arguments": { "protocol": "vnc",
+# "time": "+60" } }
+# <- { "return": {} }
+#
+##
+{ 'command': 'expire_password', 'boxed': true, 'data': 'ExpirePasswordOptions' }
+
+##
+# @ExpirePasswordOptions:
+#
+# Data required to set password expiration on a display server protocol.
+#
+# @protocol: - 'vnc' to modify the VNC server expiration
+# - 'spice' to modify the Spice server expiration
+
# @time: when to expire the password.
#
# - 'now' to expire the password immediately
@@ -54,24 +162,33 @@
# - '+INT' where INT is the number of seconds from now (integer)
# - 'INT' where INT is the absolute time in seconds
#
-# Returns: - Nothing on success
-# - If @protocol is 'spice' and Spice is not active, DeviceNotFound
-#
-# Since: 0.14
-#
# Notes: Time is relative to the server and currently there is no way to
# coordinate server time with client time. It is not recommended to
# use the absolute time version of the @time parameter unless you're
# sure you are on the same machine as the QEMU instance.
#
-# Example:
+# Since: 6.2
#
-# -> { "execute": "expire_password", "arguments": { "protocol": "vnc",
-# "time": "+60" } }
-# <- { "return": {} }
+##
+{ 'union': 'ExpirePasswordOptions',
+ 'base': { 'protocol': 'DisplayProtocol',
+ 'time': 'str' },
+ 'discriminator': 'protocol',
+ 'data': { 'vnc': 'ExpirePasswordOptionsVnc' } }
+
+##
+# @ExpirePasswordOptionsVnc:
+#
+# Options for expire_password specific to the VNC procotol.
+#
+# @display: The id of the display where the expiration should be changed.
+# Defaults to the first.
+#
+# Since: 6.2
#
##
-{ 'command': 'expire_password', 'data': {'protocol': 'str', 'time': 'str'} }
+{ 'struct': 'ExpirePasswordOptionsVnc',
+ 'data': { '*display': 'str' } }
##
# @screendump:

View File

@@ -0,0 +1,44 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Chenyi Qiang <chenyi.qiang@intel.com>
Date: Fri, 16 Dec 2022 14:22:31 +0800
Subject: [PATCH] virtio-mem: Fix the bitmap index of the section offset
vmem->bitmap indexes the memory region of the virtio-mem backend at a
granularity of block_size. To calculate the index of target section offset,
the block_size should be divided instead of the bitmap_size.
Fixes: 2044969f0b ("virtio-mem: Implement RamDiscardManager interface")
Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
Message-Id: <20221216062231.11181-1-chenyi.qiang@intel.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: David Hildenbrand <david@redhat.com>
(cherry-picked from commit b11cf32e07a2f7ff0d171b89497381a04c9d07e0)
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
hw/virtio/virtio-mem.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c
index ed170def48..e19ee817fe 100644
--- a/hw/virtio/virtio-mem.c
+++ b/hw/virtio/virtio-mem.c
@@ -235,7 +235,7 @@ static int virtio_mem_for_each_plugged_section(const VirtIOMEM *vmem,
uint64_t offset, size;
int ret = 0;
- first_bit = s->offset_within_region / vmem->bitmap_size;
+ first_bit = s->offset_within_region / vmem->block_size;
first_bit = find_next_bit(vmem->bitmap, vmem->bitmap_size, first_bit);
while (first_bit < vmem->bitmap_size) {
MemoryRegionSection tmp = *s;
@@ -267,7 +267,7 @@ static int virtio_mem_for_each_unplugged_section(const VirtIOMEM *vmem,
uint64_t offset, size;
int ret = 0;
- first_bit = s->offset_within_region / vmem->bitmap_size;
+ first_bit = s->offset_within_region / vmem->block_size;
first_bit = find_next_zero_bit(vmem->bitmap, vmem->bitmap_size, first_bit);
while (first_bit < vmem->bitmap_size) {
MemoryRegionSection tmp = *s;

View File

@@ -1,83 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Stefano Garzarella <sgarzare@redhat.com>
Date: Fri, 10 Sep 2021 14:45:33 +0200
Subject: [PATCH] block/mirror: fix NULL pointer dereference in
mirror_wait_on_conflicts()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
In mirror_iteration() we call mirror_wait_on_conflicts() with
`self` parameter set to NULL.
Starting from commit d44dae1a7c we dereference `self` pointer in
mirror_wait_on_conflicts() without checks if it is not NULL.
Backtrace:
Program terminated with signal SIGSEGV, Segmentation fault.
#0 mirror_wait_on_conflicts (self=0x0, s=<optimized out>, offset=<optimized out>, bytes=<optimized out>)
at ../block/mirror.c:172
172 self->waiting_for_op = op;
[Current thread is 1 (Thread 0x7f0908931ec0 (LWP 380249))]
(gdb) bt
#0 mirror_wait_on_conflicts (self=0x0, s=<optimized out>, offset=<optimized out>, bytes=<optimized out>)
at ../block/mirror.c:172
#1 0x00005610c5d9d631 in mirror_run (job=0x5610c76a2c00, errp=<optimized out>) at ../block/mirror.c:491
#2 0x00005610c5d58726 in job_co_entry (opaque=0x5610c76a2c00) at ../job.c:917
#3 0x00005610c5f046c6 in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>)
at ../util/coroutine-ucontext.c:173
#4 0x00007f0909975820 in ?? () at ../sysdeps/unix/sysv/linux/x86_64/__start_context.S:91
from /usr/lib64/libc.so.6
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2001404
Fixes: d44dae1a7c ("block/mirror: fix active mirror dead-lock in mirror_wait_on_conflicts")
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20210910124533.288318-1-sgarzare@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
(cherry picked from commit 66fed30c9cd11854fc878a4eceb507e915d7c9cd)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
block/mirror.c | 25 ++++++++++++++++---------
1 file changed, 16 insertions(+), 9 deletions(-)
diff --git a/block/mirror.c b/block/mirror.c
index 98fc66eabf..85b781bc21 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -160,18 +160,25 @@ static void coroutine_fn mirror_wait_on_conflicts(MirrorOp *self,
if (ranges_overlap(self_start_chunk, self_nb_chunks,
op_start_chunk, op_nb_chunks))
{
- /*
- * If the operation is already (indirectly) waiting for us, or
- * will wait for us as soon as it wakes up, then just go on
- * (instead of producing a deadlock in the former case).
- */
- if (op->waiting_for_op) {
- continue;
+ if (self) {
+ /*
+ * If the operation is already (indirectly) waiting for us,
+ * or will wait for us as soon as it wakes up, then just go
+ * on (instead of producing a deadlock in the former case).
+ */
+ if (op->waiting_for_op) {
+ continue;
+ }
+
+ self->waiting_for_op = op;
}
- self->waiting_for_op = op;
qemu_co_queue_wait(&op->waiting_requests, NULL);
- self->waiting_for_op = NULL;
+
+ if (self) {
+ self->waiting_for_op = NULL;
+ }
+
break;
}
}

View File

@@ -0,0 +1,36 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Chenyi Qiang <chenyi.qiang@intel.com>
Date: Wed, 28 Dec 2022 17:03:12 +0800
Subject: [PATCH] virtio-mem: Fix the iterator variable in a vmem->rdl_list
loop
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
It should be the variable rdl2 to revert the already-notified listeners.
Fixes: 2044969f0b ("virtio-mem: Implement RamDiscardManager interface")
Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
Message-Id: <20221228090312.17276-1-chenyi.qiang@intel.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
(cherry-picked from commit 29f1b328e3b767cba2661920a8470738469b9e36)
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
hw/virtio/virtio-mem.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c
index e19ee817fe..56db586c89 100644
--- a/hw/virtio/virtio-mem.c
+++ b/hw/virtio/virtio-mem.c
@@ -341,7 +341,7 @@ static int virtio_mem_notify_plug(VirtIOMEM *vmem, uint64_t offset,
if (ret) {
/* Notify all already-notified listeners. */
QLIST_FOREACH(rdl2, &vmem->rdl_list, next) {
- MemoryRegionSection tmp = *rdl->section;
+ MemoryRegionSection tmp = *rdl2->section;
if (rdl2 == rdl) {
break;

View File

@@ -0,0 +1,141 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Jason Wang <jasowang@redhat.com>
Date: Fri, 16 Dec 2022 11:35:52 +0800
Subject: [PATCH] vhost: fix vq dirty bitmap syncing when vIOMMU is enabled
When vIOMMU is enabled, the vq->used_phys is actually the IOVA not
GPA. So we need to translate it to GPA before the syncing otherwise we
may hit the following crash since IOVA could be out of the scope of
the GPA log size. This could be noted when using virtio-IOMMU with
vhost using 1G memory.
Fixes: c471ad0e9bd46 ("vhost_net: device IOTLB support")
Cc: qemu-stable@nongnu.org
Tested-by: Lei Yang <leiyang@redhat.com>
Reported-by: Yalan Zhang <yalzhang@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20221216033552.77087-1-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry-picked from commit 345cc1cbcbce2bab00abc2b88338d7d89c702d6b)
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
hw/virtio/vhost.c | 84 ++++++++++++++++++++++++++++++++++++-----------
1 file changed, 64 insertions(+), 20 deletions(-)
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 7fb008bc9e..fdcd1a8fdf 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -20,6 +20,7 @@
#include "qemu/range.h"
#include "qemu/error-report.h"
#include "qemu/memfd.h"
+#include "qemu/log.h"
#include "standard-headers/linux/vhost_types.h"
#include "hw/virtio/virtio-bus.h"
#include "hw/virtio/virtio-access.h"
@@ -106,6 +107,24 @@ static void vhost_dev_sync_region(struct vhost_dev *dev,
}
}
+static bool vhost_dev_has_iommu(struct vhost_dev *dev)
+{
+ VirtIODevice *vdev = dev->vdev;
+
+ /*
+ * For vhost, VIRTIO_F_IOMMU_PLATFORM means the backend support
+ * incremental memory mapping API via IOTLB API. For platform that
+ * does not have IOMMU, there's no need to enable this feature
+ * which may cause unnecessary IOTLB miss/update transactions.
+ */
+ if (vdev) {
+ return virtio_bus_device_iommu_enabled(vdev) &&
+ virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
+ } else {
+ return false;
+ }
+}
+
static int vhost_sync_dirty_bitmap(struct vhost_dev *dev,
MemoryRegionSection *section,
hwaddr first,
@@ -137,8 +156,51 @@ static int vhost_sync_dirty_bitmap(struct vhost_dev *dev,
continue;
}
- vhost_dev_sync_region(dev, section, start_addr, end_addr, vq->used_phys,
- range_get_last(vq->used_phys, vq->used_size));
+ if (vhost_dev_has_iommu(dev)) {
+ IOMMUTLBEntry iotlb;
+ hwaddr used_phys = vq->used_phys, used_size = vq->used_size;
+ hwaddr phys, s, offset;
+
+ while (used_size) {
+ rcu_read_lock();
+ iotlb = address_space_get_iotlb_entry(dev->vdev->dma_as,
+ used_phys,
+ true,
+ MEMTXATTRS_UNSPECIFIED);
+ rcu_read_unlock();
+
+ if (!iotlb.target_as) {
+ qemu_log_mask(LOG_GUEST_ERROR, "translation "
+ "failure for used_iova %"PRIx64"\n",
+ used_phys);
+ return -EINVAL;
+ }
+
+ offset = used_phys & iotlb.addr_mask;
+ phys = iotlb.translated_addr + offset;
+
+ /*
+ * Distance from start of used ring until last byte of
+ * IOMMU page.
+ */
+ s = iotlb.addr_mask - offset;
+ /*
+ * Size of used ring, or of the part of it until end
+ * of IOMMU page. To avoid zero result, do the adding
+ * outside of MIN().
+ */
+ s = MIN(s, used_size - 1) + 1;
+
+ vhost_dev_sync_region(dev, section, start_addr, end_addr, phys,
+ range_get_last(phys, s));
+ used_size -= s;
+ used_phys += s;
+ }
+ } else {
+ vhost_dev_sync_region(dev, section, start_addr,
+ end_addr, vq->used_phys,
+ range_get_last(vq->used_phys, vq->used_size));
+ }
}
return 0;
}
@@ -306,24 +368,6 @@ static inline void vhost_dev_log_resize(struct vhost_dev *dev, uint64_t size)
dev->log_size = size;
}
-static bool vhost_dev_has_iommu(struct vhost_dev *dev)
-{
- VirtIODevice *vdev = dev->vdev;
-
- /*
- * For vhost, VIRTIO_F_IOMMU_PLATFORM means the backend support
- * incremental memory mapping API via IOTLB API. For platform that
- * does not have IOMMU, there's no need to enable this feature
- * which may cause unnecessary IOTLB miss/update transactions.
- */
- if (vdev) {
- return virtio_bus_device_iommu_enabled(vdev) &&
- virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
- } else {
- return false;
- }
-}
-
static void *vhost_memory_map(struct vhost_dev *dev, hwaddr addr,
hwaddr *plen, bool is_write)
{

View File

@@ -0,0 +1,42 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Date: Mon, 9 Jan 2023 10:58:09 +0000
Subject: [PATCH] virtio-rng-pci: fix migration compat for vectors
Fixup the migration compatibility for existing machine types
so that they do not enable msi-x.
Symptom:
(qemu) qemu: get_pci_config_device: Bad config data: i=0x34 read: 84 device: 98 cmask: ff wmask: 0 w1cmask:0
qemu: Failed to load PCIDevice:config
qemu: Failed to load virtio-rng:virtio
qemu: error while loading state for instance 0x0 of device '0000:00:03.0/virtio-rng'
qemu: load of migration failed: Invalid argument
Note: This fix will break migration from 7.2->7.2-fixed with this patch
bz: https://bugzilla.redhat.com/show_bug.cgi?id=2155749
Fixes: 9ea02e8f1 ("virtio-rng-pci: Allow setting nvectors, so we can use MSI-X")
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: David Daney <david.daney@fungible.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
(picked-up from https://lists.nongnu.org/archive/html/qemu-devel/2023-01/msg01319.html)
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
hw/core/machine.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 8d34caa31d..77a0a131d1 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -42,6 +42,7 @@
GlobalProperty hw_compat_7_1[] = {
{ "virtio-device", "queue_reset", "false" },
+ { "virtio-rng-pci", "vectors", "0" },
};
const size_t hw_compat_7_1_len = G_N_ELEMENTS(hw_compat_7_1);

View File

@@ -0,0 +1,36 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Stefan Hajnoczi <stefanha@redhat.com>
Date: Thu, 26 Jan 2023 15:13:58 -0500
Subject: [PATCH] block: fix detect-zeroes= with BDRV_REQ_REGISTERED_BUF
When a write request is converted into a write zeroes request by the
detect-zeroes= feature, it is no longer associated with an I/O buffer.
The BDRV_REQ_REGISTERED_BUF flag doesn't make sense without an I/O
buffer and must be cleared because bdrv_co_do_pwrite_zeroes() fails with
-EINVAL when it's set.
Fiona Ebner <f.ebner@proxmox.com> bisected and diagnosed this QEMU 7.2
regression where writes containing zeroes to a blockdev with
discard=unmap,detect-zeroes=unmap fail.
Buglink: https://gitlab.com/qemu-project/qemu/-/issues/1404
Fixes: e8b6535533be ("block: add BDRV_REQ_REGISTERED_BUF request flag")
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
block/io.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/block/io.c b/block/io.c
index b9424024f9..bbaa0d1b2d 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2087,6 +2087,9 @@ static int coroutine_fn bdrv_aligned_pwritev(BdrvChild *child,
if (bs->detect_zeroes == BLOCKDEV_DETECT_ZEROES_OPTIONS_UNMAP) {
flags |= BDRV_REQ_MAY_UNMAP;
}
+
+ /* Can't use optimization hint with bufferless zero write */
+ flags &= ~BDRV_REQ_REGISTERED_BUF;
}
if (ret < 0) {

View File

@@ -0,0 +1,118 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Alexander Bulekov <alxndr@bu.edu>
Date: Sat, 4 Feb 2023 23:07:34 -0500
Subject: [PATCH] memory: prevent dma-reentracy issues
Add a flag to the DeviceState, when a device is engaged in PIO/MMIO/DMA.
This flag is set/checked prior to calling a device's MemoryRegion
handlers, and set when device code initiates DMA. The purpose of this
flag is to prevent two types of DMA-based reentrancy issues:
1.) mmio -> dma -> mmio case
2.) bh -> dma write -> mmio case
These issues have led to problems such as stack-exhaustion and
use-after-frees.
Summary of the problem from Peter Maydell:
https://lore.kernel.org/qemu-devel/CAFEAcA_23vc7hE3iaM-JVA6W38LK4hJoWae5KcknhPRD5fPBZA@mail.gmail.com
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/62
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/540
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/541
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/556
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/557
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/827
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1282
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Acked-by: Peter Xu <peterx@redhat.com>
(picked-up from https://lists.nongnu.org/archive/html/qemu-devel/2023-02/msg01142.html)
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
include/hw/qdev-core.h | 7 +++++++
softmmu/memory.c | 17 +++++++++++++++++
softmmu/trace-events | 1 +
3 files changed, 25 insertions(+)
diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
index 785dd5a56e..886f6bb79e 100644
--- a/include/hw/qdev-core.h
+++ b/include/hw/qdev-core.h
@@ -162,6 +162,10 @@ struct NamedClockList {
QLIST_ENTRY(NamedClockList) node;
};
+typedef struct {
+ bool engaged_in_io;
+} MemReentrancyGuard;
+
/**
* DeviceState:
* @realized: Indicates whether the device has been fully constructed.
@@ -194,6 +198,9 @@ struct DeviceState {
int alias_required_for_version;
ResettableState reset;
GSList *unplug_blockers;
+
+ /* Is the device currently in mmio/pio/dma? Used to prevent re-entrancy */
+ MemReentrancyGuard mem_reentrancy_guard;
};
struct DeviceListener {
diff --git a/softmmu/memory.c b/softmmu/memory.c
index bc0be3f62c..7dcb3347aa 100644
--- a/softmmu/memory.c
+++ b/softmmu/memory.c
@@ -533,6 +533,7 @@ static MemTxResult access_with_adjusted_size(hwaddr addr,
uint64_t access_mask;
unsigned access_size;
unsigned i;
+ DeviceState *dev = NULL;
MemTxResult r = MEMTX_OK;
if (!access_size_min) {
@@ -542,6 +543,19 @@ static MemTxResult access_with_adjusted_size(hwaddr addr,
access_size_max = 4;
}
+ /* Do not allow more than one simultanous access to a device's IO Regions */
+ if (mr->owner &&
+ !mr->ram_device && !mr->ram && !mr->rom_device && !mr->readonly) {
+ dev = (DeviceState *) object_dynamic_cast(mr->owner, TYPE_DEVICE);
+ if (dev) {
+ if (dev->mem_reentrancy_guard.engaged_in_io) {
+ trace_memory_region_reentrant_io(get_cpu_index(), mr, addr, size);
+ return MEMTX_ERROR;
+ }
+ dev->mem_reentrancy_guard.engaged_in_io = true;
+ }
+ }
+
/* FIXME: support unaligned access? */
access_size = MAX(MIN(size, access_size_max), access_size_min);
access_mask = MAKE_64BIT_MASK(0, access_size * 8);
@@ -556,6 +570,9 @@ static MemTxResult access_with_adjusted_size(hwaddr addr,
access_mask, attrs);
}
}
+ if (dev) {
+ dev->mem_reentrancy_guard.engaged_in_io = false;
+ }
return r;
}
diff --git a/softmmu/trace-events b/softmmu/trace-events
index 22606dc27b..62d04ea9a7 100644
--- a/softmmu/trace-events
+++ b/softmmu/trace-events
@@ -13,6 +13,7 @@ memory_region_ops_read(int cpu_index, void *mr, uint64_t addr, uint64_t value, u
memory_region_ops_write(int cpu_index, void *mr, uint64_t addr, uint64_t value, unsigned size, const char *name) "cpu %d mr %p addr 0x%"PRIx64" value 0x%"PRIx64" size %u name '%s'"
memory_region_subpage_read(int cpu_index, void *mr, uint64_t offset, uint64_t value, unsigned size) "cpu %d mr %p offset 0x%"PRIx64" value 0x%"PRIx64" size %u"
memory_region_subpage_write(int cpu_index, void *mr, uint64_t offset, uint64_t value, unsigned size) "cpu %d mr %p offset 0x%"PRIx64" value 0x%"PRIx64" size %u"
+memory_region_reentrant_io(int cpu_index, void *mr, uint64_t offset, unsigned size) "cpu %d mr %p offset 0x%"PRIx64" size %u"
memory_region_ram_device_read(int cpu_index, void *mr, uint64_t addr, uint64_t value, unsigned size) "cpu %d mr %p addr 0x%"PRIx64" value 0x%"PRIx64" size %u"
memory_region_ram_device_write(int cpu_index, void *mr, uint64_t addr, uint64_t value, unsigned size) "cpu %d mr %p addr 0x%"PRIx64" value 0x%"PRIx64" size %u"
memory_region_sync_dirty(const char *mr, const char *listener, int global) "mr '%s' listener '%s' synced (global=%d)"

View File

@@ -0,0 +1,32 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Paolo Bonzini <pbonzini@redhat.com>
Date: Tue, 10 Jan 2023 17:36:33 +0100
Subject: [PATCH] block/iscsi: fix double-free on BUSY or similar statuses
Commit 8c460269aa77 ("iscsi: base all handling of check condition on
scsi_sense_to_errno", 2019-07-15) removed a "goto out" so that the
same coroutine is re-entered twice; once from iscsi_co_generic_cb,
once from the timer callback iscsi_retry_timer_expired. This can
cause a crash.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1378
Reported-by: Grzegorz Zdanowski <https://gitlab.com/kiler129>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry-picked from commit 5080152e2ef6cde7aa692e29880c62bd54acb750)
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
block/iscsi.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/block/iscsi.c b/block/iscsi.c
index 3ed4a50c0d..89cd032c3a 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -268,6 +268,7 @@ iscsi_co_generic_cb(struct iscsi_context *iscsi, int status,
timer_mod(&iTask->retry_timer,
qemu_clock_get_ms(QEMU_CLOCK_REALTIME) + retry_time);
iTask->do_retry = 1;
+ return;
} else if (status == SCSI_STATUS_CHECK_CONDITION) {
int error = iscsi_translate_sense(&task->sense);
if (error == EAGAIN) {

View File

@@ -0,0 +1,69 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Guenter Roeck <linux@roeck-us.net>
Date: Tue, 28 Feb 2023 09:11:29 -0800
Subject: [PATCH] scsi: megasas: Internal cdbs have 16-byte length
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Host drivers do not necessarily set cdb_len in megasas io commands.
With commits 6d1511cea0 ("scsi: Reject commands if the CDB length
exceeds buf_len") and fe9d8927e2 ("scsi: Add buf_len parameter to
scsi_req_new()"), this results in failures to boot Linux from affected
SCSI drives because cdb_len is set to 0 by the host driver.
Set the cdb length to its actual size to solve the problem.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(picked-up from https://lists.nongnu.org/archive/html/qemu-devel/2023-02/msg08653.html)
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
hw/scsi/megasas.c | 14 ++------------
1 file changed, 2 insertions(+), 12 deletions(-)
diff --git a/hw/scsi/megasas.c b/hw/scsi/megasas.c
index 9cbbb16121..d624866bb6 100644
--- a/hw/scsi/megasas.c
+++ b/hw/scsi/megasas.c
@@ -1780,7 +1780,7 @@ static int megasas_handle_io(MegasasState *s, MegasasCmd *cmd, int frame_cmd)
uint8_t cdb[16];
int len;
struct SCSIDevice *sdev = NULL;
- int target_id, lun_id, cdb_len;
+ int target_id, lun_id;
lba_count = le32_to_cpu(cmd->frame->io.header.data_len);
lba_start_lo = le32_to_cpu(cmd->frame->io.lba_lo);
@@ -1789,7 +1789,6 @@ static int megasas_handle_io(MegasasState *s, MegasasCmd *cmd, int frame_cmd)
target_id = cmd->frame->header.target_id;
lun_id = cmd->frame->header.lun_id;
- cdb_len = cmd->frame->header.cdb_len;
if (target_id < MFI_MAX_LD && lun_id == 0) {
sdev = scsi_device_find(&s->bus, 0, target_id, lun_id);
@@ -1804,15 +1803,6 @@ static int megasas_handle_io(MegasasState *s, MegasasCmd *cmd, int frame_cmd)
return MFI_STAT_DEVICE_NOT_FOUND;
}
- if (cdb_len > 16) {
- trace_megasas_scsi_invalid_cdb_len(
- mfi_frame_desc(frame_cmd), 1, target_id, lun_id, cdb_len);
- megasas_write_sense(cmd, SENSE_CODE(INVALID_OPCODE));
- cmd->frame->header.scsi_status = CHECK_CONDITION;
- s->event_count++;
- return MFI_STAT_SCSI_DONE_WITH_ERROR;
- }
-
cmd->iov_size = lba_count * sdev->blocksize;
if (megasas_map_sgl(s, cmd, &cmd->frame->io.sgl)) {
megasas_write_sense(cmd, SENSE_CODE(TARGET_FAILURE));
@@ -1823,7 +1813,7 @@ static int megasas_handle_io(MegasasState *s, MegasasCmd *cmd, int frame_cmd)
megasas_encode_lba(cdb, lba_start, lba_count, is_write);
cmd->req = scsi_req_new(sdev, cmd->index,
- lun_id, cdb, cdb_len, cmd);
+ lun_id, cdb, sizeof(cdb), cmd);
if (!cmd->req) {
trace_megasas_scsi_req_alloc_failed(
mfi_frame_desc(frame_cmd), target_id, lun_id);

View File

@@ -0,0 +1,100 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Tue, 7 Mar 2023 15:03:02 +0100
Subject: [PATCH] ide: avoid potential deadlock when draining during trim
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The deadlock can happen as follows:
1. ide_issue_trim is called, and increments the in_flight counter.
2. ide_issue_trim_cb calls blk_aio_pdiscard.
3. Somebody else starts draining (e.g. backup to insert the cbw node).
4. ide_issue_trim_cb is called as the completion callback for
blk_aio_pdiscard.
5. ide_issue_trim_cb issues yet another blk_aio_pdiscard request.
6. The request is added to the wait queue via blk_wait_while_drained,
because draining has been started.
7. Nobody ever decrements the in_flight counter and draining can't
finish. This would be done by ide_trim_bh_cb, which is called after
ide_issue_trim_cb has issued its last request, but
ide_issue_trim_cb is not called anymore, because it's the
completion callback of blk_aio_pdiscard, which waits on draining.
Quoting Hanna Czenczek:
> The point of 7e5cdb345f was that we need any in-flight count to
> accompany a set s->bus->dma->aiocb. While blk_aio_pdiscard() is
> happening, we dont necessarily need another count. But we do need
> it while there is no blk_aio_pdiscard().
> ide_issue_trim_cb() returns in two cases (and, recursively through
> its callers, leaves s->bus->dma->aiocb set):
> 1. After calling blk_aio_pdiscard(), which will keep an in-flight
> count,
> 2. After calling replay_bh_schedule_event() (i.e.
> qemu_bh_schedule()), which does not keep an in-flight count.
Thus, even after moving the blk_inc_in_flight to above the
replay_bh_schedule_event call, the invariant "ide_issue_trim_cb
returns with an accompanying in-flight count" is still satisfied.
However, the issue 7e5cdb345f fixed for canceling resurfaces, because
ide_cancel_dma_sync assumes that it just needs to drain once. But now
the in_flight count is not consistently > 0 during the trim operation.
So, change it to drain until !s->bus->dma->aiocb, which means that the
operation finished (s->bus->dma->aiocb is cleared by ide_set_inactive
via the ide_dma_cb when the end of the transfer is reached).
Discussion here:
https://lists.nongnu.org/archive/html/qemu-devel/2023-03/msg02506.html
Fixes: 7e5cdb345f ("ide: Increment BB in-flight counter for TRIM BH")
Suggested-by: Hanna Czenczek <hreitz@redhat.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
hw/ide/core.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/hw/ide/core.c b/hw/ide/core.c
index 39afdc0006..b67c1885a8 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -443,7 +443,7 @@ static void ide_trim_bh_cb(void *opaque)
iocb->bh = NULL;
qemu_aio_unref(iocb);
- /* Paired with an increment in ide_issue_trim() */
+ /* Paired with an increment in ide_issue_trim_cb() */
blk_dec_in_flight(blk);
}
@@ -503,6 +503,8 @@ static void ide_issue_trim_cb(void *opaque, int ret)
done:
iocb->aiocb = NULL;
if (iocb->bh) {
+ /* Paired with a decrement in ide_trim_bh_cb() */
+ blk_inc_in_flight(s->blk);
replay_bh_schedule_event(iocb->bh);
}
}
@@ -514,9 +516,6 @@ BlockAIOCB *ide_issue_trim(
IDEState *s = opaque;
TrimAIOCB *iocb;
- /* Paired with a decrement in ide_trim_bh_cb() */
- blk_inc_in_flight(s->blk);
-
iocb = blk_aio_get(&trim_aiocb_info, s->blk, cb, cb_opaque);
iocb->s = s;
iocb->bh = qemu_bh_new(ide_trim_bh_cb, iocb);
@@ -739,8 +738,9 @@ void ide_cancel_dma_sync(IDEState *s)
*/
if (s->bus->dma->aiocb) {
trace_ide_cancel_dma_sync_remaining();
- blk_drain(s->blk);
- assert(s->bus->dma->aiocb == NULL);
+ while (s->bus->dma->aiocb) {
+ blk_drain(s->blk);
+ }
}
}

View File

@@ -0,0 +1,67 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Klaus Jensen <k.jensen@samsung.com>
Date: Wed, 8 Mar 2023 19:57:12 +0300
Subject: [PATCH] hw/nvme: fix missing endian conversions for doorbell buffers
The eventidx and doorbell value are not handling endianness correctly.
Fix this.
Fixes: 3f7fe8de3d49 ("hw/nvme: Implement shadow doorbell buffer support")
Cc: qemu-stable@nongnu.org
Reported-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
(cherry picked from commit 2fda0726e5149e032acfa5fe442db56cd6433c4c)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Conflicts: hw/nvme/ctrl.c
(picked up from qemu-stable mailing list)
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
hw/nvme/ctrl.c | 22 ++++++++++++++++------
1 file changed, 16 insertions(+), 6 deletions(-)
diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index e54276dc1d..98d8e34109 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -1333,8 +1333,12 @@ static inline void nvme_blk_write(BlockBackend *blk, int64_t offset,
static void nvme_update_cq_head(NvmeCQueue *cq)
{
- pci_dma_read(&cq->ctrl->parent_obj, cq->db_addr, &cq->head,
- sizeof(cq->head));
+ uint32_t v;
+
+ pci_dma_read(&cq->ctrl->parent_obj, cq->db_addr, &v, sizeof(v));
+
+ cq->head = le32_to_cpu(v);
+
trace_pci_nvme_shadow_doorbell_cq(cq->cqid, cq->head);
}
@@ -6141,15 +6145,21 @@ static uint16_t nvme_admin_cmd(NvmeCtrl *n, NvmeRequest *req)
static void nvme_update_sq_eventidx(const NvmeSQueue *sq)
{
- pci_dma_write(&sq->ctrl->parent_obj, sq->ei_addr, &sq->tail,
- sizeof(sq->tail));
+ uint32_t v = cpu_to_le32(sq->tail);
+
+ pci_dma_write(&sq->ctrl->parent_obj, sq->ei_addr, &v, sizeof(v));
+
trace_pci_nvme_eventidx_sq(sq->sqid, sq->tail);
}
static void nvme_update_sq_tail(NvmeSQueue *sq)
{
- pci_dma_read(&sq->ctrl->parent_obj, sq->db_addr, &sq->tail,
- sizeof(sq->tail));
+ uint32_t v;
+
+ pci_dma_read(&sq->ctrl->parent_obj, sq->db_addr, &v, sizeof(v));
+
+ sq->tail = le32_to_cpu(v);
+
trace_pci_nvme_shadow_doorbell_sq(sq->sqid, sq->tail);
}

View File

@@ -0,0 +1,50 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Julia Suvorova <jusual@redhat.com>
Date: Thu, 23 Feb 2023 13:57:47 +0100
Subject: [PATCH] hw/smbios: fix field corruption in type 4 table
Since table type 4 of SMBIOS version 2.6 is shorter than 3.0, the
strings which follow immediately after the struct fields have been
overwritten by unconditional filling of later fields such as core_count2.
Make these fields dependent on the SMBIOS version.
Fixes: 05e27d74c7 ("hw/smbios: add core_count2 to smbios table type 4")
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2169904
Signed-off-by: Julia Suvorova <jusual@redhat.com>
Message-Id: <20230223125747.254914-1-jusual@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Ani Sinha <ani@anisinha.ca>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry-picked from commit 60d09b8dc7dd4256d664ad680795cb1327805b2b)
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
hw/smbios/smbios.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/hw/smbios/smbios.c b/hw/smbios/smbios.c
index b4243de735..66a020999b 100644
--- a/hw/smbios/smbios.c
+++ b/hw/smbios/smbios.c
@@ -749,14 +749,16 @@ static void smbios_build_type_4_table(MachineState *ms, unsigned instance)
t->core_count = (ms->smp.cores > 255) ? 0xFF : ms->smp.cores;
t->core_enabled = t->core_count;
- t->core_count2 = t->core_enabled2 = cpu_to_le16(ms->smp.cores);
-
t->thread_count = (ms->smp.threads > 255) ? 0xFF : ms->smp.threads;
- t->thread_count2 = cpu_to_le16(ms->smp.threads);
t->processor_characteristics = cpu_to_le16(0x02); /* Unknown */
t->processor_family2 = cpu_to_le16(0x01); /* Other */
+ if (tbl_len == SMBIOS_TYPE_4_LEN_V30) {
+ t->core_count2 = t->core_enabled2 = cpu_to_le16(ms->smp.cores);
+ t->thread_count2 = cpu_to_le16(ms->smp.threads);
+ }
+
SMBIOS_BUILD_TABLE_POST;
smbios_type4_count++;
}

View File

@@ -0,0 +1,35 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Date: Tue, 7 Feb 2023 17:49:44 +0000
Subject: [PATCH] virtio-rng-pci: fix transitional migration compat for vectors
In bad9c5a516 ("virtio-rng-pci: fix migration compat for vectors") I
fixed the virtio-rng-pci migration compatibility, but it was discovered
that we also need to fix the other aliases of the device for the
transitional cases.
Fixes: 9ea02e8f1 ('virtio-rng-pci: Allow setting nvectors, so we can use MSI-X')
bz: https://bugzilla.redhat.com/show_bug.cgi?id=2162569
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20230207174944.138255-1-dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry-picked from commit 62bdb8871512076841f4464f7e26efdc7783f78d)
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
hw/core/machine.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/hw/core/machine.c b/hw/core/machine.c
index cd84579591..4297315984 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -43,6 +43,8 @@
GlobalProperty hw_compat_7_1[] = {
{ "virtio-device", "queue_reset", "false" },
{ "virtio-rng-pci", "vectors", "0" },
+ { "virtio-rng-pci-transitional", "vectors", "0" },
+ { "virtio-rng-pci-non-transitional", "vectors", "0" },
};
const size_t hw_compat_7_1_len = G_N_ELEMENTS(hw_compat_7_1);

View File

@@ -0,0 +1,80 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Akihiko Odaki <akihiko.odaki@daynix.com>
Date: Tue, 31 Jan 2023 12:00:37 +0900
Subject: [PATCH] hw/timer/hpet: Fix expiration time overflow
The expiration time provided for timer_mod() can overflow if a
ridiculously large value is set to the comparator register. The
resulting value can represent a past time after rounded, forcing the
timer to fire immediately. If the timer is configured as periodic, it
will rearm the timer again, and form an endless loop.
Check if the expiration value will overflow, and if it will, stop the
timer instead of rearming the timer with the overflowed time.
This bug was found by Alexander Bulekov when fuzzing igb, a new
network device emulation:
https://patchew.org/QEMU/20230129053316.1071513-1-alxndr@bu.edu/
The fixed test case is:
fuzz/crash_2d7036941dcda1ad4380bb8a9174ed0c949bcefd
Fixes: 16b29ae180 ("Add HPET emulation to qemu (Beth Kon)")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20230131030037.18856-1-akihiko.odaki@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry-picked from commit 37d2bcbc2a4e9c2e9061bec72a32c7e49b9f81ec)
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
hw/timer/hpet.c | 19 +++++++++++++------
1 file changed, 13 insertions(+), 6 deletions(-)
diff --git a/hw/timer/hpet.c b/hw/timer/hpet.c
index 9520471be2..5f88ffdef8 100644
--- a/hw/timer/hpet.c
+++ b/hw/timer/hpet.c
@@ -352,6 +352,16 @@ static const VMStateDescription vmstate_hpet = {
}
};
+static void hpet_arm(HPETTimer *t, uint64_t ticks)
+{
+ if (ticks < ns_to_ticks(INT64_MAX / 2)) {
+ timer_mod(t->qemu_timer,
+ qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + ticks_to_ns(ticks));
+ } else {
+ timer_del(t->qemu_timer);
+ }
+}
+
/*
* timer expiration callback
*/
@@ -374,13 +384,11 @@ static void hpet_timer(void *opaque)
}
}
diff = hpet_calculate_diff(t, cur_tick);
- timer_mod(t->qemu_timer,
- qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + (int64_t)ticks_to_ns(diff));
+ hpet_arm(t, diff);
} else if (t->config & HPET_TN_32BIT && !timer_is_periodic(t)) {
if (t->wrap_flag) {
diff = hpet_calculate_diff(t, cur_tick);
- timer_mod(t->qemu_timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) +
- (int64_t)ticks_to_ns(diff));
+ hpet_arm(t, diff);
t->wrap_flag = 0;
}
}
@@ -407,8 +415,7 @@ static void hpet_set_timer(HPETTimer *t)
t->wrap_flag = 1;
}
}
- timer_mod(t->qemu_timer,
- qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + (int64_t)ticks_to_ns(diff));
+ hpet_arm(t, diff);
}
static void hpet_del_timer(HPETTimer *t)

View File

@@ -0,0 +1,71 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Eugenio=20P=C3=A9rez?= <eperezma@redhat.com>
Date: Thu, 9 Feb 2023 18:00:04 +0100
Subject: [PATCH] vdpa: stop all svq on device deletion
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Not stopping them leave the device in a bad state when virtio-net
fronted device is unplugged with device_del monitor command.
This is not triggable in regular poweroff or qemu forces shutdown
because cleanup is called right after vhost_vdpa_dev_start(false). But
devices hot unplug does not call vdpa device cleanups. This lead to all
the vhost_vdpa devices without stop the SVQ but the last.
Fix it and clean the code, making it symmetric with
vhost_vdpa_svqs_start.
Fixes: dff4426fa656 ("vhost: Add Shadow VirtQueue kick forwarding capabilities")
Reported-by: Lei Yang <leiyang@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230209170004.899472-1-eperezma@redhat.com>
Tested-by: Laurent Vivier <lvivier@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
(cherry-picked from commit 2e1a9de96b487cf818a22d681cad8d3f5d18dcca)
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
hw/virtio/vhost-vdpa.c | 17 ++---------------
1 file changed, 2 insertions(+), 15 deletions(-)
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 7468e44b87..03c78d25d8 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -707,26 +707,11 @@ static int vhost_vdpa_get_device_id(struct vhost_dev *dev,
return ret;
}
-static void vhost_vdpa_reset_svq(struct vhost_vdpa *v)
-{
- if (!v->shadow_vqs_enabled) {
- return;
- }
-
- for (unsigned i = 0; i < v->shadow_vqs->len; ++i) {
- VhostShadowVirtqueue *svq = g_ptr_array_index(v->shadow_vqs, i);
- vhost_svq_stop(svq);
- }
-}
-
static int vhost_vdpa_reset_device(struct vhost_dev *dev)
{
- struct vhost_vdpa *v = dev->opaque;
int ret;
uint8_t status = 0;
- vhost_vdpa_reset_svq(v);
-
ret = vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, &status);
trace_vhost_vdpa_reset_device(dev, status);
return ret;
@@ -1088,6 +1073,8 @@ static void vhost_vdpa_svqs_stop(struct vhost_dev *dev)
for (unsigned i = 0; i < v->shadow_vqs->len; ++i) {
VhostShadowVirtqueue *svq = g_ptr_array_index(v->shadow_vqs, i);
+
+ vhost_svq_stop(svq);
vhost_vdpa_svq_unmap_rings(dev, svq);
}
}

View File

@@ -0,0 +1,132 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Carlos=20L=C3=B3pez?= <clopez@suse.de>
Date: Mon, 13 Feb 2023 09:57:47 +0100
Subject: [PATCH] vhost: avoid a potential use of an uninitialized variable in
vhost_svq_poll()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
In vhost_svq_poll(), if vhost_svq_get_buf() fails due to a device
providing invalid descriptors, len is left uninitialized and returned
to the caller, potentally leaking stack data or causing undefined
behavior.
Fix this by initializing len to 0.
Found with GCC 13 and -fanalyzer (abridged):
../hw/virtio/vhost-shadow-virtqueue.c: In function vhost_svq_poll:
../hw/virtio/vhost-shadow-virtqueue.c:538:12: warning: use of uninitialized value len [CWE-457] [-Wanalyzer-use-of-uninitialized-value]
538 | return len;
| ^~~
vhost_svq_poll: events 1-4
|
| 522 | size_t vhost_svq_poll(VhostShadowVirtqueue *svq)
| | ^~~~~~~~~~~~~~
| | |
| | (1) entry to vhost_svq_poll
|......
| 525 | uint32_t len;
| | ~~~
| | |
| | (2) region created on stack here
| | (3) capacity: 4 bytes
|......
| 528 | if (vhost_svq_more_used(svq)) {
| | ~
| | |
| | (4) inlined call to vhost_svq_more_used from vhost_svq_poll
(...)
| 528 | if (vhost_svq_more_used(svq)) {
| | ^~~~~~~~~~~~~~~~~~~~~~~~~
| | ||
| | |(8) ...to here
| | (7) following true branch...
|......
| 537 | vhost_svq_get_buf(svq, &len);
| | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| | |
| | (9) calling vhost_svq_get_buf from vhost_svq_poll
|
+--> vhost_svq_get_buf: events 10-11
|
| 416 | static VirtQueueElement *vhost_svq_get_buf(VhostShadowVirtqueue *svq,
| | ^~~~~~~~~~~~~~~~~
| | |
| | (10) entry to vhost_svq_get_buf
|......
| 423 | if (!vhost_svq_more_used(svq)) {
| | ~
| | |
| | (11) inlined call to vhost_svq_more_used from vhost_svq_get_buf
|
(...)
|
vhost_svq_get_buf: event 14
|
| 423 | if (!vhost_svq_more_used(svq)) {
| | ^
| | |
| | (14) following false branch...
|
vhost_svq_get_buf: event 15
|
|cc1:
| (15): ...to here
|
<------+
|
vhost_svq_poll: events 16-17
|
| 537 | vhost_svq_get_buf(svq, &len);
| | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
| | |
| | (16) returning to vhost_svq_poll from vhost_svq_get_buf
| 538 | return len;
| | ~~~
| | |
| | (17) use of uninitialized value len here
Note by Laurent Vivier <lvivier@redhat.com>:
The return value is only used to detect an error:
vhost_svq_poll
vhost_vdpa_net_cvq_add
vhost_vdpa_net_load_cmd
vhost_vdpa_net_load_mac
-> a negative return is only used to detect error
vhost_vdpa_net_load_mq
-> a negative return is only used to detect error
vhost_vdpa_net_handle_ctrl_avail
-> a negative return is only used to detect error
Fixes: d368c0b052ad ("vhost: Do not depend on !NULL VirtQueueElement on vhost_svq_flush")
Signed-off-by: Carlos López <clopez@suse.de>
Message-Id: <20230213085747.19956-1-clopez@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry-picked from commit e4dd39c699b7d63a06f686ec06ded8adbee989c1)
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
hw/virtio/vhost-shadow-virtqueue.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
index 5bd14cad96..a723073747 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -522,7 +522,7 @@ static void vhost_svq_flush(VhostShadowVirtqueue *svq,
size_t vhost_svq_poll(VhostShadowVirtqueue *svq)
{
int64_t start_us = g_get_monotonic_time();
- uint32_t len;
+ uint32_t len = 0;
do {
if (vhost_svq_more_used(svq)) {

View File

@@ -0,0 +1,70 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Yajun Wu <yajunw@nvidia.com>
Date: Tue, 14 Feb 2023 10:14:30 +0800
Subject: [PATCH] chardev/char-socket: set s->listener = NULL in
char_socket_finalize
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
After live migration with virtio block device, qemu crash at:
#0 0x000055914f46f795 in object_dynamic_cast_assert (obj=0x559151b7b090, typename=0x55914f80fbc4 "qio-channel", file=0x55914f80fb90 "/images/testvfe/sw/qemu.gerrit/include/io/channel.h", line=30, func=0x55914f80fcb8 <__func__.17257> "QIO_CHANNEL") at ../qom/object.c:872
#1 0x000055914f480d68 in QIO_CHANNEL (obj=0x559151b7b090) at /images/testvfe/sw/qemu.gerrit/include/io/channel.h:29
#2 0x000055914f4812f8 in qio_net_listener_set_client_func_full (listener=0x559151b7a720, func=0x55914f580b97 <tcp_chr_accept>, data=0x5591519f4ea0, notify=0x0, context=0x0) at ../io/net-listener.c:166
#3 0x000055914f580059 in tcp_chr_update_read_handler (chr=0x5591519f4ea0) at ../chardev/char-socket.c:637
#4 0x000055914f583dca in qemu_chr_be_update_read_handlers (s=0x5591519f4ea0, context=0x0) at ../chardev/char.c:226
#5 0x000055914f57b7c9 in qemu_chr_fe_set_handlers_full (b=0x559152bf23a0, fd_can_read=0x0, fd_read=0x0, fd_event=0x0, be_change=0x0, opaque=0x0, context=0x0, set_open=false, sync_state=true) at ../chardev/char-fe.c:279
#6 0x000055914f57b86d in qemu_chr_fe_set_handlers (b=0x559152bf23a0, fd_can_read=0x0, fd_read=0x0, fd_event=0x0, be_change=0x0, opaque=0x0, context=0x0, set_open=false) at ../chardev/char-fe.c:304
#7 0x000055914f378caf in vhost_user_async_close (d=0x559152bf21a0, chardev=0x559152bf23a0, vhost=0x559152bf2420, cb=0x55914f2fb8c1 <vhost_user_blk_disconnect>) at ../hw/virtio/vhost-user.c:2725
#8 0x000055914f2fba40 in vhost_user_blk_event (opaque=0x559152bf21a0, event=CHR_EVENT_CLOSED) at ../hw/block/vhost-user-blk.c:395
#9 0x000055914f58388c in chr_be_event (s=0x5591519f4ea0, event=CHR_EVENT_CLOSED) at ../chardev/char.c:61
#10 0x000055914f583905 in qemu_chr_be_event (s=0x5591519f4ea0, event=CHR_EVENT_CLOSED) at ../chardev/char.c:81
#11 0x000055914f581275 in char_socket_finalize (obj=0x5591519f4ea0) at ../chardev/char-socket.c:1083
#12 0x000055914f46f073 in object_deinit (obj=0x5591519f4ea0, type=0x5591519055c0) at ../qom/object.c:680
#13 0x000055914f46f0e5 in object_finalize (data=0x5591519f4ea0) at ../qom/object.c:694
#14 0x000055914f46ff06 in object_unref (objptr=0x5591519f4ea0) at ../qom/object.c:1202
#15 0x000055914f4715a4 in object_finalize_child_property (obj=0x559151b76c50, name=0x559151b7b250 "char3", opaque=0x5591519f4ea0) at ../qom/object.c:1747
#16 0x000055914f46ee86 in object_property_del_all (obj=0x559151b76c50) at ../qom/object.c:632
#17 0x000055914f46f0d2 in object_finalize (data=0x559151b76c50) at ../qom/object.c:693
#18 0x000055914f46ff06 in object_unref (objptr=0x559151b76c50) at ../qom/object.c:1202
#19 0x000055914f4715a4 in object_finalize_child_property (obj=0x559151b6b560, name=0x559151b76630 "chardevs", opaque=0x559151b76c50) at ../qom/object.c:1747
#20 0x000055914f46ef67 in object_property_del_child (obj=0x559151b6b560, child=0x559151b76c50) at ../qom/object.c:654
#21 0x000055914f46f042 in object_unparent (obj=0x559151b76c50) at ../qom/object.c:673
#22 0x000055914f58632a in qemu_chr_cleanup () at ../chardev/char.c:1189
#23 0x000055914f16c66c in qemu_cleanup () at ../softmmu/runstate.c:830
#24 0x000055914eee7b9e in qemu_default_main () at ../softmmu/main.c:38
#25 0x000055914eee7bcc in main (argc=86, argv=0x7ffc97cb8d88) at ../softmmu/main.c:48
In char_socket_finalize after s->listener freed, event callback function
vhost_user_blk_event will be called to handle CHR_EVENT_CLOSED.
vhost_user_blk_event is calling qio_net_listener_set_client_func_full which
is still using s->listener.
Setting s->listener = NULL after object_unref(OBJECT(s->listener)) can
solve this issue.
Signed-off-by: Yajun Wu <yajunw@nvidia.com>
Acked-by: Jiri Pirko <jiri@nvidia.com>
Message-Id: <20230214021430.3638579-1-yajunw@nvidia.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry-picked from commit b8a7f51f59e28d5a8e0c07ed3919cc9695560ed2)
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
chardev/char-socket.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/chardev/char-socket.c b/chardev/char-socket.c
index 879564aa8a..b00efb1482 100644
--- a/chardev/char-socket.c
+++ b/chardev/char-socket.c
@@ -1065,6 +1065,7 @@ static void char_socket_finalize(Object *obj)
qio_net_listener_set_client_func_full(s->listener, NULL, NULL,
NULL, chr->gcontext);
object_unref(OBJECT(s->listener));
+ s->listener = NULL;
}
if (s->tls_creds) {
object_unref(OBJECT(s->tls_creds));

View File

@@ -0,0 +1,41 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Jason Wang <jasowang@redhat.com>
Date: Thu, 23 Feb 2023 14:59:20 +0800
Subject: [PATCH] intel-iommu: fail MAP notifier without caching mode
Without caching mode, MAP notifier won't work correctly since guest
won't send IOTLB update event when it establishes new mappings in the
I/O page tables. Let's fail the IOMMU notifiers early instead of
misbehaving silently.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Viktor Prutyanov <viktor@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20230223065924.42503-2-jasowang@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry-picked from commit b8d78277c091f26fdd64f239bc8bb7e55d74cecf)
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
hw/i386/intel_iommu.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index a08ee85edf..9143376677 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -3186,6 +3186,13 @@ static int vtd_iommu_notify_flag_changed(IOMMUMemoryRegion *iommu,
"Snoop Control with vhost or VFIO is not supported");
return -ENOTSUP;
}
+ if (!s->caching_mode && (new & IOMMU_NOTIFIER_MAP)) {
+ error_setg_errno(errp, ENOTSUP,
+ "device %02x.%02x.%x requires caching mode",
+ pci_bus_num(vtd_as->bus), PCI_SLOT(vtd_as->devfn),
+ PCI_FUNC(vtd_as->devfn));
+ return -ENOTSUP;
+ }
/* Update per-address-space notifier flags */
vtd_as->notifier_flags = new;

View File

@@ -0,0 +1,50 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Jason Wang <jasowang@redhat.com>
Date: Thu, 23 Feb 2023 14:59:21 +0800
Subject: [PATCH] intel-iommu: fail DEVIOTLB_UNMAP without dt mode
Without dt mode, device IOTLB notifier won't work since guest won't
send device IOTLB invalidation descriptor in this case. Let's fail
early instead of misbehaving silently.
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Tested-by: Laurent Vivier <lvivier@redhat.com>
Tested-by: Viktor Prutyanov <viktor@daynix.com>
Buglink: https://bugzilla.redhat.com/2156876
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20230223065924.42503-3-jasowang@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry-picked from commit 09adb0e021207b60a0c51a68939b4539d98d3ef3)
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
hw/i386/intel_iommu.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 9143376677..d025ef2873 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -3179,6 +3179,7 @@ static int vtd_iommu_notify_flag_changed(IOMMUMemoryRegion *iommu,
{
VTDAddressSpace *vtd_as = container_of(iommu, VTDAddressSpace, iommu);
IntelIOMMUState *s = vtd_as->iommu_state;
+ X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s);
/* TODO: add support for VFIO and vhost users */
if (s->snoop_control) {
@@ -3193,6 +3194,13 @@ static int vtd_iommu_notify_flag_changed(IOMMUMemoryRegion *iommu,
PCI_FUNC(vtd_as->devfn));
return -ENOTSUP;
}
+ if (!x86_iommu->dt_supported && (new & IOMMU_NOTIFIER_DEVIOTLB_UNMAP)) {
+ error_setg_errno(errp, ENOTSUP,
+ "device %02x.%02x.%x requires device IOTLB mode",
+ pci_bus_num(vtd_as->bus), PCI_SLOT(vtd_as->devfn),
+ PCI_FUNC(vtd_as->devfn));
+ return -ENOTSUP;
+ }
/* Update per-address-space notifier flags */
vtd_as->notifier_flags = new;

View File

@@ -0,0 +1,38 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Alexander Bulekov <alxndr@bu.edu>
Date: Mon, 13 Mar 2023 04:24:16 -0400
Subject: [PATCH] memory: Allow disabling re-entrancy checking per-MR
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
---
include/exec/memory.h | 3 +++
softmmu/memory.c | 2 +-
2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 91f8a2395a..d7268d9f39 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -765,6 +765,9 @@ struct MemoryRegion {
unsigned ioeventfd_nb;
MemoryRegionIoeventfd *ioeventfds;
RamDiscardManager *rdm; /* Only for RAM */
+
+ /* For devices designed to perform re-entrant IO into their own IO MRs */
+ bool disable_reentrancy_guard;
};
struct IOMMUMemoryRegion {
diff --git a/softmmu/memory.c b/softmmu/memory.c
index 7dcb3347aa..2b46714191 100644
--- a/softmmu/memory.c
+++ b/softmmu/memory.c
@@ -544,7 +544,7 @@ static MemTxResult access_with_adjusted_size(hwaddr addr,
}
/* Do not allow more than one simultanous access to a device's IO Regions */
- if (mr->owner &&
+ if (mr->owner && !mr->disable_reentrancy_guard &&
!mr->ram_device && !mr->ram && !mr->rom_device && !mr->readonly) {
dev = (DeviceState *) object_dynamic_cast(mr->owner, TYPE_DEVICE);
if (dev) {

View File

@@ -0,0 +1,33 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Alexander Bulekov <alxndr@bu.edu>
Date: Mon, 13 Mar 2023 04:24:17 -0400
Subject: [PATCH] lsi53c895a: disable reentrancy detection for script RAM
As the code is designed to use the memory APIs to access the script ram,
disable reentrancy checks for the pseudo-RAM ram_io MemoryRegion.
In the future, ram_io may be converted from an IO to a proper RAM MemoryRegion.
Reported-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
---
hw/scsi/lsi53c895a.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/hw/scsi/lsi53c895a.c b/hw/scsi/lsi53c895a.c
index 50979640c3..894b9311ac 100644
--- a/hw/scsi/lsi53c895a.c
+++ b/hw/scsi/lsi53c895a.c
@@ -2302,6 +2302,12 @@ static void lsi_scsi_realize(PCIDevice *dev, Error **errp)
memory_region_init_io(&s->io_io, OBJECT(s), &lsi_io_ops, s,
"lsi-io", 256);
+ /*
+ * Since we use the address-space API to interact with ram_io, disable the
+ * re-entrancy guard.
+ */
+ s->ram_io.disable_reentrancy_guard = true;
+
address_space_init(&s->pci_io_as, pci_address_space_io(dev), "lsi-pci-io");
qdev_init_gpio_out(d, &s->ext_irq, 1);

View File

@@ -0,0 +1,166 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Laszlo Ersek <lersek@redhat.com>
Date: Thu, 5 Jan 2023 17:18:04 +0100
Subject: [PATCH] acpi: cpuhp: fix guest-visible maximum access size to the
legacy reg block
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The modern ACPI CPU hotplug interface was introduced in the following
series (aa1dd39ca307..679dd1a957df), released in v2.7.0:
1 abd49bc2ed2f docs: update ACPI CPU hotplug spec with new protocol
2 16bcab97eb9f pc: piix4/ich9: add 'cpu-hotplug-legacy' property
3 5e1b5d93887b acpi: cpuhp: add CPU devices AML with _STA method
4 ac35f13ba8f8 pc: acpi: introduce AcpiDeviceIfClass.madt_cpu hook
5 d2238cb6781d acpi: cpuhp: implement hot-add parts of CPU hotplug
interface
6 8872c25a26cc acpi: cpuhp: implement hot-remove parts of CPU hotplug
interface
7 76623d00ae57 acpi: cpuhp: add cpu._OST handling
8 679dd1a957df pc: use new CPU hotplug interface since 2.7 machine type
Before patch#1, "docs/specs/acpi_cpu_hotplug.txt" only specified 1-byte
accesses for the hotplug register block. Patch#1 preserved the same
restriction for the legacy register block, but:
- it specified DWORD accesses for some of the modern registers,
- in particular, the switch from the legacy block to the modern block
would require a DWORD write to the *legacy* block.
The latter functionality was then implemented in cpu_status_write()
[hw/acpi/cpu_hotplug.c], in patch#8.
Unfortunately, all DWORD accesses depended on a dormant bug: the one
introduced in earlier commit a014ed07bd5a ("memory: accept mismatching
sizes in memory_region_access_valid", 2013-05-29); first released in
v1.6.0. Due to commit a014ed07bd5a, the DWORD accesses to the *legacy*
CPU hotplug register block would work in spite of the above series *not*
relaxing "valid.max_access_size = 1" in "hw/acpi/cpu_hotplug.c":
> static const MemoryRegionOps AcpiCpuHotplug_ops = {
> .read = cpu_status_read,
> .write = cpu_status_write,
> .endianness = DEVICE_LITTLE_ENDIAN,
> .valid = {
> .min_access_size = 1,
> .max_access_size = 1,
> },
> };
Later, in commits e6d0c3ce6895 ("acpi: cpuhp: introduce 'Command data 2'
field", 2020-01-22) and ae340aa3d256 ("acpi: cpuhp: spec: add typical
usecases", 2020-01-22), first released in v5.0.0, the modern CPU hotplug
interface (including the documentation) was extended with another DWORD
*read* access, namely to the "Command data 2" register, which would be
important for the guest to confirm whether it managed to switch the
register block from legacy to modern.
This functionality too silently depended on the bug from commit
a014ed07bd5a.
In commit 5d971f9e6725 ('memory: Revert "memory: accept mismatching sizes
in memory_region_access_valid"', 2020-06-26), first released in v5.1.0,
the bug from commit a014ed07bd5a was fixed (the commit was reverted).
That swiftly exposed the bug in "AcpiCpuHotplug_ops", still present from
the v2.7.0 series quoted at the top -- namely the fact that
"valid.max_access_size = 1" didn't match what the guest was supposed to
do, according to the spec ("docs/specs/acpi_cpu_hotplug.txt").
The symptom is that the "modern interface negotiation protocol"
described in commit ae340aa3d256:
> + Use following steps to detect and enable modern CPU hotplug interface:
> + 1. Store 0x0 to the 'CPU selector' register,
> + attempting to switch to modern mode
> + 2. Store 0x0 to the 'CPU selector' register,
> + to ensure valid selector value
> + 3. Store 0x0 to the 'Command field' register,
> + 4. Read the 'Command data 2' register.
> + If read value is 0x0, the modern interface is enabled.
> + Otherwise legacy or no CPU hotplug interface available
falls apart for the guest: steps 1 and 2 are lost, because they are DWORD
writes; so no switching happens. Step 3 (a single-byte write) is not
lost, but it has no effect; see the condition in cpu_status_write() in
patch#8. And step 4 *misleads* the guest into thinking that the switch
worked: the DWORD read is lost again -- it returns zero to the guest
without ever reaching the device model, so the guest never learns the
switch didn't work.
This means that guest behavior centered on the "Command data 2" register
worked *only* in the v5.0.0 release; it got effectively regressed in
v5.1.0.
To make things *even more* complicated, the breakage was (and remains, as
of today) visible with TCG acceleration only. Commit 5d971f9e6725 makes
no difference with KVM acceleration -- the DWORD accesses still work,
despite "valid.max_access_size = 1".
As commit 5d971f9e6725 suggests, fix the problem by raising
"valid.max_access_size" to 4 -- the spec now clearly instructs the guest
to perform DWORD accesses to the legacy register block too, for enabling
(and verifying!) the modern block. In order to keep compatibility for the
device model implementation though, set "impl.max_access_size = 1", so
that wide accesses be split before they reach the legacy read/write
handlers, like they always have been on KVM, and like they were on TCG
before 5d971f9e6725 (v5.1.0).
Tested with:
- OVMF IA32 + qemu-system-i386, CPU hotplug/hot-unplug with SMM,
intermixed with ACPI S3 suspend/resume, using KVM accel
(regression-test);
- OVMF IA32X64 + qemu-system-x86_64, CPU hotplug/hot-unplug with SMM,
intermixed with ACPI S3 suspend/resume, using KVM accel
(regression-test);
- OVMF IA32 + qemu-system-i386, SMM enabled, using TCG accel; verified the
register block switch and the present/possible CPU counting through the
modern hotplug interface, during OVMF boot (bugfix test);
- I do not have any testcase (guest payload) for regression-testing CPU
hotplug through the *legacy* CPU hotplug register block.
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Ani Sinha <ani@anisinha.ca>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Philippe Mathieu-Daudé <philmd@linaro.org>
Cc: qemu-stable@nongnu.org
Ref: "IO port write width clamping differs between TCG and KVM"
Link: http://mid.mail-archive.com/aaedee84-d3ed-a4f9-21e7-d221a28d1683@redhat.com
Link: https://lists.gnu.org/archive/html/qemu-devel/2023-01/msg00199.html
Reported-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Tested-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230105161804.82486-1-lersek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry-picked from commit dab30fbef3896bb652a09d46c37d3f55657cbcbb)
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
hw/acpi/cpu_hotplug.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/hw/acpi/cpu_hotplug.c b/hw/acpi/cpu_hotplug.c
index 53654f8638..ff14c3f410 100644
--- a/hw/acpi/cpu_hotplug.c
+++ b/hw/acpi/cpu_hotplug.c
@@ -52,6 +52,9 @@ static const MemoryRegionOps AcpiCpuHotplug_ops = {
.endianness = DEVICE_LITTLE_ENDIAN,
.valid = {
.min_access_size = 1,
+ .max_access_size = 4,
+ },
+ .impl = {
.max_access_size = 1,
},
};

View File

@@ -0,0 +1,286 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Richard Henderson <richard.henderson@linaro.org>
Date: Sat, 14 Jan 2023 13:05:41 -1000
Subject: [PATCH] tests/tcg/i386: Introduce and use reg_t consistently
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Define reg_t based on the actual register width.
Define the inlines using that type. This will allow
input registers to 32-bit insns to be set to 64-bit
values on x86-64, which allows testing various edge cases.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230114230542.3116013-2-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry-picked from commit 5d62d6649cd367b5b4a3676e7514d2f9ca86cb03)
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
tests/tcg/i386/test-i386-bmi2.c | 182 ++++++++++++++++----------------
1 file changed, 93 insertions(+), 89 deletions(-)
diff --git a/tests/tcg/i386/test-i386-bmi2.c b/tests/tcg/i386/test-i386-bmi2.c
index 5fadf47510..3c3ef85513 100644
--- a/tests/tcg/i386/test-i386-bmi2.c
+++ b/tests/tcg/i386/test-i386-bmi2.c
@@ -3,34 +3,40 @@
#include <stdint.h>
#include <stdio.h>
+#ifdef __x86_64
+typedef uint64_t reg_t;
+#else
+typedef uint32_t reg_t;
+#endif
+
#define insn1q(name, arg0) \
-static inline uint64_t name##q(uint64_t arg0) \
+static inline reg_t name##q(reg_t arg0) \
{ \
- uint64_t result64; \
+ reg_t result64; \
asm volatile (#name "q %1, %0" : "=r"(result64) : "rm"(arg0)); \
return result64; \
}
#define insn1l(name, arg0) \
-static inline uint32_t name##l(uint32_t arg0) \
+static inline reg_t name##l(reg_t arg0) \
{ \
- uint32_t result32; \
+ reg_t result32; \
asm volatile (#name "l %k1, %k0" : "=r"(result32) : "rm"(arg0)); \
return result32; \
}
#define insn2q(name, arg0, c0, arg1, c1) \
-static inline uint64_t name##q(uint64_t arg0, uint64_t arg1) \
+static inline reg_t name##q(reg_t arg0, reg_t arg1) \
{ \
- uint64_t result64; \
+ reg_t result64; \
asm volatile (#name "q %2, %1, %0" : "=r"(result64) : c0(arg0), c1(arg1)); \
return result64; \
}
#define insn2l(name, arg0, c0, arg1, c1) \
-static inline uint32_t name##l(uint32_t arg0, uint32_t arg1) \
+static inline reg_t name##l(reg_t arg0, reg_t arg1) \
{ \
- uint32_t result32; \
+ reg_t result32; \
asm volatile (#name "l %k2, %k1, %k0" : "=r"(result32) : c0(arg0), c1(arg1)); \
return result32; \
}
@@ -65,130 +71,128 @@ insn1l(blsr, src)
int main(int argc, char *argv[]) {
uint64_t ehlo = 0x202020204f4c4845ull;
uint64_t mask = 0xa080800302020001ull;
- uint32_t result32;
+ reg_t result;
#ifdef __x86_64
- uint64_t result64;
-
/* 64 bits */
- result64 = andnq(mask, ehlo);
- assert(result64 == 0x002020204d4c4844);
+ result = andnq(mask, ehlo);
+ assert(result == 0x002020204d4c4844);
- result64 = pextq(ehlo, mask);
- assert(result64 == 133);
+ result = pextq(ehlo, mask);
+ assert(result == 133);
- result64 = pdepq(result64, mask);
- assert(result64 == (ehlo & mask));
+ result = pdepq(result, mask);
+ assert(result == (ehlo & mask));
- result64 = pextq(-1ull, mask);
- assert(result64 == 511); /* mask has 9 bits set */
+ result = pextq(-1ull, mask);
+ assert(result == 511); /* mask has 9 bits set */
- result64 = pdepq(-1ull, mask);
- assert(result64 == mask);
+ result = pdepq(-1ull, mask);
+ assert(result == mask);
- result64 = bextrq(mask, 0x3f00);
- assert(result64 == (mask & ~INT64_MIN));
+ result = bextrq(mask, 0x3f00);
+ assert(result == (mask & ~INT64_MIN));
- result64 = bextrq(mask, 0x1038);
- assert(result64 == 0xa0);
+ result = bextrq(mask, 0x1038);
+ assert(result == 0xa0);
- result64 = bextrq(mask, 0x10f8);
- assert(result64 == 0);
+ result = bextrq(mask, 0x10f8);
+ assert(result == 0);
- result64 = blsiq(0x30);
- assert(result64 == 0x10);
+ result = blsiq(0x30);
+ assert(result == 0x10);
- result64 = blsiq(0x30ull << 32);
- assert(result64 == 0x10ull << 32);
+ result = blsiq(0x30ull << 32);
+ assert(result == 0x10ull << 32);
- result64 = blsmskq(0x30);
- assert(result64 == 0x1f);
+ result = blsmskq(0x30);
+ assert(result == 0x1f);
- result64 = blsrq(0x30);
- assert(result64 == 0x20);
+ result = blsrq(0x30);
+ assert(result == 0x20);
- result64 = blsrq(0x30ull << 32);
- assert(result64 == 0x20ull << 32);
+ result = blsrq(0x30ull << 32);
+ assert(result == 0x20ull << 32);
- result64 = bzhiq(mask, 0x3f);
- assert(result64 == (mask & ~INT64_MIN));
+ result = bzhiq(mask, 0x3f);
+ assert(result == (mask & ~INT64_MIN));
- result64 = bzhiq(mask, 0x1f);
- assert(result64 == (mask & ~(-1 << 30)));
+ result = bzhiq(mask, 0x1f);
+ assert(result == (mask & ~(-1 << 30)));
- result64 = rorxq(0x2132435465768798, 8);
- assert(result64 == 0x9821324354657687);
+ result = rorxq(0x2132435465768798, 8);
+ assert(result == 0x9821324354657687);
- result64 = sarxq(0xffeeddccbbaa9988, 8);
- assert(result64 == 0xffffeeddccbbaa99);
+ result = sarxq(0xffeeddccbbaa9988, 8);
+ assert(result == 0xffffeeddccbbaa99);
- result64 = sarxq(0x77eeddccbbaa9988, 8 | 64);
- assert(result64 == 0x0077eeddccbbaa99);
+ result = sarxq(0x77eeddccbbaa9988, 8 | 64);
+ assert(result == 0x0077eeddccbbaa99);
- result64 = shrxq(0xffeeddccbbaa9988, 8);
- assert(result64 == 0x00ffeeddccbbaa99);
+ result = shrxq(0xffeeddccbbaa9988, 8);
+ assert(result == 0x00ffeeddccbbaa99);
- result64 = shrxq(0x77eeddccbbaa9988, 8 | 192);
- assert(result64 == 0x0077eeddccbbaa99);
+ result = shrxq(0x77eeddccbbaa9988, 8 | 192);
+ assert(result == 0x0077eeddccbbaa99);
- result64 = shlxq(0xffeeddccbbaa9988, 8);
- assert(result64 == 0xeeddccbbaa998800);
+ result = shlxq(0xffeeddccbbaa9988, 8);
+ assert(result == 0xeeddccbbaa998800);
#endif
/* 32 bits */
- result32 = andnl(mask, ehlo);
- assert(result32 == 0x04d4c4844);
+ result = andnl(mask, ehlo);
+ assert(result == 0x04d4c4844);
- result32 = pextl((uint32_t) ehlo, mask);
- assert(result32 == 5);
+ result = pextl((uint32_t) ehlo, mask);
+ assert(result == 5);
- result32 = pdepl(result32, mask);
- assert(result32 == (uint32_t)(ehlo & mask));
+ result = pdepl(result, mask);
+ assert(result == (uint32_t)(ehlo & mask));
- result32 = pextl(-1u, mask);
- assert(result32 == 7); /* mask has 3 bits set */
+ result = pextl(-1u, mask);
+ assert(result == 7); /* mask has 3 bits set */
- result32 = pdepl(-1u, mask);
- assert(result32 == (uint32_t)mask);
+ result = pdepl(-1u, mask);
+ assert(result == (uint32_t)mask);
- result32 = bextrl(mask, 0x1f00);
- assert(result32 == (mask & ~INT32_MIN));
+ result = bextrl(mask, 0x1f00);
+ assert(result == (mask & ~INT32_MIN));
- result32 = bextrl(ehlo, 0x1018);
- assert(result32 == 0x4f);
+ result = bextrl(ehlo, 0x1018);
+ assert(result == 0x4f);
- result32 = bextrl(mask, 0x1038);
- assert(result32 == 0);
+ result = bextrl(mask, 0x1038);
+ assert(result == 0);
- result32 = blsil(0xffff);
- assert(result32 == 1);
+ result = blsil(0xffff);
+ assert(result == 1);
- result32 = blsmskl(0x300);
- assert(result32 == 0x1ff);
+ result = blsmskl(0x300);
+ assert(result == 0x1ff);
- result32 = blsrl(0xffc);
- assert(result32 == 0xff8);
+ result = blsrl(0xffc);
+ assert(result == 0xff8);
- result32 = bzhil(mask, 0xf);
- assert(result32 == 1);
+ result = bzhil(mask, 0xf);
+ assert(result == 1);
- result32 = rorxl(0x65768798, 8);
- assert(result32 == 0x98657687);
+ result = rorxl(0x65768798, 8);
+ assert(result == 0x98657687);
- result32 = sarxl(0xffeeddcc, 8);
- assert(result32 == 0xffffeedd);
+ result = sarxl(0xffeeddcc, 8);
+ assert(result == 0xffffeedd);
- result32 = sarxl(0x77eeddcc, 8 | 32);
- assert(result32 == 0x0077eedd);
+ result = sarxl(0x77eeddcc, 8 | 32);
+ assert(result == 0x0077eedd);
- result32 = shrxl(0xffeeddcc, 8);
- assert(result32 == 0x00ffeedd);
+ result = shrxl(0xffeeddcc, 8);
+ assert(result == 0x00ffeedd);
- result32 = shrxl(0x77eeddcc, 8 | 128);
- assert(result32 == 0x0077eedd);
+ result = shrxl(0x77eeddcc, 8 | 128);
+ assert(result == 0x0077eedd);
- result32 = shlxl(0xffeeddcc, 8);
- assert(result32 == 0xeeddcc00);
+ result = shlxl(0xffeeddcc, 8);
+ assert(result == 0xeeddcc00);
return 0;
}

View File

@@ -0,0 +1,97 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Richard Henderson <richard.henderson@linaro.org>
Date: Sat, 14 Jan 2023 13:05:42 -1000
Subject: [PATCH] target/i386: Fix BEXTR instruction
There were two problems here: not limiting the input to operand bits,
and not correctly handling large extraction length.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1372
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230114230542.3116013-3-richard.henderson@linaro.org>
Cc: qemu-stable@nongnu.org
Fixes: 1d0b926150e5 ("target/i386: move scalar 0F 38 and 0F 3A instruction to new decoder", 2022-10-18)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry-picked from commit b14c0098975264ed03144f145bca0179a6763a07)
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
target/i386/tcg/emit.c.inc | 22 +++++++++++-----------
tests/tcg/i386/test-i386-bmi2.c | 12 ++++++++++++
2 files changed, 23 insertions(+), 11 deletions(-)
diff --git a/target/i386/tcg/emit.c.inc b/target/i386/tcg/emit.c.inc
index 7037ff91c6..99f6ba6e19 100644
--- a/target/i386/tcg/emit.c.inc
+++ b/target/i386/tcg/emit.c.inc
@@ -1078,30 +1078,30 @@ static void gen_ANDN(DisasContext *s, CPUX86State *env, X86DecodedInsn *decode)
static void gen_BEXTR(DisasContext *s, CPUX86State *env, X86DecodedInsn *decode)
{
MemOp ot = decode->op[0].ot;
- TCGv bound, zero;
+ TCGv bound = tcg_constant_tl(ot == MO_64 ? 63 : 31);
+ TCGv zero = tcg_constant_tl(0);
+ TCGv mone = tcg_constant_tl(-1);
/*
* Extract START, and shift the operand.
* Shifts larger than operand size get zeros.
*/
tcg_gen_ext8u_tl(s->A0, s->T1);
+ if (TARGET_LONG_BITS == 64 && ot == MO_32) {
+ tcg_gen_ext32u_tl(s->T0, s->T0);
+ }
tcg_gen_shr_tl(s->T0, s->T0, s->A0);
- bound = tcg_constant_tl(ot == MO_64 ? 63 : 31);
- zero = tcg_constant_tl(0);
tcg_gen_movcond_tl(TCG_COND_LEU, s->T0, s->A0, bound, s->T0, zero);
/*
- * Extract the LEN into a mask. Lengths larger than
- * operand size get all ones.
+ * Extract the LEN into an inverse mask. Lengths larger than
+ * operand size get all zeros, length 0 gets all ones.
*/
tcg_gen_extract_tl(s->A0, s->T1, 8, 8);
- tcg_gen_movcond_tl(TCG_COND_LEU, s->A0, s->A0, bound, s->A0, bound);
-
- tcg_gen_movi_tl(s->T1, 1);
- tcg_gen_shl_tl(s->T1, s->T1, s->A0);
- tcg_gen_subi_tl(s->T1, s->T1, 1);
- tcg_gen_and_tl(s->T0, s->T0, s->T1);
+ tcg_gen_shl_tl(s->T1, mone, s->A0);
+ tcg_gen_movcond_tl(TCG_COND_LEU, s->T1, s->A0, bound, s->T1, zero);
+ tcg_gen_andc_tl(s->T0, s->T0, s->T1);
gen_op_update1_cc(s);
set_cc_op(s, CC_OP_LOGICB + ot);
diff --git a/tests/tcg/i386/test-i386-bmi2.c b/tests/tcg/i386/test-i386-bmi2.c
index 3c3ef85513..982d4abda4 100644
--- a/tests/tcg/i386/test-i386-bmi2.c
+++ b/tests/tcg/i386/test-i386-bmi2.c
@@ -99,6 +99,9 @@ int main(int argc, char *argv[]) {
result = bextrq(mask, 0x10f8);
assert(result == 0);
+ result = bextrq(0xfedcba9876543210ull, 0x7f00);
+ assert(result == 0xfedcba9876543210ull);
+
result = blsiq(0x30);
assert(result == 0x10);
@@ -164,6 +167,15 @@ int main(int argc, char *argv[]) {
result = bextrl(mask, 0x1038);
assert(result == 0);
+ result = bextrl((reg_t)0x8f635a775ad3b9b4ull, 0x3018);
+ assert(result == 0x5a);
+
+ result = bextrl((reg_t)0xfedcba9876543210ull, 0x7f00);
+ assert(result == 0x76543210u);
+
+ result = bextrl(-1, 0);
+ assert(result == 0);
+
result = blsil(0xffff);
assert(result == 1);

View File

@@ -0,0 +1,47 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Richard Henderson <richard.henderson@linaro.org>
Date: Sat, 14 Jan 2023 08:06:01 -1000
Subject: [PATCH] target/i386: Fix C flag for BLSI, BLSMSK, BLSR
We forgot to set cc_src, which is used for computing C.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1370
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230114180601.2993644-1-richard.henderson@linaro.org>
Cc: qemu-stable@nongnu.org
Fixes: 1d0b926150e5 ("target/i386: move scalar 0F 38 and 0F 3A instruction to new decoder", 2022-10-18)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry-picked from commit 99282098dc74c2055bde5652bde6cf0067d0c370)
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
target/i386/tcg/emit.c.inc | 3 +++
1 file changed, 3 insertions(+)
diff --git a/target/i386/tcg/emit.c.inc b/target/i386/tcg/emit.c.inc
index 99f6ba6e19..4d7702c106 100644
--- a/target/i386/tcg/emit.c.inc
+++ b/target/i386/tcg/emit.c.inc
@@ -1111,6 +1111,7 @@ static void gen_BLSI(DisasContext *s, CPUX86State *env, X86DecodedInsn *decode)
{
MemOp ot = decode->op[0].ot;
+ tcg_gen_mov_tl(cpu_cc_src, s->T0);
tcg_gen_neg_tl(s->T1, s->T0);
tcg_gen_and_tl(s->T0, s->T0, s->T1);
tcg_gen_mov_tl(cpu_cc_dst, s->T0);
@@ -1121,6 +1122,7 @@ static void gen_BLSMSK(DisasContext *s, CPUX86State *env, X86DecodedInsn *decode
{
MemOp ot = decode->op[0].ot;
+ tcg_gen_mov_tl(cpu_cc_src, s->T0);
tcg_gen_subi_tl(s->T1, s->T0, 1);
tcg_gen_xor_tl(s->T0, s->T0, s->T1);
tcg_gen_mov_tl(cpu_cc_dst, s->T0);
@@ -1131,6 +1133,7 @@ static void gen_BLSR(DisasContext *s, CPUX86State *env, X86DecodedInsn *decode)
{
MemOp ot = decode->op[0].ot;
+ tcg_gen_mov_tl(cpu_cc_src, s->T0);
tcg_gen_subi_tl(s->T1, s->T0, 1);
tcg_gen_and_tl(s->T0, s->T0, s->T1);
tcg_gen_mov_tl(cpu_cc_dst, s->T0);

View File

@@ -0,0 +1,192 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Paolo Bonzini <pbonzini@redhat.com>
Date: Tue, 31 Jan 2023 09:48:03 +0100
Subject: [PATCH] target/i386: fix ADOX followed by ADCX
When ADCX is followed by ADOX or vice versa, the second instruction's
carry comes from EFLAGS and the condition codes use the CC_OP_ADCOX
operation. Retrieving the carry from EFLAGS is handled by this bit
of gen_ADCOX:
tcg_gen_extract_tl(carry_in, cpu_cc_src,
ctz32(cc_op == CC_OP_ADCX ? CC_C : CC_O), 1);
Unfortunately, in this case cc_op has been overwritten by the previous
"if" statement to CC_OP_ADCOX. This works by chance when the first
instruction is ADCX; however, if the first instruction is ADOX,
ADCX will incorrectly take its carry from OF instead of CF.
Fix by moving the computation of the new cc_op at the end of the function.
The included exhaustive test case fails without this patch and passes
afterwards.
Because ADCX/ADOX need not be invoked through the VEX prefix, this
regression bisects to commit 16fc5726a6e2 ("target/i386: reimplement
0x0f 0x38, add AVX", 2022-10-18). However, the mistake happened a
little earlier, when BMI instructions were rewritten using the new
decoder framework.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1471
Reported-by: Paul Jolly <https://gitlab.com/myitcv>
Fixes: 1d0b926150e5 ("target/i386: move scalar 0F 38 and 0F 3A instruction to new decoder", 2022-10-18)
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry-picked from commit 60c7dd22e1383754d5f150bc9f7c2785c662a7b6)
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
target/i386/tcg/emit.c.inc | 20 +++++----
tests/tcg/i386/Makefile.target | 6 ++-
tests/tcg/i386/test-i386-adcox.c | 75 ++++++++++++++++++++++++++++++++
3 files changed, 91 insertions(+), 10 deletions(-)
create mode 100644 tests/tcg/i386/test-i386-adcox.c
diff --git a/target/i386/tcg/emit.c.inc b/target/i386/tcg/emit.c.inc
index 4d7702c106..0d7c6e80ae 100644
--- a/target/i386/tcg/emit.c.inc
+++ b/target/i386/tcg/emit.c.inc
@@ -1015,6 +1015,7 @@ VSIB_AVX(VPGATHERQ, vpgatherq)
static void gen_ADCOX(DisasContext *s, CPUX86State *env, MemOp ot, int cc_op)
{
+ int opposite_cc_op;
TCGv carry_in = NULL;
TCGv carry_out = (cc_op == CC_OP_ADCX ? cpu_cc_dst : cpu_cc_src2);
TCGv zero;
@@ -1022,14 +1023,8 @@ static void gen_ADCOX(DisasContext *s, CPUX86State *env, MemOp ot, int cc_op)
if (cc_op == s->cc_op || s->cc_op == CC_OP_ADCOX) {
/* Re-use the carry-out from a previous round. */
carry_in = carry_out;
- cc_op = s->cc_op;
- } else if (s->cc_op == CC_OP_ADCX || s->cc_op == CC_OP_ADOX) {
- /* Merge with the carry-out from the opposite instruction. */
- cc_op = CC_OP_ADCOX;
- }
-
- /* If we don't have a carry-in, get it out of EFLAGS. */
- if (!carry_in) {
+ } else {
+ /* We don't have a carry-in, get it out of EFLAGS. */
if (s->cc_op != CC_OP_ADCX && s->cc_op != CC_OP_ADOX) {
gen_compute_eflags(s);
}
@@ -1053,7 +1048,14 @@ static void gen_ADCOX(DisasContext *s, CPUX86State *env, MemOp ot, int cc_op)
tcg_gen_add2_tl(s->T0, carry_out, s->T0, carry_out, s->T1, zero);
break;
}
- set_cc_op(s, cc_op);
+
+ opposite_cc_op = cc_op == CC_OP_ADCX ? CC_OP_ADOX : CC_OP_ADCX;
+ if (s->cc_op == CC_OP_ADCOX || s->cc_op == opposite_cc_op) {
+ /* Merge with the carry-out from the opposite instruction. */
+ set_cc_op(s, CC_OP_ADCOX);
+ } else {
+ set_cc_op(s, cc_op);
+ }
}
static void gen_ADCX(DisasContext *s, CPUX86State *env, X86DecodedInsn *decode)
diff --git a/tests/tcg/i386/Makefile.target b/tests/tcg/i386/Makefile.target
index 81831cafbc..bafd8c2180 100644
--- a/tests/tcg/i386/Makefile.target
+++ b/tests/tcg/i386/Makefile.target
@@ -14,7 +14,7 @@ config-cc.mak: Makefile
I386_SRCS=$(notdir $(wildcard $(I386_SRC)/*.c))
ALL_X86_TESTS=$(I386_SRCS:.c=)
SKIP_I386_TESTS=test-i386-ssse3 test-avx test-3dnow test-mmx
-X86_64_TESTS:=$(filter test-i386-bmi2 $(SKIP_I386_TESTS), $(ALL_X86_TESTS))
+X86_64_TESTS:=$(filter test-i386-adcox test-i386-bmi2 $(SKIP_I386_TESTS), $(ALL_X86_TESTS))
test-i386-sse-exceptions: CFLAGS += -msse4.1 -mfpmath=sse
run-test-i386-sse-exceptions: QEMU_OPTS += -cpu max
@@ -28,6 +28,10 @@ test-i386-bmi2: CFLAGS=-O2
run-test-i386-bmi2: QEMU_OPTS += -cpu max
run-plugin-test-i386-bmi2-%: QEMU_OPTS += -cpu max
+test-i386-adcox: CFLAGS=-O2
+run-test-i386-adcox: QEMU_OPTS += -cpu max
+run-plugin-test-i386-adcox-%: QEMU_OPTS += -cpu max
+
#
# hello-i386 is a barebones app
#
diff --git a/tests/tcg/i386/test-i386-adcox.c b/tests/tcg/i386/test-i386-adcox.c
new file mode 100644
index 0000000000..16169efff8
--- /dev/null
+++ b/tests/tcg/i386/test-i386-adcox.c
@@ -0,0 +1,75 @@
+/* See if various BMI2 instructions give expected results */
+#include <assert.h>
+#include <stdint.h>
+#include <stdio.h>
+
+#define CC_C 1
+#define CC_O (1 << 11)
+
+#ifdef __x86_64__
+#define REG uint64_t
+#else
+#define REG uint32_t
+#endif
+
+void test_adox_adcx(uint32_t in_c, uint32_t in_o, REG adcx_operand, REG adox_operand)
+{
+ REG flags;
+ REG out_adcx, out_adox;
+
+ asm("pushf; pop %0" : "=r"(flags));
+ flags &= ~(CC_C | CC_O);
+ flags |= (in_c ? CC_C : 0);
+ flags |= (in_o ? CC_O : 0);
+
+ out_adcx = adcx_operand;
+ out_adox = adox_operand;
+ asm("push %0; popf;"
+ "adox %3, %2;"
+ "adcx %3, %1;"
+ "pushf; pop %0"
+ : "+r" (flags), "+r" (out_adcx), "+r" (out_adox)
+ : "r" ((REG)-1), "0" (flags), "1" (out_adcx), "2" (out_adox));
+
+ assert(out_adcx == in_c + adcx_operand - 1);
+ assert(out_adox == in_o + adox_operand - 1);
+ assert(!!(flags & CC_C) == (in_c || adcx_operand));
+ assert(!!(flags & CC_O) == (in_o || adox_operand));
+}
+
+void test_adcx_adox(uint32_t in_c, uint32_t in_o, REG adcx_operand, REG adox_operand)
+{
+ REG flags;
+ REG out_adcx, out_adox;
+
+ asm("pushf; pop %0" : "=r"(flags));
+ flags &= ~(CC_C | CC_O);
+ flags |= (in_c ? CC_C : 0);
+ flags |= (in_o ? CC_O : 0);
+
+ out_adcx = adcx_operand;
+ out_adox = adox_operand;
+ asm("push %0; popf;"
+ "adcx %3, %1;"
+ "adox %3, %2;"
+ "pushf; pop %0"
+ : "+r" (flags), "+r" (out_adcx), "+r" (out_adox)
+ : "r" ((REG)-1), "0" (flags), "1" (out_adcx), "2" (out_adox));
+
+ assert(out_adcx == in_c + adcx_operand - 1);
+ assert(out_adox == in_o + adox_operand - 1);
+ assert(!!(flags & CC_C) == (in_c || adcx_operand));
+ assert(!!(flags & CC_O) == (in_o || adox_operand));
+}
+
+int main(int argc, char *argv[]) {
+ /* try all combinations of input CF, input OF, CF from op1+op2, OF from op2+op1 */
+ int i;
+ for (i = 0; i <= 15; i++) {
+ printf("%d\n", i);
+ test_adcx_adox(!!(i & 1), !!(i & 2), !!(i & 4), !!(i & 8));
+ test_adox_adcx(!!(i & 1), !!(i & 2), !!(i & 4), !!(i & 8));
+ }
+ return 0;
+}
+

View File

@@ -0,0 +1,64 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Richard Henderson <richard.henderson@linaro.org>
Date: Sat, 14 Jan 2023 13:32:06 -1000
Subject: [PATCH] target/i386: Fix BZHI instruction
We did not correctly handle N >= operand size.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1374
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230114233206.3118472-1-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry-picked from commit 9ad2ba6e8e7fc195d0dd0b76ab38bd2fceb1bdd4)
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
target/i386/tcg/emit.c.inc | 14 +++++++-------
tests/tcg/i386/test-i386-bmi2.c | 3 +++
2 files changed, 10 insertions(+), 7 deletions(-)
diff --git a/target/i386/tcg/emit.c.inc b/target/i386/tcg/emit.c.inc
index 0d7c6e80ae..7296f3952c 100644
--- a/target/i386/tcg/emit.c.inc
+++ b/target/i386/tcg/emit.c.inc
@@ -1145,20 +1145,20 @@ static void gen_BLSR(DisasContext *s, CPUX86State *env, X86DecodedInsn *decode)
static void gen_BZHI(DisasContext *s, CPUX86State *env, X86DecodedInsn *decode)
{
MemOp ot = decode->op[0].ot;
- TCGv bound;
+ TCGv bound = tcg_constant_tl(ot == MO_64 ? 63 : 31);
+ TCGv zero = tcg_constant_tl(0);
+ TCGv mone = tcg_constant_tl(-1);
- tcg_gen_ext8u_tl(s->T1, cpu_regs[s->vex_v]);
- bound = tcg_constant_tl(ot == MO_64 ? 63 : 31);
+ tcg_gen_ext8u_tl(s->T1, s->T1);
/*
* Note that since we're using BMILG (in order to get O
* cleared) we need to store the inverse into C.
*/
- tcg_gen_setcond_tl(TCG_COND_LT, cpu_cc_src, s->T1, bound);
- tcg_gen_movcond_tl(TCG_COND_GT, s->T1, s->T1, bound, bound, s->T1);
+ tcg_gen_setcond_tl(TCG_COND_LEU, cpu_cc_src, s->T1, bound);
- tcg_gen_movi_tl(s->A0, -1);
- tcg_gen_shl_tl(s->A0, s->A0, s->T1);
+ tcg_gen_shl_tl(s->A0, mone, s->T1);
+ tcg_gen_movcond_tl(TCG_COND_LEU, s->A0, s->T1, bound, s->A0, zero);
tcg_gen_andc_tl(s->T0, s->T0, s->A0);
gen_op_update1_cc(s);
diff --git a/tests/tcg/i386/test-i386-bmi2.c b/tests/tcg/i386/test-i386-bmi2.c
index 982d4abda4..0244df7987 100644
--- a/tests/tcg/i386/test-i386-bmi2.c
+++ b/tests/tcg/i386/test-i386-bmi2.c
@@ -123,6 +123,9 @@ int main(int argc, char *argv[]) {
result = bzhiq(mask, 0x1f);
assert(result == (mask & ~(-1 << 30)));
+ result = bzhiq(mask, 0x40);
+ assert(result == mask);
+
result = rorxq(0x2132435465768798, 8);
assert(result == 0x9821324354657687);

1249
debian/patches/pve-qemu-7.2-vitastor.patch vendored Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -14,10 +14,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/block/file-posix.c b/block/file-posix.c
index dd295cfc6d..3ac5177cbb 100644
index b9647c5ffc..9a16d86344 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -533,7 +533,7 @@ static QemuOptsList raw_runtime_opts = {
@@ -552,7 +552,7 @@ static QemuOptsList raw_runtime_opts = {
{
.name = "locking",
.type = QEMU_OPT_STRING,
@@ -26,7 +26,7 @@ index dd295cfc6d..3ac5177cbb 100644
},
{
.name = "pr-manager",
@@ -631,7 +631,7 @@ static int raw_open_common(BlockDriverState *bs, QDict *options,
@@ -652,7 +652,7 @@ static int raw_open_common(BlockDriverState *bs, QDict *options,
s->use_lock = false;
break;
case ON_OFF_AUTO_AUTO:

View File

@@ -9,10 +9,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/net/net.h b/include/net/net.h
index 5d1508081f..f665924193 100644
index dc20b31e9f..5ae04a8693 100644
--- a/include/net/net.h
+++ b/include/net/net.h
@@ -219,8 +219,8 @@ void netdev_add(QemuOpts *opts, Error **errp);
@@ -236,8 +236,8 @@ void netdev_add(QemuOpts *opts, Error **errp);
int net_hub_id_for_client(NetClientState *nc, int *id);
NetClientState *net_hub_port_find(int hub_id);

View File

@@ -10,10 +10,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 21b33fbe2e..32514193a9 100644
index d4bc19577a..be7da64f38 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -2007,9 +2007,9 @@ uint64_t cpu_get_tsc(CPUX86State *env);
@@ -2174,9 +2174,9 @@ uint64_t cpu_get_tsc(CPUX86State *env);
#define CPU_RESOLVING_TYPE TYPE_X86_CPU
#ifdef TARGET_X86_64
@@ -24,4 +24,4 @@ index 21b33fbe2e..32514193a9 100644
+#define TARGET_DEFAULT_CPU_TYPE X86_CPU_TYPE_NAME("kvm32")
#endif
#define cpu_signal_handler cpu_x86_signal_handler
#define cpu_list x86_cpu_list

View File

@@ -9,10 +9,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/ui/spice-core.c b/ui/spice-core.c
index 0371055e6c..840cf56923 100644
index c3ac20ad43..37774f1c0a 100644
--- a/ui/spice-core.c
+++ b/ui/spice-core.c
@@ -694,32 +694,35 @@ static void qemu_spice_init(void)
@@ -689,32 +689,35 @@ static void qemu_spice_init(void)
if (tls_port) {
x509_dir = qemu_opt_get(opts, "x509-dir");

View File

@@ -9,7 +9,7 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 11 insertions(+), 4 deletions(-)
diff --git a/block/gluster.c b/block/gluster.c
index e8ee14c8e9..3eb6a05500 100644
index 7c90f7ba4b..2e03102f00 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -42,7 +42,7 @@

View File

@@ -18,10 +18,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 2 insertions(+)
diff --git a/block/rbd.c b/block/rbd.c
index dcf82b15b8..feeec452f0 100644
index f826410f40..64a8d7d48b 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -814,6 +814,8 @@ static int qemu_rbd_connect(rados_t *cluster, rados_ioctx_t *io_ctx,
@@ -820,6 +820,8 @@ static int qemu_rbd_connect(rados_t *cluster, rados_ioctx_t *io_ctx,
rados_conf_set(*cluster, "rbd_cache", "false");
}

View File

@@ -4,17 +4,19 @@ Date: Mon, 6 Apr 2020 12:16:37 +0200
Subject: [PATCH] PVE: [Up] qmp: add get_link_status
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
[FE: add get_link_status to command name exceptions]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
net/net.c | 27 +++++++++++++++++++++++++++
qapi/net.json | 15 +++++++++++++++
qapi/pragma.json | 1 +
3 files changed, 43 insertions(+)
qapi/pragma.json | 2 ++
3 files changed, 44 insertions(+)
diff --git a/net/net.c b/net/net.c
index 76bbb7c31b..82e0a768b4 100644
index 840ad9dca5..28e97c5d85 100644
--- a/net/net.c
+++ b/net/net.c
@@ -1314,6 +1314,33 @@ void hmp_info_network(Monitor *mon, const QDict *qdict)
@@ -1372,6 +1372,33 @@ void hmp_info_network(Monitor *mon, const QDict *qdict)
}
}
@@ -49,10 +51,10 @@ index 76bbb7c31b..82e0a768b4 100644
{
NetClientState *nc;
diff --git a/qapi/net.json b/qapi/net.json
index 7fab2e7cd8..74c9a6109e 100644
index 522ac582ed..327d7c5a37 100644
--- a/qapi/net.json
+++ b/qapi/net.json
@@ -35,6 +35,21 @@
@@ -36,6 +36,21 @@
##
{ 'command': 'set_link', 'data': {'name': 'str', 'up': 'bool'} }
@@ -75,12 +77,20 @@ index 7fab2e7cd8..74c9a6109e 100644
# @netdev_add:
#
diff --git a/qapi/pragma.json b/qapi/pragma.json
index 3bc0335d1f..7c91ea3685 100644
index 7f810b0e97..29233db825 100644
--- a/qapi/pragma.json
+++ b/qapi/pragma.json
@@ -22,6 +22,7 @@
'system_reset',
@@ -15,6 +15,7 @@
'device_add',
'device_del',
'expire_password',
+ 'get_link_status',
'migrate_cancel',
'netdev_add',
'netdev_del',
@@ -26,6 +27,7 @@
'system_wakeup' ],
# Commands allowed to return a non-dictionary
'command-returns-exceptions': [
+ 'get_link_status',
'human-monitor-command',

View File

@@ -16,7 +16,7 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/block/gluster.c b/block/gluster.c
index 3eb6a05500..b612918ee8 100644
index 2e03102f00..7886c5fe8c 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -57,6 +57,7 @@ typedef struct GlusterAIOCB {
@@ -39,15 +39,15 @@ index 3eb6a05500..b612918ee8 100644
}
aio_co_schedule(acb->aio_context, acb->coroutine);
@@ -1021,6 +1024,7 @@ static coroutine_fn int qemu_gluster_co_pwrite_zeroes(BlockDriverState *bs,
@@ -1022,6 +1025,7 @@ static coroutine_fn int qemu_gluster_co_pwrite_zeroes(BlockDriverState *bs,
acb.ret = 0;
acb.coroutine = qemu_coroutine_self();
acb.aio_context = bdrv_get_aio_context(bs);
+ acb.is_write = true;
ret = glfs_zerofill_async(s->fd, offset, size, gluster_finish_aiocb, &acb);
ret = glfs_zerofill_async(s->fd, offset, bytes, gluster_finish_aiocb, &acb);
if (ret < 0) {
@@ -1202,9 +1206,11 @@ static coroutine_fn int qemu_gluster_co_rw(BlockDriverState *bs,
@@ -1203,9 +1207,11 @@ static coroutine_fn int qemu_gluster_co_rw(BlockDriverState *bs,
acb.aio_context = bdrv_get_aio_context(bs);
if (write) {
@@ -67,11 +67,11 @@ index 3eb6a05500..b612918ee8 100644
ret = glfs_fsync_async(s->fd, gluster_finish_aiocb, &acb);
if (ret < 0) {
@@ -1314,6 +1321,7 @@ static coroutine_fn int qemu_gluster_co_pdiscard(BlockDriverState *bs,
@@ -1316,6 +1323,7 @@ static coroutine_fn int qemu_gluster_co_pdiscard(BlockDriverState *bs,
acb.ret = 0;
acb.coroutine = qemu_coroutine_self();
acb.aio_context = bdrv_get_aio_context(bs);
+ acb.is_write = true;
ret = glfs_discard_async(s->fd, offset, size, gluster_finish_aiocb, &acb);
ret = glfs_discard_async(s->fd, offset, bytes, gluster_finish_aiocb, &acb);
if (ret < 0) {

View File

@@ -9,10 +9,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/qemu-img.c b/qemu-img.c
index 908fd0cce5..5dc1d0a2ca 100644
index a9b3a8103c..0bc9f1af59 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -2977,7 +2977,8 @@ static int img_info(int argc, char **argv)
@@ -3013,7 +3013,8 @@ static int img_info(int argc, char **argv)
list = collect_image_info_list(image_opts, filename, fmt, chain,
force_share);
if (!list) {

View File

@@ -31,13 +31,14 @@ override the output file's size.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
qemu-img-cmds.hx | 4 +-
qemu-img.c | 187 +++++++++++++++++++++++++++++------------------
2 files changed, 119 insertions(+), 72 deletions(-)
qemu-img.c | 202 ++++++++++++++++++++++++++++++-----------------
2 files changed, 133 insertions(+), 73 deletions(-)
diff --git a/qemu-img-cmds.hx b/qemu-img-cmds.hx
index b3620f29e5..e70ef3dc91 100644
index 1b1dab5b17..d1616c045a 100644
--- a/qemu-img-cmds.hx
+++ b/qemu-img-cmds.hx
@@ -58,9 +58,9 @@ SRST
@@ -53,10 +54,10 @@ index b3620f29e5..e70ef3dc91 100644
DEF("info", img_info,
diff --git a/qemu-img.c b/qemu-img.c
index 5dc1d0a2ca..f773182bd0 100644
index 0bc9f1af59..221b9d6a16 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -4793,10 +4793,12 @@ static int img_bitmap(int argc, char **argv)
@@ -4829,10 +4829,12 @@ static int img_bitmap(int argc, char **argv)
#define C_IF 04
#define C_OF 010
#define C_SKIP 020
@@ -69,7 +70,7 @@ index 5dc1d0a2ca..f773182bd0 100644
};
struct DdIo {
@@ -4872,6 +4874,19 @@ static int img_dd_skip(const char *arg,
@@ -4908,6 +4910,19 @@ static int img_dd_skip(const char *arg,
return 0;
}
@@ -89,7 +90,7 @@ index 5dc1d0a2ca..f773182bd0 100644
static int img_dd(int argc, char **argv)
{
int ret = 0;
@@ -4912,6 +4927,7 @@ static int img_dd(int argc, char **argv)
@@ -4948,6 +4963,7 @@ static int img_dd(int argc, char **argv)
{ "if", img_dd_if, C_IF },
{ "of", img_dd_of, C_OF },
{ "skip", img_dd_skip, C_SKIP },
@@ -97,7 +98,7 @@ index 5dc1d0a2ca..f773182bd0 100644
{ NULL, NULL, 0 }
};
const struct option long_options[] = {
@@ -4987,91 +5003,112 @@ static int img_dd(int argc, char **argv)
@@ -5023,91 +5039,112 @@ static int img_dd(int argc, char **argv)
arg = NULL;
}
@@ -134,11 +135,7 @@ index 5dc1d0a2ca..f773182bd0 100644
- error_report_err(local_err);
- ret = -1;
- goto out;
+ if (!blk1) {
+ ret = -1;
+ goto out;
+ }
}
- }
- if (!drv->create_opts) {
- error_report("Format driver '%s' does not support image creation",
- drv->format_name);
@@ -150,7 +147,11 @@ index 5dc1d0a2ca..f773182bd0 100644
- proto_drv->format_name);
- ret = -1;
- goto out;
- }
+ if (!blk1) {
+ ret = -1;
+ goto out;
+ }
}
- create_opts = qemu_opts_append(create_opts, drv->create_opts);
- create_opts = qemu_opts_append(create_opts, proto_drv->create_opts);
-
@@ -274,41 +275,54 @@ index 5dc1d0a2ca..f773182bd0 100644
}
if (dd.flags & C_SKIP && (in.offset > INT64_MAX / in.bsz ||
@@ -5089,11 +5126,17 @@ static int img_dd(int argc, char **argv)
@@ -5124,20 +5161,43 @@ static int img_dd(int argc, char **argv)
in.buf = g_new(uint8_t, in.bsz);
for (out_pos = 0; in_pos < size; block_count++) {
int in_ret, out_ret;
for (out_pos = 0; in_pos < size; ) {
+ int in_ret, out_ret;
int bytes = (in_pos + in.bsz > size) ? size - in_pos : in.bsz;
-
- if (in_pos + in.bsz > size) {
- in_ret = blk_pread(blk1, in_pos, in.buf, size - in_pos);
+ size_t in_bsz = in_pos + in.bsz > size ? size - in_pos : in.bsz;
- ret = blk_pread(blk1, in_pos, bytes, in.buf, 0);
- if (ret < 0) {
+ if (blk1) {
+ in_ret = blk_pread(blk1, in_pos, in.buf, in_bsz);
} else {
- in_ret = blk_pread(blk1, in_pos, in.buf, in.bsz);
+ in_ret = read(STDIN_FILENO, in.buf, in_bsz);
+ in_ret = blk_pread(blk1, in_pos, bytes, in.buf, 0);
+ if (in_ret == 0) {
+ in_ret = bytes;
+ }
+ } else {
+ in_ret = read(STDIN_FILENO, in.buf, bytes);
+ if (in_ret == 0) {
+ /* early EOF is considered an error */
+ error_report("Input ended unexpectedly");
+ ret = -1;
+ goto out;
+ }
}
if (in_ret < 0) {
+ }
+ if (in_ret < 0) {
error_report("error while reading from input image file: %s",
@@ -5103,9 +5146,13 @@ static int img_dd(int argc, char **argv)
- strerror(-ret));
+ strerror(-in_ret));
+ ret = -1;
goto out;
}
in_pos += in_ret;
in_pos += bytes;
- out_ret = blk_pwrite(blk2, out_pos, in.buf, in_ret, 0);
- ret = blk_pwrite(blk2, out_pos, bytes, in.buf, 0);
- if (ret < 0) {
+ if (blk2) {
+ out_ret = blk_pwrite(blk2, out_pos, in.buf, in_ret, 0);
+ out_ret = blk_pwrite(blk2, out_pos, in_ret, in.buf, 0);
+ if (out_ret == 0) {
+ out_ret = in_ret;
+ }
+ } else {
+ out_ret = write(STDOUT_FILENO, in.buf, in_ret);
+ }
- if (out_ret < 0) {
+
+ if (out_ret != in_ret) {
error_report("error while writing to output image file: %s",
strerror(-out_ret));
ret = -1;
- strerror(-ret));
+ strerror(-out_ret));
+ ret = -1;
goto out;
}
out_pos += bytes;

View File

@@ -10,15 +10,16 @@ an expected end of input.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
qemu-img.c | 28 +++++++++++++++++++++++++---
1 file changed, 25 insertions(+), 3 deletions(-)
diff --git a/qemu-img.c b/qemu-img.c
index f773182bd0..98a6562364 100644
index 221b9d6a16..c1306385a8 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -4794,11 +4794,13 @@ static int img_bitmap(int argc, char **argv)
@@ -4830,11 +4830,13 @@ static int img_bitmap(int argc, char **argv)
#define C_OF 010
#define C_SKIP 020
#define C_OSIZE 040
@@ -32,7 +33,7 @@ index f773182bd0..98a6562364 100644
};
struct DdIo {
@@ -4887,6 +4889,19 @@ static int img_dd_osize(const char *arg,
@@ -4923,6 +4925,19 @@ static int img_dd_osize(const char *arg,
return 0;
}
@@ -52,13 +53,13 @@ index f773182bd0..98a6562364 100644
static int img_dd(int argc, char **argv)
{
int ret = 0;
@@ -4901,12 +4916,14 @@ static int img_dd(int argc, char **argv)
@@ -4937,12 +4952,14 @@ static int img_dd(int argc, char **argv)
int c, i;
const char *out_fmt = "raw";
const char *fmt = NULL;
- int64_t size = 0;
+ int64_t size = 0, readsize = 0;
int64_t block_count = 0, out_pos, in_pos;
int64_t out_pos, in_pos;
bool force_share = false;
struct DdInfo dd = {
.flags = 0,
@@ -68,7 +69,7 @@ index f773182bd0..98a6562364 100644
};
struct DdIo in = {
.bsz = 512, /* Block size is by default 512 bytes */
@@ -4928,6 +4945,7 @@ static int img_dd(int argc, char **argv)
@@ -4964,6 +4981,7 @@ static int img_dd(int argc, char **argv)
{ "of", img_dd_of, C_OF },
{ "skip", img_dd_skip, C_SKIP },
{ "osize", img_dd_osize, C_OSIZE },
@@ -76,20 +77,22 @@ index f773182bd0..98a6562364 100644
{ NULL, NULL, 0 }
};
const struct option long_options[] = {
@@ -5124,14 +5142,18 @@ static int img_dd(int argc, char **argv)
@@ -5160,9 +5178,10 @@ static int img_dd(int argc, char **argv)
in.buf = g_new(uint8_t, in.bsz);
- for (out_pos = 0; in_pos < size; block_count++) {
- for (out_pos = 0; in_pos < size; ) {
+ readsize = (dd.isize > 0) ? dd.isize : size;
+ for (out_pos = 0; in_pos < readsize; block_count++) {
+ for (out_pos = 0; in_pos < readsize; ) {
int in_ret, out_ret;
- size_t in_bsz = in_pos + in.bsz > size ? size - in_pos : in.bsz;
+ size_t in_bsz = in_pos + in.bsz > readsize ? readsize - in_pos : in.bsz;
- int bytes = (in_pos + in.bsz > size) ? size - in_pos : in.bsz;
+ int bytes = (in_pos + in.bsz > readsize) ? readsize - in_pos : in.bsz;
if (blk1) {
in_ret = blk_pread(blk1, in_pos, in.buf, in_bsz);
in_ret = blk_pread(blk1, in_pos, bytes, in.buf, 0);
if (in_ret == 0) {
@@ -5171,6 +5190,9 @@ static int img_dd(int argc, char **argv)
} else {
in_ret = read(STDIN_FILENO, in.buf, in_bsz);
in_ret = read(STDIN_FILENO, in.buf, bytes);
if (in_ret == 0) {
+ if (dd.isize == 0) {
+ goto out;

View File

@@ -4,33 +4,89 @@ Date: Mon, 6 Apr 2020 12:16:42 +0200
Subject: [PATCH] PVE: [Up] qemu-img dd: add -n skip_create
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
[FE: fix getopt-string + add documentation]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
qemu-img.c | 23 ++++++++++++++---------
1 file changed, 14 insertions(+), 9 deletions(-)
docs/tools/qemu-img.rst | 11 ++++++++++-
qemu-img-cmds.hx | 4 ++--
qemu-img.c | 23 ++++++++++++++---------
3 files changed, 26 insertions(+), 12 deletions(-)
diff --git a/docs/tools/qemu-img.rst b/docs/tools/qemu-img.rst
index 15aeddc6d8..5e713e231d 100644
--- a/docs/tools/qemu-img.rst
+++ b/docs/tools/qemu-img.rst
@@ -208,6 +208,10 @@ Parameters to convert subcommand:
Parameters to dd subcommand:
+.. option:: -n
+
+ Skip the creation of the target volume
+
.. program:: qemu-img-dd
.. option:: bs=BLOCK_SIZE
@@ -488,7 +492,7 @@ Command description:
it doesn't need to be specified separately in this case.
-.. option:: dd [--image-opts] [-U] [-f FMT] [-O OUTPUT_FMT] [bs=BLOCK_SIZE] [count=BLOCKS] [skip=BLOCKS] if=INPUT of=OUTPUT
+.. option:: dd [--image-opts] [-U] [-f FMT] [-O OUTPUT_FMT] [-n] [bs=BLOCK_SIZE] [count=BLOCKS] [skip=BLOCKS] if=INPUT of=OUTPUT
dd copies from *INPUT* file to *OUTPUT* file converting it from
*FMT* format to *OUTPUT_FMT* format.
@@ -499,6 +503,11 @@ Command description:
The size syntax is similar to :manpage:`dd(1)`'s size syntax.
+ If the ``-n`` option is specified, the target volume creation will be
+ skipped. This is useful for formats such as ``rbd`` if the target
+ volume has already been created with site specific options that cannot
+ be supplied through ``qemu-img``.
+
.. option:: info [--object OBJECTDEF] [--image-opts] [-f FMT] [--output=OFMT] [--backing-chain] [-U] FILENAME
Give information about the disk image *FILENAME*. Use it in
diff --git a/qemu-img-cmds.hx b/qemu-img-cmds.hx
index d1616c045a..b5b0bb4467 100644
--- a/qemu-img-cmds.hx
+++ b/qemu-img-cmds.hx
@@ -58,9 +58,9 @@ SRST
ERST
DEF("dd", img_dd,
- "dd [--image-opts] [-U] [-f fmt] [-O output_fmt] [bs=block_size] [count=blocks] [skip=blocks] [osize=output_size] if=input of=output")
+ "dd [--image-opts] [-U] [-f fmt] [-O output_fmt] [-n] [bs=block_size] [count=blocks] [skip=blocks] [osize=output_size] if=input of=output")
SRST
-.. option:: dd [--image-opts] [-U] [-f FMT] [-O OUTPUT_FMT] [bs=BLOCK_SIZE] [count=BLOCKS] [skip=BLOCKS] [osize=OUTPUT_SIZE] if=INPUT of=OUTPUT
+.. option:: dd [--image-opts] [-U] [-f FMT] [-O OUTPUT_FMT] [-n] [bs=BLOCK_SIZE] [count=BLOCKS] [skip=BLOCKS] [osize=OUTPUT_SIZE] if=INPUT of=OUTPUT
ERST
DEF("info", img_info,
diff --git a/qemu-img.c b/qemu-img.c
index 98a6562364..355b3b82f4 100644
index c1306385a8..59c403373b 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -4918,7 +4918,7 @@ static int img_dd(int argc, char **argv)
@@ -4954,7 +4954,7 @@ static int img_dd(int argc, char **argv)
const char *fmt = NULL;
int64_t size = 0, readsize = 0;
int64_t block_count = 0, out_pos, in_pos;
int64_t out_pos, in_pos;
- bool force_share = false;
+ bool force_share = false, skip_create = false;
struct DdInfo dd = {
.flags = 0,
.count = 0,
@@ -4956,7 +4956,7 @@ static int img_dd(int argc, char **argv)
@@ -4992,7 +4992,7 @@ static int img_dd(int argc, char **argv)
{ 0, 0, 0, 0 }
};
- while ((c = getopt_long(argc, argv, ":hf:O:U", long_options, NULL))) {
+ while ((c = getopt_long(argc, argv, ":hf:O:U:n", long_options, NULL))) {
+ while ((c = getopt_long(argc, argv, ":hf:O:Un", long_options, NULL))) {
if (c == EOF) {
break;
}
@@ -4976,6 +4976,9 @@ static int img_dd(int argc, char **argv)
@@ -5012,6 +5012,9 @@ static int img_dd(int argc, char **argv)
case 'h':
help();
break;
@@ -40,7 +96,7 @@ index 98a6562364..355b3b82f4 100644
case 'U':
force_share = true;
break;
@@ -5106,13 +5109,15 @@ static int img_dd(int argc, char **argv)
@@ -5142,13 +5145,15 @@ static int img_dd(int argc, char **argv)
size - in.bsz * in.offset, &error_abort);
}

View File

@@ -7,17 +7,20 @@ Actually provide memory information via the query-balloon
command.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
[FE: add BalloonInfo to member name exceptions list]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
hw/virtio/virtio-balloon.c | 33 +++++++++++++++++++++++++++++++--
monitor/hmp-cmds.c | 30 +++++++++++++++++++++++++++++-
qapi/machine.json | 22 +++++++++++++++++++++-
3 files changed, 81 insertions(+), 4 deletions(-)
qapi/pragma.json | 1 +
4 files changed, 82 insertions(+), 4 deletions(-)
diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index ae7867a8db..956e3f4e46 100644
index 73ac5eb675..bbfe7eca62 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -820,8 +820,37 @@ static uint64_t virtio_balloon_get_features(VirtIODevice *vdev, uint64_t f,
@@ -806,8 +806,37 @@ static uint64_t virtio_balloon_get_features(VirtIODevice *vdev, uint64_t f,
static void virtio_balloon_stat(void *opaque, BalloonInfo *info)
{
VirtIOBalloon *dev = opaque;
@@ -58,10 +61,10 @@ index ae7867a8db..956e3f4e46 100644
static void virtio_balloon_to_target(void *opaque, ram_addr_t target)
diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index f4ef58d257..c8b97909e7 100644
index 01b789a79e..480b798963 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -698,7 +698,35 @@ void hmp_info_balloon(Monitor *mon, const QDict *qdict)
@@ -696,7 +696,35 @@ void hmp_info_balloon(Monitor *mon, const QDict *qdict)
return;
}
@@ -99,10 +102,10 @@ index f4ef58d257..c8b97909e7 100644
qapi_free_BalloonInfo(info);
}
diff --git a/qapi/machine.json b/qapi/machine.json
index 157712f006..34035c25d1 100644
index b9228a5e46..10e77a9af3 100644
--- a/qapi/machine.json
+++ b/qapi/machine.json
@@ -1018,10 +1018,30 @@
@@ -1054,9 +1054,29 @@
# @actual: the logical size of the VM in bytes
# Formula used: logical_vm_size = vm_ram_size - balloon_size
#
@@ -123,7 +126,6 @@ index 157712f006..34035c25d1 100644
+# @max_mem: amount of memory (in bytes) assigned to the guest
+#
# Since: 0.14
#
##
-{ 'struct': 'BalloonInfo', 'data': {'actual': 'int' } }
+{ 'struct': 'BalloonInfo',
@@ -134,3 +136,15 @@ index 157712f006..34035c25d1 100644
##
# @query-balloon:
diff --git a/qapi/pragma.json b/qapi/pragma.json
index 29233db825..f2097b9020 100644
--- a/qapi/pragma.json
+++ b/qapi/pragma.json
@@ -37,6 +37,7 @@
'member-name-exceptions': [ # visible in:
'ACPISlotType', # query-acpi-ospm-status
'AcpiTableOptions', # -acpitable
+ 'BalloonInfo', # query-balloon
'BlkdebugEvent', # blockdev-add, -blockdev
'BlkdebugSetStateOptions', # blockdev-add, -blockdev
'BlockDeviceInfo', # query-block

View File

@@ -13,10 +13,10 @@ Signed-off-by: Dietmar Maurer <dietmar@proxmox.com>
2 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/hw/core/machine-qmp-cmds.c b/hw/core/machine-qmp-cmds.c
index 216fdfaf3a..8f8d5d5276 100644
index 4f4ab30f8c..76fff60a6b 100644
--- a/hw/core/machine-qmp-cmds.c
+++ b/hw/core/machine-qmp-cmds.c
@@ -98,6 +98,12 @@ MachineInfoList *qmp_query_machines(Error **errp)
@@ -99,6 +99,12 @@ MachineInfoList *qmp_query_machines(Error **errp)
info->hotpluggable_cpus = mc->has_hotpluggable_cpus;
info->numa_mem_supported = mc->numa_mem_supported;
info->deprecated = !!mc->deprecation_reason;
@@ -30,10 +30,10 @@ index 216fdfaf3a..8f8d5d5276 100644
info->default_cpu_type = g_strdup(mc->default_cpu_type);
info->has_default_cpu_type = true;
diff --git a/qapi/machine.json b/qapi/machine.json
index 34035c25d1..cf120ac343 100644
index 10e77a9af3..9156103c8f 100644
--- a/qapi/machine.json
+++ b/qapi/machine.json
@@ -141,6 +141,8 @@
@@ -138,6 +138,8 @@
#
# @is-default: whether the machine is default
#
@@ -42,7 +42,7 @@ index 34035c25d1..cf120ac343 100644
# @cpu-max: maximum number of CPUs supported by the machine type
# (since 1.5)
#
@@ -162,7 +164,7 @@
@@ -159,7 +161,7 @@
##
{ 'struct': 'MachineInfo',
'data': { 'name': 'str', '*alias': 'str',

View File

@@ -12,10 +12,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2 files changed, 8 insertions(+)
diff --git a/qapi/ui.json b/qapi/ui.json
index cba8665b73..081115ea8a 100644
index 0abba3e930..bf8f441227 100644
--- a/qapi/ui.json
+++ b/qapi/ui.json
@@ -333,11 +333,14 @@
@@ -310,11 +310,14 @@
#
# @channels: a list of @SpiceChannel for each active spice channel
#
@@ -28,10 +28,10 @@ index cba8665b73..081115ea8a 100644
'*tls-port': 'int', '*auth': 'str', '*compiled-version': 'str',
+ '*ticket': 'str',
'mouse-mode': 'SpiceQueryMouseMode', '*channels': ['SpiceChannel']},
'if': 'defined(CONFIG_SPICE)' }
'if': 'CONFIG_SPICE' }
diff --git a/ui/spice-core.c b/ui/spice-core.c
index 840cf56923..96be349635 100644
index 37774f1c0a..367f77f2b4 100644
--- a/ui/spice-core.c
+++ b/ui/spice-core.c
@@ -534,6 +534,11 @@ static SpiceInfo *qmp_query_spice_real(Error **errp)

View File

@@ -0,0 +1,281 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Thu, 13 Oct 2022 11:33:50 +0200
Subject: [PATCH] PVE: add IOChannel implementation for savevm-async
based on migration/channel-block.c and the implementation that was
present in migration/savevm-async.c before QEMU 7.1.
Passes along read/write requests to the given BlockBackend, while
ensuring that a read request going beyond the end results in a
graceful short read.
Additionally, allows tracking the current position from the outside
(intended to be used for progress tracking).
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
migration/channel-savevm-async.c | 182 +++++++++++++++++++++++++++++++
migration/channel-savevm-async.h | 51 +++++++++
migration/meson.build | 1 +
3 files changed, 234 insertions(+)
create mode 100644 migration/channel-savevm-async.c
create mode 100644 migration/channel-savevm-async.h
diff --git a/migration/channel-savevm-async.c b/migration/channel-savevm-async.c
new file mode 100644
index 0000000000..06d5484778
--- /dev/null
+++ b/migration/channel-savevm-async.c
@@ -0,0 +1,182 @@
+/*
+ * QIO Channel implementation to be used by savevm-async QMP calls
+ */
+#include "qemu/osdep.h"
+#include "migration/channel-savevm-async.h"
+#include "qapi/error.h"
+#include "sysemu/block-backend.h"
+#include "trace.h"
+
+QIOChannelSavevmAsync *
+qio_channel_savevm_async_new(BlockBackend *be, size_t *bs_pos)
+{
+ QIOChannelSavevmAsync *ioc;
+
+ ioc = QIO_CHANNEL_SAVEVM_ASYNC(object_new(TYPE_QIO_CHANNEL_SAVEVM_ASYNC));
+
+ bdrv_ref(blk_bs(be));
+ ioc->be = be;
+ ioc->bs_pos = bs_pos;
+
+ return ioc;
+}
+
+
+static void
+qio_channel_savevm_async_finalize(Object *obj)
+{
+ QIOChannelSavevmAsync *ioc = QIO_CHANNEL_SAVEVM_ASYNC(obj);
+
+ if (ioc->be) {
+ bdrv_unref(blk_bs(ioc->be));
+ ioc->be = NULL;
+ }
+ ioc->bs_pos = NULL;
+}
+
+
+static ssize_t
+qio_channel_savevm_async_readv(QIOChannel *ioc,
+ const struct iovec *iov,
+ size_t niov,
+ int **fds,
+ size_t *nfds,
+ Error **errp)
+{
+ QIOChannelSavevmAsync *saioc = QIO_CHANNEL_SAVEVM_ASYNC(ioc);
+ BlockBackend *be = saioc->be;
+ int64_t maxlen = blk_getlength(be);
+ QEMUIOVector qiov;
+ size_t size;
+ int ret;
+
+ qemu_iovec_init_external(&qiov, (struct iovec *)iov, niov);
+
+ if (*saioc->bs_pos >= maxlen) {
+ error_setg(errp, "cannot read beyond maxlen");
+ return -1;
+ }
+
+ if (maxlen - *saioc->bs_pos < qiov.size) {
+ size = maxlen - *saioc->bs_pos;
+ } else {
+ size = qiov.size;
+ }
+
+ // returns 0 on success
+ ret = blk_preadv(be, *saioc->bs_pos, size, &qiov, 0);
+ if (ret < 0) {
+ error_setg_errno(errp, -ret, "blk_preadv failed");
+ return -1;
+ }
+
+ *saioc->bs_pos += size;
+ return size;
+}
+
+
+static ssize_t
+qio_channel_savevm_async_writev(QIOChannel *ioc,
+ const struct iovec *iov,
+ size_t niov,
+ int *fds,
+ size_t nfds,
+ int flags,
+ Error **errp)
+{
+ QIOChannelSavevmAsync *saioc = QIO_CHANNEL_SAVEVM_ASYNC(ioc);
+ BlockBackend *be = saioc->be;
+ QEMUIOVector qiov;
+ int ret;
+
+ qemu_iovec_init_external(&qiov, (struct iovec *)iov, niov);
+
+ if (qemu_in_coroutine()) {
+ ret = blk_co_pwritev(be, *saioc->bs_pos, qiov.size, &qiov, 0);
+ aio_wait_kick();
+ } else {
+ ret = blk_pwritev(be, *saioc->bs_pos, qiov.size, &qiov, 0);
+ }
+
+ if (ret < 0) {
+ error_setg_errno(errp, -ret, "blk(_co)_pwritev failed");
+ return -1;
+ }
+
+ *saioc->bs_pos += qiov.size;
+ return qiov.size;
+}
+
+
+static int
+qio_channel_savevm_async_set_blocking(QIOChannel *ioc,
+ bool enabled,
+ Error **errp)
+{
+ if (!enabled) {
+ error_setg(errp, "Non-blocking mode not supported for savevm-async");
+ return -1;
+ }
+ return 0;
+}
+
+
+static int
+qio_channel_savevm_async_close(QIOChannel *ioc,
+ Error **errp)
+{
+ QIOChannelSavevmAsync *saioc = QIO_CHANNEL_SAVEVM_ASYNC(ioc);
+ int rv = bdrv_flush(blk_bs(saioc->be));
+
+ if (rv < 0) {
+ error_setg_errno(errp, -rv, "Unable to flush VMState");
+ return -1;
+ }
+
+ bdrv_unref(blk_bs(saioc->be));
+ saioc->be = NULL;
+ saioc->bs_pos = NULL;
+
+ return 0;
+}
+
+
+static void
+qio_channel_savevm_async_set_aio_fd_handler(QIOChannel *ioc,
+ AioContext *ctx,
+ IOHandler *io_read,
+ IOHandler *io_write,
+ void *opaque)
+{
+ // if channel-block starts doing something, check if this needs adaptation
+}
+
+
+static void
+qio_channel_savevm_async_class_init(ObjectClass *klass,
+ void *class_data G_GNUC_UNUSED)
+{
+ QIOChannelClass *ioc_klass = QIO_CHANNEL_CLASS(klass);
+
+ ioc_klass->io_writev = qio_channel_savevm_async_writev;
+ ioc_klass->io_readv = qio_channel_savevm_async_readv;
+ ioc_klass->io_set_blocking = qio_channel_savevm_async_set_blocking;
+ ioc_klass->io_close = qio_channel_savevm_async_close;
+ ioc_klass->io_set_aio_fd_handler = qio_channel_savevm_async_set_aio_fd_handler;
+}
+
+static const TypeInfo qio_channel_savevm_async_info = {
+ .parent = TYPE_QIO_CHANNEL,
+ .name = TYPE_QIO_CHANNEL_SAVEVM_ASYNC,
+ .instance_size = sizeof(QIOChannelSavevmAsync),
+ .instance_finalize = qio_channel_savevm_async_finalize,
+ .class_init = qio_channel_savevm_async_class_init,
+};
+
+static void
+qio_channel_savevm_async_register_types(void)
+{
+ type_register_static(&qio_channel_savevm_async_info);
+}
+
+type_init(qio_channel_savevm_async_register_types);
diff --git a/migration/channel-savevm-async.h b/migration/channel-savevm-async.h
new file mode 100644
index 0000000000..17ae2cb261
--- /dev/null
+++ b/migration/channel-savevm-async.h
@@ -0,0 +1,51 @@
+/*
+ * QEMU I/O channels driver for savevm-async.c
+ *
+ * Copyright (c) 2022 Proxmox Server Solutions
+ *
+ * Authors:
+ * Fiona Ebner (f.ebner@proxmox.com)
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef QIO_CHANNEL_SAVEVM_ASYNC_H
+#define QIO_CHANNEL_SAVEVM_ASYNC_H
+
+#include "io/channel.h"
+#include "qom/object.h"
+
+#define TYPE_QIO_CHANNEL_SAVEVM_ASYNC "qio-channel-savevm-async"
+OBJECT_DECLARE_SIMPLE_TYPE(QIOChannelSavevmAsync, QIO_CHANNEL_SAVEVM_ASYNC)
+
+
+/**
+ * QIOChannelSavevmAsync:
+ *
+ * The QIOChannelBlock object provides a channel implementation that is able to
+ * perform I/O on any BlockBackend whose BlockDriverState directly contains a
+ * VMState (as opposed to indirectly, like qcow2). It allows tracking the
+ * current position from the outside.
+ */
+struct QIOChannelSavevmAsync {
+ QIOChannel parent;
+ BlockBackend *be;
+ size_t *bs_pos;
+};
+
+
+/**
+ * qio_channel_savevm_async_new:
+ * @be: the block backend
+ * @bs_pos: used to keep track of the IOChannels current position
+ *
+ * Create a new IO channel object that can perform I/O on a BlockBackend object
+ * whose BlockDriverState directly contains a VMState.
+ *
+ * Returns: the new channel object
+ */
+QIOChannelSavevmAsync *
+qio_channel_savevm_async_new(BlockBackend *be, size_t *bs_pos);
+
+#endif /* QIO_CHANNEL_SAVEVM_ASYNC_H */
diff --git a/migration/meson.build b/migration/meson.build
index 690487cf1a..8cac83c06c 100644
--- a/migration/meson.build
+++ b/migration/meson.build
@@ -13,6 +13,7 @@ softmmu_ss.add(files(
'block-dirty-bitmap.c',
'channel.c',
'channel-block.c',
+ 'channel-savevm-async.c',
'colo-failover.c',
'colo.c',
'exec.c',

View File

@@ -23,26 +23,30 @@ Signed-off-by: Dietmar Maurer <dietmar@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
[improve aborting]
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
[FE: further improve aborting
adapt to removal of QEMUFileOps
improve condition for entering final stage]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
hmp-commands-info.hx | 13 +
hmp-commands.hx | 33 ++
hmp-commands.hx | 33 +++
include/migration/snapshot.h | 2 +
include/monitor/hmp.h | 5 +
migration/meson.build | 1 +
migration/savevm-async.c | 598 +++++++++++++++++++++++++++++++++++
migration/savevm-async.c | 538 +++++++++++++++++++++++++++++++++++
monitor/hmp-cmds.c | 57 ++++
qapi/migration.json | 34 ++
qapi/misc.json | 32 ++
qapi/migration.json | 34 +++
qapi/misc.json | 32 +++
qemu-options.hx | 12 +
softmmu/vl.c | 10 +
11 files changed, 797 insertions(+)
11 files changed, 737 insertions(+)
create mode 100644 migration/savevm-async.c
diff --git a/hmp-commands-info.hx b/hmp-commands-info.hx
index 27206ac049..e6dd3be07a 100644
index 754b1e8408..489c524e9e 100644
--- a/hmp-commands-info.hx
+++ b/hmp-commands-info.hx
@@ -551,6 +551,19 @@ SRST
@@ -540,6 +540,19 @@ SRST
Show current migration parameters.
ERST
@@ -63,13 +67,13 @@ index 27206ac049..e6dd3be07a 100644
.name = "balloon",
.args_type = "",
diff --git a/hmp-commands.hx b/hmp-commands.hx
index d78e4cfc47..42203dbe92 100644
index 673e39a697..039be0033d 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -1744,3 +1744,36 @@ ERST
.help = "start a round of guest dirty rate measurement",
.cmd = hmp_calc_dirty_rate,
},
@@ -1815,3 +1815,36 @@ SRST
Dump the FDT in dtb format to *filename*.
ERST
#endif
+
+ {
+ .name = "savevm-start",
@@ -115,10 +119,10 @@ index e72083b117..c846d37806 100644
+
#endif
diff --git a/include/monitor/hmp.h b/include/monitor/hmp.h
index 3baa1058e2..1247d7362a 100644
index dfbc0c9a2f..440f86aba8 100644
--- a/include/monitor/hmp.h
+++ b/include/monitor/hmp.h
@@ -25,6 +25,7 @@ void hmp_info_status(Monitor *mon, const QDict *qdict);
@@ -27,6 +27,7 @@ void hmp_info_status(Monitor *mon, const QDict *qdict);
void hmp_info_uuid(Monitor *mon, const QDict *qdict);
void hmp_info_chardev(Monitor *mon, const QDict *qdict);
void hmp_info_mice(Monitor *mon, const QDict *qdict);
@@ -126,7 +130,7 @@ index 3baa1058e2..1247d7362a 100644
void hmp_info_migrate(Monitor *mon, const QDict *qdict);
void hmp_info_migrate_capabilities(Monitor *mon, const QDict *qdict);
void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict);
@@ -79,6 +80,10 @@ void hmp_netdev_add(Monitor *mon, const QDict *qdict);
@@ -81,6 +82,10 @@ void hmp_netdev_add(Monitor *mon, const QDict *qdict);
void hmp_netdev_del(Monitor *mon, const QDict *qdict);
void hmp_getfd(Monitor *mon, const QDict *qdict);
void hmp_closefd(Monitor *mon, const QDict *qdict);
@@ -135,13 +139,13 @@ index 3baa1058e2..1247d7362a 100644
+void hmp_delete_drive_snapshot(Monitor *mon, const QDict *qdict);
+void hmp_savevm_end(Monitor *mon, const QDict *qdict);
void hmp_sendkey(Monitor *mon, const QDict *qdict);
void hmp_screendump(Monitor *mon, const QDict *qdict);
void coroutine_fn hmp_screendump(Monitor *mon, const QDict *qdict);
void hmp_chardev_add(Monitor *mon, const QDict *qdict);
diff --git a/migration/meson.build b/migration/meson.build
index f8714dcb15..ea9aedeefc 100644
index 8cac83c06c..0842d00cd2 100644
--- a/migration/meson.build
+++ b/migration/meson.build
@@ -23,6 +23,7 @@ softmmu_ss.add(files(
@@ -24,6 +24,7 @@ softmmu_ss.add(files(
'multifd-zlib.c',
'postcopy-ram.c',
'savevm.c',
@@ -151,11 +155,12 @@ index f8714dcb15..ea9aedeefc 100644
), gnutls)
diff --git a/migration/savevm-async.c b/migration/savevm-async.c
new file mode 100644
index 0000000000..79a0cda906
index 0000000000..dc30558713
--- /dev/null
+++ b/migration/savevm-async.c
@@ -0,0 +1,598 @@
@@ -0,0 +1,538 @@
+#include "qemu/osdep.h"
+#include "migration/channel-savevm-async.h"
+#include "migration/migration.h"
+#include "migration/savevm.h"
+#include "migration/snapshot.h"
@@ -179,9 +184,6 @@ index 0000000000..79a0cda906
+
+/* #define DEBUG_SAVEVM_STATE */
+
+/* used while emulated sync operation in progress */
+#define NOT_DONE -EINPROGRESS
+
+#ifdef DEBUG_SAVEVM_STATE
+#define DPRINTF(fmt, ...) \
+ do { printf("savevm-async: " fmt, ## __VA_ARGS__); } while (0)
@@ -210,7 +212,7 @@ index 0000000000..79a0cda906
+ int64_t total_time;
+ QEMUBH *finalize_bh;
+ Coroutine *co;
+ QemuCoSleep *target_close_wait;
+ QemuCoSleep target_close_wait;
+} snap_state;
+
+static bool savevm_aborted(void)
@@ -268,6 +270,7 @@ index 0000000000..79a0cda906
+
+ if (snap_state.file) {
+ ret = qemu_fclose(snap_state.file);
+ snap_state.file = NULL;
+ }
+
+ if (snap_state.target) {
@@ -285,9 +288,7 @@ index 0000000000..79a0cda906
+ blk_unref(snap_state.target);
+ snap_state.target = NULL;
+
+ if (snap_state.target_close_wait) {
+ qemu_co_sleep_wake(snap_state.target_close_wait);
+ }
+ qemu_co_sleep_wake(&snap_state.target_close_wait);
+ }
+
+ return ret;
@@ -313,60 +314,6 @@ index 0000000000..79a0cda906
+ snap_state.state = SAVE_STATE_ERROR;
+}
+
+static int block_state_close(void *opaque, Error **errp)
+{
+ snap_state.file = NULL;
+ return blk_flush(snap_state.target);
+}
+
+typedef struct BlkRwCo {
+ int64_t offset;
+ QEMUIOVector *qiov;
+ ssize_t ret;
+} BlkRwCo;
+
+static void coroutine_fn block_state_write_entry(void *opaque) {
+ BlkRwCo *rwco = opaque;
+ rwco->ret = blk_co_pwritev(snap_state.target, rwco->offset, rwco->qiov->size,
+ rwco->qiov, 0);
+ aio_wait_kick();
+}
+
+static ssize_t block_state_writev_buffer(void *opaque, struct iovec *iov,
+ int iovcnt, int64_t pos, Error **errp)
+{
+ QEMUIOVector qiov;
+ BlkRwCo rwco;
+
+ assert(pos == snap_state.bs_pos);
+ rwco = (BlkRwCo) {
+ .offset = pos,
+ .qiov = &qiov,
+ .ret = NOT_DONE,
+ };
+
+ qemu_iovec_init_external(&qiov, iov, iovcnt);
+
+ if (qemu_in_coroutine()) {
+ block_state_write_entry(&rwco);
+ } else {
+ Coroutine *co = qemu_coroutine_create(&block_state_write_entry, &rwco);
+ bdrv_coroutine_enter(blk_bs(snap_state.target), co);
+ BDRV_POLL_WHILE(blk_bs(snap_state.target), rwco.ret == NOT_DONE);
+ }
+ if (rwco.ret < 0) {
+ return rwco.ret;
+ }
+
+ snap_state.bs_pos += qiov.size;
+ return qiov.size;
+}
+
+static const QEMUFileOps block_file_ops = {
+ .writev_buffer = block_state_writev_buffer,
+ .close = block_state_close,
+};
+
+static void process_savevm_finalize(void *opaque)
+{
+ int ret;
@@ -401,7 +348,7 @@ index 0000000000..79a0cda906
+ (void)qemu_savevm_state_complete_precopy(snap_state.file, false, false);
+ ret = qemu_file_get_error(snap_state.file);
+ if (ret < 0) {
+ save_snapshot_error("qemu_savevm_state_iterate error %d", ret);
+ save_snapshot_error("qemu_savevm_state_complete_precopy error %d", ret);
+ }
+ }
+
@@ -422,8 +369,11 @@ index 0000000000..79a0cda906
+ } else if (snap_state.state == SAVE_STATE_ACTIVE) {
+ snap_state.state = SAVE_STATE_COMPLETED;
+ } else if (aborted) {
+ save_snapshot_error("process_savevm_cleanup: found aborted state: %d",
+ snap_state.state);
+ /*
+ * If there was an error, there's no need to set a new one here.
+ * If the snapshot was canceled, leave setting the state to
+ * qmp_savevm_end(), which is waked by save_snapshot_cleanup().
+ */
+ } else {
+ save_snapshot_error("process_savevm_cleanup: invalid state: %d",
+ snap_state.state);
@@ -464,9 +414,16 @@ index 0000000000..79a0cda906
+
+ pending_size = pend_precopy + pend_compatible + pend_postcopy;
+
+ maxlen = blk_getlength(snap_state.target) - 30*1024*1024;
+ /*
+ * A guest reaching this cutoff is dirtying lots of RAM. It should be
+ * large enough so that the guest can't dirty this much between the
+ * check and the guest actually being stopped, but it should be small
+ * enough to avoid long downtimes for non-hibernation snapshots.
+ */
+ maxlen = blk_getlength(snap_state.target) - 100*1024*1024;
+
+ if (pending_size > 400000 && snap_state.bs_pos + pending_size < maxlen) {
+ /* Note that there is no progress for pend_postcopy when iterating */
+ if (pending_size - pend_postcopy > 400000 && snap_state.bs_pos + pending_size < maxlen) {
+ ret = qemu_savevm_state_iterate(snap_state.file, false);
+ if (ret < 0) {
+ save_snapshot_error("qemu_savevm_state_iterate error %d", ret);
@@ -549,6 +506,7 @@ index 0000000000..79a0cda906
+ snap_state.bs_pos = 0;
+ snap_state.total_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
+ snap_state.blocker = NULL;
+ snap_state.target_close_wait = (QemuCoSleep){ .to_wake = NULL };
+
+ if (snap_state.error) {
+ error_free(snap_state.error);
@@ -575,7 +533,9 @@ index 0000000000..79a0cda906
+ goto restart;
+ }
+
+ snap_state.file = qemu_fopen_ops(&snap_state, &block_file_ops);
+ QIOChannel *ioc = QIO_CHANNEL(qio_channel_savevm_async_new(snap_state.target,
+ &snap_state.bs_pos));
+ snap_state.file = qemu_file_new_output(ioc);
+
+ if (!snap_state.file) {
+ error_set(errp, ERROR_CLASS_GENERIC_ERROR, "failed to open '%s'", statefile);
@@ -653,9 +613,8 @@ index 0000000000..79a0cda906
+ * call exits the statefile will be closed and can be removed immediately */
+ DPRINTF("savevm-end: waiting for cleanup\n");
+ timeout = 30L * 1000 * 1000 * 1000;
+ qemu_co_sleep_ns_wakeable(snap_state.target_close_wait,
+ qemu_co_sleep_ns_wakeable(&snap_state.target_close_wait,
+ QEMU_CLOCK_REALTIME, timeout);
+ snap_state.target_close_wait = NULL;
+ if (snap_state.target) {
+ save_snapshot_error("timeout waiting for target file close in "
+ "qmp_savevm_end");
@@ -664,6 +623,11 @@ index 0000000000..79a0cda906
+ return;
+ }
+
+ // File closed and no other error, so ensure next snapshot can be started.
+ if (snap_state.state != SAVE_STATE_ERROR) {
+ snap_state.state = SAVE_STATE_DONE;
+ }
+
+ DPRINTF("savevm-end: cleanup done\n");
+}
+
@@ -683,27 +647,6 @@ index 0000000000..79a0cda906
+ true, name, errp);
+}
+
+static ssize_t loadstate_get_buffer(void *opaque, uint8_t *buf, int64_t pos,
+ size_t size, Error **errp)
+{
+ BlockBackend *be = opaque;
+ int64_t maxlen = blk_getlength(be);
+ if (pos > maxlen) {
+ return -EIO;
+ }
+ if ((pos + size) > maxlen) {
+ size = maxlen - pos - 1;
+ }
+ if (size == 0) {
+ return 0;
+ }
+ return blk_pread(be, pos, buf, size);
+}
+
+static const QEMUFileOps loadstate_file_ops = {
+ .get_buffer = loadstate_get_buffer,
+};
+
+int load_snapshot_from_blockdev(const char *filename, Error **errp)
+{
+ BlockBackend *be;
@@ -711,6 +654,7 @@ index 0000000000..79a0cda906
+ Error *blocker = NULL;
+
+ QEMUFile *f;
+ size_t bs_pos = 0;
+ int ret = -EINVAL;
+
+ be = blk_new_open(filename, NULL, NULL, 0, &local_err);
@@ -724,7 +668,7 @@ index 0000000000..79a0cda906
+ blk_op_block_all(be, blocker);
+
+ /* restore the VM state */
+ f = qemu_fopen_ops(be, &loadstate_file_ops);
+ f = qemu_file_new_input(QIO_CHANNEL(qio_channel_savevm_async_new(be, &bs_pos)));
+ if (!f) {
+ error_setg(errp, "Could not open VM state file");
+ goto the_end;
@@ -754,10 +698,10 @@ index 0000000000..79a0cda906
+ return ret;
+}
diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index c8b97909e7..64a84cf4ee 100644
index 480b798963..cfebfd1db5 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -1961,6 +1961,63 @@ void hmp_info_memory_devices(Monitor *mon, const QDict *qdict)
@@ -1906,6 +1906,63 @@ void hmp_info_memory_devices(Monitor *mon, const QDict *qdict)
hmp_handle_error(mon, err);
}
@@ -822,10 +766,10 @@ index c8b97909e7..64a84cf4ee 100644
{
IOThreadInfoList *info_list = qmp_query_iothreads(NULL);
diff --git a/qapi/migration.json b/qapi/migration.json
index 1124a2dda8..3d72b3e3f3 100644
index 88ecf86ac8..4435866379 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -247,6 +247,40 @@
@@ -261,6 +261,40 @@
'*compression': 'CompressionStats',
'*socket-address': ['SocketAddress'] } }
@@ -867,10 +811,10 @@ index 1124a2dda8..3d72b3e3f3 100644
# @query-migrate:
#
diff --git a/qapi/misc.json b/qapi/misc.json
index 5c2ca3b556..9bc14e1032 100644
index 27ef5a2b20..b3ce75dcae 100644
--- a/qapi/misc.json
+++ b/qapi/misc.json
@@ -431,6 +431,38 @@
@@ -435,6 +435,38 @@
##
{ 'command': 'query-fdsets', 'returns': ['FdsetInfo'] }
@@ -910,10 +854,10 @@ index 5c2ca3b556..9bc14e1032 100644
# @CommandLineParameterType:
#
diff --git a/qemu-options.hx b/qemu-options.hx
index 83aa59a920..002ba697e9 100644
index 7f99d15b23..54efb127c4 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -4131,6 +4131,18 @@ SRST
@@ -4391,6 +4391,18 @@ SRST
Start right away with a saved state (``loadvm`` in monitor)
ERST
@@ -933,21 +877,21 @@ index 83aa59a920..002ba697e9 100644
DEF("daemonize", 0, QEMU_OPTION_daemonize, \
"-daemonize daemonize QEMU after initializing\n", QEMU_ARCH_ALL)
diff --git a/softmmu/vl.c b/softmmu/vl.c
index 5ca11e7469..220c67cd32 100644
index 5f7f6ca981..21f067d115 100644
--- a/softmmu/vl.c
+++ b/softmmu/vl.c
@@ -150,6 +150,7 @@ static const char *incoming;
static const char *loadvm;
static const char *accelerators;
@@ -164,6 +164,7 @@ static const char *accelerators;
static bool have_custom_ram_size;
static const char *ram_memdev_id;
static QDict *machine_opts_dict;
+static const char *loadstate;
static QTAILQ_HEAD(, ObjectOption) object_opts = QTAILQ_HEAD_INITIALIZER(object_opts);
static ram_addr_t maxram_size;
static uint64_t ram_slots;
@@ -2700,6 +2701,12 @@ void qmp_x_exit_preconfig(Error **errp)
autostart = 0;
exit(1);
}
static QTAILQ_HEAD(, DeviceOption) device_opts = QTAILQ_HEAD_INITIALIZER(device_opts);
static int display_remote;
@@ -2607,6 +2608,12 @@ void qmp_x_exit_preconfig(Error **errp)
if (loadvm) {
load_snapshot(loadvm, NULL, false, NULL, &error_fatal);
+ } else if (loadstate) {
+ Error *local_err = NULL;
+ if (load_snapshot_from_blockdev(loadstate, &local_err) < 0) {
@@ -957,7 +901,7 @@ index 5ca11e7469..220c67cd32 100644
}
if (replay_mode != REPLAY_MODE_NONE) {
replay_vmstate_init();
@@ -3238,6 +3245,9 @@ void qemu_init(int argc, char **argv, char **envp)
@@ -3151,6 +3158,9 @@ void qemu_init(int argc, char **argv)
case QEMU_OPTION_loadvm:
loadvm = optarg;
break;

View File

@@ -10,17 +10,19 @@ Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
[increase max IOV count in QEMUFile to actually write more data]
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
[FE: adapt to removal of QEMUFileOps]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
migration/qemu-file.c | 38 +++++++++++++++++++++++++-------------
migration/qemu-file.h | 1 +
migration/savevm-async.c | 4 ++--
3 files changed, 28 insertions(+), 15 deletions(-)
migration/qemu-file.c | 49 +++++++++++++++++++++++++++-------------
migration/qemu-file.h | 2 ++
migration/savevm-async.c | 5 ++--
3 files changed, 38 insertions(+), 18 deletions(-)
diff --git a/migration/qemu-file.c b/migration/qemu-file.c
index 6338d8e2ff..6697a93a7e 100644
index 2d5f74ffc2..9fd97e6fe1 100644
--- a/migration/qemu-file.c
+++ b/migration/qemu-file.c
@@ -30,8 +30,8 @@
@@ -31,8 +31,8 @@
#include "trace.h"
#include "qapi/error.h"
@@ -30,9 +32,9 @@ index 6338d8e2ff..6697a93a7e 100644
+#define MAX_IOV_SIZE MIN_CONST(IOV_MAX, 256)
struct QEMUFile {
const QEMUFileOps *ops;
@@ -45,7 +45,8 @@ struct QEMUFile {
when reading */
const QEMUFileHooks *hooks;
@@ -55,7 +55,8 @@ struct QEMUFile {
int buf_index;
int buf_size; /* 0 when writing */
- uint8_t buf[IO_BUF_SIZE];
@@ -41,53 +43,76 @@ index 6338d8e2ff..6697a93a7e 100644
DECLARE_BITMAP(may_free, MAX_IOV_SIZE);
struct iovec iov[MAX_IOV_SIZE];
@@ -103,7 +104,7 @@ bool qemu_file_mode_is_not_valid(const char *mode)
@@ -127,7 +128,9 @@ bool qemu_file_mode_is_not_valid(const char *mode)
return false;
}
-QEMUFile *qemu_fopen_ops(void *opaque, const QEMUFileOps *ops, bool has_ioc)
+QEMUFile *qemu_fopen_ops_sized(void *opaque, const QEMUFileOps *ops, bool has_ioc, size_t buffer_size)
-static QEMUFile *qemu_file_new_impl(QIOChannel *ioc, bool is_writable)
+static QEMUFile *qemu_file_new_impl(QIOChannel *ioc,
+ bool is_writable,
+ size_t buffer_size)
{
QEMUFile *f;
@@ -112,9 +113,17 @@ QEMUFile *qemu_fopen_ops(void *opaque, const QEMUFileOps *ops, bool has_ioc)
f->opaque = opaque;
f->ops = ops;
f->has_ioc = has_ioc;
@@ -136,6 +139,8 @@ static QEMUFile *qemu_file_new_impl(QIOChannel *ioc, bool is_writable)
object_ref(ioc);
f->ioc = ioc;
f->is_writable = is_writable;
+ f->buf_allocated_size = buffer_size;
+ f->buf = malloc(buffer_size);
+
return f;
}
@@ -146,17 +151,27 @@ static QEMUFile *qemu_file_new_impl(QIOChannel *ioc, bool is_writable)
*/
QEMUFile *qemu_file_get_return_path(QEMUFile *f)
{
- return qemu_file_new_impl(f->ioc, !f->is_writable);
+ return qemu_file_new_impl(f->ioc, !f->is_writable, DEFAULT_IO_BUF_SIZE);
}
+QEMUFile *qemu_fopen_ops(void *opaque, const QEMUFileOps *ops, bool has_ioc)
+{
+ return qemu_fopen_ops_sized(opaque, ops, has_ioc, DEFAULT_IO_BUF_SIZE);
QEMUFile *qemu_file_new_output(QIOChannel *ioc)
{
- return qemu_file_new_impl(ioc, true);
+ return qemu_file_new_impl(ioc, true, DEFAULT_IO_BUF_SIZE);
+}
+
+QEMUFile *qemu_file_new_output_sized(QIOChannel *ioc, size_t buffer_size)
+{
+ return qemu_file_new_impl(ioc, true, buffer_size);
}
QEMUFile *qemu_file_new_input(QIOChannel *ioc)
{
- return qemu_file_new_impl(ioc, false);
+ return qemu_file_new_impl(ioc, false, DEFAULT_IO_BUF_SIZE);
+}
+
+QEMUFile *qemu_file_new_input_sized(QIOChannel *ioc, size_t buffer_size)
+{
+ return qemu_file_new_impl(ioc, false, buffer_size);
}
void qemu_file_set_hooks(QEMUFile *f, const QEMUFileHooks *hooks)
{
@@ -349,7 +358,7 @@ static ssize_t qemu_fill_buffer(QEMUFile *f)
@@ -414,7 +429,7 @@ static ssize_t qemu_fill_buffer(QEMUFile *f)
do {
len = qio_channel_read(f->ioc,
(char *)f->buf + pending,
- IO_BUF_SIZE - pending,
+ f->buf_allocated_size - pending,
&local_error);
if (len == QIO_CHANNEL_ERR_BLOCK) {
if (qemu_in_coroutine()) {
@@ -464,6 +479,8 @@ int qemu_fclose(QEMUFile *f)
}
g_clear_pointer(&f->ioc, object_unref);
len = f->ops->get_buffer(f->opaque, f->buf + pending, f->pos,
- IO_BUF_SIZE - pending, &local_error);
+ f->buf_allocated_size - pending, &local_error);
if (len > 0) {
f->buf_size += len;
f->pos += len;
@@ -389,6 +398,9 @@ int qemu_fclose(QEMUFile *f)
ret = ret2;
}
}
+
+ free(f->buf);
+
/* If any error was spotted before closing, we should report it
* instead of the close() return value.
*/
@@ -443,7 +455,7 @@ static void add_buf_to_iovec(QEMUFile *f, size_t len)
@@ -518,7 +535,7 @@ static void add_buf_to_iovec(QEMUFile *f, size_t len)
{
if (!add_to_iovec(f, f->buf + f->buf_index, len, false)) {
f->buf_index += len;
@@ -96,7 +121,7 @@ index 6338d8e2ff..6697a93a7e 100644
qemu_fflush(f);
}
}
@@ -469,7 +481,7 @@ void qemu_put_buffer(QEMUFile *f, const uint8_t *buf, size_t size)
@@ -544,7 +561,7 @@ void qemu_put_buffer(QEMUFile *f, const uint8_t *buf, size_t size)
}
while (size > 0) {
@@ -105,7 +130,7 @@ index 6338d8e2ff..6697a93a7e 100644
if (l > size) {
l = size;
}
@@ -516,8 +528,8 @@ size_t qemu_peek_buffer(QEMUFile *f, uint8_t **buf, size_t size, size_t offset)
@@ -591,8 +608,8 @@ size_t qemu_peek_buffer(QEMUFile *f, uint8_t **buf, size_t size, size_t offset)
size_t index;
assert(!qemu_file_is_writable(f));
@@ -116,7 +141,7 @@ index 6338d8e2ff..6697a93a7e 100644
/* The 1st byte to read from */
index = f->buf_index + offset;
@@ -567,7 +579,7 @@ size_t qemu_get_buffer(QEMUFile *f, uint8_t *buf, size_t size)
@@ -642,7 +659,7 @@ size_t qemu_get_buffer(QEMUFile *f, uint8_t *buf, size_t size)
size_t res;
uint8_t *src;
@@ -125,7 +150,7 @@ index 6338d8e2ff..6697a93a7e 100644
if (res == 0) {
return done;
}
@@ -601,7 +613,7 @@ size_t qemu_get_buffer(QEMUFile *f, uint8_t *buf, size_t size)
@@ -676,7 +693,7 @@ size_t qemu_get_buffer(QEMUFile *f, uint8_t *buf, size_t size)
*/
size_t qemu_get_buffer_in_place(QEMUFile *f, uint8_t **buf, size_t size)
{
@@ -134,7 +159,7 @@ index 6338d8e2ff..6697a93a7e 100644
size_t res;
uint8_t *src = NULL;
@@ -626,7 +638,7 @@ int qemu_peek_byte(QEMUFile *f, int offset)
@@ -701,7 +718,7 @@ int qemu_peek_byte(QEMUFile *f, int offset)
int index = f->buf_index + offset;
assert(!qemu_file_is_writable(f));
@@ -143,7 +168,7 @@ index 6338d8e2ff..6697a93a7e 100644
if (index >= f->buf_size) {
qemu_fill_buffer(f);
@@ -778,7 +790,7 @@ static int qemu_compress_data(z_stream *stream, uint8_t *dest, size_t dest_len,
@@ -853,7 +870,7 @@ static int qemu_compress_data(z_stream *stream, uint8_t *dest, size_t dest_len,
ssize_t qemu_put_compression_data(QEMUFile *f, z_stream *stream,
const uint8_t *p, size_t size)
{
@@ -153,36 +178,39 @@ index 6338d8e2ff..6697a93a7e 100644
if (blen < compressBound(size)) {
return -1;
diff --git a/migration/qemu-file.h b/migration/qemu-file.h
index 3f36d4dc8c..67501fd9cf 100644
index fa13d04d78..914f1a63a8 100644
--- a/migration/qemu-file.h
+++ b/migration/qemu-file.h
@@ -121,6 +121,7 @@ typedef struct QEMUFileHooks {
@@ -63,7 +63,9 @@ typedef struct QEMUFileHooks {
} QEMUFileHooks;
QEMUFile *qemu_fopen_ops(void *opaque, const QEMUFileOps *ops, bool has_ioc);
+QEMUFile *qemu_fopen_ops_sized(void *opaque, const QEMUFileOps *ops, bool has_ioc, size_t buffer_size);
QEMUFile *qemu_file_new_input(QIOChannel *ioc);
+QEMUFile *qemu_file_new_input_sized(QIOChannel *ioc, size_t buffer_size);
QEMUFile *qemu_file_new_output(QIOChannel *ioc);
+QEMUFile *qemu_file_new_output_sized(QIOChannel *ioc, size_t buffer_size);
void qemu_file_set_hooks(QEMUFile *f, const QEMUFileHooks *hooks);
int qemu_get_fd(QEMUFile *f);
int qemu_fclose(QEMUFile *f);
diff --git a/migration/savevm-async.c b/migration/savevm-async.c
index 79a0cda906..970ee3b3fc 100644
index dc30558713..a38e7351c1 100644
--- a/migration/savevm-async.c
+++ b/migration/savevm-async.c
@@ -418,7 +418,7 @@ void qmp_savevm_start(bool has_statefile, const char *statefile, Error **errp)
goto restart;
}
@@ -374,7 +374,7 @@ void qmp_savevm_start(bool has_statefile, const char *statefile, Error **errp)
- snap_state.file = qemu_fopen_ops(&snap_state, &block_file_ops);
+ snap_state.file = qemu_fopen_ops_sized(&snap_state, &block_file_ops, false, 4 * 1024 * 1024);
QIOChannel *ioc = QIO_CHANNEL(qio_channel_savevm_async_new(snap_state.target,
&snap_state.bs_pos));
- snap_state.file = qemu_file_new_output(ioc);
+ snap_state.file = qemu_file_new_output_sized(ioc, 4 * 1024 * 1024);
if (!snap_state.file) {
error_set(errp, ERROR_CLASS_GENERIC_ERROR, "failed to open '%s'", statefile);
@@ -567,7 +567,7 @@ int load_snapshot_from_blockdev(const char *filename, Error **errp)
@@ -507,7 +507,8 @@ int load_snapshot_from_blockdev(const char *filename, Error **errp)
blk_op_block_all(be, blocker);
/* restore the VM state */
- f = qemu_fopen_ops(be, &loadstate_file_ops);
+ f = qemu_fopen_ops_sized(be, &loadstate_file_ops, false, 4 * 1024 * 1024);
- f = qemu_file_new_input(QIO_CHANNEL(qio_channel_savevm_async_new(be, &bs_pos)));
+ f = qemu_file_new_input_sized(QIO_CHANNEL(qio_channel_savevm_async_new(be, &bs_pos)),
+ 4 * 1024 * 1024);
if (!f) {
error_setg(errp, "Could not open VM state file");
goto the_end;

View File

@@ -4,17 +4,19 @@ Date: Mon, 6 Apr 2020 12:16:47 +0200
Subject: [PATCH] PVE: block: add the zeroinit block driver filter
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
[adapt to changed function signatures]
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
---
block/meson.build | 1 +
block/zeroinit.c | 196 ++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 197 insertions(+)
block/zeroinit.c | 198 ++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 199 insertions(+)
create mode 100644 block/zeroinit.c
diff --git a/block/meson.build b/block/meson.build
index 0450914c7a..7a0bc3df09 100644
index b7c68b83a3..020a89ae07 100644
--- a/block/meson.build
+++ b/block/meson.build
@@ -41,6 +41,7 @@ block_ss.add(files(
@@ -43,6 +43,7 @@ block_ss.add(files(
'vmdk.c',
'vpc.c',
'write-threshold.c',
@@ -24,10 +26,10 @@ index 0450914c7a..7a0bc3df09 100644
softmmu_ss.add(when: 'CONFIG_TCG', if_true: files('blkreplay.c'))
diff --git a/block/zeroinit.c b/block/zeroinit.c
new file mode 100644
index 0000000000..5529627f7e
index 0000000000..b60e1b84dc
--- /dev/null
+++ b/block/zeroinit.c
@@ -0,0 +1,196 @@
@@ -0,0 +1,198 @@
+/*
+ * Filter to fake a zero-initialized block device.
+ *
@@ -107,7 +109,9 @@ index 0000000000..5529627f7e
+
+ /* Open the raw file */
+ bs->file = bdrv_open_child(qemu_opt_get(opts, "x-next"), options, "next",
+ bs, &child_of_bds, BDRV_CHILD_FILTERED, false, &local_err);
+ bs, &child_of_bds,
+ BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY,
+ false, &local_err);
+ if (local_err) {
+ ret = -EINVAL;
+ error_propagate(errp, local_err);
@@ -138,22 +142,22 @@ index 0000000000..5529627f7e
+}
+
+static int coroutine_fn zeroinit_co_preadv(BlockDriverState *bs,
+ uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags)
+ int64_t offset, int64_t bytes, QEMUIOVector *qiov, BdrvRequestFlags flags)
+{
+ return bdrv_co_preadv(bs->file, offset, bytes, qiov, flags);
+}
+
+static int coroutine_fn zeroinit_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset,
+ int count, BdrvRequestFlags flags)
+ int64_t bytes, BdrvRequestFlags flags)
+{
+ BDRVZeroinitState *s = bs->opaque;
+ if (offset >= s->extents)
+ return 0;
+ return bdrv_pwrite_zeroes(bs->file, offset, count, flags);
+ return bdrv_pwrite_zeroes(bs->file, offset, bytes, flags);
+}
+
+static int coroutine_fn zeroinit_co_pwritev(BlockDriverState *bs,
+ uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags)
+ int64_t offset, int64_t bytes, QEMUIOVector *qiov, BdrvRequestFlags flags)
+{
+ BDRVZeroinitState *s = bs->opaque;
+ int64_t extents = offset + bytes;
@@ -174,9 +178,9 @@ index 0000000000..5529627f7e
+}
+
+static int coroutine_fn zeroinit_co_pdiscard(BlockDriverState *bs,
+ int64_t offset, int count)
+ int64_t offset, int64_t bytes)
+{
+ return bdrv_co_pdiscard(bs->file, offset, count);
+ return bdrv_co_pdiscard(bs->file, offset, bytes);
+}
+
+static int zeroinit_co_truncate(BlockDriverState *bs, int64_t offset,

View File

@@ -14,12 +14,12 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2 files changed, 11 insertions(+)
diff --git a/qemu-options.hx b/qemu-options.hx
index 002ba697e9..a05959b9f1 100644
index 54efb127c4..ef456d03ec 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -1005,6 +1005,9 @@ DEFHEADING()
@@ -1147,6 +1147,9 @@ backend describes how QEMU handles the data.
DEFHEADING(Block device options:)
ERST
+DEF("id", HAS_ARG, QEMU_OPTION_id,
+ "-id n set the VMID", QEMU_ARCH_ALL)
@@ -28,10 +28,10 @@ index 002ba697e9..a05959b9f1 100644
"-fda/-fdb file use 'file' as floppy disk 0/1 image\n", QEMU_ARCH_ALL)
DEF("fdb", HAS_ARG, QEMU_OPTION_fdb, "", QEMU_ARCH_ALL)
diff --git a/softmmu/vl.c b/softmmu/vl.c
index 220c67cd32..d87cf6e103 100644
index 21f067d115..9d737e7914 100644
--- a/softmmu/vl.c
+++ b/softmmu/vl.c
@@ -2736,6 +2736,7 @@ void qemu_init(int argc, char **argv, char **envp)
@@ -2643,6 +2643,7 @@ void qemu_init(int argc, char **argv)
MachineClass *machine_class;
bool userconfig = true;
FILE *vmstate_dump_file = NULL;
@@ -39,9 +39,9 @@ index 220c67cd32..d87cf6e103 100644
qemu_add_opts(&qemu_drive_opts);
qemu_add_drive_opts(&qemu_legacy_drive_opts);
@@ -3360,6 +3361,13 @@ void qemu_init(int argc, char **argv, char **envp)
case QEMU_OPTION_smp:
machine_parse_property_opt(qemu_find_opts("smp-opts"), "smp", optarg, &error_fatal);
@@ -3263,6 +3264,13 @@ void qemu_init(int argc, char **argv)
machine_parse_property_opt(qemu_find_opts("smp-opts"),
"smp", optarg);
break;
+ case QEMU_OPTION_id:
+ vm_id = strtol(optarg, (char **)&optarg, 10);

View File

@@ -13,10 +13,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2 files changed, 42 insertions(+), 20 deletions(-)
diff --git a/block/file-posix.c b/block/file-posix.c
index 3ac5177cbb..907aa3f22e 100644
index 9a16d86344..bd68df57ad 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -2443,6 +2443,7 @@ raw_co_create(BlockdevCreateOptions *options, Error **errp)
@@ -2487,6 +2487,7 @@ raw_co_create(BlockdevCreateOptions *options, Error **errp)
int fd;
uint64_t perm, shared;
int result = 0;
@@ -24,7 +24,7 @@ index 3ac5177cbb..907aa3f22e 100644
/* Validate options and set default values */
assert(options->driver == BLOCKDEV_DRIVER_FILE);
@@ -2483,19 +2484,22 @@ raw_co_create(BlockdevCreateOptions *options, Error **errp)
@@ -2527,19 +2528,22 @@ raw_co_create(BlockdevCreateOptions *options, Error **errp)
perm = BLK_PERM_WRITE | BLK_PERM_RESIZE;
shared = BLK_PERM_ALL & ~BLK_PERM_RESIZE;
@@ -59,7 +59,7 @@ index 3ac5177cbb..907aa3f22e 100644
}
/* Clear the file by truncating it to 0 */
@@ -2549,13 +2553,15 @@ raw_co_create(BlockdevCreateOptions *options, Error **errp)
@@ -2593,13 +2597,15 @@ raw_co_create(BlockdevCreateOptions *options, Error **errp)
}
out_unlock:
@@ -82,7 +82,7 @@ index 3ac5177cbb..907aa3f22e 100644
}
out_close:
@@ -2580,6 +2586,7 @@ static int coroutine_fn raw_co_create_opts(BlockDriver *drv,
@@ -2624,6 +2630,7 @@ static int coroutine_fn raw_co_create_opts(BlockDriver *drv,
PreallocMode prealloc;
char *buf = NULL;
Error *local_err = NULL;
@@ -90,7 +90,7 @@ index 3ac5177cbb..907aa3f22e 100644
/* Skip file: protocol prefix */
strstart(filename, "file:", &filename);
@@ -2602,6 +2609,18 @@ static int coroutine_fn raw_co_create_opts(BlockDriver *drv,
@@ -2646,6 +2653,18 @@ static int coroutine_fn raw_co_create_opts(BlockDriver *drv,
return -EINVAL;
}
@@ -109,7 +109,7 @@ index 3ac5177cbb..907aa3f22e 100644
options = (BlockdevCreateOptions) {
.driver = BLOCKDEV_DRIVER_FILE,
.u.file = {
@@ -2613,6 +2632,8 @@ static int coroutine_fn raw_co_create_opts(BlockDriver *drv,
@@ -2657,6 +2676,8 @@ static int coroutine_fn raw_co_create_opts(BlockDriver *drv,
.nocow = nocow,
.has_extent_size_hint = has_extent_size_hint,
.extent_size_hint = extent_size_hint,
@@ -119,10 +119,10 @@ index 3ac5177cbb..907aa3f22e 100644
};
return raw_co_create(&options, errp);
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 6356a63695..fdfa579d00 100644
index 7daaf545be..9e902b96bb 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -4341,7 +4341,8 @@
@@ -4624,7 +4624,8 @@
'size': 'size',
'*preallocation': 'PreallocMode',
'*nocow': 'bool',

View File

@@ -26,10 +26,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 2cf2f321f9..e0f857820d 100644
index 8d34caa31d..2df9037c4e 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -107,7 +107,8 @@ GlobalProperty hw_compat_4_0[] = {
@@ -132,7 +132,8 @@ GlobalProperty hw_compat_4_0[] = {
{ "virtio-vga", "edid", "false" },
{ "virtio-gpu-device", "edid", "false" },
{ "virtio-device", "use-started", "false" },

View File

@@ -19,10 +19,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
4 files changed, 36 insertions(+), 1 deletion(-)
diff --git a/hw/core/machine-qmp-cmds.c b/hw/core/machine-qmp-cmds.c
index 8f8d5d5276..370e66d9cc 100644
index 76fff60a6b..ec9201fb9a 100644
--- a/hw/core/machine-qmp-cmds.c
+++ b/hw/core/machine-qmp-cmds.c
@@ -102,6 +102,12 @@ MachineInfoList *qmp_query_machines(Error **errp)
@@ -103,6 +103,12 @@ MachineInfoList *qmp_query_machines(Error **errp)
if (strcmp(mc->name, MACHINE_GET_CLASS(current_machine)->name) == 0) {
info->has_is_current = true;
info->is_current = true;
@@ -36,23 +36,23 @@ index 8f8d5d5276..370e66d9cc 100644
if (mc->default_cpu_type) {
diff --git a/include/hw/boards.h b/include/hw/boards.h
index accd6eff35..1b16728389 100644
index 90f1dd3aeb..14d60520d9 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -205,6 +205,8 @@ struct MachineClass {
@@ -230,6 +230,8 @@ struct MachineClass {
const char *desc;
const char *deprecation_reason;
+ const char *pve_version;
+
void (*init)(MachineState *state);
void (*reset)(MachineState *state);
void (*reset)(MachineState *state, ShutdownCause reason);
void (*wakeup)(MachineState *state);
diff --git a/qapi/machine.json b/qapi/machine.json
index cf120ac343..a6f483af4f 100644
index 9156103c8f..f4fb1b2c9c 100644
--- a/qapi/machine.json
+++ b/qapi/machine.json
@@ -160,6 +160,8 @@
@@ -157,6 +157,8 @@
#
# @default-ram-id: the default ID of initial RAM memory backend (since 5.2)
#
@@ -61,7 +61,7 @@ index cf120ac343..a6f483af4f 100644
# Since: 1.2
##
{ 'struct': 'MachineInfo',
@@ -167,7 +169,7 @@
@@ -164,7 +166,7 @@
'*is-default': 'bool', '*is-current': 'bool', 'cpu-max': 'int',
'hotpluggable-cpus': 'bool', 'numa-mem-supported': 'bool',
'deprecated': 'bool', '*default-cpu-type': 'str',
@@ -71,10 +71,10 @@ index cf120ac343..a6f483af4f 100644
##
# @query-machines:
diff --git a/softmmu/vl.c b/softmmu/vl.c
index d87cf6e103..e9d40065bc 100644
index 9d737e7914..a64eee2fad 100644
--- a/softmmu/vl.c
+++ b/softmmu/vl.c
@@ -1621,6 +1621,7 @@ static const QEMUOption *lookup_opt(int argc, char **argv,
@@ -1578,6 +1578,7 @@ static const QEMUOption *lookup_opt(int argc, char **argv,
static MachineClass *select_machine(QDict *qdict, Error **errp)
{
const char *optarg = qdict_get_try_str(qdict, "type");
@@ -82,7 +82,7 @@ index d87cf6e103..e9d40065bc 100644
GSList *machines = object_class_get_list(TYPE_MACHINE, false);
MachineClass *machine_class;
Error *local_err = NULL;
@@ -1638,6 +1639,11 @@ static MachineClass *select_machine(QDict *qdict, Error **errp)
@@ -1595,6 +1596,11 @@ static MachineClass *select_machine(QDict *qdict, Error **errp)
}
}
@@ -94,7 +94,7 @@ index d87cf6e103..e9d40065bc 100644
g_slist_free(machines);
if (local_err) {
error_append_hint(&local_err, "Use -machine help to list supported machines\n");
@@ -3312,12 +3318,31 @@ void qemu_init(int argc, char **argv, char **envp)
@@ -3205,12 +3211,31 @@ void qemu_init(int argc, char **argv)
case QEMU_OPTION_machine:
{
bool help;

View File

@@ -0,0 +1,59 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fabian Ebner <f.ebner@proxmox.com>
Date: Wed, 2 Mar 2022 08:35:05 +0100
Subject: [PATCH] block/backup: move bcs bitmap initialization to job creation
For backing up the state of multiple disks from the same time, a job
for each disk has to be created. It's convenient if the jobs don't
have to be started at the same time and if operation of the VM can be
resumed after job creation. This would lead to a window between job
creation and running the job, where writes can happen. But no writes
should happen between setting up the copy-before-write filter and
setting up the block copy state bitmap, because then new writes would
just pass through.
Commit 06e0a9c16405c0a4c1eca33cf286cc04c42066a2 moved initalization of
the bitmap to setting up the copy-before-write filter when sync_mode
is not MIRROR_SYNC_MODE_BITMAP. Ensure that the bitmap is initialized
upon job creation for the remaining case too, by moving the
backup_init_bcs_bitmap call to backup_job_create.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
block/backup.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/block/backup.c b/block/backup.c
index 6a9ad97a53..9b0151c5be 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -237,8 +237,8 @@ static void backup_init_bcs_bitmap(BackupBlockJob *job)
true);
} else if (job->sync_mode == MIRROR_SYNC_MODE_TOP) {
/*
- * We can't hog the coroutine to initialize this thoroughly.
- * Set a flag and resume work when we are able to yield safely.
+ * Initialization is costly here. Simply set a flag and let the
+ * backup_run coroutine resume work once it can yield safely.
*/
block_copy_set_skip_unallocated(job->bcs, true);
}
@@ -252,8 +252,6 @@ static int coroutine_fn backup_run(Job *job, Error **errp)
BackupBlockJob *s = container_of(job, BackupBlockJob, common.job);
int ret;
- backup_init_bcs_bitmap(s);
-
if (s->sync_mode == MIRROR_SYNC_MODE_TOP) {
int64_t offset = 0;
int64_t count;
@@ -492,6 +490,8 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
block_job_add_bdrv(&job->common, "target", target, 0, BLK_PERM_ALL,
&error_abort);
+ backup_init_bcs_bitmap(job);
+
return &job->common;
error:

View File

@@ -9,45 +9,45 @@ Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
---
block/meson.build | 2 +
meson.build | 5 +
vma-reader.c | 857 ++++++++++++++++++++++++++++++++++++++++++++++
vma-writer.c | 790 ++++++++++++++++++++++++++++++++++++++++++
vma.c | 851 +++++++++++++++++++++++++++++++++++++++++++++
vma-reader.c | 859 ++++++++++++++++++++++++++++++++++++++++++++++
vma-writer.c | 791 ++++++++++++++++++++++++++++++++++++++++++
vma.c | 849 +++++++++++++++++++++++++++++++++++++++++++++
vma.h | 150 ++++++++
6 files changed, 2655 insertions(+)
6 files changed, 2656 insertions(+)
create mode 100644 vma-reader.c
create mode 100644 vma-writer.c
create mode 100644 vma.c
create mode 100644 vma.h
diff --git a/block/meson.build b/block/meson.build
index 7a0bc3df09..9ce9246194 100644
index 020a89ae07..4feae20e37 100644
--- a/block/meson.build
+++ b/block/meson.build
@@ -44,6 +44,8 @@ block_ss.add(files(
@@ -46,6 +46,8 @@ block_ss.add(files(
'zeroinit.c',
), zstd, zlib, gnutls)
+block_ss.add(files('../vma-writer.c'), libuuid)
+
softmmu_ss.add(when: 'CONFIG_TCG', if_true: files('blkreplay.c'))
softmmu_ss.add(files('block-ram-registrar.c'))
block_ss.add(when: 'CONFIG_QCOW1', if_true: files('qcow.c'))
diff --git a/meson.build b/meson.build
index b3e7ec0e92..cc46eabb42 100644
index 5c6b5a1c75..e8cf7e3d78 100644
--- a/meson.build
+++ b/meson.build
@@ -1064,6 +1064,8 @@ keyutils = dependency('libkeyutils', required: false,
@@ -1525,6 +1525,8 @@ keyutils = dependency('libkeyutils', required: false,
has_gettid = cc.has_function('gettid')
+libuuid = cc.find_library('uuid', required: true)
+
# Malloc tests
malloc = []
@@ -2743,6 +2745,9 @@ if have_tools
qemu_nbd = executable('qemu-nbd', files('qemu-nbd.c'),
dependencies: [blockdev, qemuutil, gnutls], install: true)
# libselinux
selinux = dependency('libselinux',
required: get_option('selinux'),
@@ -3596,6 +3598,9 @@ if have_tools
dependencies: [blockdev, qemuutil, gnutls, selinux],
install: true)
+ vma = executable('vma', files('vma.c', 'vma-reader.c') + genh,
+ dependencies: [authz, block, crypto, io, qom], install: true)
@@ -57,10 +57,10 @@ index b3e7ec0e92..cc46eabb42 100644
subdir('contrib/elf2dmp')
diff --git a/vma-reader.c b/vma-reader.c
new file mode 100644
index 0000000000..2b1d1cdab3
index 0000000000..e65f1e8415
--- /dev/null
+++ b/vma-reader.c
@@ -0,0 +1,857 @@
@@ -0,0 +1,859 @@
+/*
+ * VMA: Virtual Machine Archive
+ *
@@ -78,7 +78,6 @@ index 0000000000..2b1d1cdab3
+#include <glib.h>
+#include <uuid/uuid.h>
+
+#include "qemu-common.h"
+#include "qemu/timer.h"
+#include "qemu/ratelimit.h"
+#include "vma.h"
@@ -255,6 +254,9 @@ index 0000000000..2b1d1cdab3
+ if (vmar->rstate[i].bitmap) {
+ g_free(vmar->rstate[i].bitmap);
+ }
+ if (vmar->rstate[i].target) {
+ blk_unref(vmar->rstate[i].target);
+ }
+ }
+
+ if (vmar->md5csum) {
@@ -586,7 +588,7 @@ index 0000000000..2b1d1cdab3
+ }
+ }
+ } else {
+ int res = blk_pwrite(target, sector_num * BDRV_SECTOR_SIZE, buf, nb_sectors * BDRV_SECTOR_SIZE, 0);
+ int res = blk_pwrite(target, sector_num * BDRV_SECTOR_SIZE, nb_sectors * BDRV_SECTOR_SIZE, buf, 0);
+ if (res < 0) {
+ error_setg(errp, "blk_pwrite to %s failed (%d)",
+ bdrv_get_device_name(blk_bs(target)), res);
@@ -920,10 +922,10 @@ index 0000000000..2b1d1cdab3
+
diff --git a/vma-writer.c b/vma-writer.c
new file mode 100644
index 0000000000..11d8321ffd
index 0000000000..df4b20793d
--- /dev/null
+++ b/vma-writer.c
@@ -0,0 +1,790 @@
@@ -0,0 +1,791 @@
+/*
+ * VMA: Virtual Machine Archive
+ *
@@ -947,6 +949,7 @@ index 0000000000..11d8321ffd
+#include "qemu/main-loop.h"
+#include "qemu/coroutine.h"
+#include "qemu/cutils.h"
+#include "qemu/memalign.h"
+
+#define DEBUG_VMA 0
+
@@ -1130,9 +1133,9 @@ index 0000000000..11d8321ffd
+ assert(qemu_in_coroutine());
+ AioContext *ctx = qemu_get_current_aio_context();
+ aio_set_fd_handler(ctx, fd, false, NULL, (IOHandler *)qemu_coroutine_enter,
+ NULL, qemu_coroutine_self());
+ NULL, NULL, qemu_coroutine_self());
+ qemu_coroutine_yield();
+ aio_set_fd_handler(ctx, fd, false, NULL, NULL, NULL, NULL);
+ aio_set_fd_handler(ctx, fd, false, NULL, NULL, NULL, NULL, NULL);
+}
+
+static ssize_t coroutine_fn
@@ -1716,10 +1719,10 @@ index 0000000000..11d8321ffd
+}
diff --git a/vma.c b/vma.c
new file mode 100644
index 0000000000..df542b7732
index 0000000000..e8dffb43e0
--- /dev/null
+++ b/vma.c
@@ -0,0 +1,851 @@
@@ -0,0 +1,849 @@
+/*
+ * VMA: Virtual Machine Archive
+ *
@@ -1737,11 +1740,11 @@ index 0000000000..df542b7732
+#include <glib.h>
+
+#include "vma.h"
+#include "qemu-common.h"
+#include "qemu/module.h"
+#include "qemu/error-report.h"
+#include "qemu/main-loop.h"
+#include "qemu/cutils.h"
+#include "qemu/memalign.h"
+#include "qapi/qmp/qdict.h"
+#include "sysemu/block-backend.h"
+
@@ -2031,8 +2034,6 @@ index 0000000000..df542b7732
+ int vmstate_fd = -1;
+ guint8 vmstate_stream = 0;
+
+ BlockBackend *blk = NULL;
+
+ for (i = 1; i < 255; i++) {
+ VmaDeviceInfo *di = vma_reader_get_device_info(vmar, i);
+ if (di && (strcmp(di->devname, "vmstate") == 0)) {
@@ -2053,6 +2054,8 @@ index 0000000000..df542b7732
+ int flags = BDRV_O_RDWR;
+ bool write_zero = true;
+
+ BlockBackend *blk = NULL;
+
+ if (readmap) {
+ RestoreMap *map;
+ map = (RestoreMap *)g_hash_table_lookup(devmap, di->devname);
@@ -2165,8 +2168,6 @@ index 0000000000..df542b7732
+
+ vma_reader_destroy(vmar);
+
+ blk_unref(blk);
+
+ bdrv_close_all();
+
+ return ret;

View File

@@ -10,20 +10,20 @@ Subject: [PATCH] PVE-Backup: add backup-dump block driver
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
block/backup-dump.c | 168 ++++++++++++++++++++++++++++++++++++++
block/backup.c | 32 +++-----
block/meson.build | 1 +
include/block/block_int.h | 35 ++++++++
job.c | 3 +-
5 files changed, 216 insertions(+), 23 deletions(-)
block/backup-dump.c | 167 +++++++++++++++++++++++++++++++
block/backup.c | 30 ++----
block/meson.build | 1 +
include/block/block_int-common.h | 35 +++++++
job.c | 3 +-
5 files changed, 213 insertions(+), 23 deletions(-)
create mode 100644 block/backup-dump.c
diff --git a/block/backup-dump.c b/block/backup-dump.c
new file mode 100644
index 0000000000..93d7f46950
index 0000000000..04718a94e2
--- /dev/null
+++ b/block/backup-dump.c
@@ -0,0 +1,168 @@
@@ -0,0 +1,167 @@
+/*
+ * BlockDriver to send backup data stream to a callback function
+ *
@@ -35,7 +35,6 @@ index 0000000000..93d7f46950
+ */
+
+#include "qemu/osdep.h"
+#include "qemu-common.h"
+#include "qom/object_interfaces.h"
+#include "block/block_int.h"
+
@@ -193,16 +192,16 @@ index 0000000000..93d7f46950
+ return bs;
+}
diff --git a/block/backup.c b/block/backup.c
index bd3614ce70..8bae9b060e 100644
index 9b0151c5be..6e8f6e67b3 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -31,28 +31,6 @@
@@ -29,28 +29,6 @@
#define BACKUP_CLUSTER_SIZE_DEFAULT (1 << 16)
#include "block/copy-before-write.h"
-typedef struct BackupBlockJob {
- BlockJob common;
- BlockDriverState *backup_top;
- BlockDriverState *cbw;
- BlockDriverState *source_bs;
- BlockDriverState *target_bs;
-
@@ -225,11 +224,10 @@ index bd3614ce70..8bae9b060e 100644
static const BlockJobDriver backup_job_driver;
static void backup_cleanup_sync_bitmap(BackupBlockJob *job, int ret)
@@ -504,6 +482,16 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
goto error;
@@ -454,6 +432,14 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
}
+ cluster_size = backup_calculate_cluster_size(target, errp);
cluster_size = block_copy_cluster_size(bcs);
+ if (cluster_size < 0) {
+ goto error;
+ }
@@ -238,12 +236,11 @@ index bd3614ce70..8bae9b060e 100644
+ if (bdrv_get_info(bs, &bdi) == 0) {
+ cluster_size = MAX(cluster_size, bdi.cluster_size);
+ }
+
/*
* If source is in backing chain of target assume that target is going to be
* used for "image fleecing", i.e. it should represent a kind of snapshot of
if (perf->max_chunk && perf->max_chunk < cluster_size) {
error_setg(errp, "Required max-chunk (%" PRIi64 ") is less than backup "
diff --git a/block/meson.build b/block/meson.build
index 9ce9246194..19bc2b7cbb 100644
index 4feae20e37..0d7023fc82 100644
--- a/block/meson.build
+++ b/block/meson.build
@@ -4,6 +4,7 @@ block_ss.add(files(
@@ -251,13 +248,13 @@ index 9ce9246194..19bc2b7cbb 100644
'amend.c',
'backup.c',
+ 'backup-dump.c',
'backup-top.c',
'copy-before-write.c',
'blkdebug.c',
'blklogwrites.c',
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 11442893d0..8f6135e6a5 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
diff --git a/include/block/block_int-common.h b/include/block/block_int-common.h
index 31ae91e56e..37b64bcd93 100644
--- a/include/block/block_int-common.h
+++ b/include/block/block_int-common.h
@@ -26,6 +26,7 @@
#include "block/accounting.h"
@@ -266,13 +263,13 @@ index 11442893d0..8f6135e6a5 100644
#include "block/aio-wait.h"
#include "qemu/queue.h"
#include "qemu/coroutine.h"
@@ -63,6 +64,40 @@
@@ -64,6 +65,40 @@
#define BLOCK_PROBE_BUF_SIZE 512
+typedef int BackupDumpFunc(void *opaque, uint64_t offset, uint64_t bytes, const void *buf);
+
+BlockDriverState *bdrv_backuo_dump_create(
+BlockDriverState *bdrv_backup_dump_create(
+ int dump_cb_block_size,
+ uint64_t byte_size,
+ BackupDumpFunc *dump_cb,
@@ -284,7 +281,7 @@ index 11442893d0..8f6135e6a5 100644
+typedef struct BlockCopyState BlockCopyState;
+typedef struct BackupBlockJob {
+ BlockJob common;
+ BlockDriverState *backup_top;
+ BlockDriverState *cbw;
+ BlockDriverState *source_bs;
+ BlockDriverState *target_bs;
+
@@ -308,16 +305,16 @@ index 11442893d0..8f6135e6a5 100644
BDRV_TRACKED_READ,
BDRV_TRACKED_WRITE,
diff --git a/job.c b/job.c
index e7a5d28854..44eec9a441 100644
index 72d57f0934..93e22d180b 100644
--- a/job.c
+++ b/job.c
@@ -269,7 +269,8 @@ static bool job_started(Job *job)
return job->co;
@@ -330,7 +330,8 @@ static bool job_started_locked(Job *job)
}
-static bool job_should_pause(Job *job)
+bool job_should_pause(Job *job);
+bool job_should_pause(Job *job)
/* Called with job_mutex held. */
-static bool job_should_pause_locked(Job *job)
+bool job_should_pause_locked(Job *job);
+bool job_should_pause_locked(Job *job)
{
return job->pause_count > 0;
}

View File

@@ -7,32 +7,34 @@ Signed-off-by: Dietmar Maurer <dietmar@proxmox.com>
[PVE-Backup: avoid coroutines to fix AIO freeze, cleanups]
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
[FE: add new force parameter to job_cancel_sync calls
adapt for new job lock mechanism replacing AioContext locks]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
block/meson.build | 5 +
block/monitor/block-hmp-cmds.c | 33 ++
blockdev.c | 1 +
hmp-commands-info.hx | 14 +
hmp-commands.hx | 29 +
include/block/block_int.h | 2 +-
include/monitor/hmp.h | 3 +
meson.build | 1 +
monitor/hmp-cmds.c | 44 ++
proxmox-backup-client.c | 176 ++++++
proxmox-backup-client.h | 59 ++
pve-backup.c | 959 +++++++++++++++++++++++++++++++++
pve-backup.c | 956 +++++++++++++++++++++++++++++++++
qapi/block-core.json | 109 ++++
qapi/common.json | 13 +
qapi/machine.json | 15 +-
15 files changed, 1449 insertions(+), 14 deletions(-)
14 files changed, 1445 insertions(+), 13 deletions(-)
create mode 100644 proxmox-backup-client.c
create mode 100644 proxmox-backup-client.h
create mode 100644 pve-backup.c
diff --git a/block/meson.build b/block/meson.build
index 19bc2b7cbb..9e433daf2e 100644
index 0d7023fc82..e995ae72b9 100644
--- a/block/meson.build
+++ b/block/meson.build
@@ -46,6 +46,11 @@ block_ss.add(files(
@@ -48,6 +48,11 @@ block_ss.add(files(
), zstd, zlib, gnutls)
block_ss.add(files('../vma-writer.c'), libuuid)
@@ -43,9 +45,9 @@ index 19bc2b7cbb..9e433daf2e 100644
+
softmmu_ss.add(when: 'CONFIG_TCG', if_true: files('blkreplay.c'))
softmmu_ss.add(files('block-ram-registrar.c'))
diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c
index 3e6670c963..1e29681d30 100644
index b6135e9bfe..477044c54a 100644
--- a/block/monitor/block-hmp-cmds.c
+++ b/block/monitor/block-hmp-cmds.c
@@ -1015,3 +1015,36 @@ void hmp_info_snapshots(Monitor *mon, const QDict *qdict)
@@ -86,7 +88,7 @@ index 3e6670c963..1e29681d30 100644
+ hmp_handle_error(mon, error);
+}
diff --git a/blockdev.c b/blockdev.c
index b6f797b41f..84e9b898be 100644
index 756e980889..bc8d67b290 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -36,6 +36,7 @@
@@ -98,10 +100,10 @@ index b6f797b41f..84e9b898be 100644
#include "monitor/monitor.h"
#include "qemu/error-report.h"
diff --git a/hmp-commands-info.hx b/hmp-commands-info.hx
index e6dd3be07a..15ddecada1 100644
index 489c524e9e..bc1d46d845 100644
--- a/hmp-commands-info.hx
+++ b/hmp-commands-info.hx
@@ -497,6 +497,20 @@ SRST
@@ -486,6 +486,20 @@ SRST
Show the current VM UUID.
ERST
@@ -123,10 +125,10 @@ index e6dd3be07a..15ddecada1 100644
{
.name = "usernet",
diff --git a/hmp-commands.hx b/hmp-commands.hx
index 42203dbe92..7faba36b39 100644
index 039be0033d..fcf9461295 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -99,6 +99,35 @@ ERST
@@ -101,6 +101,35 @@ ERST
SRST
``block_stream``
Copy data from a backing file into a block device.
@@ -162,24 +164,11 @@ index 42203dbe92..7faba36b39 100644
ERST
{
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 8f6135e6a5..4a572a2e34 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -66,7 +66,7 @@
typedef int BackupDumpFunc(void *opaque, uint64_t offset, uint64_t bytes, const void *buf);
-BlockDriverState *bdrv_backuo_dump_create(
+BlockDriverState *bdrv_backup_dump_create(
int dump_cb_block_size,
uint64_t byte_size,
BackupDumpFunc *dump_cb,
diff --git a/include/monitor/hmp.h b/include/monitor/hmp.h
index 1247d7362a..8d3df46c93 100644
index 440f86aba8..350527e599 100644
--- a/include/monitor/hmp.h
+++ b/include/monitor/hmp.h
@@ -29,6 +29,7 @@ void hmp_info_savevm(Monitor *mon, const QDict *qdict);
@@ -31,6 +31,7 @@ void hmp_info_savevm(Monitor *mon, const QDict *qdict);
void hmp_info_migrate(Monitor *mon, const QDict *qdict);
void hmp_info_migrate_capabilities(Monitor *mon, const QDict *qdict);
void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict);
@@ -187,7 +176,7 @@ index 1247d7362a..8d3df46c93 100644
void hmp_info_cpus(Monitor *mon, const QDict *qdict);
void hmp_info_vnc(Monitor *mon, const QDict *qdict);
void hmp_info_spice(Monitor *mon, const QDict *qdict);
@@ -72,6 +73,8 @@ void hmp_x_colo_lost_heartbeat(Monitor *mon, const QDict *qdict);
@@ -74,6 +75,8 @@ void hmp_x_colo_lost_heartbeat(Monitor *mon, const QDict *qdict);
void hmp_set_password(Monitor *mon, const QDict *qdict);
void hmp_expire_password(Monitor *mon, const QDict *qdict);
void hmp_change(Monitor *mon, const QDict *qdict);
@@ -197,22 +186,22 @@ index 1247d7362a..8d3df46c93 100644
void hmp_device_add(Monitor *mon, const QDict *qdict);
void hmp_device_del(Monitor *mon, const QDict *qdict);
diff --git a/meson.build b/meson.build
index cc46eabb42..7d7e474313 100644
index e8cf7e3d78..782756162c 100644
--- a/meson.build
+++ b/meson.build
@@ -1065,6 +1065,7 @@ keyutils = dependency('libkeyutils', required: false,
@@ -1526,6 +1526,7 @@ keyutils = dependency('libkeyutils', required: false,
has_gettid = cc.has_function('gettid')
libuuid = cc.find_library('uuid', required: true)
+libproxmox_backup_qemu = cc.find_library('proxmox_backup_qemu', required: true)
# Malloc tests
# libselinux
selinux = dependency('libselinux',
diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index 64a84cf4ee..7efcd2d641 100644
index cfebfd1db5..a40b25e906 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -195,6 +195,50 @@ void hmp_info_mice(Monitor *mon, const QDict *qdict)
@@ -199,6 +199,50 @@ void hmp_info_mice(Monitor *mon, const QDict *qdict)
qapi_free_MouseInfoList(mice_list);
}
@@ -260,9 +249,9 @@ index 64a84cf4ee..7efcd2d641 100644
+ qapi_free_BackupStatus(info);
+}
+
static char *SocketAddress_to_str(SocketAddress *addr)
void hmp_info_migrate(Monitor *mon, const QDict *qdict)
{
switch (addr->type) {
MigrationInfo *info;
diff --git a/proxmox-backup-client.c b/proxmox-backup-client.c
new file mode 100644
index 0000000000..a8f6653a81
@@ -512,10 +501,10 @@ index 0000000000..1dda8b7d8f
+#endif /* PROXMOX_BACKUP_CLIENT_H */
diff --git a/pve-backup.c b/pve-backup.c
new file mode 100644
index 0000000000..66868dec14
index 0000000000..6af212b9b4
--- /dev/null
+++ b/pve-backup.c
@@ -0,0 +1,959 @@
@@ -0,0 +1,956 @@
+#include "proxmox-backup-client.h"
+#include "vma.h"
+
@@ -593,14 +582,16 @@ index 0000000000..66868dec14
+lookup_active_block_job(PVEBackupDevInfo *di)
+{
+ if (!di->completed && di->bs) {
+ for (BlockJob *job = block_job_next(NULL); job; job = block_job_next(job)) {
+ if (job->job.driver->job_type != JOB_TYPE_BACKUP) {
+ continue;
+ }
+ WITH_JOB_LOCK_GUARD() {
+ for (BlockJob *job = block_job_next_locked(NULL); job; job = block_job_next_locked(job)) {
+ if (job->job.driver->job_type != JOB_TYPE_BACKUP) {
+ continue;
+ }
+
+ BackupBlockJob *bjob = container_of(job, BackupBlockJob, common);
+ if (bjob && bjob->source_bs == di->bs) {
+ return job;
+ BackupBlockJob *bjob = container_of(job, BackupBlockJob, common);
+ if (bjob && bjob->source_bs == di->bs) {
+ return job;
+ }
+ }
+ }
+ }
@@ -870,10 +861,7 @@ index 0000000000..66868dec14
+ qemu_mutex_unlock(&backup_state.backup_mutex);
+
+ if (next_job) {
+ AioContext *aio_context = next_job->job.aio_context;
+ aio_context_acquire(aio_context);
+ job_cancel_sync(&next_job->job);
+ aio_context_release(aio_context);
+ job_cancel_sync(&next_job->job, true);
+ } else {
+ break;
+ }
@@ -935,7 +923,7 @@ index 0000000000..66868dec14
+ goto out;
+}
+
+bool job_should_pause(Job *job);
+bool job_should_pause_locked(Job *job);
+
+static void pvebackup_run_next_job(void)
+{
@@ -953,18 +941,16 @@ index 0000000000..66868dec14
+ if (job) {
+ qemu_mutex_unlock(&backup_state.backup_mutex);
+
+ AioContext *aio_context = job->job.aio_context;
+ aio_context_acquire(aio_context);
+
+ if (job_should_pause(&job->job)) {
+ bool error_or_canceled = pvebackup_error_or_canceled();
+ if (error_or_canceled) {
+ job_cancel_sync(&job->job);
+ } else {
+ job_resume(&job->job);
+ WITH_JOB_LOCK_GUARD() {
+ if (job_should_pause_locked(&job->job)) {
+ bool error_or_canceled = pvebackup_error_or_canceled();
+ if (error_or_canceled) {
+ job_cancel_sync_locked(&job->job, true);
+ } else {
+ job_resume_locked(&job->job);
+ }
+ }
+ }
+ aio_context_release(aio_context);
+ return;
+ }
+ }
@@ -1148,7 +1134,7 @@ index 0000000000..66868dec14
+
+ ssize_t size = bdrv_getlength(di->bs);
+ if (size < 0) {
+ error_setg_errno(task->errp, -di->size, "bdrv_getlength failed");
+ error_setg_errno(task->errp, -size, "bdrv_getlength failed");
+ goto err;
+ }
+ di->size = size;
@@ -1476,12 +1462,12 @@ index 0000000000..66868dec14
+ return info;
+}
diff --git a/qapi/block-core.json b/qapi/block-core.json
index fdfa579d00..c5d604693f 100644
index 9e902b96bb..c3b6b93472 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -699,6 +699,115 @@
{ 'command': 'query-block', 'returns': ['BlockInfo'] }
@@ -740,6 +740,115 @@
{ 'command': 'query-block', 'returns': ['BlockInfo'],
'allow-preconfig': true }
+##
+# @BackupStatus:
@@ -1596,13 +1582,13 @@ index fdfa579d00..c5d604693f 100644
# @BlockDeviceTimedStats:
#
diff --git a/qapi/common.json b/qapi/common.json
index 7c976296f0..0690b07903 100644
index 356db3f670..aae8a3b682 100644
--- a/qapi/common.json
+++ b/qapi/common.json
@@ -197,3 +197,16 @@
{ 'enum': 'GrabToggleKeys',
'data': [ 'ctrl-ctrl', 'alt-alt', 'shift-shift','meta-meta', 'scrolllock',
'ctrl-scrolllock' ] }
@@ -206,3 +206,16 @@
##
{ 'struct': 'HumanReadableText',
'data': { 'human-readable-text': 'str' } }
+
+##
+# @UuidInfo:
@@ -1617,7 +1603,7 @@ index 7c976296f0..0690b07903 100644
+##
+{ 'struct': 'UuidInfo', 'data': {'UUID': 'str'} }
diff --git a/qapi/machine.json b/qapi/machine.json
index a6f483af4f..6effa7ad30 100644
index f4fb1b2c9c..0d6ee836ed 100644
--- a/qapi/machine.json
+++ b/qapi/machine.json
@@ -4,6 +4,8 @@
@@ -1629,7 +1615,7 @@ index a6f483af4f..6effa7ad30 100644
##
# = Machines
##
@@ -229,19 +231,6 @@
@@ -226,19 +228,6 @@
##
{ 'command': 'query-target', 'returns': 'TargetInfo' }

View File

@@ -7,15 +7,15 @@ Subject: [PATCH] PVE-Backup: pbs-restore - new command to restore from proxmox
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
meson.build | 4 +
pbs-restore.c | 224 ++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 228 insertions(+)
pbs-restore.c | 223 ++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 227 insertions(+)
create mode 100644 pbs-restore.c
diff --git a/meson.build b/meson.build
index 7d7e474313..dd1c5bdb4e 100644
index 782756162c..63ea813a9a 100644
--- a/meson.build
+++ b/meson.build
@@ -2749,6 +2749,10 @@ if have_tools
@@ -3602,6 +3602,10 @@ if have_tools
vma = executable('vma', files('vma.c', 'vma-reader.c') + genh,
dependencies: [authz, block, crypto, io, qom], install: true)
@@ -28,10 +28,10 @@ index 7d7e474313..dd1c5bdb4e 100644
subdir('contrib/elf2dmp')
diff --git a/pbs-restore.c b/pbs-restore.c
new file mode 100644
index 0000000000..4d3f925a1b
index 0000000000..2f834cf42e
--- /dev/null
+++ b/pbs-restore.c
@@ -0,0 +1,224 @@
@@ -0,0 +1,223 @@
+/*
+ * Qemu image restore helper for Proxmox Backup
+ *
@@ -50,7 +50,6 @@ index 0000000000..4d3f925a1b
+#include <getopt.h>
+#include <string.h>
+
+#include "qemu-common.h"
+#include "qemu/module.h"
+#include "qemu/error-report.h"
+#include "qemu/main-loop.h"
@@ -96,7 +95,7 @@ index 0000000000..4d3f925a1b
+ }
+ res = blk_pwrite_zeroes(callback_data->target, offset, data_len, 0);
+ } else {
+ res = blk_pwrite(callback_data->target, offset, data, data_len, 0);
+ res = blk_pwrite(callback_data->target, offset, data_len, data, 0);
+ }
+
+ if (res < 0) {

View File

@@ -29,7 +29,7 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
6 files changed, 142 insertions(+), 23 deletions(-)
diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c
index 1e29681d30..3fca3ce3e9 100644
index 477044c54a..556af25861 100644
--- a/block/monitor/block-hmp-cmds.c
+++ b/block/monitor/block-hmp-cmds.c
@@ -1042,6 +1042,7 @@ void hmp_backup(Monitor *mon, const QDict *qdict)
@@ -41,10 +41,10 @@ index 1e29681d30..3fca3ce3e9 100644
false, NULL, false, NULL, !!devlist,
devlist, qdict_haskey(qdict, "speed"), speed, &error);
diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index 7efcd2d641..b2b5f1298b 100644
index a40b25e906..670f783515 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -221,19 +221,42 @@ void hmp_info_backup(Monitor *mon, const QDict *qdict)
@@ -225,19 +225,42 @@ void hmp_info_backup(Monitor *mon, const QDict *qdict)
monitor_printf(mon, "End time: %s", ctime(&info->end_time));
}
@@ -132,7 +132,7 @@ index 1dda8b7d8f..8cbf645b2c 100644
diff --git a/pve-backup.c b/pve-backup.c
index 66868dec14..6cdbd40529 100644
index 3d28975eaa..abd7062afe 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -28,6 +28,8 @@
@@ -162,7 +162,7 @@ index 66868dec14..6cdbd40529 100644
BlockDriverState *target;
} PVEBackupDevInfo;
@@ -105,11 +110,12 @@ static bool pvebackup_error_or_canceled(void)
@@ -107,11 +112,12 @@ static bool pvebackup_error_or_canceled(void)
return error_or_canceled;
}
@@ -176,7 +176,7 @@ index 66868dec14..6cdbd40529 100644
qemu_mutex_unlock(&backup_state.stat.lock);
}
@@ -148,7 +154,8 @@ pvebackup_co_dump_pbs_cb(
@@ -150,7 +156,8 @@ pvebackup_co_dump_pbs_cb(
pvebackup_propagate_error(local_err);
return pbs_res;
} else {
@@ -186,7 +186,7 @@ index 66868dec14..6cdbd40529 100644
}
return size;
@@ -208,11 +215,11 @@ pvebackup_co_dump_vma_cb(
@@ -210,11 +217,11 @@ pvebackup_co_dump_vma_cb(
} else {
if (remaining >= VMA_CLUSTER_SIZE) {
assert(ret == VMA_CLUSTER_SIZE);
@@ -200,7 +200,7 @@ index 66868dec14..6cdbd40529 100644
remaining = 0;
}
}
@@ -248,6 +255,18 @@ static void coroutine_fn pvebackup_co_cleanup(void *unused)
@@ -250,6 +257,18 @@ static void coroutine_fn pvebackup_co_cleanup(void *unused)
if (local_err != NULL) {
pvebackup_propagate_error(local_err);
}
@@ -219,7 +219,7 @@ index 66868dec14..6cdbd40529 100644
}
proxmox_backup_disconnect(backup_state.pbs);
@@ -303,6 +322,12 @@ static void pvebackup_complete_cb(void *opaque, int ret)
@@ -305,6 +324,12 @@ static void pvebackup_complete_cb(void *opaque, int ret)
// remove self from job queue
backup_state.di_list = g_list_remove(backup_state.di_list, di);
@@ -232,7 +232,7 @@ index 66868dec14..6cdbd40529 100644
g_free(di);
qemu_mutex_unlock(&backup_state.backup_mutex);
@@ -472,12 +497,18 @@ static bool create_backup_jobs(void) {
@@ -469,12 +494,18 @@ static bool create_backup_jobs(void) {
assert(di->target != NULL);
@@ -253,7 +253,7 @@ index 66868dec14..6cdbd40529 100644
JOB_DEFAULT, pvebackup_complete_cb, di, NULL, &local_err);
aio_context_release(aio_context);
@@ -528,6 +559,8 @@ typedef struct QmpBackupTask {
@@ -525,6 +556,8 @@ typedef struct QmpBackupTask {
const char *fingerprint;
bool has_fingerprint;
int64_t backup_time;
@@ -262,7 +262,7 @@ index 66868dec14..6cdbd40529 100644
bool has_format;
BackupFormat format;
bool has_config_file;
@@ -619,6 +652,7 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -616,6 +649,7 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
}
size_t total = 0;
@@ -270,7 +270,7 @@ index 66868dec14..6cdbd40529 100644
l = di_list;
while (l) {
@@ -656,6 +690,8 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -653,6 +687,8 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
int dump_cb_block_size = PROXMOX_BACKUP_DEFAULT_CHUNK_SIZE; // Hardcoded (4M)
firewall_name = "fw.conf";
@@ -279,7 +279,7 @@ index 66868dec14..6cdbd40529 100644
char *pbs_err = NULL;
pbs = proxmox_backup_new(
task->backup_file,
@@ -675,7 +711,8 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -672,7 +708,8 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
goto err;
}
@@ -289,7 +289,7 @@ index 66868dec14..6cdbd40529 100644
goto err;
/* register all devices */
@@ -686,9 +723,40 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -683,9 +720,40 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
const char *devname = bdrv_get_device_name(di->bs);
@@ -332,7 +332,7 @@ index 66868dec14..6cdbd40529 100644
if (!(di->target = bdrv_backup_dump_create(dump_cb_block_size, di->size, pvebackup_co_dump_pbs_cb, di, task->errp))) {
goto err;
@@ -697,6 +765,8 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -694,6 +762,8 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
di->dev_id = dev_id;
}
} else if (format == BACKUP_FORMAT_VMA) {
@@ -341,7 +341,7 @@ index 66868dec14..6cdbd40529 100644
vmaw = vma_writer_create(task->backup_file, uuid, &local_err);
if (!vmaw) {
if (local_err) {
@@ -724,6 +794,8 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -721,6 +791,8 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
}
}
} else if (format == BACKUP_FORMAT_DIR) {
@@ -350,7 +350,7 @@ index 66868dec14..6cdbd40529 100644
if (mkdir(task->backup_file, 0640) != 0) {
error_setg_errno(task->errp, errno, "can't create directory '%s'\n",
task->backup_file);
@@ -796,8 +868,10 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -793,8 +865,10 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
char *uuid_str = g_strdup(backup_state.stat.uuid_str);
backup_state.stat.total = total;
@@ -361,7 +361,7 @@ index 66868dec14..6cdbd40529 100644
qemu_mutex_unlock(&backup_state.stat.lock);
@@ -821,6 +895,10 @@ err:
@@ -818,6 +892,10 @@ err:
PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
l = g_list_next(l);
@@ -372,7 +372,7 @@ index 66868dec14..6cdbd40529 100644
if (di->target) {
bdrv_unref(di->target);
}
@@ -862,6 +940,7 @@ UuidInfo *qmp_backup(
@@ -859,6 +937,7 @@ UuidInfo *qmp_backup(
bool has_fingerprint, const char *fingerprint,
bool has_backup_id, const char *backup_id,
bool has_backup_time, int64_t backup_time,
@@ -380,7 +380,7 @@ index 66868dec14..6cdbd40529 100644
bool has_format, BackupFormat format,
bool has_config_file, const char *config_file,
bool has_firewall_file, const char *firewall_file,
@@ -880,6 +959,8 @@ UuidInfo *qmp_backup(
@@ -877,6 +956,8 @@ UuidInfo *qmp_backup(
.backup_id = backup_id,
.has_backup_time = has_backup_time,
.backup_time = backup_time,
@@ -389,7 +389,7 @@ index 66868dec14..6cdbd40529 100644
.has_format = has_format,
.format = format,
.has_config_file = has_config_file,
@@ -948,10 +1029,14 @@ BackupStatus *qmp_query_backup(Error **errp)
@@ -945,10 +1026,14 @@ BackupStatus *qmp_query_backup(Error **errp)
info->has_total = true;
info->total = backup_state.stat.total;
@@ -405,10 +405,10 @@ index 66868dec14..6cdbd40529 100644
qemu_mutex_unlock(&backup_state.stat.lock);
diff --git a/qapi/block-core.json b/qapi/block-core.json
index c5d604693f..a138ad08d4 100644
index c3b6b93472..992e6c1e3f 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -712,8 +712,13 @@
@@ -753,8 +753,13 @@
#
# @total: total amount of bytes involved in the backup process
#
@@ -422,7 +422,7 @@ index c5d604693f..a138ad08d4 100644
# @zero-bytes: amount of 'zero' bytes detected.
#
# @start-time: time (epoch) when backup job started.
@@ -726,8 +731,8 @@
@@ -767,8 +772,8 @@
#
##
{ 'struct': 'BackupStatus',
@@ -433,7 +433,7 @@ index c5d604693f..a138ad08d4 100644
'*start-time': 'int', '*end-time': 'int',
'*backup-file': 'str', '*uuid': 'str' } }
@@ -770,6 +775,8 @@
@@ -811,6 +816,8 @@
#
# @backup-time: backup timestamp (Unix epoch, required for format 'pbs')
#
@@ -442,7 +442,7 @@ index c5d604693f..a138ad08d4 100644
# Returns: the uuid of the backup job
#
##
@@ -780,6 +787,7 @@
@@ -821,6 +828,7 @@
'*fingerprint': 'str',
'*backup-id': 'str',
'*backup-time': 'int',

View File

@@ -14,12 +14,12 @@ Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
block/monitor/block-hmp-cmds.c | 4 ++-
pve-backup.c | 59 ++++++++++++++++++++++++++--------
pve-backup.c | 57 +++++++++++++++++++++++++++-------
qapi/block-core.json | 6 ++++
3 files changed, 55 insertions(+), 14 deletions(-)
3 files changed, 54 insertions(+), 13 deletions(-)
diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c
index 3fca3ce3e9..69254396d5 100644
index 556af25861..a09f722fea 100644
--- a/block/monitor/block-hmp-cmds.c
+++ b/block/monitor/block-hmp-cmds.c
@@ -1042,7 +1042,9 @@ void hmp_backup(Monitor *mon, const QDict *qdict)
@@ -34,7 +34,7 @@ index 3fca3ce3e9..69254396d5 100644
false, NULL, false, NULL, !!devlist,
devlist, qdict_haskey(qdict, "speed"), speed, &error);
diff --git a/pve-backup.c b/pve-backup.c
index 6cdbd40529..7527885251 100644
index abd7062afe..e113ab61b9 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -8,6 +8,7 @@
@@ -53,7 +53,7 @@ index 6cdbd40529..7527885251 100644
uint8_t dev_id;
bool completed;
char targetfile[PATH_MAX];
@@ -135,10 +137,13 @@ pvebackup_co_dump_pbs_cb(
@@ -137,10 +139,13 @@ pvebackup_co_dump_pbs_cb(
PVEBackupDevInfo *di = opaque;
assert(backup_state.pbs);
@@ -67,17 +67,24 @@ index 6cdbd40529..7527885251 100644
qemu_co_mutex_lock(&backup_state.dump_callback_mutex);
// avoid deadlock if job is cancelled
@@ -147,16 +152,28 @@ pvebackup_co_dump_pbs_cb(
@@ -149,17 +154,29 @@ pvebackup_co_dump_pbs_cb(
return -1;
}
- pbs_res = proxmox_backup_co_write_data(backup_state.pbs, di->dev_id, buf, start, size, &local_err);
- qemu_co_mutex_unlock(&backup_state.dump_callback_mutex);
+ uint64_t transferred = 0;
+ uint64_t reused = 0;
+ while (transferred < size) {
+ uint64_t left = size - transferred;
+ uint64_t to_transfer = left < di->block_size ? left : di->block_size;
+
- if (pbs_res < 0) {
- pvebackup_propagate_error(local_err);
- return pbs_res;
- } else {
- size_t reused = (pbs_res == 0) ? size : 0;
- pvebackup_add_transfered_bytes(size, !buf ? size : 0, reused);
+ pbs_res = proxmox_backup_co_write_data(backup_state.pbs, di->dev_id,
+ is_zero_block ? NULL : buf + transferred, start + transferred,
+ to_transfer, &local_err);
@@ -90,22 +97,15 @@ index 6cdbd40529..7527885251 100644
+ }
+
+ reused += pbs_res == 0 ? to_transfer : 0;
+ }
+
qemu_co_mutex_unlock(&backup_state.dump_callback_mutex);
-
- if (pbs_res < 0) {
- pvebackup_propagate_error(local_err);
- return pbs_res;
- } else {
- size_t reused = (pbs_res == 0) ? size : 0;
- pvebackup_add_transfered_bytes(size, !buf ? size : 0, reused);
- }
+ pvebackup_add_transfered_bytes(size, is_zero_block ? size : 0, reused);
}
+ qemu_co_mutex_unlock(&backup_state.dump_callback_mutex);
+ pvebackup_add_transfered_bytes(size, is_zero_block ? size : 0, reused);
+
return size;
}
@@ -178,6 +195,7 @@ pvebackup_co_dump_vma_cb(
@@ -180,6 +197,7 @@ pvebackup_co_dump_vma_cb(
int ret = -1;
assert(backup_state.vmaw);
@@ -113,7 +113,7 @@ index 6cdbd40529..7527885251 100644
uint64_t remaining = size;
@@ -204,9 +222,7 @@ pvebackup_co_dump_vma_cb(
@@ -206,9 +224,7 @@ pvebackup_co_dump_vma_cb(
qemu_co_mutex_unlock(&backup_state.dump_callback_mutex);
++cluster_num;
@@ -124,7 +124,7 @@ index 6cdbd40529..7527885251 100644
if (ret < 0) {
Error *local_err = NULL;
vma_writer_error_propagate(backup_state.vmaw, &local_err);
@@ -569,6 +585,10 @@ typedef struct QmpBackupTask {
@@ -566,6 +582,10 @@ typedef struct QmpBackupTask {
const char *firewall_file;
bool has_devlist;
const char *devlist;
@@ -135,7 +135,7 @@ index 6cdbd40529..7527885251 100644
bool has_speed;
int64_t speed;
Error **errp;
@@ -692,6 +712,7 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -689,6 +709,7 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
bool use_dirty_bitmap = task->has_use_dirty_bitmap && task->use_dirty_bitmap;
@@ -143,7 +143,7 @@ index 6cdbd40529..7527885251 100644
char *pbs_err = NULL;
pbs = proxmox_backup_new(
task->backup_file,
@@ -701,8 +722,10 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -698,8 +719,10 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
task->has_password ? task->password : NULL,
task->has_keyfile ? task->keyfile : NULL,
task->has_key_password ? task->key_password : NULL,
@@ -155,7 +155,7 @@ index 6cdbd40529..7527885251 100644
if (!pbs) {
error_set(task->errp, ERROR_CLASS_GENERIC_ERROR,
@@ -721,6 +744,8 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -718,6 +741,8 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
l = g_list_next(l);
@@ -164,7 +164,7 @@ index 6cdbd40529..7527885251 100644
const char *devname = bdrv_get_device_name(di->bs);
BdrvDirtyBitmap *bitmap = bdrv_find_dirty_bitmap(di->bs, PBS_BITMAP_NAME);
@@ -941,6 +966,8 @@ UuidInfo *qmp_backup(
@@ -938,6 +963,8 @@ UuidInfo *qmp_backup(
bool has_backup_id, const char *backup_id,
bool has_backup_time, int64_t backup_time,
bool has_use_dirty_bitmap, bool use_dirty_bitmap,
@@ -173,7 +173,7 @@ index 6cdbd40529..7527885251 100644
bool has_format, BackupFormat format,
bool has_config_file, const char *config_file,
bool has_firewall_file, const char *firewall_file,
@@ -951,6 +978,8 @@ UuidInfo *qmp_backup(
@@ -948,6 +975,8 @@ UuidInfo *qmp_backup(
.backup_file = backup_file,
.has_password = has_password,
.password = password,
@@ -182,7 +182,7 @@ index 6cdbd40529..7527885251 100644
.has_key_password = has_key_password,
.key_password = key_password,
.has_fingerprint = has_fingerprint,
@@ -961,6 +990,10 @@ UuidInfo *qmp_backup(
@@ -958,6 +987,10 @@ UuidInfo *qmp_backup(
.backup_time = backup_time,
.has_use_dirty_bitmap = has_use_dirty_bitmap,
.use_dirty_bitmap = use_dirty_bitmap,
@@ -194,10 +194,10 @@ index 6cdbd40529..7527885251 100644
.format = format,
.has_config_file = has_config_file,
diff --git a/qapi/block-core.json b/qapi/block-core.json
index a138ad08d4..a75f1b4687 100644
index 992e6c1e3f..5ac6276dc1 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -777,6 +777,10 @@
@@ -818,6 +818,10 @@
#
# @use-dirty-bitmap: use dirty bitmap to detect incremental changes since last job (optional for format 'pbs')
#
@@ -208,7 +208,7 @@ index a138ad08d4..a75f1b4687 100644
# Returns: the uuid of the backup job
#
##
@@ -788,6 +792,8 @@
@@ -829,6 +833,8 @@
'*backup-id': 'str',
'*backup-time': 'int',
'*use-dirty-bitmap': 'bool',

View File

@@ -7,20 +7,24 @@ Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
[error cleanups, file_open implementation]
Signed-off-by: Dietmar Maurer <dietmar@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
[FE: adapt to changed function signatures
make pbs_co_preadv return values consistent with QEMU]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
block/meson.build | 3 +
block/pbs.c | 271 +++++++++++++++++++++++++++++++++++++++++++
block/pbs.c | 276 +++++++++++++++++++++++++++++++++++++++++++
configure | 9 ++
meson.build | 1 +
qapi/block-core.json | 13 +++
5 files changed, 297 insertions(+)
meson.build | 2 +-
qapi/block-core.json | 13 ++
qapi/pragma.json | 1 +
6 files changed, 303 insertions(+), 1 deletion(-)
create mode 100644 block/pbs.c
diff --git a/block/meson.build b/block/meson.build
index 9e433daf2e..e3ed5ac97c 100644
index e995ae72b9..7ef2fa72d5 100644
--- a/block/meson.build
+++ b/block/meson.build
@@ -51,6 +51,9 @@ block_ss.add(files(
@@ -53,6 +53,9 @@ block_ss.add(files(
'../pve-backup.c',
), libproxmox_backup_qemu)
@@ -29,13 +33,13 @@ index 9e433daf2e..e3ed5ac97c 100644
+
softmmu_ss.add(when: 'CONFIG_TCG', if_true: files('blkreplay.c'))
softmmu_ss.add(files('block-ram-registrar.c'))
diff --git a/block/pbs.c b/block/pbs.c
new file mode 100644
index 0000000000..78dad0dcc4
index 0000000000..9d1f1f39d4
--- /dev/null
+++ b/block/pbs.c
@@ -0,0 +1,271 @@
@@ -0,0 +1,276 @@
+/*
+ * Proxmox Backup Server read-only block driver
+ */
@@ -232,20 +236,25 @@ index 0000000000..78dad0dcc4
+}
+
+static coroutine_fn int pbs_co_preadv(BlockDriverState *bs,
+ uint64_t offset, uint64_t bytes,
+ QEMUIOVector *qiov, int flags)
+ int64_t offset, int64_t bytes,
+ QEMUIOVector *qiov, BdrvRequestFlags flags)
+{
+ BDRVPBSState *s = bs->opaque;
+ int ret;
+ char *pbs_error = NULL;
+ uint8_t *buf = malloc(bytes);
+
+ if (offset < 0 || bytes < 0) {
+ fprintf(stderr, "unexpected negative 'offset' or 'bytes' value!\n");
+ return -EIO;
+ }
+
+ ReadCallbackData rcb = {
+ .co = qemu_coroutine_self(),
+ .ctx = bdrv_get_aio_context(bs),
+ };
+
+ proxmox_restore_read_image_at_async(s->conn, s->aid, buf, offset, bytes,
+ proxmox_restore_read_image_at_async(s->conn, s->aid, buf, (uint64_t)offset, (uint64_t)bytes,
+ read_callback, (void *) &rcb, &ret, &pbs_error);
+
+ qemu_coroutine_yield();
@@ -259,12 +268,12 @@ index 0000000000..78dad0dcc4
+ qemu_iovec_from_buf(qiov, 0, buf, bytes);
+ free(buf);
+
+ return ret;
+ return 0;
+}
+
+static coroutine_fn int pbs_co_pwritev(BlockDriverState *bs,
+ uint64_t offset, uint64_t bytes,
+ QEMUIOVector *qiov, int flags)
+ int64_t offset, int64_t bytes,
+ QEMUIOVector *qiov, BdrvRequestFlags flags)
+{
+ fprintf(stderr, "pbs-bdrv: cannot write to backup file, make sure "
+ "any attached disk devices are set to read-only!\n");
@@ -308,71 +317,72 @@ index 0000000000..78dad0dcc4
+
+block_init(bdrv_pbs_init);
diff --git a/configure b/configure
index 6e308ed77f..869e97c72f 100755
index 26c7bc5154..c587e986c7 100755
--- a/configure
+++ b/configure
@@ -428,6 +428,7 @@ vdi=${default_feature:-yes}
vvfat=${default_feature:-yes}
qed=${default_feature:-yes}
parallels=${default_feature:-yes}
@@ -285,6 +285,7 @@ linux_user=""
bsd_user=""
pie=""
coroutine=""
+pbs_bdrv="yes"
libxml2="auto"
debug_mutex="no"
libpmem="auto"
@@ -1486,6 +1487,10 @@ for opt do
;;
--enable-parallels) parallels="yes"
plugins="$default_feature"
meson=""
ninja=""
@@ -864,6 +865,10 @@ for opt do
--enable-uuid|--disable-uuid)
echo "$0: $opt is obsolete, UUID support is always built" >&2
;;
+ --disable-pbs-bdrv) pbs_bdrv="no"
+ ;;
+ --enable-pbs-bdrv) pbs_bdrv="yes"
+ ;;
--disable-vhost-user) vhost_user="no"
--with-git=*) git="$optarg"
;;
--enable-vhost-user) vhost_user="yes"
@@ -1956,6 +1961,7 @@ disabled with --disable-FEATURE, default is enabled if available
vvfat vvfat image format support
qed qed image format support
parallels parallels image format support
--with-git-submodules=*)
@@ -1049,6 +1054,7 @@ cat << EOF
debug-info debugging information
safe-stack SafeStack Stack Smash Protection. Depends on
clang/llvm >= 3.7 and requires coroutine backend ucontext.
+ pbs-bdrv Proxmox backup server read-only block driver support
crypto-afalg Linux AF_ALG crypto backend driver
capstone capstone disassembler support
debug-mutex mutex debugging support
@@ -4624,6 +4630,9 @@ fi
if test "$linux_aio" = "yes" ; then
echo "CONFIG_LINUX_AIO=y" >> $config_host_mak
NOTE: The object files are built at the place where configure is launched
EOF
@@ -2372,6 +2378,9 @@ echo "TARGET_DIRS=$target_list" >> $config_host_mak
if test "$modules" = "yes"; then
echo "CONFIG_MODULES=y" >> $config_host_mak
fi
+if test "$pbs_bdrv" = "yes" ; then
+ echo "CONFIG_PBS_BDRV=y" >> $config_host_mak
+fi
if test "$vhost_scsi" = "yes" ; then
echo "CONFIG_VHOST_SCSI=y" >> $config_host_mak
fi
# XXX: suppress that
if [ "$bsd" = "yes" ] ; then
diff --git a/meson.build b/meson.build
index dd1c5bdb4e..45c1f2de73 100644
index 63ea813a9a..f7f5b3f253 100644
--- a/meson.build
+++ b/meson.build
@@ -3111,6 +3111,7 @@ summary_info += {'lzfse support': liblzfse.found()}
summary_info += {'zstd support': zstd.found()}
summary_info += {'NUMA host support': config_host.has_key('CONFIG_NUMA')}
summary_info += {'libxml2': libxml2.found()}
@@ -3978,7 +3978,7 @@ summary_info += {'bzip2 support': libbzip2}
summary_info += {'lzfse support': liblzfse}
summary_info += {'zstd support': zstd}
summary_info += {'NUMA host support': numa}
-summary_info += {'capstone': capstone}
+summary_info += {'PBS bdrv support': config_host.has_key('CONFIG_PBS_BDRV')}
summary_info += {'capstone': capstone_opt == 'disabled' ? false : capstone_opt}
summary_info += {'libpmem support': libpmem.found()}
summary_info += {'libdaxctl support': libdaxctl.found()}
summary_info += {'libpmem support': libpmem}
summary_info += {'libdaxctl support': libdaxctl}
summary_info += {'libudev': libudev}
diff --git a/qapi/block-core.json b/qapi/block-core.json
index a75f1b4687..e4d0c923a4 100644
index 5ac6276dc1..45b63dfe26 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -2982,6 +2982,7 @@
'luks', 'nbd', 'nfs', 'null-aio', 'null-co', 'nvme', 'parallels',
'preallocate', 'qcow', 'qcow2', 'qed', 'quorum', 'raw', 'rbd',
{ 'name': 'replication', 'if': 'defined(CONFIG_REPLICATION)' },
@@ -3103,6 +3103,7 @@
'parallels', 'preallocate', 'qcow', 'qcow2', 'qed', 'quorum',
'raw', 'rbd',
{ 'name': 'replication', 'if': 'CONFIG_REPLICATION' },
+ 'pbs',
'ssh', 'throttle', 'vdi', 'vhdx', 'vmdk', 'vpc', 'vvfat' ] }
##
@@ -3045,6 +3046,17 @@
'ssh', 'throttle', 'vdi', 'vhdx',
{ 'name': 'virtio-blk-vfio-pci', 'if': 'CONFIG_BLKIO' },
{ 'name': 'virtio-blk-vhost-user', 'if': 'CONFIG_BLKIO' },
@@ -3179,6 +3180,17 @@
{ 'struct': 'BlockdevOptionsNull',
'data': { '*size': 'int', '*latency-ns': 'uint64', '*read-zeroes': 'bool' } }
@@ -390,11 +400,23 @@ index a75f1b4687..e4d0c923a4 100644
##
# @BlockdevOptionsNVMe:
#
@@ -4263,6 +4275,7 @@
@@ -4531,6 +4543,7 @@
'nfs': 'BlockdevOptionsNfs',
'null-aio': 'BlockdevOptionsNull',
'null-co': 'BlockdevOptionsNull',
+ 'pbs': 'BlockdevOptionsPbs',
'nvme': 'BlockdevOptionsNVMe',
'parallels': 'BlockdevOptionsGenericFormat',
'preallocate':'BlockdevOptionsPreallocate',
'nvme-io_uring': { 'type': 'BlockdevOptionsNvmeIoUring',
'if': 'CONFIG_BLKIO' },
diff --git a/qapi/pragma.json b/qapi/pragma.json
index f2097b9020..5ab1890519 100644
--- a/qapi/pragma.json
+++ b/qapi/pragma.json
@@ -47,6 +47,7 @@
'BlockInfo', # query-block
'BlockdevAioOptions', # blockdev-add, -blockdev
'BlockdevDriver', # blockdev-add, query-blockstats, ...
+ 'BlockdevOptionsPbs', # for PBS backwards compat
'BlockdevVmdkAdapterType', # blockdev-create (to match VMDK spec)
'BlockdevVmdkSubformat', # blockdev-create (to match VMDK spec)
'ColoCompareProperties', # object_add, -object

View File

@@ -16,10 +16,10 @@ Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2 files changed, 38 insertions(+)
diff --git a/pve-backup.c b/pve-backup.c
index 7527885251..8cba8e97d3 100644
index e113ab61b9..9318ca4f0c 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -1075,3 +1075,12 @@ BackupStatus *qmp_query_backup(Error **errp)
@@ -1072,3 +1072,12 @@ BackupStatus *qmp_query_backup(Error **errp)
return info;
}
@@ -33,10 +33,10 @@ index 7527885251..8cba8e97d3 100644
+ return ret;
+}
diff --git a/qapi/block-core.json b/qapi/block-core.json
index e4d0c923a4..3eebe7ff71 100644
index 45b63dfe26..8b0e0d92de 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -822,6 +822,35 @@
@@ -863,6 +863,35 @@
##
{ 'command': 'backup-cancel' }

View File

@@ -15,10 +15,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
3 files changed, 159 insertions(+), 42 deletions(-)
diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index b2b5f1298b..7a449edafa 100644
index 670f783515..d819e5fc36 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -198,6 +198,7 @@ void hmp_info_mice(Monitor *mon, const QDict *qdict)
@@ -202,6 +202,7 @@ void hmp_info_mice(Monitor *mon, const QDict *qdict)
void hmp_info_backup(Monitor *mon, const QDict *qdict)
{
BackupStatus *info;
@@ -26,7 +26,7 @@ index b2b5f1298b..7a449edafa 100644
info = qmp_query_backup(NULL);
@@ -228,26 +229,29 @@ void hmp_info_backup(Monitor *mon, const QDict *qdict)
@@ -232,26 +233,29 @@ void hmp_info_backup(Monitor *mon, const QDict *qdict)
// this should not happen normally
monitor_printf(mon, "Total size: %d\n", 0);
} else {
@@ -69,7 +69,7 @@ index b2b5f1298b..7a449edafa 100644
info->zero_bytes, zero_per);
diff --git a/pve-backup.c b/pve-backup.c
index 8cba8e97d3..22420db26a 100644
index 9318ca4f0c..c85b2ecd83 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -46,6 +46,7 @@ static struct PVEBackupState {
@@ -80,7 +80,7 @@ index 8cba8e97d3..22420db26a 100644
} stat;
int64_t speed;
VmaWriter *vmaw;
@@ -672,7 +673,6 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -669,7 +670,6 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
}
size_t total = 0;
@@ -88,7 +88,7 @@ index 8cba8e97d3..22420db26a 100644
l = di_list;
while (l) {
@@ -693,18 +693,33 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -690,18 +690,33 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
uuid_generate(uuid);
@@ -125,7 +125,7 @@ index 8cba8e97d3..22420db26a 100644
}
int dump_cb_block_size = PROXMOX_BACKUP_DEFAULT_CHUNK_SIZE; // Hardcoded (4M)
@@ -731,12 +746,12 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -728,12 +743,12 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
error_set(task->errp, ERROR_CLASS_GENERIC_ERROR,
"proxmox_backup_new failed: %s", pbs_err);
proxmox_backup_free_error(pbs_err);
@@ -140,7 +140,7 @@ index 8cba8e97d3..22420db26a 100644
/* register all devices */
l = di_list;
@@ -747,6 +762,8 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -744,6 +759,8 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
di->block_size = dump_cb_block_size;
const char *devname = bdrv_get_device_name(di->bs);
@@ -149,7 +149,7 @@ index 8cba8e97d3..22420db26a 100644
BdrvDirtyBitmap *bitmap = bdrv_find_dirty_bitmap(di->bs, PBS_BITMAP_NAME);
bool expect_only_dirty = false;
@@ -755,49 +772,59 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -752,49 +769,59 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
if (bitmap == NULL) {
bitmap = bdrv_create_dirty_bitmap(di->bs, dump_cb_block_size, PBS_BITMAP_NAME, task->errp);
if (!bitmap) {
@@ -219,7 +219,7 @@ index 8cba8e97d3..22420db26a 100644
}
/* register all devices for vma writer */
@@ -807,7 +834,7 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -804,7 +831,7 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
l = g_list_next(l);
if (!(di->target = bdrv_backup_dump_create(VMA_CLUSTER_SIZE, di->size, pvebackup_co_dump_vma_cb, di, task->errp))) {
@@ -228,7 +228,7 @@ index 8cba8e97d3..22420db26a 100644
}
const char *devname = bdrv_get_device_name(di->bs);
@@ -815,16 +842,14 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -812,16 +839,14 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
if (di->dev_id <= 0) {
error_set(task->errp, ERROR_CLASS_GENERIC_ERROR,
"register_stream failed");
@@ -247,7 +247,7 @@ index 8cba8e97d3..22420db26a 100644
}
backup_dir = task->backup_file;
@@ -841,18 +866,18 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -838,18 +863,18 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
di->size, flags, false, &local_err);
if (local_err) {
error_propagate(task->errp, local_err);
@@ -269,7 +269,7 @@ index 8cba8e97d3..22420db26a 100644
}
@@ -860,7 +885,7 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -857,7 +882,7 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
if (task->has_config_file) {
if (pvebackup_co_add_config(task->config_file, config_name, format, backup_dir,
vmaw, pbs, task->errp) != 0) {
@@ -278,7 +278,7 @@ index 8cba8e97d3..22420db26a 100644
}
}
@@ -868,12 +893,11 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -865,12 +890,11 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
if (task->has_firewall_file) {
if (pvebackup_co_add_config(task->firewall_file, firewall_name, format, backup_dir,
vmaw, pbs, task->errp) != 0) {
@@ -293,7 +293,7 @@ index 8cba8e97d3..22420db26a 100644
if (backup_state.stat.error) {
error_free(backup_state.stat.error);
@@ -893,10 +917,9 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -890,10 +914,9 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
char *uuid_str = g_strdup(backup_state.stat.uuid_str);
backup_state.stat.total = total;
@@ -305,7 +305,7 @@ index 8cba8e97d3..22420db26a 100644
qemu_mutex_unlock(&backup_state.stat.lock);
@@ -913,6 +936,9 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -910,6 +933,9 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
task->result = uuid_info;
return;
@@ -315,7 +315,7 @@ index 8cba8e97d3..22420db26a 100644
err:
l = di_list;
@@ -1076,11 +1102,42 @@ BackupStatus *qmp_query_backup(Error **errp)
@@ -1073,11 +1099,42 @@ BackupStatus *qmp_query_backup(Error **errp)
return info;
}
@@ -359,10 +359,10 @@ index 8cba8e97d3..22420db26a 100644
return ret;
}
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 3eebe7ff71..170c13984d 100644
index 8b0e0d92de..7fde927621 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -830,6 +830,8 @@
@@ -871,6 +871,8 @@
# @pbs-dirty-bitmap: True if dirty-bitmap-incremental backups to PBS are
# supported.
#
@@ -371,7 +371,7 @@ index 3eebe7ff71..170c13984d 100644
# @pbs-dirty-bitmap-savevm: True if 'dirty-bitmaps' migration capability can
# safely be set for savevm-async.
#
@@ -838,6 +840,7 @@
@@ -879,6 +881,7 @@
##
{ 'struct': 'ProxmoxSupportStatus',
'data': { 'pbs-dirty-bitmap': 'bool',
@@ -379,7 +379,7 @@ index 3eebe7ff71..170c13984d 100644
'pbs-dirty-bitmap-savevm': 'bool',
'pbs-library-version': 'str' } }
@@ -851,6 +854,59 @@
@@ -892,6 +895,59 @@
##
{ 'command': 'query-proxmox-support', 'returns': 'ProxmoxSupportStatus' }

View File

@@ -14,18 +14,18 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2 files changed, 7 insertions(+), 2 deletions(-)
diff --git a/meson.build b/meson.build
index 45c1f2de73..44071acbb7 100644
index f7f5b3f253..283b0e356e 100644
--- a/meson.build
+++ b/meson.build
@@ -1065,6 +1065,7 @@ keyutils = dependency('libkeyutils', required: false,
@@ -1526,6 +1526,7 @@ keyutils = dependency('libkeyutils', required: false,
has_gettid = cc.has_function('gettid')
libuuid = cc.find_library('uuid', required: true)
+libsystemd = cc.find_library('systemd', required: true)
libproxmox_backup_qemu = cc.find_library('proxmox_backup_qemu', required: true)
# Malloc tests
@@ -2246,6 +2247,7 @@ if have_block
# libselinux
@@ -3096,6 +3097,7 @@ if have_block
# os-posix.c contains POSIX-specific functions used by qemu-storage-daemon,
# os-win32.c does not
blockdev_ss.add(when: 'CONFIG_POSIX', if_true: files('os-posix.c'))
@@ -34,7 +34,7 @@ index 45c1f2de73..44071acbb7 100644
endif
diff --git a/os-posix.c b/os-posix.c
index ae6c9f2a5e..36807806bf 100644
index 4858650c3e..c5cb12226a 100644
--- a/os-posix.c
+++ b/os-posix.c
@@ -28,6 +28,8 @@
@@ -44,15 +44,15 @@ index ae6c9f2a5e..36807806bf 100644
+#include <systemd/sd-journal.h>
+#include <syslog.h>
#include "qemu-common.h"
/* Needed early for CONFIG_BSD etc. */
@@ -291,9 +293,10 @@ void os_setup_post(void)
#include "net/slirp.h"
@@ -287,9 +289,10 @@ void os_setup_post(void)
dup2(fd, 0);
dup2(fd, 1);
- /* In case -D is given do not redirect stderr to /dev/null */
+ /* In case -D is given do not redirect stderr to journal */
if (!qemu_logfile) {
if (!qemu_log_enabled()) {
- dup2(fd, 2);
+ int journal_fd = sd_journal_stream_fd("QEMU", LOG_ERR, 0);
+ dup2(journal_fd, 2);

View File

@@ -7,14 +7,14 @@ Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
include/qemu/job.h | 12 ++++++++++++
job.c | 31 +++++++++++++++++++++++++++++++
2 files changed, 43 insertions(+)
job.c | 34 ++++++++++++++++++++++++++++++++++
2 files changed, 46 insertions(+)
diff --git a/include/qemu/job.h b/include/qemu/job.h
index 41162ed494..6662c63519 100644
index e502787dd8..963cf2bef5 100644
--- a/include/qemu/job.h
+++ b/include/qemu/job.h
@@ -285,6 +285,18 @@ typedef enum JobCreateFlags {
@@ -381,6 +381,18 @@ void job_unlock(void);
*/
JobTxn *job_txn_new(void);
@@ -34,10 +34,10 @@ index 41162ed494..6662c63519 100644
* Release a reference that was previously acquired with job_txn_add_job or
* job_txn_new. If it's the last reference to the object, it will be freed.
diff --git a/job.c b/job.c
index 44eec9a441..a0753ff2f1 100644
index 93e22d180b..2b31f1e14f 100644
--- a/job.c
+++ b/job.c
@@ -72,6 +72,8 @@ struct JobTxn {
@@ -93,6 +93,8 @@ struct JobTxn {
/* Reference count */
int refcnt;
@@ -45,8 +45,8 @@ index 44eec9a441..a0753ff2f1 100644
+ bool sequential;
};
/* Right now, this mutex is only needed to synchronize accesses to job->busy
@@ -102,6 +104,25 @@ JobTxn *job_txn_new(void)
void job_lock(void)
@@ -118,6 +120,25 @@ JobTxn *job_txn_new(void)
return txn;
}
@@ -69,20 +69,23 @@ index 44eec9a441..a0753ff2f1 100644
+ job_start(first);
+}
+
static void job_txn_ref(JobTxn *txn)
/* Called with job_mutex held. */
static void job_txn_ref_locked(JobTxn *txn)
{
txn->refcnt++;
@@ -850,6 +871,9 @@ static void job_completed_txn_success(Job *job)
@@ -1057,6 +1078,12 @@ static void job_completed_txn_success_locked(Job *job)
*/
QLIST_FOREACH(other_job, &txn->jobs, txn_list) {
if (!job_is_completed(other_job)) {
if (!job_is_completed_locked(other_job)) {
+ if (txn->sequential) {
+ job_unlock();
+ /* Needs to be called without holding the job lock */
+ job_start(other_job);
+ job_lock();
+ }
return;
}
assert(other_job->ret == 0);
@@ -1020,6 +1044,13 @@ int job_finish_sync(Job *job, void (*finish)(Job *, Error **errp), Error **errp)
@@ -1268,6 +1295,13 @@ int job_finish_sync_locked(Job *job,
return -EBUSY;
}
@@ -90,9 +93,9 @@ index 44eec9a441..a0753ff2f1 100644
+ * of cancelling, these have not begun work so job_enter won't do anything,
+ * let's ensure they are marked as ABORTING if required */
+ if (job->status == JOB_STATUS_CREATED && job->txn->sequential) {
+ job_update_rc(job);
+ job_update_rc_locked(job);
+ }
+
AIO_WAIT_WHILE(job->aio_context,
(job_enter(job), !job_is_completed(job)));
job_unlock();
AIO_WAIT_WHILE_UNLOCKED(job->aio_context,
(job_enter(job), !job_is_completed(job)));

View File

@@ -12,12 +12,15 @@ transaction, so drives will still be backed up one after the other.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
[FE: add new force parameter to job_cancel_sync calls
adapt for new job lock mechanism replacing AioContext locks]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
pve-backup.c | 169 +++++++++++++++------------------------------------
1 file changed, 50 insertions(+), 119 deletions(-)
pve-backup.c | 163 ++++++++++++++++-----------------------------------
1 file changed, 50 insertions(+), 113 deletions(-)
diff --git a/pve-backup.c b/pve-backup.c
index 22420db26a..2e628d68e4 100644
index c85b2ecd83..b5fb844434 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -52,6 +52,7 @@ static struct PVEBackupState {
@@ -28,7 +31,7 @@ index 22420db26a..2e628d68e4 100644
QemuMutex backup_mutex;
CoMutex dump_callback_mutex;
} backup_state;
@@ -71,32 +72,12 @@ typedef struct PVEBackupDevInfo {
@@ -71,34 +72,12 @@ typedef struct PVEBackupDevInfo {
size_t size;
uint64_t block_size;
uint8_t dev_id;
@@ -45,14 +48,16 @@ index 22420db26a..2e628d68e4 100644
-lookup_active_block_job(PVEBackupDevInfo *di)
-{
- if (!di->completed && di->bs) {
- for (BlockJob *job = block_job_next(NULL); job; job = block_job_next(job)) {
- if (job->job.driver->job_type != JOB_TYPE_BACKUP) {
- continue;
- }
- WITH_JOB_LOCK_GUARD() {
- for (BlockJob *job = block_job_next_locked(NULL); job; job = block_job_next_locked(job)) {
- if (job->job.driver->job_type != JOB_TYPE_BACKUP) {
- continue;
- }
-
- BackupBlockJob *bjob = container_of(job, BackupBlockJob, common);
- if (bjob && bjob->source_bs == di->bs) {
- return job;
- BackupBlockJob *bjob = container_of(job, BackupBlockJob, common);
- if (bjob && bjob->source_bs == di->bs) {
- return job;
- }
- }
- }
- }
@@ -62,7 +67,7 @@ index 22420db26a..2e628d68e4 100644
static void pvebackup_propagate_error(Error *err)
{
qemu_mutex_lock(&backup_state.stat.lock);
@@ -272,18 +253,6 @@ static void coroutine_fn pvebackup_co_cleanup(void *unused)
@@ -274,18 +253,6 @@ static void coroutine_fn pvebackup_co_cleanup(void *unused)
if (local_err != NULL) {
pvebackup_propagate_error(local_err);
}
@@ -81,7 +86,7 @@ index 22420db26a..2e628d68e4 100644
}
proxmox_backup_disconnect(backup_state.pbs);
@@ -322,8 +291,6 @@ static void pvebackup_complete_cb(void *opaque, int ret)
@@ -324,8 +291,6 @@ static void pvebackup_complete_cb(void *opaque, int ret)
qemu_mutex_lock(&backup_state.backup_mutex);
@@ -90,7 +95,7 @@ index 22420db26a..2e628d68e4 100644
if (ret < 0) {
Error *local_err = NULL;
error_setg(&local_err, "job failed with err %d - %s", ret, strerror(-ret));
@@ -336,20 +303,17 @@ static void pvebackup_complete_cb(void *opaque, int ret)
@@ -338,20 +303,17 @@ static void pvebackup_complete_cb(void *opaque, int ret)
block_on_coroutine_fn(pvebackup_complete_stream, di);
@@ -117,30 +122,22 @@ index 22420db26a..2e628d68e4 100644
}
static void pvebackup_cancel(void)
@@ -371,36 +335,28 @@ static void pvebackup_cancel(void)
@@ -373,32 +335,28 @@ static void pvebackup_cancel(void)
proxmox_backup_abort(backup_state.pbs, "backup canceled");
}
- qemu_mutex_unlock(&backup_state.backup_mutex);
-
- for(;;) {
-
- BlockJob *next_job = NULL;
+ /* it's enough to cancel one job in the transaction, the rest will follow
+ * automatically */
+ GList *bdi = g_list_first(backup_state.di_list);
+ BlockJob *cancel_job = bdi && bdi->data ?
+ ((PVEBackupDevInfo *)bdi->data)->job :
+ NULL;
+
+ /* ref the job before releasing the mutex, just to be safe */
+ if (cancel_job) {
+ job_ref(&cancel_job->job);
+ }
+
+ /* job_cancel_sync may enter the job, so we need to release the
+ * backup_mutex to avoid deadlock */
qemu_mutex_unlock(&backup_state.backup_mutex);
- for(;;) {
-
- BlockJob *next_job = NULL;
-
- qemu_mutex_lock(&backup_state.backup_mutex);
-
- GList *l = backup_state.di_list;
@@ -153,32 +150,34 @@ index 22420db26a..2e628d68e4 100644
- next_job = job;
- break;
- }
- }
-
+ /* ref the job before releasing the mutex, just to be safe */
+ if (cancel_job) {
+ WITH_JOB_LOCK_GUARD() {
+ job_ref_locked(&cancel_job->job);
}
+ }
- qemu_mutex_unlock(&backup_state.backup_mutex);
-
+ /* job_cancel_sync may enter the job, so we need to release the
+ * backup_mutex to avoid deadlock */
+ qemu_mutex_unlock(&backup_state.backup_mutex);
- if (next_job) {
- AioContext *aio_context = next_job->job.aio_context;
- aio_context_acquire(aio_context);
- job_cancel_sync(&next_job->job);
- aio_context_release(aio_context);
- job_cancel_sync(&next_job->job, true);
- } else {
- break;
- }
+ if (cancel_job) {
+ AioContext *aio_context = cancel_job->job.aio_context;
+ aio_context_acquire(aio_context);
+ job_cancel_sync(&cancel_job->job);
+ job_unref(&cancel_job->job);
+ aio_context_release(aio_context);
+ WITH_JOB_LOCK_GUARD() {
+ job_cancel_sync_locked(&cancel_job->job, true);
+ job_unref_locked(&cancel_job->job);
}
}
}
@@ -459,51 +415,19 @@ static int coroutine_fn pvebackup_co_add_config(
@@ -458,49 +416,19 @@ static int coroutine_fn pvebackup_co_add_config(
goto out;
}
-bool job_should_pause(Job *job);
-bool job_should_pause_locked(Job *job);
-
-static void pvebackup_run_next_job(void)
-{
@@ -196,18 +195,16 @@ index 22420db26a..2e628d68e4 100644
- if (job) {
- qemu_mutex_unlock(&backup_state.backup_mutex);
-
- AioContext *aio_context = job->job.aio_context;
- aio_context_acquire(aio_context);
-
- if (job_should_pause(&job->job)) {
- bool error_or_canceled = pvebackup_error_or_canceled();
- if (error_or_canceled) {
- job_cancel_sync(&job->job);
- } else {
- job_resume(&job->job);
- WITH_JOB_LOCK_GUARD() {
- if (job_should_pause_locked(&job->job)) {
- bool error_or_canceled = pvebackup_error_or_canceled();
- if (error_or_canceled) {
- job_cancel_sync_locked(&job->job, true);
- } else {
- job_resume_locked(&job->job);
- }
- }
- }
- aio_context_release(aio_context);
- return;
- }
- }
@@ -233,7 +230,7 @@ index 22420db26a..2e628d68e4 100644
BackupPerf perf = { .max_workers = 16 };
/* create and start all jobs (paused state) */
@@ -526,7 +450,7 @@ static bool create_backup_jobs(void) {
@@ -523,7 +451,7 @@ static bool create_backup_jobs(void) {
BlockJob *job = backup_job_create(
NULL, di->bs, di->target, backup_state.speed, sync_mode, di->bitmap,
bitmap_mode, false, NULL, &perf, BLOCKDEV_ON_ERROR_REPORT, BLOCKDEV_ON_ERROR_REPORT,
@@ -242,7 +239,7 @@ index 22420db26a..2e628d68e4 100644
aio_context_release(aio_context);
@@ -538,7 +462,8 @@ static bool create_backup_jobs(void) {
@@ -535,7 +463,8 @@ static bool create_backup_jobs(void) {
pvebackup_propagate_error(create_job_err);
break;
}
@@ -252,18 +249,20 @@ index 22420db26a..2e628d68e4 100644
bdrv_unref(di->target);
di->target = NULL;
@@ -556,6 +481,10 @@ static bool create_backup_jobs(void) {
@@ -553,6 +482,12 @@ static bool create_backup_jobs(void) {
bdrv_unref(di->target);
di->target = NULL;
}
+
+ if (di->job) {
+ job_unref(&di->job->job);
+ WITH_JOB_LOCK_GUARD() {
+ job_unref_locked(&di->job->job);
+ }
+ }
}
}
@@ -946,10 +875,6 @@ err:
@@ -943,10 +878,6 @@ err:
PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
l = g_list_next(l);
@@ -274,7 +273,7 @@ index 22420db26a..2e628d68e4 100644
if (di->target) {
bdrv_unref(di->target);
}
@@ -1038,9 +963,15 @@ UuidInfo *qmp_backup(
@@ -1035,9 +966,15 @@ UuidInfo *qmp_backup(
block_on_coroutine_fn(pvebackup_co_prepare, &task);
if (*errp == NULL) {

View File

@@ -49,13 +49,15 @@ before.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
[FE: add new force parameter to job_cancel_sync calls]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
pve-backup.c | 217 ++++++++++++++++++++++++++++---------------
pve-backup.c | 212 +++++++++++++++++++++++++++----------------
qapi/block-core.json | 5 +-
2 files changed, 144 insertions(+), 78 deletions(-)
2 files changed, 138 insertions(+), 79 deletions(-)
diff --git a/pve-backup.c b/pve-backup.c
index 2e628d68e4..9c20ef3a5e 100644
index b5fb844434..88268bb586 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -33,7 +33,9 @@ const char *PBS_BITMAP_NAME = "pbs-incremental-dirty-bitmap";
@@ -190,7 +192,7 @@ index 2e628d68e4..9c20ef3a5e 100644
// remove self from job list
backup_state.di_list = g_list_remove(backup_state.di_list, di);
@@ -310,21 +321,49 @@ static void pvebackup_complete_cb(void *opaque, int ret)
@@ -310,21 +321,46 @@ static void pvebackup_complete_cb(void *opaque, int ret)
/* call cleanup if we're the last job */
if (!g_list_first(backup_state.di_list)) {
@@ -208,7 +210,7 @@ index 2e628d68e4..9c20ef3a5e 100644
- assert(!qemu_in_coroutine());
+ PVEBackupDevInfo *di = opaque;
+ di->completed_ret = ret;
+
+ /*
+ * Schedule stream cleanup in async coroutine. close_image and finish might
+ * take a while, so we can't block on them here. This way it also doesn't
@@ -227,13 +229,10 @@ index 2e628d68e4..9c20ef3a5e 100644
+static void job_cancel_bh(void *opaque) {
+ CoCtxData *data = (CoCtxData*)opaque;
+ Job *job = (Job*)data->data;
+ AioContext *job_ctx = job->aio_context;
+ aio_context_acquire(job_ctx);
+ job_cancel_sync(job);
+ aio_context_release(job_ctx);
+ job_cancel_sync(job, true);
+ aio_co_enter(data->ctx, data->co);
+}
+
+static void coroutine_fn pvebackup_co_cancel(void *opaque)
+{
Error *cancel_err = NULL;
@@ -245,13 +244,15 @@ index 2e628d68e4..9c20ef3a5e 100644
if (backup_state.vmaw) {
/* make sure vma writer does not block anymore */
@@ -342,27 +381,22 @@ static void pvebackup_cancel(void)
@@ -342,28 +378,22 @@ static void pvebackup_cancel(void)
((PVEBackupDevInfo *)bdi->data)->job :
NULL;
- /* ref the job before releasing the mutex, just to be safe */
if (cancel_job) {
- job_ref(&cancel_job->job);
- WITH_JOB_LOCK_GUARD() {
- job_ref_locked(&cancel_job->job);
- }
+ CoCtxData data = {
+ .ctx = qemu_get_current_aio_context(),
+ .co = qemu_coroutine_self(),
@@ -266,11 +267,10 @@ index 2e628d68e4..9c20ef3a5e 100644
- qemu_mutex_unlock(&backup_state.backup_mutex);
-
- if (cancel_job) {
- AioContext *aio_context = cancel_job->job.aio_context;
- aio_context_acquire(aio_context);
- job_cancel_sync(&cancel_job->job);
- job_unref(&cancel_job->job);
- aio_context_release(aio_context);
- WITH_JOB_LOCK_GUARD() {
- job_cancel_sync_locked(&cancel_job->job, true);
- job_unref_locked(&cancel_job->job);
- }
- }
+ qemu_co_mutex_unlock(&backup_state.backup_mutex);
}
@@ -282,7 +282,7 @@ index 2e628d68e4..9c20ef3a5e 100644
}
// assumes the caller holds backup_mutex
@@ -415,10 +449,18 @@ static int coroutine_fn pvebackup_co_add_config(
@@ -416,10 +446,18 @@ static int coroutine_fn pvebackup_co_add_config(
goto out;
}
@@ -302,20 +302,20 @@ index 2e628d68e4..9c20ef3a5e 100644
Error *local_err = NULL;
/* create job transaction to synchronize bitmap commit and cancel all
@@ -454,24 +496,19 @@ static bool create_backup_jobs(void) {
@@ -455,24 +493,19 @@ static bool create_backup_jobs(void) {
aio_context_release(aio_context);
- if (!job || local_err != NULL) {
- Error *create_job_err = NULL;
- error_setg(&create_job_err, "backup_job_create failed: %s",
- local_err ? error_get_pretty(local_err) : "null");
+ di->job = job;
+
- pvebackup_propagate_error(create_job_err);
+ if (!job || local_err) {
+ error_setg(errp, "backup_job_create failed: %s",
local_err ? error_get_pretty(local_err) : "null");
-
- pvebackup_propagate_error(create_job_err);
+ local_err ? error_get_pretty(local_err) : "null");
break;
}
@@ -332,15 +332,13 @@ index 2e628d68e4..9c20ef3a5e 100644
l = backup_state.di_list;
while (l) {
PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
@@ -483,12 +520,17 @@ static bool create_backup_jobs(void) {
}
@@ -485,13 +518,15 @@ static bool create_backup_jobs(void) {
if (di->job) {
+ AioContext *ctx = di->job->job.aio_context;
+ aio_context_acquire(ctx);
+ job_cancel_sync(&di->job->job);
job_unref(&di->job->job);
+ aio_context_release(ctx);
WITH_JOB_LOCK_GUARD() {
+ job_cancel_sync_locked(&di->job->job, true);
job_unref_locked(&di->job->job);
}
}
}
}
@@ -351,7 +349,7 @@ index 2e628d68e4..9c20ef3a5e 100644
}
typedef struct QmpBackupTask {
@@ -525,11 +567,12 @@ typedef struct QmpBackupTask {
@@ -528,11 +563,12 @@ typedef struct QmpBackupTask {
UuidInfo *result;
} QmpBackupTask;
@@ -365,7 +363,7 @@ index 2e628d68e4..9c20ef3a5e 100644
QmpBackupTask *task = opaque;
task->result = NULL; // just to be sure
@@ -550,8 +593,9 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -553,8 +589,9 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
const char *firewall_name = "qemu-server.fw";
if (backup_state.di_list) {
@@ -376,7 +374,7 @@ index 2e628d68e4..9c20ef3a5e 100644
return;
}
@@ -618,6 +662,8 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -621,6 +658,8 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
}
di->size = size;
total += size;
@@ -385,7 +383,7 @@ index 2e628d68e4..9c20ef3a5e 100644
}
uuid_generate(uuid);
@@ -849,6 +895,8 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -852,6 +891,8 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
backup_state.stat.dirty = total - backup_state.stat.reused;
backup_state.stat.transferred = 0;
backup_state.stat.zero_bytes = 0;
@@ -394,7 +392,7 @@ index 2e628d68e4..9c20ef3a5e 100644
qemu_mutex_unlock(&backup_state.stat.lock);
@@ -863,6 +911,33 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -866,6 +907,33 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
uuid_info->UUID = uuid_str;
task->result = uuid_info;
@@ -428,7 +426,7 @@ index 2e628d68e4..9c20ef3a5e 100644
return;
err_mutex:
@@ -885,6 +960,7 @@ err:
@@ -888,6 +956,7 @@ err:
g_free(di);
}
g_list_free(di_list);
@@ -436,7 +434,7 @@ index 2e628d68e4..9c20ef3a5e 100644
if (devs) {
g_strfreev(devs);
@@ -905,6 +981,8 @@ err:
@@ -908,6 +977,8 @@ err:
}
task->result = NULL;
@@ -445,7 +443,7 @@ index 2e628d68e4..9c20ef3a5e 100644
return;
}
@@ -958,24 +1036,8 @@ UuidInfo *qmp_backup(
@@ -961,24 +1032,8 @@ UuidInfo *qmp_backup(
.errp = errp,
};
@@ -470,7 +468,7 @@ index 2e628d68e4..9c20ef3a5e 100644
return task.result;
}
@@ -1027,6 +1089,7 @@ BackupStatus *qmp_query_backup(Error **errp)
@@ -1030,6 +1085,7 @@ BackupStatus *qmp_query_backup(Error **errp)
info->transferred = backup_state.stat.transferred;
info->has_reused = true;
info->reused = backup_state.stat.reused;
@@ -479,10 +477,10 @@ index 2e628d68e4..9c20ef3a5e 100644
qemu_mutex_unlock(&backup_state.stat.lock);
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 170c13984d..a0d1d278e9 100644
index 7fde927621..bf559c6d52 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -729,12 +729,15 @@
@@ -770,12 +770,15 @@
#
# @uuid: uuid for this backup job
#

View File

@@ -36,11 +36,11 @@ index 465906710d..4f0aeceb6f 100644
+
#endif
diff --git a/migration/meson.build b/migration/meson.build
index ea9aedeefc..c27dc9bd97 100644
index 0842d00cd2..d012f4d8d3 100644
--- a/migration/meson.build
+++ b/migration/meson.build
@@ -7,8 +7,10 @@ migration_files = files(
'qemu-file-channel.c',
@@ -6,8 +6,10 @@ migration_files = files(
'vmstate.c',
'qemu-file.c',
'yank_functions.c',
+ 'pbs-state.c',
@@ -51,17 +51,17 @@ index ea9aedeefc..c27dc9bd97 100644
softmmu_ss.add(files(
'block-dirty-bitmap.c',
diff --git a/migration/migration.c b/migration/migration.c
index 041b8451a6..9df2eed75e 100644
index f485eea5fb..89b287180f 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -218,6 +218,7 @@ void migration_object_init(void)
@@ -229,6 +229,7 @@ void migration_object_init(void)
blk_mig_init();
ram_mig_init();
dirty_bitmap_mig_init();
+ pbs_state_mig_init();
}
void migration_cancel(void)
void migration_cancel(const Error *error)
diff --git a/migration/pbs-state.c b/migration/pbs-state.c
new file mode 100644
index 0000000000..29f2b3860d
@@ -175,10 +175,10 @@ index 0000000000..29f2b3860d
+ NULL);
+}
diff --git a/pve-backup.c b/pve-backup.c
index 9c20ef3a5e..59ccb38ceb 100644
index 88268bb586..fa9c6c4493 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -1132,6 +1132,7 @@ ProxmoxSupportStatus *qmp_query_proxmox_support(Error **errp)
@@ -1128,6 +1128,7 @@ ProxmoxSupportStatus *qmp_query_proxmox_support(Error **errp)
ret->pbs_library_version = g_strdup(proxmox_backup_qemu_version());
ret->pbs_dirty_bitmap = true;
ret->pbs_dirty_bitmap_savevm = true;
@@ -187,10 +187,10 @@ index 9c20ef3a5e..59ccb38ceb 100644
return ret;
}
diff --git a/qapi/block-core.json b/qapi/block-core.json
index a0d1d278e9..e5de769dc1 100644
index bf559c6d52..24f30260c8 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -838,6 +838,11 @@
@@ -879,6 +879,11 @@
# @pbs-dirty-bitmap-savevm: True if 'dirty-bitmaps' migration capability can
# safely be set for savevm-async.
#
@@ -202,7 +202,7 @@ index a0d1d278e9..e5de769dc1 100644
# @pbs-library-version: Running version of libproxmox-backup-qemu0 library.
#
##
@@ -845,6 +850,7 @@
@@ -886,6 +891,7 @@
'data': { 'pbs-dirty-bitmap': 'bool',
'query-bitmap-info': 'bool',
'pbs-dirty-bitmap-savevm': 'bool',

View File

@@ -19,7 +19,7 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 35f5ef688d..c4640925e7 100644
index 9aba7d9c22..f4ecf9c9f9 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -538,7 +538,7 @@ static int add_bitmaps_to_list(DBMSaveState *s, BlockDriverState *bs,

View File

@@ -21,10 +21,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 30 insertions(+)
diff --git a/block/iscsi.c b/block/iscsi.c
index 4d2a416ce7..c345d30812 100644
index a316d46d96..3ed4a50c0d 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -1372,12 +1372,42 @@ static char *get_initiator_name(QemuOpts *opts)
@@ -1387,12 +1387,42 @@ static char *get_initiator_name(QemuOpts *opts)
const char *name;
char *iscsi_name;
UuidInfo *uuid_info;

View File

@@ -32,7 +32,7 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
5 files changed, 77 insertions(+), 196 deletions(-)
diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c
index 69254396d5..b838586fc0 100644
index a09f722fea..71ed202491 100644
--- a/block/monitor/block-hmp-cmds.c
+++ b/block/monitor/block-hmp-cmds.c
@@ -1016,7 +1016,7 @@ void hmp_info_snapshots(Monitor *mon, const QDict *qdict)
@@ -54,10 +54,10 @@ index 69254396d5..b838586fc0 100644
Error *error = NULL;
diff --git a/hmp-commands.hx b/hmp-commands.hx
index 7faba36b39..dca4e58858 100644
index fcf9461295..5fdb198ca4 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -109,6 +109,7 @@ ERST
@@ -111,6 +111,7 @@ ERST
"\n\t\t\t Use -d to dump data into a directory instead"
"\n\t\t\t of using VMA format.",
.cmd = hmp_backup,
@@ -65,7 +65,7 @@ index 7faba36b39..dca4e58858 100644
},
SRST
@@ -122,6 +123,7 @@ ERST
@@ -124,6 +125,7 @@ ERST
.params = "",
.help = "cancel the current VM backup",
.cmd = hmp_backup_cancel,
@@ -116,10 +116,10 @@ index 4ce7bc0b5e..0923037dec 100644
static void proxmox_backup_schedule_wake(void *data) {
CoCtxData *waker = (CoCtxData *)data;
diff --git a/pve-backup.c b/pve-backup.c
index 59ccb38ceb..f858003a06 100644
index 5662f48b72..e4fe1b601d 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -357,7 +357,7 @@ static void job_cancel_bh(void *opaque) {
@@ -354,7 +354,7 @@ static void job_cancel_bh(void *opaque) {
aio_co_enter(data->ctx, data->co);
}
@@ -128,7 +128,7 @@ index 59ccb38ceb..f858003a06 100644
{
Error *cancel_err = NULL;
error_setg(&cancel_err, "backup canceled");
@@ -394,11 +394,6 @@ static void coroutine_fn pvebackup_co_cancel(void *opaque)
@@ -391,11 +391,6 @@ static void coroutine_fn pvebackup_co_cancel(void *opaque)
qemu_co_mutex_unlock(&backup_state.backup_mutex);
}
@@ -140,7 +140,7 @@ index 59ccb38ceb..f858003a06 100644
// assumes the caller holds backup_mutex
static int coroutine_fn pvebackup_co_add_config(
const char *file,
@@ -533,50 +528,27 @@ static void create_backup_jobs_bh(void *opaque) {
@@ -529,50 +524,27 @@ static void create_backup_jobs_bh(void *opaque) {
aio_co_enter(data->ctx, data->co);
}
@@ -207,7 +207,7 @@ index 59ccb38ceb..f858003a06 100644
BlockBackend *blk;
BlockDriverState *bs = NULL;
const char *backup_dir = NULL;
@@ -593,17 +565,17 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -589,17 +561,17 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
const char *firewall_name = "qemu-server.fw";
if (backup_state.di_list) {
@@ -230,7 +230,7 @@ index 59ccb38ceb..f858003a06 100644
gchar **d = devs;
while (d && *d) {
@@ -611,14 +583,14 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -607,14 +579,14 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
if (blk) {
bs = blk_bs(blk);
if (!bdrv_is_inserted(bs)) {
@@ -247,7 +247,7 @@ index 59ccb38ceb..f858003a06 100644
"Device '%s' not found", *d);
goto err;
}
@@ -641,7 +613,7 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -637,7 +609,7 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
}
if (!di_list) {
@@ -256,7 +256,7 @@ index 59ccb38ceb..f858003a06 100644
goto err;
}
@@ -651,13 +623,13 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -647,13 +619,13 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
while (l) {
PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
l = g_list_next(l);
@@ -267,12 +267,12 @@ index 59ccb38ceb..f858003a06 100644
ssize_t size = bdrv_getlength(di->bs);
if (size < 0) {
- error_setg_errno(task->errp, -di->size, "bdrv_getlength failed");
+ error_setg_errno(errp, -di->size, "bdrv_getlength failed");
- error_setg_errno(task->errp, -size, "bdrv_getlength failed");
+ error_setg_errno(errp, -size, "bdrv_getlength failed");
goto err;
}
di->size = size;
@@ -684,47 +656,44 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -680,47 +652,44 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
}
if (format == BACKUP_FORMAT_PBS) {
@@ -337,7 +337,7 @@ index 59ccb38ceb..f858003a06 100644
if (connect_result < 0)
goto err_mutex;
@@ -743,9 +712,9 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -739,9 +708,9 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
BdrvDirtyBitmap *bitmap = bdrv_find_dirty_bitmap(di->bs, PBS_BITMAP_NAME);
bool expect_only_dirty = false;
@@ -349,7 +349,7 @@ index 59ccb38ceb..f858003a06 100644
if (!bitmap) {
goto err_mutex;
}
@@ -775,12 +744,12 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -771,12 +740,12 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
}
}
@@ -364,7 +364,7 @@ index 59ccb38ceb..f858003a06 100644
goto err_mutex;
}
@@ -794,10 +763,10 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -790,10 +759,10 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
backup_state.stat.bitmap_list = g_list_append(backup_state.stat.bitmap_list, info);
}
} else if (format == BACKUP_FORMAT_VMA) {
@@ -377,7 +377,7 @@ index 59ccb38ceb..f858003a06 100644
}
goto err_mutex;
}
@@ -808,25 +777,25 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -804,25 +773,25 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
l = g_list_next(l);
@@ -409,7 +409,7 @@ index 59ccb38ceb..f858003a06 100644
l = di_list;
while (l) {
@@ -840,34 +809,34 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -836,34 +805,34 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
bdrv_img_create(di->targetfile, "raw", NULL, NULL, NULL,
di->size, flags, false, &local_err);
if (local_err) {
@@ -453,7 +453,7 @@ index 59ccb38ceb..f858003a06 100644
goto err_mutex;
}
}
@@ -885,7 +854,7 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -881,7 +850,7 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
if (backup_state.stat.backup_file) {
g_free(backup_state.stat.backup_file);
}
@@ -462,7 +462,7 @@ index 59ccb38ceb..f858003a06 100644
uuid_copy(backup_state.stat.uuid, uuid);
uuid_unparse_lower(uuid, backup_state.stat.uuid_str);
@@ -900,7 +869,7 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -896,7 +865,7 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
qemu_mutex_unlock(&backup_state.stat.lock);
@@ -471,7 +471,7 @@ index 59ccb38ceb..f858003a06 100644
backup_state.vmaw = vmaw;
backup_state.pbs = pbs;
@@ -910,8 +879,6 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -906,8 +875,6 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
uuid_info = g_malloc0(sizeof(*uuid_info));
uuid_info->UUID = uuid_str;
@@ -480,7 +480,7 @@ index 59ccb38ceb..f858003a06 100644
/* Run create_backup_jobs_bh outside of coroutine (in BH) but keep
* backup_mutex locked. This is fine, a CoMutex can be held across yield
* points, and we'll release it as soon as the BH reschedules us.
@@ -925,7 +892,7 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -921,7 +888,7 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
qemu_coroutine_yield();
if (local_err) {
@@ -489,7 +489,7 @@ index 59ccb38ceb..f858003a06 100644
goto err;
}
@@ -938,7 +905,7 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -934,7 +901,7 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
/* start the first job in the transaction */
job_txn_start_seq(backup_state.txn);
@@ -498,7 +498,7 @@ index 59ccb38ceb..f858003a06 100644
err_mutex:
qemu_mutex_unlock(&backup_state.stat.lock);
@@ -969,7 +936,7 @@ err:
@@ -965,7 +932,7 @@ err:
if (vmaw) {
Error *err = NULL;
vma_writer_close(vmaw, &err);
@@ -507,7 +507,7 @@ index 59ccb38ceb..f858003a06 100644
}
if (pbs) {
@@ -980,65 +947,8 @@ err:
@@ -976,65 +943,8 @@ err:
rmdir(backup_dir);
}
@@ -575,10 +575,10 @@ index 59ccb38ceb..f858003a06 100644
BackupStatus *qmp_query_backup(Error **errp)
diff --git a/qapi/block-core.json b/qapi/block-core.json
index e5de769dc1..afa67c28d2 100644
index 24f30260c8..4e8c35a3a2 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -801,7 +801,7 @@
@@ -842,7 +842,7 @@
'*config-file': 'str',
'*firewall-file': 'str',
'*devlist': 'str', '*speed': 'int' },
@@ -587,7 +587,7 @@ index e5de769dc1..afa67c28d2 100644
##
# @query-backup:
@@ -823,7 +823,7 @@
@@ -864,7 +864,7 @@
# Notes: This command succeeds even if there is no backup process running.
#
##

View File

@@ -19,7 +19,7 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
3 files changed, 11 insertions(+)
diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c
index b838586fc0..5b52b93232 100644
index 71ed202491..c7468e5d3b 100644
--- a/block/monitor/block-hmp-cmds.c
+++ b/block/monitor/block-hmp-cmds.c
@@ -1039,6 +1039,7 @@ void coroutine_fn hmp_backup(Monitor *mon, const QDict *qdict)
@@ -31,10 +31,10 @@ index b838586fc0..5b52b93232 100644
false, NULL, // PBS backup-id
false, 0, // PBS backup-time
diff --git a/pve-backup.c b/pve-backup.c
index f858003a06..04ebfc1e33 100644
index 109498eaf9..4b5134ed27 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -533,6 +533,7 @@ UuidInfo coroutine_fn *qmp_backup(
@@ -529,6 +529,7 @@ UuidInfo coroutine_fn *qmp_backup(
bool has_password, const char *password,
bool has_keyfile, const char *keyfile,
bool has_key_password, const char *key_password,
@@ -42,7 +42,7 @@ index f858003a06..04ebfc1e33 100644
bool has_fingerprint, const char *fingerprint,
bool has_backup_id, const char *backup_id,
bool has_backup_time, int64_t backup_time,
@@ -681,6 +682,7 @@ UuidInfo coroutine_fn *qmp_backup(
@@ -677,6 +678,7 @@ UuidInfo coroutine_fn *qmp_backup(
has_password ? password : NULL,
has_keyfile ? keyfile : NULL,
has_key_password ? key_password : NULL,
@@ -50,7 +50,7 @@ index f858003a06..04ebfc1e33 100644
has_compress ? compress : true,
has_encrypt ? encrypt : has_keyfile,
has_fingerprint ? fingerprint : NULL,
@@ -1044,5 +1046,6 @@ ProxmoxSupportStatus *qmp_query_proxmox_support(Error **errp)
@@ -1040,5 +1042,6 @@ ProxmoxSupportStatus *qmp_query_proxmox_support(Error **errp)
ret->pbs_dirty_bitmap_savevm = true;
ret->pbs_dirty_bitmap_migration = true;
ret->query_bitmap_info = true;
@@ -58,10 +58,10 @@ index f858003a06..04ebfc1e33 100644
return ret;
}
diff --git a/qapi/block-core.json b/qapi/block-core.json
index afa67c28d2..84e4406d21 100644
index 4e8c35a3a2..d8c7331090 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -772,6 +772,8 @@
@@ -813,6 +813,8 @@
#
# @key-password: password for keyfile (optional for format 'pbs')
#
@@ -70,7 +70,7 @@ index afa67c28d2..84e4406d21 100644
# @fingerprint: server cert fingerprint (optional for format 'pbs')
#
# @backup-id: backup ID (required for format 'pbs')
@@ -791,6 +793,7 @@
@@ -832,6 +834,7 @@
'*password': 'str',
'*keyfile': 'str',
'*key-password': 'str',
@@ -78,7 +78,7 @@ index afa67c28d2..84e4406d21 100644
'*fingerprint': 'str',
'*backup-id': 'str',
'*backup-time': 'int',
@@ -843,6 +846,9 @@
@@ -884,6 +887,9 @@
# migration cap if this is false/unset may lead
# to crashes on migration!
#
@@ -88,7 +88,7 @@ index afa67c28d2..84e4406d21 100644
# @pbs-library-version: Running version of libproxmox-backup-qemu0 library.
#
##
@@ -851,6 +857,7 @@
@@ -892,6 +898,7 @@
'query-bitmap-info': 'bool',
'pbs-dirty-bitmap-savevm': 'bool',
'pbs-dirty-bitmap-migration': 'bool',

View File

@@ -17,7 +17,7 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 14 insertions(+), 3 deletions(-)
diff --git a/block/pbs.c b/block/pbs.c
index 78dad0dcc4..ac54e816c0 100644
index 9d1f1f39d4..ce9a870885 100644
--- a/block/pbs.c
+++ b/block/pbs.c
@@ -200,7 +200,16 @@ static coroutine_fn int pbs_co_preadv(BlockDriverState *bs,
@@ -36,9 +36,9 @@ index 78dad0dcc4..ac54e816c0 100644
+ buf = g_malloc(bytes);
+ }
ReadCallbackData rcb = {
.co = qemu_coroutine_self(),
@@ -218,8 +227,10 @@ static coroutine_fn int pbs_co_preadv(BlockDriverState *bs,
if (offset < 0 || bytes < 0) {
fprintf(stderr, "unexpected negative 'offset' or 'bytes' value!\n");
@@ -223,8 +232,10 @@ static coroutine_fn int pbs_co_preadv(BlockDriverState *bs,
return -EIO;
}
@@ -49,5 +49,5 @@ index 78dad0dcc4..ac54e816c0 100644
+ g_free(buf);
+ }
return ret;
return 0;
}

View File

@@ -11,7 +11,7 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/block/stream.c b/block/stream.c
index 97bee482dc..50093c9f57 100644
index 694709bd25..e09bd5c4ef 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -28,7 +28,7 @@ enum {

View File

@@ -1,33 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Stefan Reiter <s.reiter@proxmox.com>
Date: Wed, 26 May 2021 15:26:30 +0200
Subject: [PATCH] PVE: whitelist 'invalid' QAPI names for backwards compat
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
qapi/pragma.json | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/qapi/pragma.json b/qapi/pragma.json
index 7c91ea3685..c3888d654c 100644
--- a/qapi/pragma.json
+++ b/qapi/pragma.json
@@ -12,6 +12,7 @@
'device_add',
'device_del',
'expire_password',
+ 'get_link_status',
'migrate_cancel',
'netdev_add',
'netdev_del',
@@ -60,6 +61,8 @@
'SysEmuTarget', # query-cpu-fast, query-target
'UuidInfo', # query-uuid
'VncClientInfo', # query-vnc, query-vnc-servers, ...
- 'X86CPURegister32' # qom-get of x86 CPU properties
+ 'X86CPURegister32', # qom-get of x86 CPU properties
# feature-words, filtered-features
+ 'BlockdevOptionsPbs', # for PBS backwards compat
+ 'BalloonInfo'
] } }

View File

@@ -17,10 +17,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 4 insertions(+)
diff --git a/block/io.c b/block/io.c
index f38e7f81d8..28c3a712b6 100644
index b9424024f9..01f50d28c8 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1764,6 +1764,10 @@ static int bdrv_pad_request(BlockDriverState *bs,
@@ -1730,6 +1730,10 @@ static int bdrv_pad_request(BlockDriverState *bs,
{
int ret;

View File

@@ -24,18 +24,21 @@ once the backing image is removed. It will be replaced by 'file'.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
[FE: adapt to changed function signatures
make error return value consistent with QEMU]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
block/alloc-track.c | 345 ++++++++++++++++++++++++++++++++++++++++++++
block/alloc-track.c | 350 ++++++++++++++++++++++++++++++++++++++++++++
block/meson.build | 1 +
2 files changed, 346 insertions(+)
2 files changed, 351 insertions(+)
create mode 100644 block/alloc-track.c
diff --git a/block/alloc-track.c b/block/alloc-track.c
new file mode 100644
index 0000000000..35f2737c89
index 0000000000..43d40d11af
--- /dev/null
+++ b/block/alloc-track.c
@@ -0,0 +1,345 @@
@@ -0,0 +1,350 @@
+/*
+ * Node to allow backing images to be applied to any node. Assumes a blank
+ * image to begin with, only new writes are tracked as allocated, thus this
@@ -166,7 +169,7 @@ index 0000000000..35f2737c89
+}
+
+static int coroutine_fn track_co_preadv(BlockDriverState *bs,
+ uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags)
+ int64_t offset, int64_t bytes, QEMUIOVector *qiov, BdrvRequestFlags flags)
+{
+ BDRVAllocTrackState *s = bs->opaque;
+ QEMUIOVector local_qiov;
@@ -177,6 +180,11 @@ index 0000000000..35f2737c89
+ int64_t local_bytes;
+ bool alloc;
+
+ if (offset < 0 || bytes < 0) {
+ fprintf(stderr, "unexpected negative 'offset' or 'bytes' value!\n");
+ return -EIO;
+ }
+
+ /* a read request can span multiple granularity-sized chunks, and can thus
+ * contain blocks with different allocation status - we could just iterate
+ * granularity-wise, but for better performance use bdrv_dirty_bitmap_next_X
@@ -219,21 +227,21 @@ index 0000000000..35f2737c89
+}
+
+static int coroutine_fn track_co_pwritev(BlockDriverState *bs,
+ uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags)
+ int64_t offset, int64_t bytes, QEMUIOVector *qiov, BdrvRequestFlags flags)
+{
+ return bdrv_co_pwritev(bs->file, offset, bytes, qiov, flags);
+}
+
+static int coroutine_fn track_co_pwrite_zeroes(BlockDriverState *bs,
+ int64_t offset, int count, BdrvRequestFlags flags)
+ int64_t offset, int64_t bytes, BdrvRequestFlags flags)
+{
+ return bdrv_co_pwrite_zeroes(bs->file, offset, count, flags);
+ return bdrv_co_pwrite_zeroes(bs->file, offset, bytes, flags);
+}
+
+static int coroutine_fn track_co_pdiscard(BlockDriverState *bs,
+ int64_t offset, int count)
+ int64_t offset, int64_t bytes)
+{
+ return bdrv_co_pdiscard(bs->file, offset, count);
+ return bdrv_co_pdiscard(bs->file, offset, bytes);
+}
+
+static coroutine_fn int track_co_flush(BlockDriverState *bs)
@@ -382,7 +390,7 @@ index 0000000000..35f2737c89
+
+block_init(bdrv_alloc_track_init);
diff --git a/block/meson.build b/block/meson.build
index e3ed5ac97c..d1ee260048 100644
index 7ef2fa72d5..15352f579f 100644
--- a/block/meson.build
+++ b/block/meson.build
@@ -2,6 +2,7 @@ block_ss.add(genh)

View File

@@ -11,10 +11,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 5 insertions(+)
diff --git a/migration/savevm-async.c b/migration/savevm-async.c
index 970ee3b3fc..b3ccc069f1 100644
index a38e7351c1..0b1b60c6ae 100644
--- a/migration/savevm-async.c
+++ b/migration/savevm-async.c
@@ -19,6 +19,7 @@
@@ -20,6 +20,7 @@
#include "qemu/timer.h"
#include "qemu/main-loop.h"
#include "qemu/rcu.h"
@@ -22,7 +22,7 @@ index 970ee3b3fc..b3ccc069f1 100644
/* #define DEBUG_SAVEVM_STATE */
@@ -580,6 +581,10 @@ int load_snapshot_from_blockdev(const char *filename, Error **errp)
@@ -521,6 +522,10 @@ int load_snapshot_from_blockdev(const char *filename, Error **errp)
dirty_bitmap_mig_before_vm_start();
qemu_fclose(f);

View File

@@ -0,0 +1,130 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fabian Ebner <f.ebner@proxmox.com>
Date: Mon, 7 Feb 2022 14:21:01 +0100
Subject: [PATCH] qemu-img: dd: add -l option for loading a snapshot
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
docs/tools/qemu-img.rst | 6 +++---
qemu-img-cmds.hx | 4 ++--
qemu-img.c | 33 +++++++++++++++++++++++++++++++--
3 files changed, 36 insertions(+), 7 deletions(-)
diff --git a/docs/tools/qemu-img.rst b/docs/tools/qemu-img.rst
index 5e713e231d..9390d5e5cf 100644
--- a/docs/tools/qemu-img.rst
+++ b/docs/tools/qemu-img.rst
@@ -492,10 +492,10 @@ Command description:
it doesn't need to be specified separately in this case.
-.. option:: dd [--image-opts] [-U] [-f FMT] [-O OUTPUT_FMT] [-n] [bs=BLOCK_SIZE] [count=BLOCKS] [skip=BLOCKS] if=INPUT of=OUTPUT
+.. option:: dd [--image-opts] [-U] [-f FMT] [-O OUTPUT_FMT] [-n] [-l SNAPSHOT_PARAM] [bs=BLOCK_SIZE] [count=BLOCKS] [skip=BLOCKS] if=INPUT of=OUTPUT
- dd copies from *INPUT* file to *OUTPUT* file converting it from
- *FMT* format to *OUTPUT_FMT* format.
+ dd copies from *INPUT* file or snapshot *SNAPSHOT_PARAM* to *OUTPUT* file
+ converting it from *FMT* format to *OUTPUT_FMT* format.
The data is by default read and written using blocks of 512 bytes but can be
modified by specifying *BLOCK_SIZE*. If count=\ *BLOCKS* is specified
diff --git a/qemu-img-cmds.hx b/qemu-img-cmds.hx
index b5b0bb4467..36f97e1f19 100644
--- a/qemu-img-cmds.hx
+++ b/qemu-img-cmds.hx
@@ -58,9 +58,9 @@ SRST
ERST
DEF("dd", img_dd,
- "dd [--image-opts] [-U] [-f fmt] [-O output_fmt] [-n] [bs=block_size] [count=blocks] [skip=blocks] [osize=output_size] if=input of=output")
+ "dd [--image-opts] [-U] [-f fmt] [-O output_fmt] [-n] [-l snapshot_param] [bs=block_size] [count=blocks] [skip=blocks] [osize=output_size] if=input of=output")
SRST
-.. option:: dd [--image-opts] [-U] [-f FMT] [-O OUTPUT_FMT] [-n] [bs=BLOCK_SIZE] [count=BLOCKS] [skip=BLOCKS] [osize=OUTPUT_SIZE] if=INPUT of=OUTPUT
+.. option:: dd [--image-opts] [-U] [-f FMT] [-O OUTPUT_FMT] [-n] [-l SNAPSHOT_PARAM] [bs=BLOCK_SIZE] [count=BLOCKS] [skip=BLOCKS] [osize=OUTPUT_SIZE] if=INPUT of=OUTPUT
ERST
DEF("info", img_info,
diff --git a/qemu-img.c b/qemu-img.c
index 59c403373b..065a54cc42 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -4946,6 +4946,7 @@ static int img_dd(int argc, char **argv)
BlockDriver *drv = NULL, *proto_drv = NULL;
BlockBackend *blk1 = NULL, *blk2 = NULL;
QemuOpts *opts = NULL;
+ QemuOpts *sn_opts = NULL;
QemuOptsList *create_opts = NULL;
Error *local_err = NULL;
bool image_opts = false;
@@ -4955,6 +4956,7 @@ static int img_dd(int argc, char **argv)
int64_t size = 0, readsize = 0;
int64_t out_pos, in_pos;
bool force_share = false, skip_create = false;
+ const char *snapshot_name = NULL;
struct DdInfo dd = {
.flags = 0,
.count = 0,
@@ -4992,7 +4994,7 @@ static int img_dd(int argc, char **argv)
{ 0, 0, 0, 0 }
};
- while ((c = getopt_long(argc, argv, ":hf:O:Un", long_options, NULL))) {
+ while ((c = getopt_long(argc, argv, ":hf:O:l:Un", long_options, NULL))) {
if (c == EOF) {
break;
}
@@ -5015,6 +5017,19 @@ static int img_dd(int argc, char **argv)
case 'n':
skip_create = true;
break;
+ case 'l':
+ if (strstart(optarg, SNAPSHOT_OPT_BASE, NULL)) {
+ sn_opts = qemu_opts_parse_noisily(&internal_snapshot_opts,
+ optarg, false);
+ if (!sn_opts) {
+ error_report("Failed in parsing snapshot param '%s'",
+ optarg);
+ goto out;
+ }
+ } else {
+ snapshot_name = optarg;
+ }
+ break;
case 'U':
force_share = true;
break;
@@ -5074,11 +5089,24 @@ static int img_dd(int argc, char **argv)
if (dd.flags & C_IF) {
blk1 = img_open(image_opts, in.filename, fmt, 0, false, false,
force_share);
-
if (!blk1) {
ret = -1;
goto out;
}
+ if (sn_opts) {
+ bdrv_snapshot_load_tmp(blk_bs(blk1),
+ qemu_opt_get(sn_opts, SNAPSHOT_OPT_ID),
+ qemu_opt_get(sn_opts, SNAPSHOT_OPT_NAME),
+ &local_err);
+ } else if (snapshot_name != NULL) {
+ bdrv_snapshot_load_tmp_by_id_or_name(blk_bs(blk1), snapshot_name,
+ &local_err);
+ }
+ if (local_err) {
+ error_reportf_err(local_err, "Failed to load snapshot: ");
+ ret = -1;
+ goto out;
+ }
}
if (dd.flags & C_OSIZE) {
@@ -5233,6 +5261,7 @@ static int img_dd(int argc, char **argv)
out:
g_free(arg);
qemu_opts_del(opts);
+ qemu_opts_del(sn_opts);
qemu_opts_free(create_opts);
blk_unref(blk1);
blk_unref(blk2);

View File

@@ -0,0 +1,407 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fabian Ebner <f.ebner@proxmox.com>
Date: Thu, 21 Apr 2022 13:26:48 +0200
Subject: [PATCH] vma: allow partial restore
Introduce a new map line for skipping a certain drive, of the form
skip=drive-scsi0
Since in PVE, most archives are compressed and piped to vma for
restore, it's not easily possible to skip reads.
For the reader, a new skip flag for VmaRestoreState is added and the
target is allowed to be NULL if skip is specified when registering. If
the skip flag is set, no writes will be made as well as no check for
duplicate clusters. Therefore, the flag is not set for verify.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Acked-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
vma-reader.c | 64 ++++++++++++---------
vma.c | 157 +++++++++++++++++++++++++++++----------------------
vma.h | 2 +-
3 files changed, 126 insertions(+), 97 deletions(-)
diff --git a/vma-reader.c b/vma-reader.c
index e65f1e8415..81a891c6b1 100644
--- a/vma-reader.c
+++ b/vma-reader.c
@@ -28,6 +28,7 @@ typedef struct VmaRestoreState {
bool write_zeroes;
unsigned long *bitmap;
int bitmap_size;
+ bool skip;
} VmaRestoreState;
struct VmaReader {
@@ -425,13 +426,14 @@ VmaDeviceInfo *vma_reader_get_device_info(VmaReader *vmar, guint8 dev_id)
}
static void allocate_rstate(VmaReader *vmar, guint8 dev_id,
- BlockBackend *target, bool write_zeroes)
+ BlockBackend *target, bool write_zeroes, bool skip)
{
assert(vmar);
assert(dev_id);
vmar->rstate[dev_id].target = target;
vmar->rstate[dev_id].write_zeroes = write_zeroes;
+ vmar->rstate[dev_id].skip = skip;
int64_t size = vmar->devinfo[dev_id].size;
@@ -446,28 +448,30 @@ static void allocate_rstate(VmaReader *vmar, guint8 dev_id,
}
int vma_reader_register_bs(VmaReader *vmar, guint8 dev_id, BlockBackend *target,
- bool write_zeroes, Error **errp)
+ bool write_zeroes, bool skip, Error **errp)
{
assert(vmar);
- assert(target != NULL);
+ assert(target != NULL || skip);
assert(dev_id);
- assert(vmar->rstate[dev_id].target == NULL);
-
- int64_t size = blk_getlength(target);
- int64_t size_diff = size - vmar->devinfo[dev_id].size;
-
- /* storage types can have different size restrictions, so it
- * is not always possible to create an image with exact size.
- * So we tolerate a size difference up to 4MB.
- */
- if ((size_diff < 0) || (size_diff > 4*1024*1024)) {
- error_setg(errp, "vma_reader_register_bs for stream %s failed - "
- "unexpected size %zd != %zd", vmar->devinfo[dev_id].devname,
- size, vmar->devinfo[dev_id].size);
- return -1;
+ assert(vmar->rstate[dev_id].target == NULL && !vmar->rstate[dev_id].skip);
+
+ if (target != NULL) {
+ int64_t size = blk_getlength(target);
+ int64_t size_diff = size - vmar->devinfo[dev_id].size;
+
+ /* storage types can have different size restrictions, so it
+ * is not always possible to create an image with exact size.
+ * So we tolerate a size difference up to 4MB.
+ */
+ if ((size_diff < 0) || (size_diff > 4*1024*1024)) {
+ error_setg(errp, "vma_reader_register_bs for stream %s failed - "
+ "unexpected size %zd != %zd", vmar->devinfo[dev_id].devname,
+ size, vmar->devinfo[dev_id].size);
+ return -1;
+ }
}
- allocate_rstate(vmar, dev_id, target, write_zeroes);
+ allocate_rstate(vmar, dev_id, target, write_zeroes, skip);
return 0;
}
@@ -560,19 +564,23 @@ static int restore_extent(VmaReader *vmar, unsigned char *buf,
VmaRestoreState *rstate = &vmar->rstate[dev_id];
BlockBackend *target = NULL;
+ bool skip = rstate->skip;
+
if (dev_id != vmar->vmstate_stream) {
target = rstate->target;
- if (!verify && !target) {
+ if (!verify && !target && !skip) {
error_setg(errp, "got wrong dev id %d", dev_id);
return -1;
}
- if (vma_reader_get_bitmap(rstate, cluster_num)) {
- error_setg(errp, "found duplicated cluster %zd for stream %s",
- cluster_num, vmar->devinfo[dev_id].devname);
- return -1;
+ if (!skip) {
+ if (vma_reader_get_bitmap(rstate, cluster_num)) {
+ error_setg(errp, "found duplicated cluster %zd for stream %s",
+ cluster_num, vmar->devinfo[dev_id].devname);
+ return -1;
+ }
+ vma_reader_set_bitmap(rstate, cluster_num, 1);
}
- vma_reader_set_bitmap(rstate, cluster_num, 1);
max_sector = vmar->devinfo[dev_id].size/BDRV_SECTOR_SIZE;
} else {
@@ -618,7 +626,7 @@ static int restore_extent(VmaReader *vmar, unsigned char *buf,
return -1;
}
- if (!verify) {
+ if (!verify && !skip) {
int nb_sectors = end_sector - sector_num;
if (restore_write_data(vmar, dev_id, target, vmstate_fd,
buf + start, sector_num, nb_sectors,
@@ -654,7 +662,7 @@ static int restore_extent(VmaReader *vmar, unsigned char *buf,
return -1;
}
- if (!verify) {
+ if (!verify && !skip) {
int nb_sectors = end_sector - sector_num;
if (restore_write_data(vmar, dev_id, target, vmstate_fd,
buf + start, sector_num,
@@ -679,7 +687,7 @@ static int restore_extent(VmaReader *vmar, unsigned char *buf,
vmar->partial_zero_cluster_data += zero_size;
}
- if (rstate->write_zeroes && !verify) {
+ if (rstate->write_zeroes && !verify && !skip) {
if (restore_write_data(vmar, dev_id, target, vmstate_fd,
zero_vma_block, sector_num,
nb_sectors, errp) < 0) {
@@ -850,7 +858,7 @@ int vma_reader_verify(VmaReader *vmar, bool verbose, Error **errp)
for (dev_id = 1; dev_id < 255; dev_id++) {
if (vma_reader_get_device_info(vmar, dev_id)) {
- allocate_rstate(vmar, dev_id, NULL, false);
+ allocate_rstate(vmar, dev_id, NULL, false, false);
}
}
diff --git a/vma.c b/vma.c
index e8dffb43e0..e6e9ffc7fe 100644
--- a/vma.c
+++ b/vma.c
@@ -138,6 +138,7 @@ typedef struct RestoreMap {
char *throttling_group;
char *cache;
bool write_zero;
+ bool skip;
} RestoreMap;
static bool try_parse_option(char **line, const char *optname, char **out, const char *inbuf) {
@@ -245,47 +246,61 @@ static int extract_content(int argc, char **argv)
char *bps = NULL;
char *group = NULL;
char *cache = NULL;
+ char *devname = NULL;
+ bool skip = false;
+ uint64_t bps_value = 0;
+ const char *path = NULL;
+ bool write_zero = true;
+
if (!line || line[0] == '\0' || !strcmp(line, "done\n")) {
break;
}
int len = strlen(line);
if (line[len - 1] == '\n') {
line[len - 1] = '\0';
- if (len == 1) {
+ len = len - 1;
+ if (len == 0) {
break;
}
}
- while (1) {
- if (!try_parse_option(&line, "format", &format, inbuf) &&
- !try_parse_option(&line, "throttling.bps", &bps, inbuf) &&
- !try_parse_option(&line, "throttling.group", &group, inbuf) &&
- !try_parse_option(&line, "cache", &cache, inbuf))
- {
- break;
+ if (strncmp(line, "skip", 4) == 0) {
+ if (len < 6 || line[4] != '=') {
+ g_error("read map failed - option 'skip' has no value ('%s')",
+ inbuf);
+ } else {
+ devname = line + 5;
+ skip = true;
+ }
+ } else {
+ while (1) {
+ if (!try_parse_option(&line, "format", &format, inbuf) &&
+ !try_parse_option(&line, "throttling.bps", &bps, inbuf) &&
+ !try_parse_option(&line, "throttling.group", &group, inbuf) &&
+ !try_parse_option(&line, "cache", &cache, inbuf))
+ {
+ break;
+ }
}
- }
- uint64_t bps_value = 0;
- if (bps) {
- bps_value = verify_u64(bps);
- g_free(bps);
- }
+ if (bps) {
+ bps_value = verify_u64(bps);
+ g_free(bps);
+ }
- const char *path;
- bool write_zero;
- if (line[0] == '0' && line[1] == ':') {
- path = line + 2;
- write_zero = false;
- } else if (line[0] == '1' && line[1] == ':') {
- path = line + 2;
- write_zero = true;
- } else {
- g_error("read map failed - parse error ('%s')", inbuf);
+ if (line[0] == '0' && line[1] == ':') {
+ path = line + 2;
+ write_zero = false;
+ } else if (line[0] == '1' && line[1] == ':') {
+ path = line + 2;
+ write_zero = true;
+ } else {
+ g_error("read map failed - parse error ('%s')", inbuf);
+ }
+
+ path = extract_devname(path, &devname, -1);
}
- char *devname = NULL;
- path = extract_devname(path, &devname, -1);
if (!devname) {
g_error("read map failed - no dev name specified ('%s')",
inbuf);
@@ -299,6 +314,7 @@ static int extract_content(int argc, char **argv)
map->throttling_group = group;
map->cache = cache;
map->write_zero = write_zero;
+ map->skip = skip;
g_hash_table_insert(devmap, map->devname, map);
@@ -328,6 +344,7 @@ static int extract_content(int argc, char **argv)
const char *cache = NULL;
int flags = BDRV_O_RDWR;
bool write_zero = true;
+ bool skip = false;
BlockBackend *blk = NULL;
@@ -343,6 +360,7 @@ static int extract_content(int argc, char **argv)
throttling_group = map->throttling_group;
cache = map->cache;
write_zero = map->write_zero;
+ skip = map->skip;
} else {
devfn = g_strdup_printf("%s/tmp-disk-%s.raw",
dirname, di->devname);
@@ -361,57 +379,60 @@ static int extract_content(int argc, char **argv)
write_zero = false;
}
- size_t devlen = strlen(devfn);
- QDict *options = NULL;
- bool writethrough;
- if (format) {
- /* explicit format from commandline */
- options = qdict_new();
- qdict_put_str(options, "driver", format);
- } else if ((devlen > 4 && strcmp(devfn+devlen-4, ".raw") == 0) ||
- strncmp(devfn, "/dev/", 5) == 0)
- {
- /* This part is now deprecated for PVE as well (just as qemu
- * deprecated not specifying an explicit raw format, too.
- */
- /* explicit raw format */
- options = qdict_new();
- qdict_put_str(options, "driver", "raw");
- }
- if (cache && bdrv_parse_cache_mode(cache, &flags, &writethrough)) {
- g_error("invalid cache option: %s\n", cache);
- }
+ if (!skip) {
+ size_t devlen = strlen(devfn);
+ QDict *options = NULL;
+ bool writethrough;
+ if (format) {
+ /* explicit format from commandline */
+ options = qdict_new();
+ qdict_put_str(options, "driver", format);
+ } else if ((devlen > 4 && strcmp(devfn+devlen-4, ".raw") == 0) ||
+ strncmp(devfn, "/dev/", 5) == 0)
+ {
+ /* This part is now deprecated for PVE as well (just as qemu
+ * deprecated not specifying an explicit raw format, too.
+ */
+ /* explicit raw format */
+ options = qdict_new();
+ qdict_put_str(options, "driver", "raw");
+ }
- if (errp || !(blk = blk_new_open(devfn, NULL, options, flags, &errp))) {
- g_error("can't open file %s - %s", devfn,
- error_get_pretty(errp));
- }
+ if (cache && bdrv_parse_cache_mode(cache, &flags, &writethrough)) {
+ g_error("invalid cache option: %s\n", cache);
+ }
- if (cache) {
- blk_set_enable_write_cache(blk, !writethrough);
- }
+ if (errp || !(blk = blk_new_open(devfn, NULL, options, flags, &errp))) {
+ g_error("can't open file %s - %s", devfn,
+ error_get_pretty(errp));
+ }
- if (throttling_group) {
- blk_io_limits_enable(blk, throttling_group);
- }
+ if (cache) {
+ blk_set_enable_write_cache(blk, !writethrough);
+ }
- if (throttling_bps) {
- if (!throttling_group) {
- blk_io_limits_enable(blk, devfn);
+ if (throttling_group) {
+ blk_io_limits_enable(blk, throttling_group);
}
- ThrottleConfig cfg;
- throttle_config_init(&cfg);
- cfg.buckets[THROTTLE_BPS_WRITE].avg = throttling_bps;
- Error *err = NULL;
- if (!throttle_is_valid(&cfg, &err)) {
- error_report_err(err);
- g_error("failed to apply throttling");
+ if (throttling_bps) {
+ if (!throttling_group) {
+ blk_io_limits_enable(blk, devfn);
+ }
+
+ ThrottleConfig cfg;
+ throttle_config_init(&cfg);
+ cfg.buckets[THROTTLE_BPS_WRITE].avg = throttling_bps;
+ Error *err = NULL;
+ if (!throttle_is_valid(&cfg, &err)) {
+ error_report_err(err);
+ g_error("failed to apply throttling");
+ }
+ blk_set_io_limits(blk, &cfg);
}
- blk_set_io_limits(blk, &cfg);
}
- if (vma_reader_register_bs(vmar, i, blk, write_zero, &errp) < 0) {
+ if (vma_reader_register_bs(vmar, i, blk, write_zero, skip, &errp) < 0) {
g_error("%s", error_get_pretty(errp));
}
diff --git a/vma.h b/vma.h
index c895c97f6d..1b62859165 100644
--- a/vma.h
+++ b/vma.h
@@ -142,7 +142,7 @@ GList *vma_reader_get_config_data(VmaReader *vmar);
VmaDeviceInfo *vma_reader_get_device_info(VmaReader *vmar, guint8 dev_id);
int vma_reader_register_bs(VmaReader *vmar, guint8 dev_id,
BlockBackend *target, bool write_zeroes,
- Error **errp);
+ bool skip, Error **errp);
int vma_reader_restore(VmaReader *vmar, int vmstate_fd, bool verbose,
Error **errp);
int vma_reader_verify(VmaReader *vmar, bool verbose, Error **errp);

View File

@@ -0,0 +1,233 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Wolfgang Bumiller <w.bumiller@proxmox.com>
Date: Tue, 26 Apr 2022 16:06:28 +0200
Subject: [PATCH] pbs: namespace support
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
---
block/monitor/block-hmp-cmds.c | 1 +
block/pbs.c | 25 +++++++++++++++++++++----
pbs-restore.c | 19 ++++++++++++++++---
pve-backup.c | 6 +++++-
qapi/block-core.json | 5 ++++-
5 files changed, 47 insertions(+), 9 deletions(-)
diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c
index c7468e5d3b..57b2457f1e 100644
--- a/block/monitor/block-hmp-cmds.c
+++ b/block/monitor/block-hmp-cmds.c
@@ -1041,6 +1041,7 @@ void coroutine_fn hmp_backup(Monitor *mon, const QDict *qdict)
false, NULL, // PBS key_password
false, NULL, // PBS master_keyfile
false, NULL, // PBS fingerprint
+ false, NULL, // PBS backup-ns
false, NULL, // PBS backup-id
false, 0, // PBS backup-time
false, false, // PBS use-dirty-bitmap
diff --git a/block/pbs.c b/block/pbs.c
index ce9a870885..9192f3e41b 100644
--- a/block/pbs.c
+++ b/block/pbs.c
@@ -14,6 +14,7 @@
#include <proxmox-backup-qemu.h>
#define PBS_OPT_REPOSITORY "repository"
+#define PBS_OPT_NAMESPACE "namespace"
#define PBS_OPT_SNAPSHOT "snapshot"
#define PBS_OPT_ARCHIVE "archive"
#define PBS_OPT_KEYFILE "keyfile"
@@ -27,6 +28,7 @@ typedef struct {
int64_t length;
char *repository;
+ char *namespace;
char *snapshot;
char *archive;
} BDRVPBSState;
@@ -40,6 +42,11 @@ static QemuOptsList runtime_opts = {
.type = QEMU_OPT_STRING,
.help = "The server address and repository to connect to.",
},
+ {
+ .name = PBS_OPT_NAMESPACE,
+ .type = QEMU_OPT_STRING,
+ .help = "Optional: The snapshot's namespace.",
+ },
{
.name = PBS_OPT_SNAPSHOT,
.type = QEMU_OPT_STRING,
@@ -76,7 +83,7 @@ static QemuOptsList runtime_opts = {
// filename format:
-// pbs:repository=<repo>,snapshot=<snap>,password=<pw>,key_password=<kpw>,fingerprint=<fp>,archive=<archive>
+// pbs:repository=<repo>,namespace=<ns>,snapshot=<snap>,password=<pw>,key_password=<kpw>,fingerprint=<fp>,archive=<archive>
static void pbs_parse_filename(const char *filename, QDict *options,
Error **errp)
{
@@ -112,6 +119,7 @@ static int pbs_open(BlockDriverState *bs, QDict *options, int flags,
s->archive = g_strdup(qemu_opt_get(opts, PBS_OPT_ARCHIVE));
const char *keyfile = qemu_opt_get(opts, PBS_OPT_KEYFILE);
const char *password = qemu_opt_get(opts, PBS_OPT_PASSWORD);
+ const char *namespace = qemu_opt_get(opts, PBS_OPT_NAMESPACE);
const char *fingerprint = qemu_opt_get(opts, PBS_OPT_FINGERPRINT);
const char *key_password = qemu_opt_get(opts, PBS_OPT_ENCRYPTION_PASSWORD);
@@ -124,9 +132,12 @@ static int pbs_open(BlockDriverState *bs, QDict *options, int flags,
if (!key_password) {
key_password = getenv("PBS_ENCRYPTION_PASSWORD");
}
+ if (namespace) {
+ s->namespace = g_strdup(namespace);
+ }
/* connect to PBS server in read mode */
- s->conn = proxmox_restore_new(s->repository, s->snapshot, password,
+ s->conn = proxmox_restore_new_ns(s->repository, s->snapshot, s->namespace, password,
keyfile, key_password, fingerprint, &pbs_error);
/* invalidates qemu_opt_get char pointers from above */
@@ -171,6 +182,7 @@ static int pbs_file_open(BlockDriverState *bs, QDict *options, int flags,
static void pbs_close(BlockDriverState *bs) {
BDRVPBSState *s = bs->opaque;
g_free(s->repository);
+ g_free(s->namespace);
g_free(s->snapshot);
g_free(s->archive);
proxmox_restore_disconnect(s->conn);
@@ -252,8 +264,13 @@ static coroutine_fn int pbs_co_pwritev(BlockDriverState *bs,
static void pbs_refresh_filename(BlockDriverState *bs)
{
BDRVPBSState *s = bs->opaque;
- snprintf(bs->exact_filename, sizeof(bs->exact_filename), "%s/%s(%s)",
- s->repository, s->snapshot, s->archive);
+ if (s->namespace) {
+ snprintf(bs->exact_filename, sizeof(bs->exact_filename), "%s/%s:%s(%s)",
+ s->repository, s->namespace, s->snapshot, s->archive);
+ } else {
+ snprintf(bs->exact_filename, sizeof(bs->exact_filename), "%s/%s(%s)",
+ s->repository, s->snapshot, s->archive);
+ }
}
static const char *const pbs_strong_runtime_opts[] = {
diff --git a/pbs-restore.c b/pbs-restore.c
index 2f834cf42e..f03d9bab8d 100644
--- a/pbs-restore.c
+++ b/pbs-restore.c
@@ -29,7 +29,7 @@
static void help(void)
{
const char *help_msg =
- "usage: pbs-restore [--repository <repo>] snapshot archive-name target [command options]\n"
+ "usage: pbs-restore [--repository <repo>] [--ns namespace] snapshot archive-name target [command options]\n"
;
printf("%s", help_msg);
@@ -77,6 +77,7 @@ int main(int argc, char **argv)
Error *main_loop_err = NULL;
const char *format = "raw";
const char *repository = NULL;
+ const char *backup_ns = NULL;
const char *keyfile = NULL;
int verbose = false;
bool skip_zero = false;
@@ -90,6 +91,7 @@ int main(int argc, char **argv)
{"verbose", no_argument, 0, 'v'},
{"format", required_argument, 0, 'f'},
{"repository", required_argument, 0, 'r'},
+ {"ns", required_argument, 0, 'n'},
{"keyfile", required_argument, 0, 'k'},
{0, 0, 0, 0}
};
@@ -110,6 +112,9 @@ int main(int argc, char **argv)
case 'r':
repository = g_strdup(argv[optind - 1]);
break;
+ case 'n':
+ backup_ns = g_strdup(argv[optind - 1]);
+ break;
case 'k':
keyfile = g_strdup(argv[optind - 1]);
break;
@@ -160,8 +165,16 @@ int main(int argc, char **argv)
fprintf(stderr, "connecting to repository '%s'\n", repository);
}
char *pbs_error = NULL;
- ProxmoxRestoreHandle *conn = proxmox_restore_new(
- repository, snapshot, password, keyfile, key_password, fingerprint, &pbs_error);
+ ProxmoxRestoreHandle *conn = proxmox_restore_new_ns(
+ repository,
+ snapshot,
+ backup_ns,
+ password,
+ keyfile,
+ key_password,
+ fingerprint,
+ &pbs_error
+ );
if (conn == NULL) {
fprintf(stderr, "restore failed: %s\n", pbs_error);
return -1;
diff --git a/pve-backup.c b/pve-backup.c
index 4b5134ed27..262e7d3894 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -10,6 +10,8 @@
#include "qapi/qmp/qerror.h"
#include "qemu/cutils.h"
+#include <proxmox-backup-qemu.h>
+
/* PVE backup state and related function */
/*
@@ -531,6 +533,7 @@ UuidInfo coroutine_fn *qmp_backup(
bool has_key_password, const char *key_password,
bool has_master_keyfile, const char *master_keyfile,
bool has_fingerprint, const char *fingerprint,
+ bool has_backup_ns, const char *backup_ns,
bool has_backup_id, const char *backup_id,
bool has_backup_time, int64_t backup_time,
bool has_use_dirty_bitmap, bool use_dirty_bitmap,
@@ -670,8 +673,9 @@ UuidInfo coroutine_fn *qmp_backup(
firewall_name = "fw.conf";
char *pbs_err = NULL;
- pbs = proxmox_backup_new(
+ pbs = proxmox_backup_new_ns(
backup_file,
+ has_backup_ns ? backup_ns : NULL,
backup_id,
backup_time,
dump_cb_block_size,
diff --git a/qapi/block-core.json b/qapi/block-core.json
index d8c7331090..889726fc26 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -817,6 +817,8 @@
#
# @fingerprint: server cert fingerprint (optional for format 'pbs')
#
+# @backup-ns: backup namespace (required for format 'pbs')
+#
# @backup-id: backup ID (required for format 'pbs')
#
# @backup-time: backup timestamp (Unix epoch, required for format 'pbs')
@@ -836,6 +838,7 @@
'*key-password': 'str',
'*master-keyfile': 'str',
'*fingerprint': 'str',
+ '*backup-ns': 'str',
'*backup-id': 'str',
'*backup-time': 'int',
'*use-dirty-bitmap': 'bool',
@@ -3290,7 +3293,7 @@
{ 'struct': 'BlockdevOptionsPbs',
'data': { 'repository': 'str', 'snapshot': 'str', 'archive': 'str',
'*keyfile': 'str', '*password': 'str', '*fingerprint': 'str',
- '*key_password': 'str' } }
+ '*key_password': 'str', '*namespace': 'str' } }
##
# @BlockdevOptionsNVMe:

View File

@@ -0,0 +1,80 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fabian Ebner <f.ebner@proxmox.com>
Date: Thu, 23 Jun 2022 14:00:05 +0200
Subject: [PATCH] Revert "block/rbd: workaround for ceph issue #53784"
This reverts commit fc176116cdea816ceb8dd969080b2b95f58edbc0 in
preparation to revert 0347a8fd4c3faaedf119be04c197804be40a384b.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
---
block/rbd.c | 42 ++----------------------------------------
1 file changed, 2 insertions(+), 40 deletions(-)
diff --git a/block/rbd.c b/block/rbd.c
index 64a8d7d48b..9fc6dcb957 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -1348,7 +1348,6 @@ static int coroutine_fn qemu_rbd_co_block_status(BlockDriverState *bs,
int status, r;
RBDDiffIterateReq req = { .offs = offset };
uint64_t features, flags;
- uint64_t head = 0;
assert(offset + bytes <= s->image_size);
@@ -1376,43 +1375,7 @@ static int coroutine_fn qemu_rbd_co_block_status(BlockDriverState *bs,
return status;
}
-#if LIBRBD_VERSION_CODE < LIBRBD_VERSION(1, 17, 0)
- /*
- * librbd had a bug until early 2022 that affected all versions of ceph that
- * supported fast-diff. This bug results in reporting of incorrect offsets
- * if the offset parameter to rbd_diff_iterate2 is not object aligned.
- * Work around this bug by rounding down the offset to object boundaries.
- * This is OK because we call rbd_diff_iterate2 with whole_object = true.
- * However, this workaround only works for non cloned images with default
- * striping.
- *
- * See: https://tracker.ceph.com/issues/53784
- */
-
- /* check if RBD image has non-default striping enabled */
- if (features & RBD_FEATURE_STRIPINGV2) {
- return status;
- }
-
-#pragma GCC diagnostic push
-#pragma GCC diagnostic ignored "-Wdeprecated-declarations"
- /*
- * check if RBD image is a clone (= has a parent).
- *
- * rbd_get_parent_info is deprecated from Nautilus onwards, but the
- * replacement rbd_get_parent is not present in Luminous and Mimic.
- */
- if (rbd_get_parent_info(s->image, NULL, 0, NULL, 0, NULL, 0) != -ENOENT) {
- return status;
- }
-#pragma GCC diagnostic pop
-
- head = req.offs & (s->object_size - 1);
- req.offs -= head;
- bytes += head;
-#endif
-
- r = rbd_diff_iterate2(s->image, NULL, req.offs, bytes, true, true,
+ r = rbd_diff_iterate2(s->image, NULL, offset, bytes, true, true,
qemu_rbd_diff_iterate_cb, &req);
if (r < 0 && r != QEMU_RBD_EXIT_DIFF_ITERATE2) {
return status;
@@ -1431,8 +1394,7 @@ static int coroutine_fn qemu_rbd_co_block_status(BlockDriverState *bs,
status = BDRV_BLOCK_ZERO | BDRV_BLOCK_OFFSET_VALID;
}
- assert(req.bytes > head);
- *pnum = req.bytes - head;
+ *pnum = req.bytes;
return status;
}

View File

@@ -0,0 +1,35 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fabian Ebner <f.ebner@proxmox.com>
Date: Thu, 23 Jun 2022 14:00:07 +0200
Subject: [PATCH] Revert "block/rbd: fix handling of holes in
.bdrv_co_block_status"
This reverts commit 9e302f64bb407a9bb097b626da97228c2654cfee in
preparation to revert 0347a8fd4c3faaedf119be04c197804be40a384b.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
---
block/rbd.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/block/rbd.c b/block/rbd.c
index 9fc6dcb957..98f4ba2620 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -1307,11 +1307,11 @@ static int qemu_rbd_diff_iterate_cb(uint64_t offs, size_t len,
RBDDiffIterateReq *req = opaque;
assert(req->offs + req->bytes <= offs);
-
- /* treat a hole like an unallocated area and bail out */
- if (!exists) {
- return 0;
- }
+ /*
+ * we do not diff against a snapshot so we should never receive a callback
+ * for a hole.
+ */
+ assert(exists);
if (!req->exists && offs > req->offs) {
/*

View File

@@ -0,0 +1,161 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fabian Ebner <f.ebner@proxmox.com>
Date: Tue, 17 May 2022 09:46:02 +0200
Subject: [PATCH] Revert "block/rbd: implement bdrv_co_block_status"
During backup, bdrv_co_block_status is called for each block copy
chunk. When RBD is used, the current implementation with
rbd_diff_iterate2() using whole_object=true takes about linearly more
time, depending on the image size. Since there are linearly more
chunks, the slowdown is quadratic, becoming unacceptable for large
images (starting somewhere between 500-1000 GiB in my testing).
This reverts commit 0347a8fd4c3faaedf119be04c197804be40a384b as a
stop-gap measure, until it's clear how to make the implemenation
more efficient.
Upstream bug report:
https://gitlab.com/qemu-project/qemu/-/issues/1026
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
---
block/rbd.c | 112 ----------------------------------------------------
1 file changed, 112 deletions(-)
diff --git a/block/rbd.c b/block/rbd.c
index 98f4ba2620..efcbbe5949 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -97,12 +97,6 @@ typedef struct RBDTask {
int64_t ret;
} RBDTask;
-typedef struct RBDDiffIterateReq {
- uint64_t offs;
- uint64_t bytes;
- bool exists;
-} RBDDiffIterateReq;
-
static int qemu_rbd_connect(rados_t *cluster, rados_ioctx_t *io_ctx,
BlockdevOptionsRbd *opts, bool cache,
const char *keypairs, const char *secretid,
@@ -1293,111 +1287,6 @@ static ImageInfoSpecific *qemu_rbd_get_specific_info(BlockDriverState *bs,
return spec_info;
}
-/*
- * rbd_diff_iterate2 allows to interrupt the exection by returning a negative
- * value in the callback routine. Choose a value that does not conflict with
- * an existing exitcode and return it if we want to prematurely stop the
- * execution because we detected a change in the allocation status.
- */
-#define QEMU_RBD_EXIT_DIFF_ITERATE2 -9000
-
-static int qemu_rbd_diff_iterate_cb(uint64_t offs, size_t len,
- int exists, void *opaque)
-{
- RBDDiffIterateReq *req = opaque;
-
- assert(req->offs + req->bytes <= offs);
- /*
- * we do not diff against a snapshot so we should never receive a callback
- * for a hole.
- */
- assert(exists);
-
- if (!req->exists && offs > req->offs) {
- /*
- * we started in an unallocated area and hit the first allocated
- * block. req->bytes must be set to the length of the unallocated area
- * before the allocated area. stop further processing.
- */
- req->bytes = offs - req->offs;
- return QEMU_RBD_EXIT_DIFF_ITERATE2;
- }
-
- if (req->exists && offs > req->offs + req->bytes) {
- /*
- * we started in an allocated area and jumped over an unallocated area,
- * req->bytes contains the length of the allocated area before the
- * unallocated area. stop further processing.
- */
- return QEMU_RBD_EXIT_DIFF_ITERATE2;
- }
-
- req->bytes += len;
- req->exists = true;
-
- return 0;
-}
-
-static int coroutine_fn qemu_rbd_co_block_status(BlockDriverState *bs,
- bool want_zero, int64_t offset,
- int64_t bytes, int64_t *pnum,
- int64_t *map,
- BlockDriverState **file)
-{
- BDRVRBDState *s = bs->opaque;
- int status, r;
- RBDDiffIterateReq req = { .offs = offset };
- uint64_t features, flags;
-
- assert(offset + bytes <= s->image_size);
-
- /* default to all sectors allocated */
- status = BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID;
- *map = offset;
- *file = bs;
- *pnum = bytes;
-
- /* check if RBD image supports fast-diff */
- r = rbd_get_features(s->image, &features);
- if (r < 0) {
- return status;
- }
- if (!(features & RBD_FEATURE_FAST_DIFF)) {
- return status;
- }
-
- /* check if RBD fast-diff result is valid */
- r = rbd_get_flags(s->image, &flags);
- if (r < 0) {
- return status;
- }
- if (flags & RBD_FLAG_FAST_DIFF_INVALID) {
- return status;
- }
-
- r = rbd_diff_iterate2(s->image, NULL, offset, bytes, true, true,
- qemu_rbd_diff_iterate_cb, &req);
- if (r < 0 && r != QEMU_RBD_EXIT_DIFF_ITERATE2) {
- return status;
- }
- assert(req.bytes <= bytes);
- if (!req.exists) {
- if (r == 0) {
- /*
- * rbd_diff_iterate2 does not invoke callbacks for unallocated
- * areas. This here catches the case where no callback was
- * invoked at all (req.bytes == 0).
- */
- assert(req.bytes == 0);
- req.bytes = bytes;
- }
- status = BDRV_BLOCK_ZERO | BDRV_BLOCK_OFFSET_VALID;
- }
-
- *pnum = req.bytes;
- return status;
-}
-
static int64_t qemu_rbd_getlength(BlockDriverState *bs)
{
BDRVRBDState *s = bs->opaque;
@@ -1633,7 +1522,6 @@ static BlockDriver bdrv_rbd = {
#ifdef LIBRBD_SUPPORTS_WRITE_ZEROES
.bdrv_co_pwrite_zeroes = qemu_rbd_co_pwrite_zeroes,
#endif
- .bdrv_co_block_status = qemu_rbd_co_block_status,
.bdrv_snapshot_create = qemu_rbd_snap_create,
.bdrv_snapshot_delete = qemu_rbd_snap_remove,

View File

@@ -0,0 +1,60 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fabian Ebner <f.ebner@proxmox.com>
Date: Wed, 25 May 2022 13:59:37 +0200
Subject: [PATCH] PVE-Backup: create jobs: correctly cancel in error scenario
The first call to job_cancel_sync() will cancel and free all jobs in
the transaction, so ensure that it's called only once and get rid of
the job_unref() that would operate on freed memory.
It's also necessary to NULL backup_state.pbs in the error scenario,
because a subsequent backup_cancel QMP call (as happens in PVE when
the backup QMP command fails) would try to call proxmox_backup_abort()
and run into a segfault.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
[FE: adapt for new job lock mechanism replacing AioContext locks]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
pve-backup.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/pve-backup.c b/pve-backup.c
index 262e7d3894..fde3554133 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -503,6 +503,11 @@ static void create_backup_jobs_bh(void *opaque) {
}
if (*errp) {
+ /*
+ * It's enough to cancel one job in the transaction, the rest will
+ * follow automatically.
+ */
+ bool canceled = false;
l = backup_state.di_list;
while (l) {
PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
@@ -513,11 +518,11 @@ static void create_backup_jobs_bh(void *opaque) {
di->target = NULL;
}
- if (di->job) {
+ if (!canceled && di->job) {
WITH_JOB_LOCK_GUARD() {
job_cancel_sync_locked(&di->job->job, true);
- job_unref_locked(&di->job->job);
}
+ canceled = true;
}
}
}
@@ -943,6 +948,7 @@ err:
if (pbs) {
proxmox_backup_disconnect(pbs);
+ backup_state.pbs = NULL;
}
if (backup_dir) {

View File

@@ -0,0 +1,73 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fabian Ebner <f.ebner@proxmox.com>
Date: Wed, 25 May 2022 13:59:38 +0200
Subject: [PATCH] PVE-Backup: ensure jobs in di_list are referenced
Ensures that qmp_backup_cancel doesn't pick a job that's already been
freed. With unlucky timings it seems possible that:
1. job_exit -> job_completed -> job_finalize_single starts
2. pvebackup_co_complete_stream gets spawned in completion callback
3. job finalize_single finishes -> job's refcount hits zero -> job is
freed
4. qmp_backup_cancel comes in and locks backup_state.backup_mutex
before pvebackup_co_complete_stream can remove the job from the
di_list
5. qmp_backup_cancel will pick a job that's already been freed
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
[FE: adapt for new job lock mechanism replacing AioContext locks]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
pve-backup.c | 22 +++++++++++++++++++---
1 file changed, 19 insertions(+), 3 deletions(-)
diff --git a/pve-backup.c b/pve-backup.c
index fde3554133..0cf30e1ced 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -316,6 +316,13 @@ static void coroutine_fn pvebackup_co_complete_stream(void *opaque)
}
}
+ if (di->job) {
+ WITH_JOB_LOCK_GUARD() {
+ job_unref_locked(&di->job->job);
+ di->job = NULL;
+ }
+ }
+
// remove self from job list
backup_state.di_list = g_list_remove(backup_state.di_list, di);
@@ -491,6 +498,11 @@ static void create_backup_jobs_bh(void *opaque) {
aio_context_release(aio_context);
di->job = job;
+ if (job) {
+ WITH_JOB_LOCK_GUARD() {
+ job_ref_locked(&job->job);
+ }
+ }
if (!job || local_err) {
error_setg(errp, "backup_job_create failed: %s",
@@ -518,11 +530,15 @@ static void create_backup_jobs_bh(void *opaque) {
di->target = NULL;
}
- if (!canceled && di->job) {
+ if (di->job) {
WITH_JOB_LOCK_GUARD() {
- job_cancel_sync_locked(&di->job->job, true);
+ if (!canceled) {
+ job_cancel_sync_locked(&di->job->job, true);
+ canceled = true;
+ }
+ job_unref_locked(&di->job->job);
+ di->job = NULL;
}
- canceled = true;
}
}
}

View File

@@ -0,0 +1,118 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fabian Ebner <f.ebner@proxmox.com>
Date: Wed, 25 May 2022 13:59:39 +0200
Subject: [PATCH] PVE-Backup: avoid segfault issues upon backup-cancel
When canceling a backup in PVE via a signal it's easy to run into a
situation where the job is already failing when the backup_cancel QMP
command comes in. With a bit of unlucky timing on top, it can happen
that job_exit() runs between schedulung of job_cancel_bh() and
execution of job_cancel_bh(). But job_cancel_sync() does not expect
that the job is already finalized (in fact, the job might've been
freed already, but even if it isn't, job_cancel_sync() would try to
deref job->txn which would be NULL at that point).
It is not possible to simply use the job_cancel() (which is advertised
as being async but isn't in all cases) in qmp_backup_cancel() for the
same reason job_cancel_sync() cannot be used. Namely, because it can
invoke job_finish_sync() (which uses AIO_WAIT_WHILE and thus hangs if
called from a coroutine). This happens when there's multiple jobs in
the transaction and job->deferred_to_main_loop is true (is set before
scheduling job_exit()) or if the job was not started yet.
Fix the issue by selecting the job to cancel in job_cancel_bh() itself
using the first job that's not completed yet. This is not necessarily
the first job in the list, because pvebackup_co_complete_stream()
might not yet have removed a completed job when job_cancel_bh() runs.
An alternative would be to continue using only the first job and
checking against JOB_STATUS_CONCLUDED or JOB_STATUS_NULL to decide if
it's still necessary and possible to cancel, but the approach with
using the first non-completed job seemed more robust.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
[FE: adapt for new job lock mechanism replacing AioContext locks]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
pve-backup.c | 57 ++++++++++++++++++++++++++++++++++------------------
1 file changed, 38 insertions(+), 19 deletions(-)
diff --git a/pve-backup.c b/pve-backup.c
index 0cf30e1ced..4067018dbe 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -354,12 +354,41 @@ static void pvebackup_complete_cb(void *opaque, int ret)
/*
* job_cancel(_sync) does not like to be called from coroutines, so defer to
- * main loop processing via a bottom half.
+ * main loop processing via a bottom half. Assumes that caller holds
+ * backup_mutex.
*/
static void job_cancel_bh(void *opaque) {
CoCtxData *data = (CoCtxData*)opaque;
- Job *job = (Job*)data->data;
- job_cancel_sync(job, true);
+
+ /*
+ * Be careful to pick a valid job to cancel:
+ * 1. job_cancel_sync() does not expect the job to be finalized already.
+ * 2. job_exit() might run between scheduling and running job_cancel_bh()
+ * and pvebackup_co_complete_stream() might not have removed the job from
+ * the list yet (in fact, cannot, because it waits for the backup_mutex).
+ * Requiring !job_is_completed() ensures that no finalized job is picked.
+ */
+ GList *bdi = g_list_first(backup_state.di_list);
+ while (bdi) {
+ if (bdi->data) {
+ BlockJob *bj = ((PVEBackupDevInfo *)bdi->data)->job;
+ if (bj) {
+ Job *job = &bj->job;
+ WITH_JOB_LOCK_GUARD() {
+ if (!job_is_completed_locked(job)) {
+ job_cancel_sync_locked(job, true);
+ /*
+ * It's enough to cancel one job in the transaction, the
+ * rest will follow automatically.
+ */
+ break;
+ }
+ }
+ }
+ }
+ bdi = g_list_next(bdi);
+ }
+
aio_co_enter(data->ctx, data->co);
}
@@ -380,22 +409,12 @@ void coroutine_fn qmp_backup_cancel(Error **errp)
proxmox_backup_abort(backup_state.pbs, "backup canceled");
}
- /* it's enough to cancel one job in the transaction, the rest will follow
- * automatically */
- GList *bdi = g_list_first(backup_state.di_list);
- BlockJob *cancel_job = bdi && bdi->data ?
- ((PVEBackupDevInfo *)bdi->data)->job :
- NULL;
-
- if (cancel_job) {
- CoCtxData data = {
- .ctx = qemu_get_current_aio_context(),
- .co = qemu_coroutine_self(),
- .data = &cancel_job->job,
- };
- aio_bh_schedule_oneshot(data.ctx, job_cancel_bh, &data);
- qemu_coroutine_yield();
- }
+ CoCtxData data = {
+ .ctx = qemu_get_current_aio_context(),
+ .co = qemu_coroutine_self(),
+ };
+ aio_bh_schedule_oneshot(data.ctx, job_cancel_bh, &data);
+ qemu_coroutine_yield();
qemu_co_mutex_unlock(&backup_state.backup_mutex);
}

Some files were not shown because too many files have changed in this diff Show More