Commit Graph

462 Commits (master)

Author SHA1 Message Date
Fiona Ebner 3c4f941ac7 add more stable fixes
The patches were selected from the recent "Patch Round-up for stable
7.2.1" [0]. Those that should be relevant for our supported use-cases
(and the upcoming nvme use-case) were picked. Most of the patches
added now have not been submitted to qemu-stable before.

The follow-up for the virtio-rng-pci migration fix will break
migration between versions with the fix and without the fix when a
virtio-pci-rng(-non)-transitional device is used. Luckily Proxmox VE
only uses the virtio-pci-rng device, and this was fixed by
0006-virtio-rng-pci-fix-migration-compat-for-vectors.patch which was
applied before any public version of Proxmox VE's QEMU 7.2 package was
released.

[0]: https://lists.nongnu.org/archive/html/qemu-stable/2023-03/msg00010.html
[1]: https://bugzilla.redhat.com/show_bug.cgi?id=2162569

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-03-13 17:36:19 +01:00
Fiona Ebner 3a94e1a186 fixup patch "ide: avoid potential deadlock when draining during trim"
The patch was incomplete and (re-)introduced an issue with a potential
failing assertion upon cancelation of the DMA request.

There is a patch on qemu-devel now[0], and it's the same as this one
code-wise (except for comments). But the discussion is still ongoing.
While there shouldn't be a real issue with the patch, there might be
better approaches. The plan is to use this as a stop-gap for now and
pick up the proper solution once it's ready.

[0]: https://lists.nongnu.org/archive/html/qemu-devel/2023-03/msg03325.html

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-03-13 17:36:19 +01:00
Thomas Lamprecht 67cae45f41 bump version to 7.2.0-6
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-03-08 14:32:22 +01:00
Fiona Ebner 58659169de add patch to avoid potential deadlock with trim for IDE/SATA and draining
In particular, the deadlock can occur, together with unlucky timing
between the QEMU threads, when the guest is issuing trim requests
during the start of a backup operation.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
 [ T: resolve trivial merge conflict in series file ]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-03-08 14:22:36 +01:00
Fiona Ebner 10691e04e9 add patch fixing Linux boot failures with megasas SCSI
A regression in 7.2 and easily reproduced.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-03-07 19:50:12 +01:00
Thomas Lamprecht 09723b9298 bump version to 7.2.0-5
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-02-21 13:50:08 +01:00
Fiona Ebner 00e2507aac add fix for iscsi double free issue leading to crashes
Reported here[0] and here[1].

[0]: https://gitlab.com/qemu-project/qemu/-/issues/1378
[1]: https://forum.proxmox.com/threads/122776/

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-02-21 13:49:19 +01:00
Fiona Ebner e7e5f63573 add patch fixing DMA reentrancy issues
that could lead to use-after-frees and stack overflows with a
malicious (or buggy) guest. See [0] for a good summary:

[0]: https://lore.kernel.org/qemu-devel/CAFEAcA_23vc7hE3iaM-JVA6W38LK4hJoWae5KcknhPRD5fPBZA@mail.gmail.com

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-02-21 10:18:35 +01:00
Fiona Ebner 1688b43738 QMP backup: use correct errno when getting blockdrive length fails
di->size would only be set later. The errno is minus the return value
from the function.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-02-21 09:19:16 +01:00
Fiona Ebner eee064d954 savevm-async: keep more free space when entering final stage
In qemu-server, we already allocate 2 * $mem_size + 500 MiB for driver
state (which was 32 MiB long ago according to git history). It seems
likely that the 30 MiB cutoff in the savevm-async implementation was
chosen based on that.

In bug #4476 [0], another issue caused the iteration to not make any
progress and the state file filled up all the way to the 30 MiB +
pending_size cutoff. Since the guest is not stopped immediately after
the check, it can still dirty some RAM and the current cutoff is not
enough for a reproducer VM (was done while bug #4476 still was not
fixed), dirtying memory with
> stress-ng -B 2 --bigheap-growth 64.0M'
After entering the final stage, savevm actually filled up the state
file completely, leading to an I/O error. It's probably the same
scenario as reported in the bug report, the error message was fixed in
commit a020815 ("savevm-async: fix function name in error message")
after the bug report.

If not for the bug, the cutoff will only be reached by a VM that's
dirtying RAM faster than can be written to the storage, so increase
the cutoff to 100 MiB to have a bigger chance to finish successfully,
while still trying to not increase downtime too much for
non-hibernation snapshots.

[0]: https://bugzilla.proxmox.com/show_bug.cgi?id=4476

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-02-21 08:39:08 +01:00
Fiona Ebner 8051a24b5f fix #4476: savevm-async: avoid looping without progress
when pend_postcopy is large. By definition, pend_postcopy won't
decrease when iterating, so a value larger than the cutoff of 400000
would lead to essentially empty iterations, filling up the state file
until only 30 MiB + pending_size remain and the second half of the
check would trigger.

Avoid this, by not considering pend_postcopy for the cutoff to enter
the final phase.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-02-21 08:39:08 +01:00
Fiona Ebner ade9f50160 d/rules: add note explaining why using noopt doesn't currenlty work
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-02-14 10:04:21 +01:00
Fiona Ebner 0fde60fd10 d/rules: add missing export for CFLAGS
Otherwise, they don't affect the build of QEMU at all.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-02-14 10:04:21 +01:00
Thomas Lamprecht d82c5eb632 bump version to 7.2.0-4
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-01-27 09:37:53 +01:00
Fiona Ebner d5f6ef56f0 add patch to fix issue with VirtIO disk using detect-zeroes=unmap
Affects Proxmox VE, when the discard disk setting is used for a
VirtIO disk.

Upstream bug report:
https://gitlab.com/qemu-project/qemu/-/issues/1404

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-01-27 09:36:41 +01:00
Fabian Grünbichler 658cba46ee d/control: also conflict with "qemu-system-data"
it ships files also shipped by our qemu package, switching from Debian qemu to
ours doesn't work without manual intervention otherwise..

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2023-01-26 10:55:37 +01:00
Fiona Ebner a02081501a savevm-async: fix function name in error message
which also makes it distinguishable from the other
"qemu_savevm_state_iterate error" message.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-01-24 17:08:54 +01:00
Thomas Lamprecht baf4e3132d bump version to 7.2.0-3
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-01-12 13:13:23 +01:00
Fiona Ebner 48c307550a add regression fix for migration with virtio-rng device
between QEMU less than 7.2 and QEMU 7.2 without the fix (both
directions are affected).

As mentioned in the patch message, this fix itself will break
migration between QEMU 7.2 and QEMU 7.2 with the fix (in both
directions, if a virtio-rng device is attached), but this is fine,
because no pve-qemu-kvm package with QEMU 7.2 has been publicly
released yet.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-01-12 13:10:19 +01:00
Thomas Lamprecht 89fdfe8975 bump version to 7.2.0-2
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-01-10 15:47:52 +01:00
Fiona Ebner f64132208a cherry-pick stable fixes for 7.2
Two for virtio-mem and one for vIOMMU. Both features are not yet
exposed in PVE's qemu-server, but planned to be added.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-01-10 15:42:28 +01:00
Fiona Ebner 271ac0a8a7 add QAPI naming exceptions in patches introducing them
Avoids a patch and is required to compile when not all patches are
applied. No functional change is intended.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-01-10 15:42:16 +01:00
Fiona Ebner f4ed54ec37 d/control: drop outdated jemalloc dependencies
Commit 3d785ea ("disable jemalloc") disabled jemalloc support, so
these are not needed anymore.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2022-12-21 13:52:16 +01:00
Fiona Ebner 2277182712 d/control: add libslirp-dev as a build dependency
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2022-12-21 13:52:16 +01:00
Fiona Ebner 0906461df0 d/rules: enable slirp again
Commit d03e1b3 ("update submodule and patches to 7.2.0") argued that
slirp is not explicitly supported in PVE, but that is not true. In
qemu-server, user networking is supported (via CLI/API) when no bridge
is set on a virtual NIC. So slirp needs to stay to keep such NICs
working.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2022-12-21 13:52:16 +01:00
Wolfgang Bumiller 29bee92c59 bump version to 7.2.0-1
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2022-12-16 13:23:29 +01:00
Fiona Ebner 82640bb859 d/rules: explicitly disable building slirp
Otherwise, it depends on whether libslirp-devel is installed or not.
See the previous commit message for more context.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2022-12-16 11:47:25 +01:00
Fiona Ebner d03e1b3ce3 update submodule and patches to 7.2.0
User-facing breaking change:

The slirp submodule for user networking got removed. It would be
necessary to add the --enable-slirp option to the build and/or install
the appropriate library to continue building it. Since PVE is not
explicitly supporting it, it would require additionally installing the
libslirp0 package on all installations and there is *very* little
mention on the community forum when searching for "slirp" or
"netdev user", the plan is to only enable it again if there is some
real demand for it.

Notable changes:

* The big change for this release is the rework of job locking, using
  a job mutex and introducing _locked() variants of job API functions
  moving away from call-side AioContext locking. See (in the qemu
  submodule) commit 6f592e5aca ("job.c: enable job lock/unlock and
  remove Aiocontext locks") and previous commits for context.

  Changes required for the backup patches:
  * Use WITH_JOB_LOCK_GUARD() and call the _locked() variant of job
    API functions where appropriate (many are only availalbe as
    a _locked() variant).
  * Remove acquiring/releasing AioContext around functions taking the
    job mutex lock internally.

  The patch introducing sequential transaction support for jobs needs
  to temporarily unlock the job mutex to call job_start() when
  starting the next job in the transaction.

* The zeroinit block driver now marks its child as primary.

  The documentation in include/block/block-common.h states:
  > Filter node has exactly one FILTERED|PRIMARY child, and may have
  > other children which must not have these bits

  Without this, an assert will trigger when copying to a zeroinit target
  with qemu-img convert, because bdrv_child_cb_attach() expects any
  non-PRIMARY child to be not FILTERED:
  > qemu-img convert -n -p -f raw -O raw input.raw zeroinit:output.raw
  > qemu-img: ../block.c:1476: bdrv_child_cb_attach: Assertion
  > `!(child->role & BDRV_CHILD_FILTERED)' failed.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2022-12-16 11:47:20 +01:00
Thomas Lamprecht 55e33a045e bump version to 7.1.0-4
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-11-22 09:21:10 +01:00
Thomas Lamprecht 8a38e1da9e cherry-pick "block/block-backend: blk_set_enable_write_cache is IO_CODE"
albeit I was short from disarming that GLOBAL_STATE_CODE assert
completely, as its just bogus to assert that on runtime for a lot of
call sites, rather it should be verified on compilation (function
coloring with attributes and maybe a compiler plugin).

But, as this is already solved upstream lets take in that patch.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-11-22 09:19:00 +01:00
Thomas Lamprecht 3b3d5516ee bump version to 7.1.0-3
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-10-28 10:27:54 +02:00
Thomas Lamprecht 509409fb64 init: daemonize: defuse PID file resolve error to warning
fixes file restore, where we actively unlink the PID file of the
transient VM ourself after opening it - while we use it only for
tracking when the QEMU process itself has finished start up, it's
easier and cleaner to fix this regression now, than to rework that to
something that doesn't depends on the PID file at all.

Applying Fiona's patch as patch-patch tracked under extra, as I
expect that something similar to this gets accepted upstreamed.

Link: https://lists.proxmox.com/pipermail/pve-devel/2022-October/054448.html
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-10-28 10:22:26 +02:00
Wolfgang Bumiller bf03cd367f bump version to 7.1.0-2
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2022-10-18 15:35:09 +02:00
Fiona Ebner 0af826b448 savevm async IO channel: channel writev: fix return value in error case
The documentation in include/io/channel.h states that -1 or
QIO_CHANNEL_ERR_BLOCK should be returned upon error. Simply passing
along the return value from the blk-functions has the potential to
confuse the call sides. Non-blocking mode is not implemented
currently, so -1 it is.

The "return ret" was mistakenly left over from the previous
QEMUFileOps based implementation. Also, use error_setg_errno(), since
the blk(_co)_p{readv,writev} functions return errno codes.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2022-10-18 15:32:13 +02:00
Wolfgang Bumiller ed23707ed7 bump version to 7.1.0-1
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2022-10-14 14:55:53 +02:00
Fiona Ebner 4e1935c2c9 {alloc track, pbs} block driver: bdrv_co_preadv: adapt return values
to be in-line with what other implementations in QEMU do. Commit
1d39c7098bbfa6862cb96066c4f8f6735ea397c5 mentions the EIO bit and
the function is expected to return 0 upon success (see other
implementations).

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2022-10-14 14:52:36 +02:00
Fiona Ebner a262e9642b savevm async: cleaner initialization of target_close_wait member
Suggested-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2022-10-14 14:52:34 +02:00
Fiona Ebner 73912aee39 cherry-pick upstream fixes for 7.1.0
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2022-10-14 14:52:32 +02:00
Fiona Ebner 5b15e2ecaf update submodule and patches to 7.1.0
Notable changes:
* The only big change is the switch to using a custom QIOChannel for
  savevm-async, because the previously used QEMUFileOps was dropped.

  Changes to the current implementation:

  * Switch to vector based methods as required for an IO channel. For
    short reads the passed-in IO vector is stuffed with zeroes at the
    end, just to be sure.

  * For reading: The documentation in include/io/channel.h states that
    at least one byte should be read, so also error out when whe are
    at the very end instead of returning 0.

  * For reading: Fix off-by-one error when request goes beyond end.

    The wrong code piece was:
    if ((pos + size) > maxlen) {
        size = maxlen - pos - 1;
    }

    Previously, the last byte would not be read. It's actually
    possible to get a snapshot .raw file that has content all the way
    up the final 512 byte (= BDRV_SECTOR_SIZE) boundary without any
    trailing zero bytes (I wrote a script to do it).

    Luckily, it didn't cause a real issue, because qemu_loadvm_state()
    is not interested in the final (i.e. QEMU_VM_VMDESCRIPTION)
    section. The buffer for reading it is simply freed up afterwards
    and the function will assume that it read the whole section, even
    if that's not the case.

  * For writing: Make use of the generated blk_pwritev() wrapper
    instead of manually wrapping the coroutine to simplify and save a
    few lines.

* Adapt to changed interfaces for blk_{pread,pwrite}:
  * a9262f551e ("block: Change blk_{pread,pwrite}() param order")
  * 3b35d4542c ("block: Add a 'flags' param to blk_pread()")
  * bf5b16fa40 ("block: Make blk_{pread,pwrite}() return 0 on success")
  Those changes especially affected the qemu-img dd patches, because
  the context also changed, but also some of our block drivers used
  the functions.

* Drop qemu-common.h include: it got renamed after essentially
  everything was moved to other headers. The only remaining user I
  could find for things dropped from the header between 7.0 and 7.1
  was qemu_get_vm_name() in the iscsi-initiatorname patch, but it
  already includes the header to which the function was moved.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2022-10-14 14:52:29 +02:00
Wolfgang Bumiller 2775b2e378 bump version to 7.0.0-4
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2022-10-10 11:56:27 +02:00
Wolfgang Bumiller ed01236593 add patch: PVE Backup: allow passing max-workers performance setting
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2022-10-10 11:55:15 +02:00
Fiona Ebner 2b259b70ec d/rules: add revision to package version
This version string can be queried with $BINARY --version as well as
the query-version QMP command.

Useful for qemu-server to be able to report the running QEMU version
exactly. Could also be used to version guard against features as an
alternative to the query-proxmox-support QMP command.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2022-10-10 11:26:47 +02:00
Thomas Lamprecht a186335be5 bump version to 7.0.0-3
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-08-30 12:54:12 +02:00
Fiona Ebner 1976ca4607 savevm-async: set SAVE_STATE_DONE when closing state file was successful
Without this change, it's necessary to send a second savevm-end QMP
command after aborting a snaphsot, before a new savevm-start QMP
command can succeed.

In process_savevm_finalize(), no longer set an error in the abort
scenario. If there already is another error, there's no need to
override it. If canceling was done intentionally, qmp_savevm_end()
is responsible for setting the state now.

Reported-by: Mira Limbeck <m.limbeck@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2022-08-19 09:44:16 +02:00
Fiona Ebner 563c592898 savevm-async: avoid segfault when aborting snapshot
Reported in the community forum[0].

For 6.1.0, there were a few changes to the coroutine-sleep API, but
the adaptations in f376b2b ("update and rebase to QEMU v6.1.0") made
a mistake.

Currently, target_close_wait is NULL when passed to
qemu_co_sleep_ns_wakeable(), which further passes it to
qemu_co_sleep(), but there, it is dereferenced when trying to access
the 'to_wake' member:

> Thread 1 "kvm" received signal SIGSEGV, Segmentation fault.
> qemu_co_sleep (w=0x0) at ../util/qemu-coroutine-sleep.c:57

To fix it, create a proper struct and pass its address instead. Also
call qemu_co_sleep_wake unconditionally, because the NULL check (for
the 'to_wake' member) is done inside the function itself.

This patch is based on what the QEMU commits introducing the changes
to the coroutine-sleep API did to the callers in QEMU:
eaee072085 ("coroutine-sleep: allow qemu_co_sleep_wake that wakes nothing")
29a6ea24eb ("coroutine-sleep: replace QemuCoSleepState pointer with struct in the API")

[0]: https://forum.proxmox.com/threads/112130/

Tested-by: Mira Limbeck <m.limbeck@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2022-08-19 09:44:14 +02:00
Thomas Lamprecht 1de53d8a45 bump version to 7.0.0-2
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-07-20 09:17:13 +02:00
Fabian Ebner 0e88ec19db add two more stable patches
For the io_uring patch, it's not very clear which configurations can
trigger it, but it should be rather uncommon. See qemu commit
be6a166fde652589761cf70471bcde623e9bd72a for a bit more information.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2022-07-19 17:22:10 +02:00
Wolfgang Bumiller 9ee866b2e9 bump version to 7.0.0-1
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2022-06-30 11:08:36 +02:00
Fabian Ebner 14ed554660 cherry-pick upstream fixes for 7.0.0
coming in via qemu-stable (except for the vdmk fix, which was tagged
for-7.0 on the qemu-devel list, but didn't make it into the release).

Also took the chance to switch the gluster fix to the version that
made it into upstream.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2022-06-29 12:29:30 +02:00
Fabian Ebner eba403aafc d/rules: adapt to changed opensbi riscv filenames in 7.0.0
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2022-06-29 12:29:28 +02:00