Commit Graph

8 Commits (4b7975e75d4c067180ddc10f94018c569d581b01)

Author SHA1 Message Date
Fiona Ebner 4b7975e75d update submodule and patches to QEMU 8.1.5
Most notable fixes from a Proxmox VE perspective are:

* "virtio-net: correctly copy vnet header when flushing TX"
  To prevent a stack overflow that could lead to leaking parts of the
  QEMU process's memory.
* "hw/pflash: implement update buffer for block writes"
  To prevent an edge case for half-completed writes. This potentially
  affected EFI disks.
* Fixes to i386 emulation and ARM emulation.

No changes for patches were necessary (all are just automatic context
changes).

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-02-02 19:06:29 +01:00
Fiona Ebner 10e1093325 update submodule and patches to QEMU 8.1.2
Bigger notable changes:

* Commit 1a30b0f5d7 ("block: .bdrv_open is non-coroutine and
  unlocked") broke the PVE backup patches, in particular setting up
  the backup dump block driver, because bdrv_new_open_driver() cannot
  be called from a coroutine. To fix it, bdrv_co_open() is used
  instead, and while it's a much more involved function, the result
  should be essentially the same. The only difference I noticed is
  that the BDRV_O_ALLOW_RDWR flag is also set in the resulting bds
  (block driver state), but that shouldn't hurt.

Smaller notable changes:

* aio_set_fd_handler() dropped its 'is_external' parameter stating
  that all callers now pass false in 60f782b6b7 ("aio: remove
  aio_disable_external() API"). The calls in the PVE patches also
  passed false, so just drop the parameter too.

* global_state_store() does not have a return value anymore, so the
  user in the PVE savevm-async patch was adapted. For context, see
  c33f1829f8 ("migration: never fail in global_state_store()").

* Renames affecting the PVE savevm-async patch:
  migrate_use_block() -> migrate_block() and ram_counters -> mig_stats
  9d4b1e5f22 ("migration: Move migrate_use_block() to options.c")
  aff3f6606d ("migration: Rename ram_counters to mig_stats")

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-10-24 15:01:23 +02:00
Filip Schauer 0ff45eb23e backup: Fix spelling error in function name
Signed-off-by: Filip Schauer <f.schauer@proxmox.com>
[FE: fixup patch context]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-09-08 11:13:04 +02:00
Fiona Ebner 9e0186f289 backup: drop broken BACKUP_FORMAT_DIR
Since upstream QEMU 8.0, it's no longer possible to call
bdrv_img_create() from a coroutine anymore, meaning a backup with the
directory format would crash the QEMU instance.

The feature is only exposed via the monitor and was intended to be
experimental. There were no user reports about the breakage and it
only was noticed during the rebase for QEMU 8.1, because other parts
of the backup code needed adaptation and I decided to check the
BACKUP_FORMAT_DIR case too.

It should not stay in a broken state of course, but avoid the
maintenance cost and just make it a removed feature for Proxmox VE 8
retroactively.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-09-06 16:59:12 +02:00
Fiona Ebner 0cffb504e7 backup: create jobs in a drained section
With the drive-backup QMP command, upstream QEMU uses a drained
section for the source drive when creating the backup job. Do the same
here to avoid subtle bugs.

There, the drained section extends until after the job is started, but
this cannot be done here for multi-disk backups (could at most start
the first job). The important thing is that the cbw
(copy-before-write) node is in place and the bcs (block-copy-state)
bitmap is initialized, which both happen during job creation (ensured
by the "block/backup: move bcs bitmap initialization to job creation"
PVE patch).

One such bug is one reported in the community forum [0], where using a
drive with iothread can lead to an overlapping block-copy request and
consequently an assertion failure. The block-copy code relies on the
bcs bitmap to determine if a request for a certain range can be
created. Each time a request is created, it resets the bcs bitmap at
that range to indicate that it's being handled.

The duplicate request can happen as follows:
Thread A attaches the cbw node
Thread B creates a request and resets the bitmap at that range
Thread A clears the bitmap and merges it with the PBS bitmap
The merging can lead to the bitmap being set again at the range of
the previous request, so the block-copy code thinks it's fine to
create a request there.
Thread B creates another requests at an overlapping range before the
other request is finished.

The drained section ensures that nothing else can interfere with the
bcs bitmap between attaching the copy-before-write block node and
initialization of the bitmap.

[0]: https://forum.proxmox.com/threads/133149/

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-09-06 16:59:12 +02:00
Fiona Ebner 5f9cb29c3a backup: trim heap after finishing
Reported in the community forum [0]. By default, there can be large
amounts of memory left assigned to the QEMU process after backup.
Likely because of fragmentation, it's necessary to explicitly call
malloc_trim() to tell glibc that it shouldn't keep all that memory
resident for the process.

QEMU itself already does a malloc_trim() in the RCU thread, but that
code path might not be reached (or not for a long time) under usual
operation. The value of 4 MiB for the argument was also copied from
there.

Example with the following configuration:
> agent: 1
> boot: order=scsi0
> cores: 4
> cpu: x86-64-v2-AES
> ide2: none,media=cdrom
> memory: 1024
> name: backup-mem
> net0: virtio=DA:58:18:26:59:9F,bridge=vmbr0,firewall=1
> numa: 0
> ostype: l26
> scsi0: rbd:base-107-disk-0/vm-106-disk-1,size=4302M
> scsihw: virtio-scsi-pci
> smbios1: uuid=b2d4511e-8d01-44f1-afd6-9581b30c24a6
> sockets: 2
> startup: order=2
> virtio0: lvmthin:vm-106-disk-1,iothread=1,size=1G
> virtio1: lvmthin:vm-106-disk-2,iothread=1,size=1G
> virtio2: lvmthin:vm-106-disk-3,iothread=1,size=1G
> vmgenid: 0a1d8751-5e02-449d-977e-c0160e900231

Before the change:

> root@pve8a1 ~ # grep VmRSS /proc/$(cat /var/run/qemu-server/106.pid)/status
> VmRSS:	  370948 kB
> root@pve8a1 ~ # vzdump 106 --storage pbs
> (...)
> INFO: Backup job finished successfully
> root@pve8a1 ~ # grep VmRSS /proc/$(cat /var/run/qemu-server/106.pid)/status
> VmRSS:	 2114964 kB

After the change:

> root@pve8a1 ~ # grep VmRSS /proc/$(cat /var/run/qemu-server/106.pid)/status
> VmRSS:	  398788 kB
> root@pve8a1 ~ # vzdump 106 --storage pbs
> (...)
> INFO: Backup job finished successfully
> root@pve8a1 ~ # grep VmRSS /proc/$(cat /var/run/qemu-server/106.pid)/status
> VmRSS:	  424356 kB

[0]: https://forum.proxmox.com/threads/131339/

Co-diagnosed-by: Friedrich Weber <f.weber@proxmox.com>
Co-diagnosed-by: Dominik Csapak <d.csapak@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Acked-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2023-08-16 11:50:12 +02:00
Fiona Ebner d847446186 regenerate patches
There's still some context changes not covered by earlier series. No
functional change intended.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-06-15 13:55:22 +02:00
Fiona Ebner a816d2969e drop patch for custom get_link_status QMP command
There doesn't seem to be any Proxmox VE code using this.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-06-07 19:35:40 +02:00