Compare commits

...

150 Commits

Author SHA1 Message Date
95769aedb6 Add Vitastor support 2025-03-22 11:44:34 +03:00
Thomas Lamprecht
fa8486e62f bump version to 7.2.10-1
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2024-04-10 15:47:31 +02:00
Fiona Ebner
5d3fc48e11 pick up some extra fixes from upcoming 7.2.11
In particular, the i386 patches fix an issue that was newly introduced
in 7.2.10 and the LSI patches improve the reentrancy fix. The others
also sounded relevant and nice to have.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-04-10 15:43:57 +02:00
Fiona Ebner
a573598321 update patches and submodule to QEMU stable 7.2.10
Many stable fixes came in since the last bump, a few of which were
actually already present. Notable ones not yet present include a few
guest-triggerable assert fixes, some AHCI/IDE fixes (including the fix
for bug #2784), TGC fixes for i386 and ARM, VirtIO fixes, fix to avoid
VNC clipboard denial-of-service.

The reentrancy patches that landed upstream/stable were a newer
version than the ones backported initially here, so it was necessary
to explicitly drop them before rebase (which then picked up the
upstream version).

There were no other conflicts.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-04-10 15:43:57 +02:00
Thomas Lamprecht
8673d5e40a makefile: convert to use simple parenthesis
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
(cherry picked from commit 3c995a426d)
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-09-29 13:58:31 +02:00
Thomas Lamprecht
cfe581f192 d/lintian-overrides: ignore groff line breakage/adjustment warnings
not much we can do here anyway..

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
(cherry picked from commit be7ce325c7)
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-09-29 13:58:31 +02:00
Thomas Lamprecht
555c2585ca d/lintian-overrides: sort
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
(cherry picked from commit 19b4b4c50f)
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-09-29 13:58:31 +02:00
Thomas Lamprecht
899a49e30b d/parse-machines: produce stable json output
Enabling the "canonical" option the keys will be sorted, improving
build reproducibility.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
(cherry picked from commit 590adba81a)
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-09-29 13:58:31 +02:00
Thomas Lamprecht
6facdf3a08 also exclude hppa-firmware.img ROM from build
We don't use it and with debhelper compat level >= 11, the switch
from detecting files for strip through patters to checking for an ELF
header caused a build failure with the hppa-firmware.img ROM, as some
tools cannot cope with HP PARISC files.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-05-22 13:29:59 +02:00
Thomas Lamprecht
cb2b3190a4 move cleanup of unused ROMs from d/rules to build-dir generation
this way we save a bit of space and should make build also slightly
faster, otherwise nothing should change.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-05-22 13:29:59 +02:00
Thomas Lamprecht
2e416ad9d5 d/rules: fix debian-rules-missing-required-target
until we switch fully over to the dh sequencer

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-05-22 13:29:59 +02:00
Thomas Lamprecht
d80ca49db8 d/rules: cleanup cruft and use dpkg makefile fragements
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-05-22 13:29:59 +02:00
Thomas Lamprecht
d65b507d3f buildsys: update lintian overrides
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-05-22 13:29:59 +02:00
Thomas Lamprecht
98fd8612cb add .gitignore file
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-05-22 12:05:14 +02:00
Thomas Lamprecht
4f56d29218 buildsys: use shorter variable name $@ in $(BUILDIR) target
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-05-22 12:05:14 +02:00
Thomas Lamprecht
cd148033f3 buildsys: only run lintian for phony dsc target
This allows the sbuild to start much faster (lintian takes ~ minutes
for such big packages), and that without loss as sbuild will run
lintian on both binary and source package anyway.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-05-22 12:05:14 +02:00
Thomas Lamprecht
92c6d84f6a d/control: avoid versioned build-dependcies with a -1 revision
no effect besides making it harder to build this for an eventual
backport.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-05-22 12:05:14 +02:00
Thomas Lamprecht
b8af8dd4fa debian: normalize packaging files with wrap-and-sort -tkn
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-05-22 12:05:13 +02:00
Fiona Ebner
6eb3e31968 d/rules: fix comment about when clean target is executed
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-05-21 15:51:16 +02:00
Fiona Ebner
c913853be7 d/rules: move copying config.guess and config.sub to config.status target
It causes problems when done as part of the clean target when building
the dsc with the following error due to the additional files:
dpkg-source: error: aborting due to unexpected upstream changes

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-05-21 15:51:16 +02:00
Fiona Ebner
4fc4b533b5 buildsys: fix lintian overrides
See https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1007002 for more
information.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-05-21 15:51:16 +02:00
Fiona Ebner
023b916380 d/rules: set job flag for make based on DEB_BUILD_OPTIONS
Copied from Debian's QEMU package's d/rules. Otherwise, ninja will end
up using only a single job (in Debian Bookworm/Proxmox VE 8).

Suggested-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-05-21 15:51:16 +02:00
Fiona Ebner
19a11f24a5 buildsys: expand clean target
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
 [ T: remove all tarballs for a package and any .deb ]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-05-21 15:50:56 +02:00
Fiona Ebner
030fa1db4b buildsys: create build directory atomically
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-05-21 15:50:56 +02:00
Fiona Ebner
2d17b4b4d9 buildsys: add sbuild convenience target
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-05-21 15:50:56 +02:00
Fiona Ebner
280d157f1c buildsys: add dsc target
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-05-21 15:50:56 +02:00
Fiona Ebner
f6be0ca51a buildsys: derive upload dist automatically
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-05-21 15:50:56 +02:00
Thomas Lamprecht
93d558c1ee bump version to 7.2.0-8
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-03-17 15:48:12 +01:00
Fiona Ebner
e752bbe5e2 cherry-pick TCG-related stable fixes for 7.2
When turning off the "KVM hardware virtualization" checkbox in Proxmox
VE, the TCG accelerator is used, so these fixes are relevant then.

The first patch is included to allow cherry-picking the others without
changes.

Reported-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-03-17 15:46:20 +01:00
Thomas Lamprecht
018ef788b3 bump version to 7.2.0-8
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-03-17 12:12:02 +01:00
Fiona Ebner
72fc94c0c6 add patch fixing ACPI CPU hotplug issue with TCG
Required for the debian/edk2-vars-generator.py script in the
pve-edk2-firmware repository when building the edk2-stable202302
release. Without this patch, the QEMU process spawned by the script
would hang indefinietly.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-03-17 12:06:22 +01:00
Thomas Lamprecht
09186f4b6e bump version to 7.2.0-7
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-03-13 17:42:52 +01:00
Fiona Ebner
ffda59f626 add patches to fix regression with LSI SCSI controller
The patch 0008-memory-prevent-dma-reentracy-issues.patch introduced a
regression for the LSI SCSI controller leading to boot failures [0],
because, in its current form, it relies on reentrancy for a particular
ram_io region.

[0]: https://forum.proxmox.com/threads/123843

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-03-13 17:36:22 +01:00
Fiona Ebner
3c4f941ac7 add more stable fixes
The patches were selected from the recent "Patch Round-up for stable
7.2.1" [0]. Those that should be relevant for our supported use-cases
(and the upcoming nvme use-case) were picked. Most of the patches
added now have not been submitted to qemu-stable before.

The follow-up for the virtio-rng-pci migration fix will break
migration between versions with the fix and without the fix when a
virtio-pci-rng(-non)-transitional device is used. Luckily Proxmox VE
only uses the virtio-pci-rng device, and this was fixed by
0006-virtio-rng-pci-fix-migration-compat-for-vectors.patch which was
applied before any public version of Proxmox VE's QEMU 7.2 package was
released.

[0]: https://lists.nongnu.org/archive/html/qemu-stable/2023-03/msg00010.html
[1]: https://bugzilla.redhat.com/show_bug.cgi?id=2162569

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-03-13 17:36:19 +01:00
Fiona Ebner
3a94e1a186 fixup patch "ide: avoid potential deadlock when draining during trim"
The patch was incomplete and (re-)introduced an issue with a potential
failing assertion upon cancelation of the DMA request.

There is a patch on qemu-devel now[0], and it's the same as this one
code-wise (except for comments). But the discussion is still ongoing.
While there shouldn't be a real issue with the patch, there might be
better approaches. The plan is to use this as a stop-gap for now and
pick up the proper solution once it's ready.

[0]: https://lists.nongnu.org/archive/html/qemu-devel/2023-03/msg03325.html

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-03-13 17:36:19 +01:00
Thomas Lamprecht
67cae45f41 bump version to 7.2.0-6
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-03-08 14:32:22 +01:00
Fiona Ebner
58659169de add patch to avoid potential deadlock with trim for IDE/SATA and draining
In particular, the deadlock can occur, together with unlucky timing
between the QEMU threads, when the guest is issuing trim requests
during the start of a backup operation.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
 [ T: resolve trivial merge conflict in series file ]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-03-08 14:22:36 +01:00
Fiona Ebner
10691e04e9 add patch fixing Linux boot failures with megasas SCSI
A regression in 7.2 and easily reproduced.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-03-07 19:50:12 +01:00
Thomas Lamprecht
09723b9298 bump version to 7.2.0-5
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-02-21 13:50:08 +01:00
Fiona Ebner
00e2507aac add fix for iscsi double free issue leading to crashes
Reported here[0] and here[1].

[0]: https://gitlab.com/qemu-project/qemu/-/issues/1378
[1]: https://forum.proxmox.com/threads/122776/

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-02-21 13:49:19 +01:00
Fiona Ebner
e7e5f63573 add patch fixing DMA reentrancy issues
that could lead to use-after-frees and stack overflows with a
malicious (or buggy) guest. See [0] for a good summary:

[0]: https://lore.kernel.org/qemu-devel/CAFEAcA_23vc7hE3iaM-JVA6W38LK4hJoWae5KcknhPRD5fPBZA@mail.gmail.com

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-02-21 10:18:35 +01:00
Fiona Ebner
1688b43738 QMP backup: use correct errno when getting blockdrive length fails
di->size would only be set later. The errno is minus the return value
from the function.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-02-21 09:19:16 +01:00
Fiona Ebner
eee064d954 savevm-async: keep more free space when entering final stage
In qemu-server, we already allocate 2 * $mem_size + 500 MiB for driver
state (which was 32 MiB long ago according to git history). It seems
likely that the 30 MiB cutoff in the savevm-async implementation was
chosen based on that.

In bug #4476 [0], another issue caused the iteration to not make any
progress and the state file filled up all the way to the 30 MiB +
pending_size cutoff. Since the guest is not stopped immediately after
the check, it can still dirty some RAM and the current cutoff is not
enough for a reproducer VM (was done while bug #4476 still was not
fixed), dirtying memory with
> stress-ng -B 2 --bigheap-growth 64.0M'
After entering the final stage, savevm actually filled up the state
file completely, leading to an I/O error. It's probably the same
scenario as reported in the bug report, the error message was fixed in
commit a020815 ("savevm-async: fix function name in error message")
after the bug report.

If not for the bug, the cutoff will only be reached by a VM that's
dirtying RAM faster than can be written to the storage, so increase
the cutoff to 100 MiB to have a bigger chance to finish successfully,
while still trying to not increase downtime too much for
non-hibernation snapshots.

[0]: https://bugzilla.proxmox.com/show_bug.cgi?id=4476

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-02-21 08:39:08 +01:00
Fiona Ebner
8051a24b5f fix #4476: savevm-async: avoid looping without progress
when pend_postcopy is large. By definition, pend_postcopy won't
decrease when iterating, so a value larger than the cutoff of 400000
would lead to essentially empty iterations, filling up the state file
until only 30 MiB + pending_size remain and the second half of the
check would trigger.

Avoid this, by not considering pend_postcopy for the cutoff to enter
the final phase.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-02-21 08:39:08 +01:00
Fiona Ebner
ade9f50160 d/rules: add note explaining why using noopt doesn't currenlty work
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-02-14 10:04:21 +01:00
Fiona Ebner
0fde60fd10 d/rules: add missing export for CFLAGS
Otherwise, they don't affect the build of QEMU at all.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-02-14 10:04:21 +01:00
Thomas Lamprecht
d82c5eb632 bump version to 7.2.0-4
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-01-27 09:37:53 +01:00
Fiona Ebner
d5f6ef56f0 add patch to fix issue with VirtIO disk using detect-zeroes=unmap
Affects Proxmox VE, when the discard disk setting is used for a
VirtIO disk.

Upstream bug report:
https://gitlab.com/qemu-project/qemu/-/issues/1404

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-01-27 09:36:41 +01:00
Fabian Grünbichler
658cba46ee d/control: also conflict with "qemu-system-data"
it ships files also shipped by our qemu package, switching from Debian qemu to
ours doesn't work without manual intervention otherwise..

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2023-01-26 10:55:37 +01:00
Fiona Ebner
a02081501a savevm-async: fix function name in error message
which also makes it distinguishable from the other
"qemu_savevm_state_iterate error" message.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-01-24 17:08:54 +01:00
Thomas Lamprecht
baf4e3132d bump version to 7.2.0-3
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-01-12 13:13:23 +01:00
Fiona Ebner
48c307550a add regression fix for migration with virtio-rng device
between QEMU less than 7.2 and QEMU 7.2 without the fix (both
directions are affected).

As mentioned in the patch message, this fix itself will break
migration between QEMU 7.2 and QEMU 7.2 with the fix (in both
directions, if a virtio-rng device is attached), but this is fine,
because no pve-qemu-kvm package with QEMU 7.2 has been publicly
released yet.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-01-12 13:10:19 +01:00
Thomas Lamprecht
89fdfe8975 bump version to 7.2.0-2
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-01-10 15:47:52 +01:00
Fiona Ebner
f64132208a cherry-pick stable fixes for 7.2
Two for virtio-mem and one for vIOMMU. Both features are not yet
exposed in PVE's qemu-server, but planned to be added.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-01-10 15:42:28 +01:00
Fiona Ebner
271ac0a8a7 add QAPI naming exceptions in patches introducing them
Avoids a patch and is required to compile when not all patches are
applied. No functional change is intended.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-01-10 15:42:16 +01:00
Fiona Ebner
f4ed54ec37 d/control: drop outdated jemalloc dependencies
Commit 3d785ea ("disable jemalloc") disabled jemalloc support, so
these are not needed anymore.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2022-12-21 13:52:16 +01:00
Fiona Ebner
2277182712 d/control: add libslirp-dev as a build dependency
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2022-12-21 13:52:16 +01:00
Fiona Ebner
0906461df0 d/rules: enable slirp again
Commit d03e1b3 ("update submodule and patches to 7.2.0") argued that
slirp is not explicitly supported in PVE, but that is not true. In
qemu-server, user networking is supported (via CLI/API) when no bridge
is set on a virtual NIC. So slirp needs to stay to keep such NICs
working.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2022-12-21 13:52:16 +01:00
Wolfgang Bumiller
29bee92c59 bump version to 7.2.0-1
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2022-12-16 13:23:29 +01:00
Fiona Ebner
82640bb859 d/rules: explicitly disable building slirp
Otherwise, it depends on whether libslirp-devel is installed or not.
See the previous commit message for more context.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2022-12-16 11:47:25 +01:00
Fiona Ebner
d03e1b3ce3 update submodule and patches to 7.2.0
User-facing breaking change:

The slirp submodule for user networking got removed. It would be
necessary to add the --enable-slirp option to the build and/or install
the appropriate library to continue building it. Since PVE is not
explicitly supporting it, it would require additionally installing the
libslirp0 package on all installations and there is *very* little
mention on the community forum when searching for "slirp" or
"netdev user", the plan is to only enable it again if there is some
real demand for it.

Notable changes:

* The big change for this release is the rework of job locking, using
  a job mutex and introducing _locked() variants of job API functions
  moving away from call-side AioContext locking. See (in the qemu
  submodule) commit 6f592e5aca ("job.c: enable job lock/unlock and
  remove Aiocontext locks") and previous commits for context.

  Changes required for the backup patches:
  * Use WITH_JOB_LOCK_GUARD() and call the _locked() variant of job
    API functions where appropriate (many are only availalbe as
    a _locked() variant).
  * Remove acquiring/releasing AioContext around functions taking the
    job mutex lock internally.

  The patch introducing sequential transaction support for jobs needs
  to temporarily unlock the job mutex to call job_start() when
  starting the next job in the transaction.

* The zeroinit block driver now marks its child as primary.

  The documentation in include/block/block-common.h states:
  > Filter node has exactly one FILTERED|PRIMARY child, and may have
  > other children which must not have these bits

  Without this, an assert will trigger when copying to a zeroinit target
  with qemu-img convert, because bdrv_child_cb_attach() expects any
  non-PRIMARY child to be not FILTERED:
  > qemu-img convert -n -p -f raw -O raw input.raw zeroinit:output.raw
  > qemu-img: ../block.c:1476: bdrv_child_cb_attach: Assertion
  > `!(child->role & BDRV_CHILD_FILTERED)' failed.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2022-12-16 11:47:20 +01:00
Thomas Lamprecht
55e33a045e bump version to 7.1.0-4
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-11-22 09:21:10 +01:00
Thomas Lamprecht
8a38e1da9e cherry-pick "block/block-backend: blk_set_enable_write_cache is IO_CODE"
albeit I was short from disarming that GLOBAL_STATE_CODE assert
completely, as its just bogus to assert that on runtime for a lot of
call sites, rather it should be verified on compilation (function
coloring with attributes and maybe a compiler plugin).

But, as this is already solved upstream lets take in that patch.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-11-22 09:19:00 +01:00
Thomas Lamprecht
3b3d5516ee bump version to 7.1.0-3
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-10-28 10:27:54 +02:00
Thomas Lamprecht
509409fb64 init: daemonize: defuse PID file resolve error to warning
fixes file restore, where we actively unlink the PID file of the
transient VM ourself after opening it - while we use it only for
tracking when the QEMU process itself has finished start up, it's
easier and cleaner to fix this regression now, than to rework that to
something that doesn't depends on the PID file at all.

Applying Fiona's patch as patch-patch tracked under extra, as I
expect that something similar to this gets accepted upstreamed.

Link: https://lists.proxmox.com/pipermail/pve-devel/2022-October/054448.html
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-10-28 10:22:26 +02:00
Wolfgang Bumiller
bf03cd367f bump version to 7.1.0-2
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2022-10-18 15:35:09 +02:00
Fiona Ebner
0af826b448 savevm async IO channel: channel writev: fix return value in error case
The documentation in include/io/channel.h states that -1 or
QIO_CHANNEL_ERR_BLOCK should be returned upon error. Simply passing
along the return value from the blk-functions has the potential to
confuse the call sides. Non-blocking mode is not implemented
currently, so -1 it is.

The "return ret" was mistakenly left over from the previous
QEMUFileOps based implementation. Also, use error_setg_errno(), since
the blk(_co)_p{readv,writev} functions return errno codes.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2022-10-18 15:32:13 +02:00
Wolfgang Bumiller
ed23707ed7 bump version to 7.1.0-1
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2022-10-14 14:55:53 +02:00
Fiona Ebner
4e1935c2c9 {alloc track, pbs} block driver: bdrv_co_preadv: adapt return values
to be in-line with what other implementations in QEMU do. Commit
1d39c7098bbfa6862cb96066c4f8f6735ea397c5 mentions the EIO bit and
the function is expected to return 0 upon success (see other
implementations).

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2022-10-14 14:52:36 +02:00
Fiona Ebner
a262e9642b savevm async: cleaner initialization of target_close_wait member
Suggested-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2022-10-14 14:52:34 +02:00
Fiona Ebner
73912aee39 cherry-pick upstream fixes for 7.1.0
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2022-10-14 14:52:32 +02:00
Fiona Ebner
5b15e2ecaf update submodule and patches to 7.1.0
Notable changes:
* The only big change is the switch to using a custom QIOChannel for
  savevm-async, because the previously used QEMUFileOps was dropped.

  Changes to the current implementation:

  * Switch to vector based methods as required for an IO channel. For
    short reads the passed-in IO vector is stuffed with zeroes at the
    end, just to be sure.

  * For reading: The documentation in include/io/channel.h states that
    at least one byte should be read, so also error out when whe are
    at the very end instead of returning 0.

  * For reading: Fix off-by-one error when request goes beyond end.

    The wrong code piece was:
    if ((pos + size) > maxlen) {
        size = maxlen - pos - 1;
    }

    Previously, the last byte would not be read. It's actually
    possible to get a snapshot .raw file that has content all the way
    up the final 512 byte (= BDRV_SECTOR_SIZE) boundary without any
    trailing zero bytes (I wrote a script to do it).

    Luckily, it didn't cause a real issue, because qemu_loadvm_state()
    is not interested in the final (i.e. QEMU_VM_VMDESCRIPTION)
    section. The buffer for reading it is simply freed up afterwards
    and the function will assume that it read the whole section, even
    if that's not the case.

  * For writing: Make use of the generated blk_pwritev() wrapper
    instead of manually wrapping the coroutine to simplify and save a
    few lines.

* Adapt to changed interfaces for blk_{pread,pwrite}:
  * a9262f551e ("block: Change blk_{pread,pwrite}() param order")
  * 3b35d4542c ("block: Add a 'flags' param to blk_pread()")
  * bf5b16fa40 ("block: Make blk_{pread,pwrite}() return 0 on success")
  Those changes especially affected the qemu-img dd patches, because
  the context also changed, but also some of our block drivers used
  the functions.

* Drop qemu-common.h include: it got renamed after essentially
  everything was moved to other headers. The only remaining user I
  could find for things dropped from the header between 7.0 and 7.1
  was qemu_get_vm_name() in the iscsi-initiatorname patch, but it
  already includes the header to which the function was moved.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2022-10-14 14:52:29 +02:00
Wolfgang Bumiller
2775b2e378 bump version to 7.0.0-4
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2022-10-10 11:56:27 +02:00
Wolfgang Bumiller
ed01236593 add patch: PVE Backup: allow passing max-workers performance setting
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2022-10-10 11:55:15 +02:00
Fiona Ebner
2b259b70ec d/rules: add revision to package version
This version string can be queried with $BINARY --version as well as
the query-version QMP command.

Useful for qemu-server to be able to report the running QEMU version
exactly. Could also be used to version guard against features as an
alternative to the query-proxmox-support QMP command.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2022-10-10 11:26:47 +02:00
Thomas Lamprecht
a186335be5 bump version to 7.0.0-3
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-08-30 12:54:12 +02:00
Fiona Ebner
1976ca4607 savevm-async: set SAVE_STATE_DONE when closing state file was successful
Without this change, it's necessary to send a second savevm-end QMP
command after aborting a snaphsot, before a new savevm-start QMP
command can succeed.

In process_savevm_finalize(), no longer set an error in the abort
scenario. If there already is another error, there's no need to
override it. If canceling was done intentionally, qmp_savevm_end()
is responsible for setting the state now.

Reported-by: Mira Limbeck <m.limbeck@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2022-08-19 09:44:16 +02:00
Fiona Ebner
563c592898 savevm-async: avoid segfault when aborting snapshot
Reported in the community forum[0].

For 6.1.0, there were a few changes to the coroutine-sleep API, but
the adaptations in f376b2b ("update and rebase to QEMU v6.1.0") made
a mistake.

Currently, target_close_wait is NULL when passed to
qemu_co_sleep_ns_wakeable(), which further passes it to
qemu_co_sleep(), but there, it is dereferenced when trying to access
the 'to_wake' member:

> Thread 1 "kvm" received signal SIGSEGV, Segmentation fault.
> qemu_co_sleep (w=0x0) at ../util/qemu-coroutine-sleep.c:57

To fix it, create a proper struct and pass its address instead. Also
call qemu_co_sleep_wake unconditionally, because the NULL check (for
the 'to_wake' member) is done inside the function itself.

This patch is based on what the QEMU commits introducing the changes
to the coroutine-sleep API did to the callers in QEMU:
eaee072085 ("coroutine-sleep: allow qemu_co_sleep_wake that wakes nothing")
29a6ea24eb ("coroutine-sleep: replace QemuCoSleepState pointer with struct in the API")

[0]: https://forum.proxmox.com/threads/112130/

Tested-by: Mira Limbeck <m.limbeck@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2022-08-19 09:44:14 +02:00
Thomas Lamprecht
1de53d8a45 bump version to 7.0.0-2
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-07-20 09:17:13 +02:00
Fabian Ebner
0e88ec19db add two more stable patches
For the io_uring patch, it's not very clear which configurations can
trigger it, but it should be rather uncommon. See qemu commit
be6a166fde652589761cf70471bcde623e9bd72a for a bit more information.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2022-07-19 17:22:10 +02:00
Wolfgang Bumiller
9ee866b2e9 bump version to 7.0.0-1
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2022-06-30 11:08:36 +02:00
Fabian Ebner
14ed554660 cherry-pick upstream fixes for 7.0.0
coming in via qemu-stable (except for the vdmk fix, which was tagged
for-7.0 on the qemu-devel list, but didn't make it into the release).

Also took the chance to switch the gluster fix to the version that
made it into upstream.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2022-06-29 12:29:30 +02:00
Fabian Ebner
eba403aafc d/rules: adapt to changed opensbi riscv filenames in 7.0.0
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2022-06-29 12:29:28 +02:00
Fabian Ebner
b2685aee04 d/rules: drop outdated configure flags
See QEMU commits 9e8be4c546ce8469ca9702715bf8f198d604b685 and
a5730b8bd3675f484ed0eacea052452048eeb35d for more information.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2022-06-29 12:29:25 +02:00
Fabian Ebner
dc9827a6a4 update submodule and patches to 7.0.0
Only very minor changes needed:
* Most patches in extra (or some version of them) are part of 7.0.0.
* aio_set_fd_handler got an extra parameter, but can just pass NULL
  like we did for the related 'poll' parameter. See QEMU commit
  826cc32423db2a99d184dbf4f507c737d7e7a4ae for more.
* Add include for qemu/memalign.h in vma.c and vma-writer.c.
* Add reverts for fixups of already reverted 0347a8fd4c ("block/rbd:
  implement bdrv_co_block_status") that came in with 7.0.0. Those
  fixups are not enough, see Proxmox bugzilla #4047.
* Two trivial context changes for bitmap-mirror patches.
* block_int.h got split up into multiple headers.
* Some context changes in configure and meson.build.
* Used the oppurtunity to squash fixup of bdrv_backuo_dump_create typo
  in a later patch into the patch introducing the function (had to
  move code to new header during rebase).

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2022-06-29 12:29:21 +02:00
Thomas Lamprecht
4e4b9ab13f bump version to 6.2.0-11
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-06-22 15:54:58 +02:00
Thomas Lamprecht
39e84ba82d vma/alloc-track improvements
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-06-22 15:52:16 +02:00
Thomas Lamprecht
4fd0fa7fb3 re-export patches in normalized form
iow. using:

git format-patch --zero-commit --no-signature --no-numbered --diff-algorithm=myers ...

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-06-22 15:49:53 +02:00
Dominik Csapak
539e333eaa add 'namespace' to BlockdevOptionsPbs
so that we can use it for the -blockdev options (used for live-restore)

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2022-06-22 15:10:49 +02:00
Fabian Grünbichler
68569ea2ff bump version to 6.2.0-10
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2022-06-09 16:35:57 +02:00
Fabian Grünbichler
41aedfb6db add d/source/include-binaries
to shutup dpkg-source when building a source package

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2022-06-09 16:35:01 +02:00
Fabian Ebner
7bd4d8645a fix #4101: acquire job's aio context before calling job_unref
Otherwise, we might run into an abort via bdrv_co_yield_to_drain()
(can at least happen when a disk with iothread is used):
> #0  0x00007fef4f5dece1 __GI_raise (libc.so.6 + 0x3bce1)
> #1  0x00007fef4f5c8537 __GI_abort (libc.so.6 + 0x25537)
> #2  0x00005641bce3c71f error_exit (qemu-system-x86_64 + 0x80371f)
> #3  0x00005641bce3d02b qemu_mutex_unlock_impl (qemu-system-x86_64 + 0x80402b)
> #4  0x00005641bcd51655 bdrv_co_yield_to_drain (qemu-system-x86_64 + 0x718655)
> #5  0x00005641bcd52de8 bdrv_do_drained_begin (qemu-system-x86_64 + 0x719de8)
> #6  0x00005641bcd47e07 blk_drain (qemu-system-x86_64 + 0x70ee07)
> #7  0x00005641bcd498cd blk_unref (qemu-system-x86_64 + 0x7108cd)
> #8  0x00005641bcd31e6f block_job_free (qemu-system-x86_64 + 0x6f8e6f)
> #9  0x00005641bcd32d65 job_unref (qemu-system-x86_64 + 0x6f9d65)
> #10 0x00005641bcd93b3d pvebackup_co_complete_stream (qemu-system-x86_64 + 0x75ab3d)
> #11 0x00005641bce4e353 coroutine_trampoline (qemu-system-x86_64 + 0x815353)

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Acked-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2022-06-09 14:57:28 +02:00
Wolfgang Bumiller
ed3b5b8ab8 bump version to 6.2.0-9
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2022-06-08 14:04:09 +02:00
Wolfgang Bumiller
7f4326d1dc pbs cleanup fixes
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2022-06-08 13:10:51 +02:00
Wolfgang Bumiller
53bff441c5 delete patches which were dropped from the series file
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2022-06-08 13:07:04 +02:00
Thomas Lamprecht
f9a4b1cea7 bump version to 6.2.0-8
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-05-19 09:25:11 +02:00
Fabian Ebner
dc265df350 add revert to work around performance regression when backing up large RBD disk
resulting in QMP timeouts and very slow backups. The plan is to figure
out (ideally together with upstream) a way to make the implementation
of bdrv_co_block_status for RBD more efficient. But for now, revert
the problematic change as a stop-gap measure.

Upstream bug report:
https://gitlab.com/qemu-project/qemu/-/issues/1026

Forum threads:
https://forum.proxmox.com/threads/109272/
https://forum.proxmox.com/threads/109448/
https://forum.proxmox.com/threads/101334/ (partially)

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2022-05-19 09:23:38 +02:00
Thomas Lamprecht
e0076597c6 bump version to 6.2.0-7
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-05-12 16:06:00 +02:00
Thomas Lamprecht
ee99c1f813 d/control: bump build-depenceny of proxmox-backup-qemu to 1.3.0-1
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-05-12 16:05:30 +02:00
Wolfgang Bumiller
58a5492e9c namespace support
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2022-05-12 13:49:35 +02:00
Thomas Lamprecht
e67b8b5bd9 bump version to 6.2.0-6
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-05-11 10:42:57 +02:00
Thomas Lamprecht
309b5c1694 backport various fixes for gluster, qxl and vnc
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-05-11 10:40:14 +02:00
Thomas Lamprecht
4ce5937f89 bump version to 6.2.0-5
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-04-25 10:13:50 +02:00
Thomas Lamprecht
f87d0523df vma: allow partial restore
Introduce a new map line for skipping a certain drive, of the form
skip=drive-scsi0

Since in PVE, most archives are compressed and piped to vma for
restore, it's not easily possible to skip reads.

For the reader, a new skip flag for VmaRestoreState is added and the
target is allowed to be NULL if skip is specified when registering.
If
the skip flag is set, no writes will be made as well as no check for
duplicate clusters. Therefore, the flag is not set for verify.

Originally-by: Fabian Ebner <f.ebner@proxmox.com>
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Acked-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-04-25 10:07:37 +02:00
Thomas Lamprecht
2fd4ea2813 patches: update context
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-04-25 10:07:01 +02:00
Thomas Lamprecht
2653a5f029 vma: restore: call blk_unref for all opened block devices
Originally-by: Fabian Ebner <f.ebner@proxmox.com>
Link: https://lists.proxmox.com/pipermail/pve-devel/2022-April/052642.html
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-04-25 10:05:29 +02:00
Thomas Lamprecht
664ecf59a9 bump version to 6.2.0-4
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-04-22 11:52:34 +02:00
Thomas Lamprecht
4de9440f87 various stable backports
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-04-22 10:22:39 +02:00
Thomas Lamprecht
9b348f8c6d d/copyright: drop trailing whitespace
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-04-22 09:16:23 +02:00
Thomas Lamprecht
799cf8c5a3 d/control: add suggest dependency-hint for libgl1
It pulls in a lot of stuff via the libglx0 -> libglx-mesa0 dependency
chain, so only suggest it for now to avoid installing it in the
installer or via common "PVE on-top Debian" installations, VirGL
integration is experimental after all and we may drop/replace it with
the vulkan based venus one, once available (Debian 12?).

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-04-22 09:14:39 +02:00
Thomas Lamprecht
b02e62dba0 d/control: add libgbm to build dependencies
required for good virgl support

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-04-22 08:50:36 +02:00
Thomas Lamprecht
fc03e1b6bf bump version to 6.2.0-3
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-04-15 09:09:43 +02:00
Thomas Lamprecht
c8ba14bed0 cherry-pick fix for passing some acpi slic tables
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-04-15 08:07:34 +02:00
Wolfgang Bumiller
daea534caa bump version to 6.2.0-2
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2022-03-03 12:05:37 +01:00
Fabian Ebner
27199bd753 backup: add patch to initialize bcs bitmap early enough for PBS
This is necessary for multi-disk backups where not all jobs are
immediately started after they are created. QEMU commit
06e0a9c16405c0a4c1eca33cf286cc04c42066a2 did already part of the work,
ensuring that new writes after job creation don't pass through to the
backup, but not yet for the MIRROR_SYNC_MODE_BITMAP case which is used
for PBS.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2022-03-03 11:37:17 +01:00
Thomas Lamprecht
e050683663 d/control: mark numactl a recommended package
we do not call in anywhere unconditionally

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-02-24 07:51:19 +01:00
Thomas Lamprecht
13184117e4 d/control: drop sdl dependency, we disable it on compile tinme
disabled via d/rules since a while...

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-02-24 07:51:19 +01:00
Thomas Lamprecht
aa4b14ea10 d/control: libaio1 is added by dh shlibs
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-02-24 07:51:19 +01:00
Thomas Lamprecht
3aa5b7598d enable zstd support
plan to use that for multifd migration, among other things

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-02-24 07:51:19 +01:00
Thomas Lamprecht
13d3e10aa6 compile in virgl support
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-02-24 07:51:19 +01:00
Thomas Lamprecht
58c3533a58 bump version to 6.2.0-1
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-02-18 14:23:41 +01:00
Thomas Lamprecht
fe0542bed9 d/rules: disable libssh by default
was always disabled in our clean builds, this now also avoids
auto-enabling it on "dirty" build hosts

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-02-18 14:21:53 +01:00
Fabian Ebner
f6d40bfdf4 add patch for loading a snapshot with qemu-img dd
Will be used when cloning from a qcow2 efidisk.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2022-02-15 14:03:07 +01:00
Fabian Ebner
107132becc fix getopt-string when introducing -n option for qemu-img dd
The colon after U is wrong, because it doesn't take an argument.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2022-02-15 14:03:07 +01:00
Fabian Ebner
4567474e95 update submodule and patches to 6.2.0
Notable changes:
* bdrv_co_p{discard,readv,writev,write_zeroes} function signatures
  changed, to using int64_t for offsets/bytes and some still had int
  rather than BrdvRequestFlags for the flags.
* job_cancel_sync now has a force parameter. Commit messages in
  73895f3838cd7fdaf185cf1dbc47be58844a966f
  4cfb3f05627ad82af473e7f7ae113c3884cd04e3
  sound like using force=true makes more sense.
* Added 3 patches coming in via qemu-stable tag, most important one is
  to work around a librbd issue.
* Added another 3 patches from qemu-devel to fix issue leading to
  crash when live migrating with iothread.
* cluster_size calculation helper changed (see patch pve/0026).
* QAPI's if conditionals now use 'CONFIG_FOO' rather than
  'defined(CONFIG_FOO)'

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2022-02-15 14:03:07 +01:00
Thomas Lamprecht
33a2d3a826 bump version to 6.1.1-2
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-02-14 15:53:18 +01:00
Fabian Ebner
2bf61c3eb6 vma: create: register all streams before entering coroutines
Otherwise, the header might already get written by a coroutine and
registering further streams will fail after that.

Also adds a missing g_list_free call for the other GList that's used.

Reported in the community forum:
https://forum.proxmox.com/threads/104744/

Reproducer script (increase beyond 30 if the issue isn't triggered yet):
> #!/usr/bin/perl
>
> my $dir = "./vma-create-bug";
> mkdir $dir;
>
> my $archive_path = "$dir/vzdump-qemu-104-2202_02_02-00_00_00.vma";
> unlink $archive_path;
>
> my $cmd = "vma create $archive_path -v";
> for (my $i = 0; $i < 30; $i++) {
>   system("truncate -s 1M $dir/drive-virtio$i.img");
>   $cmd .= " drive-virtio$i=$dir/drive-virtio$i.img";
> }
> system($cmd);

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2022-02-14 15:38:58 +01:00
Thomas Lamprecht
c07b2203b3 bump version to 6.1.1-1
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-01-13 10:57:48 +01:00
Thomas Lamprecht
ddbf7a872d update submodule and patches to 6.1.1
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-01-13 10:56:39 +01:00
Thomas Lamprecht
95c7156d1e bump version to 6.1.0-3
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-12-01 15:35:49 +01:00
Fabian Ebner
570d4ad51d fix #3738: cherry-pick "block: introduce max_hw_iov for use in scsi-generic"
which fixes the bad commit 18473467d55a20d643b6c9b3a52de42f705b4d35
that was tracked down via bisecting, and has a Cc for qemu-stable as
well.

Issue was easy enough to reproduce with a single virtio-block disk
using a few runs of dd if=/dev/urandom of=file bs=1M count=1000

Commit cc071629539dc1f303175a7e2d4ab854c0a8b20f upstream.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-12-01 15:34:27 +01:00
Dominik Csapak
c5e8e7c998 buildsys: fix build-dependencies on headers for 'vma' and 'pbs_restore'
both of them depend on generated header files, so we have to specify
them as sources. Otherwise, it happens (at least on some machines)
that they will be compiled before the headers are generated, aborting
the build.

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2021-11-18 08:11:57 +01:00
Fabian Grünbichler
7cf6b60926 fix #3728: handle machine without type
libguestfs starts their helper VMs with `-machine accel=..` without a
machine type, and our pve version suffix handling would segfault in that
case. there might be other scripted use cases that are affected as well.

this regression was introduced with the rebase of our patch set on top
of 6.1.0

Fixes: f376b2b9e2

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2021-11-17 17:20:26 +01:00
Thomas Lamprecht
50999525c6 bump version to 6.1.0-2
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-11-16 09:38:18 +01:00
Fabian Grünbichler
edbcc10a69 cherry-pick segfault fix
this was reported multiple times in our forums[1 with backtraces, 2 & 3
with same log messages], fix is taken from upstream master.

1: https://forum.proxmox.com/threads/pve-7-0-14-1-vm-not-running-live-migration-kills-vm-post-ssd-move-pre-ram-move.99704/
2: https://forum.proxmox.com/threads/proxmox-7-0-14-1-crashes-vm-during-migrate-to-other-host.99678
3: https://forum.proxmox.com/threads/cannot-migrate-between-zfs-and-ceph.99685/#post-430152

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2021-11-16 09:23:43 +01:00
Thomas Lamprecht
cc707c362e bump version to 6.1.0-1
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-10-13 17:58:18 +02:00
Stefan Reiter
af64ed13eb add fixup patch for qxl migration logic
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2021-10-13 17:58:18 +02:00
Stefan Reiter
f376b2b9e2 update and rebase to QEMU v6.1.0
Very clean rebase, only the +pve version handling needed manual fixing.
Drops two applied patches from extra/ and adds one new from upstream
(extra/0001*, fixes VNC over unix sockets) as well as 3 of my own for
allowing password changes on custom VNC displays again (as seen and
reviewed upstream, but not yet applied).

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2021-10-11 15:13:26 +02:00
Thomas Lamprecht
89fa943ef9 bump version to 6.0.0-4
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-09-06 07:30:05 +02:00
Stefan Reiter
26eee146bc add temporary QMP race fix
same as the initial version sent to qemu-devel, it won't be the final
fix we plan to upstream but it should be enough band-aid to
workaround how PVE uses the QMP.

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
 [ Thomas: add a bit reasoning to commit message body ]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-09-06 07:28:07 +02:00
Wolfgang Bumiller
277d33454f drop patch force-disabling smm
This drops debian/patches/pve/0005-PVE-Config-smm_available-false.patch
(and renumbers the remaining patches)

From what I could gather, this patch was originally added
due to issues with old kernels. Now we have users which
seem to run into issues *with* the patch.

All this does is toggle an option, and it's available via a
qemu CLI option anyway, so if dropping this patch causes
issues for some people we can just add an option to
qemu-server & UI control smm explicitly.

Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Cc: Alexandre Derumier <aderumier@odiso.com>
Tested-by: Stefan Reiter <s.reiter@proxmox.com>
2021-08-24 11:19:05 +02:00
Fabian Grünbichler
611b692181 bump version to 6.0.0-3
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2021-08-03 15:02:40 +02:00
Fabian Ebner
0114d3cd02 io_uring: resubmit when result is -EAGAIN
Linux SCSI can throw spurious -EAGAIN in some corner cases in its
completion path, which will end up being the result in the completed
io_uring request.

Resubmitting such requests should allow block jobs to complete, even
if such spurious errors are encountered.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-07-29 11:51:57 +02:00
Thomas Lamprecht
bb3e494bdc bump version to 6.0.0-2
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-06-23 11:04:47 +02:00
Stefan Reiter
403f23a0c3 enable io-uring support in QEMU builds
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2021-06-21 09:56:06 +02:00
Thomas Lamprecht
db687e3cac buildsys: change upload dist to bullseye
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-06-08 11:18:10 +02:00
Thomas Lamprecht
8893def37c bump version to 6.0.0-1
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-05-28 11:30:55 +02:00
Stefan Reiter
263fef5b4c update keycodemapdb for 6.0
QEMU 6.0 requires the updated version to build correctly, as the
keymap-gen tool gained some new parameters.

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2021-05-28 11:29:44 +02:00
Stefan Reiter
eb96e850ac debian: ignore submodule checks in QEMU build
...we do those manually, and the build dir is not a git repo.

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2021-05-28 11:29:44 +02:00
Stefan Reiter
8dca018b68 udpate and rebase to QEMU v6.0.0
Mostly minor changes, bigger ones summarized:
* QEMU's internal backup code now uses a new async system, which allows
  parallel requests - the default max_workers settings is 64, I chose
  less, since 64 put enough stress on QEMU that the guest became
  practically unusable during the backup, and 16 still shows quite a
  nice measureable performance improvement. Little code changes for us
  though.
* 'malformed' QAPI parameters/functions are now a build error (i.e.
  using '_' vs '-'), I chose to just whitelist our calls in the name of
  backwards compatibility.
* monitor OOB race fix now uses the upstream variant, cherry-picked from
  origin/master since it's not in 6.0 by default
* last patch fixes a bug with snapshot rollback related to the new yank
  system

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2021-05-28 11:29:44 +02:00
120 changed files with 7255 additions and 2429 deletions

7
.gitignore vendored Normal file
View File

@@ -0,0 +1,7 @@
/*.build
/*.buildinfo
/*.changes
/*.deb
/*.dsc
/*.tar*
/pve-qemu-kvm-*.*/

View File

@@ -1,41 +1,77 @@
include /usr/share/dpkg/pkg-info.mk
include /usr/share/dpkg/architecture.mk
include /usr/share/dpkg/default.mk
PACKAGE = pve-qemu-kvm
SRCDIR := qemu
BUILDDIR ?= ${PACKAGE}-${DEB_VERSION_UPSTREAM}
BUILDDIR ?= $(PACKAGE)-$(DEB_VERSION_UPSTREAM)
ORIG_SRC_TAR=$(PACKAGE)_$(DEB_VERSION_UPSTREAM).orig.tar.gz
GITVERSION := $(shell git rev-parse HEAD)
DEB = ${PACKAGE}_${DEB_VERSION_UPSTREAM_REVISION}_${DEB_BUILD_ARCH}.deb
DEB_DBG = ${PACKAGE}-dbg_${DEB_VERSION_UPSTREAM_REVISION}_${DEB_BUILD_ARCH}.deb
DSC=$(PACKAGE)_$(DEB_VERSION_UPSTREAM_REVISION).dsc
DEB = $(PACKAGE)_$(DEB_VERSION_UPSTREAM_REVISION)_$(DEB_BUILD_ARCH).deb
DEB_DBG = $(PACKAGE)-dbg_$(DEB_VERSION_UPSTREAM_REVISION)_$(DEB_BUILD_ARCH).deb
DEBS = $(DEB) $(DEB_DBG)
all: $(DEBS)
.PHONY: submodule
submodule:
test -f "${SRCDIR}/configure" || git submodule update --init --recursive
test -f "$(SRCDIR)/configure" || git submodule update --init --recursive
PC_BIOS_FW_PURGE_LIST_IN = \
hppa-firmware.img \
openbios-ppc \
openbios-sparc32 \
openbios-sparc64 \
palcode-clipper \
s390-ccw.img \
s390-netboot.img \
u-boot.e500 \
.*\.dtb \
qemu_vga.ndrv \
slof.bin \
opensbi-riscv.*-generic-fw_dynamic.bin \
BLOB_PURGE_SED_CMDS = $(foreach FILE,$(PC_BIOS_FW_PURGE_LIST_IN),-e "/$(FILE)/d")
BLOB_PURGE_FILTER = $(foreach FILE,$(PC_BIOS_FW_PURGE_LIST_IN),-e "$(FILE)")
$(BUILDDIR): keycodemapdb | submodule
# check if qemu/ was used for a build
# if so, please run 'make distclean' in the submodule and try again
test ! -f $(SRCDIR)/build/config.status
rm -rf $(BUILDDIR)
cp -a $(SRCDIR) $(BUILDDIR)
cp -a debian $(BUILDDIR)/debian
rm -rf $(BUILDDIR)/ui/keycodemapdb
cp -a keycodemapdb $(BUILDDIR)/ui/
echo "git clone git://git.proxmox.com/git/pve-qemu.git\\ngit checkout $(GITVERSION)" > $(BUILDDIR)/debian/SOURCE
rm -rf $@.tmp $@
cp -a $(SRCDIR) $@.tmp
cp -a debian $@.tmp/debian
rm -rf $@.tmp/ui/keycodemapdb
cp -a keycodemapdb $@.tmp/ui/
find $@.tmp/pc-bios -type f | grep $(BLOB_PURGE_FILTER) | xargs rm -f
sed -i $(BLOB_PURGE_SED_CMDS) $@.tmp/pc-bios/meson.build
echo "git clone git://git.proxmox.com/git/pve-qemu.git\\ngit checkout $(GITVERSION)" > $@.tmp/debian/SOURCE
mv $@.tmp $@
.PHONY: deb kvm
deb kvm: $(DEBS)
$(DEB_DBG): $(DEB)
$(DEB): $(BUILDDIR)
cd $(BUILDDIR); dpkg-buildpackage -b -us -uc -j
cd $(BUILDDIR); dpkg-buildpackage -b -us -uc -j32
lintian $(DEBS)
sbuild: $(DSC)
sbuild $(DSC)
$(ORIG_SRC_TAR): $(BUILDDIR)
tar czf $(ORIG_SRC_TAR) --exclude="$(BUILDDIR)/debian" $(BUILDDIR)
.PHONY: dsc
dsc:
rm -rf *.dsc $(BUILDDIR)
$(MAKE) $(DSC)
lintian $(DSC)
$(DSC): $(ORIG_SRC_TAR) $(BUILDDIR)
cd $(BUILDDIR); dpkg-buildpackage -S -us -uc -d
.PHONY: update
update:
cd $(SRCDIR) && git submodule deinit ui/keycodemapdb || true
@@ -48,13 +84,14 @@ update:
git add keycodemapdb
.PHONY: upload
upload: UPLOAD_DIST ?= $(DEB_DISTRIBUTION)
upload: $(DEBS)
tar cf - ${DEBS} | ssh repoman@repo.proxmox.com upload --product pve --dist buster
tar cf - $(DEBS) | ssh repoman@repo.proxmox.com upload --product pve --dist $(UPLOAD_DIST)
.PHONY: distclean clean
distclean: clean
clean:
rm -rf $(BUILDDIR) $(PACKAGE)*.deb *.buildinfo *.changes
rm -rf $(PACKAGE)-[0-9]*/ $(PACKAGE)*.tar* *.deb *.dsc *.build *.buildinfo *.changes
.PHONY: dinstall
dinstall: $(DEBS)

322
debian/changelog vendored
View File

@@ -1,3 +1,325 @@
pve-qemu-kvm (7.2.10-1+vitastor1) bullseye; urgency=medium
* Add Vitastor support
-- Vitaliy Filippov <vitalif@yourcmc.ru> Sat, 22 Mar 2025 11:41:12 +0300
pve-qemu-kvm (7.2.10-1) bullseye; urgency=medium
* update patches and submodule to QEMU stable 7.2.10
* pick up some extra fixes from upcoming 7.2.11 to, e.g., avoid releasing a
regression for i386 that happened in the 7.2.10 stable release.
-- Proxmox Support Team <support@proxmox.com> Wed, 10 Apr 2024 15:47:27 +0200
pve-qemu-kvm (7.2.0-8) bullseye; urgency=medium
* backport fix for ACPI CPU hotplug issue with TCG
* cherry-pick TCG-related stable fixes for 7.2 for users that turned off KVM
HW acceleration
-- Proxmox Support Team <support@proxmox.com> Fri, 17 Mar 2023 15:47:08 +0100
pve-qemu-kvm (7.2.0-7) bullseye; urgency=medium
* improve fix for potential deadlock with trim for IDE/SATA and draining
* backport stable fixes:
- hw/nvme: fix missing endian conversions for doorbell buffers
- hw/smbios: fix field corruption in type 4 table
- virtio-rng-pci: fix transitional migration compat for vectors
- hw/timer/hpet: Fix expiration time overflow
- vhost/vdpa: stop all svq on device deletion
- vhost: avoid a potential use of an uninitialized variable in the call to
vhost_svq_poll
- chardev/char-socket: set s->listener = NULL in char_socket_finalize to
fix a potential crash after live-migration
- intel-iommu: fail MAP notifier without caching mode
- intel-iommu: fail DEVIOTLB_UNMAP without dt mode
* fix a regression for when the LSI SCSI controller is used
-- Proxmox Support Team <support@proxmox.com> Mon, 13 Mar 2023 17:42:49 +0100
pve-qemu-kvm (7.2.0-6) bullseye; urgency=medium
* fix 7.2 regression for Linux boot failures with megasas SCSI
* fix 7.0 regression for a potential deadlock with trim for IDE/SATA and
draining
-- Proxmox Support Team <support@proxmox.com> Wed, 08 Mar 2023 14:32:17 +0100
pve-qemu-kvm (7.2.0-5) bullseye; urgency=medium
* fix #4476: savevm-async: avoid looping without progress
* savevm-async: decrease the boundary for free space for (memory) state left
on target from 30 MiB to 100 MiB, improving the heuristic for when to
enter the final "pause and sync" stage.
* QMP backup: use correct error number when getting blockdrive length fails
* backport fix for some DMA reentrancy issues, better protecting against
malicious guests
* backport fix for iSCSI double free issue leading to crashes
-- Proxmox Support Team <support@proxmox.com> Tue, 21 Feb 2023 13:49:43 +0100
pve-qemu-kvm (7.2.0-4) bullseye; urgency=medium
* backport fix for a 7.2 regression when using VirtIO disk with
detect-zeroes=unmap
-- Proxmox Support Team <support@proxmox.com> Fri, 27 Jan 2023 09:37:49 +0100
pve-qemu-kvm (7.2.0-3) bullseye; urgency=medium
* add fix for live-migration with virtio-rng devices, which regressed in
QEMU 7.2.0.
-- Proxmox Support Team <support@proxmox.com> Thu, 12 Jan 2023 13:13:14 +0100
pve-qemu-kvm (7.2.0-2) bullseye; urgency=medium
* enable slirp again for now, as in qemu-server, user networking is
supported (via CLI/API) when no bridge is set on a virtual NIC
* cherry-pick stable fixes for 7.2. Two for virtio-mem and one for vIOMMU.
Both features are not yet exposed in PVE's qemu-server, but there's work
going on to change that.
-- Proxmox Support Team <support@proxmox.com> Tue, 10 Jan 2023 15:47:48 +0100
pve-qemu-kvm (7.2.0-1) bullseye; urgency=medium
* update to QEMU stable release 7.2.0
* drop 'slirp' networking
-- Proxmox Support Team <support@proxmox.com> Fri, 16 Dec 2022 13:18:21 +0100
pve-qemu-kvm (7.1.0-4) bullseye; urgency=medium
* cherry-pick "block/block-backend: blk_set_enable_write_cache is IO_CODE"
-- Proxmox Support Team <support@proxmox.com> Tue, 22 Nov 2022 09:21:06 +0100
pve-qemu-kvm (7.1.0-3) bullseye; urgency=medium
* init: daemonize: defuse PID file resolve error to a warning at max, fixing
some usecases that regressed with 7.1, like tracking start up in our
file-restore VM.
-- Proxmox Support Team <support@proxmox.com> Fri, 28 Oct 2022 10:27:49 +0200
pve-qemu-kvm (7.1.0-2) bullseye; urgency=medium
* fix an issue with error handling in async backup code
-- Proxmox Support Team <support@proxmox.com> Tue, 18 Oct 2022 15:33:44 +0200
pve-qemu-kvm (7.1.0-1) bullseye; urgency=medium
* update to QEMU stable release 7.1.0
* add fix for io_uring_register_ring_fd from upstream
-- Proxmox Support Team <support@proxmox.com> Fri, 14 Oct 2022 14:54:09 +0200
pve-qemu-kvm (7.0.0-4) bullseye; urgency=medium
* add revision to version output
* PVE Backup: allow passing max-workers performance setting
-- Proxmox Support Team <support@proxmox.com> Mon, 10 Oct 2022 11:55:37 +0200
pve-qemu-kvm (7.0.0-3) bullseye; urgency=medium
* savevm-async: avoid segfault when aborting snapshot creation task
* savevm-async: set SAVE_STATE_DONE when closing state file was successful
allowing one to start a new snapshot task after aborting one.
-- Proxmox Support Team <support@proxmox.com> Tue, 30 Aug 2022 12:54:03 +0200
pve-qemu-kvm (7.0.0-2) bullseye; urgency=medium
* backport "io_uring: fix short read slow path"
* backport "e1000: set RX descriptor status in a separate operation"
-- Proxmox Support Team <support@proxmox.com> Wed, 20 Jul 2022 09:17:07 +0200
pve-qemu-kvm (7.0.0-1) bullseye; urgency=medium
* update to QEMU stable release 7.0.0
-- Proxmox Support Team <support@proxmox.com> Thu, 30 Jun 2022 11:07:37 +0200
pve-qemu-kvm (6.2.0-11) bullseye; urgency=medium
* add 'namespace' to BlockdevOptionsPbs for live-restore support
* vma: create: support 64KiB-unaligned input images like to improve backing
up some VM templates
* block: alloc-track: avoid unlikely, but possible premature break
-- Proxmox Support Team <support@proxmox.com> Wed, 22 Jun 2022 15:54:54 +0200
pve-qemu-kvm (6.2.0-10) bullseye; urgency=medium
* fix #4101: fix backup cancellation bug with iothreads
-- Proxmox Support Team <support@proxmox.com> Thu, 9 Jun 2022 16:35:51 +0200
pve-qemu-kvm (6.2.0-9) bullseye; urgency=medium
* fix possible race conditions during cancellation of a PBS backup
-- Proxmox Support Team <support@proxmox.com> Wed, 08 Jun 2022 14:03:22 +0200
pve-qemu-kvm (6.2.0-8) bullseye; urgency=medium
* revert "block/rbd: implement bdrv_co_block_status" to work around
performance regression when backing up large RBD disk
-- Proxmox Support Team <support@proxmox.com> Thu, 19 May 2022 09:24:45 +0200
pve-qemu-kvm (6.2.0-7) bullseye; urgency=medium
* Proxmox Backup Server namespace support
-- Proxmox Support Team <support@proxmox.com> Thu, 12 May 2022 16:05:56 +0200
pve-qemu-kvm (6.2.0-6) bullseye; urgency=medium
* block/gluster: correctly set max_pdiscard which is int64_t to avoid
triggering assertion
* ui/vnc.c: Fixed a deadlock bug
* display/qxl-render: fix race condition in qxl_cursor (CVE-2021-4207) and
integer overflow in cursor_alloc (CVE-2021-4206)
-- Proxmox Support Team <support@proxmox.com> Wed, 11 May 2022 10:42:53 +0200
pve-qemu-kvm (6.2.0-5) bullseye; urgency=medium
* vma: allow partial restore by skipping some disk
-- Proxmox Support Team <support@proxmox.com> Mon, 25 Apr 2022 10:13:46 +0200
pve-qemu-kvm (6.2.0-4) bullseye; urgency=medium
* d/control: add libgbm to build dependencies
* d/control: add suggest dependency-hint for libgl1
* various stable backports:
+ virtio-net: fix map leaking on error during receive
+ memory: Fix incorrect calls of log_global_start/stop
+ acpi: fix OEM ID/OEM Table ID padding
+ vhost-vsock: detach the virqueue element in case of error
+ vhost-user: remove VirtQ notifier restore
+ vhost-user: fix VirtQ notifier cleanup
+ virtio: fix the condition for iommu_platform not supported
-- Proxmox Support Team <support@proxmox.com> Fri, 22 Apr 2022 11:52:30 +0200
pve-qemu-kvm (6.2.0-3) bullseye; urgency=medium
* cherry-pick fix for some manually added ACPI table SLIC entries via the
custom args flag.
-- Proxmox Support Team <support@proxmox.com> Fri, 15 Apr 2022 09:09:37 +0200
pve-qemu-kvm (6.2.0-2) bullseye; urgency=medium
* compile in virgl support
* enable zstd support
* drop sdl dependency (it was disabled at compile time already)
* recommend 'numactl'
* fix an issue with multi-disk backups where chunks would be written
multiple times
-- Proxmox Support Team <support@proxmox.com> Thu, 03 Mar 2022 12:03:44 +0100
pve-qemu-kvm (6.2.0-1) bullseye; urgency=medium
* update to QEMU stable release 6.2.0
-- Proxmox Support Team <support@proxmox.com> Thu, 17 Feb 2022 06:23:14 +0100
pve-qemu-kvm (6.1.1-2) bullseye; urgency=medium
* vma: create: register all streams before entering coroutines to avoid that
an early stream starts to write already before all got registered.
-- Proxmox Support Team <support@proxmox.com> Mon, 14 Feb 2022 15:53:15 +0100
pve-qemu-kvm (6.1.1-1) bullseye; urgency=medium
* update to 6.1.1 stable release
-- Proxmox Support Team <support@proxmox.com> Thu, 13 Jan 2022 10:57:43 +0100
pve-qemu-kvm (6.1.0-3) bullseye; urgency=medium
* fix #3738: cherry-pick "block: introduce max_hw_iov for use in scsi-
generic
-- Proxmox Support Team <support@proxmox.com> Wed, 01 Dec 2021 15:35:43 +0100
pve-qemu-kvm (6.1.0-2) bullseye; urgency=medium
* avoid a possible segmentation fault during block (disk) mirror
-- Proxmox Support Team <support@proxmox.com> Tue, 16 Nov 2021 09:38:10 +0100
pve-qemu-kvm (6.1.0-1) bullseye; urgency=medium
* update to QEMU stable release 6.1.0
-- Proxmox Support Team <support@proxmox.com> Mon, 11 Oct 2021 15:15:19 +0200
pve-qemu-kvm (6.0.0-4) bullseye; urgency=medium
* drop the ancient workaround that force disabled SMM due to observing VM
hangs on old kernel versions.
* monitor/qmp: fix race with clients disconnecting early resulting in other
clients receiving a message with the (now wrong) ID of the former
-- Proxmox Support Team <support@proxmox.com> Mon, 06 Sep 2021 07:30:00 +0200
pve-qemu-kvm (6.0.0-3) bullseye; urgency=medium
* io_uring: resubmit when result is -EAGAIN
-- Proxmox Support Team <support@proxmox.com> Tue, 3 Aug 2021 15:01:31 +0200
pve-qemu-kvm (6.0.0-2) bullseye; urgency=medium
* enable io-uring support in QEMU builds
-- Proxmox Support Team <support@proxmox.com> Wed, 23 Jun 2021 11:03:54 +0200
pve-qemu-kvm (6.0.0-1) bullseye; urgency=medium
* update to QEMU stable release 6.0.0
-- Proxmox Support Team <support@proxmox.com> Fri, 28 May 2021 11:30:50 +0200
pve-qemu-kvm (5.2.0-11) bullseye; urgency=medium
* re-build for Proxmox VE 7 / Debian Bullseye

23
debian/control vendored
View File

@@ -10,28 +10,34 @@ Build-Depends: autotools-dev,
libattr1-dev,
libcap-ng-dev,
libcurl4-gnutls-dev,
libepoxy-dev,
libfdt-dev,
libgbm-dev,
libglusterfs-dev (>= 5.2-2),
libgnutls28-dev,
libiscsi-dev (>= 1.12.0),
libjemalloc-dev,
libjpeg-dev,
libjson-perl,
libnuma-dev,
libpci-dev,
libpixman-1-dev,
libproxmox-backup-qemu0-dev (>= 1.0.3-1),
libproxmox-backup-qemu0-dev (>= 1.3.0),
librbd-dev (>= 0.48),
libsdl1.2-dev,
libseccomp-dev,
libslirp-dev,
libspice-protocol-dev (>= 0.12.14~),
libspice-server-dev (>= 0.14.0~),
libsystemd-dev,
libusb-1.0-0-dev (>= 1.0.17-1),
liburing-dev,
libusb-1.0-0-dev (>= 1.0.17),
libusbredirparser-dev (>= 0.6-2),
libvirglrenderer-dev,
libzstd-dev,
meson,
python3-minimal,
python3-sphinx,
python3-sphinx-rtd-theme,
quilt,
texi2html,
texinfo,
@@ -43,7 +49,6 @@ Package: pve-qemu-kvm
Architecture: any
Depends: ceph-common (>= 0.48),
iproute2,
libaio1,
libgfapi0 | glusterfs-common (>= 5.6),
libgfchangelog0 | glusterfs-common (>= 5.6),
libgfdb0 | glusterfs-common (>= 5.6),
@@ -52,16 +57,15 @@ Depends: ceph-common (>= 0.48),
libglusterfs-dev | glusterfs-common (>= 5.6),
libglusterfs0 | glusterfs-common (>= 5.6),
libiscsi4 (>= 1.12.0) | libiscsi7,
libjemalloc2,
libjpeg62-turbo,
libsdl1.2debian,
libspice-server1 (>= 0.14.0~),
libusb-1.0-0 (>= 1.0.17-1),
libusbredirparser1 (>= 0.6-2),
libuuid1,
numactl,
${misc:Depends},
${shlibs:Depends},
Recommends: numactl,
Suggests: libgl1,
Conflicts: kvm,
pve-kvm,
pve-qemu-kvm-2.6.18,
@@ -69,9 +73,10 @@ Conflicts: kvm,
qemu-kvm,
qemu-system-arm,
qemu-system-common,
qemu-system-data,
qemu-system-x86,
qemu-utils,
Provides: qemu-system-arm, qemu-system-x86, qemu-utils
Provides: qemu-system-arm, qemu-system-x86, qemu-utils,
Replaces: pve-kvm,
pve-qemu-kvm-2.6.18,
qemu-system-arm,
@@ -85,6 +90,6 @@ Description: Full virtualization on x86 hardware
Package: pve-qemu-kvm-dbg
Architecture: any
Section: debug
Depends: pve-qemu-kvm (= ${binary:Version})
Depends: pve-qemu-kvm (= ${binary:Version}),
Description: pve qemu debugging symbols
This package contains the debugging symbols for pve-qemu-kvm.

2
debian/copyright vendored
View File

@@ -25,7 +25,7 @@ License:
In particular, the QEMU virtual CPU core library (libqemu.a) is
released under the GNU Lesser General Public License version 2 or later.
On Debian systems, the complete text of the GNU Lesser General Public
On Debian systems, the complete text of the GNU Lesser General Public
License can be found in the file /usr/share/common-licenses/LGPL.
Some hardware device emulation sources and other QEMU functionality are

View File

@@ -24,4 +24,5 @@ while (<STDIN>) {
die "no QEMU machine types detected from STDIN input" if scalar (@$machines) <= 0;
print to_json($machines, { utf8 => 1 }) or die "$!\n";
print to_json($machines, { utf8 => 1, canonical => 1 })
or die "failed to encode detected machines as JSON - $!\n";

View File

@@ -26,19 +26,20 @@ Suggested-by: Ma Haocong <mahaocong@didichuxing.com>
Signed-off-by: Ma Haocong <mahaocong@didichuxing.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
block/mirror.c | 98 ++++++++++++++++++++++++++++++-------
blockdev.c | 39 +++++++++++++--
include/block/block_int.h | 4 +-
qapi/block-core.json | 29 +++++++++--
tests/test-block-iothread.c | 4 +-
block/mirror.c | 98 +++++++++++++++++++++-----
blockdev.c | 39 +++++++++-
include/block/block_int-global-state.h | 4 +-
qapi/block-core.json | 29 ++++++--
tests/unit/test-block-iothread.c | 4 +-
5 files changed, 145 insertions(+), 29 deletions(-)
diff --git a/block/mirror.c b/block/mirror.c
index 8e1ad6eceb..97843992c2 100644
index 251adc5ae0..8ead5f77a0 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -50,7 +50,7 @@ typedef struct MirrorBlockJob {
@@ -51,7 +51,7 @@ typedef struct MirrorBlockJob {
BlockDriverState *to_replace;
/* Used to block operations on the drive-mirror-replace target */
Error *replace_blocker;
@@ -56,7 +57,7 @@ index 8e1ad6eceb..97843992c2 100644
BdrvDirtyBitmap *dirty_bitmap;
BdrvDirtyBitmapIter *dbi;
uint8_t *buf;
@@ -677,7 +679,8 @@ static int mirror_exit_common(Job *job)
@@ -699,7 +701,8 @@ static int mirror_exit_common(Job *job)
bdrv_child_refresh_perms(mirror_top_bs, mirror_top_bs->backing,
&error_abort);
if (!abort && s->backing_mode == MIRROR_SOURCE_BACKING_CHAIN) {
@@ -66,7 +67,7 @@ index 8e1ad6eceb..97843992c2 100644
BlockDriverState *unfiltered_target = bdrv_skip_filters(target_bs);
if (bdrv_cow_bs(unfiltered_target) != backing) {
@@ -774,6 +777,16 @@ static void mirror_abort(Job *job)
@@ -797,6 +800,16 @@ static void mirror_abort(Job *job)
assert(ret == 0);
}
@@ -83,7 +84,7 @@ index 8e1ad6eceb..97843992c2 100644
static void coroutine_fn mirror_throttle(MirrorBlockJob *s)
{
int64_t now = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
@@ -955,7 +968,8 @@ static int coroutine_fn mirror_run(Job *job, Error **errp)
@@ -977,7 +990,8 @@ static int coroutine_fn mirror_run(Job *job, Error **errp)
mirror_free_init(s);
s->last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
@@ -93,23 +94,23 @@ index 8e1ad6eceb..97843992c2 100644
ret = mirror_dirty_init(s);
if (ret < 0 || job_is_cancelled(&s->common.job)) {
goto immediate_exit;
@@ -1188,6 +1202,7 @@ static const BlockJobDriver mirror_job_driver = {
@@ -1224,6 +1238,7 @@ static const BlockJobDriver mirror_job_driver = {
.run = mirror_run,
.prepare = mirror_prepare,
.abort = mirror_abort,
+ .clean = mirror_clean,
.pause = mirror_pause,
.complete = mirror_complete,
},
@@ -1203,6 +1218,7 @@ static const BlockJobDriver commit_active_job_driver = {
.cancel = mirror_cancel,
@@ -1240,6 +1255,7 @@ static const BlockJobDriver commit_active_job_driver = {
.run = mirror_run,
.prepare = mirror_prepare,
.abort = mirror_abort,
+ .clean = mirror_clean,
.pause = mirror_pause,
.complete = mirror_complete,
},
@@ -1550,7 +1566,10 @@ static BlockJob *mirror_start_job(
.cancel = commit_active_cancel,
@@ -1627,7 +1643,10 @@ static BlockJob *mirror_start_job(
BlockCompletionFunc *cb,
void *opaque,
const BlockJobDriver *driver,
@@ -121,8 +122,8 @@ index 8e1ad6eceb..97843992c2 100644
bool auto_complete, const char *filter_node_name,
bool is_mirror, MirrorCopyMode copy_mode,
Error **errp)
@@ -1563,10 +1582,39 @@ static BlockJob *mirror_start_job(
Error *local_err = NULL;
@@ -1639,10 +1658,39 @@ static BlockJob *mirror_start_job(
uint64_t target_perms, target_shared_perms;
int ret;
- if (granularity == 0) {
@@ -163,7 +164,7 @@ index 8e1ad6eceb..97843992c2 100644
assert(is_power_of_2(granularity));
if (buf_size < 0) {
@@ -1705,7 +1753,9 @@ static BlockJob *mirror_start_job(
@@ -1774,7 +1822,9 @@ static BlockJob *mirror_start_job(
s->replaces = g_strdup(replaces);
s->on_source_error = on_source_error;
s->on_target_error = on_target_error;
@@ -174,7 +175,7 @@ index 8e1ad6eceb..97843992c2 100644
s->backing_mode = backing_mode;
s->zero_target = zero_target;
s->copy_mode = copy_mode;
@@ -1726,6 +1776,18 @@ static BlockJob *mirror_start_job(
@@ -1795,6 +1845,18 @@ static BlockJob *mirror_start_job(
bdrv_disable_dirty_bitmap(s->dirty_bitmap);
}
@@ -193,7 +194,7 @@ index 8e1ad6eceb..97843992c2 100644
ret = block_job_add_bdrv(&s->common, "source", bs, 0,
BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE |
BLK_PERM_CONSISTENT_READ,
@@ -1803,6 +1865,9 @@ fail:
@@ -1872,6 +1934,9 @@ fail:
if (s->dirty_bitmap) {
bdrv_release_dirty_bitmap(s->dirty_bitmap);
}
@@ -203,7 +204,7 @@ index 8e1ad6eceb..97843992c2 100644
job_early_fail(&s->common.job);
}
@@ -1820,29 +1885,23 @@ void mirror_start(const char *job_id, BlockDriverState *bs,
@@ -1889,31 +1954,25 @@ void mirror_start(const char *job_id, BlockDriverState *bs,
BlockDriverState *target, const char *replaces,
int creation_flags, int64_t speed,
uint32_t granularity, int64_t buf_size,
@@ -220,6 +221,8 @@ index 8e1ad6eceb..97843992c2 100644
- bool is_none_mode;
BlockDriverState *base;
GLOBAL_STATE_CODE();
- if ((mode == MIRROR_SYNC_MODE_INCREMENTAL) ||
- (mode == MIRROR_SYNC_MODE_BITMAP)) {
- error_setg(errp, "Sync mode '%s' not supported",
@@ -238,7 +241,7 @@ index 8e1ad6eceb..97843992c2 100644
}
BlockJob *commit_active_start(const char *job_id, BlockDriverState *bs,
@@ -1868,7 +1927,8 @@ BlockJob *commit_active_start(const char *job_id, BlockDriverState *bs,
@@ -1940,7 +1999,8 @@ BlockJob *commit_active_start(const char *job_id, BlockDriverState *bs,
job_id, bs, creation_flags, base, NULL, speed, 0, 0,
MIRROR_LEAVE_BACKING_CHAIN, false,
on_error, on_error, true, cb, opaque,
@@ -246,13 +249,13 @@ index 8e1ad6eceb..97843992c2 100644
+ &commit_active_job_driver, MIRROR_SYNC_MODE_FULL,
+ NULL, 0, base, auto_complete,
filter_node_name, false, MIRROR_COPY_MODE_BACKGROUND,
&local_err);
if (local_err) {
errp);
if (!job) {
diff --git a/blockdev.c b/blockdev.c
index fe6fb5dc1d..394920613d 100644
index ae27a41efa..a0c7e0c13b 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2930,6 +2930,10 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
@@ -2956,6 +2956,10 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
BlockDriverState *target,
bool has_replaces, const char *replaces,
enum MirrorSyncMode sync,
@@ -263,7 +266,7 @@ index fe6fb5dc1d..394920613d 100644
BlockMirrorBackingMode backing_mode,
bool zero_target,
bool has_speed, int64_t speed,
@@ -2949,6 +2953,7 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
@@ -2975,6 +2979,7 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
{
BlockDriverState *unfiltered_bs;
int job_flags = JOB_DEFAULT;
@@ -271,7 +274,7 @@ index fe6fb5dc1d..394920613d 100644
if (!has_speed) {
speed = 0;
@@ -3003,6 +3008,29 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
@@ -3029,6 +3034,29 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
sync = MIRROR_SYNC_MODE_FULL;
}
@@ -301,7 +304,7 @@ index fe6fb5dc1d..394920613d 100644
if (!has_replaces) {
/* We want to mirror from @bs, but keep implicit filters on top */
unfiltered_bs = bdrv_skip_implicit_filters(bs);
@@ -3049,8 +3077,8 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
@@ -3075,8 +3103,8 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
* and will allow to check whether the node still exist at mirror completion
*/
mirror_start(job_id, bs, target,
@@ -312,7 +315,7 @@ index fe6fb5dc1d..394920613d 100644
on_source_error, on_target_error, unmap, filter_node_name,
copy_mode, errp);
}
@@ -3195,6 +3223,8 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
@@ -3221,6 +3249,8 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
blockdev_mirror_common(arg->has_job_id ? arg->job_id : NULL, bs, target_bs,
arg->has_replaces, arg->replaces, arg->sync,
@@ -321,7 +324,7 @@ index fe6fb5dc1d..394920613d 100644
backing_mode, zero_target,
arg->has_speed, arg->speed,
arg->has_granularity, arg->granularity,
@@ -3216,6 +3246,8 @@ void qmp_blockdev_mirror(bool has_job_id, const char *job_id,
@@ -3242,6 +3272,8 @@ void qmp_blockdev_mirror(bool has_job_id, const char *job_id,
const char *device, const char *target,
bool has_replaces, const char *replaces,
MirrorSyncMode sync,
@@ -330,7 +333,7 @@ index fe6fb5dc1d..394920613d 100644
bool has_speed, int64_t speed,
bool has_granularity, uint32_t granularity,
bool has_buf_size, int64_t buf_size,
@@ -3265,7 +3297,8 @@ void qmp_blockdev_mirror(bool has_job_id, const char *job_id,
@@ -3291,7 +3323,8 @@ void qmp_blockdev_mirror(bool has_job_id, const char *job_id,
}
blockdev_mirror_common(has_job_id ? job_id : NULL, bs, target_bs,
@@ -340,11 +343,11 @@ index fe6fb5dc1d..394920613d 100644
zero_target, has_speed, speed,
has_granularity, granularity,
has_buf_size, buf_size,
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 95d9333be1..6f8eda629a 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -1230,7 +1230,9 @@ void mirror_start(const char *job_id, BlockDriverState *bs,
diff --git a/include/block/block_int-global-state.h b/include/block/block_int-global-state.h
index b49f4eb35b..9d744db618 100644
--- a/include/block/block_int-global-state.h
+++ b/include/block/block_int-global-state.h
@@ -149,7 +149,9 @@ void mirror_start(const char *job_id, BlockDriverState *bs,
BlockDriverState *target, const char *replaces,
int creation_flags, int64_t speed,
uint32_t granularity, int64_t buf_size,
@@ -356,10 +359,10 @@ index 95d9333be1..6f8eda629a 100644
BlockdevOnError on_source_error,
BlockdevOnError on_target_error,
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 04ad80bc1e..9db3120716 100644
index 95ac4fa634..7daaf545be 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1971,10 +1971,19 @@
@@ -2000,10 +2000,19 @@
# (all the disk, only the sectors allocated in the topmost image, or
# only new I/O).
#
@@ -380,7 +383,7 @@ index 04ad80bc1e..9db3120716 100644
#
# @buf-size: maximum amount of data in flight from source to
# target (since 1.4).
@@ -2012,7 +2021,9 @@
@@ -2043,7 +2052,9 @@
{ 'struct': 'DriveMirror',
'data': { '*job-id': 'str', 'device': 'str', 'target': 'str',
'*format': 'str', '*node-name': 'str', '*replaces': 'str',
@@ -391,7 +394,7 @@ index 04ad80bc1e..9db3120716 100644
'*speed': 'int', '*granularity': 'uint32',
'*buf-size': 'int', '*on-source-error': 'BlockdevOnError',
'*on-target-error': 'BlockdevOnError',
@@ -2280,10 +2291,19 @@
@@ -2322,10 +2333,19 @@
# (all the disk, only the sectors allocated in the topmost image, or
# only new I/O).
#
@@ -412,7 +415,7 @@ index 04ad80bc1e..9db3120716 100644
#
# @buf-size: maximum amount of data in flight from source to
# target
@@ -2332,7 +2352,8 @@
@@ -2375,7 +2395,8 @@
{ 'command': 'blockdev-mirror',
'data': { '*job-id': 'str', 'device': 'str', 'target': 'str',
'*replaces': 'str',
@@ -422,11 +425,11 @@ index 04ad80bc1e..9db3120716 100644
'*speed': 'int', '*granularity': 'uint32',
'*buf-size': 'int', '*on-source-error': 'BlockdevOnError',
'*on-target-error': 'BlockdevOnError',
diff --git a/tests/test-block-iothread.c b/tests/test-block-iothread.c
index 3f866a35c6..500ede71c8 100644
--- a/tests/test-block-iothread.c
+++ b/tests/test-block-iothread.c
@@ -623,8 +623,8 @@ static void test_propagate_mirror(void)
diff --git a/tests/unit/test-block-iothread.c b/tests/unit/test-block-iothread.c
index 8ca5adec5e..dae80e5a5f 100644
--- a/tests/unit/test-block-iothread.c
+++ b/tests/unit/test-block-iothread.c
@@ -755,8 +755,8 @@ static void test_propagate_mirror(void)
/* Start a mirror job */
mirror_start("job0", src, target, NULL, JOB_DEFAULT, 0, 0, 0,
@@ -436,4 +439,4 @@ index 3f866a35c6..500ede71c8 100644
+ false, BLOCKDEV_ON_ERROR_REPORT, BLOCKDEV_ON_ERROR_REPORT,
false, "filter_node", MIRROR_COPY_MODE_BACKGROUND,
&error_abort);
job = job_get("job0");
WITH_JOB_LOCK_GUARD() {

View File

@@ -18,15 +18,16 @@ incremental backup modes; we can use this bitmap to later refresh a
successfully created mirror.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
block/mirror.c | 24 ++++++++++++++++++------
1 file changed, 18 insertions(+), 6 deletions(-)
diff --git a/block/mirror.c b/block/mirror.c
index 97843992c2..d1cce079da 100644
index 8ead5f77a0..35c1b8f25d 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -654,8 +654,6 @@ static int mirror_exit_common(Job *job)
@@ -676,8 +676,6 @@ static int mirror_exit_common(Job *job)
bdrv_unfreeze_backing_chain(mirror_top_bs, target_bs);
}
@@ -35,9 +36,9 @@ index 97843992c2..d1cce079da 100644
/* Make sure that the source BDS doesn't go away during bdrv_replace_node,
* before we can call bdrv_drained_end */
bdrv_ref(src);
@@ -755,6 +753,18 @@ static int mirror_exit_common(Job *job)
blk_set_perm(bjob->blk, 0, BLK_PERM_ALL, &error_abort);
blk_insert_bs(bjob->blk, mirror_top_bs, &error_abort);
@@ -778,6 +776,18 @@ static int mirror_exit_common(Job *job)
block_job_remove_all_bdrv(bjob);
bdrv_replace_node(mirror_top_bs, mirror_top_bs->backing->bs, &error_abort);
+ if (s->sync_bitmap) {
+ if (s->bitmap_mode == BITMAP_SYNC_MODE_ALWAYS ||
@@ -54,7 +55,7 @@ index 97843992c2..d1cce079da 100644
bs_opaque->job = NULL;
bdrv_drained_end(src);
@@ -1592,10 +1602,6 @@ static BlockJob *mirror_start_job(
@@ -1668,10 +1678,6 @@ static BlockJob *mirror_start_job(
" sync mode",
MirrorSyncMode_str(sync_mode));
return NULL;
@@ -65,7 +66,7 @@ index 97843992c2..d1cce079da 100644
}
} else if (bitmap) {
error_setg(errp,
@@ -1612,6 +1618,12 @@ static BlockJob *mirror_start_job(
@@ -1688,6 +1694,12 @@ static BlockJob *mirror_start_job(
return NULL;
}
granularity = bdrv_dirty_bitmap_granularity(bitmap);

View File

@@ -10,15 +10,16 @@ as one without the other does not make much sense with the current set
of modes.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
blockdev.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/blockdev.c b/blockdev.c
index 394920613d..4f8bd38b58 100644
index a0c7e0c13b..98b9dff154 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3029,6 +3029,9 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
@@ -3055,6 +3055,9 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
if (bdrv_dirty_bitmap_check(bitmap, BDRV_BITMAP_ALLOW_RO, errp)) {
return;
}

View File

@@ -10,15 +10,16 @@ since sync_bitmap is busy at the point of merging, and we checked access
beforehand.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
block/mirror.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
block/mirror.c | 11 ++++-------
1 file changed, 4 insertions(+), 7 deletions(-)
diff --git a/block/mirror.c b/block/mirror.c
index d1cce079da..e6140cf018 100644
index 35c1b8f25d..4969c6833c 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -759,8 +759,8 @@ static int mirror_exit_common(Job *job)
@@ -782,8 +782,8 @@ static int mirror_exit_common(Job *job)
job->ret == 0 && ret == 0)) {
/* Success; synchronize copy back to sync. */
bdrv_clear_dirty_bitmap(s->sync_bitmap, NULL);
@@ -29,14 +30,17 @@ index d1cce079da..e6140cf018 100644
}
}
bdrv_release_dirty_bitmap(s->dirty_bitmap);
@@ -1793,8 +1793,8 @@ static BlockJob *mirror_start_job(
@@ -1862,11 +1862,8 @@ static BlockJob *mirror_start_job(
}
if (s->sync_mode == MIRROR_SYNC_MODE_BITMAP) {
- bdrv_merge_dirty_bitmap(s->dirty_bitmap, s->sync_bitmap,
- NULL, &local_err);
- if (local_err) {
- goto fail;
- }
+ bdrv_dirty_bitmap_merge_internal(s->dirty_bitmap, s->sync_bitmap,
+ NULL, true);
if (local_err) {
goto fail;
}
}
ret = block_job_add_bdrv(&s->common, "source", bs, 0,

View File

@@ -20,11 +20,11 @@ intentionally keeping copyright and ownership of original test case to
honor provenance.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
tests/qemu-iotests/384 | 547 +++++++
tests/qemu-iotests/384.out | 2846 ++++++++++++++++++++++++++++++++++++
tests/qemu-iotests/group | 1 +
3 files changed, 3394 insertions(+)
2 files changed, 3393 insertions(+)
create mode 100755 tests/qemu-iotests/384
create mode 100644 tests/qemu-iotests/384.out
@@ -3433,15 +3433,3 @@ index 0000000000..9b7408b6d6
+{"execute": "blockdev-mirror", "arguments": {"bitmap": "bitmap0", "device": "drive0", "filter-node-name": "mirror-top", "job-id": "api_job", "sync": "none", "target": "mirror_target"}}
+{"error": {"class": "GenericError", "desc": "bitmap-mode must be specified if a bitmap is provided"}}
+
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index 2960dff728..952dceba1f 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -270,6 +270,7 @@
253 rw quick
254 rw backing quick
255 rw quick
+384 rw
256 rw auto quick
257 rw
258 rw quick

View File

@@ -11,6 +11,7 @@ mode was never available for drive-mirror, it makes the interface more
uniform w.r.t. backup block jobs.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
block/mirror.c | 28 +++------------
blockdev.c | 29 +++++++++++++++
@@ -18,11 +19,11 @@ Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
3 files changed, 70 insertions(+), 59 deletions(-)
diff --git a/block/mirror.c b/block/mirror.c
index e6140cf018..3a08239a78 100644
index 4969c6833c..cf85ae1074 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -1592,31 +1592,13 @@ static BlockJob *mirror_start_job(
Error *local_err = NULL;
@@ -1668,31 +1668,13 @@ static BlockJob *mirror_start_job(
uint64_t target_perms, target_shared_perms;
int ret;
- if (sync_mode == MIRROR_SYNC_MODE_INCREMENTAL) {
@@ -59,10 +60,10 @@ index e6140cf018..3a08239a78 100644
if (bitmap_mode != BITMAP_SYNC_MODE_NEVER) {
diff --git a/blockdev.c b/blockdev.c
index 4f8bd38b58..a40c6fd0f6 100644
index 98b9dff154..5b15a86bfa 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3008,7 +3008,36 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
@@ -3034,7 +3034,36 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
sync = MIRROR_SYNC_MODE_FULL;
}

View File

@@ -1,33 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Thomas Lamprecht <t.lamprecht@proxmox.com>
Date: Mon, 14 Sep 2020 19:32:21 +0200
Subject: [PATCH] Revert "qemu-img convert: Don't pre-zero images"
This reverts commit edafc70c0c8510862f2f213a3acf7067113bcd08.
As it correlates with causing issues on LVM allocation
https://bugzilla.proxmox.com/show_bug.cgi?id=3002
---
qemu-img.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/qemu-img.c b/qemu-img.c
index 8bdea40b58..f9050bfaad 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -2104,6 +2104,15 @@ static int convert_do_copy(ImgConvertState *s)
s->has_zero_init = bdrv_has_zero_init(blk_bs(s->target));
}
+ if (!s->has_zero_init && !s->target_has_backing &&
+ bdrv_can_write_zeroes_with_unmap(blk_bs(s->target)))
+ {
+ ret = blk_make_zero(s->target, BDRV_REQ_MAY_UNMAP | BDRV_REQ_NO_FALLBACK);
+ if (ret == 0) {
+ s->has_zero_init = true;
+ }
+ }
+
/* Allocate buffer for copied data. For compressed images, only one cluster
* can be copied at a time. */
if (s->compressed) {

View File

@@ -0,0 +1,206 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Stefan Reiter <s.reiter@proxmox.com>
Date: Mon, 23 Aug 2021 11:28:32 +0200
Subject: [PATCH] monitor/qmp: fix race with clients disconnecting early
The following sequence can produce a race condition that results in
responses meant for different clients being sent to the wrong one:
(QMP, no OOB)
1) client A connects
2) client A sends 'qmp_capabilities'
3) 'qmp_dispatch' runs in coroutine, schedules out to
'do_qmp_dispatch_bh' and yields
4) client A disconnects (i.e. aborts, crashes, etc...)
5) client B connects
6) 'do_qmp_dispatch_bh' runs 'qmp_capabilities' and wakes calling coroutine
7) capabilities are now set and 'mon->commands' is set to '&qmp_commands'
8) 'qmp_dispatch' returns to 'monitor_qmp_dispatch'
9) success message is sent to client B *without it ever having sent
'qmp_capabilities' itself*
9a) even if client B ignores it, it will now presumably send it's own
greeting, which will error because caps are already set
The fix proposed here uses an atomic, sequential connection number
stored in the MonitorQMP struct, which is incremented everytime a new
client connects. Since it is not changed on CHR_EVENT_CLOSED, the
behaviour of allowing a client to disconnect only one side of the
connection is retained.
The connection_nr needs to be exposed outside of the monitor subsystem,
since qmp_dispatch lives in qapi code. It needs to be checked twice,
once for actually running the command in the BH (fixes 7), and once for
sending back a response (fixes 9).
This satisfies my local reproducer - using multiple clients constantly
looping to open a connection, send the greeting, then exiting no longer
crashes other, normally behaving clients with unrelated responses.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
include/monitor/monitor.h | 1 +
monitor/monitor-internal.h | 7 +++++++
monitor/monitor.c | 15 +++++++++++++++
monitor/qmp.c | 15 ++++++++++++++-
qapi/qmp-dispatch.c | 21 +++++++++++++++++----
stubs/monitor-core.c | 5 +++++
6 files changed, 59 insertions(+), 5 deletions(-)
diff --git a/include/monitor/monitor.h b/include/monitor/monitor.h
index 737e750670..38804b8595 100644
--- a/include/monitor/monitor.h
+++ b/include/monitor/monitor.h
@@ -16,6 +16,7 @@ extern QemuOptsList qemu_mon_opts;
Monitor *monitor_cur(void);
Monitor *monitor_set_cur(Coroutine *co, Monitor *mon);
bool monitor_cur_is_qmp(void);
+int monitor_get_connection_nr(const Monitor *mon);
void monitor_init_globals(void);
void monitor_init_globals_core(void);
diff --git a/monitor/monitor-internal.h b/monitor/monitor-internal.h
index a2cdbbf646..b531bd50e7 100644
--- a/monitor/monitor-internal.h
+++ b/monitor/monitor-internal.h
@@ -152,6 +152,13 @@ typedef struct {
QemuMutex qmp_queue_lock;
/* Input queue that holds all the parsed QMP requests */
GQueue *qmp_requests;
+
+ /*
+ * A sequential number that gets incremented on every new CHR_EVENT_OPENED.
+ * Used to avoid leftover responses in BHs from being sent to the wrong
+ * client. Access with atomics.
+ */
+ int connection_nr;
} MonitorQMP;
/**
diff --git a/monitor/monitor.c b/monitor/monitor.c
index 86949024f6..c306cadcf4 100644
--- a/monitor/monitor.c
+++ b/monitor/monitor.c
@@ -135,6 +135,21 @@ bool monitor_cur_is_qmp(void)
return cur_mon && monitor_is_qmp(cur_mon);
}
+/**
+ * If @mon is a QMP monitor, return the connection_nr, otherwise -1.
+ */
+int monitor_get_connection_nr(const Monitor *mon)
+{
+ MonitorQMP *qmp_mon;
+
+ if (!monitor_is_qmp(mon)) {
+ return -1;
+ }
+
+ qmp_mon = container_of(mon, MonitorQMP, common);
+ return qatomic_read(&qmp_mon->connection_nr);
+}
+
/**
* Is @mon is using readline?
* Note: not all HMP monitors use readline, e.g., gdbserver has a
diff --git a/monitor/qmp.c b/monitor/qmp.c
index acd0a350c2..cc1407e4ac 100644
--- a/monitor/qmp.c
+++ b/monitor/qmp.c
@@ -141,6 +141,8 @@ static void monitor_qmp_dispatch(MonitorQMP *mon, QObject *req)
QDict *rsp;
QDict *error;
+ int conn_nr_before = qatomic_read(&mon->connection_nr);
+
rsp = qmp_dispatch(mon->commands, req, qmp_oob_enabled(mon),
&mon->common);
@@ -156,7 +158,17 @@ static void monitor_qmp_dispatch(MonitorQMP *mon, QObject *req)
}
}
- monitor_qmp_respond(mon, rsp);
+ /*
+ * qmp_dispatch might have yielded and waited for a BH, in which case there
+ * is a chance a new client connected in the meantime - if this happened,
+ * the command will not have been executed, but we also need to ensure that
+ * we don't send back a corresponding response on a line that no longer
+ * belongs to this request.
+ */
+ if (conn_nr_before == qatomic_read(&mon->connection_nr)) {
+ monitor_qmp_respond(mon, rsp);
+ }
+
qobject_unref(rsp);
}
@@ -427,6 +439,7 @@ static void monitor_qmp_event(void *opaque, QEMUChrEvent event)
switch (event) {
case CHR_EVENT_OPENED:
+ qatomic_inc_fetch(&mon->connection_nr);
mon->commands = &qmp_cap_negotiation_commands;
monitor_qmp_caps_reset(mon);
data = qmp_greeting(mon);
diff --git a/qapi/qmp-dispatch.c b/qapi/qmp-dispatch.c
index 5d000fae87..404d428824 100644
--- a/qapi/qmp-dispatch.c
+++ b/qapi/qmp-dispatch.c
@@ -117,16 +117,28 @@ typedef struct QmpDispatchBH {
QObject **ret;
Error **errp;
Coroutine *co;
+ int conn_nr;
} QmpDispatchBH;
static void do_qmp_dispatch_bh(void *opaque)
{
QmpDispatchBH *data = opaque;
- assert(monitor_cur() == NULL);
- monitor_set_cur(qemu_coroutine_self(), data->cur_mon);
- data->cmd->fn(data->args, data->ret, data->errp);
- monitor_set_cur(qemu_coroutine_self(), NULL);
+ /*
+ * A QMP monitor tracks it's client with a connection number, if this
+ * changes during the scheduling delay of this BH, we must not execute the
+ * command. Otherwise a badly placed 'qmp_capabilities' might affect the
+ * connection state of a client it was never meant for.
+ */
+ if (data->conn_nr == monitor_get_connection_nr(data->cur_mon)) {
+ assert(monitor_cur() == NULL);
+ monitor_set_cur(qemu_coroutine_self(), data->cur_mon);
+ data->cmd->fn(data->args, data->ret, data->errp);
+ monitor_set_cur(qemu_coroutine_self(), NULL);
+ } else {
+ error_setg(data->errp, "active monitor connection changed");
+ }
+
aio_co_wake(data->co);
}
@@ -253,6 +265,7 @@ QDict *qmp_dispatch(const QmpCommandList *cmds, QObject *request,
.ret = &ret,
.errp = &err,
.co = qemu_coroutine_self(),
+ .conn_nr = monitor_get_connection_nr(cur_mon),
};
aio_bh_schedule_oneshot(iohandler_get_aio_context(), do_qmp_dispatch_bh,
&data);
diff --git a/stubs/monitor-core.c b/stubs/monitor-core.c
index afa477aae6..d3ff124bf3 100644
--- a/stubs/monitor-core.c
+++ b/stubs/monitor-core.c
@@ -12,6 +12,11 @@ Monitor *monitor_set_cur(Coroutine *co, Monitor *mon)
return NULL;
}
+int monitor_get_connection_nr(const Monitor *mon)
+{
+ return -1;
+}
+
void monitor_init_qmp(Chardev *chr, bool pretty, Error **errp)
{
}

View File

@@ -1,38 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Stefan Reiter <s.reiter@proxmox.com>
Date: Thu, 28 Jan 2021 15:19:51 +0100
Subject: [PATCH] docs: don't install man page if guest agent is disabled
No sense outputting the qemu-ga and qemu-ga-ref man pages when the guest
agent binary itself is disabled. This mirrors behaviour from before the
meson switch.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
---
docs/meson.build | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/docs/meson.build b/docs/meson.build
index ebd85d59f9..cc6f5007f8 100644
--- a/docs/meson.build
+++ b/docs/meson.build
@@ -46,6 +46,8 @@ if build_docs
meson.source_root() / 'docs/sphinx/qmp_lexer.py',
qapi_gen_depends ]
+ have_ga = have_tools and config_host.has_key('CONFIG_GUEST_AGENT')
+
configure_file(output: 'index.html',
input: files('index.html.in'),
configuration: {'VERSION': meson.project_version()},
@@ -53,8 +55,8 @@ if build_docs
manuals = [ 'devel', 'interop', 'tools', 'specs', 'system', 'user' ]
man_pages = {
'interop' : {
- 'qemu-ga.8': (have_tools ? 'man8' : ''),
- 'qemu-ga-ref.7': 'man7',
+ 'qemu-ga.8': (have_ga ? 'man8' : ''),
+ 'qemu-ga-ref.7': (have_ga ? 'man7' : ''),
'qemu-qmp-ref.7': 'man7',
},
'tools': {

View File

@@ -0,0 +1,42 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Fri, 28 Oct 2022 10:09:46 +0200
Subject: [PATCH] init: daemonize: defuse PID file resolve error
When proxmox-file-restore invokes QEMU, the PID file is a (temporary)
file that's already unlinked, so resolving the absolute path here
failed.
It should not be a critical error when the PID file unlink handler
can't be registered, because the path can't be resolved for whatever
reason. If the file is already gone from QEMU's perspective (i.e.
errno is ENOENT), silently ignore the error. Otherwise, print a
warning.
Reported-by: Dominik Csapak <d.csapak@proxmox.com>
Suggested-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
softmmu/vl.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/softmmu/vl.c b/softmmu/vl.c
index 38d76d6e51..7aa3eb5cf9 100644
--- a/softmmu/vl.c
+++ b/softmmu/vl.c
@@ -2468,10 +2468,11 @@ static void qemu_maybe_daemonize(const char *pid_file)
pid_file_realpath = g_malloc0(PATH_MAX);
if (!realpath(pid_file, pid_file_realpath)) {
- error_report("cannot resolve PID file path: %s: %s",
- pid_file, strerror(errno));
- unlink(pid_file);
- exit(1);
+ if (errno != ENOENT) {
+ warn_report("not removing PID file on exit: cannot resolve PID "
+ "file path: %s: %s", pid_file, strerror(errno));
+ }
+ return;
}
qemu_unlink_pidfile_notifier = (struct UnlinkPidfileNotifier) {

View File

@@ -1,31 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Stefan Reiter <s.reiter@proxmox.com>
Date: Thu, 4 Feb 2021 17:06:19 +0100
Subject: [PATCH] migration: only check page size match if RAM postcopy is
enabled
Postcopy may also be advised for dirty-bitmap migration only, in which
case the remote page size will not be available and we'll instead read
bogus data, blocking migration with a mismatch error if the VM uses
hugepages.
Fixes: 58110f0acb ("migration: split common postcopy out of ram postcopy")
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
migration/ram.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/migration/ram.c b/migration/ram.c
index 7811cde643..6ace15261c 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -3521,7 +3521,7 @@ static int ram_load_precopy(QEMUFile *f)
}
}
/* For postcopy we need to check hugepage sizes match */
- if (postcopy_advised &&
+ if (postcopy_advised && migrate_postcopy_ram() &&
block->page_size != qemu_host_page_size) {
uint64_t remote_page_size = qemu_get_be64(f);
if (remote_page_size != block->page_size) {

View File

@@ -0,0 +1,69 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Guenter Roeck <linux@roeck-us.net>
Date: Tue, 28 Feb 2023 09:11:29 -0800
Subject: [PATCH] scsi: megasas: Internal cdbs have 16-byte length
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Host drivers do not necessarily set cdb_len in megasas io commands.
With commits 6d1511cea0 ("scsi: Reject commands if the CDB length
exceeds buf_len") and fe9d8927e2 ("scsi: Add buf_len parameter to
scsi_req_new()"), this results in failures to boot Linux from affected
SCSI drives because cdb_len is set to 0 by the host driver.
Set the cdb length to its actual size to solve the problem.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(picked-up from https://lists.nongnu.org/archive/html/qemu-devel/2023-02/msg08653.html)
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
hw/scsi/megasas.c | 14 ++------------
1 file changed, 2 insertions(+), 12 deletions(-)
diff --git a/hw/scsi/megasas.c b/hw/scsi/megasas.c
index 9cbbb16121..d624866bb6 100644
--- a/hw/scsi/megasas.c
+++ b/hw/scsi/megasas.c
@@ -1780,7 +1780,7 @@ static int megasas_handle_io(MegasasState *s, MegasasCmd *cmd, int frame_cmd)
uint8_t cdb[16];
int len;
struct SCSIDevice *sdev = NULL;
- int target_id, lun_id, cdb_len;
+ int target_id, lun_id;
lba_count = le32_to_cpu(cmd->frame->io.header.data_len);
lba_start_lo = le32_to_cpu(cmd->frame->io.lba_lo);
@@ -1789,7 +1789,6 @@ static int megasas_handle_io(MegasasState *s, MegasasCmd *cmd, int frame_cmd)
target_id = cmd->frame->header.target_id;
lun_id = cmd->frame->header.lun_id;
- cdb_len = cmd->frame->header.cdb_len;
if (target_id < MFI_MAX_LD && lun_id == 0) {
sdev = scsi_device_find(&s->bus, 0, target_id, lun_id);
@@ -1804,15 +1803,6 @@ static int megasas_handle_io(MegasasState *s, MegasasCmd *cmd, int frame_cmd)
return MFI_STAT_DEVICE_NOT_FOUND;
}
- if (cdb_len > 16) {
- trace_megasas_scsi_invalid_cdb_len(
- mfi_frame_desc(frame_cmd), 1, target_id, lun_id, cdb_len);
- megasas_write_sense(cmd, SENSE_CODE(INVALID_OPCODE));
- cmd->frame->header.scsi_status = CHECK_CONDITION;
- s->event_count++;
- return MFI_STAT_SCSI_DONE_WITH_ERROR;
- }
-
cmd->iov_size = lba_count * sdev->blocksize;
if (megasas_map_sgl(s, cmd, &cmd->frame->io.sgl)) {
megasas_write_sense(cmd, SENSE_CODE(TARGET_FAILURE));
@@ -1823,7 +1813,7 @@ static int megasas_handle_io(MegasasState *s, MegasasCmd *cmd, int frame_cmd)
megasas_encode_lba(cdb, lba_start, lba_count, is_write);
cmd->req = scsi_req_new(sdev, cmd->index,
- lun_id, cdb, cdb_len, cmd);
+ lun_id, cdb, sizeof(cdb), cmd);
if (!cmd->req) {
trace_megasas_scsi_req_alloc_failed(
mfi_frame_desc(frame_cmd), target_id, lun_id);

View File

@@ -0,0 +1,100 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Tue, 7 Mar 2023 15:03:02 +0100
Subject: [PATCH] ide: avoid potential deadlock when draining during trim
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The deadlock can happen as follows:
1. ide_issue_trim is called, and increments the in_flight counter.
2. ide_issue_trim_cb calls blk_aio_pdiscard.
3. Somebody else starts draining (e.g. backup to insert the cbw node).
4. ide_issue_trim_cb is called as the completion callback for
blk_aio_pdiscard.
5. ide_issue_trim_cb issues yet another blk_aio_pdiscard request.
6. The request is added to the wait queue via blk_wait_while_drained,
because draining has been started.
7. Nobody ever decrements the in_flight counter and draining can't
finish. This would be done by ide_trim_bh_cb, which is called after
ide_issue_trim_cb has issued its last request, but
ide_issue_trim_cb is not called anymore, because it's the
completion callback of blk_aio_pdiscard, which waits on draining.
Quoting Hanna Czenczek:
> The point of 7e5cdb345f was that we need any in-flight count to
> accompany a set s->bus->dma->aiocb. While blk_aio_pdiscard() is
> happening, we dont necessarily need another count. But we do need
> it while there is no blk_aio_pdiscard().
> ide_issue_trim_cb() returns in two cases (and, recursively through
> its callers, leaves s->bus->dma->aiocb set):
> 1. After calling blk_aio_pdiscard(), which will keep an in-flight
> count,
> 2. After calling replay_bh_schedule_event() (i.e.
> qemu_bh_schedule()), which does not keep an in-flight count.
Thus, even after moving the blk_inc_in_flight to above the
replay_bh_schedule_event call, the invariant "ide_issue_trim_cb
returns with an accompanying in-flight count" is still satisfied.
However, the issue 7e5cdb345f fixed for canceling resurfaces, because
ide_cancel_dma_sync assumes that it just needs to drain once. But now
the in_flight count is not consistently > 0 during the trim operation.
So, change it to drain until !s->bus->dma->aiocb, which means that the
operation finished (s->bus->dma->aiocb is cleared by ide_set_inactive
via the ide_dma_cb when the end of the transfer is reached).
Discussion here:
https://lists.nongnu.org/archive/html/qemu-devel/2023-03/msg02506.html
Fixes: 7e5cdb345f ("ide: Increment BB in-flight counter for TRIM BH")
Suggested-by: Hanna Czenczek <hreitz@redhat.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
hw/ide/core.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/hw/ide/core.c b/hw/ide/core.c
index 3e97d665d9..a0f6801bce 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -443,7 +443,7 @@ static void ide_trim_bh_cb(void *opaque)
iocb->bh = NULL;
qemu_aio_unref(iocb);
- /* Paired with an increment in ide_issue_trim() */
+ /* Paired with an increment in ide_issue_trim_cb() */
blk_dec_in_flight(blk);
}
@@ -503,6 +503,8 @@ static void ide_issue_trim_cb(void *opaque, int ret)
done:
iocb->aiocb = NULL;
if (iocb->bh) {
+ /* Paired with a decrement in ide_trim_bh_cb() */
+ blk_inc_in_flight(s->blk);
replay_bh_schedule_event(iocb->bh);
}
}
@@ -515,9 +517,6 @@ BlockAIOCB *ide_issue_trim(
IDEDevice *dev = s->unit ? s->bus->slave : s->bus->master;
TrimAIOCB *iocb;
- /* Paired with a decrement in ide_trim_bh_cb() */
- blk_inc_in_flight(s->blk);
-
iocb = blk_aio_get(&trim_aiocb_info, s->blk, cb, cb_opaque);
iocb->s = s;
iocb->bh = qemu_bh_new_guarded(ide_trim_bh_cb, iocb,
@@ -741,8 +740,9 @@ void ide_cancel_dma_sync(IDEState *s)
*/
if (s->bus->dma->aiocb) {
trace_ide_cancel_dma_sync_remaining();
- blk_drain(s->blk);
- assert(s->bus->dma->aiocb == NULL);
+ while (s->bus->dma->aiocb) {
+ blk_drain(s->blk);
+ }
}
}

View File

@@ -1,143 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Stefan Hajnoczi <stefanha@redhat.com>
Date: Thu, 4 Feb 2021 18:34:35 +0000
Subject: [PATCH] virtiofsd: extract lo_do_open() from lo_open()
Both lo_open() and lo_create() have similar code to open a file. Extract
a common lo_do_open() function from lo_open() that will be used by
lo_create() in a later commit.
Since lo_do_open() does not otherwise need fuse_req_t req, convert
lo_add_fd_mapping() to use struct lo_data *lo instead.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20210204150208.367837-2-stefanha@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
tools/virtiofsd/passthrough_ll.c | 73 ++++++++++++++++++++------------
1 file changed, 46 insertions(+), 27 deletions(-)
diff --git a/tools/virtiofsd/passthrough_ll.c b/tools/virtiofsd/passthrough_ll.c
index 97485b22b4..218e20e9d7 100644
--- a/tools/virtiofsd/passthrough_ll.c
+++ b/tools/virtiofsd/passthrough_ll.c
@@ -471,17 +471,17 @@ static void lo_map_remove(struct lo_map *map, size_t key)
}
/* Assumes lo->mutex is held */
-static ssize_t lo_add_fd_mapping(fuse_req_t req, int fd)
+static ssize_t lo_add_fd_mapping(struct lo_data *lo, int fd)
{
struct lo_map_elem *elem;
- elem = lo_map_alloc_elem(&lo_data(req)->fd_map);
+ elem = lo_map_alloc_elem(&lo->fd_map);
if (!elem) {
return -1;
}
elem->fd = fd;
- return elem - lo_data(req)->fd_map.elems;
+ return elem - lo->fd_map.elems;
}
/* Assumes lo->mutex is held */
@@ -1661,6 +1661,38 @@ static void update_open_flags(int writeback, int allow_direct_io,
}
}
+static int lo_do_open(struct lo_data *lo, struct lo_inode *inode,
+ struct fuse_file_info *fi)
+{
+ char buf[64];
+ ssize_t fh;
+ int fd;
+
+ update_open_flags(lo->writeback, lo->allow_direct_io, fi);
+
+ sprintf(buf, "%i", inode->fd);
+ fd = openat(lo->proc_self_fd, buf, fi->flags & ~O_NOFOLLOW);
+ if (fd == -1) {
+ return errno;
+ }
+
+ pthread_mutex_lock(&lo->mutex);
+ fh = lo_add_fd_mapping(lo, fd);
+ pthread_mutex_unlock(&lo->mutex);
+ if (fh == -1) {
+ close(fd);
+ return ENOMEM;
+ }
+
+ fi->fh = fh;
+ if (lo->cache == CACHE_NONE) {
+ fi->direct_io = 1;
+ } else if (lo->cache == CACHE_ALWAYS) {
+ fi->keep_cache = 1;
+ }
+ return 0;
+}
+
static void lo_create(fuse_req_t req, fuse_ino_t parent, const char *name,
mode_t mode, struct fuse_file_info *fi)
{
@@ -1701,7 +1733,7 @@ static void lo_create(fuse_req_t req, fuse_ino_t parent, const char *name,
ssize_t fh;
pthread_mutex_lock(&lo->mutex);
- fh = lo_add_fd_mapping(req, fd);
+ fh = lo_add_fd_mapping(lo, fd);
pthread_mutex_unlock(&lo->mutex);
if (fh == -1) {
close(fd);
@@ -1892,38 +1924,25 @@ static void lo_fsyncdir(fuse_req_t req, fuse_ino_t ino, int datasync,
static void lo_open(fuse_req_t req, fuse_ino_t ino, struct fuse_file_info *fi)
{
- int fd;
- ssize_t fh;
- char buf[64];
struct lo_data *lo = lo_data(req);
+ struct lo_inode *inode = lo_inode(req, ino);
+ int err;
fuse_log(FUSE_LOG_DEBUG, "lo_open(ino=%" PRIu64 ", flags=%d)\n", ino,
fi->flags);
- update_open_flags(lo->writeback, lo->allow_direct_io, fi);
-
- sprintf(buf, "%i", lo_fd(req, ino));
- fd = openat(lo->proc_self_fd, buf, fi->flags & ~O_NOFOLLOW);
- if (fd == -1) {
- return (void)fuse_reply_err(req, errno);
- }
-
- pthread_mutex_lock(&lo->mutex);
- fh = lo_add_fd_mapping(req, fd);
- pthread_mutex_unlock(&lo->mutex);
- if (fh == -1) {
- close(fd);
- fuse_reply_err(req, ENOMEM);
+ if (!inode) {
+ fuse_reply_err(req, EBADF);
return;
}
- fi->fh = fh;
- if (lo->cache == CACHE_NONE) {
- fi->direct_io = 1;
- } else if (lo->cache == CACHE_ALWAYS) {
- fi->keep_cache = 1;
+ err = lo_do_open(lo, inode, fi);
+ lo_inode_put(lo, &inode);
+ if (err) {
+ fuse_reply_err(req, err);
+ } else {
+ fuse_reply_open(req, fi);
}
- fuse_reply_open(req, fi);
}
static void lo_release(fuse_req_t req, fuse_ino_t ino,

View File

@@ -0,0 +1,273 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Zhuojia Shen <chaosdefinition@hotmail.com>
Date: Wed, 10 Apr 2024 08:43:24 +0300
Subject: [PATCH] target/arm: align exposed ID registers with Linux
In CPUID registers exposed to userspace, some registers were missing
and some fields were not exposed. This patch aligns exposed ID
registers and their fields with what the upstream kernel currently
exposes.
Specifically, the following new ID registers/fields are exposed to
userspace:
ID_AA64PFR1_EL1.BT: bits 3-0
ID_AA64PFR1_EL1.MTE: bits 11-8
ID_AA64PFR1_EL1.SME: bits 27-24
ID_AA64ZFR0_EL1.SVEver: bits 3-0
ID_AA64ZFR0_EL1.AES: bits 7-4
ID_AA64ZFR0_EL1.BitPerm: bits 19-16
ID_AA64ZFR0_EL1.BF16: bits 23-20
ID_AA64ZFR0_EL1.SHA3: bits 35-32
ID_AA64ZFR0_EL1.SM4: bits 43-40
ID_AA64ZFR0_EL1.I8MM: bits 47-44
ID_AA64ZFR0_EL1.F32MM: bits 55-52
ID_AA64ZFR0_EL1.F64MM: bits 59-56
ID_AA64SMFR0_EL1.F32F32: bit 32
ID_AA64SMFR0_EL1.B16F32: bit 34
ID_AA64SMFR0_EL1.F16F32: bit 35
ID_AA64SMFR0_EL1.I8I32: bits 39-36
ID_AA64SMFR0_EL1.F64F64: bit 48
ID_AA64SMFR0_EL1.I16I64: bits 55-52
ID_AA64SMFR0_EL1.FA64: bit 63
ID_AA64MMFR0_EL1.ECV: bits 63-60
ID_AA64MMFR1_EL1.AFP: bits 47-44
ID_AA64MMFR2_EL1.AT: bits 35-32
ID_AA64ISAR0_EL1.RNDR: bits 63-60
ID_AA64ISAR1_EL1.FRINTTS: bits 35-32
ID_AA64ISAR1_EL1.BF16: bits 47-44
ID_AA64ISAR1_EL1.DGH: bits 51-48
ID_AA64ISAR1_EL1.I8MM: bits 55-52
ID_AA64ISAR2_EL1.WFxT: bits 3-0
ID_AA64ISAR2_EL1.RPRES: bits 7-4
ID_AA64ISAR2_EL1.GPA3: bits 11-8
ID_AA64ISAR2_EL1.APA3: bits 15-12
The code is also refactored to use symbolic names for ID register fields
for better readability and maintainability.
The test case in tests/tcg/aarch64/sysregs.c is also updated to match
the intended behavior.
Signed-off-by: Zhuojia Shen <chaosdefinition@hotmail.com>
Message-id: DS7PR12MB6309FB585E10772928F14271ACE79@DS7PR12MB6309.namprd12.prod.outlook.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: use Sn_n_Cn_Cn_n syntax to work with older assemblers
that don't recognize id_aa64isar2_el1 and id_aa64mmfr2_el1]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit bc6bd20ee3538347afb750c4bd06edca4a922897)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: pick this for v8.0.0-2361-g1f51573f79
"target/arm: Fix SME full tile indexing")
---
target/arm/helper.c | 96 +++++++++++++++++++++++++------
tests/tcg/aarch64/Makefile.target | 7 ++-
tests/tcg/aarch64/sysregs.c | 24 ++++++--
3 files changed, 103 insertions(+), 24 deletions(-)
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 2e284e048c..acc0470e86 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -7852,31 +7852,89 @@ void register_cp_regs_for_features(ARMCPU *cpu)
#ifdef CONFIG_USER_ONLY
static const ARMCPRegUserSpaceInfo v8_user_idregs[] = {
{ .name = "ID_AA64PFR0_EL1",
- .exported_bits = 0x000f000f00ff0000,
- .fixed_bits = 0x0000000000000011 },
+ .exported_bits = R_ID_AA64PFR0_FP_MASK |
+ R_ID_AA64PFR0_ADVSIMD_MASK |
+ R_ID_AA64PFR0_SVE_MASK |
+ R_ID_AA64PFR0_DIT_MASK,
+ .fixed_bits = (0x1u << R_ID_AA64PFR0_EL0_SHIFT) |
+ (0x1u << R_ID_AA64PFR0_EL1_SHIFT) },
{ .name = "ID_AA64PFR1_EL1",
- .exported_bits = 0x00000000000000f0 },
+ .exported_bits = R_ID_AA64PFR1_BT_MASK |
+ R_ID_AA64PFR1_SSBS_MASK |
+ R_ID_AA64PFR1_MTE_MASK |
+ R_ID_AA64PFR1_SME_MASK },
{ .name = "ID_AA64PFR*_EL1_RESERVED",
- .is_glob = true },
- { .name = "ID_AA64ZFR0_EL1" },
+ .is_glob = true },
+ { .name = "ID_AA64ZFR0_EL1",
+ .exported_bits = R_ID_AA64ZFR0_SVEVER_MASK |
+ R_ID_AA64ZFR0_AES_MASK |
+ R_ID_AA64ZFR0_BITPERM_MASK |
+ R_ID_AA64ZFR0_BFLOAT16_MASK |
+ R_ID_AA64ZFR0_SHA3_MASK |
+ R_ID_AA64ZFR0_SM4_MASK |
+ R_ID_AA64ZFR0_I8MM_MASK |
+ R_ID_AA64ZFR0_F32MM_MASK |
+ R_ID_AA64ZFR0_F64MM_MASK },
+ { .name = "ID_AA64SMFR0_EL1",
+ .exported_bits = R_ID_AA64SMFR0_F32F32_MASK |
+ R_ID_AA64SMFR0_B16F32_MASK |
+ R_ID_AA64SMFR0_F16F32_MASK |
+ R_ID_AA64SMFR0_I8I32_MASK |
+ R_ID_AA64SMFR0_F64F64_MASK |
+ R_ID_AA64SMFR0_I16I64_MASK |
+ R_ID_AA64SMFR0_FA64_MASK },
{ .name = "ID_AA64MMFR0_EL1",
- .fixed_bits = 0x00000000ff000000 },
- { .name = "ID_AA64MMFR1_EL1" },
+ .exported_bits = R_ID_AA64MMFR0_ECV_MASK,
+ .fixed_bits = (0xfu << R_ID_AA64MMFR0_TGRAN64_SHIFT) |
+ (0xfu << R_ID_AA64MMFR0_TGRAN4_SHIFT) },
+ { .name = "ID_AA64MMFR1_EL1",
+ .exported_bits = R_ID_AA64MMFR1_AFP_MASK },
+ { .name = "ID_AA64MMFR2_EL1",
+ .exported_bits = R_ID_AA64MMFR2_AT_MASK },
{ .name = "ID_AA64MMFR*_EL1_RESERVED",
- .is_glob = true },
+ .is_glob = true },
{ .name = "ID_AA64DFR0_EL1",
- .fixed_bits = 0x0000000000000006 },
- { .name = "ID_AA64DFR1_EL1" },
+ .fixed_bits = (0x6u << R_ID_AA64DFR0_DEBUGVER_SHIFT) },
+ { .name = "ID_AA64DFR1_EL1" },
{ .name = "ID_AA64DFR*_EL1_RESERVED",
- .is_glob = true },
+ .is_glob = true },
{ .name = "ID_AA64AFR*",
- .is_glob = true },
+ .is_glob = true },
{ .name = "ID_AA64ISAR0_EL1",
- .exported_bits = 0x00fffffff0fffff0 },
+ .exported_bits = R_ID_AA64ISAR0_AES_MASK |
+ R_ID_AA64ISAR0_SHA1_MASK |
+ R_ID_AA64ISAR0_SHA2_MASK |
+ R_ID_AA64ISAR0_CRC32_MASK |
+ R_ID_AA64ISAR0_ATOMIC_MASK |
+ R_ID_AA64ISAR0_RDM_MASK |
+ R_ID_AA64ISAR0_SHA3_MASK |
+ R_ID_AA64ISAR0_SM3_MASK |
+ R_ID_AA64ISAR0_SM4_MASK |
+ R_ID_AA64ISAR0_DP_MASK |
+ R_ID_AA64ISAR0_FHM_MASK |
+ R_ID_AA64ISAR0_TS_MASK |
+ R_ID_AA64ISAR0_RNDR_MASK },
{ .name = "ID_AA64ISAR1_EL1",
- .exported_bits = 0x000000f0ffffffff },
+ .exported_bits = R_ID_AA64ISAR1_DPB_MASK |
+ R_ID_AA64ISAR1_APA_MASK |
+ R_ID_AA64ISAR1_API_MASK |
+ R_ID_AA64ISAR1_JSCVT_MASK |
+ R_ID_AA64ISAR1_FCMA_MASK |
+ R_ID_AA64ISAR1_LRCPC_MASK |
+ R_ID_AA64ISAR1_GPA_MASK |
+ R_ID_AA64ISAR1_GPI_MASK |
+ R_ID_AA64ISAR1_FRINTTS_MASK |
+ R_ID_AA64ISAR1_SB_MASK |
+ R_ID_AA64ISAR1_BF16_MASK |
+ R_ID_AA64ISAR1_DGH_MASK |
+ R_ID_AA64ISAR1_I8MM_MASK },
+ { .name = "ID_AA64ISAR2_EL1",
+ .exported_bits = R_ID_AA64ISAR2_WFXT_MASK |
+ R_ID_AA64ISAR2_RPRES_MASK |
+ R_ID_AA64ISAR2_GPA3_MASK |
+ R_ID_AA64ISAR2_APA3_MASK },
{ .name = "ID_AA64ISAR*_EL1_RESERVED",
- .is_glob = true },
+ .is_glob = true },
};
modify_arm_cp_regs(v8_idregs, v8_user_idregs);
#endif
@@ -8194,8 +8252,12 @@ void register_cp_regs_for_features(ARMCPU *cpu)
#ifdef CONFIG_USER_ONLY
static const ARMCPRegUserSpaceInfo id_v8_user_midr_cp_reginfo[] = {
{ .name = "MIDR_EL1",
- .exported_bits = 0x00000000ffffffff },
- { .name = "REVIDR_EL1" },
+ .exported_bits = R_MIDR_EL1_REVISION_MASK |
+ R_MIDR_EL1_PARTNUM_MASK |
+ R_MIDR_EL1_ARCHITECTURE_MASK |
+ R_MIDR_EL1_VARIANT_MASK |
+ R_MIDR_EL1_IMPLEMENTER_MASK },
+ { .name = "REVIDR_EL1" },
};
modify_arm_cp_regs(id_v8_midr_cp_reginfo, id_v8_user_midr_cp_reginfo);
#endif
diff --git a/tests/tcg/aarch64/Makefile.target b/tests/tcg/aarch64/Makefile.target
index a72578fccb..fc6d5d824d 100644
--- a/tests/tcg/aarch64/Makefile.target
+++ b/tests/tcg/aarch64/Makefile.target
@@ -23,7 +23,8 @@ config-cc.mak: Makefile
$(call cc-option,-march=armv8.1-a+sve2, CROSS_CC_HAS_SVE2); \
$(call cc-option,-march=armv8.3-a, CROSS_CC_HAS_ARMV8_3); \
$(call cc-option,-mbranch-protection=standard, CROSS_CC_HAS_ARMV8_BTI); \
- $(call cc-option,-march=armv8.5-a+memtag, CROSS_CC_HAS_ARMV8_MTE)) 3> config-cc.mak
+ $(call cc-option,-march=armv8.5-a+memtag, CROSS_CC_HAS_ARMV8_MTE); \
+ $(call cc-option,-march=armv9-a+sme, CROSS_CC_HAS_ARMV9_SME)) 3> config-cc.mak
-include config-cc.mak
# Pauth Tests
@@ -53,7 +54,11 @@ endif
ifneq ($(CROSS_CC_HAS_SVE),)
# System Registers Tests
AARCH64_TESTS += sysregs
+ifneq ($(CROSS_CC_HAS_ARMV9_SME),)
+sysregs: CFLAGS+=-march=armv9-a+sme -DHAS_ARMV9_SME
+else
sysregs: CFLAGS+=-march=armv8.1-a+sve
+endif
# SVE ioctl test
AARCH64_TESTS += sve-ioctls
diff --git a/tests/tcg/aarch64/sysregs.c b/tests/tcg/aarch64/sysregs.c
index 40cf8d2877..46b931f781 100644
--- a/tests/tcg/aarch64/sysregs.c
+++ b/tests/tcg/aarch64/sysregs.c
@@ -22,6 +22,13 @@
#define HWCAP_CPUID (1 << 11)
#endif
+/*
+ * Older assemblers don't recognize newer system register names,
+ * but we can still access them by the Sn_n_Cn_Cn_n syntax.
+ */
+#define SYS_ID_AA64ISAR2_EL1 S3_0_C0_C6_2
+#define SYS_ID_AA64MMFR2_EL1 S3_0_C0_C7_2
+
int failed_bit_count;
/* Read and print system register `id' value */
@@ -112,18 +119,23 @@ int main(void)
* minimum valid fields - for the purposes of this check allowed
* to have non-zero values.
*/
- get_cpu_reg_check_mask(id_aa64isar0_el1, _m(00ff,ffff,f0ff,fff0));
- get_cpu_reg_check_mask(id_aa64isar1_el1, _m(0000,00f0,ffff,ffff));
+ get_cpu_reg_check_mask(id_aa64isar0_el1, _m(f0ff,ffff,f0ff,fff0));
+ get_cpu_reg_check_mask(id_aa64isar1_el1, _m(00ff,f0ff,ffff,ffff));
+ get_cpu_reg_check_mask(SYS_ID_AA64ISAR2_EL1, _m(0000,0000,0000,ffff));
/* TGran4 & TGran64 as pegged to -1 */
- get_cpu_reg_check_mask(id_aa64mmfr0_el1, _m(0000,0000,ff00,0000));
- get_cpu_reg_check_zero(id_aa64mmfr1_el1);
+ get_cpu_reg_check_mask(id_aa64mmfr0_el1, _m(f000,0000,ff00,0000));
+ get_cpu_reg_check_mask(id_aa64mmfr1_el1, _m(0000,f000,0000,0000));
+ get_cpu_reg_check_mask(SYS_ID_AA64MMFR2_EL1, _m(0000,000f,0000,0000));
/* EL1/EL0 reported as AA64 only */
get_cpu_reg_check_mask(id_aa64pfr0_el1, _m(000f,000f,00ff,0011));
- get_cpu_reg_check_mask(id_aa64pfr1_el1, _m(0000,0000,0000,00f0));
+ get_cpu_reg_check_mask(id_aa64pfr1_el1, _m(0000,0000,0f00,0fff));
/* all hidden, DebugVer fixed to 0x6 (ARMv8 debug architecture) */
get_cpu_reg_check_mask(id_aa64dfr0_el1, _m(0000,0000,0000,0006));
get_cpu_reg_check_zero(id_aa64dfr1_el1);
- get_cpu_reg_check_zero(id_aa64zfr0_el1);
+ get_cpu_reg_check_mask(id_aa64zfr0_el1, _m(0ff0,ff0f,00ff,00ff));
+#ifdef HAS_ARMV9_SME
+ get_cpu_reg_check_mask(id_aa64smfr0_el1, _m(80f1,00fd,0000,0000));
+#endif
get_cpu_reg_check_zero(id_aa64afr0_el1);
get_cpu_reg_check_zero(id_aa64afr1_el1);

View File

@@ -1,107 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Stefan Hajnoczi <stefanha@redhat.com>
Date: Thu, 4 Feb 2021 18:34:36 +0000
Subject: [PATCH] virtiofsd: optionally return inode pointer from
lo_do_lookup()
lo_do_lookup() finds an existing inode or allocates a new one. It
increments nlookup so that the inode stays alive until the client
releases it.
Existing callers don't need the struct lo_inode so the function doesn't
return it. Extend the function to optionally return the inode. The next
commit will need it.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <20210204150208.367837-3-stefanha@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
tools/virtiofsd/passthrough_ll.c | 29 +++++++++++++++++++++--------
1 file changed, 21 insertions(+), 8 deletions(-)
diff --git a/tools/virtiofsd/passthrough_ll.c b/tools/virtiofsd/passthrough_ll.c
index 218e20e9d7..2bd050b620 100644
--- a/tools/virtiofsd/passthrough_ll.c
+++ b/tools/virtiofsd/passthrough_ll.c
@@ -843,11 +843,13 @@ static int do_statx(struct lo_data *lo, int dirfd, const char *pathname,
}
/*
- * Increments nlookup and caller must release refcount using
- * lo_inode_put(&parent).
+ * Increments nlookup on the inode on success. unref_inode_lolocked() must be
+ * called eventually to decrement nlookup again. If inodep is non-NULL, the
+ * inode pointer is stored and the caller must call lo_inode_put().
*/
static int lo_do_lookup(fuse_req_t req, fuse_ino_t parent, const char *name,
- struct fuse_entry_param *e)
+ struct fuse_entry_param *e,
+ struct lo_inode **inodep)
{
int newfd;
int res;
@@ -857,6 +859,10 @@ static int lo_do_lookup(fuse_req_t req, fuse_ino_t parent, const char *name,
struct lo_inode *inode = NULL;
struct lo_inode *dir = lo_inode(req, parent);
+ if (inodep) {
+ *inodep = NULL;
+ }
+
/*
* name_to_handle_at() and open_by_handle_at() can reach here with fuse
* mount point in guest, but we don't have its inode info in the
@@ -924,7 +930,14 @@ static int lo_do_lookup(fuse_req_t req, fuse_ino_t parent, const char *name,
pthread_mutex_unlock(&lo->mutex);
}
e->ino = inode->fuse_ino;
- lo_inode_put(lo, &inode);
+
+ /* Transfer ownership of inode pointer to caller or drop it */
+ if (inodep) {
+ *inodep = inode;
+ } else {
+ lo_inode_put(lo, &inode);
+ }
+
lo_inode_put(lo, &dir);
fuse_log(FUSE_LOG_DEBUG, " %lli/%s -> %lli\n", (unsigned long long)parent,
@@ -959,7 +972,7 @@ static void lo_lookup(fuse_req_t req, fuse_ino_t parent, const char *name)
return;
}
- err = lo_do_lookup(req, parent, name, &e);
+ err = lo_do_lookup(req, parent, name, &e, NULL);
if (err) {
fuse_reply_err(req, err);
} else {
@@ -1067,7 +1080,7 @@ static void lo_mknod_symlink(fuse_req_t req, fuse_ino_t parent,
goto out;
}
- saverr = lo_do_lookup(req, parent, name, &e);
+ saverr = lo_do_lookup(req, parent, name, &e, NULL);
if (saverr) {
goto out;
}
@@ -1544,7 +1557,7 @@ static void lo_do_readdir(fuse_req_t req, fuse_ino_t ino, size_t size,
if (plus) {
if (!is_dot_or_dotdot(name)) {
- err = lo_do_lookup(req, ino, name, &e);
+ err = lo_do_lookup(req, ino, name, &e, NULL);
if (err) {
goto error;
}
@@ -1742,7 +1755,7 @@ static void lo_create(fuse_req_t req, fuse_ino_t parent, const char *name,
}
fi->fh = fh;
- err = lo_do_lookup(req, parent, name, &e);
+ err = lo_do_lookup(req, parent, name, &e, NULL);
}
if (lo->cache == CACHE_NONE) {
fi->direct_io = 1;

View File

@@ -0,0 +1,91 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Peter Maydell <peter.maydell@linaro.org>
Date: Wed, 10 Apr 2024 08:43:25 +0300
Subject: [PATCH] tests/tcg/aarch64/sysregs.c: Use S syntax for id_aa64zfr0_el1
and id_aa64smfr0_el1
Some assemblers will complain about attempts to access
id_aa64zfr0_el1 and id_aa64smfr0_el1 by name if the test
binary isn't built for the right processor type:
/tmp/ccASXpLo.s:782: Error: selected processor does not support system register name 'id_aa64zfr0_el1'
/tmp/ccASXpLo.s:829: Error: selected processor does not support system register name 'id_aa64smfr0_el1'
However, these registers are in the ID space and are guaranteed to
read-as-zero on older CPUs, so the access is both safe and sensible.
Switch to using the S syntax, as we already do for ID_AA64ISAR2_EL1
and ID_AA64MMFR2_EL1. This allows us to drop the HAS_ARMV9_SME check
and the makefile machinery to adjust the CFLAGS for this test, so we
don't rely on having a sufficiently new compiler to be able to check
these registers.
This means we're actually testing the SME ID register: no released
GCC yet recognizes -march=armv9-a+sme, so that was always skipped.
It also avoids a future problem if we try to switch the "do we have
SME support in the toolchain" check from "in the compiler" to "in the
assembler" (at which point we would otherwise run into the above
errors).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 3dc2afeab2964b54848715b913b6c605f36be3e1)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: pick this for v8.0.0-2361-g1f51573f79
"target/arm: Fix SME full tile indexing")
---
tests/tcg/aarch64/Makefile.target | 7 +------
tests/tcg/aarch64/sysregs.c | 11 +++++++----
2 files changed, 8 insertions(+), 10 deletions(-)
diff --git a/tests/tcg/aarch64/Makefile.target b/tests/tcg/aarch64/Makefile.target
index fc6d5d824d..118d069073 100644
--- a/tests/tcg/aarch64/Makefile.target
+++ b/tests/tcg/aarch64/Makefile.target
@@ -51,15 +51,10 @@ AARCH64_TESTS += mte-1 mte-2 mte-3 mte-4 mte-5 mte-6 mte-7
mte-%: CFLAGS += -march=armv8.5-a+memtag
endif
-ifneq ($(CROSS_CC_HAS_SVE),)
# System Registers Tests
AARCH64_TESTS += sysregs
-ifneq ($(CROSS_CC_HAS_ARMV9_SME),)
-sysregs: CFLAGS+=-march=armv9-a+sme -DHAS_ARMV9_SME
-else
-sysregs: CFLAGS+=-march=armv8.1-a+sve
-endif
+ifneq ($(CROSS_CC_HAS_SVE),)
# SVE ioctl test
AARCH64_TESTS += sve-ioctls
sve-ioctls: CFLAGS+=-march=armv8.1-a+sve
diff --git a/tests/tcg/aarch64/sysregs.c b/tests/tcg/aarch64/sysregs.c
index 46b931f781..d8eb06abcf 100644
--- a/tests/tcg/aarch64/sysregs.c
+++ b/tests/tcg/aarch64/sysregs.c
@@ -25,9 +25,14 @@
/*
* Older assemblers don't recognize newer system register names,
* but we can still access them by the Sn_n_Cn_Cn_n syntax.
+ * This also means we don't need to specifically request that the
+ * assembler enables whatever architectural features the ID registers
+ * syntax might be gated behind.
*/
#define SYS_ID_AA64ISAR2_EL1 S3_0_C0_C6_2
#define SYS_ID_AA64MMFR2_EL1 S3_0_C0_C7_2
+#define SYS_ID_AA64ZFR0_EL1 S3_0_C0_C4_4
+#define SYS_ID_AA64SMFR0_EL1 S3_0_C0_C4_5
int failed_bit_count;
@@ -132,10 +137,8 @@ int main(void)
/* all hidden, DebugVer fixed to 0x6 (ARMv8 debug architecture) */
get_cpu_reg_check_mask(id_aa64dfr0_el1, _m(0000,0000,0000,0006));
get_cpu_reg_check_zero(id_aa64dfr1_el1);
- get_cpu_reg_check_mask(id_aa64zfr0_el1, _m(0ff0,ff0f,00ff,00ff));
-#ifdef HAS_ARMV9_SME
- get_cpu_reg_check_mask(id_aa64smfr0_el1, _m(80f1,00fd,0000,0000));
-#endif
+ get_cpu_reg_check_mask(SYS_ID_AA64ZFR0_EL1, _m(0ff0,ff0f,00ff,00ff));
+ get_cpu_reg_check_mask(SYS_ID_AA64SMFR0_EL1, _m(80f1,00fd,0000,0000));
get_cpu_reg_check_zero(id_aa64afr0_el1);
get_cpu_reg_check_zero(id_aa64afr1_el1);

View File

@@ -1,296 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Stefan Hajnoczi <stefanha@redhat.com>
Date: Thu, 4 Feb 2021 18:34:37 +0000
Subject: [PATCH] virtiofsd: prevent opening of special files (CVE-2020-35517)
A well-behaved FUSE client does not attempt to open special files with
FUSE_OPEN because they are handled on the client side (e.g. device nodes
are handled by client-side device drivers).
The check to prevent virtiofsd from opening special files is missing in
a few cases, most notably FUSE_OPEN. A malicious client can cause
virtiofsd to open a device node, potentially allowing the guest to
escape. This can be exploited by a modified guest device driver. It is
not exploitable from guest userspace since the guest kernel will handle
special files inside the guest instead of sending FUSE requests.
This patch fixes this issue by introducing the lo_inode_open() function
to check the file type before opening it. This is a short-term solution
because it does not prevent a compromised virtiofsd process from opening
device nodes on the host.
Restructure lo_create() to try O_CREAT | O_EXCL first. Note that O_CREAT
| O_EXCL does not follow symlinks, so O_NOFOLLOW masking is not
necessary here. If the file exists and the user did not specify O_EXCL,
open it via lo_do_open().
Reported-by: Alex Xu <alex@alxu.ca>
Fixes: CVE-2020-35517
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20210204150208.367837-4-stefanha@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
tools/virtiofsd/passthrough_ll.c | 144 ++++++++++++++++++++-----------
1 file changed, 92 insertions(+), 52 deletions(-)
diff --git a/tools/virtiofsd/passthrough_ll.c b/tools/virtiofsd/passthrough_ll.c
index 2bd050b620..03c5e0d13c 100644
--- a/tools/virtiofsd/passthrough_ll.c
+++ b/tools/virtiofsd/passthrough_ll.c
@@ -567,6 +567,38 @@ static int lo_fd(fuse_req_t req, fuse_ino_t ino)
return fd;
}
+/*
+ * Open a file descriptor for an inode. Returns -EBADF if the inode is not a
+ * regular file or a directory.
+ *
+ * Use this helper function instead of raw openat(2) to prevent security issues
+ * when a malicious client opens special files such as block device nodes.
+ * Symlink inodes are also rejected since symlinks must already have been
+ * traversed on the client side.
+ */
+static int lo_inode_open(struct lo_data *lo, struct lo_inode *inode,
+ int open_flags)
+{
+ g_autofree char *fd_str = g_strdup_printf("%d", inode->fd);
+ int fd;
+
+ if (!S_ISREG(inode->filetype) && !S_ISDIR(inode->filetype)) {
+ return -EBADF;
+ }
+
+ /*
+ * The file is a symlink so O_NOFOLLOW must be ignored. We checked earlier
+ * that the inode is not a special file but if an external process races
+ * with us then symlinks are traversed here. It is not possible to escape
+ * the shared directory since it is mounted as "/" though.
+ */
+ fd = openat(lo->proc_self_fd, fd_str, open_flags & ~O_NOFOLLOW);
+ if (fd < 0) {
+ return -errno;
+ }
+ return fd;
+}
+
static void lo_init(void *userdata, struct fuse_conn_info *conn)
{
struct lo_data *lo = (struct lo_data *)userdata;
@@ -696,9 +728,9 @@ static void lo_setattr(fuse_req_t req, fuse_ino_t ino, struct stat *attr,
if (fi) {
truncfd = fd;
} else {
- sprintf(procname, "%i", ifd);
- truncfd = openat(lo->proc_self_fd, procname, O_RDWR);
+ truncfd = lo_inode_open(lo, inode, O_RDWR);
if (truncfd < 0) {
+ errno = -truncfd;
goto out_err;
}
}
@@ -860,7 +892,7 @@ static int lo_do_lookup(fuse_req_t req, fuse_ino_t parent, const char *name,
struct lo_inode *dir = lo_inode(req, parent);
if (inodep) {
- *inodep = NULL;
+ *inodep = NULL; /* in case there is an error */
}
/*
@@ -1674,19 +1706,26 @@ static void update_open_flags(int writeback, int allow_direct_io,
}
}
+/*
+ * Open a regular file, set up an fd mapping, and fill out the struct
+ * fuse_file_info for it. If existing_fd is not negative, use that fd instead
+ * opening a new one. Takes ownership of existing_fd.
+ *
+ * Returns 0 on success or a positive errno.
+ */
static int lo_do_open(struct lo_data *lo, struct lo_inode *inode,
- struct fuse_file_info *fi)
+ int existing_fd, struct fuse_file_info *fi)
{
- char buf[64];
ssize_t fh;
- int fd;
+ int fd = existing_fd;
update_open_flags(lo->writeback, lo->allow_direct_io, fi);
- sprintf(buf, "%i", inode->fd);
- fd = openat(lo->proc_self_fd, buf, fi->flags & ~O_NOFOLLOW);
- if (fd == -1) {
- return errno;
+ if (fd < 0) {
+ fd = lo_inode_open(lo, inode, fi->flags);
+ if (fd < 0) {
+ return -fd;
+ }
}
pthread_mutex_lock(&lo->mutex);
@@ -1709,9 +1748,10 @@ static int lo_do_open(struct lo_data *lo, struct lo_inode *inode,
static void lo_create(fuse_req_t req, fuse_ino_t parent, const char *name,
mode_t mode, struct fuse_file_info *fi)
{
- int fd;
+ int fd = -1;
struct lo_data *lo = lo_data(req);
struct lo_inode *parent_inode;
+ struct lo_inode *inode = NULL;
struct fuse_entry_param e;
int err;
struct lo_cred old = {};
@@ -1737,36 +1777,38 @@ static void lo_create(fuse_req_t req, fuse_ino_t parent, const char *name,
update_open_flags(lo->writeback, lo->allow_direct_io, fi);
- fd = openat(parent_inode->fd, name, (fi->flags | O_CREAT) & ~O_NOFOLLOW,
- mode);
+ /* Try to create a new file but don't open existing files */
+ fd = openat(parent_inode->fd, name, fi->flags | O_CREAT | O_EXCL, mode);
err = fd == -1 ? errno : 0;
- lo_restore_cred(&old);
- if (!err) {
- ssize_t fh;
+ lo_restore_cred(&old);
- pthread_mutex_lock(&lo->mutex);
- fh = lo_add_fd_mapping(lo, fd);
- pthread_mutex_unlock(&lo->mutex);
- if (fh == -1) {
- close(fd);
- err = ENOMEM;
- goto out;
- }
+ /* Ignore the error if file exists and O_EXCL was not given */
+ if (err && (err != EEXIST || (fi->flags & O_EXCL))) {
+ goto out;
+ }
- fi->fh = fh;
- err = lo_do_lookup(req, parent, name, &e, NULL);
+ err = lo_do_lookup(req, parent, name, &e, &inode);
+ if (err) {
+ goto out;
}
- if (lo->cache == CACHE_NONE) {
- fi->direct_io = 1;
- } else if (lo->cache == CACHE_ALWAYS) {
- fi->keep_cache = 1;
+
+ err = lo_do_open(lo, inode, fd, fi);
+ fd = -1; /* lo_do_open() takes ownership of fd */
+ if (err) {
+ /* Undo lo_do_lookup() nlookup ref */
+ unref_inode_lolocked(lo, inode, 1);
}
out:
+ lo_inode_put(lo, &inode);
lo_inode_put(lo, &parent_inode);
if (err) {
+ if (fd >= 0) {
+ close(fd);
+ }
+
fuse_reply_err(req, err);
} else {
fuse_reply_create(req, &e, fi);
@@ -1780,7 +1822,6 @@ static struct lo_inode_plock *lookup_create_plock_ctx(struct lo_data *lo,
pid_t pid, int *err)
{
struct lo_inode_plock *plock;
- char procname[64];
int fd;
plock =
@@ -1797,12 +1838,10 @@ static struct lo_inode_plock *lookup_create_plock_ctx(struct lo_data *lo,
}
/* Open another instance of file which can be used for ofd locks. */
- sprintf(procname, "%i", inode->fd);
-
/* TODO: What if file is not writable? */
- fd = openat(lo->proc_self_fd, procname, O_RDWR);
- if (fd == -1) {
- *err = errno;
+ fd = lo_inode_open(lo, inode, O_RDWR);
+ if (fd < 0) {
+ *err = -fd;
free(plock);
return NULL;
}
@@ -1949,7 +1988,7 @@ static void lo_open(fuse_req_t req, fuse_ino_t ino, struct fuse_file_info *fi)
return;
}
- err = lo_do_open(lo, inode, fi);
+ err = lo_do_open(lo, inode, -1, fi);
lo_inode_put(lo, &inode);
if (err) {
fuse_reply_err(req, err);
@@ -2005,39 +2044,40 @@ static void lo_flush(fuse_req_t req, fuse_ino_t ino, struct fuse_file_info *fi)
static void lo_fsync(fuse_req_t req, fuse_ino_t ino, int datasync,
struct fuse_file_info *fi)
{
+ struct lo_inode *inode = lo_inode(req, ino);
+ struct lo_data *lo = lo_data(req);
int res;
int fd;
- char *buf;
fuse_log(FUSE_LOG_DEBUG, "lo_fsync(ino=%" PRIu64 ", fi=0x%p)\n", ino,
(void *)fi);
- if (!fi) {
- struct lo_data *lo = lo_data(req);
-
- res = asprintf(&buf, "%i", lo_fd(req, ino));
- if (res == -1) {
- return (void)fuse_reply_err(req, errno);
- }
+ if (!inode) {
+ fuse_reply_err(req, EBADF);
+ return;
+ }
- fd = openat(lo->proc_self_fd, buf, O_RDWR);
- free(buf);
- if (fd == -1) {
- return (void)fuse_reply_err(req, errno);
+ if (!fi) {
+ fd = lo_inode_open(lo, inode, O_RDWR);
+ if (fd < 0) {
+ res = -fd;
+ goto out;
}
} else {
fd = lo_fi_fd(req, fi);
}
if (datasync) {
- res = fdatasync(fd);
+ res = fdatasync(fd) == -1 ? errno : 0;
} else {
- res = fsync(fd);
+ res = fsync(fd) == -1 ? errno : 0;
}
if (!fi) {
close(fd);
}
- fuse_reply_err(req, res == -1 ? errno : 0);
+out:
+ lo_inode_put(lo, &inode);
+ fuse_reply_err(req, res);
}
static void lo_read(fuse_req_t req, fuse_ino_t ino, size_t size, off_t offset,

View File

@@ -0,0 +1,199 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Richard Henderson <richard.henderson@linaro.org>
Date: Wed, 10 Apr 2024 08:43:26 +0300
Subject: [PATCH] target/arm: Fix SME full tile indexing
For the outer product set of insns, which take an entire matrix
tile as output, the argument is not a combined tile+column.
Therefore using get_tile_rowcol was incorrect, as we extracted
the tile number from itself.
The test case relies only on assembler support for SME, since
no release of GCC recognizes -march=armv9-a+sme yet.
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1620
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230622151201.1578522-5-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: dropped now-unneeded changes to sysregs CFLAGS]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 1f51573f7925b80e79a29f87c7d9d6ead60960c0)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
---
target/arm/translate-sme.c | 24 ++++++---
tests/tcg/aarch64/Makefile.target | 7 ++-
tests/tcg/aarch64/sme-outprod1.c | 83 +++++++++++++++++++++++++++++++
3 files changed, 107 insertions(+), 7 deletions(-)
create mode 100644 tests/tcg/aarch64/sme-outprod1.c
diff --git a/target/arm/translate-sme.c b/target/arm/translate-sme.c
index 7b87a9df63..65f8495bdd 100644
--- a/target/arm/translate-sme.c
+++ b/target/arm/translate-sme.c
@@ -103,6 +103,21 @@ static TCGv_ptr get_tile_rowcol(DisasContext *s, int esz, int rs,
return addr;
}
+/*
+ * Resolve tile.size[0] to a host pointer.
+ * Used by e.g. outer product insns where we require the entire tile.
+ */
+static TCGv_ptr get_tile(DisasContext *s, int esz, int tile)
+{
+ TCGv_ptr addr = tcg_temp_new_ptr();
+ int offset;
+
+ offset = tile * sizeof(ARMVectorReg) + offsetof(CPUARMState, zarray);
+
+ tcg_gen_addi_ptr(addr, cpu_env, offset);
+ return addr;
+}
+
static bool trans_ZERO(DisasContext *s, arg_ZERO *a)
{
if (!dc_isar_feature(aa64_sme, s)) {
@@ -279,8 +294,7 @@ static bool do_adda(DisasContext *s, arg_adda *a, MemOp esz,
return true;
}
- /* Sum XZR+zad to find ZAd. */
- za = get_tile_rowcol(s, esz, 31, a->zad, false);
+ za = get_tile(s, esz, a->zad);
zn = vec_full_reg_ptr(s, a->zn);
pn = pred_full_reg_ptr(s, a->pn);
pm = pred_full_reg_ptr(s, a->pm);
@@ -310,8 +324,7 @@ static bool do_outprod(DisasContext *s, arg_op *a, MemOp esz,
return true;
}
- /* Sum XZR+zad to find ZAd. */
- za = get_tile_rowcol(s, esz, 31, a->zad, false);
+ za = get_tile(s, esz, a->zad);
zn = vec_full_reg_ptr(s, a->zn);
zm = vec_full_reg_ptr(s, a->zm);
pn = pred_full_reg_ptr(s, a->pn);
@@ -337,8 +350,7 @@ static bool do_outprod_fpst(DisasContext *s, arg_op *a, MemOp esz,
return true;
}
- /* Sum XZR+zad to find ZAd. */
- za = get_tile_rowcol(s, esz, 31, a->zad, false);
+ za = get_tile(s, esz, a->zad);
zn = vec_full_reg_ptr(s, a->zn);
zm = vec_full_reg_ptr(s, a->zm);
pn = pred_full_reg_ptr(s, a->pn);
diff --git a/tests/tcg/aarch64/Makefile.target b/tests/tcg/aarch64/Makefile.target
index 118d069073..5e4ea7c998 100644
--- a/tests/tcg/aarch64/Makefile.target
+++ b/tests/tcg/aarch64/Makefile.target
@@ -24,7 +24,7 @@ config-cc.mak: Makefile
$(call cc-option,-march=armv8.3-a, CROSS_CC_HAS_ARMV8_3); \
$(call cc-option,-mbranch-protection=standard, CROSS_CC_HAS_ARMV8_BTI); \
$(call cc-option,-march=armv8.5-a+memtag, CROSS_CC_HAS_ARMV8_MTE); \
- $(call cc-option,-march=armv9-a+sme, CROSS_CC_HAS_ARMV9_SME)) 3> config-cc.mak
+ $(call cc-option,-Wa$(COMMA)-march=armv9-a+sme, CROSS_AS_HAS_ARMV9_SME)) 3> config-cc.mak
-include config-cc.mak
# Pauth Tests
@@ -51,6 +51,11 @@ AARCH64_TESTS += mte-1 mte-2 mte-3 mte-4 mte-5 mte-6 mte-7
mte-%: CFLAGS += -march=armv8.5-a+memtag
endif
+# SME Tests
+ifneq ($(CROSS_AS_HAS_ARMV9_SME),)
+AARCH64_TESTS += sme-outprod1
+endif
+
# System Registers Tests
AARCH64_TESTS += sysregs
diff --git a/tests/tcg/aarch64/sme-outprod1.c b/tests/tcg/aarch64/sme-outprod1.c
new file mode 100644
index 0000000000..6e5972d75e
--- /dev/null
+++ b/tests/tcg/aarch64/sme-outprod1.c
@@ -0,0 +1,83 @@
+/*
+ * SME outer product, 1 x 1.
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#include <stdio.h>
+
+extern void foo(float *dst);
+
+asm(
+" .arch_extension sme\n"
+" .type foo, @function\n"
+"foo:\n"
+" stp x29, x30, [sp, -80]!\n"
+" mov x29, sp\n"
+" stp d8, d9, [sp, 16]\n"
+" stp d10, d11, [sp, 32]\n"
+" stp d12, d13, [sp, 48]\n"
+" stp d14, d15, [sp, 64]\n"
+" smstart\n"
+" ptrue p0.s, vl4\n"
+" fmov z0.s, #1.0\n"
+/*
+ * An outer product of a vector of 1.0 by itself should be a matrix of 1.0.
+ * Note that we are using tile 1 here (za1.s) rather than tile 0.
+ */
+" zero {za}\n"
+" fmopa za1.s, p0/m, p0/m, z0.s, z0.s\n"
+/*
+ * Read the first 4x4 sub-matrix of elements from tile 1:
+ * Note that za1h should be interchangable here.
+ */
+" mov w12, #0\n"
+" mova z0.s, p0/m, za1v.s[w12, #0]\n"
+" mova z1.s, p0/m, za1v.s[w12, #1]\n"
+" mova z2.s, p0/m, za1v.s[w12, #2]\n"
+" mova z3.s, p0/m, za1v.s[w12, #3]\n"
+/*
+ * And store them to the input pointer (dst in the C code):
+ */
+" st1w {z0.s}, p0, [x0]\n"
+" add x0, x0, #16\n"
+" st1w {z1.s}, p0, [x0]\n"
+" add x0, x0, #16\n"
+" st1w {z2.s}, p0, [x0]\n"
+" add x0, x0, #16\n"
+" st1w {z3.s}, p0, [x0]\n"
+" smstop\n"
+" ldp d8, d9, [sp, 16]\n"
+" ldp d10, d11, [sp, 32]\n"
+" ldp d12, d13, [sp, 48]\n"
+" ldp d14, d15, [sp, 64]\n"
+" ldp x29, x30, [sp], 80\n"
+" ret\n"
+" .size foo, . - foo"
+);
+
+int main()
+{
+ float dst[16];
+ int i, j;
+
+ foo(dst);
+
+ for (i = 0; i < 16; i++) {
+ if (dst[i] != 1.0f) {
+ break;
+ }
+ }
+
+ if (i == 16) {
+ return 0; /* success */
+ }
+
+ /* failure */
+ for (i = 0; i < 4; ++i) {
+ for (j = 0; j < 4; ++j) {
+ printf("%f ", (double)dst[i * 4 + j]);
+ }
+ printf("\n");
+ }
+ return 1;
+}

View File

@@ -1,29 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Greg Kurz <groug@kaod.org>
Date: Thu, 4 Feb 2021 18:34:38 +0000
Subject: [PATCH] virtiofsd: Add _llseek to the seccomp whitelist
This is how glibc implements lseek(2) on POWER.
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1917692
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <20210121171540.1449777-1-groug@kaod.org>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
tools/virtiofsd/passthrough_seccomp.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/virtiofsd/passthrough_seccomp.c b/tools/virtiofsd/passthrough_seccomp.c
index 11623f56f2..bb8ef5b17f 100644
--- a/tools/virtiofsd/passthrough_seccomp.c
+++ b/tools/virtiofsd/passthrough_seccomp.c
@@ -68,6 +68,7 @@ static const int syscall_whitelist[] = {
SCMP_SYS(linkat),
SCMP_SYS(listxattr),
SCMP_SYS(lseek),
+ SCMP_SYS(_llseek), /* For POWER */
SCMP_SYS(madvise),
SCMP_SYS(mkdirat),
SCMP_SYS(mknodat),

View File

@@ -0,0 +1,61 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Dmitrii Gavrilov <ds-gavr@yandex-team.ru>
Date: Wed, 10 Apr 2024 08:43:28 +0300
Subject: [PATCH] system/qdev-monitor: move drain_call_rcu call under if (!dev)
in qmp_device_add()
Original goal of addition of drain_call_rcu to qmp_device_add was to cover
the failure case of qdev_device_add. It seems call of drain_call_rcu was
misplaced in 7bed89958bfbf40df what led to waiting for pending RCU callbacks
under happy path too. What led to overall performance degradation of
qmp_device_add.
In this patch call of drain_call_rcu moved under handling of failure of
qdev_device_add.
Signed-off-by: Dmitrii Gavrilov <ds-gavr@yandex-team.ru>
Message-ID: <20231103105602.90475-1-ds-gavr@yandex-team.ru>
Fixes: 7bed89958bf ("device_core: use drain_call_rcu in in qmp_device_add", 2020-10-12)
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 012b170173bcaa14b9bc26209e0813311ac78489)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
---
softmmu/qdev-monitor.c | 23 +++++++++++------------
1 file changed, 11 insertions(+), 12 deletions(-)
diff --git a/softmmu/qdev-monitor.c b/softmmu/qdev-monitor.c
index 4b0ef65780..f4348443b0 100644
--- a/softmmu/qdev-monitor.c
+++ b/softmmu/qdev-monitor.c
@@ -853,19 +853,18 @@ void qmp_device_add(QDict *qdict, QObject **ret_data, Error **errp)
return;
}
dev = qdev_device_add(opts, errp);
-
- /*
- * Drain all pending RCU callbacks. This is done because
- * some bus related operations can delay a device removal
- * (in this case this can happen if device is added and then
- * removed due to a configuration error)
- * to a RCU callback, but user might expect that this interface
- * will finish its job completely once qmp command returns result
- * to the user
- */
- drain_call_rcu();
-
if (!dev) {
+ /*
+ * Drain all pending RCU callbacks. This is done because
+ * some bus related operations can delay a device removal
+ * (in this case this can happen if device is added and then
+ * removed due to a configuration error)
+ * to a RCU callback, but user might expect that this interface
+ * will finish its job completely once qmp command returns result
+ * to the user
+ */
+ drain_call_rcu();
+
qemu_opts_del(opts);
return;
}

View File

@@ -1,31 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Greg Kurz <groug@kaod.org>
Date: Thu, 4 Feb 2021 18:34:39 +0000
Subject: [PATCH] virtiofsd: Add restart_syscall to the seccomp whitelist
This is how linux restarts some system calls after SIGSTOP/SIGCONT.
This is needed to avoid virtiofsd termination when resuming execution
under GDB for example.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <20210201193305.136390-1-groug@kaod.org>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
tools/virtiofsd/passthrough_seccomp.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/virtiofsd/passthrough_seccomp.c b/tools/virtiofsd/passthrough_seccomp.c
index bb8ef5b17f..44d75e0e36 100644
--- a/tools/virtiofsd/passthrough_seccomp.c
+++ b/tools/virtiofsd/passthrough_seccomp.c
@@ -92,6 +92,7 @@ static const int syscall_whitelist[] = {
SCMP_SYS(renameat),
SCMP_SYS(renameat2),
SCMP_SYS(removexattr),
+ SCMP_SYS(restart_syscall),
SCMP_SYS(rt_sigaction),
SCMP_SYS(rt_sigprocmask),
SCMP_SYS(rt_sigreturn),

View File

@@ -0,0 +1,85 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Sven Schnelle <svens@stackframe.org>
Date: Wed, 10 Apr 2024 08:43:29 +0300
Subject: [PATCH] hw/scsi/lsi53c895a: stop script on phase mismatch
Netbsd isn't happy with qemu lsi53c895a emulation:
cd0(esiop0:0:2:0): command with tag id 0 reset
esiop0: autoconfiguration error: phase mismatch without command
esiop0: autoconfiguration error: unhandled scsi interrupt, sist=0x80 sstat1=0x0 DSA=0x23a64b1 DSP=0x50
This is because lsi_bad_phase() triggers a phase mismatch, which
stops SCRIPT processing. However, after returning to
lsi_command_complete(), SCRIPT is restarted with lsi_resume_script().
Fix this by adding a return value to lsi_bad_phase(), and only resume
script processing when lsi_bad_phase() didn't trigger a host interrupt.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Tested-by: Helge Deller <deller@gmx.de>
Message-ID: <20240302214453.2071388-1-svens@stackframe.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit a9198b3132d81a6bfc9fdbf6f3d3a514c2864674)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
---
hw/scsi/lsi53c895a.c | 16 ++++++++++++----
1 file changed, 12 insertions(+), 4 deletions(-)
diff --git a/hw/scsi/lsi53c895a.c b/hw/scsi/lsi53c895a.c
index ca619ed564..905f5ef237 100644
--- a/hw/scsi/lsi53c895a.c
+++ b/hw/scsi/lsi53c895a.c
@@ -570,8 +570,9 @@ static inline void lsi_set_phase(LSIState *s, int phase)
s->sstat1 = (s->sstat1 & ~PHASE_MASK) | phase;
}
-static void lsi_bad_phase(LSIState *s, int out, int new_phase)
+static int lsi_bad_phase(LSIState *s, int out, int new_phase)
{
+ int ret = 0;
/* Trigger a phase mismatch. */
if (s->ccntl0 & LSI_CCNTL0_ENPMJ) {
if ((s->ccntl0 & LSI_CCNTL0_PMJCTL)) {
@@ -584,8 +585,10 @@ static void lsi_bad_phase(LSIState *s, int out, int new_phase)
trace_lsi_bad_phase_interrupt();
lsi_script_scsi_interrupt(s, LSI_SIST0_MA, 0);
lsi_stop_script(s);
+ ret = 1;
}
lsi_set_phase(s, new_phase);
+ return ret;
}
@@ -789,7 +792,7 @@ static int lsi_queue_req(LSIState *s, SCSIRequest *req, uint32_t len)
static void lsi_command_complete(SCSIRequest *req, size_t resid)
{
LSIState *s = LSI53C895A(req->bus->qbus.parent);
- int out;
+ int out, stop = 0;
out = (s->sstat1 & PHASE_MASK) == PHASE_DO;
trace_lsi_command_complete(req->status);
@@ -797,7 +800,10 @@ static void lsi_command_complete(SCSIRequest *req, size_t resid)
s->command_complete = 2;
if (s->waiting && s->dbc != 0) {
/* Raise phase mismatch for short transfers. */
- lsi_bad_phase(s, out, PHASE_ST);
+ stop = lsi_bad_phase(s, out, PHASE_ST);
+ if (stop) {
+ s->waiting = 0;
+ }
} else {
lsi_set_phase(s, PHASE_ST);
}
@@ -807,7 +813,9 @@ static void lsi_command_complete(SCSIRequest *req, size_t resid)
lsi_request_free(s, s->current);
scsi_req_unref(req);
}
- lsi_resume_script(s);
+ if (!stop) {
+ lsi_resume_script(s);
+ }
}
/* Callback to indicate that the SCSI layer has completed a transfer. */

View File

@@ -1,108 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Vitaly Cheptsov <cheptsov@ispras.ru>
Date: Tue, 2 Mar 2021 09:21:10 -0500
Subject: [PATCH] i386/acpi: restore device paths for pre-5.1 vms
After fixing the _UID value for the primary PCI root bridge in
af1b80ae it was discovered that this change updates Windows
configuration in an incompatible way causing network configuration
failure unless DHCP is used. More details provided on the list:
https://lists.gnu.org/archive/html/qemu-devel/2021-02/msg08484.html
This change reverts the _UID update from 1 to 0 for q35 and i440fx
VMs before version 5.2 to maintain the original behaviour when
upgrading.
Cc: qemu-stable@nongnu.org
Cc: qemu-devel@nongnu.org
Reported-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Vitaly Cheptsov <cheptsov@ispras.ru>
Message-Id: <20210301195919.9333-1-cheptsov@ispras.ru>
Tested-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Fixes: af1b80ae56c9 ("i386/acpi: fix inconsistent QEMU/OVMF device paths")
---
hw/i386/acpi-build.c | 4 ++--
hw/i386/pc_piix.c | 2 ++
hw/i386/pc_q35.c | 2 ++
include/hw/i386/pc.h | 1 +
4 files changed, 7 insertions(+), 2 deletions(-)
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 1f5c211245..b5616582a5 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1513,7 +1513,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
dev = aml_device("PCI0");
aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
- aml_append(dev, aml_name_decl("_UID", aml_int(0)));
+ aml_append(dev, aml_name_decl("_UID", aml_int(pcmc->pci_root_uid)));
aml_append(sb_scope, dev);
aml_append(dsdt, sb_scope);
@@ -1530,7 +1530,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
- aml_append(dev, aml_name_decl("_UID", aml_int(0)));
+ aml_append(dev, aml_name_decl("_UID", aml_int(pcmc->pci_root_uid)));
aml_append(dev, build_q35_osc_method());
aml_append(sb_scope, dev);
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 13d1628f13..2524c96216 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -417,6 +417,7 @@ static void pc_i440fx_machine_options(MachineClass *m)
{
PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
pcmc->default_nic_model = "e1000";
+ pcmc->pci_root_uid = 0;
m->family = "pc_piix";
m->desc = "Standard PC (i440FX + PIIX, 1996)";
@@ -448,6 +449,7 @@ static void pc_i440fx_5_1_machine_options(MachineClass *m)
compat_props_add(m->compat_props, hw_compat_5_1, hw_compat_5_1_len);
compat_props_add(m->compat_props, pc_compat_5_1, pc_compat_5_1_len);
pcmc->kvmclock_create_always = false;
+ pcmc->pci_root_uid = 1;
}
DEFINE_I440FX_MACHINE(v5_1, "pc-i440fx-5.1", NULL,
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index a3f4959c43..c58dad5ae3 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -329,6 +329,7 @@ static void pc_q35_machine_options(MachineClass *m)
{
PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
pcmc->default_nic_model = "e1000e";
+ pcmc->pci_root_uid = 0;
m->family = "pc_q35";
m->desc = "Standard PC (Q35 + ICH9, 2009)";
@@ -364,6 +365,7 @@ static void pc_q35_5_1_machine_options(MachineClass *m)
compat_props_add(m->compat_props, hw_compat_5_1, hw_compat_5_1_len);
compat_props_add(m->compat_props, pc_compat_5_1, pc_compat_5_1_len);
pcmc->kvmclock_create_always = false;
+ pcmc->pci_root_uid = 1;
}
DEFINE_Q35_MACHINE(v5_1, "pc-q35-5.1", NULL,
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 911e460097..7f8e1a791f 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -99,6 +99,7 @@ struct PCMachineClass {
int legacy_acpi_table_size;
unsigned acpi_data_size;
bool do_not_add_smb_acpi;
+ int pci_root_uid;
/* SMBIOS compat: */
bool smbios_defaults;

View File

@@ -0,0 +1,38 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Sven Schnelle <svens@stackframe.org>
Date: Wed, 10 Apr 2024 08:43:30 +0300
Subject: [PATCH] hw/scsi/lsi53c895a: add missing decrement of reentrancy
counter
When the maximum count of SCRIPTS instructions is reached, the code
stops execution and returns, but fails to decrement the reentrancy
counter. This effectively renders the SCSI controller unusable
because on next entry the reentrancy counter is still above the limit.
This bug was seen on HP-UX 10.20 which seems to trigger SCRIPTS
loops.
Fixes: b987718bbb ("hw/scsi/lsi53c895a: Fix reentrancy issues in the LSI controller (CVE-2023-0330)")
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-ID: <20240128202214.2644768-1-svens@stackframe.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Helge Deller <deller@gmx.de>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 8b09b7fe47082c69295a0fc0cc01b041b6385025)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
---
hw/scsi/lsi53c895a.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/hw/scsi/lsi53c895a.c b/hw/scsi/lsi53c895a.c
index 905f5ef237..c7a3964b5f 100644
--- a/hw/scsi/lsi53c895a.c
+++ b/hw/scsi/lsi53c895a.c
@@ -1167,6 +1167,7 @@ again:
lsi_script_scsi_interrupt(s, LSI_SIST0_UDC, 0);
lsi_disconnect(s);
trace_lsi_execute_script_stop();
+ reentrancy_level--;
return;
}
insn = read_dword(s, s->dsp);

View File

@@ -1,48 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Stefan Reiter <s.reiter@proxmox.com>
Date: Mon, 22 Mar 2021 15:20:04 +0100
Subject: [PATCH] monitor/qmp: fix race on CHR_EVENT_CLOSED without OOB
The QMP dispatcher coroutine holds the qmp_queue_lock over a yield
point, where it expects to be rescheduled from the main context. If a
CHR_EVENT_CLOSED event is received just then, it can race and block the
main thread on the mutex in monitor_qmp_cleanup_queue_and_resume.
Calculate need_resume immediately after we pop a request from the queue,
so that we can release the mutex before yielding.
Suggested-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
---
monitor/qmp.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/monitor/qmp.c b/monitor/qmp.c
index 2e37d11bd3..2aff833f7a 100644
--- a/monitor/qmp.c
+++ b/monitor/qmp.c
@@ -252,6 +252,12 @@ void coroutine_fn monitor_qmp_dispatcher_co(void *data)
}
}
+ mon = req_obj->mon;
+ /* qmp_oob_enabled() might change after "qmp_capabilities" */
+ need_resume = !qmp_oob_enabled(mon) ||
+ mon->qmp_requests->length == QMP_REQ_QUEUE_LEN_MAX - 1;
+ qemu_mutex_unlock(&mon->qmp_queue_lock);
+
if (qatomic_xchg(&qmp_dispatcher_co_busy, true) == true) {
/*
* Someone rescheduled us (probably because a new requests
@@ -270,11 +276,6 @@ void coroutine_fn monitor_qmp_dispatcher_co(void *data)
aio_co_schedule(qemu_get_aio_context(), qmp_dispatcher_co);
qemu_coroutine_yield();
- mon = req_obj->mon;
- /* qmp_oob_enabled() might change after "qmp_capabilities" */
- need_resume = !qmp_oob_enabled(mon) ||
- mon->qmp_requests->length == QMP_REQ_QUEUE_LEN_MAX - 1;
- qemu_mutex_unlock(&mon->qmp_queue_lock);
if (req_obj->req) {
QDict *qdict = qobject_to(QDict, req_obj->req);
QObject *id = qdict ? qdict_get(qdict, "id") : NULL;

View File

@@ -1,42 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Kevin Wolf <kwolf@redhat.com>
Date: Thu, 3 Dec 2020 18:23:10 +0100
Subject: [PATCH] block: Fix locking in qmp_block_resize()
The drain functions assume that we hold the AioContext lock of the
drained block node. Make sure to actually take the lock.
Cc: qemu-stable@nongnu.org
Fixes: eb94b81a94bce112e6b206df846c1551aaf6cab6
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20201203172311.68232-3-kwolf@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
---
blockdev.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/blockdev.c b/blockdev.c
index fe6fb5dc1d..9a86e9fb4b 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2481,14 +2481,17 @@ void coroutine_fn qmp_block_resize(bool has_device, const char *device,
goto out;
}
+ bdrv_co_lock(bs);
bdrv_drained_begin(bs);
+ bdrv_co_unlock(bs);
+
old_ctx = bdrv_co_enter(bs);
blk_truncate(blk, size, false, PREALLOC_MODE_OFF, 0, errp);
bdrv_co_leave(bs, old_ctx);
- bdrv_drained_end(bs);
out:
bdrv_co_lock(bs);
+ bdrv_drained_end(bs);
blk_unref(blk);
bdrv_co_unlock(bs);
}

View File

@@ -0,0 +1,173 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Sven Schnelle <svens@stackframe.org>
Date: Wed, 10 Apr 2024 08:43:31 +0300
Subject: [PATCH] hw/scsi/lsi53c895a: add timer to scripts processing
HP-UX 10.20 seems to make the lsi53c895a spinning on a memory location
under certain circumstances. As the SCSI controller and CPU are not
running at the same time this loop will never finish. After some
time, the check loop interrupts with a unexpected device disconnect.
This works, but is slow because the kernel resets the scsi controller.
Instead of signaling UDC, start a timer and exit the loop. Until the
timer fires, the CPU can process instructions which might changes the
memory location.
The limit of instructions is also reduced because scripts running on
the SCSI processor are usually very short. This keeps the time until
the loop is exit short.
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-ID: <20240229204407.1699260-1-svens@stackframe.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 9876359990dd4c8a48de65cf5e1c3d13e96a7f4e)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
---
hw/scsi/lsi53c895a.c | 43 +++++++++++++++++++++++++++++++++----------
hw/scsi/trace-events | 2 ++
2 files changed, 35 insertions(+), 10 deletions(-)
diff --git a/hw/scsi/lsi53c895a.c b/hw/scsi/lsi53c895a.c
index c7a3964b5f..48c85d479c 100644
--- a/hw/scsi/lsi53c895a.c
+++ b/hw/scsi/lsi53c895a.c
@@ -188,7 +188,7 @@ static const char *names[] = {
#define LSI_TAG_VALID (1 << 16)
/* Maximum instructions to process. */
-#define LSI_MAX_INSN 10000
+#define LSI_MAX_INSN 100
typedef struct lsi_request {
SCSIRequest *req;
@@ -205,6 +205,7 @@ enum {
LSI_WAIT_RESELECT, /* Wait Reselect instruction has been issued */
LSI_DMA_SCRIPTS, /* processing DMA from lsi_execute_script */
LSI_DMA_IN_PROGRESS, /* DMA operation is in progress */
+ LSI_WAIT_SCRIPTS, /* SCRIPTS stopped because of instruction count limit */
};
enum {
@@ -224,6 +225,7 @@ struct LSIState {
MemoryRegion ram_io;
MemoryRegion io_io;
AddressSpace pci_io_as;
+ QEMUTimer *scripts_timer;
int carry; /* ??? Should this be an a visible register somewhere? */
int status;
@@ -415,6 +417,7 @@ static void lsi_soft_reset(LSIState *s)
s->sbr = 0;
assert(QTAILQ_EMPTY(&s->queue));
assert(!s->current);
+ timer_del(s->scripts_timer);
}
static int lsi_dma_40bit(LSIState *s)
@@ -1135,6 +1138,12 @@ static void lsi_wait_reselect(LSIState *s)
}
}
+static void lsi_scripts_timer_start(LSIState *s)
+{
+ trace_lsi_scripts_timer_start();
+ timer_mod(s->scripts_timer, qemu_clock_get_us(QEMU_CLOCK_VIRTUAL) + 500);
+}
+
static void lsi_execute_script(LSIState *s)
{
PCIDevice *pci_dev = PCI_DEVICE(s);
@@ -1144,6 +1153,11 @@ static void lsi_execute_script(LSIState *s)
int insn_processed = 0;
static int reentrancy_level;
+ if (s->waiting == LSI_WAIT_SCRIPTS) {
+ timer_del(s->scripts_timer);
+ s->waiting = LSI_NOWAIT;
+ }
+
reentrancy_level++;
s->istat1 |= LSI_ISTAT1_SRUN;
@@ -1151,8 +1165,8 @@ again:
/*
* Some windows drivers make the device spin waiting for a memory location
* to change. If we have executed more than LSI_MAX_INSN instructions then
- * assume this is the case and force an unexpected device disconnect. This
- * is apparently sufficient to beat the drivers into submission.
+ * assume this is the case and start a timer. Until the timer fires, the
+ * host CPU has a chance to run and change the memory location.
*
* Another issue (CVE-2023-0330) can occur if the script is programmed to
* trigger itself again and again. Avoid this problem by stopping after
@@ -1160,13 +1174,8 @@ again:
* which should be enough for all valid use cases).
*/
if (++insn_processed > LSI_MAX_INSN || reentrancy_level > 8) {
- if (!(s->sien0 & LSI_SIST0_UDC)) {
- qemu_log_mask(LOG_GUEST_ERROR,
- "lsi_scsi: inf. loop with UDC masked");
- }
- lsi_script_scsi_interrupt(s, LSI_SIST0_UDC, 0);
- lsi_disconnect(s);
- trace_lsi_execute_script_stop();
+ s->waiting = LSI_WAIT_SCRIPTS;
+ lsi_scripts_timer_start(s);
reentrancy_level--;
return;
}
@@ -2205,6 +2214,9 @@ static int lsi_post_load(void *opaque, int version_id)
return -EINVAL;
}
+ if (s->waiting == LSI_WAIT_SCRIPTS) {
+ lsi_scripts_timer_start(s);
+ }
return 0;
}
@@ -2302,6 +2314,15 @@ static const struct SCSIBusInfo lsi_scsi_info = {
.cancel = lsi_request_cancelled
};
+static void scripts_timer_cb(void *opaque)
+{
+ LSIState *s = opaque;
+
+ trace_lsi_scripts_timer_triggered();
+ s->waiting = LSI_NOWAIT;
+ lsi_execute_script(s);
+}
+
static void lsi_scsi_realize(PCIDevice *dev, Error **errp)
{
LSIState *s = LSI53C895A(dev);
@@ -2321,6 +2342,7 @@ static void lsi_scsi_realize(PCIDevice *dev, Error **errp)
"lsi-ram", 0x2000);
memory_region_init_io(&s->io_io, OBJECT(s), &lsi_io_ops, s,
"lsi-io", 256);
+ s->scripts_timer = timer_new_us(QEMU_CLOCK_VIRTUAL, scripts_timer_cb, s);
/*
* Since we use the address-space API to interact with ram_io, disable the
@@ -2345,6 +2367,7 @@ static void lsi_scsi_exit(PCIDevice *dev)
LSIState *s = LSI53C895A(dev);
address_space_destroy(&s->pci_io_as);
+ timer_del(s->scripts_timer);
}
static void lsi_class_init(ObjectClass *klass, void *data)
diff --git a/hw/scsi/trace-events b/hw/scsi/trace-events
index ab238293f0..131af99d91 100644
--- a/hw/scsi/trace-events
+++ b/hw/scsi/trace-events
@@ -299,6 +299,8 @@ lsi_execute_script_stop(void) "SCRIPTS execution stopped"
lsi_awoken(void) "Woken by SIGP"
lsi_reg_read(const char *name, int offset, uint8_t ret) "Read reg %s 0x%x = 0x%02x"
lsi_reg_write(const char *name, int offset, uint8_t val) "Write reg %s 0x%x = 0x%02x"
+lsi_scripts_timer_triggered(void) "SCRIPTS timer triggered"
+lsi_scripts_timer_start(void) "SCRIPTS timer started"
# virtio-scsi.c
virtio_scsi_cmd_req(int lun, uint32_t tag, uint8_t cmd) "virtio_scsi_cmd_req lun=%u tag=0x%x cmd=0x%x"

View File

@@ -1,118 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Kevin Wolf <kwolf@redhat.com>
Date: Thu, 3 Dec 2020 18:23:11 +0100
Subject: [PATCH] block: Fix deadlock in bdrv_co_yield_to_drain()
If bdrv_co_yield_to_drain() is called for draining a block node that
runs in a different AioContext, it keeps that AioContext locked while it
yields and schedules a BH in the AioContext to do the actual drain.
As long as executing the BH is the very next thing that the event loop
of the node's AioContext does, this actually happens to work, but when
it tries to execute something else that wants to take the AioContext
lock, it will deadlock. (In the bug report, this other thing is a
virtio-scsi device running virtio_scsi_data_plane_handle_cmd().)
Instead, always drop the AioContext lock across the yield and reacquire
it only when the coroutine is reentered. The BH needs to unconditionally
take the lock for itself now.
This fixes the 'block_resize' QMP command on a block node that runs in
an iothread.
Cc: qemu-stable@nongnu.org
Fixes: eb94b81a94bce112e6b206df846c1551aaf6cab6
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1903511
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20201203172311.68232-4-kwolf@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
---
block/io.c | 41 ++++++++++++++++++++++++-----------------
1 file changed, 24 insertions(+), 17 deletions(-)
diff --git a/block/io.c b/block/io.c
index ec5e152bb7..a9f56a9ab1 100644
--- a/block/io.c
+++ b/block/io.c
@@ -306,17 +306,7 @@ static void bdrv_co_drain_bh_cb(void *opaque)
if (bs) {
AioContext *ctx = bdrv_get_aio_context(bs);
- AioContext *co_ctx = qemu_coroutine_get_aio_context(co);
-
- /*
- * When the coroutine yielded, the lock for its home context was
- * released, so we need to re-acquire it here. If it explicitly
- * acquired a different context, the lock is still held and we don't
- * want to lock it a second time (or AIO_WAIT_WHILE() would hang).
- */
- if (ctx == co_ctx) {
- aio_context_acquire(ctx);
- }
+ aio_context_acquire(ctx);
bdrv_dec_in_flight(bs);
if (data->begin) {
assert(!data->drained_end_counter);
@@ -328,9 +318,7 @@ static void bdrv_co_drain_bh_cb(void *opaque)
data->ignore_bds_parents,
data->drained_end_counter);
}
- if (ctx == co_ctx) {
- aio_context_release(ctx);
- }
+ aio_context_release(ctx);
} else {
assert(data->begin);
bdrv_drain_all_begin();
@@ -348,13 +336,16 @@ static void coroutine_fn bdrv_co_yield_to_drain(BlockDriverState *bs,
int *drained_end_counter)
{
BdrvCoDrainData data;
+ Coroutine *self = qemu_coroutine_self();
+ AioContext *ctx = bdrv_get_aio_context(bs);
+ AioContext *co_ctx = qemu_coroutine_get_aio_context(self);
/* Calling bdrv_drain() from a BH ensures the current coroutine yields and
* other coroutines run if they were queued by aio_co_enter(). */
assert(qemu_in_coroutine());
data = (BdrvCoDrainData) {
- .co = qemu_coroutine_self(),
+ .co = self,
.bs = bs,
.done = false,
.begin = begin,
@@ -368,13 +359,29 @@ static void coroutine_fn bdrv_co_yield_to_drain(BlockDriverState *bs,
if (bs) {
bdrv_inc_in_flight(bs);
}
- replay_bh_schedule_oneshot_event(bdrv_get_aio_context(bs),
- bdrv_co_drain_bh_cb, &data);
+
+ /*
+ * Temporarily drop the lock across yield or we would get deadlocks.
+ * bdrv_co_drain_bh_cb() reaquires the lock as needed.
+ *
+ * When we yield below, the lock for the current context will be
+ * released, so if this is actually the lock that protects bs, don't drop
+ * it a second time.
+ */
+ if (ctx != co_ctx) {
+ aio_context_release(ctx);
+ }
+ replay_bh_schedule_oneshot_event(ctx, bdrv_co_drain_bh_cb, &data);
qemu_coroutine_yield();
/* If we are resumed from some other event (such as an aio completion or a
* timer callback), it is a bug in the caller that should be fixed. */
assert(data.done);
+
+ /* Reaquire the AioContext of bs if we dropped it */
+ if (ctx != co_ctx) {
+ aio_context_acquire(ctx);
+ }
}
void bdrv_do_drained_begin_quiesce(BlockDriverState *bs,

View File

@@ -0,0 +1,161 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Laurent Vivier <lvivier@redhat.com>
Date: Wed, 10 Apr 2024 08:43:33 +0300
Subject: [PATCH] e1000e: fix link state on resume
On resume e1000e_vm_state_change() always calls e1000e_autoneg_resume()
that sets link_down to false, and thus activates the link even
if we have disabled it.
The problem can be reproduced starting qemu in paused state (-S) and
then set the link to down. When we resume the machine the link appears
to be up.
Reproducer:
# qemu-system-x86_64 ... -device e1000e,netdev=netdev0,id=net0 -S
{"execute": "qmp_capabilities" }
{"execute": "set_link", "arguments": {"name": "net0", "up": false}}
{"execute": "cont" }
To fix the problem, merge the content of e1000e_vm_state_change()
into e1000e_core_post_load() as e1000 does.
Buglink: https://issues.redhat.com/browse/RHEL-21867
Fixes: 6f3fbe4ed06a ("net: Introduce e1000e device emulation")
Suggested-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 4cadf10234989861398e19f3bb441d3861f3bb7c)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
---
hw/net/e1000e_core.c | 60 ++++++--------------------------------------
hw/net/e1000e_core.h | 2 --
2 files changed, 7 insertions(+), 55 deletions(-)
diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index c71d82ce1d..742f5ec800 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -108,14 +108,6 @@ e1000e_intmgr_timer_resume(E1000IntrDelayTimer *timer)
}
}
-static void
-e1000e_intmgr_timer_pause(E1000IntrDelayTimer *timer)
-{
- if (timer->running) {
- timer_del(timer->timer);
- }
-}
-
static inline void
e1000e_intrmgr_stop_timer(E1000IntrDelayTimer *timer)
{
@@ -397,24 +389,6 @@ e1000e_intrmgr_resume(E1000ECore *core)
}
}
-static void
-e1000e_intrmgr_pause(E1000ECore *core)
-{
- int i;
-
- e1000e_intmgr_timer_pause(&core->radv);
- e1000e_intmgr_timer_pause(&core->rdtr);
- e1000e_intmgr_timer_pause(&core->raid);
- e1000e_intmgr_timer_pause(&core->tidv);
- e1000e_intmgr_timer_pause(&core->tadv);
-
- e1000e_intmgr_timer_pause(&core->itr);
-
- for (i = 0; i < E1000E_MSIX_VEC_NUM; i++) {
- e1000e_intmgr_timer_pause(&core->eitr[i]);
- }
-}
-
static void
e1000e_intrmgr_reset(E1000ECore *core)
{
@@ -3336,12 +3310,6 @@ e1000e_core_read(E1000ECore *core, hwaddr addr, unsigned size)
return 0;
}
-static inline void
-e1000e_autoneg_pause(E1000ECore *core)
-{
- timer_del(core->autoneg_timer);
-}
-
static void
e1000e_autoneg_resume(E1000ECore *core)
{
@@ -3353,22 +3321,6 @@ e1000e_autoneg_resume(E1000ECore *core)
}
}
-static void
-e1000e_vm_state_change(void *opaque, bool running, RunState state)
-{
- E1000ECore *core = opaque;
-
- if (running) {
- trace_e1000e_vm_state_running();
- e1000e_intrmgr_resume(core);
- e1000e_autoneg_resume(core);
- } else {
- trace_e1000e_vm_state_stopped();
- e1000e_autoneg_pause(core);
- e1000e_intrmgr_pause(core);
- }
-}
-
void
e1000e_core_pci_realize(E1000ECore *core,
const uint16_t *eeprom_templ,
@@ -3381,9 +3333,6 @@ e1000e_core_pci_realize(E1000ECore *core,
e1000e_autoneg_timer, core);
e1000e_intrmgr_pci_realize(core);
- core->vmstate =
- qemu_add_vm_change_state_handler(e1000e_vm_state_change, core);
-
for (i = 0; i < E1000E_NUM_QUEUES; i++) {
net_tx_pkt_init(&core->tx[i].tx_pkt, core->owner,
E1000E_MAX_TX_FRAGS, core->has_vnet);
@@ -3408,8 +3357,6 @@ e1000e_core_pci_uninit(E1000ECore *core)
e1000e_intrmgr_pci_unint(core);
- qemu_del_vm_change_state_handler(core->vmstate);
-
for (i = 0; i < E1000E_NUM_QUEUES; i++) {
net_tx_pkt_reset(core->tx[i].tx_pkt);
net_tx_pkt_uninit(core->tx[i].tx_pkt);
@@ -3561,5 +3508,12 @@ e1000e_core_post_load(E1000ECore *core)
*/
nc->link_down = (core->mac[STATUS] & E1000_STATUS_LU) == 0;
+ /*
+ * we need to restart intrmgr timers, as an older version of
+ * QEMU can have stopped them before migration
+ */
+ e1000e_intrmgr_resume(core);
+ e1000e_autoneg_resume(core);
+
return 0;
}
diff --git a/hw/net/e1000e_core.h b/hw/net/e1000e_core.h
index 4ddb4d2c39..f2a8ff4a33 100644
--- a/hw/net/e1000e_core.h
+++ b/hw/net/e1000e_core.h
@@ -100,8 +100,6 @@ struct E1000Core {
E1000IntrDelayTimer eitr[E1000E_MSIX_VEC_NUM];
bool eitr_intr_pending[E1000E_MSIX_VEC_NUM];
- VMChangeStateEntry *vmstate;
-
uint32_t itr_guest_value;
uint32_t eitr_guest_value[E1000E_MSIX_VEC_NUM];

View File

@@ -0,0 +1,61 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Paolo Bonzini <pbonzini@redhat.com>
Date: Wed, 10 Apr 2024 08:43:49 +0300
Subject: [PATCH] target/i386: introduce function to query MMU indices
Remove knowledge of specific MMU indexes (other than MMU_NESTED_IDX and
MMU_PHYS_IDX) from mmu_translate(). This will make it possible to split
32-bit and 64-bit MMU indexes.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 5f97afe2543f09160a8d123ab6e2e8c6d98fa9ce)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: context fixup in target/i386/cpu.h due to other changes in that area)
---
target/i386/cpu.h | 10 ++++++++++
target/i386/tcg/sysemu/excp_helper.c | 4 ++--
2 files changed, 12 insertions(+), 2 deletions(-)
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 7be047ce33..f175e18768 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -2195,6 +2195,16 @@ static inline int cpu_mmu_index(CPUX86State *env, bool ifetch)
? MMU_KNOSMAP_IDX : MMU_KSMAP_IDX;
}
+static inline bool is_mmu_index_smap(int mmu_index)
+{
+ return mmu_index == MMU_KSMAP_IDX;
+}
+
+static inline bool is_mmu_index_user(int mmu_index)
+{
+ return mmu_index == MMU_USER_IDX;
+}
+
static inline bool is_mmu_index_32(int mmu_index)
{
assert(mmu_index < MMU_PHYS_IDX);
diff --git a/target/i386/tcg/sysemu/excp_helper.c b/target/i386/tcg/sysemu/excp_helper.c
index 5999cdedf5..553a60d976 100644
--- a/target/i386/tcg/sysemu/excp_helper.c
+++ b/target/i386/tcg/sysemu/excp_helper.c
@@ -135,7 +135,7 @@ static bool mmu_translate(CPUX86State *env, const TranslateParams *in,
{
const target_ulong addr = in->addr;
const int pg_mode = in->pg_mode;
- const bool is_user = (in->mmu_idx == MMU_USER_IDX);
+ const bool is_user = is_mmu_index_user(in->mmu_idx);
const MMUAccessType access_type = in->access_type;
uint64_t ptep, pte, rsvd_mask;
PTETranslate pte_trans = {
@@ -355,7 +355,7 @@ do_check_protect_pse36:
}
int prot = 0;
- if (in->mmu_idx != MMU_KSMAP_IDX || !(ptep & PG_USER_MASK)) {
+ if (!is_mmu_index_smap(in->mmu_idx) || !(ptep & PG_USER_MASK)) {
prot |= PAGE_READ;
if ((ptep & PG_RW_MASK) || !(is_user || (pg_mode & PG_MODE_WP))) {
prot |= PAGE_WRITE;

View File

@@ -0,0 +1,130 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Paolo Bonzini <pbonzini@redhat.com>
Date: Wed, 10 Apr 2024 08:43:50 +0300
Subject: [PATCH] target/i386: use separate MMU indexes for 32-bit accesses
Accesses from a 32-bit environment (32-bit code segment for instruction
accesses, EFER.LMA==0 for processor accesses) have to mask away the
upper 32 bits of the address. While a bit wasteful, the easiest way
to do so is to use separate MMU indexes. These days, QEMU anyway is
compiled with a fixed value for NB_MMU_MODES. Split MMU_USER_IDX,
MMU_KSMAP_IDX and MMU_KNOSMAP_IDX in two.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 90f641531c782c873a05895f411c05fbbbef3c49)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: move changes for x86_cpu_mmu_index() to cpu_mmu_index() due to missing
v8.2.0-1030-gace0c5fe5950 "target/i386: Populate CPUClass.mmu_index"
Increase NB_MMU_MODES from 5 to 8 in target/i386/cpu-param.h due to missing
v7.2.0-2640-gffd824f3f32d "include/exec: Set default NB_MMU_MODES to 16"
v7.2.0-2647-g6787318a5d86 "target/i386: Remove NB_MMU_MODES define"
which relaxed upper limit of MMU index for i386, since this commit starts
using MMU_NESTED_IDX=7.
Thanks Zhao Liu and Paolo Bonzini for the analisys and suggestions.
)
---
target/i386/cpu-param.h | 2 +-
target/i386/cpu.h | 44 ++++++++++++++++++++--------
target/i386/tcg/sysemu/excp_helper.c | 3 +-
3 files changed, 34 insertions(+), 15 deletions(-)
diff --git a/target/i386/cpu-param.h b/target/i386/cpu-param.h
index f579b16bd2..e21e472e1e 100644
--- a/target/i386/cpu-param.h
+++ b/target/i386/cpu-param.h
@@ -23,7 +23,7 @@
# define TARGET_VIRT_ADDR_SPACE_BITS 32
#endif
#define TARGET_PAGE_BITS 12
-#define NB_MMU_MODES 5
+#define NB_MMU_MODES 8
#ifndef CONFIG_USER_ONLY
# define TARGET_TB_PCREL 1
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index f175e18768..73eee08f3f 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -2182,27 +2182,42 @@ uint64_t cpu_get_tsc(CPUX86State *env);
#define cpu_list x86_cpu_list
/* MMU modes definitions */
-#define MMU_KSMAP_IDX 0
-#define MMU_USER_IDX 1
-#define MMU_KNOSMAP_IDX 2
-#define MMU_NESTED_IDX 3
-#define MMU_PHYS_IDX 4
+#define MMU_KSMAP64_IDX 0
+#define MMU_KSMAP32_IDX 1
+#define MMU_USER64_IDX 2
+#define MMU_USER32_IDX 3
+#define MMU_KNOSMAP64_IDX 4
+#define MMU_KNOSMAP32_IDX 5
+#define MMU_PHYS_IDX 6
+#define MMU_NESTED_IDX 7
+
+#ifdef CONFIG_USER_ONLY
+#ifdef TARGET_X86_64
+#define MMU_USER_IDX MMU_USER64_IDX
+#else
+#define MMU_USER_IDX MMU_USER32_IDX
+#endif
+#endif
static inline int cpu_mmu_index(CPUX86State *env, bool ifetch)
{
- return (env->hflags & HF_CPL_MASK) == 3 ? MMU_USER_IDX :
- (!(env->hflags & HF_SMAP_MASK) || (env->eflags & AC_MASK))
- ? MMU_KNOSMAP_IDX : MMU_KSMAP_IDX;
+ int mmu_index_32 = (env->hflags & HF_CS64_MASK) ? 1 : 0;
+ int mmu_index_base =
+ (env->hflags & HF_CPL_MASK) == 3 ? MMU_USER64_IDX :
+ !(env->hflags & HF_SMAP_MASK) ? MMU_KNOSMAP64_IDX :
+ (env->eflags & AC_MASK) ? MMU_KNOSMAP64_IDX : MMU_KSMAP64_IDX;
+
+ return mmu_index_base + mmu_index_32;
}
static inline bool is_mmu_index_smap(int mmu_index)
{
- return mmu_index == MMU_KSMAP_IDX;
+ return (mmu_index & ~1) == MMU_KSMAP64_IDX;
}
static inline bool is_mmu_index_user(int mmu_index)
{
- return mmu_index == MMU_USER_IDX;
+ return (mmu_index & ~1) == MMU_USER64_IDX;
}
static inline bool is_mmu_index_32(int mmu_index)
@@ -2213,9 +2228,12 @@ static inline bool is_mmu_index_32(int mmu_index)
static inline int cpu_mmu_index_kernel(CPUX86State *env)
{
- return !(env->hflags & HF_SMAP_MASK) ? MMU_KNOSMAP_IDX :
- ((env->hflags & HF_CPL_MASK) < 3 && (env->eflags & AC_MASK))
- ? MMU_KNOSMAP_IDX : MMU_KSMAP_IDX;
+ int mmu_index_32 = (env->hflags & HF_LMA_MASK) ? 1 : 0;
+ int mmu_index_base =
+ !(env->hflags & HF_SMAP_MASK) ? MMU_KNOSMAP64_IDX :
+ ((env->hflags & HF_CPL_MASK) < 3 && (env->eflags & AC_MASK)) ? MMU_KNOSMAP64_IDX : MMU_KSMAP64_IDX;
+
+ return mmu_index_base + mmu_index_32;
}
#define CC_DST (env->cc_dst)
diff --git a/target/i386/tcg/sysemu/excp_helper.c b/target/i386/tcg/sysemu/excp_helper.c
index 553a60d976..5f13252d68 100644
--- a/target/i386/tcg/sysemu/excp_helper.c
+++ b/target/i386/tcg/sysemu/excp_helper.c
@@ -541,7 +541,8 @@ static bool get_physical_address(CPUX86State *env, vaddr addr,
if (likely(use_stage2)) {
in.cr3 = env->nested_cr3;
in.pg_mode = env->nested_pg_mode;
- in.mmu_idx = MMU_USER_IDX;
+ in.mmu_idx =
+ env->nested_pg_mode & PG_MODE_LMA ? MMU_USER64_IDX : MMU_USER32_IDX;
in.ptw_idx = MMU_PHYS_IDX;
if (!mmu_translate(env, &in, out, err)) {

View File

@@ -0,0 +1,46 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Paolo Bonzini <pbonzini@redhat.com>
Date: Wed, 10 Apr 2024 08:43:51 +0300
Subject: [PATCH] target/i386: fix direction of "32-bit MMU" test
The low bit of MMU indices for x86 TCG indicates whether the processor is
in 32-bit mode and therefore linear addresses have to be masked to 32 bits.
However, the index was computed incorrectly, leading to possible conflicts
in the TLB for any address above 4G.
Analyzed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Fixes: b1661801c18 ("target/i386: Fix physical address truncation", 2024-02-28)
Fixes: 1c15f97b4f1 ("target/i386: Fix physical address truncation" in stable-7.2)
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2206
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 2cc68629a6fc198f4a972698bdd6477f883aedfb)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: move changes for x86_cpu_mmu_index() to cpu_mmu_index() due to missing
v8.2.0-1030-gace0c5fe59 "target/i386: Populate CPUClass.mmu_index")
---
target/i386/cpu.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 73eee08f3f..326649ca99 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -2201,7 +2201,7 @@ uint64_t cpu_get_tsc(CPUX86State *env);
static inline int cpu_mmu_index(CPUX86State *env, bool ifetch)
{
- int mmu_index_32 = (env->hflags & HF_CS64_MASK) ? 1 : 0;
+ int mmu_index_32 = (env->hflags & HF_CS64_MASK) ? 0 : 1;
int mmu_index_base =
(env->hflags & HF_CPL_MASK) == 3 ? MMU_USER64_IDX :
!(env->hflags & HF_SMAP_MASK) ? MMU_KNOSMAP64_IDX :
@@ -2228,7 +2228,7 @@ static inline bool is_mmu_index_32(int mmu_index)
static inline int cpu_mmu_index_kernel(CPUX86State *env)
{
- int mmu_index_32 = (env->hflags & HF_LMA_MASK) ? 1 : 0;
+ int mmu_index_32 = (env->hflags & HF_LMA_MASK) ? 0 : 1;
int mmu_index_base =
!(env->hflags & HF_SMAP_MASK) ? MMU_KNOSMAP64_IDX :
((env->hflags & HF_CPL_MASK) < 3 && (env->eflags & AC_MASK)) ? MMU_KNOSMAP64_IDX : MMU_KSMAP64_IDX;

View File

@@ -0,0 +1,35 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Tao Su <tao1.su@linux.intel.com>
Date: Wed, 10 Apr 2024 08:43:52 +0300
Subject: [PATCH] target/i386: Revert monitor_puts() in do_inject_x86_mce()
monitor_puts() doesn't check the monitor pointer, but do_inject_x86_mce()
may have a parameter with NULL monitor pointer. Revert monitor_puts() in
do_inject_x86_mce() to fix, then the fact that we send the same message to
monitor and log is again more obvious.
Fixes: bf0c50d4aa85 (monitor: expose monitor_puts to rest of code)
Reviwed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Message-ID: <20240320083640.523287-1-tao1.su@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 7fd226b04746f0be0b636de5097f1b42338951a0)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
---
target/i386/helper.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/target/i386/helper.c b/target/i386/helper.c
index 0ac2da066d..290d9d309c 100644
--- a/target/i386/helper.c
+++ b/target/i386/helper.c
@@ -427,7 +427,7 @@ static void do_inject_x86_mce(CPUState *cs, run_on_cpu_data data)
if (need_reset) {
emit_guest_memory_failure(MEMORY_FAILURE_ACTION_RESET, ar,
recursive);
- monitor_puts(params->mon, msg);
+ monitor_printf(params->mon, "%s", msg);
qemu_log_mask(CPU_LOG_RESET, "%s\n", msg);
qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET);
return;

View File

@@ -0,0 +1,86 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Richard Henderson <richard.henderson@linaro.org>
Date: Wed, 10 Apr 2024 08:43:57 +0300
Subject: [PATCH] tcg/optimize: Fix sign_mask for logical right-shift
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The 'sign' computation is attempting to locate the sign bit that has
been repeated, so that we can test if that bit is known zero. That
computation can be zero if there are no known sign repetitions.
Cc: qemu-stable@nongnu.org
Fixes: 93a967fbb57 ("tcg/optimize: Propagate sign info for shifting")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2248
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit 2911e9b95f3bb03783ae5ca3e2494dc3b44a9161)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: trivial context fixup in tests/tcg/aarch64/Makefile.target)
---
tcg/optimize.c | 2 +-
tests/tcg/aarch64/Makefile.target | 1 +
tests/tcg/aarch64/test-2248.c | 28 ++++++++++++++++++++++++++++
3 files changed, 30 insertions(+), 1 deletion(-)
create mode 100644 tests/tcg/aarch64/test-2248.c
diff --git a/tcg/optimize.c b/tcg/optimize.c
index ae081ab29c..b6f6436c74 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -1907,7 +1907,7 @@ static bool fold_shift(OptContext *ctx, TCGOp *op)
* will not reduced the number of input sign repetitions.
*/
sign = (s_mask & -s_mask) >> 1;
- if (!(z_mask & sign)) {
+ if (sign && !(z_mask & sign)) {
ctx->s_mask = s_mask;
}
break;
diff --git a/tests/tcg/aarch64/Makefile.target b/tests/tcg/aarch64/Makefile.target
index 5e4ea7c998..474f61bc30 100644
--- a/tests/tcg/aarch64/Makefile.target
+++ b/tests/tcg/aarch64/Makefile.target
@@ -10,6 +10,7 @@ VPATH += $(AARCH64_SRC)
# Base architecture tests
AARCH64_TESTS=fcvt pcalign-a64
+AARCH64_TESTS += test-2248
fcvt: LDFLAGS+=-lm
diff --git a/tests/tcg/aarch64/test-2248.c b/tests/tcg/aarch64/test-2248.c
new file mode 100644
index 0000000000..aac2e17836
--- /dev/null
+++ b/tests/tcg/aarch64/test-2248.c
@@ -0,0 +1,28 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/* See https://gitlab.com/qemu-project/qemu/-/issues/2248 */
+
+#include <assert.h>
+
+__attribute__((noinline))
+long test(long x, long y, long sh)
+{
+ long r;
+ asm("cmp %1, %2\n\t"
+ "cset x12, lt\n\t"
+ "and w11, w12, #0xff\n\t"
+ "cmp w11, #0\n\t"
+ "csetm x14, ne\n\t"
+ "lsr x13, x14, %3\n\t"
+ "sxtb %0, w13"
+ : "=r"(r)
+ : "r"(x), "r"(y), "r"(sh)
+ : "x11", "x12", "x13", "x14");
+ return r;
+}
+
+int main()
+{
+ long r = test(0, 1, 2);
+ assert(r == -1);
+ return 0;
+}

View File

@@ -0,0 +1,66 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Wafer <wafer@jaguarmicro.com>
Date: Wed, 10 Apr 2024 08:44:02 +0300
Subject: [PATCH] hw/virtio: Fix packed virtqueue flush used_idx
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
In the event of writing many chains of descriptors, the device must
write just the id of the last buffer in the descriptor chain, skip
forward the number of descriptors in the chain, and then repeat the
operations for the rest of chains.
Current QEMU code writes all the buffer ids consecutively, and then
skips all the buffers altogether. This is a bug, and can be reproduced
with a VirtIONet device with _F_MRG_RXBUB and without
_F_INDIRECT_DESC:
If a virtio-net device has the VIRTIO_NET_F_MRG_RXBUF feature
but not the VIRTIO_RING_F_INDIRECT_DESC feature,
'VirtIONetQueue->rx_vq' will use the merge feature
to store data in multiple 'elems'.
The 'num_buffers' in the virtio header indicates how many elements are merged.
If the value of 'num_buffers' is greater than 1,
all the merged elements will be filled into the descriptor ring.
The 'idx' of the elements should be the value of 'vq->used_idx' plus 'ndescs'.
Fixes: 86044b24e8 ("virtio: basic packed virtqueue support")
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Wafer <wafer@jaguarmicro.com>
Message-Id: <20240407015451.5228-2-wafer@jaguarmicro.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 2d9a31b3c27311eca1682cb2c076d7a300441960)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
---
hw/virtio/virtio.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index b7da7f074d..e4f8ed1e63 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -1367,12 +1367,20 @@ static void virtqueue_packed_flush(VirtQueue *vq, unsigned int count)
return;
}
+ /*
+ * For indirect element's 'ndescs' is 1.
+ * For all other elemment's 'ndescs' is the
+ * number of descriptors chained by NEXT (as set in virtqueue_packed_pop).
+ * So When the 'elem' be filled into the descriptor ring,
+ * The 'idx' of this 'elem' shall be
+ * the value of 'vq->used_idx' plus the 'ndescs'.
+ */
+ ndescs += vq->used_elems[0].ndescs;
for (i = 1; i < count; i++) {
- virtqueue_packed_fill_desc(vq, &vq->used_elems[i], i, false);
+ virtqueue_packed_fill_desc(vq, &vq->used_elems[i], ndescs, false);
ndescs += vq->used_elems[i].ndescs;
}
virtqueue_packed_fill_desc(vq, &vq->used_elems[0], 0, true);
- ndescs += vq->used_elems[0].ndescs;
vq->inuse -= ndescs;
vq->used_idx += ndescs;

1279
debian/patches/pve-qemu-7.2-vitastor.patch vendored Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -14,10 +14,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/block/file-posix.c b/block/file-posix.c
index d5fd1dbcd2..bda3e606dc 100644
index b9647c5ffc..9a16d86344 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -508,7 +508,7 @@ static QemuOptsList raw_runtime_opts = {
@@ -552,7 +552,7 @@ static QemuOptsList raw_runtime_opts = {
{
.name = "locking",
.type = QEMU_OPT_STRING,
@@ -26,7 +26,7 @@ index d5fd1dbcd2..bda3e606dc 100644
},
{
.name = "pr-manager",
@@ -606,7 +606,7 @@ static int raw_open_common(BlockDriverState *bs, QDict *options,
@@ -652,7 +652,7 @@ static int raw_open_common(BlockDriverState *bs, QDict *options,
s->use_lock = false;
break;
case ON_OFF_AUTO_AUTO:

View File

@@ -9,10 +9,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/net/net.h b/include/net/net.h
index 778fc787ca..fb2db6bb75 100644
index 5a7c0e9ebf..59dde996f9 100644
--- a/include/net/net.h
+++ b/include/net/net.h
@@ -210,8 +210,8 @@ void netdev_add(QemuOpts *opts, Error **errp);
@@ -238,8 +238,8 @@ void netdev_add(QemuOpts *opts, Error **errp);
int net_hub_id_for_client(NetClientState *nc, int *id);
NetClientState *net_hub_port_find(int hub_id);

View File

@@ -10,10 +10,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 88e8586f8f..93563ee0c2 100644
index 326649ca99..24d21486bc 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1973,9 +1973,9 @@ uint64_t cpu_get_tsc(CPUX86State *env);
@@ -2174,9 +2174,9 @@ uint64_t cpu_get_tsc(CPUX86State *env);
#define CPU_RESOLVING_TYPE TYPE_X86_CPU
#ifdef TARGET_X86_64
@@ -24,4 +24,4 @@ index 88e8586f8f..93563ee0c2 100644
+#define TARGET_DEFAULT_CPU_TYPE X86_CPU_TYPE_NAME("kvm32")
#endif
#define cpu_signal_handler cpu_x86_signal_handler
#define cpu_list x86_cpu_list

View File

@@ -9,10 +9,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/ui/spice-core.c b/ui/spice-core.c
index eea52f5389..d09ee7f09e 100644
index c3ac20ad43..37774f1c0a 100644
--- a/ui/spice-core.c
+++ b/ui/spice-core.c
@@ -667,32 +667,35 @@ static void qemu_spice_init(void)
@@ -689,32 +689,35 @@ static void qemu_spice_init(void)
if (tls_port) {
x509_dir = qemu_opt_get(opts, "x509-dir");

View File

@@ -9,7 +9,7 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 11 insertions(+), 4 deletions(-)
diff --git a/block/gluster.c b/block/gluster.c
index 4f1448e2bc..715f06394c 100644
index 7c90f7ba4b..2e03102f00 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -42,7 +42,7 @@

View File

@@ -1,24 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Alexandre Derumier <aderumier@odiso.com>
Date: Mon, 6 Apr 2020 12:16:34 +0200
Subject: [PATCH] PVE: [Config] smm_available = false
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
hw/i386/x86.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index 5944fc44ed..31b481b4e9 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -1115,7 +1115,7 @@ bool x86_machine_is_smm_enabled(const X86MachineState *x86ms)
if (tcg_enabled() || qtest_enabled()) {
smm_available = true;
} else if (kvm_enabled()) {
- smm_available = kvm_has_smm();
+ smm_available = false;
}
if (smm_available) {

View File

@@ -18,10 +18,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 2 insertions(+)
diff --git a/block/rbd.c b/block/rbd.c
index 9bd2bce716..c7195a2342 100644
index f826410f40..64a8d7d48b 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -609,6 +609,8 @@ static int qemu_rbd_connect(rados_t *cluster, rados_ioctx_t *io_ctx,
@@ -820,6 +820,8 @@ static int qemu_rbd_connect(rados_t *cluster, rados_ioctx_t *io_ctx,
rados_conf_set(*cluster, "rbd_cache", "false");
}

View File

@@ -4,17 +4,19 @@ Date: Mon, 6 Apr 2020 12:16:37 +0200
Subject: [PATCH] PVE: [Up] qmp: add get_link_status
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
[FE: add get_link_status to command name exceptions]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
net/net.c | 27 +++++++++++++++++++++++++++
qapi/net.json | 15 +++++++++++++++
qapi/pragma.json | 1 +
3 files changed, 43 insertions(+)
qapi/pragma.json | 2 ++
3 files changed, 44 insertions(+)
diff --git a/net/net.c b/net/net.c
index 6a2c3d9567..a1e9514fb8 100644
index c3391168f6..f7d984f6f5 100644
--- a/net/net.c
+++ b/net/net.c
@@ -1277,6 +1277,33 @@ void hmp_info_network(Monitor *mon, const QDict *qdict)
@@ -1387,6 +1387,33 @@ void hmp_info_network(Monitor *mon, const QDict *qdict)
}
}
@@ -49,10 +51,10 @@ index 6a2c3d9567..a1e9514fb8 100644
{
NetClientState *nc;
diff --git a/qapi/net.json b/qapi/net.json
index a3a1336001..b8092c4e20 100644
index 522ac582ed..327d7c5a37 100644
--- a/qapi/net.json
+++ b/qapi/net.json
@@ -35,6 +35,21 @@
@@ -36,6 +36,21 @@
##
{ 'command': 'set_link', 'data': {'name': 'str', 'up': 'bool'} }
@@ -75,14 +77,22 @@ index a3a1336001..b8092c4e20 100644
# @netdev_add:
#
diff --git a/qapi/pragma.json b/qapi/pragma.json
index cffae27666..5a3e3de95f 100644
index 7f810b0e97..29233db825 100644
--- a/qapi/pragma.json
+++ b/qapi/pragma.json
@@ -5,6 +5,7 @@
{ 'pragma': {
# Commands allowed to return a non-dictionary:
'returns-whitelist': [
+ 'get_link_status',
@@ -15,6 +15,7 @@
'device_add',
'device_del',
'expire_password',
+ 'get_link_status',
'migrate_cancel',
'netdev_add',
'netdev_del',
@@ -26,6 +27,7 @@
'system_wakeup' ],
# Commands allowed to return a non-dictionary
'command-returns-exceptions': [
+ 'get_link_status',
'human-monitor-command',
'qom-get',
'query-migrate-cache-size',
'query-tpm-models',

View File

@@ -16,7 +16,7 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/block/gluster.c b/block/gluster.c
index 715f06394c..6d02170d9b 100644
index 2e03102f00..7886c5fe8c 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -57,6 +57,7 @@ typedef struct GlusterAIOCB {
@@ -27,7 +27,7 @@ index 715f06394c..6d02170d9b 100644
} GlusterAIOCB;
typedef struct BDRVGlusterState {
@@ -759,8 +760,10 @@ static void gluster_finish_aiocb(struct glfs_fd *fd, ssize_t ret,
@@ -752,8 +753,10 @@ static void gluster_finish_aiocb(struct glfs_fd *fd, ssize_t ret,
acb->ret = 0; /* Success */
} else if (ret < 0) {
acb->ret = -errno; /* Read/Write failed */
@@ -39,15 +39,15 @@ index 715f06394c..6d02170d9b 100644
}
aio_co_schedule(acb->aio_context, acb->coroutine);
@@ -1028,6 +1031,7 @@ static coroutine_fn int qemu_gluster_co_pwrite_zeroes(BlockDriverState *bs,
@@ -1022,6 +1025,7 @@ static coroutine_fn int qemu_gluster_co_pwrite_zeroes(BlockDriverState *bs,
acb.ret = 0;
acb.coroutine = qemu_coroutine_self();
acb.aio_context = bdrv_get_aio_context(bs);
+ acb.is_write = true;
ret = glfs_zerofill_async(s->fd, offset, size, gluster_finish_aiocb, &acb);
ret = glfs_zerofill_async(s->fd, offset, bytes, gluster_finish_aiocb, &acb);
if (ret < 0) {
@@ -1209,9 +1213,11 @@ static coroutine_fn int qemu_gluster_co_rw(BlockDriverState *bs,
@@ -1203,9 +1207,11 @@ static coroutine_fn int qemu_gluster_co_rw(BlockDriverState *bs,
acb.aio_context = bdrv_get_aio_context(bs);
if (write) {
@@ -59,7 +59,7 @@ index 715f06394c..6d02170d9b 100644
ret = glfs_preadv_async(s->fd, qiov->iov, qiov->niov, offset, 0,
gluster_finish_aiocb, &acb);
}
@@ -1275,6 +1281,7 @@ static coroutine_fn int qemu_gluster_co_flush_to_disk(BlockDriverState *bs)
@@ -1268,6 +1274,7 @@ static coroutine_fn int qemu_gluster_co_flush_to_disk(BlockDriverState *bs)
acb.ret = 0;
acb.coroutine = qemu_coroutine_self();
acb.aio_context = bdrv_get_aio_context(bs);
@@ -67,11 +67,11 @@ index 715f06394c..6d02170d9b 100644
ret = glfs_fsync_async(s->fd, gluster_finish_aiocb, &acb);
if (ret < 0) {
@@ -1321,6 +1328,7 @@ static coroutine_fn int qemu_gluster_co_pdiscard(BlockDriverState *bs,
@@ -1316,6 +1323,7 @@ static coroutine_fn int qemu_gluster_co_pdiscard(BlockDriverState *bs,
acb.ret = 0;
acb.coroutine = qemu_coroutine_self();
acb.aio_context = bdrv_get_aio_context(bs);
+ acb.is_write = true;
ret = glfs_discard_async(s->fd, offset, size, gluster_finish_aiocb, &acb);
ret = glfs_discard_async(s->fd, offset, bytes, gluster_finish_aiocb, &acb);
if (ret < 0) {

View File

@@ -9,10 +9,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/qemu-img.c b/qemu-img.c
index f9050bfaad..7e6666b5f7 100644
index 2c32d9da4e..b9636714f6 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -3022,7 +3022,8 @@ static int img_info(int argc, char **argv)
@@ -3013,7 +3013,8 @@ static int img_info(int argc, char **argv)
list = collect_image_info_list(image_opts, filename, fmt, chain,
force_share);
if (!list) {

View File

@@ -31,13 +31,14 @@ override the output file's size.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
qemu-img-cmds.hx | 4 +-
qemu-img.c | 191 +++++++++++++++++++++++++++++------------------
2 files changed, 121 insertions(+), 74 deletions(-)
qemu-img.c | 202 ++++++++++++++++++++++++++++++-----------------
2 files changed, 133 insertions(+), 73 deletions(-)
diff --git a/qemu-img-cmds.hx b/qemu-img-cmds.hx
index b3620f29e5..e70ef3dc91 100644
index 1b1dab5b17..d1616c045a 100644
--- a/qemu-img-cmds.hx
+++ b/qemu-img-cmds.hx
@@ -58,9 +58,9 @@ SRST
@@ -53,10 +54,10 @@ index b3620f29e5..e70ef3dc91 100644
DEF("info", img_info,
diff --git a/qemu-img.c b/qemu-img.c
index 7e6666b5f7..44cf942bd2 100644
index b9636714f6..0f6a4e4e57 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -4897,10 +4897,12 @@ static int img_bitmap(int argc, char **argv)
@@ -4840,10 +4840,12 @@ static int img_bitmap(int argc, char **argv)
#define C_IF 04
#define C_OF 010
#define C_SKIP 020
@@ -69,7 +70,7 @@ index 7e6666b5f7..44cf942bd2 100644
};
struct DdIo {
@@ -4976,6 +4978,19 @@ static int img_dd_skip(const char *arg,
@@ -4919,6 +4921,19 @@ static int img_dd_skip(const char *arg,
return 0;
}
@@ -89,7 +90,7 @@ index 7e6666b5f7..44cf942bd2 100644
static int img_dd(int argc, char **argv)
{
int ret = 0;
@@ -5016,6 +5031,7 @@ static int img_dd(int argc, char **argv)
@@ -4959,6 +4974,7 @@ static int img_dd(int argc, char **argv)
{ "if", img_dd_if, C_IF },
{ "of", img_dd_of, C_OF },
{ "skip", img_dd_skip, C_SKIP },
@@ -97,7 +98,7 @@ index 7e6666b5f7..44cf942bd2 100644
{ NULL, NULL, 0 }
};
const struct option long_options[] = {
@@ -5094,8 +5110,13 @@ static int img_dd(int argc, char **argv)
@@ -5034,91 +5050,112 @@ static int img_dd(int argc, char **argv)
arg = NULL;
}
@@ -105,53 +106,30 @@ index 7e6666b5f7..44cf942bd2 100644
- error_report("Must specify both input and output files");
+ if (!(dd.flags & C_IF) && (!fmt || strcmp(fmt, "raw") != 0)) {
+ error_report("Input format must be raw when readin from stdin");
+ ret = -1;
+ goto out;
+ }
ret = -1;
goto out;
}
-
- blk1 = img_open(image_opts, in.filename, fmt, 0, false, false,
- force_share);
-
- if (!blk1) {
+ if (!(dd.flags & C_OF) && strcmp(out_fmt, "raw") != 0) {
+ error_report("Output format must be raw when writing to stdout");
ret = -1;
goto out;
}
@@ -5107,85 +5128,101 @@ static int img_dd(int argc, char **argv)
goto out;
}
- blk1 = img_open(image_opts, in.filename, fmt, 0, false, false,
- force_share);
+ if (dd.flags & C_IF) {
+ blk1 = img_open(image_opts, in.filename, fmt, 0, false, false,
+ force_share);
- if (!blk1) {
- ret = -1;
- goto out;
+ if (!blk1) {
+ ret = -1;
+ goto out;
+ }
}
- drv = bdrv_find_format(out_fmt);
- if (!drv) {
- error_report("Unknown file format");
+ if (dd.flags & C_OSIZE) {
+ size = dd.osize;
+ } else if (dd.flags & C_IF) {
+ size = blk_getlength(blk1);
+ if (size < 0) {
+ error_report("Failed to get size for '%s'", in.filename);
+ ret = -1;
+ goto out;
+ }
+ } else if (dd.flags & C_COUNT) {
+ size = dd.count * in.bsz;
+ } else {
+ error_report("Output size must be known when reading from stdin");
ret = -1;
goto out;
}
- ret = -1;
- goto out;
- }
- proto_drv = bdrv_find_protocol(out.filename, true, &local_err);
+ if (dd.flags & C_IF) {
+ blk1 = img_open(image_opts, in.filename, fmt, 0, false, false,
+ force_share);
- if (!proto_drv) {
- error_report_err(local_err);
@@ -169,14 +147,50 @@ index 7e6666b5f7..44cf942bd2 100644
- proto_drv->format_name);
- ret = -1;
- goto out;
+ if (!(dd.flags & C_OSIZE) && dd.flags & C_COUNT && dd.count <= INT64_MAX / in.bsz &&
+ dd.count * in.bsz < size) {
+ size = dd.count * in.bsz;
+ if (!blk1) {
+ ret = -1;
+ goto out;
+ }
}
- create_opts = qemu_opts_append(create_opts, drv->create_opts);
- create_opts = qemu_opts_append(create_opts, proto_drv->create_opts);
-
- opts = qemu_opts_create(create_opts, NULL, 0, &error_abort);
- size = blk_getlength(blk1);
- if (size < 0) {
- error_report("Failed to get size for '%s'", in.filename);
+ if (dd.flags & C_OSIZE) {
+ size = dd.osize;
+ } else if (dd.flags & C_IF) {
+ size = blk_getlength(blk1);
+ if (size < 0) {
+ error_report("Failed to get size for '%s'", in.filename);
+ ret = -1;
+ goto out;
+ }
+ } else if (dd.flags & C_COUNT) {
+ size = dd.count * in.bsz;
+ } else {
+ error_report("Output size must be known when reading from stdin");
ret = -1;
goto out;
}
- if (dd.flags & C_COUNT && dd.count <= INT64_MAX / in.bsz &&
+ if (!(dd.flags & C_OSIZE) && dd.flags & C_COUNT && dd.count <= INT64_MAX / in.bsz &&
dd.count * in.bsz < size) {
size = dd.count * in.bsz;
}
- /* Overflow means the specified offset is beyond input image's size */
- if (dd.flags & C_SKIP && (in.offset > INT64_MAX / in.bsz ||
- size < in.bsz * in.offset)) {
- qemu_opt_set_number(opts, BLOCK_OPT_SIZE, 0, &error_abort);
- } else {
- qemu_opt_set_number(opts, BLOCK_OPT_SIZE,
- size - in.bsz * in.offset, &error_abort);
- }
+ if (dd.flags & C_OF) {
+ drv = bdrv_find_format(out_fmt);
+ if (!drv) {
@@ -186,9 +200,11 @@ index 7e6666b5f7..44cf942bd2 100644
+ }
+ proto_drv = bdrv_find_protocol(out.filename, true, &local_err);
- size = blk_getlength(blk1);
- if (size < 0) {
- error_report("Failed to get size for '%s'", in.filename);
- ret = bdrv_create(drv, out.filename, opts, &local_err);
- if (ret < 0) {
- error_reportf_err(local_err,
- "%s: error while creating output image: ",
- out.filename);
- ret = -1;
- goto out;
- }
@@ -212,20 +228,18 @@ index 7e6666b5f7..44cf942bd2 100644
+ create_opts = qemu_opts_append(create_opts, drv->create_opts);
+ create_opts = qemu_opts_append(create_opts, proto_drv->create_opts);
- if (dd.flags & C_COUNT && dd.count <= INT64_MAX / in.bsz &&
- dd.count * in.bsz < size) {
- size = dd.count * in.bsz;
- }
- /* TODO, we can't honour --image-opts for the target,
- * since it needs to be given in a format compatible
- * with the bdrv_create() call above which does not
- * support image-opts style.
- */
- blk2 = img_open_file(out.filename, NULL, out_fmt, BDRV_O_RDWR,
- false, false, false);
+ opts = qemu_opts_create(create_opts, NULL, 0, &error_abort);
- /* Overflow means the specified offset is beyond input image's size */
- if (dd.flags & C_SKIP && (in.offset > INT64_MAX / in.bsz ||
- size < in.bsz * in.offset)) {
- qemu_opt_set_number(opts, BLOCK_OPT_SIZE, 0, &error_abort);
- } else {
- qemu_opt_set_number(opts, BLOCK_OPT_SIZE,
- size - in.bsz * in.offset, &error_abort);
- }
- if (!blk2) {
- ret = -1;
- goto out;
+ /* Overflow means the specified offset is beyond input image's size */
+ if (dd.flags & C_OSIZE) {
+ qemu_opt_set_number(opts, BLOCK_OPT_SIZE, size, &error_abort);
@@ -236,15 +250,7 @@ index 7e6666b5f7..44cf942bd2 100644
+ qemu_opt_set_number(opts, BLOCK_OPT_SIZE,
+ size - in.bsz * in.offset, &error_abort);
+ }
- ret = bdrv_create(drv, out.filename, opts, &local_err);
- if (ret < 0) {
- error_reportf_err(local_err,
- "%s: error while creating output image: ",
- out.filename);
- ret = -1;
- goto out;
- }
+
+ ret = bdrv_create(drv, out.filename, opts, &local_err);
+ if (ret < 0) {
+ error_reportf_err(local_err,
@@ -253,14 +259,7 @@ index 7e6666b5f7..44cf942bd2 100644
+ ret = -1;
+ goto out;
+ }
- /* TODO, we can't honour --image-opts for the target,
- * since it needs to be given in a format compatible
- * with the bdrv_create() call above which does not
- * support image-opts style.
- */
- blk2 = img_open_file(out.filename, NULL, out_fmt, BDRV_O_RDWR,
- false, false, false);
+
+ /* TODO, we can't honour --image-opts for the target,
+ * since it needs to be given in a format compatible
+ * with the bdrv_create() call above which does not
@@ -268,10 +267,7 @@ index 7e6666b5f7..44cf942bd2 100644
+ */
+ blk2 = img_open_file(out.filename, NULL, out_fmt, BDRV_O_RDWR,
+ false, false, false);
- if (!blk2) {
- ret = -1;
- goto out;
+
+ if (!blk2) {
+ ret = -1;
+ goto out;
@@ -279,41 +275,54 @@ index 7e6666b5f7..44cf942bd2 100644
}
if (dd.flags & C_SKIP && (in.offset > INT64_MAX / in.bsz ||
@@ -5203,11 +5240,17 @@ static int img_dd(int argc, char **argv)
@@ -5135,20 +5172,43 @@ static int img_dd(int argc, char **argv)
in.buf = g_new(uint8_t, in.bsz);
for (out_pos = 0; in_pos < size; block_count++) {
int in_ret, out_ret;
for (out_pos = 0; in_pos < size; ) {
+ int in_ret, out_ret;
int bytes = (in_pos + in.bsz > size) ? size - in_pos : in.bsz;
-
- if (in_pos + in.bsz > size) {
- in_ret = blk_pread(blk1, in_pos, in.buf, size - in_pos);
+ size_t in_bsz = in_pos + in.bsz > size ? size - in_pos : in.bsz;
- ret = blk_pread(blk1, in_pos, bytes, in.buf, 0);
- if (ret < 0) {
+ if (blk1) {
+ in_ret = blk_pread(blk1, in_pos, in.buf, in_bsz);
} else {
- in_ret = blk_pread(blk1, in_pos, in.buf, in.bsz);
+ in_ret = read(STDIN_FILENO, in.buf, in_bsz);
+ in_ret = blk_pread(blk1, in_pos, bytes, in.buf, 0);
+ if (in_ret == 0) {
+ in_ret = bytes;
+ }
+ } else {
+ in_ret = read(STDIN_FILENO, in.buf, bytes);
+ if (in_ret == 0) {
+ /* early EOF is considered an error */
+ error_report("Input ended unexpectedly");
+ ret = -1;
+ goto out;
+ }
}
if (in_ret < 0) {
+ }
+ if (in_ret < 0) {
error_report("error while reading from input image file: %s",
@@ -5217,9 +5260,13 @@ static int img_dd(int argc, char **argv)
- strerror(-ret));
+ strerror(-in_ret));
+ ret = -1;
goto out;
}
in_pos += in_ret;
in_pos += bytes;
- out_ret = blk_pwrite(blk2, out_pos, in.buf, in_ret, 0);
- ret = blk_pwrite(blk2, out_pos, bytes, in.buf, 0);
- if (ret < 0) {
+ if (blk2) {
+ out_ret = blk_pwrite(blk2, out_pos, in.buf, in_ret, 0);
+ out_ret = blk_pwrite(blk2, out_pos, in_ret, in.buf, 0);
+ if (out_ret == 0) {
+ out_ret = in_ret;
+ }
+ } else {
+ out_ret = write(STDOUT_FILENO, in.buf, in_ret);
+ }
- if (out_ret < 0) {
+
+ if (out_ret != in_ret) {
error_report("error while writing to output image file: %s",
strerror(-out_ret));
ret = -1;
- strerror(-ret));
+ strerror(-out_ret));
+ ret = -1;
goto out;
}
out_pos += bytes;

View File

@@ -10,15 +10,16 @@ an expected end of input.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
qemu-img.c | 28 +++++++++++++++++++++++++---
1 file changed, 25 insertions(+), 3 deletions(-)
diff --git a/qemu-img.c b/qemu-img.c
index 44cf942bd2..5ce60e8a45 100644
index 0f6a4e4e57..a3cd66e56c 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -4898,11 +4898,13 @@ static int img_bitmap(int argc, char **argv)
@@ -4841,11 +4841,13 @@ static int img_bitmap(int argc, char **argv)
#define C_OF 010
#define C_SKIP 020
#define C_OSIZE 040
@@ -32,7 +33,7 @@ index 44cf942bd2..5ce60e8a45 100644
};
struct DdIo {
@@ -4991,6 +4993,19 @@ static int img_dd_osize(const char *arg,
@@ -4934,6 +4936,19 @@ static int img_dd_osize(const char *arg,
return 0;
}
@@ -52,13 +53,13 @@ index 44cf942bd2..5ce60e8a45 100644
static int img_dd(int argc, char **argv)
{
int ret = 0;
@@ -5005,12 +5020,14 @@ static int img_dd(int argc, char **argv)
@@ -4948,12 +4963,14 @@ static int img_dd(int argc, char **argv)
int c, i;
const char *out_fmt = "raw";
const char *fmt = NULL;
- int64_t size = 0;
+ int64_t size = 0, readsize = 0;
int64_t block_count = 0, out_pos, in_pos;
int64_t out_pos, in_pos;
bool force_share = false;
struct DdInfo dd = {
.flags = 0,
@@ -68,7 +69,7 @@ index 44cf942bd2..5ce60e8a45 100644
};
struct DdIo in = {
.bsz = 512, /* Block size is by default 512 bytes */
@@ -5032,6 +5049,7 @@ static int img_dd(int argc, char **argv)
@@ -4975,6 +4992,7 @@ static int img_dd(int argc, char **argv)
{ "of", img_dd_of, C_OF },
{ "skip", img_dd_skip, C_SKIP },
{ "osize", img_dd_osize, C_OSIZE },
@@ -76,20 +77,22 @@ index 44cf942bd2..5ce60e8a45 100644
{ NULL, NULL, 0 }
};
const struct option long_options[] = {
@@ -5238,14 +5256,18 @@ static int img_dd(int argc, char **argv)
@@ -5171,9 +5189,10 @@ static int img_dd(int argc, char **argv)
in.buf = g_new(uint8_t, in.bsz);
- for (out_pos = 0; in_pos < size; block_count++) {
- for (out_pos = 0; in_pos < size; ) {
+ readsize = (dd.isize > 0) ? dd.isize : size;
+ for (out_pos = 0; in_pos < readsize; block_count++) {
+ for (out_pos = 0; in_pos < readsize; ) {
int in_ret, out_ret;
- size_t in_bsz = in_pos + in.bsz > size ? size - in_pos : in.bsz;
+ size_t in_bsz = in_pos + in.bsz > readsize ? readsize - in_pos : in.bsz;
- int bytes = (in_pos + in.bsz > size) ? size - in_pos : in.bsz;
+ int bytes = (in_pos + in.bsz > readsize) ? readsize - in_pos : in.bsz;
if (blk1) {
in_ret = blk_pread(blk1, in_pos, in.buf, in_bsz);
in_ret = blk_pread(blk1, in_pos, bytes, in.buf, 0);
if (in_ret == 0) {
@@ -5182,6 +5201,9 @@ static int img_dd(int argc, char **argv)
} else {
in_ret = read(STDIN_FILENO, in.buf, in_bsz);
in_ret = read(STDIN_FILENO, in.buf, bytes);
if (in_ret == 0) {
+ if (dd.isize == 0) {
+ goto out;

View File

@@ -0,0 +1,121 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Alexandre Derumier <aderumier@odiso.com>
Date: Mon, 6 Apr 2020 12:16:42 +0200
Subject: [PATCH] PVE: [Up] qemu-img dd: add -n skip_create
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
[FE: fix getopt-string + add documentation]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
docs/tools/qemu-img.rst | 11 ++++++++++-
qemu-img-cmds.hx | 4 ++--
qemu-img.c | 23 ++++++++++++++---------
3 files changed, 26 insertions(+), 12 deletions(-)
diff --git a/docs/tools/qemu-img.rst b/docs/tools/qemu-img.rst
index 15aeddc6d8..5e713e231d 100644
--- a/docs/tools/qemu-img.rst
+++ b/docs/tools/qemu-img.rst
@@ -208,6 +208,10 @@ Parameters to convert subcommand:
Parameters to dd subcommand:
+.. option:: -n
+
+ Skip the creation of the target volume
+
.. program:: qemu-img-dd
.. option:: bs=BLOCK_SIZE
@@ -488,7 +492,7 @@ Command description:
it doesn't need to be specified separately in this case.
-.. option:: dd [--image-opts] [-U] [-f FMT] [-O OUTPUT_FMT] [bs=BLOCK_SIZE] [count=BLOCKS] [skip=BLOCKS] if=INPUT of=OUTPUT
+.. option:: dd [--image-opts] [-U] [-f FMT] [-O OUTPUT_FMT] [-n] [bs=BLOCK_SIZE] [count=BLOCKS] [skip=BLOCKS] if=INPUT of=OUTPUT
dd copies from *INPUT* file to *OUTPUT* file converting it from
*FMT* format to *OUTPUT_FMT* format.
@@ -499,6 +503,11 @@ Command description:
The size syntax is similar to :manpage:`dd(1)`'s size syntax.
+ If the ``-n`` option is specified, the target volume creation will be
+ skipped. This is useful for formats such as ``rbd`` if the target
+ volume has already been created with site specific options that cannot
+ be supplied through ``qemu-img``.
+
.. option:: info [--object OBJECTDEF] [--image-opts] [-f FMT] [--output=OFMT] [--backing-chain] [-U] FILENAME
Give information about the disk image *FILENAME*. Use it in
diff --git a/qemu-img-cmds.hx b/qemu-img-cmds.hx
index d1616c045a..b5b0bb4467 100644
--- a/qemu-img-cmds.hx
+++ b/qemu-img-cmds.hx
@@ -58,9 +58,9 @@ SRST
ERST
DEF("dd", img_dd,
- "dd [--image-opts] [-U] [-f fmt] [-O output_fmt] [bs=block_size] [count=blocks] [skip=blocks] [osize=output_size] if=input of=output")
+ "dd [--image-opts] [-U] [-f fmt] [-O output_fmt] [-n] [bs=block_size] [count=blocks] [skip=blocks] [osize=output_size] if=input of=output")
SRST
-.. option:: dd [--image-opts] [-U] [-f FMT] [-O OUTPUT_FMT] [bs=BLOCK_SIZE] [count=BLOCKS] [skip=BLOCKS] [osize=OUTPUT_SIZE] if=INPUT of=OUTPUT
+.. option:: dd [--image-opts] [-U] [-f FMT] [-O OUTPUT_FMT] [-n] [bs=BLOCK_SIZE] [count=BLOCKS] [skip=BLOCKS] [osize=OUTPUT_SIZE] if=INPUT of=OUTPUT
ERST
DEF("info", img_info,
diff --git a/qemu-img.c b/qemu-img.c
index a3cd66e56c..4f5ef5b887 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -4965,7 +4965,7 @@ static int img_dd(int argc, char **argv)
const char *fmt = NULL;
int64_t size = 0, readsize = 0;
int64_t out_pos, in_pos;
- bool force_share = false;
+ bool force_share = false, skip_create = false;
struct DdInfo dd = {
.flags = 0,
.count = 0,
@@ -5003,7 +5003,7 @@ static int img_dd(int argc, char **argv)
{ 0, 0, 0, 0 }
};
- while ((c = getopt_long(argc, argv, ":hf:O:U", long_options, NULL))) {
+ while ((c = getopt_long(argc, argv, ":hf:O:Un", long_options, NULL))) {
if (c == EOF) {
break;
}
@@ -5023,6 +5023,9 @@ static int img_dd(int argc, char **argv)
case 'h':
help();
break;
+ case 'n':
+ skip_create = true;
+ break;
case 'U':
force_share = true;
break;
@@ -5153,13 +5156,15 @@ static int img_dd(int argc, char **argv)
size - in.bsz * in.offset, &error_abort);
}
- ret = bdrv_create(drv, out.filename, opts, &local_err);
- if (ret < 0) {
- error_reportf_err(local_err,
- "%s: error while creating output image: ",
- out.filename);
- ret = -1;
- goto out;
+ if (!skip_create) {
+ ret = bdrv_create(drv, out.filename, opts, &local_err);
+ if (ret < 0) {
+ error_reportf_err(local_err,
+ "%s: error while creating output image: ",
+ out.filename);
+ ret = -1;
+ goto out;
+ }
}
/* TODO, we can't honour --image-opts for the target,

View File

@@ -1,65 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Alexandre Derumier <aderumier@odiso.com>
Date: Mon, 6 Apr 2020 12:16:42 +0200
Subject: [PATCH] PVE: [Up] qemu-img dd: add -n skip_create
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
qemu-img.c | 23 ++++++++++++++---------
1 file changed, 14 insertions(+), 9 deletions(-)
diff --git a/qemu-img.c b/qemu-img.c
index 5ce60e8a45..86bfd0288b 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -5022,7 +5022,7 @@ static int img_dd(int argc, char **argv)
const char *fmt = NULL;
int64_t size = 0, readsize = 0;
int64_t block_count = 0, out_pos, in_pos;
- bool force_share = false;
+ bool force_share = false, skip_create = false;
struct DdInfo dd = {
.flags = 0,
.count = 0,
@@ -5060,7 +5060,7 @@ static int img_dd(int argc, char **argv)
{ 0, 0, 0, 0 }
};
- while ((c = getopt_long(argc, argv, ":hf:O:U", long_options, NULL))) {
+ while ((c = getopt_long(argc, argv, ":hf:O:U:n", long_options, NULL))) {
if (c == EOF) {
break;
}
@@ -5080,6 +5080,9 @@ static int img_dd(int argc, char **argv)
case 'h':
help();
break;
+ case 'n':
+ skip_create = true;
+ break;
case 'U':
force_share = true;
break;
@@ -5220,13 +5223,15 @@ static int img_dd(int argc, char **argv)
size - in.bsz * in.offset, &error_abort);
}
- ret = bdrv_create(drv, out.filename, opts, &local_err);
- if (ret < 0) {
- error_reportf_err(local_err,
- "%s: error while creating output image: ",
- out.filename);
- ret = -1;
- goto out;
+ if (!skip_create) {
+ ret = bdrv_create(drv, out.filename, opts, &local_err);
+ if (ret < 0) {
+ error_reportf_err(local_err,
+ "%s: error while creating output image: ",
+ out.filename);
+ ret = -1;
+ goto out;
+ }
}
/* TODO, we can't honour --image-opts for the target,

View File

@@ -7,17 +7,20 @@ Actually provide memory information via the query-balloon
command.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
[FE: add BalloonInfo to member name exceptions list]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
hw/virtio/virtio-balloon.c | 33 +++++++++++++++++++++++++++++++--
monitor/hmp-cmds.c | 30 +++++++++++++++++++++++++++++-
qapi/machine.json | 22 +++++++++++++++++++++-
3 files changed, 81 insertions(+), 4 deletions(-)
qapi/pragma.json | 1 +
4 files changed, 82 insertions(+), 4 deletions(-)
diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index b22b5beda3..6e581439bf 100644
index e4c4c2d3c8..49874569b1 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -805,8 +805,37 @@ static uint64_t virtio_balloon_get_features(VirtIODevice *vdev, uint64_t f,
@@ -806,8 +806,37 @@ static uint64_t virtio_balloon_get_features(VirtIODevice *vdev, uint64_t f,
static void virtio_balloon_stat(void *opaque, BalloonInfo *info)
{
VirtIOBalloon *dev = opaque;
@@ -58,10 +61,10 @@ index b22b5beda3..6e581439bf 100644
static void virtio_balloon_to_target(void *opaque, ram_addr_t target)
diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index 65d8ff4849..705f08a8f1 100644
index 01b789a79e..480b798963 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -695,7 +695,35 @@ void hmp_info_balloon(Monitor *mon, const QDict *qdict)
@@ -696,7 +696,35 @@ void hmp_info_balloon(Monitor *mon, const QDict *qdict)
return;
}
@@ -99,10 +102,10 @@ index 65d8ff4849..705f08a8f1 100644
qapi_free_BalloonInfo(info);
}
diff --git a/qapi/machine.json b/qapi/machine.json
index 7c9a263778..3e59199280 100644
index b9228a5e46..10e77a9af3 100644
--- a/qapi/machine.json
+++ b/qapi/machine.json
@@ -1205,10 +1205,30 @@
@@ -1054,9 +1054,29 @@
# @actual: the logical size of the VM in bytes
# Formula used: logical_vm_size = vm_ram_size - balloon_size
#
@@ -122,8 +125,7 @@ index 7c9a263778..3e59199280 100644
+#
+# @max_mem: amount of memory (in bytes) assigned to the guest
+#
# Since: 0.14.0
#
# Since: 0.14
##
-{ 'struct': 'BalloonInfo', 'data': {'actual': 'int' } }
+{ 'struct': 'BalloonInfo',
@@ -134,3 +136,15 @@ index 7c9a263778..3e59199280 100644
##
# @query-balloon:
diff --git a/qapi/pragma.json b/qapi/pragma.json
index 29233db825..f2097b9020 100644
--- a/qapi/pragma.json
+++ b/qapi/pragma.json
@@ -37,6 +37,7 @@
'member-name-exceptions': [ # visible in:
'ACPISlotType', # query-acpi-ospm-status
'AcpiTableOptions', # -acpitable
+ 'BalloonInfo', # query-balloon
'BlkdebugEvent', # blockdev-add, -blockdev
'BlkdebugSetStateOptions', # blockdev-add, -blockdev
'BlockDeviceInfo', # query-block

View File

@@ -13,10 +13,10 @@ Signed-off-by: Dietmar Maurer <dietmar@proxmox.com>
2 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/hw/core/machine-qmp-cmds.c b/hw/core/machine-qmp-cmds.c
index 5362c80a18..3fcb82ce2f 100644
index 4f4ab30f8c..76fff60a6b 100644
--- a/hw/core/machine-qmp-cmds.c
+++ b/hw/core/machine-qmp-cmds.c
@@ -234,6 +234,12 @@ MachineInfoList *qmp_query_machines(Error **errp)
@@ -99,6 +99,12 @@ MachineInfoList *qmp_query_machines(Error **errp)
info->hotpluggable_cpus = mc->has_hotpluggable_cpus;
info->numa_mem_supported = mc->numa_mem_supported;
info->deprecated = !!mc->deprecation_reason;
@@ -30,19 +30,19 @@ index 5362c80a18..3fcb82ce2f 100644
info->default_cpu_type = g_strdup(mc->default_cpu_type);
info->has_default_cpu_type = true;
diff --git a/qapi/machine.json b/qapi/machine.json
index 3e59199280..dfc1a49d3c 100644
index 10e77a9af3..9156103c8f 100644
--- a/qapi/machine.json
+++ b/qapi/machine.json
@@ -318,6 +318,8 @@
@@ -138,6 +138,8 @@
#
# @is-default: whether the machine is default
#
+# @is-current: whether this machine is currently used
+#
# @cpu-max: maximum number of CPUs supported by the machine type
# (since 1.5.0)
# (since 1.5)
#
@@ -339,7 +341,7 @@
@@ -159,7 +161,7 @@
##
{ 'struct': 'MachineInfo',
'data': { 'name': 'str', '*alias': 'str',

View File

@@ -12,29 +12,29 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2 files changed, 8 insertions(+)
diff --git a/qapi/ui.json b/qapi/ui.json
index 6c7b33cb72..39ff301d1e 100644
index 0abba3e930..bf8f441227 100644
--- a/qapi/ui.json
+++ b/qapi/ui.json
@@ -215,11 +215,14 @@
@@ -310,11 +310,14 @@
#
# @channels: a list of @SpiceChannel for each active spice channel
#
+# @ticket: The last ticket set with set_password
+#
# Since: 0.14.0
# Since: 0.14
##
{ 'struct': 'SpiceInfo',
'data': {'enabled': 'bool', 'migrated': 'bool', '*host': 'str', '*port': 'int',
'*tls-port': 'int', '*auth': 'str', '*compiled-version': 'str',
+ '*ticket': 'str',
'mouse-mode': 'SpiceQueryMouseMode', '*channels': ['SpiceChannel']},
'if': 'defined(CONFIG_SPICE)' }
'if': 'CONFIG_SPICE' }
diff --git a/ui/spice-core.c b/ui/spice-core.c
index d09ee7f09e..da3d2644d1 100644
index 37774f1c0a..367f77f2b4 100644
--- a/ui/spice-core.c
+++ b/ui/spice-core.c
@@ -538,6 +538,11 @@ static SpiceInfo *qmp_query_spice_real(Error **errp)
@@ -534,6 +534,11 @@ static SpiceInfo *qmp_query_spice_real(Error **errp)
micro = SPICE_SERVER_VERSION & 0xff;
info->compiled_version = g_strdup_printf("%d.%d.%d", major, minor, micro);

View File

@@ -0,0 +1,281 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Thu, 13 Oct 2022 11:33:50 +0200
Subject: [PATCH] PVE: add IOChannel implementation for savevm-async
based on migration/channel-block.c and the implementation that was
present in migration/savevm-async.c before QEMU 7.1.
Passes along read/write requests to the given BlockBackend, while
ensuring that a read request going beyond the end results in a
graceful short read.
Additionally, allows tracking the current position from the outside
(intended to be used for progress tracking).
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
migration/channel-savevm-async.c | 182 +++++++++++++++++++++++++++++++
migration/channel-savevm-async.h | 51 +++++++++
migration/meson.build | 1 +
3 files changed, 234 insertions(+)
create mode 100644 migration/channel-savevm-async.c
create mode 100644 migration/channel-savevm-async.h
diff --git a/migration/channel-savevm-async.c b/migration/channel-savevm-async.c
new file mode 100644
index 0000000000..06d5484778
--- /dev/null
+++ b/migration/channel-savevm-async.c
@@ -0,0 +1,182 @@
+/*
+ * QIO Channel implementation to be used by savevm-async QMP calls
+ */
+#include "qemu/osdep.h"
+#include "migration/channel-savevm-async.h"
+#include "qapi/error.h"
+#include "sysemu/block-backend.h"
+#include "trace.h"
+
+QIOChannelSavevmAsync *
+qio_channel_savevm_async_new(BlockBackend *be, size_t *bs_pos)
+{
+ QIOChannelSavevmAsync *ioc;
+
+ ioc = QIO_CHANNEL_SAVEVM_ASYNC(object_new(TYPE_QIO_CHANNEL_SAVEVM_ASYNC));
+
+ bdrv_ref(blk_bs(be));
+ ioc->be = be;
+ ioc->bs_pos = bs_pos;
+
+ return ioc;
+}
+
+
+static void
+qio_channel_savevm_async_finalize(Object *obj)
+{
+ QIOChannelSavevmAsync *ioc = QIO_CHANNEL_SAVEVM_ASYNC(obj);
+
+ if (ioc->be) {
+ bdrv_unref(blk_bs(ioc->be));
+ ioc->be = NULL;
+ }
+ ioc->bs_pos = NULL;
+}
+
+
+static ssize_t
+qio_channel_savevm_async_readv(QIOChannel *ioc,
+ const struct iovec *iov,
+ size_t niov,
+ int **fds,
+ size_t *nfds,
+ Error **errp)
+{
+ QIOChannelSavevmAsync *saioc = QIO_CHANNEL_SAVEVM_ASYNC(ioc);
+ BlockBackend *be = saioc->be;
+ int64_t maxlen = blk_getlength(be);
+ QEMUIOVector qiov;
+ size_t size;
+ int ret;
+
+ qemu_iovec_init_external(&qiov, (struct iovec *)iov, niov);
+
+ if (*saioc->bs_pos >= maxlen) {
+ error_setg(errp, "cannot read beyond maxlen");
+ return -1;
+ }
+
+ if (maxlen - *saioc->bs_pos < qiov.size) {
+ size = maxlen - *saioc->bs_pos;
+ } else {
+ size = qiov.size;
+ }
+
+ // returns 0 on success
+ ret = blk_preadv(be, *saioc->bs_pos, size, &qiov, 0);
+ if (ret < 0) {
+ error_setg_errno(errp, -ret, "blk_preadv failed");
+ return -1;
+ }
+
+ *saioc->bs_pos += size;
+ return size;
+}
+
+
+static ssize_t
+qio_channel_savevm_async_writev(QIOChannel *ioc,
+ const struct iovec *iov,
+ size_t niov,
+ int *fds,
+ size_t nfds,
+ int flags,
+ Error **errp)
+{
+ QIOChannelSavevmAsync *saioc = QIO_CHANNEL_SAVEVM_ASYNC(ioc);
+ BlockBackend *be = saioc->be;
+ QEMUIOVector qiov;
+ int ret;
+
+ qemu_iovec_init_external(&qiov, (struct iovec *)iov, niov);
+
+ if (qemu_in_coroutine()) {
+ ret = blk_co_pwritev(be, *saioc->bs_pos, qiov.size, &qiov, 0);
+ aio_wait_kick();
+ } else {
+ ret = blk_pwritev(be, *saioc->bs_pos, qiov.size, &qiov, 0);
+ }
+
+ if (ret < 0) {
+ error_setg_errno(errp, -ret, "blk(_co)_pwritev failed");
+ return -1;
+ }
+
+ *saioc->bs_pos += qiov.size;
+ return qiov.size;
+}
+
+
+static int
+qio_channel_savevm_async_set_blocking(QIOChannel *ioc,
+ bool enabled,
+ Error **errp)
+{
+ if (!enabled) {
+ error_setg(errp, "Non-blocking mode not supported for savevm-async");
+ return -1;
+ }
+ return 0;
+}
+
+
+static int
+qio_channel_savevm_async_close(QIOChannel *ioc,
+ Error **errp)
+{
+ QIOChannelSavevmAsync *saioc = QIO_CHANNEL_SAVEVM_ASYNC(ioc);
+ int rv = bdrv_flush(blk_bs(saioc->be));
+
+ if (rv < 0) {
+ error_setg_errno(errp, -rv, "Unable to flush VMState");
+ return -1;
+ }
+
+ bdrv_unref(blk_bs(saioc->be));
+ saioc->be = NULL;
+ saioc->bs_pos = NULL;
+
+ return 0;
+}
+
+
+static void
+qio_channel_savevm_async_set_aio_fd_handler(QIOChannel *ioc,
+ AioContext *ctx,
+ IOHandler *io_read,
+ IOHandler *io_write,
+ void *opaque)
+{
+ // if channel-block starts doing something, check if this needs adaptation
+}
+
+
+static void
+qio_channel_savevm_async_class_init(ObjectClass *klass,
+ void *class_data G_GNUC_UNUSED)
+{
+ QIOChannelClass *ioc_klass = QIO_CHANNEL_CLASS(klass);
+
+ ioc_klass->io_writev = qio_channel_savevm_async_writev;
+ ioc_klass->io_readv = qio_channel_savevm_async_readv;
+ ioc_klass->io_set_blocking = qio_channel_savevm_async_set_blocking;
+ ioc_klass->io_close = qio_channel_savevm_async_close;
+ ioc_klass->io_set_aio_fd_handler = qio_channel_savevm_async_set_aio_fd_handler;
+}
+
+static const TypeInfo qio_channel_savevm_async_info = {
+ .parent = TYPE_QIO_CHANNEL,
+ .name = TYPE_QIO_CHANNEL_SAVEVM_ASYNC,
+ .instance_size = sizeof(QIOChannelSavevmAsync),
+ .instance_finalize = qio_channel_savevm_async_finalize,
+ .class_init = qio_channel_savevm_async_class_init,
+};
+
+static void
+qio_channel_savevm_async_register_types(void)
+{
+ type_register_static(&qio_channel_savevm_async_info);
+}
+
+type_init(qio_channel_savevm_async_register_types);
diff --git a/migration/channel-savevm-async.h b/migration/channel-savevm-async.h
new file mode 100644
index 0000000000..17ae2cb261
--- /dev/null
+++ b/migration/channel-savevm-async.h
@@ -0,0 +1,51 @@
+/*
+ * QEMU I/O channels driver for savevm-async.c
+ *
+ * Copyright (c) 2022 Proxmox Server Solutions
+ *
+ * Authors:
+ * Fiona Ebner (f.ebner@proxmox.com)
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef QIO_CHANNEL_SAVEVM_ASYNC_H
+#define QIO_CHANNEL_SAVEVM_ASYNC_H
+
+#include "io/channel.h"
+#include "qom/object.h"
+
+#define TYPE_QIO_CHANNEL_SAVEVM_ASYNC "qio-channel-savevm-async"
+OBJECT_DECLARE_SIMPLE_TYPE(QIOChannelSavevmAsync, QIO_CHANNEL_SAVEVM_ASYNC)
+
+
+/**
+ * QIOChannelSavevmAsync:
+ *
+ * The QIOChannelBlock object provides a channel implementation that is able to
+ * perform I/O on any BlockBackend whose BlockDriverState directly contains a
+ * VMState (as opposed to indirectly, like qcow2). It allows tracking the
+ * current position from the outside.
+ */
+struct QIOChannelSavevmAsync {
+ QIOChannel parent;
+ BlockBackend *be;
+ size_t *bs_pos;
+};
+
+
+/**
+ * qio_channel_savevm_async_new:
+ * @be: the block backend
+ * @bs_pos: used to keep track of the IOChannels current position
+ *
+ * Create a new IO channel object that can perform I/O on a BlockBackend object
+ * whose BlockDriverState directly contains a VMState.
+ *
+ * Returns: the new channel object
+ */
+QIOChannelSavevmAsync *
+qio_channel_savevm_async_new(BlockBackend *be, size_t *bs_pos);
+
+#endif /* QIO_CHANNEL_SAVEVM_ASYNC_H */
diff --git a/migration/meson.build b/migration/meson.build
index 690487cf1a..8cac83c06c 100644
--- a/migration/meson.build
+++ b/migration/meson.build
@@ -13,6 +13,7 @@ softmmu_ss.add(files(
'block-dirty-bitmap.c',
'channel.c',
'channel-block.c',
+ 'channel-savevm-async.c',
'colo-failover.c',
'colo.c',
'exec.c',

View File

@@ -23,27 +23,31 @@ Signed-off-by: Dietmar Maurer <dietmar@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
[improve aborting]
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
[FE: further improve aborting
adapt to removal of QEMUFileOps
improve condition for entering final stage]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
hmp-commands-info.hx | 13 +
hmp-commands.hx | 33 ++
include/migration/snapshot.h | 1 +
hmp-commands.hx | 33 +++
include/migration/snapshot.h | 2 +
include/monitor/hmp.h | 5 +
migration/meson.build | 1 +
migration/savevm-async.c | 598 +++++++++++++++++++++++++++++++++++
migration/savevm-async.c | 538 +++++++++++++++++++++++++++++++++++
monitor/hmp-cmds.c | 57 ++++
qapi/migration.json | 34 ++
qapi/misc.json | 32 ++
qapi/migration.json | 34 +++
qapi/misc.json | 32 +++
qemu-options.hx | 12 +
softmmu/vl.c | 10 +
11 files changed, 796 insertions(+)
11 files changed, 737 insertions(+)
create mode 100644 migration/savevm-async.c
diff --git a/hmp-commands-info.hx b/hmp-commands-info.hx
index 117ba25f91..b3b797ca28 100644
index 754b1e8408..489c524e9e 100644
--- a/hmp-commands-info.hx
+++ b/hmp-commands-info.hx
@@ -580,6 +580,19 @@ SRST
Show current migration xbzrle cache size.
@@ -540,6 +540,19 @@ SRST
Show current migration parameters.
ERST
+ {
@@ -63,13 +67,13 @@ index 117ba25f91..b3b797ca28 100644
.name = "balloon",
.args_type = "",
diff --git a/hmp-commands.hx b/hmp-commands.hx
index ff2d7aa8f3..d294c234a5 100644
index 673e39a697..039be0033d 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -1866,3 +1866,36 @@ ERST
.flags = "p",
},
@@ -1815,3 +1815,36 @@ SRST
Dump the FDT in dtb format to *filename*.
ERST
#endif
+
+ {
+ .name = "savevm-start",
@@ -104,21 +108,21 @@ index ff2d7aa8f3..d294c234a5 100644
+ .coroutine = true,
+ },
diff --git a/include/migration/snapshot.h b/include/migration/snapshot.h
index c85b6ec75b..4411b7121d 100644
index e72083b117..c846d37806 100644
--- a/include/migration/snapshot.h
+++ b/include/migration/snapshot.h
@@ -17,5 +17,6 @@
@@ -61,4 +61,6 @@ bool delete_snapshot(const char *name,
bool has_devices, strList *devices,
Error **errp);
int save_snapshot(const char *name, Error **errp);
int load_snapshot(const char *name, Error **errp);
+int load_snapshot_from_blockdev(const char *filename, Error **errp);
+
#endif
diff --git a/include/monitor/hmp.h b/include/monitor/hmp.h
index ed2913fd18..4e06f89e8e 100644
index dfbc0c9a2f..440f86aba8 100644
--- a/include/monitor/hmp.h
+++ b/include/monitor/hmp.h
@@ -25,6 +25,7 @@ void hmp_info_status(Monitor *mon, const QDict *qdict);
@@ -27,6 +27,7 @@ void hmp_info_status(Monitor *mon, const QDict *qdict);
void hmp_info_uuid(Monitor *mon, const QDict *qdict);
void hmp_info_chardev(Monitor *mon, const QDict *qdict);
void hmp_info_mice(Monitor *mon, const QDict *qdict);
@@ -126,7 +130,7 @@ index ed2913fd18..4e06f89e8e 100644
void hmp_info_migrate(Monitor *mon, const QDict *qdict);
void hmp_info_migrate_capabilities(Monitor *mon, const QDict *qdict);
void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict);
@@ -83,6 +84,10 @@ void hmp_netdev_add(Monitor *mon, const QDict *qdict);
@@ -81,6 +82,10 @@ void hmp_netdev_add(Monitor *mon, const QDict *qdict);
void hmp_netdev_del(Monitor *mon, const QDict *qdict);
void hmp_getfd(Monitor *mon, const QDict *qdict);
void hmp_closefd(Monitor *mon, const QDict *qdict);
@@ -135,27 +139,28 @@ index ed2913fd18..4e06f89e8e 100644
+void hmp_delete_drive_snapshot(Monitor *mon, const QDict *qdict);
+void hmp_savevm_end(Monitor *mon, const QDict *qdict);
void hmp_sendkey(Monitor *mon, const QDict *qdict);
void hmp_screendump(Monitor *mon, const QDict *qdict);
void coroutine_fn hmp_screendump(Monitor *mon, const QDict *qdict);
void hmp_chardev_add(Monitor *mon, const QDict *qdict);
diff --git a/migration/meson.build b/migration/meson.build
index 980e37865c..e62b79b60f 100644
index 8cac83c06c..0842d00cd2 100644
--- a/migration/meson.build
+++ b/migration/meson.build
@@ -23,6 +23,7 @@ softmmu_ss.add(files(
@@ -24,6 +24,7 @@ softmmu_ss.add(files(
'multifd-zlib.c',
'postcopy-ram.c',
'savevm.c',
+ 'savevm-async.c',
'socket.c',
'tls.c',
))
), gnutls)
diff --git a/migration/savevm-async.c b/migration/savevm-async.c
new file mode 100644
index 0000000000..593a619088
index 0000000000..dc30558713
--- /dev/null
+++ b/migration/savevm-async.c
@@ -0,0 +1,598 @@
@@ -0,0 +1,538 @@
+#include "qemu/osdep.h"
+#include "migration/channel-savevm-async.h"
+#include "migration/migration.h"
+#include "migration/savevm.h"
+#include "migration/snapshot.h"
@@ -179,9 +184,6 @@ index 0000000000..593a619088
+
+/* #define DEBUG_SAVEVM_STATE */
+
+/* used while emulated sync operation in progress */
+#define NOT_DONE -EINPROGRESS
+
+#ifdef DEBUG_SAVEVM_STATE
+#define DPRINTF(fmt, ...) \
+ do { printf("savevm-async: " fmt, ## __VA_ARGS__); } while (0)
@@ -210,7 +212,7 @@ index 0000000000..593a619088
+ int64_t total_time;
+ QEMUBH *finalize_bh;
+ Coroutine *co;
+ QemuCoSleepState *target_close_wait;
+ QemuCoSleep target_close_wait;
+} snap_state;
+
+static bool savevm_aborted(void)
@@ -268,6 +270,7 @@ index 0000000000..593a619088
+
+ if (snap_state.file) {
+ ret = qemu_fclose(snap_state.file);
+ snap_state.file = NULL;
+ }
+
+ if (snap_state.target) {
@@ -285,9 +288,7 @@ index 0000000000..593a619088
+ blk_unref(snap_state.target);
+ snap_state.target = NULL;
+
+ if (snap_state.target_close_wait) {
+ qemu_co_sleep_wake(snap_state.target_close_wait);
+ }
+ qemu_co_sleep_wake(&snap_state.target_close_wait);
+ }
+
+ return ret;
@@ -313,60 +314,6 @@ index 0000000000..593a619088
+ snap_state.state = SAVE_STATE_ERROR;
+}
+
+static int block_state_close(void *opaque, Error **errp)
+{
+ snap_state.file = NULL;
+ return blk_flush(snap_state.target);
+}
+
+typedef struct BlkRwCo {
+ int64_t offset;
+ QEMUIOVector *qiov;
+ ssize_t ret;
+} BlkRwCo;
+
+static void coroutine_fn block_state_write_entry(void *opaque) {
+ BlkRwCo *rwco = opaque;
+ rwco->ret = blk_co_pwritev(snap_state.target, rwco->offset, rwco->qiov->size,
+ rwco->qiov, 0);
+ aio_wait_kick();
+}
+
+static ssize_t block_state_writev_buffer(void *opaque, struct iovec *iov,
+ int iovcnt, int64_t pos, Error **errp)
+{
+ QEMUIOVector qiov;
+ BlkRwCo rwco;
+
+ assert(pos == snap_state.bs_pos);
+ rwco = (BlkRwCo) {
+ .offset = pos,
+ .qiov = &qiov,
+ .ret = NOT_DONE,
+ };
+
+ qemu_iovec_init_external(&qiov, iov, iovcnt);
+
+ if (qemu_in_coroutine()) {
+ block_state_write_entry(&rwco);
+ } else {
+ Coroutine *co = qemu_coroutine_create(&block_state_write_entry, &rwco);
+ bdrv_coroutine_enter(blk_bs(snap_state.target), co);
+ BDRV_POLL_WHILE(blk_bs(snap_state.target), rwco.ret == NOT_DONE);
+ }
+ if (rwco.ret < 0) {
+ return rwco.ret;
+ }
+
+ snap_state.bs_pos += qiov.size;
+ return qiov.size;
+}
+
+static const QEMUFileOps block_file_ops = {
+ .writev_buffer = block_state_writev_buffer,
+ .close = block_state_close,
+};
+
+static void process_savevm_finalize(void *opaque)
+{
+ int ret;
@@ -401,7 +348,7 @@ index 0000000000..593a619088
+ (void)qemu_savevm_state_complete_precopy(snap_state.file, false, false);
+ ret = qemu_file_get_error(snap_state.file);
+ if (ret < 0) {
+ save_snapshot_error("qemu_savevm_state_iterate error %d", ret);
+ save_snapshot_error("qemu_savevm_state_complete_precopy error %d", ret);
+ }
+ }
+
@@ -422,8 +369,11 @@ index 0000000000..593a619088
+ } else if (snap_state.state == SAVE_STATE_ACTIVE) {
+ snap_state.state = SAVE_STATE_COMPLETED;
+ } else if (aborted) {
+ save_snapshot_error("process_savevm_cleanup: found aborted state: %d",
+ snap_state.state);
+ /*
+ * If there was an error, there's no need to set a new one here.
+ * If the snapshot was canceled, leave setting the state to
+ * qmp_savevm_end(), which is waked by save_snapshot_cleanup().
+ */
+ } else {
+ save_snapshot_error("process_savevm_cleanup: invalid state: %d",
+ snap_state.state);
@@ -464,9 +414,16 @@ index 0000000000..593a619088
+
+ pending_size = pend_precopy + pend_compatible + pend_postcopy;
+
+ maxlen = blk_getlength(snap_state.target) - 30*1024*1024;
+ /*
+ * A guest reaching this cutoff is dirtying lots of RAM. It should be
+ * large enough so that the guest can't dirty this much between the
+ * check and the guest actually being stopped, but it should be small
+ * enough to avoid long downtimes for non-hibernation snapshots.
+ */
+ maxlen = blk_getlength(snap_state.target) - 100*1024*1024;
+
+ if (pending_size > 400000 && snap_state.bs_pos + pending_size < maxlen) {
+ /* Note that there is no progress for pend_postcopy when iterating */
+ if (pending_size - pend_postcopy > 400000 && snap_state.bs_pos + pending_size < maxlen) {
+ ret = qemu_savevm_state_iterate(snap_state.file, false);
+ if (ret < 0) {
+ save_snapshot_error("qemu_savevm_state_iterate error %d", ret);
@@ -549,6 +506,7 @@ index 0000000000..593a619088
+ snap_state.bs_pos = 0;
+ snap_state.total_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
+ snap_state.blocker = NULL;
+ snap_state.target_close_wait = (QemuCoSleep){ .to_wake = NULL };
+
+ if (snap_state.error) {
+ error_free(snap_state.error);
@@ -575,7 +533,9 @@ index 0000000000..593a619088
+ goto restart;
+ }
+
+ snap_state.file = qemu_fopen_ops(&snap_state, &block_file_ops);
+ QIOChannel *ioc = QIO_CHANNEL(qio_channel_savevm_async_new(snap_state.target,
+ &snap_state.bs_pos));
+ snap_state.file = qemu_file_new_output(ioc);
+
+ if (!snap_state.file) {
+ error_set(errp, ERROR_CLASS_GENERIC_ERROR, "failed to open '%s'", statefile);
@@ -653,9 +613,8 @@ index 0000000000..593a619088
+ * call exits the statefile will be closed and can be removed immediately */
+ DPRINTF("savevm-end: waiting for cleanup\n");
+ timeout = 30L * 1000 * 1000 * 1000;
+ qemu_co_sleep_ns_wakeable(QEMU_CLOCK_REALTIME, timeout,
+ &snap_state.target_close_wait);
+ snap_state.target_close_wait = NULL;
+ qemu_co_sleep_ns_wakeable(&snap_state.target_close_wait,
+ QEMU_CLOCK_REALTIME, timeout);
+ if (snap_state.target) {
+ save_snapshot_error("timeout waiting for target file close in "
+ "qmp_savevm_end");
@@ -664,6 +623,11 @@ index 0000000000..593a619088
+ return;
+ }
+
+ // File closed and no other error, so ensure next snapshot can be started.
+ if (snap_state.state != SAVE_STATE_ERROR) {
+ snap_state.state = SAVE_STATE_DONE;
+ }
+
+ DPRINTF("savevm-end: cleanup done\n");
+}
+
@@ -683,27 +647,6 @@ index 0000000000..593a619088
+ true, name, errp);
+}
+
+static ssize_t loadstate_get_buffer(void *opaque, uint8_t *buf, int64_t pos,
+ size_t size, Error **errp)
+{
+ BlockBackend *be = opaque;
+ int64_t maxlen = blk_getlength(be);
+ if (pos > maxlen) {
+ return -EIO;
+ }
+ if ((pos + size) > maxlen) {
+ size = maxlen - pos - 1;
+ }
+ if (size == 0) {
+ return 0;
+ }
+ return blk_pread(be, pos, buf, size);
+}
+
+static const QEMUFileOps loadstate_file_ops = {
+ .get_buffer = loadstate_get_buffer,
+};
+
+int load_snapshot_from_blockdev(const char *filename, Error **errp)
+{
+ BlockBackend *be;
@@ -711,6 +654,7 @@ index 0000000000..593a619088
+ Error *blocker = NULL;
+
+ QEMUFile *f;
+ size_t bs_pos = 0;
+ int ret = -EINVAL;
+
+ be = blk_new_open(filename, NULL, NULL, 0, &local_err);
@@ -724,7 +668,7 @@ index 0000000000..593a619088
+ blk_op_block_all(be, blocker);
+
+ /* restore the VM state */
+ f = qemu_fopen_ops(be, &loadstate_file_ops);
+ f = qemu_file_new_input(QIO_CHANNEL(qio_channel_savevm_async_new(be, &bs_pos)));
+ if (!f) {
+ error_setg(errp, "Could not open VM state file");
+ goto the_end;
@@ -754,10 +698,10 @@ index 0000000000..593a619088
+ return ret;
+}
diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index 705f08a8f1..77ab152aab 100644
index 480b798963..cfebfd1db5 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -1949,6 +1949,63 @@ void hmp_info_memory_devices(Monitor *mon, const QDict *qdict)
@@ -1906,6 +1906,63 @@ void hmp_info_memory_devices(Monitor *mon, const QDict *qdict)
hmp_handle_error(mon, err);
}
@@ -822,10 +766,10 @@ index 705f08a8f1..77ab152aab 100644
{
IOThreadInfoList *info_list = qmp_query_iothreads(NULL);
diff --git a/qapi/migration.json b/qapi/migration.json
index 3c75820527..cb3627884c 100644
index 88ecf86ac8..4435866379 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -242,6 +242,40 @@
@@ -261,6 +261,40 @@
'*compression': 'CompressionStats',
'*socket-address': ['SocketAddress'] } }
@@ -867,10 +811,10 @@ index 3c75820527..cb3627884c 100644
# @query-migrate:
#
diff --git a/qapi/misc.json b/qapi/misc.json
index 40df513856..4f5333d960 100644
index 27ef5a2b20..b3ce75dcae 100644
--- a/qapi/misc.json
+++ b/qapi/misc.json
@@ -476,6 +476,38 @@
@@ -435,6 +435,38 @@
##
{ 'command': 'query-fdsets', 'returns': ['FdsetInfo'] }
@@ -910,10 +854,10 @@ index 40df513856..4f5333d960 100644
# @CommandLineParameterType:
#
diff --git a/qemu-options.hx b/qemu-options.hx
index 104632ea34..c1352312c2 100644
index 7f798ce47e..9e3de34143 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -3903,6 +3903,18 @@ SRST
@@ -4423,6 +4423,18 @@ SRST
Start right away with a saved state (``loadvm`` in monitor)
ERST
@@ -933,31 +877,21 @@ index 104632ea34..c1352312c2 100644
DEF("daemonize", 0, QEMU_OPTION_daemonize, \
"-daemonize daemonize QEMU after initializing\n", QEMU_ARCH_ALL)
diff --git a/softmmu/vl.c b/softmmu/vl.c
index e6e0ad5a92..03152c816c 100644
index 7aa3eb5cf9..c94fe3d778 100644
--- a/softmmu/vl.c
+++ b/softmmu/vl.c
@@ -2878,6 +2878,7 @@ void qemu_init(int argc, char **argv, char **envp)
int optind;
const char *optarg;
const char *loadvm = NULL;
+ const char *loadstate = NULL;
MachineClass *machine_class;
const char *cpu_option;
const char *vga_model = NULL;
@@ -3439,6 +3440,9 @@ void qemu_init(int argc, char **argv, char **envp)
case QEMU_OPTION_loadvm:
loadvm = optarg;
break;
+ case QEMU_OPTION_loadstate:
+ loadstate = optarg;
+ break;
case QEMU_OPTION_full_screen:
dpy.has_full_screen = true;
dpy.full_screen = true;
@@ -4478,6 +4482,12 @@ void qemu_init(int argc, char **argv, char **envp)
autostart = 0;
exit(1);
}
@@ -164,6 +164,7 @@ static const char *accelerators;
static bool have_custom_ram_size;
static const char *ram_memdev_id;
static QDict *machine_opts_dict;
+static const char *loadstate;
static QTAILQ_HEAD(, ObjectOption) object_opts = QTAILQ_HEAD_INITIALIZER(object_opts);
static QTAILQ_HEAD(, DeviceOption) device_opts = QTAILQ_HEAD_INITIALIZER(device_opts);
static int display_remote;
@@ -2615,6 +2616,12 @@ void qmp_x_exit_preconfig(Error **errp)
if (loadvm) {
load_snapshot(loadvm, NULL, false, NULL, &error_fatal);
+ } else if (loadstate) {
+ Error *local_err = NULL;
+ if (load_snapshot_from_blockdev(loadstate, &local_err) < 0) {
@@ -967,3 +901,13 @@ index e6e0ad5a92..03152c816c 100644
}
if (replay_mode != REPLAY_MODE_NONE) {
replay_vmstate_init();
@@ -3159,6 +3166,9 @@ void qemu_init(int argc, char **argv)
case QEMU_OPTION_loadvm:
loadvm = optarg;
break;
+ case QEMU_OPTION_loadstate:
+ loadstate = optarg;
+ break;
case QEMU_OPTION_full_screen:
dpy.has_full_screen = true;
dpy.full_screen = true;

View File

@@ -9,17 +9,20 @@ increase performance storing the state onto ceph.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
[increase max IOV count in QEMUFile to actually write more data]
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
[FE: adapt to removal of QEMUFileOps]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
migration/qemu-file.c | 38 +++++++++++++++++++++++++-------------
migration/qemu-file.h | 1 +
migration/savevm-async.c | 4 ++--
3 files changed, 28 insertions(+), 15 deletions(-)
migration/qemu-file.c | 49 +++++++++++++++++++++++++++-------------
migration/qemu-file.h | 2 ++
migration/savevm-async.c | 5 ++--
3 files changed, 38 insertions(+), 18 deletions(-)
diff --git a/migration/qemu-file.c b/migration/qemu-file.c
index be21518c57..1926b5202c 100644
index 2d5f74ffc2..9fd97e6fe1 100644
--- a/migration/qemu-file.c
+++ b/migration/qemu-file.c
@@ -30,8 +30,8 @@
@@ -31,8 +31,8 @@
#include "trace.h"
#include "qapi/error.h"
@@ -29,9 +32,9 @@ index be21518c57..1926b5202c 100644
+#define MAX_IOV_SIZE MIN_CONST(IOV_MAX, 256)
struct QEMUFile {
const QEMUFileOps *ops;
@@ -45,7 +45,8 @@ struct QEMUFile {
when reading */
const QEMUFileHooks *hooks;
@@ -55,7 +55,8 @@ struct QEMUFile {
int buf_index;
int buf_size; /* 0 when writing */
- uint8_t buf[IO_BUF_SIZE];
@@ -40,53 +43,76 @@ index be21518c57..1926b5202c 100644
DECLARE_BITMAP(may_free, MAX_IOV_SIZE);
struct iovec iov[MAX_IOV_SIZE];
@@ -101,7 +102,7 @@ bool qemu_file_mode_is_not_valid(const char *mode)
@@ -127,7 +128,9 @@ bool qemu_file_mode_is_not_valid(const char *mode)
return false;
}
-QEMUFile *qemu_fopen_ops(void *opaque, const QEMUFileOps *ops)
+QEMUFile *qemu_fopen_ops_sized(void *opaque, const QEMUFileOps *ops, size_t buffer_size)
-static QEMUFile *qemu_file_new_impl(QIOChannel *ioc, bool is_writable)
+static QEMUFile *qemu_file_new_impl(QIOChannel *ioc,
+ bool is_writable,
+ size_t buffer_size)
{
QEMUFile *f;
@@ -109,9 +110,17 @@ QEMUFile *qemu_fopen_ops(void *opaque, const QEMUFileOps *ops)
f->opaque = opaque;
f->ops = ops;
@@ -136,6 +139,8 @@ static QEMUFile *qemu_file_new_impl(QIOChannel *ioc, bool is_writable)
object_ref(ioc);
f->ioc = ioc;
f->is_writable = is_writable;
+ f->buf_allocated_size = buffer_size;
+ f->buf = malloc(buffer_size);
+
return f;
}
@@ -146,17 +151,27 @@ static QEMUFile *qemu_file_new_impl(QIOChannel *ioc, bool is_writable)
*/
QEMUFile *qemu_file_get_return_path(QEMUFile *f)
{
- return qemu_file_new_impl(f->ioc, !f->is_writable);
+ return qemu_file_new_impl(f->ioc, !f->is_writable, DEFAULT_IO_BUF_SIZE);
}
+QEMUFile *qemu_fopen_ops(void *opaque, const QEMUFileOps *ops)
+{
+ return qemu_fopen_ops_sized(opaque, ops, DEFAULT_IO_BUF_SIZE);
QEMUFile *qemu_file_new_output(QIOChannel *ioc)
{
- return qemu_file_new_impl(ioc, true);
+ return qemu_file_new_impl(ioc, true, DEFAULT_IO_BUF_SIZE);
+}
+
+QEMUFile *qemu_file_new_output_sized(QIOChannel *ioc, size_t buffer_size)
+{
+ return qemu_file_new_impl(ioc, true, buffer_size);
}
QEMUFile *qemu_file_new_input(QIOChannel *ioc)
{
- return qemu_file_new_impl(ioc, false);
+ return qemu_file_new_impl(ioc, false, DEFAULT_IO_BUF_SIZE);
+}
+
+QEMUFile *qemu_file_new_input_sized(QIOChannel *ioc, size_t buffer_size)
+{
+ return qemu_file_new_impl(ioc, false, buffer_size);
}
void qemu_file_set_hooks(QEMUFile *f, const QEMUFileHooks *hooks)
{
@@ -346,7 +355,7 @@ static ssize_t qemu_fill_buffer(QEMUFile *f)
@@ -414,7 +429,7 @@ static ssize_t qemu_fill_buffer(QEMUFile *f)
do {
len = qio_channel_read(f->ioc,
(char *)f->buf + pending,
- IO_BUF_SIZE - pending,
+ f->buf_allocated_size - pending,
&local_error);
if (len == QIO_CHANNEL_ERR_BLOCK) {
if (qemu_in_coroutine()) {
@@ -464,6 +479,8 @@ int qemu_fclose(QEMUFile *f)
}
g_clear_pointer(&f->ioc, object_unref);
len = f->ops->get_buffer(f->opaque, f->buf + pending, f->pos,
- IO_BUF_SIZE - pending, &local_error);
+ f->buf_allocated_size - pending, &local_error);
if (len > 0) {
f->buf_size += len;
f->pos += len;
@@ -386,6 +395,9 @@ int qemu_fclose(QEMUFile *f)
ret = ret2;
}
}
+
+ free(f->buf);
+
/* If any error was spotted before closing, we should report it
* instead of the close() return value.
*/
@@ -435,7 +447,7 @@ static void add_buf_to_iovec(QEMUFile *f, size_t len)
@@ -518,7 +535,7 @@ static void add_buf_to_iovec(QEMUFile *f, size_t len)
{
if (!add_to_iovec(f, f->buf + f->buf_index, len, false)) {
f->buf_index += len;
@@ -95,7 +121,7 @@ index be21518c57..1926b5202c 100644
qemu_fflush(f);
}
}
@@ -461,7 +473,7 @@ void qemu_put_buffer(QEMUFile *f, const uint8_t *buf, size_t size)
@@ -544,7 +561,7 @@ void qemu_put_buffer(QEMUFile *f, const uint8_t *buf, size_t size)
}
while (size > 0) {
@@ -104,7 +130,7 @@ index be21518c57..1926b5202c 100644
if (l > size) {
l = size;
}
@@ -508,8 +520,8 @@ size_t qemu_peek_buffer(QEMUFile *f, uint8_t **buf, size_t size, size_t offset)
@@ -591,8 +608,8 @@ size_t qemu_peek_buffer(QEMUFile *f, uint8_t **buf, size_t size, size_t offset)
size_t index;
assert(!qemu_file_is_writable(f));
@@ -115,7 +141,7 @@ index be21518c57..1926b5202c 100644
/* The 1st byte to read from */
index = f->buf_index + offset;
@@ -559,7 +571,7 @@ size_t qemu_get_buffer(QEMUFile *f, uint8_t *buf, size_t size)
@@ -642,7 +659,7 @@ size_t qemu_get_buffer(QEMUFile *f, uint8_t *buf, size_t size)
size_t res;
uint8_t *src;
@@ -124,16 +150,16 @@ index be21518c57..1926b5202c 100644
if (res == 0) {
return done;
}
@@ -593,7 +605,7 @@ size_t qemu_get_buffer(QEMUFile *f, uint8_t *buf, size_t size)
@@ -676,7 +693,7 @@ size_t qemu_get_buffer(QEMUFile *f, uint8_t *buf, size_t size)
*/
size_t qemu_get_buffer_in_place(QEMUFile *f, uint8_t **buf, size_t size)
{
- if (size < IO_BUF_SIZE) {
+ if (size < f->buf_allocated_size) {
size_t res;
uint8_t *src;
uint8_t *src = NULL;
@@ -618,7 +630,7 @@ int qemu_peek_byte(QEMUFile *f, int offset)
@@ -701,7 +718,7 @@ int qemu_peek_byte(QEMUFile *f, int offset)
int index = f->buf_index + offset;
assert(!qemu_file_is_writable(f));
@@ -142,7 +168,7 @@ index be21518c57..1926b5202c 100644
if (index >= f->buf_size) {
qemu_fill_buffer(f);
@@ -770,7 +782,7 @@ static int qemu_compress_data(z_stream *stream, uint8_t *dest, size_t dest_len,
@@ -853,7 +870,7 @@ static int qemu_compress_data(z_stream *stream, uint8_t *dest, size_t dest_len,
ssize_t qemu_put_compression_data(QEMUFile *f, z_stream *stream,
const uint8_t *p, size_t size)
{
@@ -152,36 +178,39 @@ index be21518c57..1926b5202c 100644
if (blen < compressBound(size)) {
return -1;
diff --git a/migration/qemu-file.h b/migration/qemu-file.h
index a9b6d6ccb7..8752d27c74 100644
index fa13d04d78..914f1a63a8 100644
--- a/migration/qemu-file.h
+++ b/migration/qemu-file.h
@@ -120,6 +120,7 @@ typedef struct QEMUFileHooks {
@@ -63,7 +63,9 @@ typedef struct QEMUFileHooks {
} QEMUFileHooks;
QEMUFile *qemu_fopen_ops(void *opaque, const QEMUFileOps *ops);
+QEMUFile *qemu_fopen_ops_sized(void *opaque, const QEMUFileOps *ops, size_t buffer_size);
QEMUFile *qemu_file_new_input(QIOChannel *ioc);
+QEMUFile *qemu_file_new_input_sized(QIOChannel *ioc, size_t buffer_size);
QEMUFile *qemu_file_new_output(QIOChannel *ioc);
+QEMUFile *qemu_file_new_output_sized(QIOChannel *ioc, size_t buffer_size);
void qemu_file_set_hooks(QEMUFile *f, const QEMUFileHooks *hooks);
int qemu_get_fd(QEMUFile *f);
int qemu_fclose(QEMUFile *f);
diff --git a/migration/savevm-async.c b/migration/savevm-async.c
index 593a619088..cc2552d977 100644
index dc30558713..a38e7351c1 100644
--- a/migration/savevm-async.c
+++ b/migration/savevm-async.c
@@ -418,7 +418,7 @@ void qmp_savevm_start(bool has_statefile, const char *statefile, Error **errp)
goto restart;
}
@@ -374,7 +374,7 @@ void qmp_savevm_start(bool has_statefile, const char *statefile, Error **errp)
- snap_state.file = qemu_fopen_ops(&snap_state, &block_file_ops);
+ snap_state.file = qemu_fopen_ops_sized(&snap_state, &block_file_ops, 4 * 1024 * 1024);
QIOChannel *ioc = QIO_CHANNEL(qio_channel_savevm_async_new(snap_state.target,
&snap_state.bs_pos));
- snap_state.file = qemu_file_new_output(ioc);
+ snap_state.file = qemu_file_new_output_sized(ioc, 4 * 1024 * 1024);
if (!snap_state.file) {
error_set(errp, ERROR_CLASS_GENERIC_ERROR, "failed to open '%s'", statefile);
@@ -567,7 +567,7 @@ int load_snapshot_from_blockdev(const char *filename, Error **errp)
@@ -507,7 +507,8 @@ int load_snapshot_from_blockdev(const char *filename, Error **errp)
blk_op_block_all(be, blocker);
/* restore the VM state */
- f = qemu_fopen_ops(be, &loadstate_file_ops);
+ f = qemu_fopen_ops_sized(be, &loadstate_file_ops, 4 * 1024 * 1024);
- f = qemu_file_new_input(QIO_CHANNEL(qio_channel_savevm_async_new(be, &bs_pos)));
+ f = qemu_file_new_input_sized(QIO_CHANNEL(qio_channel_savevm_async_new(be, &bs_pos)),
+ 4 * 1024 * 1024);
if (!f) {
error_setg(errp, "Could not open VM state file");
goto the_end;

View File

@@ -4,30 +4,32 @@ Date: Mon, 6 Apr 2020 12:16:47 +0200
Subject: [PATCH] PVE: block: add the zeroinit block driver filter
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
[adapt to changed function signatures]
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
---
block/meson.build | 1 +
block/zeroinit.c | 196 ++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 197 insertions(+)
block/zeroinit.c | 198 ++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 199 insertions(+)
create mode 100644 block/zeroinit.c
diff --git a/block/meson.build b/block/meson.build
index 5dcc1e5cce..c10d544864 100644
index b7c68b83a3..020a89ae07 100644
--- a/block/meson.build
+++ b/block/meson.build
@@ -39,6 +39,7 @@ block_ss.add(files(
@@ -43,6 +43,7 @@ block_ss.add(files(
'vmdk.c',
'vpc.c',
'write-threshold.c',
+ 'zeroinit.c',
), zstd, zlib)
), zstd, zlib, gnutls)
softmmu_ss.add(when: 'CONFIG_TCG', if_true: files('blkreplay.c'))
diff --git a/block/zeroinit.c b/block/zeroinit.c
new file mode 100644
index 0000000000..5529627f7e
index 0000000000..b60e1b84dc
--- /dev/null
+++ b/block/zeroinit.c
@@ -0,0 +1,196 @@
@@ -0,0 +1,198 @@
+/*
+ * Filter to fake a zero-initialized block device.
+ *
@@ -107,7 +109,9 @@ index 0000000000..5529627f7e
+
+ /* Open the raw file */
+ bs->file = bdrv_open_child(qemu_opt_get(opts, "x-next"), options, "next",
+ bs, &child_of_bds, BDRV_CHILD_FILTERED, false, &local_err);
+ bs, &child_of_bds,
+ BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY,
+ false, &local_err);
+ if (local_err) {
+ ret = -EINVAL;
+ error_propagate(errp, local_err);
@@ -138,22 +142,22 @@ index 0000000000..5529627f7e
+}
+
+static int coroutine_fn zeroinit_co_preadv(BlockDriverState *bs,
+ uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags)
+ int64_t offset, int64_t bytes, QEMUIOVector *qiov, BdrvRequestFlags flags)
+{
+ return bdrv_co_preadv(bs->file, offset, bytes, qiov, flags);
+}
+
+static int coroutine_fn zeroinit_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset,
+ int count, BdrvRequestFlags flags)
+ int64_t bytes, BdrvRequestFlags flags)
+{
+ BDRVZeroinitState *s = bs->opaque;
+ if (offset >= s->extents)
+ return 0;
+ return bdrv_pwrite_zeroes(bs->file, offset, count, flags);
+ return bdrv_pwrite_zeroes(bs->file, offset, bytes, flags);
+}
+
+static int coroutine_fn zeroinit_co_pwritev(BlockDriverState *bs,
+ uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags)
+ int64_t offset, int64_t bytes, QEMUIOVector *qiov, BdrvRequestFlags flags)
+{
+ BDRVZeroinitState *s = bs->opaque;
+ int64_t extents = offset + bytes;
@@ -174,9 +178,9 @@ index 0000000000..5529627f7e
+}
+
+static int coroutine_fn zeroinit_co_pdiscard(BlockDriverState *bs,
+ int64_t offset, int count)
+ int64_t offset, int64_t bytes)
+{
+ return bdrv_co_pdiscard(bs->file, offset, count);
+ return bdrv_co_pdiscard(bs->file, offset, bytes);
+}
+
+static int zeroinit_co_truncate(BlockDriverState *bs, int64_t offset,

View File

@@ -14,12 +14,12 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2 files changed, 11 insertions(+)
diff --git a/qemu-options.hx b/qemu-options.hx
index c1352312c2..9a0cb6780e 100644
index 9e3de34143..1ff8905127 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -906,6 +906,9 @@ DEFHEADING()
@@ -1159,6 +1159,9 @@ legacy PC, they are not recommended for modern configurations.
DEFHEADING(Block device options:)
ERST
+DEF("id", HAS_ARG, QEMU_OPTION_id,
+ "-id n set the VMID", QEMU_ARCH_ALL)
@@ -28,20 +28,20 @@ index c1352312c2..9a0cb6780e 100644
"-fda/-fdb file use 'file' as floppy disk 0/1 image\n", QEMU_ARCH_ALL)
DEF("fdb", HAS_ARG, QEMU_OPTION_fdb, "", QEMU_ARCH_ALL)
diff --git a/softmmu/vl.c b/softmmu/vl.c
index 03152c816c..da204d24f0 100644
index c94fe3d778..a6f7a422ec 100644
--- a/softmmu/vl.c
+++ b/softmmu/vl.c
@@ -2866,6 +2866,7 @@ static char *find_datadir(void)
void qemu_init(int argc, char **argv, char **envp)
{
int i;
@@ -2651,6 +2651,7 @@ void qemu_init(int argc, char **argv)
MachineClass *machine_class;
bool userconfig = true;
FILE *vmstate_dump_file = NULL;
+ long vm_id;
int snapshot, linux_boot;
const char *initrd_filename;
const char *kernel_filename, *kernel_cmdline;
@@ -3557,6 +3558,13 @@ void qemu_init(int argc, char **argv, char **envp)
exit(1);
}
qemu_add_opts(&qemu_drive_opts);
qemu_add_drive_opts(&qemu_legacy_drive_opts);
@@ -3271,6 +3272,13 @@ void qemu_init(int argc, char **argv)
machine_parse_property_opt(qemu_find_opts("smp-opts"),
"smp", optarg);
break;
+ case QEMU_OPTION_id:
+ vm_id = strtol(optarg, (char **)&optarg, 10);
@@ -51,5 +51,5 @@ index 03152c816c..da204d24f0 100644
+ }
+ break;
case QEMU_OPTION_vnc:
vnc_parse(optarg, &error_fatal);
vnc_parse(optarg);
break;

View File

@@ -11,7 +11,7 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 9 insertions(+)
diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c
index 502e94effc..590ef6ec8e 100644
index 2a20982066..7968ad5a93 100644
--- a/hw/intc/apic_common.c
+++ b/hw/intc/apic_common.c
@@ -278,6 +278,15 @@ static void apic_reset_common(DeviceState *dev)

View File

@@ -8,15 +8,15 @@ Otherwise creating images on nfs/cifs can be problematic.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
block/file-posix.c | 61 +++++++++++++++++++++++++++++---------------
block/file-posix.c | 59 ++++++++++++++++++++++++++++++--------------
qapi/block-core.json | 3 ++-
2 files changed, 43 insertions(+), 21 deletions(-)
2 files changed, 42 insertions(+), 20 deletions(-)
diff --git a/block/file-posix.c b/block/file-posix.c
index bda3e606dc..037839622e 100644
index 9a16d86344..bd68df57ad 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -2388,6 +2388,7 @@ raw_co_create(BlockdevCreateOptions *options, Error **errp)
@@ -2487,6 +2487,7 @@ raw_co_create(BlockdevCreateOptions *options, Error **errp)
int fd;
uint64_t perm, shared;
int result = 0;
@@ -24,7 +24,7 @@ index bda3e606dc..037839622e 100644
/* Validate options and set default values */
assert(options->driver == BLOCKDEV_DRIVER_FILE);
@@ -2428,19 +2429,22 @@ raw_co_create(BlockdevCreateOptions *options, Error **errp)
@@ -2527,19 +2528,22 @@ raw_co_create(BlockdevCreateOptions *options, Error **errp)
perm = BLK_PERM_WRITE | BLK_PERM_RESIZE;
shared = BLK_PERM_ALL & ~BLK_PERM_RESIZE;
@@ -59,7 +59,7 @@ index bda3e606dc..037839622e 100644
}
/* Clear the file by truncating it to 0 */
@@ -2494,13 +2498,15 @@ raw_co_create(BlockdevCreateOptions *options, Error **errp)
@@ -2593,13 +2597,15 @@ raw_co_create(BlockdevCreateOptions *options, Error **errp)
}
out_unlock:
@@ -82,7 +82,7 @@ index bda3e606dc..037839622e 100644
}
out_close:
@@ -2525,6 +2531,7 @@ static int coroutine_fn raw_co_create_opts(BlockDriver *drv,
@@ -2624,6 +2630,7 @@ static int coroutine_fn raw_co_create_opts(BlockDriver *drv,
PreallocMode prealloc;
char *buf = NULL;
Error *local_err = NULL;
@@ -90,7 +90,7 @@ index bda3e606dc..037839622e 100644
/* Skip file: protocol prefix */
strstart(filename, "file:", &filename);
@@ -2547,6 +2554,18 @@ static int coroutine_fn raw_co_create_opts(BlockDriver *drv,
@@ -2646,6 +2653,18 @@ static int coroutine_fn raw_co_create_opts(BlockDriver *drv,
return -EINVAL;
}
@@ -109,7 +109,7 @@ index bda3e606dc..037839622e 100644
options = (BlockdevCreateOptions) {
.driver = BLOCKDEV_DRIVER_FILE,
.u.file = {
@@ -2558,6 +2577,8 @@ static int coroutine_fn raw_co_create_opts(BlockDriver *drv,
@@ -2657,6 +2676,8 @@ static int coroutine_fn raw_co_create_opts(BlockDriver *drv,
.nocow = nocow,
.has_extent_size_hint = has_extent_size_hint,
.extent_size_hint = extent_size_hint,
@@ -118,20 +118,11 @@ index bda3e606dc..037839622e 100644
},
};
return raw_co_create(&options, errp);
@@ -3104,7 +3125,7 @@ static int raw_check_perm(BlockDriverState *bs, uint64_t perm, uint64_t shared,
}
/* Copy locks to the new fd */
- if (s->perm_change_fd) {
+ if (s->use_lock && s->perm_change_fd) {
ret = raw_apply_lock_bytes(NULL, s->perm_change_fd, perm, ~shared,
false, errp);
if (ret < 0) {
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 9db3120716..d285622589 100644
index 7daaf545be..9e902b96bb 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -4224,7 +4224,8 @@
@@ -4624,7 +4624,8 @@
'size': 'size',
'*preallocation': 'PreallocMode',
'*nocow': 'bool',

View File

@@ -18,10 +18,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/monitor/qmp.c b/monitor/qmp.c
index b42f8c6af3..2e37d11bd3 100644
index cc1407e4ac..c34fa2e0e3 100644
--- a/monitor/qmp.c
+++ b/monitor/qmp.c
@@ -466,8 +466,7 @@ void monitor_init_qmp(Chardev *chr, bool pretty, Error **errp)
@@ -502,8 +502,7 @@ void monitor_init_qmp(Chardev *chr, bool pretty, Error **errp)
qemu_chr_fe_set_echo(&mon->common.chr, true);
/* Note: we run QMP monitor in I/O thread when @chr supports that */

View File

@@ -26,10 +26,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/hw/core/machine.c b/hw/core/machine.c
index d0408049b5..5b38cf9356 100644
index 19f42450f5..39ef2b6fe6 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -78,7 +78,8 @@ GlobalProperty hw_compat_4_0[] = {
@@ -135,7 +135,8 @@ GlobalProperty hw_compat_4_0[] = {
{ "virtio-vga", "edid", "false" },
{ "virtio-gpu-device", "edid", "false" },
{ "virtio-device", "use-started", "false" },

View File

@@ -10,18 +10,19 @@ Version is made available as 'pve-version' in query-machines (same as,
and only if 'is-current').
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
hw/core/machine-qmp-cmds.c | 6 ++++++
include/hw/boards.h | 2 ++
qapi/machine.json | 4 +++-
softmmu/vl.c | 15 ++++++++++++++-
4 files changed, 25 insertions(+), 2 deletions(-)
softmmu/vl.c | 25 +++++++++++++++++++++++++
4 files changed, 36 insertions(+), 1 deletion(-)
diff --git a/hw/core/machine-qmp-cmds.c b/hw/core/machine-qmp-cmds.c
index 3fcb82ce2f..7868241bd5 100644
index 76fff60a6b..ec9201fb9a 100644
--- a/hw/core/machine-qmp-cmds.c
+++ b/hw/core/machine-qmp-cmds.c
@@ -238,6 +238,12 @@ MachineInfoList *qmp_query_machines(Error **errp)
@@ -103,6 +103,12 @@ MachineInfoList *qmp_query_machines(Error **errp)
if (strcmp(mc->name, MACHINE_GET_CLASS(current_machine)->name) == 0) {
info->has_is_current = true;
info->is_current = true;
@@ -35,32 +36,32 @@ index 3fcb82ce2f..7868241bd5 100644
if (mc->default_cpu_type) {
diff --git a/include/hw/boards.h b/include/hw/boards.h
index a49e3a6b44..8e0a8c5571 100644
index ca2f0d3592..acc3b62b6e 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -165,6 +165,8 @@ struct MachineClass {
@@ -232,6 +232,8 @@ struct MachineClass {
const char *desc;
const char *deprecation_reason;
+ const char *pve_version;
+
void (*init)(MachineState *state);
void (*reset)(MachineState *state);
void (*reset)(MachineState *state, ShutdownCause reason);
void (*wakeup)(MachineState *state);
diff --git a/qapi/machine.json b/qapi/machine.json
index dfc1a49d3c..32fc674042 100644
index 9156103c8f..f4fb1b2c9c 100644
--- a/qapi/machine.json
+++ b/qapi/machine.json
@@ -337,6 +337,8 @@
@@ -157,6 +157,8 @@
#
# @default-ram-id: the default ID of initial RAM memory backend (since 5.2)
#
+# @pve-version: custom PVE version suffix specified as 'machine+pveN'
+#
# Since: 1.2.0
# Since: 1.2
##
{ 'struct': 'MachineInfo',
@@ -344,7 +346,7 @@
@@ -164,7 +166,7 @@
'*is-default': 'bool', '*is-current': 'bool', 'cpu-max': 'int',
'hotpluggable-cpus': 'bool', 'numa-mem-supported': 'bool',
'deprecated': 'bool', '*default-cpu-type': 'str',
@@ -70,40 +71,58 @@ index dfc1a49d3c..32fc674042 100644
##
# @query-machines:
diff --git a/softmmu/vl.c b/softmmu/vl.c
index da204d24f0..5b5512128e 100644
index a6f7a422ec..8b0b35b6b4 100644
--- a/softmmu/vl.c
+++ b/softmmu/vl.c
@@ -2325,6 +2325,8 @@ static MachineClass *machine_parse(const char *name, GSList *machines)
@@ -1582,6 +1582,7 @@ static const QEMUOption *lookup_opt(int argc, char **argv,
static MachineClass *select_machine(QDict *qdict, Error **errp)
{
MachineClass *mc;
GSList *el;
+ size_t pvever_index = 0;
+ gchar *name_clean;
if (is_help_option(name)) {
printf("Supported machines are:\n");
@@ -2341,12 +2343,23 @@ static MachineClass *machine_parse(const char *name, GSList *machines)
exit(0);
const char *optarg = qdict_get_try_str(qdict, "type");
+ const char *pvever = qdict_get_try_str(qdict, "pvever");
GSList *machines = object_class_get_list(TYPE_MACHINE, false);
MachineClass *machine_class;
Error *local_err = NULL;
@@ -1599,6 +1600,11 @@ static MachineClass *select_machine(QDict *qdict, Error **errp)
}
}
- mc = find_machine(name, machines);
+ // PVE version is specified with '+' as seperator, e.g. pc-i440fx+pvever
+ pvever_index = strcspn(name, "+");
+
+ name_clean = g_strndup(name, pvever_index);
+ mc = find_machine(name_clean, machines);
+ g_free(name_clean);
+
if (!mc) {
error_report("unsupported machine type");
error_printf("Use -machine help to list supported machines\n");
exit(1);
}
+
+ if (pvever_index < strlen(name)) {
+ mc->pve_version = &name[pvever_index+1];
+ if (machine_class) {
+ machine_class->pve_version = g_strdup(pvever);
+ qdict_del(qdict, "pvever");
+ }
+
return mc;
}
g_slist_free(machines);
if (local_err) {
error_append_hint(&local_err, "Use -machine help to list supported machines\n");
@@ -3213,12 +3219,31 @@ void qemu_init(int argc, char **argv)
case QEMU_OPTION_machine:
{
bool help;
+ size_t pvever_index, name_len;
+ const gchar *name;
+ gchar *name_clean, *pvever;
keyval_parse_into(machine_opts_dict, optarg, "type", &help, &error_fatal);
if (help) {
machine_help_func(machine_opts_dict);
exit(EXIT_SUCCESS);
}
+
+ // PVE version is specified with '+' as seperator, e.g. pc-i440fx+pvever
+ name = qdict_get_try_str(machine_opts_dict, "type");
+ if (name != NULL) {
+ name_len = strlen(name);
+ pvever_index = strcspn(name, "+");
+ if (pvever_index < name_len) {
+ name_clean = g_strndup(name, pvever_index);
+ pvever = g_strndup(name + pvever_index + 1, name_len - pvever_index - 1);
+ qdict_put_str(machine_opts_dict, "pvever", pvever);
+ qdict_put_str(machine_opts_dict, "type", name_clean);
+ g_free(name_clean);
+ g_free(pvever);
+ }
+ }
+
break;
}
case QEMU_OPTION_accel:

View File

@@ -0,0 +1,59 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fabian Ebner <f.ebner@proxmox.com>
Date: Wed, 2 Mar 2022 08:35:05 +0100
Subject: [PATCH] block/backup: move bcs bitmap initialization to job creation
For backing up the state of multiple disks from the same time, a job
for each disk has to be created. It's convenient if the jobs don't
have to be started at the same time and if operation of the VM can be
resumed after job creation. This would lead to a window between job
creation and running the job, where writes can happen. But no writes
should happen between setting up the copy-before-write filter and
setting up the block copy state bitmap, because then new writes would
just pass through.
Commit 06e0a9c16405c0a4c1eca33cf286cc04c42066a2 moved initalization of
the bitmap to setting up the copy-before-write filter when sync_mode
is not MIRROR_SYNC_MODE_BITMAP. Ensure that the bitmap is initialized
upon job creation for the remaining case too, by moving the
backup_init_bcs_bitmap call to backup_job_create.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
block/backup.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/block/backup.c b/block/backup.c
index 6a9ad97a53..9b0151c5be 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -237,8 +237,8 @@ static void backup_init_bcs_bitmap(BackupBlockJob *job)
true);
} else if (job->sync_mode == MIRROR_SYNC_MODE_TOP) {
/*
- * We can't hog the coroutine to initialize this thoroughly.
- * Set a flag and resume work when we are able to yield safely.
+ * Initialization is costly here. Simply set a flag and let the
+ * backup_run coroutine resume work once it can yield safely.
*/
block_copy_set_skip_unallocated(job->bcs, true);
}
@@ -252,8 +252,6 @@ static int coroutine_fn backup_run(Job *job, Error **errp)
BackupBlockJob *s = container_of(job, BackupBlockJob, common.job);
int ret;
- backup_init_bcs_bitmap(s);
-
if (s->sync_mode == MIRROR_SYNC_MODE_TOP) {
int64_t offset = 0;
int64_t count;
@@ -492,6 +490,8 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
block_job_add_bdrv(&job->common, "target", target, 0, BLK_PERM_ALL,
&error_abort);
+ backup_init_bcs_bitmap(job);
+
return &job->common;
error:

View File

@@ -3,50 +3,53 @@ From: Dietmar Maurer <dietmar@proxmox.com>
Date: Mon, 6 Apr 2020 12:16:57 +0200
Subject: [PATCH] PVE-Backup: add vma backup format code
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
[FE: create: register all streams before entering coroutines]
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
---
block/meson.build | 2 +
meson.build | 5 +
vma-reader.c | 857 ++++++++++++++++++++++++++++++++++++++++++++++
vma-writer.c | 790 ++++++++++++++++++++++++++++++++++++++++++
vma.c | 839 +++++++++++++++++++++++++++++++++++++++++++++
vma-reader.c | 859 ++++++++++++++++++++++++++++++++++++++++++++++
vma-writer.c | 791 ++++++++++++++++++++++++++++++++++++++++++
vma.c | 849 +++++++++++++++++++++++++++++++++++++++++++++
vma.h | 150 ++++++++
6 files changed, 2643 insertions(+)
6 files changed, 2656 insertions(+)
create mode 100644 vma-reader.c
create mode 100644 vma-writer.c
create mode 100644 vma.c
create mode 100644 vma.h
diff --git a/block/meson.build b/block/meson.build
index c10d544864..feffbc8623 100644
index 020a89ae07..4feae20e37 100644
--- a/block/meson.build
+++ b/block/meson.build
@@ -42,6 +42,8 @@ block_ss.add(files(
@@ -46,6 +46,8 @@ block_ss.add(files(
'zeroinit.c',
), zstd, zlib)
), zstd, zlib, gnutls)
+block_ss.add(files('../vma-writer.c'), libuuid)
+
softmmu_ss.add(when: 'CONFIG_TCG', if_true: files('blkreplay.c'))
softmmu_ss.add(files('block-ram-registrar.c'))
block_ss.add(when: 'CONFIG_QCOW1', if_true: files('qcow.c'))
diff --git a/meson.build b/meson.build
index e3386196ba..d5b660516b 100644
index 787f91855e..a496f6fe10 100644
--- a/meson.build
+++ b/meson.build
@@ -725,6 +725,8 @@ keyutils = dependency('libkeyutils', required: false,
@@ -1525,6 +1525,8 @@ keyutils = dependency('libkeyutils', required: false,
has_gettid = cc.has_function('gettid')
+libuuid = cc.find_library('uuid', required: true)
+
# Malloc tests
# libselinux
selinux = dependency('libselinux',
required: get_option('selinux'),
@@ -3598,6 +3600,9 @@ if have_tools
dependencies: [blockdev, qemuutil, gnutls, selinux],
install: true)
malloc = []
@@ -1907,6 +1909,9 @@ if have_tools
qemu_nbd = executable('qemu-nbd', files('qemu-nbd.c'),
dependencies: [blockdev, qemuutil], install: true)
+ vma = executable('vma', files('vma.c', 'vma-reader.c'),
+ vma = executable('vma', files('vma.c', 'vma-reader.c') + genh,
+ dependencies: [authz, block, crypto, io, qom], install: true)
+
subdir('storage-daemon')
@@ -54,10 +57,10 @@ index e3386196ba..d5b660516b 100644
subdir('contrib/elf2dmp')
diff --git a/vma-reader.c b/vma-reader.c
new file mode 100644
index 0000000000..2b1d1cdab3
index 0000000000..e65f1e8415
--- /dev/null
+++ b/vma-reader.c
@@ -0,0 +1,857 @@
@@ -0,0 +1,859 @@
+/*
+ * VMA: Virtual Machine Archive
+ *
@@ -75,7 +78,6 @@ index 0000000000..2b1d1cdab3
+#include <glib.h>
+#include <uuid/uuid.h>
+
+#include "qemu-common.h"
+#include "qemu/timer.h"
+#include "qemu/ratelimit.h"
+#include "vma.h"
@@ -252,6 +254,9 @@ index 0000000000..2b1d1cdab3
+ if (vmar->rstate[i].bitmap) {
+ g_free(vmar->rstate[i].bitmap);
+ }
+ if (vmar->rstate[i].target) {
+ blk_unref(vmar->rstate[i].target);
+ }
+ }
+
+ if (vmar->md5csum) {
@@ -583,7 +588,7 @@ index 0000000000..2b1d1cdab3
+ }
+ }
+ } else {
+ int res = blk_pwrite(target, sector_num * BDRV_SECTOR_SIZE, buf, nb_sectors * BDRV_SECTOR_SIZE, 0);
+ int res = blk_pwrite(target, sector_num * BDRV_SECTOR_SIZE, nb_sectors * BDRV_SECTOR_SIZE, buf, 0);
+ if (res < 0) {
+ error_setg(errp, "blk_pwrite to %s failed (%d)",
+ bdrv_get_device_name(blk_bs(target)), res);
@@ -917,10 +922,10 @@ index 0000000000..2b1d1cdab3
+
diff --git a/vma-writer.c b/vma-writer.c
new file mode 100644
index 0000000000..11d8321ffd
index 0000000000..df4b20793d
--- /dev/null
+++ b/vma-writer.c
@@ -0,0 +1,790 @@
@@ -0,0 +1,791 @@
+/*
+ * VMA: Virtual Machine Archive
+ *
@@ -944,6 +949,7 @@ index 0000000000..11d8321ffd
+#include "qemu/main-loop.h"
+#include "qemu/coroutine.h"
+#include "qemu/cutils.h"
+#include "qemu/memalign.h"
+
+#define DEBUG_VMA 0
+
@@ -1127,9 +1133,9 @@ index 0000000000..11d8321ffd
+ assert(qemu_in_coroutine());
+ AioContext *ctx = qemu_get_current_aio_context();
+ aio_set_fd_handler(ctx, fd, false, NULL, (IOHandler *)qemu_coroutine_enter,
+ NULL, qemu_coroutine_self());
+ NULL, NULL, qemu_coroutine_self());
+ qemu_coroutine_yield();
+ aio_set_fd_handler(ctx, fd, false, NULL, NULL, NULL, NULL);
+ aio_set_fd_handler(ctx, fd, false, NULL, NULL, NULL, NULL, NULL);
+}
+
+static ssize_t coroutine_fn
@@ -1713,10 +1719,10 @@ index 0000000000..11d8321ffd
+}
diff --git a/vma.c b/vma.c
new file mode 100644
index 0000000000..2eea2fc281
index 0000000000..e8dffb43e0
--- /dev/null
+++ b/vma.c
@@ -0,0 +1,839 @@
@@ -0,0 +1,849 @@
+/*
+ * VMA: Virtual Machine Archive
+ *
@@ -1734,11 +1740,11 @@ index 0000000000..2eea2fc281
+#include <glib.h>
+
+#include "vma.h"
+#include "qemu-common.h"
+#include "qemu/module.h"
+#include "qemu/error-report.h"
+#include "qemu/main-loop.h"
+#include "qemu/cutils.h"
+#include "qemu/memalign.h"
+#include "qapi/qmp/qdict.h"
+#include "sysemu/block-backend.h"
+
@@ -2028,8 +2034,6 @@ index 0000000000..2eea2fc281
+ int vmstate_fd = -1;
+ guint8 vmstate_stream = 0;
+
+ BlockBackend *blk = NULL;
+
+ for (i = 1; i < 255; i++) {
+ VmaDeviceInfo *di = vma_reader_get_device_info(vmar, i);
+ if (di && (strcmp(di->devname, "vmstate") == 0)) {
@@ -2050,6 +2054,8 @@ index 0000000000..2eea2fc281
+ int flags = BDRV_O_RDWR;
+ bool write_zero = true;
+
+ BlockBackend *blk = NULL;
+
+ if (readmap) {
+ RestoreMap *map;
+ map = (RestoreMap *)g_hash_table_lookup(devmap, di->devname);
@@ -2162,8 +2168,6 @@ index 0000000000..2eea2fc281
+
+ vma_reader_destroy(vmar);
+
+ blk_unref(blk);
+
+ bdrv_close_all();
+
+ return ret;
@@ -2292,6 +2296,7 @@ index 0000000000..2eea2fc281
+ int i, c;
+ int verbose = 0;
+ const char *archivename;
+ GList *backup_coroutines = NULL;
+ GList *config_files = NULL;
+
+ for (;;) {
@@ -2380,7 +2385,9 @@ index 0000000000..2eea2fc281
+ job->dev_id = dev_id;
+
+ Coroutine *co = qemu_coroutine_create(backup_run, job);
+ qemu_coroutine_enter(co);
+ // Don't enter coroutine yet, because it might write the header before
+ // all streams can be registered.
+ backup_coroutines = g_list_append(backup_coroutines, co);
+ }
+
+ VmaStatus vmastat;
@@ -2388,6 +2395,13 @@ index 0000000000..2eea2fc281
+ int last_percent = -1;
+
+ if (devcount) {
+ GList *entry = backup_coroutines;
+ while (entry && entry->data) {
+ Coroutine *co = entry->data;
+ qemu_coroutine_enter(co);
+ entry = g_list_next(entry);
+ }
+
+ while (1) {
+ main_loop_wait(false);
+ vma_writer_get_status(vmaw, &vmastat);
@@ -2452,6 +2466,8 @@ index 0000000000..2eea2fc281
+ g_error("creating vma archive failed");
+ }
+
+ g_list_free(backup_coroutines);
+ g_list_free(config_files);
+ vma_writer_destroy(vmaw);
+ return 0;
+}

View File

@@ -7,21 +7,23 @@ Subject: [PATCH] PVE-Backup: add backup-dump block driver
- move BackupBlockJob declaration from block/backup.c to include/block/block_int.h
- block/backup.c - backup-job-create: also consider source cluster size
- job.c: make job_should_pause non-static
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
block/backup-dump.c | 168 ++++++++++++++++++++++++++++++++++++++
block/backup.c | 23 ++----
block/meson.build | 1 +
include/block/block_int.h | 30 +++++++
job.c | 3 +-
5 files changed, 206 insertions(+), 19 deletions(-)
block/backup-dump.c | 167 +++++++++++++++++++++++++++++++
block/backup.c | 30 ++----
block/meson.build | 1 +
include/block/block_int-common.h | 35 +++++++
job.c | 3 +-
5 files changed, 213 insertions(+), 23 deletions(-)
create mode 100644 block/backup-dump.c
diff --git a/block/backup-dump.c b/block/backup-dump.c
new file mode 100644
index 0000000000..93d7f46950
index 0000000000..04718a94e2
--- /dev/null
+++ b/block/backup-dump.c
@@ -0,0 +1,168 @@
@@ -0,0 +1,167 @@
+/*
+ * BlockDriver to send backup data stream to a callback function
+ *
@@ -33,7 +35,6 @@ index 0000000000..93d7f46950
+ */
+
+#include "qemu/osdep.h"
+#include "qemu-common.h"
+#include "qom/object_interfaces.h"
+#include "block/block_int.h"
+
@@ -191,17 +192,18 @@ index 0000000000..93d7f46950
+ return bs;
+}
diff --git a/block/backup.c b/block/backup.c
index 9afa0bf3b4..3df3d532d5 100644
index 9b0151c5be..6e8f6e67b3 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -32,24 +32,6 @@
@@ -29,28 +29,6 @@
#define BACKUP_CLUSTER_SIZE_DEFAULT (1 << 16)
#include "block/copy-before-write.h"
-typedef struct BackupBlockJob {
- BlockJob common;
- BlockDriverState *backup_top;
- BlockDriverState *cbw;
- BlockDriverState *source_bs;
- BlockDriverState *target_bs;
-
- BdrvDirtyBitmap *sync_bitmap;
-
@@ -210,29 +212,35 @@ index 9afa0bf3b4..3df3d532d5 100644
- BlockdevOnError on_source_error;
- BlockdevOnError on_target_error;
- uint64_t len;
- uint64_t bytes_read;
- int64_t cluster_size;
- BackupPerf perf;
-
- BlockCopyState *bcs;
-
- bool wait;
- BlockCopyCallState *bg_bcs_call;
-} BackupBlockJob;
-
static const BlockJobDriver backup_job_driver;
static void backup_progress_bytes_callback(int64_t bytes, void *opaque)
@@ -423,6 +405,11 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
goto error;
static void backup_cleanup_sync_bitmap(BackupBlockJob *job, int ret)
@@ -454,6 +432,14 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
}
cluster_size = block_copy_cluster_size(bcs);
+ if (cluster_size < 0) {
+ goto error;
+ }
+
+ BlockDriverInfo bdi;
+ if (bdrv_get_info(bs, &bdi) == 0) {
+ cluster_size = MAX(cluster_size, bdi.cluster_size);
+ }
+
/*
* If source is in backing chain of target assume that target is going to be
* used for "image fleecing", i.e. it should represent a kind of snapshot of
if (perf->max_chunk && perf->max_chunk < cluster_size) {
error_setg(errp, "Required max-chunk (%" PRIi64 ") is less than backup "
diff --git a/block/meson.build b/block/meson.build
index feffbc8623..2507af1168 100644
index 4feae20e37..0d7023fc82 100644
--- a/block/meson.build
+++ b/block/meson.build
@@ -4,6 +4,7 @@ block_ss.add(files(
@@ -240,20 +248,28 @@ index feffbc8623..2507af1168 100644
'amend.c',
'backup.c',
+ 'backup-dump.c',
'backup-top.c',
'copy-before-write.c',
'blkdebug.c',
'blklogwrites.c',
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 6f8eda629a..5455102da8 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -63,6 +63,36 @@
diff --git a/include/block/block_int-common.h b/include/block/block_int-common.h
index 31ae91e56e..37b64bcd93 100644
--- a/include/block/block_int-common.h
+++ b/include/block/block_int-common.h
@@ -26,6 +26,7 @@
#include "block/accounting.h"
#include "block/block.h"
+#include "block/block-copy.h"
#include "block/aio-wait.h"
#include "qemu/queue.h"
#include "qemu/coroutine.h"
@@ -64,6 +65,40 @@
#define BLOCK_PROBE_BUF_SIZE 512
+typedef int BackupDumpFunc(void *opaque, uint64_t offset, uint64_t bytes, const void *buf);
+
+BlockDriverState *bdrv_backuo_dump_create(
+BlockDriverState *bdrv_backup_dump_create(
+ int dump_cb_block_size,
+ uint64_t byte_size,
+ BackupDumpFunc *dump_cb,
@@ -265,8 +281,9 @@ index 6f8eda629a..5455102da8 100644
+typedef struct BlockCopyState BlockCopyState;
+typedef struct BackupBlockJob {
+ BlockJob common;
+ BlockDriverState *backup_top;
+ BlockDriverState *cbw;
+ BlockDriverState *source_bs;
+ BlockDriverState *target_bs;
+
+ BdrvDirtyBitmap *sync_bitmap;
+
@@ -275,26 +292,29 @@ index 6f8eda629a..5455102da8 100644
+ BlockdevOnError on_source_error;
+ BlockdevOnError on_target_error;
+ uint64_t len;
+ uint64_t bytes_read;
+ int64_t cluster_size;
+ BackupPerf perf;
+
+ BlockCopyState *bcs;
+
+ bool wait;
+ BlockCopyCallState *bg_bcs_call;
+} BackupBlockJob;
+
enum BdrvTrackedRequestType {
BDRV_TRACKED_READ,
BDRV_TRACKED_WRITE,
diff --git a/job.c b/job.c
index 8fecf38960..f9884e7d9d 100644
index 72d57f0934..93e22d180b 100644
--- a/job.c
+++ b/job.c
@@ -269,7 +269,8 @@ static bool job_started(Job *job)
return job->co;
@@ -330,7 +330,8 @@ static bool job_started_locked(Job *job)
}
-static bool job_should_pause(Job *job)
+bool job_should_pause(Job *job);
+bool job_should_pause(Job *job)
/* Called with job_mutex held. */
-static bool job_should_pause_locked(Job *job)
+bool job_should_pause_locked(Job *job);
+bool job_should_pause_locked(Job *job)
{
return job->pause_count > 0;
}

View File

@@ -6,33 +6,36 @@ Subject: [PATCH] PVE-Backup: proxmox backup patches for qemu
Signed-off-by: Dietmar Maurer <dietmar@proxmox.com>
[PVE-Backup: avoid coroutines to fix AIO freeze, cleanups]
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
[FE: add new force parameter to job_cancel_sync calls
adapt for new job lock mechanism replacing AioContext locks]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
block/meson.build | 5 +
block/monitor/block-hmp-cmds.c | 33 ++
blockdev.c | 1 +
hmp-commands-info.hx | 13 +
hmp-commands-info.hx | 14 +
hmp-commands.hx | 29 +
include/block/block_int.h | 2 +-
include/monitor/hmp.h | 3 +
meson.build | 1 +
monitor/hmp-cmds.c | 44 ++
proxmox-backup-client.c | 176 ++++++
proxmox-backup-client.h | 59 ++
pve-backup.c | 957 +++++++++++++++++++++++++++++++++
pve-backup.c | 956 +++++++++++++++++++++++++++++++++
qapi/block-core.json | 109 ++++
qapi/common.json | 13 +
qapi/machine.json | 15 +-
15 files changed, 1446 insertions(+), 14 deletions(-)
14 files changed, 1445 insertions(+), 13 deletions(-)
create mode 100644 proxmox-backup-client.c
create mode 100644 proxmox-backup-client.h
create mode 100644 pve-backup.c
diff --git a/block/meson.build b/block/meson.build
index 2507af1168..dfae565db3 100644
index 0d7023fc82..e995ae72b9 100644
--- a/block/meson.build
+++ b/block/meson.build
@@ -44,6 +44,11 @@ block_ss.add(files(
), zstd, zlib)
@@ -48,6 +48,11 @@ block_ss.add(files(
), zstd, zlib, gnutls)
block_ss.add(files('../vma-writer.c'), libuuid)
+block_ss.add(files(
@@ -42,12 +45,12 @@ index 2507af1168..dfae565db3 100644
+
softmmu_ss.add(when: 'CONFIG_TCG', if_true: files('blkreplay.c'))
softmmu_ss.add(files('block-ram-registrar.c'))
diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c
index d15a2be827..9ba7c774a2 100644
index cf21b5e40a..60fa93c85e 100644
--- a/block/monitor/block-hmp-cmds.c
+++ b/block/monitor/block-hmp-cmds.c
@@ -1012,3 +1012,36 @@ void hmp_info_snapshots(Monitor *mon, const QDict *qdict)
@@ -1017,3 +1017,36 @@ void hmp_info_snapshots(Monitor *mon, const QDict *qdict)
g_free(sn_tab);
g_free(global_snapshots);
}
@@ -85,7 +88,7 @@ index d15a2be827..9ba7c774a2 100644
+ hmp_handle_error(mon, error);
+}
diff --git a/blockdev.c b/blockdev.c
index a40c6fd0f6..e2f826ca62 100644
index 5b15a86bfa..cba1078815 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -36,6 +36,7 @@
@@ -97,13 +100,14 @@ index a40c6fd0f6..e2f826ca62 100644
#include "monitor/monitor.h"
#include "qemu/error-report.h"
diff --git a/hmp-commands-info.hx b/hmp-commands-info.hx
index b3b797ca28..295e14e64f 100644
index 489c524e9e..bc1d46d845 100644
--- a/hmp-commands-info.hx
+++ b/hmp-commands-info.hx
@@ -513,6 +513,19 @@ SRST
Show CPU statistics.
@@ -486,6 +486,20 @@ SRST
Show the current VM UUID.
ERST
+
+ {
+ .name = "backup",
+ .args_type = "",
@@ -121,10 +125,10 @@ index b3b797ca28..295e14e64f 100644
{
.name = "usernet",
diff --git a/hmp-commands.hx b/hmp-commands.hx
index d294c234a5..0c6b944850 100644
index 039be0033d..fcf9461295 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -98,6 +98,35 @@ ERST
@@ -101,6 +101,35 @@ ERST
SRST
``block_stream``
Copy data from a backing file into a block device.
@@ -160,32 +164,19 @@ index d294c234a5..0c6b944850 100644
ERST
{
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 5455102da8..1bd4b64522 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -65,7 +65,7 @@
typedef int BackupDumpFunc(void *opaque, uint64_t offset, uint64_t bytes, const void *buf);
-BlockDriverState *bdrv_backuo_dump_create(
+BlockDriverState *bdrv_backup_dump_create(
int dump_cb_block_size,
uint64_t byte_size,
BackupDumpFunc *dump_cb,
diff --git a/include/monitor/hmp.h b/include/monitor/hmp.h
index 4e06f89e8e..10f52bd92a 100644
index 440f86aba8..350527e599 100644
--- a/include/monitor/hmp.h
+++ b/include/monitor/hmp.h
@@ -30,6 +30,7 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict);
@@ -31,6 +31,7 @@ void hmp_info_savevm(Monitor *mon, const QDict *qdict);
void hmp_info_migrate(Monitor *mon, const QDict *qdict);
void hmp_info_migrate_capabilities(Monitor *mon, const QDict *qdict);
void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict);
void hmp_info_migrate_cache_size(Monitor *mon, const QDict *qdict);
+void hmp_info_backup(Monitor *mon, const QDict *qdict);
void hmp_info_cpus(Monitor *mon, const QDict *qdict);
void hmp_info_vnc(Monitor *mon, const QDict *qdict);
void hmp_info_spice(Monitor *mon, const QDict *qdict);
@@ -76,6 +77,8 @@ void hmp_x_colo_lost_heartbeat(Monitor *mon, const QDict *qdict);
@@ -74,6 +75,8 @@ void hmp_x_colo_lost_heartbeat(Monitor *mon, const QDict *qdict);
void hmp_set_password(Monitor *mon, const QDict *qdict);
void hmp_expire_password(Monitor *mon, const QDict *qdict);
void hmp_change(Monitor *mon, const QDict *qdict);
@@ -195,22 +186,22 @@ index 4e06f89e8e..10f52bd92a 100644
void hmp_device_add(Monitor *mon, const QDict *qdict);
void hmp_device_del(Monitor *mon, const QDict *qdict);
diff --git a/meson.build b/meson.build
index d5b660516b..3094f98c47 100644
index a496f6fe10..406112d96f 100644
--- a/meson.build
+++ b/meson.build
@@ -726,6 +726,7 @@ keyutils = dependency('libkeyutils', required: false,
@@ -1526,6 +1526,7 @@ keyutils = dependency('libkeyutils', required: false,
has_gettid = cc.has_function('gettid')
libuuid = cc.find_library('uuid', required: true)
+libproxmox_backup_qemu = cc.find_library('proxmox_backup_qemu', required: true)
# Malloc tests
# libselinux
selinux = dependency('libselinux',
diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index 77ab152aab..182e79c943 100644
index cfebfd1db5..a40b25e906 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -195,6 +195,50 @@ void hmp_info_mice(Monitor *mon, const QDict *qdict)
@@ -199,6 +199,50 @@ void hmp_info_mice(Monitor *mon, const QDict *qdict)
qapi_free_MouseInfoList(mice_list);
}
@@ -258,9 +249,9 @@ index 77ab152aab..182e79c943 100644
+ qapi_free_BackupStatus(info);
+}
+
static char *SocketAddress_to_str(SocketAddress *addr)
void hmp_info_migrate(Monitor *mon, const QDict *qdict)
{
switch (addr->type) {
MigrationInfo *info;
diff --git a/proxmox-backup-client.c b/proxmox-backup-client.c
new file mode 100644
index 0000000000..a8f6653a81
@@ -510,10 +501,10 @@ index 0000000000..1dda8b7d8f
+#endif /* PROXMOX_BACKUP_CLIENT_H */
diff --git a/pve-backup.c b/pve-backup.c
new file mode 100644
index 0000000000..d40f3f2fd6
index 0000000000..6af212b9b4
--- /dev/null
+++ b/pve-backup.c
@@ -0,0 +1,957 @@
@@ -0,0 +1,956 @@
+#include "proxmox-backup-client.h"
+#include "vma.h"
+
@@ -591,14 +582,16 @@ index 0000000000..d40f3f2fd6
+lookup_active_block_job(PVEBackupDevInfo *di)
+{
+ if (!di->completed && di->bs) {
+ for (BlockJob *job = block_job_next(NULL); job; job = block_job_next(job)) {
+ if (job->job.driver->job_type != JOB_TYPE_BACKUP) {
+ continue;
+ }
+ WITH_JOB_LOCK_GUARD() {
+ for (BlockJob *job = block_job_next_locked(NULL); job; job = block_job_next_locked(job)) {
+ if (job->job.driver->job_type != JOB_TYPE_BACKUP) {
+ continue;
+ }
+
+ BackupBlockJob *bjob = container_of(job, BackupBlockJob, common);
+ if (bjob && bjob->source_bs == di->bs) {
+ return job;
+ BackupBlockJob *bjob = container_of(job, BackupBlockJob, common);
+ if (bjob && bjob->source_bs == di->bs) {
+ return job;
+ }
+ }
+ }
+ }
@@ -868,10 +861,7 @@ index 0000000000..d40f3f2fd6
+ qemu_mutex_unlock(&backup_state.backup_mutex);
+
+ if (next_job) {
+ AioContext *aio_context = next_job->job.aio_context;
+ aio_context_acquire(aio_context);
+ job_cancel_sync(&next_job->job);
+ aio_context_release(aio_context);
+ job_cancel_sync(&next_job->job, true);
+ } else {
+ break;
+ }
@@ -933,7 +923,7 @@ index 0000000000..d40f3f2fd6
+ goto out;
+}
+
+bool job_should_pause(Job *job);
+bool job_should_pause_locked(Job *job);
+
+static void pvebackup_run_next_job(void)
+{
@@ -951,18 +941,16 @@ index 0000000000..d40f3f2fd6
+ if (job) {
+ qemu_mutex_unlock(&backup_state.backup_mutex);
+
+ AioContext *aio_context = job->job.aio_context;
+ aio_context_acquire(aio_context);
+
+ if (job_should_pause(&job->job)) {
+ bool error_or_canceled = pvebackup_error_or_canceled();
+ if (error_or_canceled) {
+ job_cancel_sync(&job->job);
+ } else {
+ job_resume(&job->job);
+ WITH_JOB_LOCK_GUARD() {
+ if (job_should_pause_locked(&job->job)) {
+ bool error_or_canceled = pvebackup_error_or_canceled();
+ if (error_or_canceled) {
+ job_cancel_sync_locked(&job->job, true);
+ } else {
+ job_resume_locked(&job->job);
+ }
+ }
+ }
+ aio_context_release(aio_context);
+ return;
+ }
+ }
@@ -978,6 +966,8 @@ index 0000000000..d40f3f2fd6
+
+ Error *local_err = NULL;
+
+ BackupPerf perf = { .max_workers = 16 };
+
+ /* create and start all jobs (paused state) */
+ GList *l = backup_state.di_list;
+ while (l) {
@@ -991,8 +981,8 @@ index 0000000000..d40f3f2fd6
+
+ BlockJob *job = backup_job_create(
+ NULL, di->bs, di->target, backup_state.speed, MIRROR_SYNC_MODE_FULL, NULL,
+ BITMAP_SYNC_MODE_NEVER, false, NULL, BLOCKDEV_ON_ERROR_REPORT, BLOCKDEV_ON_ERROR_REPORT,
+ JOB_DEFAULT, pvebackup_complete_cb, di, 1, NULL, &local_err);
+ BITMAP_SYNC_MODE_NEVER, false, NULL, &perf, BLOCKDEV_ON_ERROR_REPORT, BLOCKDEV_ON_ERROR_REPORT,
+ JOB_DEFAULT, pvebackup_complete_cb, di, NULL, &local_err);
+
+ aio_context_release(aio_context);
+
@@ -1144,7 +1134,7 @@ index 0000000000..d40f3f2fd6
+
+ ssize_t size = bdrv_getlength(di->bs);
+ if (size < 0) {
+ error_setg_errno(task->errp, -di->size, "bdrv_getlength failed");
+ error_setg_errno(task->errp, -size, "bdrv_getlength failed");
+ goto err;
+ }
+ di->size = size;
@@ -1472,12 +1462,12 @@ index 0000000000..d40f3f2fd6
+ return info;
+}
diff --git a/qapi/block-core.json b/qapi/block-core.json
index d285622589..9054db608c 100644
index 9e902b96bb..c3b6b93472 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -745,6 +745,115 @@
{ 'command': 'query-block', 'returns': ['BlockInfo'] }
@@ -740,6 +740,115 @@
{ 'command': 'query-block', 'returns': ['BlockInfo'],
'allow-preconfig': true }
+##
+# @BackupStatus:
@@ -1592,13 +1582,13 @@ index d285622589..9054db608c 100644
# @BlockDeviceTimedStats:
#
diff --git a/qapi/common.json b/qapi/common.json
index 716712d4b3..556dab79e1 100644
index 356db3f670..aae8a3b682 100644
--- a/qapi/common.json
+++ b/qapi/common.json
@@ -145,3 +145,16 @@
@@ -206,3 +206,16 @@
##
{ 'enum': 'PCIELinkWidth',
'data': [ '1', '2', '4', '8', '12', '16', '32' ] }
{ 'struct': 'HumanReadableText',
'data': { 'human-readable-text': 'str' } }
+
+##
+# @UuidInfo:
@@ -1613,7 +1603,7 @@ index 716712d4b3..556dab79e1 100644
+##
+{ 'struct': 'UuidInfo', 'data': {'UUID': 'str'} }
diff --git a/qapi/machine.json b/qapi/machine.json
index 32fc674042..145f1a4fa2 100644
index f4fb1b2c9c..0d6ee836ed 100644
--- a/qapi/machine.json
+++ b/qapi/machine.json
@@ -4,6 +4,8 @@
@@ -1625,7 +1615,7 @@ index 32fc674042..145f1a4fa2 100644
##
# = Machines
##
@@ -406,19 +408,6 @@
@@ -226,19 +228,6 @@
##
{ 'command': 'query-target', 'returns': 'TargetInfo' }
@@ -1636,7 +1626,7 @@ index 32fc674042..145f1a4fa2 100644
-#
-# @UUID: the UUID of the guest
-#
-# Since: 0.14.0
-# Since: 0.14
-#
-# Notes: If no UUID was specified for the guest, a null UUID is returned.
-##

View File

@@ -7,19 +7,19 @@ Subject: [PATCH] PVE-Backup: pbs-restore - new command to restore from proxmox
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
meson.build | 4 +
pbs-restore.c | 224 ++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 228 insertions(+)
pbs-restore.c | 223 ++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 227 insertions(+)
create mode 100644 pbs-restore.c
diff --git a/meson.build b/meson.build
index 3094f98c47..6f1fafee14 100644
index 406112d96f..9c46881eb7 100644
--- a/meson.build
+++ b/meson.build
@@ -1913,6 +1913,10 @@ if have_tools
vma = executable('vma', files('vma.c', 'vma-reader.c'),
@@ -3604,6 +3604,10 @@ if have_tools
vma = executable('vma', files('vma.c', 'vma-reader.c') + genh,
dependencies: [authz, block, crypto, io, qom], install: true)
+ pbs_restore = executable('pbs-restore', files('pbs-restore.c'),
+ pbs_restore = executable('pbs-restore', files('pbs-restore.c') + genh,
+ dependencies: [authz, block, crypto, io, qom,
+ libproxmox_backup_qemu], install: true)
+
@@ -28,10 +28,10 @@ index 3094f98c47..6f1fafee14 100644
subdir('contrib/elf2dmp')
diff --git a/pbs-restore.c b/pbs-restore.c
new file mode 100644
index 0000000000..4d3f925a1b
index 0000000000..2f834cf42e
--- /dev/null
+++ b/pbs-restore.c
@@ -0,0 +1,224 @@
@@ -0,0 +1,223 @@
+/*
+ * Qemu image restore helper for Proxmox Backup
+ *
@@ -50,7 +50,6 @@ index 0000000000..4d3f925a1b
+#include <getopt.h>
+#include <string.h>
+
+#include "qemu-common.h"
+#include "qemu/module.h"
+#include "qemu/error-report.h"
+#include "qemu/main-loop.h"
@@ -96,7 +95,7 @@ index 0000000000..4d3f925a1b
+ }
+ res = blk_pwrite_zeroes(callback_data->target, offset, data_len, 0);
+ } else {
+ res = blk_pwrite(callback_data->target, offset, data, data_len, 0);
+ res = blk_pwrite(callback_data->target, offset, data_len, data, 0);
+ }
+
+ if (res < 0) {

View File

@@ -29,10 +29,10 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
6 files changed, 142 insertions(+), 23 deletions(-)
diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c
index 9ba7c774a2..056d14deee 100644
index 60fa93c85e..8b23bdedac 100644
--- a/block/monitor/block-hmp-cmds.c
+++ b/block/monitor/block-hmp-cmds.c
@@ -1039,6 +1039,7 @@ void hmp_backup(Monitor *mon, const QDict *qdict)
@@ -1044,6 +1044,7 @@ void hmp_backup(Monitor *mon, const QDict *qdict)
false, NULL, // PBS fingerprint
false, NULL, // PBS backup-id
false, 0, // PBS backup-time
@@ -41,10 +41,10 @@ index 9ba7c774a2..056d14deee 100644
false, NULL, false, NULL, !!devlist,
devlist, qdict_haskey(qdict, "speed"), speed, &error);
diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index 182e79c943..604026bb37 100644
index a40b25e906..670f783515 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -221,19 +221,42 @@ void hmp_info_backup(Monitor *mon, const QDict *qdict)
@@ -225,19 +225,42 @@ void hmp_info_backup(Monitor *mon, const QDict *qdict)
monitor_printf(mon, "End time: %s", ctime(&info->end_time));
}
@@ -132,7 +132,7 @@ index 1dda8b7d8f..8cbf645b2c 100644
diff --git a/pve-backup.c b/pve-backup.c
index d40f3f2fd6..1cd9d31d7c 100644
index 6af212b9b4..3f97cf6532 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -28,6 +28,8 @@
@@ -162,7 +162,7 @@ index d40f3f2fd6..1cd9d31d7c 100644
BlockDriverState *target;
} PVEBackupDevInfo;
@@ -105,11 +110,12 @@ static bool pvebackup_error_or_canceled(void)
@@ -107,11 +112,12 @@ static bool pvebackup_error_or_canceled(void)
return error_or_canceled;
}
@@ -176,7 +176,7 @@ index d40f3f2fd6..1cd9d31d7c 100644
qemu_mutex_unlock(&backup_state.stat.lock);
}
@@ -148,7 +154,8 @@ pvebackup_co_dump_pbs_cb(
@@ -150,7 +156,8 @@ pvebackup_co_dump_pbs_cb(
pvebackup_propagate_error(local_err);
return pbs_res;
} else {
@@ -186,7 +186,7 @@ index d40f3f2fd6..1cd9d31d7c 100644
}
return size;
@@ -208,11 +215,11 @@ pvebackup_co_dump_vma_cb(
@@ -210,11 +217,11 @@ pvebackup_co_dump_vma_cb(
} else {
if (remaining >= VMA_CLUSTER_SIZE) {
assert(ret == VMA_CLUSTER_SIZE);
@@ -200,7 +200,7 @@ index d40f3f2fd6..1cd9d31d7c 100644
remaining = 0;
}
}
@@ -248,6 +255,18 @@ static void coroutine_fn pvebackup_co_cleanup(void *unused)
@@ -250,6 +257,18 @@ static void coroutine_fn pvebackup_co_cleanup(void *unused)
if (local_err != NULL) {
pvebackup_propagate_error(local_err);
}
@@ -219,7 +219,7 @@ index d40f3f2fd6..1cd9d31d7c 100644
}
proxmox_backup_disconnect(backup_state.pbs);
@@ -303,6 +322,12 @@ static void pvebackup_complete_cb(void *opaque, int ret)
@@ -305,6 +324,12 @@ static void pvebackup_complete_cb(void *opaque, int ret)
// remove self from job queue
backup_state.di_list = g_list_remove(backup_state.di_list, di);
@@ -232,7 +232,7 @@ index d40f3f2fd6..1cd9d31d7c 100644
g_free(di);
qemu_mutex_unlock(&backup_state.backup_mutex);
@@ -470,12 +495,18 @@ static bool create_backup_jobs(void) {
@@ -469,12 +494,18 @@ static bool create_backup_jobs(void) {
assert(di->target != NULL);
@@ -247,13 +247,13 @@ index d40f3f2fd6..1cd9d31d7c 100644
BlockJob *job = backup_job_create(
- NULL, di->bs, di->target, backup_state.speed, MIRROR_SYNC_MODE_FULL, NULL,
- BITMAP_SYNC_MODE_NEVER, false, NULL, BLOCKDEV_ON_ERROR_REPORT, BLOCKDEV_ON_ERROR_REPORT,
- BITMAP_SYNC_MODE_NEVER, false, NULL, &perf, BLOCKDEV_ON_ERROR_REPORT, BLOCKDEV_ON_ERROR_REPORT,
+ NULL, di->bs, di->target, backup_state.speed, sync_mode, di->bitmap,
+ bitmap_mode, false, NULL, BLOCKDEV_ON_ERROR_REPORT, BLOCKDEV_ON_ERROR_REPORT,
JOB_DEFAULT, pvebackup_complete_cb, di, 1, NULL, &local_err);
+ bitmap_mode, false, NULL, &perf, BLOCKDEV_ON_ERROR_REPORT, BLOCKDEV_ON_ERROR_REPORT,
JOB_DEFAULT, pvebackup_complete_cb, di, NULL, &local_err);
aio_context_release(aio_context);
@@ -526,6 +557,8 @@ typedef struct QmpBackupTask {
@@ -525,6 +556,8 @@ typedef struct QmpBackupTask {
const char *fingerprint;
bool has_fingerprint;
int64_t backup_time;
@@ -262,7 +262,7 @@ index d40f3f2fd6..1cd9d31d7c 100644
bool has_format;
BackupFormat format;
bool has_config_file;
@@ -617,6 +650,7 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -616,6 +649,7 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
}
size_t total = 0;
@@ -270,7 +270,7 @@ index d40f3f2fd6..1cd9d31d7c 100644
l = di_list;
while (l) {
@@ -654,6 +688,8 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -653,6 +687,8 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
int dump_cb_block_size = PROXMOX_BACKUP_DEFAULT_CHUNK_SIZE; // Hardcoded (4M)
firewall_name = "fw.conf";
@@ -279,7 +279,7 @@ index d40f3f2fd6..1cd9d31d7c 100644
char *pbs_err = NULL;
pbs = proxmox_backup_new(
task->backup_file,
@@ -673,7 +709,8 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -672,7 +708,8 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
goto err;
}
@@ -289,7 +289,7 @@ index d40f3f2fd6..1cd9d31d7c 100644
goto err;
/* register all devices */
@@ -684,9 +721,40 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -683,9 +720,40 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
const char *devname = bdrv_get_device_name(di->bs);
@@ -332,7 +332,7 @@ index d40f3f2fd6..1cd9d31d7c 100644
if (!(di->target = bdrv_backup_dump_create(dump_cb_block_size, di->size, pvebackup_co_dump_pbs_cb, di, task->errp))) {
goto err;
@@ -695,6 +763,8 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -694,6 +762,8 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
di->dev_id = dev_id;
}
} else if (format == BACKUP_FORMAT_VMA) {
@@ -341,7 +341,7 @@ index d40f3f2fd6..1cd9d31d7c 100644
vmaw = vma_writer_create(task->backup_file, uuid, &local_err);
if (!vmaw) {
if (local_err) {
@@ -722,6 +792,8 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -721,6 +791,8 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
}
}
} else if (format == BACKUP_FORMAT_DIR) {
@@ -350,7 +350,7 @@ index d40f3f2fd6..1cd9d31d7c 100644
if (mkdir(task->backup_file, 0640) != 0) {
error_setg_errno(task->errp, errno, "can't create directory '%s'\n",
task->backup_file);
@@ -794,8 +866,10 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -793,8 +865,10 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
char *uuid_str = g_strdup(backup_state.stat.uuid_str);
backup_state.stat.total = total;
@@ -361,7 +361,7 @@ index d40f3f2fd6..1cd9d31d7c 100644
qemu_mutex_unlock(&backup_state.stat.lock);
@@ -819,6 +893,10 @@ err:
@@ -818,6 +892,10 @@ err:
PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
l = g_list_next(l);
@@ -372,7 +372,7 @@ index d40f3f2fd6..1cd9d31d7c 100644
if (di->target) {
bdrv_unref(di->target);
}
@@ -860,6 +938,7 @@ UuidInfo *qmp_backup(
@@ -859,6 +937,7 @@ UuidInfo *qmp_backup(
bool has_fingerprint, const char *fingerprint,
bool has_backup_id, const char *backup_id,
bool has_backup_time, int64_t backup_time,
@@ -380,7 +380,7 @@ index d40f3f2fd6..1cd9d31d7c 100644
bool has_format, BackupFormat format,
bool has_config_file, const char *config_file,
bool has_firewall_file, const char *firewall_file,
@@ -878,6 +957,8 @@ UuidInfo *qmp_backup(
@@ -877,6 +956,8 @@ UuidInfo *qmp_backup(
.backup_id = backup_id,
.has_backup_time = has_backup_time,
.backup_time = backup_time,
@@ -389,7 +389,7 @@ index d40f3f2fd6..1cd9d31d7c 100644
.has_format = has_format,
.format = format,
.has_config_file = has_config_file,
@@ -946,10 +1027,14 @@ BackupStatus *qmp_query_backup(Error **errp)
@@ -945,10 +1026,14 @@ BackupStatus *qmp_query_backup(Error **errp)
info->has_total = true;
info->total = backup_state.stat.total;
@@ -405,10 +405,10 @@ index d40f3f2fd6..1cd9d31d7c 100644
qemu_mutex_unlock(&backup_state.stat.lock);
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 9054db608c..d4e1c98c50 100644
index c3b6b93472..992e6c1e3f 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -758,8 +758,13 @@
@@ -753,8 +753,13 @@
#
# @total: total amount of bytes involved in the backup process
#
@@ -422,7 +422,7 @@ index 9054db608c..d4e1c98c50 100644
# @zero-bytes: amount of 'zero' bytes detected.
#
# @start-time: time (epoch) when backup job started.
@@ -772,8 +777,8 @@
@@ -767,8 +772,8 @@
#
##
{ 'struct': 'BackupStatus',
@@ -433,7 +433,7 @@ index 9054db608c..d4e1c98c50 100644
'*start-time': 'int', '*end-time': 'int',
'*backup-file': 'str', '*uuid': 'str' } }
@@ -816,6 +821,8 @@
@@ -811,6 +816,8 @@
#
# @backup-time: backup timestamp (Unix epoch, required for format 'pbs')
#
@@ -442,7 +442,7 @@ index 9054db608c..d4e1c98c50 100644
# Returns: the uuid of the backup job
#
##
@@ -826,6 +833,7 @@
@@ -821,6 +828,7 @@
'*fingerprint': 'str',
'*backup-id': 'str',
'*backup-time': 'int',

View File

@@ -11,6 +11,7 @@ Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
PVE: add zero block handling to PBS dump callback
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
block/monitor/block-hmp-cmds.c | 4 ++-
pve-backup.c | 57 +++++++++++++++++++++++++++-------
@@ -18,10 +19,10 @@ Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
3 files changed, 54 insertions(+), 13 deletions(-)
diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c
index 056d14deee..46c63b1cf9 100644
index 8b23bdedac..f59b02592e 100644
--- a/block/monitor/block-hmp-cmds.c
+++ b/block/monitor/block-hmp-cmds.c
@@ -1039,7 +1039,9 @@ void hmp_backup(Monitor *mon, const QDict *qdict)
@@ -1044,7 +1044,9 @@ void hmp_backup(Monitor *mon, const QDict *qdict)
false, NULL, // PBS fingerprint
false, NULL, // PBS backup-id
false, 0, // PBS backup-time
@@ -33,7 +34,7 @@ index 056d14deee..46c63b1cf9 100644
false, NULL, false, NULL, !!devlist,
devlist, qdict_haskey(qdict, "speed"), speed, &error);
diff --git a/pve-backup.c b/pve-backup.c
index 1cd9d31d7c..b8182aaf89 100644
index 3f97cf6532..a275a1d4e1 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -8,6 +8,7 @@
@@ -52,7 +53,7 @@ index 1cd9d31d7c..b8182aaf89 100644
uint8_t dev_id;
bool completed;
char targetfile[PATH_MAX];
@@ -135,10 +137,13 @@ pvebackup_co_dump_pbs_cb(
@@ -137,10 +139,13 @@ pvebackup_co_dump_pbs_cb(
PVEBackupDevInfo *di = opaque;
assert(backup_state.pbs);
@@ -66,7 +67,7 @@ index 1cd9d31d7c..b8182aaf89 100644
qemu_co_mutex_lock(&backup_state.dump_callback_mutex);
// avoid deadlock if job is cancelled
@@ -147,17 +152,29 @@ pvebackup_co_dump_pbs_cb(
@@ -149,17 +154,29 @@ pvebackup_co_dump_pbs_cb(
return -1;
}
@@ -104,7 +105,7 @@ index 1cd9d31d7c..b8182aaf89 100644
return size;
}
@@ -178,6 +195,7 @@ pvebackup_co_dump_vma_cb(
@@ -180,6 +197,7 @@ pvebackup_co_dump_vma_cb(
int ret = -1;
assert(backup_state.vmaw);
@@ -112,7 +113,7 @@ index 1cd9d31d7c..b8182aaf89 100644
uint64_t remaining = size;
@@ -204,9 +222,7 @@ pvebackup_co_dump_vma_cb(
@@ -206,9 +224,7 @@ pvebackup_co_dump_vma_cb(
qemu_co_mutex_unlock(&backup_state.dump_callback_mutex);
++cluster_num;
@@ -123,7 +124,7 @@ index 1cd9d31d7c..b8182aaf89 100644
if (ret < 0) {
Error *local_err = NULL;
vma_writer_error_propagate(backup_state.vmaw, &local_err);
@@ -567,6 +583,10 @@ typedef struct QmpBackupTask {
@@ -566,6 +582,10 @@ typedef struct QmpBackupTask {
const char *firewall_file;
bool has_devlist;
const char *devlist;
@@ -134,7 +135,7 @@ index 1cd9d31d7c..b8182aaf89 100644
bool has_speed;
int64_t speed;
Error **errp;
@@ -690,6 +710,7 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -689,6 +709,7 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
bool use_dirty_bitmap = task->has_use_dirty_bitmap && task->use_dirty_bitmap;
@@ -142,7 +143,7 @@ index 1cd9d31d7c..b8182aaf89 100644
char *pbs_err = NULL;
pbs = proxmox_backup_new(
task->backup_file,
@@ -699,8 +720,10 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -698,8 +719,10 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
task->has_password ? task->password : NULL,
task->has_keyfile ? task->keyfile : NULL,
task->has_key_password ? task->key_password : NULL,
@@ -154,7 +155,7 @@ index 1cd9d31d7c..b8182aaf89 100644
if (!pbs) {
error_set(task->errp, ERROR_CLASS_GENERIC_ERROR,
@@ -719,6 +742,8 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -718,6 +741,8 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
l = g_list_next(l);
@@ -163,7 +164,7 @@ index 1cd9d31d7c..b8182aaf89 100644
const char *devname = bdrv_get_device_name(di->bs);
BdrvDirtyBitmap *bitmap = bdrv_find_dirty_bitmap(di->bs, PBS_BITMAP_NAME);
@@ -939,6 +964,8 @@ UuidInfo *qmp_backup(
@@ -938,6 +963,8 @@ UuidInfo *qmp_backup(
bool has_backup_id, const char *backup_id,
bool has_backup_time, int64_t backup_time,
bool has_use_dirty_bitmap, bool use_dirty_bitmap,
@@ -172,7 +173,7 @@ index 1cd9d31d7c..b8182aaf89 100644
bool has_format, BackupFormat format,
bool has_config_file, const char *config_file,
bool has_firewall_file, const char *firewall_file,
@@ -949,6 +976,8 @@ UuidInfo *qmp_backup(
@@ -948,6 +975,8 @@ UuidInfo *qmp_backup(
.backup_file = backup_file,
.has_password = has_password,
.password = password,
@@ -181,7 +182,7 @@ index 1cd9d31d7c..b8182aaf89 100644
.has_key_password = has_key_password,
.key_password = key_password,
.has_fingerprint = has_fingerprint,
@@ -959,6 +988,10 @@ UuidInfo *qmp_backup(
@@ -958,6 +987,10 @@ UuidInfo *qmp_backup(
.backup_time = backup_time,
.has_use_dirty_bitmap = has_use_dirty_bitmap,
.use_dirty_bitmap = use_dirty_bitmap,
@@ -193,10 +194,10 @@ index 1cd9d31d7c..b8182aaf89 100644
.format = format,
.has_config_file = has_config_file,
diff --git a/qapi/block-core.json b/qapi/block-core.json
index d4e1c98c50..0fda1e3fd3 100644
index 992e6c1e3f..5ac6276dc1 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -823,6 +823,10 @@
@@ -818,6 +818,10 @@
#
# @use-dirty-bitmap: use dirty bitmap to detect incremental changes since last job (optional for format 'pbs')
#
@@ -207,7 +208,7 @@ index d4e1c98c50..0fda1e3fd3 100644
# Returns: the uuid of the backup job
#
##
@@ -834,6 +838,8 @@
@@ -829,6 +833,8 @@
'*backup-id': 'str',
'*backup-time': 'int',
'*use-dirty-bitmap': 'bool',

View File

@@ -6,20 +6,25 @@ Subject: [PATCH] PVE: Add PBS block driver to map backup archives into VMs
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
[error cleanups, file_open implementation]
Signed-off-by: Dietmar Maurer <dietmar@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
[FE: adapt to changed function signatures
make pbs_co_preadv return values consistent with QEMU]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
block/meson.build | 3 +
block/pbs.c | 271 +++++++++++++++++++++++++++++++++++++++++++
block/pbs.c | 276 +++++++++++++++++++++++++++++++++++++++++++
configure | 9 ++
meson.build | 1 +
qapi/block-core.json | 14 ++-
5 files changed, 297 insertions(+), 1 deletion(-)
meson.build | 2 +-
qapi/block-core.json | 13 ++
qapi/pragma.json | 1 +
6 files changed, 303 insertions(+), 1 deletion(-)
create mode 100644 block/pbs.c
diff --git a/block/meson.build b/block/meson.build
index dfae565db3..a070060e53 100644
index e995ae72b9..7ef2fa72d5 100644
--- a/block/meson.build
+++ b/block/meson.build
@@ -49,6 +49,9 @@ block_ss.add(files(
@@ -53,6 +53,9 @@ block_ss.add(files(
'../pve-backup.c',
), libproxmox_backup_qemu)
@@ -28,13 +33,13 @@ index dfae565db3..a070060e53 100644
+
softmmu_ss.add(when: 'CONFIG_TCG', if_true: files('blkreplay.c'))
softmmu_ss.add(files('block-ram-registrar.c'))
diff --git a/block/pbs.c b/block/pbs.c
new file mode 100644
index 0000000000..1481a2bfd1
index 0000000000..9d1f1f39d4
--- /dev/null
+++ b/block/pbs.c
@@ -0,0 +1,271 @@
@@ -0,0 +1,276 @@
+/*
+ * Proxmox Backup Server read-only block driver
+ */
@@ -231,20 +236,25 @@ index 0000000000..1481a2bfd1
+}
+
+static coroutine_fn int pbs_co_preadv(BlockDriverState *bs,
+ uint64_t offset, uint64_t bytes,
+ QEMUIOVector *qiov, int flags)
+ int64_t offset, int64_t bytes,
+ QEMUIOVector *qiov, BdrvRequestFlags flags)
+{
+ BDRVPBSState *s = bs->opaque;
+ int ret;
+ char *pbs_error = NULL;
+ uint8_t *buf = malloc(bytes);
+
+ if (offset < 0 || bytes < 0) {
+ fprintf(stderr, "unexpected negative 'offset' or 'bytes' value!\n");
+ return -EIO;
+ }
+
+ ReadCallbackData rcb = {
+ .co = qemu_coroutine_self(),
+ .ctx = bdrv_get_aio_context(bs),
+ };
+
+ proxmox_restore_read_image_at_async(s->conn, s->aid, buf, offset, bytes,
+ proxmox_restore_read_image_at_async(s->conn, s->aid, buf, (uint64_t)offset, (uint64_t)bytes,
+ read_callback, (void *) &rcb, &ret, &pbs_error);
+
+ qemu_coroutine_yield();
@@ -258,12 +268,12 @@ index 0000000000..1481a2bfd1
+ qemu_iovec_from_buf(qiov, 0, buf, bytes);
+ free(buf);
+
+ return ret;
+ return 0;
+}
+
+static coroutine_fn int pbs_co_pwritev(BlockDriverState *bs,
+ uint64_t offset, uint64_t bytes,
+ QEMUIOVector *qiov, int flags)
+ int64_t offset, int64_t bytes,
+ QEMUIOVector *qiov, BdrvRequestFlags flags)
+{
+ fprintf(stderr, "pbs-bdrv: cannot write to backup file, make sure "
+ "any attached disk devices are set to read-only!\n");
@@ -307,72 +317,72 @@ index 0000000000..1481a2bfd1
+
+block_init(bdrv_pbs_init);
diff --git a/configure b/configure
index 18c26e0389..33d9933871 100755
index 5f1828f1ec..c9b70b5e9a 100755
--- a/configure
+++ b/configure
@@ -436,6 +436,7 @@ vvfat="yes"
qed="yes"
parallels="yes"
sheepdog="no"
@@ -285,6 +285,7 @@ linux_user=""
bsd_user=""
pie=""
coroutine=""
+pbs_bdrv="yes"
libxml2=""
debug_mutex="no"
libpmem=""
@@ -1461,6 +1462,10 @@ for opt do
;;
--enable-sheepdog) sheepdog="yes"
plugins="$default_feature"
meson=""
ninja=""
@@ -864,6 +865,10 @@ for opt do
--enable-uuid|--disable-uuid)
echo "$0: $opt is obsolete, UUID support is always built" >&2
;;
+ --disable-pbs-bdrv) pbs_bdrv="no"
+ ;;
+ --enable-pbs-bdrv) pbs_bdrv="yes"
+ ;;
--disable-vhost-user) vhost_user="no"
--with-git=*) git="$optarg"
;;
--enable-vhost-user) vhost_user="yes"
@@ -1843,6 +1848,7 @@ disabled with --disable-FEATURE, default is enabled if available:
qed qed image format support
parallels parallels image format support
sheepdog sheepdog block driver support (deprecated)
--with-git-submodules=*)
@@ -1049,6 +1054,7 @@ cat << EOF
debug-info debugging information
safe-stack SafeStack Stack Smash Protection. Depends on
clang/llvm >= 3.7 and requires coroutine backend ucontext.
+ pbs-bdrv Proxmox backup server read-only block driver support
crypto-afalg Linux AF_ALG crypto backend driver
capstone capstone disassembler support
debug-mutex mutex debugging support
@@ -6682,6 +6688,9 @@ if test "$sheepdog" = "yes" ; then
add_to deprecated_features "sheepdog"
echo "CONFIG_SHEEPDOG=y" >> $config_host_mak
NOTE: The object files are built at the place where configure is launched
EOF
@@ -2372,6 +2378,9 @@ echo "TARGET_DIRS=$target_list" >> $config_host_mak
if test "$modules" = "yes"; then
echo "CONFIG_MODULES=y" >> $config_host_mak
fi
+if test "$pbs_bdrv" = "yes" ; then
+ echo "CONFIG_PBS_BDRV=y" >> $config_host_mak
+fi
if test "$pty_h" = "yes" ; then
echo "HAVE_PTY_H=y" >> $config_host_mak
fi
# XXX: suppress that
if [ "$bsd" = "yes" ] ; then
diff --git a/meson.build b/meson.build
index 6f1fafee14..4d156d35ce 100644
index 9c46881eb7..93ebda47af 100644
--- a/meson.build
+++ b/meson.build
@@ -2199,6 +2199,7 @@ summary_info += {'vvfat support': config_host.has_key('CONFIG_VVFAT')}
summary_info += {'qed support': config_host.has_key('CONFIG_QED')}
summary_info += {'parallels support': config_host.has_key('CONFIG_PARALLELS')}
summary_info += {'sheepdog support': config_host.has_key('CONFIG_SHEEPDOG')}
@@ -3980,7 +3980,7 @@ summary_info += {'bzip2 support': libbzip2}
summary_info += {'lzfse support': liblzfse}
summary_info += {'zstd support': zstd}
summary_info += {'NUMA host support': numa}
-summary_info += {'capstone': capstone}
+summary_info += {'PBS bdrv support': config_host.has_key('CONFIG_PBS_BDRV')}
summary_info += {'capstone': capstone_opt == 'disabled' ? false : capstone_opt}
summary_info += {'libpmem support': config_host.has_key('CONFIG_LIBPMEM')}
summary_info += {'libdaxctl support': config_host.has_key('CONFIG_LIBDAXCTL')}
summary_info += {'libpmem support': libpmem}
summary_info += {'libdaxctl support': libdaxctl}
summary_info += {'libudev': libudev}
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 0fda1e3fd3..553112d998 100644
index 5ac6276dc1..45b63dfe26 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -2975,7 +2975,7 @@
'luks', 'nbd', 'nfs', 'null-aio', 'null-co', 'nvme', 'parallels',
'qcow', 'qcow2', 'qed', 'quorum', 'raw', 'rbd',
{ 'name': 'replication', 'if': 'defined(CONFIG_REPLICATION)' },
- 'sheepdog',
+ 'sheepdog', 'pbs',
'ssh', 'throttle', 'vdi', 'vhdx', 'vmdk', 'vpc', 'vvfat' ] }
##
@@ -3039,6 +3039,17 @@
@@ -3103,6 +3103,7 @@
'parallels', 'preallocate', 'qcow', 'qcow2', 'qed', 'quorum',
'raw', 'rbd',
{ 'name': 'replication', 'if': 'CONFIG_REPLICATION' },
+ 'pbs',
'ssh', 'throttle', 'vdi', 'vhdx',
{ 'name': 'virtio-blk-vfio-pci', 'if': 'CONFIG_BLKIO' },
{ 'name': 'virtio-blk-vhost-user', 'if': 'CONFIG_BLKIO' },
@@ -3179,6 +3180,17 @@
{ 'struct': 'BlockdevOptionsNull',
'data': { '*size': 'int', '*latency-ns': 'uint64', '*read-zeroes': 'bool' } }
@@ -390,11 +400,23 @@ index 0fda1e3fd3..553112d998 100644
##
# @BlockdevOptionsNVMe:
#
@@ -4148,6 +4159,7 @@
@@ -4531,6 +4543,7 @@
'nfs': 'BlockdevOptionsNfs',
'null-aio': 'BlockdevOptionsNull',
'null-co': 'BlockdevOptionsNull',
+ 'pbs': 'BlockdevOptionsPbs',
'nvme': 'BlockdevOptionsNVMe',
'parallels': 'BlockdevOptionsGenericFormat',
'qcow2': 'BlockdevOptionsQcow2',
'nvme-io_uring': { 'type': 'BlockdevOptionsNvmeIoUring',
'if': 'CONFIG_BLKIO' },
diff --git a/qapi/pragma.json b/qapi/pragma.json
index f2097b9020..5ab1890519 100644
--- a/qapi/pragma.json
+++ b/qapi/pragma.json
@@ -47,6 +47,7 @@
'BlockInfo', # query-block
'BlockdevAioOptions', # blockdev-add, -blockdev
'BlockdevDriver', # blockdev-add, query-blockstats, ...
+ 'BlockdevOptionsPbs', # for PBS backwards compat
'BlockdevVmdkAdapterType', # blockdev-create (to match VMDK spec)
'BlockdevVmdkSubformat', # blockdev-create (to match VMDK spec)
'ColoCompareProperties', # object_add, -object

View File

@@ -16,10 +16,10 @@ Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2 files changed, 38 insertions(+)
diff --git a/pve-backup.c b/pve-backup.c
index b8182aaf89..98e79552ef 100644
index a275a1d4e1..b10373cd8a 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -1073,3 +1073,12 @@ BackupStatus *qmp_query_backup(Error **errp)
@@ -1072,3 +1072,12 @@ BackupStatus *qmp_query_backup(Error **errp)
return info;
}
@@ -33,10 +33,10 @@ index b8182aaf89..98e79552ef 100644
+ return ret;
+}
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 553112d998..f3608390c4 100644
index 45b63dfe26..8b0e0d92de 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -868,6 +868,35 @@
@@ -863,6 +863,35 @@
##
{ 'command': 'backup-cancel' }

View File

@@ -7,6 +7,7 @@ Returns advanced information about dirty bitmaps used (or not used) for
the latest PBS backup.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
monitor/hmp-cmds.c | 28 ++++++-----
pve-backup.c | 117 ++++++++++++++++++++++++++++++++-----------
@@ -14,10 +15,10 @@ Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
3 files changed, 159 insertions(+), 42 deletions(-)
diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index 604026bb37..95f4e7f5c1 100644
index 670f783515..d819e5fc36 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -198,6 +198,7 @@ void hmp_info_mice(Monitor *mon, const QDict *qdict)
@@ -202,6 +202,7 @@ void hmp_info_mice(Monitor *mon, const QDict *qdict)
void hmp_info_backup(Monitor *mon, const QDict *qdict)
{
BackupStatus *info;
@@ -25,7 +26,7 @@ index 604026bb37..95f4e7f5c1 100644
info = qmp_query_backup(NULL);
@@ -228,26 +229,29 @@ void hmp_info_backup(Monitor *mon, const QDict *qdict)
@@ -232,26 +233,29 @@ void hmp_info_backup(Monitor *mon, const QDict *qdict)
// this should not happen normally
monitor_printf(mon, "Total size: %d\n", 0);
} else {
@@ -68,7 +69,7 @@ index 604026bb37..95f4e7f5c1 100644
info->zero_bytes, zero_per);
diff --git a/pve-backup.c b/pve-backup.c
index 98e79552ef..8305105fd5 100644
index b10373cd8a..8ae50e06c3 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -46,6 +46,7 @@ static struct PVEBackupState {
@@ -79,7 +80,7 @@ index 98e79552ef..8305105fd5 100644
} stat;
int64_t speed;
VmaWriter *vmaw;
@@ -670,7 +671,6 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -669,7 +670,6 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
}
size_t total = 0;
@@ -87,7 +88,7 @@ index 98e79552ef..8305105fd5 100644
l = di_list;
while (l) {
@@ -691,18 +691,33 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -690,18 +690,33 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
uuid_generate(uuid);
@@ -124,7 +125,7 @@ index 98e79552ef..8305105fd5 100644
}
int dump_cb_block_size = PROXMOX_BACKUP_DEFAULT_CHUNK_SIZE; // Hardcoded (4M)
@@ -729,12 +744,12 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -728,12 +743,12 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
error_set(task->errp, ERROR_CLASS_GENERIC_ERROR,
"proxmox_backup_new failed: %s", pbs_err);
proxmox_backup_free_error(pbs_err);
@@ -139,7 +140,7 @@ index 98e79552ef..8305105fd5 100644
/* register all devices */
l = di_list;
@@ -745,6 +760,8 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -744,6 +759,8 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
di->block_size = dump_cb_block_size;
const char *devname = bdrv_get_device_name(di->bs);
@@ -148,7 +149,7 @@ index 98e79552ef..8305105fd5 100644
BdrvDirtyBitmap *bitmap = bdrv_find_dirty_bitmap(di->bs, PBS_BITMAP_NAME);
bool expect_only_dirty = false;
@@ -753,49 +770,59 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -752,49 +769,59 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
if (bitmap == NULL) {
bitmap = bdrv_create_dirty_bitmap(di->bs, dump_cb_block_size, PBS_BITMAP_NAME, task->errp);
if (!bitmap) {
@@ -218,7 +219,7 @@ index 98e79552ef..8305105fd5 100644
}
/* register all devices for vma writer */
@@ -805,7 +832,7 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -804,7 +831,7 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
l = g_list_next(l);
if (!(di->target = bdrv_backup_dump_create(VMA_CLUSTER_SIZE, di->size, pvebackup_co_dump_vma_cb, di, task->errp))) {
@@ -227,7 +228,7 @@ index 98e79552ef..8305105fd5 100644
}
const char *devname = bdrv_get_device_name(di->bs);
@@ -813,16 +840,14 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -812,16 +839,14 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
if (di->dev_id <= 0) {
error_set(task->errp, ERROR_CLASS_GENERIC_ERROR,
"register_stream failed");
@@ -246,7 +247,7 @@ index 98e79552ef..8305105fd5 100644
}
backup_dir = task->backup_file;
@@ -839,18 +864,18 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -838,18 +863,18 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
di->size, flags, false, &local_err);
if (local_err) {
error_propagate(task->errp, local_err);
@@ -268,7 +269,7 @@ index 98e79552ef..8305105fd5 100644
}
@@ -858,7 +883,7 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -857,7 +882,7 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
if (task->has_config_file) {
if (pvebackup_co_add_config(task->config_file, config_name, format, backup_dir,
vmaw, pbs, task->errp) != 0) {
@@ -277,7 +278,7 @@ index 98e79552ef..8305105fd5 100644
}
}
@@ -866,12 +891,11 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -865,12 +890,11 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
if (task->has_firewall_file) {
if (pvebackup_co_add_config(task->firewall_file, firewall_name, format, backup_dir,
vmaw, pbs, task->errp) != 0) {
@@ -292,7 +293,7 @@ index 98e79552ef..8305105fd5 100644
if (backup_state.stat.error) {
error_free(backup_state.stat.error);
@@ -891,10 +915,9 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -890,10 +914,9 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
char *uuid_str = g_strdup(backup_state.stat.uuid_str);
backup_state.stat.total = total;
@@ -304,7 +305,7 @@ index 98e79552ef..8305105fd5 100644
qemu_mutex_unlock(&backup_state.stat.lock);
@@ -911,6 +934,9 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -910,6 +933,9 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
task->result = uuid_info;
return;
@@ -314,7 +315,7 @@ index 98e79552ef..8305105fd5 100644
err:
l = di_list;
@@ -1074,11 +1100,42 @@ BackupStatus *qmp_query_backup(Error **errp)
@@ -1073,11 +1099,42 @@ BackupStatus *qmp_query_backup(Error **errp)
return info;
}
@@ -358,10 +359,10 @@ index 98e79552ef..8305105fd5 100644
return ret;
}
diff --git a/qapi/block-core.json b/qapi/block-core.json
index f3608390c4..f57fda122c 100644
index 8b0e0d92de..7fde927621 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -876,6 +876,8 @@
@@ -871,6 +871,8 @@
# @pbs-dirty-bitmap: True if dirty-bitmap-incremental backups to PBS are
# supported.
#
@@ -370,7 +371,7 @@ index f3608390c4..f57fda122c 100644
# @pbs-dirty-bitmap-savevm: True if 'dirty-bitmaps' migration capability can
# safely be set for savevm-async.
#
@@ -884,6 +886,7 @@
@@ -879,6 +881,7 @@
##
{ 'struct': 'ProxmoxSupportStatus',
'data': { 'pbs-dirty-bitmap': 'bool',
@@ -378,7 +379,7 @@ index f3608390c4..f57fda122c 100644
'pbs-dirty-bitmap-savevm': 'bool',
'pbs-library-version': 'str' } }
@@ -897,6 +900,59 @@
@@ -892,6 +895,59 @@
##
{ 'command': 'query-proxmox-support', 'returns': 'ProxmoxSupportStatus' }

View File

@@ -7,33 +7,34 @@ QEMU uses the logging for error messages usually, so LOG_ERR is most
fitting.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
meson.build | 2 ++
os-posix.c | 7 +++++--
2 files changed, 7 insertions(+), 2 deletions(-)
diff --git a/meson.build b/meson.build
index 4d156d35ce..737ea9e5d7 100644
index 93ebda47af..d47ce5fe64 100644
--- a/meson.build
+++ b/meson.build
@@ -726,6 +726,7 @@ keyutils = dependency('libkeyutils', required: false,
@@ -1526,6 +1526,7 @@ keyutils = dependency('libkeyutils', required: false,
has_gettid = cc.has_function('gettid')
libuuid = cc.find_library('uuid', required: true)
+libsystemd = cc.find_library('systemd', required: true)
libproxmox_backup_qemu = cc.find_library('proxmox_backup_qemu', required: true)
# Malloc tests
@@ -1539,6 +1540,7 @@ blockdev_ss.add(files(
# os-posix.c contains POSIX-specific functions used by qemu-storage-daemon,
# os-win32.c does not
blockdev_ss.add(when: 'CONFIG_POSIX', if_true: files('os-posix.c'))
+blockdev_ss.add(when: 'CONFIG_POSIX', if_true: libsystemd)
softmmu_ss.add(when: 'CONFIG_WIN32', if_true: [files('os-win32.c')])
# libselinux
@@ -3094,6 +3095,7 @@ if have_block
# os-posix.c contains POSIX-specific functions used by qemu-storage-daemon,
# os-win32.c does not
blockdev_ss.add(when: 'CONFIG_POSIX', if_true: files('os-posix.c'))
+ blockdev_ss.add(when: 'CONFIG_POSIX', if_true: libsystemd)
softmmu_ss.add(when: 'CONFIG_WIN32', if_true: [files('os-win32.c')])
endif
common_ss.add(files('cpus-common.c'))
diff --git a/os-posix.c b/os-posix.c
index 1de2839554..ac4f652923 100644
index 4858650c3e..c5cb12226a 100644
--- a/os-posix.c
+++ b/os-posix.c
@@ -28,6 +28,8 @@
@@ -43,15 +44,15 @@ index 1de2839554..ac4f652923 100644
+#include <systemd/sd-journal.h>
+#include <syslog.h>
#include "qemu-common.h"
/* Needed early for CONFIG_BSD etc. */
@@ -288,9 +290,10 @@ void os_setup_post(void)
#include "net/slirp.h"
@@ -287,9 +289,10 @@ void os_setup_post(void)
dup2(fd, 0);
dup2(fd, 1);
- /* In case -D is given do not redirect stderr to /dev/null */
+ /* In case -D is given do not redirect stderr to journal */
if (!qemu_logfile) {
if (!qemu_log_enabled()) {
- dup2(fd, 2);
+ int journal_fd = sd_journal_stream_fd("QEMU", LOG_ERR, 0);
+ dup2(journal_fd, 2);

View File

@@ -4,16 +4,17 @@ Date: Thu, 20 Aug 2020 14:31:59 +0200
Subject: [PATCH] PVE: Add sequential job transaction support
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
include/qemu/job.h | 12 ++++++++++++
job.c | 31 +++++++++++++++++++++++++++++++
2 files changed, 43 insertions(+)
job.c | 34 ++++++++++++++++++++++++++++++++++
2 files changed, 46 insertions(+)
diff --git a/include/qemu/job.h b/include/qemu/job.h
index 32aabb1c60..f7a6a0926a 100644
index e502787dd8..963cf2bef5 100644
--- a/include/qemu/job.h
+++ b/include/qemu/job.h
@@ -280,6 +280,18 @@ typedef enum JobCreateFlags {
@@ -381,6 +381,18 @@ void job_unlock(void);
*/
JobTxn *job_txn_new(void);
@@ -33,10 +34,10 @@ index 32aabb1c60..f7a6a0926a 100644
* Release a reference that was previously acquired with job_txn_add_job or
* job_txn_new. If it's the last reference to the object, it will be freed.
diff --git a/job.c b/job.c
index f9884e7d9d..05b7797e82 100644
index 93e22d180b..2b31f1e14f 100644
--- a/job.c
+++ b/job.c
@@ -72,6 +72,8 @@ struct JobTxn {
@@ -93,6 +93,8 @@ struct JobTxn {
/* Reference count */
int refcnt;
@@ -44,8 +45,8 @@ index f9884e7d9d..05b7797e82 100644
+ bool sequential;
};
/* Right now, this mutex is only needed to synchronize accesses to job->busy
@@ -102,6 +104,25 @@ JobTxn *job_txn_new(void)
void job_lock(void)
@@ -118,6 +120,25 @@ JobTxn *job_txn_new(void)
return txn;
}
@@ -68,20 +69,23 @@ index f9884e7d9d..05b7797e82 100644
+ job_start(first);
+}
+
static void job_txn_ref(JobTxn *txn)
/* Called with job_mutex held. */
static void job_txn_ref_locked(JobTxn *txn)
{
txn->refcnt++;
@@ -841,6 +862,9 @@ static void job_completed_txn_success(Job *job)
@@ -1057,6 +1078,12 @@ static void job_completed_txn_success_locked(Job *job)
*/
QLIST_FOREACH(other_job, &txn->jobs, txn_list) {
if (!job_is_completed(other_job)) {
if (!job_is_completed_locked(other_job)) {
+ if (txn->sequential) {
+ job_unlock();
+ /* Needs to be called without holding the job lock */
+ job_start(other_job);
+ job_lock();
+ }
return;
}
assert(other_job->ret == 0);
@@ -1011,6 +1035,13 @@ int job_finish_sync(Job *job, void (*finish)(Job *, Error **errp), Error **errp)
@@ -1268,6 +1295,13 @@ int job_finish_sync_locked(Job *job,
return -EBUSY;
}
@@ -89,9 +93,9 @@ index f9884e7d9d..05b7797e82 100644
+ * of cancelling, these have not begun work so job_enter won't do anything,
+ * let's ensure they are marked as ABORTING if required */
+ if (job->status == JOB_STATUS_CREATED && job->txn->sequential) {
+ job_update_rc(job);
+ job_update_rc_locked(job);
+ }
+
AIO_WAIT_WHILE(job->aio_context,
(job_enter(job), !job_is_completed(job)));
job_unlock();
AIO_WAIT_WHILE_UNLOCKED(job->aio_context,
(job_enter(job), !job_is_completed(job)));

View File

@@ -11,12 +11,16 @@ To keep the rate-limiting and IO impact from before, we use a sequential
transaction, so drives will still be backed up one after the other.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
[FE: add new force parameter to job_cancel_sync calls
adapt for new job lock mechanism replacing AioContext locks]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
pve-backup.c | 167 +++++++++++++++------------------------------------
1 file changed, 49 insertions(+), 118 deletions(-)
pve-backup.c | 163 ++++++++++++++++-----------------------------------
1 file changed, 50 insertions(+), 113 deletions(-)
diff --git a/pve-backup.c b/pve-backup.c
index 8305105fd5..d7f2b2206f 100644
index 8ae50e06c3..eedac335ec 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -52,6 +52,7 @@ static struct PVEBackupState {
@@ -27,7 +31,7 @@ index 8305105fd5..d7f2b2206f 100644
QemuMutex backup_mutex;
CoMutex dump_callback_mutex;
} backup_state;
@@ -71,32 +72,12 @@ typedef struct PVEBackupDevInfo {
@@ -71,34 +72,12 @@ typedef struct PVEBackupDevInfo {
size_t size;
uint64_t block_size;
uint8_t dev_id;
@@ -44,14 +48,16 @@ index 8305105fd5..d7f2b2206f 100644
-lookup_active_block_job(PVEBackupDevInfo *di)
-{
- if (!di->completed && di->bs) {
- for (BlockJob *job = block_job_next(NULL); job; job = block_job_next(job)) {
- if (job->job.driver->job_type != JOB_TYPE_BACKUP) {
- continue;
- }
- WITH_JOB_LOCK_GUARD() {
- for (BlockJob *job = block_job_next_locked(NULL); job; job = block_job_next_locked(job)) {
- if (job->job.driver->job_type != JOB_TYPE_BACKUP) {
- continue;
- }
-
- BackupBlockJob *bjob = container_of(job, BackupBlockJob, common);
- if (bjob && bjob->source_bs == di->bs) {
- return job;
- BackupBlockJob *bjob = container_of(job, BackupBlockJob, common);
- if (bjob && bjob->source_bs == di->bs) {
- return job;
- }
- }
- }
- }
@@ -61,7 +67,7 @@ index 8305105fd5..d7f2b2206f 100644
static void pvebackup_propagate_error(Error *err)
{
qemu_mutex_lock(&backup_state.stat.lock);
@@ -272,18 +253,6 @@ static void coroutine_fn pvebackup_co_cleanup(void *unused)
@@ -274,18 +253,6 @@ static void coroutine_fn pvebackup_co_cleanup(void *unused)
if (local_err != NULL) {
pvebackup_propagate_error(local_err);
}
@@ -80,7 +86,7 @@ index 8305105fd5..d7f2b2206f 100644
}
proxmox_backup_disconnect(backup_state.pbs);
@@ -322,8 +291,6 @@ static void pvebackup_complete_cb(void *opaque, int ret)
@@ -324,8 +291,6 @@ static void pvebackup_complete_cb(void *opaque, int ret)
qemu_mutex_lock(&backup_state.backup_mutex);
@@ -89,7 +95,7 @@ index 8305105fd5..d7f2b2206f 100644
if (ret < 0) {
Error *local_err = NULL;
error_setg(&local_err, "job failed with err %d - %s", ret, strerror(-ret));
@@ -336,20 +303,17 @@ static void pvebackup_complete_cb(void *opaque, int ret)
@@ -338,20 +303,17 @@ static void pvebackup_complete_cb(void *opaque, int ret)
block_on_coroutine_fn(pvebackup_complete_stream, di);
@@ -116,7 +122,7 @@ index 8305105fd5..d7f2b2206f 100644
}
static void pvebackup_cancel(void)
@@ -371,36 +335,28 @@ static void pvebackup_cancel(void)
@@ -373,32 +335,28 @@ static void pvebackup_cancel(void)
proxmox_backup_abort(backup_state.pbs, "backup canceled");
}
@@ -125,13 +131,6 @@ index 8305105fd5..d7f2b2206f 100644
- for(;;) {
-
- BlockJob *next_job = NULL;
-
- qemu_mutex_lock(&backup_state.backup_mutex);
-
- GList *l = backup_state.di_list;
- while (l) {
- PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
- l = g_list_next(l);
+ /* it's enough to cancel one job in the transaction, the rest will follow
+ * automatically */
+ GList *bdi = g_list_first(backup_state.di_list);
@@ -139,15 +138,23 @@ index 8305105fd5..d7f2b2206f 100644
+ ((PVEBackupDevInfo *)bdi->data)->job :
+ NULL;
- qemu_mutex_lock(&backup_state.backup_mutex);
-
- GList *l = backup_state.di_list;
- while (l) {
- PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
- l = g_list_next(l);
-
- BlockJob *job = lookup_active_block_job(di);
- if (job != NULL) {
- next_job = job;
- break;
- }
- }
+ /* ref the job before releasing the mutex, just to be safe */
+ if (cancel_job) {
+ job_ref(&cancel_job->job);
+ WITH_JOB_LOCK_GUARD() {
+ job_ref_locked(&cancel_job->job);
}
+ }
- qemu_mutex_unlock(&backup_state.backup_mutex);
@@ -156,27 +163,21 @@ index 8305105fd5..d7f2b2206f 100644
+ qemu_mutex_unlock(&backup_state.backup_mutex);
- if (next_job) {
- AioContext *aio_context = next_job->job.aio_context;
- aio_context_acquire(aio_context);
- job_cancel_sync(&next_job->job);
- aio_context_release(aio_context);
- job_cancel_sync(&next_job->job, true);
- } else {
- break;
- }
+ if (cancel_job) {
+ AioContext *aio_context = cancel_job->job.aio_context;
+ aio_context_acquire(aio_context);
+ job_cancel_sync(&cancel_job->job);
+ job_unref(&cancel_job->job);
+ aio_context_release(aio_context);
+ WITH_JOB_LOCK_GUARD() {
+ job_cancel_sync_locked(&cancel_job->job, true);
+ job_unref_locked(&cancel_job->job);
}
}
}
@@ -459,51 +415,19 @@ static int coroutine_fn pvebackup_co_add_config(
@@ -458,49 +416,19 @@ static int coroutine_fn pvebackup_co_add_config(
goto out;
}
-bool job_should_pause(Job *job);
-bool job_should_pause_locked(Job *job);
-
-static void pvebackup_run_next_job(void)
-{
@@ -194,18 +195,16 @@ index 8305105fd5..d7f2b2206f 100644
- if (job) {
- qemu_mutex_unlock(&backup_state.backup_mutex);
-
- AioContext *aio_context = job->job.aio_context;
- aio_context_acquire(aio_context);
-
- if (job_should_pause(&job->job)) {
- bool error_or_canceled = pvebackup_error_or_canceled();
- if (error_or_canceled) {
- job_cancel_sync(&job->job);
- } else {
- job_resume(&job->job);
- WITH_JOB_LOCK_GUARD() {
- if (job_should_pause_locked(&job->job)) {
- bool error_or_canceled = pvebackup_error_or_canceled();
- if (error_or_canceled) {
- job_cancel_sync_locked(&job->job, true);
- } else {
- job_resume_locked(&job->job);
- }
- }
- }
- aio_context_release(aio_context);
- return;
- }
- }
@@ -228,19 +227,19 @@ index 8305105fd5..d7f2b2206f 100644
+ }
+ backup_state.txn = job_txn_new_seq();
+
BackupPerf perf = { .max_workers = 16 };
/* create and start all jobs (paused state) */
GList *l = backup_state.di_list;
while (l) {
@@ -524,7 +448,7 @@ static bool create_backup_jobs(void) {
@@ -523,7 +451,7 @@ static bool create_backup_jobs(void) {
BlockJob *job = backup_job_create(
NULL, di->bs, di->target, backup_state.speed, sync_mode, di->bitmap,
bitmap_mode, false, NULL, BLOCKDEV_ON_ERROR_REPORT, BLOCKDEV_ON_ERROR_REPORT,
- JOB_DEFAULT, pvebackup_complete_cb, di, 1, NULL, &local_err);
bitmap_mode, false, NULL, &perf, BLOCKDEV_ON_ERROR_REPORT, BLOCKDEV_ON_ERROR_REPORT,
- JOB_DEFAULT, pvebackup_complete_cb, di, NULL, &local_err);
+ JOB_DEFAULT, pvebackup_complete_cb, di, backup_state.txn, &local_err);
aio_context_release(aio_context);
@@ -536,7 +460,8 @@ static bool create_backup_jobs(void) {
@@ -535,7 +463,8 @@ static bool create_backup_jobs(void) {
pvebackup_propagate_error(create_job_err);
break;
}
@@ -250,18 +249,20 @@ index 8305105fd5..d7f2b2206f 100644
bdrv_unref(di->target);
di->target = NULL;
@@ -554,6 +479,10 @@ static bool create_backup_jobs(void) {
@@ -553,6 +482,12 @@ static bool create_backup_jobs(void) {
bdrv_unref(di->target);
di->target = NULL;
}
+
+ if (di->job) {
+ job_unref(&di->job->job);
+ WITH_JOB_LOCK_GUARD() {
+ job_unref_locked(&di->job->job);
+ }
+ }
}
}
@@ -944,10 +873,6 @@ err:
@@ -943,10 +878,6 @@ err:
PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
l = g_list_next(l);
@@ -272,7 +273,7 @@ index 8305105fd5..d7f2b2206f 100644
if (di->target) {
bdrv_unref(di->target);
}
@@ -1036,9 +961,15 @@ UuidInfo *qmp_backup(
@@ -1035,9 +966,15 @@ UuidInfo *qmp_backup(
block_on_coroutine_fn(pvebackup_co_prepare, &task);
if (*errp == NULL) {

View File

@@ -48,13 +48,16 @@ before.
[0] https://lists.gnu.org/archive/html/qemu-devel/2020-09/msg03515.html
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
[FE: add new force parameter to job_cancel_sync calls]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
pve-backup.c | 217 ++++++++++++++++++++++++++++---------------
pve-backup.c | 212 +++++++++++++++++++++++++++----------------
qapi/block-core.json | 5 +-
2 files changed, 144 insertions(+), 78 deletions(-)
2 files changed, 138 insertions(+), 79 deletions(-)
diff --git a/pve-backup.c b/pve-backup.c
index d7f2b2206f..e671ed8d48 100644
index eedac335ec..7bd9d06346 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -33,7 +33,9 @@ const char *PBS_BITMAP_NAME = "pbs-incremental-dirty-bitmap";
@@ -189,7 +192,7 @@ index d7f2b2206f..e671ed8d48 100644
// remove self from job list
backup_state.di_list = g_list_remove(backup_state.di_list, di);
@@ -310,21 +321,49 @@ static void pvebackup_complete_cb(void *opaque, int ret)
@@ -310,21 +321,46 @@ static void pvebackup_complete_cb(void *opaque, int ret)
/* call cleanup if we're the last job */
if (!g_list_first(backup_state.di_list)) {
@@ -226,10 +229,7 @@ index d7f2b2206f..e671ed8d48 100644
+static void job_cancel_bh(void *opaque) {
+ CoCtxData *data = (CoCtxData*)opaque;
+ Job *job = (Job*)data->data;
+ AioContext *job_ctx = job->aio_context;
+ aio_context_acquire(job_ctx);
+ job_cancel_sync(job);
+ aio_context_release(job_ctx);
+ job_cancel_sync(job, true);
+ aio_co_enter(data->ctx, data->co);
+}
@@ -244,13 +244,15 @@ index d7f2b2206f..e671ed8d48 100644
if (backup_state.vmaw) {
/* make sure vma writer does not block anymore */
@@ -342,27 +381,22 @@ static void pvebackup_cancel(void)
@@ -342,28 +378,22 @@ static void pvebackup_cancel(void)
((PVEBackupDevInfo *)bdi->data)->job :
NULL;
- /* ref the job before releasing the mutex, just to be safe */
if (cancel_job) {
- job_ref(&cancel_job->job);
- WITH_JOB_LOCK_GUARD() {
- job_ref_locked(&cancel_job->job);
- }
+ CoCtxData data = {
+ .ctx = qemu_get_current_aio_context(),
+ .co = qemu_coroutine_self(),
@@ -265,11 +267,10 @@ index d7f2b2206f..e671ed8d48 100644
- qemu_mutex_unlock(&backup_state.backup_mutex);
-
- if (cancel_job) {
- AioContext *aio_context = cancel_job->job.aio_context;
- aio_context_acquire(aio_context);
- job_cancel_sync(&cancel_job->job);
- job_unref(&cancel_job->job);
- aio_context_release(aio_context);
- WITH_JOB_LOCK_GUARD() {
- job_cancel_sync_locked(&cancel_job->job, true);
- job_unref_locked(&cancel_job->job);
- }
- }
+ qemu_co_mutex_unlock(&backup_state.backup_mutex);
}
@@ -281,7 +282,7 @@ index d7f2b2206f..e671ed8d48 100644
}
// assumes the caller holds backup_mutex
@@ -415,10 +449,18 @@ static int coroutine_fn pvebackup_co_add_config(
@@ -416,10 +446,18 @@ static int coroutine_fn pvebackup_co_add_config(
goto out;
}
@@ -301,7 +302,7 @@ index d7f2b2206f..e671ed8d48 100644
Error *local_err = NULL;
/* create job transaction to synchronize bitmap commit and cancel all
@@ -452,24 +494,19 @@ static bool create_backup_jobs(void) {
@@ -455,24 +493,19 @@ static bool create_backup_jobs(void) {
aio_context_release(aio_context);
@@ -331,15 +332,13 @@ index d7f2b2206f..e671ed8d48 100644
l = backup_state.di_list;
while (l) {
PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
@@ -481,12 +518,17 @@ static bool create_backup_jobs(void) {
}
@@ -485,13 +518,15 @@ static bool create_backup_jobs(void) {
if (di->job) {
+ AioContext *ctx = di->job->job.aio_context;
+ aio_context_acquire(ctx);
+ job_cancel_sync(&di->job->job);
job_unref(&di->job->job);
+ aio_context_release(ctx);
WITH_JOB_LOCK_GUARD() {
+ job_cancel_sync_locked(&di->job->job, true);
job_unref_locked(&di->job->job);
}
}
}
}
@@ -350,7 +349,7 @@ index d7f2b2206f..e671ed8d48 100644
}
typedef struct QmpBackupTask {
@@ -523,11 +565,12 @@ typedef struct QmpBackupTask {
@@ -528,11 +563,12 @@ typedef struct QmpBackupTask {
UuidInfo *result;
} QmpBackupTask;
@@ -364,7 +363,7 @@ index d7f2b2206f..e671ed8d48 100644
QmpBackupTask *task = opaque;
task->result = NULL; // just to be sure
@@ -548,8 +591,9 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -553,8 +589,9 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
const char *firewall_name = "qemu-server.fw";
if (backup_state.di_list) {
@@ -375,7 +374,7 @@ index d7f2b2206f..e671ed8d48 100644
return;
}
@@ -616,6 +660,8 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -621,6 +658,8 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
}
di->size = size;
total += size;
@@ -384,7 +383,7 @@ index d7f2b2206f..e671ed8d48 100644
}
uuid_generate(uuid);
@@ -847,6 +893,8 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -852,6 +891,8 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
backup_state.stat.dirty = total - backup_state.stat.reused;
backup_state.stat.transferred = 0;
backup_state.stat.zero_bytes = 0;
@@ -393,7 +392,7 @@ index d7f2b2206f..e671ed8d48 100644
qemu_mutex_unlock(&backup_state.stat.lock);
@@ -861,6 +909,33 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -866,6 +907,33 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
uuid_info->UUID = uuid_str;
task->result = uuid_info;
@@ -427,7 +426,7 @@ index d7f2b2206f..e671ed8d48 100644
return;
err_mutex:
@@ -883,6 +958,7 @@ err:
@@ -888,6 +956,7 @@ err:
g_free(di);
}
g_list_free(di_list);
@@ -435,7 +434,7 @@ index d7f2b2206f..e671ed8d48 100644
if (devs) {
g_strfreev(devs);
@@ -903,6 +979,8 @@ err:
@@ -908,6 +977,8 @@ err:
}
task->result = NULL;
@@ -444,7 +443,7 @@ index d7f2b2206f..e671ed8d48 100644
return;
}
@@ -956,24 +1034,8 @@ UuidInfo *qmp_backup(
@@ -961,24 +1032,8 @@ UuidInfo *qmp_backup(
.errp = errp,
};
@@ -469,7 +468,7 @@ index d7f2b2206f..e671ed8d48 100644
return task.result;
}
@@ -1025,6 +1087,7 @@ BackupStatus *qmp_query_backup(Error **errp)
@@ -1030,6 +1085,7 @@ BackupStatus *qmp_query_backup(Error **errp)
info->transferred = backup_state.stat.transferred;
info->has_reused = true;
info->reused = backup_state.stat.reused;
@@ -478,10 +477,10 @@ index d7f2b2206f..e671ed8d48 100644
qemu_mutex_unlock(&backup_state.stat.lock);
diff --git a/qapi/block-core.json b/qapi/block-core.json
index f57fda122c..9b827cbe43 100644
index 7fde927621..bf559c6d52 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -775,12 +775,15 @@
@@ -770,12 +770,15 @@
#
# @uuid: uuid for this backup job
#

View File

@@ -12,21 +12,22 @@ Also add a flag to query-proxmox-support so qemu-server can determine if
safe migration is possible and makes sense.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
include/migration/misc.h | 3 ++
migration/meson.build | 2 +
migration/migration.c | 1 +
migration/pbs-state.c | 106 +++++++++++++++++++++++++++++++++++++++
pve-backup.c | 1 +
qapi/block-core.json | 6 +++
softmmu/vl.c | 1 +
6 files changed, 119 insertions(+)
create mode 100644 migration/pbs-state.c
diff --git a/include/migration/misc.h b/include/migration/misc.h
index 34e7d75713..f83816dd3c 100644
index 465906710d..4f0aeceb6f 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -75,4 +75,7 @@ bool migration_in_incoming_postcopy(void);
@@ -75,4 +75,7 @@ bool migration_in_bg_snapshot(void);
/* migration/block-dirty-bitmap.c */
void dirty_bitmap_mig_init(void);
@@ -35,13 +36,13 @@ index 34e7d75713..f83816dd3c 100644
+
#endif
diff --git a/migration/meson.build b/migration/meson.build
index e62b79b60f..b90a04aa75 100644
index 0842d00cd2..d012f4d8d3 100644
--- a/migration/meson.build
+++ b/migration/meson.build
@@ -7,8 +7,10 @@ migration_files = files(
'qemu-file-channel.c',
@@ -6,8 +6,10 @@ migration_files = files(
'vmstate.c',
'qemu-file.c',
'qjson.c',
'yank_functions.c',
+ 'pbs-state.c',
)
softmmu_ss.add(migration_files)
@@ -49,6 +50,18 @@ index e62b79b60f..b90a04aa75 100644
softmmu_ss.add(files(
'block-dirty-bitmap.c',
diff --git a/migration/migration.c b/migration/migration.c
index 9b496cce1d..421b4ee225 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -229,6 +229,7 @@ void migration_object_init(void)
blk_mig_init();
ram_mig_init();
dirty_bitmap_mig_init();
+ pbs_state_mig_init();
}
void migration_cancel(const Error *error)
diff --git a/migration/pbs-state.c b/migration/pbs-state.c
new file mode 100644
index 0000000000..29f2b3860d
@@ -162,10 +175,10 @@ index 0000000000..29f2b3860d
+ NULL);
+}
diff --git a/pve-backup.c b/pve-backup.c
index e671ed8d48..bd2647e5f3 100644
index 7bd9d06346..5662f48b72 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -1130,6 +1130,7 @@ ProxmoxSupportStatus *qmp_query_proxmox_support(Error **errp)
@@ -1128,6 +1128,7 @@ ProxmoxSupportStatus *qmp_query_proxmox_support(Error **errp)
ret->pbs_library_version = g_strdup(proxmox_backup_qemu_version());
ret->pbs_dirty_bitmap = true;
ret->pbs_dirty_bitmap_savevm = true;
@@ -174,10 +187,10 @@ index e671ed8d48..bd2647e5f3 100644
return ret;
}
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 9b827cbe43..30eb1262ff 100644
index bf559c6d52..24f30260c8 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -884,6 +884,11 @@
@@ -879,6 +879,11 @@
# @pbs-dirty-bitmap-savevm: True if 'dirty-bitmaps' migration capability can
# safely be set for savevm-async.
#
@@ -189,7 +202,7 @@ index 9b827cbe43..30eb1262ff 100644
# @pbs-library-version: Running version of libproxmox-backup-qemu0 library.
#
##
@@ -891,6 +896,7 @@
@@ -886,6 +891,7 @@
'data': { 'pbs-dirty-bitmap': 'bool',
'query-bitmap-info': 'bool',
'pbs-dirty-bitmap-savevm': 'bool',
@@ -197,15 +210,3 @@ index 9b827cbe43..30eb1262ff 100644
'pbs-library-version': 'str' } }
##
diff --git a/softmmu/vl.c b/softmmu/vl.c
index 5b5512128e..6721889fee 100644
--- a/softmmu/vl.c
+++ b/softmmu/vl.c
@@ -4304,6 +4304,7 @@ void qemu_init(int argc, char **argv, char **envp)
blk_mig_init();
ram_mig_init();
dirty_bitmap_mig_init();
+ pbs_state_mig_init();
qemu_opts_foreach(qemu_find_opts("mon"),
mon_init_func, NULL, &error_fatal);

View File

@@ -13,15 +13,16 @@ that are obviously marked as "busy", which would cause none at all to be
transferred.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
migration/block-dirty-bitmap.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index c61d382be8..26e4e5c99c 100644
index 9aba7d9c22..f4ecf9c9f9 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -534,7 +534,7 @@ static int add_bitmaps_to_list(DBMSaveState *s, BlockDriverState *bs,
@@ -538,7 +538,7 @@ static int add_bitmaps_to_list(DBMSaveState *s, BlockDriverState *bs,
if (bdrv_dirty_bitmap_check(bitmap, BDRV_BITMAP_DEFAULT, &local_err)) {
error_report_err(local_err);

View File

@@ -15,15 +15,16 @@ According to RFC 3720, an initiator name is at most 223 bytes long, so the
4 KiB buffer is big enough, even if many whitespaces are used.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
block/iscsi.c | 30 ++++++++++++++++++++++++++++++
1 file changed, 30 insertions(+)
diff --git a/block/iscsi.c b/block/iscsi.c
index e30a7e3606..6c70bbe351 100644
index 1bba42a71b..89cd032c3a 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -1374,12 +1374,42 @@ static char *get_initiator_name(QemuOpts *opts)
@@ -1388,12 +1388,42 @@ static char *get_initiator_name(QemuOpts *opts)
const char *name;
char *iscsi_name;
UuidInfo *uuid_info;

View File

@@ -22,6 +22,7 @@ monitor for that coroutine ourselves, but let's just fix it the right
way instead)
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
block/monitor/block-hmp-cmds.c | 4 +-
hmp-commands.hx | 2 +
@@ -31,10 +32,10 @@ Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
5 files changed, 77 insertions(+), 196 deletions(-)
diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c
index 46c63b1cf9..11c84d5508 100644
index f59b02592e..2e53cb65df 100644
--- a/block/monitor/block-hmp-cmds.c
+++ b/block/monitor/block-hmp-cmds.c
@@ -1013,7 +1013,7 @@ void hmp_info_snapshots(Monitor *mon, const QDict *qdict)
@@ -1018,7 +1018,7 @@ void hmp_info_snapshots(Monitor *mon, const QDict *qdict)
g_free(global_snapshots);
}
@@ -43,7 +44,7 @@ index 46c63b1cf9..11c84d5508 100644
{
Error *error = NULL;
@@ -1022,7 +1022,7 @@ void hmp_backup_cancel(Monitor *mon, const QDict *qdict)
@@ -1027,7 +1027,7 @@ void hmp_backup_cancel(Monitor *mon, const QDict *qdict)
hmp_handle_error(mon, error);
}
@@ -53,10 +54,10 @@ index 46c63b1cf9..11c84d5508 100644
Error *error = NULL;
diff --git a/hmp-commands.hx b/hmp-commands.hx
index 0c6b944850..54de3f80e6 100644
index fcf9461295..5fdb198ca4 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -108,6 +108,7 @@ ERST
@@ -111,6 +111,7 @@ ERST
"\n\t\t\t Use -d to dump data into a directory instead"
"\n\t\t\t of using VMA format.",
.cmd = hmp_backup,
@@ -64,7 +65,7 @@ index 0c6b944850..54de3f80e6 100644
},
SRST
@@ -121,6 +122,7 @@ ERST
@@ -124,6 +125,7 @@ ERST
.params = "",
.help = "cancel the current VM backup",
.cmd = hmp_backup_cancel,
@@ -115,10 +116,10 @@ index 4ce7bc0b5e..0923037dec 100644
static void proxmox_backup_schedule_wake(void *data) {
CoCtxData *waker = (CoCtxData *)data;
diff --git a/pve-backup.c b/pve-backup.c
index bd2647e5f3..dec9c0d188 100644
index 5662f48b72..e4fe1b601d 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -357,7 +357,7 @@ static void job_cancel_bh(void *opaque) {
@@ -354,7 +354,7 @@ static void job_cancel_bh(void *opaque) {
aio_co_enter(data->ctx, data->co);
}
@@ -127,7 +128,7 @@ index bd2647e5f3..dec9c0d188 100644
{
Error *cancel_err = NULL;
error_setg(&cancel_err, "backup canceled");
@@ -394,11 +394,6 @@ static void coroutine_fn pvebackup_co_cancel(void *opaque)
@@ -391,11 +391,6 @@ static void coroutine_fn pvebackup_co_cancel(void *opaque)
qemu_co_mutex_unlock(&backup_state.backup_mutex);
}
@@ -139,7 +140,7 @@ index bd2647e5f3..dec9c0d188 100644
// assumes the caller holds backup_mutex
static int coroutine_fn pvebackup_co_add_config(
const char *file,
@@ -531,50 +526,27 @@ static void create_backup_jobs_bh(void *opaque) {
@@ -529,50 +524,27 @@ static void create_backup_jobs_bh(void *opaque) {
aio_co_enter(data->ctx, data->co);
}
@@ -206,7 +207,7 @@ index bd2647e5f3..dec9c0d188 100644
BlockBackend *blk;
BlockDriverState *bs = NULL;
const char *backup_dir = NULL;
@@ -591,17 +563,17 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -589,17 +561,17 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
const char *firewall_name = "qemu-server.fw";
if (backup_state.di_list) {
@@ -229,7 +230,7 @@ index bd2647e5f3..dec9c0d188 100644
gchar **d = devs;
while (d && *d) {
@@ -609,14 +581,14 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -607,14 +579,14 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
if (blk) {
bs = blk_bs(blk);
if (!bdrv_is_inserted(bs)) {
@@ -246,7 +247,7 @@ index bd2647e5f3..dec9c0d188 100644
"Device '%s' not found", *d);
goto err;
}
@@ -639,7 +611,7 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -637,7 +609,7 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
}
if (!di_list) {
@@ -255,7 +256,7 @@ index bd2647e5f3..dec9c0d188 100644
goto err;
}
@@ -649,13 +621,13 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -647,13 +619,13 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
while (l) {
PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
l = g_list_next(l);
@@ -266,12 +267,12 @@ index bd2647e5f3..dec9c0d188 100644
ssize_t size = bdrv_getlength(di->bs);
if (size < 0) {
- error_setg_errno(task->errp, -di->size, "bdrv_getlength failed");
+ error_setg_errno(errp, -di->size, "bdrv_getlength failed");
- error_setg_errno(task->errp, -size, "bdrv_getlength failed");
+ error_setg_errno(errp, -size, "bdrv_getlength failed");
goto err;
}
di->size = size;
@@ -682,47 +654,44 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -680,47 +652,44 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
}
if (format == BACKUP_FORMAT_PBS) {
@@ -336,7 +337,7 @@ index bd2647e5f3..dec9c0d188 100644
if (connect_result < 0)
goto err_mutex;
@@ -741,9 +710,9 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -739,9 +708,9 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
BdrvDirtyBitmap *bitmap = bdrv_find_dirty_bitmap(di->bs, PBS_BITMAP_NAME);
bool expect_only_dirty = false;
@@ -348,7 +349,7 @@ index bd2647e5f3..dec9c0d188 100644
if (!bitmap) {
goto err_mutex;
}
@@ -773,12 +742,12 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -771,12 +740,12 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
}
}
@@ -363,7 +364,7 @@ index bd2647e5f3..dec9c0d188 100644
goto err_mutex;
}
@@ -792,10 +761,10 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -790,10 +759,10 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
backup_state.stat.bitmap_list = g_list_append(backup_state.stat.bitmap_list, info);
}
} else if (format == BACKUP_FORMAT_VMA) {
@@ -376,7 +377,7 @@ index bd2647e5f3..dec9c0d188 100644
}
goto err_mutex;
}
@@ -806,25 +775,25 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -804,25 +773,25 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
l = g_list_next(l);
@@ -408,7 +409,7 @@ index bd2647e5f3..dec9c0d188 100644
l = di_list;
while (l) {
@@ -838,34 +807,34 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -836,34 +805,34 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
bdrv_img_create(di->targetfile, "raw", NULL, NULL, NULL,
di->size, flags, false, &local_err);
if (local_err) {
@@ -452,7 +453,7 @@ index bd2647e5f3..dec9c0d188 100644
goto err_mutex;
}
}
@@ -883,7 +852,7 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -881,7 +850,7 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
if (backup_state.stat.backup_file) {
g_free(backup_state.stat.backup_file);
}
@@ -461,7 +462,7 @@ index bd2647e5f3..dec9c0d188 100644
uuid_copy(backup_state.stat.uuid, uuid);
uuid_unparse_lower(uuid, backup_state.stat.uuid_str);
@@ -898,7 +867,7 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -896,7 +865,7 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
qemu_mutex_unlock(&backup_state.stat.lock);
@@ -470,7 +471,7 @@ index bd2647e5f3..dec9c0d188 100644
backup_state.vmaw = vmaw;
backup_state.pbs = pbs;
@@ -908,8 +877,6 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -906,8 +875,6 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
uuid_info = g_malloc0(sizeof(*uuid_info));
uuid_info->UUID = uuid_str;
@@ -479,7 +480,7 @@ index bd2647e5f3..dec9c0d188 100644
/* Run create_backup_jobs_bh outside of coroutine (in BH) but keep
* backup_mutex locked. This is fine, a CoMutex can be held across yield
* points, and we'll release it as soon as the BH reschedules us.
@@ -923,7 +890,7 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -921,7 +888,7 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
qemu_coroutine_yield();
if (local_err) {
@@ -488,7 +489,7 @@ index bd2647e5f3..dec9c0d188 100644
goto err;
}
@@ -936,7 +903,7 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
@@ -934,7 +901,7 @@ static void coroutine_fn pvebackup_co_prepare(void *opaque)
/* start the first job in the transaction */
job_txn_start_seq(backup_state.txn);
@@ -497,7 +498,7 @@ index bd2647e5f3..dec9c0d188 100644
err_mutex:
qemu_mutex_unlock(&backup_state.stat.lock);
@@ -967,7 +934,7 @@ err:
@@ -965,7 +932,7 @@ err:
if (vmaw) {
Error *err = NULL;
vma_writer_close(vmaw, &err);
@@ -506,7 +507,7 @@ index bd2647e5f3..dec9c0d188 100644
}
if (pbs) {
@@ -978,65 +945,8 @@ err:
@@ -976,65 +943,8 @@ err:
rmdir(backup_dir);
}
@@ -574,10 +575,10 @@ index bd2647e5f3..dec9c0d188 100644
BackupStatus *qmp_query_backup(Error **errp)
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 30eb1262ff..6ff5367383 100644
index 24f30260c8..4e8c35a3a2 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -847,7 +847,7 @@
@@ -842,7 +842,7 @@
'*config-file': 'str',
'*firewall-file': 'str',
'*devlist': 'str', '*speed': 'int' },
@@ -586,7 +587,7 @@ index 30eb1262ff..6ff5367383 100644
##
# @query-backup:
@@ -869,7 +869,7 @@
@@ -864,7 +864,7 @@
# Notes: This command succeeds even if there is no backup process running.
#
##

View File

@@ -11,6 +11,7 @@ from the PVE side to avoid QMP calls with unsupported parameters.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
block/monitor/block-hmp-cmds.c | 1 +
pve-backup.c | 3 +++
@@ -18,10 +19,10 @@ Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
3 files changed, 11 insertions(+)
diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c
index 11c84d5508..0932deb28c 100644
index 2e53cb65df..f98f4cf7e6 100644
--- a/block/monitor/block-hmp-cmds.c
+++ b/block/monitor/block-hmp-cmds.c
@@ -1036,6 +1036,7 @@ void coroutine_fn hmp_backup(Monitor *mon, const QDict *qdict)
@@ -1041,6 +1041,7 @@ void coroutine_fn hmp_backup(Monitor *mon, const QDict *qdict)
false, NULL, // PBS password
false, NULL, // PBS keyfile
false, NULL, // PBS key_password
@@ -30,10 +31,10 @@ index 11c84d5508..0932deb28c 100644
false, NULL, // PBS backup-id
false, 0, // PBS backup-time
diff --git a/pve-backup.c b/pve-backup.c
index dec9c0d188..076146cc1e 100644
index e4fe1b601d..41e8effa01 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -531,6 +531,7 @@ UuidInfo coroutine_fn *qmp_backup(
@@ -529,6 +529,7 @@ UuidInfo coroutine_fn *qmp_backup(
bool has_password, const char *password,
bool has_keyfile, const char *keyfile,
bool has_key_password, const char *key_password,
@@ -41,7 +42,7 @@ index dec9c0d188..076146cc1e 100644
bool has_fingerprint, const char *fingerprint,
bool has_backup_id, const char *backup_id,
bool has_backup_time, int64_t backup_time,
@@ -679,6 +680,7 @@ UuidInfo coroutine_fn *qmp_backup(
@@ -677,6 +678,7 @@ UuidInfo coroutine_fn *qmp_backup(
has_password ? password : NULL,
has_keyfile ? keyfile : NULL,
has_key_password ? key_password : NULL,
@@ -49,7 +50,7 @@ index dec9c0d188..076146cc1e 100644
has_compress ? compress : true,
has_encrypt ? encrypt : has_keyfile,
has_fingerprint ? fingerprint : NULL,
@@ -1042,5 +1044,6 @@ ProxmoxSupportStatus *qmp_query_proxmox_support(Error **errp)
@@ -1040,5 +1042,6 @@ ProxmoxSupportStatus *qmp_query_proxmox_support(Error **errp)
ret->pbs_dirty_bitmap_savevm = true;
ret->pbs_dirty_bitmap_migration = true;
ret->query_bitmap_info = true;
@@ -57,10 +58,10 @@ index dec9c0d188..076146cc1e 100644
return ret;
}
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 6ff5367383..bef9b65fec 100644
index 4e8c35a3a2..d8c7331090 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -818,6 +818,8 @@
@@ -813,6 +813,8 @@
#
# @key-password: password for keyfile (optional for format 'pbs')
#
@@ -69,7 +70,7 @@ index 6ff5367383..bef9b65fec 100644
# @fingerprint: server cert fingerprint (optional for format 'pbs')
#
# @backup-id: backup ID (required for format 'pbs')
@@ -837,6 +839,7 @@
@@ -832,6 +834,7 @@
'*password': 'str',
'*keyfile': 'str',
'*key-password': 'str',
@@ -77,7 +78,7 @@ index 6ff5367383..bef9b65fec 100644
'*fingerprint': 'str',
'*backup-id': 'str',
'*backup-time': 'int',
@@ -889,6 +892,9 @@
@@ -884,6 +887,9 @@
# migration cap if this is false/unset may lead
# to crashes on migration!
#
@@ -87,7 +88,7 @@ index 6ff5367383..bef9b65fec 100644
# @pbs-library-version: Running version of libproxmox-backup-qemu0 library.
#
##
@@ -897,6 +903,7 @@
@@ -892,6 +898,7 @@
'query-bitmap-info': 'bool',
'pbs-dirty-bitmap-savevm': 'bool',
'pbs-dirty-bitmap-migration': 'bool',

View File

@@ -11,12 +11,13 @@ Tracing shows the fast-path is taken almost all the time, though not
100% so the slow one is still necessary.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
block/pbs.c | 17 ++++++++++++++---
1 file changed, 14 insertions(+), 3 deletions(-)
diff --git a/block/pbs.c b/block/pbs.c
index 1481a2bfd1..fbf0d8d845 100644
index 9d1f1f39d4..ce9a870885 100644
--- a/block/pbs.c
+++ b/block/pbs.c
@@ -200,7 +200,16 @@ static coroutine_fn int pbs_co_preadv(BlockDriverState *bs,
@@ -35,9 +36,9 @@ index 1481a2bfd1..fbf0d8d845 100644
+ buf = g_malloc(bytes);
+ }
ReadCallbackData rcb = {
.co = qemu_coroutine_self(),
@@ -218,8 +227,10 @@ static coroutine_fn int pbs_co_preadv(BlockDriverState *bs,
if (offset < 0 || bytes < 0) {
fprintf(stderr, "unexpected negative 'offset' or 'bytes' value!\n");
@@ -223,8 +232,10 @@ static coroutine_fn int pbs_co_preadv(BlockDriverState *bs,
return -EIO;
}
@@ -48,5 +49,5 @@ index 1481a2bfd1..fbf0d8d845 100644
+ g_free(buf);
+ }
return ret;
return 0;
}

View File

@@ -4,15 +4,17 @@ Date: Tue, 2 Mar 2021 16:34:28 +0100
Subject: [PATCH] PVE: block/stream: increase chunk size
Ceph favors bigger chunks, so increase to 4M.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
block/stream.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/block/stream.c b/block/stream.c
index 236384f2f7..a5371420e3 100644
index 694709bd25..e09bd5c4ef 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -26,7 +26,7 @@ enum {
@@ -28,7 +28,7 @@ enum {
* large enough to process multiple clusters in a single call, so
* that populating contiguous regions of the image is efficient.
*/

View File

@@ -1,42 +0,0 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Stefan Reiter <s.reiter@proxmox.com>
Date: Tue, 2 Mar 2021 16:11:54 +0100
Subject: [PATCH] block/io: accept NULL qiov in bdrv_pad_request
Some operations, e.g. block-stream, perform reads while discarding the
results (only copy-on-read matters). In this case they will pass NULL as
the target QEMUIOVector, which will however trip bdrv_pad_request, since
it wants to extend its passed vector.
Simply check for NULL and do nothing, there's no reason to pad the
target if it will be discarded anyway.
---
block/io.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/block/io.c b/block/io.c
index ec5e152bb7..08dee005ec 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1613,13 +1613,16 @@ static bool bdrv_pad_request(BlockDriverState *bs,
return false;
}
- qemu_iovec_init_extended(&pad->local_qiov, pad->buf, pad->head,
- *qiov, *qiov_offset, *bytes,
- pad->buf + pad->buf_len - pad->tail, pad->tail);
+ if (*qiov) {
+ qemu_iovec_init_extended(&pad->local_qiov, pad->buf, pad->head,
+ *qiov, *qiov_offset, *bytes,
+ pad->buf + pad->buf_len - pad->tail, pad->tail);
+ *qiov = &pad->local_qiov;
+ *qiov_offset = 0;
+ }
+
*bytes += pad->head + pad->tail;
*offset -= pad->head;
- *qiov = &pad->local_qiov;
- *qiov_offset = 0;
return true;
}

View File

@@ -0,0 +1,33 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Stefan Reiter <s.reiter@proxmox.com>
Date: Tue, 2 Mar 2021 16:11:54 +0100
Subject: [PATCH] block/io: accept NULL qiov in bdrv_pad_request
Some operations, e.g. block-stream, perform reads while discarding the
results (only copy-on-read matters). In this case they will pass NULL as
the target QEMUIOVector, which will however trip bdrv_pad_request, since
it wants to extend its passed vector.
Simply check for NULL and do nothing, there's no reason to pad the
target if it will be discarded anyway.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
block/io.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/block/io.c b/block/io.c
index 4589a58917..20167d61b7 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1730,6 +1730,10 @@ static int bdrv_pad_request(BlockDriverState *bs,
{
int ret;
+ if (!qiov) {
+ return 0;
+ }
+
bdrv_check_qiov_request(*offset, *bytes, *qiov, *qiov_offset, &error_abort);
if (!bdrv_init_padding(bs, *offset, *bytes, pad)) {

View File

@@ -23,18 +23,22 @@ If 'auto-remove' is set, alloc-track will automatically detach itself
once the backing image is removed. It will be replaced by 'file'.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
[FE: adapt to changed function signatures
make error return value consistent with QEMU]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
block/alloc-track.c | 342 ++++++++++++++++++++++++++++++++++++++++++++
block/alloc-track.c | 350 ++++++++++++++++++++++++++++++++++++++++++++
block/meson.build | 1 +
2 files changed, 343 insertions(+)
2 files changed, 351 insertions(+)
create mode 100644 block/alloc-track.c
diff --git a/block/alloc-track.c b/block/alloc-track.c
new file mode 100644
index 0000000000..b579380279
index 0000000000..43d40d11af
--- /dev/null
+++ b/block/alloc-track.c
@@ -0,0 +1,345 @@
@@ -0,0 +1,350 @@
+/*
+ * Node to allow backing images to be applied to any node. Assumes a blank
+ * image to begin with, only new writes are tracked as allocated, thus this
@@ -165,7 +169,7 @@ index 0000000000..b579380279
+}
+
+static int coroutine_fn track_co_preadv(BlockDriverState *bs,
+ uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags)
+ int64_t offset, int64_t bytes, QEMUIOVector *qiov, BdrvRequestFlags flags)
+{
+ BDRVAllocTrackState *s = bs->opaque;
+ QEMUIOVector local_qiov;
@@ -176,6 +180,11 @@ index 0000000000..b579380279
+ int64_t local_bytes;
+ bool alloc;
+
+ if (offset < 0 || bytes < 0) {
+ fprintf(stderr, "unexpected negative 'offset' or 'bytes' value!\n");
+ return -EIO;
+ }
+
+ /* a read request can span multiple granularity-sized chunks, and can thus
+ * contain blocks with different allocation status - we could just iterate
+ * granularity-wise, but for better performance use bdrv_dirty_bitmap_next_X
@@ -218,21 +227,21 @@ index 0000000000..b579380279
+}
+
+static int coroutine_fn track_co_pwritev(BlockDriverState *bs,
+ uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags)
+ int64_t offset, int64_t bytes, QEMUIOVector *qiov, BdrvRequestFlags flags)
+{
+ return bdrv_co_pwritev(bs->file, offset, bytes, qiov, flags);
+}
+
+static int coroutine_fn track_co_pwrite_zeroes(BlockDriverState *bs,
+ int64_t offset, int count, BdrvRequestFlags flags)
+ int64_t offset, int64_t bytes, BdrvRequestFlags flags)
+{
+ return bdrv_co_pwrite_zeroes(bs->file, offset, count, flags);
+ return bdrv_co_pwrite_zeroes(bs->file, offset, bytes, flags);
+}
+
+static int coroutine_fn track_co_pdiscard(BlockDriverState *bs,
+ int64_t offset, int count)
+ int64_t offset, int64_t bytes)
+{
+ return bdrv_co_pdiscard(bs->file, offset, count);
+ return bdrv_co_pdiscard(bs->file, offset, bytes);
+}
+
+static coroutine_fn int track_co_flush(BlockDriverState *bs)
@@ -381,7 +390,7 @@ index 0000000000..b579380279
+
+block_init(bdrv_alloc_track_init);
diff --git a/block/meson.build b/block/meson.build
index a070060e53..e387990764 100644
index 7ef2fa72d5..15352f579f 100644
--- a/block/meson.build
+++ b/block/meson.build
@@ -2,6 +2,7 @@ block_ss.add(genh)

View File

@@ -0,0 +1,35 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Stefan Reiter <s.reiter@proxmox.com>
Date: Wed, 26 May 2021 17:36:55 +0200
Subject: [PATCH] PVE: savevm-async: register yank before
migration_incoming_state_destroy
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
migration/savevm-async.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/migration/savevm-async.c b/migration/savevm-async.c
index a38e7351c1..0b1b60c6ae 100644
--- a/migration/savevm-async.c
+++ b/migration/savevm-async.c
@@ -20,6 +20,7 @@
#include "qemu/timer.h"
#include "qemu/main-loop.h"
#include "qemu/rcu.h"
+#include "qemu/yank.h"
/* #define DEBUG_SAVEVM_STATE */
@@ -521,6 +522,10 @@ int load_snapshot_from_blockdev(const char *filename, Error **errp)
dirty_bitmap_mig_before_vm_start();
qemu_fclose(f);
+
+ /* state_destroy assumes a real migration which would have added a yank */
+ yank_register_instance(MIGRATION_YANK_INSTANCE, &error_abort);
+
migration_incoming_state_destroy();
if (ret < 0) {
error_setg_errno(errp, -ret, "Error while loading VM state");

View File

@@ -0,0 +1,130 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fabian Ebner <f.ebner@proxmox.com>
Date: Mon, 7 Feb 2022 14:21:01 +0100
Subject: [PATCH] qemu-img: dd: add -l option for loading a snapshot
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
docs/tools/qemu-img.rst | 6 +++---
qemu-img-cmds.hx | 4 ++--
qemu-img.c | 33 +++++++++++++++++++++++++++++++--
3 files changed, 36 insertions(+), 7 deletions(-)
diff --git a/docs/tools/qemu-img.rst b/docs/tools/qemu-img.rst
index 5e713e231d..9390d5e5cf 100644
--- a/docs/tools/qemu-img.rst
+++ b/docs/tools/qemu-img.rst
@@ -492,10 +492,10 @@ Command description:
it doesn't need to be specified separately in this case.
-.. option:: dd [--image-opts] [-U] [-f FMT] [-O OUTPUT_FMT] [-n] [bs=BLOCK_SIZE] [count=BLOCKS] [skip=BLOCKS] if=INPUT of=OUTPUT
+.. option:: dd [--image-opts] [-U] [-f FMT] [-O OUTPUT_FMT] [-n] [-l SNAPSHOT_PARAM] [bs=BLOCK_SIZE] [count=BLOCKS] [skip=BLOCKS] if=INPUT of=OUTPUT
- dd copies from *INPUT* file to *OUTPUT* file converting it from
- *FMT* format to *OUTPUT_FMT* format.
+ dd copies from *INPUT* file or snapshot *SNAPSHOT_PARAM* to *OUTPUT* file
+ converting it from *FMT* format to *OUTPUT_FMT* format.
The data is by default read and written using blocks of 512 bytes but can be
modified by specifying *BLOCK_SIZE*. If count=\ *BLOCKS* is specified
diff --git a/qemu-img-cmds.hx b/qemu-img-cmds.hx
index b5b0bb4467..36f97e1f19 100644
--- a/qemu-img-cmds.hx
+++ b/qemu-img-cmds.hx
@@ -58,9 +58,9 @@ SRST
ERST
DEF("dd", img_dd,
- "dd [--image-opts] [-U] [-f fmt] [-O output_fmt] [-n] [bs=block_size] [count=blocks] [skip=blocks] [osize=output_size] if=input of=output")
+ "dd [--image-opts] [-U] [-f fmt] [-O output_fmt] [-n] [-l snapshot_param] [bs=block_size] [count=blocks] [skip=blocks] [osize=output_size] if=input of=output")
SRST
-.. option:: dd [--image-opts] [-U] [-f FMT] [-O OUTPUT_FMT] [-n] [bs=BLOCK_SIZE] [count=BLOCKS] [skip=BLOCKS] [osize=OUTPUT_SIZE] if=INPUT of=OUTPUT
+.. option:: dd [--image-opts] [-U] [-f FMT] [-O OUTPUT_FMT] [-n] [-l SNAPSHOT_PARAM] [bs=BLOCK_SIZE] [count=BLOCKS] [skip=BLOCKS] [osize=OUTPUT_SIZE] if=INPUT of=OUTPUT
ERST
DEF("info", img_info,
diff --git a/qemu-img.c b/qemu-img.c
index 4f5ef5b887..4894016ad2 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -4957,6 +4957,7 @@ static int img_dd(int argc, char **argv)
BlockDriver *drv = NULL, *proto_drv = NULL;
BlockBackend *blk1 = NULL, *blk2 = NULL;
QemuOpts *opts = NULL;
+ QemuOpts *sn_opts = NULL;
QemuOptsList *create_opts = NULL;
Error *local_err = NULL;
bool image_opts = false;
@@ -4966,6 +4967,7 @@ static int img_dd(int argc, char **argv)
int64_t size = 0, readsize = 0;
int64_t out_pos, in_pos;
bool force_share = false, skip_create = false;
+ const char *snapshot_name = NULL;
struct DdInfo dd = {
.flags = 0,
.count = 0,
@@ -5003,7 +5005,7 @@ static int img_dd(int argc, char **argv)
{ 0, 0, 0, 0 }
};
- while ((c = getopt_long(argc, argv, ":hf:O:Un", long_options, NULL))) {
+ while ((c = getopt_long(argc, argv, ":hf:O:l:Un", long_options, NULL))) {
if (c == EOF) {
break;
}
@@ -5026,6 +5028,19 @@ static int img_dd(int argc, char **argv)
case 'n':
skip_create = true;
break;
+ case 'l':
+ if (strstart(optarg, SNAPSHOT_OPT_BASE, NULL)) {
+ sn_opts = qemu_opts_parse_noisily(&internal_snapshot_opts,
+ optarg, false);
+ if (!sn_opts) {
+ error_report("Failed in parsing snapshot param '%s'",
+ optarg);
+ goto out;
+ }
+ } else {
+ snapshot_name = optarg;
+ }
+ break;
case 'U':
force_share = true;
break;
@@ -5085,11 +5100,24 @@ static int img_dd(int argc, char **argv)
if (dd.flags & C_IF) {
blk1 = img_open(image_opts, in.filename, fmt, 0, false, false,
force_share);
-
if (!blk1) {
ret = -1;
goto out;
}
+ if (sn_opts) {
+ bdrv_snapshot_load_tmp(blk_bs(blk1),
+ qemu_opt_get(sn_opts, SNAPSHOT_OPT_ID),
+ qemu_opt_get(sn_opts, SNAPSHOT_OPT_NAME),
+ &local_err);
+ } else if (snapshot_name != NULL) {
+ bdrv_snapshot_load_tmp_by_id_or_name(blk_bs(blk1), snapshot_name,
+ &local_err);
+ }
+ if (local_err) {
+ error_reportf_err(local_err, "Failed to load snapshot: ");
+ ret = -1;
+ goto out;
+ }
}
if (dd.flags & C_OSIZE) {
@@ -5244,6 +5272,7 @@ static int img_dd(int argc, char **argv)
out:
g_free(arg);
qemu_opts_del(opts);
+ qemu_opts_del(sn_opts);
qemu_opts_free(create_opts);
blk_unref(blk1);
blk_unref(blk2);

View File

@@ -0,0 +1,407 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fabian Ebner <f.ebner@proxmox.com>
Date: Thu, 21 Apr 2022 13:26:48 +0200
Subject: [PATCH] vma: allow partial restore
Introduce a new map line for skipping a certain drive, of the form
skip=drive-scsi0
Since in PVE, most archives are compressed and piped to vma for
restore, it's not easily possible to skip reads.
For the reader, a new skip flag for VmaRestoreState is added and the
target is allowed to be NULL if skip is specified when registering. If
the skip flag is set, no writes will be made as well as no check for
duplicate clusters. Therefore, the flag is not set for verify.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Acked-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
vma-reader.c | 64 ++++++++++++---------
vma.c | 157 +++++++++++++++++++++++++++++----------------------
vma.h | 2 +-
3 files changed, 126 insertions(+), 97 deletions(-)
diff --git a/vma-reader.c b/vma-reader.c
index e65f1e8415..81a891c6b1 100644
--- a/vma-reader.c
+++ b/vma-reader.c
@@ -28,6 +28,7 @@ typedef struct VmaRestoreState {
bool write_zeroes;
unsigned long *bitmap;
int bitmap_size;
+ bool skip;
} VmaRestoreState;
struct VmaReader {
@@ -425,13 +426,14 @@ VmaDeviceInfo *vma_reader_get_device_info(VmaReader *vmar, guint8 dev_id)
}
static void allocate_rstate(VmaReader *vmar, guint8 dev_id,
- BlockBackend *target, bool write_zeroes)
+ BlockBackend *target, bool write_zeroes, bool skip)
{
assert(vmar);
assert(dev_id);
vmar->rstate[dev_id].target = target;
vmar->rstate[dev_id].write_zeroes = write_zeroes;
+ vmar->rstate[dev_id].skip = skip;
int64_t size = vmar->devinfo[dev_id].size;
@@ -446,28 +448,30 @@ static void allocate_rstate(VmaReader *vmar, guint8 dev_id,
}
int vma_reader_register_bs(VmaReader *vmar, guint8 dev_id, BlockBackend *target,
- bool write_zeroes, Error **errp)
+ bool write_zeroes, bool skip, Error **errp)
{
assert(vmar);
- assert(target != NULL);
+ assert(target != NULL || skip);
assert(dev_id);
- assert(vmar->rstate[dev_id].target == NULL);
-
- int64_t size = blk_getlength(target);
- int64_t size_diff = size - vmar->devinfo[dev_id].size;
-
- /* storage types can have different size restrictions, so it
- * is not always possible to create an image with exact size.
- * So we tolerate a size difference up to 4MB.
- */
- if ((size_diff < 0) || (size_diff > 4*1024*1024)) {
- error_setg(errp, "vma_reader_register_bs for stream %s failed - "
- "unexpected size %zd != %zd", vmar->devinfo[dev_id].devname,
- size, vmar->devinfo[dev_id].size);
- return -1;
+ assert(vmar->rstate[dev_id].target == NULL && !vmar->rstate[dev_id].skip);
+
+ if (target != NULL) {
+ int64_t size = blk_getlength(target);
+ int64_t size_diff = size - vmar->devinfo[dev_id].size;
+
+ /* storage types can have different size restrictions, so it
+ * is not always possible to create an image with exact size.
+ * So we tolerate a size difference up to 4MB.
+ */
+ if ((size_diff < 0) || (size_diff > 4*1024*1024)) {
+ error_setg(errp, "vma_reader_register_bs for stream %s failed - "
+ "unexpected size %zd != %zd", vmar->devinfo[dev_id].devname,
+ size, vmar->devinfo[dev_id].size);
+ return -1;
+ }
}
- allocate_rstate(vmar, dev_id, target, write_zeroes);
+ allocate_rstate(vmar, dev_id, target, write_zeroes, skip);
return 0;
}
@@ -560,19 +564,23 @@ static int restore_extent(VmaReader *vmar, unsigned char *buf,
VmaRestoreState *rstate = &vmar->rstate[dev_id];
BlockBackend *target = NULL;
+ bool skip = rstate->skip;
+
if (dev_id != vmar->vmstate_stream) {
target = rstate->target;
- if (!verify && !target) {
+ if (!verify && !target && !skip) {
error_setg(errp, "got wrong dev id %d", dev_id);
return -1;
}
- if (vma_reader_get_bitmap(rstate, cluster_num)) {
- error_setg(errp, "found duplicated cluster %zd for stream %s",
- cluster_num, vmar->devinfo[dev_id].devname);
- return -1;
+ if (!skip) {
+ if (vma_reader_get_bitmap(rstate, cluster_num)) {
+ error_setg(errp, "found duplicated cluster %zd for stream %s",
+ cluster_num, vmar->devinfo[dev_id].devname);
+ return -1;
+ }
+ vma_reader_set_bitmap(rstate, cluster_num, 1);
}
- vma_reader_set_bitmap(rstate, cluster_num, 1);
max_sector = vmar->devinfo[dev_id].size/BDRV_SECTOR_SIZE;
} else {
@@ -618,7 +626,7 @@ static int restore_extent(VmaReader *vmar, unsigned char *buf,
return -1;
}
- if (!verify) {
+ if (!verify && !skip) {
int nb_sectors = end_sector - sector_num;
if (restore_write_data(vmar, dev_id, target, vmstate_fd,
buf + start, sector_num, nb_sectors,
@@ -654,7 +662,7 @@ static int restore_extent(VmaReader *vmar, unsigned char *buf,
return -1;
}
- if (!verify) {
+ if (!verify && !skip) {
int nb_sectors = end_sector - sector_num;
if (restore_write_data(vmar, dev_id, target, vmstate_fd,
buf + start, sector_num,
@@ -679,7 +687,7 @@ static int restore_extent(VmaReader *vmar, unsigned char *buf,
vmar->partial_zero_cluster_data += zero_size;
}
- if (rstate->write_zeroes && !verify) {
+ if (rstate->write_zeroes && !verify && !skip) {
if (restore_write_data(vmar, dev_id, target, vmstate_fd,
zero_vma_block, sector_num,
nb_sectors, errp) < 0) {
@@ -850,7 +858,7 @@ int vma_reader_verify(VmaReader *vmar, bool verbose, Error **errp)
for (dev_id = 1; dev_id < 255; dev_id++) {
if (vma_reader_get_device_info(vmar, dev_id)) {
- allocate_rstate(vmar, dev_id, NULL, false);
+ allocate_rstate(vmar, dev_id, NULL, false, false);
}
}
diff --git a/vma.c b/vma.c
index e8dffb43e0..e6e9ffc7fe 100644
--- a/vma.c
+++ b/vma.c
@@ -138,6 +138,7 @@ typedef struct RestoreMap {
char *throttling_group;
char *cache;
bool write_zero;
+ bool skip;
} RestoreMap;
static bool try_parse_option(char **line, const char *optname, char **out, const char *inbuf) {
@@ -245,47 +246,61 @@ static int extract_content(int argc, char **argv)
char *bps = NULL;
char *group = NULL;
char *cache = NULL;
+ char *devname = NULL;
+ bool skip = false;
+ uint64_t bps_value = 0;
+ const char *path = NULL;
+ bool write_zero = true;
+
if (!line || line[0] == '\0' || !strcmp(line, "done\n")) {
break;
}
int len = strlen(line);
if (line[len - 1] == '\n') {
line[len - 1] = '\0';
- if (len == 1) {
+ len = len - 1;
+ if (len == 0) {
break;
}
}
- while (1) {
- if (!try_parse_option(&line, "format", &format, inbuf) &&
- !try_parse_option(&line, "throttling.bps", &bps, inbuf) &&
- !try_parse_option(&line, "throttling.group", &group, inbuf) &&
- !try_parse_option(&line, "cache", &cache, inbuf))
- {
- break;
+ if (strncmp(line, "skip", 4) == 0) {
+ if (len < 6 || line[4] != '=') {
+ g_error("read map failed - option 'skip' has no value ('%s')",
+ inbuf);
+ } else {
+ devname = line + 5;
+ skip = true;
+ }
+ } else {
+ while (1) {
+ if (!try_parse_option(&line, "format", &format, inbuf) &&
+ !try_parse_option(&line, "throttling.bps", &bps, inbuf) &&
+ !try_parse_option(&line, "throttling.group", &group, inbuf) &&
+ !try_parse_option(&line, "cache", &cache, inbuf))
+ {
+ break;
+ }
}
- }
- uint64_t bps_value = 0;
- if (bps) {
- bps_value = verify_u64(bps);
- g_free(bps);
- }
+ if (bps) {
+ bps_value = verify_u64(bps);
+ g_free(bps);
+ }
- const char *path;
- bool write_zero;
- if (line[0] == '0' && line[1] == ':') {
- path = line + 2;
- write_zero = false;
- } else if (line[0] == '1' && line[1] == ':') {
- path = line + 2;
- write_zero = true;
- } else {
- g_error("read map failed - parse error ('%s')", inbuf);
+ if (line[0] == '0' && line[1] == ':') {
+ path = line + 2;
+ write_zero = false;
+ } else if (line[0] == '1' && line[1] == ':') {
+ path = line + 2;
+ write_zero = true;
+ } else {
+ g_error("read map failed - parse error ('%s')", inbuf);
+ }
+
+ path = extract_devname(path, &devname, -1);
}
- char *devname = NULL;
- path = extract_devname(path, &devname, -1);
if (!devname) {
g_error("read map failed - no dev name specified ('%s')",
inbuf);
@@ -299,6 +314,7 @@ static int extract_content(int argc, char **argv)
map->throttling_group = group;
map->cache = cache;
map->write_zero = write_zero;
+ map->skip = skip;
g_hash_table_insert(devmap, map->devname, map);
@@ -328,6 +344,7 @@ static int extract_content(int argc, char **argv)
const char *cache = NULL;
int flags = BDRV_O_RDWR;
bool write_zero = true;
+ bool skip = false;
BlockBackend *blk = NULL;
@@ -343,6 +360,7 @@ static int extract_content(int argc, char **argv)
throttling_group = map->throttling_group;
cache = map->cache;
write_zero = map->write_zero;
+ skip = map->skip;
} else {
devfn = g_strdup_printf("%s/tmp-disk-%s.raw",
dirname, di->devname);
@@ -361,57 +379,60 @@ static int extract_content(int argc, char **argv)
write_zero = false;
}
- size_t devlen = strlen(devfn);
- QDict *options = NULL;
- bool writethrough;
- if (format) {
- /* explicit format from commandline */
- options = qdict_new();
- qdict_put_str(options, "driver", format);
- } else if ((devlen > 4 && strcmp(devfn+devlen-4, ".raw") == 0) ||
- strncmp(devfn, "/dev/", 5) == 0)
- {
- /* This part is now deprecated for PVE as well (just as qemu
- * deprecated not specifying an explicit raw format, too.
- */
- /* explicit raw format */
- options = qdict_new();
- qdict_put_str(options, "driver", "raw");
- }
- if (cache && bdrv_parse_cache_mode(cache, &flags, &writethrough)) {
- g_error("invalid cache option: %s\n", cache);
- }
+ if (!skip) {
+ size_t devlen = strlen(devfn);
+ QDict *options = NULL;
+ bool writethrough;
+ if (format) {
+ /* explicit format from commandline */
+ options = qdict_new();
+ qdict_put_str(options, "driver", format);
+ } else if ((devlen > 4 && strcmp(devfn+devlen-4, ".raw") == 0) ||
+ strncmp(devfn, "/dev/", 5) == 0)
+ {
+ /* This part is now deprecated for PVE as well (just as qemu
+ * deprecated not specifying an explicit raw format, too.
+ */
+ /* explicit raw format */
+ options = qdict_new();
+ qdict_put_str(options, "driver", "raw");
+ }
- if (errp || !(blk = blk_new_open(devfn, NULL, options, flags, &errp))) {
- g_error("can't open file %s - %s", devfn,
- error_get_pretty(errp));
- }
+ if (cache && bdrv_parse_cache_mode(cache, &flags, &writethrough)) {
+ g_error("invalid cache option: %s\n", cache);
+ }
- if (cache) {
- blk_set_enable_write_cache(blk, !writethrough);
- }
+ if (errp || !(blk = blk_new_open(devfn, NULL, options, flags, &errp))) {
+ g_error("can't open file %s - %s", devfn,
+ error_get_pretty(errp));
+ }
- if (throttling_group) {
- blk_io_limits_enable(blk, throttling_group);
- }
+ if (cache) {
+ blk_set_enable_write_cache(blk, !writethrough);
+ }
- if (throttling_bps) {
- if (!throttling_group) {
- blk_io_limits_enable(blk, devfn);
+ if (throttling_group) {
+ blk_io_limits_enable(blk, throttling_group);
}
- ThrottleConfig cfg;
- throttle_config_init(&cfg);
- cfg.buckets[THROTTLE_BPS_WRITE].avg = throttling_bps;
- Error *err = NULL;
- if (!throttle_is_valid(&cfg, &err)) {
- error_report_err(err);
- g_error("failed to apply throttling");
+ if (throttling_bps) {
+ if (!throttling_group) {
+ blk_io_limits_enable(blk, devfn);
+ }
+
+ ThrottleConfig cfg;
+ throttle_config_init(&cfg);
+ cfg.buckets[THROTTLE_BPS_WRITE].avg = throttling_bps;
+ Error *err = NULL;
+ if (!throttle_is_valid(&cfg, &err)) {
+ error_report_err(err);
+ g_error("failed to apply throttling");
+ }
+ blk_set_io_limits(blk, &cfg);
}
- blk_set_io_limits(blk, &cfg);
}
- if (vma_reader_register_bs(vmar, i, blk, write_zero, &errp) < 0) {
+ if (vma_reader_register_bs(vmar, i, blk, write_zero, skip, &errp) < 0) {
g_error("%s", error_get_pretty(errp));
}
diff --git a/vma.h b/vma.h
index c895c97f6d..1b62859165 100644
--- a/vma.h
+++ b/vma.h
@@ -142,7 +142,7 @@ GList *vma_reader_get_config_data(VmaReader *vmar);
VmaDeviceInfo *vma_reader_get_device_info(VmaReader *vmar, guint8 dev_id);
int vma_reader_register_bs(VmaReader *vmar, guint8 dev_id,
BlockBackend *target, bool write_zeroes,
- Error **errp);
+ bool skip, Error **errp);
int vma_reader_restore(VmaReader *vmar, int vmstate_fd, bool verbose,
Error **errp);
int vma_reader_verify(VmaReader *vmar, bool verbose, Error **errp);

View File

@@ -0,0 +1,233 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Wolfgang Bumiller <w.bumiller@proxmox.com>
Date: Tue, 26 Apr 2022 16:06:28 +0200
Subject: [PATCH] pbs: namespace support
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
---
block/monitor/block-hmp-cmds.c | 1 +
block/pbs.c | 25 +++++++++++++++++++++----
pbs-restore.c | 19 ++++++++++++++++---
pve-backup.c | 6 +++++-
qapi/block-core.json | 5 ++++-
5 files changed, 47 insertions(+), 9 deletions(-)
diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c
index f98f4cf7e6..55ef4f5965 100644
--- a/block/monitor/block-hmp-cmds.c
+++ b/block/monitor/block-hmp-cmds.c
@@ -1043,6 +1043,7 @@ void coroutine_fn hmp_backup(Monitor *mon, const QDict *qdict)
false, NULL, // PBS key_password
false, NULL, // PBS master_keyfile
false, NULL, // PBS fingerprint
+ false, NULL, // PBS backup-ns
false, NULL, // PBS backup-id
false, 0, // PBS backup-time
false, false, // PBS use-dirty-bitmap
diff --git a/block/pbs.c b/block/pbs.c
index ce9a870885..9192f3e41b 100644
--- a/block/pbs.c
+++ b/block/pbs.c
@@ -14,6 +14,7 @@
#include <proxmox-backup-qemu.h>
#define PBS_OPT_REPOSITORY "repository"
+#define PBS_OPT_NAMESPACE "namespace"
#define PBS_OPT_SNAPSHOT "snapshot"
#define PBS_OPT_ARCHIVE "archive"
#define PBS_OPT_KEYFILE "keyfile"
@@ -27,6 +28,7 @@ typedef struct {
int64_t length;
char *repository;
+ char *namespace;
char *snapshot;
char *archive;
} BDRVPBSState;
@@ -40,6 +42,11 @@ static QemuOptsList runtime_opts = {
.type = QEMU_OPT_STRING,
.help = "The server address and repository to connect to.",
},
+ {
+ .name = PBS_OPT_NAMESPACE,
+ .type = QEMU_OPT_STRING,
+ .help = "Optional: The snapshot's namespace.",
+ },
{
.name = PBS_OPT_SNAPSHOT,
.type = QEMU_OPT_STRING,
@@ -76,7 +83,7 @@ static QemuOptsList runtime_opts = {
// filename format:
-// pbs:repository=<repo>,snapshot=<snap>,password=<pw>,key_password=<kpw>,fingerprint=<fp>,archive=<archive>
+// pbs:repository=<repo>,namespace=<ns>,snapshot=<snap>,password=<pw>,key_password=<kpw>,fingerprint=<fp>,archive=<archive>
static void pbs_parse_filename(const char *filename, QDict *options,
Error **errp)
{
@@ -112,6 +119,7 @@ static int pbs_open(BlockDriverState *bs, QDict *options, int flags,
s->archive = g_strdup(qemu_opt_get(opts, PBS_OPT_ARCHIVE));
const char *keyfile = qemu_opt_get(opts, PBS_OPT_KEYFILE);
const char *password = qemu_opt_get(opts, PBS_OPT_PASSWORD);
+ const char *namespace = qemu_opt_get(opts, PBS_OPT_NAMESPACE);
const char *fingerprint = qemu_opt_get(opts, PBS_OPT_FINGERPRINT);
const char *key_password = qemu_opt_get(opts, PBS_OPT_ENCRYPTION_PASSWORD);
@@ -124,9 +132,12 @@ static int pbs_open(BlockDriverState *bs, QDict *options, int flags,
if (!key_password) {
key_password = getenv("PBS_ENCRYPTION_PASSWORD");
}
+ if (namespace) {
+ s->namespace = g_strdup(namespace);
+ }
/* connect to PBS server in read mode */
- s->conn = proxmox_restore_new(s->repository, s->snapshot, password,
+ s->conn = proxmox_restore_new_ns(s->repository, s->snapshot, s->namespace, password,
keyfile, key_password, fingerprint, &pbs_error);
/* invalidates qemu_opt_get char pointers from above */
@@ -171,6 +182,7 @@ static int pbs_file_open(BlockDriverState *bs, QDict *options, int flags,
static void pbs_close(BlockDriverState *bs) {
BDRVPBSState *s = bs->opaque;
g_free(s->repository);
+ g_free(s->namespace);
g_free(s->snapshot);
g_free(s->archive);
proxmox_restore_disconnect(s->conn);
@@ -252,8 +264,13 @@ static coroutine_fn int pbs_co_pwritev(BlockDriverState *bs,
static void pbs_refresh_filename(BlockDriverState *bs)
{
BDRVPBSState *s = bs->opaque;
- snprintf(bs->exact_filename, sizeof(bs->exact_filename), "%s/%s(%s)",
- s->repository, s->snapshot, s->archive);
+ if (s->namespace) {
+ snprintf(bs->exact_filename, sizeof(bs->exact_filename), "%s/%s:%s(%s)",
+ s->repository, s->namespace, s->snapshot, s->archive);
+ } else {
+ snprintf(bs->exact_filename, sizeof(bs->exact_filename), "%s/%s(%s)",
+ s->repository, s->snapshot, s->archive);
+ }
}
static const char *const pbs_strong_runtime_opts[] = {
diff --git a/pbs-restore.c b/pbs-restore.c
index 2f834cf42e..f03d9bab8d 100644
--- a/pbs-restore.c
+++ b/pbs-restore.c
@@ -29,7 +29,7 @@
static void help(void)
{
const char *help_msg =
- "usage: pbs-restore [--repository <repo>] snapshot archive-name target [command options]\n"
+ "usage: pbs-restore [--repository <repo>] [--ns namespace] snapshot archive-name target [command options]\n"
;
printf("%s", help_msg);
@@ -77,6 +77,7 @@ int main(int argc, char **argv)
Error *main_loop_err = NULL;
const char *format = "raw";
const char *repository = NULL;
+ const char *backup_ns = NULL;
const char *keyfile = NULL;
int verbose = false;
bool skip_zero = false;
@@ -90,6 +91,7 @@ int main(int argc, char **argv)
{"verbose", no_argument, 0, 'v'},
{"format", required_argument, 0, 'f'},
{"repository", required_argument, 0, 'r'},
+ {"ns", required_argument, 0, 'n'},
{"keyfile", required_argument, 0, 'k'},
{0, 0, 0, 0}
};
@@ -110,6 +112,9 @@ int main(int argc, char **argv)
case 'r':
repository = g_strdup(argv[optind - 1]);
break;
+ case 'n':
+ backup_ns = g_strdup(argv[optind - 1]);
+ break;
case 'k':
keyfile = g_strdup(argv[optind - 1]);
break;
@@ -160,8 +165,16 @@ int main(int argc, char **argv)
fprintf(stderr, "connecting to repository '%s'\n", repository);
}
char *pbs_error = NULL;
- ProxmoxRestoreHandle *conn = proxmox_restore_new(
- repository, snapshot, password, keyfile, key_password, fingerprint, &pbs_error);
+ ProxmoxRestoreHandle *conn = proxmox_restore_new_ns(
+ repository,
+ snapshot,
+ backup_ns,
+ password,
+ keyfile,
+ key_password,
+ fingerprint,
+ &pbs_error
+ );
if (conn == NULL) {
fprintf(stderr, "restore failed: %s\n", pbs_error);
return -1;
diff --git a/pve-backup.c b/pve-backup.c
index 41e8effa01..1c25ae98bd 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -10,6 +10,8 @@
#include "qapi/qmp/qerror.h"
#include "qemu/cutils.h"
+#include <proxmox-backup-qemu.h>
+
/* PVE backup state and related function */
/*
@@ -531,6 +533,7 @@ UuidInfo coroutine_fn *qmp_backup(
bool has_key_password, const char *key_password,
bool has_master_keyfile, const char *master_keyfile,
bool has_fingerprint, const char *fingerprint,
+ bool has_backup_ns, const char *backup_ns,
bool has_backup_id, const char *backup_id,
bool has_backup_time, int64_t backup_time,
bool has_use_dirty_bitmap, bool use_dirty_bitmap,
@@ -670,8 +673,9 @@ UuidInfo coroutine_fn *qmp_backup(
firewall_name = "fw.conf";
char *pbs_err = NULL;
- pbs = proxmox_backup_new(
+ pbs = proxmox_backup_new_ns(
backup_file,
+ has_backup_ns ? backup_ns : NULL,
backup_id,
backup_time,
dump_cb_block_size,
diff --git a/qapi/block-core.json b/qapi/block-core.json
index d8c7331090..889726fc26 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -817,6 +817,8 @@
#
# @fingerprint: server cert fingerprint (optional for format 'pbs')
#
+# @backup-ns: backup namespace (required for format 'pbs')
+#
# @backup-id: backup ID (required for format 'pbs')
#
# @backup-time: backup timestamp (Unix epoch, required for format 'pbs')
@@ -836,6 +838,7 @@
'*key-password': 'str',
'*master-keyfile': 'str',
'*fingerprint': 'str',
+ '*backup-ns': 'str',
'*backup-id': 'str',
'*backup-time': 'int',
'*use-dirty-bitmap': 'bool',
@@ -3290,7 +3293,7 @@
{ 'struct': 'BlockdevOptionsPbs',
'data': { 'repository': 'str', 'snapshot': 'str', 'archive': 'str',
'*keyfile': 'str', '*password': 'str', '*fingerprint': 'str',
- '*key_password': 'str' } }
+ '*key_password': 'str', '*namespace': 'str' } }
##
# @BlockdevOptionsNVMe:

View File

@@ -0,0 +1,80 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fabian Ebner <f.ebner@proxmox.com>
Date: Thu, 23 Jun 2022 14:00:05 +0200
Subject: [PATCH] Revert "block/rbd: workaround for ceph issue #53784"
This reverts commit fc176116cdea816ceb8dd969080b2b95f58edbc0 in
preparation to revert 0347a8fd4c3faaedf119be04c197804be40a384b.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
---
block/rbd.c | 42 ++----------------------------------------
1 file changed, 2 insertions(+), 40 deletions(-)
diff --git a/block/rbd.c b/block/rbd.c
index 64a8d7d48b..9fc6dcb957 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -1348,7 +1348,6 @@ static int coroutine_fn qemu_rbd_co_block_status(BlockDriverState *bs,
int status, r;
RBDDiffIterateReq req = { .offs = offset };
uint64_t features, flags;
- uint64_t head = 0;
assert(offset + bytes <= s->image_size);
@@ -1376,43 +1375,7 @@ static int coroutine_fn qemu_rbd_co_block_status(BlockDriverState *bs,
return status;
}
-#if LIBRBD_VERSION_CODE < LIBRBD_VERSION(1, 17, 0)
- /*
- * librbd had a bug until early 2022 that affected all versions of ceph that
- * supported fast-diff. This bug results in reporting of incorrect offsets
- * if the offset parameter to rbd_diff_iterate2 is not object aligned.
- * Work around this bug by rounding down the offset to object boundaries.
- * This is OK because we call rbd_diff_iterate2 with whole_object = true.
- * However, this workaround only works for non cloned images with default
- * striping.
- *
- * See: https://tracker.ceph.com/issues/53784
- */
-
- /* check if RBD image has non-default striping enabled */
- if (features & RBD_FEATURE_STRIPINGV2) {
- return status;
- }
-
-#pragma GCC diagnostic push
-#pragma GCC diagnostic ignored "-Wdeprecated-declarations"
- /*
- * check if RBD image is a clone (= has a parent).
- *
- * rbd_get_parent_info is deprecated from Nautilus onwards, but the
- * replacement rbd_get_parent is not present in Luminous and Mimic.
- */
- if (rbd_get_parent_info(s->image, NULL, 0, NULL, 0, NULL, 0) != -ENOENT) {
- return status;
- }
-#pragma GCC diagnostic pop
-
- head = req.offs & (s->object_size - 1);
- req.offs -= head;
- bytes += head;
-#endif
-
- r = rbd_diff_iterate2(s->image, NULL, req.offs, bytes, true, true,
+ r = rbd_diff_iterate2(s->image, NULL, offset, bytes, true, true,
qemu_rbd_diff_iterate_cb, &req);
if (r < 0 && r != QEMU_RBD_EXIT_DIFF_ITERATE2) {
return status;
@@ -1431,8 +1394,7 @@ static int coroutine_fn qemu_rbd_co_block_status(BlockDriverState *bs,
status = BDRV_BLOCK_ZERO | BDRV_BLOCK_OFFSET_VALID;
}
- assert(req.bytes > head);
- *pnum = req.bytes - head;
+ *pnum = req.bytes;
return status;
}

View File

@@ -0,0 +1,35 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fabian Ebner <f.ebner@proxmox.com>
Date: Thu, 23 Jun 2022 14:00:07 +0200
Subject: [PATCH] Revert "block/rbd: fix handling of holes in
.bdrv_co_block_status"
This reverts commit 9e302f64bb407a9bb097b626da97228c2654cfee in
preparation to revert 0347a8fd4c3faaedf119be04c197804be40a384b.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
---
block/rbd.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/block/rbd.c b/block/rbd.c
index 9fc6dcb957..98f4ba2620 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -1307,11 +1307,11 @@ static int qemu_rbd_diff_iterate_cb(uint64_t offs, size_t len,
RBDDiffIterateReq *req = opaque;
assert(req->offs + req->bytes <= offs);
-
- /* treat a hole like an unallocated area and bail out */
- if (!exists) {
- return 0;
- }
+ /*
+ * we do not diff against a snapshot so we should never receive a callback
+ * for a hole.
+ */
+ assert(exists);
if (!req->exists && offs > req->offs) {
/*

Some files were not shown because too many files have changed in this diff Show More