Compare commits

...

14 Commits

Author SHA1 Message Date
Thomas Lamprecht
eca4daeeed bump version to 8.0.2-7
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-10-04 08:33:39 +02:00
Fiona Ebner
816077299c fix #2874: SATA: avoid unsolicited write to sector 0 during reset
If there is a pending DMA operation during ide_bus_reset(), the fact
that the IDEstate is already reset before the operation is canceled
can be problematic. In particular, ide_dma_cb() might be called and
then use the reset IDEstate which contains the signature after the
reset. When used to construct the IO operation this leads to
ide_get_sector() returning 0 and nsector being 1. This is particularly
bad, because a write command will thus destroy the first sector which
often contains a partition table or similar.

Upstream discussion:
https://lists.nongnu.org/archive/html/qemu-devel/2023-08/msg04239.html

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-09-26 11:30:22 +02:00
Fiona Ebner
ef3308db71 vma: avoid compiler warning about incompatible pointer type
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-09-08 11:18:30 +02:00
Filip Schauer
0ff45eb23e backup: Fix spelling error in function name
Signed-off-by: Filip Schauer <f.schauer@proxmox.com>
[FE: fixup patch context]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-09-08 11:13:04 +02:00
Thomas Lamprecht
6c5563e30b bump version to 8.0.2-6
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-09-06 17:04:04 +02:00
Fiona Ebner
9e0186f289 backup: drop broken BACKUP_FORMAT_DIR
Since upstream QEMU 8.0, it's no longer possible to call
bdrv_img_create() from a coroutine anymore, meaning a backup with the
directory format would crash the QEMU instance.

The feature is only exposed via the monitor and was intended to be
experimental. There were no user reports about the breakage and it
only was noticed during the rebase for QEMU 8.1, because other parts
of the backup code needed adaptation and I decided to check the
BACKUP_FORMAT_DIR case too.

It should not stay in a broken state of course, but avoid the
maintenance cost and just make it a removed feature for Proxmox VE 8
retroactively.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-09-06 16:59:12 +02:00
Fiona Ebner
0cffb504e7 backup: create jobs in a drained section
With the drive-backup QMP command, upstream QEMU uses a drained
section for the source drive when creating the backup job. Do the same
here to avoid subtle bugs.

There, the drained section extends until after the job is started, but
this cannot be done here for multi-disk backups (could at most start
the first job). The important thing is that the cbw
(copy-before-write) node is in place and the bcs (block-copy-state)
bitmap is initialized, which both happen during job creation (ensured
by the "block/backup: move bcs bitmap initialization to job creation"
PVE patch).

One such bug is one reported in the community forum [0], where using a
drive with iothread can lead to an overlapping block-copy request and
consequently an assertion failure. The block-copy code relies on the
bcs bitmap to determine if a request for a certain range can be
created. Each time a request is created, it resets the bcs bitmap at
that range to indicate that it's being handled.

The duplicate request can happen as follows:
Thread A attaches the cbw node
Thread B creates a request and resets the bitmap at that range
Thread A clears the bitmap and merges it with the PBS bitmap
The merging can lead to the bitmap being set again at the range of
the previous request, so the block-copy code thinks it's fine to
create a request there.
Thread B creates another requests at an overlapping range before the
other request is finished.

The drained section ensures that nothing else can interfere with the
bcs bitmap between attaching the copy-before-write block node and
initialization of the bitmap.

[0]: https://forum.proxmox.com/threads/133149/

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-09-06 16:59:12 +02:00
Fiona Ebner
f7eed6caa1 regenerate patch stats
Apparently wasn't correct in 0cff91a ("fix #1534: vma: Add extract
filter for disk images").

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-09-06 16:59:12 +02:00
Filip Schauer
0cff91a000 fix #1534: vma: Add extract filter for disk images
Add a filter to the "vma extract" command. A comma seperated list of
disk images that should be extracted can be passed with the "-d" option.

Example to extract an IDE drive and an SCSI drive from vzdump.vma:

vma extract vzdump.vma -d "drive-ide0,drive-scsi0" extractdir

Signed-off-by: Filip Schauer <f.schauer@proxmox.com>
2023-08-30 10:40:51 +02:00
Fiona Ebner
6cadf3677d bump version to 8.0.2-5
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-08-16 11:56:49 +02:00
Fiona Ebner
5f9cb29c3a backup: trim heap after finishing
Reported in the community forum [0]. By default, there can be large
amounts of memory left assigned to the QEMU process after backup.
Likely because of fragmentation, it's necessary to explicitly call
malloc_trim() to tell glibc that it shouldn't keep all that memory
resident for the process.

QEMU itself already does a malloc_trim() in the RCU thread, but that
code path might not be reached (or not for a long time) under usual
operation. The value of 4 MiB for the argument was also copied from
there.

Example with the following configuration:
> agent: 1
> boot: order=scsi0
> cores: 4
> cpu: x86-64-v2-AES
> ide2: none,media=cdrom
> memory: 1024
> name: backup-mem
> net0: virtio=DA:58:18:26:59:9F,bridge=vmbr0,firewall=1
> numa: 0
> ostype: l26
> scsi0: rbd:base-107-disk-0/vm-106-disk-1,size=4302M
> scsihw: virtio-scsi-pci
> smbios1: uuid=b2d4511e-8d01-44f1-afd6-9581b30c24a6
> sockets: 2
> startup: order=2
> virtio0: lvmthin:vm-106-disk-1,iothread=1,size=1G
> virtio1: lvmthin:vm-106-disk-2,iothread=1,size=1G
> virtio2: lvmthin:vm-106-disk-3,iothread=1,size=1G
> vmgenid: 0a1d8751-5e02-449d-977e-c0160e900231

Before the change:

> root@pve8a1 ~ # grep VmRSS /proc/$(cat /var/run/qemu-server/106.pid)/status
> VmRSS:	  370948 kB
> root@pve8a1 ~ # vzdump 106 --storage pbs
> (...)
> INFO: Backup job finished successfully
> root@pve8a1 ~ # grep VmRSS /proc/$(cat /var/run/qemu-server/106.pid)/status
> VmRSS:	 2114964 kB

After the change:

> root@pve8a1 ~ # grep VmRSS /proc/$(cat /var/run/qemu-server/106.pid)/status
> VmRSS:	  398788 kB
> root@pve8a1 ~ # vzdump 106 --storage pbs
> (...)
> INFO: Backup job finished successfully
> root@pve8a1 ~ # grep VmRSS /proc/$(cat /var/run/qemu-server/106.pid)/status
> VmRSS:	  424356 kB

[0]: https://forum.proxmox.com/threads/131339/

Co-diagnosed-by: Friedrich Weber <f.weber@proxmox.com>
Co-diagnosed-by: Dominik Csapak <d.csapak@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Acked-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2023-08-16 11:50:12 +02:00
Fiona Ebner
c36e3f9d17 refresh patch context
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Acked-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2023-08-16 11:50:08 +02:00
Filip Schauer
b8b4ce0480 Add format attributes to function candidates
Add format attributes to functions that take printf-like arguments. This
provides additional compile-time checking that the correct parameters
are passed to the functions.

This fixes compiler warnings generated by the -Wsuggest-attribute=format
flag.

Signed-off-by: Filip Schauer <f.schauer@proxmox.com>
2023-08-08 09:08:48 +02:00
Fiona Ebner
df47146afe add patch fixing fd leak for vhost
Each pause+resume operation (which is also done as part of taking a VM
snapshot) would increase the number of open file descriptors by the
number of vhost devices (e.g. network devices by default). This could
lead to crashes during backup and surely other issues once the system
limit (default 1024) was reached [0].

[0]: https://forum.proxmox.com/threads/131603/

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-08-03 17:40:13 +02:00
11 changed files with 260 additions and 104 deletions

31
debian/changelog vendored
View File

@@ -1,3 +1,34 @@
pve-qemu-kvm (8.0.2-7) bookworm; urgency=medium
* fix #2874: SATA: avoid unsolicited write to sector 0 during reset
-- Proxmox Support Team <support@proxmox.com> Wed, 04 Oct 2023 08:33:35 +0200
pve-qemu-kvm (8.0.2-6) bookworm; urgency=medium
* fix #1534: vma: add extract-filter for disk images allowing users to pass
a comma separated list of the disks they want to extract from an archive.
* backup: create jobs in a drained section to avoid subtle bugs where
something interferes with the block-copy-state bitmap on initialization
* backup: drop experimental, and since a while also fully broken, directory
backup format (BACKUP_FORMAT_DIR). This format was never exposed via the
Proxmox VE API, but only available via QMP, as its broken since QEMU 8 and
we got zero reports about that, it's safe to assume that there are no
public users, so just remove it completely.
-- Proxmox Support Team <support@proxmox.com> Wed, 06 Sep 2023 17:03:59 +0200
pve-qemu-kvm (8.0.2-5) bookworm; urgency=medium
* improve memory footprint after backup by not keeping as much memory
resident.
* fix file descriptor leak for vhost (used by default by vNICs).
-- Proxmox Support Team <support@proxmox.com> Wed, 16 Aug 2023 11:52:24 +0200
pve-qemu-kvm (8.0.2-4) bookworm; urgency=medium
* fix resume for snapshot and hibernate in combination with iothread and

View File

@@ -0,0 +1,29 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Li Feng <fengli@smartx.com>
Date: Mon, 31 Jul 2023 20:10:06 +0800
Subject: [PATCH] vhost: fix the fd leak
When the vhost-user reconnect to the backend, the notifer should be
cleanup. Otherwise, the fd resource will be exhausted.
Fixes: f9a09ca3ea ("vhost: add support for configure interrupt")
Signed-off-by: Li Feng <fengli@smartx.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
---
hw/virtio/vhost.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index a266396576..8e3311781f 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -2034,6 +2034,8 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
event_notifier_test_and_clear(
&hdev->vqs[VHOST_QUEUE_NUM_CONFIG_INR].masked_config_notifier);
event_notifier_test_and_clear(&vdev->config_notifier);
+ event_notifier_cleanup(
+ &hdev->vqs[VHOST_QUEUE_NUM_CONFIG_INR].masked_config_notifier);
trace_vhost_dev_stop(hdev, vdev->name, vrings);

View File

@@ -0,0 +1,100 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Thu, 24 Aug 2023 11:22:21 +0200
Subject: [PATCH] hw/ide: reset: cancel async DMA operation before reseting
state
If there is a pending DMA operation during ide_bus_reset(), the fact
that the IDEstate is already reset before the operation is canceled
can be problematic. In particular, ide_dma_cb() might be called and
then use the reset IDEstate which contains the signature after the
reset. When used to construct the IO operation this leads to
ide_get_sector() returning 0 and nsector being 1. This is particularly
bad, because a write command will thus destroy the first sector which
often contains a partition table or similar.
Traces showing the unsolicited write happening with IDEstate
0x5595af6949d0 being used after reset:
> ahci_port_write ahci(0x5595af6923f0)[0]: port write [reg:PxSCTL] @ 0x2c: 0x00000300
> ahci_reset_port ahci(0x5595af6923f0)[0]: reset port
> ide_reset IDEstate 0x5595af6949d0
> ide_reset IDEstate 0x5595af694da8
> ide_bus_reset_aio aio_cancel
> dma_aio_cancel dbs=0x7f64600089a0
> dma_blk_cb dbs=0x7f64600089a0 ret=0
> dma_complete dbs=0x7f64600089a0 ret=0 cb=0x5595acd40b30
> ahci_populate_sglist ahci(0x5595af6923f0)[0]
> ahci_dma_prepare_buf ahci(0x5595af6923f0)[0]: prepare buf limit=512 prepared=512
> ide_dma_cb IDEState 0x5595af6949d0; sector_num=0 n=1 cmd=DMA WRITE
> dma_blk_io dbs=0x7f6420802010 bs=0x5595ae2c6c30 offset=0 to_dev=1
> dma_blk_cb dbs=0x7f6420802010 ret=0
> (gdb) p *qiov
> $11 = {iov = 0x7f647c76d840, niov = 1, {{nalloc = 1, local_iov = {iov_base = 0x0,
> iov_len = 512}}, {__pad = "\001\000\000\000\000\000\000\000\000\000\000",
> size = 512}}}
> (gdb) bt
> #0 blk_aio_pwritev (blk=0x5595ae2c6c30, offset=0, qiov=0x7f6420802070, flags=0,
> cb=0x5595ace6f0b0 <dma_blk_cb>, opaque=0x7f6420802010)
> at ../block/block-backend.c:1682
> #1 0x00005595ace6f185 in dma_blk_cb (opaque=0x7f6420802010, ret=<optimized out>)
> at ../softmmu/dma-helpers.c:179
> #2 0x00005595ace6f778 in dma_blk_io (ctx=0x5595ae0609f0,
> sg=sg@entry=0x5595af694d00, offset=offset@entry=0, align=align@entry=512,
> io_func=io_func@entry=0x5595ace6ee30 <dma_blk_write_io_func>,
> io_func_opaque=io_func_opaque@entry=0x5595ae2c6c30,
> cb=0x5595acd40b30 <ide_dma_cb>, opaque=0x5595af6949d0,
> dir=DMA_DIRECTION_TO_DEVICE) at ../softmmu/dma-helpers.c:244
> #3 0x00005595ace6f90a in dma_blk_write (blk=0x5595ae2c6c30,
> sg=sg@entry=0x5595af694d00, offset=offset@entry=0, align=align@entry=512,
> cb=cb@entry=0x5595acd40b30 <ide_dma_cb>, opaque=opaque@entry=0x5595af6949d0)
> at ../softmmu/dma-helpers.c:280
> #4 0x00005595acd40e18 in ide_dma_cb (opaque=0x5595af6949d0, ret=<optimized out>)
> at ../hw/ide/core.c:953
> #5 0x00005595ace6f319 in dma_complete (ret=0, dbs=0x7f64600089a0)
> at ../softmmu/dma-helpers.c:107
> #6 dma_blk_cb (opaque=0x7f64600089a0, ret=0) at ../softmmu/dma-helpers.c:127
> #7 0x00005595ad12227d in blk_aio_complete (acb=0x7f6460005b10)
> at ../block/block-backend.c:1527
> #8 blk_aio_complete (acb=0x7f6460005b10) at ../block/block-backend.c:1524
> #9 blk_aio_write_entry (opaque=0x7f6460005b10) at ../block/block-backend.c:1594
> #10 0x00005595ad258cfb in coroutine_trampoline (i0=<optimized out>,
> i1=<optimized out>) at ../util/coroutine-ucontext.c:177
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
hw/ide/core.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/hw/ide/core.c b/hw/ide/core.c
index 08e1f0c3d7..148fccdef2 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -2513,19 +2513,19 @@ static void ide_dummy_transfer_stop(IDEState *s)
void ide_bus_reset(IDEBus *bus)
{
- bus->unit = 0;
- bus->cmd = 0;
- ide_reset(&bus->ifs[0]);
- ide_reset(&bus->ifs[1]);
- ide_clear_hob(bus);
-
- /* pending async DMA */
+ /* pending async DMA - needs the IDEState before it is reset */
if (bus->dma->aiocb) {
trace_ide_bus_reset_aio();
blk_aio_cancel(bus->dma->aiocb);
bus->dma->aiocb = NULL;
}
+ bus->unit = 0;
+ bus->cmd = 0;
+ ide_reset(&bus->ifs[0]);
+ ide_reset(&bus->ifs[1]);
+ ide_clear_hob(bus);
+
/* reset dma provider too */
if (bus->dma->ops->reset) {
bus->dma->ops->reset(bus->dma);

View File

@@ -139,7 +139,7 @@ index 8a142fc7a9..a7824b5266 100644
'threadinfo.c',
diff --git a/migration/savevm-async.c b/migration/savevm-async.c
new file mode 100644
index 0000000000..ac1fac6378
index 0000000000..aa2017d496
--- /dev/null
+++ b/migration/savevm-async.c
@@ -0,0 +1,533 @@
@@ -275,7 +275,7 @@ index 0000000000..ac1fac6378
+ return ret;
+}
+
+static void save_snapshot_error(const char *fmt, ...)
+static void G_GNUC_PRINTF(1, 2) save_snapshot_error(const char *fmt, ...)
+{
+ va_list ap;
+ char *msg;

View File

@@ -192,7 +192,7 @@ index 9d0155a2a1..cc06240e8d 100644
int qemu_fclose(QEMUFile *f);
diff --git a/migration/savevm-async.c b/migration/savevm-async.c
index ac1fac6378..ea3b2f36a6 100644
index aa2017d496..b97f2c4f14 100644
--- a/migration/savevm-async.c
+++ b/migration/savevm-async.c
@@ -380,7 +380,7 @@ void qmp_savevm_start(const char *statefile, Error **errp)

View File

@@ -15,11 +15,11 @@ Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
block/meson.build | 2 +
meson.build | 5 +
vma-reader.c | 867 +++++++++++++++++++++++++++++++++++++++++++++
vma-writer.c | 793 +++++++++++++++++++++++++++++++++++++++++
vma.c | 878 ++++++++++++++++++++++++++++++++++++++++++++++
vma-reader.c | 867 ++++++++++++++++++++++++++++++++++++++++++++
vma-writer.c | 793 ++++++++++++++++++++++++++++++++++++++++
vma.c | 900 ++++++++++++++++++++++++++++++++++++++++++++++
vma.h | 150 ++++++++
6 files changed, 2695 insertions(+)
6 files changed, 2717 insertions(+)
create mode 100644 vma-reader.c
create mode 100644 vma-writer.c
create mode 100644 vma.c
@@ -1735,10 +1735,10 @@ index 0000000000..ac7da237d0
+}
diff --git a/vma.c b/vma.c
new file mode 100644
index 0000000000..304f02bc84
index 0000000000..347f6283ca
--- /dev/null
+++ b/vma.c
@@ -0,0 +1,878 @@
@@ -0,0 +1,900 @@
+/*
+ * VMA: Virtual Machine Archive
+ *
@@ -1772,7 +1772,7 @@ index 0000000000..304f02bc84
+ "vma list <filename>\n"
+ "vma config <filename> [-c config]\n"
+ "vma create <filename> [-c config] pathname ...\n"
+ "vma extract <filename> [-r <fifo>] <targetdir>\n"
+ "vma extract <filename> [-d <drive-list>] [-r <fifo>] <targetdir>\n"
+ "vma verify <filename> [-v]\n"
+ ;
+
@@ -1917,9 +1917,10 @@ index 0000000000..304f02bc84
+ const char *filename;
+ const char *dirname;
+ const char *readmap = NULL;
+ gchar **drive_list = NULL;
+
+ for (;;) {
+ c = getopt(argc, argv, "hvr:");
+ c = getopt(argc, argv, "hvd:r:");
+ if (c == -1) {
+ break;
+ }
@@ -1928,6 +1929,9 @@ index 0000000000..304f02bc84
+ case 'h':
+ help();
+ break;
+ case 'd':
+ drive_list = g_strsplit(optarg, ",", 254);
+ break;
+ case 'r':
+ readmap = optarg;
+ break;
@@ -2064,12 +2068,12 @@ index 0000000000..304f02bc84
+
+ int i;
+ int vmstate_fd = -1;
+ guint8 vmstate_stream = 0;
+ bool drive_rename_bitmap[255];
+ memset(drive_rename_bitmap, 0, sizeof(drive_rename_bitmap));
+
+ for (i = 1; i < 255; i++) {
+ VmaDeviceInfo *di = vma_reader_get_device_info(vmar, i);
+ if (di && (strcmp(di->devname, "vmstate") == 0)) {
+ vmstate_stream = i;
+ char *statefn = g_strdup_printf("%s/vmstate.bin", dirname);
+ vmstate_fd = open(statefn, O_WRONLY|O_CREAT|O_EXCL, 0644);
+ if (vmstate_fd < 0) {
@@ -2089,7 +2093,21 @@ index 0000000000..304f02bc84
+
+ BlockBackend *blk = NULL;
+
+ if (readmap) {
+ if (drive_list) {
+ skip = true;
+ int j;
+ for (j = 0; drive_list[j]; j++) {
+ if (strcmp(drive_list[j], di->devname) == 0) {
+ skip = false;
+ drive_rename_bitmap[i] = true;
+ break;
+ }
+ }
+ } else {
+ drive_rename_bitmap[i] = true;
+ }
+
+ if (!skip && readmap) {
+ RestoreMap *map;
+ map = (RestoreMap *)g_hash_table_lookup(devmap, di->devname);
+ if (map == NULL) {
@@ -2102,7 +2120,7 @@ index 0000000000..304f02bc84
+ cache = map->cache;
+ write_zero = map->write_zero;
+ skip = map->skip;
+ } else {
+ } else if (!skip) {
+ devfn = g_strdup_printf("%s/tmp-disk-%s.raw",
+ dirname, di->devname);
+ printf("DEVINFO %s %zd\n", devfn, di->size);
@@ -2183,6 +2201,10 @@ index 0000000000..304f02bc84
+ }
+ }
+
+ if (drive_list) {
+ g_strfreev(drive_list);
+ }
+
+ if (vma_reader_restore(vmar, vmstate_fd, verbose, &errp) < 0) {
+ g_error("restore failed - %s", error_get_pretty(errp));
+ }
@@ -2190,7 +2212,7 @@ index 0000000000..304f02bc84
+ if (!readmap) {
+ for (i = 1; i < 255; i++) {
+ VmaDeviceInfo *di = vma_reader_get_device_info(vmar, i);
+ if (di && (i != vmstate_stream)) {
+ if (di && drive_rename_bitmap[i]) {
+ char *tmpfn = g_strdup_printf("%s/tmp-disk-%s.raw",
+ dirname, di->devname);
+ char *fn = g_strdup_printf("%s/disk-%s.raw",
@@ -2314,13 +2336,13 @@ index 0000000000..304f02bc84
+ ret = blk_co_preadv(job->target, start * VMA_CLUSTER_SIZE,
+ readlen, &qiov, 0);
+ if (ret < 0) {
+ vma_writer_set_error(job->vmaw, "read error", -1);
+ vma_writer_set_error(job->vmaw, "read error");
+ goto out;
+ }
+
+ size_t zb = 0;
+ if (vma_writer_write(job->vmaw, job->dev_id, start, buf, &zb) < 0) {
+ vma_writer_set_error(job->vmaw, "backup_dump_cb vma_writer_write failed", -1);
+ vma_writer_set_error(job->vmaw, "backup_dump_cb vma_writer_write failed");
+ goto out;
+ }
+ }
@@ -2619,7 +2641,7 @@ index 0000000000..304f02bc84
+}
diff --git a/vma.h b/vma.h
new file mode 100644
index 0000000000..1b62859165
index 0000000000..86d2873aa5
--- /dev/null
+++ b/vma.h
@@ -0,0 +1,150 @@
@@ -2757,7 +2779,7 @@ index 0000000000..1b62859165
+int coroutine_fn vma_writer_flush_output(VmaWriter *vmaw);
+
+int vma_writer_get_status(VmaWriter *vmaw, VmaStatus *status);
+void vma_writer_set_error(VmaWriter *vmaw, const char *fmt, ...);
+void vma_writer_set_error(VmaWriter *vmaw, const char *fmt, ...) G_GNUC_PRINTF(2, 3);
+
+
+VmaReader *vma_reader_create(const char *filename, Error **errp);

View File

@@ -79,24 +79,26 @@ Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
adapt for new job lock mechanism replacing AioContext locks
adapt to QAPI changes
improve canceling
allow passing max-workers setting]
allow passing max-workers setting
use malloc_trim after backup
create jobs in a drained section]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
block/meson.build | 5 +
block/monitor/block-hmp-cmds.c | 40 ++
block/monitor/block-hmp-cmds.c | 39 ++
blockdev.c | 1 +
hmp-commands-info.hx | 14 +
hmp-commands.hx | 31 +
hmp-commands.hx | 29 +
include/monitor/hmp.h | 3 +
meson.build | 1 +
monitor/hmp-cmds.c | 72 +++
proxmox-backup-client.c | 146 +++++
proxmox-backup-client.h | 60 ++
pve-backup.c | 1097 ++++++++++++++++++++++++++++++++
qapi/block-core.json | 226 +++++++
pve-backup.c | 1067 ++++++++++++++++++++++++++++++++
qapi/block-core.json | 229 +++++++
qapi/common.json | 13 +
qapi/machine.json | 15 +-
14 files changed, 1711 insertions(+), 13 deletions(-)
14 files changed, 1681 insertions(+), 13 deletions(-)
create mode 100644 proxmox-backup-client.c
create mode 100644 proxmox-backup-client.h
create mode 100644 pve-backup.c
@@ -118,10 +120,10 @@ index f580f95395..5bcebb934b 100644
softmmu_ss.add(when: 'CONFIG_TCG', if_true: files('blkreplay.c'))
softmmu_ss.add(files('block-ram-registrar.c'))
diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c
index ca2599de44..636509b83e 100644
index ca2599de44..6efe28cef5 100644
--- a/block/monitor/block-hmp-cmds.c
+++ b/block/monitor/block-hmp-cmds.c
@@ -1029,3 +1029,43 @@ void hmp_change_medium(Monitor *mon, const char *device, const char *target,
@@ -1029,3 +1029,42 @@ void hmp_change_medium(Monitor *mon, const char *device, const char *target,
qmp_blockdev_change_medium(device, NULL, target, arg, true, force,
!!read_only, read_only_mode, errp);
}
@@ -139,7 +141,6 @@ index ca2599de44..636509b83e 100644
+{
+ Error *error = NULL;
+
+ int dir = qdict_get_try_bool(qdict, "directory", 0);
+ const char *backup_file = qdict_get_str(qdict, "backupfile");
+ const char *devlist = qdict_get_try_str(qdict, "devlist");
+ int64_t speed = qdict_get_try_int(qdict, "speed", 0);
@@ -157,7 +158,7 @@ index ca2599de44..636509b83e 100644
+ false, false, // PBS use-dirty-bitmap
+ false, false, // PBS compress
+ false, false, // PBS encrypt
+ true, dir ? BACKUP_FORMAT_DIR : BACKUP_FORMAT_VMA,
+ true, BACKUP_FORMAT_VMA,
+ NULL, NULL,
+ devlist, qdict_haskey(qdict, "speed"), speed,
+ false, 0, // BackupPerf max-workers
@@ -203,10 +204,10 @@ index a166bff3d5..4b75966c2e 100644
{
.name = "usernet",
diff --git a/hmp-commands.hx b/hmp-commands.hx
index d9f9f42d11..775518fb09 100644
index d9f9f42d11..ddb9678dc3 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -101,6 +101,37 @@ ERST
@@ -101,6 +101,35 @@ ERST
SRST
``block_stream``
Copy data from a backing file into a block device.
@@ -214,11 +215,9 @@ index d9f9f42d11..775518fb09 100644
+
+ {
+ .name = "backup",
+ .args_type = "directory:-d,backupfile:s,speed:o?,devlist:s?",
+ .params = "[-d] backupfile [speed [devlist]]",
+ .help = "create a VM Backup."
+ "\n\t\t\t Use -d to dump data into a directory instead"
+ "\n\t\t\t of using VMA format.",
+ .args_type = "backupfile:s,speed:o?,devlist:s?",
+ .params = "backupfile [speed [devlist]]",
+ .help = "create a VM backup (VMA format).",
+ .cmd = hmp_backup,
+ .coroutine = true,
+ },
@@ -587,10 +586,10 @@ index 0000000000..8cbf645b2c
+#endif /* PROXMOX_BACKUP_CLIENT_H */
diff --git a/pve-backup.c b/pve-backup.c
new file mode 100644
index 0000000000..dd72ee0ed6
index 0000000000..21441b2f97
--- /dev/null
+++ b/pve-backup.c
@@ -0,0 +1,1097 @@
@@ -0,0 +1,1067 @@
+#include "proxmox-backup-client.h"
+#include "vma.h"
+
@@ -605,6 +604,10 @@ index 0000000000..dd72ee0ed6
+#include "qapi/qmp/qerror.h"
+#include "qemu/cutils.h"
+
+#if defined(CONFIG_MALLOC_TRIM)
+#include <malloc.h>
+#endif
+
+#include <proxmox-backup-qemu.h>
+
+/* PVE backup state and related function */
@@ -697,7 +700,7 @@ index 0000000000..dd72ee0ed6
+ return error_or_canceled;
+}
+
+static void pvebackup_add_transfered_bytes(size_t transferred, size_t zero_bytes, size_t reused)
+static void pvebackup_add_transferred_bytes(size_t transferred, size_t zero_bytes, size_t reused)
+{
+ qemu_mutex_lock(&backup_state.stat.lock);
+ backup_state.stat.zero_bytes += zero_bytes;
@@ -758,7 +761,7 @@ index 0000000000..dd72ee0ed6
+ }
+
+ qemu_co_mutex_unlock(&backup_state.dump_callback_mutex);
+ pvebackup_add_transfered_bytes(size, is_zero_block ? size : 0, reused);
+ pvebackup_add_transferred_bytes(size, is_zero_block ? size : 0, reused);
+
+ return size;
+}
@@ -816,11 +819,11 @@ index 0000000000..dd72ee0ed6
+ } else {
+ if (remaining >= VMA_CLUSTER_SIZE) {
+ assert(ret == VMA_CLUSTER_SIZE);
+ pvebackup_add_transfered_bytes(VMA_CLUSTER_SIZE, zero_bytes, 0);
+ pvebackup_add_transferred_bytes(VMA_CLUSTER_SIZE, zero_bytes, 0);
+ remaining -= VMA_CLUSTER_SIZE;
+ } else {
+ assert(ret == remaining);
+ pvebackup_add_transfered_bytes(remaining, zero_bytes, 0);
+ pvebackup_add_transferred_bytes(remaining, zero_bytes, 0);
+ remaining = 0;
+ }
+ }
@@ -869,6 +872,14 @@ index 0000000000..dd72ee0ed6
+ backup_state.stat.end_time = time(NULL);
+ backup_state.stat.finishing = false;
+ qemu_mutex_unlock(&backup_state.stat.lock);
+
+#if defined(CONFIG_MALLOC_TRIM)
+ /*
+ * Try to reclaim memory for buffers (and, in case of PBS, Rust futures), etc.
+ * Won't happen by default if there is fragmentation.
+ */
+ malloc_trim(4 * 1024 * 1024);
+#endif
+}
+
+static void coroutine_fn pvebackup_co_complete_stream(void *opaque)
@@ -1020,7 +1031,6 @@ index 0000000000..dd72ee0ed6
+ const char *file,
+ const char *name,
+ BackupFormat format,
+ const char *backup_dir,
+ VmaWriter *vmaw,
+ ProxmoxBackupHandle *pbs,
+ Error **errp)
@@ -1046,13 +1056,6 @@ index 0000000000..dd72ee0ed6
+ } else if (format == BACKUP_FORMAT_PBS) {
+ if (proxmox_backup_co_add_config(pbs, name, (unsigned char *)cdata, clen, errp) < 0)
+ goto err;
+ } else if (format == BACKUP_FORMAT_DIR) {
+ char config_path[PATH_MAX];
+ snprintf(config_path, PATH_MAX, "%s/%s", backup_dir, name);
+ if (!g_file_set_contents(config_path, cdata, clen, &err)) {
+ error_setg(errp, "unable to write config file '%s'", config_path);
+ goto err;
+ }
+ }
+
+ out:
@@ -1103,12 +1106,16 @@ index 0000000000..dd72ee0ed6
+ AioContext *aio_context = bdrv_get_aio_context(di->bs);
+ aio_context_acquire(aio_context);
+
+ bdrv_drained_begin(di->bs);
+
+ BlockJob *job = backup_job_create(
+ NULL, di->bs, di->target, backup_state.speed, sync_mode, di->bitmap,
+ bitmap_mode, false, NULL, &backup_state.perf, BLOCKDEV_ON_ERROR_REPORT,
+ BLOCKDEV_ON_ERROR_REPORT, JOB_DEFAULT, pvebackup_complete_cb, di, backup_state.txn,
+ &local_err);
+
+ bdrv_drained_end(di->bs);
+
+ aio_context_release(aio_context);
+
+ di->job = job;
@@ -1188,7 +1195,6 @@ index 0000000000..dd72ee0ed6
+
+ BlockBackend *blk;
+ BlockDriverState *bs = NULL;
+ const char *backup_dir = NULL;
+ Error *local_err = NULL;
+ uuid_t uuid;
+ VmaWriter *vmaw = NULL;
@@ -1428,54 +1434,21 @@ index 0000000000..dd72ee0ed6
+ goto err_mutex;
+ }
+ }
+ } else if (format == BACKUP_FORMAT_DIR) {
+ if (mkdir(backup_file, 0640) != 0) {
+ error_setg_errno(errp, errno, "can't create directory '%s'\n",
+ backup_file);
+ goto err_mutex;
+ }
+ backup_dir = backup_file;
+
+ l = di_list;
+ while (l) {
+ PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
+ l = g_list_next(l);
+
+ const char *devname = bdrv_get_device_name(di->bs);
+ snprintf(di->targetfile, PATH_MAX, "%s/%s.raw", backup_dir, devname);
+
+ int flags = BDRV_O_RDWR;
+ bdrv_img_create(di->targetfile, "raw", NULL, NULL, NULL,
+ di->size, flags, false, &local_err);
+ if (local_err) {
+ error_propagate(errp, local_err);
+ goto err_mutex;
+ }
+
+ di->target = bdrv_co_open(di->targetfile, NULL, NULL, flags, &local_err);
+ if (!di->target) {
+ error_propagate(errp, local_err);
+ goto err_mutex;
+ }
+ }
+ } else {
+ error_set(errp, ERROR_CLASS_GENERIC_ERROR, "unknown backup format");
+ goto err_mutex;
+ }
+
+
+ /* add configuration file to archive */
+ if (config_file) {
+ if (pvebackup_co_add_config(config_file, config_name, format, backup_dir,
+ vmaw, pbs, errp) != 0) {
+ if (pvebackup_co_add_config(config_file, config_name, format, vmaw, pbs, errp) != 0) {
+ goto err_mutex;
+ }
+ }
+
+ /* add firewall file to archive */
+ if (firewall_file) {
+ if (pvebackup_co_add_config(firewall_file, firewall_name, format, backup_dir,
+ vmaw, pbs, errp) != 0) {
+ if (pvebackup_co_add_config(firewall_file, firewall_name, format, vmaw, pbs, errp) != 0) {
+ goto err_mutex;
+ }
+ }
@@ -1588,10 +1561,6 @@ index 0000000000..dd72ee0ed6
+ backup_state.pbs = NULL;
+ }
+
+ if (backup_dir) {
+ rmdir(backup_dir);
+ }
+
+ qemu_co_mutex_unlock(&backup_state.backup_mutex);
+ return NULL;
+}
@@ -1689,10 +1658,10 @@ index 0000000000..dd72ee0ed6
+ return ret;
+}
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 542add004b..4ec70acf95 100644
index 542add004b..985859ddee 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -835,6 +835,232 @@
@@ -835,6 +835,235 @@
{ 'command': 'query-block', 'returns': ['BlockInfo'],
'allow-preconfig': true }
@@ -1742,9 +1711,12 @@ index 542add004b..4ec70acf95 100644
+# An enumeration of supported backup formats.
+#
+# @vma: Proxmox vma backup format
+#
+# @pbs: Proxmox backup server format
+#
+##
+{ 'enum': 'BackupFormat',
+ 'data': [ 'vma', 'dir', 'pbs' ] }
+ 'data': [ 'vma', 'pbs' ] }
+
+##
+# @backup:

View File

@@ -403,10 +403,10 @@ index 32ab849ce6..69afe3441b 100644
summary_info += {'libdaxctl support': libdaxctl}
summary_info += {'libudev': libudev}
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 4ec70acf95..47118bf83e 100644
index 985859ddee..d601fb4ab2 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -3301,6 +3301,7 @@
@@ -3304,6 +3304,7 @@
'parallels', 'preallocate', 'qcow', 'qcow2', 'qed', 'quorum',
'raw', 'rbd',
{ 'name': 'replication', 'if': 'CONFIG_REPLICATION' },
@@ -414,7 +414,7 @@ index 4ec70acf95..47118bf83e 100644
'ssh', 'throttle', 'vdi', 'vhdx',
{ 'name': 'virtio-blk-vfio-pci', 'if': 'CONFIG_BLKIO' },
{ 'name': 'virtio-blk-vhost-user', 'if': 'CONFIG_BLKIO' },
@@ -3377,6 +3378,17 @@
@@ -3380,6 +3381,17 @@
{ 'struct': 'BlockdevOptionsNull',
'data': { '*size': 'int', '*latency-ns': 'uint64', '*read-zeroes': 'bool' } }
@@ -432,7 +432,7 @@ index 4ec70acf95..47118bf83e 100644
##
# @BlockdevOptionsNVMe:
#
@@ -4750,6 +4762,7 @@
@@ -4753,6 +4765,7 @@
'nfs': 'BlockdevOptionsNfs',
'null-aio': 'BlockdevOptionsNull',
'null-co': 'BlockdevOptionsNull',

View File

@@ -175,10 +175,10 @@ index 0000000000..887e998b9e
+ NULL);
+}
diff --git a/pve-backup.c b/pve-backup.c
index dd72ee0ed6..cb5312fff3 100644
index 21441b2f97..5e5c37e06d 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -1090,6 +1090,7 @@ ProxmoxSupportStatus *qmp_query_proxmox_support(Error **errp)
@@ -1060,6 +1060,7 @@ ProxmoxSupportStatus *qmp_query_proxmox_support(Error **errp)
ret->pbs_library_version = g_strdup(proxmox_backup_qemu_version());
ret->pbs_dirty_bitmap = true;
ret->pbs_dirty_bitmap_savevm = true;
@@ -187,10 +187,10 @@ index dd72ee0ed6..cb5312fff3 100644
ret->pbs_masterkey = true;
ret->backup_max_workers = true;
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 47118bf83e..809f3c61bc 100644
index d601fb4ab2..16be1e02ec 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -984,6 +984,11 @@
@@ -987,6 +987,11 @@
# @pbs-dirty-bitmap-savevm: True if 'dirty-bitmaps' migration capability can
# safely be set for savevm-async.
#
@@ -202,7 +202,7 @@ index 47118bf83e..809f3c61bc 100644
# @pbs-masterkey: True if the QMP backup call supports the 'master_keyfile'
# parameter.
#
@@ -994,6 +999,7 @@
@@ -997,6 +1002,7 @@
'data': { 'pbs-dirty-bitmap': 'bool',
'query-bitmap-info': 'bool',
'pbs-dirty-bitmap-savevm': 'bool',

View File

@@ -13,7 +13,7 @@ Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
1 file changed, 2 deletions(-)
diff --git a/migration/savevm-async.c b/migration/savevm-async.c
index ea3b2f36a6..dd7744ab66 100644
index b97f2c4f14..87ea0573d3 100644
--- a/migration/savevm-async.c
+++ b/migration/savevm-async.c
@@ -403,10 +403,8 @@ void qmp_savevm_start(const char *statefile, Error **errp)

View File

@@ -8,6 +8,8 @@ extra/0007-bcm2835_property-disable-reentrancy-detection-for-io.patch
extra/0008-raven-disable-reentrancy-detection-for-iomem.patch
extra/0009-apic-disable-reentrancy-detection-for-apic-msi.patch
extra/0010-migration-block-dirty-bitmap-fix-loading-bitmap-when.patch
extra/0011-vhost-fix-the-fd-leak.patch
extra/0012-hw-ide-reset-cancel-async-DMA-operation-before-reset.patch
bitmap-mirror/0001-drive-mirror-add-support-for-sync-bitmap-mode-never.patch
bitmap-mirror/0002-drive-mirror-add-support-for-conditional-and-always-.patch
bitmap-mirror/0003-mirror-add-check-for-bitmap-mode-without-bitmap.patch