Compare commits

..

7 Commits

Author SHA1 Message Date
Fiona Ebner
6cadf3677d bump version to 8.0.2-5
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-08-16 11:56:49 +02:00
Fiona Ebner
5f9cb29c3a backup: trim heap after finishing
Reported in the community forum [0]. By default, there can be large
amounts of memory left assigned to the QEMU process after backup.
Likely because of fragmentation, it's necessary to explicitly call
malloc_trim() to tell glibc that it shouldn't keep all that memory
resident for the process.

QEMU itself already does a malloc_trim() in the RCU thread, but that
code path might not be reached (or not for a long time) under usual
operation. The value of 4 MiB for the argument was also copied from
there.

Example with the following configuration:
> agent: 1
> boot: order=scsi0
> cores: 4
> cpu: x86-64-v2-AES
> ide2: none,media=cdrom
> memory: 1024
> name: backup-mem
> net0: virtio=DA:58:18:26:59:9F,bridge=vmbr0,firewall=1
> numa: 0
> ostype: l26
> scsi0: rbd:base-107-disk-0/vm-106-disk-1,size=4302M
> scsihw: virtio-scsi-pci
> smbios1: uuid=b2d4511e-8d01-44f1-afd6-9581b30c24a6
> sockets: 2
> startup: order=2
> virtio0: lvmthin:vm-106-disk-1,iothread=1,size=1G
> virtio1: lvmthin:vm-106-disk-2,iothread=1,size=1G
> virtio2: lvmthin:vm-106-disk-3,iothread=1,size=1G
> vmgenid: 0a1d8751-5e02-449d-977e-c0160e900231

Before the change:

> root@pve8a1 ~ # grep VmRSS /proc/$(cat /var/run/qemu-server/106.pid)/status
> VmRSS:	  370948 kB
> root@pve8a1 ~ # vzdump 106 --storage pbs
> (...)
> INFO: Backup job finished successfully
> root@pve8a1 ~ # grep VmRSS /proc/$(cat /var/run/qemu-server/106.pid)/status
> VmRSS:	 2114964 kB

After the change:

> root@pve8a1 ~ # grep VmRSS /proc/$(cat /var/run/qemu-server/106.pid)/status
> VmRSS:	  398788 kB
> root@pve8a1 ~ # vzdump 106 --storage pbs
> (...)
> INFO: Backup job finished successfully
> root@pve8a1 ~ # grep VmRSS /proc/$(cat /var/run/qemu-server/106.pid)/status
> VmRSS:	  424356 kB

[0]: https://forum.proxmox.com/threads/131339/

Co-diagnosed-by: Friedrich Weber <f.weber@proxmox.com>
Co-diagnosed-by: Dominik Csapak <d.csapak@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Acked-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2023-08-16 11:50:12 +02:00
Fiona Ebner
c36e3f9d17 refresh patch context
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Acked-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2023-08-16 11:50:08 +02:00
Filip Schauer
b8b4ce0480 Add format attributes to function candidates
Add format attributes to functions that take printf-like arguments. This
provides additional compile-time checking that the correct parameters
are passed to the functions.

This fixes compiler warnings generated by the -Wsuggest-attribute=format
flag.

Signed-off-by: Filip Schauer <f.schauer@proxmox.com>
2023-08-08 09:08:48 +02:00
Fiona Ebner
df47146afe add patch fixing fd leak for vhost
Each pause+resume operation (which is also done as part of taking a VM
snapshot) would increase the number of open file descriptors by the
number of vhost devices (e.g. network devices by default). This could
lead to crashes during backup and surely other issues once the system
limit (default 1024) was reached [0].

[0]: https://forum.proxmox.com/threads/131603/

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-08-03 17:40:13 +02:00
Fabian Grünbichler
d9cbfafeeb bump version to 8.0.2-4
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2023-07-28 12:59:10 +02:00
Fiona Ebner
5919ec1446 add patch fixing resume for snapshot and hibernate with drive with iothread and a dirty bitmap
Not difficult to run into, just have a drive with iothread, take a PBS
backup and then take a snapshot or hibernate. Resuming will fail with
> qemu: qemu_mutex_unlock_impl: Operation not permitted
because of not acquiring the correct AioContext first.

Migration is not affected, because it runs in coroutine context.

Reported in the community forum:
https://forum.proxmox.com/threads/129899/

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-07-28 12:00:50 +02:00
14 changed files with 126 additions and 1134 deletions

View File

@@ -55,7 +55,7 @@ $(BUILDDIR): keycodemapdb | submodule
deb kvm: $(DEBS)
$(DEB_DBG): $(DEB)
$(DEB): $(BUILDDIR)
cd $(BUILDDIR); dpkg-buildpackage -b -us -uc -j32
cd $(BUILDDIR); dpkg-buildpackage -b -us -uc -j
lintian $(DEBS)
sbuild: $(DSC)

16
debian/changelog vendored
View File

@@ -1,8 +1,18 @@
pve-qemu-kvm (8.0.2-3+vitastor1) bookworm; urgency=medium
pve-qemu-kvm (8.0.2-5) bookworm; urgency=medium
* Add Vitastor support
* improve memory footprint after backup by not keeping as much memory
resident.
-- Vitaliy Filippov <vitalif@yourcmc.ru> Sat, 24 Jun 2023 00:50:32 +0300
* fix file descriptor leak for vhost (used by default by vNICs).
-- Proxmox Support Team <support@proxmox.com> Wed, 16 Aug 2023 11:52:24 +0200
pve-qemu-kvm (8.0.2-4) bookworm; urgency=medium
* fix resume for snapshot and hibernate in combination with iothread and
dirty bitmap
-- Proxmox Support Team <support@proxmox.com> Fri, 28 Jul 2023 12:58:22 +0200
pve-qemu-kvm (8.0.2-3) bookworm; urgency=medium

View File

@@ -0,0 +1,48 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Fri, 28 Jul 2023 10:47:48 +0200
Subject: [PATCH] migration/block-dirty-bitmap: fix loading bitmap when there
is an iothread
The bdrv_create_dirty_bitmap() function (which is also called by
bdrv_dirty_bitmap_create_successor()) uses bdrv_getlength(bs). This is
a wrapper around a coroutine, and thus uses bdrv_poll_co(). Polling
tries to release the AioContext which will trigger an assert() if it
hasn't been acquired before.
The issue does not happen for migration, because there we are in a
coroutine already, so the wrapper will just call bdrv_co_getlength()
directly without polling.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
migration/block-dirty-bitmap.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index fe73aa94b1..7eaf498439 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -805,8 +805,11 @@ static int dirty_bitmap_load_start(QEMUFile *f, DBMLoadState *s)
"destination", bdrv_dirty_bitmap_name(s->bitmap));
return -EINVAL;
} else {
+ AioContext *ctx = bdrv_get_aio_context(s->bs);
+ aio_context_acquire(ctx);
s->bitmap = bdrv_create_dirty_bitmap(s->bs, granularity,
s->bitmap_name, &local_err);
+ aio_context_release(ctx);
if (!s->bitmap) {
error_report_err(local_err);
return -EINVAL;
@@ -833,7 +836,10 @@ static int dirty_bitmap_load_start(QEMUFile *f, DBMLoadState *s)
bdrv_disable_dirty_bitmap(s->bitmap);
if (flags & DIRTY_BITMAP_MIG_START_FLAG_ENABLED) {
+ AioContext *ctx = bdrv_get_aio_context(s->bs);
+ aio_context_acquire(ctx);
bdrv_dirty_bitmap_create_successor(s->bitmap, &local_err);
+ aio_context_release(ctx);
if (local_err) {
error_report_err(local_err);
return -EINVAL;

View File

@@ -0,0 +1,29 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Li Feng <fengli@smartx.com>
Date: Mon, 31 Jul 2023 20:10:06 +0800
Subject: [PATCH] vhost: fix the fd leak
When the vhost-user reconnect to the backend, the notifer should be
cleanup. Otherwise, the fd resource will be exhausted.
Fixes: f9a09ca3ea ("vhost: add support for configure interrupt")
Signed-off-by: Li Feng <fengli@smartx.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
---
hw/virtio/vhost.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index a266396576..8e3311781f 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -2034,6 +2034,8 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
event_notifier_test_and_clear(
&hdev->vqs[VHOST_QUEUE_NUM_CONFIG_INR].masked_config_notifier);
event_notifier_test_and_clear(&vdev->config_notifier);
+ event_notifier_cleanup(
+ &hdev->vqs[VHOST_QUEUE_NUM_CONFIG_INR].masked_config_notifier);
trace_vhost_dev_stop(hdev, vdev->name, vrings);

File diff suppressed because it is too large Load Diff

View File

@@ -139,7 +139,7 @@ index 8a142fc7a9..a7824b5266 100644
'threadinfo.c',
diff --git a/migration/savevm-async.c b/migration/savevm-async.c
new file mode 100644
index 0000000000..ac1fac6378
index 0000000000..aa2017d496
--- /dev/null
+++ b/migration/savevm-async.c
@@ -0,0 +1,533 @@
@@ -275,7 +275,7 @@ index 0000000000..ac1fac6378
+ return ret;
+}
+
+static void save_snapshot_error(const char *fmt, ...)
+static void G_GNUC_PRINTF(1, 2) save_snapshot_error(const char *fmt, ...)
+{
+ va_list ap;
+ char *msg;

View File

@@ -192,7 +192,7 @@ index 9d0155a2a1..cc06240e8d 100644
int qemu_fclose(QEMUFile *f);
diff --git a/migration/savevm-async.c b/migration/savevm-async.c
index ac1fac6378..ea3b2f36a6 100644
index aa2017d496..b97f2c4f14 100644
--- a/migration/savevm-async.c
+++ b/migration/savevm-async.c
@@ -380,7 +380,7 @@ void qmp_savevm_start(const char *statefile, Error **errp)

View File

@@ -1735,7 +1735,7 @@ index 0000000000..ac7da237d0
+}
diff --git a/vma.c b/vma.c
new file mode 100644
index 0000000000..304f02bc84
index 0000000000..c76ecefa0f
--- /dev/null
+++ b/vma.c
@@ -0,0 +1,878 @@
@@ -2314,13 +2314,13 @@ index 0000000000..304f02bc84
+ ret = blk_co_preadv(job->target, start * VMA_CLUSTER_SIZE,
+ readlen, &qiov, 0);
+ if (ret < 0) {
+ vma_writer_set_error(job->vmaw, "read error", -1);
+ vma_writer_set_error(job->vmaw, "read error");
+ goto out;
+ }
+
+ size_t zb = 0;
+ if (vma_writer_write(job->vmaw, job->dev_id, start, buf, &zb) < 0) {
+ vma_writer_set_error(job->vmaw, "backup_dump_cb vma_writer_write failed", -1);
+ vma_writer_set_error(job->vmaw, "backup_dump_cb vma_writer_write failed");
+ goto out;
+ }
+ }
@@ -2619,7 +2619,7 @@ index 0000000000..304f02bc84
+}
diff --git a/vma.h b/vma.h
new file mode 100644
index 0000000000..1b62859165
index 0000000000..86d2873aa5
--- /dev/null
+++ b/vma.h
@@ -0,0 +1,150 @@
@@ -2757,7 +2757,7 @@ index 0000000000..1b62859165
+int coroutine_fn vma_writer_flush_output(VmaWriter *vmaw);
+
+int vma_writer_get_status(VmaWriter *vmaw, VmaStatus *status);
+void vma_writer_set_error(VmaWriter *vmaw, const char *fmt, ...);
+void vma_writer_set_error(VmaWriter *vmaw, const char *fmt, ...) G_GNUC_PRINTF(2, 3);
+
+
+VmaReader *vma_reader_create(const char *filename, Error **errp);

View File

@@ -79,7 +79,8 @@ Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
adapt for new job lock mechanism replacing AioContext locks
adapt to QAPI changes
improve canceling
allow passing max-workers setting]
allow passing max-workers setting
use malloc_trim after backup]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
block/meson.build | 5 +
@@ -92,11 +93,11 @@ Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
monitor/hmp-cmds.c | 72 +++
proxmox-backup-client.c | 146 +++++
proxmox-backup-client.h | 60 ++
pve-backup.c | 1097 ++++++++++++++++++++++++++++++++
pve-backup.c | 1109 ++++++++++++++++++++++++++++++++
qapi/block-core.json | 226 +++++++
qapi/common.json | 13 +
qapi/machine.json | 15 +-
14 files changed, 1711 insertions(+), 13 deletions(-)
14 files changed, 1723 insertions(+), 13 deletions(-)
create mode 100644 proxmox-backup-client.c
create mode 100644 proxmox-backup-client.h
create mode 100644 pve-backup.c
@@ -587,10 +588,10 @@ index 0000000000..8cbf645b2c
+#endif /* PROXMOX_BACKUP_CLIENT_H */
diff --git a/pve-backup.c b/pve-backup.c
new file mode 100644
index 0000000000..dd72ee0ed6
index 0000000000..10ca8a0b1d
--- /dev/null
+++ b/pve-backup.c
@@ -0,0 +1,1097 @@
@@ -0,0 +1,1109 @@
+#include "proxmox-backup-client.h"
+#include "vma.h"
+
@@ -605,6 +606,10 @@ index 0000000000..dd72ee0ed6
+#include "qapi/qmp/qerror.h"
+#include "qemu/cutils.h"
+
+#if defined(CONFIG_MALLOC_TRIM)
+#include <malloc.h>
+#endif
+
+#include <proxmox-backup-qemu.h>
+
+/* PVE backup state and related function */
@@ -869,6 +874,14 @@ index 0000000000..dd72ee0ed6
+ backup_state.stat.end_time = time(NULL);
+ backup_state.stat.finishing = false;
+ qemu_mutex_unlock(&backup_state.stat.lock);
+
+#if defined(CONFIG_MALLOC_TRIM)
+ /*
+ * Try to reclaim memory for buffers (and, in case of PBS, Rust futures), etc.
+ * Won't happen by default if there is fragmentation.
+ */
+ malloc_trim(4 * 1024 * 1024);
+#endif
+}
+
+static void coroutine_fn pvebackup_co_complete_stream(void *opaque)

View File

@@ -175,10 +175,10 @@ index 0000000000..887e998b9e
+ NULL);
+}
diff --git a/pve-backup.c b/pve-backup.c
index dd72ee0ed6..cb5312fff3 100644
index 10ca8a0b1d..0a5ce2cab8 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -1090,6 +1090,7 @@ ProxmoxSupportStatus *qmp_query_proxmox_support(Error **errp)
@@ -1102,6 +1102,7 @@ ProxmoxSupportStatus *qmp_query_proxmox_support(Error **errp)
ret->pbs_library_version = g_strdup(proxmox_backup_qemu_version());
ret->pbs_dirty_bitmap = true;
ret->pbs_dirty_bitmap_savevm = true;

View File

@@ -19,7 +19,7 @@ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index fe73aa94b1..a6440929fa 100644
index 7eaf498439..509f3df0a6 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -539,7 +539,7 @@ static int add_bitmaps_to_list(DBMSaveState *s, BlockDriverState *bs,

View File

@@ -67,10 +67,10 @@ index a8dfd8fefd..fa9b0b0f10 100644
* must_precopy:
* - must be migrated in precopy or in stopped state
diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index a6440929fa..69fab3275c 100644
index 509f3df0a6..42dc4a8d61 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -1214,10 +1214,17 @@ static int dirty_bitmap_save_setup(QEMUFile *f, void *opaque)
@@ -1220,10 +1220,17 @@ static int dirty_bitmap_save_setup(QEMUFile *f, void *opaque)
{
DBMSaveState *s = &((DBMState *)opaque)->save;
SaveBitmapState *dbms = NULL;
@@ -90,7 +90,7 @@ index a6440929fa..69fab3275c 100644
return -1;
}
@@ -1225,7 +1232,9 @@ static int dirty_bitmap_save_setup(QEMUFile *f, void *opaque)
@@ -1231,7 +1238,9 @@ static int dirty_bitmap_save_setup(QEMUFile *f, void *opaque)
send_bitmap_start(f, s, dbms);
}
qemu_put_bitmap_flags(f, DIRTY_BITMAP_MIG_FLAG_EOS);

View File

@@ -13,7 +13,7 @@ Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
1 file changed, 2 deletions(-)
diff --git a/migration/savevm-async.c b/migration/savevm-async.c
index ea3b2f36a6..dd7744ab66 100644
index b97f2c4f14..87ea0573d3 100644
--- a/migration/savevm-async.c
+++ b/migration/savevm-async.c
@@ -403,10 +403,8 @@ void qmp_savevm_start(const char *statefile, Error **errp)

View File

@@ -7,6 +7,8 @@ extra/0006-lsi53c895a-disable-reentrancy-detection-for-script-R.patch
extra/0007-bcm2835_property-disable-reentrancy-detection-for-io.patch
extra/0008-raven-disable-reentrancy-detection-for-iomem.patch
extra/0009-apic-disable-reentrancy-detection-for-apic-msi.patch
extra/0010-migration-block-dirty-bitmap-fix-loading-bitmap-when.patch
extra/0011-vhost-fix-the-fd-leak.patch
bitmap-mirror/0001-drive-mirror-add-support-for-sync-bitmap-mode-never.patch
bitmap-mirror/0002-drive-mirror-add-support-for-conditional-and-always-.patch
bitmap-mirror/0003-mirror-add-check-for-bitmap-mode-without-bitmap.patch
@@ -58,4 +60,3 @@ pve/0042-Revert-block-rbd-implement-bdrv_co_block_status.patch
pve/0043-alloc-track-fix-deadlock-during-drop.patch
pve/0044-migration-for-snapshots-hold-the-BQL-during-setup-ca.patch
pve/0045-savevm-async-don-t-hold-BQL-during-setup.patch
pve-qemu-8.0-vitastor.patch