mirror_qemu

Commit Graph

Author	SHA1	Message	Date
Dr. David Alan Gilbert	d3dff7a5a1	vhost-user: Add 'VHOST_USER_POSTCOPY_ADVISE' message Wire up a notifier to send a VHOST_USER_POSTCOPY_ADVISE message on an incoming advise. Later patches will fill in the behaviour/contents of the message. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2018-03-20 05:03:27 +02:00
Dr. David Alan Gilbert	1693c64c27	postcopy: Add notifier chain Add a notifier chain for postcopy with a 'reason' flag and an opportunity for a notifier member to return an error. Call it when enabling postcopy. This will initially used to enable devices to declare they're unable to postcopy and later to notify of devices of stages within postcopy. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2018-03-20 05:03:27 +02:00
Dr. David Alan Gilbert	2ce16640b4	postcopy: use UFFDIO_ZEROPAGE only when available Use a flag on the RAMBlock to state whether it has the UFFDIO_ZEROPAGE capability, use it when it's available. This allows the use of postcopy on tmpfs as well as hugepage backed files. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2018-03-20 05:03:27 +02:00
Peter Xu	dd0ee30cae	migration: fix applying wrong capabilities When setting migration capabilities via QMP/HMP, we'll apply them even if the capability check failed. Fix it. Fixes: `4a84214ebe` ("migration: provide migrate_caps_check()", 2017-07-18) Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20180305094938.31374-1-peterx@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2018-03-09 17:39:25 +00:00
Peter Lieven	ef9c5160a1	migration/block: rename MAX_INFLIGHT_IO to MAX_IO_BUFFERS this actually limits (as the original commit mesage suggests) the number of I/O buffers that can be allocated and not the number of parallel (inflight) I/O requests. Signed-off-by: Peter Lieven <pl@kamp.de> Message-Id: <1520507908-16743-4-git-send-email-pl@kamp.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2018-03-09 17:39:25 +00:00
Peter Lieven	86b124bc76	migration/block: reset dirty bitmap before read in bulk phase Reset the dirty bitmap before reading to make sure we don't miss any new data. Cc: qemu-stable@nongnu.org Signed-off-by: Peter Lieven <pl@kamp.de> Message-Id: <1520507908-16743-3-git-send-email-pl@kamp.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2018-03-09 17:39:25 +00:00
Peter Lieven	b255734531	migration: do not transfer ram during bulk storage migration this patch makes the bulk phase of a block migration to take place before we start transferring ram. As the bulk block migration can take a long time its pointless to transfer ram during that phase. Signed-off-by: Peter Lieven <pl@kamp.de> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <1520507908-16743-2-git-send-email-pl@kamp.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2018-03-09 17:39:25 +00:00
Marc-André Lureau	ab105cc138	migration: fix minor finalize leak Spotted thanks to ASAN: QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64 tests/migration-test -p /x86_64/migration/bad_dest ==30302==ERROR: LeakSanitizer: detected memory leaks Direct leak of 48 byte(s) in 1 object(s) allocated from: #0 0x7f60efba1a38 in __interceptor_calloc (/lib64/libasan.so.4+0xdea38) #1 0x7f60eef3cf75 in g_malloc0 ../glib/gmem.c:124 #2 0x55ca9094702c in error_copy /home/elmarco/src/qemu/util/error.c:203 #3 0x55ca9037a30f in migrate_set_error /home/elmarco/src/qemu/migration/migration.c:1139 #4 0x55ca9037a462 in migrate_fd_error /home/elmarco/src/qemu/migration/migration.c:1150 #5 0x55ca9038162b in migrate_fd_connect /home/elmarco/src/qemu/migration/migration.c:2411 #6 0x55ca90386e41 in migration_channel_connect /home/elmarco/src/qemu/migration/channel.c:81 #7 0x55ca9038335e in socket_outgoing_migration /home/elmarco/src/qemu/migration/socket.c:85 #8 0x55ca9083dd3a in qio_task_complete /home/elmarco/src/qemu/io/task.c:142 #9 0x55ca9083d6cc in gio_task_thread_result /home/elmarco/src/qemu/io/task.c:88 #10 0x7f60eef37317 in g_idle_dispatch ../glib/gmain.c:5552 #11 0x7f60eef3490b in g_main_dispatch ../glib/gmain.c:3182 #12 0x7f60eef357ac in g_main_context_dispatch ../glib/gmain.c:3847 #13 0x55ca90927231 in glib_pollfds_poll /home/elmarco/src/qemu/util/main-loop.c:214 #14 0x55ca90927420 in os_host_main_loop_wait /home/elmarco/src/qemu/util/main-loop.c:261 #15 0x55ca909275fa in main_loop_wait /home/elmarco/src/qemu/util/main-loop.c:515 #16 0x55ca8fc1c2a4 in main_loop /home/elmarco/src/qemu/vl.c:1942 #17 0x55ca8fc2eb3a in main /home/elmarco/src/qemu/vl.c:4724 #18 0x7f60e4082009 in __libc_start_main (/lib64/libc.so.6+0x21009) Indirect leak of 45 byte(s) in 1 object(s) allocated from: #0 0x7f60efba1850 in malloc (/lib64/libasan.so.4+0xde850) #1 0x7f60eef3cf0c in g_malloc ../glib/gmem.c:94 #2 0x7f60eef3d1cf in g_malloc_n ../glib/gmem.c:331 #3 0x7f60eef596eb in g_strdup ../glib/gstrfuncs.c:363 #4 0x55ca90947085 in error_copy /home/elmarco/src/qemu/util/error.c:204 #5 0x55ca9037a30f in migrate_set_error /home/elmarco/src/qemu/migration/migration.c:1139 #6 0x55ca9037a462 in migrate_fd_error /home/elmarco/src/qemu/migration/migration.c:1150 #7 0x55ca9038162b in migrate_fd_connect /home/elmarco/src/qemu/migration/migration.c:2411 #8 0x55ca90386e41 in migration_channel_connect /home/elmarco/src/qemu/migration/channel.c:81 #9 0x55ca9038335e in socket_outgoing_migration /home/elmarco/src/qemu/migration/socket.c:85 #10 0x55ca9083dd3a in qio_task_complete /home/elmarco/src/qemu/io/task.c:142 #11 0x55ca9083d6cc in gio_task_thread_result /home/elmarco/src/qemu/io/task.c:88 #12 0x7f60eef37317 in g_idle_dispatch ../glib/gmain.c:5552 #13 0x7f60eef3490b in g_main_dispatch ../glib/gmain.c:3182 #14 0x7f60eef357ac in g_main_context_dispatch ../glib/gmain.c:3847 #15 0x55ca90927231 in glib_pollfds_poll /home/elmarco/src/qemu/util/main-loop.c:214 #16 0x55ca90927420 in os_host_main_loop_wait /home/elmarco/src/qemu/util/main-loop.c:261 #17 0x55ca909275fa in main_loop_wait /home/elmarco/src/qemu/util/main-loop.c:515 #18 0x55ca8fc1c2a4 in main_loop /home/elmarco/src/qemu/vl.c:1942 #19 0x55ca8fc2eb3a in main /home/elmarco/src/qemu/vl.c:4724 #20 0x7f60e4082009 in __libc_start_main (/lib64/libc.so.6+0x21009) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20180306170959.3921-1-marcandre.lureau@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2018-03-09 17:39:25 +00:00
Peter Xu	1939ccdaa6	qio: non-default context for TLS handshake A new parameter "context" is added to qio_channel_tls_handshake() is to allow the TLS to be run on a non-default context. Still, no functional change. Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2018-03-06 10:19:07 +00:00
Peter Xu	8005fdd8fa	qio: non-default context for async conn We have worked on qio_task_run_in_thread() already. Further, let all the qio channel APIs use that context. Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2018-03-06 10:19:06 +00:00
Markus Armbruster	112ed241f5	qapi: Empty out qapi-schema.json The previous commit improved compile time by including less of the generated QAPI headers. This is impossible for stuff defined directly in qapi-schema.json, because that ends up in headers that that pull in everything. Move everything but include directives from qapi-schema.json to new sub-module qapi/misc.json, then include just the "misc" shard where possible. It's possible everywhere, except: * monitor.c needs qmp-command.h to get qmp_init_marshal() * monitor.c, ui/vnc.c and the generated qapi-event-FOO.c need qapi-event.h to get enum QAPIEvent Perhaps we'll get rid of those some other day. Adding a type to qapi/migration.json now recompiles some 120 instead of 2300 out of 5100 objects. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20180211093607.27351-25-armbru@redhat.com> [eblake: rebase to master] Signed-off-by: Eric Blake <eblake@redhat.com>	2018-03-02 13:45:50 -06:00
Markus Armbruster	9af2398977	Include less of the generated modular QAPI headers In my "build everything" tree, a change to the types in qapi-schema.json triggers a recompile of about 4800 out of 5100 objects. The previous commit split up qmp-commands.h, qmp-event.h, qmp-visit.h, qapi-types.h. Each of these headers still includes all its shards. Reduce compile time by including just the shards we actually need. To illustrate the benefits: adding a type to qapi/migration.json now recompiles some 2300 instead of 4800 objects. The next commit will improve it further. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20180211093607.27351-24-armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> [eblake: rebase to master] Signed-off-by: Eric Blake <eblake@redhat.com>	2018-03-02 13:45:50 -06:00
Peter Xu	3e0c8050eb	migration: pass MigrationState to migrate_init() Let the callers take the object, then pass it to migrate_init(). Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20180208103132.28452-12-peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2018-02-14 10:37:09 +00:00
Peter Xu	d6208e35e4	migration: allow send_rq to fail We will not allow failures to happen when sending data from destination to source via the return path. However it is possible that there can be errors along the way. This patch allows the migrate_send_rp_message() to return error when it happens, and further extended it to migrate_send_rp_req_pages(). Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20180208103132.28452-9-peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2018-02-14 10:36:52 +00:00
Peter Xu	9ab7ef9b66	migration: provide postcopy_fault_thread_notify() A general helper to notify the fault thread. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20180208103132.28452-4-peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2018-02-14 10:36:02 +00:00
Peter Xu	64f615fe34	migration: reuse mis->userfault_quit_fd It was only used for quitting the page fault thread before. Let it be something more useful - now we can use it to notify a "wake" for the page fault thread (for any reason), and it only means "quit" if the fault_thread_quit is set. Since we changed what it does, renaming it to userfault_event_fd. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20180208103132.28452-3-peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2018-02-14 10:35:45 +00:00
Peter Xu	7a9ddfbfae	migration: better error handling with QEMUFile If the postcopy down due to some reason, we can always see this on dst: qemu-system-x86_64: RP: Received invalid message 0x0000 length 0x0000 However in most cases that's not the real issue. The problem is that qemu_get_be16() has no way to show whether the returned data is valid or not, and we are _always_ assuming it is valid. That's possibly not wise. The best approach to solve this would be: refactoring QEMUFile interface to allow the APIs to return error if there is. However it needs quite a bit of work and testing. For now, let's explicitly check the validity first before using the data in all places for qemu_get_*(). This patch tries to fix most of the cases I can see. Only if we are with this, can we make sure we are processing the valid data, and also can we make sure we can capture the channel down events correctly. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20180208103132.28452-2-peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2018-02-14 10:34:56 +00:00
Dr. David Alan Gilbert	b9ccaf6d74	migration: Fix early failure cleanup Avoid crash in cleanup after a very early migration failure (possibly due to my `688a3dcba9` 'Route errors down ...') Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20180212160340.15333-2-dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com>	2018-02-14 10:31:01 +00:00
Ross Lagerwall	96994fd1e4	migration/xen: Check return value of qemu_fclose QEMUFile uses buffered IO so when writing small amounts (such as the Xen device state file), the actual write call and any errors that may occur only happen as part of qemu_fclose(). Therefore, report IO errors when saving the device state under Xen by checking the return value of qemu_fclose(). Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Message-Id: <20180206163039.23661-1-ross.lagerwall@citrix.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2018-02-14 10:18:42 +00:00
Markus Armbruster	452fcdbc49	Include qapi/qmp/qdict.h exactly where needed This cleanup makes the number of objects depending on qapi/qmp/qdict.h drop from 4550 (out of 4743) to 368 in my "build everything" tree. For qapi/qmp/qobject.h, the number drops from 4552 to 390. While there, separate #include from file comment with a blank line. Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20180201111846.21846-13-armbru@redhat.com>	2018-02-09 13:52:15 +01:00
Markus Armbruster	15280c360e	qdict qlist: Make most helper macros functions The macro expansions of qdict_put_TYPE() and qlist_append_TYPE() need qbool.h, qnull.h, qnum.h and qstring.h to compile. We include qnull.h and qnum.h in the headers, but not qbool.h and qstring.h. Works, because we include those wherever the macros get used. Open-coding these helpers is of dubious value. Turn them into functions and drop the includes from the headers. This cleanup makes the number of objects depending on qapi/qmp/qnum.h from 4551 (out of 4743) to 46 in my "build everything" tree. For qapi/qmp/qnull.h, the number drops from 4552 to 21. Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20180201111846.21846-10-armbru@redhat.com>	2018-02-09 13:52:15 +01:00
Markus Armbruster	e688df6bc4	Include qapi/error.h exactly where needed This cleanup makes the number of objects depending on qapi/error.h drop from 1910 (out of 4743) to 1612 in my "build everything" tree. While there, separate #include from file comment with a blank line, and drop a useless comment on why qemu/osdep.h is included first. Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20180201111846.21846-5-armbru@redhat.com> [Semantic conflict with commit `34e304e975` resolved, OSX breakage fixed]	2018-02-09 13:50:17 +01:00
Markus Armbruster	522ece32d2	Drop superfluous includes of qapi-types.h and test-qapi-types.h Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20180201111846.21846-4-armbru@redhat.com>	2018-02-09 05:05:11 +01:00
Greg Kurz	875fcd013a	migration: incoming postcopy advise sanity checks If postcopy-ram was set on the source but not on the destination, migration doesn't occur, the destination prints an error and boots the guest: qemu-system-ppc64: Expected vmdescription section, but got 0 We end up with two running instances. This behaviour was introduced in 2.11 by commit `58110f0acb` "migration: split common postcopy out of ram postcopy" to prepare ground for the upcoming dirty bitmap postcopy support. It adds a new case where the source may send an empty postcopy advise because dirty bitmap doesn't need to check page sizes like RAM postcopy does. If the source has enabled postcopy-ram, then it sends an advise with the page size values. If the destination hasn't enabled postcopy-ram, then loadvm_postcopy_handle_advise() leaves the page size values on the stream and returns. This confuses qemu_loadvm_state() later on and causes the destination to start execution. As discussed several times, postcopy-ram should be enabled both sides to be functional. This patch changes the destination to perform some extra checks on the advise length to ensure this is the case. Otherwise an error is returned and migration is aborted. Reported-by: Balamuruhan S <bala24@linux.vnet.ibm.com> Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <151791621042.19120.3103118434734245776.stgit@bahia> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2018-02-06 14:53:02 +00:00
Ross Lagerwall	032b79f717	migration: Don't leak IO channels Since qemu_fopen_channel_{in,out}put take references on the underlying IO channels, make sure to release our references to them. Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Message-Id: <20171101142526.1006-2-ross.lagerwall@citrix.com> Reviewed-by: Daniel P. Berrange <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2018-02-06 14:53:02 +00:00
Dr. David Alan Gilbert	6039dd5b1c	migration: Recover block devices if failure in device state In `e91d895` I added the new pause-before-switchover mechanism to allow migration completion to be delayed; this changes the last state prior to completion to MIGRATE_STATUS_DEVICE rather than MIGRATE_STATUS_ACTIVE. Fix the failure path in migration_completion to recover the block devices if it fails in MIGRATE_STATUS_DEVICE, not just the MIGRATE_STATUS_ACTIVE that it previously had. This corresponds to rh bz: https://bugzilla.redhat.com/show_bug.cgi?id=1538494 whose symptom is an occasional source crash on a failed migration. Fixes: `e91d8951d5` Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2018-02-06 14:53:02 +00:00
Juan Quintela	7faccdc3e7	migration: Drop current address parameter from save_zero_page() It already has RAMBlock and offset, it can calculate it itself. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2018-02-06 10:55:13 +00:00
Wei Wang	0781c1ed1c	migration: use s->threshold_size inside migration_update_counters Fixes: `b15df1ae50` ("migration: cleanup stats update into function") The threshold size is changed to be recorded in s->threshold_size. Signed-off-by: Wei Wang <wei.w.wang@intel.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2018-02-06 10:55:13 +00:00
Daniel Henrique Barboza	ee555cdf4d	migration/savevm.c: set MAX_VM_CMD_PACKAGED_SIZE to 1ul << 32 MAX_VM_CMD_PACKAGED_SIZE is a constant used in qemu_savevm_send_packaged and loadvm_handle_cmd_packaged to determine whether a package is too big to be sent or received. qemu_savevm_send_packaged is called inside postcopy_start (migration/migration.c) to send the MigrationState in a single blob to the destination, using the MIG_CMD_PACKAGED subcommand, which will read it up using loadvm_handle_cmd_packaged. If the blob is larger than MAX_VM_CMD_PACKAGED_SIZE, an error is thrown and the postcopy migration is aborted. Both MAX_VM_CMD_PACKAGED_SIZE and MIG_CMD_PACKAGED were introduced by commit `11cf1d984b` ("MIG_CMD_PACKAGED: Send a packaged chunk ..."). The constant has its original value of 1ul << 24 (16MB). The current MAX_VM_CMD_PACKAGED_SIZE value is not enough to support postcopy migration of bigger pseries guests. The blob size for a postcopy migration of a pseries guest with the following setup: qemu-system-ppc64 --nographic -vga none -machine pseries,accel=kvm -m 64G \ -smp 1,maxcpus=32 -device virtio-blk-pci,drive=rootdisk \ -drive file=f27.qcow2,if=none,cache=none,format=qcow2,id=rootdisk \ -netdev user,id=u1 -net nic,netdev=u1 Goes around 12MB. Bumping the RAM to 128G makes the blob sizes goes to 20MB. With 256G the blob goes to 37MB - more than twice the current maximum size. At this moment the pseries machine can handle guests with up to 1TB of RAM, making this postcopy blob goes to 128MB of size approximately. Following the discussions made in [1], there is a need to understand what devices are aggressively consuming the blob in that manner and see if that can be mitigated. Until then, we can set MAX_VM_CMD_PACKAGED_SIZE to the maximum value allowed. Since the size is a 32 bit int variable, we can set it as 1ul << 32, giving a maximum blob size of 4G that is enough to support postcopy migration of 32TB RAM guests given the above constraints. [1] https://lists.nongnu.org/archive/html/qemu-devel/2018-01/msg06313.html Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com> Reported-by: Balamuruhan S <bala24@linux.vnet.ibm.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2018-02-06 10:55:13 +00:00
Dr. David Alan Gilbert	688a3dcba9	migration: Route errors down through migration_channel_connect Route async errors (especially from sockets) down through migration_channel_connect and on to migrate_fd_connect where they can be cleaned up. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2018-02-06 10:55:12 +00:00
Dr. David Alan Gilbert	cce8040bb0	migration: Allow migrate_fd_connect to take an Error * Allow whatever is performing the connection to pass migrate_fd_connect an error to indicate there was a problem during connection, an allow us to clean up. The caller must free the error. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2018-02-06 10:55:12 +00:00
Peter Maydell	52483b067c	Pull request for various patches that have been reviewed and laying on the mailing list for a while, but apparently no maintainer feels really responsible for picking up. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABAgAGBQJaZcaYAAoJEC7Z13T+cC21ab0P/3fE52pp0BEWRfM3MkTyJCgs c3ZDR2raLsrwl5MMoeV6TCZ8ILp3RR5ipdnpsxAlUKi0e953hduFXJ/A9meElogu i1BdCl7SBYUWXg9WKqH5cX9LGiGiRLQ53KehCB6wa4nXBjkL1bGtbprbp5kCb+Sn tavBoIqkwxC2VvqJHL23uS7n/3bPkr4XA1bA/VWezm7be6f5bEqBzORdxPabRC3f M7b1ntl2Xj9PXpwKZkHgET8Wg1Ne5kCUvvx9o22iMuHhBHsxAmMc06Q96wihDUI3 /CwzaErGrykGRX95y++yaBMUEYMSk90dv9cXHTMryDw/0id0OMnpcGm6SeQHlcQT ATrhnH1VzEcgGJPYpqKxNvb0pJZ7t7gYUSi0HMC83PG2S9wD/kBvtB8rqumTolKB cvI6l7PFfCZIr3FsTyGHX1KVRHX8PWljnKIvAbyUEuK4XDnSX7hT+7jQvuxj4HSZ /ZlksnSIHcUf5gx6zG2StVKo5TEnY6JUhf8CuQeILW0ZGj6V/aUFbR5aMmgVinyj p/2OOUsze6rYCcpVE2kI7hSMrSXk4QfvPyIvjGf86EVxTM4SK03bP4QVFwtZtFzA HXXFIgKNcE7I7BhghC6PLcmHaYBiK8A5EIvXpgi4tD7L3PFLos6wtR6T2ze3mBn2 JyKxGw9rExlAJMi5/WE9 =JCAM -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/huth/tags/pull-request-2018-01-22' into staging Pull request for various patches that have been reviewed and laying on the mailing list for a while, but apparently no maintainer feels really responsible for picking up. # gpg: Signature made Mon 22 Jan 2018 11:10:16 GMT # gpg: using RSA key 0x2ED9D774FE702DB5 # gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" # gpg: aka "Thomas Huth <thuth@redhat.com>" # gpg: aka "Thomas Huth <huth@tuxfamily.org>" # gpg: aka "Thomas Huth <th.huth@posteo.de>" # Primary key fingerprint: 27B8 8847 EEE0 2501 18F3 EAB9 2ED9 D774 FE70 2DB5 * remotes/huth/tags/pull-request-2018-01-22: hw/isa: Replace fprintf(stderr, "\n" with error_report() hw/ipmi: Replace fprintf(stderr, "\n" with error_report() hw/bt: Replace fprintf(stderr, "*\n" with error_report() Fixes after renaming __FUNCTION__ to __func__ Replace all occurances of __FUNCTION__ with __func__ tests/cpu-plug-test: Test CPU hot-plugging on s390x tests/cpu-plug-test: Check CPU hot-plugging on ppc64, too tests/cpu-plug-test: Check the CPU hot-plugging with device_add, too tests: Rename pc-cpu-test.c to cpu-plug-test.c Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2018-01-23 10:15:09 +00:00
Peter Maydell	ee86981bda	migration: Revert postcopy-blocktime commit set This reverts commits `ca6011c` migration: add postcopy total blocktime into query-migrate `5f32dc8` migration: add blocktime calculation into migration-test `2f7dae9` migration: postcopy_blocktime documentation `3be98be` migration: calculate vCPU blocktime on dst side `01a87f0` migration: add postcopy blocktime ctx into MigrationIncomingState `31bf06a` migration: introduce postcopy-blocktime capability as they don't build on ppc32 due to trying to do atomic accesses on types that are larger than the host pointer type. Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2018-01-23 10:08:05 +00:00
Alistair Francis	a89f364ae8	Replace all occurances of __FUNCTION__ with __func__ Replace all occurs of __FUNCTION__ except for the check in checkpatch with the non GCC specific __func__. One line in hcd-musb.c was manually tweaked to pass checkpatch. Signed-off-by: Alistair Francis <alistair.francis@xilinx.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Gerd Hoffmann <kraxel@redhat.com> [THH: Removed hunks related to pxa2xx_mmci.c (fixed already)] Signed-off-by: Thomas Huth <thuth@redhat.com>	2018-01-22 09:46:18 +01:00
Peter Maydell	c1d5b9add7	* QemuMutex tracing improvements (Alex) * ram_addr_t optimization (David) * SCSI fixes (Fam, Stefan, me) * do {} while (0) fixes (Eric) * KVM fix for PMU (Jan) * memory leak fixes from ASAN (Marc-André) * migration fix for HPET, icount, loadvm (Maria, Pavel) * hflags fixes (me, Tao) * block/iscsi uninitialized variable (Peter L.) * full support for GMainContexts in character devices (Peter Xu) * more boot-serial-test (Thomas) * Memory leak fix (Zhecheng) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJaXgkRAAoJEL/70l94x66DA3EIAI8z8Y+1NAmbLqiHhrrN9Ji/ b8EHQ8wf0pwwrHuRVKYZvKUU8yvp/CRIoVWZwfeGjRbZC+l7l+BAwdOx42Bj/dUW VopNzcJMu3s5SNwoYLvs01OjhciBYNXWTXBkIiErwurF0Ow7oYR7trkLwOw0veSO L4qFAGoIBI/7b6BZ3YRQXshhzdSQ6dvHrDness2V1c0crLG+yhvjKJ8PJ2tJyNZO DbsrCd7hS6e6liSUqdLj9XgRySFj9R5kgjaLjckjg1SC6kmhLN9hyke8iXgH7uvz WGnRPmKjKexFHVYgR0rRFlazcQclAczHuIi/OZe0HLi6trg2YKBkolMaQLQdgfk= =HTyS -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging * QemuMutex tracing improvements (Alex) * ram_addr_t optimization (David) * SCSI fixes (Fam, Stefan, me) * do {} while (0) fixes (Eric) * KVM fix for PMU (Jan) * memory leak fixes from ASAN (Marc-André) * migration fix for HPET, icount, loadvm (Maria, Pavel) * hflags fixes (me, Tao) * block/iscsi uninitialized variable (Peter L.) * full support for GMainContexts in character devices (Peter Xu) * more boot-serial-test (Thomas) * Memory leak fix (Zhecheng) # gpg: Signature made Tue 16 Jan 2018 14:15:45 GMT # gpg: using RSA key 0xBFFBD25F78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini/tags/for-upstream: (51 commits) scripts/analyse-locks-simpletrace.py: script to analyse lock times util/qemu-thread-: add qemu_lock, locked and unlock trace events cpu: flush TB cache when loading VMState block/iscsi: fix initialization of iTask in iscsi_co_get_block_status find_ram_offset: Align ram_addr_t allocation on long boundaries find_ram_offset: Add comments and tracing cpu_physical_memory_sync_dirty_bitmap: Another alignment fix checkpatch: Enforce proper do/while (0) style maint: Fix macros with broken 'do/while(0); ' usage tests: Avoid 'do/while(false); ' in vhost-user-bridge chardev: Clean up previous patch indentation chardev: Use goto/label instead of do/break/while(0) mips: Tweak location of ';' in macros net: Drop unusual use of do { } while (0); irq: fix memory leak cpus: unify qemu__wait_io_event icount: fixed saving/restoring of icount warp timers scripts/qemu-gdb/timers.py: new helper to dump timer state scripts/qemu-gdb: add simple tcg lock status helper target-i386: update hflags on Hypervisor.framework ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2018-01-16 15:45:15 +00:00
Eric Blake	2562755ee7	maint: Fix macros with broken 'do/while(0); ' usage The point of writing a macro embedded in a 'do { ... } while (0)' loop (particularly if the macro has multiple statements or would otherwise end with an 'if' statement) is so that the macro can be used as a drop-in statement with the caller supplying the trailing ';'. Although our coding style frowns on brace-less 'if': if (cond) statement; else something else; that is the classic case where failure to use do/while(0) wrapping would cause the 'else' to pair with any embedded 'if' in the macro rather than the intended outer 'if'. But conversely, if the macro includes an embedded ';', then the same brace-less coding style would now have two statements, making the 'else' a syntax error rather than pairing with the outer 'if'. Thus, even though our coding style with required braces is not impacted, ending a macro with ';' makes our code harder to port to projects that use brace-less styles. The change should have no semantic impact. I was not able to fully compile-test all of the changes (as some of them are examples of the ugly bit-rotting debug print statements that are completely elided by default, and I didn't want to recompile with the necessary -D witnesses - cleaning those up is left as a bite-sized task for another day); I did, however, audit that for all files touched, all callers of the changed macros DID supply a trailing ';' at the callsite, and did not appear to be used as part of a brace-less conditional. Found mechanically via: $ git grep -B1 'while (0);' \| grep -A1 \\\\ Signed-off-by: Eric Blake <eblake@redhat.com> Acked-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20171201232433.25193-7-eblake@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-01-16 14:54:52 +01:00
Peter Xu	816306826a	migration: remove notify in fd_error It is already called in migrate_fd_cleanup. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2018-01-15 12:48:13 +01:00
Peter Xu	26978faf2f	migration: remove some block_cleanup_parameters() Keep the one in migrate_fd_cleanup() would be enough. Removing the other two. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2018-01-15 12:48:12 +01:00
Peter Xu	199aa6d4e4	migration: put the finish part into a new function This patch only moved the last part of migration_thread() into a new function migration_iteration_finish() to make it much shorter. With previous works to remove some local variables, now it's fairly easy to do that. Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2018-01-15 12:48:11 +01:00
Peter Xu	2ad873057e	migration: major cleanup for migrate iterations The major work for migration iterations are to move RAM/block/... data via qemu_savevm_state_iterate(). Generalize those part into a single function. Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2018-01-15 12:48:11 +01:00
Peter Xu	b15df1ae50	migration: cleanup stats update into function We have quite a few lines in migration_thread() that calculates some statistics for the migration interations. Isolate it into a single function to improve readability. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2018-01-15 12:48:10 +01:00
Peter Xu	39b9e17905	migration: use switch at the end of migration It converts the old if clauses into switch, explicitly mentions the possible migration states. The old nested "if"s are not clear on what we do on different states. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2018-01-15 12:48:10 +01:00
Peter Xu	cf011f082d	migration: introduce migrate_calculate_complete Generalize the calculation part when migration complete into a function to simplify migration_thread(). Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2018-01-15 12:48:09 +01:00
Peter Xu	64909f9740	migration: introduce downtime_start Introduce MigrationState.downtime_start to replace the local variable "start_time" in migration_thread to avoid passing things around. Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2018-01-15 12:48:09 +01:00
Peter Xu	7287cbd46e	migration: move vm_old_running into global state Firstly, it was passed around. Let's just move it into MigrationState just like many other variables as state of migration, renaming it to vm_was_running. One thing to mention is that for postcopy, we actually don't need this knowledge at all since postcopy can't resume a VM even if it fails (we can see that from the old code too: when we try to resume we also check against "entered_postcopy" variable). So further we do this: - in postcopy_start(), we don't update vm_old_running since useless - in migration_thread(), we don't need to check entered_postcopy when resume, since it's only used for precopy. Comment this out too for that variable definition. Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2018-01-15 12:48:08 +01:00
Peter Xu	4af246a34e	migration: split use of MigrationState.total_time It was used either to: 1. store initial timestamp of migration start, and 2. store total time used by last migration Let's provide two parameters for each of them. Mix use of the two is slightly misleading. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2018-01-15 12:48:08 +01:00
Peter Xu	deb74fb670	migration: remove "enable_colo" var It's only used once, clean it up a bit. Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2018-01-15 12:48:07 +01:00
Peter Xu	0ceccd858a	migration: qemu_savevm_state_cleanup() in cleanup Moving existing callers all into migrate_fd_cleanup(). It simplifies migration_thread() a bit. Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2018-01-15 12:48:06 +01:00
Peter Xu	0d649a0e95	migration: assert colo instead of check When reaching here if we are still "active" it means we must be in colo state. After a quick discussion offlist, we decided to use the safer error_report(). Finally I want to use "switch" here rather than lots of complicated if clauses. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2018-01-15 12:48:06 +01:00
Vladimir Sementsov-Ogievskiy	1f8956041a	migration: finalize current_migration object current_migration has .instance_finalize callback, but it is not called, because nobody unrefs current_migration. Fix that. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2018-01-15 12:48:05 +01:00

1 2 3 4 5 ...

741 Commits (2a84ffc0be7e54f2f4b76cb3ebfdde0b37bba9f0)