migration: Postpone the kick of the fault thread after recover

The new migrate_send_rp_req_pages_pending() call should greatly improve
destination responsiveness because it will resync faulted address after
postcopy recovery.  However it is also the 1st place to initiate the page
request from the main thread.

One thing is overlooked on that migrate_send_rp_message_req_pages() is not
designed to be thread-safe.  So if we wake the fault thread before syncing all
the faulted pages in the main thread, it means they can race.

Postpone the wake up operation after the sync of faulted addresses.

Fixes: 0c26781c09 ("migration: Sync requested pages after postcopy recovery")
Tested-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20201102153010.11979-3-peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
master
Peter Xu 2020-11-02 10:30:10 -05:00 committed by Dr. David Alan Gilbert
parent cc5ab87200
commit 5e77343113
1 changed files with 8 additions and 3 deletions

View File

@ -2069,12 +2069,9 @@ static int loadvm_postcopy_handle_resume(MigrationIncomingState *mis)
/*
* This means source VM is ready to resume the postcopy migration.
* It's time to switch state and release the fault thread to
* continue service page faults.
*/
migrate_set_state(&mis->state, MIGRATION_STATUS_POSTCOPY_RECOVER,
MIGRATION_STATUS_POSTCOPY_ACTIVE);
qemu_sem_post(&mis->postcopy_pause_sem_fault);
trace_loadvm_postcopy_handle_resume();
@ -2095,6 +2092,14 @@ static int loadvm_postcopy_handle_resume(MigrationIncomingState *mis)
*/
migrate_send_rp_req_pages_pending(mis);
/*
* It's time to switch state and release the fault thread to continue
* service page faults. Note that this should be explicitly after the
* above call to migrate_send_rp_req_pages_pending(). In short:
* migrate_send_rp_message_req_pages() is not thread safe, yet.
*/
qemu_sem_post(&mis->postcopy_pause_sem_fault);
return 0;
}