Commit Graph

836 Commits (2a28720d773df2193c9fb633c02092cca107a9e5)

Author SHA1 Message Date
Markus Armbruster ca77ee28e0 Include migration/qemu-file-types.h a lot less
In my "build everything" tree, changing migration/qemu-file-types.h
triggers a recompile of some 2600 out of 6600 objects (not counting
tests and objects that don't depend on qemu/osdep.h).

The culprit is again hw/hw.h, which supposedly includes it for
convenience.

Include migration/qemu-file-types.h only where it's needed.  Touching
it now recompiles less than 200 objects.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190812052359.30071-10-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-08-16 13:31:52 +02:00
Markus Armbruster 71e8a91585 Include sysemu/reset.h a lot less
In my "build everything" tree, changing sysemu/reset.h triggers a
recompile of some 2600 out of 6600 objects (not counting tests and
objects that don't depend on qemu/osdep.h).

The main culprit is hw/hw.h, which supposedly includes it for
convenience.

Include sysemu/reset.h only where it's needed.  Touching it now
recompiles less than 200 objects.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190812052359.30071-9-armbru@redhat.com>
2019-08-16 13:31:52 +02:00
Cédric Le Goater 310cda5b5e spapr/xive: Fix migration of hot-plugged CPUs
The migration sequence of a guest using the XIVE exploitation mode
relies on the fact that the states of all devices are restored before
the machine is. This is not true for hot-plug devices such as CPUs
which state come after the machine. This breaks migration because the
thread interrupt context registers are not correctly set.

Fix migration of hotplugged CPUs by restoring their context in the
'post_load' handler of the XiveTCTX model.

Fixes: 277dd3d771 ("spapr/xive: add migration support for KVM")
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190813064853.29310-1-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-08-13 16:50:30 +10:00
Greg Kurz 8d216d8c53 xics/kvm: Fix fallback to emulated XICS
Commit 4812f26152 tried to fix rollback path of xics_kvm_connect() but
it isn't enough. If we fail to create the KVM device, the guest fails
to boot later on with:

[    0.010817] pci 0000:00:00.0: Adding to iommu group 0
[    0.010863] irq: unknown-1 didn't like hwirq-0x1200 to VIRQ17 mapping (rc=-22)
[    0.010923] pci 0000:00:01.0: Adding to iommu group 0
[    0.010968] irq: unknown-1 didn't like hwirq-0x1201 to VIRQ17 mapping (rc=-22)
[    0.011543] EEH: No capable adapters found
[    0.011597] irq: unknown-1 didn't like hwirq-0x1000 to VIRQ17 mapping (rc=-22)
[    0.011651] audit: type=2000 audit(1563977526.000:1): state=initialized audit_enabled=0 res=1
[    0.011703] ------------[ cut here ]------------
[    0.011729] event-sources: Unable to allocate interrupt number for /event-sources/epow-events
[    0.011776] WARNING: CPU: 0 PID: 1 at arch/powerpc/platforms/pseries/event_sources.c:34 request_event_sources_irqs+0xbc/0x150
[    0.011828] Modules linked in:
[    0.011850] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.1.17-300.fc30.ppc64le #1
[    0.011886] NIP:  c0000000000d4fac LR: c0000000000d4fa8 CTR: c0000000018f0000
[    0.011923] REGS: c00000001e4c38d0 TRAP: 0700   Not tainted  (5.1.17-300.fc30.ppc64le)
[    0.011966] MSR:  8000000002029033 <SF,VEC,EE,ME,IR,DR,RI,LE>  CR: 28000284  XER: 20040000
[    0.012012] CFAR: c00000000011b42c IRQMASK: 0
[    0.012012] GPR00: c0000000000d4fa8 c00000001e4c3b60 c0000000015fc400 0000000000000051
[    0.012012] GPR04: 0000000000000001 0000000000000000 0000000000000081 772d6576656e7473
[    0.012012] GPR08: 000000001edf0000 c0000000014d4830 c0000000014d4830 6e6576652f20726f
[    0.012012] GPR12: 0000000000000000 c0000000018f0000 c000000000010bf0 0000000000000000
[    0.012012] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[    0.012012] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[    0.012012] GPR24: 0000000000000000 0000000000000000 c000000000ebbf00 c0000000000d5570
[    0.012012] GPR28: c000000000ebc008 c00000001fff8248 0000000000000000 0000000000000000
[    0.012372] NIP [c0000000000d4fac] request_event_sources_irqs+0xbc/0x150
[    0.012409] LR [c0000000000d4fa8] request_event_sources_irqs+0xb8/0x150
[    0.012445] Call Trace:
[    0.012462] [c00000001e4c3b60] [c0000000000d4fa8] request_event_sources_irqs+0xb8/0x150 (unreliable)
[    0.012513] [c00000001e4c3bf0] [c000000001042848] __machine_initcall_pseries_init_ras_IRQ+0xc8/0xf8
[    0.012563] [c00000001e4c3c20] [c000000000010810] do_one_initcall+0x60/0x254
[    0.012611] [c00000001e4c3cf0] [c000000001024538] kernel_init_freeable+0x35c/0x444
[    0.012655] [c00000001e4c3db0] [c000000000010c14] kernel_init+0x2c/0x148
[    0.012693] [c00000001e4c3e20] [c00000000000bdc4] ret_from_kernel_thread+0x5c/0x78
[    0.012736] Instruction dump:
[    0.012759] 38a00000 7c7f1b78 7f64db78 2c1f0000 2fbf0000 78630020 4180002c 409effa8
[    0.012805] 7fa4eb78 7f43d378 48046421 60000000 <0fe00000> 3bde0001 2c1e0010 7fde07b4
[    0.012851] ---[ end trace aa5785707323fad3 ]---

This happens because QEMU fell back on XICS emulation but didn't unregister
the RTAS calls from KVM. The emulated RTAS calls are hence never called and
the KVM ones return an error to the guest since the KVM device is absent.

The sanity checks in xics_kvm_disconnect() are abusive since we're freeing
the KVM device. Simply drop them.

Fixes: 4812f26152 "xics/kvm: Add proper rollback to xics_kvm_init()"
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <156398744035.546975.7029414194633598474.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-28 11:50:26 +10:00
Jan Kiszka be1927c97e ioapic: kvm: Skip route updates for masked pins
Masked entries will not generate interrupt messages, thus do no need to
be routed by KVM. This is a cosmetic cleanup, just avoiding warnings of
the kind

qemu-system-x86_64: vtd_irte_get: detected non-present IRTE (index=0, high=0xff00, low=0x100)

if the masked entry happens to reference a non-present IRTE.

Cc: qemu-stable@nongnu.org
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Message-Id: <a84b7e03-f9a8-b577-be27-4d93d1caa1c9@siemens.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2019-07-25 04:17:35 -04:00
Greg Kurz 38298611d5 xics/kvm: Always set the MASKED bit if interrupt is masked
The ics_set_kvm_state_one() function is called either to restore the
state of an interrupt source during migration or to set the interrupt
source to a default state during reset.

Since always, ie. 2013, the code only sets the MASKED bit if the 'current
priority' and the 'saved priority' are different. This is likely true
when restoring an interrupt that had been previously masked with the
ibm,int-off RTAS call. However this is always false in the case of
reset since both 'current priority' and 'saved priority' are equal to
0xff, and the MASKED bit is never set.

The legacy KVM XICS device gets away with that because it ends updating
its internal structure the same way, whether the MASKED bit is set or
the priority is 0xff.

The XICS-on-XIVE device for POWER9 is different. It sticks to the KVM
documentation [1] and _really_ relies on the MASKED bit to correctly
set. If not, it will configure the interrupt source in the XIVE HW, even
though the guest hasn't configured the interrupt yet. This disturbs the
complex logic implemented in XICS-on-XIVE and may result in the loss of
subsequent queued events.

Always set the MASKED bit if interrupt is masked as expected by the KVM
XICS-on-XIVE device. This has no impact on the legacy KVM XICS.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/virtual/kvm/devices/xics.txt

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <156217454083.559957.7359208229523652842.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-12 15:50:00 +10:00
Peter Maydell c4107e8208 Bugfixes.
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJdH7FgAAoJEL/70l94x66DdUUH+gLr/ZdjLIdfYy9cjcnevf4E
 cJlxdaW9KvUsK2uVgqQ/3b1yF+GCGk10n6n8ZTIbClhs+6NpqEMz5O3FA/Na6FGA
 48M2DwJaJ2H9AG/lQBlSBNUZfLsEJ9rWy7DHvNut5XMJFuWGwdtF/jRhUm3KqaRq
 vAaOgcQHbzHU9W8r1NJ7l6pnPebeO7S0JQV+82T/ITTz2gEBDUkJ36boO6fedkVQ
 jLb9nZyG3CJXHm2WlxGO4hkqbLFzURnCi6imOh2rMdD8BCu1eIVl59tD1lC/A0xv
 Pp3xXnv9SgJXsV4/I/N3/nU85ZhGVMPQZXkxaajHPtJJ0rQq7FAG8PJMEj9yPe8=
 =poke
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

Bugfixes.

# gpg: Signature made Fri 05 Jul 2019 21:21:52 BST
# gpg:                using RSA key BFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream:
  ioapic: use irq number instead of vector in ioapic_eoi_broadcast
  hw/i386: Fix linker error when ISAPC is disabled
  Makefile: generate header file with the list of devices enabled
  target/i386: kvm: Fix when nested state is needed for migration
  minikconf: do not include variables from MINIKCONF_ARGS in config-all-devices.mak
  target/i386: fix feature check in hyperv-stub.c
  ioapic: clear irq_eoi when updating the ioapic redirect table entry
  intel_iommu: Fix unexpected unmaps during global unmap
  intel_iommu: Fix incorrect "end" for vtd_address_space_unmap
  i386/kvm: Fix build with -m32
  checkpatch: do not warn for multiline parenthesized returned value
  pc: fix possible NULL pointer dereference in pc_machine_get_device_memory_region_size()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-07-08 10:26:18 +01:00
Li Qiang 03f990a5e3 ioapic: use irq number instead of vector in ioapic_eoi_broadcast
When emulating irqchip in qemu, such as following command:

x86_64-softmmu/qemu-system-x86_64 -m 1024 -smp 4 -hda /home/test/test.img
-machine kernel-irqchip=off --enable-kvm -vnc :0 -device edu -monitor stdio

We will get a crash with following asan output:

(qemu) /home/test/qemu5/qemu/hw/intc/ioapic.c:266:27: runtime error: index 35 out of bounds for type 'int [24]'
=================================================================
==113504==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x61b000003114 at pc 0x5579e3c7a80f bp 0x7fd004bf8c10 sp 0x7fd004bf8c00
WRITE of size 4 at 0x61b000003114 thread T4
    #0 0x5579e3c7a80e in ioapic_eoi_broadcast /home/test/qemu5/qemu/hw/intc/ioapic.c:266
    #1 0x5579e3c6f480 in apic_eoi /home/test/qemu5/qemu/hw/intc/apic.c:428
    #2 0x5579e3c720a7 in apic_mem_write /home/test/qemu5/qemu/hw/intc/apic.c:802
    #3 0x5579e3b1e31a in memory_region_write_accessor /home/test/qemu5/qemu/memory.c:503
    #4 0x5579e3b1e6a2 in access_with_adjusted_size /home/test/qemu5/qemu/memory.c:569
    #5 0x5579e3b28d77 in memory_region_dispatch_write /home/test/qemu5/qemu/memory.c:1497
    #6 0x5579e3a1b36b in flatview_write_continue /home/test/qemu5/qemu/exec.c:3323
    #7 0x5579e3a1b633 in flatview_write /home/test/qemu5/qemu/exec.c:3362
    #8 0x5579e3a1bcb1 in address_space_write /home/test/qemu5/qemu/exec.c:3452
    #9 0x5579e3a1bd03 in address_space_rw /home/test/qemu5/qemu/exec.c:3463
    #10 0x5579e3b8b979 in kvm_cpu_exec /home/test/qemu5/qemu/accel/kvm/kvm-all.c:2045
    #11 0x5579e3ae4499 in qemu_kvm_cpu_thread_fn /home/test/qemu5/qemu/cpus.c:1287
    #12 0x5579e4cbdb9f in qemu_thread_start util/qemu-thread-posix.c:502
    #13 0x7fd0146376da in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76da)
    #14 0x7fd01436088e in __clone (/lib/x86_64-linux-gnu/libc.so.6+0x12188e

This is because in ioapic_eoi_broadcast function, we uses 'vector' to
index the 's->irq_eoi'. To fix this, we should uses the irq number.

Signed-off-by: Li Qiang <liq3ea@163.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <20190622002119.126834-1-liq3ea@163.com>
2019-07-05 22:19:59 +02:00
Li Qiang d15d3d573a ioapic: clear irq_eoi when updating the ioapic redirect table entry
irq_eoi is used to count the number of irq injected during eoi
broadcast. It should be set to 0 when updating the ioapic's redirect
table entry.

Suggested-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Li Qiang <liq3ea@163.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <20190624151635.22494-1-liq3ea@163.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-07-05 22:16:46 +02:00
Peter Maydell be32116e32 target/arm: v8M: Check state of exception being returned from
In v8M, an attempt to return from an exception which is not
active is an illegal exception return. For this purpose,
exceptions which can configurably target either Secure or
NonSecure are not considered to be active if they are
configured for the opposite security state for the one
we're trying to return from (eg attempt to return from
an NS NMI but NMI targets Secure). In the pseudocode this
is handled by IsActiveForState().

Detect this case rather than counting an active exception
possibly of the wrong security state as being sufficient.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190617175317.27557-4-peter.maydell@linaro.org
2019-07-04 17:25:30 +01:00
Peter Maydell 077d744910 arm v8M: Forcibly clear negative-priority exceptions on deactivate
To prevent execution priority remaining negative if the guest
returns from an NMI or HardFault with a corrupted IPSR, the
v8M interrupt deactivation process forces the HardFault and NMI
to inactive based on the current raw execution priority,
even if the interrupt the guest is trying to deactivate
is something else. In the pseudocode this is done in the
Deactivate() function.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190617175317.27557-3-peter.maydell@linaro.org
2019-07-04 17:25:30 +01:00
Peter Maydell 506179e421 ppc patch queue 2019-07-2
Here's my next pull request for qemu-4.1.  I'm not sure if this will
 squeak in just before the soft freeze, or just after.  I don't think
 it really matters - most of this is bugfixes anyway.  There's some
 cleanups which aren't stictly bugfixes, but which I think are safe
 enough improvements to go in the soft freeze.  There's no true feature
 work.
 
 Unfortunately, I wasn't able to complete a few of my standard battery
 of pre-pull tests, due to some failures that appear to also be in
 master.  I'm hoping that hasn't missed anything important in here.
 
 Highlights are:
   * A number of fixe and cleanups for the XIVE implementation
   * Cleanups to the XICS interrupt controller to fit better with the new
     XIVE code
   * Numerous fixes and improvements to TCG handling of ppc vector
     instructions
   * Remove a number of unnnecessary #ifdef CONFIG_KVM guards
   * Fix some errors in the PCI hotplug paths
   * Assorted other fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEdfRlhq5hpmzETofcbDjKyiDZs5IFAl0a9JMACgkQbDjKyiDZ
 s5ItkQ//bpkDkztJfRbOB7cgFVQCbXIJ5mpG7PBnBJDohXRtEsjCunNwL+GelRMl
 FizPJO3sGpR2f+MgH+7MJ+Y6ESSwDhI6u8TbH4MjGTc9kWsqV1YUy6nB3grxwqG7
 k9AXN0z6e1MZLaZuseGBrZmPzZcvNwnPKFqEU06ZXqIWscNgXWXteyO5JTZW4O9M
 +Ttiser/f6dRCHKrKnlJp3D1blBaJVUXzZTJVqmH6AiJy/xfHq7Ak6LQKrVrt8Vc
 I2hGMEqyDE+ppr8cuGku4KR8GWUen9m0F0bTVGjPsG1io+spAznxNZL/Z+KJPzrI
 cCFaKoyNknIicx/0/iil5TEuu4rz985erNZBcglarK/w9w0RyW2LlcDbvzV+gO6c
 Ln/1WLZZh4WufR4s4195zUJwZPwGp0E4xFdfk20ulzVzV4wVCMbNJHZpchHYFMi3
 fW4Yzhpq5zaOTIaew5+tWST+8RuduacZ/Rm+f9LNui42uA52/EMoD8Vo34n8CIro
 9DPOS64Jk9BjIr9bMstFOBCyTVt64IFzskDOMCSCznUl51Hm0ytfAJH3Gty7YazQ
 ZxncazzlC9E6OzCTYRDNSPnTKGFvccGmuir/SXPWf3bn8oBC9p3P1mPK3cgk//as
 CvWW8Y/QAJOrxEls5QZzpIBjxqAcMoMVjir6l1OT2/gvBTJto1Q=
 =QAyU
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-4.1-20190702' into staging

ppc patch queue 2019-07-2

Here's my next pull request for qemu-4.1.  I'm not sure if this will
squeak in just before the soft freeze, or just after.  I don't think
it really matters - most of this is bugfixes anyway.  There's some
cleanups which aren't stictly bugfixes, but which I think are safe
enough improvements to go in the soft freeze.  There's no true feature
work.

Unfortunately, I wasn't able to complete a few of my standard battery
of pre-pull tests, due to some failures that appear to also be in
master.  I'm hoping that hasn't missed anything important in here.

Highlights are:
  * A number of fixe and cleanups for the XIVE implementation
  * Cleanups to the XICS interrupt controller to fit better with the new
    XIVE code
  * Numerous fixes and improvements to TCG handling of ppc vector
    instructions
  * Remove a number of unnnecessary #ifdef CONFIG_KVM guards
  * Fix some errors in the PCI hotplug paths
  * Assorted other fixes

# gpg: Signature made Tue 02 Jul 2019 07:07:15 BST
# gpg:                using RSA key 75F46586AE61A66CC44E87DC6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" [full]
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>" [full]
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" [full]
# gpg:                 aka "David Gibson (kernel.org) <dwg@kernel.org>" [unknown]
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-4.1-20190702: (49 commits)
  spapr/xive: Add proper rollback to kvmppc_xive_connect()
  ppc/xive: Fix TM_PULL_POOL_CTX special operation
  ppc/pnv: Rework cache watch model of PnvXIVE
  ppc/xive: Make the PIPR register readonly
  ppc/xive: Force the Physical CAM line value to group mode
  spapr/xive: simplify spapr_irq_init_device() to remove the emulated init
  spapr/xive: rework the mapping the KVM memory regions
  spapr_pci: Unregister listeners before destroying the IOMMU address space
  target/ppc: improve VSX_FMADD with new GEN_VSX_HELPER_VSX_MADD macro
  target/ppc: decode target register in VSX_EXTRACT_INSERT at translation time
  target/ppc: decode target register in VSX_VECTOR_LOAD_STORE_LENGTH at translation time
  target/ppc: introduce GEN_VSX_HELPER_R2_AB macro to fpu_helper.c
  target/ppc: introduce GEN_VSX_HELPER_R2 macro to fpu_helper.c
  target/ppc: introduce GEN_VSX_HELPER_R3 macro to fpu_helper.c
  target/ppc: introduce GEN_VSX_HELPER_X1 macro to fpu_helper.c
  target/ppc: introduce GEN_VSX_HELPER_X2_AB macro to fpu_helper.c
  target/ppc: introduce GEN_VSX_HELPER_X2 macro to fpu_helper.c
  target/ppc: introduce separate generator and helper for xscvqpdp
  target/ppc: introduce GEN_VSX_HELPER_X3 macro to fpu_helper.c
  target/ppc: introduce separate VSX_CMP macro for xvcmp* instructions
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-07-02 18:56:44 +01:00
Greg Kurz 1c3d4a8f4b spapr/xive: Add proper rollback to kvmppc_xive_connect()
Make kvmppc_xive_disconnect() able to undo the changes of a partial
execution of kvmppc_xive_connect() and use it to perform rollback.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <156198735673.293938.7313195993600841641.stgit@bahia>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 10:11:44 +10:00
Cédric Le Goater aaa450300e ppc/xive: Fix TM_PULL_POOL_CTX special operation
When a CPU is reseted, the hypervisor (Linux or OPAL) invalidates the
POOL interrupt context of a CPU with this special command. It returns
the POOL CAM line value and resets the VP bit.

Fixes: 4836b45510 ("ppc/xive: activate HV support")
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190630204601.30574-5-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Cédric Le Goater 0df68c7ed6 ppc/pnv: Rework cache watch model of PnvXIVE
When the software modifies the XIVE internal structures, ESB, EAS,
END, NVT, it also must update the caches of the different XIVE
sub-engines. HW offers a set of common interface for such purpose.

The CWATCH_SPEC register defines the block/index of the target and a
set of flags to perform a full update and to watch for update
conflicts.

The cache watch CWATCH_DATAX registers are then loaded with the target
data with a first read on CWATCH_DATA0. Writing back is done in the
opposit order, CWATCH_DATA0 triggering the update.

The SCRUB_TRIG registers are used to flush the cache in RAM, and to
possibly invalidate it. Cache disablement is also an option but as we
do not model the cache, these registers are no-ops

Today, the modeling of these registers is incorrect but it did not
impact the set up of a baremetal system. However, running KVM requires
a rework.

Fixes: 2dfa91a2aa ("ppc/pnv: add a XIVE interrupt controller model for POWER9")
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190630204601.30574-4-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Cédric Le Goater 8256870ada ppc/xive: Make the PIPR register readonly
When the hypervisor (KVM) dispatches a vCPU on a HW thread, it restores
its thread interrupt context. The Pending Interrupt Priority Register
(PIPR) is computed from the Interrupt Pending Buffer (IPB) and stores
should not be allowed to change its value.

Fixes: 207d9fe985 ("ppc/xive: introduce the XIVE interrupt thread context")
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190630204601.30574-3-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Cédric Le Goater fe9a9d527d ppc/xive: Force the Physical CAM line value to group mode
When an interrupt needs to be delivered, the XIVE interrupt controller
presenter scans the CAM lines of the thread interrupt contexts of the
HW threads of the chip to find a matching vCPU. The interrupt context
is composed of 4 different sets of registers: Physical, HV, OS and
User.

The encoding of the Physical CAM line depends on the mode in which the
interrupt controller is operating: CAM mode or block group mode.
Block group mode being the default configuration today on POWER9 and
the only one available on the next POWER10 generation, enforce this
encoding in the Physical CAM line :

    chip << 19 | 0000000 0 0001 thread (7Bit)

It fits the overall encoding of the NVT ids and simplifies the matching
algorithm in the presenter.

Fixes: d514c48d41 ("ppc/xive: hardwire the Physical CAM line of the thread context")
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190630204601.30574-2-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Cédric Le Goater 981b1c6266 spapr/xive: rework the mapping the KVM memory regions
Today, the interrupt device is fully initialized at reset when the CAS
negotiation process has completed. Depending on the KVM capabilities,
the SpaprXive memory regions (ESB, TIMA) are initialized with a host
MMIO backend or a QEMU emulated backend. This results in a complex
initialization sequence partially done at realize and later at reset,
and some memory region leaks.

To simplify this sequence and to remove of the late initialization of
the emulated device which is required to be done only once, we
introduce new memory regions specific for KVM. These regions are
mapped as overlaps on top of the emulated device to make use of the
host MMIOs. Also provide proper cleanups of these regions when the
XIVE KVM device is destroyed to fix the leaks.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190614165920.12670-2-clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Greg Kurz 4812f26152 xics/kvm: Add proper rollback to xics_kvm_init()
Make xics_kvm_disconnect() able to undo the changes of a partial execution
of xics_kvm_connect() and use it to perform rollback.

Note that kvmppc_define_rtas_kernel_token(0) never fails, no matter the
RTAS call has been defined or not.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <156077922319.433243.609897156640506891.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Greg Kurz 330a21e3c4 xics/kvm: Add error propagation to ic*_set_kvm_state() functions
This allows errors happening there to be propagated up to spapr_irq,
just like XIVE already does.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <156077921763.433243.4614327010172954196.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Greg Kurz ab3d15fa84 xics/kvm: Always use local_err in xics_kvm_init()
Passing both errp and &local_err to functions is a recipe for messing
things up.

Since we must use &local_err for icp_kvm_realize(), use &local_err
everywhere where rollback must happen and have a single call to
error_propagate() them all. While here, add errno to the error
message.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <156077921212.433243.11716701611944816815.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Greg Kurz 64fb96214c xics/kvm: Skip rollback when KVM XICS is absent
There is no need to rollback anything at this point, so just return an
error.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <156077920657.433243.13541093940589972734.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Greg Kurz eab9f191a0 xics/spapr: Rename xics_kvm_init()
Switch to using the connect/disconnect terminology like we already do for
XIVE.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <156077920102.433243.6605099291134598170.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Greg Kurz 25c79a3089 xics/spapr: Only emulated XICS should use RTAS/hypercalls emulation
Checking that we're not using the in-kernel XICS is ok with the "xics"
interrupt controller mode, but it is definitely not enough with the
other modes since the guest could be using XIVE.

Ensure XIVE is not in use when emulated XICS RTAS/hypercalls are
called.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <156077253666.424706.6104557911104491047.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Greg Kurz d9b9e6f6b9 xics: Add comment about CPU hotplug
So that no one is tempted to drop that code, which is never called
for cold plugged CPUs.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <156078063349.435533.12283208810037409702.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Greg Kurz 7abc0c6d35 xics/spapr: Detect old KVM XICS on POWER9 hosts
Older KVMs on POWER9 don't support destroying/recreating a KVM XICS
device, which is required by 'dual' interrupt controller mode. This
causes QEMU to emit a warning when the guest is rebooted and to fall
back on XICS emulation:

qemu-system-ppc64: warning: kernel_irqchip allowed but unavailable:
 Error on KVM_CREATE_DEVICE for XICS: File exists

If kernel irqchip is required, QEMU will thus exit when the guest is
first rebooted. Failing QEMU this late may be a painful experience
for the user.

Detect that and exit at machine init instead.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <156044430517.125694.6207865998817342638.stgit@bahia.lab.toulouse-stg.fr.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Greg Kurz d9293c4843 xics/spapr: Register RTAS/hypercalls once at machine init
QEMU may crash when running a spapr machine in 'dual' interrupt controller
mode on some older (but not that old, eg. ubuntu 18.04.2) KVMs with partial
XIVE support:

qemu-system-ppc64: hw/ppc/spapr_rtas.c:411: spapr_rtas_register:
 Assertion `!name || !rtas_table[token].name' failed.

XICS is controlled by the guest thanks to a set of RTAS calls. Depending
on whether KVM XICS is used or not, the RTAS calls are handled by KVM or
QEMU. In both cases, QEMU needs to expose the RTAS calls to the guest
through the "rtas" node of the device tree.

The spapr_rtas_register() helper takes care of all of that: it adds the
RTAS call token to the "rtas" node and registers a QEMU callback to be
invoked when the guest issues the RTAS call. In the KVM XICS case, QEMU
registers a dummy callback that just prints an error since it isn't
supposed to be invoked, ever.

Historically, the XICS controller was setup during machine init and
released during final teardown. This changed when the 'dual' interrupt
controller mode was added to the spapr machine: in this case we need
to tear the XICS down and set it up again during machine reset. The
crash happens because we indeed have an incompatibility with older
KVMs that forces QEMU to fallback on emulated XICS, which tries to
re-registers the same RTAS calls.

This could be fixed by adding proper rollback that would unregister
RTAS calls on error. But since the emulated RTAS calls in QEMU can
now detect when they are mistakenly called while KVM XICS is in
use, it seems simpler to register them once and for all at machine
init. This fixes the crash and allows to remove some now useless
lines of code.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <156044429963.125694.13710679451927268758.stgit@bahia.lab.toulouse-stg.fr.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Greg Kurz d9715d6772 xics/spapr: Prevent RTAS/hypercalls emulation to be used by in-kernel XICS
The XICS-related RTAS calls and hypercalls in QEMU are not supposed to
be called when the KVM in-kernel XICS is in use.

Add some explicit checks to detect that, print an error message and report
an hardware error to the guest.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <156044429419.125694.507569071972451514.stgit@bahia.lab.toulouse-stg.fr.ibm.com>
[dwg: Correction to commit message]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Cédric Le Goater e1a9b7d1fc ppc/pnv: fix StoreEOI activation
The firmware (skiboot) of the PowerNV machines can configure the XIVE
interrupt controller to activate StoreEOI on the ESB pages of the
interrupts. This feature lets software do an EOI with a store instead
of a load. It is not activated today on P9 for rare race condition
issues but it should be on future processors.

Nevertheless, QEMU has a model for StoreEOI which can be used today by
experimental firmwares. But, the use of object_property_set_int() in
the PnvXive model is incorrect and crashes QEMU. Replace it with a
direct access to the ESB flags of the XiveSource object modeling the
internal sources of the interrupt controller.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190612162357.29566-1-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-07-02 09:43:58 +10:00
Andrew Jeffery ebd205c080 aspeed: vic: Add support for legacy register interface
The legacy interface only supported up to 32 IRQs, which became
restrictive around the AST2400 generation. QEMU support for the SoCs
started with the AST2400 along with an effort to reimplement and
upstream drivers for Linux, so up until this point the consumers of the
QEMU ASPEED support only required the 64 IRQ register interface.

In an effort to support older BMC firmware, add support for the 32 IRQ
interface.

Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Message-id: 20190618165311.27066-22-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-07-01 17:29:00 +01:00
Peter Maydell 0edfcc9ec0 hw/intc/arm_gicv3: GICD_TYPER.SecurityExtn is RAZ if GICD_CTLR.DS == 1
The GICv3 specification says that the GICD_TYPER.SecurityExtn bit
is RAZ if GICD_CTLR.DS is 1. We were incorrectly making it RAZ
if the security extension is unsupported. "Security extension
unsupported" always implies GICD_CTLR.DS == 1, but the guest can
also set DS on a GIC which does support the security extension.
Fix the condition to correctly check the GICD_CTLR.DS bit.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20190524124248.28394-3-peter.maydell@linaro.org
2019-06-17 15:13:19 +01:00
Peter Maydell e40f60730a hw/intc/arm_gicv3: Fix decoding of ID register range
The GIC ID registers cover an area 0x30 bytes in size
(12 registers, 4 bytes each). We were incorrectly decoding
only the first 0x20 bytes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190524124248.28394-2-peter.maydell@linaro.org
2019-06-17 15:13:19 +01:00
Peter Maydell a050901d4b ppc patch queue 2019-06-12
Next pull request against qemu-4.1.  The big thing here is adding
 support for hot plug of P2P bridges, and PCI devices under P2P bridges
 on the "pseries" machine (which doesn't use SHPC).  Other than that
 there's just a handful of fixes and small enhancements.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEdfRlhq5hpmzETofcbDjKyiDZs5IFAl0AkgwACgkQbDjKyiDZ
 s5Jyug//cwxP+t1t2CNHtffKwiXFzuEKx9YSNE1V0wog6aB40EbPKU72FzCq6FfA
 lev+pZWV9AwVMzFYe4VM/7Lqh7WFMYDT3DOXaZwfANs4471vYtgvPi21L2TBj80d
 hMszlyLWMLY9ByOzCxIq3xnbivGpA94G2q9rKbwXdK4T/5i62Pe3SIfgG+gXiiwW
 +YlHWCPX0I1cJz2bBs9ElXdl7ONWnn+7uDf7gNfWkTKuiUq6Ps7mxzy3GhJ1T7nz
 OFKmQ5dKzLJsgOULSSun8kWpXBmnPffkM3+fCE07edrWZVor09fMCk4HvtfaRy2K
 FFa2Kvzn/V/70TL+44dsSX4QcwdcHQztiaMO7UGPq9CMswx5L7gsNmfX6zvK1Nrb
 1t7ORZKNJ72hMyvDPSMiGU2DpVjO3ZbBlSL4/xG8Qeal4An0kgkN5NcFlB/XEfnz
 dsKu9XzuGSeD1bWz1Mgcf1x7lPDBoHIKLcX6notZ8epP/otu4ywNFvAkPu4fk8s0
 4jQGajIT7328SmzpjXClsmiEskpKsEr7hQjPRhu0hFGrhVc+i9PjkmbDl0TYRAf6
 N6k6gJQAi+StJde2rcua1iS7Ra+Tka6QRKy+EctLqfqOKPb2VmkZ6fswQ3nfRRlT
 LgcTHt2iJcLeud2klVXs1e4pKXzXchkVyFL4ucvmyYG5VeimMzU=
 =ERgu
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-4.1-20190612' into staging

ppc patch queue 2019-06-12

Next pull request against qemu-4.1.  The big thing here is adding
support for hot plug of P2P bridges, and PCI devices under P2P bridges
on the "pseries" machine (which doesn't use SHPC).  Other than that
there's just a handful of fixes and small enhancements.

# gpg: Signature made Wed 12 Jun 2019 06:47:56 BST
# gpg:                using RSA key 75F46586AE61A66CC44E87DC6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" [full]
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>" [full]
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" [full]
# gpg:                 aka "David Gibson (kernel.org) <dwg@kernel.org>" [unknown]
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-4.1-20190612:
  ppc/xive: Make XIVE generate the proper interrupt types
  ppc/pnv: activate the "dumpdtb" option on the powernv machine
  target/ppc: Use tcg_gen_gvec_bitsel
  spapr: Allow hot plug/unplug of PCI bridges and devices under PCI bridges
  spapr: Direct all PCI hotplug to host bridge, rather than P2P bridge
  spapr: Don't use bus number for building DRC ids
  spapr: Clean up DRC index construction
  spapr: Clean up spapr_drc_populate_dt()
  spapr: Clean up dt creation for PCI buses
  spapr: Clean up device tree construction for PCI devices
  spapr: Clean up device node name generation for PCI devices
  target/ppc: Fix lxvw4x, lxvh8x and lxvb16x
  spapr_pci: Improve error message

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-06-12 14:43:47 +01:00
Markus Armbruster a8d2532645 Include qemu-common.h exactly where needed
No header includes qemu-common.h after this commit, as prescribed by
qemu-common.h's file comment.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190523143508.25387-5-armbru@redhat.com>
[Rebased with conflicts resolved automatically, except for
include/hw/arm/xlnx-zynqmp.h hw/arm/nrf51_soc.c hw/arm/msf2-soc.c
block/qcow2-refcount.c block/qcow2-cluster.c block/qcow2-cache.c
target/arm/cpu.h target/lm32/cpu.h target/m68k/cpu.h target/mips/cpu.h
target/moxie/cpu.h target/nios2/cpu.h target/openrisc/cpu.h
target/riscv/cpu.h target/tilegx/cpu.h target/tricore/cpu.h
target/unicore32/cpu.h target/xtensa/cpu.h; bsd-user/main.c and
net/tap-bsd.c fixed up]
2019-06-12 13:20:20 +02:00
Markus Armbruster 0b8fa32f55 Include qemu/module.h where needed, drop it from qemu-common.h
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190523143508.25387-4-armbru@redhat.com>
[Rebased with conflicts resolved automatically, except for
hw/usb/dev-hub.c hw/misc/exynos4210_rng.c hw/misc/bcm2835_rng.c
hw/misc/aspeed_scu.c hw/display/virtio-vga.c hw/arm/stm32f205_soc.c;
ui/cocoa.m fixed up]
2019-06-12 13:18:33 +02:00
Benjamin Herrenschmidt 4aca978654 ppc/xive: Make XIVE generate the proper interrupt types
It should be generic Hypervisor Virtualization interrupts for HV
directed rings and traditional External Interrupts for the OS directed
ring.

Don't generate anything for the user ring as it isn't actually
supported.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190606174409.12502-1-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-06-12 10:41:50 +10:00
Richard Henderson 5a7330b35c target/mips: Use env_cpu, env_archcpu
Cleanup in the boilerplate that each target must define.
Replace mips_env_get_cpu with env_archcpu.  The combination
CPU(mips_env_get_cpu) should have used ENV_GET_CPU to begin;
use env_cpu now.

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2019-06-10 07:03:42 -07:00
Cédric Le Goater cdd71c8e9d spapr/xive: fix multiple resets when using the 'dual' interrupt mode
Today, when a reset occurs on a pseries machine using the 'dual'
interrupt mode, the KVM devices are released and recreated depending
on the interrupt mode selected by CAS. If XIVE is selected, the SysBus
memory regions of the SpaprXive model are initialized by the KVM
backend initialization routine each time a reset occurs. This leads to
a crash after a couple of resets because the machine reaches the
QDEV_MAX_MMIO limit of SysBusDevice :

qemu-system-ppc64: hw/core/sysbus.c:193: sysbus_init_mmio: Assertion `dev->num_mmio < QDEV_MAX_MMIO' failed.

To fix, initialize the SysBus memory regions in spapr_xive_realize()
called only once and remove the same inits from the QEMU and KVM
backend initialization routines which are called at each reset.

Reported-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190522074016.10521-2-clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29 11:39:47 +10:00
Cédric Le Goater 3f777abc71 spapr/irq: add KVM support to the 'dual' machine
The interrupt mode is chosen by the CAS negotiation process and
activated after a reset to take into account the required changes in
the machine. This brings new constraints on how the associated KVM IRQ
device is initialized.

Currently, each model takes care of the initialization of the KVM
device in their realize method but this is not possible anymore as the
initialization needs to be done globaly when the interrupt mode is
known, i.e. when machine is reseted. It also means that we need a way
to delete a KVM device when another mode is chosen.

Also, to support migration, the QEMU objects holding the state to
transfer should always be available but not necessarily activated.

The overall approach of this proposal is to initialize both interrupt
mode at the QEMU level to keep the IRQ number space in sync and to
allow switching from one mode to another. For the KVM side of things,
the whole initialization of the KVM device, sources and presenters, is
grouped in a single routine. The XICS and XIVE sPAPR IRQ reset
handlers are modified accordingly to handle the init and the delete
sequences of the KVM device.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190513084245.25755-15-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29 11:39:46 +10:00
Cédric Le Goater 83629419a5 ppc/xics: fix irq priority in ics_set_irq_type()
Recent commits changed the behavior of ics_set_irq_type() to
initialize correctly LSIs at the KVM level. ics_set_irq_type() is also
called by the realize routine of the different devices of the machine
when initial interrupts are claimed, before the ICSState device is
reseted.

In the case, the ICSIRQState priority is 0x0 and the call to
ics_set_irq_type() results in configuring the target of the
interrupt. On P9, when using the KVM XICS-on-XIVE device, the target
is configured to be server 0, priority 0 and the event queue 0 is
created automatically by KVM.

With the dual interrupt mode creating the KVM device at reset, it
leads to unexpected effects on the guest, mostly blocking IPIs. This
is wrong, fix it by reseting the ICSIRQState structure when
ics_set_irq_type() is called.

Fixes: commit 6cead90c5c ("xics: Write source state to KVM at claim time")
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190513084245.25755-14-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29 11:39:46 +10:00
Cédric Le Goater cf435df697 spapr/irq: initialize the IRQ device only once
Add a check to make sure that the routine initializing the emulated
IRQ device is called once. We don't have much to test on the XICS
side, so we introduce a 'init' boolean under ICSState.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190513084245.25755-13-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29 11:39:46 +10:00
Cédric Le Goater ae805ea907 spapr/irq: introduce a spapr_irq_init_device() helper
The way the XICS and the XIVE devices are initialized follows the same
pattern. First, try to connect to the KVM device and if not possible
fallback on the emulated device, unless a kernel_irqchip is required.
The spapr_irq_init_device() routine implements this sequence in
generic way using new sPAPR IRQ handlers ->init_emu() and ->init_kvm().

The XIVE init sequence is moved under the associated sPAPR IRQ
->init() handler. This will change again when KVM support is added for
the dual interrupt mode.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190513084245.25755-12-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29 11:39:46 +10:00
Cédric Le Goater 3bf84e99c8 spapr: check for the activation of the KVM IRQ device
The activation of the KVM IRQ device depends on the interrupt mode
chosen at CAS time by the machine and some methods used at reset or by
the migration need to be protected.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190513084245.25755-11-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29 11:39:46 +10:00
Cédric Le Goater 56b11587df spapr: introduce routines to delete the KVM IRQ device
If a new interrupt mode is chosen by CAS, the machine generates a
reset to reconfigure. At this point, the connection with the previous
KVM device needs to be closed and a new connection needs to opened
with the KVM device operating the chosen interrupt mode.

New routines are introduced to destroy the XICS and the XIVE KVM
devices. They make use of a new KVM device ioctl which destroys the
device and also disconnects the IRQ presenters from the vCPUs.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190513084245.25755-10-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29 11:39:46 +10:00
Cédric Le Goater 277dd3d771 spapr/xive: add migration support for KVM
When the VM is stopped, the VM state handler stabilizes the XIVE IC
and marks the EQ pages dirty. These are then transferred to destination
before the transfer of the device vmstates starts.

The SpaprXive interrupt controller model captures the XIVE internal
tables, EAT and ENDT and the XiveTCTX model does the same for the
thread interrupt context registers.

At restart, the SpaprXive 'post_load' method restores all the XIVE
states. It is called by the sPAPR machine 'post_load' method, when all
XIVE states have been transferred and loaded.

Finally, the source states are restored in the VM change state handler
when the machine reaches the running state.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190513084245.25755-7-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29 11:39:46 +10:00
Cédric Le Goater 9b88cd7673 spapr/xive: introduce a VM state change handler
This handler is in charge of stabilizing the flow of event notifications
in the XIVE controller before migrating a guest. This is a requirement
before transferring the guest EQ pages to a destination.

When the VM is stopped, the handler sets the source PQs to PENDING to
stop the flow of events and to possibly catch a triggered interrupt
occuring while the VM is stopped. Their previous state is saved. The
XIVE controller is then synced through KVM to flush any in-flight
event notification and to stabilize the EQs. At this stage, the EQ
pages are marked dirty to make sure the EQ pages are transferred if a
migration sequence is in progress.

The previous configuration of the sources is restored when the VM
resumes, after a migration or a stop. If an interrupt was queued while
the VM was stopped, the handler simply generates the missing trigger.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190513084245.25755-6-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29 11:39:46 +10:00
Cédric Le Goater 7bfc759c02 spapr/xive: add state synchronization with KVM
This extends the KVM XIVE device backend with 'synchronize_state'
methods used to retrieve the state from KVM. The HW state of the
sources, the KVM device and the thread interrupt contexts are
collected for the monitor usage and also migration.

These get operations rely on their KVM counterpart in the host kernel
which acts as a proxy for OPAL, the host firmware. The set operations
will be added for migration support later.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190513084245.25755-5-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29 11:39:46 +10:00
Cédric Le Goater 0c575703e4 spapr/xive: add hcall support when under KVM
XIVE hcalls are all redirected to QEMU as none are on a fast path.
When necessary, QEMU invokes KVM through specific ioctls to perform
host operations. QEMU should have done the necessary checks before
calling KVM and, in case of failure, H_HARDWARE is simply returned.

H_INT_ESB is a special case that could have been handled under KVM
but the impact on performance was low when under QEMU. Here are some
figures :

    kernel irqchip      OFF          ON
    H_INT_ESB                    KVM   QEMU

    rtl8139 (LSI )      1.19     1.24  1.23  Gbits/sec
    virtio             31.80    42.30   --   Gbits/sec

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190513084245.25755-4-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29 11:39:45 +10:00
Cédric Le Goater 38afd772f8 spapr/xive: add KVM support
This introduces a set of helpers when KVM is in use, which create the
KVM XIVE device, initialize the interrupt sources at a KVM level and
connect the interrupt presenters to the vCPU.

They also handle the initialization of the TIMA and the source ESB
memory regions of the controller. These have a different type under
KVM. They are 'ram device' memory mappings, similarly to VFIO, exposed
to the guest and the associated VMAs on the host are populated
dynamically with the appropriate pages using a fault handler.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190513084245.25755-3-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29 11:39:45 +10:00
Satheesh Rajendran f81d69fcea Fix typo on "info pic" monitor cmd output for xive
Instead of LISN i.e "Logical Interrupt Source Number" as per
Xive PAPR document "info pic" prints as LSIN, let's fix it.

Signed-off-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Message-Id: <20190509080750.21999-1-sathnaga@linux.vnet.ibm.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-05-29 11:39:45 +10:00