summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
* | | | | | | | | Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2015-09-1162-700/+2653
|\ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull more kvm updates from Paolo Bonzini: "ARM: - Full debug support for arm64 - Active state switching for timer interrupts - Lazy FP/SIMD save/restore for arm64 - Generic ARMv8 target PPC: - Book3S: A few bug fixes - Book3S: Allow micro-threading on POWER8 x86: - Compiler warnings Generic: - Adaptive polling for guest halt" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (49 commits) kvm: irqchip: fix memory leak kvm: move new trace event outside #ifdef CONFIG_KVM_ASYNC_PF KVM: trace kvm_halt_poll_ns grow/shrink KVM: dynamic halt-polling KVM: make halt_poll_ns per-vCPU Silence compiler warning in arch/x86/kvm/emulate.c kvm: compile process_smi_save_seg_64() only for x86_64 KVM: x86: avoid uninitialized variable warning KVM: PPC: Book3S: Fix typo in top comment about locking KVM: PPC: Book3S: Fix size of the PSPB register KVM: PPC: Book3S HV: Exit on H_DOORBELL if HOST_IPI is set KVM: PPC: Book3S HV: Fix race in starting secondary threads KVM: PPC: Book3S: correct width in XER handling KVM: PPC: Book3S HV: Fix preempted vcore stolen time calculation KVM: PPC: Book3S HV: Fix preempted vcore list locking KVM: PPC: Book3S HV: Implement H_CLEAR_REF and H_CLEAR_MOD KVM: PPC: Book3S HV: Fix bug in dirty page tracking KVM: PPC: Book3S HV: Fix race in reading change bit when removing HPTE KVM: PPC: Book3S HV: Implement dynamic micro-threading on POWER8 KVM: PPC: Book3S HV: Make use of unused threads when running guests ...
| * | | | | | | | | kvm: irqchip: fix memory leakSudip Mukherjee2015-09-081-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We were taking the exit path after checking ue->flags and return value of setup_routing_entry(), but 'e' was not freed incase of a failure. Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | | | | | | | | kvm: move new trace event outside #ifdef CONFIG_KVM_ASYNC_PFWanpeng Li2015-09-081-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes compilation with ppc64_defconfig. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | | | | | | | | KVM: trace kvm_halt_poll_ns grow/shrinkWanpeng Li2015-09-062-2/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Tracepoint for dynamic halt_pool_ns, fired on every potential change. Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | | | | | | | | KVM: dynamic halt-pollingWanpeng Li2015-09-061-2/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a downside of always-poll since poll is still happened for idle vCPUs which can waste cpu usage. This patchset add the ability to adjust halt_poll_ns dynamically, to grow halt_poll_ns when shot halt is detected, and to shrink halt_poll_ns when long halt is detected. There are two new kernel parameters for changing the halt_poll_ns: halt_poll_ns_grow and halt_poll_ns_shrink. no-poll always-poll dynamic-poll ----------------------------------------------------------------------- Idle (nohz) vCPU %c0 0.15% 0.3% 0.2% Idle (250HZ) vCPU %c0 1.1% 4.6%~14% 1.2% TCP_RR latency 34us 27us 26.7us "Idle (X) vCPU %c0" is the percent of time the physical cpu spent in c0 over 60 seconds (each vCPU is pinned to a pCPU). (nohz) means the guest was tickless. (250HZ) means the guest was ticking at 250HZ. The big win is with ticking operating systems. Running the linux guest with nohz=off (and HZ=250), we save 3.4%~12.8% CPUs/second and get close to no-polling overhead levels by using the dynamic-poll. The savings should be even higher for higher frequency ticks. Suggested-by: David Matlack <dmatlack@google.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> [Simplify the patch. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | | | | | | | | KVM: make halt_poll_ns per-vCPUWanpeng Li2015-09-062-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change halt_poll_ns into per-VCPU variable, seeded from module parameter, to allow greater flexibility. Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | | | | | | | | Silence compiler warning in arch/x86/kvm/emulate.cValdis Kletnieks2015-09-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Compiler warning: CC [M] arch/x86/kvm/emulate.o arch/x86/kvm/emulate.c: In function "__do_insn_fetch_bytes": arch/x86/kvm/emulate.c:814:9: warning: "linear" may be used uninitialized in this function [-Wmaybe-uninitialized] GCC is smart enough to realize that the inlined __linearize may return before setting the value of linear, but not smart enough to realize the same X86EMU_CONTINUE blocks actual use of the value. However, the value of 'linear' can only be set to one value, so hoisting the one line of code upwards makes GCC happy with the code. Reported-by: Aruna Hewapathirane <aruna.hewapathirane@gmail.com> Tested-by: Aruna Hewapathirane <aruna.hewapathirane@gmail.com> Signed-off-by: Valdis Kletnieks <valdis.kletnieks@vt.edu> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | | | | | | | | kvm: compile process_smi_save_seg_64() only for x86_64Alexander Kuleshov2015-09-061-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The process_smi_save_seg_64() function called only in the process_smi_save_state_64() if the CONFIG_X86_64 is set. This patch adds #ifdef CONFIG_X86_64 around process_smi_save_seg_64() to prevent following warning message: arch/x86/kvm/x86.c:5946:13: warning: ‘process_smi_save_seg_64’ defined but not used [-Wunused-function] static void process_smi_save_seg_64(struct kvm_vcpu *vcpu, char *buf, int n) ^ Signed-off-by: Alexander Kuleshov <kuleshovmail@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | | | | | | | | KVM: x86: avoid uninitialized variable warningPaolo Bonzini2015-09-061-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This does not show up on all compiler versions, so it sneaked into the first 4.3 pull request. The fix is to mimic the logic of the "print sptes" loop in the "fill array" loop. Then leaf and root can be both initialized unconditionally. Note that "leaf" now points to the first unused element of the array, not the last filled element. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | | | | | | | | KVM: PPC: Book3S: Fix typo in top comment about lockingGreg Kurz2015-09-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * | | | | | | | | KVM: PPC: Book3S: Fix size of the PSPB registerThomas Huth2015-09-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The size of the Problem State Priority Boost Register is only 32 bits, but the kvm_vcpu_arch->pspb variable is declared as "ulong", ie. 64-bit. However, the assembler code accesses this variable with 32-bit accesses, and the KVM_REG_PPC_PSPB macro is defined with SIZE_U32, too, so that the current code is broken on big endian hosts: kvmppc_get_one_reg_hv() will only return zero for this register since it is using the wrong half of the pspb variable. Let's fix this problem by adjusting the size of the pspb field in the kvm_vcpu_arch structure. Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * | | | | | | | | KVM: PPC: Book3S HV: Exit on H_DOORBELL if HOST_IPI is setGautham R. Shenoy2015-09-031-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The code that handles the case when we receive a H_DOORBELL interrupt has a comment which says "Hypervisor doorbell - exit only if host IPI flag set". However, the current code does not actually check if the host IPI flag is set. This is due to a comparison instruction that got missed. As a result, the current code performs the exit to host only if some sibling thread or a sibling sub-core is exiting to the host. This implies that, an IPI sent to a sibling core in (subcores-per-core != 1) mode will be missed by the host unless the sibling core is on the exit path to the host. This patch adds the missing comparison operation which will ensure that when HOST_IPI flag is set, we unconditionally exit to the host. Fixes: 66feed61cdf6 Cc: stable@vger.kernel.org # v4.1+ Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * | | | | | | | | KVM: PPC: Book3S HV: Fix race in starting secondary threadsGautham R. Shenoy2015-09-032-1/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current dynamic micro-threading code has a race due to which a secondary thread naps when it is supposed to be running a vcpu. As a side effect of this, on a guest exit, the primary thread in kvmppc_wait_for_nap() finds that this secondary thread hasn't cleared its vcore pointer. This results in "CPU X seems to be stuck!" warnings. The race is possible since the primary thread on exiting the guests only waits for all the secondaries to clear its vcore pointer. It subsequently expects the secondary threads to enter nap while it unsplits the core. A secondary thread which hasn't yet entered the nap will loop in kvm_no_guest until its vcore pointer and the do_nap flag are unset. Once the core has been unsplit, a new vcpu thread can grab the core and set the do_nap flag *before* setting the vcore pointers of the secondary. As a result, the secondary thread will now enter nap via kvm_unsplit_nap instead of running the guest vcpu. Fix this by setting the do_nap flag after setting the vcore pointer in the PACA of the secondary in kvmppc_run_core. Also, ensure that a secondary thread doesn't nap in kvm_unsplit_nap when the vcore pointer in its PACA struct is set. Fixes: b4deba5c41e9 Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * | | | | | | | | Merge tag 'signed-kvm-ppc-next' of git://github.com/agraf/linux-2.6 into ↵Paolo Bonzini2015-08-22663-4720/+6955
| |\ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kvm-queue Patch queue for ppc - 2015-08-22 Highlights for KVM PPC this time around: - Book3S: A few bug fixes - Book3S: Allow micro-threading on POWER8
| | * | | | | | | | | KVM: PPC: Book3S: correct width in XER handlingSam bobroff2015-08-225-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In 64 bit kernels, the Fixed Point Exception Register (XER) is a 64 bit field (e.g. in kvm_regs and kvm_vcpu_arch) and in most places it is accessed as such. This patch corrects places where it is accessed as a 32 bit field by a 64 bit kernel. In some cases this is via a 32 bit load or store instruction which, depending on endianness, will cause either the lower or upper 32 bits to be missed. In another case it is cast as a u32, causing the upper 32 bits to be cleared. This patch corrects those places by extending the access methods to 64 bits. Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Tested-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Alexander Graf <agraf@suse.de>
| | * | | | | | | | | KVM: PPC: Book3S HV: Fix preempted vcore stolen time calculationPaul Mackerras2015-08-221-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Whenever a vcore state is VCORE_PREEMPT we need to be counting stolen time for it. This currently isn't the case when we have a vcore that no longer has any runnable threads in it but still has a runner task, so we do an explicit call to kvmppc_core_start_stolen() in that case. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
| | * | | | | | | | | KVM: PPC: Book3S HV: Fix preempted vcore list lockingPaul Mackerras2015-08-221-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a vcore gets preempted, we put it on the preempted vcore list for the current CPU. The runner task then calls schedule() and comes back some time later and takes itself off the list. We need to be careful to lock the list that it was put onto, which may not be the list for the current CPU since the runner task may have moved to another CPU. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
| | * | | | | | | | | KVM: PPC: Book3S HV: Implement H_CLEAR_REF and H_CLEAR_MODPaul Mackerras2015-08-222-9/+121
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds implementations for the H_CLEAR_REF (test and clear reference bit) and H_CLEAR_MOD (test and clear changed bit) hypercalls. When clearing the reference or change bit in the guest view of the HPTE, we also have to clear it in the real HPTE so that we can detect future references or changes. When we do so, we transfer the R or C bit value to the rmap entry for the underlying host page so that kvm_age_hva_hv(), kvm_test_age_hva_hv() and kvmppc_hv_get_dirty_log() know that the page has been referenced and/or changed. These hypercalls are not used by Linux guests. These implementations have been tested using a FreeBSD guest. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
| | * | | | | | | | | KVM: PPC: Book3S HV: Fix bug in dirty page trackingPaul Mackerras2015-08-224-1/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes a bug in the tracking of pages that get modified by the guest. If the guest creates a large-page HPTE, writes to memory somewhere within the large page, and then removes the HPTE, we only record the modified state for the first normal page within the large page, when in fact the guest might have modified some other normal page within the large page. To fix this we use some unused bits in the rmap entry to record the order (log base 2) of the size of the page that was modified, when removing an HPTE. Then in kvm_test_clear_dirty_npages() we use that order to return the correct number of modified pages. The same thing could in principle happen when removing a HPTE at the host's request, i.e. when paging out a page, except that we never page out large pages, and the guest can only create large-page HPTEs if the guest RAM is backed by large pages. However, we also fix this case for the sake of future-proofing. The reference bit is also subject to the same loss of information. We don't make the same fix here for the reference bit because there isn't an interface for userspace to find out which pages the guest has referenced, whereas there is one for userspace to find out which pages the guest has modified. Because of this loss of information, the kvm_age_hva_hv() and kvm_test_age_hva_hv() functions might incorrectly say that a page has not been referenced when it has, but that doesn't matter greatly because we never page or swap out large pages. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
| | * | | | | | | | | KVM: PPC: Book3S HV: Fix race in reading change bit when removing HPTEPaul Mackerras2015-08-221-6/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The reference (R) and change (C) bits in a HPT entry can be set by hardware at any time up until the HPTE is invalidated and the TLB invalidation sequence has completed. This means that when removing a HPTE, we need to read the HPTE after the invalidation sequence has completed in order to obtain reliable values of R and C. The code in kvmppc_do_h_remove() used to do this. However, commit 6f22bd3265fb ("KVM: PPC: Book3S HV: Make HTAB code LE host aware") removed the read after invalidation as a side effect of other changes. This restores the read of the HPTE after invalidation. The user-visible effect of this bug would be that when migrating a guest, there is a small probability that a page modified by the guest and then unmapped by the guest might not get re-transmitted and thus the destination might end up with a stale copy of the page. Fixes: 6f22bd3265fb Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
| | * | | | | | | | | KVM: PPC: Book3S HV: Implement dynamic micro-threading on POWER8Paul Mackerras2015-08-226-62/+473
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This builds on the ability to run more than one vcore on a physical core by using the micro-threading (split-core) modes of the POWER8 chip. Previously, only vcores from the same VM could be run together, and (on POWER8) only if they had just one thread per core. With the ability to split the core on guest entry and unsplit it on guest exit, we can run up to 8 vcpu threads from up to 4 different VMs, and we can run multiple vcores with 2 or 4 vcpus per vcore. Dynamic micro-threading is only available if the static configuration of the cores is whole-core mode (unsplit), and only on POWER8. To manage this, we introduce a new kvm_split_mode struct which is shared across all of the subcores in the core, with a pointer in the paca on each thread. In addition we extend the core_info struct to have information on each subcore. When deciding whether to add a vcore to the set already on the core, we now have two possibilities: (a) piggyback the vcore onto an existing subcore, or (b) start a new subcore. Currently, when any vcpu needs to exit the guest and switch to host virtual mode, we interrupt all the threads in all subcores and switch the core back to whole-core mode. It may be possible in future to allow some of the subcores to keep executing in the guest while subcore 0 switches to the host, but that is not implemented in this patch. This adds a module parameter called dynamic_mt_modes which controls which micro-threading (split-core) modes the code will consider, as a bitmap. In other words, if it is 0, no micro-threading mode is considered; if it is 2, only 2-way micro-threading is considered; if it is 4, only 4-way, and if it is 6, both 2-way and 4-way micro-threading mode will be considered. The default is 6. With this, we now have secondary threads which are the primary thread for their subcore and therefore need to do the MMU switch. These threads will need to be started even if they have no vcpu to run, so we use the vcore pointer in the PACA rather than the vcpu pointer to trigger them. It is now possible for thread 0 to find that an exit has been requested before it gets to switch the subcore state to the guest. In that case we haven't added the guest's timebase offset to the timebase, so we need to be careful not to subtract the offset in the guest exit path. In fact we just skip the whole path that switches back to host context, since we haven't switched to the guest context. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
| | * | | | | | | | | KVM: PPC: Book3S HV: Make use of unused threads when running guestsPaul Mackerras2015-08-226-72/+298
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When running a virtual core of a guest that is configured with fewer threads per core than the physical cores have, the extra physical threads are currently unused. This makes it possible to use them to run one or more other virtual cores from the same guest when certain conditions are met. This applies on POWER7, and on POWER8 to guests with one thread per virtual core. (It doesn't apply to POWER8 guests with multiple threads per vcore because they require a 1-1 virtual to physical thread mapping in order to be able to use msgsndp and the TIR.) The idea is that we maintain a list of preempted vcores for each physical cpu (i.e. each core, since the host runs single-threaded). Then, when a vcore is about to run, it checks to see if there are any vcores on the list for its physical cpu that could be piggybacked onto this vcore's execution. If so, those additional vcores are put into state VCORE_PIGGYBACK and their runnable VCPU threads are started as well as the original vcore, which is called the master vcore. After the vcores have exited the guest, the extra ones are put back onto the preempted list if any of their VCPUs are still runnable and not idle. This means that vcpu->arch.ptid is no longer necessarily the same as the physical thread that the vcpu runs on. In order to make it easier for code that wants to send an IPI to know which CPU to target, we now store that in a new field in struct vcpu_arch, called thread_cpu. Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Tested-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
| | * | | | | | | | | KVM: PPC: add missing pt_regs initializationTudor Laurentiu2015-08-221-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On this switch branch the regs initialization doesn't happen so add it. This was found with the help of a static code analysis tool. Signed-off-by: Laurentiu Tudor <Laurentiu.Tudor@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
| | * | | | | | | | | KVM: PPC: Fix warnings from sparseThomas Huth2015-08-228-8/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When compiling the KVM code for POWER with "make C=1", sparse complains about functions missing proper prototypes and a 64-bit constant missing the ULL prefix. Let's fix this by making the functions static or by including the proper header with the prototypes, and by appending a ULL prefix to the constant PPC_MPPE_ADDRESS_MASK. Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Alexander Graf <agraf@suse.de>
| | * | | | | | | | | KVM: PPC: Remove PPC970 from KVM_BOOK3S_64_HV text in KconfigThomas Huth2015-08-221-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since the PPC970 support has been removed from the kvm-hv kernel module recently, we should also reflect this change in the help text of the corresponding Kconfig option. Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Alexander Graf <agraf@suse.de>
| | * | | | | | | | | KVM: PPC: fix suspicious use of conditional operatorTudor Laurentiu2015-08-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This was signaled by a static code analysis tool. Signed-off-by: Laurentiu Tudor <Laurentiu.Tudor@freescale.com> Reviewed-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
| * | | | | | | | | | Merge tag 'kvm-arm-for-4.3' of ↵Paolo Bonzini2015-08-2232-544/+1596
| |\ \ \ \ \ \ \ \ \ \ | | |_|/ / / / / / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-queue KVM/ARM changes for 4.3 - Full debug support for arm64 - Active state switching for timer interrupts - Lazy FP/SIMD save/restore for arm64 - Generic ARMv8 target
| | * | | | | | | | | arm: KVM: keep arm vfp/simd exit handling consistent with arm64Mario Smarduch2015-08-191-6/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After enhancing arm64 FP/SIMD exit handling, ARMv7 VFP exit branch is moved to guest trap handling. This allows us to keep exit handling flow between both architectures consistent. Signed-off-by: Mario Smarduch <m.smarduch@samsung.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| | * | | | | | | | | arm64: KVM: Optimize arm64 skip 30-50% vfp/simd save/restore on exitsMario Smarduch2015-08-192-10/+68
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch only saves and restores FP/SIMD registers on Guest access. To do this cptr_el2 FP/SIMD trap is set on Guest entry and later checked on exit. lmbench, hackbench show significant improvements, for 30-50% exits FP/SIMD context is not saved/restored [chazy/maz: fixed save/restore logic for 32bit guests] Signed-off-by: Mario Smarduch <m.smarduch@samsung.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| | * | | | | | | | | KVM: arm/arm64: timer: Allow the timer to control the active stateMarc Zyngier2015-08-124-15/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to remove the crude hack where we sneak the masked bit into the timer's control register, make use of the phys_irq_map API control the active state of the interrupt. This causes some limited changes to allow for potential error propagation. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| | * | | | | | | | | KVM: arm/arm64: vgic: Prevent userspace injection of a mapped interruptMarc Zyngier2015-08-122-33/+72
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Virtual interrupts mapped to a HW interrupt should only be triggered from inside the kernel. Otherwise, you could end up confusing the kernel (and the GIC's) state machine. Rearrange the injection path so that kvm_vgic_inject_irq is used for non-mapped interrupts, and kvm_vgic_inject_mapped_irq is used for mapped interrupts. The latter should only be called from inside the kernel (timer, irqfd). Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| | * | | | | | | | | KVM: arm/arm64: vgic: Add vgic_{get,set}_phys_irq_activeMarc Zyngier2015-08-122-0/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to control the active state of an interrupt, introduce a pair of accessors allowing the state to be set/queried. This only affects the logical state, and the HW state will only be applied at world-switch time. Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| | * | | | | | | | | KVM: arm/arm64: vgic: Allow HW interrupts to be queued to a guestMarc Zyngier2015-08-121-3/+86
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To allow a HW interrupt to be injected into a guest, we lookup the guest virtual interrupt in the irq_phys_map list, and if we have a match, encode both interrupts in the LR. We also mark the interrupt as "active" at the host distributor level. On guest EOI on the virtual interrupt, the host interrupt will be deactivated. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| | * | | | | | | | | KVM: arm/arm64: vgic: Allow dynamic mapping of physical/virtual interruptsMarc Zyngier2015-08-123-1/+235
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to be able to feed physical interrupts to a guest, we need to be able to establish the virtual-physical mapping between the two worlds. The mappings are kept in a set of RCU lists, indexed by virtual interrupts. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| | * | | | | | | | | KVM: arm/arm64: vgic: Relax vgic_can_sample_irq for edge IRQsMarc Zyngier2015-08-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We only set the irq_queued flag for level interrupts, meaning that "!vgic_irq_is_queued(vcpu, irq)" is a good enough predicate for all interrupts. This will allow us to inject edge HW interrupts, for which the state ACTIVE+PENDING is not allowed. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| | * | | | | | | | | KVM: arm/arm64: vgic: Allow HW irq to be encoded in LRMarc Zyngier2015-08-124-5/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that struct vgic_lr supports the LR_HW bit and carries a hwirq field, we can encode that information into the list registers. This patch provides implementations for both GICv2 and GICv3. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| | * | | | | | | | | KVM: arm/arm64: vgic: Convert struct vgic_lr to use bitfieldsMarc Zyngier2015-08-121-3/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As we're about to cram more information in the vgic_lr structure (HW interrupt number and additional state information), we switch to a layout similar to the HW's: - use bitfields to save space (we don't need more than 10 bits to represent the irq numbers) - source CPU and HW interrupt can share the same field, as a SGI doesn't have a physical line. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| | * | | | | | | | | arm/arm64: KVM: Move vgic handling to a non-preemptible sectionMarc Zyngier2015-08-121-3/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As we're about to introduce some serious GIC-poking to the vgic code, it is important to make sure that we're going to poke the part of the GIC that belongs to the CPU we're about to run on (otherwise, we'd end up with some unexpected interrupts firing)... Introducing a non-preemptible section in kvm_arch_vcpu_ioctl_run prevents the problem from occuring. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| | * | | | | | | | | arm/arm64: KVM: Fix ordering of timer/GIC on guest entryMarc Zyngier2015-08-121-4/+3Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As we now inject the timer interrupt when we're about to enter the guest, it makes a lot more sense to make sure this happens before the vgic code queues the pending interrupts. Otherwise, we get the interrupt on the following exit, which is not great for latency (and leads to all kind of bizarre issues when using with active interrupts at the HW level). Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
| | * | | | | | | | | arm64: KVM: remove remaining reference to vgic_sr_vectorsVladimir Murzin2015-08-122-7/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since commit 8a14849 (arm64: KVM: Switch vgic save/restore to alternative_insn) vgic_sr_vectors is not used anymore, so remove remaining leftovers and kill the structure. Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| | * | | | | | | | | arm64/kvm: Add generic v8 KVM targetSuzuki K. Poulose2015-08-123-3/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds a generic ARM v8 KVM target cpu type for use by the new CPUs which eventualy ends up using the common sys_reg table. For backward compatibility the existing targets have been preserved. Any new target CPU that can be covered by generic v8 sys_reg tables should make use of the new generic target. Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com> Acked-by: Marc Zyngier <Marc.Zyngier@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| | * | | | | | | | | KVM: arm64: add trace points for guest_debug debugAlex Bennée2015-07-214-1/+179
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This includes trace points for: kvm_arch_setup_guest_debug kvm_arch_clear_guest_debug I've also added some generic register setting trace events and also a trace point to dump the array of hardware registers. Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| | * | | | | | | | | KVM: arm64: enable KVM_CAP_SET_GUEST_DEBUGAlex Bennée2015-07-218-19/+89
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Finally advertise the KVM capability for SET_GUEST_DEBUG. Once arm support is added this check can be moved to the common kvm_vm_ioctl_check_extension() code. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| | * | | | | | | | | KVM: arm64: guest debug, HW assisted debug supportAlex Bennée2015-07-211-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds support for userspace to control the HW debug registers for guest debug. In the debug ioctl we copy an IMPDEF registers into a new register set called host_debug_state. We use the recently introduced vcpu parameter debug_ptr to select which register set is copied into the real registers when world switch occurs. I've made some helper functions from hw_breakpoint.c more widely available for re-use. As with single step we need to tweak the guest registers to enable the exceptions so we need to save and restore those bits. Two new capabilities have been added to the KVM_EXTENSION ioctl to allow userspace to query the number of hardware break and watch points available on the host hardware. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| | * | | | | | | | | KVM: arm64: introduce vcpu->arch.debug_ptrAlex Bennée2015-07-219-47/+316
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This introduces a level of indirection for the debug registers. Instead of using the sys_regs[] directly we store registers in a structure in the vcpu. The new kvm_arm_reset_debug_ptr() sets the debug ptr to the guest context. Because we no longer give the sys_regs offset for the sys_reg_desc->reg field, but instead the index into a debug-specific struct we need to add a number of additional trap functions for each register. Also as the generic generic user-space access code no longer works we have introduced a new pair of function pointers to the sys_reg_desc structure to override the generic code when needed. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| | * | | | | | | | | KVM: arm64: re-factor hyp.S debug register codeAlex Bennée2015-07-211-379/+138Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a pre-cursor to sharing the code with the guest debug support. This replaces the big macro that fishes data out of a fixed location with a more general helper macro to restore a set of debug registers. It uses macro substitution so it can be re-used for debug control and value registers. It does however rely on the debug registers being 64 bit aligned (as they happen to be in the hyp ABI). Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| | * | | | | | | | | KVM: arm64: guest debug, add support for single-stepAlex Bennée2015-07-214-5/+80
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds support for single-stepping the guest. To do this we need to manipulate the guests PSTATE.SS and MDSCR_EL1.SS bits to trigger stepping. We take care to preserve MDSCR_EL1 and trap access to it to ensure we don't affect the apparent state of the guest. As we have to enable trapping of all software debug exceptions we suppress the ability of the guest to single-step itself. If we didn't we would have to deal with the exception arriving while the guest was in kernelspace when the guest is expecting to single-step userspace. This is something we don't want to unwind in the kernel. Once the host is no longer debugging the guest its ability to single-step userspace is restored. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| | * | | | | | | | | KVM: arm64: guest debug, add SW break point supportAlex Bennée2015-07-214-2/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds support for SW breakpoints inserted by userspace. We do this by trapping all guest software debug exceptions to the hypervisor (MDCR_EL2.TDE). The exit handler sets an exit reason of KVM_EXIT_DEBUG with the kvm_debug_exit_arch structure holding the exception syndrome information. It will be up to userspace to extract the PC (via GET_ONE_REG) and determine if the debug event was for a breakpoint it inserted. If not userspace will need to re-inject the correct exception restart the hypervisor to deliver the debug exception to the guest. Any other guest software debug exception (e.g. single step or HW assisted breakpoints) will cause an error and the VM to be killed. This is addressed by later patches which add support for the other debug types. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| | * | | | | | | | | KVM: arm: introduce kvm_arm_init/setup/clear_debugAlex Bennée2015-07-218-12/+108
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a precursor for later patches which will need to do more to setup debug state before entering the hyp.S switch code. The existing functionality for setting mdcr_el2 has been moved out of hyp.S and now uses the value kept in vcpu->arch.mdcr_el2. As the assembler used to previously mask and preserve MDCR_EL2.HPMN I've had to add a mechanism to save the value of mdcr_el2 as a per-cpu variable during the initialisation code. The kernel never sets this number so we are assuming the bootcode has set up the correct value here. This also moves the conditional setting of the TDA bit from the hyp code into the C code which is currently used for the lazy debug register context switch code. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| | * | | | | | | | | KVM: arm: guest debug, add stub KVM_SET_GUEST_DEBUG ioctlAlex Bennée2015-07-214-8/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit adds a stub function to support the KVM_SET_GUEST_DEBUG ioctl. Any unsupported flag will return -EINVAL. For now, only KVM_GUESTDBG_ENABLE is supported, although it won't have any effects. Signed-off-by: Alex Bennée <alex.bennee@linaro.org>. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>