summaryrefslogtreecommitdiffstats
path: root/arch/x86/include/asm/kvm_host.h
Commit message (Collapse)AuthorAgeFilesLines
* KVM: VMX: Cache vmcs segment fieldsAvi Kivity2011-05-221-0/+1
| | | | | | | | | Since the emulator now checks segment limits and access rights, it generates a lot more accesses to the vmcs segment fields. Undo some of the performance hit by cacheing those fields in a read-only cache (the entire cache is invalidated on any write, or on guest exit). Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: mmio_fault_cr2 is not usedGleb Natapov2011-05-221-1/+0Star
| | | | | | | Remove unused variable mmio_fault_cr2. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: x86 emulator: add ->fix_hypercall() callbackAvi Kivity2011-05-221-2/+0Star
| | | | | | Artificial, but needed to remove direct calls to KVM. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: x86 emulator: make emulate_invlpg() an emulator callbackAvi Kivity2011-05-221-1/+0Star
| | | | | | Removing direct calls to KVM. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: x86 emulator: emulate CLTS internallyAvi Kivity2011-05-221-1/+0Star
| | | | | | | | | Avoid using ctxt->vcpu; we can do everything with ->get_cr() and ->set_cr(). A side effect is that we no longer activate the fpu on emulated CLTS; but that should be very rare. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: x86 emulator: add and use new callbacks set_idt(), set_gdt()Avi Kivity2011-05-221-3/+0Star
| | | | | | Replacing direct calls to realmode_lgdt(), realmode_lidt(). Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: X86: Update last_guest_tsc in vcpu_putJoerg Roedel2011-05-111-1/+0Star
| | | | | | | | | | | | The last_guest_tsc is used in vcpu_load to adjust the tsc_offset since tsc-scaling is merged. So the last_guest_tsc needs to be updated in vcpu_put instead of the the last_host_tsc. This is fixed with this patch. Reported-by: Jan Kiszka <jan.kiszka@web.de> Tested-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: emulator: do not needlesly sync registers from emulator ctxt to vcpuGleb Natapov2011-05-111-0/+2
| | | | | | | | | | | | | Currently we sync registers back and forth before/after exiting to userspace for IO, but during IO device model shouldn't need to read/write the registers, so we can as well skip those sync points. The only exaception is broken vmware backdor interface. The new code sync registers content during IO only if registers are read from/written to by userspace in the middle of the IO operation and this almost never happens in practise. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: X86: Implement userspace interface to set virtual_tsc_khzJoerg Roedel2011-05-111-0/+7
| | | | | | | | | | This patch implements two new vm-ioctls to get and set the virtual_tsc_khz if the machine supports tsc-scaling. Setting the tsc-frequency is only possible before userspace creates any vcpu. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: X86: Delegate tsc-offset calculation to architecture codeJoerg Roedel2011-05-111-0/+2
| | | | | | | | | | With TSC scaling in SVM the tsc-offset needs to be calculated differently. This patch propagates this calculation into the architecture specific modules so that this complexity can be handled there. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: X86: Implement call-back to propagate virtual_tsc_khzJoerg Roedel2011-05-111-0/+1
| | | | | | | | | | | This patch implements a call-back into the architecture code to allow the propagation of changes to the virtual tsc_khz of the vcpu. On SVM it updates the tsc_ratio variable, on VMX it does nothing. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: X86: Let kvm-clock report the right tsc frequencyJoerg Roedel2011-05-111-3/+3
| | | | | | | | This patch changes the kvm_guest_time_update function to use TSC frequency the guest actually has for updating its clock. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: MMU: remove mmu_seq verification on pte update pathXiao Guangrong2011-05-111-1/+1
| | | | | | | | The mmu_seq verification can be removed since we get the pfn in the protection of mmu_lock. Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: SVM: Add intercept check for emulated cr accessesJoerg Roedel2011-05-111-0/+15
| | | | | | | | This patch adds all necessary intercept checks for instructions that access the crX registers. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: x86: Add x86 callback for intercept checkJoerg Roedel2011-05-111-0/+7
| | | | | | | | This patch adds a callback into kvm_x86_ops so that svm and vmx code can do intercept checks on emulated instructions. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: 16-byte mmio supportAvi Kivity2011-05-111-0/+1
| | | | | | | | | Since sse instructions can issue 16-byte mmios, we need to support them. We can't increase the kvm_run mmio buffer size to 16 bytes without breaking compatibility, so instead we break the large mmios into two smaller 8-byte ones. Since the bus is 64-bit we aren't breaking any atomicity guarantees. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Cache cplAvi Kivity2011-05-111-0/+1
| | | | | | | | We may read the cpl quite often in the same vmexit (instruction privilege check, memory access checks for instruction and operands), so we gain a bit if we cache the value. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Optimize vmx_get_rflags()Avi Kivity2011-05-111-0/+1
| | | | | | If called several times within the same exit, return cached results. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: MMU: cleanup pte write pathXiao Guangrong2011-03-171-5/+2Star
| | | | | | | | | | | This patch does: - call vcpu->arch.mmu.update_pte directly - use gfn_to_pfn_atomic in update_pte path The suggestion is from Avi. Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: MMU: do not record gfn in kvm_mmu_pte_writeXiao Guangrong2011-03-171-1/+0Star
| | | | | | | | | No need to record the gfn to verifier the pte has the same mode as current vcpu, it's because we only speculatively update the pte only if the pte and vcpu have the same mode Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: x86: Convert tsc_write_lock to raw_spinlockJan Kiszka2011-03-171-1/+1
| | | | | | | | Code under this lock requires non-preemptibility. Ensure this also over -rt by converting it to raw spinlock. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Convert kvm_lock to raw_spinlockJan Kiszka2011-03-171-1/+1
| | | | | | | | Code under this lock requires non-preemptibility. Ensure this also over -rt by converting it to raw spinlock. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* thp: mmu_notifier_test_youngAndrea Arcangeli2011-01-141-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For GRU and EPT, we need gup-fast to set referenced bit too (this is why it's correct to return 0 when shadow_access_mask is zero, it requires gup-fast to set the referenced bit). qemu-kvm access already sets the young bit in the pte if it isn't zero-copy, if it's zero copy or a shadow paging EPT minor fault we relay on gup-fast to signal the page is in use... We also need to check the young bits on the secondary pagetables for NPT and not nested shadow mmu as the data may never get accessed again by the primary pte. Without this closer accuracy, we'd have to remove the heuristic that avoids collapsing hugepages in hugepage virtual regions that have not even a single subpage in use. ->test_young is full backwards compatible with GRU and other usages that don't have young bits in pagetables set by the hardware and that should nuke the secondary mmu mappings when ->clear_flush_young runs just like EPT does. Removing the heuristic that checks the young bit in khugepaged/collapse_huge_page completely isn't so bad either probably but I thought it was worth it and this makes it reliable. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* KVM: MMU: audit: allow audit more guests at the same timeXiao Guangrong2011-01-121-0/+4
| | | | | | | | | | | | It only allows to audit one guest in the system since: - 'audit_point' is a glob variable - mmu_audit_disable() is called in kvm_mmu_destroy(), so audit is disabled after a guest exited this patch fix those issues then allow to audit more guests at the same time Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Fetch guest cr3 from hardware on demandAvi Kivity2011-01-121-0/+2
| | | | | | | | | | Instead of syncing the guest cr3 every exit, which is expensince on vmx with ept enabled, sync it only on demand. [sheng: fix incorrect cr3 seen by Windows XP] Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: SVM: copy instruction bytes from VMCBAndre Przywara2011-01-121-4/+5
| | | | | | | | | | | In case of a nested page fault or an intercepted #PF newer SVM implementations provide a copy of the faulting instruction bytes in the VMCB. Use these bytes to feed the instruction emulator and avoid the costly guest instruction fetch in this case. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: cleanup emulate_instructionAndre Przywara2011-01-121-2/+9
| | | | | | | | | | emulate_instruction had many callers, but only one used all parameters. One parameter was unused, another one is now hidden by a wrapper function (required for a future addition anyway), so most callers use now a shorter parameter list. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: move complete_insn_gp() into x86.cAndre Przywara2011-01-121-0/+2
| | | | | | | | move the complete_insn_gp() helper function out of the VMX part into the generic x86 part to make it usable by SVM. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: x86: fix CR8 handlingAndre Przywara2011-01-121-1/+1
| | | | | | | | | | The handling of CR8 writes in KVM is currently somewhat cumbersome. This patch makes it look like the other CR register handlers and fixes a possible issue in VMX, where the RIP would be incremented despite an injected #GP. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: MMU: retry #PF for softmmuXiao Guangrong2011-01-121-0/+1
| | | | | | | | Retry #PF for softmmu only when the current vcpu has the same cr3 as the time when #PF occurs Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: MMU: rename 'no_apf' to 'prefault'Xiao Guangrong2011-01-121-1/+2
| | | | | | | | It's the speculative path if 'no_apf = 1' and we will specially handle this speculative path in the later patch, so 'prefault' is better to fit the sense. Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Don't spin on virt instruction faults during rebootAvi Kivity2011-01-121-2/+6
| | | | | | | | | | | | | Since vmx blocks INIT signals, we disable virtualization extensions during reboot. This leads to virtualization instructions faulting; we trap these faults and spin while the reboot continues. Unfortunately spinning on a non-preemptible kernel may block a task that reboot depends on; this causes the reboot to hang. Fix by skipping over the instruction and hoping for the best. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: X86: Introduce generic guest-mode representationJoerg Roedel2011-01-121-0/+1
| | | | | | | | | | This patch introduces a generic representation of guest-mode fpr a vcpu. This currently only exists in the SVM code. Having this representation generic will help making the non-svm code aware of nesting when this is necessary. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: Pull extra page fault information into struct x86_exceptionAvi Kivity2011-01-121-13/+4Star
| | | | | | | | | | | | | | | Currently page fault cr2 and nesting infomation are carried outside the fault data structure. Instead they are placed in the vcpu struct, which results in confusion as global variables are manipulated instead of passing parameters. Fix this issue by adding address and nested fields to struct x86_exception, so this struct can carry all information associated with a fault. Signed-off-by: Avi Kivity <avi@redhat.com> Tested-by: Joerg Roedel <joerg.roedel@amd.com> Tested-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: Push struct x86_exception info the various gva_to_gpa variantsAvi Kivity2011-01-121-5/+9
| | | | | Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: MMU: remove 'clear_unsync' parameterXiao Guangrong2011-01-121-1/+1
| | | | | | | Remove it since we can judge it by using sp->unsync Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: Add instruction-set-specific exit qualifications to kvm_exit traceAvi Kivity2011-01-121-0/+1
| | | | | | | | | | | The exit reason alone is insufficient to understand exactly why an exit occured; add ISA-specific trace parameters for additional information. Because fetching these parameters is expensive on vmx, and because these parameters are fetched even if tracing is disabled, we fetch the parameters via a callback instead of as traditional trace arguments. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: MMU: fix apf prefault if nested guest is enabledXiao Guangrong2011-01-121-0/+1
| | | | | | | | If apf is generated in L2 guest and is completed in L1 guest, it will prefault this apf in L1 guest's mmu context. Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: remove unused function declarationXiao Guangrong2011-01-121-1/+0Star
| | | | | | | Remove the declaration of kvm_mmu_set_base_ptes() Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Let host know whether the guest can handle async PF in non-userspace ↵Gleb Natapov2011-01-121-0/+1
| | | | | | | | | | | | context. If guest can detect that it runs in non-preemptable context it can handle async PFs at any time, so let host know that it can send async PF even if guest cpu is not in userspace. Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: Inject asynchronous page fault into a PV guest if page is swapped out.Gleb Natapov2011-01-121-0/+3
| | | | | | | | | | | | | | | | Send async page fault to a PV guest if it accesses swapped out memory. Guest will choose another task to run upon receiving the fault. Allow async page fault injection only when guest is in user mode since otherwise guest may be in non-sleepable context and will not be able to reschedule. Vcpu will be halted if guest will fault on the same page again or if vcpu executes kernel code. Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: Add PV MSR to enable asynchronous page faults delivery.Gleb Natapov2011-01-121-0/+2
| | | | | | | | Guest enables async PF vcpu functionality using this MSR. Reviewed-by: Rik van Riel <riel@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: Retry fault before vmentryGleb Natapov2011-01-121-1/+3
| | | | | | | | | | | | | When page is swapped in it is mapped into guest memory only after guest tries to access it again and generate another fault. To save this fault we can map it immediately since we know that guest is going to access the page. Do it only when tdp is enabled for now. Shadow paging case is more complicated. CR[034] and EFER registers should be switched before doing mapping and then switched back. Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: Halt vcpu if page it tries to access is swapped outGleb Natapov2011-01-121-0/+18
| | | | | | | | | | | | | | | | If a guest accesses swapped out memory do not swap it in from vcpu thread context. Schedule work to do swapping and put vcpu into halted state instead. Interrupts will still be delivered to the guest and if interrupt will cause reschedule guest will continue to run another task. [avi: remove call to get_user_pages_noio(), nacked by Linus; this makes everything synchrnous again] Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: enlarge number of possible CPUID leavesAndre Przywara2010-12-081-1/+1
| | | | | | | | | | | Currently the number of CPUID leaves KVM handles is limited to 40. My desktop machine (AthlonII) already has 35 and future CPUs will expand this well beyond the limit. Extend the limit to 80 to make room for future processors. KVM-Stable-Tag. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: x86: TSC catchup modeZachary Amsden2010-10-241-0/+6
| | | | | | | | | | | | | | | Negate the effects of AN TYM spell while kvm thread is preempted by tracking conversion factor to the highest TSC rate and catching the TSC up when it has fallen behind the kernel view of time. Note that once triggered, we don't turn off catchup mode. A slightly more clever version of this is possible, which only does catchup when TSC rate drops, and which specifically targets only CPUs with broken TSC, but since these all are considered unstable_tsc(), this patch covers all necessary cases. Signed-off-by: Zachary Amsden <zamsden@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* KVM: MMU: Don't track nested fault info in error-codeJoerg Roedel2010-10-241-0/+1
| | | | | | | | | This patch moves the detection whether a page-fault was nested or not out of the error code and moves it into a separate variable in the fault struct. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Non-atomic interrupt injectionAvi Kivity2010-10-241-0/+1
| | | | | | | | | Change the interrupt injection code to work from preemptible, interrupts enabled context. This works by adding a ->cancel_injection() operation that undoes an injection in case we were not able to actually enter the guest (this condition could never happen with atomic injection). Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: MMU: Track NX state in struct kvm_mmuJoerg Roedel2010-10-241-0/+2
| | | | | | | | | | | With Nested Paging emulation the NX state between the two MMU contexts may differ. To make sure that always the right fault error code is recorded this patch moves the NX state into struct kvm_mmu so that the code can distinguish between L1 and L2 NX state. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: MMU: Allow long mode shadows for legacy page tablesJoerg Roedel2010-10-241-0/+1
| | | | | | | | | | | | Currently the KVM softmmu implementation can not shadow a 32 bit legacy or PAE page table with a long mode page table. This is a required feature for nested paging emulation because the nested page table must alway be in host format. So this patch implements the missing pieces to allow long mode page tables for page table types. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>