summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* KVM: MMU: large page supportMarcelo Tosatti2008-04-276-32/+262
| | | | | | | | | | | | | | Create large pages mappings if the guest PTE's are marked as such and the underlying memory is hugetlbfs backed. If the largepage contains write-protected pages, a large pte is not used. Gives a consistent 2% improvement for data copies on ram mounted filesystem, without NPT/EPT. Anthony measures a 4% improvement on 4-way kernbench, with NPT. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: ignore zapped root pagetablesMarcelo Tosatti2008-04-275-2/+48
| | | | | | | | | | | | Mark zapped root pagetables as invalid and ignore such pages during lookup. This is a problem with the cr3-target feature, where a zapped root table fools the faulting code into creating a read-only mapping. The result is a lockup if the instruction can't be emulated. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Cc: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Implement dummy values for MSR_PERF_STATUSAlexander Graf2008-04-271-1/+7
| | | | | | | Darwin relies on this and ceases to work without. Signed-off-by: Alexander Graf <alex@csgraf.de> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: sparse fixes for kvm/x86.cHarvey Harrison2008-04-271-13/+13
| | | | | | | | | | | | | | | | | | In two case statements, use the ever popular 'i' instead of index: arch/x86/kvm/x86.c:1063:7: warning: symbol 'index' shadows an earlier one arch/x86/kvm/x86.c:1000:9: originally declared here arch/x86/kvm/x86.c:1079:7: warning: symbol 'index' shadows an earlier one arch/x86/kvm/x86.c:1000:9: originally declared here Make it static. arch/x86/kvm/x86.c:1945:24: warning: symbol 'emulate_ops' was not declared. Should it be static? Drop the return statements. arch/x86/kvm/x86.c:2878:2: warning: returning void-valued expression arch/x86/kvm/x86.c:2944:2: warning: returning void-valued expression Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: SVM: make iopm_base staticHarvey Harrison2008-04-271-1/+1
| | | | | | | | Fixes sparse warning as well. arch/x86/kvm/svm.c:69:15: warning: symbol 'iopm_base' was not declared. Should it be static? Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: x86 emulator: fix sparse warnings in x86_emulate.cHarvey Harrison2008-04-271-2/+2
| | | | | | | | | | | | | | | Nesting __emulate_2op_nobyte inside__emulate_2op produces many shadowed variable warnings on the internal variable _tmp used by both macros. Change the outer macro to use __tmp. Avoids a sparse warning like the following at every call site of __emulate_2op arch/x86/kvm/x86_emulate.c:1091:3: warning: symbol '_tmp' shadows an earlier one arch/x86/kvm/x86_emulate.c:1091:3: originally declared here [18 more warnings suppressed] Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Add stat counter for hypercallsAmit Shah2008-04-272-0/+3
| | | | | Signed-off-by: Amit Shah <amit.shah@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Use x86's segment descriptor struct instead of private definitionAvi Kivity2008-04-273-39/+8Star
| | | | | | The x86 desc_struct unification allows us to remove segment_descriptor.h. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Increase the number of user memory slots per vmAvi Kivity2008-04-271-1/+1
| | | | Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Add API for determining the number of supported memory slotsAvi Kivity2008-04-272-0/+4
| | | | Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Increase vcpu count to 16Avi Kivity2008-04-271-1/+1
| | | | | | With NPT support, scalability is much improved. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Add API to retrieve the number of supported vcpus per vmAvi Kivity2008-04-272-0/+4
| | | | Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: x86 emulator: make register_address_increment and JMP_REL static inlinesHarvey Harrison2008-04-271-30/+26Star
| | | | | | | Change jmp_rel() to a function as well. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: x86 emulator: make register_address, address_mask static inlinesHarvey Harrison2008-04-271-19/+29
| | | | | Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: x86 emulator: add ad_mask static inlineHarvey Harrison2008-04-271-3/+8
| | | | | | | Replaces open-coded mask calculation in macros. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* x86: KVM guest: paravirtualized clocksourceGlauber de Oliveira Costa2008-04-275-0/+182
| | | | | | | | | | | | | | | | | | This is the guest part of kvm clock implementation It does not do tsc-only timing, as tsc can have deltas between cpus, and it did not seem worthy to me to keep adjusting them. We do use it, however, for fine-grained adjustment. Other than that, time comes from the host. [randy dunlap: add missing include] [randy dunlap: disallow on Voyager or Visual WS] Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: paravirtualized clocksource: host partGlauber de Oliveira Costa2008-04-274-1/+145
| | | | | | | | | | | | | | | | | This is the host part of kvm clocksource implementation. As it does not include clockevents, it is a fairly simple implementation. We only have to register a per-vcpu area, and start writing to it periodically. The area is binary compatible with xen, as we use the same shadow_info structure. [marcelo: fix bad_page on MSR_KVM_SYSTEM_TIME] [avi: save full value of the msr, even if enable bit is clear] [avi: clear previous value of time_page] Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: SVM: enable LBR virtualizationJoerg Roedel2008-04-271-2/+37
| | | | | | | | | | | | This patch implements the Last Branch Record Virtualization (LBRV) feature of the AMD Barcelona and Phenom processors into the kvm-amd module. It will only be enabled if the guest enables last branch recording in the DEBUG_CTL MSR. So there is no increased world switch overhead when the guest doesn't use these MSRs. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Markus Rechberger <markus.rechberger@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: SVM: allocate the MSR permission map per VCPUJoerg Roedel2008-04-272-35/+34Star
| | | | | | | | | | | This patch changes the kvm-amd module to allocate the SVM MSR permission map per VCPU instead of a global map for all VCPUs. With this we have more flexibility allowing specific guests to access virtualized MSRs. This is required for LBR virtualization. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Markus Rechberger <markus.rechberger@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: SVM: let init_vmcb() take struct vcpu_svm as parameterJoerg Roedel2008-04-271-6/+6
| | | | | | | | | Change the parameter of the init_vmcb() function in the kvm-amd module from struct vmcb to struct vcpu_svm. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Markus Rechberger <markus.rechberger@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: VMX: fix typo in VMX header defineRyan Harper2008-04-272-5/+5
| | | | | | | | | Looking at Intel Volume 3b, page 148, table 20-11 and noticed that the field name is 'Deliver' not 'Deliever'. Attached patch changes the define name and its user in vmx.c Signed-off-by: Ryan Harper <ryanh@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: SVM: add support for Nested PagingJoerg Roedel2008-04-271-5/+67
| | | | | | | | This patch contains the SVM architecture dependent changes for KVM to enable support for the Nested Paging feature of AMD Barcelona and Phenom processors. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: add TDP support to the KVM MMUJoerg Roedel2008-04-272-3/+82
| | | | | | | | This patch contains the changes to the KVM MMU necessary for support of the Nested Paging feature in AMD Barcelona and Phenom Processors. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: export the load_pdptrs() function to modulesJoerg Roedel2008-04-272-0/+3
| | | | | | | The load_pdptrs() function is required in the SVM module for NPT support. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: make the __nonpaging_map function genericJoerg Roedel2008-04-271-4/+3Star
| | | | | | | | | The mapping function for the nonpaging case in the softmmu does basically the same as required for Nested Paging. Make this function generic so it can be used for both. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: export information about NPT to generic x86 codeJoerg Roedel2008-04-273-1/+20
| | | | | | | | | | The generic x86 code has to know if the specific implementation uses Nested Paging. In the generic code Nested Paging is called Two Dimensional Paging (TDP) to avoid confusion with (future) TDP implementations of other vendors. This patch exports the availability of TDP to the generic x86 code. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: SVM: add module parameter to disable Nested PagingJoerg Roedel2008-04-271-0/+8
| | | | | | | | | To disable the use of the Nested Paging feature even if it is available in hardware this patch adds a module parameter. Nested Paging can be disabled by passing npt=0 to the kvm_amd module. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: SVM: add detection of Nested Paging featureJoerg Roedel2008-04-271-0/+8
| | | | | | | | Let SVM detect if the Nested Paging feature is available on the hardware. Disable it to keep this patch series bisectable. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: SVM: move feature detection to hardware setup codeJoerg Roedel2008-04-271-1/+3
| | | | | | | | | By moving the SVM feature detection from the each_cpu code to the hardware setup code it runs only once. As an additional advance the feature check is now available earlier in the module setup process. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: allow access to EFER in 32bit KVMJoerg Roedel2008-04-271-10/+0Star
| | | | | | | | This patch makes the EFER register accessible on a 32bit KVM host. This is necessary to boot 32 bit PAE guests under SVM. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: VMX: unifdef the EFER specific codeJoerg Roedel2008-04-271-8/+2Star
| | | | | | | | | | | To allow access to the EFER register in 32bit KVM the EFER specific code has to be exported to the x86 generic code. This patch does this in a backwards compatible manner. [avi: add check for EFER-less hosts] Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: align valid EFER bits with the features of the host systemJoerg Roedel2008-04-273-1/+16
| | | | | | | | | | This patch aligns the bits the guest can set in the EFER register with the features in the host processor. Currently it lets EFER.NX disabled if the processor does not support it and enables EFER.LME and EFER.LMA only for KVM on 64 bit hosts. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: make EFER_RESERVED_BITS configurable for architecture codeJoerg Roedel2008-04-272-2/+10
| | | | | | | | This patch give the SVM and VMX implementations the ability to add some bits the guest can set in its EFER register. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Disable pagefaults during copy_from_user_inatomic()Andrea Arcangeli2008-04-271-0/+2
| | | | | | | | With CONFIG_PREEMPT=n, this is needed in order to disable the fault-in code from sleeping. Signed-off-by: Andrea Arcangeli <andrea@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Use CONFIG_PREEMPT_NOTIFIERS around struct preempt_notifierHollis Blanchard2008-04-271-0/+2
| | | | | | | | This allows kvm_host.h to be #included even when struct preempt_notifier is undefined. This is needed to build ppc asm-offsets.h. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: VMX: Enable Virtual Processor Identification (VPID)Sheng Yang2008-04-273-5/+82
| | | | | | | | | | | | To allow TLB entries to be retained across VM entry and VM exit, the VMM can now identify distinct address spaces through a new virtual-processor ID (VPID) field of the VMCS. [avi: drop vpid_sync_all()] [avi: add "cc" to asm constraints] Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Limit vcpu mmap size to one page on non-x86Avi Kivity2008-04-271-1/+4
| | | | | | | | The second page is only needed on archs that support pio. Noted by Carsten Otte. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Decouple mmio from shadow page tablesAvi Kivity2008-04-272-28/+23Star
| | | | | | | | | | | | | Currently an mmio guest pte is encoded in the shadow pagetable as a not-present trapping pte, with the SHADOW_IO_MARK bit set. However nothing is ever done with this information, so maintaining it is a useless complication. This patch moves the check for mmio to before shadow ptes are instantiated, so the shadow code is never invoked for ptes that reference mmio. The code is simpler, and with future work, can be made to handle mmio concurrently. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: x86 emulator: group decoding for group 1 instructionsAvi Kivity2008-04-271-2/+23
| | | | | | Opcodes 0x80-0x83 Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: Only x86 has pioAvi Kivity2008-04-271-0/+2
| | | | Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: constify function pointer tablesJan Engelhardt2008-04-271-2/+2
| | | | | Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: x86 emulator: add group 7 decodingAvi Kivity2008-04-271-2/+7
| | | | | | This adds group decoding for opcode 0x0f 0x01 (group 7). Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: x86 emulator: Group decoding for groups 4 and 5Avi Kivity2008-04-271-30/+10Star
| | | | | | Add group decoding support for opcode 0xfe (group 4) and 0xff (group 5). Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: x86 emulator: Group decoding for group 3Avi Kivity2008-04-271-24/+10Star
| | | | | | This adds group decoding support for opcodes 0xf6, 0xf7 (group 3). Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: x86 emulator: group decoding for group 1AAvi Kivity2008-04-271-1/+7
| | | | | | This adds group decode support for opcode 0x8f. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: x86 emulator: add support for group decodingAvi Kivity2008-04-271-6/+27
| | | | | | | | | Certain x86 instructions use bits 3:5 of the byte following the opcode as an opcode extension, with the decode sometimes depending on bits 6:7 as well. Add support for this in the main decoding table rather than an ad-hock adaptation per opcode. Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Simplify hash table indexingDong, Eddie2008-04-272-6/+7
| | | | | Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* KVM: MMU: Update shadow ptes on partial guest pte writesDong, Eddie2008-04-272-12/+18
| | | | | | | | | | | | | | | | A guest partial guest pte write will leave shadow_trap_nonpresent_pte in spte, which generates a vmexit at the next guest access through that pte. This patch improves this by reading the full guest pte in advance and thus being able to update the spte and eliminate the vmexit. This helps pae guests which use two 32-bit writes to set a single 64-bit pte. [truncation fix by Eric] Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com> Signed-off-by: Feng (Eric) Liu <eric.e.liu@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* Merge branch 'for-linus' of ↵Linus Torvalds2008-04-268-72/+228
|\ | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86-bigbox-bootmem-v3 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86-bigbox-bootmem-v3: x86_64/mm: check and print vmemmap allocation continuous x86_64: fix setup_node_bootmem to support big mem excluding with memmap x86_64: make reserve_bootmem_generic() use new reserve_bootmem() mm: allow reserve_bootmem() cross nodes mm: offset align in alloc_bootmem() mm: fix alloc_bootmem_core to use fast searching for all nodes mm: make mem_map allocation continuous
| * x86_64/mm: check and print vmemmap allocation continuousYinghai Lu2008-04-263-2/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On big systems with lots of memory, don't print out too much during bootup, and make it easy to find if it is continuous. on 256G 8 sockets system will get [ffffe20000000000-ffffe20002bfffff] PMD -> [ffff810001400000-ffff810003ffffff] on node 0 [ffffe2001c700000-ffffe2001c7fffff] potential offnode page_structs [ffffe20002c00000-ffffe2001c7fffff] PMD -> [ffff81000c000000-ffff8100255fffff] on node 0 [ffffe20038700000-ffffe200387fffff] potential offnode page_structs [ffffe2001c800000-ffffe200387fffff] PMD -> [ffff810820200000-ffff81083c1fffff] on node 1 [ffffe20040000000-ffffe2007fffffff] PUD ->ffff811027a00000 on node 2 [ffffe20038800000-ffffe2003fffffff] PMD -> [ffff811020200000-ffff8110279fffff] on node 2 [ffffe20054700000-ffffe200547fffff] potential offnode page_structs [ffffe20040000000-ffffe200547fffff] PMD -> [ffff811027c00000-ffff81103c3fffff] on node 2 [ffffe20070700000-ffffe200707fffff] potential offnode page_structs [ffffe20054800000-ffffe200707fffff] PMD -> [ffff811820200000-ffff81183c1fffff] on node 3 [ffffe20080000000-ffffe200bfffffff] PUD ->ffff81202fa00000 on node 4 [ffffe20070800000-ffffe2007fffffff] PMD -> [ffff812020200000-ffff81202f9fffff] on node 4 [ffffe2008c700000-ffffe2008c7fffff] potential offnode page_structs [ffffe20080000000-ffffe2008c7fffff] PMD -> [ffff81202fc00000-ffff81203c3fffff] on node 4 [ffffe200a8700000-ffffe200a87fffff] potential offnode page_structs [ffffe2008c800000-ffffe200a87fffff] PMD -> [ffff812820200000-ffff81283c1fffff] on node 5 [ffffe200c0000000-ffffe200ffffffff] PUD ->ffff813037a00000 on node 6 [ffffe200a8800000-ffffe200bfffffff] PMD -> [ffff813020200000-ffff8130379fffff] on node 6 [ffffe200c4700000-ffffe200c47fffff] potential offnode page_structs [ffffe200c0000000-ffffe200c47fffff] PMD -> [ffff813037c00000-ffff81303c3fffff] on node 6 [ffffe200c4800000-ffffe200e07fffff] PMD -> [ffff813820200000-ffff81383c1fffff] on node 7 instead of a very long print out... Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>