openslx/kernel-qcow2-linux.git - In-kernel qcow2 (Kernel part)

	Commit message (Collapse)	Author	Age	Files	Lines
*	[PATCH] x86_64: Fix gcc 4 warning in aperture.c	Andi Kleen	2005-11-15	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	Fix arch/x86_64/kernel/aperture.c: In function #iommu_hole_init#: arch/x86_64/kernel/aperture.c:199: warning: #aper_order# may be used uninitialized in this function Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] x86-64/i386: Fix CPU model for family 6	Suresh Siddha	2005-11-15	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \|	According to cpuid instruction in IA32 SDM-Vol2, when computing cpu model, we need to consider extended model ID for family 0x6 also. AK: Also added fixes/simplifcation from Petr Vandrovec Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] x86_64: Remove duplicate __cpuinit define	Ashok Raj	2005-11-15	1	-2/+0
\| \| \| \| \| \| \| \| \|	Remove duplicate __cpuinit in smp.c. Already defined in init.h which is already included. Signed-off-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] x86_64: Use the DMA32 zone for dma_alloc_coherent()/pci_alloc_consistent	Andi Kleen	2005-11-15	1	-1/+7
\| \| \| \| \|	Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] i386/x86-64: Share interrupt vectors when there is a large number of ↵	James Cleverdon	2005-11-15	2	-8/+74
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	interrupt sources Here's a patch that builds on Natalie Protasevich's IRQ compression patch and tries to work for MPS boots as well as ACPI. It is meant for a 4-node IBM x460 NUMA box, which was dying because it had interrupt pins with GSI numbers > NR_IRQS and thus overflowed irq_desc. The problem is that this system has 270 GSIs (which are 1:1 mapped with I/O APIC RTEs) and an 8-node box would have 540. This is much bigger than NR_IRQS (224 for both i386 and x86_64). Also, there aren't enough vectors to go around. There are about 190 usable vectors, not counting the reserved ones and the unused vectors at 0x20 to 0x2F. So, my patch attempts to compress the GSI range and share vectors by sharing IRQs. Cc: "Protasevich, Natalie" <Natalie.Protasevich@unisys.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] x86_64: Support for AMD specific MCE Threshold.	Jacob Shin	2005-11-15	8	-0/+569
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	MC4_MISC - DRAM Errors Threshold Register realized under AMD K8 Rev F. This register is used to count correctable and uncorrectable ECC errors that occur during DRAM read operations. The user may interface through sysfs files in order to change the threshold configuration. bank%d/error_count - reads current error count, write to clear. bank%d/interrupt_enable - set/clear interrupt enable. bank%d/threshold_limit - read/write the threshold limit. APIC vector 0xF9 in hw_irq.h. 5 software defined bank ids in mce.h. new apic.c function to setup threshold apic lvt. defaults to interrupt off, count enabled, and threshold limit max. sysfs interface created on /sys/devices/system/threshold. AK: added some ifdefs to make it compile on UP Signed-off-by: Jacob Shin <jacob.shin@amd.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] x86_64: Account mem_map in VM holes accounting	Andi Kleen	2005-11-15	1	-0/+19
\| \| \| \| \| \| \| \| \| \| \|	The VM needs to know about lost memory in zones to accurately balance dirty pages. This patch accounts mem_map in there too, which fixes a constant errror of a few percent. Also some other misc mappings and the kernel text itself are accounted too. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] x86_64: Add 4GB DMA32 zone	Andi Kleen	2005-11-15	2	-43/+47
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add a new 4GB GFP_DMA32 zone between the GFP_DMA and GFP_NORMAL zones. As a bit of historical background: when the x86-64 port was originally designed we had some discussion if we should use a 16MB DMA zone like i386 or a 4GB DMA zone like IA64 or both. Both was ruled out at this point because it was in early 2.4 when VM is still quite shakey and had bad troubles even dealing with one DMA zone. We settled on the 16MB DMA zone mainly because we worried about older soundcards and the floppy. But this has always caused problems since then because device drivers had trouble getting enough DMA able memory. These days the VM works much better and the wide use of NUMA has proven it can deal with many zones successfully. So this patch adds both zones. This helps drivers who need a lot of memory below 4GB because their hardware is not accessing more (graphic drivers - proprietary and free ones, video frame buffer drivers, sound drivers etc.). Previously they could only use IOMMU+16MB GFP_DMA, which was not enough memory. Another common problem is that hardware who has full memory addressing for >4GB misses it for some control structures in memory (like transmit rings or other metadata). They tended to allocate memory in the 16MB GFP_DMA or the IOMMU/swiotlb then using pci_alloc_consistent, but that can tie up a lot of precious 16MB GFPDMA/IOMMU/swiotlb memory (even on AMD systems the IOMMU tends to be quite small) especially if you have many devices. With the new zone pci_alloc_consistent can just put this stuff into memory below 4GB which works better. One argument was still if the zone should be 4GB or 2GB. The main motivation for 2GB would be an unnamed not so unpopular hardware raid controller (mostly found in older machines from a particular four letter company) who has a strange 2GB restriction in firmware. But that one works ok with swiotlb/IOMMU anyways, so it doesn't really need GFP_DMA32. I chose 4GB to be compatible with IA64 and because it seems to be the most common restriction. The new zone is so far added only for x86-64. For other architectures who don't set up this new zone nothing changes. Architectures can set a compatibility define in Kconfig CONFIG_DMA_IS_DMA32 that will define GFP_DMA32 as GFP_DMA. Otherwise it's a nop because on 32bit architectures it's normally not needed because GFP_NORMAL (=0) is DMA able enough. One problem is still that GFP_DMA means different things on different architectures. e.g. some drivers used to have #ifdef ia64 use GFP_DMA (trusting it to be 4GB) #elif __x86_64__ (use other hacks like the swiotlb because 16MB is not enough) ... . This was quite ugly and is now obsolete. These should be now converted to use GFP_DMA32 unconditionally. I haven't done this yet. Or best only use pci_alloc_consistent/dma_alloc_coherent which will use GFP_DMA32 transparently. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] x86_64: Update defconfig	Andi Kleen	2005-11-15	1	-15/+83
\| \| \| \| \| \| \|	Rerun and enable autofs 4, relayfs and softdog Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] x86-64: bitops fix for -Os	Alexandre Oliva	2005-11-03	1	-16/+50
\| \| \| \| \| \| \| \| \|	This fixes the x86-64 find_[first\|next]_zero_bit() function for the end-of-range case. It didn't test for a zero size, and the "rep scas" would do entirely the wrong thing. Signed-off-by: Alexandre Oliva <oliva@lsd.ic.unicamp.br> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	manual update from upstream:	Tony Luck	2005-10-31	9	-225/+48
\|\ \| \| \| \| \| \| \| \| \| \| \| \|	Applied Al's change 06a544971fad0992fe8b92c5647538d573089dd4 to new location of swiotlb.c Signed-off-by: Tony Luck <tony.luck@intel.com>
\| *	[PATCH] hpet-RTC: cache the comparator register	Clemens Ladisch	2005-10-31	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reads from an HPET register require a round trip to the south bridge and are almost as slow as PCI reads. By caching the last value we've written to the comparator register, we can eliminate all HPET reads from the fast path in the emulated RTC interrupt handler. Signed-off-by: Clemens Ladisch <clemens@ladisch.de> Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
\| *	[PATCH] hpet-RTC: fix timer config register accesses	Clemens Ladisch	2005-10-31	1	-7/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make sure that the RTC timer is in non-periodic mode; some stupid BIOS might have initialized it to periodic mode. Furthermore, don't set the SETVAL bit in the config register. This wouldn't have any effect unless the timer was in period mode (which it isn't), and then the actual timer frequency would be half that of the desired one because incrementing the comparator in the interrupt handler would be done after the hardware has already incremented it itself. Signed-off-by: Clemens Ladisch <clemens@ladisch.de> Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
\| *	[PATCH] hpet-RTC: disable interrupt when no longer needed	Clemens Ladisch	2005-10-31	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When the emulated RTC interrupt is no longer needed, we better disable it; otherwise, we get a spurious interrupt whenever the timer has rolled over and reaches the same comparator value. Having a superfluous interrupt every five minutes doesn't hurt much, but it's bad style anyway. ;-) Signed-off-by: Clemens Ladisch <clemens@ladisch.de> Acked-by: "Pallipadi, Venkatesh" <venkatesh.pallipadi@intel.com> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
\| *	[PATCH] jiffies_64 cleanup	Thomas Gleixner	2005-10-31	1	-4/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Define jiffies_64 in kernel/timer.c rather than having 24 duplicated defines in each architecture. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
\| *	[PATCH] Remove orphaned TIOCGDEV compat ioctl	Brian Gerst	2005-10-31	1	-29/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This ioctl doesn't exist for native i386. Signed-off-by: Brian Gerst <bgerst@didntduck.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
\| *	[PATCH] introduce setup_timer() helper	Oleg Nesterov	2005-10-31	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Every user of init_timer() also needs to initialize ->function and ->data fields. This patch adds a simple setup_timer() helper for that. The schedule_timeout() is patched as an example of usage. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
\| *	[PATCH] swsusp: rework memory freeing on resume	Rafael J. Wysocki	2005-10-31	1	-65/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The following patch makes swsusp use the PG_nosave and PG_nosave_free flags to mark pages that should be freed in case of an error during resume. This allows us to simplify the code and to use swsusp_free() in all of the swsusp's resume error paths, which makes them actually work. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
\| *	[PATCH] Clean up mtrr compat ioctl code	Brian Gerst	2005-10-31	1	-96/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Handle 32-bit mtrr ioctls in the mtrr driver instead of the ia32 compatability layer. Signed-off-by: Brian Gerst <bgerst@didntduck.org> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
\| *	[PATCH] x86: vmx cpu feature detection	Kamble, Nitin A	2005-10-31	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If VMX feature is available in the CPU, this patch will make it visible in the /proc/cpuinfo with the cpuid detection. Signed-Off-By: Nitin A Kamble <nitin.a.kamble@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
\| *	[PATCH] FPU context corrupted after resume	Shaohua Li	2005-10-31	1	-6/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	mxcsr_feature_mask_init isn't needed in suspend/resume time (we can use boot time mask). And actually it's harmful, as it clear task's saved fxsave in resume. This bug is widely seen by users using zsh. (akpm: my eyes. Fixed some surrounding whitespace mess) Signed-off-by: Shaohua Li<shaohua.li@intel.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
\| *	[PATCH] i386 and x86_64 TSC set_cyc2ns_scale imprecision	Mathieu Desnoyers	2005-10-31	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I just found out that some precision is unnecessarily lost in the arch/i386/kernel/timers/timer_tsc.c:set_cyc2ns_scale function. It uses a cpu_mhz parameter when it could use a cpu_khz. In the specific case of an Intel P4 running at 3001.171 Mhz, the truncation to 3001 Mhz leads to an imprecision of 19 microseconds per second : this is very sad for a timer with nearly nanosecond accuracy. Fix the x86_64 architecture too. Cc: george anzinger <george@mvista.com> Cc: john stultz <johnstul@us.ibm.com> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
\| *	[PATCH] mm: init_mm without ptlock	Hugh Dickins	2005-10-30	1	-3/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	First step in pushing down the page_table_lock. init_mm.page_table_lock has been used throughout the architectures (usually for ioremap): not to serialize kernel address space allocation (that's usually vmlist_lock), but because pud_alloc,pmd_alloc,pte_alloc_kernel expect caller holds it. Reverse that: don't lock or unlock init_mm.page_table_lock in any of the architectures; instead rely on pud_alloc,pmd_alloc,pte_alloc_kernel to take and drop it when allocating a new one, to check lest a racing task already did. Similarly no page_table_lock in vmalloc's map_vm_area. Some temporary ugliness in __pud_alloc and __pmd_alloc: since they also handle user mms, which are converted only by a later patch, for now they have to lock differently according to whether or not it's init_mm. If sources get muddled, there's a danger that an arch source taking init_mm.page_table_lock will be mixed with common source also taking it (or neither take it). So break the rules and make another change, which should break the build for such a mismatch: remove the redundant mm arg from pte_alloc_kernel (ppc64 scrapped its distinct ioremap_mm in 2.6.13). Exceptions: arm26 used pte_alloc_kernel on user mm, now pte_alloc_map; ia64 used pte_alloc_map on init_mm, now pte_alloc_kernel; parisc had bad args to pmd_alloc and pte_alloc_kernel in unused USE_HPPA_IOREMAP code; ppc64 map_io_page forgot to unlock on failure; ppc mmu_mapin_ram and ppc64 im_free took page_table_lock for no good reason. Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
\| *	[PATCH] mm: mm_init set_mm_counters	Hugh Dickins	2005-10-30	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	How is anon_rss initialized? In dup_mmap, and by mm_alloc's memset; but that's not so good if an mm_counter_t is a special type. And how is rss initialized? By set_mm_counter, all over the place. Come on, we just need to initialize them both at once by set_mm_counter in mm_init (which follows the memcpy when forking). Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
\| *	[PATCH] gfp_t: dma-mapping (amd64)	Al Viro	2005-10-28	2	-3/+3
\| \| \| \| \| \| \| \| \| \|	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* \|	Update from upstream with manual merge of Yasunori Goto's	Tony Luck	2005-10-20	17	-106/+204
\|\\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	changes to swiotlb.c made in commit 281dd25cdc0d6903929b79183816d151ea626341 since this file has been moved from arch/ia64/lib/swiotlb.c to lib/swiotlb.c Signed-off-by: Tony Luck <tony.luck@intel.com>
\| *	[PATCH] x86_64: Allocate cpu local data for all possible CPUs	Andi Kleen	2005-10-11	2	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	CPU hotplug fills up the possible map to NR_CPUs, but it did that after setting up per CPU data. This lead to CPU data not getting allocated for all possible CPUs, which lead to various side effects. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
\| *	[PATCH] x86_64: Fix change_page_attr cache flushing	Andi Kleen	2005-10-11	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Noticed by Terence Ripperda Undo wrong change in global_flush_tlb. We need to flush the caches in all cases, not just when pages were reverted. This was a bogus optimization added earlier, but it was wrong. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
\| *	[PATCH] i386: fix stack alignment for signal handlers	Markus F.X.J. Oberhumer	2005-10-10	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fixes the setup of the alignment of the signal frame, so that all signal handlers are run with a properly aligned stack frame. The current code "over-aligns" the stack pointer so that the stack frame is effectively always mis-aligned by 4 bytes. But what we really want is that on function entry ((sp + 4) & 15) == 0, which matches what would happen if the stack were aligned before a "call" instruction. Signed-off-by: Markus F.X.J. Oberhumer <markus@oberhumer.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
\| *	[PATCH] x86_64: Set up safe page tables during resume	Rafael J. Wysocki	2005-10-10	2	-6/+138
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The following patch makes swsusp avoid the possible temporary corruption of page translation tables during resume on x86-64. This is achieved by creating a copy of the relevant page tables that will not be modified by swsusp and can be safely used by it on resume. The problem is that during resume on x86-64 swsusp may temporarily corrupt the page tables used for the direct mapping of RAM. If that happens, a page fault occurs and cannot be handled properly, which leads to the solid hang of the affected system. This leads to the loss of the system's state from before suspend and may result in the loss of data or the corruption of filesystems, so it is a serious issue. Also, it appears to happen quite often (for me, as often as 50% of the time). The problem is related to the fact that (at least) one of the PMD entries used in the direct memory mapping (starting at PAGE_OFFSET) points to a page table the physical address of which is much greater than the physical address of the PMD entry itself. Moreover, unfortunately, the physical address of the page table before suspend (i.e. the one stored in the suspend image) happens to be different to the physical address of the corresponding page table used during resume (i.e. the one that is valid right before swsusp_arch_resume() in arch/x86_64/kernel/suspend_asm.S is executed). Thus while the image is restored, the "offending" PMD entry gets overwritten, so it does not point to the right physical address any more (i.e. there's no page table at the address pointed to by it, because it points to the address the page table has been at during suspend). Consequently, if the PMD entry is used later on, and it _is_ used in the process of copying the image pages, a page fault occurs, but it cannot be handled in the normal way and the system hangs. In principle we can call create_resume_mapping() from swsusp_arch_resume() (ie. from suspend_asm.S), but then the memory allocations in create_resume_mapping(), resume_pud_mapping(), and resume_pmd_mapping() must be made carefully so that we use _only_ NosaveFree pages in them (the other pages are overwritten by the loop in swsusp_arch_resume()). Additionally, we are in atomic context at that time, so we cannot use GFP_KERNEL. Moreover, if one of the allocations fails, we should free all of the allocated pages, so we need to trace them somehow. All of this is done in the appended patch, except that the functions populating the page tables are located in arch/x86_64/kernel/suspend.c rather than in init.c. It may be done in a more elegan way in the future, with the help of some swsusp patches that are in the works now. [AK: move some externs into headers, renamed a function] Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
\| *	[PATCH] x86_64: Drop global bit from early low mappings	Andi Kleen	2005-10-05	1	-20/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Drop global bit from early low mappings Suggested by Linus, originally also proposed by Suresh. This fixes a race condition with early start of udev, originally tracked down by Suresh B. Siddha. The problem was that switching to the user space VM would not clear the global low mappings for the beginning of memory, which lead to memory corruption. Drop the global bits. The kernel mapping stays global because it should stay constant. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
\| *	[PATCH] x86_64: Fix numa node topology detection for srat based x86_64 boxes	Ravikiran G Thirumalai	2005-10-03	1	-3/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	2.6.14-rc2 does not assign cpus to proper nodeids on our em64t numa boxen. Our boxes use acpi srat for parsing the numa information. srat_detect_node() used phys_proc_id[] to get to the cpu's local apic id, but phys_proc_id[] represents the cpu<->initial_apic_id mapping. The following patch fixes this problem. Now apicid_to_node[] is properly indexed with the local apic id. Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
\| *	[PATCH] x86_64 early numa init fix	Ravikiran G Thirumalai	2005-09-30	1	-3/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The tests Alok carried out on Petr's box confirmed that cpu_to_node[BP] is not setup early enough by numa_init_array due to the x86_64 changes in 2.6.14-rc*, and unfortunately set wrongly by the work around code in numa_init_array(). cpu_to_node[0] gets set with 1 early and later gets set properly to 0 during identify_cpu() when all cpus are brought up, but confusing the numa slab in the process. Here is a quick fix for this. The right fix obviously is to have cpu_to_node[bsp] setup early for numa_init_array(). The following patch will fix the problem now, and the code can stay on even when cpu_to_node{BP] gets fixed early correctly. Thanks to Petr for access to his box. Signed off by: Ravikiran Thirumalai <kiran@scalex86.org> Signed-off-by: Alok N Kataria <alokk@calsoftinc.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
\| *	[PATCH] x86_64: fix the BP node_to_cpumask	Ravikiran G Thirumalai	2005-09-30	1	-4/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix the BP node_to_cpumask. 2.6.14-rc* broke the boot cpu bit as the cpu_to_node(0) is now not setup early enough for numa_init_array. cpu_to_node[] is setup much later at srat_detect_node on acpi srat based em64t machines. This seems like a problem on amd machines too, Tested on em64t though. /sys/devices/system/node/node0/cpumap shows up sanely after this patch. Signed off by: Ravikiran Thirumalai <kiran@scalex86.org> Signed-off-by: Shai Fultheim <shai@scalex86.org> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
\| *	[PATCH] utilization of kprobe_mutex is incorrect on x86_64	Zhang, Yanmin	2005-09-30	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The up()/down() orders are incorrect in arch/x86_64/kprobes.c file. kprobe_mutext is used to protect the free kprobe instruction slot list. arch_prepare_kprobe applies for a slot from the free list, and arch_remove_kprobe returns a slot to the free list. The incorrect up()/down() orders to operate on kprobe_mutex fail to protect the free list. If 2 threads try to get/return kprobe instruction slot at the same time, the free slot list might be broken, or a free slot might be applied by 2 threads. Signed-off-by: Zhang Yanmin <Yanmin.zhang@intel.com> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
\| *	[PATCH] x86_64: Fix mce_log	Mike Waychison	2005-09-30	1	-3/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The attempt to fixup the lockless mce log buffer introduced an infinite loop when trying to find a free entry. And: Using rcu_dereference() to load mcelog.next doesn't seem to be sufficient enough to ensure that mcelog.next is loaded each time around the loop in mce_log(). Instead, use an explicit rmb() to ensure that the compiler gets it right. AK: turned the smp_wmbs into true wmbs to make sure they are not reordered by the compiler on UP. Signed-off-by: Mike Waychison <mikew@google.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
\| *	[PATCH] Fix up TLB flush filter disabling	Andi Kleen	2005-09-30	1	-10/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I checked with AMD and they requested to only disable it for family 15. Also disable it for i386 too. And some style fixes. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
\| *	[PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs	john stultz	2005-09-28	1	-3/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This should resolve the issue seen in bugme bug #5105, where it is assumed that dualcore x86_64 systems have synced TSCs. This is not the case, and alternate timesources should be used instead. For more details, see: http://bugzilla.kernel.org/show_bug.cgi?id=5105 Andi's earlier concerns that the TSCs should be synced on dualcore systems have been resolved by confirmation from AMD folks that they can be unsynced. Acked-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
\| *	[PATCH] update URL for HPET spec.	Randy Dunlap	2005-09-21	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Correct URL for HPET spec. Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
\| *	x86-64/smp: fix random SIGSEGV issues	Linus Torvalds	2005-09-18	1	-0/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	They seem to have been due to AMD errata 63/122; the fix is to disable TLB flush filtering in SMP configurations. Confirmed to fix the problem by Andrew Walrond <andrew@walrond.org> [ Let's see if we'll have a better fix eventually, this is the Q&D "let's get this fixed and out there" version ] Signed-off-by: Linus Torvalds <torvalds@osdl.org>
\| *	[PATCH] x86_64: e820.c needs module.h	Andrew Morton	2005-09-17	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For EXPORT_SYMBOL. Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
\| *	[LIB]: Consolidate _atomic_dec_and_lock()	David S. Miller	2005-09-15	4	-51/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Several implementations were essentialy a common piece of C code using the cmpxchg() macro. Put the implementation in one spot that everyone can share, and convert sparc64 over to using this. Alpha is the lone arch-specific implementation, which codes up a special fast path for the common case in order to avoid GP reloading which a pure C version would require. Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	[PATCH] swiotlb: move from arch/ia64/lib/ to lib/	John W. Linville	2005-09-29	1	-2/+0
\|/ \| \| \| \| \| \| \| \| \|	The swiotlb implementation is shared by both IA-64 and EM64T. However, the source itself lives under arch/ia64. This patch moves swiotlb.c from arch/ia64/lib to lib/ and fixes-up the appropriate Makefile and Kconfig files. No actual changes are made to swiotlb.c. Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
*	Partially revert "Fix time going twice as fast problem on ATI Xpress chipsets"	Linus Torvalds	2005-09-15	1	-9/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 66759a01adbfe8828dd063e32cf5ed3f46696181 introduced the fix for time ticking too fast on some boards by disabling one of the doubly connected timer pins on ATI boards. However, it ends up being _much_ too broad a brush, and that just makes some other ATI boards not work at all since they now have no timer source. So disable the automatic ATI southbridge detection, and just rely on people who see this problem disabling it by hand with the option "disable_timer_pin_1" on the kernel command line. Maybe somebody can figure out the proper tests at a later date. Acked-by: Peter Osterlund <petero2@telia.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] error path in setup_arg_pages() misses vm_unacct_memory()	Hugh Dickins	2005-09-14	2	-10/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Pavel Emelianov and Kirill Korotaev observe that fs and arch users of security_vm_enough_memory tend to forget to vm_unacct_memory when a failure occurs further down (typically in setup_arg_pages variants). These are all users of insert_vm_struct, and that reservation will only be unaccounted on exit if the vma is marked VM_ACCOUNT: which in some cases it is (hidden inside VM_STACK_FLAGS) and in some cases it isn't. So x86_64 32-bit and ppc64 vDSO ELFs have been leaking memory into Committed_AS each time they're run. But don't add VM_ACCOUNT to them, it's inappropriate to reserve against the very unlikely case that gdb be used to COW a vDSO page - we ought to do something about that in do_wp_page, but there are yet other inconsistencies to be resolved. The safe and economical way to fix this is to let insert_vm_struct do the security_vm_enough_memory check when it finds VM_ACCOUNT is set. And the MIPS irix_brk has been calling security_vm_enough_memory before calling do_brk which repeats it, doubly accounting and so also leaking. Remove that, and all the fs and arch calls to security_vm_enough_memory: give it a less misleading name later on. Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-Off-By: Kirill Korotaev <dev@sw.ru> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] x86_64: Export end_pfn	Andi Kleen	2005-09-13	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Fixes > if [ -r System.map -a -x /sbin/depmod ]; then /sbin/depmod -ae -F > System.map 2. 6.14-rc1; fi > WARNING: /lib/modules/2.6.14-rc1/kernel/drivers/char/agp/amd64-agp.ko > needs unknown symbol end_pfn Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] x86_64: NMI watchdog frequency calculation adjustments	Jan Beulich	2005-09-13	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \|	Like previously done for i386, get the x86_64 watchdog tick calculation into a state where it can also be used on CPUs with frequencies beyond 4GHz. Signed-off-by: Jan Beulich <jbeulich@novell.com> Acked-by: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] use add_taint() for setting tainted bit flags	Randy Dunlap	2005-09-13	1	-1/+1
\| \| \| \| \| \| \| \| \|	Use the add_taint() interface for setting tainted bit flags instead of doing it manually. Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] x86-64: Use correct mask to compute conflicting nodes in SRAT	Andi Kleen	2005-09-12	1	-1/+1
\| \| \| \| \| \| \|	The nodes are not set online yet at this point. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] x86-64: reset apicid<->node tables when SRAT cannot be parsed	Andi Kleen	2005-09-12	1	-0/+3
\| \| \| \|	Signed-off-by: Linus Torvalds <torvalds@osdl.org>