summaryrefslogtreecommitdiffstats
path: root/arch/powerpc/mm
Commit message (Collapse)AuthorAgeFilesLines
* powerpc/mm: Improve readability of update_mmu_cache()Gavin Shan2016-05-111-7/+14
| | | | | | | | | | | | | | | The function is used to update the MMU with software PTE. It can be called by data access exception handler (0x300) or instruction access exception handler (0x400). If the function is called by 0x400 handler, the local variable @access is set to _PAGE_EXEC to indicate the software PTE should have that flag set. When the function is called by 0x300 handler, @access is set to zero. This improves the readability of the function by replacing if statements with switch. No logical changes introduced. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: define TOP_ZONE as a constantOliver O'Halloran2016-05-111-12/+5Star
| | | | | | | | | | | | The zone that contains the top of memory will be either ZONE_NORMAL or ZONE_HIGHMEM depending on the kernel config. There are two functions that require this information and both of them use an #ifdef to set a local variable (top_zone). This is a little silly so lets just make it a constant. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Cc: linux-mm@kvack.org Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm/hash64: Fix subpage protection with 4K HPTE configMichael Ellerman2016-05-111-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | With Linux page size of 64K and hardware only supporting 4K HPTE, if we use subpage protection, we always fail for the subpage 0 as shown below (using the selftest subpage_prot test): 520175565: (4520111850): Failed at 0x3fffad4b0000 (p=13,sp=0,w=0), want=fault, got=pass ! 4520890210: (4520826495): Failed at 0x3fffad5b0000 (p=29,sp=0,w=0), want=fault, got=pass ! 4521574251: (4521510536): Failed at 0x3fffad6b0000 (p=45,sp=0,w=0), want=fault, got=pass ! 4522258324: (4522194609): Failed at 0x3fffad7b0000 (p=61,sp=0,w=0), want=fault, got=pass ! This is because hash preload wrongly inserts the HPTE entry for subpage 0 without looking at the subpage protection information. Fix it by teaching should_hash_preload() not to preload if we have subpage protection configured for that range. It appears this has been broken since it was introduced in 2008. Fixes: fa28237cfcc5 ("[POWERPC] Provide a way to protect 4k subpages when using 64k pages") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [mpe: Rework into should_hash_preload() to avoid build fails w/SLICES=n] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm/hash64: Factor out hash preload psize checkMichael Ellerman2016-05-111-4/+17
| | | | | | | | | | | | Currently we have a check in hash_preload() against the psize, which is only included when CONFIG_PPC_MM_SLICES is enabled. We want to expand this check in a subsequent patch, so factor it out to allow that. As a bonus it removes the #ifdef in the C code. Unfortunately we can't put this in the existing CONFIG_PPC_MM_SLICES block because it would require a forward declaration. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm/slice: Remove slice_mm_new_context()Aneesh Kumar K.V2016-05-111-2/+1Star
| | | | | | | | | | | | The usage in mm mmu_context_nohash.c is bogus, because we set the context.id value to MMU_NO_CONTEXT 4 lines previously in the same function, meaning slice_mm_new_context() will always be true. The book3s 64 usage was removed in the previous commit. So remove it as unused. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm/subpage: Initialise user psize correctlyAneesh Kumar K.V2016-05-111-1/+10
| | | | | | | | | | | | | | | | | | | | | | As part of the radix support we switched Book3s64 to use a value of ~0 for MMU_NO_CONTEXT. That is because id 0 is special on radix. However that broke the logic in init_new_context(). The code there needs to differentiate between a newly allocated context and one inherited via fork. Previously it worked because a newly allocated context has an id of zero (because it was just memset() to zero), which used to match MMU_NO_CONTEXT, and therefore slice_mm_new_context() did the right thing. Instead check against a context.id value of zero instead of using slice_mm_new_context(). Without this patch we never call slice_set_user_psize(), and end up with a slice psize value of zero and we always end up using 4K HPTE. Fixes: 1a472c9dba6b ("powerpc/mm/radix: Add tlbflush routines") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm/radix: Add radix THP callbacksAneesh Kumar K.V2016-05-112-1/+118
| | | | | | | | | | The deposited pgtable_t is a pte fragment hence we cannot use page->lru for linking then together. We use the first two 64 bits for pte fragment as list_head type to link all deposited fragments together. On withdraw we properly zero then out. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm/thp: Abstraction for THP functionsAneesh Kumar K.V2016-05-113-125/+132
| | | | | Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: THP is only available on hash64 as of nowAneesh Kumar K.V2016-05-112-359/+358Star
| | | | | | | Only code movement in this patch. No functionality change. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Add radix support for hugetlbAneesh Kumar K.V2016-05-114-1/+104
| | | | | Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Fix vma_mmu_pagesize() for radixAneesh Kumar K.V2016-05-111-4/+4
| | | | | | | | Radix doesn't use the slice framework to find the page size. Hence use vma to find the page size. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: pte_frag abstractionAneesh Kumar K.V2016-05-113-0/+12
| | | | | | | | | | | In this patch we make the number of pte fragments per level 4 page table page a variable. Radix level 4 table size is 256 bytes and hence we can have 256 fragments per level 4 page. We don't update the fragment count in this patch. We need to do performance measurements to find the right value for fragment count. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/radix: Update MMU cacheAneesh Kumar K.V2016-05-111-0/+2
| | | | | | | | | With radix there is no MMU cache. Hence we don't need to do anything in update_mmu_cache(). Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: vmalloc abstraction in preparation for radixAneesh Kumar K.V2016-05-114-3/+29
| | | | | | | | | | The vmalloc range differs between hash and radix config. Hence make VMALLOC_START and related constants a variable which will be runtime initialized depending on whether hash or radix mode is active. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [mpe: Fix missing init of ioremap_bot in pgtable_64.c for ppc64e] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Update pte filter for radixAneesh Kumar K.V2016-05-111-0/+3
| | | | | | Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Add radix pgalloc detailsAneesh Kumar K.V2016-05-113-1/+17
| | | | | Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Make 4K and 64K use pte_t for pgtable_tAneesh Kumar K.V2016-05-111-1/+1
| | | | | | | | | | | | | | | | This patch switches 4K Linux page size config to use pte_t * type instead of struct page * for pgtable_t. This simplifies the code a lot and helps in consolidating both 64K and 4K page allocator routines. The changes should not have any impact, because we already store physical address in the upper level page table tree and that implies we already do struct page * to physical address conversion. One change to note here is we move the pgtable_page_dtor() call for nohash to pte_fragment_free_mm(). The nohash related change is due to the related changes in pgtable_64.c. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Rename function to indicate we are allocating fragmentsAneesh Kumar K.V2016-05-111-17/+4Star
| | | | | | | Only code cleanup. No functionality change. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm/radix: Update PTCR on secondary CPUsAneesh Kumar K.V2016-05-111-3/+7
| | | | | Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm/radix: Pick the address layout for radix configAneesh Kumar K.V2016-05-111-0/+109
| | | | | | | | | | | | Hash needs special get_unmapped_area() handling because of limitations around base page size, so we have to set HAVE_ARCH_UNMAPPED_AREA. With radix we don't have such restrictions, so we could use the generic code. But because we've set HAVE_ARCH_UNMAPPED_AREA (for hash), we have to re-implement the same logic as the generic code. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm/radix: Limit paca allocation in radixAneesh Kumar K.V2016-05-111-1/+19
| | | | | | | | | | On return from RTAS we access the paca variables and we have 64 bit disabled. This requires us to limit paca in 32 bit range. Fix this by setting ppc64_rma_size to first_memblock_size/1G range. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm/radix: Add checks in slice code to catch radix usageAneesh Kumar K.V2016-05-111-0/+16
| | | | | | | | Radix doesn't need slice support. Catch incorrect usage of slice code when radix is enabled. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm/radix: Add tlbflush routinesAneesh Kumar K.V2016-05-012-1/+243
| | | | | | | | | | | | | | Core kernel doesn't track the page size of the VA range that we are invalidating. Hence we end up flushing TLB for the entire mm here. Later patches will improve this. We also don't flush page walk cache separetly instead use RIC=2 when flushing TLB, because we do a MMU gather flush after freeing page table. MMU_NO_CONTEXT is updated for hash. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Hash abstraction for tlbflush routinesAneesh Kumar K.V2016-05-011-1/+1
| | | | | Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Rename mmu_context_hash64.c to mmu_context_book3s64.cAneesh Kumar K.V2016-05-012-4/+3Star
| | | | | | | | This file now contains both hash and radix specific code. Rename it to indicate this better. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm/radix: Add mmu context handling callback for radixAneesh Kumar K.V2016-05-011-8/+35
| | | | | Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Abstraction for switch_mmu_context()Aneesh Kumar K.V2016-05-011-1/+2
| | | | | | | | | How we switch MMU context differs between hash and radix. For hash we need to switch the SLB details and for radix we need to switch the PID SPR. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm/radix: Add radix callbacks for vmemmap and map_kernel page()Aneesh Kumar K.V2016-05-011-0/+20
| | | | | Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Abstraction for vmemmap and map_kernel_page()Aneesh Kumar K.V2016-05-013-16/+6Star
| | | | | | | | | For hash we create vmemmap mapping using bolted hash page table entries. For radix we fill the radix page table. The next patch will add the radix details for creating vmemmap mappings. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm/radix: Add radix callbacks for early init routinesAneesh Kumar K.V2016-05-012-0/+357
| | | | | | | | | | | | | | | This adds routines for early setup for radix. We use device tree property "ibm,processor-radix-AP-encodings" to find supported page sizes. If we don't find the above we consider 64K and 4K as supported page sizes. We do map vmemap using 2M page size if we can. The linear mapping is done such that we use required page size for that range. For example memory of 3.5G is mapped such that we use 1G mapping till 3G range and use 2M mapping for the rest. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Abstract early MMU init in preparation for radixAneesh Kumar K.V2016-05-011-3/+3
| | | | | Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Make page table size a variableAneesh Kumar K.V2016-05-017-16/+41
| | | | | | | | | | | | | | | | | | | | | Radix and hash MMU models support different page table sizes. Make the #defines a variable so that existing code can work with variable sizes. Slice related code is only used by hash, so use hash constants there. We will replicate some of the boundary conditions with resepct to TASK_SIZE using radix values too. Right now we do boundary condition check using hash constants. Swapper pgdir size is initialized in asm code. We select the max pgd size to keep it simple. For now we select hash pgdir. When adding radix we will switch that to radix pgdir which is 64K. BUILD_BUG_ON check which is removed is already done in hugepage_init() using MAYBE_BUILD_BUG_ON(). Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm/book3s: Rename hash specific PTE bits to carry H_ prefixAneesh Kumar K.V2016-05-017-63/+65
| | | | | | | | | | | | | | This helps to make following hash only pte bits easier. We have kept _PAGE_CHG_MASK, _HPAGE_CHG_MASK and _PAGE_PROT_BITS as it is in this patch eventhough they use hash specific bits. Using them in radix as it is should be ok, because with radix we expect those bit positions to be zero. Only renames in this patch, no change in functionality. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Move hash and no hash code to separate filesAneesh Kumar K.V2016-05-015-153/+222
| | | | | | | | | This patch reduces the number of #ifdefs in C code and will also help in adding radix changes later. Only code movement in this patch. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [mpe: Propagate copyrights and update GPL text] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm/hash: Add support for Power9 HashAneesh Kumar K.V2016-05-013-3/+57
| | | | | | | | | | | | | | | | | | | PowerISA 3.0 adds a parition table indexed by LPID. Parition table allows us to specify the MMU model that will be used for guest and host translation. This patch adds support with SLB based hash model (UPRT = 0). What is required with this model is to support the new hash page table entry format and also setup partition table such that we use hash table for address translation. We don't have segment table support yet. In order to make sure we don't load KVM module on Power9 (since we don't have kvm support yet) this patch also disables KVM on Power9. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Use generic version of pmdp_clear_flush_young()Aneesh Kumar K.V2016-05-011-10/+3Star
| | | | | | | | | The radix variant is going to require a flush_pmd_tlb_range(). With flush_pmd_tlb_range() added, pmdp_clear_flush_young() is the same as the generic version. So drop the powerpc specific variant. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Drop WIMG in favour of new constantsAneesh Kumar K.V2016-05-014-18/+14Star
| | | | | | | | | | | | | | | | | | | | | | PowerISA 3.0 introduces two pte bits with the below meaning for radix: 00 -> Normal Memory 01 -> Strong Access Order (SAO) 10 -> Non idempotent I/O (Cache inhibited and guarded) 11 -> Tolerant I/O (Cache inhibited) We drop the existing WIMG bits in the Linux page table in favour of the above constants. We loose _PAGE_WRITETHRU with this conversion. We only use writethru via pgprot_cached_wthru() which is used by fbdev/controlfb.c which is Apple control display and also PPC32. With respect to _PAGE_COHERENCE, we have been marking hpte always coherent for some time now. htab_convert_pte_flags() always added HPTE_R_M. NOTE: KVM changes need closer review. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Use a helper for finding pte bits mapping I/O areaAneesh Kumar K.V2016-05-011-2/+2
| | | | | | | | Use a helper instead of open coding with constants. A later patch will drop the WIMG bits and use PowerISA 3.0 defines. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Update _PAGE_KERNEL_ROAneesh Kumar K.V2016-05-011-6/+11
| | | | | | | | | | | | | | | | | PS3 had used a PPP bit hack to implement a read only mapping in the kernel area. Since we are bolting the ioremap area, it used the pte flags _PAGE_PRESENT | _PAGE_USER to get a PPP value of 0x3 there by resulting in a read only mapping. This means the area can be accessed by user space, but kernel will never return such an address to user space. But we can do better by implementing a read only kernel mapping using PPP bits 0b110. This also allows us to do read only kernel mapping for radix in later patches. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Remove RPN_SHIFT and RPN_SIZEAneesh Kumar K.V2016-05-011-1/+1
| | | | | | | | | | | | | | | | PTE_RPN_SHIFT is actually page size dependent. Even though PowerISA 3.0 expects only the lower 12 bits to be zero, we will always find the pages to be PAGE_SHIFT aligned. In case of hash config, this also allows us to use the additional 3 bits to track pte specific information. We need to make sure we use these bits only for hash specific pte flags. For both 4K and 64K config, pte now can hold 57 bits address. Inorder to keep things simpler, drop PTE_RPN_SHIFT and PTE_RPN_SIZE and specify the 57 bit detail explicitly. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Replace _PAGE_USER with _PAGE_PRIVILEGEDAneesh Kumar K.V2016-05-018-17/+42
| | | | | | | | | | | | | | | | | | | | | _PAGE_PRIVILEGED means the page can be accessed only by the kernel. This is done to keep pte bits similar to PowerISA 3.0 Radix PTE format. User pages are now marked by clearing _PAGE_PRIVILEGED bit. Previously we allowed the kernel to have a privileged page in the lower address range (USER_REGION). With this patch such access is denied. We also prevent a kernel access to a non-privileged page in higher address range (ie, REGION_ID != 0). Both the above access scenarios should never happen. Cc: Arnd Bergmann <arnd@arndb.de> Cc: Jeremy Kerr <jk@ozlabs.org> Cc: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Convert pte_user() to static inlineMichael Ellerman2016-05-011-1/+1
| | | | | | | | | | | | | | In a subsequent patch we want to add a second definition of pte_user(). Before we do that, make the signature clear, ie. it takes a pte_t and returns bool. We move it up inside the existing #ifndef __ASSEMBLY__ block, but otherwise it's a straight conversion. Convert the call in settlbcam(), which passes an unsigned long, to pass a pte_t. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm/subpage: Clear RWX bit to indicate no accessAneesh Kumar K.V2016-05-011-3/+8
| | | | | | | | | | | | | | | | | | Subpage protection used to depend on the _PAGE_USER bit to implement no access mode. This patch switches that to use _PAGE_RWX. We clear Read, Write and Execute access from the pte instead of clearing _PAGE_USER now. This was done so that we can switch to _PAGE_PRIVILEGED in a later patch. subpage_protection() returns pte bits that need to be cleared. Instead of updating the interface to handle no-access in a separate way, it appears simpler to clear RWX acecss to indicate no access. We still don't insert hash ptes for no access implied by !_PAGE_RWX. Hence we should not get PROT_FAULT with change. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Use _PAGE_READ to indicate Read accessAneesh Kumar K.V2016-05-018-16/+16
| | | | | | | | | | | | | | | | | | | | This splits the _PAGE_RW bit into _PAGE_READ and _PAGE_WRITE. It also removes the dependency on _PAGE_USER for implying read only. Few things to note here is that, we have read implied with write and execute permission. Hence we should always find _PAGE_READ set on hash pte fault. We still can't switch PROT_NONE to !(_PAGE_RWX). Auto numa depends on marking a prot none pte _PAGE_WRITE. (For more details look at b191f9b106ea "mm: numa: preserve PTE write permissions across a NUMA hinting fault") Cc: Arnd Bergmann <arnd@arndb.de> Cc: Jeremy Kerr <jk@ozlabs.org> Cc: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Use big endian Linux page tables for book3s 64Aneesh Kumar K.V2016-05-013-7/+10
| | | | | | | | | | | | | | | | | | | | | | | | | Traditionally Power server machines have used the Hashed Page Table MMU mode. In this mode Linux manages its own tree of nested page tables, aka. "the Linux page tables", which are not used by the hardware directly, and software loads translations into the hash page table for use by the hardware. Power ISA 3.0 defines a new MMU mode, known as Radix Tree Translation, where the hardware can directly operate on the Linux page tables. However the hardware requires that the page tables be in big endian format. To accommodate this, switch the pgtable types to __be64 and add appropriate endian conversions. Because we will be supporting a single kernel binary that boots using either radix or hash mode, we always store the Linux page tables big endian, even in hash mode where they are not actually used by the hardware. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [mpe: Fix sparse errors, flesh out change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Add pte_xchg() helperMichael Ellerman2016-05-013-8/+7Star
| | | | | | | | | | | | We have five locations in 64-bit hash MMU code that do a cmpxchg() of a PTE. Currently doing it inline OK, but in a future patch we will be converting the PTEs to __be64 in some configs. In that case we will need casts at every cmpxchg() site in order to keep sparse happy. So move the logic into a helper, this is a reasonably nice cleanup on its own. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Drop PTE_ATOMIC_UPDATES from pmd_hugepage_update()Aneesh Kumar K.V2016-05-011-5/+1Star
| | | | | | | | | | | | | | | | pmd_hugepage_update() is inside #ifdef CONFIG_TRANSPARENT_HUGEPAGE. THP can only be enabled if PPC_BOOK3S_64=y && PPC_64K_PAGES=y, aka. hash64. On hash64 we always define PTE_ATOMIC_UPDATES to 1, meaning the #ifdef in pmd_hugepage_update() is unnecessary, so drop it. That is also the only use of PTE_ATOMIC_UPDATES in any of the hash code, meaning we no longer need to #define it at all in the hash headers. Note it's still #defined and used in the nohash code. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc: sparse: Include headers for __weak symbolsDaniel Axtens2016-04-121-0/+1
| | | | | | | | | | | | Sometimes when sparse warns about undefined symbols, it isn't because they should have 'static' added, it's because they're overriding __weak symbols defined elsewhere, and the header has been missed. Fix a few of them by adding appropriate headers. Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/mm: Remove long disabled SLB codeMichael Ellerman2016-04-112-51/+0Star
| | | | | | | | | | | | | | We have a bunch of SLB related code in the tree which is there to handle dynamic VSIDs - but currently it's all disabled at compile time. The comments say "Keep that around for when we re-implement dynamic VSIDs". But that was over 10 years ago (commit 3c726f8dee6f ("[PATCH] ppc64: support 64k pages")). The chance that it would still work unchanged is minimal, and in the meantime it's confusing to folks browsing/grepping the code. If we ever want to re-instate it, it's in the git history. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Balbir Singh <bsingharora@gmail.com>
* powerpc/mm: Fixup preempt underflow with huge pagesSebastian Siewior2016-03-291-2/+2
| | | | | | | | | | | | | | | | | | | | | | | hugepd_free() used __get_cpu_var() once. Nothing ensured that the code accessing the variable did not migrate from one CPU to another and soon this was noticed by Tiejun Chen in 94b09d755462 ("powerpc/hugetlb: Replace __get_cpu_var with get_cpu_var"). So we had it fixed. Christoph Lameter was doing his __get_cpu_var() replaces and forgot PowerPC. Then he noticed this and sent his fixed up batch again which got applied as 69111bac42f5 ("powerpc: Replace __get_cpu_var uses"). The careful reader will noticed one little detail: get_cpu_var() got replaced with this_cpu_ptr(). So now we have a put_cpu_var() which does a preempt_enable() and nothing that does preempt_disable() so we underflow the preempt counter. Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Christoph Lameter <cl@linux.com> Cc: stable@vger.kernel.org Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>