summaryrefslogtreecommitdiffstats
path: root/kernel/locking/qspinlock_paravirt.h
Commit message (Collapse)AuthorAgeFilesLines
* locking/pvqspinlock: Only kick CPU at unlock timeWaiman Long2015-08-031-18/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | For an over-committed guest with more vCPUs than physical CPUs available, it is possible that a vCPU may be kicked twice before getting the lock - once before it becomes queue head and once again before it gets the lock. All these CPU kicking and halting (VMEXIT) can be expensive and slow down system performance. This patch adds a new vCPU state (vcpu_hashed) which enables the code to delay CPU kicking until at unlock time. Once this state is set, the new lock holder will set _Q_SLOW_VAL and fill in the hash table on behalf of the halted queue head vCPU. The original vcpu_halted state will be used by pv_wait_node() only to differentiate other queue nodes from the qeue head. Signed-off-by: Waiman Long <Waiman.Long@hp.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Douglas Hatch <doug.hatch@hp.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Scott J Norton <scott.norton@hp.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1436647018-49734-2-git-send-email-Waiman.Long@hp.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* locking/pvqspinlock: Order pv_unhash() after cmpxchg() on unlock slowpathWill Deacon2015-08-031-5/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | When we unlock in __pv_queued_spin_unlock(), a failed cmpxchg() on the lock value indicates that we need to take the slow-path and unhash the corresponding node blocked on the lock. Since a failed cmpxchg() does not provide any memory-ordering guarantees, it is possible that the node data could be read before the cmpxchg() on weakly-ordered architectures and therefore return a stale value, leading to hash corruption and/or a BUG(). This patch adds an smb_rmb() following the failed cmpxchg operation, so that the unhashing is ordered after the lock has been checked. Reported-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Will Deacon <will.deacon@arm.com> [ Added more comments] Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Waiman Long <Waiman.Long@hp.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: Steve Capper <Steve.Capper@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20150713155830.GL2632@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* locking: Clean up pvqspinlock warningPeter Zijlstra2015-08-031-6/+7
| | | | | | | | | | | | | | | | | | | - Rename the on-stack variable to match the datastructure variable, - place the cmpxchg back under the comment that explains it, - clean up the WARN() statement to avoid superfluous conditionals and line-breaks. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Waiman Long <Waiman.Long@hp.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
* locking/pvqspinlock: Fix kernel panic in locking-selftestWaiman Long2015-07-211-1/+10
| | | | | | | | | | | | | | | | | | | | | | | | Enabling locking-selftest in a VM guest may cause the following kernel panic: kernel BUG at .../kernel/locking/qspinlock_paravirt.h:137! This is due to the fact that the pvqspinlock unlock function is expecting either a _Q_LOCKED_VAL or _Q_SLOW_VAL in the lock byte. This patch prevents that bug report by ignoring it when debug_locks_silent is set. Otherwise, a warning will be printed if it contains an unexpected value. With this patch applied, the kernel locking-selftest completed without any noise. Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Waiman Long <Waiman.Long@hp.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1436663959-53092-1-git-send-email-Waiman.Long@hp.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* locking/arch: Rename set_mb() to smp_store_mb()Peter Zijlstra2015-05-191-1/+1
| | | | | | | | | | | | | Since set_mb() is really about an smp_mb() -- not a IO/DMA barrier like mb() rename it to match the recent smp_load_acquire() and smp_store_release(). Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
* locking/pvqspinlock: Replace xchg() by the more descriptive set_mb()Waiman Long2015-05-111-1/+1
| | | | | | | | | | | | | | | | The xchg() function was used in pv_wait_node() to set a certain value and provide a memory barrier which is what the set_mb() function is for. This patch replaces the xchg() call by set_mb(). Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Waiman Long <Waiman.Long@hp.com> Cc: Douglas Hatch <doug.hatch@hp.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Scott J Norton <scott.norton@hp.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
* locking/pvqspinlock: Implement simple paravirt support for the qspinlockWaiman Long2015-05-081-0/+325
Provide a separate (second) version of the spin_lock_slowpath for paravirt along with a special unlock path. The second slowpath is generated by adding a few pv hooks to the normal slowpath, but where those will compile away for the native case, they expand into special wait/wake code for the pv version. The actual MCS queue can use extra storage in the mcs_nodes[] array to keep track of state and therefore uses directed wakeups. The head contender has no such storage directly visible to the unlocker. So the unlocker searches a hash table with open addressing using a simple binary Galois linear feedback shift register. Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Waiman Long <Waiman.Long@hp.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Daniel J Blueman <daniel@numascale.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Douglas Hatch <doug.hatch@hp.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Paolo Bonzini <paolo.bonzini@gmail.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> Cc: Rik van Riel <riel@redhat.com> Cc: Scott J Norton <scott.norton@hp.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1429901803-29771-9-git-send-email-Waiman.Long@hp.com Signed-off-by: Ingo Molnar <mingo@kernel.org>