summaryrefslogtreecommitdiffstats
path: root/kernel/rcu
Commit message (Collapse)AuthorAgeFilesLines
* rcu: Move common code out of if-else blockAkira Yokosawa2019-03-261-3/+1Star
| | | | | | | | | | | | | As the result of recent addition of "rdp->core_needs_qs = false;" in the "if" block, now both branches of the if-else have the same assignment. Factor it out and reduce line count. Signed-off-by: Akira Yokosawa <akiyks@gmail.com> Cc: Joel Fernandes <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com> Acked-by: Joel Fernandes (Google) <joel@joelfernandes.org>
* rcu: Set rcutree.kthread_prio sysfs access to read-onlyLiu Song2019-03-261-1/+1
| | | | | | | | | | | | | | | | | | | | The rcutree.kthread_prio kernel-boot parameter is used to set the priority for boost (rcub), per-CPU (rcuc), and grace-period (rcu_preempt or rcu_sched) kthreads. It is also used by rcutorture to check whether it is possible to meaningfully test RCU priority boosting. However, all of these cases will either ignore or be confused by any post-boot changes to rcutree.kthread_prio. Note that the user really can change the priorities of all of these kthreads using chrt, given sufficient privileges. Therefore, the read-write nature of sysfs access to rcutree.kthread_prio is thus at best an attractive nuisance. This commit therefore changes sysfs access to rcutree.kthread_prio to be read-only. Signed-off-by: Liu Song <liu.song11@zte.com.cn> Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
* rcu: Make exit_rcu() handle non-preempted RCU readersPaul E. McKenney2019-03-261-7/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The purpose of exit_rcu() is to handle cases where buggy code causes a task to exit within an RCU read-side critical section. It currently does that in the case where said RCU read-side critical section was preempted at least once, but fails to handle cases where preemption did not occur. This case needs to be handled because otherwise the final context switch away from the exiting task will incorrectly behave as if task exit were instead a preemption of an RCU read-side critical section, and will therefore queue the exiting task. The exiting task will have exited, and thus won't ever execute rcu_read_unlock(), which means that it will remain queued forever, blocking all subsequent grace periods, and eventually resulting in OOM. Although this is arguably better than letting grace periods proceed and having a later rcu_read_unlock() access the now-freed task structure that once belonged to the exiting tasks, it would obviously be better to correctly handle this case. This commit therefore sets ->rcu_read_lock_nesting to 1 in that case, so that the subsequence call to __rcu_read_unlock() causes the exiting task to exit its dangling RCU read-side critical section. Note that deferred quiescent states need not be considered. The reason is that removing the task from the ->blkd_tasks[] list in the call to rcu_preempt_deferred_qs() handles the per-task component of any deferred quiescent state, and all other components of any deferred quiescent state are associated with the CPU, which isn't going anywhere until some later CPU-hotplug operation, which will report any remaining deferred quiescent states from within the rcu_report_dead() function. Note also that negative values of ->rcu_read_lock_nesting need not be considered. First, these won't show up in exit_rcu() unless there is a serious bug in RCU, and second, setting ->rcu_read_lock_nesting sets the state so that the RCU read-side critical section will be exited normally. Again, this code has no effect unless there has been some prior bug that prevents a task from leaving an RCU read-side critical section before exiting. Furthermore, there have been no reports of the bug fixed by this commit appearing in production. This commit is therefore absolutely -not- recommended for backporting to -stable. Reported-by: ABHISHEK DUBEY <dabhishek@iisc.ac.in> Reported-by: BHARATH Y MOURYA <bharathm@iisc.ac.in> Reported-by: Aravinda Prasad <aravinda@iisc.ac.in> Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com> Tested-by: ABHISHEK DUBEY <dabhishek@iisc.ac.in>
* rcu: rcu_qs -- Use raise_softirq_irqoff to not save irqs twiceCyrill Gorcunov2019-03-261-1/+1
| | | | | | | | | The rcu_qs is disabling IRQs by self so no need to do the same in raise_softirq but instead we can save some cycles using raise_softirq_irqoff directly. CC: Paul E. McKenney <paulmck@linux.ibm.com> Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
* rcu: Avoid unnecessary softirq when system is idleJoel Fernandes (Google)2019-03-261-0/+1
| | | | | | | | | | | | | | | | | | | | | | When there are no callbacks pending on an idle system, I noticed that RCU softirq is continuously firing. During this the cpu_no_qs is set to false, and core_needs_qs is set to true indefinitely. This causes rcu_process_callbacks to be repeatedly called, even though the node corresponding to the CPU has that CPU's mask bit cleared and the system is idle. I believe the race is when such mask clearing is done during idle CPU scan of the quiescent state forcing stage in the kthread instead of the softirq. Since the rnp mask is cleared, but the flags on the CPU's rdp are not cleared, the CPU thinks it still needs to report to core RCU. Cure this by clearing the core_needs_qs flag when the CPU detects that its node is already updated which will avoid the unwanted softirq raises to the benefit of real-time systems. Test: Ran rcutorture for various tree RCU configs. Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
* rcu: Unconditionally expedite during suspend/hibernatePaul E. McKenney2019-03-261-4/+2Star
| | | | | | | | | | | | | | | The rcu_pm_notify() function refuses to switch to/from expedited grace periods on systems with more than 256 CPUs due to the serialized initialization of expedited grace periods. However, expedited grace periods are now initialized in parallel, removing this concern. This commit therefore removes the checks from rcu_pm_notify(), so that expedited grace periods are used unconditionally during suspend/resume and hibernate/wake operations. As always, real-time workloads wishing to completely avoid expedited grace periods can use the rcupdate.rcu_normal= kernel parameter. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
* Merge branch 'perf-core-for-linus' of ↵Linus Torvalds2019-03-062-0/+4
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Ingo Molnar: "Lots of tooling updates - too many to list, here's a few highlights: - Various subcommand updates to 'perf trace', 'perf report', 'perf record', 'perf annotate', 'perf script', 'perf test', etc. - CPU and NUMA topology and affinity handling improvements, - HW tracing and HW support updates: - Intel PT updates - ARM CoreSight updates - vendor HW event updates - BPF updates - Tons of infrastructure updates, both on the build system and the library support side - Documentation updates. - ... and lots of other changes, see the changelog for details. Kernel side updates: - Tighten up kprobes blacklist handling, reduce the number of places where developers can install a kprobe and hang/crash the system. - Fix/enhance vma address filter handling. - Various PMU driver updates, small fixes and additions. - refcount_t conversions - BPF updates - error code propagation enhancements - misc other changes" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (238 commits) perf script python: Add Python3 support to syscall-counts-by-pid.py perf script python: Add Python3 support to syscall-counts.py perf script python: Add Python3 support to stat-cpi.py perf script python: Add Python3 support to stackcollapse.py perf script python: Add Python3 support to sctop.py perf script python: Add Python3 support to powerpc-hcalls.py perf script python: Add Python3 support to net_dropmonitor.py perf script python: Add Python3 support to mem-phys-addr.py perf script python: Add Python3 support to failed-syscalls-by-pid.py perf script python: Add Python3 support to netdev-times.py perf tools: Add perf_exe() helper to find perf binary perf script: Handle missing fields with -F +.. perf data: Add perf_data__open_dir_data function perf data: Add perf_data__(create_dir|close_dir) functions perf data: Fail check_backup in case of error perf data: Make check_backup work over directories perf tools: Add rm_rf_perf_data function perf tools: Add pattern name checking to rm_rf perf tools: Add depth checking to rm_rf perf data: Add global path holder ...
| * kprobes: Prohibit probing on RCU debug routineMasami Hiramatsu2019-02-131-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since kprobe itself depends on RCU, probing on RCU debug routine can cause recursive breakpoint bugs. Prohibit probing on RCU debug routines. int3 ->do_int3() ->ist_enter() ->RCU_LOCKDEP_WARN() ->debug_lockdep_rcu_enabled() -> int3 Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andrea Righi <righi.andrea@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/154998807741.31052.11229157537816341591.stgit@devbox Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * x86/kprobes: Prohibit probing on functions before kprobe_int3_handler()Masami Hiramatsu2019-02-131-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prohibit probing on the functions called before kprobe_int3_handler() in do_int3(). More specifically, ftrace_int3_handler(), poke_int3_handler(), and ist_enter(). And since rcu_nmi_enter() is called by ist_enter(), it also should be marked as NOKPROBE_SYMBOL. Since those are handled before kprobe_int3_handler(), probing those functions can cause a breakpoint recursion and crash the kernel. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andrea Righi <righi.andrea@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/154998793571.31052.11301258949601150994.stgit@devbox Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | Merge branch 'core-rcu-for-linus' of ↵Linus Torvalds2019-03-0514-650/+390Star
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU updates from Ingo Molnar: "The main RCU related changes in this cycle were: - Additional cleanups after RCU flavor consolidation - Grace-period forward-progress cleanups and improvements - Documentation updates - Miscellaneous fixes - spin_is_locked() conversions to lockdep - SPDX changes to RCU source and header files - SRCU updates - Torture-test updates, including nolibc updates and moving nolibc to tools/include" * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (71 commits) locking/locktorture: Convert to SPDX license identifier linux/torture: Convert to SPDX license identifier torture: Convert to SPDX license identifier linux/srcu: Convert to SPDX license identifier linux/rcutree: Convert to SPDX license identifier linux/rcutiny: Convert to SPDX license identifier linux/rcu_sync: Convert to SPDX license identifier linux/rcu_segcblist: Convert to SPDX license identifier linux/rcupdate: Convert to SPDX license identifier linux/rcu_node_tree: Convert to SPDX license identifier rcu/update: Convert to SPDX license identifier rcu/tree: Convert to SPDX license identifier rcu/tiny: Convert to SPDX license identifier rcu/sync: Convert to SPDX license identifier rcu/srcu: Convert to SPDX license identifier rcu/rcutorture: Convert to SPDX license identifier rcu/rcu_segcblist: Convert to SPDX license identifier rcu/rcuperf: Convert to SPDX license identifier rcu/rcu.h: Convert to SPDX license identifier RCU/torture.txt: Remove section MODULE PARAMETERS ...
| | \
| | \
| | \
| | \
| | \
| | \
| *-----. \ Merge branches 'doc.2019.01.26a', 'fixes.2019.01.26a', 'sil.2019.01.26a', ↵Paul E. McKenney2019-02-0914-300/+118Star
| |\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'spdx.2019.02.09a', 'srcu.2019.01.26a' and 'torture.2019.01.26a' into HEAD doc.2019.01.26a: Documentation updates. fixes.2019.01.26a: Miscellaneous fixes. sil.2019.01.26a: Removal of a few more spin_is_locked() instances. spdx.2019.02.09a: Add SPDX identifiers to RCU files srcu.2019.01.26a: SRCU updates. torture.2019.01.26a: Torture-test updates.
| | | | | * | rcuperf: Stop abusing IS_ENABLED()Paul E. McKenney2019-01-261-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ever-evolving IS_ENABLED() macro is intended for CONFIG_* Kconfig options, but rcuperf currently uses it for the decidedly non-CONFIG_* MODULE macro. In the spirit of not inviting trouble, this commit substitutes tried-and-true #ifdef. Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com> Acked-by: Ingo Molnar <mingo@kernel.org>
| | | | | * | rcutorture: Add grace period after CPU offlinePaul E. McKenney2019-01-261-1/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Beyond a certain point in the CPU-hotplug offline process, timers get stranded on the outgoing CPU, and won't fire until that CPU comes back online, which might well be never. This commit therefore adds a hook in torture_onoff_init() that is invoked from torture_offline(), which rcutorture uses to occasionally wait for a grace period. This should result in failures for RCU implementations that rely on stranded timers eventually firing in the absence of the CPU coming back online. Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
| | | | | * | rcutorture: Record grace periods in forward-progress histogramPaul E. McKenney2019-01-261-7/+22
| | |_|_|/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit records grace periods in rcutorture's n_launders_hist[] histogram, thus allowing rcu_torture_fwd_cb_hist() to print out the elapsed number of grace periods between buckets. This information helps to determine whether a lack of forward progress is due to stalled grace periods on the one hand or due to sluggish callback invocation on the other. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
| | | | * | srcu: Remove srcu_queue_delayed_work_on()Sebastian Andrzej Siewior2019-01-263-43/+24Star
| | |_|/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | srcu_queue_delayed_work_on() disables preemption (and therefore CPU hotplug in RCU's case) and then checks based on its own accounting if a CPU is online. If the CPU is online it uses queue_delayed_work_on() otherwise it fallbacks to queue_delayed_work(). The problem here is that queue_work() on -RT does not work with disabled preemption. queue_work_on() works also on an offlined CPU. queue_delayed_work_on() has the problem that it is possible to program a timer on an offlined CPU. This timer will fire once the CPU is online again. But until then, the timer remains programmed and nothing will happen. Add a local timer which will fire (as requested per delay) on the local CPU and then enqueue the work on the specific CPU. RCUtorture testing with SRCU-P for 24h showed no problems. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
| | | * | rcu/update: Convert to SPDX license identifierPaul E. McKenney2019-02-091-15/+2Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace the license boiler plate with a SPDX license identifier. While in the area, update an email address. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
| | | * | rcu/tree: Convert to SPDX license identifierPaul E. McKenney2019-02-094-61/+9Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace the license boiler plate with a SPDX license identifier. While in the area, update an email address. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com> [ paulmck: Update .h file SPDX comment format per Joe Perches. ] Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
| | | * | rcu/tiny: Convert to SPDX license identifierPaul E. McKenney2019-02-091-15/+2Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace the license boiler plate with a SPDX license identifier. While in the area, update an email address. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
| | | * | rcu/sync: Convert to SPDX license identifierPaul E. McKenney2019-02-091-14/+1Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace the license boiler plate with a SPDX license identifier. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
| | | * | rcu/srcu: Convert to SPDX license identifierPaul E. McKenney2019-02-092-30/+4Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace the license boiler plate with a SPDX license identifier. While in the area, update an email address. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
| | | * | rcu/rcutorture: Convert to SPDX license identifierPaul E. McKenney2019-02-091-16/+3Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace the license boiler plate with a SPDX license identifier. While in the area, update an email address. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
| | | * | rcu/rcu_segcblist: Convert to SPDX license identifierPaul E. McKenney2019-02-092-30/+4Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace the license boiler plate with a SPDX license identifier. While in the area, update an email address. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
| | | * | rcu/rcuperf: Convert to SPDX license identifierPaul E. McKenney2019-02-091-16/+3Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace the license boiler plate with a SPDX license identifier. While in the area, update an email address. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
| | | * | rcu/rcu.h: Convert to SPDX license identifierPaul E. McKenney2019-02-091-15/+2Star
| | |/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | Replace the license boiler plate with a SPDX license identifier. While in the area, update an email address. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
| | * | rcu: Fix obsolete DYNTICK_IRQ_NONIDLE commentPaul E. McKenney2019-01-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit updates the DYNTICK_IRQ_NONIDLE header comment to remove the obsolete commentary about unmatched rcu_irq_{enter,exit}(). Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
| | * | rcu: Repair rcu_nmi_exit() docbook headerPaul E. McKenney2019-01-261-1/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit removes the "@irq" argument from the rcu_nmi_exit() docbook header, given that this function now has no arguments. Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
| | * | rcu: Remove preemption disabling from expedited CPU selectionPaul E. McKenney2019-01-261-2/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It turns out that it is queue_delayed_work_on() rather than queue_work_on() that has difficulties when used concurrently with CPU-hotplug removal operations. It is therefore unnecessary to protect CPU identification and queue_work_on() with preempt_disable(). This commit therefore removes the preempt_disable() and preempt_enable() from sync_rcu_exp_select_cpus(), which has the further benefit of reducing the number of changes that must be maintained in the -rt patchset. Reported-by: Thomas Gleixner <tglx@linutronix.de> Reported-by: Sebastian Siewior <bigeasy@linutronix.de> Suggested-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
| | * | rcu: Rename rcu_process_callbacks() to rcu_core() for Tree RCUPaul E. McKenney2019-01-261-7/+3Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Although the name rcu_process_callbacks() still makes sense for Tiny RCU, where most of what it does is invoke callbacks, it no longer makes much sense for Tree RCU, especially given that the actually callback invocation is relegated to rcu_do_batch(), or, for no-CBs CPUs, to the rcuo kthreads. Especially in the latter case, rcu_process_callbacks() has very little to do with actual callbacks. A better description of this function is that it performs RCU's core processing. This commit therefore changes the name of Tree RCU's rcu_process_callbacks() function to rcu_core(), which also has the virtue of being consistent with the existing invoke_rcu_core() function. While in the area, the header comment is reworked. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
| | * | rcu: Rename rcu_check_callbacks() to rcu_sched_clock_irq()Paul E. McKenney2019-01-264-25/+21Star
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The name rcu_check_callbacks() arguably made sense back in the early 2000s when RCU was quite a bit simpler than it is today, but it has become quite misleading, especially with the advent of dyntick-idle and NO_HZ_FULL. The rcu_check_callbacks() function is RCU's hook into the scheduling-clock interrupt, and is now but one of many ways that callbacks get promoted to invocable state. This commit therefore changes the name to rcu_sched_clock_irq(), which is the same number of characters and clearly indicates this function's relation to the rest of the Linux kernel. In addition, for the sake of consistency, rcu_flavor_check_callbacks() is also renamed to rcu_flavor_sched_clock_irq(). While in the area, the header comments for both functions are reworked. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
| * | Merge branches 'consolidate.2019.01.26a' and 'fwd.2019.01.26a' into HEADPaul E. McKenney2019-01-263-78/+109
| |\ \ | | | | | | | | | | | | | | | | consolidate.2019.01.26a: RCU flavor consolidation cleanups. fwd.2019.01.26a: RCU grace-period forward-progress fixes.
| | * | rcu: Prevent needless ->gp_seq_needed update in __note_gp_changes()Zhang, Jun2019-01-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, __note_gp_changes() checks to see if the rcu_node structure's ->gp_seq_needed is greater than or equal to that of the rcu_data structure, and if so, updates the rcu_data structure's ->gp_seq_needed field. This results in a useless store in the case where the two fields are equal. This commit therefore carries out this store only in the case where the rcu_node structure's ->gp_seq_needed is strictly greater than that of the rcu_data structure. Signed-off-by: "Zhang, Jun" <jun.zhang@intel.com> Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com> Link: https://lkml.kernel.org/r/88DC34334CA3444C85D647DBFA962C2735AD5F77@SHSMSX104.ccr.corp.intel.com
| | * | rcu: Do RCU GP kthread self-wakeup from softirq and interruptZhang, Jun2019-01-261-6/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The rcu_gp_kthread_wake() function is invoked when it might be necessary to wake the RCU grace-period kthread. Because self-wakeups are normally a useless waste of CPU cycles, if rcu_gp_kthread_wake() is invoked from this kthread, it naturally refuses to do the wakeup. Unfortunately, natural though it might be, this heuristic fails when rcu_gp_kthread_wake() is invoked from an interrupt or softirq handler that interrupted the grace-period kthread just after the final check of the wait-event condition but just before the schedule() call. In this case, a wakeup is required, even though the call to rcu_gp_kthread_wake() is within the RCU grace-period kthread's context. Failing to provide this wakeup can result in grace periods failing to start, which in turn results in out-of-memory conditions. This race window is quite narrow, but it actually did happen during real testing. It would of course need to be fixed even if it was strictly theoretical in nature. This patch does not Cc stable because it does not apply cleanly to earlier kernel versions. Fixes: 48a7639ce80c ("rcu: Make callers awaken grace-period kthread") Reported-by: "He, Bo" <bo.he@intel.com> Co-developed-by: "Zhang, Jun" <jun.zhang@intel.com> Co-developed-by: "He, Bo" <bo.he@intel.com> Co-developed-by: "xiao, jin" <jin.xiao@intel.com> Co-developed-by: Bai, Jie A <jie.a.bai@intel.com> Signed-off: "Zhang, Jun" <jun.zhang@intel.com> Signed-off: "He, Bo" <bo.he@intel.com> Signed-off: "xiao, jin" <jin.xiao@intel.com> Signed-off: Bai, Jie A <jie.a.bai@intel.com> Signed-off-by: "Zhang, Jun" <jun.zhang@intel.com> [ paulmck: Switch from !in_softirq() to "!in_interrupt() && !in_serving_softirq() to avoid redundant wakeups and to also handle the interrupt-handler scenario as well as the softirq-handler scenario that actually occurred in testing. ] Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com> Link: https://lkml.kernel.org/r/CD6925E8781EFD4D8E11882D20FC406D52A11F61@SHSMSX104.ccr.corp.intel.com
| | * | rcu: Add sysrq rcu_node-dump capabilityPaul E. McKenney2019-01-261-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Life is hard if RCU manages to get stuck without triggering RCU CPU stall warnings or triggering the rcu_check_gp_start_stall() checks for failing to start a grace period. This commit therefore adds a boot-time-selectable sysrq key (commandeering "y") that allows manually dumping Tree RCU state. The new rcutree.sysrq_rcu kernel boot parameter must be set for this sysrq to be available. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
| | * | rcu: Protect rcu_check_gp_kthread_starvation() access to ->gp_flagsPaul E. McKenney2019-01-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The rcu_check_gp_kthread_starvation() function can be invoked without holding locks, so the access to the rcu_state structure's ->gp_flags field must be protected with READ_ONCE(). This commit therefore adds this protection. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
| | * | rcu: Improve diagnostics for failed RCU grace-period startPaul E. McKenney2019-01-262-23/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a grace period fails to start (for example, because you commented out the last two lines of rcu_accelerate_cbs_unlocked()), rcu_core() will invoke rcu_check_gp_start_stall(), which will notice and complain. However, this complaint is lacking crucial debugging information such as when the last wakeup executed and what the value of ->gp_seq was at that time. This commit therefore removes the current pr_alert() from rcu_check_gp_start_stall(), instead invoking show_rcu_gp_kthreads(), which has been updated to print the needed information, which is collected by rcu_gp_kthread_wake(). Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
| | * | rcu: Update NOCB commentsPaul E. McKenney2019-01-261-17/+16Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit updates a few obsolete comments in the RCU callback-offload code. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
| | * | rcu: Remove unused rcu_cpu_kthread_cpu per-CPU variablePaul E. McKenney2019-01-261-4/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The rcu_cpu_kthread_cpu used to provide debugfs information, but is no longer used. This commit therefore removes it. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
| | * | rcu: Move rcu_cpu_has_work to rcu_data structurePaul E. McKenney2019-01-262-13/+5Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Given that RCU has a perfectly good per-CPU rcu_data structure, most per-CPU quantities should be stored there. This commit therefore moves the rcu_cpu_has_work per-CPU variable to the rcu_data structure. This also makes this variable unconditionally present, which should be acceptable given the memory reduction due to the RCU flavor consolidation and also due to simplifications this will enable. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
| | * | rcu: Remove unused rcu_cpu_kthread_loops per-CPU variablePaul E. McKenney2019-01-262-3/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The rcu_cpu_kthread_loops variable used to provide debugfs information, but is no longer used. This commit therefore removes it. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
| | * | rcu: Move rcu_cpu_kthread_status to rcu_data structurePaul E. McKenney2019-01-262-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Given that RCU has a perfectly good per-CPU rcu_data structure, most per-CPU quantities should be stored there. This commit therefore moves the rcu_cpu_kthread_status per-CPU variable to the rcu_data structure. This also makes this variable unconditionally present, which should be acceptable given the memory reduction due to the RCU flavor consolidation and also due to simplifications this will enable. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
| | * | rcu: Move rcu_cpu_kthread_task to rcu_data structurePaul E. McKenney2019-01-262-7/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Given that RCU has a perfectly good per-CPU rcu_data structure, most per-CPU quantities should be stored there. This commit therefore moves the rcu_cpu_kthread_task per-CPU variable to the rcu_data structure. This also makes this variable unconditionally present, which should be acceptable given the memory reduction due to the RCU flavor consolidation and also due to simplifications this will enable. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
| | * | rcu: Accommodate zero jiffies_till_first_fqs and kthread kickingPaul E. McKenney2019-01-261-1/+1
| | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is perfectly fine to set the rcutree.jiffies_till_first_fqs boot parameter to zero, in fact, this can be useful on specialty systems that usually have at least one idle CPU and that need fast grace periods. This is because this setting causes the RCU grace-period kthread to scan for idle threads immediately after grace-period initialization, as opposed to waiting several jiffies to do so. It is also perfectly fine to set the rcutree.rcu_kick_kthreads kernel parameter, which gives the RCU grace-period kthread an extra wakeup if it doesn't make progress for a period of three times the setting of the rcutree.jiffies_till_first_fqs boot parameter. This is of course problematic when the value of this parameter is zero, as it can result in unnecessary wakeup IPIs along with unnecessary WARN_ONCE() invocations. This commit therefore defers kthread kicking for at least two jiffies, regardless of the setting of rcutree.jiffies_till_first_fqs. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
| * | rcu: Discard separate per-CPU callback countsPaul E. McKenney2019-01-263-68/+13Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Back when there were multiple flavors of RCU, it was necessary to separately count lazy and non-lazy callbacks for each CPU. These counts were used in CONFIG_RCU_FAST_NO_HZ kernels to determine how long a newly idle CPU should be allowed to sleep before handling its RCU callbacks. But now that there is only one flavor, the callback counts for a given CPU's sole rcu_data structure are the counts for that CPU. This commit therefore removes the rcu_data structure's ->nonlazy_posted and ->nonlazy_posted_snap fields, the rcu_idle_count_callbacks_posted() and rcu_cpu_has_callbacks() functions, repurposes the rcu_data structure's ->all_lazy field to record the laziness state at the beginning of the latest idle sojourn, and modifies CONFIG_RCU_FAST_NO_HZ RCU CPU stall warnings accordingly. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
| * | rcu: Inline _synchronize_rcu_expedited() into synchronize_rcu_expedited()Paul E. McKenney2019-01-261-45/+36Star
| | | | | | | | | | | | | | | | | | | | | | | | Now that _synchronize_rcu_expedited() has only one caller, and given that this is a tail call, this commit inlines _synchronize_rcu_expedited() into synchronize_rcu_expedited(). Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
| * | rcu: Consolidate PREEMPT and !PREEMPT synchronize_rcu()Paul E. McKenney2019-01-263-91/+73Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that rcu_blocking_is_gp() makes the correct immediate-return decision for both PREEMPT and !PREEMPT, a single implementation of synchronize_rcu() will work correctly under both configurations. This commit therefore eliminates a few lines of code by consolidating the two implementations of synchronize_rcu(). Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
| * | rcu: Consolidate PREEMPT and !PREEMPT synchronize_rcu_expedited()Paul E. McKenney2019-01-261-56/+49Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The CONFIG_PREEMPT=n and CONFIG_PREEMPT=y implementations of synchronize_rcu_expedited() are quite similar, and with small modifications to rcu_blocking_is_gp() can be made identical. This commit therefore makes this change in order to save a few lines of code and to reduce the amount of duplicate code. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
| * | rcu: Determine expedited-GP IPI handler at build timePaul E. McKenney2019-01-262-17/+14Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Back when there could be multiple RCU flavors running in the same kernel at the same time, it was necessary to specify the expedited grace-period IPI handler at runtime. Now that there is only one RCU flavor, the IPI handler can be determined at build time. There is therefore no longer any reason for the RCU-preempt and RCU-sched IPI handlers to have different names, nor is there any reason to pass these handlers in function arguments and in the data structures enclosing workqueues. This commit therefore makes all these changes, pushing the specification of the expedited grace-period IPI handler down to the point of use. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
| * | rcu: Inline rcu_kthread_do_work() into its sole remaining callerPaul E. McKenney2019-01-261-6/+1Star
| | | | | | | | | | | | | | | | | | | | | | | | The rcu_kthread_do_work() function has a single-line body and only one remaining caller. This commit therefore saves a few lines of code by inlining rcu_kthread_do_work() into its sole remaining caller. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
| * | rcu: Eliminate RCU_BH_FLAVOR and RCU_SCHED_FLAVORPaul E. McKenney2019-01-262-4/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | Now that the RCU flavors have been consolidated, RCU_BH_FLAVOR and RCU_SCHED_FLAVOR are no longer used. This commit therefore saves a few lines by removing them. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
| * | rcu: Inline force_quiescent_state() into rcu_force_quiescent_state()Paul E. McKenney2019-01-261-15/+6Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Given that rcu_force_quiescent_state() is a simple wrapper around force_quiescent_state(), this commit saves a few lines of code by inlining force_quiescent_state() into rcu_force_quiescent_state(), and changing all references to force_quiescent_state() to instead invoke rcu_force_quiescent_state(). Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>