summaryrefslogtreecommitdiffstats
path: root/kernel
Commit message (Collapse)AuthorAgeFilesLines
...
| | * | | | | | | | | | | | | tracing/kprobes: Merge trace probe enable/disable functionsMasami Hiramatsu2011-07-151-56/+36Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge redundant enable/disable functions into enable_trace_probe() and disable_trace_probe(). Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: yrl.pp-manager.tt@hitachi.com Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Link: http://lkml.kernel.org/r/20110627072644.6528.26910.stgit@fedora15 [ converted kprobe selftest to use enable_trace_probe ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| | * | | | | | | | | | | | | tracing/kprobes: Rename probe_* to trace_probe_*Masami Hiramatsu2011-07-141-21/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rename probe_* to trace_probe_* for avoiding namespace confliction. This also fixes improper names of find_probe_event() and cleanup_all_probes() to find_trace_probe() and release_all_trace_probes() respectively. Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/20110627072636.6528.60374.stgit@fedora15 Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| | * | | | | | | | | | | | | perf, x86: P4 PMU - Introduce event alias featureCyrill Gorcunov2011-07-141-2/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of hw_nmi_watchdog_set_attr() weak function and appropriate x86_pmu::hw_watchdog_set_attr() call we introduce even alias mechanism which allow us to drop this routines completely and isolate quirks of Netburst architecture inside P4 PMU code only. The main idea remains the same though -- to allow nmi-watchdog and perf top run simultaneously. Note the aliasing mechanism applies to generic PERF_COUNT_HW_CPU_CYCLES event only because arbitrary event (say passed as RAW initially) might have some additional bits set inside ESCR register changing the behaviour of event and we can't guarantee anymore that alias event will give the same result. P.S. Thanks a huge to Don and Steven for for testing and early review. Acked-by: Don Zickus <dzickus@redhat.com> Tested-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> CC: Ingo Molnar <mingo@elte.hu> CC: Peter Zijlstra <a.p.zijlstra@chello.nl> CC: Stephane Eranian <eranian@google.com> CC: Lin Ming <ming.m.lin@intel.com> CC: Arnaldo Carvalho de Melo <acme@redhat.com> CC: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/20110708201712.GS23657@sun Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| | * | | | | | | | | | | | | tracing: Have dynamic size event stack tracesSteven Rostedt2011-07-143-19/+87
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the stack trace per event in ftace is only 8 frames. This can be quite limiting and sometimes useless. Especially when the "ignore frames" is wrong and we also use up stack frames for the event processing itself. Change this to be dynamic by adding a percpu buffer that we can write a large stack frame into and then copy into the ring buffer. For interrupts and NMIs that come in while another event is being process, will only get to use the 8 frame stack. That should be enough as the task that it interrupted will have the full stack frame anyway. Requested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| | * | | | | | | | | | | | | ftrace: Fix dynamic selftest failure on some archsSteven Rostedt2011-07-141-0/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Archs that do not implement CONFIG_HAVE_FUNCTION_TRACE_MCOUNT_TEST, will fail the dynamic ftrace selftest. The function tracer has a quick 'off' variable that will prevent the call back functions from being called. This variable is called function_trace_stop. In x86, this is implemented directly in the mcount assembly, but for other archs, an intermediate function is used called ftrace_test_stop_func(). In dynamic ftrace, the function pointer variable ftrace_trace_function is used to update the caller code in the mcount caller. But for archs that do not have CONFIG_HAVE_FUNCTION_TRACE_MCOUNT_TEST set, it only calls ftrace_test_stop_func() instead, which in turn calls __ftrace_trace_function. When more than one ftrace_ops is registered, the function it calls is ftrace_ops_list_func(), which will iterate over all registered ftrace_ops and call the callbacks that have their hash matching. The issue happens when two ftrace_ops are registered for different functions and one is then unregistered. The __ftrace_trace_function is then pointed to the remaining ftrace_ops callback function directly. This mean it will be called for all functions that were registered to trace by both ftrace_ops that were registered. This is not an issue for archs with CONFIG_HAVE_FUNCTION_TRACE_MCOUNT_TEST, because the update of ftrace_trace_function doesn't happen until after all functions have been updated, and then the mcount caller is updated. But for those archs that do use the ftrace_test_stop_func(), the update is immediate. The dynamic selftest fails because it hits this situation, and the ftrace_ops that it registers fails to only trace what it was suppose to and instead traces all other functions. The solution is to delay the setting of __ftrace_trace_function until after all the functions have been updated according to the registered ftrace_ops. Also, function_trace_stop is set during the update to prevent function tracing from calling code that is caused by the function tracer itself. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| | * | | | | | | | | | | | | ftrace: Update filter when tracing enabled in set_ftrace_filter()Steven Rostedt2011-07-141-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, if set_ftrace_filter() is called when the ftrace_ops is active, the function filters will not be updated. They will only be updated when tracing is disabled and re-enabled. Update the functions immediately during set_ftrace_filter(). Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| | * | | | | | | | | | | | | ftrace: Balance records when updating the hashSteven Rostedt2011-07-141-16/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Whenever the hash of the ftrace_ops is updated, the record counts must be balance. This requires disabling the records that are set in the original hash, and then enabling the records that are set in the updated hash. Moving the update into ftrace_hash_move() removes the bug where the hash was updated but the records were not, which results in ftrace triggering a warning and disabling itself because the ftrace_ops filter is updated while the ftrace_ops was registered, and then the failure happens when the ftrace_ops is unregistered. The current code will not trigger this bug, but new code will. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| | * | | | | | | | | | | | | ftrace: Do not disable interrupts for modules in mcount updateSteven Rostedt2011-07-081-5/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When I mounted an NFS directory, it caused several modules to be loaded. At the time I was running the preemptirqsoff tracer, and it showed the following output: # tracer: preemptirqsoff # # preemptirqsoff latency trace v1.1.5 on 2.6.33.9-rt30-mrg-test # -------------------------------------------------------------------- # latency: 1177 us, #4/4, CPU#3 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4) # ----------------- # | task: modprobe-19370 (uid:0 nice:0 policy:0 rt_prio:0) # ----------------- # => started at: ftrace_module_notify # => ended at: ftrace_module_notify # # # _------=> CPU# # / _-----=> irqs-off # | / _----=> need-resched # || / _---=> hardirq/softirq # ||| / _--=> preempt-depth # |||| /_--=> lock-depth # |||||/ delay # cmd pid |||||| time | caller # \ / |||||| \ | / modprobe-19370 3d.... 0us!: ftrace_process_locs <-ftrace_module_notify modprobe-19370 3d.... 1176us : ftrace_process_locs <-ftrace_module_notify modprobe-19370 3d.... 1178us : trace_hardirqs_on <-ftrace_module_notify modprobe-19370 3d.... 1178us : <stack trace> => ftrace_process_locs => ftrace_module_notify => notifier_call_chain => __blocking_notifier_call_chain => blocking_notifier_call_chain => sys_init_module => system_call_fastpath That's over 1ms that interrupts are disabled on a Real-Time kernel! Looking at the cause (being the ftrace author helped), I found that the interrupts are disabled before the code modification of mcounts into nops. The interrupts only need to be disabled on start up around this code, not when modules are being loaded. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| | * | | | | | | | | | | | | tracing: Still trace filtered irq functions when irq trace is disabledSteven Rostedt2011-07-082-17/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a function is set to be traced by the set_graph_function, but the option funcgraph-irqs is zero, and the traced function happens to be called from a interrupt, it will not be traced. The point of funcgraph-irqs is to not trace interrupts when we are preempted by an irq, not to not trace functions we want to trace that happen to be *in* a irq. Luckily the current->trace_recursion element is perfect to add a flag to help us be able to trace functions within an interrupt even when we are not tracing interrupts that preempt the trace. Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com> Tested-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | | | | | | | | | | | | Merge branch 'perf/urgent' into perf/coreIngo Molnar2011-07-216-35/+144
| |\ \ \ \ \ \ \ \ \ \ \ \ \ \ | | |/ / / / / / / / / / / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge reason: pick up the latest fixes - they won't make v3.0. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | | | | | | | | | | ftrace: Fix regression where ftrace breaks when modules are loadedSteven Rostedt2011-07-151-2/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Enabling function tracer to trace all functions, then load a module and then disable function tracing will cause ftrace to fail. This can also happen by enabling function tracing on the command line: ftrace=function and during boot up, modules are loaded, then you disable function tracing with 'echo nop > current_tracer' you will trigger a bug in ftrace that will shut itself down. The reason is, the new ftrace code keeps ref counts of all ftrace_ops that are registered for tracing. When one or more ftrace_ops are registered, all the records that represent the functions that the ftrace_ops will trace have a ref count incremented. If this ref count is not zero, when the code modification runs, that function will be enabled for tracing. If the ref count is zero, that function will be disabled from tracing. To make sure the accounting was working, FTRACE_WARN_ON()s were added to updating of the ref counts. If the ref count hits its max (> 2^30 ftrace_ops added), or if the ref count goes below zero, a FTRACE_WARN_ON() is triggered which disables all modification of code. Since it is common for ftrace_ops to trace all functions in the kernel, instead of creating > 20,000 hash items for the ftrace_ops, the hash count is just set to zero, and it represents that the ftrace_ops is to trace all functions. This is where the issues arrise. If you enable function tracing to trace all functions, and then add a module, the modules function records do not get the ref count updated. When the function tracer is disabled, all function records ref counts are subtracted. Since the modules never had their ref counts incremented, they go below zero and the FTRACE_WARN_ON() is triggered. The solution to this is rather simple. When modules are loaded, and their functions are added to the the ftrace pool, look to see if any ftrace_ops are registered that trace all functions. And for those, update the ref count for the module function records. Reported-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| | * | | | | | | | | | | | | ftrace: Fix regression of :mod:module function enablingSteven Rostedt2011-07-072-10/+5Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The new code that allows different utilities to pick and choose what functions they trace broke the :mod: hook that allows users to trace only functions of a particular module. The reason is that the :mod: hook bypasses the hash that is setup to allow individual users to trace their own functions and uses the global hash directly. But if the global hash has not been set up, it will cause a bug: echo '*:mod:radeon' > /sys/kernel/debug/set_ftrace_filter produces: [drm:drm_mode_getfb] *ERROR* invalid framebuffer id [drm:radeon_crtc_page_flip] *ERROR* failed to reserve new rbo buffer before flip BUG: unable to handle kernel paging request at ffffffff8160ec90 IP: [<ffffffff810d9136>] add_hash_entry+0x66/0xd0 PGD 1a05067 PUD 1a09063 PMD 80000000016001e1 Oops: 0003 [#1] SMP Jul 7 04:02:28 phyllis kernel: [55303.858604] CPU 1 Modules linked in: cryptd aes_x86_64 aes_generic binfmt_misc rfcomm bnep ip6table_filter hid radeon r8169 ahci libahci mii ttm drm_kms_helper drm video i2c_algo_bit intel_agp intel_gtt Pid: 10344, comm: bash Tainted: G WC 3.0.0-rc5 #1 Dell Inc. Inspiron N5010/0YXXJJ RIP: 0010:[<ffffffff810d9136>] [<ffffffff810d9136>] add_hash_entry+0x66/0xd0 RSP: 0018:ffff88003a96bda8 EFLAGS: 00010246 RAX: ffff8801301735c0 RBX: ffffffff8160ec80 RCX: 0000000000306ee0 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880137c92940 RBP: ffff88003a96bdb8 R08: ffff880137c95680 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000000 R12: ffffffff81c9df78 R13: ffff8801153d1000 R14: 0000000000000000 R15: 0000000000000000 FS: 00007f329c18a700(0000) GS:ffff880137c80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffff8160ec90 CR3: 000000003002b000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process bash (pid: 10344, threadinfo ffff88003a96a000, task ffff88012fcfc470) Stack: 0000000000000fd0 00000000000000fc ffff88003a96be38 ffffffff810d92f5 ffff88011c4c4e00 ffff880000000000 000000000b69f4d0 ffffffff8160ec80 ffff8800300e6f06 0000000081130295 0000000000000282 ffff8800300e6f00 Call Trace: [<ffffffff810d92f5>] match_records+0x155/0x1b0 [<ffffffff810d940c>] ftrace_mod_callback+0xbc/0x100 [<ffffffff810dafdf>] ftrace_regex_write+0x16f/0x210 [<ffffffff810db09f>] ftrace_filter_write+0xf/0x20 [<ffffffff81166e48>] vfs_write+0xc8/0x190 [<ffffffff81167001>] sys_write+0x51/0x90 [<ffffffff815c7e02>] system_call_fastpath+0x16/0x1b Code: 48 8b 33 31 d2 48 85 f6 75 33 49 89 d4 4c 03 63 08 49 8b 14 24 48 85 d2 48 89 10 74 04 48 89 42 08 49 89 04 24 4c 89 60 08 31 d2 RIP [<ffffffff810d9136>] add_hash_entry+0x66/0xd0 RSP <ffff88003a96bda8> CR2: ffffffff8160ec90 ---[ end trace a5d031828efdd88e ]--- Reported-by: Brian Marete <marete@toshnix.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| | * | | | | | | | | | | | | tracing: Have "enable" file use refcounts like the "filter" fileSteven Rostedt2011-07-071-9/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The "enable" file for the event system can be removed when a module is unloaded and the event system only has events from that module. As the event system nr_events count goes to zero, it may be freed if its ref_count is also set to zero. Like the "filter" file, the "enable" file may be opened by a task and referenced later, after a module has been unloaded and the events for that event system have been removed. Although the "filter" file referenced the event system structure, the "enable" file only references a pointer to the event system name. Since the name is freed when the event system is removed, it is possible that an access to the "enable" file may reference a freed pointer. Update the "enable" file to use the subsystem_open() routine that the "filter" file uses, to keep a reference to the event system structure while the "enable" file is opened. Cc: <stable@kernel.org> Reported-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| | * | | | | | | | | | | | | tracing: Fix bug when reading system filters on module removalSteven Rostedt2011-07-073-11/+82
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The event system is freed when its nr_events is set to zero. This happens when a module created an event system and then later the module is removed. Modules may share systems, so the system is allocated when it is created and freed when the modules are unloaded and all the events under the system are removed (nr_events set to zero). The problem arises when a task opened the "filter" file for the system. If the module is unloaded and it removed the last event for that system, the system structure is freed. If the task that opened the filter file accesses the "filter" file after the system has been freed, the system will access an invalid pointer. By adding a ref_count, and using it to keep track of what is using the event system, we can free it after all users are finished with the event system. Cc: <stable@kernel.org> Reported-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | | | | | | | | | | | | Merge branch 'tip/perf/core-2' of ↵Ingo Molnar2011-07-0513-316/+324
| |\ \ \ \ \ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/core
| | * | | | | | | | | | | | | | tracing/kprobes: Fix kprobe-tracer to support stack traceMasami Hiramatsu2011-06-153-7/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix to support kernel stack trace correctly on kprobe-tracer. Since the execution path of kprobe-based dynamic events is different from other tracepoint-based events, normal ftrace_trace_stack() doesn't work correctly. To fix that, this introduces ftrace_trace_stack_regs() which traces stack via pt_regs instead of current stack register. e.g. # echo p schedule+4 > /sys/kernel/debug/tracing/kprobe_events # echo 1 > /sys/kernel/debug/tracing/options/stacktrace # echo 1 > /sys/kernel/debug/tracing/events/kprobes/enable # head -n 20 /sys/kernel/debug/tracing/trace bash-2968 [000] 10297.050245: p_schedule_4: (schedule+0x4/0x4ca) bash-2968 [000] 10297.050247: <stack trace> => schedule_timeout => n_tty_read => tty_read => vfs_read => sys_read => system_call_fastpath kworker/0:1-2940 [000] 10297.050265: p_schedule_4: (schedule+0x4/0x4ca) kworker/0:1-2940 [000] 10297.050266: <stack trace> => worker_thread => kthread => kernel_thread_helper sshd-1132 [000] 10297.050365: p_schedule_4: (schedule+0x4/0x4ca) sshd-1132 [000] 10297.050365: <stack trace> => sysret_careful Note: Even with this fix, the first entry will be skipped if the probe is put on the function entry area before the frame pointer is set up (usually, that is 4 bytes (push %bp; mov %sp %bp) on x86), because stack unwinder depends on the frame pointer. Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: yrl.pp-manager.tt@hitachi.com Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Namhyung Kim <namhyung@gmail.com> Link: http://lkml.kernel.org/r/20110608070934.17777.17116.stgit@fedora15 Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| | * | | | | | | | | | | | | | stack_trace: Add weak save_stack_trace_regs()Masami Hiramatsu2011-06-151-3/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add weak symbol of save_stack_trace_regs() as same as save_stack_trace_tsk() since that is not implemented except x86 yet. Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: yrl.pp-manager.tt@hitachi.com Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Namhyung Kim <namhyung@gmail.com> Link: http://lkml.kernel.org/r/20110608070927.17777.37895.stgit@fedora15 Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| | * | | | | | | | | | | | | | ring-buffer: Set __GFP_NORETRY flag for ring buffer allocating processVaibhav Nagarnaik2011-06-151-6/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The tracing ring buffer is allocated from kernel memory. While allocating a large chunk of memory, OOM might happen which destabilizes the system. Thus random processes might get killed during the allocation. This patch adds __GFP_NORETRY flag to the ring buffer allocation calls to make it fail more gracefully if the system will not be able to complete the allocation request. Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Vaibhav Nagarnaik <vnagarnaik@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Michael Rubin <mrubin@google.com> Cc: David Sharp <dhsharp@google.com> Link: http://lkml.kernel.org/r/1307491302-9236-1-git-send-email-vnagarnaik@google.com Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| | * | | | | | | | | | | | | | tracing: Convert to kstrtoul_from_userPeter Huewe2011-06-155-110/+20Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch replaces the code for getting an unsigned long from a userspace buffer by a simple call to kstroul_from_user. This makes it easier to read and less error prone. Signed-off-by: Peter Huewe <peterhuewe@gmx.de> Link: http://lkml.kernel.org/r/1307476707-14762-1-git-send-email-peterhuewe@gmx.de Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| | * | | | | | | | | | | | | | tracing, function_graph: Add context-info support for function_graph tracerJiri Olsa2011-06-151-22/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function_graph tracer does not follow global context-info option. Adding TRACE_ITER_CONTEXT_INFO trace_flags check to enable it. With following commands: # echo function_graph > ./current_tracer # echo 0 > options/context-info # cat trace This is what it looked like before: # tracer: function_graph # # TIME CPU DURATION FUNCTION CALLS # | | | | | | | | 1) 0.079 us | } /* __vma_link_rb */ 1) 0.056 us | copy_page_range(); 1) | security_vm_enough_memory() { ... This is what it looks like now: # tracer: function_graph # } /* update_ts_time_stats */ timekeeping_max_deferment(); ... Signed-off-by: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1307113131-10045-6-git-send-email-jolsa@redhat.com Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| | * | | | | | | | | | | | | | tracing, function_graph: Remove lock-depth from latency traceJiri Olsa2011-06-151-4/+3Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The lock_depth was removed in commit e6e1e25 tracing: Remove lock_depth from event entry Removing the lock_depth info from function_graph latency header. With following commands: # echo function_graph > ./current_tracer # echo 1 > options/latency-format # cat trace This is what it looked like before: # tracer: function_graph # # function_graph latency trace v1.1.5 on 3.0.0-rc1-tip+ # -------------------------------------------------------------------- # latency: 0 us, #59756/311298, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2) # ----------------- # | task: -0 (uid:0 nice:0 policy:0 rt_prio:0) # ----------------- # # _-----=> irqs-off # / _----=> need-resched # | / _---=> hardirq/softirq # || / _--=> preempt-depth # ||| / _-=> lock-depth # |||| / # CPU||||| DURATION FUNCTION CALLS # | ||||| | | | | | | 0) .... 0.068 us | } /* __rcu_read_unlock */ ... This is what it looks like now: # tracer: function_graph # # function_graph latency trace v1.1.5 on 3.0.0-rc1-tip+ # -------------------------------------------------------------------- # latency: 0 us, #59747/1744610, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2) # ----------------- # | task: -0 (uid:0 nice:0 policy:0 rt_prio:0) # ----------------- # # _-----=> irqs-off # / _----=> need-resched # | / _---=> hardirq/softirq # || / _--=> preempt-depth # ||| / # CPU|||| DURATION FUNCTION CALLS # | |||| | | | | | | 0) ..s. 1.641 us | } /* __rcu_process_callbacks */ ... Signed-off-by: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1307113131-10045-5-git-send-email-jolsa@redhat.com Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| | * | | | | | | | | | | | | | tracing, function: Fix trace header to follow context-info optionJiri Olsa2011-06-151-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The header display of function tracer does not follow the context-info option, so field names are displayed even if this option is off. Added check for TRACE_ITER_CONTEXT_INFO trace_flags. With following commands: # echo function > ./current_tracer # echo 0 > options/context-info # cat trace This is what it looked like before: # tracer: function # # TASK-PID CPU# TIMESTAMP FUNCTION # | | | | | add_preempt_count <-schedule rcu_note_context_switch <-schedule ... This is what it looks like now: # tracer: function # _raw_spin_unlock_irqrestore <-hrtimer_try_to_cancel ... Signed-off-by: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1307113131-10045-4-git-send-email-jolsa@redhat.com Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| | * | | | | | | | | | | | | | tracing, function_graph: Merge overhead and duration display functionsJiri Olsa2011-06-151-74/+74
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Functions print_graph_overhead() and print_graph_duration() displays data for one field - DURATION. I merged them into single function print_graph_duration(), and added a way to display the empty parts of the field. This way the print_graph_irq() function can use this column to display the IRQ signs if needed and the DURATION field details stays inside the print_graph_duration() function. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1307113131-10045-3-git-send-email-jolsa@redhat.com Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| | * | | | | | | | | | | | | | tracing, function_graph: Remove dependency of abstime and duration fields on ↵Jiri Olsa2011-06-153-18/+9Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | latency The display of absolute time and duration fields is based on the latency field. This was added during the irqsoff/wakeup tracers graph support changes. It's causing confusion in what fields will be displayed for the function_graph tracer itself. So I'm removing this depency, and adding absolute time and duration fields to the preemptirqsoff preemptoff irqsoff wakeup tracers. With following commands: # echo function_graph > ./current_tracer # cat trace This is what it looked like before: # tracer: function_graph # # TIME CPU DURATION FUNCTION CALLS # | | | | | | | | 0) 0.068 us | } /* page_add_file_rmap */ 0) | _raw_spin_unlock() { ... This is what it looks like now: # tracer: function_graph # # CPU DURATION FUNCTION CALLS # | | | | | | | 0) 0.068 us | } /* add_preempt_count */ 0) 0.993 us | } /* vfsmount_lock_local_lock */ ... For preemptirqsoff preemptoff irqsoff wakeup tracers, this is what it looked like before: SNIP # _-----=> irqs-off # / _----=> need-resched # | / _---=> hardirq/softirq # || / _--=> preempt-depth # ||| / _-=> lock-depth # |||| / # CPU TASK/PID ||||| DURATION FUNCTION CALLS # | | | ||||| | | | | | | 1) <idle>-0 | d..1 0.000 us | acpi_idle_enter_simple(); ... This is what it looks like now: SNIP # # _-----=> irqs-off # / _----=> need-resched # | / _---=> hardirq/softirq # || / _--=> preempt-depth # ||| / # TIME CPU TASK/PID |||| DURATION FUNCTION CALLS # | | | | |||| | | | | | | 19.847735 | 1) <idle>-0 | d..1 0.000 us | acpi_idle_enter_simple(); ... Signed-off-by: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1307113131-10045-2-git-send-email-jolsa@redhat.com Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| | * | | | | | | | | | | | | | async: Fixed an include coding style issuePaul McQuade2011-06-151-5/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Added <linux/atomic.h>,<linux/ktime.h> and Removed <asm/atomic.h>. Added KERN_DEBUG to printk() functions. Acked-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Paul McQuade <tungstentide@gmail.com> Link: http://lkml.kernel.org/r/4DE596B4.7030904@gmail.com Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| | * | | | | | | | | | | | | | ftrace: Fixed an include coding style issuePaul McQuade2011-06-151-5/+2Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Removed <asm/ftrace.h> because <linux/ftrace.h> was already declared. Braces of struct's coding style fixed. Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Paul McQuade <tungstentide@gmail.com> Link: http://lkml.kernel.org/r/4DE59711.3090900@gmail.com Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| | * | | | | | | | | | | | | | tracing: Add disable_on_free optionSteven Rostedt2011-06-152-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a trace option to disable tracing on free. When this option is set, a write into the free_buffer file will not only shrink the ring buffer down to zero, but it will also disable tracing. Cc: Vaibhav Nagarnaik <vnagarnaik@google.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| | * | | | | | | | | | | | | | tracing: Add a proc file to stop tracing and free bufferVaibhav Nagarnaik2011-06-151-35/+73
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The proc file entry buffer_size_kb is used to set the size of tracing buffer. The memory to expand the buffer size is kernel memory. Consider a use case where tracing is handled by a user space utility, which acts as a gate keeper for tracing requests. In an OOM condition, tracing is considered a low priority task and if the utility gets killed the ring buffer memory cannot be released back to the kernel. This patch adds a proc file called "free_buffer" whose purpose is to stop tracing and free up the ring buffer when it is closed. The user space process can then set the desired size in buffer_size_kb file and open the fd to the "free_buffer" file. Under OOM condition, if the process gets killed, the kernel closes the file descriptor. The release handler stops the tracing and releases the kernel memory automatically. Cc: Ingo Molnar <mingo@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Michael Rubin <mrubin@google.com> Cc: David Sharp <dhsharp@google.com> Signed-off-by: Vaibhav Nagarnaik <vnagarnaik@google.com> Link: http://lkml.kernel.org/r/1308012717-11148-1-git-send-email-vnagarnaik@google.com Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| | * | | | | | | | | | | | | | tracing: Use NUMA allocation for per-cpu ring buffer pagesVaibhav Nagarnaik2011-06-153-22/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The tracing ring buffer is a group of per-cpu ring buffers where allocation and logging is done on a per-cpu basis. The events that are generated on a particular CPU are logged in the corresponding buffer. This is to provide wait-free writes between CPUs and good NUMA node locality while accessing the ring buffer. However, the allocation routines consider NUMA locality only for buffer page metadata and not for the actual buffer page. This causes the pages to be allocated on the NUMA node local to the CPU where the allocation routine is running at the time. This patch fixes the problem by using a NUMA node specific allocation routine so that the pages are allocated from a NUMA node local to the logging CPU. I tested with the getuid_microbench from autotest. It is a simple binary that calls getuid() in a loop and measures the average time for the syscall to complete. The following command was used to test: $ getuid_microbench 1000000 Compared the numbers found on kernel with and without this patch and found that logging latency decreases by 30-50 ns/call. tracing with non-NUMA allocation - 569 ns/call tracing with NUMA allocation - 512 ns/call Signed-off-by: Vaibhav Nagarnaik <vnagarnaik@google.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Michael Rubin <mrubin@google.com> Cc: David Sharp <dhsharp@google.com> Link: http://lkml.kernel.org/r/1304470602-20366-1-git-send-email-vnagarnaik@google.com Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| | * | | | | | | | | | | | | | tracing: Schedule a delayed work to call wakeup()Vaibhav Nagarnaik2011-06-151-11/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In using syscall tracing by concurrent processes, the wakeup() that is called in the event commit function causes contention on the spin lock of the waitqueue. I enabled sys_enter_getuid and sys_exit_getuid tracepoints, and by running getuid_microbench from autotest in parallel I found that the contention causes exponential latency increase in the tracing path. The autotest binary getuid_microbench calls getuid() in a tight loop for the given number of iterations and measures the average time required to complete a single invocation of syscall. The patch schedules a delayed work after 2 ms once an event commit calls to wake up the trace wait_queue. This removes the delay caused by contention on spin lock in wakeup() and amortizes the wakeup() calls scheduled over the 2 ms period. In the following example, the script enables the sys_enter_getuid and sys_exit_getuid tracepoints and runs the getuid_microbench in parallel with the given number of processes. The output clearly shows the latency increase caused by contentions. $ ~/getuid.sh 1 1000000 calls in 0.720974253 s (720.974253 ns/call) $ ~/getuid.sh 2 1000000 calls in 1.166457554 s (1166.457554 ns/call) 1000000 calls in 1.168933765 s (1168.933765 ns/call) $ ~/getuid.sh 3 1000000 calls in 1.783827516 s (1783.827516 ns/call) 1000000 calls in 1.795553270 s (1795.553270 ns/call) 1000000 calls in 1.796493376 s (1796.493376 ns/call) $ ~/getuid.sh 4 1000000 calls in 4.483041796 s (4483.041796 ns/call) 1000000 calls in 4.484165388 s (4484.165388 ns/call) 1000000 calls in 4.484850762 s (4484.850762 ns/call) 1000000 calls in 4.485643576 s (4485.643576 ns/call) $ ~/getuid.sh 5 1000000 calls in 6.497521653 s (6497.521653 ns/call) 1000000 calls in 6.502000236 s (6502.000236 ns/call) 1000000 calls in 6.501709115 s (6501.709115 ns/call) 1000000 calls in 6.502124100 s (6502.124100 ns/call) 1000000 calls in 6.502936358 s (6502.936358 ns/call) After the patch, the latencies scale better. 1000000 calls in 0.728720455 s (728.720455 ns/call) 1000000 calls in 0.842782857 s (842.782857 ns/call) 1000000 calls in 0.883803135 s (883.803135 ns/call) 1000000 calls in 0.902077764 s (902.077764 ns/call) 1000000 calls in 0.902838202 s (902.838202 ns/call) 1000000 calls in 0.908896885 s (908.896885 ns/call) 1000000 calls in 0.932523515 s (932.523515 ns/call) 1000000 calls in 0.958009672 s (958.009672 ns/call) 1000000 calls in 0.986188020 s (986.188020 ns/call) 1000000 calls in 0.989771102 s (989.771102 ns/call) 1000000 calls in 0.933518391 s (933.518391 ns/call) 1000000 calls in 0.958897947 s (958.897947 ns/call) 1000000 calls in 1.031038897 s (1031.038897 ns/call) 1000000 calls in 1.089516025 s (1089.516025 ns/call) 1000000 calls in 1.141998347 s (1141.998347 ns/call) Signed-off-by: Vaibhav Nagarnaik <vnagarnaik@google.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Michael Rubin <mrubin@google.com> Cc: David Sharp <dhsharp@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1305059241-7629-1-git-send-email-vnagarnaik@google.com Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | | | | | | | | | | | | | perf: export perf_event_refresh() to modulesAvi Kivity2011-07-011-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | KVM needs one-shot samples, since a PMC programmed to -X will fire after X events and then again after 2^40 events (i.e. variable period). Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1309362157-6596-4-git-send-email-avi@redhat.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | | | | | | | | | perf: Add context field to perf_eventAvi Kivity2011-07-013-10/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The perf_event overflow handler does not receive any caller-derived argument, so many callers need to resort to looking up the perf_event in their local data structure. This is ugly and doesn't scale if a single callback services many perf_events. Fix by adding a context parameter to perf_event_create_kernel_counter() (and derived hardware breakpoints APIs) and storing it in the perf_event. The field can be accessed from the callback as event->overflow_handler_context. All callers are updated. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1309362157-6596-2-git-send-email-avi@redhat.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | | | | | | | | | perf: Remove the perf_output_begin(.sample) argumentPeter Zijlstra2011-07-012-24/+21Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since only samples call perf_output_sample() its much saner (and more correct) to put the sample logic in there than in the perf_output_begin()/perf_output_end() pair. Saves a useless argument, reduces conditionals and shrinks struct perf_output_handle, win! Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-2crpvsx3cqu67q3zqjbnlpsc@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | | | | | | | | | perf: Remove the nmi parameter from the swevent and overflow interfacePeter Zijlstra2011-07-015-45/+33Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The nmi parameter indicated if we could do wakeups from the current context, if not, we would set some state and self-IPI and let the resulting interrupt do the wakeup. For the various event classes: - hardware: nmi=0; PMI is in fact an NMI or we run irq_work_run from the PMI-tail (ARM etc.) - tracepoint: nmi=0; since tracepoint could be from NMI context. - software: nmi=[0,1]; some, like the schedule thing cannot perform wakeups, and hence need 0. As one can see, there is very little nmi=1 usage, and the down-side of not using it is that on some platforms some software events can have a jiffy delay in wakeup (when arch_irq_work_raise isn't implemented). The up-side however is that we can remove the nmi parameter and save a bunch of conditionals in fast paths. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Michael Cree <mcree@orcon.net.nz> Cc: Will Deacon <will.deacon@arm.com> Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com> Cc: Anton Blanchard <anton@samba.org> Cc: Eric B Munson <emunson@mgebm.net> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: David S. Miller <davem@davemloft.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jason Wessel <jason.wessel@windriver.com> Cc: Don Zickus <dzickus@redhat.com> Link: http://lkml.kernel.org/n/tip-agjev8eu666tvknpb3iaj0fg@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | | | | | | | | | perf, x86: Add hw_watchdog_set_attr() in a sake of nmi-watchdog on P4Cyrill Gorcunov2011-07-011-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Due to restriction and specifics of Netburst PMU we need a separated event for NMI watchdog. In particular every Netburst event consumes not just a counter and a config register, but also an additional ESCR register. Since ESCR registers are grouped upon counters (i.e. if ESCR is occupied for some event there is no room for another event to enter until its released) we need to pick up the "least" used ESCR (or the most available one) for nmi-watchdog purposes -- so MSR_P4_CRU_ESCR2/3 was chosen. With this patch nmi-watchdog and perf top should be able to run simultaneously. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> CC: Lin Ming <ming.m.lin@intel.com> CC: Arnaldo Carvalho de Melo <acme@redhat.com> CC: Frederic Weisbecker <fweisbec@gmail.com> Tested-and-reviewed-by: Don Zickus <dzickus@redhat.com> Tested-and-reviewed-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20110623124918.GC13050@sun Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | | | | | | | | | events: Ensure that timers are updated without requiring read() callEric B Munson2011-07-011-2/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The event tracing infrastructure exposes two timers which should be updated each time the value of the counter is updated. Currently, these counters are only updated when userspace calls read() on the fd associated with an event. This means that counters which are read via the mmap'd page exclusively never have their timers updated. This patch adds ensures that the timers are updated each time the values in the mmap'd page are updated. Signed-off-by: Eric B Munson <emunson@mgebm.net> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1308932786-5111-1-git-send-email-emunson@mgebm.net Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | | | | | | | | | events: Move lockless timer calculation into helper functionEric B Munson2011-07-011-7/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Take the timer calculation from perf_output_read and move it to a helper function for any place that needs timer values but cannot take the ctx->lock. Signed-off-by: Eric B Munson <emunson@mgebm.net> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1308861279-15216-2-git-send-email-emunson@mgebm.net Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | | | | | | | | | events: Add note to update_event_times comment about holding ctx->lockEric B Munson2011-07-011-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Eric B Munson <emunson@mgebm.net> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1308861279-15216-1-git-send-email-emunson@mgebm.net Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | | | | | | | | | perf_events: Fix perf buffer watermark settingVince Weaver2011-07-012-9/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since 2.6.36 (specifically commit d57e34fdd60b ("perf: Simplify the ring-buffer logic: make perf_buffer_alloc() do everything needed"), the perf_buffer_init_code() has been mis-setting the buffer watermark if perf_event_attr.wakeup_events has a non-zero value. This is because perf_event_attr.wakeup_events is a union with perf_event_attr.wakeup_watermark. This commit re-enables the check for perf_event_attr.watermark being set before continuing with setting a non-default watermark. This bug is most noticable when you are trying to use PERF_IOC_REFRESH with a value larger than one and perf_event_attr.wakeup_events is set to one. In this case the buffer watermark will be set to 1 and you will get extraneous POLL_IN overflows rather than POLL_HUP as expected. [ avoid using attr.wakeup_events when attr.watermark is set ] Signed-off-by: Vince Weaver <vweaver1@eecs.utk.edu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: <stable@kernel.org> Link: http://lkml.kernel.org/r/alpine.DEB.2.00.1106011506390.5384@cl320.eecs.utk.edu Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | | | | | | | | | Merge commit 'v3.0-rc5' into perf/coreIngo Molnar2011-07-0116-521/+611
| |\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ | | | |_|_|_|/ / / / / / / / / / | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge reason: Pick up the latest fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | | | | | | | | | Merge commit 'v3.0-rc3' into perf/coreIngo Molnar2011-06-1610-47/+103
| |\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge reason: add the latest fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | | | | | | | | | | perf: Split up buffer handling from core codeFrederic Weisbecker2011-06-094-498/+568
| | |_|/ / / / / / / / / / / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | And create the internal perf events header. v2: Keep an internal inlined perf_output_copy() Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Borislav Petkov <bp@alien8.de> Cc: Stephane Eranian <eranian@google.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1305827704-5607-1-git-send-email-fweisbec@gmail.com [ v3: use clearer 'ring_buffer' and 'rb' naming ] Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | | | | | | | | | perf, core: Fix initial task_ctx/event installationPeter Zijlstra2011-06-071-11/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A lost Quilt refresh of 2c29ef0fef8 (perf: Simplify and fix __perf_install_in_context()) is causing grief and lockups, reported by Jiri Olsa. When installing an event in a task context, there's a number of issues: - there might not be an existing task context, in which case we should install the now current context; - there might already be a context, not the current one, in which case we should de-schedule the old and install the new; these cases were dealt with in the lost refresh, however there is one further case that was found in testing: - there might already be a context, the current one, in which case we should still de-schedule, and should take care to re-install it (note that task_ctx_sched_out() clears cpuctx->task_ctx). Reported-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1307399008.2497.971.camel@laptop Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | | | | | | | | | Merge branch 'perf/urgent' into perf/coreIngo Molnar2011-06-044-27/+66
| |\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: tools/perf/util/python.c Merge reason: resolve the conflict with perf/urgent. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ Merge commit 'v3.0-rc1' into perf/coreIngo Molnar2011-06-0321-733/+876
| |\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ | | | |_|_|_|_|_|_|_|_|_|_|_|_|/ / | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge reason: merge in the latest fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | | | | | | | | | | perf: De-schedule a task context when removing the last eventPeter Zijlstra2011-05-281-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since perf_install_in_context() will now install a context when we add the first event, we can de-schedule the context when the last event is removed. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20110409192142.090431763@chello.nl Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | | | | | | | | | | perf: Change close() semantics for group eventsPeter Zijlstra2011-05-281-7/+1Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to always call list_del_event() on the correct cpu if the event is part of an active context and avoid having to do two IPIs, change the close() semantics slightly. The current perf_event_disable() call would disable a whole group if the event that's being closed is the group leader, whereas the new code keeps the group siblings enabled. People should not rely on this behaviour and I don't think they do, but in case we find they do, the fix is easy and we have to take the double IPI cost. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Vince Weaver <vweaver1@eecs.utk.edu> Link: http://lkml.kernel.org/r/20110409192142.038377551@chello.nl Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | | | | | | | | | | perf: Collect the schedule-in rules in one functionPeter Zijlstra2011-05-281-12/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This was scattered out - refactor it into a single function. No change in functionality. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20110409192141.979862055@chello.nl Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | | | | | | | | | | perf: Change and simplify ctx::is_active semanticsPeter Zijlstra2011-05-281-6/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of tracking if a context is active or not, track which events of the context are active. By making it a bitmask of EVENT_PINNED|EVENT_FLEXIBLE we can simplify some of the scheduling routines since it can avoid adding events that are already active. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20110409192141.930282378@chello.nl Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | | | | | | | | | | perf: Simplify and fix __perf_install_in_context()Peter Zijlstra2011-05-281-45/+35Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently __perf_install_in_context() will try and schedule in the event irrespective of our event scheduling rules, that is, we try to schedule CPU-pinned, TASK-pinned, CPU-flexible, TASK-flexible, but when creating a new event we simply try and schedule it on top of whatever is already on the PMU, this can lead to errors for pinned events. Therefore, simplify things and simply schedule everything out, add the event to the corresponding context and schedule everything back in. This also nicely handles the case where with __ARCH_WANT_INTERRUPTS_ON_CTXSW the IPI can come right in the middle of schedule, before we managed to call perf_event_task_sched_in(). Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20110409192141.870894224@chello.nl Signed-off-by: Ingo Molnar <mingo@elte.hu>