summaryrefslogtreecommitdiffstats
path: root/tools/perf/ui
Commit message (Collapse)AuthorAgeFilesLines
* perf annotate browser: Circulate percent, total-period and nr-samples viewTaeung Song2017-08-181-3/+8
| | | | | | | | | | | | | | | | | | | | | Using the existing 't' hotkey, support the three views: percent, total period and number of samples on the annotate TUI browser, circulating them like below: Percent -> Total Period -> Nr Samples -> Percent ... Committer notes: Removed new 'e' hotkey, should be resubmitted as a separate patch, with proper justification for its inclusion. Suggested-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Milian Wolff <milian.wolff@kdab.com> Link: http://lkml.kernel.org/r/1503046028-5691-1-git-send-email-treeze.taeung@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf annotate browser: Support --show-nr-samples optionTaeung Song2017-08-181-3/+11
| | | | | | | | | | | | | | | | | | | Support the --show-nr-samples in the TUI browser. Committer notes: Lift the restriction about --tui but leave it for --gtk: $ export LD_LIBRARY_PATH=~/lib64 $ perf annotate --gtk --show-nr-samples --show-nr-samples is not available in --gtk mode at this time $ Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1503046023-5646-1-git-send-email-treeze.taeung@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf annotate TUI: Set appropriate column width for period/percentArnaldo Carvalho de Melo2017-07-281-4/+6
| | | | | | | | | | | | | | | | Either when we start 'perf annotate' or 'perf report' with --show-total-period or when we, in the annotate browser, press 't' to toggle period/percent for the first column, we need to adjust the width for the 'period' case. Based-on-a-patch-by: Taeung Song <treeze.taeung@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-n2np5qcs20u6qjdr9orygne6@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf annotate TUI: Fix column header when toggling period/percentTaeung Song2017-07-281-1/+1
| | | | | | | | | | | | | | | | | | We have the 't' hotkey to toggle showing either the total period or the percentage of samples for a given line, but we forgot to toggle as well the column header, always showing "Percent", even when showing the period, fix it. Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1501172169-6761-1-git-send-email-treeze.taeung@gmail.com [ Extracted from a larger patch, s/Event count/Period/g ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf annotate TUI: Clarify calculation of column header widthsArnaldo Carvalho de Melo2017-07-281-9/+11
| | | | | | | | | | | | | | | | | | | | | | | | | In commit f8f4aaead579 ("perf annotate: Finally display IPC and cycle accounting") the 'pcnt_width' variable was abused in a few places to also include the optional width of the "IPC" and "cycles" columns, while in other places we stopped using 'pcnt_width' and instead its previous equation... Now that we need to tap into annotate_browser__pcnt_width() to consider if --show-total-period is being used and instead of that hardcoded 7 (strlen("Percent")) we need to use it or strlen("Event count") we need this properly clarified to avoid having to touch all the (7 * nr_events) places. Clarify this by introducing a separate annotate_browser__cycles_width() to leave the pcnt_width calculate just what its name implies. Cc: Taeung Song <treeze.taeung@gmail.com> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/n/tip-szgb07t4k5wtvks8nzwkg710@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf annotate TUI: Fix --show-total-periodTaeung Song2017-07-281-1/+1
| | | | | | | | | | | | | | | We were showing the number of samples, not the total period, fix it. Reported-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Martin Liška <mliska@suse.cz> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Jiri Olsa <jolsa@redhat.com> Fixes: 0c4a5bcea460 ("perf annotate: Display total number of samples with --show-total-period") Link: http://lkml.kernel.org/r/1500500223-16753-1-git-send-email-treeze.taeung@gmail.com [ extracted from a larger patch ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf annotate TUI: Use sym_hist_entry in disasm_line_samplesArnaldo Carvalho de Melo2017-07-281-4/+4
| | | | | | | | | | | | | Just paving the way to fix --show-total-period in the TUI, i.e. now we save in struct disasm_line_samples not just the number of samples, but also the total period. Based-on-a-patch-by: Taeung Song <treeze.taeung@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/n/tip-1sup5hkwrxocjvrmrmhs732o@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf annotate: Rename 'sum' to 'nr_samples' in struct sym_histTaeung Song2017-07-211-1/+1
| | | | | | | | | | | | | | | | | | | To make it more clear that it is the sum of all the nr_samples fields in the addr[] entries, i.e.: sym_hist->nr_samples = sum(sym_hist->addr[0 .. symbol__size(sym)]->nr_samples) Committer notes: Taeung had renamed it to total_samples, but using nr_samples, as in the added explanation above, looks clearer and establishes the direct connection, making clear it is about the _number_ of samples. Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1500500211-16599-1-git-send-email-treeze.taeung@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf annotate: Introduce struct sym_hist_entryTaeung Song2017-07-212-5/+5
| | | | | | | | | | | | | | | | | | | struct sym_hist has addr[] but it should have not only number of samples but also the sample period. So use new struct symhist_entry to pave the way to have that. Committer notes: This initial patch will only introduce the struct sym_hist_entry and use only the nr_samples member, which makes the code clearer and paves the way to save the period as well. Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1500500205-16553-1-git-send-email-treeze.taeung@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* tools include: Adopt strstarts() from the kernelArnaldo Carvalho de Melo2017-07-203-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | Replacing prefixcmp(), same purpose, inverted result, so standardize on the kernel variant, to reduce silly differences among tools/ and the kernel sources, making it easier for people to work in both codebases. And then doing: if (strstarts(option, "no-")) Looks clearer than doing: if (!prefixcmp(option, "no-")) To figure out if option starts witn "no-". Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-kaei42gi7lpa8subwtv7eug8@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf report: Enable finding kernel inline functionsJin Yao2017-07-192-6/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently perf supports a mode to query inline stack. It works well for finding user space inline functions but it doesn't work for kernel ones, due to some unnecessary check. This patch removes these unnecessary checks. Now kernel inline functions can be reported. For example: perf report --inline -g func --stdio |--46.19%--do_huge_pmd_anonymous_page | do_huge_pmd_anonymous_page (inline) | __do_huge_pmd_anonymous_page (inline) | __SetPageUptodate (inline) | __set_bit (inline) The result is compared with the output of addr2line. They match. Signed-off-by: Yao Jin <yao.jin@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1500409892-15904-1-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf annotate: Implement visual marker for macro fusionJin Yao2017-07-193-0/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For marking fused instructions clearly this patch adds a line before the first instruction of pair and joins it with the arrow of the jump to its target. For example, when "je" is selected in annotate view, the line before cmpl is displayed and joins the arrow of "je". │ ┌──cmpl $0x0,argp_program_version_hook 81.93 │ ├──je 20 │ │ lock cmpxchg %esi,0x38a9a4(%rip) │ │↓ jne 29 │ │↓ jmp 43 11.47 │20:└─→cmpxch %esi,0x38a999(%rip) That means the cmpl+je is a fused instruction pair and they should be considered together. Changelog: v3: Use Arnaldo's fix to improve the arrow origin rendering. To get the evsel->evlist->env->cpuid, save the evsel in annotate_browser. v2: new function "ins__is_fused" to check if the instructions are fused. Signed-off-by: Yao Jin <yao.jin@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1499403995-19857-3-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf annotate: Check for fused instructionsJin Yao2017-07-192-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Macro fusion merges two instructions to a single micro-op. Intel core platform performs this hardware optimization under limited circumstances. For example, CMP + JCC can be "fused" and executed /retired together. While with sampling this can result in the sample sometimes being on the JCC and sometimes on the CMP. So for the fused instruction pair, they could be considered together. On Nehalem, fused instruction pairs: cmp/test + jcc. On other new CPU: cmp/test/add/sub/and/inc/dec + jcc. This patch adds an x86-specific function which checks if 2 instructions are in a "fused" pair. For non-x86 arch, the function is just NULL. Changelog: v4: Move the CPU model checking to symbol__disassemble and save the CPU family/model in arch structure. It avoids checking every time when jump arrow printed. v3: Add checking for Nehalem (CMP, TEST). For other newer Intel CPUs just check it by default (CMP, TEST, ADD, SUB, AND, INC, DEC). v2: Remove the original weak function. Arnaldo points out that doing it as a weak function that will be overridden by the host arch doesn't work. So now it's implemented as an arch-specific function. Committer fix: Do not access evsel->evlist->env->cpuid, ->env can be null, introduce perf_evsel__env_cpuid(), just like perf_evsel__env_arch(), also used in this function call. The original patch was segfaulting 'perf top' + annotation. But this essentially disables this fused instructions augmentation in 'perf top', the right thing is to get the cpuid from the running kernel, left for a later patch tho. Signed-off-by: Yao Jin <yao.jin@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1499403995-19857-2-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf annotate: Fix broken arrow at row 0 connecting jmp instruction to its ↵Jin Yao2017-07-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | target When the jump instruction is displayed at the row 0 in annotate view, the arrow is broken. An example: 16.86 │ ┌──je 82 0.01 │ movsd (%rsp),%xmm0 │ movsd 0x8(%rsp),%xmm4 │ movsd 0x8(%rsp),%xmm1 │ movsd (%rsp),%xmm3 │ divsd %xmm4,%xmm0 │ divsd %xmm3,%xmm1 │ movsd (%rsp),%xmm2 │ addsd %xmm1,%xmm0 │ addsd %xmm2,%xmm0 │ movsd %xmm0,(%rsp) │82: sub $0x1,%ebx 83.03 │ ↑ jne 38 │ add $0x10,%rsp │ xor %eax,%eax │ pop %rbx │ ← retq The patch increments the row number before checking with 0. Signed-off-by: Yao Jin <yao.jin@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: stable@vger.kernel.org Fixes: 944e1abed9e1 ("perf ui browser: Add method to draw up/down arrow line") Link: http://lkml.kernel.org/r/1496901704-30275-1-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf annotate: Return arch from symbol__disassemble() and save it in browserJin Yao2017-06-192-2/+7
| | | | | | | | | | | | | | | | | | | In annotate browser, we will add support to check fused instructions. While this is x86-specific feature so we need the annotate browser to know what the arch it runs on. symbol__disassemble() has figured out the arch. This patch just lets the arch return from symbol__disassemble and save the arch in annotate browser. Signed-off-by: Yao Jin <yao.jin@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1497840958-4759-2-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf annotate browser: Display titles in left frameJin Yao2017-06-191-3/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The annotate browser is divided into 2 frames. Left frame contains 3 columns (some platforms only have one column). For example: │26 int compute_flag() │27 { 22.80 1.20 │ sub $0x8,%rsp │25 int i; │ │27 i = rand() % 2; 22.78 1.20 1 │ → callq rand@plt While it's hard for user to understand what the data is. This patch adds the titles "Percent", "IPC" and "Cycle" on columns. Percent IPC Cycle │ │25 __attribute__((noinline)) │26 int compute_flag() │27 { 22.80 1.20 │ sub $0x8,%rsp │25 int i; │ │27 i = rand() % 2; 22.78 1.20 1 │ → callq rand@plt The titles are displayed at row 0 of annotate browser if row 0 doesn't have values of percent, ipc and cycle. Signed-off-by: Yao Jin <yao.jin@linux.intel.com> Acked-by: Milian Wolff <milian.wolff@kdab.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yao Jin <yao.jin@linux.intel.com> Link: http://lkml.kernel.org/r/1493909895-9668-3-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf report: Remove unnecessary check in annotate_browser_write()Jin Yao2017-06-191-14/+10Star
| | | | | | | | | | | | | | | | | | | | | | | | In annotate_browser_write(), if (dl->offset != -1 && percent_max != 0.0) { if (percent_max != 0.0) { ... } ... } The second check of (percent_max != 0.0) is not necessary, remove it. Signed-off-by: Yao Jin <yao.jin@linux.intel.com> Acked-by: Milian Wolff <milian.wolff@kdab.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yao Jin <yao.jin@linux.intel.com> Link: http://lkml.kernel.org/r/1493909895-9668-2-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Put caller above callee in --children modeNamhyung Kim2017-05-241-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The __hpp__sort_acc() sorts entries using callchain depth in order to put callers above in children mode. But it assumed the callchain order was callee-first. Now default (for children) is caller-first so the order of entries is reverted. For example, consider following case: $ perf report --no-children ..l # Overhead Command Shared Object Symbol # ........ ....... ................... .......................... # 99.44% a.out a.out [.] main | ---main __libc_start_main _start Then children mode should show 'start' above '__libc_start_main' since it's the caller (parent) of the __libc_start_main. But it's reversed: # Children Self Command Shared Object Symbol # ........ ........ ....... ............... ..................... # 99.61% 0.00% a.out libc-2.25.so [.] __libc_start_main 99.61% 0.00% a.out a.out [.] _start 99.54% 99.44% a.out a.out [.] main This patch fixes it. # Children Self Command Shared Object Symbol # ........ ........ ....... ............... ..................... # 99.61% 0.00% a.out a.out [.] _start 99.61% 0.00% a.out libc-2.25.so [.] __libc_start_main 99.54% 99.44% a.out a.out [.] main Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yao Jin <yao.jin@linux.intel.com> Cc: kernel-team@lge.com Link: http://lkml.kernel.org/r/20170524062129.32529-8-namhyung@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
* perf ui gtk: Move gtk .so name to the only place where it is usedArnaldo Carvalho de Melo2017-04-261-0/+3
| | | | | | | | | No need to pollute util.h with this. Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/n/tip-kec0chbdtgrd71o3oi2kz2zt@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Use just forward declarations for struct thread where possibleArnaldo Carvalho de Melo2017-04-242-0/+2
| | | | | | | | Removing various instances of unnecessary includes, reducing the maze of header dependencies. Link: http://lkml.kernel.org/n/tip-hwu6eyuok9pc57alookyzmsf@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Move units conversion/formatting routines to separate objectArnaldo Carvalho de Melo2017-04-201-0/+1
| | | | | | | | | | | | Out of util.h, to disentangle it a bit more. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-vpksyj3w5fk9t8s6mxmkajyr@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Add signal.h to places using its definitionsArnaldo Carvalho de Melo2017-04-202-0/+2
| | | | | | | And remove it from util.h, disentangling it a bit more. Link: http://lkml.kernel.org/n/tip-2zg9s5nx90yde64j3g4z2uhk@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Remove include dirent.h from util.hArnaldo Carvalho de Melo2017-04-191-0/+1
| | | | | | | | | | | | | The files using the dirent.h routines should instead include it, reducing the includes hell that lead to longer build times. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-42g2f4z6nfg7mdb2ae97n7tj@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Move path related functions to util/path.hArnaldo Carvalho de Melo2017-04-191-0/+1
| | | | | | | | | | | | Disentangling util.h header mess a bit more. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-aj6je8ly377i4upedmjzdsq6@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Don't include terminal handling headers in util.hArnaldo Carvalho de Melo2017-04-193-0/+4
| | | | | | | | | | | | | | | Continuing the disentanglement, mostly the TUI needs CTRL(c), that is in sys/ttydefaults.h and term.c needs the termios headers. And term.h needs to be added to a few places too. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-il19zna7qj9ytavdbwlipc7t@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Include errno.h where neededArnaldo Carvalho de Melo2017-04-191-0/+1
| | | | | | | | | | | | | | Removing it from util.h, part of an effort to disentangle the includes hell, that makes changes to util.h or something included by it to cause a complete rebuild of the tools. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-ztrjy52q1rqcchuy3rubfgt2@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Move extra string util functions to util/string2.hArnaldo Carvalho de Melo2017-04-194-0/+4
| | | | | | | | | | | | | Moving them from util.h, where they don't belong. Since libc already have string.h, name it slightly differently, as string2.h. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-eh3vz5sqxsrdd8lodoro4jrw@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Move srcline definitions to separate headerArnaldo Carvalho de Melo2017-04-192-0/+2
| | | | | | | | | | | | Out of util.h into a new file, srcline.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-ludnlm4djqcdjziekzr4s3u9@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Move sane ctype stuff from util.h to sane_ctype.hArnaldo Carvalho de Melo2017-04-194-1/+6
| | | | | | | | | | | | More stuff that came from git, out of the hodge-podge that is util.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-e3lana4gctz3ub4hn4y29hkw@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Including missing inttypes.h headerArnaldo Carvalho de Melo2017-04-194-1/+4
| | | | | | | | | | | | Needed to use the PRI[xu](32,64) formatting macros. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-wkbho8kaw24q67dd11q0j39f@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Add include <linux/kernel.h> where ARRAY_SIZE() is usedArnaldo Carvalho de Melo2017-04-192-0/+2
| | | | | | | | | | | | | To pave the way for further cleanups where linux/kernel.h may stop being included in some header. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-qqxan6tfsl6qx3l0v3nwgjvk@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf ui browser: Refactor the code to parse color configs with ltrim()Taeung Song2017-04-111-1/+1
| | | | | | | | | | | | | | When parsing {fore, back} ground color configs, use ltrim() instead of just while loop and isspace(). Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1491575061-704-4-git-send-email-treeze.taeung@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf report: Enable sorting by srcline as keyMilian Wolff2017-03-272-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Often it is interesting to know how costly a given source line is in total. Previously, one had to build these sums manually based on all addresses that pointed to the same source line. This patch introduces srcline as a sort key, which will do the aggregation for us. Paired with the recent addition of showing inline frames, this makes perf report much more useful for many C++ work loads. The following shows the new feature in action. First, let's show the status quo output when we sort by address. The result contains many hist entries that generate the same output: ~~~~~~~~~~~~~~~~ $ perf report --stdio --inline -g address # Children Self Command Shared Object Symbol # ........ ........ ............ ................... ......................................... # 99.89% 35.34% cpp-inlining cpp-inlining [.] main | |--64.55%--main complex:655 | /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline) | /usr/include/c++/6.3.1/complex:664 (inline) | | | |--60.31%--hypot +20 | | | | | |--8.52%--__hypot_finite +273 | | | | | |--7.32%--__hypot_finite +411 ... --35.34%--_start +4194346 __libc_start_main +241 | |--6.65%--main random.tcc:3326 | /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline) | /usr/include/c++/6.3.1/bits/random.h:1809 (inline) | /usr/include/c++/6.3.1/bits/random.h:1818 (inline) | /usr/include/c++/6.3.1/bits/random.h:185 (inline) | |--2.70%--main random.tcc:3326 | /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline) | /usr/include/c++/6.3.1/bits/random.h:1809 (inline) | /usr/include/c++/6.3.1/bits/random.h:1818 (inline) | /usr/include/c++/6.3.1/bits/random.h:185 (inline) | |--1.69%--main random.tcc:3326 | /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline) | /usr/include/c++/6.3.1/bits/random.h:1809 (inline) | /usr/include/c++/6.3.1/bits/random.h:1818 (inline) | /usr/include/c++/6.3.1/bits/random.h:185 (inline) ... ~~~~~~~~~~~~~~~~ With this patch and `-g srcline` we instead get the following output: ~~~~~~~~~~~~~~~~ $ perf report --stdio --inline -g srcline # Children Self Command Shared Object Symbol # ........ ........ ............ ................... ......................................... # 99.89% 35.34% cpp-inlining cpp-inlining [.] main | |--64.55%--main complex:655 | /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline) | /usr/include/c++/6.3.1/complex:664 (inline) | | | |--64.02%--hypot | | | | | --59.81%--__hypot_finite | | | --0.53%--cabs | --35.34%--_start __libc_start_main | |--12.48%--main random.tcc:3326 | /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline) | /usr/include/c++/6.3.1/bits/random.h:1809 (inline) | /usr/include/c++/6.3.1/bits/random.h:1818 (inline) | /usr/include/c++/6.3.1/bits/random.h:185 (inline) ... ~~~~~~~~~~~~~~~~ Signed-off-by: Milian Wolff <milian.wolff@kdab.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Yao Jin <yao.jin@linux.intel.com> Link: http://lkml.kernel.org/r/20170318214928.9047-1-milian.wolff@kdab.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf report: Show inline stack for browser modeJin Yao2017-03-271-8/+172
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the address belongs to an inlined function, the source information back to the first non-inlined function will be printed. For example: 1. Show inlined function name perf report -g function --inline - 0.69% 0.00% inline ld-2.23.so [.] dl_main - dl_main 0.56% _dl_relocate_object _dl_relocate_object (inline) elf_dynamic_do_Rela (inline) 2. Show the file/line information perf report -g address --inline - 0.69% 0.00% inline ld-2.23.so [.] _dl_start _dl_start rtld.c:307 /build/glibc-GKVZIf/glibc-2.23/elf/rtld.c:413 (inline) + _dl_sysdep_start dl-sysdep.c:250 Signed-off-by: Yao Jin <yao.jin@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Milian Wolff <milian.wolff@kdab.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Link: http://lkml.kernel.org/r/1490474069-15823-6-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf report: Show inline stack for stdio modeJin Yao2017-03-271-1/+84
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the address belongs to an inlined function, the source information back to the first non-inlined function will be printed. For example: 1. Show inlined function name perf report --stdio -g function --inline 0.69% 0.00% inline ld-2.23.so [.] dl_main | ---dl_main | --0.56%--_dl_relocate_object _dl_relocate_object (inline) elf_dynamic_do_Rela (inline) 2. Show the file/line information perf report --stdio -g address --inline 0.69% 0.00% inline ld-2.23.so [.] _dl_start_user | ---_dl_start_user .:0 _dl_start rtld.c:307 /build/glibc-GKVZIf/glibc-2.23/elf/rtld.c:413 (inline) _dl_sysdep_start dl-sysdep.c:250 | --0.56%--dl_main rtld.c:2076 Committer tests: # perf record --call-graph dwarf ~/bin/perf stat usleep 1 Performance counter stats for 'usleep 1': 0.443020 task-clock (msec) # 0.449 CPUs utilized 1 context-switches # 0.002 M/sec 0 cpu-migrations # 0.000 K/sec 52 page-faults # 0.117 M/sec 1,049,423 cycles # 2.369 GHz 801,456 instructions # 0.76 insn per cycle 155,609 branches # 351.246 M/sec 7,026 branch-misses # 4.52% of all branches 0.000987570 seconds time elapsed [ perf record: Woken up 2 times to write data ] [ perf record: Captured and wrote 0.553 MB perf.data (66 samples) ] # perf report --stdio --inline fs__get_mountpoint <SNIP> 1.73% 0.00% perf perf [.] fs__get_mountpoint | ---fs__get_mountpoint fs__get_mountpoint (inline) fs__check_mounts (inline) __statfs entry_SYSCALL_64 sys_statfs SYSC_statfs user_statfs user_path_at_empty filename_lookup path_lookupat link_path_walk inode_permission __inode_permission kernfs_iop_permission kernfs_refresh_inode security_inode_notifysecctx selinux_inode_notifysecctx selinux_inode_setsecurity security_context_to_sid security_context_to_sid_core string_to_context_struct symcmp Signed-off-by: Yao Jin <yao.jin@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Milian Wolff <milian.wolff@kdab.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Link: http://lkml.kernel.org/r/1490474069-15823-5-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf hists browser: Fix typo in function switch_data_fileChangbin Du2017-03-131-1/+1
| | | | | | | | | | | Should clear buf 'abs_path', not 'options'. Signed-off-by: Changbin Du <changbin.du@intel.com> Cc: Feng Tang <feng.tang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Fixes: 341487ab561f ("perf hists browser: Add option for runtime switching perf data file") Link: http://lkml.kernel.org/r/20170313114652.9207-1-changbin.du@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf utils: Check verbose flag properlyNamhyung Kim2017-02-202-4/+4
| | | | | | | | | | | | | It now can have negative value to suppress the message entirely. So it needs to check it being positive. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: kernel-team@lge.com Link: http://lkml.kernel.org/r/20170217081742.17417-3-namhyung@kernel.org [ Adjust fuzz on tools/perf/util/pmu.c, add > 0 checks in many other places ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* Merge branch 'perf/urgent' into perf/core, to pick up fixesIngo Molnar2017-02-061-0/+10
|\ | | | | | | Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * perf diff: Fix -o/--order option behavior (again)Namhyung Kim2017-02-021-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 21e6d8428664 ("perf diff: Use perf_hpp__register_sort_field interface") changed list_add() to perf_hpp__register_sort_field(). This resulted in a behavior change since the field was added to the tail instead of the head. So the -o option is mostly ignored due to its order in the list. This patch fixes it by adding perf_hpp__prepend_sort_field(). Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Fixes: 21e6d8428664 ("perf diff: Use perf_hpp__register_sort_field interface") Link: http://lkml.kernel.org/r/20170118051457.30946-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf diff: Fix segfault on 'perf diff -o N' optionNamhyung Kim2017-02-021-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The -o/--order option is to select column number to sort a diff result. It does the job by adding a hpp field at the beginning of the sort list. But it should not be added to the output field list as it has no callbacks required by a output field. During the setup_sorting(), the perf_hpp__setup_output_field() appends the given sort keys to the output field if it's not there already. Originally it was checked by fmt->list being non-empty. But commit 3f931f2c4274 ("perf hists: Make hpp setup function generic") changed it to check the ->equal callback. Anyways, we don't need to add the pseudo hpp field to the output field list since it won't be used for output. So just skip fields if they have no ->color or ->entry callbacks. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Fixes: 3f931f2c4274 ("perf hists: Make hpp setup function generic") Link: http://lkml.kernel.org/r/20170118051457.30946-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | perf hists browser: Add e/c hotkeys to expand/collapse callchain for current ↵Jiri Olsa2017-01-201-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | entry Currently we allow only to expand or collapse all entries in the browser with 'E' or 'C' keys. Allow user to expand or collapse only current entry in the browser with e or c key. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Joe Mario <jmario@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1484904032-11040-3-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | perf hists browser: Put hist_entry folding logic into single functionJiri Olsa2017-01-201-18/+25
| | | | | | | | | | | | | | | | | | | | | | | | It will be used in following patch to expand or collapse only the current browser entry. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1484904032-11040-2-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | perf tools: Move two variables usied in libperf from perf.cSoramichi AKIYAMA2017-01-171-0/+1
|/ | | | | | | | | | | | | | | | | | The use_browser and perf_version_string variables are both declared in perf.c but they are also referenced by other functions of libperf.a. Therefore a user linking an own main() with libperf.a must declare those two variables in their files even if the files never use the browser or the version information. This patch fixes this issue by moving use_browser and perf_version_string out of perf.c to some other files. Signed-off-by: Soramichi Akiyama <akiyama@m.soramichi.jp> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170117002237.c1aec0ce3b4d675dca018deb@m.soramichi.jp Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf annotate: Fix jump target outside of function address rangeRavi Bangoria2016-12-151-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If jump target is outside of function range, perf is not handling it correctly. Especially when target address is lesser than function start address, target offset will be negative. But, target address declared to be unsigned, converts negative number into 2's complement. See below example. Here target of 'jumpq' instruction at 34cf8 is 34ac0 which is lesser than function start address(34cf0). 34ac0 - 34cf0 = -0x230 = 0xfffffffffffffdd0 Objdump output: 0000000000034cf0 <__sigaction>: __GI___sigaction(): 34cf0: lea -0x20(%rdi),%eax 34cf3: cmp -bashx1,%eax 34cf6: jbe 34d00 <__sigaction+0x10> 34cf8: jmpq 34ac0 <__GI___libc_sigaction> 34cfd: nopl (%rax) 34d00: mov 0x386161(%rip),%rax # 3bae68 <_DYNAMIC+0x2e8> 34d07: movl -bashx16,%fs:(%rax) 34d0e: mov -bashxffffffff,%eax 34d13: retq perf annotate before applying patch: __GI___sigaction /usr/lib64/libc-2.22.so lea -0x20(%rdi),%eax cmp -bashx1,%eax v jbe 10 v jmpq fffffffffffffdd0 nop 10: mov _DYNAMIC+0x2e8,%rax movl -bashx16,%fs:(%rax) mov -bashxffffffff,%eax retq perf annotate after applying patch: __GI___sigaction /usr/lib64/libc-2.22.so lea -0x20(%rdi),%eax cmp -bashx1,%eax v jbe 10 ^ jmpq 34ac0 <__GI___libc_sigaction> nop 10: mov _DYNAMIC+0x2e8,%rax movl -bashx16,%fs:(%rax) mov -bashxffffffff,%eax retq Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Chris Riyder <chris.ryder@arm.com> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Markus Trippelsdorf <markus@trippelsdorf.de> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Taeung Song <treeze.taeung@gmail.com> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/1480953407-7605-3-git-send-email-ravi.bangoria@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf annotate: Show invalid jump offset in error messageArnaldo Carvalho de Melo2016-11-251-2/+4
| | | | | | | | | | | | | | | | | To help in debugging when the wrong offset is being used, like in: │13d98: ↓ jne 13dd1 <lzma_lzma_preset@@XZ_5.0+0x28e1> That is the full line from objdump, and it seems what should be used is 13dd1, not 28e1. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-4nc0marsgst1ft6inmvqber7@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf ui helpline: Provide a printf variantArnaldo Carvalho de Melo2016-11-252-0/+11
| | | | | | | | | | | | | To print some values, like in the annotation code with invalid jump offsets. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-1vk0g5twas2ioswn1mmvnvwq@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf annotate: Remove duplicate 'name' field from disasm_lineArnaldo Carvalho de Melo2016-11-251-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The disasm_line::name field is always equal to ins::name, being used just to locate the instruction's ins_ops from the per-arch instructions table. Eliminate this duplication, nuking that field and instead make ins__find() return an ins_ops, store it in disasm_line::ins.ops, and keep just in disasm_line::ins.name what was in disasm_line::name, this way we end up not keeping a reference to entries in the per-arch instructions table. This in turn will help supporting multiple ways to manage the per-arch instructions table, allowing resorting that array, for instance, when the entries will move after references to its addresses were made. The same problem is avoided when one grows the array with realloc. So architectures simply keeping a constant array will work as well as architectures building the table using regular expressions or other logic that involves resorting the table. Reviewed-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Chris Riyder <chris.ryder@arm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Markus Trippelsdorf <markus@trippelsdorf.de> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Pawel Moll <pawel.moll@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Taeung Song <treeze.taeung@gmail.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-vr899azvabnw9gtuepuqfd9t@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* Merge tag 'perf-core-for-mingo-20161123' of ↵Ingo Molnar2016-11-242-2/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: New tool: - 'perf sched timehist' provides an analysis of scheduling events. Example usage: perf sched record -- sleep 1 perf sched timehist By default it shows the individual schedule events, including the wait time (time between sched-out and next sched-in events for the task), the task scheduling delay (time between wakeup and actually running) and run time for the task: time cpu task name wait time sch delay run time [tid/pid] (msec) (msec) (msec) -------- ------ ---------------- --------- --------- -------- 1.874569 [0011] gcc[31949] 0.014 0.000 1.148 1.874591 [0010] gcc[31951] 0.000 0.000 0.024 1.874603 [0010] migration/10[59] 3.350 0.004 0.011 1.874604 [0011] <idle> 1.148 0.000 0.035 1.874723 [0005] <idle> 0.016 0.000 1.383 1.874746 [0005] gcc[31949] 0.153 0.078 0.022 ... Times are in msec.usec. (David Ahern, Namhyung Kim) Improvements: - Make 'perf c2c report' support -f/--force, to allow skipping the ownership check for root users, for instance, just like the other tools (Jiri Olsa) - Allow sorting cachelines by total number of HITMs, in addition to local and remote numbers (Jiri Olsa) Fixes: - Make sure errors aren't suppressed by the TUI reset at the end of a 'perf c2c report' session (Jiri Olsa) Infrastructure changes: - Initial work on having the annotate code better support multiple architectures, including the ability to cross-annotate, i.e. to annotate perf.data files collected on an ARM system on a x86_64 workstation (Arnaldo Carvalho de Melo, Ravi Bangoria, Kim Phillips) - Use USECS_PER_SEC instead of hard coded number in libtraceevent (Steven Rostedt) - Add retrieval of preempt count and latency flags in libtraceevent (Steven Rostedt) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * perf annotate: Start supporting cross arch annotationArnaldo Carvalho de Melo2016-11-172-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce a 'struct arch', where arch specific stuff will live, starting with objdump's choice of comment delimitation character, that is '#' in x86 while a ';' in arm. This has some bits and pieces from a patch submitted by Ravi. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Chris Riyder <chris.ryder@arm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Markus Trippelsdorf <markus@trippelsdorf.de> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Pawel Moll <pawel.moll@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Taeung Song <treeze.taeung@gmail.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-f337tzjjcl8vtapgvjxmhrbx@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | Merge branch 'linus' into perf/core, to pick up fixesIngo Molnar2016-11-241-10/+23
|\ \ | |/ |/| | | Signed-off-by: Ingo Molnar <mingo@kernel.org>