diff options
Diffstat (limited to 'tools/perf/Documentation')
-rw-r--r-- | tools/perf/Documentation/Build.txt | 24 | ||||
-rw-r--r-- | tools/perf/Documentation/Makefile | 1 | ||||
-rw-r--r-- | tools/perf/Documentation/perf-config.txt | 16 | ||||
-rw-r--r-- | tools/perf/Documentation/perf-list.txt | 12 | ||||
-rw-r--r-- | tools/perf/Documentation/perf-record.txt | 31 | ||||
-rw-r--r-- | tools/perf/Documentation/perf-report.txt | 13 | ||||
-rw-r--r-- | tools/perf/Documentation/perf-script.txt | 3 | ||||
-rw-r--r-- | tools/perf/Documentation/perf-stat.txt | 9 | ||||
-rw-r--r-- | tools/perf/Documentation/perf.data-file-format.txt | 24 | ||||
-rw-r--r-- | tools/perf/Documentation/perf.txt | 2 | ||||
-rw-r--r-- | tools/perf/Documentation/tips.txt | 7 |
11 files changed, 137 insertions, 5 deletions
diff --git a/tools/perf/Documentation/Build.txt b/tools/perf/Documentation/Build.txt index f6fc6507ba55..3766886c4bca 100644 --- a/tools/perf/Documentation/Build.txt +++ b/tools/perf/Documentation/Build.txt @@ -47,3 +47,27 @@ Those objects are then used in final linking: NOTE this description is omitting other libraries involved, only focusing on build framework outcomes + +3) Build with ASan or UBSan +========================== + $ cd tools/perf + $ make DESTDIR=/usr + $ make DESTDIR=/usr install + +AddressSanitizer (or ASan) is a GCC feature that detects memory corruption bugs +such as buffer overflows and memory leaks. + + $ cd tools/perf + $ make DEBUG=1 EXTRA_CFLAGS='-fno-omit-frame-pointer -fsanitize=address' + $ ASAN_OPTIONS=log_path=asan.log ./perf record -a + +ASan outputs all detected issues into a log file named 'asan.log.<pid>'. + +UndefinedBehaviorSanitizer (or UBSan) is a fast undefined behavior detector +supported by GCC. UBSan detects undefined behaviors of programs at runtime. + + $ cd tools/perf + $ make DEBUG=1 EXTRA_CFLAGS='-fno-omit-frame-pointer -fsanitize=undefined' + $ UBSAN_OPTIONS=print_stacktrace=1 ./perf record -a + +If UBSan detects any problem at runtime, it outputs a “runtime error:” message. diff --git a/tools/perf/Documentation/Makefile b/tools/perf/Documentation/Makefile index ac841bc5c35b..6d148a40551c 100644 --- a/tools/perf/Documentation/Makefile +++ b/tools/perf/Documentation/Makefile @@ -1,3 +1,4 @@ +# SPDX-License-Identifier: GPL-2.0-only include ../../scripts/Makefile.include include ../../scripts/utilities.mak diff --git a/tools/perf/Documentation/perf-config.txt b/tools/perf/Documentation/perf-config.txt index 86f3dcc15f83..462b3cde0675 100644 --- a/tools/perf/Documentation/perf-config.txt +++ b/tools/perf/Documentation/perf-config.txt @@ -114,7 +114,7 @@ Given a $HOME/.perfconfig like this: [report] # Defaults - sort-order = comm,dso,symbol + sort_order = comm,dso,symbol percent-limit = 0 queue-size = 0 children = true @@ -584,6 +584,20 @@ llvm.*:: llvm.opts:: Options passed to llc. +samples.*:: + + samples.context:: + Define how many ns worth of time to show + around samples in perf report sample context browser. + +scripts.*:: + + Any option defines a script that is added to the scripts menu + in the interactive perf browser and whose output is displayed. + The name of the option is the name, the value is a script command line. + The script gets the same options passed as a full perf script, + in particular -i perfdata file, --cpu, --tid + SEE ALSO -------- linkperf:perf[1] diff --git a/tools/perf/Documentation/perf-list.txt b/tools/perf/Documentation/perf-list.txt index 138fb6e94b3c..18ed1b0fceb3 100644 --- a/tools/perf/Documentation/perf-list.txt +++ b/tools/perf/Documentation/perf-list.txt @@ -199,6 +199,18 @@ also be supplied. For example: perf stat -C 0 -e 'hv_gpci/dtbp_ptitc,phys_processor_idx=0x2/' ... +EVENT QUALIFIERS: + +It is also possible to add extra qualifiers to an event: + +percore: + +Sums up the event counts for all hardware threads in a core, e.g.: + + + perf stat -e cpu/event=0,umask=0x3,percore=1/ + + EVENT GROUPS ------------ diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt index 8f0c2be34848..de269430720a 100644 --- a/tools/perf/Documentation/perf-record.txt +++ b/tools/perf/Documentation/perf-record.txt @@ -406,7 +406,8 @@ symbolic names, e.g. on x86, ax, si. To list the available registers use --intr-regs=ax,bx. The list of register is architecture dependent. --user-regs:: -Capture user registers at sample time. Same arguments as -I. +Similar to -I, but capture user registers at sample time. To list the available +user registers use --user-regs=\?. --running-time:: Record running and enabled time for read events (:S) @@ -459,6 +460,30 @@ Set affinity mask of trace reading thread according to the policy defined by 'mo node - thread affinity mask is set to NUMA node cpu mask of the processed mmap buffer cpu - thread affinity mask is set to cpu of the processed mmap buffer +--mmap-flush=number:: + +Specify minimal number of bytes that is extracted from mmap data pages and +processed for output. One can specify the number using B/K/M/G suffixes. + +The maximal allowed value is a quarter of the size of mmaped data pages. + +The default option value is 1 byte which means that every time that the output +writing thread finds some new data in the mmaped buffer the data is extracted, +possibly compressed (-z) and written to the output, perf.data or pipe. + +Larger data chunks are compressed more effectively in comparison to smaller +chunks so extraction of larger chunks from the mmap data pages is preferable +from the perspective of output size reduction. + +Also at some cases executing less output write syscalls with bigger data size +can take less time than executing more output write syscalls with smaller data +size thus lowering runtime profiling overhead. + +-z:: +--compression-level[=n]:: +Produce compressed trace using specified level n (default: 1 - fastest compression, +22 - smallest trace) + --all-kernel:: Configure all used events to run in kernel space. @@ -495,6 +520,10 @@ overhead. You can still switch them on with: --switch-output --no-no-buildid --no-no-buildid-cache +--switch-max-files=N:: + +When rotating perf.data with --switch-output, only keep N files. + --dry-run:: Parse options then exit. --dry-run can be used to detect errors in cmdline options. diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt index 1a27bfe05039..f441baa794ce 100644 --- a/tools/perf/Documentation/perf-report.txt +++ b/tools/perf/Documentation/perf-report.txt @@ -105,6 +105,8 @@ OPTIONS guest machine - sample: Number of sample - period: Raw number of event count of sample + - time: Separate the samples by time stamp with the resolution specified by + --time-quantum (default 100ms). Specify with overhead and before it. By default, comm, dso and symbol keys are used. (i.e. --sort comm,dso,symbol) @@ -459,6 +461,10 @@ include::itrace.txt[] --socket-filter:: Only report the samples on the processor socket that match with this filter +--samples=N:: + Save N individual samples for each histogram entry to show context in perf + report tui browser. + --raw-trace:: When displaying traceevent output, do not use print fmt or plugins. @@ -477,6 +483,9 @@ include::itrace.txt[] Please note that not all mmaps are stored, options affecting which ones are include 'perf record --data', for instance. +--ns:: + Show time stamps in nanoseconds. + --stats:: Display overall events statistics without any further processing. (like the one at the end of the perf report -D command) @@ -494,6 +503,10 @@ include::itrace.txt[] The period/hits keywords set the base the percentage is computed on - the samples period or the number of samples (hits). +--time-quantum:: + Configure time quantum for time sort key. Default 100ms. + Accepts s, us, ms, ns units. + include::callchain-overhead-calculation.txt[] SEE ALSO diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt index 2e19fd7ffe35..9b0d04dd2a61 100644 --- a/tools/perf/Documentation/perf-script.txt +++ b/tools/perf/Documentation/perf-script.txt @@ -380,6 +380,9 @@ include::itrace.txt[] Set the maximum number of program blocks to print with brstackasm for each sample. +--reltime:: + Print time stamps relative to trace start. + --per-event-dump:: Create per event files with a "perf.data.EVENT.dump" name instead of printing to stdout, useful, for instance, for generating flamegraphs. diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt index 4bc2085e5197..1e312c2672e4 100644 --- a/tools/perf/Documentation/perf-stat.txt +++ b/tools/perf/Documentation/perf-stat.txt @@ -43,6 +43,10 @@ report:: param1 and param2 are defined as formats for the PMU in /sys/bus/event_source/devices/<pmu>/format/* + 'percore' is a event qualifier that sums up the event counts for both + hardware threads in a core. For example: + perf stat -A -a -e cpu/event,percore=1/,otherevent ... + - a symbolically formed event like 'pmu/config=M,config1=N,config2=K/' where M, N, K are numbers (in decimal, hex, octal format). Acceptable values for each of 'config', 'config1' and 'config2' @@ -72,9 +76,8 @@ report:: --all-cpus:: system-wide collection from all CPUs (default if no target is specified) --c:: ---scale:: - scale/normalize counter values +--no-scale:: + Don't scale/normalize counter values -d:: --detailed:: diff --git a/tools/perf/Documentation/perf.data-file-format.txt b/tools/perf/Documentation/perf.data-file-format.txt index 593ef49b273c..6967e9b02be5 100644 --- a/tools/perf/Documentation/perf.data-file-format.txt +++ b/tools/perf/Documentation/perf.data-file-format.txt @@ -272,6 +272,19 @@ struct { Two uint64_t for the time of first sample and the time of last sample. + HEADER_COMPRESSED = 27, + +struct { + u32 version; + u32 type; + u32 level; + u32 ratio; + u32 mmap_len; +}; + +Indicates that trace contains records of PERF_RECORD_COMPRESSED type +that have perf_events records in compressed form. + other bits are reserved and should ignored for now HEADER_FEAT_BITS = 256, @@ -437,6 +450,17 @@ struct auxtrace_error_event { Describes a header feature. These are records used in pipe-mode that contain information that otherwise would be in perf.data file's header. + PERF_RECORD_COMPRESSED = 81, + +struct compressed_event { + struct perf_event_header header; + char data[]; +}; + +The header is followed by compressed data frame that can be decompressed +into array of perf trace records. The size of the entire compressed event +record including the header is limited by the max value of header.size. + Event types Define the event attributes with their IDs. diff --git a/tools/perf/Documentation/perf.txt b/tools/perf/Documentation/perf.txt index 864e37597252..401f0ed67439 100644 --- a/tools/perf/Documentation/perf.txt +++ b/tools/perf/Documentation/perf.txt @@ -22,6 +22,8 @@ OPTIONS verbose - general debug messages ordered-events - ordered events object debug messages data-convert - data convert command debug messages + stderr - write debug output (option -v) to stderr + in browser mode --buildid-dir:: Setup buildid cache directory. It has higher priority than diff --git a/tools/perf/Documentation/tips.txt b/tools/perf/Documentation/tips.txt index 849599f39c5e..869965d629ce 100644 --- a/tools/perf/Documentation/tips.txt +++ b/tools/perf/Documentation/tips.txt @@ -15,6 +15,7 @@ To see callchains in a more compact form: perf report -g folded Show individual samples with: perf script Limit to show entries above 5% only: perf report --percent-limit 5 Profiling branch (mis)predictions with: perf record -b / perf report +To show assembler sample contexts use perf record -b / perf script -F +brstackinsn --xed Treat branches as callchains: perf report --branch-history To count events in every 1000 msec: perf stat -I 1000 Print event counts in CSV format with: perf stat -x, @@ -34,3 +35,9 @@ Show current config key-value pairs: perf config --list Show user configuration overrides: perf config --user --list To add Node.js USDT(User-Level Statically Defined Tracing): perf buildid-cache --add `which node` To report cacheline events from previous recording: perf c2c report +To browse sample contexts use perf report --sample 10 and select in context menu +To separate samples by time use perf report --sort time,overhead,sym +To set sample time separation other than 100ms with --sort time use --time-quantum +Add -I to perf report to sample register values visible in perf report context. +To show IPC for sampling periods use perf record -e '{cycles,instructions}:S' and then browse context +To show context switches in perf report sample context add --switch-events to perf record. |