summaryrefslogtreecommitdiffstats
path: root/tools/perf/Documentation
diff options
context:
space:
mode:
Diffstat (limited to 'tools/perf/Documentation')
-rw-r--r--tools/perf/Documentation/Build.txt24
-rw-r--r--tools/perf/Documentation/Makefile1
-rw-r--r--tools/perf/Documentation/perf-config.txt16
-rw-r--r--tools/perf/Documentation/perf-list.txt12
-rw-r--r--tools/perf/Documentation/perf-record.txt31
-rw-r--r--tools/perf/Documentation/perf-report.txt13
-rw-r--r--tools/perf/Documentation/perf-script.txt3
-rw-r--r--tools/perf/Documentation/perf-stat.txt9
-rw-r--r--tools/perf/Documentation/perf.data-file-format.txt24
-rw-r--r--tools/perf/Documentation/perf.txt2
-rw-r--r--tools/perf/Documentation/tips.txt7
11 files changed, 137 insertions, 5 deletions
diff --git a/tools/perf/Documentation/Build.txt b/tools/perf/Documentation/Build.txt
index f6fc6507ba55..3766886c4bca 100644
--- a/tools/perf/Documentation/Build.txt
+++ b/tools/perf/Documentation/Build.txt
@@ -47,3 +47,27 @@ Those objects are then used in final linking:
NOTE this description is omitting other libraries involved, only
focusing on build framework outcomes
+
+3) Build with ASan or UBSan
+==========================
+ $ cd tools/perf
+ $ make DESTDIR=/usr
+ $ make DESTDIR=/usr install
+
+AddressSanitizer (or ASan) is a GCC feature that detects memory corruption bugs
+such as buffer overflows and memory leaks.
+
+ $ cd tools/perf
+ $ make DEBUG=1 EXTRA_CFLAGS='-fno-omit-frame-pointer -fsanitize=address'
+ $ ASAN_OPTIONS=log_path=asan.log ./perf record -a
+
+ASan outputs all detected issues into a log file named 'asan.log.<pid>'.
+
+UndefinedBehaviorSanitizer (or UBSan) is a fast undefined behavior detector
+supported by GCC. UBSan detects undefined behaviors of programs at runtime.
+
+ $ cd tools/perf
+ $ make DEBUG=1 EXTRA_CFLAGS='-fno-omit-frame-pointer -fsanitize=undefined'
+ $ UBSAN_OPTIONS=print_stacktrace=1 ./perf record -a
+
+If UBSan detects any problem at runtime, it outputs a “runtime error:” message.
diff --git a/tools/perf/Documentation/Makefile b/tools/perf/Documentation/Makefile
index ac841bc5c35b..6d148a40551c 100644
--- a/tools/perf/Documentation/Makefile
+++ b/tools/perf/Documentation/Makefile
@@ -1,3 +1,4 @@
+# SPDX-License-Identifier: GPL-2.0-only
include ../../scripts/Makefile.include
include ../../scripts/utilities.mak
diff --git a/tools/perf/Documentation/perf-config.txt b/tools/perf/Documentation/perf-config.txt
index 86f3dcc15f83..462b3cde0675 100644
--- a/tools/perf/Documentation/perf-config.txt
+++ b/tools/perf/Documentation/perf-config.txt
@@ -114,7 +114,7 @@ Given a $HOME/.perfconfig like this:
[report]
# Defaults
- sort-order = comm,dso,symbol
+ sort_order = comm,dso,symbol
percent-limit = 0
queue-size = 0
children = true
@@ -584,6 +584,20 @@ llvm.*::
llvm.opts::
Options passed to llc.
+samples.*::
+
+ samples.context::
+ Define how many ns worth of time to show
+ around samples in perf report sample context browser.
+
+scripts.*::
+
+ Any option defines a script that is added to the scripts menu
+ in the interactive perf browser and whose output is displayed.
+ The name of the option is the name, the value is a script command line.
+ The script gets the same options passed as a full perf script,
+ in particular -i perfdata file, --cpu, --tid
+
SEE ALSO
--------
linkperf:perf[1]
diff --git a/tools/perf/Documentation/perf-list.txt b/tools/perf/Documentation/perf-list.txt
index 138fb6e94b3c..18ed1b0fceb3 100644
--- a/tools/perf/Documentation/perf-list.txt
+++ b/tools/perf/Documentation/perf-list.txt
@@ -199,6 +199,18 @@ also be supplied. For example:
perf stat -C 0 -e 'hv_gpci/dtbp_ptitc,phys_processor_idx=0x2/' ...
+EVENT QUALIFIERS:
+
+It is also possible to add extra qualifiers to an event:
+
+percore:
+
+Sums up the event counts for all hardware threads in a core, e.g.:
+
+
+ perf stat -e cpu/event=0,umask=0x3,percore=1/
+
+
EVENT GROUPS
------------
diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index 8f0c2be34848..de269430720a 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -406,7 +406,8 @@ symbolic names, e.g. on x86, ax, si. To list the available registers use
--intr-regs=ax,bx. The list of register is architecture dependent.
--user-regs::
-Capture user registers at sample time. Same arguments as -I.
+Similar to -I, but capture user registers at sample time. To list the available
+user registers use --user-regs=\?.
--running-time::
Record running and enabled time for read events (:S)
@@ -459,6 +460,30 @@ Set affinity mask of trace reading thread according to the policy defined by 'mo
node - thread affinity mask is set to NUMA node cpu mask of the processed mmap buffer
cpu - thread affinity mask is set to cpu of the processed mmap buffer
+--mmap-flush=number::
+
+Specify minimal number of bytes that is extracted from mmap data pages and
+processed for output. One can specify the number using B/K/M/G suffixes.
+
+The maximal allowed value is a quarter of the size of mmaped data pages.
+
+The default option value is 1 byte which means that every time that the output
+writing thread finds some new data in the mmaped buffer the data is extracted,
+possibly compressed (-z) and written to the output, perf.data or pipe.
+
+Larger data chunks are compressed more effectively in comparison to smaller
+chunks so extraction of larger chunks from the mmap data pages is preferable
+from the perspective of output size reduction.
+
+Also at some cases executing less output write syscalls with bigger data size
+can take less time than executing more output write syscalls with smaller data
+size thus lowering runtime profiling overhead.
+
+-z::
+--compression-level[=n]::
+Produce compressed trace using specified level n (default: 1 - fastest compression,
+22 - smallest trace)
+
--all-kernel::
Configure all used events to run in kernel space.
@@ -495,6 +520,10 @@ overhead. You can still switch them on with:
--switch-output --no-no-buildid --no-no-buildid-cache
+--switch-max-files=N::
+
+When rotating perf.data with --switch-output, only keep N files.
+
--dry-run::
Parse options then exit. --dry-run can be used to detect errors in cmdline
options.
diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index 1a27bfe05039..f441baa794ce 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -105,6 +105,8 @@ OPTIONS
guest machine
- sample: Number of sample
- period: Raw number of event count of sample
+ - time: Separate the samples by time stamp with the resolution specified by
+ --time-quantum (default 100ms). Specify with overhead and before it.
By default, comm, dso and symbol keys are used.
(i.e. --sort comm,dso,symbol)
@@ -459,6 +461,10 @@ include::itrace.txt[]
--socket-filter::
Only report the samples on the processor socket that match with this filter
+--samples=N::
+ Save N individual samples for each histogram entry to show context in perf
+ report tui browser.
+
--raw-trace::
When displaying traceevent output, do not use print fmt or plugins.
@@ -477,6 +483,9 @@ include::itrace.txt[]
Please note that not all mmaps are stored, options affecting which ones
are include 'perf record --data', for instance.
+--ns::
+ Show time stamps in nanoseconds.
+
--stats::
Display overall events statistics without any further processing.
(like the one at the end of the perf report -D command)
@@ -494,6 +503,10 @@ include::itrace.txt[]
The period/hits keywords set the base the percentage is computed
on - the samples period or the number of samples (hits).
+--time-quantum::
+ Configure time quantum for time sort key. Default 100ms.
+ Accepts s, us, ms, ns units.
+
include::callchain-overhead-calculation.txt[]
SEE ALSO
diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
index 2e19fd7ffe35..9b0d04dd2a61 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -380,6 +380,9 @@ include::itrace.txt[]
Set the maximum number of program blocks to print with brstackasm for
each sample.
+--reltime::
+ Print time stamps relative to trace start.
+
--per-event-dump::
Create per event files with a "perf.data.EVENT.dump" name instead of
printing to stdout, useful, for instance, for generating flamegraphs.
diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index 4bc2085e5197..1e312c2672e4 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -43,6 +43,10 @@ report::
param1 and param2 are defined as formats for the PMU in
/sys/bus/event_source/devices/<pmu>/format/*
+ 'percore' is a event qualifier that sums up the event counts for both
+ hardware threads in a core. For example:
+ perf stat -A -a -e cpu/event,percore=1/,otherevent ...
+
- a symbolically formed event like 'pmu/config=M,config1=N,config2=K/'
where M, N, K are numbers (in decimal, hex, octal format).
Acceptable values for each of 'config', 'config1' and 'config2'
@@ -72,9 +76,8 @@ report::
--all-cpus::
system-wide collection from all CPUs (default if no target is specified)
--c::
---scale::
- scale/normalize counter values
+--no-scale::
+ Don't scale/normalize counter values
-d::
--detailed::
diff --git a/tools/perf/Documentation/perf.data-file-format.txt b/tools/perf/Documentation/perf.data-file-format.txt
index 593ef49b273c..6967e9b02be5 100644
--- a/tools/perf/Documentation/perf.data-file-format.txt
+++ b/tools/perf/Documentation/perf.data-file-format.txt
@@ -272,6 +272,19 @@ struct {
Two uint64_t for the time of first sample and the time of last sample.
+ HEADER_COMPRESSED = 27,
+
+struct {
+ u32 version;
+ u32 type;
+ u32 level;
+ u32 ratio;
+ u32 mmap_len;
+};
+
+Indicates that trace contains records of PERF_RECORD_COMPRESSED type
+that have perf_events records in compressed form.
+
other bits are reserved and should ignored for now
HEADER_FEAT_BITS = 256,
@@ -437,6 +450,17 @@ struct auxtrace_error_event {
Describes a header feature. These are records used in pipe-mode that
contain information that otherwise would be in perf.data file's header.
+ PERF_RECORD_COMPRESSED = 81,
+
+struct compressed_event {
+ struct perf_event_header header;
+ char data[];
+};
+
+The header is followed by compressed data frame that can be decompressed
+into array of perf trace records. The size of the entire compressed event
+record including the header is limited by the max value of header.size.
+
Event types
Define the event attributes with their IDs.
diff --git a/tools/perf/Documentation/perf.txt b/tools/perf/Documentation/perf.txt
index 864e37597252..401f0ed67439 100644
--- a/tools/perf/Documentation/perf.txt
+++ b/tools/perf/Documentation/perf.txt
@@ -22,6 +22,8 @@ OPTIONS
verbose - general debug messages
ordered-events - ordered events object debug messages
data-convert - data convert command debug messages
+ stderr - write debug output (option -v) to stderr
+ in browser mode
--buildid-dir::
Setup buildid cache directory. It has higher priority than
diff --git a/tools/perf/Documentation/tips.txt b/tools/perf/Documentation/tips.txt
index 849599f39c5e..869965d629ce 100644
--- a/tools/perf/Documentation/tips.txt
+++ b/tools/perf/Documentation/tips.txt
@@ -15,6 +15,7 @@ To see callchains in a more compact form: perf report -g folded
Show individual samples with: perf script
Limit to show entries above 5% only: perf report --percent-limit 5
Profiling branch (mis)predictions with: perf record -b / perf report
+To show assembler sample contexts use perf record -b / perf script -F +brstackinsn --xed
Treat branches as callchains: perf report --branch-history
To count events in every 1000 msec: perf stat -I 1000
Print event counts in CSV format with: perf stat -x,
@@ -34,3 +35,9 @@ Show current config key-value pairs: perf config --list
Show user configuration overrides: perf config --user --list
To add Node.js USDT(User-Level Statically Defined Tracing): perf buildid-cache --add `which node`
To report cacheline events from previous recording: perf c2c report
+To browse sample contexts use perf report --sample 10 and select in context menu
+To separate samples by time use perf report --sort time,overhead,sym
+To set sample time separation other than 100ms with --sort time use --time-quantum
+Add -I to perf report to sample register values visible in perf report context.
+To show IPC for sampling periods use perf record -e '{cycles,instructions}:S' and then browse context
+To show context switches in perf report sample context add --switch-events to perf record.