summaryrefslogtreecommitdiffstats
path: root/tools
Commit message (Collapse)AuthorAgeFilesLines
* selftests/bpf: Test indirect var_off stack access in raw modeAndrey Ignatov2019-04-051-0/+27
| | | | | | | | | | | | | Test that verifier rejects indirect access to uninitialized stack with variable offset. Example of output: # ./test_verifier ... #859/p indirect variable-offset stack access, uninitialized OK Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* samples, selftests/bpf: add NULL check for ksym_searchDaniel T. Lee2019-04-041-2/+2
| | | | | | | | | | Since, ksym_search added with verification logic for symbols existence, it could return NULL when the kernel symbols are not loaded. This commit will add NULL check logic after ksym_search. Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* selftests/bpf: ksym_search won't check symbols existsDaniel T. Lee2019-04-041-0/+4
| | | | | | | | | | | | | | | Currently, ksym_search located at trace_helpers won't check symbols are existing or not. In ksym_search, when symbol is not found, it will return &syms[0](_stext). But when the kernel symbols are not loaded, it will return NULL, which is not a desired action. This commit will add verification logic whether symbols are loaded prior to the symbol search. Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* selftests/bpf: synthetic tests to push verifier limitsAlexei Starovoitov2019-04-042-9/+35
| | | | | | | | | | | | Add a test to generate 1m ld_imm64 insns to stress the verifier. Bump the size of fill_ld_abs_vlan_push_pop test from 4k to 29k and jump_around_ld_abs from 4k to 5.5k. Larger sizes are not possible due to 16-bit offset encoding in jump instructions. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* selftests/bpf: add few verifier scale testsAlexei Starovoitov2019-04-047-1/+215
| | | | | | | | | | | | | | | | | | | | | | | Add 3 basic tests that stress verifier scalability. test_verif_scale1.c calls non-inlined jhash() function 90 times on different position in the packet. This test simulates network packet parsing. jhash function is ~140 instructions and main program is ~1200 insns. test_verif_scale2.c force inlines jhash() function 90 times. This program is ~15k instructions long. test_verif_scale3.c calls non-inlined jhash() function 90 times on But this time jhash has to process 32-bytes from the packet instead of 14-bytes in tests 1 and 2. jhash function is ~230 insns and main program is ~1200 insns. $ test_progs -s can be used to see verifier stats. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* libbpf: teach libbpf about log_level bit 2Alexei Starovoitov2019-04-044-4/+17
| | | | | | | | | | | | | | | | | Allow bpf_prog_load_xattr() to specify log_level for program loading. Teach libbpf to accept log_level with bit 2 set. Increase default BPF_LOG_BUF_SIZE from 256k to 16M. There is no downside to increase it to a maximum allowed by old kernels. Existing 256k limit caused ENOSPC errors and users were not able to see verifier error which is printed at the end of the verifier log. If ENOSPC is hit, double the verifier log and try again to capture the verifier error. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* selftests: bpf: remove duplicate .flags initialization in ctx_skb.cStanislav Fomichev2019-04-021-1/+0Star
| | | | | | | | | verifier/ctx_skb.c:708:11: warning: initializer overrides prior initialization of this subobject [-Winitializer-overrides] .flags = F_NEEDS_EFFICIENT_UNALIGNED_ACCESS, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* selftests: bpf: fix -Wformat-invalid-specifier for bpf_obj_id.cStanislav Fomichev2019-04-021-4/+4
| | | | | | | | Use standard C99 %zu for sizeof, not GCC's custom %Zu: bpf_obj_id.c:76:48: warning: invalid conversion specifier 'Z' Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* selftests: bpf: fix -Wformat-security warning for flow_dissector_load.cStanislav Fomichev2019-04-021-1/+1
| | | | | | | | | | | | | | | flow_dissector_load.c:55:19: warning: format string is not a string literal (potentially insecure) [-Wformat-security] error(1, errno, command); ^~~~~~~ flow_dissector_load.c:55:19: note: treat the string as an argument to avoid this error(1, errno, command); ^ "%s", 1 warning generated. Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* selftests: bpf: tests.h should depend on .c files, not the outputStanislav Fomichev2019-04-021-2/+2
| | | | | | | | | | | | | | | This makes sure we don't put headers as input files when doing compilation, because clang complains about the following: clang-9: error: cannot specify -o when generating multiple output files ../lib.mk:152: recipe for target 'xxx/tools/testing/selftests/bpf/test_verifier' failed make: *** [xxx/tools/testing/selftests/bpf/test_verifier] Error 1 make: *** Waiting for unfinished jobs.... clang-9: error: cannot specify -o when generating multiple output files ../lib.mk:152: recipe for target 'xxx/tools/testing/selftests/bpf/test_progs' failed Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* bpf: add bpffs multi-dimensional array tests in test_btfYonghong Song2019-04-011-8/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For multiple dimensional arrays like below, int a[2][3] both llvm and pahole generated one BTF_KIND_ARRAY type like . element_type: int . index_type: unsigned int . number of elements: 6 Such a collapsed BTF_KIND_ARRAY type will cause the divergence in BTF vs. the user code. In the compile-once-run-everywhere project, the header file is generated from BTF and used for bpf program, and the definition in the header file will be different from what user expects. But the kernel actually supports chained multi-dimensional array types properly. The above "int a[2][3]" can be represented as Type #n: . element_type: int . index_type: unsigned int . number of elements: 3 Type #(n+1): . element_type: type #n . index_type: unsigned int . number of elements: 2 The following llvm commit https://reviews.llvm.org/rL357215 also enables llvm to generated proper chained multi-dimensional arrays. The test_btf already has a raw test ("struct test #1") for chained multi-dimensional arrays. This patch added amended bpffs test for chained multi-dimensional arrays. Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* selftests/bpf: Test variable offset stack accessAndrey Ignatov2019-03-291-2/+77
| | | | | | | | | | | | | | | | | Test different scenarios of indirect variable-offset stack access: out of bound access (>0), min_off below initialized part of the stack, max_off+size above initialized part of the stack, initialized stack. Example of output: ... #856/p indirect variable-offset stack access, out of bound OK #857/p indirect variable-offset stack access, max_off+size > max_initialized OK #858/p indirect variable-offset stack access, min_off < min_initialized OK #859/p indirect variable-offset stack access, ok OK ... Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
* tools/bpf: generate pkg-config file for libbpfLuca Boccassi2019-03-283-3/+28
| | | | | | | | | Generate a libbpf.pc file at build time so that users can rely on pkg-config to find the library, its CFLAGS and LDFLAGS. Signed-off-by: Luca Boccassi <bluca@debian.org> Acked-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2019-03-28137-1343/+4676
|\
| * Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds2019-03-2730-143/+1037
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking fixes from David Miller: "Fixes here and there, a couple new device IDs, as usual: 1) Fix BQL race in dpaa2-eth driver, from Ioana Ciornei. 2) Fix 64-bit division in iwlwifi, from Arnd Bergmann. 3) Fix documentation for some eBPF helpers, from Quentin Monnet. 4) Some UAPI bpf header sync with tools, also from Quentin Monnet. 5) Set descriptor ownership bit at the right time for jumbo frames in stmmac driver, from Aaro Koskinen. 6) Set IFF_UP properly in tun driver, from Eric Dumazet. 7) Fix load/store doubleword instruction generation in powerpc eBPF JIT, from Naveen N. Rao. 8) nla_nest_start() return value checks all over, from Kangjie Lu. 9) Fix asoc_id handling in SCTP after the SCTP_*_ASSOC changes this merge window. From Marcelo Ricardo Leitner and Xin Long. 10) Fix memory corruption with large MTUs in stmmac, from Aaro Koskinen. 11) Do not use ipv4 header for ipv6 flows in TCP and DCCP, from Eric Dumazet. 12) Fix topology subscription cancellation in tipc, from Erik Hugne. 13) Memory leak in genetlink error path, from Yue Haibing. 14) Valid control actions properly in packet scheduler, from Davide Caratti. 15) Even if we get EEXIST, we still need to rehash if a shrink was delayed. From Herbert Xu. 16) Fix interrupt mask handling in interrupt handler of r8169, from Heiner Kallweit. 17) Fix leak in ehea driver, from Wen Yang" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (168 commits) dpaa2-eth: fix race condition with bql frame accounting chelsio: use BUG() instead of BUG_ON(1) net: devlink: skip info_get op call if it is not defined in dumpit net: phy: bcm54xx: Encode link speed and activity into LEDs tipc: change to check tipc_own_id to return in tipc_net_stop net: usb: aqc111: Extend HWID table by QNAP device net: sched: Kconfig: update reference link for PIE net: dsa: qca8k: extend slave-bus implementations net: dsa: qca8k: remove leftover phy accessors dt-bindings: net: dsa: qca8k: support internal mdio-bus dt-bindings: net: dsa: qca8k: fix example net: phy: don't clear BMCR in genphy_soft_reset bpf, libbpf: clarify bump in libbpf version info bpf, libbpf: fix version info and add it to shared object rxrpc: avoid clang -Wuninitialized warning tipc: tipc clang warning net: sched: fix cleanup NULL pointer exception in act_mirr r8169: fix cable re-plugging issue net: ethernet: ti: fix possible object reference leak net: ibm: fix possible object reference leak ...
| | * Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller2019-03-253-14/+54
| | |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Alexei Starovoitov says: ==================== pull-request: bpf 2019-03-24 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) libbpf verision fix up from Daniel. 2) fix liveness propagation from Jakub. 3) fix verbose print of refcounted regs from Martin. 4) fix for large map allocations from Martynas. 5) fix use after free in sanitize_ptr_alu from Xu. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | | * bpf, libbpf: clarify bump in libbpf version infoDaniel Borkmann2019-03-251-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current documentation suggests that we would need to bump the libbpf version on every change. Lets clarify this a bit more and reflect what we do today in practice, that is, bumping it once per development cycle. Fixes: 76d1b894c515 ("libbpf: Document API and ABI conventions") Reported-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
| | | * bpf, libbpf: fix version info and add it to shared objectDaniel Borkmann2019-03-251-14/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Even though libbpf's versioning script for the linker (libbpf.map) is pointing to 0.0.2, the BPF_EXTRAVERSION in the Makefile has not been updated along with it and is therefore still on 0.0.1. While fixing up, I also noticed that the generated shared object versioning information is missing, typical convention is to have a linker name (libbpf.so), soname (libbpf.so.0) and real name (libbpf.so.0.0.2) for library management. This is based upon the LIBBPF_VERSION as well. The build will then produce the following bpf libraries: # ll libbpf* libbpf.a libbpf.so -> libbpf.so.0.0.2 libbpf.so.0 -> libbpf.so.0.0.2 libbpf.so.0.0.2 # readelf -d libbpf.so.0.0.2 | grep SONAME 0x000000000000000e (SONAME) Library soname: [libbpf.so.0] And install them accordingly: # rm -rf /tmp/bld; mkdir /tmp/bld; make -j$(nproc) O=/tmp/bld install Auto-detecting system features: ... libelf: [ on ] ... bpf: [ on ] CC /tmp/bld/libbpf.o CC /tmp/bld/bpf.o CC /tmp/bld/nlattr.o CC /tmp/bld/btf.o CC /tmp/bld/libbpf_errno.o CC /tmp/bld/str_error.o CC /tmp/bld/netlink.o CC /tmp/bld/bpf_prog_linfo.o CC /tmp/bld/libbpf_probes.o CC /tmp/bld/xsk.o LD /tmp/bld/libbpf-in.o LINK /tmp/bld/libbpf.a LINK /tmp/bld/libbpf.so.0.0.2 LINK /tmp/bld/test_libbpf INSTALL /tmp/bld/libbpf.a INSTALL /tmp/bld/libbpf.so.0.0.2 # ll /usr/local/lib64/libbpf.* /usr/local/lib64/libbpf.a /usr/local/lib64/libbpf.so -> libbpf.so.0.0.2 /usr/local/lib64/libbpf.so.0 -> libbpf.so.0.0.2 /usr/local/lib64/libbpf.so.0.0.2 Fixes: 1bf4b05810fe ("tools: bpftool: add probes for eBPF program types") Fixes: 1b76c13e4b36 ("bpf tools: Introduce 'bpf' library and add bpf feature check") Reported-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
| | | * bpf: verifier: propagate liveness on all framesJakub Kicinski2019-03-221-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 7640ead93924 ("bpf: verifier: make sure callees don't prune with caller differences") connected up parentage chains of all frames of the stack. It didn't, however, ensure propagate_liveness() propagates all liveness information along those chains. This means pruning happening in the callee may generate explored states with incomplete liveness for the chains in lower frames of the stack. The included selftest is similar to the prior one from commit 7640ead93924 ("bpf: verifier: make sure callees don't prune with caller differences"), where callee would prune regardless of the difference in r8 state. Now we also initialize r9 to 0 or 1 based on a result from get_random(). r9 is never read so the walk with r9 = 0 gets pruned (correctly) after the walk with r9 = 1 completes. The selftest is so arranged that the pruning will happen in the callee. Since callee does not propagate read marks of r8, the explored state at the pruning point prior to the callee will now ignore r8. Propagate liveness on all frames of the stack when pruning. Fixes: f4d7e40a5b71 ("bpf: introduce function calls (verification)") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
| | * | net/sched: act_vlan: validate the control action inside init()Davide Caratti2019-03-211-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the following script: # tc qdisc add dev crash0 clsact # tc filter add dev crash0 egress matchall \ > action vlan pop pass index 90 # tc actions replace action vlan \ > pop goto chain 42 index 90 cookie c1a0c1a0 # tc actions show action vlan had the following output: Error: Failed to init TC action chain. We have an error talking to the kernel total acts 1 action order 0: vlan pop goto chain 42 index 90 ref 2 bind 1 cookie c1a0c1a0 Then, the first packet transmitted by crash0 made the kernel crash: BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 #PF error: [normal kernel read fault] PGD 800000007974f067 P4D 800000007974f067 PUD 79638067 PMD 0 Oops: 0000 [#1] SMP PTI CPU: 3 PID: 0 Comm: swapper/3 Not tainted 5.0.0-rc4.gotochain_crash+ #536 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 RIP: 0010:tcf_action_exec+0xb8/0x100 Code: 00 00 00 20 74 1d 83 f8 03 75 09 49 83 c4 08 4d 39 ec 75 bc 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 49 8b 97 a8 00 00 00 <48> 8b 12 48 89 55 00 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 RSP: 0018:ffff982dfdb83be0 EFLAGS: 00010246 RAX: 000000002000002a RBX: ffff982dfc55db00 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff982df97099c0 RDI: ffff982dfc55db00 RBP: ffff982dfdb83c80 R08: ffff982df983fec8 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffff982df5aacd00 R13: ffff982df5aacd08 R14: 0000000000000001 R15: ffff982df97099c0 FS: 0000000000000000(0000) GS:ffff982dfdb80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 00000000796d0005 CR4: 00000000001606e0 Call Trace: <IRQ> tcf_classify+0x58/0x120 __dev_queue_xmit+0x40a/0x890 ? ip6_finish_output2+0x369/0x590 ip6_finish_output2+0x369/0x590 ? ip6_output+0x68/0x110 ip6_output+0x68/0x110 ? nf_hook.constprop.35+0x79/0xc0 mld_sendpack+0x16f/0x220 mld_ifc_timer_expire+0x195/0x2c0 ? igmp6_timer_handler+0x70/0x70 call_timer_fn+0x2b/0x130 run_timer_softirq+0x3e8/0x440 ? enqueue_hrtimer+0x39/0x90 __do_softirq+0xe3/0x2f5 irq_exit+0xf0/0x100 smp_apic_timer_interrupt+0x6c/0x130 apic_timer_interrupt+0xf/0x20 </IRQ> RIP: 0010:native_safe_halt+0x2/0x10 Code: 7b ff ff ff 7f f3 c3 65 48 8b 04 25 00 5c 01 00 f0 80 48 02 20 48 8b 00 a8 08 74 8b eb c1 90 90 90 90 90 90 90 90 90 90 fb f4 <c3> 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 f4 c3 90 90 90 90 90 90 RSP: 0018:ffffa4714038feb8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13 RAX: ffffffff840184f0 RBX: 0000000000000003 RCX: 0000000000000000 RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000001e57d3f387 RBP: 0000000000000003 R08: 001125d9ca39e1eb R09: 0000000000000000 R10: 000000000000027d R11: 000000000009f400 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 ? __sched_text_end+0x1/0x1 default_idle+0x1c/0x140 do_idle+0x1c4/0x280 cpu_startup_entry+0x19/0x20 start_secondary+0x1a7/0x200 secondary_startup_64+0xa4/0xb0 Modules linked in: act_vlan veth ip6table_filter ip6_tables iptable_filter binfmt_misc ext4 snd_hda_codec_generic mbcache crct10dif_pclmul jbd2 snd_hda_intel crc32_pclmul snd_hda_codec ghash_clmulni_intel snd_hwdep snd_hda_core snd_seq snd_seq_device snd_pcm aesni_intel crypto_simd cryptd glue_helper joydev snd_timer virtio_balloon snd pcspkr soundcore i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs ata_generic pata_acpi qxl drm_kms_helper syscopyarea sysfillrect sysimgblt virtio_net fb_sys_fops virtio_blk ttm net_failover virtio_console failover ata_piix drm libata crc32c_intel virtio_pci serio_raw virtio_ring virtio floppy dm_mirror dm_region_hash dm_log dm_mod CR2: 0000000000000000 Validating the control action within tcf_vlan_init() proved to fix the above issue. A TDC selftest is added to verify the correct behavior. Fixes: db50514f9a9c ("net: sched: add termination action to allow goto chain") Fixes: 97763dc0f401 ("net_sched: reject unknown tcfa_action values") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | net/sched: act_tunnel_key: validate the control action inside init()Davide Caratti2019-03-211-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the following script: # tc qdisc add dev crash0 clsact # tc filter add dev crash0 egress matchall \ > action tunnel_key set src_ip 10.10.10.1 dst_ip 20.20.2 dst_port 3128 \ > nocsum id 1 pass index 90 # tc actions replace action tunnel_key \ > set src_ip 10.10.10.1 dst_ip 20.20.2 dst_port 3128 nocsum id 1 \ > goto chain 42 index 90 cookie c1a0c1a0 # tc actions show action tunnel_key had the following output: Error: Failed to init TC action chain. We have an error talking to the kernel total acts 1 action order 0: tunnel_key set src_ip 10.10.10.1 dst_ip 20.20.2.0 key_id 1 dst_port 3128 nocsum goto chain 42 index 90 ref 2 bind 1 cookie c1a0c1a0 then, the first packet transmitted by crash0 made the kernel crash: BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 #PF error: [normal kernel read fault] PGD 800000002aba4067 P4D 800000002aba4067 PUD 795f9067 PMD 0 Oops: 0000 [#1] SMP PTI CPU: 3 PID: 0 Comm: swapper/3 Not tainted 5.0.0-rc4.gotochain_crash+ #536 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 RIP: 0010:tcf_action_exec+0xb8/0x100 Code: 00 00 00 20 74 1d 83 f8 03 75 09 49 83 c4 08 4d 39 ec 75 bc 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 49 8b 97 a8 00 00 00 <48> 8b 12 48 89 55 00 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 RSP: 0018:ffff9346bdb83be0 EFLAGS: 00010246 RAX: 000000002000002a RBX: ffff9346bb795c00 RCX: 0000000000000002 RDX: 0000000000000000 RSI: ffff93466c881700 RDI: 0000000000000246 RBP: ffff9346bdb83c80 R08: ffff9346b3e1e0c8 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffff9346b978f000 R13: ffff9346b978f008 R14: 0000000000000001 R15: ffff93466dceeb40 FS: 0000000000000000(0000) GS:ffff9346bdb80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000007a6c2002 CR4: 00000000001606e0 Call Trace: <IRQ> tcf_classify+0x58/0x120 __dev_queue_xmit+0x40a/0x890 ? ip6_finish_output2+0x369/0x590 ip6_finish_output2+0x369/0x590 ? ip6_output+0x68/0x110 ip6_output+0x68/0x110 ? nf_hook.constprop.35+0x79/0xc0 mld_sendpack+0x16f/0x220 mld_ifc_timer_expire+0x195/0x2c0 ? igmp6_timer_handler+0x70/0x70 call_timer_fn+0x2b/0x130 run_timer_softirq+0x3e8/0x440 ? tick_sched_timer+0x37/0x70 __do_softirq+0xe3/0x2f5 irq_exit+0xf0/0x100 smp_apic_timer_interrupt+0x6c/0x130 apic_timer_interrupt+0xf/0x20 </IRQ> RIP: 0010:native_safe_halt+0x2/0x10 Code: 55 ff ff ff 7f f3 c3 65 48 8b 04 25 00 5c 01 00 f0 80 48 02 20 48 8b 00 a8 08 74 8b eb c1 90 90 90 90 90 90 90 90 90 90 fb f4 <c3> 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 f4 c3 90 90 90 90 90 90 RSP: 0018:ffffa48a8038feb8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13 RAX: ffffffffaa8184f0 RBX: 0000000000000003 RCX: 0000000000000000 RDX: 0000000000000001 RSI: 0000000000000087 RDI: 0000000000000003 RBP: 0000000000000003 R08: 0011251c6fcfac49 R09: ffff9346b995be00 R10: ffffa48a805e7ce8 R11: 00000000024c38dd R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 ? __sched_text_end+0x1/0x1 default_idle+0x1c/0x140 do_idle+0x1c4/0x280 cpu_startup_entry+0x19/0x20 start_secondary+0x1a7/0x200 secondary_startup_64+0xa4/0xb0 Modules linked in: act_tunnel_key veth ip6table_filter ip6_tables iptable_filter binfmt_misc ext4 crct10dif_pclmul crc32_pclmul snd_hda_codec_generic ghash_clmulni_intel mbcache snd_hda_intel jbd2 snd_hda_codec snd_hwdep snd_hda_core snd_seq snd_seq_device snd_pcm aesni_intel crypto_simd cryptd glue_helper joydev snd_timer snd pcspkr virtio_balloon soundcore i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs ata_generic pata_acpi qxl drm_kms_helper syscopyarea sysfillrect virtio_net sysimgblt fb_sys_fops ttm net_failover virtio_console virtio_blk failover drm serio_raw crc32c_intel ata_piix virtio_pci floppy virtio_ring libata virtio dm_mirror dm_region_hash dm_log dm_mod CR2: 0000000000000000 Validating the control action within tcf_tunnel_key_init() proved to fix the above issue. A TDC selftest is added to verify the correct behavior. Fixes: db50514f9a9c ("net: sched: add termination action to allow goto chain") Fixes: 97763dc0f401 ("net_sched: reject unknown tcfa_action values") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | net/sched: act_skbmod: validate the control action inside init()Davide Caratti2019-03-211-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the following script: # tc qdisc add dev crash0 clsact # tc filter add dev crash0 egress matchall \ > action skbmod set smac 00:c1:a0:c1:a0:00 pass index 90 # tc actions replace action skbmod \ > set smac 00:c1:a0:c1:a0:00 goto chain 42 index 90 cookie c1a0c1a0 # tc actions show action skbmod had the following output: src MAC address <00:c1:a0:c1:a0:00> src MAC address <00:c1:a0:c1:a0:00> Error: Failed to init TC action chain. We have an error talking to the kernel total acts 1 action order 0: skbmod goto chain 42 set smac 00:c1:a0:c1:a0:00 index 90 ref 2 bind 1 cookie c1a0c1a0 Then, the first packet transmitted by crash0 made the kernel crash: BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 #PF error: [normal kernel read fault] PGD 800000002d5c7067 P4D 800000002d5c7067 PUD 77e16067 PMD 0 Oops: 0000 [#1] SMP PTI CPU: 3 PID: 0 Comm: swapper/3 Not tainted 5.0.0-rc4.gotochain_crash+ #536 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 RIP: 0010:tcf_action_exec+0xb8/0x100 Code: 00 00 00 20 74 1d 83 f8 03 75 09 49 83 c4 08 4d 39 ec 75 bc 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 49 8b 97 a8 00 00 00 <48> 8b 12 48 89 55 00 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 RSP: 0018:ffff8987ffd83be0 EFLAGS: 00010246 RAX: 000000002000002a RBX: ffff8987aeb68800 RCX: ffff8987fa263640 RDX: 0000000000000000 RSI: ffff8987f51c8802 RDI: 00000000000000a0 RBP: ffff8987ffd83c80 R08: ffff8987f939bac8 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffff8987f5c77d00 R13: ffff8987f5c77d08 R14: 0000000000000001 R15: ffff8987f0c29f00 FS: 0000000000000000(0000) GS:ffff8987ffd80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000007832c004 CR4: 00000000001606e0 Call Trace: <IRQ> tcf_classify+0x58/0x120 __dev_queue_xmit+0x40a/0x890 ? ip6_finish_output2+0x369/0x590 ip6_finish_output2+0x369/0x590 ? ip6_output+0x68/0x110 ip6_output+0x68/0x110 ? nf_hook.constprop.35+0x79/0xc0 mld_sendpack+0x16f/0x220 mld_ifc_timer_expire+0x195/0x2c0 ? igmp6_timer_handler+0x70/0x70 call_timer_fn+0x2b/0x130 run_timer_softirq+0x3e8/0x440 ? tick_sched_timer+0x37/0x70 __do_softirq+0xe3/0x2f5 irq_exit+0xf0/0x100 smp_apic_timer_interrupt+0x6c/0x130 apic_timer_interrupt+0xf/0x20 </IRQ> RIP: 0010:native_safe_halt+0x2/0x10 Code: 56 ff ff ff 7f f3 c3 65 48 8b 04 25 00 5c 01 00 f0 80 48 02 20 48 8b 00 a8 08 74 8b eb c1 90 90 90 90 90 90 90 90 90 90 fb f4 <c3> 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 f4 c3 90 90 90 90 90 90 RSP: 0018:ffffa2a1c038feb8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13 RAX: ffffffffa94184f0 RBX: 0000000000000003 RCX: 0000000000000001 RDX: 0000000000000001 RSI: 0000000000000087 RDI: 0000000000000003 RBP: 0000000000000003 R08: 001123cfc2ba71ac R09: 0000000000000000 R10: 0000000000000000 R11: 00000000000f4240 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 ? __sched_text_end+0x1/0x1 default_idle+0x1c/0x140 do_idle+0x1c4/0x280 cpu_startup_entry+0x19/0x20 start_secondary+0x1a7/0x200 secondary_startup_64+0xa4/0xb0 Modules linked in: act_skbmod veth ip6table_filter ip6_tables iptable_filter binfmt_misc ext4 crct10dif_pclmul crc32_pclmul ghash_clmulni_intel mbcache jbd2 snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_seq snd_seq_device aesni_intel crypto_simd cryptd glue_helper snd_pcm joydev pcspkr virtio_balloon snd_timer snd i2c_piix4 soundcore nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs ata_generic pata_acpi qxl drm_kms_helper syscopyarea sysfillrect virtio_net sysimgblt fb_sys_fops net_failover virtio_console ttm virtio_blk failover drm crc32c_intel serio_raw ata_piix virtio_pci libata virtio_ring virtio floppy dm_mirror dm_region_hash dm_log dm_mod CR2: 0000000000000000 Validating the control action within tcf_skbmod_init() proved to fix the above issue. A TDC selftest is added to verify the correct behavior. Fixes: db50514f9a9c ("net: sched: add termination action to allow goto chain") Fixes: 97763dc0f401 ("net_sched: reject unknown tcfa_action values") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | net/sched: act_skbedit: validate the control action inside init()Davide Caratti2019-03-211-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the following script: # tc qdisc add dev crash0 clsact # tc filter add dev crash0 egress matchall \ > action skbedit ptype host pass index 90 # tc actions replace action skbedit \ > ptype host goto chain 42 index 90 cookie c1a0c1a0 # tc actions show action skbedit had the following output: Error: Failed to init TC action chain. We have an error talking to the kernel total acts 1 action order 0: skbedit ptype host goto chain 42 index 90 ref 2 bind 1 cookie c1a0c1a0 Then, the first packet transmitted by crash0 made the kernel crash: BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 #PF error: [normal kernel read fault] PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 3 PID: 3467 Comm: kworker/3:3 Not tainted 5.0.0-rc4.gotochain_crash+ #536 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 Workqueue: ipv6_addrconf addrconf_dad_work RIP: 0010:tcf_action_exec+0xb8/0x100 Code: 00 00 00 20 74 1d 83 f8 03 75 09 49 83 c4 08 4d 39 ec 75 bc 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 49 8b 97 a8 00 00 00 <48> 8b 12 48 89 55 00 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 RSP: 0018:ffffb50a81e1fad0 EFLAGS: 00010246 RAX: 000000002000002a RBX: ffff9aa47ba4ea00 RCX: 0000000000000001 RDX: 0000000000000000 RSI: ffff9aa469eeb3c0 RDI: ffff9aa47ba4ea00 RBP: ffffb50a81e1fb70 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: ffff9aa47bce0638 R12: ffff9aa4793b0c00 R13: ffff9aa4793b0c08 R14: 0000000000000001 R15: ffff9aa469eeb3c0 FS: 0000000000000000(0000) GS:ffff9aa474780000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000007360e005 CR4: 00000000001606e0 Call Trace: tcf_classify+0x58/0x120 __dev_queue_xmit+0x40a/0x890 ? ndisc_next_option+0x50/0x50 ? ___neigh_create+0x4d5/0x680 ? ip6_finish_output2+0x1b5/0x590 ip6_finish_output2+0x1b5/0x590 ? ip6_output+0x68/0x110 ip6_output+0x68/0x110 ? nf_hook.constprop.28+0x79/0xc0 ndisc_send_skb+0x248/0x2e0 ndisc_send_ns+0xf8/0x200 ? addrconf_dad_work+0x389/0x4b0 addrconf_dad_work+0x389/0x4b0 ? __switch_to_asm+0x34/0x70 ? process_one_work+0x195/0x380 ? addrconf_dad_completed+0x370/0x370 process_one_work+0x195/0x380 worker_thread+0x30/0x390 ? process_one_work+0x380/0x380 kthread+0x113/0x130 ? kthread_park+0x90/0x90 ret_from_fork+0x35/0x40 Modules linked in: act_skbedit veth ip6table_filter ip6_tables iptable_filter binfmt_misc crct10dif_pclmul crc32_pclmul ghash_clmulni_intel ext4 snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep mbcache snd_hda_core jbd2 snd_seq snd_seq_device snd_pcm aesni_intel crypto_simd cryptd snd_timer glue_helper snd joydev soundcore pcspkr virtio_balloon i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs ata_generic pata_acpi qxl drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm virtio_net net_failover drm failover virtio_blk virtio_console ata_piix virtio_pci crc32c_intel serio_raw libata virtio_ring virtio floppy dm_mirror dm_region_hash dm_log dm_mod CR2: 0000000000000000 Validating the control action within tcf_skbedit_init() proved to fix the above issue. A TDC selftest is added to verify the correct behavior. Fixes: db50514f9a9c ("net: sched: add termination action to allow goto chain") Fixes: 97763dc0f401 ("net_sched: reject unknown tcfa_action values") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | net/sched: act_simple: validate the control action inside init()Davide Caratti2019-03-211-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the following script: # tc qdisc add dev crash0 clsact # tc filter add dev crash0 egress matchall \ > action simple sdata hello pass index 90 # tc actions replace action simple \ > sdata world goto chain 42 index 90 cookie c1a0c1a0 # tc action show action simple had the following output: Error: Failed to init TC action chain. We have an error talking to the kernel total acts 1 action order 0: Simple <world> index 90 ref 2 bind 1 cookie c1a0c1a0 Then, the first packet transmitted by crash0 made the kernel crash: BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 #PF error: [normal kernel read fault] PGD 800000006a6fb067 P4D 800000006a6fb067 PUD 6aed6067 PMD 0 Oops: 0000 [#1] SMP PTI CPU: 2 PID: 3241 Comm: kworker/2:0 Not tainted 5.0.0-rc4.gotochain_crash+ #536 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 Workqueue: ipv6_addrconf addrconf_dad_work RIP: 0010:tcf_action_exec+0xb8/0x100 Code: 00 00 00 20 74 1d 83 f8 03 75 09 49 83 c4 08 4d 39 ec 75 bc 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 49 8b 97 a8 00 00 00 <48> 8b 12 48 89 55 00 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 RSP: 0018:ffffbe6781763ad0 EFLAGS: 00010246 RAX: 000000002000002a RBX: ffff9e59bdb80e00 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff9e59b4716738 RDI: ffff9e59ab12d140 RBP: ffffbe6781763b70 R08: 0000000000000234 R09: 0000000000aaaaaa R10: 0000000000000000 R11: ffff9e59b247cd50 R12: ffff9e59b112f100 R13: ffff9e59b112f108 R14: 0000000000000001 R15: ffff9e59ab12d0c0 FS: 0000000000000000(0000) GS:ffff9e59b4700000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000006af92004 CR4: 00000000001606e0 Call Trace: tcf_classify+0x58/0x120 __dev_queue_xmit+0x40a/0x890 ? ndisc_next_option+0x50/0x50 ? ___neigh_create+0x4d5/0x680 ? ip6_finish_output2+0x1b5/0x590 ip6_finish_output2+0x1b5/0x590 ? ip6_output+0x68/0x110 ip6_output+0x68/0x110 ? nf_hook.constprop.28+0x79/0xc0 ndisc_send_skb+0x248/0x2e0 ndisc_send_ns+0xf8/0x200 ? addrconf_dad_work+0x389/0x4b0 addrconf_dad_work+0x389/0x4b0 ? __switch_to_asm+0x34/0x70 ? process_one_work+0x195/0x380 ? addrconf_dad_completed+0x370/0x370 process_one_work+0x195/0x380 worker_thread+0x30/0x390 ? process_one_work+0x380/0x380 kthread+0x113/0x130 ? kthread_park+0x90/0x90 ret_from_fork+0x35/0x40 Modules linked in: act_simple veth ip6table_filter ip6_tables iptable_filter binfmt_misc crct10dif_pclmul crc32_pclmul ghash_clmulni_intel ext4 snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep mbcache snd_hda_core jbd2 snd_seq snd_seq_device snd_pcm aesni_intel crypto_simd cryptd snd_timer glue_helper snd joydev virtio_balloon pcspkr soundcore i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs ata_generic pata_acpi qxl drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops virtio_net ttm net_failover virtio_console virtio_blk failover drm crc32c_intel serio_raw floppy ata_piix libata virtio_pci virtio_ring virtio dm_mirror dm_region_hash dm_log dm_mod CR2: 0000000000000000 Validating the control action within tcf_simple_init() proved to fix the above issue. A TDC selftest is added to verify the correct behavior. Fixes: db50514f9a9c ("net: sched: add termination action to allow goto chain") Fixes: 97763dc0f401 ("net_sched: reject unknown tcfa_action values") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | net/sched: act_sample: validate the control action inside init()Davide Caratti2019-03-211-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the following script: # tc qdisc add dev crash0 clsact # tc filter add dev crash0 egress matchall \ > action sample rate 1024 group 4 pass index 90 # tc actions replace action sample \ > rate 1024 group 4 goto chain 42 index 90 cookie c1a0c1a0 # tc actions show action sample had the following output: Error: Failed to init TC action chain. We have an error talking to the kernel total acts 1 action order 0: sample rate 1/1024 group 4 goto chain 42 index 90 ref 2 bind 1 cookie c1a0c1a0 Then, the first packet transmitted by crash0 made the kernel crash: BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 #PF error: [normal kernel read fault] PGD 8000000079966067 P4D 8000000079966067 PUD 7987b067 PMD 0 Oops: 0000 [#1] SMP PTI CPU: 0 PID: 5 Comm: kworker/0:0 Not tainted 5.0.0-rc4.gotochain_crash+ #536 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 Workqueue: ipv6_addrconf addrconf_dad_work RIP: 0010:tcf_action_exec+0xb8/0x100 Code: 00 00 00 20 74 1d 83 f8 03 75 09 49 83 c4 08 4d 39 ec 75 bc 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 49 8b 97 a8 00 00 00 <48> 8b 12 48 89 55 00 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 RSP: 0018:ffffbee60033fad0 EFLAGS: 00010246 RAX: 000000002000002a RBX: ffff99d7ae6e3b00 RCX: 00000000e555df9b RDX: 0000000000000000 RSI: 00000000b0352718 RDI: ffff99d7fda1fcf0 RBP: ffffbee60033fb70 R08: 0000000070731ab1 R09: 0000000000000400 R10: 0000000000000000 R11: ffff99d7ac733838 R12: ffff99d7f3c2be00 R13: ffff99d7f3c2be08 R14: 0000000000000001 R15: ffff99d7f3c2b600 FS: 0000000000000000(0000) GS:ffff99d7fda00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 00000000797de006 CR4: 00000000001606f0 Call Trace: tcf_classify+0x58/0x120 __dev_queue_xmit+0x40a/0x890 ? ndisc_next_option+0x50/0x50 ? ___neigh_create+0x4d5/0x680 ? ip6_finish_output2+0x1b5/0x590 ip6_finish_output2+0x1b5/0x590 ? ip6_output+0x68/0x110 ip6_output+0x68/0x110 ? nf_hook.constprop.28+0x79/0xc0 ndisc_send_skb+0x248/0x2e0 ndisc_send_ns+0xf8/0x200 ? addrconf_dad_work+0x389/0x4b0 addrconf_dad_work+0x389/0x4b0 ? __switch_to_asm+0x34/0x70 ? process_one_work+0x195/0x380 ? addrconf_dad_completed+0x370/0x370 process_one_work+0x195/0x380 worker_thread+0x30/0x390 ? process_one_work+0x380/0x380 kthread+0x113/0x130 ? kthread_park+0x90/0x90 ret_from_fork+0x35/0x40 Modules linked in: act_sample psample veth ip6table_filter ip6_tables iptable_filter binfmt_misc ext4 crct10dif_pclmul crc32_pclmul ghash_clmulni_intel mbcache jbd2 snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_seq snd_seq_device aesni_intel crypto_simd snd_pcm cryptd glue_helper snd_timer joydev snd pcspkr virtio_balloon i2c_piix4 soundcore nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs ata_generic pata_acpi qxl drm_kms_helper syscopyarea sysfillrect virtio_net sysimgblt fb_sys_fops net_failover ttm failover virtio_blk virtio_console drm ata_piix serio_raw crc32c_intel libata virtio_pci virtio_ring virtio floppy dm_mirror dm_region_hash dm_log dm_mod CR2: 0000000000000000 Validating the control action within tcf_sample_init() proved to fix the above issue. A TDC selftest is added to verify the correct behavior. Fixes: db50514f9a9c ("net: sched: add termination action to allow goto chain") Fixes: 97763dc0f401 ("net_sched: reject unknown tcfa_action values") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | net/sched: act_police: validate the control action inside init()Davide Caratti2019-03-211-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the following script: # tc qdisc add dev crash0 clsact # tc filter add dev crash0 egress matchall \ > action police rate 3mbit burst 250k pass index 90 # tc actions replace action police \ > rate 3mbit burst 250k goto chain 42 index 90 cookie c1a0c1a0 # tc actions show action police rate 3mbit burst had the following output: Error: Failed to init TC action chain. We have an error talking to the kernel total acts 1 action order 0: police 0x5a rate 3Mbit burst 250Kb mtu 2Kb action goto chain 42 overhead 0b ref 2 bind 1 cookie c1a0c1a0 Then, when crash0 starts transmitting more than 3Mbit/s, the following kernel crash is observed: BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 #PF error: [normal kernel read fault] PGD 800000007a779067 P4D 800000007a779067 PUD 2ad96067 PMD 0 Oops: 0000 [#1] SMP PTI CPU: 3 PID: 5032 Comm: netperf Not tainted 5.0.0-rc4.gotochain_crash+ #533 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 RIP: 0010:tcf_action_exec+0xb8/0x100 Code: 00 00 00 20 74 1d 83 f8 03 75 09 49 83 c4 08 4d 39 ec 75 bc 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 49 8b 97 a8 00 00 00 <48> 8b 12 48 89 55 00 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 RSP: 0018:ffffb0e04064fa60 EFLAGS: 00010246 RAX: 000000002000002a RBX: ffff93bb3322cce0 RCX: 0000000000000005 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff93bb3322cce0 RBP: ffffb0e04064fb00 R08: 0000000000000022 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000001 R12: ffff93bb3beed300 R13: ffff93bb3beed308 R14: 0000000000000001 R15: ffff93bb3b64d000 FS: 00007f0bc6be5740(0000) GS:ffff93bb3db80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 00000000746a8001 CR4: 00000000001606e0 Call Trace: tcf_classify+0x58/0x120 __dev_queue_xmit+0x40a/0x890 ? ipt_do_table+0x31c/0x420 [ip_tables] ? ip_finish_output2+0x16f/0x430 ip_finish_output2+0x16f/0x430 ? ip_output+0x69/0xe0 ip_output+0x69/0xe0 ? ip_forward_options+0x1a0/0x1a0 __tcp_transmit_skb+0x563/0xa40 tcp_write_xmit+0x243/0xfa0 __tcp_push_pending_frames+0x32/0xf0 tcp_sendmsg_locked+0x404/0xd30 tcp_sendmsg+0x27/0x40 sock_sendmsg+0x36/0x40 __sys_sendto+0x10e/0x140 ? __sys_connect+0x87/0xf0 ? syscall_trace_enter+0x1df/0x2e0 ? __audit_syscall_exit+0x216/0x260 __x64_sys_sendto+0x24/0x30 do_syscall_64+0x5b/0x180 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7f0bc5ffbafd Code: 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 8b 05 ae c4 2c 00 85 c0 75 2d 45 31 c9 45 31 c0 4c 63 d1 48 63 ff b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 01 c3 48 8b 15 63 63 2c 00 f7 d8 64 89 02 48 RSP: 002b:00007fffef94b7f8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 0000000000004000 RCX: 00007f0bc5ffbafd RDX: 0000000000004000 RSI: 00000000017e5420 RDI: 0000000000000004 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000004 R13: 00000000017e51d0 R14: 0000000000000010 R15: 0000000000000006 Modules linked in: act_police veth ip6table_filter ip6_tables iptable_filter binfmt_misc ext4 snd_hda_codec_generic mbcache crct10dif_pclmul jbd2 crc32_pclmul ghash_clmulni_intel snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_seq snd_seq_device snd_pcm aesni_intel crypto_simd cryptd glue_helper snd_timer snd joydev pcspkr virtio_balloon soundcore i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs ata_generic pata_acpi qxl drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm virtio_blk virtio_net virtio_console net_failover failover crc32c_intel ata_piix libata serio_raw virtio_pci virtio_ring virtio floppy dm_mirror dm_region_hash dm_log dm_mod CR2: 0000000000000000 Validating the control action within tcf_police_init() proved to fix the above issue. A TDC selftest is added to verify the correct behavior. Fixes: db50514f9a9c ("net: sched: add termination action to allow goto chain") Fixes: 97763dc0f401 ("net_sched: reject unknown tcfa_action values") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | net/sched: act_pedit: validate the control action inside init()Davide Caratti2019-03-211-0/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the following script: # tc filter add dev crash0 egress matchall \ > action pedit ex munge ip ttl set 10 pass index 90 # tc actions replace action pedit \ > ex munge ip ttl set 10 goto chain 42 index 90 cookie c1a0c1a0 # tc actions show action pedit had the following output: Error: Failed to init TC action chain. We have an error talking to the kernel total acts 1 action order 0: pedit action goto chain 42 keys 1 index 90 ref 2 bind 1 key #0 at ipv4+8: val 0a000000 mask 00ffffff cookie c1a0c1a0 Then, the first packet transmitted by crash0 made the kernel crash: BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 #PF error: [normal kernel read fault] PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 2 PID: 0 Comm: swapper/2 Not tainted 5.0.0-rc4.gotochain_crash+ #533 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 RIP: 0010:tcf_action_exec+0xb8/0x100 Code: 00 00 00 20 74 1d 83 f8 03 75 09 49 83 c4 08 4d 39 ec 75 bc 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 49 8b 97 a8 00 00 00 <48> 8b 12 48 89 55 00 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 RSP: 0018:ffff94a73db03be0 EFLAGS: 00010246 RAX: 000000002000002a RBX: ffff94a6ee4c0700 RCX: 000000000000000a RDX: 0000000000000000 RSI: ffff94a6ed22c800 RDI: 0000000000000000 RBP: ffff94a73db03c80 R08: ffff94a7386fa4c8 R09: ffff94a73229ea20 R10: 0000000000000000 R11: 0000000000000000 R12: ffff94a6ed22cb00 R13: ffff94a6ed22cb08 R14: 0000000000000001 R15: ffff94a6ed22c800 FS: 0000000000000000(0000) GS:ffff94a73db00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000007120e002 CR4: 00000000001606e0 Call Trace: <IRQ> tcf_classify+0x58/0x120 __dev_queue_xmit+0x40a/0x890 ? ip6_finish_output2+0x369/0x590 ip6_finish_output2+0x369/0x590 ? ip6_output+0x68/0x110 ip6_output+0x68/0x110 ? nf_hook.constprop.35+0x79/0xc0 mld_sendpack+0x16f/0x220 mld_ifc_timer_expire+0x195/0x2c0 ? igmp6_timer_handler+0x70/0x70 call_timer_fn+0x2b/0x130 run_timer_softirq+0x3e8/0x440 ? tick_sched_timer+0x37/0x70 __do_softirq+0xe3/0x2f5 irq_exit+0xf0/0x100 smp_apic_timer_interrupt+0x6c/0x130 apic_timer_interrupt+0xf/0x20 </IRQ> RIP: 0010:native_safe_halt+0x2/0x10 Code: 4e ff ff ff 7f f3 c3 65 48 8b 04 25 00 5c 01 00 f0 80 48 02 20 48 8b 00 a8 08 74 8b eb c1 90 90 90 90 90 90 90 90 90 90 fb f4 <c3> 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 f4 c3 90 90 90 90 90 90 RSP: 0018:ffffab1740387eb8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13 RAX: ffffffffb18184f0 RBX: 0000000000000002 RCX: 0000000000000001 RDX: 0000000000000001 RSI: 0000000000000087 RDI: 0000000000000002 RBP: 0000000000000002 R08: 000f168fa695f9a9 R09: 0000000000000020 R10: 0000000000000004 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 ? __sched_text_end+0x1/0x1 default_idle+0x1c/0x140 do_idle+0x1c4/0x280 cpu_startup_entry+0x19/0x20 start_secondary+0x1a7/0x200 secondary_startup_64+0xa4/0xb0 Modules linked in: act_pedit veth ip6table_filter ip6_tables iptable_filter binfmt_misc ext4 mbcache jbd2 crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep aesni_intel snd_hda_core crypto_simd snd_seq cryptd glue_helper snd_seq_device snd_pcm joydev snd_timer pcspkr virtio_balloon snd soundcore i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs qxl ata_generic pata_acpi drm_kms_helper virtio_net net_failover syscopyarea sysfillrect sysimgblt failover virtio_blk fb_sys_fops virtio_console ttm drm crc32c_intel serio_raw ata_piix virtio_pci libata virtio_ring virtio floppy dm_mirror dm_region_hash dm_log dm_mod CR2: 0000000000000000 Validating the control action within tcf_pedit_init() proved to fix the above issue. A TDC selftest is added to verify the correct behavior. Fixes: db50514f9a9c ("net: sched: add termination action to allow goto chain") Fixes: 97763dc0f401 ("net_sched: reject unknown tcfa_action values") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | net/sched: act_nat: validate the control action inside init()Davide Caratti2019-03-211-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the following script: # tc qdisc add dev crash0 clsact # tc filter add dev crash0 egress matchall \ > action nat ingress 1.18.1.1 1.18.2.2 pass index 90 # tc actions replace action nat \ > ingress 1.18.1.1 1.18.2.2 goto chain 42 index 90 cookie c1a0c1a0 # tc actions show action nat had the following output: Error: Failed to init TC action chain. We have an error talking to the kernel total acts 1 action order 0: nat ingress 1.18.1.1/32 1.18.2.2 goto chain 42 index 90 ref 2 bind 1 cookie c1a0c1a0 Then, the first packet transmitted by crash0 made the kernel crash: BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 #PF error: [normal kernel read fault] PGD 800000002d180067 P4D 800000002d180067 PUD 7cb8b067 PMD 0 Oops: 0000 [#1] SMP PTI CPU: 3 PID: 164 Comm: kworker/3:1 Not tainted 5.0.0-rc4.gotochain_crash+ #533 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 Workqueue: ipv6_addrconf addrconf_dad_work RIP: 0010:tcf_action_exec+0xb8/0x100 Code: 00 00 00 20 74 1d 83 f8 03 75 09 49 83 c4 08 4d 39 ec 75 bc 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 49 8b 97 a8 00 00 00 <48> 8b 12 48 89 55 00 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 RSP: 0018:ffffae4500e2fad0 EFLAGS: 00010246 RAX: 000000002000002a RBX: ffff9fa52e28c800 RCX: 0000000001011201 RDX: 0000000000000000 RSI: 0000000000000056 RDI: ffff9fa52ca12800 RBP: ffffae4500e2fb70 R08: 0000000000000022 R09: 000000000000000e R10: 00000000ffffffff R11: 0000000001011201 R12: ffff9fa52cbc9c00 R13: ffff9fa52cbc9c08 R14: 0000000000000001 R15: ffff9fa52ca12780 FS: 0000000000000000(0000) GS:ffff9fa57db80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000073f8c004 CR4: 00000000001606e0 Call Trace: tcf_classify+0x58/0x120 __dev_queue_xmit+0x40a/0x890 ? ndisc_next_option+0x50/0x50 ? ___neigh_create+0x4d5/0x680 ? ip6_finish_output2+0x1b5/0x590 ip6_finish_output2+0x1b5/0x590 ? ip6_output+0x68/0x110 ip6_output+0x68/0x110 ? nf_hook.constprop.28+0x79/0xc0 ndisc_send_skb+0x248/0x2e0 ndisc_send_ns+0xf8/0x200 ? addrconf_dad_work+0x389/0x4b0 addrconf_dad_work+0x389/0x4b0 ? __switch_to_asm+0x34/0x70 ? process_one_work+0x195/0x380 ? addrconf_dad_completed+0x370/0x370 process_one_work+0x195/0x380 worker_thread+0x30/0x390 ? process_one_work+0x380/0x380 kthread+0x113/0x130 ? kthread_park+0x90/0x90 ret_from_fork+0x35/0x40 Modules linked in: act_nat veth ip6table_filter ip6_tables iptable_filter binfmt_misc ext4 crct10dif_pclmul crc32_pclmul ghash_clmulni_intel mbcache jbd2 snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_seq snd_seq_device snd_pcm aesni_intel crypto_simd cryptd glue_helper snd_timer snd joydev virtio_balloon pcspkr soundcore i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs qxl ata_generic pata_acpi drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm virtio_net virtio_blk net_failover failover virtio_console drm crc32c_intel floppy ata_piix libata virtio_pci virtio_ring virtio serio_raw dm_mirror dm_region_hash dm_log dm_mod CR2: 0000000000000000 Validating the control action within tcf_nat_init() proved to fix the above issue. A TDC selftest is added to verify the correct behavior. Fixes: db50514f9a9c ("net: sched: add termination action to allow goto chain") Fixes: 97763dc0f401 ("net_sched: reject unknown tcfa_action values") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | net/sched: act_connmark: validate the control action inside init()Davide Caratti2019-03-211-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the following script: # tc qdisc add dev crash0 clsact # tc filter add dev crash0 egress matchall \ > action connmark pass index 90 # tc actions replace action connmark \ > goto chain 42 index 90 cookie c1a0c1a0 # tc actions show action connmark had the following output: Error: Failed to init TC action chain. We have an error talking to the kernel total acts 1 action order 0: connmark zone 0 goto chain 42 index 90 ref 2 bind 1 cookie c1a0c1a0 Then, the first packet transmitted by crash0 made the kernel crash: BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 #PF error: [normal kernel read fault] PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 0 PID: 302 Comm: kworker/0:2 Not tainted 5.0.0-rc4.gotochain_crash+ #533 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 Workqueue: ipv6_addrconf addrconf_dad_work RIP: 0010:tcf_action_exec+0xb8/0x100 Code: 00 00 00 20 74 1d 83 f8 03 75 09 49 83 c4 08 4d 39 ec 75 bc 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 49 8b 97 a8 00 00 00 <48> 8b 12 48 89 55 00 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 RSP: 0018:ffff9bea406c3ad0 EFLAGS: 00010246 RAX: 000000002000002a RBX: ffff8c5dfc009f00 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff9bea406c3a80 RDI: ffff8c5dfb9d6ec0 RBP: ffff9bea406c3b70 R08: ffff8c5dfda222a0 R09: ffffffff90933c3c R10: 0000000000000000 R11: 0000000092793f7d R12: ffff8c5df48b3c00 R13: ffff8c5df48b3c08 R14: 0000000000000001 R15: ffff8c5dfb9d6e40 FS: 0000000000000000(0000) GS:ffff8c5dfda00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000062e0e006 CR4: 00000000001606f0 Call Trace: tcf_classify+0x58/0x120 __dev_queue_xmit+0x40a/0x890 ? ndisc_next_option+0x50/0x50 ? ___neigh_create+0x4d5/0x680 ? ip6_finish_output2+0x1b5/0x590 ip6_finish_output2+0x1b5/0x590 ? ip6_output+0x68/0x110 ip6_output+0x68/0x110 ? nf_hook.constprop.28+0x79/0xc0 ndisc_send_skb+0x248/0x2e0 ndisc_send_ns+0xf8/0x200 ? addrconf_dad_work+0x389/0x4b0 addrconf_dad_work+0x389/0x4b0 ? __switch_to_asm+0x34/0x70 ? process_one_work+0x195/0x380 ? addrconf_dad_completed+0x370/0x370 process_one_work+0x195/0x380 worker_thread+0x30/0x390 ? process_one_work+0x380/0x380 kthread+0x113/0x130 ? kthread_park+0x90/0x90 ret_from_fork+0x35/0x40 Modules linked in: act_connmark nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 veth ip6table_filter ip6_tables iptable_filter binfmt_misc ext4 crct10dif_pclmul mbcache crc32_pclmul jbd2 snd_hda_codec_generic ghash_clmulni_intel snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_seq snd_seq_device snd_pcm aesni_intel snd_timer crypto_simd cryptd snd glue_helper joydev virtio_balloon pcspkr soundcore i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs ata_generic pata_acpi qxl drm_kms_helper virtio_net net_failover syscopyarea virtio_blk failover virtio_console sysfillrect sysimgblt fb_sys_fops ttm drm ata_piix crc32c_intel serio_raw libata virtio_pci virtio_ring virtio floppy dm_mirror dm_region_hash dm_log dm_mod CR2: 0000000000000000 Validating the control action within tcf_connmark_init() proved to fix the above issue. A TDC selftest is added to verify the correct behavior. Fixes: db50514f9a9c ("net: sched: add termination action to allow goto chain") Fixes: 97763dc0f401 ("net_sched: reject unknown tcfa_action values") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | net/sched: act_mirred: validate the control action inside init()Davide Caratti2019-03-211-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the following script: # tc qdisc add dev crash0 clsact # tc filter add dev crash0 egress matchall \ > action mirred ingress mirror dev lo pass # tc actions replace action mirred \ > ingress mirror dev lo goto chain 42 index 90 cookie c1a0c1a0 # tc actions show action mirred had the following output: Error: Failed to init TC action chain. We have an error talking to the kernel total acts 1 action order 0: mirred (Ingress Mirror to device lo) goto chain 42 index 90 ref 2 bind 1 cookie c1a0c1a0 Then, the first packet transmitted by crash0 made the kernel crash: Mirror/redirect action on BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 #PF error: [normal kernel read fault] PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 3 PID: 47 Comm: kworker/3:1 Not tainted 5.0.0-rc4.gotochain_crash+ #533 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 Workqueue: ipv6_addrconf addrconf_dad_work RIP: 0010:tcf_action_exec+0xb8/0x100 Code: 00 00 00 20 74 1d 83 f8 03 75 09 49 83 c4 08 4d 39 ec 75 bc 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 49 8b 97 a8 00 00 00 <48> 8b 12 48 89 55 00 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 RSP: 0018:ffffa772404b7ad0 EFLAGS: 00010246 RAX: 000000002000002a RBX: ffff9c5afc3f4300 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff9c5afdba9380 RDI: 0000000000029380 RBP: ffffa772404b7b70 R08: ffff9c5af7010028 R09: ffff9c5af7010029 R10: 0000000000000000 R11: ffff9c5af94c6a38 R12: ffff9c5af7953000 R13: ffff9c5af7953008 R14: 0000000000000001 R15: ffff9c5af7953d00 FS: 0000000000000000(0000) GS:ffff9c5afdb80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000007c514004 CR4: 00000000001606e0 Call Trace: tcf_classify+0x58/0x120 __dev_queue_xmit+0x40a/0x890 ? ndisc_next_option+0x50/0x50 ? ___neigh_create+0x4d5/0x680 ? ip6_finish_output2+0x1b5/0x590 ip6_finish_output2+0x1b5/0x590 ? ip6_output+0x68/0x110 ip6_output+0x68/0x110 ? nf_hook.constprop.28+0x79/0xc0 ndisc_send_skb+0x248/0x2e0 ndisc_send_ns+0xf8/0x200 ? addrconf_dad_work+0x389/0x4b0 addrconf_dad_work+0x389/0x4b0 ? __switch_to_asm+0x34/0x70 ? process_one_work+0x195/0x380 ? addrconf_dad_completed+0x370/0x370 process_one_work+0x195/0x380 worker_thread+0x30/0x390 ? process_one_work+0x380/0x380 kthread+0x113/0x130 ? kthread_park+0x90/0x90 ret_from_fork+0x35/0x40 Modules linked in: act_mirred veth ip6table_filter ip6_tables iptable_filter binfmt_misc ext4 crct10dif_pclmul snd_hda_codec_generic crc32_pclmul snd_hda_intel snd_hda_codec mbcache ghash_clmulni_intel jbd2 snd_hwdep snd_hda_core snd_seq snd_seq_device snd_pcm aesni_intel snd_timer snd crypto_simd cryptd glue_helper soundcore virtio_balloon joydev pcspkr i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs ata_generic pata_acpi qxl drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops virtio_net ttm virtio_blk net_failover virtio_console failover drm ata_piix crc32c_intel virtio_pci serio_raw libata virtio_ring virtio floppy dm_mirror dm_region_hash dm_log dm_mod CR2: 0000000000000000 Validating the control action within tcf_mirred_init() proved to fix the above issue. For the same reason, postpone the assignment of tcfa_action and tcfm_eaction to avoid partial reconfiguration of a mirred rule when it's replaced by another one that mirrors to a device that does not exist. A TDC selftest is added to verify the correct behavior. Fixes: db50514f9a9c ("net: sched: add termination action to allow goto chain") Fixes: 97763dc0f401 ("net_sched: reject unknown tcfa_action values") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | net/sched: act_ife: validate the control action inside init()Davide Caratti2019-03-211-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the following script: # tc qdisc add dev crash0 clsact # tc filter add dev crash0 egress matchall \ > action ife encode allow mark pass index 90 # tc actions replace action ife \ > encode allow mark goto chain 42 index 90 cookie c1a0c1a0 # tc action show action ife had the following output: IFE type 0xED3E IFE type 0xED3E Error: Failed to init TC action chain. We have an error talking to the kernel total acts 1 action order 0: ife encode action goto chain 42 type 0XED3E allow mark index 90 ref 2 bind 1 cookie c1a0c1a0 Then, the first packet transmitted by crash0 made the kernel crash: BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 #PF error: [normal kernel read fault] PGD 800000007b4e7067 P4D 800000007b4e7067 PUD 7b4e6067 PMD 0 Oops: 0000 [#1] SMP PTI CPU: 2 PID: 164 Comm: kworker/2:1 Not tainted 5.0.0-rc4.gotochain_crash+ #533 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 Workqueue: ipv6_addrconf addrconf_dad_work RIP: 0010:tcf_action_exec+0xb8/0x100 Code: 00 00 00 20 74 1d 83 f8 03 75 09 49 83 c4 08 4d 39 ec 75 bc 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 49 8b 97 a8 00 00 00 <48> 8b 12 48 89 55 00 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 RSP: 0018:ffffa6a7c0553ad0 EFLAGS: 00010246 RAX: 000000002000002a RBX: ffff9796ee1bbd00 RCX: 0000000000000001 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffffa6a7c0553b70 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: ffff9797385bb038 R12: ffff9796ead9d700 R13: ffff9796ead9d708 R14: 0000000000000001 R15: ffff9796ead9d800 FS: 0000000000000000(0000) GS:ffff97973db00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000007c41e006 CR4: 00000000001606e0 Call Trace: tcf_classify+0x58/0x120 __dev_queue_xmit+0x40a/0x890 ? ndisc_next_option+0x50/0x50 ? ___neigh_create+0x4d5/0x680 ? ip6_finish_output2+0x1b5/0x590 ip6_finish_output2+0x1b5/0x590 ? ip6_output+0x68/0x110 ip6_output+0x68/0x110 ? nf_hook.constprop.28+0x79/0xc0 ndisc_send_skb+0x248/0x2e0 ndisc_send_ns+0xf8/0x200 ? addrconf_dad_work+0x389/0x4b0 addrconf_dad_work+0x389/0x4b0 ? __switch_to_asm+0x34/0x70 ? process_one_work+0x195/0x380 ? addrconf_dad_completed+0x370/0x370 process_one_work+0x195/0x380 worker_thread+0x30/0x390 ? process_one_work+0x380/0x380 kthread+0x113/0x130 ? kthread_park+0x90/0x90 ret_from_fork+0x35/0x40 Modules linked in: act_gact act_meta_mark act_ife dummy veth ip6table_filter ip6_tables iptable_filter binfmt_misc snd_hda_codec_generic ext4 snd_hda_intel snd_hda_codec crct10dif_pclmul mbcache crc32_pclmul jbd2 snd_hwdep snd_hda_core ghash_clmulni_intel snd_seq snd_seq_device snd_pcm snd_timer aesni_intel crypto_simd snd cryptd glue_helper virtio_balloon joydev pcspkr soundcore i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs ata_generic pata_acpi qxl virtio_net drm_kms_helper virtio_blk net_failover syscopyarea failover sysfillrect virtio_console sysimgblt fb_sys_fops ttm drm crc32c_intel serio_raw ata_piix virtio_pci virtio_ring libata virtio floppy dm_mirror dm_region_hash dm_log dm_mod [last unloaded: act_ife] CR2: 0000000000000000 Validating the control action within tcf_ife_init() proved to fix the above issue. A TDC selftest is added to verify the correct behavior. Fixes: db50514f9a9c ("net: sched: add termination action to allow goto chain") Fixes: 97763dc0f401 ("net_sched: reject unknown tcfa_action values") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | net/sched: act_gact: validate the control action inside init()Davide Caratti2019-03-211-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the following script: # tc qdisc add dev crash0 clsact # tc filter add dev crash0 egress matchall \ > action gact pass index 90 # tc actions replace action gact \ > goto chain 42 index 90 cookie c1a0c1a0 # tc actions show action gact had the following output: Error: Failed to init TC action chain. We have an error talking to the kernel total acts 1 action order 0: gact action goto chain 42 random type none pass val 0 index 90 ref 2 bind 1 cookie c1a0c1a0 Then, the first packet transmitted by crash0 made the kernel crash: BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 #PF error: [normal kernel read fault] PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 2 PID: 0 Comm: swapper/2 Not tainted 5.0.0-rc4.gotochain_crash+ #533 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 RIP: 0010:tcf_action_exec+0xb8/0x100 Code: 00 00 00 20 74 1d 83 f8 03 75 09 49 83 c4 08 4d 39 ec 75 bc 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 49 8b 97 a8 00 00 00 <48> 8b 12 48 89 55 00 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 RSP: 0018:ffff8c2434703be0 EFLAGS: 00010246 RAX: 000000002000002a RBX: ffff8c23ed6d7e00 RCX: 000000000000005a RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8c23ed6d7e00 RBP: ffff8c2434703c80 R08: ffff8c243b639ac8 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffff8c2429e68b00 R13: ffff8c2429e68b08 R14: 0000000000000001 R15: ffff8c2429c5a480 FS: 0000000000000000(0000) GS:ffff8c2434700000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000002dc0e005 CR4: 00000000001606e0 Call Trace: <IRQ> tcf_classify+0x58/0x120 __dev_queue_xmit+0x40a/0x890 ? ip6_finish_output2+0x369/0x590 ip6_finish_output2+0x369/0x590 ? ip6_output+0x68/0x110 ip6_output+0x68/0x110 ? nf_hook.constprop.35+0x79/0xc0 mld_sendpack+0x16f/0x220 mld_ifc_timer_expire+0x195/0x2c0 ? igmp6_timer_handler+0x70/0x70 call_timer_fn+0x2b/0x130 run_timer_softirq+0x3e8/0x440 ? tick_sched_timer+0x37/0x70 __do_softirq+0xe3/0x2f5 irq_exit+0xf0/0x100 smp_apic_timer_interrupt+0x6c/0x130 apic_timer_interrupt+0xf/0x20 </IRQ> RIP: 0010:native_safe_halt+0x2/0x10 Code: 74 ff ff ff 7f f3 c3 65 48 8b 04 25 00 5c 01 00 f0 80 48 02 20 48 8b 00 a8 08 74 8b eb c1 90 90 90 90 90 90 90 90 90 90 fb f4 <c3> 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 f4 c3 90 90 90 90 90 90 RSP: 0018:ffff9c8640387eb8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13 RAX: ffffffff8b2184f0 RBX: 0000000000000002 RCX: 0000000000000001 RDX: 0000000000000001 RSI: 0000000000000087 RDI: 0000000000000002 RBP: 0000000000000002 R08: 000eb57882b36cc3 R09: 0000000000000020 R10: 0000000000000004 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 ? __sched_text_end+0x1/0x1 default_idle+0x1c/0x140 do_idle+0x1c4/0x280 cpu_startup_entry+0x19/0x20 start_secondary+0x1a7/0x200 secondary_startup_64+0xa4/0xb0 Modules linked in: act_gact act_bpf veth ip6table_filter ip6_tables iptable_filter binfmt_misc crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_codec_generic ext4 snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core mbcache jbd2 snd_seq snd_seq_device snd_pcm aesni_intel crypto_simd cryptd glue_helper virtio_balloon joydev pcspkr snd_timer snd i2c_piix4 soundcore nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs ata_generic pata_acpi qxl drm_kms_helper syscopyarea virtio_net sysfillrect net_failover virtio_blk sysimgblt fb_sys_fops virtio_console ttm failover drm crc32c_intel serio_raw ata_piix libata floppy virtio_pci virtio_ring virtio dm_mirror dm_region_hash dm_log dm_mod [last unloaded: act_bpf] CR2: 0000000000000000 Validating the control action within tcf_gact_init() proved to fix the above issue. A TDC selftest is added to verify the correct behavior. Fixes: db50514f9a9c ("net: sched: add termination action to allow goto chain") Fixes: 97763dc0f401 ("net_sched: reject unknown tcfa_action values") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | net/sched: act_csum: validate the control action inside init()Davide Caratti2019-03-211-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the following script: # tc qdisc add dev crash0 clsact # tc filter add dev crash0 egress matchall action csum icmp pass index 90 # tc actions replace action csum icmp goto chain 42 index 90 \ > cookie c1a0c1a0 # tc actions show action csum had the following output: Error: Failed to init TC action chain. We have an error talking to the kernel total acts 1 action order 0: csum (icmp) action goto chain 42 index 90 ref 2 bind 1 cookie c1a0c1a0 Then, the first packet transmitted by crash0 made the kernel crash: BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 #PF error: [normal kernel read fault] PGD 8000000074692067 P4D 8000000074692067 PUD 2e210067 PMD 0 Oops: 0000 [#1] SMP PTI CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.0.0-rc4.gotochain_crash+ #533 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 RIP: 0010:tcf_action_exec+0xb8/0x100 Code: 00 00 00 20 74 1d 83 f8 03 75 09 49 83 c4 08 4d 39 ec 75 bc 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 49 8b 97 a8 00 00 00 <48> 8b 12 48 89 55 00 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 RSP: 0018:ffff93153da03be0 EFLAGS: 00010246 RAX: 000000002000002a RBX: ffff9314ee40f700 RCX: 0000000000003a00 RDX: 0000000000000000 RSI: ffff931537c87828 RDI: ffff931537c87818 RBP: ffff93153da03c80 R08: 00000000527cffff R09: 0000000000000003 R10: 000000000000003f R11: 0000000000000028 R12: ffff9314edf68400 R13: ffff9314edf68408 R14: 0000000000000001 R15: ffff9314ed67b600 FS: 0000000000000000(0000) GS:ffff93153da00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000073e32003 CR4: 00000000001606f0 Call Trace: <IRQ> tcf_classify+0x58/0x120 __dev_queue_xmit+0x40a/0x890 ? ip6_finish_output2+0x369/0x590 ip6_finish_output2+0x369/0x590 ? ip6_output+0x68/0x110 ip6_output+0x68/0x110 ? nf_hook.constprop.35+0x79/0xc0 mld_sendpack+0x16f/0x220 mld_ifc_timer_expire+0x195/0x2c0 ? igmp6_timer_handler+0x70/0x70 call_timer_fn+0x2b/0x130 run_timer_softirq+0x3e8/0x440 ? tick_sched_timer+0x37/0x70 __do_softirq+0xe3/0x2f5 irq_exit+0xf0/0x100 smp_apic_timer_interrupt+0x6c/0x130 apic_timer_interrupt+0xf/0x20 </IRQ> RIP: 0010:native_safe_halt+0x2/0x10 Code: 66 ff ff ff 7f f3 c3 65 48 8b 04 25 00 5c 01 00 f0 80 48 02 20 48 8b 00 a8 08 74 8b eb c1 90 90 90 90 90 90 90 90 90 90 fb f4 <c3> 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 f4 c3 90 90 90 90 90 90 RSP: 0018:ffffffff9a803e98 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13 RAX: ffffffff99e184f0 RBX: 0000000000000000 RCX: 0000000000000001 RDX: 0000000000000001 RSI: 0000000000000087 RDI: 0000000000000000 RBP: 0000000000000000 R08: 000eb5c4572376b3 R09: 0000000000000000 R10: ffffa53e806a3ca0 R11: 00000000000f4240 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 ? __sched_text_end+0x1/0x1 default_idle+0x1c/0x140 do_idle+0x1c4/0x280 cpu_startup_entry+0x19/0x20 start_kernel+0x49e/0x4be secondary_startup_64+0xa4/0xb0 Modules linked in: act_csum veth ip6table_filter ip6_tables iptable_filter binfmt_misc ext4 crct10dif_pclmul crc32_pclmul snd_hda_codec_generic ghash_clmulni_intel snd_hda_intel mbcache snd_hda_codec jbd2 snd_hwdep snd_hda_core snd_seq snd_seq_device snd_pcm aesni_intel crypto_simd cryptd snd_timer glue_helper snd joydev virtio_balloon pcspkr soundcore i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs ata_generic pata_acpi qxl drm_kms_helper syscopyarea sysfillrect virtio_net sysimgblt net_failover fb_sys_fops virtio_console virtio_blk ttm failover drm ata_piix crc32c_intel floppy virtio_pci serio_raw libata virtio_ring virtio dm_mirror dm_region_hash dm_log dm_mod CR2: 0000000000000000 Validating the control action within tcf_csum_init() proved to fix the above issue. A TDC selftest is added to verify the correct behavior. Fixes: db50514f9a9c ("net: sched: add termination action to allow goto chain") Fixes: 97763dc0f401 ("net_sched: reject unknown tcfa_action values") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | net/sched: act_bpf: validate the control action inside init()Davide Caratti2019-03-211-0/+25
| | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the following script: # tc filter add dev crash0 egress matchall \ > action bpf bytecode '1,6 0 0 4294967295' pass index 90 # tc actions replace action bpf \ > bytecode '1,6 0 0 4294967295' goto chain 42 index 90 cookie c1a0c1a0 # tc action show action bpf had the following output: Error: Failed to init TC action chain. We have an error talking to the kernel total acts 1 action order 0: bpf bytecode '1,6 0 0 4294967295' default-action goto chain 42 index 90 ref 2 bind 1 cookie c1a0c1a0 Then, the first packet transmitted by crash0 made the kernel crash: RIP: 0010:tcf_action_exec+0xb8/0x100 Code: 00 00 00 20 74 1d 83 f8 03 75 09 49 83 c4 08 4d 39 ec 75 bc 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 49 8b 97 a8 00 00 00 <48> 8b 12 48 89 55 00 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 RSP: 0018:ffffb3a0803dfa90 EFLAGS: 00010246 RAX: 000000002000002a RBX: ffff942b347ada00 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffffb3a08034d038 RDI: ffff942b347ada00 RBP: ffffb3a0803dfb30 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: ffffb3a0803dfb0c R12: ffff942b3b682b00 R13: ffff942b3b682b08 R14: 0000000000000001 R15: ffff942b3b682f00 FS: 00007f6160a72740(0000) GS:ffff942b3da80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 00000000795a4002 CR4: 00000000001606e0 Call Trace: tcf_classify+0x58/0x120 __dev_queue_xmit+0x40a/0x890 ? ip_finish_output2+0x16f/0x430 ip_finish_output2+0x16f/0x430 ? ip_output+0x69/0xe0 ip_output+0x69/0xe0 ? ip_forward_options+0x1a0/0x1a0 ip_send_skb+0x15/0x40 raw_sendmsg+0x8e1/0xbd0 ? sched_clock+0x5/0x10 ? sched_clock_cpu+0xc/0xa0 ? try_to_wake_up+0x54/0x480 ? ldsem_down_read+0x3f/0x280 ? _cond_resched+0x15/0x40 ? down_read+0xe/0x30 ? copy_termios+0x1e/0x70 ? tty_mode_ioctl+0x1b6/0x4c0 ? sock_sendmsg+0x36/0x40 sock_sendmsg+0x36/0x40 __sys_sendto+0x10e/0x140 ? do_vfs_ioctl+0xa4/0x640 ? handle_mm_fault+0xdc/0x210 ? syscall_trace_enter+0x1df/0x2e0 ? __audit_syscall_exit+0x216/0x260 __x64_sys_sendto+0x24/0x30 do_syscall_64+0x5b/0x180 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7f615f7e3c03 Code: 48 8b 0d 90 62 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 9d c3 2c 00 00 75 13 49 89 ca b8 2c 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 34 c3 48 83 ec 08 e8 4b cc 00 00 48 89 04 24 RSP: 002b:00007ffee5d8cc28 EFLAGS: 00000246 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 000055a4f28f1700 RCX: 00007f615f7e3c03 RDX: 0000000000000040 RSI: 000055a4f28f1700 RDI: 0000000000000003 RBP: 00007ffee5d8e340 R08: 000055a4f28ee510 R09: 0000000000000010 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000040 R13: 000055a4f28f16c0 R14: 000055a4f28ef69c R15: 0000000000000080 Modules linked in: act_bpf veth ip6table_filter ip6_tables iptable_filter binfmt_misc ext4 mbcache crct10dif_pclmul jbd2 crc32_pclmul snd_hda_codec_generic ghash_clmulni_intel snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_seq snd_seq_device snd_pcm aesni_intel crypto_simd cryptd glue_helper pcspkr joydev virtio_balloon snd_timer snd i2c_piix4 soundcore nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs ata_generic pata_acpi qxl drm_kms_helper virtio_blk virtio_net virtio_console net_failover failover syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm crc32c_intel ata_piix serio_raw libata virtio_pci virtio_ring virtio floppy dm_mirror dm_region_hash dm_log dm_mod CR2: 0000000000000000 Validating the control action within tcf_bpf_init() proved to fix the above issue. A TDC selftest is added to verify the correct behavior. Fixes: db50514f9a9c ("net: sched: add termination action to allow goto chain") Fixes: 97763dc0f401 ("net_sched: reject unknown tcfa_action values") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds2019-03-24109-1314/+3722
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Thomas Gleixner: "A larger set of perf updates. Not all of them are strictly fixes, but that's solely the tip maintainers fault as they let the timely -rc1 pull request fall through the cracks for various reasons including travel. So I'm sending this nevertheless because rebasing and distangling fixes and updates would be a mess and risky as well. As of tomorrow, a strict fixes separation is happening again. Sorry for the slip-up. Kernel: - Handle RECORD_MMAP vs. RECORD_MMAP2 correctly so different consumers of the mmap event get what they requested. Tools: - A larger set of updates to perf record/report/scripts vs. time stamp handling - More Python3 fixups - A pile of memory leak plumbing - perf BPF improvements and fixes - Finalize the perf.data directory storage" [ Note: the kernel part is strictly a fix, the updates are purely to tooling - Linus ] * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (75 commits) perf bpf: Show more BPF program info in print_bpf_prog_info() perf bpf: Extract logic to create program names from perf_event__synthesize_one_bpf_prog() perf tools: Save bpf_prog_info and BTF of new BPF programs perf evlist: Introduce side band thread perf annotate: Enable annotation of BPF programs perf build: Check what binutils's 'disassembler()' signature to use perf bpf: Process PERF_BPF_EVENT_PROG_LOAD for annotation perf symbols: Introduce DSO_BINARY_TYPE__BPF_PROG_INFO perf feature detection: Add -lopcodes to feature-libbfd perf top: Add option --no-bpf-event perf bpf: Save BTF information as headers to perf.data perf bpf: Save BTF in a rbtree in perf_env perf bpf: Save bpf_prog_info information as headers to perf.data perf bpf: Save bpf_prog_info in a rbtree in perf_env perf bpf: Make synthesize_bpf_events() receive perf_session pointer instead of perf_tool perf bpf: Synthesize bpf events with bpf_program__get_prog_info_linear() bpftool: use bpf_program__get_prog_info_linear() in prog.c:do_dump() tools lib bpf: Introduce bpf_program__get_prog_info_linear() perf record: Replace option --bpf-event with --no-bpf-event perf tests: Fix a memory leak in test__perf_evsel__tp_sched_test() ...
| | * \ Merge tag 'perf-core-for-mingo-5.1-20190321' of ↵Thomas Gleixner2019-03-2278-1028/+1794
| | |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/core improvements and fixes from Arnaldo: BPF: Song Liu: - Add support for annotating BPF programs, using the PERF_RECORD_BPF_EVENT and PERF_RECORD_KSYMBOL recently added to the kernel and plugging binutils's libopcodes disassembly of BPF programs with the existing annotation interfaces in 'perf annotate', 'perf report' and 'perf top' various output formats (--stdio, --stdio2, --tui). perf list: Andi Kleen: - Filter metrics when using substring search. perf record: Andi Kleen: - Allow to limit number of reported perf.data files - Clarify help for --switch-output. perf report: Andi Kleen - Indicate JITed code better. - Show all sort keys in help output. perf script: Andi Kleen: - Support relative time. perf stat: Andi Kleen: - Improve scaling. General: Changbin Du: - Fix some mostly error path memory and reference count leaks found using gcc's ASan and UBSan. Vendor events: Mamatha Inamdar: - Remove P8 HW events which are not supported. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| | | * | perf bpf: Show more BPF program info in print_bpf_prog_info()Song Liu2019-03-213-3/+53
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch enables showing bpf program name, address, and size in the header. Before the patch: perf report --header-only ... # bpf_prog_info of id 9 # bpf_prog_info of id 10 # bpf_prog_info of id 13 After the patch: # bpf_prog_info 9: bpf_prog_7be49e3934a125ba addr 0xffffffffa0024947 size 229 # bpf_prog_info 10: bpf_prog_2a142ef67aaad174 addr 0xffffffffa007c94d size 229 # bpf_prog_info 13: bpf_prog_47368425825d7384_task__task_newt addr 0xffffffffa0251137 size 369 Committer notes: Fix the fallback definition when HAVE_LIBBPF_SUPPORT is not defined, i.e. add the missing 'static inline' and add the __maybe_unused to the args. Also add stdio.h since we now use FILE * in bpf-event.h. Signed-off-by: Song Liu <songliubraving@fb.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@google.com> Link: http://lkml.kernel.org/r/20190319165454.1298742-3-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| | | * | perf bpf: Extract logic to create program names from ↵Song Liu2019-03-211-27/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | perf_event__synthesize_one_bpf_prog() Extract logic to create program names to synthesize_bpf_prog_name(), so that it can be reused in header.c:print_bpf_prog_info(). This commit doesn't change the behavior. Signed-off-by: Song Liu <songliubraving@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@google.com> Link: http://lkml.kernel.org/r/20190319165454.1298742-2-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| | | * | perf tools: Save bpf_prog_info and BTF of new BPF programsSong Liu2019-03-2128-24/+145
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To fully annotate BPF programs with source code mapping, 4 different information are needed: 1) PERF_RECORD_KSYMBOL 2) PERF_RECORD_BPF_EVENT 3) bpf_prog_info 4) btf This patch handles 3) and 4) for BPF programs loaded after 'perf record|top'. For timely process of these information, a dedicated event is added to the side band evlist. When PERF_RECORD_BPF_EVENT is received via the side band event, the polling thread gathers 3) and 4) vis sys_bpf and store them in perf_env. This information is saved to perf.data at the end of 'perf record'. Committer testing: The 'wakeup_watermark' member in 'struct perf_event_attr' is inside a unnamed union, so can't be used in a struct designated initialization with older gccs, get it out of that, isolating as 'attr.wakeup_watermark = 1;' to work with all gcc versions. We also need to add '--no-bpf-event' to the 'perf record' perf_event_attr tests in 'perf test', as the way that that test goes is to intercept the events being setup and looking if they match the fields described in the control files, since now it finds first the side band event used to catch the PERF_RECORD_BPF_EVENT, they all fail. With these issues fixed: Same scenario as for testing BPF programs loaded before 'perf record' or 'perf top' starts, only start the BPF programs after 'perf record|top', so that its information get collected by the sideband threads, the rest works as for the programs loaded before start monitoring. Add missing 'inline' to the bpf_event__add_sb_event() when HAVE_LIBBPF_SUPPORT is not defined, fixing the build in systems without binutils devel files installed. Signed-off-by: Song Liu <songliubraving@fb.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@google.com> Link: http://lkml.kernel.org/r/20190312053051.2690567-16-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| | | * | perf evlist: Introduce side band threadSong Liu2019-03-215-0/+155
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces side band thread that captures extended information for events like PERF_RECORD_BPF_EVENT. This new thread uses its own evlist that uses ring buffer with very low watermark for lower latency. To use side band thread, we need to: 1. add side band event(s) by calling perf_evlist__add_sb_event(); 2. calls perf_evlist__start_sb_thread(); 3. at the end of perf run, perf_evlist__stop_sb_thread(). In the next patch, we use this thread to handle PERF_RECORD_BPF_EVENT. Committer notes: Add fix by Jiri Olsa for when te sb_tread can't get started and then at the end the stop_sb_thread() segfaults when joining the (non-existing) thread. That can happen when running 'perf top' or 'perf record' as a normal user, for instance. Further checks need to be done on top of this to more graciously handle these possible failure scenarios. Signed-off-by: Song Liu <songliubraving@fb.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@google.com> Link: http://lkml.kernel.org/r/20190312053051.2690567-15-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| | | * | perf annotate: Enable annotation of BPF programsSong Liu2019-03-203-2/+164
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In symbol__disassemble(), DSO_BINARY_TYPE__BPF_PROG_INFO dso calls into a new function symbol__disassemble_bpf(), where annotation line information is filled based on the bpf_prog_info and btf data saved in given perf_env. symbol__disassemble_bpf() uses binutils's libopcodes to disassemble bpf programs. Committer testing: After fixing this: - u64 *addrs = (u64 *)(info_linear->info.jited_ksyms); + u64 *addrs = (u64 *)(uintptr_t)(info_linear->info.jited_ksyms); Detected when crossbuilding to a 32-bit arch. And making all this dependent on HAVE_LIBBFD_SUPPORT and HAVE_LIBBPF_SUPPORT: 1) Have a BPF program running, one that has BTF info, etc, I used the tools/perf/examples/bpf/augmented_raw_syscalls.c put in place by 'perf trace'. # grep -B1 augmented_raw ~/.perfconfig [trace] add_events = /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.c # # perf trace -e *mmsg dnf/6245 sendmmsg(20, 0x7f5485a88030, 2, MSG_NOSIGNAL) = 2 NetworkManager/10055 sendmmsg(22<socket:[1056822]>, 0x7f8126ad1bb0, 2, MSG_NOSIGNAL) = 2 2) Then do a 'perf record' system wide for a while: # perf record -a ^C[ perf record: Woken up 68 times to write data ] [ perf record: Captured and wrote 19.427 MB perf.data (366891 samples) ] # 3) Check that we captured BPF and BTF info in the perf.data file: # perf report --header-only | grep 'b[pt]f' # event : name = cycles:ppp, , id = { 294789, 294790, 294791, 294792, 294793, 294794, 294795, 294796 }, size = 112, { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|CPU|PERIOD, read_format = ID, disabled = 1, inherit = 1, mmap = 1, comm = 1, freq = 1, task = 1, precise_ip = 3, sample_id_all = 1, exclude_guest = 1, mmap2 = 1, comm_exec = 1, ksymbol = 1, bpf_event = 1 # bpf_prog_info of id 13 # bpf_prog_info of id 14 # bpf_prog_info of id 15 # bpf_prog_info of id 16 # bpf_prog_info of id 17 # bpf_prog_info of id 18 # bpf_prog_info of id 21 # bpf_prog_info of id 22 # bpf_prog_info of id 41 # bpf_prog_info of id 42 # btf info of id 2 # 4) Check which programs got recorded: # perf report | grep bpf_prog | head 0.16% exe bpf_prog_819967866022f1e1_sys_enter [k] bpf_prog_819967866022f1e1_sys_enter 0.14% exe bpf_prog_c1bd85c092d6e4aa_sys_exit [k] bpf_prog_c1bd85c092d6e4aa_sys_exit 0.08% fuse-overlayfs bpf_prog_819967866022f1e1_sys_enter [k] bpf_prog_819967866022f1e1_sys_enter 0.07% fuse-overlayfs bpf_prog_c1bd85c092d6e4aa_sys_exit [k] bpf_prog_c1bd85c092d6e4aa_sys_exit 0.01% clang-4.0 bpf_prog_c1bd85c092d6e4aa_sys_exit [k] bpf_prog_c1bd85c092d6e4aa_sys_exit 0.01% clang-4.0 bpf_prog_819967866022f1e1_sys_enter [k] bpf_prog_819967866022f1e1_sys_enter 0.00% clang bpf_prog_c1bd85c092d6e4aa_sys_exit [k] bpf_prog_c1bd85c092d6e4aa_sys_exit 0.00% runc bpf_prog_819967866022f1e1_sys_enter [k] bpf_prog_819967866022f1e1_sys_enter 0.00% clang bpf_prog_819967866022f1e1_sys_enter [k] bpf_prog_819967866022f1e1_sys_enter 0.00% sh bpf_prog_c1bd85c092d6e4aa_sys_exit [k] bpf_prog_c1bd85c092d6e4aa_sys_exit # This was with the default --sort order for 'perf report', which is: --sort comm,dso,symbol If we just look for the symbol, for instance: # perf report --sort symbol | grep bpf_prog | head 0.26% [k] bpf_prog_819967866022f1e1_sys_enter - - 0.24% [k] bpf_prog_c1bd85c092d6e4aa_sys_exit - - # or the DSO: # perf report --sort dso | grep bpf_prog | head 0.26% bpf_prog_819967866022f1e1_sys_enter 0.24% bpf_prog_c1bd85c092d6e4aa_sys_exit # We'll see the two BPF programs that augmented_raw_syscalls.o puts in place, one attached to the raw_syscalls:sys_enter and another to the raw_syscalls:sys_exit tracepoints, as expected. Now we can finally do, from the command line, annotation for one of those two symbols, with the original BPF program source coude intermixed with the disassembled JITed code: # perf annotate --stdio2 bpf_prog_819967866022f1e1_sys_enter Samples: 950 of event 'cycles:ppp', 4000 Hz, Event count (approx.): 553756947, [percent: local period] bpf_prog_819967866022f1e1_sys_enter() bpf_prog_819967866022f1e1_sys_enter Percent int sys_enter(struct syscall_enter_args *args) 53.41 push %rbp 0.63 mov %rsp,%rbp 0.31 sub $0x170,%rsp 1.93 sub $0x28,%rbp 7.02 mov %rbx,0x0(%rbp) 3.20 mov %r13,0x8(%rbp) 1.07 mov %r14,0x10(%rbp) 0.61 mov %r15,0x18(%rbp) 0.11 xor %eax,%eax 1.29 mov %rax,0x20(%rbp) 0.11 mov %rdi,%rbx return bpf_get_current_pid_tgid(); 2.02 → callq *ffffffffda6776d9 2.76 mov %eax,-0x148(%rbp) mov %rbp,%rsi int sys_enter(struct syscall_enter_args *args) add $0xfffffffffffffeb8,%rsi return bpf_map_lookup_elem(pids, &pid) != NULL; movabs $0xffff975ac2607800,%rdi 1.26 → callq *ffffffffda6789e9 cmp $0x0,%rax 2.43 → je 0 add $0x38,%rax 0.21 xor %r13d,%r13d if (pid_filter__has(&pids_filtered, getpid())) 0.81 cmp $0x0,%rax → jne 0 mov %rbp,%rdi probe_read(&augmented_args.args, sizeof(augmented_args.args), args); 2.22 add $0xfffffffffffffeb8,%rdi 0.11 mov $0x40,%esi 0.32 mov %rbx,%rdx 2.74 → callq *ffffffffda658409 syscall = bpf_map_lookup_elem(&syscalls, &augmented_args.args.syscall_nr); 0.22 mov %rbp,%rsi 1.69 add $0xfffffffffffffec0,%rsi syscall = bpf_map_lookup_elem(&syscalls, &augmented_args.args.syscall_nr); movabs $0xffff975bfcd36000,%rdi add $0xd0,%rdi 0.21 mov 0x0(%rsi),%eax 0.93 cmp $0x200,%rax → jae 0 0.10 shl $0x3,%rax 0.11 add %rdi,%rax 0.11 → jmp 0 xor %eax,%eax if (syscall == NULL || !syscall->enabled) 1.07 cmp $0x0,%rax → je 0 if (syscall == NULL || !syscall->enabled) 6.57 movzbq 0x0(%rax),%rdi if (syscall == NULL || !syscall->enabled) cmp $0x0,%rdi 0.95 → je 0 mov $0x40,%r8d switch (augmented_args.args.syscall_nr) { mov -0x140(%rbp),%rdi switch (augmented_args.args.syscall_nr) { cmp $0x2,%rdi → je 0 cmp $0x101,%rdi → je 0 cmp $0x15,%rdi → jne 0 case SYS_OPEN: filename_arg = (const void *)args->args[0]; mov 0x10(%rbx),%rdx → jmp 0 case SYS_OPENAT: filename_arg = (const void *)args->args[1]; mov 0x18(%rbx),%rdx if (filename_arg != NULL) { cmp $0x0,%rdx → je 0 xor %edi,%edi augmented_args.filename.reserved = 0; mov %edi,-0x104(%rbp) augmented_args.filename.size = probe_read_str(&augmented_args.filename.value, mov %rbp,%rdi add $0xffffffffffffff00,%rdi augmented_args.filename.size = probe_read_str(&augmented_args.filename.value, mov $0x100,%esi → callq *ffffffffda658499 mov $0x148,%r8d augmented_args.filename.size = probe_read_str(&augmented_args.filename.value, mov %eax,-0x108(%rbp) augmented_args.filename.size = probe_read_str(&augmented_args.filename.value, mov %rax,%rdi shl $0x20,%rdi shr $0x20,%rdi if (augmented_args.filename.size < sizeof(augmented_args.filename.value)) { cmp $0xff,%rdi → ja 0 len -= sizeof(augmented_args.filename.value) - augmented_args.filename.size; add $0x48,%rax len &= sizeof(augmented_args.filename.value) - 1; and $0xff,%rax mov %rax,%r8 mov %rbp,%rcx return perf_event_output(args, &__augmented_syscalls__, BPF_F_CURRENT_CPU, &augmented_args, len); add $0xfffffffffffffeb8,%rcx mov %rbx,%rdi movabs $0xffff975fbd72d800,%rsi mov $0xffffffff,%edx → callq *ffffffffda658ad9 mov %rax,%r13 } mov %r13,%rax 0.72 mov 0x0(%rbp),%rbx mov 0x8(%rbp),%r13 1.16 mov 0x10(%rbp),%r14 0.10 mov 0x18(%rbp),%r15 0.42 add $0x28,%rbp 0.54 leaveq 0.54 ← retq # Please see 'man perf-config' to see how to control what should be seen, via ~/.perfconfig [annotate] section, for instance, one can suppress the source code and see just the disassembly, etc. Alternatively, use the TUI bu just using 'perf annotate', press '/bpf_prog' to see the bpf symbols, press enter and do the interactive annotation, which allows for dumping to a file after selecting the the various output tunables, for instance, the above without source code intermixed, plus showing all the instruction offsets: # perf annotate bpf_prog_819967866022f1e1_sys_enter Then press: 's' to hide the source code + 'O' twice to show all instruction offsets, then 'P' to print to the bpf_prog_819967866022f1e1_sys_enter.annotation file, which will have: # cat bpf_prog_819967866022f1e1_sys_enter.annotation bpf_prog_819967866022f1e1_sys_enter() bpf_prog_819967866022f1e1_sys_enter Event: cycles:ppp 53.41 0: push %rbp 0.63 1: mov %rsp,%rbp 0.31 4: sub $0x170,%rsp 1.93 b: sub $0x28,%rbp 7.02 f: mov %rbx,0x0(%rbp) 3.20 13: mov %r13,0x8(%rbp) 1.07 17: mov %r14,0x10(%rbp) 0.61 1b: mov %r15,0x18(%rbp) 0.11 1f: xor %eax,%eax 1.29 21: mov %rax,0x20(%rbp) 0.11 25: mov %rdi,%rbx 2.02 28: → callq *ffffffffda6776d9 2.76 2d: mov %eax,-0x148(%rbp) 33: mov %rbp,%rsi 36: add $0xfffffffffffffeb8,%rsi 3d: movabs $0xffff975ac2607800,%rdi 1.26 47: → callq *ffffffffda6789e9 4c: cmp $0x0,%rax 2.43 50: → je 0 52: add $0x38,%rax 0.21 56: xor %r13d,%r13d 0.81 59: cmp $0x0,%rax 5d: → jne 0 63: mov %rbp,%rdi 2.22 66: add $0xfffffffffffffeb8,%rdi 0.11 6d: mov $0x40,%esi 0.32 72: mov %rbx,%rdx 2.74 75: → callq *ffffffffda658409 0.22 7a: mov %rbp,%rsi 1.69 7d: add $0xfffffffffffffec0,%rsi 84: movabs $0xffff975bfcd36000,%rdi 8e: add $0xd0,%rdi 0.21 95: mov 0x0(%rsi),%eax 0.93 98: cmp $0x200,%rax 9f: → jae 0 0.10 a1: shl $0x3,%rax 0.11 a5: add %rdi,%rax 0.11 a8: → jmp 0 aa: xor %eax,%eax 1.07 ac: cmp $0x0,%rax b0: → je 0 6.57 b6: movzbq 0x0(%rax),%rdi bb: cmp $0x0,%rdi 0.95 bf: → je 0 c5: mov $0x40,%r8d cb: mov -0x140(%rbp),%rdi d2: cmp $0x2,%rdi d6: → je 0 d8: cmp $0x101,%rdi df: → je 0 e1: cmp $0x15,%rdi e5: → jne 0 e7: mov 0x10(%rbx),%rdx eb: → jmp 0 ed: mov 0x18(%rbx),%rdx f1: cmp $0x0,%rdx f5: → je 0 f7: xor %edi,%edi f9: mov %edi,-0x104(%rbp) ff: mov %rbp,%rdi 102: add $0xffffffffffffff00,%rdi 109: mov $0x100,%esi 10e: → callq *ffffffffda658499 113: mov $0x148,%r8d 119: mov %eax,-0x108(%rbp) 11f: mov %rax,%rdi 122: shl $0x20,%rdi 126: shr $0x20,%rdi 12a: cmp $0xff,%rdi 131: → ja 0 133: add $0x48,%rax 137: and $0xff,%rax 13d: mov %rax,%r8 140: mov %rbp,%rcx 143: add $0xfffffffffffffeb8,%rcx 14a: mov %rbx,%rdi 14d: movabs $0xffff975fbd72d800,%rsi 157: mov $0xffffffff,%edx 15c: → callq *ffffffffda658ad9 161: mov %rax,%r13 164: mov %r13,%rax 0.72 167: mov 0x0(%rbp),%rbx 16b: mov 0x8(%rbp),%r13 1.16 16f: mov 0x10(%rbp),%r14 0.10 173: mov 0x18(%rbp),%r15 0.42 177: add $0x28,%rbp 0.54 17b: leaveq 0.54 17c: ← retq Another cool way to test all this is to symple use 'perf top' look for those symbols, go there and press enter, annotate it live :-) Signed-off-by: Song Liu <songliubraving@fb.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@google.com> Link: http://lkml.kernel.org/r/20190312053051.2690567-13-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| | | * | perf build: Check what binutils's 'disassembler()' signature to useSong Liu2019-03-203-2/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 003ca0fd2286 ("Refactor disassembler selection") in the binutils repo, which changed the disassembler() function signature, so we must use the feature test introduced in fb982666e380 ("tools/bpftool: fix bpftool build with bintutils >= 2.9") to deal with that. Committer testing: After adding the missing function call to test-all.c, and: FEATURE_CHECK_LDFLAGS-disassembler-four-args = -bfd -lopcodes And the fallbacks for cases where we need -liberty and sometimes -lz to tools/perf/Makefile.config, we get: $ make -C tools/perf O=/tmp/build/perf install-bin make: Entering directory '/home/acme/git/perf/tools/perf' BUILD: Doing 'make -j8' parallel build Auto-detecting system features: ... dwarf: [ on ] ... dwarf_getlocations: [ on ] ... glibc: [ on ] ... gtk2: [ on ] ... libaudit: [ on ] ... libbfd: [ on ] ... libelf: [ on ] ... libnuma: [ on ] ... numa_num_possible_cpus: [ on ] ... libperl: [ on ] ... libpython: [ on ] ... libslang: [ on ] ... libcrypto: [ on ] ... libunwind: [ on ] ... libdw-dwarf-unwind: [ on ] ... zlib: [ on ] ... lzma: [ on ] ... get_cpuid: [ on ] ... bpf: [ on ] ... libaio: [ on ] ... disassembler-four-args: [ on ] CC /tmp/build/perf/jvmti/libjvmti.o CC /tmp/build/perf/builtin-bench.o <SNIP> $ $ The feature detection test-all.bin gets successfully built and linked: $ ls -la /tmp/build/perf/feature/test-all.bin -rwxrwxr-x. 1 acme acme 2680352 Mar 19 11:07 /tmp/build/perf/feature/test-all.bin $ nm /tmp/build/perf/feature/test-all.bin | grep -w disassembler 0000000000061f90 T disassembler $ Time to move on to the patches that make use of this disassembler() routine in binutils's libopcodes. Signed-off-by: Song Liu <songliubraving@fb.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jakub Kicinski <jakub.kicinski@netronome.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Roman Gushchin <guro@fb.com> Cc: Stanislav Fomichev <sdf@google.com> Link: http://lkml.kernel.org/r/20190312053051.2690567-13-songliubraving@fb.com [ split from a larger patch, added missing FEATURE_CHECK_LDFLAGS-disassembler-four-args ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| | | * | perf bpf: Process PERF_BPF_EVENT_PROG_LOAD for annotationSong Liu2019-03-191-0/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds processing of PERF_BPF_EVENT_PROG_LOAD, which sets proper DSO type/id/etc of memory regions mapped to BPF programs to DSO_BINARY_TYPE__BPF_PROG_INFO. Signed-off-by: Song Liu <songliubraving@fb.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@google.com> Cc: kernel-team@fb.com Link: http://lkml.kernel.org/r/20190312053051.2690567-14-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| | | * | perf symbols: Introduce DSO_BINARY_TYPE__BPF_PROG_INFOSong Liu2019-03-193-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce a new dso type DSO_BINARY_TYPE__BPF_PROG_INFO for BPF programs. In symbol__disassemble(), DSO_BINARY_TYPE__BPF_PROG_INFO dso will call into a new function symbol__disassemble_bpf() in an upcoming patch, where annotation line information is filled based bpf_prog_info and btf saved in given perf_env. Committer notes: Removed the unnamed union with 'bpf_prog' and 'cache' in 'struct dso', to fix this bug when exiting 'perf top': # perf top perf: Segmentation fault -------- backtrace -------- perf[0x5a785a] /lib64/libc.so.6(+0x385bf)[0x7fd68443c5bf] perf(rb_first+0x2b)[0x4d6eeb] perf(dso__delete+0xb7)[0x4dffb7] perf[0x4f9e37] perf(perf_session__delete+0x64)[0x504df4] perf(cmd_top+0x1957)[0x454467] perf[0x4aad18] perf(main+0x61c)[0x42ec7c] /lib64/libc.so.6(__libc_start_main+0xf2)[0x7fd684428412] perf(_start+0x2d)[0x42eead] # # addr2line -fe ~/bin/perf 0x4dffb7 dso_cache__free /home/acme/git/perf/tools/perf/util/dso.c:713 That is trying to access the dso->data.cache, and that is not used with BPF programs, so we end up accessing what is in bpf_prog.first_member, b00m. Signed-off-by: Song Liu <songliubraving@fb.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@google.com> Cc: kernel-team@fb.com Link: http://lkml.kernel.org/r/20190312053051.2690567-13-songliubraving@fb.com [ split from a larger patch ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| | | * | perf feature detection: Add -lopcodes to feature-libbfdSong Liu2019-03-191-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Both libbfd and libopcodes are distributed with binutil-dev/devel. When libbfd is present, it is OK to assume that libopcodes also present. This has been a safe assumption for bpftool. This patch adds -lopcodes to perf/Makefile.config. libopcodes will be used in the next commit for BPF annotation. Signed-off-by: Song Liu <songliubraving@fb.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@google.com> Cc: kernel-team@fb.com Link: http://lkml.kernel.org/r/20190312053051.2690567-12-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| | | * | perf top: Add option --no-bpf-eventSong Liu2019-03-191-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds option --no-bpf-event to 'perf top', which is the same as the option of 'perf record'. The following patches will use this option. Committer testing: # perf top -vv 2> /tmp/perf_event_attr.out # cat /tmp/perf_event_attr.out ------------------------------------------------------------ perf_event_attr: size 112 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|CPU|PERIOD read_format ID disabled 1 inherit 1 mmap 1 comm 1 freq 1 task 1 precise_ip 3 sample_id_all 1 exclude_guest 1 mmap2 1 comm_exec 1 ksymbol 1 bpf_event 1 ------------------------------------------------------------ # After this patch: # perf top --no-bpf-event -vv 2> /tmp/perf_event_attr.out # cat /tmp/perf_event_attr.out ------------------------------------------------------------ perf_event_attr: size 112 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|CPU|PERIOD read_format ID disabled 1 inherit 1 mmap 1 comm 1 freq 1 task 1 precise_ip 3 sample_id_all 1 exclude_guest 1 mmap2 1 comm_exec 1 ksymbol 1 ------------------------------------------------------------ # Signed-off-by: Song Liu <songliubraving@fb.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@google.com> Cc: kernel-team@fb.com Link: http://lkml.kernel.org/r/20190312053051.2690567-11-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| | | * | perf bpf: Save BTF information as headers to perf.dataSong Liu2019-03-192-1/+101
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch enables 'perf record' to save BTF information as headers to perf.data. A new header type HEADER_BPF_BTF is introduced for this data. Committer testing: As root, being on the kernel sources top level directory, run: # perf trace -e tools/perf/examples/bpf/augmented_raw_syscalls.c -e *msg Just to compile and load a BPF program that attaches to the raw_syscalls:sys_{enter,exit} tracepoints to trace the syscalls ending in "msg" (recvmsg, sendmsg, recvmmsg, sendmmsg, etc). Make sure you have a recent enough clang, say version 9, to get the BTF ELF sections needed for this testing: # clang --version | head -1 clang version 9.0.0 (https://git.llvm.org/git/clang.git/ 7906282d3afec5dfdc2b27943fd6c0309086c507) (https://git.llvm.org/git/llvm.git/ a1b5de1ff8ae8bc79dc8e86e1f82565229bd0500) # readelf -SW tools/perf/examples/bpf/augmented_raw_syscalls.o | grep BTF [22] .BTF PROGBITS 0000000000000000 000ede 000b0e 00 0 0 1 [23] .BTF.ext PROGBITS 0000000000000000 0019ec 0002a0 00 0 0 1 [24] .rel.BTF.ext REL 0000000000000000 002fa8 000270 10 30 23 8 Then do a systemwide perf record session for a few seconds: # perf record -a sleep 2s Then look at: # perf report --header-only | grep b[pt]f # event : name = cycles:ppp, , id = { 1116204, 1116205, 1116206, 1116207, 1116208, 1116209, 1116210, 1116211 }, size = 112, { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|PERIOD, read_format = ID, disabled = 1, inherit = 1, mmap = 1, comm = 1, freq = 1, enable_on_exec = 1, task = 1, precise_ip = 3, sample_id_all = 1, exclude_guest = 1, mmap2 = 1, comm_exec = 1, ksymbol = 1, bpf_event = 1 # bpf_prog_info of id 13 # bpf_prog_info of id 14 # bpf_prog_info of id 15 # bpf_prog_info of id 16 # bpf_prog_info of id 17 # bpf_prog_info of id 18 # bpf_prog_info of id 21 # bpf_prog_info of id 22 # bpf_prog_info of id 51 # bpf_prog_info of id 52 # btf info of id 8 # We need to show more info about these BPF and BTF entries , but that can be done later. Signed-off-by: Song Liu <songliubraving@fb.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@google.com> Cc: kernel-team@fb.com Link: http://lkml.kernel.org/r/20190312053051.2690567-10-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| | | * | perf bpf: Save BTF in a rbtree in perf_envSong Liu2019-03-194-0/+102
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BTF contains information necessary to annotate BPF programs. This patch saves BTF for BPF programs loaded in the system. Signed-off-by: Song Liu <songliubraving@fb.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@google.com> Cc: kernel-team@fb.com Link: http://lkml.kernel.org/r/20190312053051.2690567-9-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| | | * | perf bpf: Save bpf_prog_info information as headers to perf.dataSong Liu2019-03-192-1/+153
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch enables perf-record to save bpf_prog_info information as headers to perf.data. A new header type HEADER_BPF_PROG_INFO is introduced for this data. Committer testing: As root, being on the kernel sources top level directory, run: # perf trace -e tools/perf/examples/bpf/augmented_raw_syscalls.c -e *msg Just to compile and load a BPF program that attaches to the raw_syscalls:sys_{enter,exit} tracepoints to trace the syscalls ending in "msg" (recvmsg, sendmsg, recvmmsg, sendmmsg, etc). Then do a systemwide perf record session for a few seconds: # perf record -a sleep 2s Then look at: # perf report --header-only | grep -i bpf # bpf_prog_info of id 13 # bpf_prog_info of id 14 # bpf_prog_info of id 15 # bpf_prog_info of id 16 # bpf_prog_info of id 17 # bpf_prog_info of id 18 # bpf_prog_info of id 21 # bpf_prog_info of id 22 # bpf_prog_info of id 208 # bpf_prog_info of id 209 # We need to show more info about these programs, like bpftool does for the ones running on the system, i.e. 'perf record/perf report' become a way of saving the BPF state in a machine to then analyse on another, together with all the other information that is already saved in the perf.data header: # perf report --header-only # ======== # captured on : Tue Mar 12 11:42:13 2019 # header version : 1 # data offset : 296 # data size : 16294184 # feat offset : 16294480 # hostname : quaco # os release : 5.0.0+ # perf version : 5.0.gd783c8 # arch : x86_64 # nrcpus online : 8 # nrcpus avail : 8 # cpudesc : Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz # cpuid : GenuineIntel,6,142,10 # total memory : 24555720 kB # cmdline : /home/acme/bin/perf (deleted) record -a # event : name = cycles:ppp, , id = { 3190123, 3190124, 3190125, 3190126, 3190127, 3190128, 3190129, 3190130 }, size = 112, { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|CPU|PERIOD, read_format = ID, disabled = 1, inherit = 1, mmap = 1, comm = 1, freq = 1, task = 1, precise_ip = 3, sample_id_all = 1, exclude_guest = 1, mmap2 = 1, comm_exec = 1 # CPU_TOPOLOGY info available, use -I to display # NUMA_TOPOLOGY info available, use -I to display # pmu mappings: intel_pt = 8, software = 1, power = 11, uprobe = 7, uncore_imc = 12, cpu = 4, cstate_core = 18, uncore_cbox_2 = 15, breakpoint = 5, uncore_cbox_0 = 13, tracepoint = 2, cstate_pkg = 19, uncore_arb = 17, kprobe = 6, i915 = 10, msr = 9, uncore_cbox_3 = 16, uncore_cbox_1 = 14 # CACHE info available, use -I to display # time of first sample : 116392.441701 # time of last sample : 116400.932584 # sample duration : 8490.883 ms # MEM_TOPOLOGY info available, use -I to display # bpf_prog_info of id 13 # bpf_prog_info of id 14 # bpf_prog_info of id 15 # bpf_prog_info of id 16 # bpf_prog_info of id 17 # bpf_prog_info of id 18 # bpf_prog_info of id 21 # bpf_prog_info of id 22 # bpf_prog_info of id 208 # bpf_prog_info of id 209 # missing features: TRACING_DATA BRANCH_STACK GROUP_DESC AUXTRACE STAT CLOCKID DIR_FORMAT # ======== # Committer notes: We can't use the libbpf unconditionally, as the build may have been with NO_LIBBPF, when we end up with linking errors, so provide dummy {process,write}_bpf_prog_info() wrapped by HAVE_LIBBPF_SUPPORT for that case. Printing are not affected by this, so can continue as is. Signed-off-by: Song Liu <songliubraving@fb.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@google.com> Cc: kernel-team@fb.com Link: http://lkml.kernel.org/r/20190312053051.2690567-8-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| | | * | perf bpf: Save bpf_prog_info in a rbtree in perf_envSong Liu2019-03-196-2/+144
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | bpf_prog_info contains information necessary to annotate bpf programs. This patch saves bpf_prog_info for bpf programs loaded in the system. Some big picture of the next few patches: To fully annotate BPF programs with source code mapping, 4 different informations are needed: 1) PERF_RECORD_KSYMBOL 2) PERF_RECORD_BPF_EVENT 3) bpf_prog_info 4) btf Before this set, 1) and 2) in the list are already saved to perf.data file. For BPF programs that are already loaded before perf run, 1) and 2) are synthesized by perf_event__synthesize_bpf_events(). For short living BPF programs, 1) and 2) are generated by kernel. This set handles 3) and 4) from the list. Again, it is necessary to handle existing BPF program and short living program separately. This patch handles 3) for exising BPF programs while synthesizing 1) and 2) in perf_event__synthesize_bpf_events(). These data are stored in perf_env. The next patch saves these data from perf_env to perf.data as headers. Similarly, the two patches after the next saves 4) of existing BPF programs to perf_env and perf.data. Another patch later will handle 3) and 4) for short living BPF programs by monitoring 1) and 2) in a dedicate thread. Signed-off-by: Song Liu <songliubraving@fb.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@google.com> Cc: kernel-team@fb.com Link: http://lkml.kernel.org/r/20190312053051.2690567-7-songliubraving@fb.com [ set env->bpf_progs.infos_cnt to zero in perf_env__purge_bpf() as noted by jolsa ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>