summaryrefslogtreecommitdiffstats
path: root/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
Commit message (Collapse)AuthorAgeFilesLines
* nfp: limit the number of TSO segmentsJakub Kicinski2018-02-081-0/+2
| | | | | | | | | | | Most FWs limit the number of TSO segments a frame can produce to 64. This is for fairness and efficiency (of FW datapath) reasons. If a frame with larger number of segments is submitted the FW will drop it. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* nfp: forbid disabling hw-tc-offload on representors while offload activeJakub Kicinski2018-02-081-4/+3Star
| | | | | | | | | | | | | | | | | All netdevs which can accept TC offloads must implement .ndo_set_features(). nfp_reprs currently do not do that, which means hw-tc-offload can be turned on and off even when offloads are active. Whether the offloads are active is really a question to nfp_ports, so remove the per-app tc_busy callback indirection thing, and simply count the number of offloaded items in nfp_port structure. Fixes: 8a2768732a4d ("nfp: provide infrastructure for offloading flower based TC filters") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Tested-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* nfp: don't advertise hw-tc-offload on non-port netdevsJakub Kicinski2018-02-081-1/+1
| | | | | | | | | | | | | | | | | | | nfp_port is a structure which represents an ASIC port, both PCIe vNIC (on a PF or a VF) or the external MAC port. vNIC netdev (struct nfp_net) and pure representor netdev (struct nfp_repr) both have a pointer to this structure. nfp_reprs always have a port associated. nfp_nets, however, only represent a device port in legacy mode, where they are considered the MAC port. In switchdev mode they are just the CPU's side of the PCIe link. By definition TC offloads only apply to device ports. Don't set the flag on vNICs without a port (i.e. in switchdev mode). Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Tested-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* nfp: bpf: plumb extack into functions related to XDP offloadQuentin Monnet2018-01-221-1/+1
| | | | | | | | | | | Pass a pointer to an extack object to nfp_app_xdp_offload() in order to prepare for extack usage in the nfp driver. Next step will be to forward this extack pointer to nfp_net_bpf_offload(), once this function is able to use it for printing error messages. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* nfp: allow apps to disable ctrl vNIC capabilitiesJakub Kicinski2018-01-191-0/+4
| | | | | | | | | | | | | | | Most vNIC capabilities are netdev related. It makes no sense to initialize them and waste FW resources. Some are even counter-productive, like IRQ moderation, which will slow down exchange of control messages. Add to nfp_app a mask of enabled control vNIC capabilities for apps to use. Make flower and BPF enable all capabilities for now. No functional changes. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* nfp: split reading capabilities out of nfp_net_init()Jakub Kicinski2018-01-191-11/+20
| | | | | | | | | | | | | nfp_net_init() is a little long and we are about to add more code to reading capabilties. Move the capability reading, parsing and validating out. Only actual initialization will stay in nfp_net_init(). No functional changes. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* nfp: read mailbox address from TLV capsJakub Kicinski2018-01-191-6/+14
| | | | | | | | | | Allow specifying alternative vNIC mailbox location in TLV caps. This way we can size the mailbox to the needs and not necessarily waste 512B of ctrl memory space. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* nfp: read ME frequency from vNIC ctrl memoryJakub Kicinski2018-01-191-1/+1
| | | | | | | | | | PCIe island clock frequency is used when converting coalescing parameters from usecs to NFP timestamps. Most chips don't run at 1200MHz, allow FW to provide us with the real frequency. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* nfp: add TLV capabilities to the BARJakub Kicinski2018-01-191-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | NFP is entirely programmable, including the PCI data interface. Using a fixed control BAR layout certainly makes implementations easier, but require careful considerations when space is allocated. Once BAR area is allocated to one feature nothing else can use it. Allocating space statically also requires it to be sized upfront, which leads to either unnecessary limitation or wastage. We currently have a 32bit capability word defined which tells drivers which application FW features are supported. Most of the bits are exhausted. The same bits are also reused for enabling specific features. Bulk of capabilities don't have a need for an enable bit, however, leading to confusion and wastage. TLVs seems like a better fit for expressing capabilities of applications running on programmable hardware. This patch leaves the front of the BAR as is, and declares a TLV capability start at offset 0x58. Most of the space up to 0x0d90 is already allocated, but the used space can be wrapped with RESERVED TLVs. E.g.: Address Type Length 0x0058 RESERVED 0xe00 /* Wrap basic structures */ 0x0e5c FEATURE_A 0x004 0x0e64 FEATURE_B 0x004 0x0e6c RESERVED 0x990 /* Wrap qeueue stats */ 0x1800 FEATURE_C 0x100 Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* nfp: bpf: add basic control channel communicationJakub Kicinski2018-01-141-0/+7
| | | | | | | | | | | | | | For map support we will need to send and receive control messages. Add basic support for sending a message to FW, and waiting for a reply. Control messages are tagged with a 16 bit ID. Add a simple ID allocator and make sure we don't allow too many messages in flight, to avoid request <> reply mismatches. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2018-01-121-0/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | BPF alignment tests got a conflict because the registers are output as Rn_w instead of just Rn in net-next, and in net a fixup for a testcase prohibits logical operations on pointers before using them. Also, we should attempt to patch BPF call args if JIT always on is enabled. Instead, if we fail to JIT the subprogs we should pass an error back up and fail immediately. Signed-off-by: David S. Miller <davem@davemloft.net>
| * nfp: always unmask aux interrupts at initJakub Kicinski2018-01-101-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The link state and exception interrupts may be masked when we probe. The firmware should in theory prevent sending (and automasking) those interrupts if the device is disabled, but if my reading of the FW code is correct there are firmwares out there with race conditions in this area. The interrupt may also be masked if previous driver which used the device was malfunctioning and we didn't load the FW (there is no other good way to comprehensively reset the PF). Note that FW unmasks the data interrupts by itself when vNIC is enabled, such helpful operation is not performed for LSC/EXN interrupts. Always unmask the auxiliary interrupts after request_irq(). On the remove path add missing PCI write flush before free_irq(). Fixes: 4c3523623dc0 ("net: add driver for Netronome NFP4000/NFP6000 NIC VFs") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | nfp: hand over to BPF offload app at coarser granularityJakub Kicinski2018-01-101-9/+1Star
| | | | | | | | | | | | | | | | | | | | Instead of having an app callback per message type hand off all offload-related handling to apps with one "rest of ndo_bpf" callback. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* | nfp: bpf: don't allow changing MTU above BPF offload limit when activeJakub Kicinski2018-01-101-0/+5
| | | | | | | | | | | | | | | | | | When BPF offload is active we need may need to restrict the MTU changes more than just to the limitation of the kernel XDP datapath. Allow the BPF code to veto a MTU change. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* | nfp: don't try to register XDP rxq structures on control queuesJakub Kicinski2018-01-101-4/+8
| | | | | | | | | | | | | | | | | | | | Some RX rings are used for control messages, those will not have a netdev pointer in dp. Skip XDP rxq handling on those rings. Fixes: 7f1c684a8966 ("nfp: setup xdp_rxq_info") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller2018-01-081-2/+8
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Daniel Borkmann says: ==================== pull-request: bpf-next 2018-01-07 The following pull-request contains BPF updates for your *net-next* tree. The main changes are: 1) Add a start of a framework for extending struct xdp_buff without having the overhead of populating every data at runtime. Idea is to have a new per-queue struct xdp_rxq_info that holds read mostly data (currently that is, queue number and a pointer to the corresponding netdev) which is set up during rxqueue config time. When a XDP program is invoked, struct xdp_buff holds a pointer to struct xdp_rxq_info that the BPF program can then walk. The user facing BPF program that uses struct xdp_md for context can use these members directly, and the verifier rewrites context access transparently by walking the xdp_rxq_info and net_device pointers to load the data, from Jesper. 2) Redo the reporting of offload device information to user space such that it works in combination with network namespaces. The latter is reported through a device/inode tuple as similarly done in other subsystems as well (e.g. perf) in order to identify the namespace. For this to work, ns_get_path() has been generalized such that the namespace can be retrieved not only from a specific task (perf case), but also from a callback where we deduce the netns (ns_common) from a netdevice. bpftool support using the new uapi info and extensive test cases for test_offload.py in BPF selftests have been added as well, from Jakub. 3) Add two bpftool improvements: i) properly report the bpftool version such that it corresponds to the version from the kernel source tree. So pick the right linux/version.h from the source tree instead of the installed one. ii) fix bpftool and also bpf_jit_disasm build with bintutils >= 2.9. The reason for the build breakage is that binutils library changed the function signature to select the disassembler. Given this is needed in multiple tools, add a proper feature detection to the tools/build/features infrastructure, from Roman. 4) Implement the BPF syscall command BPF_MAP_GET_NEXT_KEY for the stacktrace map. It is currently unimplemented, but there are use cases where user space needs to walk all stacktrace map entries e.g. for dumping or deleting map entries w/o having to close and recreate the map. Add BPF selftests along with it, from Yonghong. 5) Few follow-up cleanups for the bpftool cgroup code: i) rename the cgroup 'list' command into 'show' as we have it for other subcommands as well, ii) then alias the 'show' command such that 'list' is accepted which is also common practice in iproute2, and iii) remove couple of newlines from error messages using p_err(), from Jakub. 6) Two follow-up cleanups to sockmap code: i) remove the unused bpf_compute_data_end_sk_skb() function and ii) only build the sockmap infrastructure when CONFIG_INET is enabled since it's only aware of TCP sockets at this time, from John. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * | nfp: setup xdp_rxq_infoJesper Dangaard Brouer2018-01-061-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Driver hook points for xdp_rxq_info: * reg : nfp_net_rx_ring_alloc * unreg: nfp_net_rx_ring_free In struct nfp_net_rx_ring moved member @size into a hole on 64-bit. Thus, the size remaines the same after adding member @xdp_rxq. Cc: oss-drivers@netronome.com Cc: Jakub Kicinski <jakub.kicinski@netronome.com> Cc: Simon Horman <simon.horman@netronome.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
* | | nfp: add basic multicast filteringJakub Kicinski2018-01-051-2/+5
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | We currently always pass all multicast traffic through. Only set L2MC when actually needed. Since the driver was not making use of the capability to filter out mcast frames, some FW projects don't implement it any more. Don't warn users if capability is not present (like we do for promisc flag). The lack of L2MC capability is assumed to mean all multicast traffic goes through. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | nfp: set flags in the correct member of netdev_bpfJakub Kicinski2017-12-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | netdev_bpf.flags is the input member for installing the program. netdev_bpf.prog_flags is the output member for querying. Set the correct one on query. Fixes: 92f0292b35a0 ("net: xdp: report flags program was installed with on query") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* | net: xdp: make the stack take care of the tear downJakub Kicinski2017-12-031-3/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since day one of XDP drivers had to remember to free the program on the remove path. This leads to code duplication and is error prone. Make the stack query the installed programs on unregister and if something is installed, remove the program. Freeing of program attached to XDP generic is moved from free_netdev() as well. Because the remove will now be called before notifiers are invoked, BPF offload state of the program will not get destroyed before uninstall. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* | net: xdp: report flags program was installed with on queryJakub Kicinski2017-12-031-0/+1
|/ | | | | | | | | | | | | | Some drivers enforce that flags on program replacement and removal must match the flags passed on install. This leaves the possibility open to enable simultaneous loading of XDP programs both to HW and DRV. Allow such drivers to report the flags back to the stack. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* Merge branch 'akpm' (patches from Andrew)Linus Torvalds2017-11-161-1/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge updates from Andrew Morton: - a few misc bits - ocfs2 updates - almost all of MM * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (131 commits) memory hotplug: fix comments when adding section mm: make alloc_node_mem_map a void call if we don't have CONFIG_FLAT_NODE_MEM_MAP mm: simplify nodemask printing mm,oom_reaper: remove pointless kthread_run() error check mm/page_ext.c: check if page_ext is not prepared writeback: remove unused function parameter mm: do not rely on preempt_count in print_vma_addr mm, sparse: do not swamp log with huge vmemmap allocation failures mm/hmm: remove redundant variable align_end mm/list_lru.c: mark expected switch fall-through mm/shmem.c: mark expected switch fall-through mm/page_alloc.c: broken deferred calculation mm: don't warn about allocations which stall for too long fs: fuse: account fuse_inode slab memory as reclaimable mm, page_alloc: fix potential false positive in __zone_watermark_ok mm: mlock: remove lru_add_drain_all() mm, sysctl: make NUMA stats configurable shmem: convert shmem_init_inodecache() to void Unify migrate_pages and move_pages access checks mm, pagevec: rename pagevec drained field ...
| * mm: remove __GFP_COLDMel Gorman2017-11-161-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As the page free path makes no distinction between cache hot and cold pages, there is no real useful ordering of pages in the free list that allocation requests can take advantage of. Juding from the users of __GFP_COLD, it is likely that a number of them are the result of copying other sites instead of actually measuring the impact. Remove the __GFP_COLD parameter which simplifies a number of paths in the page allocator. This is potentially controversial but bear in mind that the size of the per-cpu pagelists versus modern cache sizes means that the whole per-cpu list can often fit in the L3 cache. Hence, there is only a potential benefit for microbenchmarks that alloc/free pages in a tight loop. It's even worse when THP is taken into account which has little or no chance of getting a cache-hot page as the per-cpu list is bypassed and the zeroing of multiple pages will thrash the cache anyway. The truncate microbenchmarks are not shown as this patch affects the allocation path and not the free path. A page fault microbenchmark was tested but it showed no sigificant difference which is not surprising given that the __GFP_COLD branches are a miniscule percentage of the fault path. Link: http://lkml.kernel.org/r/20171018075952.10627-9-mgorman@techsingularity.net Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Jan Kara <jack@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | nfp: bpf: move to new BPF program offload infrastructureJakub Kicinski2017-11-051-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Following steps are taken in the driver to offload an XDP program: XDP_SETUP_PROG: * prepare: - allocate program state; - run verifier (bpf_analyzer()); - run translation; * load: - stop old program if needed; - load program; - enable BPF if not enabled; * clean up: - free program image. With new infrastructure the flow will look like this: BPF_OFFLOAD_VERIFIER_PREP: - allocate program state; BPF_OFFLOAD_TRANSLATE: - run translation; XDP_SETUP_PROG: - stop old program if needed; - load program; - enable BPF if not enabled; BPF_OFFLOAD_DESTROY: - free program image. Take advantage of the new infrastructure. Allocation of driver metadata has to be moved from jit.c to offload.c since it's now done at a different stage. Since there is no separate driver private data for verification step, move temporary nfp_meta pointer into nfp_prog. We will now use user space context offsets. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: bpf: rename ndo_xdp to ndo_bpfJakub Kicinski2017-11-051-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | ndo_xdp is a control path callback for setting up XDP in the driver. We can reuse it for other forms of communication between the eBPF stack and the drivers. Rename the callback and associated structures and definitions. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | nfp: use a counter instead of log message for allocation failuresJakub Kicinski2017-11-021-5/+10
| | | | | | | | | | | | | | | | | | Add a counter incremented when allocation of replacement RX page fails. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | nfp: switch to dev_alloc_page()Jakub Kicinski2017-11-021-1/+1
| | | | | | | | | | | | | | | | | | Use the dev_alloc_page() networking helper to allocate pages for RX packets. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | drivers/net: netronome: Convert timers to use timer_setup()Kees Cook2017-10-271-4/+3Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. Cc: Jakub Kicinski <jakub.kicinski@netronome.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jiri Pirko <jiri@mellanox.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Simon Horman <simon.horman@netronome.com> Cc: oss-drivers@netronome.com Cc: netdev@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2017-10-221-6/+14
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There were quite a few overlapping sets of changes here. Daniel's bug fix for off-by-ones in the new BPF branch instructions, along with the added allowances for "data_end > ptr + x" forms collided with the metadata additions. Along with those three changes came veritifer test cases, which in their final form I tried to group together properly. If I had just trimmed GIT's conflict tags as-is, this would have split up the meta tests unnecessarily. In the socketmap code, a set of preemption disabling changes overlapped with the rename of bpf_compute_data_end() to bpf_compute_data_pointers(). Changes were made to the mv88e6060.c driver set addr method which got removed in net-next. The hyperv transport socket layer had a locking change in 'net' which overlapped with a change of socket state macro usage in 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>
| * nfp: handle page allocation failuresJakub Kicinski2017-10-101-6/+14
| | | | | | | | | | | | | | | | | | | | | | page_address() does not handle NULL argument gracefully, make sure we NULL-check the page pointer before passing it to page_address(). Fixes: ecd63a0217d5 ("nfp: add XDP support in the driver") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bpf, nfp: add meta data supportDaniel Borkmann2017-09-261-25/+15Star
| | | | | | | | | | | | | | | | | | | | | | | | | | Implement support for transferring XDP meta data into skb for nfp driver; before calling into the program, xdp.data_meta points to xdp.data, where on program return with pass verdict, we call into skb_metadata_set(). Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bpf: add meta pointer for direct accessDaniel Borkmann2017-09-261-0/+1
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This work enables generic transfer of metadata from XDP into skb. The basic idea is that we can make use of the fact that the resulting skb must be linear and already comes with a larger headroom for supporting bpf_xdp_adjust_head(), which mangles xdp->data. Here, we base our work on a similar principle and introduce a small helper bpf_xdp_adjust_meta() for adjusting a new pointer called xdp->data_meta. Thus, the packet has a flexible and programmable room for meta data, followed by the actual packet data. struct xdp_buff is therefore laid out that we first point to data_hard_start, then data_meta directly prepended to data followed by data_end marking the end of packet. bpf_xdp_adjust_head() takes into account whether we have meta data already prepended and if so, memmove()s this along with the given offset provided there's enough room. xdp->data_meta is optional and programs are not required to use it. The rationale is that when we process the packet in XDP (e.g. as DoS filter), we can push further meta data along with it for the XDP_PASS case, and give the guarantee that a clsact ingress BPF program on the same device can pick this up for further post-processing. Since we work with skb there, we can also set skb->mark, skb->priority or other skb meta data out of BPF, thus having this scratch space generic and programmable allows for more flexibility than defining a direct 1:1 transfer of potentially new XDP members into skb (it's also more efficient as we don't need to initialize/handle each of such new members). The facility also works together with GRO aggregation. The scratch space at the head of the packet can be multiple of 4 byte up to 32 byte large. Drivers not yet supporting xdp->data_meta can simply be set up with xdp->data_meta as xdp->data + 1 as bpf_xdp_adjust_meta() will detect this and bail out, such that the subsequent match against xdp->data for later access is guaranteed to fail. The verifier treats xdp->data_meta/xdp->data the same way as we treat xdp->data/xdp->data_end pointer comparisons. The requirement for doing the compare against xdp->data is that it hasn't been modified from it's original address we got from ctx access. It may have a range marking already from prior successful xdp->data/xdp->data_end pointer comparisons though. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* nfp: be drop monitor friendlyJakub Kicinski2017-09-041-1/+1
| | | | | | | | | | Use dev_consume_skb_any() in place of dev_kfree_skb_any() when control frame has been successfully processed in flower and on the driver's main TX completion path. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2017-09-021-7/+7
|\ | | | | | | | | | | Three cases of simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
| * nfp: TX time stamp packets before HW doorbell is rungJakub Kicinski2017-08-241-2/+2
| | | | | | | | | | | | | | | | | | | | | | TX completion may happen any time after HW queue was kicked. We can't access the skb afterwards. Move the time stamping before ringing the doorbell. Fixes: 4c3523623dc0 ("net: add driver for Netronome NFP4000/NFP6000 NIC VFs") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * nfp: avoid buffer leak when representor is missingJakub Kicinski2017-08-241-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When driver receives a muxed frame, but it can't find the representor netdev it is destined to it will try to "drop" that frame, i.e. reuse the buffer. The issue is that the replacement buffer has already been allocated at this point, and reusing the buffer from received frame will leak it. Change the code to put the new buffer on the ring earlier and not reuse the old buffer (make the buffer parameter to nfp_net_rx_drop() a NULL). Fixes: 91bf82ca9eed ("nfp: add support for tx/rx with metadata portid") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | nfp: add basic SR-IOV ndo functionsPablo Cascón2017-08-261-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add basic ndo_set/get_vf to support SR-IOV. VF to egress phy static mapping by now. Use vfcfg ABI version 2 to write the info to the FW and collect the return value from the mailbox. Signed-off-by: Pablo Cascón <pablo.cascon@netronome.com> Signed-off-by: Jimmy Kizito <jimmy.kizito@netronome.com> Signed-off-by: Rami Tomer <rami.tomer@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2017-08-221-2/+1Star
|\|
| * nfp: fix infinite loop on umapping cleanupColin Ian King2017-08-181-2/+1Star
| | | | | | | | | | | | | | | | | | | | | | | | | | The while loop that performs the dma page unmapping never decrements index counter f and hence loops forever. Fix this with a pre-decrement on f. Detected by CoverityScan, CID#1357309 ("Infinite loop") Fixes: 4c3523623dc0 ("net: add driver for Netronome NFP4000/NFP6000 NIC VFs") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2017-08-101-0/+2
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | The UDP offload conflict is dealt with by simply taking what is in net-next where we have removed all of the UFO handling code entirely. The TCP conflict was a case of local variables in a function being removed from both net and net-next. In netvsc we had an assignment right next to where a missing set of u64 stats sync object inits were added. Signed-off-by: David S. Miller <davem@davemloft.net>
| * nfp: Initialize RX and TX ring 64-bit stats seqcountsFlorian Fainelli2017-08-021-0/+2
| | | | | | | | | | | | | | | | | | | | | | On 32-bit hosts and with CONFIG_DEBUG_LOCK_ALLOC we should be seeing a lockdep splat indicating this seqcount is not correctly initialized, fix that. Fixes: 4c3523623dc0 ("net: add driver for Netronome NFP4000/NFP6000 NIC VFs") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | nfp: set config bit (ifup/ifdown) on netdev open/closeDirk van der Merwe2017-07-251-1/+9
|/ | | | | | | | | | | | | | | | | | | | | When a netdev (PF netdev or representor) is opened or closed, set the physical port config bit appropriately - which powers UP/DOWN the PHY module for the physical interface. The PHY is powered first in the HW/FW configuration step when opening the netdev and again last in the HW/FW configuration step when closing the netdev. This is only applicable when there is a physical port associated with the netdev and if the NSP support this. Otherwise we silently ignore this step. The 'nfp_eth_set_configured' can actually return positive values - updated the function documentation appropriately. Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* nfp: default to chained metadata prepend formatJakub Kicinski2017-07-051-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | ABI 4.x introduced the chained metadata format and made it the only one possible. There are cases, however, where the old format is preferred - mostly to make interoperation with VFs using ABI 3.x easier for the datapath. In ABI 5.x we allowed for more flexibility by selecting the metadata format based on capabilities. The default was left to non-chained. In case of fallback traffic, there is no capability telling the driver there may be chained metadata. With a very stripped- -down FW the default old metadata format would be selected making the driver drop all fallback traffic. This patch changes the default selection in the driver. It should not hurt with old firmwares, because if they don't advertise RSS they will not produce metadata anyway. New firmwares advertising ABI 5.x, however, can depend on the driver defaulting to chained format. Fixes: f9380629fafc ("nfp: advertise support for NFD ABI 0.5") Suggested-by: Michael Rapson <michael.rapson@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* nfp: provide infrastructure for offloading flower based TC filtersPieter Jansen van Vuuren2017-07-011-13/+1Star
| | | | | | | | | | | | | Adds a flower based TC offload handler for representor devices, this is in addition to the bpf based offload handler. The changes in this patch will be used in a follow-up patch to add tc flower offload to the NFP. The flower app enables tc offloads on representors by default. Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* nfp: add phys_switch_id supportSimon Horman2017-07-011-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | Add phys_switch_id support by allowing lookup of SWITCHDEV_ATTR_ID_PORT_PARENT_ID via the nfp_repr_port_attr_get switchdev operation. This is visible to user-space in the phys_switch_id attribute of a netdev. e.g. cd /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0 find . -name phys_switch_id | xargs grep . ./net/eth3/phys_switch_id:00154d1300bd ./net/eth4/phys_switch_id:00154d1300bd ./net/eth2/phys_switch_id:00154d1300bd grep: ./net/eth5/phys_switch_id: Operation not supported In the above eth2 and eth3 and representor netdevs for the first and second physical port. eth4 is the representor for the PF. And eth5 is the PF netdev. Signed-off-by: Simon Horman <simon.horman@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* nfp: add support for tx/rx with metadata portidSimon Horman2017-06-251-6/+51
| | | | | | | | | | Allow tx/rx with metadata port id. This will be used for tx/rx of representor netdevs acting as upper-devices while a pf netdev acts as a lower-device. Signed-off-by: Simon Horman <simon.horman@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* nfp: xdp: report if program is offloadedJakub Kicinski2017-06-231-0/+2
| | | | | | | Make use of just added XDP_ATTACHED_HW. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* nfp: bpf: add support for XDP_FLAGS_HW_MODEJakub Kicinski2017-06-231-3/+10
| | | | | | | | Respect the XDP_FLAGS_HW_MODE. When it's set install the program on the NIC and skip enabling XDP in the driver. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* nfp: bpf: release the reference on offloaded programsJakub Kicinski2017-06-231-21/+13Star
| | | | | | | | | | | | | | The xdp_prog member of the adapter's data path structure is used for XDP in driver mode. In case a XDP program is loaded with in HW-only mode, we need to store it somewhere else. Add a new XDP prog pointer in the main structure and use that when we need to know whether any XDP program is loaded, not only a driver mode one. Only release our reference on adapter free instead of immediately after netdev unregister to allow offload to be disabled first. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* nfp: bpf: don't offload XDP programs in DRV_MODEJakub Kicinski2017-06-231-3/+11
| | | | | | | | | | | | | | | DRV_MODE means that user space wants the program to be run in the driver. Do not try to offload. Only offload if no mode flags have been specified. Remember what the mode is when the program is installed and refuse new setup requests if there is already a program loaded in a different mode. This should leave it open for us to implement simultaneous loading of two programs - one in the drv path and another to the NIC later. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>